Applied and Numerical Harmonic Analysis
Stephan Dahlke, Filippo De Mari Philipp Grohs, Demetrio Labate, Editors
Harmonic and Applied Analysis From Groups to Signals
Applied and Numerical Harmonic Analysis Series Editor John J. Benedetto University of Maryland College Park, MD, USA
Editorial Advisory Board Akram Aldroubi Vanderbilt University Nashville, TN, USA
Gitta Kutyniok Technische Universität Berlin Berlin, Germany
Douglas Cochran Arizona State University Phoenix, AZ, USA
Mauro Maggioni Duke University Durham, NC, USA
Hans G. Feichtinger University of Vienna Vienna, Austria
Zuowei Shen National University of Singapore Singapore, Singapore
Christopher Heil Georgia Institute of Technology Atlanta, GA, USA
Thomas Strohmer University of California Davis, CA, USA
Stéphane Jaffard University of Paris XII Paris, France
Yang Wang Michigan State University East Lansing, MI, USA
Jelena Kovaˇcevi´c Carnegie Mellon University Pittsburgh, PA, USA
More information about this series at http://www.springer.com/series/4968
Stephan Dahlke • Filippo De Mari • Philipp Grohs Demetrio Labate Editors
Harmonic and Applied Analysis From Groups to Signals
Editors Stephan Dahlke Mathematics and Computer Sciences Philipps-Universität Marburg Marburg, Hessen, Germany
Filippo De Mari Department of Mathematics Università di Genova Genova, Italy
Philipp Grohs Applied Mathematics ETH Zürich Zürich, Switzerland
Demetrio Labate Department of Mathematics University of Houston Houston, TX, USA
ISSN 2296-5009 ISSN 2296-5017 (electronic) Applied and Numerical Harmonic Analysis ISBN 978-3-319-18862-1 ISBN 978-3-319-18863-8 (eBook) DOI 10.1007/978-3-319-18863-8 Library of Congress Control Number: 2015945944 Mathematics Subject Classification (2010): 22D10, 22E30, 42B35, 42C15, 42C40, 44A15, 46E35, 65T60 Springer Cham Heidelberg New York Dordrecht London © Springer International Publishing Switzerland 2015 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper Springer International Publishing AG Switzerland is part of Springer Science+Business Media (www. springer.com)
ANHA Series Preface
The Applied and Numerical Harmonic Analysis (ANHA) book series aims to provide the engineering, mathematical, and scientific communities with significant developments in harmonic analysis, ranging from abstract harmonic analysis to basic applications. The title of the series reflects the importance of applications and numerical implementation, but richness and relevance of applications and implementation depend fundamentally on the structure and depth of theoretical underpinnings. Thus, from our point of view, the interleaving of theory and applications and their creative symbiotic evolution is axiomatic. Harmonic analysis is a wellspring of ideas and applicability that has flourished, developed, and deepened over time within many disciplines and by means of creative cross-fertilization with diverse areas. The intricate and fundamental relationship between harmonic analysis and fields such as signal processing, partial differential equations (PDEs), and image processing is reflected in our state-of-theart ANHA series. Our vision of modern harmonic analysis includes mathematical areas such as wavelet theory, Banach algebras, classical Fourier analysis, time-frequency analysis, and fractal geometry, as well as the diverse topics that impinge on them. For example, wavelet theory can be considered an appropriate tool to deal with some basic problems in digital signal processing, speech and image processing, geophysics, pattern recognition, biomedical engineering, and turbulence. These areas implement the latest technology from sampling methods on surfaces to fast algorithms and computer vision methods. The underlyingmathematics of wavelet theory depends not only on classical Fourier analysis, but also on ideas from abstract harmonic analysis, including von Neumann algebras and the affine group. This leads to a study of the Heisenberg group and its relationship to Gabor systems, and of the metaplectic group for a meaningful interaction of signal decomposition methods. The unifying influence of wavelet theory in the aforementioned topics illustrates the justification for providing a means for centralizing and disseminating information from the broader, but still focused, area of harmonic analysis. This will be a key role of ANHA. We intend to publish with the scope and interaction that such a host of issues demands. v
vi
ANHA Series Preface
Along with our commitment to publish mathematically significant works at the frontiers of harmonic analysis, we have a comparably strong commitment to publish major advances in the following applicable topics in which harmonic analysis plays a substantial role: Antenna theory Prediction theory Biomedical signal processing Radar applications Digital signal processing Sampling theory Fast algorithms Spectral estimation Gabor theory and applications Speech processing Image processing Time-frequency and Numerical partial differential equations time-scale analysis Wavelet theory The above point of view for the ANHA book series is inspired by the history of Fourier analysis itself, whose tentacles reach into so many fields. In the last two centuries Fourier analysis has had a major impact on the development of mathematics, on the understanding of many engineering and scientific phenomena, and on the solution of some of the most important problems in mathematics and the sciences. Historically, Fourier series were developed in the analysis of some of the classical PDEs of mathematical physics; these series were used to solve such equations. In order to understand Fourier series and the kinds of solutions they could represent, some of the most basic notions of analysis were defined, e.g., the concept of “function.” Since the coefficients of Fourier series are integrals, it is no surprise that Riemann integrals were conceived to deal with uniqueness properties of trigonometric series. Cantor’s set theory was also developed because of such uniqueness questions. A basic problem in Fourier analysis is to show how complicated phenomena, such as sound waves, can be described in terms of elementary harmonics. There are two aspects of this problem: first, to find, or even define properly, the harmonics or spectrum of a given phenomenon, e.g., the spectroscopy problem in optics; second, to determine which phenomena can be constructed from given classes of harmonics, as done, for example, by the mechanical synthesizers in tidal analysis. Fourier analysis is also the natural setting for many other problems in engineering, mathematics, and the sciences. For example, Wiener’s Tauberian theorem in Fourier analysis not only characterizes the behavior of the prime numbers, but also provides the proper notion of spectrum for phenomena such as white light; this latter process leads to the Fourier analysis associated with correlation functions in filtering and prediction problems, and these problems, in turn, deal naturally with Hardy spaces in the theory of complex variables. Nowadays, some of the theory of PDEs has given way to the study of Fourier integral operators. Problems in antenna theory are studied in terms of unimodular trigonometric polynomials. Applications of Fourier analysis abound in signal processing, whether with the fast Fourier transform (FFT), or filter design, or the adaptivemodeling inherent in time-frequency-scalemethods such as wavelet theory. The coherent states of mathematical physics are translated and modulated Fourier
ANHA Series Preface
vii
transforms, and these are used, in conjunction with the uncertainty principle, for dealing with signal reconstruction in communications theory. We are back to the raison d’être of the ANHA series! University of Maryland College Park
John J. Benedetto Series Editor
Acknowledgements
The idea of this book emerged during the Workshop on Applied Harmonic Analysis that was held in Genova in September 2013. We wish to thank Stefano Vigogna who helped organizing and running the workshop, and all the institutions that supported it: the German Academic Exchange Service (DAAD), the Istituto Nazionale dell’Alta Matematica (INdAM), the project “PRIN 2010–2011 geometria, topologia e analisi armonica”, the Statistical Learning and Image Processing Genoa University Research Group (Slipguru), the Ph.D. program STIC and the Department of Mathematics (DIMA) of the University of Genova. Thanks to Tommaso Bruno and Sören Häuser for their careful proofreading of the second chapter. Last but not least, we would like to thank Sören Häuser for his skillful support in the many issues that were involved in harmonizing the different contributions, and in solving all the LATEX problems.
ix
Contents
1
2
3
From Group Representations to Signal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . Stephan Dahlke, Filippo De Mari, Philipp Grohs, and Demetrio Labate References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
The Use of Representations in Applied Harmonic Analysis. . . . . . . . . . . . . Filippo De Mari and Ernesto De Vito 2.1 Representations of Lie Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Locally compact groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Lie Groups and Lie Algebras. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Representation Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Reproducing Systems and Square Integrability . . . . . . . . . . . . . . . 2.1.5 Unbounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.6 Stone’s Theorem and the Differential of a Representation . . . 2.2 The Heisenberg Group and its Representations . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 The Group and its Lie Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Automorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 The Schrödinger Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Time-frequency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 The Metaplectic Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 More on the Symplectic Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Construction of the Metaplectic Representation . . . . . . . . . . . . . . 2.3.3 Restriction to Triangular Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
Shearlet Coorbit Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stephan Dahlke, Sören Häuser, Gabriele Steidl, and Gerd Teschke 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Coorbit Space Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Coorbit Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
7 8 10 30 36 43 46 48 48 51 53 57 61 62 64 67 80 83
83 85 85 95 xi
xii
Contents
3.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Proof of Lemma 3.16 and 3.18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Multivariate Shearlet Transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 The Shearlet Group in Rd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 The Continuous Shearlet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Multivariate Shearlet Coorbit Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Shearlet Coorbit Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Structure of Shearlet Coorbit Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 Traces of Shearlet Coorbit Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Variation of a Theme: the Toeplitz Shearlet Transform . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5
Efficient Analysis and Detection of Edges Through Directional Multiscale Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kanghui Guo and Demetrio Labate 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The continuous wavelet transform and its generalizations . . . . . . . . . . . . 4.2.1 Continuous wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Continuous shearlets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Continuous shearlets in the plane (n D 2) . . . . . . . . . . . . . . . . . . . . . 4.3 Shearlet analysis of step edges. Case n D 2. . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Proof of Theorem 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Shearlet analysis of edges in dimension n D 3 . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 3D Continuous Shearlet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Characterization of 3D Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Identification of curve singularities on the piecewise smooth surface boundary of a solid . . . . . . . . . . . . . . . . 4.5 Other results and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Shearlet analysis of general edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 Numerical applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optimally Sparse Data Representations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Philipp Grohs 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Signal Classes and Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Upper Bounds on the Optimal Encoding Rate . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Sparse Approximation in Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Best N-term Approximation in Dictionaries . . . . . . . . . . . . . . . . . .
104 108 110 110 115 119 119 127 129 129 130 136 145 146 149 149 153 153 157 158 163 167 175 175 177 180 187 187 188 189 195 199 199 202 202 207 214 215
Contents
5.4.2 Best N-term Approximation with Polynomial Depth Search . 5.4.3 Sparse Approximations in Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Wavelet Approximation of Piecewise Smooth Functions . . . . . . . . . . . . . 5.6 Cartoon Approximation with Curvelet Tight Frames . . . . . . . . . . . . . . . . . 5.6.1 Suboptimality of Wavelets for Cartoon Images . . . . . . . . . . . . . . . 5.6.2 Curvelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.3 Shearlets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Appendix: Chernoff Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiii
217 220 225 229 229 233 240 244 245 247
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Contributors
Stephan Dahlke FB12 Mathematik und Informatik, Philipps-Universität Marburg, Lahnberge, Marburg, Germany Filippo De Mari Dipartimento di Matematica, Università degli Studi di Genova, Genova, Italy Ernesto De Vito Dipartimento di Matematica, Università degli Studi di Genova, Genova, Italy Philipp Grohs Seminar for Applied Mathematics, ETH Zürich, Zurich, Switzerland Kanghui Guo Missouri State University, Springfield, MO, USA Sören Häuser Fachbereich für Mathematik, Technische Universität Kaiserslautern, Kaiserslautern, Germany Demetrio Labate Department of Mathematics, University of Houston, Houston, TX, USA Gabriele Steidl Fachbereich für Mathematik, Technische Universität Kaiserslautern, Kaiserslautern, Germany Gerd Teschke Institute for Computational Mathematics in Science and Technology, Hochschule Neubrandenburg, University of Applied Sciences, Neubrandenburg, Germany
xv
Chapter 1
From Group Representations to Signal Analysis Stephan Dahlke, Filippo De Mari, Philipp Grohs, and Demetrio Labate
Abstract In this chapter, we present the point of view that has inspired this book and we explain the perspective and scope of the four chapters that follow.
This book is concerned with signal analysis in its broadest sense. Usually, signals are modeled as functions in suitable spaces such as L2 , the space of square integrable functions, or Sobolev spaces. Signals might be given explicitly as, for example, in image analysis or implicitly, as solutions of operator equations. In either case, the problem of interest is to analyze and process these signals, that is, to extract their information and then to manipulate the signals for tasks such as compression, denoising, and enhancement. During the last decade, rapid advances in computing power and sensing technologies, and the exponential growth of the internet have enormously increased the availability of data, leading to what is sometimes described as the “data deluge” or “big data” problem. This situation created new opportunities and new challenges in the field of signal processing, since huge amounts of data have to be transmitted, stored, and analyzed with high efficiency. The challenges are due not only to the size of data but also to their complexity, since data acquired in many applications (think, for instance, of electronic surveillance and social media data) are often
S. Dahlke () FB12 Mathematik und Informatik, Philipps-Universität Marburg, Hans-Meerwein Straße, Lahnberge, 35032 Marburg, Germany e-mail:
[email protected] F. De Mari Dipartimento di Matematica, Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genova, Italy e-mail:
[email protected] P. Grohs Seminar for Applied Mathematics, ETH Zürich, Rämistrasse 101, 8092 Zurich, Switzerland e-mail:
[email protected] D. Labate Department of Mathematics, University of Houston, Houston, TX 77204, USA e-mail:
[email protected] © Springer International Publishing Switzerland 2015 S. Dahlke et al. (eds.), Harmonic and Applied Analysis, Applied and Numerical Harmonic Analysis 68, DOI 10.1007/978-3-319-18863-8_1
1
2
S. Dahlke et al.
heterogeneous and high-dimensional. Confronted and perhaps even daunted by these challenges, some scientists have already announced the “end of theory.” They claim that a rigorous mathematical theory is no longer necessary as big data “offer a higher form of intelligence and knowledge that can generate insights that were previously impossible” [1]. The whole analysis process should hence be solely datadriven because it is claimed that “correlation is more important than causation.” In other words, knowledge would be generated by combing through data with sufficient computational power in order to discover correlations by means of appropriate statistical tools. According to this point of view, the future progress of science would only depend on the increase of computing power and the clever implementation of well-established statistical algorithms. We are guided by a very different perspective. We believe that there is nothing more applicable than a sound mathematical theory. We are convinced that rigorous mathematics can provide, and is already providing, the right instruments to address the many new challenges coming from advances in science and technology. Harmonic analysis, in particular, has been historically a discipline at the crossroads of mathematics, computer science, and engineering, and because of its versatile nature it can boost a closer mutual cooperation, where mathematical theory fosters technological advances and scientific discoveries, and, in turn, science and technology provide a continuous stimulus for the development of new mathematics. Starting with classical Fourier analysis and continuing with the theory of timefrequency analysis and wavelets, harmonic analysis has shown over many decades a remarkable “regenerative and centralizing power”1 . During the last decade, the emergence of the theory of sparse representations and compressed sensing and the introduction of innovative multiscale methods going beyond conventional wavelets prove that we are witnessing a clear manifestation of our point of view. The aim of this book is to present in a comprehensive and consistent manner some of the most promising mathematical concepts emerged in applied harmonic analysis during the last decade. Let us briefly summarize these ideas while emphasizing their connection with signal analysis. The first step in signal analysis is always signal transformation. Signals are modeled as elements of a function space and transformed via a mapping into a new function, defined on a suitable parameter space. This mapping usually makes it easier to recognize and extract the most relevant information of the signal. Very often the parameter space is itself highly structured, reflecting the symmetries inherently attached to the space of signals. Under favorable circumstances, the parameter space forms a group. Thus, the very high-powered tools from group theory apply, in particular the theory of unitary representations of Lie groups. In fact some of the most important transforms, such as the wavelet transform and the Gabor transform, are related to square-integrable group representations. The overall
1
J. Benedetto, introduction to [2].
1 From Group Representations to Signal Analysis
3
potential of representation theory, however, is far from having been fully exploited yet and, more importantly, its most basic constructs and techniques are not as widely known among applied mathematicians and people working in this area. Thus, in the chapter “The use of representation theory in applied harmonic analysis” we provide a crash course on Lie theory, survey its most commonly recognized applications in signal analysis already indicated above, namely the representation theory of the affine group and of the Heisenberg group as keys to wavelets and Gabor analysis, respectively, and finally examine in some detail the metaplectic representation of the symplectic group. Even though the latter is also widely appreciated and its role clearly acknowledged in this and other closely related fields, some of its more recently investigated facets are perhaps less known. In particular, we focus on the observation that its restriction to block-triangular subgroups provides a unified approach to many known reproducing formulae, such as those used in the theory of shearlets, as well as some other new intriguing constructions emerged in applied harmonic analysis and signal processing. One of the most successful applications of unitary group representations is coorbit theory, initially developed by Feichtinger and Gröchenig in a series of papers in the late eighties. It is well known that the convergence order of any numerical approximation scheme is closely related to the regularity of the object one wants to approximate. Therefore, it is important to classify the “right” smoothness measures. This is one of the key issues addressed by coorbit space theory, in which a crucial role is played by the so-called voice transform. The main data with which the transform is built are a unitary representation of the parameter group G, acting on the Hilbert space H of the signals that one wants to analyze, and a fixed vector in H , to be thought of as the appropriate analogue of a standard wavelet. The voice transform of a signal is just the collection of its projections along the various “directions” in H that are obtained by acting with G on the analyzing vector, and thus maps signals in H into functions on G. The technical assumptions are taylored in such a way that the voice transform of any signal actually belongs to L2 .G/, and, most importantly, gives rise to a natural class of distributions, to which the (extended) voice transform may be applied. The idea is then to define canonical smoothness spaces, the coorbit spaces, by collecting those distributions whose voice transform has a prescribed decay rate, codified by the appropriate space of functions on G. In the wavelet setting, the coorbit spaces are the (homogeneous) Besov spaces, and for the Gabor transform one obtain the modulation spaces. In Section 3.2 we present a detailed introduction into coorbit space theory. The next step in signal analysis is always discretization. When it comes to practical applications, of course only discrete data can be handled, so that some kinds of bases of more general frames for the underlying spaces are needed. The resulting building blocks, the frame elements, should therefore be adapted to the kind of information that one wants to extract. Typical examples are wavelet frames that can be used for time-scale analysis of signals, and Gabor frames that can be used for time-frequency analysis. One way to construct canonical building blocks is again coorbit theory. We refer again to Section 3.2.
4
S. Dahlke et al.
One specific task in modern signal analysis is the detection of directional information. Classical transforms such as the wavelet and the Gabor transform, respectively, are suboptimal since they are essentially isotropic. Therefore, there has been a compelling need to design new transforms and building blocks that are particularly tuned to this problem. The main issue is the search of the correct mathematical tool to capture and follow directional changes, and this is achieved by using a most natural idea, the use of shearing transformations in combination with anisotropic (typically parabolic) dilations. Roughly speaking, this amounts to elongating shapes along one direction while keeping the others unchanged. Due perhaps to their intrinsic features, among other approaches such as curvelets, ridgelets, and contourlets, just to name a few, shearlets have gained more and more attention over the last few years and have proved to be very efficient. A discussion of some of the most significant aspects of shearlet theory covers a very important part of this book. Among all the recently developed directional transforms, the shearlet transform stands out since it stems from a square-integrable representation of a specific group, the full shearlet group. This remarkable property paves the way for the application of all the powerful tools from group representation theory already mentioned above and, in particular, square-integrability calls for an understanding of the coorbit space theory that naturally arises. In the third chapter of this book, “Shearlet coorbit theory,” the use of the continuous shearlet transform in the context of coorbit space theory is thoroughly investigated, and various embeddings and trace results are proved. Once suitable building blocks for the extraction of directional information are constructed, a natural question is how we can benefit from it in terms of data compression. A thorough mathematical analysis of this question in a more general context is presented in the chapter “Optimally Sparse Data Representations” where we study the use of sparse frame representations for the optimal compression of different signal classes governed by different features (point- or curve-like discontinuities, textures, etc.). Several examples are considered, among them the optimal compression of piecewise smooth univariate signals with wavelets and the optimal compression of bivariate piecewise smooth functions with curved discontinuities (a common benchmark model for natural images known as ‘cartoonimages’) by curvelets and shearlets. So far, we have mainly discussed group representations, their associated function spaces and their atomic decompositions, with special attention to building blocks that are sensitive to directions. In a certain sense, such function spaces are defined so that membership in these spaces is a measure of some kind of global information. However, in many practical applications, it is equally important or even preferable to measure local geometric information, e.g., local smoothness or local geometric features of singularity structures. The prominence of local properties, that was already critical in the chapter “Optimally sparse data representations” in the context of sparse approximations, is the central topic of the chapter “Efficient analysis and detection of edges through directional multiscale representations.” The main goal of this chapter is to illustrate how local information can be retrieved by means of microlocal analysis based on the continuous shearlet transform.
1 From Group Representations to Signal Analysis
5
Multiscale transforms have been frequently associated with the analysis of singularities of functions and distributions and the application of the wavelet transform to detect edges and other singularities goes back to the origins of the wavelet literature. However, while the conventional continuous wavelet transform is able to detect the location of singular points, due to its intrinsic isotropic nature it is unable to provide additional information about the geometry of the set of singularities. By combining multiresolution analysis and directional sensitivity, the continuous shearlet transform offers an ideal framework for microlocal analysis. In the chapter “Efficient analysis and detection of edges through directional multiscale representations,” we show that this approach can be used to derive a precise geometric characterization of singularities for a large class of functions or distributions through its asymptotic decay as the scale parameter tends to zero. For example, it can be used to determine the location and orientation of step edges, including possibly the detection of corner points.
References 1. Boyd, D., Crawford, K.: Critical questions for big data. Inf. Commun. Soc. 15(5), 662–679 (2012) 2. Heil, C., Walnut, D.: Fundamental Papers in Wavelet Theory. Princeton University Press, Princeton, NJ (2006)
Chapter 2
The Use of Representations in Applied Harmonic Analysis Filippo De Mari and Ernesto De Vito
Abstract The role of unitary group representations in applied mathematics is manifold and has been frequently pointed out and exploited. In this chapter, we first review the basic notions and constructs of Lie theory and then present the main features of some of the most useful unitary representations, such as the wavelet representation of the affine group, the Schrödinger representation of the Heisenberg group, and the metaplectic representation. The emphasis is on reproducing formulae. In the last section we discuss a promising class of unitary representations arising by restricting the metaplectic representation to triangular subgroups of the symplectic group. This class includes many known important examples, like the shearlet representation, and others that have not been looked at from the point of view of possible applications, like the so-called Schrödingerlets.
2.1 Representations of Lie Groups Representation theory of groups is a vast subject. Many of the aspects of this theory that are of interest in Harmonic Analysis, and, in particular, of that body of ideas and techniques that might be collectively referred to as Applied Harmonic Analysis, can be studied within the class of locally compact and second countable topological groups, a family whose nickname in the group theory jargon is “lcsc.” Most interesting examples in which we are interested, however, belong to the smaller and nicer class of Lie groups, which feature an additional geometric-analytic nature that allows, for instance, to speak about dimension or to perform handy calculations such as taking derivatives or solving differential equations. As it is often the case, there is a trade-off between the beautiful generality and formal simplicity of lcsc groups, for which a smaller number of techniques is available, and the class of Lie groups, which is smaller, but enjoys many more desirable properties. At the level of practical examples, the theoretical obstacles fade out almost completely because one
F. De Mari () • E. De Vito Dipartimento di Matematica, Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genova, Italy e-mail:
[email protected];
[email protected] © Springer International Publishing Switzerland 2015 S. Dahlke et al. (eds.), Harmonic and Applied Analysis, Applied and Numerical Harmonic Analysis 68, DOI 10.1007/978-3-319-18863-8_2
7
8
F. De Mari and E. De Vito
is in the end dealing with matrix groups whose description in coordinates is often quite natural, and the computations that are at times hard to formalize in the general setup are simple extensions of those that the reader is familiar with in Rd . What is gained is a general conceptual landscape that provides insight and makes full use of the symmetries that are involved in the problems at hand. Finally, we think that research in this area requires a wide box of mathematical tools, including those that are pertinent to Lie groups, because of their effectiveness, flexibility, and depth. For these reasons we shall work mostly within this family, even though we by no means use the full force of the representation theory of Lie groups. On the reader’s side, we take for granted some working knowledge of topology, calculus, linear algebra, and elementary differential geometry. From the latter, we essentially need the notion of smooth manifold, the basic constructs of tangent vectors and vector fields, and the use of tangent mappings, or differentials, that we actually very quickly review. Some of the results concerning Lie groups that are summarized in Section 2.1.2 below are used in the sections that follow, others are presented in order to achieve a better understanding of the main ideas. A full list of appropriate references is beyond the purpose of this contribution. It is nevertheless worthwhile mentioning a few important books that have been inspiring to us and whose general viewpoint is close to ours, such as those by Ali, Antoine, Gazeau [1], Daubechies [10], Folland [16], Führ [19], Gröchenig [21] and Kaniuth, Taylor [24]. Given the nature of this chapter, that is for a substantial part a survey of wellknown results, some basic facts are presented as exercises, and are indeed workable with little effort.
2.1.1 Locally compact groups We start with some fundamental definitions and results. For a detailed account on these matters, the reader may consult [17]. Definition 2.1. A topological group is a group G endowed with a topology relative to which the group operations .g; h/ 7! gh;
g 7! g1
are continuous as maps G G ! G and G ! G, respectively. G is locally compact if every point has a compact neighborhood. We shall also assume our groups to be Hausdorff. Definition 2.2. A Borel measure defined on the -algebra generated by the open sets of the topological space X is called a Radon measure if: i) it is finite on compact sets; ii) it is outer regular on the Borel sets: for every Borel set E
2 The Use of Representations
9
.E/ D inff.U/ W U E; U open g iii) it is inner regular on the open sets: for every open set U .U/ D supf.K/ W K E; K compact g: Definition 2.3. A left Haar measure on the topological group G is a nonzero Radon measure such that .xE/ D .E/ for every Borel set E G and every x 2 G. Similarly for right Haar measures. Of course, the prototype of Haar measure is the Lebesgue measure on the additive group Rd , which is invariant under left (and right) translations. Theorem 2.4 ([17] Theorem (2.10)). Every locally compact group G has a left Haar measure , which is essentially unique in the sense that if is any other left Haar measure, then there exists a positive constant c such that D c. If we fix a left Haar measure on G, then for any x 2 G the measure .x/ defined by .x/.E/ D .Ex/ is again a left Haar measure. Therefore there must exist a positive real number, denoted .x/ such that .x/ D .x/: The function W G ! RC is called the modular function. In order to keep the notation clean, we will assume throughout that a left Haar measure has been fixed and we write dx instead of d.x/. Proposition 2.5 ([17] Proposition (2.24)). Let G be a locally compact group. The modular function is a continuous homomorphism from G into the multiplicative group RC . Furthermore, for every f 2 L1 .G/ we have Z
f .xy/ dx D .y/1 G
Z f .x/ dx : G
In Section 2.1.2.8 below we give some examples in the context of Lie groups. A group for which the modular function is identically equal to one is called unimodular. Large classes of groups are unimodular, such as the Abelian, compact, nilpotent, semisimple, and reductive groups. Nevertheless, in Applied Harmonic Analysis non-unimodular groups play a prominent role. The most important example is the affine group “ax C b” that we define in the next section.
10
F. De Mari and E. De Vito
2.1.2 Lie Groups and Lie Algebras We recall, without proofs, some basic facts about Lie groups and Lie algebras. For a concise and effective exposition, see [31]. Classical references with a wider scope are [26] and [30]. We shall often use the word “smooth” in place of “C1 ”. Definition 2.6. A Lie group G is a smooth manifold endowed with a group structure such that the group operations .g; h/ 7! gh and g 7! g1 are smooth. Example 2.7. Clearly, Rd is an additive, Abelian Lie group. Similarly Cd , identified with R2d as manifolds. Any finite dimensional real or complex vector space can be given the structure of Lie group simply choosing a basis and then identifying it with Rd . Example 2.8. The sphere S1 D fei W 2 Œ0; 2/g is an Abelian compact Lie group. Example 2.9. The multiplicative group GL.d; R/ of invertible matrices is a Lie 2 group. Indeed, it is an open submanifold of Rd with the global coordinates xij that assign to a matrix its ij-th entry. If y; z 2 GL.d; R/, then xij .yz/ and xij .y1 / are rational functions of fxij .y/; xij .z/g and of fxij .y/g, respectively, with nonvanishing denominator. Hence they are smooth functions. Great attention deserve the closed subgroups of GL.d; R/. They are automatically Lie groups, and in fact enjoy additional nice features from the topological point of view. This very important result is due to Cartan and is recalled below in Theorem 2.26. A remarkable closed subgroup of GL.2d; R/ is the symplectic group Sp.d; R/, that is of central importance in this chapter. It is defined by ˚ Sp.d; R/ D g 2 GL.2d; R/ W t gJg D J
(2.1)
where t g is the transpose of g and where J is the canonical skew-symmetric matrix 0 Id ; Id 0
JD
(2.2)
that defines the standard symplectic form (see Section 2.3.1 for more details). Notice that for d D 1 we have Sp.1; R/ D SL.2; R/, the group of 2 2 real matrices with determinant equal to one. Example 2.10 (The affine group “ax C b”). There are several possible versions of this group. Let G D RC R as a manifold. One can visualize it as the right half plane. The multiplication is obtained by thinking of the pair .a; b/, with a > 0 and b 2 R, as identifying the affine transformation of the real line given by x 7! ax C b, whence the name. The composition of maps x 7! ax C b 7! a0 .ax C b/ C b0 D .a0 a/x C .a0 b C b0 /
2 The Use of Representations
11
yields the product rule .a0 ; b0 /.a; b/ D .a0 a; a0 b C b0 /: Evidently, both functions a0 a and a0 b C b0 are smooth in the global coordinates on G, which is then a connected Lie group. When speaking of the “ax C b” group we refer to this group. A non-connected version arises by taking a 2 R D R n f0g instead of a > 0. We shall refer to this as the full affine group. Yet another slightly different construction comes from thinking of the pair .a; b/ as identifying the transformation x 7! a.x C b/. This point of view yields both a connected and a non-connected Lie group. Definition 2.11. A Lie algebra g over R is a real vector space endowed with a bilinear operation Œ ; W g g ! g, called bracket, such that i) ŒX; Y D ŒY; X for every X; Y 2 g, ii) ŒX; ŒY; Z D ŒŒX; Y; Z C ŒY; ŒX; Z for every X; Y; Z 2 g. Item ii), otherwise known as the Jacobi identity, should be thought of as an analogous version of the derivative of the product. Indeed, if for any X 2 g we put ad XW g ! g;
ad X.Y/ D ŒX; Y
the Jacobi identity may be formulated: ad X.ŒY; Z/ D Œad X.Y/; Z C ŒY; ad X.Z/; which reminds the derivative of the product, with the bracket as product. This seemingly awkward notation comes from the fact that the map X 7! ad X defines the so-called adjoint representation (see Subsection 2.1.2.6 below for details). Example 2.12. If V is a vector space, the set End.V/ of all linear maps of V into itself is a Lie algebra under the commutator Œ ; D as bracket. With this structure understood, it is denoted by gl.V/. It should be clear what is meant by Lie subalgebra of a Lie algebra g: it is a vector subspace h which is closed under bracket, that is, such that if A; B 2 h then ŒA; B 2 h. A stronger notion is that of ideal: it is a Lie subalgebra h of g with the property that Œh; g h, which means that for every A 2 h and every B 2 g we have ŒA; B 2 h.
12
2.1.2.1
F. De Mari and E. De Vito
Tangent Vectors and Vector Fields
Tangent vectors can be defined in several equivalent ways. A natural way to think of a tangent vector at the point p of the manifold M is to introduce an equivalence relation among all the smooth curves c defined in some open interval containing 0 2 R with values in M such that c.0/ D p. We establish that c1 c2 if for every smooth function 'W Up ! R defined on an open neighborhood Up of p in M it holds d ˇˇ d ˇˇ ˇ '.c1 .t// D ˇ '.c2 .t//: dt tD0 dt tD0 The equivalence class hcip of any of these curves is then a tangent vector to M at p. This line of thoughts will be adopted in Section 2.2 when computing the generators of the Lie algebra of the Heisenberg group. The set of all tangent vectors at p is a vector space, the tangent space of M at p, denoted Tp .M/. Once a coordinate patch .Up ; x1 ; : : : ; xd / around p 2 M has been fixed, the equivalence class of the special curve ci defined by t 7! .p1 ; : : : ; pi C t; : : : ; pd / is identified with the tangent vector denoted by @ ˇˇ ˇ @xi p because it operates on any function ' defined and smooth on Up by @ ˇˇ d ˇˇ ' D ˇ '.ci .t//; ˇ @xi p dt tD0 the common value attained along all curves equivalent to ci . These particular tangent vectors give rise to a basis of the vector space Tp .M/. A sensible expression for a tangent vector at p is therefore Xp D
d X
ai
iD1
@ ˇˇ ˇ 2 Tp .M/; @xi p
(2.3)
with a1 ; : : : ; ad 2 R. As implicitly said in the previous paragraph, a tangent vector acts on a function ' defined locally around p as a first order differential operator and produces a real number, the effect of the derivative Xp ' defined in (2.3). Using equivalence classes of curves, this real number is hcip ' D
d ˇˇ ˇ '.c.t//: dt tD0
A vector field X on the manifold M is a smooth map that assigns to each point p 2 M a tangent vector at that point, that is, an element Xp 2 Tp .M/. The simplest example of a locally defined vector field is
2 The Use of Representations
13
@ ; @xi
p 7!
@ ˇˇ ˇ : @xi p
Vector fields act on functions, in the sense that if X is a vector field and ' is a function, then X' is the function that at the point p takes the value X'.p/ D Xp .'/: The smoothness of X is, by definition, the requirement that X' is smooth whenever ' is such. Let X.M/ denote the set of all smooth vector fields on M. They can be interpreted as the derivations of C1 .M/, the algebra of smooth functions. This means that any X 2 X.M/ acts on smooth functions linearly (that is, X.˛' C ˇ / D ˛X.'/CˇX. / for any choice of scalars ˛ and ˇ), producing new smooth functions, and that the derivative rule X.' / D X.'/
C 'X. /
holds. As a consequence, in any local coordinate system .U ; x1 ; : : : ; xd / a vector field X 2 X.M/ can be expressed as XD
d X iD1
'i
@ @xi
where '1 ; : : : ; 'd 2 C1 .M/. The action of X on a function function X whose value at p 2 U M is X .p/ D
d X iD1
'i .p/
2 C1 .M/ is then the
@ .p/: @xi
Thus X is smooth if and only if the functions 'i are smooth.
2.1.2.2
Lie Algebras of Vector Fields
The set X.M/ enjoys a structural algebraic property, it is a C1 .M/-module. This means that the vector fields form an Abelian group under the natural (pointwise) sum, and they can be multiplied (pointwise) by smooth functions, respecting the rules that modules require, namely '.X C Y/ D 'X C 'Y;
.' C /X D 'X C X;
'. X/ D .' /X;
1X D X;
where X; Y 2 X.M/, '; 2 C1 .M/ and 1 is the function equal to one on M. More remarkably, X.M/ is a (typically infinite dimensional) Lie algebra under the commutator. This is a consequence of the fact that
14
F. De Mari and E. De Vito
ŒX; Y WD X ı Y Y ı X is in fact a first order differential operator because the second order terms vanish due to the equality of mixed partials. Therefore ŒX; Y 2 X.M/ and the Jacobi identity is readily established, together with bilinearity and skew-symmetry. As we shall see below, the main feature of Lie groups is that, thanks to the presence of (left) translations, the Lie algebra X.G/ always admits a very natural finite dimensional Lie subalgebra. Any smooth map FW M ! N between smooth manifolds gives rise to the tangent map F of the corresponding tangent spaces. Thus, for any p 2 M the tangent map at p is the linear map Fp W Tp .M/ ! TF.p/ .N/ carrying the tangent vector Xp 2 Tp .M/ to the tangent vector Fp Xp 2 TF.p/ .N/ whose action (as a derivation) on a function defined in a neighborhood of F.p/ is Fp Xp . / D Xp .
ı F/:
The tangent map is also known as the differential of F at p. When M and N are open subsets of Rd and Rn , respectively, the differential is expressed in the canonical bases by the n d Jacobian matrix. Let now G be a Lie group and denote by lg W G ! G the left translation by g 2 G, that is lg .h/ D gh. A vector field X 2 X.G/ is called left invariant on G if for every g; h 2 G it satisfies .lg /h Xh D Xgh : The set of all left invariant vector fields on G will be denoted L .G/. Proposition 2.13 ([31] Proposition (3.7)). Let G be a Lie group and denote by L .G/ the set of left invariant vector fields on G. Then: i) L .G/ is an R-vector space and the linear map ˛W L .G/ ! Te .G/ defined by ˛.X/ D Xe is a vector space isomorphism between L .G/ and the tangent space to G at the identity e 2 G. Consequently, dim L .G/ D dim Te .G/ D dim G. ii) The commutator ŒX; Y D X ı Y Y ı X of two left invariant vector fields is again a left invariant vector field. With this bracket, L .G/ becomes a Lie algebra, which will be called the Lie algebra of G. We shall now discuss in some detail an example that plays a crucial role in what follows and illustrates the concepts that we have just introduced. Perhaps the most important Lie group in which we are interested is GL.d; R/, the invertible d d matrices. Indeed, as we shall see, most of the groups at which we look in this chapter arise as closed subgroups of GL.d; R/. Therefore, GL.d; R/ serves as a large ambient group in which the action takes place. We shall now see that its Lie algebra is naturally identified with the Lie algebra of all d d matrices. The identification that we present provides the most basic insight when dealing with matrix Lie groups.
2 The Use of Representations
15
Let us denote by gl.d; R/ the Lie algebra whose underlying vector space is the set Md .R/ of square d d matrices with real entries and whose bracket is the commutator ŒA; B D AB BA. The reader may check that this is indeed a bracket, in the sense that it defines a bona fide Lie algebra structure on Md .R/. If we use 2 global coordinates on GL.d; R/ as an open subset of Rd (see Example 2.9 above), then a tangent vector to GL.d; R/ at g 2 GL.d; R/ may be written as
D
X
aij
ij
@ ˇˇ ˇ 2 Tg .GL.d; R// @xij g
for some d2 real numbers aij . If h 2 GL.d; R/, then the image of under the differential .lh /g is the tangent vector .lh /g D
X ij
bij
@ ˇˇ ˇ 2 Thg .GL.d; R// ; @xij hg
where again bij are d2 suitable real numbers. In order to compute the explicit value of these coordinates as functions of aij and of h, we use the fundamental identification of tangent vectors as first order differential operators discussed above and evaluate the vector field .lh /g on the coordinate function xij . Explicitly, X bij D .lh /g .xij / D xij ı lh D
hik xkj
D
X pq
apq
@ ˇˇ X hik xkj ˇ @xpq g k
!
k
! D
X
apj hip D .hA/ij ;
p
where A D .aij / is the d d real matrix associated with the components aij . Hence .lh /g D
X @ ˇˇ .hA/ij ij ˇ : @x hg ij
(2.4)
This formula suggests an expression for the left invariant vector fields on GL.d; R/. We mean the following: a left invariant vector field on GL.d; R/ is completely determined by its value at the identity, whose coordinates are encoded by a d d matrix, say A. We denote by X A such vector field. As the next proposition shows, the correspondence A $ X A is an instance of the isomorphism described in item i) of Proposition 2.13. The isomorphism is very nice and operative, in the sense that the bracket of the vector fields X A and X B is the vector field X ABBA , so that we can completely identify the left invariant vector fields on GL.d; R/ with the elements of the Lie algebra gl.d; R/ and compute directly with matrices instead of going through complicated expressions that involve partial derivatives.
16
F. De Mari and E. De Vito
Proposition 2.14. Given any matrix A 2 Mn .R/, the vector field X A on GL.d; R/ whose value at g 2 GL.d; R/ is XgA D
X
.gA/ij
ij
@ ˇˇ ˇ @xij g
is a left invariant vector field on GL.d; R/. Further, the map A 7! X A is a Lie algebra isomorphism between gl.d; R/ and the Lie algebra of GL.d; R/. Proof. The fact that X A is a left invariant vector field follows at once from (2.4). Take now A; B 2 gl.d; R/. We now find the component of ŒX A ; X B g by computing .X A ı X B X B ı X A /g .xij /. Since X @ ˇˇ X @ ˇˇ .X ı X /g .x / D .gA/pq pq ˇ .gB/mn mn ˇ .xij / @x g mn @x g pq A
B
!
ij
D
X @ ˇˇ .gA/pq pq ˇ .gB/ij @x g pq
X @ ˇˇ X .gA/pq pq ˇ . gik Bkj / @x g k pq X X .gA/pq . ıip ıqk Bkj / D
D
pq
k
X .gA/iq Bqj D q
D .gAB/ij ; we get .X A ı X B X B ı X A /g .xij / D .g.AB BA//ij and consequently ŒX A ; X B D X ŒA;B ; which is precisely what we wanted to show.
2.1.2.3
t u
Homomorphisms
Take two Lie groups, G and H. A map FW G ! H is a Lie group homomorphism if it is a group homomorphism (hence if F.e/ D e, the identities of G and H,
2 The Use of Representations
17
respectively, and if F.xy/ D F.x/F.y/ for every x; y 2 G) and also a smooth map of manifolds. We say that F is a Lie group isomorphism if it is a diffeomorphism, that is, a smooth bijection with smooth inverse. Example 2.15. A good example of homomorphism is the map F W U.2/ ! Sp.2; R/ defined by F.X C iY/ D
X Y : Y X
Here U.2/ stands for the 2 2 complex unitary matrices, those for which t gN g D I. Now, if g D X C iY with X; Y 2 M2 .R/, then t gN g D I is equivalent to t XX C t YY D I and t XY symmetric. But then F.X C iY/ satisfies t
t F.X C iY/JF.X C iY/ D
X t Y t Y tX
I
X Y Y X
I t XY C t YX t XX C t YY DJ D t YY t XX t YX t XY
and is therefore symplectic. This proves that F takes values in Sp.2; R/. It is easy to see that F.gh/ D F.g/F.h/ whereas F.I/ D I is obvious. As for smoothness, this is a somewhat tricky issue that needs not concern us now. Finally, U.2/ has dimension 4, while Sp.2; R/ has dimension 10, so F cannot possibly be an isomorphism. An isomorphism of G onto itself is called an automorphism. A natural class of automorphisms are the so-called inner automorphisms, namely those given by inner conjugation. If g 2 G, the inner conjugation defined by g is the map ig W G ! G;
ig .h/ D ghg1 :
In general, a Lie group possesses automorphisms that are not inner. A good example is given by the automorphisms of the Heisenberg group, that are listed below in Theorem 2.86. Observe that the set of linear automorphisms of a finite dimensional R-vector space V (hence a Lie group) has itself a natural structure of Lie group, because if a basis is selected, then the group of all linear invertible maps may be identified with the Lie group of invertible matrices. This group is denoted by Aut.V/. As we shall see in Section 2.1.3, a homomorphism W G ! Aut.V/ is what is called a finite dimensional representation of G. A Lie algebra homomorphism between two Lie algebras g and h is a linear map W g ! h for which .ŒX; Y/ D Œ .X/; .Y/. Further, if is a linear isomorphism, then it is called a Lie algebra isomorphism. If h D gl.W/ is the Lie algebra of all endomorphisms of a vector space W, a Lie algebra homomorphism W g ! h is called a representation of g on W. This is the case of the homomorphism X 7! ad X, which defines the adjoint representation of g, a representation of g on itself.
18
F. De Mari and E. De Vito
Let FW G ! H be a Lie group homomorphism. Its differential evaluated at the identity Fe W Te .G/ ! Te .H/ is a linear map. By the identifications Te .G/ ' L .G/ and Te .H/ ' L .H/, Fe induces a linear map L .G/ ! L .H/ denoted dF. More precisely, if X 2 L .G/, then dF.X/ is the unique left invariant vector field on G such that .dF.X//e D Fe Xe : The following result clarifies matters. Proposition 2.16 ([31] Theorem (3.14)). Let FW G ! H be a Lie group homomorphism. Then for every X 2 L .G/ it holds that Fg Xg D .dF.X//F.X/ and dF is a Lie algebra homomorphism. Take again a Lie group homomorphism iW H ! G and assume that i is injective and that also its differential is injective at every point (such a map is called an injective immersion). In this case the pair .i; H/ is called a Lie subgroup of G. It should be clear that whenever a Lie subgroup is given, then, upon taking the differential di of the corresponding immersion, one gets an immersion of Lie algebras. In other words, to any Lie subgroup there corresponds a Lie subalgebra. The question concerning a possible reverse correspondence is addressed by the following fundamental result. Theorem 2.17 ([31] Theorem (3.19) and corollaries, and Theorem (3.48)). Let G be a Lie group with Lie algebra g and take a Lie subalgebra h of g. Then there exists a connected Lie subgroup .i; H/ of G, unique up to isomorphisms, such that di.L .H// D h. Therefore there is a bijective correspondence between the connected Lie subgroups of a Lie group and the subalgebras of its Lie algebra. Under this bijection, normal subgroups correspond to ideals. The theory of covering groups is also very important, and relevant in the present context, but we content ourselves with the observation that every connected Lie group has a simply connected covering that admits the structure of Lie group for which the covering homomorphism is a Lie group homomorphism. Furthermore, a Lie group homomorphism FW G ! H is a covering map if and only if dF is a Lie algebra isomorphism. The following theorem is of central importance in the theory of Lie groups, it is the monodromy principle for Lie groups, namely the possibility of lifting homomorphisms from the Lie algebra to the Lie group. Theorem 2.18 ([31] Theorem (3.16) and Theorem (3.27)). Let G1 and G2 be two Lie groups with Lie algebras g1 and g2 , respectively, and let ˚W g1 ! g2 be a Lie algebra homomorphism. Then there cannot be more than one Lie group homomorphism FW G1 ! G2 such that dF D ˚. If G1 is simply connected, then such an F exists.
2 The Use of Representations
2.1.2.4
19
Exponential Mapping
We now review in some detail the definition of the fundamental map linking any Lie group with its Lie algebra, namely the exponential mapping expW g ! G. Let R be the additive Lie group of real numbers. Its Lie algebra is one-dimensional and is generated by the vector field dtd . Take now a Lie group G with Lie algebra g, and fix X 2 g. The map
d 7! X; dt
2R
is a Lie algebra homomorphism from R into g. Since R is simply connected, by the monodromy principle there exists a unique homomorphism X W R ! G such that: (
ˇ . X / dtd ˇtD D X X . / ˇ . X /0 dtd ˇtD0 D Xe :
(2.5)
Conversely, if W R ! G is a Lie group homomorphism, then X D d . dtd / satisfies D X . Hence, the correspondence X 7! X establishes a bijection between g and the set of homomorphisms from R into G with the property that d X . dtd / D X for every X 2 g. Fix now 2 R and X 2 g. Then, if m denotes the multiplication by in R, the map .t/ D X .t/ D X ı m .t/ is again a homomorphism from R into G and since 0
d ˇˇ d ˇˇ D . X /0 ˇ D Xe ; ˇ dt tD0 dt tD0
it follows that D X , that is
X .t/ D X .t/;
t; 2 R; X 2 g:
(2.6)
We define exp X D X .1/;
X 2 g:
The map expW g ! G is called the exponential mapping. From (2.6) it follows that
X .t/ D exp.tX/;
t 2 R; X 2 g
exp 0 D e: It is easy to check that if X 2 g and x 2 G are fixed, the map t 7! x exp.tX/ defines the integral curve relative to X passing through x, namely the smooth curve whose differential carries the tangent vector in R to the value of the vector field at the image point. Hence, for every C1 -function ' in a neighborhood of x we have
20
F. De Mari and E. De Vito
Xx .'/ D
d ˇˇ ˇ '.x exp tX/: dt tD0
(2.7)
An immediate consequence of the fact that X is a homomorphism are the formulae exp.t C s/X D exp tX exp sX exp.tX/ D .exp tX/1 :
(2.8)
The following formulae require some harder work: exp tX exp tY D expft.X C Y/ C 12 t2 ŒX; Y C O.t3 /g exp.tX/ exp.tY/ exp tX exp tY D expft2 ŒX; Y C O.t3 /g
(2.9) (2.10)
exp tX exp tY exp.tX/ D expftY C t2 ŒX; Y C O.t3 /g: Formula (2.9) is the well-known Baker–Campbell–Hausdorff formula. The exponential map is in general neither injective nor surjective, but it is locally very nice: Proposition 2.19 ([31] Theorem (3.31)). The exponential map is C1 and its differential at zero is the identity map of g. Consequently, exp establishes a diffeomorphism of a neighborhood of 0 2 g onto a neighborhood of e 2 G. One of the most fundamental properties of the exponential mapping is that it always intertwines the homomorphisms of Lie groups with the corresponding homomorphisms of the Lie algebras: Theorem 2.20 ([31] Theorem (3.32)). Let FW G ! H be a Lie group homomorphism with differential dFW g ! h. Then, for every X 2 g F.exp X/ D exp.dF X/: By means of the previous result it is easy to show the next one, which is of practical use because it allows to calculate the Lie algebra of a subgroup of G as a subalgebra of the Lie algebra of G. Proposition 2.21 ([31] Proposition (3.33)). Let H be a Lie subgroup of the Lie group G and let h g be the corresponding Lie algebras. Fix X 2 g. If X 2 h, then exp tX 2 H for every t 2 R. Conversely, if exp tX 2 H for every t 2 R, then X 2 h. Refining the above result one obtains the next, which is useful when dealing with the classical matrix groups and algebras. Proposition 2.22 ([31] Theorem (3.34)). Let A be an abstract subgroup of the Lie group G and let a be a vector subspace of the Lie algebra g of G. Let U be a neighborhood of 0 2 g diffeomorphic via exp to the neighborhood V of e 2 G. Suppose that
2 The Use of Representations
21
exp.U \ a/ D A \ V : Then, endowed with the relative topology, A is a Lie subgroup of G and a is its Lie algebra. As mentioned several times, the set of all n n real matrices endowed with the bracket ŒA; B D AB BA is a Lie algebra, and, as any finite dimensional vector space, a smooth manifold with coordinates given by any choice of a basis. As shown in Proposition 2.14, the Lie algebra of GL.d; R/ is canonically identified with gl.d; R/. The ordinary matrix exponentiation gives rise to a unique homomorphism from R into GL.d; R/ t 7! etA D
C1 X kD0
.tA/k kŠ
(2.11)
that satisfies the properties (2.5) which define the exponential mapping, so that exp A D eA : A classic application of Jordan normal forms yields det eA D etr A ;
(2.12)
so that the exponential of any square matrix A is an invertible matrix and exp maps indeed gl.d; R/ to GL.d; R/. We observe en passant that (2.12) implies that the exponential maps sl.d; R/ to SL.d; R/, that is, the Lie algebra of traceless matrices to the Lie group of matrices with determinant equal to one. For later use, we remark that (2.11) entails t
t
.eA / D e A
(2.13)
and that for any invertible B 1
BeA B1 D eBAB :
(2.14)
Example 2.23. We show that the Lie algebra of Sp.d; R/ is ˚ sp.d; R/ D X 2 gl.2d; R/ W t XJ C XJ D 0 ;
(2.15)
where J is the canonical skew-symmetric matrix defined in (2.2). If X 2 sp.d; R/, the relation t XJ D JX, together with J 2 D I and (2.13) and (2.14), implies t
.eX /J.eX / D e X JeX D J.Je X J 1 /eX D JeJ XJ eX D JeX eX D J: t
t
t
22
F. De Mari and E. De Vito
Conversely, we show that if Y D eX 2 V \ Sp.d; R/ where V is a suitable neighborhood of the identity, then t XJ CJX D 0. This can be done by observing that t
1
J D t .eX /J.eX / D e X eJXJ J t
implies e X eJXJ
1
D I and hence by (2.8) 1
e X D eJXJ : t
Take now a neighborhood U of 0 2 gl.d; R/ diffeomorphic under exp to the neighborhood V of I 2 GL.d; R/. Assuming that U is small enough, that is, intersecting it with t U D ft Z W Z 2 U g and with JU J 1 D fJZJ 1 W Z 2 U g, which are both open and contain the zero matrix, we may assume that both t X and JXJ 1 belong to U , where the exponential is a diffeomorphism. Hence t X D JXJ 1 , which amounts to t XJ C JX D 0. Applying now Proposition 2.22 we obtain the claimed description of sp.d; R/, which can be made even more explicit. Indeed, if XD
AB 2 sp.d; R/; CD
then 0 D t XJ C JX D
t C C C tA C D t D A t B B
implies that in fact X 2 sp.d; R/ if and only if A B ; C t A
XD
B; C 2 Sym.d; R/:
(2.16)
Example 2.24. Arguing as in the previous example, one can show that the Lie algebra of the special orthogonal group, namely the compact Lie group SO.d/ D fg 2 GL.d; R/ W t gg D I; det g D 1g; is the Lie algebra of skew-symmetric matrices so.d/ D fX 2 gl.d; R/ W t X C X D 0g: Example 2.25. Another basic fact: the Lie algebra of the unitary group U.n/ D fg 2 GL.n; C/ W t gg D Ig is the Lie algebra of skew-hermitian matrices u.d/ D fX 2 gl.d; C/ W t X C X D 0g:
(2.17)
2 The Use of Representations
2.1.2.5
23
Closed Subgroups
As the examples at the end of the previous section indicate, many of the most interesting matrix Lie groups arise by imposing extra equations on GL.d; R/. Since 2 these equations are often of the form F.gij / D 0 with FW Rn ! R a polynomial or a rational function of the entries, hence continuous on some open set, their solutions cut out subgroups that are topologically closed. The closed subgroups are very special. The first major result is the following Theorem 2.26 (Cartan ([31] Theorem (3.42))). Let G be a Lie group and let H be a closed subgroup of G. Then H has a unique smooth (in fact analytic) structure that makes it a Lie subgroup of G. Theorem 2.27 ([31] Theorem (3.43)). Let G and H be two Lie groups with Lie algebra g and h, respectively, and let FW G ! H be a Lie group homomorphism. Assume that G is connected, then: i) ker.F/ is a closed Lie subgroup of G with Lie algebra ker.dF/; ii) F.G/ is a Lie subgroup of H with Lie algebra dF.g/ 2 h. An important example of closed Lie subgroup is the center ZG of G, namely ˚ ZG D g 2 G W gxg1 D x;
for all x 2 G :
The fact that ZG is indeed closed can either be shown directly (with sequences) or by using Corollary 2.30 below, which exhibits the center as the kernel of a very important homomorphism of G into the automorphisms of its Lie algebra, the adjoint representation. Observe that, besides being closed, the center ZG is always normal in G, just as any other kernel of any smooth homomorphism. Suppose now that H is any closed and normal subgroup of G, not necessarily its center. Then G=H has a natural group structure, and it is natural to ask whether it might be a Lie group. This is indeed the case: Theorem 2.28 ([31] Theorem (3.64)). If H is a closed and normal subgroup of G, then there exists a unique manifold structure on the quotient group G=H that turns it into a Lie group. Moreover, the natural projection pW G ! G=H is a smooth surjection. The projection map pW G ! G=H described in the above proposition is thus a smooth Lie group homomorphism whose kernel is exactly H. Therefore, the closed normal subgroups always do appear as kernels of smooth homomorphisms. Many examples of quotient Lie groups can be given. In Section 2.2 below we are mostly concerned with the Heisenberg group Hd , whose center is isomorphic to R. The quotient Hd =Z is (isomorphic to) the Abelian group R2d .
24
2.1.2.6
F. De Mari and E. De Vito
Adjoint Representation
The most important finite dimensional representation of a Lie group G is certainly the adjoint representation, which acts on its Lie algebra g. Given any real Lie algebra g, we shall denote by gl.g/ the Lie algebra of all endomorphisms of g with the commutator as bracket and by GL.g/ the group of all nonsingular endomorphisms of g as a vector space. Hence gl.g/ is the Lie algebra of GL.g/. The map X 7! ad X is a Lie algebra homomorphism whose image is a Lie subalgebra of gl.g/ denoted ad g. Let Int.g/ be the connected Lie subgroup of GL.g/ whose Lie algebra is ad g. The group Int.g/ is called the adjoint group of g. Schematically: gl.g/ [ ad g
! GL.g/ [ ! Int.g/
where the arrows stand for the correspondence group-algebra. Next, let Aut.g/ be the Lie subgroup of GL.g/ consisting of all the automorphisms of g (the invertible Lie algebra homomorphisms of g onto itself) and denote by @g its Lie algebra. We know that @g consists of all the endomorphisms D 2 gl.g/ such that exp tD 2 Aut.g/ for every t 2 R. From exp.tD/ŒX; Y D Œexp.tD/X; exp.tD/Y, taking the derivative at t D 0, it follows that DŒX; Y D ŒDX; Y C ŒX; DY: Any such operator is called a derivation of g. Conversely, if D is a derivation, then one shows by induction that Dk ŒX; Y D
X kŠ ŒDi X; Dj Y; iŠjŠ iCjDk
so that exp.tD/ŒX; Y D Œexp.tD/X; exp.tD/Y. It follows that @g consists of all the derivations of g. Finally, since ad X is a derivation of g for every X 2 g, we may refine the preceding diagram and get: gl.g/ [ @g [ ad g
! GL.g/ [ ! Aut.g/ [ ! Int.g/
2 The Use of Representations
25
It is easy to see that Int.g/ is a normal subgroup of Aut.g/ and ad g is an ideal in @g. Let G be a Lie group and take g 2 G. Denote by ig the inner conjugation in G, namely x 7! gxg1 and put Ad g WD dig W g ! g: Since ig is an isomorphism of G it follows that Ad g 2 Aut.g/. The map AdW G ! Aut.g/;
g 7! Ad g
is a homomorphism of Lie groups and is called the adjoint representation of G. Ad is indeed a finite dimensional representation of G on a real vector space. On the classical matrix groups, in particular on GL.d; R/ and hence on its closed subgroups, we have Ad g.X/ D gXg1 : Also, the universal intertwining property of exp given in Theorem 2.20 entails: exp.Ad g.X// D exp.dig .X// D ig .exp X/ D g.exp X/g1 ; which is a general version of (2.14). Theorem 2.29 ([26] Proposition (1.93)). Ad is a smooth map and d Ad D ad. In particular, for any X 2 g Ad.exp X/ D ead X : Corollary 2.30 ([23] Corollary (5.2)). The adjoint representation of G is a smooth surjective homomorphism of G onto Int.g/ whose kernel is the center ZG of G. Hence, G=ZG ' Int.g/. From the previous result and from Theorem 2.26, we see that the center of a connected Lie group is a closed Lie subgroup. It then follows again from Theorem 2.27 that the Lie algebra of the center of G is the center of the Lie algebra, namely ˚ zg D X 2 g W ŒX; Y D 0 for all Y 2 g : We deduce from this that a Lie group is Abelian (i.e., it coincides with its center) if and only if its Lie algebra is such.
26
2.1.2.7
F. De Mari and E. De Vito
Semidirect Products
Suppose that G and H are two Lie groups and that we are given a group homomorphism W H ! Aut.G/;
h 7! h
such that the map .g; h/ 7! h .g/ is a smooth map of G H into H. Hence, for every h 2 H the map h is an invertible Lie group homomorphism of G onto itself, and hk D h ı k for every h; k 2 H. It is then possible to define the semidirect product of G and H. It is the group denoted G Ì H whose elements are those of G H and where the product is defined by .g1 ; h1 /.g2 ; h2 / D .g1 h1 .g2 /; h1 h2 /: It is immediate to check that this is a group law, and indeed smooth, so that G H is a Lie group. Inverses are given by .g; h/1 D .h1 .g1 /; h1 /: If we identify G and H with the subsets of G Ì H given by f.g; e/ W g 2 Gg and f.e; h/ W h 2 Hg, respectively, then both G and H are closed subgroups and G is a normal subgroup in G Ì H. Example 2.31. The most obvious example of semidirect product is the “ax C b” group. Evidently, H D RC , G D R and a .b/ D ab. One possible generalization in higher dimensions is the Euclidean motion group in Rd , where H D SO.d/ acts in the natural linear fashion on G D Rd . Example 2.32. Another example of semidirect product is the Poincaré group. Here H D SO.1; 3/ D fh 2 SL.4; R/ W t hI1;3 h D I1;3 g; where
I1;3
2 1 60 D6 40 0
0 1 0 0
0 0 1 0
3 0 07 7: 05 1
As G, we take R4 . Again, h .x/ D hx for h 2 SO.1; 3/ and x 2 R4 . Other useful examples of semidirect products will be discussed in Section 2.2 and also in Section 2.3.
2 The Use of Representations
27
Exercise 2.33. This exercise aims at showing what is the sensible notion of semidirect product of Lie algebras. Let a and b denote two Lie algebras and suppose that we are given a Lie algebra homomorphism W a ! @b into the derivations of b. Show that there exists a unique Lie algebra structure on the vector space g D a C b which preserves the Lie algebra structures of both a and b, and such that ŒA; B D .A/.B/ for every A 2 a and every B 2 b. Further, show that a is a subalgebra and b is an ideal in g.
2.1.2.8
Haar Measure on Lie Groups and Integration
For Lie groups, left Haar measures are very easy to construct. One takes any positive definite inner product on the tangent space at the identity Te .G/ and carries it around with the differential of left translations, thereby obtaining a Riemannian structure. The corresponding volume form is a Haar measure. Furthermore, in every local coordinate system, it is given by a C1 density times the Lebesgue measure. We do not appeal here to these facts that go beyond our scopes, and simply quote a handy result. Proposition 2.34 ([17], Proposition (2.21)). If G is a Lie group whose underlying manifold is an open set in Rd and if the left translations are given by affine maps, that is xy D A.x/y C b.x/; where A.x/ is a linear transformation and b.x/ 2 Rd , denoted by dx the Lebesgue measure of Rd , then jdet A.x/j1 dx is a Haar measure on G. Example 2.35. For example, in the group “ax C b” the left translations are l.a;b/ .˛; ˇ/ D
a0 ˛ 0 C ; 0a ˇ b
so that by Proposition 2.34 we have jdet A.a; b/j1 da db D
da db : a2
As for the modular function, in any Lie group we have .g/ D jdet Ad.g/j1 : It is possible to realize the “ax C b” group as a matrix group. The reader may check that the correspondence
28
F. De Mari and E. De Vito
ab .a; b/ $ 01 establishes an isomorphism of “ax C b” with a closed Lie subgroup of GL.2; R/, whose Lie algebra is easily seen to consist of the matrices AB : 0 0 The adjoint representation takes the form A .bA C aB/ a b A B a1 ba1 D 0 1 0 0 01 0 0 and it is thus the linear map A 1 0 A 7! : B b a B It follows that
1 0 1 j D a1 : b a
.a; b/ D jdet
(2.18)
One can check directly with a change of variable that indeed Z RC R
2.1.2.9
f ..˛; ˇ/.a; b//
da db D a2
Z RC R
f .a; b/
da db : a2
Convolutions
The convolution of integrable functions defined on Rd involves translations. It is thus suitable for interpretation on any group G on which a reasonable notion of measure is given, for example Lie groups. Indeed, if G is a Lie group with left Haar measure dx and if f ; g 2 L1 .G/, then the convolution of f and g is the function defined by Z
f .y/g.y1 x/ dy :
f g.x/ D G
By the left invariance of dx, we have that Z Z G
G
jf .y/g.y1 x/j dx dy D
Z Z jf .y/g.x/j dx dy D kf k1 kgk1 G
G
(2.19)
2 The Use of Representations
29
so that, by Fubini’s theorem, f g 2 L1 .G/ and kf gk1 kf k1 kgk1 : The convolution of L1 functions can be expressed in several different ways: Z
f .y/g.y1 x/ dy
f g.x/ D
(2.20)
G
Z
f .xy/g.y1 / dy
D G
Z
f .y1 /g.yx/.y1 / dy
D G
Z
f .xy1 /g.y/.y1 / dy :
D G
It must be noticed that, unless the group G is itself commutative, in general the convolution is not commutative. Exercise 2.36. Show that on the affine group “ax C b” the convolution is Z Z f g.˛; ˇ/ D
R
˛ ˇ b da f .a; b/ g ; db a a a2 RC
and verify that it is not commutative. Exercise 2.37. Show that on any Lie group G and for any f ; g 2 L1 .G/ one has L.x/.f g/ D .L.x/f / g;
R.x/.f g/ D f .R.x/g/;
where L and R are the left and right translations of functions, respectively. They are defined by means of the left and right translations lx and rx on the group by the formulae L.x/f .y/ D f .lx1 y/ D f .x1 y/;
R.x/f .y/ D f .rx y/ D f .yx/:
The mapping properties of (left or right) convolution operators have been studied in much detail. We collect here some results that will be useful. For a detailed proof of 1 .G/, if many statements, see [5], whereas [17] contains the basic facts. If f ; g 2 Lloc 1 the convolution f g defined by (2.19) exists, and if jf j jgj is in Lloc .G/, then we say that f and g are convolvable. Below and in the remaining part of this chapter we write z f .x/ D f .x1 /:
30
F. De Mari and E. De Vito
Theorem 2.38. Let f ; g be two measurable functions on the Lie group G. i) If f 2 L1 .G/ and g 2 Lp .G/ with 1 p 1, then the integrals in (2.20) converge for almost every x 2 G, we have f g 2 Lp .G/ and kf gkp kf k1 kgkp : Furthermore, if p D 1, then f g is also continuous. g 2 Lq .G/ where 1 < p < C1 and 1 < q < C1 ii) If f 2 Lp .G/, g 2 Lq .G/ and z 1 1 1 satisfy p C q D 1C r with r > 1, then f and g are convolvable and f g belongs gkq D kgkq , then to Lr .G/. Furthermore, if kz kf gkr kf kp kgkq : g 2 Lq .G/, where 1 < p < C1 and iii) If f 2 Lp .G/, g 2 Lq .G/ and z then f and g are convolvable, f g belongs to C0 .G/ and
1 p
C
1 q
D 1,
gkq : kf gk1 kf kp kz
2.1.3 Representation Theory Let H1 and H2 be two Hilbert spaces (the corresponding norms and scalar product are simply denoted by kk and h; i). Suppose that AW H1 ! H2 is linear and bounded, that is A 2 B.H1 ; H2 /. Recall that A is an isometry if kAuk D kuk for every u 2 H1 . Since kAuk2 D hAu; Aui D hA Au; ui and kuk2 D hu; ui, the polarization identity implies that A is an isometry if and only if A A D idH1 . Hence, isometries are injective, but they are not necessarily surjective. A bijective isometry is called a unitary map. If A is unitary, such is also A1 and in this case AA D idH2 . In particular if H1 D H2 D H , the set ˚ U .H / D A 2 B.H / W A is unitary forms a group. Evidently, U .H / B.H /, the space of bounded linear operators of H onto itself. Let now G be a Lie1 group. Definition 2.39. A unitary representation of G on the Hilbert space H is a group homomorphism W G ! U .H / continuous in the strong operator topology. This means:
1
In all what follows, it would suffice to consider a locally compact Hausdorff topological group.
2 The Use of Representations
31
i) .gh/ D .g/.h/ for every g; h 2 G; ii) .g1 / D .g/1 D .g/ for every g 2 G; iii) g 7! .g/u is continuous from G to H , for every u 2 H . Observe that from the equality k.g/u .h/uk D k.h1 g/u uk it follows that it is enough to check 2.39 for g D e, the identity of G. Example 2.40. Let G D R be additive group and H D C. For every s 2 R we define the function s .t/ D eits and we identify the complex number eits with the multiplication operator on C defined by z 7! zeits . Clearly, s W R ! U .C/ is a unitary representation, because
s .x C y/ D s .x/ s .y/
s .x/ D s .x/1 D s .x/ t 7! eits z is continuous for every z 2 C: Example 2.41. Let G be any locally compact group and choose H D L2 .G/. Define LW G ! U .H /;
x 7! L.x/;
L.x/f .y/ D f .x1 y/:
(2.21)
It is easy to check that this is a unitary representation, the so-called left regular representation. Similarly, the right regular representation is defined by RW G ! U .H /;
x 7! R.x/;
R.x/f .y/ D .x/1=2 f .yx/:
The modular function is necessary in order that R.x/ is unitary. Example 2.42. Let G be the “ax C b” group and H D L2 .R/. Define 1 .a; b/f .x/ D p f a
xb ; a
a > 0; b 2 R;
(2.22)
the so-called wavelet representation. Notice that it is just the composition of the two very important and basic unitary maps, the translation and dilation operators Tb f .x/ D f .x b/ 1 x Da f .x/ D p f a a for indeed 1 Tb Da f .x/ D Tb .Da f /.x/ D Da f .x b/ D p f a
xb : a
32
F. De Mari and E. De Vito
Observe that Tb Tb0 D TbCb0 ;
Da Da0 D Daa0 :
It is important to notice that Tb Da 6D Da Tb . More precisely,
x 1 x x ab 1 1 D Tab Da f .x/: Dp f Da Tb f .x/ D p .Tb f / b D p f a a a a a a In other words Da Tb D Tab Da : It follows that .Tˇ D˛ /.Tb Da / D Tˇ .D˛ Tb /Da D Tˇ .T˛b Da /Da D .Tˇ T˛b /.Da Da / D TˇC˛b D˛a : so that is a homomorphism: .˛; ˇ/.a; b/ D .˛a; ˇ C ˛b/ D ..˛; ˇ/.a; b//: Finally, it is instructive to check the strong continuity, which is left as an exercise. Example 2.43. A variant of (2.22) is the analogous that arises by considering the full affine group acting on L2 .R/: 1 full .a; b/f .x/ D p f jaj
xb ; a
a 2 R ; b 2 R:
(2.23)
Definition 2.44. Let M be a closed subspace of the Hilbert space H . We say that M is an invariant subspace for the unitary representation if .g/M M for every g 2 G. We say that is irreducible if H does not contain closed invariant subspaces other than H and f0g. Exercise 2.45. Prove that if M is an invariant subspace for then such is also M ? and that D M ˚ M ? , where M is the restriction of to M and similarly for M ? . This means that for every g 2 G the linear map .g/ is the direct sum of the linear maps M and M ? , each acting on the appropriate space. Definition 2.46. Let be a unitary representation of G on H and take ; 2 H . The function G ! C defined by g 7! h ; .g/ i is called the coefficient of relative to . ; /. If D , it is called a diagonal coefficient. Notice that all coefficients are continuous functions and jh ; .g/ ij k kk k.
2 The Use of Representations
33
Proposition 2.47. The following two conditions for a representation are equivalent: i) is irreducible ii) if and are non zero vectors in H , then the coefficient h ; .g/ i is nonzero as a continuous map. Proof. Suppose that is irreducible and assume by contradiction that we can find two nonzero vectors ; 2 H for which h ; .g/ i D 0. The space M D clf.g/ W g 2 Gg is a closed invariant subspace, it is not the zero space because 2 M and it cannot be H because 2 M ? . This contradicts the hypothesis that is irreducible. Conversely, if ii) holds, take any closed invariant subspace M ¤ f0g and take a nonzero vector 2 M , so that .g/ 2 M for every g 2 G. If M ¤ H , then M ? ¤ f0g and therefore there exists a nonzero 2 M ? . But this entails that h ; .g/ i D 0 for every g 2 G, contrary to assumption. Hence M D H and is irreducible. t u Example 2.48. Using the previous proposition, we show that the wavelet representation (2.22) of the “ax C b” group is not irreducible, whereas the wavelet representation (2.23) of the full affine group is. The calculations that follow are very basic and important. We shall use the Fourier transform F , defined on L1 .Rd / \ L2 .Rd / by Z fO . / D F f . / D
Rd
f .x/e2ix dx :
(2.24)
A straightforward computation yields p 2ib
ae fO .a /; p F .full .a; b/f /. / D jaje2ib fO .a /; F ..a; b/f /. / D
a > 0; b 2 R
(2.25)
a 2 R ; b 2 R:
We start with and show that it is not irreducible. To this end, take two nonzero f ; g 2 L2 .R/. Then, by Plancherel Z
jh.a; b/f ; gij2 G
da db D a2
Z
da db a2 G ˇ ˇ2 Z Z ˇ p 2ib
ˇ da db ˇ D fO .a /Og. / d ˇˇ ˇ O ae a2 G R Z ˇ ˇ 1 ˇ.F !a /.b/ˇ2 db da ; D a G jhF ..a; b/f /; F gij2
34
F. De Mari and E. De Vito
where !a . / D fO .a /Og. /. Hence, again by Plancherel Z
da db D jh.a; b/f ; gij a2 G 2
D
Z
j!a . /j2 d
G
Z Z O R
C1 0
da ; a
2 da O jOg. /j2 d : jf .a /j a
(2.26)
Define now the following (Hardy) spaces: HC .R/ D ff 2 L2 .R/ W fO . / D 0 if < 0g H .R/ D ff 2 L2 .R/ W fO . / D 0 if > 0g:
(2.27)
Now, if we take f 2 HC .R/ and g 2 H .R/, then the support of gO is contained in the negative reals, so that we may suppose that, in the inner integral in (2.26), a < 0 for every a > 0. Hence a is outside the support of fO for every a > 0. It follows that the coefficient h.a; b/f ; gi vanishes and that is not irreducible. It is actually not hard to show that the Hardy spaces are both closed subspaces of L2 .R/ and, by (2.25), that they are both invariant under . This provides a direct alternative way to see that is not irreducible. The above computations, however, reveal a lot more than the simple fact that is not irreducible. First of all, if we restrict to HC .R/, that is, if both f ; g 2 HC .R/, then since their Fourier transforms are supported in the positive reals, for any fixed > 0 we may make the change of variable a 7! a= in the inner integral in (2.26) and obtain Z
jh.a; b/f ; gij2 G
da db D a2
Z
C1 0
jfO .a/j2
da a
Z R
jOg. /j2 d :
(2.28)
This proves that for f ; g 2 HC .R/ both nonzero we have h.a; b/f ; gi ¤ 0 as a continuous function, because neither fO nor gO can identically vanish. Hence the restriction of to HC .R/ is irreducible. The same holds true for H .R/. We leave it as an exercise to show that L2 .R/ D HC .R/ ˚ H .R/ and that D C ˚ . We observe en passant that if f 2 HC .R/ is such that Z
C1 0
jfO .a/j2
da D 1; a
(2.29)
then, by Plancherel and (2.28) we have kh.a; b/f ; gik D kgk and similarly for H .R/. Equation (2.29) is called a Calderón equation and a function f 2 HC .R/ that satisfies it is called a wavelet.
2 The Use of Representations
35
Let us now consider the full affine group. A calculation analogous to the previous one yields Z
jhfull .a; b/f ; gij2 Gfull
da db D a2
Z Z R
R
jfO .a /j2
da jOg. /j2 d
jaj
This time, for any nonzero , as a ranges in R the numbers a cover R and the change of variable a 7! a= in the inner integral gives Z Gfull
jhfull .a; b/f ; gij2
da db D a2
Z R
jfO .a/j2
da jaj
Z R
jOg. /j2 d ;
(2.30)
which cannot be zero if both f and g are not zero. This proves that full is an irreducible unitary representation of Gfull . The Calderón equation for the full affine group is thus Z R
jfO .a/j2
da D 1: jaj
(2.31)
Definition 2.49. Let be a representation of G on H . A vector u 2 H is called a cyclic vector for the representation if the closed linear span Mu of f.x/u W x 2 Gg coincides with H . Clearly, in general, Mu is a closed -invariant subspace of H . The representation is called cyclic if it has a cyclic vector. Definition 2.50. Let i W G ! U .Hi /, i D 1; 2 be two unitary representations of G. They are called unitarily equivalent if there exists a unitary operator UW H1 ! H2 such that 2 .g/ ı U D U ı 1 .g/;
for every g 2 G:
A bounded (not necessarily unitary) operator satisfying the above equation is called an intertwining operator between 1 and 2 . The set of all intertwining operators between 1 and 2 will be denoted I .1 ; 2 /. If 1 D 2 D , we write I .; / D I ./. Exercise 2.51. Let L be the left regular representation of R on L2 .R/, as defined in (2.21), namely L.x/f .y/ D f .y x/, and let R the right regular representation of R on L2 .R/, namely R.x/f .y/ D f .y C x/. Exhibit a unitary operator on L2 .R/ that intertwines R and L. Exercise 2.52. Prove that if 1 and 2 are equivalent, they have the same coefficients. Conversely, assume that 1 and 2 are irreducible and suppose that they have the same nonzero diagonal coefficients. Show that 1 and 2 are equivalent. [Hint: take 1 e 2 such that h 1 ; 1 .g/ 1 i D h P 2 ; 2 .g/ 2 i 6D 0 for all g 2 G k and define U on the elements of the form D jD1 ˛j 1 .xj / 1 via the formula Pk U D jD1 ˛j 2 .xj / 2 .]
36
F. De Mari and E. De Vito
Exercise 2.53. Let M H be a closed subspace and denote by P the orthogonal projection onto M . Prove that M is -invariant if and only if P 2 I ./. The next result is of crucial importance in representation theory. For a proof, see, for example, [16]. Lemma 2.54 (Schur’s lemma). i) A unitary representation of G is irreducible if and only if I ./ contains only scalar multiples of the identity. ii) Let 1 e 2 be two unitary irreducible representations of G. If they are equivalent, then I .1 ; 2 / has dimension one, otherwise I .1 ; 2 / D f0g. Corollary 2.55. Every irreducible representation of an Abelian group is one dimensional. Proof. Suppose that G is Abelian and take a representation of G. Then all the operators .x/ commute and hence are in I ./. If is irreducible, then .x/ is a constant multiple of the identity and every one-dimensional subspace of H is invariant. Therefore H must be one dimensional. t u Suppose that G is a Lie group with a unitary representation and that N is a closed and normal subgroup of G. Then, as we know from Theorem 2.28, the quotient G=N is a Lie group. If the kernel of contains N, that is, if .n/ is the identity operator for every n 2 N, then it is possible to project to a representation Q of the quotient G=N. Indeed, one puts .gN/ Q D .g/ and obtains a well-defined unitary representation of the Lie group G=N that acts on the same Hilbert space on which was defined. It is easy to see that if is irreducible, then such is . Q
2.1.4 Reproducing Systems and Square Integrability The general issue of square integrability has been the subject matter of many studies. The reader is referred to the books [1, 19] for further reading. We are interested in the properties of the coefficients of a given unitary representation of G. More precisely, we fix a vector 2 H and consider the socalled voice transform associated with it, namely V W H ! L1 .G/ \ C.G/ defined by V v.x/ D hv; .x/ i;
v2H:
2 The Use of Representations
37
Thus, for fixed , the voice transform maps elements in the Hilbert space H to functions on G that are bounded and continuous, as established formally in Proposition 2.56 below. For reasons that will become clear in what follows, the function K .x/ D V
.x/ D h ; .x/ i
is of particular relevance, and is called the kernel of the voice transform. By definition, it is the diagonal coefficient corresponding to of the representation. Proposition 2.56. Let be a unitary representation of the Lie group G on the Hilbert space H and let 2 H be a fixed vector. Then: i) the transforms V v are bounded continuous functions on G, for all v 2 H ; ii) the voice transform V satisfies V ı .x/ D L.x/ ı V for every x 2 G; iii) is a cyclic vector for if and only if V is an injective map of H into L1 .G/ \ C.G/; iv) the kernel K satisfies K D K .
z
Proof. i) The continuity of x 7! hv; .x/ i follows from the continuity of the representation (strong continuity implies weak continuity). Boundedness follows from jV .x/j D jhv; .x/ ij kvk k k: ii) This is just a direct computation that uses the fact that is unitary, hence .y/ D .y1 /. Indeed: V ..y/v/ .x/ D h.y/v; .x/ i
D hv; .y1 x/ i D V v.y1 x/ D L.y/V v .x/:
iii) Take v 2 H . Then V v./ D 0 ” hv; ./ i D 0 ” v 2 M ? ; where the first two equalities refer to the function on G which is identically zero. Now, is cyclic if and only if M ? D f0g and this is equivalent to the fact that the only zero transform V v is when v D 0, which is the injectivity of V . iv) This is immediate, since
z
K .x/ D h ; .x/ i D h.x/ ; i D h ; .x1 / i D K .x1 / D K .x/: t u
38
F. De Mari and E. De Vito
Definition 2.57. Let be a unitary representation of the Lie group G on the Hilbert space H . If there exists a vector 2 H , called admissible, for which the corresponding voice transform takes values in L2 .G/ and is a nontrivial multiple of an isometry, that is, if there exists C > 0 such that V W H ! L2 .G/;
kV vk D C kvk
for every v 2 H , then we say that the system .G; ; H ; / is reproducing, or, for short, that is an admissible vector for . Notice that if is anpadmissible vector, then such is every multiple of . By replacing with = C , the voice transform becomes an isometry. In this section we shall always assume this normalization. By the polarization identity, the isometry property kV vk D kvk is equivalent to hV v; V wi D hv; wi;
v; w 2 H ;
(2.32)
where the first inner product is in L2 .G/ and the second in H . We observe en passant that if is an admissible vector for , then it is a cyclic vector for . This is because if V is an isometry, then it is injective on H and iii) of Proposition 2.56 applies. In the literature, the notion of reproducing system is primarily studied when is irreducible. If this is the case, and if admits an admissible vector, then one says that is square integrable. Since in many important examples in analysis one has nonirreducible representations, we allow for this situation to happen. Many relevant properties concerning reproducing systems are expressed efficiently with the notion of weak integral. We do not develop this theory in full here, but simply record what is needed to us. Suppose that W G ! H is a continuous map and suppose further that for any v 2 H the integral Z h .x/; vi dx G
is absolutely convergent for every v 2R H . Then, as a consequence of the closed graph theorem, the mapping v 7! G h .x/; vi dx defines a continuous linear functional on H . We collect these two properties by saying that is scalarly continuously integrable. Then by the Riesz representation theorem there exists a unique element in H , denoted Z .x/ dx G
and called the weak integral of , for which Z
Z .x/ dx; vi D
h G
h .x/; vi dx; G
v2H:
2 The Use of Representations
39
Proposition 2.58. Suppose that .G; ; H ; / is a reproducing system. Then the reproducing formula Z v D hv; .x/ i.x/ dx (2.33) G
holds for v 2 H , where the right-hand side is interpreted as weak integral. The adjoint of the voice transform is given as weak integral by the formula Z VF D F.x/.x/ dx; F 2 L2 .G/ (2.34) G
and V V D idH . Proof. Since the voice transform V maps H into L2 .G/ by assumption, and since it satisfies the isometric property (2.32), for every w 2 H we have Z Z hv; .x/ ih.x/ ; wi dx D V v.x/V w.x/ dx (2.35) G
G
D hV v; V wi D hv; wi: This shows that the continuous mapping v W G ! H defined for fixed v 2 H by v .x/ D hv; .x/ i.x/
(2.36)
is scalarly continuously integrable, because w 7! hv; wi is well defined and continuous. Hence v is weakly integrable, and the weak integral of v must be equal to v because (2.35) entails Z v .x/ dx; wi D hv; wi
h G
for every w 2 H . This establishes (2.33). As for (2.34), we notice that for any F 2 L2 .G/ the continuous mapping ˚F W G ! H defined by ˚F .x/ D F.x/.x/ is scalarly continuously integrable because Z
Z hF.x/.x/u; wi dx D G
F.x/V w.x/ dx D hF; V wi: G
Formula V V D idH follows from (2.34) applied to F D V v and from (2.33), for VV v D
Z
Z V v.x/.x/
G
hv; .x/ i.x/
dx D
dx D v:
G
t u
40
F. De Mari and E. De Vito
Before we proceed further, some comments are in order. The first observation concerns the geometric interpretation of (2.33). The mapping v defined in (2.36) associates with x 2 G the projection of v along .x/ . The reproducing formula (2.33) then expresses the fact that we can recover any element v 2 H by gluing all its projections with an integral, so that in some sense the collection of all the vectors f.x/ W x 2 Gg, called the orbit of under G, consists of sufficiently many “directions.” Secondly, the weak integral (2.34) that defines the adjoint of the voice transform is at times referred to as the Fourier transform of F evaluated at , and is written V F D .F/ : A third comment concerns general properties of isometries. As already mentioned in the beginning of Section 2.1.3, a bounded linear operator AW H1 ! H2 between Hilbert spaces is an isometry if and only if A A D idH1 . Thus, the last statement in the previous theorem is in fact a simple consequence of the fact that V is an isometry. Furthermore, if A is an isometry, then AA is the projection onto the range of A, for AA is self-adjoint and idempotent. Example 2.59. The following is an important example of a reproducing system: the (irreducible) wavelet system that was discussed in Example 2.48. We consider the full affine group “ax C b”, where a is any nonzero real number and b 2 R. The representation full is defined in (2.23) and the representation space is H D L2 .R/. If 2 L2 .R/ satisfies the Calderón equation (2.31), then (2.30) becomes Z
jhf ; full .a; b/ ij2 Gfull
da db D kf k22 ; a2
which shows that the voice transform V , whose explicit form is given by 1 V f .a; b/ D p jaj
Z R
f .x/
xb a
dx;
is indeed an isometry of L2 .R/ into L2 .G/. Thus, the Calderón equation (2.31) selects the admissible vectors for the wavelet representation, namely the wavelets. The reproducing formula (2.33) is often written as: Z Z f .x/ D
R
R
V f .a; b/ Tb Da .x/
da db : a2
The reader is referred to [10, 21, 28] for further reading on this very wide topic. Exercise 2.60. Show that the wavelet representation of the connected affine group on L2 .R/ is reproducing, though not irreducible. Find the conditions that characterize the admissible vectors. The next result describes the projection operator as a convolution against Ku .
2 The Use of Representations
41
Proposition 2.61. Suppose that .G; ; H ; / is a reproducing system. Then the projection onto the range of the voice transform is given by V VF D F K ;
F 2 L2 .G/:
In particular, K is a convolution idempotent, that is K DK K :
z
2 L2 .G/ and K D K 2 L2 .G/, so that, Proof. First of all, notice that K D V by iii) of Theorem 2.38 (with p D q D 2), the convolution F K is well defined for every F 2 L2 .G/. Therefore, taking into account the various properties, we have V .V F/.x/ D hV F; .x/ i D hF; V ..x/ /i D hF; L.x/V
i
D hF; L.x/K i Z F.y/K .x1 y/ dy D G
Z D
F.y/K .y1 x/ dy
G
D F K .x/: The second statement is obvious, because K is of course in the range of V and hence coincides with its projection onto the range. t u We end this section with a classical result due to Duflo and Moore [15], later reviewed with a slightly different argument in [22]. The proof would require some results on unbounded operators that defy the scope of this chapter. Here we content ourselves with its statement2 and some comments. For the notion of self-adjoint (possibly unbounded) operator, the reader is referred to the next section. Theorem 2.62 ([15]). Let be a square integrable unitary representation of the Lie group G on the Hilbert space H . Then there exists a unique positive, densely defined self-adjoint operator A on H such that:
2
We formulate the theorem for Lie groups, but it actually holds for locally compact groups.
42
F. De Mari and E. De Vito 1;
i) If
2
are in the domain D.A/ of A, then for all f1 ; f2 2 H Z
Z hf1 ; .x/
1 ih.x/
2 ; f2 i
G
dx D G
V 1 f1 .x/V 2 f2 .x/ dx D hA
2; A
1 ihf1 ; f2 i:
(2.37)
In particular, the set of admissible vectors coincides with the subset of D.A/ consisting of those for which kA k D 1. ii) The operator A satisfies the semi-invariance relation .x/A.x/ D .x/1=2 A
(2.38)
for every x 2 G. The operator A is usually referred to as the Duflo-Moore operator of , whereas formulæ (2.37) are known as the orthogonality relations. Several comments are in order. First of all, the assumption that is irreducible (which is part of the notion of square integrability) is crucial for this theorem, which, as it stands, is in general false for nonirreducible albeit reproducing representations. Secondly, observe that the semi-invariance equation (2.38), which makes sense because the domain D.A/ is invariant under every .x/, shows that in the nonunimodular case the operator A cannot be bounded, for if it were bounded, then the norm of the operator on the left-hand side of (2.38) would be the same as the norm of A for every x 2 G, whereas on the right-hand side the norm would be multiplied by .x/1=2 , which is different from one if x ¤ e. If G is unimodular, then A is a multiple of the identity, and, by the previous argument, this happens only in the unimodular case. The Duflo-Moore operator is related to the so-called formal degree operator, usually denoted by K , by the equality K D A2 . The terminology stems from the theory of compact groups. If G is compact, then any irreducible representation is finite dimensional and, if d denotes the dimension of H , the Schur orthogonality relations (see, e.g., [17]) give kV f k22 D d1 k k2 kf k2 : Formally, this is (2.37) when f1 D f2 and
1
D
2,
provided that A D d1=2 I.
Example 2.63. As we know, the wavelet representation of the connected affine group, defined by (2.22), is square integrable on HC .R/. Let D.A/ D f
RC /g 2 HC .R/ W 1=2 O . / 2 L2 .b
and define A on D.A/ by Ac . / D 1=2 O . /:
2 The Use of Representations
43
Exercise 2.64. Show that A is not a bounded operator. is admissible if and only if 2 D.A/ and Z C1 d
kA k2 D kAc k2 D j O . /j2 D 1;
0
We know from (2.29) that
as in i) of Theorem 2.62. In fact, (2.28) is a polarized version of the orthogonality relations (2.37). As for the semi-invariance property, we observe that .a; b/ f .x/ D
1 b F ..a; b/ f /. / D p e2i a fO . =a/; a
p af .ax C b/;
so that, by (2.25) and the definition of A on the frequency domain, we have F ..a; b/A.a; b/ /. / D D
p
ae2ib F A.a; b/ .a /
p
ae2ib
F ..a; b/ / .a / p a
1 1 a
b D p e2ib p e2i a .a / O . / a a
1 1 D p O . / p : a
Since .a; b/ D 1=a by (2.18), this is F .A /. /.a; b/1=2 , as stated in ii) of Theorem 2.62.
2.1.5 Unbounded Operators In this section H is a fixed Hilbert space. We say that A is an operator on H if it is a linear map defined on a linear subspace D.A/ H , called its domain, with image R.A/ H , another linear subspace called its range. It is not assumed that A is bounded or continuous. Of course, if A is continuous, then it has a continuous extension on the closure of D.A/ and hence on H . In other words, in this case A is the restriction to D.A/ of some AQ 2 B.H /. The graph of A in H H will be denoted G .A/. Observe that a linear map B is an extension of A if and only if G .A/ G .B/, so that in this case we may write A B. An operator is called closed if such is its graph. The closed graph theorem asserts that A 2 B.H / if and only if D.A/ D H and A is a closed operator. Next we define the adjoint of A, denoted A . Its domain D.A / consists of all the vectors y 2 H for which the linear functional x 7! hAx; yi
(2.39)
44
F. De Mari and E. De Vito
is continuous on D.A/. If y 2 D.A /, then the Hahn–Banach theorem allows us to extend the functional in (2.39) to a continuous linear functional on H and hence there exists an element, denoted A y for which hAx; yi D hx; A yi;
x 2 D.A/:
(2.40)
Clearly, A y is uniquely determined by (2.40) if and only if D.A/ is dense in H . We shall then define A only for the densely defined operators A. Exercise 2.65. Show that if A is a densely defined operator, then A is an operator on H . Prove further that if A 2 B.H /, then the definition of A coincides with the usual one. In particular, D.A / D H and A 2 B.H /. Exercise 2.66. Let A, B, and C be operators on H . Prove the following relations: D.B C A/ ˚D D.B/ \ D.A/; D.BA/ D x 2 D.A/ W Ax 2 D.B/ ; .C C B/ C A D C C .B C A/; .CB/A D C.BA/; .C C B/A D CA C BA; AC C AB A.C C B/. Exercise 2.67. Let B, A, and BA be densely defined operators onH . Prove that then A B .BA/ . Furthermore, if B 2 B.H /, then A B D .BA/ . Definition 2.68. An operator on H is said to be symmetric if for every x 2 D.A/ and y 2 D.A/ hAx; yi D hx; Ayi:
(2.41)
The symmetric densely defined operators are those for which A A . If A D A , then A is called self-adjoint. Finally, we say that A is skew-adjoint if iA is self-adjoint. Observe that a bounded operator is symmetric if and only if it is self-adjoint. In general this is not true. Furthermore, if D.A/ is dense and hAx; yi D hx; Byi for every x 2 D.A/ and every y 2 D.B/, then B A . Example 2.69. Let H D L2 .Œ0; 1/ with the Lebesgue, measure, and put: ˚ 0 2 D.A1 / D f 2 A:C:Œ0; ˚ 1 W f 2 L D.A2 / D D.A1 / \ ˚f W f .0/ D f .1/ D.A3 / D D.A1 / \ f W f .0/ D f .1/ D 0 , where A:C:Œ0; 1 is the space of absolutely continuous functions on Œ0; 1. Define next Ak f D if 0 ;
f 2 D.Ak /; k D 1; 2; 3:
2 The Use of Representations
45
It is not hard to show that A1 D A3 ;
A2 D A2 ;
A3 D A1 :
Since clearly A3 A2 A1 , it follows that A2 is a self-adjoint extension of A3 , which is symmetric but not self-adjoint, and that the extension A1 of A2 is not symmetric. We now discuss a phenomenon that is relevant in the representation theory of the Heisenberg group, (see (2.52) below) and in Physics, where they are known as the Weyl relations and imply the Heisenberg uncertainty principle (see [18] for a thorough mathematical survey). Take again H D L2 .Œ0; 1/ as above and consider Df D f 0 on D.A2 / and Mf .t/ D tf .t/. It is immediate to check that .DMMD/f D f , that is DM MD D I;
(2.42)
where I is the identity on the domain of D. Thus, the identity appears as the commutator of two operators, only one of which is bounded (kMf k2 kf k2 because t 2 Œ0; 1). One can legitimately ask if it is possible to realize an equality like (2.42) with two bounded operators. The answer is negative, not only in the Banach algebra B.H / but in any other Banach algebra with unit. The very elegant proof of the proposition that follows is due to Wielandt. Theorem 2.70. Let A be a Banach algebra with unit e. If x, y 2 A , then xy yx 6D e: Proof. Suppose that xy yx D e and let us make the inductive assumption xn y yxn D nxn1 ; which is true for n D 1. Then xnC1 y yxnC1 D xn .xy yx/ C .xn y yxn /x D xn e C nxn1 x D .n C 1/xn ; so that the relation is true for all positive integers n. It then follows that nkxn1 k D kxn y yxn k 2kxn kkyk 2kxn1 kkxkkyk; that is n 2kxkkyk for every n. This is impossible.
t u
46
F. De Mari and E. De Vito
2.1.6 Stone’s Theorem and the Differential of a Representation In this section we state Stone’s theorem, an infinite dimensional analogue of the fact that the Lie algebra of the unitary group is the skew-hermitian matrices. We then explain how to any unitary representation of a Lie group G there corresponds a representation of its Lie algebra by “skew-hermitian,” i.e., skew-adjoint, operators. Definition 2.71. A one parameter group of operators on H is a family fUt W t 2 Rg B.H / that satisfies i) U0 D I; ii) UtCs D Ut Us ; iii) limt!0 kUt x xk D 0 for every x 2 H . If is a unitary representation on H of the Lie group G with Lie algebra g, then for any fixed X 2 g the family fUt D .exp tX/ W t 2 Rg is a one parameter group of unitary operators on H . In analogy with the case H D C, in which every differentiable function such that f .s C t/ D f .s/f .t/ is of the form f .t/ D eat with a D f 0 .0/, it is possible to associate with any one parameter group of operators a “generating” operator, in general unbounded, as explained in the definition that follows. Definition 2.72. Let fUt W t 2 Rg be a one parameter group of operators on H and let D.A/ denote the subspace of H consisting of the vectors x 2 H for which lim
t!0
Ut x x DW Ax t
(2.43)
exists in the norm topology of H . The operator A (necessarily linear) defined on D.A/ by (2.43) is called the infinitesimal generator of the group. Below is the classical statement of the celebrated theorem by M.H. Stone. For a proof, see, for instance, [29]. Theorem 2.73 (Stone’s theorem). Let fUt g be a one parameter group of unitary operators on H . The infinitesimal generator A of fUt g is densely defined on H and is skew-adjoint. Conversely, if A is a densely defined operator on H and is skewadjoint, then there exits a unique one parameter group of unitary operators on H whose infinitesimal generator is A. Let’s go back to the case when H is the space on which the unitary representation of the Lie group G acts. By means of (2.43), we define the differential d on g as .exp tX/x x t!0 t
d.X/x D lim
(2.44)
2 The Use of Representations
47
which we sometime write as d.X/x D
d ˇˇ ˇ .exp tX/x: dt tD0
Our next objective is to show that d indeed defines a representation of g, that is, to make sure that the various operators d.X/ can be properly composed and satisfy d.ŒX; Y/ D d.X/ ı d.Y/ d.Y/ ı d.X/: We must therefore study the domains D.d.X//. Recall that a function f defined on an open set ˝ 2 Rd with values in H is differentiable at x0 2 ˝ if there is a linear map dfx0 W Rd ! H , necessarily unique, such that lim
x!x0
f .x/ f .x0 / dfx0 .x x0 / D 0; kx x0 k
where kk is any norm in Rd . The map dfx0 is then the differential of f at x0 . If f is differentiable at all points of ˝, then x 7! dfx is a map from ˝ into End.Rd ; H /. The latter, in turn, is a topological vector space in a canonical way. We then say that f is of class C1 if x 7! dfx is continuous, of class C2 if x 7! dfx is of class C1 , and so on. We say that f is of class C1 if it is of class Ck for all k. The notion of C1 map is clearly local and applies to the case of maps defined on a Lie group G with values in H . The following result establishes an analogue of formula (2.7) adapted to this setup. Proposition 2.74. Let G be a Lie group with Lie algebra g and take X 2 g. If FW G ! H is a C1 map with values in the Hilbert space H , then the map g 7! Xg F D lim
t!0
F.g exp tX/ F.g/ t
is also of class C1 . Proof. The composition .g; t/ 7! .g; exp tX/ 7! g exp tX 7! F.g exp tX/ is C1 . Hence, its partial derivative with respect to t at zero is C1 with respect to g. u t We can finally introduce the space on which the operators that arise from the differential of a unitary representation of a Lie group are naturally defined. Definition 2.75. Let be a unitary representation of the Lie group G on H . An element 2 H is called a C1 -vector for if g 7! .g/ is of class C1 on G. The space of C1 -vectors for will be denoted C1 ./. Theorem 2.76 ([25] Proposition 3.9). Let be a unitary representation of the Lie group G on the Hilbert space H . For any X 2 g, d.X/ sends C1 ./ into itself and
48
F. De Mari and E. De Vito
d.ŒX; Y/ D d.X/ ı d.Y/ d.Y/ ı d.X/ ;
2 C1 ./:
From Theorem 2.76 it is easy to derive the following facts. Proposition 2.77 ([25] Proposition 3.10 and Proposition 3.11). Let be a unitary representation of the Lie group G on the Hilbert space H and let g denote the Lia algebra of G. Then: i) for every X 2 g, the operator d.X/ is skew-adjoint on C1 ./; ii) for every g 2 G, .g/ sends C1 ./ into itself ; iii) for every g 2 G and every X 2 g the formula .g/ d.X/.g/1 D d.Ad gX/ holds. We conclude this section with a crucial result, which implies that the operators d.X/ are densely defined on H . Theorem 2.78 ([25] Theorem 3.15). Let be a unitary representation of the Lie group G on the Hilbert space H . The space C1 ./ is dense in H .
2.2 The Heisenberg Group and its Representations 2.2.1 The Group and its Lie Algebra We shall denote by Hd the Heisenberg group, namely the manifold Rd Rd R endowed with the product .q; p; t/.q0 ; p0 ; t0 / D .q C q0 ; p C p0 ; t C t0 12 .t qp0 t pq0 //:
(2.45)
Let !W R2d R2d ! R be the standard symplectic form given by (2.2), that is !.x; x0 / D t xJx0 ; Upon writing x D .q; p/ 2 R2d , we may formulate (2.45) in terms of the symplectic form, namely: 1 .x; t/.x0 ; t0 / D .x C x0 ; t C t0 !.x; x0 //: 2
(2.46)
It is clear from (2.45) and (2.46) that the product in Hd is given by functions that are C1 in the global R2dC1 coordinates .q; p; t/. Furthermore, one checks at once that .x; t/1 D .x; t/; another C1 formula. Hence Hd is a Lie group.
2 The Use of Representations
49
Observe that H1 can also be seen as a particular group of symplectic matrices, that is, a subgroup of Sp.2; R/, in the sense of (2.1). Indeed, if we put 2
3 1 0 0 0 6 p 1 0 07 7 g.q; p; t/ D 6 4 t q=2 1 p5 ; q=2 0 0 1
(2.47)
then it is easy to check that the very same product formula as in (2.46) holds true. Exercise 2.79. Prove that the center of Hd is Z D f.0; t/ W t 2 Rg. Show that the quotient group Hd =Z is isomorphic to the Abelian Lie group R2d . Exercise 2.80. Write explicitly the inner conjugation ig .h/ D ghg1 in Hd . Exercise 2.81. Prove that L1 D f.q; 0; t/ W q 2 Rd ; t 2 Rg
L2 D f.0; p; t/ W p 2 Rd ; t 2 Rg
are two Lie subgroups of Hd which are mutually isomorphic but not conjugate. Exercise 2.82. Check that the matrices in (2.47) are in Sp.2; R/ and that they satisfy the product law (2.46). Exercise 2.83. Extend the embedding (2.47) to arbitrary dimension. We next want to identify the Lie algebra hd of Hd in terms of left invariant vector fields. We first select the standard basis of the tangent space to the identity given by the canonical tangent vectors h.sej ; 0; 0/i0 D
@ ˇˇ ˇ ; @qj 0
where ej denotes the j-th unit coordinate vector in Rd and then we take its left translates. Similarly for the other coordinates pj and t. Hence, we fix .q; p; t/ 2 Hd and f 2 C1 .Hd / and consider the smooth curves s 7! .sej ; 0; 0/, with s 2 ."; "/ and then s 7! .q; p; t/.sej ; 0; 0/ passing through .q; p; t/, and compute d ˇˇ d ˇˇ 1 ˇ f ..q; p; t/.sej ; 0; 0// D ˇ f .q C sej ; p; t C spj / ds sD0 ds sD0 2 @f 1 @f D .q; p; t/ C pj .q; p; t/: @qj 2 @t
50
F. De Mari and E. De Vito
Hence, the equivalence class of the curve s 7! .sej ; 0; 0/ at .q; p; t/ is the tangent vector that corresponds to the differential operator @ ˇˇ 1 @ ˇˇ C pj ˇ : ˇ @qj .q;p;t/ 2 @t .q;p;t/ An analogous calculation with the curves s 7! .0; sej ; 0/ and s 7! .0; 0; s/ shows that a basis for hd is given by the vector fields fQ1 ; : : : ; Qd ; P1 ; : : : ; Pd ; Tg, where Qj D
@ 1 @ C pj ; @qj 2 @t
j D 1; : : : ; d
Pj D
@ 1 @ qj ; @pj 2 @t
j D 1; : : : ; d
TD
@ : @t
It is also straightforward to check that ŒQj ; Pk D ıjk T; ŒQj ; T D ŒPj ; T D 0;
j; k D 1; : : : ; d j D 1; : : : ; d;
that are the celebrated Heisenberg commutation relations. P Therefore, identifying R2dC1 with hd via the map .x1 ; : : : ; xd ; y1 ; : : : ; yd ; z/ 7! djD1 .xj Qj C yj Pj / C zT, we obtain the following Lie algebra structure on R2dC1 : Œ.X; z/; .X 0 ; z0 / D .0; !.X; X 0 //: From this commutator rule, one sees immediately that ŒA; ŒB; C D 0 for every choice of A, B, and C in hd . Similarly, any higher order bracket vanishes. This fact is expressed technically by saying that hd is a two-step nilpotent Lie algebra. The Baker–Campbell–Hausdorff formula (2.9) becomes, for A; B 2 hd 1 exp A exp B D exp.A C B C ŒA; B/ 2 and since for A D .X; z/ and B D .X 0 ; z0 / it holds 1 1 A C B C ŒA; B D .X C X 0 ; z C z0 !.X; X 0 //; 2 2 which coincides with the product (2.46), we infer that for every .X; z/ 2 R2dC1 exp.X; z/ D .X; z/:
2 The Use of Representations
51
This simply says that the exponential mapping is the identity map of R2dC1 , when we identify the latter with hd on the one hand and with Hd on the other hand. Exercise 2.84. Prove that the adjoint action of Hd on hd is given by Ad.y; s/.X; t/ D .X; t !.y; X//.
2.2.2 Automorphisms When introducing the metaplectic representation of the symplectic group (see Section 2.3), it is crucial to understand well the automorphisms of the Heisenberg group. More generally, given a Lie group G with Lie algebra g it is often interesting (and many times difficult) to describe their automorphism groups. Recall that an automorphism of the Lie group G is an invertible group homomorphism. Unraveling the definition, this means a bijective smooth map FW G ! G preserving products, namely satisfying F.xy/ D F.x/F.y/ and F.e/ D e. The collection of all such maps is, in turn, a group under composition, denoted Aut.G/. Under favorable circumstances,3 and surely this is the case for Hd , the group Aut.G/ has itself a natural structure of Lie group, though we shall not insist on this. Notice that since we have identified Hd with R2dC1 , we are in fact looking at maps FW R2dC1 ! R2dC1 . Similarly, one may consider the automorphisms of g, namely the bijective linear maps ˚W g ! g satisfying ˚.ŒX; Y/ D Œ˚.X/; ˚.Y/. As we know, they form a group, denoted by Aut.g/. Since its elements are linear maps, Aut.g/ is in fact a (closed) subgroup of GL.d; R/ in a natural fashion, where d is the dimension of g. Hence Aut.g/ can be given the structure of a Lie group without problems. Also, the differential dF of any F 2 Aut.G/ is easily seen to belong to Aut.g/. This accounts for an immersion of Aut.G/ into Aut.g/ whenever G is connected. When looking at the Heisenberg group, things are even nicer than this, and the following results show exactly how. Very clear proofs of the statements that follow can be found in [16]. Proposition 2.85. Aut.Hd / D Aut.hd /. It is actually possible to give an explicit description of the automorphisms of Hd . It is immediate to see that each of the following families of maps consists of automorphisms. q q i) Symplectic maps. For any A 2 Sp.d; R/, . ; t/ 7! .A ; t/. p p ii) Inner automorphisms. iii) Homogeneous dilations. For any a 2 RC , ıa .x; t/ D .ax; a2 t/. Observe that for any a; a0 > 0 it holds ıa ıa0 D ıaa0 . iv) Inversion. .q; p; t/ 7! .p; q; t/.
3 As it was shown by Hochschild in 1951, it is enough that the group of components of G is finitely generated.
52
F. De Mari and E. De Vito
Theorem 2.86. Every automorphism of Hd can be written uniquely as ˛1 ˛2 ˛3 ˛4 , with ˛j as in items i), ii), iii), and iv) above. It is important to observe that the only automorphisms that leave the center fixed are those of the kind i) and ii). We shall denote by T the group generated by them. Exercise 2.87. Define the semidirect product R2d Ì Sp.d; R/ as R2d Sp.d; R/ endowed with the product .x; A/.x0 ; A0 / D .x C Ax0 ; AA0 /. Show that the map .x; A/ 7! .x;A/ , where .x;A/ .y; t/ D i.x;0/ .Ay; t/;
.x; 0/; .y; t/ 2 Hd
defines an isomorphism of R2d Ì Sp.d; R/ with T . Define further the semidirect product Hd Ì Sp.d; R/ via the formula ..x; t/I A/..x0 ; t0 /I A0 / D ..x; t/.Ax0 ; t0 /I AA0 /:
(2.48)
Calculate its center Z and prove that Hd Ì Sp.d; R/=Z is isomorphic as a group to R2d Ì Sp.d; R/. Exercise 2.88. Consider the symplectic matrix 2 1 0 60 d 'A D 6 40 0 0 b=2
0 0 1 0
3 0 2c7 7 2 Sp.2; R/ 0 5 a
where ad bc D 1, so that ab AD 2 Sp.1; R/ D SL.2; R/: cd Using the embedding (2.47), show that 'A g.q; p; t/'A1 D g.q0 ; p0 ; t/ where 0 q ab q D : cd p p0 Thus the automorphisms of type i) of H1 can be realized inside Sp.2; R/ as conjugations. The six dimensional subgroup of Sp.2; R/ generated by the matrices g.q; p; t/ and 'A is the so-called Jacobi group. Show that the Jacobi group is isomorphic to the semidirect product H1 Ì SL.2; R/ of Exercise 2.87 for d D 1, where the group law is given in (2.48). More on the Jacobi group is to be found in [4]. A classification of its subgroups up to conjugation is given in [14]. From the point of view of square integrability issues, the Jacobi group has been studied in [13].
2 The Use of Representations
53
Exercise 2.89. Show how to realize the automorphisms of type ii) inside Sp.2; R/ as conjugations. Exercise 2.90. Consider next the symplectic matrix 2 1 a 6 0 a D 6 4 0 0
0 1 0 0
0 0 a 0
3 0 07 7: 05 1
Prove that a g1 a D ıa .g/ for every g as in (2.47). Thus, also the automorphisms of type iii) of H1 can be realized inside Sp.2; R/ as conjugations. The four dimensional subgroup of Sp.2; R/ generated by the matrices g.q; p; t/ and a is isomorphic to the so-called extended Heisenberg group. The latter is the Heisenberg group extended by the group of homogeneous dilations, namely the semidirect product Hd Ì RC where the product is .h; a/.h0 ; a0 / D .hıa .h0 /; aa0 /: Exhibit the explicit isomorphism.
2.2.3 The Schrödinger Representation Let be a unitary and irreducible representation of Hd on the Hilbert space H . Then all the operators .0; 0; t/ are in I ./ because the elements of the form .0; 0; t/ are in the center of Hd . Schur’s Lemma implies that there exists a complex number of modulus one, denoted .t/, such that .0; 0; t/ D .t/IH :
(2.49)
Therefore the function t 7! .t/ is continuous from R into T and satisfies the equalities .s C t/ D .s/ .t/ and .0/ D 1. Thus there exists a unique 2 R such that
.t/ D eit :
(2.50)
Now, if D 0, then .0; 0; t/ D IH , so that Z ker and the representation projects onto a representation Q on the quotient Hd =Z ' R2d . The representation Q is still unitary and irreducible. But R2d is Abelian and hence, by Corollary 2.55, dim H D 1, namely H D C. Therefore, there exists a unique vector . ; / 2 R2d such that .x; Q y/ D ei. xC y/ IC
54
F. De Mari and E. De Vito
and hence acts on C as .x; y; t/z D ei. xC y/ z:
(2.51)
Consider next the case 6D 0. From (2.49) and (2.50), taking the differential, it follows d.T/ D iIH : In particular d.T/ is a bounded operator. But we also know that ŒQj ; Pj D T for every j D 1; : : : ; d, so that necessarily Œd.Qj /; d.Pj / D d.T/ D iIH :
(2.52)
By modifying Theorem 2.70 in the case of a multiple of the identity (the details are left as an exercise) we see that the operators d.Qj / and d.Pj / cannot be both bounded and, in particular, the representation space H cannot be finite dimensional. In conclusion, every unitary and irreducible representation of Hd which is not trivial on the center is necessarily infinite dimensional. The next step consists in looking for the unitary and irreducible representations of Hd that are not trivial on the center. The most natural way to go about it is to compare the commutation Heisenberg relations with (2.42). We shall start from a very natural representation of the Lie algebra hd and then use some heuristics, together with Stone’s Theorem, in order to get the corresponding representation of the group. To this end, we introduce a useful Lie subalgebra of the Lie algebra of all differential operators with polynomial coefficients on Rd . Let PD.Rd / denote the vector space of all differential operators with polynomial coefficients on Rd . Under composition, it is an associative algebra generated by the 2d elements Dj D
@ ; @xj
Mj D .2i/xj ;
j D 1; : : : ; d:
As any other associative algebra, PD.Rd / becomes a Lie algebra if we define the bracket as the commutator, that is ŒA; B D A ı B B ı A. The generators Dj and Mj , together with the operator “2i-times the identity,” give rise to a finite dimensional Lie subalgebra of PD.Rd / which is isomorphic to hd . Indeed, the correspondence Dj 7! Qj , Mj 7! Pj and .2i/I 7! T establishes the isomorphism. This isomorphism is actually a faithful (i.e., injective) representation of the Heisenberg algebra and should be thought of as the differential of the representation we are looking for. In order to find the representation space, it will then be enough to find a Hilbert space of functions on Rd on which the operators we are dealing with are skew-adjoint. Finally, we shall exponentiate the representation of hd .
2 The Use of Representations
55
Let H D L2 .Rd / and D D S .Rd /, the Schwartz space of rapidly decreasing functions. Both the derivations Dj and the multiplications Mj are densely defined on H since D is a natural common domain for them, which is dense in H . The operators are formally skew-adjoint because for every '; 2 S .Rd / we have hMj '; i D h'; Mj i;
hDj '; i D h'; Dj i;
where we are using L2 inner products and, in the second, integration by parts. Observe that it is not necessary to specify the skew-adjoint extensions of Mj and of Dj . It will be sufficient to exhibit one parameter groups of unitary operators on L2 .Rd / whose infinitesimal generators extend our operators, for Stone’s Theorem guarantees that these groups are the appropriate ones. Consider then the following formal computation: exp.tDj /f .x/ D
1 X .tDj /n nD0
nŠ
f .x/ D
1 X .t/n @n f .x/ D f .x tej /: nŠ @xjn nD0
Similarly, exp.tMj /f .x/ D e2ixj f .x/: It is also quite clear that the operators .j/
Ut f .x/ D f .x tej /;
.j/
Vt f .x/ D e2itxj f .x/
give rise to one parameter groups of unitary operators on L2 .Rd /. Exercise 2.91. Prove that the Schwartz space S .Rd / is contained in the domain of .j/ .j/ the infinitesimal generators of fUt g and fVt g. From the previous computations, it is natural to set: .q; 0; 0/f .x/ D f .x q/ .0; p; 0/f .x/ D e2ipx f .x/
(2.53)
.0; 0; t/f .x/ D e2it f .x/: Finally, since for every .q; p; t/ 2 Hd we have 1 .q; p; t/ D .0; 0; t q p/.0; p; 0/.q; 0; 0/; 2 by composing the various (2.53) we obtain .q; p; t/f .x/ D e2it eiqp e2ipx f .x q/:
(2.54)
56
F. De Mari and E. De Vito
It is now elementary to check that (2.54) defines a representation of Hd , called the Schrödinger representation. While it is clear that the thus defined operators are unitary because they are compositions of unitary operators, irreducibility requires some extra work. Theorem 2.92. Formula (2.54) defines a unitary irreducible representation of Hd on L2 .Rd /. Proof. In order to show that is continuous in the strong operator topology it is enough to prove that if .q; p; t/ ! .0; 0; 0/, then .q; p; t/f ! f in L2 .Rd / for every f 2 L2 .Rd /. Now, Z Rd
j.q; p; t/f .x/ f .x/j2 dx
1=2
D
Z Z
Rd
Rd
C
je2itiqpC2ipx f .x q/ f .x/j2 dx je2itiqpC2ipx 1j2 jf .x/j2 dx
Z Rd
jf .x q/ f .x/j2 dx
12
12
12
and both summands tend to zero as .q; p; t/ ! .0; 0; 0/, the first by dominated convergence and the second by the continuity of translations in L2 .Rd /. Let’s show irreducibility. Take a closed subspace M L2 .Rd / and suppose that it is -invariant. If f 2 M , then the translate q f D .q; 0; 0/f in also in M . If P is the orthogonal projection onto M , then P commutes with translations. Hence, by the classical characterization of the bounded operators on L2 .Rd / that commute with translations (see, for example, Theorem 2.5.10 in [20]), there exists a multiplier m 2 L1 .Rd / such that F .Pf /. / D m. /F f . /:
(2.55)
Since P2 D P, then also m2 D m. Upon considering .0; p; 0/, one sees that P commutes also with multiplication by any of the characters ep .x/ D e2ipx . Therefore, from (2.55) we infer F ŒP.ep f /. / D m. /F Œep f . / D m. /F f . p/; F Œep Pf . / D F ŒPf . p/ D m. p/F f . p/; which yield m. / D m. p/;
for every p 2 Rd :
Hence m is constant and since m2 D m it must be either m D 0, that is M D f0g, or else m D 1, that is M D L2 .Rd /. t u
2 The Use of Representations
57
By means of formulae (2.54), we have built a unitary and irreducible representation of Hd for which .0; 0; t/ D e2it IH , that is, recalling (2.49) and (2.50), the representation corresponding to the case D 2. It is now easy to define a representation corresponding to the real number D 2h, with h 2 R n f0g. It is enough to put h .q; p; t/f .x/ D .hq; p; ht/f .x/ D e2iht eiqhp e2ipx f .x hq/:
(2.56)
Exercise 2.93. Check that h is a unitary and irreducible representation of Hd for every h 6D 0, and that if h 6D h0 , then the corresponding representations are inequivalent. The following celebrated theorem by M. Stone and J. von Neumann is of fundamental importance in Harmonic Analysis: it states that the representations that we have defined so far exhaust, up to equivalence, the class of unitary and irreducible representation of Hd . For a proof, see, for example, [16]. Theorem 2.94 (Stone–von Neumann). Every unitary and irreducible representation of Hd is unitarily equivalent to either one of the one-dimensional representations (2.51), or to a Schrödinger representation h defined in (2.56). It is worth observing that the determination of the constant h is very simple: just compute the representation on the central elements .0; 0; t/: by (2.56) this must be just e2iht I.
2.2.4 Time-frequency Analysis In this section we indicate the role that the representation theory of the Heisenberg group plays in time-frequency analysis, in particular in the study of the so-called short-time Fourier transform (often abbreviated in STFT). The basic reference on this topic is the book [21]. We first introduce some ingredients of time-frequency analysis and then show the connections to the representation theory of the (reduced) Heisenberg group. The issue is that the STFT can be viewed in the framework of reproducing systems, as discussed in Section 2.1.4. In order to stress the link between the STFT and the representation theory of the Heisenberg group, we follow the convention of denoting by x or q the spatial variable of functions (or time for d D 1) and by or p the frequency variable. Also, we indicate with a dot the scalar product in Rd .
2.2.4.1
Short Time Fourier Transform
The most basic operations in time-frequency analysis are the shifts in time and in frequency. For q; p 2 Rd and f W Rd ! C, we define the time shift by
58
F. De Mari and E. De Vito
Tq f .x/ D f .x q/;
q; x 2 Rd
and the frequency shift by Mp f . / D e2ip f . /;
; p 2 Rd :
It is a matter of simple computation to establish an adapted version of the canonical commutation relations, namely Tq Mp D e2iqp Mp Tq :
(2.57)
It follows in particular that Tq and Mp commute if and only if qp 2 Z. Furthermore, using the well-known properties of the Fourier transform, we have the formulae O Tc q f D Mq f ;
b
Mp f D Tp fO :
From these it follows what is to be regarded as the most important formula in timefrequency analysis, namely
1
Tq Mp f D Mq Tp fO ; whose proof is left as an exercise. The short-time Fourier transform is a mathematical device that is meant to capture the local contributions to the Fourier transform of a given function: one restricts the function to a small interval by a cut-off function (preferably some smooth window) and then takes the Fourier transform. By sliding the interval, one sees what are the various contributions from different regions of the time domain. Formally, we have: Definition 2.95. Let 6D 0 be a fixed window, that is, a function defined on Rd . The short-time Fourier transform of the function f W Rd ! C with respect to is defined by Z S f .q; p/ D
Rd
f .x/ .x q/e2ipy dx;
q; p 2 Rd
whenever the integral makes sense. A most fundamental observation is that the STFT is a function on the so-called phase-space, namely the 2d-space R2d in which times and frequencies simultaneously lie. In other words, it is a function of time and frequency. The time variable labels the “center” of the window and the frequency labels the point at which the Fourier transform is evaluated. At this stage we do not insist much on the precise domains for or for f . The most basic properties of the STFT are given in the result that follows. The proof is easy.
2 The Use of Representations
59
2 L2 .Rd /, then S f is uniformly continuous in R2d and
Proposition 2.96. If f ;
S f .q; p/ D F .f Tq /.p/
(2.58)
D hf ; Mp Tq i
(2.59)
D hfO ; Tp Mq O i
(2.60)
D e2iqp S O fO .p; q/:
(2.61)
Some comments. Formula (2.58) says what the STFT really is, the Fourier transform of a localized version of f . Formulae (2.59) and (2.60) indicate that it looks formally like the coefficient of some representation. Formula (2.61) exhibits an intriguing symmetry in time and frequency, together with a ninety-degree rotation, namely .q; p/ 7! .p; q/, in phase-space (given by the action of the matrix J). In the next three results we state some of the most remarkable features of the STFT. Theorem 2.97 (Orthogonality relations). Given f1 ; f2 ; j D 1; 2 it holds that S j fj 2 L2 .R2d / and hS 1 f1 ; S 2 f2 i D hf1 ; f2 ih
1;
1;
2
2 i:
2 L2 .Rd /, then for
(2.62)
It must be observed that the first inner product in (2.62) is in L2 .R2d / (phase space) whereas the ones appearing in the right-hand side are both in L2 .Rd /, in the time domain or in the frequency domain, as one prefers, because of Parseval’s equality. Corollary 2.98. If f ;
2 L2 .Rd /, then kS f k D kf kk k:
In particular, if k k D 1, then kS f k D kf k
for all f 2 L2 .Rd /
(2.63)
and in this case the STFT is an isometry of L2 .Rd / into L2 .R2d /. Theorem 2.99 (Inversion of the STFT). Let ; 2 L2 .Rd / be such that h; i 6D 0. Then, for every f 2 L2 .Rd / f D holds as a weak integral.
1 h; i
“ R2d
S f .q; p/Mp Tq dq dp
(2.64)
60
F. De Mari and E. De Vito
It must be pointed out that in many cases one chooses that k k D 1, so that (2.64) simplifies to
D and assumes further
“ f D R2d
S f .q; p/Mp Tq
dq dp;
a formula that is sometimes referred to as the reproducing formula for the STFT.
2.2.4.2
Square Integrability and the Reduced Heisenberg Group
First of all, we show that .q; p; t/ D e2it eiqp Tq Mp :
(2.65)
Indeed, for any f 2 L2 .Rd / we have, by (2.57) and (2.54) e2it eiqp Tq Mp f .y/ D e2it eiqp e2iqp Mp Tq f .y/ D e2it eiqp e2ipy Tq f .y/ D e2it eiqp e2ipy f .y q/ D .q; p; t/f .y/: Secondly, for fixed ; f 2 L2 .Rd /, the coefficient hf ; .q; p; t/ i of the Schrödinger representation satisfies, according to (2.65), (2.57), and (2.59) hf ; .q; p; t/ i D e2it eiqp hf ; Tq Mp i D e2it eiqp S f .q; p/;
(2.66)
Therefore the STFT is nothing else but a multiple (with a complex number of modulus one) of the voice transform associated with the Schrödinger representation of Hd . One extra step needs to be made in order to recast the main properties of the STFT in the language of reproducing systems. Indeed, the STFT of a function is a function on the Heisenberg group that does not depend on the central variable, which leads to a serious integrability issue because the right-hand side of (2.66) is not in L2 .Hd / as the integral of its square modulus certainly diverges in the t variable. Things can be fixed by introducing the reduced Heisenberg group. We start by computing the kernel of the Schrödinger representation. Using formula (2.65), it is immediate that .q; p; t/f D f for every f 2 L2 .Rd / if and only if q D p D 0 and t 2 Z. Thus ker D f.0; 0; k/ W k 2 Zg ' Z:
2 The Use of Representations
61
The reduced Heisenberg group is nothing else but Hd = ker . To have a workable model, for this quotient group, one simply puts Hdr D R2d T, where T are the complex numbers of modulus one, and defines 0
.x; /.x0 ; 0 / D .x C x0 ; 0 ei!.x;x / /:
(2.67)
It is quite clear that the map .x; t/ 7! .x; e2it / is a surjective homomorphism of Hd onto Hdr whose kernel is exactly ker , so that we have a handy model for Hd = ker . Exercise 2.100. Show that if we parametrize the elements of Hdr by .x; e2i / where 2 Œ0; 1/, then a left Haar measure on Hdr is dx d . By the very construction of Hdr , the Schrödinger representation projects onto an irreducible representation WD Q of Hdr on L2 .Rd /. With slight abuse, it is called the Schrödinger representation of Hdr and its explicit formula is .q; p; / D eiqp Tq Mp ;
.q; p; / 2 Hdr :
(2.68)
Exercise 2.101. Derive a theorem that describes all the irreducible representations of Hdr starting from the Stone-von Neumann theorem. Finally, we compute the voice transform V of and find from (2.66) that N iqp S f .q; p/ V f .q; p; / D hf ; .q; p; / i D e Therefore, appealing to the orthogonality relations in the form (2.63), we get Z
1
kV f k D 0
Z R2d
ˇ2 ˇ 2i iqp ˇe e S f .x/ˇ dq dp d D kS f k D kf kk k:
This proves that any unit vector representation of Hdr .
2 L2 .Rd / is admissible for the Schrödinger
Exercise 2.102. Write the inversion formula of the STFT as the reproducing formula (2.33) associated with the square integrable representation of Hdr .
2.3 The Metaplectic Representation The metaplectic representation is a double-valued unitary representation of the symplectic group Sp.d; R/ on L2 .Rd /, or, more technically speaking, a unitary representation of the double cover of Sp.d; R/, otherwise known as the metaplectic
62
F. De Mari and E. De Vito
group Mp.d/. It has many other names in the literature, such as the oscillator representation, or the Segal-Shale-Weil representation, and it appears pervasively in mathematics. We point out right from the start that it is not irreducible: both the spaces Le2 .Rd / and Lo2 .Rd / of even and odd (square integrable) functions, respectively, are closed and invariant, and on each of them the metaplectic representation is irreducible. In very practical terms, the metaplectic representation assigns to each symplectic matrix a unitary operator on L2 .Rd /, which is well defined only up to a sign. This ambiguity, though mathematically important and not removable, plays a very mild role, if none at all, in many of the most important aspects in which it appears in Applied Harmonic Analysis. In particular, this ambiguity is irrelevant in the context of reproducing formulae, that is, when square integrability issues arise. As hinted in the previous paragraph, the very definition of the metaplectic representation is troublesome, and there are different ways of going about it. For a thorough presentation of this topic, the reader is referred to [16, 21, 27]. Here we content ourselves with a discussion of the main features rather than delving into the full machinery of proofs. The reason is because our interest resides in the restriction of the metaplectic representation to a particular class of triangular Lie subgroups of Sp.d; R/. This class, known as the class E , contains a rich subclass of reproducing groups, with interesting new examples, as well as well-known ones. This general theme of reproducing groups for the metaplectic representation started with [13, 14] and has then been investigated in a series of more recent papers [2, 3, 6–9, 11]. We begin by summarizing some useful properties of the symplectic group and of its Lie algebra. A direct proof of most of the statements may be found in [16]. For further reading, see, for example, [26].
2.3.1 More on the Symplectic Group As we have seen, the symplectic group is the set of invertible 2d 2d matrices preserving the standard symplectic (skew-symmetric) form !W R2d R2d ! R given by the matrix J defined in (2.2). Exercise 2.103. Show that an invertible matrix g satisfies !.gx; gy/ D !.x; y/ for every x; y 2 R2d if and only if t gJg D J. Recall that, under the usual identifications, the Lie algebra sp.d; R/ of Sp.d; R/ is explicitly described in (2.15) and (2.16). It is worth observing that from these it follows immediately that X 2 sp.d; R/ if and only if t X 2 sp.d; R/. In particular, t we obtain that both JX and XJ are symmetric. Indeed, .JX/ D t X t J D t XJ D JX t t by XJ C JX D 0. Similarly, one uses X 2 sp.d; R/ to show that XJ is symmetric. The symplectic group has many interesting subgroups. Among many others, the compact group K D Sp.d; R/ \ O.2d/;
2 The Use of Representations
63
which was implicitly introduced in the beginning of Section 2.1.2.3, where we also describe an explicit isomorphism of K with the unitary group U.d/. We formalize this: Proposition 2.104. The group K D Sp.d; R/ \ O.2d/ is a maximal compact subgroup of Sp.d; R/ and it is isomorphic to the unitary group U.d/ under the natural map induced on linear maps by the identification .x; y/ 7! x C iy of R2d with Cd . Other very important subgroups of Sp.d; R/ are h 0 W det h ¤ 0 0 t h1 I 0 SD W 2 Sym.d; R/ I E 0 1 d W E D diag.e ; : : : ; e /; j 2 R AD 0 E1
DD
where evidently Sym.d; R/ stands for the vector space of all symmetric d d real matrices. Clearly, D ' GL.d; R/. Using D, we can embed any (closed) subgroup of GL.d; R/ into Sp.d; R/ in a canonical fashion, for example the special linear group SL.d; R/ or the special orthogonal group SO.d/ h 0 W h 2 SL.d; R/ D Sp.d; R/ 0 t h1 h 0 W h 2 SO.d/ D Sp.d; R/: SO.d/ ,! 0 t h1
SL.d; R/ ,!
Exercise 2.105. Show that D, A, and S are indeed closed subgroups of Sp.d; R/. Show that D normalizes S, that is, that dsd1 2 S whenever d 2 D and s 2 S. Describe explicitly the group SD D fsd W d 2 D; s 2 Sg and exhibit its semidirect product structure. This group is called the standard maximal parabolic subgroup of Sp.d; R/. Show also that ˚ S D ts W s 2 S is a subgroup of Sp.d; R/ and make the appropriate statements for SD. Prove that SD and SD are conjugate. Proposition 2.106. The symplectic group is generated by S [ D [ fJg and also by S [ D [ fJg.
64
F. De Mari and E. De Vito
The meaning of the above statements is that any element in Sp.d; R/ can be written as a finite product of elements all taken from S[D[fJg, or all taken from S[D[fJg. This fact is of practical relevance, as we shall see, when dealing with the metaplectic representation.
2.3.2 Construction of the Metaplectic Representation We now proceed to define the metaplectic representation abstractly. Later we indicate how one can go about clearing out all the unsettled points. Recall that the group T of automorphisms of the Heisenberg group that leave the center pointwise fixed is generated by the inner automorphisms and by the symplectic maps. More precisely, the maps ih with h 2 Hd and Tg with g 2 Sp.d; R/ given by i.x;y;z/ .q; p; t/ D .x; y; z/.q; p; t/.x; y; z/1 D .q; p; t C x p y q/ Tg .q; p; t/ D .g.q; p/; t/; (where g.q; p/ means the effect of the linear map g on the vector .q; p/ 2 R2d ) generate the group T . Any element T 2 T is actually a product T D ih Tg . If T 2 T , we can precompose the Schrödinger representation with T to obtain a new representation ı T of Hd on L2 .Rd /. Indeed, if h; k 2 Hd , then . ıT/.hk/ D .T.hk// D .T.h/T.k// D .T.h//.T.k// D . ıT/.h/. ıT/.k/ shows that it is indeed a representation and furthermore (2.54) shows that . ı T/.0; 0; t/ D .T.0; 0; t// D .0; 0; t/ D e2it I: By the Stone-von Neumann theorem, and ı T must be equivalent. Hence there is a unitary operator .T/ on L2 .Rd / that intertwines the two representations: ı T.h/ D .T/.h/.T/1 ;
h 2 Hd :
(2.69)
Moreover, by Schur’s Lemma, namely item ii) of Lemma 2.54, .T/ is determined up to a phase factor because the space I .; ı T/ of all intertwining operators is one dimensional and contains a unitary operator, hence all its multiples by complex numbers of modulus one. Now, if we write (2.69) for the product TS we find that .T/.S/ does the job, so there exists cT;S 2 T such that .TS/ D cT;S .T/.S/:
2 The Use of Representations
65
This means that defines a projective representation of T , that is, a homomorphism into the quotient of the group of unitary operators modulo its center fcI W c 2 S1 g. Now, if T D ih , with h 2 Hd , then since .hgh1 / D .h/.g/.h/1 ; we can certainly take .ih / D .h/. Thus we may restrict ourselves to the subgroup of T consisting of the automorphisms Tg with g 2 Sp.d; R/, a group manifestly isomorphic to Sp.d; R/ itself. We shall write for simplicity .g/ instead of .Tg /. From the above discussion it follows that the unitary operator .g/ is determined up to a phase factor by the relation .g.q; p/; 0/ D .g/.q; p; 0/.g/1 :
(2.70)
It may be shown (see below for further comments) that the phase factors can be chosen in one and only one way up to a sign, so that becomes a double-valued unitary representation of Sp.d; R/. In other words, it holds .gh/ D ˙.g/.h/;
g; h 2 Sp.d; R/:
With this choice, is called the metaplectic representation. We should either think of as a genuine unitary representation of the double cover Mp.d/ of Sp.d; R/ or as a homomorphism of Sp.d; R/ into the quotient of the group of unitary operators modulo its center f˙Ig. Given a single g 2 Sp.d; R/, we could also think of .g/ as a pair of unitary operators that differ from each other by 1, but we shall not do so. In explicit formulas, this ambiguity usually appears as the possible choice of sign of a square root. Next we compute .g/ for certain particular types of g, up to phase factors. h 0 with h 2 GL.d; R/. Then i) Take g 2 D, that is g D 0 t h1 .g.q; p/; 0/f .x/ D .hq; t h1 p; 0/f .x/ t 1 p/
D ei.hq/. h
2iph1 x
D eiqp e
t 1 p/x
e2i. h
f .x hq/
f .h.h1 x q//:
The unitary operator on L2 .Rd / given by Uf .x/ D jdet hj1=2 f .h1 x/ satisfies U.q; p; 0/U 1 f .x/ D jdet hj1=2 ..q; p; 0/U 1 f /.h1 x/ D jdet hj1=2 ..q; p; 0/jdet hj1=2 f ı h/.h1 x/ D .q; p; 0/.f ı h/.h1 x/ 1 x
D eiqp e2iph
.f ı h/.h1 x q/:
66
F. De Mari and E. De Vito
Hence U satisfies (2.70), so itmust coincide with .g/ up to a phase factor. I 0 ii) Take now g 2 S, that is, g D with 2 Sym.d; R/. Then I .g.q; p/; 0/f .x/ D .q; q C p; 0/f .x/ D eiq.qCp/ e2i.qCp/x f .x q/ D eiqp e2ipx eiqq e2iqx f .x q/: The unitary operator on L2 .Rd / given by Vf .x/ D eixx f .x/ satisfies V.q; p; 0/V 1 f .x/ D eixx .q; p; 0/V 1 f .x/ D eixx eiqp e2ipx .V 1 f /.x q/ D eixx eiqp e2ipx ei.xq/.xq/ f .x q/ D eiqp e2ipx eiqq e2iqx f .x q/: Hence V satisfies (2.70), so it must coincide with .g/ up to a phase factor. iii) Finally, take g D J. Then .J.q; p/; 0/f . / D .p; q; 0/f . / D eiqp e2iq f . p/: For any Schwartz function f we have F .q; p; 0/F 1 f . / D
Z Z
D
e2i x .q; p; 0/.F 1 f /.x/ dx e2i x eiqp e2ipx .F 1 f /.x q/ dx
iqp
Z
De
iqp
De
Z
e2i.p /x .F 1 f /.x q/ dx e2i.p /.qCy/ .F 1 f /.y/ dy
D eiqp e2i.p /q
Z
e2i.p /y .F 1 f /.y/ dy
D eiqp e2i q f . p/: Hence F satisfies (2.70), so it must coincide with .g/ up to a phase factor. In summary, up to phase factors, we have identified .g/ on a generating set of elements (thanks to Proposition 2.106), so in principle we know up to phase factors. Anticipating what can be made rigorous, we actually have
2 The Use of Representations
67
h 0 f .x/ D ˙jdet hj1=2 f .h1 x/ 0 t h1
I 0 f .x/ D ˙eixx f .x/ I .J/ f .x/ D id=2 F f .x/:
(2.71) (2.72) (2.73)
It must be pointed out that in formula (2.73) the sign of the square root accounts exactly for the ambiguity in sign. Exercise 2.107. Show that both the even and odd parts of L2 , namely Le2 .Rd / D ff 2 L2 .Rd / W f .x/ D f .x/g Lo2 .Rd / D ff 2 L2 .Rd / W f .x/ D f .x/g; are closed invariant subspaces on each of which the action of is irreducible.
2.3.2.1
An Outline of the Full Construction
The question remains: how to define in full detail? The several nontrivial technicalities will not be unraveled completely. The construction can be summarized in the following three basic steps. i) Define first the infinitesimal representation d of the Lie algebra sp.d; R/ by densely defined unbounded and essentially skew-adjoint operators on L2 .Rd /. This step is obtained by means of the so-called Weyl calculus, realizing first sp.d; R/ as a Lie algebra of polynomials. ii) “Integrate” the representation to a representation of the universal cover of Sp.d; R/. This step uses the notion of analytic vector for a representation. iii) Show that the representation actually factors to a representation of the double cover of Sp.d; R/.
2.3.3 Restriction to Triangular Subgroups We finally present the class of examples that has been looked at in the papers [2, 3, 9]. Definition 2.108. A Lie subgroup G of Sp.d; R/ belongs to the class E if it is of the form n h 0 o GD W h 2 H; 2 ˙ ; h t h1
68
F. De Mari and E. De Vito
where H is a connected Lie subgroup of GL.d; R/ and ˙ is a subspace of the space Sym.d; R/ of d d symmetric matrices. We further require that both ˙ and H are not trivial. In order for G to be a group it is necessary and sufficient that h Œ˙ D ˙ for all h 2 H, where 1
h Œ D t h h1 :
(2.74)
If G 2 E , both ˙ and H are naturally identified as Lie subgroups of G. Clearly, ˙ H D G, ˙ \ H D feg, ˙ is a normal subgroup of G and it is invariant under the action of H given by (2.74), so that G is the semi-direct product ˙ Ì H. Take G D ˙ Ì H in the class E . A left Haar measure and the modular function of G are dg D .h/1 d dh;
G .; h/ D .h/1 H .h/;
(2.75)
where d is a Haar measure of ˙, dh is a left Haar measure of H, H is the modular function of H, and is the positive character of H given by
.h/ D jdet h j;
(2.76)
where h stands for the linear map 7! h Œ . Exercise 2.109. Prove all the statements that have been made in the previous paragraph: G is a group if and only if h Œ˙ D ˙ for all h 2 H, and the formulae (2.75) do define Haar measure and modular function. There is a surprisingly large set of groups in the class E that are relevant in the study of reproducing systems. Example 2.110. For d D 1, that is, for Sp.1; R/ D SL.2; R/, it is completely obvious that there is exactly one group in the class E , that will be denoted E1 . This group is actually a very remarkable example, namely a copy of the “ax C b” group. The map '..a; b// D
p 1= a 0 p p b= a a
establishes an isomorphism between “ax C b” and the group E1 , as one checks immediately. Observe that, p for the group E1 , (2.71) is meaningful for the 1 1 positive matrix h D 1= a (no sign ambiguity) and if one takes the plus sign in (2.72), then one gets a perfectly well-defined unitary representation on E1 , and it is not necessary to appeal neither to projective representations nor to coverings. As we will see, this argument holds for all groups in the class E so that it is legitimate to speak about the metaplectic representation for groups in E , which will again be denoted by .
2 The Use of Representations
69
We show next that the restriction of to E1 acting on the space of even L2 functions on the line is equivalent to the subrepresentation of the wavelet representation (2.22) of “ax C b” on the Hardy space H .R/ defined in (2.27). In other words, denoting by e the former and by the latter, we will show that e ' : A crucial remark in order to prove the sought for equivalence is that one should think of as acting on Fourier transforms, that is p 2 a;b fO . / D e b a1=4 fO . a /: O for the frequency domain, L2 .R O C / for the L2 -transforms For this reason, we write R O C and L2 .R/ O for the even ones. We define the operator that are supported in R e O ! L 2 .R O C / by
C W Le2 .R/ . C gO /. / D
p
2 Œ0;C1/ . /Og. /;
which is an obvious unitary isomorphism. Next we put ( 2
2
O C / ! L .R O C /; ˚W L .R
˚.Og/. / D
p .2 /1=4 gO . 2 /
0
0
< 0:
Clearly, ˚ is unitary as well. Finally, we denote by R the reflection Rf .x/ D f .x/. Since R commutes with F , it sends HC .R/ unitarily onto H .R/, and vice versa. We show next that the unitary map O ! H .R/; TW Le2 .R/
T D R ı F 1 ı ˚ ı C
O it holds intertwines e and , that is, for every fO 2 Le2 .R/ T.ea;b fO / D a;b .T fO /:
(2.77)
instead of Here we have written for short ea;b in place of e .'.a; b// and a;b .a; b/. Applying the definition, and reflection on both sides, (2.77) is equivalent to
R F 1 ˚ C fO F 1 ˚ C .ea;b fO / D Ra;b which, after Fourier transformation, is in turn equivalent to h i R F 1 ˚ C fO . / ˚ C .ea;b fO /. / D F Ra;b
a.e. 2 R:
(2.78)
70
F. De Mari and E. De Vito
Next we observe that Ra;b R sends each H˙ .R/ into itself and satisfies, for ' 2 L2 .R/, 1 Ra;b R'.x/ D p ' a
xCb a
D a;b '.x/:
Therefore, using this and (2.25), the right-hand side of (2.78) becomes h 1 i p F a;b F ˚ C fO . / D ae2ib .˚ C fO /.a / p p p D 2 ae2ib .2a /1=4 fO . 2a / for 0, and vanishes for < 0. The left-hand side of (2.78) is p
p 2 p p p p 2.2 /1=4 ea;b fO . 2 / D 2.2 /1=4 a1=4 eib. 2 / fO . a 2 /
for 0, and vanishes for < 0. This establishes (2.77). As already remarked in the example that we have just discussed, for the groups in E it is perfectly legitimate to speak about as a bona fide representation. The reason is that if one takes the plus sign in (2.72) then 1
.;h/ f .x/ D .det h/ 2 ei xx f .h1 x/
(2.79)
is indeed a unitary representation. We illustrate next a remarkable structural feature of the groups in E and present a general geometric result.
2.3.3.1
The Symbol
The restriction of the metaplectic representation to G 2 E is completely O d , the map characterized by a “symbol” ˚, as we now explain. Given4 2 R 1 7! 2 is a linear functional on ˙ and hence it defines a unique element ˚. / 2 ˙ , the dual of ˙, by the requirement that 1 ˚. /. / D
2 O d ! ˙ is called the symbol for all 2 ˙ . The corresponding function ˚W R associated with ˙ and has the invariance property (2.81) that we now explain.
The reason for the symbol instead of x rests in the fact that we should think of as acting in the frequency domain.
4
2 The Use of Representations
71
Observe first that the contragredient action of (2.74) is given, for 2 ˙ , 2 ˙, and h 2 H, by hŒ . / D ..h1 / Œ / D .t h h/;
(2.80)
O d by means of Next, notice that since H GL.d; R/, it acts naturally on R h: D h : O d and h 2 H The invariance property that we are after is that for all 2 R ˚.h: / D hŒ˚. /:
(2.81)
This is seen by observing that for all 2 ˙ we have, by (2.80) 1 ˚.h: /. / D t h h D ˚. /..h1 / Œ / D hŒ˚. /. /: 2 Therefore, for 2 ˙ and h 2 H, we may rewrite 1
.;h/ f . / D .det h/ 2 e2i˚. /./ f .h1 /;
(2.82)
which exhibits as completely determined by the symbol. Also, this proves that is a representation of the kind considered in [11], with a quadratic symbol.
2.3.3.2
Geometric Characterization
The next result gives some necessary and sufficient conditions for a group G 2 E to be reproducing. In order to state it, we need to introduce some standard terminology. First, the set HŒy D fhŒy 2 Rn W h 2 Hg is called the orbit of H in Rn , where we have identified ˙ with Rn and where evidently n D dim ˙. Secondly the closed Lie group of H defined by Hy D fh 2 H W hŒy D yg is called the stability subgroup at y 2 Rn . Thirdly, we recall that a subset A in a topological space X is locally closed if it is the intersection of an open and a close set of X. Equivalently, if each point in A has an open neighborhood U X such that A is closed in U (with the subspace topology), or, yet equivalently, if A is open in its closure (with the subspace topology). Finally, let D˚. / denote the n d Jacobian
72
F. De Mari and E. De Vito
D˚. / D
@.'1 ; : : : ; 'n / @. 1 ; : : : ; d /
p and by J˚. / D det D˚. /t D˚. / the Jacobian determinant. With this notation, the critical points of ˚ are precisely the solutions of the equation J˚. / D 0. Theorem 2.111 ([3, 11]). Take G D ˙ Ì H 2 E and assume that the orbit HŒy is locally closed in Rn for every y 2 Rn . If G is a reproducing group, then i) G is non-unimodular; ii) dim ˙ d; O d , has iii) the set of critical points of ˚, which is an H-invariant closed subset of R zero Lebesgue measure. Furthermore, if dim ˙ D d, then O d / the stability subgroup Hy is compact. iv) for almost every y 2 ˚.R Conversely, if i), ii) iii), and iv) (without assuming dim ˙ D d) hold true, then G is reproducing. If we look at what happens when d D 2, we find that there is a wealth of reproducing systems arising from groups in E : there are in fact 16 families of inequivalent reproducing groups. For a full discussion, the reader is referred to [2, 3], where these groups are classified up to the appropriate notion of equivalence and where the admissibility conditions analogous to the Calderón equation (2.29) are derived. Here we content ourselves with two examples that underline why this class is interesting. Example 2.112 (Shearlets). The Heisenberg group can be realized inside Sp.2; R/ in a slightly different way from (2.47), which entails a direct realization of the standard product given in (2.46). By this we mean that the latter can be read off from the matrix product of the matrices (2.47), and conversely. There is, however, another standard realization of the Heisenberg group, known as the polarized version H1pol , that yields an embedding in the symplectic group as an element of the class E . The polarized Heisenberg group may be seen as the set of the 3 3 matrices 3 2 1p t gpol .q; p; t/ D 40 1 q5 ; 001 which obey the product rule gpol .q; p; t/gpol .q0 ; p0 ; t0 / D gpol .q C q0 ; p C p0 ; t C t0 C pq0 /: Exercise 2.113. Check that (2.83) is true and prove that 'W H1 ! H1pol ;
'.g.q; p; t// D gpol .q; p; t C
qp / 2
(2.83)
2 The Use of Representations
73
is a group isomorphism. Define Hdpol and extend all the above to general d. Next we show how to see H1pol as a group in E . For this purpose, put ˙D
t q=2 W t; q 2 R ; q=2 0
HD
10 Wp2R : p1
Clearly h Œ˙ D ˙ so that we have a group in E . The elements e.q; p; t/ D
h 0 h t h1
2
3 1 0 0 0 6 p 1 0 07 7 D6 4t qp=2 q=2 1 p5 q=2 0 0 1
satisfy the product rule e.q; p; t/e.q0 ; p0 ; t0 / D e.q C q0 ; p C p0 ; t C t0 C pq0 /; as required to see that this is an embedding of the polarized group. Finally, we show that the extended (polarized) Heisenberg group (see Exercise 2.90) can also be seen as an element in the class E . This is achieved by extending H to p 1= a 0 p W a > 0; p 2 R : He D p= a 1 By doing so, one gets the matrices 2
p 0 1= a p 6 1 p= a e.a; q; p; t/ D 6 4t=pa qp=2pa q=2 p 0 q=2 a
3 0 0 0 07 7 p a p5 0 1
(2.84)
that satisfy the product rule e.a; q; p; t/e.a0 ; q0 ; p0 ; t/ D e.aa0 ; q C
p 0 p p aq ; p C ap0 ; t C at0 C apq0 /:
For brevity, we shall denote by H1e D ˙ ÌHe the group consisting of all the elements in the form (2.84). Evidently, H1e 2 E . Observe that e.a; 0; 0; 0/e.1; q; p; t/e.a; 0; 0; 0/1 D e.1;
p
aq;
p
ap; at/;
so that we are indeed extending the Heisenberg group via its homogeneous dilations. We now compute the various ingredients that are needed. First of all, we write
74
F. De Mari and E. De Vito
.t;q/
t q=2 D q=2 0
thereby identifying ˙ with R2 . The elements of ˙ are in turn identified with vectors in R2 in the sense that to y 2 R2 there corresponds the functional y for which y ..t;q/ / D y .t; q/: Secondly, ha;p Œ.t;q/ D .t ha;p /1 .t;q/ h1 a;p p p a p a p t q=2 D 0 1 0 1 q=2 0 p p at C apq q a=2 p : D 0 q a=2 Thus, h Œ.t;q/ is the element of ˙ associated with the vector: p t ap a t p WD 2 R2 : Ma;p q q 0 a
This means that the action ha;p is expressed in R2 by the matrix Ma;p . Therefore, the positive character defined in (2.76) is the determinant of Ma;p , namely
.ha;p / D a3=2 : By (2.75), we then have that H1e .e.a; q; p; t// D a3=2 H .ha;p / and since H .ha;p / D a1 , as it is easily checked computing, for example, the adjoint representation of H on its Lie algebra, we infer that G is not unimodular. 1 Next, since the contragredient action y 7! hŒy corresponds to t Ma;p , we have y1 1=a 0 y1 =a y1 p p : D D ha;p p=a 1= a y2 y2 py1 =a C y2 = a
(2.85)
We now compute the symbol ˚. Since 1 1 1 1 t q=2 1 h.t;q/ ; i D 1 2 D . 12 ; 1 2 / .t; q/;
2 q=2 0 2 2 2 2
2 The Use of Representations
75
Fig. 2.1 The five orbits of H.
the symbol of H1e is 1 1 ˚. 1 ; 2 / D . 12 ; 1 2 /: 2 2
(2.86)
The Jacobian is easily computed to be J˚. 1 ; 2 / D
1 2
q . 12 1 2 /. 12 C 1 2 /
and its zero set is f. 1 ; 2 / 2 R2 W 1 D 0g [ f. 1 ; 2 / 2 R2 W 1 D ˙ 2 g; a set of Lebesgue measure zero. The image of ˚ is the closed left half plane. It is readily seen that the action (2.85) has five orbits in R2 : the origin, the two half lines f.0; y2 / W y2 > 0g and f.0; y2 / W y2 < 0g and finally the two half spaces f.y1 ; y2 / W y1 < 0g and f.y1 ; y2 / W y1 > 0g. Each of these sets is locally closed because it is the intersection of a closed and an open set in the plane (Fig. 2.1). Formula (2.85) also allows us to compute, for every y 2 ˚.R2 / the stability group Hy , which is evidently the identity matrix, hence compact, because ha;p
y1 y D 1 ” a D 1; y2 y2
p D 0 ” ha;p D I2 :
Therefore, all the hypotheses of Theorem 2.111 are satisfied and we may conclude that the restriction5 of to H1e , which by (2.79) or (2.82) is just 2
.a; q; p; t/fO . 1 ; 2 / D a1=4 ei.t;q/. 1 ; 1 2 / fO
5
p
a 1 ; 2 p 1 ;
Once more, we think of the metaplectic representation as acting on the frequency side.
(2.87)
76
F. De Mari and E. De Vito
is reproducing. This fact had been established directly in [8] (that is, without the use of Theorem 2.111) and can also be found in [3]. In both papers, Calderón equations for the admissible vectors are also worked out. A very remarkable fact, one which is particularly relevant for this book, is stated in Theorem 2.114 below, which is contained in [12]. For the reader’s convenience, we include here a direct argument. First of all, we introduce the connected shearlet group SC that will be discussed in detail in the other chapters of this book, to which we refer for historical and bibliographical information. It is the set RC R R2 endowed with the group operation .a; s; t/.a0 ; s0 ; t0 / D .aa0 ; s C a1=2 s0 ; t C Ss Aa t0 /; where 1s Ss D ; 01
a 0 p ; Aa D 0 a
p as a p : Ss Aa D 0 a
We can thus write the group law more explicitly as .a; s; t/.a0 ; s0 ; t0 / D .aa0 ; s C a1=2 s0 ; t1 C at10 C
p
at20 ; t2 C
p
at20 /:
The map W SC ! H1e ;
.a; s; t1 ; t2 / 7! e.a; t2 ; s; t1 /;
which amounts to renaming the variables according to p
! s;
t1
! t;
! q;
t2
(2.88)
establishes a Lie group isomorphism. The shearlet representation on the frequency O 2 / is defined by: domain, that is, on L2 .R O .a;s;t/ . 1 ; 2 / D a3=4 e2it O .a 1 ;
p
a.s 1 C 2 //:
Renaming the variables after (2.88), this becomes O .a;q;p;t/ . 1 ; 2 / D a3=4 e2i.t;q/. 1 ; 2 / O .a 1 ;
p
a.p 1 C 2 //:
(2.89)
Before we proceed, a remark on irreducibility and equivalence is in order. For simplicity, put O 2 W 1 > 0g; ˝C D f. 1 ; 2 / 2 R
O 2 W 1 < 0g: ˝ D f. 1 ; 2 / 2 R
2 The Use of Representations
77
It is clear that L2 .˝C / and L2 .˝ / are closed invariant spaces both for the shearlet representation and for the metaplectic representation (2.87). The subrepresentations ˙ of obtained by restriction to L2 .˝˙ / are mutually equivalent, because the reflection R fO . 1 ; 2 / D fO . 1 ; 2 / sends one space unitarily into the other, and intertwines the two subrepresentations, as it is evident from (2.87). Indeed, the p mapping . 1 ; 2 / 7! . a 1 ; 2 p 1 / commutes with sign change of . 1 ; 2 / and the phase factor exp.i.t; q/ . 12 ; 1 2 // is invariant under it. Further, it may be shown that C is in fact irreducible. We may thus write ' C ˚ C , in the sense that, when restricted to H1e , is equivalent to two copies of the same irreducible representation. The two restrictions of the shearlet representation to L2 .˝˙ /, however, are not mutually equivalent. This may be seen by observing that any intertwining operator T must commute with all the operators associated with the group elements .1; q; 0; t/, that is, with multiplication by e2i.t;q/. 1 ; 2 / . Using the multiplier theorem for L2 .R2 /, one sees that (after extension to the whole of L2 and Fourier transform) T must be given by convolution with a tempered distribution. This leads to a contradiction because it actually implies that T must be zero. Theorem 2.114 ([12]). The subrepresentation obtained by restricting the metaplectic representation of the extended Heisenberg group H1e to the right half frequency plane is equivalent to the subrepresentation obtained by restricting the shearlet representation of the connected shearlet group to the left half frequency plane. Proof. We describe explicitly the intertwining operator L that realizes the unitary equivalence between the restriction to L2 .˝ / of the shearlet representation with the restriction to L2 .˝C / of . The interesting fact is that L is essentially constructed by means of the symbol ˚W ˝C ! ˝ defined in (2.86), a diffeomorphism with Jacobian determinant 12 =2. Define L W L2 .˝ / ! L2 .˝C /;
1 O L '. / O D p '.˚. //: 2
For any 'O 2 L2 .˝ /, we have Z
Z
2
˝C
jL '. /j O d D Z
1 2 O j p '.˚. //j d
2 ˝C
D ˝C
2 j'.˚. //j O
12 d D 2
Z ˝
2 j'. /j O d ;
so that L is an isometry. It is not hard to show that L is in fact surjective. Using the definition (2.89) we have
78
F. De Mari and E. De Vito
1 L O .a;q;p;t/ . 1 ; 2 / D p O .a;q;p;t/ .˚. // 2 2 1 2
1 D p O .a;q;p;t/ 1 ; 2 2 2
1 2 D p a3=4 ei.t;q/. 1 ; 1 2 / O 2
12
12 p
1 2 a ; a.p C / ; 2 2 2
and using (2.87) we have .a; q; p; t/.L O /. 1 ; 2 /
p 2 D a1=4 ei.t;q/. 1 ; 1 2 / .L O / a 1 ; 2 p 1 p
a 12 1 p a 1 O 1=4 i.t;q/. 12 ; 1 2 / 2 Da e ; . a. 1 2 p 1 / : p 2 2 2
This establishes the intertwining property L O .a;q;p;t/ D .a; q; p; t/.L O / and the proof of Theorem 2.114. Example 2.115 (Schrödingerlets). This is an example of a 3-dimensional reproO 2 /. The group G consists of the matrices ducing systems in L2 .R a1=2 R 0 ; ta1=2 R a1=2 R
t 2 R; a > 0; R 2 SO.2/:
The rotations in SO.2/ are parametrized in the standard way, namely cos sin ; 2 Œ0; 2/: R' D sin cos Therefore G is in the class E , with ˙ D ftI2 W t 2 Rg and where H is the Abelian group RC SO.2/. En passant, here d D 2 and n D 1. The metaplectic representation restricted to G, thought in the frequency domain, is given by formula (2.79), namely 2 .t; a; /fO . / D a1=2 eitk k fO .a1=2 R /;
O 2 /: fO 2 L2 .R
The space domain version of this representation explains the reason of the name that is used for the admissible vectors relative to this group. Denote by Q the unitary representation obtained by conjugating with the Fourier transform, that is .g/f Q D F 1 ı .g/ ı F :
2 The Use of Representations
79
We interpret t 2 R as a time parameter and look at the evolution flow of a function in the space domain f 2 L1 .R2 / \ L2 .R2 /, given by Z Q 1; 0/f .x/ D .t; x/ 7! Q t f .x/ D .t;
RO2
2 fO . /eitk k e2ix d :
It is straightforward to verify that the flow Q t f .x/ satisfies the Schrödinger equation @ 4i C Q t f .x/ D 0; @t where is the spatial Laplacian D
@2 @2 C : @x12 @x22
For this reason the admissible vectors of G are called Schrödingerlets. In summary, the system of unitary operators attached to G is generated by rotations, by dilations, and by the evolution flow of the Schrödinger operator. We show next that restricted to G is indeed reproducing. First of all, writing ha; D a1=2 R , we have
1 ha; Œt D t h1 a; t ha; D taI2 D at :
Thus, the action ha; is expressed in R as multiplication by a, so that the contragredient action y 7! ha; Œy amounts to multiplication by a1 . It follows that the positive character defined in (2.76) is
.ha; / D a: By (2.75), taking into account that H, being Abelian, is unimodular we have that G .a; ; t// D a1 ; so that G itself is not unimodular. As for the symbol ˚W R2 ! R, since 1 1 t D k k2 t; 2 2 it is the mapping given by 1 ˚. / D k k2 : 2
80
F. De Mari and E. De Vito
Its image is the closed left half line .1; 0 and its Jacobian is J˚. 1 ; 2 / D k k; whose zero set is just the origin of R2 , obviously of Lebesgue measure zero. The action y 7! hŒy has three orbits in R: the origin and the two half lines .1; 0/ and .0; C1/, each of which is locally closed. For every y < 0, hence almost everywhere in ˚.R2 / D .1; 0, the stability group Hy is the identity matrix, hence compact. Therefore, all the hypotheses of Theorem 2.111 are satisfied and G is reproducing. The reader is referred to [3] for the Calderón equations for the admissible vectors.
References 1. Ali, S.T., Antoine, J.-P., Gazeau, J.-P.: Coherent states, wavelets, and their generalizations. In: Theoretical and Mathematical Physics, 2nd edn. Springer, New York (2014) 2. Alberti, G.S., Balletti, L., De Mari, F., De Vito, E.: Reproducing subgroups of Sp.2; R/. Part I: Algebraic classification. J. Fourier Anal. Appl. 19(4), 651–682 (2013) 3. Alberti, G.S., De Mari, F., De Vito, E., Mantovani, L.: Reproducing subgroups of Sp.2; R/. Part II: admissible vectors. Monatsh. Math. 173(3), 261–307 (2014) 4. Berndt, R., Schmidt, R.: Elements of the representation theory of the Jacobi group. In: Modern Birkhäuser Classics. Birkhäuser/Springer Basel AG, Basel (1998) 5. Bourbaki, N.: Integration. I. Chapters 1–6. Elements of Mathematics (Berlin). Springer, Berlin (2004) 6. Cordero, E., De Mari, F., Nowak, K., Tabacco, A.: Analytic features of reproducing groups for the metaplectic representation. J. Fourier Anal. Appl. 12(2), 157–180 (2006) 7. Cordero, E., De Mari, F., Nowak, K., Tabacco, A.: Reproducing groups for the metaplectic representation. In: Pseudo-Differential Operators and Related Topics. Oper. Theory Adv. Appl., vol. 164, pp. 227–244. Birkhäuser, Basel (2006) 8. Cordero, E., De Mari, F., Nowak, K., Tabacco, A.: Dimensional upper bounds for admissible subgroups for the metaplectic representation. Math. Nachr. 283(7), 982–993 (2010) 9. Cordero, E., Tabacco, A.: Triangular subgroups of Sp.d; R/ and reproducing formulae. J. Funct. Anal. 264(9), 2034–2058 (2013) 10. Daubechies, I.: Ten lectures on wavelets. In: CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 61. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1992) 11. De Mari, F., De Vito, E.: Admissible vectors for mock metaplectic representations. Appl. Comput. Harmon. Anal. 34(2), 163–200 (2013) 12. De Mari, F., De Vito, E., Dahlke, S., Häuser, S., Steidl, G., Teschke, G.: Different faces of the shearlet group. J. Geom. Anal. (2015) available online DOI 10.1007/s12220-015-9605-7 13. De Mari, F., Nowak, K.: Analysis of the affine transformations of the time-frequency plane. Bull. Aust. Math. Soc. 63(2), 195–218 (2001) 14. De Mari, F., Nowak, K.: Canonical subgroups of H1 Ì SL.2; R/. Boll. U.M.I. Sez. B 5(8), 405–430 (2002) 15. Duflo, M., Moore, C.C.: On the regular representation of a nonunimodular locally compact group. J. Funct. Anal. 21(2), 209–243 (1976) 16. Folland, G.B.: Harmonic analysis in phase space. In: Annals of Mathematics Studies, vol. 122. Princeton University Press, Princeton, NJ (1989) 17. Folland, G.B.: A course in abstract harmonic analysis. In: Studies in Advanced Mathematics. CRC Press, Boca Raton, FL (1995)
2 The Use of Representations
81
18. Folland, G.B., Sitaram, A.: The uncertainty principle: a mathematical survey, J. Fourier Anal. Appl. 3(3), 207–238 (1997) 19. Führ, H.: Abstract harmonic analysis of continuous wavelet transforms vol. 1863 of Lecture Notes in Mathematics. Springer, Berlin (2005) 20. Grafakos, L.: Classical and Modern Fourier Aanalysis. Pearson Education, Upper Saddle River, NJ (2004) 21. Gröchenig, K.: Foundations of time-frequency analysis. In: Applied and Numerical Harmonic Analysis. Birkhäuser Boston, Boston, MA (2001) 22. Grossmann, A., Morlet, J., Paul, T.: Transforms associated to square integrable group representations. I. General results. J. Math. Phys. 26(10), 2473–2479 (1985) 23. Helgason, S.: Differential Geometry, Lie Groups and Symmetric Spaces. Academic Press, New York (1978) 24. Kaniuth, E., Taylor, K. F.: Induced representations of locally compact groups. In: Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge (2013) 25. Knapp, A.W.: Representation theory of semisimple groups. In: Princeton Mathematical Series, vol. 36. Princeton University Press, Princeton, NJ (1986). An overview based on examples 26. Knapp, A.W.: Lie groups beyond an introduction. In: Progress in Mathematics, vol. 140. Birkhäuser Boston, Boston, MA (1996) 27. Lion, G., Vergne, M.: The Weil representation, Maslov index and theta series. In: Progress in Mathematics, vol. 6. Birkhäuser, Boston, MA (1980) 28. Mallat, S.: A Wavelet Tour of Signal Processing, 3rd edn. Elsevier/Academic Press, Amsterdam (2009). The sparse way, with contributions from Gabriel Peyré 29. Rudin, W.: Functional Analysis, 2nd edn. In: International Series in Pure and Applied Mathematics. McGraw-Hill, New York (1991) 30. Varadarajan, V.S.: Lie groups, Lie algebras, and their representations. In: Graduate Texts in Mathematics, vol. 102. Springer, New York (1984). Reprint of the 1974 edition 31. Warner, F.W.: Foundations of differentiable manifolds and Lie groups. In: Graduate Texts in Mathematics, vol. 94. Springer, New York-Berlin (1983)
Chapter 3
Shearlet Coorbit Theory Stephan Dahlke, Sören Häuser, Gabriele Steidl, and Gerd Teschke
Abstract In this chapter, we will provide a comprehensive overview of shearlet coorbit theory. We will present an almost self-contained introduction into coorbit theory which is the basis for all our investigations. We also discuss the group theoretical background of the continuous shearlet transform, and we explain how the shearlet transform can be combined with coorbit theory. By proceeding this way we can establish new smoothness spaces, the shearlet coorbit spaces. The structure of these spaces will be discussed in detail. In particular, we derive density, embedding, and trace results for the shearlet coorbit spaces.
3.1 Introduction This chapter is concerned with group theoretical aspects of the continuous shearlet transform. Shearlets are quite recently developed affine representation systems for the analysis of signals which are based on translations, shearings, and anisotropic (parabolic) dilations. They have in particular been designed for the analysis of directional information. The detection of directional information is currently a very hot topic, and standard (isotropic) systems such as wavelets are known to perform suboptimally. In recent studies, it has turned out that anisotropic representation systems like shearlets we can provide approximation schemes with certain optimal
S. Dahlke () FB12 Mathematik und Informatik, Philipps-Universität Marburg, Hans-Meerwein Straße, Lahnberge, 35032 Marburg, Germany e-mail:
[email protected] S. Häuser • G. Steidl Fachbereich für Mathematik, Technische Universität Kaiserslautern, Paul-Ehrlich-Str. 31, 67663 Kaiserslautern, Germany e-mail:
[email protected];
[email protected] G. Teschke Institute for Computational Mathematics in Science and Technology, Hochschule Neubrandenburg, University of Applied Sciences, Brodaer Str. 2, 17033 Neubrandenburg, Germany e-mail:
[email protected] © Springer International Publishing Switzerland 2015 S. Dahlke et al. (eds.), Harmonic and Applied Analysis, Applied and Numerical Harmonic Analysis 68, DOI 10.1007/978-3-319-18863-8_3
83
84
S. Dahlke et al.
properties. In this chapter, we will not give a detailed description of the shearlet concept and the associated approximation and detection algorithms, respectively; for that, we refer to the monograph [29] and to Chapter 4 of this book. Instead, we will focus on an additional aspect of the shearlet approach, and this is group theory. Among all the recent directional representation systems such as curvelets [5], contourlets [14], ridgelets [4], etc. the shearlet system stands out since it is related to a unitary, square integrable representation of a certain group, the full shearlet group S. This implies that all the powerful tools that have been provided within the realm of square integrable group representations are immediately available for the continuous shearlet transform! Moreover, the relation to group theory provides us with a link to another very important concept, namely coorbit space theory which has been introduced by Feichtinger and Gröchenig in a series of papers [15–18]. Coorbit space theory can be used to design new smoothness spaces associated with a square integrable group representation, in the sense that the smoothness of a function is measured by the decay of the associated voice transform. Moreover, coorbit theory provides a systematic way to discretize these spaces, that is, to construct (Banach) frames for them. For classical transforms such as the wavelet or the Gabor transform, this approach yields the Besov and modulation spaces, respectively. In recent studies, it has been shown that the coorbit concept can also be applied to the shearlet setting, and this yields completely new regularity spaces, the shearlet coorbit spaces. The aim of this chapter is to present a detailed and consistent description of all these relationships. We will explain the group theoretical background, establish the shearlet coorbit spaces, and discuss their structural properties, i.e., we will show density, embedding and trace results, respectively. This chapter is organized as follows: In Section 3.2 we recall the basic concepts of coorbit space theory as far as it is needed for our purposes. In Subsection 3.2.1 we introduce the necessary building blocks to establish the existence of these spaces. Moreover, in Subsection 3.2.2 we describe how atomic decompositions and Banach frames for these spaces can be constructed. Then, in Section 3.3, we present the group theoretical background of the continuous shearlet transform. We introduce the full shearlet group, discuss its properties, and study the representation which is related to the continuous shearlet transform. Then, in Section 3.4, we combine the shearlet approach with coorbit theory. In Subsection 3.4.1, we show that all the integrability conditions that are needed within the coorbit space theory can be satisfied in the shearlet case, so that the shearlet coorbit spaces can be established. Then, in Subsection 3.4.2, we show how Banach frames for these new smoothness spaces can be constructed. Finally, in Section 3.5, we discuss the structural properties of the shearlet coorbit spaces. First, in Subsection 3.5.1, we show that a natural subset of the space of Schwartz functions is dense in all shearlet coorbit spaces. Then, in Subsection 3.5.2, we study the relationships of shearlet coorbit spaces to the well-known homogeneous Besov spaces. It turns out that for specific classes of weights the shearlet coorbit spaces can be embedded into (sums of) homogeneous Besov spaces. Moreover, in Subsection 3.5.3 we discuss trace results of shearlet coorbit spaces on R3 onto two-dimensional hyperplanes.
3 Shearlet Coorbit Theory
85
It follows that the trace of a shearlet coorbit space is either contained in (the sum of) lower-dimensional shearlet coorbit space or in (the sum of) homogeneous Besov spaces. The proofs are based on the concept of coorbit molecules and introduced in [24].
3.2 Coorbit Space Theory In this section, we want to recall the basic facts concerning the coorbit theory as developed by Feichtinger and Gröchenig in a series of papers [15–18]. This theory is based on square integrable group representations and has the following important advantages: • The theory is universal in the following sense: Given a Hilbert space H , a square integrable representation of a group G and a non-empty set of so-called analyzing functions, the whole abstract machinery can be applied. • The approach provides us with natural families of smoothness spaces, the coorbit spaces. Roughly speaking they are defined as the collection of all elements in the Hilbert space H for which the voice transform associated with the group representation has a certain decay. In many cases, e.g., for the affine group and the Weyl-Heisenberg group, these coorbit spaces coincide with classical smoothness spaces such as Besov and modulation spaces, respectively. • The Feichtinger-Gröchenig theory does not only give rise to Hilbert frames in H , but also to frames in scales of the associated coorbit spaces. Moreover, not only Hilbert spaces, but also Banach spaces can be handled. • The discretization process that produces the frame does not take place in H (which might look ugly and complicated), but on the topological group at hand (which is usually a more handy object), and is transported to H by the group representation.
3.2.1 Coorbit Spaces Let G be a locally compact topological group with left Haar measure dg. The left and right translations of functions F on G by g 2 G are defined as Lg F.h/ WD F.g1 h/;
Rg F.h/ WD F.hg/;
respectively, see Exercise 2.37. Let be a square integrable representation of G in a Hilbert space H with admissible vector , see Section 2.1.4. In particular, is assumed to be irreducible. Let V W H ! L2 .G; dg/;
f 7! hf ; .g/ iH
86
S. Dahlke et al.
denote the voice transform. By Proposition 2.56 ii) we have for all g 2 G the intertwining property Lg .V f / D V ..g/f /:
(3.1)
By the famous result of Duflo and Moore, cf. Theorem 2.62, the admissibility condition 2 L2 .G/
V
(3.2)
implies that there exists a positive, densely defined self-adjoint operator A on H such that Z Z hf1 ; .g/ 1 ih.g/ 2 ; f2 i dg D V 1 .f1 /V 2 .f2 / dg D hA 2 ; A 1 ihf1 ; f2 i G
G
(3.3)
for all f1 ; f2 2 H and 1 ; 1 2 dom A. For the special choice kA kH D 1 and f1 D f2 D f this implies the isometry
1
D
2
D
with
kV f kL2 D kA k2H kf kH D kf kH : Note that is admissible if it fulfills (3.2) and kA kH D 1. Setting 1 D 2 D with kA kH D 1 and f1 D f ; f2 D .h/ we obtain the reproducing property V f K D V f;
K WD V
D h ; .g/ i:
(3.4)
Recall that by Proposition 2.56 i) the function K is continuous. A weight w on G is a continuous function wW G ! RC satisfying w.gh/ w.g/w.h/ for all g; h 2 G. Further we assume throughout this chapter that w.g/ D w.g1 /
(3.5)
for all g 2 G. This implies w.g/ 1 for all g 2 G. Let Lw1 .G/ WD fF measurable W Fw 2 L1 .G/g with norm kFkLw1 R G jF.g/jw.g/ dg. Definition 3.1 (Analyzing vector). An admissible function analyzing vector if it is contained in the set Aw WD f
2H WV
WD
2 H is called an
D h ; ./ i 2 Lw1 .G/g:
(3.6)
3 Shearlet Coorbit Theory
For an analyzing vector
87
2 Aw we consider the space
H1;w WD ff 2 H W V f D hf ; ./ i 2 Lw1 .G/g; with the norm kf kH1;w WD kV f kLw1 . In [15, Lemma 4.2] it was shown that Aw and H1;w coincide as sets. In the realm of coorbit theory, the spaces H1;w serve as substitutes for the (Schwartz) spaces of test functions which are used in the classical approaches to construct Besov, Sobolev, and Triebel-Lizorkin spaces. We refer, e.g., to [32] for details. The next proposition states the basic properties of H1;w . Proposition 3.2. i) The space H1;w is a Banach space. ii) The space H1;w is independent of the choice of 2 Aw : iii) The spaces H , H1;w , and its dual space H1;w (space of all continuous linear functionals on H1;w ) form a Gelfand triple, i.e., H1;w ,! H ,! H1;w ; where ,! denotes a dense continuous embedding. iv) The space H1;w is -invariant, i.e., f 2 H1;w implies .g/f 2 H1;w for all g 2 G. Proof. i) By Schwartz’ inequality and since w.g/ 1 for all g 2 G we observe for f 2 H that kf k2H D kV f k2L2 D Z
Z jV f .g/jjV f .g/j dg G
jhf ; .g/ ijjV f .g/j dg
D Z
G
kf kH k.g/ kH jV f .g/jw.g/ dg
G
D kf kH k kH kV f kLw1 : Consequently we have kf kH k kH kV f kLw1 :
(3.7)
Let .fi /i2N be a Cauchy sequence in H1;w : Then, for all " > 0 there exists an N."/ such that kfi fj kH1;w D kV .fi / V .fj /kLw1 " for all
i; j N."/;
i.e., V .fi /i2N is a Cauchy sequence in Lw1 .G/. Since Lw1 is a Banach space, there exists 2 Lw1 such that lim kV .fi / kLw1 D 0:
i!1
(3.8)
88
S. Dahlke et al.
On the other hand, (3.7) implies that .fi /i2N is also a Cauchy sequence in H , so that, since the voice transform is an isometry, there exists f 2 H such that lim kV .fi / V .f /kL2 D 0:
i!1
The convergence in L2 .G/ implies that there exists a subsequence .fik /k2N in H such that lim V .fik / D V .f /
a.e.
k!1
see, e.g., [19, Theorem 2.20]. The same subsequence converges by (3.8) in Lw1 to . Therefore there exists a subsequence of .fik /k2N which converges a.e. to . This implies that D V .f / and i) is shown. ii) Given two analyzing vectors 1 and 2 we have to show that V 1 .f / 2 Lw1 implies V 2 .f / 2 Lw1 . By (3.3) we obtain for f 2 H and 2 dom A that Z .V f V /.h/ D
hf ; .g/ ih ; .g1 h/ i dg
G
Z
hf ; .g/ ih.g/ ; .h/ i dg
D G
D hA ; A ihf ; .h/ i D hA ; A iV f .h/: Let 2 dom A be an analyzing vector such that hA ; A i i 6D 0, i D 1; 2. Then V 1 .f / V . / V 2 .
2/
D hA ; A
1 iV .f /
D hA ; A
1 ihA
V 2.
2 ; A iV
2/ 2
.f /
and using Young’s inequality we can conclude that the left-hand side is in Lw1 . Thus, V 2 .f / 2 Lw1 and we are done. iii) Eq. (3.7) implies that H1;w ,! H . Further, the embedding is dense since is admissible and therefore cyclic. Then, by duality, H is weak -continuously embedded in H1;w : We refer to [21] for details. iv) Let f 2 H1;w , then Z
Z jV ..h/f /.g/jw.g/ dg D
k.h/f kH1;w D
G
Z
jhf ; .h1 g/ ijw.g/ dg D
D Z
G
jhf ; .g/ ijw.h/w.g/ dg
G
jh.h/f ; .g/ ijw.g/ dg G
Z
jhf ; .g/ ijw.hg/ dg G
3 Shearlet Coorbit Theory
89
Z jhf ; .g/ ijw.g/ dg
w.h/ G
D w.h/kf kH1;w : Thus .h/f 2 H1;w which finishes the proof.
t u
Due to Proposition 3.2 iii), the inner product on H H can be extended to a sesquilinear form on H1;w H1;w . Definition 3.3 (Extended voice transform). For transform Ve; is defined for T 2 H1;w by
2 Aw , the extended voice
Ve; .T/.g/ WD hT; .g/ iH1;w H1;w :
The next proposition collects some properties of Ve; . In particular, it turns out that the reproducing kernel property (3.4) carries over to the image of the extended voice transform. Proposition 3.4. Let K be defined by (3.4). Then the following relations hold true: i) Ve; T K D Ve; T for T 2 H1;w . 1 1 ii) Ve; W H1;w ! L1=w .G/ is a one-to-one map from H1;w into L1=w .G/. 1 iii) For every F 2 L1=w .G/ satisfying the relation F K D F there exists a unique element T 2 H1;w with Ve; .T/ D F.
Proof.
i) By the reproducing formula (2.33) we have Z .g/ D h.g/ ; .h/ i.h/ dh : G
Hence, for T 2
H1;w ,
H Ve; .T/.g/ D hT; .g/ iH1;w 1;w Z D hT; h.g/ ; .h/ i.h/ d.h/iH1;w H1;w
G
Z D Z
G
Z
G
D
H h.g/ ; .h/ ihT; .h/ iH1;w 1;w dh H h ; .h1 g/ ihT; .h/ iH1;w 1;w dh
K .h1 g/Ve; T.h/ dh
D G
D .Ve; T K /.g/ :
90
S. Dahlke et al.
ii) We verify H jVe; T.g/j D jhT; .g/ iH1;w 1;w j kTkH1;w k.g/ kH1;w
and following the lines of the proof of Proposition 3.2 iv) further k kH jVe; T.g/j w.g/kTkH1;w 1;w :
1 . Since Hence Ve; T 2 L1=w
is cyclic, see Definition 2.49, the relation
Ve; .T/.g/ D h.g/ ; Ti D 0 for all g 2 G implies T D 0 so that Ve; is injective. 1 1 iii) We calculate the adjoint Ve; W L1=w .G/ ! H1;w . For F 2 L1=w .G/, f 2 H1;w we have by definition hVe; .F/; f iH1;w H1;w D hF; Ve; .f /i D
Z D
Z F.g/hf ; .g/ i dg G
F.g/.g/ dg; f G
H H1;w 1;w
and hence, Ve;
Z .F/ D G
F.g/.g/ dg 2 H1;w ;
1 which is well defined in the weak sense. Consequently, for F 2 L1=w , we have Ve; .Ve; .F//.g/ D hVe; .F/; .g/ iH1;w H1;w
D hF; Ve; ..g/ /i D F K .g/: If F K .g/ D F, then T WD Ve; .F/ fulfills Ve; .T/ D F. To show that T is unique we apply these formulas again and obtain Ve; .Ve; .Ve; .T/// D Ve; .T/ K D Ve; .T/: ! H1;w is the Since Ve; is one-to-one this shows that Ve; ı Ve; W H1;w Q Q identity operator. If F D Ve; .T/ for some T 2 H1;w , then T D Ve; .F/ D Q D TQ so that T is indeed unique. Ve; .Ve; .T// t u
Classical function spaces such as Besov- or Triebel–Lizorkin spaces are defined as subspaces of the set of tempered distributions satisfying additional properties.
3 Shearlet Coorbit Theory
91
In our setting, the role of the tempered distributions is played by the dual space H1;w , and generalized smoothness spaces are obtained by collecting all elements in H1;w for which the extended voice transform has a certain decay. Given a weight w on G, a continuous, positive function mW G ! RC is called a w-moderate weight on G if m.ghk/ w.g/m.h/w.k/ for all g; h; k 2 G. Observe that the moderateness of m implies m.gh/ Cm.g/w.h/
and
m.gh/ Cw.g/m.h/
(3.9)
with C WD w.e/. For 1 p 1 we consider the Banach spaces p .G/ WD fF measurable W Fm 2 Lp .G/g Lm
with norms Z kFk
p Lm
WD
jF.g/m.g/j
p
1p
1 p < 1;
G
kFkLm1 WD ess supjf .g/m.g/j: g2G
p Definition 3.5 (Coorbit space). The coorbit spaces Hp;m associated with Lm are defined as p W Ve; .T/ 2 Lm .G/g: Hp;m WD fT 2 H1;w
(3.10)
We equip these spaces with the norm kTkHp;m WD kVe; .T/kLmp . Remark 3.6. Following the lines of the proof of Proposition 3.2 i) and ii), it can be checked that the coorbit spaces as defined in (3.10) are Banach spaces which are independent of the choice of the analyzing vector. In many applications, e.g., for the discretization of the coorbit spaces Hp;m in p Subsection 3.2.2, convolution products of functions in Lm with functions in Lw1 have to be evaluated. Here the following weighted Schur lemma provides a useful tool. Lemma 3.7 (Weighted Schur Lemma). Let .X; A ; / be a -finite measure space. Let K be an A ˝ A -measurable function on X X, and w be a positive weight function on X. Suppose that K satisfies the conditions Z Z
jK.x; y/j
w.x/ d .x/ CK w.y/
for a.e. y 2 X
jK.x; y/j
w.x/ d .y/ CK w.y/
for a.e. x 2 X
X
X
and (3.11)
92
S. Dahlke et al.
If f 2 Lwp .X/, 1 p 1, then the integral Z If .x/ D K.x; y/f .y/d .y/ X
converges absolutely for a.e. x 2 X. The function If is in Lwp .X/ and fulfills kIf kLwp CK kf kLwp :
(3.12)
For a proof, we refer, e.g., to [10, Appendix A]. p For Lm and Lw1 spaces the following generalized Young inequalities hold true. We write x . y if x Cy for some C 0. Proposition 3.8 (Generalized Young Inequality). Let m be a w-moderate weight. 1 p Then it follows for F 2 Lm .G/ and H 2 Lw1 or H 2 Lw1 \ Lw 1 , respectively, that kH FkLmp . kFkLmp kHkLw1 ;
(3.13)
kF HkLmp . kFkLmp maxfkHkLw1 ; kHkL1
w1
g:
(3.14)
Proof. Let us start with (3.14). We apply Lemma 3.7 for the case K.g; h/ WD H.h1 g/ and w D m. Using (3.9) we conclude Z
m.g/ jH.h g/j dg D m.h/ G 1
Z jH.g/j G
Z
m.hg/ dg m.h/
jH.g/j
C G
Z DC
m.h/w.g/ dg m.h/
jH.g/jw.g/ dg G
D CkHkLw1 : On the other hand, we get Z G
jH.h1 g/j
m.g/ dh D m.h/
Z
jH..g1 h/1 /j
G
Z
jH.h1 /j
D G
Z
m.g/ dh m.gh/
jH.h1 /j
C G
Z DC
m.g/ dh m.h/
m.gh/w.h1 / dh m.gh/
jH.h/jw.h/.h1 / dh
G
D CkHkL1
w1
:
3 Shearlet Coorbit Theory
93
Thus the assumptions of Lemma 3.7 are fulfilled and (3.14) follows immediately. Relation (3.13) can be proved analogously. t u The coorbit spaces are closely related to certain reproducing kernel subspaces p of Lm .G/. Before we can prove this fundamental relation some preparations are necessary. First of all, we need the Wiener Amalgam spaces that will be heavily used in the sequel. We define these spaces for general solid Banach function spaces. A solid Banach function space Y is a Banach space of functions which is 1 continuously embedded in Lloc and jf .y/j jg.y/j a.e. for g 2 Y implies f 2 Y with p kf kY kgkY . In particular the Lm spaces are examples of solid Banach function spaces. Definition 3.9 (Wiener-type spaces). Let B and Y be solid Banach function spaces V and e 2 Q. V Then the right on G. Choose Q G compact with non-void interior Q and left translated local maximal functions M R .F/, M L .F/ of F are defined by M R .FI B/.g/ WD kRg Q FkB ;
M L .FI B/.g/ WD kLg Q FkB ;
(3.15)
and the Wiener-type spaces by n o L R .FI B/ 2 Y ; M R .B; Y/ WD F W M
˚ M L .B; Y/ WD F W M L .FI B/ 2 Y (3.16)
endowed with the norms L R .FI B/kY ; kFkM R .B;Y/ WD kM
kFkM L .B;Y/ WD kM L .FI B/kY :
Two properties of Wiener-type spaces which will be required in the proof of the correspondence principle in Theorem 3.12 will be provided in the next lemma and proposition. Lemma 3.10. Let m be a w-moderate weight. Then it holds p 1 Lm .G/ M L .L1 ; L1=w /: p Proof. By the Hölder inequality we have for all F 2 Lm .G/ and
Z
Z jF.g/j Q .g/ dg D
k Q FkL1 D G
jF.g/j Q .g/ G
Z jF.g/j m .g/ dg p
G
Cm;Q kFkLmp :
p
1=p Z
1 p
C
1 q
D 1 that
m.g/ dg m.g/
1 j Q .g/j q dg m .g/ G q
1=q
94
S. Dahlke et al.
Then we can estimate the maximal function by M L .FI L1 /.g/ D k.Lg Q /FkL1 D k Q .Lg1 F/kL1 Cm;Q kLg1 FkLmp : By (3.9) and the symmetry of the weight we obtain Z p
jF.gh/jp m.h/p dh
kLg1 FkLp D m
Z
G
jF.h/jp m.g1 h/p dh
D G
1 p
Z p
. w.g /
G
jF.h/jp m.h/p dh D w.g/p kFkLp
m
so that M L .FI L1 /.g/ . Cm;Q kFkLmp w.g/: Finally this implies L 1 L 1 1 1 D kM .FI L /kL1 D ess sup M .FI L /.g/ w.g/ . Cm;Q kFkLmp : kFkM L .L1 ;L1=w / 1=w g2G
t u
and we are done.
Proposition 3.11. Let Y be a solid, translation invariant Banach space and w a weight function on G fulfilling Y .Lw1 /z Y; where kFk 1 zWD .Lw /
R
1 G jF.g /jw.g/ dg.
Then the following inclusion holds true:
M L .L1 ; Y/ M R .L1 ; Lw1 / Y:
(3.17)
The proof can be found in [17, Theorem 7.1]. For 1 p 1, we define the reproducing kernel subspaces p .G/ W F K D Fg Mp;m WD fF 2 Lm p .G/. The following theorem states a fundamental correspondence principle of Lm between these spaces and the coorbit spaces Hp;m .
Theorem 3.12 (Correspondence Principle). Let m be a w-moderate weight and an analyzing vector with kernel K 2 M R .L1 ; Lw1 /. Then the map Ve; induces an isomorphism Ve; W Hp;m ! Mp;m .
3 Shearlet Coorbit Theory
95
Proof. 1. First we verify Ve; .Hp;m / Mp;m . By Proposition 3.4 i), we have Ve; .T/ K D Ve; .T/ in H1;w , and therefore Ve; .T/ K D Ve; .T/ also in Hp;m . p 2. For the converse take F 2 Mp;m , i.e., F 2 Lm .G/ satisfying F K D F. As 1 soon as we have shown that F 2 L1=w .G/ we can apply Proposition 3.4 iii), and 1 p find that T WD Ve; .F/ 2 H1;w satisfies Ve; .T/ D F 2 L1=w .G/ \ Lm .G/ and consequently T 2 Hp;m . We have to verify that F K D F implies 1 1 p F 2 L1=w .G/. By Lemma 3.10 any F 2 Lm .G/ belongs to M L .L1 ; L1=w /. 1 We want to apply Proposition 3.11 with Y WD L1=w . First we check that the assumption of the proposition is fulfilled. Indeed, for F 2 L1 and H 2 .L1 /z 1=w
we obtain w.g/
1
w
ˇZ ˇ ˇ ˇ 1 ˇ j.F G/.g/j D w.g/ ˇ F.h/H.h g/ dhˇˇ G ˇZ ˇ ˇ ˇ 1 ˇ 1 1 D w.g/ ˇ F.h/w.h/ w.h/H.h g/ dhˇˇ G Z ˇ ˇ 1 w.g/1 kFkL1=w w.h/ ˇH.h1 g/ˇ dh 1
G
1 D w.g/1 kFkL1=w 1 D w.g/1 kFkL1=w
Z 1 . kFkL1=w
G
Z
ˇ ˇ w.h/ ˇH..g1 h/1 /ˇ dh
G
Z
ˇ ˇ w.gh/ ˇH.h1 /ˇ dh
G
ˇ ˇ w.h/ ˇH.h1 /ˇ dh D kFkL1 kHk 1=w
By assumption on K and Proposition 3.11 we obtain finally F K 1 L1=w .G/.
:
1z .Lw /
D F 2 t u
Remark 3.13. Theorem 3.12 is one of the most important results in the context of coorbit theory. Indeed, if we want to study the properties of an abstract coorbit space, the correspondence principle enables one to go via the voice transform to the much more concrete reproducing kernel spaces Mp;m . Then, the properties of the reproducing kernel spaces can be transformed back to the coorbit spaces by applying Ve;1 .
3.2.2 Discretization In this section, we want to briefly explain how atomic decompositions and Banach frames for the coorbit spaces introduced in Section 3.2.1 can be constructed. Some preparations are necessary.
96
S. Dahlke et al.
Definition 3.14 (Q-dense and relatively separated sets). Given the compact set Q with S non-void interior, a (countable) family X D .gi /i2I in G is said to be Q-dense Q of e we have if i2I gi Q D G and separated if for some compact neighborhood Q Q Q gi Q \ gj Q D ;; i 6D j, and relatively separated if X is a finite union of separated sets. Moreover, let ˚ D . i ; X; Q/ denote a partition of unity subordinated to the Q-dense set X, i.e., supp i gi Q;
0 i 1 for all i 2 I ;
X
i 1:
(3.18)
i2I
We will also need the U-oscillation of a function F with respect to a compact neighborhood U of e given by oscU .F/.g/ WD supjF.ug/ F.g/j:
(3.19)
u2U
In this setting, it is possible to derive atomic decompositions and Banach frames. We will start in Subsection 3.2.2.1 with the atomic decompositions. Banach frames will be established in Subsection 3.2.2.2.
3.2.2.1
Atomic Decompositions
Theorem 3.15. Let be an analyzing vector whose kernel fulfills K 2 1 M R .L1 ; Lw1 / \ M R .L1 ; Lw 1 /. Then, for any sufficiently small neighborhood U of e the following relations hold true: i) For any U-dense and relatively separated set X D .gi /i2I , the space Hp;m has the following atomic decomposition: If T 2 Hp;m , then TD
X
ci .T/.gi / ;
(3.20)
i2I
where the sequence of coefficients depends linearly on T and satisfies k.ci .T//i2I k`pm . kTkHp;m : ii) Conversely, if .di /i2I 2 `pm , then T D
P i2I
di .gi /
kTkHp;m . k.di /i2I k`pm :
(3.21) is in Hp;m and (3.22)
Proof. Let us define the operator T˚ .F/ WD
X hF; i iLgi K ; i2I
(3.23)
3 Shearlet Coorbit Theory
97
where . i /i is a partition of unity. The first step is to show that the operator T˚ is invertible on the reproducing kernel space Mp;m . Then, using the correspondence principle we would obtain for any T 2 Hp;m , T D Ve;1 Ve; .T/ D Ve;1 .T˚ ı T˚1 /Ve; .T/ X D Ve;1 . hT˚1 Ve; .T/; i iLgi K / i2I
X D hT˚1 Ve; .T/; i iVe;1 Lgi K i2I
X D hT˚1 Ve; .T/; i iVe;1 Lgi V . / i2I
X D hT˚1 Ve; .T/; i iVe;1 V ..gi / // i2I
X D hT˚1 Ve; .T/; i i.gi / i2I
which would be a desired atomic decomposition (3.20). To finish the proof, it would remain to establish (3.21) and (3.22). Step 1: Invertibility of T˚ . We start with a pointwise estimate. Since Mp;m is a reproducing kernel space, we get for F 2 Mp;m , ˇZ ˇ ˇ ˇ XZ ˇ ˇ 1 1 j.F T˚ F/.g/j D ˇ F.h/K .h g/ dh i .h/F.h/K .gi g/ dhˇ ˇ G ˇ G i2I ˇZ ! ˇ ˇ ˇ X ˇ ˇ D ˇ F.h/ dhˇ : i .h/ K .h1 g/ K .g1 i g/ ˇ G ˇ i2I
Now, the partition of unity properties (3.18) yields Z X j.F T˚ F/.g/j jF.h/j i .h/ supjK .h1 g/ K .uh1 g/j dh Z
u2U
G i2I
jF.h/j G
X
i .h/ oscU .K /.h1 g/ dh
i2I
D .jFj oscU .K //.g/: Defining the operator TQ .F/ WD jFj oscU .K /;
(3.24)
98
S. Dahlke et al.
it remains to show that kTQ .F/kLmp CU kFkLmp ;
where
lim CU D 0:
(3.25)
U!feg
Then, by choosing the neighborhood U sufficiently small, the invertibility of T˚ follows by a Neumann series argument. Using Proposition 3.8 and (3.14) we get kTQ .F/kLmp . kFkLmp maxfkoscU .K /kLw1 ; koscU .K /kL1
w1
g;
and (3.25) is established by applying the following lemma to the kernel K with respect to w and w1 : The proof of Lemma 3.16 will be presented in Subsection 3.2.4. Lemma 3.16. i) Let F 2 Lw1 \ M R .L1 ; Lw1 /. Then oscU .F/ 2 Lw1 , and if F is in addition continuous, then lim koscU .F/kLw1 D 0:
(3.26)
U!feg
z
1 1 R 1 ; Lw ii) If F 2 Lw1 \ M R .L1 ; Lw1 /, then oscU .F/ 2 Lw 1 . If F 2 M .L 1 /, then 1 oscU .F/ 2 Lw . If F is in addition continuous, then (3.26) holds true for osc with 1 respect to the Lw1 -norm and the Lw 1 norm, respectively.
z
z
Remark 3.17. The proof of i) does not require any symmetry condition on w so that we can replace w, e.g., by w1 . For ii) the symmetry condition (3.5) is really needed. It remains to show (3.21) and (3.22). p Step 2: Proof of (3.21). We have to check that for all F 2 Lm the inequality
k.hF; i i/i2I k`pm . kFkLmp
(3.27)
holds. Indeed, (3.27) implies by the correspondence principle and the boundedness of T˚1 for T 2 Hp;m k.hT˚1 Ve; T; i i/i2I k`pm . kT˚1 Ve; TkLmp ~T˚1 ~kVe; .T/kLmp D ~T 1 ~ kTkHp;m ; which is (3.21). To show (3.27), we will employ the following lemma that will once again be proved in Subsection 3.2.4. Lemma 3.18. Let 1 p 1. Then X k.ci /i2I k`pm k jci j gi U kLmp i2I
(3.28)
3 Shearlet Coorbit Theory
99
By Lemma 3.18, we obtain X k.hF; i i/i2I k`pm k.hjFj; i i/i2I k`pm . k hjFj; i i gi U kLmp : i2I
Observe that, since X is relatively separated, for a fixed g 2 G the set Ig WD fi 2 I W g 2 gi Ug is finite. Therefore we may write X X hjFj; i i gi U .g/ D hjFj; i i gi U .g/ hjFj; K.g; /i i2I
i2Ig
with K.g; h/ WD
X
gi U .h/ gi U .g/:
(3.29)
i2I
We want to use the weighted Schur Lemma 3.7 for K as defined in (3.29). Since m satisfies (3.9) we observe Z G
gi U .g/
m.g/ dg D m.gi /
Z G
U .g1 i g/
m.g/ dg D m.gi /
Z U
m.gi g/ dg C m.gi /
Z Q w.g/ dg C; U
so that Z Z X X m.g/ m.gi / m.g/ m.gi / dg D dg CQ ; K.g; h/
gi U .h/
gi U .g/
gi U .h/ m.h/ m.h/ G m.gi / m.h/ G i2I
i2I
which is uniformly bounded since X is relatively separated and the first condition in (3.11) holds. The second condition in (3.11) can be established in a similar way. This proves (3.27) and finishes Step 2. Step 3: Proof of (3.22). It is enough to show that X k di Lgi K kLmp . k.di /i2I k`pm :
(3.30)
i2I
We start with the following observation that can be found in [22, Lemma 4.6]: jK .h1 l/ K .g1 l/j oscU .K /.h1 l/
for all
l 2 G; if h 2 gU (3.31)
for all
l 2 G; if h 2 gi U: (3.32)
Eq. (3.31) obviously implies 1 1 jK .g1 i l/j oscU .K /.h l/ C jK .h l/j
100
S. Dahlke et al.
Using (3.32), we obtain the following pointwise estimate ˇ ˇ ˇ ˇ Z ˇX ˇ ˇX ˇ ˇ ˇ ˇ ˇ 1 di Lgi K .l/ˇ . ˇ di K .gi l/ dhˇ ˇ ˇ ˇ ˇ ˇ g U i i2I i2I Z X jdi j jK .g1 i l/j dh gi U
i2I
X
Z jdi j
i2I
Z
oscU .K /.h1 l/ C jK .h1 l/j dh
gi U
.oscU .K /.h1 l/ C jK .h1 l/j/.
D G
D
X
!
X jdi j gi U .h// dh i2I
jdi j gi U .oscU .K / C jK j/.l/:
i2I
Now, Proposition 3.8, Lemma 3.16, and Lemma 3.18 imply X X p di K .g1 jdi j gi U kLmp . k.di /i2I k`pm ; k i /kLm . k i2I
i2I
and the proof of Step 3 is finished. t u
3.2.2.2
Banach Frames
In Subsection 3.2.2.1, we have seen that under very natural conditions the coorbit spaces Hp;m can be decomposed by means of atoms that are obtained by discretizing the underlying group representation with respect to a U-dense set X. Then, of course, the question arises how a given signal T 2 Hp;m can be reconstructed from a sequence of discrete data. The answer is provided by the next theorem which in particular implies the existence of a reconstruction operator R that gives us back any distribution T 2 Hp;m from the sequence of its so-called moments H .hT; .gi / iH1;w 1;w /i2I . Theorem 3.19. Let be an analyzing vector whose kernel fulfills K 2 1 M R .L1 ; Lw1 / \ M R .L1 ; Lw /. Then, for any sufficiently small neighborhood 1 U of e and every U-dense and relatively separated set X, the set f.gi / W i 2 I g is a Banach frame for Hp;m : This means that i) T 2 Hp;m
p H if and only if .hT; .gi / iH1;w 1;w /i2I 2 `m ;
(3.33)
3 Shearlet Coorbit Theory
101
ii) p H k.hT; .gi / iH1;w 1;w /i2I k`m kTkHp;m ;
(3.34)
iii) there exists a bounded, linear reconstruction operator R from `pm to Hp;m such that (3.35) R .hT; .gi / iH1;w H1;w /i2I D T: Proof. We define the operator S˚ .F/ WD
X
F.gi / i K :
(3.36)
i2I
If we can show that S˚ is invertible on Mp;m , then the correspondence principle would imply for any T 2 Hp;m , T D Ve;1 Ve; .T/ D Ve;1 S˚1 .S˚ Ve; .T// ! X 1 1 D Ve; S˚ Ve; .T/.gi / i K i2I
D Ve;1 S˚1
X H hT; .xi / iH1;w 1;w i K
!
i2I DW R..hf ; .gi / iH1;w H1;w /i2I /;
so that iii) is proved. It would remain to show i) and ii). Step 1: Invertibility of S˚ . On the reproducing kernel spaces Mp;m we may write kF S˚ FkLmp D k.F
X
F.gi / i / K kLmp ;
i2I
so that by using again Proposition 3.8, we get kF S˚ FkLmp . kF
X
F.gi / i kLmp :
(3.37)
i2I
To further estimate the right-hand side in (3.37), we start again with a pointwise estimate. By the reproducing kernel property, we get for h 2 gi U, jF.h/ F.gi /j supjF.h/ F.hu1 /j u2U
102
S. Dahlke et al.
Z
F.l/.K .l1 h/ K .l1 hu1 //jdl
D supj u2U
Z
G
jF.l/j supjK .l1 h/ K .l1 hu1 /jdl
G
u2U
D .jFj supjK ./ K .u1 /j/.h/: u2U
Observe that sup jK .h/ K .hu1 /j D sup jKL .h1 / KL .uh1 /j D sup jK .h1 / K .uh1 /j; u2U
u2U
u2U
so that
z
jF.h/ F.gi /j . jFj oscU .K / .h/: Therefore, X X F.gi / i .h/j . .jFj oscU .K //.h/ i .h/ D jFj oscU .K /.h/; jF.h/
z
i2I
z
i2I
and hence by Proposition 3.8 and (3.14), kF S˚ FkLmp . kF
X i2I
z ; kz osc .K
F.gi / i kLmp . kjFj oscU .K /kLmp
z
. kjFjkLmp maxfkoscU .K /kLw1
U
/kL1
w1
g:
The integrability conditions together with the additional requirement (3.5) imply that the proof of Lemma 3.16 as outlined in Subsection 3.2.4 also carries over to oscU .K/. Therefore, the invertibility of S˚ follows by taking the neighborhood U sufficiently small. This finishes Step 1. Step 2: Lower bound in (3.34). By using the boundedness of S˚1 , Proposition 3.8 and Lemma 3.18, we obtain
z
kTkHp;m D kVe; .T/kLmp D k.S˚1 ı S˚ /Ve; .T/kLmp X . kS˚ Ve; .T/kLmp D k Ve; .T/ i K i2I
X X .k Ve; .T/ i kLmp . k jVe; .T/j gi U kLmp i2I
i2I
. k.hT; .gi / i/i2I k ; p `m
kLmp
3 Shearlet Coorbit Theory
103
which is the lower bound in (3.34). Step 3: Upper bound in (3.34). By the correspondence principle, it is enough to show for F 2 Mp;m that kF.gi /k`pm . kFkLmp :
(3.38)
Defining O K.g; h/ WD
X jK .h1 gi /j gi U .g/;
(3.39)
i2I
we may write X
jF.gi /j gi U
XZ D j F.h/K .h1 gi / dhj gi U .g/
i2I
i2I
G
Z X
jK .h1 gi /j gi U .g/jF.h/j dh
G i2I
Z
O K.g; h/jF.h/j dh :
D G
As in the proof of step 1, we observe that (3.31) implies for g 2 gi U the relation
z
jKN .h1 g/ KN .h1 gi /j oscU .K /.h1 g/; i.e., O K.g; h/
X .oscU .K /.h1 g/ C jK j.h1 g/ gi U .g//
z
i2I
z
. oscU .K /.h1 g/ C jK j.h1 g/ and hence X
z
F.gi / gi U .g/ . jFj .oscU .K / C jK j/.g/:
i2I
Now (3.38) follows by another application of Lemma 3.18 and Proposition 3.8 X kF.gi /k`pm . k jF.gi /j gi U kLmp . kFkLmp : i2I
This finishes Step 3.
t u
104
S. Dahlke et al.
3.2.3 Examples There seems to be a rich variety of examples for the abstract theory presented above. We discuss now the results for two typical examples: the (reduced) Heisenberg group and the affine (ax C b) group. In these cases the spaces arising as coorbit spaces have been extensively studied (as Banach spaces of function or distributions of their own right without making use of group and representation theory). As we will see in Subsection 3.2.3.1 the coorbit spaces associated with the affine group are the well-known classical Besov spaces. Therefore, let us briefly recall the definitions and some basic properties of these spaces. Following [2] we start with the definition of inhomogeneous Besov spaces. The homogeneous case will be discussed afterwards. Let ' 2 S .Rd / such that 1 j j 2g; 2 1 '. / > 0 for < j j < 2; 2 1 X '.2j / D 1 for ¤ 0: supp ' D f W
jD1
Using this we define for j 2 Z the partition of unity 'j and
by
F 'j . / D '.2j /; F . / D 1
1 X
'.2j /:
jD1
Definition 3.20 (Inhomogeneous Besov space). Let 2 R and 1 p; q 1 then Bp;q
n WD f 2 S 0 W kf kBp;q WD k
1 X o j q 1q 2 k'j f kLp f kLp C and 0 N < . Then, with 1 p 1 and 1 q 1, kf kBp;q kf kLp C
d Z X
1 0
kD1
N q dt 1q @ f tN !pm t; @x : N k t
For the homogeneous Besov spaces we consider symmetric sums. Definition 3.22 (Homogeneous Besov space). Let 2 R and 1 p; q 1 then 1 n X o j q 1q BP p;q WD f 2 S 0 W kf kBP p;q WD 2 k'j f kLp 0 and integers m; N such that m C N > and 0 N < we have similar as before kf kBP p;q
d Z X kD1
0
1
t
N
!pm
N q dt 1q @ f t; @x : N k t
The relation between homogeneous and inhomogeneous Besov spaces is given in the next theorem. Theorem 3.23 ([2, Theorem 6.3.2]). Suppose that f 2 S 0 and 0 62 supp fO . Then for s 2 R and 1 p; q 1 f 2 BP p;q ” f 2 Bp;q : Moreover, for s > 0 Bp;q D Lp \ BP p;q : A characterization of Besov spaces can also be found in Theorem 3.43.
3.2.3.1
The Affine (ax C b) Group and Besov Spaces
In the first example we consider the full affine group G from Example 2.10. It is given as the set R R with group law .a; b/ ı .a0 ; b0 / D .aa0 ; ab0 C b/
106
S. Dahlke et al.
with left Haar measure da db =a2 and modular function .a; b/ D jaj1 , see Example 2.35. We consider the representation 1 .a; b/ .x/ D p jaj
xb a
introduced in Example 2.42. In order to establish coorbit spaces we have to verify that 2 Aw for some properly chosen symmetric weight. We discuss two cases. In the first case we assume that the analyzing wavelet has compact support in the Fourier domain, i.e. supp O .Œa1 ; a0 [ Œa0 ; a1 /. Theorem 3.24. Let 0 < ˛ a0 < a1 . Let be a Schwartz function with supp O
.Œa1 ; a0 [ Œa0 ; a1 /. Then we have that h ; ./ i 2 Lw1 .G/. Proof. The assertion follows directly by writing Z kh ; ./ ikLw1 .G/ D
R
da k O NO a;0 kF 1 L1 w.a/ 2 a
and realizing that O NO a;0 6D 0 only for a 2 Œa1 =a0 ; a0 =a1 [ Œa0 =a1 ; a1 =a0 . See also the proof of Theorem 3.33 below. t u However, when it comes to practical applications, it would be desirable to analyze the functions/signals of interest with compactly supported wavelets. In this case the integrability condition needs to be satisfied. Indeed, any compactly supported wavelet with sufficient smoothness and enough vanishing moments is for a specific subclass of weights, e.g. w.a/ D jaj C jaj , contained in Aw . Selecting now m.a; b/ D jajs , s 2 R, we identify the membership of a function to the coorbit space Hp;m through Z f 2 Hp;m ”
jVe .f /.a; b/jp jajsp
G
da db < 1: a2
The next step is to construct atomic decompositions (and Banach frames) for these spaces. To this end, we have to construct suitable U-dense sets. Let U be a neighborhood of the identity in G, let ˛ > 1 and > 0 be defined such that U0 D Œ˛ 1=2 ; ˛ 1=2 / Œ=2; =2/ U. We then deduce that the sequence f."˛ j ; "˛ j m/ W j 2 Z; m 2 Z; " 2 f1; 1gg is U-dense and relatively separated. This set allows now for atomic decompositions. If f 2 Hp;m , then f D
X j;m
cj;m .f /.˛ j ; ˛ j m/
;
3 Shearlet Coorbit Theory
107
where the sequence satisfies 0 @
X
11=p jcj;m . f /jp j˛jjsp A
Ckf kHp;m :
j;m
Conversely, if .dj;m .f // 2 `pm , then f D 0 kf kHp;m C0 @
P
X
j;m
dj;m . f /.˛ j ; ˛ j m/
2 Hp;m and
11=p jdj;m . f /jp j˛jjsp A
:
j;m
These expressions are equivalent to the membership of f in the homogeneous Besov s1=21=p space BP p;p , for further details see again [15] and the references therein.
3.2.3.2
The Reduced Heisenberg Group and Modulation Spaces
Recall the reduced Heisenberg group Hdr D Rd Rd T with group law (see (2.67)) 0
0
.q; p; / ıHdr .q0 ; p0 ; 0 / D .q C q0 ; p C p0 ; 0 ei. qp pq / / t
t
and the product measure dq dp dt as Haar measure. We are interested in the Schrödinger representation on H D L2 .Rd / given by .q; p; / .x/ D ei qp Tq Mp .x/ D ei qp e2i py .x q/ t
t
t
see also (2.68). The spaces Hp;m are well defined as long as the weight m is of the form m.q; p; t/ D m.p/, Q where m Q is any moderate weight on Rd , see [15]. For the s choice m.q; p; t/ D .1 C jpj/ the resulting spaces turn out to coincide with the s s so-called modulation spaces Mp;p .Rd /, see [15]. The spaces M2;2 .Rd / can be identified with the classical Bessel-potential spaces. Applying the general result on atomic decompositions we obtain representations of the following form (which are s called Gabor-type representations). Given s 2 R and some 0 ¤ 2 M1;1 .Rd / 2 ( .x/ D ex is one option) there exists ˛0 > 0 and ˇ0 > 0 such that for ˛ ˛0 and ˇ ˇ0 there exists C D C.˛; ˇ/ > 0 with the property: X s .Rd / ” f D an;k .ˇk; ˛n; 1/ ; f 2 Mp;p n;k
where the sequence of coefficients satisfies X jan;k jp .1 C jnj/sp n;k
!1=p Ckf kMp;p s .Rd / :
108
S. Dahlke et al.
3.2.4 Proof of Lemma 3.16 and 3.18 Proof (of Lemma 3.16). i) The first step is to show that koscU .F/kLw1 < 1:
(3.40)
The following fact has been shown in [22]: oscU .F/.g/ M R .FI L1 /.g1 / C jF.g/j; so that, by applying the Lw1 -norm and taking into account that F 2 Lw1 \ M R .L1 ; Lw1 / we get Z
Z oscU .F/w.g/ dg G
1
Z
1
M .FI L /.g /w.g/ dg C R
G
jF.g/w.g/j dg < 1: G
Q G such Eq. (3.40) implies that for every " > 0 there exists a compact set Q that for all U U0 ; U0 sufficiently small, Z Q GnQ
oscU .F/.g/w.g/ dg
0 .1 C k!k2 /2˛
130
S. Dahlke et al.
1 and m.a; s; t/ D m.a; s/ WD jajr . jaj C jaj C jsj/n for some r 2 R; n 0. Then the set of Schwartz functions forms a dense subset of the shearlet coorbit space S C p;m .
Proof. As in [9, Theorem 4.7] it can be shown that S0 is at least contained in S C p;m . It remains to show the density. To this end, we observe from Theorem 4.2 in [9] that certain band-limited Schwartz functions can be used as analyzing shearlets. Now let us recall that the atomic decomposition in (3.57) has to be understood as a limit of finite linear combinations with respect to the shearlet coorbit norm. However, every finite linear combination of Schwartz functions is again a Schwartz function, hence (3.57) implies that we have found for any T 2 S C p;m a sequence of Schwartz functions which converges to T. t u
3.5.2 Embeddings In the remaining parts of this chapter, we are mainly interested in weights m.a; s; t/ D m.a/ WD jajr ; r 0 and use the abbreviation S C p;r WD S C p;m : Note that m is w-moderate, e.g., with respect to the symmetric weight w.a; s; t/ D jaj C jaj , r . For simplicity, we will very often assume that we can use ˇ D D 1 in the U-dense, relatively separated set (3.56) and restrict ourselves 1 to positive dilations a > 0, i.e., " D 1. Moreover, for a WD ˛ j , s WD ˛ j.1 d / k and t WD S j.1 d1 / A˛j l we will use the abbreviation j;k;l WD .a; s; t/ : In other ˛ k words, we will mostly assume that T 2 S C p;r can be written as TD
X X X
1
c.j; k; l/.˛ j ; ˛ j.1 d / k; S
D
X X X
1
˛ j.1 d / k
j2Z k2Zd1 l2Zd
c.j; k; l/
j;k;l :
A˛j l/ (3.60)
j2Z k2Zd1 l2Zd
3.5.2.1
Embeddings into Coorbit Spaces
p coorbit spaces were In [15, Section 5.7] some embedding theorems for general Lm given. In particular, the authors mentioned that for a fixed weight m, these spaces are monotonically increasing with p. The following corollary is a special result in this direction.
3 Shearlet Coorbit Theory
131
Proposition 3.41. For 1 p1 p2 1 the embedding S C p1 ;r S C p2 ;r holds true. Introducing the ‘smoothness spaces’ Gpr WD S C p;rCd. 1 1 / ; this implies the 2 p continuous embedding Gpr11 Gpr22 ;
if
r1
d d D r2 : p1 p2
Proof. By Theorem 3.38 we obtain that kTkS C p2 ;r . k.c" .j; k; l/k`pr 2 .
X j2Z
˛ jrp2
X
jc" .j; k; l/jp2
p1
2
;
k;l "2f1;1g
where c" .j; k; l/ is the coefficient in the representation (3.57) with respect to (3.56) 1 belonging to the function ."˛ j ; ˛ j.1 d / k; S j.1 d1 / A˛j l/ . Since `p1 `p2 for ˛ k p1 p2 we get finally that kTkS C p2 ;r .
X
˛ jrp2
X j2Z
3.5.2.2
jc" .j; k; l/jp1
pp2 p12 1
k;l "2f1;1g
j2Z
.
X
˛ jrp1
X
jc" .j; k; l/jp1
p1
1
. kf kS C p1 ;r :
k;l "2f1;1g
t u
Embeddings into Besov spaces
To keep the technical difficulties at a reasonable level, we will restrict ourselves from now on mainly to function spaces on R3 . The higher-dimensional setting can be handled in a similar way. To derive reasonable embedding theorems as well as trace results in the next section, it is necessary to introduce the following subspaces of S C p;r . For an analyzing shearlet such that the assumptions of Theorem 3.38 are satisfied and for some 2 f0; 1g2 , the closed subspace of S C . / p;r of S C p;r is defined by n XXX S C . / c.j; k; l/ p;r WD T 2 S C p;r W T D
j;k;l ;
j2Z k2Z2 l2Z3
o 2j c.j; k; l/ D 0 for jki j> ˛ 3 if i D 1
(3.61)
As we shall see in the sequel, some of the resulting spaces S C . / p;r embed in scales of Besov spaces, and similar properties hold true for the trace theorems.
132
S. Dahlke et al.
Remark 3.42. i) It is clear that the shearlet coorbit spaces as established in Section 3.4.1 are not Besov spaces since the underlying groups are completely different. The shearlet coorbit spaces are related with the shearlet group whereas the Besov spaces are related with the affine group. Without the restriction stated in (3.61) the shearlet coorbit spaces are too ‘big’ compared to the Besov spaces and no embedding can be shown, at least not with our methods. ii) Apart from the technical reason stated in i), the restriction on the shear part might look artificial. However, in a certain sense, the spaces in (3.61) resemble the well-known ‘cone-adapted shearlets’ that are very often used in practice. We refer, e.g., to [25]. The proof of our embedding result heavily uses the well-known atomic decompositions of Besov spaces. We start by the characterization of homogeneous Besov spaces Bp;q from [20], see also [26, 32]. For inhomogeneous Besov spaces, we refer to [31]. For ˛ > 1, D > 1 and K 2 N0 , a K times differentiable function on Rd is called a K-atom if the following two conditions are fulfilled: A1) supp DQj;l .Rd / for some l 2 Rd , where DQj;l .Rd / denotes the cube in Rd centered at ˛ j l with sides parallel to the coordinate axes and side length 2˛ j D. A2) jD .x/j ˛ jjj for j j K. Now the homogeneous Besov spaces can be characterized as follows. Theorem 3.43. Let D > 1 and K 2 N0 with K 1 C b c, > 0 be fixed. Let 1 p 1. Then f 2 BP p;q if and only if it can be represented as f .x/ D
XX
.j; l/ j;l .x/;
(3.62)
j2Z l2Zd
where the j;l are K-atoms with supp j;l DQj;l .Rd / and kf kBP p;q inf
X
d
˛ j. p /q
X
j.j; l/jp
qp 1q
l2Zd
j2Z
where the infimum is taken over all admissible representations (3.62). Now we can state and prove the following embedding result of certain subspaces of shearlet coorbit spaces into (sums of) homogeneous Besov spaces. 3 P 1 3 P 2 3 Theorem 3.44. The embedding S C .1;1/ p;r .R / Bp;p .R / C Bp;p .R /, holds true, where
1 C 2b1 c D 3r
9 21 C 2 p
and
7 2 5 2 b2 c D r C C : 3 3p 6
3 Shearlet Coorbit Theory
133
Proof. By (3.61) we know that T 2 S C .1;1/ p;r can be written as T.x/ D
X
X
X
X
c.j; k; l/
j;k;l .x/:
j2Z jk1 j˛ 2j=3 jk2 j˛ 2j=3 l2Z3
By Theorem 3.36, the analyzing function can be chosen compactly supported in ŒD; D3 for some D > 1. For our i , i D 1; 2 defined in the theorem, let Ki WD 1 C bi c, i D 1; 2, and K WD maxfK1 ; K2 g. We normalize such that its derivatives of order 0 j j K are not larger than 1. The first step is to split T 2 S C .1;1/ p;r into T D T1 C T2 as follows: X X X X c.j; k; l/ j;k;l .x1 ; x2 ; x3 / (3.63) T1 .x1 ; x2 ; x3 / WD j0 jk1 j˛ 2j=3 jk2 j˛ 2j=3 l2Z3
XX
T2 .x1 ; x2 ; x3 / WD
c.j; 0; l/
j;k;l .x1 ; x2 ; x3 /:
(3.64)
j 0, let D."; p/ be the ball in R2 of radius " and center p, and Dc ."; p/ D R2 n D."; p/. Using (4.13), we can write the shearlet transform of B as S H B.a; s; p/ D I1 .a; s; p/ C I2 .a; s; p/; where I1 .a; s; p/ D
1 2i
I2 .a; s; p/ D
1 2i
Z 2Z 1Z 0
0
.d/ O a;s;p .; /e2i ./E˛.t/ ./ nE.t/ dt d d;
Z 2Z 1Z 0
(4.14)
@S\D.";p/
.d/ O a;s;p .; /e2i ./E˛.t/ ./ nE.t/ dt d d:
@S\Dc .";p/
0
(4.15) The following Localization Lemma shows that I2 has rapid asymptotic decay, at fine scales. Lemma 4.7 (Localization Lemma). Let I2 .a; s; p/ be given by (4.15). For any positive integer N, there is a constant CN > 0 such that jI2 .a; s; p/j CN aˇN ; asymptotically as a ! 0, uniformly for all s 2 R. Proof. We will only examine the behavior of I2 .a; s; p/ for jsj 1, so that we use the horizontal shearlets only. The proof when jsj > 1 is similar. We have: 1 I2 .a; s; p/ D 2i D
1Cˇ a 2
2i
Z
Z @S\Dc .";p/
Z @S\Dc .";p/
2Z 1 0
0
Z
.h/ O a;s;p .; /e2i ./E˛.t/ ./ nE.t/ dt d d
2Z 1 0
0
O 1 .a cos / O 2 .aˇ1 .tan s//
4 Detection of Edges
169
e2i ./p d de2i ./E˛.t/ ./ nE.t/ dt ˇ1
a 2 D 2i
Z
Z @S\Dc .";p/
2Z 1 0
0
O 1 . cos / O 2 .aˇ1 .tan s//
e2i a ./.pE˛.t// ./ nE.t/ d d dt: By assumption, since integration occurs on the region Dc ."; p/, we have that kp ˛.t/k E " for all ˛.t/ E 2 @S \ Dc ."; p/. Hence, there is a constant Cp such that infx2@S\Dc .";p/ jp xj D Cp . Note that, due to the assumptions on the support on O 2 , the integration in the variable is restricted to the interval I D f W j tan sj T C a1ˇ g. Let I1 D f W j./ .p x/j pp g I , and I2 D I n I1 . Since the 2
vectors ./; 0 . / form an orthonormal basis in R2 , it follows that, on the set I2 , C we have j 0 . / .p x/j pp . Hence we can express the integrals I2 as a sum 2 of a term where 2 I1 and another term where 2 I2 , and integrate by parts as follows. On I1 , we integrate by parts with respect to the variable ; on I2 , we integrate by parts with respect to the variable . Doing this repeatedly, it yields that, for any positive integer N, jI2 .a; s; p/j CN aˇN , uniformly in s. This finishes the proof of the lemma. t u Let ˛E .t/ be the arclength parametrization of the boundary curve @S, with 0 t L and p 2 @S. Without loss of generality we may assume that L > 1 and p D .0; 0/ D ˛.1/. E If p is a regular point, we write the boundary curve near p as C D @S \ D."; .0; 0//, where C D f˛.t/ E W 1 " t 1 C "g: Rather than using the arclength representation of C , we can also write C D f.G.u/; u/; " u "g, where G.u/ is a smooth function. Since p D .0; 0/, we have G.0/ D 0. If p is a corner point of @S, we write the boundary curve near p as C D @S \ D."; .0; 0// D C [ C C ; where E W 1 " t 1g; C D f˛.t/
C C D f˛.t/ E W 1 t 1 C "g:
Similar to the regular point case, we can write C C D f.GC .u/; u/; 0 u "g and C D f.G .u/; u/; " u 0g, where GC .u/ and G .u/ are smooth functions on Œ0; " and Œ"; 0, respectively. We will need the following Lemma which is proven in [21]. Lemma 4.8. Let 2 2 L2 .R/ be such that k 2 k2 D 1, supp O 2 Œ1; 1, O 2 is even, nonnegative and decreasing on Œ0; 1. Then, for each > 0, we have that Z
1 0
O 2 .u/ sin.u2 / C cos.u2 / du > 0:
170
K. Guo and D. Labate
We can now proceed with the proof of Theorem 4.4. Proof of Theorem 4.4. As above, it will be sufficient to examine the case of the horizontal shearlets only; the case of vertical shearlets is similar. • Part (i) This follows directly from Lemma 4.7. • Part (ii) Assume that s D s0 does not correspond to any of the normal directions of @S at p D .0; 0/. We write s0 D tan 0 , where we assume that j0 j 4 . Otherwise, for the case 4 < j0 j 2 , one will use the vertical shearlets and the argument is very similar to the one presented below. Hence, we have that ˇ1
I1 .a; s0 ; 0/ D
a 2 2i
Z
1Z 2
0
O 1 . cos / O 2 .aˇ1 .tan tan 0 //K.a; ; / d d;
0
where Z K.a; ; / D
1C" 1"
e2i a ./E˛.t/ ./ nE.t/ dt:
Let b 2 C01 .R/ be a smooth bump function such that b.t/ D 1 for jt 1j and b.t/ D 0 for jt 1j > 3" . Hence we can write 4
" 4
I1 .a; s0 ; 0/ D I11 .a; s0 ; 0/ C I12 .a; s0 ; 0/; where ˇ1
a 2 I11 .a; s0 ; 0/ D 2i
Z 1Z 0
2 0
O 1 . cos / O 2 .aˇ1 .tan tan 0 //
K1 .a; ; / d d; ˇ1
a 2 I12 .a; s0 ; 0/ D 2i
Z 1Z 0
2 0
O 1 . cos / O 2 .aˇ1 .tan tan 0 //
K2 .a; ; / d d; and Z K1 .a; ; / D
1"
Z K2 .a; ; / D
1C"
1C"
1"
e2i a ./E˛.t/ ./ nE.t/ b.t/ dt
e2i a ./E˛.t/ ./ nE.t/ .1 b.t// dt:
From the definition of b.t/, we have 1 b.t/ D 0 for jt 1j 4" . Since the boundary curve f˛.t/; E 0 t Lg is simple and p D .0; 0/ D ˛.1/, E it follows that there exists a c0 > 0 such that k˛.t/k E c0 for all t with 4" jt 1j ".
4 Detection of Edges
171
" Replacing the set Dc ."; p/ by the set f˛.t/; E jt 1j "g, one can repeat the 4 argument of Lemma 4.7 for I12 .a; s0 ; 0/ to show that jI12 .a; s0 ; 0/j CN aN for any N > 0.
Recall that when a ! 0, we have ! 0 . Since s0 does not correspond to the normal direction at p, one can choose " sufficient small so that ./ ˛E 0 .t/ ¤ 0 for jt 1j " and for all small a (and hence for near 0 ). Also from the assumption on b, it follows that b.n/ .1 "/ D 0 and b.n/ .1 C "/ D 0 for all n 0. Writing
e2i a ./E˛.t/ D
0 a 2i a ./E ˛ .t/ e ; 2i ./ ˛E 0 .t/
it follows that Z 1C" 0 ./ nE.t/ a K1 .a; ; / D b.t/ dt e2i a ./E˛.t/ 2i 1" ./ ˛E 0 .t/ !
1C" ai E.t/ 2i a ./E ˛ .t/ ./ n D e b.t/ C K3 .a; ; / 2 ./ ˛E 0 .t/ 1" D
ai K3 .a; ; /; 2
where we used the fact that b.1 "/ D 0, b.1 C "/ D 0 and Z K3 .a; ; / D
1C"
2i a ./E ˛ .t/
e 1"
./ nE.t/ b.t/ ./ ˛E 0 .t/
0 dt:
Repeating the above argument for K3 .a; ; / and using induction, it follows that for all N > 0 there exists a CN > 0 such that jK1 .a; ; /j CN aN and, hence, that jI11 .a; s0 ; 0/j CN aN : • Part (iii) Without loss of generality, we may assume that p D .0; 0/ and s0 D 0 so that tan 0 D 0 (and hence G0 .0/ D 0). It follows that G.u/ D Au2 C O.u3 / near u D 0 with some constant A. Since G00 .0/ D 0 by assumption in this case, we have A D 0. Using polar coordinates, we can express I1 .a; 0; 0/, evaluated on s0 D 0, as I1 .a; 0; 0/ ˇ1
a 2 D 2i
Z 1Z 0
0
2
O 1 . cos / O 2 .a1ˇ tan /
. cos C sin O.u2 // du d d:
Z
"
"
3 /Csin
e2i a .cos O.u
u/
172
K. Guo and D. Labate
By Lemma 4.7, to complete the proof of this case it is sufficient to show that lim a
a!0C
1Cˇ 2 I1 .a; s0 ; 0/
¤ 0:
In the expression of I1 , the interval Œ0; 2 of the integral in can be broken into the subintervals Œ 2 ; 2 and Œ 2 ; 3 . On Œ 2 ; 3 , we let 0 D so that 2 2 0 2 Œ 2 ; 2 and sin D sin 0 , cos D cos 0 . Using this observation and the fact that O 1 is an odd function, it follows that I1 .a; 0; 0/ D I10 .a; 0; 0/ C I11 .a; 0; 0/; where 2i a
1ˇ 2 I10 .a; 0; 0/
Z 1Z
D cos
2
2
0
Z 1Z C cos
2
2
0
Z
O 1 . cos / O 2 .a1ˇ tan /
"
3 /Cu sin /
e2i a .cos O.u
du d d
"
O 1 . cos / O 2 .a1ˇ tan /
Z
"
3 /Cu sin /
e2i a .cos O.u
du d d;
"
and 1ˇ 2 I11 .a; 0; 0/
2i a
1Z
Z D sin
2
0
1Z
Z sin
2
O 1 . cos / O 2 .a1ˇ tan /
2
2
0
Z
"
3 /Cu sin /
e2i a .cos O.u
"
O 1 . cos / O 2 .a1ˇ tan /
Z
"
"
O.u2 /du d d
3 /Cu sin /
e2i a .cos O.u
O.u2 / du d d:
For 2 . 2 ; 2 /, let t D aˇ1 tan and u D aˇ v. We observe that a ! 0 implies ! 0. It is easy to see that lima!0 a1 .sin u/ D lima!0 a1 .cos a1ˇCˇ tv/ D tv and that, for 13 < ˇ < 1, we have lima!0 It follows that lim 2ia
a!0C
Z D
0
1
D lima!0 a3ˇ1 O.v 3 / D 0:
1Cˇ 2 I10 .a; 0; 0/
Z O 1 ./
D 2 O2 .0/
O.u3 / a
Z 0
1
Z
1
1 1 1
O 2 .t/e2ivt dtdvd C
O1 ./d > 0:
Z 0
1
Z O 1 ./
1
Z
1
1 1
O 2 .t/e2ivt dtdvd
4 Detection of Edges
173
Since there is a factor O.u2 / in the expression of I11 , the same calculation
1Cˇ 2 I11 .a; 0; 0/ D 0. Through these calculations, we have 1Cˇ lima!0C 2ia 2 I1 .a; s0 ; 0/ ¤ 0. This completes the proof of
yields that lima!0C 2ia hence shown that part (iii).
• Part (iv) We first consider the case of ˇ D 12 . Since k.p0 / 6D 0, by assumption we have that G.u/ D Au2 C O.u3 / with A 6D 0. Without loss of generality, we may assume A D 1 so that we can write G.u/ D u2 C O.u3 /. As in (iii), using polar coordinates, we can express I1 .a; 0; 0/, evaluated on s0 D 0, as I1 .a; 0; 0/ D I10 .a; 0; 0/ C I11 .a; 0; 0/; where 1
2i a 4 I10 .a; 0; 0/ Z Z 1Z 2 1 O 1 . cos / O 2 .a 4 tan / D cos 0
Z C cos
2
"
1Z
2
1
2
0
"
O 1 . cos / O 2 .a 2 tan /
2 CO.u3 //Cu sin /
e2i a .cos .u Z
"
"
du d d
2 CO.u3 //Cu sin /
e2i a .cos .u
du d d
and 1
2i a 4 I11 .a; 0; 0/ Z 1Z D sin 0
2
2
O 1 . cos / O 2 .a 12 tan /
.2u C O.u2 //du d d sin Z
"
2 CO.u3 //Cu sin /
e2i a .cos .u
"
Z
1Z 0
Z
2
2
"
2 CO.u3 //Cu sin /
e2i a .cos .u
"
O 1 . cos / O 2 .a 12 tan /
.2u C O.u2 // du d d:
As in part (iii), it is enough to show 3
lim a 4 I10 .a; 0; 0/ ¤ 0:
a!0C 1
1
For 2 . 2 ; 2 /, let t D a 2 tan and u D a 2 v. In the calculation below, we will use the formulas of Fresnel integrals Z
1
cos. x2 /dx D 2 1
Z
1
sin. x2 /dx D 1: 2 1
174
K. Guo and D. Labate
Hence we have: 3
lim 2ia 4 I10 .a; 0; 0/
a!0C
Z
1
D 0
1 ./ 1
Z
1
C 0
Z
1
D 0
1
C
Z
O 2 .t/ 1
1 1
1
1 ./ 1
0
Z
Z
1
1 ./
Z
1
1
1 ./ 1
Z
Z
Z
1
e2i.v
1
dv dt d
1
Z
O 2 .t/
1
e2i.v
1
1 2 e 2 it O 2 .t/
1
2 Cvt/
Z
1 2 e 2 it O 2 .t/
1
2 Cvt/
dv dt d 1
2
e2i.vC 2 t/ dv dt d
1
Z
1
1
2
e2i.vC 2 t/ dv dt d
1
Z 1 O 1 ./ Z 1 2 O 2 O t t cos. / .t/ dt C sin. / .t/dt d; p 2 2 2 2 1 1
1
D 0
The last expression is strictly positive by Lemma 4.8 and the properties of O 1 . This completes the proof of part (iv) for ˇ D 12 . It remains to prove part (iv) for 0 < ˇ < 12 . We will follow the same idea as the argument for ˇ D 12 and use the change of variables t D aˇ1 tan and u D 12 v (instead of u D aˇ v in the proof of part (iii)). Note that a ! 0 implies ! 0. We will also use the observation that 1 1 3 lim .u2 C O.u3 // D lim .av 2 C a 2 O.v 3 // D v 2 a!0 a a
a!0
and 1 1 1 1 lim .sin u/ D lim .cos a1ˇC 2 v/ D lim a 2 ˇ cos v/ D 0: a!0 a a!0 a a!0 With these observations, it now follows that ˇ
1 2
lim 2i a
a!0C
Z I10 .a; 0; 0/ D Z
1
Z
1 ./
Z
1
0 1
C
1
0
Z
1 ./
1
D 0
This completes the proof of part (iv).
1 1 1
1
Z O 2 .t/ Z O 2 .t/
1
2
e2iv dv dt d
1 1 1
2
e2iv dv dt d
O 1 ./ Z 1 O 2 .t/ dt > 0: p d 1
4 Detection of Edges
• Part (v)
175
We need only to consider
1 2
< ˇ < 1.
Here again we let t D aˇ1 tan , but let u D aˇ v as in the proof of part (iii) 1 (recall that we let u D a 2 v in the proof of part (iv)). In this case we have 1 1 lim .u2 C O.u3 // D lim .a2ˇ v 2 C a3ˇ O.v 3 // D 0 a!0 a a 1 1 lim .sin u/ D lim .cos a1ˇCˇ tv/ D t v: a!0 a a!0 a
a!0
Following the same argument as in the proof of part (iii), we have lim 2ia
a!0C
1Cˇ 2 I10 .a; 0; 0/
Z
1
D Z
0 1
C 0
1 ./
Z
1
c1 ./
D 2c2 .0/
Z
Z
1 0
1
1 1
Z Z
1
1 1 1
1
O 2 .t/e2ivt dt dv d O 2 .t/e2ivt dt dv d
c1 ./ d > 0:
4.4 Shearlet analysis of edges in dimension n D 3 The shearlet-based analysis of step edges we presented above extends to the 3-dimensional setting [22, 23]. The statements of the 3D results are formally very similar to their 2-dimensional counterparts. Also the proofs follow essentially the same ideas of the 2D case, except for more significant changes needed in the analysis of irregular boundary points. Indeed, if we consider functions of the form f D ˝ where ˝ R3 is a compact set with piecewise smooth boundary, then there are two main types of ‘edges’ to consider: the smooth (or regular) surface boundaries and the curvilinear singularities resulting from the intersection of different smooth (or regular) sections of such surfaces. We will show below that it is possible to design an appropriate modification of the continuous shearlet transform suited to these curvilinear singularities in R3 .
4.4.1 3D Continuous Shearlet Transform There is a natural way to extend to the 3D setting the 2-dimensional fine-scale continuous shearlet transform we constructed in Section 4.2.3. Similar to the 2-dimensional case, we can define separate 3D shearlet systems spanning proper subspaces of L2 .R3 /, which are obtained by restricting the shear variables to a finite subset only. Namely, we introduce the following pyramidal regions in R3 :
176
K. Guo and D. Labate
P1 D f. 1 ; 2 ; 3 / 2 R3 W j 1 j 2; j
21 j 1 and j
13 j 1g; P2 D f. 1 ; 2 ; 3 / 2 R3 W j 2 j 2; j
21 j > 1 and j
23 j 1g; P3 D f. 1 ; 2 ; 3 / 2 R3 W j 3 j 2; j
31 j > 1 and j
23 j > 1g: For D . 1 ; 2 ; 3 / 2 R3 , 1 ¤ 0, let
.d/
, d D 1; 2; 3 be defined by
O .1/ . / D O .1/ . 1 ; 2 ; 3 / D O 1 . 1 / O 2 . 2 /; O 2 . 3 /;
1
1 O .2/ . / D O .2/ . 1 ; 2 ; 3 / D O 1 . 2 / O 2 . 1 /; O 2 . 3 /;
2
2 O .3/ . / D O .3/ . 1 ; 2 ; 3 / D O 1 . 3 / O 2 . 2 /; O 2 . 1 /;
3
3 where 1 , 2 satisfy the same assumptions as in the 2D case. For d D 1; 2; 3, and ˇ D .ˇ1 ; ˇ2 /, with 0 < ˇ1 ; ˇ2 < 1, the 3D pyramid-based continuous shearlet systems for L2 .Pd /_ are the systems f where
.d/ a;s1 ;s2 ;t
.d/ a;s1 ;s2 ;t .x/
.1/ Mas 1 s2
W 0 a 14 ; 32 s1 32 ; 32 s2 32 ; t 2 R3 g; .d/
1
D j det Mas1 s2 j 2
.d/
a aˇ1 s1 aˇ2 s2
D
0 0
aˇ1
0 aˇ2
0
.3/ Mas 1 s2
D
.d/
..Mas1 s2 /1 .x t//; and
!
.2/ D ; Mas 1 s2
aˇ1 0 0 0 aˇ2 0 ˇ ˇ 1 2 a s1 a s2 a
In particular, in the Fourier domain, the shearlets .1/ O a;s . 1 ; 2 ; 3 / D a 1 ;s2 ;t
1Cˇ1 Cˇ2 2
aˇ1 0 0 aˇ1 s1 a aˇ2 s2 0 0 aˇ2
;
:
.1/ a;s1 ;s2 ;t
have the form:
O 1 .a 1 / O 2 .aˇ1 1 . 2 s1 // O 2 .aˇ2 1 . 3 s2 // e2i t ;
1
1
with similar expressions for the shearlets on the other pyramidal regions. This shows .d/ that, similar to the 2D case, the shearlets a;s1 ;s2 ;t are well-localized waveforms associated with various scales controlled by a, orientations controlled by the two shear variables s1 ; s2 and locations controlled by t. Fig. 4.3 shows the Fourier domain support of a representative element of a 3D continuous shearlet system. For f 2 L2 .R3 /, we define the 3D (fine-scale) pyramid-based continuous shearlet transform f ! S H f .a; s1 ; s2 ; t/, for a > 0, s1 ; s2 2 R, t 2 R3 by
4 Detection of Edges
177
Fig. 4.3 Fourier domain support of a representative element system, inside the pyramidal region P1 .
8 ˆ hf ; ˆ ˆ < S H f .a; s1 ; s2 ; t/ D hf ; ˆ ˆ ˆ :hf ;
.1/ a;s1 ;s2 ;t i .2/ i s a; s1 ; s2 ;t 1 1 .3/ i s a; s1 ; s1 ;t 2
2
.1/ a;s1 ;s2 ;t
of a 3D continuous shearlet
if js1 j; js2 j 1; if js1 j > 1; js2 j js1 j if js2 j > 1; js2 j > js1 j:
That is, depending on the values of the shearing variables, the 3D continuous shearlet transform only involves one specific pyramid-based shearlet system. As above, we are only interested in the continuous shearlet transform at “fine scales,” as a approaches 0, since this is what is needed for the analysis of the singularities of f .
4.4.2 Characterization of 3D Boundaries Similar to its 2D counterpart, the 3D continuous shearlet transform can be applied to analyze the geometry of the set of singularities of functions and distributions of 3 variables. In particular, we can show that it provides a geometric characterization of the boundary set of solid regions. To state the precise result, let f D ˝ , where ˝ is a subset of R3 whose boundary @˝ is a 2-dimensional manifold. We say that @˝ is piecewise smooth if: (i) @˝ is a C1 manifold except possibly for finitely many separating C3 curves on @˝; (ii) at each point on a separating curve, @˝ has exactly two outer normal vectors, which are not on the same line.
178
K. Guo and D. Labate
Let the outer normal vector of @˝ be nEp D ˙.cos 0 sin 0 ; sin 0 sin 0 ; cos 0 / for some 0 2 Œ0; 2; 0 2 Œ0; : We say that s D .s1 ; s2 / corresponds to the normal 1 1 direction nEp if s1 D a 2 tan 0 ; s2 D a 2 cot 0 sec 0 . Notice that this definition excludes, in particular, surfaces containing vertices, such as the vertex of a cone. The following theorem, which is a proved in [23], shows that the behavior of the 3D continuous shearlet transform is consistent with the one found in dimension n D 2. Namely, for f D ˝ , the continuous shearlet transform S H f .a; s1 ; s2 ; t/, has rapid asymptotic decay as a ! 0 for all locations t 2 R3 , except when t is on the boundary of ˝ and the orientation variables s1 ; s2 correspond to the normal direction of the boundary surface at t, or when t is on a separating curve and the orientation variables s1 ; s2 correspond to the normal direction of the boundary surface at t. Thus, as in the 2D case, the continuous shearlet transform provides a description of the geometry of @˝ through the asymptotic decay of S H B.a; s1 ; s2 ; t/, at fine scales. Here is the precise statement for the case ˇ1 D ˇ2 D 12 . Theorem 4.9. Let 1 2 be chosen as in Theorem 4.4, ˇ1 D ˇ2 D 12 and f D ˝ , where ˝ be a bounded region in R3 . Assume that the boundary surface @˝ is a piecewise smooth 2-dimensional manifold. Let j ; j D 1; 2; m be the separating curves of @˝. Then we have (i) If t … @˝, then lim aN S H f .a; s1 ; s2 ; t/ D 0;
a!0C
for all N > 0:
S (ii) If t 2 @˝ n m jD1 j and .s1 ; s2 / does not correspond to the normal direction of @˝ at t, then lim aN S H f .a; s1 ; s2 ; t/ D 0;
a!0C
for all N > 0:
S (iii) If t 2 @˝ n m jD1 j and s D .s1 ; s2 / corresponds to the normal direction of S @˝ at t or t 2 m jD1 j and s D .s1 ; s2 / corresponds to one of the two normal directions of @˝ at t, then lim a1 S H f .a; s1 ; s2 ; t/ ¤ 0:
a!0C
Hence, similar to the 2D case, the continuous shearlet transform decays rapidly away from the boundary and on the boundary for non-normal orientations. The decay rate is only O.a/ at the boundary for normal orientation. This behavior is illustrated in Fig. 4.4. On the separating curves of the surface i the information provided by the theorem is less sharp, in the sense that, for normal orientations, the decay rate is O.a/, but, for non-normal orientations, we can only say that the decay rate is faster but not necessarily rapid. In particular, it is shown in [23] that,
4 Detection of Edges
179
3
O(a 4 ) 3
O(a 4 ) 3
O(a 4 )
O(aN) O(aN) O(aN)
Fig. 4.4 Asymptotic decay rate of the 3D continuous shearlet transform. The continuous shearlet transform S H B.a; s1 ; s2 ; t/, where B D ˝ , ˝ R3 and @˝ piecewise smooth boundary has rapid asymptotic decay, as a ! 0, away from @˝. When t 2 @˝ is a regular boundary point and the shear variables .s1 ; s2 / correspond to the normal direction at t, then S H B.a; s1 ; s2 ; t/ O.a/, as a ! 0; otherwise, if .s1 ; s2 / does not correspond to the normal direction at t, S H B has rapid asymptotic decay.
when ˇ1 D ˇ2 D 1=2, the decay rate is of order O.a3=2 / or faster. We will see in Section 4.4.3 that there is an alternative and more precise approach to analyze the curves i . We refer the reader to [23] for the complete proof of Theorem 4.9. In the following, we make a few basic observations about this proof. As in the 2D case, the starting point of the proof is the divergence theorem, which allows us to write the Fourier transform of f as fO . / D O ˝ . / D
1 2ij j2
Z @˝
e2i x nE.x/ d .x/;
where nE is the outer normal vector to @˝ at x. Next, using spherical coordinates, we have that, for js1 j; js2 j < 1, S H f .a; s1 ; s2 ; t/ D hf ;
.1/ a;s1 ;s2 ;t i
D I1 .a; s1 ; s2 ; t/ C I2 .a; s1 ; s2 ; t/;
where Z I1 .a; s1 ; s2 ; t/ D
0
Z I2 .a; s1 ; s2 ; t/ D
0
2 Z Z 1 0
0
2 Z Z 1 0
0
.1/ T1 .; ; / O a;s1 ;s2 ;t .; ; / 2 sin d d d
.1/ T2 .; ; / O a;s1 ;s2 ;t .; ; / 2 sin d d d
180
K. Guo and D. Labate
and 1 T1 .; ; / D 2i T2 .; ; / D
1 2i
Z P" .t/
e2i .; /x .; / nE.x/ d .x/
Z
@˝nP" .t/
e2i .; /x .; / nE.x/ d .x/;
where P" .t/ D @˝ \ D."; t/, and D."; t/ is the ball in R3 of radius " and center t. Notice that I2 is associated with the term T2 which is evaluated away from the location t of the continuous shearlet transform. Hence, a localization result similar to Lemma 4.7 shows that I2 is rapidly decreasing as a ! 0. The rest of the proof, when t is located on a regular point of @˝, can be derived using ideas similar to the proof of Theorem 4.4. The situation for t located on a separating curve requires a different approach cannot be carried out using the method from the proof of Theorem 4.6.
4.4.3 Identification of curve singularities on the piecewise smooth surface boundary of a solid In the 3-dimensional setting, there are other types of singularities of interest aside from the surface boundary of a solid region. In particular, if f D ˝ , where ˝ R3 q is a compact subset whose surface boundary S˝ D @˝ is the union S˝ D [iD1 Si and each Si is a smooth surface with boundary curve i , it may be useful to identify the location and orientation of the separating curves D 1 [ [ m . We can attempt to address this question by making use of the results of Section 4.4.2 to determine if a given t 2 R3 belongs to , and, if so, the value of the tangent vector 0 at t. Using Theorem 4.9, where we choose ˇ1 D ˇ2 D 1=2, we make the following observations. • If aN lima!0C S H f .a; s1 ; s2 ; t/ D 0, for all N > 0 and all but at most one pair .s1 ; s2 /, then t … . • If aNk lima!0C S H f .a; sk1 ; sk2 ; t/ ¤ 0 for some N0 ; N1 > 0 and distinct .s01 ; s02 / and .s11 ; s12 /, then t 2 . • If a1 lima!0C S H f .a; sk1 ; sk2 ; t/ ¤ 0 for distinct .s01 ; s02 / and .s11 ; s12 / then 0 jt (the tangent line evaluated at t) equals (up to a scalar) the vector cross product of the orientations corresponding to .s01 ; s02 / and .s11 ; s12 /. However, there are multiple drawbacks regarding the applicability of this method. First, suppose S H f .a; s1 ; s2 ; t/ is observed to decay relatively slowly as a ! 0C for all .s1 ; s2 / only in some localized directional region S. It may be difficult to determine whether there is a single peak of slow decay .s01 ; s02 / (indicating t … ) or two nearby peaks of slow decay .s01 ; s02 / and .s11 ; s12 / (indicating t 2 ). Second, suppose we are relatively certain that t 2 . To determine 0 jt , we need to determine the two values .s01 ; s02 / and .s11 ; s12 / such that
4 Detection of Edges
181
a1 lim S H f .a; sk1 ; sk2 ; t/ ¤ 0; a!0C
for k D 1; 2. However, .s01 ; s02 / and .s11 ; s12 / may be difficult to isolate since, by Theorem 4.9, there may be .s1 ; s2 / ¤ .s01 ; s02 /; .s11 ; s12 / with comparably slow decay rates: a3=2 lim S H f .a; s1 ; s2 ; t/ ¤ 0: a!0C
As an alternative approach to this problem, one can introduce a modified continuous shearlet transform that can precisely address the question posed in the first paragraph of this section, while avoiding all of the limitations suggested in the previous paragraph. This idea was recently introduced in [30], and we summarize the main result below. First, let us be more precise about the types of objects we will consider. We say that the surface boundary S˝ of ˝ is piecewise CK at t if there exists an open set U R3 with t 2 U and F; G 2 CK .U; R/ with F.t/ D G.t/ D 0 and frF.t/; rG.t/g linearly independent such that ˝ \ U D fx 2 U W F.x/ < 0gfx 2 U W G.x/ < 0g (in the a.e. sense), where the symbol can be either \ or [. In this case, we call O˝ .t/ D rF.t/ rG.t/ (where is the vector cross product) the orientation of S˝ at t. Note that O˝ .t/ is well defined (up to nonzero scalar multiplication) and equals the tangent vector at t to the curve defined by fx W F.x/ D G.x/ D 0g near t. Fix ˇ1 > ˇ3 > ˇ2 > 0 and write ˇ0 D .ˇ1 ˇ2 ˇ3 /=2. For a > 0 and s 2 R, we define the following matrices: 1 1 1 0 0 100 1s0 10s B21 .s/ D @ s 1 0A B12 .s/ D @0 1 0A B13 .s/ D @0 1 0A 001 001 001 1 1 1 0 0 0 100 100 100 B31 .s/ D @0 1 0A B32 .s/ D @0 1 0A B23 .s/ D @0 1 s A s01 0s1 001 1 1 1 0 ˇ3 0 ˇ1 0 ˇ1 a 0 0 a 0 0 a 0 0 A21 .a/ D @ 0 aˇ1 0 A A12 .a/ D @ 0 aˇ3 0 A A13 .a/ D @ 0 aˇ2 0 A 0 0 aˇ2 0 0 aˇ2 0 0 aˇ3 0
182
K. Guo and D. Labate
0
0 ˇ2 0 ˇ2 1 1 1 aˇ3 0 0 a 0 0 a 0 0 A31 .a/ D @ 0 aˇ2 0 A A32 .a/ D @ 0 aˇ3 0 A A23 .a/ D @ 0 aˇ1 0 A 0 0 aˇ1 0 0 aˇ1 0 0 aˇ3 1 1 1 0 0 0 001 100 100 21 D @1 0 0A 12 D @0 0 1A 13 D @0 1 0A
31
Let
010 1 0 001 D @0 1 0 A 100
32
010 1 0 010 D @0 0 1A 100
23
001 1 0 010 D @1 0 0A : 001
2 L2 .R3 / and, for t 2 R3 , define ij a;s;t
The collection f
ij a;s;t g
1 1 1 D j det A1 ij j .ij Aij Bij .x t//:
(4.16)
induces the 6 continuous transforms fS ij g, where S ij f .a; s; t/ D hf ;
ij a;s;t i:
We are now ready to define a new transform able to capture curvilinear singularities on the surface of a solid in 3D. Let V denote .R3 n f0g/= , where v w if v D cw for some c 2 R n f0g. Write K D f.2; 1/; .3; 1/; .1; 2/; .3; 2/; .1; 3/; .2; 3/g and, for .i; j/ 2 K , define ( Pij D
Œ1; 1;
if .i; j/ D .1; 3/; .2; 3/; .1; 2/
.1; 1/;
otherwise
:
If v 2 V, there exists a unique j D j.v/ 2 f1; 2; 3g such that vj ¤ 0 and vi =vj 2 Pij , for all i, with the quantities j and vi =vj well defined with respect to . If a > 0, v 2 V, and t 2 R3 , we define the .3; 1/-continuous shearlet transform S.3;1/ as S.3;1/ f .a; v; t/ D
Y
Sij f .a; vi =vj ; t/:
i2f1;2;3gnfjg
The “.3; 1/” indicates that the transform is designed to capture singularities along 1-dimensional structures in the 3-dimensional ambient space. Before stating the main theorem from [30], we will make some additional assumptions on given by (4.16).
4 Detection of Edges
183
Let 0 < " < M1 < 1 and 0 < M2 < M3 < 1 be such that M3 > M2
M1 "
ˇ2 =ˇ1
:
Choose 1 ; 2 2 C1 .R/ such that Z supp.1 / Œ"; M1 ;
1
0
1 .a/2
da D ˇ1 =2; a
2 is compactly supported in .0; 1/, and j2 . /j D 1;
for all M2 M3 :
For q D 1; 2, define ( qeven . /
D (
qodd . /
D
q . /;
if 0
q . /;
if < 0
;
q . /;
if 0
q . /;
if < 0
;
and q 2 L2 .R/ by O q D qeven C iqodd : Let 0 < M4 < 1 and choose such that O 3 is even, belongs to C1 .R; R/, and satisfies O 3 .0/ ¤ 0, supp. O 3 / ŒM4 ; M4 ;
and
k
3k
3
2 L2 .R/
D 1:
Finally, define 2 L2 .R3 / by O . / D O 1 . 1 / O 2 . 2 / O 3 . 3 = 1 /. Note, in particular, that is real-valued and that O belongs to C1 .R3 / and is compactly supported. With this choice of we can now state the following result,2 which shows that S.3;1/ precisely identifies both t and the tangent vector O˝ .t/ when S˝ is piecewise C1 at t. Theorem 4.10. Let f D ˝ and suppose that ˇ1 < 2ˇ2 . We have the following: • If t … S˝ , where S˝ denotes the closure of S˝ , or if S˝ is C1 at t, then lim aN S.3;1/ f .a; v; t/ D 0;
a!0C
for all N > 0 and all v 2 V. 2 Note that, in [30], the theorem is stated under more general assumptions on satisfying an appropriate admissibility condition. The special function we consider here is in fact one example of a function satisfying such admissibility condition.
184
K. Guo and D. Labate
• Let v 2 V and assume S˝ is piecewise C1 at p. If v O˝ .t/, then lim a2.ˇ1 Cˇ3 Cˇ0 / S.3;1/ f .a; v; t/ 2 C [ f1g n f0gI
a!0C
otherwise, lim aN S.3;1/ .˝/.a; v; t/ D 0;
a!0C
for all N > 0. We refer to [30] for a discussion of more technical aspects of the theorem, such as the assumptions on the analyzing functions, and for the proof of this result. In the following, we will briefly illustrate the application of Theorem 4.10 using a simple example. We set f D ˝ where ˝ \ U D fx 2 U W x1 < 0g \ fx 2 U W x2 < 0g; and U D .1; 1/3 . Write D fx 2 U W x1 D x2 D 0g; S1 D fx 2 U W x1 D 0; x2 < 0g;
and
S2 D fx 2 U W x1 < 0; x2 D 0g:
Then, it follows that • S˝ \ U D S1 [ [ S2 • S˝ is C1 at t, for all t 2 S1 [ S2 • S˝ is piecewise C1 at t, with O˝ .t/ D .0; 0; 1/ and j.O˝ .t// D 3, for all t 2 . Thus, Theorem 4.10 implies that • If t 2 U n , then lim aN S.3;1/ .˝/.a; v; t/ D 0;
a!0C
for all N > 0 and all v 2 V. • Assume t 2 . If v .0; 0; 1/, lim a2.ˇ1 Cˇ3 Cˇ0 / S.3;1/ f .a; v; t/ 2 C [ f1g n f0gI
a!0C
otherwise, lim aN S.3;1/ f .a; v; t/ D 0;
a!0C
for all N > 0.
(4.17)
4 Detection of Edges
185
x3
O(a2(β1 +β3 +β0 ))
O(aN) O(aN)
O(aN)
x2 x1 Fig. 4.5 Asymptotic decay rate of the (3,1)-continuous shearlet transform S.3;1/ of f D ˝ , where ˝ R3 has piecewise smooth boundary S D S1 [ S2 [ : : : . Away from the surface S and on S but away from , the decay is faster than O.aN /, for any N 2 N. When t 2 , for v corresponding to the tangent direction of at t, S.3;1/ f .a; v; t/ decays as O.a2.ˇ1 Cˇ3 Cˇ0 / /; when v does not correspond to the tangent direction of at t, the decay is faster than O.aN /, for any N 2 N.
Thus, the (3,1)-continuous shearlet transform characterizes both the location and orientation of the singularity curve through its asymptotic decay at fine scales (Fig. 4.5). On the other hand, one can verify that the ‘standard’ 3D continuous shearlet transform S H defined in Section 4.4.1 is not very efficient to capture the geometry of the singularity line . In fact we observe the following. • If t 2 U n . [ S1 [ S2 /, then lim aN S H f .a; v; t/ D 0;
a!0C
for all v 2 V and all K > 0.
186
K. Guo and D. Labate
• Assume p 2 S1 . If v D .1; 0; 0/ (i.e., the normal vector of S1 ), then lim a1 S H f .a; v; t/ ¤ 0I
a!0C
otherwise, lim aN S H f .a; v; t/ D 0;
a!0C
for all N > 0. • Assume t 2 S2 . If v D .0; 1; 0/ (i.e., the normal vector of S2 ), then lim a1 S.3;2/ .˝/.a; v; t/ ¤ 0I
a!0C
otherwise, lim aN S.3;2/ .˝/.a; v; t/ D 0;
a!0C
for all N > 0. • Assume t 2 . If v 2 f.1; 0; 0/; .0; 1; 0/g, then lim a1 S H f .a; v; t/ ¤ 0I
a!0C
otherwise, lim sup a3=2 S H f .a; v; t/ < 1:
(4.18)
a!0C
We thus see that S H f is able to detect the location of as all t such that the condition lim ˛ N S H f .a; v; t/ D 0; for all N > 0
a!0C
fails for at least two values of v. In this case, S H f can then detect the orientation, .0; 0; 1/, of as the vector cross product of the two unique v, .1; 0; 0/ and .0; 1; 0/, for which (4.18) fails. Comparing these results to those in the previous paragraph (particularly, (4.17) to (4.18)), we see that while S H f can detect the location of just as precisely as S.3;1/ , the latter is much better able to precisely identify the orientation of .
4 Detection of Edges
187
4.5 Other results and applications The shearlet analysis of singularities extends beyond the cases considered in the previous sections. In this section, we will briefly review the results available in the literature. We will also include a brief discussion of the numerical algorithms that were developed based on shearlet analysis of singularities.
4.5.1 Shearlet analysis of general edges In the engineering literature, it is common to consider several types of edges beside the step edges [45]. For example, ramp edges are associated with sharp linear transitions in images. As an idealized model of such edges, let us consider the twodimensional distribution x1 H.x1 ; x2 /, where H is the two-dimensional Heaviside step function defined in Section 4.1. The line x1 D 0 is the ramp edge that we wish to detect. By applying the continuous shearlet transform, a calculation very similar to Section 4.1 yields: S H .x1 H/.a; s; t/ D hx1 H; a;s;t i Z 1 O 1 ; 2 / O a;s;t . 1 ; 2 / d 1 d 2 D @1 H.
2i R2 Z 1 O 1 ; 2 / @1 O a;s;t . 1 ; 2 / d 1 d 2 D H.
2i R2 Z ı2 . 1 ; 2 / O D @1 a;s;t . 1 ; 2 / d 1 d 2 2i 1 R2 Z 1 D @1 O a;s;t . 1 ; 2 /j 2 D0 d 1 2i
1 R Z 1Cˇ 1 D a 2 O 2 .aˇ1 s/ @1 O 1 .a 1 / e2i 1 t1 d 1 R 2i 1 Z 1Cˇ 1 a@1 O 1 .a 1 / C 2it1 O 1 .a 1 / e2i 1 t1 d 1 : D a 2 O 2 .aˇ1 s/ R 2i 1 Similar to the observations from Section 4.1, under the assumption that O 1 2 Cc1 .R/ it follows that S H .x1 H/.a; s; t/ decays rapidly, asymptotically for a ! 0, for all .t1 ; t2 / when t1 ¤ 0, and for t1 D 0, s ¤ 0. On the other hand, if t1 D 0 and s D 0 we have: Z 3Cˇ 1 O 2 S H .x1 H/.a; s; t/ D a @1 O 1 .a 1 / d 1 : 2 .0/ R 2i 1
188
K. Guo and D. Labate
Provided that O 2 .0/ ¤ 0 and that the integral on the right-hand side of the equation 3Cˇ
above is nonzero, it follows that S H .x1 H/.a; s; t/ D O.a 2 /: If the ideal ramp edge is replaced by a polynomial-type edge x1m H.x1 ; x2 /, using the same argument above we can show that the continuous shearlet transform will exhibit slow asymptotic decay at the edge location t1 D 0, when s D 0, with 2mC1Cˇ
decay rate O.a 2 /: In other words, the example suggests that, also in the case of general edges, the continuous shearlet transform may be able to capture the geometry of edge curves through its asymptotic behavior at fine scales. It was recently shown [24] that it is possible to extend the shearlet-based analysis of edges to general functions of the form B D f S where f 2 C1 .R2 / and S R2 is a compact region with piecewise smooth boundary @S. The analysis of this problem is significantly more involved than the example illustrated above. This is due not only to the fact that, as in the analysis of step edges, there is no explicit expression for the Fourier transform of B D f S , but also to the fact that f and its partial derivatives up to an arbitrary order may vanish at @S. To deal with this situation, the method proposed in [24] uses a slightly modified shearlet transform. The detailed presentation of this result is beyond the scope of this chapter and we refer the interested reader to the paper. We also recall that other partial results about the analysis of ‘general’ edges using methods based on the shearlet transform can be found in [18, 26]. Finally, we recall that the microlocal analysis of edges and singularities plays a very significant role in the problem of geometric separation, aiming to break up functions or distributions into geometrically distinct components. It was observed by Donoho and Kutyniok that the ability to achieve such separation is frequently a consequence of the ability to discriminate singularities. As a mathematical idealization of a class of images, they consider distributions of the form f D P C T, where P is a collection of point-like singularities and T is a piece-wise smooth function containing curvilinear edges. By expanding f within a combined dictionary of wavelets and curvelets and enforcing sparsity via minimization of the expansion coefficients in the `1 -norm, they proved that the geometric components P and T can be separately recovered by wavelets and curvelets, respectively, asymptotically at fine scales [11]. This result relies on the microlocal properties of wavelets and curvelets. In a recent paper [25], the authors of this chapter proved an extension of the geometric separation result to the three-dimensional setting using the shearletbased methods presented in Section 4.4.
4.5.2 Numerical applications The mathematical results presented in this chapter about the geometric analysis of singularities in multivariate functions provide the theoretical justification for a number of numerical methods for edge analysis and detection, and for feature extraction in images. We briefly describe below some of these applications.
4 Detection of Edges
189
In particular, a shearlet-based algorithm for edge detection was introduced in [43, 44], based on the properties of the continuous shearlet transform and taking advantage of its ability to accurately detect the edge orientation. This algorithm was shown to be very competitive with respect to more conventional edge detectors and inspired the development of similar directional-sensitive methods. Such methods include an edge detector using anisotropic Gaussian kernels which generate shearlet-like filters [41] and a method using a sort of directional spline wavelets [27]. An extension of the original shearlet-based algorithm for edge detection to the three-dimensional setting was developed in [40]. The theoretical properties of the continuous shearlet transform also justify a number of numerical methods designed to extract features in images. In particular, the analysis of step edges in Sections 4.3 shows that corner points are associated with the presence of two directions where the continuous shearlet transform has ‘slow’ asymptotic decay. This property was exploited in [44] to derive an algorithm to detect corners and other landmarks in images. A refined version of this corner detection algorithm was recently proposed in [12]. More generally, one can derive discrete algorithms based on the continuous shearlet transform to extract geometrical features associated with edges in images. Examples of such applications can be found in [34, 39], where a shearlet-based geometric descriptor called Directional Ratio is introduced to reliably detect morphological properties in fluorescent images of neuronal cultures.
Appendix We recall some basic facts from Fourier analysis, including the Fourier transform of distributions. We refer the reader to [13, 14] for additional details.
The Fourier transform L1 .Rn / is the space of Lebesgue integrable function on Rn and L2 .Rn / the Hilbert space of square Lebesgue integrable function on Rn endowed with the inner product R hf ; gi D Rn f g. The Schwartz space S .Rn / consists of those functions in C1 .Rn / which, together with all their derivatives, vanish at infinity faster than any power of jxj. That is, S .Rn / D ff 2 C1 .Rn / W sup .1 C jxj/N j@˛ f .x/j < 1; for all N; ˛g; x2Rn
where N is any nonnegative integer and ˛ D .˛1 ; : : : ; ˛n / 2 Nn is any multi-index.
190
K. Guo and D. Labate
Definition 4.11. The Fourier transform is the operator F mapping a function f 2 L1 .Rn / into F f D fO defined by Z fO . / D
Rn
f .x/e2ix dx:
The inverse Fourier transform is the operator F 1 mapping a function g 2 L1 .Rn / into F 1 g D gL , where Z gL .x/ D gO .x/ D
g. /e2ix d :
Rn
It is a fact that g D .Og/_ for any function g 2 L1 .Rn / with gO 2 L1 .Rn /. The Fourier transform is a bijection of S onto itself and can be extended via an appropriate limit to a unitary map from L2 onto itself. Under this extension, the Fourier inversion theorem is valid and, for f ; g 2 L2 .Rn /, the Plancherel formula holds: hf ; gi D hfO ; gO i; and, in particular, kf k2 D kfO k2 : Among the most important properties of Fourier analysis we recall the following list of results (cf. [13]). Theorem 4.12. Let f ; g 2 L1 .Rn /. For y 2 Rn , let Ty f .x/ D f .x y/ and, for M 2 GLn .C/, let DM f .x/ D j det Mj1=2 f .M 1 x/. .Ty f /^ . / D e2i y fO . / and .DM f /^ . / D DN fO . /; where N D .M /1 . .f g/^ D fO gO : If x˛ f 2 L1 for j˛j k; then @˛ fO D ..2ix/˛ f /^ : If f 2 Ck , @˛ fO 2 L1 , for j˛j k, and @˛ fO 2 C0 , for j˛j k 1; then ˛O .@˛fO /^ . / D .2i / f . /: (v) F L1 .Rn / C0 .Rn /.
(i) (ii) (iii) (iv)
The following proposition is a simple application of the Fourier transform showing that regularity on Rn implies decay in the Fourier domain. Proposition 4.13. Suppose that 2 L2 .Rn / is such that O 2 Cc1 .R/, where R D supp O Rn . Then, for each k 2 N, there is a constant Ck > 0 such that, for any x 2 Rn , we have j .x/j Ck .1 C jxj2 /k :
4 Detection of Edges
191
Pn In particular, Ck D k m.R/ k O k1 C k4k O k1 , where 4 D iD1
@2 @ i2
is the
Fourier-domain Laplacian operator and m.R/ is the Lebesgue measure of R. Proof. From the definition of the Fourier transform, it follows that, for every x 2 Rn , j .x/j m.R/ k O k1 :
(4.19)
An integration by parts shows that Z
4 O . / e2ih ;xi d D .2/2 jxj2
.x/:
R
Thus, for every x 2 Rn , .2 jxj/2k j .x/j m.R/ k4k O k1 :
(4.20)
Using (4.19) and (4.20), we have 1 C .2 jxj/2k j .x/j m.R/ k O k1 C k4k O k1 :
(4.21)
Observe that, for each k 2 N, k .1 C jxj2 /k 1 C .2/2 jxj2 k 1 C .2 jxj/2k : Using this last inequality and (4.21), we have that for each x 2 Rn j .x/j k m.R/ .1 C jxj2 /k k O k1 C k4k O k1 :
t u
Under the same assumptions of Proposition 4.13, we can derive a similar estimate valid for M;t D Tt DM . Using a change of variables in the last step of the proof above, we have that for all k > 0 there is a Ck > 0 such that j
M;t .x/j
1
Ck j det Mj 2 .1 C jM 1 .x t/j2 /k :
Distributions and the Fourier transform of distributions The space D.Rn / of test functions is the space of all C1 functions whose support is compact. A sequence f j g in D.Rn / converges in D to if the supports of all j are contained in a fixed compact subset of Rn and if @˛ j ! @˛ uniformly for all multi-indices ˛.
192
K. Guo and D. Labate
Definition 4.14. A distribution is a continuous linear functional on D and the space of distributions is denoted by D 0 . We impose the weak topology on D 0 , that is, the topology of pointwise convergence on D. If F 2 D 0 .Rn / and 2 D.Rn /, we denote the value of F at by F. / or hF; i. The latter notation conflicts with the notation of inner product but its meaning will be clear by the context. Given two distribution F and G, we say that F D G if hF; i D hG; i, for all 2 D.Rn /. R 1 Example 4.15. Every f 2 Lloc .Rn / defines a distribution by 2 S .Rn / ! Rn f . Example 4.16. The Dirac’s impulse ı is defined by ı. / D hı; i D .0/, 2 S .Rn /. This is an example of a distribution which is not a function. Example 4.17. The distribution pv. 1x / is defined by hpv. 1x /; i D P:V:
Z R
.x/ dx D lim "!0 x
Z
"
"
.x/ dx; x
for 2 S .R/, where P.V. is the principal value of the integral. There is a general procedure for extending many linear operations from functions to distributions. • Differentiation. For any F 2 D 0 .Rn /, the derivatives @˛ F 2 D 0 .Rn / are given by h@˛ F; i D .1/j˛j hF; @˛ i: • Multiplication by a smooth function. Given g 2 C1 .Rn /, for any F 2 D 0 .Rn /, the product gF 2 D 0 .Rn / is given by hgF; i D hF; g i: • Convolution. Let g 2 C1 .Rn /. For any F 2 D 0 .Rn /, the convolution F g 2 D 0 .Rn / is given by hF g; i D hF; gQ i; where gQ .x/ D g.x/. For example, let H be the one-dimensional Heaviside function, defined by H.x/ D 0 if x < 0, H.x/ D 1 if x 0. A direct computation shows that, for any 2 S .R/, 0
0
hH ; i D hH; i D Hence H 0 D ı.
Z
1 0
0 .x/ dx D .0/ D hı; i:
4 Detection of Edges
193
The following class of distributions are useful to extend the Fourier transform beyond the realm of classical functions. Definition 4.18. A tempered distribution is a continuous linear functional on S and the space of tempered distributions is denoted by S 0 . We impose the weak topology on S 0 , that is, the topology of pointwise convergence on S . R 1 n N Example 4.19. Every f 2 Lloc .R R / such that Rn .1 C jxj/ jf .x/j dx < 1 defines a tempered distribution by ! f . Example 4.20. Any distribution with compact support is tempered. The Fourier transform extends to a continuous linear map from S 0 to itself by defining O O i D hF; i; hF;
F 2 S 0; 2 S :
This definition agrees with the classical definition when F 2 L1 \ L2 . Furthermore, it is easy to verify that the basic properties of the Fourier transform continue to hold. In particular, for F 2 S 0 .Rn /, we have the following formulas. (i) (ii) (iii) (iv)
O where N D .M /1 . .Ty F/^ D e2i y FO and .DM F/^ D DN F; .F g/^ D FO gO ; for g 2 S .Rn /. @˛ FO D ..2ix/˛ F/^ : O ^ D .2i /˛ F: O .@˛ F/
Similarly, the inverse Fourier transform is defined on S 0 by L L i D hF; i; hF;
F 2 S 0; 2 S ;
L ^ D .F/ O _: and, for all F 2 S 0 .Rn /, F D .F/ Example 4.21. For any 2 S .R/, Z O i D hı; i O D .0/ O hı; D
R
.x/ dx D h1; i:
Hence ıO D 1. It follows that, for any y 2 R, the Fourier transform of ıy D Ty ı is .k/ . / D .2i /k : ıOy . / D e2iy and, for any k 2 N, ıc c Example 4.22. We will show that sgn. / D i1 pv. 1 /, where sgn is the signum function, that is defined as sgn.x/ D 1 of x < 0, sgn.x/ D 0 of x 0:( In order to derive ex=n if x < 0 this result, we consider first the functions fn defined by fn .x/ D , ex=n if x > 0 where n 2 N. An application of Lebesgue Dominated Convergence theorem shows that fn converges to f D sgn as n ! 1 in the sense of tempered distributions, that is, hfn ; i ! hsgn; i as n ! 1, for all 2 S .R/: A direct computation (note that fn 2 L1 .R/) shows that
194
K. Guo and D. Labate
Z fOn . / D
1
ex=n e2ix dx
0
Z
0
ex=n e2ix dx D
1
1 n
1 C 2i
1 n
1 : 2i
Finally, we use that fact that if .Fn /; F 2 S 0 and hFn ; i ! hF; i for all 2 S ; O i for all 2 S : Since then hFO n ; i ! hF; Z . / 1 P:V: d ; lim hfOn ; i D n!1 i
c we conclude that sgn. / D
1 i
pv. 1 /.
Example 4.23. The one-dimensional Heaviside function H.x/ can be written as 1 O D 12 ı. / C 2i pv. 1 /: H.x/ D 12 C 12 sgn.x/. It follows that H. / Example 4.24. Let us consider the two-dimensional Heaviside function H1 .x1 ; x2 / D x1 >0 .x1 ; x2 /. Since it can be written as the tensor product H1 .x1 ; x2 / D O 1 . 1 ; 2 / D 1 ı. 1 /ı. 2 / C ı. 1 / pv. 1 /: H.x1 /1.x2 /, it follows that H 2 2i
1
Singular support and wavefront set The notion of singular support is introduced to describe the location where a distribution fails to be smooth. Since a distribution is not defined at a single point, this definition requires to deal with open sets containing the point of interest. For a distribution F, we say that x0 2 Rn is a regular point of F if there exists a function g 2 C1 .U/, where U Rn is an open neighborhood of x0 and g.x0 / D 1, such that gF 2 C1 .U/. The complement of the set of the regular points of F is called the singular support of F and is denoted by sing supp.F/. It is easy to see that the singular support of F is a closed set. For example, sing supp.ı/ D f0g. Also sing supp.pv. 1x // D f0g. Note that the condition gF 2 C1 .U/ is equivalent to .gF/^ being rapidly decreasing, i.e., for all N > 0 there exists a CN > 0 such that .gF/^ . /j CN .1 C j j/N : If a function or distribution fails to be smooth, we can look not only for the location of the singularity in space, but also for the orientation of the singularity. We shall say that a set 2 Rn n f0g is conic if 2 implies that 2 for all > 0. A conic neighborhood of a point is an open conic set containing it. For a distribution F, the point .x; / 2 Rn Rn n f0g is a regular directed point of F if there exists a function g 2 C1 .U/, where U Rn is an open neighborhood of x and g.x/ D 1, such that gF 2 C1 .U/ and, for al N > 0, there exists a CN > 0 such that j.gF/^ . /j CN .1 C j j/N ;
4 Detection of Edges
195
for all is a conic neighborhood containing the direction 0 . The complement in Rn Rn n f0g of the set of regular directed points of F is called the wavefront set of F and is denoted by WF.F/. Example 4.25. Let x D .x0 ; x00 / be a splitting of the coordinates and define the distribution F by Z hF; i D
.0; x00 / dx00 :
It is easy to see that sing supp.F/ D f.x0 ; x00 / W x0 D 0g. To compute the wavefront set, observe that, for 2 C1 .U/, where U is a neighborhood of a point x0 D .x00 ; x000 /, we have: Z h F; i D
.0; x00 / .0; x00 / dx00 :
Thus, . F/^ . / D O0 . 00 /, where 0 .x00 / D .0; x00 /. Since 0 is C1 and compactly supported, its Fourier transform has rapid decay as a function of 00 but is constant as a function of 0 . Hence we conclude that WF.F/ D f.0; x00 ; 0 ; 0/g. Acknowledgements The authors are partially supported by NSF grant DMS 1008900/1008907. DL is also partially supported by NSF grant DMS 1005799.
References 1. Bros, J., Iagolnitzer, D.: Support essentiel et structure analytique des distributions. Seminaire Goulaouic-Lions-Schwartz 19 (1975–76) 2. Calderòn, A.: Intermediate spaces and interpolation, the complex method. Stud. Math. 24(2), 113–190 (1964). URL http://eudml.org/doc/217085 3. Candès, E.J., Demanet, L.: The curvelet representation of wave propagators is optimally sparse. Commun. Pure Appl. Anal. 58(11), 1472–1528 (2005) 4. Candès, E.J., Donoho, D.L.: New tight frames of curvelets and optimal representations of objects with piecewise C2 singularities. Commun. Pure Appl. Anal. 57(2), 219–266 (2004) 5. Candès, E.J., Donoho, D.L.: Continuous curvelet transform: I. resolution of the wavefront set. Appl. Comp. Harm. Analysis 19, 162–197 (2005) 6. Córdoba, A., Fefferman, C.: Wave packets and fourier integral operators. Commun. Partial Differ. Equ. 3(11), 979–1005 (1978) 7. Dahlke, S., De Mari, F., De Vito, E., Häuser, S., Steidl, G., Teschke, G.: Different faces of the shearlet group. ArXiv (1404.4545) (2014). URL http://arxiv.org/abs/1404.4545 8. Dahlke, S., Kutyniok, G., Maass, P., Sagiv, C., Stark, H., Teschke, G.: The uncertainty principle associated with the continuous shearlet transform. IJWMIP 6(2), 157–181 (2008) 9. Dahlke, S., Steidl, G., Teschke, G.: The continuous shearlet transform in arbitrary space dimensions. J. Fourier Anal. Appl. 16(3), 340–364 (2010). DOI 10.1007/s00041-009-9107-8. URL http://dx.doi.org/10.1007/s00041-009-9107-8 10. Delort, J.: F.B.I. transformation: second microlocalization and semilinear caustics. In: Lecture Notes in Mathematics, vol. 1522. Springer, Berlin/Heidelberg (1992)
196
K. Guo and D. Labate
11. Donoho, D.L., Kutyniok, G.: Commun. Pure Appl. Math. 66(1), 1–47 (2013). DOI 10.1002/cpa.21418. URL http://dx.doi.org/10.1002/cpa.21418 12. Duval-Poo, M., Odone, F., De Vito, E.: Edges and corners with shearlets. preprint (2015) 13. Folland, G.B.: Real Analysis, 2nd edn. Pure and Applied Mathematics (New York). Wiley, New York (1999). Modern techniques and their applications. A Wiley-Interscience Publication 14. Friedlander, G., Joshi, M.: Introduction to the Theory of Distributions, 2nd edn. Cambridge University Press, New York (1998) 15. Gérard, P.: Moyennisation et regularité deux-microlocale. Ann. Scient. Ec. Norm. Sup., 4eme serie 23, 89–121 (1990) 16. Grohs, P.: Continuous shearlet frames and resolution of the wavefront set. Monatshefte für Mathematik 164(4), 393–426 (2011). DOI 10.1007/s00605-010-0264-2. URL http://dx.doi. org/10.1007/s00605-010-0264-2 17. Grossmann, A.: Wavelet transforms and edge detection. In: Albeverio, S., Blanchard, P., Hazewinkel, M., Streit, L. (eds.) Stochastic Processes in Physics and Engineering, Mathematics and Its Applications, vol. 42, pp. 149–157. Springer, Dordrecht (1988) 18. Guo, K., Houska, R., Labate, D.: Microlocal analysis of singularities from directional multiscale representations. In: Fasshauer, G.E., Schumaker, L.L. (eds.) Approximation Theory XIV: San Antonio 2013, Springer Proceedings in Mathematics and Statistics, vol. 83, pp. 173–196. Springer, New York (2014) 19. Guo, K., Labate, D.: Optimally sparse multidimensional representation using shearlets. SIAM J. Math. Anal. 39(1), 298–318 (2007) 20. Guo, K., Labate, D.: Representation of fourier integral operators using shearlets. J. Fourier Anal. Appl. 14, 327–371 (2008) 21. Guo, K., Labate, D.: Characterization and analysis of edges using the continuous shearlet transform. SIAM J. Imag. Sci. 2(3), 959–986 (2009) 22. Guo, K., Labate, D.: Analysis and detection of surface discontinuities using the 3d continuous shearlet transform. Appl. Comput. Harmon. Anal. 30, 231–242 (2010) 23. Guo, K., Labate, D.: Characterization of piecewise smooth surfaces using the 3d continuous shearlet transform. J. Fourier Anal. Appl. 18, 488–516 (2012) 24. Guo, K., Labate, D.: Characterization and analysis of edges in piecewise smooth functions. preprint (2015) 25. Guo, K., Labate, D.: Geometric separation of singularities using combined multiscale dictionaries. J. Fourier Anal. Appl. (in press) (2015) 26. Guo, K., Labate, D., Lim, W.: Edge analysis and identification using the continuous shearlet transform. Appl. Comput. Harmon. Anal. 27(1), 24–46 (2009) 27. Guo, W., Lai, M.: Box spline wavelet frames for image edge analysis. SIAM J. Imag. Sci. 6(3), 1553–1578 (2013). DOI 10.1137/120881348 28. Herz, C.S.: Fourier transforms related to convex sets. Ann. Math. 75, 81–92 (1962) 29. Holschneider, M.: Wavelets: An Analysis Tool. Oxford University Press, Oxford (1995) 30. Houska, R., Labate, D.: Detection of boundary curves on the piecewise smooth boundary surface of three dimensional solids. Appl. Comput. Harmon. Anal. (2014) 31. Jaffard, S.: Pointwise smoothness, two-microlocalization and wavelet coefficients. Publicacions Matemàtiques 35(1), 155–168 (1991). URL http://eudml.org/doc/41681 32. Jaffard, S., Meyer, Y.: Wavelet Methods for Pointwise Regularity and Local Oscillations of Functions, vol. 123. Mem. Am. Math. Soc. (1996) 33. Kutyniok, G., Labate, D.: Resolution of the wavefront set using continuous shearlets. Trans. Am. Math. Soc. 361(5), 2719–2754 (2009). DOI 10.1090/S0002-9947-08-04700-4 34. Labate, D., Laezza, F., Negi, P., Ozcan, B., Papadakis, M.: Efficient processing of fluorescence images using directional multiscale representations. Math. Model. Nat. Phenom. 9(5), 177–193 (2014) 35. Laugesen, R., Weaver, N., Weiss, G., Wilson, E.: A characterization of the higher dimensional groups associated with continuous wavelets. J. Geom. Anal. 12(1), 89–102 (2002). DOI 10.1007/BF02930862. URL http://dx.doi.org/10.1007/BF02930862
4 Detection of Edges
197
36. Marr, D.: Early processing of visual information. Phil. Trans. R. Soc. Lond. B 275, 483–524 (1976) 37. Marr, D., Hildreth, E.: Theory of edge detection. Proc. Roy. Soc. Lond. Ser. B Biol. Sci. 207(1167), 187–217 (1980). DOI 10.2307/35407. URL http://dx.doi.org/10.2307/35407 38. Meyer, Y.: Ondelettes et Opérateurs. Hermann, Paris (1990) 39. Ozcan, B., Labate, D., Jiménez, D., Papadakis, M.: Directional and non-directional representations for the characterization of neuronal morphology. In: Proc. SPIE. Wavelets XV (San Diego, CA, 2013), vol. 8858, pp. 885,803–885,803–11 (2013). DOI 10.1117/12.2024777. URL http://dx.doi.org/10.1117/12.2024777 40. Schug, D., Easley, G., O’Leary, D.: Three-dimensional shearlet edge analysis. In: Proc. SPIE, Independent Component Analyses, Wavelets, Neural Networks, Biosystems, and Nanoengineering IX, vol. 8058 (2011). DOI 10.1117/12.884194. URL http://dx.doi.org/10.1117/12. 884194 41. Shui, P., Zhang, W.: Noise-robust edge detector combining isotropic and anisotropic gaussian kernels. Pattern Recogn. 45(2), 806–820 (2012). DOI 10.1016/j.patcog.2011.07.020. URL http://dx.doi.org/10.1016/j.patcog.2011.07.020 42. Weiss, G., Wilson, E.: The mathematical theory of wavelets. In: Byrnes, J. (ed.) Twentieth Century Harmonic Analysis - A Celebration, NATO Science Series, vol. 33, pp. 329–366. Springer, Dordrecht (2001) 43. Yi, S., Labate, D., Easley, G., Krim, H.: Edge detection and processing using shearlets. In: Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on, pp. 1148–1151 (2008). DOI 10.1109/ICIP.2008.4711963 44. Yi, S., Labate, D., Easley, G., Krim, H.: A shearlet approach to edge analysis and detection. IEEE Trans. Image Process. 18(5), 929–941 (2009) 45. Ziou, D., Tabbone, S.: Edge detection techniques - an overview. Int. J. Pattern Recogn. 8, 537–559 (1998)
Chapter 5
Optimally Sparse Data Representations Philipp Grohs
Abstract This chapter is dedicated to the question of how to efficiently encode a given class of signals. We introduce several mathematical techniques to construct optimal data representations for a number of signal types. Specifically we study the optimal approximation of functions governed by anisotropic singularities such as edges in images.
5.1 Introduction One of the fundamental questions in the field of digital signal processing is how to efficiently represent or encode a continuous signal by a bitstream of finite length. In an age of ever increasing amounts of data it is all the more relevant to find efficient representations for it, in order to store the data or to transmit it over a communication channel. Of course, different types of data require different representations. What works well for, say, sound data need not be the optimal choice for, say, medical image data. In this chapter we will look at the encoding problem from a mathematical point of view. To this end we start by introducing mathematical models for certain signal classes in Section 5.2 as relatively compact subsets of a Hilbert space. Being merely mathematical models, these ‘signal classes’ will never truly contain all the complexities of real-world signals but they may serve as a decent approximation to reality while being concrete enough to allow for a precise mathematical analysis. To give an example, in order to model natural images we will introduce the mathematical model of cartoon-images which are two-dimensional piecewise smooth functions with possible discontinuities along curves. These curved discontinuities model the edges of a natural image. Since for most images the main part of the information lies in its edges, the model of cartoon-images has proven to be a useful mathematical formalization of natural images, see Figure 5.1.
P. Grohs () Seminar for Applied Mathematics, ETH Zürich, Rämistrasse 101, 8092 Zurich, Switzerland e-mail:
[email protected] © Springer International Publishing Switzerland 2015 S. Dahlke et al. (eds.), Harmonic and Applied Analysis, Applied and Numerical Harmonic Analysis 68, DOI 10.1007/978-3-319-18863-8_5
199
200
P. Grohs
Fig. 5.1 Many parts of natural images can be modeled mathematically as piecewise smooth functions with curved discontinuities.
The main question that we ask in this chapter is the following. Suppose we have a given signal class C and a desired precision " > 0. What is the minimal number N of bits needed to encode any signal f 2 C up to precision "?
Of course this question makes no sense mathematically, as it stands. We need to cast it more into the mathematical language. For instance: What precisely do we mean by a ‘signal class’? What does ‘encoding’ mean? And what do we mean by ‘up to precision "’? In what follows we will present rigorous answers to these questions, together with some examples. Finally we will show a fundamental and perhaps surprising phenomenon: For many signal classes one can precisely quantify the optimal tradeoff between N, the number of bits, and ", the desired precision. To this end we will first, in Section 5.2 introduce basic notions of coding theory. After presenting several interesting mathematical models for different types of realworld signals we introduce, for a signal class C , its optimal encoding rate which
5 Optimally Sparse Data Representations
201
describes the optimal asymptotic trade-off between the precision " > 0 and the number of bits needed to achieve such a precision which can be theoretically achieved uniformly over all signals in C . In Section 5.3 we show the fundamental fact that for each signal class there exists a precise upper bound on the optimal encoding rate. Using the technique of hypercube embeddings introduced originally by Donoho in [13] we explicitly calculate such upper bounds for all signal classes introduced earlier in Section 5.2. These results are not new but the proofs are substantially simplified in comparison with [13] which relies on results from the field of rate-distortion theory [1]. In contrast our presentation is completely elementary and only requires some simple results from probability theory, such as Chernoff-type bounds. These are derived in Appendix 5.8. The following Section 5.4 is then concerned with actually constructing optimal encoders using sparse expansions in dictionaries. We will mostly study sparse expansions in frames and introduce some elements of nonlinear approximation theory [10]. In particular we will see how one can design efficient encoding schemes using appropriate frame expansions. The remaining sections contain two important examples of signal classes and associated frame expansions which generate optimal encoding schemes: In Section 5.5 we will show that wavelet frames (or bases) are optimally adapted for the sparse approximation of piecewise smooth univariate functions. In Section 5.6 we study the sparse approximation of cartoon-images. We first establish the fact that wavelets perform subobtimally for this task. Then we introduce curvelet tight frames which, by a landmark result by Candès and Donoho [3], achieve optimally sparse representations of cartoon-images, which makes them an extremely attractive representation system in image processing. After that we introduce shearlet frames [29] which are a related construction with the same desirable properties as curvelets and which have the advantage of being much simpler to implement on digital data. Finally we mention the notion of parabolic molecules [24] which comprises a general framework for the sparse approximation of cartoon-images and which encompasses both curvelets and shearlets as special cases. Section 5.7 closes with some further examples of signal classes and frame constructions for their optimally sparse representation. Finally, in the Appendix 5.8 we collect some basic facts in probability theory and give an elementary derivation of Chernoff’s bound as needed in Section 5.2. We have decided to present the general results on coding complexity from Sections 5.2, 5.3, and 5.4 in full detail and hope to provide the reader with a useful and comprehensive reference on that topic. In the remaining sections we will then switch to a more informal presentation style, as a more detailed treatment would be beyond of the scope of this work.
202
P. Grohs
5.1.1 Notation We comment on the notation which we shall use in the present work. As usual, we denote by Lp .Rd / the usual Lebesgue spaces with associated norm k kp . For a discrete set equipped with the counting measure we denote the corresponding Lebesgue space by `p ./ or `p if is known from the context. The associated norm will again be denoted k kp . We use the symbol h; i for the inner product on the Hilbert space L2 .Rd /. The Euclidean inner product on Rd is denoted by for ; 2 Rd . The Euclidean norm . /1=2 of a vector 2 Rd will be denoted 1 d O Rby j j. For a function f 2 L .R / we can define the Fourier transform f .!/ WD f .x/ exp.2ix !/dx. By density this definition can be extended to tempered d R distributions f . We shall also use the notation T to denote the one-dimensional torus which can be identified with the half-open interval Œ0; 2/. Sometimes we will use the notations bxc WD maxfl 2 Z W l xg, and hxi WD .1 C x2 /1=2 . The natural logarithm will be denoted log and the logarithm with basis 2 will be denoted log2 . We use the symbol A . B or A D O.B/ to indicate that A CB with a uniform constant C. Finally we shall often use the letter C for a generic constant whose precise value may change from line to line.
5.2 Signal Classes and Encoding Let us first agree on what we mean by a ‘signal class.’ Ideally we would like to consider such classes as, for instance, music or, say, fingerprint images, and so on. But clearly such fuzzy definitions do not allow for a rigorous analysis. We have to mathematicize the concept of signal classes in order to gain a deeper understanding. To this end we shall fix a separable Hilbert space H and define the notion of ‘signal class’ as follows. Definition 5.1. A relatively compact subset C H is called a signal class. We list a few examples for signal classes which are commonly used as benchmarks for certain real-world signals. It should be emphasized that these examples only constitute mathematical models and may or may not have anything to do with the real world. Nevertheless, they provide us with important insights of what we can expect in real life. In what follows we shall always have H D L2 .Rd / for some d 2 N, i.e. we study signal classes which are compact subsets of the space of square integrable functions. This makes sense in a signal processing context but if our signal class consists of solutions to PDEs, Sobolev spaces would be more appropriate [15]. Example 5.2 (Smooth Functions). Given a smooth bounded domain C Rd , define for k 2 N and K > 0 CKk .C/ WD fu 2 L2 .Rd / W u 2 Ck ; kukCk K and supp u Cg;
5 Optimally Sparse Data Representations
203
where we denote kukCk WD
k X lD0
sup jdl f .x/j x
and dl f the l-th total derivative of f . Example 5.3 (Piecewise Smooth Functions). Given a bounded interval I D .a; b/ R, define for k 2 N the set of piecewise smooth functions on I as k;pw
CK
.I/ WD ff1 Œ0;c/ C f2 Œc;1 where c 2 I and f1 ; f2 2 CKk .I/g;
where we denote by C the indicator function of a set C. Example 5.4 (Star Shaped Images, see [13]). We now take a first step in modeling multivariate signals with anisotropic and curved singularities. We start with binary images. To this end we take a C2 -function W T ! Œ0; 1/ defined on the torus T D Œ0; 2, where the boundary points are identified. Additionally, we assume that there exists 0 < 0 < 1 such that . / 0 for all 2 T. Then we define the subset B R2 by o n B D x 2 R2 W x D .r; / in polar coordinates with 2 T; r . / ;
(5.1)
such that the boundary D @B of B is a closed regular C2 -Jordan curve in Œ0; 12 parameterized by b. / D
. / cos. / ; . / sin. /
2 T:
(5.2)
Furthermore, we require that kkC2 K:
(5.3)
For K > 0 the set STAR2K is defined as the collection of all indicator functions B of subsets B Œ0; 12 , which are translates of sets of the form (5.1) with a boundary obeying (5.2) and (5.3). Example 5.5 (Cartoon Images, see [3]). Star shaped images are indicator functions of subsets of Œ0; 12 . A more realistic mathematical model for real-world images is the set of cartoon images which is defined as CARTK2 WD ff1 B C f2 W B 2 STAR2K and f1 ; f2 2 CK2 .Œ0; 12 /g: Figure 5.2 shows an illustration of cartoon images.
204
P. Grohs
Fig. 5.2 Left: A natural image is typically composed of smooth parts separated by edges and thus resembles a cartoon image as defined in Example 5.5. The main features are still visible. Right: True cartoon image.
Example 5.6 (Textures, see [9]). Textures are signals with highly oscillatory, repetitive structures. In [9] the following model has been proposed for textures: k WD fsin.Mf .x//g.x/ where f ; g 2 CKk .Œ0; 12 /g: TEXTK;M
It consists of warped, oscillatory patterns. Example 5.7 (Mutilated Functions, see [2]). The class of ‘mutilated functions’ has been introduced in [2] as all functions of the form k;pw
MUTILKk WD fg.u x/h.x/ W g 2 CK
.R/; h 2 CKk .Œ0; 1d /; u 2 Rd ; juj D 1g:
Functions in MUTILKk are generally smooth aside from possible discontinuities across hyperplanes orthogonal to the vector u. Mutilated functions arise, for instance, as solution to linear transport PDEs [25]. We reiterate that the signal classes that we have seen in the previous examples do not necessarily correspond to what real-world signals might look like. But still it k is reasonable to consider, for instance, the class TEXTK;M as a stylized model for 2 fingerprint images or seismic data, or the class CARTK as a model for images with little texture, see Figures 5.3 and 5.2. Having agreed on what we mean by signal classes in a mathematical sense we now define what we mean by encoding such data which formally means a mapping that maps each u 2 C to a bitstream of finite length. Our nomenclature is mostly based on conventions in the field of rate-distortion theory [1]. Definition 5.8. Let C be a signal class. An encoding/decoding pair .E; D/ consists of two mappings
5 Optimally Sparse Data Representations
205
Fig. 5.3 Left: A fingerprint image resembles a texture image as defined in Example 5.6. Right: True texture image.
E W C ! f0; 1gR ;
D W f0; 1gR ! H ;
where R 2 N denotes the runlength R.E; D/ of .E; D/. The distortion of .E; D/ is defined as ı.E; D/ WD sup ku D ı E.u/kH : u2C
Given an encoding/decoding pair one encodes a signal u 2 C simply by applying the mapping E (and thus ‘digitizing’ the continuous signal u into a bitstream) and the decoding works by applying the decoder, e.g. computing D.E.u// 2 H . The goal of rate-distortion theory is simply to develop encoding/decoding pairs with a runlength which is as short as possible while at the same time keeping the distortion small. In particular we want to understand the following quantity. Definition 5.9. Let C be a signal class. Then we denote its optimal encoding rate by s .C / WD supfs > 0 W There exists C > 0 such that for each R 2 N there exists .ER ; DR / with R.ER ; DR / D R and ı.ER ; DR / CRs g:
(5.4)
We remark that knowledge of s .C / exactly answers the question posed in the beginning of this section: Indeed, for any s < s .C / we can encode any signal u 2 C , up to precision " > 0 using, up to a fixed constant C, "1=s bits. Of course, the larger s , the better: suppose we want to encode with a guaranteed accuracy of four decimals, i.e. " D 106 . Then, if s .C / D 2, we would need about 1000 bits whereas we would need about 1012 bits if s .C / D 1=2.
206
P. Grohs
Summarizing, the quantity s .C / represents a fundamental measure of the complexity of the signal class and the theoretical limitations of encoding. It is closely related to the concept of Kolmogorov entropy [36]. Remark 5.10. We briefly discuss the connection between Kolmogorov entropy (aka Metric entropy) and optimal encoding. Suppose we have a metric space X with distance function d. For x 2 X and r > 0 denote by B.x; r/ the open ball around x with radius r in X . Suppose that C X is relatively compact. Then for any " > 0 there is a finite covering number N" .C / which is minimal among all N 2 N such that there exist .xi /NiD1 X satisfying C
N [
B.xi ; "/:
iD1
We say that .xi /NiD1 constitute an "-cover of C . The Kolmogorov "-Entropy is then defined as H" .C / WD log2 .N" .C //: Let us now see how this quantity is related to encoding. Conforming to the setting we have developed in this section we consider as our metric space the Hilbert space H and as C a signal class as defined in Definition 5.1. Suppose we are given a precision " > 0 and consider H" WD H" .C /. The definition of the Kolmogorov " "-Entropy simply states that we can find N" WD 2H" elements .ui /NiD1 H whose " "-neighborhoods cover C . Certainly we can encode the elements .ui /NiD1 using H" bits by simply encoding the corresponding indices in binary representation. We can now construct an encoder E W C ! f0; 1gH" which maps a given u 2 C to the binary representation of the index i of the element ui closest to u (in case there are several closest elements, simply take the one with the smallest index). The decoder D simply transforms a binary string 2 f0; 1gH" to the corresponding index i and then maps to ui . Clearly this gives an encoding/decoding pair with runlength H" and distortion bounded by ". Conversely, suppose we have an encoding/decoding pair .E; D/ with runlength H and distortion ". Then the system fD. / W 2 f0; 1gH g constitutes an "-cover of C and hence we get that H" .C / H. Summarizing these considerations we get that s .C / D supfs > 0 W sup "1=s H" .C / < 1g: ">0
(5.5) Þ
Actually computing s .C / might seem like a daunting task. Nevertheless, in the following pages we will show how s .C / can be computed for all the examples given above!
5 Optimally Sparse Data Representations
207
5.3 Upper Bounds on the Optimal Encoding Rate We first present a technique to derive upper bounds on s .C /, using the notion of hypercube embeddings, originally introduced by Donoho in [13]. Definition 5.11. i) A signal class C H is said to contain an embedded orthogonal hypercube of dimension m and sidelength ı if there exist f0 2 C and orthogonal functions i 2 H for i D 1; : : : ; m with k i kH ı such that the collection of hypercube vertices m n X h.mI f0 ; . i /i / D h D f0 C "i
i
o W "i 2 f0; 1g
iD1
is contained in C . p ii) A signal class C is said to contain a copy of `0 , p > 0, if there exists a sequence of orthogonal hypercubes H WD .hk /k2N , embedded in C , which have dimensions mk and side-lengths ık , such that ık ! 0 and for some constant C>0 1=p
ık CH mk
for all k 2 N:
(5.6)
A landmark result by Donoho [13] states that if a signal class C contains a copy of p `0 , then the best theoretically achievable encoding rate for any encoding/decoding ! Intuitively this holds because it is possible to precisely pair is bounded by 2p 2p understand the complexity of hypercubes. If they are embedded in C , then C must be at least as complex as the embedded hypercubes. The following result holds true. p
Theorem 5.12. Suppose that signal class C H contains a copy of `0 for p 2 .0; 2. Then s .C /
2p : 2p
(5.7)
The proof of Theorem 5.12 will make use of Lemma 5.14 below, which gives a minimal bound on the error made in encoding bitstreams of length m with only R bits where R=m < 1. It requires the notion of Hamming distance, defined for two bitstreams D . 1 ; : : : ; m /; D . 1 ; : : : ; m / 2 f0; 1gm via distH . ; / WD #fi 2 f1; : : : ; mg W i ¤ i g: We start with the following estimate on binomial coefficients. Lemma 5.13. Let d D "n for 1=2 > " > 0. Then we have
208
P. Grohs
2h."/n
! d1 X n > i iD0
with h."/ WD " log2 ."/ .1 "/ log2 .1 "/, the binary entropy of ", plotted in Figure 5.7. Proof. Suppose that .Xi /niD1 are iid binomially distributed with p D 1=2. Let X WD Pn X iD1 i the cumulative distribution. Observe that ! d1 1 X n D P.X < d/: 2n iD0 i Then Chernoff’s bound as derived in Theorem 5.53 yields the desired result.
t u
We proceed by stating and proving the following key lemma. Lemma 5.14. Suppose that ˛ WD R=m < 1 and let E W f0; 1gm ! f0; 1gR , D W f0; 1gR ! f0; 1gm be arbitrary mappings. Then there exist constants C; m0 > 0, only depending on the ratio ˛ (but not on E, D, and m), and an element 2 f0; 1gm with distH . ; D ı E. // Cm; provided that m m0 . Proof. For 2 f0; 1gR define the sets S. / WD E1 . / which make up a disjoint partitioning [
S. / D f0; 1gm
2f0;1gR
of f0; 1gm by at most 2R subsets. It follows by the pigeonhole principle that there exists at least one 2 f0; 1gR such that the set S./ satisfies #S./ 2mR D 2m.1˛/ :
(5.8)
Consider D./ 2 f0; 1gm . We will show that, for m sufficiently large, there exists at least one 2 S./ f0; 1gm such that for a constant C > 0 depending only on ˛ we get distH . ; D.// Cm:
5 Optimally Sparse Data Representations
209
Since D./ D D ı E. /, this would imply the statement. We start with the identity ! k X m #f 2 f0; 1g W distH . ; D.// kg D DW F.k; m/ l lD0 m
(5.9)
valid for all k 2 N. Suppose now that for some constants C; m0 > 0 we get an estimate of the form F.Cm; m/ < 2m.1˛/
for all m > m0 :
(5.10)
Then, by (5.8) and (5.9) this would imply the existence of a 2 S./ which satisfies distH . ; D ı E. // Cm and we would be done. Due to the monotonicity of h, Lemma 5.13 (with k D d 1; n D m and i D l) implies that F.k; m/ < 2m.1˛/ ;
(5.11)
whenever k < h1 .min.˛; 1 ˛// m 1 Choose an arbitrary ı 2 R which satisfies h1 .min.˛; 1 ˛// > ı > 0. Then, whenever m > m0 WD 1ı we have that 1 h .1 ˛/ ı m < h1 .1 ˛/ m 1 and thus, using (5.11) we get with C WD h1 .1 ˛/ ı
and
m0 D
1 ı
that F.Cm; m/ < 2m.1˛/
for all m > m0 ;
which is (5.10) and thus proves the desired claim.
t u
Remark 5.15. Lemma 5.14 essentially states the intuitive fact that if we code only using a fraction of the original wordlength then the worst-case error will be comparable to a fraction of the original wordlength. It has first been used in [11, 13] in the context of hypercube embeddings. The proof is only sketched there
210
P. Grohs
for ˛ < 1=2. The argument given in [13] makes use of rate-distortion theory [1] and is actually highly non-elementary. The above proof is much shorter and completely elementary. Þ We can finally proceed with the proof of Theorem 5.12. Proof (of Theorem 5.12). Suppose that C contains an embedded orthogonal hypercube h of dimension m and sidelength ı. Clearly, every element h 2 h can be uniquely identified with a binary sequence D .h/ 2 f0; 1gm via an obvious bijection . Since h 2 h C we can, for any R 2 N and s < s .C /, define an encoder ER W C ! f0; 1gR and a decoder DR W f0; 1gR ! H such that the error estimate sup ku DR ı ER .u/kH CRs
u2C
holds true with a constant C > 0 independent of R. Restricting this encoder to the hypercube h and identifying h with f0; 1gm we get another encoding-decoding pair EQ Rm W f0; 1gm ! f0; 1gR defined on 2 f0; 1gm via EQ Rm . / WD ER . 1 . // 2 f0; 1gR
(5.12)
and corresponding decoder which operates on 2 f0; 1gR via m Qm D R ./ WD ı ˘h ı DR ./ 2 f0; 1g :
(5.13)
Here, ˘h W H ! h denotes the orthogonal projection onto h. Since the mapping EQ Rm maps an arbitrary bitstream 2 f0; 1gm to a bitstream of length R we unavoidably lose information if R < m, as Lemma 5.14 tells us. In particular let us now fix R WD m=3. Then, Lemma 5.14 implies the existence of
0 2 f0; 1gm satisfying Qm Qm distH 0 ; D R ı ER . 0 / Cm
(5.14)
which, by orthogonality of h and the fact that h has sidelength ı implies that 2 1 2 . 0 / 1 ı D Qm Qm R ı ER . 0 / H Cmı :
(5.15)
Setting h0 WD 1 . 0 / 2 h C and observing the definitions (5.12) and (5.13), Equation (5.15) leads to the inequality h0 ˘h ı DR ı ER .h0 /2 Cmı 2 : H
(5.16)
5 Optimally Sparse Data Representations
211
Since kh0 DR ı ER .h0 /kH h0 ˘h ı DR ı ER .h0 /H ; the estimate (5.16) implies that kh0 DR ı ER .h0 /k2H Cmı 2 :
(5.17)
This lower bound stands in contrast to the upper bound kh0 DR ı ER .h0 /kH CRs ;
(5.18)
valid for arbitrary s < s .C / and a (possibly new) constant C > 0. Recall that we fixed R WD m=3, so (5.18) implies (again, with a possibly different constant C > 0) that kh0 DR ı ER .h0 /kH Cms
(5.19)
Note that all constants appearing in equations (5.14–5.21) are independent of m. In p particular suppose now that C contains a copy of `0 in the sense of Definition 5.11. Then we have a sequence of hypercubes H D .hk / of dimensions .mk / and sidelengths .ık / obeying, with a uniform (e.g., k-independent) constant CH > 0 the inequality 1=p
ık CH mk
:
(5.20)
Combining (5.20) with (5.17) therefore yields the existence of hk0 2 hk such that k h Dm =3 ı Em =3 .hk /2 C CH m12=p k k 0 0 H k
(5.21)
with yet another constant C > 0 independent of k. On the other hand, Equation (5.19) implies that k h Dm =3 ı Em =3 .hk /2 Cm2s k k 0 0 H k
(5.22)
for all s < s .C /. Balancing the upper bound (5.22) with the lower bound (5.21) finally yields the desired estimate (5.7). t u Remark 5.16. Hypercube embeddings provide an asymptotic measure of complexity for a given signal class, as demonstrated by the previous theorem. Note also that the quantity CH from (5.6) plays an important role: From Equation (5.21) it follows that CH gives a lower bound on the minimal implicit constant that appears in (5.4). More precisely, a family ..ER ; DR //R2N of encoding/decoding pairs with R.ER ; DR / D R always obeys the lower bound
212
P. Grohs .C /
ı.ER ; DR / & CH Rs
;
with the implicit constant independent of C and H .
Þ
The concept of hypercube embeddings might appear extremely complicated at first sight. However, the nice thing is that we can actually construct hypercube embeddings for several interesting signal classes. Let us now examine the complexity of the signal classes introduced above in Section 5.2. Indeed, one can show the following: Theorem 5.17. i) The signal class CKk .C/ as defined in Example 5.2 contains a 1=.k=dC1=2/ copy of `0 with CH & K 2=3 ii) The signal class STAR2K as defined in Example 5.4 contains a copy of `0 with CH & K 2=.kC1/ k iii) The signal class TEXTK;M contains a copy of `0 with CH & K M Proof. We start with i). For simplicity we assume that C D Œ0; 1. Let us fix a nonnegative function 2 C1 .R/ with supp Œ0; 1 and k kCk K. Let l 2 N. Then, for i 2 f0; 1; : : : ; l 1g we define hl as the hypercube generated by f0 D 0 and i D lk .l i/. The i ’s have disjoint supports and are therefore orthogonal in L2 .R/. So the set l1 n X "i hl WD h D
i
o W "i 2 f0; 1g
iD0
indeed forms an orthogonal hypercube in L2 .R/ of dimension l. We need to show that hl is contained in CKk .C/. Note that due to the disjoint supports of the i ’s we only need to show that i 2 CKk .C/ for all i 2 f0; : : : ; l 1g. To this end we note the simple fact that k i kCk k kCk which indeed implies that hl CKk .C/. How about the sidelength of hl ? To this end we compute k
2 i kL2
Dk
2 0 kL2
Dl
2k
Z
j .lt/j2 dt D l2k1 k k2L2 :
We have constructed a sequence .hl / of embedded hypercubes of dimensions ml D l and sidelengths ıl D lk1=2 k kL2 which implies the desired statement. Since k kL2 can be made, up to a fixed constant, as large as K we also get the desired bound on CH .
5 Optimally Sparse Data Representations
213
We move on to show ii). Let f0 WD r1=2 be the indicator function of the unit ball of radius, say, 1=2. We fix again as above and now define i .r/
D Bl f0 ;
where Bl D fr
i .r/
C 1=2g
and r is the radius in polar coordinates. By appropriately normalizing we can again show, exactly as above in the proof of i), that the l-dimensional hypercube (observe again that the i ’s are mutually orthogonal, due to the support properties of ) l1 n X hl WD h D f0 C "i
i
o W "i 2 f0; 1g
iD0
is contained in STAR2K . In order to find its sidelength we need to compute the L2 norms of the i ’s. Note that the function i .r/ is the indicator function of a set whose area is bounded by a constant times l3 . So we have that k i kL2 is bounded by a constant times l3=2 , so the sidelength of hl is of approximate size l3=2 . This shows the desired result. We turn to iii). Again we define as above and set, for fixed l 2 N, and i D .i1 ; i2 / 2 f0; : : : ; l 1g2 k .lx1 i1 / .lx2 l2 / : i1 ;i2 WD sin M l Again, by normalizing appropriately (k kCk . K) we get a sequence H D .hl /l2N of embedded hypercubes with the dimension of hl equal to l2 . In order to calculate the sidelength of hl we note that for lgM we can always bound ˇ ˇ ˇ ˇ ˇsin M lk .lx1 i1 / .lx2 l2 / ˇ & ˇM lk .lx1 i1 / .lx2 l2 /ˇ This estimate leads to the estimate ıl D k i kL2 & Mlk1 ; valid for lgM and which gives the desired result.
t u
Combining Theorem 5.17 with Theorem 5.12 we immediately get the following upper bounds on the best N-term approximation rate for the signal classes considered in Section 5.2. Corollary 5.18. We have the upper bounds i) s .CKk .C// k=d, k;pw ii) s .CK .I// k, iii) s .STAR2K / 1,
214
P. Grohs
iv) s .CARTK2 / 1, k v) s .TEXTK;M / k=2, vi) s .MUTILKk / k=d. The morale of these results is that there exists an absolute limit on how well we can do in terms of encoding the above signal classes. The next obvious question is: Are these bounds sharp?
5.4 Sparse Approximation in Dictionaries Now we would like to come back to our original problem, namely how to efficiently encode an arbitrary u 2 C . In the previous section we have seen that there are certain limitations as to how efficient an encoder can be. We want to find out if these limitations are sharp? Or, in other words, does equality hold in the estimates of Corollary 5.18? We will show that the answer is ‘yes.’ In order to achieve this we need to construct optimal encoding/decoding pairs. One of the most popular approaches for encoding signals is known as ‘sparse representations.’ The underlying idea is that often typical signals can be well approximated by a finite superposition of simple template signals ˚ WD .' /2 , where is a countable index set. The collection of template signals is then called a dictionary: Definition 5.19. Let be a countable set and C a signal class as above in Definition 5.1. Then any ˚ WD .' /2 H is called a dictionary. The idea of sparse coding exploits the fact that, provided the dictionary ˚ is carefully chosen, any u 2 C can be well approximated by a sparse linear combination of dictionary elements. If this is the case, one can encode any u 2 C simply by its coefficients in the sparse linear combination, together with their location. As a motivation, think of music signals. A music signal is formally given by a function that measures variations in air pressure as a function of time. Plotting such a function is neither informative nor efficient, see Figure 5.4, left. One can encode such data by representing music as a sparse superposition of elementary tones, as it is done in a musical score, see Figure 5.4, middle. It turns out that mathematically the dictionary that best mimicks such a representation is given by Gabor frames, see Figure 5.4, right. See [18] for a comprehensive introduction into Gabor frames. Do there exist mathematical equivalents of musical scores for other signal classes? We will see that this is indeed the case for the examples considered above. For further literature on the theory and practice of sparse data representations, we recommend the excellent monographs [10, 34].
5 Optimally Sparse Data Representations
215
Fig. 5.4 Left: Music signal represented as variations in air pressure. Middle: ‘sparse’ representation as a musical score. Right: representation in Gabor dictionary, yielding a time-frequency representation which mimicks a musical score.
5.4.1 Best N-term Approximation in Dictionaries Motivated by the considerations above, given a signal class C and a dictionary ˚ we can now try to quantify how well an arbitrary signal can be represented by a sparse linear combination of template signals. This is done in the following definition. Definition 5.20. Suppose that ˚ is a dictionary as above in Definition 5.19 and C a signal class as above in Definition 5.1. Define, for N 2 N the sets ˙N .˚/ WD
8 <X :
c ' 2 H W N with #N D N and .c /2N
2N
9 = R ; ;
of N-term linear combinations of ˚ and for s > 0the nonlinear approximation spaces n A s .˚/ WD u 2 H W There exists C > 0 such that for all N 2 N we have inf
v2˙N .˚/
o ku vkH CN s :
Any function uN 2 ˙N .˚/ which realizes the infimum in the definition above (meaning that ku uN kH D infv2˙N .˚/ ku vkH ) is called a best N-term approximation of u. Finally let ssparse .C ; ˚/ WD sup fs > 0 W C A s .˚/g : The set ˙N .˚/ consists of all N-term linear combinations of template signals and A s .˚/ describes all elements of H which can be approximated by N-term linear combinations of template signals, at an asymptotic error rate N s . Intuitively, the larger s, the better: suppose we are given a desired precision " > 0. Then any
216
P. Grohs
u 2 A s .˚/ can be encoded using O."1=s / coefficients .c / and indices 2 N . Furthermore, the larger s WD s .C ; ˚/ turns out to be, the better the dictionary ˚ is adapted to the signal class C , since in that case we can encode any u 2 C using O."1=s / coefficients .c / and indices 2 N . We will see soon that this intuition is actually misleading: the quantity s alone is in general unrelated to the existence of efficient encoders. Remark 5.21. The study of the spaces A s .˚/ plays an important role in the field of nonlinear approximation theory [10] and the adaptive numerical solution of partial differential equations [5]. For several dictionaries ˚, the spaces A s .˚/ are well studied. If ˚ is a wavelet basis, then A s .˚/ defines a certain Besov space [39]. If ˚ is a Wilson basis, then A s .˚/ defines a modulation space [16]. In both these cases, the resulting function spaces can be described as coorbit spaces. Þ We could now try to find a dictionary ˚ which, given a signal class C , maximizes the quantity s .C ; ˚/. Such a dictionary would then be optimal for the signal class C . Unfortunately, the answer to this problem turns out to carry little insight: Lemma 5.22. Let ˚ H be dense in H . Then for any signal class C H we have ssparse .C ; ˚/ D 1: Proof. We will show that A s .˚/ D H for all s > 0. Indeed, suppose that u 2 H . Clearly, since ˚ is, by assumption, dense in H , we get that inf
ku vkH D 0
inf
ku vkH D 0:
v2˙1 .˚/
which implies that v2˙N .˚/
t u
Lemma 5.22 reveals that, given a signal class C , seeking to optimize the quantity s .C ; ˚/ in order to find a suitable dictionary yields a degenerate problem. In particular, by Corollary 5.18 it follows necessarily that only having high N-term approximation rates does not necessarily lead to efficient encoders – sparsity alone is not enough!
By Lemma 5.22 any dense subset ˚ H would allow to encode any u 2 C up to precision " > 0 by just one index 2 and one coefficient c 2 R. But how to find this index? And how to encode it efficiently? Clearly, finding the index would amount to searching the full (infinite) index set , a task that is computationally infeasible. In particular it follows that the quantity ssparse .C ; ˚/ bears no relevance in terms of developing efficient encoders. We need to refine slightly the notion of ‘allowed dictionaries’ for the sparse representation of signals in C .
5 Optimally Sparse Data Representations
217
5.4.2 Best N-term Approximation with Polynomial Depth Search In Section 5.4.1 we have seen that the quantity ssparse .C ; ˚/ defined in Definition 5.20 actually carries no relevance in terms of encoding a given signal class: it can be made arbitrarily large by choosing appropriate dictionaries ˚, yielding computationally intractable sparse representations. On the other hand, we already know but Corollary 5.18 that for the signal classes we have seen so far there is an absolute limit on how well we can encode. The central problem that we have faced is that, in order to find a best N-term approximation for a given signal u, we potentially have to search through the full infinite index set . An idea which has been put forward by Donoho in [13] suggests to restrict the search for the optimal coefficient set N in the best N-term approximation to the first .N/ coefficients where is some polynomial. This approach is known as polynomial-depth search. To this end, fix a polynomial and identify the discrete index set with the natural numbers N. For convenience we introduce the notation ŒM WD f1; : : : ; Mg for M 2 N. Definition 5.23. Suppose that ˚ D .'i /i2N is a dictionary as above in Definition 5.19 indexed by the natural numbers, is a fixed univariate polynomial, and C a signal class as above in Definition 5.1. Define for N 2 N and s > 0, the sets ˙N .˚/ WD
8 <X :
c ' 2 H W N Œ.N/ with #N D N and .c /2N
2N
9 = R ; ;
and n As .˚/ WD u 2 H W There exists C > 0 such that for all N 2 N we have inf
v2˙N .˚/
o ku vkH CN s :
Any function uN 2 ˙N .˚/ which realizes the infimum in the definition above (meaning that ku uN kH D infv2˙N .˚/ ku vkH ) is called a best N-term approximation of u under polynomial depth search. Finally let ssparsepoly; .C ; ˚/ WD sup fs > 0 W C As .˚/g :
218
P. Grohs
Again, the quality of a given dictionary ˚ in terms of its sparse approximation properties for a signal class C can be measured in terms of the quantity ssparsepoly; .C ; ˚/ but, as we shall see in a moment, Definition 5.23 avoids the degeneracy exposed by Lemma 5.22 for the similar quantity ssparse .C ; ˚/. Essentially now the search for the index set N of the best N-term approximation of a given signal u 2 C can be reduced to the first .N/ indices which renders the problem of actually computing a best N-term approximation tractable. More precisely we have the following result. Theorem 5.24. Suppose that C is a signal class, ˚ a dictionary and a polynomial. Then, for any N 2 N and 0 < s < ssparsepoly; .C ; ˚/ there exists an W f0; 1gN ! H and encoding map EN˚; W C ! f0; 1gN and a decoding map D˚; N a constant C > 0 independent of N such that we have the error estimate s ı.EN˚; ; D˚; N / CN :
(5.23)
In other words, given a precision " > 0 we can encode an arbitrary signal u 2 C using at most a uniform constant times "1=s bits. Proof. By Definition 5.23 and relative compactness of C we know that for any s < ssparsepoly; .C ; ˚/ there exists a constant C > 0 such that for any u 2 C and N 2 N there is an index set N Œ.N/ with #N D N and coefficients .c /2N obeying the estimate X u c ' 2N
CN s :
H
We need to encode both the N coefficients .c /N and the N indices N as a sequence of bits. Encoding of index set. We start by encoding the index set N . It consists of N natural numbers, all bounded by .N/, where is a polynomial of degree, say, k. Clearly we can encode by O.N log.N// bits (this is actually the place where the polynomial depth search assumption comes in). Encoding of coefficients. It remains to encode the coefficients .c /2N R. This turns out to be slightly more subtle. (i) Consider the set ˚N WD .' /2N We apply Gram-Schmidt orthonormalization of ˚N (starting from the first to the last element) to obtain an orthonormal set ˚Q N WD .'Q /N which has the same linear span as ˚N . Observe that some elements 'Q may be zero. We denote the set f 2 N W 'Q D 0g by 0N . We can now uniquely define a sequence .Qc /2N by X 2N
cQ 'Q D
X 2N
c '
5 Optimally Sparse Data Representations
219
and cQ D 0 for We observe that X u c Q ' Q 2N
CN s
and
2 0N :
X 2N
H
cQ 2 2 sup kuk2H C 2C2 N 2s : u2C
(5.24)
Note that the second inequality above implies that the coefficient sequences .Qc /2N are uniformly bounded, independent of N and u. (ii) We now would like to encode the sequence .Qc /2N . We do this simply by rounding all these coefficients up to multiples of N .sC1=2/ and (due to the fact that the coefficients are uniformly bounded) wind up with a bit string of length O.N log.N//. All in all, combining (i) and (ii) above we arrive at an encoding procedure ER˚; W C ! f0; 1gR with R D O.N log.N//. Decoding procedure. The decoding map is now simple: simply undo the operations above. That is, starting from our bit string first reconstruct N (this can be done exactly) and the rounded approximations .Oc /2N to the coefficient sequence .Qc /2 form a P N . Using these coefficients and the index set N we can R reconstruction 2N cO 'Q in H . This defines the mapping D˚; W f0; 1g ! H. R Error estimate. It remains to study the approximation error of our encodingP decoding procedure. Suppose we have available a reconstruction O 'Q 2N c obtained from the above procedure. Recall that, by our assumption on the rounding there exists a constant C > 0, independent of N and u, with max2N jQc cO j CN .sC1=2/ . It follows that 2 X X 1=2.sC1=2/ c O ' Q c Q ' Q D CN s : CN 2N 2N H
Together with (5.24, left) this yields, with a (possibly different but independent of N and u) constant C > 0 the estimate ˚; ˚; u DR ı ER .u/ CN s : H
Since R D O.N log.N// and s < ssparsepoly; .C ; ˚/ arbitrary, this yields the desired result. t u By the previous theorem we conclude that a large quantity ssparsepoly; .C ; ˚/ automatically implies efficient encoding schemes for the signal class C . This stands in sharp contrast to the results of Section 5.4.1.
220
P. Grohs
We can now again ask the question if, given a signal class C there exists an optimal dictionary ˚ for its optimally sparse representation. To this end we define the following. Definition 5.25. Given a signal class C we define the optimally achievable best N-term approximation rate under polynomial depth search constraint by ssparsepoly .C / WD supfssparsepoly; .C ; ˚/ W is a polynomial, and ˚ D .'i /i2N a dictionaryg: A dictionary ˚ D .'i /i2N is called optimal for the sparse approximation of C if there exists a polynomial such that ssparsepoly .C / D ssparsepoly; .C ; ˚/: By Theorem 5.24 we immediately get the following lower bound for the quantity s .C /. Lemma 5.26. We have s .C / ssparsepoly .C /: So the main task for obtaining a complete understanding of the complexity of a signal class is to find optimal dictionaries in the sense of Definition 5.25.
5.4.3 Sparse Approximations in Frames Recall that we asked whether the upper bounds derived in Corollary 5.18 are in fact optimal. To go about this task for a specific signal class C we would need to find a dictionary ˚ D .' /2N and a polynomial which satisfies ssparsepoly; .C ; ˚/ D s .C /: Such a dictionary should also be computationally efficient in the sense that best N-term approximations should also be computable in a reasonable amount of time. Good candidates for such dictionaries are frames for H . For a comprehensive monograph on the theory of frames, we refer to [4]. Most of the material in this section can also be found in [10]. We briefly recall the definition of a frame for a Hilbert space H with inner product h; iH .
5 Optimally Sparse Data Representations
221
Definition 5.27. A dictionary ˚ D .' /2 H is called a frame for H if there exist constants 0 < A˚ B˚ such that X jhu; ' iH j2 B˚ kuk2H A˚ kuk2H 2
holds true for all u 2 H . A frame is called tight if A D B and Parseval if A D B D 1. A dual frame of ˚ is, by definition a complementary system ˚Q D .'Q /2 which satisfies, for all u 2 H that uD
X
X
hu; ' iH 'Q D
2
hu; 'Q iH ' :
2
The canonical dual frame of ˚, which always exists, is the unique dual frame ˚Q of ˚ which satisfies X X hu; 'Q i2H D min c2 : P .c /2 W
2
2 c ' Du
2
If ˚ is tight, its canonical dual is given by ˚Q D A1 ˚. In order to measure best N-term approximation rates in frames we will require the notion of weak `p -quasinorms which are defined for coefficients .c /2 R as ˚ k.c /2 kw`p WD min C > 0 W jci j Ci1=p for all i 2 N ; where .ci /i2N is a nonincreasing rearrangement of .c /2 . Weak `p -quasinorms are intimately related to conventional `p -quasinorms. More precisely we have the following lemma. Lemma 5.28. For any sequence .c /2 R and any p; ı > 0 there exists a constant C.p; ı/ > 0 such that C.p; ı/k.c /2 k`pCı k.c /2 kw`p k.c /2 k`p : Proof. Suppose first that .c /2 2 `p . Then we can consider a nonincreasing rearrangement .ci /i2N and observe that p k.c /2 k`p
D
X j2N
p
jcj j
i X
jcj jp ijci jp ;
jD1
where the last inequality follows from the fact that the sequence .jcj j/j2N is nonincreasing. We get the estimate jci j k.c /2 k`p i1=p ;
222
P. Grohs
which shows that k.c /2 kw`p k.c /2 k`p , thereby proving the second inequality above. Conversely, suppose that k.c /2 kw`p < 1 Then we can estimate, for each ı > 0 the norm X X pCı jci jpCı k.c /2 kw`p i1ı=p < 1; k.c /2 k`pCı D i2N
i2N
t u
which proves the first inequality above.
We also record for later reference the following useful characterization of weak `p quasinorms. Lemma 5.29. Denote for all " > 0 the index subsets ."/ WD f 2 W jc j > "g: Then we have the equivalence k.c /2 kw`p min C > 0 W sup "p #."/ C : ">0
Proof. We will only prove that kckw`p . min fC W #."/ C"p for all " > 0g
(5.25)
since this is all we need in the sequel. As an exercise, try to prove the converse direction. To prove (5.25), assume that #."/ C"p
for all " > 0:
Denote by .ci /i2N the nonincreasing rearrangement of c. With " WD jcN j we get by assumption that p
N # f 2 W jc j cN g CjcN j : It follows that C1 jcN jp N 1 ; or 1
jcN j C1=p N p t u
5 Optimally Sparse Data Representations
223
We can now state the main result concerning the sparse approximation with frames. It turns out that the best N-term approximation rate can be described precisely using weak `p norms on the coefficient sequence. Theorem 5.30. Suppose that C H is a signal class and ˚ a frame for H with Q dual frame ˚. i) Suppose that sup khu; ˚Q iH kw`p < 1:
(5.26)
u2C
Then C A 1=p1=2 .˚/: If in addition ˚ is a basis, then also the converse holds true. ii) Suppose that D N and in addition to (5.26) it holds that there exists ˛ > 0 and C > 0 such that sup
1 X
u2C iDn
jhu; 'Qi iH j2 Cn˛ :
(5.27)
Then there exists a polynomial such that C A1=p1=2 .˚/: Proof. We start with i) which is a standard result in nonlinear approximation theory (see, for instance, [10]). Suppose we have u 2 C . We construct an N-term approximation uN by uN WD
N X hu; 'Qi iH 'i ; iD1
where .hu; 'Qi iH /i2N denotes a nonincreasing rearrangement of the frame coefficient sequence .hu; 'Q iH //2 . We need to estimate the error ku uN kH in dependence on N. Since ˚ constitutes a frame this error can be bounded as follows: ku uN k2H C
X
jhu; 'Qi iH j2 :
i>N
Furthermore, due to the assumption (5.26) we have that jhu; 'Qi iH j Ci1=p :
224
P. Grohs
All in all we get an estimate ku uN k2H C
X
i2=p :
i>N
It remains to show that with s WD 1=p 1=2 we have X
i2=p N 2s :
(5.28)
i>N
For this we now suppose without loss of generality that N D 2l 1 for some l 2 N. Then we can write X X i2=p D Fm ; (5.29) i>N
ml
where Fm WD
2mC1 X1
i2=p :
iD2m
Since each summand in the definition of Fm is bounded from above by 22m=p and there are exactly 2m such summands, we now estimate Fm 2m.12=p/ : Putting this estimate into (5.29) yields X
i2=p
i>N
X
2m.12=p/ C2l.12=p/ CN 2s ;
ml
whenever 2=p 1 < 0. This proves (5.28) and thus i). Let us turn to ii). We will show that there exists a polynomial such that with N X
hu; 'Qi iH 'i Œ.N/ .i / 2 ˙N .˚/
iD1
we have the estimate ku uN kH CN s ; which implies the desired statement.
5 Optimally Sparse Data Representations
225
Arguments akin to the ones used above in the proof of i) show that for any M 2 N the cutoff vM WD
M X hu; 'Q iH ' D1
satisfies ku vM kH CM ˛=2 : It follows that with .t/ WD tb2s=˛cC1 we have ku v.N/ kH CN s : This, together with the fact that ku uN kH ku uN kH C ku v.N/ kH yields the desired approximation rate for uN .
t u
In [13] the condition (5.27) is termed weak tail-compactness. From Theorem 5.30 and Lemma 5.26 we have the following corollary. Corollary 5.31. Suppose that C is a signal class, ˚ is a dictionary and a polynomial. Suppose that (5.27) holds true for a constant ˛ > 0 and that (5.26) holds true for all p > 1=.s C 1=2/. Then s .C / ssparsepoly; .C ; ˚/ s : Corollary 5.31 tells us what we have to look for in designing optimal frame representations for a given signal class: The goal is to find a frame for H such that (5.26) holds true for all p > 1=.s .C / C 1=2/ and in addition (5.27) holds true for a constant ˛ > 0. Once we achieved this, by Corollary 5.31 and Theorem 5.24, we have a constructive way of designing optimal encoders. It turns out that for all the signal classes presented in Section 5.2 such optimal frame constructions exist. This is the topic of the remainder of the present chapter.
5.5 Wavelet Approximation of Piecewise Smooth Functions k;pw
We shall now consider in more detail the signal class C D CK .I/ of piecewise Ck functions on an interval I. We have seen in Corollary 5.18 that s .C / k. We will show that this bound is sharp by showing that for a wavelet frame ˚ L2 .I/ the bound ssparsepoly; .C ; ˚/ k holds true and utilizing Corollary 5.31. In order
226
P. Grohs
to establish the announced bound we will show that the coefficient sequence of an arbitrary u 2 C lies in a weak `p space with appropriately small p > 0. We begin by introducing wavelet systems. Definition 5.32. Let ; 2 L2 .R/ and > 0. Then the wavelet system generated by the scaling function and the wavelet with sampling constant > 0 is defined as ˚ WD .' /2 WD W. ; I / WD f k WD . k/ W k 2 Zg [ ˚ j=2 .2j k/ W j 2 N; k 2 Z : j;k WD 2 Many constructions of wavelet systems exist which form a (orthonormal) basis or a (tight) frame for L2 .R/. Describing them would be beyond the scope of this chapter but let us mention the groundbreaking work [7] by Daubechies which, for any L; M 0 constructs generators ; 2 CL , compactly supported, such that i) The system W. ; I 1/ constitutes an orthonormal basis for L2 .R/, and ii) the wavelet possesses M vanishing moments in the sense that Z R
xm .x/dx D 0 for all m 2 ŒM:
(5.30)
Much more information on wavelets can be found in [8]. Here we shall only focus on their sparsification properties for piecewise smooth univariate functions. The following result shows that such wavelet systems are optimally suited for the approximation of piecewise smooth functions. Proposition 5.33. Suppose that the system ˚ WD W. ; I / constitutes a frame for L2 .R/ where we assume that ; 2 Ck .R/ are compactly supported and possesses k 1 vanishing moments in the sense of Equation (5.30). Then, with C D k;pw CK .I/ (see Example 5.3), Equation (5.26) holds true for all p > 1=.k C 1=2/. 1 Proof. We will show that for any u 2 C we have for every p > kC1=2 that the p coefficient sequence .hu; j;i i/j;i lies in ` . By Lemma 5.28 this implies that the 1 coefficient sequence also lies in w`p for all values p > kC1=2 . For simplicity we let I D Œ0; 1 and D 1, the general case being no more difficult. We have assumed the vanishing moments condition
Z p.x/ .x/dx D 0
for all polynomials p of degree k 1.
(5.31)
Furthermore, by the compact support assumption, we may suppose that supp
j;i
2j Œi a; i C a
(5.32)
5 Optimally Sparse Data Representations
227
for some fixed a > 0. Now fix a scale j. For any bounded u we have (using the substitution y D 2j x i) the estimate Z Z f .x/2j=2 .2j x i/dx . 2j=2 : (5.33) u.x/ j;i .x/dx D 2j Œia;iCa
If we require more smoothness on u, we can get much better estimates: For u 2 Ck denote by pj;i .x/ the Taylor polynomial of degree k 1 of u around 2j i which gives an approximation sup x22j Œia;iCa
ju.x/ pj;i .x/j . 2kj :
Hence we get Z
Z u.x/
j;i .x/dx
D
2j Œia;iCa
pj;i .x/ C O.2kj / 2j=2 .2j x i/dx:
Using the vanishing-moment-property (5.31) and the substitution y D 2j x, we get that Z (5.34) u.x/ j;i .x/dx . 2.kC1=2/j for all u 2 Ck . Let us split the index set Z into K1 D fi 2 Z W supp
j;i
intersects the singularityg
and K2 D fi 2 Z W supp
j;i
intersects Œ0; 1 and not the singularityg:
By (5.32) we get that #K1 . 1
and
#K2 . 2j :
(5.35)
For i 2 K1 the wavelet coefficients cj;i WD hu; j;i i satisfy jcj;i j . 2j=2 by (5.33) and for i 2 K2 we have, by (5.34), that jcj;i j . 2j.kC1=2/ . Hence we get that X i
jcj;i jp .
X K1
2pj=2 C
X K1
2pj.kC1=2/ . 2pj=2 C 2j.pkCp=21/ ;
228
P. Grohs
by (5.35). Now suppose that p > and XX j
1 . kC1=2
Then WD min.p=2; p˛ C p=2 1/ > 0
jcj;i jp .
X
i
2 j < 1:
j
t u Remark 5.34. One can do a more refined analysis, using Lemma 5.29, which shows that actually the statement of Proposition 5.33 remains true in the boundary case p D 1=.k C 1=2/. Þ Having established the optimal sparsity of wavelet coefficient sequences of piecewise smooth functions we still need to verify (5.27). Note that this requires us to enumerate the elements of our wavelet frame W. ; I /. Note moreover that it suffices to only consider those elements of the wavelet frame with support intersecting the interval I. For a fixed scale j there are 2j such elements j;k . We order these index pairs .j; k/ for which the support of j;k intersects the interval I in lexicographic order to obtain an enumeration .'i /i2N of those wavelet elements. Proposition 5.35. With the same assumptions as in Proposition 5.33 and the above described enumeration .'i /i2N of the wavelet elements intersecting I, Equation (5.27) holds true for some ˛ > 0. Proof. Suppose for simplicity that n D 2r (the generalization to arbitrary n is a simple exercise). We have to give an exponential (in r) bound for the tail sum XX j>r
jhu;
j;i ij
2
:
i
Let us again fix j. We have seen in the proof of Proposition 5.33 that there are O.1/ coefficients of size O.2j=2 / and O.2j / coefficients of size O.2kj / and, using this information we can estimate X XX 2j C 2j 22kj . n1 : jhu; j;i ij2 . j>r
i
j>r
t u
This is the desired bound. Putting together these results we get the following
Corollary 5.36. Let ˚ be a wavelet frame with Ck compactly supported generators and k 1 vanishing moments. Let ˚Q be the canonical dual, then we have, with some polynomial , that s .CK
k;pw
.I// D ssparsepoly; .CK
k;pw
.I/; ˚Q / D k:
5 Optimally Sparse Data Representations
229
We have gained a complete understanding of the complexity of piecewise smooth functions! In particular we can encode arbitrary piecewise smooth functions up to error " using order "1=k bits but we can do no better asymptotically. In particular: Wavelets are perfect for the approximation of piecewise smooth functions defined on an interval.
We mention also that the optimal encoding scheme is very simple: To get a worstcase error of order N s compute the first .N/ wavelet coefficients and keep only the N largest of them. This method is called thresholding and it is the basis for many compression algorithms, among them JPEG2000. Remark 5.37. One might critique the method of thresholding as described above as a wasteful method of data processing. After all we have to measure .N/ wavelet coefficients, only to throw away almost all of them in the later compression stage. The currently thriving research field of compressed sensing seeks to overcome this problem by cleverly taking only order N samples of a signal (in this case random samples of its Fourier transform with respect to a judiciously chosen probability distribution) which already contain the information of the N largest wavelet coefficients. See [17] for a comprehensive introduction. Þ
5.6 Cartoon Approximation with Curvelet Tight Frames In Section 5.5 we have seen that for piecewise smooth functions defined on an interval, the lower bound derived earlier is in fact sharp. To this end we have shown that simple thresholding of wavelet coefficients provides us with asymptotically optimal encoding schemes. We do not want to ask the same question for the signal class CARTK2 of cartoon images as defined earlier in Section 5.2. Corollary 5.18 gives us a benchmark: we cannot expect N-term approximation rates higher than 1. In fact we will see that this bound is sharp, i.e. that s .CARTK2 / D 1. Similar to the analysis in Section 5.5 we will go about proving this by constructing a frame for L2 .R2 / in which cartoon images are optimally sparse.
5.6.1 Suboptimality of Wavelets for Cartoon Images In the last section we saw that wavelets are optimal for approximating piecewise smooth univariate functions. Does this nice property persist when we go from univariate to multivariate? First let us see how we can construct bivariate wavelet systems. Two-dimensional wavelet frames can be constructed from one-dimensional ones using a tensor-product construction. Let W. ; I / be a univariate wavelet system as defined in Definition 5.32. Then the associated bivariate tensor-product wavelet
230
P. Grohs
system is defined as ˚ W2D . ; I / WD .' /2 WD k2D W k D .k1 ; k2 / 2 Z2 [ n o .e/ 2 W j 2 N; k 2 Z ; e 2 f.1; 0/; .0; 1/; .1; 1/g ; j;k where k2D .x1 ; x2 / WD .x1 k1 / .x2 k2 /; .0;1/ j;k
WD 2j .2j x1 k1 /
.2j x2 k2 /;
.1;0/ j;k
WD 2j .2j x1 k1 / .2j x2 k2 /;
.1;1/ j;k
WD 2j .2j x1 k1 /
.2j x2 k2 /:
If W. ; I / constitutes a frame (resp. basis) for L2 .R/, then W2D . ; I / constitutes a frame (resp. basis) for L2 .R2 /. The following result, which we formulate for convenience only for wavelet ONB’s, holds true. Proposition 5.38. Suppose that ˚ WD W2D . ; I / is a bivariate tensor product wavelet orthonormal basis for L2 .R2 / and let C WD CARTK2 . Then ssparse .C ; ˚/ D
1 : 2
Proof (Sketch). Consider as a template cartoon image the indicator function of the unit ball: 1 x12 C x22 1=2 u W Œ1; 12 ! R; u.x/ D 0 else: In order to assess the approximation properties of bivariate wavelet bases, we need to consider the absolute values of the wavelet coefficients c D jhu; ' ij. We would like to find out for which p > 0 these coefficient sequences are in `p which, by the results of the previous section lets us immediately draw conclusions about the best possible approximation rate. More precisely we will (somewhat informally) show that the sequence .c / can never be in `p for any p < 1. By Lemma 5.28 this implies that .c / cannot be in any weak `p space for p < 1. By Theorem 5.30 (i) this implies that u…As for any s > 12 , which is the desired result. In order to estimate the `p norm of the coefficient sequence let us fix again a scale j > 0 and assume for simplicity that our wavelets are of compact support and .e/ have vanishing moments of order M 1. Then the wavelets j;k are supported in a quadrilateral of size 2j 2j . Hence, at least 2j wavelets intersect
5 Optimally Sparse Data Representations
231
Fig. 5.5 Left: Covering the singularity curve with isotropic elements of width 2j and length 2j requires 2j elements to cover it fully. Right: Only 2j=2 anisotropically scaled elements of length 2j=2 and width 2j suffice to cover the singularity curve.
the singularity curve of u (in our case the circle). A short calculation reveals that these coefficients are of size 2j . The coefficients for the wavelets which do not intersect the singularity curve vanish, due to the vanishing moment property. Hence, in each scale we have XX .e/ jhu; j;k ijp 2j 2pj D 2.1p/j : k2Z2
e
This implies that the wavelet coefficient sequence cannot lie in `p for any p < 1.
t u
By Proposition 5.38 we thus cannot expect an N-term approximation rate higher than N 1=2 with wavelets. This falls short of the expected rate N 1 by a wide margin. Can we improve this? What is the central problem of approximating an edge singularity by wavelets? One aspect is definitely that, due to the isotropic dilation operation x 7! 2j x, the supports of wavelets are isotropic boxes of diameter 2j . For any edge singularities, we need approximately 2j such boxes to cover the singularity, see Figure 5.5, left. Possibly another form of dilation could perform better. It turns out that (in some sense) the optimal choice to achieve anisotropy is to replace the dilation operation x 7! 2j x by .x1 ; x2 / 7! .2j x1 ; 2j=2 x2 /. Then, if we use this form of dilation, a function with support in I1 I2 R2 gets mapped to the function 23j=4 .A2j x/ which has support in 2j I1 2j=2 I2 , i.e. in an increasingly needle-like anisotropic rectangle (the pre-factor 23j=4 is added to preserve the L2 -norm of the dilation operation). We have put Aa WD
a 0 p : 0 a
This particular form of dilation is called parabolic scaling.
232
P. Grohs
Let us add translations which would yield a system 23j=4 .A2j x k/;
j 2 N; k 2 Z2
which consists of functions which increasingly look like little vertical needles, with j increasing. These needles are translated along the grid 2j Z 2j=2 Z. Intuitively, with this system we should be able to approximate functions with edge singularities in the vertical direction. In order to be able to approximate any curved singularity, we rotate these elements and get a system 22j=4 .A2j Rj;l x k/;
j 2 N; k 2 Z2 ; l 2 0; : : : ; Lj ;
where R WD
cos. / sin. / sin. / cos. /
denotes the rotation associated with angle 2 Œ0; 2/. at scale j. The most sensible way to pick those is to sample them uniformly on Œ0; 2. In summary we can put Lj 2j=2
and
j;l WD
l2 ; l D 0; : : : ; Lj 1: Lj
(5.36)
In summary, we have the following wishlist: • A system of the form 23j=4 .A2j Rj;l x k/;
j 2 N; k 2 Z2 ; l 2 0; : : : ; Lj
(5.37)
which • constitutes a tight frame for L2 .R2 /, • approximates functions with edge singularities well. We can even make the last point concerning the approximation of singularities a bit more precise: Due to parabolic scaling, for each scale we have order 2j=2 ‘large’ coefficients which cover the singularity, see Figure 5.5, right, meaning that their essential support intersects the singularity curve and its orientation is tangent to the singularity. Assume in addition, just for the sake of argument, that we can show that all other coefficients decay so fast as to be negligible. Then, reasoning as above for the wavelet-case, these 2j=2 ‘large’ coefficients are bounded by 23j=4 , since ˇ ˇ ˇhu; 23j=4 .A2j R x k/iˇ D 23j=4 jhu; ij kukL1 k kL1 : j;l It would then follow that for each " > 0, all coefficients of scale > 43 log2 ."1 / are automatically smaller than " in absolute value and for all smaller scales we have
5 Optimally Sparse Data Representations
233
about 2j=2 coefficients which might be larger than " in absolute value. In summary, we would get about 4 3
log2 ."1 /
X
2j=2 . "2=3
jD0
coefficients in ."/, or in other words, sup "2=3 #."/ < 1: ">0
By Lemma 5.29 this implies that the coefficient sequence would lie in w`2=3 which, by Theorem 5.30 would imply a best N-term approximation rate N 1 , which is optimal and in particular vastly better than the rate N 1=2 of the bivariate wavelet approximation. Therefore, we wish to construct a Parseval frame based on the ideas outlined above such that we obtain a best N-term approximation rate N 1 for functions with edge singularities. It turns out that these wishes cannot be satisfied in full, but with very little compromise, more precisely we shall see that one can actually construct tight frames of the form 23j=4
.j;l;k/
.A2j Rj;l x k/;
j 2 N; k 2 Z2 ; l 2 0; : : : ; Lj ;
(5.38)
with (mildly) varying generators .j;l;k/ 2 L2 .R2 /. These frames can then be shown to provide optimal best N-term approximation rates for cartoon images. The proof of this latter fact relies on the heuristics that we developed above. The technical complications turn out to be considerable which makes the precise argument too long to be included here.
5.6.2 Curvelets This section introduces curvelet tight frames as constructed in [3]. We shall indeed see that we can obtain tight frames for L2 .R2 / of the form (5.38). The construction is Fourier based, that is we will use the identification between a function and its Fourier transform, as given by Parseval’s identity: kf kL2 .R2 / D kfO kL2 .R2 / : The construction is simplified by treating the radial and angular components separately and therefore we will work in polar coordinates .j j; =j j/.
234
P. Grohs
Fig. 5.6 Left: Curvelet frequency tiling. Right: Shearlet frequency tiling.
The first part of the construction is to provide a family of window functions in Fourier space which partition the Fourier space into the so-called parabolic wedges of length 2j and width 2j=2 , see Figure 5.6, left. This decomposition is classically known as the second dyadic decomposition and has its origin in the study of Fourier integral operators [40]. As stated before it will be convenient to pick window functions which are given as a product of a function depending only on the radius j j and another function depending only on the angle =j j. For the construction of the radial functions W .j/ Q .0/ W RC ! Œ0; 1 and W Q W RC ! at each scale j 2 N0 we start with C1 -functions W Œ0; 1 with the following properties Q .0/ Œ0; 2; supp W
Q .0/ .r/ D 1 for all r 2 Œ0; 3 ; W 2
Q Œ 1 ; 2; supp W 2
Q W.r/ D 1 for all r 2 Œ 34 ; 32 :
Q j r/ on RC . Finally, we Q .j/ .r/ WD W.2 Then we define for j 2 N the functions W rescale for every j 2 N0 (to obtain an integer grid later) Q .j/ .8r/; W .j/ .r/ WD W
r 2 RC :
P Notice that it holds 2 W .j/ 1. Next, we define the angular functions V .j;l/ W S1 ! Œ0; 1 on the unit circle 1 2 ˚S R , where j 2 N specifies the scale and l the orientation, running through 0; : : : ; Lj 1 with Lj D 2bj=2c ;
j 2 N:
This time we start with a C1 -function V W R ! Œ0; 1, living on the whole of R, satisfying supp V Œ 34 ; 34
and
V.t/ D 1 for all t 2 Œ 2 ; 2 :
5 Optimally Sparse Data Representations
235
Since the interval Œ; can be identified via t 7! eit with S1 , we obtain C1 -functions VQ .j;0/ W S1 ! Œ0; 1 for every j 2 N by restricting the scaled functions V.2bj=2c / to Œ; . In order to end up with real-valued curvelets, we then symmetrize V .j;0/ . / WD VQ .j;0/ . / C VQ .j;0/ . / ; 2 S1 : bj=2c , and for l 2 At ˚ each scale j 2 N we define the characteristic angle j D 2 0; 1; : : : ; Lj 1 we let
Rj;l D
cos.j;l / sin.j;l / sin.j;l / cos.j;l /
(5.39)
be the rotation matrix by the angle j;l WDlj . By rotating V .j;0/ we finally get V .j;l/ W S1 ! Œ0; 1, V .j;l/ . / WD V .j;0/ .Rj;l /
for 2 S1 :
In order to secure the tightness of the frame we need to renormalize with the function . / WD W .0/ .j j/2 C
X
W .j/ .j j/2 V .j;l/
j;l
2 ; j j
which satisfies 1 . / 8 for all 2 R2 . Bringing the radial and angular components together, we obtain the functions .0/
W .j j/ X0 . / WD p ; . /
W .j/ .j j/V .j;l/ Xj;l . / D p . /
j j
for 2 R2 : (5.40)
It is obvious that X0 , Xj;l 2 C1 .R2 /. Moreover, these functions are real-valued, non-negative, compactly supported, and L1 -bounded by 1. Let e1 D .1; 0/ 2 R2 be the first unit vector and put n o Sj WD x 2 S1 W jhx; e1 ij cos.j =2/ : 2 Indeed, we have supp ˚ X0 W0 WD f 2 R W 8j j 2g and supp Xj;l Wj;l , where for j 2 N, l 2 0; : : : ; Lj 1 the sets
n Wj;l WD 2 R2 W 2j1 8j j 2jC1 ;
o
2 R1 j;l Sj j j
(5.41)
236
P. Grohs
are antipodal pairs of symmetric wedges as in Figure 5.6, left. After this preparation, we are ready to define the functions Fourier side by O 0 . / WD X0 . /
and
0
and
j;l
on the
O j;l . / WD Xj;l . / for 2 R2 :
Let us summarize what we have done so far. We have constructed a family of window functions O 0 , O j;l with • supp O 0 W0 and supp O j;l Wj;l , and • the Calderôn condition X j O j;l . /j2 D 1 j O 0 . /j2 C
for all 2 R2
j;l
holds true. With these facts it is simple to obtain the following characterization of the L2 .R2 /norm, where we denote the convolution of two functions f ; g 2 L1 .R2 / by Z f g.x/ WD f .y/g.x y/dy R2
and note that
b
f g.!/ D fO .!/ gO .!/; see, e.g., [37]. Lemma 5.39. For every f 2 L2 .R2 / it holds kf k22 D kf
2 0 k2
C
X
kf
2 j;l k2 :
j;l
Proof. By construction the system satisfies the discrete Calderôn condition, i.e., j O 0 . /j2 C
X
j O j;l . /j2 D 1
for all 2 R2 :
j;l
yields for all f 2 L2 .R2 / Z 2 kf k2 D jfO . /j2 d
Z D
R2
R2
j O 0 . /j2 jfO . /j2 d C
D kf
2 0 k2
C
X
kf
Z
X
R2 j;l
j O j;l . /j2 jfO . /j2 d
2 j;l k2 :
j;l
t u
5 Optimally Sparse Data Representations
237
The full frame of curvelets is obtained by taking translates of 0 and j;l in the spatial domain and L2 -normalizing afterwards. Accordingly, for j 2 N, l 2 f0; 1; : : : ; Lj 1g, and k 2 Z2 we define 0;k WD 0 . k/ and j;l;k
WD 2j3=4
j;l .
xj;l;k /
1 with xj;l;k D R1 j;l A2j k:
The corresponding set of curvelet indices D .j; l; k/ will henceforth be denoted by M. Further, we will use the following notation for this system C .W .0/ ; W; V/ D .
/2M D
n 0;k
o W k 2 Z2 [
n j;l;k
o W j 2 N; k 2 Z2 ; l 2 f0; 1; : : : ; Lj 1g :
Theorem 5.40. Let W .0/ , W, V be defined as above. The curvelet system C .W .0/ ; W; V/ constitutes a tight frame for L2 .R2 /. Proof. For j 2 N the wedges Wj;0 are contained in the rectangles j;0 , Wj;0
1 Œ2jC1 ; 2jC1 Œ42j=2 ; 42j=2 8 Œ2j1 ; 2j1 Œ2j=21 ; 2j=21 DW j;0 ;
(5.42)
and W0 is contained in the cube Œ2; 22 =.8/ Œ 12 ; 12 2 DW 0 . Since Wj;l D 1 R1 j;l Wj;0 we can put j;l WD Rj;l j;0 such that Wj;l j;l . For each fixed j 2 N the Fourier system .uj;k /k2Z2 given by uj;k . / D 2j3=4 exp 2ixj;0;k with xj;0;k D .2j k1 ; 2j=2 k2 / D A1 2j k constitutes an orthonormal basis for L2 .j;0 /. Observe that xj;l;k D R1 j;l xj;0;k . Therefore, uj;k .Rj;l / D 2j3=4 exp 2ixj;l;k ; and the system uj;k .Rj;l / k2Z2 is an orthonormal basis for L2 .j;l /. By Lemma 5.39 it holds X kf j;l k22 : kf k22 D kf 0 k22 C j;l
By the above observation we have for j 2 N kf
2 j;l k2
D kfO O j;l k22 D
Z j;l
jfO . / O j;l . /j2 d
238
P. Grohs
ˇ2 X ˇˇ Z ˇ ˇ fO . / O j;l . /uj;k .Rj;l / d ˇ
D
k2Z2
X
D
jhfO ; 2j.1C˛/=2 O j;l e2ixj;l;k ij2 D
k2Z2
X
jhf ;
j;l;k ij
2
:
k2Z2 2 0 k2
A similar argument for kf
t u
finishes the proof.
We have constructed a curvelet tight frame but the reader may wonder whether this strange frequency-based construction is actually of the form (5.38), which we anticipated in our informal discussion above. It turns out that it is. Indeed one can show the following. Lemma 5.41. The curvelet system C .W .0/ ; W; V/ can be written in the form (5.38) with functions .j;l;k/ which satisfy for any i 2 N supp
1 Œa; b Œc; c [ Œb; a Œc; c; .j;l;k/
j
.j;l;k/
.x/j Ci hxii
for all x 2 R2 ;
(5.43) (5.44)
and .j;l;k/
k
kCi .R2 / Di
(5.45)
with constants a; b; c; Ci ; Di > 0 which are independent of j; l; k. Proof. First we note that, by rotation invariance of the window functions Xj;l (more precisely, the window Xj;l is generated from the window Xj;0 by rotation by j;l ) we may restrict ourselves to the case j;l D 0. Therefore we need to show that the function j;l;k
WD 23s =2
j;0
1 A2j
satisfies the desired bounds. First note that
1D .j;l;k/
b
j;0
.A2j / :
1
Hence, by the support properties of j;0 , the function .j;l;k/ , together with all its derivatives has compact support in a rectangle away from the 1 -axis, which gives
1
(5.43). Since the functions .j;l;k/ are uniformly bounded on their support, also (5.44) follows immediately by noting that taking partial derivatives corresponds to multiplication by polynomials in Fourier space.
1
Lastly we show that, on its support, the function .j;l;k/ has bounded derivatives, withq a bound independent of j. But this follows from elementary arguments. Using rD
12 C 22 , ! D arctan . 2 = 1 /, we obtain that (up to a smooth reweighting)
5 Optimally Sparse Data Representations
239
1 . / D b .A .j;l;k/
j;0
2j /
D W ˛j . / V ˇj . / ;
where
q
2 : ˛j . / WD 2j 22j 12 C 2j 22 ; and ˇj . / WD 2j=2 arctan j=2 2 1 Now it is a straightforward computation to show that all derivatives of ˛j and ˇj are
1
bounded on the support of .j;l;k/ and uniformly in j. But this implies (5.45) using, again, the fact that differentiation in Fourier space corresponds to multiplication by polynomials in R2 . t u Now that we have seen that the curvelet frame construction C .W .0/ ; W; V/, while not being precisely of the form (5.37) but almost (by Lemma 5.41), the motivation given in Section 5.6.1 indicates that C .W .0/ ; W; V/ might be the right system for the approximation of cartoon images. This is indeed the case, as the following fundamental result (whose proof is too lengthy and complicated to be reproduced here) shows. Theorem 5.42 ([3]). Let ˚ D C .W .0/ ; W; V/ be the curvelet frame as constructed above. Let C D CART. Then (5.26) holds true for every p > 2=3. In particular, ssparse .C ; ˚/ 1: Remark 5.43. Actually the result of [3] is stronger than Theorem 5.42. Candès and Donoho showed the stronger statement that sup ku uN k2 . N 1 log.N/3=2 ;
u2C
where uN denotes the best N-term approximation constructed by keeping the first N largest curvelet coefficients of u. The same remark applies also to the corresponding results on shearlets below in Section 5.6.3. Þ Moreover, similar to the wavelet-case in Proposition 5.35 one can show the following. Proposition 5.44. With the same assumptions as in Theorem 5.42 and an appropriate enumeration .'i /i2N of the curvelet frame elements, Equation (5.27) holds true for some ˛ > 0. The proof is similar to the proof of Proposition 5.35, although some care must be taken due to the fact that our curvelet frame elements are not compactly supported. In summary we gained a complete understanding of the complexity of the signal class CARTK2 , as the following result shows.
240
P. Grohs
Corollary 5.45. Let ˚ D C .W .0/ ; W; V/ be the curvelet tight frame constructed above. Then we have, with some polynomial , that s .CARTK2 / D ssparsepoly; .CARTK2 ; ˚/ D 1: Proof. This result is a direct consequence of Theorem 5.42, Proposition 5.44, Corollary 5.18, and Theorem 5.30. t u
5.6.3 Shearlets The previous section has given us insight into the complexity of the signal class CARTK2 of bivariate cartoon images. We have seen that the curvelet frame ˚ D C .W .0/ ; W; V/ provides optimal N-term approximation rates for CARTK2 and that there is no way of encoding arbitrary cartoon images more efficiently than with curvelets. In reality, images are typically sampled on regular grids and any actual implementation must be able to handle such data efficiently. Here a problem arises: the operation of rotation on which the whole construction of curvelets is built cannot be sensibly defined on grid-valued data. One remedy is to replace the rotation operation by shearing which is also defined on discrete grids. This idea leads to the concept of shearlets [29]. For the definition of shearlets it is convenient to resort to the following notation. 1=2 We put Aa D diag.a; a1=2 / and AQ a D ; a/ for the scaling matrices, and diag.a
1 l and SQ l WD SlT . These shearing matrices denote the shearing matrices by Sl WD 01 now take the role of the rotation matrices in the curvelet case above. A shearlet system is defined as follows. Definition 5.46. For a sampling constant c 2 RC the cone-adapted shearlet system SH ; ; Q I c generated by ; ; Q 2 L2 .R2 / is defined by SH ; ; Q I c D ˚. I c/ [ . I c/ [ Q . Q I c/; where ˚. I c/ D f k D . k/ W k 2 cZ2 g; ˚ . I c/ D j;l;k D 2j3=4 .Sl A2j k/ W j 0; jlj d2j=2 e; k 2 cZ2 ; ˚ Q . Q I c; ˇ/ D Q j;l;k D 2j3=4 Q .SlT AQ 2j k/ W j 0; jlj d2j=2 e; k 2 cZ2 : Clearly, as stated before, replacing the rotation operation by shears makes the transition between the continuous and digital world much simpler: unlike rotation, shearing is also well defined as an operation on a discrete sampling grid. This fact
5 Optimally Sparse Data Representations
241
makes a faithful digital implementation of shearlets possible, see [32]. The question is whether these shearlet systems satisfy the same nice theoretical properties that curvelets do. The work [28] gives general sufficient conditions on the generators ; ; Q and the sampling constant c such that the associated shearlet system constitutes a frame for L2 .R2 /. We give two prominent examples. Example 5.47 (Bandlimited shearlets). We first present the classical cone-adapted shearlet construction of bandlimited shearlets presented in [26]. It is worth emphasizing that due to the shearing operator, the frequency domain needs to be split into four cones to ensure an almost uniform treatment of the different directions, which comes naturally for rotation as a means to change the orientation (compare Figure 5.6, right). First, let 1 ; 2 2 L2 .R/ be chosen such that 1 1 1 1 O supp 1 ; [ ; supp O 2 Œ1; 1 ; ; 2 16 16 2 X ˇˇ ˇˇ2 1 ˇ O 1 2j ! ˇ D 1 for j!j ; 8 j0 and ˇ ˇ2 ˇO ˇ ˇ 2 2bj=2c ! C l ˇ D 1
bj=2c 2X
for j!j 1:
lD2bj=2c
Then the classical mother shearlet
is defined by
O . / WD O 1 . 1 / O 2
2
1
:
For j; l 2 Z let now the parabolic scaling matrix Aj and the shearing matrix Sl be defined by Aj WD
2j 0 0 2j=2
and
Sl WD
1 l : 01
Further, for a domain ˝ R2 let us define the space n o L2 .˝/_ WD f 2 L2 .R2 / W supp fO ˝ : It was then shown in [26] that the system n ˙ 0 WD 23j=4
o Sl Aj k W j 0; l D 2bj=2c ; ; 2bj=2c ; k 2 Z2
242
P. Grohs
constitutes a tight frame for the Hilbert space L2 .C /_ on the frequency cone
1 j 2 j C WD W j 1 j ; 1 : 8 j 1 j By reversing the coordinate axes, also a Parseval frame ˙ 1 for L2 .C 0 /_ , where 1 j 1 j 1 ; C WD W j 2 j ; 8 j 2 j 0
can be constructed. Finally, we can consider a Parseval frame ˚ ˚ WD . k/ W k 2 Z2 2 _ for the Hilbert space L2 18 ; 18 . Combining those systems, we obtain the bandlimited shearlet frame ˙ WD ˙ 0 [ ˙ 1 [ ˚: Following [27], a slight modification of the bandlimited shearlet construction, namely by carefully glueing together boundary elements along the seamlines with angle =4, yields a Parseval frame with smooth and well-localized elements. See also [20] for a related construction. Example 5.48. Another popular shearlet construction is to choose the generators ; ; Q compactly supported. Note that, due to the bandlimitedness of the construction outlined in Example 5.47, the frame elements can never have compact support in space which is a limiting factor in certain applications. The same holds true for the construction of bandlimited curvelets in Section 5.6. In [28] constructions of compactly supported shearlet frames are given. Fast algorithms for the frame decomposition and reconstruction of a given signal are discussed in [33]. These constructions have the big advantage of being compactly supported. On the other hand, the frame property is only guaranteed if the sampling constant c is chosen sufficiently small. In addition the frames are not tight, which means that the structure of the dual frame is not well understood (see however [31] where the so-called dualizable shearlet frames are constructed). Another construction of compactly supported shearlet frames can be found in [6]. We have now seen two different constructions of shearlet frames. It turns out that both of them satisfy the same approximation properties as curvelets. Theorem 5.49. Suppose that ˚ is the bandlimited shearlet tight frame constructed in [27] and mentioned in Example 5.47 or a shearlet frame SH. ; ; Q I c/ as constructed in [28] and mentioned in Example 5.48 with compactly supported generators ; ; Q such that for all D . 1 ; 2 / 2 R2 , the shearlet satisfies
5 Optimally Sparse Data Representations
243
(i) jˇ O . /j ˇC1 min.1;j 1 j˛ / min.1; j 1 j / min.1; j 2 j / and ˇ @ O ˇ
2 , (ii) ˇ @ 2 . /ˇ jh. 1 /j 1 C 1
where ˛ > 5, 4, h 2 L1 .R/, and C1 is a constant, and suppose that the shearlet Q satisfies (i) and (ii) with the roles of 1 and 2 reversed. Let C D CARTK2 . Then (5.26) holds true for every p > 2=3. In particular, ssparse .C ; ˚Q / 1; where ˚Q denotes the canonical dual frame of ˚ (in the bandlimited case we have ˚Q D ˚. Proof. The case of ˚ being the bandlimited tight frame has been shown in [26], while the case of ˚ being a compactly supported shearlet frame has been shown in [30]. t u Remark 5.50. Regarding the development of encoding schemes with shearlets we would like to mention that explicit construction of an optimal encoding scheme using compactly supported shearlet frames has been given in [19]. There the encoding of the index set is done by exploiting a tree structure, rather than using the weak tail compactness property. Þ In summary we have now several systems which deliver optimal best N-term approximation rates for cartoon images: • bandlimited curvelets, cf. Section 5.6, which probably constitute the ‘cleanest’ construction (rotation yields the most natural operation involving directionality and all directions are treated completely uniformly) which is however hard to implement directly (rotations cannot be easily defined on digital data) and which is hampered by the fact that the analyzing elements are bandlimited and thus not well localized in space. • bandlimited shearlets, cf. Example 5.47 which yield a bandlimited tight frame similar to the curvelet construction but with rotation replaced by shears and thus making a faithful discretization possible. • compactly supported shearlets, cf. Example 5.48 which yield shearlet frame systems with compactly supported analyzing elements and hence superior localization in space. However, these systems do not form tight frames and the condition for frameability in [28] requires the sampling constant c to be very small (as opposed to c D 1 for the bandlimited construction). One may ask, how are all these different systems related to each other? This question has been explored in [24] where it is shown that they all belong to a larger class of systems, so-called parabolic molecules. In fact it is shown that all these systems can be seen as systems of the form (5.38), even in the shearlet case. Furthermore [24] shows that, provided that some natural time-frequency-localization properties hold for the generators .j;l;k/ in (5.38), then any two systems of parabolic molecules are
244
P. Grohs
equivalent in terms of their approximation properties. This gives another simpler proof for the optimal N-term approximation rates of shearlet systems: in fact, it follows trivially from the related result for curvelets and the fact that all these systems belong to the family of parabolic molecules. Further details can be found in the survey [23]. In summary we can say that Parabolic molecules are perfect for the approximation of bivariate functions with C2 -curved discontinuities.
How about bivariate functions with C˛ -curved discontinuities, where ˛ ¤ 2? For ˛ < 2 one can show that optimal rates can be achieved by so-called ˛-molecules (where dilation matrix Aa is replaced by dilation matrices of the form the parabolic diag a; a1=˛ ), see [21, 22]. For ˛ > 2 one can prove optimal rates for so-called bandlet dictionaries [35] which, however, do not form a frame.
5.7 Further Examples k;pw
So far we have seen how to optimally approximate the signal classes CK .I/ with wavelets and CARTK2 with parabolic molecules. How about the other signal classes introduced in Section 5.2? It turns out that also in those cases optimal frame constructions exist. Below we give a brief summary. k • The signal class TEXTK;M can be optimally approximated by frames of wave atoms as introduced in [9]. These systems are therefore particularly attractive for the compression of fingerprint images or seismic data. • The signal class MUTILKk can be optimally approximated by ridgelet frames (or bases), which is the topic of [2, 12]. Functions in MUTILKk arise as solutions to linear transport PDEs. Based on this motivation the papers [14, 25] develop optimal adaptive ridgelet discretization schemes for the numerical solution of these equations.
We can thus sum up our main findings in the following theorem. Theorem 5.51. All the upper bounds in 5.18 are sharp. In particular i) s .CKk .C// D k=d, and optimal compression can be achieved with either wavelets, parabolic molecules (e.g., curvelets and shearlets), and many more systems. k;pw ii) s .CK .I// D k, and optimal compression can be achieved with wavelets. iii) s .STAR2K / D 1, and optimal compression can be achieved with parabolic molecules (e.g., curvelets and shearlets). iv) s .CARTK2 / D 1, and optimal compression can be achieved with parabolic molecules (e.g., curvelets and shearlets). k v) s .TEXTK;M / D k=2, and optimal compression can be achieved with wave atoms. vi) s .MUTILKk / D k=2 and optimal compression can be achieved with ridgelets.
5 Optimally Sparse Data Representations
245
This means that for all signal classes considered in Section 5.2 we could gain a fully complete picture on how to optimally encode them. It goes without saying that the tools developed in this chapter can be applied to a large number of additional signal classes.
5.8 Appendix: Chernoff Bounds For the convenience of the reader we present here the version of Chernoff’s bound which is needed for the conclusions of Lemma 5.13. Although the presentation is self-contained, we refer to [38] for more details. We start with random variables X1 ; : : : ; Xn with values in f0; 1g, which are iid (independent identically distributed) with binomial distribution P.Xi D 0/ D P.Xi D 1/ D 12 . Then Chernoff’s P bound gives a bound on the probability that the cumulative distribution X WD niD1 Xi is smaller than a certain quantity. Before we start with its derivation we state Markov’s inequality. Lemma 5.52 (Markov Inequality). Suppose that .˝; ˙; / is a measure space and f W ˝ ! R measurable. Then for every " > 0 we have R .fx 2 ˝ W jf .x/j "g/
˝
jf .x/jd.x/ : "
Proof. Assume without loss of generality that f is nonnegative and consider the function " if f .x/ " g.x/ WD 0 else Since f .x/ g.x/ pointwise we have that Z
Z ˝
f .x/d.x/
˝
g.x/d.x/ D " .fx 2 ˝ W f .x/ "g/ ; t u
which is the desired claim.
Now we are ready to formulate and prove the version of Chernoff’s bound which we require. Theorem 5.53. Suppose that X1 ; : : : ; Xn W ˝ ! f0; 1g (with .˝; ˙; P/ a probability space) are iid binomially distributed with P.Xi D 0/ D P.Xi D 1/ D 12 . Then, for any 1=2 > q > 0 we have 1X P. Xi q/ 2n 2nh.q/ ; n iD1 n
246
P. Grohs
Fig. 5.7 The binary entropy h.x/ D x log2 .x/ .1 x/ log2 .1 x/.
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1
0
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
with h.x/ D x log2 .x/ .1 x/ log2 .1 x/; the binary entropy, see Figure 5.7
P Proof. Consider the cumulative distribution X D niD1 Xi . Let q 2 R and t > 0 arbitrary. Then Markov’s inequality (Lemma 5.52) gives us tX
P.X q/ D P.2
tq
2
Z / 2
tX.!/
tq ˝
2
dP.!/ D 2
tq
Z Y n ˝ iD1
2tXi .!/ dP.!/:
Since the Xi ’s are independent we have Z Y n ˝ iD1
tXi .!/
2
dP.!/ D
n Z Y iD1
˝
tXi .!/
2
dP.!/ D
1 1 C 2t 2 2
n :
Putting together we get that
n 1 P. X q/ 2n inf 2tq C 2t.q1/ : t>0 n The infimum of the function above is reached for t D log2 ..1 q/=q/ and we get n 1 P. X q/ 2n 2t q C 2t .q1/ : n
(5.46)
5 Optimally Sparse Data Representations
247
Note that due to the assumption that q < 1=2, the quantity t is positive, as it is required. Now we compute .1q/ qq /
2t q C 2t .q1/ D .1 q/.1q/ qq D 2log2 ..1q/ Putting this equation into (5.46) gives the final result.
D 2h.q/ : t u
References 1. Berger, T.: Rate distortion theory: A mathematical basis for data compression (1971) 2. Candes, E.J.: Ridgelets and the representation of mutilated sobolev functions. SIAM J. Math. Anal. 33(2), 347–368 (2001) 3. Candès, E.J., Donoho, D.L.: New tight frames of curvelets and optimal representations of objects with C2 singularities. Commun. Pure Appl. Math. 56, 219–266 (2004) 4. Christensen, O.: An Introduction to Frames and Riesz Bases. Birkhäuser Basel, Boston (2003) 5. Cohen, A., Dahmen, W., DeVore, R.: Adaptive wavelet methods for elliptic operator equations: convergence rates. Math. Comput. 70(233), 27–75 (2001) 6. Dahlke, S., Steidl, G., Teschke, G.: Shearlet coorbit spaces: compactly supported analyzing shearlets, traces and embeddings. J. Fourier Anal. Appl. 17(6), 1232–1255 (2011) 7. Daubechies, I.: Orthonormal bases of compactly supported wavelets. Commun. Pure Appl. Math. 41(7), 909–996 (1988) 8. Daubechies, I., et al.: Ten Lectures on Wavelets, vol. 61. SIAM, Philadelphia (1992) 9. Demanet, L., Ying, L.: Wave atoms and sparsity of oscillatory patterns. Appl. Comput. Harmonic Anal. 23(3), 368–387 (2007) 10. DeVore, R.: Nonlinear approximation. Acta Numerica 7, 51–150 (1998) 11. Donoho, D.L.: Unconditional bases and bit-level compression. Appl. Comput. Harmonic Anal. 3(4), 388–392 (1996) 12. Donoho, D.L.: Orthonormal ridgelets and linear singularities. SIAM J. Math. Anal. 31(5), 1062–1099 (2000) 13. Donoho, D.L.: Sparse components of images and optimal atomic decomposition. Constr. Approx. 17, 353–382 (2001) 14. Etter, S., Grohs, P., Obermeier, A.: Ffrt: A fast finite ridgelet transform for radiative transport. Multiscale Model. Simul. 13(1), 1–42 (2015) 15. Evans, L.C.: Partial Differential Equations. AMS, Providence (2010) 16. Feichtinger, H.G., Gröchenig, K., Walnut, D.: Wilson bases and modulation spaces. Mathematische Nachrichten 155(1), 7–17 (1992) 17. Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Springer, New York (2013) 18. Gröchenig, K.: Foundations of Time-Frequency Analysis. Springer, New York (2001) 19. Grohs, P.: Tree approximation with anisotropic decompositions. Appl. Comput. Harmonic Anal. 33(1), 44–57 (2012) 20. Grohs, P.: Bandlimited shearlet frames with nice duals. J. Comput. Appl. Math. 243, 139–151 (2013) 21. Grohs, P., Keiper, S., Kutyniok, G., Schäfer, M.: ˛-molecules. arXiv preprint arXiv:1407.4424 (2014) 22. Grohs, P., Keiper, S., Kutyniok, G., Schäfer, M.: Cartoon approximation with ˛-curvelets. arXiv preprint arXiv:1404.1043 (2014)
248
P. Grohs
23. Grohs, P., Keiper, S., Kutyniok, G., Schäfer, M.: Parabolic molecules: Curvelets, shearlets, and beyond. In: Approximation Theory XIV: San Antonio 2013, pp. 141–172. Springer, New York (2014) 24. Grohs, P., Kutyniok, G.: Parabolic molecules. Found. Comput. Math. 14(2), 299–337 (2014) 25. Grohs, P., Obermeier, A.: Optimal adaptive ridgelet schemes for linear transport equations. arXiv preprint arXiv:1409.1881 (2014) 26. Guo, K., Labate, D.: Optimally sparse multidimensional representation using shearlets. SIAM J. Math Anal. 39, 298–318 (2007) 27. Guo, K., Labate, D.: The construction of smooth Parseval frames of shearlets. Math. Model. Nat. Phenom. 8(1), 82–105 (2013) 28. Kittipoom, P., Kutyniok, G., Lim, W.Q.: Construction of compactly supported shearlet frames. Constr. Approx. 35(1), 21–72 (2012) 29. Kutyniok, G., Labate, D. (eds.): Shearlets: Multiscale Analysis for Multivariate Data. Birkhäuser, Boston (2012) 30. Kutyniok, G., Lim, W.Q.: Compactly supported shearlets are optimally sparse. J. Approx. Theory 163(11), 1564–1589 (2011) 31. Kutyniok, G., Lim, W.Q.: Dualizable shearlet frames and sparse approximation. arXiv preprint arXiv:1411.2303 (2014) 32. Kutyniok, G., Shahram, M., Zhuang, X.: Shearlab: A rational design of a digital parabolic scaling algorithm. SIAM J. Imag. Sci. 5(4), 1291–1332 (2012) 33. Lim, W.Q.: The discrete shearlet transform: A new directional transform and compactly supported shearlet frames. IEEE Trans. Image Process. 19(5), 1166–1180 (2010) 34. Mallat, S.: A Wavelet Tour of Signal Processing. Academic press, London (2006) 35. Mallat, S., Peyré, G.: A review of bandlet methods for geometrical image representation. Numer. Algorithm 44(3), 205–234 (2007) 36. Ott, E.: Chaos in Dynamical Systems. Cambridge University Press, Cambridge (2002) 37. Pinsky, M.A.: Introduction to Fourier analysis and wavelets. Brooks/Cole, Pacific Grove (2002) 38. Renyi, A.: Foundations of Probability. Courier Corporation, Mineola (2007) 39. Runst, T., Sickel, W.: Sobolev spaces of fractional order, Nemytskij operators, and nonlinear partial differential equations, vol. 3. Walter de Gruyter, Berlin (1996) 40. Stein, E.M., Murphy, T.S.: Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals, vol. 3. Princeton University Press, Princeton (1993)
Applied and Numerical Harmonic Analysis (68 volumes) A. Saichev and W.A. Woyczyñski: Distributions in the Physical and Engineering Sciences (ISBN 978-0-8176-3924-2) C.E. D’Attellis and E.M. Fernandez-Berdaguer: Wavelet Theory and Harmonic Analysis in Applied Sciences (ISBN 978-0-8176-3953-2) H.G. Feichtinger and T. Strohmer: Gabor Analysis and Algorithms (ISBN 978-08176-3959-4) R. Tolimieri and M. An: Time-Frequency Representations (ISBN 978-0-8176-3918-1) T.M. Peters and J.C. Williams: The Fourier Transform in Biomedical Engineering (ISBN 978-0-8176-3941-9) G.T. Herman: Geometry of Digital Spaces (ISBN 978-0-8176-3897-9) A. Teolis: Computational Signal Processing with Wavelets (ISBN 978-0-81763909-9) J. Ramanathan: Methods of Applied Fourier Analysis (ISBN 978-0-8176-3963-1) J.M. Cooper: Introduction to Partial Differential Equations with MATLAB (ISBN 978-0-8176-3967-9) A. Procházka, N.G. Kingsbury, P.J. Payner, and J. Uhlir: Signal Analysis and Prediction (ISBN 978-0-8176-4042-2) W. Bray and C. Stanojevic: Analysis of Divergence (ISBN 978-1-4612-7467-4) G.T. Herman and A. Kuba: Discrete Tomography (ISBN 978-0-8176-4101-6) K. Gröchenig: Foundations of Time-Frequency Analysis (ISBN 978-0-8176-4022-4) L. Debnath: Wavelet Transforms and Time-Frequency Signal Analysis (ISBN 9780-8176-4104-7) J.J. Benedetto and P.J.S.G. Ferreira: Modern Sampling Theory (ISBN 978-0-81764023-1)
D.F. Walnut: An Introduction to Wavelet Analysis (ISBN 978-0-8176-3962-4) A. Abbate, C. DeCusatis, and P.K. Das: Wavelets and Subbands (ISBN 978-0-81764136-8) O. Bratteli, P. Jorgensen, and B. Treadway: Wavelets Through a Looking Glass (ISBN 978-0-8176-4280-80) H.G. Feichtinger and T. Strohmer: Advances in Gabor Analysis (ISBN 978-0-81764239-6) O. Christensen: An Introduction to Frames and Riesz Bases (ISBN 978-0-81764295-2) L. Debnath: Wavelets and Signal Processing (ISBN 978-0-8176-4235-8) G. Bi and Y. Zeng: Transforms and Fast Algorithms for Signal Analysis and Representations (ISBN 978-0-8176-4279-2) J.H. Davis: Methods of Applied Mathematics with a MATLAB Overview (ISBN 9780-8176-4331-7) J.J. Benedetto and A.I. Zayed: Modern Sampling Theory (ISBN 978-0-8176-4023-1) E. Prestini: The Evolution of Applied Harmonic Analysis (ISBN 978-0-8176-4125-2) L. Brandolini, L. Colzani, A. Iosevich, and G. Travaglini: Fourier Analysis and Convexity (ISBN 978-0-8176-3263-2) W. Freeden and V. Michel: Multiscale Potential Theory (ISBN 978-0-8176-4105-4) O. Christensen and K.L. Christensen: Approximation Theory (ISBN 978-0-81763600-5) O. Calin and D.-C. Chang: Geometric Mechanics on Riemannian Manifolds (ISBN 978-0-8176-4354-6) J.A. Hogan: Time?Frequency and Time?Scale Methods (ISBN 978-0-8176-4276-1) C. Heil: Harmonic Analysis and Applications (ISBN 978-0-8176-3778-1) K. Borre, D.M. Akos, N. Bertelsen, P. Rinder, and S.H. Jensen: A Software-Defined GPS and Galileo Receiver (ISBN 978-0-8176-4390-4)
T. Qian, M.I. Vai, and Y. Xu: Wavelet Analysis and Applications (ISBN 978-3-76437777-9) G.T. Herman and A. Kuba: Advances in Discrete Tomography and Its Applications (ISBN 978-0-8176-3614-2) M.C. Fu, R.A. Jarrow, J.-Y. Yen, and R.J. Elliott: Advances in Mathematical Finance (ISBN 978-0-8176-4544-1) O. Christensen: Frames and Bases (ISBN 978-0-8176-4677-6) P.E.T. Jorgensen, J.D. Merrill, and J.A. Packer: Representations, Wavelets, and Frames (ISBN 978-0-8176-4682-0) M. An, A.K. Brodzik, and R. Tolimieri: Ideal Sequence Design in Time-Frequency Space (ISBN 978-0-8176-4737-7) S.G. Krantz: Explorations in Harmonic Analysis (ISBN 978-0-8176-4668-4) B. Luong: Fourier Analysis on Finite Abelian Groups (ISBN 978-0-8176-4915-9) G.S. Chirikjian: Stochastic Models, Information Theory, and Lie Groups, Volume 1 (ISBN 978-0-8176-4802-2) C. Cabrelli and J.L. Torrea: Recent Developments in Real and Harmonic Analysis (ISBN 978-0-8176-4531-1) M.V. Wickerhauser: Mathematics for Multimedia (ISBN 978-0-8176-4879-4) B. Forster, P. Massopust, O. Christensen, K. Gröchenig, D. Labate, P. Vandergheynst, G. Weiss, and Y. Wiaux: Four Short Courses on Harmonic Analysis (ISBN 978-08176-4890-9) O. Christensen: Functions, Spaces, and Expansions (ISBN 978-0-8176-4979-1) J. Barral and S. Seuret: Recent Developments in Fractals and Related Fields (ISBN 978-0-8176-4887-9) O. Calin, D.-C. Chang, and K. Furutani, and C. Iwasaki: Heat Kernels for Elliptic and Sub-elliptic Operators (ISBN 978-0-8176-4994-4) C. Heil: A Basis Theory Primer (ISBN 978-0-8176-4686-8) J.R. Klauder: A Modern Approach to Functional Integration (ISBN 978-0-81764790-2)
J. Cohen and A.I. Zayed: Wavelets and Multiscale Analysis (ISBN 978-0-81768094-7) D. Joyner and J.-L. Kim: Selected Unsolved Problems in Coding Theory (ISBN 978-0-8176-8255-2) G.S. Chirikjian: Stochastic Models, Information Theory, and Lie Groups, Volume 2 (ISBN 978-0-8176-4943-2) J.A. Hogan and J.D. Lakey: Duration and Bandwidth Limiting (ISBN 978-0-81768306-1) G. Kutyniok and D. Labate: Shearlets (ISBN 978-0-8176-8315-3) P.G. Casazza and P. Kutyniok: Finite Frames (ISBN 978-0-8176-8372-6) V. Michel: Lectures on Constructive Approximation (ISBN 978-0-8176-8402-0) D. Mitrea, I. Mitrea, M. Mitrea, and S. Monniaux: Groupoid Metrization Theory (ISBN 978-0-8176-8396-2) T.D. Andrews, R. Balan, J.J. Benedetto, W. Czaja, and K.A. Okoudjou: Excursions in Harmonic Analysis, Volume 1 (ISBN 978-0-8176-8375-7) T.D. Andrews, R. Balan, J.J. Benedetto, W. Czaja, and K.A. Okoudjou: Excursions in Harmonic Analysis, Volume 2 (ISBN 978-0-8176-8378-8) D.V. Cruz-Uribe and A. Fiorenza: Variable Lebesgue Spaces (ISBN 978-3-03480547-6) W. Freeden and M. Gutting: Special Functions of Mathematical (Geo-)Physics (ISBN 978-3-0348-0562-9) A. Saichev and W.A. Woyczyñski: Distributions in the Physical and Engineering Sciences, Volume 2: Linear and Nonlinear Dynamics of Continuous Media (ISBN 978-0-8176-3942-6) S. Foucart and H. Rauhut: A Mathematical Introduction to Compressive Sensing (ISBN 978-0-8176-4947-0) G. Herman and J. Frank: Computational Methods for Three-Dimensional Microscopy Reconstruction (ISBN 978-1-4614-9520-8) A. Paprotny and M. Thess: Realtime Data Mining: Self-Learning Techniques for Recommendation Engines (ISBN 978-3-319-01320-6)
A. Zayed and G. Schmeisser: New Perspectives on Approximation and Sampling Theory: Festschrift in Honor of Paul Butzer’s 85th Birthday (978-3-319-08800-6) R. Balan, M. Begue, J. Benedetto, W. Czaja, and K.A Okoudjou: Excursions in Harmonic Analysis, Volume 3 (ISBN 978-3-319-13229-7) S. Dahlke, F. De Mari, P. Grohs, and D. Labate: Harmonic and Applied Analysis: From Groups to Signals (ISBN 978-XXX) For an up-to-date list of ANHA titles, please visit http://www.springer.com/ series/4968
Index
Symbols C1 -vector, 47 K-atom, 132 U-oscillation, 96
A admissibility condition, 86, 117, 155 shearlets, 159 admissible vector, 38 affine group, 85, 153, 154 analyzing vector, 85, 86 atomic decomposition, 96, 128 automorphism inner, 17
B Baker–Campbell–Hausdorff formula, 20, 50 Banach frame, 100, 128 Banach space, 85 bandlets, 244 bandlimited shearlet frames, 241 Besov space, 85, 104, 105, 107, 132, 216 best N-term approximation, 215 best N-term approximation under polynomial depth search, 217 binary entropy, 208 bracket, 11
C Calderón equation, 34 canonical commutation relations, 58
canonical dual frame, 221 cartoon images, 203 CCR, 58 center, 25 center , 23 Chernoff bound, 208 class E , 67 coefficient, 32 diagonal, 32 compactly supported shearlet frames, 242 compressed sensing, 229 cone-adapted shearlet system, 240 continuous shearlet transform, 158, 162 continuous wavelet transform, 151, 155 convolution, 28, 236 idempotent, 41 convolvable, 29 coorbit space, 85, 91 shearlet, 120 corner point, 164 correspondence principle, 94 covering number, 206 curvelets, 233 cyclic vector, 35
D derivation, 24 dictionary, 214 diffeomorphism, 17 difference operator, 105 differential, 14, 47 dilation, 31 dilation matrix, 110
© Springer International Publishing Switzerland 2015 S. Dahlke et al. (eds.), Harmonic and Applied Analysis, Applied and Numerical Harmonic Analysis 68, DOI 10.1007/978-3-319-18863-8
255
256 distortion, 205 distribution, 191 Duflo-Moore operator, 42
E edge, 150 ramp edge, 187 step edge, 150, 163 encoding/decoding pair, 204 exponential mapping, 19
F FBI transform, 152 formal degree operator, 42 frame in a Hilbert space, 220 frequency shift, 58 full affine group, 105
G Gabor, 107 geometric separation, 188 group adjoint, 24 affine, 10 full, 11 automorphism, 51 connected shearlet, 76 Lie, 10 locally compact, 8 semidirect product, 26 symplectic, 10 topological, 8 unimodular, 9
H Haar measure shearlet group, 113 Toeplitz shearlet group, see shearlet group Hamming distance, 207 Heaviside step function, 150 Heisenberg group, 48 automorphisms, 51 extended, 53 reduced, 107 hypercube embedding, 207
I ideal, 11 infinitesimal generator, 46
Index integral curve, 19 intertwining operator, 35 invariant subspace, 32 irreducible, 32 isometry, 30
J Jacobi identity, 11
K kernel, 37 Kolmogorov entropy, 206
L left invariant, 14 Lemma Schur, 36 Lie algebra, 11 homomorphism, 17 isomorphism, 17 Lie subalgebra, 11 semidirect product, 27 two-step nilpotent, 50 Lie group, 10 homomorphism, 16 isomorphism, 17 automorphism, 17 Jacobi group, 52 localization lemma, 168 locally closed, 71
M Markov inequality, 245 measure Borel, 8 Haar, 9 metric entropy, 206 modular function, 9 shearlet group, 115 modulation space, 85, 107, 216 modulus of continuity, 104 monodromy principle, 18 mutilated functions, 204
N nonincreasing rearrangement, 221 nonlinear approximation space, 215 nonlinear approximation theory, 216
Index O operator adjoint, 43 closed, 43 densely defined, 43 one parameter group, 46 self-adjoint, 44 skew-adjoint, 44 symmetric, 44 unbounded, 43 optimal dictionary, 220 optimal encoding rate, 205 orbit, 40, 71 orthogonality relations, 42
P parabolic molecules, 243 parabolic scaling, 231 Parseval frame, 221 phase-space, 58 piecewise smooth functions, 203 polynomial-depth search, 217
R rate-distortion theory, 205, 210 reduced Heisenberg group, 61 representation (unitarily) equivalent, 35 adjoint, 11, 24, 25 coefficient, 32 cyclic, 35 irreducible, 32, 85 left regular, 31 Lie algebra, 17 Lie group, 17 metaplectic, 61, 64 right regular, 31 Schrödinger, 53, 107 shearlet, 76 shearlet group, 115 square integrable, 38, 85 Toeplitz shearlet group, see shearlet group unitary, 30 wavelet, 31 reproducing formula, 40 reproducing property, 86 reproducing system, 38 ridgelets, 244 runlength, 205
257 S scalar integrability, 38 Schrödinger representation, 56 Schrödingerlets, 78 second dyadic decomposition, 234 semi-invariance, 42 set Q-dense, 96 U-dense, 127 relatively separated, 96, 127 separated, 96 shear matrix, 110 Toeplitz, 145 shearlet admissible, 119 shearlet coorbit space, see coorbit space shearlet group, 157 connected Toeplitz, 145 full, 110, 111 Toeplitz, 145 shearlet transform continuous, 119 Toeplitz, 145 Short Time Fourier Transform, 57 signal class, 202 singular support, 194 smoothness space, 85 sparse linear combination, 215 stability subgroup, 71 star shaped images, 203 STFT, 57 subgroup Lie, 18
T tangent map, 14 tangent space, 12 tangent vector, 12 tensor-product wavelets, 229 texture images, 204 theorem Stone, 46 tight frame, 221 time shift, 57 translation, 31, 110
U unitarily equivalent, 35 unitary map, 30
258 V vanishing moments, 226 vector field, 12 voice transform, 36, 86 extended, 89 W wave atoms, 244
Index wavefront set, 152, 195 wavelet, 34, 40 wavelet system, 226 weak `p quasinorm, 221 weak integral, 38 weak tail-compactness, 225 weight, 86, 119 w-moderate, 91 Weyl-Heisenberg group, 85