Scaling, Fractals and Wavelets

This page intentionally left blank Scaling, Fractals and Wavelets This page intentionally left blank Scaling, Fra...

Author: Patrice Abry | Paolo Goncalves | Jacques Levy Vehel

98 downloads 1565 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

This page intentionally left blank

Scaling, Fractals and Wavelets



Edited by Patrice Abry Paulo Gonçalves Jacques Lévy Véhel

First published in France in 2 volumes in 2002 by Hermes Science/Lavoisier entitled: Lois d’échelle, fractales et ondelettes © LAVOISIER, 2002 First published in Great Britain and the United States in 2009 by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK

John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA

www.iste.co.uk

www.wiley.com

© ISTE Ltd, 2009 The rights of Patrice Abry, Paulo Gonçalves and Jacques Lévy Véhel to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Cataloging-in-Publication Data Lois d’échelle, fractales et ondelettes. English Scaling, fractals and wavelets/edited by Patrice Abry, Paulo Gonçalves, Jacques Lévy Véhel. p. cm. Includes bibliographical references. ISBN 978-1-84821-072-1 1. Signal processing--Mathematics. 2. Fractals. 3. Wavelets (Mathematics) I. Abry, Patrice. II. Gonçalves, Paulo. III. Lévy Véhel, Jacques, 1960- IV. Title. TK5102.9.L65 2007 621.382'20151--dc22 2007025119 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN: 978-1-84821-072-1 Printed and bound in Great Britain by CPI Antony Rowe Ltd, Chippenham, Wiltshire.

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

Chapter 1. Fractal and Multifractal Analysis in Signal Processing . . . . . Jacques L ÉVY V ÉHEL and Claude T RICOT

19

1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Dimensions of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1. Minkowski-Bouligand dimension . . . . . . . . . . . . . . . . 1.2.2. Packing dimension . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3. Covering dimension . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4. Methods for calculating dimensions . . . . . . . . . . . . . . . 1.3. Hölder exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1. Hölder exponents related to a measure . . . . . . . . . . . . . . 1.3.2. Theorems on set dimensions . . . . . . . . . . . . . . . . . . . 1.3.3. Hölder exponent related to a function . . . . . . . . . . . . . . 1.3.4. Signal dimension theorem . . . . . . . . . . . . . . . . . . . . . 1.3.5. 2-microlocal analysis . . . . . . . . . . . . . . . . . . . . . . . 1.3.6. An example: analysis of stock market price . . . . . . . . . . . 1.4. Multifractal analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1. What is the purpose of multifractal analysis? . . . . . . . . . . 1.4.2. First ingredient: local regularity measures . . . . . . . . . . . . 1.4.3. Second ingredient: the size of point sets of the same regularity 1.4.4. Practical calculation of spectra . . . . . . . . . . . . . . . . . . 1.4.5. Refinements: analysis of the sequence of capacities, mutual analysis and multisingularity . . . . . . . . . . . . . . . . . . . 1.4.6. The multifractal spectra of certain simple signals . . . . . . . . 1.4.7. Two applications . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.7.1. Image segmentation . . . . . . . . . . . . . . . . . . . . . 1.4.7.2. Analysis of TCP traffic . . . . . . . . . . . . . . . . . . . 1.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

19 20 21 25 27 29 33 33 33 36 42 45 46 48 48 49 50 52

. . . . . .

. . . . . .

60 62 66 66 67 68

6


Chapter 2. Scale Invariance and Wavelets . . . . . . . . . . . . . . . . . . . . Patrick F LANDRIN, Paulo G ONÇALVES and Patrice A BRY

71

2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 2.2. Models for scale invariance . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.2.1. Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.2.2. Self-similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2.2.3. Long-range dependence . . . . . . . . . . . . . . . . . . . . . . . . 75 2.2.4. Local regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 2.2.5. Fractional Brownian motion: paradigm of scale invariance . . . . 77 2.2.6. Beyond the paradigm of scale invariance . . . . . . . . . . . . . . 79 2.3. Wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 2.3.1. Continuous wavelet transform . . . . . . . . . . . . . . . . . . . . 81 2.3.2. Discrete wavelet transform . . . . . . . . . . . . . . . . . . . . . . 82 2.4. Wavelet analysis of scale invariant processes . . . . . . . . . . . . . . . 85 2.4.1. Self-similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 2.4.2. Long-range dependence . . . . . . . . . . . . . . . . . . . . . . . . 88 2.4.3. Local regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2.4.4. Beyond second order . . . . . . . . . . . . . . . . . . . . . . . . . . 92 2.5. Implementation: analysis, detection and estimation . . . . . . . . . . . . 92 2.5.1. Estimation of the parameters of scale invariance . . . . . . . . . . 93 2.5.2. Emphasis on scaling laws and determination of the scaling range . 96 2.5.3. Robustness of the wavelet approach . . . . . . . . . . . . . . . . . 98 2.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 2.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Chapter 3. Wavelet Methods for Multifractal Analysis of Functions . . . . 103 Stéphane JAFFARD 3.1. Introduction . . . . . . . . . . . . . . . . . . . . 3.2. General points regarding multifractal functions 3.2.1. Important definitions . . . . . . . . . . . . 3.2.2. Wavelets and pointwise regularity . . . . 3.2.3. Local oscillations . . . . . . . . . . . . . . 3.2.4. Complements . . . . . . . . . . . . . . . . 3.3. Random multifractal processes . . . . . . . . . 3.3.1. Lévy processes . . . . . . . . . . . . . . . 3.3.2. Burgers’ equation and Brownian motion . 3.3.3. Random wavelet series . . . . . . . . . . . 3.4. Multifractal formalisms . . . . . . . . . . . . . 3.4.1. Besov spaces and lacunarity . . . . . . . . 3.4.2. Construction of formalisms . . . . . . . . 3.5. Bounds of the spectrum . . . . . . . . . . . . . 3.5.1. Bounds according to the Besov domain .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

103 104 104 107 112 116 117 117 120 122 123 123 126 129 129

Contents

7

3.5.2. Bounds deduced from histograms . . . . . . . . . . . . . . . . . . 132 3.6. The grand-canonical multifractal formalism . . . . . . . . . . . . . . . . 132 3.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Chapter 4. Multifractal Scaling: General Theory and Approach by Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Rudolf R IEDI 4.1. Introduction and summary . . . . . . . . . . . . . . . 4.2. Singularity exponents . . . . . . . . . . . . . . . . . 4.2.1. Hölder continuity . . . . . . . . . . . . . . . . . 4.2.2. Scaling of wavelet coefficients . . . . . . . . . 4.2.3. Other scaling exponents . . . . . . . . . . . . . 4.3. Multifractal analysis . . . . . . . . . . . . . . . . . . 4.3.1. Dimension based spectra . . . . . . . . . . . . 4.3.2. Grain based spectra . . . . . . . . . . . . . . . 4.3.3. Partition function and Legendre spectrum . . . 4.3.4. Deterministic envelopes . . . . . . . . . . . . . 4.4. Multifractal formalism . . . . . . . . . . . . . . . . . 4.5. Binomial multifractals . . . . . . . . . . . . . . . . . 4.5.1. Construction . . . . . . . . . . . . . . . . . . . 4.5.2. Wavelet decomposition . . . . . . . . . . . . . 4.5.3. Multifractal analysis of the binomial measure . 4.5.4. Examples . . . . . . . . . . . . . . . . . . . . . 4.5.5. Beyond dyadic structure . . . . . . . . . . . . . 4.6. Wavelet based analysis . . . . . . . . . . . . . . . . . 4.6.1. The binomial revisited with wavelets . . . . . . 4.6.2. Multifractal properties of the derivative . . . . 4.7. Self-similarity and LRD . . . . . . . . . . . . . . . . 4.8. Multifractal processes . . . . . . . . . . . . . . . . . 4.8.1. Construction and simulation . . . . . . . . . . 4.8.2. Global analysis . . . . . . . . . . . . . . . . . . 4.8.3. Local analysis of warped FBM . . . . . . . . . 4.8.4. LRD and estimation of warped FBM . . . . . . 4.9. Bibliography . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

139 140 140 142 144 145 145 146 147 149 151 154 154 157 158 160 162 163 163 165 167 168 169 170 170 173 173

Chapter 5. Self-similar Processes . . . . . . . . . . . . . . . . . . . . . . . . . 179 Albert B ENASSI and Jacques I STAS 5.1. Introduction . . . . . . . . . . . . . . 5.1.1. Motivations . . . . . . . . . . . 5.1.2. Scalings . . . . . . . . . . . . . 5.1.2.1. Trees . . . . . . . . . . . . 5.1.2.2. Coding of R . . . . . . . 5.1.2.3. Renormalizing Cantor set

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

179 179 182 182 183 183

8


5.1.2.4. Random renormalized Cantor set . . . . . . . . . . . . . . . 5.1.3. Distributions of scale invariant masses . . . . . . . . . . . . . . . 5.1.3.1. Distribution of masses associated with Poisson measures . 5.1.3.2. Complete coding . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4. Weierstrass functions . . . . . . . . . . . . . . . . . . . . . . . . 5.1.5. Renormalization of sums of random variables . . . . . . . . . . 5.1.6. A common structure for a stochastic (semi-)self-similar process 5.1.7. Identifying Weierstrass functions . . . . . . . . . . . . . . . . . . 5.1.7.1. Pseudo-correlation . . . . . . . . . . . . . . . . . . . . . . . 5.2. The Gaussian case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1. Self-similar Gaussian processes with r-stationary increments . . 5.2.1.1. Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1.3. Characterization . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2. Elliptic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3. Hyperbolic processes . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4. Parabolic processes . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5. Wavelet decomposition . . . . . . . . . . . . . . . . . . . . . . . 5.2.5.1. Gaussian elliptic processes . . . . . . . . . . . . . . . . . . 5.2.5.2. Gaussian hyperbolic process . . . . . . . . . . . . . . . . . 5.2.6. Renormalization of sums of correlated random variable . . . . . 5.2.7. Convergence towards fractional Brownian motion . . . . . . . . 5.2.7.1. Quadratic variations . . . . . . . . . . . . . . . . . . . . . . 5.2.7.2. Acceleration of convergence . . . . . . . . . . . . . . . . . 5.2.7.3. Self-similarity and regularity of trajectories . . . . . . . . . 5.3. Non-Gaussian case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2. Symmetric α-stable processes . . . . . . . . . . . . . . . . . . . 5.3.2.1. Stochastic measure . . . . . . . . . . . . . . . . . . . . . . . 5.3.2.2. Ellipticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3. Censov and Takenaka processes . . . . . . . . . . . . . . . . . . 5.3.4. Wavelet decomposition . . . . . . . . . . . . . . . . . . . . . . . 5.3.5. Process subordinated to Brownian measure . . . . . . . . . . . . 5.4. Regularity and long-range dependence . . . . . . . . . . . . . . . . . . 5.4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2. Two examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2.1. A signal plus noise model . . . . . . . . . . . . . . . . . . . 5.4.2.2. Filtered white noise . . . . . . . . . . . . . . . . . . . . . . 5.4.2.3. Long-range correlation . . . . . . . . . . . . . . . . . . . . 5.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

184 184 184 185 185 186 187 188 188 189 189 189 189 190 190 191 192 192 192 193 193 193 193 194 195 195 195 196 196 196 198 198 199 200 200 201 201 201 202 202

Contents

9

Chapter 6. Locally Self-similar Fields . . . . . . . . . . . . . . . . . . . . . . 205 Serge C OHEN 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Recap of two representations of fractional Brownian motion . 6.2.1. Reproducing kernel Hilbert space . . . . . . . . . . . . . 6.2.2. Harmonizable representation . . . . . . . . . . . . . . . . 6.3. Two examples of locally self-similar fields . . . . . . . . . . . 6.3.1. Definition of the local asymptotic self-similarity (LASS) 6.3.2. Filtered white noise (FWN) . . . . . . . . . . . . . . . . . 6.3.3. Elliptic Gaussian random fields (EGRP) . . . . . . . . . . 6.4. Multifractional fields and trajectorial regularity . . . . . . . . . 6.4.1. Two representations of the MBM . . . . . . . . . . . . . . 6.4.2. Study of the regularity of the trajectories of the MBM . . 6.4.3. Towards more irregularities: generalized multifractional Brownian motion (GMBM) and step fractional Brownian motion (SFBM) . . . . . . . . . . . . . . . . . . . . . . . . 6.4.3.1. Step fractional Brownian motion . . . . . . . . . . . 6.4.3.2. Generalized multifractional Brownian motion . . . 6.5. Estimate of regularity . . . . . . . . . . . . . . . . . . . . . . . 6.5.1. General method: generalized quadratic variation . . . . . 6.5.2. Application to the examples . . . . . . . . . . . . . . . . . 6.5.2.1. Identification of filtered white noise . . . . . . . . . 6.5.2.2. Identification of elliptic Gaussian random processes 6.5.2.3. Identification of MBM . . . . . . . . . . . . . . . . . 6.5.2.4. Identification of SFBMs . . . . . . . . . . . . . . . . 6.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

205 207 207 208 213 213 214 215 218 219 221

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

222 223 224 226 226 228 228 230 231 233 235

Chapter 7. An Introduction to Fractional Calculus . . . . . . . . . . . . . . 237 Denis M ATIGNON 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1.1. Fields of application . . . . . . . . . . . . . . . . . . . . . . . 7.1.1.2. Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3. Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1. Fractional integration . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2. Fractional derivatives within the framework of causal distributions 7.2.2.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2.2. Fundamental solutions . . . . . . . . . . . . . . . . . . . . . . 7.2.3. Mild fractional derivatives, in the Caputo sense . . . . . . . . . . . 7.2.3.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

237 237 237 238 238 239 240 240 242 242 245 246 246

10


7.2.3.2. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 7.2.3.3. Mittag-Leffler eigenfunctions . . . . . . . . . . . . . . . . . . 248 7.2.3.4. Fractional power series expansions of order α (α-FPSE) . . 250 7.3. Fractional differential equations . . . . . . . . . . . . . . . . . . . . . . 251 7.3.1. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 7.3.1.1. Framework of causal distributions . . . . . . . . . . . . . . . 251 7.3.1.2. Framework of fractional power series expansion of order one half . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 7.3.1.3. Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 7.3.2. Framework of causal distributions . . . . . . . . . . . . . . . . . . 254 7.3.3. Framework of functions expandable into fractional power series (α-FPSE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 7.3.4. Asymptotic behavior of fundamental solutions . . . . . . . . . . . 257 7.3.4.1. Asymptotic behavior at the origin . . . . . . . . . . . . . . . 257 7.3.4.2. Asymptotic behavior at infinity . . . . . . . . . . . . . . . . . 257 7.3.5. Controlled-and-observed linear dynamic systems of fractional order 261 7.4. Diffusive structure of fractional differential systems . . . . . . . . . . . 262 7.4.1. Introduction to diffusive representations of pseudo-differential operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 7.4.2. General decomposition result . . . . . . . . . . . . . . . . . . . . . 264 7.4.3. Connection with the concept of long memory . . . . . . . . . . . . 265 7.4.4. Particular case of fractional differential systems of commensurate orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 7.5. Example of a fractional partial differential equation . . . . . . . . . . . 266 7.5.1. Physical problem considered . . . . . . . . . . . . . . . . . . . . . 267 7.5.2. Spectral consequences . . . . . . . . . . . . . . . . . . . . . . . . . 268 7.5.3. Time-domain consequences . . . . . . . . . . . . . . . . . . . . . . 268 7.5.3.1. Decomposition into wavetrains . . . . . . . . . . . . . . . . . 269 7.5.3.2. Quasi-modal decomposition . . . . . . . . . . . . . . . . . . 270 7.5.3.3. Fractional modal decomposition . . . . . . . . . . . . . . . . 271 7.5.4. Free problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 7.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 7.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Chapter 8. Fractional Synthesis, Fractional Filters . . . . . . . . . . . . . . 279 Liliane B EL, Georges O PPENHEIM, Luc ROBBIANO and Marie-Claude V IANO 8.1. Traditional and less traditional questions about fractionals . . . . . . . . 279 8.1.1. Notes on terminology . . . . . . . . . . . . . . . . . . . . . . . . . 279 8.1.2. Short and long memory . . . . . . . . . . . . . . . . . . . . . . . . 279 8.1.3. From integer to non-integer powers: filter based sample path design 280 8.1.4. Local and global properties . . . . . . . . . . . . . . . . . . . . . . 281 8.2. Fractional filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 8.2.1. Desired general properties: association . . . . . . . . . . . . . . . 282

Contents

8.2.2. Construction and approximation techniques . . . . . . . . . . . 8.3. Discrete time fractional processes . . . . . . . . . . . . . . . . . . . 8.3.1. Filters: impulse responses and corresponding processes . . . . 8.3.2. Mixing and memory properties . . . . . . . . . . . . . . . . . . 8.3.3. Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . 8.3.4. Simulated example . . . . . . . . . . . . . . . . . . . . . . . . . 8.4. Continuous time fractional processes . . . . . . . . . . . . . . . . . . 8.4.1. A non-self-similar family: fractional processes designed from fractional filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.2. Sample path properties: local and global regularity, memory . 8.5. Distribution processes . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1. Motivation and generalization of distribution processes . . . . 8.5.2. The family of linear distribution processes . . . . . . . . . . . 8.5.3. Fractional distribution processes . . . . . . . . . . . . . . . . . 8.5.4. Mixing and memory properties . . . . . . . . . . . . . . . . . . 8.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 9. Iterated Function Systems and Some Generalizations: Local Regularity Analysis and Multifractal Modeling of Signals . Khalid DAOUDI 9.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2. Definition of the Hölder exponent . . . . . . . . . . . . . . . . 9.3. Iterated function systems (IFS) . . . . . . . . . . . . . . . . . . 9.4. Generalization of iterated function systems . . . . . . . . . . . 9.4.1. Semi-generalized iterated function systems . . . . . . . . 9.4.2. Generalized iterated function systems . . . . . . . . . . . 9.5. Estimation of pointwise Hölder exponent by GIFS . . . . . . . 9.5.1. Principles of the method . . . . . . . . . . . . . . . . . . . 9.5.2. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.3. Application . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6. Weak self-similar functions and multifractal formalism . . . . 9.7. Signal representation by WSA functions . . . . . . . . . . . . . 9.8. Segmentation of signals by weak self-similar functions . . . . 9.9. Estimation of the multifractal spectrum . . . . . . . . . . . . . 9.10. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.11. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

. . . . . . .

. . . . . . .

282 284 284 286 287 289 291

. . . . . . . .

. . . . . . . .

291 293 294 294 294 295 296 297

. . . . . 301 . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

Chapter 10. Iterated Function Systems and Applications in Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Franck DAVOINE and Jean-Marc C HASSERY 10.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2. Iterated transformation systems . . . . . . . . . . . . . . . . . . . . . . 10.2.1. Contracting transformations and iterated transformation systems 10.2.1.1. Lipschitzian transformation . . . . . . . . . . . . . . . . . .

301 303 304 306 307 308 311 312 314 315 318 320 324 326 327 329 333 333 333 334 334

12


10.2.1.2. Contracting transformation . . . . . . . . . . . . . . . . . . 10.2.1.3. Fixed point . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1.4. Hausdorff distance . . . . . . . . . . . . . . . . . . . . . . . 10.2.1.5. Contracting transformation on the space H(R2 ) . . . . . . 10.2.1.6. Iterated transformation system . . . . . . . . . . . . . . . . 10.2.2. Attractor of an iterated transformation system . . . . . . . . . . . 10.2.3. Collage theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.4. Finally contracting transformation . . . . . . . . . . . . . . . . . 10.2.5. Attractor and invariant measures . . . . . . . . . . . . . . . . . . 10.2.6. Inverse problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3. Application to natural image processing: image coding . . . . . . . . . 10.3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2. Coding of natural images by fractals . . . . . . . . . . . . . . . . 10.3.2.1. Collage of a source block onto a destination block . . . . . 10.3.2.2. Hierarchical partitioning . . . . . . . . . . . . . . . . . . . . 10.3.2.3. Coding of the collage operation on a destination block . . . 10.3.2.4. Contraction control of the fractal transformation . . . . . . 10.3.3. Algebraic formulation of the fractal transformation . . . . . . . . 10.3.3.1. Formulation of the mass transformation . . . . . . . . . . . 10.3.3.2. Contraction control of the fractal transformation . . . . . . 10.3.3.3. Fisher formulation . . . . . . . . . . . . . . . . . . . . . . . 10.3.4. Experimentation on triangular partitions . . . . . . . . . . . . . . 10.3.5. Coding and decoding acceleration . . . . . . . . . . . . . . . . . 10.3.5.1. Coding simplification suppressing the research for similarities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.5.2. Decoding simplification by collage space orthogonalization 10.3.5.3. Coding acceleration: search for the nearest neighbor . . . . 10.3.6. Other optimization diagrams: hybrid methods . . . . . . . . . . . 10.4. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

334 334 334 335 335 335 336 338 339 340 340 340 342 342 344 345 345 345 347 349 350 351 352 352 358 360 360 362

Chapter 11. Local Regularity and Multifractal Methods for Image and Signal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 Pierrick L EGRAND 11.1. Introduction . . . . . . . . . . . . . . . . . . . . . 11.2. Basic tools . . . . . . . . . . . . . . . . . . . . . 11.2.1. Hölder regularity analysis . . . . . . . . . . 11.2.2. Reminders on multifractal analysis . . . . . 11.2.2.1. Hausdorff multifractal spectrum . . . 11.2.2.2. Large deviation multifractal spectrum 11.2.2.3. Legendre multifractal spectrum . . . . 11.3. Hölderian regularity estimation . . . . . . . . . . 11.3.1. Oscillations (OSC) . . . . . . . . . . . . . . 11.3.2. Wavelet coefficient regression (W CR) . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

367 368 368 369 369 370 371 371 371 372

Contents

11.3.3. Wavelet leaders regression (W L) . . . . . . . . . . . . . . . 11.3.4. Limit inf and limit sup regressions . . . . . . . . . . . . . . 11.3.5. Numerical experiments . . . . . . . . . . . . . . . . . . . . . 11.4. Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.2. Minimax risk, optimal convergence rate and adaptivity . . . 11.4.3. Wavelet based denoising . . . . . . . . . . . . . . . . . . . . 11.4.4. Non-linear wavelet coefficients pumping . . . . . . . . . . . 11.4.4.1. Minimax properties . . . . . . . . . . . . . . . . . . . . 11.4.4.2. Regularity control . . . . . . . . . . . . . . . . . . . . 11.4.4.3. Numerical experiments . . . . . . . . . . . . . . . . . 11.4.5. Denoising using exponent between scales . . . . . . . . . . 11.4.5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11.4.5.2. Estimating the local regularity of a signal from noisy observations . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.5.3. Numerical experiments . . . . . . . . . . . . . . . . . 11.4.6. Bayesian multifractal denoising . . . . . . . . . . . . . . . . 11.4.6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11.4.6.2. The set of parameterized classes S(g, ψ) . . . . . . . 11.4.6.3. Bayesian denoising in S(g, ψ) . . . . . . . . . . . . . 11.4.6.4. Numerical experiments . . . . . . . . . . . . . . . . . 11.4.6.5. Denoising of road profiles . . . . . . . . . . . . . . . . 11.5. Hölderian regularity based interpolation . . . . . . . . . . . . . . 11.5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.2. The method . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.3. Regularity and asymptotic properties . . . . . . . . . . . . . 11.5.4. Numerical experiments . . . . . . . . . . . . . . . . . . . . . 11.6. Biomedical signal analysis . . . . . . . . . . . . . . . . . . . . . . 11.7. Texture segmentation . . . . . . . . . . . . . . . . . . . . . . . . . 11.8. Edge detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8.1.1. Edge detection . . . . . . . . . . . . . . . . . . . . . . 11.9. Change detection in image sequences using multifractal analysis 11.10. Image reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 11.11. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

372 373 374 376 376 377 378 380 380 381 382 383 383

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

384 386 386 386 387 388 390 391 393 393 393 394 394 394 401 403 403 406 407 408 409

Chapter 12. Scale Invariance in Computer Network Traffic . . . . . . . . . 413 Darryl V EITCH 12.1. Teletraffic – a new natural phenomenon . . . . . . . 12.1.1. A phenomenon of scales . . . . . . . . . . . . . 12.1.2. An experimental science of “man-made atoms” 12.1.3. A random current . . . . . . . . . . . . . . . . . 12.1.4. Two fundamental approaches . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

413 413 415 416 417

14


12.2. From a wealth of scales arise scaling laws . . 12.2.1. First discoveries . . . . . . . . . . . . . . 12.2.2. Laws reign . . . . . . . . . . . . . . . . . 12.2.3. Beyond the revolution . . . . . . . . . . 12.3. Sources as the source of the laws . . . . . . . 12.3.1. The sum or its parts . . . . . . . . . . . . 12.3.2. The on/off paradigm . . . . . . . . . . . 12.3.3. Chemistry . . . . . . . . . . . . . . . . . 12.3.4. Mechanisms . . . . . . . . . . . . . . . . 12.4. New models, new behaviors . . . . . . . . . . 12.4.1. Character of a model . . . . . . . . . . . 12.4.2. The fractional Brownian motion family 12.4.3. Greedy sources . . . . . . . . . . . . . . 12.4.4. Never-ending calls . . . . . . . . . . . . 12.5. Perspectives . . . . . . . . . . . . . . . . . . . 12.6. Bibliography . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

419 419 420 424 426 426 427 428 429 430 430 431 432 432 433 434

Chapter 13. Research of Scaling Law on Stock Market Variations . . . . . 437 Christian WALTER 13.1. Introduction: fractals in finance . . . . . . . . . . . . . . . . . . . . . . 13.2. Presence of scales in the study of stock market variations . . . . . . . 13.2.1. Modeling of stock market variations . . . . . . . . . . . . . . . . 13.2.1.1. Statistical apprehension of stock market fluctuations . . . . 13.2.1.2. Profit and stock market return operations in different scales 13.2.1.3. Traditional financial modeling: Brownian motion . . . . . . 13.2.2. Time scales in financial modeling . . . . . . . . . . . . . . . . . . 13.2.2.1. The existence of characteristic time . . . . . . . . . . . . . . 13.2.2.2. Implicit scaling invariances of traditional financial modeling 13.3. Modeling postulating independence on stock market returns . . . . . . 13.3.1. 1960-1970: from Pareto’s law to Lévy’s distributions . . . . . . . 13.3.1.1. Leptokurtic problem and Mandelbrot’s first model . . . . . 13.3.1.2. First emphasis of Lévy’s α-stable distributions in finance . 13.3.2. 1970–1990: experimental difficulties of iid-α-stable model . . . 13.3.2.1. Statistical problem of parameter estimation of stable laws . 13.3.2.2. Non-normality and controversies on scaling invariance . . 13.3.2.3. Scaling anomalies of parameters under iid hypothesis . . . 13.3.3. Unstable iid models in partial scaling invariance . . . . . . . . . 13.3.3.1. Partial scaling invariances by regime switching models . . 13.3.3.2. Partial scaling invariances as compared with extremes . . . 13.4. Research of dependency and memory of markets . . . . . . . . . . . . 13.4.1. Linear dependence: testing of H-correlative models on returns . 13.4.1.1. Question of dependency of stock market returns . . . . . . 13.4.1.2. Problem of slow cycles and Mandelbrot’s second model . .

437 439 439 439 442 443 445 445 446 446 446 446 448 448 448 449 451 452 452 453 454 454 454 455

Contents

13.4.1.3. Introduction of fractional differentiation in econometrics . 13.4.1.4. Experimental difficulties of H-correlative model on returns 13.4.2. Non-linear dependence: validating H-correlative model on volatilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4.2.1. The 1980s: ARCH modeling and its limits . . . . . . . . . 13.4.2.2. The 1990s: emphasis of long dependence on volatility . . . 13.5. Towards a rediscovery of scaling laws in finance . . . . . . . . . . . . 13.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 14. Scale Relativity, Non-differentiability and Fractal Space-time Laurent N OTTALE

15

455 456 456 456 457 457 458 465

14.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 14.2. Abandonment of the hypothesis of space-time differentiability . . . . 466 14.3. Towards a fractal space-time . . . . . . . . . . . . . . . . . . . . . . . . 466 14.3.1. Explicit dependence of coordinates on spatio-temporal resolutions 467 14.3.2. From continuity and non-differentiability to fractality . . . . . . 467 14.3.3. Description of non-differentiable process by differential equations 469 14.3.4. Differential dilation operator . . . . . . . . . . . . . . . . . . . . 471 14.4. Relativity and scale covariance . . . . . . . . . . . . . . . . . . . . . . 472 14.5. Scale differential equations . . . . . . . . . . . . . . . . . . . . . . . . 472 14.5.1. Constant fractal dimension: “Galilean” scale relativity . . . . . . 473 14.5.2. Breaking scale invariance: transition scales . . . . . . . . . . . . 474 14.5.3. Non-linear scale laws: second order equations, discrete scale invariance, log-periodic laws . . . . . . . . . . . . . . . . . . . . . 475 14.5.4. Variable fractal dimension: Euler-Lagrange scale equations . . . 476 14.5.5. Scale dynamics and scale force . . . . . . . . . . . . . . . . . . . 478 14.5.5.1. Constant scale force . . . . . . . . . . . . . . . . . . . . . . 479 14.5.5.2. Scale harmonic oscillator . . . . . . . . . . . . . . . . . . . 480 14.5.6. Special scale relativity – log-Lorentzian dilation laws, invariant scale limit under dilations . . . . . . . . . . . . . . . . . . . . . . . 481 14.5.7. Generalized scale relativity and scale-motion coupling . . . . . . 482 14.5.7.1. A reminder about gauge invariance . . . . . . . . . . . . . . 483 14.5.7.2. Nature of gauge fields . . . . . . . . . . . . . . . . . . . . . 484 14.5.7.3. Nature of the charges . . . . . . . . . . . . . . . . . . . . . . 486 14.5.7.4. Mass-charge relations . . . . . . . . . . . . . . . . . . . . . 488 14.6. Quantum-like induced dynamics . . . . . . . . . . . . . . . . . . . . . 488 14.6.1. Generalized Schrödinger equation . . . . . . . . . . . . . . . . . 488 14.6.2. Application in gravitational structure formation . . . . . . . . . . 492 14.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 14.8. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 List of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503


Preface

It is a common scheme in many sciences to study systems or signals by looking for characteristic scales in time or space. These are then used as references for expressing all measured quantities. Physicists may for instance employ the size of a structure, while signal processors are often interested in correlation lengths: (blocks of) samples whose distance is several times the correlation lengths are considered statistically independent. The concept of scale invariance may be considered to be the converse of this approach: it means that there is no characteristic scale in the system. In other words, all scales contribute to the observed phenomenon. This “non-property” is also loosely referred to as scaling law or scaling behavior. Note that we may reverse the perspective and consider scale invariance as the signature of a strong organization in the system. Indeed, it is well known in physics that invariance laws are associated with fundamental properties. It is remarkable that phenomena where scaling laws have been observed cover a wide range of fields, both in natural and artificial systems. In the first category, these include for instance hydrology, in relation to the variability of water levels, hydrodynamics and the study of turbulence, statistical physics with the study of long-range interactions, electronics with the so-called 1/f noise in semiconductors, geophysics with the distribution of faults, biology, physiology and the variability of human body rhythms such as the heart rate. In the second category, we may mention geography with the distribution of population in cities or in continents, Internet traffic and financial markets. From a signal processing perspective, the aim is then to study transfer mechanisms between scales (also called “cascades”) rather than to identify relevant scales. We are thus led to forget about scale-based models (such as Markov models), and to focus on models allowing us to study correspondences between many scales. The central notion behind scaling laws is that of self-similarity. Loosely speaking, this means that each part is (statistically) the same as the whole object. In particular, information gathered from observing the data should be independent of the scale of observation.

18


There is considerable variety in observed self-similar behaviors. They may for instance appear through scaling laws in the Fourier domain, either at all frequencies or in a finite but large range of frequencies, or even in the limit of high or low frequencies. In many cases, studying second-order quantities such as spectra will prove insufficient for describing scaling laws. Higher-order moments are then necessary. More generally, the fundamental model of self-similarity has to be adapted in many settings, and to be generalized in various directions, so that it becomes useful in real-world situations. These include self-similar stochastic processes, 1/f processes, long memory processes, multifractal and multifractional processes, locally self-similar processes and more. Multifractal analysis, in particular, has developed as a method allowing us to study complex objects which are not necessarily “fractal”, by describing the variations of local regularity. The recent change of paradigm consisting of using fractal methods rather than studying fractal objects is one of the reasons for the success of the domain in applications. We are delighted to invite our reader for a promenade in the realm of scaling laws, its mathematical models and its real-world manifestations. The 14 chapters have all been written by experts. The first four chapters deal with the general mathematical tools allowing us to measure fractional dimensions, local regularity and scaling in its various disguises. Wavelets play a particular role for this purpose, and their role is emphasized. Chapters 5 and 6 describe advanced stochastic models relevant in our area. Chapter 7 deals with fractional calculus, and Chapter 8 explains how to synthesize certain fractal models. Chapter 9 gives a general introduction to IFS, a powerful tool for building and describing fractals and other complex objects, while Chapter 10, of applied nature, considers the application of IFS to image compression. The four remaining chapters also deal with applications: various signal and image processing tasks are considered in Chapter 11. Chapter 12 deals with Internet traffic, and Chapter 13 with financial data analysis. Finally, Chapter 14 describes a fractal space-time in the frame of cosmology. It is a great pleasure for us to thank all the authors of this volume for the quality of their contribution. We believe they have succeeded in exposing advanced concepts with great pedagogy.

Chapter 1

Fractal and Multifractal Analysis in Signal Processing

1.1. Introduction The aim of this chapter is to describe some of the fundamental concepts of fractal analysis in view of their application. We will thus present a simple introduction to the concepts of fractional dimension, regularity exponents and multifractal analysis, and show how they are used in signal and image processing. Since we are interested in applications, most theoretical results are given without proofs. These are available in the references mentioned where appropriate. In contrast, we will pay special attention to the practical aspects. In particular, almost all the notions explained below are implemented in the FracLab toolbox. This toolbox is freely available from the following site: http://complex.futurs.inria .fr/FracLab/, so that interested readers may perform hands-on experiments. Before we start, we wish to emphasize the following point: recent successes of fractal analysis in signal and image processing do not generally stem from the fact that they are applied to fractal objects (in a more or less strict sense). Indeed, most real-world signals are neither self-similar nor display the characteristics usually associated with fractals (except for the irregularity at each scale). The relevance of fractal analysis instead results from the progress made in the development of fractal methods. Such methods have lately become more general and reliable, and they now allow to describe precisely the singular structure of complex signals,

Chapter written by Jacques L ÉVY V ÉHEL and Claude T RICOT.

20


without any assumption of “fractality”: as a rule, performing a fractal analysis will be useful as soon as the considered signal is irregular and this irregularity contains meaningful information. There are numerous examples of such situations, ranging from image segmentation (where, for instance, contours are made of singular points; see section 1.4.7 and Chapter 11) to vocal synthesis [DAO 02] or financial analysis. This chapter roughly follows the chronological order in which the various tools have been introduced. We first describe several notions of fractional dimensions. These provide a global characterization of a signal. We then introduce Hölder exponents, which supply local measures of irregularity. The last part of the chapter is devoted to multifractal analysis, a most refined tool that describes the local as well as the overall singular structure of signals. All the concepts presented here are more fully developed in [TRI 99, LEV 02]. 1.2. Dimensions of sets The concept of dimension applies to objects more general than signals. To simplify, we shall consider sets in a metric space, although the notion of dimension makes sense for more complex entities such as measures or classes of functions [KOL 61]. Several interesting notions of dimension exist. This might look like a drawback for the mathematical analysis of fractal sets. However, it is actually an advantage, since each dimension emphasizes a different aspect of an object. It is thus worthwhile to determine the specificity of each dimension. As a general rule, none of these tools outperform the other. Let us first give a general definition of the notion of dimension. DEFINITION 1.1.– We call dimension an application d defined on the family of bounded sets of Rn and ranging in R+ ∪ {−∞}, such that: 1) d(∅) = −∞, d({x}) = 0 for any point x; 2) E1 ⊂ E2 ⇒ d(E1 ) d(E2 ) (monotonicity); 3) if E has non-zero n-dimensional volume, then d(E) = n; 4) if E is a diffeomorphism T of Rn (such as, in particular, a similarity with non-zero ratio, or a non-singular affine application), then d(T (E)) = d(E) (invariance). Moreover, we will say that d is stable if d(E1 ∪ E2 ) = max{d(E1 ), d(E2 )}. It is said to be σ-stable if, for any countable collection of sets: d ∪n En = sup d En σ-stable dimensions may be extended in a natural way to characterize unbounded sets of Rn .


21

1.2.1. Minkowski-Bouligand dimension The Minkowski-Bouligand dimension was invented by Bouligand [BOU 28], who named it the Cantor-Minkowski order. It is now commonly referred to as the box dimension. Let us cover a bounded set E of Rn with cubes of side ε and disjoint interiors. Let Nε (E) be the number of these cubes. When E contains an infinite number of points (i.e. if it is a curve, a surface, etc.), Nε (E) tends to +∞ when ε tends to 0. The box dimension Δ characterizes the rate of this growth. Roughly speaking, Δ is the real number such that: Δ 1 , Nε (E) ε assuming this number exists. More generally, we define, for all bounded E, the number: Δ(E) = lim sup ε→∞

log Nε (E) |log ε|

(1.1)

A lower limit may also be used: δ(E) = lim inf ε→∞

log Nε (E) |log ε|

(1.2)

Note that some authors refer to the box dimension only when both indices coincide, that is, when the limit exists. Both indices Δ and δ are dimensions in the sense previously defined. However, Δ is stable, contrarily to δ, so that Δ is more commonly used. Let us mention an ¯ denotes the closure of E (the set of all limit points of important property: if E sequences in E), then: ¯ = Δ(E) Δ(E) This property shows that Δ is not sensitive to the topological type of E. It only characterizes the density of a set. For example, the (countable) set of the rational numbers of the interval [0, 1] has one dimension, which is the dimension of the interval itself. Even discrete sequences may have non-zero dimension: let, for instance, E be the set of numbers n−α with α > 0. Then Δ(E) = 1/(α + 1). Equivalent definitions It is not mandatory to use cubes to calculate Δ. The original definition of Bouligand is as follows:

22


– in Rn , let us consider the Minkowski sausage: E(ε) = ∪x∈E Bε (x) which is the union of all the balls of radius ε centered at E. Denote its volume by Voln (E(ε)). This volume is approximately of the order of Nε (E) εn . This allows us to give the equivalent definition: Voln E(ε) ; (1.3) Δ(E) = lim sup n − log ε ε→0 – we may also define Nε (E), which is the smallest number of balls of radius ε covering E; or Nε (E), the largest number of disjoint balls of radius ε centered on E. Replacing Nε (E) by any of these values in equation (1.1) still gives Δ(E). Discrete values of ε In these definitions, the variable ε is continuous. The results remain the same if we use a discrete sequence such as εn = 2−n . More generally we may replace ε with any sequence which does not converge too quickly towards 0. More precisely, we require that: lim

n→∞

log εn = 1. log εn+1

This remark is important, as it allows us to perform numerical estimations of Δ. Let us now give some well-known examples of calculating dimensions. EXAMPLE 1.1.– Let (an ) be a sequence of real numbers such that 0 < 2an+1 < an < a0 = 1. Let E0 = [0, 1]. We construct by induction a sequence of sets (En ) such that En is made of 2n closed disjoint intervals of length an , each containing exactly two intervals of En+1 . The sets En are nested, and the sequence (En ) converges to a compact set E such that: E = ∩ n En . Let us consider a particular case. When all the interval extremities En are also interval extremities of En+1 , E is called a perfect symmetric set [KAH 63] or sometimes, more loosely, a Cantor set. Assume that the ratio log an / log an+1 tends to 1. According to the previous comment on discrete sequences, we obtain the following values: δ(E) = lim inf n→∞

n log 2 , |log an |

Δ(E) = lim sup n→∞

n log 2 . |log an |


23

However, these results are true for any sequence (an ). Even more specifically, consider the case where an = an , with 0 < a < 12 . The ratios an /an+1 are then constant and dimensions take the common value log 2/|log a|. This is the case of the self-similar set which satisfies the following relation: E = f1 (E) ∪ f2 (E) with f1 (x) = a x and f2 (x) = a x + 1 − a. This set is the attractor of the iterated function system {f1 , f2 } (see Chapters 9 and 10). It is also called a perfect symmetric set with constant ratio. EXAMPLE 1.2.– We construct a planar self-similar curve with extremities A and B, A = B as follows: take N + 1 distinct points A1 = A, A2 , . . . , AN +1 = B, such that dist(Ai , Ai+1 ) < dist(A, B). For each i = 1, . . . , N , define a similarity fi (that is, a composition of a homothety, an orthogonal transformation and a translation), such that fi (AB) = Ai Ai+1 . The ratio of fi is ai = dist(Ai , Ai+1 )/ dist(A, B). Starting from the segment Γ0 = AB, define by induction the polygonal curves Γn = ∪i fi (Γn−1 ). This sequence (Γn ) converges to a curve Γ which satisfies the following relation: Γ = ∪i fi (Γ). In other words, Γ is the attractor of the IFS {f1 , . . . , fn }. When Γ is simple, the dimensions δ and Δ assume a common value, which is also the similarity dimension, i.e. the unique solution of the equation N

axi = 1.

i=1

In the particular case where all distances dist(Ai , Ai+1 ) are the same, the ratios ai are equal to a value a such that N a > 1 (necessary condition for the continuity of Γ) and N a2 < 1 (necessary condition for the simplicity of Γ). Clearly, δ(Γ) = Δ(Γ) = log N/|log a|.

0

1/3

2/3

1

Figure 1.1. Von Koch curve, the attractor of a system of four similarities with common ratio

1 3

24


Function scales The previous definitions all involve ratios of logarithms. This is an immediate consequence of the fact that a dimension is defined as an order of growth related to the scale of functions {tα , α > 0}. In general, a scale of functions F in the neighborhood of 0 is a family of functions which are all comparable in the Hardy sense, that is, for any f and g in F, the ratio f (x)/g(x) tends to a limit (possibly +∞ or −∞) when x tends to 0. Function scales are defined in a similar way in the neighborhood of +∞. Scales other than {tα } will yield other types of dimensions. A dimension must be considered as a Dedekind cut in a given scale of functions. The following expressions will make this clearer: Δ(E) = inf{α such that εα Nε (E) → 0} δ(E) = sup{α such that εα Nε (E) → +∞}

(1.4) (1.5)

these are equivalent to equations (1.1) and (1.2) (see [TRI 99]). Complementary intervals on the line In the particular case where the compact E lies in an interval J of the line, the complementary set of E in J is a union of disjoint open intervals, whose lengths will be denoted by cn . Let |E| be the Lebesgue measure of E (which means, for an interval, its length). The dimension of E may be written as: log E(ε) Δ(E) = lim sup 1 − log ε ε→0 If |E| = 0, the sum of the cn is equal to the length of J. The dimension is then equal to the convergence exponent of the series cn :

α Δ(E) = inf α such that cn < +∞ (1.6) n

Proof. This result may be obtained by calculating an approximation of the length of Minkowski sausage E(ε). Let us assume that the complementary intervals are ranked in decreasing lengths: c1 c2 · · · cn · · · If |E| = 0 and if cn ε > cn+1 , then: |E(ε)| nε +

in

ci


thus εα−1 L(E(ε)) nεα + εα−1

in ci .

It may be shown that both values

α

inf{α such that nε < +∞} and

25

inf

α such that ε

α−1

ci < +∞

in

are equal to the convergence exponent. It is therefore equal to Δ(E). EXERCISE 1.1.– Verify formula (1.6) for the perfect symmetric sets of Example 1.1. If |E| = 0, then the convergence exponent of cn still makes sense. It characterizes a degree of proximity of the exterior with the set E. More precisely, we obtain log|E(ε) − E| α (1.7) cn < +∞ = lim sup 1 − inf α such that log ε ε→0 n where the set E(ε) − E refers to the Minkowski sausage of E deprived of the points of E. How can we generalize the study of the complementary set in Rn with n 2? The open intervals must be replaced with an appropriate paving. The results connecting the elements of this paving to the dimension depend both on the geometry of the tiles and on their respective positions. The topology of the complementary set must be investigated more deeply [TRI 87]. The index that generalizes (1.7) (replacing the 1 of the space dimension by n) is the fact fractal exponent, studied in [GRE 85, TRI 86b]. In the case of a zero area curve in R2 , this also leads to the notion of lateral dimension. Note that the dimensions corresponding to each side of the curve are not necessarily equal [TRI 99].

1.2.2. Packing dimension The packing dimension is, to some extent, a regularization of the box dimension [TRI 82]. Indeed, Δ is not σ-stable, but we may derive a σ-stable dimension from any index thanks to the operation described below. PROPOSITION 1.1.– Let B be the family of all bounded sets of Rn and α : B −→ R+ . Then, the function α ˆ defined for any subsets of Rn as: α ˆ (E) = inf{sup α(Ei )/E = ∪Ei , Ei ∈ B} is monotonous and σ-stable.

26


Proof. Any subset E of Rn is a union of bounded sets. If E1 ⊂ E2 , then any covering of E1 may be completed with a covering of E2 . This entails monotonicity. Now, let ε > 0 and a sequence (Ek )k1 of sets whose union is E. For any k, there ˆ (Ek ) + ε2−k . Since exists a decomposition (Ei,k ) of Ek such that sup α(Ei,k ) α E = ∪i,k Ei,k , we deduce that: α ˆ (E) sup α ˆ (Ek ) + ε

2−k = sup α ˆ (Ek ) + ε

k

k

Thus, the inequality α ˆ (E) supk α ˆ (Ek ) holds. The converse inequality stems from monotonicity. The packing dimension is the result of this operation on Δ. We set ˆ Dim = Δ The term packing will be explained later. The new index Dim is indeed a dimension, and it is σ-stable. Therefore, contrarily to Δ, it vanishes for countable sets. The inequality: Dim(E) Δ(E) is true for any bounded set. This becomes an equality when E presents a homogenous structure in the following sense: THEOREM 1.1.– Let E be a compact set such that, for all open sets U intersecting E, Δ(E ∩ U ) = Δ(E). Then Δ(E) = Dim(E). Proof. Let Ei be a decomposition of E. Since E is compact, a Baire theorem entails that the Ei are not all nowhere dense in E. Therefore, there exist an index i0 and an ¯ ∩ U , which yields: open set U intersecting E such that Ei0 ∩ U = E ¯ ∩ U ) Δ(E ∩ U ) = Δ(E) Δ(Ei0 ) = Δ(Ei0 ) Δ(Ei0 ∩ U ) = Δ(E As a result, Δ(E) supi Δ(Ei ), and thus Δ(E) Dim(E). The converse inequality is always true. EXAMPLE 1.3.– All self-similar sets are of this type, including those presented above: Cantor sets and curves. For these sets, the packing dimension has the same value as Δ(E).


27

EXAMPLE 1.4.– Dense sets in [0, 1], when they are not compact, do not necessarily have a packing dimension equal to 1. Let us consider, for any real p, 0 < p < 1, the set Ep of p-normal numbers, that is, those numbers whose frequency of zeros in their dyadic expansion is equal to p. Any dyadic interval of [0, 1], however small it may be, contains points of Ep , so Ep is dense in [0, 1]. As a consequence, Δ(Ep ) = 1. In contrast, the value of Dim(Ep ) is: Dim(Ep ) =

1 p log p + (1 − p) log(1 − p). log 2

This result will be derived in section 1.3.2. 1.2.3. Covering dimension The covering dimension was introduced by Hausdorff [HAU 19]. Here we adopt the traditional approach through Hausdorff measures; a direct approach, using Vitali’s covering convergence exponent, may be used to calculate the dimension without using measures [TRI 99]. Covering measures Originally, the covering measures were defined to generalize and, most of all, to precisely define the concepts of length, surface, volume, etc. They constitute an important tool in geometric measure theory. Firstly, let us consider a determining function φ: R+ −→ R+ , which is increasing and continuous in the neighborhood of 0, and such that φ(0) = 0. Let E be a set in a metric space (that is, a space where a distance has been defined). For every ε > 0, we consider all the coverings of E by bounded sets Ui of diameter diam(Ui ) ε. Let Hεφ (E) = inf φ diam(Ui ) /E ⊂ ∪i Ui , diam(Ei ) ε . When ε tends to 0, this quantity (possibly infinite) cannot decrease. The limit corresponds to the φ-Hausdorff measure: H φ (E) = lim Hεφ (E) ε→0

In this definition, the covering sets Ui can be taken in a more restricted family. If we suppose that Ui are open, or convex, the result remains unchanged. The main properties are that of any Borel measure: – E1 ⊂ E2 =⇒ H φ (E1 ) H φ (E2 );

28


– if (Ei ) is a collection of countable sets, then H φ (∪Ei )

H φ (Ei )

i

– if E1 and E2 are at non-zero distance from each other, any ε-covering of E1 is disjoint from any ε-covering of E2 when ε is sufficiently small. Then H φ (E1 ∪ E2 ) = H φ (E1 ) + H φ (E1 ). This implies that H φ is a metric measure. The Borel sets are H φ -measurable and for any collection (Ei ) of disjoint Borel sets, φ H (∪i Ei ) = i H φ (Ei ). The scale of functions tα In the case where φ(t) = tα with α > 0, we use the simple notation H φ = H α . Consider the case α = 1. For any curve Γ the value H 1 (Γ) is equal to the length of Γ. Therefore H 1 is a generalization of the concept of length: it may be applied to any subset of the metric space. Now let α = 2. For any plane surface S, the value of H 2 (S) is proportional to the area of S. For non-plane surfaces, H 2 provides an appropriate mathematical definition of area – using a triangulation of S is not acceptable from a theoretical point of view. More generally, when α is an integer, H α is proportional to the α-dimensional volume. However, α can also take non-integer values, which makes it possible to define the dimension of any set. The use of the term dimension is justified by the following property: if aE is the image of E by a homothety of ratio a, then H α (aE) = aα H α (E)

Measures estimated using boxes If we want to restrict the class of sets from which coverings are taken even more, one option would be to cover E with centered balls or dyadic boxes. In each case, the result is a measure H ∗α which is generally not equal to H α (E); nevertheless, it is an equivalent measure in the sense that we can find two non-zero constants, c1 and c2 , such that for any E: c1 H α (E) H ∗α (E) c2 H α (E) Clearly the H ∗α measures give rise to the same dimension.


29

Dimension For every E, there exists a unique critical value α such that: – H α (E) > 0 and β < α =⇒ H β (E) = +∞; – H α (E) < +∞ and β > α =⇒ H β (E) = 0. The dimension is defined as

dim(E) = inf α such that H α (E) = 0

= sup α such that H α (E)= + ∞

(1.8)

NOTE 1.1.– This approach is not very different from the one which leads to the box dimension. Compare equation (1.8) with equations (1.4) and (1.5). Once again, it may be generalized by using other function scales than tα . Properties The properties of dim directly stem from those of H α measures. It is a σ-stable dimension, like Dim. To compare all these dimensions, let us observe that δ can be defined in the same manner as the covering dimension, by using coverings made up of sets of equal diameter. This implies the inequality dim(E) δ(E) for any E. The σ-stability property then implies the following: ˆ ˆ dim(E) δ(E) Δ(E) = Dim(E) Δ(E). These inequalities may be strict. However, the equality dim(E) = Dim(E), and even dim(E) = Δ(E), occur in cases where E is sufficiently regular. Examples include rectifiable curves and self-similar sets. Packing measures By considering packings of E, that is, families of disjoint sets at zero distance from E, and by switching inf and sup in the definitions, it is possible to define packing measures which are symmetric to the covering measures and whose critical index is precisely equal to Dim. This explains why Dim is called a packing dimension. 1.2.4. Methods for calculating dimensions Since a dimension is an index of irregularity, it may be used for the classification of irregular sets and for characterizing phenomenons showing erratic behaviors. Here we focus on signals. In practice, we may assume that the signal Γ is given in axes Oxy by discrete data (xi , fi ), with xi < xi+1 : for example, a time series. Notice that other sets, such as geographical curves obtained by aerial photography, are not of this type, so that the analysis tools can be different.

30


The algorithms used for estimating a dimension rely on the theoretical formula which defines the dimension, with the usual limitation of a minimal scale, roughly equal to the distances xi+1 − xi . Indeed, it is impossible to go further into the dataset structure without a reconstruction at very small scales, which leads to adding up arbitrarily new data. The evaluation of a dimension, whose theoretical definition requires the finest coverings possible, is therefore difficult to justify. This is why we do not propose algorithms for the estimation of σ-stable dimensions such as Hausdorff or packing dimensions. Calculations may be performed to estimate Δ or related dimensions that are naturally adapted to signal analysis. They are usually carried out using logarithmic diagrams. A quantity Q(f, ε) is estimated, which characterizes the irregularity for a certain number of values of the resolution ε between two limit values εmax and εmin . If Q(f, ε) follows a power law of the type cεΔ , then log Q(f, ε) is an affine function of log ε, with slope Δ. The idea is to seek functions Q(f, ε) which provide appropriate logarithmic diagrams, i.e. allow us to estimate the slopes precisely. Here are some examples. Boxes and disks After counting the number Nε of squares of a network of sides ε, we draw the diagram (|log ε|, log Nε ). Although it is very easy to program, the method presents an obvious disadvantage: the quantities Nε are integers, which makes the diagram chaotic for large values of ε. We could try a method using the Minkowski sausage of Γ: 1 |log ε|, log 2 Aire Γ(ε) ε However, this method is more difficult to program than the previous method and also lacks precision: the diagram shows, in general, a strong concavity – even for a curve as simple as a straight line segment! These methods are very popular, in spite of numerical difficulties. Unfortunately there is a major drawback for signals. The coordinates (xi , fi ) are not, in general, of the same nature. If they refer, for example, to stock exchange data, xi is a time value and fi an exchange value. In this case, it makes no sense to give the units the same value on the axis Ox and Oy. The covering of Γ by squares or disks is therefore meaningless. It is preferable to use algorithms which will provide a slope independent of changes of unit. For this purpose, the calculated quantity Q(f, ε) should satisfy the following properties: for any real a, there exists c(a), such that, for any ε: Q(af, ε) = c(a) Q(f, ε) as is the case for the methods which are described below.

(1.9)


31

Variation method Here we anticipate section 1.3.4. The oscillation of a function f on any set F is defined as β(f, F ) = sup{f (t) − f (t ) t, t ∈ F } The ε-variation on an interval J is the arithmetic mean of the oscillation over intervals of length 2ε: 1 β(f, [t − ε, t + ε] ∩ J) dt varε (f ) = |J| J The variation method [DUB 89, TRI 88] consists of finding the slope of the diagram: 1 |log ε|, log 2 varε (f ) . ε Since β(af, [x − ε, x + ε]) = a β(f, [x − ε, x + ε]) for all x and ε, we obtain through integration varε (af ) = a varε (f ), so equation (1.9) is satisfied. In this case we obtain diagrams which present an almost perfect alignment for a large class of known signals. Furthermore, this method presents the following advantages: ease of programming and speed of execution. Lp norms method Let us define the Lp norm of a function f : D ⊂ Rn −→ R by the relation: 1/p 1 p |f (x)| dx . Lp (f ) = Voln (D) D It is a functional norm when p 1. When p → +∞, the expression Lp (f ) tends to the norm: L∞ (f ) = sup |f (x)| x∈D

Given a signal f defined on [a, b] and the values x ∈ [a, b] and ε > 0, we apply this tool at any x to the local function difference defined by f (x) − f (x ) where x − x ε. Using the norm L∞ , this gives supx ∈[x−ε,x+ε] (|f (x) − f (x )|). This quantity is equivalent to ε-oscillation, since 1 β f, [x − ε, x + ε] sup (|f (x) − f (x )|) β f, [x − ε, x + ε] 2 x ∈[x−ε,x+ε]

32


It is therefore possible to replace the ε-variation of f by the integral over J of supx ∈[x−ε,x+ε] (|f (x) − f (x )|), without altering the theoretical result for Δ. However, it is also possible to use Lp norms. Indeed, the oscillation (or the local norm L∞ ) only takes into account the peaks of the function. In practice, it can happen that these peaks are measured with a significant error, or even destroyed in the process of acquisition of data (profiles of rough surfaces, for example). It is preferable to use all the intermediate values and replace the ε-variation with the quantity: J

1 2ε

x+ε

|f (x) − f (x )|p dx

1/p dt

x−ε

In this expression, large values of p allow us to emphasize the effect of local peaks, whereas if p = 1, all the values of function f have equal importance. These integrals make it possible to rectify the corresponding logarithmic diagram and to calculate the slope with precision. We can also replace the above integral on J by a norm Lq , with q > 1. If q is large, this will take into account the more irregular parts of the signal. We can also change the integral in the window [x−ε, x+ε] into a convolution product by a kernel of type K(x /ε), so that the results are even smoother. However, it should be noted that except for particular cases (Weierstrass functions, for example), we do not exactly calculate the dimension Δ with these methods, but rather an index smaller than Δ [TRI 99], which nevertheless remains relevant to the signal irregularity. Let us develop an example of the index just referred to. Let K be a kernel belonging to the Schwartz class, with integral equal to 1. Let Ka (t) = a1 K( at ) for a > 0. For a function f defined in a compact, let f a be the convolution of f with Ka . Since f a is regular, the length Λa of its graph is finite. We define the regularization dimension dimR (f ) of the graph of f as: dimR (f ) = 1 + lim

a→0

log(Λa ) . − log a

(1.10)

This dimension measures the speed at which the length of less and less regularized versions of the graph of f tend to infinity. It is easily proved that if f is continuous, the inequality dimR Δ is always true. An interesting aspect of the dimension of regularization is that it is a well-adapted estimation tool. Results obtained on usual signals (Weierstrass function, iterated function system and Brownian fractional motion) are generally satisfactory, even for small-sized samples (a few hundred points). Moreover, the simple analytical form of dimR allows us to easily obtain an estimator for data corrupted by an additional noise, which is particularly useful in signal processing (see [ROU 98] and the FracLab manual for more details).


33

1.3. Hölder exponents 1.3.1. Hölder exponents related to a measure The dimensional analysis of a set is related to its local properties. To go further into this study, it is convenient to use a measure μ supported by the set. In many cases (self-similar sets, for example), E is defined at the same time as μ. If E is a curve, constructing a measure on E is called a parameterization. Without a parameterization it is impossible to analyze the curve. However, a given set can support very different measures. Particularly interesting ones are the well-balanced measures, in a sense we will explain. Given a measure μ of Rn , let us first define the Hölder exponent of μ over any measurable set F by αμ (F ) =

log μ(F ) log diam(F )

By convention, 0/0 = 1 and 1/0 = +∞. Given a set E, we use this notion in a local manner, i.e. on arbitrarily small intervals or cubes intersecting E. A pointwise Hölder exponent is then defined using centered balls Bε (x) whose radius tends to 0: αμ (x) = lim inf αμ Bε (x) ε→0

The symmetric exponent can also be useful: αμ (x) = lim sup αμ Bε (x) ε→0

In addition, the geometric context sometimes induces a specific analysis framework. If a measure is defined by its value on the dyadic cubes, it will be easier to use the following Hölder exponents: αμ∗ (x) = lim sup αμ un (x) , α∗μ (x) = lim inf αμ un (x) n→+∞

n→+∞

N where un (x) is the cube of i=1 [ki 2−n , (ki + 1)2−n [ that contains x. Other covering nets can obviously be used, but dyadic cubes are well suited for calculations. 1.3.2. Theorems on set dimensions The first theorem can be used as a basis for understanding the subsequent more technical results.

34


THEOREM 1.2.– Let μ be a finite measure such that μ(E) > 0. Assume that there exists a real α, such that for any x ∈ E: αμ∗ (x) = α∗μ (x) = α Then α = dim(E) = Dim(E). Proof. Let ε > 0. Let En be the subset of E consisting of all points x such that, for k n: α − ε αμ uk (x) α + ε If ρ < 2−n , any ρ-covering {ui } of En by dyadic cubes of rank ≥ n is such that: |ui |α+ε μ(ui ) |ui |α−ε for all i. Therefore μ(En ) i |ui |α−ε and i |ui |α+ε μ . First, we deduce that H ∗(α−ε) (En ) μ(En ). If n is large enough then μ(En ) > 0. This gives dim(En ) α − ε. Since En ⊂ E, then dim(E) α − ε. Secondly, by using the covering of En formed by dyadic cubes of rank k n, we obtain: N2−k (En )2−k(α+ε) μ Therefore N2−k (En ) μ 2k(α+ε) , which implies Δ(En ) α + ε. As a consequence, Dim(E) α + ε by σ-stability. By making ε tend to 0, we obtain the desired result. An analogous theorem can be stated with balls Bε (x) centered at E. We can develop the arguments of the preceding proof to obtain more general results as follows. THEOREM 1.3.– Assume that μ is a finite measure such that μ(E) > 0. Then:

(1.11) inf αμ (x) dim(E) sup αμ (x) x∈E

x∈E

inf αμ (x) Dim(E) sup αμ (x)

(1.12)

inf αμ (x) Δ(E) lim sup sup αμ (x)

(1.13)

x∈E

x∈E

x∈E

ε→0

x∈E


35

Inequality (1.13) seems more complex than the others. Nevertheless, we can derive from it a simple result: if 0 < μ(E) < +∞ and if α(Bε (x)) converges uniformly on E to a number α, then α = Δ(E). The same results hold if, for example, we replace the network of balls centered at E with that of dyadic cubes. EXAMPLE 1.5.– A perfect symmetric set (see Example 1.1) is the support of a natural or canonical measure: each of the 2n covering intervals of rank n is associated with the weight 2−n . In the case where the set has constant ratio a, these intervals have rank n and their Hölder exponent assumes the value log 2/|log a| uniformly. This grid of intervals allows the computation of dimensions. Indeed dim(E) = Dim(E) = Δ(E) =

log 2 |log a|

By making the successive ratios an vary, it is also possible to construct a set such that: dim(E) = lim inf

n log 2 , |log an |

Dim(E) = Δ(E) = lim sup

n log 2 |log an |

EXAMPLE 1.6.– The set Ep of p-normal numbers (see Example 1.4) supports a measure which makes it possible to estimate its dimension and which is known as the Besicovitch measure. It is defined on the dyadic intervals [0, 1]. Set μ([0, 12 ]) = p and μ([ 12 , 1]) = 1 − p, p ∈ (0, 1). The weights p and 1 − p are then distributed similarly at the following stages: Since each dyadic interval un of rank n is the union of the intervals vn (on the left) and vn (on the right) of rank n + 1, we put μ(vn ) = p μ(un ),

μ(vn ) = (1 − p) μ(un )

It is easy to calculate the exact measure of each dyadic interval un (x) containing the point x by using the base 2 expansion of x. Denote: N0 (x, n) = number of 0 in the expansion of x between ranks 1 and n N1 (x, n) = number of 1 in the expansion of x between ranks 1 and n Thus, N0 (x, n) + N1 (x, n) = n and: μ un (x) = pN0 (x,n) (1 − p)N1 (x,n)

(1.14)

36


First, let us show that Ep has full measure. The easiest way to proceed is to use the language of probability. Each point x can be viewed as the result of a process which is a sequence of independent Bernoulli random variables Xi taking n the value 0 and 1 with probabilities p and 1 − p. The frequency N1 (x, n)/n = ( 1 Xi )/n has mean 1 − p and variance p(1 − p)/n. We may apply here the strong law of large numbers: with probability 1, N1 (x, n)/n tends to 1 − p when n → +∞, and N0 (x, n)/n tends to p. Coming back to the language of measure, this result tells us that the set of x for which N0 (x, n)/n tends to p has measure 1. Such x are in Ep . Thus μ(Ep ) = 1. Secondly, to compute the dimension, we first need to determine the value of the Hölder exponent. Equation (1.14) implies the following result: 1 1 |log p| |log(1 − p)| + N1 (x, n) α un (x) = N0 (x, n) n log 2 n log 2 for any x of [0, 1]. If x ∈ Ep , then α(un (x)) tends to the value: αμ∗ (x) = α∗μ (x) = p

|log(1 − p)| |log p| + (1 − p) log 2 log 2

(1.15)

Thus, the value is the same for dim(Ep ) and Dim(Ep ) according to equations (1.11) and (1.12). We observe that in equation (1.13), the left-hand side is also equal to this value. Moreover, supx∈Ep α(un (x)) is equal to the largest Hölder exponent of dyadic intervals of rank n. If, for example, p < 12 , this largest exponent is equal to |log p|/ log 2, which is larger than 1. Therefore, the right-hand side of (1.13) is larger than 1. In fact, equation (1.13) gives no indication on the value of Δ(E). An argument of density yields Δ(Ep ) = 1 for any value of p.

1.3.3. Hölder exponent related to a function The Hölder exponents of a function give much more information than those of measures. Firstly, let us generalize the notion of measure of a set F , by using the notion of oscillation of the function f in F : β(f, F ) = sup{f (t) − f (t ) for t, t ∈ F }

This allows us to define a Hölder exponent: α(F ) =

log β(f, F ) log diam(F )


37

Given an interval J and a function f : J → R, we may use this notion locally in arbitrarily small neighborhoods of t ∈ J. The pointwise Hölder exponent of f in t is obtained as αpf (t) = lim inf α [t − ε, t + ε] ∩ J ε→0

According to this definition, the exponent of an increasing function f is the same as that of the measure μ defined on any [c, d] ⊂ J by μ([c, d]) = f (d) − f (c). Indeed, f (d) − f (c) is also the oscillation value of f in [c, d]. However, in general, f is not monotonous and it is therefore necessary to carry out a more accurate analysis, as we will see below. As in the case of measures, we may also consider the “symmetric” exponent defined with an upper limit, and also the exponents obtained as lower and upper limits by using particular grids of intervals, like the dyadic intervals. Oscillation considered as a measurement of the local variability of a function possesses many advantages. In particular, it is closely related to the box dimension. However, there are some counterparts: it is not simple to use in a theoretical context, it is sometimes difficult to estimate with precision under experimental conditions and, finally, it is sensitive to various disturbances (discretization, noise, etc.). It is possible to replace the oscillation with other set functions v(f, F ) showing more robustness. However, most alternatives no longer verify the important triangle inequality: v(f, F1 ∪ F2 ) v(f, F1 ) + v(f, F2 ) for all F1 , F2 ⊂ J (see [TRI 99]). We can simplify the analysis by restricting the general F to the class of intervals and by setting: v(f, [a, b]) = |f (b) − f (a)|.

We may even consider only dyadic intervals, and take: v f, [k 2−n , (k + 1) 2−n ] = |cj,k | where cj,k is the wavelet coefficient of f at scale j and position k (see also Chapters 2, 8 and 9). Let us now give an alternate and useful definition of the Hölder exponent.

38


DEFINITION 1.2.– Let f : R → R be a function, s > 0 and x0 a real number. Then f ∈ C s (x0 ), if and only if there is a real number η > 0, a polynomial P of degree ≤ s and a constant C, such that ∀x ∈ B(x0 , η),

|f (x) − P (x − x0 )| C|x − x0 |s .

(1.16)

The pointwise Hölder exponent of f at x0 , denoted αpf (x0 ) or αp (x0 ), is given by

sup s/f ∈ C s x0 If 0 < s < 1, then the polynomial P is simply the constant f (x0 ) and the increments of f over [x0 − ε, x0 + ε] are indeed compared to ε. Considering P allows us to take into account the higher order singularities, i.e. in the derivatives of f . We remove the “regular part” of f to exhibit the singular behavior. Consider for example f (x) = x + |x|3/2 . Using P allows us to find αp = 32 , whereas the simple increment would give αp = 1, a non-significant value in this case. The pointwise exponent, whose definition is natural, has a geometric interpretation. To begin with, let us remove the signal’s regular part, thus performing the difference f (x) − P (x − x0 ). Around x0 , the signal thus obtained is entirely contained in a hull of the form C|x − x0 |αp +ε for any ε > 0, and this hull is optimal, i.e. any smaller hull C|x − x0 |αp −ε , does not contain an infinite number of points of the signal. We observe that the smaller the αp is, the more irregular f is in the neighborhood of x0 and vice versa. In addition, αp > 1 implies that f is derivable at x0 , and a discontinuous function at x0 is such that αp = 0. In many applications, the regularity of a signal is as important as, or more important, than its amplitude. For example, the edges of an image do not vary if an affine transformation of the gray-levels is carried out. This will modify the amplitude, but not the exponents. The pointwise Hölder exponent will thus be one of the main ingredients for the fractal analysis of a signal or image. Generally speaking, measuring and studying the pointwise exponent is particularly useful in the processing of strongly irregular signals whose irregularity is expected to contain relevant information. Examples includes biomedical signals and images (ECG, EEG, echography, scintigraphy), Internet traffic logs, radar images, etc. The pointwise exponent, however, presents some drawbacks. For example, it is neither stable by integral-differentiation nor, more generally, under the action of pseudo-differential operators. This means that it is not possible to predict the exponent at x of a primitive F of f knowing αpf (x). It is only guaranteed that αpF (x) αpf (x) + 1. In the same way, the exponent of the analytical signal associated with f is not necessarily the same as f . This is a problem in signal processing, since


39

this type of operator is often used. The second disadvantage, related to the first, is that αp (x) does not provide the whole information on the regularity of f in x. A common example is that of the chirp |x|γ sin(1/|x|δ ), with γ and δ positive. We verify that the pointwise exponent at 0 is equal to γ. In particular, it is independent of δ: αp (0) is not sensitive to “oscillating singularities”, i.e. to situations where the local frequency of the signal tends to infinity in the neighborhood of 0. It is therefore necessary to introduce at least one more exponent to fully describe the local irregularity. During the last few years, several quantities have been proposed, including the chirp exponent, the oscillation exponent, etc. Here we will focus on the local Hölder exponent, which we now define (for more details on the properties of this exponent, see [LEV 98a, SEU 02]). Let us first recall that, given a function f : Ω → R, where Ω ⊂ R is an open set, we say that f belongs to the global Hölder space Cls (Ω), with 0 < s < 1, if there is a constant C, such that for any x, y in Ω: |f (x) − f (y)| C|x − y|s

(1.17)

If m < s < m + 1 (m ∈ N), then f ∈ Cls (Ω) means that there exists a constant C, such that, for any x, y in Ω: |∂ m f (x) − ∂ m f (y)| C|x − y|s−m

(1.18)

Now let αl (Ω) = sup{s/f ∈ Cls (Ω)}. Notice that if Ω ⊂ Ω, then αl (Ω ) αl (Ω). To define the local Hölder exponent, we will use the following lemma. LEMMA 1.1.– Let (Oi )i∈I be a family of decreasing open sets (i.e. Oi ⊂ Oj if i > j), such that: ∩i Oi = {x0 }

(1.19)

αl (x0 ) = sup{αl (Oi )}

(1.20)

Let: i∈I

Then, αl (x0 ) does not depend on the choice of the family (Oi )i∈I .

This result makes it possible to define the local exponent by using any intervals family containing x0 .

40


DEFINITION 1.3.– Let f be a function defined in a neighborhood of x0 . Let {In }n∈N be a decreasing sequence of open intervals converging to x0 . By definition, the local Hölder exponent of f at x0 , noted αl (x0 ), is: αl (x0 ) = sup αl (In ) = lim αl (In ) n∈N

n→+∞

(1.21)

Let us briefly note that the local exponent is related to a notion of critical exponent of fractional derivation [KOL 01]. We may understand the difference between αp and αl as follows: let us suppose that there exists a single couple (y, z) such that β(f, B(x, ε)) = f (y)−f (z). Then αp results from the comparison between β(f, B(x, ε)) and ε, whereas for αl , we compare β(f, B(x, ε)) to |y − z|. This is particularly clear in the case of the chirp, where the distance between the points (y, z) realizing the oscillation tends to zero much faster than the size of the ball around 0. Accordingly, it is easy to demonstrate that αl (0) = γ/(1 + δ) for the chirp. The exponent αl thus “sees” oscillations around 0: for fixed γ, the chirp is more irregular (in the sense of αl ) when δ is larger. The local exponent possesses an advantage over αp : it is stable under the action of pseudo-differential operators. However, as well as αp , αl cannot by itself completely characterize the irregularity around a point. Moreover, αl is, in a certain sense, less “precise” than αp . The results presented below support this assertion. PROPOSITION 1.2.– For any continuous f and for all x: αlf (x) min αpf (x), lim inf αpf (t) t→x

The following two theorems describe the structure of the Hölder functions, i.e. the functions which associate with any x the exponents of f at x. THEOREM 1.4.– Let g : R → R+ be a function. The two assertions below are equivalent: – g is the lower limit of a sequence of continuous functions; – there exists a continuous function f whose pointwise Hölder function αp (x) satisfies αp (x) = g(x) for all x. THEOREM 1.5.– Let g : R → R+ be a function. The following two assertions are equivalent: – g is a lower semi-continuous (LSC) function; – there exists a continuous function f whose local Hölder function αl (x) satisfies αl (x) = g(x) for any x.


41

NOTE 1.2.– Let us recall that a function f : D ⊂ R → R is LSC if, for any x ∈ D and for any sequence (xn ) in D tending to x: lim inf f (xn ) f (x) n→∞

(1.22)

Figure 1.2 shows a generalization of the Weierstrass function defined on [0, 1] for which αp (x) = αl (x) = x for any x. This function is defined as ∞ Wg (x) = i=0 ω −nx cos(ω n x), with ω > 1.

Figure 1.2. Generalized Weierstrass function for which αp (x) = αl (x) = x

Since the class of lower limits of continuous function is much larger than that of lower semi-continuous functions, we observe that αp generally supplies more information than αl . For example, αp can vary much “faster” than αl . In particular, it is possible to construct a continuous function whose pointwise Hölder function coincides with the indicator function of the set of rational numbers. It is everywhere discontinuous, and thus its local Hölder function is constantly equal to 0. The following results describe more precisely the relations between αl and αp . PROPOSITION 1.3.– Let f : I → R be a continuous function, and assume that there exists γ > 0 such that f ∈ C γ (I). Then, there exists a subset D of I such that: – D is dense, uncountable and has Hausdorff dimension zero; – for any x ∈ D, αp (x) = αl (x).

42


Moreover, this result is optimal, i.e. there exists a function of global regularity γ > 0 such that αp (x) = αl (x) for all x outside a set of zero Hausdorff dimension. THEOREM 1.6.– Let 0 < γ < 1 and f : [0, 1] → [γ, 1] be a lower limit of continuous functions. Let g : [0, 1] → [γ, 1] be a lower semi-continuous function. Assume that for all t ∈ [0, 1], f (t) g(t). Then, there exists a continuous function F : [0, 1] → R such that: – for all x, αl (x) = g(x); – for all x outside a set of zero Hausdorff dimension, αp (x) = f (x). This theorem shows that, when the “compatibility condition” f (t) g(t) is satisfied, we can simultaneously and independently prescribe the local and pointwise regularity of a function outside a “small” set. These two measures of irregularity are thus to some extent independent and provide complementary information. 1.3.4. Signal dimension theorem Let us investigate the relationships between the dimension of a signal and its Hölder exponents. There is no general result concerning the Hausdorff dimension, apart from obvious upper bounds resulting from the inequalities dim(Γ) Dim(Γ) Δ(Γ). Here is a result for Dim(Γ) [TRI 86a]. THEOREM 1.7.– If Γ is the graph of a continuous function f , then: 2 − sup αμ (x) Dim(Γ) 2 − inf αμ (x) . x∈J

x∈J

The same inequalities are true if we use the grid of the dyadic intervals: 2 − sup α∗μ (x) Dim(Γ) 2 − inf α∗μ (x) . x∈J

x∈J

(1.23)

(1.24)

We do not provide the demonstration of these results, which requires an evaluation of the packing measure of the graph. In the same context, we could show that if the local Hölder exponents α(un (x)) tend uniformly to a real α, then this number is also equal to Δ(Γ). However, a much more interesting equality may be given for Minkowski-Bouligand dimension of the graph which is both simple and general. THEOREM 1.8.– Let f be a continuous function defined on an interval J, and non-constant on J. For any ε > 0, let us call ε-variation of f on J the arithmetic mean of ε-oscillations: 1 β f, [x − ε, x + ε] ∩ J dt varε (f ) = |J| J


43

then: log varε (f ) Δ(Γ) = lim sup 2 − log ε ε→0

(1.25)

The assumption that f is not constant is necessary, as otherwise the oscillations are all zero and varε (f ) = 0. In this case, the graph is a horizontal segment and the value of its dimension is 1. Proof. A proof [TRI 99] using geometric arguments consists of estimating the area of the Minkowski ε-sausage Γ(ε). We show that this is equivalent to that of the union of the horizontal segments ∪t∈J [t − ε, t + ε] × {z(t)} centered on the graph. This is equal to the variation varε (z). EXAMPLE 1.7.– The graph of a self-affine function defined on J = [a, b] may be obtained as the attractor of an iterated functions system, like the self-similar curves of Example 1.2 (see also Chapters 9 and 10). For this, it is sufficient to define: – an integer N 2; – N + 1 points in the plane A1 = (x1 , y1 ), . . . , AN +1 = (xN +1 , yN +1 ), such that x1 = a < · · · < xi < · · · < xN +1 = b; – N affine triangular applications of the plane T1 ,. . . ,TN , such that, for each Ti , the image of the segment A1 AN +1 is the segment Ai Ai+1 . These may be written as: Ti =

ρi hi

0 δi

+ i ηi

where 0 < ρi = (xi+1 − xi )/(b − a) < 1 and |δi | < 1. The five parameters of Ti are related by the relations Ti (A1 ) = Ai and Ti (AN +1 ) = Ai+1 . In the particular case where ρi = 1/N for any i and i |δ i | 1, we may verify that if ε = N −k , the quantity varε (f ) is of the order of (( i |δi |)/N )k . We then obtain: log i |δi | |log( i |δi |)/N | =1+ Δ(Γ) = 2 − log N log N (see the classical example of Figure 1.3 where N = 4 and δi = 12 for any i). The Hölder exponent is calculated using 4-adic intervals. Its uniform value is 12 . Therefore Δ(Γ) = Dim(Γ) = 32 . Let us note that the Hausdorff dimension, strictly lower than 3 2 , is much more difficult to estimate [MCM 84].

44


EXAMPLE 1.8.– Lacunary series, such as the Weierstrass function, provide other examples of signals: f (x) =

∞

ω −nH cos(ω n x + φn )

i=0

where ω > 1 and 0 < H < 1. The values of the “phases” φn are arbitrary. We can directly prove [TRI 99] that, restricted to any bounded interval J, the box dimension of the graph Γ is equal to 2 − H. By homogenity, Dim(Γ) = 2 − H. This confirms the fact – although it is more difficult to prove (see [BOU 00]) – that the ε-oscillations are uniformly of the order of εH . In other words, there exist two constants c1 and c2 > 0 such that, for any t ∈ J and for any ε, with 0 < ε < 1: c1 εH β f, [x − ε, x + ε] c2 εH

Figure 1.3. Graph of a nowhere derivable function, attractor of a system of four affine applications, such that ρi = 14 and δi = 12 . Dimensions Δ and Dim are equal to 32 . The Hausdorff dimension lies between 1 and 32


45

Finding the value of the Hausdorff dimension of such graphs is still an open problem. In the case where the φn are independent random variables with same distribution, it is known that dim(Γ) = 2 − H with probability 1. 1.3.5. 2-microlocal analysis In this section, we briefly present an analysis of local regularity which is much finer than the Hölder exponents described above. As was observed previously, neither αp nor αl allow us to completely describe the local irregularity of a function. To obtain an exhaustive description, we need to consider an infinite number of exponents. This is the purpose of 2-microlocal analysis. This powerful tool was defined in [BON 86], where it is introduced in the framework PDE in the frame of Littlewood-Paley analysis. A definition based on wavelets is developed in [JAF 96]. We present here a time-domain characterization [KOL 02].

DEFINITION 1.4.– A function f : I ⊂ R → R belongs to the 2-microlocal space Cxs,s 0 if there exist 0 < δ < 14 and C > 0 such that, for all (x, y) satisfying |x − x0 | < δ and |y − x0 | < δ: |f (x) − f (y)| −s /2 −s /2 |x − y| + |y − x0 | C|x − y|s+s |x − y| + |x − x0 |

Figure 1.4. Graph of the Weierstrass function for ω = 3 and H = 12 . Phases φn are random independent variables and are identically distributed. Dimensions Δ and Dim are equal to 32 . The Hausdorff dimension is almost surely close to 32

(1.26)

46


This definition is valid only for 0 < s < 1 and 0 < s + s < 1. The general case is slightly more complex and will not be dealt with here. 2-microlocal spaces, as opposed to Hölder space, require two exponents (s, s ) in their definition. While αp is defined as the sup of exponents α, such that f belongs to Cxα0 , we cannot proceed in the same way to define “2-microlocal exponents”. Instead, we define in the abstract }). plan (s, s ) the 2-microlocal frontier of f at x0 as the curve (s, sup{s , f ∈ Cxs,s 0 It is not hard to show that this curve is well defined, concave and decreasing. Its intersection with the s axis is exactly αl . Moreover, under the hypothesis that f possesses a minimum global regularity, αp is the intersection of the frontier with the line s + s = 0. The 2-microlocal frontier thus allows us to re-interpret the two exponents within a unified framework. The main advantage of the frontier is that it completely describes the evolutions of αp under integro-differentiation of arbitrary order: indeed, an integro-differentiation of order ε simply shifts the frontier by ε along the s axis. Thus, 2-microlocal analysis provides extremely rich information on the regularity of a function around a point. To conclude this brief presentation, let us mention that algorithms exist which make it possible to numerically estimate the 2-microlocal frontier. They often allow us to calculate the values of αp and αl more precisely than a direct method (various estimation methods for the exponents and the 2-microlocal frontier are proposed in FracLab). Furthermore, it is possible to develop a 2-microlocal formalism [LEV 04a] which presents strong analogies with the multifractal formalism (see below). 1.3.6. An example: analysis of stock market price As an illustration of some of the notions introduced above, we use this section to detail a simplified example of the fractal analysis of a signal based on Hölder exponents. Our purpose is not to develop a complete application (this would require a whole chapter) but instead to demonstrate how we calculate and use the information provided by local regularity analysis in a practical scenario. The signal is a stock market log. Financial analysis offers an interesting area of application for fractal methods (see Chapter 13 for a detailed study). We will consider the evolution of the Nikkei index between January 1, 1980 and November 5, 2000. These signals comprise 5,313 values and are presented in Figure 1.5. As with many stock market accounts, it is extremely irregular. We calculate the local Hölder exponent of the logarithm of this signal, which is the quantity on which financial analysts work. The exponent is calculated by a direct application of Definition 1.3: at each point x, we find, for increasing values of ε, one couple (y, z) for which the signal oscillation in a ball centered in x of radius ε is attained. A bilogarithmic regression between the vector of the values found for the oscillation and the distances |y − z| is then performed (see the FracLab manual for more details on the procedure). As Figure 1.6 shows, most local exponents are comprised between 0 and 1, with


47

Figure 1.5. The Nikkei index between 1st January 1980 and 5th November 2000

some peaks above 1 and up to 3. The exponent values in [0, 1], which imply that the signal is continuous but not differentiable, confirm the visual impression of great irregularity. We can go beyond this qualitative comment by observing that the periods where “something occurs” have a definite signature in terms of the exponents: they are characterized by a dramatic increase of αl followed by very small values, below 0.2. Let us examine some examples. The most remarkable point of the local regularity graph is its maximum at abscissa 2,018, with an amplitude of 3. The most singular points, i.e. those with the smallest exponent, are situated just after this maximum: the exponent is around 0.2 for the abscissae between 2,020 and 2,050, and of the order of 0.05 between points 2,075 and 2,100. These two values are distinctly below the exponent average, which is 0.4 (the variance being 0.036). Calculations show that less than 10% of the log points possess an exponent smaller than 0.2. This remarkable behavior corresponds to the crash of October 19, 1987 which occurs at abscissa 2,036, in the middle of the first zone of points with low regularity after the maximum: the most “irregular” days of the entire signal are thus, as expected, situated in the weeks which followed the crash. It is worthwhile noting that this fact is much more apparent on the regularity signal than on the original log, where only the crash appears clearly, with the subsequent period not displaying remarkable features. Let us now consider another area which contains many points of low regularity along with some isolated regular points (i.e. having αl > 1). It corresponds to the zone between abscissae 4,450 and 4,800: this period approximatively corresponds to the Asian crisis that took place between January 1997 and June 1998 (analysts do not agree upon the exact dates of the beginning and the ending of this crisis: some of them date its beginning in mid-1997 or its end towards the end of 1999, or much later). On the graph of the log, we can observe that this period seems slightly more

48


Figure 1.6. Local Hölder function of the Nikkei index

irregular than others. In terms of exponents, we notice that it contains two maxima, with values greater than 1, both followed by low regularity points: this area comprises a high proportion of irregular points, since 12% of its points have an exponent lower than 0.15. This proportion is three times higher than that observed in the whole log. The analysis just performed has remained at an elementary level. However, it has allowed us to show that major events have repercussions on the evolution of the local Hölder exponent and that the graph of αl emphasizes features not easily visible on the original log.

1.4. Multifractal analysis 1.4.1. What is the purpose of multifractal analysis? In the previous section, it has been observed that the Hölder functions provide precise information on the regularity at each point of a signal. In applications, this information is often useful as such, but there exists many cases where it is necessary to go further. Here are three examples that highlight this necessity. In image segmentation, we expect that edges correspond to low regularity points and hence to small exponents. However, the precise value of the Hölder exponents of contour points cannot be universal: non-linear transformations of an image, for instance, might preserve edges while modifying the exponent value. In this first situation, we see that the pointwise regularity does not provide all the relevant information and that it is necessary to supplement it with structural information. Let us now consider the issue of image texture classification. It is clear that classifying a pixel based on its


49

local exponent would not give satisfactory results. A more relevant approach would be to use the statistical distribution of exponents with zones. The same comment applies to the characterization of Internet traffic according to its degree of sporadicity. In this second situation, the Hölder function provides information that is too rich, and which we would like to balance in a certain sense. The last situation is when the exponents are too difficult to calculate: there exists, in particular, continuous signals, easy to synthesize, whose Hölder function is everywhere discontinuous. In this case, the pointwise regularity information is too complex to be used under its original form. In all these examples, we would like to use “higher level” information, which would be extracted from the Hölder function or would sum up, in some sense, its relevant features. Several ways of doing this exist. The idea that comes immediately to mind is simply to calculate histograms of exponents. This approach, however, is not adapted, both for mathematical reasons that go beyond the scope of this chapter and because, in fractal analysis, we always try to deal with quantities that are scale-independent. The most relevant way to extract information from the Hölder function and to describe it globally is to perform a multifractal analysis. There are many variants and we will concentrate on two popular examples: the first is geometric and consists of calculating the dimension of the set points possessing the same exponent. The second is statistical: we study the probability of finding, at a fixed resolution, a given exponent and how this probability evolves when the resolution tends to infinity. The following two sections are devoted to developing these notions. 1.4.2. First ingredient: local regularity measures Before giving a detailed description of the local regularity variations of a signal, it is necessary to determine what method will be used to measure this regularity. It has been explained in section 1.3 that many characterizations are equally relevant and that the choice of one or the other is dictated by practical considerations and the type of applications chosen. Likewise, we may base multifractal analysis on various measures of local regularity. However, there are a certain number of advantages in using the pointwise exponent, which is reasonably simple while leading to a rich enough analysis. Therefore, the following text will use this measure of regularity, which corresponds to the most common choice. Grain exponents For reasons which will be explained below, we need to define a new class of exponents called grain exponents. These are simply approximations, at each finite resolution, of the usual exponents. For simplicity, let us assume that our signal X is defined on [0, 1], and let u denote an interval in [0, 1]. We first choose a descriptor VX (u) of the relevant information of X in u: if X is a measure μ, we will most often take Vμ (u) = μ(u). If X is a function f , then Vf (u) may for instance be

50


the absolute value of the increment of f in u, that is |f (umax ) − f (umin )| (where u = [umin , umax ]). A more precise descriptor is obtained by considering the oscillation instead, and setting Vf (u) = β(f, u). Finally, in cases where the intervals u are dyadic, of the form Ink = [(k − 1) 2−n , k 2−n ], a third possibility is to choose Vf (Ink ) = |ckn |, where ckn is the wavelet coefficient of f on scale n and in position k (note that, in this case, most quantities defined below will additionally depend on the wavelet). Once VX (u) has been defined, the grain exponent may be calculated as follows: α(u) =

log VX (u) log|u|

Filtration It is then necessary to define a sequence of partitions (Ink )(n0,k=1...νn ) of [0, 1]. For each fixed n, the collection of the νn intervals (Ink )k constitutes a partition of [0, 1] and we require that, when n tends to infinity, maxk |Ink | tends to 0 (this implies, of course, that νn has to tend to infinity). A common (but not neutral) choice is to consider the dyadic intervals (and thus νn = 2n ). The grain exponent αnk is then defined by: log VX (Ink ) k αn = log(|Ink |)

Ink :

Intuitively, αnk does indeed measure the “singularity” of X in the (small) interval the smaller αnk , the greater the variation of X in Ink , and vice versa.

Abstract function A It is important to observe at this point that we can carry out a more general version of multifractal analysis by replacing the grain exponent α(u) with a function A defined on the metric space of closed intervals of [0, 1], and ranging in R+ ∪{+∞} [LEV 04b]. In this context, more general results can be obtained. 1.4.3. Second ingredient: the size of point sets of the same regularity In the same way as the local regularity can be defined through different approaches, there exist many ways of extracting high-level information. Geometric method Conceptually, the simplest method consists of considering the sets Eα of those points of [0, 1] possessing a given exponent α and then describing the size of Eα (to simplify notations and as the focus of this section is on the pointwise exponent,


51

we write α instead of αp ). In many cases of practical and theoretical interest, these sets have a zero Lebesgue measure for most of the values of α. In addition, they are often dense in [0, 1], with a box dimension equal to 1. To distinguish them, it is thus necessary to measure them either by their Hausdorff or packing dimension. Here, only the first of these dimensions will be considered. We set: fh (α) = dimH (Eα ) where Eα = {x : α(x) = α}. The function fh is called the Hausdorff multifractal spectrum of X. Since the empty set dimension is −∞, we see that fh will take values in {−∞} ∪ [0, n] for an n-dimensional signal. Even though a strict definition of a multifractal object does not exist, it seems reasonable to talk of multifractality when fh (α) is positive or zero for several values of α: indeed, this means that X will display different singular behaviors on different subsets of [0, 1]. Often, we will require that fh be strictly positive on an interval in order to consider X as truly multifractal. While the Hausdorff spectrum is simple to define, it requires an extremely delicate calculation in theoretical as well as numerical studies. The next subsection presents another multifractal spectrum, which is easier to calculate and which also serves to give an approximation by excess of fh . Statistical method The second method used to globally describe the variations of regularity is to adopt a statistical approach (as opposed to fh , which is a geometric spectrum): we first choose a value of α, and counts the number of intervals Ink , at a given resolution n, where X possesses a grain exponent approximately equal to α. We then let the resolution tend to infinity and observes how this number evolves. More precisely, the large deviation spectrum fg is defined as: fg (α) = lim lim inf ε→0 n→∞

log Nnε (α) log νn

where: Nnε (α) = #{k : α − ε αnk α + ε} We may heuristically understand this spectrum by letting Pn denote the uniform probability law on {1 . . . νn }, i.e. Pn (k) = 1/νn for k = 1, . . . , νn . Then, neglecting ε and assuming that the lower limit is a limit: Pn (αnk α) νnfg (α)−1

(1.27)

52


In other words, if an interval Ink at resolution n is drawn randomly (for a sufficiently large value of n), then the probability of observing a singularity exponent f (α)−1 . approximately equal to α is proportional to νng From the definition, it is clear that fg , as fh , ranges in {−∞} ∪ [0, 1] (in one dimension). As a consequence, whenever fg (α) is not equal to one (and thus is strictly smaller than 1), the probability of observing a grain exponent close to α decays exponentially fast to 0 when the resolution tends to infinity: only those α such that fg (α) = 1 will occur in the limit of infinite resolution. The study of this type of behavior and the determination of the associated exponential convergence speed (here, fg (α) − 1) are the topic of the branch of probability called “large deviation theory”, from which we derive the denomination of fg . Nature of the variable ε Let us consider the role of ε: at each finite resolution, the number of intervals Ink is finite. As a consequence, Nn0 (α) will be zero except for a finite number of values of α. The variable ε then represents a “tolerance” on the exponent value, which is made necessary by the fact that we work at finite resolutions. Once the limit on n has been taken, this tolerance is no longer needed and we can let ε tend to 0 (note that the limit in ε always exists, as we are dealing with a monotonic function of ε). From a more general point of view, we may understand the difference between fh and fg as an inversion of the limits: in the case of fh , we first let the resolution tend to infinity to calculate the exponents and then “count” how many points are characterized by a given exponent. In the case of fg , we start by counting, at a given resolution, the number of intervals having a prescribed exponent, and then let the resolution tend to infinity. This second procedure renders the calculation of the spectrum easier, but in general it will obviously lead to a different result. In section 1.4.7, two examples are given that illustrate the difference between fh and fg . NOTE 1.3.– Historically, the large deviation spectrum was introduced as an easy means to calculate fh through arguments developed in section 1.4.4. It was then gradually realized that fg contains information sometimes more relevant than fh , particularly in signal processing applications. For more details on this topic, see [LEV 98b], where the denominations “Hausdorff spectrum”, “large deviation spectrum” and “Legendre spectrum” were introduced. The next section tackles the problem of calculating the multifractal spectra. 1.4.4. Practical calculation of spectra Let us begin with the Hausdorff spectrum. It is clear that the calculation of the exponents in each point, and then of all the associated dimensions of the Hausdorff spectrum, is an extremely difficult task. It is thus desirable to look for indirect methods to evaluate fh . Two of them are described below.


53

Multifractal formalism In some cases, fh may be obtained as the Legendre transform of a function that is easily calculated. When this is the case, we say that the (strong) multifractal formalism holds. Define: n

Sn (q) =

2

VX (Ink )q

k=1

and set: τn (q) = −

1 log2 Sn (q) n

τ (q) = lim inf τn (q) n→∞

To understand the link between fh and τ heuristically, let us evaluate τn by grouping together the terms that have the same order of magnitude in the definition of Sn . In that view, fix a “Hölder exponent” α and consider those intervals for which, when n is sufficiently large, VX (Ink ) ∼ |Ink |α . Assume that this approximation is true uniformly in k. We may then roughly estimate the number of such intervals by 2nfh (α) . Then: Sn (q) = {2−nqα : VX (Ink ) ∼ |Ink |α } = 2−n(qα−fh (α)) α

k

α

Factorizing 2−n inf α (qα−fh (α)) , we obtain for τn : 1 2−n[qα−fh (α)−inf α (qα−fh (α))] τn (q) = inf qα − fh (α) − log α n α and, taking the “limit” when n tends to infinity: τ (q) := lim inf α (qα − fh (α)) =: fh∗ (α). The transformation fh → fh∗ is called the “Legendre transform” (for the concave functions). Provided we can justify all the manipulations above, we thus expect that τ will be the Legendre transform of fh . If, in addition, fh is concave, a well-known property of the Legendre transform guarantees that fh = τ ∗ . As announced above, fh is then obtained at once from τ . The important point is that τ is itself much easier to evaluate than fh : there is no need to calculate any Hausdorff dimension and furthermore the definition of τ involves only average quantities, and no evaluations of pointwise Hölder exponents. The only difficulty lies in the estimation of the limit. Therefore, the estimation of τ will be in general much easier and more robust than that of fh . For this reason, even though the multifractal formalism does not hold in

54


general, it is interesting to consider a new spectrum, called the Legendre spectrum and defined as1 fl = τ ∗ . The study of the validity domain of the equality fh = fl is probably one of the most examined issues in multifractal analysis, for obvious theoretical and practical reasons. However, this is an extremely complex problem, which has only been partially answered to this day, even in “elementary” cases such as the one of multiplicative processes (see section 1.4.6). It is easy to find counter-examples to the equality fh = τ ∗ . For instance, since a Legendre transform is always concave, the formalism certainly fails as soon as fh is not concave. There is no reason to expect that the spectrum of real signals will have this property. In particular, it is not preserved by addition, so that it will not be very stable in practice. Other dimensional spectra Instead of using the Legendre transform approach, another path consists of defining spectra which are close in spirit to fh , but are easier to calculate. Since the major difficulty arises from the estimation of the Hausdorff dimension, we may try to replace it with the box dimension, which is simpler to evaluate. Unfortunately, merely replacing the Haussdorff dimension by Δ in the definition of the spectrum leads to uninteresting results. Indeed, as mentioned above, the sets Eα have, in many cases of interest, a box dimension equal to 1 (see section 1.4.6 for examples). In this situation, the spectrum obtained by replacing dim by Δ, being constant, will not supply any information. A more promising approach consists of defining dimension spectra as in [LEV 04b, TRI 99]. To do so, let us consider a general function of intervals A, as in section 1.4.2. For any x in [0, 1], set: αn (x) = A un (x) where un (x) is the dyadic interval of length 2−n containing x (take for instance the right interval if two such intervals exist). Then let: Eα (, N ) = {x/n N ⇒ |αn (x) − α| } Eα (, N ) is an increasing function of N , and thus: ∪N Eα (, N ) supN Eα (, N ). Let:

=

Eα () = sup Eα (, N ) = {x/∃N such that n N ⇒ |αn (x) − α| } N

1. The inequalities fh g fl are always true (see below). Thus, it would be more accurate to say that the Legendre spectrum is an approximation (by excess) of the large deviation spectrum, rather than that of the Hausdorff spectrum.


55

Since the sets Eα () decrease with , we may define: Eα = lim Eα () = {x/αn (x)→n→∞ α} →0

DEFINITION 1.5.– Let d be any dimension. We define the following spectra: fd (α) = d(Eα ) = d lim sup Eα (, N ) →0 N

fdlim (α) = lim d Eα () = lim d sup Eα (, N ) →0

→0

fdlim sup (α) = lim sup d Eα (, N )

(1.28) (1.29)

N

→0 N

(1.30)

When d is the Hausdorff dimension, then fd is just the Hausdorff spectrum fh . Let D = Im (A) be the closure of the image of A. Then D is the support of the spectra. Indeed, for any α ∈ D, Eα (, N ) = ∅, and thus fd (α) = fdlim (α) = fdlim sup (α) = −∞ (obviously this also applies to fg ). The following inequalities are easily proved: fd fdlim

(1.31)

fdlim sup fdlim

(1.32)

There is no relation in general between fd and fdlim sup . However, if d is σ-stable: fd (α) fdlim (α) = fdlim sup (α)

(1.33)

Besides, if d1 and d2 have two dimensions such that, for any E ⊂ [0, 1]: d1 (E) d2 (E) then: sup sup (E) fdlim (E), fdlim (E) fdlim (E) fd1 (E) fd2 (E), fdlim 1 2 1 2

It is not hard to prove the below sequence of inequalities. PROPOSITION 1.4.– For any A: fh (α) fhlim sup (α) = fhlim (α) lim lim sup fΔ (α) min fΔ (α), fg (α)

(1.34)

56


This is an improvement on the traditional result fh (α) fg (α). In particular, lim . If, as soon as fh (α) = fg (α), all the above spectra coincide, except perhaps fΔ lim sup on the contrary, fh is smaller than fg , then we may hope that fΔ (α) will be a better approximation of fh than fg . The important point here is that the calculation of lim sup fΔ (α) only involves box dimensions and that it is of the same order of complexity lim sup as that of fg . The spectrum fΔ (α) is thus a good substitute when we want to lim sup numerically estimate fh . For example, the practical calculation of fΔ (α) on a multinomial measure (see section 1.4.6) yields good results. Since d([0, 1]) = 1, all the spectra have a maximum lower than or equal to 1. In certain cases, a more precise result concerning fdlim sup , fdlim and fg is available: PROPOSITION 1.5.– Let K be the set of x in [0, 1] such that the sequence (αn (x)) converges. Let us suppose that |K| > 0. Then, there exists α0 in D such that: fdlim sup (α0 ) = fdlim (α0 ) = fg (α0 ) = 1

(1.35)

Let us note that such a constraint is typically not satisfied by fh . This shows that fh contains, in general, more information. A more precise and more general result may be found in [LEV 04b]. The three spectra fdlim sup , fdlim and fg also obey a structural constraint, expressed by the following proposition. PROPOSITION 1.6.– The functions fg , fdlim and fdlim sup are upper semi-continuous. NOTE 1.4.– Recall that a function f : D ⊂ R → R is upper semi-continuous (USC) if, for any x ∈ D and for any sequence (xn ) of D converging to x: lim sup f (xn ) f (x)

(1.36)

n→∞

Let us define the upper semi-continuous envelope of a function f by: f˜(α) = lim sup{f (β)/|β − α| } →0

Then, the above results imply that: f˜d (α) fdlim (α)

(1.37)

Let us also mention the following result. PROPOSITION 1.7.– Let A and B be the two interval functions and C = max{A, B}. Let fg (α, A), fg (α, B) and fg (α, C) denote the corresponding spectra. Then, for any α: fg (α, C) max{fg (α, A), fg (α, B)}

(1.38)


57

PROPOSITION 1.8.– If d is stable, then inequality (1.38) is true for fd , fdlim and fdlim sup . A significant result concerning the “inverse problem” for the spectra serves as a conclusion for this section. PROPOSITION 1.9.– Let f be a USC function ranging in [0, 1] ∪ {−∞}. Then, there exists an interval function A whose fdlim sup or fdlim spectrum is exactly f as soon as d is σ-stable or d = Δ. Let us note that fd , when d is σ-stable, is not necessarily USC. This shows once more that this spectrum is richer than the other ones (see [LEV 98b] for a study of the structural properties of fd with d = h). Weak multifractal formalism Let us now consider the numerical estimation of fg . As in the case of fh , two approaches exist: either we resort to a multifractal formalism with fg as the Legendre transform of a simple function, or we analyze in detail the definition of fg and deduces estimation methods from this. In the first case, the heuristic justification is the same as for fh and we expect that, under certain conditions, fg = τ ∗ . Since we avoid an inversion of limits as compared to the case of fh , this formalism (sometimes called weak multifractal formalism, as opposed to the strong formalism which ensures the equality of τ ∗ and fh ) will be satisfied more often. However, the necessary condition that fg be concave, associated once again with the lack of stability, still limits its applicability. An important difference between the strong and weak formalisms is that, in the latter case, a precise and reasonably general criterion ensuring its validity is known. We are referring to a version of the Ellis theorem, one of the fundamental results in the theory of large deviations, which is recalled below in a simplified form. THEOREM 1.9.– If τ (q) = limn→∞ τn (q) exists as a limit (rather than a lower limit), and if it is differentiable, then fg = fl . When fg is not concave, it cannot equal fl , but the following result holds. THEOREM 1.10.– If fg is equal to −∞ outside a compact set, then fl is its concave hull, i.e.: fl = (fg )∗∗ This theorem makes it possible to measure precisely the information which is lost when fg is replaced by fl . See [LEV 98b] for related results.

58


Continuous spectra It is possible to prove that the previous relation is still valid in the more subtle case of continuous spectra [LEV 04b, TRI 99]. These continuous spectra constitute a generalization of fg that allow us to avoid choosing a partition. As already mentioned, this choice is not neutral and different partitions will, in general, lead to different spectra. To begin with, interval families are defined: Rη = {u interval of [0, 1] such that |u| = η} Rη (α) = {u : |u| = η, A(u) = α} Rεη (α) = {u : |u| = η, |A(u) − α| ε} DEFINITION 1.6 (continuous large deviation spectra).– log η1 |∪Rεη (α)| fgc (α) = lim lim sup ε→0 η→0 log η log η1 |∪Rη (α)| f˜gc (α) = lim sup log η η→0 Note that fgc (α) is defined similarly to fg , except that all the intervals of a given length η are considered where the variation of X is of the order of η α±ε , rather than only dyadic intervals. Since the number of these intervals may be infinite, we replace Nnε (α) with a measure of the average length, i.e. η1 |∪Rεη (α)|. Within this continuous framework, Rη (α) will, in general, be non-empty for infinitely many values of α and not only for at most 2n values, as is the case for Nn0 (α): it is thus possible to get rid of ε and define the new spectrum f˜gc . Legendre transform of continuous spectra For any family of R intervals, ∪R denotes the union of all the intervals of R. A packing of R is a sub-family composed of disjoint intervals. DEFINITION 1.7 (Legendre continuous spectrum).– For any real q, let:

q qA(u) |u| : R is a packing of R H (R) = sup u∈R

and: Jηq = sup H q Rη (α) α


Set:

59

log H q (Rη ) η→0 log η q log J (Rη ) τ˜c (q) = lim inf η→0 log η τ c (q) = lim inf

and finally: flc = (τ c )∗ and f˜lc = (˜ τ c )∗ . Here are the main properties of fgc , f˜gc , flc and f˜lc . PROPOSITION 1.10.– – flc and f˜lc are concave functions; – for any α, f˜c (α) f c (α) and fg (α) f c (α); g

g

g

– if μ is a multinomial measure (see section 1.4.6), then fgc (α) = f˜gc (α) = flc (α) = f˜lc (α) = fg (α). THEOREM 1.11.– If fgc (respectively f˜gc ) is equal to −∞ outside a compact interval, then, for any α: ∗∗ flc (α) = fgc (α) ∗∗ f˜lc (α) = f˜gc (α) PROPOSITION 1.11.– – τ c and τ˜c are increasing and concave functions; – τ c (0) = τ˜c (0) = −Δ(supp(X)); – if X is a probability measure, then τ c (1) = τ˜c (1) = 0; – τ c (q) = lim inf n→∞ log H q (Rηn )/ log ηn , where (ηn ) is a sequence tending to zero such that log ηn / log ηn+1 → 1 when n → ∞. The same is true for τ˜c . The last property is important in numerical applications: it means that τ c and τ˜c may be estimated by using discrete sequences of the type ηn = 2−n . Kernel method A second method to estimate fg , which does not assume that the weak formalism is true (and thus in particular allows us to obtain non-concave spectra), is based on the following. Let K denote the “rectangular kernel”, i.e. K(x) = 1 for x ∈ [−1, 1] and K(x) = 0 elsewhere. Let Kε (t) = 1ε K( εt ). Then, by definition, Nnε (α) = 2n+1 εKε ∗ ρn (α), where the symbol ∗ represents convolution. It is not hard to check that replacing K by any positive kernel with compact support whose

60


integral equals 1 in the definition of Nnε (α) will not change the value of fg . A basic idea is then to use a more regular kernel than the rectangular one to improve the estimation. A more elaborate approach is to use ideas from density estimation to try and remove the double limit in the definition of fg : this is performed by choosing ε to be a function of n in such a way that appropriate convergence properties are obtained [LEV 96b]. We may, for instance, show the following result: PROPOSITION 1.12.– Assume that the studied signal is a finite sum of multinomial measures (see section 1.4.6). Let: fnε (α) =

log Nnε (α) log νn

εn Then, if εn is a sequence such that limn→∞ log(ν = c, where c is a positive n) constant: lim sup fnεn (α) − fg (α) = 0 n→∞ α

Even without making such a strong assumption on the signal structure, it is still possible, in certain cases, to obtain convergence results, as above, with ε = ε(n), using more sophisticated statistical techniques. To conclude our presentation of multifractal spectra, let us emphasize that no spectrum is “better” than the others in all respects. All the spectra give similar but different information. As was already observed for the local regularity measures, each one has advantages and drawbacks, and the choice has to be made in view of the considered application. If we are interested in the geometric aspects, a dimension spectrum should be favored. The large deviation spectrum will be used in statistical and most signal processing applications. When the number of data is small or if it appears that the estimations are not reliable, we will resort to the Legendre spectrum. To be able to compare the different information and also to assert the quality of the estimations, it is important to dispose of general theoretical relations between the spectra. It is remarkable that, as seen above, such relations exist under rather weak hypotheses. 1.4.5. Refinements: analysis of the sequence of capacities, mutual analysis and multisingularity In this section, some refinements of multifractal analysis, useful in applications, will be briefly discussed. The first refinement stems from the following consideration. Assume we are interested in the analysis of road traffic and that the signal X(k) at hand is the flow


61

per hour, i.e. the number of vehicles crossing a given small section of road during a fixed time interval (often in the order a minute).2 In this case, each point of the signal is not a pointwise value but corresponds to a space-time integral. We would thus be inclined to model such data by a measure and carry out multifractal analysis of this measure. However, it appears that, if we want to anticipate congestions, the relevant quantity to consider for a given time interval Tn whose length n is large as compared to one minute is not the sum of the individual flows X(k) for k in Tn , but the maximum of these flows. Hence, instead of one signal, we would rather consider a sequence of signals, Xn , each yielding the maximum flow at the time scale n: Xn (j) = maxk∈Tn X(k). At each scale n, the signal Xn is a set function (i.e. it does not give pointwise values but averages). However, the Xn are not measures, as they are not additive: the maximum on two disjoint intervals T 1 and T 2 is not, in general, the sum of the maxima of T 1 and T 2. Nonetheless, each Xn possesses some regularity properties which allows us to model it as a Choquet capacity. Thus, we are led to generalize multifractal analysis so as to no longer process one signal, which would be a pointwise function or a measure, but a sequence of signals which are Choquet capacities. A number of other examples exist, particularly in image analysis, where such a generalization is found necessary (see [LEV 96a]). We will not give here the definition of a Choquet capacity, but we stress that the generalization of multifractal analysis to sequences of Choquet capacities, at first seemingly abstract, is in fact very simple. Indeed, nothing in the definition of multifractal analysis implies that the same signal has to be considered at each resolution, nor that the signal must be a function or a measure. In particular, the relations between the spectra are preserved in this generalization [LEV 98b]. A different generalization consists of noting that, in the usual formulation of multifractal analysis, the Lebesgue measure L plays a particular role which may not be desirable. Let us for instance consider the definition of the grain exponent. The logarithm of the interval measure Ink is compared to |Ink |, which is nothing but the Lebesgue measure of Ink . In the same way, when we define fh , the Hausdorff dimension is calculated with respect to L. However, it is a traditional fact that we may 3 define an s-dimensional Hausdorff with measure respects to an arbitrary non-atomic s measure μ by replacing the sum |Uj | with μ(Uj ) . As a matter of fact, once a non-atomic reference measure μ has been chosen, we may rewrite all the definitions of multifractal analysis (Hölder exponent, grain exponent, spectra, etc.) by replacing L with μ. If this analysis is applied to a signal X (which may be a function, a measure or a capacity), we obtain the multifractal analysis of X with respect to μ. This type of analysis is called mutual multifractal analysis. There are several benefits to this

2. A multifractal analysis of traffic in Paris is presented in [VOJ 94]. 3. A measure is non-atomic if it attributes zero mass to all singletons. The reasons for which one has to restrict to such measures go beyond the scope of this chapter; see [LEV 98b] for more details.

62


generalization. Let us briefly illustrate some of them through two examples. Assume that the signal to be studied X is equal to Y + Z, where Y and Z are two measures Y Z (α) fh,L (α) for supported respectively on [0, 12 ] and on [ 12 , 1]. Suppose that fh,L T all α, where fh,L is the Haussdorff spectrum of T with respect to L. Then, it is easy X Z to see that fh,L (α) = fh,L (α) for any α. The Y component is not detected by the analysis. On the contrary, if we calculate the spectra with respect to the reference X Z Z (α) = fh,Z (α), since fh,Z (α) will be equal to measure Z, then, in general, fh,Z 1 for α = 1 and to −∞ everywhere else. A change in the reference measure thus allows us to carry out a more accurate analysis, by shedding light on possibly hidden components. As a second example, consider an application in image analysis. It is possible to use mutual multifractal analysis to selectively detect certain types of changes in image sequences (for example, the appearance of manufactured objects in sequences of aerial images [CAN 96]). The idea consists of choosing the first image of the sequence as the reference measure and then analyzing each following image with respect to it. In the absence of any change, the mutual spectrum will be equal to 1 for α = 1 and to −∞ everywhere else. A change will lead to a spreading of this spectrum and it is possible to classify changes according to the corresponding values of the couple (α, f (α)) (see Chapter 11). Finally, a third refinement consists of carrying out a multifractal analysis “at a point”, using 2-microlocal analysis: the so-called 2-microlocal formalism allows us to define a function, corresponding to fg , which completely describes the singular behavior of a signal in the vicinity of any point. Such an analysis provides in particular ways to modify pointwise regularity in a powerful manner. 1.4.6. The multifractal spectra of certain simple signals The paradigmatic example of multifractal measures is the Besicovitch measure – often called, in this context, the “binomial measure”. For x in [0, 1], write: x=

∞

xi 2−i

where xi = 0, 1

1

1 1 − xi , n i=1 n

Φn0 (x) =

1 xi n i=1 n

Φn1 (x) =

According to (1.13): α un (x) = −Φn0 (x) log2 p − Φn1 (x) log2 (1 − p) Write lim inf n→∞ Φn0 (x) =: Φ0 (x) and Φ1 (x) := 1 − Φ0 (x). We obtain: α∗ (x) = −Φ0 (x) log2 p − Φ1 (x) log2 (1 − p)


63

The sets Eα are the sets of points of [0, 1] having a given proportion Φ0 (x) of 0 in their base-2 expansion. Calculations similar to those in Example 1.6 lead to: dim Eα = −Φ0 log2 Φ0 − Φ1 log2 Φ1 The Hausdorff spectrum is thus given in parametric form by: α(Φ0 ) = −Φ0 log2 p − Φ1 log2 (1 − p) fn (α) = −Φ0 log2 Φ0 − Φ1 log2 Φ1 Note also that the sets of points in (0, 1) for which Φn0 (x) does not converge has Hausdorff dimension equal to 1, although it has a zero Lebesgue measure: indeed, the strong law of large numbers entails that, for L-almost x, Φ0 (x) = Φ1 (x) = 12 . An immediate consequence is that, for αm := − 12 log2 p(1 − p), fh (αm ) = 1. Now consider the grain exponents. We observe that L-a.s., αnk → αm . The function fg (α) will measure the speed at which Pr (|αnk − αm | > ε) tends to 0 for ε > 0 when n → ∞. For fixed n there are exactly Cnk intervals Inj such that Φn0 (x) = nk for x ∈ Inj . This makes it possible to evaluate Nnε (α), which is equal to Cnk for ε sufficiently small and α close to α(Φ0 = nk ). Using Stirling’s formula to estimate Cnk , we can then obtain fg (α). However, these calculations are somewhat tedious, in particular due to the double limit, and it is much faster to evaluate fl (α) and use the general results on the relationships between the three spectra: By definition: τ (q) = lim inf n→∞

log2

2n −1 j q k=0 μ In −n

Now, μ(Inj ) = pk (1 − p)n−k exactly Cnk times, for k = 0 . . . n. As a consequence: n 2 −1

j=0

n n q μ Inj = Cnk pkq (1 − p)(n−k)q = pq + (1 − p)q k=0

and: τ (q) = − log2 pq + (1 − p)q A simple Legendre transform calculation then shows that fl (α) = τ ∗ (α) is equal to fh (α). Since fh fl fg is always true, it follows that the strong multifractal formalism holds in the case of the binomial measure: fh = fg = fl .

64


1

0.8

0.6

0.4

0.2

0

0.8

0.9

1

1.1

1.2

1.3

Figure 1.7. Spectrum of the binomial measure

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.2

0.4

0.6

0.8

1

1.2

1.4

Figure 1.8. Spectrum of the sum of two binomial measures

Let us note, however, that simple operations will break the formalism: for instance, the sum of two Besicovitch measures on [0, 1], or the lumping of two such measures with disjoint supports, will have fh = fg = fl (see Figures 1.7 and 1.8). There are many generalizations of the Besicovitch measure. We can replace base 2 by base b > 2, i.e. partition [0, 1] into b sub-intervals. We then speak of multinomial measures. We may also distribute the measure on b < b intervals, in which case the


65

support of the measure will be a Cantor set. Another variation is to consider stochastic constructions by choosing the partitioning and/or p as random variables following certain probability laws. An example of such a stochastic Besicovitch measure, where, at each scale, p is chosen as an iid lognormal random variable, is represented in Figure 1.9. This type of measure can serve as a starting point for the modeling of certain Internet traffic.

Figure 1.9. A random binomial measure

Essentially the same analysis as above applies to the self-affine functions defined in Example 1.7. Similar parametric formulae are obtained for the spectra, which also coincide in this case. Let us mention as a final note that such measures and functions have a Hölder function which is everywhere discontinuous and has level lines which are everywhere dense. We have αl (x) = α0 for all x, where α0 = mint αp (t). Let us now consider the Weierstrass function: W (x) =

∞

ω −nH cos(ω n x)

i=0

As was already mentioned, α(t) = H for any t for W . As a consequence: fh (H) = 1,

fh (α) = −∞ if α = H

The value of the large deviation spectrum depends on the choice of function VW (u). Taking VW (u) = β(W, u) leads to fg = fh . However, if VW (u) is defined to

66


be the increment of W in u, then, for certain values of ω: ⎧ ⎪ if α < H ⎨−∞ fg (α) = H + 1 − α if H α H + 1 ⎪ ⎩ −∞ if α > H + 1 The heuristic explanation of the fact that fg is positive for some values of α larger than H is as follows: at each finite resolution ε, the increments of W in intervals of size ε are at most of the order of εH , because the exponent is defined as a lim inf. However, there will also exist intervals for which the increments are smaller, yielding a larger observed grain exponent. The fg spectrum does in fact measure the speed at which the probability of finding one of these smoother increments tends to 0 when ε tends to 0. To conclude, we note that, in both cases (oscillations and increments), fg is concave and coincides with fl . Similar results hold for fractional Brownian motion. Of course, since fractional Brownian motion is a stochastic process, its spectrum is a priori a random function. However, we can show that, with probability 1, fh and fg (defined either using oscillations or increments) are exactly the same as those given above for the Weierstrass function. 1.4.7. Two applications To conclude this chapter, we briefly mention two applications of multifractal analysis to signal and image processing. Our intention is not to go into the details of these applications (they are developed in Chapters 11 and 12), but to indicate, in a simplified way, how the tools introduced above are put into practice. 1.4.7.1. Image segmentation The issue of edge detection in images allows us to illustrate in a concrete way the relevance of multifractal spectra, as well as the difference between fh and fg . We have observed above that edge points are irregular points, but that they cannot be characterized by a universal value of αp , since non-linear transformations of an image may leave the contours unchanged while modifying the exponents. To characterize edges, it is necessary to include higher level information. This information may be obtained from the following obvious comment: (smooth) edges of an image form a set of lines which is of one dimension. Looking for contours thus means looking for sets of points which are characterized by specific values of αp (local regularity criterion), and such that their associated dimension is 1 (global criterion). In other words, we will characterize edges as those points possessing an exponent which belongs to fh−1 (1). This provides a geometric characterization of edges: on the one hand, we rely on


67

pointwise exponents, which measure the regularity at infinite resolution; on the other hand, we use fh , which is a dimension spectrum. However, it is also possible to follow a statistical approach: assume we consider a very simple image which contains only a black line, the “edge”, drawn on a white background. If we draw randomly in a uniform way a point in the image at infinite resolution, the probability of hitting the line is zero. However, at any finite resolution, where the image is made of, say, 2n ×2n pixels, the probability of hitting the edge is of the order 2−n , since the edge contains 2n pixels. According to the definition of fg (recall in particular (1.27)), we see that, on the black line, fg (α) = 1. In this approach, we thus characterize an edge point as a point possessing a singularity α whose probability of occurrence decreases as 2−n when the resolution tends to infinity. When the multifractal formalism holds, the geometric and statistical approaches yield the same result. For more details, see Chapter 11 and [LEV 96a]. 1.4.7.2. Analysis of TCP traffic Our second application deals with the modeling and analysis of TCP traffic. Here the situation is a little bit different from that of the previous application: contrarily to typical images, TCP traffic possesses, under certain conditions, a multifractal structure. In the language of the discussion at the end of the introduction to this chapter, we here use fractal methods to study a fractal object. This will entail few changes as far as the analysis of the data is concerned. However, dealing with a multifractal signal brings up the question of the source of this multifractality, and thus of a model capable of explaining this phenomenon. This issue will not be not tackled here. See Chapter 12 and [LEV 97, LEV 01, RIE 97, BLV 01]. What is the use of carrying out a multifractal analysis of TCP? First of all, the range of values taken by the Hölder exponents provides important information on the small-scale behavior of traffic. The smaller α is, the more sporadic traffic will be, which means that variations on short time intervals will be significant. The spectrum also allows us to elucidate what is typical behavior, i.e. the value α0 such that fg (α0 ) = 1: with high probability, the variation of traffic between two close time instants (t1 , t2 ) will be of the order |t2 − t1 |α0 . While this typical behavior is important for the understanding and the management of the network, it is also useful to know which other variations may occur, and with what probabilities. This is exactly the information provided by fg . Thus, the whole large deviation spectrum is useful in this application. Let us note that, in contrast, the Hausdorff spectrum is probably less adapted here: first because the relevant physical quantities are increments at different time scales, small but finite; there is no notion of regularity at infinite resolution, as is the case with images. Second, the relevant information is statistical in nature rather than geometric. To conclude, let us mention that the large deviation spectrum of certain TCP traces, as estimated by the kernel method, displays a shape reminiscent of that of

68


the sum of two binomial measures. This provides useful information on the fine structure of these traces. In particular, fg is not concave. This shows the advantage of having procedures available to estimate spectra without making the assumption that a multifractal formalism is satisfied.

1.5. Bibliography [BLV 01] BARRAL J., L ÉVY V ÉHEL J., “Multifractal analysis of a class of additive processes with correlated nonstationary increments”, Electronic Journal of Probability, vol. 9, p. 508–543, 2001. [BON 86] B ONY J., “Second microlocalization and propagation of singularities for semilinear hyperbolic equations”, in Hyperbolic equations and related topics (Katata/Kyoto, 1984), Academic Press, Boston, Massachusetts, p. 11–49, 1986. [BOU 28] B OULIGAND G., “Ensembles impropres et nombre dimensionnel”, Bull. Soc. Math., vol. 52, p. 320–334 and 361–376, 1928 (see also Les définitions modernes de la dimension, Hermann, 1936). [BOU 00] B OUSCH T., H EURTEAUX Y., “Caloric measure on the domains bounded by Weierstrass-type graphs”, Ann. Acad. Sci. Fenn. Math., vol. 25, p. 501–522, 2000. [CAN 96] C ANUS C., L ÉVY V ÉHEL J., “Change detection in sequences of images by multifractal analysis”, in ICASSP’96 (Atlanta, Georgia), 1996. [DAO 02] DAOUDI K., L ÉVY V ÉHEL J., “Signal representation and segmentation based on multifractal stationarity”, Signal Processing, vol. 82, no. 12, p. 2015–2024, 2002. [DUB 89] D UBUC B., T RICOT C., ROQUES -C ARMES C., Z UCKER S., “Evaluating the fractal dimension of profiles”, Physical Review A, vol. 39, p. 1500–1512, 1989. [GRE 85] G REBOGI C., M C D ONALD S., OTT E., YORKE J., “Exterior dimension of large fractals”, Physics Letters A, vol. 110, p. 1–4, 1985. [HAR 16] H ARDY G., “Weierstrass non-differential function”, American Mathematical Society Translations, vol. 17, p. 301–325, 1916. [HAU 19] H AUSDORFF F., “Dimension und äusseres Mass”, Math. Ann., vol. 79, p. 157–179, 1919. [JAF 96] JAFFARD S., M EYER Y., “Wavelet methods for pointwise regularity and local oscillations of functions”, Mem. Amer. Math. Soc., vol. 123, 1996. [KAH 63] K AHANE J., S ALEM R., Ensembles parfaits et séries trigonométriques, Hermann, 1963. [KOL 61] KOLMOGOROV A., T IHOMIROV V., “Epsilon-entropy and epsilon-capacity of sets in functional spaces”, American Mathematical Society Translations, vol. 17, p. 277-364, 1961. [KOL 01] KOLWANKAR K., L ÉVY V ÉHEL J., “Measuring functions smoothness with local fractional derivatives”, Frac. Calc. Appl. Anal., vol. 4, no. 3, p. 285–301, 2001.


69

[KOL 02] KOLWANKAR K., L ÉVY V ÉHEL J., “A time domain characterization of the fine local regularity of functions”, J. Fourier Anal. Appl., vol. 8, no. 4, p. 319–334, 2002. [LEV 96a] L ÉVY V ÉHEL J., “Introduction to the multifractal analysis of images”, in F ISHER Y. (Ed.), Fractal Image Encoding and Analysis, Springer-Verlag, 1996. [LEV 96b] L ÉVY V ÉHEL J., “Numerical computation of the large deviation multifractal spectrum”, in CFIC (Rome, Italy), 1996. [LEV 97] L ÉVY V ÉHEL J., R IEDI R., “Fractional Brownian motion and data traffic modeling: The other end of the spectrum”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering, Springer-Verlag, 1997. [LEV 98a] L ÉVY V ÉHEL J., G UIHENEUF B., “2-Microlocal analysis and applications in signal processing”, in International Wavelets Conference (Tangier), 1998. [LEV 98b] L ÉVY V ÉHEL J., VOJAK R., “Multifractal analysis of Choquet capacities”, Advances in Applied Mathematics, vol. 20, no. 1, p. 1–43, 1998. [LEV 01] L ÉVY V ÉHEL J., S IKDAR B., “A multiplicative multifractal model for TCP traffic”, in ISCC’2001 (Tunisia), 2001. [LEV 02] L ÉVY V ÉHEL J., “Multifractal processing of signals”, forthcoming. [LEV 04a] L ÉVY V ÉHEL J., S EURET S., “The 2-microlocal formalism”, in Fractal Geometry and Applications: A Jubliee of Benoit Mandelbrot, Proc. Sympos. Pure Math., PSPUM, vol. 72, Part 2, p. 153–215, 2004. [LEV 04b] L ÉVY V ÉHEL J., “On various multifractal spectra”, in BANDT C., M OSCO U., Z ÄHLE M. (Eds.), Fractal Geometry and Stochastics III, Progress in Probability, Birtkhäuser Verlag, vol. 57, p. 23–42, 2004. [MCM 84] M C M ULLEN C., “The Hausdorff dimension of general Sierpinski carpets”, Nagoya Mathematical Journal, vol. 96, p. 1–9, 1984. [RIE 97] R IEDI R., L ÉVY V ÉHEL J., TCP traffic is multifractal: a numerical study, Technical Report RR-3129, INRIA, 1997. [ROU 98] ROUEFF F., L ÉVY V ÉHEL J., “A regularization approach to fractional dimension estimation”, in Fractals’98 (Malta), 1998. [SEU 02] S EURET S., L ÉVY V ÉHEL J., “The local Hölder function of a continuous function”, Appl. Comput. Hamron. Anal., vol. 13, no. 3, p. 263–276, 2002. [TRI 82] T RICOT C., “Two definitions of fractal dimension”, Math. Proc. Camb. Phil. Soc., vol. 91, p. 57–74, 1982. [TRI 86a] T RICOT C., “Dimensions de graphes”, Comptes rendus de l’Académie des sciences de Paris, vol. 303, p. 609–612, 1986. [TRI 86b] T RICOT C., “The geometry of the complement of a fractal set”, Physics Letters A, vol. 114, p. 430–434, 1986. [TRI 87] T RICOT C., “Dimensions aux bords d’un ouvert”, Ann. Sc. Math. Québec, vol. 11, no. 1, p. 205–235, 1987. [TRI 88] T RICOT C., Q UINIOU J., W EHBI D., ROQUES -C ARMES C., D UBUC B., “Evaluation de la dimension fractale d’un graphe”, Rev. Phys. Appl., vol. 23, p. 111–124, 1988.

70


[TRI 99] T RICOT C., Courbes et dimension fractale, Springer-Verlag, 2nd edition, 1999. [VOJ 94] VOJAK R., L ÉVY V ÉHEL J., DANECH -PAJOUH M., “Multifractal description of road traffic structure”, in Seventh IFAC/IFORS Symposium on Transportation Systems: Theory and Application of Advanced Technology (Tianjin, China), p. 942–947, 1994.

Chapter 2

Scale Invariance and Wavelets

2.1. Introduction

Processes presenting “power law” spectra (often regrouped under the restrictive, but generic, term of “1/f ” processes) appear in various domains: hydrology [BER 94], finance [MAN 97], telecommunications [LEL 94, PAR 00], turbulence [CAS 96, FRI 95], biology [TEI 00] and many more [WOR 96]. The characteristics of these processes are based upon concepts such as fractality, self-similarity or long-range dependence and, even though these different notions are not equivalent, they all possess a common characteristic: that of replacing the idea of a structure related to a preferred time scale with that of an invariant relationship between different scales.

The study of scale invariant processes presents several difficulties ranging from modeling to analysis and processing, for which few tools were available until recently [BER 94]. The effective possibility of appropriately manipulating these processes has recently been reinforced by the appearance of adequate multiresolution techniques: the tools which are referred to here have been developed for this purpose. These tools are explicitly based on the theoretical as well as algorithmic potentialities offered by wavelet transforms.

Chapter written by Patrick F LANDRIN, Paulo G ONÇALVES and Patrice A BRY.

72


2.2. Models for scale invariance 2.2.1. Intuition From a qualitative point of view, the idea beyond a 1/f spectrum involves situations where, given a signal observed in the time domain, its empirical power spectrum density behaves as S(f ) = C|f |−α with α > 0. From a practical viewpoint, it is evidently the equivalent form: log S(f ) = log C − α log |f | which is the most significant, since the “1/f ” character is translated by a straight line in doubly logarithmic coordinates. Generally, when dealing with physical observations, referring to some 1/f spectral behavior is only meaningful with respect to a frequency analysis band. Therefore, the introduction, on the half-line of positive frequencies, of two (adjustable) frequencies fbf and fhf such that 0 < fbf < fhf < +∞, will end up with a three regime classification, whether we study the 1/f behavior in (at least) one of the three domains that this partition defines. Let us consider each of these cases: 1) fbf f fhf : this is the context of a bandpass domain where we simply observe an algebraic decrease of the spectrum density, without a predominant frequency; 2) fhf f +∞: the 1/f character is dominant in the high frequency limit and highlights the local regularity of the sample paths, their variability and their fractal nature; 3) 0 f fbf : the power law of the spectrum intervenes here in the limit of low frequencies, resulting in a divergence at the origin of the spectrum density. If 0 < α < 1, this divergence corresponds to an algebraic decrease of the correlation function, which is slow enough for the latter not to be summable; there is long-range dependence or long memory. In fact, these three regimes represent three different properties, hence, they have no reason to exist at the same time. However, they possess the common denominator of being linked to an idea of scale invariance according to which – within a scale range and up to some renormalization – the properties of the whole are the same as those of the parts (self-similarity). Indeed, a power law spectrum belongs to the class of homogenous functions. Its form therefore remains invariant under scaling in the sense that, for any c ∈ + : S(f ) = C|f |−α

=⇒

S(cf ) = C|cf |−α = c−α S(f )


73

Given that, through Fourier transformation, a dilation or compression in the frequency domain is translated by a corresponding compression or dilation in the time domain, it is thus legitimate to expect “1/f ” processes to be closely coupled with self-similar processes. 2.2.2. Self-similarity To be more precise [BER 94, SAM 94] (see also Chapter 5), we introduce the following definition. DEFINITION 2.1.– A process X = {X(t), t ∈ } is said to be self-similar of index H > 0 if, for any c ∈ , it satisfies the following equality in a distributional sense: L {cH X(t/c), t ∈ }, {X(t), t ∈ } =

∀c > 0.

(2.1)

According to this definition, a self-similar process does not possess any characteristic scale, insofar as it remains (statistically) identical to itself after any scale change. If, from a theoretical point of view, self-similarity is likely to extend from the largest to the finest scales, the above-mentioned definition must, in general, go with a scale domain (i.e., with a variation domain of the factor c) for which the invariance has a meaning. For instance, the finite duration of an observation settles the maximum attainable scale, in the same way as the finite resolution of a sensor limits the finest scale. It is noteworthy that, if a (second-order) process is self-similar, it is necessarily non-stationary. Indeed, assuming that, at some arbitrary time instant t1 := 1, the condition var X(t1 ) = 0 holds, it stems from Definition 2.1 that var X(t) = varX(t × 1) = t2H var X(t1 ) and, as a consequence, the variance of the process depends on time. This behavior applies to all finite moments of X: E |X(t)|q = E |X(1)|q |t|qH .

(2.2)

Therefore, in a strict sense, an ordinary spectrum cannot be attached to a self-similar process. Nevertheless, there exists an interesting sub-class of the class of self-similar processes, which, in a sense, could be paralleled with that of stationary processes: it is that of processes with stationary increments defined as follows [BER 94, SAM 94]. DEFINITION 2.2.– A process X = {X(t), t ∈ } is said to have stationary increments if and only if, for any θ ∈ , the law of its increment process:

X (θ) := X (θ) (t) := X(t + θ) − X(t), t ∈ does not depend on t.

74


Figure 2.1. Path of a self-similar process. When we simultaneously apply to the sample path of a self-similar process a dilation of the time axis by a factor c and a dilation of the amplitude axis by a factor c−H , we obtain a new sample path that is (statistically) indistinguishable from its original

In this definition, the parameter θ plays the role of a time scale according to which we study the process X. Indeed, the self-similarity of the latter is translated on its increments by:

L H (θ/c) c X (t/c), t ∈ , X (θ) (t), t ∈ =

∀c > 0.

(2.3)

It is evidently possible to extend this definition to higher order increments (“increments of increments”). Coupling self-similarity of index H and the stationary increments property implies that the parameter H remains in the range 0 < H < 1. Moreover, the covariance function of a process (originally centered and zero at the origin) must be of the form: E X(t)X(s) =

σ 2 2H |t| + |s|2H − |t − s|2H 2

(2.4)


75

with the identification σ 2 := E |X(1)|2 . Indeed, if we adopt the convention according to which at time t = 0, X(0) = 0, it follows from the assumptions made that: 2 1 E X 2 (t) + E X 2 (s) − E X(t) − X(s) E X(t)X(s) = 2 2 1 E X 2 (t) + E X 2 (s) − E X(t − s) − X(0) = 2 E |X(1)|2 2H |t| + |s|2H − |t − s|2H , = 2 which explains the structure of relation (2.4). Moreover, the correlation function of the increment processes X (θ) reads: σ2 |τ + θ|2H + |τ − θ|2H − 2|τ |2H E X (θ) (t)X (θ) (t + τ ) = 2 2H 2H σ 2 2H θ 1 − θ − 2 . |τ | = 1 + + 2 τ τ

(2.5)

It is now possible to study in depth its asymptotic behaviors in both limits of large and small τ s. For instance, we show that in the limit τ → +∞ (i.e., τ θ), the autocorrelation function decreases asymptotically as τ 2(H−1) : E X (θ) (t)X (θ) (t + τ ) ∼

σ2 2H (2H − 1) θ2 τ 2(H−1) . 2

(2.6)

By Fourier duality, this behavior induces an algebraic spectral divergence with exponent 1 − 2H at the origin. Self-similar processes with stationary increments are hence closely related to long-range dependent processes. In the other limit, τ → 0 (i.e., τ θ), we show that for H > 12 : (2.7) E X (θ) (t)X (θ) (t + τ ) ∼ σ 2 θ2H 1 − θ−2H |τ |2H . This behavior characterizes the local regularity of each sample path of the process X. The following sections explain the notions associated with each of these limits: long-range dependence on the one hand and local regularity on the other hand. 2.2.3. Long-range dependence DEFINITION 2.3.– A second order stationary process X = {X(t), t ∈ } is said to be “long-range dependent” (or to have “long memory”) if its correlation function cX (τ ) := E X(t)X(t + τ ) is such that [BER 94, SAM 94]: cX (τ ) ∼ cr |τ |−β ,

τ −→ +∞

(2.8)

76


with 0 < β < 1. In the same way, the power spectrum density: +∞ cX (τ ) e−i2πf τ dτ ΓX (f ) := −∞

of a long-range dependent process is such that: ΓX (f ) ∼ cf |f |−γ ,

f −→ 0

(2.9)

with 0 < γ = 1 − β < 1 and cf = 2 (2π) sin((1 − γ)π/2) Γ(γ) cr , where Γ denotes the usual Gamma function. Under its form (2.8), long-range dependence is related to the fact that, for large lags, the (algebraic) decrease of the correlation function is so slow that it does not enable its summability: hence, there is a long memory effect, in the sense that significant statistical relations are maintained between very distant samples. Obviously, this situation is in contrast with that of Markovian processes with short memory, which are characterized by an asymptotic exponential reduction of the correlations. By definition, the existence of an exponential decrease involves a characteristic time scale, whereas this is no longer the case for an algebraic decrease: hence, it is a matter of scaling law behavior. By Fourier duality, long-range dependence implies that ΓX (0) = ∞, in accordance with the power law divergence expressed by (2.9). Finally, even if the property of long-range dependence exists and although its definition is independent from that of self-similarity, relation (2.6) demonstrates that a strong bond exists between these two notions, since it indicates that the increment process of a self-similar process with stationary increments presents, if H > 12 , long-range dependence. 2.2.4. Local regularity The main issue of this section, rather than the long-term behavior of the autocorrelation function, is its short-term behavior. Let X be a second order stationary random process, whose autocorrelation function is originally such that: E X(t)X(t + τ ) ∼ σ 2 (1 − C|τ |2h ),

τ −→ 0,

0 < h < 1.

(2.10)

Hence, it is easy to prove that this original covariance structure is equivalent to an algebraic behavior of the increments variance in the limit of short increments: E |X(t + τ ) − X(t)|2 ∼ C|τ |2h ,

τ −→ 0,

0 < h < 1.

This relation provides information on the local regularity of each sample path of the process X. For Gaussian processes, for instance, it indicates that these sample paths are continuous of order h < h. When 0 < h < 1, this means that these trajectories of X are everywhere continuous but nowhere differentiable.


77

To describe this local regularity more precisely, one can use the notion of Hölder exponent, according to the following definition. DEFINITION 2.4.– A signal X(t) is of Hölder regularity h 0 in t0 if there exists a local polynomial Pt0 (t) of degree n = h and a constant C such that: |X(t) − Pt0 (t)| C|t − t0 |h .

(2.11)

In the case where 0 h < 1, the regular part of X(t) is reduced to Pt0 (t) = X(t0 ), thus leading to the characterization of the Hölder regularity of X(t) in t0 by the relation: |X(t0 + θ) − X(t0 )| C|θ|h .

(2.12)

The Hölder exponent heuristically supplies a measure of the roughness of the sample path of X: the closer it is to 1, the softer and more regular the path; the closer it is to 0, the rougher and the more variable the path. The asymptotic algebraic behavior of the increments variance thus highlights a Hölder regularity h < h of the sample paths of the process X. This correspondence between the asymptotic algebraic behavior of increments and the local regularity remains valid even if the process X is no longer stationary, but only has stationary increments. The processes whose sample paths possess a uniform and constant local regularity are said to be monofractal. As far as self-similar processes with stationary increments are concerned, it is easy to observe that, on the one hand, starting from (2.12), the increments present an algebraic behavior for all the θ in general, hence in particular in the limit θ → 0: E |X(t + θ) − X(t)|2 = E |X(θ) − X(0) |2 = σ 2 |θ|2H

(2.13)

=0

whereas, on the other hand, relation (2.13) indicates that the increment process presents an autocovariance as in (2.10). Self-similar processes with stationary increments, as well as their increment processes, thus present uniform local regularities (i.e., on average and everywhere) h < H. 2.2.5. Fractional Brownian motion: paradigm of scale invariance The simplest and most commonly used model of a self-similar process is that of fractional Brownian motion (FBM) [MAN 68], which is characterized by its real exponent 0 < H < 1, called the Hurst exponent. DEFINITION 2.5.– FBM BH = {BH (t), t ∈ ; BH (0) = 0} is the only zero-mean Gaussian process which is self-similar and possesses stationary increments.

78


The self-similarity and the stationary nature of the increments guarantee that the covariance function of FBM is of the form (2.4). As regards the Gaussian character, it demands that the probability law of FBM must be entirely determined by this covariance structure. FBM can be considered as a generalization of ordinary Brownian motion. In the case of ordinary Brownian motion, we know that the increments possess the particularity of being decorrelated (and therefore independent because of Gaussiannity). The generalization offered by FBM consists of introducing a possibility of correlation between the increments. In fact, we show that: E BH (t + θ) − BH (t) BH (t) − BH (t − θ) = σ 2 22H−1 − 1 |θ|2H which confirms the decorrelation between the increments when H = 12 (i.e. ordinary Brownian motion) but induces a positive correlation (persistence) or a negative correlation (antipersistence) depending on whether H > 12 or H < 12 . DEFINITION 2.6.– We call fractional Gaussian noise (FGN) the increments process GH;θ := {GH;θ (t), t ∈ } defined by: GH;θ (t) :=

1 (θ) B (t), θ H

θ>0

(2.14)

where BH is FBM. It is, by construction, a stationary process, everywhere continuous but nowhere differentiable, that can be considered as an extension of white Gaussian noise. Hence, we must be very careful when we decide to take the limit of definition (2.14) when θ → 0. Nevertheless, if we are interested in the behavior of FGN with “small” increments, we observe according to (2.6) and (2.14) that: cGH;θ (τ ) := E GH;θ (t)GH;θ (t + τ ) ∼

σ2 2H(2H − 1)τ 2(H−1) , 2

τ θ.

On the one hand, this behavior highlights that FGN presents some long memory and, on the other hand, that the power spectrum density of FGN is proportional to |f |−(2H−1) . It is therefore possible to prove on the basis of several arguments (integration/differentiation type) [FLA 92] that the FBM itself possesses an “average spectrum” of the form ΓBH (f ) ∝ |f |−(2H+1) . Along with its role of spectral exponent, the parameter H also controls the Hölder regularity of the sample paths of FBM and FGN, which is h < H in any point. To this regularity (or irregularity), a notion of fractality is naturally associated with the Gaussian processes, since the Hausdorff dimension of the sample paths is equal to dimH graph(BH ) = 2 − H (for a precise definition of the Hausdorff dimension, see Chapter 1).


79

As a result, FBM presents the advantage (or the disadvantage) of being globally self-similar on the entire frequency axis, the only parameter H controlling, according to the requirements, one or other of the three regimes cited before: self-similarity, long memory and local regularity. In terms of modeling, FBM appears as a particularly interesting starting point (as can be the case for white Gaussian noise in stationary contexts). This simplicity (FBM is the only Gaussian process with stationary and self-similar increments and it is entirely determined by the single parameter H) is not of course without counterparts when it is comes to applications, i.e. as soon as it becomes necessary to consider real data. From this theme, numerous variations can be considered, which are not only mentioned here but are also studied in detail in the other chapters of this volume. In all cases, it is a matter of replacing the single exponent H by a collection of exponents. 2.2.6. Beyond the paradigm of scale invariance To begin with, we can consider modifying relation (2.13) by allowing the exponent to depend on time: E |X(t + θ) − X(t)|2 ∼ C(t)|θ|2h(t) ,

θ → 0.

When 0 < h(t) < 1 is a sufficiently regular deterministic function, we describe the process X as multifractional or, when it is Gaussian, as locally self-similar, i.e., that locally around t, X(t) possesses similarities with a FBM of parameter H = h(t) (for more details, see Chapter 6). The local regularity is no longer a uniform or global quantity along the sample path but, on the contrary, it varies in time, according to h(t), which therefore makes it possible to model time variations of the roughness. When h(t) is itself a strong irregular function, possibly a random process, in the sense that, with t fixed, h(t) depends on the observed realization of X, the process X is said to be multifractal. The variability fluctuations are no longer described by h(t), but by a multifractal spectrum D(h) which characterizes the Hausdorff dimension of the set of points t where h(t) = h (see Chapter 1 and Chapter 3). One of the major consequences of multifractality in the processes is the fact that quantities usually called partition functions, behave according to power laws in the small scale limit: 1 (τ ) |X (t + kτ )|q cq |τ |ζ(q) , n n

|τ | → 0

(2.15)

k=1

n For processes with stationary increments, the time averages (1/n) k=1 |X (τ ) (t+ kτ )|q can be regarded as estimations of the averages of the set E |X (τ ) (t)|q . Relation (2.15) thus recalls equation (2.2), which is a consequence of self-similarity. However, a fundamental difference exists: the exponents ζ(q) do not possess a priori any

80


reason to present a linear behavior qH. In other words, the description of scaling laws in data cannot be carried out with a single exponent but requires a whole collection of them. Measuring exponents ζ(q) represents a possibility, through a Legendre transform, of estimating the multifractal spectrum. However, a detailed discussion of the multifractal processes is beyond the scope of this chapter; to this end, see Chapter 1 and Chapter 3. Multifractal processes provide a rich and natural extension of the self-similar model insofar as a single exponent is replaced by a set; nevertheless, they are essentially related to the existence of power law behaviors. In the analysis of experimental data, such behaviors might not be observed. In order to illustrate these situations, the infinitely divisible cascades model exploits an additional degree of freedom: we relax the constraint of a proper power law behavior for the moments, and replace it with a simple behavior that has separable variables q (order of the moment) and τ (scale analysis). The equations below explain this behavior: self-similar multifractal inf. divisib. casc.

E |X (τ ) (t)|q = cq |τ |qH = cq exp(qH log τ ); E |X (τ ) (t)|q = cq |τ |ζ(q) = cq exp ζ(q) log τ ; E |X (τ ) (t)|q = cq exp H(q)n(τ ) .

(2.16) (2.17) (2.18)

In this scenario, the function n(τ ) is no longer fixed a priori to be log τ , as much as the function H(q) is no longer a priori linear according to qH. The concept of an infinitely divisible cascade was initially introduced by Castaing in the context of turbulence [CAS 90, CAS 96]. The complete definition of this notion is beyond the scope of this chapter and can be found in [VEI 00]. It is nonetheless important to indicate that a quantity, called the propagator of the cascade, plays an important role here: it links the probability densities of process increments with two different scales τ and τ . The infinite divisibility formally translates the notion of the absence of any preferred time scale and demands this propagator be constituted of an elementary function G0 , convoluted with itself a number of times dependent only on the scales τ and τ , and therefore with the following functional form: Gτ,τ (log α) = [G0 (log α)]∗(n(τ )−n(τ

))

.

A possible interpretation of this relation is to read the function G0 as the elementary step, i.e. the building block of the cascade, whereas the quantity n(τ ) − n(τ ) measures how many times this elementary step must be carried out to evolve from scale τ to scale τ . The derivative of n with respect to log τ thus describes, in a sense, the size of the cascade. The term of infinitely divisible cascade is ascribed to situations where the function n possesses the specific form n(τ ) = log τ ; otherwise, we only refer to a scaling law behavior. The infinitely divisible scale invariant cascades correspond to multiscaling or multifractality when the scaling law


81

exists in the small scale limit. The exponents ζ(q) associated with the multifractal spectrum are thus connected to the propagator of the cascade by ζ(q) = H(q). When the functions H and n simultaneously take the forms H(q) = qH and n(τ ) = log τ , the infinitely divisible cascades are simply reduced to the case of self-similarity, thus represented as a particular case. The propagator is hence written as a Dirac function, Gτ,τ (log α) = δ(log α − H log(τ /τ )). The fundamental characteristic of the infinitely divisible cascades – separation of variables q and τ – induces the following relations, which are essential for the analysis [VEI 00]: log E |X (τ ) |q = H(q)n(τ ) + Kq ; log E |X (τ ) |q =

H(q) log E |X (τ ) |p + κq,p . H(p)

(2.19) (2.20)

These equations indicate that the moments behave as power laws with respect to each other, this property being exploited in the analysis. Further definitions, interpretations and applications of the infinitely divisible cascades can be found in [VEI 00]. 2.3. Wavelet transform 2.3.1. Continuous wavelet transform The continuous wavelet decomposition of a signal X(t) ∈ L2 (; dt) is a linear transformation from L2 (; dt) to L2 (×+∗ ; dtada 2 ), defined by [DAU 92, MAL 98] u−t 1 X(u)ψ ∗ du. (2.21) TX (a, t) := √ a a This is the inner product between the analyzed signal X and a set of analyzing waveforms obtained from a prototype wavelet (or mother wavelet) ψ by dilations with a scale factor a ∈ +∗ and shifts in time t ∈ . In order for the wavelet transform to be a joint representation in time and frequency of the information contained in X, in other words, to be so that the coefficients TX (a, t) account for X around a given instant, in a given frequency range, the mother wavelet must be a function well localized in both time and frequency. In order to obtain the inverse of the wavelet transform, it is also necessary that the mother wavelet satisfies a closure relation: t − t du da = δ(t − t ), ψ(u)ψ u − a a2 which induces a condition called admissibility: ∞ cψ = |Ψ(f )|2 df /|f |. −∞

82


Given this condition, it is possible to reconstruct the signal X by inverting the wavelet transform according to: u − t dt da −1 TX (a, t)ψ . X(u) = cψ a a2 From the admissibility constraint, it also follows that ψ must satisfy: ψ(t) dt = 0. Such a waveform ψ is therefore an oscillating function, localized on a short temporal support, hence the name wavelet. This oscillating behavior indicates that the wavelet transform does not detect the DC component (average value) of the analyzed signal X. For certain mother wavelets, this property can be extended to higher orders: tk ψ(t) dt = 0, ∀0 k < nψ which means that the wavelet analyzing a signal X is orthogonal to the polynomial components of a degree lower than or equal to its number of vanishing moments nψ . In other words, the wavelet coefficients obtained from a mother wavelet characterized by nψ vanishing moments are insensitive to the behaviors of the signal, which are more regular, i.e. softer than the behavior of a polynomial of a degree strictly lower than nψ ; on the other hand, they account for the information relative to behaviors that are more irregular than such polynomial trends. 2.3.2. Discrete wavelet transform One of the fundamental characteristics of the continuous wavelet transform is its redundant character: the information contained in a signal, i.e. in a space of one dimension, is represented, through the wavelet transform, in a space of dimension 2, the time-scale plane (t, a) ∈ ( × ∗+ ); neighboring coefficients thus share some part of the same information. To reduce this redundance, we define the discrete wavelet transform by the set of coefficients: j/2 X(u) ψ(2j u − k) du (2.22) dX (j, k) := 2 defined using a critical discrete1 sampling of the time-scale plane, which is usually called the dyadic grid: t → 2−j k, a → 2−j , (k, j) ∈ Z × Z ,

1. In [DAU 92] a detailed study can be found of the frames or oblique bases which correspond to the sub-critical sampling of the time-scale plane.


83

thus ending up with the correspondence: dX (j, k) = TX (t = 2−j k, a = 2−j ). In this case, the collection of dilated and shifted versions of the mother wavelet {ψj,k (t), j ∈ Z, k ∈ Z} may constitute a basis for L2 (). Here, to simplify, we will suppose that this refers to orthonormal wavelet bases. However, discrete wavelet transforms are not necessarily or a priori equivalent to the existence of an orthonormal basis. The strict definition of the discrete wavelet transform leading to (orthonormal) bases goes through multiresolution analysis. A multiresolution analysis consists of a collection of nested subspaces of L2 (): . . . ⊂ Vj+1 ⊂ Vj ⊂ Vj−1 ⊂ . . . Each Vj , j ∈ Z possesses its own orthonormal basis {2j/2 φ(2j · −k), k ∈ Z} constructed, as for the wavelets, from a prototype scaling function2 (or father wavelet) φ0 onto which dyadic dilations and integer shifts are applied. The embedded structure demands that the function φ must satisfy a two-scale relation: √ un φ(t − n). φ(t/2) = 2 n

The projection of a signal X ∈ L2 () on this basis thus supplies the approximation coefficients at scale j: j/2 X(u) φ(2j u − k) du. aX (j, k) := 2 To complete these approximations, it is necessary to project the signal X onto the supplementary spaces of Vj in Vj+1 ; therefore, we define Wj by: Vj ⊕ Wj = Vj+1 ,

j−1

Wj = ∅,

Vj :=

Wj . j =−∞

j∈Z

For each fixed scale j, the wavelet family {2j/2 ψ(2j . − k), k ∈ Z} thus forms a (orthonormal) basis of the corresponding subspace:

+∞ j/2 j dX (j, k) ψ(2 t − k) . Wj := X : X(t) = 2 k=−∞

2. To define a multiresolution analysis, this function φ has to satisfy a certain number of constraints which are not detailed here [DAU 92].

84


There again, the embedded structure imposes a two-scale relation on the wavelet: √ vn φ(t − n). ψ(t/2) = 2 n

The wavelet coefficients or detail coefficients therefore correspond to the projections of X on Wj . The signal X can thus be represented as a sum of approximations and details: aX (j, k) 2j/2 φ(2j t − k) X(t) = k

+

−∞ j =j

(2.23)

j /2

dX (j , k) 2

j

ψ(2 t − k).

k

time

scale

signal

details approximation

high-pass filter + decimation

low-pass filter + decimation

Figure 2.2. Fast pyramidal algorithm with filter structure for discrete wavelet decompositions. An approximation aX (0, k) (at scale 0) of the continuous time signal X is initially calculated (only this stage involves a continuous time evaluation from X). The “signal” represented in the figure is made up by the sequence aX (0, k). In multiresolution analysis this approximation a0,k is decomposed in a series of details dX (−1, k) and a new and rougher approximation aX (−1, K). This procedure is then iterated from the sequence aX (−1, k). The impulse responses of the discrete-time filters depend on the generating sequences u and v which define the scaling function and the wavelet. In the case of orthonormal bases, they are exactly equal to them


85

Finally, thanks to the properties of embedded spaces specific to multiresolution analysis, there exist very fast algorithms with a pyramidal structure, which enables effective and efficient calculations of the discrete decomposition coefficients. From the sequences u and v, described as the generators of multiresolution analysis, we can prove that the approximation and detail coefficients at octave j can be calculated from those at octave j − 1: aX (j, k) = X(t)2j/2 φ(2j t − k) dt = =

√ X(t)2j/2 2 un φ 2(2j t − k) − n dt

n

un

X(t)2(j+1)/2 φ(2j+1 t − 2k − n) dt

n

=

un aj+1,2k+n

n

=

u∨ n aj+1,2k−n

n

= u∨ · ∗ aj+1,· (2k) and, in an identical manner: dX (j, k) = v·∨ ∗ aX (j + 1, ·) (2k) where ∗ denotes the discrete time convolution operator, i.e., (x· ∗ y· )(k) = ∨ ∨ n x(n)y(k − n), un = u−n and vn = v−n . The two previous relations can be rewritten by using the decimation operator ↓2 (y = ↓2 x means yk = x2k , i.e., that every other sample x is left out): ! " aX (j, k) = ↓2 (u∨ · ∗ aj−1,· ) (k); ! " dX (j, k) = ↓2 (v·∨ ∗ aj−1,· ) (k). Thanks to this recursive structure, the calculation cost of discrete wavelet decomposition of a signal uniformly sampled on N points is in O(N ). 2.4. Wavelet analysis of scale invariant processes The aim of this section is to study how these fundamental principles (scale changing operator) and essential properties (multiresolution structure, number of vanishing moments, localization) of wavelet decomposition can be exploited, in order to characterize and easily measure the scale invariance phenomena that have been previously described.

86


Let us note that the set of results mentioned below can be formulated in the same way as with the continuous wavelet decompositions. However, for the sake of simplicity and conciseness, we will only tackle the case of discrete random fields of orthogonal wavelet coefficients, arising from the decomposition of scale invariant processes. 2.4.1. Self-similarity PROPOSITION 2.1.– The wavelet coefficients resulting from the decomposition of a self-similar process of index H satisfy the equality: L −j(H+ 1 ) 2 dX (j, 0), . . . , dX (j, Nj − 1) = dX (0, 0), . . . , dX (0, Nj − 1) . 2 This result, initially demonstrated for the FBM [FLA 92] and then generalized to the set of self-similar processes [AVE 98], is based on the scale invariance principle stemming from the dilation/compression operator which defines the wavelet analysis. To outline the proof, it is only necessary to write down the main argument: when L 2jH X(u): X(2j u) = dX (j, k) = X(u)ψ(2j u − k)2j/2 du = L

=

2−j/2 X(2−j u)ψ(u − k) du 1

2−j(H+ 2 )

X(u)ψ(u − k) du

1

= 2−j(H+ 2 ) dX (0, k). The principal consequence of self-similarity is the fact that, when they do exist, the q-th order moment of the wavelet coefficients satisfy the equality: 1

E |dX (j, k)|q = 2−jq(H+ 2 ) E |dX (0, k)|q . PROPOSITION 2.2.– The wavelet coefficients resulting from the decomposition of a process with stationary increments are stationary at each scale 2j . To understand the origin of this result, let us note that the sampled process of −j increments X (θ=2 ) [2−j k] := X((k + 1)2−j ) − X(k 2−j ) can be identified with a wavelet decomposition (2.22) according to: −j X (2 ) [2−j k] = 2j X(u) [δ(2j u − k − 1) − δ(2j u − k)] du = 2j/2 dX (j, k),


87

with ψ(t) = δ(t−1)−δ(t) as the analyzing wavelet (an elementary wavelet sometimes referred to as the poor man’s # wavelet). In fact, it is the naturally admissible oscillating structure of the wavelets ( ψ(t) dt = 0) which guarantees this stationarity in the case of processes with stationary increments. Heuristically, and by underlining the main argument – the fact that the number of vanishing moments is at least greater than or equal to 1 – the proof reads (on the coefficients of the discrete decompositions and with j = 0 to simplify the writing): dX (0, k + k0 ) = X(u)ψ(u − k − k0 ) du = X(u + k0 )ψ(u − k) du = [X(u + k0 ) − X(k0 )]ψ(u − k) du L [X(u) − X(0)]ψ(u − k) du = = X(u)ψ(u − k) du = dX (0, k). This proof highlights the role played, for stationarization, by the fact that ψ is of zero-mean value (i.e. that its number of vanishing moments is at least 1). This result was obtained in the case of FBM, although directly from the covariance form, in [FLA 92, TEW 92], extended to stable cases, independently by different authors [DEL 00, PES 99] and proved in a general context in [CAM 95, MAS 93]. Given that we are dealing with processes with stationary increments of order p, the simple admissibility condition of the wavelets is no longer sufficient. Hence, it is necessary to choose a wavelet analysis ψ possessing nψ p vanishing moments so that the coefficient series dX (j, k) obtained are stationary at each scale. The complete proof of this result is given in [AVE 98]. However, a good way to make the issue clearer would be to argue here that the wavelet tool plays a role similar to that of a differentiation operator, insofar as the number of vanishing moments control, by time-frequency duality, the behavior of the spectrum magnitude |Ψ(f )| in the vicinity of the zero frequency. Indeed, for a wavelet ψ possessing nψ vanishing moments, we have |Ψ(f )| ∼ |f |nψ , f → 0, which at first approximation we can identify with the differentiation operator of order nψ . PROPOSITION 2.3.– The wavelet coefficients resulting from the decomposition of a process X which is zero-mean, self-similar of index H, of finite variance and with stationary increments (H − ASAS ) possesses, when they exist, moments of order q satisfying the following scaling law: 1

E |dX (j, k)|q = E |dX (0, 0)|q 2−jq(H+ 2 ) .

88


This last result stems directly from the coupling of the two previous propositions. For processes with finite variance (i.e., whose third order moment 2 exists) – Gaussian processes, just as the FBM, for instance – this relation takes on the following specific form: E |dX (j, k)|2 = E |dX (0, 0)|2 2−j(2H+1) .

(2.24)

Given that the latter are second order statistics, the particular form (2.4) of the covariance structure of a H − ASAS process makes it possible to deduce the asymptotic behavior of the dependence structure of the wavelet coefficients [FLA 92, TEW 92]. PROPOSITION 2.4.– The asymptotic covariance structure of the wavelet coefficients of a process X which is zero-mean, self-similar of index H, of finite variance and with stationary increments (H − ASAS ) takes on the form:

E dX (j, k)dX (j , k ) ≈ |2−j k − 2−j k |2(H−nψ ) ,

|2−j k − 2−j k | → ∞

which illustrates, on the one hand, that the larger the number of vanishing moments, the shorter the range of the correlation and on the other hand, that if H > nψ + 12 , the long-range dependence which exists for the increment process if H > 12 , is transformed into a short-range dependence [ABR 95, FLA 92, TEW 92]. The set of the results which have just been presented can be made more precise when we specify the distribution law which underlies the self-similar process with stationary increments. The Gaussian case, illustrated by the FBM, has been widely studied [FLA 92, MAS 93]. Its wavelet coefficients are Gaussian at all scales. More recently, interest in the non-Gaussian case has led to developments for self-similar α-stable processes (or α-stable motions) [ABR 00a, DEL 00]. Hence, we can deduce from the wavelet decomposition of such processes that the series of their coefficients dX (j, k), in addition to the above-mentioned properties, is itself α-stable with the same index. 2.4.2. Long-range dependence As specified in section 2.2.3, stationary processes with “long-range dependence” are characterized by a slow decrease of their correlation function cX (τ ) ∼ cr |τ |−β , 0 < β < 1. Thus, the strong statistical connections maintained even between distant samples, X(t) and X(t + τ ), make the study and analysis of such processes much more complex, by impairing, for example, the convergence of algorithms relying on empirical moment estimators. It will be shown hereafter that wavelet decomposition of a process with long-range dependence makes it possible to circumvent this difficulty since – under certain conditions – the series of coefficients dX (j, k) exhibit


89

short-term dependence. The covariance function of the wavelet coefficients possesses the following form: E dX (j, k)dX (j , k ) j+j −β 2 cr |τ | ψ(2j u − k) ψ 2j (u − τ ) − k du dτ ∼2 = 2−

j+j 2

cf

(2.25)

Ψ(2−j f )Ψ(2−j f ) −i2πf (2−j k−2−j k ) e df, |f |γ

indicating that its asymptotic behavior, i.e. for the large values of the interval |2−j k − 2−j k |, is equivalent to that of its original Fourier transform and hence to that of the relation:

2−(j+j )nψ 2(−j−j )nψ |f |2nψ |Ψ(2−j f )Ψ(2−j f )| ∼ = . −γ γ f →0 |f | |f | |f |γ−2nψ Thus, we can observe the effect of the number of vanishing moments nψ of the wavelet, which may compensate the original divergence of the spectrum density of the process. By choosing a wavelet such that nψ γ/2, the long-range dependence of the process is no longer preserved in the coefficient sequences of the decomposition. Hence, the bigger nΨ is, the faster the residual correlation decreases:

E dX (j, k)dX (j , k ) ≈ |2−j k − 2−j k |γ−2nψ −1 ,

|2−j k − 2−j k | → ∞.

From equation (2.25) we can also prove that the variance of the wavelet coefficients follows a power law behavior as a function of scales: E |dX (j, k)|2 = 2−j(1−β) cf

|Ψ(f )|2 df = c0 2−jγ . |f |γ

(2.26)

This relation will be at the core of the estimation procedure of the parameter γ (see the following section). Finally, it is important to specify that, since it is possible to invert the wavelet decompositions (see equation (2.23)), the non-stationarity of the studied processes does not disappear from the analysis (no more than the long-range dependence does); all the information is preserved but redistributed differently amongst the coefficients. Thus, long-range dependence and non-stationarity are related to the approximation coefficients aX (j, k) of the decomposition, whereas self-similarity is observed through the scales, by an algebraic progression of the moments of order q of the detail coefficients dX (j, k).

90


2.4.3. Local regularity The local regularity properties of process sample paths have been introduced in section 2.2.4. Their “wavelet” counterparts most often derive from the orthogonal discrete wavelet transform, given that they could be extended, normally quite easily, to the continuous (surfaces) varieties (see Chapter 3). THEOREM 2.1.– Let X be a signal with Hölder regularity h 0 in t0 and ψ a sufficiently regular wavelet (nψ h). Hence, there exists a constant c > 0 such that for any j, k ∈ Z × Z: 1 |dX (j, k)| c 2−( 2 +h)j 1 + |2j t0 − k|h . Conversely, if for any j, k ∈ Z × Z: 1 |dX (j, k)| c 2−( 2 +h)j 1 + |2j t0 − k|h for h < h, thus X has Hölder regularity h in t0 . The proof of the theorem was established independently by Jaffard [JAF 89] and Holschneider and Tchamitchian [HOL 90]. In the light of this result, we note once again that it is the decrease of wavelet coefficients through scales which characterizes the local regularity of the sample path of X. Furthermore, this result is not surprising since the Hölder regularity of a function is a particular cause for the 1/f spectral behavior at high frequencies. The second part of Theorem 2.1 also shows that knowledge of the coefficients located “vertically” to the singular point (|2j t0 − k| = 0) is itself not sufficient to determine the local regularity of X in t0 . Strictly speaking, it would be necessary to consider the decomposition in its entirety, thus implying that an isolated singularity can affect all the coefficients dX (j, k) inside a cone, called an influence cone. For a wavelet whose temporal support is finite, this cone is also limited at each scale. From the estimation point of view, the direct implication of Theorem 2.1 is to highlight the practical limits of (discrete) orthogonal wavelet transforms, because it is quite unlikely that the abscissa t0 of the singularity coincides with the coefficients line on the dyadic grid. Hence, in practice, it is more often a continuous analysis diagram which is preferred, for which we possess a less precise and incomplete version (direct implication) of Theorem 2.1 (see the following proposition). PROPOSITION 2.5.– If X is of Hölder regularity n < h < n + 1 in t0 , for a wavelet analysis ψ possessing nψ h vanishing moments, then we have the following asymptotic behavior: 1

TX (t0 , a) ∼ O(ah+ 2 ),

a −→ 0+ .


91

Proof. Let the continuous wavelet transform (2.21) constructed with nψ > n be: √ TX (t0 , a) = a ψ(u) X(t0 + au) du =

√

$

a

ψ(u) X(t0 + au) −

n

% r r

du,

cr a u

r=0

where cr represent the Taylor expansion coefficients of X in the vicinity of t0 . The signal X is of regularity n < h < n + 1 in t0 and ψ is a localized time function. Thus, in the limit of infinitely fine resolutions (a → 0+ ): √ lim+ TX (t0 , a) C a ψ(u)|au|h du a→0

h+ 12

a

1

|u|h ψ(u) du = Cψ ah+ 2 .

It is important to underline that if the wavelet ψ is not of sufficient regularity, it is the term of degree nψ in the Taylor polynomial which dominates at finite scales and it is thus the regularity of the wavelet which imposes the decrease of the coefficients through scales. However, one should not be misled by the interpretation of Proposition 2.5. It is only because we focus on the limited case of infinite resolution that the influence of the singularity seems to be perfectly localized in t = t0 . In reality, it is shown in [MAL 92] that, in the case of non-oscillating singularities (see Chapter 3), it is necessary and sufficient to consider the maximum local lines of the wavelet coefficients situated inside the influence cone, {TX (a, t) : |t − t0 | < c a}, to be able to characterize the local regularity of the process. In addition, the practical use of this property is made more difficult by the necessarily finite resolution imposed by the sampling of the data, which does not permit detailed scrutiny of the data beyond a minimum scale, which is noted by convention a = 1. Furthermore, the different aspects of the study of the local regularity of a function constitutes an important object in other chapters of this work. This is of true for Chapter 3, which tackles the issue of the characterization of functional regularity spaces, of Chapter 6 and Chapter 5 which expose the case of multifractional processes and their sample path regularity, and finally of Chapter 1, which presents the multifractal spectra as statistical and geometric measures of the distribution of pointwise singularities of a process. Finally, let us note that, as previously indicated in section 2.2.4, the increments of stochastic stationary processes with stationary increments for which the Hölder

92


exponent is constant throughout the sample paths satisfy the asymptotic relation: E |X(t + τ ) − X(t)|2 ∼ C|τ |2h ,

|τ | → 0.

The latter can be rewritten identically on the wavelet coefficients, which can be either continuous or discrete: E |TX (a, t)|2 ∼ a2h+1 ,

a −→ 0;

(2.27a)

E |dX (j, k)|2 ∼ 2−j(2h+1) ,

j −→ +∞.

(2.27b)

These relations should be compared with those obtained in the case of self-similarity (2.24) and long-range dependence (2.26), and will serve as a starting point in the construction of estimators (see the following section). 2.4.4. Beyond second order In this chapter, analysis is limited to the detailed presentation of the wavelet analysis of scaling laws existing in the second statistical order (self-similarity, long-range dependence, constant local regularity). Nevertheless, the study of scaling law models which involve all statistical orders (multifractal processes, infinitely divisible cascades) can be carried out from wavelet analysis in the same way and benefits from the same qualities and advantages. The wavelet analysis of multifractal processes is developed by S. Jaffard and R. Riedi in Chapter 3 and Chapter 4 respectively. The wavelet analysis of infinitely divisible cascades is detailed in [VEI 00]. 2.5. Implementation: analysis, detection and estimation This section is devoted to the implementation of wavelet analysis in the study of scale invariance phenomena, whether it is for detecting and highlighting them, or for estimating the parameters which describe them. The previous sections have outlined the power law behavior of the scale variance of wavelet coefficients: E |dX (j, k)|2 c C 2jα

(2.28)

see, for self-similarity, equation (2.24); for monofractality of long-range dependence, equation (2.26); for monofractality of sample paths, equation (2.27); these equations are represented in Table 2.1. This relation crystallizes the potential of the “wavelet” tool for the analysis of scaling laws and will hence be the core aspect of this section. In a real situation, we must begin by validating the relevance of a model before estimating its parameters. In other words, we must first highlight the existence of scale invariance phenomena and identify a scale range within which the above-mentioned


α

c

C

Self-sim. with stat. incr. 2H + 1 σ 2 = E |X(1)|2 Long-range dependence

γ

#

|Ψ(f )|2 /|f |2H+1 df #

cf

Uniform local regularity 2h + 1

93

|Ψ(f )|2 /|f |γ df

–

–

Table 2.1. Summary of scaling laws existing in the second statistical order

relation supplies an appropriate description of the data, then carry out the measure of exponent α. A simple situation serves as an introduction, where we suppose that there is an octave range j1 j j2 , already identified, for which the fundamental relation is satisfied in an exact manner: j1 j j2 ,

E |dX (j, k)|2 = cf C 2jα

and we concentrate on the estimation of the parameter α. 2.5.1. Estimation of the parameters of scale invariance To estimate the exponent α, a simple idea consists of measuring the slope of log2 E |dX (j, k)|2 against log2 2j = j. In practice of course, this implies that we have to estimate the quantity E |dX (j, k)|2 from a single observed realization of finite length. Given the properties of the wavelet coefficients (stationarity, weak statistical dependence) put forth in the previous section, we simply propose to perform the estimation of the ensemble average by the time average [ABR 95, ABR 98, FLA 92]: 1 dx (j, k)2 nj k

where nj designates the number of wavelet coefficients available at octave j. DEFINITION 2.7.– Let us begin by recalling the main characteristics of linear regression. Let yj be the random variables such that E yj = αj + b and let us define σj2 = var yj . The estimator by a weighted linear regression of α reads: j2 α ˆ=

j=j1

yj (S0 j − S1 )/σj2 S0 S2 − S12

≡

j2 j=j1

wj yj

(2.29)

94


with: Sp =

j2

j p /aj ,

p = 0, 1, 2

j=j1

where aj are arbitrary quantities, acting as weights associated with yj . With these definitions, the weights wj satisfy the usual constraints, i.e. j2 j2 j=j1 jwj = 1 and j=j1 wj = 0. We can also easily observe that the estimator is unbiased: α ˆ = α. Moreover, the variance of this estimator is written, in the case of uncorrelated variables yj : var α ˆ=

j2

aj σj2 .

j=j1

The choice of the weights remains to be specified. We know that the variance α ˆ is minimal if we take into consideration the covariance structure of yj , yj in the definition of aj . Once again, in the case of uncorrelated variables yj , this leads us to choose aj ≡ σj2 . In the case of scale invariance, we use the estimator defined earlier with the variables: 1 dX (j, k)2 − g(j), yj = log2 nj k

where g(j) are the correction terms aimed at taking into account the fact that E(log(·)) is not log(E(·)) and at ensuring that E yj = αj + b. Hence, this estimator simply consists of a weighted linear regression carried out in the diagram yj against j, referred to as the log-scale diagram [VEI 99]. In order to easily implement this estimator, it is necessary to further determine g(j) and σj2 and choose aj . To begin with, we assume that dx (j, k) are random Gaussian variables, i.e., that they result from the wavelet decomposition of a process which is itself jointly Gaussian. Moreover, if we idealize the weak correlation property of the wavelet coefficients in exact independence, then we can calculate g(j) and σj2 analytically: g(j) = Γ (nj /2)/ Γ(nj /2) log 2 − log2 (nj /2) ∼

−1 , nj log 2

nj → ∞;

(2.30a)

σj2 = ζ(2, nj /2)/ log2 2 ∼

2 , nj log2 2

nj → ∞,

(2.30b)


95

where Γ and Γ respectively designate the Gamma function and its derivative, and ∞ where ζ(2, z) = n=0 1/(z + n)2 defines a function called the generalized Riemann zeta function. Let us note that these analytical expressions, which depend on the known nj s alone, can hence be easily evaluated in practice. The numerical simulations presented in depth in [VEI 99] indicate that, for Gaussian processes, this analytical calculation happens to be an excellent approximation of reality, satisfying a posteriori the idealization of exact independence. Thus, for Gaussian processes, we obtain an estimator which is remarkably simple to carry out, since the quantities g(j) and σj2 can be analytically calculated and do not need to be estimated from data, and which gives excellent statistical performance. From these analytical expressions, we obtain: Eα ˆ = α, which indicates that the estimator is unbiased and this is also valid for observations of finite duration. With the choice aj = σj2 , its variance reads: Var α ˆ=

j2

σj2 wj2 =

j=j1

S0 S0 S2 − S12

and it attains the Cramér-Rao lower bound, which is calculated under the same hypotheses (Gaussian process and exact independence) [ABR 95, VEI 99, WOR 96]. In addition, if we add the form nj = 2−j n (with n as the number of coefficients in the initial process) induced by the construction of the multiresolution analysis, we obtain the following expression for the variance: Var α ˆ=

j2

σj2 wj2

j=j1

1 (1 − 2−J ) · 1−j1 , n 2 F

(2.31)

where F = F (j1 , J) = log2 2·(1−(J 2 /2+2) 2−J +2−2J ) and where J = j2 −j1 +1 denotes the number of octaves involved in the linear regression. This analytical result shows that the variance of the estimator decreases in 1/n, in spite of the possible presence of a long-range dependence in the analyzed process. It is noteworthy that, in practice, relation nj = 2−j n is not exactly satisfied because of boundary effects, which are systematically excluded a priori from the measures. In the case of non-Gaussian processes, the implementation of the estimator is more subtle, since we cannot use analytical expressions for g(j) and σj2 . Nevertheless, in the case of finite variance processes, the variables (1/nj ) k dX (j, k)2 are asymptotically Gaussian and we can show that correcting terms can be introduced [ABR 00b] to the Gaussian case: g(j) ∼ −

1 + C4 (j)/2 ; nj log 2

σj2 ∼ 2/ log2 2

1 + C4 (j)/2 , nj

96


where C4 denotes the normalized fourth-order cumulant: 2 2 C4 (j) = E dX (j, k)4 − 3 E dX (j, k)2 / E dX (j, k)2 of the wavelet coefficients at octave j. The practical use of these relations requires the estimation of C4 , which can be difficult, as well as the guarantee that, for each octave, a sufficient number of points exist, so that the above form, which results from an asymptotic expansion, is valid. An approximate yet simple practical choice, regularly implemented, consists of using: g(j) ≡ 0; σj2 ≡ 2/(nj log2 2); aj ≡ σj2 . The numerical simulations proposed in [ABR 98, VEI 99] show that the performance of these choices are very satisfying for the analysis of long-range dependence. Indeed, such a choice implies that the importance in linear regression of yj is twice less than that of yj−1 , which is a point of view a priori realistic for the study of long-range dependence (Gaussianization effect for large scales), but less obvious for the study of local regularities. This choice is all the more delicate as it induces an effect on the bias and the variance of the estimator at the same time; an alternative choice, aj constant, can also be considered. In equation (2.28), from which the study of the scaling law behavior stems, the focus has, until now, been on the exponent α of the power law, since it defines the phenomenon. However, the measure of the multiplicative parameter cf can be fruitful for certain applications. This estimation is detailed in [VEI 99]. Finally, the case of self-similar processes with stationary increments of infinite variance (and/or mean value) will not be tackled here. It is especially developed in [ABR 00a]. 2.5.2. Emphasis on scaling laws and determination of the scaling range The previous section relied on the hypothesis that the quantity E dX (j, k)2 was made on the basis of a power law. This ideal situation is rarely observed for two reasons. On one hand, the real experimental data are likely to be only approximately described by the proposed models of scale invariance. On the other hand, certain models themselves induce only an approximate or asymptotic behavior as a power law – as is the case for long-range dependence or processes with fractal sample paths; in fact, only the self-similar model induces a strictly satisfied power law: LRD

j → +∞ E dX (j, k)2 ∼ cf C 2jα ;

Fractal j → −∞ E dX (j, k)2 ∼ cf C 2jα ; H-ss

∀j

E dX (j, k)2 = cf C 2jα .


97

Hence, in the implementation of the estimator described earlier, it is necessary to choose an octave range on which the measure is carried out, i.e., to select the octaves j1 and j2 . Making this choice does not necessarily mean extracting the theoretical values of j1 and j2 from the data, since these are not always defined by the model, but rather optimizes the statistical performance of the estimator. Widening the octave range [j1 , j2 ] implies the use of a higher fraction of the available wavelet coefficients, resulting in a reduction of the estimation variance, as indicated by the above-mentioned relation (2.31); conversely, it can also mean an increase in the estimation bias if we carry out the measure on a range where the behavior is notably different from a power law behavior. The choice of the range is thus guided by the optimization of a bias-variance trade-off. The example of long-range dependence is used: according to the model, we wish to choose j2 = +∞, i.e., in practice, j2 as large as possible, since the maximum limit of the wavelet decomposition is fixed by the number of coefficients of the analysis process and the importance of boundary effects; as for j1 , it is not imposed by the model. Choosing the larger j1 makes it possible to work in a zone where the asymptotic behavior is satisfied and thus means a small estimation bias but a strong variance (this is essentially contained in 2j1 , see equation (2.31), and thus doubles each time we increase j1 of 1). Qualitatively, we are led to tolerate a little bias (i.e., widen the measure range towards the small octaves) to reduce the variance and thus minimize the average quadratic error (AQE = (bias)2 + variance). In the case of the local regularity measure, the situation is different. The model tends to impose small j1 (in practice, j1 = 1) without fixing j2 . As in the preceding case, increasing j2 may induce a gap in the ideal behavior of the bias, but will only slightly reduce the variance: indeed, it is sufficient to study the form of the variance dependence of α ˆ to note the reduced influence of j2 , in accordance with the fact that there exist fewer and fewer wavelet coefficients exist on the coarsest octaves. In practice, we are led to choose the narrowest possible range which is also limited on the lowest octaves. To move towards more quantitative arguments, it is necessary to completely specify the models of the processes studied. We will keep on using the method which imposes the least possible a priori assumptions on the model and consider that it is sufficient to postulate that scale invariance phenomena are present. We resort to the quantity: G(j1 , j2 ) =

j2 (yj − α ˆ j − ˆb).2 σj2 j=j

(2.32)

1

which conveys a usual measure of the mean-square error between data and model. For Gaussian yj , the variable G(j1 , j2 ) follows a Chi-2 law χ2J−2 with J − 2 degrees of freedom. The dependence in j1 , j2 of the quantity: G(j1 ,j2 ) χ2J−2 (u) du Q(j1 , j2 ) = 1 − 0

98

Scaling, Fractals and Wavelets Ŧ5 9

D = 0.55 8

Ŧ10

D = 2.57

cf = 4.7

7

1 d j d 10

Ŧ15

4 d j d 10 6

y

j

y

j

5

Ŧ20 4

3

Ŧ25

2

Ŧ30

1

0

Ŧ35

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

Octave j

Octave j

Figure 2.3. Examples of log-scale diagram. Right: second order log-scale diagram for a long-range dependent process, also possessing a highly pronounced correlation structure of short memory type (visible to the small octaves); practically, it refers to ARFIMA (0, d, 2) with d = 0.25 and a second order moving average Ψ(B) = 1 + 2B + B 2 , implying (γ, cf ) = (0.50, 6.38). The vertical error bars for each j carry out confidence intervals at (2) 95% of Yj . A linear behavior is observed between the octaves [j1 , j2 ] = (4, 10), which excludes the small octaves (short range memory) but includes the larger ones. A weighted linear regression enables, in spite of the strong presence of short-term dependencies, a precise estimation of γ: γˆ = 0.53 ± 0.07, cˆf = 6.0 with 4.5 < cˆf < 7.8. Left: second order log-scale diagram for a self-similar process (FBM) of parameter H = 0.8. The linear behavior spreads ˆ = 0.79 over all scales and allows a precise estimation of H: H

makes it possible to work upon the choice of the analysis range. A value of Q close to 1 indicates the adequacy of the model, as opposed to Q close to 0. An approach which consists of examining breaking points in the behavior of Q with j1 , j2 is proposed. 2.5.3. Robustness of the wavelet approach One of the great difficulties in the analysis of scale invariance phenomena is linked to the fact that their qualitative expressions are close to those induced by non-stationarities. For a long time, data modeling by scale invariance has been rejected because it was considered, sometimes correctly, as an artefact due to non-stationarity. The difficulties are of two types: on the one hand, as just mentioned above, identifying scale invariance when it refers, in fact, to non-stationarities; on the other hand, failing to detect scale invariance or to correctly estimate its parameters, when non-stationary effects are superimposed. Wavelet analysis has made it possible to find solutions to these two problems. For the first type of problem, it was proposed [VEI 01] to chop the signal under analysis into L segments which do not overlap. For each segment, we carry out an estimation α ˆ l of the scale invariance parameter. Then we validate the relevance of using a scale invariance model by testing the similarity between blocks of α ˆ l . Hence, it does not refer to a stationary test in the more general sense, but more simply to a


99

18

30

16

25

14

20 12

15

yj

10

10

8

5

6

0

4

2

Ŧ5

0

Ŧ10

2000

4000

6000

8000

10000

12000

14000

1

2

3

4

5

30

6

7

8

9

10

11

7

8

9

10

11

Octave j

16000

18

25

16

20

14

12

15

10

D = 0.59

yj

10

8

5 6

0 4

Ŧ5

Ŧ10

2

2000

4000

6000

8000

time

10000

12000

14000

16000

0

1

2

3

4

5

6

Octave j

Figure 2.4. Robustness with respect to superimposed trends. Left: fractional Gaussian noise with H = 0.80 (above) and with sinusoidal and superimposed linear trends (below). Right: log-scale diagrams of signal corrupted by the trends, as computed with a Daubechies 2 wavelet (i.e., N = 2) (above) and Daubechies 6 wavelet (i.e. N = 6) (below). We observe that increasing N cancels the effects of the superimposed trends ˆ = 0.795 and allows for a reliable estimation of H: H

test aimed at detecting abnormally large fluctuations of estimations α ˆ l which leads us to reject the presence of scale invariance. In practice, the properties of the wavelet coefficients (weak statistical dependence among coefficients) and the definition and the theoretical study of the estimator α ˆ , presented earlier, make it possible to conduct this test as the detection of n mean value change within independent Gaussian variables of unknown but identical mean values and of possibly different but known variances [VEI 01]. For the second type of problem, the number of vanishing moments of the wavelet plays a fundamental role. By definition, the wavelet coefficients of a polynomial p(t) of degree P strictly smaller than the number of vanishing moments of the mother wavelet, P < nψ , are exactly zero. This means that if the observed signal Z is made up of a signal to analyze X on which a polynomial trend is superimposed, the wavelet analysis of the scale invariance phenomena that are likely to be present in X will be, given its linear nature, insensitive to the presence of p as soon as

100


nψ is sufficiently large. In practice, we do not necessarily know a priori the order of the corrupting polynomial, if any; we can thus simply carry out a series of wavelet analyses by making N increase. When these results no longer change with nψ , this is an indication that P has been overtaken. It is noteworthy that this procedure is made fully practicable by the low calculation cost of the discrete wavelet decomposition. Certainly, in practice, the trends superimposed on the data are not polynomial. However, in the case where they possess a sufficiently regular behavior (e.g., quasi-sinusoidal oscillations) or slightly irregular (e.g., in t−β , β > 0), the preceding argument remains valid: when we make nψ increase, the magnitude of the wavelet coefficients of the trend decreases, whereas that of X remains identical; the effect of the trend thus becomes quasi-negligible [ABR 98]. The superposition of a deterministic trend to the process X can be interpreted as a non-stationarity of the mean; this situation can be complex considering that the variance of the process itself evolves. Hence, we can write the observation as: Z(t) = a(t) + b(t)X(t) where X(t) is a process presenting scale invariance under the form of one of the models referred to earlier, (self-similarity, long-range dependence, etc.) and where a(t) and b(t) are sufficiently regular deterministic functions. Thus, it has been shown that the variation in the number of vanishing moments of the mother wavelet makes it possible to overcome the drift effects of a and b and to carry out reliable estimations of the scale invariance parameters associated only with X [ROU 99]. Finally, the analysis of the signal plus noise situation, which is usual in signal processing when we write an observation Z = Y + X (where X is the process with scale invariance to be studied and Y some additive random noise) has been considered in [WOR 96] by maximum likelihood approaches and will not be developed here. 2.6. Conclusion In this chapter, we have focused on a qualitative description rather than a rigorous formalization of the concepts, models and analyses. The main concern was to offer the reader, who is not specialized in this field but is eager to implement, from real data, certain principles of the fractal analysis, some entry points that are as much theoretical as practical. Especially in the first part, emphasis has been put on the relations between the different models used to describe scaling laws, their similarities and differences, so that this notion become accessible. Similarly, in the second part, the presentation of wavelet tools enabling the characterization and analysis of scaling laws has been structured around the different practical aspects essential for their implementation (selection of the scale range, estimation of the parameters, robustness and algorithm). All the technical analysis described here, as well as some extensions and variations, have been put in practice in Matlab, with toolboxes that are freely


101

accessible on the websites http://www.ens-lyon.fr/pabry and http://perso.ens-lyon .fr/paulo.goncalves. 2.7. Bibliography [ABR 95] A BRY P., G ONÇALVES P., F LANDRIN P., “Wavelets, spectrum estimation, and 1/f processes”, in A NTONIADIS A., O PPENHEIM G. (Eds.), Wavelets and Statistics, Springer-Verlag, Lecture Notes in Statistics 103, New York, p. 15–30, 1995. [ABR 98] A BRY P., V EITCH D., “Wavelet analysis of long-range dependent traffic”, IEEE Trans. on Info. Theory, vol. 44, no. 1, p. 2–15, 1998. [ABR 00a] A BRY P., P ESQUET-P OPESCU P., TAQQU M.S., “Wavelet based estimators for self similar α-stable processes”, in International Conference on Signal Processing: Sixteenth World Computer Congress (Beijing, China, 2000), August 2000. [ABR 00b] A BRY P., TAQQU M.S., F LANDRIN P., V EITCH D., “Wavelets for the analysis, estimation, and synthesis of scaling data”, in PARK K., W ILLINGER W. (Eds.), Self-similar Network Traffic and Performance Evaluation, John Wiley & Sons, p. 39–88, 2000. [AVE 98] AVERKAMP R., H OUDRÉ C., “Some distributional properties of the continuous wavelet transform of random processes”, IEEE Trans. on Info. Theory, vol. 44, no. 3, p. 1111–1124, 1998. [BER 94] B ERAN J., Statistics for Long-memory Processes, Chapman and Hall, New York, 1994. [CAM 95] C AMBANIS S., H OUDRÉ C., “On the continuous wavelet transform of second-order random processes”, IEEE Trans. on Info. Theory, vol. 41, no. 3, p. 628–642, 1995. [CAS 90] C ASTAING B., G AGNE Y., H OPFINGER E., “Velocity probability density functions of high Reynolds number turbulence”, Physica D, vol. 46, p. 177, 1990. [CAS 96] C ASTAING B., “The temperature of turbulent flows”, Journal de physique II France, vol. 6, p. 105–114, 1996. [DAU 92] DAUBECHIES I., Ten Lectures on Wavelets, SIAM, 1992. [DEL 00] D ELBEKE L., A BRY P., “Stochastic integral representation and properties of the wavelet coefficients of linear fractional stable motion”, Stochastic Processes and their Applications, vol. 86, p. 177–182, 2000. [FLA 92] F LANDRIN P., “Wavelet analysis and synthesis of fractional Brownian motion”, IEEE Trans. on Info. Theory, vol. IT-38, no. 2, p. 910–917, 1992. [FRI 95] F RISCH U., Turbulence: The Legacy of A. Kolmogorov, Cambridge University Press, Cambridge, 1995. [HOL 90] H OLSCHNEIDER M., T CHAMITCHIAN P., “Régularité locale de la fonction non différentiable de Riemann”, in L EMARIÉ P.G. (Ed.), Les ondelettes en 1989, Springer-Verlag, 1990. [JAF 89] JAFFARD S., “Exposants de Hölder en des points donnés et coefficients d’ondelettes”, Comptes rendus de l’Académie des sciences de Paris, vol. 308, 1989.

102


[LEL 94] L ELAND W.E., TAQQU M.S., W ILLINGER W., W ILSON D.V., “On the self-similar nature of Ethernet traffic (extended version)”, IEEE/ACM Trans. on Networking, vol. 2, p. 1–15, 1994. [MAL 92] M ALLAT S.G., H WANG W.L., “Singularity detection and processing with wavelets”, IEEE Trans. on Info. Theory, vol. 38, no. 2, p. 617–643, 1992. [MAL 98] M ALLAT S.G., A Wavelet Tour of Signal Processing, Academic Press, San Diego, California, 1998. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MAN 97] M ANDELBROT B.B., Fractals and Scaling in Finance, Springer, New York, 1997. [MAS 93] M ASRY E., “The wavelet transform of stochastic processes with stationary increments and its application to fractional Brownian motion”, IEEE Trans. on Info. Theory, vol. 39, no. 1, p. 260–264, 1993. [PAR 00] PARK K., W ILLINGER W. (Eds.), Self-similar Network Traffic and Performance Evaluation, John Wiley & Sons (Interscience Division), 2000. [PES 99] P ESQUET-P OPESCU B., “Statistical properties of the wavelet decomposition of certain non-Gaussian self-similar processes”, Signal Processing, vol. 75, no. 3, 1999. [ROU 99] ROUGHAN M., V EITCH D., “Measuring long-range dependence under changing traffic conditions”, in IEEE INFOCOM’99 (Manhattan, New York), IEEE Computer Society Press, Los Alamitos, California, p. 1513–1521, March 1999. [SAM 94] S AMORODNITSKY G., TAQQU M.S., Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance, Chapman and Hall, New York and London, 1994. [TEI 00] T EICH M., L OWEN S., J OST B., V IBE -R HEYMER K., H ENEGHAN C., “Heart rate variability: measures and models”, Nonlinear Biomedical Signal Processing, vol. II, Dynamic Analysis and Modeling (M. Akay, Ed.), Ch. 6, p. 159–213, IEEE Press, 2001. [TEW 92] T EWFIK A.H., K IM M., “Correlation structure of the discrete wavelet coefficients of fractional Brownian motions”, IEEE Trans. on Info. Theory, vol. IT-38, no. 2, p. 904–909, 1992. [VEI 99] V EITCH D., A BRY P., “A wavelet based joint estimator of the parameters of long-range dependence”, IEEE Transactions on Information Theory (special issue on “Multiscale statistical signal analysis and its applications”), vol. 45, no. 3, p. 878–897, 1999. [VEI 00] V EITCH D., A BRY P., F LANDRIN P., C HAINAIS P., “Infinitely divisible cascade analysis of network traffic data”, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (Istanbul, Turkey), June 2000. [VEI 01] V EITCH D., A BRY P., “A statistical test for the constancy of scaling exponents”, IEEE Trans. on Sig. Proc., vol. 49, no. 10, p. 2325–2334, 2001. [WOR 96] W ORNELL G.W., Signal Processing with Fractals – A Wavelet-based Approach, Prentice-Hall, 1996.

Chapter 3

Wavelet Methods for Multifractal Analysis of Functions

3.1. Introduction A large number of signals are very irregular. In the most complex situations, the irregularity manifests itself in different forms and may change its form almost instantaneously. The most widely studied example in physics is the speed signal of a turbulent flow. During the 1980s, precise records of the speed of a turbulent flow were made in the ONERA wind tunnel at Modane (see Gagne et al. [GAG 87]). A thin wire, heated at one point, is placed in the flow of turbulent air; the rate at which the temperature decreases is directly proportional to the orthogonal component along the flow speed at the heated point. We obtain incomplete information as the signal is 1D; however, it is more precise than the best numerical simulations being carried out currently. The study of this signal showed that the signal recorded does not have statistical homogenity; its regularity varies a lot from one point to another [ARN 95b, FRI 95]. Such signals cannot be modeled by processes such as fractional Brownian motion, for example. The techniques of multifractal analysis were developed in order to model and analyze such behaviors. Originally introduced for fully developed turbulence, these techniques began to be used, within a few years, in various scientific fields: traffic analysis (road and Internet traffic) [ABR 98, LEV 97, TAQ 97, WIL 96, WIL 00], modeling of economic signals [MAN 97], texture analysis [BIS 98], electrocardiograms [AMA 00], etc. The mathematical theory of multifractal functions has expanded considerably: not only were numerous heuristic arguments used to numerically analyze the multifractal Chapter written by Stéphane JAFFARD.

104


signals studied and justified under certain assumptions within a context of limited validity but, most importantly, these mathematical results had various consequences on applications; they led researchers to introduce new tools that made it possible to refine and enrich the techniques of multifractal analysis. Inside mathematics, multifractal analysis acquired an extremely original position, since functions taken from very different domains prove to be multifractal: – in probabilities, the sample paths of the Lévy process [JAF 99]; – in analytical theory of numbers, trigonometric series linked to theta functions [JAF 96a]; – in analysis, the geometric constructions like “Peano functions” [JAF 97c, JAF NIC]; – in arithmetic, functions where diophantine approximation properties play a role [JAF 97b]; – etc. Multifractal analysis thus provides us with a vocabulary and a cluster of methods that help establish connections and find analogies between diverse fields of science and mathematics. Almost immediately since their appearance, wavelet analysis techniques were applied to multifractal analysis of signals by Arneodo and his team of CRPP of Bordeaux [ARN 95b]. These techniques were seen to be extremely powerful for mathematical analysis of problems, as well as for the construction of robust numerical algorithms. In this chapter, we explain certain fundamental results related to the wavelet methods of multifractal analysis in detail. We also briefly describe the vast scientific panorama of recent times and conclude with a review of specialized articles. In order to facilitate the presentation and the notations, all results will be set out in dimension l. Most of the results can easily be applied to functions defined in d (see [JAF 04b, JAF 06]). The reader can find a more detailed presentation of the different fields of application of multifractal analysis in [ARN 95a, JAF 01a, MAN 98, LAS 08, WEN 07]. 3.2. General points regarding multifractal functions 3.2.1. Important definitions Multifractal functions help in modeling signals whose regularity varies from one point to another. Thus, the first problem is to mathematically define a function’s regularity at every point. What is “pointwise regularity”? It is a way of quantifying,


105

with the help of a positive real number α, the fact that the graph of a function is generally rough at a given point x0 (the picture is not simply superficial; the concepts we introduce have, in fact, been used in rough symmetry [DUB 89, TRI 97]). The Hölder regularity generalizes familiar concepts: the “minimum level” of regularity is continuity. A function f is continuous in x0 if we have |f (x) − f (x0 )| → 0 when x → x0 ; the continuity will correspond to a regularity index α = 0. Similarly, f is differentiable at x0 if there is a function P such that |f (x) − P (x − x0 )| → 0 which is faster than |x − x0 | when x → x0 ; the derivability will correspond to a regularity index α = 1. The following definition is a direct generalization of these two cases. DEFINITION 3.1.– Let α be a positive real number and x0 ∈ ; a function f : → is C α (x0 ) if there exists a polynomial P of degree less than α such that: |f (x) − P (x − x0 )| C|x − x0 |α

(3.1)

NOTE 3.1.– The polynomial P is unique (if Q was acceptable as well, by applying Definition 3.1 with P and Q, we would have |P (x) − Q(x)| C|x|α and, since P − Q is of a maximum degree [α], we would have P − Q = 0). The constant in the polynomial P (x − x0 ) is always f (x0 ); similarly, the first degree term, if present, is always (x − x0 )f (x0 ) (by the derivative’s definition). Also, if f is C [α] (x0 ) close to x0 , the comment that we just made concerning the uniqueness of P implies that the polynomial P is Taylor’s expansion of f in x0 of order α. However, as equation (3.1) can take place for large values of α, without f being twice more differentiable in x0 (in which case Taylor’s expansion stops at similar term), we will consider, for example, the “chirp” xn sin(x−n ) in 0 for big n (which does not have a second derivative in 0 since the first derivative is not continuous). We see that P gives a generalization of the notion of Taylor’s expansion (see also Chapter 4; we will consult [GIN 00, MEY 98] for extensions of these concepts in more general contexts). We will finally note that equation (3.1) implies that f is bounded in the vicinity of x0 ; therefore, we suppose that the functions we look at are locally bounded (see [MEY 98], where the exponent factor is introduced, which makes it possible to define a weak notion of Hölder regularity for functions that are not a priori locally bounded – and the same for distributions). DEFINITION 3.2.– The Hölder exponent of f in x0 is: hf (x0 ) = sup{α : f is C α (x0 )} The Hölder exponent is a function that is defined point by point and describes local variations of the regularity of f . Certain functions have a constant Hölder exponent. Thus, the Weierstrass series: b−Hj sin(bj x) Wb,H (x) = j

106


has a Hölder exponent that is constant and equal to H; similarly, the sample paths of the Brownian motion verify with near certainty that hB (x) = 12 for all x. In a more general manner, a fractional Brownian motion of exponent H has at every point a Hölder exponent equal to H. Such functions are irregular everywhere. Our objective is to study functions whose Hölder exponent can jump from one point to another. In such a situation, the numerical calculation of functions hf (x0 ) is completely unstable and of little significance. We are rather trying to extract less precise information: whether or not the function hf takes a certain given value H and, if it does, what is the size of the sets of points where hf takes on this value? Here we are faced with a new problem: what is the “right” notion of “size” in this context? We will not be able to fully justify the answer to this question because it is a result of the study of numerous mathematical examples. Let us just keep in mind that the term “size” does not signify “Lebesgue measure” because, in general, there exists a Hölder exponent that is the “most probable” and that appears almost everywhere. The other exponents thus appear on all zero sets and the “Lebesgue measure” does not make it possible to differentiate them. Besides, the “right” notion of size cannot be the box dimension because these sets are usually dense. In fact, we expect them to be fractal. A traditional mathematical method to measure the size of such dense sets of zero measure is to calculate their Hausdorff dimension. Let us recall its definition. DEFINITION 3.3.– Let A be a subset of . For ε > 0, let us note: εdi Mεd = inf R

i

where R signifies a generic covering of A by intervals ]xi , xi + εi [ of a length εi ε. The operator inf is thus taken on all these coverings. For all d ∈ [0, 1], Hausdorff d-dimensional measure of A is: mes d (A) = lim M d →0

This measure takes on a value of +∞ or 0 except for, at the most, a value of d and the Hausdorff dimension of A is: dim(A) = sup d: lim M d = +∞ = inf d: lim M d = 0 →0

→0

DEFINITION 3.4.– Let f be a function and H 0. If H is a value taken by function hf (x), let us note by EH the set of points x where we have hf (x) = H. Therefore, the singularity spectrum (or Hölder spectrum) of the signal being studied is: fH (H) = dim(EH ) (we use the convention fH (H) = −∞ if H is not a Hölder exponent of f ).


107

The concept of multifractal function is not precisely defined (just like the concept of a fractal set). For us, a multifractal function is a function whose spectrum of singularities is “non-trivial”, i.e. unreduced to a point. In the examples, fH (H) takes on positive values on an entire interval [Hmin , Hmax ]. Its assessment thus requires a study of an infinity of fractal sets EH , hence the term “multifractal”. 3.2.2. Wavelets and pointwise regularity For many reasons that we shall gradually discover, wavelet methods of analysis are a favorite tool for studying multifractal functions. The first reason is that we have a simple criteria that allows us to characterize the value of the Hölder exponent by a decay condition of a given function’s wavelet coefficients. Let us begin by recapitulating certain points related to the wavelet analysis methods. An orthonormal base of wavelets of L2 () has a particularly simple algorithmic form: we start from a function ψ (the “mother” wavelet) that is regular and well-localized; the technical assumptions are: ∀i = 0, . . . , N,

|ψ (i) (x)|

∀m ∈ N,

C(i, m) (1 + |x|)m

for a relatively big N . We can choose such functions as ψ, such that, moreover, the translation-dilation of ψ: ψj,k (x) = 2j/2 ψ(2j x − k),

j, k ∈ Z

form an orthonormal base of L2 () (see [MEY 90]) (we will choose N to be bigger than the maximum regularity that we expect to find in the signal analyzed; we can also take N = +∞ and the wavelet ψ will thus belong to the Schwartz class). We verify that the wavelet has a corresponding number of zero moments: ψ(x)xi dx = 0 ∀i = 0, . . . , N,

Thus, every f ∈ L2 () function can be written as: f (x) =

ef (k, j) ψ(2j x − k)

j∈Z k∈Z

where df (k, j) are the wavelet coefficients of f : f (t)ψ(2j t − k) dt ef (k, j) = 2j

108


We should note that we do not choose an L2 normalization for the wavelet coefficients, but an L1 normalization which is better adapted to the study of Hölder regularity. Let us first study the characterization by wavelets of uniform regularity. Let us begin by defining it. A function f belongs to C α () if condition (3.1) takes place for all x0 , with the possibility of choosing C uniformly, i.e. independently of x0 . If we have α < 1, taking into account that P (x − x0 ) = f (x0 ), this condition can be rewritten as: f ∈ C α () ⇐⇒ ∀x, y ∈ |f (x) − f (y)| C|x − y|α This condition of uniform regularity is characterized by a condition of uniform decay of the wavelet coefficients of f (see [MEY 90]). PROPOSITION 3.1.– If α ∈ ]0, 1[ we thus have the following characterization: f ∈ C α () ⇐⇒ ∃C > 0 :

∀j, k

|df (k, j)| C 2−αj

Proof. Let us assume that f ∈ C α (); then, ∀j, k, we have: j j j f (x)ψ(2 x − k) dx = 2 f (x) − f (k 2−j ) ψ(2j x − k) dx ef (k, j) = 2 (because the wavelets are of zero integral) thus: C |ef (k, j)| C 2j |x − k 2−j |α dx C 2−αj (1 + |2j x − k|)2 (by the change of variable t = 2j x − k). Let us now prove the converse. Let us assume that we have |ef (k, j)| C 2−αj . Let j0 be defined by 2−j0 −1 |x − x0 | < 2−j0 and note that: fj (x) =

ef (k, j) ψ(2j x − k)

k

From the localization assumption of ψ, we deduce that we obtain: |fj (x)| C

k

2−αj C 2−αj (1 + |2j x − k|)2


109

and similarly, using the localization of ψ , we deduce that we have |fj (x)| C 2(1−α)j . We obtain: |f (x) − f (x0 )| |fj (x) − fj (x0 )| + |fj (x)| + |fj (x0 )| j>j0

jj0

j>j0

Using finite increments, the first term is bounded by: |x − x0 | sup |fj (t)| C|x − x0 | 2(1−α)j C|x − x0 |2(1−α)j0 C jj0

[x,x0 ]

jj0

(because we have α < 1). Coming back to the definition of j0 , we see that the first term is bounded by C|x − x0 |α . The second and the third terms are bounded by: 2−αj C 2−αj0 C|x − x0 |α j>j0

thus, the converse estimate holds. The reader will easily be able to extend this result to a case where we have α > 1 and α ∈ N; see [MEY 90] in the case of α ∈ N. If a function f belongs to one of the C α spaces, for α > 0, we will say that f is uniformly Hölderian. The following theorem is similar to Proposition 3.1, but gives a result of pointwise regularity. THEOREM 3.1.– Let α ∈ ]0, 1[. If f is C α (x0 ), then we have: |ef (k, j)| C 2−αj (1 + |2j x0 − k|α )

(3.2)

conversely, if the wavelet coefficients of f verify (3.2) and if f is uniformly Hölderian, then, if we have |x − x0 | 1, we obtain: 2 α (3.3) |f (x) − f (x0 )| C|x − x0 | log |x − x0 | (f is “nearly” C α (x0 ) if we make a small logarithmic correction). Proof. Let us assume that f is C α (x0 ); then we have: j j j f (x)ψ(2 x − k) dx = 2 f (x) − f (x0 ) ψ(2j x − k) dx df (k, j) = 2 (because the wavelets are of zero integral). Thus, |df (k, j)| is bounded by: C|x − x0 |α |x − k 2−j |α + |k 2−j − x0 |α j j 2 dx 2 C 2 dx (1 + |2j x − k|)2 (1 + |2j x − k|)2

110


(because, for a, b > 0, we have (a + b)α 2aα + 2bα ). By once again changing the variable t = 2j x − k, we obtain |df (k, j)| C 2−αj (1 + |2j x0 − k|α ). Let us now prove the converse. Assuming that there exists an > 0 such that f ∈ C (), let j0 and j1 be defined by: 2−j0 −1 |x − x0 | < 2−j0

and

j1 =

α j0

From (3.2), we deduce that, for all x, we have: |ef (k, j)| C(2−αj + |x0 − k 2−j |α ) 2 C(2−αj + |x − x0 |α + |x − k 2−j |α ) and thus: |fj (x)| C

2−αj + |x − x0 |α + 2−αj |2j x − k|α (1 + |2j x − k|)2 k

We have: k

1 C (1 + |2j x − k|)2

and similarly: k

1 C (1 + |2j x − k|)2−α

because we have α < 1. We get: |fj (x)| C 2−αj + |x − x0 |α C 2−αj (1 + 2αj |x − x0 |α ) Using the localization of ψ , we obtain, in the same manner: |fj (x)| C 2(1−α)j (1 + 2αj |x − x0 |α ) in particular, if we have j j0 , we obtain |fj (x)| C2(1−α)j . We can write: |f (x) − f (x0 )|

jj0

|fj (x) − fj (x0 )| +

j>j0

|fj (x)| +

j>j0

|fj (x0 )|

(3.4)


111

The first term is bounded by:

C|x − x0 |

sup |fj (t)| C|x − x0 |

jj0 [x,x0 ]

2(1−α)j

jj0

C|x − x0 |2(1−α)j0 C|x − x0 |α As far as the second term is concerned, if we have j > j0 , bound (3.4) becomes |fj (x)| C|x − x0 |α and we thus have:

|fj (x)|

j0 <j<j1

|x − x0 |α

j0 <j<j1

C(j1 − j0 )|x − x0 |α C|x − x0 |α log

2 |x − x0 |

because we have j1 − j0 (α/)j0 . Moreover, since f belongs to C (), we have:

|fj (x)|

jj1

2− j C 2− j1 C|x − x0 |α

jj1

(because of the choice of j1 ). As far as the third term is concerned, the bound (3.4) in x = x0 becomes |fj (x0 )| C 2−αj and thus:

|fj (x)|

j>j0

2−αj C|x − x0 |α

jj0

and the converse is proved in Theorem 3.1. Here again, the reader will easily be able to extend this result to exponents α > 1 (see [JAF 91]). As far as the second part of the theorem is concerned, we sometimes use the slight variation below. PROPOSITION 3.2.– Let α ∈ ]0, 1[. If the wavelet coefficients of f verify:

|df (k, j)| C 2−αj (1 + |2j x0 − k|α ) for an α < α, then f is C α (x0 ).

(3.5)

112


The demonstration of this proposition is very similar to that of the second part of the theorem and therefore we only outline it. We show this time that (3.5) implies that:

|fj (x)| C 2−αj (1 + 2α j |x − x0 |α ) and:

|fj (x)| C 2(1−α)j (1 + 2α j |x − x0 |α ) and we finish by using the estimation on fj for j j0 and on fj for j j0 . This proposition also extends to the case where α > 1 (see [JAF 91]). NOTE 3.2.– We could get the impression that Proposition 3.2, contrary to Theorem 3.1, does not make the assumption of global regularity. This does not matter because, if (3.5) is verified and if we have |x − x0 | 1, then we can deduce that |df (k, j)| C 2−(α−α )j , i.e. that we have f ∈ C α−α uniformly close to x0 . We could nevertheless refer to [JAF 00c] where pointwise Hölder estimations are obtained from wavelet coefficients, in the absence of any assumption of uniform regularity. From the theorem, we can immediately deduce the following corollary that characterizes the Hölder exponent by local decay of wavelet coefficients. COROLLARY 3.1.– If f is uniformly Hölder, the Hölder exponent of f at every point x0 is given by: log(|df (k, j)|) (3.6) hf (x0 ) = lim inf inf j→∞ k log(2−j + |k 2−j − x0 |) This corollary is used in all mathematical results where the singularity spectrum of a function f is derived from its wavelet coefficients. On the other hand, it can be used to obtain the numerical value of hf (x) only if function hf is constant, or if it varies slightly with the discretization scale of the signal. 3.2.3. Local oscillations Using the Hölder exponent as a means to measure pointwise regularity makes it a powerful signal and image analysis tool (see [DAO 95, LEV 95]). However, there are many disadvantages in characterizing the pointwise regularity by using only the Hölder exponent: – inability to measure the oscillatory character of the local behavior of f close to x0 ; – lack of stability while using “traditional” operators, such as differential operators, pseudo-differential operators or the Hilbert H transform (convolution


113

operator with one dimension as x1 , which allows the transfer of the real signal to the associated analytic signal during signal analysis). So, we can create, for instance, functions f locally bounded with f ∈ C α (x0 ) and Hf ∈ C β (x0 ) for any β > 0 (see [JAF 91]).

spaces will provide a substitute to the pointwise regularity The 2-microlocal Cxs,s 0 notion which does not have any disadvantages; moreover, they are defined even if f is not locally bounded. We have already seen this condition in (3.5).

space if its DEFINITION 3.5.– An F distribution belongs to the 2-microlocal Cxs,s 0 wavelet coefficients satisfy, for sufficiently small |x0 − k 2−j |:

∃C > 0 |ef (k, j)| C 2−j(s+s ) (2−j + |k 2−j − x0 |)−s

(3.7)

This condition does not depend on the chosen wavelet base (sufficiently regular) (see [JAF 91]). DEFINITION 3.6.– The 2-microlocal domain of f in x0 , noted by E(f (x0 )) is the set of couples (s, s ) such that f ∈ C s,s (x0 ). By interpolation between conditions (3.7), we find that the 2-microlocal domain is a convex set and, moreover, by using the trivial lower bound 2−j +|k 2−j −x0 | 2−j , we can see that its boundary is a curve whose slope is everywhere larger than −1. Conversely, we can check that these conditions characterize the boundaries of the 2-microlocal domains in one point x0 (see [GUI 98, MEY 98]) (a more difficult and unresolved issue is to determine which are the compatibility conditions between the different 2-microlocal domains E(f (x)) of a function when the point x varies). The 2-microlocal domain provides very accurate information on the behavior of f close to x0 ; specifically, we can derive the Hölder exponent of primitives from it or fractional derivatives of f (see [ARN 97]). However, this complete information is superfluous when put into practice. In fact, we need to preserve few parameters – and not a convex function at each point (the boundary of the 2-microlocal domain): at least the Hölder exponent, and often a second parameter β, which will measure the oscillating character of f close to x0 . In fact, a similar Hölder exponent can, at any given point, show very different behavior, such as “cusps” |x − x0 |H or, on the contrary, high oscillating functions, such as “chirps”: 1 (3.8) gH,β (x) = |x − x0 |H sin |x − x0 |β for β > 0. In signal processing, the chirp notion models functions whose instantaneous frequency increases rapidly at a given time (see [CHA 99]). The β exponent measures the speed at which the instant frequency (3.8) diverges at x0 .

114


An additional motivation for the analysis of chirps is that this type of behavior incurs a failure in the first versions of the multifractal formalism, as we will see in section 3.4. We presently have two possible mathematical definitions for exponent β in a general framework. Of course, both result in the same value for functions (3.8). We will define them and discuss their respective advantages with regard to the analysis of the signal. Let f be locally bounded and let us note by f (−n) a primitive n times iterated from f . As shown by a sequence of integrations by parts, a consequence of the oscillations (−n) of (3.8) close to x0 is that gH,β belongs to C H+n(β+1) (x0 ) (the increase of the Hölder exponent in x0 is not 1 at each integration, as expected for an arbitrary function, but β + 1). This observation has led to the following definition given by Meyer [JAF 96b]. DEFINITION 3.7.– Let H 0 and β > 0. A function f ∈ L∞ () is a chirp (n) of type (H, β) in x0 if, for any n 0, f can be written as f = gn , with H+n(1+β) (x0 ). gn ∈ C The chirp type can be derived from the 2-microlocal domain in x0 . In fact, if we have f ∈ C H (x0 ), we also have f ∈ C H,−H (x0 ), etc. and, in general, (n) the condition f = gn with gn ∈ C H+n(1+β) (x0 ) implies that we have f ∈ H+nβ,−H−n(β+1) (x0 ). The other characterization of chirps, given below (see C [JAF 96b]), shows that their definition reflects appropriately the oscillatory phenomenon present in functions gH,β . PROPOSITION 3.3.– A function f ∈ L∞ is a chirp of type (H, β) in x0 if and only if a function r(x), C ∞ exists close to x0 and > 0 such that, if we have 0 < x < , then: f (x) = r(x − x0 ) + (x − x0 )H g+ (x − x0 )−β and if we have − < x < 0, then: f (x) = r(x − x0 ) + |x − x0 |H g− |x − x0 |−β since functions g+ and g− are permanently oscillating, i.e. they have bounded primitives of any order. It is very easy to check that the interior of the set of points (f, β) such that f is a chirp of the type (H, β) in x0 is always of the form H < hf (x0 ), β < βf (x0 ) [JAF 00a]. The positive number βf (x0 ) is called the chirp exponent in x0 .


115

A highly oscillating local behavior such as (3.8) is notable and it was believed for a long time that this could be observed only in isolated points. This is why the Meyer result, proving that the Riemann series n−2 sin(πn2 x) has a dense set of chirps of type ( 32 , 1), was unexpected (see [JAF 96b, JAF 01a]). Since then, we know how to generate functions with chirps almost everywhere (see [JAF 00a]). Definition 3.7 has not been adapted to the analysis of signals. In fact, we will see that it is not stable by adding an arbitrary regular function (but not C ∞ ). We will also introduce another definition of the local exponent to measure the oscillation without this disadvantage. Let us consider the following example. Let B(x) be a Brownian motion and: 1 1/3 + B(x) (3.9) C(x) = x sin x The Hölder exponent of B(x) being everywhere 12 , the largest singularity at 0 is the chirp x1/3 sin( x1 ) and we effectively observe this oscillating behavior in the graph of C(x) expanded around 0. However, after integration, the random term becomes paramount (in fact, an integration by parts shows that the first term is O(|x|7/3 ) whereas a Brownian primitive has the Hölder exponent hC (x) = 32 everywhere); the oscillating behavior then disappears on the primitive. The oscillations of the graph of C(x), which do not exist in the primitive, are not taken into account in Definition 3.7: we see that C(x) is not a chirp at 0 (it is actually a chirp of exponents ( 13 , 0)). Nevertheless, this big oscillating behavior should be taken into account in the chirp exponent: C(x) “should” have exponents ( 13 , 1). Let us now show how to define an oscillating exponent taking the value β for (3.8), which would not be changed by adding a “regular noise” (it will take value 1 for C(x) and no longer 0). It is obvious, from the previous example, that the oscillating exponent should not be determined by taking into account primitives of F . An “infinitesimal” fractional integration should be used so that the order of importance of terms in (3.9) is not disturbed. Let ht (x0 ) be the Hölder exponent of a fractional integral of order t of the function f at x0 . To be more precise, if f is a locally bounded function, let us note by ht (x0 ) the Hölder exponent in x0 of: I t (f ) = (Id − Δ)−t/2 (φf )

(3.10)

where φ is a C ∞ function with compact support such that φ(x0 ) = 1 and (Id −Δ)−t/2 is the convolution operator which, in Fourier, is none other than multiplication by (1 + |ξ|2 )−t/2 (see Chapter 7 for more details). In the example of function C(x), we find, where x0 = 0, ht (x0 ) = 13 + 2t after a fractional integration of quite small order t, i.e., 13 + 2t < 12 + t. In this example, the increase of the Hölder exponent in x0 after a fractional integration of a quite small order t is 2t; we can therefore recover the β

116


oscillating exponent in this way. In general, the function t → ht (x0 ) is concave, so much so that its derivative at the right exists in 0 (with the possible value +∞). The following definition is derived from it. DEFINITION 3.8.– Let f : d → be a locally bounded function. The oscillating exponent of f in x0 is: ∂ −1 (3.11) β = ht (x0 ) ∂t t=0 This exponent belongs to [0, +∞]. We will notice that if we have ht (x0 ) = +∞, then β is not defined. The following proposition, extracted from [AUB 99], shows that this definition appropriately reflects the oscillating phenomenon present in (3.8). PROPOSITION 3.4.– If f is uniformly Hölderian, ∀H < h(x0 ) and ∀β < β(x0 ), f can be written as: 1 H + r(x) f (x) = |x − x0 | g |x − x0 |β with r(x) ∈ C α (x0 ) for a α > h(x0 ) and g is infinitely oscillating. 3.2.4. Complements We discuss here the known results relating to the construction of functions having a prescribed Hölder exponent (and possibly oscillating). This problem was first encountered in the speech simulation context. A speech signal possesses a Hölder exponent which varies drastically (particularly in the case of consonants) and this led to the idea of efficiently storing such a signal by keeping only the information contained in the Hölder exponent. The following theorem characterizes functions which are Hölder exponents (see [AYA 08]). THEOREM 3.2.– A positive function h(x) is the Hölder exponent of a bounded function f if and only if h can be written as lim inf of a sequence of continuous functions. When h(x) has a minimal Hölder regularity, a natural construction is provided by the multifractional Brownian motion (see [BEN 97, PEL 96]). Contrary to the previous result, a couple of functions (h(x), β(x)) should verify very specific conditions to be a couple (Hölder exponent, chirp exponent): β(x) should vanish on a dense set [GUI 98]. We also have a constructive result: if this condition is satisfied, we can prescribe this couple almost everywhere (see [JAF 05]) but unfortunately we do not know how to characterize couples which are couples of the form (Hölder exponent, chirp exponent).


117

3.3. Random multifractal processes We present two examples of random multifractal processes. The first is the Lévy process. Their importance is derived, on the one hand, from the central place these processes have in determining the probability factor and, on the other hand, their increasing importance for physics or financial modeling, particularly in situations where Gaussian models are inadequate (see, for instance, [MAN 97, SCH 95] and Chapter 5 and Chapter 6). Our second example involves random wavelet series, that is, processes whose wavelet coefficients are independent and, at a particular scale, have the same laws (these laws can be set in an arbitrary manner at each scale). In addition to the intrinsic advantages of this model, it also makes it possible to introduce new concepts and it enriches the possible variants of the multifractal formalism. 3.3.1. Lévy processes A Lévy process Xt (t 0) with values in can be defined as a process with stationary independent expansions: Xt+s − Xt is independent on (Xv )0vt and has the same law as Xs . The function that characterizes the Lévy process is written as E(eiλXt ) = e−tφ(λ) , where: 2 1 − eiλx + iλx1|x|1 π(dx) (and we can “forget” this first process whose addition does not modify the spectrum) and the series j0 Xtj where Xtj are compensated and independent composed Poisson processes with Lévy measure πj (dx) = 12−j |x| 0 according to the initial condition u0 (x) = u(x, 0); in fact, if U is a primitive of u, then U verifies: 2 1 ∂U ∂2U ∂U + =ν 2 ∂t 2 ∂x ∂x Then, we carry out the Cole-Hopf transformation, which consists of supposing that φ = e−U/2ν ; φ then verifies the equation of the linear heat, which we explicitly resolve, hence the expression of U . By passing to the limit within this expression when ν → 0, we obtain using a standard technique (Laplace method) the following result (see [EVA 98] for the details of the calculations). Let us suppose, to simplify things, that the initial condition u0 is zero on ] − ∞, 0[ and that we have u0 (s) + s 0 for s large enough. To have an idea, let us have a look


121

at the solution at this instant: t = 1. First, we consider for each x 0, the function of the variable s 0: s u0 (r) + r − x dr Fx (s) = 0

and we note by a(x, 1) the largest point where the minimum is attained: a(x, 1) = max{s 0 : Fx (s) Fx (s ); for all s }

(3.18)

The limit solution when ν → 0 with the time t = 1 is then given by: u(x, 1) = x − a(x, 1)

(3.19)

Formula (3.18) shows that the random process a(x, 1) obtained when the initial condition is a Brownian motion on + (and zero on − ) is a subordinator. We call a subordinator an increasing Lévy process (σx , x 0). A traditional example, which plays an important role later, is that of the first times of passage of a Brownian motion with derivative. More specifically, let us consider a real standard Brownian motion (Bs , s 0), let us note Xs = Bs + s and let us introduce for x 0: τx = inf{s 0, Xs > x} Because τx is a stopping time and since Xτx = x, the strong Markov property applied to the Brownian motion implies that the process Xs = Xs+τx − x is also a Brownian motion with derivative, which is independent of the portion of the trajectory before τx , (Xr , 0 r τx ). For any z ∈ [0, x], the first time of passage τz clearly only depends on the trajectory before τx and hence (Xs , s 0) is independent of (τz , 0 z x). The identification: τx+y − τx = inf{s 0, Xs+τx > x + y} = inf{s 0, Xs > y} then highlights the independence and the homogenity of the incrementation of τ , which is hence a subordinator. Close arguments apply to the increments of function (3.18) for Burgers’ equation (3.17) non-viscous with Brownian initial condition, i.e., when we have u(x, 0) = Bx for x 0. Indeed, we verify that (a(x, 1) − a(0, 1), x 0) and (τx , x 0) follow the same rule (see [BER 98]). The isolated example of the Burgers’ equation with initial Brownian data can make us hope that more general results are true – and in particular that large classes of non-linear partial differential equations generically develop multifractal solutions. However, at present, there are no proven results of this type (however, the reader can consult [VER 94] concerning Burgers’ equation in several space dimensions).

122


3.3.3. Random wavelet series Since we are interested in the local properties of the functions, it is equivalent, and also easier, to work with periodic wavelets that are obtained by periodization of a usual base of wavelets (see [MEY 90]) and are defined on the toric T = /Z. The periodic wavelets: ψ 2j (x − l) − k , j ∈ N, 0 k < 2j (3.20) ψj,k (x) = 2j/2 l∈Z

form an orthonormal base of L2 (T) (by adding the constant function equal to 1, see [MEY 90]; we use the same notation than for the wavelets on , which will not lead to any confusion). We also assume that ψ has enough regularity and zero moments. Any periodic function f is hence written as follows: ef (k, j) 2−j/2 ψj,k (x) (3.21) f (x) = j,k

where the wavelet coefficients of f are hence given by: 1 2j/2 ψj,k (t)f (t) dt ef (k, j) = 0

We assume that all the coefficients are independent and have the same law at each scale. Let ρj be the measure of common probability of 2j random variables Xj,k = −(log2 |df (k, j)|)/j (signs of wavelet coefficients do not have any influence on the Hölder regularity, which is why we are not making any assumptions on this subject). Thus, the measure ρj verifies: P |ef (k, j)| 2−aj = ρj (−∞, a] We will make the following assumption on ρj : ∃ > 0 :

supp(ρj ) ⊂ [, +∞]

It signifies that the sample paths of the process are uniformly Hölder. We need to define the logarithmic density ρ˜(α) of the coefficients; i.e.: log2 2j ρj ([α − , α + ]) ρ¯(α) = lim lim sup →0 j→+∞ j The reader should note that this density is, in fact, a spectrum of large deviation, but calculated from the wavelet coefficients (see Chapter 4 where general results concerning spectra of large deviation are established). Then, we note: ρ¯(α) if ρ¯(α) 0 ρ˜(α) = 0 otherwise


123

or:

Hmax =

sup α>0

ρ˜(α) α

−1

THEOREM 3.4.– Let f be a random wavelet series verifying a uniform regularity assumption. The singularity spectrum of almost any trajectory of f is supported by [, Hmax ] and, within this interval: ρ˜(α) α∈[0,H] α

fH (H) = H sup

(3.22)

Function ρ˜(α) being essentially arbitrary, we notice that a singularity spectrum of a series of random wavelets is not necessarily concave. 3.4. Multifractal formalisms Even if the singularity spectrum of numerous mathematical functions can be determined by simply using Definition 3.4 with regard to the multifractal signals, it is not practical to determine their regularity at each point. This is because the Hölder exponent can be discontinuous everywhere, and it is even less practical to calculate the infinity of the corresponding Hausdorff dimensions! Frisch and Parisi have introduced a formula which allows us to deduce the singularity spectrum of a signal from quantities that are easily measurable. It is the first example of what we now call multifractal formalisms. Several variants have been proposed since then; we shall describe some of them and we shall compare their respective performances. Additional results are found in Chapter 4. The formula proposed by Frisch and Parisi is based on the knowledge of the Besov spaces to which the function belongs. This is why we begin with a few reminders regarding these spaces and their characterization by wavelets. 3.4.1. Besov spaces and lacunarity One of the reasons for the success of wavelet decomposition in applications is that they often provide representations of signals that are very lacunary (few coefficients are numerically not negligible). This lacunarity is often quantified by determining to which Besov spaces the considered function belongs to. For the characterization of the wavelet coefficients of Besov spaces see [MEY 90]: ∀s ∈ , p > 0, f ∈ B s,p () ⇐⇒

j,k

1

|ef (k, j) 2(s− p )j |p

1/p < +∞

(3.23)

124


When we have p 1, the Besov spaces are very close to the Sobolev spaces; indeed, if Lp,s is the space of the functions of Lp for which the fractional derivatives of order s still belong to Lp , we have the following injections: ∀ > 0,

∀p 1,

Lp,s+ → B s,p → Lp,s−

However, a determining advantage over Sobolev spaces is that the Besov spaces are defined for any p > 0. It is precisely these spaces for p close to 0 that enable us to measure the lacunarity of the representation in wavelets of f . We illustrate this point with an example. Let us consider the function:

H(x) =

1 if |x| 1 0 otherwise

and let us assume that the wavelet chosen has a compact support, with the interval [−A, A]. Because ψ is a zero integral, for each j, there is less than 4A non-zero wavelet coefficients, so much so that the decomposition in wavelets of f is very lacunary. Because H(x) is bounded, we have |df (k, j)| C for any j, k. By using (3.23), H(x) belongs to B s,p () as soon as we have s < 1/p. Let us show, at the same time, that such an assertion is a way to quantify the fact that the decomposition in wavelets of f is very lacunary. Let us suppose that f is a bounded function satisfying: ∀p > 0,

∀s
0 and for any > 0, at each scale j, there are −Dj . Indeed, if this was not the less than C(, D)2 j coefficients of a size larger than 2 case, by taking p = /(2D), we would obtain k |df (k, j)|p → +∞ when j → +∞, which is a contradiction. Here is another illustration of the relation between the lacunarity of the decomposition in wavelets and the Besov regularity. We assume that f belongs to ∩p>0 B 1/p,p . Coming back to (3.23), we observe that this condition means exactly that the sequence df (k, j) belongs to lp for any p > 0. Let us then note by dn the rearrangement in a decreasing order of the sequence of wavelet coefficient modules |df (k, j)|; hence, the sequence dn also belongs to lp for any p > 0. Thus: ∀p

∃Cp

such that

∞ n=1

dpn Cp


125

Because the sequence dn is decreasing: ∀N,

N dpN

N n=1

dpn

∞

dpn Cp

n=1

and thus dN (Cp )1/p N −1/p . Since we can take p arbitrarily close to 0, we observe that the rearrangement in a decreasing order of the sequence |df (k, j)| has fast decay, which is, once again, a way to express the lacunarity (the converse is immediate: if the sequence dn has fast decay, it belongs to all lp and thus this is also the case for the sequence df (k, j)). The Besov space for p < 1 is not locally convex, which partly explains the difficulties in their utilization. Before the introduction of wavelets, these spaces were characterized either by order of approximation of f with rational fractions for which the numerator and the denominator have a fixed degree, or by an order of approximation with the splines “with free nodes” (which means that we are free to choose the points where the polynomials in parts are connected) (see [DEV 98, JAF 01a]). However, these characterizations are difficult to handle and hence do not have any real numerical applications. Characterization (3.23) shows that the knowledge of the Besov spaces to which f belongs is clearly linked to the asymptotic behavior (when j → +∞) of the moments of distribution of the wavelet coefficients of f ; see (3.26). Generally, more information is available; indeed, these moments are deduced from the histogram of the coefficients at each scale j. This is why it is normal to wonder which information regarding the pointwise regularity of f can be deduced from the knowledge of these histograms. We present a study of this problem below. We observe that the cascade type models for the evolution of the repartition function of wavelet coefficients through the scales have been proposed to model the speed of turbulent flows [ARN 98]. To start with, let us point out a limitation of the multifractal analysis: functions having the same histograms of wavelet coefficients at each scale can have singularity spectra that are completely different [JAF 97a]. In the multifractal analysis, it is not only the histogram of the coefficients which is important, but also their positions. This is why no formula deducing the singularity spectrum from the knowledge of the histograms can be valid in general. However, we can hope that some formulae are “more valid than others”. Indeed, we have observed that if the coefficient values are independent random variables, there is a spectrum which is almost sure; we shall notice that the formula that yields this spectrum differs from the formulae proposed until now. Another approach consists of specifying the information on the function and considering the functional spaces that take the positions of the large wavelet coefficients into consideration (see [JAF 05]).

126


3.4.2. Construction of formalisms The construction of a multifractal formalism can be based on two types of considerations: – counting arguments: we consider the increments (or wavelet coefficients) having a certain size; we estimate their number and deduce their contribution to some “calculable” quantities; – more mathematical arguments: we prove that a bound of the spectrum, according to the “calculable” quantities, is generally true and that this bound is “generically” an equality. The term “generically” is to be understood in the sense of “Baire classes” if the information at the start is of functional type, or as “almost sure” if the information at the start is a probability. We begin by describing the first approach; we do not exactly recapitulate the initial argument of Frisch and Parisi, but rather its “translation” in wavelets, as found in [ARN 95b, JAF 97a]. This approach admits two variants, according to the information on the function that we have. Indeed, we can start from: – the partition function τ (q) defined from knowledge of the Besov spaces to which f belongs: q log k |ef (k, j)| s/q,q } = 1 + lim inf τ (q) = sup{s : f ∈ B j→+∞ log 2−j – histograms of wavelet coefficients. Generally, for each j, let Nj (α) = #{k : |df (k, j)| 2−αj }. Thus, we have E(Nj (α)) = 2j ρj ([0, a]). If: log Nj (α + ) − Nj (α − ) ρ(α, ) = lim sup log(2j ) j→∞

(3.24)

then, we define: ρ(α) = inf ρ(α, ) >0

(3.25)

˜ (there is an order of 2ρ(α)j coefficients of size ∼ 2−αj ).

It is important to note that the information provided by ρ(α) is richer than that provided by τ (q); indeed, τ (q) can be deduced from the histograms with: −1 −j −αqj 2 log2 2 Nj (α) dα (3.26) τ (q) = lim inf j→+∞ j


because, by definition of Nj , we have deduce that:

k |df

(k, j)|q =

#

τ (q) = inf αq − ρ(α) + 1 α0

127

2−αqj dNj (α). It is easy to (3.27)

On the other hand, we cannot reconstitute ρ(α) from τ (q); indeed, it is clear from (3.27) that the two functions ρ(α) and ρ (α) having the same concave envelope lead to the same function τ (q). Based on τ (q), we can thus only obtain the envelope of ρ(α), by carrying out a Legendre transformation once again. We will now describe the heuristic arguments based on the construction of these multifractal formalisms. We will divide them into four steps, highlighting the implicit assumptions that we make for each of them. S TEP 1. The first assumption, common to both approaches, is that the Hölder exponent at each point x0 is given by the order of magnitude of the wavelet coefficients of f in a cone |k2−j − x0 | C2−j . With respect to (3.6), if they are decreasing as 2−Hj , we then have hf (x0 ) = H. This assumption is verified if f does not have “cusp” type singularities ([MEY 98]), i.e., the oscillation exponent is zero everywhere. If we go from the data of ρ(H), we have, as an assumption, 2ρ(H)j wavelet coefficients of size 2−Hj . By using the supports of the corresponding wavelets to cover EH , we expect to obtain fH (H) = ρ(H). Thus, we also obtain a first form of multifractal formalism: the formalism said to be “of large deviation”, which simply affirms that: fH (H) = ρ(H) Let us briefly justify this name as well as that of the large deviation spectrum that we sometimes give to function ρ. The theory of large deviations takes care of the calculation of the probabilities, which are so small that we can only correctly estimate them on logarithmic scales. The basic example is as follows: if Xi are n independent reduced centered Gaussian, and if we have Sñ = n1 i=1 Xi , then we obtain n1 log P(|Sñ | δ) ∼ −δ 2 /2. The analogy with (3.24) and (3.25) is striking since, in the common law of wavelet coefficients, the parts of very small probabilities, which we measure with the help of a logarithmic scale, are those that provide the relevant information ρ(α) for calculating the spectrum (see Chapter 4 for a more detailed study). The effective calculation of function ρ(α) is numerically delicate because its definition leads to a double limit, which generally results into problems said to be “of finite size”. In theory, we must go “completely” to the limit in j in (3.24) before taking the limit in in (3.25). Practically speaking, the two limits must effectively be taken “together”. The problem is then to know how to take j sufficiently large according to , which creates significant numerical stability problems. In any case, a

128


calculation of ρ(α) which is numerically reliable requires us to know the signal on a large number of scales, i.e., with an excellent precision. This is why we often prefer to work from averages, such as k |df (k, j)|q , i.e., finally, based on the partition function for which the definition leads to only one limit. From now, this is the point of view that we shall adopt (however, let us note that the direct method introduced by Chhabra and Jensen in [CHH 89] is a method for calculating ρ(α) without going through a double limit, or through a Legendre transformation; we shall find a mathematical discussion of this method adapted to the framework of the wavelets in [JAF 04a]). S TEP 2. We will estimate, for each H, the contribution of Hölder singularities of exponent H at: |ef (k, j)|q (3.28) k

Each singularity of this type brings a contribution of C2−Hqj and there must be ∼ 2fH (H)j intervals of length 2−j to recover these singularities; the total contribution of the Hölder singularities of exponent H at (3.28) is thus as follows: 2fH (H)j 2−Hqj = 2−(Hq−fH (H))j

(3.29)

This is a critical step of reasoning; it contains an inversion of limits that implicitly assumes that all Hölder singularities have coefficients ∼ 2−Hj simultaneously from a certain scale J and that Hausdorff dimension can be estimated as if it were a box dimension. It is notable that the multifractal formalism leads to the correct singularity spectrum in several situations where these two assumptions are not verified. S TEP 3. The third step is an argument of the “Laplace method” type. When j → +∞, we note that, among the (3.29) terms, the one that brings the main contribution to (3.28) is that for which the exponent H carries out the minimum of Hq − fH (H), from which comes the heuristic formula: τ (q) − 1 = inf hq − fH (H) H

S TEP 4. If fH (H) is concave, then −fH (H) and −τ (q) + 1 are conjugate convex functions and each of them can be deduced from the other by a Legendre transformation. Thus, if we define the Legendre spectrum with: fL (H) = inf Hq − τ (q) + 1 q

we deduce a first formulation of the multifractal formalism: fH (H) = fL (H) = inf Hq − τ (q) + 1 q

(3.30)


129

The assumption on which the concavity fH (H) depends has no need to be verified. We can then: – stop at step 3 and only affirm that τ (q) is the Legendre transformation of fH (H). However, this lower form of multifractal formalism is of little interest because the quantity that we would like to calculate is fH (H) and the quantity that we know is generally τ (q); – affirm that (3.30) provides, in fact, the concave spectrum envelope. This information is, however, particularly ambiguous when the function calculated in this manner contains segments at the right (it is often the case, see [JAF 97b, JAF 99]): do they correspond to effective values of the spectrum or only to its envelope in a region where it is not concave? 3.5. Bounds of the spectrum We will obtain bounds of fH (H) valid in general. Once again, we have here two points of view, depending on whether the information that we have concerns the functional spaces to which f belongs or histograms of the wavelet coefficients. We can make an observation before any calculation: since ρ(α) contains more information than τ (q), we expect that the bounds obtained from ρ(α) will be better; we shall see that this is indeed the case. 3.5.1. Bounds according to the Besov domain We start by bounding the singularity spectrum of functions belonging to a fixed Besov space. PROPOSITION 3.5.– Let s > 1/p, α ∈ [s − p1 , s] and d = 1 − p(s − α). Then, for any function f ∈ B s,p , all the points x0 where f ∈ C s−( d-dimensional measure equal to zero.

1−d p )

(x0 ) have a Hausdorff

Proof. Let s > 0 and p > 0. Let us note: ef (k, j) = df (k, j) 2j(s−1/p) With respect to (3.23), the condition f ∈ B s,p can then be rewritten: |ef (k, j)|p < ∞

(3.31)

j,k

Let us note by Ij,k the interval centered in k 2−j and length |ef (k, j)|p/d . We can now rewrite (3.31): ! "d diam(Ij,k ) < ∞

130


For all J,(Ij,k )jJ form a covering of all the points belonging to an infinity of intervals Ij,k , so much so that the Hausdorff d-dimensional measure of this set is zero. If a point x belongs to more than a finite number of intervals Ij,k , there exists J (= J(x)) such that: k ∀j J, ∀k j − x |ef (k, j)|p/d 2 and thus: d/p k 1 1 d |ef (k, j)| j − x 2−(s− p )j = 2−(s− p + p )j |2j x − k|d/p 2 1

d

Since we have s − 1/p > 0, Proposition 3.2 implies that f ∈ C s− p + p (x), from which we obtain Proposition 3.5, since: s−

1 d + = α. p p

NOTE 3.4.– The condition s > 1/p is necessary because there are functions of B 1/p,p that are nowhere locally bounded, (see [JAF 00c]). We will now deduce from Proposition 3.5 a bound of the spectrum fH (H). Let us assume that τ (q) is known. Let q and > 0 be fixed. By definition of τ (q), for any > 0, f belongs to B (τ (q)− )/q,q . We can then apply Proposition 3.5 for all q such that: ∃ > 0 :

1 τ (q) − > q q

which is equivalent to τ (q) > 1. If f is uniformly Hölder, τ (q) is increasing and continuous, and verifies: lim τ (q) = 0 and

q→0

τ (q) → +∞

when

q → +∞

There is a unique value qc such that τ (qc ) = 1 and, for any q > qc , we thus have: τ (q) − −H fH (H) 1 − q q thus: fH (H) qH − τ (q) + 1 Since this result is true for any q > qc , we have shown the following proposition.


131

PROPOSITION 3.6.– If f is uniformly Hölder, the singularity spectrum of f verifies the bound: (3.32) fH (H) inf qH − τ (q) + 1 q>qc

The quasi-sure results that we now describe show that bound (3.32) is optimal. Proposition 3.6 suggests that, in formula (3.30), the domain of q on which the Legendre transform must be calculated is the interval [qc , +∞). Let us assume that τ (q) is a function of partition admissible, i.e., it is the partition function of a function f uniformly Hölder (this will be the case if s(q) = qτ (1/q) is concave and verifies 0 s (q) 1 and s(0) > 0; see [JAF 00b]). To say that f has as a partition function τ (q) implies, by definition, that f belongs to the space V (= Vτ (q) ) defined by: B (τ (q)− )/q,q (3.33) V = >0,q>0

Space V is a Baire space, i.e. it has the following property: any intersection of countable dense open subsets is dense. In a Baire space, a property which is true at least on one intersection countable dense open subsets is said to be quasi-sure. Hence, it is natural to wonder if the multifractal formalism occurs almost surely in V . The following theorem, taken from [JAF 00b], answers this question. THEOREM 3.5.– Let τ (q) be obtainable and V the space defined by (3.33). The definition domain of the singularity spectrum of almost any function of V is the interval [s(0), 1/qc ] and, on this interval, we have: (3.34) fH (H) = inf Hq − τ (q) + 1 qqc

Formula (3.34) affirms that the singularity spectrum of almost any function is made up of two parts: if we have H < τ (qc ), the infimum in (3.34) is reached for q > qc and the spectrum can be calculated using the “usual” Legendre transformation of τ (q): fH (H) = inf Hq − τ (q) + 1 q>0

If we have τ (qc ) H 1/qc , the infimum in (3.34) is reached for q = qc and the spectrum is the segment at the right fH (H) = Hqc . The study of regularity properties that are almost sure derives from Banach (see [BAN 31]). Buczolich and Nagy have shown in [BUC 01] that almost any monotone function is multifractal of spectrum fH (H) = H for H ∈ [0, 1].

132


3.5.2. Bounds deduced from histograms The following proposition provides the optimal of the Hölder spectrum, which can be deduced in general from the histograms of wavelet coefficients [AUB 00]. PROPOSITION 3.7.– If we have f ∈ C () for a > 0, then: fH (H) H sup α∈[0,H]

ρ(α) α

(3.35)

This becomes an equality for random wavelet series. Indeed, they verify ρ˜(α) = ρ(α), which shows that this bound is optimal. We can easily verify that it implies (3.32). However, (3.35) clearly provides a better bound if ρ(α) is not concave. Once again, we see that the histogram of the coefficients strictly contains more “useful” information than the partition function. We will note that, although (3.35) is more precise than (3.32), the fact still remains that (3.32) is optimal, when the only information available is the partition function (as shown by almost sure results). The optimal bounds (3.35) and (3.32) can propose variants of the multifractal formalism. We say that the almost sure multifractal formalism is verified if (3.35) is saturated, i.e. if: ρ(α) α∈[0,H] α

fH (H) = H sup

(3.36)

and the multifractal formalism almost sure is verified if (3.35) is saturated, i.e. if: (3.37) fH (H) = inf qH − τ (q) + 1 qqc

3.6. The grand-canonical multifractal formalism The aim of the grand-canonical multifractal formalism is to calculate the spectrum of oscillating singularities d(H, β) which, by definition, provides the Hausdorff dimension of all the points where the Hölder exponent is H and the oscillation exponent β. This formalism is based on new functional spaces. To define them, we will use the more geometric notations that follow: λ and λ will designate the dyadic intervals λj,k = k 2−j + [0, 2−j ] and λj ,k = k 2−j + [0, 2−j ] respectively, Cλ will designate the coefficient df (k, j) and ψλ the wavelet ψ(2j x − k).

DEFINITION 3.9.– Let p > 0 and s, s ∈ . A function f belongs to à Ops,s () if its wavelet coefficients satisfy: 1/p sj s j p sup |Cλ 2 | 0. Then, we take n arbitrarily large such that N n (a, ε) ≥ 2nγ . For such n we bound S n (q) by noting n 2 −1

k=0

n

2−nqsk ≥

n

2−nqsk ≥ N n (a, ε)2−n(qa+|q|ε) ≥ 2−n(qa−γ+|q|ε) (4.32)

|sn k −a| 0. Let n0 be such that EΩ [S n (q)] ≤ 2−n(T (q)−ε) for all n ≥ n0 . Then, n(T (q)−2ε) Sn (q, ω) ≤ E 2n(T (q)−2ε) Sn (q, ω) ≤ 2−nε < ∞. E lim sup 2 n→∞

n≥n0

n≥n0

Thus, almost surely lim supn→∞ 2n(T (q)−2ε) Sn (q, ω) < ∞, and τ (q) ≥ T (q)−2ε. Consequently, this estimate holds with probability one simultaneously for all ε = 1/m (m ∈ N) and some countable, dense set of q values with T (q) < ∞. Since τ (q) and T (q) are always concave due to Corollary 4.2 below (see section 4.4), they are continuous on open sets and the claim follows. Along the same lines we may define the corresponding deterministic grain spectrum. By analogy, we will replace probability over κn = {0, . . . , 2n − 1} in (4.21), i.e. N n (a, ε), by probability over Ω × κn , i.e. n 2 −1

! " PΩ [a − ε ≤ snk < a + ε] = 2n EΩ×κn 1[a−ε,a+ε) (snK )

k=0

(4.38)

= EΩ [N n (a, ε)] and define F (a) := lim lim sup ε↓0

n→∞

1 log2 EΩ [N n (a, ε)] n

(4.39)

Replacing N n (a, ε) with (4.38) in the proof of Theorem 4.1 and taking expectations in (4.32) we find properties analogous to the pathwise spectra τ and f : THEOREM 4.2.– For all a F (a) ≤ T ∗ (a).

(4.40)

Furthermore, under conditions on T (q) analogous to τ (q) in Theorem 4.1 F (a) = T ∗ (a) = F (a) := lim lim inf ε↓0 n→∞

1 log2 EΩ [N n (a, ε)]. n

(4.41)

It follows from Lemma 4.4 that with probability one τ ∗ (a, ω) ≤ T ∗ (a) for all a. Similarly, the deterministic grain spectrum F (a) is an upper bound to its pathwise defined random counterpart f (a, ω), however, only pointwise. On the other hand, we have here almost sure equality under certain conditions. NOTE 4.2 (Negative dimensions).– Defined through counting f (a) as always positive – or −∞. The envelopes T ∗ and F , being defined through expectations of

Multifractal Scaling: General Theory and Approach by Wavelets

151

counts and sums, may assume negative values. Consequently, the negative values of T ∗ and F are not very useful in the estimation of f ; however, they do contain further information and can be “observed”. Negative F (a) and T ∗ (a) have been termed negative dimensions [MAN 90b]. They correspond to probabilities of observing a coarse Hölder exponent a which decays faster than the 2n = #κn “samples” snk available in one realization. Oversampling the process, i.e. analyzing several independent realizations, will increase the number of samples more “rare” snk may be observed. In loose terms, in exp(−n ln(2)F (a)) independent traces we have a fair chance of seeing at least one snk of size a. Thereby, it is essential not to average the spectra f (a) of the various realizations but the numbers N n (a, ε). This way, negative “dimensions” F (a) become visible. 4.4. Multifractal formalism In the previous section, various multifractal spectra were introduced along with some simple relations between them. These can be summarized as follows: COROLLARY 4.1 (Multifractal formalism).– For every a a.s.

dim(Ka ) ≤ dim(Ea ) ≤ f (a) ≤ τ ∗ (a) ≤ T ∗ (a)

(4.42)

where the first relations hold pathwise and the last one (the two terms on both sides of the last inequality) with probability one. Similarly a.s.

dim(Ka ) ≤ f (a) ≤ f (a) ≤ F (a) ≤ T ∗ (a).

(4.43)

The spectra on the left end have stronger implications on the local scaling structure while the ones on the right end are more easy to estimate or calculate. This set of inequalities could fairly be called the “multifractal formalism”. However, in the mathematical community a slightly different terminology is already established which goes “the multifractal formalism holds” and means that for a particular process (or one of its paths, according to context) dim(Ka ) can be calculated using some adequate partition function (such as τ (q)) and taking its Legendre transform. Consequently, when “the multifractal formalism holds” for a path or process, then we often find that equality holds between several or all spectra appearing in (4.42), depending on the context of the formalism that had been established. This property (“the multifractal formalism holds”) is a very strong one and suggests the presence of one single underlying multiplicative structure in Y . This intuition is supported by the fact that the multifractal formalism in known to “hold” up to now only for objects with strong rescaling properties where multiplication is involved such as self-similar measures, products of processes and infinitely divisible

152


cascades (see [CAW 92, FAL 94, ARB 96, RIE 95a, PES 97, HOL 92], respectively [MAN 02, BAC 03, BAR 04, BAR 02, CHA 05] as well as references therein). A notable exception of processes without injected multiplicative structure are Lévy processes, the multifractal properties of which are well understood due to [JAF 99]. Though we pointed out some conditions for equality between f , τ ∗ and T ∗ we must note that in general we may have strict inequality in some or all parts of (4.42). Such cases have been presented in [RIE 95a, RIE 98]. There is, however, one equality which holds under mild conditions and connects the two spectra in the center of (4.42). THEOREM 4.3.– Consider a realization or path of Y . If the sequence snk is bounded, then τ (q) = f ∗ (q),

for all q ∈ .

(4.44)

Proof. Note that τ (q) ≤ f ∗ (q) from Lemma 4.3. Now, to estimate τ (q) from below, choose a larger than |snk | for all n and k and group the terms in S n (q) conveniently, i.e.

a/ε

S n (q) ≤

n

2−nqsk

i=− a/ε (i−1)ε≤sn k 0. Then, for every a ∈ [−a, a] there is ε0 (a) and n0 (a) such that N n (a, ε) ≤ 2n(f (a)+η) for all ε < ε0 (a) and all n > n0 (a). We would like to have ε0 and n0 independent from a for our uniform estimate. To this end note that N n (a , ε ) ≤ N n (a, ε) for all a ∈ [a − ε/2, a + ε/2] and all ε < ε/2. By compactness we may choose a finite set of aj (j = 1, . . . , m) such that the collection [aj − ε0 (aj )/2, aj + ε0 (aj )/2] covers [−a, a]. Set ε1 = (1/2) minj=1,...,m ε0 (aj ) and n1 = maxj=1,...,m n0 (aj ). Then, for all ε < ε1 and n > n1 , and for all a ∈ [−a, a] we have N n (a, ε) ≤ 2n(f (a)+η) and, thus,

a/ε

S n (q) ≤

2−n(qiε−f (iε)−η−|q|ε) (4.46)

i=− a/ε

≤ (2a/ε + 1) · 2−n(f

∗

(q)−η−|q|ε)

.

Letting n → ∞ we find τ (q) ≥ f ∗ (q) − η − |q|ε for all ε < ε1 . Now we let ε → 0 and finally η → 0 to find the desired inequality.


153

Due to the properties of Legendre transforms3 it follows: COROLLARY 4.2 (Properties of the partition function).– If the sequence snk is bounded, then the partition function τ (q) is concave and monotonous. Consequently, τ (q) is continuous on , and differentiable in all but a countable number of exceptional points. In order to efficiently invert Theorem 4.3 we need: LEMMA 4.5 (Lower semi-continuity of f and F ).– Let am converge to a∗ . Then f (a∗ ) ≥ lim sup f (am )

(4.47)

m→∞

and analogous for F . Proof. For all ε > 0 we can find m0 such that a∗ − ε < am − ε/2 < am + ε/2 < a∗ + ε for all m > m0 . Then, N n (a∗ , ε) ≥ N n (am , ε/2) and E[N n (a∗ , ε)] ≥ E[N n (am , ε/2)]. We find lim sup n→∞

1 1 log2 N n (a∗ , ε) ≥ lim sup log2 N n (am , ε/2) ≥ f (am ) n n→∞ n

for any m > m0 (ε) and similar for F . Now, let first m → ∞ and then ε → 0. COROLLARY 4.3 (Central multifractal formalism).– We always have f (a) ≤ f ∗∗ (a) = τ ∗ (a).

(4.48)

Furthermore, denoting by τ (q±) the right- (resp. left-)sided limits of derivatives we have, f (a) = τ ∗ (a) = qτ (q±) − τ (q±)

at a = τ (q±).

(4.49)

Proof. The graph of f ∗∗ is the concave hull of the graph of f which implies (4.48). It is an easy task to derive (4.49) under assumptions suitable to make the tools of calculus available such as continuous second derivatives. To prove it in general let us first assume that τ is differentiable at a fixed q. In particular, τ (q ) is then finite for q close to q. Since τ (q) = f ∗ (q) there is a sequence am such that τ (q) = limm qam − f (am ). Since τ (q ) ≤ q a − f (a) for all q and a by (4.33), and since τ is differentiable

3. For a tutorial on the Legendre transform see [RIE 99, App. A].

154


at q this sequence am must converge to a∗ := τ (q). From the definition of am we conclude that f (am ) converges to qa∗ − τ (q). Applying Lemma 4.5 we find that f (a∗ ) ≥ qa∗ − τ (q). Recalling (4.33) implies the desired equality. Now, for an arbitrary q the concave shape of τ implies that there is a sequence of numbers qm larger than q in which τ is differentiable and which converges down to q. Consequently, τ (q+) = limm τ (qm ). Formula (4.49) being established at all qm Lemma 4.5 applies with am = τ (qm ) and a∗ = τ (q+) to yield f (τ (q+)) ≥ qτ (q+) − τ (q+). Again, (4.33) furnishes the opposite inequality. A similar argument applies to τ (q−). COROLLARY 4.4.– If T (q) is finite for an open interval of q-values then |snk | is bounded for almost all paths, and T (q) = F ∗ (q)

for all q.

(4.50)

Moreover, F (a) = T ∗ (a) = qT (±q) − T (±q)

at a = T (±q).

(4.51)

Proof. Assume for a moment that snk is unbounded from above with positive probability. Then, grouping (4.45) requires an additional term collecting the snk > a. In fact, for any number a we can find n arbitrarily large such that snk > a for some k. This implies that for any negative q we have S n (q) ≥ 2−nqa and τ (q) ≤ qa. Letting a → ∞ shows that τ (q) = −∞. By Lemma 4.4 we must have T (q) = −∞, a contradiction. Similarly, we show that snk is bounded from below. The remaining claims can be established analogously to those for τ (q) by taking expectations in (4.45). NOTE 4.3 (Estimation and unbounded moments).– In order to apply Corollary 4.4 in a real world situation, but also for the purpose of estimating τ (q), it is of great importance to possess a method to estimate the range of q-values for which the moments of a stationary process (such as the increments or the wavelet coefficients of Y ) are finite. Such a procedure is proposed in [GON 05] (see also [RIE 04]). 4.5. Binomial multifractals The binomial measure has a long-standing tradition in serving as the paradigm of multifractal scaling [MAN 74, KAH 76, MAN 90a, CAW 92, HOL 92, BEN 87, RIE 95a, RIE 97b]. We present it here with an eye on possible generalizations of use in modeling. 4.5.1. Construction To be consistent in notation we denote the binomial measure by μb and its distribution function by Mb (t) := μb (] − ∞, t[). Note that μb is a measure or


155

(probability) distribution, i.e. not a function in the usual sense, while Mb is a right-continuous and increasing function by definition. In order to define μb we again use the notation (4.4): for any fixed t there is a unique sequence k1 , k2 , . . . such that the dyadic intervals Iknn = [kn 2−n , (kn +1)2−n [ contain t for all integer n. So, the Ikn form a decreasing sequence of half open intervals n+1 n+1 is the left subinterval of Iknn and I2k which shrink down to {t}. Moreover, I2k n n +1 the right subinterval (see Figure 4.1). Note that the first n elements of such a sequence, i.e. (k1 , k2 , . . . , kn ) are identical for all points t ∈ Iknn . We call this a nested sequence and it is uniquely defined by the value of kn . We set μb (Ikn ) = Mb ((kn + 1)2−n ) − Mb (kn 2−n ) · · · Mk11 · M00 . = Mknn · Mkn−1 n−1

(4.52)

In other words, the mass lying in Iknn is redistributed among its two dyadic n+1 n+1 n+1 n+1 subintervals I2k and I2k in the proportions M2k and M2k . For consistency n n +1 n n +1 n+1 n+1 we require M2kn + M2kn +1 = 1. Having defined the mass of dyadic intervals we obtain the mass of any interval ] − ∞, t[ by writing itas a disjoint union of dyadic intervals J n and noting Mb (t) = μb (] − ∞, t[= n μb (J n ). Therefore, integrals (expectations) with respect to μb can be calculated as g(t)μb (dt) = lim

n→∞

=

g(t)dMb (t) = lim

n→∞

n −1 2

g(k2−n )μb (Ikn )

(4.53)

k=0 n −1 2

g(k2−n ) Mb ((k + 1)2−n ) − Mb (k2−n ) (4.54)

k=0

Alternatively, the measure μb can be defined using its distribution function Mb . Indeed, as a distribution function, Mb is monotone and continuous from the right. Since (4.52) defines Mb in all dyadic points it can be obtained in any other point as the right-sided limit. Note that Mb is continuous at a given point t unless Mknn (t) = 1 for all n large. To generate randomness in Mb , we choose the various Mkn to be random variables. The above properties then hold pathwise. We will make the following assumptions on the multiplier distributions Mkn : i) Conservation of mass. Almost surely for all n and k Mkn is positive and n+1 n+1 + M2k = 1. M2k n n +1

(4.55)

156


0

M0

0.07

1

0

0.06 0.05

1 M0

.M

0.04

0 0

1

.

0.03

0

M1 M0

0

0.5

1

0.02 0.01 0

. .

2

1

0

M0 M0 M0 0

. .

2

1

Ŧ0.01

0

M1 M0 M0

0.25

. .

2

1

0

M2 M1 M0 0.5

. .

2

1

0

M3 M1 M0 0.75 1

Ŧ0.02 Ŧ0.03 0

2000

4000

6000

8000

Figure 4.2. Iterative construction of the binomial cascade

As we have seen, this guarantees that Mb is well defined. ii) Nested independence. All multipliers of a nested sequence are mutually independent. Analogously to (4.52) we have for any nested sequence EΩ [Mknn · · · M00 ] = EΩ [Mknn ] · · · EΩ [M00 ]

(4.56)

and similar for other moments. This will allow for simple calculations in what follows. iii) Identical distributions. For all n and k M0 if k is even, n fd Mk = M1 if k is odd.

(4.57)

A more general version of iii) was given in [RIE 99] to allow for more flexibility in model matching. The theory of cascades or, more properly, T -martingales4 [KAH 76, BEN 87, HOL 92, BAR 97], provides a wealth of possible generalizations. Most importantly, it allows us to soften the almost sure conservation condition i) to i’) Conservation in the mean EΩ [M0 + M1 ] = 1.

(4.58)

In this case, Mb is well defined since (4.52) forms a martingale due to the nested independence (4.56). The main advantage of such an approach is that we can use unbounded multipliers M0 and M1 such as log-normal random variables. Then, the marginals of the increment process, i.e. μb (Ikn ) are exactly log-normal on all scales. For general binomials, always assuming ii) it can be argued that the marginals μb (Ikn ) are at least asymptotically log-normal by applying a central limit theorem to the logarithm of (4.52).

4. For any fixed t sequence (4.52) forms a martingale due to the nested independence (4.56).


157

4.5.2. Wavelet decomposition The scaling coefficients of μb using the Haar wavelet are simply

φ∗j,k (t) μb (dt) = 2n/2

Dn,k (μb ) =

(k+1)2−n

μb (dt) = 2n/2 μb (Ikn )

k2−n

(4.59)

from (4.8) and (4.53). With (4.9) and (4.52) we derive the explicit expression for the Haar wavelet coefficients: n+1 n+1 ) − μb (I2k ) 2−n/2 Cn,kn (μb ) = μb (I2k n n +1 n+1 n+1 − M2k ) = (M2k n n +1

n )

Mki i .

(4.60)

i=0

Similar scaling properties hold when using arbitrary, compactly supported wavelets, provided the distributions of the multipliers are scale independent. This comes about from (4.52) and (4.53), which give the following rule for substituting t = 2n t − kn −n/2 2 Cn,kn (μb ) = ψ(2n t − kn )μb (dt) Iknn

= Mknn · · · Mk11 ·

1 0

(4.61) (n,k ) ψ(t )μb n (dt ).

(n,k )

Here μb n is a binomial measure constructed with the same method as μb itself, however, with multipliers taken from the subtree which has its root at the node kn of level n of the original tree. More precisely, for any nested sequence i1 , . . . , im (n,kn )

μb

n+1 n+2 (Iim ) = M2k · M4k · · · M2n+m m k +i . m n +i1 n +i2 n m

(n,k )

From nested independence (4.56) we infer that this measure μb n is independent of Mki i (i = 1, . . . , n). Furthermore, the identical distributions of the multipliers iii) imply that for arbitrary, compactly supported wavelets 1 1 d (n,k ) ψ(t)μb n (dt) = C0,0 (μb ) = ψ(t)μb (dt) (4.62) 0

d

0

where = denotes equality in distribution. In particular, for the Haar wavelet we have 1 d (n,k ) n+1 n+1 Haar ψHaar (t)μb n (dt) = M2k − M2k = M0 − M1 = C0,0 (μb ) (4.63) n n +1 0

158


(the deterministic analog has also been observed in [BAC 93]). Finally, note that if ψ is supported on [0, 1], then ψ(2n (·) − k) is supported on Ikn . So, the tree of wavelet coefficients Cn,k of μb possess a structure similar to the tree of increments of Mb (compare (4.52)). With a little more effort we calculate the wavelet coefficients of Mb itself, provided ψ is admissible and supported on [0, 1]. Indeed, Mb (t) − Mb (kn 2−n ) = μb ([kn 2−n , t]) (n,kn )

= Mknn · · · Mk11 Mb (n,kn )

where Mb this yields

(n,kn )

(t ) := μb

2−n/2 Cn,kn (Mb ) =

([0, t ]). Using

Iknn

#

(2n t − kn ),

(4.64)

ψ = 0 and substituting t = 2n t − kn

ψ(2n t − kn ) Mb (t) − Mb (kn 2−n ) dt

= 2−n · Mknn · · · Mk11 ·

0

1

(4.65) (n,k ) ψ(t )Mb n (t )dt .

Again, we have 1 d (n,k ) ψ(t)Mb n (dt) = C0,0 (Mb ) = 0

1

ψ(t)Mb (dt)

(4.66)

0

LEMMA 4.6.– Let ψ be a wavelet supported on [0, 1]. Let Mb be a binomial with i)-iii). Then, Cn,kn (μb ) is given by (4.61), and if ψ is admissible then Cn,kn (Mb ) is given by (4.65). Furthermore, (4.62) and (4.66) hold. It is obvious that the dyadic structure present in both the construction of the binomial measure as well as in the wavelet transform are responsible for the simplicity of the calculation above. It is, however, standard by now to extend the procedure to more general multinomial cascades such as Mc , introduced in section 4.5.5 (see [ARB 96, RIE 95a]). 4.5.3. Multifractal analysis of the binomial measure In the light of Lemma 4.6 it becomes clear that the singularity exponent α(t) is most easily accessible for Mb while w(t) is readily available for both, Mb and μb . On the other hand, as increments appear in α(t) they are not well defined for μb . Thus, it is natural to calculate the spectra of both, Mb and μb , with appropriate singularity exponents, i.e. f α,Mb , f w,Mb and f w,μb .


159

Now, Lemma 4.6 indicates that the singularity structures of μb and Mb are closely related. Indeed, μb is the distributional derivative of Mb in the sense of (4.52) and (4.54). Since taking a derivative “should” simply reduce the scaling exponent by one, we would expect that their spectra are identical up to a shift in a by −1. Indeed, this is true for increasing processes, such as Mb , as we will elaborate in section 4.6.2. However, it has to be pointed out that this rule cannot be correct for oscillating processes. This is effectively demonstrated by the example ta · sin(t−b ) with b > 0. Though this example has the exponent a at zero, its derivative behaves like ta−b−1 there. This is caused by the strong oscillations, also called chirp, at zero. In order to deal with such situations the 2-microlocalization theory has to be employed [JAF 91]. Let us first dwell on the well known multifractal analysis of Mb based on αkn . Recall that Mb ((kn + 1)2−n ) − Mb (kn ) is given by (4.52), and use the nested independence (4.56) and identical distributions (4.57) to obtain E[S

n

α,Mb (q)]

=

n 2 −1

E

'

Mknn

q

q 0 q ( M0 · · · Mk11

kn =0

n ( n i n−i E [M0q ] E [M1q ] · =E i i=0 ' ( n 0 q · (E [M0q ] + E [M1q ]) . = E M0

'

q M00

(4.67)

From this, it follows immediately that T α,Mb (q) = − log2 E [(M0 )q + (M1 )q ] .

(4.68)

Note that this value may be −∞ for some q. THEOREM 4.4.– Assume that i’), ii) and iii) hold. Assume furthermore that M0 and M1 have at least some finite moment of negative order. Then, with probability one dim(Ka ) = f (a) = τ ∗ (a) = T ∗ α,Mb (a)

(4.69)

for all a such that T ∗ α,Mb (a) > 0. Thereby, all the spectra are related to the singularity exponents αkn or hnk of Mb . NOTE 4.4 (Wavelet analysis).– In what follows we will show that we obtain the same spectra for Mb replacing αkn with wkn for certain analyzing wavelets. We will also mention the changes which become necessary when studying distribution functions of measures with fractal support (see section 4.5.5).

160


Proof. Inspection [BAR 97] we find that dim(Ka ) = T ∗ (a) for αkn under the given assumptions. Earlier results, such as [FAL 94, ARB 96], used more restrictive assumptions but are somewhat easier to read. Though weaker than [BAR 97] they are sufficient in some situations. 4.5.4. Examples Example 1 (β binomial).– Consider multipliers M0 and M1 that follow a β distribution, which has the density cp tp−1 (1 − t)p−1 for t ∈ [0, 1] and 0 elsewhere. Thus, p > 0 is a parameter and cp is a normalization constant. Note that the conservation of mass i) imposes a symmetric distribution since M0 and M1 are set to be equally distributed. The β distribution has finite moments of order q > −p which can be expressed explicitly using the Γ-function. We obtain β-Binomial: T α (q) = −1 − log2

Γ(p + q)Γ(2p) Γ(2p + q)Γ(p)

(q > −p),

(4.70)

and T (q) = −∞ for q ≤ −p. For a typical shape of these spectra, see Figure 4.3. 1.5

1.5

slope=D

slope=q

1 0.5

1

(q,T(q)) (1,T(1))=(1,0)

ŦT*(D)

T (D) o

(0,T(0))=(0,Ŧ1)

Ŧ1

0

q0=0

* (D,T (D))

0.5

*

T(q) o

0 Ŧ0.5

q1=1

ŦT(0)=1

ŦT(1)=0

Ŧ1.5 Ŧ2

Ŧ0.5

ŦT(q)

Ŧ2.5 Ŧ3 Ŧ1

0

1 qo

2

3

Ŧ1 Ŧ0.5

0

0.5

1

1.5

2

2.5

3

D o

Figure 4.3. The spectrum of a binomial measure with β distributed multipliers with p = 1.66. Trivially, T (0) = −1, where the maximum of T ∗ is 1. In addition, every positive increment process has T (1) = 0, where T ∗ touches the bisector. Finally, the LRD parameter is Hvar = (T (2) + 1)/2 = 0.85 (see (4.90) below)

An application of the β binomial for the modeling of data traffic on the Internet can be found in [RIE 99]. Example 2 (Uniform binomial).– As a special case of the β binomial we obtain uniform distributions for the multipliers when setting p = 1. Formula (4.70) simplifies


161

to T α (q) = −1 + log2 (1 + q) for q > −1. Applying the formula for the Legendre transform (4.51) yields the explicit expression a (4.71) uniform binomial: T ∗ α (a) = 1 − a + log2 (e) + log2 log2 (e) for a > 0 and T ∗ α (a) = −∞ for a ≤ 0.

Example 3 (Log-normal binomial).– Another very interesting case is log-normal distributions for the multipliers M0 and M1 . Note that we have to replace i) with i’) in this case since log-normal variables can be arbitrarily large, i.e. larger than 1. Recall that the log-normal binomial enjoys the advantage of having exactly log-normal marginals μb (Ikn ) since the product of independent log-normal variables is again a log-normal variable. Having mass conservation only in the mean, however, may cause problems in simulations since the sample mean of the process μb (Ikn ) (k = 0, . . . , 2n − 1) is not M00 as in case i), but depends on n. Indeed, the negative (virtual) a appearing in the log-normal binomial spectrum reflects the possibility that the sample average my increase locally (see [MAN 90a]). The calculation of its spectrum starts by observing that the exponential M = eG of a N (m, σ 2 ) variable G, i.e. a Gaussian with mean m and variance σ 2 , has the q-th moment E[M q ] = E[exp(qG)] = exp(qm + q 2 σ 2 /2). Assuming that M0 and M1 are equally distributed as M their mean must be 1/2. Hence m + σ 2 = − ln(2), and σ2 q (4.72) log-normal binomial: T α (q) = (q − 1) 1 − 2 ln(2) for all q ∈ such that E[(Mb (1))q ] is finite. Note that the parabola in (4.72) has two zeros: 1 and qcrit = 2 ln(2)/σ 2 . It follows from [KAH 76] that E[(Mb (1))q ] < ∞ exactly for q < qcrit . Since T (q) is exactly differentiable for q < qcrit we may obtain its Legendre transform implicitly from (4.51) for a = T (q) with q < qcrit , i.e., for all a > acrit = T (qcrit ) = σ 2 /(2 ln(2)) − 1. Eliminating q from (4.51) yields the explicit form 2 σ2 ln(2) a − 1 − T ∗ α (a) = 1 − (a ≥ acrit ) (4.73) 2σ 2 2 ln(2) For a ≤ acrit the Legendre transform yields T ∗ (a) = a · qcrit . Thus, at acrit the spectrum T ∗ crosses over from parabola (4.73) to its tangent through the origin with slope qcrit (the other tangent through the origin is the bisector). It should be remembered that only the positive part of this spectrum can be estimated from one realization of Mb . The negative part corresponds to events so rare that they can only be observed in a large array of realizations (see Note 4.2).

162


The log-normal framework also allows us to calculate F (a) explicitly, demonstrating which rescaling properties of the marginal distributions of the increment processes of Mb are captured in the multifractal spectra. Indeed, if all ln(Mkn ) are N (m, σ 2 ) then − ln(2) · αkn is N (m, σ 2 /n). The mean value theorem of integration gives ln(2)(−a+ε) 1 (x − m)2 n dx exp − PΩ [|αk − a| < ε] = * 2σ 2 /n 2πσ 2 /n ln(2)(−a−ε) (− ln(2)xa,n − m)2 1 ln(2) · 2ε · exp − =* 2σ 2 /n 2πσ 2 /n with xa,n ∈ [a − ε, a + ε] for all n. Keeping only the exponential term in n and substituting m = −σ 2 − ln(2) we find ln(2) 1 log2 (2n PΩ [|αkn − a| < ε]) 1 − n 2σ 2

σ2 xa,n − 1 − 2 ln(2)

2 .

(4.74)

Comparing with (4.73) we see that T ∗ (a) = F (a), as stated in Theorem 4.2. The above computation shows impressively how well adapted a multiplicative iteration with log-normal multipliers is to multifractal analysis (or vice versa): F extracts, basically, the exponent of the Gaussian kernel. Since the multifractal formalism holds for Mb these features can be measured or estimated using the re-normalized histogram, i.e. the grain based multifractal spectrum f (a). This is a property which could be labeled with the term ergodicity. Note, however, that classical ergodic theory deals with observations along an orbit of increasing length, while f (a) concerns a sequence of orbits. 4.5.5. Beyond dyadic structure We elaborate here generalizations of the binomial cascade. Statistically self-similar measures: a natural generalization of the random binomial, denoted here by Mc , is obtained by splitting intervals Jkn iteratively n+1 n+1 n+1 n , . . . , Jck+c−1 with length |Jck+i | = Ln+1 into c subintervals Jck ck+i |Jk | and n+1 n+1 n mass μc (Jck+i ) = Mck+i μc (Jck ). In the most simple case, we will require mass n+1 n+1 n+1 conservation, i.e. Mck + · · · + Mck+c−1 = 1, but also Ln+1 ck + · · · + Lck+c−1 = 1 which guarantees that μc lives everywhere. Assuming the analogous properties of ii) and iii) to hold for both the length- as well as the mass-multipliers we find that T Mc (q) is the unique solution of ( ' (4.75) E (M0 )q (L0 )−T (q) + · · · + (Mc−1 )q (Lc−1 )−T (q) = 1.


163

This formula of T (q) can be derived rigorously by taking expectations where appropriate in the proof of [RIE 95a, Prop 14]. Doing so shows, moreover, that T (q) assumes a limit in these examples. Multifractal formalism: it is notable that the multifractal formalism “holds” for the class of statistically self-similar measures described above in Theorem 4.4 (see [ARB 96]). n+1 n n However, if Ln+1 ck + · · · + Lck+c−1 = λ < 1, e.g. choosing Lk = (1/c ) almost surely with c > c, then the measure μc lives on a set of fractal dimension and its distribution function Mc (t) = μc ([0, t)) is constant almost everywhere. In this case, equality in the multifractal formalism will fail: indeed, unless the scaling exponents snk are modified to account for boundary effects caused by the fractal support, the partition function will be unbounded for negative q, e.g. τ α (q) = −∞ for q < 0 (see [RIE 95a]). As a consequence, T α (q) = −∞ and (4.75) is no longer valid for q < 0. Interestingly, the fine spectrum dim(Ka ) is still known, however, due to [ARB 96].

Stationary increments: however, an entirely different and novel way of introducing randomness in the geometry of multiplicative cascades which leads to perfectly stationary increments has been given recently in [MAN 02] and in [BAR 02, BAR 03, BAR 04, MUZ 02, BAC 03, CHA 02, CHA 05, RIE 07b, RIE 07a]. The description of these model is, unfortunately, beyond the scope of this work. Binomial in the wavelet domain: in concluding this section we should mention that, with regard to (4.61), we may choose to directly model the wavelet coefficients of a process in a multiplicative fashion in order to obtain a desired multifractal structure. Some early steps in this direction have been taken in [ARN 98]. 4.6. Wavelet based analysis 4.6.1. The binomial revisited with wavelets The deterministic envelope is the most simple wavelet-based spectra of μb to calculate. Taking into account the normalization factors in (4.12) when using Lemma 4.6, the calculation of (4.67) carries over to give n

S n w,μb (q) = 2nq E [|C0,0 |q ] · (EΩ [M0q ] + EΩ [M1q ]) , and similar for Mb . Provided E [|C0,0 |q ] is finite this immediately gives T w,μb (q) + q = T w,Mb (q) = T α,Mb (q), T

∗

w,μb (a

− 1) = T

∗

w,Mb (a)

=T

∗

α,Mb (a).

(4.76) (4.77)

164


Imposing additional assumptions on the distributions of the multipliers we may also control wkn (μb ) themselves and not only their moments. To this end, we should be able to guarantee that the wavelet coefficients do not decay too fast (compare (4.10)), i.e. the random factor (4.62) which appears in (4.61) does not become too small. Indeed, it is sufficient to assume that there is some ε > 0 such that |C0,0 (μb )| ≥ ε # (n,k ) almost surely. Then for all t, (1/n) log( ψ(t)μb n (dt)) → 0, and with (4.61) 1 (4.78) wμb (t) = lim inf − log2 2n/2 |Cn,kn | = αMb (t) − 1, n→∞ n and similarly wμb (t) = αMb (t) − 1. Observe that this is precisely the relation we expect between the scaling exponents of a process and its (distributional) derivative – at least in nice cases – and that it is in agreement with (4.77). In summary (first observed for deterministic binomials in [BAC 93]): COROLLARY 4.5.– Assume that μb is a random binomial measure satisfying i)-iii). # # (n,k) (n,k) Assume, that the random variables | ψ(t)μb (dt)| resp. | ψ(t)Mb (t)dt| are uniformly bounded away from 0. Then, the multifractal formalism “holds” for the wavelet based spectra of μb , resp. Mb , i.e. dim(Eaw,μb ) = f w,μb (a) = τ ∗ w,μb (a) = T ∗ w,μb (a),

(4.79)

dim(Eaw,Mb ) = f w,Mb (a) = τ ∗ w,Mb (a) = T ∗ w,Mb (a).

(4.80)

a.s.

a.s.

a.s.

respectively a.s.

a.s.

a.s.

# # (n,k) (n,k) Requiring that | ψ(t)μb (dt)| resp. | ψ(t)Mb (t)dt| should be bounded away from zero in order to insure (4.78), though satisfied in some simple cases, seems unrealistically restrictive to be of practical use. A few comments are in order here. First, this condition can be weakened to arbitrarily allow small values of these integrals, as long as all their negative moments exist. This can be shown by an argument using the Borel-Cantelli lemma. Second, the condition may simplify in two ways. For iid multipliers we know that these integrals are equal in distribution to C0,0 , thus only n = k = 0 has to be checked. Further, for the Haar wavelet and symmetric multipliers, it becomes simply the condition that M0 be uniformly bounded away from zero (see (4.60)), or at least that E[|M0 − 1/2|q ] < ∞ for all negative q. Third, if we drop iii) and allow the multiplier distributions to depend on scale (see # # (n,k) (n,k) [RIE 99]), then | ψ(t)μb (dt)| resp. | ψ(t)Mb (t)dt| has to be bounded away from zero only for large n. In applications such as network traffic modeling we find n+1 n+1 − M2k+1 is best modeled by discrete distributions on [0, 1] that on fine scales M2k with large variance, i.e. without mass around 1/2.


165

Fourth, another way out is to avoid small wavelet coefficients entirely in a multifractal analysis. More precisely, we would follow [BAC 93, JAF 97] and replace Cn,kn in the definition of wknn (4.12) by the maximum over certain wavelet coefficients “close” to t. Of course, the multifractal formalism of section 4.4 still holds. [JAF 97] gives conditions under which the spectrum τ ∗ w,μb (a) based on this modified wkn agrees with the “Hölder” spectrum dim(Ea ) based on hnk (Mb ). 4.6.2. Multifractal properties of the derivative Corollary 4.5 establishes for the binomial what intuition suggests in general, i.e. that the multifractal spectra of processes and their derivative should be related in a simple fashion – at least for certain classes of processes. As we will show, increasing processes have this property, at least for the wavelet based multifractal spectra. However, the order of Hölder regularity in the sense of the spaces Cth (see Lemma 4.1) might decrease under differentiation by an amount different from 1. This is particularly true in the presence of highly oscillatory behavior such as “chirps”, as the example ta sin(1/t2 ) demonstrates. In order to assess the proper space Cth a 2-microlocalization has to be employed. For good surveys see [JAF 95, JAF 91]. In order to establish a general result on derivatives we place ourselves in the framework whereby we care less for a representation of a process in terms of wavelet coefficients and are interested purely in an analysis of oscillatory behavior. A typical example of an analyzing mother wavelet ψ are the derivatives of the Gaussian kernel exp (−t2 /2) which were used to produce Figure 4.4. The idea is to use integration by parts. For a continuous measure μ on [0, 1] with distribution function M(t) = μ([0, t)) and a continuously differentiable function g this reads as g(t)μ(dt) = lim

n→∞

= lim

n→∞

n −1 2

g(k2−n ) M((k + 1)2−n ) − M(k2−n )

k=0 n −1 2

M(k2−n ) g((k − 1)2−n ) − g(k2−n )

(4.81)

k=0

+ M(1)g(1 − 2−n ) − M(0)g(−2−n ) = M(1)g(1) − M(0)g(0) − M(t)g (t)dt where we alluded to (4.53) and regrouped terms. As a matter of fact, M(0) = 0 and M(1) = 1. A similar calculation can be performed for a more general, not necessarily increasing process Y , provided it has a derivative Y , by replacing μ(dt) with Y (t)dt.

166


Figure 4.4. Demonstration of the multifractal behavior of a binomial measure μb (left) and its distribution function Mb (right). On the top a numerical simulation, i.e. (4.52) on the left and Mb (k2−n ) on the right for n = 20. In the middle the moduli of a continuous wavelet transform [DAU 92] where the second Gaussian derivative was taken as the analyzing wavelet ψ(t) for μb , resp. the third derivative ψ for Mb . The dark lines#indicate the “lines of maxima” [JAF 97, BAC 93], i.e. the locations where the modulus of ψ(2j t − s)μb (dt) has a local maximum as a function of s with j fixed. On the bottom a multifractal analysis in three steps. First, a plot of log S n w (q) against n tests for linear behavior for various q. Second, the partition function τ (q) is computed as the slopes of a least square linear fit of log S n . Finally, the Legendre transform τ ∗ (a) of τ (q) is calculated following (4.49). Indicated with dashes in the plots of τ (q) and τ ∗ (a) of μb are the corresponding function for Mb , providing empirical evidence for (4.76), (4.77), and (4.83)

Now, setting g(t) = 2n/2 ψ(2n t − k) for a smooth analyzing wavelet ψ we have g (t) = 23n/2 ψ (2n t − k) and obtain Cn,k (ψ, μ) = 2n/2 ψ(2n − k) − 2n · Cn,k (ψ , M).

(4.82)

Estimating 2n − kn = 2n − t2n (1 − t)2n and assuming exponential decay of ψ(t) at infinity allows us to conclude w(t)ψ,μ = −1 + w(t)ψ ,M ,

(4.83)

and similarly to w(t). COROLLARY 4.6.– f ψ,μ (a) = f ψ ,M (a + 1)

τ ∗ ψ,μ (a) = τ ∗ ψ ,M (a + 1)

(4.84)

This is impressively demonstrated in Figure 4.4. We should note that ψ has one more vanishing moment than ψ which is easily seen by integrating by parts. Thus, it


167

is natural to analyze the integral of a process, here the distribution function M of the measure μ, using ψ since the degree of the Taylor polynomials typically grows by 1 under integration. NOTE 4.5 (Visibility of singularities and regularity of the wavelet).– It is notable that the Haar wavelet yields the full spectra of the binomial Mb (and also of its distributional derivative μb ). This fact is in some discord with the folklore saying that a wavelet cannot detect degrees of regularity larger than its own. In other words, a signal will rarely be more regular than the basis elements it is composed of. To resolve the apparent paradox, recall the peculiar property of multiplicative measures which is to have constant Taylor# polynomials. So, it will reveal its # scaling structure to any analyzing wavelet with ψ = 0. No higher regularity, i.e. tk ψ(t)dt = 0 is required. The correct reading of the literature is indeed, that wavelets are only guaranteed to detect singularities smaller than their own regularity. 4.7. Self-similarity and LRD The statistical self-similarity as expressed in (4.1) makes FBM, or rather its increment process, a paradigm of long range dependence (LRD). To be more explicit let δ denote a fixed lag and define fractional Gaussian noise (FGN) as G(k) := BH ((k + 1)δ) − BH (kδ).

(4.85)

Possessing the LRD property means that the auto-correlation rG (k) := EΩ [G(n + k)G(n)] decays so slowly that k rG (k) = ∞. The presence of such strong dependence bears an important consequence on the aggregated processes G(m) (k) :=

1 m

(k+1)m−1

G(i).

(4.86)

i=km

They have a much higher variance, and variability, than would be the case for a short range dependent process. Indeed, if X is a process with iid values X(k), then X (m) (k) has variance (1/m2 ) var(X0 + · · · + Xm−1 ) = (1/m) var(X). For G we find, due to (4.1) and BH (0) = 0, H 1 m BH (mδ) = var BH (δ) var(G(m) (0)) = var m m (4.87) = m2H−2 var (BH (δ)) . Indeed, for H > 1/2 this expression decays much slower than 1/m. As is shown in [COX 84] var(X (m) ) m2H−2 is equivalent to rX (k) k 2H−2 and so, G(k) is indeed LRD for H > 1/2. Let us demonstrate with FGN how to relate LRD with multifractal analysis based only on the fact that it is a zero-mean processes, not (4.1). To this end let

168


δ = 2−n denote the finest resolution we will consider, and let 1 be the largest. For m = 2i (0 ≤ i ≤ n) the process mG(m) (k) becomes simply BH ((k + 1)mδ) − BH (kmδ) = BH ((k + 1)2i−n ) − BH (k2i−n ). However, the second moment of this expression – which is also the variance – is exactly what determines T α (2). More precisely, using stationarity of G and substituting m = 2i , we obtain ' ( ! " 2 −1 EΩ S n−i α (2) = EΩ |mG(m) (k)|2 n−i

−(n−i)T α (2)

2

= 2n−i 22i var G(2

i

)

(4.88)

k=0

.

This should be compared with the definition of the LRD parameter H using var(G(m) ) m2H−2

or

i

var(G(2 ) ) 2i(2H−2) .

(4.89)

At this point a conceptual difficulty arises. Multifractal analysis is formulated in the limit of small scales (i → −∞) while LRD is a property for large scales (i → ∞). Thus, the two exponents H and T α (2) can in theory only be related when assuming that the scaling they represent is actually exact at all scales, and not only asymptotically. In any real world application, however, we will determine both H and T α (2) by finding a scaling region i ≤ i ≤ i in which (4.88) and (4.89) hold up to satisfactory precision. Comparing the two scaling laws in i yields T α (2) + 1 − 2 = 2H − 2, or H=

T α (2) + 1 . 2

(4.90)

This formula expresses most pointedly how multifractal analysis goes beyond second order statistics: with T (q) we capture the scaling of all moments. The relation (4.90), here derived for zero-mean processes, can be put on more solid grounds using wavelet estimators of the LRD parameter [ABR 95] which are more robust than the estimators through variance. The same formula (4.90) also reappears for certain multifractals (see (4.100)). In this context it is worthwhile pointing forward to (4.96), from which we conclude that T BH (q) = qH − 1 if q > −1. The fact to note here is that FBM requires indeed only one parameter to capture its scaling while multifractal scaling, in principle, is described by an array of parameters T (q). 4.8. Multifractal processes The most prominent examples where we find coinciding, strictly concave multifractal spectra are the distribution functions of cascade measures [MAN 74,


169

KAH 76, CAW 92, FAL 94, ARB 96, OLS 94, HOL 92, RIE 95a, RIE 97b, PES 97] for which dim(Ka ) and T ∗ (a) are equal and have the form of a ∩ (see Figure 4.3 and also 4.5(e)). These cascades are constructed through a multiplicative iteration scheme such as the binomial cascade, which is presented in detail earlier in this chapter with special emphasis on its wavelet decomposition. Having positive increments, this class of processes is, however, sometimes too restrictive. FBM, as noted, has the disadvantage of a poor multifractal structure and does not contribute to a larger pool of stochastic processes with multifractal characteristics. It is also notable that the first “natural”, truly multifractal stochastic process to be identified was the Lévy motion [JAF 99]. This example is particularly appealing since scaling is not injected into the model by an iterative construction (this is what we mean by the term natural). However, its spectrum is degenerative, though it shows a non-trivial range of scaling exponents h(t), in the sense that it is linear. 4.8.1. Construction and simulation With the formalism presented here, the stage is set for constructing and studying new classes of truly multifractional processes. The idea, to speak in Mandelbrot’s own words, is inevitable after the fact. The ingredients are simple: a multifractal “time warp’, i.e. an increasing function or process M(t) for which the multifractal formalism is known to hold, and a function or process V with strong monofractal scaling properties such as fractional Brownian motion (FBM), a Weierstrass process or self-similar martingales such as Lévy motion. We then form the compound process V(t) := V (M(t)).

(4.91)

To fix the ideas, let us recall the method of midpoint displacement which can be used to define a simple Brownian motion B1/2 which we will also call the Wiener motion (WM) for a clear distinction from FBM. This method constructs B1/2 iteratively at dyadic points. Having constructed B1/2 (k2−n ) and B1/2 ((k + 1)2−n ) we define B1/2 ((2k + 1)2−n−1 ) as (B1/2 (k2−n ) + B1/2 ((k + 1)2−n ))/2 + Xk,n . The offsets Xk,n are independent zero-mean Gaussian variables with variance such as to satisfy (4.1) with H = 1/2, hence the name of the method. One way to obtain Wiener motion in multifractal time WM(MF) is then to keep the offset variables Xk,n as they are but to apply them at the time instances tk,n defined by tk,n = M−1 (k2−n ), i.e. M(tk,n ) = k2−n : B1/2 (t2k+1,n+1 ) :=

B1/2 (tk,n ) + B1/2 (tk+1,n ) + Xk,n . 2

(4.92)

This amounts to a randomly located random displacement, the location being determined by M. Indeed, (4.91) is nothing but a time warp. An alternative construction of “warped Wiener motion” WM(MF) which yields equally spaced sampling, as opposed to the samples B1/2 (tk,n ) provided by (4.92), is

170


desirable. To this end, note first that the increments of WM(MF) become independent Gaussians once the path of M(t) is realized. To be more precise, fix n and let G(k) := B((k + 1)2−n ) − B(k2−n ) = B1/2 (M(k + 1)2−n ) − B1/2 (M(k2−n )).

(4.93)

For a sample path of G we start by producing first the random variables M(k2−n ). Once this is done, the G(k) are simply independent zero-mean Gaussian variables with variance |M((k + 1)2−n ) − M(k2−n )|. This procedure has been used in Figure 4.5. 4.8.2. Global analysis To calculate the multifractal envelope T (q) we need only to know that V is an H-sssi process, i.e. that the increment V (t + u) − V (t) is equal in distribution to uH V (1) (see (4.1)). Assuming independence between V and M, a simple calculation reads as EΩ

n −1 2

V (k + 1)2−n − V k2−n q

k=0

=

n 2 −1

k=0

=

n 2 −1

q ! E E V M (k + 1)2−n − V M k2−n M k2−n , " M (k + 1)2−n

(4.94)

' q " qH ( ! E V (1) . E M (k + 1)2−n − M k2−n

k=0

With little more effort the increments |V((k + 1)2−n ) − V(k2−n )| can be replaced n by suprema, i.e. by 2−nhk , or even certain wavelet coefficients under appropriate assumptions (see [RIE 88]). It follows that ! " T M (qH) if EΩ | sup0≤t≤1 V (t)|q < ∞ (4.95) Warped H-sssi: T V (q) = −∞ otherwise. Simple H-sssi process: when choosing the deterministic warp time M(t) = t we have T M (q) = q − 1 since S n M (q) = 2n · 2−nq for all n. Also, V = V . We obtain T M (qH) = qH − 1 which has to be inserted into (4.95) to obtain ! " qH − 1 if EΩ | sup0≤t≤1 V (t)|q < ∞ (4.96) Simple H-sssi: T V (q) = −∞ otherwise. 4.8.3. Local analysis of warped FBM Let us now turn to the special case where V is FBM. Then, we use the term FB(MF) to abbreviate fractional Brownian motion in multifractal time:


171

B(t) = BH (M(t)). First, to obtain an idea of what to expect from the spectra of B, let us note that the moments appearing in (4.95) are finite for all q > −1 (see [RIE 88, lem 7.4] for a detailed discussion). Applying the Legendre transform easily yields ∗ (a/H). T ∗ B (a) = inf (qa − TM (qH)) = TM

(4.97)

q

(a)

(d)

50

1

40

0.8

30

0.6

20

0.4

10

0.2

0 0

0

0.2

0.4

0.6

0.8

1

Ŧ0.5

Ŧ0.4

Ŧ0.3

Ŧ0.2

Ŧ0.1

0

0.1

time lag

0.2

0.3

0.4

0.5

(b) (e)

1 0.8

0

0.6

Ŧ0.1

0.4

Ŧ0.2

0.2 0 0

0.2

0.4

0.6

0.8

1

Ŧ0.3 Ŧ0.4

(c)

Ŧ0.5

1.5

Ŧ0.6

1

Ŧ0.7

0.5

Ŧ0.8

0

Ŧ0.9

Ŧ0.5 0

0.2

0.4

0.6

time

0.8

1

Ŧ1 0

0.5

1

1.5

2

a

Figure 4.5. Left: simulation of Brownian motion in binomial time (a) sampling of Mb ((k + 1)2−n ) − Mb (k2−n ) (k = 0, . . . , 2n − 1), indicating distortion of dyadic time intervals, (b) Mb ((k2−n )): the time warp, (c) Brownian motion warped with (b): B(k2−n ) = B1/2 (Mb (k2−n )) Right: estimation of dim EaB using τ ∗ w,B , (d) empirical correlation of the Haar wavelet coefficients, (e) dot-dashed: T ∗ Mb (from theory), dashed: T ∗ B (a) = T ∗ Mb (a/H) Solid: the estimator τ ∗ w,B obtained from (c). (Reproduced from [GON 99])

Second, towards the local analysis we recall the uniform and strict Hölder continuity of the paths of FBM5 which reads roughly as sup |B(t + u) − B(t)| = sup |BH (M(t + u)) − BH (M(t))|

|u|≤δ

|u|≤δ

sup |M(t + u) − M(t)|H . |u|≤δ

5. For a precise statement see Adler [ADL 81] or [RIE 88, Theorem 7.4].

172


This is the key to concluding that BH simply squeezes the Hölder regularity exponents by a factor H. Thus, hB (t) = H · hM (t), etc. and M = KaB , Ka/H

and, consequently, analogous to (4.97), dB (a) = dM (a/H). Figure 4.5(d)-(e) displays an estimation of dB (a) using wavelets which agrees very closely with the form dM (a/H) predicted by theory (for statistics on this estimator see [GON 99, GON 98]). In conclusion: COROLLARY 4.7 (Fractional Brownian motion in multifractal time).– Let BH denote FBM of Hurst parameter H. Let M(t) be of almost surely continuous paths and independent of BH . Then, the multifractal warp formalism ∗ (a/H) dim(KaB ) = f B (a) = τ ∗ B (a) = T ∗ B (a) = TM

(4.98)

holds for B(t) = BH (M(t)) for any a such that the multifractal formalism holds for M ) = T ∗ M (a/H). M at a/H, i.e., for which dim(Ka/H This means that the local, or fine, multifractal structure of B captured in dim(KaB ) on the left can be estimated through grain based, simpler and numerically more robust spectra on the right side, such as τ ∗ B (a) (compare Figure 4.5 (e)). “Warp formula” (4.98) is appealing since it allows us to separate the LRD parameter of FBM and the multifractal spectrum of the time change M. Indeed, provided that M is almost surely increasing, we have T M (1) = 0 since S n (0) = M(1) for all n. Thus, T B (1/H) = 0 reveals the value of H. Alternatively, the tangent at T ∗ B through the origin has slope 1/H. Once H is known, T ∗ M follows easily from T ∗ B . Simple FBM: when choosing the deterministic warp time M(t) = t we have B = BH and T M (q) = q − 1 since S n M (q) = 2n · 2−nq for all n. We conclude that T BH (q) = qH − 1

(4.99)

for all q > −1. This confirms (4.90) for FGN. With (4.98) it shows that all spectra of FBM consist of the one point (H, 1) only, making the monofractal character of this process most explicit.


173

4.8.4. LRD and estimation of warped FBM Let G(k) := B((k + 1)2−n ) − B(k2−n ) be FGN in multifractal time (see (4.93) for the case H = 1/2). Calculating auto-correlations explicitly shows that G is second order stationary under mild conditions with HG =

T M (2H) + 1 . 2

(4.100)

Let us discuss some special cases. For example, in a continuous, increasing warp time M, we have always T M (0) = −1 and T M (1) = 0. Exploiting the concave shape of T M we find that H < H G < 1/2 for 0 < H < 1/2, and 1/2 < H G < H for 1/2 < H < 1. Thus, multifractal warping cannot create LRD and it seems to weaken the dependence as measured through second order statistics. Especially in the case of H = 1/2 (“white noise in multifractal time”) G(k) becomes uncorrelated. This follows from (4.100). Notably, this is a different statement from the observation that the G(k) are independently conditioned on M (see section 4.8.1). As a particular consequence, wavelet coefficients will decorrelate fast for the entire process G, not only when conditioning on M (see Figure 4.5(d)). This is favorable for estimation purposes as it reduces the error variance. Of greater importance, however, is the warning that the vanishing correlations should not lead us to assume the independence of G(k). After all, G becomes Gaussian only lead us to assume that we know M. A strong, higher order dependence in G is hidden in the dependence of the increments of M which determine the variance of G(k) as in (4.93). Indeed, Figure 4.5(c) shows clear phases of monotony of B indicating positive dependence in its increments G, despite vanishing correlations. Mandelbrot calls this the “blind spot of spectral analysis”. 4.9. Bibliography [ABR 95] A BRY P., G ONÇALVES P., F LANDRIN P., “Wavelets, spectrum analysis and 1/f processes”, in A NTONIADIS A., O PPENHEIM G. (Eds.), Lecture Notes in Statistics: Wavelets and Statistics, vol. 103, p. 15–29, 1995. [ABR 00] A BRY P., F LANDRIN P., TAQQU M., V EITCH D., “Wavelets for the analysis, estimation and synthesis of scaling data”, Self-similar Network Traffic and Performance Evaluation, John Wiley & Sons, 2000. [ADL 81] A DLER R., The Geometry of Random Fields, John Wiley & Sons, New York, 1981. [ARB 96] A RBEITER M., PATZSCHKE N., “Self-similar random multifractals”, Math. Nachr., vol. 181, p. 5–42, 1996. [ARN 98] A RNEODO A., BACRY E., M UZY J., “Random cascades on wavelet dyadic trees”, Journal of Mathematical Physics, vol. 39, no. 8, p. 4142–4164, 1998.

174


[BAC 93] BACRY E., M UZY J., A RNEODO A., “Singularity spectrum of fractal signals from wavelet analysis: exact results”, J. Stat. Phys., vol. 70, p. 635–674, 1993. [BAC 03] BACRY E., M UZY J., “Log-infinitely divisible multifractal processes”, Comm. in Math. Phys., vol. 236, p. 449–475, 2003. [BAR 97] BARRAL J., Continuity, moments of negative order, and multifractal analysis of Mandelbrot’s multiplicative cascades, PhD thesis no. 4704, Paris-Sud University, 1997. [BAR 02] BARRAL J., M ANDELBROT B., “Multiplicative products of cylindrical pulses”, Probability Theory and Related Fields, vol. 124, p. 409–430, 2002. [BAR 03] BARRAL J., “Poissonian products of random weights: Uniform convergence and related measures”, Rev. Mat. Iberoamericano, vol. 19, p. 1–44, 2003. [BAR 04] BARRAL J., M ANDELBROT B., “Random multiplicative multifractal measures, Part II”, Proc. Symp. Pures Math., AMS, Providence, RI, vol. 72, no. 2, p. 17–52, 2004. [BEN 87] B EN NASR F., “Mandelbrot random measures associated with substitution”, C. R. Acad. Sc. Paris, vol. 304, no. 10, p. 255–258, 1987. [BRO 92] B ROWN G., M ICHON G., P EYRIERE J., “On the multifractal analysis of measures”, J. Stat. Phys., vol. 66, p. 775–790, 1992. [CAW 92] C AWLEY R., M AULDIN R.D., “Multifractal decompositions of Moran fractals”, Advances Math., vol. 92, p. 196–236, 1992. [CHA 02] C HAINAIS P., R IEDI R., A BRY P., “Compound Poisson cascades”, Proc. Colloque “Autosimilarité et Applications” Clermont-Ferrand, France, May 2002, 2002. [CHA 05] C HAINAIS P., R IEDI R., A BRY P., “On non-scale invariant infinitely divisible cascades”, IEEE Trans. Information Theory, vol. 51, no. 3, p. 1063–1083, 2005. [COX 84] C OX D., “Long-range dependence: a review”, Statistics: An Appraisal, p. 55–74, 1984. [CUT 86] C UTLER C., “The Hausdorff dimension distribution of finite measures in Euclidean space”, Can. J. Math., vol. 38, p. 1459–1484, 1986. [DAU 92] DAUBECHIES I., Ten Lectures on Wavelets, SIAM, New York, 1992. [ELL 84] E LLIS R., “Large deviations for a general class of random vectors”, Ann. Prob., vol. 12, p. 1–12, 1984. [EVE 95] E VERTSZ C.J.G., “Fractal geometry of financial time series”, Fractals, vol. 3, p. 609–616, 1995. [FAL 94] FALCONER K.J., “The multifractal spectrum of statistically self-similar measures”, J. Theor. Prob., vol. 7, p. 681–702, 1994. [FEL 98] F ELDMANN A., G ILBERT A.C., W ILLINGER W., “Data networks as cascades: Investigating the multifractal nature of Internet WAN traffic”, Proc. ACM/Sigcomm 98, vol. 28, p. 42–55, 1998. [FRI 85] F RISCH U., PARISI G., “Fully developed turbulence and intermittency”, Proc. Int. Summer School on Turbulence and Predictability in Geophysical Fluid Dynamics and Climate Dynamics, p. 84–88, 1985.


175

[GON 98] G ONÇALVES P., R IEDI R., BARANIUK R., “Simple statistical analysis of wavelet-based multifractal spectrum estimation”, Proc. 32nd Asilomar Conf. on Signals, Systems and Computers, Pacific Grove, CA, Nov. 1998. [GON 99] G ONÇALVES P., R IEDI R., “Wavelet analysis of fractional Brownian motion in multifractal time”, Proceedings of the 17th Colloquium GRETSI, Vannes, France, September 1999. [GON 05] G ONÇALVES P., R IEDI R., “Diverging moments and parameter estimation”, J. Amer. Stat. Assoc., vol. 100, no. 472, p. 1382–1393, December 2005. [GRA 83] G RASSBERGER P., P ROCACCIA I., “Characterization of strange attractors”, Phys. Rev. Lett., vol. 50, p. 346–349, 1983. [HAL 86] H ALSEY T., J ENSEN M., K ADANOFF L., P ROCACCIA I., S HRAIMAN B., “Fractal measures and their singularities: the characterization of strange sets”, Phys. Rev. A, vol. 33, p. 1141–1151, 1986. [HEN 83] H ENTSCHEL H., P ROCACCIA I., “The infinite number of generalized dimensions of fractals and strange attractors”, Physica D, vol. 8, p. 435–444, 1983. [HOL 92] H OLLEY R., WAYMIRE E., “Multifractal dimensions and scaling exponents for strongly bounded random cascades”, Ann. Appl. Prob., vol. 2, p. 819–845, 1992. [JAF 91] JAFFARD S., “Pointwise smoothness, two-microlocalization coefficients”, Publicacions Mathematiques, vol. 35, p. 155–168, 1991.

and

wavelet

[JAF 95] JAFFARD S., “Local behavior of Riemann’s function”, Contemporary Mathematics, vol. 189, p. 287–307, 1995. [JAF 97] JAFFARD S., “Multifractal formalism for functions, Part 1: Results valid for all functions”, SIAM J. of Math. Anal., vol. 28, p. 944–970, 1997. [JAF 99] JAFFARD S., “The multifractal nature of Lévy processes”, Prob. Th. Rel. Fields, vol. 114, p. 207–227, 1999. [KAH 76] K AHANE J.-P., P EYRIÈRE J., “Sur Certaines Martingales de Benoit Mandelbrot”, Adv. Math., vol. 22, p. 131–145, 1976. [LV 98] L ÉVY V ÉHEL J., VOJAK R., “Multifractal analysis of Choquet capacities: preliminary results”, Adv. Appl. Math., vol. 20, p. 1–34, 1998. [LEL 94] L ELAND W., TAQQU M., W ILLINGER W., W ILSON D., “On the self-similar nature of Ethernet traffic (extended version)”, IEEE/ACM Trans. Networking, p. 1–15, 1994. [MAN 68] M ANDELBROT B.B., N ESS J.W.V., “Fractional Brownian motion, fractional noises and applications”, SIAM Reviews, vol. 10, p. 422–437, 1968. [MAN 74] M ANDELBROT B.B., “Intermittent turbulence in self similar cascades: divergence of high moments and dimension of the carrier”, J. Fluid. Mech., vol. 62, p. 331, 1974. [MAN 90a] M ANDELBROT B.B., “Limit lognormal multifractal measures”, Physica A, vol. 163, p. 306–315, 1990. [MAN 90b] M ANDELBROT B.B., “Negative fractal dimensions and multifractals”, Physica A, vol. 163, p. 306–315, 1990.

176


[MAN 97] M ANDELBROT B.B., Fractals and Scaling in Finance, Springer, New York, 1997. [MAN 99] M ANDELBROT B.B., “A multifractal walk down Wall Street”, Scientific American, vol. 280, no. 2, p. 70–73, February 1999. [MAN 02] M ANNERSALO P., N ORROS I., R IEDI R., “Multifractal products of stochastic processes: construction and some basic properties”, Advances in Applied Probability, vol. 34, no. 4, p. 888–903, December 2002. [MUZ 02] M UZY J., BACRY E., “Multifractal stationary random measures and multifractal random walks with log-infinitely divisible scaling laws”, Phys. Rev. E, vol. 66, 2002. [NOR 94] N ORROS I., “A storage model with self-similar input”, Queueing Systems, vol. 16, p. 387–396, 1994. [OLS 94] O LSEN L., “Random geometrically graph directed self-similar multifractals”, Pitman Research Notes Math. Ser., vol. 307, 1994. [PES 97] P ESIN Y., W EISS H., “A multifractal analysis of equilibrium measures for conformal expanding maps and Moran-like geometric constructions”, J. Stat. Phys., vol. 86, p. 233–275, 1997. [PEY 98] P EYRIÈRE J., An Introduction to Fractal Measures and Dimensions, Paris, 11th Edition, k 159, 1998, ISBN 2-87800-143-5. [RIB 06] R IBEIRO V., R IEDI R., C ROUSE M.S., BARANIUK R.G., “Multiscale queuing analysis of long-range-dependent network traffic”, IEEE Trans. Networking, vol. 14, no. 5, p. 1005–1018, October 2006. [RIE 88] R IEDI R.H., “Multifractal processes”, in D OUKHAN P., O PPENHEIM G., TAQQU M.S. (Eds.), Long Range Dependence: Theory and Applications, p. 625–715, Birkhäuser 2002, ISBN: 0817641688. [RIE 95a] R IEDI R.H., “An improved multifractal formalism and self-similar measures”, J. Math. Anal. Appl., vol. 189, p. 462–490, 1995. [RIE 95b] R IEDI R.H., M ANDELBROT B.B., “Multifractal formalism for infinite multinomial measures”, Adv. Appl. Math., vol. 16, p. 132–150, 1995. [RIE 97a] R IEDI R.H., L ÉVY V ÉHEL J., “Multifractal properties of TCP traffic: a numerical study”, Technical Report No 3129, INRIA Rocquencourt, France, February, 1997, see also: L ÉVY V ÉHEL J., R IEDI R.H., “Fractional Brownian motion and data traffic modeling”, in Fractals in Engineering, p. 185–202, Springer, 1997. [RIE 97b] R IEDI R.H., S CHEURING I., “Conditional and relative multifractal spectra”, Fractals. An Interdisciplinary Journal, vol. 5, no. 1, p. 153–168, 1997. [RIE 98] R IEDI R.H., M ANDELBROT B.B., “Exceptions to the multifractal formalism for discontinuous measures”, Math. Proc. Cambr. Phil. Soc., vol. 123, p. 133–157, 1998. [RIE 99] R IEDI R.H., C ROUSE M.S., R IBEIRO V., BARANIUK R.G., “A multifractal wavelet model with application to TCP network traffic”, IEEE Trans. Info. Theory, Special issue on multiscale statistical signal analysis and its applications, vol. 45, p. 992–1018, April 1999.


177

[RIE 00] R IEDI R.H., W ILLINGER W., “Toward an improved understanding of network traffic dynamics”, in PARK K., W ILLINGER W. (Eds.), Self-similar Network Traffic and Performance Evaluation, p. 507–530, Wiley, 2000. [RIE 04] R IEDI R.H., G ONÇALVES P., Diverging moments, characteristic regularity and wavelets, Rice University, Dept. of Statistics, Technical Report, vol. TR2004-04, August 2004. [RIE 07a] R IEDI R.H., G ERSHMAN D., “Infinitely divisible shot-noise: modeling fluctuations in networking and finance”, Proceedings ICNF 07, Tokyo, Japan, September 2007. [RIE 07b] R IEDI R.H., G ERSHMAN D., Infinitely divisible shot-noise, Report, Dept. of Statistics, Rice University, TR2007-07, August 2007. [TEL 88] T EL T., “Fractals, multifractals and thermodynamics”, Z. Naturforsch. A, vol. 43, p. 1154–1174, 1988. [TRI 82] T RICOT C., “Two definitions of fractal dimension”, Math. Proc. Cambr. Phil. Soc., vol. 91, p. 57–74, 1982. ´ J., Wavelets and Subband Coding, Prentice-Hall, ˘ C [VET 95] V ETTERLI M., KOVA CEVI Englewood Cliffs, NJ, 1995.


Chapter 5

Self-similar Processes

5.1. Introduction 5.1.1. Motivations Invariance properties constitute the basis of major laws in physics. For example, conservation of energy results from invariance of these laws compared with temporal translations. Mandelbrot was the first to relate scale invariance to complex objects and the outcome was coined “fractals”. Using the concept of scale invariance, different notions of fractal dimension can be discussed. A particular class of complex objects presenting scale invariance is that of random medium, on which we mainly focus here. Let us begin with the example of percolation (see, for example, [GRI 89]). On a regular network, some connections are randomly and abruptly removed. The resulting network itself is random and it contains “cracks”, “bottlenecks”, “cul-de-sac”, etc. However, a regularity of statistical nature is often seen. For example, let us think of a network of spins on Z2 at critical temperature (see [GUY 94]): an “island” of + signs will be found within a “lake” of − signs, which itself is an island, etc. At each scale, we statistically see “the same thing”. Over this mathematical medium, physicists imagine the circulation of a fluid (or particles) and are hence interested, for example, in the position Xn of a particle after n time steps or in statistical characteristics such as its average position EXn . Using symmetry, this average position is often zero. Therefore, we will study the

Chapter written by Albert B ENASSI and Jacques I STAS.

180


corresponding standard deviation EXn2 . Let us consider a case where, for large n: EXn2 ∼ σ 2 n2H with σ 2 the variance. When we have H = 12 , the random walk Xn is of Brownian nature. It is said to be abnormal and overdiffusive (or underdiffusive) when H > 12 (respectively H < 12 ). ), with n 0, Let us now consider the case when the dilated random walk ( X(λn) λH is statistically indistinguishable from the initial walk (Xn ), n 0: X(λn) L , n 0 = (Xn ), n 0 (5.1) λH The walk is then said to be self-similar, when the equality in law (5.1) is valid for all λ > 0. A comprehensive survey of random walk on fractal media, orientated toward physicists, can be found in [HAV 87]. A more mathematically grounded framework can be read in [BAR 95]. Apart from the framework of random walk, reasons for which a physical quantity possesses invariance with a power law are generally very tricky to discover. Indications of physical nature can be found in [HER 90], particularly in Duxburg’s contribution, which elaborates a scale renormalization theory for crack dynamics. In [DUB 97], a number of contributions in various fields such as financial markets, avalanches, metallurgy, etc. provide illustrations of scale invariance. In particular, [DUR 97] proposes an analysis of invariance phenomena in avalanches, which is both experimental and theoretical. One major source of inspiration for the definition and study of the property of scale invariance, is that of hydrodynamic turbulence – more precisely, Kolmogorov’s work (see, for example, [FRI 97]) which, from the basis of Richardson’s work on the energy cascades, established the famous − 53 law, in 1941, based on the modeling of energy transfers in turbulent flows. This theory provides a powerful means to define the stochastic self-similar processes and to study their properties. The reader is referred to [FRI 97] and references therein. Scale invariance, or self-similarity, sometimes leads to a correlation property referred to as long-range dependence. Generally, for processes satisfying (5.1), increments have a power law correlation decrease, a slow decline that indicates long-term persistence. Mandelbrot and van Ness [MAN 68] popularized fractional Brownian motion, historically introduced in [KOL 40], precisely to model long-range correlation. These processes had since an extraordinary success and numerous extensions provide quantities of generally identifiable models. Let us briefly describe the article [WIL 98], in which the authors give a “physical” theory of fractional


181

Brownian motion. A typical machine is either active, or inactive; the durations of activity and inactivity are independent. An infinity of machines is then considered. At the time tT , we consider all the active machines – more exactly, the fluctuations around the average number of active machines at time tT . Then, we renormalize in T to obtain the law of this phenomenon. The set of all durations of active machines at a given time is distributed in a rather similar way to that of the balls of the distribution model of non-renormalized mass which we will study later. The reader is also directed to Chapter 12. The goal of this chapter is to present a set of stochastic processes which have partial self-similarity and stationarity properties. Unfortunately, we cannot claim to present an exhaustive study – moreover, the available space prohibits it. We had to make choices. We preferred to challenge the reader by asking him or her questions whose answers appeared to us as surprising. The intention is to show that the concept of scale invariance remains, to a great extent, misunderstood. We thus hope, with what has been said before, centered on physics, and which we will present, to have opened paths which other researchers will perhaps follow. Trees will be used here as the leading path to scale invariance. It seemed to us that such a simple geometric structure, with such great flexibility, is well-adapted to the study of self-similarity. In order to become familiarized with trees and invariance by dilation and translation, we begin our presentation with a study of purely geometric scaling, where trees and spaces are mixed. From this, we present random or non-random fractals with scale invariance. This leads us to the model of mass distribution. This provides a convenient means to generate a quantity of processes, stochastic or not, with scale and translation invariance. It is remarkable that, through a suitable wavelet decomposition, all the self-similar stochastic processes with stationary increments relate to a “layer-type” model, except perhaps for Takenaka’s process. These models of scale invariance enable us to question the difference between two concepts of equal importance: long-range correlation and sample path regularity. From examples, we show that these two concepts are independent. This chapter consists of four sections. The first is mostly an introduction. The second clarifies the Gaussian case, a quasi-solved problem. The third section turns to some non-Gaussian cases, mostly that of stable processes. The last section studies correlation and regularity from defective examples. Generally, certain technical difficulties, sometimes even major ones, are overlooked. Consequently, the results may give the impression of lack of rigor, while returning to the original work is a necessity.

182


5.1.2. Scalings 5.1.2.1. Trees To understand the geometric aspects of scaling, we will mainly use trees. Let us start by studying the interval [0, 1[. Let q 2 be a real number and Aq = {0, 1, . . . , q − 1}. Let us note by Tq (respectively Tq ) the unilateral set of sequences (respectively bilateral) (a1 , a2 , . . . , an , . . .) (respectively (. . . , a−n , . . . , a−1 , a0 , a1 , . . . , an , . . .)) where an take values in Aq . We will denote by a a sequence (a1 , a2 , . . . , an , . . .) and an the finite sub-sequence (a1 , a2 , . . . , an ). We can consider the set Tq from various points of view: – Tq is a set of real numbers of [0, 1[ written with base q: x=

+∞ ak (x)

qk

k=1

Let us recall that this decomposition is unique except for a countable set of real numbers; – Tq is the q-ary tree. Each father has q sons. This tree is provided with a root or ancestor, denoted by , that has no antecedent and is associated with the empty sequence. The peaks of Tq are finite sequences an , with n 1, and the ridges are couples (an−1 , an ), with n 2; – Tq is the set of q-adic cells {Δnk , 0 k n}, with Δnk = [k/q n , (k + 1)/q n [. The lexicographic order makes it possible to associate each finite sequence an with only one of the q n cells Δnk . The (lexicographic) order number of this cell will be noted by kn (an ).

0 I−1 1 *:I 1 0

Figure 5.1. Coding from cell [0.5, 0.75]


183

1 I−1 1 *:I 0

Figure 5.2. Coding from cell [0.0, 0.5]

5.1.2.2. Coding of R This section may seem complex, but is actually not difficult: it amounts to extending the previous coding to R. Let a be an infinite branch of the tree Tq . For n 0, let us dilate R by q n , then let us perform a translation by kn (an ) so as to move cell Δnkn (an ) , multiplied by q n , in coincidence with [0, 1[. The Tq (an ) tree, of root an , allows us to code all the q-adic cells included in [−2n , 2n ], with their position in R:

k (5.2) Δm , −m n < +∞, −kn (an ) k q m+n − kn (an ) When we have n → +∞, Tq (an ) extending by itself, this tends towards the ˜ and complete q-adic tree Tq , which has been provided with a particular bilateral way a ˜, ˜ ). This triplet leads us to code all q-adic cells of R. Another a root ˜ , noted by (Tq , a possibility is to base the analysis on the decomposition of any arbitrary real numbers in base q, as previously done for the interval [0, 1[. This coding can be extended to Rd . The reader is referred to [BEN 00] for more details. 5.1.2.3. Renormalizing Cantor set of T in Let E be a set of [0, 1]. With the set E, we can associate a sub-tree TE,q the following way: in any x ∈ E, we connect the branch of a(x).

Let us now assume that E is the triadic Cantor set. The natural choice is q = 3. Up to countable sets, E is the real set that admit no 1 in their decomposition in base 3. . As previously, Thus, set E is simply defined by a condition on the branches of TE,3 by means of dilation and translation, we define, from Cantor set E, a set E on R: in other words, E is the set of real numbers that admit no 1 in their decomposition in base 3. This set E is, of course, invariant by dilation of a factor 3p , with p ∈ Z. However, it is not invariant for other factors, as can be easily verified, for example with 2.

184


5.1.2.4. Random renormalized Cantor set We build a uniform law on the set of binary sub-trees of ternary tree T3 . Following the intuition behind the construction of traditional Cantor set, let us define the random compact K(T ) by: def

K(T ) =

+

Δ(b)

(5.3)

n0 b∈Tn

where Δ(b) is the single triadic cell connected to branch b ∈ Tn . Then, we observe that K(T, a) is the renormalized set K(T ) along the branch a as we did previously. Therefore, we verify that the law of K(T, a) is equal to the law of 3p K(T, a), with p ∈ Z. We will say that K(T, a) is semi-self-similar, preposition semi indicating that the renormalization factors for which K(T, a) remains, in invariant form, a strict sub-set of R, namely, the multiplicative sub-group of the powers of 3. K(T, a) law is stable using translation by integers. Combining translations by integers and multiplications by powers of 3, we find that, for any decimal d (in base 3), the law ofK(T, a) is equal to the law of K(T, a) + d: there is an invariance under translation. This will be referred to as a quasi-stationarity property, stationary being used only when invariance is achieved for all translation parameters. 5.1.3. Distributions of scale invariant masses Inspired by the preceding construction, we now propose that of a stationary scale invariant phenomenon. More precisely, we aim at building a random measure M (dx) verifying the following properties of stationarity and semi-self-similarity associated with a sub-group G multiplicative of R+ : L

M (dx − y) = M (dx), ∀y ∈ Rd L

M (λdx) = λH M (dx), ∀λ ∈ G

(stationary)

(5.4a)

(H-semi-self-similarity)

(5.4b)

In sections 5.1.3.1 and 5.1.3.2 two brief examples are presented. 5.1.3.1. Distribution of masses associated with Poisson measures Let Pn denote an infinity of independent Poisson measures on Rd , with intensity chosen as the Lebesgue measure and identical parameter. Let (xni , i ∈ In ) be a realization of Pn indexed by the set In . Let us denote by B the ball of Rd with center 0 and radius 1. Let us define the measure M0 by its density m(x): def

m(x) =

n0

2−nH

i∈In

1B (2n x − xni ).

(5.5)


185

xn

hence, to point 2ni , we allotted a mass proportional to 2−n(H+d) , since the proportionality coefficient is equal to the volume of B. The contribution of these masses at scale n is proportional to 2−nH per volume unit. 5.1.3.2. Complete coding If we define a Cantor set on [0, 1[ only, the resulting set is not stable by dilation of a factor 3. We must define this set on R to make it stable by dilation of any unspecified power of 3. In the same way, the measure M0 defined above cannot be stable by dilation of a factor 2. However, the approach used as for Cantor sets can be adopted. We outline it only briefly. For any n 0, we define the measure Mn par Mn (dx) = 2−nH (M0 (x + 2−n ) − M0 (x))(dx). As distributions, this sequence Mn converges slightly towards M . Thus, we verify that M is semi-self-similar for the multiplicative sub-group of powers of 2. It is also stationary. Let us note that our construction seems to ascribe a specific role with the number 2. This is not the case and we can replace 2 by any b > 0 in (5.5). We then obtain the semi-self-similar measure for the multiplicative sub-group of powers of b. 5.1.4. Weierstrass functions With Weierstrass functions, we have a deterministic distribution model of mass with properties analog to equation (5.4). If b > 1 and 0 < H < 1, for x ∈ R, Weierstrass functions Wb,H are defined as (see [WEI 72]): def b−nH sin(bn x). (5.6) Wb,H (x) = n∈Z

We can easily verify the semi-self-similar property: Wb,H (bx) = bH Wb,H (x) We should note that the preceding constructions, intended to expand Cantor sets and renormalized sums of Poisson measures on R, have their match on Weierstrass functions by writing: 0 (x) = b−nH sin(bn x) Wb,H n0 0 and noticing that we have limp→+∞ bpH Wb,H (b−p x) = Wb,H (x).

The question that naturally arises is: are there probabilistic models which are self-similar and stationary? The traditional answer is positive, provided the stationarity condition is replaced by a stationary of the increments condition. Therefore, we proceed with the introduction of self-similar stochastic processes whose increments are stationary.

186


5.1.5. Renormalization of sums of random variables In this section, we present the results of Lamperti’s article [LAM 62] on self-similar process obtained as renormalization limits of other stochastic processes. Let us first recall Lamperti’s definition of a “semi-stable”1 stochastic process. DEFINITION 5.1.– A stochastic process X(x), with x ∈ R, is called semi-stable if, for any a > 0, there is a renormalization function b(a) > 0 such that: L X(ax), x ∈ R = b(a)X(x), x ∈ R When the function b(a) is of the form aH , the process X is called self-similar. Lamperti’s fundamental result shows that the possible choices for the renormalization function b(a) is actually limited. THEOREM 5.1 ([LAM 62, Theorem 1, p. 63]).– Any stochastic semi-stable process is self-similar. From now on, we must note that this result is not in contradiction with the existence of locally self-similar2 processes (see [BEN 98]). Let X and Y be two real stochastic processes indexed by Rd . If we assume hypothesis R, there exists a function f : R+ → R+ such that: X(ax) L d , x ∈ R = Y (x), x ∈ Rd lim a→+∞ f (a) Moreover, let us recall that a function L is a slowly varying function if, for any y > 0, we obtain limx→+∞ L(xy) L(x) = 1. THEOREM 5.2 ([LAM 62, Theorem 2, p. 64]).– Let X and Y be two stochastic processes such that there is a function f for which the hypothesis R is satisfied. Then, f necessarily has the following structure, with H > 0: f (a) = aH L(a) where L is a slowly varying function. 1. Not to be confused with the definition of a stable process. 2. These processes are presented in Chapter 6.


187

As illustrations of this theorem, we offer the two traditional examples: – Brownian motion: X(x) =

[x]

ξk ,

f (n) =

√ n

k=1

where the ξk are independent Bernoulli on {−1, 1}. Brownian motion can be defined as a limit of X(nx) f (n) when n → +∞; – Lévy’s symmetric α-stable motion: X(x) =

[x]

ξk ,

1

f (n) = n α

k=1

where the ξk are independent identically distributed, stable, symmetric, random variables (see Chapter 14 in [BRE 68]). Lévy’s symmetric α-stable motion can be defined as a limit of X(nx) f (n) when n → +∞. Thus, stochastic self-similar processes appear as natural limits of renormalization procedures. Theorem 5.8 provides another example, which is neither Gaussian nor stable. 5.1.6. A common structure for a stochastic (semi-)self-similar process We now wish to propose a unified structure for the known (semi-)self-similar processes. The basic ingredients are as follows: – a self-similar parameter H > 0; – a “vaguelette” type basis (see [MEY 90]); – a sequence of independent and identically distributed random variables. Let Ed = {0, 1}d be the set of binary sequences of length d and Ed = Ed − {1, 1, . . . , 1}. Let us note by Λd the set of{(n, k), n ∈ Z, k ∈ Zd }. Let us nd observe that ψλ (x) = 2 2 ψ u (2n x − k), with x ∈ Rd and λ = (n, k, u), where n is the scale parameter, k the localization parameter and u the orientation parameter: if we have u = 0, φu is the mother wavelet, if we have u = 1, φu is the father wavelet. This dilation and translation structure, in fact, uses a structure of subjacent binary trees. Function ψ is assumed to rapidly decrease at infinity, and be null and Lipschitzian in zero. Let us observe that ξλ , with λ = (n, k, u) a random variable sequence such that, for any n, n , k, k : L

ξn ,k ,u = ξn,k,u

188


Let us then define the process X by: def 2−nH ψλ (x)ξλ X(x) =

(5.7)

λ

The process is semi-self-similar for the multiplicative group of power 2. Again, the number 2 does not play a crucial role. Later on, we will see that Brownian fractional motions have this structure. We observe that Weierstrass functions return to this framework by supposing that ψ(x) = sin(x)1[0,2π] (x). In fact: b−nH ψ(bn x − 2kπ) Wb,H (x) = n∈Z,k∈Z

The question of parameter identifiability for a semi-self-similar process is natural. We will be dealing with it in the next section. 5.1.7. Identifying Weierstrass functions 5.1.7.1. Pseudo-correlation Subject to existence, let us define the pseudo-correlation function of a deterministic function f by (see [BAS 62]): T 1 def f (x)f (x + τ ) dx γf (·) (τ ) = lim T →+∞ 2T −T A function is said to be pseudo-random if limτ →+∞ γf (·) (τ ) = 0, and pseudo-stationary when γf (·−r) = γf (·) for any r. It can be shown that Weierstrass functions are pseudo-random and pseudo-stationary. The example of Weierstrass functions shows that a semi-self-similar phenomenon is not solely determined by the self-similar parameter H. Is it possible to identify the parameters that generate these phenomena? We will see later that generally it is possible. To conclude this introduction, we will be focusing on Weierstrass functions, whose identification requires general tools, although their demonstration is elementary. To this end, let us introduce the quadratic variations of a function f : 2 N −1 k k+1 1 N −f f V (f ) = N N N k=0

Let us define: RN =

VN 1 2 log2 2 VN

By using the pseudo-random character of Weierstrass functions, we can show that RN measures H, when N → +∞.


189

5.2. The Gaussian case As always, the Gaussian case is the best understood. The structure of stochastic, Gaussian, self-similar processes, also possessing stationary increments, is well-known. In his article [DOB 79], Dobrushin presents contemporary results, including his own works, in a definitive style and in the framework of generalized stochastic processes. Here, we give a “stochastic processes” version. 5.2.1. Self-similar Gaussian processes with r-stationary increments To describe Dobrushin’s results, we need some notations. 5.2.1.1. Notations Let Rd , d 1, be the usual Euclidean space; with x = (x1 , . . . , xd ), and d d |x| = 1 x2k . Let xy = 1 xk yk denote the scalar product of vectors x and y. 2

Let k = (k1 , . . . , kd ) ∈ Nd be a multi-index of length |k| = k1 + . . . + kd . We ∂ k1 ) ◦ . . . ◦ ( ∂x∂ d )kd . define Dk = ( ∂x 1 Let f be a function of Rd in R. Let TF (f ) denote its Fourier transform and TF −1 (f ) its inverse Fourier transform. When f is a distribution, the same notations are used for the (inverse) Fourier transform. Let us recall that, for k ∈ N: TF (Dk f )(ξ) = i|k| ξ k TF (f )(ξ) Let us denote by S(Rd ) (or S where there is no ambiguity) the Schwartz space of functions C ∞ with rapid decrease and rapid decrease derivatives; S (Rd ) (or S ) then denotes the space of moderate distributions. Let T ∈ S . The translation operator τk def is defined as #τk T, φ$ = #T, τ−k φ$ for φ ∈ S, with τ−k φ(x) = φ(x + k). Let f be a function of Rd in R and n an integer. Let f ⊗n denote a function on def (Rd )n defined as f ⊗n (x1 , . . . , xn ) = (f (x1 ), . . . , f (xn )). Finally, for k ∈ Rd , let us introduce the translation operator τk f ⊗n = (τk f (x1 ), . . . , τk f (xn )). 5.2.1.2. Definitions In this section, the same notations are used for distributions and functions. DEFINITION 5.2.– Let (X(x)), with x ∈ Rd , be a (possibly) generalized stochastic process: – X is said to be stationary if, for any integer n and any h ∈ Rd : L

τh X ⊗n = X ⊗n

190


– let r be a non-zero integer. X is said to possess stationary r-increments if Dk X is stationary for any k such that |k| = r; – let r be a non-zero integer and H > 0. X is said to be (r, H) self-similar if there is a polynomial P of degree r so that P (D)X is self-similar with parameter H. NOTE 5.1.– X is said to have stationary increments if it is 1-stationary. X is said to be self-similar with parameter H if it is (1, H) self-similar. 5.2.1.3. Characterization THEOREM 5.3 ([DOB 79, Theorem 3.2, p. 9]).– Let X be a Gaussian (r, H) self-similar process, with H < r, r-stationary increments and a polynomial P . Then, there is a function S of Rd in R, on the unit sphere Σd , such that, for all functions φ of S: P (iξ)TF (φ)(ξ) TF (W )(dξ) #X, P (D)φ$ = d d R |ξ| 2 +H S ξ |ξ| Let us now introduce the pseudo-differential3 operator L, with symbol ρ(ξ) = ξ S( |ξ| ): |ξ| d 2 +H

Lf (x) =

Rd

TF (f )(ξ)ρ(ξ)eixξ dξ

(5.8)

Then, the following weak stochastic differential equation can be deduced (see [BEN 97]): LX = W ◦ where W ◦ is a Gaussian white noise. Now, let us give some examples. 5.2.2. Elliptic processes Let L be the pseudo-differential operator defined in (5.8). L is called elliptic if two constants 0 < a A < +∞ exist such that a S A on the sphere Σd . By analogy, the corresponding process X will be called elliptic [BEN 97]. Generally, self-similar Gaussian processes (r, H) with r-stationary increments, with 0 < H < 1, admit the following representation. 3. See [MEY 90] for traditional results on operators.


191

THEOREM 5.4 ([BEN 97]).– Let X be a self-similar Gaussian (r, H) self-similar process, with r-stationary increments, with 0 < H < 1: – X admits the following harmonic representation: X(x) =

r−1 (ixξ)k k=0 k! TF (W )(dξ) d ξ +H 2 |ξ| S |ξ|

eixξ −

Rd

– X is the unique solution of the following stochastic elliptic differential equation: LX = W ◦ Q(D)X(0) = 0 for any Q such that d◦ Q < r As a particular case, we can mention the harmonic representation [MAN 68] of fractional Brownian motion of parameter H: r = 1 and S ≡ 1: BH (t) =

eixξ − 1

R

1

|ξ| 2 +H

TF (W )(dξ)

5.2.3. Hyperbolic processes DEFINITION 5.3.– The operator L defined in (10.8) is called hyperbolic if its symbol d 1 is of the form ρ(ξ) = i=1 |ξi |Hi + 2 . THEOREM 5.5 (FRACTIONAL B ROWNIAN SHEET [LEG 99]).– Fractional Brownian sheet, defined as: X(x) =

d ixi ξi ) e −1

Rd i=1

1

|ξi |Hi + 2

TF (W )(dξ)

satisfies the following equality:

X(λ1 x1 , . . . , λd xd )

Rd

L

=

d )

i X(x1 , . . . , xd ) Rd λH i

i=1

COROLLARY 5.1.– When λ1 = · · · = λd and H = H1 + · · · + Hd , the hyperbolic process X obtained is self-similar with parameter H, with H between 0 and d. In contrast with the elliptic case, H > 1 is hence allowed, though the Brownian fractional sheet is non-derivable.

192


5.2.4. Parabolic processes Let A be a pseudo-differential operator of dimension n − 1, i.e., its symbol is a function of Rn−1 in R. Let L be the pseudo-differential operator of dimension n, whose symbol is a function of R × Rn−1 in R and defined by L = ∂t − A. Let us consider the stochastic differential equation LX = W ◦ . By analogy with the classification of operators, X is said to be parabolic. The most prominent example is the Ornstein-Uehlenbeck process: t e−(t−s)A W (ds, dxy) OU (t, x) = 0

Rd

The operator ∂t is renormalized with a factor 12 ; the operator A can be renormalized with an arbitrary factor. Generally, a parabolic process is not self-similar. 5.2.5. Wavelet decomposition In this section, we expand Gaussian self-similar processes on a wavelet basis, which hence also constitutes a basis for the self-reproducing Hilbert space of the process4. 5.2.5.1. Gaussian elliptic processes Let ψ u , with u ∈ Ed , be a Lemarié-Meyer generating system [MEY 90]. Let ψλ , with λ ∈ Λd , be the generated orthonormal basis of wavelets. Let us assume that X verifies hypotheses and notations, as in Theorem 5.4. Let us define φu , with u ∈ Ed , with the harmonic representation: k r−1 eixξ − k=0 (ixξ) u k! TF (ψ u )(dξ) φ (x) = d ξ +H Rd 2 |ξ| S |ξ| Let us then define the associated family of wavelets φλ , with λ ∈ Λd . THEOREM 5.6 ([BEN 97]).– There is a sequence of normalized Gaussian normal random 2D variables ηλ such that: 2−j(r−1+H) ηλ φλ (x) X(x) = λ∈Λd

If X is self-similar in the usual sense, then this decomposition is a renormalized distribution of mass, as defined in section 5.1.3.

4. See [NEV 68] for self-reproducing Hilbert spaces.


193

5.2.5.2. Gaussian hyperbolic process THEOREM 5.7.– With the same notations as those of Theorem 5.6, we obtain the decomposition: 2−(n1 H1 +···+nd Hd ) φλ1 (x1 ) × · · · × φλd (xd )ηλ1 ,...,λd X(x) = (λ1 ,...,λd )∈(Λ1 )d

where the sequence ηλ1 ,...,λd consists of normalized Gaussian random 2D variables. Hyperbolic processes enable us to model multiscale random structures, with preferred directions. 5.2.6. Renormalization of sums of correlated random variable Let BH denote fractional Brownian motion of parameter H. Let us consider the increments of size h > 0: Xk = h−H (BH ((k + 1)h) − BH (kh)). The following properties are well-known: – X0 is a normalized Gaussian random variable; – the sequence (Xk ) is stationary; – a law of large numbers can be written as: 1 2 Xk = 1 n→+∞ n + 1 n

lim

(a.s.)

(5.9)

k=0

Difficulties only start when trying to estimate the convergence speed in (5.9). Before exhibiting the results of these questions, let us give an outline of the Rosenblatt process and law. The Rosenblatt law is defined by its characteristic data function, which can be found in [TAQ 75, p. 299]. We can define a stochastic process (ZD (t))t>0 called the Rosenblatt process whose law in every moment is a Rosenblatt law of parameter D. We can find the functional characteristic of (ZD (t1 ), . . . , ZD (tk )), with k 1, in [TAQ 75]. 5.2.7. Convergence towards fractional Brownian motion 5.2.7.1. Quadratic variations Many statistical estimation procedures are based on quadratic variations. It is hence useful to recall quadratic variations of fractional Brownian motion. THEOREM 5.8.– Let:

k+1 k − BH Xk,N = N H BH N N

194


be the renormalized increments of a fractional Brownian motion of order H. The following results are proven in [GUY 89, TAQ 75]: – when 0 < H < 34 : √

[N t]

N

2 (Xk,N − 1)

k=0

converges in law, when N → +∞, towards σH BH (t), where BH is fractional Brownian motion of order H; – for 34 < H < 1:

[N t]

N

2(1−H)

2 (Xk,N − 1)

k=0

converges in law, when we have N → +∞, towards a Rosenblatt process Z(1−H) (t). Theorem 5.8 admits a generalization to functions G of L2 (R, μ), where μ is the 2 Gaussian density (2π)−1/2 exp(− x2 ). It is known (see, for example, [NEV 68]) that form an orthonormal basis for L2 (R, μ). Hermite’s polynomials Hk , with k 0, Let us expand G on this basis: G(x) = k0 gk Hk (x). If we have g0 = g1 = 0 and g2 = 0, Theorem 5.8 remains valid – up to changes in the variances of the limit processes (see [GUY 89, TAQ 75]). Theorem 5.8 thus provides examples of non-Gaussian self-similar process with stationary increments. Other examples are discussed later. 5.2.7.2. Acceleration of convergence Instead of standard increments Xk,N , let us now consider the second order increments: k+1 k k−1 H − 2BH + BH Yk,N = N σ BH N N N where σ is chosen such that Yk,N is of variance 1. Then, the following result can be obtained (cf. [IST 97]). THEOREM 5.9.– Quantity: √

[N t]

N

2 (Yk,N − 1)

k=0

converges in law, when N → +∞, towards a fractional Brownian motion BH of order H. The frontier H = 34 disappears, thanks to the introduction of the generalized variations Yk,N .


195

5.2.7.3. Self-similarity and regularity of trajectories It is generally admitted in the literature, following the works by Mandelbrot, that self-similarity necessarily goes with sample path irregularity, for example, of Hölderian type. From a simple example, we now show that such an association does not hold in general. To construct such a process, let us start with an infinitely derivable function φ, with compact support and such that there exists a neighborhood of 0 which is not included in the support of φ. Let us then define a stochastic process as: φ(tξ) TF (W )(dξ) (5.10) X(t) = 1 +H R |ξ| 2 for which the following original result can be obtained. THEOREM 5.10.– X, as defined in (5.10), is a zero-mean Gaussian process that possesses the following properties: – X is self-similar with parameter H ∈ R; – the trajectories of X, for t = 0, are infinitely derivable. Let us mention that X does not have stationary increments. It is precisely this loss of increment stationarity that allows the regularity of trajectories. 5.3. Non-Gaussian case 5.3.1. Introduction In this section, we study processes whose laws are either subordinate to the law of the Brownian measure (cf. Dobrushin [DOB 79]), or symmetric α-stable laws (hereafter SαS). Samorodnitsky and Taqqu’s book [SAM 94] is one of the most prominent reference tools for stable processes and is largely used here. Two classes of process are studied: processes represented by moving averages and those defined by harmonizable representation. These two classes are not equivalent (cf. Theorem 5.11 below). Censov’s process, and a variant, i.e. Takenaka processes, are also analyzed. Let us mention that Takenaka processes do not belong to either of the two aforementioned classes. However, all these processes have a point in common: they are elliptic, i.e., they are the solutions of a stochastic elliptic equation of noninteger degree. Ellipticity makes it possible to expand such processes on appropriate wavelet basis, as in [BEN 97]. In the Gaussian case, this decomposition is of Karhunen-Loeve type. Therefore, the question, which is still unanswered is: are the SαS, self-similar

196


processes, with stationary increments, elliptic? Finally, ellipticity enables us to construct a wide variety of self-similar processes with stationary increments subordinated to the Brownian measure. 5.3.2. Symmetric α-stable processes 5.3.2.1. Stochastic measure Let d be a non-zero integer and μ a measure on Rd × E. A stochastic measure Mα on Rd is SαS with control measure μ if, for any p < α, there is a constant cp,α such that, for any test function φ, we obtain (see [SAM 94, section 10.3.3]): αp p α |φ(x)| μ(dx) (5.11) E|Mα (φ)| = cp,α Rd

Like previous notations, the measure Mα above possesses the following properties: L

– stationarity. For any x ∈ Rd : τx (Mα ) = Mα ; L

– unitarity. Let Ux f (y) = eixy f (y). For any x ∈ Rd : Ux (Mα ) = Mα ; L

d

– homogenity. For any λ > 0: rλ M = λ α M ; – symmetry: L

−Mα = Mα NOTE 5.2.– Taken as distributions, let TF (Mα ) be the Fourier transform of Mα . If Mα is stationary, TF (Mα ) is unitary. If we have α > 1 and if Mα is d/α-homogenous, then TF (Mα ) is d/α -homogenous, with α1 + α1 = 1. 5.3.2.2. Ellipticity Let Q and S be two functions of the unit sphere Σd−1 of Rd in R+ . Let Mα be a stochastic measure and μ be the Lebesgue measure on Rd . Let us consider, when they exist, the following stochastic processes: d d x−y H− α H− α − |y| |x − y| Q Q(y) Mα (dy) X(x) = |x − y| Rd (5.12) eixξ − 1 (dξ) Z(x) = M α d +H S ξ Rd |ξ| α |ξ| THEOREM 5.11 ([SAM 94], p. 358).– Processes X and Z are defined whenever 0 < H < 1 and 0 < α < 2. So, they have stationary increments and are self-similar of order H. There exists no pair (Q, S) such that X and the real part of Z have proportional laws.


197

NOTE 5.3.– When α = 2, X and Z processes can have proportional laws for conveniently chosen couples (Q, S), particularly for Q = S ≡ 1. Let us define the operator τd,α by: 1 x d (ξ) = |x|H− α Q ξ d +H |x| α |ξ| τd,α S |ξ| where the Fourier transform is taken as distributions. Then, the following Plancherel formula holds. THEOREM 5.12 ([BEN 99]).– Let 0 < α < 1. There then exists two constants c and c such that the processes defined in (5.12) admit the following harmonizable representation: eixξ − 1 (5.13) X(x) = c ξ TF (Mα )(dξ) d +H τd,α S |ξ| Rd |ξ| α ' ( d d −1 −1 |x − y|H− α τd,α Z(x) = c S(x − y) − |y|H− α τd,α S(y) TF (Mα )(dy) Rd

Let Θ be a symmetric function of Σd−1 in R+ . Let us consider the symbol: d ξ +H β ρβ,Θ (ξ) = |ξ| Θ |ξ| and the corresponding pseudo-differential operator: Lβ,Θ f (x) = ρβ,Θ (ξ)TF (f )(ξ)eixξ dξ

(5.14)

Rd

THEOREM 5.13 ([BEN 99]).– Processes X and Z of Theorem 5.11 are the unique solutions of the following systems: Lα ,Θ X = Mα

(a.s.)

X(0) = 0 with Θ = τd,α S and: Lα ,S Z = TF (Mα )

(a.s.)

Z(0) = 0 NOTE 5.4.– Generally, the sample paths of X and Z are not continuous, hence the definition of X(0) and Z(0) in average.

198


5.3.3. Censov and Takenaka processes Let us now consider two examples of self-similar processes with stationary increments, which are neither Gaussian nor stable. Let us consider the affine space Ed of dimension d. Let Vt , t ∈ Rd , denote the set of hyperplans separating the origin 0 from t. Let Nα stand for a SαS measure, with control measure dσ(s)dρ, where dσ(s) is the surface measure of the unit sphere Σd−1 . The Censov process is defined by: C(t) = Nα (Vt ) Let us now consider the set Bt of spheres of Ed that separate the points 0 and t. Each of these spheres is determined by its center x and radius r. Let Mαβ be a SαS measure, with control measure μ(dx, dr) = rβ−d−1 dxdr. Takenaka process, with exponent β is defined as: T β (t) = Mαβ (Bt ) THEOREM 5.14 ([SAM 94]).– When α > 1 and β < 1, processes C and T β are β well-defined. They are self-similar processes (of order H = α1 for C(t) and H = α for T (t)), with stationary increments. THEOREM 5.15 ([SAM 94]).– For α < 2, when projected in any arbitrary direction, processes C and T β have non-proportional laws. 5.3.4. Wavelet decomposition Let M be a stochastic measure verifying (5.11) and possessing stationarity, unitary, homogenity and symmetry properties. Let ψλ be a family of wavelets as defined in section 5.2.5. Let 1 < β 2. We denote by ψλβ the family 2j/β ψ u (2j x − k). A simple observation leads to the following result. LEMMA 5.1.– We have: M (dx) =

ψλα M (ψ α )

Λd

def = ψλα ξλα Λd

Moreover, let us define: φα λ

=

eixy − 1 TF (ψλα )(y) dy ρ(y)


199

where the function ρ is the denominator of (5.12) or (5.13). From this, we deduce the following decomposition: α 2−jH φα X(x) = λ ξλ λ∈Λd

where the

(ξλα )

verify the following stationarity properties. For any j: L

∀

L

∀r.

α α ξj+,k,u = ξj,k,u α α = ξj,k,u ξj,k+r,u

(5.15)

This property can be compared with that given in [CAM 95] (for second order processes) or in [AVE 98] (general case) for the wavelet coefficients of self-similar processes with stationary increments. 5.3.5. Process subordinated to Brownian measure Let (Ω, F, P) be a probabilized space and Wd the Brownian measure on L2 (Rd ). The space L2 (Ω, F, P) is characterized by its decomposition in chaos [NEV 68]. Let us briefly recall this theory. Let Σn be the symmetric group of order n. For any function of n variables, we define: def 1 f (xσ(1) , . . . , xσ(n) ) f ◦n (x1 , . . . , xn ) = n! Σn

Let us also define the symmetric stochastic measure of order n, i.e. (n) Wd (dx1 , . . . , dxn ) on L2 ((Rd )n ) by: def

(n)

Wd (A1 × · · · × An ) = Wd (A1 ) × · · · × Wd (An ) where any two Ai are disjoint. In addition, it is imposed that the expectation (n) of Wd (f1 . . . fn ) is always zero. As an example, with n = 2, we obtain Wd2 (f g) = Wd2 (f )Wd2 (g) − #f, g$: Wd (f ) = Wd (f ◦n ) (n)

def

(n)

The following properties are established: (n) (m) E Wd (f ), Wd (g) = δn,m #f, g$ For any F ∈ L2 (Ω, F, P), there exists a sequence fn ∈ L2 ((Rd )n ), n 0, with: (n) Wd (fn ) F = n0 (0)

and this decomposition is unique. Moreover, we have EF = Wd .

200


THEOREM 5.16.– Let 0 < H < 1: – process Y n defined by: n

def

Y (x1 , . . . , xn ) =

eix·ξ − 1

Rd

|ξ|

dn 2 +H

TF (Wdn )(dξ)

is self-similar (of order H), with stationary increments; – process X n defined by: def

X n (x) = Y n (x, . . . , x) is self-similar (of order H), with stationary increments; – if an is a summable square sequence, process X defined by: def

X(x) =

an X n (x)

n0

is self-similar (of order H), with stationary increments. This theorem shows how difficult a general classification of self-similar processes with stationary increments can be. Let us note, moreover, that we considered the elliptic case only. We could, for example, also think of combinations of hyperbolic and elliptic cases. This difficulty is clearly indicated by Dobrushin in the issues raised in the comments of [DOB 79, Theorem 6.2, p. 24]. 5.4. Regularity and long-range dependence 5.4.1. Introduction As opposed to what the title of this section may suggest, we will address this question only by means of filtered white noise and for the Gaussian case. Despite its restricted character, this class of examples allows us to question the connection between the regularity of trajectories on the one hand and the long-range correlation5 on the other hand. To begin with, let us once again consider fractional Brownian motion BH , with parameter H. The sample paths of BH are Hölderian with exponent h (a.s.), for any

5. The analysis of decorrelation and mixing process properties, is an already old subject (see, for example, [DOU 94, IBR 78]).


201

h < H, but are not Hölderian with exponent H (a.s.). In addition, we can verify that, for Δ > 0, k ∈ N: " ! |Δ|2H E BH (Δ) BH (k + 1)Δ − BH (kΔ) = c 2(1−H) |k| The decrease, with respect to lag Δ, of the correlation of the increments of X is slow. It is often incorrectly admitted that the Hölderian character and the slow decrease of the correlation of the increments are tied together. 5.4.2. Two examples 5.4.2.1. A signal plus noise model Let H and K be such that 0 < K < H < 1. Let S1 and S2 , be two functions on the sphere Σd−1 with values in [a, b], with 0 < a b < +∞. Then, let us consider the process X defined by: eixξ − 1 eixξ − 1 def ,1 (dξ) + TF (W2 )(dξ) W X(x) = d d Rd |ξ| 2 +H S1 ξ Rd |ξ| 2 +K S2 ξ |ξ| |ξ| def

= YH (x) + ZK (x)

where it is assumed that W1 and W2 are two independent Wiener processes. The process X can be viewed as signal YH corrupted by noise ZK . Indeed, ZK is more irregular than YH . It is shown in Chapter 6 that X is locally self-similar, with parameter K. The local behavior is, indeed, dominated by K: X(λx) L = ZK (x) x lim K λ→0 λ x However, we can also verify: lim

λ→+∞

X(λx) λH

x

L = YH (x) x

The global behavior is hence dominated by H. 5.4.2.2. Filtered white noise Now, let us consider the process: eixξ − 1 def TF (W )(dξ) T (x) = d d ξ Rd |ξ| 2 +H S1 ξ 2 +K S 2 |ξ| |ξ| + |ξ|

202


It satisfies the following properties. T (λx) L lim = YH (x) x H λ→0 λ x T (λx) L lim = ZK (x) x K λ→+∞ λ x 5.4.2.3. Long-range correlation Previous results show that X is Hölderian, with exponent k < K (a.s.) and is not Hölderian with exponent K, and also that Y is Hölderian with exponent h < H (a.s.) and is not Hölderian with exponent H. The long-range correlation of X is dominated by H: X(kh) X (k + 1)h − X(kh) c = lim E |h|2H |h|→∞ |k|2(1−H) while the long-range correlation of T is dominated by K: T (kh) T (k + 1)h − T (kh) c = lim E 2K 2(1−K) |h| |h|→∞ |k| Thus, from these generic examples, we can see that long-range correlation and Hölderian regularity are two distinct concepts. 5.5. Bibliography [AVE 98] AVERKAMP R., H OUDRÉ C., “Some distributional properties of the continuous wavelet transform of random processes”, IEEE Trans. on Info. Theory, vol. 44, no. 3, p. 1111–1124, 1998. [BAR 95] BARLOW M., “Diffusion on fractals”, in Ecole d’été de Saint-Flour, Springer, 1995. [BAS 62] BASS J., “Les fonctions pseudo-aléatoires”, in Mémorial des sciences mathématiques, fascicule 153, Gauthier-Villars, Paris, 1962. [BEN 97] B ENASSI A., JAFFARD S., ROUX D., “Elliptic Gaussian random processes”, Rev. Math. Iber., vol. 13, p. 19–90, 1997. [BEN 98] B ENASSI A., C OHEN S., I STAS J., “Identifying the multifractional function of a Gaussian process”, Stat. Proba. Let., vol. 39, p. 337–345, 1998. [BEN 99] B ENASSI A., ROUX D., “Elliptic self-similar stochastic processes”, in D EKKING M., L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer, 1999. [BEN 00] B ENASSI A., C OHEN S., D EGUY S., I STAS J., “Self-similarity and intermittency”, in Wavelets and Time-frequency Signal Analysis (Cairo, Egypt), EPH, 2000.


203

[BRE 68] B REIMAN L., Probability, Addison-Wesley, 1968. [CAM 95] C AMBANIS S., H OUDRÉ C., “On the continuous wavelet transform of second-order random processes”, IEEE Trans. on Info. Theory, vol. 41, no. 3, p. 628–642, 1995. [DOB 79] D OBRUSHIN R.L., “Gaussian and their subordinated self-similar random fields”, Ann. Proba., vol. 7, no. 3, p. 1–28, 1979. [DOU 94] D OUKHAN P., “Mixing: properties and examples”, in Lecture Notes in Statistics 85, Springer-Verlag, 1994. [DUB 97] D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond: Proceedings of the CNRS School (Les Houches, France), EDP Sciences and Springer, 1997. [DUR 97] D URAND J., Sables, poudres et grains, Eyrolles Sciences, Paris, 1997. [FRI 97] F RISCH U., Turbulence, Cambridge University Press, 1997. [GRI 89] G RIMMETT G., Percolation, Springer-Verlag, 1989. [GUY 89] G UYON X., L EON J., “Convergence en loi des H-variations d’un processus Gaussien stationnaire”, Annales de l’Institut Henri Poincaré, vol. 25, p. 265–282, 1989. [GUY 94] G UYON E., T ROADEC J.P., Du sac de billes au tas de sable, Editions Odile Jacob, 1994. [HAV 87] H AVLIN S., B EN -H VRAHAM D., “Diffusion in disordered media”, Advances in Physics, vol. 36, no. 6, p. 695–798, 1987. [HER 90] H ERMANN H., ROUX S., Statistical Models for the Feature of Disordered Media – Random Materials and Processes, North-Holland, 1990. [IBR 78] I BRAGIMOV I., ROZANOV Y., “Gaussian random processes”, in Applications of Mathematics 9, Springer-Verlag, 1978. [IST 97] I STAS J., L ANG G., “Quadratic variations and estimation of the Hölder index of a Gaussian process”, Annals of the Institute Henri Poincaré, vol. 33, p. 407–436, 1997. [KOL 40] KOLMOGOROV A., “Wienersche Spiralen und einige andere interessante Kurven im Hilbertsche Raum”, Comptes rendus (Dokl.) de l’Académie des sciences de l’URSS, vol. 26, p. 115–118, 1940. [LAM 62] L AMPERTI J., “Semi-stable stochastic processes”, Trans. Am. Math. Soc., vol. 104, p. 62–78, 1962. [LEG 99] L ÉGER S., P ONTIER M., “Drap brownien fractionnaire”, Note aux Comptes rendus de l’Académie des sciences, S. I, vol. 329, p. 893–898, 1999. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MEY 90] M EYER Y., Ondelettes et opérateurs, Hermann, Paris, 1990. [NEV 68] N EVEU J., Processus Gaussiens, Montreal University Press, 1968. [SAM 94] S AMORODNITSKY G., TAQQU M.S., Stable Non-Gaussian Random Processes, Stochastic Models with Infinite Variance, Chapman and Hall, New York and London, 1994.

204


[TAQ 75] TAQQU M.S., “Weak convergence to fractional Brownian motion and the Rosenblatt process”, Z.W.G., vol. 31, p. 287–302, 1975. [WEI 72] W EIERSTRASS K., “Ueber continuirliche Functionen eines reellen Arguments, die fuer keinen Werth des letzteren einen bestimmten differentialquotienten besitzen”, Koenigl. Akad. Know. Mathematical Works II, vol. 31, p. 71–74, 1872. [WIL 98] W ILLINGER W., PAXSON V., TAQQU M.S., “Self-similarity and heavy tails: structural modeling of network traffic”, in A DLER R.J., F ELDMAN R.E., TAQQU M.S. (Eds.), A Practical Guide to Heavy Tails: Statistical Techniques and Applications, Springer-Verlag, p. 27–53, 1998.

Chapter 6

Locally Self-similar Fields

6.1. Introduction Engineers and mathematicians interested in applications have to use many different models to describe reality. The objective of this chapter is to explain the usefulness of locally self-similar fields. First, we will show how the traditional concept of self-similarity often proves too narrow to model certain phenomena. Then, given the diversity of existing locally self-similar models, we will present the panorama of relations that are found among them. Finally, we will familiarize the reader with the techniques used in this field. In order to understand the genesis of locally self-similar fields, it is necessary to go back to their common ancestor: the fractional Brownian motion (FBM). Historically, the popularity of simple random models having properties of self-similarity in principle can be traced back to [MAN 68]. In particular (if restricted to the Gaussian processes), Mandelbrot and Van Ness show that there exists a single process with stationary increments, self-similar of order H (for 0 < H < 1). This property implies that a change of scale on the index amounts to a scaling on the process value: L

BH (x) = H BH (x) See Chapter 5 for more precise explanations. We will thereafter note by BH the FBM of order H. One of the most interesting properties of fractional Brownian motion of order H is the Hölderian regularity of order H, noted by C H (with near

Chapter written by Serge C OHEN.

206


logarithmic factors) of the trajectories. Indeed, FBM is a good candidate for modeling phenomena which, using a statistical processing, are found to have C H trajectories and are supposed, for theoretical reasons, to be self-similar. The importance of the identification of H, starting from the samples of the phenomenon necessarily taken in discrete time, is thus crucial. At the same time, it is necessary to remember that FBMs are processes with stationary increments, which simplifies the spectral study of the process but is too restrictive for certain applications. Indeed, in many fields (when we want to simulate textures in an image), it is expected, a priori, that order H depends on the point at which the process is observed. For example, if, using a random field, we want to model the aerial photographing of a forest, we would like to have a model where the parameter of self-similarity around the point x, noted by h(x), depends on the geological nature of the ground in the vicinity of x. However, a spatial modulation of the field law is generally incompatible with the property of self-similarity, which is the overall property. Consequently, the problem consists of arriving at a concept of a sufficiently flexible, locally self-similar field, so that the parameters which define the field law could vary with the position, yet be simple enough to enable the identification of these parameters. Unfortunately, the simple approach, which consists of reproducing H by a function h(x) in the formula giving the covariance of a FBM, is not satisfactory: generally, we can show that there does not exist any Gaussian field having this generalized covariance. We will thus have to introduce the mathematical tools which will make it possible to build models generalizing FBMs and also to identify the functional parameters of these models. These theoretical recaps will be dealt with in section 6.2, where we will consider each time the relevance of the concept introduced for an example of fractional Brownian motion. In this context, we discuss traditional techniques for the study of Gaussian fields and also the tools of analysis in wavelets. Using this theoretical framework, in section 6.3 we will formally define the property of local self-similarity and present two examples which form the base of all the later models. Having established that these models are not sufficiently general for the applications, we will penetrate into the multifractional world in section 6.4. In each preceding model, specific attention will be given to the regularity of the trajectories and, in section 6.5, we shall develop the statistical methods which make it possible to estimate this regularity. This is what we call model identifiability. At this point, it is necessary to clarify that the term “fields” is used for families of random variables indexed by groups of d for d 1. In applications, the most interesting cases correspond to d > 1 (for example, d = 2, for the images). However, certain statements, particularly those concerning identification, will relate only to processes (i.e., fields where we have d = 1).


207

6.2. Recap of two representations of fractional Brownian motion We begin with the presentation of tools for the study of locally self-similar fields based on the concept of reproducing kernel Hilbert space. We derive from this a Karhunen-Loeve expansion regarding FBM. We find that there is a spectral representation of fractional Brownian motion called harmonizable. It will be an occasion to recall some concepts of the multiresolution analysis. 6.2.1. Reproducing kernel Hilbert space The study of Gaussian fields is largely facilitated by a tool of analysis which is traditionally associated with these fields: reproducing kernel Hilbert space. From a physical point of view, the reproducing kernel Hilbert space can be regarded as a space which describes the energy associated with a Gaussian field within the meaning of a spectral energy. On the other hand, mathematically, a reproducing kernel Hilbert space is a Hilbert space of deterministic functions whose standards characterize all the properties of the field. See [NEV 68] for a detailed study. Let us now recall its formal definition. DEFINITION 6.1.– Let (Xx )x∈d be a centered Gaussian field (i.e., E(Xx ) = 0). We will call Gaussian space associated with X the space of square integrable random variables (noted by L2 (Ω, A, P )) and made up with the help of linear combinations of the variables Xx and their limits, that is:

n (6.1) λi Xxi HX =adh Z, such that ∃n ∈ N and ∃λi for i=1 to n and Z = i=1

where adh means that we take the closure of the set for the topology defined by L2 (Ω, A, P ). The space:

HX = hZ , such that ∃Z ∈ HX / hZ (x) = E(ZXx∗ ) equipped with the Hermitian product: ∀Z1 , Z2 ∈ HX ,

#hZ1 , hZ2 $ = E(Z1 Z2∗ )

(6.2)

is the reproducing kernel Hilbert space associated with the field X. It is verified, according to (6.2), that the application: h : HX −→ HX Z −→ hZ

(6.3)

208


is an isometry between Gaussian space and the reproducing kernel Hilbert space, while the Hermitian product on HX is ad hoc. In particular, this application is bijective, meaning that for all the functions h ∈ HX there is only one corresponding random variable Z of HX . Moreover, HX contains the functions of y: hXx (y) = R(x, y) resulting from the covariance of X: R(x, y) = E(Xx Xy∗ ) and we can describe the reproducing kernel Hilbert space as the closure of the finite linear combinations of functions R(x, ·) for HX . Lastly, the name reproducing kernel Hilbert space comes from the property verified by its scalar product: #R(x, ·), R(y, ·)$HX = R(x, y)

(6.4)

However, the most important aspect of reproducing kernel Hilbert space is the fact that the choice of an orthonormal base of this space makes it possible to obtain a series representation of the field, which is often called Karhunen-Loeve expansion. THEOREM 6.1.– Any orthonormal base (en (x))n∈N of HX , is associated with an orthonormal base of HX , i.e. (ηn )n∈N , by the relation: hηn = en The random variables (ηn )n∈N are the centered independent Gaussian variables of variance 1 and the field can be represented by: Xx (ω) =

+∞

ηn (ω)en (x)

(6.5)

n=0

where the ω are hazards of the space of probability Ω and convergence in (6.5) is in the direction L2 (Ω). The preceding theorem is true for any field and for any orthonormal base of HX . In fact, a martingale type argument shows that convergence is almost sure, which is important, particulary when simulating the fields of interest here. Nevertheless, a judicious choice of the orthonormal base of HX is necessary for conveniently studying the regularity of the trajectories of these fields. We will illustrate these ideas in the fundamental example of FBM in the next section. 6.2.2. Harmonizable representation By way of example, let us seek the reproducing kernel Hilbert space of the FBM: we will deduce from it a Karhunen-Loeve expansion which will form the basis for


209

studying the trajectorial regularity of the generalizations for FBM. Let us begin with the definition of the FBM, starting from its covariance. DEFINITION 6.2.– We will call FBM of order H the real centered Gaussian field BH given by the covariance: 1

x 2H + y 2H − x − y 2H E BH (x)BH (y) = 2

(6.6)

where 0 < H < 1 and where is the Euclidean norm on d . Let us begin with some elementary comments explaining this presentation. To simplify our study, we shall assume a field with real values. In addition, it is sometimes more vivid to define the FBM by expressing the variance of the increments, which is: 2 E BH (x) − BH (y) = x − y 2H In fact, this property characterizes the FBM if we additionally impose BH (0) = 0 a.s. If H = 12 and d = 1, the FBM is a standard Brownian motion and the increments are then independent if they are taken on separate intervals; however, this case is exceptional, and the majority of methods used for the standard Brownian do not apply to the other H. On the other hand, we note that the FBM is a field with stationary increments whatever the H, which constitutes the starting point for representing its covariance. Indeed, it is traditional to represent fields with stationary increments through a spectral measure. While following the example of the stationary processes (see [YAG 87] for a general presentation), we obtain: R(x, y) =

d

(eix·ξ − 1)(e−iy·ξ − 1) μ(dξ)

(6.7)

where μ is the spectral measurement. In the case of FBM, we can guess the spectral measurement from the formula: d

|eix·ξ − 1|2 dξ 2 = CH

x 2H

ξ d+2H (2π)d/2

(6.8)

where CH is a strictly positive constant; the preceding formula gives: R(x, y) =

1 2 CH

d

(eix·ξ − 1)(e−iy·ξ − 1) dξ

ξ d+2H (2π)d/2

= #kx , ky $L2 (d )

(6.9)

210


where we have:

#f, g$L2 (d ) =

d

f (ξ)g ∗ (ξ)

dξ (2π)d/2

The covariance can still be written by using Parseval’s formula, which expresses that Fourier transform is an isometry of L2 : ,y $L2 (d ) ,x , k R(x, y) = #k

(6.10)

where fˆ is the Fourier transform of f . By using the Fourier inversion theorem: kx =

−1 eix·ξ d

CH ξ 2 +H

Equation (6.9) is an attempt to associate the covariance with a functional scalar product and thus with (6.4). That will enable us, according to [BEN 97b], to have a convenient description of the reproducing kernel Hilbert space of the FBM. To this end, let us define the operator of isometry J between HBH and L2 (d ). DEFINITION 6.3.– We define the linear operator J of L2 (d ) on HBH by assuming: J (kx ) = R(x, ·) For ψ ∈ L2 (d ): J (ψ)(y) =

d

dξ (e−iy·ξ − 1) ˆ ∗ (ψ) (ξ) d (2π)d/2 CH ξ 2 +H

(6.11)

The reproducing kernel Hilbert space of the FBM can be written:

HBH = f, ∃ψ ∈ L2 (d ) such that f = J (ψ) Moreover, J is an isometry: #J (ψ1 ), J (ψ2 )$HBH = #ψ1 , ψ2 $L2 (d )

(6.12)

The properties of J contained in Definition 6.3 are proved in pages 24 and 25 of [BEN 97b]. This presentation of reproducing kernel Hilbert space makes it possible to easily build an orthonormal base of HBH such that the associated Karhunen-Loeve expansion almost surely does converge. The first stage consists of choosing an


211

orthonormal base of L2 (d ) which is adapted to our problem. For this, let us start from a multiresolution analysis of L2 (d ) (see [MEY 90a]): it is known that there are functions ψ (l) ∈ L2 (d ) for l pertaining to L = {0, 1}d \ {(0, . . . , 0)} such that their (l) (ξ) are C ∞ and vanish beyond the limit of 2π ξ 8π . Fourier transforms ψ 3 3 We then suppose: (l)

dj

ψj,k (x) = 2 2 ψ (l) (2j x − k)

j, k ∈ Z

(6.13)

(l)

Below, we will note λ = (j, k, l), Λ = Z2 × L and ψj,k = ψλ . Conventionally, in a multiresolution analysis (ψλ )λ∈Λ is an orthonormal base of L2 (d ) and function ψλ is localized around k 2−j which we will identify, by using unconventional language, with λ. Here ψλ in particular shows a fast decrease: |ψλ (x)|

C 2dj/2 1 + |2j x − k|K

∀K ∈ N

(6.14)

From the base (ψλ )λ∈Λ , we will build an orthonormal base of the reproducing kernel Hilbert space of the FBM by assuming: dξ e−ix·ξ − 1 ˆ ∗ (ψλ ) (ξ) (6.15) ϕλ (x) = d d/2 +H (2π) d 2 C

ξ H When d = 1, functions ϕλ are “morally” the fractional integrals of functions ψλ , within the meaning of Chapter 7. To be convinced of this, it is necessary to express the fractional integration operator of the previously mentioned chapter regarding the Fourier transform of the function to which we apply it. It will be noted, however, that the correspondence is not exact. Nevertheless, the principal purpose of functions ϕλ is that they “inherit,” in a certain manner, the localization properties of functions ψλ : these are “wavelet” type functions in the terminology of [MEY 90b], which results in: 1 1 (6.16) + |ϕλ (x)| C(K) 2−Hj 1 + |2j x − k|K 1 + |k|K |ϕλ (x) − ϕλ (y)| C(K) 2−Hj

2j |x − y| 1 + |2j x − k|K

∀K ∈ N

(6.17)

where we suppose that we have j 0 and (j, l) = (0, 0). Consequently, the series expansion (6.5) of the FBM becomes: ηλ ϕλ (x) (6.18) BH (x) = λ∈Λ

where (ηλ )λ∈Λ is a sequence of centered Gaussian random variables independent of variance 1. Thanks to localization (6.16), it is possible to say that, roughly, the

212


behavior in the vicinity of x of the FBM BH depends mainly on the random variables ηλ for λ approaching x; in particular, the series of (6.18) almost surely converges. In this chapter we will continue to use the representation known as harmonizable for FBM of order H. In this, the FBM appears as a white noise to which a filter is applied within the meaning of the theory of the signal. Although FBM can be defined by its harmonizable representation (see Chapter 7 of [SAM 94]), we will here deduce it from (6.18). From this point of view, it is necessary to assume: ˆ ∗ (dξ) = W

∗

,λ (ξ) ηλ ψ

λ∈Λ

dξ (2π)d/2

(6.19)

This is a Gaussian random measure and it integrates the deterministic functions f of L2 (d ) into possibly complex values and provides a Gaussian random variable: ∗ ,λ (ξ) dξ ˆ ∗ (dξ) = f (ξ)W ηλ f (ξ)ψ (6.20) (2π)d/2 d d λ∈Λ It is necessary to understand the left-hand side of (6.20) like a notation for the Gaussian random variable defined by the convergent series in L2 (Ω) of the right-hand side. Since the variables ηλ are independent, it is deduced that: . . . .

d

.2 . ˆ ∗ (dξ). f (ξ)W .

L2 (Ω)

= f 2L2 (d )

On the basis of (6.18) and (6.20), we obtain: e−ix·ξ − 1 ˆ ∗ W (dξ) BH (x) = d +H d CH ξ 2

(6.21)

(6.22)

The FBM is thus a white noise filtered through the filter: g(x, ξ) =

e−ix·ξ − 1 d

CH ξ 2 +H

(6.23)

It should be noted that there are other filters leading to fields which have the same law as that defined in (6.22). In the next section, we will see that it is possible to define the generalizations of FBM which do not have stationary increments, on the basis of the reproducing kernel Hilbert space in FBM or its harmonizable representation.


213

6.3. Two examples of locally self-similar fields 6.3.1. Definition of the local asymptotic self-similarity (LASS) In this section, we will precisely define the property of local self-similarity which we are seeking. Let us recall, to this end, the property of self-similarity verified by the FBM: ∀ ∈ + ,

∀x ∈ d ,

L

BH (x) = H BH (x)

(6.24)

L

where = means that, for all n ∈ N and any choice of (x1 , . . . , xn ) in n , the vector (BH (x1 ), . . . , BH (xn )) has the same law as H (BH (x1 ), . . . , BH (xn )). As we saw in the introduction, it is not easy to localize this overall property while preserving an identifiable model for which the trajectories of the process have locally, in the vicinity of a point, the desired Hölderian regularity. The asymptotic definition, presented initially in [BEN 97b], corresponds to these objectives. DEFINITION 6.4.– Let there be a function h: d → ]0, 1[. A field X will be called locally self-similar (LASS) of multifractional function h if: X(x + u) − X(x) L lim = a(x) BH (u) u∈d (6.25) ∀x ∈ d , h(x) →0+ u∈d where a is a strictly positive function and BH is a FBM of order H = h(x). The topology with which we equip the trajectory space to define the convergence in law is that of uniform convergence on each compact. This definition can be reformulated qualitatively by saying that a locally self-similar process admits at each point x ∈ d , to a standardization of the near variance given by a(x), a Brownian fractional tangent BH . It is a satisfactory generalization of the self-similar property. Indeed, it is easy to verify that a FBM is locally self-similar to the constant multifractional function equal to its order. In the case of FBM, we have: L

X(x + u) − X(x) = X(u) because of the stationarity property of the increments and it is noted, while applying (6.24), that for a FBM, the term: X(x + u) − X(x) H is constant in law. This elementary verification explains the denominator of (6.25) as well as the role of localization of the asymptotic → 0+ . The last advantage of Definition 6.4 is that it enables the construction of non-trivial examples of locally self-similar processes, as we will see in the following section.

214


6.3.2. Filtered white noise (FWN) In [BEN 98b], we propose to generalize the harmonizable representation (6.22) by calling filtered white noise (FWN) any process corresponding to a filter g(x, ξ) of the form: a(x) b(x) −ix·ξ − 1) + + R(x, ξ) (6.26) g(x, ξ) = (e d d

ξ 2 +H1

ξ 2 +H2 for 0 < H1 < H2 < 1. The term in parentheses in (6.26) is an asymptotic expansion in high frequency and we will find, in the following definition, the precise assumptions which express that R(x, ξ) is negligible in front of: b(x) d

ξ 2 +H2 when ξ → +∞. DEFINITION 6.5.– We will call filtered second order white noise a process X which admits harmonizable representation: a(t) b(t) −itξ ˆ ∗ (dξ) (6.27) − 1) + + R(t, ξ) W ∀t ∈ , X(t) = (e 1 1 |ξ| 2 +H1 |ξ| 2 +H2 where there are the two following hypotheses. HYPOTHESIS 6.1.– In the preceding definition, we have 0 < H1 < H2 < 1 and R(t, ξ) ∈ C 1,2 ([0, 1] × ) is such that: m+n ∂ C ∂tm ∂ξ n R(t, ξ) |ξ| 12 +η+n for m = 0, 1 and n = 0, 1, 2 with η > H2 . The symbol C denotes a generic constant. HYPOTHESIS 6.2.– In the preceding definition, we have a, b ∈ C 1 ([0, 1]) and, for every t ∈ [0, 1], we have a(t)b(t) = 0. Limiting ourselves to an expansion of order 2 is a convention adopted with the aim of facilitating the presentation of the identification algorithms. For the same reason, we suppose that filtered white noises are the processes indexed by t ∈ . For a better understanding of the relationship between filtered white noises and FBM, it is enough to suppose that R(t, ξ) is identically zero. We find Xt = CH1 a(t)BH1 (t) + CH2 b(t)BH2 (t), for BH1 , BH2 two fractional Brownian motions and the CH constant defined in (6.8). It should, however, be noted that, even in this simplified example, BH1 and BH2 are not independent and therefore


215

the law of X is not trivially deduced from FBM. This last example illustrates an additional virtue of filtered white noises: their definition authorizes not only the functional parameters a(t) and b(t) to vary according to the position, but also the superposition of self-similar phenomena of orders H1 and H2 . In addition, we will find an interpretation of H2 in terms of long dependence in Chapter 5. On the other hand, starting from formula (6.27), it is difficult to find a convenient expression of the reproducing kernel Hilbert space of a filtered white noise. Moreover, the filtered white noises do not maintain the overall properties of FBMs, that is to say, self-similarity and stationarity of the increments, which is an advantage when modeling certain phenomena. Since the filter of a white filtered noise is asymptotically equivalent in high frequency to that of a FBM, only the local properties remain. For example, the filtered white noises verify a property of local self-similarity of the following type. PROPOSITION 6.1.– A filtered second order white noise X associated with a filter of form (6.26) is locally self-similar of constant multifractional function equal to H1 : lim+

→0

X(t + u) − X(t) H1

u∈

L = CH1 a(t) BH1 (u) u∈

(6.28)

A proof of this result can be found in [BEN 98a], regarding the multifractional processes. 6.3.3. Elliptic Gaussian random fields (EGRP) Another manner of generalizing FBM, which constitutes the approach adopted in [BEN 97b], consists of starting from the reproducing kernel Hilbert space. By returning to Definition 6.3, we can already represent the reproducing kernel Hilbert space norm by means of the operator J and the formula: #J (ψ1 ), J (ψ2 )$HBH = #ψ1 , ψ2 $L2 (d ) However, this formula can be presented differently. For every function f, g of space D0 of zero functions in 0, C ∞ with compact support: #f, g$HBH = #AH f, g$L2 (d )

(6.29)

2 where AH is a pseudo-differential operator of symbol CH

ξ 1+2H (CH indicates the constant defined in (6.8)), i.e.: dξ 2 eix·ξ CH

ξ 1+2H fˆ(ξ) (6.30) AH (f ) = (2π)d/2 d

216


A demonstration of (6.29) is found in Lemma 1.1 of [BEN 97b]. This equation is also equivalent to: AH = J −2

(6.31)

2 It should be noted that the symbol of the operator AH , σ(ξ) = CH

ξ 1+2H is homogenous in ξ and does not depend on x, which respectively corresponds to the self-similarity property and increment stationarity of the process. Consequently, it is natural to consider Gaussian processes which are associated with the symbols σ(x, ξ) which also depend on the position. The property of stationarity of the increments is lost. However, if we impose that σ(x, ξ) is elliptic of order H, i.e., controlled by

ξ 1+2H when ξ → +∞, in the precise sense that there exists C > 0 such that, for all x, ξ ∈ d :

C(1 + ξ )2H+1 σ(x, ξ)

1 (1 + ξ )2H+1 C

(6.32)

we then obtain processes called elliptic Gaussian random processes (EGRP), which locally preserve many properties of the FBM. In this chapter, we will define the elliptic Gaussian random processes in a less general manner than in [BEN 97b]. DEFINITION 6.6.– Let AX be the pseudo-differential operator defined by: dξ eitξ σ(t, ξ)fˆ(ξ) ∀f ∈ D0 , AX (f ) = 1/2 (2π)

(6.33)

of the symbol σ(t, ξ) verifying for 0 < H < 1, the following hypothesis: HYPOTHESIS 6.3 (H).– There exists R > 0: – for every t ∈ and for i = 0 to 3: i ∂ σ(t, ξ) 2H+1−i ∂ξ i Ci (1 + |ξ|)

for |ξ| > R

– > such that:

|σ(s, ξ) − σ(t, ξ)| (1 + |ξ|)2α+1+ |s − t| – it is elliptic of order H (see (6.32)). We will then call elliptic Gaussian random processes of order H the Gaussian processes of reproducing kernel Hilbert space given by adherence of D0 for the norm (#AX f, f $L2 () )1/2 and provided with the Hermitian product: #f, g$HX = #AX f, g$L2 ()


217

Let us make some comments to clarify Definition 6.6. First, we restrict ourselves to one dimension mainly for the same reasons of simplicity as for the filtered white noises. In addition, let us note that if σ does not depend on t, then AX still verifies: AX = JX−2

(6.34)

for the isometry: def

JX (ψ)(y) =

dξ e−iyξ − 1 ˆ ∗ * (ψ) (ξ) (2π)1/2 σ(ξ)

∀ψ ∈ L2 ()

This is the same, in fact, as saying that we have a harmonizable representation of X: −itξ e −1 ˆ ∗ * W (dξ) X(t) = σ(ξ) * It is therefore enough to have an asymptotic expansion of (e−itξ − 1)/ σ(ξ) of the type (6.26) so that X is a filtered white noise. On the other hand, the FBMs are not elliptic Gaussian random processes of the type defined earlier, since the lower inequality of ellipticity (6.32) is not verified. Moreover, if the symbol depends on t, relation (6.34) is no longer true and the elliptic Gaussian random processes are no longer filtered white noises. Let us reconsider Hypothesis 6.3. The first two points are necessary so that the symbol AX behaves asymptotically at high frequency “as if” it does not depend on t. If we want to distinguish the two models roughly, we can consider that elliptic Gaussian random processes have more regular trajectories, whereas filtered white noises lend themselves better to identification. Consequently, let us reconsider the manner of determining the local regularity of an elliptic Gaussian random process and very briefly summarize the reasoning of [BEN 97b]. The starting point for the study of the regularity in the elliptic Gaussian random processes is, as for FBM, a Karhunen-Loeve expansion of the elliptic Gaussian random processes in adapted bases. The selected orthonormal base is built starting from the base (6.13) of L2 () by supposing: 1

φλ = (AX )− 2 (ψλ ) where the fractional power of AX is defined by means of a symbolic calculation on the operators. This leads to: ηλ φλ a.s. and in L2 (Ω) (6.35) X= λ

The regularity of X is the consequence of “wavelet” type estimates which relate to φλ and its first derivative and which resemble to (6.16). A precise statement is

218


found in Theorem 1.1 of [BEN 97b]; the essential point is the decrease in 2−Hj of the numerator with respect to the scale factor j – it is indeed this exponent which governs the almost sure regularity of the process. Thanks to the traditional techniques on the random series (see Chapters 15 and 16 of [KAH 85]), we find that, “morally”, the trajectories are Hölderian of the exponent H. On page 34 of [BEN 97b], we will find a great number of results describing very precisely the properties of the local and overall continuity modules of the elliptic Gaussian random processes; we will only mention here, by way of example, a law of the local iterated logarithm. THEOREM 6.2.– If X is an elliptic Gaussian random process of order H, then, for all t ∈ , we have: lim sup ε→0

|X(t + ε) − X(t)| & = C(t) |ε|H log log( 1ε )

(a.s.)

(6.36)

with 0 < C(t) < +∞. Thus, considering that “the trajectories are Hölderian of the exponent H” is equivalent to forgetting the iterated logarithm factor. On the other hand, it should be noted that if we are interested only in the continuity module of the elliptic Gaussian random processes, without wanting to specify the limit C(t), then metric entropic techniques (see [LIF 95]) valid for all Gaussian processes are applicable. Lastly, elliptic Gaussian processes are locally self-similar and subject to a convergence property of their symbol at high frequency. PROPOSITION 6.2.– If an elliptic Gaussian random process X of order H is associated with a symbol verifying: lim

|ξ|→+∞

σ(t, ξ) = a(t) |ξ|1+2H

∀t ∈

then X is a locally self-similar of constant multifractional function equal to H: X(t + u) − X(t) L lim = a(t) BH (u) u∈ (6.37) H →0+ u∈ 6.4. Multifractional fields and trajectorial regularity The examples of the preceding section lead to the principal objection concerning both elliptic Gaussian processes and filtered white noises: the property of local self-similarity shows that, in spite of the modulations introduced by the symbol or filter, the multifractional function remains constant. The multifractional Brownian motion (MBM), introduced independently by [BEN 97b, PEL 96], is a model where a non-trivial multifractional function appears. This can be defined by its harmonizable representation.


219

DEFINITION 6.7.– Let h: d → ]0, 1[ be a measurable function. We will call MBM of function h any field admitting the harmonizable representation: 1 Bh (x) = v h(x)

d

e−ix·ξ − 1 d

ξ 2 +h(x)

W (dξ)

(6.38)

where W (dξ) is a Brownian measure and, for every s ∈ ]0, 1[: v(s) =

d

1/2 2 1 − cos(ξ1 ) dξ

ξ d+2s (2π)d/2

(6.39)

where ξ1 is the first co-ordinate of ξ. As in (6.19), we define a general Brownian measure starting from an orthonormal base (gn (x))n∈N of L2 (d ) and a sequence (ηn )n∈N of centered independent Gaussian variables of variance 1, by supposing: dξ f (ξ)W (dξ) = ηn f (ξ)gn∗ (ξ) (6.40) d/2 (2π) d d n∈N for any function f of L2 (d ). In addition, the function of standardization v 2 of (6.8) and it is noted immediately that, if function corresponds to the constant CH h is a constant equal to H, the MBM is a fractional Brownian motion of order H (a unifractional Brownian). Before studying the properties of the MBM, we will establish the link between its harmonizable representation and the representation in the form of moving average obtained by [PEL 96]. 6.4.1. Two representations of the MBM To summarize the link between the harmonizable representation and the moving average representation of the MBM, we will say that they are Fourier transforms of each other. To be more precise, let us start from Definition 6.7 of a process indexed by (which is the case under consideration in [PEL 96]). Let us suppose that the ˆ ∗ (dξ); we have a series expansion of the MBM: Brownian measure of (6.38) is W 0 / e−it· − 1 1 , Bh (t) = ηλ , ψλ (6.41) 1 v h(t) λ∈Λ |.| 2 +h(t) L2 () However, Parseval’s identity leads to: /

e−it· − 1 , , ψλ 1 |.| 2 +h(t)

0 = L2

/ eit· − 1

, ψλ 1 +h(t)

|.| 2

0 (6.42) L2

220

Scaling, Fractals and Wavelets 1

The Fourier transform of (eit· − 1)/|.| 2 +h(t) is calculated by noticing that the transform of a homogenous distribution is also homogenous: it· − 1 e

|.|

1 2 +h(t)

1 1 (s) = C h(t) |t − s|(h(t)− 2 ) − |s|(h(t)− 2 )

(6.43)

We deduce from it the following theorem, whose proof is found in [COH 99]. ˆ ∗ (dξ) of (6.19). The MBM of the THEOREM 6.3.– Let the Brownian measure be W harmonizable representation: 1 Bh (t) = v h(t)

e−itξ − 1 ˆ ∗ W (dξ) 1

ξ 2 +h(t)

(6.44)

is equal almost surely to a deterministic multiplicative function close to the symmetric moving average:

+∞ ! −∞

1 1 " |t − s|(h(t)− 2 ) − |s|(h(t)− 2 ) W (ds)

where the Brownian measure is given by: W (ds) = ηλ ψλ (s) ds

(6.45)

(6.46)

λ∈Λ

This theorem calls for several comments. First of all, when h(t) = 12 : ! 1 1 " |t − s|(h(t)− 2 ) − |s|(h(t)− 2 ) is not clearly defined, but the proof of the theorem shows that we must suppose: " ! 1 1 0 0 def − log |t − s| − |s| = log |t − s| |s| in (6.45). Now that we know that there is primarily only one MBM, we can state the local self-similarity associated with its multifractional function. On this subject, let us remember theorem 1.7 of [BEN 97b], which finds its symmetric (match) in Proposition 5 of [PEL 96]. PROPOSITION 6.3.– A MBM of function h of Hölderian class C r , with r > supt h(t), is locally self-similar to multifractional function h.


221

6.4.2. Study of the regularity of the trajectories of the MBM This section will recall the results known about the trajectory regularities of the MBMs. To carry out this study, both in [BEN 97b, PEL 96], a hypothesis of regularity is stated on the multifractional function h itself. In this section, we assume that the following hypothesis is verified: HYPOTHESIS 6.4.– Function h is Hölderian of exponent r (noted h ∈ C r ) with: r > sup h(t) t∈

This hypothesis of surprising formulation has long been considered to be related to the technique of the proof. In fact, we will see by outlining the proof of [BEN 97b] that we cannot do better for a MBM and that the obstruction comes from the “low frequencies”. Let us begin with the random series representation of the MBM (6.41), which we present differently: 1 ηλ χλ t, h(t) Bh (t) = v h(t) λ∈Λ

(6.47)

where the function: χλ (x, y) =

dξ e−ixξ − 1 , ∗ ψλ (ξ) 1 +y (2π)1/2

ξ 2

(6.48)

,λ does not is analytical in its two variables (the fact that the support of functions ψ contain 0 is used here). Similarly, the standardization function v is analytical and does not cancel itself on ]0, 1[. It follows that, if we truncate the series (6.47) by considering only a finite number of dyadics λ, then the random function which results from it has the same regular multifractional function h. In addition, the irregularity of the trajectories of the MBM is a consequence of high frequency phenomena (i.e., dependent on χλ (t, h(t)) for |λ| → +∞). We can find in [BEN 97b] the high frequency estimates for the MBM, which are generalizations of (6.16). For every K ∈ N: 1 1 (6.49) + |χλ t, h(t) | C(K) 2−h(t)j 1 + |2j x − k|K 1 + |k|K |χλ t, h(t) − χλ s, h(s) | j (6.50) j|h(t) − h(s)| −h(s,t)j 2 |t − s| + j|h(t) − h(s)| C(K) 2 + 1 + |2j t − k|K 1 + |k|K with h(s, t) = min(h(s), h(t)). We notice, in particular, the factor 2−h(t)j which leads, for reasons identical to those of section 6.3.3, to the conclusion that the

222


MBM is, up to logarithmic factor, almost surely Hölderian of the exponent h(t). If Hypothesis 6.4 is omitted, it is not difficult to see that “the Hölder exponent” of the MBM in t is given by min(h(t), r) which amounts to saying that it is the most irregular part of the high and low frequency of the MBM which imposes the overall regularity. Let us recall one of the results of [BEN 97b]. THEOREM 6.4.– If X is a MBM of the multifractional function verifying Hypothesis 6.4, then, for all t ∈ , we have: lim sup ε→0

|X(t + ε) − X(t)| & = C(t) |ε|h(t) log log( 1ε )

a.s.

(6.51)

with 0 < C(t) < +∞. For the issue of the MBM simulation, see [AYA 00], where there are some indications regarding this question. We will also note the existence of the FracLab toolbox in this field. 6.4.3. Towards more irregularities: generalized multifractional Brownian motion (GMBM) and step fractional Brownian motion (SFBM) We saw in the preceding section that the MBM provides a model for locally self-similar processes with varying multifractional functions and pointwise exponents. However, Hypothesis 6.4 – essential within the strict framework of MBM – is cumbersome for certain applications. Let us quote two examples where we would wish for regularities which are worse than Hölderian regularities. First, in the rupture models which are important in image segmentation, we wish that the Hölder exponent had some discontinuities. Let us be clearer about this problem through a metaphor. Let us suppose that we have an aerial image on which we want to distinguish the limit of a field and a forest. It is usual to model the texture of a forest by a FBM of the Hölder exponent H1 . In the same way, for the portion of the field, we can think of a FBM of exponent H2 . The question arises as to how “to connect” the two processes on the border. There is a possibility of considering a MBM within the meaning of Definition 6.7, for which function h takes the two values H1 and H2 . However, this MBM does not correspond to the image. Indeed, it shows a discontinuity at the place where function h jumps. However, on an image, only the regularity changes suddenly and most often, the field remains continuous. To model this type of rupture, let us recall the construction of the step fractional Brownian motion (SFBM) of [BEN 00]. In addition, the Hölder exponent of the MBM varies very slowly for applications with so-called developed turbulence (see [FRI 97] for an introduction to this subject). Indeed, the physics of turbulence teach us that the accessible data from measurements are not the Hölder exponents of the studied quantities, but their multifractal spectrum.


223

The description of multifractal spectrum is beyond the scope of this chapter (see Chapter 1, Chapter 3 and Chapter 4), but it is enough to know that for a function whose Hölder exponent is itself C r , this spectrum is trivial. We can thus be convinced that the MBM is not a realistic model for developed turbulence. In order to obtain processes whose trajectories have Hölder exponents which vary abruptly, Ayache and Lévy Véhel have proposed a model called the generalized multifractional Brownian motion (GMBM) in [AYA 99]. We present their model in the second part of this section. 6.4.3.1. Step fractional Brownian motion The multifractional functions associated with the SFBMs are very simple, which makes it possible to have a reasonable model for the identification of ruptures. We limit ourselves to multifractional functions in steps: h(t) =

K

1[ai ,ai+1 [ (t) Hi

(6.52)

i=0

with a0 = −∞ and aK+1 = +∞, and where ai is an increasing sequence of realities. By taking (6.47), we arrive at the following definition. DEFINITION 6.8.– Let Λ+ = { 2kj , for k ∈ Z, j ∈ N}. By SFBM we mean the process of multifractional function h defined by (6.52): (6.53) ηλ χλ t, h(λ) Qh (t) = λ∈Λ+

where functions χλ are defined by (6.48). In the preceding definition, there are some differences as compared to (6.47). Some of them are technical, like the suppression of standardization v(h(t)) or the absence of negative frequencies. On the other hand, the SFBM has continuous trajectories whereas the MBM which corresponds to a piecewise multifractal function is discontinuous. This phenomenon occurs due to the replacement of χλ (t, h(t)) by χλ (t, h(λ)). Indeed, the first function is discontinuous as h at points ai , this jump disappearing in χλ (t, h(λ)). However, the fast decay property of functions t → χλ (t, h(λ)) when |t − λ| → +∞ causes the SFBMs to have local properties very close to those of the MBM outside the jump moments of the multifractional function. The following theorem, which more precisely describes the regularity of the SFBM, can be found in [BEN 00]. THEOREM 6.5.– For any open interval I of , we suppose: H ∗ (I) = inf{h(t), for t ∈ I}

224


If Qh is a SFBM of the multifractional function h, then Qh is the overall Hölderian of exponent H for all 0 < H < H ∗ (I), on any compact interval J ⊂ I. Thus, in terms of regularity, the SFBM is a satisfactory model. We will see, in section 6.5, that we can completely identify the multifractional function: moments and amplitudes of the jumps. 6.4.3.2. Generalized multifractional Brownian motion Let us now outline the work of [AYA 99]. The authors propose to circumvent the “low frequency” problems encountered within the definition of MBM, by replacing the multifractional function h with a sequence of regular functions hn , whose limit, which will play the role of the multifractional function, can be very irregular. Let us first specify the technical conditions relating to the sequence (hn )n∈N . DEFINITION 6.9.– A function h is said to be locally Hölderian of exponent r and of constant c > 0 on if, for all t1 and every t2 , such that, |t1 − t2 | 1, we have: |h(t1 ) − h(t2 )| c|t1 − t2 |r Such a function will be called (r, c) Hölderian. We can consequently define the multifractional sequences which generalize the multifractional functions for the GMBM. DEFINITION 6.10.– We will call a multifractional sequence a sequence (hn )n∈N of Hölderian functions (r, cn ) with values in an interval [a, b] ⊂ ]0, 1[ and we will call its lower limit a generalized multifractional function (GMF): h(t) = lim inf hn (t) n→+∞

if (hn )n∈N verifies the following properties: – for all and all t0 , there exists n0 (t0 , ) and h0 (t0 , ) > 0 such that, for all n > n0 and, |h| < h0 we have: hn (t0 + h) > h(t0 ) − – for all t, we have h(t) < r and cn = O(n). In the preceding definition, it is essential that the generalized multifractional function is a limit when the index n tends towards +∞; we will see that this translates the high frequency portion of the information contained in the multifractional sequence. In addition, the GMF set contains very irregular functions like, for example, 0 < a < b < 1: t −→ b + (a − b)1F (t)


225

where F is a set of the Cantor type. A proof of this result, as well as an opening point of discussion on the set of the GMF, is found in [AYA 99]. Lastly, a process can be associated with a multifractal sequence in the following manner. DEFINITION 6.11.– We will call a GMBM associated with a multifractional sequence noted by (h) = (hn )n∈N any process permitting the harmonizable representation: e−it·ξ − 1 Y(h) (t) = W (dξ) 1 +h0 (t) |ξ| 12 , stochastic error is dominant over distortion. Otherwise, the convergence speed is imposed by distortion. The estimate of the second-order factors (b, H2 ) is more difficult because it necessitates that we build functions that do not depend asymptotically on the first-order factors. An example of such a functional is given by: V N − 22H1 −1 VN 2

230


The intervention of the factor 22H1 −1 is necessary to compensate for the influence of the first-order parameters exactly. On the other hand, it must be estimated and for this we will use the convergence of: V N2 2

lim

N →+∞

VN 2

= 22H1 −1

which is sufficiently rapid for the compensation to always take. An estimator of the parameter H2 is thus obtained. THEOREM 6.8.– If the function: def

WN = V N − 2

V N2 2

VN 2

VN

(6.63)

the estimator: 2 ,2 (N ) = 1 − 1 log2 VN /2 + log2 WN/2 H 2 2 VN 2 WN

(6.64)

converges a.s. towards H2 when N → +∞. 6.5.2.2. Identification of elliptic Gaussian random processes Although it is possible to directly identify the symbol of an elliptic Gaussian random process of the form: 1

1

σ(t, ξ) = a(t)|ξ| 2 +H1 + b(t)|ξ| 2 +H2 + p(ξ)

(6.65)

when 0 < H2 < H1 < 1, for a and b two strictly positive C 1 functions, and for p a c∞ function such that p(ξ) = 1 if |ξ| 1 and p(ξ) = 0 if |ξ| > 2 (see [BEN 94]); a comparison carried out in [BEN 97a] between filtered white noises and elliptic Gaussian random processes makes it possible to obtain the result more easily. Let us make several comments on the symbols which we identify. The symbols of form (6.65) verify Hypothesis 6.3 (H1 ). In particular, function p was introduced so that the elliptic inequality of order H1 in ξ = 0 would be satisfied. In fact, (6.65) should be understood as an expansion in the fractional power in high frequency (|ξ| → +∞) of a general symbol. The identification of the symbol of an elliptic Gaussian random process X comes from the comparison of X with the filtered white noise: −itξ e −1 ˆ ∗ * W (dξ) (6.66) Yt = σ(t, ξ) This explains that the order of the powers for an elliptic Gaussian random process is reversed compared to that which we have for a filtered white noise. The results of identification for the elliptic Gaussian processes can be summarized by recalling the following theorem.


231

THEOREM 6.9.– If X is an elliptic Gaussian random process of the symbol verifying (6.65) and: 3H − 1 1 < H2 sup 0, 2 then the estimators: ˜ N = 1 log2 VN/2 + 1 H 2 VN

(6.67)

ˆ

N 2HN −1 VN (w) JÑ (w) = Ñ ) F (2H

(6.68)

for w ∈ C 2 [0, 1] with support included in ]0, 1[ and: VN 2 /2 WN/2 1 1 (H + log2 − log2 2 )N = 2 2 VN 2 WN

(6.69)

where WN is defined by (6.63), converge almost surely when N → +∞ towards respectively: 1 w(t) dt, H2 H1 , J(w) = 0 a(t) It should be noted that, for the elliptic Gaussian random processes, an additional condition for the identification of the parameter H2 is found, which is (3H1 − 1)/2 < H2 . This hypothesis is not only technical; a similar hypothesis is found in [INO 76] for Markovian Gaussian fields of order p: only the monomials of higher degree of a polynomial symbols are identifiable. We can thus only hope, within our framework, to identify the principal part of the symbol σ. 6.5.2.3. Identification of MBM To identify the multifractional function of a MBM, the generalized quadratic variations must be suitably localized. Indeed, in this case, a pointwise estimator of h is proposed in [BEN 98a]. It is necessary for us, however, to insist on the fact that we can prove the convergence of the estimators only for regular multifractional functions: in this section, we will suppose that the following hypothesis is verified. HYPOTHESIS 6.5.– Function h is of class C 1 . This hypothesis is, of course, more restrictive than Hypothesis 6.4. Let us specify the principles of the localization of the generalized quadratic variations. A natural method consists of utilizing the weight function: w = 1[t0 ,t1 ]

for 0 < t0 < t1 < 1

232


We thus obtain a localized variation in the interval [t0 , t1 ] which we will note: VN (t0 , t1 ) = VN (w) =

(6.70)

p {p∈Z,t0 N t1 }

$ %2 p+1 p p+2 − 2X +X X N N N

(6.71)

We can now define an estimator: VN/2 (t0 , t1 ) 1 hN log2 +1 (t0 , t1 ) = 2 VN (t0 , t1 ) which converges, when N → +∞, towards: inf h(s), s ∈ ]t0 , t1 [

(6.72)

(a.s.)

Indeed, it is the worst Hölderian regularity which is dominating for this estimate. We deduce from this intermediate stage that we must reduce the size of the observation interval [t0 , t1 ] as N increases if we want to estimate h(t). Let us suppose: def

VN, (t) = VN (t − , t + ) VN/2, def 1 ˆ log2 h ,N (t) = 2 VN,

(6.73) (6.74)

and let us apply the general principles of section 6.5 to VN, (t): VN, (t) ≈

1 N

p N

C2

∈[t− ,t+ ]

p p , ω N 1−2h( N ) N

(6.75)

It is clear that the larger is, the smaller the stochastic error due to a law of large numbers, since a great number of variables is added up; however, for a large , we introduce a significant distortion by having replaced: N 1−2h(t)

by

p

N 1−2h( N )

The choice of the convergence speed of towards 0 is carried out in the following theorem, extracted from [BEN 98a]. THEOREM 6.10.– Let X be a MBM of harmonizable representation: −it·ξ 1 e −1 Bh (t) = W (dξ) 1 v h(t) |ξ| 2 +h(t)


233

associated with a multifractional function h verifying Hypothesis 6.5. For = N −α with 0 < α < 12 and N → ∞: ˆ ,N (t) −→ h(t) h

(a.s.)

For = N −1/3 : ˆ ,N (t) − h(t) 2 = O Log2 (N )N −2/3 E h In the preceding statement, we used the standardization 1/v(h(t)) but the result is unchanged if this factor is replaced by any function C 1 of t. In addition, the choice of = N −1/3 renders the contributions of the asymptotic error and of the distortion of the same order and thus, in a certain sense, asymptotically minimizes the upper bound obtained for the quadratic risk. 6.5.2.4. Identification of SFBMs In the case of a SFBM, the multifractional function to estimate is a piecewise constant function h given by (6.52) and we will build an estimator of: Θ0 = (a1 , . . . , aK ; H0 , . . . , HK ) starting from the quadratic variation: p + 1 p (2 1 ' p + 2 − 2Qh + Qh Qh VÑ (s, t) = N n n n p

(6.76)

s n 0 on the right of t and on the left of t: DN (A, t) = fN (t, t + A) − fN (t − A, t)

(6.78)

Let us suppose as known ν0 = mini=1,...,K−1 |ai+1 − ai | the minimal distance between two jumps of h, as well as a minor η0 of the absolute value of the magnitude

234


of jumps δi = Hi − Hi−1 . By taking A < ν0 , we obtain the convergence of DN (A, t) towards: δi 1[ai ,ai +A[ (t) D∞ (A, t) = i such that δi >0

+

(6.79) δi 1[ai −A,ai [ (t)

i such that δi 0), or on the left of the rupture moment ai in the case of a negative jump (i.e., δi < 0). Consequently, for any threshold η ∈ ]η0 /2, η0 [ and any size of window A ν0 , we estimate the first time of the positive jump of D∞ (A, ·) starting from the first moment Nl such that DN (A, l/N ) η, then the second by using the same method but deviating from A in relation to the first found in a more formal way, we suppose: (N )

τˆ1

=

1 min{l ∈ Z, DN (A, l/N ) η} N

where (N )

τˆ1 (N )

If τˆ

= +∞ when DN (A, l/N ) < η, ∀l ∈ Z

< +∞: (N )

τˆ+1 =

1 (N ) min{l ∈ Z, l/N τˆ + A and DN (A, l/N ) η} N

and (N )

ςˆ1

=

1 max{l ∈ Z, DN (A, l/N ) −η} N

where (N )

ςˆ1 (N )

If ςˆm

= −∞ when DN (A, l/N ) > −η, ∀l ∈ Z

> −∞:

(N )

ςˆm+1 =

1 (N ) max{l ∈ Z, l/N ςˆm − A and DN (A, l/N ) −η} N

By uniting the two preceding families and then sorting in ascending order, we (N ) (N ) obtain a family of estimators of the jump times of h: (ˆ a1 , . . . , a ˆκN ) and we estimate


235

(N ) (N ) ˆ (N ) = fN (ˆ the value of h at the jump moments by assuming H a1 −10A, a ˆ1 −5A) 0 (N ) (N ) (N ) ) ˆι ˆ κ(N and H = fN (ˆ aι + A/3, a ˆι+1 − A/3) for ι = 1, . . . , κN − 1 and H = N (N ) (N ) (N ) (N ) fN (ˆ a1 +5A, a ˆ1 +10A). To finish, we build the estimator ΘN = (ˆ a1 , . . . , a ˆκN ; ) ˆ (N ) , . . . , H ˆ κ(N H N ), whose consistency is established in [BEN 00]. 0

THEOREM 6.11.– Qh is a step fractional Brownian process of function of scale h(·) verifying (6.52). If, moreover, A < ν0 and η ∈ ]η0 /2, η0 [, then we have limN →+∞ ΘN = Θ0 a.s. with Θ0 = (a1 , . . . , aK ; H0 , . . . , HK ). 6.6. Bibliography [AYA 99] AYACHE A., L ÉVY V ÉHEL J., “Generalised multifractional Brownian motion: definition and preliminary results”, in D EKKING M., L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer-Verlag, p. 17–32, 1999. [AYA 00] AYACHE A., C OHEN S., L ÉVY V ÉHEL J., “The covariance structure of multifractional Brownian motion, with application to long range dependence”, in Proceedings of ICASSP (Istanbul, Turkey), 2000. [AYA 04a] AYACHE A., B ENASSI A., C OHEN S., L ÉVY V ÉHEL J., “Regularity and identification of generalized multifractional Gaussian processes”, in Séminaire de Probabilités XXXVIII – Lecture Notes in Mathematics, Springer-Verlag Heidelberg, vol. 1857, p. 290–312, 2004. [AYA 04b] AYACHE A., L ÉVY V ÉHEL J., “On the identification of the pointwise Hölder exponent of the generalized multifractional Brownian motion”, in Stoch. Proc. Appl., vol. 111, p. 119–156, 2004. [BEN 94] B ENASSI A., C OHEN S., JAFFARD S., “Identification de processus Gaussiens elliptiques”, C. R. Acad. Sc. Paris, series, vol. 319, p. 877–880, 1994. [BEN 97a] B ENASSI A., C OHEN S., I STAS J., JAFFARD S, “Identification of elliptic Gaussian random processes”, in L ÉVY V ÉHEL J., T RICOT C. (Eds.), Fractals and Engineering, Springer-Verlag, p. 115–123, 1997. [BEN 97b] B ENASSI A., JAFFARD S., ROUX D., “Gaussian processes and pseudodifferential elliptic operators”, Revista Mathematica Iberoamericana, vol. 13, no. 1, p. 19–89, 1997. [BEN 98a] B ENASSI A., C OHEN S., I STAS J., “Identifying the multifractional function of a Gaussian process”, Statistic and Probability Letters, vol. 39, p. 337–345, 1998. [BEN 98b] B ENASSI A., C OHEN S., I STAS J., JAFFARD S., “Identification of filtered white noises”, Stoch. Proc. Appl., vol. 75, p. 31–49, 1998. [BEN 00] B ENASSI A., B ERTRAND P., C OHEN S., I STAS J., “Identification of the Hurst exponent of a step multifractional Brownian motion”, Statistical Inference for Stochastic Processes, vol. 3, p. 101–110, 2000. [BER 00] B ERTRAND P., “A local method for estimating change points: the hat-function”, Statistics, vol. 34, no. 3, p. 215–235, 2000.

236


[COH 99] C OHEN S., “From self-similarity to local self-similarity: the estimation problem”, in D EKKING M., L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer-Verlag, p. 3–16, 1999. [FRI 97] F RISCH U, Turbulence, Cambridge University Press, 1997. [INO 76] I NOUÉ K., “Equivalence of measures for some class of Gaussian random fields”, J. Multivariate Anal., vol. 6, p. 295–308, 1976. [IST 96] I STAS J., “Estimating the singularity function of a Gaussian process with applications”, Scand. J. Statist., vol. 23, no. 5, p. 581–596, 1996. [KAH 85] K AHANE J.P., Some Random Series of Functions, Cambridge University Press, second edition, 1985. [LEO 89] L EON J.R., O RTEGA J., “Weak convergence of different types of variation for biparametric Gaussian processes”, in Colloquia Math. Soc. J. Bolayi no. 57, 1989. [LIF 95] L IFSHITS M.A., Gaussian Random Functions, Kluwer Academic Publishers, 1995. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, p. 422–437, 1968. [MEY 90a] M EYER Y., Ondelettes et operateurs, Hermann, Paris, vol. 1, 1990. [MEY 90b] M EYER Y., Ondelettes et operateurs, Hermann, Paris, vol. 2, 1990. [NEV 68] N EVEU J., Processus aléatoires Gaussiens, Montreal University Press, SMS, 1968. [PEL 96] P ELTIER R.F., L ÉVY V ÉHEL J., “Multifractional Brownian motion: definition and preliminary results”, 1996 (available at http://www-syntim.inria.fr/fractales). [SAM 94] S AMORODNITSKY G., TAQQU M.S., Stable Non-Gaussian Random Processes, Chapman & Hall, 1994. [YAG 87] YAGLOM A.M., Correlation Theory of Stationary and Related Random Functions. Volume I: Basic Results, Springer, 1987.

Chapter 7

An Introduction to Fractional Calculus

7.1. Introduction 7.1.1. Motivations We give some traditional example applications of fractional calculus and then we briefly point out the theoretical references. 7.1.1.1. Fields of application The modeling of certain physical phenomena, described as long memory, can be carried out by introducing integro-differentials terms with weakly singular kernels (i.e., locally integrable but not necessarily continuous like tα−1 when 0 < α < 1) in the equations of the dynamics of materials. This is very frequent, for example, in linear viscoelasticity with long memory, where a fractional stress-strain dynamic relation can be proposed: see [BAG 86] for viscoelasticity; [KOE 84, KOE 86] for a presentation a little more formalized; [BAG 91] for a rich and quite detailed example; [BAG 83a] for a modal analysis in forced mode or [BAG 85] for a modal analysis in transient state and finally, [CAP 76] for a modeling which utilizes an equation with partial derivatives with fractional derivative in time. There are also applications for modeling in chemistry of polymers [BAG 83b] or for modeling of dynamics at the interface of fractal structures: see [LEM 90] for the applied physical aspect and [GIO 92] for the theoretical physical aspect.

Chapter written by Denis M ATIGNON.

238


Moreover, fractional derivatives can appear naturally when a dynamic phenomenon is strongly conditioned by the geometry of the problem: a simple, very instructive example is presented in [TOR 84]. See in particular [CARP 97] for examples in continuum mechanics and [POD 99] for many applications in engineering sciences. 7.1.1.2. Theories A detailed historical overview of the theory of fractional derivatives is given in [OLD 74]; moreover, this work is undoubtedly one of the first attempts to assemble scattered results. Recently, a theoretical synthesis was proposed in [MIL 93], where certain algebraic aspects of fractional differential equations of rational order are completely developed. In mathematics, the Russian work [SAM 87] is authoritative; it compiles a set of unique definitions and theories. Pseudo-differential operators are mentioned in Chapter 7 of [TAY 96] and the first article on the concept of diffusive representation was, as far as we know, section 5 of [STA 94]. During the last 10 years, a number of themes have developed: see, in particular, the book [MAT 98c] for a general theoretical framework [MON 98], and for several applications derived from them. 7.1.2. Problems From a mathematical point of view, these integro-differential relations or convolutions with locally integrable kernels (or L1loc , i.e., absolutely integrable on any interval [a, b]) are not simple to treat: analytically, the singular character of kernel tα−1 (with 0 < α < 1) problematizes the use of theorems based on the regularity of the latter (in [DAUT 84a], for example, the kernels are always supposed to be continuous). Numerically, it is not simple to treat this singularity at the temporal origin (although that is a priori possible by carrying out an integration by parts, which artificially increases the order of derivation of the unknown function, while keeping a convolution with a more regular kernel). In the theories mentioned in section 7.1.1.2, several problems appear. First, the definition of fractional derivatives poses problems for orders higher than 1 (in particular, fractional derivatives do not commute, which is extremely awkward and, in addition, the composition of integration and fractional derivatives of the same order do not necessarily give the identity); this leads, in practice, to the use of rather strict calculations and a reintroduction a posteriori of the formal solution in the


239

starting equation, to check the coherence of the result. Second, the question of initial conditions for fractional differential equations is not truly solved: we are obliged to define zero or infinite initial values. Lastly, the true analytical nature of solutions can be masked by closed-form solutions utilizing a great number of special functions, which facilitates neither the characterization of important analytical properties of these solutions nor their numerical simulation. The focus of our work concerns the theory of fractional differential equations (FDE): first, we clarify various definitions by using the framework of causal distributions (i.e., generalized functions whose support is the positive real axis) and by interpreting results on functions expandable in fractional power series of order α (α-FPSE); second, we clarify problems related to fractional differential equations by formulating solutions in a compact general form and third, we establish a strong bond with diffusive representations of pseudo-differential operators (DR of PDO), which is a nearly incontrovertible concept when derivation orders are arbitrary. Finally, we study the extension to several variables by treating a fractional partial differential equation (FPDE) which in fact constitutes a modal analysis of fractional order. 7.1.3. Outline This chapter is composed of four distinct sections. First, in section 7.2, we give definitions of the fundamental concepts necessary for the study and handling of fractional formalism. We recall the definition of fractional integration in section 7.2.1. We show in section 7.2.2 that the inversion of this functional relation can be correctly defined within the framework of causal distributions and we examine the fundamental solutions directly connected to this operator. Lastly, we adopt a definition which is easier to handle, i.e., a “mild” fractional derivative, so as to be able to use fractional derivatives on regular causal functions. We examine in section 7.2.3 the eigenfunctions of this new operator and show its structural relationship with a generalization of Taylor expansions for non-differentiable functions at the temporal origin, like the functions expandable in fractional power series. Then, in section 7.3, we are interested in the fractional differential equations. These are linear relations in an operator of fractional derivative and its successive powers; it appears naturally that the rational orders play an important role, since certain powers are in direct relationship with the usual derivatives of integer orders. We thus examine fractional differential equations in the context of causal distributions (in section 7.3.2) and functions expandable into fractional power series (in section 7.3.3). We then tackle, in section 7.3.4, the asymptotic behavior of the fundamental solutions of these fractional differential equations, i.e., the divergence in modulus, the pseudo-periodicity or convergence towards zero of the eigenfunctions

240


of fractional derivatives (which plays a similar role to that of the exponential function in the case of integer order). Finally, in section 7.3.5, we examine a class of controlled-and-observed linear dynamic systems of fractional order and approach some typical stakes of automatic control. Then, in section 7.4, we consider fractional differential equations in one variable but when orders of derivations are not commensurate: there are no simple algebraic tools at our disposal in the frequency domain and work carried out in the case of commensurate orders does not apply any more. We further examine the strong bond which exists with diffusive representations of pseudo-differential operators. We give some simple ideas and elementary properties and then we present a general result of decomposition for the solutions of fractional differential equations into a localized or integer order part and a diffusive part. In section 7.5, finally, we show that the preceding theory in the time variable (which appeals, in the commensurate case, to polynomials and rational fractions in frequency domain) can extend to several variables in the case of fractional partial differential equations (we obtain more general meromorphic functions which are not rational fractions). With this intention, we treat an example conclusively: that of the partial differential wave equation with viscothermal losses at the walls of the acoustic pipes, i.e. an equation which reveals a time derivative of order three halves. Throughout this chapter, we will treat the half-order as an example, in order to clarify our intention. This chapter has been inspired by several articles and particularly [AUD 00, MAT 95a]. This personal work is also the fruit of collaborations with various researchers including d’Andréa-Novel, Audounet, Dauphin, Heleschewitz and Montseny. More recently, new co-authors have helped enlarge the perspective of our work: let us mention Hélie, Haddar, Prieur and Zwart. 7.2. Definitions 7.2.1. Fractional integration The primitive, canceling at initial time t = 0, reiterated an integer number n of times I n f , of an integrable function f , is nothing other than the convolution of f with a polynomial kernel Yn (t) = tn−1 + /(n − 1)!: τ1 t τn−1 def dτ1 dτ2 · · · f τn dτn I n f (t) =

0

0

0

= Yn f (t). By extension, we define [MIL 93] the primitive I α f of any order α > 0 by using function Γ of Euler which extends the factorial.


241

DEFINITION 7.1.– The primitive of order α > 0 of causal f, locally integrable, is given by: def

I α f (t) = (Yα f )(t)

where we have set Yα (t) =

tα−1 + . Γ(α)

(7.1)

PROPOSITION 7.1.– The property Yα Yβ = Yα+β makes it possible to write the fundamental composition law: I α ◦ I β = I α+β for α > 0 and β > 0. Proof. To establish the property, it is enough to check that the exponents coincide, the numerical coefficient coming from the properties of function Γ: t (t − τ )α−1 τ β−1 dτ (Yα Yβ )(t) ∝ 0

= tα+β−1

1

(1 − x)α−1 xβ−1 dx

0

∝ Yα+β (t). To establish the fundamental composition law, it is enough to use the fact that the convolution of functions is associative, from where: def

I α+β f = Yα+β f = (Yα Yβ ) f = Yα (Yβ f ) = I α {I β f }. PROPOSITION 7.2.– The Laplace transform of Yα for α > 0 is: L[Yα ](s) = s−α

for e(s) > 0

(7.2)

i.e., with the right-half complex plane as a convergence strip. Proof. A direct calculation for s > 0 provides: +∞ +∞ α−1 t 1 def e−st dt = s−α xα−1 e−x dx = s−α L[Yα ](s) = Γ(α) Γ(α) 0 0 according to the definition of Γ; the result is continued to e(s) > 0 by analyticity. NOTE 7.1.– We see in particular that the delicate meaning given to a fractional power of the complex variable s is perfectly defined: s → sα indicates the analytical continuation of the power function on positive reals. It is the principal determination of the multiform function s → sα ; it has Hermitian symmetry.

242


PROPOSITION 7.3.– For a causal function f which has a Laplace transform in e(s) > af , we have: L[I α f ](s) = s−α L[f ](s)

for e(s) > max(0, af ).

(7.3)

Proof. This follows from the fact that the Laplace transform transforms a convolution into a product of the Laplace transforms and Proposition 7.2. In particular, we can prove Proposition 7.1 very simply, when we see that s−α−β = s s for e(s) > 0. −α −β

EXAMPLE 7.1.– For f locally integrable, i.e. f ∈ L1loc , we thus obtain: t 1 1 √ f (t − τ ) dτ. I 2 f (t) = πτ 0 7.2.2. Fractional derivatives within the framework of causal distributions 7.2.2.1. Motivation The idea of fractional derivatives of causal functions (or signals) is to obtain an inverse formula to that of fractional integration defined by (7.1), i.e.: f = I α (Dα f ) This is a rather delicate problem; it can be solved by calling upon the theory of Volterra integral equations (for example, see [KOE 84]). However, one of the major problems is the composition law or the law of exponents and particularly because fractional derivatives and integrals do not always commute, which poses delicate practical problems. That is why we propose to carry out the inversion of space of (7.1) within the more general framework of causal distributions (i.e., D+ distributions whose support is the positive real axis of the time variable), while referring to [SCH 65] in particular, even if it means returning later, in section 7.2.3, to an interpretation in terms of internal operation in a class of particular functions. 7.2.2.1.1. Passage to the distributions def

By following Definition 7.1, we pose naturally I 0 f = f , which gives, according to (7.1), f = Y0 f . It is clear that no locally integrable function Y0 can be a solution of the preceding convolution equation; on the other hand, the Dirac distribution is the neutral element of convolution of distributions [SCH 65]. From where necessarily: def

Y0 = δ and convolution in equation (7.1) is to be taken in the sense of distributions.

(7.4)


243

7.2.2.1.2. Framework of causal distributions We could place ourselves within the framework of distributions, but convolution (which is the basic functional relations for invariant linear systems) is not, in general, associative. When the supports are limited from below (usually the case when we are interested in causal signals), we obtain the property known as convolutive supports, which enables the associative convolution property. , which is a convolution algebra, there is an associative property Therefore, in D+ of the convolution product, and the convolutive inverse of a distribution, if it exists, is unique (see lesson 32 of [GAS 90]), which allows a direct use of the impulse response h of a causal linear system. Indeed, let us consider the general convolution equation : in the unknown y ∈ D+

P y =x

(7.5)

represents the system and x ∈ D+ the known causal input; i.e. h, the where P ∈ D+ impulse response of the system, defined by:

P h = δ. Then, y = h x is the solution of equation (7.5); indeed: P y = P (h x) = (P h) x = δ x = x thanks to the associative convolution property of causal distributions. We thus follow [GUE 72] to define fractional derivatives Dα . DEFINITION 7.2.– The derivative in the sense of causal distributions of f ∈ D+ is: def

Dα f = Y−α f

where we have Y−α Yα = δ

(7.6)

. i.e., Y−α is the convolutive inverse of Yα in D+

At this stage, the problem is thus to identify the causal distribution Y−α , which we know could not be a function belonging to L1loc . Let us give its characterization by Laplace transform. PROPOSITION 7.4.– The Laplace transform of Y−α for α > 0 is: L[Y−α ](s) = sα

for e(s) > 0

i.e., with a right-half complex plane as the convergence strip.

(7.7)

244


Proof. We initially use the fact that, within the framework of causal distributions, the Laplace transform of a convolution product is the product of the Laplace transforms, which we apply to definition Y−α using (7.6), by taking into account Proposition 7.2 and L[δ](s) = 1, i.e.: s−α L[Y−α ](s) = 1 ∀s, e(s) > 0 which proves, on the one hand, the existence and, on the other hand, the declared result. NOTE 7.2.– We read on the behavior at infinity of the Laplace transform that Y−α will be less regular the larger α is. PROPOSITION 7.5.– The property Y−α Y−β = Y−α−β makes it possible to write the fundamental composition law: Dα ◦ Dβ = Dα+β for α > 0 and β > 0. PROPOSITION 7.6.– For a causal distribution f which has a Laplace transform in e(s) > af , we have: L[Dα f ](s) = sα L[f ](s)

for e(s) > max(0, af ).

(7.8)

Finally, we obtain the following fundamental result. PROPOSITION 7.7.– For α and β two real numbers, we have: – the property Yα Yβ = Yα+β ; – the fundamental composition law I α ◦ I β = I α+β ; by taking as notation convention I α = D−α when α < 0. EXAMPLE 7.2.– We seek to clarify the half-order derivation: we first calculate the distribution Y−1/2 , then we calculate D1/2 [f Y1 ] where f is a regular function. From the point of view of distributions, we can write Y−1/2 = D1 Y1/2 , where D1 is the derivative in the sense of distributions; maybe, by taking ϕ ∈ C0∞ a test function: ∞ 1 √ ϕ (t) dt #Y−1/2 , ϕ$ = #D1 Y1/2 , ϕ$ = −#Y1/2 , ϕ $ = − Γ(1/2) t 0 ∞ ∞ 1 1 1 lim √ ϕ(t) + ϕ(t) dt =− Γ(1/2) ε→0 2 t3/2 t ε ε ∞ 1 2 1 √ lim =− ϕ(t) dt − ϕ(ε) 2 Γ(1/2) ε→0 ε t3/2 ε 0 / 1 −3/2 pf (t+ ), ϕ = Γ(−1/2)


245

where pf indicates the finite part within the Hadamard concept of divergent integral. We thus obtain the result, which is not very easy to handle in practice: −3/2

Y−1/2 =

pf (t+ ) . Γ(−1/2)

Let us now calculate the derivative of half-order of a causal distribution f Y1 where Y1 is the Heaviside distribution and f ∈ C 1 . Then, we have D1 [f Y1 ] = f Y1 + f (0)δ, from where, by taking into account D1/2 = I 1/2 ◦ D1 : def

D1/2 [f Y1 ] = Y−1/2 [f Y1 ] = Y1/2 D1 [f Y1 ] = Y1/2 [f Y1 ] + f (0)Y1/2 t 1 1 √ f (t − τ ) dτ + f (0) √ . = πτ πt 0 where two terms appear: the first is a convolution of L1loc functions and it is a regular term, i.e., continuous in t = 0+ ; the second is a function which diverges in t = 0+ , while remaining L1loc . Moreover, the preceding formulation√remains valid if we have f ∈ C 0 and f ∈ L1loc : i.e., for example, for t → Y3/2 (t) ∝ t, for which it is easy to check that we have D1/2 Y3/2 = Y1 , in other words the constant 1 for t > 0. PROPOSITION 7.8.– In general, for f ∈ C 0 such that f ∈ L1loc and 0 < α < 1: Dα [f Y1 ] = Y1−α [f Y1 ] + f (0) Y1−α . 7.2.2.2. Fundamental solutions We define operator Dα in the space D+ of causal distributions. Let us now seek α 1 the fundamental solution of operator D − λ.

DEFINITION 7.3.– The quantity Eα (λ, t) is the fundamental solution of Dα − λ for the complex value λ; it fulfills by definition: Dα Eα (λ, t) = λEα (λ, t) + δ.

(7.9)

PROPOSITION 7.9.– The quantity Eα (λ, t) is given by: ∞ ! " λk Y(1+k)α (t). Eα (λ, t) = L−1 (sα − λ)−1 , e(s) > aλ =

(7.10)

k=0

1. It is the extension of property D1 eλt Y1 (t) = λeλt Y1 (t) + δ in the case of integer order.

246


Proof. Let us take the Laplace transform of (7.9); it is: (sα − λ)L[Eα (λ, t)](s) = 1 for e(s) > 0 from where, for e(s) > aλ : ! " L Eα (λ, t) (s) = (sα − λ)−1 = s−α (1 − λs−α )−1 = s−α

+∞

(λs−α )k

for |s| > |λ|1/α

k=0

=

+∞

λk s−(1+k)α .

k=0

By taking the inverse Laplace transform term by term, see [KOL 69], and by using Proposition 7.2, we then obtain the result announced in the time domain (the series of functions (7.10) is normally convergent on every compact subset). EXAMPLE 7.3.– Let us examine the particular cases of integer and half-integer orders. On the one hand, for α = 1, we obtain the causal exponential: E1 (λ, t) = eλt Y1 (t) as fundamental solution of the operator D1 − λ within the framework of causal distributions. In addition, for α = 12 , we obtain: E1/2 (λ, t) =

+∞

λk Y 1+k 2

k=0

=

+∞ k=0

k−1

λk

t 2 Γ( k+1 2 )

√ +∞ (λ t)k = Y1/2 + λ Γ(1 + k2 ) k=0 which is the sum of an L1loc function (i.e. Y1/2 ) and of a power series in the variable √ λ t, which is thus a continuous function. 7.2.3. Mild fractional derivatives, in the Caputo sense 7.2.3.1. Motivation For 0 < α < 1, we saw, according to Proposition 7.8, that Dα f was not continuous in t = 0+ and that, even when we have f ∈ C 1 , which might at first seem slightly


247

paradoxical, it would be preferred that Dα f is defined, in a certain sense, between f and f . Moreover, we have just seen that the fundamental solutions Eα (λ, t) are not continuous at the origin t = 0+ ; from the analytical point of view, that is likely a priori to be awkward when initial values are given in a fractional differential equation. For α > 1, the analytical situation worsens, since the objects which are handled become very rapidly distributions which move away from regular functions; for example: Y−n = δ (n)

for n a natural integer.

(7.11)

These considerations justify the description of “mild” applied to fractional derivatives dα , which we now define. 7.2.3.2. Definition We are naturally led to extract from the preceding definitions the more regular or milder parts, according to the example introduced in [BAG 91]. The definition we propose does actually coincide with that given by Caputo in [CAP 76]. DEFINITION 7.4.– For a causal function f and continuous from the right at t = 0: def

dα f = Dα f − f (0+ ) Y1−α .

(7.12)

In particular, if f ∈ L1loc , then dα f = Y1−α [f Y1 ]. For 0 < α < 1, to some extent, we extract from f (continuous but non-derivable) an intermediate degree of regularity (connected to the Hölder exponent of f at 0). It appears that dα f can be a continuous function, for example, when f ∈ C 0 and f ∈ L1loc , which was not the case for the fractional derivative in the sense of distributions Dα f . Moreover, we can note that, in the case of the integer order, the second derivative of function f is defined like the derivative of f , i.e., exactly like def the following iteration of the operator of derivation; in other words, d2 = (d1 )◦2 . In view of these remarks, we propose the following definition. DEFINITION 7.5.– For 0 < α 1, we will say that f is of class Cαn if all the n sequentially mild derivatives of order alpha of f exist and are continuous, even at t = 0, i.e.: (dα )◦k f ∈ C 0 for 0 k n. The idea of sequentiality is introduced in a completely formal manner in Chapter 6 of [MIL 93] and more as a curiosity than something fundamentally coherent. Moreover, it is not the same dα which is used, but a definition which coincides with Dα for certain classes of functions, for 0 < α < 1. However, one of the inherent difficulties in the definition used is that the fundamental composition law is lost, whereas Dnα = (Dα )◦n is obtained immediately according to Proposition 7.5.

248


EXAMPLE 7.4.– For α = 12 , let us apply successively D1/2 to f causal given by: √ f = b0 + b1 t + b2 t + b3 t3/2 + b4 t2 . Let us reformulate this expansion on the basis of Yk/2 ; it becomes: f = a0 Y1 + a1 Y3/2 + a2 Y2 + a3 Y5/2 + a4 Y3 D1/2 f = a0 Y1/2 + a1 Y1 + a2 Y3/2 + a3 Y2 + a4 Y5/2 d1/2 f

D1 f = a0 Y0 + a1 Y1/2 + a2 Y1 + a3 Y3/2 + a4 Y2

(d1/2 )2 f

d1 f

D3/2 f = a0 Y−1/2 + a1 Y0 + a2 Y1/2 + a3 Y1 + a4 Y3/2 (d1/2 )3 f

D2 f = a0 Y−1 + a1 Y−1/2 + a2 Y0 + a3 Y1/2 + a4 Y1

(d1 )2 f

(d1/2 )4 f

iff

a1 =0

The interest of the operator d1/2 and its successive powers (noted from now on by (d ) instead of (d1/2 )◦k to make the writing less cumbersome) is manifest here. Indeed, with the choice of f considered, (d1/2 )k f are continuous functions at t = 0+ ; 4 ∞ and even f ∈ C1/2 since we have according to Definition 7.5, we see that f ∈ C1/2 1/2 5 1/2 k (d ) f ≡ 0! Moreover, we obtain the property ak = [(d ) f ](t = 0+ ) and thus the following formula: 1/2 k

f=

4

ak Y1+ k 2

with ak = [(d1/2 )k f ](t = 0+ ).

k=0

This is a kind of fractional Taylor expansion of f causal, in the vicinity of 0, which we will generalize in section 7.2.3.4. 7.2.3.3. Mittag-Leffler eigenfunctions We defined operator dα and noted that it acted in an internal way in the class of functions: we can naturally seek the eigenfunctions of this operator in this class of functions.

Cα∞


249

DEFINITION 7.6.– For 0 < α 1, Eα (λ, t) is the eigenfunction of dα for the complex eigenvalue λ, initialized at 1; it fulfills, by definition: α d Eα (λ, t) = λEα (λ, t), (7.13) Eα (λ, 0+ ) = 1. PROPOSITION 7.10.– The quantity Eα (λ, t) is given by: ! " Eα (λ, t) = I 1−α Eα (λ, t) = L−1 sα−1 (sα − λ)−1 , e(s) > aλ =

∞

λk Y1+αk (t) = Eα (λtα +)

(7.14)

k=0

where Eα (z) is the Mittag-Leffler monogenic function defined by the power series: def

Eα (z) =

+∞ k=0

zk . Γ(1 + αk)

(7.15)

Proof. It is enough to express (7.13) by using Definition 7.4 of dα , i.e.: Dα Eα (λ, t) = λEα (λ, t) + 1Y1−α from where, under the terms of Definition 7.3 of Eα (λ, t) as a fundamental solution of operator (Dα − λ), we obtain as the solution of the preceding equation with second term the a priori causal distribution: Eα (λ, t) = Y1−α (·) Eα (λ, ·) (t) ∞ k = Y1−α λ Y(1+k)α (t) k=0

=

∞

λk Y1+αk (t)

k=0

=

∞ k=0

λk tαk + Γ(1 + αk)

= Eα (z = λtα +) The result sought in the causal distributions is thus, as stated, a continuous function directly connected to the Mittag-Leffler monogenic functions [MIT 04]. EXAMPLE 7.5.– Let us examine the particular cases of the integer and half-integer orders.

250


On the one hand, for α = 1, we obtain the causal exponential: E1 (λ, t) = eλt Y1 (t) as eigenfunction of usual derivation d1 (which actually belongs to the class of C1∞ functions). On the other hand, for α = 12 , we obtain: E1/2 (λ, t) =

+∞

λk Y1+ k 2

k=0

√ +∞ (λ t)k = Γ(1 + k2 ) k=0

√ = exp (λ2 t) [1 + erf (λ t)]

where erf is the error function (to be evaluated in all the complex plane). 7.2.3.4. Fractional power series expansions of order α (α-FPSE) The series expansion of functions Eα (λ, t) highlighted in Proposition 7.10 suggests the following definition naturally. DEFINITION 7.7.– For 0 < α 1, the sequence (ak )k0 of complex numbers makes it possible to define the formal series: f (t) =

∞

ak Y1+αk (t)

(7.16)

k=0

which takes an analytical meaning of fractional power series expansion of order α (α-FPSE), as soon as |ak | are bounded from above by a geometric sequence for example; the uniform convergence of the series of functions then takes place on every compact subset of [0, +∞[. PROPOSITION 7.11.– Any expandable function in fractional power series of order α is of class Cα∞ ; and it fulfills, in particular: ak = [(dα )k f ](0+ )

for

k0

(7.17)

Proof. The proof is provided in a way similar to the calculation in the example studied in section 7.2.3.2 for a series comprising a finite number of terms. There is no problem of commutation between the operator dα and the infinite summation, since the series of functions of class Cα∞ converges uniformly on every compact subset: in other words, term by term derivation dα is perfectly licit.


251

NOTE 7.3.– Just as any function of class C ∞ is not necessarily expandable in power series (PSE), any function of class Cα∞ is also not necessarily expandable in fractional power series of order α. We will introduce the function later on: ! √ " ψ 1 (t) = L−1 e− s , e(s) > 0 ∝ t−3/2 exp(−1/4t) Y1 (t), ∞ which is of class C1/2 but which is not expandable in fractional power series of half-order.

7.3. Fractional differential equations In Chapter 5 of [MIL 93], examples of fractional differential equations are examined and we note, in particular, problems of initial value (0 or +∞). In Chapter 6 of [MIL 93], the vectorial aspect is considered (we will start with that) and the idea of sequentiality is present, from a rather formal point of view which, for us, involves reserves of an analytical nature which we already stated. In section 7.3, we commence by treating and analyzing an example, which justifies the resolution of fractional differential equations within two quite distinct frameworks: causal distributions in section 7.3.2 and functions with a fractional power series expansion of order α in section 7.3.3; we are finally concerned, in section 7.3.4, with the asymptotic behavior of the solutions of fractional differential equations, which is a question connected with the basic concept of stability. 7.3.1. Example An integro-differential equation (in y, Y1/2 y , y for example, where y of class C 1 is sought) can be written either with derivatives in the sense of distributions (D1/2 y, D1 y), or with mild derivatives (d1/2 y, (d1/2 )2 y), which makes d1 y disappear. Let us clarify the passage in a particular framework for integro-differential equation with right-hand side: y (t) + c1 (Y1/2 y )(t) + c2 y(t) = x(t)

for t > 0,

with y(0) = a0 .

(7.18) (7.19)

7.3.1.1. Framework of causal distributions In D+ (7.18)-(7.19) is written in a single equation which uses the initial condition, i.e.:

D1 [yY1 ] + c1 D1/2 [yY1 ] + c2 yY1 = xY1 + a0 {Y0 + c1 Y1/2 } which can be vectorially formulated in the following manner: y y 0 1 0 1/2 = + D −c2 −c1 D1/2 y xY1 + a0 {Y0 + c1 Y1/2 } D1/2 y

(7.20)

(7.21)

252


and is thus solved simply by defining E1/2 (Λ, t) by the power series in the square matrix (exactly as for the matrix exponential): def

E1/2 (Λ, t) =

+∞

k

Λ Y(k+1)/2 = Y1/2 I + Λ

k=0

+∞ k=0

√

Λ

k

k

t Γ(1 + k2 )

from where the solution of (7.21) in D+ , which we will develop in section 7.3.2, is obtained: (7.22) D1/2 y = Λy + xD ⇐⇒ y(t) = E1/2 (Λ, ·) xD (t).

The notations are obvious; let us specify only that the index D of xD means that the vector contains not only the second member x, but also distributions related to initial condition a0 . 7.3.1.2. Framework of fractional power series expansion of order one half We work now with the mild derivative of order one half and seek a function y of 2 which is also of class C 1 ; equation (7.20) is then written in an equivalent class C1/2 manner: (d1/2 )2 y + c1 d1/2 y + c2 y = x for t > 0, y(0) = a0 with ! 1/2 " d y (0) = 0 which can be vectorially formulated in the following way: y y 0 1 0 + d1/2 1/2 = −c2 −c1 d1/2 y x d y y a0 with (0) = 0 d1/2 y

for t > 0,

(7.23) (7.24)

(7.25) (7.26)

and is solved by defining E1/2 (Λ, t) by the power series in the square matrix Λ: def

E1/2 (Λ, t) =

+∞

√ Λk Y1+ k = E1/2 (Λ t) 2

k=0

from where, with obvious notations, the solution of (7.25) and (7.26), which we will develop in section 7.3.3, is obtained: (7.27) d1/2 y = Λy + x ⇔ y(t) = E1/2 (Λ, t)y(0) + E1/2 (Λ, ·) x (t).


253

7.3.1.3. Notes Under initial vectorial condition (7.26) or under two initial scalar conditions (7.24), set the non-integer order initial condition to zero to ensure the C 1 regularity of the solution of the physical starting problem (7.18), which has only one physical initial condition given by (7.19). We obtain a response to the initial condition which is of ∞ , but of class C1k for k = 0, 1 only; it is the same for the impulse response h class C1/2 to the input x. However, in presenting a general theory of fractional differential equations, nothing prevents us from considering [d1/2 f ](0) = a1 as an independent parameter, which will make it possible to speak about response to the integer or non-integer initial conditions ai . Thus, the problem, in general, would be, instead of (7.23)-(7.24): 1/2 2 d y + c1 d1/2 y + c2 y = x for t > 0, ⎧ ⎨y(0) = a0 with ⎩!d1/2 y "(0) = a 1 or, instead of (7.20): D1 [yY1 ] + c1 D1/2 [yY1 ] + c2 yY1 = xY1 + a0 {Y0 + c1 Y1/2 } + a1 Y1/2 . NOTE 7.4.– In terms of application to physics, the rational case α = p1 is interesting; the relations between (d1/p )np y and (d1 )n y will indeed have to be clarified. However, it is rather the case of commensurate orders of derivation which is suitable for an algebraic treatment in general, which is a treatment in every respect analogous to that carried out for 12 . PROPOSITION 7.12.– Any scalar fractional differential equation of commensurate order with α of degree n can be brought back to a vectorial fractional differential equation of order α of degree 1 in dimension n. Proof. It is valid for a fractional differential equation in Dα as for a fractional differential equation in dα , since we have the crucial property of sequentiality. We have just seen it on an example of order 12 and degree 2; the proof, in general, is straightforward. We thus give results directly in vectorial form later on, i.e., by extracting the first component from the vector solution; in other words, the solution of the scalar problem.

254


7.3.2. Framework of causal distributions DEFINITION 7.8.– By definition, we have: def

Eα (Λ, t) =

+∞

Λk Y(1+k)α (t)

(7.28)

k=0

The matrix Eα (Λ, t) is expressed like a power series in the matrix Λ which, after reduction of the latter (eigenvalues λi of multiplicity mi ), is made explicit on the basis of fundamental solutions and their successive convolutions Eαj (λi , t), with 1 j mi . It is a first extension to the fractional case of the matrix exponential concept. PROPOSITION 7.13.– For the j-th times convolution of fundamental solutions, we have: Eαj (λ, t) = L−1 [(sα − λ)−j , e(s) > aλ ] j−1 ∂ 1 = L−1 (sα − λ)−1 , e(s) > aλ (j − 1)! ∂λ j−1 ∂ 1 Eα (λ, t) = (j − 1)! ∂λ =

+∞

j−1 Cj−1+k λk Y(j+k)α (t).

k=0

Proof. It is a formal calculation without much interest; let us note the use of the parametric derivative with respect to the complex parameter λ. In the integer case (α = 1), it is written simply E1j (λ, t) = Yj (t)E1 (λ, t), which can prove itself directly by using the following convolution property of causal functions f and g: " ! f (τ )eλτ g(τ )eλτ (t) = [f (τ ) g(τ )](t) eλt .

PROPOSITION 7.14.– We have: Dα y = Λy + xD

⇐⇒

y(t) = Eα (Λ, ·) xD (t)

(7.29)

where vector xD contains, on the one hand, distributions related to the initial conditions of vector y and, on the other hand, a regular function x or right-hand side.


255

Proof. It derives from Eα (Λ, t), which is the fundamental solution of the matrix operator (Dα I − Λ); this can be achieved by Laplace transform, as for Proposition 7.9. The fundamental relation is established (see Definition 7.3): Dα Eα (Λ, t) = ΛEα (Λ, t) + I δ

(7.30)

from where the announced result is obtained. 7.3.3. Framework of functions expandable into fractional power series (α-FPSE) DEFINITION 7.9.– By definition, we have: def

Eα (Λ, t) =

+∞

Λk Y1+kα (t).

(7.31)

k=0

The matrix Eα (Λ, t) is expressed like a power series in the matrix Λ which, in the case where Λ is diagonalizable (eigenvalues λi ), is made explicit on the basis of eigenfunctions Eα (λi , t), with 1 i n. It is the other extension to the fractional case of the matrix exponential concept. PROPOSITION 7.15.– We have: dα y = Λy + x ⇐⇒ y(0+ ) = y 0

y(t) = Eα (Λ, t)y 0 + Eα (Λ, ·) x (t)

(7.32)

where, this time, vector x is a continuous function (or input) which controls the fractional differential system. Proof. By using Definition 7.4, the left-hand side of (7.32) becomes: Dα y = Λy + x + Y1−α y 0 . We use Proposition 7.14 then, by taking: xD (t) = x(t) + Y1−α (t) y 0 . By noting that:

Eα (Λ, t) = Y1−α (·) Eα (Λ, ·) (t).

(7.33)

We then find the right-hand side of (7.32) to be the solution. By reinterpreting this result on a scalar fractional differential equation of degree n, it appears that y 0 is the vector of the n first coefficients of the expansion in fractional power series of order α of the solution y; in other words, the vector of the fractional order initial conditions.

256


To establish the link with physics, when α = p1 , it is advisable to initialize to 0 the fractional order initial conditions and to give the values of traditional initial position and velocity to the integer terms (on the example in y, Y1/2 y , y , of order 12 and degree 2, we took y(0) = y0 and d1/2 y(0) = 0; thus, the response to the only initial ∞ which is C 1 without being C 2 ). conditions is a function C1/2 Within this framework, it is then possible to treat fractional differential equations-α in an entirely algebraic way, by introducing the characteristic polynomial in the variable σ: P (σ) = σ n + cn−1 σ n−1 + · · · + c0 =

r )

(σ − λi )mi

i=1

of the fractional differential equation with the right-hand side: (dα )n y + cn−1 (dα )n−1 y + . . . + c0 y = x The responses hk (t) to the various initial conditions ak = [(dα )k y](0) for 0 k n − 1 and the impulse response h(t) of the system are linear combinations of Eαj (λi , t), with 1 j mi and 1 i r, which can be made explicit by the method of the unknown coefficients, or by algebraic means; in the generic case of distinct roots, for example, we obtain: n 1 1 def −1 = Eα (λ1 , t) · · · Eα (λn , t) = E (λ , t) h(t) = L (λ ) α i P (sα ) P i i=1 EXAMPLE 7.6.– Let us take again the example stated in a general way in section 7.3. Let λ1 , λ2 be the roots of P (σ) = σ 2 + c1 σ + c2 . The general solution of the system is given as: y(t) = (h x)(t) + a1 h1 (t) + a0 h0 (t) with the impulse response: 2 2 √ 1 λi 1 E E = (λ , t) = (λ t) h(t) = L−1 i i 1/2 1/2 P (λi ) P (λi ) P (s1/2 ) i=1 i=1 the response to the initial condition (half-integer) a1 : h1 (t) = I 1/2 h =

2 i=1

1 P (λ

i)

√ E1/2 (λi t)

and the response to the initial condition (integer) a0 : h0 (t) = D1/2 h1 + c1 h1 = h + c1 h1 =

2 λi + c1 i=1

P (λ

i)

√ E1/2 (λi t).


257

j NOTE 7.5.– When there is a double root λ1 , the preceding expressions use E1/2 (λ1 , t) for j = 1, 2; they also give rise to algebraic simplifications which finally reveal √ E1/2 (λ1 , t) and t E1/2 (λ1 , t).

7.3.4. Asymptotic behavior of fundamental solutions 7.3.4.1. Asymptotic behavior at the origin We saw that Eα (λ, t) has an integrable singularity at the origin; we find the general result according to: PROPOSITION 7.16.– When t → 0+ : Eαj (λ, t) ∼ Yjα (t) =

tjα−1 + ∈ L1loc . Γ(jα)

Proof. This follows from Proposition 7.13; the equivalent in 0+ is deduced from it immediately. 7.3.4.2. Asymptotic behavior at infinity At the beginning of the 20th century, the mathematician Mittag-Leffler was interested in functions Eα (z) (for reasons unconnected with fractional calculus [MIT 04]): concerning asymptotic behavior when |z| → +∞ when α < 2, he established the exponential divergence type of the sector of the complex plane | arg z| < α π2 and convergence towards 0 outwards; the nature of convergence towards 0 and the asymptotic behavior on the limit were not examined. We found later, in the middle of the last century [BAT 54], the very nature of the convergence towards 0 for | arg z| > α π2 . We reuse similar results on the fundamental solutions and extend them, on the one hand, to the limit | arg λ| = α π2 and, on the other hand, to the successive convolutions ∗j α (λ, t). PROPOSITION 7.17.– The asymptotic behavior (when t → +∞) of the fundamental solutions of Dα − λ and their convolutions, which structurally appear in the solutions of fractional differential equations of order α (as the basis of {polynomials in t} × {exp(λt)} in the case of the integer order α = 1) is given by the position of the eigenvalues λ in the complex plane, which holds the place of the fractional spectral domain: – for |arg(λ)| < α π2 , Eαj (λ, t) diverges in an exponential way (more precisely {polynomial in tα } × {exp(λ1/α t)}); – for |arg(λ)| = α π2 , Eα1 (λ, t) is asymptotically oscillatory and Eαj (λ, t) with j 2 diverges in an oscillating polynomial way (in tα );

258


– for |arg(λ)| > α π2 , we obtain Eαj (λ, t) ∼ kj,α λ−1−j t−1−α . In this latter case, we note that Eαj (λ, t) ∈ L1 (]0, +∞[), which is crucial for the impulse responses (the notion of a bounded input-bounded output (BIBO) system is related to the integrable character of the impulse response; in short, L1 L∞ ⊂ L∞ ). Proof. See [MAT 96b, MAT 98a] for these tricky calculations of residues and asymptotic behavior of indefinite integrals depending on a parameter. Let us note that the analysis in the Laplace plane provides only one pole when | arg(λ)| < απ and the latter (if it exists) is accompanied by an integral term – or aperiodic multimode according to [OUS 83] – resulting from the cut on the negative real semi-axis imposed by the multiform character of s → sα : this is our first engagement with diffusive representation, which will be detailed further in section 7.4.1. EXAMPLE 7.7.– In the half-integer case, we can illustrate the asymptotic behavior in the two sides of Figure 7.1: the eigenvalue λ describes the plane in σ (which is only √ the “unfolded” Riemann surface s). That is translated in the Laplace plane either by a pole and a cut (or a “pole” in the first layer of the Riemann surface), or by the cut alone (or a “pole” in the second layer of the Riemann surface). 2

2

1.5

1.5

stable

1

stable

1

stable 0.5 Im(sigma)

Im(s)

0.5

unstable

0

-0.5

-0.5

stable

-1

-1

-1.5

-1.5

-2 -2

-1.5

-1

unstable

0

-0.5

0 Re(s)

0.5

1

1.5

2

-2 -2

stable

-1.5

-1

-0.5

0 Re(sigma)

0.5

1

1.5

2

Figure 7.1. Half-integer case: (a) Laplace plane in s; (b) plane in σ

√ In Figures 7.2 to 7.8, we represent the eigenfunctions E1/2 (λ t) (whose integral term decreases like t−1/2 ) in real and imaginary parts. We will note the asymptotically oscillatory character in Figure 7.4 and the absence of oscillatory term (or residue) in Figures 7.6 to 7.8; only the integral or diffusive part is present.

An Introduction to Fractional Calculus 5

4

3

2

1 0

0.2

0.4

0.8

0.6

1

t

√ Figure 7.2. For λ = 1, E1/2 (λ t). Exponentially divergent real behavior

6

8

4

6

2 4 t 0

0

0.2

0.4

0.6

0.8

1 2

-2 0 -4

0

0.2

0.4

0.6

0.8

1

t -2

-6

-4

-8

-10

-6

√ √ Figure 7.3. For λ = 3(1 + 0.9.i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Oscillatory exponentially divergent behavior

2 2

1 1

t 0

0

0.2

0.4

0.6

0.8

1 0

0

0.2

0.4

0.6

0.8

t

-1 -1

-2 -2

√ √ Figure 7.4. For λ = 4(1 + i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Asymptotically oscillating behavior

1

259

260


1

1.5

0.5 1

0

0

1 t

0.5

2

1.5

0.5

-0.5 0

0

0.2

0.4

0.8

0.6

1

t -1 -0.5

√ √ Figure 7.5. For λ = 4(0.8 + i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Behavior converging in two times: oscillatory exponentially, then diffusive in t−1/2

1

0.6

0.5

0.8

0.4 0.6

0.3 0.4 0.2

0.2 0.1

0

0

0.2

0.4

0.6

0.8

1

0

0

0.2

0.4

t

0.6

0.8

1

t

√ √ Figure 7.6. For λ = 5i, (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Diffusive behavior only in t−1/2

1

0.2 0.8

0.15 0.6

0.1 0.4

0.05 0.2

0

0.2

0.4

0.6 t

0.8

1

0

0

0.2

0.4

0.6

0.8

t

√ √ Figure 7.7. For λ = 4(−1 + i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Diffusive behavior in t−1/2

1


261

1

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

t

√ Figure 7.8. For λ = −10, E1/2 (λ t). Pure diffusive behavior in t−1/2

7.3.5. Controlled-and-observed linear dynamic systems of fractional order Let us assume the class of the following fractional linear dynamic systems: dα x = Ax + Bu y = Cx + Du We can study them under the angle of asymptotic stability, controllability, observability, stabilization by state feedback, construction of an asymptotic observer and stabilization by an observator-based controller. The results which relate to the controlled-and-observed linear dynamic systems of integer order [DAN 94, SON 90] can be generalized to the fractional order. In particular, a system in this class will have the property of: – stability if and only if | arg spec(A)| > α π2 ; – stabilizability by state feedback if and only if: ∃K

such that | arg spec(A + BK)| > α

π 2

which is fulfilled if the “ungovernable” modes of (A, B) in the traditional sense are stable in the α-sense and thus in particular if the pair (A, B) is governable in the traditional sense; – construction of an asymptotic observer if and only if: ∃L such that | arg spec(A + LC)| > α

π 2

262


which is fulfilled if the “unobservable” modes of (C, A) in the traditional sense are stable in the α-sense and thus in particular if the pair (C, A) is observable in the traditional sense; – stabilizability by observer-based controller if and only if it is stabilizable by state feedback and if we can build an asymptotic observer, which is specifically the case when the triplet (C, A, B) is minimal in the traditional sense. Further details can be found in [MAT 96c] for the concepts of observability and controllability and in [MAT 97] for the observer-based control. NOTE 7.6.– We should, however, be aware that the application range of the preceding approach is rather limited because it relies heavily on the commensurate character of the derivation orders and therefore makes a distinction between rational orders and others, which is theoretically restrictive and completely impracticable for digital simulation, for example. 7.4. Diffusive structure of fractional differential systems We now approach the study of fractional differential systems of incommensurate orders, linear and with constant coefficients in time, i.e., the pseudo-differential input(u)-output (y) systems of the form: K k=0

ak Dαk y(t) =

L

bl Dβl u(t)

l=0

corresponding, by Laplace transform, to the symbol: L bl sβl H(s) = Kl=0 . αk k=0 ak s

(7.34)

NOTE 7.7.– Strictly speaking, the term fractional should be reserved for the commensurate systems of orders (βl = lα1 and αk = kα1 ), whereas the term non-integer would be, in truth, more suitable; we conform here to the Anglo-Saxon use (fractional calculus). In section 7.4.1 we give a general structure result which shows to what extent the fractional differential systems are also diffusive pseudo-differential systems. In section 7.4.3, a characterization of the concept of long memory is given. Finally, in section 7.4.4, we recall the particular case of the fractional differential systems of commensurate orders, to which the general structure result naturally applies, but which allows, moreover, an explicit characterization of stability (in the sense of BIBO). However, first of all, in section 7.4.1, we recall some basic ideas on what diffusive representations of pseudo-differential operators are.


263

7.4.1. Introduction to diffusive representations of pseudo-differential operators A first-order system, or autoregressive filter of order 1 (AR-1) in other contexts, is undoubtedly the simplest linear dynamic system imaginable which does not oscillate, but has a behavior of pure relaxation. A discrete superposition of such systems, for various time constants τk , or in an equivalent way for various relaxation constants ξk = τk−1 and various weights μk , gives a simple idea – without being simplistic2 – of the diffusive pseudo-differential operators required to simulate the fractional differential equations. When the superposition is discrete and finite, the resulting system is a system of a integer order with poles (real negative sk = −ξk ) and of zeros; on the other hand, if the superposition is either discrete infinite, or continuous for all the relaxation constants ξ > 0 and with a weight function μ(ξ), we obtain a pseudo-differential system known to be of the diffusive type, the function μ being called the diffusive representation of the associated pseudo-differential operator. In the sense of systems theory, a realization of such a system will be: ∂t ψ(t, ξ) = −ξ ψ(t, ξ) + u(t) +∞ y(t) = μ(ξ)ψ(t, ξ) dξ

(7.35) (7.36)

0

which is mathematically meaningful within a suitable functional framework (see e.g. [STA 94, MON 98, MAT 08] for technical details; the latter reference making the link with the class of well-posed linear systems). A simple calculation thus shows that the impulse response of the input u-output y system is: +∞ μ(ξ) e−ξt dξ. (7.37) hμ (t) = 0

Its transfer function or its symbol is then, for e(s) > 0: +∞ μ(ξ) Hμ (s) = dξ. s +ξ 0

(7.38)

EXAMPLE 7.8.– A simple case of a diffusive pseudo-differential operator is that of the fractional integrator I α , whose diffusive representation is μα (ξ) = sinπαπ ξ −α for 0 < α < 1.

2. Indeed, it is, on the one hand, by completion of this family within a suitable topological framework that we can obtain the space of diffusive pseudo-differential operators and, on the other hand and eventually, these simple systems which are programmed numerically by procedures of standard numerical approximation; see e.g. [HÉL 06b].

264


We see that one of the advantages of diffusive representations is to transform non-local problems of hereditary nature, in time, into local problems, which specifically enables a standard and effective numerical approximation (see e.g. [HEL 00]). On the other hand, when the diffusive representation μ is positive, the realization suggested has the important property of dissipativity of the pseudo-differential operator (a natural energy functional is then given by Eψ (t) = # +∞ μ(ξ) |ψ(t, ξ)|2 dξ), which is in this case of the positive type, which has 0 important consequences, particularly for the study of stability coupled systems (see [MON 97] and also [MON 00] for non-linear systems, time-varying, with hysteresis, etc.). Now, as far as stability is concerned, it is important to notice that some technicalities must be taken care of in an infinite-dimensional setting (namely, LaSalle’s invariance principle does not apply when the pre-compactness of trajectories in the energy space has not been proved a priori: this is the reason why we have to analyze the spectrum of the infinitesimal generator of the semigroup of the augmented system and resort to Arendt-Batty stability theorem, as has been done recently in [MAT 05]). 7.4.2. General decomposition result 1 tα−1 and by strictly limiting ourselves to the By re-using the notation Yα (t) = Γ(α) + case of strictly proper systems (βL < αK ), the following significant result is obtained (see [MAT 98a, AUD 00]).

THEOREM 7.1 (D ECOMPOSITION R ESULT).– The impulse response h of system (7.34) of symbol H has the structure: h(t) =

νi r i=1 j=1

si t

rij Yj (t) e

+∞

+

μ(ξ) e−ξt dξ

(7.39)

0

where si are complex poles in C \ − and where μ is a distribution. Moreover, in the case of a density, the analytical form of μ is given by: α +β K L k l 1 k=0 l=0 ak bl sin (αk − βl )π ξ μ(ξ) = K . π k=0 ak 2 ξ 2αk + 0k 12 , the increment process has long memory; if H 12 , memory is short; – on the other hand, H also determines the sample path regularity: the larger H is, the more regular the sample path. For continuous time, there exists a fractional ARMA family whose definition is identical to that of discrete time. Its parametric richness enables us to disconnect

282


regularity properties from memory properties: the parameters that regulate trajectories and memory range are no longer the same. Continuous time fractional ARMAs are stationary, but are not self-similar. 8.2. Fractional filters 8.2.1. Desired general properties: association ARMA filters, i.e. “rational” filters, are stable when associated in series, in parallel and even in feedback loop. It is one of their main properties. Fractional filters offer a less advantageous situation. Two fractional filters associated in series yield a fractional filter. However, if we move to parallel association, the resulting filter no longer belongs to the fractional filter family. In general, the sum of any two power law transfer functions (of the type z d ) is not a power law function, except if all the exponents dj s are integers. In other words, as opposed to the family of ARMA filters, that of fractional ARMA filters is not stable under parallel association. We wish to compensate for this drawback. We can extend the fractional filter family to a class of filters which remains stable under parallel or series associations. A way of achieving this consists of adding to fractional filters all the sums for these filters. In fact, a good way of extending the family is to ensure that the two following properties are satisfied. The first relates to the localization of the filter’s singular points, which must be situated in a well-chosen area. The second property relates to the increase at infinity of the filter’s transfer functions in the complex plane, which must not be too fast. In the discrete case, if they exist, the singularities are required not to be too unstable and, if they are oscillating, F should be regular in their vicinity. This family contains fractional filters and shares a common property with that of traditional ARMA filters: they are closed under both serial and parallel associations. In continuous time, F must be holomorphic in a conical area containing the right half-plane and its growth at infinity must be approximately that of a power law function. 8.2.2. Construction and approximation techniques The approximation of filter F by polynomials or rational fractions of the complex variable z is a traditional problem. The approximate Fa filter is used to filter a white noise in order to create an MA type approximation of the initial process. As for polynomial approximations, authors resort to truncated series expansions of F (z). The expansions are made on the basis of the z n or on the basis of Gegenbauer

Fractional Synthesis, Fractional Filters

283

polynomials [GRAY 89]. Expansion coefficients are easily calculated by recurrence. Because of this property, it is possible to calculate hundreds of thousands of terms (290,000 in [GRAY 89] to approximate a filter having two roots on the unit circle). Moreover, linear recurrences still exist for the coefficients of the impulse response of general fractional filters. They remain linear, but the coefficients are affine functions of time. In simple cases, the quadratic upper bound for the rest of the truncated series are easy to determine. Nevertheless, the number of terms increases with the memory range. To construct a very long memory process, a moving average process of a gigantic order is necessary. However, the simplicity of this procedure largely accounts for its success. In a traditional way, the analyticity of F and a criterion of infinite norm are both used. Other approximations are studied to solve stochastic problems that involve the properties of F on the imaginary axis in continuous time or on the unit circle in discrete time. Rational approximations are rarer although more promising. The principle lies in the search for an ARMA filter which minimizes a certain criterion, in general of type L2 . In [WHI 86], Whitfield brings together several ideas of approximations by a rational fraction. Certain criteria include a ponderation function in the criterion which essentially ensures the approximation in a frequency band. The procedures are built on linear or non-linear least square algorithms, recursive or not. The integral is replaced by a sum, or approximated by a trapezoid method. Various authors carry out approximations with an interesting intuitive sense. Let us quote Oustaloup [OUS 91], who chooses 2n + 1 zeros zj and 2n + 1 real poles pj , p z so that ratios zjj and j+1 pj do not depend on j. In [BON 92, CUR 86] approximations of Hankel matrices H were studied, whose first line consists of the desired impulse response. Calculations are carried out satisfactorily for the processes with intermediate memory. We then have access to an upper bound H∞ for the approximation error. Baratchart et al. [BARA 91] perfected a powerful algorithm, which we describe briefly. A linear system is considered, with constant coefficients, strictly causal, single . . . , fm , . . .) be its impulse response and f defined entry and single exit. Let (f1 , f2 , +∞ in the complex plane by f (z) = m=1 fm z −m . We assume that f belongs to Hardy − space H2 , i.e., it is square integrable on the unit circle. We then seek a rational fraction of maximum order n (to be determined) in H2− that minimizes the criterion f − r 22 . The problem at hand is that of minimizing a functional Ψn (q), where q is a polynomial

284


of P1n , the space of the real polynomials of maximum degree n whose roots are inside the unit disc and such that the coefficient of the highest degree is equal to one. It is shown that if f is holomorphic in the vicinity of the unit circle, then Ψn can extend to a regular function on Δn , the adherence of P1n in n . Adherence Δn is the set of polynomials of degree n whose roots are in the closed unit disc. A polynomial of the edge of Δn , ∂Δn , can then be factorized into a polynomial of degree k, every root of which is of module 1 and a polynomial qi internal to Δn−k . Moreover, if qi is a critical point of Δn−k , then ∇n (q), the gradient of Ψn at point q, is orthogonal to ∂Δn and points towards outside. By supposing, moreover, that ∇k is non-zero on ∂Δk and that the critical points of Ψk in Δk are not degenerated for 1 k n, the following method can give a local minimum: 1) we choose a point q0 interior to Δn as initial condition and integrate the vector −∇n ; 2) either a critical point is reached: if it is a local minimum, the procedure is completed otherwise, since it is not degenerated, it is unstable for small disturbances and the procedure can continue; 3) or the edge ∂Δn is reached in qb : then qb is decomposed up into qb = qu qi and we go back to stage 1) with qi and Δn−k . We end up reaching a minimum of Ψm (1 m n), qm , that gives, by a simple transformation, a local minimum of Ψn . This algorithm never meets the same point twice and convergence towards a local minimum is guaranteed. It was extended to the multivariable case and with time-varying coefficients (see [BARA 98]). 8.3. Discrete time fractional processes 8.3.1. Filters: impulse responses and corresponding processes Let us consider the fractional filter (of variable z) parameterized by αj and dj : F (z) =

J )

1 − αj z

dj

|z| < a

(8.1)

1

where a is defined below. αj are non-zero complex numbers. Let us note that αj−1 is a singular value of F if dj ∈ N. The following notations will be used: 1) E ∗ is the set of the singular points of F ; 2) a = min{|αj |−1 , j ∈ E ∗ }; 3) E ∗∗ is the set of the indices of the singular points whose module is equal to a; 4) d = min{Re(dj ), j ∈ E ∗∗ };


285

5) E ∗∗∗ is the subset of E ∗∗ corresponding to the singular points for which Re(dj ) = d. In (8.1), each factor is selected to satisfy (1 − αj z)dj = 1 when z = 0. In the domain |z| < a, F admits the series expansion: F (z) = 1 +

+∞

aj z j

j=1

where the series (aj )j1 is the convolution product of the development of the J factors in (8.1). When n tends to infinity and if F is not a polynomial, i.e., if E ∗ = ∅, then:

(J) n Γ(n − dj ) 1 + o(1) when n −→ +∞ (8.2) an = C0 αj Γ(−dj )n! ∗∗∗ j∈E

(j)

where C0 =

m=j (1

−

αm dm . αj )

From this we deduce that, if F is not a polynomial, then: 1) (aj ) belongs to l1 (N) if and only if a > 1 or (a = 1 and d > 0); 2) (aj ) belongs to l2 (N) if and only if a > 1 or (a = 1 and d > − 12 ). ∞ ∞ Consequently, 1 |aj | = +∞ and 1 |aj |2 < +∞ if and only if a = 1 1 and ∞ d ∈2 ] − 2 , 0]. Let us consider a transfer function of the form (8.1) with |a | < +∞, meaning that either F is a polynomial, or: j 1 a > 1 or

a = 1 and d > −

1 2

(8.3) ¯

¯ k z)dk Let us suppose now that if (αk , dk ) ∈ 2 , then the “conjugated” factor (1− α will appear in the right-hand side part of (8.1). Then, (aj ) is a real sequence. If (εn ) is a white noise, the process defined by: X(n) = ε(n) +

∞

aj ε(n − j)

(8.4)

j=1

is a second order stationary process, zero-mean, linear and regular, with a spectral with J density proportional to f (λ) = | j=1 (1 − αj exp(iλ))dj |2 . Moreover, if (ε(n)) is an iid sequence, (X(n)) is strictly stationary and ergodic. This process admits an autoregressive development of infinite order: X(n) +

∞ i=1

bj X(n − j) = ε(n) with

∞ j=1

b2j < +∞

286


Under the following conditions: 1) |αj | 1 for all j ∈ {1, . . . , J}; 2) if |αj | = 1, then Re(dj ) ∈ ] − 12 , 12 ], when n tends towards infinity, we have: 1) an = O(a−n ) and bn = O(b−n ) when α > 1; 2) an = O(n−d−1 ) and bn = O(nδ−1 ) when α = 1, with b = min{|αj |−1 , dj ∈ N} and δ = max{Re(dj ), |αj | = 1}. The coefficients (aj ) are the solution of the affine linear difference equation of order J: nan +

J

(n − k)qk − pk−1 an−k = 0,

n1

(8.5)

k=1

with aj = 0 if j < 0 and a0 = 1. The pj and qj are respectively the coefficients of the two relatively prime polynomials P and Q defined by Q(0) = 1 and: P (z) F (z) −αj dj = = F (z) 1 − α z Q(z) j 1 J

The coefficients bj are the solution of the same difference equation, replacing P by −P . These equations are useful to calculate the coefficients aj and bj , which enables us to obtain simulations or forecasts for the process X(n). 8.3.2. Mixing and memory properties We study two characterizations of the memory structure: the speed of covariance decrease#and the mixing coefficients. The covariance of the process X(n) is given by π σ(n) = −π exp(inλ)f (λ) dλ. When n tends to +∞, if F is a polynomial of degree k, then σ(n) = 0 for n > k, otherwise: 1) if a > 1, then σ(n) = O(a−n ); 2) if a = 1, then σ(n) = ( j∈E ∗∗∗ γj αjn n−1−2dj )(1 + o(1)). Consequently, the covariance (σ(n)) is not absolutely summable, thus X(n) has long memory if and only if a = 1 and d ∈ ] − 12 , 0], i.e., if a singularity is located on the unit circle. Another approach is to study mixing coefficients. Mixing coefficients measure a certain form of dependence between sigma-algebras. More precisely, if A and B are


287

two sigma-algebras, the strong mixing coefficient α is defined by:

P (A ∩ B) − P (A)P (B) α(A, B) = sup A∈A,B∈B

the mixing coefficient of absolute regularity β by:

β(A, B) = sup |P (B/A) − P (B)| B∈B

and the mixing coefficient of maximum correlation ρ by:

corr(X, Y ) ρ(A, B) = sup X∈L2 (A),Y ∈L2 (B)

It is said that a process X is mixing if the sigma-algebras generated by {X(t), t ∈ I} and {X(t), t ∈ J} have a mixing coefficient which tends towards 0 when d(I, J) tends towards infinity. Here, we will use the mixing coefficient:

0 , v ∈ Hn+∞ , Var (u) = Var (v) = 1 r(n) = sup |cov(u, v)|, u ∈ H−∞ where Hrs is the subspace of L2 generated by (X(j), j ∈ {r, . . . , s}). Then, (X(n)) is r-mixing if and only if F is a polynomial or if a > 1 and, in the latter case, r(n) = O(an ). If, moreover, (X(n)) is Gaussian, then (X(n)) is strongly mixing if and only if F is a polynomial or if a > 1 and, in this last case, (X(n)) is β-mixing with β(n) = O(an ). If (ε(n)) is an iid series with a probability density p such that: |p(x) − p(x + h)| dx C1 |h|

then, if a > 1, the process (X(n)) is β-mixing with β(n) = O(a−2n/3 ). According to the values of the parameters a and d, there are thus cases where the process X(n) is: – with long memory and not mixing; – with short memory and not mixing; – with short memory and mixing. 8.3.3. Parameter estimation In the context of long memory processes, there are two facets to estimation problems. The first is when long memory behaves like a parasitic phenomenon that tends to make traditional results obsolete. Examples of this kind are: regression parameters estimation with long memory noise, estimation of the marginal laws of a long memory process and rupture detection in a long memory process. The other aspect relates to the estimation of the parameters quantifying the memory length, i.e., in fact, spectral density parameters. Within this last framework, three types of methods were largely studied, depending on whether the model was completely or partially parameterized:

288


1) methods related to maximum likelihood, which, by nature, aim at estimating all the parameters of the model; 2) the estimation of the memory exponent d in ϕ(λ) = f ∗ (λ)|λ|−2d where ϕ is the spectral density; 3) the estimation of the autosimilarity parameter in fractional Brownian motion. For fractional ARMA processes whose transfer function has the form: F (z) = (1 − z)d

J )

(1 − αj z)dj ,

j=1

where dj are integer and where − 12 < d < 12 (referred to as ARFIMA), parametric methods give satisfactory results, provided that the order of the model is known a priori (see, for example, [GON 87]). When the order of the model is not known, semi-parametric methods, as in point 2), are fruitfully used. We can schematically describe these methods as follows. Let us use the flexible framework of models with spectral density: ϕ(eiλ ) = |1 − eiλ |−2d g(λ), ∞ where g can be written g(λ) = exp( k=0 θk cos kλ) and is a very regular function. The unique singularity of ϕ is located at λ = 0. Then, the natural tool to estimate the spectral density is the periodogram: InX (x)

n 1 | = X(t)eitx |2 2π t=1

The log of the periodogram estimates log ϕ: log ϕ(eiλ ) = −2d log |1 − eiλ | +

∞

θk cos kλ

k=0

After a truncation of the sum at the order q − 1, the parameters d, θ0 , θ1 ,. . . , θq−1 are estimated by a regression of log I(λ) against (−2 log |1 − eiλ |, cos kλ). The difficulty resides in two points: – the choice of q; – the study of the properties of the estimators. In [MOU 00], recent procedures based on penalization techniques give automatic methods for the choice of q, that can be tuned to the function g. The asymptotic


289

properties ensure convergences, for example towards Gaussian, of the estimator of d. Many estimators that rely on this principle were proposed, some of them being compared in [BARD 01, MOU 01]. When the singularities of the transfer function are not located at z = 1, another model is used. Let us assume that the transfer function is written: F (z) = (1 − eiλ0 z)d (1 − e−iλ0 z)d

J )

(1 − αj z)dj ,

j=1

with |αj | < 1, λ0 = 0. The spectral density takes the form ϕ(λ) = |1−ei(λ+λ0 ) |2d g(λ) with a very regular g. If λ0 is known, various authors [OPP 00] think that the estimate of d is made according to ideas developed for α = 1. When λ0 is not known, it has to be estimated. The idea is to use the frequency location where the periodogram takes its max. Although more sophisticated, the procedures elaborated by Yajima [YAJ 96], Hidalgo [HID 99] and Giraitis et al. [GIR 01], provide convergences in probability and in law of the estimators towards normal law. 8.3.4. Simulated example Several methods were proposed to simulate trajectories of fractional processes. Granger and Joyeux [GRAN 80] use an autoregressive approximation of order 100 obtained by truncating the AR(∞) representation combined with an initialization procedure based on the Cholesky decomposition. Geweke and Porter-Hudak [GEW 85] or Hosking [HOS 81] elaborate on the autocovariance and use a Levinson-Durbin-Whittle algorithm to generate an autoregressive approximation. In both cases, quality is not quantified – although it can be. Gray et al. [GRAY 89] approximate (X(n)) by a long MA obtained by truncating the MA (∞) representation. The second method seems inadequate in our case because there is no expression for the autocovariance function of fractional ARMA processes. These methods can easily be established thanks to differential equations (8.5). However, when (X(n)) is long-ranged, these methods require very long trajectories of white noise because of the slow decay of an ; for example, Gray et al. [GRAY 89] use moving averages of order around 290,000. Another idea consists of approximating F by a rational fraction B A and simulating the ARMA process with representation A(L)Y (n) = B(L)ε(n). This is the chosen approach for the example below. The algorithm used to calculate the polynomials A and B is developed by Baratchart et al. [BARA 91, BARA 98]. In principle, this algorithm, as with the theoretical results, is only valid when F has no singular point on the unit disc. However, it provides satisfactory results, from the perspective detailed below. The studied filter F reads: −0.2 −0.2 z − exp(−2iπ 0.231) F (z) = z − exp(2iπ 0.231)

290


It is approximated by a rational fraction whose numerator and denominator are of degrees 8 and 9, respectively. Figure 8.1 shows the localization of the poles and the zeros of this rational fraction and the singularities of F . 90 1 120

60

singularities zeros poles

0.8

0.6 150

30

0.4 0.2

180

0

330

210

300

240 270

Figure 8.1. Localization of the singularities () of the filter F (z) = (z − exp(2iπ 0.231))−0.2 (z − exp(−2iπ 0.231))−0.2 and of the poles (◦) and zeros (∗) of the rational fraction B approximating F A

It is worth noting that the rational fraction B A is stable; however, its poles and zeros are almost superimposed to the two singularities of the function F . In Figure 8.2, the amplitude of F on the unit circle (continuous line) is compared to that of the rational fraction (dotted line). The approximation is excellent away from a close vicinity around the singularities of F . Figure 8.3 is a simulated sample path of the ARMA process obtained with the rational approximation of the transfer function. We compare the impulse response of the process, its empirical covariance and spectral density (continuous line) with the impulse response, theoretical covariance and spectral density (dotted line), calculated directly from function F . The impulse response of the ARMA process satisfactorily matches the theoretical impulse response. The estimated spectral density for the ARMA process reasonably matches the theoretical spectral density. For small time lags, the autocorrelation calculated on the simulated process and the autocorrelation calculated as the inverse Fourier transform of the spectral density are very similar. This is no longer true for larger lags, which comes as no surprise because we have, on the one hand, a short memory process, and on the other, a long memory process.

Fractional Synthesis, Fractional Filters transfer function

module of the transfer function

4

20 15 modulus in dB

imaginairy part

2

0

−2

−4

291

10 5 0

0

2

4 6 real part

8

−5

10

Black diagram

−2 0 2 frequency in radians/second phase of the transfer function

20 20 phase in degrees

modulus in dB

15 10 5

10 0 −10

0 −20 −5

−20

−10 0 10 phase in degrees

20

−2 0 2 frequency in radians/second

Figure 8.2. Amplitude of F (z) (continuous line) and of the rational (z) (dotted line) approximating F on the unit circle fraction B A Simulated trajectories

Impulse reponse

3 1 2

0.8

1

0.6

0

0.4

−1

0.2 0

−2

−0.2 −3

0

50

100

0

Spectral density

2

50

100

Autocorrelation

10

0.2 0.1

1

10

0 −0.1

0

10

−0.2 −1

10

0

0.5

1

1.45

2

2.5

3

−0.3

0

20

40

60

Figure 8.3. Simulation of a sample path for a fractional ARMA process obtained from an approximating rational fraction. Comparison between simulated and theoretical impulse responses, spectral densities and autocorrelation functions

8.4. Continuous time fractional processes 8.4.1. A non-self-similar family: fractional processes designed from fractional filters FBMs were the first to be introduced as continuous time processes characterized by a fractional parameter [MAN 68]. They are interesting because they generalize ordinary Brownian motion while maintaining its Gaussian nature and self-similarity. However, consequently, they lose the independence of their increments. The key properties of these processes are governed by the unique parameter H. This simplicity

292


has its advantages and constraints. The design of continuous time processes, controlled by several parameters that make it possible to uncouple the local from long-range memory properties, is the subject of the present section. However, these new processes are not self-similar. Fractional ARMA processes are defined, as in the discrete case, by a fractional filter s, with (the complex number) parameters ak and dk : F (s) =

K )

(s − ak )dk

for Re(s) > a

(8.6)

k=1

The following notations are used: K 1) D = k=1 dk ; 2) E ∗ is the set of the singular points of F ; 3) a = max{Re(ak ), k ∈ E ∗ }; 4) E ∗∗ is the set of the indices of the singular points whose real part is equal to a; 5) d = min{Re(dk ), k ∈ E ∗∗ }. ¯

It is assumed that, if (ak , dk ) ∈ 2 , then the factor (s − a ¯k )dk is present in the right-hand side of (8.6). Under this hypothesis, D is real. Let us moreover assume, in this section, that D < 0, since the study of the case D > 0 is the subject of the next section. Then, the set E ∗ is not empty and the impulse response f , given by the inverse Laplace transform of F , is well-defined, real and locally integrable on + : f (t) =

1 2iπ

c+i∞

exp(st)F (s) ds

for t > 0,

c ∈ ]a, +∞[

(8.7)

c−i∞

No closed-form expression for f is available except when K = 1. Its behavior in the vicinity of 0+ is described by: f (t) ∼

t−(1+D) Γ(−D)

t → 0+

(8.8)

and, in the vicinity of +∞, by: f (t) ∼

λk t−(1+dk ) exp(ak t)

t → +∞

k∈E ∗∗

where λk are non-zero complex numbers depending on parameters ak and dk .

(8.9)


293

As in the discrete case, if D < 1−K, then f is the solution of the linear differential equation of order K whose coefficients are affine functions of t: K

(νj + tψj )f (j) (t) = 0

(8.10)

k=0

where νj and ψk are constants depending on parameters ak and dk . Then, the impulse responses enable us to define the processes (X(t)) as a stochastic integral of Brownian motion W (s): t f (t − s) dW (s) (8.11) X(t) = −∞

2

+

when f belongs to L ( ) and is given by (8.8). Then, (X(t)) is a zero-mean, stationary Gaussian process with spectral density # +∞ g(λ) = |F (iλ)|2 and with covariance function σ(t) = −∞ g(λ) exp(iλt) dλ. 8.4.2. Sample path properties: local and global regularity, memory Let us begin with the properties of the covariance function σ, from which the other properties are deduced. The memory properties of (X(t)) are given by the covariance behavior at infinity: 1) when a < 0, σ(t) = o(t−n ) when t → +∞ for all n ∈ N; −2dj −1 2) when a = 0, σ(t) ∼ , t → +∞, when γj are j∈E ∗∗ γj exp(aj t)t constants depending on F ; 3) σ is non-integrable if and only if a = 0 and d ∈ ] − 12 , 0[. For this latter case, (X(t)) is a long memory process. From the covariance behavior at infinity, we can also deduce that the process (X(t)) is strongly mixing if and only if a < 0. As in the discrete case, there are various memory-mixing scenarios. The regularity of the sample path is determined by the covariance behavior at the origin: 1) if D < − 32 , then σ(t) = σ(0)+γ1 t2 +o(t2+ε ), where γ1 is a non-zero constant and ε a positive number; 2) if D > − 32 , then σ(t) = σ(0)+γ2 t−2D−1 +o(t−2D−1 ), where γ2 is a non-zero constant; 3) if D = − 32 , then σ(t) = σ(0) +

t2 2

log t(1 + o(1)).

From this, we deduce that there is a process (Y (t)) equivalent to (X(t)), such that all the trajectories are Hölderian of exponent γ for all γ ∈ ]0, min(1, − 12 − D)[.

294


Moreover, if D < − 32 these trajectories are of class C 1 . The Hausdorff dimension of the sample paths of the process Y (t) is then equal to 1 if D < − 32 and to 52 + D if D ∈ [− 32 , − 12 [. In this studied family, there exist long memory processes X(t) (a = 0, d ∈ ] − 12 , 0[), which can be as regular (D < − 32 ) or irregular (D close to − 12 ) as desired. Conversely, there exist processes X(t) with short memory (a < 0) having arbitrary regularity, in contrast to fractional Brownian motion. However, only the processes obtained from F (s) = sd transfer functions are self-similar. 8.5. Distribution processes 8.5.1. Motivation and generalization of distribution processes To complement the survey of continuous time fractional ARMA processes, the idea that spontaneously comes to mind is to study the consequences of slackening the constraint D < 0. Then, the impulse response f is no longer a simple function, belonging to L2 , but can be defined as the inverse Laplace transform of F in the space D of distributions with support on + . Expression (8.11) no longer makes sense and it is necessary to define the process X differently. Distribution processes were introduced independently by Ito [ITO 54], and Gelfand and Vilenkin [GEL 64]. They can be defined in the following way: a second order distribution process X is a continuous linear application from the space of C ∞ test functions with compact support in , to the space of random variables whose moment of second order exists. We note: ∀ϕ ∈ C0∞ () X(ϕ) = #X, ϕ$ ∈ L2 (Ω) and we have, for all K compact of : ∃CK , ∃k, ∀ϕ ∈ C0∞ (K),

#X, ϕ$ L2 (Ω) CK ϕ k

Derivation, time shift, convolution and Fourier transform (denoted F(X) or ˆ are defined for distribution processes, as they are for distributions. Likewise, X) expectation, covariance and stationarity for distribution processes are defined as they are for continuous time processes. In particular (see [GEL 64]), if X is a second order stationary distribution process, then there exists η ∈ C such# that, for all ϕ ∈ C0∞ (), the expectation m of X is written m(ϕ) = E(X(ϕ)) = η ϕ(t) dt, and there exists a positive tempered measurement μ, called the spectral measurement, such that the # ˆ dμ(ξ) = ¯ covariance B of X is written B(ϕ, ψ) = E(X(ϕ)X(ψ)) = ϕ(ξ) ˆ ψ(ξ) ¯ If μ admits a density g with respect to the Lebesgue measure, g is called #σ, ϕ ∗ ψ$. the spectral density of X and then σ = F −1 (g). 8.5.2. The family of linear distribution processes Let us begin by putting forth the definition of this family. Let f be a function of L2 and X(t) the Gaussian process defined by: X(t) = f (t − s) dW (s)


295

where W (s) is the ordinary Brownian motion. Let fˇ(t) = f (−t). Then, X(t) is a distribution process if for ϕ ∈ C0∞ (): #X, ϕ$ = f (t − s) dW (s) ϕ(t) dt

It is easy to see that we can, in this case, permute the integrals and that, for all ϕ ∈ C0∞ (): f (t − s) dW (s) ϕ(t) dt = f (t − s)ϕ(t) dt dW (s)

It is then natural to introduce the following process: #X, ϕ$ = fˇ ∗ ϕ(s) dW (s)

(8.12)

with f ∈ D () such that fˇ ∗ ϕ(s) ∈ L2 (n ) for all ϕ ∈ C0∞ (n ). Let H −∞ () = ∪s∈ H s () be the Sobolev space, where H s () = {f ∈ S (), (1 + ξ 2 )s/2 |fˆ| ∈ L2 ()}, and S is the space of temperate distributions. It is shown that (8.12) properly defines a distribution process when the distribution f belongs to H −∞ (). The process X, with impulse response f , is then a zero-mean stationary Gaussian distribution process, with spectral density |fˆ(ξ)|2 and covariance σ = F −1 (|fˆ|2 ). As for continuous time processes, the regularity of a process X, with impulse response f ∈ H −∞ (n ), is characterized by the regularity of f : if f ∈ H −∞ (), the corresponding distribution process X belongs to the Hölder space C s (, L2 (Ω)) s (n ) (see [TRI 92] for the definition if and only if f belongs to the Besov space B2,∞ of these spaces, or Chapter 3). 8.5.3. Fractional distribution processes We can now define fractional ARMA processes for D > 0, i.e. when the inverse Laplace transform of f does not belong to L2 . In fact, a more general framework is available: let us suppose that F satisfies the two following assumptions: 1) F is holomorphic in the domain D = C \ {z, Re(z) a and |Im(z)| K|Re(z)|}; 2) there exists N > 0 and there exists C such that |F (z)| C(1 + |z|)N in D. This includes the functions F which verify (8.6). Under assumptions 1) and 2) above, f = L−1 (F ) exists [SCH 66], since L is defined by: ∀t > 0,

L(f )(s) = F (s) = #f (t), e−st $

for s ∈ C, Re(s) > a

296


The function f belongs to the space H −∞ () if a < 0 or (a = 0 and F (iξ) ∈ L2loc ()), this condition being satisfied in the case of fractional filters as long as d > − 12 if a = 0 and whatever the value of D. In fact, f belongs to the Besov −N −1/2 () and the associated process has regularity of order −N − 12 . space B2,∞ Moreover, if F verifies hypotheses 1) and 2) and if a < 0, then f is an analytical function for t > 0. More precisely, there exists K > 1 such that f has a holomorphic extension to {|Im(t)| K1 Re(t)}. If F is a fractional filter, these impulse responses are simple. They are zero on − and, except on {0}, they are very regular functions. The only serious irregularity is in 0, as can be seen by examining the following formula. Let δ denote the Dirac mass at t = 0, δ (j) its j th derivative in the sense of distributions and pv(tλ ) the principal value of tλ . If D ∈ N, the function f reads: f (t) = δ (D) (t) +

D

γj δ (D−j) (t) +

1

∞ D+1

γj

t−(D−j+1) Γ(j − D)

and if D ∈ \ N: f (t) =

∞ 1 pv(t−(D−j+1) ) vp(t−(D+1) ) + γj Γ(−D) Γ(−D + j) j=1

The function f is characterized by the same asymptotic behavior at infinity as in the continuous case and satisfies the same differential equation of order K with affine coefficients in t. Also, the regularity index of the distribution process X is −D − 12 . 8.5.4. Mixing and memory properties For distribution processes, memory properties also derive from the summability of the covariance function. As for the continuous case, a fractional ARMA distribution has a long memory, if and only if a = 0 and d ∈ ] − 12 , 0[. Regarding mixing properties, it is necessary to redefine the various coefficients extending usual definitions [DOU 94] to distribution processes. To this end, we replace the concept of distance in time by that of time-lag between supports of test functions. 0 and HT+∞ the Let X be a stationary distribution process; let us denote by H−∞ ∞ vectorial subspaces generated by {X(ϕ), ϕ ∈ C0 (] − ∞, 0])} and {X(ψ), ψ ∈ C0∞ ([T, +∞[)}. The mixing linear coefficient of the process X can then be defined by:

0 , Z ∈ HT+∞ rT = sup |corr(Y, Z)|, Y ∈ H−∞

This coefficient coincides with the usual linear coefficient of mixing when X is a time process. The other mixing coefficients can be defined similarly. In particular, linear mixing and ρ-mixing coefficients are equal and satisfy the following bounding relation: αT ρT 2παT


297

This implies that for Gaussian distribution processes, linear mixing, ρ-mixing and α-mixing are equivalent notions. Let us now assume that X is a linear distribution process, with transfer function F . A sufficient condition for the process X to be ρ-mixing reads: if F verifies assumptions 1) and 2), if F is bounded from below for large |z|, i.e.: there exists C, A such that, for |z| > A, C|z|N |F (z)| and moreover, if a < 0, then the ρ-mixing coefficient of the distribution process X tends towards 0 when T tends towards infinity. In this case, we obtain: ρT = O(ebT )

for all

b ∈ ]a, 0[.

However, if F has a singularity on the imaginary axis, i.e., if F can be written as F (z) = (z − iα)d G(z) with d ∈ C \ N, α ∈ and G continuous close to iα and G(iα) = 0, then the distribution process is not ρ-mixing. These two conditions yield the following result for fractional ARMA processes: the fractional distribution process, with transfer function F , is ρ-mixing if and only if a < 0. Then, ρT = O(ebT ) for all b ∈ ]a, 0[. This result complements the result obtained for continuous time processes, by providing an explicit mixing rate. Many authors [DOM 92, HAY 81, IBR 74, ROZ 63] studied the relation between the mixing properties and the spectral density of continuous time stationary processes. Their results either rely on more restrictive hypotheses than ours, but give better convergence speeds, or rely on hypotheses about functional spaces membership – difficult to verify in our case – but give necessary and sufficient conditions for the mixing coefficient to tend towards 0. 8.6. Bibliography [BARA 91] BARATCHART L., C ARDELLI M., C LIVI M., “Identification and rational L2 approximation: a gradient algorithm”, Automatica, vol. 27, no. 2, p. 413–418, 1991. [BARA 98] BARATCHART L., G RIMM J., L EBLOND J., O LIVI M., S EYFERT F., W IELONSKY F., Identification d’un filtre hyperfréquences par approximation dans le domaine complexe, Technical Report 219, INRIA, 1998. [BARD 01] BARDET J.M., L ANG G., O PPENHEIM G., TAQQU M., P HILIPPE A., S TOEV S., “Semi-parametric estimation of long-range dependence parameter: a survey”, in Long-range Dependance: Theory and Applications, Birkhäuser, Boston, Massachusetts, 2001. [BER 94] B ERAN J., Statistics for Long-memory Processes, Chapman & Hall, New York, 1994.

298


[BON 92] B ONNET C., Réduction de systèmes linéaires discrets de dimension infinie: étude de filtres fractionnaires, RAIRO Automat-Prod. Inform. Ind., vol. 26, no. 5-6, p. 399–422, 1992. [COX 84] C OX D.R., “Long-range dependence: a review”, in DAVID H.A., DAVID H.T. (ed.), Proceedings of the Fiftieth Anniversary Conference Iowa State, Iowa State University Press, p. 55–74, 1984. [CUR 86] C URTAIN R.F., G LOVER K., “Balanced realisation for infinite-dimensional systems”, in Operator Theory and Systems, Birkhäuser, Boston, Massachusetts, 1986. [DOM 92] D OMINGUEZ M., “Mixing coefficient, generalized maximal correlation coefficients, and weakly positive measures”, J. Multivariate Anal., vol. 43, no. 1, p. 110–124, 1992. [DOU 94] D OUKHAN P., Mixing Properties and Examples, Springer-Verlag, Lectures Notes in Statistics, 1994. [GEL 64] G ELFAND I.M., V ILENKIN N.Y., Generalized Functions, vol. 4, Academic Press, New York, 1964. [GEW 85] G EWEKE J., P ORTER -H UDAK S., “The estimation and application of long time series models”, J. Time Series Anal., vol. 4, p. 221–238, 1985. [GIR 01] G IRAITIS L., H IDALGO J., ROBINSON P.M., “Gaussian estimation of parametric spectral density with unknown pole”, Annals of Statistics, vol. 29, no. 4, p. 987–1023, 2001. [GON 87] G ONÇALVÈS E., “Une généralisation des processus ARMA”, Annales d’économie et de statistiques, vol. 5, p. 109–146, 1987. [GRAN 80] G RANGER C.W.J., J OYEUX R., “An introduction to long-memory time series models and fractional differencing”, J. Time Ser. Anal., vol. 1, p. 15–29, 1980. [GRAY 89] G RAY H.L., Z HANG N.F., W OODWARD W.A., “On generalized fractional processes”, J. Time Ser. Anal., vol. 10, no. 3, p. 233–256, 1989. [HAY 81] H AYASHI E., “The spectral density of a strongly mixing stationary Gaussian process”, Pacific Journal of Mathematics, vol. 96, no. 2, p. 343–359, 1981. [HID 99] H IDALGO J., “Estimation of the pole of long memory processes”, Mimeo, London School of Economics, 1999. [HOS 81] H OSKING J.R.M., “Fractional differencing”, Biometrika, vol. 68, p. 165–176, 1981. [IBR 74] I BRAGIMOV I., ROZANOV Y., Processus aléatoires Gaussiens, Mir, Moscou, 1974. [ITO 54] I TO K., “Stationary random distributions”, Mem. Coll. Sci. Kyoto Univ. Series A, vol. 28, no. 3, p. 209–223, 1954. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MIL 93] M ILLER K.S., ROSS B., An Introduction to Fractional Calculus and Fractional Differential Equations, John Wiley & Sons, 1993. [MOU 00] M OULINES E., S OULIER P., “Confidence sets via empirical likelihood: broadband log-periodogram regression of time series with long-range dependence”, Annals of Statistics, vol. 27, no. 4, p. 1415–1439, 2000.


299

[MOU 01] M OULINES E., S OULIER P., “Semiparametric spectral estimation for fractional processes”, in Long-range Dependence: Theory and Applications, Birkhäuser, Boston, Massachusetts, 2001. [OLD 74] O LDHAM K.B., S PANNIER J., The Fractional Calculus, Academic Press, 1974. [OPP 00] O PPENHEIM G., O ULD H AYE M., V IANO M.C., “Long-memory with seasonal effects”, Statistical Inference for Stochastic Processes, vol. 3, p. 53–68, 2000. [OUS 91] O USTALOUP A., La commande Crone, commande robuste d’ordre non entier, Hermes, Paris, 1991. [ROZ 63] ROZANOV Y., Stochastic Random Processes, Holdenday, 1963. [SCH 66] S CHWARTZ L., Théorie des distributions, Hermann, Paris, 1966. [TAQ 92] TAQQU M.S., “A bibliographical guide to self-similar processes and long-range dependence”, in Dependence in Probability and Statistics, Birkhäuser, 1992. [TRI 92] T RIEBEL H., Theory of Function Spaces, Birkhäuser, 1992. [WHI 86] W HITFIELD A.H., “Transfer function synthesis using frequency response data”, Int. J. Control, vol. 43, no. 5, p. 1413–1426, 1986. [YAJ 96] YAJIMA Y., “Estimation of the frequency of unbounded spectral densities”, ASA Proc. Business and Economic Statistics, Section 4-7, Amer. Statist. Assoc., Alexandria, VA.


Chapter 9

Iterated Function Systems and Some Generalizations: Local Regularity Analysis and Multifractal Modeling of Signals

9.1. Introduction There are many ways of carrying out the fractal analysis of a signal: evaluation and comparison of various measures and dimensions (for example, Hausdorff [FAL 90] or packing [TRIC 82], lacunarity [MAND 93], etc.). The objective of this chapter is to describe in detail two types of fractal characterizations: – analysis of the pointwise Hölderian regularity; – multifractal analysis and modeling. The first characterization enables us to describe the irregularities of a function f (t) by associating it with its Hölder function αf (t) which gives, at each point t, the value of the Hölder exponent of f . The smaller αf (t) is, the more irregular the function f is. A negative exponent indicates a discontinuity, whereas if αf (t) is strictly superior to 1, f is differentiable at least once at t. The characterization of signals through their Hölderian regularity has been studied by many authors from a theoretical point of view. For instance, it is related to wavelet decompositions [JAF 89, JAF 91, JAF 92, MEY 90a], signal processing applications [LEV 95, MAL 92] such as denoising, turbulence analysis [BAC 91] and image segmentation [LEV 96]. This approach is particularly relevant when the information

Chapter written by Khalid DAOUDI.

302


resides in the signal irregularity rather than, for example, in its amplitude or in its Fourier transform (this is notably the case for edge detection in image processing). The first part of this chapter is thus devoted to studying the properties of the Hölder function of signals. The question that naturally arises is the following: given a continuous function f on [0, 1], what is the most general form that can be taken by αf ? By generalizing the notion of iterated function systems (IFS), we answer this question by characterizing the class of functions αf and by giving an explicit method to construct a function whose Hölder function is prescribed in this class. This generalization enables us to define a new class of functions, that of generalized iterated function systems (GIFS). This will allow the development of a new approach to estimate the Hölder function of a given signal. An interesting feature of the Hölder function is that it can be very simple while the signal is irregular. For example, although they are nowhere differentiable, Weierstrass function [WEI 95] and fractional Brownian motion (FBM) [MAND 68] have a constant Hölder function. However, there are signals with very irregular appearance for which the Hölder function is even more irregular; e.g. continuous signals f such that αf is discontinuous everywhere. While the canonical example is that of IFS, it turns out to be more interesting to use another description for the signal: the multifractal spectrum. Instead of attributing to each t the value of the Hölder function, all the points with same exponent α in a sub-set Eα are aggregated and the irregularity is characterized in a global manner by calculating, for each value of α, the Hausdorff dimension of the set Eα . This yields a geometric estimation of the “size” of the subparts of the support of f where a given singularity appears. This type of analysis, first referenced in [MAND 72, MAND 74] and in the context of turbulence [FRI 95], has since been used often. It has been studied at a theoretical level (analysis of self-similar measures or functions in a deterministic [BAC 93, OLS 02, RIE 94, RIE 95] and random [ARB 02, FAL 94, GRA 87, MAND 89, MAU 86, OLS 94] context, extension to capacities, higher order of spectra [VOJ 95]) and directly applied (study of DLA sequences [EVE 92, MAND 91], analysis of earthquake distribution [HIR 02], signal processing [LEV 95] and traffic analysis). The second part of this chapter deals with multifractal analysis. Self-similar functions constitutes the paradigm of “multifractal signals”, as most of the quantities of interest can be explicitly calculated. In particular, it has been demonstrated that the multifractal formalism, which connects the Hausdorff multifractal spectrum to the Legendre transform of a partition function, holds for self-similar measures and functions, and various extensions of the latter were considered [RIE 95]. The multifractal formalism enables us to reduce the calculation of the Hausdorff spectrum

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

303

(which is very complex, as it not only entails the determination all Hölder exponents but also the deduction of an infinity of Hausdorff dimension) to a simple calculation of a partition function limit, which offers less difficulty, both theoretically and numerically. In practice, self-similar functions and their immediate extensions are, most of the time, too rigid to model real signals, such as, for instance, speech signals, in an appropriate way. In this chapter, we present a generalization – weak self-affine (WSA) functions – that offers a satisfactory trade-off between flexibility and complexity in modeling. Weak self-affine functions are essentially self-similar functions for which the renormalization parameters can differ from one scale to another, while verifying certain conditions which enable us to preserve the multiplicative structure. The Hausdorff multifractal spectrum of WSA functions is calculated and the validity of the multifractal formalism proven. We then explain how to use WSA functions to model and segment real signals. We also show how modeling through WSA functions can be used to estimate non-concave multifractal spectra. This chapter is organized as follows. In the following section, we give the definition of the Hölder exponent. Then, we describe the concept of iterated function systems (IFS) and analyze the local regularity behavior of affine iterated function systems which generate the continuous function graphs. In section 9.4, which constitutes the core of this chapter, we propose some generalizations of IFS which enables us to solve the problem of characterizing Hölder functions. In section 9.5, we introduce a method to estimate the Hölder exponent, based on GIFS and evaluate its performance from numerical simulations. In section 9.6, we address multifractal analysis and modeling. We introduce WSA functions and prove their multifractal formalism. In sections 9.7 and 9.8, we show how to represent and segment real signals by WSA functions. section 9.9 is devoted to the estimation of the multifractal spectrum through the WSA approach. Finally in section 9.10 we present some numerical experiments. 9.2. Definition of the Hölder exponent The Hölder exponent is a parameter which quantifies the local (or pointwise) regularity of a function around a point.1 Let us first define the pointwise Hölder space. ∗ Let I be an interval of , and F a continuous function on I in and β + \N.

1. The Hölder exponent can also be defined for measures or functions of sets in general.

304


DEFINITION 9.1.– Let t0 ∈ I. The function f belongs to the pointwise Hölder space C β (t0 ) if and only if there exists a polynomial P of degree less than or equal to the integer part of β and a positive constant C such that for any t in the neighborhood of t0 : f (t) − P t − t0 ≤ C t − t0 β Let us note that if β ∈ N∗ , the space C β has to be replaced by the Zygmund β-class [MEY 90a, MEY 90b]. DEFINITION 9.2.– A function f is said to be of Hölder exponent β at t0 if and only if: 1) for any scalar γ < β: lim

h→0

|f (t0 + h) − P (h)| =0 |h|γ

2) if β < +∞, for any scalar γ > β: lim sup h→0

|f (t0 + h) − P (h)| = +∞ |h|γ

where P is a polynomial of degree less than or equal to the integer part of β. If β < +∞, this is equivalent to: f∈ C β− (t0 )

but

>0

f ∈

+

C β+ (t0 )

>0

This is also equivalent to: β = sup{θ > 0 : f ∈ C θ (t0 )} 9.3. Iterated function systems (IFS) Let K be a complete metric space, with distance d. Given m continuous functions Sn of K in K, we call an iterated functions system (IFS) the family {K, Sn : n = 1, 2, . . . , m}. Let H be the set of all non-empty closed parts of K. Then the set H is a compact metric space for the Hausdorff distance h [HUT 81] defined, for any A, B in H, by: h(A, B) = max sup inf d(x, y), sup inf d(x, y) x∈A y∈B

x∈B y∈A


305

Let us consider the operator W : H → H defined by: W (G) =

m +

Sn (G)

for all G ∈ H

n=1

We call any set A ∈ H which is a fixed point of W an attractor of the IFS {K, Sn : n = 1, 2, . . . , m}, i.e., it verifies: W (A) = A An IFS always possesses at least one attractor. Indeed, given any set G ∈ H, (m) (G) = the closure of the accumulation set of points {W (m) (G)}∞ m=1 , with W (m−1) (G)), is a fixed point of W . W (W If all Sn functions are contractions, then the IFS is said to be hyperbolic. In this case, W is also a contraction for the Hausdorff metric; thus, it possesses a single fixed point which is the single attractor of the IFS. When the IFS is hyperbolic, the attractor can be obtained in the following manner [BAR 85a]: let p = (p1 , . . . , pm ) be a probability vector with pn > 0 and n pn = 1. From the fixed point x0 of S1 , let us define the sequence xi by successively choosing xi ∈ {S1 (xi−1 ), . . . , Sm (xi−1 )}, where the probability pn is linked to the occurrence xi = Sn (xi−1 ). Then, the attractor is the closure of the trajectory {xi }i∈N . In this chapter, we focus on IFS which make it possible to generate continuous function graphs [BAR 85a]. Given a set of points {(xn , yn ) ∈ [0; 1] × [u; v], n = 0, 1, . . . , m}, with (u, v) ∈ 2 , let us consider the IFS given by m contractions Sn (n = 1, . . . , m) which are defined on [0; 1] × [u; v] by: Sn (x, y) = Ln (x); Fn (x, y) where Ln is a contraction which transforms [0; 1] into [xn−1 ; xn ] and where Fn : [0; 1] × [u; v] → [u; v] is a contraction with respect to the second variable, which satisfies: Fn (x0 , y0 ) = yn−1 ; Fn (xm , ym ) = yn

(9.1)

Then, the attractor of this IFS is the continuous function graph which interpolates the points (xn , yn ). In general, this type of function is called a fractal interpolation function [BAR 85a]. The most studied class of IFS is that of affine iterated function systems, i.e., IFS for which Ln and Fn are affine functions. We will study this class later. We also assume that the interpolation points are equally spaced. Then, Sn (0 ≤ n < m) can be written in a matrix form as: t 1/m 0 t n/m = + Sn x an cn x bn

306


Let f be the function whose graph is the attractor of the corresponding IFS. Let us note that once cn is fixed, an and bn are uniquely determined by (9.1) so as to ensure the continuity of f . We are now going to calculate the Hölder function of f and see if we can control the local regularity with these affine iterated function systems. PROPOSITION 9.1 ([DAO 98]).– Let t ∈ [0; 1) and 0 · i1 . . . ik . . . be its base m decomposition (when t possesses two decompositions, we select the one with a finite number of digits). Then: log(ci1 . . . cik ) log(cj1 . . . cjk ) log(cl1 . . . clk ) , lim inf , lim inf αf (t) = min lim inf k→+∞ k→+∞ k→+∞ log(m−k ) log(m−k ) log(m−k ) where, for any integer k, if we note tk = m−k [mk t], the k-tuples (j1 , . . . , jk ) and (l1 , . . . , lk ) are given by: −k = t+ k = tk + m

k

jp m−p

p=1 −k t− = k = tk − m

k

lp m−p

p=1

COROLLARY 9.1 ([DAO 98]).– Let t ∈ [0; 1). If, for every i ∈ {0, . . . , m − 1}, the proportion φi (t) of i in the base m decomposition of t exists, then: αf (t) = −

m−1

φi (t) logm ci

i=0

This corollary clearly shows that we cannot control the regularity at each point using IFS. Indeed, the almost sure value of φi (t) w.r.t. the Lebesgue measure is 1 m , hence almost all the points have the same Hölder exponent. However, we can easily construct a continuous function whose Hölder function is not constant almost everywhere. In the next section, we propose a generalization of IFS that offers more flexibility in the choice of the Hölder function. 9.4. Generalization of iterated function systems The principal idea from which the generalization of IFS is inspired lies in the following question: what happens, in terms of regularity, if the Si contractions are allowed to vary at every iteration in the process of attractor generation? However, raising this question entails that we first answer the preliminary issue: does an “attractor” exist in this case? Andersson [AND 92] studied this problem and found satisfactory conditions for the existence and uniqueness of an attractor when the Si vary.


307

Formally, let us consider the collection of sets (F k )k∈N∗ where each F k is a (not empty) finite set of contractions Sik in K for i = 0, . . . , Nk − 1, Nk ≥ 1 being the cardinal of F k , while cki denotes the contraction factor of Sik . For n ∈ N∗ , let 'nNi be the set of sequences of length n, defined by:

'nNi = σ = (σ1 , . . . , σn ) : σi ∈ {0, . . . , Ni − 1}, i ∈ N∗ and let:

∗ '∞ Ni = σ = (σ1 , σ2 , . . .) : σi ∈ {0, . . . , Ni − 1}, i ∈ N

For any k, let us consider the operator W k : H → H defined by: Nk +

k

W (A) =

Snk (A)

for A ∈ H

n=1

Let us define the conditions: (c)

sup

lim

n→∞ (σ ,...,σ )∈n 1 n N

i

(c )

lim

sup

n→∞ (σ ,σ ,...)∈∞ 1 2 N

i

n )

ckσk

=0

k=1 ∞

d(Sσj+1 x, x) j+1

j=n

j )

ckσk

= 0 ∀x ∈ K

k=1

Andersson proved the below result. PROPOSITION 9.2 ([AND 92]).– If conditions (c) and (c ) are satisfied, then there exists a unique compact set A ⊂ K such that: lim W k ◦ . . . ◦ W 1 (G) = A

k→∞

for all G ∈ H

A is called the attractor of the IFS (K, [0]{F k }k∈N∗ ). 9.4.1. Semi-generalized iterated function systems We now consider the case where the Sik are affine and where the Nk are constant. Let F k be a set of affine contractions Sik (0 ≤ i < m) whose matrix representation reads: 1/m 0 i/m t t = + Sik x x cki aki bki Let us assume that conditions (c) and (c ) are satisfied. Then, if aki and bki satisfy similar relations as (9.1), we can show, by using the same techniques as

308


in [BAR 85a], that the attractor of the semi-generalized IFS (K, {F k }k∈N ) is the graph of a continuous function f . As for typical IFS, let us now verify whether the expression of the Hölder function for semi-generalized IFS allows us to control the local regularity. PROPOSITION 9.3 ([DAO 98]).– Let t ∈ [0; 1). Then: log(c1j1 . . . ckjk ) log(c1l1 . . . cklk ) log(c1i1 . . . ckik ) , lim inf , lim inf αf (t) = min lim inf k→+∞ k→+∞ k→+∞ log(m−k ) log(m−k ) log(m−k ) where ip , jp and lp are defined as in Proposition 9.1. Although the Hölder function of semi-generalized IFS describes a broader class than that of standard IFS, it still remains very restrictive (as far as the problem at hand is concerned). Indeed, it is easy to observe that two scalars which only differ in a finite number of digits in their base m decomposition have the same Hölder exponent – whereas it is easy to construct a continuous function whose Hölder function does not satisfy this constraint. It remains thus impossible to control the local regularity at each point by using the semi-generalized IFS. 9.4.2. Generalized iterated function systems Let us now consider a more flexible extension than semi-generalized IFS, by allowing the number and support of Sik to vary through iterations. More precisely, let F k be the set of affine contractions Sik (0 ≤ i ≤ mk − 1), where each Sik only operates on [[ mi ]m−k+1 ; ([ mi ] + 1)m−k+1 ] and has values in [im−k ; (i + 1)m−k ]. Then, the matrix representation of Sik becomes: 1/m 0 t t i/mk = + Sik x x cki aki bki We call (K, (F k )) a GIFS. Given cki , the following construction yields an attractor which is the graph of a continuous function f , that interpolates a set of given points {( mi , yi ), i = 0, . . . , m} (for simplicity, we consider the case m = 2, although the general case can be treated in a similar way). Consider the graph of a non-affine continuous function φ on [0; 1], we note: φ(0) = u,

φ(1) = v

then we choose aki and bki so that the following conditions hold. For i = 0, 1: i i + 1 , yi , Si1 (1, v) = , yi+1 Si1 (0, u) = m m S02 (0, y0 ) = (0, y0 ), S02 (1/2, y1 ) = S12 (0, y0 ), S12 (1/2, y1 ) = (1/2, y1 ) S22 (1/2, y1 ) = (1/2, y1 ), S22 (1, y2 ) = S32 (1/2, y1 ), S32 (1, y2 ) = (1, y2 )


309

For k > 2 and i = 0, . . . , 2k − 1: 1) if i is even, then: a) if i < 2k−1 : Sik ◦ S k−1 ◦ S[k−2 ◦ . . . ◦ S[2 i i ]

i 2k−2

22

2

] (0, y0 )

◦ S[k−2 ◦ . . . ◦ S[2 = S k−1 i i ]

i 2k−2

22

2

◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S k−1 i i ]

i 2k−2

22

2

] (0, y0 )

] (1/2, y1 )

k−2 k 2 ◦ S[k−1 = Si+1 i+1 ◦ S i+1 ◦ . . . ◦ S[ i+1 ] (0, y0 ) ] ] [ 22

2

2k−2

b) if i ≥ 2k−1 : ◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S k−1 i i ]

i 2k−2

22

2

] (1/2, y1 )

◦ S[k−2 ◦ . . . ◦ S[2 = S k−1 i i ]

i 2k−2

22

2

◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S k−1 i i ]

i 2k−2

22

2

] (1/2, y1 )

] (1, y2 )

k−2 k 2 ◦ S[k−1 = Si+1 i+1 ◦ S i+1 ◦ . . . ◦ S[ i+1 ] (1/2, y1 ) [ ] ] 22

2

2k−2

2) if i is odd, then: a) if i < 2k−1 : Sik ◦ S[k−1 ◦ S[k−2 ◦ . . . ◦ S[2 i i ] ]

i 2k−2

22

2

] (1/2, y1 )

◦ S[k−2 ◦ . . . ◦ S[2 = S[k−1 i i ] ]

i 2k−2

22

2

] (1/2, y1 )

b) if i ≥ 2k−1 : ◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S[k−1 i i ] ]

i 2k−2

22

2

] (1, y2 )

◦ S[k−2 ◦ . . . ◦ S[2 = S[k−1 i i ] ] 2

22

i 2k−2

] (1, y2 )

This set of conditions, which we call continuity conditions, ensures that f is a continuous function that interpolates the points ( mi , yi ). The Hölder function f is given by the following proposition. PROPOSITION 9.4.– Let us assume that the conditions (c) and (c ) are satisfied. Then, the attractor of the GIFS, defined above, is the graph of a continuous function f such that: i = yi ∀i = 0, . . . , m f m and: αf (t) = min(α1 , α2 , α3 )

310


where: ⎧ log ckmk−1 i1 +mk−2 i2 +...+mik−1 +ik . . . c2mi1 +i2 c1i1 ⎪ ⎪ ⎪ α1 = lim inf ⎪ ⎪ k→+∞ log(m−k ) ⎪ ⎪ k ⎪ ⎨ log cmk−1 j1 +mk−2 j2 +...+mjk−1 +jk . . . c2mj1 +j2 c1j1 α2 = lim inf ⎪ k→+∞ log(m−k ) ⎪ ⎪ k ⎪ ⎪ ⎪ log cmk−1 l1 +mk−2 l2 +...+mlk−1 +lk . . . c2ml1 +l2 c1l1 ⎪ ⎪ ⎩α3 = lim inf k→+∞ log(m−k )

(9.2)

and where ip , jp and lp are defined as in Proposition 9.1. 1 NOTE 9.1.– Given m real numbers u1 , . . . , um ∈ ] m ; 1[, let us define, for any k ≥ 1 k k and for any i ∈ {0, . . . , m − 1}, ci as:

cki = ri+1−m[ mi ] In this case, we recover the original construction of the usual IFS, considered in section 9.3. We now prove that GIFS allow us to solve the problem of characterizing the Hölder functions. Indeed, we have the following main result. THEOREM 9.1.– Let s be a function of [0; 1] in [0; 1]. The following conditions are equivalent: 1) s is the Hölder function of a continuous function f of [0; 1] in ; 2) there is a sequence (sn )n≥1 of continuous functions such that: s(x) = lim inf sn (x), n→+∞

∀x ∈ [0; 1]

The implication of 1) ⇒ 2) is relatively easy and can be found in [DAO 98]. Hereafter, we present a constructive proof of the converse implication. To do so, let H denote the set of functions of [0; 1] in [0; 1] which are inferior limits of continuous functions. We need the following lemma. LEMMA 9.1 ([DAO 98]).– Let s ∈ H. There exists a sequence {Rn }n≥1 of piecewise polynomials such that: ⎧ s(t) = lim inf n→+∞ Rn (t) ∀t ∈ [0; 1] ⎪ ⎪ ⎪ ⎪ ⎨ + −

Rn ∞ ≤ n, Rn ∞ ≤ n ∀n ≥ 1 (9.3) ⎪ ⎪ ⎪ 1 ⎪ ⎩ Rn ∞ ≥ log n −

where Rn and Rn are the right and left derivatives of Rn , respectively. +


311

Let {Rn }n≥1 be the sequence given by (9.3) and M be the set of m-adic points of [0; 1]. Now let us consider the sequence {rk }k≥1 of functions on M in defined, for k0 ip m−p , by: any t ∈ M such that t = p=1 r1 (t) = R1 (i1 m−1 ) rk (t) = kRk (t) − (k − 1)Rk−1

$k−1

% ip m

−p

for k = 2, . . . , k0

p=1

rk (t) = kRk (t) − (k − 1)Rk−1 (t)

for k > k0

Thanks to the continuity conditions, finding a GIFS whose attractor satisfies 1) amounts to determining the double sequence (cki )i,k . The latter is given by the following result. PROPOSITION 9.5 ([DAO 98]).– Let s ∈ H and {rk }k≥1 be the previously defined sequence. Then, the attractor of the GIFS whose contraction factors are given by: cki = m−rk (im

−k

)

is the graph of the continuous function f satisfying: αf (t) = s(t)

∀t ∈ [0; 1]

This result provides an explicit method, fast and easy to execute, that allows the construction of interpolating continuous functions whose Hölder function is prescribed in the class of inferior limits of continuous functions (situated between the first and second Baire classes). Let us underline that there are two other constructive approaches to prescribe Hölder functions. One of them is based on a generalization of Weierstrass function [DAO 98] and the other is based on the wavelet decomposition [JAF 95]. This section is concluded with some numerical simulations. Figures 9.1 and 9.2 show the attractors of GIFS with prescribed Hölder functions. Figure 9.1 (respectively 9.2) shows the graph obtained when s(t) = t (respectively s(t) = | sin(5πt)|). In both cases, the set of interpolation points is: 2 3 4 1 ,1 ; ,1 ; ,1 ; , 1 ; (1, 0) (0, 0); 5 5 5 5 9.5. Estimation of pointwise Hölder exponent by GIFS In this section, we address the problem of the estimation of the Hölder exponent for a given discrete time signal. Our approach, based on GIFS, is to be compared

312

Scaling, Fractals and Wavelets 5 "f" 4

3

2

1

0

-1

-2

0

0.2

0.4

0.6

0.8

1

Figure 9.1. Attractor of a GIFS whose Hölder function is s(t) = t

5 "f" 4

3

2

1

0

-1

-2

-3

0

0.2

0.4

0.6

0.8

1

Figure 9.2. Attractor of a GIFS whose Hölder function is s(t) = | sin(5πt)|

with two other methods which make it possible to obtain satisfactory estimations. The first method is based on the wavelet transform and is called the wavelet transform maxima modules (WTMM) (cf. Chapter 3 for a detailed description of this method). The second method [GON 92b] uses Wigner-Ville distributions. 9.5.1. Principles of the method For the sake of simplicity, our study is limited to continuous functions on [0; 1]. The calculation algorithm of the Hölder exponent is based on Proposition 9.4. To apply this proposition to the calculation of the Hölder exponents for a real continuous signal f , we have to begin by calculating the coefficients cjk of a GIFS whose attractor is f . This amounts to solving the “inverse problem” for GIFS, which is a generalization of the ordinary inverse problem for IFS. The latter problem was studied by many authors, either from a theoretical point of view [ABE 92, BAR 88, BAR 85a, BAR 85b, BAR 86, CEN 93, DUV 92, FOR 94, FOR 95, VRS 90] or in


313

physics applications [MANT 89, VRS 91b], image compression [BAR 93a, BAR 93b, CAB 92, FIS 93, JAC 93a, JAC 93b, VRS 91a] or signal processing [MAZ 92]. The inverse problem is defined as follows: “given a signal f , find a (generalized) iterated function system whose attractor approximates at best f , in the sense of a fixed norm”. This problem is extremely complex in the case of the IFS, yet becomes much easier for that of GIFS. In particular, it is shown in [DAO 98] that, for a function f ∈ C 0 ([0; 1]), we can find a GIFS whose attractor is as close to f as wished, in the ||.||∞ sense. This is obviously different when IFS have a finite number of functions. However, the consequence of simplifying and improving the approximation, is that we move from a finite to an infinite number of parameters in the modeling. A practical solution to the inverse problem for GIFS is detailed in the next section. For now, let us note that in the particular case where: ⎧ ⎨m = 2 ⎩Lk (t) = i t + i i m mk if we write: 1 1 sjk = f (k + )2−j − f (k2−j ) + f (k + 1)2−j 2 2

∀k ∈ {0, . . . , 2j − 1}

and: cjk =

sjk sj−1 [k]

∀k = 0, . . . , 2j − 1

(9.4)

2

then, since m and Lki are fixed, the inverse problem is solved if cjk satisfy Conditions (c) and (c ). Indeed, in this case, the attractor of the GIFS defined by the interpolation points {(0, 0); ( 12 , f ( 12 )); (1, 0)} and the coefficients cji is the graph of f . To calculate the Hölder function of f , we note:

1 S = f ∈ C 0 ([0; 1]) : |cjk | ∈ [ + ; 1 − ] 2 where cjk are determined by (9.4) and where is a strictly positive fixed scalar (as small as we wish). If we have f ∈ S, we can apply Proposition 9.4 and the pointwise Hölder exponents of f ∈ S are obtained by applying this proposition. NOTE 9.2.– If we have f (0) = f (1) = 0, then: j f (x) = sk θj,k (x) j≥0 0≤k 0 and b > 0 such that, for any i ∈ {0, . . . , d − 1} and j ≥ 1, we have: 0 < a ≤ uji ≤ b < 1

320


Let us also suppose that:

card j ∈ {1, . . . , n} : uji ≤ xi ; ∀i = 0, . . . , d − 1 p(x0 , . . . , xd−1 ) = lim n n exists for any (x0 , . . . , xd−1 ) ∈ [a; b]d . Finally, let us suppose that g is uniformly more regular than f . We note by d(α) the (Hausdorff) multifractal spectrum of f , i.e., d(α) = dimH {x : α(x) = α}, where α(x) is the Hölder exponent of f at point x. Then: – d(α) = −∞ if α ∈ [αmin ; αmax ] where: ⎧ logd (u1d−1 ) + · · · + logd (und−1 ) ⎪ ⎪ ⎨αmin = lim − n n 1 n ⎪ ⎪ ⎩αmax = lim − logd (u0 ) + . . . + logd (u0 ) n n – if α ∈ [αmin ; αmax ], then: d(α) = inf qα − τ (q) q∈

where: n τ (q) = lim inf −

j=1

logd (λj0 )q + . . . + (λjd−1 )q

n→∞

n

This theorem shows that the multifractal formalism is valid for the WSA functions. Therefore, the Hausdorff multifractal spectrum of large deviations and Legendre transform (see Chapter 1 or Chapter 4 for definitions) coincide, are concave and can be easily calculated. 9.7. Signal representation by WSA functions The method we use to represent a given signal is based on its approximation by one or many WSA functions. In this section, we develop a practical technique to approximate, in the L2 sense, a signal through a WSA function. In practice, only discrete data are available; thus, in what follows, we will suppose that we have a signal {f (m), m = 0, . . . , 2J − 1}. Our purpose is to find the parameters (d, g, (Si )i , (jk )k,j , (λji )i,j ) of the WSA function which provide the best L2 -approximation of f . In its general form, this problem is difficult to solve. However, it is possible to consider a simplified and less general form, for which a solution can be found by using a fast algorithm based on the wavelet decomposition of f . Let us describe this sub-optimal solution. Let φ


321

denote the scaling function, ψ the corresponding wavelet and wkn the resulting wavelet coefficients of the signal f : f (x) = a0 φ(x) +

J−1

wkj ψ(2j x − k)

j=0 0≤k 1, we have j=1 ij (k)2 n P = 1. n−1 k 0≤k 0. As a consequence, the (cjk ) are defined only for j > j0 . In practice, if we write the signal f as: f (x) =

0≤k (n) 2 > β2 (10.27) s = ? max l=1:Q

α(n)=l

where Q is the number of source blocks used. The sum above encompasses the scale (n) coefficients β2 associated with the set of destination blocks rn which depend on the source block dα(n) . If the collages result from averaging the pixels of the source blocks, s reads: = > (n) 2 >B β2 (10.28) s = ? max D l=1:Q α(n)=l

2

Ei ri

d E2k

E2j rj

rk Figure 10.8. Illustration of equations (10.27) and (10.28): for each source (n) block dα(n) , we calculate the sum of the scale coefficients β2 associated with the destination blocks rn which depend on dα(n)

Equation (10.28) shows that the Lipschitz factor of the operator T is reduced by B when the pixels of the source blocks are averaged. It depends on the scale a factor D (n) coefficients β2 but also on the size difference of the compared blocks. Hence, the “spatial contraction” of the blocks influences, in this case, the contraction factor of T .

350


10.3.3.3. Fisher formulation Fisher [FIS 95a] described the collage operation of a source block onto a destination block using a unique formula: ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ an bn en 0 x x 0 ⎠ ⎝y ⎠ + ⎝ fn ⎠ (10.29) ω n ⎝ y ⎠ = ⎝ cn d n (n) (n) z z β1 0 0 β2 where (x, y) are the coordinates of a pixel pertaining to the source block dα(n) and z is its gray-level. The symbols an , bn , cn , dn , en and fn are the coefficients of the affine spatial transformation which brings back the pixels of source block dα(n) inside the (n) (n) destination block rn . The quantities β1 and β2 are the transformation coefficients of the pixels’ gray-level. Whereas the mass transformation coefficients used by Jacquin are chosen from a predefined set of values, Fisher uses a uniform scalar quantifier. Jacobs et al. showed that for this type of quantifier, the quantification of the translation coefficients and of the scale coefficients, with seven and five bits, respectively, is optimal in terms of visual quality of the rebuilt images [JACO 92]. Optimal coefficients calculation of the mass transformation For a destination block r, the approximation ˆ r is given by: ˆ r = β2 b2 + β1 b1 The calculation of the “mass” transformation coefficients is an optimization 2 problem within a linear subspace X of the vector space RB . The purpose is to find, for a destination block r and for a decimated source block b2 , the optimal coefficients β1 and β2 minimizing the least square distance d between r and its collage ˆ r. THEOREM 10.2 (P ROJECTION).– The optimal approximation in the L2 norm of a 2 r element of X which vector r element of RB , in the linear subspace X is the vector ˆ yields the residual vector r − ˆ r orthogonal to all vectors spanning the subspace X. Let us consider, the mean square error (MSE) between two vectors x and y in the 2 space RB , defined by the following expression: 2

Err (x, y) =

B j=1

(xj − yj )2

∀x, y ∈ RB

2

Iterated Function Systems and Applications in Image Processing

351

Determining the two optimal coefficients β1 and β2 leading to the best approximation ˆ r of vector r in the basis b1 , b2 , amounts to canceling the two mean square errors: Err (r − β1 b1 − β2 b2 , b1 ) = 0 Err (r − β1 b1 − β2 b2 , b2 ) = 0 The optimal coefficients β1 and β2 thus calculated are not constrained and the contraction of the collage operator is not guaranteed. Jacobs et al. empirically show that thresholding the magnitude of the β2 coefficient to 1.5 ensures the final convergence of the fractal transformation. Hürtgen proposes a detailed study of the contraction control of the fractal transformation by considering particular cases based on square partitioning [HURT 93a, HURT 94] and on the spectral radius associated with the linear term L introduced by Lundheim (see equation (10.18)). 10.3.4. Experimentation on triangular partitions It has been shown in section 10.3 that the fractal coding of a natural image consisted of approximating each area of the image from other areas of the same image, by means of local transformations. The image is thus partitioned into N blocks rn , forming a partition R. Each block is then put in correspondence with another block dn of the image, from which it is possible to approximate, by an elementary transformation ωn , the gray-level function of rn . The operator W defining a fractal transformation of the image is composed of the N transformations ωn . It is finally contracting provided that a sufficient number of transformations ωn are contracting. Fractal transformation coding requires the coefficients of the N transformations ωn to be stored. It is thus all the more effective that the partition R contains a reduced number of blocks. The first works of Jacquin, mentioned in section 10.3, showed the advantage of using square regular partitions to build the operator W . However, they did not reach large compression rates because the partition R entailed too large a number of blocks rn . This issue could be circumvented by calculating the fractal transformation on square or rectangular partitionings [FIS 95a] adapted to the image contents (quadtree, HV). We shall now present an approach that consists of calculating the fractal transformation of the image on a triangular partitioning, which is flexible and adapted to the contents of the image. Various algorithms [CHAS 93, DAVO 97] allow us to build such a triangulation.

352


Dk

Rk

Di,j

Rj

Ri

Figure 10.9. Fractal transformation calculation. The left hand partition (D) encloses the source blocks and the right hand partition (R) contains the destination blocks r

Let us assume that the triangulation R (adapted to the image contents) has N destination blocks ri and that the (regular) triangulation D contains Q source blocks (see Figure 10.9). The algorithm consists of associating each block ri with the block dj that minimizes the error d between the gray-level function of block ri and that of block dj transformed by ω. The mass transformation to perform the collage of the source block dj onto the destination block ri is the same as the one proposed by Jacquin and Fisher. It uses (i) (i) only two coefficients: the shift coefficient β1 and the scale coefficient β2 . The decoding algorithm amounts to an iteration of operator W , after the partitions R and D were rebuilt, starting from an arbitrary image f0 . After k iterations of operator W , the gray-level fk (xi , yi ) of a pixel in block r reads: (10.30) fk (xi , yi ) = β2 fk−1 v −1 (xi , yi ) + β1 ∀(xi , yi ) ∈ r In practice, the result converges towards the reconstructed image, an attractor of the fractal transformation, after five to ten iterations (see Figure 10.10). Primarily, the number of iterations depends on the surface ratio between the blocks of partition D and those of partition R. As the number of iterations increases, the collage of a block dα(n) (covering several blocks r) onto its corresponding block rn reduces the size of the details within blocks of partition R. 10.3.5. Coding and decoding acceleration 10.3.5.1. Coding simplification suppressing the research for similarities Dudbridge proposed in 1995 [DUD 95b] a fast fractal compression method for images, based on a regular square partitioning. The speed of this compression algorithm is due to the fact that no search for a similar interblock is made. The


353

Figure 10.10. Decoding of Lena image 512 × 512. MSE At iteration 15: Tc = 11.2 : 1, PSNR = −10 log10 255 2 = 32.29 dB

image is partitioned into a set of square fixed size blocks, and each block is coded individually by a fractal transformation. According to the author, the method gives less efficient results than, for example, Jacquin’s method. The reasons for this will be explained at the end of the section.

354


Coding An image5 is coded using a set of contracting spatial transformations (IFS) {ω1 , . . . , ωN } defined on R2 , associated with a contracting transformation G acting on the pixels’ luminance. At resolution m, the square support of the IFS transformed image reads: A=

N + k=1

ωk (A) =

N + k1 =1

···

N + km =1

ωk1 ◦ · · · ◦ ωkm (A)

(10.31)

Ak1 ...km

It is noteworthy that the spatial transformation is applied to A and not to a subpart of A as it is the case in the traditional approach of coding defined by Jacquin. The quantity p = Ak1 ...km denotes an “element” of the image support at resolution m, which may contain several pixels of the original image. At the maximum resolution, the size of element p is equal to that of an image pixel. The set Pm = {Ak1 ...km ; k1 , . . . , km = 1, . . . , N } contains all the elements of the image at resolution m. In the following section, we will consider that the IFS is composed of N = 4 affine transformations defined by equations (10.10). In these conditions, equation (10.31) is illustrated in Figure 10.11.

Figure 10.11. The square image A is divided into four square elements by four affine contracting transformations ωk1 (k1 = 1 . . . 4). In the center, Ak1 = ωk1 (A) corresponds to one of the four elements of the image at resolution 1. On the right, Ak1 k2 = ωk1 ◦ ωk2 (A) corresponds to one of the 16 elements of the image at resolution 2

5. Within this section, the term image stands for a square block resulting from the regular partitioning of the original image.


The transformation G abides by the following equation [DUD 95b]: Gf (p) = (ak1 x + bk1 y + tk1 ) dx dy + sk1 v(p)

355

(10.32)

p

in which the function f : Pm → R gives the gray-level of element p and v(p) (p) (see is the sum of the gray-levels of the elements included in the block ωk−1 1 Figure 10.12): v(p) =

N

f (Ak2 ...km i ).

i=1

A1111

A1112

A1114

A1113

v (A 1441 )

-1 1

w (A 1441 )

Figure 10.12. Example for k1 = 1 (left upper quadrant), m = 4 and N = 4: v (A1441 ) = 4i=1 f (A441i ). In this particular case, the size of element p = A1441 corresponds to that of an image pixel

In contrast to the mass transformation of Jacquin, which contains only one scale factor and one shift factor on the gray-levels, equation (10.32) contains two additional coefficients, related to the position (x, y) in the image of the element to be approximated. Dudbridge demonstrates that the transformation G is finally contracting at all N resolutions m if | k1 =1 sk1 | is smaller than 1, with respect to the Euclidean distance.

356


The calculation of the coefficients ak1 , bk1 , tk1 and sk1 for all k1 of the set [1 . . . N ] is carried out so as to minimize the least squares error at resolution m between the original image f and its G-transform. This way, for the collage theorem to be satisfied, it suffices to minimize for all k1 in set [1 . . . N ], the following function: 2 (ak1 x + bk1 y + tk1 ) dx dy + sk1 v(p) − f (p) (10.33) p∈ωk1 (Pm−1 )

p

The N summations are performed on the sub-block k1 of the original image, noted ωk1 (Pm−1 ) (recall that this is one of the four quadrants of the original image). The sum v(p) depends on the resolution m of approximation of the luminance of element p. The minimization of function (10.33) amounts to solving the following system of equations [DUD 95b]: ⎡ 2 ⎤ x x y x 1 v(p) x⎥ ⎢ p p p p p p ⎥ ⎢ p p p p ⎢ ⎥⎡ ⎤ 2 ⎥ ⎢ ak1 ⎢ x y y y 1 v(p) y ⎥ ⎥ ⎢ bk ⎥ p p p p p ⎥⎢ ⎢ p p p p p ⎢ ⎥⎣ 1⎥ ⎢ 2 ⎥ tk 1 ⎦ ⎢ ⎥ ⎢ x 1 y 1 1 v(p) 1 ⎥ sk1 ⎢ ⎥ ⎢ p p p p p p p p p p ⎢ 2 ⎥ ⎦ ⎣ v(p) v(p) x v(p) y v(p) 1 p

p

⎡

p

p

p

p

p

⎤

f (p) x ⎥ ⎢ p ⎢ p ⎥ ⎥ ⎢ ⎢ f (p) y ⎥ ⎥ ⎢ ⎢ p p ⎥ = ⎢ ⎥ ⎥ ⎢ ⎢ f (p) 1 ⎥ ⎢ p ⎥ p ⎥ ⎢ ⎦ ⎣ f (p)v(p) p

An image (a square block of the original image partition) is then coded by a sequence of 4 × 4 real coefficients. Decoding The decoding algorithm allows for a fast and non-iterative reconstruction of the invariant function g associated with the operator G. It is only necessary to know the coefficients ak1 , bk1 , tk1 and sk1 associated with each of the N spatial transformations ωk1 .


357

Dudbridge demonstrates that the gray-level sum, noted gk1 , of the sub-elements included in the element Ak1 can be decomposed as follows [DUD 95b, MON 95a]: x dx dy + bk1 y dx dy gk1 = ak1 Ak 1

Ak 1

+ tk 1

Ak 1

1 dx dy + sk1

N

(10.34) gk

k=1

and that, consequently, the sum of the gray-levels of the N elements Ak1 reads: # # N # N a x + b y + t 1 k Ak k Ak k Ak k=1 gk = (10.35) N 1 − k=1 sk k=1 Similarly, gk1 k2 stands for the gray-level sum of the sub-elements of Ak1 k2 and reads: x dx dy + bk2 y dx dy gk1 k2 = ak1 Ak 1 k 2

Ak 1 k 2

+ tk2 with

N

k=1 gk2 k

Ak 1 k 2

dx dy + sk2

N

(10.36) gk2 k

k=1

= gk 2 .

The decoding procedure decomposes as follows: N – the sum k=1 gk is directly calculated from the coefficients of G (formula (10.35)). The result is equal to the gray-level sum of the pixels in the original image; – according to (10.34), gk1 (k1 = 1 . . . N ) is a function of the variables ak1 , bk1 , N tk1 , sk1 and of the value k=1 gk previously calculated; – according to (10.36), gk1 k2 is a function of the variables ak2 , bk2 , tk2 , sk2 and of the previously calculated gk2 ; – etc. This way, each element of the invariant function g at resolution m can be recursively calculated. The reconstruction procedure is not iterative, in contrast to most algorithms of fractal decoding. The method presented in this section, approximates the luminance function f in each square block of an original image partition. Each block is approximated by an invariant function g, independently of the rest of the image. At a given resolution m, the approximation is performed in the least squares sense, using an IFS associated

358

Scaling, Fractals and Wavelets g1

g3

g2

g 12

g 22

g 32

g 42

g4

Figure 10.13. Illustration of Dudbridge algorithm for decoding at resolution m = 3, a 8 × 8 image. The four values gk1 k2 (k1 = 1 . . . 4) at resolution 2 depend on the value gk2 at resolution 1 and on their respective position in the image

with a finally contracting transformation G in the luminance space. The expression of G (equation (10.32)) is comparable with that of the mass transformation suggested by Jacquin, since it also relies on a scale factor sk1 and on a shift factor tk1 . It also contains two additional coefficients ak1 and bk1 which act on the coordinates of the approximated elements inside the block: a first order approximation, in that case. Equation (10.32) shares similarities with equation (10.26): the coefficients ak1 and bk1 define weights associated with two inclined planes, in the luminance space. The reason why this method is not as efficient as the basic diagram of Jacquin is because a block is approximated from itself and not from another block of the image. Then, the transformation G must be sufficiently complex to achieve a good block approximation. That is why Dudbridge added two more parameters to the mass transformation expression. However, saving these extra parameters results in a lower compression ratio. Nonetheless, the method has the advantage of being very fast. Moreover, the coding-decoding algorithm is evenly balanced regarding calculation time. Initially, Dudbridge presented this coding technique in his thesis in 1992 [DUD 92]. Since then, the approach has been generalized on square blocks issued from a quadtree partitioning [DUD 95a]. The additional weights of the terms x2 , y 2 , x3 and y 3 in the expression of G, were studied by Monro et al. [MON 93a, MON 93b, MON 94, WOO 94]. The authors also extended this approach to the compression of video sequences [WIL 94]. 10.3.5.2. Decoding simplification by collage space orthogonalization We now consider that vector ˆ r, an approximation of the destination vector r, reads: ˆ r = β2 b2 + β1 b1

(10.37)

The vector ˆ r belongs to the vector subspace spanned by b1 and b2 , where b1 is a constant basis vector (b1 = [1, 1, . . . , 1]T ) and b2 a vector extracted from the image to be coded. Each vector ˆ r, r, b1 and b2 is of dimension B 2 .


359

De-correlation of the mass transformation coefficients Øien proposed in [ØIE 93] a solution designed to accelerate the decompression phase by orthogonalization of the basis vectors b1 and b2 that span the collage vector subspace. The second advantage of this approach is that we do not have to impose constraints on the scale coefficients of the fractal transformation. In signal processing applications, such as data compression, the handled vectors are generally represented by a linear combination of orthogonal functions. The most usual example is that of the Fourier transform where the functions are complex exponential. We can resort to the Gram-Schmidt procedure to orthogonalize the decimated block b2 with respect to the constant basis vector b1 of the vectorial subspace. This procedure amounts to multiply the vector b2 by the orthogonalizing matrix O = I − b1 b1 T , where I is the identity matrix of dimension B 2 × B 2 . The ˜ 2 reads: orthogonalized vector b ˜ 2 = Ob2 b Practically, this amounts to removing the b2 component that belongs to the subspace spanned by vector b1 , and thus to force the mean value of block b2 to zero. The new collage, now performed in the vectorial subspace spanned by the ˜ 2 , becomes: orthogonal vectors b1 and b ˜ 2 + α1 b1 ˆ ro = α2 b In this case, the coefficients αi are independent of each other. Øien and Lepsøy show in [ØIE 94b] that it is not necessary to formally orthogonalize the vectors to calculate the coefficients α1 and α2 . They are directly obtained from the expressions: 2

B 1 α1 = #r, b1 $ = 2 rj B j=1

and

α2 =

#r, b2 $ − α1 #b1 , b2 $

b2 2 − #b1 , b2 $2

(10.38)

in which rj and bj are the pixels inside blocks r and b2 , respectively. The coefficients α1 and α2 are related to the initial coefficients β1 and β2 through the relations: β2 = α2

and

β1 = α1 − #b1 , b2 $β2

Øien and Lepsøy show in [ØIE 94b] that with the orthogonalization of vectors b1 and b2 prior to the coding phase, it becomes possible to warrant the exact convergence of the decoder in a finite number of iterations.

360


10.3.5.3. Coding acceleration: search for the nearest neighbor In [SAU 95], Saupe proposes a fractal coding procedure of complexity in O(log Q), where Q is the number of source blocks, which is worthy comparing with the usual schemes whose complexity is in O(Q). The traditional procedure, using an affine transformation in the luminance spaces, searches for each destination block 2 2 r ∈ RB , of the Q source block b2 ∈ RB that minimizes: E(ˆ r, r) = min r − β1 b1 − β2 b2 2 β1 ,β2

Calculation of the optimal coefficients β1 and β2 and of the error is costly in terms of calculation time. 2

Saupe considers the orthonormal basis of the vector subspace of RB , spanned by the normalized vectors b1 (b1 = B1 (1, . . . , 1)) and φ(b2 ). The symbol φ represents the projection operator making φ(b2 ) orthonormal to b1 . It is shown, in this case [SAU 95], that the error E(ˆ r, r) is proportional to an increasing monotonic function of distance D given by: D(ˆ r, r) = min d φ(b2 ), φ(r) , d −φ(b2 ), φ(r) Minimizing E(ˆ r, r) amounts to minimizing D(ˆ r, r) and thus, to seeking for the closest neighbor of φ(r) among the 2Q vectors ±φ(b2 ). Different fast search algorithms of the nearest neighbor are proposed in the literature. In [FRI 77], the authors build a tree of dimension B 2 and define a search method of complexity in O(log Q). The results presented in [SAU 95] show that it is possible to gain a factor 1.3 to 11.5 over the compression time, without degrading the quality of the reconstructed image significantly. The gain depends of course on the nature of the image and on the number of source blocks considered. 10.3.6. Other optimization diagrams: hybrid methods Seminal work by Jacquin, based on the use of a local iterated contracting functions system, launched a great deal of research into the approximation or coding of real 1D, 2D and 3D signals by fractals [FIS 95a, JACQ 93, SAU 94, WOH 99]. These mainly concern: – the construction of an optimal partition to calculate the fractal transformation [DAVO 97, FIS 95b, FIS 95c, HURT 93c, NOV 93, REU 94, THO 95]. They are composed of square, rectangular or polygonal surface blocks locally adapted to the texture of the images (see Figure 10.14); – the acceleration of the coding algorithm (see [DAVO 96, DUD 92, LEP 93, TRU 00]);


361

– the use of constant vectors issued from a known dictionary, to approximate the destination blocks from the source blocks [GHA 93, VIN 93]; – the use of non-affine elementary functions ωn that allow for coding the spatial redundancy of the images [LIN 94, POP 97]; – the acceleration of the decoding algorithm: iterative, non-iterative, hierarchical [BARA 95, CHAN 00, ØIE 93, ØIE 94a]; – the theoretical study of the fractal transformation [VRS 99] and of the convergence of the decoder [HURT 93a, HURT 94, LUN 92, MUK 00]; – the extension of the method to the coding of video sequences [BART 95, BEA 91, BOG 94, FIS 94, HURD 92, HURT 93b, LAZ 94, MON 95b, WIL 94]; – the use of fractals in hybrid coding-decoding schemes, either based on a discrete cosine transform [BART 94a, BART 94b, BART 95b] or on a wavelet transform of the image [BEL 98, DAVI 98, KRU 95, RIN 95, SIM 95, WALLE 96]. The fractal code is, in this case, calculated on another representation of the original image, which can be more favorable to the search of local similarities.

Figure 10.14. Illustration of different partitionings: quadtree (square), HV (rectangular) and Voronoï (polygonal), used for the search of local similarities in the image and for the calculation of the fractal code

Figure 10.15 compares the compression performances using fractal coding on square, rectangular and polygonal partitioning with those of a normalized JPEG compression [WALLA 91]. It highlights the fact that beyond a reasonable compression rate, compression by fractals outperforms the JPEG compression, at least in terms of visual quality of the images.

362


38 JPEG Delaunay fractal Voronoi fractal HV fractal Quadtree 1 fractal Quadtree 2 fractal

Reconstruction SNR

36 34 32 30 28 26 10

20

30

40 50 60 70 Compression rate

80

90

100

Figure 10.15. Signal to noise ratios versus the compression rate, calculated on the 512 × 512 Lena image. These ratios compare the quality of the reconstructed images, after fractal compression on blocks of variable geometries, and after a normalized JPEG compressor

10.4. Bibliography [BARA 95] BARAHAV Z., M ALAH D., K ARNIN E., “Hierarchical interpretation of fractal image coding and its applications”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 91–117, 1995. [BARN 86] BARNSLEY M.F., E RVIN V., H ARDIN D., L ANCASTER J., “Solution of an inverse problem for fractals and other sets”, Proc. Natl. Acad. Sci. USA, vol. 83, p. 1975–1977, 1986. [BARN 88] BARNSLEY M.F., Fractal Everywhere, Academic Press, New York, 1988. [BARN 93] BARNSLEY M.F., H URD L.P., Fractal Image Compression, A.K. Peters, Wellesley, 1993. [BART 94a] BARTHEL K.U., S CHÜTTEMEYER J., VOYÉ T., N OLL P., “A new image coding technique unifying fractal and transform coding”, in IEEE International Conference on Image Processing (Austin, Texas), p. 112–116, November 1994. [BART 94b] BARTHEL K.U., VOYÉ T., “Adaptive fractal image coding in the frequency domain”, in Proceedings of International Workshop on Image Processing: Theory, Methodology, Systems, and Applications (Budapest, Hungary), June 1994.


363

[BART 95] BARTHEL K.U., VOYÉ T., “Three-dimensional fractal video coding”, in ICIP (Washington DC, USA), vol. 3, p. 260–263, 1995. [BART 95b] BARTHEL K.U., “Entropy constrained fractal image coding”, Fractals, in NATO ASI on Fractal Image Coding, Trondheim, Norway, July 1995. [BEA 91] B EAUMONT J.M., “Image data compression using fractal techniques”, BT Technol. J., vol. 9, no. 4, p. 93–109, 1991. [BEL 98] B ELLOULATA K., BASKURT A., B ENOIT-C ATIN H., P ROST R., “Fractal coding of subbands with an oriented partition”, Signal Processing: Image Communication, vol. 12, 1998. [BOG 94] B OGDAN A., “Multiscale (inter/intra frame) fractal video coding”, in Proceedings of the IEEE International Conference on Image Processing (ICIP’94, Austin, Texas), November 1994. [CAB 92] C ABRELLI C.A., F ORTE B., M OLTER U.M., V RSCAY E.R., “Iterated fuzzy set systems: A new approach to the inverse problem for fractals and other sets”, Journal of Mathematical Analysis and Applications, vol. 171, no. 1, p. 79–100, 1992. [CHAN 00] C HANG H.T., K UO C.J., “Iteration-free fractal image coding based on efficient domain pool design”, IEEE Transactions on Image Processing, vol. 9, no. 3, p. 329–339, 2000. [CHAS 93] C HASSERY J.M., DAVOINE F., B ERTIN E., “Compression fractale par partitionnements de Delaunay”, in Quatorzième colloque GRETSI (Juan-les-Pins, France), vol. 2, p. 819–822, 1993. [DAVI 98] DAVIS G., “A wavelet-based analysis of fractal image compression”, IEEE Transactions on Image Processing, vol. 7, p. 141–154, 1998. [DAVO 96] DAVOINE F., A NTONINI M., C HASSERY J.M., BARLAUD M., “Fractal image compression based on Delaunay triangulation and vector quantization”, IEEE Transactions on Image Processing: Special Issue on Vector Quantization, February 1996. [DAVO 97] DAVOINE F., ROBERT G., C HASSERY J.M., “How to improve pixel-based fractal image coding with adaptive partitions”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering, Springer-Verlag, p. 292–307, 1997. [DUD 92] D UDBRIDGE F., Image approximation by self-affine fractals, PhD Thesis, University of London, 1992. [DUD 95a] D UDBRIDGE F., “Fast image coding by a hierarchical fractal construction”, University of California, San Diego, 1995. [DUD 95b] D UDBRIDGE F., “Least-squares block coding by fractal functions”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 229–241, 1995. [ELT 87] E LTON J.H., “An ergodic theorem for iterated maps”, Ergodic Theory and Dynamical Systems, vol. 7, p. 481–488, 1987. [FIS 91] F ISHER Y., JACOBS E.W., B OSS R.D., Iterated transform image compression, Technical Report 1408, Naval Ocean Systems Center, San Diego, California, April 1991.

364


[FIS 94] F ISHER Y., ROGOVIN D., S HEN T.P., “Fractal (self-VQ) encoding of video sequences”, in Proceedings of the SPIE: Visual Communications and Image Processing (Chicago, Illinois), September 1994. [FIS 95a] F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, 1995. [FIS 95b] F ISHER Y., “Fractal image compression with Quadtrees”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 55–77, 1995. [FIS 95c] F ISHER Y., M ENLOVE S., “Fractal encoding with HV partitions”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 119–136, 1995. [FRI 77] F RIEDMAN J.H., F INKEL J.L., “An algorithm for finding best matches in logarithmic expected time”, ACM Trans. Math. Software, vol. 3, no. 3, p. 209–226, 1977. [GHA 93] G HARAVI -A LKHANSARI M., H UANG T.S., “A fractal-based image block-coding algorithm”, in Proceedings of ICASSP, p. 345–348, 1993. [HURD 92] H URD L.P., G USTAVUS M.A., BARNSLEY M.F., “Fractal video compression”, in Compcon Spring. Conference 37, p. 41–42, 1992. [HURT 93a] H ÜRTGEN B., “Contractivity of fractal transforms for image coding”, Electronics Letters, vol. 29, no. 20, p. 1749–1750, 1993. [HURT 93b] H ÜRTGEN B., B ÜTTGEN P., “Fractal approach to low rate video coding”, in Proceedings of SPIE, vol. 2094, p. 120–131, 1993. [HURT 93c] H ÜRTGEN B., M ÜLLER F., S TILLER C., “Adaptive fractal coding of still pictures”, in Proceedings of the Picture Coding Symposium, 1993. [HURT 94] H ÜRTGEN B., H AIN T., “On the convergence of fractal transforms”, in Proceedings of ICASSP, p. 561–564, 1994. [JACO 92] JACOBS E.W., F ISHER Y., B OSS R.D., “Image compression: a study of the iterated transform method”, Signal Processing, vol. 29, p. 251–263, 1992. [JACQ 92] JACQUIN A.E., “Image coding based on a fractal theory of iterated contractive image transformations”, IEEE Transactions on Image Processing, vol. 1, no. 1, p. 18–30, 1992. [JACQ 93] JACQUIN A.E., “Fractal image coding: a review”, Proceedings of the IEEE, vol. 81, no. 10, p. 1451–1465, 1993. [KRO 92] K ROPATSCH W.G., N EUHAUSSER M.A., L EITGEB I.J., B ISCHOF H., “Combining pyramidal and fractal image coding”, in Proceedings of the Eleventh ICPR (The Hague, Netherlands), vol. 3, p. 61–64, 1992. [KRU 95] K RUPNIK H., M ALAH D., K ARNIN E., “Fractal representation of images via the discrete wavelet transform”, in IEEE Eighteenth Conference of EE in Israel (Tel Aviv), March 1995. [LAZ 94] L AZAR M.S., B RUTON L.T., “Fractal block coding of digital video”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 4, no. 3, p. 297–308, 1994.


365

[LEP 93] L EPSØY S., Attractor image compression – Fast algorithms and comparisons to related techniques, PhD Thesis, Norwegian Institute of Technology, Trondheim, June 1993. [LEV 90] L ÉVY V ÉHEL J., G AGALOWICZ A., Fractal approximation of 2-D object, Technical Report 1187, INRIA, Rocquencourt, France, 1990. [LIN 94] L IN H., V ENETSANOPOULOS A.N., “Incorporating nonlinear contractive functions into the fractal coding”, in Proceedings of the International Workshop on Intelligent Signal Processing and Communication Systems (Seoul, Korea), p. 169–172, October 1994. [LUN 92] L UNDHEIM L., Fractal signal modelling for source coding, PhD Thesis, Norwegian institute of technology, Trondheim, September 1992. [LUT 93] L UTTON E., L ÉVY V ÉHEL J., “Optimization of fractal functions using genetic algorithms”, in Fractal’93 (London, Great Britain), Springer, 1993. [MAN 89] M ANTICA G., S LOAN A., “Chaotic optimization and the construction of fractals: Solution of an inverse problem”, Complex Systems, vol. 3, p. 37–62, 1989. [MON 93a] M ONRO D.M., “Class of fractal transforms”, Electronics Letters, vol. 29, no. 4, p. 362–363, 1993. [MON 93b] M ONRO D.M., “Fractal transforms: Complexity versus fidelity”, in V ERNAZZA G., V ENETSANOPOULOS A.N., B RACCINI C. (Eds.), Image Processing: Theory and Applications, Elsevier Science Publishers, p. 45–48, 1993. [MON 94] M ONRO D.M., W OOLLEY S.J., “Fractal image compression without searching”, in Proceedings of ICASSP, vol. 5, p. 557–560, 1994. [MON 95a] M ONRO D.M., D UDBRIDGE F., “Rendering algorithms for deterministic fractals”, IEEE Computer Graphics and Applications, p. 32–41, January 1995. [MON 95b] M ONRO D.M., N ICHOLLS J.A., “Low bit rate colour fractal video”, in ICIP (Washington DC, USA), vol. 3, p. 264–267, 1995. [MUK 00] M UKHERJEE J., K UMAR P., G HOSH S.K., “A graph-theoretic approach for studying the convergence of fractal encoding algorithm”, IEEE Transactions on Image Processing, vol. 9, no. 3, p. 366–377, 2000. [NOV 93] N OVAK M., Attractor coding of images, PhD Thesis, Department of Electrical Engineering, Linköping University, 1993. [ØIE 93] Ø IEN G.E., L2-optimal attractor image coding with fast decoder convergence, PhD Thesis, Norwegian Institute of Technology, Trondheim, April 1993. [ØIE 94a] ØIE 94a Ø IEN G.E., BAHARAV Z., L EPSØY S., K ARNIN E., M ALAH D., “A new improved collage theorem with applications to multiresolution fractal image coding”, in International Conference on Accoustics, Speech, and Signal Processing, 1994. [ØIE 94b] Ø IEN G.E., L EPSØY S., “Fractal-based image coding with fast decoder convergence”, Signal Processing, vol. 40, p. 105–117, 1994. [POP 97] P OPESCU D.C., D IMCA A., YAN H., “A nonlinear model for fractal image coding”, IEEE Transactions on Image Processing, vol. 6, no. 3, 1997. [RAM 86] R AMAMURTHI B., G ERSHO A., “Classified vector quantization of images”, IEEE Transactions on Communications, vol. 34, no. 11, p. 1105–1115, 1986.

366


[REU 94] R EUSENS E., “Partitioning complexity issue for iterated functions systems based image coding”, in Proceedings of the Seventh European Signal Processing Conference (Edinburgh, Scotland), vol. 1, p. 171–174, September 1994. [RIN 94] R INALDO R., Z AKHOR A., “Inverse and approximation problem for two-dimensional fractal sets”, IEEE Transactions on Image Processing, vol. 3, no. 6, p. 802–820, 1994. [RIN 95] R INALDO R., C ALVAGNO G., “Image coding by block prediction of multiresolution subimages”, IEEE Transactions on Image Processing, p. 909–920, July 1995. [SAU 94] S AUPE D., H AMZAOUI R., “A review of the fractal image compression literature”, Computer Graphics, vol. 28, no. 4, p. 268–276, 1994. [SAU 95] S AUPE D., “Accelerating fractal image compression by multi-dimensional nearest neighbor search”, in S TORER J.A., C OHN M. (Eds.), Proceedings of the Data Compression Conference (DCC’95, Institute for Information Technology, Freiburg University), IEEE Computer Society Press, March 1995. [SIM 95] S IMON B., “Explicit link between local fractal transform and multiresolution transform”, in ICIP (Washington DC, USA), vol. 1, p. 278–281, 1995. [THO 95] T HOMAS L., D ERAVI F., “Region-based fractal image compression using heuristic search”, IEEE Transactions on Image Processing, vol. 4, no. 6, p. 832–838, 1995. [TRI 93] T RICOT C., Courbes et dimension fractale, Springer-Verlag, 1993. [TRU 00] T RUONG T.K., J ENG J.H., R EED I.S., L EE P.C., L I A.Q., “A fast encoding algorithm for fractal image compression using the DCT inner product”, IEEE Transactions on Image Processing, vol. 9, no. 4, p. 529–535, 2000. [VIN 93] V INES G., Signal modelling with iterated function systems, PhD Thesis, Georgia Institute of Technology, May 1993. [VRS 90] V RSCAY E.R., “Moment and collage methods for the inverse problem of fractal construction with iterated function systems”, in Fractal’90 conference, June 1990. [VRS 99] V RSCAY E.R., S AUPE D., “Can one break the collage barrier in fractal image coding?”, in Fractals in Engineering, Springer-Verlag, 1999. [WALLA 91] WALLACE G.K., “The JPEG still picture Communications of the ACM, vol. 34, no. 4, p. 30–44, 1991.

compression

standard”,

[WALLE 96] VAN DE WALLE A., “Merging fractal image compression and wavelet transform methods”, Fractals, 1996. [WIL 94] W ILSON D.L., N ICHOLLS J.A., M ONRO D.M., “Rate buffered fractal video”, in Proceedings of the ICASSP, vol. V, p. 505–508, 1994. [WOH 99] W OHLBERG B., DE JAGER G., “A review of the fractal image coding literature”, IEEE Transactions on Image Processing, vol. 8, no. 12, p. 1716–1729, 1999. [WOO 94] W OOLLEY S.J., M ONRO D.M., “Rate-distortion performance of fractal transforms for image compression”, Fractals, vol. 2, no. 6, p. 395–398, 1994.

Chapter 11

Local Regularity and Multifractal Methods for Image and Signal Analysis

11.1. Introduction In this chapter, we shall review some of the important and recent applications of local regularity and multifractal analysis to signal/image processing. Obviously, we will not aim at a complete coverage of the field, which would require a book of its own: (multi)fractal processing of signals and images is indeed now present in numerous applications. Rather, we will concentrate on a few topics, and try to explain in a very concrete manner how tools developed for the study of irregular functions may be applied to solve typical signal processing problems. This chapter is organized as follows: in section 11.2, we briefly recall the notions that will be used in what is to come, namely regularity exponents and multifractal spectra. For more details on these, see Chapters 1 and 3. In section 11.3, we explain how to estimate regularity exponents on numerical data and compare various methods to do so (we do not tackle the problem of multifractal spectrum estimation, which will not be needed here. See Chapters 1 and 3 for more information on this). Section 11.4 gives a detailed explanation of how to use fractal tools to perform signal and image denoising. We first recall a traditional wavelet-based denoising method, and explain why it is not adapted to processing irregular signals. We then

Chapter written by Pierrick L EGRAND.

368


present three different methods based on Hölder exponents and large deviation multifractal spectra that give good results on signals such as fractal functions, road profiles and SAR (radar) images. Section 11.5 explains how Hölder exponents may be used to perform data interpolation: the idea is to refine the resolution in such a way that local regularity is preserved at each point. Again, this approach is well adapted to processing irregular signals and images, as we show in examples. Section 11.6 gives an account of the remarkable applications of fractal tools to ECG analysis: links between the condition of the heart and some features of the multifractal spectrum of its ECG, relation between RR signals and their local regularity, etc. In section 11.7, we briefly describe an application of multifractal analysis to texture classification, and describe an example of well logs. Section 11.8 is devoted to the presentation of an image segmentation method based on characterizing edges through multifractal analysis. The issue of change detection in a sequence of images is dealt with in section 11.9. As in the contour segmentation application, the idea is to characterize relevant changes through their signatures in the multifractal spectrum. As a final image processing application, we describe in section 11.10 a method for reconstructing an image from a specific subset of pixels selected through multifractal analysis. To end this introduction, we should also mention that many of the methods described in this chapter are implemented in the free software toolbox FracLab [FracLab]. 11.2. Basic tools 11.2.1. Hölder regularity analysis This section focuses on the Hölder characterizations of regularity. To simplify notations, we assume that our signals are nowhere differentiable. Generalization to other signals simply requires the introduction of polynomials in the definitions (see Chapters 1 and 3).


369

DEFINITION 11.1 (Pointwise Hölder exponent).– Let α ∈ (0, 1), and x0 ∈ K ⊂ R. A function f : K → R is in Cxα0 if, for all x in a neighborhood of x0 , |f (x) − f (x0 )| ≤ c|x − x0 |α

(11.1)

where c is a constant. The pointwise Hölder exponent of f at x0 , denoted αp (x0 ), is the supremum of the α for which the equation (11.1) holds. Let us now introduce the local Hölder exponent: let α ∈ (0, 1), Ω ⊂ R. We say that f ∈ Clα (Ω) if: ∃ C : ∀x, y ∈ Ω :

|f (x) − f (y)| ≤C |x − y|α

Let: αl (f, x0 , ρ) = sup {α : f ∈ Clα (B (x0 , ρ))} and notice that αl (f, x0 , ρ) is non-increasing as a function of ρ. We may thus set the following definition: DEFINITION 11.2.– Let f be a continuous function. The local Hölder exponent of f at x0 is the real number: αl (x0 ) = αl (f, x0 ) = lim αl (f, x0 , ρ) ρ→0

11.2.2. Reminders on multifractal analysis We briefly state in this section some basic facts about multifractal analysis. Multifractal analysis is concerned with the study of the regularity structure of functions or processes, both from a local and global perspective. More precisely, we start by measuring in some way the pointwise regularity, usually with some kind of Hölder exponent. The second step is to give a global description of this regularity. This can be done either in a geometric fashion using Hausdorff dimension, or in a statistical manner using a large deviation analysis. Formally, let X(t), t ∈ I ⊂ R be a deterministic function or a stochastic process on a probability space (Ω, F, P). For ease of notation, we shall assume without loss of generality that I = [0, 1]. We define the following functions (these are random functions in general when X is itself random). 11.2.2.1. Hausdorff multifractal spectrum To simplify notations, set α(t) = αp (t). The Hausdorff spectrum describes the structure of the function t → α(t) by evaluating the size of its level sets. More precisely, let: Tα = {t ∈ I, α(t) = α}

370


The Hausdorff multifractal spectrum is the function: fh (α) = dimH (Tα ) where dimH (T ) denotes the Hausdorff dimension of the set T . 11.2.2.2. Large deviation multifractal spectrum Let: Nnε (α) = #{k : α − ε ≤ αnk ≤ α + ε} where αnk is the coarse-grained exponent corresponding to the dyadic interval Ink = [k2−n , (k + 1)2−n ], i.e.: αnk =

log |Ynk | − log n

Here, Ynk is a quantity that measures the variation of X in the interval Ink . The choice Ynk := X ((k + 1)2−n ) − X (k2−n ) leads to the simplest analytical calculations. Another possibility is to set: Ynk = oscX (Ink ), i.e. the oscillation of X inside Ink . A third choice is to take Ynk to be the wavelet coefficient xn,k of X at scale n and location k (note that, in this case, the spectrum will depend on the chosen wavelet). The large deviation spectrum fg (α) is defined as follows: fg (α) = lim lim inf ε→0 n→∞

log Nnε (α) log n

Note that, whatever the choice of Ynk , fg always ranges in R+ ∪ −{∞}. The intuitive meaning of fg is as follows. For n large enough, one has roughly: Pn (αnk α) 2−n(1−fg (α)) where Pn denotes the uniform distribution over {0, 1, . . . , 2n − 1}. Thus, for all α such that fg (α) < 1, 1 − fg (α) measures the exponential decay rate of the probability of finding an interval Ink with coarse-grained exponent equal to α, when n tends to infinity. When X is a stochastic process, fg is in general a random function. In applications, it is convenient to consider in this case a “deterministic version” of fg , defined as follows. log πn (α) . Fg (α) = 1 + lim lim sup ε→0 n→∞ log(n) where: πn (α) := P × Pn [αnk ∈ (α − ε, α + ε)] and unlike fg , Fg may assume non-trivial negative values.


371

11.2.2.3. Legendre multifractal spectrum It is natural to interpret the spectrum fg as a rate function in a large deviation principle (LDP). Large deviations theorems provide conditions under which such rate functions may be calculated as the Legendre transform of a limiting moment generating function. When applicable, this procedure provides a more robust estimation of fg than a direct calculation. Define, for q ∈ R: Sn (q) =

n−1

|Ynk |q

k=0

with the convention 0q := 0 for all q ∈ R. Let: τ (q) = lim inf n→∞

log Sn (q) − log(n)

The Legendre multifractal spectrum of X is defined as (τ ∗ denotes the Legendre transform of τ ): fl (α) := τ ∗ (α) = inf (qα − τ (q)). q∈R

Being defined through a Legendre transform fl is a concave function. The two spectra fg and fl are related as follows. Define the sequence of random variables Zn := log |Ynk | where the randomness is through a choice of k uniformly in the set {0, . . . , n − 1}. Consider the corresponding moment generating functions: cn (q) := −

log En [exp(qZn )] log(n)

where En denotes expectation with respect to Pn . A version of Gärtner-Ellis theorem ensures that if lim cn (q) exists (in which case it equals 1 + τ (q)), and is differentiable, then c∗ = fg − 1. We then say that the weak multifractal formalism fg = fl holds. 11.3. Hölderian regularity estimation 11.3.1. Oscillations (OSC) The most natural way to estimate regularity is to use the “oscillation” method. This method is a direct application of the definition of the Hölder exponent (see [TRI 95] for more on this topic).

372


As seen above, a function f (t) is Hölderian with exponent α ∈ (0, 1) at t if there exists a constant c such that for all t in a neighborhood of t, |f (t) − f (t )| ≤ c|t − t |α In terms of oscillations, this condition may be written as: ∃c, ∀τ oscτ (t) ≤ cτ α where oscτ (t) = sup|t−t |≤τ f (t ) − inf |t−t |≤τ f (t ) = supt ,t ∈[t−τ,t+τ ] |f (t ) − f (t )|. The estimator is then simply defined as the slope in the least-square linear regression of the logarithms of the oscillations versus the logarithms of the size τ of balls used to calculate the oscillations. 11.3.2. Wavelet coefficient regression (W CR) A method using wavelet coefficients is described in this section. It relies on a theorem by S. Jaffard. This theorem shows how we can estimate the regularity at the point t0 using the wavelet coefficients (provided the wavelets verify some regularity properties [JAF 04]). THEOREM 11.1 (S. Jaffard).– Let f be a uniformly Hölderian function and α the pointwise Hölder exponent of f at t0 . Then there exists a constant c such that the wavelet coefficients verify: 1

|cj,k | ≤ c2−j(α+ 2 ) (1 + |2j t0 − k|)α

∀j, k ∈ Z2 1

Conversely, if ∀j, k ∈ Z2 we obtain |cj,k | ≤ c2−j(α+ 2 ) (1 + |2j t0 − k|)α for a α < α then the Hölder exponent of f at t0 is α. From this theorem, a traditional local regularity estimator is obtained if we consider only the indexes j, k such that |k − 2j t0 | < cste. This amounts to making the assumption that the local and pointwise Hölder exponent of f at t0 coincide [LEV 04b]. Under this hypothesis, an estimator is obtained through the regression slope p of log2 |cj,k | versus j. More precisely, at each point t0 of a signal decomposed on n scales we estimate the regularity by: n 1 sj log2 |cj,k | α ˜ (n, t0 ) = − − Kn 2 j=1 12 with Kn = n(n−1)(n+1) and sj = j − t0 +1 “above” t0 , i.e. k = 2n−j+1 .

n+1 2 .

(11.2)

The cj,k are the wavelet coefficients

11.3.3. Wavelet leaders regression (W L) This method is very similar to the previous one, but often provides better results. For more information on wavelet leaders see [JAF 04].


A dyadic cube at the scale j is given by: λ = interval becomes a cube in Rd .

!k

k+1 2j , 2j

373

. In d dimensions, this

DEFINITION 11.3.– The wavelet leaders are dλ = supλ ⊂λ |cλ |. λj (t0 ) is the dyadic cube of size 2−j at scale j containing the point t0 . Let dj (t0 ) = supλ ∈adj(λj (t0 )) |cλ | with adj(λj (t0 )) the set of dyadic cubes adjacents to λj (t0 ). PROPOSITION 11.1 (S. Jaffard).– If f ∈ C α (t0 ), then ∃c > 0, ∀j ≥ 0,

1

dj (t0 ) ≤ c2−(α+ 2 )j

(11.3)

Conversely, if equation (11.3) holds, and if f is uniform Hölderian, then f ∈ C α (t0 ). From this theorem and with the simplification adj(λj (t0 )) = λj (t0 ) only, the new estimator is determined. At each point t0 of the signal X decomposed on n scales, the regularity is estimated by the following formula: n 1 sj log2 max (|xλ |) α ˜ W L (n, t0 ) = − − Kn λ ⊂λ 2 j=1 11.3.4. Limit inf and limit sup regressions The definition of the Hölder exponent makes use of a lower limit, which allows the exponent to exist without conditions. The three estimators presented above, however, calculate the exponent through a linear regression. Typically, they will not converge to the true exponent when the resolution tends to infinity if the expression defining the exponent does not converge, i.e. when the lower limit is not a plain limit. Indeed, if the upper and lower limits are different, the slope given by a linear regression has no relevance. However, as shown in [LEG 04b, LEV 04a], it is still possible to use regressions to obtain the exponent. The method is general, as it applies to the estimation of upper and lower limits through a modified regression scheme, that we proceed to explain now. The use of these liminf and limsup regression methods is of great practical importance, as it allows us to estimate, on arbitrary signals, various fractal quantities such as dimensions, exponents, and multifractal spectra. Note that, even for fractal signals, the Hölder exponents are often obtained as genuine lower limits (i.e. the limit does not exist).

374

Scaling, Fractals and Wavelets l

Let (lj )j≥1 be an arbitrary sequence of real numbers, and denote uj = jj . Let a = lim inf j→∞ uj . In our framework, think for instance of lj as the logarithm of the sizes of balls used in the oscillation calculations. Define, for all n ≥ 1: Vn0 = {1, . . . , n}

L0n = {l1 , . . . , ln }

0 Let (a0n , b0n ) be the parameters of the least square n linear regression of Ln with respect to Vn0 , i.e. the real numbers that minimize j=1 (lj − aj − b)2 over all couples (a, b). We write: 0 0 an , bn = Reg Vn0 , L0n

Now let: Vn1 = {j ∈ Vn0 , lj ≤ a0n j + b0n }, L1n = {lj , j ∈ Vn1 },

(a1n , b1n ) = Reg(Vn1 , L1n )

and define recursively: i−1 Vni = {j ∈ Vni−1 , lj ≤ ai−1 n j + bn },

Lin = {lj , j ∈ Vni },

(ain , bin ) = Reg(Vni , Lin )

for all i = 2, . . . Nn , where Nn is defined as the first index such that #VnNn +1 < 2. The geometric interpretation of the sequence (ain , bin ) is simple: In the first step, we keep in (Vn1 , L1n ) those points that are “below” the regression line of L0n with respect to Vn0 . We then calculate the regression line of L1n with respect to Vn1 to obtain (a1n , b1n ), and iterate the process until at most one point remains below the regression line. n The slope of the liminf regression is then defined as aN n . (the method is similar for the limsup, just keep the point above the regression line). In many cases of interest, n aN n will tend to a when n tends to infinity.

11.3.5. Numerical experiments In this section we compare the three methods W CR, W L and OSC on different kinds of signals. For more experiments, see [LEG 04b]. Figure 11.1 represents a generalized Weierstrass function with a regularity H(t) = t and the estimations of the Hölder function by regression of the wavelet coefficients (W CR), by wavelet leaders (W L) and by oscillation (OSC). This experiment shows that the best results are obtained by the W L method in this case.

Local Regularity and Multifractal Methods for Image and Signal Analysis 5

1.2

4

1

3

0.8

2

0.6

1

0.4

0

0.2

−1

0

−2

0

500

1000

1500

2000

2500

3000

3500

4000

4500

−0.2 0

500

1000

1500

(a) 1.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

500

1000

1500

2000

(c)

2500

3000

3500

4000

4500

2500

3000

3500

4000

4500

(b)

1.2

−0.2 0

2000

375

2500

3000

3500

4000

4500

−0.2 0

500

1000

1500

2000

(d)

Figure 11.1. (a) Generalized Weierstrass function (4096 points), regularity h(t) = t; (b) regularity estimation by W CR; (c) regularity estimation by W L; (d) regularity estimation by OSC

The second comparison deals with multifractional Brownian motion (MBM). MBM is an extension of fractional Brownian motion where the local regularity may be controlled. See Chapter 6 and [PEL 95, AYA 99, AYA 00b, AYA 00a]. For the experiment, 10 MBM with a regularity evolving like a sine function are built. The three estimation methods are applied to each signal and the results are displayed in Figure 11.2 (mean and variance). We see that, in this case, the oscillation-based method provides the best results both in terms of bias and variance. In conclusion, the methods described in this section generally provide decent estimates of the Hölderian regularity. Nevertheless, there is no “best” estimator between them. The estimation quality depends on the signal class.

376

Scaling, Fractals and Wavelets 2

2

1.5

1.5

1

1

0.5

0.5

0

0

−0.5

0

500

1000

1500

2000

2500

3000

3500

−0.5

4000

0

500

1000

(a)

1500

2000

2500

3000

3500

4000

(b) 2

1.5

1

0.5

0

−0.5

0

500

1000

1500

2000

2500

3000

3500

4000

(c)

Figure 11.2. Estimation of the regularity of a set of ten MBM with a regularity evolving like a sine function. The three estimation methods are tested. For each method, 10 Hölder functions are thus obtained. The empirical mean and the variance on these 10 functions are then calculated. Abscissa: time. Ordinates: mean estimated regularity (white), and error bars corresponding to two times the standard deviation on each side (gray). The theoretical regularity is displayed in black. (a) W CR method; (b) W L method; (c) OSC method

11.4. Denoising 11.4.1. Introduction Signal/image denoising is an important task in many areas including biology, medicine, astronomy, geophysics, and many more. For such applications and others, it is important to denoise the observed data in such a way that the features of interest to the practitioner are preserved. The basic framework is as follows. We observe a signal (or an image) Y which is a combination F (X, B) of the signal of interest X and a “noise” B. Making various assumptions on the noise, the structure of X and


377

@ of the original the function F , we then try to derive a method to obtain an estimate X image which is in some sense optimal. F usually amounts to convolving X with a low pass filter and adding noise. Assumptions on X are almost always related to its regularity, e.g. X is supposed to be piecewise C n for some n ≥ 1. In this section, B is assumed to be independent of X, white, Gaussian and centered. 11.4.2. Minimax risk, optimal convergence rate and adaptivity A useful way to compare denoising methods is to analyze their convergence properties. In this section, we recall some basic facts (see [HAR 98] for more details). DEFINITION 11.4.– The minimax risk in LP is given by ˆ n − X||pp Rn (V, p) = inf sup E||X @ n ∈E X∈V X

where E is the set of measurable estimators and V a ball in a functional space. 1

DEFINITION 11.5.– rn Rn (V, p) p is called the optimal convergence rate or minimax convergence rate on the class V for the risk Lp . We say that the estimator @n − X p Rn (V, p). @n of X reaches the optimal convergence rate if supX∈V E X X p Typical function spaces that are considered in this framework are the so-called Besov Spaces (for a complete description of Besov spaces see [PEE 76, POP 88]). s and that the Lp loss is used. Suppose for instance that X belongs to a ball in Br,q Then, we can show that: sn • If r ≥ p (homogenous area) the optimal rate is 2− 2s+1 . sn p ≤ r ≤ p (intermediate area) the optimal rate is 2− 2s+1 and • If 2s+1 (s− r1 + p1 )n − 2(s− 1 + 1 )+1 r p 2 for linear estimators. (s− r1 + p1 ) 1 1 2 p (sparse area), the optimal rate is (n2−n ) (s− r + p )+1 for non-linear • If r ≤ 2s+1 estimators. In the following sections, the L2 loss is used, and as a consequence, there is no sparse zone. The corresponding optimal convergence rates are as follows. Non-linear estimator sn 1 r ≥ 2 Rn (V, 2) 2 = 2− 2s+1 1

r < 2 Rn (V, 2) 2 = 2− 2s+1 sn

Linear estimator sn 1 Rnlin (V, 2) 2 = 2− 2s+1 (s− r1 + 12 )n − 1 1 1 2 Rnlin (V, 2) 2 = 2 (s− r + 2 )+1

Table 11.1. Convergence rates

378


For some estimators, the availability of the optimal rate of convergence is conditioned to the knowledge of information about the signal, such as its regularity. This constraint is a drawback in applications. In this context we try to develop adaptive estimators. DEFINITION 11.6.– X ∗ is an adaptive estimator for the loss Lp and the set {Fα , α ∈ A} if for all α ∈ A, there exists a constant cα > 0: sup E||X ∗ − X||pp ≤ cα Rn (α, p)

X∈Fα

For general results about adaptivity, see [LEP 90, LEP 91, LEP 92, BIR 97]. 11.4.3. Wavelet based denoising A popular set of denoising methods is based on decomposing the corrupted signal in a wavelet basis, processing the wavelet coefficients, and then going back to the time domain. In the case of additive white noise, this is justified by two fundamental facts: first, many real-world signals have a sparse structure in the wavelet domain, i.e. a few coefficients are significant, and most are small or zero. Second, for an orthonormal wavelet transform, all wavelet coefficients of a white noise are iid random variables. Denoising in the wavelet domain thus allows us to separate in an easy way “large”, significant, coefficients, from “small” coefficients due mainly to noise. Throughout this section, the wavelet coefficients of a signal X are denoted by xj,k where j is scale and k is location. X is the original signal, Y the observed noisy signal @ an estimator of X. We assume that Y = X +B, where B is a centered Gaussian and X white noise with variance σ 2 , independent from the original signal X. Thus, we have yj,k = xj,k + bj,k . Since the wavelet basis is supposed to be orthonormal, the bj,k are also Gaussian and iid. The first and simplest methods for denoising based on the above principles are the so-called hard and soft thresholding [DEV 92, DON 94]. Since the time these methods were introduced, a huge number of improvements have been proposed, ranging from block thresholding [HAR 98] to Bayesian approaches [VID 99] and many more. We briefly recall the basics of hard thresholding, and show why a different method is needed for the processing of irregular signals. DEFINITION 11.7.– Let Yn be a sample of Y on 2n points. The estimator of X by @ HT , a signal with the following wavelet coefficients: hard thresholding is X {ˆ xHT j,k }j,k = {yj,k .1|yj,k |≥λn }j,k where λn is a given threshold.


379

Traditional choices for λn include the so-called universal, sure and Bayesian n√ thresholds [VID 99]. In this section, the universal threshold λn = σ2− 2 2n will be used throughout. s and Xn its THEOREM 11.2 (Risk for hard thresholding (D. Donoho)).– Let X ∈ Bp,q sampled version on 2n points. 2sn @ HT )2 ] ≤ C.n.2− 2s+1 RHT := E[(Xn − X

Thus, hard thresholding is near-minimax. A limitation of hard thresholding, as well as of most wavelet-based methods, is that they are not well adapted to denoise highly textured or everywhere irregular signals, in particular (multi)fractal or multifractional signals, with potentially rapid variations in local regularity. It is particularly well-known that, when the original signal is itself irregular, most wavelet-based denoising methods will typically produce an oversmoothed signal and/or so-called “ringing” effects. Indeed, as recalled above, the basic idea behind wavelet thresholding is that many real-world signals have a sparse wavelet representation, with few large wavelet coefficients. Putting small coefficients to 0 in the noisy signal will then in general do no harm, since these are mainly due to noise. Everywhere irregular signals, on the other hand, have significant coefficients scattered all other the time-frequency plane. At high frequencies, these significant but relatively small coefficients in the signal crucially determine the local irregularity. Zeroing small coefficients will thus typically destroy the regularity information. As a consequence, it is no surprise that a specific method has to be designed for such signals. Figure 11.4 illustrates some of the drawbacks just mentioned. A theoretical result on a particular class of signals also allows us to measure precisely the over-smoothing effect of hard thresholding. Define the set P ART (α) as follows: 1 (11.4) PART(α) := X, {xj,k }j,k = {εj,k .2−j(α+ 2 ) }j,k , εj,k iid in {−1, 1} PROPOSITION 11.2.– Let α ˜ X HT (n, t) denote the regularity of the signal after hard thresholding, estimated by the wcr method. Then, for a signal X ∈ P ART (α), at each point t, 6α + 1 1 8α3 + 12α2 + 12α αX HT (n, t)] = − − + p lim E[˜ n→∞ 2 2(2α + 1)2 (2α + 1)3 where p is the number of vanishing moments of the wavelet. This result means that the regularity of the denoised signal is essentially controlled by p. In particular, if we use a wavelet with an infinite number of vanishing moments, the estimated regularity of the hard thresholded signal will be equal to infinity.

380


The following sections describe three denoising methods that are well fitted to the processing of extremely irregular signals such as (multi)fractal ones. The first and second methods both make it possible to control the local Hölder regularity. They differ in the way this regularity is estimated. In the first method, this is done through linear regression over all available scales. In the second one, an “exponent between scales” is used. The third method allows a control of the multifractal spectrum. 11.4.4. Non-linear wavelet coefficients pumping In this section, a refinement of hard thresholding is presented. It is called non-linear wavelet pumping (NLP) and is near-minimax, adaptive and allows us to control the regularity of the denoised signal through a parameter δ ∈ R+ . For a theoretical study, proofs and numerical experiments, see [LEG 04b]. DEFINITION 11.8.– Let Yn a sample of Y on 2n points. The estimator of X by NLP @ N LP , a signal with the following wavelet coefficients: method, is given by X LP −jδ {ˆ xN yj,k .1|yj,k |

s , then RN LP ≤ RHT + O(RHT ) 2s + 1


381

Thus, NLP is near-minimax. Additionally, If δ >

1 the estimator is adaptive. 2

11.4.4.2. Regularity control The advantage of NLP is that it allows a control over the local regularity through the parameter δ. PROPOSITION 11.3 (Increase of regularity).– Let αY (n, t) and αX@ N LP (n, t) denote @ N LP at the respectively the regularity of the noisy signal Y and of the estimator X point t, estimated by wcr. Then at each point t: αX@ N LP (n, t) = αY (n, t) + Kn δ

n

jsj .

j=1 (n) |yj,k (t)| cn , and where Kn and sn are estimated from the noisy wavelet coefficients (yj,k ) at scales j < cn . This means that, at small scales, we do not accept overly large coefficients, that is, coefficients which would not be compatible with the estimated Hölder regularity of the signals (statistically, there will always be such coefficients, since noise has no regularity given that its coefficients do not decrease with scale). On the other hand, “small” coefficients (those not exceeding 2Kn −j(sn +1/2) ) are left unchanged. Note that both the estimated regularity sn and the critical scale cn depend on the considered point. Note also that this procedure may be seen as a location-dependent shrinkage of the coefficients. We can prove the following property, which essentially says that the above method does a good job in recovering the regularity of the original signal, as measured by the exponent between two scales, provided that we are able to estimate with good accuracy its Hölder exponent at any given point t: PROPOSITION 11.6.– Let X belong to C 0 (R) for some 0 > 0, and let α denote its Hölder exponent at point t. Let (sn )n be a sequence of real numbers tending almost surely (resp. in n An be defined as above. probability) to α. Let c(n) = 1+2s . Let X n An ) tends Then, for any function h tending to infinity with h(n) ≤ n, αg (h(n), n, X almost surely (resp. in probability) to α. For this method to be put to practical use, there thus remains to estimate the critical scale and Hölder exponent from the noisy observations. This is the topic of the next section. 11.4.5.2. Estimating the local regularity of a signal from noisy observations The main result in [ECH 07] concerning the estimation of the critical scale is the following. THEOREM 11.4.– Let (xi )i∈N denote the wavelet coefficients of X ∈ C 0 (R) “above” a point t where the local and pointwise Hölder exponent of X coincide. Let β = lim inf i→∞ − logi |xi | . Assume that there exists a decreasing sequence (εn ) such


385

that εn = o n1 when n → ∞ and − logi |xi | ≥ β − εi , for all i. Let (yi ) denote the noisy coefficients corresponding to the xi . Let: Ln (p) =

n 1 y2 , (n − p + 1)2 i=p i

and denote p∗ = p∗ (n) an integer such that: Ln (p∗ ) =

min

p:1≤p≤n−b log(n)

where b > 1 is a fixed number. Finally let q(n) = ∀a > 1,

Ln (p),

n 1 . 2(β− n )

p∗ (n) ≤ q(n) + a log(n),

Then, almost surely:

n→∞

In addition, if the sequence (xi ) verifies the following condition: there exists a sequence of positive integers (θn ) such that, for all n large enough and all θ ≥ θn : q−1 1 − δβ∗ 1 2 xi > bσn2 , ∗ θ (1 − δβ )2 i=q−θ

where δ∗ ∈ (0, 12 ) and δ ∗ ∈ ( 12 , β). Then, almost surely: ∀a > 1,

p∗ (n) ≥ q(n) − max(a log(n), θn ),

n→∞

In other words, when the conditions of the theorem are met, any minimizer of L is, within an error of O(log(n)), approximately equal to the searched for critical scale. This allows in turn to estimate the Hölder exponent using the next corollary: COROLLARY 11.1.– With the same notations and assumptions as in the theorem above, with the additional condition that θn is not larger than b log(n) for all ˆ sufficiently large n, define: β(n) = 2p∗n(n) + n1 . Then the following inequality holds almost surely for all large enough n: ˆ |β(n) − β| ≤ 2bβ 2

log(n) . n

ˆ The value sn = β(n) + 12 is used in (11.5). Kn is estimated as the offset in the linear least square regression of the logarithm for the absolute value of the wavelet coefficients with respect to scale, at scales larger than p∗ (n).

386


Figure 11.6. Top: original Weierstrass function. Middle: noisy version. Bottom: signal obtained with the regularity preserving method

11.4.5.3. Numerical experiments Figure 11.6 shows the original, noisy and denoised versions of a Weiertsrass function. 11.4.6. Bayesian multifractal denoising 11.4.6.1. Introduction In [LEG 04b, LEV 03], a denoising method is presented that assumes a minimal local regularity. This assumption translates into constraints on the multifractal spectrum of the signals. Such constraints are used in turn in a Bayesian framework to estimate the wavelet coefficients of the original signal from the noisy ones. An assumption is made that the original signal belongs to a certain set of parameterized classes S described below. Functions belonging to such classes have a minimal local regularity, but may have wildly varying pointwise Hölder exponent. Along with possible additional conditions, this yields a parametric form for the prior distribution of the wavelet coefficients of X. These coefficients are estimated using a traditional maximum a posteriori technique. As a consequence, the estimate is defined as the signal “closest” to the observation which has the desired multifractal spectrum (or a degenerate version of it, see below). Because the multifractal spectrum subsumes information about the pointwise Hölder regularity, this procedure is naturally adapted for signals which have sudden changes in regularity.


387

11.4.6.2. The set of parameterized classes S(g, ψ) The denoising technique described below is based on the multifractal spectrum rather than the use of the sole Hölder exponent. This will in general allow for more robust estimates, since we use a higher level description subsuming information on the whole signal. For such an approach to be practical, however, we need to make the assumption that the considered signals belong to a given set of parameterized classes, as we will now describe. Let F be the set of lower semi-continuous functions from R+ to R ∪ {−∞}. We consider classes of random functions X(t), t ∈ [0, 1], defined on (Ω, F, P) defined by (11.6) below1. Each class S(g, ψ) is characterized by the functional parameter g ∈ F and a wavelet ψ such that the set {ψj,k }j,k forms a basis of L2 . Let: – K be a positive constant log (K|x |) – Pεj (α, K) = P × Pj α − ε < 2 −j j,k| < α + ε Define:

S(g, ψ) =

X : ∃K > 0, j0 ∈ Z : ∀j > j0 , xj,k

and xj,k are identically distributed for (k, k ) ∈ {0, 1, . . . , 2j − 1} log2 Pεj (α, K) = g(α) + Rn,ε (α) and j

(11.6)

where Rn,ε (α) is such that limε→0 limn→∞ Rn,ε (α) = 0 uniformly in α. The assumption that, for large enough j, the wavelet coefficients (xj,k )k at scale j are identically distributed entails that: log2 (K|xj,k |) ε j0

k

(11.7)


389

The estimate for K can be heuristically justified as follows; writing @ is α with α > 0 implies that K|xj,k | < 1 for all couples (j, k). K chosen as the smallest normalizing factor that entails the latter inequality. log2 (K|xj,k |) −j

In the numerical experiments, we shall deal with the case where the noise is centered, Gaussian, with variance σ 2 . The MAP estimate then reads: % B $ C @ (yj,k − x)2 log2 (Kx) − x ˆj,k = argmaxx>0 jg × sgn(yj,k ) −j 2σ 2 While (11.7) gives an explicit formula for denoising Y , it is often of little practical use. Indeed, in most applications, we do not know the multifractal spectrum of X, so that without an evaluation of g, it is not possible to use (11.7) to obtain x ˆj,k . In addition, we should recall that Fg depends in general on the analyzing wavelet. We would thus need to know the spectrum shape for the specific wavelet in use. Furthermore, a major aim of this approach is to be able to extract the multifractal @ A strong justification of the multifractal features of X from the denoised signal X. X Bayesian approach is to estimate Fg as follows: a) denoise Y , b) evaluate numerically @ @ c) set F@X = F X@ . Obviously, from this point of view, it does the spectrum F X of X, g

g

g

not make sense to require the prior knowledge of FgX in the Bayesian approach. Thus, a “degenerated” version of (11.7) is presented which uses a single real parameter as input instead of the whole spectrum. The heuristic is as follows; from a regularity point of view, important information contained in the spectrum is its support, i.e. the set of all possible “Hölder exponents”. More precisely, let α0 = inf{α, Fg (α) > −∞}. While the spectra shapes obtained with different analyzing wavelets depend on the wavelet, their supports are always included in [α0 , ∞). The “flat” spectrum 1[α0 ,∞) thus contains intrinsic information. Furthermore, it only depends on the positive real α0 . Rewriting (11.7) with a flat spectrum yields the following explicit simple expression for x ˆj,k :

x ˆj,k = yj,k = sgn(yj,k )2−jα0

if K|yj,k | < 2−jα0

(11.8)

otherwise

Although α0 is really prior information, it can be estimated from the noisy observations [LEG 04b]. In this respect, it is comparable to the threshold used in the traditional hard or soft wavelet thresholding scheme. Furthermore, in applications, it is useful to think of α0 rather as a tuning parameter, whereby increasing α0 yields a smoother estimate (since the original signal is assumed to have a larger minimal exponent). Note that (11.8) has a flavor reminiscent of the method described in section 11.4.5.

390


11.4.6.4. Numerical experiments As a test signal for numerical experiments, we shall consider fractional Brownian motion (FBM). As is well-known, FBM is the zero-mean Gaussian process X(t) with covariance function: R(t, s) = E(X(t)X(s)) =

σ 2 2H |t| + |s|2H − |t − s|2H 2

where H is a real number in (0, 1) and σ is a real number. FBM reduces to Brownian motion when H = 1/2. In all other cases, it has stationary correlated increments. At all points, the local and pointwise Hölder exponents of FBM equal H almost surely. The Hausdorff multifractal spectrum of FBM is degenerated, as we have fh (α) = −∞ almost surely for α = H, fh (H) = 1. The large deviation spectrum, however, depends on the definition of Ynk : if we consider oscillations, then fg = fh . Taking increments, we get that, almost surely, for all α: ⎧ ⎪ if α < H ⎨−∞ fg (α) = fl (α) = H + 1 − α if H ≤ α ≤ H + 1 ⎪ ⎩ −∞ if α > H + 1 Moreover, in both cases (oscillations and increments), fg (α) is given by log Nnε (α) ε→0 n→∞ log n lim lim

(i.e. the lim inf in n is really a plain limit). Together with the stationarity property of the increments (or the wavelet coefficients), this entails that FBM belongs to a class S(g, ψ). If we define the Ynk to be wavelet coefficients, the spectrum will depend on the analyzing wavelet ψ. All spectra with upper envelope equal to the characteristic function of [H, ∞) may be obtained with adequate choice of ψ. The result of the denoising procedure will thus in principle be wavelet-dependent. The influence of the wavelet is controlled through the prior choice, i.e. the multifractal spectrum among all admissible ones. In practice, few variations are observed if we use a Daubechies wavelet with length between 2 and 20, and a non-increasing spectrum supported on [H, H + 1] with fg (H) = 1. A graphical comparison of results obtained through Bayesian multifractal denoising and traditional hard and soft thresholding is displayed in Figure 11.7. For each method, the parameters were manually set so as to obtain the best fit to the known original signal. By and large, the following conclusions may be drawn from these experiments. First, it is seen that, for irregular signals such as FBM, which belong to S(g, ψ), the Bayesian method yields more satisfactory results

Local Regularity and Multifractal Methods for Image and Signal Analysis 0.5

0.5

0

0

Ŧ0.5

Ŧ0.5 200

400

600

800

1000

0.5

0.5

0

0

Ŧ0.5

Ŧ0.5 200

400

600

800

1000

0.5

0.5

0

0

Ŧ0.5

Ŧ0.5 200

400

600

800

1000

391

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

Figure 11.7. First line: FBM with H = 0.6 (left) and noisy version with Gaussian white noise (right). Second line: Denoised versions with a traditional wavelet thresholding; hard thresholding (left), soft thresholding (right). Third line: Bayesian denoising with increments’ spectrum (left), Bayesian denoising with flat spectrum (right)

than traditional wavelet thresholding (we should however recall that hard and soft thresholding were not designed for stochastic signals). In particular, this method preserves a roughly correct regularity along the path, while wavelet thresholding yields a signal with both too smooth and too irregular regions. Second, it appears that using the degenerate information provided by the “flat” spectrum does not significantly decrease the denoising quality. 11.4.6.5. Denoising of road profiles An important problem in road engineering is to understand the mechanisms of friction between rubber and the road. This is a difficult problem, since friction depends on many parameters: the type of rubber, the type of road, the speed, etc. Several authors have shown that most road profiles are fractal [RAD 94, HEI 97, GUG 98] on given ranges of scales. Such a property has obvious consequences on friction, some of which have been investigated for instance in [RAD 94, KLU 00]. The main idea is that, in the presence of fractal roads, all irregularity scales contribute to friction [DO 01].

392


In [LEG 04a], it is verified that road profiles finely sampled using tactile and laser sensors are indeed fractals. More precisely, it is shown that they have well-defined correlation exponents and regularization dimensions over a wide range of scales. However, various classes of profiles which have different friction coefficients are not discriminated by these global fractal parameters. This means that friction may have relatively low correlation with fractional dimensions or correlation exponents. In contrast, experiments show that the pointwise Hölder exponent allows us to separate road profiles which have different friction coefficients. The laser acquisition system developed at LCPC (Laboratoire central des ponts et chaussées), based on an Imagine optics sensor, allows us to obtain (at the finest resolution) road profiles with a sampling step of 2.5 microns. These signals are very noisy. As a consequence, when they are used for computing a theoretical friction ([DO 01]), a very low correlation with the real friction (0.48) is obtained. Since local regularity is related to friction, it seems natural to use one of the regularity-based denoising methods presented above in this case. It appears that a Bayesian denoising is well fitted; after this denoising, the correlation between theoretical real friction increases up to 0.86 (see Figure 11.8). For more on this topic, see [LEG 04b]. Denoised Profiles : correlation = 0.8565 1

0.9

0.9

0.8

0.8

0.7

0.7 Theoretical friction

Theoretical friction

Original Profiles : correlation = 0.4760 1

0.6 0.5 0.4

0.6 0.5 0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

0.2

0.4 0.6 Measured friction

(a)

0.8

1

0

0

0.2

0.4 0.6 Measured friction

0.8

1

(b)

Figure 11.8. Theoretical friction versus measured friction. Each star represents a given class of road profiles. We calculate the friction for each profile in a class and performs an average: (a) originals profiles (correlation 0.4760); (b) denoised profiles (correlation 0.8565)


393

11.5. Hölderian regularity based interpolation 11.5.1. Introduction A ubiquitous problem in signal and image processing is to obtain data sampled with the best possible resolution. At the acquisition step, the resolution is limited by various factors such as the physical properties of the sensors or the cost. It is therefore desirable to seek methods which would allow us to increase the resolution after acquisition. This is useful, for instance, in medical imaging or target recognition. At first sight, this might appear hopeless, since we cannot “invent” information which has not been recorded. Nevertheless, by making reasonable assumptions on the underlying signal (typically, a priori smoothness information), we may design various methods (note that we do not consider here the situation where several low resolution overlapping signals are available). However, most techniques developed so far suffer from a number of problems. While the interpolated image is usually too smooth, it also occurs sometimes that on the contrary too many details are added, in particular in smooth regions. In addition, the creation of details is not well controlled, so that we can neither predict how the high resolution image will look like, nor describe the theoretical properties of the interpolation scheme. The main idea of [LEV 06] is to perform interpolation in such a way that smooth regions as well as irregular regions (i.e. sharp edges or textures) remain so after zooming. This can be interpreted as a constraint on the local regularity; the interpolation method should preserve local regularity. 11.5.2. The method Let Xn be the signal obtained by sampling the signal X on 2n points. The proposed method is strongly related to the estimator W CR described in section 11.3.2 and follows the steps below: – estimate the regularity by the W CR method: the regression of the logarithm of wavelet coefficients vs scale is calculated above each point t; – the wavelet coefficients at the scale n+1 are obtained by the following formula: 2 log |xj,k |(3j − n − 2) n(n − 1) j=1 2 n

log2 |˜ xn+1,k | =

with xj,k , j = 1 . . . n the wavelet coefficients “above” t. This means that the regression slope (i.e. the estimated regularity by the W CR method) remains the same after interpolation (see Figure 11.9 left, second row); – perform an inverse wavelet transform.

394


With this method, the local estimated regularity of the signal/image will remain unchanged because the high frequencies content is added in a manner coherent with lower scales. From an algorithmic point of view, we note that only one computation is needed, whatever the number of added scales. 11.5.3. Regularity and asymptotic properties An+m be the signal after m interpolations, α ˜ (n, t) be the estimated regularity Let X A at t given by (11.2) and log2 Kn,k be the ordinate at zero of the W CR regression. PROPOSITION 11.7.– If X ∈ C α then, whatever the number m of added scales: An+m 2 ≤

X −X 2

@n c2 K 1 2−2αn + 2αˆ n 2−2@αn n 2α 2 2 −1 2 −1

@n, α with (K @n ) such that: @ n 2−2j αˆ n = K

' max

A n,k ,α(n,t)) (K ˜

˜ A n,k 2−2j α(n,t) K

(

PROPOSITION 11.8.– Assume that X ∈ C α and that at each point the local and the pointwise Hölder exponents of X coincide. Then, ∀ε > 0, ∃N : An+m 2 = O(2−(n+m)(α−ε) ) n > N ⇒ X − X An+m B s = O(2−(n+m)(α−s−ε) ) for all s < α − ε. In addition, X − X p,q See [LEG 04b] for more results. 11.5.4. Numerical experiments We show a comparison between the regularity-based interpolation method and a traditional bicubic method on a scene containing a Japanese door (toryi). Figure 11.9 displays the original 128×128 pixel image, and eight-times bicubic and regularity-based interpolations on a detail of the door image. 11.6. Biomedical signal analysis Fractal analysis has long been applied with success in the biomedical field. A particularly interesting example is provided by the study of ECG. ECG and signals derived from them, such as RR intervals, are an important source of information in the detection of various pathologies, e.g. congestive heart failure and sleep apnea, among others. The fractality of such data has been reported in numerous works over the years. Several fractal parameters, such as the box dimension, have been found to correlate well with this heart condition in certain situations ([PET 99, TU 04]).


16

14

12

10

8

6

4

2

0

0

2

4

6

8

10

12

14

Figure 11.9. Left. First row: original door image, second row: Regression of the logarithm of the wavelet coefficients vs scales above the point t. The added wavelet coefficient is the one on the right. Right: 8 times bicubic (up) and regularity-based (bottom) interpolations on door image (detail)

395

396


More precise information on ECG is provided by multifractal analysis, because their local regularity varies wildly from point to point. In the specific case of RR intervals, several studies have shown that the multifractal spectrum correlates well with the heart condition ([IVA 99, MEY 03, GOL 02]). Roughly speaking, we observe two notable phenomena: – On average, the local regularity of healthy RR is smaller than that in presence of, e.g., congestive heart failure. In other words, pathologies increase the local regularity. – Healthy RR have much more variability in terms of local regularity; congestive heart failure reduces the range of observed regularities. These results may be traced back to the fact that congestive heart failure is associated with profound abnormalities in both the sympathetic and parasympathetic control mechanisms that regulate beat-to-beat variability [GOL 02]. A precise view on the mechanisms leading to multifractality is important if we want to understand the purposes it serves and how it will be modified in response to external changes or in case of abnormal behavior. As of today, there is no satisfactory multifractal model for RR intervals ([AMA 99]). As a preliminary step toward this goal, we shall describe in this section a remarkable feature of the time-evolution of the local regularity. Obviously, calculating the time evolution of the local regularity gives far more information than the sole multifractal spectrum. Indeed, the latter may be calculated from the former, while the reverse is not true. In addition, inspecting the variations of local regularity yields new insights which cannot be deduced from a multifractal spectrum, since all time-dependent information is lost on a spectrum. This is crucial for RR intervals, since, as we will see, the evolution of local regularity is strongly (negatively) correlated with the RR signals. This fact prompts the development of new models that would account for the fact that, when the RR intervals are larger, the RR signal is more irregular, and vice versa. In that view, we shall briefly describe a new mathematical model that goes beyond the usual multifractional Brownian motion (MBM). Recall that the MBM is the following random process, that depends on the functional parameter H(t), where H : [0, ∞) → [a, b] ⊂ (0, 1) is a C 1 function: 0 WH(t) (t) = [(t − s)H(t)−1/2 − (−s)H(t)−1/2 ]dW (s) −∞

t

+

(t − s)H(t)−1/2 dW (s).

0

The main feature of MBM is that its Hölder exponent may be easily prescribed; at each point t0 , it is equal to H(t0 ) with a probability of one. Thus, MBM allows us to describe phenomena whose regularity evolves in time/space. For more details on MBM, see Chapter 6.


397

Figure 11.10 shows two paths of MBM with a linear function H(t) = 0.2 + 0.6t and a periodic H(t) = 0.5 + 0.3 sin(4πt). We clearly see how regularity evolves over time.

Figure 11.10. MBM paths with linear and periodic H functions

Estimation of the H functions from the traces in Figure 11.10, using the so-called generalized quadratic variations are shown in Figure 11.11 (the theoretical regularity is in gray and the estimated regularity is in black).

Figure 11.11. Estimation of the local regularity of MBM paths. Left: linear H function. Right: periodic H function

24-hour interbeat (RR) interval time series obtained from the PhysioNet database [PHY] along with their estimated local regularity (assuming that the processes may be modeled as MBM) are shown in Figure 11.12. These signals were derived from long-term ECG recordings of adults between the ages of 20 and 50 who have no known cardiac abnormalities and typically begin and end in the early morning (within an hour or two of the subject waking). They are composed of around 100, 000 points. As we can see from Figure 11.12, there is a clear negative correlation between the value of the RR interval and its local regularity: when the black curve moves up, the

398


Figure 11.12. RR interval time series (upper curves) and estimation of the local regularity (lower curves)

gray tends to move down. In other words, slower heartbeats (higher RR values) are typically more irregular (smaller Hölder exponents) than faster ones. In order to account for this striking feature, the modeling based on MBM must be refined. Indeed, while MBM allows us to tune regularity at each time, it does so in an “exogenous” manner. This means that the value of H and of WH are independent. A better model for RR time series requires us to define a modified MBM where the regularity would be a function of WH at each time. Such a process is called a self-regulating multifractional process (SRMP). It is defined as follows. We give ourselves a deterministic, smooth, one-to-one function g ranging in (0, 1), and we seek a process X such that, at each t, αX (t) = g(X(t)) almost surely. It is not possible to write an explicit expression for such a process. Rather, a fixed point approach is used, which we now briefly describe (see [BAR 07] for more details). Start from an MBM WH with an arbitrary function H (for instance a constant). At the second step, set H = g(WH ). Then iterate this process, i.e. calculate a new WH with this updated H function, and so on. We may prove that these iterations will almost surely converge to a well-defined SRMP X with the desired property, namely the regularity of X at any given time t is equal to g(X(t)). For such a process, there is a functional relation between the amplitude and the regularity. However, this does not make precise control of the Hölder exponent possible. Let us explain this through an example. Take definiteness g(t) = t for all t. Then, a given realization might result in a low value of X at, say, t = 0.5 and thus high irregularity at this point, while another realization might give a large X(0.5), resulting in a path that is smooth at 0.5. See Figure 11.13 for an example of this fact. In order to gain more control, the definition of an SRMP is modified as follows. First define a “shape function” s, which is a deterministic smooth function. Then, at each step, calculate WH , and set H = g(WH + ms), where m is a positive number. The function s thus serves two purposes. First, it allows us to tune the shape of X:


399

1 0.8 0.6 0.4

1

2

3

4

0.2

0

1

2

3

0

1

2

3

4

x 10

1 0.8 0.6 0.4

1

2

3

4

0.2

4

x 10

Figure 11.13. Paths of SRMPs with g(Z) = Z

when m is large, X and s will essentially have the same form. Second, because of the first property, it allows us to decide where the process will be irregular and where it will be smooth. Figure 11.14 displays an example of SRMP with controlled shapes.

(a)

(b)

Figure 11.14. (a) SRMP with g(Z) = Z (black), and controlling shape function (gray); (b) the same SRMP (black) with estimated Hölder exponent (gray)

400


It is then possible to obtain a fine model for RR traces based on the following ingredients: – an “s” function, that describes the overall shape of the trace, and in particular the nycthemeral cycle; – a g function whose role is to ensure that the correct relation between the heart rate and its regularity is maintained at all times. The shape s is estimated from the data in the following way; for each RRi time series, histograms of both the signal and its exponent are plotted, and modeled as a sum of two Gaussians, as represented in Figure 11.15.

Figure 11.15. Histogram of RRi time series, modeled as a sum of two Gaussians

From these signals the shape functions are inferred. They are based on splines and parameterized by: – Dn , duration of the night: Dn ∈ [6, 10] – Dm , duration of the beginning of the measure: Dm ∈ [2, 4] – Ds , duration of the sleeping phase: Ds ∈ [0.5, 1.5] – Da , duration of the awakening phase: Dr ∈ [0.5, 1.5] – RRid , mean interbeat interval during the day: RRid ∈ [0.6018, 0.7944] – RRin , mean interbeat interval during the night: RRin ∈ [0.7739, 1.0531] randomly chosen, in each case, in their respective intervals, with uniform probability (see Figure 11.16 for a representation).


401

Figure 11.16. Shape function of RR intervals

The g function is estimated in the phase space. More precisely, for each trace, the value of H as a function of the RR interval is plotted. Representing all these graphs on a single plot, a histogram is obtained, as in Figure 11.17. 8

5

6 4

4 5

2 3

3

8 5

6 4

2

2 5

2 1 0.2

0.4

0.6

0.8

1 RRi

1.2

1.4

1.6

1.8

8 0.4

0.5

0.6

0.7

0.8

0.9

1

RRi

Figure 11.17. Histogram in the phase space

The ridge line of this histogram, seen as a surface in the (RR, α) plane is then extracted (see Figure 11.17). It is seen that this ridge line is roughly a straight line, that is, fitted using least squares minimization in order to obtain an equation of the form α = g(RR) = aRR + b. The last step is to synthesize an SRMP with shape function s and regularity function g, as explained in the previous section. Paths obtained in this way are shown in Figure 11.18. Compare this with the graphs shown in Figure 11.12, displaying true RR traces. 11.7. Texture segmentation We will now briefly explain how multifractal analysis may be used for texture segmentation. We present an application to 1D signals, namely well logs. For an application to images, see [MUL 94, SAU 99].

402


Figure 11.18. Two forged RR intervals based on SRMP (upper curves) and estimated regularity (lower curves)

The characterization of geological strata with the help of well logs can be used for the interpretation of a sedimentary environment of an area of interest, such as reservoirs. Recent progress [SER 87] of electrofacies has enabled us to relate well logs to the sedimentary environment and to extrapolate information coming from the core of any vertical well span. Electrofacies predictions are based on multivariate correlations and cluster analysis for which the entry data are conventional well logs (including sonic logs, of density and gamma) as well as the information extracted from the core analysis. The microresistivity log (ML) measures the local rock wall resistivity of the wells. The measure is obtained by passing an electric current in the rock, to a lateral depth of approximately 1 cm. The resistivity varies according to the local porosity function and the connectivity of the pores (normally, the rock is a non-conductor and thus the current passes in the fluid contained in the pores). ML contain information not only on the inclination of geological strata but also on the texture of these strata. To analyze the irregular variations of ML, we may calculate texture parameters locally and at different depths in the well. These can be used to obtain a well segmentation by letting [SAU 97] r(xi ) denote the signal resistivity, where xi are equidistant coordinates which measure the depth in the wells. In order to emphasize the vertical variations of r(xi ), a transformed signal sω (xi ) is first defined sω (xi ) = |r(xi+1 ) − r(xi )|ω where ω > 0. This transformation amplifies the small scales and eliminates any constant component of signal r(xi ). Analysis of well logs from the Oseberg reservoir in the North Sea shows that, typically, a fractal behavior is observed for lengths of about [2 cm, 20 cm]. It has been


403

found that relevant textural indices are given by the information dimension D(1) and the curvature parameter αc = 2|D (1)|/(D(0) − D(1)), where D(q) is defined by D(q) = τ (q)/(q − 1), with an obvious modification for q = 1. In particular, these indices allow us to separate the three strata present in these logs. For instance, D(1) is correlated with the degree of heterogenity: a formation which is more heterogenous translates into smaller D(1) [SAU 97]. 11.8. Edge detection 11.8.1. Introduction In the multifractal approach to edge detection, an image is modeled by a measure μ, or, more precisely, a Choquet capacity [LEV 98]. A Choquet capacity is roughly a measure which does not need to satisfy the additivity requirement. This distinction will not be essential for our discussion below, and the reader may safely assume that we are dealing with measures. See Chapter 1 for more precise information on this. The basic assumptions underlying the multifractal approach to image segmentation are as follows: • The relevant information for the analysis can be extracted from the Hölder regularity of μ. • Three levels contribute to the perception of the image: the pointwise Hölder regularity of μ at each point, the variation of the Hölder regularity of μ in local neighborhoods and the global distribution of the regularity in the whole scene. • The analysis should be translation and scale invariant. Let us briefly compare the multifractal approach to traditional methods such as mathematical morphology (MM), and gradient based methods, or more generally image multiscale analysis (IMA): • As in MM and IMA, translation and scale invariance principles are fulfilled. • There is no so-called “local comparison principle” or “local knowledge principle”, i.e. the decision of classifying a point as an edge point is not based only on local information. On the contrary, it is considered useful to analyze information about whole parts of the image at each point. • The most important difference between “traditional” and multifractal methods lies in the way they deal with regularity. While the former aims at obtaining smoother versions of the image (possibly at different scales) in order to remove irregularities, the latter tries to extract information directly from the singularities. Edges, for instance, are not considered as points where large variations of the signal still exist after smoothing, but as points whose regularity is different from the “background” regularity in the raw data. Such an approach makes sense for “complex” images, in which the relevant structures are themselves irregular. Indeed, an implicit assumption of MM and IMA is that the useful information lies at boundaries between originally smooth regions, so that it is natural to filter the image. However, there are cases (e.g.

404


in medical imaging, satellite or radar imaging) where the meaningful features are essentially singular. • As in MM, and contrarily to IMA, the multifractal approach does not assume that there is a universal scheme for image analysis. Rather, depending on what we are looking for, different measures μ may be used to describe the image. • Both MM and IMA consider the relative values of the gray levels as the basic information. Here the Hölder regularity is considered instead. This again is justified in situations where the important information lies in the singularity structure of the image. Throughout the rest of this section, we make the following assumption: f := fh = fg The simplest approach then consists of defining a measure (or, often, a sequence of Choquet capacities on the image), calculating its multifractal spectrum, and classifying each point according to the corresponding value of (α, f (α)), both in a geometric and a probabilistic fashion. The value of α gives a local information about the pointwise regularity: for a fixed capacity, an ideal step edge point in an image without noise is characterized by a given value. The value of f (α) yields a global information: a point on a smooth contour belongs to a set Tα whose dimension is 1, a point contained in a homogenous region has f (α) = 2, etc. The probabilistic interpretation of f (α) corresponds to the fact that a point in a homogenous region is a frequent event, an edge-point is a rare event, and, for instance, a corner an even rarer event (see Figures 11.19 and 11.20). Indeed, if too many “edge points” are detected, it is in general more appropriate to describe these points as belonging to a homogenous (textured) zone.

Figure 11.19. Three edges, a texture

Figure 11.20. Three corners, a texture


405

In other words, the assumption that fg = fh allows us to link the geometric and probabilistic interpretations of the spectrum. Points on a smooth contour have an α such that: • fh (α) = 1 because a smooth contour fills the space as a line. • fg (α) = 1 because a smooth contour has a given probability to appear. In fact, we may define the point type (i.e. edge, corner, smooth region, etc.) through its associated f (α) value; a point x with f (α(x)) = 1 is called an edge point, a point x with f (α(x)) = 2 is called a smooth point, and, more generally, for t ∈ [0, 2], x is called a point of type t if f (α(x)) = t. A benefit of the multifractal approach is thus that it allows us to define not only edge points, but a continuum of various types of points. An important issue lies in the choice of a relevant sequence of capacities for describing the scene. The problem of finding an optimal c in a general setting is unsolved. In practice, we often use the following. Assume the image is defined on [0, 1] × [0, 1]. Let P := ((Ikn )0≤k 1, then the exponent of Q(x) is α − 1. Therefore, in particular, if service durations do not have a variance, the average waiting time before receiving service is infinite! This recalls certain fluid queue results, and in fact there are strong connections between the two types of system, the main difference being that, for point arrival, the incoming mass is instantaneously rather than progressively deposited into the queue, therefore further aggravating its load. A considerable number of results for such systems are now available. For a survey we can consult Boxma and Cohen [BOX 00]. Among the most important generalizations is the replacement of “M” by “GI”, indicating a renewal process whose inter-arrivals are distributed according to a variable A which is not exponential but arbitrary. This law can also have a heavy tail, in which case, depending on the ratio of the exponents of A and B, different behaviors are possible, especially when the system is heavily loaded: λ ≈ C. Another major factor lies in the choice of service discipline of the queue. In [COH 73] the traditional choice is made: arrivals are serviced in the order of arrival. However, there is no shortage of alternatives which are commonly employed in switches. For example, with processor sharing, where the server divides its capacity equally over all the customers present, we recover a finite average waiting time even when B is without variance, essentially because no arrival is forced to wait behind earlier arrivals which may have very long service times. 12.5. Perspectives Even if the fractal nature of teletraffic is now accepted, and a new understanding of its impact has been, to some extent, reached, the list of open questions remains long. In reality, we are in the early stages of studying this phenomenon, observing its evolution, and appreciating its implications. As far as long-range dependence is concerned, one category of outstanding questions concern the details of these effects on various aspects of performance. In some sense it is necessary to “redo everything” in the queueing literature and other fields, to take into account this invariance at large scales. Despite considerable progress, our knowledge falls far short of that necessary to design networks capable of mimimizing the harmful effects with confidence, and efficiently exploit the beneficial properties of long memory. A second category of questions that appears essential for the future is to understand the origin (or origins) of the apparent scaling invariance over small scales; multifractal behavior. Understanding these origins will be essential to predict whether this behavior will persist, not only in the sense of not disappearing, but also in the sense of its extension towards ever smaller scales, as they are progressively activated by advances in technology. Even if small scale invariance is influenced, or even entirely determined, by network design, the study of its impact on performance will remain

434


relevant. Though it appears obvious that, like any variability, its presence will be negative overall, we have yet to evaluate the cost of any impact against the cost of the actions that may be required to suppress it. The third category of questions concern protocol dynamics in closed loop control, such as in TCP, which configures the global network as an immense distributed dynamic system, from which the generation of scale invariances may be only one of the important consequences. The richness of non-linear and non-local interactions in this system merits that this new field be studied in full depth. The next phase in the evolution of fractal teletraffic phenomenon, as unpredictable as it is fascinating, could very well come from determinism rather than randomness. 12.6. Bibliography [ABR 98] A BRY P., V EITCH D., “Wavelet analysis of long-range dependent traffic”, IEEE Transactions on Information Theory, vol. 44, no. 1, p. 2–15, 1998. [ABR 00] A BRY P., TAQQU M.S., F LANDRIN P., V EITCH D., “Wavelets for the analysis, estimation, and synthesis of scaling data”, in PARK K., W ILLINGER W. (Eds.), Self-similar Network Traffic and Performance Evaluation, John Wiley & Sons, 2000. [BER 95] B ERAN J., S HERMAN R., TAQQU M.S., W ILLINGER W., “Variable-bit-rate video traffic and long range dependence”, IEEE Transactions on Communications, vol. 43, p. 1566–1579, 1995. [BOX 97] B OXMA O.J., D UMAS V., Fluid queues with long-tailed activity period distributions, Technical Report PNA-R9705, CWI, Amsterdam, The Netherlands, April 1997. [BOX 00] B OXMA O.J., C OHEN J.W., “The single server queue: Heavy tails and heavy traffic”, in PARK K., W ILLINGER W. (Eds.), Self-Similar Network Traffic and Performance Evaluation, John Wiley & Sons, 2000. [BRI 96] B RICHET F., ROBERTS J., S IMONIAN A., V EITCH D., “Heavy traffic analysis of a storage model with long range dependent on/off sources”, Queueing Systems, vol. 23, p. 197–225, 1996. [CHO 97] C HOUDHURY G.L., W HITT W., “Long-tail buffer-content distributions in broadband networks”, Performance Evaluation, vol. 30, p. 177–190, 1997. [COH 73] C OHEN J.W., “Some results on regular variation for the distributions in queueing and fluctuation theory”, Journal of Applied Probability, vol. 10, p. 343–353, 1973. [COX 84] C OX D.R., “Statistics: an appraisal”, in DAVID H.A., DAVID H.T. (Eds.), Long-Range Dependence: A Review, Iowa State University Press, Ames, USA, p. 55–74, 1984. [CUN 95] C UNHA C., B ESTAVROS A., C ROVELLA M., Characteristics of WWW client-based traces, Technical Report, Boston University, Boston, Massachusetts, July 1995. [DUF 94] D UFFY D.E., M CINTOSH A.A., ROSENSTEIN M., W ILLINGER W., “Statistical analysis of CCSN/SS7 traffic data from working CCS subnetworks”, IEEE Journal on Selected Areas in Communications, vol. 12, no. 3, 1994.

Scale Invariance in Computer Network Traffic

435

[ERR 90] E RRAMILLI A., S INGH R.P., “Application of deterministic chaotic maps to characterize broadband traffic”, in Proceedings of the Seventh ITC Specialist Seminar (Livingston, New Jersey), 1990. [ERR 93] E RRAMILLI A., W ILLINGER W., “Fractal properties in packet traffic measurements”, in Proceedings of the ITC Specialist Seminar (Saint Petersburg, Russia), 1993. [ERR 95] E RRAMILLI A., S INGH R.P., P RUTHI P., “An application of deterministic chaotic maps to model packet traffic”, Queueing Systems, vol. 20, p. 171–206, 1995. [FAL 90] FALCONER K., Fractal Geometry: Mathematical Foundations and Applications, John Wiley & Sons, 1990. [FEL 98] F ELDMANN A., G ILBERT A., W ILLINGER W., “Data networks as cascades: Explaining the multifractal nature of internet WAN traffic”, in ACM/Sigcomm’98 (Vancouver, Canada), 1998. [KUR 96] K URTZ T.G., “Limit theorems for workload input models”, in K ELLY F.P., Z ACHARY S., Z IEDINS I. (Eds.), Stochastic Networks: Theory and Applications, Clarendon Press, Oxford, p. 119–140, 1996. [LEL 91] L ELAND W.E., W ILSON D.V., “High time-resolution measurement and analysis of LAN traffic: Implications for LAN interconnection”, in Proceedings of the IEEE Infocom’91 (Bal Harbour, Florida), p. 1360–1366, 1991. [LEL 93] L ELAND W.E., TAQQU M.S., W ILLINGER W., W ILSON D.V., “On the self-similar nature of Ethernet traffic”, Computer Communications Review, vol. 23, p. 183–193, 1993. [LEL 94] L ELAND W.E., TAQQU M.S., W ILLINGER W., W ILSON D.V., “On the self-similar nature of Ethernet traffic (extended version)”, IEEE/ACM Transactions on Networking, vol. 2, no. 1, p. 1–15, 1994. [LEV 97] L ÉVY V ÉHEL J., R IEDI R.H., “Fractional Brownian motion and data traffic modeling: The other end of the spectrum”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering’97, Springer, 1997. [MEI 91] M EIER -H ELLSTERN K., W IRTH P.E., YAN Y.L., H OEFLIN D.A., “Traffic models for ISDN data users: Office automation application”, in Proceedings of the Thirteenth ITC (Copenhagen, Denmark), p. 167–172, 1991. [NAR 98] NARAYAN O., “Exact asymptotic queue length distribution for fractional Brownian traffic”, Adv. Perf. Analysis, vol. 1, no. 39, 1998. [NOR 94] N ORROS I., “A storage model with self-similar input”, Queueing Systems, vol. 16, p. 387–396, 1994. [PAW 88] PAWLITA P.F., “Two decades of data traffic measurements: A survey of published results, experiences, and applicability”, in Proceedings of the Twelfth International Teletraffic Congress (ITC 12, Turin, Italy), 1988. [PAX 94a] PAXSON V., F LOYD S., “Wide-area traffic: The failure of Poisson modeling”, IEEE/ACM Transactions on Networking, vol. 3, no. 3, p. 226–244, 1994. [PAX 94b] PAXSON V., F LOYD S., “Wide-area traffic: The failure of Poisson modeling”, in Proceedings of SIGCOMM’94, 1994.

436


[RAM 88] R AMASWAMI V., “Traffic performance modeling for packet communication – Whence, where, and whither?”, in Proceedings of the Third Australian Teletraffic Research Seminar, vol. 31, November 1988. [RIE 95] R IEDI R.H., “An improved multifractal formalism and self-similar measures”, J. Math. Anal. Appl., vol. 189, p. 462–490, 1995. [RIE 99] R IEDI R.H., C ROUSE M.S., R IBEIRO V.J., BARANIUK R.G., “A multifractal wavelet model with application to network traffic”, IEEE Transactions on Information Theory (special issue on “Multiscale Statistical Signal Analysis and its Applications”), vol. 45, no. 3, p. 992–1018, 1999. [ROU 99] ROUGHAN M., YATES J., V EITCH D., “The mystery of the missing scales: Pitfalls in the use of fractal renewal processes to simulate LRD processes”, in ASA-IMA Conference on Applications of Heavy Tailed Distributions in Economics, Engineering, and Statistics (American University, Washington, USA), June 1999. [RYU 96] RYU B.K., L OWEN S.B., “Point process approaches to the modeling and analysis of self-similar traffic. Part I: Model construction”, in IEEE INFOCOM’96: The Conference on Computer Communications (San Francisco, California), IEEE Computer Society Press, Los Alamitos, California, vol. 3, p. 1468–1475, March 1996. [SIM 99] S IMONIAN A., M ASSOULIÉ L., “Large buffer asymptotics for the queue with FBM input”, Journal of Applied Probability, vol. 36, no. 3, 1999. [STE 94] S TEVENS W., TCP/IP Illustrated. Volume 1: The Protocols, Addison-Wesley, 1994. [TAN 88] TANNENBAUM A.S., Computer Networks, Prentice Hall, Second Edition, 1988. [TAQ 95] TAQQU M.S., T EVEROVSKY V., W ILLINGER W., “Estimators for long-range dependence: An empirical study”, Fractals, vol. 3, no. 4, p. 785–798, 1995. [TAQ 97] TAQQU M.S., W ILLINGER W., S HERMAN R., “Proof of a fundamental result in self-similar traffic modeling”, Computer Communication Review, vol. 27, p. 5–23, 1997. [VEI 93] V EITCH D., “Novel models of broadband traffic”, in IEEE Globecom’93 (Houston, Texas), p. 1057, November 1993. [VEI 99] V EITCH D., A BRY P., “A wavelet based joint estimator of the parameters of long-range dependence”, IEEE Transactions on Information Theory (special issue on “Multiscale Statistical Signal Analysis and its Applications”), vol. 45, no. 3, p. 878–897, 1999. [WIL 95] W ILLINGER W., TAQQU M.S., S HERMAN R., W ILSON D.V., “Self-similarity through high-variability: Statistical analysis of the Ethernet LAN traffic at the source level”, in Proceedings of the ACM/SIGCOMM’95 conference, 1995 (available at the address: http://www.acm.org/sigcomm/sigcomm95/sigcpapers.html). [WIL 96] W ILLINGER W., TAQQU M.S., E RRAMILLI A., “A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks”, in K ELLY F.P., Z ACHARY S., Z IEDINS I. (Eds.), Stochastic Networks: Theory and Applications, Clarendon Press, Oxford, p. 339–366, 1996.

Chapter 13

Research of Scaling Law on Stock Market Variations

13.1. Introduction: fractals in finance Stock market graphs representing changes in the prices of securities over a period of time appear as irregular forms that seem to be reproduced and repeated in all scales of analysis: rising periods follow periods of decline. However, the rises are broken up with intermediate falling phases and falls are interspersed with partial rises, and this goes on until the natural quotation scale limit is reached. This entanglement of repetitive patterns of rising and falling waves in all the scales was discovered in the 1930s by Ralph Elliott, to whom this idea occurred while observing the ebb and flow of tides on the sands of a seashore. From this, he formulated a financial symbolization known as “stock market waves” or “Elliott’s waves,” which he broke up into huge tides, normal waves and wavelets, and also “tsunami”, from the name given in Japan to huge waves arising due to earthquakes. The theory called “Elliott’s waves” [ELL 38] presents a deterministic fractal description of the stock market from self-similar geometric figures that we find on all scales of observation and compiles a toolbox in the form of graphic analysis of stock market fluctuations used by certain market professionals: technical analysts. Elliott’s figures propose a calibration of rise and fall variations from a pythagorician numerology based on the usage of golden ratio and Fibonacci sequence, which are predictions strongly tinged with subjectivity, in so far as detection and positioning of waves depend on the graphic analyst’s view of the

Chapter written by Christian WALTER.

438


market which he examines. For the lack of an appropriate mathematical tool, this conceptualization of stock market variations was confirmed, as alchemy before chemistry, in the pre-scientific field until the emergence of fractals. The fractals of Benoît Mandelbrot, though developed in a radically different approach, fit in this understanding of stock market variations and present, as common point with Elliott’s waves, the fact of finding how to untangle the inextricable interlacing of stock market fluctuations in all the scales. Using stock market language, do we find ourselves in fall correction of a rising phase or in a fall period contradicted by a temporary rise? Fractals represent adequate conceptualization allowing the translation of intuitions of graphic analysts in rigorous mathematical representations. However, the adventure of fractals in finance does not have a smooth history. It rather refers to an eventful progression of Mandelbrot’s assumptions through the evolution of finance theory over 40 years, from 1960 until today, which stirred up a vehement controversy on modeling in infinite variance or infinite memory. The connecting thread of Mandelbrot’s works, followed by others – including his contradictors – was the research of scaling laws on stock market fluctuations, irrespective of whether this research followed the direction of scaling invariance, or pure fractal approach of markets, as proposed by Mandelbrot, or according to that of an instrumentation of multiscaling analysis of markets, such as that corresponding to mixed processes or of ARCH type that emerged in the 1980s, or to the changing system in the 1990s. The starting point of this controversy was the existence of leptokurtic (or non-Gaussian) distributions on stock market variations. This distributional anomaly in relation to the Brownian hypothesis of traditional financial modeling led Mandelbrot to propose α-stable distributions in 1962 to Paul Lévy’s infinite variance by replacing Gaussian for modeling of periodic returns. However, very soon, this new hypothesis provoked a relatively fierce controversy concerning the existence of variance and other new candidate processes appeared, all the more easily while scaling invariance of α-stable laws, cardinal property of Mandelbrot’s fractal hypothesis, did not appear, or only with difficulty, experimentally validated. The attempt to resolve leptokurtic problems by conservation of iid hypothesis and the proposal of α-stable distributions did not solve all the anomalies, since a new type of anomalies, or scaling anomalies appeared. Therefore, the theoretical research is interested in modeling leptokurticity in other possible ways and we turn towards the second pivot of financial modeling: the hypothesis of independence of successive returns, which was equally challenged. Hence, we looked in different forms of dependence between returns (linear and then non-linear dependence) for the cause of leptokurticity observed. This is the second round of empirical investigations. After highlighting the absence of short memory on returns, the research is turned towards the detection of long memory on returns.


439

This attempt does not succeed either. Then, the focus is shifted to the process of volatilities, with the formalization of short memory on volatilities, i.e. an approach that led to the trend of ARCH modeling and then by highlighting long memory on volatilities (or long dependence), i.e. a trend that led to the rediscovery of scaling laws in finance. Finally, the fractal hypothesis was validated on the generating process of stock market volatilities. Today, the long memory of volatilities (i.e., a hyperbolic law on correlations among volatilities) has become a recognized fact on financial markets and financial modeling seeks to reconcile the absence of memory on returns and the presence of long memory on volatilities. After the first part, which briefly outlines the quantities followed in finance and traditional financial modeling, we present a review of theoretical works on results of research on scaling laws for the description of stock market variations1. This review very clearly shows the various distinct periods and thus we propose to establish a chronology in this research of scaling laws, i.e. a periodization which illustrates conceptual transfers whose subject has been finance for 40 years. The chronology is as follows: – during the first period, from 1960 to 1970, Mandelbrot’s proposals and the first promising discoveries of scaling invariance on the markets launched a debate in the university community, by introducing iid-α-stable and H-correlative models; – this debate developed during the period 1970-1990 and seems to be completed with the experimental rejection of fractals in finance on stock market returns; – however, parallel to fractals, developments of time series econometrics in the process of fractional integration degree of type ARFIMA from the 1980s and, then FIGARCH in the 1990s led, from 1990-2000, a rediscovery of scaling laws on the process of stock market volatilities, using long memory concepts; – finally, the measurement of time itself becomes an object of research, with the recent developments of modeling in time deformation. 13.2. Presence of scales in the study of stock market variations 13.2.1. Modeling of stock market variations 13.2.1.1. Statistical apprehension of stock market fluctuations When we want to statistically describe the behavior of a stock market between two dates 0 and T , with the aim of proposing its probabilistic modeling, two “natural”

1. Mathematical aspects of fractal modeling in general, developed in many works, are not dealt with here. Specific aspects of fractal modeling in finance and examples of application of iid-α-stable model are presented in detail in [LEV 02].

440


interpretations of the available data – price quotations – are possible. We may consider the prices quoted between 0 and T based on a fixed frequency of observation, which can be a day, a month or a trimester – but also an hour or five minutes. This further subdivides the interval [0, T ] in n periods equal to basic length τ = T /n, this duration τ defining a “characteristic time” of market observation. On the other hand, we considered the price quoted in every transaction that took place between 0 and T , which means splitting up the interval [0, T ] in 0 = t0 < t1 < . . . < tn = T and working in “time deformation”, or “transaction-time”, tj being the moment of the j th transaction. The first approach appears most immediate, but because of the discontinuous and irregular nature of stock market quotations, it is possible that the price recorded on date t does not correspond to a real exchange on the market, i.e., to an equilibrium of supply and demand at the time of quoting: in that case, the economic significance of statistical analysis could appear very weak. On the other hand, when the frequency of quoting is higher than a day, the variations between the previous day’s closing price and the following day’s opening price are treated as intra-daily variations. Finally, this quoting in physical time assumes that market activity is broadly uniform during the observation period, which is generally not the case. It is from here that the interest in the second approach is derived, corresponding as if it does to the succession of balanced price in supply and demand. These two approaches exist simultaneously in financial modeling and this alternative leads us to consider the issue of adequate time to measure stock market fluctuations, which was put forth for the first time by Mandelbrot and Taylor [MAND 67b] and by Clark [CLA 73], who introduced the concept of “time-information” – where information was associated with the volume of transactions2. The first analysis (calendar time) is widely used, but the second research trend (time deformation) has begun to be the subject of new interest. This interest is the result of change (and also reflects it) that appeared in the computing environment of stock markets and which is translated by an increasing abundance of available data of prices: if prices have been registered daily since the 19th century, during the 1980s they were quoted every minute and then in the 1990s, after each transaction. Thus, the magnitude of sample sizes increased in several powers of 10. Statistical tests carried out on markets during the 1960s made use of approximately 103 data. Those in the early 1990s processed nearly 105 data. The most recent investigations examine nearly 107 data.

2. We can observe that this approach is similar to that of Maurice Allais who introduced the concept of “psychological time” in economics (see for example [ALL 74]).


441

In calendar time, the basic modeling is as follows. Let S(t) be the price of asset S on date t. The variation of price between 0 and T is: S(T ) = S(0) +

n

ΔS(k)

n=

k=1

T τ

(13.1)

The notation ΔS(k) represents the price variation of asset S between the dates t − τ and t, where t is expressed in multiple steps of basic time τ : ΔS(t, τ ) = S(t) − S(t − τ ) = S(kτ ) − S (k − 1)τ = ΔS(k) (13.2) In transaction-time, prices are quoted in every transaction made and price variations between two successive transactions are taken into consideration. Let N (t) be the number of transactions3 made between dates 0 and t. The variation of price between 0 and T in this case is:

N (T )

(T ) = S(0) +

ΔS(j)

(13.3)

j=1

The notation ΔS(j) represents price variation of the asset S between transactions j − 1 and j: ΔS(j) = S(j) − S(j − 1) The pricing process {S(j)} is therefore indexed by a functioning stock market time, or “transaction-time,” noted by θ(t): S(j) = S(θ(t)). Finally, market professionals generally say that the value of a quoted price (and thus relevance of measurement) is not the same depending on whether this price corresponds to a transaction of 500,000 securities or 5 securities. Therefore, the concepts of market “depth”, exchange “weight”, come into play. We measure the intensity of the exchange, or “activity level” of the market, by the quantity of securities exchanged, or volume of transactions. Financial modeling took into account this element and the volume of transactions in the evolution of price is analyzed4 today by introducing volume process in financial modeling.

3. The process of transaction inflows N (t) was dealt by Hasbrouck and Ho [HAS 87] and recently by Ghysels et al. [GHY 97]. Evertsz [EVE 95b] showed that the distribution of waiting time between two quotations followed Pareto’s power law, whose exponent value implies infinite mathematical expectation. 4. See, for example, Lamoureux and Lastrapes [LAM 94] or Gouriéroux and Le Fol [GOU 97b] who give a synthetic idea of this question. Maillet and Michel [MAI 97] showed that distribution of volumes follows Pareto’s power law.

442


Let V (t) be the total volume of securities exchanged between 0 and t. The total volume of securities exchanged between 0 and T is:

N (T )

V (T ) =

υ(j)

(13.4)

j=1

Notation υ(j) represents the volume of securities exchanged during the transaction j. The volume process {V (j)} is indexed by transaction-time. The price quoted on date T is therefore the result of three factors or processes between 0 and T : the transaction process of N (t), the process of price variations between two transactions ΔS(j) and the volume process υ(j). The price on date T is the result of the simultaneous effect of these three factors. 13.2.1.2. Profit and stock market return operations in different scales From basic data such as quoted prices on the period [0, T ], three quantities are of interest in finance. The three quantities are as follows: – profit realized on security during the period [0, T ], defined by: G(T ) = S(T ) − S(0)

(13.5)

– rate of return of security over the period [0, T ], defined by: R(T ) =

S(T ) − S(0) G(T ) = S(0) S(0)

– continuous rate of return security over the period [0, T ], defined by: r(T ) = ln 1 + R(T ) = ln S(T ) − ln S(0)

(13.6)

(13.7)

We are interested in the evolution of these quantities over successive sub-periods [t − τ, t]. The periodic gain realized on security during the sub-period [t − τ, t] is: ΔG(t, τ ) = G(t) − G(t − τ ) = S(t) − S(t − τ ) = ΔS(t, τ )

(13.8)

The rate of periodic return realized on security during the sub-period [t − τ, t] is: ΔR(t, τ ) =

S(t) − S(t − τ ) ΔS(t, τ ) = S(t − τ ) S(t − τ )

or: 1 + ΔR(t, τ ) =

S(t) 1 + R(t) = S(t − τ ) 1 + R(t − τ )

(13.9)


443

The rate of periodic continuous return realized on security during the sub-period [t − τ, t] is: Δr(t, τ ) = ln 1 + ΔR(t, τ ) = ln S(t) − ln S(t − τ ) (13.10) = r(t) − r(t − τ ) When data are obtained in high frequency, ΔR(t, τ ) is “small” and we have ln(1 + ΔR(t, τ )) ≈ ΔR(t, τ ). Expressions (13.9) and (13.10) are very close and we measure periodic security returns by one or the other. The temporal aggregation of returns is realized by bringing expressions (13.6) and (13.9) closer; we have (t = kτ ): 1 + R(T ) =

T )

n ) 1 + ΔR(t, τ ) = 1 + ΔR(k)

t=τ

(13.11)

k=1

In the same way, (13.7) and (13.10) lead to: r(T ) =

T t=τ

Δr(t, τ ) =

n

Δr(k)

(13.12)

k=1

13.2.1.3. Traditional financial modeling: Brownian motion Modeling stock market variations has led us to assume that S(t) is a random variable: therefore, the value sequence S(1), S(2), S(3), etc. is considered as values on certain dates t of a process in continuous time. So, analysis of stock market variations leads to a stochastic process studying {S(t), t 0} or the associated processes {R(t), t 0} or {r(t), t 0} and their growth. The usual hypothesis of financial theory assumes that these random processes have independent and identically distributed (iid) increments of finite variance, which we can write, as an abbreviation, “iid-Gaussian” modeling. iid-Gaussian hypothesis has been the subject of much controversy for the last 50 years. iid sub-hypothesis was intensively tested in the theoretical works.5 Today, it is admitted that, for the calculation of usual evaluation and hedge models, this assumption is valid in first approximation and when τ is more than 10 minutes (see, for example, [BOU 97]). It is more convenient for the distribution of returns in scale τ than that of returns in scale T = nτ because, in this case, if P (Δr(t, τ )) is the probability distribution of periodic

5. See [CAM 97] for a complete review of the different ways to statistically test this and also the results obtained.

444


returns Δr(t, τ ), then:

⊗n P Δr(t, nτ ) = P Δr(t, τ )

(13.13)

where ⊗ represents the convolution operator. From the probabilistic point of view, the advantage of this hypothesis is purely computational. From the economic point of view, the independence of returns means considering that the available and relevant information for the evaluation of financial assets is correctly transferred in the quoted prices, which signifies the beginning of a concept of informational market efficiency; stationarity signifies that the economic characteristics of an observed phenomenon do not change much in the course of time. The existence of variance limits the fluctuation of returns, not a stock market crash or stock market boom. The first formal model representing stock market variations was proposed [BAC 00] in 1900 by Louis Bachelier6 and based on profits (13.8), then modified in 1959 by Osborne for returns (13.9) and (13.10): dS(t) = dr(t) = μ dt + σ dW (t) S(t)

(13.14)

with coefficients μ ∈ R and σ > 0, where W (t) is a standard Brownian motion7, i.e., E(W (1)) = 0 and E(W (1)2 ) = 1. Coefficient μ represents the expectation of instantaneous returns for the share purchased. The risk of a financial asset is generally measured by the coefficient σ of Brownian motion, called “volatility” by market professionals: this is a potential dispersal measure of stock market returns. There are other risk measures, which are all based on this idea of conditional variability of returns in a given time (see [GOU 97b]). The solution of (13.14) is obtained by supposing that X(t) = ln S(t) and by applying Itô’s differentiation formulae in dX(t). We obtain: σ2 t + σW (t) t ∈ [0, T ] (13.15) S(t) = S(0) exp μ− 2 which is considered as the standard model of stock market variations.

6. A biography of Louis Bachelier has been compiled by Courtault et al. [COU 00]. For a description of financial aspects of Bachelier’s work and their impact on the finance industry, see [WAL 96]. For an understanding of Bachelier’s probabilistic work with reference to his epoch, see [TAQ 00]. 7. Let us note that Bachelier did not know Brownian motion in the strict sense of its definition because it is only in 1905 that this definition would be given by Einstein, then in 1923 by Wiener. However, Bachelier assumes that the successive differences of the form ΔS(t, τ ) are independent of Gaussian distribution and of proportional variance in time interval τ , which leads to describe Wiener’s process.


445

13.2.2. Time scales in financial modeling 13.2.2.1. The existence of characteristic time If we choose modeling in physical time, thus in fixed pace of time, the first question that arises is that of selecting the pace of time τ , i.e., resolution scale of market analysis: is it necessary to examine time variations – daily, weekly, monthly, etc.? Which is the most appropriate observation scale for capturing the statistical structure of stock market variations? Thus, a question of financial nature appears: should the probability law which governs stock market variations be the same at all scales? If we understand each time scale as representing an investment horizon for a given category of operators, there is apparently no particular reason for variations corresponding to a short trading horizon and those corresponding to a long horizon of portfolio manager to be modeled by the same probability law. Equation (13.12) shows that the return in scale T is the sum of returns in scales τ in the case of iid. Generally, when we add iid random variables, the resulting probability law is different from initial probability laws. Thus, a multiscale analysis seems, at first sight, inevitable, if we do not wish to lose information on the market behavior at each scale of characteristic time of a given economic phenomenon. The first analysis of market behavior used only one observation frequency, often monthly. It was Mandelbrot who, in 1962, became the first to introduce the concept of simultaneous analysis on several scales, in order to compare distributions of periodic returns Δr(t, τ ) based on these different scales τ . Mandelbrot sought to establish invariance by changing scale on periodic returns (i.e., a fractal structure of the market). If P (Δr(t, τ )) is the probability distribution of periodic returns Δr(t, τ ), relation (13.13) is simplified as: ⊗n = nH P Δr(t, τ ) (13.16) P Δr(t, τ ) where H is a self-similar exponent – which means that the process of returns {r(t), t 0} is self-similar to exponent H: L r(T ) = r(nτ ) = nH r(τ )

T = nτ

(13.17)

L where = symbolizes equality in distribution.

In such a market model, an important consequence of a fractal hypothesis is the absence of preferential observation scale and of characteristic discriminant time, for its statistical observation. In this case, it becomes possible to estimate probability law for a long horizon from the study of stock market fluctuations on a short horizon: distribution of returns in long-term horizon T is obtained, from the distribution of returns in short-term horizon τ , by means of relation (13.17). In other words, by observing the market in any scale, we can access its fundamental behavioral structure: the probability law which characterizes stock market fluctuations is independent of the scale of these fluctuations.

446


13.2.2.2. Implicit scaling invariances of traditional financial modeling However, traditional financial modeling has fractal properties: Brownian motion is a self-similar process of exponent H = 12 . Particularly, its increase Δr over a time τ follows a scaling law such that: Δr(t, τ ) ∼ τ 1/2

(13.18)

The distribution of ratio Δr/τ 1/2 is independent of time. Translated in financial terms, the magnitude order of a security return for a given time is proportional to the square root of this time. In the theory of finance, it is stated that the returns and the associated risk are proportional to the time spent. Relation (13.18) gives this proportionality, irrespective of the time scale (duration) considered. Hence, there is an invariance in the law of returns by changing the scale: law of security returns does not depend on the duration of security detention. Thus, the theoretical risk of a financial asset will be expanded in square root (exponent 12 ) of the detention time of this asset. Important people belonging to these markets permanently apply this fractal property, by opting to “annualize” volatility by means of the aforementioned formula (13.18) Thus, for example, volatility in 12 months will be equal to the volatility in a month multiplied by the square root of 12. This calculation of long-term risk level from short-term risk is also at the base of the banking industry’s prudential reflections on the control of risks on the market operations (see [BAL 94, BAL 96, BAL 98, BAL 99, IOS 94]). 13.3. Modeling postulating independence on stock market returns 13.3.1. 1960-1970: from Pareto’s law to Lévy’s distributions 13.3.1.1. Leptokurtic problem and Mandelbrot’s first model The scaling character of quoted price fluctuations in stock markets was first established by Mandelbrot through the study of price variations in cotton between 1880 and 1958. This is the first trace of an explicit comment about the existence of scaling phenomena on stock market variations. This existence was highlighted by the study of distribution tails, which brought out the connection between the discovery of scaling laws and the appropriate treatment of large stock market variations. From the first statistical study of stock market fluctuations, it was established that the empirical distributions of successive returns contained too many tail points for them to be adjusted by Gaussian densities: the empirical distributions obtained were all leptokurtic. This problem of great stock market variations was not solved and was temporarily abandoned by research for lack of appropriate means to model it. Moreover, this abnormal distribution tail was old and dated back to Pareto who had invented the law which carries his name precisely to give an account of the distribution


447

of revenues in an economy on a given date and which is a power law. However, Pareto’s law did not seem to have the status of a limit law in probability and was not used in finance. Thus, Mandelbrot tackled the problem of large values of empirical distribution functions of returns Δr(t, τ ) = ln S(t) − ln S(t − τ ), where S(t) is the closing price of cotton on date t, with two values for τ : month and day. By calculating expressions Fr (Δr(t, τ ) > u) for positive values and Fr (Δr(t, τ ) < −u) for negative values, where Fr designates the cumulative frequency of variations Δr(t, τ ), a double adjustment to Pareto’s exponent law α is obtained: ! " (13.19a) ln Fr Δr(t, τ ) > u ≈ −α ln u + ln C (τ ) ! " (13.19b) ln Fr Δr(t, τ ) < −u ≈ −α ln u + ln C (τ ) Noting that adjustment rights corresponding to distributions τ = a day and τ = a month are parallels, Mandelbrot deduces that distribution laws of Δr for τ = a day and a month only differ by a changing of scale and hence proposes a new model of price variation: by conserving iid assumptions, the stability of phenomena between a day and a month is interpreted as a stability trace according to Lévy. A random variable X is called stable according to Lévy, or α-stable, if, for any couple c1 , c2 > 0, there is α ∈ ]0, 2] and d ∈ R such that: α c1 X1 + c2 X2 ≡ cX + d cα = cα 1 + c2

(13.20)

where ≡ symbolizes equality in distribution and where X1 and X2 are two copies independent of X. In the case where we have d = 0, X is strictly known as stable. Exponent α ∈ ]0, 2] is the characteristic exponent of stable laws. Mandelbrot’s inference comes from the following property of stable laws. If X is a stable law of a characteristic exponent α, then it can be shown that: 'A ( A2 −α 1 + x + O x−2α (13.21) P (X x) = 1 − F (x) = x−α πα 2πα where A1 and A2 are the independent quantities of α. In addition, by definition, a random variable follows Pareto’s law in a higher tail if: (13.22) P X x | x x(0) = 1 − F (x) = x−α h(x) where α is called Pareto’s index and where h(x) is a slowly varying function, i.e., lim h(tx)/h(x) = 1 when x → ∞ for any t > 0. When h(x) is a constant function, the law is said to be Pareto’s in a strict sense. By writing h(x) = [(A1 /(πα)) + (A2 /(2πα))x−α + O(x−2α )], relations (13.21) and (13.22) we show that α-stable laws are asymptotically Paretians with the tailing

448


index α: this is the reason why, in his 1962 communication, Mandelbrot concludes that “Paretian character[. . . ]is “predicted” or “confirmed” by stability”. The second important fact of this empirical emphasis concerns the value of α equal to 1.7. No higher order moment than α exists. Thus, since α is lower than 2, the variance Δr(t, τ ) is infinite. 13.3.1.2. First emphasis of Lévy’s α-stable distributions in finance Mandelbrot [MAND 63] developed and summarized variation modeling of prices proposed in 1962: “this was the first model that I have elaborated to describe the price variation practiced on certain stock exchanges of raw materials in a realistic way,” (see [MAND 97a], French edition, p.128). We can qualify this first model as “iid-α-stable,” insofar as iid hypotheses are conserved and that the characteristic exponent value α of stable laws goes from 2 (Gauss) to value α < 2. This model, which made it possible to create, in an unforeseen way, the flooding of stock markets, was named “Noah’s effect” by Mandelbrot, in reference to the biblical flood (see [MAND 73b]). Fama [FAM 65] and then Mandelbrot [MAND 67a] followed the initial investigations and validated the model. Finally, in 1968, the first tabulations of symmetric α-stable laws were carried out by Fama and Roll [FAM 68], which made it possible to generate the first parameter estimators of these distributions. 13.3.2. 1970–1990: experimental difficulties of iid-α-stable model 13.3.2.1. Statistical problem of parameter estimation of stable laws As Fama indicated in 1965, these first emphases of Lévy’s distributions were fragile because the estimation methods of characteristic exponent α were not very sure: the adjusting method of distribution tails in a bilogarithmic graph is very sensitive to subjective selection of the point from which we commence distribution tails. Fama [FAM 65] had proposed two other estimators based on invariance property by applied addition, be it an interquantile interval measure, or dilation law of empirical variance. However, these two estimators were equally fragile: the former presumed the independence of growth and the latter was very sensitive to the selection of sample size. A stage was reached in 1971: Fama and Roll, using properties relating to quantiles, which were detected with the help of previously made tabulations of symmetric stable distributions, proposed new estimate methods of the parameters α and γ of symmetric stable laws [FAM 71]. These first statistical tools allowed the implementation of the first tests of iid-α-stable model in 1970. Then, a second generation of estimators appeared during the 1970s. Successively, Press [PRE 72], DuMouchel [DUM 73, DUM 75], Paulson et al. [PAU 75], Arad


449

[ARA 80], Koutrouvélis [KOU 80] and McCulloch [MCC 81] developed new estimation methods of parameter, using the characteristic function of stable laws8. Simultaneously, generators of stable random variables were designed by Chambers et al. [CHA 76], whose algorithms allow an improvement of the simulation possibilities on the financial markets. These new theoretical stages make it possible to improve the tests for the hypothesis of scale invariance. However, DuMouchel [DUM 83] showed that it is possible, by means of the preceding methods, to separate Lévy-stable from Pareto-unstable distributions (i.e., with convergence towards a normal law). He shows that these methods are good when the “true” distribution is stable, but are skewed when this is not the case, which lets a doubt remain regarding the validity of scale invariance, when this invariance is verified by means of these methods. In addition, the sample size can affect the results of estimations made with Koutrouvélis method and a fortiori with older methods9. For example, Walter [WAL 99] verifies that α increases according to a decrease in sample size but remains nearly constant when tests on sub-samples of constant size are carried out. More generally, we can say that the difficulties of characteristic exponent estimation make it very delicate to determine a definitive position. Thus, we find the following remark in a recent manual: “we think that estimate methods of parameter α are not precise enough to infer a clear conclusion on the real nature of distributions from estimates made on various time scales” (see [EMB 97], p. 406). When it occurs, the rejection of stability of α will not appear as “conclusive”, as Taylor affirms (see later on). In addition, other more recent studies, like those of Belkacem et al. [BEL 00], have shown the partial validations of this invariance. 13.3.2.2. Non-normality and controversies on scaling invariance In a general way, all work which will be undertaken on stock markets will not only confirm the abnormality of distributions of returns on various scales, and the possible adjustment in Lévy’s distributions on each scale, but also the difficulty in validating the fractal hypothesis. In fact, a scaling anomaly will quickly appear, which is a tendency towards the systematic increase in value of α(τ ) according to τ . The differences between these works will entail the choice of replacement process to give

8. For a review of these methods, see the works of Akgiray and Lamoureux [AKG 89] or Walter [WAL 94], who arrived at the same conclusion on selecting the best estimate method: that of Koutrouvélis [KOU 80]. 9. See the work of Koutrouvélis [KOU 80] and Akgiray and Lamoureux [AKG 89] for illustrations of this sample size problem, which has been known since the first works of Mandelbrot and Fama.

450


an account of this failure, by means of non-fractal modeling, i.e., of a multi-scale market analysis. Here we present the main articles relating to this emphasis. Teichmoeller [TEI 71], Officer [OFF 72], Fielitz and Smith [FIE 72], Praetz [PRA 72] and Barnea and Downes [BAR 73] obtain all the values of α which increase on average from 1.6 in high frequency to 1.8 in low frequency. This increase led Hsu et al. [HSU 74], who also verified it, to estimate that “in an economy where factors affecting price levels (technical developments, government policies, etc.) can undergo movements on a great scale, it seems unreasonable (our emphasis) to want to try to represent price variations by a single probability distribution” (see [HSU 74], p.1). Brenner [BRE 74], Blattberg and Gonedes [BLA 74] and Hagerman [HAG 78] continued the investigations by observing the same phenomenon. Hagerman concludes that “the symmetric stable model cannot reasonably (our emphasis) be regarded as a suitable description of stock market returns” (see [HAG 78], p. 1220). We can see a similarity of arguments between Hagerman and Hsu et al. [HSU 74] for whom it does not seem to be “reasonable” to retain a model with infinite variance. This argument was used by Bienaymé against Cauchy as early as 1853. Zajdenweber [ZAJ 76] verifies the adjustment in Lévy’s distribution but does not test the scale invariance. Upton and Shannon [UPT 79] take up the question in a different way by seeking to estimate the violation degree in normality based on observation scale by using the Kolmogorov-Smirnov (KS) method, which is a calculation of curve coefficients K and skewness S. The scale invariance is not retained. A new study by Fielitz and Rozelle [FIE 83] confirms the scaling anomaly. Other investigations are carried out on exchange markets. Wasserfallen and Zimmermann [WAS 85], Boothe and Glassman [BOO 87], Akgiray and Booth [AKG 88a], Tucker and Pond [TUC 88] and Hall et al. [HAL 89] tackled, for their part, the increase of α according to the decrease of observation frequency. At the end of the 1980s, the iid-α-stable model of stock market returns appeared to be rejected by all the research in this field. In 1986, as we read in a summarized work on the analysis of stock market variations: “many researchers estimated that the hypothesis of infinite variance was not acceptable. Detailed studies on stock market variations rejected Lévy’s distributions in a conclusive way. [. . .] Ten years after his article in 1965, Fama himself preferred to use a normal distribution for monthly variations and thus to give up stable distributions for daily variations” (see [TAY 86], p. 46). However, we can observe that theoretical scale invariance of Gaussian modeling (scaling law in square root of time) is not validated by real markets in all cases and that generalization by iid-α-stable model represents a good compromise between modeling


451

power and statistical cost of estimation. We find such an argument, for example, in McCulloch [MCC 78], who advocates the small number of parameters required by stable laws, as compared with the five necessary parameters for jump models such as those proposed by Merton [MER 76]. In other words, the question remains open, even if it is probable that the “true” process of returns is more complex than iid-α-stable modeling. Certain works that were carried out show that the values of α can change in time10 (stationarity problem of Δr), which leads us to raise the question of dependence between increments of the prices process and in finding other forms of scaling laws on financial series. 13.3.2.3. Scaling anomalies of parameters under iid hypothesis Systematic increase of characteristic exponent α(τ ) of stable laws according to τ constitutes what is called a “scaling anomaly.” Indeed, in iid-α-stable modeling, the following relation must be verified: α(T ) = α(nτ ) = α(τ )

T = nτ

(13.23)

The fact that this relation is not found for all the values of n shows that scale invariance is not total on all time scales, or that the iid hypothesis is not valid. More generally, a way of highlighting invariance by changing the scaling probability law, and thus being able to determine fractal hypothesis, is to examine whether its characteristic parameters have a scaling behavior, i.e., seek a dilation (or contraction) law of parameters according to time scale. This idea is the beginning of an important trend in the theoretical research in finance. Let λ(τ ) be a statistical parameter of distribution Δr(t, τ ): λ(τ ) is a function of τ and searching for scaling laws on a market between 0 and T therefore leads to the estimation of parameter values based on each value of τ , then to the study of scale relation, or function λ: τ → λ(τ ). All the statistical distribution parameters are also a priori usable for the research of scaling laws on distributions. The most analyzed parameters in research works are either a scaling parameter or the curve coefficient, or kurtosis K. In the Gaussian case, the scaling parameter is the standard deviation and in case of iid increments, we must have the relation: σ(T ) = σ(nτ ) = n1/2 σ(τ )

T = nτ

(13.24)

This scaling relation, already postulated on variance by Bachelier [BAC 00], was introduced into research during the 1980s, and is known under the name of “test of

10. See an example in [WAL 94].

452


variance ratio”.11 Relation (13.24) shows that in the case of iid returns, we must have a proportionality σ(τ ) ∼ τ 1/2 . Some works have highlighted a slight violation of this relation, bringing to light a proportionality of type σ(τ ) ∼ τ H with H > 0.5. For example, Mantegna [MANT 91], and Mantegna and Stanley [MANT 00] make a list of the values close to 0.53 or 0.57. In case of non-Gaussian α-stable laws, the scaling parameter noted by γ is tested and we must have the relation12: γ(T ) = γ(nτ ) = n1/α γ(τ )

T = nτ

(13.25)

An important parameter is Pearson’s coefficient, or kurtosis K, defined by KX = E[(X − E(X))4 ]/E[(X − E(X))2 ]2 − 3, as this makes it possible to highlight a variation in the normality of the distribution observed. For a normal distribution, we have KX = 0. In the case of iid-Gaussian returns, we must have: K(T ) = K(nτ ) = K(τ )/n

T = nτ

(13.26)

Yet, for example, Cont [CON 97] finds that the kurtosis coefficient K(τ ) does not decrease in 1/n but rather in n1/α with α ≈ 0.5 indicating the presence of a possible non-linear dependence between variations (see section 13.4). Generally, the more we improve our knowledge of the scaling behaviors of various parameters, the more it becomes possible to choose between the two alternative terms, scale invariance or characteristic scales. The study of scaling behaviors of parameters thus helps in the modeling of stock market fluctuations. The existence of a scaling anomaly on parameter α during investigations carried on between 1970 and 1980, then on K parameter during the following decade, led certain authors to try to modify Mandelbrot’s model by limiting scale invariance, either to certain time scales, by introducing system changes (cross-over), or to certain parts of the distributions only on the extreme values. In these two fractal metamorphoses, this led to the introduction of a multiscale market analysis. 13.3.3. Unstable iid models in partial scaling invariance 13.3.3.1. Partial scaling invariances by regime switching models The question of mode changes, or partial scaling invariance on a given frequency band had already been dealt with by Mandelbrot [MAND 63], who assumed the

11. For example, see Lo and MacKinlay’s work [LO 88], who gave a list of previous works on the calculation of the variance ratio. 12. This relation is verified by Walter [WAL 91, WAL 94, WAL 99] and Belkacem et al. [BEL 00].


453

existence of higher and lower limits (cut-off) in the fractality of markets (see also [MAND 97a], p. 51 and pp. 64–66) and introduced the concept of scaling range. Akgiray and Booth [AKG 88b] used this idea to reinforce McCulloch’s argument [MCC 78] on the cost-advantage ratio of a model in scaling invariance. Using stable distributions between two cutoffs is appropriate because it is less costly in parameter estimations than other modeling, which is perhaps finer (like the combinations of normal laws or mixed diffusion-jumps processes) but also more complex and therefore at the origin of a greater number of estimation errors. Therefore, the issue to be solved is the detection of points where change in speed occurs. Bouchaud and Potters [BOU 97] and Mantegna and Stanley [MANT 00] propose such a model, combining Lévy’s distributions and exponential law from a given value. 13.3.3.2. Partial scaling invariances as compared with extremes DuMouchel [DUM 83] suggests, without making a hypothesis a priori on the entire scaling invariance, “letting distribution tails speak for themselves” (see [DUM 83], p. 1025). For this, he uses the generalized Pareto’sdistribution introduced by Pickands [PIC 75], whose distribution function is: 1 − (1 − kx/σ)1/k k = 0 (13.27) F (x) = 1 − exp(−x/σ) k=0 where σ > 0 and k are the form parameters: the bigger k is, the thicker the distribution tail. In the case where distribution is stable with characteristic exponent α < 2 (scaling invariance), then we have α ∼ = 1/k. We can observe that, while Pareto’s laws had been Mandelbrot’s initial step in his introduction of the concept of scaling invariance on stock market variations, Du Mouchel operated in a manner similar to his predecessors and rediscovered Pareto’s law without the invariance sought by Mandelbrot. Mittnik and Rachev [MIT 89] propose to replace scale invariance on summation of iid-α-stable variables by another invariance structure, invariance compared with the minimum: X(1)

L

= an min X(i)+bn 1in

(13.28)

in which the stability property by addition is replaced by the stability property for an extreme value, i.e. the minimum. Weibull’s distribution corresponds to this structure. This was the beginning of a research trend that would lead to the rediscovery in finance, during the 1990s, of the theory of extreme values,13 which depicts another form of invariance: invariance compared with consideration of the maxima and minima.

13. For the application of the theory of extreme values in finance, see [LON 96, LON 00].

454


13.4. Research of dependency and memory of markets 13.4.1. Linear dependence: testing of H-correlative models on returns 13.4.1.1. Question of dependency of stock market returns The standard model of stock market variations made a hypothesis that returns Δr(t, τ ) = ln S(t) − ln S(t − τ ) were iid according to a normal variance law σ 2 τ . The question of validating the independence hypothesis emerged very early in the empirical works dealing with the characterizations of stock market fluctuations. Generally, dependency between two random variables X and Y is measured by the quantity Cf,g (X, Y ) = E(f (X)g(Y )) − E(f (X)]E[g(Y )) and we have the relation: independent X and Y

⇐⇒

Cf,g (X, Y ) = 0

The case of f (x) = g(x) = x corresponds to the measurement of usual covariance. Other cases include all the (non-linear) possible correlations between variables X and Y . Applied to stock market variations, this measure implies that the returns Δr(t, τ ) are independent only if we have C(h) = Cf,f (Δr(t), Δr(t+h)) = 0 for any function f (Δr(t)). Therefore, studying the independence of stock market variations will pave the way for the analysis of function: C(h) = E f Δr(t) f Δr(t + h) (13.29) − E f Δr(t) E f Δr(t + h) The chronology of the study merges with different choices made for the definition of function f (·). The earliest works (1930-1970) on the verification of increment independence were done only on f (x) = x. In this case, C(h) becomes the common autocovariance function: C(h) = γ(h) = E Δr(t)Δr(t + h) − E Δr(t) E Δr(t + h) (13.30) and the independence of increments corresponds to the invalidity of the linear correlation coefficient. In total, the conclusions of initial works proved the absence of a serial autocorrelation and contributed to the formation of a concept of informational efficiency in stock markets.14

14. See, for example, [CAM 97, TAY 86] for a review of this form of independence and [WAL 96] for the historical formation of the efficiency concept from initial works.


455

13.4.1.2. Problem of slow cycles and Mandelbrot’s second model However, by the end of the 1970s, certain results contrary to this relation came up in the study of return behaviors in a long-term horizon, which led to tests called “long memory.” By noting by γ(h) = E(Δr(t)Δr(t + h)) − E(Δr(t))E(Δr(t + h)) the common autocovariance function and ρ(h) = γ(h)/γ(0) the associated autocorrelation function, the standard model of stock market variations implies that ρ(h) must decrease geometrically, i.e., ρ(h) cr−h with c > 0. However, it seemed that, in some cases, we obtain a hyperbolic decay ρ(h) ∼ ch2H−2 with c > 0 and 0 < H < 1, corresponding to a phenomenon of “long memory” or “long dependence.” This phenomenon of long memory was observed in the 1960s by Adelman [ADE 65] and Granger [GRA 66]; the latter described it as “the characteristic of fluctuating economic variables”. Besides, this led Mandelbrot [MAND 65] to rediscover Hurst’s law [HUR 51] by introducing the concept of “self-similar process” which later became [MAND 68] fractional Brownian motion (FBM), whose increments are self-similar with exponent H and autocovariance function γ(h) = 12 [|h + 1|2H − 2|h|2H + |h − 1|2H ]. Hence, Mandelbrot’s model can be qualified as “H-correlative” model. Mandelbrot called it “Joseph’s effect” with reference to the slow and aperiodic cycles evoked in biblical history concerning Joseph and the fluctuations in the Egyptian harvest [MAND 73a]. Summers [SUM 86], Fama and French [FAM 88], Poterba and Summers [POT 88] and DeBondt and Thaler [DEB 89] highlighted the phenomena of “average return” for successive returns, introducing the concept of long-term horizon on markets. Although divergent, the interpretations of these autocorrelation phenomena on a long horizon tended to question the hypothesis of common independence and to find a form of long memory on stock market returns. 13.4.1.3. Introduction of fractional differentiation in econometrics Since the 1970s, econometric limits of ARMA (p, q) and ARIMA (p, d, q) stationary processes in the description of financial series had progressively led to a generalization of these models by introducing a non-integer differentiation degree 0 < d < 12 with ARFIMA process, which found a great echo in finance in the 1980s. The fractional differentiation operator ∇d = (1 − L)d where ∇ is defined by ∇X(t) = X(t) − X(t − 1) = (1 − L)X(t) and: ∇d = (1 − L)d =

∞

(−1)k Cdk Lk

k=0

where Cdk is the binomial coefficient, made it possible to obtain “long memory” on studied economic series and met the demand for a new characterization of some of the properties observed in these series. Baillie [BAI 96] presents a complete synthesis

456


of the usage of these processes in econometrics of finance. ARFIMA and FBM trends recurred and led to the research of long memory on returns. 13.4.1.4. Experimental difficulties of H-correlative model on returns Tests carried out in the research work of these anomalies implemented Hurst’s R/S statistic, improved by Mandelbrot [MAND 72]. This statistic helps in finding the value of self-similar exponent H insofar as the ratio R/S is asymptotically proportional to nH : H ≈ ln(R/S)/ ln n. Thus, between 1980 and 1990, several works revealed values of H greater than 0.5, indicating the presence of long memory on markets, which seemed to corroborate the observations concerning the “abnormal” behavior of returns over long periods. However, Lo [LO 91] showed that this statistics is also sensitive to short memory effects: in the case of AR (1) process, the result R/S can be based on the rise of 73%. Lo proposed a modified statistic R/S, by adding weighed autocovariance terms to the denominator. Therefore, it appears that new values obtained from H were close to 0.5. Thus, for example, Corazza et al. [COR 97], Batten et al. [BAT 99] and Howe et al. [HOW 99] verify that the traditional analysis R/S gives values of H greater than 0.5 but the modified statistics R/S of Lo [LO 91] makes the values of H drop towards 0.5: “what is more astonishing in this result is not the absence of long memory but rather the radical change in judgment that we are led to implement when we use Lo’s modified statistics” (see [HOW 99], p.149). Further studies on independence will consider, in function C(h) defined in (13.29), for the case f (x) = x2 and f (x) = |x|. Absolute variations of price and their squares represent a measurement of price “volatility”. It is on this form of dependence, i.e. dependence on volatility, that scaling laws in finance will appear. 13.4.2. Non-linear dependence: validating H-correlative model on volatilities 13.4.2.1. The 1980s: ARCH modeling and its limits A common beginning of all the studies that were conducted in the 1990s is the observation of limits of iid-α-stable and H-correlative models, applied on stock market returns. This observation will lead us to look for a form of dependence on their volatility, by first introducing short memory on variances, with the trend of ARCH15 modeling, which is a trend that created a great number of models for this family developing initial logic of conditioning of variance in various directions (for a synthesis review see [BOL 92]). However, in 1997, we could read this comment on

15. Auto-regressive conditional heteroscedasticity: modeling introduced by Engle [ENG 82]. See an ARCH presentation in [GOU 97a, GUE 94].


457

the ARCH trend: “Yet, the recent inflation of basic model varieties and terminology GARCH, IGARCH, EGARCH, TARCH, QTARCH, SWARCH, ACD-ARCH reveals that this approach appears to have reached its limits, cannot adequately answer to some questions, or does not make it possible to reproduce some stylized facts” (see [GOU 97b], p.8). These “stylized facts” particularly relate to hyperbolic decline in the correlation of volatilities, i.e., long memory, or scaling law on volatility. 13.4.2.2. The 1990s: emphasis of long dependence on volatility Baillie [BAI 96] and Bollerslev et al. [BOL 00] present a review of the emphasis of long memory on volatility. This scaling law on volatility makes it possible to understand scaling anomalies observed on kurtosis K. In fact, as Cont [CON 97] shows, if we assume that correlations on volatility are defined by a power law of type g(k) ∼ = g(0)k −α , then we obtain a scaling relation for kurtosis K: 6 K(τ ) + 2 K(τ ) + K(nτ ) = n (2 − α)(1 − α)nα which explains the phenomenon of abnormal decrease of kurtosis. Mandelbrot [MAND 71] showed the importance of taking into consideration the horizon in markets whose variations can be modeled by long dependence processes: particularly, probability of huge losses decreases less rapidly than in a iid-Gaussian world. Financiers often say that “patience reduces risk”: what long dependence shows is that this decrease is much slower than it appears and that it is necessary to be very patient. 13.5. Towards a rediscovery of scaling laws in finance After 40 years of financial modeling of stock market prices, we can observe that one of the new intellectual aspects of the 1990s in terms of describing stock market variations was a change in perspective on markets that appears in the research in finance. We can find a trace of this change in the emergence of new vocabulary. Although since Zajdenweber [ZAJ 76], all reference to fractals had disappeared from articles on finance (fractals developed in other research fields), Peters [PET 89], who estimated the value of Hurst’s exponent H on index SP500, and Walter [WAL 89, WAL 90] reintroduced Mandelbrot’s terminology and the concept of fractal structure of markets by considering “Noah” and “Joseph” effects simultaneously in their implications for understanding the nature of stock market variations. It is especially with long memory of volatilities that the concept of fractal structure reappeared and Baillie [BAI 96] can recall the relation between Mandelbrot’s terminology and the recent econometric studies. Richards [RIC 00] is directly interested in the fractal dimension of the market.

458


This is, in fact, the beginning of a progressive rediscovery of scaling laws and of a growing value for these laws. However, following Peters’ works, we can draw attention to the confusion that may emerge among the professional financial community, between the concept of fractals and that of chaos. Peters [PET 91] associated these two concepts in an approach that is more spontaneous than rigorous and consolidated them in his second work [PET 94], in which fractals and chaos are mistakenly unified in the title by presenting the application of chaos theory on investment policies, from a fractal description of stock market variations. Insofar as a great number of studies highlighted the non-applicability of approaches using chaos for the description of stock market variations16, this confusion, introduced by Peters, contributed (and perhaps still contributes) to problematizing the professional community’s understanding of fractal hypothesis, which often associates chaos concept with fractals. Notwithstanding this conceptual hesitation, we can conclude that, faced with the success of fractal modeling of volatility and with recent attempts to apply Brownian motion on deformed time [DAC 93, EVE 95a, MUL 90, MUL 93], the financial modeling of stock market variations must make way in the coming years for a partial rediscovery of scaling invariances, no longer in the context of unique fractal dimension (as in the case of iid-α-stable and H-correlative models) but from the introduction of deformed time models, which make it possible to understand market time by replacing physical time with intrinsic stock market time. The most recent modeling explores this promising method (see, for example, [ANE 00, MAND 97b]). 13.6. Bibliography [ADE 65] A DELMAN I., “Long cycles: facts or artefacts?”, American Economic Review, vol. 50, p. 444–463, 1965. [AKG 88a] A KGIRAY V., B OOTH G., “Mixed diffusion-jump process modeling of exchange rate movements”, The Review of Economics and Statistics, p. 631–637, 1988. [AKG 88b] A KGIRAY V., B OOTH G., “The stable-law model of stock returns”, Journal of Business and Economic Statistics, vol. 6, no. 1, p. 51–57, 1988. [AKG 89] A KGIRAY V., L AMOUREUX C., “Estimation of stable-law parameters: a comparative study”, Journal of Business and Economic Statistics, vol. 7, no. 1, p. 85–93, 1989. [ALL 74] A LLAIS M., “The psychological rate of interest”, Journal of Money, Credit, and Banking, vol. 3, p. 285–331, 1974. [ANE 00] A NÉ T., G EMAN H., “Order flow, transaction clock, and normality of asset returns”, Journal of Finance, vol. 55, no. 4, 2000.

16. For a synthesis, see, for example, [MIG 98].


459

[ARA 80] A RAD R., “Parameter estimation for symmetric stable distributions”, International Economic Review, vol. 21, no. 1, p. 209–220, 1980. [BAC 00] BACHELIER L., Théorie de la spéculation, PhD Thesis in Mathematical Sciences, Ecole normale supérieure, 1900. [BAI 96] BAILLIE R., “Long memory processes and fractional integration in econometrics”, Journal of Econometrics, vol. 73, p. 5–59, 1996. [BAL 94] BÂLE, Risk management guidelines for derivatives, Basle Committee on Banking Supervision, July 1994. [BAL 96] BÂLE, Amendment to the capital accord to incorporate market risks, Basle Committee on Banking Supervision, January 1996. [BAL 98] BÂLE, Framework for supervisory information about derivatives and trading activities, Joint report, Basle Committee on Banking Supervision and Technical Committee of the IOSCO, September 1998. [BAL 99] BÂLE, Trading and derivatives disclosures of banks and securities firms, Joint report, Basle Committee on Banking Supervision and Technical Committee of the IOSCO, December 1999. [BAR 73] BARNEA A., D OWNES D., “A reexamination of the empirical distribution of stock price changes”, Journal of the American Statistical Association, vol. 68, no. 342, p. 348–350, 1973. [BAT 99] BATTEN J., E LLIS C., M ELLOR R., “Scaling laws in variance as a measure of long-term dependence”, International Review of Financial Analysis, vol. 8, no. 2, p. 123–138, 1999. [BEL 00] B ELKACEM L., L ÉVY V ÉHEL J., WALTER C., “CAPM, risk, and portfolio selection in α-stable markets”, Fractals, vol. 8, no. 1, p. 99–115, 2000. [BLA 74] B LATTBERG R., G ONEDES N., “A comparison of the stable and Student distributions as statistical models for stock prices”, Journal of Business, vol. 47, p. 244–280, 1974. [BOL 92] B OLLERSLEV T., C HOU R., K RONER K., “ARCH modeling in finance: A review of the theory and empirical evidence”, Journal of Econometrics, vol. 52, no. 1-2, p. 5–59, 1992. [BOL 00] B OLLERSLEV T., C AI J., S ONG F., “Intraday periodicity, long memory volatility, and macroeconomic announcements effects in the US treasury bond market”, Journal of Empirical Finance, vol. 7, p. 37–55, 2000. [BOO 87] B OOTHE P., G LASSMAN D., “The statistical distribution of exchange rates: Empirical evidence and economic implications”, Journal of International Economics, vol. 22, p. 297–319, 1987. [BOU 97] B OUCHAUD J.P., P OTTERS M., Théorie des risques financiers, Collection Aléas, Saclay, 1997. [BRE 74] B RENNER M., “On the stability of the distribution of the market component in stock price changes”, Journal of Financial and Quantitative Analysis, vol. 9, p. 945–961, 1974.

460


[CAM 97] C AMPBELL J., L O A., M ACKINLAY A.C., The Econometrics of Financial Markets, Princeton University Press, 1997. [CHA 76] C HAMBERS J., M ALLOWS C., S TUCK B., “A method for simulating stable random variables”, Journal of the American Statistical Association, vol. 71, no. 354, p. 340–344, 1976. [CLA 73] C LARK P., “A subordinated stochastic process model with finite variance for speculative prices”, Econometrica, vol. 41, no. 1, p. 135–155, 1973. [CON 97] C ONT R., “Scaling properties of intraday price changes”, Science and Finance Working Paper, June 1997. [COR 97] C ORAZZA M., M ALLIARIS A.G., NARDELLI C., “Searching for fractal structure in agricultural futures markets”, The Journal of Future Markets, vol. 17, no. 4, p. 433–473, 1997. [COU 00] C OURTAULT J.M., K ABANOV Y., B RU B., C RÉPEL P., L EBON I., L E M ARCHAND A., “Louis Bachelier on the centenary théorie de la spéculation”, Mathematical Finance, vol. 10, no. 3, p. 341–353, 2000. [DAC 93] DACOROGNA M., M ÜLLER U., NAGLER R., O LSEN R., P ICTET O., “A geographical model for the daily and weekly seasonal volatility in the foreign exchange market”, Journal of International Money and Finance, vol. 12, p. 413–438, 1993. [DEB 89] D E B ONDT W., T HALER R., “Anomalies: A mean-reverting walk down Wall Street”, Journal of Economic Perspectives, vol. 3, no. 1, p. 189–202, 1989. [DUM 73] D U M OUCHEL W., “Stable distributions in statistical inference: 1. Symmetric stable distributions compared to other long-tailed distributions”, Journal of the American Statistical Association, vol. 68, no. 342, p. 469–477, 1973. [DUM 75] D U M OUCHEL W., “Stable distributions in statistical inference: 2. Information from stably distributed samples”, Journal of the American Statistical Association, vol. 70, no. 350, p. 386–393, 1975. [DUM 83] D U M OUCHEL W., “Estimating the stable index in order to measure tail thickness: A critique”, The Annals of Statistics, vol. 11, no. 4, p. 1019–1031, 1983. [ELL 38] E LLIOTT R., The Wave Principle, Collins, New York, 1938. [EMB 97] E MBRECHTS P., K LÜPPELBERG C., M IKOSCH T., Modelling Extremal Events for Insurance and Finance, Springer, 1997. [ENG 82] E NGLE R., “Autoregressive conditional heteroskedasticity with estimates of the variance in United Kingdom inflation”, Econometrica, vol. 50, p. 987–1008, 1982. [EVE 95a] E VERTSZ C.G., “Fractal geometry of financial time series”, Fractals, vol. 3, no. 3, p. 609–616, 1995. [EVE 95b] E VERTSZ C.G., “Self-similarity of high-frequency USD-DEM exchange rates”, in Proceedings of the First International Conference on High Frequency Data in Finance (Zurich, Switzerland), vol. 3, March 1995. [FAM 65] FAMA E., “The behavior of Stock Market prices”, Journal of Business, vol. 38, no. 1, p. 34–195, 1965.


461

[FAM 68] FAMA E., ROLL R., “Some properties of symmetric stable distributions”, Journal of the American Statistical Association, vol. 63, p. 817–836, 1968. [FAM 71] FAMA E., ROLL R., “Parameter estimates for symmetric stable distributions”, Journal of the American Statistical Association, vol. 66, no. 334, p. 331–336, 1971. [FAM 88] FAMA E., F RENCH K., “Permanent and temporary components of stock prices”, Journal of Political Economy, vol. 96, no. 2, p. 246–273, 1988. [FIE 72] F IELITZ B., S MITH E., “Asymmetric stable distributions of stock price changes”, Journal of the American Statistical Association, vol. 67, no. 340, p. 813–814, 1972. [FIE 83] F IELITZ B., ROZELLE J., “Stable distributions and the mixture of distributions hypotheses for common stock returns”, Journal of the American Statistical Association, vol. 78, no. 381, p. 28–36, 1983. [GHY 97] G HYSELS E., G OURIÉROUX C., JASIAK J., “Market time and asset price movements: Theory and estimation”, in H AND D., JARKA S. (Eds.), Statistics in Finance, Arnold, London, p. 307–322, 1997. [GOU 97a] G OURIÉROUX C., ARCH Models and Financial Applications, Springer-Verlag, 1997. [GOU 97b] G OURIÉROUX C., L E F OL G., “Volatilités et mesures du risque”, Journal de la Société de statistique de Paris, vol. 38, no. 4, p. 7–32, 1997. [GRA 66] G RANGER C.W.J., “The typical spectral shape of an economic variable”, Econometrica, vol. 34, p. 150–161, 1966. [GUE 94] G UEGAN D., “Séries chronologiques non linéaires à temps discret”, Economica, 1994. [HAG 78] H AGERMAN R., “More evidence of the distribution of security returns”, Journal of Finance, vol. 33, p. 1213–1221, 1978. [HAL 89] H ALL J., B RORSEN B., I RWIN S., “The distribution of future prices: a test of the stable paretian and mixture of normal hypotheses”, Journal of Financial and Quantitative Analysis, vol. 24, no. 1, p. 105–116, 1989. [HAS 87] H ASBROUCK J., H O T., “Order arrival, quote behavior, and the return generating process”, Journal of Finance, vol. 42, p. 1035–1048, 1987. [HOW 99] H OWE J.S., M ARTIN D., W OOD B., “Much ado about nothing: long-term memory in Pacific Rim equity markets”, International Review of Financial Analysis, vol. 8, no. 2, p. 139–151, 1999. [HSU 74] H SU D.A., M ILLER R., W ICHERN D., “On the stable paretian behavior of stok-market prices”, Journal of the American Statistical Association, vol. 69, no. 345, p. 108–113, 1974. [HUR 51] H URST H.E., “Long term storage capacity of reservoirs”, Transactions of the American Society of Civil Engineers, vol. 116, p. 770–799, 1951. [IOS 94] I OSCO, Operational and financial risk management control mechanisms for over-the-counter derivatives activities of regulated securities firms, Technical Committee of the International Organization of Securities Commissions, July 1994.

462


[KOU 80] KOUTROUVÉLIS I., “Regression-type estimation of the parameters of stable laws”, Journal of the American Statistical Association, vol. 75, no. 372, p. 918–928, 1980. [LAM 94] L AMOUREUX C., L ASTRAPES W., “Endogeneous trading volume and momentum in stock-return volatility”, Journal of Business and Economic Statistics, vol. 12, no. 2, p. 225–234, 1994. [LEV 02] L ÉVY V ÉHEL J., WALTER C., Les marchés fractals, PUF, Paris, 2002. [LO 88] L O A.W., M ACKINLAY A., “Stock prices do not follow random walks: evidence from a simple specification test”, Review of Financial Studies, vol. 1, p. 41–66, 1988. [LO 91] L O A.W., “Long-term memory in stock market prices”, Econometrica, vol. 59, no. 5, p. 1279–1313, 1991. [LON 96] L ONGIN F., “The asymptotic distribution of extreme stock market returns”, Journal of Business, vol. 69, no. 3, p. 383–408, 1996. [LON 00] L ONGIN F., “From value at risk to stress testing approach: the extreme value theory”, Journal of Banking and Finance, p. 1097–1130, 2000. [MAI 97] M AILLET B., M ICHEL T., “Mesures de temps, information et distribution des rendements intrajournaliers”, Journal de la Société de statistique de Paris, vol. 138, no. 4, p. 89–120, 1997. [MAND 63] M ANDELBROT B., “The variation of certain speculative prices”, Journal of Business, vol. 36, p. 394–419, 1963. [MAND 65] M ANDELBROT B., “Une classe de processus stochastiques homothétiques à soi ; application à la loi climatologique de H.E. Hurst”, Comptes rendus de l’Académie des sciences, vol. 260, p. 3274–3277, 1965. [MAND 67a] M ANDELBROT B., “The variation of some other speculative prices”, Journal of Business, vol. 40, p. 393–413, 1967. [MAND 67b] M ANDELBROT B., TAYLOR H., “On the distribution of stock price differences”, Operations Research, vol. 15, p. 1057–1062, 1967. [MAND 68] M ANDELBROT B., VAN N ESS J.W., “Fractional Brownian motion, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MAND 71] M ANDELBROT B., “When can price be arbitraged efficiently? A limit to the validity of random walk and martingale models”, Review of Economics and Statistics, vol. 53, p. 225–236, 1971. [MAND 72] M ANDELBROT B., “Statistical methodology for non-periodic cycles: from the covariance to R/S analysis”, Annals of Economic and Social Measurement, vol. 1, p. 259–290, 1972. [MAND 73a] M ANDELBROT B., “Le problème de la réalité des cycles lents et le syndrome de Joseph”, Economie appliquée, vol. 26, p. 349–365, 1973. [MAND 73b] M ANDELBROT B., “Le syndrome de la variance infinie et ses rapports avec la discontinuité des prix”, Economie appliquée, vol. 26, p. 349–365, 1973.


463

[MAND 97a] M ANDELBROT B., Fractals and Scaling in Finance, Springer, New York, 1997 (abridged French version: Fractales, Hasard et Finances, Flammarion, Paris). [MAND 97b] M ANDELBROT B., F ISHER A., C ALVET L., “A multifractal model of asset returns”, Cowles Foundation Discussion Paper, no. 1164, September 1997. [MANT 91] M ANTEGNA R., “Lévy walks and enhanced diffusion in Milan stock exchange”, Physica A, vol. 179, p. 232–242, 1991. [MANT 00] M ANTEGNA R., S TANLEY E., An Introduction to Econophysics: Correlations and Complexity in Finance, Cambridge University Press, 2000. [MCC 78] M C C ULLOCH J.H., “Continuous time processes with stable increments”, Journal of Business, vol. 51, no. 4, p. 601–619, 1978. [MCC 81] M C C ULLOCH J.H., “Simple consistent estimators of the stable distributions”, in Proceedings of the Annual Meeting of the Econometric Society, 1981. [MER 76] M ERTON R., “Optimal pricing when underlying stock returns are discontinuous”, Journal of Financial Economics, vol. 3, p. 125–144, 1976. [MIG 98] M IGNON V., “Marchés financiers et modélisation des rentabilités boursières”, Economica, 1998. [MIT 89] M ITTNIK S., R ACHEV S., “Stable distributions for asset returns”, Applied Mathematics Letters, vol. 2, no. 3, p. 301–304, 1989. [MUL 90] M ÜLLER U., DACOROGNA M., M ORGENEGG C., P ICTET O., S CHWARZ M., O LSEN R., “Statistical study of foreign exchange rates: empirical evidence of a price change scaling law and intraday pattern”, Journal of Banking and Finance, vol. 14, p. 1189–1208, 1990. [MUL 93] M ÜLLER U., DACOROGNA M., DAVÉ R., P ICTET O., O LSEN R., WARD J., Fractals and intrinsic time – A challenge to econometricians, Olsen and Associates Research Group, UAM 1993-08-16, 1993. [OFF 72] O FFICER R., “The distribution of stock returns”, Journal of the American Statistical Association, vol. 67, no. 340, p. 807–812, 1972. [PAU 75] PAULSON A., H OLCOMB E., L EITCH R., “The estimation of the parameters of the stable laws”, Biometrika, vol. 62, no. 1, p. 163–170, 1975. [PET 89] P ETERS E., “Fractal structure in the capital markets”, Financial Analysts Journal, p. 32–37, July-August 1989. [PET 91] P ETERS E., Chaos and Order in the Capital Markets: A New View of Cycles, Prices, and Market Volatility, John Wiley & Sons, New York, 1991. [PET 94] P ETERS E., Fractal Market Analysis: Applying Chaos Theory to Investment and Economics, John Wiley & Sons, New York, 1994. [PIC 75] P ICKANDS J., “Statistical inference using extreme order statistics”, Annals of Statistics, vol. 3, p. 119–131, 1975. [POT 88] P OTERBA J.M., S UMMERS L., “Mean reversion in stock prices: Evidence and implications”, Journal of Financial Economics, vol. 22, p. 27–59, 1988.

464


[PRA 72] P RAETZ P., “The distribution of share price changes”, Journal of Business, vol. 45, p. 49–55, 1972. [PRE 72] P RESS S.J., “Estimation in univariate and multivariate stable distributions”, Journal of the American Statistical Association, vol. 67, no. 340, p. 842–846, 1972. [RIC 00] R ICHARDS G., “The fractal structure of exchange rates: Measurement and forecasting”, Journal of International Financial Markets, Institutions, and Money, vol. 10, p. 163–180, 2000. [SUM 86] S UMMERS L., “Does the stock market rationally reflect fundamental values?”, Journal of Finance, vol. 41, no. 3, p. 591–601, 1986. [TAQ 00] TAQQU M., “Bachelier et son époque: une conversation avec Bernard Bru”, in Proceedings of the First World Congress of the Bachelier Finance Society (Paris, France), June 2000. [TAY 86] TAYLOR S., Modelling Financial Time Series, John Wiley & Sons, 1986. [TEI 71] T EICHMOELLER J., “A note on the distribution of stock price changes”, Journal of the American Statistical Association, vol. 66, no. 334, p. 282–284, 1971. [TUC 88] T UCKER A., P OND L., “The probability distribution of foreign exchange price changes: test of candidate processes”, Review of Economics and Statistics, p. 638–647, 1988. [UPT 79] U PTON D., S HANNON D., “The stable paretian distribution, subordinated stochastic processes, and asymptotic lognormality: an empirical investigation”, Journal of Finance, vol. 34, no. 4, p. 1031–1039, 1979. [WAL 89] WALTER C., “Les risques de marché et les distributions de Lévy”, Analyse financière, vol. 78, p. 40–50, 1989. [WAL 90] WALTER C., “Mise en évidence de distributions Lévy-stables et d’une structure fractale sur le marché de Paris”, in Actes du premier colloque international AFIR (Paris, France), vol. 3, p. 241–259, 1990. [WAL 91] WALTER C., “L’utilisation des lois Lévy-stables en finance: une solution possible au problème posé par les discontinuités des trajectoires boursières”, Bulletin de l’IAF, vol. 349-350, p. 3–32 and 4–23, 1991 [WAL 94] WALTER C., Les structures du hasard en économie: efficience des marchés, lois stables et processus fractals, PhD Thesis, IEP Paris, 1994. [WAL 96] WALTER C., “Une histoire du concept d’efficience sur les marchés financiers”, Annales HSS, vol. 4, p. 873–905, July-August 1996. [WAL 99] WALTER C., “Lévy-stability-under-addition and fractal structure of markets: Implications for the investment management industry and emphasized examination of MATIF notional contract”, Mathematical and Computer Modelling, vol. 29, no. 10-12, p. 37–56, 1999. [WAS 85] WASSERFALLEN W., Z IMMERMANN H., “The behavior of intra-daily exchange rates”, Journal of Banking and Finance, vol. 9, p. 55–72, 1985. [ZAJ 76] Z AJDENWEBER D., “Hasard et prévision”, Economica, 1976.

Chapter 14

Scale Relativity, Non-differentiability and Fractal Space-time

14.1. Introduction The theory of scale relativity [NOT 93] applies the principle of relativity to scale transformations (particularly to transformations of spatio-temporal resolutions). In Einstein’s [EIN 16] formulation, the principle of relativity requires that laws of nature must be valid in every coordinate system, whatever their state. Since Galileo, this principle had been applied to the states of position (origin and orientation) and motion of the coordinate system (velocity and acceleration), i.e. states which can never be defined in an absolute way, but only in a relative way. The state of one reference system can be defined only with regard to another system. It is the same as regards the change of scale. The scale of one system can be defined only with regard to another system and so owns the fundamental property of relativity: only scale ratios have a meaning, never an absolute scale. In the new approach, we reinterpret the resolutions, not only as a property of the measuring device and/or of the measured system, but more generally as an intrinsic property of space-time, characterizing the state of scale of the reference system in the same way as velocity characterizes its state of motion. The principle of scale relativity requires that the fundamental laws of nature apply, whatever the state of scale of the coordinate system.

Chapter written by Laurent N OTTALE.

466


What is the motivation behind adding such a first principle to fundamental physics? It becomes imperative from the very moment we want to generalize the current description of space and time. Present description is usually reduced to differentiable manifolds (even though singularities are possible at certain particular points). So, a way to generalize current physics consists of trying to abandon the hypothesis of differentiability of spatio-temporal coordinates. As we will see, the main consequence of such an abandonment is that space-time becomes fractal, i.e. it acquires an explicit scale dependence (more precisely, it becomes scale-divergent) in terms of the spatio-temporal resolutions. 14.2. Abandonment of the hypothesis of space-time differentiability If we analyze the state of physics based on the principle of relativity before Einstein, we note that it is entirely traditional physics, including the theory of gravitation via the generalized relativity of motion, which is based on this principle. Quantum physics, although compatible with Galilean relativity of motion, seems not to rely on it with regard to its foundations. We could question whether a new generalization of the relativity, which includes quantum effects as its consequence (or, at least, some of them) remains possible. However, in order to generalize relativity, it is necessary to generalize a possible transformation between the coordinate systems, as well as the definition of what the possible coordinate systems are and, finally, to generalize the concepts of space and space-time. The general relativity of Einstein is based on the hypothesis that the space-time is Riemannian, i.e., describable by a manifold that is at least twice differentiable: in other words, we can define a continuum of spatio-temporal events, then speeds which are their derivative and then accelerations by a new derivation. Within this framework, Einstein’s equations are the most general of the simplest equations, which are covariant in twice differentiable coordinates transformations. Just as the passage of special relativity to generalized relativity is allowed by abandoning restrictive hypothesis (that of the flatness of the space-time through a consideration of curved space-time), a new opening is then possible by abandoning the assumption of differentiability. The issue now is to describe a space-time continuum which is no longer inevitably differentiable everywhere or almost everywhere. 14.3. Towards a fractal space-time The second stage of construction consists of “recovering” a mathematical tool that seems to be lost in such a generalization. The essential tool of physics, since Galileo, Leibniz and Newton is the differential equation. Is abandoning the assumption of the differentiability of space-time and therefore of the coordinate systems and of transformations between these systems the same as abandoning the differential equations?


467

This crucial problem can be circumvented by the intervention of the concept of fractal geometry in space-time physics. With its bias, non-differentiability can be treated using differential equations. 14.3.1. Explicit dependence of coordinates on spatio-temporal resolutions This possibility results from the following theorem [NOT 93, NOT 96a, NOT 97a], which is itself a consequence of a Lebesgue theorem. It can be proved that a continuous and almost nowhere differentiable curve has a length depending explicitly on the resolution at which we consider it and tending to infinity when the interval of resolution tends to zero. In other words, such a curve is fractal in the general sense given by Mandelbrot to this term [MAN 75, MAN 82]. Applied to the coordinate system of a non-differentiable space-time, this theorem implies a fractal geometry for this space-time [ELN 95, NOT 84, ORD 83], as well as for the reference frame. Moreover, it is the dependence according to the resolution itself which solves the problem posed. Indeed, let us consider the definition of the derivative applied, for example, to a coordinate (which defines speed): x(t + dt) − x(t) (14.1) v(t) = lim dt→0 dt The non-differentiability is the non-existence of this limit. The limit being, in any case, physically unattainable (infinite energy is required to reach it, according to Heisenberg time-energy relation), v is redefined as v(t, dt), function of time t and of the differential element dt identified with an interval of resolution, regarded as a new variable. The issue is not the description of what occurs in extreme cases, but the behavior of this function during successive zooms on the interval dt. 14.3.2. From continuity and non-differentiability to fractality It can be proved [BEN 00, NOT 93, NOT 96a] that the length L of a continuous and nowhere (or almost nowhere) differentiable curve is dependent explicitly on the resolution ε at which it is considered and, further, that L(ε) remains strictly increasing and tends to infinity when ε → 0. In other words, this curve is fractal (we will use the word “fractal” in this general sense throughout this chapter). Let us consider a curve (chosen as a function f (x) for the sake of simplicity) in the Euclidean plane, which is continuous but nowhere differentiable between two points A0 {x0 , f (x0 )} and AΩ {xΩ , f (xΩ )}. Since f is non-differentiable, there is a point A1 of coordinates {x1 , f (x1 )}, with x0 < x1 < xΩ , such that A1 is not on the segment A0 AΩ . Thus, the total length becomes L1 = L(A0 A1 ) + L(A1 AΩ ) > L0 = L(A1 AΩ ). We can now iterate the argument and find two coordinates x01 and x11 with x0 < x01 < x1 and x1 < x11 < xΩ , such that L2 = L(A0 A01 ) + L(A01 A1 ) +L(A1 A11 ) + L(A11 AΩ ) > L1 > L0 . By iteration we finally construct successive

468


approximations of the function f (x) studied, f0 , f1 , . . . , fn , whose length L0 , L1 , . . . , Ln increase monotonically when the resolution ε ≈ (xΩ − x0 ) × 2−n tends to zero. In other words, continuity and non-differentiability imply monotonous scale dependence of f in terms of resolution ε. However, the function L(ε) could be increasing but converge when ε → 0. This is not the case for such a continuous and non-differentiable curve: indeed, the second stage of demonstration, which establishes the divergence of L(ε), is a consequence of Lebesgue theorem (1903), which states that a curve of finite length is differentiable almost everywhere (see for example [TRI 93]). Consequently, a non-differentiable curve is necessarily infinite. These two results, taken together, establish the above theorem on the scale divergence of non-differentiable continuous functions. A direct demonstration, using non-standard analysis, was given in [NOT 93], p. 82. This theorem can be easily generalized to curves, surfaces, volumes, and more generally to spaces of any dimension. Regarding the reverse proposition, a question remains as to whether a continuous function whose length is scale-divergent between any two points such that δx = xA − xB is finite (i.e., everywhere or nearly everywhere scale-divergent) and non-differentiable. In order to prepare the answer, let us remark that the scale-dependent length, L(δx), can be easily related to * the average value of the scale-dependent slope v(δx). Indeed, we have L(δx) = # 1 + v 2 (δx)$. Since we consider curves such that L(δx) → ∞ when δx → 0, this means that L(δx) ≈ #v(δx)$ at large enough resolution, so that L(δx) and v(δx) share the same kind of divergence when δx → 0. Basing ourselves on this simple result, the answer to the question of the non-differentiability of scale-divergent curves is as follows (correcting and updating here previously published results [NOT 08]): 1) Homogenous divergence. Let us first consider the case when the slopes diverge in the same way for all points of the curve, which we call “homogenous divergence”. In other words, we assume that, for any couple of points the absolute values v1 and v2 of their scale-dependent slopes verify: ∃K1 and K2 finite, such that, ∀δx, K1 < v2 (δx)/v1 (δx) < K2 . Then the mode of mean divergence is the same as the divergence of the slope on the various points, and it is also the mode of longitudinal divergence. In this case the inverse theorem is true, namely, in the case of homogenous divergence, the length of a continuous curve f is such that: L infinite (i.e., L = L(δx) → ∞ when δx → 0) ⇔ f non-differentiable. 2) Inhomogenous divergence. In this case there may exist curves such that only a subset of zero measure of their points have divergent slopes, in such a way that the


469

length is nevertheless infinite in the limit δx → 0. Such a function may therefore be almost everywhere differentiable, and in the same time be characterized by a genuine fractal law of scale-dependence of its length, i.e. by a power law divergence characterized by a fractal dimension DF . The same reasoning may be applied to other types of divergences, such as logarithmic, exponential, etc. Therefore, when the divergence is inhomogenous, an infinite curve may be either differentiable or non-differentiable whatever its divergence mode, namely, this means that there is no inverse theorem in this case. When it is applied to physics, this result means that a fractal behavior may result from the action of singularities (in infinite number even though forming a subset of zero measure) in a space or space-time that nevertheless remains almost everywhere differentiable (such as for example Riemannian manifolds in Einstein’s general relativity). This comes in support of Mandelbrot’s view about the origin of fractals, which are known to be extremely frequent in many natural phenomena that yet seem to be well described by standard differential equations: this could come from the existence of singularities in differentiable physics (see e.g. [MAN 82], Chapter 11). However, the viewpoint of scale relativity theory is more radical, since the main problem we aim at solving in its framework is not the (albeit very interesting) question of the origin of fractals, but the issue of the foundation of the quantum theory and of gauge fields from geometric first principles. As we shall recall, a fractal space-time is not sufficient to reach this goal (specifically concerning the emergence of complex numbers). We need to work in the framework of non-differentiable manifolds, which are indeed fractal (i.e. scale-divergent) as has been shown above. However, the fractality is not central in this context, and it mainly appears as a derived (and very useful) geometric property of such continuous non-differentiable manifolds. 14.3.3. Description of non-differentiable process by differential equations This result is the key for enabling a description of non-differentiable processes in terms of differential equations. We introduce explicitly the resolutions in the expressions of the main physical quantities and, as a consequence, in the fundamental equations of physics. This means that a quantity f , usually expressed in terms of space-time variables x, i.e., f = f (x), must now be described as also depending on resolutions ε, i.e., f = f (x, ε). In other words, rather than considering only the strictly non-differentiable mathematical object f (x), we shall consider its various “approximations” obtained from smoothing or averaging it at various resolutions ε: f (x, ε) =

+∞

−∞

Φ(x, y, ε) f (x + y) dy

(14.2)

470


where Φ(x, y, ε) is a smoothing function centered on x, for example, a Gaussian function of standard error ε. More generally, we can use wavelet transformations based on a filter that is not necessarily conservative. Such a point of view is particularly well-adapted to applications in physics: any real measurement is always performed at finite resolution (see [NOT 93] for additional comments on this point). In this framework, f (x) becomes the limit for ε → 0 of the family of functions fε (x), i.e., in other words, of the function of two variables f (x, ε). However, whereas f (x, 0) is non-differentiable (in the sense of the non-existence of the limit df /dx when ε tends to zero), f (x, ε), which we call a “fractal function” (and which is, in fact, defined using a class of equivalence that takes into account the fact that ε is a resolution, see [NOT 93]), is now differentiable for all ε = 0. The problem of physically describing of the various processes where such a function f intervenes is now posed differently. In standard differentiable physics, it amounts to finding differential equations involving the derivatives of f with respect to space-time coordinates, i.e., ∂f /∂x, ∂ 2 f /∂x2 , namely, the derivatives which intervene in laws of displacement and motion. The integro-differential method amounts to performing such a local description of space-time elementary displacements, of their effect on quantum physics and then integrating in order to obtain the large scale properties of the system under consideration. Such a method has often been called “reductionist” and it was indeed adapted to traditional problems where no new information appears at different scales. The situation is completely different for systems characterized by a fractal geometry and/or non-differentiability. Such behaviors are found towards very small and very large scales, but also, more generally, in chaotic and/or turbulent systems and probably in basically all living systems. In these cases, new, original information exists at different scales and the project to reduce the behavior of a system to one scale (in general, to a large scale) from its description at another scale (in general, the smallest possible scale, δx → 0) seems to lose its meaning and to become hopeless. Our suggestion consists precisely of giving up such a hope and introducing a new frame of thought where all scales co-exist simultaneously inside a unique scale-space, and are connected together using scale differential equations acting in this scale-space. Indeed, in non-differentiable physics, ∂f (x)/∂x = ∂f (x, 0)/∂x no longer exists. However, physics of the given process will be completely described provided we succeed in knowing f (x, ε), which is differentiable (for x and ε) for all finite values of ε. Such a function of two variables (which is written more precisely, to be complete, as f [x(ε), ε)]) can be the solution of differential equations involving ∂f (x, ε)/∂x but also ∂f (x, ε)/∂ ln ε. More generally, with non-linear laws, the equations of physics take the form of second-order differential equations, which will


471

then contain, in addition to the previous first derivatives, operators like ∂ 2 /∂x2 (laws of motion), ∂ 2 /∂(ln ε)2 (laws of scale), but also ∂ 2 /∂x∂ ln ε, which correspond to a coupling between motion and scale (see below). What is the physical meaning of the differential ∂f (x, ε)/∂ ln ε? It is simply the variation of the physical quantity f under an infinitesimal scale transformation, i.e., a resolution dilation. More precisely, let us consider the length of a non-differentiable curve L(ε), which can represent more generally a fractal curvilinear coordinate L(x, ε). Such a coordinate generalizes in a non-differentiable and fractal space-time the concept of curvilinear coordinates introduced for curved Riemannian space-time in Einstein’s general relativity [NOT 89]. 14.3.4. Differential dilation operator Let us apply an infinitesimal dilation ε → ε = ε(1+d ) to the resolution. We omit the dependence on x to simplify the notation in what follows, since for the moment we are interested in pure scale laws. We obtain: L(ε ) = L(ε + ε d ) = L(ε) +

∂L(ε) ˜ d )L(ε) ε d = (1 + D ∂ε

(14.3)

˜ is by definition the dilation operator. The comparison of the last two members where D of this equation thus yields: ˜ =ε ∂ = ∂ D ∂ε ∂ ln ε

(14.4)

This well-known form of the infinitesimal dilation operator, obtained by an application of Gell-Mann-Levy method (see [AIT 82]) shows that the “natural” variable for resolution changes is ln ε and that the differential equations of scale to build will indeed involve expressions such that ∂L(x, ε)/∂ ln ε. What will be the form that these equations take? In fact, equations describing the scale dependence of physical beings have already been introduced in physics: these are the renormalization group equations, particularly developed in the framework of Wilson’s “multiple-scale-of-length” approach [WIL 83]. In its simplest form, a “renormalization group”-like equation for a physical quantity L can be interpreted as stating that the variation of L under an infinitesimal scale transformation d ln ε depends only on L itself, in other words, L determines the whole physical behavior including the behavior in scale transformations. This is written ∂L(x, ε) = β(L) ∂ ln ε

(14.5)

Such an equation (and its generalization), the behavior of which we will analyze in more detail later on, is the differential equivalent of the generators in the case of

472


fractal objects built by iterations (for example, the von Koch curve). However, instead of passing from one stage of the construction to another by means of discrete finite dilations (successive factors in the case of the von Koch curve), we pass from ln ε to ln ε + d ln ε. In other words, the differential calculus made in the scale-space allows us to describe a non-differentiable behavior (in the limit) by differential equations. 14.4. Relativity and scale covariance We complete our current description which is made in terms of space (positions), space-time or phase-space, using a scale space. We now consider that resolutions characterize its space state, just as speeds characterize the state of motion of the coordinate system. The relative nature of temporal and spatial resolution intervals is a universal law of nature: only a ratio of length or time intervals can be defined, never their absolute value, as this is reflected in the need to appeal constantly to the units. This allows us to set the principle of scale relativity, according to which the fundamental laws of nature apply whatever the state of scale of the reference system is. In this framework, we shall call scale covariant the invariance of equations of physics under the transformations of spatio-temporal resolutions (let us note that this expression was introduced by other authors in a slightly different sense, as a generalization of scale invariance). It is also necessary to be careful because of the fact that a multiple covariance must be implemented in such an attempt, since it will be necessary to combine the covariance of motion and the new scale covariance, as well as a covariance under scale-motion coupling. We shall thus develop different types of covariant derivations which should be clearly distinguished: one strictly on the scales, then a “quantum-covariant” derivative which describes the inferred effects on the dynamics by the internal scale structures (which transforms traditional mechanics into quantum mechanics) and finally a covariant derivative which is identified with that of gauge theories and which describes non-linear effects of scale-motion coupling. 14.5. Scale differential equations We now pass on to the next stage and construct scale differential equations with a physical significance, then look at their solutions. For this we shall be guided by an analogy with the construction of the law of motion and by the constraint that such equations must satisfy the scale relativity principle. We shall find, at first, the self-similar fractal behavior at a constant dimension. In a scale transformation, such a law possesses the mathematical structure of the Galileo group and thus satisfies, in a simple way, the relativity principle.


473

The analogy with motion can be pushed further. We know, on the one hand, that the Galileo group is only an approximation of the Lorentz group (corresponding to the limit c → ∞) and, on the other hand, that both remain a description of an inertial behavior, whereas it is with dynamics that motion physics finds its complexity. The same is true for scale laws. Fractals with constant dimension constitute for scales the counterpart of what the Galilean inertia is for the motion. We can then suggest generalizing the usual dilation and contraction laws in two ways: 1) one way is to introduce a Lorentz group of scale transformation [NOT 92]. In its framework, there appears a finite resolution scale, minimal or maximal, invariant under dilation, which replaces zero or infinity while maintaining their physical properties. We have suggested identifying these scales, respectively, with the Planck length and with the scale of the cosmological constant [NOT 92, NOT 93, NOT 96a]. This situation, however, still corresponds to a linear transformation of scale on the resolutions; 2) another way is to take into account non-linear transformations of scale, i.e., to move to a “scale dynamics” and if possible to a generalized scale relativity [NOT 97a]. We shall consider in what follows some examples of these kind of generalized laws, after finding the standard fractal (scale-invariant) behavior (and the breaking of this symmetry) as a solution of the simplest possible first-order scale differential equation. 14.5.1. Constant fractal dimension: “Galilean” scale relativity Power laws, which are typical of the self-similar fractal behavior, can be identified as the simplest of the laws sought. Let us consider the simplest possible scale equation, which is written in terms of an eigenvalue equation for the dilation operator: ˜ = bL DL

(14.6)

Its solution is a standard divergent fractal law: L = L0 (λ0 /ε)δ

(14.7)

where δ = −b = D − DT , since D is the fractal dimension assumed to be constant and DT is the topological dimension. The variable L can indicate, for example, the length measured on a fractal curve (which will describe particularly a coordinate in the fractal reference system). Such a law corresponds, with regards to scales, to inertia from the point of view of motion. We can verify this easily by applying a resolution transformation to it. Under such a transformation ε → ε , we obtain: ln(L /λ) = ln(L/λ) + δ ln(ε/ε ),

δ = δ

(14.8)

474


where we recognize the mathematical structure of the Galileo transformation group between the inertial systems: the substitution (motion → scale) results in the correspondences x → ln(L/λ), t → δ and v → ln(ε/ε ). Let us note the manifestation of the relativity of the resolutions from the mathematical point of view: ε and ε intervene only by their ratio, while the reference scale λ0 disappeared in relation (14.8). In agreement with the preceding analysis of the status of resolutions in physics, the scale exponent δ plays the role for the scales which is played by time with regard to motion, and the logarithm of the ratio of resolutions plays the role of velocity. The composition law of dilations, written in logarithmic form, confirms this identification with the Galileo group: ln(ε /ε) = ln(ε /ε ) + ln(ε /ε)

(14.9)

formally identical to Galilean composition of velocities, w = u + v. 14.5.2. Breaking scale invariance: transition scales Statement (14.7) is scale invariant. This invariance is spontaneously broken by the existence of displacement and motion. Let us change the origin of coordinate system. We obtain: L = L0 (λ0 /ε)δ + L1 = L1 [1 + (λ1 /ε)δ ]

(14.10)

where λ1 = λ0 (L0 /L1 )1/δ . Whereas the scale λ0 remains arbitrary, the scale λ1 (which remains relative in terms of position and motion relativity) displays a break in scale symmetry (in other words, of a fractal to non-fractal transition in the space of scales). Indeed, it is easy to establish that, for ε λ1 , we have L ≈ L1 and L no longer depends on resolution, whereas for ε λ1 , we recover the scale dependence given by (14.7), which is asymptotically scale invariant. However, this behavior (equation (14.10)), which thus satisfies the double principle of relativity of motion and scale, is precisely obtained as the solution to the simplest scale differential equation that can be written (first-order equation, depending only on L itself, this dependence being expandable in Taylor series: the preceding case corresponds to simplification a = 0): dL/d ln ε = β(L) = a + bL + · · · .

(14.11)

The solution (14.11) is effectively given by expression (14.10), with δ = −b, L1 = −a/b, knowing that λ1 is an integration constant. Let us note that, if we push the Taylor series further, we obtain a solution yielding several transition scales, in agreement with the behaviors observed for many


475

natural fractal objects [MAN 82]. Particularly, going up to the second order, we find fractal structures with a lower and higher cut-off. We can also obtain behaviors which are scale-dependent toward the small and large scales, but which become scale-independent at intermediate scales. 14.5.3. Non-linear scale laws: second order equations, discrete scale invariance, log-periodic laws Among the corrections to scale invariance (characterized by power laws), one of them is led to play a potentially important role in many domains, which are not limited to physics. We are talking about the log-periodic laws which can be defined by the appearance of scale exponents or complex fractal dimensions. Sornette et al. (see [SOR 97, SOR 98] and the reference included) have shown that such behavior provides a very satisfactory and possibly predictive model of some earthquakes and market crashes. Chaline et al. [CHA 99] used such laws of scale to model the chronology of major jumps in the evolution of the species, and Nottale et al. [NOT 01a] showed that they also applied to the chronology of the main economic crises since the Neolithic era (see [NOT 00c] for more details). More recently, Cash et al. [CAS 02] showed that these laws describe the chronology of the main steps of embryogenesis and child development. This may be a first step towards a description of the temporal evolution of “crises” (in the general acception of this word), which could appear very general, all the more so as recent works validated these first results [SOR 01]. An intermittency model of this behavior was recently proposed [QUE 00]. Let us show how to obtain a log-periodic correction to power laws [NOT 97b] utilizing scale covariance [NOT 89], i.e. conservation of the form of scale dependent equations (see also [POC 97]). Let us consider a quantity Φ explicitly dependent on resolution, Φ(ε). In the application under consideration, the scale variable is identified with a time interval ε = T − Tc , where Tc is the date of crisis. Let us assume that Φ satisfies a renormalization-group-like first-order differential equation: dΦ − DΦ = 0 d ln ε whose solution is a power law, Φ(ε) ∝ εD .

(14.12)

In the quest to correct this law, we note that directly introducing a complex exponent is not enough since it would lead to large log-periodic fluctuations rather than to a controllable correction to the power laws. So let us assume that the cancellation of difference (14.12) was only approximate and that the second member of this equation actually differs from zero: dΦ − DΦ = χ d ln ε

(14.13)

476


We require that the new function χ is solution of an equation that keeps the same form as the initial equation: dχ − D χ = 0 d ln ε

(14.14)

Setting D = D + δ, we find that Φ is solution of a second order general equation: dΦ d2 Φ + CΦ = 0 −B (d ln ε)2 d ln ε

(14.15)

where we have B = 2D + δ and C = D(D + δ). This solution is written Φ(ε) = a εD (1 + b εδ ), where b can now be arbitrarily small. Finally, the choice of an imaginary exponent δ = iω yields a solution whose real part includes a log-periodic correction: Φ(ε) = a εD [1 + b cos(ω ln ε)]

(14.16)

Log-periodical fluctuations were also obtained within the approach of scale relativity through a reinterpretation of gauge invariance and of the nature of electromagnetism which can be proposed in this framework (see below and [NOT 96a, NOT 06]). 14.5.4. Variable fractal dimension: Euler-Lagrange scale equations Let us now consider the case of “scale dynamics”. As we have indicated earlier, the strictly scale-invariant behavior with constant fractal dimension corresponds to a free behavior from the point of view of the scale physics. Thus, just as there are forces which imply a variation with the inertial motion, we also expect to see the natural fractal systems displaying distortions compared with self-similar behavior. By analogy, such distortions can, in a first stage, be attributed to the effect of a “scale force” or even a “scale field”. Before introducing this concept, let us recall how we should reverse the viewpoint as regards the meaning of scale variables, in comparison with the usual description of fractal objects. This reversal is parallel, with respect to scales, to that which was operated for motion laws in the conversion from “Aristotelian” laws to Galilean laws. From the Aristotelian viewpoint, time is the measurement of motion: it is thus defined by taking as primary concepts space and velocity. In the same way, fractal dimension is defined, generally, from the “measure” of the fractal object (for example, curve length, surface area, etc.) and from the resolution: “t = x/v”

←→

δ = D − DT = d ln L/d ln(λ/ε)

(14.17)


477

With Galileo, time becomes a primary variable and velocity is derived from a ratio of space over time, which are now considered on the same footing, in terms of a space-time (which remains, however, degenerated, since the speed limit C is implicitly infinite there). This involves the vectorial character of velocity and its local aspect (finally implemented by its definition like the derivative of the position with respect to time). The same reversal can be applied to scales. The scale dimension δ itself becomes a primary variable, treated on the same footing as space and time, and the resolutions are therefore defined as derivatives from the fractal coordinate and δ (i.e. as a “scale-velocity”): V = ln

λ = d ln L/dδ ε

(14.18)

This new and fundamental meaning given to the scale exponent δ = D − DT , now treated like a variable, makes it necessary to allot a new name to it. Henceforth, we will call it djinn (in preceding articles, we had proposed the word zoom, but this already applies more naturally to the scales transformation themselves, ln(ε /ε)). This will lead us to work in terms of a generalized 5D space, the “space-time-djinn”. In analogy with the vectorial character of velocity, the vectorial character of the zoom (i.e., of the scale transformations) is then apparent because the four spatio-temporal resolutions can now be defined starting from the four coordinates of space-time and of the djinn: v i = dxi /dt

←→

ln

λμ = d ln Lμ /dδ εμ

(14.19)

Note however that, in more recent works, a new generalization of the physical nature of the resolutions is introduced, which attributes a tensorial nature to them, analogous to that of a variance-covariance error matrix [NOT 06, NOT 08]. We could object to this reversal of meaning of the scale variables, that, from the point of view of the measurements, it is only through L and ε that we have access to the djinn δ, which is deduced from them. However, we notice that it is the same for the time variable, which, though being a primary variable, is always measured in an indirect way (through changes of position or state in space). A final advantage of this inversion will appear later on in the attempts to construct a generalized scale relativity. It allows the definition of a new concept, i.e. that of scale-acceleration Γμ = d2 ln Lμ /dδ 2 which is necessary for the passage to non-linear scale laws and to a scale “dynamics”. The introduction of this concept makes it possible to further reinforce the identification of fractals of constant fractal dimension with “scale inertia”. Indeed,

478


the free scale equation can be written (in one dimension to simplify the writing): Γ = d2 ln L/dδ 2 = 0

(14.20)

It integrates as: d ln L/dδ = ln

λ = constant ε

(14.21)

The constancy of resolution means here that it is independent of the djinn δ. The solution therefore takes the awaited form L = L0 (λ/ε)δ . More generally, we can then make the assumption that the scale laws can be constructed from a least action principle. A scale Lagrange function, L(ln L, V, δ), with V = ln(λ/ε) is introduced, and then a scale action:

δ2

S=

L(ln L, V, δ) dδ

(14.22)

δ1

The principle of stationary action then leads to Euler-Lagrange scale equations: ∂L d ∂L = dδ ∂V ∂ ln L

(14.23)

14.5.5. Scale dynamics and scale force The simplest possible form of these equations corresponds to a cancellation of the second member (absence of scale force), and to the case where the Lagrange function takes the Newtonian form L ∝ V 2 . We once again recover, in this other way, the “scale inertia” power law behavior. Indeed, the Lagrange equation becomes in this case: dV =0 dδ

⇒

V = constant

(14.24)

The constancy of V = ln(λ/ε) means here, as we have already noticed, that it is independent of δ. Equation (14.23) can therefore be integrated under the usual fractal form L = L0 (λ/ε)δ . However, the principal advantage of this representation is that it makes it possible to pass to the following order, i.e., to non-linear scale dynamic behaviors. We consider that the resolution ε can now become a function of the djinn δ. The fact of having


479

identified the resolution logarithm with a “scale-velocity”, V = ln(λ/ε), then results naturally in defining a scale acceleration: Γ = d2 ln L/dδ 2 = d ln(λ/ε)/dδ

(14.25)

The introduction of a scale force then makes it possible to write a scale analog of Newton’s dynamic equation (which is simply the preceding Lagrange equation (14.23)): d2 ln L (14.26) dδ 2 where μ is a “scale-mass” which measures how the system resists scale force. F = μΓ = μ

14.5.5.1. Constant scale force Let us first consider the case of a constant scale-force. Continuing with the analogy with motion laws, such a force derives from a “scale-potential” ϕ = F ln L. We can write equation (14.26) in the form: d2 ln L =G dδ 2

(14.27)

where G = F/μ = constant. This is the scalar equivalent to parabolic motion in constant gravity. Its solution is a parabolic behavior: V = V0 + G δ,

ln L = ln L0 + V0 δ +

1 G δ2 2

(14.28)

The physical meaning of this result is not clear in this form. Indeed, from the experimental point of view, ln L and possibly δ are functions of V = ln(λ/ε). After redefinition of the integration constants, this solution is therefore expressed in the form: λ L 1 1 λ ln , ln ln2 (14.29) = δ= G ε L0 2G ε Thus, fractal dimension, usually constant, becomes a linear function of the log-resolution and the logarithm of length now no longer varies linearly, but in a parabolic way. This result is potentially applicable to many situations, in all the fields where fractal analysis prevails (physics, chemistry, biology, medicine, geography, etc.). Frequently, after careful examination of scale dependence for a given quantity, the power law model is rejected because of the variation of the slope in the plane (ln L, ln ε). In such a case, the conclusion that the phenomenon considered is not fractal could appear premature. It could, on the contrary, be a non-linear fractal behavior relevant to scale dynamics, in which case the identification and the study of scale force responsible for the distortion would be most interesting.

480


14.5.5.2. Scale harmonic oscillator Another interesting case of scale potential is that of the harmonic oscillator. In the case where it is “attractive”, the scale equation is written as: ln L + α2 ln L = 0

(14.30)

where the notation indicates the second derivative with respect to the variable δ. Setting α = ln(λ/Λ), the solution is written as: L ln = L0

1/2 ln2 (λ/ε) 1− 2 ln (λ/Λ)

(14.31)

Thus, there is a minimal or maximal scale Λ for the considered system, whereas the slope d ln L/d ln ε (which can no longer be identified with the djinn δ in this non-linear situation) varies between zero and infinity in the field of resolutions allowed between λ and Λ. More interesting still is the “repulsive” case, corresponding to a potential which we can write as ϕ = −(ln L/δ0 )2 /2. The solution is written as: D λ L λ − ln2 (14.32) ln = δ0 ln2 L0 ε Λ This solution is more general than that given in previous publications, where we had considered only the case ln(λ/Λ) = δ0−1 . The interest of this solution is that it again yields asymptotic behavior of very large or very small scales (ε λ or ε λ) the standard solution L = L0 (λ/ε)δ0 , of constant fractal dimension D = 1 + δ0 . On the other hand, this behavior is faced with increasing distortions when the resolution approaches a maximum scale εmax = Λ, for which the slope (which we can identify with an effective fractal dimension minus the topological dimension) becomes infinite. In physics, we suggested that such a behavior could shed new light on the quarks confinement: indeed, within the reinterpretive framework of gauge symmetries as symmetries on the spatio-temporal resolutions (see below), the gauge group of quantum chromodynamics is SU(3), which is precisely the dynamic symmetry group of the harmonic oscillator. Solutions of this type could also be of interest in the biological field, because we can interpret the existence of a maximum scale where the effective fractal dimension becomes infinite, like that of a wall, which could provide models, for example, of cell walls. With scales lower than this maximum scale (for small components which evolve inside the system considered), we tend either towards scale-independence (zero slope) in the first case, or towards “free” fractal behavior with constant slope in the second case, which is still in agreement with this interpretation.


481

14.5.6. Special scale relativity – log-Lorentzian dilation laws, invariant scale limit under dilations It is with special scale relativity that the concept of “space-time-djinn” takes its full meaning. However, this has only been developed, until now, in two dimensions: one space-time dimension and one for the djinn. A complete treatment in five dimensions remains to be made. The previous comment, according to which the standard fractal laws (in constant fractal dimension) have the structure of the Galileo group, immediately implies the possibility of generalizing of these laws. Indeed, we know since the work of Poincaré [POI 05] and Einstein [EIN 05] that, as regards motion, this group is a particular and degenerated case of Lorentz group. However, we can show [NOT 92, NOT 93] that, in two dimensions, assuming only that the law of searched transformation is linear, internal and invariant under reflection (hypotheses deducible from the only principle of special relativity), we find the Lorentz group as the only physically acceptable solution: namely, it corresponds to a Minkowskian metric. The other possible solution is the Euclidean metric, which correctly yields a relativity group (that of rotations in space), but is excluded in the space-time and space-djinn cases since it is contradictory with the experimental ordering found for velocities (the sum of two positive velocities yields a larger positive velocity) and for scale transformations (two successive dilations yield a larger dilation, not a contraction). In what follows, let us indicate by L the asymptotic part of the fractal coordinate. In order to take into account the fractal to non-fractal transition, it can be replaced in all equations by a difference of the type L − L0 . The new log-Lorentzian scale transformation is written, in terms of the ratio of dilation between the resolution scales ε → ε [NOT 92]: ln(L/L0 ) + δ ln ln(L /L0 ) = & 1 − ln2 / ln2 (λ/Λ) δ =

δ + ln ln(L/L0 )/ ln2 (λ/Λ) & 1 − ln2 / ln2 (λ/Λ)

(14.33)

(14.34)

The law of composition of dilations takes the form: ln

ln(ε/λ) + ln ε = ln ln(ε/λ) λ 1+ ln2 (λ/Λ)

(14.35)

Let us specify that these laws are valid only at scales smaller than the transition scale λ (respectively, at scales larger than it when this law is applied

482


to very large scales). As we can establish on these formulae, the scale Λ is a resolution scale invariant under dilations, unattainable, (we would need an infinite dilation from any finite scale to reach it) and uncrossable. We proposed to identify it, towards very small scales, with the space and time Planck scale, lP = (G/c3 )1/2 = 1.616 05(10) × 10−35 m and tP = lP /c, which would then own all the physical properties of the zero point while remaining finite. In the macroscopic case, it is identified to the cosmic length scale given by the inverse of the root of the cosmological constant, LU = Λ−1/2 [NOT 93, NOT 96a, NOT 03]. We have theoretically predicted this scale to be LU = (2.7761 ± 0.0004) Gpc [NOT 93], and the now observed value, LU (obs) = (2.72 ± 0.10) Gpc, is in very good agreement with this prediction (see [NOT 08] for more details). This type of “log-Lorentzian” law was also used by Dubrulle and Graner [DUB 96] in turbulence models, but with a different interpretation of the variables. To what extent does this new dilation law change our view of space-time? At a certain level, it implies a complication because of the need for introducing the fifth dimension. Thus, the scale metrics is written with two variables: λ0 (14.36) dσ 2 = dδ 2 − (d ln L)2 /C02 , with; C0 = ln Λ The invariant dσ defines a “proper djinn”, which means that, although the effective fractal dimension, given by D = 1 + δ according to (14.34), became variable, the fractal dimension remained constant in the proper reference system. However, we can also note that the fractal dimension now tends to infinity when the resolution interval tends to the Planck scale. While going to increasingly small resolutions, a fractal dimension will thus successively pass the values 2, 3, 4, which would make it possible to cover a surface, then space, then space-time using a single coordinate. It is thus possible to define a Minkowskian space-time-djinn requiring, in adequate fractal reference systems, only two dimensions on very small scales. By tending towards large resolutions, the space-time-djinn metric signature (+, −, −, −, −) sees its fifth dimension vary less and less to become almost constant on scales currently accessible to accelerators (see [NOT 96a, Figure 4]). It finally vanishes beyond the Compton scale of the system under consideration, which is identified with the fractal to non-fractal transition in rest frame. At this scale the temporal metric coefficient also changes sign, which generates the traditional Minkowskian space-time of metric signature (+, −, −, −). 14.5.7. Generalized scale relativity and scale-motion coupling This is a vast field of study. We saw how we could introduce non-linear scale transformations and a scale dynamics. This approach is, however, only a first step towards a deeper “entirely geometric” level in which scale forces are but manifestations of the fractal and non-differentiable geometry. This level of


483

description also implies taking resolutions into consideration, which would in turn depend on space and time variables. The first aspect leads to the new concept of scale field, which corresponds to a distortion in scale space compared with usual self-similar laws [NOT 97b]. It can also be represented in terms of curved scale space. It is intended that this approach will be developed in more detail in future research. The second aspect, of which we now point out some of the principal results, leads to a new interpretation of gauge invariance and thus gauge fields themselves. This in turn proves the existence of general relations between mass scale and coupling constants (generalized charge) in particle physics [NOT 96a]. One of these relations makes it possible, as we will see, to predict the value of the electron mass theoretically (considered as primarily of electromagnetic origin, in this approach), as a function of its charge. Lastly, to be complete, let us point out that even these two levels are only transitory stages from the perspective of the theory we intend to build. A more comprehensive version will deal with motion and scales on the same footing and thus see the principle of scale relativity and motion unified into a single principle. This will be done by working in a 5D space-time-djinn provided with a metric, in which all the transformations between the reference points identify with rotations: in the planes (xy, yz, zx), they are ordinary rotations of 3D space; in the planes (xt, yt, zt) they are motion effects (which are reduced to Lorentz boosts when the space-time-djinn is reduced to 4D space time on macroscopic scales); finally, four rotations in the planes (xδ, yδ, zδ, tδ) identify with changes of space-time resolutions. 14.5.7.1. A reminder about gauge invariance At the outset, let us recall briefly the nature of the problem set by gauge invariance in current physics. This problem already appears in traditional electromagnetic theory. This theory, starting from experimental constraints, has led to the introduction of a four-vector potential, Aμ , then of a tensorial field given by the derivative of the potential, Fμν = ∂μ Aν − ∂ν Aμ . However, Maxwell field equations (contrary to what occurs in Einstein’s general relativity for motion in a gravitational field) are not enough to characterize the motion of a charge in an electromagnetic field. It is necessary to add the expression for the Lorentz force, which is written in 4D form f μ = (e/c)F μν uν , where uν is the four-velocity. It is seen that only the fields intervene in this and not the potentials. This implies that the motion will be unaffected by any transformation of potentials which leave the fields invariant. It is obviously the case, if we add to the four-potential the gradient of any function of coordinates: Aμ = Aμ + ∂μ χ(x, y, z, t). This transformation is called, following Weyl, gauge transformation and the invariance law, which results from it is the gauge invariance. What was apparently only a simple latitude left in the choice of the potentials takes within the quantum mechanics framework a deeper meaning. Indeed, gauge

484


invariance in quantum electrodynamics becomes an invariance under the phase transformations of wave functions and is linked to current conservation using Noether’s theorem. It is known that this theorem connects fundamental symmetries to the appearance of conservative quantities, which are manifestations of these symmetries (thus the existence of energy results from the uniformity of time, the momentum of space homogenity, etc.). In the case of electrodynamics, it appears that the existence of the electric charge itself results from gauge symmetry. This fact is apparent in the writing of the Lagrangian which describes Dirac’s electronic field coupled to an electromagnetic field. This Lagrangian is not invariant under the gauge transformation of electromagnetic field Aμ = Aμ + ∂μ χ(x), but becomes invariant, provided it is completed by a local gauge transformation on the phase of the electron wave function, ψ → e−ieχ(x) ψ. This result can be interpreted by saying that the existence of the electromagnetic field (and its gauge symmetry) implies that of the electric charge. However, although impressive (particularly through its capacity for generalization to non-Abelian gauge theories which includes weak and strong fields and allows description of weak electric fields), this progress in comprehending the nature of the electromagnetic field and the charge remains incomplete, in our opinion. Indeed, the gauge transformation keeps an arbitrary nature. The essential point is that no explicit physical meaning is given to function χ(x): however, this is the conjugate variable of the charge in the electron phase (just as energy is the conjugate of time and momentum of space), so that it is from an understanding of its nature that an authentic comprehension of the nature of charge could arise. Moreover, the quantization of charge remains misunderstood within the framework of the current theory. However, its conjugate variable still holds the key to this problem. The example of angular momentum is clear in this regard: its conjugate quantity is the angle, so that its conservation results from the isotropy of space. Moreover, the fact that angle variations cannot exceed 2π implies that the differences in angular momentum are quantized in units of . In the same way, we can expect that the existence of limitation on the variable χ(x), once its nature is elucidated, would imply charge quantization and leads to new quantitative results. As we will see, scale relativity makes it possible indeed to make proposals in this direction. 14.5.7.2. Nature of gauge fields Let us consider an electron or any other charged particle. In scale relativity, we identify “particles” with the geodesics of a non-differentiable space-time. These paths are characterized by internal (fractal) structures (beyond the Compton scale λc = /mc of the particle in rest frame). Now consider any one of these structures (which is defined only in a relative way), lying at a resolution ε < λc . In a displacement of the electron, the relativity of scales will imply the appearance of a field induced by this displacement.


485

To understand it, we can take as model an aspect of the construction, from the general relativity of motion, of Einstein’s gravitation theory. In this theory, gravitation is identified with the manifestation of the curvature of space-time, which results in vector rotation of geometric origin. However, this general rotation of any vector during a translation can result simply from the only generalized relativity of motion. Indeed, since space-time is relative, a vector V μ subjected to a displacement dxρ cannot remain identical to itself (the reverse would mean absolute space-time). It will thus undergo a rotation, which is written, by using Einstein summation convention on identical lower and upper indices, δV μ = Γμνρ V ν dxρ . Christoffel symbols Γμνρ , which emerge naturally in this transformation, can then be calculated, while processing this construction, in terms of derivatives of the metric potentials gμν , which makes it possible to regard them as components of the gravitational field generalizing Newton’s gravitational force. Similarly, in the case of fractal electron structures, we expect that a structure, which was initially characterized by a certain scale, jumps to another scale after the electron displacement (if not, the scale space would be absolute, which would be in contradiction with the principle of scale relativity). A dilation field of resolution induced by the translations is then expected to appear, which is written: e

δε = −Aμ δxμ ε

(14.37)

This effect can be described in terms of the introduction of a covariant derivative: eDμ ln(λ/ε) = e∂μ ln(λ/ε) + Aμ

(14.38)

Now, this field of dilation must be defined irrespective of the initial scale from which we started, i.e., whatever the substructure considered. Therefore, starting from another scale ε = ε (here we take into account, as a first step, only the Galilean scale relativity law in which the product of two dilations is the standard one), we get during the same translation of the electron: e

δε = −Aμ δxμ ε

(14.39)

The two expressions for the potential Aμ are then connected by the relation: Aμ = Aμ + e ∂μ ln

(14.40)

where ln (x) = ln(ε/ε ) is the relative scale state (it depends only on the ratio between resolutions ε and ε ) which depends now explicitly on the coordinates. In this regard, this approach already comes under the framework of general scale relativity and of non-linear scale transformations, since the “scale velocity” has been redefined as a first derivative of the djinn, ln = d ln L/dδ, so that equation (14.40) involves a second-order derivative of fractal coordinate, d2 ln L/dxμ dδ.

486


If we consider a translation along two different coordinates (or, in an equivalent way, displacement on a closed loop), we may write a commutator relation: e(∂μ Dν − ∂ν Dμ ) ln = (∂μ Aν − ∂ν Aμ )

(14.41)

This relation defines a tensor field Fμν = ∂μ Aν − ∂ν Aμ , which, unlike Aμ , is independent of the initial scale from where we started. We recognize in Fμν the analog of an electromagnetic field, in Aμ that of an electromagnetic potential, in e that of electric charge and in equation (14.40) the property of gauge invariance which, in accordance with Weyl’s initial ideas and their development by Dirac [DIR 73], recovers its initial status of scale invariance. However, equation (14.40) represents progress compared with these early attempts and with the status of gauge invariance in today’s physics. Indeed, the gauge function χ(x, y, z, t) which intervenes in the standard formulation of gauge invariance, Aμ = Aμ + e ∂μ χ and which has, up to now, been considered as arbitrary, is identified with the logarithm of internal resolutions, χ = ln ρ(x, y, z, t). Another advantage with respect to Weyl’s theory is that we are now allowed to define four different and independent dilations along the four space-time resolutions instead of only one global dilation. Therefore, we expect that the field above (which corresponds to a group U(1) of electromagnetic field type) is embedded into a larger field, in accordance with the electroweak theory and grand unification attempts. In the same way, we expect that the charge e is an element of a more complicated, “vectorial” charge. These early remarks have now developed into a full theory non-Abelian gauge fields [NOT 06], in which the main tools and results of Yang-Mills theories can be recovered as a manifestation of fractal geometry. Moreover, this generalized approach makes it possible to suggest a new and more completely unified preliminary version of electroweak theory [NOT 00b], √ in which the Higgs boson mass can be predicted theoretically (we find mH = 2mW = 113.73 ± 0.06 GeV, where mW is the W gauge boson mass). Moreover, our interpretation of gauge invariance yields new insights about the nature of the electric charge and, when it is combined with the Lorentzian structure of dilations of special scale-relativity, it makes it possible to obtain new relations between the charges and the masses of elementary particles [NOT 94, NOT 96a], as recalled in what follows. 14.5.7.3. Nature of the charges In gauge transformation Aμ = Aμ − ∂μ χ, the wave function of an electron of charge e becomes: ψ = ψ eieχ

(14.42)


487

In this expression, the essential role played by the gauge function is clear. It is the conjugate variable of the electric charge, in the same way as position, time and angle are conjugate variables of momentum, energy and angular momentum, (respectively) in the expressions of the action and/or the quantum phase of a free particle, θ = i(px − Et + σϕ)/. Our knowledge of what constitutes energy, momentum and angular momentum comes from our understanding of the nature of space, time, angles and their symmetry (translations and rotations), using Noether’s theorem. Conversely, the fact that we still do not really know what an electric charge is, despite all the development of gauge theories comes, in our view, from the fact that the gauge function χ is considered devoid of physical meaning. We have interpreted in the previous section the gauge transformation as a scale transformation of resolution, ε → ε , ln = ln(ε/ε ). In such an interpretation, the specific property that characterizes a charged particle is the explicit scale dependence of its action and therefore of its wave function in function of resolution. The result is that the electron’s wave function is written: e2

ψ = ψ ei c ln

(14.43)

Since, by definition (in the system of units where the permittivity of vacuum is 1): e2 = 4παc

(14.44)

ψ = ψ ei4πα ln

(14.45)

equation (14.43) becomes:

Now considering the wave function of the electron as an explicitly dependent function on resolution ratios, we can write the scale differential equation of which ψ is a solution as: −i

∂ψ = eψ ln )

∂( ec

(14.46)

˜ = −i∂/∂( e ln ) a dilation operator. Equation (14.46) can We recognize in D c then be read as an eigenvalue equation: ˜ = eψ Dψ

(14.47)

In such a framework, the electric charge is understood as the conservative quantity that comes from the new scale symmetry, namely, the uniformity of resolution variable ln ε.

488


14.5.7.4. Mass-charge relations In the previous section, we have stated the wave function of a charged particle in the form: ψ = ψ ei4πα ln

(14.48)

In the Galilean case such a relation leads to no new result, since ln is unlimited. However, in the special scale-relativistic framework (see previous section), scale laws become Lorentzian below the Compton scale λc of the particle, then ln becomes limited by the fundamental constant C = ln(λc /lP ), which characterizes the considered particle (where lP = (G/c3 )1/2 is the Planck length scale). This implies a quantization of the charge, which amounts to relation 4παC = 2kπ, i.e.: k (14.49) 2 where k is an integer. This equation defines a general form for relations between masses and charges (coupling constants) of elementary particles. αC =

For example, in the case of the electron, the ratio of its Compton length /me c to Planck length is equal to the ratio of Planck mass (mP = (c/G)1/2 ) to electron mass. Moreover, within the framework of the electroweak theory, it appears that the coupling constant of electrodynamics at low energy (i.e., fine structure constant) results from a “running” electroweak coupling dependent on the energy scale. This running coupling is decreased by a factor 38 owing to the fact that the gauge bosons W and Z become massive and no longer contribute to the interaction at energies lower than their mass energy. We thus obtain a mass-charge relation for the electron which is written: mP 8 α ln =1 3 me

(14.50)

Such a theoretical relation between the mass and the electron charge is supported by the experimental data which leads to a value 1.0027 for this product and becomes 1.00014 when taking the threshold effects at Compton transition into account. Such a relation accounts for many other structures observed in particle physics and suggests solutions to the questions of the origin of the masses of certain particles, of the coupling values and of the hierarchy problem between electroweak and grand unification scales [NOT 96a, NOT 00a, NOT 00b]. 14.6. Quantum-like induced dynamics 14.6.1. Generalized Schrödinger equation In scale relativity, as we have seen, it is necessary to generalize the concept of space-time and once again to work within the framework of fractal space-time. We


489

consider the coordinate systems (and paths, particularly fractal space geodesics) which are themselves fractal, i.e., having internal structures at all scales. We concentrated, in the preceding sections, on possible descriptions of such structures, which relates to scale space. We will now briefly consider, to finish, its induced effects on displacements in ordinary space. The combination of these effects leads to the introduction of a description tool of the quantum mechanical type. In its framework, we give up the traditional description in terms of initial conditions and deterministic individual trajectories, for the benefit of a statistical description in terms of probability amplitudes. Let us point out the essence of the method used within the framework of scale relativity to pass from a traditional dynamics to a quantum-like dynamics. The three minimal conditions, which make it possible to transform the fundamental equation of dynamics into a Schroedinger equation are as follows: 1) there is an infinity of potential paths; this first condition is a natural outcome of non-differentiability and space fractality, if the paths could be identified with the geodesics of this space; 2) the paths are fractal curves (dimension D = 2, which corresponds to a complete loss of information on elementary displacements playing a special role here). In the case of a space and its geodesics, the fractal character of the space implies the fractality of its geodesics directly; 3) there is irreversibility at the infinitesimal level, i.e., non-invariance in the reflection of time differential element dt → −dt. Again, this condition is an immediate consequence of the abandonment of the differentiability hypothesis. Let us recall that one of the fundamental tools, which enable us to manage non-differentiability, consists of reinterpreting differential elements as variables. Thus, the space coordinate becomes a fractal function X(t, dt) and its velocity, although becoming undefined at the limit dt → 0, is now also defined as a fractal function. The difference is that there are two definitions instead of one (that are transformed one into the other by the reflection dt ↔ −dt), and thus the velocity concept becomes two-valued: X(t + dt, dt) − X(t, dt) (14.51) V+ (t, dt) = dt X(t, dt) − X(t − dt, dt) V− (t, dt) = (14.52) dt The first condition leads us to use a “fluid”-like description, where we no longer consider only the velocity of an individual path, but rather the mean velocity field v[x(t), t] of all the potential paths. The second condition brings us back to preceding works concerning scale laws satisfying the relativity principle. We saw that, in the simplest “scale-Galilean” case,

490


the coordinate (which is a solution of a scale differential equation) decomposes in the form of a traditional, scale-independent, differentiable part and of a fractal, non-differentiable part. We use this result here, after having differentiated the coordinate. This leads us to decompose the elementary displacements dX = dx + dξ in the form of a mean scale-independent, dx = v dt, and of a fluctuation dξ characterized by a law of fractal behavior, dξ ∝ dt1/DF , where DF is the fractal dimension of the path. The third condition implies, as we have seen, a two-valuedness of the velocity field. Defined by V = dX/dt = v + dξ/dt, it decomposes, in the case of both V+ and V− , in terms of a non-fractal component v (thus derivable in ordinary sense) and of a divergent fractal component dξ/dt, of zero-mean. We are thus led to introduce a 3D twin process: i i = dxi± + dξ± dX± i i dt, #dξ± $ = 0 and: in which dxi± = v± / i j 0 2D 2−2/DF dξ± dξ± = ±δ ij dt dt dt

(14.53)

(14.54)

(c = 1 is used here to simplify the writing; δ ij is a Kronecker symbol). The symbol D is a fundamental parameter of scale, which characterizes fractal trajectories behavior (it is nothing other than a different notation for the fractal to non-fractal transition scale introduced previously). This parameter determines the essential transition, which appears in such a process between the fractal behavior on a small scale (where the fluctuations dominate) and the non-fractal behavior on a large-scale (where the mean traditional motion dominates). A natural representation of the two-valuedness of variables due to irreversibility consists of using complex numbers (we can show that this choice is “covariant” in the sense that it preserves the form of the equations [CEL 04]). A complex time derivative operator is defined (which relates to the scale-independent differentiable parts): 1 d+ + d− d+ − d− dˆ = −i dt 2 dt dt

(14.55)

Then we define an average complex velocity which results from the action of this operator on the position variable: Vi =

i i v i − v− v i + v− dˆ i x = V i − i Ui = + −i + dt 2 2

(14.56)

Note that, in more recent works, we have constructed such an operator from the whole velocity field, including the non-differentiable part, and still obtained


491

the standard Schrödinger equation as the equation of motion [NOT 05, NOT 07], therefore allowing for the possible existence of fractal and non-differentiable wave functions in quantum mechanics [BER 96]. After having defined the laws of elementary displacements in such a fractal and locally irreversible process, it is necessary for us now to analyze the effects of these displacements on other physical functions. Let us consider a differentiable function f (X(t), t). Its total derivative with respect to time is written: ∂f dX 1 ∂ 2 f dX i dX j df = + ∇f. + dt ∂t dt 2 ∂X i ∂X j dt

(14.57)

We may now calculate the (+) and (−) derivatives of f . In this procedure, the mean value of dX/dt amounts to d± x/dt = v± , while #dX i dX j $ is reduced to j i dξ± $. Finally, in the particular case when the fractal dimension of the paths is #dξ± DF = 2, we have #dξ 2 $ = 2Ddt, and the last term of equation (14.57) is transformed into a Laplacian. We obtain in this case: d± f /dt = (∂/∂t + v± .∇ ± DΔ)f

(14.58)

Although we consider here only the fractal dimension DF = 2, we recall that all the results obtained can be generalized to other values of the dimension [NOT 96a]. By combining these two derivatives, we obtain the complex derivation operator with respect to time: ∂ dˆ = + V.∇ − i DΔ dt ∂t

(14.59)

It has two imaginary terms, −iU.∇ and −iDΔ, in addition to the standard Eulerian total derivative operator, d/dt = ∂/∂t + V.∇. We can now rewrite the fundamental dynamic equation using this derivative operator: this will then automatically take into account the new effects considered. It keeps the Newtonian form: m

dˆ2 x = −∇φ dt2

(14.60)

where φ is an exterior potential. If the potential is either zero or a gravitational potential, this equation is nothing other than a geodesic equation. We have therefore implemented a generalized equivalence principle, thanks to which the motion (gravitational and quantum) remains locally described under an inertial form: indeed, as we will see now, this equation can be integrated under the form of a Schrödinger equation.

492


More generally, we can generalize Lagrangian mechanics with this new tool (see [NOT 93, NOT 96a, NOT 97a, NOT 07]). The complex character of velocity V implies that the same is true of the Lagrange function and therefore of the action S. The wave function ψ is then introduced very simply as a re-expression of this complex action: ψ = eiS/2mD

(14.61)

It is related to complex velocity in the following manner: V = −2iD∇(ln ψ)

(14.62)

We can now change the descriptive tool and write the Euler-Newton equation (14.60) in terms of this wave function: 2imD

dˆ (∇ ln ψ) = ∇φ dt

(14.63)

After some calculations, this equation can be integrated in the form of a Schrödinger equation [NOT 93]: D2 Δψ + iD

φ ∂ ψ− ψ=0 ∂t 2m

(14.64)

We find the standard quantum mechanics equation by selecting D = /2m. By setting that ψψ † = , we find that the imaginary part of this equation is the continuity equation: ∂ /∂t + div ( V ) = 0 which justifies the interpretation of

(14.65)

as a probability density [NOT 05, NOT 07].

14.6.2. Application in gravitational structure formation Physics has for a long time been confronted with the problem of the very non-homogenous spacial distribution of matter in the universe. This distribution of spatial structures often show a hierarchy of organization, whether it is in the microscopic domain (quarks in the nucleons, nucleons in the nucleus, nucleus and electrons in the atom, atoms in the molecule, etc.) or macroscopic domain (stars and their planetary system, star groups and clusters gathering with the interstellar matter, itself fractal, in galaxies which form groups and clusters of galaxies, which belong to superclusters of galaxies, themselves subsets of the large scale structures of the universe). What is striking, in these two cases, is that it is vacuum rather than matter which dominate, even on very large scales where we thought we would find a homogenous distribution.


493

The theory of scale relativity was built, among other aims, to deal with questions of scale structuring. We take into account an explicit intervention of the observation scales (which amounts to working within the framework of a fractal geometry), or more generally of the scales which are characteristic of the phenomena under consideration, as well as relations between these scales, by the introduction of a resolution space. As we saw, such a description of structures over all the scales (or on a broad range) induces a new dynamics whose behavior becomes quantum rather than traditional. However, the conditions under which Newton’s equation is integrated in the form of a Schrödinger equation (which correspond to a complete loss of information on individual trajectories, from the viewpoint of angles, space and time) do not manifest themselves only on microscopic scales. Certain macroscopic systems, such as protoplanetary nebulae, which created our solar system, could satisfy these conditions and thus be described statistically by a Schrödinger-type equation (but with, of course, an interpretation different from that of standard quantum mechanics). Such a dynamics leads naturally to a “morphogenesis” [DAR 03] since it generates organized structures in a hierarchical way, dependent on the external conditions (forces and boundary conditions). A great example of the application of such an approach is in planetary system formation. It is fascinating that theoretical predictions about it could be made [NOT 93], then validated in our solar system, several years before the discovery [MAY 95, WOL 94] of the first extrasolar planets. We theoretically predicted that the distribution of semi-major axes of planetary orbits should show probability peaks for values an /GM = (n/w0 )2 , where M is the mass of the star, w0 = 144 km/s is a universal constant which characterizes the structuring of the inner solar system and which is observed (with its multiples and sub-multiples) from the planetary scales to extragalactic scales, and n is an integer. It is also expected that the eccentricities show probability peaks for values e = k/n, where k is an integer ranging between 0 and n − 1. Since then more than 250 exoplanets have been discovered, of which the observed distribution of semi-major axes (Figure 14.1) and eccentricities (Figure 14.2) show a highly significant statistical agreement with the theoretically awaited probability distributions [NOT 96b, NOT 97c, NOT 00d, NOT 01b, DAR 03]. 14.7. Conclusion The present contribution has mainly focused on the detailed principle and theoretical development of the “scale-relativistic” approach. However, we have not been able to touch on everything. For example, the construction of an equation of the Schrödinger type starting from the abandonment of differentiability, explicitly shown above in the case of Newton’s fundamental equation of motion, can be generalized in all cases where the equations of traditional physics could be put in the form of Euler-Lagrange equations. This was done explicitly for the equations of rotational

494


a/M (A.U. / M ) 0.043

Number

7

0.171

0.385

0.685

1.07

1.54

2.09

6

7

5 3 1

1

2

3 4 5 4.83 (a/M)1/2

Figure 14.1. Histogram of the observed distribution of variable n ˜ = 4.83 (a/M )1/2 where a indicates the semi-major axis and M star mass (in solar system units, i.e., astronomical unit AU and solar mass M ), for the recently discovered exoplanets and the planets of our inner solar system. We theoretically expect probability peaks for integer values of this variable. The probability of obtaining such a statistical agreement by chance is lower than 4 × 10−5

16 14

Number

12 10 8 6 4 2 0

1

2

3

4

5

6

n.e ˜ = n.e, where m is the principal quantum Figure 14.2. Histogram of the distribution of k number (which characterizes the semi-major axes) and e eccentricity, for the exoplanets and the planets of the inner solar system. The theory predicts probability peaks for integer values of this variable. The probability of obtaining such an agreement by chance is lower than 10−4 . Combined probability to obtain by chance the two distributions (semi-major axes and eccentricities) is 3 × 10−7 , i.e., a level of statistical significance reaching 5σ


495

motion of a solid, for the equation of motion with a dissipation function, for Euler and Navier-Stokes equations, or even for scalar field equations [NOT 97a, NOT 08]. Among the possible generalizations of the theory, we can also mention abandoning the differentiability, not only in the usual space (which leads, as we saw, to the introduction of a scale-space governed by differential equations acting on scale variables, in particular on the spatio-temporal resolutions), but also in the space of scales itself. All the previous construction can again apply to this deeper level of description. This leads to the introduction of a “scale-quantum mechanics” [NOT 08]. In this framework, which is equivalent to a “third quantization,” fractal “objects” of a new type can be defined: rather than having structures at well-defined scales (the case of the ordinary fractal objects), or than having variable scale structures described by traditional laws (the case of “scale-relativistic” fractals considered in this chapter), they are now characterized by an amplitude of probability for scale ratios (“quantum” fractals). With regard to the applications of this approach, we gave only two of their examples, concerning electron mass and planetary systems. Let us recall, nevertheless, that it could be applied successfully to a large number of problems of physics and astrophysics which were unresolved with the usual methods, and that it also allowed theoretical prediction of structures and of new relations [NOT 96a, NOT 08]. Thus, the transformation of fundamental dynamic equation into a Schrödinger equation under very general conditions (loss of information on the individual paths and irreversibility) leads to a renewed comprehension of the formation and evolution of gravitational structures. This method, besides semi-major axes and eccentricities of planets discovered around solar-type stars, briefly considered earlier, was also applied successfully to the three planets observed around pulsar PSR 1257+12 [NOT 96b], to obliquities and inclinations of planets and satellites of the solar system [NOT 98a], satellites of giant planets [HER 98], double stars, double galaxies, distribution of the galaxies on a very large scale and other gravitational structures [NOT 96a, NOT 98b, DAR 03]. 14.8. Bibliography [AIT 82] A ITCHISON I., An Informal Introduction to Gauge Field Theories, Cambridge University Press, Cambridge, 1982. [BEN 00] B EN A DDA F., C RESSON J., “Divergence d’échelle et différentiabilité”, Comptes rendus de l’Académie des sciences de Paris, série I, vol. 330, p. 261–264, 2000. [BER 96] B ERRY M.V., “Quantum fractals in boxes”, J. Phys. A: Math. Gen., vol. 29, p. 6617–6629, 1996. [CAS 02] C ASH R., C HALINE J., N OTTALE L., G ROU P., “Human development and log-periodic laws”, C.R. Biologies, vol. 325, p. 585, 2002.

496


[CEL 04] C ÉLÉRIER M.N., N OTTALE L., “Quantum-classical transition in scale relativity”, J. Phys. A: Math. Gen. vol. 37, p. 931, 2004. [CHA 99] C HALINE J., N OTTALE L., G ROU P., “Is the evolutionary tree a fractal structure?”, Comptes rendus de l’Académie des sciences de Paris, vol. 328, p. 717, 1999. [DAR 03] DA ROCHA D., N OTTALE L., “Gravitational structure formation in scale-relativity”, Chaos, Solitons & Fractals, vol. 16, p. 565, 2003. [DIR 73] D IRAC P.A.M., “Long range forces and broken symmetries”, Proc. Roy. Soc. Lond., vol. A333, p. 403–418, 1973. [DUB 96] D UBRULLE B., G RANER F., “Possible statistics of scale invariant systems”, J. Phys. (Fr.), vol. 6, p. 797–816, 1996. [EIN 05] E INSTEIN A., “Zur Elektrodynamik bewegter Körper”, Annalen der Physik, vol. 17, p. 891–921, 1905. [EIN 16] E INSTEIN A., “Die Grundlage der allgemeinen Relativitätstheorie” Annalen der Physik, vol. 49, p. 769–822, 1916. [ELN 95] E L NASCHIE M.S., ROSSLER O.E., P RIGOGINE I. (Eds.), Quantum Mechanics, Diffusion, and Chaotic Fractals, Pergamon, Cambridge, p. 93, 1995. [HER 98] H ERMANN R., S CHUMACHER G., G UYARD R., “Scale relativity and quantization of the solar system. Orbit quantization of the planet’s satellites”, Astronomy and Astrophysics, vol. 335, p. 281, 1998. [MAN 75] M ANDELBROT B., Les objets fractals, Flammarion, Paris, 1975. [MAN 82] M ANDELBROT B., The Fractal Geometry of Nature, Freeman, San Francisco, 1982. [MAY 95] M AYOR M., Q UELOZ D., “A Jupiter-mass companion to a solar-type star”, Nature, vol. 378, p. 355–359, 1995. [NOT 84] N OTTALE L., S CHNEIDER J., “Fractals and non-standard analysis”, Journal of Mathematical Physics, vol. 25, p. 1296, 1984. [NOT 89] N OTTALE L., “Fractals and the quantum theory of space-time”, International Journal of Modern Physics, vol. A4, p. 5047, 1989. [NOT 92] N OTTALE L., “The theory of scale relativity”, International Journal of Modern Physics, vol. A7, p. 4899, 1992. [NOT 93] N OTTALE L., Fractal Space-Time and Microphysics: Towards a Theory of Scale Relativity, World Scientific, Singapore, 1993. [NOT 94] N OTTALE L., “Scale relativity: first steps toward a field theory”, in D IAZ A LONSO J., L ORENTE PARAMO M. (Eds.), Relativity in General, E.R.E.’93, Spanish relativity meetings, Editions Frontières, p. 121, 1994. [NOT 96a] N OTTALE L., “Scale relativity and fractal space-time: application to quantum physics, cosmology and chaotic systems”, Chaos, Solitons, and Fractals, vol. 7, p. 877, 1996. [NOT 96b] N OTTALE L., “Scale-relativity and quantization of extrasolar planetary systems”, Astronomy and Astrophysics Letters, vol. 315, p. L9, 1996.


497

[NOT 97a] N OTTALE L., “Scale relativity and quantization of the universe. I. Theoretical framework”, Astronomy and Astrophysics, vol. 327, p. 867, 1997. [NOT 97b] N OTTALE L., “Scale relativity”, in D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond, Les Houches workshop, EDP Sciences and Springer, p. 249, 1997. [NOT 97c] N OTTALE L., S CHUMACHER G., G AY J., “Scale relativity and quantization of the solar system”, Astronomy and Astrophysics, vol. 322, p. 1018, 1997. [NOT 98a] N OTTALE L., “Scale relativity and quantization of planet obliquities”, Chaos, Solitons, and Fractals, vol. 9, p. 1035, 1998. [NOT 98b] N OTTALE L., S CHUMACHER G., “Scale relativity, fractal space-time, and gravitational structures”, in N OVAK M.M. (Eds.), Fractals and Beyond: Complexities in the Sciences, World Scientific, p. 149, 1998. [NOT 00a] N OTTALE L., “Scale relativity and non-differentiable fractal space-time”, in S IDHARTH B.G., A LTAISKY M. (Eds.), Frontiers of Fundamental Physics 4, Kluwer Academic and Plenum Publishers, International Symposia on Frontiers of Fundamental Physics, 2000. [NOT 00b] N OTTALE L., “Scale relativity, fractal space-time, and morphogenesis of structures”, in D IEBNER H., D RUCKREY T., W EIBEL P. (Eds.), Sciences of the Interface, Genista, Tübingen, p. 38, 2000. [NOT 00c] N OTTALE L., C HALINE J., G ROU P., Les arbres de l’évolution, Hachette, Paris, 2000. [NOT 00d] N OTTALE L., S CHUMACHER G., L EFÊVRE E.T., “Scale relativity and quantization of exoplanet orbital semi-major axes”, Astronomy and Astrophysics, vol. 361, p. 379, 2000. [NOT 01a] N OTTALE L., C HALINE J., G ROU P., “On the fractal structure of evolutionary trees”, in L OSA G. (Eds.), Fractals in Biology and Medicine, Birkhäuser Press, Mathematics and Biosciences in Interaction, 2001. [NOT 01b] N OTTALE L., T RAN M INH N., “Theoretical prediction of orbits of planets and exoplanets”, Scientific News, Paris Observatory, 2002, http://www.obspm.fr/actual/ nouvelle/nottale/nouv.fr.shtml. [NOT 03] N OTTALE L., “Scale-relativistic cosmology”, Chaos, Solitons & Fractals, vol. 16, p. 539, 2003. [NOT 05] N OTTALE L., “Origin of complex and quaternionic wavefunctions in quantum mechanics: the scale-relativistic view”, in A NGLÈS P. (Ed.), Proceedings of 7th International Colloquium on Clifford Algebra and their Applications, 19-29 May 2005, Toulouse, Birkhäuser. [NOT 06] N OTTALE L., C ÉLÉRIER M.N., L EHNER T., “Non-Abelian gauge field theories in scale relativity”, J. Math. Phys., vol. 47, p. 032303, 2006. [NOT 07] N OTTALE L., C ÉLÉRIER M.N., “Derivation of the postulates of quantum mechanics form the first principles of scale relativity”, J. Phys. A: Math. Theor., vol. 40, p. 14471, 2007. [NOT 08] N OTTALE L., The Theory of Scale Relativity, 528 pp., 2008, forthcoming.

498


[ORD 83] O RD G.N., “Fractal space-time: a geometric analogue of relativistic quantum mechanics” Journal of Physics A: Mathematical and General, vol. 16, p. 1869, 1983. [POC 97] P OCHEAU A., “From scale-invariance to scale covariance”, in D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond, Les Houches workshop, EDP Sciences and Springer, p. 209, 1997. [POI 05] P OINCARÉ H., “Sur la dynamique de l’electron”, Comptes Rendus de l’Académie des Sciences de Paris, vol. 140, p. 1504–1508, 1905. [QUE 00] Q UEIROS -C ONDÉ D., “Principle of flux entropy conservation for species evolution Principe de conservation du flux d’entropie pour l’évolution des espèces”, Comptes Rendus de l’Académie des Sciences de Paris, vol. 330, p. 445–449, 2000. [SOR 97] S ORNETTE D., “Discrete scale invariance”, in D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond, Les Houches workshop, EDP Sciences and Springer, p. 235, 1997. [SOR 98] S ORNETTE D., “Discrete scale invariance and complex dimensions”, Physics Reports, vol. 297, p. 239–270, 1998. [SOR 01] S ORNETTE D., J OHANSEN A., “Finite-time singularity in the dynamics of the world population and economic indices”, Physica, vol. A 294, p. 465–502, 2001. [TRI 93] T RICOT C., Courbes et dimensions fractales, Springer-Verlag, Paris, 1993. [WIL 83] W ILSON K.G., “The Renormalization Group and critical phenomena”, American Journal of Physics, vol. 55, p. 583–600, 1983. [WOL 94] W OLSZCZAN A., “Confirmation of Earth-mass planets orbiting the millisecond pulsar PSR B1257+12” Science, vol. 264, p. 538, 1994.

List of Authors

Patrice A BRY Laboratoire de physique de l’ENS CNRS Lyon France Liliane B EL Laboratoire de mathématiques Paris-Sud University Orsay France Albert B ENASSI Department of Mathematics Blaise Pascal University Clermont-Ferrand France Jean-Marc C HASSERY LIS CNRS Grenoble France Serge C OHEN LSP Paul Sabatier University Toulouse France

500


Khalid DAOUDI INRIA Nancy France Franck DAVOINE Heudiasyc University of Technology of Compiègne France Patrick F LANDRIN Laboratoire de physique de l’ENS CNRS Lyon France Paulo G ONÇALVES Laboratoire de l’Informatique du Parallélisme de l’ENS INRIA Lyon France Jacques I STAS Département IMSS Pierre Mendès France University Grenoble France Stéphane JAFFARD Department of Mathematics University of Paris XII Créteil France Pierrick L EGRAND IMB University of Bordeaux 1 Talence France Jacques L ÉVY V ÉHEL INRIA Centre de recherche Saclay - Île-de France Orsay France

List of Authors

Denis M ATIGNON LTCI, CNRS Ecole nationale supérieure des télécommunications Paris France Laurent N OTTALE Observatoire de Paris-Meudon CNRS Meudon France Georges O PPENHEIM Laboratoire de mathématiques Paris-Sud University Orsay France Rudolf R IEDI Department of Telecommunications University of Applied Sciences of Western Switzerland Fribourg Switzerland Luc ROBBIANO Laboratoire de mathématiques Paris-Sud University Orsay France Claude T RICOT LMP Blaise Pascal University Clermont-Ferrand France Darryl V EITCH Department of Electrical and Electronic Engineering University of Melbourne Victoria Australia

501

502


Marie-Claude V IANO Laboratoire de mathématiques Paris-Sud University Orsay France Christian WALTER PricewaterhouseCoopers, Paris University of Evry France

Index

A aggregation, 421, 422, 432, 443

C cascade, 80, 81, 92, 125, 156, 158, 425, 430 binomial, 140, 156, 162, 169 computer network traffic bursty, 415, 417, 419, 420 fractal, 420, 424, 426, 430, 431

Schrödinger, 488–493, 495 exponent Hölder, 33–48, 61, 67, 77, 105–107, 112–116, 119, 123, 127, 132, 222, 223, 226, 301, 303, 304, 306, 308, 311–316, 320, 322, 326, 328, 368, 369, 372, 373, 383–387, 396, 398, 406, 407 pointwise, 33, 37, 38, 53, 226, 311, 315, 369, 372, 384, 386, 390, 392, 394 Hurst, 77 oscillation, 39, 127, 132, 140

D diffusive representation, 238–240, 258, 262–264, 271, 273 dimension fractal, 145, 163, 428, 457, 458, 473, 475–477, 479–482 Hausdorff, 41–45, 53–55, 63, 78, 79, 106, 123, 128, 132, 145, 146, 281, 294, 302, 303, 335, 369, 370 distribution processes, 294–297 Weibull, 453

F fractal space-time, 466, 488 fractional derivative, 113, 124, 237–240, 242–251, 267, 273 filter, 279, 281–284, 296 integration, 115, 211, 240–242, 439 function partition, 79, 126, 128, 131, 132, 147–149, 151, 153, 163, 166, 302 Weierstrass, 32, 41, 44, 65, 66, 185, 188, 302, 316, 317, 336

E equation fractional differential, 239, 240, 251–273 fractional partial differential, 239, 240, 266–273

G gravitational structure, 492, 495

B binomial measure, 62–65, 68, 154, 157, 158, 160, 164, 166

H Hausdorff measure, 27, 61

504


heavy tail, 423, 427–430, 432, 433 I increment r-stationary, 189–191 stationary, 73–77, 79, 86–88, 91, 96, 163, 181, 189, 190, 194–196, 198–200, 205, 209, 212, 431 L large deviations, 57, 127, 320, 371 long range dependence, 75, 76, 88, 89, 92, 93, 95–97, 139, 167, 200, 422, 425, 427–429, 431–433 M motion fractional Brownian, 77, 103, 116, 169, 170, 172, 191, 193, 194, 205, 207, 214, 222, 223, 280, 302, 388, 431, 432, 455 multifractional Brownian, 116, 218–226, 232, 375, 396–398 multiplexing, 414, 416, 417, 430 multiresolution analysis, 83–85, 95, 207, 211 N noise filtered white, 201, 214, 215, 217, 218, 228, 230 fractional Gaussian, 78, 99, 167, 173, 421, 424, 425 P process α-stable, 88, 181, 186, 196, 197 Censov, 198 distribution, 294–297 increments, 73, 75–77, 88, 156, 160, 167, 281 Lévy, 104, 117–120 non-differentiable, 469 self-similar, 73–75, 86, 88, 96, 179–202, 213, 222, 446

semi-stable, 186 Takenaka, 195, 198 pseudo-differential operator, 38, 40, 112, 190, 192, 197, 215, 216, 238–240, 263, 273 Q quadratic variations, 188, 193, 194, 226–228, 231, 233 R renormalization of sums of random variables, 186, 187 S sample path regularity, 91, 181, 281 scale change of, 205, 465 dynamics, 473, 476, 478–480, 482 equation, 473, 478, 480 invariance, 71–81, 92–100, 179, 180, 420–434 relativity, 465–493 scaling law, 76, 80, 87, 92, 93, 96, 100, 168, 269, 270, 419–430, 439 segmentation, 66, 67, 222, 301, 324–328 self-similarity local, 206, 213, 215, 218, 220 source on/off, 427–429, 431, 432 space Besov, 123–126, 129, 144, 295, 377 Sobolev, 124, 144, 295 spectrum large deviation, 51, 52, 54, 60, 65, 67 127, 370 Legendre, 54, 60, 128, 147–149 singularity, 106, 112, 118, 119, 123, 125, 128, 129, 131 system iterated function, 23, 32, 301–320 hyperbolic, 335 W wavelet analysis, 85–92, 98, 107, 159, 322

Scaling, Fractals and Wavelets

Scaling, Fractals and Wavelets

Fractals and Scaling In Finance

Fractals and scaling in finance

Fractals and scaling in finance

Fractals and Scaling In Finance

Fractals, Scaling and Growth Far From Equilibrium

Analysis and Probability: Wavelets, Signals, Fractals

Analysis and Probability: Wavelets, Signals, Fractals

Analysis and probability: wavelets, signals, fractals

The nonlinear workbook: chaos, fractals, neural networks, wavelets etc

Wavelets

Wavelets

Scaling and Disordered Systems

Introduction to Wavelets and Wavelets Transforms

Scaling and Disordered Systems

Statistical Mechanics and Fractals

Waves and Wavelets From Fourier to Wavelets

Fractals and Hyperspaces

Scaling CouchDB

Multidimensional Scaling

Phase transition and fractals

Fractals and hyperspaces

Fractals and hyperspaces

Multidimensional Scaling

Multidimensional Scaling

Wavelets and Filter Banks

Wavelets and their applications

Wavelets Mathematics and Applications

Wavelets: Theory and applications

Scaling CouchDB

Scaling, Fractals and Wavelets