Series in Mathematical Analysis and Applications Edited by Ravi P. Agarwal and Donal O’Regan
VOLUME 9
NONLINEAR ANALYS...

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Series in Mathematical Analysis and Applications Edited by Ravi P. Agarwal and Donal O’Regan

VOLUME 9

NONLINEAR ANALYSIS

SERIES IN MATHEMATICAL ANALYSIS AND APPLICATIONS Series in Mathematical Analysis and Applications (SIMAA) is edited by Ravi P. Agarwal, Florida Institute of Technology, USA and Donal O’Regan, National University of Ireland, Galway, Ireland. The series is aimed at reporting on new developments in mathematical analysis and applications of a high standard and or current interest. Each volume in the series is devoted to a topic in analysis that has been applied, or is potentially applicable, to the solutions of scientific, engineering and social problems. Volume 1 Method of Variation of Parameters for Dynamic Systems V. Lakshmikantham and S.G. Deo Volume 2 Integral and Integrodifferential Equations: Theory, Methods and Applications Edited by Ravi P. Agarwal and Donal O’Regan Volume 3 Theorems of Leray-Schauder Type and Applications Donal O’Regan and Radu Precup Volume 4 Set Valued Mappings with Applications in Nonlinear Analysis Edited by Ravi P. Agarwal and Donal O’Regan Volume 5 Oscillation Theory for Second Order Dynamic Equations Ravi P. Agarwal, Said R. Grace, and Donal O’Regan Volume 6 Theory of Fuzzy Differential Equations and Inclusions V. Lakshmikantham and Ram N. Mohapatra Volume 7 Monotone Flows and Rapid Convergence for Nonlinear Partial Differential Equations V. Lakshmikantham, S. Koksal, and Raymond Bonnett Volume 8 Nonsmooth Critical Point Theory and Nonlinear Boundary Value Problems Leszek Gasi´nski and Nikolaos S. Papageorgiou Volume 9 Nonlinear Analysis Leszek Gasi´nski and Nikolaos S. Papageorgiou

Series in Mathematical Analysis and Applications Edited by Ravi P. Agarwal and Donal O’Regan

VOLUME 9

NONLINEAR ANALYSIS

Leszek Gasi´nski Nikolaos S. Papageorgiou

Boca Raton London New York Singapore

Published in 2005 by Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2005 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 1-58488-484-3 (Hardcover) International Standard Book Number-13: 978-1-58488-484-2 (Hardcover) Library of Congress Card Number 2005045529 This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data Gasinski, Leszek. Nonlinear analysis / Leszek Gasinski, Nikolaos S. Papageorgiou. p. cm. -- (Series in mathematical analysis and applications ; v. 9) Includes bibliographical references and index. ISBN 1-58488-484-3 1. Nonlinear functional analysis. 2. Nonlinear operators. I. Papageorgiou, Nikolaos Socrates. II. Title. III. Series. QA321.5.G37 2005 515'.7--dc22

2005045529

Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com Taylor & Francis Group is the Academic Division of T&F Informa plc.

and the CRC Press Web site at http://www.crcpress.com

To Prof. ZdzisÃlaw Denkowski

Contents

1 Hausdorff Measures and Capacity 1.1 Measure Theoretical Background . . . . . . . 1.2 Covering Results . . . . . . . . . . . . . . . . 1.3 Hausdorff Measure and Hausdorff Dimension 1.4 Differentiation of Hausdorff Measures . . . . 1.5 Lipschitz Functions . . . . . . . . . . . . . . 1.6 Capacity . . . . . . . . . . . . . . . . . . . . 1.7 Remarks . . . . . . . . . . . . . . . . . . . .

. . . . . . .

2 Lebesgue-Bochner and Sobolev Spaces 2.1 Vector-Valued Functions . . . . . . . . . . . . 2.2 Lebesgue-Bochner Spaces and Evolution Triples 2.3 Compactness Results . . . . . . . . . . . . . . 2.4 Sobolev Spaces . . . . . . . . . . . . . . . . . . 2.5 Inequalities and Embedding Theorems . . . . . 2.6 Fine Properties of Functions and BV-Functions 2.7 Remarks . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

1 3 7 22 44 52 81 103

. . . . . . .

107 108 127 150 179 213 239 257

. . . . . .

265 266 303 343 405 427 463

3 Nonlinear Operators and Young Measures 3.1 Compact and Fredholm Operators . . . . . . . . . 3.2 Operators of Monotone Type . . . . . . . . . . . . 3.3 Accretive Operators and Semigroups of Operators 3.4 The Nemytskii Operator and Integral Functions . 3.5 Young Measures . . . . . . . . . . . . . . . . . . . 3.6 Remarks . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

4 Smooth and Nonsmooth Analysis and Variational 4.1 Differential Calculus in Banach Spaces . . . . . . 4.2 Convex Functions . . . . . . . . . . . . . . . . . . 4.3 Haar Null Sets and Locally Lipschitz Functions . 4.4 Duality and Subdifferentials . . . . . . . . . . . . 4.5 Integral Functionals and Subdifferentials . . . . . 4.6 Variational Principles . . . . . . . . . . . . . . . . 4.7 Remarks . . . . . . . . . . . . . . . . . . . . . . .

Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

467 468 488 501 512 558 578 599

vii

viii 5 Critical Point Theory 5.1 Deformation Results . . . . . . . . . . . . . 5.2 Minimax Theorems . . . . . . . . . . . . . 5.3 Structure of the Critical Set . . . . . . . . 5.4 Multiple Critical Points . . . . . . . . . . . 5.5 Lusternik-Schnirelman Theory and Abstract lems . . . . . . . . . . . . . . . . . . . . . . 5.6 Remarks . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eigenvalue Prob. . . . . . . . . . . . . . . . . . . .

607 608 642 654 661 689 705

6 Eigenvalue Problems and Maximum Principles 6.1 Linear Elliptic Operators . . . . . . . . . . . . . 6.2 The Partial p-Laplacian . . . . . . . . . . . . . . 6.3 The Ordinary p-Laplacian . . . . . . . . . . . . 6.4 Maximum Principles . . . . . . . . . . . . . . . . 6.5 Comparison Principles . . . . . . . . . . . . . . . 6.6 Remarks . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

707 708 732 759 775 788 797

7 Fixed Point Theory 7.1 Metric Fixed Point Theory . . 7.2 Topological Fixed Point Theory 7.3 Partial Order and Fixed Points 7.4 Fixed Points of Multifunctions 7.5 Remarks . . . . . . . . . . . .

. . . . . . . . .

Appendix A.1 Topology . . . . . . . . . . . . . A.2 Measure Theory . . . . . . . . . A.3 Functional Analysis . . . . . . . A.4 Calculus and Nonlinear Analysis

. . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

803 804 821 833 877 891

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

895 895 899 908 912

List of Symbols

915

References

925

Preface

Linear functional analysis deals with infinite dimensional topological vector spaces (which mix in a fruitful way the linear (algebraic) structure with topological one) and the linear operators acting between them. The effort was to extend standard results of linear analysis to an infinite dimensional context. The first half of the twentieth century is marked by intensive theoretical investigations in this area, which were also accompanied by detailed treatment of linear mathematical models. With the exception of a short period during the 1930’s (compact operators and Leray-Schauder degree), nonlinear operators were out of the emerging picture. However, mounting evidence from diverse other fields such as physics, engineering, economics, biology and others suggested that there should be an effort to extend the linear theory to various kinds of nonlinear operators. Systematic efforts in this direction started in the early 1960’s and mark the beginning of what is known today as “Nonlinear Analysis.” Since then several theories have been developed in this respect and today some of them are well established approaching their limits, while others are still the object of intense research activity. It is not a coincidence that simultaneously with the advent of nonlinear analysis, we have the appearance of nonsmooth analysis and of multivalued analysis, both of which were motivated by concrete needs in applied areas such as control theory, optimization, game theory and economics. Their development provided nonlinear analysis with new concepts, tools and theories that enriched the subject considerably. Today nonlinear analysis is a well established mathematical discipline, which is characterized by a remarkable mixture of analysis, topology and applications. It is exactly the fact that the subject combines in a beautiful way these three items that makes it attractive to mathematicians. The notions and techniques of nonlinear analysis provide the appropriate tools to develop more realistic and accurate models describing various phenomena. This gives nonlinear analysis a rather interdisciplinary character. Today the more theoretically inclined nonmathematician (engineer, economist, biologist or chemist) needs a working knowledge of at least a part of nonlinear analysis in order to be able to conduct a complete qualitative analysis of his models. This supports a high demand for books on nonlinear analysis. Of course the subject is big (vast is maybe a more appropriate word) and no single book can cover all its theoretical and applied parts. In this volume, we have focused on those topics of nonlinear analysis which are pertinent to the theory of boundary value problems and their applications such as control theory and calculus of variations.

ix

x In Chapter 1 we deal with Hausdorff measures and capacities, which provide the means to estimate the “size” or “dimension” of “thin” or “highly irregular” sets. The recent development of fractal geometry and its uses in a variety of applied areas (such as Brownian motion of particles, turbulence in fluids, geographical coastlines and surfaces etc) renewed the interest on Hausdorff measures, which for a long period were a topic of secondary importance within measure theory. In this chapter we also have our first encounter with Lipschitz and locally Lipschitz functionals which will be examined again in Chapter 4. At this point we prove the celebrated “Rademacher’s theorem.” Chapter 2 deals with certain classes of function spaces, which arise naturally in the study of boundary value problems. These are the Lebesgue-Bochner spaces (the suitable spaces for the analysis of evolution equations) and the Sobolev spaces (the suitable spaces for weak solutions of elliptic equations). We conduct a detailed study of these spaces with special emphasis on compactness and embedding results. Also using the tool of Hausdorff measures and capacities, we investigate the fine properties of Sobolev functions and also introduce and study functionals of bounded variation which are useful in theoretical mechanics. In Chapter 3, we deal with certain large classes of nonlinear operators which arise often in applications. We examine compact operators for which we develop in parallel the corresponding linear theory, with one of the main results being the spectral theorem for compact self-adjoint operators on a Hilbert space. We also investigate nonlinear operators of monotone type which have their roots in the calculus of variations and exhibit remarkable surjectivity properties. Monotone operators lead to accretive operators, the two families being identical in the context of Hilbert spaces. Accretive operators are closely connected with the generation theory of semigroups of operators. We also examine both linear and nonlinear semigroups. Semigroups are basic tools in the study of evolution equations. In addition, we examine the Nemytskii operator which is a nonlinear operator encountered in almost all problems. Finally, in the last section of the chapter, we discuss Young measures which provide the right framework to examine the limit behavior of the minimizing sequence of variational problems which do not have a solution. Young measures are used in optimal control and in the calculus of variations in connection with the so-called “relaxation method.” Chapter 4 presents the calculus of smooth and of certain broad classes of nonsmooth functions. We start with the Gˆateaux and Fr´echet derivatives. We discuss the generic differentiability of continuous convex functions (Mazur’s theorem) and extend Rademacher’s theorem to locally Lipschitz functions between certain Banach spaces by using the notion of Haar-null sets. Then we pass to nondifferentiable functions and develop the duality properties and subdifferential theory of convex functions and the generalized subdifferential of locally Lipschitz functions. We also examine integral functionals and discuss the celebrated Ekeland variational principle establishing its equivalence with some other geometric results of nonlinear analysis.

xi In Chapter 5 we present the critical point theory of C 1 -functions defined on a Banach space. This theory is in the core of the variational methods used in the study of boundary value problems. We follow the deformation approach which leads to minimax characterizations of the critical values. We also study the structure of the set of critical points and derive results on the existence of multiple critical points. Next we present the Lusternik-Schnirelman theory which extends to nonlinear eigenvalue problems the corresponding linear theory of R. Courant. Chapter 6 uses the abstract results of Chapter 5 as well as results from earlier chapters to develop the spectrum of linear elliptic differential operators, of the partial p-Laplacian (with Dirichlet and Neumann boundary conditions) and of the scalar and vector ordinary p-Laplacian (with Dirichlet, Neumann and periodic boundary conditions). We also present linear and nonlinear maximum principles and comparison results, which are useful tools in the study of boundary value problems. Finally in Chapter 7 we have gathered some basic fixed point theorems. We present results from metric fixed point theory, from topological fixed point theory and fixed point results based on the partial order induced by a closed, convex pointed cone. We also indicate how many of these results can be extended to multifunctions (set-valued functions). We have tried to make the volume self-contained. For this reason at the end of the book we have included a rather extended appendix for easy reference of the general results used in the book. Nevertheless, within the test whenever we are in the need of using some results not proved in the book, we also give exact references where the interested reader can find additional information. Now that the project has reached its conclusion, we would like to thank the good people of CRC Press (especially Mrs. Jessica Vakili) for their help and kind cooperation during the preparation of this book. We would like to thank the two editors of this series, Prof. R.P. Agarwal and Prof. D.O’Regan, for supporting this effort.

Chapter 1 Hausdorff Measures and Capacity

During the golden era of measure theory (namely the first two decades of the 20th century), Carath´eodory was the first to consider the notion of “length” for sets in RN . Later, in 1919, Hausdorff, motivated by the ideas of Carath´eodory, introduced the measure and dimensional concepts that we shall discuss in this chapter. So in the modern language, the “length” of a set A ⊆ RN will be its Hausdorff one-dimensional outer measure (denoted by µ(1) ). Following the pioneering works of Carath´eodory and Hausdorff, significant contributions to the subject were made by Besicovitch. In fact, in the first decade of development of the subject, the main advances on the subject were made by Besicovitch and his students, since geometric measure theory was not part of the mainstream measure theory. However, since the early 70’s, the subject attracted a large number of researchers, due to its fundamental importance in the study of the so-called “Fractal Geometry.” Fractal sets arise in many applications, such as turbulence in fluids, geographical coastlines and surfaces, fluctuation of prices in stock exchanges, the Brownian motion of particles and others. Mandelbrojt was the first to emphasize their use to model a variety of phenomenona. There have been many ways to estimate the “size” or “dimension” of small (thin) sets and of highly irregular sets and to generalize the idea that points, curves and surfaces have dimensions 0, 1 and 2 respectively. Hausdorff measure has the advantage of being a measure and together with the notion of Hausdorff dimension can provide a more delicate sense of the size of sets in RN than Lebesgue measure provides. To illustrate this, consider in R2 the set ½µ ¾ ¶ 1 df A = t, sin : t ∈ (0, 1) . t Suppose we wish to measure the length of the curve A. A first approximation can be based on the Carath´eodory outer measure, which defines: df

λ1 (A) =

inf ∞

A⊆

S

∞ X

δ(Cn ),

Cn n=1

n=1

i.e., the infimum is taken over all countable covers of A (by δ(A) we denote the diameter of the set A; see (1.1)). If we adopt this definition, we see that λ1 (A) < +∞, while we know that the length of A is infinite. The reason for

1

2

Nonlinear Analysis

this is that in the definition of λ1 (A), the covers of A are not forced to follow the geometry of A. For this reason the Hausdorff s-dimensional measures (s) µ(s) (A) are defined as limits of outer measures µδ which follow the local geometry of A (see Definition 1.3.5). As another illustrative example, consider the unit square S in R2 (i.e., square of side length equal to 1) and define

df

λ1 (S) =

inf ∞

S⊆

S

∞ X

δ(Cn ),

Cn n=1

n=1

i.e., again the infimum is taken over all countable covers of S. We observe that we can do no better than cover S itself. Indeed, if we cover S with smaller squares of diameter less or equal to n1 , then we see that we need at 1 least n2 squares to achieve the √ covering and so the approximation of λ (S) obtained this way exceeds n 2. So the smaller the squares we use to cover, the bigger the estimate for λ1 (S). Therefore, small squares are irrelevant in the calculation of λ1 (S) and yet it is precisely them that should have an influence on the evaluation of λ1 (S). We expect λ1 (S) = 0, since the diameter is a one-dimensional concept and it is used to measure a square in R2 , which is a two dimensional concept. For this we need a definition which takes into account the local geometry of the set under consideration. In this chapter, in Section 1.1 we recall some basic definitions and facts from measure theory, which will be needed in what follows. In Section 1.2, we discuss some “covering theorems.” Covering results play a central role in geometric measure theory. In Section 1.3 we introduce and study Hausdorff measures and the Hausdorff dimension of sets. Among other things we calculate the Hausdorff dimension of some classical irregular sets in R (Cantor-like sets). From these calculations, the reader will realize that the Hausdorff measure and the Hausdorff dimension of sets (even of simple ones) may be hard to calculate. For this reason sometimes other notions may be more suitable (such as capacity; see Section 1.6). In Section 1.4 we discuss the differentiation of Hausdorff measures and derive the Lebesgue-Besicovitch differentiation theorem. In Section 1.5, using the tools of Hausdorff measures, we study the geometry of Lipschitz continuous functions. Among other things, we obtain the “area and coarea formulas” and the associated with them “change of variables formulas.” Finally in Section 1.6, we present an alternative analytical notion measuring small sets in RN , namely the p-capacity. We derive some basic properties of the p-capacities and compare them to the Hausdorff measures.

1. Hausdorff Measures and Capacity

1.1

3

Measure Theoretical Background

In this section we recall some basic definitions and facts from measure theory, which we shall need in the sequel. Let us start with the concept of outer measure, which, when restricted to a suitable σ-field of sets, leads to a measure. DEFINITION 1.1.1 Let X be a set. A map µ : 2X −→ [0, +∞] is said to be an outer measure, if (a) µ(∅) = 0; (b) A ⊆ B =⇒ µ(A) 6 µ(B) (monotonicity); (c) for any sequence of sets {An }n>1 ⊆ 2X , we have µ[ ¶ X ∞ ∞ µ An 6 µ(An ) n=1

n=1

(subadditivity). For a given outer measure µ on X and A ∈ 2X , we define the restriction of µ on A, denoted by µbA, by df

(µbA)(B) = µ(A ∩ B)

∀ B ∈ 2X .

We say that µ is a finite outer measure if µ(X) < +∞ (i.e., µ has values in R+ ). REMARK 1.1.2 Note that µbA is an outer measure on X, while we define µ|A to be the restriction of µ (as a function) on 2A , i.e., µ|A : 2A −→ [0, +∞] is defined by ¡ ¢ df µ|A (B) = µ(B)

∀ B ∈ 2A ⊆ 2X .

Outer measures are useful because they lead to measures when restricted to suitably defined σ-fields. These σ-fields can be quite large. DEFINITION 1.1.3 Let X be a set and µ an outer measure on X. A set A ∈ 2X is said to be µ-measurable, if µ(B) = µ(A ∩ B) + µ(B \ A) i.e., A “decomposes” every set B additively.

∀ B ∈ 2X ,

4

Nonlinear Analysis

REMARK 1.1.4

Let X be a set and µ an outer measure on X.

(a) By virtue of the subadditivity property of an outer measure, to show that A ∈ 2X is µ-measurable, it is enough to check that µ(B) > µ(A ∩ B) + µ(B \ A)

∀ B ∈ 2X .

(b) Clearly, if A ∈ 2X and µ(A) = 0, then A is µ-measurable. (c) If A ∈ 2X , then any µ-measurable set is also µbA-measurable. (d) A is µ-measurable if and only if Ac = X \ A is µ-measurable. It is straightforward to check the following result. PROPOSITION 1.1.5 If X is a set and µ is an outer measure on X, then the collection Σµ of all µ-measurable sets is a σ-field and µ restricted on Σµ is a measure. REMARK 1.1.6 While Definition 1.1.3 involves only additivity of µ, the conclusion in Proposition 1.1.5 is about σ-additivity of µ on Σµ . This reveals the power of Definition 1.1.3. Note that from Remark 1.1.4(b), it follows that the σ-field Σµ is µ-complete. DEFINITION 1.1.7 Let X be a nonempty Hausdorff topological space and let µ be an outer measure on X. (a) Let T be a family of 2X . We say that µ is T -regular, if µ(A) =

inf µ(B)

B∈T A⊆B

∀ A ∈ 2X .

If T = Σµ , then we simply say that µ is regular. (b) We say that µ is a Borel measure, if B(X) ⊆ Σµ with B(X) being the Borel σ-field of X. (c) We say that µ is a Borel regular measure, if µ is a Borel measure which is B(X)-regular. (d) We say that µ is a Radon measure, if µ is a Borel regular measure and µ(K) < +∞

∀ K ⊆ X, K-compact.

1. Hausdorff Measures and Capacity

5

REMARK 1.1.8 Let X be a Hausdorff topological space and let µ be an outer measure on X. (a) Note that µ is regular if and only if ∀A ∈ 2X ∃B ∈ Σµ : µ(A) = µ(B). (b) If µ is regular on X and {An }n>1 ⊆ 2X is increasing (i.e., An ⊆ An+1 for n > 1), then µ[ ¶ ∞ µ An = sup µ(An ). n>1

n=1

PROPOSITION 1.1.9 If X is a Hausdorff topological space, µ is an outer measure on X which is Borel regular and A ∈ Σµ with µ(A) < +∞, then µbA is a Radon measure. PROOF

Let

df

µ1 = µbA. Evidently Σµ ⊆ Σµ1 and so µ1 is a Borel measure. Also for every compact K ⊆ X, we have µ1 (K) < +∞. It remains to show that µ1 is Borel regular. To this end note that since µ is Borel regular, for a given A ∈ 2X , we can find B ∈ B(X), A ⊆ B, such that µ(A) = µ(B) < +∞. Because A ∈ Σµ , from Definition 1.1.3, we have µ(B \ A) = µ(B) − µ(A) = 0. Since A ∈ Σµ , for every C ∈ 2X , we have ¡ ¢ (µbB)(C) = µ(B ∩ C) = µ(B ∩ C ∩ A) + µ (B ∩ C) \ A 6 µ(C ∩ A) + µ(B \ A) = µ(C ∩ A) = (µbA)(C). As A ⊆ B, we infer that µbB = µbA. So without any loss of generality, we may assume that A ∈ B(X). Let C ∈ 2X . Since µ is Borel regular, we can find D ∈ B(X), such that A∩C ⊆D

and µ(A ∩ C) = µ(D)

6

Nonlinear Analysis

(see Remark 1.1.8(a)). Let us take df

E = D ∪ (X \ A). Evidently E ∈ B(X) and C ⊆ (A ∩ C) ∪ (X \ A) ⊆ E. Moreover, since E ∩ A = D ∩ A, we have µ1 (E) = µ(E ∩ A) = µ(D ∩ A) 6 µ(D) = µ(A ∩ C) = µ1 (C), so µ1 = µbA is Borel regular (see Remark 1.1.8(a)), hence Radon. We conclude this section, by recalling the following basic measure theoretic approximations. PROPOSITION 1.1.10 If X is a Hausdorff topological space and µ is an outer measure on X which is Borel, then (a) if A ∈ B(X), µ(A) < +∞ and ε > 0, then we can find an open set Uε ⊇ A and a closed set Cε ⊆ A, such that µ(Uε \ Cε ) < ε, i.e., µ(A) =

inf µ(U ) =

U -open A⊆U

sup µ(C). C-closed C⊆A

(b) if µ is Radon, then for every A ∈ 2X , we have µ(A) =

inf µ(U )

U -open A⊆U

and if A ∈ Σµ , then µ(A) =

sup

µ(K).

K-compact K⊆A

REMARK 1.1.11 Note that in the first part of Proposition 1.1.10(b), the set A need not be µ-measurable.

1. Hausdorff Measures and Capacity

1.2

7

Covering Results

One of the main tools in geometric measure theory is the so called Vitali covering theorem. For a given sufficiently large family of sets that cover a given set A, Vitali’s covering theorem allows us to select a countable subfamily consisting of distinct sets with exactly the desired approximation properties. The basic principle embodied in the proof of Vitali’s covering theorem is illustrated in the next proposition. In what follows for any subset A of a metric space (X, dX ), we define df

δ(A) = diam (A) = sup dX (x, y),

(1.1)

x,y∈A

df

the diameter of A (by convention diam ∅ = 0). PROPOSITION 1.2.1 If T is a collection of nondegenerate balls in RN with sup δ(B) < +∞, B∈T

then we can find a finite or countable subfamily F of T consisting of disjoint balls, such that [ [ b B ⊆ B, B∈T

B∈F

b being the ball concentric with B, but with radius five times the radius with B of B. PROOF

Let df

d0 = sup δ(B), B∈T ½ ¾ d0 d0 df Tn = B ∈ T : n < δ(B) 6 n−1 2 2

∀ n > 1.

Inductively, we generate subfamilies Fn ⊆ Tn for n > 1. Namely, let F1 be any maximal disjoint collection of balls in T1 . Suppose we have selected F1 , . . . , Fm . We choose Fm+1 to be any maximal disjoint subfamily of ½ ¾ m [ 0 0 B ∈ Tm+1 : B ∩ B = ∅ for all B ∈ Fk i=1

and finally set df

F =

∞ [ m=1

Fm .

8

Nonlinear Analysis

Evidently F ⊆ T and consists of disjoint balls. Claim. For each B ∈ T , we can find B 0 ∈ F , such that B ∩ B 0 6= ∅ and b 0 ). δ(B) 6 2δ(B 0 ) (so also B ⊆ B For some m > 1, we have B ∈ Tm . By virtue of the maximality of Fm , we m S can find B 0 ⊆ Fk with B ∩ B 0 6= ∅. We have that k=1

d0 6 δ(B 0 ) and 2m

δ(B) 6

d0 . 2m−1

So δ(B) 6 2δ(B 0 ) and this proves the claim. From the claim it follows at once that

S

S

B⊆

B∈T

b0. B

B 0 ∈F

DEFINITION 1.2.2 Let A ⊆ RNS. A collection T of sets in RN is said B and for every x ∈ A and every to be a Vitali cover of A, if A ⊆ B∈T

ε > 0, there exists B ∈ T , such that x ∈ B and 0 < δ(B) < ε. REMARK 1.2.3 Note that from the second requirement of the above definition it follows that inf δ(B) = 0. B∈T

So T is a Vitali cover of a set A, if every point x ∈ A is contained in an arbitrary small element of T . As a straightforward consequence of Proposition 1.2.1 we obtain the following proposition. PROPOSITION 1.2.4 If A ⊆ RN , T is a Vitali cover of A consisting of closed balls, such that sup δ(B) < +∞, B∈T

then there exists a countable family F = {Bn }n>1 consisting of disjoint balls from T , such that for each m > 1, we have A ⊆

m [ n=1

Bn ∪

∞ [

bn , B

n=m+1

bn is the closed ball cocentric with Bn and radius five times the radius where B of Bn .

1. Hausdorff Measures and Capacity

Let F be as in the proof of Proposition 1.2.1. Select {Bn }m n=1 ⊆ m [ Bn , then we are done. Otherwise let x ∈ A \ Bn . Since T

PROOF F. If A ⊆

9

m S

n=1

n=1

is a Vitali cover of A consisting of closed balls, then we can find B ∈ T , such that x ∈ B and B ∩ Bn = ∅ ∀ n ∈ {1, . . . , m}. But from the claim in the proof of Proposition 1.2.1, we can find B 0 ∈ F , such that b 0 and B ∩ B 0 6= ∅ B⊆B (so B 0 ∈ {Bn }∞ n=m+1 ). Now we are ready to state and prove Vitali’s covering theorem. In what follows by λN we denote the N -dimensional Lebesgue outer measure. THEOREM 1.2.5 (Vitali Covering Theorem) If A ⊆ RN with 0 < λN (A) < +∞ and T is a Vitali cover of A consisting of closed sets, then we can find a sequence {Cn }n>1 of elements in T , such that Cn ∩Cm = ∅ for n 6= m and µ ¶ ∞ [ λN A \ Cn = 0. n=1

PROOF Without any loss of generality, we can assume that there exists an open set U ⊆ RN with λN (U ) < +∞ and C⊆U

∀ C ∈T.

We construct the sequence {Cn }n>1 inductively. Let C1 ∈ T . Suppose that n S C1 , . . . , Cn are disjoint sets in T . If A ⊆ Ck , then we are finished. If not, k=1

setting df

Vn = U \

n [

Ck ,

k=1

we introduce df

Tn = Because A \

n S i=1

©

C ∈ T : C ⊆ Vn

ª

df

and δn = sup λN (C). C∈Tn

Ck 6= ∅ and T is a Vitali cover of A, we see that Tn 6= ∅ and

so δn > 0. We select Cn+1 ∈ T

with

δn < λN (Cn+1 ). 2

10

Nonlinear Analysis

We continue this process. Then either at some finite step n > 1 we shall have n S A⊆ Ck , in which case the proof of the theorem is complete or otherwise k=1

we produce a sequence {Cn }n>1 ⊆ T of disjoint sets. Then we have ∞ X

λN (Cn ) = λN

n=1

µ[ ∞

¶ Cn

6 λN (U ) < +∞.

(1.2)

n=1

For each n > 1 let Bn be a ball with center in Cn and radius equal to 3δ(Cn ). We claim that n ∞ [ [ A\ Ck ⊆ Bk ∀ n > 1. (1.3) k=1

Let x ∈ A \ such that

n S

k=n+1

Ck . Since T is a Vitali cover of A, we can find a set Cx ∈ Tn ,

k=1

x ∈ Cx

and λN (Cx ) > 0.

We shall show that Cx ∩ Ck 6= ∅ for some k > n. Indeed, if this is not the case, then λN (Cx ) 6 δk for all k > 1, which contradicts the fact that 0 6

lim δk 6

k→+∞

lim 2λN (Ck+1 ) = 0

k→+∞

(recall the choice of Ck+1 and see (1.2)). Let m > n be the smallest integer, such that Cx ∩ Cm 6= ∅. Since Cx ∈ Tm−1 , we have λN (Cx ) 6 δm−1 < 2λN (Cm ) and recalling the choice of Bm , also Cx ⊆ Bm . So we have proved (1.3). Then for any n > 1, we have ¶ µ ¶ µ n ∞ ∞ [ X [ N N Ck 6 λ A \ Ck 6 λN (Bk ). (1.4) λ A\ k=1

k=1

k=n+1

Recalling that Bk is a ball of radius 3δ(Ck ) and combining (1.2) and (1.4), we conclude that µ ¶ ∞ [ N λ A\ Ck = 0. k=1

1. Hausdorff Measures and Capacity

11

Vitali’s covering theorem may be difficult to digest at first and probably it is necessary to see the lemma in action several times before appreciating it. For this reason we present four simple applications from classical analysis of functions of one-variable. We start with a definition which establishes the notation for various limits of the difference quotient that we shall use in the sequel. These derivatives are often more useful than the ordinary derivative, since they are defined at every point. DEFINITION 1.2.6 For a given function f : [a, b] −→ R, the upper right and lower right derivates of f at x ∈ [a, b) are defined by f (x + h) − f (x) h

df

D+ f (x) = lim sup h→0+

and df

D+ f (x) = lim inf h→0+

f (x + h) − f (x) h

respectively. Similarly the upper left and lower left derivates of f at x ∈ (a, b] are defined by df

D− f (x) = lim sup h→0−

f (x + h) − f (x) h

and df

D− f (x) = lim inf − h→0

f (x + h) − f (x) h

respectively. REMARK 1.2.7 Evidently, the derivates of a function at a point may be infinite. The function f is differentiable at x ∈ (a, b), if −∞ < D+ f (x) = D+ f (x) = D− f (x) = D− f (x) < +∞. The function f is differentiable at x = a or at x = b, if the appropriate two derivates are finite and equal. Also the one-sided derivatives exist at a point x, if D+ f (x) = D+ f (x) and D− f (x) = D− f (x). The derivates are also called Dini derivates and clearly we always have

and

D+ f (x) 6 D+ f (x)

∀ x ∈ [a, b)

D− f (x) 6 D− f (x)

∀ x ∈ (a, b].

12

Nonlinear Analysis

In the literature, sometimes we find the notion of a derived number for a function f at x. So β ∈ R∗ is a derived number for f at x, if there is a sequence {hn }n>1 ⊆ R, such that hn −→ 0, hn 6= 0 and

∀n>1

f (x + hn ) − f (x) = β. n→+∞ hn lim

A function f may have many derived numbers at a point x. Of course f is differentiable at x if and only if all derived numbers of f at x agree and are finite. EXAMPLE 1.2.8

Consider the function f : R −→ R defined by ( 1 df x sin if x = 6 0, f (x) = x 0 if x = 0.

We can check that D− f (0) = −1 < D+ f (0) = 1 and every number in [−1, 1] is a derived number for f . The function f is not of bounded variation (see Definition A.2.15(a)). LEMMA 1.2.9 If f : [a, b] −→ R is nondecreasing, then all four derivates of f are finite almost everywhere on [a, b]. PROOF

Clearly all derivates are nonnegative. So it suffices to show that

D+ f (x) < +∞ Let

and

D− f (x) < +∞

for a.a. x ∈ [a, b].

½ df

A =

¾ +

x ∈ [a, b] : D f (x) = +∞

and suppose that λ∗ (A) = β > 0, where λ∗ is the Lebesgue outer measure on R. Let M > 0 be such that f (b) − f (a)

1 with hxn & 0, hxn 6= 0 such that M 6 The collection

©

∀ n > 1,

f (x + hxn ) − f (x) . hxn

[x, x + hxn ]

ª x∈A,n>1

is a Vitali cover of A. By virtue of Vitali’s covering theorem (see Theorem 1.2.5), we can find a family of disjoint intervals © ªm [xn , xn + hn ] n=1 , such that

m X

hn >

n=1

β . 2

Therefore m m X X ¡ ¢ f (xn + hn ) − f (xn ) > M hn n=1

n=1

Mβ > > f (b) − f (a), 2 a contradiction. This proves that λ∗ (A) = 0 and so

D+ f (x) < +∞.

Analogously we can prove that D− f (x) < +∞. Using this lemma and Vitali’s covering theorem, we can now prove that a nondecreasing function is differentiable almost everywhere on [a, b]. THEOREM 1.2.10 If f : [a, b] −→ R is nondecreasing, then f is differentiable almost everywhere on [a, b]. PROOF For f to be differentiable at x, we must have that all four derivates at x are finite and equal. By virtue of Lemma 1.2.9, it suffices to show that all four derivates are equal almost everywhere. Let ª df © A = x ∈ (a, b) : D+ f (x) < D+ f (x) .

14

Nonlinear Analysis

We show that A is Lebesgue-null. The proof for the other combinations of derivates is similar. Suppose that λ∗ (A) > 0. We can find rational numbers r, s, such that the set df

B =

©

ª x ∈ A : D+ f (x) < r < s < D+ f (x)

satisfies

λ∗ (B) = β > 0.

Let ε ∈ (0, β). From the regularity of the Lebesgue outer measure λ∗ , we know that there exists an open set U ⊆ (a, b), such that and λ1 (U ) − ε < β.

B⊆U

For each x ∈ B and n > 1, we can find hxn > 0, such that

£ ¤ x, x + hxn ⊆ U

The family

©

with hxn & 0, f (x + hxn ) − f (x) < r. hxn

and

ª [x, x + hxn ] x∈B,n>1

is a Vitali cover of B. By virtue of Vitali’s covering theorem (see Theorem 1.2.5), for a given ε > 0, we can find a disjoint subfamily © ªm [xn , xn + hn ] n=1 of the Vitali cover, such that µ ¶ m [ λ∗ B \ [xn , xn + hn ] < ε. n=1

We have m m X X ¡ ¢ f (xn + hn ) − f (xn ) < r hn 6 rλ1 (U ) < r(β + ε). n=1

n=1

Let us set df

C = B∩

µ[ m

¶ [xn , xn + hn ] .

n=1

We have that

β − ε < λ∗ (C).

(1.5)

1. Hausdorff Measures and Capacity

15

¡ ¢ For every y ∈ C and k > 1, we can find uyk ∈ y, y + k1 , such that f (uyk ) − f (y) > s uyk − y and

[y, uyk ] ⊆ (xn , xn + hn ),

The family

©

for some n ∈ {1, . . . , m}.

ª [y, uyk ] y∈C,k>1

is a Vitali cover of C. Invoking Vitali’s covering theorem (see Theorem 1.2.5), we can find a disjoint subfamily ©

ªl [yk , uk ] k=1 ,

such that λ∗ (C) − ε

s (uk − yk ) k=1 k=1 ¡ ¢ > s λ∗ (C) − ε > s(β − 2ε).

(1.6)

For each 1 6 n 6 m, let df

Jn =

©

ª k ∈ {1, . . . , l} : [yk , uk ] ⊆ (xn , xn + hn ) .

Since f is nondecreasing, using (1.6) and (1.5), we have s(β − 2ε)

0. Because f is absolutely continuous, we can find δ > 0, such ªm that, if (rn , sn ) n=1 is a finite family of disjoint subintervals of [a, b] with m X

(sn − rn ) < δ,

n=m

then we have

m X ¯ ¯ ¯f (sn ) − f (rn )¯ < ε. n=1

We introduce the family ¯ ¯ ½ ¾ ¯ f (y) − f (x) ¯ df ¯ 0, A ⊆ [a, b] and at each point x ∈ A there exists a derived number β (see Remark 1.2.7), such that β 0, we can find a bounded open set U ⊆ R, such that A⊆U

and

λ1 (U ) − ε < λ∗ (A).

If x ∈ A, then by hypothesis we can find a sequence {hn }n>1 ⊆ R \ {0}, such that hn −→ 0, [x, x + hn ] ⊆ U

∀n>1

(or [x + hn , x] ⊆ U in the event hn < 0; but in the sequel for simplicity we shall write [x, x + hn ] for both cases) and f (x + hn ) − f (x) < r hn

∀ n > 1.

(1.7)

For all n > 1 and x ∈ A, let df

Dn (x) = [x, x + hn ], ¤ df £ En (x) = f (x), f (x + hn ) . Because f is strictly increasing En (x) is a nondegenerate, closed interval and ¡ ¢ f Dn (x) ⊆ En (x) ∀ n > 1, x ∈ A. Since ¡ ¢ λ1 Dn (x) = |hn | and from (1.7), we have

¯ ¯ ¡ ¢ λ1 En (x) = ¯f (x + hn ) − f (x)¯,

¡ ¢ ¡ ¢ λ1 En (x) < rλ1 Dn (x) .

(1.8)

1. Hausdorff Measures and Capacity

19

Passing to the limit as n → +∞, we have |hn | −→ 0 and so from (1.8), we obtain that ¡ ¢ lim λ1 En (x) = 0.

n→+∞

Let

df

T =

©

ª En (x) x∈A,n>1 .

Then T is a Vitali cover of the set f (A). So Vitali’s covering theorem (see Theorem 1.2.5) implies the existence a disjoint sequence © ª Enk (xk ) k>1 ⊆ T , such that

µ ¶ ∞ [ λ1 f (A) \ Enk (xk ) = 0.

(1.9)

k=1

Using (1.9) and (1.8), it follows that ¢ λ f (A) 6 λ1 ∗

=

¡

∞ X

µ[ ∞ k=1

1

λ (Enk (xk )) < r

k=1

¶ Enk (xk ) ∞ X

¡ ¢ λ1 Dnk (xk ) .

(1.10)

k=1

© ª Since f is strictly increasing, we see that Dnk (xk ) k>1 are pairwise disjoint too. So we have µ[ ¶ ∞ ∞ X ¡ ¢ 1 1 λ Dnk (xk ) = λ Dnk (xk ) (1.11) k=1

k=1

6 λ1 (U ) 6 λ∗ (A) + ε. From (1.10) and (1.11), we infer that ¡ ¢ ¡ ¢ λ∗ f (A) 6 r λ∗ (A) + ε . Let ε & 0, to conclude that ¡ ¢ λ∗ f (A) 6 rλ∗ (A).

In a similar fashion, we can have the following comparison result.

(1.12)

20

Nonlinear Analysis

THEOREM 1.2.15 If f : [a, b] −→ R is a strictly increasing function, s > 0, A ⊆ [a, b] and at each point x ∈ A there exists a derived number γ, such that γ > s, then ¡ ¢ λ∗ f (A) > sλ∗ (A). The final application of Vitali’s covering theorem (see Theorem 1.2.5) is the following criterion for measurability of sets in R. THEOREM 1.2.16 If F is any collection of intervals in R and [

A =

D,

D∈F

then A is Lebesgue measurable. PROOF

Let T be a collection of all intervals E,

such that E ⊆ D for some D ∈ F. Evidently T is a Vitali cover of A

and so by Vitali’s covering theorem (see Theorem 1.2.5), we can find a sequence {En }n>1 of disjoint elements in T , such that µ ¶ ∞ [ λ A\ En = 0. ∗

n=1

Because each En ⊆ A, the set df

A =

∞ [ n=1

µ ¶ ∞ [ En ∪ A \ En n=1

is Lebesgue measurable. REMARK 1.2.17 Theorem 1.2.16 can be used to show that the upper and lower derivates of an arbitrary function are measurable. In particular then the four derivates of a measurable function are measurable and so is the derivative of a measurable function. We will not go into that here.

1. Hausdorff Measures and Capacity

21

When λN is replaced by an arbitrary Radon measure µ on RN , there is b in terms of µ(B). So the proof of Vitali’s no systematic way to control µ(B) covering theorem (see Theorem 1.2.5) which uses the principle involved in Proposition 1.2.1, namely the use of suitable expansions of balls, does not work. So we need an analog of Proposition 1.2.1, which does not require enlarging the balls, though. This is done by the so-called “Besicovitch covering theorem.” THEOREM 1.2.18 (Besicovitch Covering Theorem) If F is any collection of closed balls in RN , sup δ(B) < +∞ B∈F

and A is the set of centers of all balls B ∈ F, then there exist a positive integer k = k(N ) > 1 and Tn ⊆ F

∀ n ∈ {1, . . . , k},

such that each Tn is a countable collection of disjoint balls in F and A ⊆

k [ [

B.

n=1 B∈Tn

Using the above theorem, we can have the following counterpart of Vitali’s covering theorem (see Theorem 1.2.5). THEOREM 1.2.19 If µ is a Borel measure on RN , T is a family of nondegenerate closed balls in RN , A is the set of centers of balls in T , µ(A) < +∞, inf

Br (a)∈F

r = 0

∀a∈A

and U ⊆ RN is an open set, then there exists a countable collection of disjoint balls F from T , such that [ B∈F

B ⊆U

and

µ [ ¶ µ (A ∩ U ) \ B = 0. B∈F

22

Nonlinear Analysis

1.3

Hausdorff Measure and Hausdorff Dimension

Hausdorff measures were introduced as certain lower dimensional measures on RN which allow us to measure “small” subsets in RN . The Hausdorff measure and the associated Hausdorff dimension of the set provide a more delicate sense of the size of a set in RN than the Lebesgue measure provides. We start with the introduction of a special class of outer measures, known as metric outer measures. DEFINITION 1.3.1 function).

Let (X, dX ) be a metric space (d is the metric

(a) If A, B ⊆ X, then we say that A and B are separated sets, if df

dX (A, B) =

inf dX (a, b) > 0.

a∈A b∈B

(b) If µ is an outer measure on X, then we say that µ is a metric outer measure, if µ(A ∪ B) = µ(A) + µ(B)

∀ A, B ⊆ X, A and B separated.

We show that if µ is a metric outer measure, then B(X) ⊆ Σ(µ), i.e., µ is Borel. To this end we need the following auxiliary result, known as Carath´ eodory’s lemma. In what follows (X, d) is a metric space. LEMMA 1.3.2 (Carath´ eodory Lemma) If µ is a metric outer measure on X, U ⊆ X is an open subset, U 6= X, A ⊆ U and ½ ¾ 1 df c An = x ∈ A : d(x, U ) > ∀ n > 1, (1.13) n then µ(A) = lim µ(An ). n→+∞

PROOF Note that the sequence {An }n>1 is an increasing sequence and so lim µ(An ) exists. Moreover, since An ⊆ A for n > 1, we have n→+∞

lim µ(An ) 6 µ(A).

n→+∞

So we need to show that µ(A) 6

lim µ(An ).

n→+∞

(1.14)

1. Hausdorff Measures and Capacity

23

Because U is open, we have d(x, U c ) > 0

∀x∈A

and so we can find n0 > 1 large enough so that x ∈ An0 . Therefore, we have ∞ [

A =

An .

n=1

For each n > 1, we introduce the set ½ df Cn = An+1 \ An = x ∈ A :

1 1 6 d(x, U c ) < n+1 n

¾ .

We have A = A2n ∪

∞ [

∞ [

Ck = A2n ∪

k=2n

C2k ∪

k=n

∞ [

C2k+1

k=n

and from the subadditivity of µ, it follows that µ(A) 6 µ(A2n ) +

∞ X

µ(C2k ) +

k=n

∞ X

µ(C2k+1 ).

(1.15)

k=n

If both series are convergent, then we obtain (1.14). So suppose that this is not true and, say, we have ∞ X

µ(C2k ) = +∞.

(1.16)

k=1

Note that

¡ ¢ d C2k , C2k+2 >

1 1 − 2k + 1 2k + 2

∀k>1

and so the sets {Ck }k>1 are separated. Therefore, we have µ

µ n−1 [

¶ C2k

k=1

Note that

n−1 [

=

n−1 X

µ(C2k )

∀ n > 1.

(1.17)

k=1

C2k ⊆ A2n

∀n>1

k=1

and so

µ n−1 ¶ [ µ C2k 6 µ(A2n ) k=1

∀ n > 1.

(1.18)

24

Nonlinear Analysis

From (1.17) and (1.18), it follows that n−1 X

µ(C2k ) 6 µ(A2n ).

k=1

Combining this with (1.16), we infer that lim µ(A2n ) = +∞

n→+∞

and so µ(A) 6 as desired. Similarly, if

∞ P

lim µ(A2n ),

n→+∞

µ(C2k+1 ) = +∞.

k=1

THEOREM 1.3.3 If µ is an outer measure on X, then B(X) ⊆ Σ(µ) (i.e., µ is Borel) if and only if µ is a metric outer measure. PROOF

“=⇒”: Let A1 , A2 ⊆ X be separated sets and let us set df

β = d(A1 , A2 ) > 0. For every x ∈ A1 , we define ½ ¾ β df U (x) = B β (x) = y ∈ X : d(y, x) < 2 2

df

and U =

[

U (x).

x∈A1

Evidently U is open, A1 ⊆ U and A2 ∩ U = ∅. Since by hypothesis U ∈ Σ(µ), we have that ¡ ¢ ¡ ¢ µ(A1 ∪ A2 ) = µ (A1 ∪ A2 ) ∩ U + µ (A1 ∪ A2 ) ∩ U c . (1.19) Because A1 ⊆ U and A2 ∩ U = ∅, from (1.19), it follows that µ(A1 ∪ A2 ) = µ(A1 ) + µ(A2 ), i.e., µ is metric outer measure. “⇐=”: It suffices to show that Σ(µ) contains all closed sets. So let C ⊆ X be df

df

closed and let us set U = C c . Let D ⊆ X, A = D \ C and let {An }n>1 be an increasing sequence of subsets of A as in Lemma 1.3.2. Then d(An , C) >

1 n

∀n>1

1. Hausdorff Measures and Capacity

25

and, from Lemma 1.3.2, we have µ(D \ C) = µ(A) =

lim µ(An ).

n→+∞

(1.20)

Since by hypothesis µ is a metric outer measure and the sets {An }n>1 are separated from C, we have ¡ ¢ µ(D) > µ (D ∩ C) ∪ An = µ(D ∩ C) + µ(An ) ∀ n > 1. Passing to the limit as n → +∞ and using (1.20), we obtain µ(D) > µ(D ∩ C) + µ(D \ C). The reverse inequality is always true (subadditivity). So we obtain µ(D) = µ(D ∪ C) + µ(D \ C)

∀ D ⊆ X.

Thus C ∈ Σ(µ) and hence B(X) ⊆ Σ(µ). To introduce the concept of Hausdorff measure, we shall need the following notion. Recall that by (X, d) we denote a metric space. DEFINITION 1.3.4 of a set C, if C⊆

∞ [

A sequence {An }n>1 of subsets of X is a δ-cover

An

and

δ(An ) 6 δ

∀ n > 1.

n=1

By Tδ (C) we denote the family of all δ-covers of the set C. Using this notion, we can introduce the Hausdorff s-dimensional measure, s > 0. As usual, for any A ⊆ X, df

δ(A) = diam (A) = sup d(x, y), x,y∈A df

the diameter of A (by convention diam ∅ = 0). DEFINITION 1.3.5 define

For any s > 0, 0 < δ 6 +∞ and C ⊆ X, we df

(s)

µδ (C) =

inf

{An }n>1 ∈Tδ (C)

∞ X

δ(An )s

n=1

(as always we use the convention that inf ∅ = +∞). The Hausdorff sdimensional outer measure µ(s) is defined by df

(s)

(s)

µ(s) (C) = lim µδ (C) = sup µδ (C). δ&0

δ>0

26

Nonlinear Analysis

REMARK 1.3.6 It is easily seen that µ(s) is an outer measure. Moreover, it is a metric outer measure. Indeed, if δ > 0 is less than the positive distance of two separate sets A and C, then no set in Tδ (A ∪ C) can intersect both A and C and so it follows that (s)

(s)

(s)

µδ (A ∪ C) = µδ (A) + µδ (C). Letting δ & 0, we can obtain the same equality for µ(s) . ¡In addition by ¢ Theorem 1.3.3, µ(s) is Borel. The restriction of µ(s) on Σ µ(s) is called the Hausdorff s-dimensional measure. Sometimes it is convenient to consider δ-covers consisting of open or alternatively closed sets. In these cases, (s) although a different value of µδ may be attained for δ > 0, the limit µ(s) as δ & 0 is the same (see Davies (1970)). However, the limit µ(s) is different, if we restrict ourselves to δ-covers by balls (see Besicovitch (1928)). In this case the resulting Hausdorff measure is called the spherical Hausdorff measure. Finally, if X = RN , it is easy to see that µ(s) remains the same if we consider δ-covers consisting only of convex sets. Next we show that for any set C ⊆ X, there is a critical value s0 , such that for s > s0 , the corresponding Hausdorff s-dimensional measure of C is zero, while for s < s0 the Hausdorff s-dimensional measure of C is infinite. THEOREM 1.3.7 If A ⊆ RN and 0 6 s < t < +∞, then (a) if µ(s) (A) < +∞, then µ(t) (A) = 0; (b) if µ(t) (A) > 0, then µ(s) (A) = +∞. PROOF (a) Let µ(s) (A) < +∞ and t > s. Let {An }n>1 ∈ T m1 (A). Then for any n > 1, we have µ ¶t−s δ(An )t 1 t−s = δ(An ) 6 , δ(An )s m so

∞ X

(t)

µ 1 (A) 6 m

µ t

δ(An ) 6

n=1

and thus

µ (t)

µ 1 (A) 6 t

1 m

1 m

¶t−s

¶t−s X ∞

δ(An )s

n=1

(s)

µ 1 (A). m

Letting m → +∞, we obtain µ(t) (A) = 0. (b) Let µ(t) (A) > 0 and s < t. Assuming that µ(s) (A) < +∞, from (a), we get that µ(t) (A) = 0, a contradiction.

1. Hausdorff Measures and Capacity

27

This theorem leads to the following definition. DEFINITION 1.3.8 Let C ⊆ X. If there is no s > 0, such that df µ(s) (C) = +∞, then dim C = 0. Otherwise, let df

dim C =

sup (s)

µ

s.

s>0 (C) = +∞

Then dim C is called the Hausdorff dimension of C. Consider the Cantor ternary set C. It is well known that C is a nonempty, bounded, nowhere dense, perfect set in R which has Lebesgue measure zero. So the Lebesgue measure can contribute no additional information concerning the size of C. On the other hand, as we shall see the Hausdorff dimension provides a more delicate sense of size. PROPOSITION 1.3.9 If C ⊆ [0, 1] is the Cantor ternary set, then dim C =

ln 2 ln 3 .

PROOF We start with two simple observations concerning the Hausdorff s-dimensional outer measure µ(s) on R. First note that µ(s) is translation invariant, namely µ(s) (A) = µ(s) (A + x)

∀ A ⊆ R, x ∈ R

ª df © (here A + x = a + x : a ∈ A ). Second, µ(s) is s-positive homogeneous, i.e., for every ϑ > 0, µ(s) (ϑA) = ϑs µ(s) (A) ∀ ϑ > 0. In the of C we start by removing from [0, 1] £the open middle ¡ construction ¢ ¤ £ ¤ third 31 , 32 . The resulting set consists of two closed intervals 0, 31 and 32 , 1 . Let · ¸ · ¸ 1 2 1 df 2 df C = C ∩ 0, and C = C ∩ , 1 . 3 3 Evidently C 1 and C 2 are translates of a multiple (by 31 ) of C. So we have (s) µ(s) (C) = µ(s) (C 1 ∪ C 2 ) =µ µ¶ (C 1 ) + µ(s) (C 2 ) s ¡ ¢ 1 = 2µ(s) C 2 = 2 µ(s) (C) 3

(1.21)

(see Remark 1.3.6 and the observations in the beginning of this proof). From (1.21), it follows that µ ¶s 1 µ(s) (C) = 0 or µ(s) (C) = +∞ or 2 = 1. 3

28

Nonlinear Analysis

From the last possibility, it follows that s =

ln 2 . ln 3

If we can show that 0 < µ(s) (C) < +∞, then s = dimension of C (see Theorem 1.3.7). First we show that µ(s) (C) > 0. Note that d(C 1 , C 2 ) >

ln 2 ln 3

is the Hausdorff

1 . 3

Let δ 6 31 . Then any collection {An }n>1 ∈ Tδ (C) (which can be taken to consist of open intervals; see Remark 1.3.6) can be decomposed into two subcollections of intervals {An,1 }n>1 ∈ Tδ (C 1 ) and {An,2 }n>1 ∈ Tδ (C 2 ), such that ∞ ∞ ∞ X X X δ(An )s = δ(An,1 )s + δ(An,2 )s . (1.22) n=1

n=1

n=1

In the right hand side of (1.22) suppose that the first sum is smaller than the second. Because C 2 is a translate of C 1 , the same when applied to ª © translation the intervals {An,1 }n>1 gives a subcollection A0n,1 n>1 ∈ Tδ (C 2 ). Also from {An,1 }n>1 we can produce in a similar way a collection {A0n }n>1 covering C, such that δ(A0n ) = 3δ(A0n,1 ) ∀ n > 1. (1.23) Then, from (1.23) and the choice of s, we have ∞ X

δ(An )s >

n=1 ∞ X

= 2

∞ X

δ(An,1 )s +

n=1

δ(A0n,1 )s

= 2

n=1

∞ µ ¶s X 1 n=1

3

∞ X

δ(A0n,1 )s

n=1

δ(A0n )s =

∞ X

δ(A0n )s .

n=1

If any one of the intervals {A0n }n>1 has length bigger or equal to 31 , we have ∞ X

δ(An )s >

n=1

µ ¶s 1 1 = . 3 2

Because C is compact, we can use only finite coverings and so min δ(An ) > 0. n>1

The intervals {A0n }n>1 are multiples (by (1.23)) of a subfamily of the intervals {An }n>1 , hence we have 3 min δ(A0n ) > min δ(An ). n>1

n>1

1. Hausdorff Measures and Capacity

29

If every interval A0n has length (diameter) less than 31 , we can apply the same process to the cover {A0n }n>1 . After a finite number of such steps, we produce a cover {A00n }n>1 , such that 1 3

max δ(A00n ) > n>1

and

∞ X

δ(An )s >

n=1

so

∞ X

δ(A00n )s ,

n=1

s

δ(An )

n=1

and thus

∞ X

µ ¶s 1 1 > = 3 2

0 < µ(s) (C).

Next we show that

µ(s) (C) < +∞.

Let {An }n>1 ∈ Tδ (C) consist of open intervals. From this family, as above, we obtain covers {An,k }n>1 of C k for k ∈ {1, 2}, such that δ(An,k ) 6

δ 3

∀ n > 1.

Again from the choice of s, we have δ(An )s = δ(An,1 )s + δ(An,2 )s , so

(s)

(s)

µδ (C) > µ δ (C). 3

(s) µδ

(s)

Because is nondecreasing in δ > 0, we infer that µδ is independent of δ > 0. So we can take an open interval of length greater than 1 as an open cover of C and conclude that µ(s) (C) 6 1. This proves that dim E =

ln 2 . ln 3

One can show that for every ξ ∈ [0, 1], there exists a set A ⊆ R, such that dim A = ξ. This can be done using Cantor-like sets. These are sets which share most of the properties of the Cantor ternary set, but need not be Lebesgue-null. We can construct a Cantor-like set as follows. We start with the interval [0, 1] and proceed inductively. We remove an open interval B1,1 centered at 21 with length less than 1. We are left with closed intervals

30

Nonlinear Analysis

D1,1 and D1,2 each with length less than 21 . At the n-th step of this process we are left with closed intervals Dn,1 , Dn,2 , . . . , Dn,2n each with length less than 21n . In the (n + 1)-st step, from each closed interval Dn,k we remove an open interval En+1,k having the same center as Dn,k and length less than the length of Dn,k . We set n

df

Sn =

2 [

Dn,k

df

and S =

∞ \

Sn .

n=1

k=1

The set S is a Cantor-like set. It is known (see Hewitt & Stromberg (1975, p. 71)) that S is nonempty, compact, nowhere dense and perfect (just as the Cantor ternary set). However, unlike the Cantor ternary set, S need not be Lebesgue-null. More precisely, consider a sequence {ϑn }n>1 of positive numbers, such that 1 > 2ϑ1 > 4ϑ2 > . . . > 2n ϑn > . . . . Following the construction of S above, we remove from [0, 1] an open interval centered at 21 and having length 1 − 2ϑ1 . The remaining closed intervals D1,1 and D1,2 each have length ϑ1 . Then from each of the intervals D1,1 and D1,2 we remove cocentric open intervals each of length ϑ1 − 2ϑ2 . We are left with closed intervals D2,1 , D2,2 , D2,3 and D2,4 each of length ϑ2 . We continue this way. In the n-th step we are left with 2n closed intervals each with length ϑn . Then we have λ1 (S) = lim 2n ϑn n→+∞

1

(λ being the Lebesgue measure on R). If ϑn = 31n , then S = C is the Cantor ternary set. Although S is nowhere dense, we can have λ1 (S) as close to 1 as we choose. Indeed, for a given ξ ∈ (0, 1), let 1 nξ + 1 df ϑn = n ∀ n > 1. 2 n+1 Then we have λ1 (S) = ξ. Suppose that in the construction of the Cantor-like set at each step the closed subintervals are divided in the same proportions as the original, namely δ(D1,1 ) = δ(D1,2 ) = ϑ δ(D2,1 ) = δ(D2,2 ) = δ(D2,3 ) = δ(D2,4 ) = ϑ2 and in general δ(Dn,k ) = ϑk

∀ k ∈ {1, . . . , 2n }.

Then the resulting Cantor-like set is denoted by Sϑ . Arguing as in the proof of Proposition 1.3.9, we obtain the following Proposition.

1. Hausdorff Measures and Capacity

31

PROPOSITION 1.3.10 ¡ ¢ ln 2 If ϑ ∈ 0, 21 , then dim Sϑ = − ln ϑ. REMARK 1.3.11 If ϑ = 31 , then S = C is the Cantor ternary set and Propositions 1.3.9 and 1.3.10 coincide. COROLLARY 1.3.12 For each ξ ∈ [0, 1], there exists A ⊆ R, such that dim A = ξ. PROOF If ξ = 0, then we take³ A to ´be a singleton. If 0 < ξ < 1, then take ϑ = exp − lnξ 2 < 21 and use Proposition 1.3.10. If ξ = 1, let A = I = [0, 1]. Then we can easily check that +∞ if 0 < s < 1, 1 if s = 1, µ(s) (A) = 0 if s > 1. Therefore dim A = 1. REMARK 1.3.13 of a set A ⊆ X is by

An alternative way to define the Hausdorff dimension df

dim A =

inf

s.

s>0 µ (A) = 0 (s)

In general the Hausdorff dimension of a set may be any number in [0, +∞] and need not be an integer. Even if dim A is an integer and k = dim A > 0, the set A need not be a “k-dimensional surface” in any sense (see Federer (1969)). Next we turn our attention to the case X = RN . Let us begin by recalling the definition of the N -dimensional outer measure λN . (a) We say that Q ⊆ RN is a closed N -cube, N Q if there exist ak < bk for k = 1, . . . , N , such that Q = [ak , bk ]. We set DEFINITION 1.3.14

k=1 df

|Q| =

N Y

(bk − ak ).

k=1

(b) The Lebesgue N -dimensional outer measure λN , for all A ⊆ RN , is defined by ½X ¾ ∞ ∞ [ df N λ (A) = inf |Qk | : A ⊆ Qk , Qk is closed N -cube . k=1

k=1

32

Nonlinear Analysis

REMARK 1.3.15 Clearly the definitions of λ1 and µ(1) on R coincide. We shall show that for any N > 1 the outer measures λN and µ(N ) are closely related. In fact they differ by a multiplicative constant. This is not easy to establish and requires some preparation which culminates to the so-called “isodiametric inequality,” which says that the set of maximal volume for a given diameter is the sphere. LEMMA 1.3.16 If f : RN −→ [0, +∞] is Lebesgue measurable, then the set ½ ¾ df H = (x, ϑ) ∈ RN × R : 0 6 ϑ 6 f (x) is Lebesgue measurable in RN +1 . PROOF

Let

©

df

A =

ª x ∈ RN : f (x) = +∞ .

Then A is Lebesgue measurable. Let g : Ac × R+ −→ R+ be defined by df

g(x, ϑ) = f (x) − ϑ

∀ (x, ϑ) ∈ Ac × R+ .

Evidently g is a Carath´eodory function (i.e., it is Lebesgue measurable in x ∈ RN and continuous in ϑ ∈ R). Therefore g is Lebesgue measurable on Ac × R+ and so ½ df

H0 =

¾ (x, ϑ) ∈ Ac × R+ : ϑ 6 f (x)

is Lebesgue measurable in RN +1 . Finally note that H = H0 ∪ (A × R+ ).

In what follows for a, b ∈ RN , kakRN = 1, we introduce the following objects: ª df © L(a, b) = b + ta : t ∈ R - the line passing from b in the direction of a and df

P (a) =

©

x ∈ RN : (x, a)RN = 0

ª

- the plane passing from the origin, perpendicular to a.

1. Hausdorff Measures and Capacity

33

DEFINITION 1.3.17 Let a ∈ RN with kakRN = 1 and A ⊆ RN . We define the Steiner symmetrization of A with respect to the plane P (a) to be the set ½ ¾ [ ¡ ¢ 1 df S(a, A) = b + ta : |t| 6 µ(1) A ∩ L(a, b) . 2 b ∈ P (a) A ∩ L(a, b) 6= ∅

REMARK 1.3.18 The above defined Steiner symmetrization with respect to an (N − 1)-dimensional subspace Y of RN is the operation which associates to each A ⊆ RN , the set V ⊆ RN , such that for every L perpendicular to Y either • L ∩ A = ∅ and L ∩ V = ∅; or • L ∩ A 6= ∅ and L ∩ V is a closed segment centered in Y and µ(1) (L ∩ A) = µ(1) (L ∩ V ). If A is compact, then V is compact too and λN (A) = λN (V ). Also if A is convex, then V is convex too. The next Proposition summarizes the properties of the Steiner symmetrization. PROPOSITION 1.3.19 Let A ⊆ RN and a ∈ RN . ¡ ¢ (a) δ S(a, A) 6 δ(A). (b) If A ⊆ RN is Lebesgue measurable, ¡ ¢ then so is S(a, A) and λN S(a, A) = λN (A). PROOF

(a) Assume that δ(A) < +∞

or otherwise the result is trivial. Also we may assume that A is closed. For a given ε > 0, let x, y ∈ S(a, A) be such that ¡ ¢ δ S(a, A) − ε 6 kx − ykRN . Let

df

b = x − (x, a)RN a and

df

c = y − (y, a)RN a.

34

Nonlinear Analysis

Then b, c ∈ P (a). Let us set © ª df r = inf t ∈ R : b + ta ∈ A , © ª df u = inf t ∈ R : c + ta ∈ A ,

© ª df s = sup t ∈ R : b + ta ∈ A , © ª df v = sup t ∈ R : c + ta ∈ A .

We may assume that without any loss of generality that v − r > s − u. So 1 1 1 1 (v − r) + (s − u) = (s − r) + (v − u) 2 2 2 2 ¡ ¢ 1 ¡ ¢ 1 > µ(1) A ∩ L(a, b) + µ(1) A ∩ L(a, c) . 2 2

v−r >

Note that and

¯ ¯ (x, a)

RN

¯ ¡ ¢ ¯ 6 1 µ(1) A ∩ L(a, b) 2

¯ ¯ (y, a)

¯ ¡ ¢ ¯ 6 1 µ(1) A ∩ L(a, c) 2 (recall that x, y ∈ S(a, b)). It follows that ¯ ¯ ¯ ¯ ¯ ¯ v − r > ¯ (x, a)RN ¯ + ¯ (y, a)RN ¯ > ¯ (x − y, a)RN ¯. RN

Hence we have ¡ ¡ ¢ ¢2 2 δ S(a, A) − ε 6 kx − ykRN ¯ ¯ 2 2 6 kb − ckRN + ¯ (x − y, a)RN ¯ 2

6 kb − ckRN + (v − r)2 ° °2 = °(b + ra) − (c + va)°RN 6 δ(A)2 (note that A is closed and so b + ra, c + va ∈ A). It follows that ¡ ¢ δ S(a, A) − ε 6 δ(A). ¡ ¢ Let ε & 0, to conclude that δ S(a, A) 6 δ(A). (b) Recall that the Lebesgue measure λN is rotation invariant. So we may take 0 .. a = eN = . . 0 1

1. Hausdorff Measures and Capacity

35

Then P (a) = P (eN ) = RN −1 . Note that the function f : RN −1 −→ R, defined by ¡ ¢ df f (b) = µ(1) A ∩ L(a, b)

∀ b ∈ RN −1 ,

is measurable (Fubini’s theorem) and Z λN (A) = f (b)dλN −1 (b) A

(since λ1 = µ(1) ; see Remark 1.3.15). So by virtue of Lemma 1.3.16, we have that ½ ¾ f (b) f (b) df N −1 S(a, b) = (b, ϑ) ∈ R ×R: − 6ϑ6 2 2 ½ ¾ N −1 \ (b, 0) ∈ R × R : A ∩ L(a, b) = ∅ is Lebesgue measurable in RN and, moreover, Z ¡ ¢ N λ S(a, A) = f (b) dλN −1 (b) = λN (A). RN −1

Now we are properly equipped to prove the so-called “isodiametric inequality,” which states that, if in RN we consider the family of all sets with given diameter, the one with maximum Lebesgue N -dimensional outer measure (N volume) is the sphere. THEOREM 1.3.20 (Isodiametric Inequality) For all A ⊆ RN , we have µ ¶N δ(A) λ (A) 6 a(N ) , 2 N

N

df π 2 where a(N ) = ¡ N ¢ is the volume of the unit ball in RN . 2 !

PROOF

If δ(A) = +∞, then there is nothing to prove. So suppose that δ(A) < +∞.

36

Nonlinear Analysis

N Let {ek }N k=1 be the standard basis of R . We introduce

A1 = S(e1 , A),

A2 = S(e2 , A1 ),

...,

AN = S(eN , AN −1 ).

Let us set A∗ = AN . Claim 1. A∗ is symmetric with respect to the origin. By virtue of the definition of the Steiner symmetrization, we have that A1 is symmetric with respect to the plain P (e1 ). Let 1 6 k 6 N − 1 and suppose that Ak is symmetric with respect to P (e1 ), . . . , P (ek ). Again Ak+1 is symmetric with respect to P (ek+1 ). Let us fix 1 6 m 6 k and let Rm : RN −→ RN be reflection with respect to P (em ). Let b ∈ P (ek+1 ). Because Rm (Ak ) = Ak , we have © ª © ª µ(1) Ak ∩ L(ek+1 , b) = µ(1) Ak ∩ L(ek+1 , Rm (b)) , so ½

¾ t ∈ R : b + tek+1 ∈ Ak+1

½ =

¾ t ∈ R : Rm (b) + tek+1 ∈ Ak+1

and thus Rm (Ak+1 ) = Ak+1 , i.e., Ak+1 is symmetric with respect to P (em ). It follows that A∗ = AN is symmetric with respect to P (e1 ), . . . , P (eN ), hence it is symmetric with respect to the origin. µ

N

π2 Claim 2. λN (A∗ ) 6 ¡ N ¢ 2

!

δ(A∗ ) 2

¶N .

Let x ∈ A∗ . Then because of Claim 1, we have −x ∈ A∗ and so 2 kxkRN 6 δ(A∗ ). Hence

½ A∗ ⊆ B δ(A∗ ) (0) = 2

and so

y ∈ RN : kykRN 6

δ(A∗ ) 2

¾

µ ¶ ¶N N µ π2 δ(A∗ ) ∗ ¡ ¢ λ (A ) 6 λ B δ(A ) (0) 6 N . 2 2 2 ! N

∗

N

Using Claim 2, we can have the isodiametric inequality. Note that A ⊆ RN is Lebesgue measurable and so by Proposition 1.3.19, we have ¡ ∗¢ ¡ ¢ ¡ ∗¢ ¡ ¢ λN A = λN A and δ A 6 δ A .

1. Hausdorff Measures and Capacity

37

Using Claim 2, it follows that N µ ∗ ¶N ¡ ¢ ¡ ∗¢ π2 δ(A ) λN (A) 6 λN A = λN A 6 ¡N ¢ 2 2 ! ¶N ¶N N µ N µ π2 δ(A) π2 δ(A) 6 ¡N ¢ = ¡N ¢ . 2 2 ! ! 2 2

THEOREM 1.3.21 df

If A ⊆ RN , then λN (A) = cN µ(N ) (A), with cN =

N

π2 ¡N ¢ . N 2 2 !

PROOF For a given ε > 0, we can find a cover {Cn }n>1 of A consisting of closed, convex sets, such that ∞ X

δ(Cn )N 6 µ(N ) (A) + ε.

n=1

By virtue of Theorem 1.3.20, we have λN (Cn ) 6 cN δ(Cn )N

∀ n > 1.

So λN (A) 6

∞ X

λN (Cn ) 6 cN

n=1

∞ X

δ(Cn )N 6 cN µ(N ) (A) + cN ε.

n=1

Let ε & 0 to conclude that λN (A) 6 cN µ(N ) (A).

(1.24)

To prove the opposite inequality, first we show that µ(N ) is absolutely continuous with respect to λN (see Definition A.2.22). Note that for any N -cube Q, we have µ ¶N δ(Q) √ λN (Q) = |Q| 6 . N So for a given δ > 0, we have (N )

µδ

(A) 6

inf

∞ X

Qn -N -cube n=1 ∞ S A⊆ Qn n=1

δ(Qn ) 6 δ

δ(Qn ) 6

√ N N N λ (A).

38

Nonlinear Analysis

Let δ & 0, to conclude that µ(N ) is absolutely continuous with respect to λN (see Definition A.2.22). Next for a given ε, δ > 0, we can find a cover {Qn }n>1 of A consisting of N -cubes, such that δ(Qn ) < δ and

∞ X

∀n>1

λN (Qn ) 6 λN (A) + ε.

(1.25)

n=1

We may suppose that N -cubes are open by expanding them slightly so that the above inequality remains valid. Invoking Vitali’s covering theorem (see Theorem 1.2.5), for every n > 1 we can find disjoint balls {Bn,k }k>1 contained in Qn , such that δ(Bn,k ) 6 δ

and

µ ¶ ∞ [ λ Qn \ Bn,k = 0. N

k=1

By virtue of the absolute continuity of µ(N ) with respect to λN , we have µ

(N )

µ ¶ ∞ [ Qn \ Bn,k = 0

and

(N ) µδ

µ ¶ ∞ [ Qn \ Bn,k = 0.

k=1

k=1

Therefore, using (1.25), we have (N )

µδ

(A) 6 6 6

∞ X

(N )

µδ

k=1 ∞ X ∞ X n=1 k=1 ∞ X

1 cN

(Qn ) 6

∞ X ∞ X

(N )

µδ

n=1 k=1 ∞ X ∞ X

δ(Bn,k )N =

n=1 k=1

λN (Qn ) 6

n=1

(Bn,k ) +

∞ X n=1

(N )

µδ

µ ¶ ∞ [ Qn \ Bn,k k=1

1 N λ (Bn,k ) cN

1 N ε λ (A) + . cN cN

Let ε, δ & 0, to conclude that cN µ(N ) (A) 6 λN (A). From (1.24) and (1.26), we conclude that λN = cN µ(N ) .

(1.26)

1. Hausdorff Measures and Capacity

39

REMARK 1.3.22 Some authors, in order to get rid of the multiplicative constant cN , normalize the definition of the Hausdorff measures on RN . So if C ⊆ RN , 0 6 s < +∞, 0 < δ 6 +∞, they set µ ¶s ∞ X δ(An ) df (s) a(s) µδ (C) = inf , ∞ S 2 C⊆ A n=1 n=1

n

δ(An ) 6 δ df

where a(s) =

s

π2 . Here s Γ( 2 + 1) df

Γ(s) =

+∞ Z xs−1 e−x dx 0

is the gamma Euler function. The Hausdorff s-dimensional outer measure µ(s) is defined by (s) (s) µ(s) (C) = lim µδ (C) = sup µδ (C) δ&0

δ>0

(cf., e.g., Evans & Gariepy (1992, p. 60)) . Recall that ¡ ¢ λN B(x, r) = a(N )rN

∀ x ∈ RN .

In this case Theorem 1.3.21 says that λN = µ(N ) . Note that µ(0) is the counting measure. Let us prove some further properties of the Hausdorff measures on RN . PROPOSITION 1.3.23 Let 0 6 s < +∞. We have (a) µ(s) (A) = 0 for all A ⊆ RN and all s > N . (b) µ(s) (ξA) = ξ s µ(s) (A) for all A ⊆ RN and all ξ > 0. ¡ ¢ (c) µ(s) K(A) = µ(s) (A) for all A ⊆ RN and for any affine isometry K : RN −→ RN . PROOF

(a) Let Q = (0, 1)N and let m > 1 be an integer. For df

N k = (ki )N i=1 ∈ K = {0, . . . , m − 1} ,

we set df

Qk =

¸ N · Y ki ki + 1 , . m m i=1

40

Nonlinear Analysis

Note that

[

Q =

Qk

k∈K

So we have

X

(s)

µ √N (Q) 6 m

√ N and δ(Qk ) = . m √ s δ(Qk )s = mN −s N .

k∈K

Letting m → +∞, since s > N , we obtain µ(s) (Q) = 0, from which it follows that

µ(s) (RN ) = 0.

(b) Note that for all C ⊆ RN , we have δ(ξC) = ξδ(C). So the result follows at once from Definition 1.3.5. (c) Note that for all C ⊆ RN , we have ¡ ¢ δ K(C) = δ(C). Again the result follows from Definition 1.3.5. The next Proposition suggests a convenient way to check that µ(s) vanishes on a set. PROPOSITION 1.3.24 (s) If A ⊆ RN , 0 < δ 6 +∞ and 0 6 s < +∞ are such that µδ (A) = 0, then µ(s) (A) = 0. (0)

PROOF If s = 0, then µδ (A) = 0 implies that A = ∅ and so µ(0) (A) = 0. So suppose that s > 0. For a given ε > 0, we can find {Cn }n>1 , such that A⊆

∞ [

Cn ,

δ(Cn ) 6 δ

and

n=1

Evidently and so

(s) µε (A)

∞ X n=1

δ(Cn )s 6 ε

∀n>1

6 ε. Let ε & 0, to conclude that µ(s) (A) = 0.

δ(Cn )s 6 ε.

1. Hausdorff Measures and Capacity

41

Taking into account that for a Lipschitz continuous function with constant c > 0, for every A ⊆ RN , we have ¡ ¢ δ f (A) 6 cδ(A), and we obtain the following result. PROPOSITION 1.3.25 If f : RN −→ RM is a Lipschitz continuous function with Lipschitz constant c > 0 (see¡ Definition 1.5.1), A ⊆ RN and 0 6 s < +∞, ¢ (s) s (s) then µ f (A) 6 c µ (A). We conclude this section by returning to the notion of Hausdorff dimension (see Definition 1.3.8) and having a second look at this concept. The Hausdorff dimension has an intuitive appeal when familiar objects are under consideration. So for example dim RN = N (see Theorem 1.3.21). Suppose we want to determine the Hausdorff dimension of a curve C ⊆ R3 . Our first guess will be that dim C = 1. But recall that there are curves in R3 which fill the unit cube. Such a curve must have Hausdorff dimension 3. Therefore we must proceed with caution. DEFINITION 1.3.26

Let (X, d ) be a metric space.

¡ ¢ (a) By a curve in X we mean the image f [0, 1] of a continuous function f : [0, 1] −→ X. ¡ ¢ (b) The length of a curve C = f [0, 1] is defined by df

l(C) = sup

m X ¡ ¢ d f (xk−1 ), f (xk ) , k=1

where the supremum is taken over all partitions 0 = x0 < x1 < . . . < xm = 1 of [0, 1].

(c) The curve C is said to be rectifiable, if l(C) < +∞. REMARK 1.3.27 A curve C is a continuum, i.e., a compact and connected set in X. In particular then a curve is a Borel set; hence it is also µ(s) -measurable. Moreover, if in Definition 1.3.26(a) f is injective, then f −1 exists and is continuous and so C is the homeomorphic image of [0, 1]. Also in Definition 1.3.26(a), we can replace [0, 1] by any closed bounded interval [a, b]. Some authors require f to be injective.

42

Nonlinear Analysis

PROPOSITION 1.3.28 If (X, d ) is a metric space, f : [0, 1] −→ X is a nonconstant curve with ¡ ¢ length l and C = f [0, 1] , then (a) 0 < µ(1) (C) 6 l; (b) if f is injective, then µ(1) (C) = l. Therefore, if l is rectifiable (i.e., l < +∞), then dim C = 1. PROOF

(a) First we show that µ(1) (C) 6 l.

Assume that l < +∞ or otherwise there is nothing to prove. Let {Ak }m k=1 be a collection of closed subarcs of C, such that C =

m [

Ak ,

δ(Ak ) 6

k=1

1 n

(1)

and µ 1 (C) 6 n

m X

δ(Ak ).

(1.27)

k=1

Let us explicitly construct the subarcs Ak for k ∈ {1, . . . , m}. Note that f is uniformly continuous and so we can find η > 0, such that ¡ ¢ 1 d f (x), f (y) < n

∀ x, y ∈ [0, 1], |x − y| < η.

Consider a partition 0 = x0 < x1 < . . . < xn = 1 such that |xk − xk−1 | < η Let

¡ ¢ df Ak = f [xk−1 , xk ] ,

of

[0, 1],

© ª ∀ k ∈ 1, . . . , m . © ª ∀ k ∈ 1, . . . , m .

Evidently the subarcs {Ak }m k=1 cover C and ¡ ¢ 1 d f (xk−1 ), f (xk ) 6 δ(Ak ) < n

∀ k ∈ {1, . . . , m}.

Note that every Ak is compact and so we can find points yk , zk ∈ [xk−1 , xk ], yk 6 zk , such that ¡ ¢ d f (yk ), f (zk ) = δ(Ak ). We generate the finer partition 0 6 y1 6 z1 6 y2 6 z2 6 . . . 6 ym 6 zm 6 1.

1. Hausdorff Measures and Capacity

43

From (1.27), we have (1)

µ 1 (C) 6 n

m X k=1

δ(Ak ) =

m X ¡ ¢ d f (yk ), f (zk ) 6 l. k=1

Passing to the limit as n → +∞, we obtain that µ(1) (C) 6 l. Next we show that 0 < µ(1) (C). To this end note that if 0 6 a < b 6 1, then ¡ ¢ ¡ ¢ d f (a), f (b) 6 µ(1) f ([a, b]) . (1.28) df

To see this let h : E = f ([a, b]) −→ R be the function ¡ ¢ df h(u) = d u, f (a) . Evidently h is a Lipschitz continuous function with Lipschitz constant 1 and df

J =

£ ¤ £ ¡ ¢¤ 0, h(b) = 0, d f (a), f (b) ⊆ h(E).

So, from Proposition 1.3.25, we have ¡ ¢ ¡ ¢ d f (a), f (b) = λ1 (J) = µ(1) (J) 6 µ(1) h(E) 6 µ(1) (E). This proves inequality (1.28). But from (1.28) and since for appropriately chosen a, b we have ¡ ¢ d f (a), f (b) > 0 (recall that the curve is nonconstant), we conclude that 0 < µ(1) (C). (b) Now suppose that f is injective. Let 0 = x0 < x1 < . . . < xm = 1 be a partition of [0, 1]. The sets ¡ ¢ df Ak = f [xk−1 , xk ] are pairwise disjoint Borel subsets of X. Using inequality (1.28) on each subarc, we obtain m m X X ¡ ¢ ¡ ¡ ¢¢ d f (xk−1 ), f (xk ) 6 µ(1) f [xk−1 , xk ] k=1

k=1

µ[ ¶ m ¡ ¡ ¢¢ ¡ ¢ = µ(1) = µ(1) f [0, 1] = µ(1) (C). f [xk−1 , xk ] k=1

Since the partition of [0, 1] was arbitrary, it follows that l 6 µ(1) (C). Combining this with (a), we obtain that l = µ(1) (C).

44

1.4

Nonlinear Analysis

Differentiation of Hausdorff Measures

From the general measure theory, we know that the differentiation theory of real functions can be extended to a theory of differentiation for measures, which has many similar features and interesting problems. For the Lebesgue measures λN , N > 1, one of the basic results of this theory is the so-called Lebesgue density theorem, which we recall here. THEOREM 1.4.1 (Lebesgue Density Theorem) If A ⊆ RN is a Lebesgue measurable set, then for λN -a.a. x ∈ A, 1 λN (B r (x) ∩ A) lim = r&0 λN (B r (x)) 0 for λN -a.a. x ∈ RN \ A. DEFINITION 1.4.2

Let A ⊆ RN and x ∈ RN . We say that:

(a) x is a point of density of A, if λN (B r (x) ∩ A) = 1; r&0 λN (B r (x)) lim

(b) x is a point of dispersion of A, if λN (B r (x) ∩ A) = 0. r&0 λN (B r (x)) lim

REMARK 1.4.3 According to Theorem 1.4.1, we see that λN -almost every point of A is a point of density of A and λN -almost every point of RN \A is a point of dispersion of A. We can think that the point of density of a set A form a kind of measure theoretic interior of A, while the points of dispersion of A form a kind of measure theoretic exterior of A. The purpose of this section is to establish analogs of Theorem 1.4.1 for lower dimensional Hausdorff measures. In what follows we work in RN and 1 < s < N. THEOREM 1.4.4 If A ⊆ RN is µ(s) -measurable and µ(s) (A) < +∞, then µ(s) (B r (x) ∩ A) lim = 0 for µ(s) -a.a. x ∈ RN \ A. r&0 (2r)s

1. Hausdorff Measures and Capacity PROOF

45

For every t > 0, let ½ ¾ µ(s) (B r (x) ∩ A) df Ct = x ∈ RN \ A : lim sup > t . (2r)s r&0

To finish the proof it is enough to show that µ(s) (Ct ) = 0

∀ t > 0.

Fix ε > 0. We know that µ(s) bA is a Radon measure (see Proposition 1.1.9). So we can find K ⊆ A compact, such that µ(s) (A \ K) 6 ε (see Proposition 1.1.10(b)). Let df

U = RN \ K. Then U is open and Ct ⊆ U. For fixed δ > 0, we consider the family of closed balls ½ ¾ µ(s) (B r (x) ∩ A) df T = B r (x) : B r (x) ⊆ U, 0 < r < δ, >t . (2r)s Without any loss of generality we may assume that T 6= ∅ or otherwise Ct = ∅ and so µ(s) (Ct ) = 0. © ª Invoking Proposition 1.2.1, we can find a sequence B rn (xn ) n>1 of disjoint elements in T , such that Ct ⊆

∞ [

B 5rn (xn ).

n=1

Then we have (s)

µ10δ (Ct ) 6

∞ X

(10rn )s 6

n=1

∞ ¢ 5s X (s) ¡ µ B rn (xn ) ∩ A t n=1

5s (s) 5s (s) 5s ε 6 µ (U ∩ A) = µ (A \ K) 6 . t t t Let δ & 0, to obtain

5s ε . t Since ε > 0 was arbitrary, we conclude that µ(s) (Ct ) = 0. µ(s) (Ct ) 6

46

Nonlinear Analysis

To have a complete analog of Theorem 1.4.1, we need to check and see if something can be said about the density of A at its points. To do this we will make use of Proposition 1.2.4. THEOREM 1.4.5 If A ⊆ RN is µ(s) -measurable and µ(s) (A) < +∞, then 1 µ(s) (B r (x) ∩ A) 6 lim sup 6 1 2s (2r)s r&0 PROOF

for µ(s) -a.a. x ∈ A.

First we show that lim sup r&0

µ(s) (B r (x) ∩ A) 6 1 (2r)s

for µ(s) -a.a. x ∈ A.

(1.29)

To this end, for every t > 1, we introduce the set Ct ⊆ A defined by ½ ¾ µ(s) (B r (x) ∩ A) df Ct = x ∈ A : lim sup > t . (2r)s r&0 Fix ε > 0. Again µ(s) bA is a Radon measure (see Proposition 1.1.9). We can find an open set U ⊆ RN , such that Ct ⊆ U and

µ(s) (U ∩ A) − ε 6 µ(s) (Ct )

(1.30)

(see Proposition 1.1.10(b)). We introduce the family T of closed balls defined by ½ ¾ µ(s) (B r (x) ∩ A) df T = B r (x) : B r (x) ⊆ U, 0 < r < δ, >t . (2r)s © ª By virtue of Proposition 1.2.4, we can find a sequence B rn (xn ) n>1 of disjoint balls in T , such that Ct ⊆

m [

∞ [

B rn (xn ) ∪

n=1

B 5rn (xn )

∀ m > 1.

n=m+1

Then for δ > 0, we have (s)

µ10δ (Ct ) 6

m X n=1

(2rn )s +

∞ X

(10rn )s

n=m+1

m ∞ ¢ 5s X ¡ ¢ 1 X (s) ¡ 6 µ B rn ∩ A + µ(s) B rn (xn ) ∩ A t n=1 t n=m+1 µ [ ¶ ∞ ¡ ¢ 1 5s 6 µ(s) (U ∩ A) + µ(s) ∀ m > 1. B rn (xn ) ∩ A t t n=m+1

1. Hausdorff Measures and Capacity

47

Using (1.30) and letting m → +∞, we obtain (s)

µ10δ (Ct ) 6

¢ 1 (s) 1 ¡ (s) µ (U ∩ A) 6 µ (Ct ) + ε . t t

Letting δ & 0, we see that µ(s) (Ct ) 6

¢ 1 ¡ (s) µ (Ct ) + ε . t

Since ε > 0 was arbitrary, we finally have that 1 (s) µ (Ct ), t

µ(s) (Ct ) 6 i.e.,

µ(s) (Ct ) = 0

(recall that t > 1). This proves (1.29). Next we show that 1 µ(s) (B r (x) ∩ A) 6 lim sup 2s (2r)s r&0

for µ(s) -a.a. x ∈ A.

For a given ξ, δ ∈ (0, 1), we introduce the set A(δ, ξ) ⊆ A, defined by ½ df (s) A(δ, ξ) = x ∈ A : µδ (C ∩ A) 6 ξδ(C)s for all C ⊆ RN , ¾ with δ(C) 6 δ and x ∈ C . Let {Cn }n>1 be a δ-cover of A(δ, ξ), such that A(δ, ξ) ⊆

∞ [

Cn

n=1

and δ(Cn ) 6 δ,

and

Cn ∩ A(δ, ξ) 6= ∅

∀ n > 1.

So (s) ¡

µδ

A(δ, ξ)

¢

6

∞ X

(s) ¡

µδ

¢ Cn ∩ A(δ, ξ)

n=1

6

∞ X n=1

(s) µδ (Cn

∩ A) 6

∞ X

ξδ(Cn )s

n=1

and from Definition 1.3.5, we see that ¢ ¢ (s) ¡ (s) ¡ µδ A(δ, ξ) 6 ξµδ A(δ, ξ) .

(1.31)

48

Nonlinear Analysis

Since 0 < ξ < 1 we have

(s) ¡

and

µδ

A(δ, ξ)

¢

< +∞,

(s) ¡

µδ

¢ A(δ, ξ) = 0.

In particular, from Proposition 1.3.24, we see that ¡ ¢ µ(s) A(δ, 1 − δ) = 0. Set

½ df

D∞ =

x ∈ A : lim sup r&0

(1.32)

¾ µ(s) (B r (x) ∩ A) 1 < . (2r)s 2s

If x ∈ D∞ , then we can find δ > 0, such that µ(s) (B r (x) ∩ A) 1−δ 6 s (2r) 2s

∀ r ∈ (0, δ].

(1.33)

For any C ⊆ RN , with x∈C ∩A

and

δ(C) 6 δ,

from (1.33), we have ¡ ¢ (s) µδ (C ∩ A) 6 µ(s) (C ∩ A) 6 µ(s) B δ(C) (x) ∩ A 6 (1 − δ)δ(C)s . So it follows that x ∈ A(δ, 1 − δ). Therefore, we have µ ¶ ∞ [ 1 1 D∞ ⊆ A ,1 − , n n n=1 and, using also (1.32), we have µ(s) (D∞ ) = 0. Thus we infer that (1.31) is true. For a given locally integrable function, we can establish the Hausdorff measure of the set where the function is locally large. To do this we shall need the so-called Lebesgue differentiation theorem or Lebesgue-Besicovitch differentiation theorem THEOREM ¡ 1.4.6 ¢(Lebesgue-Besicovitch Differentiation Theorem) If f ∈ L1loc RN ; RM , then Z ° ° 1 °f (y) − f (x)° M dλN (y) = 0 for λN -a.a. x ∈ RN . lim N R r&0 λ (B r (x)) B r (x)

1. Hausdorff Measures and Capacity

49

PROOF Let D = {uk }k>1 be a dense subset of RM . Then by the classical differentiation theorem of Lebesgue (see for example Cohn (1980, p. 190)), we have Z ° ° ° ° 1 °f (y) − un ° M dλN (y) = °f (x) − un ° M (1.34) lim N R R r&0 λ (B r (x)) B r (x)

for λN -a.a. x ∈ RN . Suppose that x ∈ RN is such a differentiability point for which (1.34) is valid for all n > 1. For a given ε > 0, we can choose un , such that ° ° °f (x) − un ° N < ε. R Then we have

Z

1 N r&0 λ (B r (x)) lim

° ° °f (y) − f (x)° M dλN (y) R

B r (x)

Z

1 r&0 λN (B r (x))

6 lim

° ° ° ° °f (y) − un ° M dλN (y) + °un − f (x)° M < 2ε. R R

B r (x)

Since ε > 0 was arbitrary, we conclude that Z ° ° 1 °f (y) − f (x)° M dλN (y) = 0. lim N R r&0 λ (B r (x)) B r (x)

COROLLARY ¡ N 1.4.7 ¢ M If f ∈ L∞ , then loc R ; R Z 1 lim N f (y) dλN (y) = f (x) r&0 λ (B r (x))

for λN -a.a. x ∈ RN .

B r (x)

PROOF

Note that ° ° ° lim °

° ° f (y) dλ (y) − f (x)° °

Z

1 N r&0 λ (B r (x))

N

RM

B r (x)

1 N r&0 λ (B r (x))

Z

6 lim

° ° °f (y) − f (x)°

RM

B r (x)

So the corollary follows at once from Theorem 1.4.6.

dλN (y).

50

Nonlinear Analysis

REMARK 1.4.8 Theorem 1.4.6 and Corollary 1.4.7 remain valid if λN is replaced by any Radon measure on RN . Also we may replace the ball B r (x) by any other measurable sets Sr (x) containing x which shrink to the point x ∈ RN as r & 0. For example we can take Sr (x) to be N -cube with edges equal to 2r. If N = 1, we may take for example the intervals [x − h, h],

[x, x + h] or

[x − h, x + h].

In Proposition 2.1.22 we shall see that the results are also valid for Banach space valued functions, i.e., RM is replaced by a Banach space. Now we are ready to ¢estimate the Hausdorff measure of the set where a ¡ function f ∈ L1loc RN ; R is locally large. THEOREM ¡ 1.4.9 ¢ If f ∈ L1loc RN ; R , 0 6 s < N and ½ Z 1 df Cs = x ∈ RN : lim sup s r&0 r

¾ ¯ ¯ ¯f (y)¯ dλN (y) > 0 ,

B r (x)

then µ(s) (Cs ) = 0. PROOF It is clear that without any loss of generality, we may assume that f ∈ L1 (RN ; R). By virtue of Corollary 1.4.7, we have that Z ¯ ¯ 1 ¯f (y)¯ dλN (y) = 0 for λN -a.a. x ∈ RN lim s r→0 r B r (x)

(recall that 0 6 s < N ). So λN (Cs ) = 0. Let ε > 0, δ > 0 and ξ > 0 be given. Since f ∈ L1 (RN ; R), from the absolute continuity of the Lebesgue integral, we know that we can find ϑ > 0, such that Z ¯ ¯ ¯f (y)¯ dλN (y) < ξ ∀ A ⊆ RN , λN (A) < ϑ. A

We introduce the set

Csε ⊆ Cs

defined by ½ df Csε =

1 x ∈ Cs : lim sup s r r&o

Z B r (x)

¾ ¯ ¯ ¯f (y)¯ dλN (y) > ε .

1. Hausdorff Measures and Capacity

51

We have that λN (Csε ) = 0. So we can find an open set U ⊆ RN , such that λN (U ) < ϑ. Let us set df

T =

½ B r (x) : x ∈ Csε , 0 < r < δ, B r (x) ⊆ U ¾ Z ¯ ¯ ¯f (y)¯ dλN (y) > εrs . and B r (x)

Invoking Proposition 1.2.1, we can find a sequence disjoint balls, such that Csε ⊆

∞ [

©

ª Brn (xn ) n>1 ⊆ T of

B5rn (xn ).

n=1

From this it follows that (s)

µ10δ (Csε ) 6 6

10 ε

Z

∞ s X

10s 6 ε

n=1

Z

∞ X

(10rn )s

n=1

¯ ¯ ¯f (y)¯ dλN (y)

B rn (xn )

s ¯ ¯ ¯f (y)¯ dλN (y) 6 10 ξ. ε

U

Let δ & 0 and then ξ & 0, to conclude that µ(s) (Csε ) = 0. Since Cs =

∞ [

1

Csn ,

n=1

we conclude that µ(s) (Cs ) = 0.

52

Nonlinear Analysis

1.5

Lipschitz Functions

In this section we derive some basic properties relating to the behaviour of Lipschitz continuous functions. A first such result was already established in Proposition 1.3.25. DEFINITION 1.5.1

Let C ⊆ RN .

(a) A function f : C −→ RM is said to be Lipschitz continuous, if there exists a constant c > 0, such that ° ° °f (x) − f (y)° M 6 c kx − yk N ∀ x, y ∈ C. R R (b) If f : C −→ RM is Lipschitz continuous, then the Lipschitz constant Lip(f ) > 0 of f is defined by df

Lip(f ) =

sup x, y ∈ C x 6= y

kf (x) − f (y)kRM . kx − ykRN

(c) If U ⊆ RN is open, a function f : U −→ RM is said to be locally Lipschitz, if for every x ∈ U , we can find a neighbourhood V ⊆ U of x, such that f |V is Lipschitz continuous. THEOREM 1.5.2 N If f : RN −→ RM , f = (fi )M with λN (A) > 0, i=1 and A ⊆ R then ¡ ¢ (a) dim Gr (f |A ) > N , where Gr (f |A ) is the graph of f over A, defined by df

Gr (f |A ) =

©

ª (x, y) ∈ A × RM : y = f (x) ;

¡ ¢ (b) if f is Lipschitz continuous, then dim Gr (f |A ) = N . PROOF (a) Let P : RN +M −→ RN be the projection operator. Operator P is Lipschitz continuous with Lip(P ) = 1. By virtue of Theorem 1.3.21 and Proposition 1.3.25, we have that ¡ ¡ ¢¢ ¡ ¢ 1 N λ (A) = µ(N ) (A) = µ(N ) P Gr (f |A ) 6 µ(N ) Gr (f |A ) cN ¡ ¢ and so dim Gr (f |A ) > N (see Definition 1.3.8). 0

1, then we define fb = fbi i=1 (fi are the component functions of f ). We have M X ° ° ¯ ¯ ¡ ¢ °fb(x) − fb(z)°2 M = ¯fbi (x) − fbi (z)¯2 6 M Lip(f ) 2 kx − zk2 N , R R i=1

so

√ ¡ ¢ Lip fb 6 M Lip(f ).

1. Hausdorff Measures and Capacity REMARK 1.5.5

55

Let f : X −→ R and f : X −→ R be defined by ½ df f (x) if x ∈ A, f (x) = 0 if x ∈ RN \ A,

then as we shall see in Chapter 4, fb = f ⊕ Lip(f ) k·kX , where ⊕ denotes the operation of infimal convolution (see Definition 4.4.6(b)). Since this operation preserves convexity, then if A ⊆ X is convex and f : A −→ R is Lipschitz continuous and convex, then so is fb: X −→ R. Also note that the extension fb obtained in Theorem 1.5.4 is maximal in the sense that if g : X −→ R is any Lipschitz continuous function with Lip(g) 6 Lip(f ), such that g|A = f , then g 6 fb. Indeed note that g(x) − f (y) 6 Lip(f ) kx − ykX

∀ x ∈ X, y ∈ A,

hence g(x) 6 fb(x). A minimal such extension can be obtained by considering the function £ ¤ df fe(x) = sup f (a) − Lip(f ) kx − akX . a∈A

This extension is known as the McShane extension of f and was obtained by McShane (1934) who was the first to study the problem of extension of Lipschitz continuous functions. Finally we mention that Kirszbraun (1934) produced an extension fb of a Lipschitz continuous function f : A −→ RM , such that Lip(fb) = Lip(f ) (see also Federer (1969, p. 201)).

One of the main theorems concerning Lipschitz continuous functions is the so-called Rademacher’s theorem, which asserts that a Lipschitz continuous function f : RN −→ RM is differentiable almost everywhere. This is the starting point for extending the subdifferential theory beyond the family of convex functions (see Chapter 4). First let us recall the following basic definition from multivariable calculus. DEFINITION 1.5.6 Let U ⊆ RN be an open set. We say that a funcM tion f : U −→ R is differentiable (or Fr´ echet differentiable) at x ∈ U , if there exists L(x) ∈ L(RN ; RM ), such that lim

h→0

f (x + h) − f (x) − L(x)h = 0. khkX

REMARK 1.5.7 Evidently L(x) is unique, usually is denoted by Df (x) or f 0 (x) and it is called the derivative of f at x. From multivariable calculus, we know that if M = 1, then Df (x)u =

N N X X ∂f (x)uk = f 0 (x; ek )uk ∂xk

k=1

k=1

∀ u ∈ RN ,

56

Nonlinear Analysis

where f 0 (x; v) is the directional or Gˆ ateaux derivative of f at x in the direction v, defined by df

f 0 (x; v) = lim

λ→0

f (x + λv) − f (x) λ

N and {ek }N k=1 is the canonical basis of R .

THEOREM 1.5.8 (Rademacher Theorem) If U ⊆ RN is an open set and f : U −→ RM is a Lipschitz continuous function, then f is differentiable at λN -almost all x ∈ U . PROOF Clearly we may assume that M = 1. For any u ∈ RN with kukRN = 1, we set df

f 0 (x; u) = lim

λ→0

f (x + λu) − f (x) λ

∀ x ∈ U,

provided this limit exists. Claim 1. f 0 (x; u) exists for λN -almost all x ∈ U . Let f (x + λu) − f (x) λ λ→0 f (x + λu) − f (x) df 0 f− (x; u) = lim inf λ→0 λ df

0 (x; u) = lim sup f+

Evidently if df

Cu =

©

then Cu =

∀x∈U ∀ x ∈ U.

ª x ∈ U : f 0 (x; u) does not exist , ©

ª 0 0 x ∈ U : f− (x; u) < f+ (x; u) .

Note that 0 f+ (x; u) = inf

sup

k>1 0 < |λ| < λ∈Q

1 k

f (x + λu) − f (x) , λ

0 so f+ (·; u) is a Borel measurable function. 0 Similarly we show that f− (·; u) is a Borel measurable function. It follows that Cu ∈ B(U ) (i.e., Cu ⊆ U is a Borel measurable set). Next for every x, u ∈ RN with kukRN = 1, let ϕ : R −→ R be defined by df

ϕ(λ) = f (x + λu)

∀ λ ∈ R.

1. Hausdorff Measures and Capacity

57

The function ϕ is Lipschitz continuous, hence absolutely continuous and so by fundamental theorem of Lebesgue calculus (see Theorem A.2.20), it is differentiable at almost every λ ∈ R. Therefore µ(1) (Cu ∩ L) = 0, for every line L parallel to the direction u. Hence by Fubini’s theorem, we have λN (Cu ) = 0 and this proves the claim. From Claim 1 and Remark 1.5.7, we see that µ ∇f (x) =

¶N ∂f (x) exists for λN -a.a. x ∈ U. ∂xk k=1

¡ ¢ Claim 2. f 0 (x; u) = u, ∇f (x) RN for λN -almost all x ∈ U . Let ϑ ∈ Cc∞ (U ). We have Z Z f (x + λu) − f (x) ϑ(x) − ϑ(x − λu) N ϑ(x) dλN (x) = − f (x) dλ (x). λ λ U

U

Let λ =

1 k

for k > 1. Since f is Lipschitz continuous, we have ¯ ¯ ¯ f (x + k1 u) − f (x) ¯ ¯ ¯ 6 Lip(f ) kuk N = Lip(f ). R 1 ¯ ¯ k

Therefore when k → +∞, from the Lebesgue dominated convergence theorem (see Theorem A.2.2 ), we have that Z Z f 0 (x; u)ϑ(x) dλN (x) = − f (x)ϑ0 (x; u) dλN (x) U

= − Z =

N X k=1

Z uk U

U

Z N X ∂ϑ ∂f N f (x) uk (x) dλ (x) = (x)ϑ(x) dλN (x) ∂xk ∂xk k=1

U

¡ ¢ u, ∇f (x) RN ϑ(x) dλN (x).

U

Because ϑ ∈ Cc∞ (U ) is arbitrary, it follows that ¡ ¢ f 0 (x; u) = u, ∇f (x) RN for λN -a.a. x ∈ U. This proves the second claim.

58

Nonlinear Analysis Let {un }n>1 be a dense subset of ∂B1 (0). For n > 1, we define ½ df

En =

¡ ¢ x ∈ U : f 0 (x; un ) and ∇f (x) exist and f 0 (x; un ) = un , ∇f (x) RN

and df

E =

∞ \

¾

En .

n=1

By virtue of Claim 2, we have that λN (U \ E) = 0. Claim 3. f is differentiable at every x ∈ E. Let x ∈ E, u ∈ ∂B1 (0), λ ∈ R \ {0} and set df

η(x, u, λ) =

f (x + λu) − f (x) − (u, ∇f (x))RN . λ

If v ∈ ∂B1 (0), we have ¯ ¯ ¯η(x, u, λ) − η(x, v, λ)¯ ¯ ¯ ¯ f (x + λu) − f (x + λv) ¯ ¯¡ ¢ ¯ ¯ ¯ + ¯ u − v, ∇f (x) N ¯ 6 ¯ ¯ R λ ° ° 6 Lip(f ) ku − vkRN + °∇f (x)°RN ku − vkRN . Note that

µ ¶ ∂f Lip 6 2Lip(f ) ∂xk

and so

© ª ∀ k ∈ 1, . . . , N

√ Lip(∇f ) 6 2 N Lip(f )

(see the proof of Theorem 1.5.4). Therefore, we have √ ¢ ¯ ¯ ¡ ¯η(x, u, λ) − η(x, v, λ)¯ 6 1 + 2 N Lip(f ) ku − vk N . R

(1.38)

Let ε > 0 be given. We can choose l > 1 large enough so that © ª ∀v ∈ ∂B1 (0) ∃k ∈ 1, . . . , l : kv − uk kRN 6

ε √ . 2(1 + 2 N )Lip(f )

(1.39)

As x ∈ E, we have lim η(x, uk , λ) = 0

λ→0

(recall that x ∈ E). So we can find δ > 0, such that ¯ ¯ ¯η(x, uk , λ)¯ < ε 2

∀ 0 < |λ| < δ, k ∈ {1, . . . , l}.

(1.40)

1. Hausdorff Measures and Capacity

59

Thus from (1.38), (1.39) and (1.40), for every v ∈ ∂B1 (0), we can find k ∈ {1, . . . , l}, such that for all 0 < |λ| < δ, we have ¯ ¯ ¯ ¯ ¯ ¯ ¯η(x, v, λ)¯ 6 ¯η(x, uk , λ)¯ + ¯η(x, v, λ) − η(x, uk , λ)¯ < ε. (1.41) We emphasize that δ > 0 is independent of v ∈ ∂B1 (0). Let y ∈ U , y 6= x and let us set y−x df v = ∈ ∂B1 (0). ky − xkRN We have y = x + λv, with λ = ky − xkRN . Then ¡ ¢ ¡ ¢ f (y) − f (x) − ∇f (x), y − x RN = f (y) − f (x) + λ ∇f (x), v RN ¡ ¢ = o(λ) = o ky − xkRN as y → x. Therefore f is differentiable at x ∈ E with Df (x) = ∇f (x). This proves the claim and the theorem. COROLLARY 1.5.9 If U ⊆ RN is an open set and f : U −→ RM is a locally Lipschitz function, then f is differentiable at λN -almost all x ∈ U . PROOF Again we may assume that M = 1. Note that since f is locally Lipschitz, it is Lipschitz continuous when restricted to any compact set K ⊆ U . Indeed, if this is not true, then we can find a compact set K ⊆ U and two sequences {xn }n>1 , {yn }n>1 ⊆ K, such that ¯ ¯ n kxn − yn kRN < ¯f (xn ) − f (yn )¯ ∀ n > 1. Note that

2 max |f | ∀ n > 1. n K Since K is compact, we can produce two subsequences {xnk }k>1 of {xn }n>1 and {ynk }k>1 of {xn }n>1 , such that kxn − yn kRN 6

xnk −→ v

and ynk −→ v,

for some v ∈ K, which contradicts the fact that in a neighbourhood of x, the function f is Lipschitz continuous. Next let {Un }n>1 be an increasing sequence of bounded open subsets of U , ∞ S such that U = Un , for example let n=1

½ df

Un =

x ∈ U : kxkRN < n,

¾ 1 < d (x, ∂U ) n

∀ n > 1.

Then by virtue of Theorem 1.5.8, f is differentiable at λN -almost all x ∈ Un , for n > 1. Therefore f is differentiable λN -almost everywhere on U .

60

Nonlinear Analysis

COROLLARY 1.5.10 If U ⊆ RN is an open set, f : U −→ RM is a locally Lipschitz function and df

Z =

©

ª x ∈ U : f (x) = 0 ,

then Df (x) = 0 for λN -almost all x ∈ Z. PROOF As before we may assume that M = 1. Also suppose that λN (Z) > 0 or otherwise there is nothing to prove. Then by virtue of Theorems 1.4.1 and 1.5.8, we can choose x ∈ Z, such that Df (x) exists and λN (B r (x) ∩ Z) = 1. r&0 λN (B r (x)) lim

(1.42)

We have f (y) =

¡

¢ ¡ ¢ ∇f (x), y − x RN + o ky − xkRN as y → x.

(1.43)

Suppose that df

v = ∇f (x) 6= 0. We introduce the set ½ df

C =

u ∈ ∂B1 (0) : (v, u)RN

kvkRN > 2

¾ .

For a given u ∈ C and t > 0, in (1.43), we set y = x + tu and we have ¡ ¢ t kvkRN f (x + tu) = t ∇f (x), u RN + o(t kukRN ) > + o(t), 2 so

kvkRN f (x + tv) o(t) > + t 2 t

and f (x + tu) > 0

∀ t ∈ (0, t∗ ), u ∈ C,

for some t∗ > 0, a contradiction to (1.42). COROLLARY 1.5.11 If f1 , f2 : RN −→ RN are locally Lipschitz and df

E =

©

ª x ∈ RN : (f2 ◦ f1 )(x) = x ,

¡ ¢ then D(f2 ◦ f1 )(x) = Df2 f1 (x) Df1 (x) = idRN for λN -almost all x ∈ E.

1. Hausdorff Measures and Capacity PROOF

61

Let df

dom fi =

©

ª x ∈ RN : Dfi (x) exists ,

for i = 1, 2.

We set df

C = E ∩ dom f1 ∩ f1−1 (domf2 ). If x ∈ C \ f1−1 (dom f2 ), then f1 (x) ∈ RN \ dom f2 and so

¡ ¢ ¡ ¢ ¡ ¢ x = f2 f1 (x) = f2 ◦ f1 (x) ∈ f2 RN \ dom f2 .

It follows that E\C ⊆

¡ N ¢ ¡ ¢ R \ dom f1 ∪ f2 RN \ dom f2 .

(1.44)

Invoking Theorem 1.5.8, from (1.44), we infer that λN (C \ E) = 0 (recall that a Lipschitz continuous function maps Lebesgue-null sets to Lebesgue-null sets). If¢ x ∈ C, then from the definition of C, we see that ¡ Df1 (x) and Df2 f1 (x) exist and so ¡ ¢ Df2 f1 (x) Df1 (x) = D(f2 ◦ f1 )(x) (chain rule). Since (f2 ◦ f1 )(x) − x = 0, from Corollary 1.5.10, we infer that ¡ ¢ Df2 f1 (x) Df1 (x) = id for λN -a.a. x ∈ RN .

Continuing with our investigation of Lipschitz continuous maps f : RN −→ R , we aim at deriving change of variables formulas. We distinguish two cases. In the first case N 6 M and the change of variables formula is obtained via the so-called area formula, which asserts that N -dimensional measure of f (A) can be calculated by integrating a suitable Jacobian. In the second case M 6 N and the change of variables formula passes through the so-called coarea formula, which asserts that the integral of the (N − M )-dimensional measure of the level sets of f is computed by integrating a suitable Jacobian. First we derive the area formula and for this we need some preparation. We start with a result from linear algebra, known as polar decomposition. It produces for a linear operator L : RN −→ RM an analog of the polar representation z = reiϑ of a complex number. First some definitions. M

62

Nonlinear Analysis

DEFINITION 1.5.12 (a) An operator U : RN −→ RM is said to be orthogonal, if ¡ ¢ U x, U y RM = (x, y)RN ∀ x, y ∈ RN . (b) For a given operator L : RN −→ RM , its adjoint L∗ : RM −→ RN is defined by ¡ ¢ ¡ ¢ Lx, y RM = x, L∗ y RN ∀ x ∈ RN , y ∈ RM . (c) An operator L : RN −→ RN is self-adjoint, if L∗ = L. (d) An operator S : Rk −→ Rk is said to be positive (we write S > 0), if it is self-adjoint (i.e., S = S ∗ ) and ¡ ¢ Sx, x RN > 0 ∀ x ∈ RN . REMARK 1.5.13 If N = M , then U : RN −→ RN is orthogonal if and ∗ −1 only if U = U (hence U is invertible). In general, if U : RN −→ RM is orthogonal, then N 6 M and U ∗ ◦ U = idRN . Also U : RN −→ RM is orthogonal if and only if it is an isometry. Finally, if S : RN −→ RN is self-adjoint, then we can find an orthogonal operator U : RN −→ RN and a diagonalizable operator D : RN −→ RN , such that S = U ◦ D ◦ U −1 . A positive operator S : RN −→ RN has a unique positive square root T : RN −→ RN , i.e., T 2 = S. THEOREM 1.5.14 (Polar Decomposition Theorem) Let L : RN −→ RM be a linear operator. (a) If N 6 M , then there exist a positive operator S : RN −→ RN and an orthogonal operator U : RN −→ RM , such that L = U ◦ S. (b) If M 6 N , then there exists a positive operator S : RM −→ RM and an orthogonal operator U : RM −→ RN , such that L = S ◦ U ∗.

1. Hausdorff Measures and Capacity

63

PROOF (a) Since L∗∗ = L, the operator L∗ ◦L : RN −→ RN is clearly positive. So it admits a unique square root S : RN −→ RN (see Remark 1.5.13). For each y = Sx ∈ R(S), we write U y = Lx, motivated by the fact that eventually we must have L = U ◦ S. First, we need to show that with this definition U is unambiguously defined on R(S), that is if Sx1 = Sx2 , then Lx1 = Lx2 . ° ° Note that Sx1 = Sx2 is equivalent to saying that °S(x1 − x2 )°RN = 0 and this condition implies that ° ° °L(x1 − x2 )° M = 0. R Therefore U is well defined on R(S) and its range equals R(L). Note that L and S have the same kernel. So dim R(L) = dim R(S)

and

dim R(L)⊥ = dim R(S)⊥ .

Therefore there exists an isometric isomorphism U0 : R(S)⊥ −→ R(L)⊥ . We extend U on R(S)⊥ by setting it equal to U0 . Since RN = R(S) ⊕ R(S)⊥ , every y ∈ RN can be written in a unique way as y = Sx + u,

with u ∈ R(S)⊥ .

We set U y = Lx + U0 u and we have U : RN −→ RM , which is linear and well defined. Also, exploiting the orthogonality of R(S) and R(S)⊥ , we have ¡ ¢ ¡ ¢ U y, U y RN = Lx + U0 u, Lx + U0 u RM ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ = Lx, Lx RM + U0 u, U0 u RM = Sx, Sx RN + u, u RN = y, y RN , so U is orthogonal and U ◦ S = L. (b) Follows if we apply (a) to the operator L∗ : RM −→ RN . REMARK 1.5.15 In a polar decomposition L = U ◦ S, the positive operator S is unique. Indeed suppose that U ◦ S = U1 ◦ S1 . Then by taking adjoints, we obtain S ◦ U ∗ = S1 ◦ U1∗ and so S 2 = S ◦ U ∗ ◦ U ◦ S = S1 ◦ U1∗ ◦ U1 ◦ S1 = S12 . The positive operator S 2 = S12 has a unique square root, hence S = S1 . Moreover, if N = M and the operator L is invertible, then in the polar decomposition L = U ◦ S, the orthogonal operator U is unique too. Indeed, since L is invertible, so is S (since S = U −1 ◦ L). Then from U ◦ S = U1 ◦ S1 and since S −1 = S1−1 , we have that U = U1 ◦ S1 ◦ S −1 = U1 ◦ S1 ◦ S1−1 = U1 .

64

Nonlinear Analysis

We can use Theorem 1.5.14 to define the Jacobian of a Lipschitz continuous map f : RN −→ RM . DEFINITION 1.5.16

Let L : RN −→ RM be a linear operator.

(a) If N 6 M and L = U ◦ S is a polar decomposition of L (see Theorem 1.5.14), then we define the Jacobian of L to be ¯ df ¯ jac L = ¯ det S ¯. (b) If M 6 N and L = S ◦ U ∗ is a polar decomposition of L (see Theorem 1.5.14), then we define the Jacobian of L to be ¯ df ¯ jac L = ¯ det S ¯. (c) If f : RN −→ RM is Lipschitz continuous and

∂f1 ∂x1

...

∂fM ∂x1

...

Df = ...

∂f1 ∂xN

.. .

∂fM ∂xN

is the M × N -gradient matrix, then the Jacobian of f is defined by df

Jf (x) = jac Df (x)

for λN -a.a. x ∈ RN .

REMARK 1.5.17 Since in a polar decomposition the positive operator is uniquely defined (see Remark 1.5.15), then we see that the notions introduced in Definition 1.5.16 are well defined. If L : RN −→ RM is a linear operator, then we can easily check that if N 6 M , we have jac L = det(L∗ ◦ L), while if M 6 N , we have jac L = det(L ◦ L∗ ). Another expression computing jac L2 is given by the so-called BinetCauchy formula. So let N 6 M and set df

Θ(N, M ) =

©

ª θ : {1, . . . , N } −→ {1, . . . , M } is increasing .

For each θ ∈ Θ, we define Pθ : RM −→ RN by df

Pθ (x1 , . . . , xM ) = (xθ(1) , . . . , xθ(N ) ).

1. Hausdorff Measures and Capacity

65

Clearly Pθ is the projection operator of RM on some N -dimensional subspace N V = span {eθ(k) }N −→ RM is a linear operator then a k=1 . Then, if L : R straightforward but cumbersome proof gives the Binet-Cauchy formula: jac L2 =

X

det(Pθ ◦ L)2 .

θ∈Θ(N,M )

For details we refer to Evans & Gariepy (1992, p. 89). LEMMA 1.5.18 If L : RN −→ RM is a linear operator, N 6 M and A ⊆ RN , ¡ ¢ then µ(N ) L(A) = jac L · λN (A). PROOF Let L = U¯◦S be ¯a polar decomposition of L (see Theorem 1.5.14). We know that jac L = ¯ det S ¯ (see Definition 1.5.16(a)). If jac L = 0, then ¡det S = ¢ 0 and so S is not surjective, i.e., dim R(S) 6 N −1. It follows that µ(N ) L(A) = 0 (see, e.g., Proposition 1.3.23). If jac L > 0, then using the orthogonality of U and the facts that µ(N ) = λN in RN , L = U ◦ S and U ∗ ◦ U = idRN , we have µ(N ) (L(B r (x))) λN ((U ∗ ◦ L)(B r (x))) λN ((U ∗ ◦ U ◦ S)(B r (x))) = = λN (B r (x)) λN (B r (x)) λN (B r (x)) N N ¯ ¯ λ (S(B r (x))) λ (S(B 1 (0))) = = = ¯ det S ¯ = jac L, (1.45) a(N ) λN (B r (x)) N

df π 2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . Finally let 2 !

¡ ¢ df ϑ(A) = µ(N ) L(A)

∀ A ⊆ RN .

Then ϑ is a Radon measure and ϑ ≺≺ λN . So the Radon-Nikodym derivative of ϑ with respect to λN (see Theorem A.2.24 and Remark A.2.25) exists and is given by dϑ ϑ(B r (x)) (x) = lim N = jac L N r&0 dλ λ (B r (x))

(see (1.45) and Widom (1969, p. 119)). From the Radon-Nikodym theorem (see Theorem A.2.24), we infer that for all Borel sets A ⊆ RN , we have ¡ ¢ µ(N ) L(A) = jac L · λN (A).

(1.46)

Because ϑ and λN are both Radon measures, we conclude that (1.46) holds for all A ⊆ RN .

66

Nonlinear Analysis

LEMMA 1.5.19 If f : RN −→ RM is Lipschitz continuous, N 6 M and A ⊆ RN is Lebesgue measurable, then (a) f (A) is a µ(N ) -measurable set; ¡ ¢ (b) the map y 7−→ µ(0) A ∩ f −1 (y) is µ(N ) -measurable on RM ; Z ¡ ¢ ¡ ¢N (c) µ(0) A ∩ f −1 (y) µ(N ) (y) 6 Lip(f ) λN (A). RM

PROOF Clearly we can assume with any loss of generality that A is bounded (if not, consider instead A ∩ Br (0)). (a) From the regularity of the Lebesgue measure, we know that for every i > 1, we can find compact set Ki ⊆ A, such that λN (A \ Ki ) 6

1 i

∀ i > 1.

Because f is Lipschitz continuous, f (Ki ) ⊆ RM is compact and so it is µ(N ) measurable. Then ¶ µ[ ∞ ∞ [ Ki = f (Ki ) is a µ(N ) -measurable set. f i=1

i=1

Also, using Proposition 1.3.25, we have µ µ[ ¶¶ µ µ ¶¶ ∞ ∞ [ Ki µ(N ) f (A) \ f 6 µ(N ) f A \ Ki i=1

µ ¶ ∞ [ 6 Lip(f )N λN A \ Ki = 0,

i=1

i=1

so f (A) is µ(N ) -measurable. (b) For n > 1, we introduce the following families of N -cubes ½ df

Fk =

Q ⊆ RN : Q =

¸ ¾ N µ Y cj cj + 1 , , cj are integers, j ∈ {1, . . . , N } . k k j=1

Let df

hk =

X Q∈Fk

χf (A∩Q) .

1. Hausdorff Measures and Capacity

67

Then by part (a), hk is µ(N ) -measurable and for every y ∈ RN , hk (y) is the number of cubes Q ∈ Fk , such that ¡ ¢ (A ∩ Q) ∩ f −1 {y} 6= ∅. Hence for all y ∈ RN , we have ¡ ¡ ¢¢ as k → +∞, hk (y) % µ(0) A ∩ f −1 {y} ¡ ¢ so the function y 7−→ µ(0) A ∩ f −1 (y) is µ(N ) -measurable. (c) Using the monotone convergence theorem (see Theorem A.2.10) and using Proposition 1.3.25 and the fact that [ RN = Q ∀ k > 1, Q∈Fk

we have

Z (0)

µ RM

=

lim

k→+∞

¡

A∩f

−1

¢ ({y}) dµ(N ) (y) =

Z lim

k→+∞ RM

hk (y) dµ(N ) (y)

X

X ¡ ¡ ¢ ¢N µ(N ) f (A ∩ Q) 6 lim sup Lip(f ) λN (A ∩ Q)

Q∈Fk

k→+∞ Q∈F k

¡ ¢N = Lip(f ) λN (A).

LEMMA 1.5.20 If f : RN −→ RM is Lipschitz continuous, t > 1 and df

C =

©

ª x ∈ RN : Df (x) exists and Jf (x) > 0 ,

then there exists a sequence {Ei }i>1 of Borel subsets of RN , such that (a) C =

∞ S i=1

Ei ;

(b) f |Ei is injective for all i > 1; (c) for each i > 1 there exists a self-adjoint isomorphism Li : RN −→ RN , such that ¡ ¢ ¡ ¢ Lip (f |Ei ) ◦ L−1 6 t, Lip L−1 6 t, i i ◦ (f |Ei ) and

| det Li | 6 Jf |Ei 6 tN | det Li |. tN

68

Nonlinear Analysis

PROOF Choose ε > 0 so that 1t + ε < 1 < t − ε. Let E be a countable dense subset of C and let G be a countable dense subset of the space of self-adjoint isomorphisms of RN . For each u ∈ E, L ∈ G and k > 1, we set E(u, L, k) to be the set of all x ∈ C ∩ B k1 (u), such that µ

¶ ° ° 1 + ε kLhkRN 6 °Df (x)h°RM 6 (t − ε) kLhkRN t

and ° ° ° ° °f (y) − f (x) − Df (x)(y − x)° M 6 ε°L(y − x)° N R R

∀ h ∈ RN (1.47)

∀ y ∈ B k2 (x). (1.48)

Since x 7−→ Df (x) is a Borel function, we see that E(u, L, k) is a Borel subset of RN . From (1.47) and (1.48), it follows that ° ° ° ° ° 1° °L(y − x)° N 6 °f (y) − f (x)° M 6 t°L(y − x)° N R R R t ∀ x ∈ E(u, L, k), y ∈ B k2 (x).

(1.49)

Claim 1. If x ∈ E(u, L, k), then µ

1 +ε t

¶N

¯ ¯ ¯ ¯ ¯ det L¯ 6 Jf (x) 6 (t − ε)N ¯ det L¯.

Let Df (x) = L = U ◦ S (see Theorem 1.5.14(a)). According to Definition 1.5.16(c), ¯ ¯ Jf (x) = jac Df (x) = ¯ det S ¯. From (1.47), we have µ ¶ ° ° 1 + ε kLhkRN 6 °(U ◦ S)h°RM = kShkRN 6 (t − ε) kLhkRN ∀h ∈ RN . t Since L ∈ G, we have µ ¶ ° ° 1 + ε khkRN 6 °(S ◦ L−1 )h°RN 6 (t − ε) khkRN t so thus

∀ h ∈ RN ,

¡ ¢ (S ◦ L−1 ) B 1 (0) ⊆ B t−ε (0), ¯ ¯ ¡ ¢ ¯det(S ◦ L−1 )¯ a(N ) 6 λN B t−ε (0) = a(N )(t − ε)N ,

1. Hausdorff Measures and Capacity

69

N

df π 2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . Finally 2 ! ¯ ¯ ¯ ¯ ¯ det S ¯ 6 (t − ε)N ¯ det L¯.

Similarly, we prove the other inequality of the claim. So Claim 1 is proved. Let {Ei }i>1 be an enumeration of the countable set ©

ª E(u, L, k) : u ∈ E, L ∈ G, k > 1 .

(a) Let x ∈ C, while Df (x) = U ◦ S (by Theorem 1.5.14(a)). Select L ∈ G, such that ° ° Lip(L ◦ S −1 ) = °L ◦ S −1 °L 6 and

µ

¶−1 1 +ε t

° ° Lip(S ◦ L−1 ) = °S ◦ L−1 °L 6 t − ε.

Note that because x ∈ C, we have that S is invertible. Also select k > 1 and u ∈ E, such that 1 kx − ukRN < k and ° ° °f (y) − f (x) − Df (x)(y − x)° M R ° ° ε 6 ky − xkRN = ε°L(y − x)°RN ∀ y ∈ B k2 (x). −1 Lip(L ) We infer that x ∈ E(u, L, k) and so x ∈ Ei for some i > 1. Because x ∈ C was arbitrary, we have proved statement (a). (b) Choose Ei from the countable collection {Ei }i>1 . We have Ei = E(u, L, k) with some u ∈ E, L ∈ G and k > 1. Let us set Li = L. From (1.49), we have ° ° ° ° ° 1° °Li (y − x)° N 6 °f (y) − f (x)° M 6 t°Li (y − x)° N R R R t

∀ y ∈ B k2 (x).

70

Nonlinear Analysis

Since Ei ⊆ B k1 (u) ⊆ B k2 (x), we have that ° ° ° 1° °Li (y − x)° N 6 °f (y) − f (x)° M R t ° ° R 6 t°Li (y − x)°RN

∀ x, y ∈ Ei ,

(1.50)

so f |Ei is injective. (c) From (1.50), it follows that ¡¡ ¢ ¢ Lip f |Ei ◦ L−1 6 t i

¡ ¡ ¢¢ and Lip Li ◦ f |Ei 6 t.

Moreover, from Claim 1 and letting ε & 0, we obtain | det Li | 6 Jf |Ei 6 tN | det Li |. tN

Now we are ready for the area formula. THEOREM 1.5.21 (Area Formula) If f : RN −→ RM is a Lipschitz continuous function, N 6 M and A ⊆ RN is a Lebesgue measurable set, then Z Z ¡ ¢ Jf (x) dλN (x) = µ(0) A ∩ f −1 ({y}) dµ(N ) (y). A

RM

PROOF By virtue of Theorem 1.5.8, without any loss of generality we may assume that Df (x) (and so Jf (x) too) exists for all x ∈ A. Also as before we may suppose that λN (A) < +∞. © ª Case 1. A ⊆ Jf > 0 . In this case we may use Lemma 1.5.20 and produce a sequence {Ei }i>1 of Borel subsets of RN which satisfy the postulates of Lemma 1.5.20. We may additionally assume that the sets {Ei }i>1 are disjoint. Let Fk be the following family of N -cubes ½ df

Fk =

Q⊆R

N

¸ ¾ N µ Y cj cj + 1 : Q= , , cj are integers, j ∈ {1, . . . , N } k k j=1

(compare with the proof of Lemma 1.5.19(b)). We set df

k Fi,n = Ei ∩ Qkn ∩ A

with Qkn ∈ Fk ,

∀ i > 1, n > 1.

1. Hausdorff Measures and Capacity

71

k Evidently the sets Fi,n are disjoint and ∞ [

A =

k Fi,n

∀ k > 1.

i,n=1

First we show that lim

∞ X

k→+∞

(N )

µ

Z

¡ ¢ k f (Fi,n ) =

i,n=1

¡ ¢ µ(0) A ∩ f −1 ({y}) dµ(N ) (y).

(1.51)

RN

To this end, we introduce df

hk =

∞ X

χf (F k

i,n

i,n=1

)

∀ k > 1.

© k ª which Therefore hk (y) is the number of sets from the sequence Fi,n i,n>1 intersect f −1 ({y}). Note that ¡ ¢ hk (y) % µ(0) A ∩ f −1 ({y}) as k → +∞. Then (1.51) follows from the monotone convergence theorem (see Theorem A.2.10). Let t > 1. Because of Lemma 1.5.20 and Proposition 1.3.25, we have ¡ ¢ ¡¡ ¢ k ¢ ¡ ¢ k k µ(N ) f (Fi,n ) = µ(N ) f |Ei ◦ L−1 6 tN λN Li (Fi,n ) (1.52) i ◦ Li (Fi,n ) and

³³ ´ ´ ¡ ¢ ¡ ¢−1 k k λN Li (Fi,n ) = µ(N ) Li ◦ f |Ei ◦ f (Fi,n ) ¡ ¢ k 6 tN µ(N ) f (Fi,n ) .

(1.53)

So, using (1.52), (1.53) and Lemmas 1.5.18 and 1.5.20(c), it follows that ¢ ¢ 1 (N ) ¡ 1 ¡ k k µ f (Fi,n ) 6 N Li (Fi,n ) t2N t Z ¯ 1 ¯ k = N ¯ det Li ¯λN (Fi,n ) 6 Jf (x) dλN (x) t k Fi,n

¯ ¯ ¡ ¢ ¡ ¢ k k k 6 tN ¯ det Li ¯λN (Fi,n ) = tN λN Li (Fi,n ) 6 t2N µ(N ) f (Fi,n ) . k We take the sum for the parameters i, n > 1. Recalling that the sets Fi,n are disjoint and since f |Ei is injective (see Lemma 1.5.20(b)), we obtain

1 t2N

∞ X i,n=1

¡ ¢ k µ(N ) f (Fi,n ) 6

Z Jf (x) dλN (x) 6 t2N A

∞ X i,n=1

¡ k ¢ µ(N ) Fi,n .

72

Nonlinear Analysis

Let k → +∞ and use (1.51) to write that Z Z ¡ ¢ (N ) 1 (0) −1 µ A ∩ f ({y}) dµ (y) 6 Jf (x) dλN (x) t2N A RN Z ¡ ¢ 6 t2N µ(0) A ∩ f −1 ({y}) dµ(N ) (y). RN

Since t > 1 was arbitrary, we let t & 1 and obtain the “area formula” for the case when © ª A ⊆ Jf > 0 . © ª Case 2. A ⊆ Jf = 0 . Let ε > 0. We write f = p ◦ g, where

g : RN −→ RM × RN

and

p : RM × RN −→ RM

are defined by df

g(x) =

¡ ¢ f (x), εx

and p(y, z) = y

∀ x, z ∈ RN , y ∈ RM .

We show that there exists ξ > 0, such that 0 < Jg(x) 6 ξε Note that

· Dg(x) =

∀ x ∈ A.

(1.54)

¸ Df (x) , εIM (N +M )×M

where IM is M × M -identity matrix. Then by virtue of the Binet-Cauchy Formula (see Remark 1.5.17), we have that Jg(x)2 = “sum of squares of (N × N )-subdeterminants of Dg(x)” > ε2N > 0. Moreover, since kDf (x)kL 6 Lip(f ) for λN -a.a. x ∈ RN , once again from the Binet-Cauchy Formula, we have that ½ ¾ sum of squares of terms each 2 2 Jg(x) = Jf (x) + 6 ξε2 , involving at least one ε > 0 for some ξ > 0 and all x ∈ A.

1. Hausdorff Measures and Capacity

73

This shows that (1.54) is true. Then, using Proposition 1.3.25, Case 1, (1.54) and the fact that kpkL = 1, we have ¡ ¢ ¡ ¢ µ(N ) f (A) 6 µ(N ) g(A) Z ¡ ¢ 6 µ(0) A ∩ g −1 ({y, z}) dµ(N ) (y, z) RN +M

Z

Jg(x) dλN (x) 6

=

p ξελN (A).

A

Let ε & 0 to conclude that ¡ ¢ µ(N ) f (A) = 0. Note that Hence

¡ ¢ supp µ(0) A ∩ f −1 ({·}) ⊆ f (A).

Z (0)

µ

¡ ¢ A ∩ f −1 ({y}) dµ(N ) (y) =

Z Jf (x) dλN (x) = 0. A

RN

This proves Case 2. Finally for the general case, we write A = A0 ∪ A1 with

© ª A0 ⊆ Jf = 0

and

© ª A1 ⊆ Jf > 0

and apply the result on each set separately. REMARK 1.5.22

The function ¡ ¢ y 7−→ µ(0) A ∩ f −1 ({y})

on RM is called the multiplicity function. Also note that from Theorem 1.5.21, we infer f −1 ({y}) is at most countable for µ(N ) -almost all y ∈ RM . THEOREM 1.5.23 (Change of Variables Formula I) If f : RN −→ RM is a Lipschitz continuous function, N 6 M and g ∈ L1 (RN ), then ¸ Z Z · X g(x)Jf (x) dλN (x) = g(x) dµ(N ) (y). RN

RM

x∈f −1 ({y})

74

Nonlinear Analysis

PROOF

First assume that g > 0. Let ½ ¾ df A1 = x ∈ RN : g(x) > 1

and inductively define ½ df

An =

x∈R

N

Then

¾ n−1 1 X1 : g(x) > + χ (x) n i=1 i Ai

∀ n > 2.

∞ X 1 g > χAn . n n=1

If g(x) = +∞, we see that x ∈ An

∀ n > 1.

On the other hand, if 0 < g(x) < +∞, then x 6∈ An for infinitely many n > 1. So for infinitely many n, we have 0 < g(x) −

n−1 X i=1

and so we conclude that g =

1 1 χ (x) 6 i Ai n

∞ X 1 χAn . n n=1

From the monotone convergence theorem (see Theorem A.2.10) and Theorem 1.5.21, we have Z g(x)Jf (x) dλN (x) RN ∞ X

1 = n n=1 = = = =

Z χAn (x)Jf (x) dλN (x) N

R Z Z ∞ ∞ X X ¡ ¢ 1 1 Jf (x) dλN (x) = µ(0) An ∩ f −1 ({y}) dµ(N ) (y) n n n=1 A n=1 RM n ¶ Z µX ∞ X 1 χAn (x) dµ(N ) (y) n −1 n=1 x∈f ({y}) RM ¶ Z µ ∞ X X 1 χAn (x) dµ(N ) (y) n x∈f −1 ({y}) n=1 M RZ µ ¶ X g(x) dµ(N ) (y).

RM

x∈f −1 ({y})

1. Hausdorff Measures and Capacity

75

In the general case let g = g + − g − and apply the first part of the proof on each component function g + > 0, g − > 0. EXAMPLE 1.5.24

(a) Let N = 1, M > 1. Suppose that f : R −→ RM

is Lipschitz continuous and injective. We have f = (fk )M k=1 and with 0 =

Df = (fk0 )M k=1 d dt .

Let −∞ < a < b < +∞

and

¡ ¢ C = f [a, b]

(the curve defined by f ). Using Theorem 1.5.21, we have Zb (1)

µ

(C) =

¯ 0 ¯ 1 ¯f (t)¯ dλ (t),

a

the length of C. (b) Let N > 1, M = N + 1. Suppose that g : RN −→ R is Lipschitz continuous and let f : RN −→ RN +1 be the Lipschitz continuous function defined by df

f (x) =

¡ ¢ x, g(x)

∀ x ∈ RN .

We have µ Df (x) =

¶ IN ∇g(x) (N +1)×N

for λN -a.a. x ∈ RN ,

where IN is the N × N -identity matrix. Therefore ½ ¾ sum of squares of 2 2 (Jf ) = = 1 + k∇gkRN . N × N -subdeterminants Then, if df

G =

©

ª (x, y) ∈ RN × R : y = g(x)

76

Nonlinear Analysis

(the graph of g), from Theorem 1.5.21, we have Z q ° °2 (N ) µ (G) = 1 + °∇g(x)°RN dλN (x), RN

the surface area of G. (c) Let N > 1, M = N + 1. Suppose that f : RN −→ RN +1 is Lipschitz continuous and injective. Then µ ¶ ¡ ¢N +1 ∂fk f = fk k=1 and Df = . ∂xi k = 1, . . . , N + 1 i = 1, . . . , N

So (Jf )2 =

N +1 · X k=1

∂(f1 , . . . , fk−1 , fk+1 , fN +1 ) ∂(x1 , . . . , xN )

¸2 ,

the sum of squares of N × N -subdeterminants. Therefore, if U ⊆ RN is any open set and A = f (U ) ⊆ RN +1 , then by Theorem 1.5.21, we have ¯ ¶1 Z µ NX +1 ¯ ¯ ∂(f1 , . . . , fk−1 , fk+1 , . . . , fN +1 ) ¯2 2 N (N ) ¯ ¯ µ (A) = dλ (x). ¯ ¯ ∂(x1 , . . . , xN ) U

k=1

In Theorem 1.5.21, we proved that if f : RN −→ RM is Lipschitz continuous, N 6 M and A ⊆ RN is Lebesgue measurable, then the Jacobian integral Z Jf (x) dλN (x) A

equals the N -dimensional Hausdorff area of f |A , given by Z ¡ ¢ µ(0) A ∩ f −1 ({y}) dµ(N ) (y). RM

If N > M , then the Jacobian integral equals the “coarea” of f |A , defined by Z ¡ ¢ µ(N −M ) A ∩ f −1 ({y}) dµ(N ) (y). RM

This result is known as the coarea formula. THEOREM 1.5.25 (Coarea Formula) If f : RN −→ RM is Lipschitz continuous, M 6 N and A ⊆ RN is Lebesgue measurable, then Z Z ¡ ¢ Jf (x) dλN (x) = µ(N −M ) A ∩ f −1 ({y}) dλM (y). A

RM

1. Hausdorff Measures and Capacity

77

As was the case with the area formula, the coarea formula leads to a change of variables formula. THEOREM 1.5.26 (Change of Variables Formula II) If f : RN −→ RM is a Lipschitz continuous function, N > M and g ∈ L1 (RN ), then ¸ Z Z ·Z N (N −M ) g(x)Jf (x) dλ (x) = g(x) dµ (x) dλM (y). RN

RM

f −1 ({y})

PROOF First assume that g > 0. As in the proof of Theorem 1.5.23, we can find Lebesgue measurable sets {An }n>1 ⊆ RN , such that g =

∞ X 1 χ . n An n=1

Invoking the monotone convergence theorem (see Theorem A.2.10) and Theorem 1.5.25, we have Z g(x)Jf (x) dλN (x)

=

RN ∞ X

1 n n=1

Z Jf (x) dλN (x) An

Z ∞ X ¡ ¢ 1 = µ(N −M ) An ∩ f −1 ({y}) dλM (y) n n=1 RM

¸ Z ·X ∞ ¢ 1 (N −M ) ¡ −1 = µ An ∩ f ({y}) dλM (y) n n=1 RM ¸ Z ·Z = g(x) dµ(N −M ) (y) dλM (y). RM

f −1 ({y})

For the general case let g = g+ − g− and apply the first part to each component function g + > 0 and

g − > 0.

78

Nonlinear Analysis (a) Let N > 1, M = 1. Suppose that f : RN −→ R

EXAMPLE 1.5.27 is defined by

f (x) = kxkRN and g ∈ L1 (RN ). Then, we have Df (x) =

x kxkRN

and

∀ x ∈ RN \ {0}.

Jf (x) = 1

From Theorem 1.5.26, we have +∞· Z Z

Z N

g(x) dλ (x) =

g(x) dµ 0

RN

(N −1)

¸ (x) dλ1 (r)

∂B r (0)

+∞ · Z Z = rN −1 0

¸ g(rx) dµ(N −1) dλ1 (r).

(1.55)

∂B 1 (0)

In particular if g = χB

1 (0)

,

from (1.55), it follows that a(N ) =

¢ 1 (N −1) ¡ µ ∂B 1 (0) , N

N

π2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . 2 ! df

(b) Let N > 1 and M = 1. Suppose that f : RN −→ R is a Lipschitz continuous function. Then Jf = kDf kRN and so from Theorem 1.5.25, we have that Z

° ° °Df (x)°

RN

+∞ Z ¡ ¢ dλ (x) = µ(N −1) {f = t} dλ1 (t). N

RN

−∞

We conclude this section with some additional useful results involving the multiplicity function ¡ ¢ y 7−→ µ(0) A ∩ f −1 ({y}) of a Lipschitz continuous function f .

1. Hausdorff Measures and Capacity

79

PROPOSITION 1.5.28 If X, Y are separable metric spaces, ξ is an outer measure on Y , f : X −→ Y is a map such that for every Borel set B ⊆ X, the set f (B) is ξ-measurable, ϑ : 2X −→ R = R ∪ {+∞} is the outer measure on X, defined by ¡ ¢ df ϑ(A) = ξ f (A)

∀A⊆X

and ϑb is the Borel measure resulting from ϑ by the Carath´eodory construction, then for every Borel set B ⊆ X, we have Z ¡ ¢ b ϑ(B) = µ(0) B ∩ f −1 ({y}) dξ(y). Y

PROOF Let {Bk }k>1 be a sequence of Borel partitions of B, such that every member of Bk is the union of some subcollection in Bk+1 and sup δ(A) −→ 0

as k → +∞,

A∈Bk

i.e., B =

∞ [

Bk is a Vitali cover of B.

k=1

Note that if

df

hk (y) =

X

χf (A) (y)

∀ k > 1, y ∈ Y,

A∈Bk

then

¡ ¢ hk (y) % µ(0) B ∩ f −1 ({y})

as k → +∞.

So by the monotone convergence theorem (see Theorem A.2.10), we have that Z X X ¡ ¢ b ϑ(B) = lim ξ f (A) = lim χf (A) (y) dξ(y) k→+∞

Z =

A∈Bk

k→+∞

Y

A∈Bk

¡ ¢ µ(0) B ∩ f −1 ({y}) dξ(y).

Y

REMARK 1.5.29 Recall that if X is a separable metric space, Y is a Hausdorff topological space, f : X −→ Y is a continuous map, ξ is a Borel outer measure on Y , then for every Borel set B ⊆ X, the set f (B) is ξmeasurable. This fact is essentially the starting point for the theory of Souslin sets (see Definition A.2.29(b)).

80

Nonlinear Analysis

PROPOSITION 1.5.30 If X is a Polish space (see Definition A.2.29(a)), Y is a separable metric space, f : X −→ Y is a Lipschitz continuous function, 0 6 k < +∞ and A ⊆ X is a Borel set, then Z ¡ ¡ ¢¢ ¡ ¢k µ(0) A ∩ f −1 {y} dµ(k) (y) 6 Lip(f ) µ(k) (A). Y

PROOF

From Proposition 1.3.25, we know that ¡ ¢ ¡ ¢k df ϑ(A) = µ(k) f (A) 6 Lip(f ) µ(k) (A)

∀ A ⊆ X.

Then apply Proposition 1.5.28 on the outer measure ϑ. PROPOSITION 1.5.31 If X is a separable metric space, then for every connected set C ⊆ X, we have δ(C) 6 µ(1) (C). PROOF

Clearly we may assume that µ(1) (C) < +∞

or otherwise the inequality is obvious. Since µ(1) is a Borel measure, we can find a Borel set B ⊇ C, such that µ(1) (B) = µ(1) (C). Let u, v ∈ C and let f : X −→ R be defined by df

f (x) = dX (x, u)

∀ x ∈ X,

where dX is the metric in X. Since f is Lipschitz continuous with Lip(f ) = 1, f (u), f (v) ∈ f (C) = [a, b], from Proposition 1.5.30, we have that µ(1) (C) = µ(1) (B) Z ¡ ¢ > µ(0) B ∩ f −1 ({y}) dµ(1) (y) R

¡ ¢ > µ(1) f (B) > dX (v, u).

1. Hausdorff Measures and Capacity

1.6

81

Capacity

The notion of capacity plays a crucial role in the study of local properties of Sobolev functions. In a sense it takes the place of measure and it is used to characterize the smallness of subsets in RN . For this reason, it is indispensable in the study of the continuity properties of Sobolev functions. We shall deal with these issues in Section 2.7. Moreover, the concept of capacity enters the study of obstacle problems. In this section we develop the theory of the so-called “p-capacity” (variational capacity). The development of the theory of the p-capacity requires knowledge of the definition of Sobolev spaces and some results from their theory. To make this section self-contained, we state here the necessary material from the theory of Sobolev spaces, but we postpone the proofs until Section 2.4, where we conduct a more systematic study of Sobolev spaces. DEFINITION 1.6.1 Let U ⊆ RN be a nonempty open set. By z = N (zk )k=1 , we denote a generic point of U . (a) Suppose that f ∈ L1loc (U ). We say that gk ∈ L1loc (U ) is the distributional (or © ª weak) partial derivative of f with respect to zk (with k ∈ 1, . . . , N ) in U , if Z Z ∂ϕ f dz = − gk ϕ dz ∀ ϕ ∈ Cc∞ (Z) ∂zk U

U

is the space of all C ∞ (Z)-functions with compact supports, i.e., (here the space of test functions). We write Cc∞ (Z)

gk =

∂f = Dk f ∂zk

© ª ∀ k ∈ 1, . . . , N .

If all of the distributional (weak) partial derivatives Dk f exist for k = df

1, . . . , N , then Df = (Dfk )N k=1 is the distributional (weak) derivative of f . (b) Let p ∈ [1, +∞]. We define the Sobolev space W 1,p (U ), by df

W 1,p (U ) =

©

¡ ¢ª f ∈ Lp (U ) : Df ∈ Lp U ; RN .

Also we define df

1,p Wloc (U ) =

©

ª f : U −→ R : f |V ∈ W 1,p (V ) for all V ⊂⊂ U ,

where V ⊂⊂ U means that V is a bounded open subset of U such that V ⊆ U . 1,p The elements of Wloc (U ) are called Sobolev functions.

82

Nonlinear Analysis (c) Let p ∈ [1, N ). We define the critical Sobolev exponent df

p∗ =

Np N −p

and the space df

Kp =

©

¡ ¢ª ∗ f ∈ Lp (RN ) : f > 0, Df ∈ Lp RN ; RN .

(d) Let p ∈ [1, N ) and C ⊆ RN . The p-capacity of C is defined by df

capp (C) =

inf p

f ∈K C ⊆ int {f > 1}

° °p °Df ° . p

REMARK 1.6.2 (a) Clearly if the distributional (weak) partial derivative Dk f exists, it is uniquely defined modulo a Lebesgue-null set in RN . (b) If f ∈ W 1,p (U ), we define df

kf k1,p =

³

p

p

kf kp + kDf kp

and

´ p1

∀ p ∈ [1, +∞)

df

kf k1,∞ = kf k∞ + kDf k∞ . These are norms in W 1,p (U ) for p ∈ [1, +∞) and W 1,∞ (U ) respectively. Normed this way the Sobolev spaces are Banach spaces. ∗

(c) Although p < p∗ , we do not have Lp (RN ) ⊆ Lp (RN ) and so we cannot say that K p ⊆ W 1,p (RN ). (d) If K ⊆ RN is a compact set, then by using standard regularization (via mollification; see also Definition 2.4.10) of the characteristic function χK , we can check that p capp (K) = inf¡ ¢ kDf kp . f ∈ Cc∞ RN f > χK

(e) Evidently, if C1 ⊆ C2 , then capp (C1 ) 6 capp (C2 ) (monotonicity). (f ) Because p < p∗ , the elements of K p are Sobolev functions.

1. Hausdorff Measures and Capacity

83

As we already mentioned, for easy reference, we present four results from the theory of Sobolev spaces, which will be used in the sequel. The proofs of these results will be given in Section 2.4. PROPOSITION 1.6.3 If U ⊆ RN is open, p ∈ [1, +∞) and f ∈ W 1,p (U ), then we can find a sequence {f }n>1 ⊆ W 1,p (U ) ∩ C ∞ (U ), such that fn −→ f

in W 1,p (U ).

PROPOSITION 1.6.4 Let U ⊆ RN be an open set and let p ∈ [1, +∞). (a) If f, g ∈ W 1,p (U ), then df

df

h0 = min{f, g} ∈ W 1,p (U ),

h1 = max{f, g} ∈ W 1,p (U )

and ½ Dh0 (x) = ½ Dh1 (x) =

Df (x) Dg(x)

for λN -a.a. x ∈ {f 6 g}, for λN -a.a. x ∈ {f > g},

Dg(x) Df (x)

for λN -a.a. x ∈ {f 6 g}, for λN -a.a. x ∈ {f > g}.

In particular f + , f − , |f | ∈ W 1,p (U ). (b) If {fn }n>1 ⊆ W 1,p (U ) is a sequence, then df

h = sup fn ∈ W 1,p (U ), n>1

and

° ° df u = sup °Dfn °RN ∈ Lp (U )

° ° °Dfn (z)° N 6 u(z) R

n>1

for λN -a.a. z ∈ U.

PROPOSITION 1.6.5 If U ⊆ RN is a bounded open set with a C 1 -boundary and p ∈ [1, +∞), ¡ ¢ then there exists E ⊆ L W 1,p (U ), W 1,p (RN ) , such that E(f )|U = f. REMARK 1.6.6

The function E(f ) is called an extension of f on RN .

84

Nonlinear Analysis

Finally we mention two basic inequalities. The first is known as the “Sobolev inequality” (or “Sobolev-Nirenberg-Gagliardo inequality”) and the second is known as the “Poincar´e-Wirtinger inequality.” THEOREM 1.6.7 (Sobolev-Nirenberg-Gagliardo Inequality) If p ∈ [1, +∞), then there exists C = C(N, p) > 0, such that ∀ f ∈ W 1,p (RN ).

kf kp∗ 6 C kDf kp

THEOREM 1.6.8 (Poincar´ e-Wirtinger Inequality) If U ⊆ RN is bounded, connected and open set (i.e., a bounded domain in RN ) with a C 1 -boundary and p ∈ [1, +∞), then there exists C0 = C0 (N, p) > 0, such that ° ° °f − f ° 6 C0 kDf k ∀ f ∈ W 1,p (U ), p p with f =

1 N λ (U )

Z f (z) dz. U

If p < N , then

° ° °f − f ° ∗ 6 C0 kDf k . p p

A Sobolev inequality is also valid for the elements in K p . PROPOSITION 1.6.9 If f ∈ K p , then there exists C = C(N, p) > 0, such that ∀ f ∈ K p.

kf kp∗ 6 C kDf kp PROOF

¡ ¢ First we produce a sequence {ϕn }n>1 ⊆ Cc∞ RN , such that 0 6 ϕn < 1

∀ n > 1,

ϕn (z) % 1

for a.a. z ∈ RN ,

ϕn (z) = 1

∀ kzkRN < n

and sup kDϕn kRN < +∞.

n>1

¡ ¢ To this end, let ϕ ∈ Cc∞ B2 (0) , such that 0 6 ϕ 6 1 and

ϕ|B

1 (0)

= 1.

1. Hausdorff Measures and Capacity Let us set

85

³z´

, ∀ z ∈ RN , n > 1. n This is the desired sequence. Note that ϕn (z) = ϕ

ϕn f ∈ W 1,p (RN ) (recall that p < p∗ and use the product rule). Invoking the Sobolev-NirenbergGagliardo inequality (see Theorem 1.6.7), we can find C > 0, such that for all n > 1, we have kϕn f kp∗ 6 C kD(ϕn f )kp 6 C kDf kp + C kf Dϕn kp ; thus by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have kf kp∗ 6 C kDf kp + C lim inf kf Dϕn kp . (1.56) n→+∞

Using H¨older’s inequality (see Theorem A.2.27) (as pp∗ + p1∗ 0 = 1), the fact ( p ) ³ ∗ ´N that ϕn |Bn (0) ≡ 1 and since p pp = 1 and sup kDϕn kRN < +∞, for every n>1

n > 1, we have Z p

kf Dϕn kp = RN

µ

Z

¯ ¯ ∗ ¯f (z)¯p dz

6 Z

6 C1

{kzkRN >n} ³ ´0 °p pp∗

¶ pp∗ µ Z

° °Dϕn (z)° N R

¯ ¯ ¯f (z)Dϕn (z)¯p dz

dz

¶1− pp∗

RN

{kzkRN >n}

µ

Z

¯ ¯ ¯f (z)Dϕn (z)¯p dz =

¯ ¯ ∗ ¯f (z)¯p dz

¶ pp∗

,

{kzkRN >n} ∗

for some C1 > 0. Since |f |p ∈ L1 (RN ), we have ° °p lim °f Dϕn °p 6 C1 lim

n→+∞

µ

Z

n→+∞

¯ ¯ ∗ ¯f (z)¯p dz

¶ pp∗

= 0.

{kzkRN >n}

Using this in (1.56), we conclude that kf kp∗ 6 C kDf kp .

We use this inequality to establish that the p-capacity capp is an outer measure on RN .

86

Nonlinear Analysis

THEOREM 1.6.10 If p ∈ [1, N ), then capp is an outer measure on RN . PROOF Clearly capp (∅) = 0 and capp is monotone (see Remark 1.6.2(e)). So it remains to show that if {Cn }n>1 is a sequence of subsets of RN and ∞ S C= Cn , then n=1

capp (C) 6

∞ X

capp (Cn )

n=1

(see Definition 1.1.1). We assume that ∞ X

capp (Cn ) < +∞

n=1

or otherwise the inequality is obvious. According to Definition 1.6.1(d), for a given ε > 0, we can find fn ∈ K p , such that © ª Cn ⊆ int fn > 1 and

° ° °Dfn °p 6 capp (Cn ) + ε ∀ n > 1. (1.57) p 2n © ª Let h = sup fn . Evidently C ⊆ int h > 1 . Also, using the monotone n>1

convergence theorem (see Theorem A.2.10), Proposition 1.6.3 and (1.57), we have Z Z ∞ Z X ∗ p∗ p∗ h(z) dz = sup fn (z) dz 6 fn (z)p dz RN

RN

n>1

n=1 N R

∞ ∞ ³ p∗ X X ° °p∗ ε ´p ° ° 6 C Dfn p 6 C capp (Cn ) + n 2 n=1 n=1

6 C1

·X ∞ ³ n=1

ε ´ capp (Cn ) + n 2

¸ pp∗ < +∞,

¡ ¢ ∗ for some C1 > 0. As h ∈ Lploc RN and p < p∗ , we have h ∈ Lp (RN ). Also if u = sup kDfn kRN , from (1.57), we have n>1

Z RN

so u ∈ Lp (RN ).

∞ Z X ¯ ¯ ° ° ¯u(z)¯p dz 6 °Dfn (z)°p N dz < +∞, R n=1 N R

1. Hausdorff Measures and Capacity

87

° ° 1,p By Proposition 1.6.4(b), we have that u ∈ Wloc (RN ) and °Dh(z)° 6 u(z) ¡ ¢ for almost all z ∈ Z. Therefore Dh ∈ Lp RN ; RN and so h ∈ K p . By virtue of Definition 1.6.1(d), the monotone convergence theorem (see Theorem A.2.10) and (1.57), we have Z Z ° °p ° ° capp (C) 6 Dh(z) RN dz 6 u(z)p dz 6

∞ X

Z

RN

RN

° ° °Dfn (z)°p N dz 6 R

n=1 N R

∞ X

capp (Cn ) + ε.

n=1

Let ε & 0, to conclude that capp (C) 6

∞ X

capp (Cn ).

n=1

In the next Theorem, we have collected the basic properties of the p-capacity capp . THEOREM 1.6.11 If p ∈ [1, N ) and A ⊆ C ⊆ RN , then (a) capp (A) =

inf

A⊆U U is open

capp (U ).

(b) capp (ξA) = ξ N −p capp (A) for all ξ > 0. ¡ ¢ (c) capp L(A) = capp (A) for every affine isometry L : RN −→ RN . (d) capp (A) 6 Cµ(N −p) (A) for some C = C(N, p) > 0. ¡ ¢ N (e) λN (A) 6 C capp (A) N −p for some C = C(N, p) > 0. (f ) capp (A ∩ B) + capp (A ∪ B) 6 capp (A) + capp (B). (g) if {An }n>1 is an increasing sequence (i.e., An ⊆ An+1 for all n > 1), then µ[ ¶ ∞ lim capp (An ) = capp An . n→+∞

n=1

(h) if {An }n>1 is a decreasing sequence (i.e., An ⊇ An+1 for all n > 1) of compact sets in RN , then µ\ ¶ ∞ lim capp (An ) = capp An . n→+∞

n=1

88

Nonlinear Analysis

PROOF

(a) From the monotonicity of the p-capacity, we have capp (A) 6

inf

A⊆U U is open

capp (U ).

(1.58)

© ª For a given ε > 0, we can find f ∈ K p , such that A ⊆ int f > 1 and p

kDf kp 6 capp (A) + ε.

(1.59)

Let U = int {f > 1}. Then from Definition 1.6.1(d) and (1.58), we have p

capp (U ) 6 kDf kp 6 capp (A) + ε. Let ε & 0, to obtain that capp (U ) 6 capp (A). Combining this with (1.58), we conclude that capp (A) =

inf

A⊆U U is open

capp (U ).

© ª (b) Let ε > 0 be given. Then we can find f ∈ K p , such that A ⊆ int f > 1 and p kDf kp 6 capp (A) + ε. ³ ´ ¡ ¢ df Let ξ > 0 and h(z) = f zξ . We have h ∈ K p and ξA ⊆ int h > 1 . So ¡ ¢ p p capp (ξA) 6 kDhkp = ξ N −p kDf kp 6 ξ N −p capp (A) + ε . Let ε & 0 to obtain

capp (ξA) 6 ξ N −p capp (A).

(1.60)

Using (1.60), we see that µ capp (A) = capp so

¶ 1 1 (ξA) 6 N −p capp (ξA), ξ ξ

ξ N −p capp (A) 6 capp (ξA).

From (1.60) and (1.61), we conclude that capp (ξA) = ξ N −p capp (A) (c) The proof is similar to that of (b).

∀ ξ > 0.

(1.61)

1. Hausdorff Measures and Capacity ∞ S

(d) Let δ > 0 and suppose that A ⊆

n=1

89

B rn (xn ), with 2rn < δ for all n > 1.

Since capp is an outer measure (see Theorem 1.6.10), using also (b) and (c) as B rn (xn ) = xn + rn B 1 (0), we have capp (A) 6

∞ X

∞ ¡ ¢ ¡ ¢X capp B rn (xn ) = capp B 1 (0) rnN −p ,

n=1

so

n=1

capp (A) 6 Cµ(N −p) (A).

© ª (e) Let ε > 0 and select f ∈ K p , such that A ⊆ int f > 1 and ° °p °Df ° 6 capp (A) + ε. p

(1.62)

Using Proposition 1.6.9 and (1.62), we obtain ¡ ¢1 1 λN (A) p∗ 6 kf kp∗ 6 C kDf kp 6 C capp (A) + ε p , for some C > 0 and so λN (A) 6 Ccapp (A)

p∗ p

N

= Ccapp (A) N −p ,

for some C > 0. (f ) Let ε > 0 and select f, g ∈ K p , such that © ª © ª A ⊆ int f > 1 , B ⊆ int g > 1 and Let

p

p

kDf kp 6 capp (A) + ε,

kDgkp 6 capp (B) + ε.

© ª df h0 = min f, g

© ª df h1 = max f, g .

and

(1.63)

Using Proposition 1.6.4(a), we see that h0 , h1 ∈ K p . Also, we have ° ° ° ° °Dh0 (z)°p N + °Dh1 (z)°p N R ° °p ° °p R = °Df (z)°RN + °Dg(z)°RN for λN -a.a. z ∈ RN and

© ª A ∩ B ⊆ int h0 > 1 ,

© ª A ∪ B ⊆ int h1 > 1 .

(1.65)

Therefore from (1.65), (1.64), (1.63) and since h1 , h1 ∈ K p , we obtain p

p

capp (A ∩ B) + capp (A ∪ B) 6 kDh0 kp + kDh1 kp p

p

= kDf kp + kDgkp 6 capp (A) + capp (B) + 2ε,

(1.64)

90

Nonlinear Analysis

so capp (A ∩ B) + capp (A ∪ B) 6 capp (A) + capp (B). (g) We do the proof for the case p ∈ (1, N ). For the case p = 1 we refer to Federer & Ziemer (1972). By virtue of the monotonicity property, we have lim capp (An ) 6 capp

n→+∞

µ[ ∞

¶ An .

(1.66)

n=1

Suppose that capp

µ[ ∞

¶ An

< +∞,

n=1

as otherwise there is nothing to prove. Thus also lim capp (An ) < +∞.

n→+∞

Let ε > 0 and for every n > 1 let us select fn ∈ K p , such that © ª An ⊆ int fn > 1 df

and

p

kDfn kp 6 capp (An ) +

ε . 2n

(1.67)

df

Let us set h0 = 0 and hk = max fn . We know that {hk }k>0 ⊆ K p , hk = 16n6k ¡ ¢ © © ª ª max fk , hk−1 and Ak−1 ⊆ int min fk , hk−1 > 1 . So, using (1.67), we have p

kDhk kp + capp (Ak−1 ) ° ¡ ¢°p ° ¡ ¢°p 6 °D max{fk , hk−1 } °p + °D min{fk , hk−1 } °p ε p p p = kDfk kp + kDhk−1 kp 6 capp (Ak ) + k + kDhk−1 kp , 2 so p

p

kDhk kp − kDhk−1 kp 6 capp (Ak ) − capp (Ak−1 ) +

ε . 2k

Adding and recalling that h0 = 0, we obtain p

kDhk kp 6 capp (Ak ) + ε

∀ k > 1.

(1.68)

df

Let u = lim hk . Evidently k→+∞

∞ [ k=1

© ª Ak ⊆ int u > 1

(1.69)

1. Hausdorff Measures and Capacity

91

and so, by the monotone convergence theorem (see Theorem A.2.10), Proposition 1.6.9 and (1.68), we have kukp∗ =

lim khk kp∗ 6 C lim inf kDhk kp k→+∞ µ ¶ p1 6 C lim capp (Ak ) + ε . k→+∞

k→+∞

(1.70)

¡ ¢ So at least for a subsequence of {Dhk }k>1 ⊆ Lp RN ; RN , we have that it is bounded. Hence we may assume that ¡ ¢ w Dhk −→ Du in Lp RN ; RN (recall that we have assumed that p > 1). Then from (1.70) and since kDukp 6 lim inf kDhk kp ,

(1.71)

k→+∞

we infer that f ∈ K p . Therefore, using (1.69), (1.71) and (1.70), we have µ[ ¶ ∞ p capp An 6 kDukp 6 lim capp (An ). (1.72) n→+∞

n=1

From (1.66) and (1.72), it follows that lim capp (An ) = capp

n→+∞

µ[ ∞

¶ An .

n=1

(h) Note that due to the monotonicity property, we have ¶ µ\ ∞ An . lim capp (An ) > capp n→+∞

(1.73)

n=1 ∞ T

Let U be an open set such that

n=1

An ⊆ U . The set

∞ T n=1

An is compact and

so for some n0 > 1, we have that An0 ⊆ U , hence An ⊆ U for all n > n0 . It follows that lim capp (An ) 6 capp (U ) n→+∞

and so, using also (a), we have lim capp (An ) 6

n→+∞

T

inf

An ⊆ U U is open

capp (U ) = capp

µ\ ∞ n=1

From (1.73) and (1.74), we conclude that lim capp (An ) = capp

n→+∞

µ\ ∞ n=1

¶ An .

¶ An .

(1.74)

92

Nonlinear Analysis

REMARK 1.6.12 The monotonicity of capp together with properties (g) and (h) in Theorem 1.6.11 imply that the set-function A 7−→ capp (A) is a “Choquet capacity” (see Definition A.2.37). Using Choquet’s capacitability theorem (see Theorem A.2.39 or cf. Choquet (1955)), we can say that for all Souslin (analytic) subsets A (see Definition A.2.29(b) and Remark A.2.30) of RN (in particular then for all Borel sets A of RN ), we have capp (A) =

sup K⊆A K is compact

capp (K).

In Theorem 1.6.11(d) we obtained a first relation between p-capacity and Hausdorff measures, both of which measure small sets in RN . We can improve this result as follows. THEOREM 1.6.13 Let p ∈ (1, N ) and A ⊆ RN . (a) If µ(N −p) (A) < +∞, then capp (A) = 0. (b) If capp (A) = 0, then µ(s) (A) = 0 for all s > N − p. PROOF

(a) Clearly we may assume that A ⊆ RN is compact.

Claim 1. We can find C = C(N, p, A) > 0, such that, if V ⊆ RN is open with A ⊆ V , then we can find an open set U ⊆ RN and f ∈ K p , such that © ª p A ⊆ U ⊆ f = 1 , supp f ⊆ V and kDf kp 6 C. Let V ⊆ RN be an open set, such that A ⊆ V . Let us set df

δ =

d(A, V c ) > 0. 2

Because A is compact and µ(N −p) (A) < +∞, we can find m

{zk }k=1 ⊆ A m

and {rk }k=1 ⊆ R+ \ {0}, such that 2rk < δ,

A⊆

m [ k=1

Brk (zk )

and

m X k=1

rkN −p 6 µ(N −p) (A) + 1.

(1.75)

1. Hausdorff Measures and Capacity Let us set

m [

df

U =

93

Brk (zk )

k=1

and let fk ∈ K p be defined by 1 df fk (z) = 2− 0

if if if

kz−zk kRN rk

kz − zk kRN < rk , rk 6 kz − zk kRN 6 2rk , 2rk < kz − zk kRN .

Using Proposition 1.6.4(a), we see that p

kDfk kp 6 CrkN −p

∀ k ∈ {1, . . . , m}.

df

Let us set f = max fk . Then f ∈ K p , U ⊆ 16k6m

©

(1.76)

ª f = 1 , supp f ⊆ V and

from (1.76) and (1.75), we have p

kDf kp 6

m X

p

kDfk kp 6 C

k=1

m X

¡ ¢ rkN −p 6 C µ(N −p) (A) + 1 ,

k=1

which proves Claim 1. We use the claim inductively and produce a sequence {Un }n>1 of open sets in RN and functions {fn }n>1 ⊆ K p , such that A ⊆ Un+1 ⊆ Un ,

© ª U n+1 ⊆ int fn = 1

∀n>1

and supp fn ⊆ Un , Let df

Sm =

m X 1 n n=1

p

kDfn kp 6 C

and

df

hm =

∀ n > 1. m 1 X1 fn . Sm n=1 n

We have hm ∈ K p

and hm > 1 on Um+1 . ¢ Also because supp kDfn (·)kRN ⊆ Un \ U n+1 and p > 1, we see that ¡

° °p capp (A) 6 °Dhm °p 6 6

m ° 1 X 1 ° °Dfn °p p p p Sm n=1 n

m C X 1 −→ 0 p Sm np n=1

as m → +∞.

94

Nonlinear Analysis

(b) Since capp (A) = 0, for every n > 1 we can find fn ∈ K p , such that © ª A ⊆ int fn > 1 and 1 p kDfn kp 6 n . (1.77) 2 ∞ df P Let us set h = fn . From (1.77), we have n=1

kDhkp 6

∞ X

kDfn kp < +∞.

(1.78)

n=1

Using Proposition 1.6.9 and (1.78), we have that khkp∗ 6

∞ X

kfn kp∗ 6

n=1

∞ X

C kDfn kp < +∞,

n=1

so h ∈ K p . © ª Observe that A ⊆ int h > m for all m > 1. Let z0 ∈ A. For r > 0 small, © ª we have B r (z0 ) ⊆ int h > m , hence df

hz0 ,r =

Z

1 λN (B r (z0 ))

h(z) dz > m, B r (z0 )

which implies that hz0 ,r −→ +∞

as r & 0.

(1.79)

Claim 2. For every z0 ∈ A, we have Z ° ° 1 °Dh(z)°p N dz = +∞, lim s R r&0 r B r (z0 )

for any s > N − p. To prove this claim, we proceed by contradiction. Let z0 ∈ A and suppose that Z ° ° 1 °Dh(z)°p N dz < +∞. lim s R r&0 r B r (z0 )

Then we can find M < +∞, such that for all r ∈ (0, 1], we have Z ° ° 1 °Dh(z)°p N dz 6 M. R rs B r (z0 )

1. Hausdorff Measures and Capacity

95

Invoking the Poincar´e-Wirtinger inequality (see Theorem 1.6.8), we have Z ¯ ¯ 1 ¯h(z) − hz ,r ¯ dz 0 N λ (B r (z0 )) B r (z0 ) Z ° ° C °Dh(z)°p N dz 6 C1 rϑ , 6 N (1.80) R λ (B r (z0 )) B r (z0 )

for ¡some C, ¢ C1 > 0, ϑ = s − (N − p) and for all r ∈ (0, 1] (recall that λN B r (z0 ) = a(N )rN ). Since ¡ ¢ ¡ ¢ λN B r (z0 ) = 2N λN B r2 (z0 ) and using Jensen’s inequality (see Theorem A.2.26) and (1.80), we have that ¯ ¯ 1 ¯ λN (B r2 (z0 )) ¯

¯ ¯ ¯hz , r − hz ,r ¯ = 0 2 0 Z

2N 6 N λ (B r (z0 ))

Z

µ Z

6

2 λN (B

r (z0 ))

h(z) − hz0 ,r

¢

¯ ¯ dz ¯¯

B r (z0 )

¯ ¯h(z) − hz

2

¯ ¯ 0 ,r dz

B r (z0 ) N

¡

¯ ¯h(z) − hz

¯p ¯ dz 0 ,r

1 p

¶ p1

ϑ

6 C2 r p ,

(1.81)

B r (z0 )

for some C2 > 0. Therefore, using (1.81), for k > i, we have ¯ ¯ ¯hz0 , so

1 2k

k ¯ ¯ X ¯ ¯ − hz0 , 1i ¯ 6 ¯hz0 , 1l − hz0 , 2

2

l=i+1

¶ ϑp k µ ¯ X 1 ¯ 1 , ¯ 6 C2 2l−1 2l−1 l=i+1

© ª hz0 , 21n n>1 is a Cauchy sequence and this contradicts the fact that

hz0 , 21n −→ +∞ (see (1.79)). This proves Claim 2. Then we have ½ 1 A ⊆ z0 ∈ RN : lim sup s r&0 r

¾

° ° °Dh(z)°p N dz = +∞ R

B r (z0 )

½ ⊆

Z

z0 ∈ RN : lim sup r&0

1 rs

Z

¾

° ° °Dh(z)°p N dz > 0 R

= Cs .

B r (z0 )

But from Theorem 1.4.9, we have that µ(s) (Cs ) = 0, hence µ(s) (A) = 0.

96

Nonlinear Analysis

REMARK 1.6.14 If p = 1 and A ⊆ RN , it can be shown that cap1 (A) = (1) 0 if and only if µ (A) = 0. The proof of this result, which uses functions of bounded variations and the isoperimetric inequality, can be found in Evans & Gariepy (1992, p. 193). PROPOSITION 1.6.15 If T ⊆ (0, 1) is such that λ1 (T ) > 0, p ∈ [N − 1, N ), A ⊆ B 1 (0) ⊆ RN and for each r ∈ T there exists unique zr ∈ ∂B r (0), such that zr ∈ A, then capp (A) > 0. PROOF

Let f : RN −→ R be defined by df

2

f (z) = kzkRN =

N X

N ∀ z = (zn )N n=1 ∈ R .

zn2

n=1

Evidently f is Lipschitz continuous with Lip(f ) = 1. So by Proposition 1.3.25, we have ¡ ¢ µ(1) f (A) 6 µ(1) (A). Note that T = f (A). So by virtue of Theorem 1.3.21, we have that ¡ ¢ 0 < λ1 (T ) = µ(1) f (A) 6 µ(1) (A). If for some p ∈ [N − 1, N ), we have that capp (A) = 0,

then from Theorem 1.6.13(b) (see also Remark 1.6.14), we have µ(1) (A) = 0 (note that 1 > N − p), a contradiction. So capp (A) > 0

∀ p ∈ [N − 1, N ).

The next result provides a kind of Chebyshev inequality in terms of pcapacities. PROPOSITION 1.6.16 If p ∈ [1, N ), f ∈ K p , ε > 0 and ½ Z 1 df A = z0 ∈ RN : N λ (B r (z0 ))

¾ f (z) dz > ε for some r > 0 ,

B t (z0 )

then there exists a constant C = C(N, p), such that capp (A) 6

C p kDf kp . εp

1. Hausdorff Measures and Capacity

97

PROOF First, we show that the set A ⊆ RN is open. Let z0 ∈ A. Then for some r > 0 and ξ > 0, we have Z 1 f (z) dz = ε + ξ. λN (B r (z0 )) B r (z0 )

¡ N¢ Since f ∈ L∞ , exploiting the absolute continuity of the Lebesgue inteloc R gral, we can find ϑ > 0 small enough so that if λN (B) < ϑ, then Z 1 f (z) dz < ξ. λN (B r (z0 )) B

Also let δ > 0 be such that ¡ ¢ λN B r (z) M B r (z0 ) < ϑ

∀ kz − z0 kRN < δ,

df

where X M Y = (X \ Y ) ∪ (Y \ X) is the symmetric difference of X and Y . So, if kz − z0 kRN < δ, we have Z 1 f (y) dy λN (B r (z0 )) B r (z)

1 = N λ (B r (z0 ))

Z

1 f (y) dy + N λ (B r (z0 ))

B r (z0 )

¸

Z f (y) dy −

B r (z)

Z

1 > ε+ξ− N λ (B r (z0 ))

· Z

f (y) dy

B r (z0 )

f (y) dy > ε + ξ − ξ = ε, B r (z)OB r (z0 )

so z ∈ A and we infer that A is open. Next let z0 ∈ A and let r > 0 be such that Z 1 f (y) dy > ε. λN (B r (z0 )) B r (z0 )

Then, by Jensen’s inequality (see Theorem A.2.26), we have Z N a(N )r ε < f (y) dy B r (z0 )

6

¡

a(N )r

¢ 1 N 1− p∗

µ Z f (y)

p∗

dy

¶ p1∗

B r (z0 )

so r 6 C1 for some C1 > 0 independent of z0 .

6

¡

a(N )rN

¢1− p1∗

kf kp∗ ,

98

Nonlinear Analysis

Invoking the Besicovitch covering theorem (see Theorem 1.2.18), we can find k = k(N ) > 1, a positive integer and countable collection {Tn }kn=1 of closed balls, such that k X X A⊆ B n=1 B∈Tn

and

Z

1 N λ (B)

f (y) dy > ε

∀B∈

Tn .

(1.82)

n=1

B

©

k [

(n) ª

Let Bi be an enumeration of the elements in the countable collection i>1 Tn for n = 1, . . . , k. Using Proposition 1.6.4(a), we can check that µ

¶+

Z

1 (n) λN (Bi )

¡ (n) ¢ ∈ W 1,p Bi .

f (y) dy − f (n)

Bi

Then Poincar´e’s inequality (see Theorem 1.6.8) implies that °µ ° ° °

Z

1 (n)

λN (Bi )

¶+ ° ° ° f (y) dy − f °

(n)

W 1,p (Bi

(n) Bi

)

6 C2 kDf kLp (B (n) ;RN ) , i

(n)

for some C2 > 0. Invoking Proposition 1.6.5, we can find gi such that µ (n)

gi

> 0,

and

¶+

Z

1

(n)

gi (z) =

(n) λN (Bi )

° ° ° (n) ° °gi °

W 1,p (RN )

f (y) dy − f

(z)

(n)

>

(n)

6 C3 kDf kLp (B (n) ;RN ) , i

Z

1

f (y) dy > ε.

(n)

λN (Bi )

(n)

Bi

Also if df

g =

(n)

for a.a. z ∈ Bi

Bi

for some C3 = C3 (N, p) > 0. From (1.82), we have f + gi

∈ W 1,p (RN ),

sup i>1 n = 1, . . . , k

(n)

gi ,

(1.83)

1. Hausdorff Measures and Capacity

99

we have g > 0. We claim that h ∈ K p . To this end, using (1.83), note that Z sup i>1 RN n = 1, . . . , k

6

k X

C3p

n=1

∞ Z X i=1

k X ∞ Z X ¯ (n) ¯p ¯ (n) ¯p ¯g (y)¯ dy 6 ¯g (y)¯ dy i i n=1 i=1

RN

p

p

kDf (y)kRN dy 6 kC3p kDf kp ,

(n)

Bi

so g ∈ Lp (RN ). Also, we have Z sup

6

i>1 RN n = 1, . . . , k ∞ Z k X X p C3 n=1 i=1 (n) Bi

k X ∞ ° ° X ¯ (n) ¯p ° (n) °p ¯Dg ¯ dy 6 Dg ° i i ° p

n=1 i=1 p

p

kDf (y)kRN dy 6 kC3p kDf kp .

(1.84)

From Proposition 1.6.4(b), we infer that kDg(y)kRN 6

¯ (n) ¯p ¯Dg (y)¯

sup i>1 n = 1, . . . , k

i

for a.a. y ∈ RN ,

¡ ¢ so by (1.84), we have that Dg ∈ Lp RN ; RN and thus g ∈ K p . Because f + g > ε almost everywhere on A and A is open, using also (1.84), it follows that °p Z ° °1 ° ° ° capp (A) 6 ° ε D(f + g)(y)° N dy R RN

´ C4 ³ C5 p p 6 p kDf kp + kDgkp 6 p kDf kpp , ε ε for some C4 , C5 = C5 (N, p) > 0. Setting C = C5 , we obtain the result. DEFINITION 1.6.17 A function f : RN −→ R is said to be pquasicontinuous, if for each ε > 0, we can find an open set U ⊆ RN , such that capp (U ) < ε and f |RN \U is continuous. We have all the necessary tools to prove the following “differentiability” result for Sobolev functions. As we already said a systematic study of Sobolev functions and of their differentiability properties will be conducted in Chapter 2. Here we state a result which says that up to a set A of p-capacity zero, a function f ∈ W 1,p (RN ) can be represented by a p-quasicontinuous function.

100

Nonlinear Analysis

THEOREM 1.6.18 If p ∈ [1, N ) and f ∈ W 1,p (RN ), then there exists f ∗ : RN −→ R a p-quasicontinuous function, such that (a) there exists a Borel set A ⊆ RN with capp (A) = 0, such that Z 1 lim f (y) dy = f ∗ (z) ∀ z ∈ RN \ A; r→0 λN (B r (z)) B r (z)

(b) for each z ∈ RN \ A, we have Z

1 r→0 λN (B r (z))

¯ ¯ ∗ ¯f (y) − f ∗ (y)¯p dy = 0.

lim

B r (z)

PROOF

(a) Let ½ df

C =

lim sup r→0

Z

1 rN −p

¾ p kDf (y)kRN dy > 0 .

B r (z)

From Theorem 1.4.9, we have that µ(N −p) (C) = 0 and this by Theorem 1.6.13(a) implies that capp (C) = 0. Moreover, from the Poincar´e-Wirtinger inequality (see Theorem 1.6.8), we see that Z ¯ ¯ 1 ¯f (y) − f z,r ¯ dy = 0 lim ∀ z ∈ RN \ C. (1.85) r→0 λN (B r (z)) B r (z)

By virtue of Proposition 1.6.3, we can find a sequence {fn }n>1 ⊆ W 1,p (RN ) ∩ C ∞ (RN ), such that p

kDf − Dfn kp 6 For n > 1, we introduce the sets ¯ ½ Z ¯ 1 df N ¯ En = z ∈ R : ¯ N λ (B r (z)) B r (z)

1 2(p+1)n

∀ n > 1.

(1.86)

¯ ¾ ¯ 1 ¯ |f (y) − fn (y)| dy ¯ > n for some r > 0 . 2

1. Hausdorff Measures and Capacity

101

From Proposition 1.6.16 and (1.86), we have that capp (En ) C p 6 C kDf − Dfn kp 6 (p+1)n , 2pn 2 for some C = C(N, p) > 0, so capp (En ) 6

c . 2n

(1.87)

Moreover, we have ¯ ¯ ¯f z,r − fn (z)¯ 6

1 λN (Br (z)) Z

½ Z

Z

¯ ¯ ¯f (y) − f z,r ¯ dy +

Br (z)

|f (y) − fn (y)| dy

Br (z)

¾

|fn (y) − fn (z)| dy ,

+ Br (z)

so from (1.85), we have ¯ ¯ lim ¯f z,r − fn (z)¯ 6

1 2n

r→0

We set df

Ak = C ∪

∀ z ∈ C ∪ En .

(1.88)

¶

µ[ ∞

En

∀ k > 1.

n=k

Evidently Ak is Borel for k > 1 and we have that capp (Ak ) 6 capp (C) +

∞ X n=k

Then, if df

A =

∞ \

∞ X 1 capp (En ) 6 . 2n

(1.89)

n=k

Ak ,

k=1

from (1.89), we have that capp (A) 6

lim capp (Ak ) = 0.

k→+∞

Note that, if z ∈ RN \ Ak and n, m > k, from (1.88), we have ¯ ¯ ¯ ¯ ¯ ¯ ¯fn (z)−fm (z)¯ 6 lim sup ¯f z,r − fn (z)¯ +lim sup ¯f z,r − fm (z)¯ 6 1 + 1 , 2n 2m r→0 r→0 ¡ ¢ so the sequence {fn }n>1 converges uniformly on RN \ Ak to some h ∈ C RN . Also we have ¯ ¯ ¯ ¯ lim sup ¯h(z) − f z,r ¯ 6 |h(z) − fn (z)| + lim sup ¯fn (z) − f z,r ¯ , r→0

r→0

102

Nonlinear Analysis

so, from (1.87), we have h(z) = lim f z,r = f ∗ (z)

∀ z ∈ RN \ Ak , k > 1

r→0

hence f ∗ (z) = lim f z,r

∀ z ∈ RN \ A.

r→0

We need to show that f ∗ is p-quasicontinuous. For this purpose let ε > 0 be given. We choose k > 1, such that capp (Ak )

1 converges to f ∗ uniformly on RN \ U , we infer that the function f ∗ |RN \U is continuous, hence f ∗ is p-quasicontinuous. (b) Note that C ⊆ A and so from (1.85), we see that for all z ∈ RN \ A, we have that µ lim

r→0

1 λN (B r (z))

Z |f (y) − f ∗ (z)|

p∗

dy

¶ p1∗

B r (z)

¯ ¯ 6 lim ¯f z,r − f ∗ (z)¯ + lim r→0

r→0

µ

1 N λ (B r (z))

Z

¯ ¯ ∗ ¯f (y) − f z,r ¯p dy

¶ p1∗

= 0.

B r (z)

REMARK 1.6.19

By virtue of Theorem 1.6.13(b), we have that µ(s) (A) = 0

∀ s > N − p.

Hence dim A 6 N − p. On the other hand this last inequality does not necessarily imply that µ(N −p) (A) < +∞, which in turn gives us that capp (A) = 0 (see Theorem 1.6.13(a)). Therefore the conclusion capp (A) = 0 in the statement of Theorem 1.6.18 is stronger than the dimensionality condition dim A 6 N − p.

1. Hausdorff Measures and Capacity

1.7

103

Remarks

1.1: The necessary measure theoretic background is standard and can be found in any books on abstract measure theory. We mention the books of Ash (1972), Denkowski, Mig´orski & Papageorgiou (2003a, Chapter 2), Dudley (1989), Hewitt & Stromberg (1975) and Royden (1968). Outer measures were first introduced by Carath´eodory (1914), who also gave the definition of a µ-measurable set (see Definition 1.1.3) and proved that Σ(µ) is a σ-field and the outer measure µ restricted on Σ(µ) is a measure. For a proof of Proposition 1.1.10 we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 184-186) and Evans & Gariepy (1992, p. 6–9). 1.2: In some books a Vitali cover (see Definition 1.2.2) is called “fine cover” (see, e.g., Evans & Gariepy (1992)). The original version of Theorem 1.2.5 (Vitali covering theorem) is due to Vitali (1908), who employed closed cubes. The first to study the differentiability of monotone functions (or more generally of functions of bounded variation, in particular then of absolutely continuous functions) was Lebesgue (1904, 1910). Evans & Gariepy (1992, p. 30), Hardt (1979), Simon (1983) and Ziemer (1989) contain the proof of Theorem 1.2.18. For the proof of Theorem 1.2.19 see Evans & Gariepy (1992, p. 35). In general covering theorems are useful in harmonic analysis and in geometric measure theory. More about them can be found in de Guzman (1975). 1.3: Carath´eodory (1914), working with outer measures, was the first to introduce “Hausdorff measures.” More precisely, he introduced “1-dimensional” (or “linear”) measures in RN and also indicated that similarly one can define k-dimensional measures in RN for any integer k > 1. Hausdorff (1919) realized that Carath´eodory’s definition can be used also for noninteger s > 0. He then went on to show that Cantor’s ternary set has fractional dimension ln 2 s ln 3 . An extension of the theory can be achieved by replacing δ(An ) in the X definition of the Hausdorff measure, by ξ(An ), where ξ : 2 −→ R+ is any premeasure, i.e., ξ(∅) = 0 and if U ⊆ V then ξ(U ) 6 ξ(V ) (monotonicity). Of special interest are premeasures resulting from Hausdorff functions h. Namely we consider a function h : R+ −→ R+ satisfying: (a) h(t) > 0 for all t > 0, (b) if t 6 s, then h(t) 6 h(s), (c) h is right continuous at every t > 0. Such a function is called Hausdorff function. For such a function and a positive constant ϑ, we define a premeasure ξ on the metric space X, by © ¡ ¢ ª ½ df min h δ(A) , h(ϑ) if A 6= ∅, ξ(A) = 0 if A = ∅.

104

Nonlinear Analysis

Then ξ is the premeasure defined by h and the cut-off level ϑ. For more details about this generalization, we refer to Davies (1970) and Davies & Samuels (1974). In the presentation of the isodiametric inequality (see Theorem 1.3.20) and of the fact that µ(N ) is a multiple of the Lebesgue measure λN , we follow Evans & Gariepy (1992, Chapter 3). We refer also to Falconer (1985, Section 1.6), Federer (1969, Section 2.10.33) and Hardt (1979). It will be a grave omission not to mention the fundamental contributions on the field of Hausdorff measures made by Besicovitch. We mention the works of Besicovitch (1945, 1946) related to Theorem 1.2.18 (the covering theorem bearing his name). A more complete list of the works of Besicovitch can be found in the book of Falconer (1985). 1.4: The intuitive meaning of Theorem 1.4.1 is that for λN -almost all x ∈ A, small balls centered at x consist predominantly of points of A. Theorems 1.4.1 and 1.4.6 are due to Lebesgue (1910). They were generalized by Besicovitch (1945, 1946), who replaced the Lebesgue measure λN by a Radon measure on RN . Another source of information for the differentiation of measures in RN is the book of Widom (1969). 1.5: Theorem 1.5.4 was originally proved by McShane (1934), who produced the minimal Lipschitz extension of f . Theorem 1.5.8 was originally proved by Rademacher (1919). The proof that is given here is essentially due to Morrey (1966, Theorem 3.1.6). It can be found also in Evans & Gariepy (1992), Simon (1983) and Ziemer (1989). If we employ the notion of Haar-null set, we can also have an extension of Rademacher’s theorem to locally Lipschitz functions between Banach spaces. DEFINITION 1.7.1 Let (G, +) be an abelian Polish group and d an invariant metric on G compatible with the topology (therefore automatically complete). A universally measurable set A ⊆ G is a Haar-null set, if there exists a probability measure µ on G (not unique), such ¡ that ¢χA ? µ = R 0, where χA is the characteristic function of the set A and χA ? µ (x) = G χA (x + y) dµ(y). REMARK 1.7.2 The above definition is equivalent to the requirement that every translate of the set A is a zero set for the measure µ. The measure µ is usually called test measure. The next theorem is the extension of Theorem 1.5.8 to functions between Banach spaces. It will be proved in Section 4.3 (see Theorem 4.3.17). THEOREM 1.7.3 If X is a separable Banach space, Y is a Banach space with the RNP (see Section 2.1) and f : X −→ Y is locally Lipschitz, then there exists a universally measurable set D ⊆ X, such that X \ D is Haar-null and f |D is differentiable in the sense of Gˆ ateaux.

1. Hausdorff Measures and Capacity

105

Our derivation of the area formula (see Theorem 1.5.21) is based on Evans & Gariepy (1992, Section 3.3) (see also Federer (1969, Section 3.2) and Hardt (1979)). Lipschitz continuous functions and their properties are discussed in Federer (1969, Section 3.3). If N = M , from the two change of variables results (see Theorem 1.5.23 and Theorem 1.5.26), we obtain the following change of variables formula. THEOREM 1.7.4 If U, V ⊆ RN are open sets, f : U −→ V is a locally Lipschitz homeomorphism and u ∈ L1 (V ), then ¯ ¯ v = (u ◦ f )¯ det Jf ¯ ∈ L1 (V ) and

Z U

¯ ¡ ¢¯ u f (x) ¯ det Jf (x)¯ dλN (x) =

Z u(y) dλN (y). V

The proof of Theorem 1.5.25 can be found in Evans & Gariepy (1992, Section 3.4) or Federer (1969, Subsection 3.2.11). 1.6: Our treatment of capacity follows Evans & Gariepy (1992, Section 4.7) (see also Federer & Ziemer (1972)). There are other notions of capacity as for example Bessel capacity, Riesz capacity, etc., which are discussed in Stein (1970) and Ziemer (1989). The abstract theory of capacities in Banach spaces can be found in Fowler (1973). Moreover, for the use of capacities in the convergence of obstacles, we refer to Dal Maso (1985).

Chapter 2 Lebesgue-Bochner and Sobolev Spaces

The functional-analytic approach to the solution of (partial) differential equations requires knowledge of the properties of spaces of functions of one or several real variables. A large class of infinite dimensional dynamical systems (evolution systems) can be modelled as an abstract differential equation defined on a suitable Banach space or on a suitable manifold therein. The advantage of such an abstract formulation lies not only on its generality but also in the insight that can be gained about the many common unifying properties that tie together apparently diverse problems. It is clear that such a study relies on the knowledge of various spaces of vector valued functions (i.e., of Banach space valued functions). For this reason Section 2.1 deals with vector valued functions. We introduce the various notions of measurability for such functions and then based on them we define the different integrals corresponding to them. The emphasis is on the so-called Bochner integral, which generalizes in a very natural way the classical Lebesgue integral to vector valued functions. In Section 2.2 we continue with vector valued functions and introduce the so-called Lebesgue-Bochner spaces, which extend to vector valued functions the well known Lebesgue Lp -spaces. We also consider evolution triples and the function spaces associated with them. Evolution triples provide a suitable analytical framework for the study of a large class of linear and nonlinear evolution equations. In Section 2.3 we have compactness results for the spaces introduced and studied in the previous section. The compactness results refer to both the strong and the weak topologies on the spaces under consideration. Thus far we are dealing with function spaces arising in evolutionary problems. In Section 2.4 we study Sobolev spaces, which are the main tools in the analysis of both stationary and nonstationary equations. Sobolev spaces play a central role in the modern theory of partial differential equations and they allow us to broaden significantly the notion of solution of a boundary value problem. They provide a natural functional analytical framework for the study of weak solutions of elliptic boundary value problems. No specific applications to problems in partial differential equations are discussed. Instead the section aims to serve as a concise introduction to the properties of

107

108

Nonlinear Analysis

the Sobolev spaces (of both one and several variables). In Section 2.5 we present some fundamental inequalities associated with Sobolev functions, the celebrated embedding theorems for the Sobolev spaces and some of their consequences. The embedding theorems are arguably the most important results in this theory and the reason why Sobolev spaces are so effective in dealing with boundary value problems. Finally in Section 2.6 we establish some fine properties of Sobolev spaces and introduce functions of bounded variation (BV-functions). These are functions whose weak first partial derivatives are Radon measure and this is essentially the weakest measurable theoretic sense in which a function can be differentiable. They are particularly useful in theoretical mechanics.

2.1

Vector-Valued Functions

In this section we deal with functions which take values in Banach spaces. For such functions we define the various notions of measurability and different integrals corresponding to them. The domain of a function is a finite measure space (Ω, Σ, µ) and the range is a Banach space X. By X ∗ we denote the topological dual of X and by h·, ·iX the duality brackets for the pair (X ∗ , X). By B(X) we denote the Borel σ-field of X. DEFINITION 2.1.1

Let f : Ω −→ X be a function.

(a) Function f is said to be a simple function, if it takes only finite number of values, say x1 , . . . , xN and ¡ ¢ © ª Ck = f −1 {xk } ∈ Σ ∀ k ∈ 1, . . . , N . The formula s =

N P k=1

xk χCk is called the standard representation of f .

(b) Function f is said to be strongly measurable (or Bochner measur© ª able), if there exists a sequence sn : Ω −→ X n>1 of simple functions, such that sn (ω) −→ f (ω) for µ-a.a. ω ∈ Ω, where −→ denotes the convergence in the norm topology of X, i.e., ° ° °f (ω) − sn (ω)° −→ 0 for µ-a.a. ω ∈ Ω. X ∗ ∗ (c) Function f is said to ® be weakly measurable, if for all x ∈ X , the ∗ function ω −→ x , f (ω) X is Σ-measurable.

2. Lebesgue-Bochner and Sobolev Spaces

109

∗ ∗ (d) Function f : Ω −→ X is ® said to be weak -measurable, if for all x ∈ X, the function ω 7−→ f (ω), x X is Σ-measurable.

REMARK 2.1.2 Evidently strong measurability of a function f : Ω −→ X implies its weak measurability. Also strong measurability implies that for every B ∈ B(X), we have that f −1 (B) ∈ Σ (i.e., f is Borel measurable). Moreover, adapting the proof of the classical result, which asserts that a measurable R-valued function is the µ-almost everywhere limit of a sequence of simple functions, we see that if X is separable, then f : Ω −→ X is strongly measurable if and only if it is Borel measurable. In fact, in the next theorem, known as the Pettis measurability theorem, when X is separable, the situation simplifies considerably. THEOREM 2.1.3 A function f : Ω −→ X is strongly measurable if and only if it is weakly measurable and µ-almost separably valued (i.e., there exists a set A ∈ Σ with µ(A) = 0, such that f (Ω \ A) is separable in X). PROOF “=⇒”: Let f : Ω −→ X be a strongly measurable function. As we already pointed out, f is weakly measurable. Also since f is strongly measurable, we can find a sequence {sn }n>1 of X-valued simple functions and a µ-null set A ∈ Σ, such that sn (ω) −→ f (ω)

in X, for all ω ∈ Ω \ A.

Let {yn }n>1 be a sequence of all the values taken by the sequence {sn }n>1 (clearly the set is countable). Let df

Y = span {yn }n>1 . Then Y is a closed separable subspace of X. Moreover, f (Ω \ A) ⊆ Y and so f is µ-almost separably valued. “⇐=”: Let f : Ω −→ X be a µ-almost separably valued function. Without any loss of generality, we may assume that f is separably valued. Then replacing X by Y = span f (Ω), which is separable, we see that we may assume that X X∗

is separable. Let {x∗n }n>1 be dense in ∂B 1 (0), where X∗

∂B 1 (0) = Then

° ° °f (ω)°

X

©

ª x∗ ∈ X ∗ : kx∗ kX ∗ = 1 .

¯ ® ¯ = sup ¯ x∗n , f (ω) X ¯. n>1

But for each n > 1, ® the function ω 7−→ x∗n , f (ω) X is Σ-measurable,

110 hence

Nonlinear Analysis ° ° the function ω 7−→ °f (ω)°X is Σ-measurable.

Let

©

df

C0 =

° ° ª ω ∈ Ω : °f (ω)°X > 0 .

We have that C0 ∈ Σ and for every y ∈ X, the function ω 7−→ f (ω) − y is Σ ∩ C0 -measurable. Therefore, ° ° the function ω 7−→ °f (ω) − y °X is Σ ∩ C0 -measurable. Let {zn }n>1 be dense in f (Ω). For a given ε > 0, we define df

Dn =

©

ª ω ∈ C0 : kf (ω) − zn kX < ε .

Evidently Dn ∈ Σ ∩ C0

and

C0 =

∞ [

Dn .

n=1

Let df

En = Dn \

n−1 [

Di .

i=1

Then {En }n>1 ⊆ Σ ∩ C0 is a sequence of disjoint sets and ∞ [

C0 =

En .

n=1

We define

½ df

fε (ω) =

zn 0

if if

ω ∈ En , n > 1, ω ∈ Ω \ C0 .

Clearly fε : Ω −→ X is Σ-measurable, countably-valued (i.e., takes countably many values) and kf (ω) − fε (ω)kX < ε

∀ ω ∈ Ω.

© ª Taking ε = k1 , k > 1, we see that f is the uniform limit of a sequence f k1 k>1 of countably-valued functionals, hence f is strongly measurable. An interesting byproduct of the previous proof is the following result. COROLLARY 2.1.4 A function f : Ω −→ X is strongly measurable if and only if it is the uniform limit almost everywhere of a sequence of countably-valued, Σ-measurable functions.

2. Lebesgue-Bochner and Sobolev Spaces

111

By virtue of Theorem 2.1.3, we see that the measurability situation of Xvalued functions simplifies considerably when X is separable. THEOREM 2.1.5 If X is separable and f : Ω −→ X, then the following three properties are equivalent: (a) f is strongly measurable; (b) f is Borel measurable; (c) f is weakly measurable. REMARK 2.1.6 The usual facts regarding the stability of strongly measurable functions under sum, scalar multiplication and pointwise µ-almost everywhere limits hold. Also by just replacing absolute values by norms in the proof of the classical Egorov’s theorem (see Theorem A.2.12), we see that the result generalizes to X-valued functions. Finally for any Banach space X and a strongly measurable function f : Ω −→ X, the function ω 7−→ kf (ω)kX is Σmeasurable. Indeed, if {sn }n>1 is the sequence of X-valued simple functions, such that sn (ω) −→ f (ω) in X for µ-a.a. ω ∈ Ω, then ¯° ° ° ° ¯ ° ° ¯°f (ω)° − °sn (ω)° ¯ 6 °f (ω) − sn (ω)° −→ 0 X X X

for µ-a.a. ω ∈ Ω

° ° and so the function ω 7−→ °f (ω)°X is Σ-measurable. EXAMPLE 2.1.7 It can be shown that weak measurability does not imply strong measurability. Because of Theorem 2.1.5, we look for functions with values in a nonseparable Banach space. So consider the nonseparable ¡ ¢ Hilbert space X = l2 [0, 1] and ¡ ¢ let {et }t∈[0,1] be an orthonormal basis. The function f : [0, 1] −→ l2 [0, 1] defined by f (t) = et is weakly measurable, since ¡ ¡ ∗ ¢ ¢ x , f (t) X = x∗ , et X = 0

¡ ¢∗ ¡ ¢ ∀ x∗ ∈ l2 [0, 1] = l2 [0, 1] .

¡ ¢ On the other hand, if A ⊆ [0, 1], then f [0, 1] \ A is separable if and only if [0, 1] \ A is countable and so we cannot have λ1 (A) = 0. Therefore by virtue of Corollary 2.1.4, f is not strongly measurable.

112

Nonlinear Analysis

Now we are ready to define the Bochner integral for strongly measurable functions. DEFINITION 2.1.8

(a) Let N X

df

s(ω) =

xk χCk (ω),

xk ∈ X,

Ck ∈ Σ

k=1

be an X-valued simple function. The Bochner integral of s is defined by Z

df

s(ω) dµ(ω) =

N X

µ(Ck )xk .

k=1

Ω

(b) A function f : Ω −→ X is said to be Bochner integrable, if there exists a sequence {sn }n>1 of simple functions, such that Z ° ° °f (ω) − sn (ω)° dµ(ω) = 0. lim X n→+∞

Ω

Z If A ∈ Σ, we define the Bochner integral

f (ω) dµ(ω) of f on A, by A

Z

Z f (ω) dµ(ω) =

A

lim

χA (ω)sn (ω) dµ(ω).

n→+∞

(2.1)

Ω

Z Instead of

f (ω) dµ(ω), we will often write Ω

Z

Z f (ω) dµ or even

Ω

f dµ, Ω

when no confusion is possible. REMARK 2.1.9 It is easy to verify that in Definition 2.1.8(b), the limit in (2.1) exists and is independent of the sequence of simple functions {sn }n>1 with the properties postulated there. The next theorem gives a necessary and sufficient condition for the Bochner integrability of a function f : Ω −→ X. PROPOSITION 2.1.10 A strongly measurable f : Ω −→ X is Bochner integrable ° function ° ° ° if and only if the function ω 7−→ °f (ω)°X is Lebesgue integrable (i.e., °f (·)°X ∈ L1 (Ω)).

2. Lebesgue-Bochner and Sobolev Spaces PROOF such that

113

“=⇒”: Let {sn }n>1 be a sequence of X-valued simple functions, Z ° ° °f (ω) − sn (ω)° dµ −→ 0. X Ω

Then for any n > 1, we have Z Z Z ° ° ° ° ° ° °f (ω)° dµ 6 °f (ω) − sn (ω)° dµ + °sn (ω)° dµ < +∞, X X X Ω

Ω

Ω

so kf (·)kX ∈ L1 (Ω). “⇐=”: Since f : Ω −→ X is strongly measurable, we can find a sequence {sn }n>1 of X-valued, simple functions, such that ° ° lim °f (ω) − sn (ω)°X = 0 ∀ ω ∈ Ω \ A, n→+∞

with µ(A) = 0. Hence ° ° ° ° lim °sn (ω)°X = °f (ω)°X n→+∞

∀ ω ∈ Ω \ A.

Let hn : Ω −→ X

∀n>1

be defined by ½ df

hn (ω) =

sn (ω) 0

if ksn (ω)kX < 2 kf (ω)kX , otherwise.

Evidently for every n > 1, hn is an X-valued simple function. Also ° ° lim °f (ω) − hn (ω)°X = 0 ∀ ω ∈Ω\A n→+∞

and

° ° ° ° °f (ω) − hn (ω)° 6 3°f (ω)° X X

∀ ω ∈ Ω \ A.

So by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z ° ° °f (ω) − hn (ω)° dµ = 0, lim X n→+∞

Ω

so f is Bochner integrable (see Definition 2.1.8(b)). COROLLARY 2.1.11 If f : Ω −→ X is a Bochner integrable function and A ∈ Σ, then °Z ° Z ° ° ° ° ° f (ω) dµ° 6 °f (ω)° dµ. ° ° X A

X

A

114

Nonlinear Analysis

PROOF It is clear that the corollary holds for any s : Ω −→ X simple function. Then use Proposition 2.1.10. It is a direct consequence of Definition 2.1.8(b) that the Bochner integral is a linear operator. Namely we have the following proposition. PROPOSITION 2.1.12 If f, g : Ω −→ X are two Bochner integrable functions, A ∈ Σ and ξ ∈ R, then f + ξg is Bochner integrable too and Z Z Z ¡ ¢ f + ξg (ω) dµ = f (ω) dµ + ξ g(ω) dµ. Ω

A

A

The Lebesgue dominated convergence theorem (see Theorem A.2.2) applies also to Bochner integrable functions. PROPOSITION 2.1.13 If f : Ω −→ X is a strongly measurable function, fn : Ω −→ X, n > 1 are Bochner integrable, fn (ω) −→ f (ω)

for µ-a.a. ω ∈ Ω

and there exists h ∈ L1 (Ω)+ , such that ° ° °fn (ω)° 6 h(ω) for µ-a.a. ω ∈ Ω, and all n > 1, X then f is Bochner integrable and we have Z Z f (ω) dµ = lim fn (ω) dµ

∀ A ∈ Σ.

n→+∞

A

PROOF

A

Clearly ° ° °f (ω)° 6 h(ω) for µ-a.a. ω ∈ Ω X

and the function

° ° ω 7−→ °f (ω) − fn (ω)°X

is Σ-measurable for every n > 1. Since ° ° °f (ω) − fn (ω)° 6 2h(ω) X we have that

° ° °f (·) − fn (·)° ∈ L1 (Ω) X

for µ-a.a. ω ∈ Ω,

∀ n > 1.

2. Lebesgue-Bochner and Sobolev Spaces

115

Thus by the Lebesgue dominated convergence theorem (see Theorem A.2.2) for R-valued functions, we have Z ° ° °f (ω) − fn (ω)° dµ −→ 0. (2.2) X Ω

By virtue of Definition 2.1.8(b), for each n > 1, we can find an X-valued step function sn , such that Z ° ° °fn (ω) − sn (ω)° dµ < 1 . X n Ω

We have Z ° ° °f (ω) − sn (ω)° dµ X Ω

Z

6

° ° °f (ω) − fn (ω)° dµ + X

Ω

Z

° ° °fn (ω) − sn (ω)° dµ −→ 0 as n → +∞, X

Ω

so f is Bochner integrable. Moreover, from Corollary 2.1.11 and (2.2), we have °Z ° Z Z ° ° ° ° ° f (ω) dµ − sn (ω) dµ° 6 °f (ω) − sn (ω)° dµ ° ° X Z 6

A

A

A

° ° °f (ω) − sn (ω)° dµ −→ 0 X

as n → +∞.

Ω

Also we have a version of Fatou’s lemma (see Theorem A.2.1). PROPOSITION 2.1.14 ©R ª If fn : Ω −→ X, n > 1, are Bochner integrable, Ω kfn k dµ n>1 is bounded and w fn (ω) −→ f (ω) for a.a. ω ∈ Ω, then f is Bochner integrable and Z Z ° ° ° ° °f (ω)° dµ 6 lim inf °fn (ω)° dµ. X X n→+∞

Ω

Ω

116

Nonlinear Analysis

PROOF Evidently f is weakly measurable. Also by Theorem 2.1.3 for every n > 1, we can find An ∈ Σ which is µ-null and fn (Ω \ A) is separable. Let C ∈ Σ be the µ-null set, such that for ω ∈ Ω \ C, we have w

fn (ω) −→ f (ω). Let df

A =

µ[ ∞

¶ An

∪ C.

n=1

Then A ∈ Σ and it is µ-null. Let ∞ [

df

Y = span

¡ ¢ fn Ω \ An .

n=1

Evidently Y is a separable Banach subspace of X and f (Ω \ C) ⊆ Y. So by virtue of the weak lower semicontinuous of the norm functional in a Banach space, we have ° ° ° ° °f (ω)° 6 lim inf °fn (ω)° for µ-a.a. ω ∈ Ω. X X n→+∞

° ° Since °fn (ω)°X , n > 1, is Lebesgue integrable (see Proposition 2.1.10), by the Fatou’s lemma (see Theorem A.2.1), we have Z Z ° ° ° ° °f (ω)° dµ 6 lim inf °fn (ω)° dµ. X X n→+∞

Ω

Ω

DEFINITION 2.1.15 A set function m : Σ −→ X is said to be a vector measure, if for all sequences {An }n>1 ⊆ Σ of pairwise disjoint sets, we have m

µ[ ∞

¶ An

=

n=1

∞ X

m(An ),

n=1

where the series converges in the norm topology of X. The next proposition shows that the indefinite Bochner integral Z A 7−→ f dµ A

of a Bochner integrable function f : Ω −→ X is a vector measure which is absolutely continuous with respect to µ (i.e., m ≺≺ µ).

2. Lebesgue-Bochner and Sobolev Spaces

117

PROPOSITION 2.1.16 If f : Ω −→ X is a Bochner integrable function, then the set function m : Σ −→ X defined by Z df m(A) = f (ω) dµ ∀A∈Σ A

is a vector measure and m ≺≺ µ, i.e., lim m(A) = 0.

µ(A)&0

PROOF

Let {An }n>1 ⊆ Σ be a sequence of pairwise disjoint sets. Since °Z ° Z ° ° ° ° ° f (ω) dµ° 6 °f (ω)° dµ ∀ n > 1, ° ° X X

An

An

the series

∞ Z X

f (ω) dµ

n=1A n

is dominated term-by-term by the convergent series of positive terms Z ∞ Z X ° ° ° ° °f (ω)° dµ 6 °f (ω)° dµ < +∞ X X n=1A n

Ω

(see Proposition 2.1.10). Therefore the series ∞ Z X

f (ω) dµ

n=1A n

is absolutely convergent. Moreover, for all k > 1, we have ° Z ° ° ° ∞ S n=1

f (ω) dµ −

° ° f (ω) dµ° °

n=1A n

An

X

° ° = ° °

∞ S

° ° °f (ω)° dµ −→ 0 as k → +∞, X

6 An

n=k+1

so m

µ[ ∞ n=1

¶ An

=

∞ X n=1

m(An ),

° ° f (ω) dµ° °

Z

n=k+1

Z

∞ S

k Z X

X

An

118

Nonlinear Analysis

i.e., m is°a vector measure. ° Since °f (·)°X ∈ L1 (Ω), from the absolute continuity of the Lebesgue integral, we have Z ° ° °f (ω)° dµ = 0. lim X µ(A)&0

A

From Corollary 2.1.11, we have °Z ° ° ° ° ° ° f (ω) dµ lim °m(A)°X = lim ° ° ° µ(A)&0 µ(A)&0

X

A

Z 6

lim

µ(A)&0

° ° °f (ω)° dµ = 0 X

A

and so m ≺≺ µ. Thus far the theory of Bochner integration is a straightforward extension of the theory of Lebesgue integration, with the absolute values replaced by norms. The next theorem exhibits a strong property of the Bochner integral that has no counterpart in the theory of Lebesgue integration. THEOREM 2.1.17 If Y is another Banach space, L : X ⊇ D −→ Y is a closed linear operator ¡ ¢ and f : Ω −→ X is a Bochner integrable function, such that L f (·) : Ω −→ Y is Bochner integrable too, then µZ ¶ Z ¡ ¢ L f (ω) dµ = L f (ω)) dµ ∀ A ∈ Σ. A

PROOF

A

Let ½ df

C0 =

¾

° ° ω ∈ Ω : °f (ω)°X > 0

∈ Σ.

By Corollary 2.1.4, for a given ε > 0, we can find countably valued functions hε : C0 −→ X such that ° ° ε sup °f (ω) − hε (ω)°X < 2 ω∈Ω\E

and gε : C0 → Y,

and

° ¡ ° ¢ ε sup °L f (ω) − gε (ω)°X < , 2 ω∈Ω\E

with E ∈ Σ being a µ-null set. Let {Bn }n>1 ⊆ Σ∩C0 be a common refinement of the subdivisions corresponding to hε and gε and let ωn ∈ Bn

∀ n > 1.

2. Lebesgue-Bochner and Sobolev Spaces

119

We introduce the function uε : Ω −→ X, defined by ½ df f (ωn ) if ω ∈ Bn , n > 1, uε (ω) = 0 if ω ∈ Ω \ C0 . Then, we have

Z

° ° °f (ω) − uε (ω)° dµ < εµ(Ω) X

(2.3)

° ¡ ¢ ¡ ¢° °L f (ω) − L uε (ω) ° dµ < εµ(Ω). Y

(2.4)

Ω

and

Z Ω

Also for every A ∈ Σ, we have Z uε (ω) dµ =

∞ X

f (ωn )µ(Bn ∩ A) =

n=1

A

lim

N →+∞

N X

f (ωn )µ(Bn ∩ A)

(2.5)

n=1

and Z

∞ X ¡ ¢ ¡ ¢ L uε (ω) dµ = L f (ωn ) µ(Bn ∩ A) n=1

A

=

lim

N →+∞

N X ¡ ¢ L f (ωn ) µ(Bn ∩ A).

(2.6)

n=1

Since by hypothesis L is a closed, linear operator, from (2.5) and (2.6), we have that µZ ¶ Z ¡ ¢ uε dµ, L uε dµ ∈ Gr L. A

A

Consider a sequence εn & 0. From (2.3) and (2.4), we have Z Z Z Z uεn dµ −→ f dµ and L(uεn ) dµ −→ L(f ) dµ. A

A

Since

µZ L A

A

¶ uεn dµ

Z =

L(uεn ) dµ

∀n>1

A

and L is closed, it follows that µZ ¶ Z ¡ ¢ L f (ω) dµ = L f (ω) dµ A

A

A

∀ A ∈ Σ.

120

Nonlinear Analysis

REMARK 2.1.18

If f : Ω −→ X is a Bochner integrable function and L ∈ L(X; Y ),

then L(f ) is Bochner integrable, since ° ° °L(f (ω))° 6 kLk kf (ω)k L X Y

∀ω∈Ω

COROLLARY 2.1.19 If f, g : Ω −→ X are two Bochner integrable functions and Z

Z f (ω) dµ =

A

g(ω) dµ

∀ A ∈ Σ,

A

then f (ω) = g(ω) for µ-almost all ω ∈ Ω. PROOF We may assume that g = 0 and that X is separable (see Theorem 2.1.3). Then the ball ∗

B1 =

©

x∗ ∈ X ∗ : kx∗ kX ∗ 6 1

ª

furnished with the relative weak∗ -topology is compact, metrizable (see Alaoglu theorem; Theorem A.3.9 and Theorem A.3.13). ∗ Let {xn∗ }n>1 be a countable w∗ -dense subset of B 1 . By Theorem 2.1.17, for every n > 1 and A ∈ Σ, we have ¿ À Z Z ∗ ® xn , f (ω) X dµ = x∗n , f (ω) dµ = 0, A

so

A

∗ ® xn , f (ω) X = 0

Since

we have

° ° °f (ω)° = X

for µ-a.a. ω ∈ Ω.

¯ ¯ sup ¯ hx∗ , f (ω)iX ¯, ∗

x∗ ∈B 1

° ° °f (ω)° = 0 X

for µ-a.a. ω ∈ Ω

and so f (ω) = 0

X

for µ-a.a. ω ∈ Ω.

A similar proof gives us the following result.

2. Lebesgue-Bochner and Sobolev Spaces

121

COROLLARY 2.1.20 If f, g : Ω −→ X are two strongly measurable functions and

x∗ , f (ω)

® X

=

∗ ® x , g(ω) X

for µ-a.a. ω ∈ Ω and all x∗ ∈ X ∗

(the exceptional µ-null set may depend on x∗ ∈ X ∗ ), then f (ω) = g(ω) for µ-a.a. ω ∈ Ω. The next result can be viewed as a kind of mean value theorem for the Bochner integral. PROPOSITION 2.1.21 If f : Ω −→ X is a Bochner integrable function and A ∈ Σ with µ(A) > 0, then Z 1 f (ω) dµ ∈ conv f (A). µ(A) A

PROOF

We proceed by contradiction. Suppose that Z 1 f (ω) dµ 6∈ conv f (A). µ(A) A

Then by the strong separation theorem for convex sets (see Theorem A.3.2), we can find x∗ ∈ X ∗ \ {0} and ϑ ∈ R, such that ¿ À Z ® 1 x∗ , f (ω) dµ < ϑ 6 x∗ , f (ω) X ∀ ω ∈ A, µ(A) X A

so using Theorem 2.1.17, we have Z ∗ ® ® 1 x , f (ω) X dµ < ϑ 6 x∗ , f (ω) X µ(A)

∀ ω ∈ A.

A

Integrating this inequality over A, we obtain Z Z ∗ ® ∗ ® x , f (ω) X dµ, x , f (ω) X dµ < ϑµ(A) 6 A

A

a contradiction. Also for Bochner integrable functions, the Lebesgue differentiation theorem holds (see Theorem 1.4.6). So we have the following result.

122

Nonlinear Analysis

PROPOSITION 2.1.22 If Z ⊆ RN is a bounded open set and f : Z −→ X is a Bochner integrable function, then Z ° ° 1 °f (y) − f (x)° dλN (y) = 0 for λN -a.a. x ∈ Z, lim N X r&0 a(N )r B r (x)

where

N

π2 df a(N ) = ¡ N ¢ 2

!

is the volume of the unit ball in RN . PROOF Invoking Theorem 2.1.3, we may assume that X is separable. Let {xn }n>1 be a dense set in X. Then by Theorem 1.4.6, we have Z ° ° 1 °f (y) − xn ° dλN (y) lim X r&0 a(N )r N B r (x) ° ° = °f (x) − xn °X for λN -a.a. x ∈ Z and all n > 1.

(2.7)

Let x ∈ Z be a point where (2.7) is valid. Then for a given ε > 0, we can select xn , such that ° ° °f (x) − xn ° < ε. X We have 1 lim sup a(N )rN r&0

Z B r (x)

1 6 lim sup N r&0 a(N )r

Z

° ° °f (y) − f (x)° dλN (y) X ·

¸ ° ° ° ° °f (y) − xn ° + °xn − f (x)° dλN (y) X X

B r (x)

< 2ε, so lim sup r&0

1 a(N )rN

Z

° ° °f (y) − f (x)° dλN (y) = 0 X

for λN -a.a. x ∈ Z.

B r (x)

We conclude this section by introducing three weaker integrals for Banach space valued functions.

2. Lebesgue-Bochner and Sobolev Spaces DEFINITION 2.1.23

123

Let f : Ω −→ X be a function.

(a) Suppose that f : Ω −→ X is weakly measurable. We say that f is Pettis integrable, if for each A ∈ Σ, there exists xA ∈ X, such that Z ∗ ® ∗ ® x , xA X = x , f (ω) X dµ ∀ x∗ ∈ X ∗ . A

Then we write

Z xA = (P)- f (ω) dµ. A

(b) Suppose that f : Ω −→ X is weakly measurable. We say that f is Dun∗∗ ford integrable, if for each A ∈ Σ, there exists x∗∗ A ∈ X , such that Z ∗ ® ∗∗ ∗ ® x , f (ω) X dµ ∀ x∗ ∈ X ∗ . xA , x X ∗ = A

Then we write

Z = (D)- f (ω) dµ.

x∗∗ A

A

(c) Suppose that f : Ω −→ X ∗ is w∗ -measurable. We say that f is Gelfand integrable, if for each A ∈ Σ, there exists x∗A ∈ X ∗ , such that Z ∗ ® ® xA , x X = f (ω), x X dµ ∀ x ∈ X. A

Then we write

Z x∗A = (G)- f (ω) dµ. A

REMARK 2.1.24 Clearly we have that Bochner integrability implies Pettis integrability and Pettis integrability implies Dunford integrability. The reverse implications need not be true. Of course if X is reflexive, then the Pettis and Dunford integrals coincide. Finally note that the Gelfand integral is actually the Pettis integral for X ∗ -valued functions. For a Pettis integrable function f : Ω −→ X, we consider the set-valued Z function A 7−→ (P)- f dµ. We want to know if this is a µ-continuous vector A

measure, as was the case with the Bochner integral (see Proposition 2.1.16). To answer this we need some preparation.

124

Nonlinear Analysis

DEFINITION 2.1.25

Let

∞ P n=1

∞ P

(a) We say that the series

n=1

xn be a series of elements of X.

xn is unconditionally convergent to x, if

for all permutations π of N, the series

∞ P n=1

(b) We say that the series

∞ P n=1

xπ(n) converges to x.

xn is weakly subseries convergent to x, if

for every strictly increasing sequence {nk }k>1 of integers, the series is weakly convergent.

REMARK 2.1.26 ∞ P n=1

∞ P

If

n=1

∞ P k=1

xnk

xn is absolutely convergent (i.e., the series

kxn kX is convergent), then it is unconditionally convergent. Also uncon-

ditional convergence is equivalent to the subseries convergence (in the norm topology of X) and implies convergence. The next result is known as the Orlicz-Pettis theorem. THEOREM 2.1.27 (Orlicz-Pettis Theorem) ∞ P A formal series xn in X is unconditionally convergent if and only if it is n=1

weakly subseries convergent. REMARK 2.1.28 An interesting consequence of the above theorem is that if m : Σ −→ X is a weakly countably additive set function, then it is a vector measure. PROPOSITION 2.1.29 If f : Ω −→ X is Pettis integrable, then the function

Z Σ 3 A 7−→ m(A) = (P)- f dµ A

is a vector measure.

2. Lebesgue-Bochner and Sobolev Spaces

125

PROOF Let {An }n>1 be a sequence of pairwise disjoint sets in Σ. For every x∗ ∈ X ∗ , we have ¿ À Z Z ∗ ® x∗ , (P)f (ω) dµ = x , f (ω) X dµ X

∞ S n=1

=

∞ S

An

n=1

An

Z ∞ Z ∞ ¿ X X ∗ ® x , f (ω) X dµ = x∗ , (P)n=1A n

n=1

∞ S n=1

so the function

À , f (ω) dµ X

An

Z Σ 3 A 7−→ m(A) = (P)- f dµ A

is weakly countably additive. Of course the same argument applies to any subsequence of {An }n>1 . So we can invoke Theorem 2.1.27 and conclude that the function Z Σ 3 A 7−→ m(A) = (P)- f dµ A

is a vector measure. REMARK 2.1.30 The result is not true for the Dunford integral, which is not even strongly additive (see Diestel & Uhl (1977, p. 53)). The next result provides an easy test for checking the Gelfand integrability of f : Ω −→ X ∗ . PROPOSITION 2.1.31 If f : Ω −→ X ∗ has the following property hf (·), xiX ∈ L1 (Ω)

∀ x ∈ X,

then f is Gelfand integrable. PROOF

Let A ∈ Σ and let L : X −→ L1 (Ω) be defined by df

L(x) =

f (·), x

® X

∀ x ∈ X.

We claim that the linear operator L has a closed graph. To this end suppose that xn −→ x in X and

® f (·), xn X −→ g

in L1 (Ω).

126

Nonlinear Analysis

Then by passing to a suitable subsequence of {xn }n>1 if necessary, we may assume that ® f (ω), xn X −→ g(ω) for µ-a.a. ω ∈ Ω. Therefore g(ω) =

f (ω), x

® X

for µ-a.a. ω ∈ Ω,

hence (x, g) ∈ Gr L, i.e., L has closed graph. By the closed graph theorem (see Theorem A.3.7), L is continuous and if IA : L1 (Ω) −→ R is the integral operator, defined by Z df IA (g) = g(ω) dµ ∀ A ∈ Σ, A

we have that IA ◦ L ∈ X ∗ and so there exists x∗A ∈ X ∗ , such that ∗ ® xA , x X =

Z

® f (ω), x X dµ

∀ x ∈ X.

A

Therefore f is Gelfand integrable. REMARK 2.1.32 The same closed graph argument shows that if f : Ω −→ X is such that ∗ ® x , f (·) X ∈ L1 (Ω)

∀ x∗ ∈ X ∗ ,

then f is Dunford integrable. The situation with the Pettis integrability is less satisfactory and more sophisticated criteria are needed to establish it (see Diestel & Uhl (1977, pp. 54–56)). COROLLARY 2.1.33 If f : Ω −→ X ∗ is a w∗ -measurable function and has range which is norm bounded in X ∗ , then f is Gelfand integrable. The same argument as in the proof of Proposition 2.1.21 gives the following mean value theorem for the Gelfand integral. THEOREM 2.1.34 If f : Ω −→ X ∗ is a Gelfand integrable function and A ∈ Σ, then Z ∗ 1 (G)- f dµ ∈ conv w f (A). µ(A) A

2. Lebesgue-Bochner and Sobolev Spaces

2.2

127

Lebesgue-Bochner Spaces and Evolution Triples

Using the Bochner integral introduced in the previous section, we can introduce generalizations of the classical Lebesgue spaces to Banach space valued functions. As in the previous section (Ω, Σ, µ) is a finite measure space and X is a Banach space. Additional hypotheses will be introduced as needed. DEFINITION 2.2.1 Let p ∈ [1, +∞]. By Lp (Ω; X) we denote the space of classes of strongly measurable functions f : Ω −→ X, such that ° equivalence ° °f (·)° ∈ Lp (Ω). Also we introduce their respective norms by X df

µZ

kf kp =

¶ p1 ° ° °f (ω)°p dµ X

if p ∈ [1, +∞)

Ω

and

° ° df kf k∞ = esssup °f (ω)°X . ω∈Ω

REMARK 2.2.2 As with R-valued functions, the equivalence relation used in the above definition is the following: f ∼g

if and only if

f (ω) = g(ω)

for µ-a.a. ω ∈ Ω.

It is routine to check the following facts. PROPOSITION 2.2.3 ¡ ¢ (a) Lp (Ω; X), k·kp is a Banach space for p ∈ [1, +∞]. (b) If p ∈ [1, +∞), Σ is countably generated and X is separable, then Lp (Ω; X) is separable. (c) If p ∈ (1, +∞) and X is reflexive, then Lp (Ω; X) is reflexive. (d) If X is a Hilbert space, then L2 (Ω; X) is a Hilbert space too with inner product Z ¡ ¢ (f, g)2 = f (ω), g(ω) X dµ. Ω

128

Nonlinear Analysis

REMARK 2.2.4 The σ-field Σ is countably generated if there exists a countable subfamily T , such that Σ = σ(T ). If Ω is an open or closed subset of RN , then the Borel σ-field B(Ω) is countably generated. Also clearly if p ∈ [1, +∞) and Lp (Ω; X) is separable, then X is separable. Additional conditions on X usually translate to corresponding properties of the LebesgueBochner space Lp (Ω; X). So if p ∈ (1, +∞), then Lp (Ω; X) is uniformly convex if and only if X is uniformly convex (see Day (1955, 1973)). Moreover, as for p the Lebesgue spaces, simple functions (Ω; X) and if Z ⊆ RN ¡ ¢ are dense in L ∞ p is a bounded open set then C Z; X is dense in L (Ω; X) for p ∈ [1, +∞). PROPOSITION 2.2.5 If Y is another Banach space, X ⊆ Y and the embedding is continuous, p, r ∈ [1, +∞], p 6 r, then Lr (Ω; X) ⊆ Lp (Ω; Y ) and the embedding is continuous. PROOF Let f ∈ Lr (Ω; X). Since the embedding X ⊆ Y is continuous, using H¨older’s inequality (see Theorem A.2.27; as p 6 r), we have µZ Ω

° ° °f (ω)°p dµ Y

¶ p1

µZ 6 c1

¶ p1 µZ ¶ r1 ° ° r °f (ω)°p dµ 6 c2 kf (ω)kX dµ , X

Ω

Ω

for some c1 , c2 > 0. So Lr (Ω; X) ⊆ Lp (Ω; Y ) and the embedding is continuous. We want to identify the dual of Lp (Ω; X) for p ∈ [1, +∞). First a definition which is motivated by the fact that the proof of the classical Riesz representation theorem (see Theorem A.3.24) uses the Radon-Nikodym theorem (see Theorem A.2.24). DEFINITION 2.2.6 (a) Let m : Σ −→ X be a vector measure (see Definition 2.1.15). We say that m is of bounded variation, if |m|(Ω) < +∞, where X ° ° °m(C)° |m|(A) = sup ∀ A ∈ Σ, X TA

C∈TA

with TA running through the set of all finite Σ-partitions of A. The quantity |m| : Σ −→ R+ is called the variation of m and is a measure. (b) A Banach space X is said to have the Radon-Nikodym property (RNP for short), if for every probability space (Ω, Σ, µ) and every vector measure m : Σ −→ X of bounded variation such that m ≺≺ µ (i.e., if µ(A) = 0 then m(A) = 0), there exists f ∈ L1 (Ω; X), such that Z m(A) = f (ω) dµ ∀ A ∈ Σ. A

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.2.7 has. Suppose that

129

The RNP is not a property that every Banach space

X1 = c0 . ¡ ¢ ¡ ¢ 1 On [0, 1], B([0, 1]), λ1 (here B [0, 1] is the Borel σ-field ¡ of [0, ¢ 1] and λ is the Lebesgue measure) consider the vector measure m : B [0, 1] −→ c0 , defined by ½Z ¾ ¡ ¢ m(A) = cos nt dt ∀ A ∈ B [0, 1] . n>1

A

The Riemann-Lebesgue Lemma guarantees that ¡ ¢ m(A) ∈ c0 ∀ A ∈ B [0, 1] . Also m ≺≺ λ1 . However, m cannot have a¡ Radon-Nikodym derivative (see Theorem A.2.24 ¢ and Remark A.2.25) in L1 [0, 1]; c0 , since {cos nt}n>1 ∈ / c0

for a.a. t ∈ [0, 1].

Therefore c0 lacks the RNP. However, there are two large classes of Banach spaces which have the RNP. PROPOSITION 2.2.8 If X is reflexive or it is a separable dual space, then X has the RNP. Now we state the Riesz representation theorem for the Lebesgue-Bochner spaces. THEOREM 2.2.9 (Riesz Representation Theorem for the Lebesgue-Bochner Spaces) If p ∈ [1, +∞) and p1 + p10 = 1, ¡ ¢∗ ¢ 0¡ then Lp (Ω; X) = Lp Ω; X ∗ if and only if X ∗ has the RNP and the duality pairing is given by Z ¢ ® ® 0¡ ∀ f ∈ Lp (Ω; X), g ∈ Lp Ω; X ∗ . g(ω), f (ω) X dµ g, f Lp (Ω;X) = Ω ∗ when X = ¡What¢ can be said if X does not have the RNP (for example C [0, 1] )? We can still have a representation theorem for L1 (Ω; X). First a definition.

130

Nonlinear Analysis

¡ ¢ ∗ DEFINITION 2.2.10 By L∞ Ω; Xw we denote the space of all w∗ ∗ ∗ measurable functions g : Ω −→ X , such that there exists c > 0 with ¯ ® ¯ ¯ g(ω), x ¯ 6 c kxk for µ-a.a. ω ∈ Ω and all x ∈ X (2.8) X X (the exceptional µ-null ¡ ¢ set may depend on x). Two functions g, h are equiva∗ (denoted by g ≈ h) if lent in L∞ Ω; Xw ∗ ® ® g(ω), x X = h(ω), x X for µ-a.a. ω ∈ Ω and all x ∈ X. The infimum of all c > 0 for which the above inequality (2.8) is true is denoted by kgkL∞ (Ω;X ∗ ∗ ) and we have w

¯ ® ¯ ¯ g(ω), x ¯ 6 kgk ∞ L (Ω;X ∗ ∗ ) kxkX X w

for µ-a.a. ω ∈ Ω.

We can easily check that k·kL∞ (Ω;X ∗ ∗ ) is a norm. w

¢ ¡ ∗ does REMARK 2.2.11 (a) The equivalence relation in L∞ Ω; Xw ∗ not coincide with the usual one in the Lp -space, since ® g(ω), x X = 0 for µ-a.a. ω ∈ Ω and all x ∈ X does not necessarily imply that g(ω) = 0

for µ-a.a. ω ∈ Ω.

Indeed let Ω = [0, 1]

and

¡ ¢ X = l2 [0, 1]

(it is a nonseparable Hilbert space). Then ¡ ¢ L∞ Ω; Xw∗ ∗ = L∞ (Ω; Xw ) ¡ ¢ and let g(ω) = ga (ω) a∈[0,1] with ½ ga (ω) =

1 0

if if

ω = t, ω 6= t.

Then g ≈ 0 in L∞ (Ω; Xw ), but ° ° °g(ω)° = 1 X

for a.a. ω ∈ [0, 1]. ¡ ¢ ∗ However, if X is sparable and g ∈ L∞ Ω; Xw ∗ , then the function ° ° ω 7−→ °g(ω)°X ∗

is measurable, essentially bounded and ° ° kgkL∞ (Ω;X ∗ ∗ ) = esssup °g(ω)°X ∗ . w

ω∈Ω

2. Lebesgue-Bochner and Sobolev Spaces

131

¡ ¢ ∗ ∞ ∗ (b) In general, we have L∞ Ω; Xw ∗ ¡ 6= L¢ (Ω; X ), even if X∗ is separable. ¡ ¢ To see this let Ω = [0, 1] and X = C [0, 1] . We know that X = M [0, 1] , the space of finite Borel measures on [0, 1] equipped with the total variation norm. Let g : Ω −→ X ∗ be defined by df

g(ω) = δω , ¡ ¢ the Dirac measure at ω ∈ [0, 1]. Then g ∈ L∞ Ω; Xw∗ ∗ , but it is not strongly measurable, nor equivalent to any strongly measurable function. To see this, note that due to the separability of X, g ≈ h if and only if g(ω) = h(ω) for almost all ω ∈ Ω (with h being strongly measurable). Then g is strongly measurable too and so by virtue of Corollary 2.1.4, there exists a countablyvalued function u, such that ° ° °g(ω) − u(ω)°

X∗

1 ⊆ B 1 = © ∗ ª y ∈ X ∗ : ky ∗ kX ∗ 6 1 , such that ¯ ® ¯ ∀ x ∈ X. kxkX = sup ¯ x∗n , x X ¯ n>1

° ° Therefore for every y ∗ ∈ X ∗ , the function ω 7−→ °g(ω) − y ∗ °X ∗ is Σmeasurable and then from the proof of Theorem 2.1.3, ¡ we can ¢ infer∞that the ∗ function ω 7−→ g(ω) is strongly measurable. Hence L∞ Ω; Xw = L (Ω; X ∗ ) ∗ and Theorem 2.2.12 coincides with Theorem 2.2.9.

132

Nonlinear Analysis

In complete analogy with the case of R-valued functions, we introduce the notion of absolutely continuous X-valued function. DEFINITION 2.2.14 A function f : T = [0, b] −→ X is said to be absolutely continuous, if forª every ε > 0, we can find δ(ε) > 0, such © that for each sequence (an , bn ) n>1 of pairwise disjoint intervals in T with ∞ P (bn − an ) < δ, we have

n=1

∞ X ° ° °f (bn ) − f (an )° < ε. X n=1

Also for a function f : T = [0, b] −→ X and a partition P : 0 = x 0 < . . . < xn = b we define df

V (f, P ) =

of T,

m X ° ° °f (xk ) − f (xk−1 )° . X k=1

The variation of f on T is defined by © ª df V (f )(b) = sup V (f, P ) : P is a partition of T . When V (f )(b) is finite, we say that f is of bounded variation. REMARK 2.2.15 Clearly the function t 7−→ V (f )(t) is an increasing function and if f : T = [0, b] −→ X is absolutely continuous, then it is of bounded variation. The converse is not true. It is well known that an R-valued, absolutely continuous function is almost everywhere differentiable on T and it is the indefinite integral of its derivative. The result is no longer true for X-valued in general. EXAMPLE 2.2.16 Let X = L1 [0, 1] and consider the function f : [0, 1] −→ X, defined by df

f (t) = χ[0,t]

∀ t ∈ [0, 1].

It is easy to see that f is absolutely continuous. However, f is nowhere differentiable on [0, 1].¡ Indeed,¢ if f is differentiable at t = t0 ∈ [0, 1], then for ∗ every g ∈ L∞ [0, 1] = L1 [0, 1] , the function df

t 7−→ ϑ(t) =

® g, f (t) L1 [0,1] =

Z1

Zt g(s)f (t)(s) ds =

0

g(s)ds 0

2. Lebesgue-Bochner and Sobolev Spaces is differentiable at t = t0 . Let

½ df

g(s) = We have

½ df

ϑ(t) =

1 −1

if if

t 2t0 − t

133

s 6 t0 , s > t0 . if if

t 6 t0 , t > t0 ,

and ϑ clearly is not differentiable at t = t0 . Note that in this example X = L1 [0, 1] does not have the RNP. THEOREM 2.2.17 If X is reflexive and f : T = [0, b] −→ X is absolutely continuous, then f is differentiable at almost all t ∈ T and Zt f 0 (s) ds

f (t) = f (0) +

∀ t ∈ T.

0

PROOF Because of Theorem 2.1.3, we may assume that X is also separable. Since f is absolutely continuous, it is of bounded variation and the function t 7−→ V (f )(t) is increasing on T = [0, b] (see Definition 2.2.14 and Remark 2.2.15). For 0 6 t 6 t + h 6 b, we have ° ° °f (t + h) − f (t)° 6 V (f )(t + h) − V (f )(t), X so

¢ kf (t + h) − f (t)kX 1¡ 6 V (f )(t + h) − V (f )(t) h h

∀h>0

and lim sup h→0

kf (t + h) − f (t)kX d 6 V (f )(t) < +∞ h dt

for a.a. t ∈ T.

(2.9)

Since X is separable, reflexive, X ∗ is separable too (see Remark A.3.14). Let {x∗n }n>1 be a dense sequence in X ∗ . For every n > 1, ® the function t 7−→ x∗n , f (t) X is differentiable at every point of T \ Dn , with λ1 (Dn ) = 0 (as before λ1 denotes the Lebesgue measure on T ). Also let ½ ¾ kf (t + h) − f (t)kX df D0 = t ∈ T : lim sup = +∞ h h→0 and let us set df

D =

∞ [ n=0

Dn .

134

Nonlinear Analysis

From (2.9), we have that λ1 (D) = 0. Then for ε > 0 small enough and t ∈ T \ D, the family ½ ¾ kf (t + h) − f (t)kX : |h| 6 ε and t ∈ T \ D h is bounded. Since for every n > 1 and every t ∈ T \ D, ¿ À ∗ f (t + h) − f (t) xn , the limit lim exists, n→+∞ h X we infer that there exists u(t) ∈ X, such that for all x∗ ∈ X ∗ and all t ∈ T \D, we have ¿ À ® ∗ f (t + h) − f (t) lim xn , = x∗ , u(t) X , h→0 h X so f is weakly differentiable at every t ∈ T \ D. Let f 0 be the weak derivative of f (i.e., f 0 (t) = u(t) for all t ∈ T \ D). Clearly f 0 is weakly measurable and so by Theorem 2.1.3 it is also strongly measurable. Moreover, from the weak lower semicontinuity of the norm in a Banach space, we have ° 0 ° °f (t)° 6 lim inf kf (t + h) − f (t)kX X h→0 h

∀ t ∈ T \ D.

(2.10)

Then from (2.10) and Fatou’s lemma (see Theorem A.2.1), we have that Zb

° 0 ° °f (t)° dt 6 V (f )(b), X

0

i.e., f 0 ∈ L1 (T ; X). Also ∗ ® x , f (t) − f (0) X =

Zt

∗ 0 ® x , f (s) X ds

∀ x∗ ∈ X ∗ , t ∈ T,

0

so from Theorem 2.1.17, we have Zt f 0 (s) ds

f (t) − f (0) =

∀t∈T

0

and finally f is almost everywhere strongly differentiable with df = f 0 ∈ L1 (T ; X) dt and (2.11) holds.

(2.11)

2. Lebesgue-Bochner and Sobolev Spaces

135

REMARK 2.2.18 The result is more generally true if we assume that X has the RNP. This follows from the fact the RNP is passed to closed linear subspaces of X and if X is a separable Banach space with the RNP, then it has the separable dual (see Diestel & Uhl (1977, pp. 217–218)). So a careful reading of the previous proof reveals that it remains valid if instead we assume only that X has the RNP. The next result is an extension of the so-called “Lagrange lemma” and “DuBois-Reymond lemma” (see Denkowski, Mig´orski & Papageorgiou (2003b, p. 673)) to Banach space valued functionals. PROPOSITION 2.2.19 Let f ∈ L1 (T ; X) (with T = [0, b]). (a) If Zb f (t)ϑ(t) dt = 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,

f (t)ϑ0 (t) dt = 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,

0

then f = 0. (b) If Zb 0

then f is constant. PROOF (a) By virtue of Theorem 2.1.3, we may assume that X is sep∗ ∗ furnished with the w∗ -topology) is arable. Then Xw ∗ (the dual space X ∗ ∗ w -separable (in fact Xw∗ is a Souslin space; see Definition A.2.29(b) and Remark A.2.30). Let {x∗n }n>1 be w∗ -dense in X ∗ . Then for all n > 1 and all ¡ ¢ ϑ ∈ Cc∞ (0, b) , we have Zb

® ϑ(t) x∗n , f (t) dt =

0

¿

Zb x∗n ,

À f (t)ϑ(t) dt

0

= 0, X

so by the Lagrange lemma, we have ∗ ® xn , f (t) X = 0 for a.a. t ∈ T and all n > 1 and since {x∗n }n>1

w∗

= X ∗ , we obtain that f (t) = 0

for a.a. t ∈ T.

(b) The proof is similar, using this time the DuBois-Reymond lemma.

136

Nonlinear Analysis

The next proposition permits the identification of the space of X-valued absolutely continuous functions with a vector Sobolev space. PROPOSITION 2.2.20 If f, g ∈ L1 (T ; X) (with T = [0, b]), then the following conditions are equivalent: Zt (a) f (t) = v +

g(s) ds, v ∈ X, for almost all t ∈ T ; 0

Zb

Zb 0

(b)

f (t)ϑ (t) dt = − 0

¡ ¢ g(t)ϑ(t) dt for all ϑ ∈ Cc∞ (0, b) ;

0

(c) for every x∗ ∈ X ∗ , ® ® d ∗ x , f (·) X = x∗ , g(·) X dt in the distributional sense on (0, b) (see Definition 1.6.1(a)). PROOF “(a)=⇒(b),(c)”: These implications follow from a simple integration by parts. “(c)=⇒(b)”: From the definition ¡ ¢of distributional derivative (see Definition 1.6.1(a)), for all ϑ ∈ Cc∞ (0, b) and all x∗ ∈ X ∗ , we have Zb

∗ ® x , f (t) X ϑ0 (t) dt = −

0

Zb 0

Zb =

® d ∗ x , f (t) X ϑ(t) dt dt

∗ ® x , g(t) X ϑ(t) dt,

0

so Zb

∗ 0 ® x , ϑ (t)f (t) + ϑ(t)g(t) X dt

0

¿ =

Zb ∗

x ,

¡

À ϑ (t)f (t) + ϑ(t)g(t) dt ¢

0

= 0

∀ x∗ ∈ X ∗

X

0

and thus Zb

Zb 0

f (t)ϑ (t) dt = − 0

g(t)ϑ(t) dt 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .

2. Lebesgue-Bochner and Sobolev Spaces

137

“(b)=⇒(a)”: Let df

Zb

fb(t) =

g(s) ds

∀ t ∈ T.

0

Evidently fb is absolutely continuous and fb0 (t) = g(t)

for a.a. t ∈ T .

Let df h = f − fb.

We have

Zb h(t)ϑ0 (t) dt = 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,

0

so, using Proposition 2.2.19(b), we have h(t) = v ∈ X

∀t∈T

and finally Zb g(s) ds

f (t) = v +

∀ t ∈ T.

0

COROLLARY 2.2.21 If f, g ∈ L1 (T ; X) (T = [0, b]) and one of the equivalent statements (a), (b) or (c) in Proposition 2.2.20 holds, then f is almost everywhere equal to an absolutely continuous function f1 : T −→ X. Extending the notion of distributional (weak) derivative and the resulting Sobolev spaces (see Definition 1.6.1) to X-valued functions, we make the following definitions. DEFINITION 2.2.22 (a) Let f, g ∈ L1 (T ; X) (with T = [0, b]). We say that g is the distributional (weak) derivative of f , if Zb

Zb 0

f (t)ϑ (t) dt = − 0

g(t)ϑ(t) dt 0

We denote this derivative of f by Df .

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .

138

Nonlinear Analysis

(b) Let p ∈ [1, +∞] and T = [0, b]. We define ½ ¾ ¡ ¢ df W 1,p (0, b); X = f ∈ Lp (T ; X) : Df ∈ Lp (T ; X) . (c) Let p ∈ [1, +∞] and T = [0, b]. We define ½ ¡ ¢ df AC 1,p T, X = f : T −→ X : f is absolutely continuous, differentiable almost everywhere with derivative ¾ f 0 ∈ Lp (T ; X) . REMARK 2.2.23 According to Theorem 2.2.17 (see also Remark 2.2.18), if X is reflexive (or more generally if X has RNP), then ¡ ¢ f ∈ AC 1,p T, X if and only if there exists a function g ∈ Lp (T ; X), such that Zt ∀ t ∈ T. f (t) = f (0) + g(s) ds 0

¡ ¢ 1,p Invoking Proposition 2.2.20, we see that the spaces W (0, b); X and ¡ ¢ 1,p AC T, X (for p ∈ [1, +∞]) can be identified. THEOREM 2.2.24 If p ∈ [1, +∞] and f ∈ Lp (T ; X) (with T = [0, b]), then the following statements are equivalent: (a) f ∈ W 1,p (T ; X);

¡ ¢ (b) there exists f1 ∈ AC 1,p T, X , such that f (t) = f1 (t) for almost all t ∈ T. REMARK 2.2.25 In Section 2.4, we shall see that this property distinguishes Sobolev functions of one variable (i.e., defined on (0, b)) from Sobolev functions of several variables (i.e., functions defined on an open set Z ⊆ RN with N > 1). PROPOSITION 2.2.26 If X is reflexive, p ∈ (1, +∞) and f ∈ Lp (T ; X) (with T = [0, b]), then the following two conditions are equivalent: ¡ ¢ ¡ ¢ (a) f ∈ W 1,p (0, b); X (or there exists f1 ∈ AC 1,p T, X , such that f (t) = f1 (t) for almost all t ∈ T ); b−h Z

° ° °f (t + h) − f (t)°p dt 6 chp for some c > 0 and all h ∈ (0, b). X

(b) 0

2. Lebesgue-Bochner and Sobolev Spaces PROOF

139

“(a)=⇒(b)”: By Theorem 2.2.24, we have

t+h Z f (t + h) − f (t) = Df1 (s) ds

∀ t, t + h ∈ T = [0, b].

t

By Jensen inequality (see Theorem A.2.26), we have ° ° °f (t + h) − f (t)°p 6 hp−1 X

t+h Z ° ° °Df1 (s)°p ds, X t

so b−h Z

° ° °f (t + h) − f (t)°p dt 6 hp−1 X

0

b−h t+h Z Z

° ° °Df1 (s)°p ds dt. X

0

(2.12)

0

Note that b−h Z

0

1 h

t+h Z Zb ° ° ° ° °Df1 (s)°p ds dt −→ °Df1 (s)°p ds as h → 0 X X t

0

(see Proposition 2.1.22). So from (2.12), we conclude that b−h Z

° ° °f (t + h) − f (t)°p dt 6 chp X

∀ h ∈ (0, b),

0

for some constant c > 0. “(b)=⇒(a)”: For every n > 1, let df

gn (t) = χ[0,b− 1 ] (t) n

f (t + n1 ) − f (t) 1 n

.

By virtue of condition (b), the sequence {gn }n>1 ⊆ Lp (T ; X) is bounded. Since p ∈ (1, +∞) and X is reflexive, the Lebesgue-Bochner space Lp (T ; X) is reflexive too (see Proposition 2.2.3(c)). So by the Eberlein-Smulian theorem (see Theorem A.3.8), we may assume that w

gn −→ g for some g ∈ Lp (T ; X).

in Lp (T ; X),

140

Nonlinear Analysis

¡ ¢ For every ϑ∗ ∈ Cc∞ (0, b); X ∗ , we have Zb

∗ ® ϑ (t), g(t) X dt

0

Zb =

lim

n→+∞

∗ ® ϑ (t), gn (t) X dt

0

Zb ¿ =

ϑ∗ (t),

lim

n→+∞

f (t + n1 ) − f (t) 1 n

0

=

lim

ϑ∗ (t − n1 ) − ϑ∗ (t) 1 n

n→+∞ 0

dt

(2.13)

X

1

· b− Z n¿

À

À ¸ Zb ∗ ® , f (t) dt − n ϑ (t), f (t) X dt . X

1 b− n

¡ ¢ Because ϑ∗ ∈ Cc∞ (0, b); X ∗ , for n > 1 large enough, we have Zb

∗ ® ϑ (t), f (t) X dt = 0.

1 b− n

Also 1 b− n ¿

Z 0

ϑ∗ (t − n1 ) − ϑ∗ (t) 1 n

À , f (t)

Zb dt −→ − X

∗0 ® ϑ (t), f (t) X dt.

0

So from (2.13), we have Zb

∗ ® ϑ (t), g(t) X dt = −

0

Zb

® ϑ∗ 0 (t), f (t) X dt

¡ ¢ ∀ ϑ∗ ∈ Cc∞ (0, b); X

0

and finally Df = g i.e.,

in Lp (T ; X),

¡ ¢ f ∈ W 1,p (0, b); X .

To prove the next result concerning X-valued functions, we shall need the following general result about embeddings of Banach spaces, which will also be helpful in our discussion of evolution triples later in this section.

2. Lebesgue-Bochner and Sobolev Spaces

141

LEMMA 2.2.27 If Y is another Banach space, such that X ⊆ Y , the embedding is continuous and X is dense in Y , then (a) the embedding Y ∗ ⊆ X ∗ is continuous; (b) if X is reflexive, then Y ∗ is dense in X ∗ . PROOF (a) Since by hypothesis X is embedded continuously in Y , there exists c1 > 0, such that kxkY 6 c1 kxkX

∀ x ∈ X.

Let y ∗ ∈ Y ∗ . Then ¯ ∗ ¯ ¯ hy , xi ¯ 6 ky ∗ k ∗ kxk 6 c1 ky ∗ k ∗ kxk . Y Y Y Y X

(2.14)

Let yb∗ = y ∗ |X . Then from (2.14), we have yb∗ ∈ X ∗ and kb y ∗ kX ∗ 6 c1 ky ∗ kY ∗ .

(2.15)

We show that yb∗ = 0 implies that y ∗ = 0. Indeed for all x ∈ X, we have 0 = hb y ∗ , xiX 6 hy ∗ , xiX . Because X is dense in Y , it follows that y ∗ = 0. So the map i∗ : Y ∗ −→ X ∗ , defined by df

i∗ (y ∗ ) = yb∗ , is continuous, injective. Hence y ∗ can be identified with yb∗ and so Y ∗ ⊆ X ∗ with continuous injection (see (2.15)). (b) Suppose that assertion is not true. Then Y∗

k·kX ∗

6= X ∗

and so by the Hahn-Banach theorem, we can find u ∈ X ∗∗ = X (since X is reflexive), u 6= 0, such that hx∗ , uiX = 0 It follows that u = 0, a contradiction.

∀ x∗ ∈ Y ∗ .

142

Nonlinear Analysis

PROPOSITION 2.2.28 If X is reflexive, Y is another Banach space, X ⊆ Y , the embedding is ¡ ¢ continuous and f ∈ L∞ (T ; X) ∩ C T ; Yw (T = [0, b] and Yw is the Banach space Y equipped with the weak topology), ¡ ¢ then f ∈ C T ; Xw (where Xw is the Banach space X equipped with the weak topology). k·k

PROOF By replacing Y with X Y if necessary, we may assume that X is dense in Y . So by virtue of Lemma 2.2.27(b), Y ∗ ⊆ X ∗ and the embedding is continuous and dense. From Corollary 2.1.4, we know that there exists a sequence {fn }n>1 of X-valued, countably valued functions on T , such that fn −→ f

uniformly on T in X.

We know that ° ° °fn (t)° 6 c1 kf k ∞ L (T ;X) X for some c1 > 0 and ∗ ® ® y , fn (t) X −→ y ∗ , f (t) X

∀ t ∈ T, n > 1,

∀ y ∗ ∈ Y ∗ , t ∈ T.

It follows that ¯ ∗ ® ¯ ¯ y , fn (t) ¯ 6 c1 ky ∗ k ∗ kfn k ∞ X L (T ;X) X thus

∀ n > 1, t ∈ T,

¯ ∗ ® ¯ ¯ y , f (t) ¯ 6 c1 ky ∗ k ∗ kf k ∞ X L (T ;X) X

∀t∈T

° ° °f (t)° 6 c1 kf k ∞ L (T ;X) X

∀ t ∈ T.

and so f (t) ∈ X

and

(2.16)

∗ Next let x∗ ∈ X ∗ . We can find a sequence {ym }m>1 ⊆ Y ∗ , such that ∗ ym −→ x∗ in X ∗ . ¡ ¢ Also let tn → t in T . Because f ∈ C T ; Yw , we have

∗ ® ∗ ® ∗ ® ym , f (tn ) X = ym , f (tn ) Y −→ ym , f (t) Y as n → +∞, for all m > 1 and all t ∈ T.

(2.17)

® ∗ ® ∗ ® , f (t) X −→ x∗ , f (t) X ym , f (t) Y = ym as m → +∞, for all t ∈ T.

(2.18)

Also we have

2. Lebesgue-Bochner and Sobolev Spaces

143

From (2.17) and (2.18), via the double ©limit lemma (see Proposition A.2.35), ª we deduce that there exists a sequence m(n) n>1 increasing (not necessarily strictly) to +∞ such that ∗ ® ® ym(n) , f (tn ) X −→ x∗ , f (t) X . (2.19) From (2.16) and (2.19), we have ¯ ∗ ® ® ¯ ¯ x , f (tn ) − x∗ , f (t) ¯ X X ¯ ® ∗ ® ¯¯ ¯¯ ∗ ® ® ¯¯ ¯ ∗ 6 ¯ x , f (tn ) X − ym(n) , f (tn ) X ¯ + ¯ ym(n) , f (tn ) X − x∗ , f (t) X ¯ ¯ ° ° ° ° ® ® ¯ ∗ ° ∗ °f (tn )° + ¯¯ y ∗ , f (tn ) − x∗ , f (t) ¯¯ −→ 0, 6 °x∗ − ym(n) m(n) X X X X ¡ ¢ so f ∈ C T ; Xw . The next lemma is crucial in obtaining compactness theorems for function spaces which arise in the study of evolution equations. LEMMA 2.2.29 If X, Y, Z are three Banach spaces, such that X ⊆ Y ⊆ Z with the first embedding compact and the second continuous, then for every ξ > 0, we can find c(ξ) > 0, such that kxkY 6 ξ kxkX + c(ξ) kxkZ

∀ x ∈ X.

PROOF Suppose the lemma is not true. Then we can find ξ > 0 and a sequence {xn }n>1 ⊆ X, such that kxn kY > ξ kxn kX + n kxn kZ df

Let yn =

xn kxn kX

∀ n > 1.

for all n > 1. We have kyn kY > ξ + n kyn kZ

∀ n > 1.

(2.20)

Since kyn kX = 1 for all n > 1 and the embedding X ⊆ Y is compact, from (2.20), we have that kyn kZ −→ 0 (2.21) and also the sequence {yn }n>1 ⊆ Y is relatively compact. Thus we can find a subsequence {ynk }k>1 of {yn }n>1 , such that ynk −→ u

in Y.

Since Y is embedded continuously in Z, we have also that ynk −→ u in Z. Because of (2.21), we have that u = 0. On the other hand from (2.20) in the limit as k → +∞, we have kukY > ξ > 0, a contradiction. This proves the lemma.

144

Nonlinear Analysis

Let X, Y, Z be three Banach spaces, with X, Y reflexive. Assume that X ⊆ Y ⊆ Z, with the embeddings being continuous. Moreover, we suppose that the first embedding is compact. Let T = [0, b] and 1 < p, r. We introduce the space df

Wpr (T ) =

©

ª u ∈ Lp (T ; X) : u0 = Du ∈ Lr (T ; Z) .

Here u0 = Du denotes the derivative in the distributional sense in Z, i.e., Zb

Zb 0

u0 (t)ϑ(t) dt in Z

u(t)ϑ (t) dt = − 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .

0

We furnish Wpr (T ) with the norm kukpr = kukp + ku0 kr . Clearly Wpr (T ) normed this way is a Banach space. Indeed, consider the isomorphism η : Wpr (T ) −→ Lp (T ; X) × Lr (T ; Z), given by df

η(x) = (x, x0 )

∀ x ∈ Wpr (T )

and view Wpr (T ) as a closed subspace of Lp (T ; X) × Lr (T ; Z). Moreover, if X and Z are separable, then so is Wpr (T ) and finally if X and Z are reflexive, then Wpr (T ) is reflexive too. It is evident that Wpr (T ) ⊆ Lp (T ; X) ⊆ Lp (T ; Z) ⊆ Ls (T ; Z), with s = min{p, r}. Then ¡ ¢ Wpr (T ) ⊆ W 1,s (0, b); Z ¡ ¢ and so every u ∈ Wpr (T ) viewed as a Z-valued function belongs in AC 1,s T, Z (see Theorem 2.2.24). Therefore the derivative u0 = Du is actually a strong derivative in Z almost everywhere, i.e., u0 = We note that

du . dt

Wpr (T ) ⊆ Lp (T ; X) ⊆ Lp (T ; Y )

and clearly the embeddings are continuous. We can say more about the embedding Wpr (T ) ⊆ Lp (T ; Y ), provided we strengthen our conditions on the spaces X, Y and Z.

2. Lebesgue-Bochner and Sobolev Spaces

145

THEOREM 2.2.30 If X, Y, Z are Banach spaces, with X, Z being reflexive, the embeddings X ⊆ Y ⊆ Z being continuous and the embedding X ⊆ Y being compact, then the embedding Wpr (T ) ⊆ Lp (T ; Y ) is compact. PROOF Let {un }n>1 ⊆ Wpr (T ) be a bounded sequence. We need to show that it has a subsequence which converges strongly in Lp (T ; Y ). Note that Wpr (T ) is reflexive. Passing to a subsequence if necessary, we may assume that w un −→ u in Wpr (T ). This means that

w

un −→ u in Lp (T ; X)

and

w

u0n −→ u0

in Lr (T ; Z).

Recall that Wpr (T ) ⊆ C(T ; Z). Claim 1. The embedding Wpr (T ) ⊆ C(T ; Z) is continuous. To see this suppose that un −→ u in Wpr (T ).

(2.22)

Then un (t) −→ u(t)

in X

∀ t ∈ T \ D,

in Z

∀ t ∈ T \ D.

with λ1 (D) = 0. Evidently un (t) −→ u(t)

For t ∈ T and s ∈ T \ D, from Proposition 2.2.20, we have ° ° ° ° ° ° °un (t) − u(t)° 6 °un (s) − u(s)° + c1 °u0n − u0 ° r Z Z L (T ;Z)

(2.23)

∀ n > 1,

for some c1 > 0. For every n > 1, we choose tn ∈ T , such that ° ° ° ° °un − u° = °un (tn ) − u(tn )°Z . C(T ;Z) So from (2.22) and (2.23), we have ° ° ° ° ° ° °un − u° 6 °un (s) − u(s)°Z + c1 °u0n − u0 °Lr (T ;Z) −→ 0. C(T ;Z) This proves Claim 1. Let

df

vn = u n − u

∀ n > 1.

146

Nonlinear Analysis

Then from Claim 1, it follows that we can find c2 > 0, such that kvn kC(T ;Z) = kun − ukC(T ;Z) 6 c2

∀ n > 1.

(2.24)

We claim that vn (t) −→ 0 in Z

∀ t ∈ T.

We shall prove this for t = 0. The proof is similar for any other t ∈ T . We have Zt vn (0) = vn (t) − vn0 (τ ) dτ, 0

so vn (0) =

1 s

Zs vn (t) dt − 0

1 s

Zs Zt vn0 (τ ) dτ dt, 0

0

thus vn (0) = ξn + ηn

∀ n > 1,

with ξn

1 = s

Zs vn (t) dt and

1 = − s

ηn

0

Note that ηn = −

1 s

Zs Zt vn0 (τ ) dτ dt 0

∀ n > 1.

0

Zs (s − t)vn0 (t) dt. 0

For a given ε > 0, select s ∈ T so that Zs kηn kZ 6 0

° 0 ° °vn (t)° dt 6 ε Z 2

∀ n > 1.

For this fixed s ∈ T , note that ξn −→ 0

w

in X

ξn −→ 0

in Z

and so (since X is embedded compactly in Z). So for n > 1 large enough, we have kξn kZ 6

ε . 2

This means that vn (t) −→ 0 in Z

∀ t ∈ T.

2. Lebesgue-Bochner and Sobolev Spaces

147

Because of (2.24), we can apply Proposition 2.1.13 and infer that vn −→ 0

in Lp (T ; Z).

By virtue of Lemma 2.2.29, for a given γ > 0 we can find c(γ) > 0, such that kvn kLp (T ;Y ) 6 γ kvn kLp (T ;X) + c(γ) kvn kLp (T ;Z) , so kvn kLp (T ;Y ) 6 γc3 + c(γ) kvn kLp (T ;Z)

∀ n > 1,

(2.25)

for some c3 > 0. Since γ > 0 was arbitrary and kvn kLp (T ;Z) −→ 0, from (2.25) we infer that lim sup kvn kLp (T ;Y ) 6 0, n→+∞

i.e., vn → 0 in Lp (T ; Y ). Now we are about to introduce a notion that plays a central role in the study of evolution equations. The modern strategy in studying parabolic equations is to make use of many different function spaces. The concept of evolution triple, which we define next, provides an appropriate analytical framework to realize this strategy. DEFINITION 2.2.31 A triple of spaces (X, H, X ∗ ) is said to be an evolution triple, if the following are true: (a) X is a separable, reflexive Banach space; (b) H is a separable Hilbert space; (c) the embedding X ⊆ H is continuous and dense. REMARK 2.2.32 By virtue of Lemma 2.2.27(b), the embedding H ∗ ⊆ ∗ X is continuous and dense. Since by the Riesz-Fr´echet representation theorem (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003a, p. 316)) we can assume that H = H ∗ , then we have that all embeddings X ⊆ H ⊆ X ∗ are continuous and dense. For all h ∈ H and all x ∈ X, we have hh, xiX = (h, x)H , i.e., h·, ·iX |H×X = (·, ·)H . Also for x∗ ∈ X ∗ , x ∈ X, we have hx∗ , xiX =

lim k·k

∗

X h −→ x∗ h∈H

(h, x)H

(since H is dense in X ∗ ). Therefore if X is a Hilbert space too, we do not represent the elements of X ∗ using the inner product of X (the Riesz-Fr´echet theorem), but using the inner product of H.

148

Nonlinear Analysis

EXAMPLE 2.2.33 If Z ⊆ RN is a bounded open set with smooth boundary and p ∈ [2, +∞), then as we shall see in Section 2.4, the spaces ¡ ¢∗ X = W 1,p (Z), H = L2 (Z) and X ∗ = W 1,p (Z) form an evolution triple. For the evolution triple (X, H, X ∗ ), we can consider the reflexive Banach space ½ ¾ ¢ 0¡ df Wpp0 (T ) = u ∈ Lp (T ; X) : u0 ∈ Lp T ; X ∗ , with p1 + p10 = 1, introduced earlier. In the next proposition we establish a regularity property for the elements of Wpp0 (T ) and also derive an “integration by parts formula,” which is crucial in the treatment of evolution equations. PROPOSITION 2.2.34 (Integration by Parts Formula) If (X, H, X ∗ ) is an evolution triple and 1 < p, p0 < +∞ with p1 + then

1 p0

= 1,

(a) Wpp0 (T ) ⊆ C(T ; H) and the embedding is continuous; (b) for all u, v ∈ Wpp0 (T ) and all 0 6 s 6 t 6 b, we have ¡

¢ ¡ ¢ u(t), v(t) H − u(s), v(s) H =

Zt

£ 0 ® ® ¤ u (τ ), v(τ ) X + u(τ ), v 0 (τ ) X dτ.

s

PROOF (a) Note that by the generalized Weierstrass approximation theorem , the space of X-valued polynomials is dense in Wpp0 (T ). In particular then the embedding C 1 (T ; X) ⊆ Wpp0 (T ) is dense. Now let u, v ∈ C 1 (T ; X). We have ¢ ¡ ¢ ¡ ¢ d¡ u(t), v(t) H = u0 (t), v(t) H + u(t), v 0 (t) H dt

∀ t ∈ T.

Thus ¡ ¢ ¡ ¢ u(t), v(t) H − u(s), v(s) H ¸ Zt · ¡ ¢ ¡ 0 ¢ 0 = u (τ ), v(τ ) H + u(τ ), v (τ ) H dτ s

so ¡

u(t), v(t)

¢ H

¡ ¢ − u(s), v(s) H

∀ 0 6 s 6 t 6 b,

2. Lebesgue-Bochner and Sobolev Spaces Zt · =

¸ 0 ® ® 0 u (τ ), v(τ ) X + u(τ ), v (τ ) X dτ

149

∀ 0 6 s 6 t 6 b. (2.26)

s

Choose ϑ ∈ C 1 (R), such that ϑ(s) = 0,

ϑ(t) = 1

and

Let

|ϑ| + |ϑ0 | 6 1

on R.

df

v = ϑu. Then

v 0 = ϑ0 u + ϑu0

and using H¨older’s inequality (see Theorem A.2.27) from (2.26), we obtain ¯ ¯ ¯u(t)¯2 6 c1 kuk2 0 ∀ t ∈ T, pp for some c1 > 0 and so kukC(T ;H) 6

√ c1 kukpp0

∀ u ∈ C 1 (T ; X).

(2.27)

Therefore the identity map ¡ ¢ i : C 1 (T ; X), k·kpp0 −→ C(T ; H) is continuous. But as we said in the beginning of the proof the embedding C 1 (T ; X) ⊆ Wpp0 (T ) is dense. So we can extend i continuously on Wpp0 (T ). Hence the embedding Wpp0 (T ) ⊆ C(T ; H) is continuous. (b) The integration by parts formula follows from (2.26) and the density of the embedding C 1 (T ; X) ⊆ Wpp0 (T ) REMARK 2.2.35

Even if the embedding X ⊆ H is compact,

the embedding Wpp0 (T ) ⊆ C(T ; H) is not compact

(see Mig´orski (1994)). In general, if X and Z are Banach spaces, the embedding X ⊆ Z is continuous (a special case is if X = Z), p, r ∈ [1, ∞] and df

Wpr (T ) =

©

ª u ∈ Lp (T ; X) : u0 ∈ Lr (T ; Z) ,

then Wpr (T ) ⊆ C(T ; Z). This inclusion as well as that of Proposition 2.2.34(a) means that if u ∈ Wpp0 (T ) (respectively u ∈ Wpr (T )), then there exists u1 ∈ C(T ; H) (respectively u1 ∈ C(T ; Z)), such that u(t) = u1 (t)

for a.a. t ∈ T.

150

2.3

Nonlinear Analysis

Compactness Results

In this section we prove compactness and weak compactness results for subsets of C(T ; X) and Lp (T ; X) (p ∈ [1, +∞)). Throughout this section T = [0, b] (b < +∞) and X is a Banach space. Additional hypotheses will be introduced as needed. We start with the classical “Arzela-Ascoli theorem” which characterizes the compact subsets of C(T ; X). In its proof we shall need the following lemma. LEMMA 2.3.1 If K ⊆ X is a nonempty set and for every ε > 0 there exists a relatively compact set Kε ⊆ X, such that for every x ∈ K we can find xε ⊆ Kε , such that kx − xε kX < ε, then K is relatively compact. PROOF Let ε > 0. Choose K 2ε ⊆ X to be the relatively compact subset postulated by the hypothesis of the lemma. We can find {xkε }nk=1 ⊆ K 2ε , such that n [ K 2ε ⊆ B 2ε (xkε ). k=1

By hypothesis for every x ∈ K, there exists x 2ε ⊆ K 2ε , such that ° ° °x − x ε ° 2

X

0 there exists δ(ε) > 0, such that, if t, s ∈ T and |t − s| < δ, then ° ° °u(t) − u(s)° < ε ∀u∈K X (the equicontinuity is uniform in t ∈ T since T is compact).

2. Lebesgue-Bochner and Sobolev Spaces

151

PROOF “=⇒”: Property (a) follows from the fact that for every t ∈ T the evaluation at t map et : C(T ; X) 3 u 7−→ u(t) ∈ X is continuous. To prove property (b) (the equicontinuity property), we proceed as follows. Let ε > 0. Because K is relatively compact in C(T ; X), we can find {uk }nk=1 ⊆ K, such that K ⊆

n [

B 3ε (uk ).

k=1

If t ∈ T , there is a δ = δ(ε) > 0, such that if s ∈ T and |t − s| < δ, then ° ° °uk (t) − uk (s)° < ε ∀ k ∈ {1, . . . , n} X 3 (recall that the functions {uk }nk=1 are uniformly continuous on T ©since T ªis compact). Now let s ∈ T with |t − s| < δ and u ∈ K. Choose k0 ∈ 1, . . . , n , such that ku − uk0 k∞ < 3ε . We have ° ° °u(t) − u(s)° X ° ° ° ° ° ° ° 6 u(t) − uk0 (t)°X + °uk0 (t) − uk0 (s)°X + °uk0 (s) − u(s)°X ε 6 ku − uk0 k∞ + + kuk0 − uk∞ < ε, 3 so K is equicontinuous. “⇐=”: First note that K(0) and K(b) are both relatively compact. Indeed, for a given ε > 0, we can find δ = δ(ε) > 0, such that if 0 < s < δ, then ° ° °u(s) − u(0)° < ε ∀ u ∈ K. X Since by hypothesis K(s) ⊆ X is relatively compact, from Lemma 2.3.1, it follows that K(0) ⊆ X is relatively compact. Similarly for K(b) ⊆ X. For every integer N , let uN : T −→ X be the function equal to u ∈ K at the points tk = kb N , k = 0, . . . , N and linear between these points. Then the ª df © set KN = uN : u ∈ K is isomorphic to N Y k=0

µ K

kb N

¶ ⊆ X N +1 ,

which is relatively compact (Tychonoff’s theorem). Therefore KN ⊆ C(T ; X) is relatively compact. Also if N > δb , then by property (b), we have ku − uN k∞ < ε. So by Lemma 2.3.1, we conclude that K ⊆ C(T ; X) is relatively compact. We can have a “weak” variant of the Arzela-Ascoli theorem. First a definition.

152

Nonlinear Analysis

¡ ¢ DEFINITION 2.3.3 A set K ⊆ C T ; Xw is weakly equicontinuous, if for every ε > 0 and x∗ ∈ X ∗ , we can find δ = δ(ε, x∗ ) > 0, such that if t, s ∈ T and |t − s| < δ, then ¯ ∗ ® ¯ ¯ x , u(t) − u(s) ¯ < ε ∀ u ∈ K. X Also we say that a sequence of functions un : T −→ X, n > 1 converges weakly uniformly to u : T −→ X, if for every ε > 0 and x∗ ∈ X ∗ , we can find n0 = n0 (ε, x∗ ) > 1, such that ¯ ∗ ® ¯ ¯ x , un (t) − u(t) ¯ < ε ∀ t ∈ T, n > n0 . X THEOREM 2.3.4 ¡ ¢ If X ∗ is separable, {un }n>1 ⊆ C T ; Xw , for every t ∈ T , the set w

{un (t)}n>1 is weakly compact in X and the sequence {un }n>1 is weakly equicontinuous, ¡ ¢ then we can find u ∈ C T ; Xw and a subsequence {unk }k>1 of {un }n>1 such that unk −→ u weakly uniformly in T. PROOF Let C ∗ be a countable dense subset on X ∗ . We introduce ∗ df D = span Q C ∗ , the set of linear combinations with rational coefficients of the elements of C ∗ . Evidently D∗ is countable and dense in X ∗ . Using the classical Arzela-Ascoli theorem on C(T ) together with the Cantor diagonal process, we can find a subsequence {unk }k>1 of {un }n>1 , such that ∗ ® x , unk (·) X −→ v(x∗ )(·) in C(T ) as k → +∞. Note that x∗ 7−→ v(x∗ ) is a map defined on D∗ with values in C(T ). Moreover, we have ¯ ∗ ® ¯ ¯ x − y ∗ , un (t) ¯ 6 c1 kx∗ − y ∗ k ∗ ∀ t ∈ T, x∗ , y ∗ ∈ D∗ , X X for some c1 > 0, so ¯ ∗ ¯ ¯v(x )(t) − v(y ∗ )(t)¯ 6 c1 kx∗ − y ∗ k ∗ X and thus

∀ t ∈ T, x∗ , y ∗ ∈ D∗

° ∗ ° °v(x ) − v(y ∗ )° 6 c1 kx∗ − y ∗ kX ∗ . C(T )

Therefore the map v : D∗ −→ C(T ) is uniformly continuous. Thus it can be k·k

∗

extended to a unique continuous map vb : D∗ X = X ∗ −→ C(T ). Clearly vb is continuous. This together with the fact that K(t) ⊆ X is weakly compact imply that we can find u : T −→ X, such that ∗ ® ∀ x∗ ∈ X ∗ , t ∈ T. x , u(t) X = vb(x∗ )(t) ¡ ¢ We conclude that u ∈ C T ; Xw and unk −→ u weakly uniformly on T .

2. Lebesgue-Bochner and Sobolev Spaces

153

DEFINITION 2.3.5 A subset K ⊆ Lp (T ; X) (p ∈ [1, +∞)) is said to be p-equiintegrable, if it is uniformly integrable (see Definition A.2.3) and b−h Z

° ° °u(t + h) − u(t)°p dt = 0 X

lim

h&0

uniformly for all u ∈ K.

0

In the next theorem we present a characterization of relatively compact sets of the Lebesgue-Bochner spaces Lp (T ; X) (p ∈ [1, +∞]) and also obtain an alternative criterion for compactness in C(T ; X) in which the compactness condition of the Arzela-Ascoli theorem (see Theorem 2.3.2) is replaced by a similar one for integrals. In what follows df

τh (u)(t) = u(t + h)

∀ h > 0.

So if u is defined on T , this translated version τh (u) is defined on [−h, b − h]. Note that the definition of p-equiintegrability is equivalent to saying that ° ° °τh (u) − u° p −→ 0 as h → 0, uniformly for all u ∈ K, L (T ;X) h

df

with Th = [0, b − h]. THEOREM 2.3.6 K ⊆ Lp (T ; X), p ∈ [1, +∞) (respectively K ⊆ C(T ; X)) is relatively compact if and only if t Z (a) for all t, s ∈ (0, b), s < t, we have that the set u(τ ) dτ : u ∈ K is s

relatively compact in X; and (b) K is p-equiintegrable (respectively

lim kτh (u) − ukL∞ (Th ;X) = 0 uni-

h→+∞

formly in u ∈ K). PROOF

Suppose that K ⊆ Lp (T ; X), p ∈ [1, +∞). Zt u(τ ) dτ is continuous from Lp (T ; X) into X,

“=⇒”: Since the map u 7−→ s

property (a) is satisfied. Due to the relative compactness of K ⊆ Lp (T ; X), we can find a ©sequenceª{uk }nk=1 ⊆ Lp (T ; X), such that for every u ∈ K, we can find k ∈ 1, . . . , n , such that ku − uk kp < 3ε . Because the embedding C(T ; X) ⊆ Lp (T ; X) is dense, we can assume that uk ∈ C(T ; X). Then we can find hk > 0, such that for h ∈ (0, hk ), we have ° ° ε °τk (uk ) − uk ° p < . L (Th ;X) 3

154

Nonlinear Analysis

Let b h = min hk . We have 16k6n

¡ ¢ τh (u) − u = τh (u − uk ) − (u − uk ) + τh (uk ) − uk and so

∀h6b h, u ∈ K,

kτh (u) − ukLp (Th ;X) < ε so lim kτh (u) − ukLp (Th ;X) = 0

uniformly for u ∈ K.

h&0

“⇐=”: Let u ∈ K and r > 0. We set 1 Mr (u)(t) = r df

Zt+r u(s) ds. t

¡ ¢ We have Mr (u) ∈ C Tr ; X with Tr = [0, b − r]. For every t, s ∈ [0, b − r], s 6 t, we have s+r ° Z ° ° ° ° ° ¡ ¢ °Mr (u)(t) − Mr (u)(s)° = ° 1 τt−s (u) − u (τ ) dτ ° °r ° X X s

° 1° 6 °τt−s (u) − u°L1 (T ;X) , t−s r so Mr K =

©

Mr u : u ∈ K

ª

¡ ¢ ⊆ C Tr ; X

is uniformly equicontinuous (see condition (b)). ¡ Also¢ from condition (a), we see that for every t ∈ (0, b − r), the set Mr K (t) ¡⊆ X ¢is relatively compact. So by Theorem 2.3.2, we have that Mr K ⊆ C Tr ; X is relatively compact. Note that 1 Mr (u)(t) − u(t) = r so

Zr

¡ ¢ τh (u) − u (t) dh

∀ t ∈ Tr ,

0

° ° ° ° °Mr (u) − u° p 6 max °τh (u) − u°Lp (Tr ;X) . L (Tr ;X) h∈[0,r]

But because of condition (b), for all bb < b, K is the uniform limit of Mr K in ¡ ¢ £ ¤ Lp Tb; X with Tb = 0, bb as r → 0, r 6 b − bb. But since Mr K is relatively ¢ ¢ ¡ ¢ ¡ ¡ compact in C Tb; X and the embedding C Tb; X ⊆ Lp Tb; X is continuous, ¢ ¡ we see that K is relatively compact in Lp Tb; X . Conditions (a) and (b) remain valid if one ©changes the time direction. ª Namely if u(t) = u(b − t), then the set K = u : u ∈ K still satisfies

2. Lebesgue-Bochner and Sobolev Spaces

155

conditions (a) and (b). Then from the previous argument we have that K ¡ ¢ is relatively compact in Lp Tb; X . It follows that K is relatively compact in ¡£ ¤ ¢ Lp bb, b ; X . Setting for example bb = 2b , we obtain the relative compactness of K in Lp (T ; X). The proof of the case when K ⊆ C(T ; X) is similar. REMARK 2.3.7 The restriction p < +∞ is necessary, because if K = {u} with u bounded but discontinuous from T into X, then K is compact in L∞ (T ; X) but condition (b) is not satisfied. COROLLARY 2.3.8 If u ∈ Lp (T ; X), p ∈ [1, +∞), then ° ° °τh (u) − u° p L (T

h ;X)

−→ 0

as h & 0.

When X = R, we have the so-called “Riesz-Kolmogorov theorem” for p ∈ [1, +∞). COROLLARY 2.3.9 (Riesz-Kolmogorov Theorem) K ⊆ Lp (T ), p ∈ [1, +∞) (respectively K ⊆ C(T )) is relatively compact if and only if (a) there exist t, s ∈ (0, b), s < t, such that the set ½ Zt

¾ u(τ ) dτ : u ∈ K

⊆ X

s

is bounded; and b−h Z

¯ ¯ ¯u(t + h) − u(t)¯p dt −→ 0 as h & 0 uniformly for u ∈ K.

(b) 0

PROOF

From(a) and (b) it follows that for all t, s ∈ K, s < t, we Zt have that the set u(τ ) dτ : u ∈ K ⊆ X is bounded. So we can apply s

Theorem 2.3.6. REMARK 2.3.10 Theorem 2.3.6 and Corollary 2.3.9 provide characterizations of compact sets in C(T ; X) and C(T ) respectively. Compared with the classical Arzela-Ascoli theorem (see Theorem 2.3.2), we see that condition (a) (the space criterion) is now a condition on integrals, while condition (b) (the time criterion) remains the same.

156

Nonlinear Analysis

Next we shall characterize sets which are bounded in Lp (T ; X) and compact in Lr (T ; X) with r < p. Such results are known as “partial compactness” results, since the compactness is not achieved for the larger order p for which the set is actually bounded. First we obtain two auxiliary results relating compactness with time-local compactness. LEMMA 2.3.11 The set K ⊆ Lp (T ; X) (with p ∈ [1, +∞)) is relatively compact if and only if (a) K ⊆ Lploc (T ; X)¡ is relatively compact (i.e., for all s, t ∈ (0, b), s < t, the ¢ set K|[s,t] ⊆ Lp [s, t]; X is relatively compact); and (b)

° ° Rh ° Rb ° °u(t)°p dt + °u(t)°p dt −→ 0 as h → 0 uniformly for u ∈ K. X X 0

b−h

PROOF “=⇒”: Condition (a) is automatically true. Let u be the extension by 0 outside T of u. Then the set © ª K = u: u∈K ¡ ¢ is relatively compact in Lp [−b, 2b]; X . As ° ° °τh (u) − u° p L ([−b,2b];X) Zh =

° ° °u(t)°p dt + X

0

b−h Z

° ° °u(t + h) − u(t)°p dt + X

0

Zb

° ° °u(t)°p dt, X

b−h

applying Theorem 2.3.6, we obtain Zh

° ° °u(t)°p dt + X

0

Zb

° ° °u(t)°p dt −→ 0 X

as h & 0

uniformly for u ∈ K.

b−h

“⇐=”: Let df

uh = χ[h,b−h] u

df

and Kh =

©

ª uh : u ∈ K .

Condition (b) implies that for a given ε > 0, we can find h > 0 small enough, such that kuh − ukLp (T ;X) < ε ∀ u ∈ K. Since Kh ⊆ Lp (T ; X) is relatively compact (see condition (a)), from Lemma 2.3.1, we infer that K ⊆ Lp (T ; X) is relatively compact.

2. Lebesgue-Bochner and Sobolev Spaces

157

LEMMA 2.3.12 If K ⊆ Lp (T ; X) (with p ∈ (1, +∞]) is bounded and K ⊆ L1loc (T ; X) is relatively compact (i.e., for all t, s ∈ (0, b), s < t, K|L1 ([s,t];X) is relatively compact), then K ⊆ Lr (T ; X) is relatively compact for all r ∈ (1, p). PROOF For every h 6 b and every u ∈ K, from H¨older’s inequality (see Theorem A.2.27), we have Zh

° ° °u(t)° dt + X

0

Zb

° ° 1 °u(t)° dt 6 2h q0 kuk , q X

b−h

so K ⊆ L1 (T ; X) is relatively compact (see Lemma 2.3.11). So for a given ©ε > 0, weª can find {uk }nk=1 ⊆ K, such that for every u ∈ K, there exists k ∈ 1, . . . , n , such that 1

ku − uk k1

0, we ª can find {uk }nk=1 ⊆ K, such that for each u ∈ K there exists k ∈ 1, . . . , n , such that ku − uk kLp (T ;Z) < ε. Invoking Lemma 2.2.29, for every ξ > 0, we can find c = c(ξ) > 0, such that ku − uk kLp (T ;Y ) 6 ξ ku − uk kLp (T ;X) + c ku − uk kLp (T ;Z) 6 ξδX + cε, df

where δX = diam Lp (T ;X) K. For a given ε0 > 0, select ξ = Then from (2.28), we have

ε0 2δX

(2.28) df ε0 2c .

and ε =

ku − uk kLp (T ;Y ) 6 ε0 , so K ⊆ Lp (T ; Y ) is relatively compact. Based on the above lemma, we can have the following compactness result for an intermediate space. THEOREM 2.3.19 If Y, Z are Banach spaces, the embeddings X ⊆ Y ⊆ Z are continuous, with the first embedding compact, p ∈ [1, +∞] and (i) K ⊆ Lp (T ; X) is bounded, (ii) kτh (u) − ukLp (Th ;Z) −→ 0 as h & 0 uniformly for u ∈ K, then K is relatively compact in Lp (T ; Y ) if p ∈ [1, +∞) and in C(T ; Y ) if p = +∞. PROOF Because of the compactness of the embedding X ⊆ Y and of Theorem 2.3.6, the set K ⊆ Lp (T ; Z) is relatively compact. An application of Lemma 2.3.18 finishes the proof. This result permits an extension of Theorem 2.2.30.

160

Nonlinear Analysis

THEOREM 2.3.20 If Y, Z are Banach spaces, the embeddings X ⊆ Y ⊆ Z are continuous with the first embedding compact, then (a) if K ⊆ Lp (T ; X) (with p ∈ [1, +∞)) is bounded and the set ª df © K 0 = u0 = Du : u ∈ K ⊆ L1 (T ; Z) is bounded, we have that K ⊆ Lp (T ; Y ) is relatively compact; ª df © (b) if K ⊆ L∞ (T ; X) is bounded and K 0 = u0 = Du : u ∈ K ⊆ Lr (T ; Z) (with r > 1) is bounded, we have that K ⊆ C(T ; Y ) is relatively compact. Now let us look at weakly compact subsets of L1 (Ω; X). To describe a large class of such sets in L1 (Ω; X), we shall need two results which for easy reference we state here without proofs. The first is the celebrated James theorem. THEOREM 2.3.21 (James Theorem) A nonempty, weakly closed and bounded subset of a Banach space X is weakly compact if and only if every x∗ ∈ X ∗ attains its maximum on the set. The second result is a remarkable consequence of the property of decomposability. If (Ω, Σ, µ) is a finite measure space and X is a Banach space, a set K ⊆ L1 (Ω; X) is said to be decomposable if and only if χA u1 + χAc u2 ∈ K for all (u1 , u2 , A) ∈ K × K × Σ. PROPOSITION 2.3.22 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, ϕ : Ω × df

X −→ R = R ∪ {+∞} is jointly measurable, F : Ω −→ 2X \ {∅} is graph ª df © measurable (i.e., Gr F = (ω, x) ∈ Ω × X : x ∈ F (ω) ∈ Σ × B(X) with B(X) being the Borel σ-field of X), Z ¡ ¢ ϕ ω, u(ω) dµ Iϕ (u) = Ω

is defined (maybe +∞ or −∞) for all u ∈ SF1 with ª df © SF1 = u ∈ L1 (Ω; X) : u(ω) ∈ F (ω) for µ-a.a. ω ∈ Ω and there exists u0 ∈ SF1 , such that Iϕ (u0 ) > −∞, then

Z sup Iϕ (u) =

1 u∈SF

sup ϕ(ω, x) dµ. Ω

x∈F (ω)

2. Lebesgue-Bochner and Sobolev Spaces

161

Using these results, we can identify a large class of weakly compact subsets of L1 (Ω; X). THEOREM 2.3.23 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, F : Ω −→ 2X \ {∅} is graph measurable, for µ-almost all ω ∈ Ω, F (ω) is weakly compact, convex and there exists h ∈ L1 (Ω)+ , such that sup kxkX 6 h(ω)

for µ-a.a. ω ∈ Ω,

x∈F (ω)

then

SF1 =

¡ ¢ u ∈ L1 (Ω; X) : u(ω) ∈ F (ω) for µ-a.a. ω ∈ Ω

is weakly compact and convex. PROOF Convexity of SF1 is obvious. Moreover, because of the boundedness by h ∈ L1 (Ω)+ , we have SF1 6= 0 (see Denkowski, Mig´orski & PapageorL1 (Ω; X). giou (2003a, p. 432)). So we show that SF1 is weakly compact in ¡ ¢∗ According to Theorem 2.3.21, it suffices to show that every u∗ ∈ L1 (Ω; X) attains its supremum on SF1 . From Theorem 2.2.12, we know that ¢ ¡ 1 ¢∗ ¡ ∗ L (Ω; X) = L∞ Ω; Xw ∗ and the duality pairing is given by Z ∗

hu , uiL1 (Ω;X) =

∗ ® u (ω), u(ω) X dµ.

Ω

From Proposition 2.3.22, we have Z sup hu∗ , uiL1 (Ω;X) = sup

1 u∈SF

Z

Let

x∈F (ω)

½ df

M (ω) =

Ω

® sup u∗ (ω), x X dµ.

= Ω

1 u∈SF

∗ ® u (ω), u(ω) X dµ

y ∈ F (ω) :

∗ ® u (ω), y X =

sup x∈F (ω)

¾ ∗ ® u (ω), x X .

Since F (ω) is µ-almost everywhere w-compact, we see that M (ω) 6= 0

for µ-a.a. ω ∈ Ω.

By setting F (ω) = {0} on the exceptional Lebesgue-null set, we can say that M (ω) 6= 0

∀ ω ∈ Ω.

162

Nonlinear Analysis

Also from Denkowski, Mig´orski & Papageorgiou (2003a, p. 433), we know that we can find a sequence of Σ-measurable functions fn : Ω −→ X, such that fn (ω) ∈ F (ω)

∀ ω ∈ Ω, n > 1

and F (ω) = {fn (ω)}n>1 Hence

sup

u∗ (ω), x

x∈F (ω)

® X

k·kX

∀ ω ∈ Ω.

® = sup u∗ (ω), fn (ω) X n>1

and so ω 7−→

sup x∈F (ω)

∗ ® u (ω), x X = m∗ (ω)

is Σ-measurable.

Then ª (ω, x) ∈ Ω × X : x ∈ M (ω) © ® ª = (ω, x) ∈ Ω × X : u∗ (ω), x X = m∗ (ω) .

Gr M =

©

Since u∗ is w∗ -measurable, it follows that Gr M ∈ Σ × B(X). So we can apply the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) and obtain a strongly measurable function u0 : Ω −→ X, such that u0 (ω) ∈ M (ω) for µ-a.a. ω ∈ Ω. Evidently and

u0 ∈ SF1 hu∗ , u0 i = sup hu∗ , uiX . 1 u∈SF

¢ ¡ 1 ¢∗ ¡ ∗ Since u∗ ∈ L∞ Ω; Xw = L (Ω; X) was arbitrary and clearly SF1 is weakly ∗ closed and bounded, from Theorem 2.3.21, we conclude that SF1 ⊆ L1 (Ω; X) is weakly compact. A classical theorem of Dunford-Pettis isolates the relatively weakly compact subsets of L1 (Ω) as the bounded, uniformly integrable subsets (see Definition A.2.3). If X is reflexive, the original proof for L1 (Ω) extends with only notational changes to the present vector valued setting. So we have THEOREM 2.3.24 (Dunford-Pettis Theorem) If (Ω, Σ, µ) is a finite measure space, X is reflexive and K ⊆ L1 (Ω; X) is bounded, then K is relatively weakly compact in L1 (Ω; X) if and only if it is uniformly integrable.

2. Lebesgue-Bochner and Sobolev Spaces

163

The relative weak compactness in L1 (Ω; X) is closely related with the socalled “biting convergence,” which is useful in the calculus of variations and in optimal control. DEFINITION 2.3.25

A sequence {un }n>1 ⊆ L1 (Ω; X) is said to conb

verge to u ∈ L1 (Ω; X) in the biting sense, denoted by un −→ u, if there exists an increasing sequence {Cm }m>1 ⊆ Σ, such that µ(Cm ) % µ(Ω) and

w

un −→ u

as m → +∞

in L1 (Cm ; X)

∀ m > 1.

The so-called “Chacon Biting Lemma” says that if X is a reflexive Banach space, then every bounded sequence in L1 (Ω; X) has a subsequence converging in L1 (Ω; X) in the biting sense. The next result is a slightly stronger version of the original biting lemma. THEOREM 2.3.26 If (Ω, Σ, µ) is a finite measure space, X is a Banach space and {un }n>1 ⊆ L1 (Ω; X) is bounded, then there exists a subsequence {unk }k>1 of {un }n>1 and an increasing sequence {Cm }m>1 ⊆ Σ, such that µ(Cm ) % µ(Ω) as m → +∞ and n o is uniformly integrable. χCk unk k>1

PROOF

Let m > 1 and define Z df hm (t) = sup

n>m {kum kX >t}

° ° °un (ω)° dµ X

∀ t > 0.

Note that hm : R+ −→ R+ and it is decreasing. So lim hm (t) exists for all t→+∞

m > 1. Let df

ξ =

lim h1 (t).

t→+∞

Since for every m > 1, {u1 , . . . , um−1 } is uniformly integrable, it follows that lim hm (t) = ξ.

t→+∞

Let {ti }i>1 ⊆ R+ be a sequence increasing to +∞, such that ξ 6 h1 (ti ) 6 ξ +

1 i

∀ i > 1.

Because ξ 6 hm (t)

∀ m > 1, t ∈ R+ ,

164

Nonlinear Analysis

we can find a strictly increasing sequence {li }i>1 , such that Z

° ° °uli (ω)° dµ > ξ − 1 . X i

{kuli k >ti } X

df

Set Di = {kuli kX > ti }. Then ti µ(Di ) 6 sup kun k1 n>1

and so µ(Di ) −→ 0 as i → +∞. © ª df Let Ci = Ω \ Di . We claim that the sequence χCi uli i>1 is uniformly integrable. To this end let Z ° ° df °uli (ω)° dµ. h(t) = sup X i>1

Ci ∩{kuli k >t} X

We need to show that h(t) −→ 0 as t → +∞. We have Z ° ° °uli (ω)° dµ h(tr ) = sup X i>r

{tr r

{kuli kX >tr }

·µ 6 sup i>r

° ° °uli (ω)° dµ − X

Z

¸ ° ° °uli (ω)° dµ X

{kuli kX >ti }

µ ¶¶¸ 1 1 2 ξ+ − ξ− 6 , r i r

so h(t) −→ 0 as t → +∞. Finally replace {Di }i>1 by a sequence decreasing to ∅ (or a µ-null set). Then there exists a strictly increasing sequence {ik }k>1 , such that ¡ ¢ 1 µ Dik 6 k 2 df

Set Ak =

∞ S j=k

∀ k > 1.

Dij for k > 1. Then {Ak }k>1 decreases to a µ-null set. The

subsequence of the statement of the theorem is defined by setting unk = uli with i = © ik . Also ªCk = Ω \ Ak for k > 1. The uniform integrability of the sequence χCk unk k>1 follows from the inclusion Ω \ Ak ⊆ Ω \ Dik .

2. Lebesgue-Bochner and Sobolev Spaces

165

EXAMPLE 2.3.27 In the previous theorem, it is necessary to pass to a subsequence. To see this let Ω = [0, 1]

and

µ = λ1

© ª (the Lebesgue measure on R). If n = 2k + i with k > 1, i ∈ 0, . . . , 2k − 1 , we set · ¶ i i+1 k df 2 if ω ∈ k , k , un (ω) = 2 2 0 otherwise. Then there is no increasing sequence {Cm }m>1 with Ω = © ª χCm un n>1 is uniformly integrable. Indeed, if

∞ S m=1

Cm , such that

¡ ¢ 1 λ1 Ω \ Cm 6 , 2 © ª then for all k > 1, there exists i ∈ 0, . . . , 2k − 1 , such that µ· λ1

i i+1 , 2k 2k

¶

¶ ∩ Cm

>

1 2k+1

.

COROLLARY 2.3.28 If X is reflexive and {un }n>1 ⊆ L1 (Ω; X) is bounded, then we can find a subsequence {unk }k>1 of {un }n>1 and u ∈ L1 (Ω; X), such that b unk −→ u as k → +∞. REMARK 2.3.29 As we shall see in Section 2.5, some of the ideas involved in the “Biting Lemma” are common in the “concentration compactness theorem” (see Theorem 2.5.30). To have an analogous result for the spaces Lp (T ; X), with p ∈ (1, +∞), we need the following result which is useful in many situations since it provides information about the pointwise behaviour of a weakly convergent sequence in Lp (T ; X) for p ∈ [1, +∞). First a definition-notation. DEFINITION 2.3.30 If X is a Banach space and {An }n>1 ⊆ 2X \{∅}. We set ½ ¾ df w-lim sup An = x ∈ X : x = w-lim xnk , xnk ∈ Ank , n1 < n2 < . . . . n→+∞

k→+∞

Here w stands for the weak topology on X.

166

Nonlinear Analysis

PROPOSITION 2.3.31 If (Ω, Σ, µ) is a finite measure space, X is a Banach space, {un }n>1 ⊆ Lp (Ω; X) and u ∈ Lp (Ω; X) with p ∈ [1, +∞), w

un −→ u

in Lp (Ω; X)

and for µ-almost all ω ∈ Ω, the sequence {un (ω)}n>1 is relatively weakly compact, then © ª u(ω) ∈ conv w-lim sup un (ω) for µ-a.a. ω ∈ Ω. n→+∞

Using this Proposition we can have the following result for bounded sequences in Lp (T ; X) (with p ∈ (1, +∞)). THEOREM 2.3.32 If (Ω, Σ, µ) is a finite measure space, X is a reflexive Banach space, {un }n>1 ⊆ Lp (Ω; X) (with p ∈ (1, +∞)) is bounded and w

un (ω) −→ u(ω)

for µ-a.a. ω ∈ Ω

in X,

(2.29)

then u ∈ Lp (Ω; X) and w

un −→ u

in Lp (Ω; X).

PROOF From Proposition 2.2.3(c), we know that Lp (Ω; X) is reflexive. So by the Eberlein-Smulian theorem (see Theorem A.3.8), we can find a subsequence {unk }k>1 of {un }n>1 , such that w

unk −→ u b in Lp (Ω; X). Using Proposition 2.3.31 and (2.29), we infer that u = u b ∈ Lp (Ω; X). So every subsequence of {un }n>1 has a further subsequence weakly convergent in Lp (Ω; X) to u and from this it follows that w

un −→ u

in Lp (Ω; X).

Another notion related to the weak convergence in L1 (T ; X) (T = [0, b]) is given in the next definition. DEFINITION 2.3.33 Let T = [0, b] (b < +∞) and X a Banach space. The weak norm on L1 (T ; X) is defined by °Z t ° ° ° df ° kukw = max ° u(τ ) dτ ° ∀ u ∈ L1 (T ; X). ° 06s6t6b

s

X

2. Lebesgue-Bochner and Sobolev Spaces

167

REMARK 2.3.34 kukw

Equivalently we can define °Z t ° ° ° ° = max ° u(τ ) dτ ∀ u ∈ L1 (T ; X). ° ° t∈T

0

X

Evidently k·kw is a norm on L1 (T ; X) weaker than the usual norm Zb kuk1 =

° ° °u(t)° dt X

∀ u ∈ L1 (T ; X).

0

We shall show that for a broad class of subsets of L1 (T ; X), the topology generated by the weak norm k·kw and the weak L1 (T ; X)-topology coincide. For this purpose we introduce the following property for subsets of L1 (T ; X). DEFINITION 2.3.35 Let T = [0, b] and X a Banach space. We say that K ⊆ L1 (T ; X) has property U , if (a) K is uniformly integrable; and (b) for every ε > 0, there exists a compact set Cε ⊆ X, such that for every u ∈ K there exists a Lebesgue measurable set Aε,u ⊆ T , such that ¡ ¢ λ1 T \ Aε,u < ε and u(t) ∈ Cε

∀ t ∈ Aε,u

1

(here λ stands for the Lebesgue measure on T ). REMARK 2.3.36 Since the Lebesgue measure λ1 is nonatomic (see Theorem A.2.5 and Remark A.2.6), the uniform integrability property implies that K is bounded. Also if K ⊆ L1 (T ; X) has property U , then K is relatively w-compact (see Bourgain (1979)). THEOREM 2.3.37 If T = [0, b], X is a Banach space and K ⊆ L1 (T ; X) has property U , then the weak L1 (T ; X)-topology and k·kw -norm topology on K coincide. Moreover, K is relatively k·kw -compact. PROOF For every n > 1, let C n1 ⊆ X be the compact set postulated by Definition 2.3.35. The set ∞ df [ C = C n1 n=1

is separable in X and note that u(t) ∈ C

∀ t ∈ T \ Au ,

168

Nonlinear Analysis

with λ1 (Au ) = 0. So by replacing X by span C if necessary, we may assume that X is a separable Banach space. Then the dual unit ball © ª ∗ B 1 = x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 furnished with the relative w∗ -topology is compact metrizable. Let {tn }n>1 ⊆ T ∗

be a dense set and consider a w∗ -dense set {x∗n }n>1 ⊆ B 1 . The family n o χ[tm ,tk ] x∗n : n, m, k > 1, tm < tk is countable and so it can be enumerated as {ϕi }i>1 . We have kukw

¯ b ¯ ¯Z ¯ ¯ ¯ ® = sup ¯¯ ϕi (t), u(t) X dt¯¯ i>1 ¯ ¯ 0

(see Definition 2.3.35). Let S : L1 (T ; X) −→ l∞ be the continuous, linear operator defined by df

½ Zb

S(u) =

® ϕi (t), u(t) X dt

0

Note that

¾ . i>1

° ° °S(u)° ∞ = kuk . w l

We claim that S(K) is relatively strongly compact in l∞ . First suppose that there exists a norm compact set D ⊆ X, such that ½ ¾ 1 1 K ⊆ SD = u ∈ L (T ; X) : u(t) ∈ D for a.a. t ∈ T . Let {ei }i>1 be the standard basis in l1 and C(D) the space of continuous ¡ ¢ R-valued functions on D. Let Sb : l1 −→ L1 T ; C(D) be defined by df b i) = S(e ϕi

∀i>1

and on all of l1 by linearity and ¡ ¢ continuity. Using Theorem 2.3.6, we can see that {ϕi }i>1 ⊆ L1 T ; C(D) is relatively norm-compact. Hence Sb is a compact operator and then by Schauder’s theorem, the adjoint operator ¡ ¢ ¡ ¢∗ Sb∗ : L1 T ; C(D) = L∞ T ; M (D)w∗ −→ l∞ is compact (here by M (D)w∗ we denote the space of Radon measure furnished with w∗ -topology; recall that by the Riesz-Markov theorem, C(D)∗ = M (D);

2. Lebesgue-Bochner and Sobolev Spaces

169

¡ ¢ ¡ ¢∗ see Theorem A.3.25). For every g ∈ L∞ T ; M (D)w∗ = L1 T ; C(D) , we have ½ Zb ¾ ® ∗ Sb (g) = g(t), ϕi (t) C(D) dt , i>1

0

where

® g(t), ϕi (t) C(D) =

Z ϕi (t)(x) dg(t)(x), D

for the measure g(t) ∈ M (D). If for every t ∈ T , g(t) is Dirac measure concentrated on u(t) ∈ D ⊆ X, then ¡ ¢ g ∈ L∞ T ; M (D)w∗ and

½ Zb Sb∗ (g) =

¾ ® ϕi (t), u(t) X dt = S(u).

0

Therefore we see that the action of the operator Sb on K can be identified with the action of the operator Sb∗ and so S(K) ⊆ l∞ is relatively compact. Now we pass to the general case and assume that K has property U . For ε > 0 consider the set n o df Kε = χAε,u u : u ∈ K , where Aε,u ⊆ T is the Lebesgue measurable set postulated by Definition 2.3.35. By virtue of the uniform integrability of K, for each δ > 0, we can find ε > 0, such that inf ku − vk1 < δ

v∈Kε

∀ u ∈ K.

Note that k·kw 6 k·k1 . Therefore from the definition of the operator S, we have ° ° inf °S(u) − y °l∞ < δ. y∈S(Kε )

∞ But the set S(Kε ) ⊆ l∞ is relatively ¡norm-compact. ¢ Hence S(K) ⊆ l is 1 ∞ relatively norm-compact. Since S ∈ L L (T ; K); l , we also have that S is weak-to-weak continuous. The weak and norm topologies coincide on S(K). Thus S|K is weak-to-norm continuous. Recall that because K has property U , it is relatively w-compact (see Remark 2.3.36). Since without any loss of generality we may assume that K is convex, the linear map S : K −→ S(K) is a weak-to-norm ¡homeomorphism.¢ The norm topology of l∞ on S(K) is the norm topology of L1 (T ; X), k·kw∗ on K. Therefore the set K is relatively k·kw -compact and the proof of the theorem is finished.

170

Nonlinear Analysis

Next we ask the question when a weakly convergent sequence in Lp is strongly convergent. If p ∈ (1, +∞) and X is uniformly convex (hence reflexive; see Remark A.3.22), the Lebesgue-Bochner space Lp (Ω; X) is uniformly convex and so we have the Kadec-Klee property which says that if w

un −→ u in Lp (Ω; X) and kun kp −→ kukp , then un −→ u

in Lp (Ω; X).

This is no longer true for L1 (Ω; X). The next proposition illustrates the difference between weak and strong convergence in L1 (Ω). A sequence {un }n>1 ⊆ L1 (Ω) which converges weakly but not strongly oscillates violently around its weak limit. PROPOSITION 2.3.38 w If (Ω, Σ, µ) is a finite measure space, {un }n>1 ⊆ L1 (Ω), un −→ u in L1 (Ω) and u(ω) 6 lim inf un (ω) for µ-a.a. ω ∈ Ω, n→+∞

1

then un −→ u in L (Ω). PROOF Without any loss of generality, we may assume that u = 0. From Theorem 2.3.24, we know that the sequence {un }n>1 ⊆ L1 (Ω) is uniformly integrable. So given ε > 0 we can find δ = δ(ε) > 0, such that if A ∈ Σ, µ(A) < δ, then Z |un | dµ < ε

∀ n > 1.

A

For every N > 1, let ½ df

ΩN =

ε ω ∈ Ω : inf un (ω) > − n>N µ(Ω)

¾ .

Because of our hypothesis ¡ ¢and since we have assumed that u = 0, we can find N > 1 large so that µ ΩcN < δ. Also since w

un −→ u = 0

in L1 (Ω),

we can find N1 > N , such that for all n > N1 , we have ¯ ¯ ¯Z ¯ ¯ ¯ ¯ un dµ¯ < ε. ¯ ¯ ¯ ¯ ΩN

2. Lebesgue-Bochner and Sobolev Spaces

171

So for all n > N1 , we have Z Z Z |un | dµ = |un | dµ + |un | dµ Ω

Ωc

ΩN

N ¯ Z ¯ Z Z ¯ ¯ ε ε ¯ ¯ dµ + 6 u + dµ + |un | dµ 6 4ε, n ¯ µ(Ω) ¯ µ(Ω)

ΩN

so

ΩcN

ΩN

Z |un | dµ −→ 0

as n → +∞,

un −→ u = 0

in L1 (Ω).

Ω

i.e.,

When we deal with RN -valued functions, an extremality condition replaces the inequality hypothesis in the previous proposition. The result is due to Visintin (1984), where the reader can find the proof. PROPOSITION 2.3.39 ¡ ¢ If (Ω, Σ, µ) is a finite measure space, {fn }n>1 ⊆ L1 Ω; RN is a sequence such that ¡ ¢ w fn −→ f in L1 Ω; RN , ¡ ¢ for some f ∈ L1 Ω; RN and µ ¶ f (ω) ∈ ext conv lim sup{fn (ω)} for µ-a.a. ω ∈ Ω, n→+∞

then fn −→ f

¡ ¢ in L1 Ω; RN .

We conclude this section with a brief look at the space of Radon measures, which appears in applications (such as optimal control, game theory, mathematical economics etc.) and also is useful in the study of Sobolev spaces (see Section 2.4). So let Z be a locally compact, σ-compact metric space. We consider the following three spaces of continuous functions on Z: df

©

df

©

Cc (Z) = C0 (Z) =

ª u : Z −→ R continuous with compact support , u : Z −→ R continuous and vanishes at infinity,

i.e., for all ε > 0 there exists a compact set Kε ⊆ Z, ¯ ¯ ª such that ¯u(z)¯ < ε for all z 6∈ Kε , ª df © Cb (Z) = u : Z −→ R continuous and bounded .

172

Nonlinear Analysis

Evidently we have the following inclusions: Cc (Z) ⊆ C0 (Z) ⊆ Cb (Z). If Z is compact, then these three spaces coincide. If Z is not compact, each inclusion is strict. We can define a norm on Cb (Z) by setting ¯ ¯ df kukCb (Z) = kuk∞ = sup ¯u(z)¯. z∈Z

By restriction, this norm also passes to the spaces Cc (Z) and C0 (Z). PROPOSITION 2.3.40 The space Cb (Z) equipped with the norm k·k∞ is a Banach space. The space C0 (Z) is a closed subspace of this Banach space (hence itself a Banach space). The space Cc (Z) is k·k∞ -dense in C0 (Z). PROOF

The first two statements are obvious. Only the third requires ∞ S some work. Since Z is a locally compact, σ-compact metric space, Z = Cn , n=1

where {Cn }n>1 is a sequence of compact sets with Cn ⊆ int Cn+1 for all n > 1. Let {ϑn }n>1 and {ξn }n>1 be continuous partitions of unit subordinate to the open covers {int Cn }n>1 and {Cnc }n>1 respectively. We have ϑn + ξn = 1 on df

Z and so ϑn = 1 on Cn for n > 1. Let u ∈ C0 (Z) and set un = ϑn u. Then un ∈ Cc (Z) and

ku − un k∞ = kξn uk∞

∀ n > 1.

Since supp ξn ⊆ Cnc and u(z) −→ 0 as z tends to infinity (the one point Alexandrov compactification of Z; see Theorem A.1.3 and Remark A.1.4), we conclude that ku − un k∞ −→ 0 as n → +∞.

Also by M (Z) we denote the space of all signed measures m : B(Z) −→ R (with B(Z) being the Borel σ-field of Z) that have bounded variation. Since Z is a metric space such measures are regular. The measures in M (Z) are known as Radon measures. The Riesz-Markov representation theorem says that M (Z) is the dual space of C0 (Z).

2. Lebesgue-Bochner and Sobolev Spaces

173

THEOREM 2.3.41 (Riesz-Markov Representation Theorem) If X is a locally compact, σ-compact metric space, then C0 (Z)∗ = M (Z) and the duality pairing is given by Z hµ, uiC0 (Z) = u(z) dµ ∀ u ∈ C0 (Z), µ ∈ M (Z). Z

Using the three spaces of continuous functions on Z introduced earlier, we can define three different notions of convergence for sequences of Radon measures. DEFINITION 2.3.42 Let Z be a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z). (a) We say that the sequence {µn }n>1 converges vaguely to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ Cc (Z). Z

Z

(b) We say that the sequence {µn }n>1 converges weakly to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ C0 (Z). Z

Z w

We denote this convergence by µn −→ µ. (c) We say that the sequence {µn }n>1 converges narrowly to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ Cb (Z). Z

Z n

We denote this convergence µn −→ µ. REMARK 2.3.43

Evidently we have that

• the norm convergence in M (Z) implies the narrow convergence in M (Z); • the narrow convergence in M (Z) implies the weak convergence in M (Z); • the weak convergence in M (Z) implies the vague convergence in M (Z). In functional analytic terms the weak convergence is actually the weak∗ convergence in the Banach space M (Z) (see Theorem 2.3.41). The term weak convergence originates from probability theory. Also the term narrow convergence is the English translation of the term “convergence ´etroite” first used by Bourbaki (1969).

174

Nonlinear Analysis

PROPOSITION 2.3.44 If Z is a locally compact, σ-compact metric space, {un }n>1 ⊆ C0 (Z) is a sequence and u ∈ C0 (Z), then w un −→ u in C0 (Z) if and only if sup kun k∞ < ∞

and

n>1

un (z) −→ u(z)

∀ z ∈ Z.

PROOF “=⇒”: A weakly convergent sequence in a Banach space is bounded. So supn>1 kun k∞ < +∞. Also if µ = δz is the Dirac measure concentrated at z ∈ Z, then hδz , un i −→ hδz , ui . But hδz , un i = un (z) and

hδz , ui = u(z).

So un (z) −→ u(z)

∀ z ∈ Z.

“⇐=”: This is an immediate consequence of the Lebesgue dominated convergence theorem (see Theorem A.2.2). PROPOSITION 2.3.45 If Z is a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z) is a sequence, then (a) if µn −→ µ

vaguely in M (Z)

and for every ε > 0 there exists compact set Kε ⊆ Z, such that ¡ ¢ |µn | Kεc < ε ∀ n > n0 , then µn −→ µ

narrowly in M (Z);

(b) if µn > 0 for all n > 1, µn −→ µ

vaguely in M (Z),

and µn (Z) −→ µ(Z), then µn −→ µ

narrowly in M (Z).

2. Lebesgue-Bochner and Sobolev Spaces

175

PROOF (a) Let u ∈ Cb (Z) and ε > 0. Let Kε ⊆ Z be the compact set postulated by the hypotheses. We take ξε ∈ Cc (Z), such that ξε |Kε = 1. Evidently u = ξε u + v with supp v ⊆ Kεc . So we have Z Z Z u dµn = ξε u dµn + v dµn . Z

Z

Z

Since µn −→ µ vaguely in M (Z) and ξε u ∈ Cc (Z), we have Z Z ξε u dµn −→ ξε u dµ. Z

Also

Z

¯Z ¯ ¯Z ¯ ¯ ¯ ¯ ¯ ¡ ¢ ¯ v dµn ¯ = ¯ v dµn ¯ 6 kvk |µn | Kεc 6 ε kuk . ∞ ∞ ¯ ¯ ¯ ¯ Kεc

Z

So we obtain

Z

Z

lim sup n→+∞

and

u dµn 6 Z

Z u dµn >

n→+∞

(2.30)

Z

Z lim inf

ξε u dµ + ε kuk∞

Z

ξε u dµ − ε kuk∞ . Z

Since ε > 0 was arbitrary and ξε −→ 1, we obtain that Z Z u dµn −→ u dµ, Z

Z

i.e., µn −→ µ

narrowly in M (Z).

(b) Every measure µ0 ∈ M (Z), µ0¡> 0¢is tight. So given ε > 0, we can find a compact set Kε ⊆ Z, such that µ0 Kεc < ε. Let u ∈ Cc (Z) be such that Z supp u ⊆ Kε ,

0 6 u 6 1 and

kµk∗ − ε

1 be such that ¯ ¯ ¯µn (Z) − µ(Z)¯ < ε

¯ ¯Z Z ¯ ¯ ¯ u dµn − u dµ¯ < ε ¯ ¯

and

Z

∀ n > n0 . (2.31)

Z

Then for n > n0 , from (2.31), we have Z Z ¡ ¢ µn Kεc 6 kµn k∗ − u dµn + ε − u dµ 6 kµn k∗ − kµk∗ + 2ε < 3ε. Z

Z

So, from part (a), we conclude that µn −→ µ narrowly in M (Z).

We have a compactness result for the weak convergence of Radon measures. THEOREM 2.3.46 If Z is a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z) is bounded, then there is a subsequence {µnk }k>1 of {µn }n>1 and µ ∈ M (Z), such that w

µnk −→ µ

as k → +∞.

PROOF Let Z ∗ be the Alexandrov one-point compactification of Z (see Theorem A.1.3 and Remark A.1.4). Then Z ∗ is a compact metrizable space and so C(Z ∗ ) is a separable Banach space. Set df

E =

©

ª u ∈ C(Z ∗ ) : u(∞) = 0 .

Then this is a closed subspace of C(Z ∗ ); thus E is a separable Banach space too. For every u ∈ E, let u b denote the restriction of u to Z. Consider the linear map L : E −→ Cb (Z) defined by df

L(u) = u b

∀ u ∈ E.

We claim that L is an isometry of E onto C0 (Z). To this end, let u ∈ E. Since u is continuous at +∞, then for every ε > 0, there exists a compact set Kε , such that ¯ ¯ ¯u(z) − u(∞)¯ < ε ∀ z ∈ Kεc . This means that u b ∈ C0 (Z). On the other hand let v ∈ C0 (Z). Then v can be extended to Z ∗ by setting v1 (∞) = 0

2. Lebesgue-Bochner and Sobolev Spaces

177

and v1 (z) = v(z)

∀ z ∈ Z.

Since v ∈ C0 (Z), we see that v1 ∈ C(Z ∗ ) and so v1 ∈ E. This isometry shows that C0 (Z) is separable. Then the weak∗ topology on bounded subsets of M (Z) = C0 (Z)∗ is compact (by Alaoglu’s theorem; see Theorem A.3.9) and metrizable. This proves the theorem. REMARK 2.3.47 Using the compactification technique of the previous proof we can show that if df

G =

©

¡ ¢ ª µ ∈ M (Z ∗ ) : µ {∞} = 0

and S : G −→ M (Z) is defined by df

∀ µ ∈ M (Z ∗ )

S(µ) = µ b with µ b(A) = µ(A)

∀ A ∈ B(Z) ⊆ B(Z ∗ ),

then S is an isometry of G onto M (Z). When Z is compact, then Theorem 2.3.46 can be strengthened. In what follows by M (Z)+ we denote the elements µ of M (Z) for which we have µ > 0 (i.e., they are measures). THEOREM 2.3.48 If Z is a compact metric space and {µn }n>1 ⊆ M (Z)+ is such that kµn k = r

∀ n > 1,

then there exists a subsequence {µnk }k>1 of {µn }n>1 and µ ∈ M (Z)+ with kµk = r, such that w

µnk −→ µ i.e., the set

as k → +∞,

© ª + SR = µ ∈ M (Z)+ : kµk = r

is w∗ -sequentially compact). We conclude with a result for sequences of functions which converge simultaneously pointwise and weakly in Lp (Ω) (p ∈ [1, +∞)). This result can be viewed as a refinement of Fatou’s Lemma.

178

Nonlinear Analysis

PROPOSITION 2.3.49 If (Ω, Σ, µ) is a finite measure space, {un }n>1 ⊆ Lp (Ω) (p ∈ [1, +∞)), w

un −→ u

in Lp (Ω)

and un (ω) −→ u(ω) then

³ lim

n→+∞

PROOF we have

p

for µ-a.a. ω ∈ Ω, p

´

p

kun kp − kun − ukp

= kukp .

For a given ε > 0, we can find c(ε) > 0, such that for all a, b ∈ R, ||a + b|p − |a|p | 6 ε|a|p + c(ε)|b|p .

We set

df

(2.32) +

hεn = (||un |p − |un − u|p − |u|p | − ε|un − u|p ) . Evidently we have hεn (ω) −→ 0 for µ-a.a. ω ∈ Ω and from (2.32),

hεn (ω) 6

¯p ¡ ¢¯ 1 + c(ε) ¯u(ω)¯ .

So from the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z hεn dµ −→ 0 as n → +∞. (2.33) Ω

But note that |un |p − |un − u|p − |u|p 6 hεn + ε|un − u|p

for µ-a.a. ω ∈ Ω.

Hence, using (2.33), we obtain Z lim sup ||un |p − |un − u|p − |u|p | dµ 6 M ε, n→+∞

Ω

where

df

p

M = sup kun − ukp . n>1

Since ε > 0 was arbitrary, we have ³ ´ p p p lim kun kp − kun − ukp = kukp . n→+∞

REMARK 2.3.50 Note that if p = 2, then we do not need the µ-almost everywhere pointwise convergence of the sequence {un }n>1 to u.

2. Lebesgue-Bochner and Sobolev Spaces

2.4

179

Sobolev Spaces

Already in Section 1.6, we introduced the Sobolev space W 1,p (Z) (see Definition 1.6.1). Here we introduce Sobolev spaces of any order m > 1 and conduct a systematic study of them, proving among other things those results stated in Section 1.6 without a proof. N Let us start by fixing the notation. An element α = (αk )N is said k=1 ∈ N to be a multi-index . Associated to a multi-index α, we have the following symbols: N df X |α| = αk k=1

the length of α, and αN z α = z1α1 . . . zN

N ∀ z = (zk )N k=1 ∈ R .

We say © that two ª multi-indices α, β are related by α 6 β, if αk 6 βk for all k ∈ 1, . . . , N . Finally we set df

Dk =

∂ ∂zk

© ª ∀ k ∈ 1, . . . , N

and df

αN = Dα = D1α1 . . . DN

∂z1α1

∂ |α| αN . . . . ∂zN

DEFINITION 2.4.1 Let Z ⊆ RN be an open set. By D(Z) we de∞ note the space Cc (Z) (the space of C ∞ (Z) functions with compact support) equipped with the following convergence notion: “the sequence {ϑn }n>1 ⊆ Cc∞ (Z) is said to converge to 0, if there exists a fixed compact set K ⊆ Z, such that supp ϑn ⊆ K for all n > 1 and {Dα ϑn }n>1 converges uniformly to 0 for all α ∈ NN .” The elements of the space D(Z) are called test functions. A linear functional T : D(Z) −→ R, such that ϑn −→ 0

in D(Z),

implies T (ϑn ) −→ 0 is called a distribution. The space of distributions is denoted by D(Z)∗ .

180

Nonlinear Analysis

REMARK 2.4.2 The convergence notion introduced on Cc∞ (Z) is actually topological, i.e., corresponds to a topology on Cc∞ (Z). Therefore D(Z)∗ is the dual of the space of test functions. Recall that D(Z) is dense in Lp (Ω) for all p ∈ [1, +∞). If u ∈ L1loc (Z) and Tu : D(Z) −→ R is defined by Z df Tu (ϑ) = uϑ dz ∀ ϑ ∈ D(Z), Ω

then Tu ∈ D(Z)∗ . Moreover, if u, v ∈ L1loc (Z) and u = v

for a.a. z ∈ Z,

then Tu = Tv . In particular, if u(z) = 0

for a.a. z ∈ Z,

it defines the zero distribution. In fact the converse is also true. If Tu = 0, then u(z) = 0 for almost all z ∈ Z, provided that u ∈ L1loc (Z). Distributions resulting from locally integrable functions are usually called regular distributions. Another important distribution is the Dirac δ-function; namely for z ∈ Z, we define df

δz (ϑ) = ϑ(z)

∀ ϑ ∈ D(Z).

This distribution is not regular. DEFINITION 2.4.3 For every distribution T ∈ D(Z)∗ and every α ∈ N α N , the distribution D T is defined by ¡ ¢ df Dα T (ϑ) = (−1)|α| T Dα ϑ

∀ ϑ ∈ D(Z).

Then Dα T is the derivative of order α of the distribution T . For given two functions u, v ∈ L1loc (Z) and α ∈ NN , we write v = Dα u to express the fact that Dα Tu = Tv . This is equivalent to saying that Z Z vϑ dz = (−1)|α| uDα ϑ dz. Ω

Ω

α

The function v = D u is the derivative of order α in the sense of distributions of the function u. If u ∈ C |α| (Z), then the distributional derivative Dα u ∂ |α| u coincides with the classical partial derivative αN . ∂z1α1 . . . ∂zN REMARK 2.4.4

2. Lebesgue-Bochner and Sobolev Spaces

181

Now we are ready to give the definition of Sobolev space. DEFINITION 2.4.5 Let Z ⊆ RN be an open set. The Sobolev space m,p W (Z) for m ∈ N0 , p ∈ [1, +∞], is defined by df

W m,p (Z) =

©

ª u ∈ Lp (Z) : Dα u ∈ Lp (Z) for all α ∈ NN with |α| 6 m .

For every u ∈ W m,p (Z), we define df

kukW m,p (Z) =

µ X

kD

α

p ukp

¶ p1 if p ∈ [1, +∞)

|α|6m

and

df

kukW m,∞ (Z) =

X

kDα uk∞ .

|α|6m

Clearly this is a norm on W m,p (Z). Finally we set df

k·kW m,p (Z)

W0m,p (Z) = D(Z)

,

for p ∈ [1, +∞). REMARK 2.4.6

Evidently un −→ u in W m,p (Z)

if and only if for all α ∈ NN with |α| 6 m, we have Dα un −→ Dα u in Lp (Z). Let

© ª df r = card α : α is multi-index, |α| 6 m

and consider the map L : W m,p (Z) −→ defined by df

L(u) =

¡ α ¢ D u |α|6m

¡

¢r Lp (Z) ,

∀ u ∈ W m,p (Z).

It is easily seen that L is an isometric isomorphism. Based on this observation, we can state the following result. PROPOSITION 2.4.7 ¡ ¢ The spaces W m,p (Z), k·kW m,p (Z) (with p ∈ [1, +∞], m ∈ N0 ) are Banach spaces, which are separable for p ∈ [1, +∞), reflexive and uniformly convex for p ∈ (1, +∞).

182

Nonlinear Analysis

COROLLARY 2.4.8 For every m ∈ N0 and p ∈ [1, +∞], the space W0m,p (Z) is a closed subspace of W m,p (Z). PROOF

We need to show that W0m,p (Z) ⊆ W m,p (Z).

So let u ∈ W0m,p (Z) and let {ϑn }n>1 ⊆ D(Z) be such that ϑn −→ u in W m,p (Z). From Proposition 2.4.7, it follows that u ∈ W m,p (Z). REMARK 2.4.9 The space p = 2 is important and we reserve a special notation for it. We set df

H m (Z) = W m,2 (Z) and

df

H0m (Z) = W0m,2 (Z).

For u, v ∈ H m (Z), we define df

(u, v)H m (Z) =

X

(Dα u, Dα v)L2 (Z) =

X Z

Dα u Dα v dz.

|α|6m Ω

|α|6m

Clearly (·, ·)H m (Z) defines an inner product on H m (Z) which generates the norm k·kW m,2 (Z) . From Proposition 2.4.7, it follows that H m (Z) and H0m (Z) are Hilbert spaces. From now on m = 1. So we examine the first Sobolev spaces W 1,p (Z) and W01,p (Z), with p ∈ [1, +∞]. Next we derive ways to approximate the elements of the Sobolev space W 1,p (Z) by smooth functions. For this purpose we introduce certain regularizing sequences known as mollifiers. ¡ ¢ DEFINITION 2.4.10 Let ϕ ∈ Cc∞ RN , ϕ > 0 be such that Z © ª N supp ϕ ⊆ z ∈ R : kzkRN 6 1 and ϕ(z) dz = 1. RN

A possible choice is the function µ ¶ 1 df c exp kzk2 −1 ϕ(z) = RN 0

if

kzkRN < 1,

if

kzkRN > 1,

with c > 0 chosen in such a way so that Z ϕ(z) dz = 1. RN

2. Lebesgue-Bochner and Sobolev Spaces

183

If ε > 0, we define

1 ³z ´ ϕ . εN ε ¡ ¢ Then ϕε ∈ Cc∞ RN , ϕε > 0 is such that ϕε (z) =

supp ϕε ⊆

©

ª z ∈ RN : kzkRN 6 ε

Z and

ϕε (z) dz = 1. RN

The function ϕε is called a mollifier and given u ∈ L1loc (Z), the mollification (or regularization) of u corresponding to {ϕε }ε>0 is given by Z Z df uε (z) = u(z − y)ϕε (y) dy = u(y)ϕε (z − y) dy, Ω

RN

where we have extended u to all of RN as zero (i.e., uε = u?ϕε with ? denoting the convolution operation). REMARK 2.4.11

Note that

© ª supp uε ⊆ supp u + z ∈ RN : kzkRN 6 ε . The next proposition summarizes the approximations achieved via mollification. PROPOSITION 2.4.12 If Z ⊆ RN is an open set, then ¡ ¢ df 1 ∞ (a) for every u ∈ L (Z) and every ε > 0, u ∈ C Z , where Z−ε = ε −ε loc © ª z ∈ Z : dZ (z, ∂Z) > ε ; (b) if u ∈ C(Z), then uε −→ u

as ε & 0

uniformly on compact subsets of Z; (c) if u ∈ Lploc (Z) for some p ∈ [1, +∞), then uε −→ u

in Lploc (Z);

(d) if u ∈ W 1,p (Z), p ∈ [1, +∞], then Di uε = ϕε ? Di u

∀ i ∈ {1, . . . , N };

(e) if u ∈ W 1,p (Z), p ∈ [1, +∞), then uε −→ u

in W 1,p (Z).

184

Nonlinear Analysis

PROOF (a) Note that uε is defined on Z−ε (see Remark 2.4.11). Let z ∈ Z−ε , i ∈ {1, . . . , N } and e1 , . . . , eN be the canonical basis of RN . For |t| small enough, we have that z + tei ∈ Z−ε

∀ i ∈ {1, . . . , N }.

So if tk −→ 0, we can assume that z + tk ei ∈ Z−ε We set

1 h(z, y) = N ϕ ε

and fk (z, y) = We have

µ

∀ k > 1. z−y ε

¶ u(y)

¢ 1¡ h(z + tk ei − y) − h(z, y) . tk

¢ 1¡ uε (z + tk ei ) − uε (z) = tk

Z fk (z, y) dy. Z−ε

Note that fk (z, y) −→

∂ϕε (z − y) as k → +∞, ∂zi

∀ y ∈ Z−ε .

Moreover, by the mean value theorem , we have ¯ ¯ ¯fk (z, y)¯ 6

1 εN +1

¡ ¢ kDϕk∞ |u| ∈ L1 Z−ε .

So by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z ¢ ∂uε 1¡ ∂ϕε (z) = lim uε (z + tk ei ) − uε (z) = (z − y)u(y) dy. k→+∞ tk ∂zi ∂zi Z

In a similar way we show that the partial derivatives of uε of all orders exist and are continuous on Zε . Therefore ¡ ¢ uε ∈ C ∞ Z−ε . Note that we have ¢ ∂ ¡ ∂ϕε u ? ϕε = u ? ∂zi ∂zi

∀ i ∈ {1, . . . , N }.

(b) Let K be a compact subset of Z. Take z ∈ K and set df

x =

y−z , ε

y ∈ Z.

2. Lebesgue-Bochner and Sobolev Spaces

185

Note that ϕ(−x) = ϕ(x). Then we have ¯ ¯ ¯uε (z) − u(z)¯ 6 Z 6

µ

Z

1 εN

z−y ε

ϕ

¶

¯ ¯ ¯u(y) − u(z)¯ dy

Bε (z)

¯ ¯ ϕ(x)¯u(z + εx) − u(z)¯ dz 6 ξ(ε)

B1 (0)

(2.34)

Z ϕ(x) dx = ξ(ε),

B1 (0)

where ξ(ε) =

¯ ¯ ¯u(y) − u(v)¯.

sup (y, v) ∈ K ε × K ε ky − vk 6 ε

Since u is uniformly continuous on the compact subsets of Z, we have that lim ξ(ε) = 0.

ε&0

So from (2.34), we conclude that uε −→ u as ε & 0 uniformly on compact subsets of Z. ¡ ¢ (c) Let K be a compact subset of Z and 0 < ε < d K, ∂Z . Then for z ∈ K, we have Z ¯ ¯ ¯ ¯p ¯uε (z)¯p 6 ϑ(y)¯u(z + εy)¯ dy. (2.35) B1 (0)

To see (2.35) note that it is clearly true if p = 2. So suppose that p ∈ (1, +∞). If p1 + p10 = 1, then 1

1

ϑ(y)u(z + εy) = ϑ(y) p ϑ(y) p0 u(z + εy).

Invoking H¨older’s inequality (see Theorem A.2.27), we have ¯ ¯ ¯uε (z)¯ 6

µ Z ϑ(y) dy

¶ 10 µ Z p

B1 (0)

Since

p

¶ p1

ϑ(y) |u(z + εy)| dy

B1 (0)

Z ϑ(y) dy = 1, B1 (0)

from (2.36), we obtain (2.35). Invoking (2.35), we have ¶ Z Z µ Z ¯ ¯ ¯ ¯p ¯uε (z)¯p dz 6 ¯ ¯ ϑ(y) u(z + εy) dy dz; K

K

B1 (0)

.

(2.36)

186

Nonlinear Analysis

thus by Fubini’s theorem, we Z Z Z ¯ ¯ ¯uε (z)¯p dz 6 ϑ(y) K

B1 (0)

where

df

Z

Z

¯ ¯ ¯u(z + εy)¯p dz dy 6

K

Kε =

Since

have

ϑ(y) B1 (0)

©

Z

¯ ¯ ¯u(v)¯p dv dy,

Kε

ª v ∈ RN : d(v, K) 6 ε .

ϑ(y) dy = 1, we have B1 (0)

Z kuε kLp (K) 6

¯ ¯ ¯u(v)¯p dv,

(2.37)

Kε p

so uε ∈ L (K). Let V be a bounded open set, such that Kε ⊆ V ⊆ V ⊆ Z

∀ ε > 0 small enough. ¡ ¢ Let δ > 0 be given and select h ∈ C V , such that ku − hkLp (V ) < δ. Here we exploit the density of the embedding ¡ ¢ C V ⊆ Lp (V ). Then, using (2.37), for ε > 0 small enough, we have ku − uε kLp (K) 6 ku − hkLp (K) + kh − hε kLp (K) + k(h − v)ε kLp (K) 6 ku − hkLp (K) + kh − hε kLp (K) + kh − vkLp (Kε ) 6 3δ, so uε −→ u

in Lploc (Z) as ε & 0.

(d) Suppose that u ∈ W 1,p (Z), p ∈ (1, +∞). From the proof of part (a) and integrating by parts, we know that Z Z ∂ϕε ∂ϕε Di u(z) = (z − y)u(y) dy = − (z − y)u(y) dy ∂zi ∂yi Z Z Z ¡ ¢ = ϕε (z − y)Di u(y) dy = ϕε ? Di u (z), Z

for z ∈ Z−ε , ε > 0 and i ∈ {1, . . . , N }. (e) This follows from (c) and (d).

2. Lebesgue-Bochner and Sobolev Spaces

187

The next theorem shows that smooth functions are dense in W 1,p (Z), p ∈ [1, +∞]. So equivalently the space W 1,p (Z) can be defined as the closure in the k·kW 1,p (Z) -norm of C ∞ (Z)∩W 1,p (Z). The result is known in the literature as the Meyers-Serrin theorem. THEOREM 2.4.13 (Meyers-Serrin Theorem) If p ∈ [1, +∞), then the embedding C ∞ (Z) ∩ W 1,p (Z) ⊆ W 1,p (Z) is dense. PROOF

We define ½ ¾ ¡ ¢ 1 df Z−n = z ∈ Z : d z, ∂Z > , kzkRN < n n

and

∀n>1

df

Z0 = ∅. Set

df

Un = Z−(n+1) \ Z −(n−1)

∀ n > 1.

The collection {Un }n>1 is an open cover of Z. So we can find a smooth partition of unity {ξn }n>1 subordinate to {Un }n>1 . Then ξn ∈ Cc∞ (Un ), and

∞ X

06ξ61

ξn (z) = 1

∀n>1

∀ z ∈ Z.

n=1

Let u ∈ W 1,p (Z) and δ > 0. We have ξn u ∈ W 1,p (Z)

and

¡ ¢ supp ξn u ⊆ Un

∀ n > 1.

Thus by virtue of Proposition 2.4.12(e), there exist εn > 0, such that ¡ ¢ supp ϑεn ? (ξn u) ⊆ Un and kϑεn ? (ξn u) − ξn uk < Define df

uδ =

∞ X n=1

δ . 2n

ϑεn ? (ξn u).

(2.38)

188

Nonlinear Analysis

In some neighbourhood of each point z ∈ U , there are only finitely many nonzero terms in the sum. Hence uδ ∈ C ∞ (Z). Next note that u =

∞ X

ξn u.

n=1

From (2.38), we have ¶p ∞ µZ X ¯ ¡ ¢¯p ¯ ¯ 6 ϑεn ? (ξn u) − ξn u dz 1

kuδ − ukW 1,p (Z)

n=1

+

Z

∞ µZ X

n=1

Z

¯ ¡ ¢¯ ¯ϑεn ? D(ξn u) − Du ¯p dz

¶ p1 < δ,

so uδ ∈ C ∞ (Z) ∩ W 1,p (Z) and uδ −→ u in W 1,p (Z) as δ & 0.

REMARK 2.4.14 The result is true for all Sobolev spaces W m,p (Z), m > 1. We emphasize that in the above approximation ¡ ¢ result, we do not claim that the approximating functions belong in C ∞ Z . To obtain this we need to strengthen the geometry of the boundary ∂Z of Z. DEFINITION 2.4.15 A bounded open set Z ⊆ RN is said to be Lipschitz, if for each z ∈ ∂U , there exists a neighbourhood U of z, such that © ª N Z ∩ U = y = (zk )N k=1 ∈ R : η(y1 , . . . , yN −1 ) < yN ∩ U, where η : RN −1 −→ R is a Lipschitz continuous function and {yk }N k=1 is a system of Cartesian coordinates of RN . REMARK 2.4.16 From this definition it follows that ∂Z locally has a representation of the form yN = η(y1 , . . . , yN −1 ), i.e., near z ∈ ∂Z, the boundary ∂Z is the graph of a Lipschitz continuous function. By Rademacher’s theorem (see Theorem 1.5.8), the outer unit normal n(z) to the domain Z exists for µ(N −1) -almost all z ∈ ∂Z. If Z is a bounded polyhedron, then Z is Lipschitz. Also if Z is a C ∞ -submanifold with C ∞ boundary ∂Z, then Z is Lipschitz. Every Lipschitz open set Z ⊆ RN is locally star-shaped. For Lipschitz Z ⊆ RN we can improve the conclusion of Theorem 2.4.13.

2. Lebesgue-Bochner and Sobolev Spaces

189

THEOREM 2.4.17 If Z ⊆ RN is a bounded open set, which is Lipschitz and p ∈ [1, +∞), ¡ ¢ then the embedding C ∞ Z ⊆ W 1,p (Z) is dense. REMARK 2.4.18 Theorem 2.4.17 implies that for any bounded, open, Lipschitz set Z ⊆ RN and any given u ∈ W 1,p (Z) (p ∈ [1, +∞)), there exists a sequence {un }n>1 ⊆ D(RN ), such that un |Z −→ u in W 1,p (Z). In general for any open set Z ⊆ RN and u ∈ W 1,p (Z) (p ∈ [1, +∞)), we can say that there exists a sequence {un }n>1 ⊆ D(RN ), such that w

un −→ u in Lp (Z) Di xn |Z 0 −→ Di x|Z 0 in Lp (Z 0 ),

∀ i ∈ {1, . . . , N }, Z 0 ⊂⊂ Z

(i.e., Z 0 is a bounded open set with Z 0 ⊆ Z). This result is known in the literature as Friedrich’s theorem. Theorem 2.4.17 holds for all Sobolev spaces W m,p (Z), m > 1. None of these approximation results (Theorems 2.4.13 and 2.4.17) is true for p = +∞. Indeed consider the following examples. EXAMPLE 2.4.19

(a) Let Z = RN . We know that

¡ ¢k·k∞ ¡ ¢ Cc RN = C0 RN . ¡ ¢ Thus while u ≡ 1 ∈ W 1,∞ RN , it can not be approximated by functions in ¡ ¢ Cc∞ RN . (b) Let Z = (−1, 1) and consider the function ½ df 0 if z 6 0, u(z) = z if z > 0. Then u is absolutely continuous. Its derivative in the sense of distribution is given by ½ df 0 if z < 0, Du(z) = 1 if z > 0. ¯ ¯ Let ϑ ∈ C ∞ (Z) be ¯such that ¯kϑ0 − u0 k∞ < ε. So if z < 0, then ¯ϑ0 (z)¯ < ε and if z > 0, then ¯ϑ0 (z) − 1¯ < ε, hence 1 − ε < ϑ0 (z). By continuity, we obtain ϑ0 (0) 6 ε and ϑ0 (0) > 1 − ε. If ε < 21 , we reach a contradiction. This shows that u cannot be approximated in W 1,∞ (Z) by smooth functions.

190

Nonlinear Analysis

The next Proposition proves a simple characterization of the elements in W 1,p (Z) (p ∈ (1, +∞]). PROPOSITION 2.4.20 If Z ⊆ RN is open and u ∈ Lp (Z) with p ∈ (1, +∞], then the following statements are equivalent: (a) u ∈ W 1,p (Z); (b) there exists a constant c > 0, such that ¯Z ¯ ¯ ¯ ¯ u ∂ϑ dz ¯ 6 c kϑk 0 ∀ ϑ ∈ Cc∞ (Z), k ∈ {1, . . . , N }, p ¯ ∂zk ¯ Z

with

1 p

+

1 p0

= 1;

(c) there exists a constant c > 0, such that for all Z 0 ⊂⊂ Z (i.e., Z 0 is a bounded open set such that Z 0 ⊆ Z) and ° ° °τy (u) − u° p 0 6 c kyk N ∀y ∈ RN , with kyk N < d (Z 0 , Z c ). R R RN L (Z ) Moreover, in both (b) and (c) we can take c = kDukp . PROOF

“(a)=⇒(b)”: Obvious.

“(b)=⇒(a)”: Let Lk : Cc∞ (Z) −→ R be defined by Z ∂ϑ df Lk (ϑ) = u dz ∀ k ∈ {1, . . . , N } . ∂zk Z

0

0

Evidently Lk is linear, Lp -continuous. Since the embedding Cc∞ (Z) ⊆ Lp (Z) 0 0 is dense, we can extend Lk continuously on all of Lp (Z). So Lk ∈ Lp (Z)∗ and by the Riesz representation theorem (see Theorem A.3.24), we can find h ∈ Lp (Z), such that Z 0 Lk (v) = hv dz ∀ v ∈ Lp (Z), Z

so

Z u Z

∂ϑ dz = ∂zk

Z hϑ dz

∀ ϑ ∈ Cc∞ (Z)

Z

and Dk u = h hence u ∈ W

1,p

(Z).

∀ k ∈ {1, . . . , N },

2. Lebesgue-Bochner and Sobolev Spaces

191

“(a)=⇒(c)”: First suppose that u ∈ Cc∞ (Z). Let y ∈ RN and set df

v(t) = u(z + ty)

∀ t ∈ R.

Then from the chain rule, we have v 0 (t) = (Du(z + ty), y)RN . Integrating, we obtain Z1

Z1 0

u(z + y) − u(z) = v(1) − v(0) =

(Du(z + ty), y)RN dt

v (t) dt = 0

0

and so p

kτy (u) − ukLp (Z 0 ) Z Z1 6

p kykRN

p

kDu(z + ty)kRN dt dz Z0 0

Z1 Z =

p kykRN

p

kDu(z + ty)kRN dz dt 0 Z0

Z1 Z =

p kykRN

p

kDu(r)kRN dr dt. 0 Z 0 +ty

If

¡ ¢ kykRN < dRN Z 0 , Z c ,

we can find a bounded open set Z 00 , such that Z

00

⊆ Z

and Z 0 + ty ⊆ Z 00

Therefore

∀ t ∈ [0, 1].

Z p

p

p

kτy (u) − ukLp (Z 0 ) 6 kykRN

kDukRN dz.

(2.39)

Z 00

For the general case, suppose that u ∈ W 1,p (Z), p ∈ (1, +∞]. Then we can find a sequence {un }n>1 ⊆ Cc∞ (R), such that un −→ u in Lp (Z), Dun −→ Du

in Lp (Z 0 ),

for any Z 0 ⊂⊂ Z (see Remark 2.4.18). From (2.39), we have Z p p p kτy (un ) − un kLp (Z 0 ) 6 kykRN kDun kRN dz, Z0

192

Nonlinear Analysis

so

Z p

p

p

kτy (u) − ukLp (Z 0 ) 6 kykRN

kDukRN dz.

(2.40)

Z0

If p = +∞, we obtain (2.40) for p < +∞ and then let p → +∞. “(c)=⇒(b)”: Let ϑ ∈ Cc∞ (Z) and consider an open set Z 0 , such that supp ϑ ⊆ Z 0 ⊂⊂ Z. Let y ∈ RN with

¡ ¢ kykRN < d Z 0 , Z c .

By hypothesis we have ¯Z ¯ ¯ ¡ ¯ ¢ ¯ τy (u) − u ϑ dz ¯¯ 6 c kykRN kϑkp0 . ¯ Z

Note that Z

¡ ¢ u(z + y) − u(z) ϑ(z) dz =

Z

so

Z

¡ ¢ u(z) ϑ(z − y) − ϑ(z) dz,

Z

¯Z ¯ ¯ ¯ ¯ u(z) ϑ(z − y) − ϑ(z) dz ¯ 6 c kϑk 0 . p ¯ ¯ kyk N Z

R

©

ª

Let y = tek , t ∈ R, k ∈ 1, . . . , N . Passing to the limit as t → 0, we obtain ¯Z ¯ ¯ ¯ © ª ¯ u ∂ϑ dz ¯ 6 c kϑk 0 ∀ ϑ ∈ Cc∞ (Z), k ∈ 1, . . . , N . p ¯ ∂zk ¯ Z

Finally it is clear from the above proofs that in (b) and (c), the constant c > 0 can be taken as c = kDukp . REMARK 2.4.21 If p = 1, then we have (a) =⇒ (b) =⇒ (c). From the implication (a) =⇒ (c), we can see that if Z ⊆ RN is an open set and u ∈ W 1,∞ (Z), then ¯ ¯ ¯u(z) − u(y)¯ 6 kDuk kz − yk N ∀ z, y ∈ Z. ∞ R So W 1,∞ (Z) is the space of Lipschitz continuous functions on Z. In particular ¡ ¢ W 1,∞ (Z) ⊆ C Z . More generally, it is easy to show that if u : Z −→ R is locally Lipschitz, then 1,p u ∈ Wloc (Z) (p ∈ [1, +∞]).

2. Lebesgue-Bochner and Sobolev Spaces

193

PROPOSITION 2.4.22 If Z ⊆ RN is an open set and u, v ∈ W 1,p (Z) ∩ L∞ (Z) with p ∈ [1, +∞], then uv ∈ W 1,p (Z) and D(uv) = uDv + vDu (product rule). PROOF If p = +∞, then u, v are Lipschitz continuous functions and so differentiable for almost all z ∈ Z (see Theorem 1.5.8). Clearly uv is Lipschitz continuous too, hence in W 1,∞ (Z) and the product rule results from the usual product rule of differentiable functions. So suppose that p ∈ [1, +∞). We assume that ¯ ¯ ¯ ¯ ¯u(z)¯, ¯v(z)¯ 6 1 for a.a. z ∈ Z. Invoking Theorem 2.4.13, we can find sequences {b un }n>1 , {b vn }n>1 ⊆ C ∞ (Z) ∩ W 1,p (Z), such that u bn u bn (z) vbn vbn (z)

−→ u in W 1,p (Z), −→ u(z) for a.a. z ∈ Z, −→ v in W 1,p (Z), −→ v(z) for a.a. z ∈ Z.

Let © ª df un = max − 1, min {b un , 1} , © ª df vn = max − 1, min {b vn , 1} . Then un vn is locally Lipschitz. Moreover, we have D(un vn ) = un Dvn + vn Dun ∈ Lp (Z), so

un vn ∈ W 1,p (Z)

∀ n > 1.

Note that un −→ u in W 1,p (Z), un (z) −→ u(z) for a.a. z ∈ Z, vn −→ v in W 1,p (Z), vn (z) −→ v(z) for a.a. z ∈ Z. We have p

kun vn − uvkp Z Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯vn (z)¯p ¯un (z) − u(z)¯p dz + ¯u(z)¯p ¯vn (z) − v(z)¯p dz −→ 0. 6 Z

Z

194

Nonlinear Analysis

In addition ° ¡ ¢° °un Dvn + vn Dun − uDv + vDu °p p p

p

6 kun Dvn − uDvkp + kvn Dun − vDukp Z ¯ ¯ ¯un (z)¯p kDvn − Dvkp N dz 6 R Z

Z

+

¯p p ¯ kDv(z)kRN ¯un (z) − u(z)¯ dz

Z

Z

+

¯ ¯ ¯vn (z)¯p kDun − Dukp N dz R

Z

Z

+

¯p p ¯ kDu(z)kRN ¯vn (z) − v(z)¯ dz −→ 0.

Z

Therefore we conclude that uv ∈ W 1,p (Z) and D(uv) = uDv + vDu.

In fact a careful reading of this proof reveals that the following result is also true. PROPOSITION 2.4.23 If Z ⊆ RN is an open set, u ∈ W 1,p (Z), p ∈ [1, +∞] and v ∈ W 1,∞ (Z), then uv ∈ W 1,p (Z) and D(uv) = uDv + vDu. Next we prove a chain rule for Sobolev functions. PROPOSITION 2.4.24 (Chain Rule for Sobolev Functions) If Z ⊆ RN is an open set, ξ ∈ C 1 (R), ξ 0 ∈ L∞ (R), ξ(0) = 0 and u ∈ W 1,p (Z) with p ∈ [1, +∞], then ξ ◦ u ∈ W 1,p (Z) and D(ξ ◦ u) = ξ 0 (u)Du. Moreover, if Z is bounded, then we can drop the condition that ξ(0) = 0. PROOF

Let p ∈ [1, +∞) and ϑ ∈ Cc∞ (Z) be such that supp ϑ ⊆ V ⊂⊂ Z.

2. Lebesgue-Bochner and Sobolev Spaces

195

Using Proposition 2.4.12(e) (twice) and integration by parts, we have Z Z ∂ϑ ∂ϑ ξ(u) dz = ξ(u) dz ∂zk ∂zk Z V Z Z ∂ϑ ∂uε = lim ξ(uε ) dz = − lim ξ 0 (uε ) ϑ dz ε&0 ε&0 ∂zk ∂zk V V Z Z = − ξ 0 (u)(Dk u)ϑ dz = − ξ 0 (u)(Dk u)ϑ dz, V

so

Z

Dk (ξ ◦ u) = ξ 0 (u)Dk u

∀ k ∈ {1, . . . , N } .

It is clear from the above argument (see second equality) that if Z is bounded, then the condition ξ(0) = 0 can be dropped. If p = +∞, then ξ ◦ u is a Lipschitz continuous function (see Remark 2.4.21) and the result follows from the classical chain rule. In fact there is a stronger version of the previous proposition. It is due to Marcus & Mizel (1972), where the interested reader can find the proof. PROPOSITION 2.4.25 If Z ⊆ RN is an open set, ξ : R −→ R is Lipschitz continuous, ξ(0) = 0 and u ∈ W 1,p (Z), p ∈ [1, +∞], then ξ ◦ u ∈ W 1,p (Z) and D(ξ ◦ u) = ξ ∗ (u)Du almost everywhere on Z with ξ ∗ : R −→ R being any bounded Borel function such that ξ ∗ (z) = ξ 0 (z)

for a.a. z ∈ Z.

REMARK 2.4.26 The function f ∗ can always be taken to be bounded, by virtue of the following result due to Stampacchia (1966): “If u ∈ W 1,p (Z) and A ⊆ R is a Lebesgue-null set, then Du(z) = 0 for almost all z ∈ u−1 (A).” Moreover, note that the chain rule (see Proposition 2.4.25) is also valid for W01,p (Z) (see Corollary 2.4.8). Recall that © ª u+ = max u, 0 We have

u = u+ − u−

© ª and u− = min − u, 0 . and

|u| = u+ + u− .

Using the general version of the chain rule (see Proposition 2.4.25), we obtain at once the following result.

196

Nonlinear Analysis

PROPOSITION 2.4.27 If Z ⊆ RN is an open set and u ∈ W 1,p (Z), p ∈ [1, +∞], then u+ , u− , |u| ∈ W 1,p (Z) and we have ½ Du+ = ½ Du− =

Du 0 0 −Du

Du 0 D|u| = −Du

for a.a. z ∈ {u > 0} , for a.a. z ∈ {u 6 0} , for a.a. z ∈ {u > 0} , for a.a. z ∈ {u < 0} , for a.a. z ∈ {u > 0} , for a.a. z ∈ {u = 0} , for a.a. z ∈ {u < 0} .

Using this proposition we can show that the Sobolev spaces W 1,p (Z), p ∈ [1, +∞] have a lattice structure. COROLLARY 2.4.28 If Z ⊆ RN is an open set and u, v ∈ W 1,p (Z), p ∈ [1, +∞], then © ª h0 = min u, v ∈ W 1,p (Z),

© ª h1 = max u, v ∈ W 1,p (Z)

and we have ½ df

Dh0 = ½ df

Dh1 = PROOF

Du Dv

for a.a. z ∈ {u 6 v} , for a.a. z ∈ {u > v} ,

Du Dv

for a.a. z ∈ {u > v} , for a.a. z ∈ {u 6 v} .

Note that h1 = (u − v)+ + v

and

h0 = u − (u − v)+ .

Then the result follows at once from Proposition 2.4.27. An immediate consequence of this Corollary is the following particular case of the result of Stampacchia mentioned in Remark 2.4.26. COROLLARY 2.4.29 If Z ⊆ RN is an open set and u ∈ W 1,p (Z), p ∈ [1, +∞], then for every η ∈ R, we have Du(z) = 0

© ª for a.a. z ∈ u = η .

2. Lebesgue-Bochner and Sobolev Spaces

197

PROPOSITION 2.4.30 If Z ⊆ RN is an open set, {un }n>1 ⊆ W 1,p (Z), p ∈ (1, +∞), df

h = sup kDun k ∈ Lp (Z)

and

n>1

then g ∈ W PROOF

1,p

df

g = sup un , n>1

° ° (Z) and °Dg(z)°RN 6 h(z) for almost all z ∈ Z.

Let

df

gk = max un

∀ k > 1.

16n6k

From Corollary 2.4.28, we have that gk ∈ W 1,p (Z) and ° ° ° ° °Dgk (z)° N 6 max °Dun (z)° N 6 h(z) for a.a. z ∈ Z. R R 16n6k

(2.41)

Evidently the sequence {gk }k>1 is increasing and gk (z) −→ g(z) for a.a. z ∈ Z

as k → +∞. ¡ ¢ Also from (2.41), we see that the sequence {Dgk }k>1 ⊆ Lp Z; RN is bounded. Then from the monotone convergence theorem (see Theorem A.2.10) and the Eberlein-Smulian theorem (see Theorem A.3.8), we have in Lp (Z), ¡ ¢ −→ w in Lp Z; RN ,

gk −→ g Dgkm

w

with {gkm }m>1 being a subsequence of {gk }k>1 . For every ϑ ∈ Cc∞ (Z) and every i ∈ {1, . . . , N }, from the definition of the distributional derivative, we have Z Z (Di gk )ϑ dz = − gk Di ϑ dz, Z

so

Z

Z

Z wi ϑ dz = −

A

gDϑ dz, Z

¡ ¢ p N where w = (wi )N and finally i=1 ∈ L Z; R wi = Di g

∀ i ∈ {1, . . . , N } .

Therefore, we infer that for whole sequence, we have ¡ ¢ w Dgn −→ Dg in Lp Z; RN . So g ∈ W 1,p (Z) and kDg(z)kRN 6 h(z)

for a.a. z ∈ Z.

198

Nonlinear Analysis

From the above proofs, it is clear that we have: PROPOSITION 2.4.31 If Z ⊆ RN is an open set and {un }n>1 ⊆ W 1,p (Z), p ∈ [1, +∞) is a sequence, such that w

in Lp (Z) ¡ ¢ Dun −→ w in Lp Z; RN , un −→ u w

then u ∈ W 1,p (Z) and Du = w. PROPOSITION 2.4.32 If Z ⊆ RN is an open set, {un }n>1 , {vn }n>1 ⊆ W 1,p (Z), p ∈ [1, +∞) and w

in W 1,p (Z),

w

in W 1,p (Z),

un −→ u vn −→ v then

© ª © ª min un , vn −→ min u, v in W 1,p (Z), © ª © ª max un , vn −→ max u, v in W 1,p (Z). PROOF

It suffices to show that

+ if un −→ u in W 1,p (Z), then u+ n −→ u

First note that and so it follows that

in W 1,p (Z).

¯ + ¯ ¯ ¯ ¯un − u+ ¯ 6 ¯un − u¯ + u+ n −→ u

in Lp (Z).

Next let h = χ(0,+∞) . Using Proposition 2.4.27, we have ° + ° °Dun − Du+ °p p Z p = kh(un )Dun − h(u)DukRN dz Z

Z p

6 kDun (z) − Du(z)kp + Z

¯p p ¯ kDu(z)kRN ¯h(un ) − h(u)¯ dz −→ 0.

2. Lebesgue-Bochner and Sobolev Spaces

199

Using this, we can conclude the validity of Proposition 2.4.27 and Corollary 2.4.28 for the spaces W01,p (Z), p ∈ [1, +∞). So W01,p (Z), p ∈ [1, +∞) has a lattice structure. PROPOSITION 2.4.33 If Z ⊆ RN is an open set and u, v ∈ W01,p (Z), p ∈ [1, +∞), then © ª © ª max u, v , min u, v ∈ W01,p (Z). In particular

u+ , u− , |u| ∈ W01,p (Z).

PROOF Again it suffices to show that u+ ∈ W01,p (Z). Let {ϑn }n>1 ⊆ Cc∞ (Z) be such that ϑn −→ u in W 1,p (Z). From the proof of Proposition 2.4.12(c) and (d), it follows that we can find a sequence {ψnm }n>1 ⊆ Cc∞ (Z), ψnm > 0, such that ψnm −→ ϑ+ n

in W01,p (Z)

as m → +∞

∀ n > 1.

Since + ϑ+ n −→ u

in W 1,p (Z)

(see Proposition 2.4.32), via the double limit lemma (see Proposition A.2.35), we can find a sequence {m(n)}n>1 increasing (not necessarily strictly) to +∞, such that ψn m(n) −→ u+ in W 1,p (Z). Since ψn m(n) ∈ Cc∞ (Z), we deduce that u+ ∈ W 1,p (Z). In fact the previous result can be also obtained by having a chain rule for W01,p (Z), p ∈ (1, +∞) (see also Proposition 2.4.25). First an auxiliary result which is actually of independent interest. PROPOSITION 2.4.34 If Z ⊆ RN is an open set, u ∈ W 1,p (Z), p ∈ [1, +∞) and u vanishes outside a compact K ⊆ Z, then u ∈ W01,p (Z). PROOF

Let Z 0 be a bounded open set in RN , such that K ⊆ Z 0 ⊂⊂ Z.

¡ ¢ Let ϕ ∈ Cc∞ RN , such that

ϕ|K ≡ 1

200

Nonlinear Analysis

(i.e., ϕ is what we usually call a cut off function). We have ϕu = u. We can find {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn −→ u in Lp (Z) and Dϑn −→ Du

¡ ¢ in Lp Z 0 ; RN

(see Remark 2.4.18). We have ϕϑn −→ ϕu

in W 1,p (Z)

and ϕϑn ∈ Cc∞ (Z). Therefore ϕu = u ∈ W01,p (Z).

Using this result we can prove the chain rule for the Sobolev spaces W01,p (Z), p ∈ (1, +∞). PROPOSITION 2.4.35 If Z ⊆ RN is a bounded open set, ξ : R −→ R is Lipschitz continuous, ξ(0) = 0 and u ∈ W01,p (Z), p ∈ (1, +∞), then ξ ◦ u ∈ W01,p (Z) and D(ξ ◦ u) = ξ 0 (u)Du. PROOF

Let {ϑn }n>1 ⊆ Cc∞ (Z) be such that ϑn −→ u in W 1,p (Z).

Set

df

hn = ξ ◦ ϑn

∀ n > 1.

Evidently hn is a Lipschitz continuous function and since ϑn has compact support, so does hn . Also ¯ ¯ ¯ ∂hn ¯ ¯ ¯ ∀ i ∈ {1, . . . , N }, n > 1. ¯ ∂zi ¯ 6 Lip(hn ) Because Z ⊆ RN is bounded, we infer that ∂hn ∈ Lp (Z), ∂zi

2. Lebesgue-Bochner and Sobolev Spaces

201

hence hn ∈ W 1,p (Z) and has compact support. So Proposition 2.4.34 implies that hn ∈ W01,p (Z). Also we have ¯ ¯ ¡ ¡ ¢¯ ¢ ¡ ¢¯ ¯hn (z) − ξ u(z) ¯ = ¯ξ ϑn (z) − ξ u(z) ¯ ¯ ¯ 6 Lip(ξ)¯ϑn (z) − u(z)¯ for a.a. z ∈ Z, so hn −→ ξ ◦ u

in Lp (Z).

N Moreover, if {ek }N k=1 is the standard orthonormal basis of R , we have

|hn (z + tei ) − hn (z)| Lip(ξ) |ϑn (z + tei ) − ϑn (z)| 6 , |t| |t| so

° ° ° ∂hn ° ° ° lim sup ° ∂zi ° n→+∞

p

But

∂ϑn ∂u −→ ∂zi ∂zi

° ° ° ∂ϑn ° ° ° . 6 Lip(ξ) lim sup ° ∂zi ° n→+∞

in Lp (Z)

So from (2.42), we infer that the sequence

(2.42)

p

∀ i ∈ {1, . . . , N }. n

∂hn ∂zi

o i>1

⊆ Lp (Z) is bounded.

Since p ∈ (1, +∞), by passing to a subsequence if necessary, we may assume that ∂hn w −→ wi in Lp (Z) ∀ i ∈ {1, . . . , N }. ∂zi From Proposition 2.4.31, we have that wi =

∂ξ(u) ∂zi

and so

hn −→ ξ(u) in W 1,p (Z). Because hn ∈ W01,p (Z), we conclude that ξ ◦ u = ξ(u) ∈ W01,p (Z). Finally note that

Dhn = ξ 0 (ϑn )Dϑn

and so in the limit we have D(ξ ◦ u) = ξ 0 (u)Du.

REMARK 2.4.36 If Z ⊆ RN is bounded, open, Lipschitz, then in the above proof we can choose {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn −→ u in W 1,p (Z). Then the same proof is valid and so we have a proof of Proposition 2.4.25 with the extra hypothesis that Z ⊆ RN is bounded and Lipschitz.

202

Nonlinear Analysis

We can also have the product rule for the spaces W 1,p (Z), p ∈ [1, +∞]. The proof is the same as that of Proposition 2.4.22, using this time a sequence {b un }n>1 ⊆ Cc∞ (Z). PROPOSITION 2.4.37 If Z ⊆ RN is an open set and u ∈ W01,p (Z) ∩ L∞ (Z)

v ∈ W 1,p (Z) ∩ L∞ (Z),

and

p ∈ [1, +∞], then uv ∈ W01,p (Z) and D(uv) = uDv + vDu. Continuing with the Sobolev spaces W01,p (Z), p ∈ [1, +∞), we have the following results. PROPOSITION 2.4.38 If Z ⊆ RN is an open set, u ∈ W01,p (Z)

v ∈ W 1,p (Z),

and

p ∈ [1, +∞] and 0 6 v(z) 6 u(z)

for a.a. z ∈ Z,

then v ∈ W01,p (Z). PROOF From the proof of Proposition 2.4.33, we know that there exists a sequence {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn > 0 and Let

∀n>1

ϑn −→ u in W 1,p (Z). © ª df hn = min v, ϑn

∀ n > 1.

Evidently hn has compact support and so by Proposition 2.4.34, hn ∈ W01,p (Z). Moreover, from Proposition 2.4.32, we have that © ª hn −→ min v, u = v in W 1,p (Z). So v ∈ W01,p (Z).

2. Lebesgue-Bochner and Sobolev Spaces

203

PROPOSITION 2.4.39 If Z ∈ RN is an open set, u ∈ W01,p (Z), v ∈ W 1,p (Z), p ∈ [1, +∞) and ¯ ¯ ¯ ¯ ¯v(z)¯ 6 ¯u(z)¯ for a.a. z ∈ Z \ K, where K is a compact subset of Z, then v ∈ W01,p (Z). PROOF

Let ϕ ∈ Cc∞ (Z) be such that 0 6 ϕ 6 1 and

ϕ|K = 1

(a cut off function). We set df

u b = (1 − ϕ)|u| + ϕv + . From Propositions 2.4.33 and 2.4.34, we have that u b ∈ W01,p (Z) and So

0 6 v+ 6 u b.

v + ∈ W01,p (Z)

(see Proposition 2.4.38). Similarly we show that v − ∈ W01,p (Z). Hence v ∈ W01,p (Z). We can improve Proposition 2.4.34 and motivate the discussion of trace which follows. PROPOSITION 2.4.40 If Z ⊆ RN is a bounded open set, u ∈ W 1,p (Z), p ∈ [1, +∞) and lim u(z) = 0

z→y

∀ y ∈ ∂Z,

then u ∈ W01,p (Z). PROOF

Since

u = u+ − u− ,

we may assume that u > 0. For ε > 0 let uε ∈ W 1,p (Z) (see Proposition 2.4.27) and uε has compact support. Therefore by Proposition 2.4.34, we have that uε ∈ W01,p (Z). Now note that uε −→ u in W 1,p (Z) Thus u ∈ W01,p (Z).

as ε & 0.

204

Nonlinear Analysis

So roughly speaking a function u ∈ W 1,p (Z) belongs to W01,p (Z), if u is vanishing on ∂Z. But it is not meaningful to talk of values of u on a set of measure zero. Hence we must be more careful on how we assign boundary values to Sobolev functions. Trace theory does exactly this, namely defines and studies the concept of boundary values for the Sobolev spaces W 1,p (Z), p ∈ [1, +∞). The trace of a Sobolev function is an extension of the restriction of a continuous function on ∂Z. We start with a simple lemma.

LEMMA 2.4.41 If Z ⊆ RN is a bounded, open set which is Lipschitz, then for all p ∈ [1, +∞) there exists c > 0, such that Z p

|u|p dµ(N −1) 6 c kukW 1,p (Z)

∀ u ∈ C 1 (Z).

∂Z

PROOF Since by hypothesis the boundary ∂Z is Lipschitz, for any z = (zk )N k=1 ∈ ∂Z we can find r > 0 and a Lipschitz continuous function η : RN −1 −→ R, such that (upon rotating and relabelling the coordinate axes if necessary), we have Z ∩ Cr (z) =

©

ª N y = (yk )N k=1 ∈ R : η(y1 , . . . , yN −1 ) < yN ∩ Cr (z),

where df

Cr (z) =

©

ª N y = (yk )N k=1 ∈ R : |yk − zk | < r, k = 1, . . . , N .

First assume that u|Z\Cr (z) . If {ek }N k=1 is the standard orthonormal basis of RN and n(·) is the outward unit normal vector on ∂Z, then we have − (eN , n)RN >

¡ ¢− 1 1 + Lip(η)2 2 > 0 for µ(N −1) -a.a. z ∈ ∂Z ∩ Cr (z).

(2.43)

Let ε > 0 be given and set df

1

ξε (t) = (t2 + ε2 ) 2 − ε

∀ t ∈ R.

¯ ¯ Using the Gauss-Green theorem (see Theorem A.4.1) and since ¯ξε0 ¯ 6 1 for

2. Lebesgue-Bochner and Sobolev Spaces all t ∈ R, we have Z ¡ ¢ ξε u(y) dµ(N −1) = ∂Z

Z

205

¡ ¢ ξε u(y) dµ(N −1)

∂Z∩Cr (z)

Z

¡ ¢ ξε u(y) (−eN , n(y))RN dµ(N −1)

6 c ∂Z∩Cr (z)

Z

= −c ∂Z∩Cr (z)

Z

¢¢ ∂ ¡ ¡ ξε u(y) dy ∂yN

¯ 0¡ ° ¢¯ ° ¯ξε u(y) ¯ °Du(y)°

6 c

Z

RN

dy 6 c

° ° °Du(y)°

RN

,

Z

∂Z∩Cr (z)

with c > 0 independent of u (see (2.43)). Note that ξε (u) −→ |u| as ε & 0, so in the limit we obtain Z Z (N −1) |u| dµ 6 c kDu(z)kRN dy.

(2.44)

Z

∂Z

Now we remove the extra hypothesis that u|Z\Cr (z) = 0. In the general case, we can cover ∂Z by a finite number of such cubes Cri (zi ) = Ci for i = 1, . . . , m. Then we can find smooth functions {ξi }m i=0 , such that 0 6 ξi 6 1,

supp ξi ⊆ Ci 0 6 ξ0 6 1,

and

m X

ξi (z) = 1

∀ i ∈ {1, . . . , N },

supp ξ0 ⊆ Z ∀ z ∈ Z.

i=0

We set

df

ui = ξi u

∀ i ∈ {0, 1, . . . , m}.

Evidently each ui |Z\Ci = 0 and so using (2.44), we obtain Z Z ¡ ¢ |u| dµ(N −1) 6 c |u(z)| + kDu(z)kRN dy ∀ u ∈ C 1 (Z). Z

∂Z

If p ∈ (1, +∞), then we use (2.45) with |u|p replacing |u|. So finally Z ¯ ¯ ¯u(z)¯p dµ(N −1) 6 c kukp 1,p ∀ p ∈ [1, +∞). W (Z) ∂Z

(2.45)

206

Nonlinear Analysis

Using this Lemma, we can state and prove the trace theorem which gives meaning to the concept of boundary values for Sobolev spaces. THEOREM 2.4.42 (Trace Theorem) If Z ⊆ RN is bounded, open set which is Lipschitz and p ∈ [1, +∞), then there exists a unique continuous linear map ¡ ¢ γ0 : W 1,p (Z) −→ Lp ∂Z, µ(N −1) , such that γ0 (u) = u|∂Z

∀ u ∈ C 1 (Z).

PROOF By virtue of Theorem 2.4.17, the embedding C 1 (Z) ⊆ W 1,p (Z) is dense. From Lemma 2.4.41, we know that p ku|∂Z k p ¡ L

∂Z,µ(N −1)

¢ 6 c kukW 1,p (Z)

∀ u ∈ C 1 (Z),

for some c > 0. So we can extend uniquely to a continuous linear map ¡ ¢ γ0 : W 1,p (Z) −→ Lp ∂Z, µ(N −1) .

DEFINITION 2.4.43 trace of u on ∂Z.

For every u ∈ W 1,p (Z), we say that γ0 (u) is the

PROPOSITION 2.4.44 If Z ⊆ RN is bounded, open set which is Lipschitz and p ∈ [1, +∞), then Z Z Di u(z) dz = γ0 (u)ni dµ(N −1) ∀ u ∈ W 1,p (Z), i ∈ {1, . . . , N } Z

∂Z

(as before n = (ni )N i=1 is the outward unit normal on ∂Z). PROOF

Let {un }n>1 ⊆ C 1 (Z) be such that ku − un kW 1,p (Z) −→ 0

(see Theorem 2.4.17). From the divergence theorem of multivariable calculus (see Theorem A.4.1), we have Z Z Di un (z) dz = γ0 (un )ni dµ(N −1) ∀ n > 1. (2.46) Z

∂Z

2. Lebesgue-Bochner and Sobolev Spaces

207

From Theorem 2.4.42, we know that ¡ ¢ γ0 (un ) −→ γ0 (u) in Lp ∂Z, µ(N −1) . Also since un −→ u in W 1,p (Z), we have Di un −→ Di u in Lp (Z). So passing to the limit as n → +∞ in (2.46), we obtain Z Z Di u(z) dz = γ0 (u)ni dµ(N −1) . Z

∂Z

This proposition leads to a Green’s Formula for Sobolev functions. First an auxiliary result which provides still another version of the product rule. LEMMA 2.4.45 If Z ⊆ RN is an open set, p ∈ (1, +∞) and then for all u ∈ W

1,p

(Z), v ∈ W

1,p

0

1 p

+

1 p0

= 1,

(Z), we have uv ∈ W 1,1 (Z) and

Di (uv) = uDi v + vDi u

∀ i ∈ {1, . . . , N }.

PROOF First assume that u ∈ C 1 (Z) and consider a sequence {vn }n>1 ⊆ C 1 (Z), such that vn −→ v

0

in W 1,p (Z 0 )

∀ Z 0 ⊂⊂ Z

(see Remark 2.4.18). Let ϑ ∈ D(Z) and consider Z 0 ⊆ RN bounded, open set, such that 0 supp ϑ ⊆ Z 0 ⊆ Z ⊆ Z. For every i ∈ {1, . . . , N }, we have Z Z Z ¡ ¢ uvn Di ϑ dz = uvn Di ϑ dz = − uDi vn + vn Di u ϑ dz, Z0

Z

so

Z lim

n→+∞

¡

Z0

¢ uDi vn + vn Di u ϑ dz =

Z0

Z (uDi v + vDi u) ϕ dz. Z0

Since vn −→ v

0

in W 1,p (Z),

208

Nonlinear Analysis

we have

Z

Z

lim

uvn Di ϑ dz =

n→+∞ Z

uvDi ϑ dz. Z

So in the limit as n → +∞, we obtain Z Z ¡ ¢ uvDi ϑ dz = − uDi v + vDi u ϑ dz

∀ ϑ ∈ D(Z),

Z0

Z

so

Di (uv) = uDi v + vDi u ∈ L1 (Z),

i.e., uv ∈ W 1,1 (Z). Now we remove the restriction that u ∈ C 1 (Z). If u ∈ W 1,p (Z), we can find a sequence {un }n>1 ⊆ C 1 (Z), such that un −→ u in W 1,p (Z 0 ) for all open sets Z 0 ⊂⊂ Z. From the first part of the proof we know that Di (un v) = un Di v + vDi un Hence un v −→ uv

∀ n > 1.

in L1 (Z)

and Di (un v) −→ uDi v + vDi u

in L1 (Z).

Therefore {un v}n>1 is a Cauchy sequence in W 1,1 (Z) and so un v −→ uv

in W 1,1 (Z)

and Di (uv) = uDi v + vDi u.

THEOREM 2.4.46 (Green Formula) If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞), 1,p

1,p

0

1 1 p + p0

then for all u ∈ W (Z), v ∈ W (Z) and i ∈ {1, . . . , N }, we have Z Z Z uDi v dz + vDi u dz = γ0 (uv)ni dµ(N −1) . Z

PROOF

Z

∂Z

From Lemma 2.4.45, we know that uv ∈ W 1,1 (Z)

and Di (uv) = uDi v + vi Du.

An application of Proposition 2.4.44 leads to Green’s formula.

= 1,

2. Lebesgue-Bochner and Sobolev Spaces

209

COROLLARY 2.4.47 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞) and 1 1 p + p0 = 1, ¡ ¢ then for all u ∈ W 1,p (Z) and h ∈ C 1 RN ; RN , we have Z Z Z ¡ ¢ ¡ ¢ udiv h dz + Du(z), h(z) RN dz = γ0 (u) h, n RN dµ(N −1) . Z

Z

∂Z

Theorem 2.4.42 gives meaning to the quantity u|∂Z for any u ∈ W 1,p (Z), ∂m p ∈ [1, +∞). In fact we can do the same thing for ∂n m for any m > 1. Here ∂ ∂n denotes the outward normal derivative on ∂Z. Also we can give a more precise description of the range of the trace map. To do this we need to introduce Sobolev spaces of fractional order on manifolds. DEFINITION 2.4.48 Let M be a compact manifold in RN . For any s ∈ (0, 1), p ∈ [1, +∞) and u ∈ C ∞ (M ), we define p1 Z Z 0 p ¯ ¯ |u(z) − u(z )| df p kukW s,p (M ) = ¯u(z)¯ dz + dz dz 0 . N −1+sp kz − z 0 kRN M

M ×M

This is a norm. The completion of C ∞ (M ) under this norm is denoted by W s,p (M ). For any s > 0, we set s = k + η, with a positive integer k and η ∈ (0, 1) (if s is not an integer). We define ª df © W s,p (M ) = u ∈ W k,p (M ) : Dα u ∈ W η,p (M ) for all |α| = k . REMARK 2.4.49 The definition makes sense also for any Z ⊆ RN bounded and open. Also if s = 0, by convention W 0,p (M ) = Lp (M ). Now we can state the full version of the trace theorem. THEOREM 2.4.50 (Trace Theorem) If Z ⊆ RN is a bounded, open set which is Lipschitz, m > 1 is a positive integer and p ∈ [1, +∞), then there exists a unique bounded, linear operator m,p γ = (γk )m−1 (Z) −→ Lp (∂Z)m , k=0 : W

such that ¡ ¢ (a) if u ∈ C ∞ Z , then γk (u) = (b) range γ =

m−1 Q

W

m−k− p10 ,p

k=0

(c) ker γ = W0m,p (Z).

∂k u ∂nk

(∂Z);

for k = 1, . . . , m − 1;

210

Nonlinear Analysis

Using Theorem 2.4.46 and the continuity of the trace map γ1 , we obtain the following result. THEOREM 2.4.51 If Z ⊆ RN is a bounded, open set which is Lipschitz and u ∈ H 2 (Z), v ∈ H 1 (Z), then Z Z Z ¡ ¢ ∂u (N −1) (∆u)v dz + Du, Dv RN dz = v dµ . ∂n Z

Z

∂Z

REMARK 2.4.52 The equality in the above theorem is sometimes called Second Green’s Identity . We can have a nonlinear extension of this theorem (i.e., p 6= 2). For this purpose if Z ⊆ RN is a bounded, open set which is Lipschitz and q ∈ (1, +∞), we introduce the space: ¡ ¢ df © ¡ ¢ ª V q Z, div = v ∈ Lq Z; RN : div v ∈ Lq (Z) . ¡ ¢ We furnish V q Z, div with the norm df

kvkV q (Z,div ) =

h i q1 ° °q q kvkLq (Z;RN ) + °div v °Lq (Z) .

¡ ¢ It is easy to see that V q Z, div equipped with this norm is a separable, ¡ ¢ ¡ ¢ reflexive Banach space and the embedding C ∞ Z; RN ⊆ V q Z, div is dense. The next theorem extends Theorem 2.4.46. For a proof of it we refer to Casas & Ferna ´ndez (1989) and Kenmochi (1975). THEOREM 2.4.53 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞) and 1 1 p + p0 = 1, then there exists a unique bounded, linear operator ¢ 1 0¡ − 1 ,p0 ,p γn : V p Z, div −→ W p0 (∂Z) = W p0 (∂Z)∗ , such that

¡ ¢ ∀ v ∈ C ∞ Z; RN

γn (v) = (v, n)RN and Z

Z udiv v dz +

Z

=

¡

γn (v), γ0 (u)

¢

¡ ¢ Du, v RN dz

Z

W

1 ,p p0 (∂Z)

¢ 0¡ ∀ v ∈ V p Z, div , u ∈ W 1,p (Z).

2. Lebesgue-Bochner and Sobolev Spaces

211

If for u ∈ W 1,p (Z), we set ¡ ¢ df p ∆p u = div kDukRN Du (the p-Laplacian), then from Theorem 2.4.53, we obtain the following nonlinear extension of Theorem 2.4.51. THEOREM 2.4.54 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞), p1 + p10 = 1, u ∈ W 1,p (Z)

and

0

∆p u ∈ Lp (Z), −

1

,p0

then there exists a unique element of W p0 (∂Z), which by extension we ∂u denote by ∂n , satisfying for all v ∈ W 1,p (Z), p µ ¶ Z Z ¢ ¡ ¢ ∂u p−2 ¡ ∆p u v dz + kDukRN Du, Dv RN dz = , γ0 (v) . 1 ,p ∂np W p0 (∂Z) Z

Z

W01,p (Z),

If u ∈ then we can extend u to u b ∈ W 1,p (RN ) by simply setting u equal to zero on RN \ Z. It is not clear whether this extension is possible for u ∈ W 1,p (Z). The next theorem shows when this is possible. It is known as extension theorem. THEOREM 2.4.55 (Extension Theorem) If Z ⊆ RN is a bounded, open set which is Lipschitz and Zb ⊇ Z is open, then there exists an extension operator ¡ ¢ E : W 1,p (Z) −→ W 1,p Zb , such that E(u)|Z = u, and

° ° °E(u)°

° ° °E(u)° p b 6 c kuk p L (Z) L (Z) ¡ ¢ 6 c kuk 1,p W (Z)

b W 1,p Z

∀ u ∈ W 1,p (Z)

∀ u ∈ W 1,p (Z),

¡ ¢ for some c = c Z, Zb > 0. Next let us define the dual of W 1,p (Z) for an open set Z ⊆ RN and p ∈ [1, +∞). By considering the map L1 : W 1,p (Z) −→ Lp (Z)N +1 , defined by df

L1 (u) =

¡

u, Du

¢

∀ u ∈ W 1,p (Z),

we see that W 1,p (Z) is isometrically isomorphic to a subspace of Lp (Z)N +1 . So from the Riesz representation theorem (see Theorem A.3.24), we have

212

Nonlinear Analysis

THEOREM 2.4.56 If Z ⊆ RN is an open set, p ∈ [1, +∞), then Z ¡ ¢ G(u) = h, Du RN dz

1 p

+

1 p0

= 1 and G ∈ W 1,p (Z)∗ ,

∀ y ∈ W 1,p (Z),

Z 0

for some h ∈ Lp (Z; RN ) The dual of W 1,p (Z) is generally more than a space of distributions on Z. Clearly the restriction on Cc∞ (Z) of an element in W 1,p (Z)∗ belongs to D(Z)∗ . However, this restriction is not injective because Cc∞ (Z) is not dense in W 1,p (Z). The problem is that the elements of W 1,p (Z) can have nonzero boundary values (in the sense of trace). On the other hand Cc∞ (Z) is dense in W01,p (Z). So for this Sobolev space the restriction is injective and we can have a convenient description of W01,p (Z)∗ . THEOREM 2.4.57 If Z ⊆ RN is an open set and p ∈ [1, +∞), then ½ W01,p (Z)∗

df

=

∗

G ∈ D(Z) : G = −

N X

Dk Tgk ,

k=1

¾ p0 N for some g = (gk )N ∈ L (T ) . k=1 0

df

We set W −1,p (Z) = W01,p (Z)∗ , with

1 p

+

1 p0

= 1.

For Sobolev functions of one variable (i.e., N = 1), using Theorem 2.2.24, we have the following convenient characterization. THEOREM 2.4.58 If Z = T = [0, b] (b < +∞) and u ∈ W 1,p (T ), p1 + p10 = 1, then u admits a representative which is absolutely continuous. Moreover, for p = +∞, the representative is Lipschitz continuous. REMARK 2.4.59 The result is also true if T is unbounded. In this case the representative is locally absolutely continuous (see Definition A.2.15(b)). For p = +∞, again the representative is Lipschitz continuous (see Remark 2.4.21).

2. Lebesgue-Bochner and Sobolev Spaces

2.5

213

Inequalities and Embedding Theorems

The study of Sobolev spaces is useful because their elements possess special properties. Many of those properties are a consequence of the so-called embedding theorems. Among other things, the embedding theorems establish regularity properties for the Sobolev functions, in addition to the ones implied by their definition. Let us start with a negative observation. H 1 (Z) is in general not embedded in L∞ (Z). To see ½ ¾ 1 df Z = (x, y) ∈ R2 : x2 + y 2 < 2 , e p ¡ 1¢ η η ∈ 0, 2 and let u(x, y) = | ln r| , where r = x2 + y 2 and (x, y) ∈ Z. We have EXAMPLE 2.5.1 this let

1

Z2πµ Ze

Z 2

|u| dx dy =

Ze

| ln r|2η r dr < +∞,

| ln r| r dr dϑ 6 2π 0

Z

1

¶ 2η

0

i.e., u ∈ L2 (Z). Note that

0 ∂u ∂x

©

ª exists on Z \ (0, 0) and we have

∂u 1x 1 = −η| ln r|η−1 = −η| ln r|η−1 cos ϑ, ∂x rr r so 1

¶ Z ¯ ¯2 Z2πµ Ze ¯ ∂u ¯ 2 2η−2 1 ¯ ¯ dx dy 6 η | ln r| dr cos2 ϑ dϑ ¯ ∂x ¯ r 0

Z

0

1

Ze 6 2πη

2

2η−2 1

| ln r| 0

r

· dr 6 2πη

2

(− ln r)2η−1 − 2η − 1

¸ 1e 6 0

and thus

¡ © ª¢ ∂u ∂u ∈ L2 (Z) and ∈ C 1 Z \ (0, 0) . ∂x ∂x Let ϑ ∈ D(Z). We have µ ¶ µ ¶ 1 1 1 1 u(·, y) ∈ C 1 − , ∀y∈ − , \ {0} e e e e and so

1

Ze − − 1e

1

∂u (x, y)ϑ(x, y) dy = ∂x

Ze

u(x, y) − 1e

∂ϑ (x, y) dx. ∂x

2πη 2 1 − 2η

214

Nonlinear Analysis

¡ ¢ Integrating with respect to y ∈ − 1e , 1e \ {0} and using Fubini’s theorem, we obtain Z Z ∂u ∂ϑ − ϑ dx dy = u dx dy, ∂x ∂x Z

Z

so Dx Tu = T ∂u ∂x

and similarly we show that Dy Tu = T ∂u . ∂y

This proves that u ∈ H 1 (Z). From this example we see that H 1 (Z) 6⊆ L∞ (Z). Next we prove that if f ∈ W 1,p (RN ), then f ∈ Lr (RN ) for a certain range of r > 1 (including r = p). df

DEFINITION 2.5.2 If p ∈ [1, N ), then p∗ = cal Sobolev exponent corresponding to p.

Np N −p

is called the criti-

The next inequality, known as the “Sobolev-Nirenberg-Gagliardo inequality” (or simply as “Sobolev inequality”; see also Theorem 1.6.7), implies that the embedding W 1,p (RN ) ⊆ Lr (RN ) is continuous for r ∈ [1, p∗ ]. THEOREM 2.5.3 (Sobolev-Nirenberg-Gagliardo Inequality) If p ∈ [1, N ), then there exists c > 0, such that kukp∗ 6 c kDukp

∀ u ∈ W 1,p (RN ).

¡ ¢ PROOF By virtue of Theorem 2.4.13, we may assume that u ∈ Cc1 RN . For every i ∈ {1, . . . , N }, we have ¡ ¢ u x1 , . . . , xi , . . . , xN =

Zxi −∞

¢ ∂u ¡ x1 , . . . , ti , . . . , xN dti , ∂xi

thus ¯ ¯ ¯u(x)¯ 6

+∞ Z ° ¡ ¢° °Du xi , . . . , ti , . . . , xN ° N dti R

∀ i ∈ {1, . . . , N }

−∞

and so +∞ ¶ N1−1 N µ Z Y ¯ ¯ N ° ¡ ¢° ¯u(x)¯ N −1 6 °Du x1 , . . . , ti , . . . , xN ° N dti . R i=1

−∞

2. Lebesgue-Bochner and Sobolev Spaces

215

We integrate with respect to x1 ∈ R. +∞ Z ∗ |u|1 dx1 −∞ +∞ +∞ N µ Z +∞ µZ ¶ N1−1 Z ¶ N1−1 Y ° ° ° ° ° ° ° ° 6 Du RN dt1 Du RN dti dx1 −∞ i=2

−∞

µ

+∞ Z

° ° °Du° N dt1 R

6

¶

1 N −1

µY N

−∞

+∞ Z +∞ Z

° ° °Du° N dx1 dti R

¶ N1−1 .

i=2−∞ −∞

−∞

Next we integrate with respect to x2 ∈ R. +∞ Z +∞ Z ∗ |u|1 dx1 dx2 −∞ −∞ +∞ Z +∞ +∞ Z +∞ µZ ¶ N1−1 µ Z ¶ N1−1 6 kDukRN dx1 dx2 kDukRN dt1 dt2 × −∞ −∞

−∞ −∞

+∞ Z +∞ Z +∞ N µ Z Y

×

¶ N1−1

kDukRN dx1 dx2 dti

i=3

.

−∞ −∞ −∞

We continue this way and we obtain Z |u|

1∗

+∞ +∞ ¶ N1−1 Z N µ Z Y N dx 6 ... kDukRN dx1 . . . dxN = kDuk1N −1 , i=1

RN

−∞

−∞

so kuk1∗ 6 kDuk1 .

(2.47)

This proves the theorem for p = 1. Now suppose that p ∈ (1, +∞). Set h = |u|η with η > 0 to be chosen in the process in the proof. Using (2.47) and H¨older’s inequality (see Theorem A.2.27), we obtain µZ |u|

ηN N −1

¶ NN−1 dx

Z 6

RN

µZ

¶ 0

° ° °D|u|η ° N dx 6 η R

RN

|u|(η−1)p dx kDukp ,

6 η RN

Z

RN

|u|η−1 kDukRN dx

216 with

Nonlinear Analysis 1 p

+

1 p0

= 1. Choose η > 0 so that ηN p = (η − 1) . N −1 p−1

Then

p = η p−1

µ

p N − p−1 N −1

and p = η hence η =

¶

N −p , N −1

Np − p N −1 ∗ = p . N −p N

So we have µZ p∗

|u|

¶ NN−1 dx

µZ p∗

6 η

RN

|u|

¶ 10 dx

p

kDukp ,

RN

so kukp∗ 6 c kDukp , with c = c(N, p) > 0 (note that

N −1 N

>

N −p N p ).

The next inequality is known as “Poincar´e’s inequality” and is very useful in the study of Dirichlet elliptic equations. THEOREM 2.5.4 (Poincar´ e’s Inequality) If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then there exists c = c(Z, p) > 0, such that kukp 6 c kDukp PROOF

∀ u ∈ W 1,p (Z).

Since Z ⊆ RN is bounded, we can find ξ > 0, such that Z ⊆ (−ξ, ξ)N .

Let ϑ ∈ D(Z) and extend it by zero on the whole “cube” (−ξ, ξ)N . For every z = (zk )N k=1 , we have ZzN ∂ϑ ϑ(z) = (x1 , . . . , xN −1 , t) dt. ∂xN −ξ

2. Lebesgue-Bochner and Sobolev Spaces By H¨older’s inequality (see Theorem A.2.27), with ¯ ¯ p ¯ϑ(z1 , . . . , zN )¯p 6 (2ξ) p0

1 p

+

1 p0

217 = 1, we have

¯ Zξ ¯ ¯ ∂ϑ ¡ ¢¯p ¯ ¯ dt, z , . . . , z , t N −1 ¯ ∂zN 1 ¯

−ξ

so Z

¯ ¯ ¯ϑ(z1 , . . . , zN )¯p dz1 . . . dzN −1

(−ξ,ξ)N −1

6 (2ξ)

Z

p p0

(−ξ,ξ)N −1

thus

Z

¯ µ Zξ ¯ ¶ ¯ ∂ϑ ¡ ¢ ¯p ¯ ¯ ¯ ∂zN z1 , . . . , zN −1 , t ¯ dt dz1 , . . . dzN −1 , −ξ

¯ ¯ p ¯ϑ(z)¯p dz 6 (2ξ) p0 +1

Z

and since

p p0

Z

° °p °Dϑ° N dz R

Z

+ 1 = p, we have kϑkp 6 2ξ kDϑkp .

Since D(Z) is dense in W 1,p (Z), we conclude that kukp 6 c kDukp

∀ u ∈ W 1,p (Z),

for some c = c(Z, p) > 0. REMARK 2.5.5 In fact the result is true if Z ⊆ RN is unbounded but of finite width, namely it lies between two parallel hyperplanes (see Adams (1975, p. 158)). However, the result fails in truly unbounded domains Z ⊆ RN . Let Z ⊆ RN and ϑ ∈ D(RN ) be such that

EXAMPLE 2.5.6 ϑ|B df

Let ϑm (z) = ϑ

¡z¢ m

1 (0)

≡ 1,

ϑ|B2 (0)c ≡ 0 and

. Then if N < p, we have kϑm kW 1,p (Z) −→ 0

while

0 6 ϑ 6 1.

as m → +∞,

¡ ¢ kϑm kp > λN Bm (0) −→ +∞

as m → +∞.

An immediate useful consequence of Theorem 2.5.4, which is a basic tool in the study of Dirichlet elliptic problems, is the following result.

218

Nonlinear Analysis

COROLLARY 2.5.7 If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then ¶ p1 µZ ° ° df °Du(z)°p N kDukp = ∀ u ∈ W01,p (Z) R RN

is a norm on W01,p (Z) equivalent to the usual Sobolev norm kukW 1,p (Z) . Let us use this opportunity to mention a few equivalent norms for the Sobolev spaces W 1,p (Z). PROPOSITION 2.5.8 If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then the following three norms are equivalent to the original Sobolev norm k·kW 1,p (Z) : kuk(1)

¯Z ¯p ¶ p1 µ ° °p ¯ ¯ ° ° ¯ = Du p + ¯ u dz ¯¯

kuk(2)

¯Z ¯p ¶ p1 µ ° °p ¯ ¯ df (N −1) ¯ ° ° ¯ = Du p + ¯ u dµ ¯

kuk(3)

µ ¶ p1 Z ° °p df p (N −1) ° ° = Du p + |u| dµ .

df

Z

∂Z

∂Z

REMARK 2.5.9

If N = 1 and Z = (0, b) (b ∈ (0, +∞)), then Z u dµ(N −1) = u(0) + u(b). ∂Z

Before passing to the so-called embedding theorems, let us mention one more inequality, known in the literature as “Morrey’s inequality.” First a definition. DEFINITION 2.5.10 Let η ∈ (0, 1). A function u : RN −→ R is said to be H¨ older continuous with exponent η, if sup x, y ∈ RN x 6= y

|u(x) − u(y)| < +∞. η kx − ykRN

In the proof of Morrey’s inequality, we shall use the following lemma.

2. Lebesgue-Bochner and Sobolev Spaces

219

LEMMA 2.5.11 For every p ∈ [1, +∞), there exists c = c(N, p) > 0, such that Z Z ¯ ¯ ° ° ° ° ¯u(y) − u(z)¯p dy 6 crN +p−1 °Du(y)°p N °y − z °1−N dy, N R

B r (x)

R

B r (x)

¡ ¢ for all r > 0, u ∈ C 1 B r (x) and all y, z ∈ B r (x). PROOF

If y, z ∈ B r (x), then Z1

u(y) − u(z) = 0

Z1

¢ d ¡ u z + t(y − z dt = dt

thus ¯ ¯ ¯u(y) − u(z)¯p 6 ky − zkp N R

Z1

¡ ¢ Du(z + t(y − z)), y − z RN dt,

0

° ¡ ¢° °Du z + t(y − z) °p N dt. R

0

So using Proposition 1.3.23(b) and (c), for s > 0, we have Z ¯ ¯ ¯u(y) − u(z)¯p dµ(N −1) (y) B r (x)∩∂B s (z)

Z1

Z

° ¡ ¢° °Du z + t(y − z) °p N dµ(N −1) (y) dt R

p

6 s

0 B r (x)∩∂B s (z)

Z1 p

6 s

0

Z

1

° ° °Du(w)°p N dµ(N −1) (w) dt R

tN −1 B r (x)∩∂B st (z)

Z

Z1

° ° ° ° °Du(w)°p N °w − z °1−N dµ(N −1) (w) dt R RN

= sN +p−1 0 B r (x)∩∂B st (z)

Z

= sN +p−2

° ° ° ° °Du(w)°p N °w − z °1−N dw. N R

R

B r (x)∩B s (z)

Then from Example 1.5.27(a), we have Z Z ¯ ¯ ¯u(y) − u(z)¯p dy 6 crN +p−1 B r (x)

with c = c(N, p) > 0.

B r (x)

° ° ° ° °Du(y)°p N °y − z °1−N dy, R RN

220

Nonlinear Analysis

THEOREM 2.5.12 (Morrey Inequality) (a) For every p ∈ (N, +∞), there exists c = c(N, p) > 0, such that ¯ ¯ ¯u(y) − u(z)¯ 6 cr

Z

1 λN (B r (x))

° ° °Du(w)°p N dw, R

B r (x)

¡ ¢ for all r > 0, u ∈ W 1,p Br (x) and λN -almost all y, z ∈ B r (x). (b) If p ∈ (N, +∞) and u ∈ W 1,p (RN ), then the limit lim ur,z = u∗ (z)

r&0

exists for all z ∈ RN and u∗ is H¨ older continuous with exponent 1 − Np ; recall that Z 1 ur,z = N u(x) dλN (x). λ (B r (z)) B r (z)

¡ ¢ PROOF (a) First suppose that u ∈ C 1 Br (x) . Using Lemma 2.5.11 with p p = 1, H¨older’s inequality (see Theorem A.2.27) and recalling that p0 = p−1 , for all y, z ∈ B r (x), we have ¯ ¯ ¯u(y) − u(z)¯ 6 Z 6 c

Z

1 N λ (B r (x))

¯ ¯ ¯¢ ¡¯ ¯u(y) − u(w)¯ + ¯u(w) − u(z)¯ dw

B r (x)

´ ° ° ³ 1−N °Du(w)° N ky − wk1−N + kz − wk dw N N R R R

B r (x)

µ Z 6 c

³

1−N

1−N

ky − wkRN + kz − wkRN

´p0

B r (x)

6 cr

dw

¶ 10 µ Z p

p

¶ p1

kDukRN dw

B r (x)

(N −(N −1)p0 ) p10

µ Z

° ° °Du(w)°p N

¶ p1

R

B r (x)

µ Z 6 cr

1− N p

° ° °Du(w)°p N dw R

¶ p1 .

B r (x)

Invoking Theorem 2.4.13, we see that the same estimate holds for all u ∈ ¡ ¢ W 1,p Br (x) and for λN -almost all y, z ∈ Br (x).

2. Lebesgue-Bochner and Sobolev Spaces

221

(b) From part (a), for λN -almost all y, z ∈ B r (x), with r = kx − ykRN , we have µ Z ¶ p1 N ¯ ¯ ° °p p ¯u(y) − u(z)¯ 6 c ky − zk1− ° ° Du(w) dw N N R

R

B r (x)

6 c kDukLp (Z;RN ) ky −

1− N zkRN p

,

so u is λN -almost everywhere equal to a H¨older continuous function u∗ with exponent 1 − Np . So lim ur,z = u∗ (z)

r&0

∀ z ∈ RN .

REMARK If p = +∞, then we know that the elements of ¡ ¢ 2.5.13 W 1,∞ RN are Lipschitz continuous functions (see Remark 2.4.21). From Theorem 2.5.12, it follows that, if u ∈ W 1,p (RN ), N < p, then lim

kzkRN →+∞

u(z) = 0.

Already from Theorem 2.5.3, we know that the embedding W 1,p (RN ) ⊆ Lr (RN ) is continuous for all r ∈ [1, p∗ ]. Moreover, from Theorem 2.5.12, we know that if p > N , then the embedding W 1,p (RN ) ⊆ L∞ (RN ) is continuous. The next two theorems make these facts much more precise. The first theorem is known as the Sobolev embedding theorem, while the second is known as the Rellich-Kondrachov embedding theorem. First let us introduce a new kind of boundary regularity. DEFINITION 2.5.14 (a) For given z ∈ RN , an open ball B1 with center z and an open ball B2 not containing z, the set ª df © Cz = z + t(y − z) : y ∈ B2 , t > 0 ∩ B1 is called a finite cone in RN . (b) Let Z ⊆ RN be an open set. We say that Z has the cone property, if there exists a finite cone C0 , such that¢ for each z ∈ Z, there exists an ¡ orthogonal transformation Ux ∈ L RN ; RN , for which we have Ux (C0 ) ⊆ Z.

222

Nonlinear Analysis

REMARK 2.5.15 If Z ⊆ RN is a bounded, open set which is Lipschitz, then it has the cone property (see Adams (1975, p. 51)). Also it is clear that if Z is C 1 , then it has the cone property. THEOREM 2.5.16 (Sobolev Embedding Theorem) If Z ⊆ RN is an open set with the cone property, p ∈ [1, +∞), k, m are integers, k > 0, m > 1, then (a) if mp N , then the embedding W k+m,p (Z) ⊆ Cbk (Z) is continuous with Cbk (Z) being the space of all functions u ∈ C k (Z), such that Dα u is bounded on Z for all multiindices α ∈ NN with |α| 6 k. When Z ⊆ RN is bounded, the conclusions are stronger. THEOREM 2.5.17 (Rellich-Kondrachov Embedding Theorem) If Z ⊆ RN is an open, bounded set with the cone property, p ∈ [1, +∞), k, m are integers, k > 0, m > 1, then (a) if mp N , then the embedding W k+m,p (Z) ⊆ C k Z is compact for all r ∈ [1, +∞] and in particular if k = 0, we have that the embedding W m,p (Z) ⊆ Lr (Z) is compact for all r ∈ [1, +∞]. REMARK 2.5.18 Since W01,p (Z) is a closed subspace of W 1,p (Z), we see that both embedding theorems (i.e., Theorems 2.5.16 and 2.5.17) are also valid for W01,p (Z). In fact in this case the cone property can be dropped. Using the embedding theorems, we can prove a generalized form of Poincar´e’s inequality (see Theorem 2.5.4).

2. Lebesgue-Bochner and Sobolev Spaces

223

THEOREM 2.5.19 (Generalized Poincar´ e Inequality) If Z ⊆ RN is a bounded, open, connected set with the cone property, p ∈ (1, +∞) and V is a closed linear subspace of W 1,p (Z), such that the only constant function in V is the zero function, then kukp 6 c kDukp ∀ u ∈ V, for some c > 0. PROOF We proceed by contradiction. So suppose that the conclusion of the theorem is not true. We can find a sequence {un }n>1 , such that kun kp > n kDun kp Let df

vn =

∀ n > 1.

un kun kp

∀ n > 1.

and

kDvn kp

1 W 1,p (Z) is bounded and so by passing to a subsequence if necessary, we may assume that w

vn −→ v

in W 1,p (Z).

Because of Theorem 2.5.17(a), we have that vn −→ v

in Lp (Z).

Hence kvkp = 1. Also exploiting the weak lower semicontinuity of the norm in a Banach space, we have kDvkp 6 lim inf kDvn kp = 0, n→+∞

so v ∈ V is constant, thus v = 0, which is a contradiction to the fact that kvkp = 1.

224

Nonlinear Analysis

COROLLARY 2.5.20 If Z ⊆ RN is a bounded, open, connected set which is Lipschitz, S0 ⊆ ∂Z with µ(N −1) (S0 ) > 0, p ∈ (1, +∞) and df

V =

©

u ∈ W 1,p (Z) : γ0 (u) = 0 on S0

ª

(γ0 being the trace operator on W 1,p (Z); see Theorem 2.4.42), then kukp 6 c kDukp ∀ u ∈ V, for some c > 0. PROOF Since γ0 is continuous linear (see Theorem 2.4.42), V is closed, linear subspace of W 1,p (Z). Suppose that the constant function u ≡ c ∈ V . We have 0 = γ0 (u) = γ0 (c) = c. This permits the application of Theorem 2.5.19. This leads to another fundamental inequality, known as the “Poincar´eWirtinger inequality.” It is an essential tool in the study of periodic ordinary differential equations and Neumann partial differential equations. THEOREM 2.5.21 (Poincar´ e-Wirtinger Inequality) If Z ⊆ RN is a bounded, open, connected set with the cone property and p ∈ (1, +∞), then ku − ukp 6 c kDukp ∀ u ∈ W 1,p (Z), R df for some c > 0, where u = λN1(Z) u(z) dz. Z

PROOF

Let

Z u ∈ W 1,p (Z) : u(z) dz = 0 . V = df

Z

Clearly V is a closed, linear subspace of W 1,p (Z). If the constant function u ≡ c ∈ V , then Z u(z) dz = cλN (Z) = 0, Z

hence c = 0. Also u−u ∈ V

∀ u ∈ W 1,p (Z).

So an application of Theorem 2.5.19 finishes the proof.

2. Lebesgue-Bochner and Sobolev Spaces

225

We already know that for the Sobolev functions of one variable (i.e., N = 1), the situation is better (see Theorem 2.4.58). In this case the embedding theorems take a sharper form. THEOREM 2.5.22 Let T ⊆ R be an interval. (a) If T is open and p ∈ [1, +∞], then the embedding W 1,p (T ) ⊆ L∞ (T ) is continuous; (b) If T is open, bounded and p ∈ (1, +∞], ¡ ¢ then the embedding W 1,p (T ) ⊆ C T is compact; (c) If T is open, bounded and p ∈ [1, +∞), then the embedding W 1,1 (T ) ⊆ Lp (T ) is compact. PROOF (a) Using the extension theorem (see Theorem 2.4.55), we see that without any loss of generality, we may assume that T = R. First let u ∈ Cc1 (R) and for p ∈ [1, +∞) let df

ξp (r) = |r|p−1 r We have that ξp (u) ∈

Cc1 (R)

∀ r ∈ R.

and from the chain rule, we have

¯ ¯p−1 d ¢ ¡ ¢d d ¡ ξp u(t) = ξp0 u(t) u(t) = p¯u(t)¯ u(t) dt dt dt (since ξp0 (r) = p|r|p−1 ). Hence for every t ∈ R, we have ¡

¢ ξp u(t) =

Zt

¯ ¯p−1 p¯u(s)¯ u0 (s) ds

−∞

(since u ∈ Cc1 (R)), so by H¨older’s inequality (see Theorem A.2.27), we have Z ¯ ¡ ¯ ¯ ¯ ¯p−1 ¯ ¯ ¢¯ ¯ξp u(t) ¯ = ¯u(t)¯p 6 p¯u(s)¯ ¯u0 (s)¯ ds R

6

p−1 p kukp

0

ku kp .

By Young’s inequality (see Proposition A.4.5), we obtain kuk∞ 6 c kukW 1,p (T )

∀ u ∈ Cc1 (R),

(2.48)

for some c > 0. Now let u ∈ W 1,p (R), with p ∈ [1, +∞). Then we can find a sequence {un }n>1 ⊆ Cc1 (R), such that un −→ u in W 1,p (R).

226

Nonlinear Analysis

From (2.48), we have kun − um k∞ 6 c kun − um kW 1,p (T )

∀ n, m > 1,

hence the sequence {un }n>1 ⊆ L∞ (R) is a Cauchy sequence. Therefore un −→ u in L∞ (R) and we have proved the continuity of the embedding W 1,p (R) ⊆ L∞ (R) for p ∈ [1, +∞). Of course the result is trivially true for p = +∞. (b) Let B 1 (0) be the closed unit ball in W 1,p (T ), p ∈ (1, +∞]. Let u ∈ B 1 (0). We have ¯ Zt ¯ ¯ ¯ ¯ ¯ 1 1 ¯u(t) − u(s)¯ = ¯ u0 (τ ) dτ ¯ 6 ku0 k |t − s| p0 6 |t − s| p0 p ¯ ¯

∀ t, s ∈ T,

s

where p1 + p10 = 1 (see Theorem 2.4.58). Then the Arzela-Ascoli theorem ¡ ¢ (see Theorem 2.3.2) implies that B 1 (0) is relatively compact in C T and ¡ ¢ so we have proved the compactness of the embedding W 1,p (T ) ⊆ C T for p ∈ (1, +∞] with T ⊆ R being a bounded, open interval. (c) Let B 1 (0) be the closed unit ball in W 1,1 (T ). Let E be an open subset of T , such that E ⊆ T . Let ¡ ¢ |h| < dR E, T c and u ∈ B 1 (0). From Remark 2.4.21, we know that kτh (u) − ukL1 (E) 6 |h| ku0 kL1 (T ) 6 |h|, so Z

³ ´p−1 Z ¯ ¯ ¯ ¯ ¯u(t + h) − u(t)¯p dt 6 2 kuk ∞ ¯u(t + h) − u(t)¯ dt 6 c|h|, L (T )

E

E

for some c > 0 with p ∈ [1, +∞). Thus µZ

¶ p1 ¯ ¯ 1 1 ¯u(t + h) − u(t)¯p dt 6 c p |h| p .

E

Invoking Theorem 2.3.6, we infer that B 1 (0) is relatively compact in Lp (T ), p ∈ [1, +∞). This proves the compactness of the embedding W 1,1 (T ) ⊆ Lp (T ) for p ∈ [1, +∞) with T ⊆ R being a bounded, open interval.

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.5.23

227

The embedding ¡ ¢ W 1,1 (T ) ⊆ C T

is continuous but never compact even if the open interval T is bounded. If T ⊆ R is a bounded, open interval and {un }n>1 ⊆ W 1,1 (T ) is a bounded sequence, then we can extract a subsequence {unk }k>1 , such that {unk (t)}k>1 converges for every t ∈ T (see Denkowski, Mig´orski & Papageorgiou (2003a, p. 229)). Also if T ⊆ R is an unbounded, open interval and p ∈ (1, +∞], then the embedding W 1,p (T ) ⊆ L∞ (T ) is continuous, but not compact. To the equivalent Sobolev norms mentioned in Proposition 2.5.8, we can add one more. THEOREM 2.5.24 (a) If T ⊆ R is a bounded, open interval and r ∈ [1, +∞], then df

|||u|||W 1,p (T ) = kukr + ku0 kp is equivalent to the usual norm k·kW 1,p (T ) on W 1,p (T ); (b) If Z ⊆ RN is a bounded, open set with the cone property and p ∈ [1, +∞), then df

|||u|||W 1,p (T ) = kukr + kDukp is equivalent to the usual norm k·kW 1,p (Z) on W 1,p (Z) provided that r ∈ [1, p∗ ] if p < N , r ∈ [1, +∞) if p = N and r ∈ [1, +∞] if p > N . Now let Z ⊆ RN be a bounded, open set which is Lipschitz. Consider the Banach space M (Z) of ¡ Radon ¢∗ measures on Z with the total variation norm. Recall that M (Z) = C0 (Z) (see Theorem 2.3.41). From Theorem 2.4.17 (see also Remark 2.5.18), we know that if r > N , then the embedding ¡ ¢ © ¡ ¢ ª W01,r (Z) ⊆ C0 Z = u ∈ C Z : u|∂Z = 0 is continuous and dense. So by virtue of Lemma 2.2.27(a), we have that 0

M (Z) ⊆ W −1,r (Z), with 1r + r10 = 1 (see Theorem 2.4.57). This observation is crucial in proving the next compactness result.

228

Nonlinear Analysis

THEOREM 2.5.25 If Z ⊆ RN is a bounded, open set with the cone property and {µn }n>1 ⊆ M (Z) is a bounded sequence, then the sequence {µn }n>1 is relatively compact in W −1,r (Z) for every r ∈ df

[1, 1∗ ), where 1∗ =

N N −1 .

PROOF By virtue of Theorem 2.3.46, we can find a subsequence {µnk }n>1 of the sequence {µn }n>1 and µ ∈ M (Z), such that w

µnk −→ µ in M (Z). 1 1 r + r 0 = 1. Then ¡ ¢ 0 W01,r (Z) ⊆ C0 Z is

Let r0 > 0 be the conjugate exponent of r ∈ [1, 1∗ ), i.e., r0 > N and so by Theorem 2.5.17(c), the embedding 0

compact. Let B 1 (0) be the closed unit ball in W01,r (Z). We see that B 1 (0) ¡ ¢ m(ε) is compact in C0 Z . So given ε > 0 we can find a finite sequence {ui }i=1 , such that for every u ∈ B 1 (0), we have min ku − ui k ¡ ¢ < ε. (2.49) C0 Z

16i6m(ε)

©

ª So, if u ∈ B 1 (0), for some i ∈ 1, . . . , m(ε) , we have ¯Z ¯ Z ¯ ¯ ¯ u dµn − u dµ¯ k ¯ ¯ Z

Z

¯Z ¯ ¯Z ¯ ¯Z ¯ Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ 6 ¯ (u − ui ) dµnk ¯ + ¯ ui dµnk − ui dµ¯ + ¯ (ui − u) dµ¯¯ Z

Z

Z

¯Z ¯ Z ¯ ¯ ¯ 6 2ε sup |µnk | (Z) + ¯ ui dµnk − ui dµ¯¯. k>1 Z

Since

Z

Z

w

µnk −→ µ as k → +∞, we have that

¯Z ¯ Z ¯ ¯ ¯ ui dµn − ui dµ¯ −→ 0 as k → +∞. k ¯ ¯ Z

Z

Therefore, we conclude that lim

k→+∞

so

¯Z ¯ Z ¯ ¯ sup ¯¯ u dµnk − u dµ¯¯ = 0,

u∈B 1 (0)

Z

Z

µnk −→ µ in W −1,r (Z).

2. Lebesgue-Bochner and Sobolev Spaces

229

Next we sharpen both the Rellich-Kondrachov theorem (see Theorem 2.5.17) and Egorov’s theorem (see Theorem A.2.10). We show that a sequence bounded in W 1,r (Z) (r ∈ (1, N )) has a subsequence converging uniformly outside a very small set. The set is not only small in the Lebesgue measure (as the Egorov’s theorem postulates; see Theorem A.2.10), but it is also small in p-capacity, for p ∈ [1, r). First we introduce a notion, which is useful in the study of the pointwise properties of Sobolev functions. ¡ ¢ Suppose that u ∈ L1loc RN . Then Z 1 lim N u(y) dy if the limit exists df r&0 λ (B r (z)) u∗ (z) = B r (z) 0 otherwise

DEFINITION 2.5.26

is the precise representative of u. REMARK 2.5.27

¡ ¢ If u, v ∈ L1loc RN and

u(z) = v(z) then

for λN -almost all z ∈ RN ,

u∗ (z) = v ∗ (z)

∀ z ∈ RN .

Moreover, in view of the Lebesgue differentiation theorem (see Theorem 1.4.6), the limit in the definition of u∗ (see Definition 2.5.26) exists for λN -almost all z ∈ Z. In the next theorem, we identify each function in the Sobolev space W 1,r (Z) with its precise representative. THEOREM 2.5.28 If Z ⊆ RN is a bounded, open set which is Lipschitz, r ∈ (1, N ) and {un }n>1 ⊆ W 1,r (Z) is bounded, then there exist a subsequence {unk }k>1 of {un }n>1 and u ∈ W 1,r (Z), such that for every p ∈ [1, r) and every δ > 0, there exists a relatively closed set Aδ ⊆ Z, such that ¡ ¢ capp U \ Aδ 6 δ and unk −→ u

uniformly on Aδ .

PROOF We may assume that uk ∈ W01,r (Z) for all k > 1. Indeed if this is not the¡case¢ we choose a bounded, open set U ⊇ Z ⊇ Z and a cut-off function ϕ ∈ Cc RN , such that ϕ|Z = 1

and

ϕU c = 0.

230

Nonlinear Analysis

If u ∈ W 1,r (Z) and E(u) ∈ W 1,r (U ) is the extension (see Theorem 2.4.55), then ϕE(u) ∈ W01,r (U ). Because by hypothesis the sequence {un }n>1 ⊆ W01,r (Z) is bounded, by passing to a subsequence if necessary, we may assume that w

un −→ u in W01,r (Z) un −→ u in Lr (Z), with u ∈ W01,r (Z) (see Theorem 2.5.17(a)). Fix δ, ε > 0 and let ¯ ¯ ª df © Cnε = z ∈ Z : ¯un (z) − u(z)¯ > ε and df

hεn = Note that

2³ ε ´+ |un − u| − . ε 2 hεn ∈ W01,r (Z)

(see Proposition 2.4.33) and hεn > 1 on Cnε . From Definition 1.6.1(d), by H¨older’s and Poincar´e’s inequalities (see Theorem A.2.27), for p ∈ [1, r), we have ° °p ¡ ¢ capp Cεn 6 °Dhεn °p µ ¶p p ³n p 2 ε o´1− r r r 6 λN |un − u| > (kDun kr + kDukr ) r ε 2 r−p

6 c(ε) kun − ukLr (Z) . We choose a subsequence {unk }k>1 of {un }n>1 , such that ∞ X ° ° °un − u°r−p < +∞. k r k=1

We set df

i Dm =

∞ [

1

Cnik .

k=m

Then since capp is an outer measure on RN (see Theorem 1.6.10), we have i capp (Dm ) 6

∞ X

³ 1 ´ capp Cnik

k=m

µ ¶X ∞ ° ° 1 °un − u°r−p < δ , 6 c k r i 2k+1 k=1

provided that m = m(i) > 1 is large enough.

2. Lebesgue-Bochner and Sobolev Spaces

231

i From Theorem 1.6.11(a), we know that we can find an open set Vmi ⊇ Dm with δ capp (Vmi ) < k . 2 Set ∞ [ df i Aδ = Z \ Vm(i) . i=1

Then Aδ ⊆ Z is relatively closed, capp (Aδ ) 6

∞ X

¡ i ¢ capp Vm(i) 6 δ

i=1

and unk −→ u

REMARK 2.5.29

uniformly on Aδ .

The result in general fails if p = r.

Now we shall discuss how the continuous embedding ∗

W 1,p (Z) ⊆ Lp (Z) (with Z ∈ RN and p ∈ [1, N )) fails to be compact (see Theorem 2.5.17). It has to do with the so-called “concentration phenomena.” To start having an idea about such situations, recall Proposition 2.3.38. There we saw that if w

un −→ u

in L1 (R)

and oscillates rapidly around its weak limit, then the sequence {un }n>1 cannot converge strongly in L1 (Z). Moreover, if p ∈ (1, +∞), w

un −→ u in Lp (Z) and we also know that un (z) −→ u(z)

for a.a. z ∈ Z;

still we cannot in general deduce strong convergence in Lp (Z). The problem is that the mass of |un − u|p may coalesce onto a set of zero Lebesgue measure. This is the problem of “concentration.” For this reason, in contrast to the case p = 1, for p > 1 the best constant in the Sobolev inequality (see Theorem 2.5.3) is never achieved when Z ⊆ RN , Z 6= RN is an open set which is Lipschitz and in particular is never achieved on a bounded Lipschitz domain. For this reason our analysis will be for Z = RN . So let N > 1 and let p ∈ [1, N ) and consider df

D1,p (RN ) =

©

¡ ¢ª ∗ u ∈ Lp (RN ) : Du ∈ Lp RN ; RN ,

232

Nonlinear Analysis df

(where p∗ =

Np N −p )

furnished with the norm kukD1,p (RN ) = kDukp .

That this is a norm on D1,p (RN ) follows from the Sobolev-NirenbergGagliardo inequality (see Theorem 2.5.3; normed this way D1,p (RN ) is a separable Banach space, Hilbert space if p = 2). The best constant c > 0 in that inequality is given by p

(c−1 )p = S =

kDukp

inf

p

kukp∗

u ∈ D 1,p (RN ) u 6= 0

=

inf

u ∈ D 1,p (RN ) kukp∗ = 1

p

kDukp .

The question is whether this infimum is realized by an element in D1,p (RN ). So consider a minimizing sequence {un }n>1 ⊆ D1,p (RN ), i.e., p

kDun kp −→ S, with kun kp∗ = 1

∀ n > 1.

By passing to a subsequence if necessary, we may assume that w

un −→ u in D1,p (RN ) and so

p

p

kDukp 6 lim inf kDun kp . n→+∞

This u is a minimizer provided that kukp∗ = 1. But since w

∗

in Lp (RN ),

un −→ u

we only know that kukp∗ 6 1. Note that if v ∈ D1,p (RN ), y ∈ RN and λ > 0, then the rescaled function v y,λ (z) = λ satisfies

° y,λ ° °Dv °

p

= kDvkp

N −p p

and

v(λz + y) ° y,λ ° °v ° ∗ = kvk ∗ . p p

So the problem is invariant under translations and dilations. In order to avoid noncompactness of the minimizing sequence (hence achieve kukp∗ = 1) we need the following result, known as the Concentration-Compactness Lemma. In what follows we shall regard L1 (RN ) in a natural way as a subset of M (RN ), by associating to u ∈ L1 (RN ) the measure Z µ(A) = u(z) dz, A

i.e., dµ = u dz.

2. Lebesgue-Bochner and Sobolev Spaces

233

THEOREM 2.5.30 If p ∈ [1, N ), w

in D1,p (RN ),

un −→ u

w

p

µ bn = kDun kRN −→ µ and

νbn = |un |p

w

∗

in M (RN ),

−→ ν

with µ, ν ∈ M (RN ), µ, ν > 0, then (a) there exists an at most countable index set © I, a ªfamily {zi }i∈I of distinct points in RN and nonnegative numbers µi , νi i∈I , such that p

µ > kDukRN +

X

µi δzi

i∈I

and

∗

ν = |u|p +

X

νi δzi ;

i∈I

(b) for all i ∈ I,

p ∗

Sνip and in particular

X

6 µi

p ∗

νip

< +∞.

i∈I

PROOF Z

¡ ¢ First suppose that u = 0. Let ϑ ∈ Cc∞ RN . We have

¯ ¯ ∗ p∗ ¯ϑun ¯p dz 6 S − p

µZ

RN

¶ pp∗ ∀ n > 1.

(2.50)

RN

Since we have that

° ° °D(ϑun )°p dz

|un |p Z

∗

w

−→ ν

in M (RN ),

¯ ¯ ∗ ¯ϑun ¯p dz −→

RN

Z ∗

|ϑ|p dν. RN

Also, using the facts that ¡ ¢ un −→ u in Lploc RN (see Theorem 2.5.17(a)) and p

w

kDun kRN −→ µ in M (RN ),

(2.51)

234

Nonlinear Analysis

we have that µZ lim inf

n→+∞

° ° °D(ϑun )°p N dz R

¶ pp∗

RN

µZ

° °p |ϑ| Dun °RN dz p°

= lim inf

n→+∞

¶ pp∗

RN

Z

|ϑ|p dµ.

=

(2.52)

RN

From (2.50), (2.51) and (2.52), we infer that in the limit as n → +∞, we have µZ p∗

|ϑ|

dν

¶ p1∗

µZ 6 S

¶ p1 |ϑ| dµ

1 −p

RN

¡ ¢ ∀ ϑ ∈ Cc∞ RN .

p

(2.53)

RN

From (2.53), it follows that for all compact sets K ⊆ RN , we have 1

1

1

ν(K) p∗ 6 S − p µ(K) p .

(2.54)

Because the measures are Radon, from (2.54), we deduce that 1

1

1

ν(A) p∗ 6 S − p µ(A) p

for all Borel sets A ⊆ RN .

(2.55)

From Saks Lemma (see Theorem A.2.13), we know that µ = µ0 + µa , with µ0 nonatomic measure, µ0 > 0, µa purely atomic and X µa = µi δzi , i∈I

where I is a countable index set, {µi }i∈I ⊆ R+ \ {0} and {zi }i∈I ⊆ RN . Because of (2.55), we see that ν ≺≺ µ and so from the Radon-Nikodym theorem (see Theorem A.2.24), we have that Z dν ν(A) = dµ for all Borel sets A ⊆ RN , (2.56) dµ A

where

dν ∈ L1 (RN ; µ) dµ

2. Lebesgue-Bochner and Sobolev Spaces

235

is the Radon-Nikodym derivative (see Remark A.2.25). So dν ν(B r (z)) (z) = lim r&0 µ(B r (z)) dµ

for µ-a.a. z ∈ RN .

(2.57)

From (2.55), we see that ¡ ¢ N p∗ ν(B r (z)) 6 S − p µ B r (z) N −p , µ(B r (z))

(2.58)

¡ ¢ provided µ B r (z) 6= 0. From (2.57) and (2.58), it follows that dν (z) = 0 dµ

for a.a. z ∈ supp µ0 = RN \ {zi }i∈I .

(2.59)

Set

dν (zi )µi . dµ From (2.56), (2.57), (2.58) and (2.59), we see that the theorem holds when u = 0. Now suppose that u 6= 0. Set νi =

df

wn = un − u. Then the previous calculations apply to {wn }n>1 . Moreover, by virtue of Proposition 2.3.49, we have p

p

p

kDwn kp = kDun kp − kDukp + εn so

w

p

with εn & 0,

p

b > 0 kDwn kRN −→ µ − kDukRN = µ

in M (RN ).

Similarly, we have |wn |p

∗

−→ ν − |u|p

∗

= νb in M (RN ).

So, we are back to the case u = 0, with un replaced by wn , µ replaced by µ b and ν replaced by νb. COROLLARY 2.5.31 If p ∈ [1, +∞), w

in D1,p (RN ),

un −→ 0

p

w

µn = kDun kRN −→ µ and

νn = |un |p

∗

w

−→ ν

with µ, ν > 0 and 1

1

in M (RN ), 1

µ(RN ) p 6 S p ν(RN ) p∗ , then ν is a Dirac measure.

236

Nonlinear Analysis

PROOF

From (2.55) and the hypotheses, we see that 1

1

1

µ(RN ) p = S p ν(RN ) p∗ . Also from (2.53), we have that µZ p∗

|ϑ|

dν

¶ p1∗

µZ 6 S

1 −p

N

µ(R )

1 N

|ϑ|

RN

p∗

¶ p1∗ dµ

¡ ¢ ∀ ϑ ∈ Cc∞ RN .

RN

Thus we infer that ν = S−

p∗ p

p

µ(RN ) N −p µ

and (2.53) becomes µZ |ϑ|

p∗

dν

¶ p1∗

¡

N

ν(R )

¢ N1

µZ 6

RN

so

p

¶ p1

|ϑ| dν

,

RN

1

1

1

ν(A) p∗ ν(RN ) N 6 ν(A) p

for all Borel sets A ⊆ RN .

But this is impossible if ν is not a Dirac measure. We return to the problem of determining the best Sobolev constant, i.e., S =

inf 1,p

p

N

u ∈ D (R ) kukp∗ = 1

kDukp .

(2.60)

We have the following existence result for problem (2.60). The result is due to Lions (1985a, 1985a) and its proof, which also uses Theorem 2.5.30, can be found there. THEOREM 2.5.32 If p ∈ (1, N ) and {un }n>1 ⊆ D1,p (RN ) is a minimizing sequence for problem (2.60), then {un }n>1 up to translation and dilation is relatively compact in D1,p (RN ), i.e., there exists a sequence {(zn , λn )}n>1 ⊆ RN × (R+ \ {0}), such that the sequence N −p

uznn ,λn (z) = λn p un (λn z + zn ) is relatively compact in D1,p (RN ).

∀ z ∈ RN

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.5.33

237

If p = 2, the function df

u(z) =

[N (N − 2)] 2

N −2 4

(1 + kzkRN )

N −2 2

is a minimizer for problem (2.60) (see Aubin (1976) and Talenti (1976)). If Z ⊆ RN is any open set (not necessarily equal RN ) and by S(Z) we denote the value of problem (2.60) when RN is replaced by Z, then S(Z) = S, but S(Z) is never attained if Z 6= RN . Finally if p = 1, then the best constant for the embedding N

D1,1 (RN ) ⊆ L N −1 (RN ) is attained on the characteristic functions of balls, that is in BVloc (RN ) (see Section 2.6). Since we are in the business of determining sharp constants in inequalities, let us check to see what happens with the constant in the Poincar´e-Wirtinger inequality (see Theorem 2.5.21) for Sobolev functions of one variable. Let T = (0, b) (b < +∞) and p ∈ (1, +∞). We introduce the space ¡ ¢ df © ¡ ¢ ª 1,p Wper T ; RN = u ∈ W 1,p T ; RN : u(0) = u(b) . ¡ ¢ 1,p From Theorem 2.5.22(b), we have that Wper T ; RN is embedded continu¡ ¢ ously (in fact compactly) in C T ; RN . Therefore the evaluations at t = b and t = 0 make sense. PROPOSITION 2.5.34 ¡ ¢ Rb 1,p If u ∈ Wper T ; RN (p ∈ (1, +∞)) and u(t) dt = 0, 0

1

then kuk∞ 6 b p0 ku0 kp . PROOF Arguing on each component separately, we may assume without any loss of generality that N = 1. Then from the mean value theorem for integrals, we can find τ ∈ T = (0, b), such that 1 u(τ ) = b

Zb u(s) ds = 0. 0

By H¨older’s inequality (see Theorem A.2.27), with

1 p

+

1 p0

¯ Zt ¯ Zb ¯ ¯ ¯ ¯ ¯ 0 ¯ 1 ¯u(t)¯ = ¯ u0 (s) ds¯ 6 ¯u (s)¯ ds 6 b p0 ku0 k p ¯ ¯ τ

so kuk∞ 6 b

1 p0

0 0

ku kp .

= 1, we have ∀ t ∈ [0, b],

238

Nonlinear Analysis

¡ ¢ 1,2 In the case of the Hilbert space Wper T ; RN , we have the following sharp estimates. PROPOSITION ¡ ¢2.5.35 1,p If u ∈ Wper T ; RN (p ∈ (1, +∞)) and Zb u(t) dt = 0, 0

then 2

b2 2 ku0 k2 ; 4π 2 b 2 6 ku0 k2 . 12

(a) kuk2 6 2

(b) kuk∞ PROOF

Again we may assume that N = 1.

(a) We consider the Fourier expansion of u, i.e., µ ¶ +∞ X 2iπkt u(t) = ak exp . b k = −∞ k 6= 0

Parseval’s equality implies that 2

ku0 k2 =

+∞ X k = −∞ k 6= 0

b

4π 2 k 2 4π 2 |ak |2 > 2 2 b b

+∞ X

b|ak |2 =

k = −∞ k 6= 0

4π 2 2 kuk2 . b2

(b) Using the Cauchy-Schwarz-Bunyakowski inequality (see Proposition A.4.5 and Remark A.4.6), Parseval’s equality and since ∞ X 1 π2 = , 2 k 6

k=1

for every t ∈ [0, b], we have µ X +∞ ¯ ¯ ¯u(t)¯2 6

¶2 |ak |

k = −∞ k 6= 0

µ 6

+∞ X k = −∞ k 6= 0

b 4π 2 k 2

¶µ

+∞ X k = −∞ k 6= 0

4π 2 k 2 |ak |2 b

¶ =

b 2 ku0 k2 . 12

2. Lebesgue-Bochner and Sobolev Spaces

2.6

239

Fine Properties of Functions and BV-Functions

In this section we establish some further differentiability properties of Sobolev functions and also introduce the space of functions of bounded variation (BV -functions) and establish some of their basic properties. ∗ We start with a result on the Lp -differentiability of Sobolev functions. PROPOSITION 2.6.1 1,p If u ∈ Wloc (RN ) with p ∈ [1, N ), then for λN -almost all z ∈ RN , we have µ

1 λN (B r (z))

Z

¯ ¯u(y) − u(z) − (Du(z), y − z)

RN

¯p ∗ ¯ dy

¶ p1∗

B r (z)

= o(r) PROOF we have

as r & 0.

From Theorem 1.4.6, we know that for λN -almost all z ∈ RN , Z

1 lim r&0 λN (B r (z))

¯ ¯ ¯u(y) − u(z)¯p dz = 0

B r (z)

and 1 r&0 λN (B r (z))

Z

° ° °Du(y) − Du(z)°p N dz = 0. R

lim

B r (z)

We fix such a point z ∈ RN (known as a Lebesgue point for the functions u and Du). Clearly exploiting the translation invariance ¡ ¢of the Lebesgue measure λN , we can take z = 0. We choose ϑ ∈ Cc1 B r (0) with kϑkp0 6 1 (here p1 + p10 = 1). Let ϕ be a mollifier (see Definition 2.4.10) and for every ε > 0, set df

uε = ϑε ? u. Choose y ∈ B r (0) and let h(t) = uε (ty). Then Z1 h0 (s) ds,

h(1) = h(0) + 0

240

Nonlinear Analysis

so Z1 uε (y) = uε (0) +

¡

Duε (sy), y

¢ RN

ds

(2.61)

0

¡

= uε (0) + Du(0), y

Z1

¢ RN

+

¡

Duε (sy) − Du(0), y

¢ RN

ds.

0

Using Fubini’s theorem and a change of variables, we have Z

1 λN (B r (0))

ϑ(y) (uε (y) − uε (0) − (Du(0), y)RN ) dy B r (0)

Z1

Z

1 λN (B r (0))

= 0

¡ ¢ ϑ(y) Duε (sy) − Du(0), y RN dy ds

B r (0)

Z1

Z

1 N sλ (B rs (0))

= 0

ϑ

³y ´ ¡ ¢ Duε (y) − Du(0), y RN dy ds. s

B rs (0)

Letting ε & 0, in the limit we obtain 1 N λ (B r (0))

Z ϑ(y) (u(y) − u(0) − (Du(0), y)RN ) dy B r (0)

Z1 = 0

Z

1 N sλ (B rs (0))

ϑ

³y´ ¡ ¢ Du(y) − Du(0), y RN dy ds s

B rs (0)

Z1

µ

6 r 0

¯ ³ y ´¯p0 ¶ p10 ¯ ¯ × ¯ϑ ¯ dy s

Z

1 λN (B rs (0))

B rs (0)

µ ×

1 N λ (B rs (0))

¶ p1

Z kDu(y) −

p Du(0)kRN

dy

.

B rs (0)

Note that 1 N λ (B rs (0))

Z B rs (0)

Z ¯ ³ y ´¯p0 1 ¯ ¯ ¯ϑ ¯ dy = N s λ (B r (0)) B r (0)

|ϑ (y)|

p0

dy 6

1 a(N )rN

2. Lebesgue-Bochner and Sobolev Spaces

241

(see Remark 1.3.22). So we obtain Z

1 N λ (B r (0))

¡ ¡ ¢ ¢ ϑ(y) u(y) − u(0) − Du(0), y RN dy

B r (0)

³ ´ 1− N = ε r p0

as r & 0.

Taking the supremum over all ϑ, we obtain µ Z

1 rN

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

B r (0)

¯p ¯ dy RN

³ ´ 1− N = o r p0

¶ p1

as r & 0,

so µ

Z

1 N λ (B r (0))

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

¯p ¯ dy RN

¶ p1

B r (0)

= o(r) as r & 0.

(2.62)

Set ¡ ¢ df h(y) = u(y) − u(0) − Du(0), y RN , so h ∈ W 1,p (Br (0)) and consider its extension E(h) ∈ W 1,p (RN ). We have ° ° °E(h)° 1,p N 6 c1 kuk 1,p W (Br (0)) , W (R )

(2.63)

for some c1 > 0 (see Theorem 2.4.55). Then, via Sobolev’s inequality (see Theorem 2.5.3) and (2.63), we have µ Z

¯ ¯ ∗ ¯h(y)¯p dy

¶ p1∗

µZ 6

B r (0)

µZ

6 c2

° ° °DE(h)(y)°p N dy R

¶ p1

¯ ¯ ∗ ¯E(h)(y)¯p dy

RN

RN

µ Z 6 c3 B r (0)

¶ p1∗

¯ ¡¯ ¢ ¯h(y)¯p + kDh(y)kp N dy R

¶ p1 .

(2.64)

242

Nonlinear Analysis

Therefore using (2.62) and (2.64), we conclude that µ

1 N λ (B r (0))

Z

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

¯p∗ ¯ dy RN

¶ p1∗

B r (0)

µ 6 c4 r

Z

1 N λ (B r (0))

° ° °Du(y) − Du(0)°p N dy R

¶ p1

B r (0)

µ + c4

Z

1 λN (B r (0))

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

¯p ¯ dy N R

¶ p1

B r (0)

= o(r) as r & 0.

For differentiability λN -almost everywhere, as probably expected, we consider the case p ∈ (N, +∞]. PROPOSITION 2.6.2 1,p If u ∈ Wloc (RN ) with p ∈ (N, +∞], then u is differentiable λN -almost everywhere and the derivative equals the distributional derivative λN -almost everywhere. 1,∞ 1,p PROOF Since Wloc (RN ) ⊆ Wloc (RN ) for any p < +∞, we may assume N that p ∈ (N, +∞). For λ -almost all z ∈ RN , we have Z 1 p lim N kDu(y) − Du(z)kRN dy = 0. (2.65) r&0 λ (B r (z)) B r (z)

Choose z ∈ RN , such that (2.65) holds. Set ¡ ¢ df h(y) = u(y) − u(z) − Du(z), y − z RN

∀ y ∈ B r (z).

Using Morrey’s inequality (see Theorem 2.5.12(a)), we have ¯ ¯ ¯h(y) − h(z)¯ 6 cr

µ

1 N λ (B r (z))

Z

° ° °Dh(y)°p N dy R

B r (z)

with r = ky − zkRN . Since h(z) = 0

and

Dh = Du − Du(z),

¶ p1 ,

2. Lebesgue-Bochner and Sobolev Spaces

243

using (2.65), we obtain |u(y) − u(z) − (Du(z), y − z)RN | ky − zkRN µ ¶ p1 Z ° °p 1 ° ° 6 c N Dh(y) dy −→ 0 as y → z, λ (B r (z)) B r (z)

so u is λN -almost everywhere differentiable and ∇u(z) = Du(z) for a.a. z ∈ RN .

Next we investigate the properties of a Sobolev function u (or more exactly of its precise representation u∗ (see Definition 2.5.26)) along lines. In this direction we have the following result. PROPOSITION 2.6.3 1,p (a) If u ∈ Wloc (RN ), p ∈ [1, +∞), © ª then for each k ∈ 1, . . . , N the function ¡ ¢ u∗k (z 0 , t) = u∗ z1 , . . . , zk−1 , t, zk+1 , . . . , zN is locally absolutely continuous in t for λN −1 -almost all z 0 = (zi )N i=1,i6=k ∈ ¡ ¢ RN −1 (see Definition A.2.15(b)). Moreover (u∗k )0 ∈ Lploc RN . ¡ ¢ (b) If u ∈ Lploc RN and u = h λN -almost everywhere where for each k ∈ © ª 1, . . . , N , the function ¡ ¢ df hk (z 0 , t) = h z1 , . . . , zk−1 , t, zk+1 , . . . , zN is locally absolutely continuous in t for λN −1 -almost all z 0 = (zi )N i=1,i6=k ∈ ¡ ¢ RN −1 and h0k ∈ Lploc RN , 1,p then u ∈ Wloc (RN ). PROOF (a) Clearly we may assume that k = N . Set uε = ϕε ? u with {ϕε }ε>0 being a family of mollifiers. We know that 1,p uε −→ u in Wloc (RN )

(see Proposition 2.4.12(e)). For every M > 0 and λN −1 -almost all z 0 = −1 N −1 (zi )N , from Fubini’s theorem, we have i=1 ∈ R ¯ ¯p ¶ ZM µ ¯ ∂uε 0 ∂u 0 ¯¯ p |uε (z 0 , t) − u(z 0 , t)| + ¯¯ (z , t) − dt −→ 0 (z , t)¯ ∂zN ∂zN

−M

as ε & 0.

244

Nonlinear Analysis

Let

df

uN,ε (t) = uε (z 0 , t). Then uN,ε −→ uN

1,p in Wloc (R)

as ε & 0

and so also locally uniformly to a locally absolutely continuous function uN with u0N (t) = DN u(z 0 , t) for λ1 -a.a. t ∈ R. Also from Theorems 1.6.18, 1.6.13(b) and Remark 1.6.14, we have that uε −→ u∗

µ(N −1) − a.e.,

so from Proposition 1.3.25, we have uN,ε (t) −→ u∗ (z 0 , t) for λN −1 -a.a. z 0 ∈ RN −1 and all t ∈ R. Thus

uN (t) = u∗ (z 0 , t) for λN −1 -a.a. z 0 ∈ RN and all t ∈ R.

¡ ¢ (b) For each ϑ ∈ Cc1 RN , we have Z Z ∂ϑ ∂ϑ u dz = h dz ∂zk ∂zk RN

RN

+∞ µZ ¶ hk (z 0 , t)ϑ0 (z 0 , t) dt dz 0

Z = RN −1

Z = − RN −1

so

−∞ +∞ µZ ¶ Z 0 0 0 0 hk (z , t)ϑ(z , t) dt dz = − −∞

h0k ϑ dz 0 ,

RN −1

Dk u(z) = h0k (z) for λN -a.a. z ∈ RN and all k ∈ {1, . . . , N },

1,p thus u ∈ Wloc (RN ).

Before starting discussing BV -functions, let us prove a result on the superposition operator defined on a Sobolev space. More precisely let Z ⊆ RN be an open set and let ξ : R −→ R be a Lipschitz continuous function. If Z is unbounded, we also assume that u(0) = 0. From Proposition 2.4.25, we know that if u ∈ W 1,p (Z), then ξ ◦ u ∈ W 1,p (Z). So we can define the map Nξ : W 1,p (Z) −→ W 1,p (Z), by df

Nξ (u) = ξ ◦ u

∀ u ∈ W 1,p (Z).

2. Lebesgue-Bochner and Sobolev Spaces

245

PROPOSITION 2.6.4 If p ∈ (1, +∞), then Nξ : W 1,p (Z) −→ W 1,p (Z) is continuous. PROOF

Suppose that un −→ u in W 1,p (Z).

Then

ξ(un ) −→ ξ(u) in Lp (Z).

Also from Proposition 2.4.25, we know that D(ξ ◦ un )(z) = (ξ ∗ ◦ un )Dun (z)

for a.a. z ∈ Z,

with a bounded Borel measurable function ξ ∗ : R −→ R, such that ξ ∗ (z) = ξ 0 (z) So the sequence

for a.a. z ∈ R.

¡ ¢ {D(ξ ◦ un )}n>1 ⊆ Lp Z; RN

is bounded and it follows that ¡ ¢ w D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN . First suppose that ξ ∗ = χA , with A being a Borel set. Set df

η ∗ (t) = ξ ∗ (t) − We have Z Z

1 = p 2

° ° °D(η ◦ un )°p N dz = R

Z Z

Z p kDun kRN Z

1 2

dz −→

1 2p

and

1 df η(t) = ξ(t) − . 2

° ∗ ° °(η ◦ un )Dun °p N dz R Z

Z p

kDukRN dz = Z

° ° °D(η ◦ u)°p N dz. R

Z

Since

¡ ¢ w D(η ◦ un ) −→ D(η ◦ u) in Lp Z; RN , ° ° ° ° °D(η ◦ un )° −→ °D(η ◦ u)° p p ¢ ¡ N p (p ∈ (1, +∞)), from the Kadec-Klee and the uniform convexity of L Z; R property (see Remark A.3.22), we have that ¡ ¢ D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN , whenever ξ ∗ is a characteristic function of a Borel set.

246

Nonlinear Analysis

Clearly then the same is true for ξ ∗ being a countably-valued Borel function. Now suppose that ξ ∗ is an arbitrary bounded Borel function. For a given ε > 0, we can find a countably-valued function s∗ , such that ¯ ¯ sup ¯ξ ∗ (t) − s∗ (t)¯ 6 ε t∈R

(see Corollary 2.1.4). So using Proposition 2.4.25, we have ° ° °D(ξ ◦ un ) − D(ξ ◦ u)° p ³° ° ∗ ° ° ° ° ´ ∗ 6 °(s ◦ un )Dun − (s ◦ u)Du°p + ε °Dun °p + °Du°p , so

° ° lim sup °D(ξ ◦ un ) − D(ξ ◦ u)°p 6 2ε kDukp . n→+∞

Let ε & 0, to obtain ¡ ¢ D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN , hence ξ ◦ un −→ ξ ◦ u in W 1,p (Z).

REMARK 2.6.5 The result is also true for p = 1, but the proof is more involved. We refer to Marcus & Mizel (1979), for details. The weakest measure theoretic sense in which a function w ∈ L1 (Z) can be differentiable is to require that its partial derivatives in the sense of distributions are Radon measures. Such functions are called functions of bounded variation. More precisely we make the following definition. DEFINITION 2.6.6 Let Z ⊆ RN be an open set. A function u ∈ L1 (Z) is said to be of bounded variation, if and only if there exist bounded Borel signed measures © ª µk : B(Z) −→ R, for k ∈ 1, . . . , N , such that

Z

Z uDk ϑ dz = −

Z

ϑ dµk

∀ ϑ ∈ Cc∞ (Z).

Z

The space of functions of bounded variation is denoted by BV (Z). The next Proposition clarifies the structure of the functions of bounded variation.

2. Lebesgue-Bochner and Sobolev Spaces

247

PROPOSITION 2.6.7 If Z ⊆ RN is an open set, u ∈ BV (Z) and for h ∈ Cc (Z), h > 0, we set ½Z ¡ ¢ df ∞ N kDuk (h) = sup udiv ϑ dz : ϑ = (ϑk )N , k=1 ∈ Cc Z; R Z

° ° °ϑ(z)°

RN

¾ 6 h(z), z ∈ Z ,

then kDuk is a Radon measure. PROOF According to the Riesz-Markov representation theorem (see Theorem 2.3.41), we need to show that kDuk is a positive linear functional on Cc (Z) which is continuous under monotone convergence, i.e., if hn % h in Cc (Z), then kDuk (hn ) −→ kDuk (h). To this end let µ = (µk )N k=1 = Du. From Definition 2.6.6, we have that Z Z ¡ ¢ udiv ϑ dz = − ϑ dµ ∀ ϑ ∈ Cc∞ Z; RN . Z

Z

Thus, we may write ½Z kDuk (h) = sup

¡ ¢ N v dµ : v = (vk )N , k=1 ∈ Cc Z; R

Z

¾ ° ° °v(z)° N 6 h(z) for all z ∈ Z . R

We show that kDuk¡ (·) is additive. So let h1 , h2 ∈ Cc (Z), h1 , h2 > 0 and ¢ suppose that v ∈ Cc Z; RN is such that ° ° °v(z)° 6 h1 (z) + h2 (z) ∀ z ∈ Z. © ª Let g = min h1 , kvk and ( v(z) g(z) kv(z)k if v(z) 6= 0, df RN w(z) = 0 if v(z) = 0. ¡ ¢ Clearly w ∈ Cc Z; RN and ° ° ° ° °v(z) − w(z)° N = °v(z)° N − g(z) 6 h2 (z) R R Therefore, since ° ° °w(z)° N = g(z) 6 h1 (z) R

∀ z ∈ Z,

∀ z ∈ Z.

248

Nonlinear Analysis

we have Z

Z v dµ =

Z

Z w dµ +

Z

(v − w) dµ 6 kDuk (h1 ) + kDuk (h2 ), Z

so kDuk (h1 + h2 ) 6 kDuk (h1 ) + kDuk (h1 ). Since the opposite inequality is clearly true, we conclude that kDuk (·) is additive. Also it is clearly positively homogeneous. Thus it remains to show that if hn % h in Cc (Z)+ , then ¡

Let v ∈ Cc Z; R

¢ N

kDuk (hn ) −→ kDuk (h). , such that ° ° °v(z)° N 6 h(z) R

© ª df Let gn = min hn , kvk and (

v(z) gn (z) kv(z)k RN 0

∀ z ∈ Z.

if if

v(z) 6= 0, v(z) = 0.

¡ ¢ We have wn ∈ Cc Z; RN , ° ° °wn (z)° N = gn (z) 6 hn (z) R

∀z∈Z

df

wn (z) =

and kv − wn k = kvk − gn & 0. Because kv − wn k = kvk − gn 6 2 kvk , by virtue of the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that Z v dµ = hDu, viCc (Z;RN ) = lim hDu, wn iCc (Z;RN ) 6 lim kDuk (gn ), n→+∞

n→+∞

Z

so kDuk (h) 6

lim kDuk (gn ).

n→+∞

Since gn 6 h

∀ n > 1,

we have that the opposite inequality also holds, hence kDuk (h) >

lim kDuk (gn ),

n→+∞

so kDuk (h) =

lim kDuk (hn ).

n→+∞

2. Lebesgue-Bochner and Sobolev Spaces

249

COROLLARY 2.6.8 If Z ⊆ RN is an open set and u ∈ BV (Z), then there exists a Borel measurable function ξ : Z −→ RN , such that ° ° °ξ(z)° N = 1 µ = Du-a.e. R and

Z

Z udiv ϑ dz = −

Z

(ϑ, ξ)RN d kDuk

¡ ¢ ∀ ϑ ∈ Cc1 Z; RN .

Z

REMARK 2.6.9

Evidently ξ =

d(Du) d kDuk

(i.e., the Radon-Nikodym derivative of µ = Du with respect to kDuk, since Du ≺≺ kDuk; see Theorem A.2.24 and Remark A.2.25). So, we have Z

Z udiv ϑ dz = −

Z

ϑdDu

¡ ¢ ∀ ϑ ∈ Cc1 Z; RN .

Z

In the sequel for u ∈ L1loc (Z), we say that u ∈ BVloc (Z) (i.e., has locally bounded variation in Z), if for every bounded open set V ⊆ Z with V ⊆ Z, we have that u ∈ BV (V ). Note that the total variation of kDuk is given by ½Z kDuk (Z) = sup

¡ ¢ ∞ N udiv ϑ dz : ϑ = (ϑk )N , k=1 ∈ Cc Z; R

Z

¾ ° ° °ϑ(z)° N 6 1 for all z ∈ Z . R

The norm of BV (Z) is given by kukBV (Z) = kuk1 + kDuk and makes BV (Z) a Banach space. It is also well known that an absolutely continuous function u : R −→ R with u0 ∈ L1 (R) is of bounded variation in R. In particular then W 1,1 (R) ⊆ BV (R). Next we show that the same is true in higher dimensions (i.e., for N > 1). First two examples to motivate what follows.

250

Nonlinear Analysis

1,1 EXAMPLE 2.6.10 (a) Let Z ⊆ RN be an open set and u ∈ W ¡ (Z), ¢ 1,1 1 N then °u ∈ BV ° (Z) (i.e., W (Z) ⊆ BV (Z)). To see this let ϑ ∈ Cc Z; R with °ϑ(z)°RN 6 1 for all z ∈ Z. We have Z Z ¡ ¢ udiv ϑ dz = − Du, ϑ RN dz, Z

Z

so

Z

° ° °Du(z)°

kDuk =

RN

dz

Z

and

( ξ(z) =

Du(z) kDu(z)kRN

0

if if

Du(z) 6= 0 Du(z) = 0

for λN -a.a. z ∈ Z.

1,1 1,p Similarly we show that Wloc (Z) ⊆ BVloc (Z). In particular then Wloc (Z) ⊆ N BVloc (Z) for all p ∈ [1, +∞) and if Z ⊆ R is bounded and open then W 1,p (Z) ⊆ BV (Z) for all n > 1.

(b) Let Z ⊆ RN be an open set, U ⊆ RN another open set with C 2 -boundary ∂U , such that ¡ ¢ µ(N −1) ∂U ∩ K < +∞ for all compact sets K ⊆ Z. ¡ ¢ Then from Proposition 2.4.44, for ϑ ∈ Cc1 Z; RN , we have Z Z div ϑ dz = (ϑ, n)RN dµ(N −1) U

∂U

(here n denotes the outward unit normal along ∂U ¡ ). Hence ¢ for any bounded, open set V ⊆ Z, with V ⊆ Z and for any ϑ ∈ Cc1 V ; RN , we have Z Z ¡ ¢ div ϑ dz = (ϑ, n)RN dµ(N −1) 6 µN −1 ∂U ∩ V , U

so χU ∈ BVloc (Z). Moreover,

∂U ∩V

¡ ¢ k∂χU k (Z) = µ(N −1) ∂U ∩ Z .

Thus k∂χU k (Z) measures the size of ∂U in Z. Since χU is not in general in 1,1 Wloc (Z), we see that not every function of (locally) bounded variation is a Sobolev function. Motivated by Example 2.6.10(b), we make the following definition. DEFINITION 2.6.11 A Lebesgue measurable set A ⊆ RN is said to have finite perimeter in an open set Z ⊆ RN , if χA ∈ BV (Z).

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.6.12 poli sets.

251

Some authors call sets of finite perimeter, Cacciop-

Next we shall establish some elementary properties of BV -functions. The first is the lower semicontinuity of the variational measure. PROPOSITION 2.6.13 If Z ⊆ RN is an open set and {un }n>1 ⊆ BV (Z) is such that un −→ u

in L1loc (Z),

then for every open set U ⊆ Z, we have kDuk (U ) 6 lim inf kDun k (U ). n→+∞

¡ ¢ Let ϑ ∈ Cc∞ Z; RN be such that

PROOF

kϑ(z)kRN 6 1 We have

Z

∀ z ∈ U.

Z udiv ϑ dz =

lim

un div ϑ dz 6 lim inf kDun k (U );

n→+∞

U

n→+∞

U

so from Remark 2.6.9, we have kDuk (U ) 6 lim inf kDun k (U ). n→+∞

REMARK 2.6.14 The above Proposition does not assert that u ∈ BV (Z). This will be true if u ∈ L1 (Z) and sup kDun k (Z) < +∞. To n>1

see this let ϑ ∈ Cc1 (Z) and k = 1, . . . , N . We have Z Z Z lim ϑDk un dz = − lim un Dk ϑ dz = − uDk ϑ dz, n→+∞

n→+∞

Z

so

Z

Z

¯Z ¯ ¯ ¯ ¯ uDk ϑ dz ¯¯ 6 kϑk∞ lim inf kDun k (Z) < +∞. ¯ n→+∞ Z

Because the embedding Cc1 (Z) ⊆ Cc (Z) is dense, we have that Z Dk u(ϑ) = − uDk ϑ dz ∀ k = 1, . . . , N Z

is a bounded linear functional on Cc (Z), hence a measure.

252

Nonlinear Analysis

In the next Proposition, we establish an upper semicontinuity property of the total variation measure. PROPOSITION 2.6.15 If Z ⊆ RN is an open set, {un }n>1 ⊆ BV (Z), un −→ u

in L1loc (Z)

and kDuk (Z) =

lim kDun k (Z),

n→+∞

then ¡ ¢ ¡ ¢ lim sup kDun k U ∩ Z 6 kDuk U ∩ Z n→+∞

PROOF have that

for all open sets U ⊆ Z.

The set V = Z \ U is open and so from Proposition 2.6.13, we kDuk (V ) 6 lim inf kDun k (V ).

(2.66)

n→+∞

Then we have ¡ ¢ kDuk U ∩ Z + kDuk (V ) = kDuk (Z) = ¡ ¢ > lim sup kDun k U ∩ Z + lim inf kDun k (V ) n→+∞ n→+∞ ¡ ¢ > lim sup kDun k U ∩ Z + kDuk (V ),

lim kDun k (Z)

n→+∞

n→+∞

so

¡ ¢ ¡ ¢ lim sup kDun k U ∩ Z 6 kDuk U ∩ Z . n→+∞

Combining Propositions 2.6.13 and 2.6.15, we have the following. COROLLARY 2.6.16 If Z ⊆ RN is an open set, {un }n>1 ⊆ BV (Z), un −→ u

in L1loc (Z),

kDun k (Z) −→ kDuk (Z) and

¡ ¢ kDuk ∂U = 0

for all open sets U ⊆ Z,

then kDun k (U ) −→ kDuk (U ).

2. Lebesgue-Bochner and Sobolev Spaces

253

The next theorem is the counterpart for the space BV (Z) of the MeyersSerrin theorem (see Theorem 2.4.13). THEOREM 2.6.17 If Z ⊆ RN is an open set and u ∈ BV (Z), then we can find a sequence {un }n>1 ⊆ BV (Z) ∩ C ∞ (Z), such that un −→ u

in L1 (Z)

and

kDun k (Z) −→ kDuk (Z).

PROOF Let ε > 0. For a given positive integer m > 1, we define the following open subset of Z: ½ ¾ 1 df Zk = z ∈ Z : d(z, ∂Z) > ∩ Bk+m (0) ∀ k > 1. k+m Choose m > 1 large enough so that kDuk (Z \ Z1 ) < ε.

(2.67)

Setting Z0 = ∅, we introduce the following sequence of open sets of Z: df

Vk = Zk+1 \ Z k−1

∀ k > 1.

Let {ξk }k>1 be a C ∞ -partition of unity subordinate to the open cover {Vk }k>1 of Z, i.e., ξk ∈ Cc∞ (Vk ),

0 6 ξk 6 1

and

∞ X

ξk = 1

on Z.

k=1

Let ϕ be a mollifier and for each k > 1, choose εk > 0, such that supp (ϕεk ? (ξk u)) ⊆ Vk ε kϕεk ? (ξk u) − ξk uk1 < k 2 kϕεk ? (uDξk ) − uDξk k < ε . 1 2k Let df

uε =

∞ X

ϕεk ? (ξk u).

k=1

Then uε ∈ C ∞ (Z) and because u =

∞ X

ξk u,

k=1

from (2.68), we have kuε − uk1 < ε,

(2.68)

254

Nonlinear Analysis

so

uε −→ u in L1 (Z)

as ε & 0.

(2.69)

kDuk (Z) 6 lim inf kDuε k (Z).

(2.70)

Invoking Proposition 2.6.13, we have ε&0

¡ ¢ Now let ϑ ∈ Cc1 Z; RN be such that ° ° °ϑ(z)° N 6 1 R We have Z uε div ϑ dz =

=

Z ∞ Z X

∞ Z X

ϕεk ? (ξk u)div ϑ dz

k=1 Z

¡ ¢ ξk udiv ϕεk ? ϑ dz

k=1 Z

=

∞ Z X

udiv (ξk (ϕεk ? ϑ)) dz −

k=1 Z

=

∞ Z X

∀ z ∈ Z.

∞ Z X

u (Dξk , (ϕεk ? ϑ))RN dz

k=1 Z ∞ X ¡ ¢ udiv ξk (ϕεk ? ϑ) dz −

k=1 Z

Z

(ϑ, ϕεk ? (uDξk ) − uDξk )RN dz

k=1 Z

= η1,ε + η2,ε . Note that

° ¡ ¢ ° °ξk ϕε ? ϑ (z)° N 6 1 k R

∀ z ∈ Z, k > 1.

Also each z ∈ Z belongs in at most three elements in the cover {Vk }k>1 . So we have ¯ ¯ ¯Z ¯ ∞ Z X ¯ ¯ ¯ ¡ ¢ ¡ ¢ ¯ ¯η1,ε ¯ = ¯ udiv ξ1 (ϕε ? ϑ) dz + u div ξk (ϕεk ? ϑ) dz ¯¯ 1 ¯ ¯ ¯ k=2 Z Z ∞ X kDuk (Vk ) 6 kDuk (Z) + k=2

6 kDuk (Z) + 3 kDuk (Z \ Z1 ) 6 kDuk (Z) + 3ε.

(2.71)

Also from (2.68), we have that ¯ ¯ ¯η2,ε ¯ < ε. From (2.71) and (2.72), it follows that Z uε div ϑ dz 6 kDuk (Z) + 4ε, Z

(2.72)

2. Lebesgue-Bochner and Sobolev Spaces thus and so

255

° ° °Duε ° (Z) 6 kDuk (Z) + 4ε ° ° lim sup °Duε ° (Z) 6 kDuk (Z).

(2.73)

ε→0

From (2.70) and (2.73), we infer that ° ° °Duε ° (Z) −→ kDuk (Z)

as ε & 0.

This combined with (2.69) finishes the proof of the theorem. REMARK 2.6.18 Note that in the previous “local” approximation result, we do not have that ° ° °D(uε − u)° (Z) −→ 0 as ε & 0 and so we cannot claim the density of BV (Z) ∩ C ∞ (Z) in BV (Z). COROLLARY 2.6.19 If Z ⊆ RN is a bounded open set which is Lipschitz, then B r , for r > 0, is compact in L1 (Z), where B r = {u ∈ BV (Z) : kukBV 6 r} . PROOF Let {un }n>1 ⊆ B r . By Theorem 2.6.17, we can find ψn ∈ C ∞ (Z), such that Z ° ° ° ° °un − ψn ° 6 1 and kDψn k = °Dψn (z)° dz 6 2. 1 n Z 1,1

It follows that {ψn }n>1 ⊆ W (Z) is bounded. By virtue of Theorem 2.5.17, the sequence {ψn }n>1 ⊆ L1 (Z) is relatively compact. So we may assume that ψn −→ u in L1 (Z). From Remark 2.6.14, we have that u ∈ BV (Z) and from Proposition 2.6.13, we have that kukBV 6 r, i.e., u ∈ B r . REMARK 2.6.20 According to Corollary 2.6.19, if Z ⊆ RN is a bounded open set which is Lipschitz, then the embedding BV (Z) ⊆ L1 (Z) is compact. An interesting application of this compact embedding is the following result.

256

Nonlinear Analysis

PROPOSITION 2.6.21 If Z ⊆ RN is a bounded open set which is Lipschitz, ½ df T = A ⊆ Z : A is Lebesgue measurable, ¾ 1 N λ (A) = λ (Z \ A) = λ (Z) 2 N

N

and P (A, Z) = kDχA k

∀A∈T

(the perimeter with respect to Z functional), then there exists A∗ ∈ T , such that P (A∗ , Z) = inf P (A, Z). A∈T

PROOF

Let

df

S = {χA : A ∈ T } ⊆ L1 (Z). We furnish S with the relative L1 (Z)-topology. Since kχA k1 6 λN (Z)

∀ A∈T,

we see that the functional ξ : S −→ R defined by df

ξ(χA ) = kDχA k = P (A, Z) is coercive on S for the BV (Z)-norm. Therefore the sub-level sets of ξ are bounded in BV (Z), thus relatively compact in S ⊆ L1 (Z) (note that S is closed in L1 (Z) and see Remark 2.6.20). Also from Proposition 2.6.13, we know that ξ is lower semicontinuous on S. This means that its sub-level sets are compact in L1 (Z). So by the Weierstrass theorem, we can find A∗ ∈ T , such that P (A∗ , Z) = inf P (A, Z). A∈T

We can relate the variation measure of u and the perimeters of its superlevel sets. The result is actually a “Co-Area Formula” for BV -functions (see also Theorem 1.5.25). THEOREM 2.6.22 If Z ⊆ RN is an open set, u ∈ L1 (Z) and for every r ∈ R let ª df © Lr = z ∈ Z : u(z) > r , then (a) u ∈ BV (Z) implies that Z∞ kDuk (Z) =

° ° °DχL ° (Z) dr; r

−∞

(b) if for almost all r ∈ R, Lr has a finite perimeter, then u ∈ BV (Z).

2. Lebesgue-Bochner and Sobolev Spaces

2.7

257

Remarks

2.1: To have a good theory of integration, we need a reasonable notion of measurability of functions. In this direction the basic result is the Pettis measurability theorem (see Theorem 2.1.3), which was proved by Pettis (1938a). The main integral for vector valued functions, which has a rich enough structure to have significant applications, is the Bochner integral. The Bochner integral can be traced in the works of Bochner (1933) and Dunford (1935) and for this reason is also known as “Dunford’s first integral.” Most of the properties of the Bochner integral follow from the corresponding properties of the classical Lebesgue integral, by virtue of Proposition 2.1.10. So some analysts say that the Bochner integral is the Lebesgue integral with the absolute value replaced by norms. The Pettis integral has much fewer applications, which require knowledge and use of sophisticated measure theoretic results. The theory of Pettis integration started with the work of Pettis and attracted renewed attention after the paper of Edgar (1977). A detailed study of the Pettis integral with applications can be found in the monograph of Talagrand (1984a). On the subject of vector valued functions and their integration, the reader can consult the books of Diestel & Uhl (1977), Dunford & Schwartz (1958) and Hille & Phillips (1957). The proof of the Orlicz-Pettis theorem can be found in Diestel & Uhl (1977, p. 22). 2.2: A reference to Lebesgue-Bochner spaces can be found in every book dealing with infinite dimensional dynamical systems. They are a natural generalization of the classical Lebesgue spaces using the notion of Bochner integral. Vector measures were already considered by Pettis (1938b). However, the real expansion on the subject occurred in the late 60s and during the 70s, when there was a systematic study of the geometry of Banach spaces. That is when RNP spaces were introduced and studied in detail. That a reflexive Banach space has the RNP, which was established by Phillips (1940), while the fact that a separable dual Banach space is an RNP space is due to Dunford & Pettis (1940). The proof of Proposition 2.2.8 can be found in Diestel & Uhl (1977, pp. 79 and 82). Theorem 2.2.9 (the Riesz Representation theorem for the Lebesgue Bochner spaces Lp (Ω; X), p ∈ [1, +∞)) is essentially due to Bochner & Taylor (1938). Its proof can be found in Diestel & Uhl (1977, p. 97). Its extension (for p = 1) mentioned in Theorem 2.2.12 (called Dinculeanu-Foias theorem) is due to Dinculeanu & Foias (1961) and its proof, based on “lifting theory,” can be found in Ionescu-Tulcea & Ionescu-Tulcea (1969, p. 93). Absolute continuity of real valued functions (see Definition 2.2.14) was introduced by Vitali (1908), who established the fundamental fact that a real valued function on [0, 1] is absolutely continuous if and only if it is the integral of its derivative (the fundamental theorem of Lebesgue calculus). Theorem 2.2.17 is due to Komura (1967). Lemma 2.2.29

258

Nonlinear Analysis

can be found in Lions (1969, p. 58) and shows how new inequalities can be derived from the properties of embedding operators. Theorem 2.2.30 is due to Aubin (1963) and plays a central role in the theory of evolution equations. Evolution triples (see Definition 2.2.31) are also known as “Gelfand triples,” because of their systematic use by Gelfand & Shilov (1977) (see also Wloka (1987)). Evolution triples and their properties and applications can be found in Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Hu & Papageorgiou (1997, 2000), Lions (1969), Showalter (1997) and Zeidler (1990a, 1990b). Finally we mention a result on the structure of L1 (Ω; X) due to Talagrand (1984b). PROPOSITION 2.7.1 If (Ω, Σ, µ) is a finite measure space and X is a Banach space which is weakly sequentially complete, then L1 (Ω; X) is weakly sequentially complete too. 2.3: Theorem 2.3.2 is known as the “Arzela-Ascoli theorem” ¡ ¢ although some authors use only one of the two names. Working on C [0, 1] , Arzela (1889) proved the necessity part, while Ascoli (1883–1884) proved the sufficiency part. A general formulation of this theorem can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 73). The results on the compactness of various sets in Lp (T ; X) (p ∈ [1, +∞)) and in C(T ; X) (variations of the Arzela-Ascoli theorem) can be found in Simon (1987). They are formulations and extensions of the classical criterion for strong compactness in Lp (T ) (p ∈ [1, +∞)), due to Riesz (1933) and Kolmogorov (1931). James’ theorem (see Theorem 2.3.21), due to James (1964), is one of the deepest and most influential results of functional analysis. From it, it follows that a Banach space X is reflexive if and only if every x∗ ∈ X ∗ attains its supremum on the unit ball of X. For a proof of James’ theorem see Holmes (1975, pp. 157–161). Theorem 2.3.21 can be found in Papageorgiou (1985) (see also Denkowski, Mig´orski & Papageorgiou (2003a, p. 462)), where a kind of converse of it can also be found. The proof of Proposition 2.3.22 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 458). Ionescu-Tulcea & Ionescu-Tulcea (1969) were the first to observe that the classical Dunford-Pettis theorem (see Dunford (1935)) can be extended to X-valued functions with X being a reflexive Banach space, after some straightforward modifications in the original proof (see Theorem 2.3.24). The proof of Proposition 2.3.31 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 484). The notion of biting convergence (see Definition 2.3.35) is due to Chacon (see Brooks & Chacon (1980) and Ball & Murat (1989)). In Brooks & Chacon (1980), we can find the original version of Theorem 2.3.26 (Biting Theorem). Property U (see Definition 2.3.33) is natural in the context of solution flows of a differential equation. Theorem 2.3.37 is due to Gutman (1985). Extensions of Proposition 2.3.39 to Banach space valued functions can be found in Rzezuchowski

2. Lebesgue-Bochner and Sobolev Spaces

259

(1989). The notation for the various spaces of continuous functions is not standard (see, e.g., Hewitt & Stromberg (1975, p. 86)). For a proof of Theorem 2.3.41 (Riesz-Markov representation theorem) we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 322). Also the names for the various modes of convergence introduced in Definition 2.3.42 vary among authors. So we caution the reader to be careful. Since we are dealing with the space of measures, let us mention two striking results concerning them. Let (Ω, Σ) be a measurable space and let ca(Σ) be the space of all signed measures on Σ of bounded variation endowed with the total variation norm df

kµk1 = |µ|(Ω)

∀ µ ∈ ca(Σ).

We can also introduce another norm given by ¯ ¯ df kµk∞ = sup ¯µ(A)¯

∀ µ ∈ ca(Σ).

A∈Σ

Then kµk∞ 6 kµk1 6 4 kµk∞

∀ µ ∈ ca(Σ) ¡ ¢ (i.e., the two norms are equivalent). The space ca(Σ), k·k1 is a Banach space. The first result is a remarkable improvement of the Uniform Boundedness Principle and is known as “Nikodym’s boundedness theorem.” PROPOSITION 2.7.2 If {µs }s∈S ⊆ ca(Σ) and ¯ ¯ sup ¯µs (A)¯ < +∞

∀ A ∈ Σ,

s∈S

then

¯ ¯ sup ¯µs (A)¯ < +∞. s∈S A∈Σ

The second result is known as “Nikodym’s convergence theorem.” PROPOSITION 2.7.3 If {µn }n>1 ⊆ ca(Σ) and lim µn (A) = µ(A) exists

n→+∞

∀ A ∈ Σ,

then µ ∈ ca(Σ) and moreover, if µn ≺≺ λ for all n > 1 with λ ∈ ca(Σ), then µ ≺≺ λ. Both results can be found in Diestel (1984, pp. 80 and 90) and Dunford & Schwartz (1958, pp. 309 and 321).

260

Nonlinear Analysis

For a proof of Theorem 2.3.48 we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 198) and Parthasarathy (1967, p. 45). Proposition 2.3.49 is due to Br´ezis & Lieb (1983). Finally we state a compactness result concerning vector measures. The result is known as “Lyapunov’s convexity theorem” and has important ramifications in Control Theory (see Hermes & LaSalle (1969)). THEOREM 2.7.4 Let (Ω, Σ) be a measurable space. (a) If µk : Σ −→ R, k = 1, . . . , N are finite nonatomic measures, ¢N S ¡ then R = µk (A) k=1 is compact and convex in RN . A∈Σ

(b) If X is a Banach space with the RNP and m : Σ −→ X is a vector measure which is nonatomic and of bounded variation, S k·k then R = m(A) is strongly compact and convex. A∈Σ

2.4: Sobolev spaces were introduced by Sobolev (1963a, 1963b). Related spaces were also studied by Morrey (1940, 1966) and later by Deny & Lions (1953–1954). Today there are many well known books on the subject. We mention Adams (1975), Br´ezis (1983), Evans & Gariepy (1992), Kufner, John & Fuˇcik (1977), Lions & Magenes (1972), Maz’ja (1985) and Ziemer (1989). We mention that for functions of several variables (i.e., N > 1), when p = 2, we use the notation H m (Z) (respectively H0m (Z)) for the Sobolev space W m,2 (Z) (respectively W0m,2 (Z)). However, for functions of one variable (i.e., N = 1, hence Z = T = (a, b)), we keep the notation W m,2 (T ) (respectively W0m,2 (T )). Theorem 2.4.13 is due to Meyers & Serrin (1964). The result is often called “local approximation theorem.” A discussion of the various geometric conditions imposed on the boundary ∂Z can be found in Adams (1975, pp. 66–67). For a proof of the approximation result given in Theorem 2.4.17, we refer to Evans & Gariepy (1992, p. 127). To see that without further conditions on the domain Z, Theorem 2.4.17 is not true, consider the following example. EXAMPLE 2.7.5 df

Z =

Let ©

ª

(z1 , z2 ) ∈ R2 : 0 < |z1 | < 1, 0 < z2 < 1

and

½ df

u(z1 , z2 ) =

1 0

if if

z1 > 0, z1 < 0.

Clearly u ∈ W 1,p (Z) (p ∈ [1, +∞)). However, given ε > 0 sufficiently small, it is easy to see that we cannot find ϑ ∈ C 1 (Z), such that ku − ϑkW 1,p (Z) < ε. Note that this particular Z lies on both sides of its boundary.

2. Lebesgue-Bochner and Sobolev Spaces

261

Another approximation result, useful in optimal control problems, is given below. First a definition. DEFINITION 2.7.6 Let Z be an open set. We say that u : Z −→ R is affine, if it is the restriction to Z of an affine function over RN . We say that u : Z −→ R is piecewise affine, if it is continuous and there exists a partition of Z into a Lebesgue-null set and finite number of open sets on which u is affine. REMARK 2.7.7

If u : Z −→ R is affine, then Du = constant

and the converse is true, if Z is connected. We have ¡ ¢ u(z) = Du(z), z RN + c, with c ∈ R. PROPOSITION 2.7.8 If Z ⊆ RN is a bounded open set which is Lipschitz and u ∈ W01,p (Z) (p ∈ (1, +∞)), then we can find a sequence {un }n>1 of piecewise affine functions over Z, null on ∂Z (i.e., {un }n>1 ⊆ W01,p (Z)), such that un −→ u

in W01,p (Z).

For the case p = +∞, there is the following approximation result. PROPOSITION 2.7.9 If Z ⊆ RN is a bounded open set which is Lipschitz and u ∈ W 1,∞ (Z), then there exists a sequence {un , Zn }n>1 where un ∈ W 1,∞ (Z), Zn ⊆ Z are open, Zn ⊆ Zn+1 ∀ n > 1, λN (Z \ Zn ) −→ 0, un |Zn are piecewise affine, un (z) = u(z) un −→ u

∀ z ∈ ∂Z, n > 1, uniformly on Z,

Dun (z) −→ Du(z)

for a.a. z ∈ Z

and kDun k∞ 6 kDuk∞ + ε(n), with ε(n) −→ 0 as n → +∞.

262

Nonlinear Analysis

REMARK 2.7.10 Recall that, if u ∈ W 1,∞ (Z), then it is Lipschitz continuous on Z and so it can be extended continuously to Z (i.e., W 1,∞ (Z) ⊆ ¡ ¢ C Z ). So the boundary values of u are well defined. Both the previous approximation results can be found in Ekeland & Temam (1976, pp. 316–317). A detailed discussion of Sobolev spaces of fractional order and on manifolds can be found in Adams (1975) and Kufner, John & Fuˇcik (1977). Theorem 2.4.54 can be found in Kenmochi (1975) and Casas & Fern´andez (1989). Finally we mention a Proposition useful in the interpretation of the variational formulation of various equations, such as the Navier-Stokes equation. The result is due to de Rham (1955). PROPOSITION 2.7.11 ¡ ¢ N ∗ If Z ⊆ RN is an open set and u = (uk )N , k=1 ∈ D Z; R then a necessary and sufficient condition that u = Dh for some h ∈ D(Z)∗ is that ¡ ¢ ª df © hu, ϑi = 0 ∀ ϑ ∈ V = ϑ ∈ D Z; RN : div ϑ = 0 . REMARK Note that the divergence operator div maps ¡ ¢ 2.7.12 W01,p Z; RN onto the space ½ df

V =

¾

Z p

h ∈ L (Z) :

h(z) dz = 0

= Lp (Z)/R

Z

(recall that −div is the adjoint of the gradient operator). For the proof of the trace theorem (see Theorem 2.4.50), we refer to Adams (1975, p. 216) and Kufner, John & Fuˇcik (1977, p. 337) and for the proof of Extension Theorem (see Theorem 2.4.55), we refer to Br´ezis (1983, p. 158).

2.5: Theorem 2.5.3 is the classical “Sobolev inequality” (see Sobolev (1963a, 1963b)), which was also developed by Gagliardo (1958), Morrey (1940, 1966) and Nirenberg (1959). The proof given here is due to Nirenberg (1959). For the Poincar´e inequality (see Theorem 2.5.4) and the Poincar´e-Wirtinger inequality (see Theorem 2.5.21) we refer to Meyers (1978). For the proof of Proposition 2.5.8 we refer to Maz’ja (1985, p. 27). The Sobolev embedding theorem (see Theorem 2.5.16) originated in the work of Sobolev (1963a), with important refinements by Morrey (1940) and Gagliardo (1958). The RellichKondrachov embedding theorem (see Theorem 2.5.17) originated in a paper by Rellich (1930) for p = 2 and by Kondrachov (1945) for the general case. For the proofs of both theorems 2.5.16 and 2.5.17 we refer to Br´ezis (1983, pp. 168–170). There are variations of this theorem with interesting applications, like the following one due to Frehse (1984).

2. Lebesgue-Bochner and Sobolev Spaces

263

PROPOSITION 2.7.13 If Z ⊆ RN is a bounded open set, {un }n>1 ⊆ W 1,p (Z) (with p ∈ [1, +∞)) is a bounded sequence and Z ¢ p−2 ¡ kDun k Dun , Dh RN dz 6 M khk∞ ∀n > 1, h ∈ W 1,p (Z) ∩ L∞ (Z), Z

for some M > 0, then there exist u ∈ W 1,p (Z) and a subsequence {unk }k>1 of {un }n>1 , such that un −→ u in W 1,r (Z) ∀ r < p.

For the proof of Theorem 2.5.24, we refer to Adams (1975, p. 79). Theorem 2.5.28 is another refinement of the Rellich-Kondrachov theorem (and simultaneously of the Egorov theorem; see Theorem A.2.10) and can be found in Evans (1990, p. 8). Theorem 2.5.30 (the Concentration-Compactness Lemma) is due to Lions (1985a, 1985b) and is important in the study of elliptic differential equations involving critical exponents. For additional results in this direction we refer to the work of Ben Naoum, Troestler & Willem (1996) and Bianchi, Chabrowski & Szulkin (1995) and the monographs of Evans (1990) and Willem (1996). Propositions 2.5.34 and 2.5.35 can be found in Mawhin & Willem (1989). 2.6: For the Lp -differentiability and λN -a.e. differentiability of Sobolev functions we refer to Bagby & Ziemer (1974), Liu (1977) and Resetnjak (1969) and the books of Evans & Gariepy (1992), Federer (1969), Simon (1983), Stein (1970) and Ziemer (1989). Proposition 2.6.3 is due to Marcus & Mizel (1972), where the interested reader can find additional results in this direction. Proposition 2.6.4 is due to Marcus & Mizel (1979), where the authors prove that the result is also valid for p = 1. Functions of bounded variation on R were introduced by Jordan (1881), who placed integration within the context of a “measurable” set. Lebesgue (1910) proved that a function of bounded variation on R is almost everywhere differentiable (for a proof which does not use measure theory – except sets of measure zero – we refer to Riesz & Nagy (1955, pp. 3–10)). Before the formal introduction of distributions, extensions of the notion of bounded variation to functions of many variables were suggested by Tonelli (1926) and Cesari (1936). It involved consideration of functions along the coordinate axes. Theorem 2.6.17 is due to Krickerberg (1957). The theory of sets of finite perimeter was introduced by Caccioppoli (1953) and De Giorgi (1954, 1955) (where one can find the Co-Area Formula for BV -functions; see Theorem 2.6.22). The proof of Theorem 2.6.22 can be also found in Evans & Gariepy (1992, p. 185) and Ziemer (1989, p. 231). Further contributions were made by Federer (1958), Fleming (1960) and Krickerberg (1957). More details on the space of BV -functions can be found in the books of Evans & Gariepy (1992), Giusti (1984) and Ziemer (1989).

Chapter 3 Nonlinear Operators and Young Measures

In this chapter we study certain nonlinear operators which arise in applications and we also discuss the so-called Young measures, which roughly speaking capture the limits of minimizing sequences in variational problems which do not have a solution. For some cases we also develop the corresponding linear theory in order to have a complete picture of the theory, see the similarities and differences of the two and appreciate the limitations of the nonlinear theory. In Section 3.1, we consider compact operators. Compactness was introduced as a first attempt to deal with infinite dimensional nonlinear operator equations. By its nature, compactness approximates infinite objects by finite ones. We see that in the context of compact operators (linear and nonlinear alike) this principle is in general true. We also discuss proper maps, the spectral theory of linear, compact, self-adjoint operators on a Hilbert space and Fredholm operators. A broader framework for the analysis of infinite dimensional problems is provided by monotone operators, which extend to an infinite dimensional context, the simple notion of an increasing real function. In Section 3.2 we examine monotone operators from a Banach space into its dual, with special emphasis on maximal monotone operators, which are a generalization of a continuous increasing real function. Maximal monotone operators have remarkable surjectivity properties. We point out that surjectivity results are important because they correspond to existence results for certain classes of nonlinear operator equations. At the end of the section we also discuss generalizations of the notion of monotonicity. These are the so-called operators of monotone type, the most important of which are the pseudomonotone operators. Monotone operators map a Banach space to its dual. If instead we want to consider nonlinear operators mapping a Banach space to itself, we need to consider accretive and m-accretive operators. Their importance comes from the fact that they are the generators of linear and nonlinear semigroups, which, roughly speaking, are an abstraction of the trajectories of a given differential equation. In Section 3.3 first we examine accretive operators and then we look at semigroups of operators generated by certain accretive operators. We present in detail both the linear and nonlinear theories. Undoubtedly the most common nonlinear operator is the so-called Nemytskii operator (or superposition)

265

266

Nonlinear Analysis

operator. In Section 3.4 we examine this operator and we have a first look at integral functionals corresponding to normal integrands. In a variational problem, when the objective functional is not inf-compact, a solution does not exist. Nevertheless, the minimizing sequences (or appropriate subsequences of them) have a limit behaviour (usually more and more oscillating), which is captured by embedding the original functions to the space of Young measures (or parametrized probabilities). This embedding leads to a larger inf-compact problem which has a solution (relaxation). In Section 3.5 we discuss the theory of the Young measures and obtain additional lower semicontinuity results for integral functionals. Some of the topics of this chapter will be revisited in the course of the next chapter.

3.1

Compact and Fredholm Operators

The first efforts to solve nonlinear functional equations involved various aspects of compactness. For this reason compact operators were introduced. They constitute a class of maps to which we can generalize several of the results which are valid for maps between finite dimensional Banach spaces. Degree theory and fixed point theory, which provide important tools for the study of functional equations, depend on the notion of compact maps. DEFINITION 3.1.1 Let X, Y be two Banach spaces and let D be a subset of X. We say that f : D −→ Y is compact, if it is continuous and for every bounded set B ⊆ D, the set f (B) is compact in Y . We denote the set of compact maps by K(D; Y ). Also if D = X, we set df

Lc (X; Y ) = K(X; Y ) ∩ L(X; Y ). REMARK 3.1.2 Evidently K(D; Y ) is a linear space which is closed under composition with continuous bounded maps. If dim Y < +∞, then every continuous bounded map f : D −→ Y is compact. In the sequel we shall see that the space K(D; Y ) consists of precisely those maps which can be approximated by mappings with a finite dimensional range (see Theorem 3.1.10). Note that if L : X −→ Y is linear and maps bounded sets in X into relatively compact sets in Y , then L ∈ Lc (X; Y ) (i.e., L is also continuous). Finally, if L ∈ Lc (X; Y ), then L has a separable range.

3. Nonlinear Operators and Young Measures

267

Another notion involving compactness is given in the next definition. DEFINITION 3.1.3 Let X, Y be two Banach spaces and let D be a subset of X. We say that f : D −→ Y is completely continuous, if for every sequence {xn }n>1 ⊆ D, such that w

xn −→ x

in X,

for some x ∈ D, we have that f (xn ) −→ f (x)

in Y

(i.e., f is sequentially continuous from D with the relative weak topology of X into Y with the norm topology). REMARK 3.1.4 A completely continuous linear operator L : X −→ Y is also known as Dunford-Pettis operator and is of course continuous. In general the classes of compact maps and completely continuous maps are not comparable. However, for linear operators the situation is better. We can establish that complete continuity actually lies properly between compactness and boundedness. PROPOSITION 3.1.5 If X, Y are two Banach spaces and L ∈ Lc (X; Y ) = K(X; Y ) ∩ L(X; Y ), then L is completely continuous. PROOF

If

w

xn −→ x

in X,

then the sequence {xn }n>1 ⊆ X is bounded. Because L ∈ K(X; Y ), we have that k·k {L(xn )}n>1Y is compact in Y. Thus we can find a subsequence {xnk }k>1 of {xn }n>1 , such that L(xnk ) −→ y

in Y.

But because L ∈ L(X; Y ), we also have w

L(xn ) −→ L(x)

in Y.

Therefore y = L(x) and so we conclude that L(xn ) −→ L(x) i.e., L is completely continuous.

in Y,

268

Nonlinear Analysis

The converse of the above Proposition is not in general true. EXAMPLE 3.1.6 erty, namely if

Recall that the Banach space l1 has the Schur propw

xn −→ x in l1 , then xn −→ x in l1 . Using this we see that the identity map i : l1 −→ l1 is a completely continuous linear operator which is not compact. However, if we strengthen the condition on the space X, the situation improves. PROPOSITION 3.1.7 If X is a reflexive Banach space, Y is a Banach space, D ⊆ X is a nonempty, closed set, and f : D −→ Y is completely continuous, then f ∈ K(D; Y ). PROOF Clearly f is continuous. Let B ⊆ D be a bounded set. We need to show that f (B) is compact in Y . To this end let {yn }n>1 ⊆ f (B). Then yn = f (xn ) with xn ∈ B

∀ n > 1.

Since X is reflexive, by passing to a subsequence if necessary, we may assume that w xn −→ x in D. Then f (xn ) −→ f (x)

in Y

and so f (B) is indeed compact in Y . Combining Propositions 3.1.5 and 3.1.7, we have the following. COROLLARY 3.1.8 If X is a reflexive Banach space, Y is a Banach space and L ∈ L(X; Y ), then L is compact if and only if L is completely continuous. REMARK 3.1.9 In both Proposition 3.1.7 and Corollary 3.1.8 the condition that X is reflexive cannot be relaxed (see Example 3.1.6).

3. Nonlinear Operators and Young Measures

269

The next theorem gives a characterization of compact maps defined on a bounded set, which explains why compact maps are the suitable class to extend the properties of maps between finite dimensional Banach spaces. THEOREM 3.1.10 If X, Y are two Banach spaces, D ⊆ X is a bounded set and f : D −→ Y , then the following are equivalent: (a) f ∈ K(D; Y ); (b) given ε > 0 we can find a continuous, bounded map fε : D −→ Y , such that ° ° °f (x) − fε (x)° < ε ∀ x ∈ D, Y ¡ ¢ fε (D) ⊆ conv f (D) and dim span fε (D) < +∞. PROOF

“(a)=⇒(b)”: Since f ∈ K(D; Y ), we have that the set f (D) is compact in Y .

So given ε > 0, we can find {yk }m k=1 ⊆ Y , such that f (D) ⊆

m [

Bε (yk ).

k=1

Let

© ª df ak (y) = max ε − ky − yk kY , 0

and

ak (y) df ϑk (y) = P m ak (y)

∀ y ∈ f (D).

k=1

We define df

fε (x) =

m X

¡ ¢ ϑk f (x) yk

∀ x ∈ D.

k=1

Evidently the function fε : D −→ Y is continuous, fε (D) ⊆ span {yk }m k=1 , the set fε (D) is compact and ° ° °f (x) − fε (x)° = m Y P k=1 m P

< k=1 m P k=1

1 ak (f (x))

° m ° °X ¡ ¢¡ ¢° ° ° a f (x) y − f (x) k k ° °

Y

k=1

¡ ¢ ak f (x) ¡ ¢ε = ε ak f (x)

∀ x ∈ D.

270

Nonlinear Analysis df

“(b)=⇒(a)”: Let εn = n1 and let fεn = fn be the continuous, bounded map with finite dimensional range postulated by statement (b). Then f , being the uniform limit of the sequence {fn }n>1 of continuous maps, is itself continuous. Also let y = f (x) with x ∈ D. We have ky − yn kY

1 being strongly convergent in Y , imply xn −→ x

in X,

then f is proper. PROOF First suppose that hypothesis (i) holds. Let C ⊆ Y be a compact set. We need to show that f −1 (C) is compact in X. Let {xn }n>1 ⊆ f −1 (C). Then f (xn ) = yn ∈ C ∀ n > 1. Because C ⊆ Y is compact, by passing to a suitable subsequence if necessary, we may assume that yn −→ y ∈ C in Y. The weak coercivity of f implies that the sequence {xn }n>1 ⊆ X is bounded. © ª So the sequence u(xn ) n>1 ⊆ Y is relatively compact and we may assume that u(xn ) −→ z in Y. Then g(xn ) = f (xn ) − u(xn ) −→ y − z

in Y.

Because g is proper, it follows that the sequence {xn }n>1 has a subsequence {xnk }k>1 , such that xnk −→ x in X. Therefore f (xnk ) −→ f (x) and so y = f (x), i.e., x ∈ f −1 (C), which proves the properness of f .

3. Nonlinear Operators and Young Measures

273

Next suppose that hypothesis (ii) holds. Again f (xn ) = yn −→ y

in Y

and due to the weak coercivity of f , the sequence {xn }n>1 ⊆ X is bounded. Because X is reflexive, we may assume that w

xn −→ x

in X.

Then hypothesis (ii) implies that xn −→ x in X and so yn = f (xn ) −→ f (x)

in Y,

hence y = f (x). This proves the properness of f . Compactness and properness are related as follows. PROPOSITION 3.1.17 If X is a Banach space, D ⊆ X is a closed, bounded set and f ∈ K(D; X), then idX − f is proper (idX is the identity operator on X). PROOF Then

Let C ⊆ X be a compact set and let {xn }n>1 ⊆ (idX − f )−1 (C). xn − f (xn ) = cn ,

cn ∈ C

∀ n > 1.

Since C is compact and f ∈ K(D; X), by passing to a suitable subsequence if necessary, we may assume that cn −→ c ∈ C

and

f (xn ) −→ y

in X.

Then xn = cn + f (xn ) −→ c + y = x in X and so f (xn ) −→ f (x) in X. Thus y = f (x) and we have c = x − f (x), hence which shows that

x ∈ (idX − f )−1 (C), (idX − f )−1 (C) is compact.

274

Nonlinear Analysis

Next we have a closer look at the space of compact linear operators Lc (X; Y ). PROPOSITION 3.1.18 If X, Y are two Banach spaces, then Lc (X; Y ) with the operator norm is a Banach space. PROOF Clearly Lc (X; Y ) is a linear subspace of L(X; Y ) (see also Remark 3.1.2). Because L(X; Y ) with the operator norm is a Banach space, it suffices to show that Lc (X; Y ) is closed in L(X; Y ). So let {Ln }n>1 ⊆ Lc (X; Y ) and suppose that kLn − LkL −→ 0.

(3.1)

We need to show that L ∈ Lc (X; Y ). Because of (3.1), we have that ° ° sup °Ln (x) − L(x)°Y −→ 0. kxkX 61

So given ε > 0, we can find n0 = n0 (ε) > 1, such that ° ° °Ln (x) − L(x)° < ε ∀ n > n0 , kxkX 6 1. Y 2 If

df

B1X =

©

ª x ∈ X : kxkX < 1 ,

¢ ¡ then the set Ln0 B1X is relatively compact. So we can find a finite set F ⊆ Y , such that [ ¡ ¢ Ln0 B1X ⊆ B 2ε (y). y∈F

We claim that

[ ¡ ¢ L B1X ⊆ Bε (y). y∈F

Indeed for a given x ∈

B1X ,

we can find y ∈ F , such that ° ° °Ln0 (x) − y ° < ε . Y 2

Then ° ° ° ° ° ° °L(x) − y ° 6 °L(x) − Ln0 (x)° + °Ln0 (x) − y ° < ε + ε = ε, Y Y Y 2 2 ¡ X¢ which shows that L B1 is totally bounded, thus relatively compact. REMARK 3.1.19 Since composition of two operators, one of which is compact, is again a compact operator, we infer that Lc (X) is a closed twosided ideal of the Banach algebra L(X). Moreover, it is clear that if L ∈ Lc (X) and dim X = +∞, then L−1 does not exist (i.e., L is a singular operator).

3. Nonlinear Operators and Young Measures

275

The next characterization of the elements in the Banach space Lc (X; Y ) is known as “Schauder’s theorem.” First a definition. DEFINITION 3.1.20 If X, Y are two Banach spaces and L ∈ L(X; Y ), then its adjoint L∗ : Y ∗ −→ X ∗ is the linear operator given by df

L∗ (y ∗ ) = y ∗ L i.e.,

∗ ∗ ® ® L (y ), x X = y ∗ , L(x) Y

∀ y∗ ∈ Y ∗ , ∀ x ∈ X, y ∗ ∈ Y ∗ ,

where by h·, ·iZ we denote the duality brackets for the pair (Z, Z ∗ ) for any Banach space Z. REMARK 3.1.21 ∗

Clearly ∈ L(Y ∗ ; X ∗ )

L

and

kLkL = kL∗ kL .

So the map L −→ L∗ is an isometric isomorphism from L(X; Y ) into L(Y ∗ ; X ∗ ). Moreover, (L−1 )∗ = (L∗ )−1 and L∗ (Y ∗ ) is closed if and only if it is w∗ -closed. THEOREM 3.1.22 (Schauder Theorem) If X, Y are two Banach spaces and L ∈ L(X; Y ), then L ∈ Lc (X; Y ) if and only if L∗ ∈ Lc (Y ∗ ; X ∗ ). PROOF where

¡ ∗¢ is relatively compact, “=⇒”: We need to show that L∗ B1Y B1Y

Let {yn∗ }n>1 ⊆ B1Y

∗

df

=

©

ª y ∗ ∈ Y ∗ : ky ∗ kY ∗ < 1 .

∗

be a sequence. Consider the elements yn∗ for n > 1, ¡ ¢ restricted on the set L B1X , which is compact in Y . Clearly the sequence ¡ ¡ ¢¢ {yn∗ }n>1 ⊆ C L B1X is bounded and equicontinuous (see Definition A.1.15). So by the Arzela-Ascoli theorem (see Theorem 2.3.2), the sequence {yn∗ }n>1 ⊆ ¡ ¡ ¢¢ © ª C L B1X is relatively compact. Hence we can find a subsequence yn∗ k k>1 of {yn∗ }n>1 , such that ¯ ® ® ¯ sup ¯ yn∗ k , L(x) Y − yn∗ m , L(x) Y ¯ −→ 0 as k, m → +∞. x∈B1X

So lim

k,m→+∞

=

lim

° ∗ ∗ ° °L (yn ) − L∗ (yn∗ )° ∗ m k X ¯ ∗ ∗ ® ¯ ∗ ∗ ¯ sup L (ynk ) − L (ynm ), x X ¯

k,m→+∞ x∈B X 1

=

lim

¯ ® ¯ sup ¯ yn∗ k − yn∗ m , L(x) Y ¯ = 0.

k,m→+∞ x∈B X 1

276

Nonlinear Analysis

© ª Therefore L∗ (yn∗ k ) k>1 ⊆ X ∗ is a Cauchy sequence and so it is convergent. ¡ ∗¢ This implies that the set L∗ B1Y is compact in X ∗ and so L∗ ∈ Lc (Y ∗ ; X ∗ ). “⇐=”: Let r : X −→ X ∗ be the canonical embedding of X into X ∗∗ . have L∗∗ r = rL.

We

So identifying X with r(X), we have that L∗∗ |X = L. From the first part of the proof and since by hypothesis L∗ ∈ Lc (Y ∗ ; X ∗ ), we have that ¢ ¡ ∗∗ ⊆ Y ∗∗ is compact, L∗∗ B1X where

ª x∗∗ ∈ X ∗∗ : kx∗∗ kX ∗∗ < 1 . ¢ ¡ ¢ ¡ ∗∗ Since L∗∗ B1X is a closed subset of L∗∗ B1X , it follows that B1X

∗∗

df

=

©

¢ ¡ L∗∗ B1X ⊆ X ∗∗

is compact.

¢ ¢ ¡ ¡ ¡ ¢ But as we already established L∗∗ B1X = L B1X . So L B1X ⊆ X is compact and we conclude that L ∈ Lc (X; Y ). DEFINITION 3.1.23 Let X, Y be two Banach spaces and L ∈ L(X; Y ). We say that L is a finite rank operator (or finite dimensional operator or degenerate operator), if dim L(X) < +∞. We denote the space of all finite dimensional operators from X into Y equipped with the norm inherited df from L(X; Y ), by Lf (X; Y ). If L ∈ Lf (X; Y ), then rank L = dim L(X). REMARK 3.1.24 Clearly Lf (X; Y ) ⊆ Lc (X; Y ). The inclusion is in general strict as the next example illustrates. Consider the operator L ∈ L(l2 ), defined by nx o df n L(x) = ∀ x = {xn }n>1 ∈ l2 . 2n n>1

EXAMPLE 3.1.25

We claim that L ∈ Lc (l2 ) \ Lf (l2 ). Clearly L 6∈ Lf (l2 ). So let us show that ¡ 2¢ L ∈ Lc (l2 ). We need to show that L B1l ⊆ l2 is relatively compact. For a given ε > 0, find n0 = n0 (ε) > 1, such that ∞ X n=n0

1 6 ε. 2n +1

3. Nonlinear Operators and Young Measures The set

df

C =

n³ x

277

´ o x2 xn0 , . . . , , 0, . . . : |x | 6 1 n 2 22 2n0 1

,

is compact in l2 (view it as a subset of Rn0 ). So we can find a finite set F ⊆ C, such that [ C⊆ Bε (v). v∈F l

2

Let x ∈ B 1 . We have |xn | 6 1

∀n>1

and so there exists v ∈ F , such that n0 ¯ ¯2 X ¯ xn ¯ ¯ n − vn ¯ < ε 2 . 2 n=1

Then we have °n o ° n0 ¯ ∞ ¯2 ¯ x ¯2 X X ° xn ° ¯ xn ¯ ¯ n¯ ° ° = − v − v + ¯ n ¯ n ¯ < ε2 + ε2 = 2ε2 , n¯ ° 2n n>1 °2 2 2 l n=1 n=n +1 0

¡ 2¢ so L B1l is relatively compact in l2 , hence L ∈ Lc (l2 ). Making use of a finite basis to describe the finite dimensional range of L ∈ Lf (X; Y ), we can easily establish the following result. PROPOSITION 3.1.26 If X, Y are two Banach spaces and L ∈ L(X; Y ), then L ∈ Lf (X; Y ) if and only if L∗ ∈ Lf (Y ∗ ; X ∗ ). Moreover, rank L = rank L∗ . From Theorem 3.1.10 we know that every compact map can be uniformly approximated locally by maps with range in a finite dimensional space. Motivated by this fact, it is natural to ask whether k·kL

Lc (X; Y ) = Lf (X; Y )

.

In fact for a long time this was one of the major open problems in Banach space theory. But let us formulate the problem precisely. We start with a definition. DEFINITION 3.1.27 A Banach space Y has the approximation property, if for every Banach space X, we have that Lc (X; Y ) = Lf (X; Y )

k·kL

.

278

Nonlinear Analysis

The famous open (until 1973) problem in Banach space theory is known as the “approximation problem” and asks whether every Banach space Y has the approximation property. It was settled in the negative by Enflo (1973), who found a separable, reflexive Banach space (necessarily infinite dimensional), which lacks the approximation property. Let us also mention a few things about the spectrum of compact operators. The spaces in much of applied mathematics are actually real vector spaces, as was the case in our study so far. However, to work all the time in real spaces is mathematically inconvenient. Eigenvalue-eigenvector theory is such an instance. The theory is crippled if we insist on real vector spaces. For this reason in the next definition we consider a complex Banach space. DEFINITION 3.1.28 Let X be a complex Banach space and L ∈ L(X). The resolvent set %(L) of L is defined by df

%(L) =

©

ª λ ∈ C : (λidX − L)−1 exists and belongs in L(X) .

The operator df

R(λ) = (λidX − L)−1 is called the resolvent of L at λ. The points of %(L) are called regular values of L. The set df

σ(L) = C \ %(L) is called the spectrum of L. The point spectrum of L is the subset σp (L) of σ(L) defined by df

σ(L) =

©

ª λ ∈ σ(L) : ker (λidX − L) 6= ∅ .

The elements of σp (L) are called eigenvalues of L and for each λ ∈ σp (L) the closed subspace ker (λidX −L) of X is the eigenspace corresponding to the eigenvalue λ, while the nonzero elements of ker (λidX − L) are called eigenvectors of L. REMARK 3.1.29 If dim X = +∞ and L ∈ Lc (X), then 0 ∈ σ(L). Indeed, if L has a bounded inverse, then we could define an equivalent norm ° df ° |||x|||X = °L(x)°X

on X,

¡ ¢ whose closed unit ball is L B1X . But the latter set is compact (since L ∈ Lc (X)) and so dim X < +∞, a contradiction. Also, if dim X < +∞, then from linear algebra we know that the operator λidX − L is invertible (and automatically (λidX − L)−1 is continuous) if and only if λidX − L is bijective. So in this case σ(L) = σp (L). In general we can have that σp (L) = ∅ and σ(L) 6= ∅.

3. Nonlinear Operators and Young Measures

279

¡ ¢ EXAMPLE 3.1.30 ¡ Consider the Hilbert space L2 [0, 1] (over the com¡ ¢¢ plex scalars). Let L ∈ L L2 [0, 1] be defined by df

L(x)(t) = tx(t)

∀ t ∈ [0, 1].

We claim that σp (L) = ∅. Indeed, if for some λ ∈ C we have λx(t) = tx(t)

∀ t ∈ [0, 1],

then (λ − t)x(t) = 0 and so x(t) = 0

for a.a. t ∈ [0, 1].

On the other hand [0, 1] ⊆ σ(L). To this end let λ ∈ [0, 1] and take ε > 0, such that [λ, λ + ε] ⊆ [0, 1] or

[λ − ε, λ] ⊆ [0, 1].

To fix things, we assume that the first is true. We define ½ 1 √ if t ∈ [λ, λ + ε]. df ε xε (t) = 0 if t 6∈ [λ, λ + ε]. Then we have

Z1

λ+ε Z 2

xε (t) dt = 0

1 dt = 1 ε

λ

and so we have kxε k2 = 1. Also

¡ ¢ λidX − L (xε )(t) = (λ − t)xε (t)

Therefore ° ° °(λid − L)(xε )°2 = X 2

λ+ε Z

∀ t ∈ [0, 1].

1 ε2 (λ − t)2 dt = ε 3

λ

and so

¡ ¢ ¡ ¢ λidX − L (xε ) −→ 0 in L2 [0, 1] as ε & 0.

If λidX − L has a bounded inverse, then xε =

¡ ¢−1 (λidX − L)(xε ) −→ 0 as ε & 0, λidX − L

a contradiction to the fact that kxε k2 = 1 for all ε > 0.

280

Nonlinear Analysis

Using the theory of analytic functions one can show the following proposition. PROPOSITION 3.1.31 If X is a complex Banach space and L ∈ L(X), then σ(L) 6= ∅. REMARK 3.1.32 Banach space.

The result is no longer valid if we consider a real

PROPOSITION 3.1.33 If X is a complex Banach space and λ 6= 0 is an eigenvalue of L ∈ Lc (X), then dim(λidX − L)−1 (0) < +∞ (i.e., the eigenspace corresponding to λ is finite dimensional). PROOF

Set

df

Nλ = (λidX − L)−1 (0) and let B be a bounded subset of Nλ . For each x ∈ B, we have L(x) = λx. Since L ∈ Lc (X), we have that L(B) ⊆ X is compact. Hence λB ⊆ X is compact. Since all bounded sets in Nλ are relatively compact, it follows that dim Nλ < +∞. PROPOSITION 3.1.34 If X is a complex Banach space, L ∈ Lc (X) and ε > 0, then L has only finite many linear independent eigenvectors corresponding to eigenvalues having absolute value larger than ε. PROOF Let {xn }n>1 be a sequence of distinct eigenvectors corresponding to eigenvalues λk satisfying |λk | > ε. Set df

Xn = span {xk }nk=1

∀ n > 1.

Note that L(Xn ) = Xn and use Riesz lemma (see Proposition A.3.15), to obtain yn ∈ Xn such that Let

with kyn k = 1,

¡ ¢ 1 d yn , Xn−1 > . 2 df

un =

yn . λn

3. Nonlinear Operators and Young Measures Then

1 ε

kun kX < Also if yn =

n P

281

and L(un ) ∈ Xn .

ak xk , we have

k=1

L(un ) − yn =

n µ X λk k=1

λn

¶ − 1 ak xk =

n−1 Xµ k=1

¶ λk − 1 ak xk ∈ Xn−1 . λn

If n > m, then L(um ) ∈ Xm ⊆ Xn−1

and

L(un ) − yn ∈ Xn−1 .

So we have ° ° ¡ ¢ °L(un ) − L(um )° > d L(un ), Xn−1 ¡ ¢ ¡ ¢ 1 = d L(un ) + yn − L(un ), Xn−1 = d yn , Xn−1 > , 2 © ª so the sequence L(un ) n>1 has no convergent subsequence, a contradiction to the fact that L ∈ Lc (X). PROPOSITION 3.1.35 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then R(λidX − L) is closed. PROOF

Without any loss of generality, we may assume that λ = 1. Set df

V = idX − L

¡ ¢ df and N1 = V −1 {0} .

If λ 6∈ σp (L), then N1 = {0}. If λ ∈ σp (L), then dim N1 < +∞ (see Proposition 3.1.33). So in both cases we see that dim N1 < +∞. Thus we can write that X = N1 ⊕ Em with E being a closed subspace of X. Let df Vb = V |E .

We have V (X) = V (E) = Vb (E)

¡ ¢ ¡ ¢ and Vb −1 {0} = V −1 {0} ∩ E = {0}.

282

Nonlinear Analysis

This shows that Vb is bijective form E into X. We claim that ° ° °Vb (x)° > 0. inf x∈E kxkX = 1

X

Suppose that the claim is not true. Then we can find a sequence {xn }n>1 ⊆ E with kxn kX = 1, such that ° ° °Vb (xn )° & 0. X Since L ∈ Lc (X), by passing to a suitable subsequence if necessary, we may assume that L(xn ) −→ u in X. Then xn =

¡

¢ Vb + L (xn ) −→ u in X

and so kukX = 1. Moreover, Vb (xn ) −→ Vb (u)

in X

and so

Vb (u) = 0, a contradiction to the fact that Vb : E −→ X is bijective. So the claim is true and therefore there exists c > 0, such that ° ° °Vb (x)° > c kxk ∀ x ∈ E. X X This implies that Vb (E) is closed. Indeed, let un ∈ Vb (E)

∀n>1

and assume that un −→ u Then

in X.

un = Vb (xn ) with xn ∈ E

∀ n > 1.

We have ° 1° °Vb (xn − xm )° X c 1 = kun − um kX −→ 0 as n, m → +∞. c

kxn − xm kX 6

So xn −→ x in X for some x ∈ E and Vb (xn ) −→ Vb (x) = u ∈ Vb (E) in X. Finally recall that Vb (E) = V (E).

3. Nonlinear Operators and Young Measures

283

To produce a characterization of the spectrum of a compact operator, we shall need that following straightforward auxiliary result. LEMMA 3.1.36 If X is a Banach space, L ∈ Lc (X) and E = (idX − L)(X) is a proper subspace of X, X then for every ε > 0 we can find xε ∈ B 1 , such that ¡ ¢ dX L(xε ), L(E) > 1 − ε. PROOF By virtue of the Riesz lemma (see Proposition A.3.15), we can find xε ∈ X with kxε kX = 1, such that ¡ ¢ d xε , E > 1 − ε. Note that (idX − L)(xε ) in E and L(E) ⊆ E. Therefore ¡ ¢ ¡ ¢ ¡ ¢ dX L(xε ), L(E) > dX xε − (I − L)(xε ), E = dX xε , E > 1 − ε.

Using this lemma we can have the following remarkable property of the spectrum of a compact operator. THEOREM 3.1.37 If X is a complex Banach space, L ∈ Lc (X) and λ ∈ σ(L) \ {0}, then λ ∈ σp (L). PROOF

Without any loss of generality, we may assume that λ = 1. Let df

V = idX − L and suppose that

¡ ¢ V −1 {0} = {0}

(i.e., λ = 1 6∈ σp (L)). We set df

En = V n (X) and note that ¡ ¢ En = V n (X) = V n−1 V (X) ⊆ V n−1 (X) = En−1

∀ n > 1.

284

Nonlinear Analysis

From Proposition 3.1.35, we know that En is closed

∀ n > 1.

En+1 = En

∀ n > 1.

Suppose that En

Then according to Lemma 3.1.36, we can find xn ∈ B 1 , such that ¡ ¢ 1 dX L(xn ), L(En+1 ) > 2 so

° ° °L(xn ) − L(xm )°

∀ n > 1,

1 2 a contradiction to the compactness of L. So X

>

∀ n 6= m,

En+1 6= En , for some n > 1. We shall show that X = E0 = E1 . Suppose that this is not true, i.e., E0 6= E1 . Let m > 1 be the smallest positive integer, such that Em−1 6= Em = Em+1 . We choose y ∈ Em−1 \ Em . Then V (y) ∈ Em = Em+1 . Hence we can find z ∈ Em , such that V (y) = V (z) and

y 6= z,

since y 6∈ Em = Em+1 . Therefore V (y − z) = 0 and so y − z ∈ ker V, ¡ ¢ a contradiction to the hypothesis that V −1 {0} = {0}. So V is surjective and by Banach’s theorem (see Theorem A.3.6), we have that V −1 exists and is bounded. Hence λ = 1 6∈ σ(L), a contradiction.

3. Nonlinear Operators and Young Measures

285

From Theorem 3.1.37, Proposition 3.1.34 and the well known fact from linear algebra, which says that eigenvectors corresponding to distinct eigenvalues are linear independent, we obtain the following characterization of the spectrum of a compact operator. THEOREM 3.1.38 If X is an infinite dimensional complex Banach space and L ∈ Lc (X), then (a) σ(L) is a countable compact set whose only possible limit point is 0; (b) σ(L) = {0} ∪ σp (L); (c) if λ ∈ σp (L) \ {0}, then the eigenspace of L corresponding to L is finite dimensional. REMARK 3.1.39 The above theorem does not say that σ(L) is the disjoint union of {0} and σp (L). For example if X = l2 and L ∈ Lc (l2 ) is given by ¡ ¢ df ¡ ¢ L {xn }n>1 = x1 , 0, 0, . . . , then λ = 0 is an eigenvalue of L and the associated eigenspace is infinite dimensional (it has codimension equal to 1). On the other hand the operator L ∈ Lc (l2 ) in Example 3.1.25 is bijective and so does not have 0 as an eigenvalue. As we mentioned in the beginning of this section, compact operators generalize to infinite dimensions the properties of operators between finite dimensional spaces. One such property is that if dim X < +∞ and L ∈ L(X), then L is surjective if and only if L is injective. The result is no longer true if dim X = +∞. For example let X = l2 and let ¡ ¢ df ¡ ¢ L {xn }n>1 = 0, x1 , x2 , . . . (the right shift operator). However, if L ∈ Lc (X), then as we show in the sequel, the result is true for idX − L. We start with a definition. DEFINITION 3.1.40 Let X be a Banach space, A ⊆ X and C ⊆ X ∗ . We introduce the set A⊥ ⊆ X ∗ (pronounced “A perp”) and ⊥ C ⊆ X (pronounced “perp C”), defined by ½ A

⊥

df

=

¾ ∗

∗

∗

x ∈ X : hx , aiX = 0 for all a ∈ A , ½ ¾ df ⊥ ∗ ∗ C = x ∈ X : hc , xiX = 0 for all c ∈ C .

286

Nonlinear Analysis

REMARK 3.1.41 X respectively and ⊥

¡

A⊥

¢

The sets A⊥ and

= span A

and

⊥

C are closed subsets of X ∗ and

¡⊥ ¢⊥ ∗ C = span w C.

Also if E is a closed subspace of X, then ¡ ¢∗ X/E = E ⊥ and X ∗ / ⊥ = E ∗ E

(see, e.g., Beauzamy (1982, pp. 41 and 43)). LEMMA 3.1.42 If X, Y are two Banach spaces and L ∈ L(X; Y ), then ¡ ¢ ker L = ⊥ L∗ (Y ∗ ) and ker L∗ = L(X)⊥ . PROOF Recall that Y ∗ is a separating family of functions on Y . So x ∈ ker L if and only if ∗ ∗ ® ® L (y ), x X = y ∗ , L(x) Y = 0 ∀ y∗ ∈ Y ∗ , ¡ ¢ hence x ∈ ⊥ L∗ (Y ) . In a similar fashion, we have that y ∗ ∈ ker L∗ if and only if ∗ ® ® y , L(x) Y = L∗ (y ∗ ), x X = 0 ∀ x ∈ X, so y ∗ ∈ L(X)⊥ . LEMMA 3.1.43 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then R(λidX − L) = X implies that ker (λidX − L) = {0}, i.e., if λidX − L is surjective, then it is injective. PROOF Without any loss of generality we may assume that λ = 1. Recall that L commutes with (idX − L)n (consider the polynomial expansion of (idX − L)n ). So ¡ ¢ L ker (idX − L)n ⊆ ker (idX − L)n ∀ n > 1. Suppose that although R(idX − L)(X) = X, the operator idX − L is not injective. Note that (idX − L)n (X) = X

∀n>1

3. Nonlinear Operators and Young Measures

287

and so (idX − L)n+1 maps some elements of X to 0 that (idX − L)n does not. Hence ker (idX − L)n ( ker (idX − L)n+1 . Using Riesz lemma (see Proposition A.3.15), we can find xn ∈ ker (idX − L)n+1 with kxn kX = 1, such that kxn − ykX >

1 2

∀ n > 1, y ∈ ker (idX − L)n .

If n > m, we have (idX − L)(xn ) + L(xm ) ∈ ker (idX − L)n and so ° ° ° ¡ ¢° °L(xn ) − L(xm )° = °xn − (id − L)(xn ) + L(xm ) ° > 1 , X X X 2 a contradiction to the fact that L ∈ Lc (X). PROPOSITION 3.1.44 If X is Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then dim ker (λidX − L) = codim R(λidX − L) ³ ´ (recall that codim R(λidX − L) = dim X/R(λid − L) ). X

PROOF Without any loss of generality, we may assume that λ = 1. From Remark 3.1.41 and Lemma 3.1.42, we have that ³ ´∗ ¢ ¡ X/R(λid − L) (3.2) = R(idX − L)⊥ = ker id∗X − L∗ . X

From Theorem 3.1.22, we know that L∗ ∈ Lc (X ∗ ) and so Proposition 3.1.33 implies that ¡ ¢ dim ker id∗X − L∗ < +∞. A finite dimensional Banach space has the same dimension as its dual. So from (3.2), we have that codim R(idX − L) = dimker (id∗X − L∗ ).

(3.3)

Because L∗ ∈ L(X ∗ ), from Proposition 3.1.35, we have that R(id∗X − L∗ ) is closed, hence w∗ -closed too (see Remark 3.1.21). So from Remark 3.1.41 and Lemma 3.1.42, we have ¡ ¢⊥ ¡ ¢ ¡ ¢ ker (idX − L)⊥ = ⊥ R(id∗X − L∗ ) = R id∗X − L∗ = R id∗X − L∗ ,

288 so

Nonlinear Analysis X ∗ /R(id∗ − L∗ ) = X ∗ / = ker (idX − L)⊥ X

£ ¤∗ ker (idX − L) .

Using as before the fact that a finite dimensional Banach space has the same dimension as its dual, we obtain codim R(id∗X − L∗ ) = dim ker (idX − L).

(3.4)

Suppose that dim ker (idX − L) > codim R(idX − L). Then we can find a closed subspace E of X, such that X = R(idX − L) ⊕ E. Let PE be the projection operator onto E. Then ker PE = R(idX − L). We have that X/ker P = X/R(id − L) = E E X and so codim R(idX − L) = dim E. Therefore there is a bounded linear operator T which is not injective and maps ker (idX − L) onto E. Then ¡ ¢ T ∈ Lc ker (idX − L); X . Let F be a closed subspace of X, such that X = ker (idX − L) ⊕ F and P0 the projection operator onto ker (idX − L) and with kernel F . Set df

G = L + T P0 . Evidently G ∈ Lc (X) and we have (idX − G)(X) =

¡

¢¡ ¢ (idX − L) − T P0 ker (idX − L) ¡ ¢ + (idX − L) − T P0 (F )

= E + (idX − L)(F ) = E + (idX − L)(X) = X,

3. Nonlinear Operators and Young Measures

289

so, from Lemma 3.1.43, we have that idX − G is injective. But there is a nonzero u ∈ ker T ⊆ ker (idX − L), such that (idX − G)(u) = (idX − L)(u) − T P0 (u) = 0, a contradiction. So, it follows that dim ker (idX − L) 6 codim R(idX − L).

(3.5)

Moreover, because L∗ ∈ Lc (X ∗ ), we also have dim ker (id∗X − L∗ ) 6 codim R(id∗X − L∗ ).

(3.6)

From (3.3), (3.4), (3.5) and (3.6), we conclude that dim ker (idX − L) = codim R(idX − L).

A byproduct of the above proof is the following result. COROLLARY 3.1.45 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then dim ker (λidX − L) = dim ker (λid∗X − L∗ ). Clearly Proposition 3.1.44 permits the improvement of Lemma 3.1.43. This is done in the next theorem which summarizes all the above properties of a compact operator. THEOREM 3.1.46 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then (a) ker (λidX − L) is finite dimensional; (b) R(λidX − L) is closed and R(λidX − L) = ker (λid∗X − L∗ )⊥ ; (c) ker (λidX − L) = {0} if and only if R(λidX − L) = X; (d) dim ker (λidX − L) = dim ker (λid∗X − L∗ ). REMARK 3.1.47 Statement (c) expresses the fact that λidX − L is injective if and only if λidX − L is surjective, a well known property of linear operators between finite dimensional spaces.

290

Nonlinear Analysis

This leads us to the Fredholm alternative theorem, an important tool in the study of integral equations and boundary value problems. THEOREM 3.1.48 (Fredholm Alternative Theorem) If X is Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then one and only one of the following two alternatives holds: (a) for every u ∈ X, the equation (λidX − L)(x) = u has a unique solution x ∈ X; or (b) the homogeneous equation (λidX − L)(x) = 0 has N linear independent solutions with N > 1; in this case the nonhomogeneous equation (λidX − L)(x) = u has a solution if and only if u verifies N conditions of orthogonality, i.e., u ∈ ker (λid∗X − L∗ )⊥ . Next let us say a few words about the spectrum of a self-adjoint, compact operator on a Hilbert space. DEFINITION 3.1.49 Let H be a Hilbert space and L ∈ L(H). We say that L is self-adjoint (or hermitian) if and only if L∗ = L, i.e., ¡ ¢ ¡ ¢ L(x), y H = x, L(y) H ∀ x, y ∈ H. REMARK 3.1.50 If H is a complex Hilbert space and L ∈ L(H) is a self-adjoint operator, then ¡

¡ ¢ ¢ ¡ ¢ L(x), x H = x, L(x) H = L(x), x H ,

hence

¡

L(x), x

¢ H

∈ R.

Also one can check that n

kLn kL = kLkL and kLkL =

∀n>1

¯¡ ¢ ¯ sup ¯ L(x), x H ¯.

kxkH 61

3. Nonlinear Operators and Young Measures

291

PROPOSITION 3.1.51 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then all eigenvalues of L are real and eigenvectors corresponding to different eigenvalues are orthogonal. PROOF

Let λ be a eigenvalue with an eigenvector x. We have ¡ ¢ ¡ ¢ L(x), x H = λx, x H ,

so, from Remark 3.1.50, we have λ =

(L(x), x)H 2

kxkH

∈ R.

Also if µ is another eigenvalue with an eigenvector y, we have ¡ ¢ ¡ ¢ ¡ ¢ L(x), y H = λ (x, y)H and L(y), x H = µ x, y H . Since L is self-adjoint, it follows that (λ − µ) (x, y)H = 0. Because λ 6= µ, we conclude that (x, y)H = 0. PROPOSITION 3.1.52 If H is Hilbert space and L ∈ L(H) is a self-adjoint operator, then λ ∈ σ(L) if and only if ° ° inf °(λidX − L)(x)°H = 0. kxkH =1

PROOF

“⇐=”: If λ ∈ %(L), then (λidX − L)−1 ∈ L(H)

and so for x ∈ H with kxkH = 1, we have ° ° 1 = kxkH = °(λidX − L)−1 (λidX − L)(x)°H ° °−1 ° ° 6 °λidX − L°L °(λidX − L)(x)°H , so inf

kxkH =1

° ° ° ° °(λid − L)(x)° > °(λid − L)−1 °−1 > 0. X X H L

“=⇒”: We proceed by contradiction. So suppose that ° ° inf °(λidX − L)(x)°H = c > 0. kxkH =1

292

Nonlinear Analysis

Then by positive homogeneity, we have ° ° °(λid − L)(x)° > c kxk X H H

∀ x ∈ X.

Hence λidX − L is injective. If we show that λidX − L is also surjective, then by Banach’s theorem (see Theorem A.3.6), we have that (λidX − L)−1 ∈ L(H), a contradiction to the fact that λ ∈ σ(L). We establish the surjectivity of λidX − L in two steps. First we show that (λidX − L)(H) is dense in H and then that (λidX − L)(H) is closed in H. Suppose that (λidX −L)(H) is not dense in H. Then we can find u ∈ H\{0}, such that ¡ ¢ u, (λidX − L)(x) H = 0 ∀ x ∈ H. Since L is self-adjoint, we have that ¡ ¢ ¡ ¢ 0 = u, (λidX − L)(x) H = (λidX − L)(u), x H hence

∀ x ∈ H,

¡ ¢ λidX − L (u) = 0.

This means that λ ∈ σp (L). But from Proposition 3.1.51, we know that σp (L) ⊆ R. Hence λ = λ = λ. Therefore

¡

¢ λidX − L (u) = 0,

u 6= 0,

a contradiction to the fact that ° ° °(λid − L)(u)° > c kuk > 0. X H H This proves that (λidX − L)(H) is dense in H. Now we show that (λidX − L)(H) is closed in H. To this end let (λidX − L)(xn ) −→ y Then we have ° ° °(λidX − L)(xn − xm )° −→ 0 H Since

in H.

as n, m → +∞.

° ° °(λid − L)(xn − xm )° > c kxn − xm k , X H H

3. Nonlinear Operators and Young Measures

293

we have that kxn − xm kH −→ 0

as n, m → +∞.

Therefore xn −→ x ∈ H and so (λidX − L)(xn ) −→ (λidX − L)(x), hence y = (λidX − L)(x), which proves the closedness of (λidX − L)(H). We conclude that ¡ ¢ λidX − L (H) = H and so by Banach’s theorem, we have (λidX − L)−1 ∈ L(H), a contradiction to the fact that λ ∈ σ(L). In Proposition 3.1.51, we saw that if L ∈ L(H) is a self-adjoint operator, then σp (L) ⊆ R. Next we show that in fact the whole spectrum is real. PROPOSITION 3.1.53 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then σ(L) ⊆ R. PROOF Let λ = a + ic with c 6= 0. We show that λ ∈ %(L). For every x ∈ H, we have ¡ ¢ ¡ ¢ (λidX − L)(x), x H − x, (λidX − L)(x) H ¡ ¢ ¡ ¢ 2 2 = λ kxkH − L(x), x H − λ kxkH + x, L(x) H ¡ ¢ 2 2 = λ − λ kxkH = 2ic kxkH . So for every x ∈ H, we have ¯¡ ¢ ¡ ¢ ¯ 2 2|c| kxkH = ¯ (λidX − L)(x), x H − x, (λidX − L)(x) H ¯ ¯¡ ¢ ¯ ¢ ¯ ¯¡ 6 ¯ (λidX − L)(x), x H ¯ + ¯ (x, (λidX − L)(x) H ¯ ° ° 6 2°(λidX − L)(x)°H kxkH , hence

° ° c kxkH 6 °(λidX − L)(x)°H .

Invoking Proposition 3.1.52, we infer that λ ∈ %(L). Therefore σ(L) ⊆ R.

294

Nonlinear Analysis

We can say more about the position of σ(L) in the real line R when L ∈ L(H) is self-adjoint. PROPOSITION 3.1.54 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then σ(L) ⊆ [m, M ], where df

m =

inf

kxkH =1

¡

L(x), x

¢ H

and

df

M =

sup kxkH =1

¡ ¢ L(x), x H .

Moreover, m, M ∈ σ(L). PROOF

From Proposition 3.1.53, we know that σ(L) ⊆ R. Let r > 0 and

df

λ = M + r. Then for every x ∈ H with kxkH = 1, we have ¡ ¢ ¡ ¢ 2 (λidX − L)(x), x H = λ kxkH − L(x), x H ° °2 2 2 > λ°x°H − M kxkH = r kxkH = r, so

° ° r 6 °(λidX − L)(x)°H .

Invoking Proposition 3.1.52, we infer that λ ∈ %(L). Similarly if λ = m − r. So σ(L) ⊆ [m, M ]. Next we show that M ∈ σ(L). Note that σ(L − µidX ) = σ(L) − µ

∀ µ ∈ R.

So by replacing L with L + µidX with µ > 0 sufficiently large, we may assume that 0 6 m 6 M . Then M = kLkL (see Remark 3.1.50). Let {xn }n>1 ⊆ X be a sequence, such that ¡ ¢ kxn kH = 1 ∀ n > 1 and L(xn ), xn H % M = kLkL . Then we have ° ° ¡ ¢ °(M id − L)(xn )°2 = M xn − L(xn ), M xn − L(xn ) X H H ° °2 ¡ ¢ 2 = M 2 kxn kH + °L(xn )°H − 2M L(xn ), xn H ¡ ¢ 6 M 2 + M 2 − 2M L(xn ), xn H , hence

° ° °(M id − L)(xn )° −→ 0. X H

So Proposition 3.1.52 implies that M ∈ σ(L). The proof that m ∈ σ(L) is similar.

3. Nonlinear Operators and Young Measures

295

PROPOSITION 3.1.55 If H is a Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then σp (L) 6= ∅. PROOF If L = 0, then λ = 0 is an eigenvalue of L. If L 6= 0, then by Proposition 3.1.54, at least one of m or M is a nonzero element of σ(L). Invoking Theorem 3.1.38(b), we conclude that σp (L) 6= ∅.

PROPOSITION 3.1.56 If H is a Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then there is a orthonormal basis of H consisting of eigenvectors of L. PROOF

For λ ∈ σp (L) let df

E(λ) = (λidX − L)−1 (0) (the eigenspace corresponding to the eigenvalue λ). Let B(λ) be an orthonormal basis for each finite dimensional eigenspace E(λ). By virtue of Proposition 3.1.51, we have that [ B(λ) is an orthonormal set in H. λ∈σp (L)

Suppose that span

[

B(λ) 6= H.

λ∈σp (L)

Then set

df

F =

[

£ span

B(λ)

¤⊥

.

λ∈σp (L)

Clearly L(F ) ⊆ F. So L|F has an eigenvalue (see Proposition 3.1.55). Let u ∈ F \ {0} be an eigenvector of L|F . Evidently u is an eigenvector of L and so [ F ∩ span B(λ) ! {0}, λ∈σp (L)

a contradiction. Therefore span

[ λ∈σp (L)

B(λ) = H.

296

Nonlinear Analysis

Now we can state the so-called spectral theorem for compact self-adjoint operators on a separable Hilbert space. THEOREM 3.1.57 (Spectral Theorem) If H is an infinite dimensional separable Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then there exists an orthonormal basis {en }n>1 of H formed by eigenvectors of L, such that ∞ X

L(x) =

λn (x, en )H en

∀ x ∈ H,

n=1

with {λn }n>1 being the eigenvalues corresponding to {en }n>1 . PROOF From Proposition 3.1.56, we know that there exists an orthonormal basis of H consisting of eigenvectors of L. This orthonormal basis is countable, because H is separable. Denote it by {en }n>1 . For every x ∈ H and m > k > 1 and since ¯¡ ¢ ¯ |λn | 6 sup ¯ L(x), x H ¯ = kLkL kxkH 61

(see Remark 3.1.50), we have ° ∞ °2 m X °X ° ¯ ¯ ° ° ¯λn (x, en ) ¯2 λ (x, e ) e = n n H n° H ° H

n=k

n=k

m X ¯ ¯ 2 ¯ (x, en ) ¯2 −→ 0 as k, m → +∞, 6 kLkL H n=k

so

∞ P n=1

λn (x, en )H en is convergent in H.

Moreover, if x ∈ H with kxkH 6 1, then for every m > 1, we have ° m °2 m m X X °X ° ¯ ¯2 ¯ ¯ 2 2¯ ° ° ¯ ¯ (x, en ) ¯2 λ (x, e ) e = λ (x, e ) 6 kLk n n H n° n H n L H ° H

n=1

n=1

∞ X ¯ ¯ 2 ¯ (x, en ) ¯2 = kLk2 kxk2 . 6 kLkL H L H

n=1

(3.7)

n=1

Therefore, if we define df

T (x) =

∞ X

λn (x, en )H en ,

n=1

from (3.7), we see that T ∈ L(H). Note that L(en ) = T (en ) for all n > 1. So by linearity and continuity, we conclude that L = T .

3. Nonlinear Operators and Young Measures

297

Before passing to Fredholm operators, let us mention two more results on compact maps. PROPOSITION 3.1.58 If X, Y are two Banach spaces, U ⊆ X is an open set, f ∈ K(U ; Y ) and it is Fr´echet differentiable, then f 0 (x) ∈ Lc (X; Y ) ∀ x ∈ U. PROOF Suppose that f 0 (x) is not compact. Then we can find a sequence {un }n>1 ⊆ X with kun kX 6 1 ∀n>1 and ε > 0, such that ° 0 ° °f (x)un − f 0 (x)um ° > ε Y We have

∀ n 6= m.

f (x + h) − f (x) = f 0 (x)h + ox (h),

where

ε khkX ∀ h ∈ X, khkX 6 δ, 3 for some δ = δ(ε, x) > 0. Therefore ° ° °f (x + δun ) − f (x + δum )° ° ° ° Y ° ° ° > δ °f 0 (x)(un − um )°Y − °ox (δun )°Y − °ox (δum )°Y 2ε ε > εδ − δ = δ, 3 3 kox (h)kY 6

a contradiction to the fact that f ∈ K(U ; Y ). The converse of the above result is also true, provided that the map x 7−→ f 0 (x)

belongs in K(U ; L(X; Y )). For details see Vaˇınberg (1973, pp. 47 and 51). PROPOSITION 3.1.59 Let X, Y be two Banach spaces. (a) If f : X −→ Y is a Fr´echet differentiable operator, f 0 (x) ∈ Lc (X; Y ) ¡ ¢ for every x ∈ X and x 7−→ f 0 (x) belongs in K X; L(X; Y ) , then f ∈ K(X; Y ). (b) If f ∈ K(X; Y ), f is Fr´echet differentiable, and x 7−→ f 0 (x) belongs in ¡ ¢ K X; L(X; Y ) , then f is completely continuous.

298

Nonlinear Analysis

The study of compact operators leads us to the following definition. DEFINITION 3.1.60 Let X, Y be two Banach spaces and L ∈ L(X; Y ). We say that L is a Fredholm operator, if α(L) = dim ker L < +∞

and

β(L) = codim R(L) < +∞.

The class of Fredholm operators is denoted by Φ(X; Y ). The quantity α(L) is called the kernel index of L and the quantity β(L) is called the deficiency index of L. The index of L is defined by df

ind (L) = α(L) − β(L). If we have only that α(L) < +∞ and that R(L) is closed, then we say that L is a semi-Fredholm operator and the class of all semi-Fredholm operators is denoted by Φ+ (X; Y ). If X = Y we write Φ(X) and Φ+ (X). REMARK 3.1.61

We have Φ(X; Y ) ⊆ Φ+ (X; Y ),

since as we show in the sequel, the condition that β(L) < +∞ implies that R(L) is closed. LEMMA 3.1.62 If X, Y are two Banach spaces, L ∈ L(X; Y ) is injective and L−1 : R(L) −→ X is bounded, then R(L) is closed. PROOF

Let L(xn ) −→ y

in Y.

Since by hypothesis L has a bounded inverse on R(L), we must have that ° ° °L(xn ) − L(xm )° > c kxn − xm k ∀ n 6= m, X Y for some c > 0. Therefore {xn }n>1 ⊆ X is a Cauchy sequence and we have that xn −→ x, for some x ∈ X. Hence L(xn ) −→ L(x)

in Y

and so y = L(x). Therefore y ∈ R(L) and we conclude that R(L) is closed in Y .

3. Nonlinear Operators and Young Measures

299

The next definition formalizes an idea which was used in earlier proofs, when we wanted to get rid of the nontrivial kernel of an operator L ∈ L(X; Y ). DEFINITION 3.1.63

Let X, Y be two Banach spaces and L ∈ Lc (X; Y ).

b induced by L is the operator from X/ The operator L ker L into Y defined by ¡ ¢ df b [x] = L L(x) REMARK 3.1.64

∀ x ∈ X.

b is injective and Evidently L ¡ ¢ b = R(L). R L

b is continuous and In fact it is straightforward to show that L ° ° kLk = °b L° . L

L

PROPOSITION 3.1.65 If X, Y are two Banach spaces and L ∈ L(X; Y ), then R(L) is closed if and only if there exists c > 0, such that ° ° °L(x)° > cdX (x, ker L) ∀ x ∈ X. Y b induced by L (see Definition 3.1.63). PROOF Consider the operator L b is injective and We know that L ¡ ¢ b = R(L) R L ¡ ¢ b is closed if (see Remark 3.1.64). By virtue of Lemma 3.1.62, R(L) = R L b −1 has a bounded inverse which in turn is equivalent to saying and only if L that b kL([x])k kL(x)kY Y 0 < c = inf = inf . x6∈ker L x6∈ker L dX (x, ker L) k[x]k

We can define the quantity df

γ(L) =

inf

x6∈ker L

kL(x)kY . dX (x, ker L)

This quantity is known as the minimum modulus of L. From the previous discussion, we have the following proposition.

300

Nonlinear Analysis

PROPOSITION 3.1.66 If X, Y are two Banach spaces and L : X −→ Y is linear, then any two of the following three properties imply the other: (a) L ∈ L(X; Y ); (b) R(L) is closed in Y ; (c) γ(L) > 0. PROPOSITION 3.1.67 If X, Y are two Banach spaces, L ∈ L(X; Y ) and suppose that E is a closed subspace of Y , such that R(L) ⊕ E is closed in Y , then R(L) is closed. PROOF

Let L0 ∈ L(X × E, X × Y ) be defined by df

L0 (x, u) = L(x) + u

∀ (x, u) ∈ X × E.

Since R(L) ∩ E = {0}, we have that ker L0 = ker L × {0}. By hypothesis R(L0 ) = R(L) ⊕ E is closed. So according to Proposition 3.1.66, we have that γ(L0 ) > 0. Then for all x ∈ X, we have ° ° ° ° °L(x)° = °L0 (x, 0)° Y X×X ¡ ¢ > γ(L0 )d (x, 0), ker L0 = γ(L0 )dX (x, ker L), so γ(L) > γ(L0 ) > 0. Invoking Proposition 3.1.66, we conclude that R(L) ⊆ Y is closed. COROLLARY 3.1.68 If X, Y are two Banach spaces, L ∈ L(X; Y ) and β(L) < +∞, then R(L) is closed. PROOF

We have Y = R(L) ⊕ E

for some finite dimensional subspace E of Y . Apply Proposition 3.1.67.

3. Nonlinear Operators and Young Measures

301

This corollary implies that every L ∈ Φ(X; Y ) has closed range and so Φ(X; Y ) ⊆ Φ+ (X; Y ) (see Remark 3.1.61). The propositions that follow summarize some of the basic properties of Fredholm operators. PROPOSITION 3.1.69 If X, Y are two Banach spaces and L ∈ Φ(X; Y ), then (a) if ind L = 0 and ker L = {0}, then for every y ∈ Y the equation L(x) = y has a unique solution and L−1 exists and is bounded (i.e., L−1 ∈ L(X; Y )); (b) for given y ∈ Y , the equation L(x) = y has a solution if and only if hy ∗ , yiX = 0

∀ y ∗ ∈ ker L∗ ,

i.e., y ∈ ⊥ (ker L∗ ); (c) L∗ ∈ Φ(Y ∗ ; X ∗ ) and α(L∗ ) = β(L), β(L∗ ) = α(L), ind L∗ = −ind L. PROOF (a) Since ker L = {0}, L is injective. Also because ind L = 0, we have β(L) = 0 and so R(L) = Y . Invoking Banach’s Theorem, we conclude that L−1 ∈ L(X; Y ) and L(x) = y has a unique solution. (b) From Corollary 3.1.68, we know that R(L) is closed. Hence R(L) =

⊥

(ker L∗ ) .

(c) Since R(L) is closed, we have ker L∗ So

w∗

= R(L)⊥

and

α(L∗ ) = β(L) and

⊥

¡

¢ R(L∗ ) = ker L.

β(L∗ ) = α(L).

Because L ∈ Φ(X; Y ), it follows that L∗ ∈ Φ(Y ∗ ; X ∗ )

and

ind L∗ = −ind L.

The next proposition gives a basic stability property of Fredholm operators. PROPOSITION 3.1.70 If X, Y are two Banach spaces, L ∈ Φ(X; Y ) and T ∈ Lc (X; Y ), df

then G = L + T ∈ Φ(X; Y ) and ind G = ind L.

302

Nonlinear Analysis

REMARK 3.1.71 The result is also true if instead of T ∈ Lc (X; Y ) we assume that T ∈ L(X; Y ) with kT kL 6 δ for some δ = δ(L) > 0. Also note that in particular Proposition 3.1.70 implies that if T ∈ Lc (X; Y ) then idX − T ∈ Φ(X; Y ). PROPOSITION 3.1.72 If X is a Banach space and L ∈ L(X), then L ∈ Φ+ (X) if and only if for every closed and bounded set B ⊆ X, L|B is proper. PROOF

“=⇒”: Let {xn }n>1 ⊆ B be such that L(xn ) −→ u in X.

We have that X = ker L ⊕ E, with a closed subspace E ⊆ X (since dim ker L < +∞). So xn = zn + en with zn ∈ ker L, en ∈ E. We have L(xn ) = L(en ) −→ u

in X.

b (see Definition 3.1.63) is injective and so by Banach’s Then operator L|E = L Theorem, ¡ ¢ L−1 ∈ L R(L); E . Therefore, we have en −→ e

in X,

for some e ∈ E. The sequence {zn }n>1 ⊆ ker L is bounded. So exploiting the finite dimensionality of ker L, we have that the sequence {zn }n>1 is relatively compact in X. Therefore we conclude that the sequence {xn }n>1 is relatively compact in X, which proves that L|B is proper. “=⇒”: The set

©

ª

x ∈ X : x ∈ ker L, kxkX 6 1

is compact by hypothesis. So it follows that ker L is finite dimensional. We can write X = ker L ⊕ E, with a closed subspace E ⊆ X. Then Proposition 3.1.67 implies that R(L) is closed, hence L ∈ Φ+ (X).

3. Nonlinear Operators and Young Measures

3.2

303

Operators of Monotone Type

Operators of monotone type were introduced to provide an analytic framework broader than compact operators in order to study nonlinear functional equations. Their systematic study of monotone operators starts in early 1960’s and marks the advent of nonlinear functional analysis. Monotone operators are rooted in the theory of variational problems. Moreover, recalling that the Gˆateaux derivative of a convex function is the prototypical example of a nonlinear monotone operator, it is no surprise that for a long period the theory of monotone operators and convex analysis developed in parallel and interacted heavily. The mathematical framework of the analysis in this section is the following. Let X be a reflexive Banach space and X ∗ its topological dual. By h·, ·iX we denote the duality pairing for the spaces X ∗ and X. Also A : X ⊇ D(A) −→ 2X

∗

is a generally multivalued operator. The domain D(A) of A is defined by df

D(A) =

©

ª

x ∈ X : A(x) 6= ∅

and the graph Gr A of A is defined by df

Gr A =

©

ª (x, x∗ ) ∈ X × X ∗ : x∗ ∈ A(x) .

Also we can define A−1 : X ∗ ⊇ D∗ −→ 2X b df

Gr A∗ =

©

ª (x∗ , x) ∈ X ∗ × X : (x, x∗ ) ∈ Gr A .

Note that A−1 is always defined as a multivalued operator. Some of the results in this section are actually true in a more general setting. However, in order to have a uniform presentation, we have chosen to work in the above setting, which after all is what we encounter in most applications. DEFINITION 3.2.1 ator.

∗

Let A : X ⊇ D(A) −→ 2X be a multivalued oper-

(a) We say that A is monotone, if hx∗ − y ∗ , x − yiX > 0, for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y).

304

Nonlinear Analysis

(b) We say that A is strictly monotone, if it is monotone and hx∗ − y ∗ , x − yiX > 0, for all x, y ∈ D(A), x 6= y and all x∗ ∈ A(x), y ∗ ∈ A(y). (c) We say that A is strongly monotone, if there exists c > 0, such that 2

hx∗ − y ∗ , x − yiX > c kx − ykX , for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y). (d) We say that A is uniformly monotone, if there exists a continuous function c : R+ −→ R+ , which is strictly increasing, c(0) = 0, c(r) −→ +∞ as r → +∞ and ¡ ¢ hx∗ − y ∗ , x − yiX > c kx − ykX kx − ykX , for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y). (e) We say that A is coercive, if D(A) is bounded or D(A) is unbounded and inf{hx∗ , xiX : x∗ ∈ A(x)} −→ +∞ kxkX

as kxkX → +∞, x ∈ D(A).

We say that A is weakly coercive, if D(A) is bounded or D(A) is unbounded and inf kx∗ kX ∗ −→ +∞ as kxkX → +∞, x ∈ D. ∗ x ∈A(x)

REMARK 3.2.2 If A is strongly monotone, then A is uniformly monotone. If A is uniformly monotone, then A is strictly monotone. If A is strictly monotone, then A is monotone. If A is uniformly monotone, then A is coercive. If A is coercive, then A is weakly coercive. Sometimes it is convenient to identify A with its graph. For this reason some authors speak of monotone sets in X × X ∗ . ∗

DEFINITION 3.2.3 A monotone map A : X ⊇ D(A) −→ 2X is said to be maximal monotone, if the inequality hx∗ − y ∗ , x − yiX > 0

∀ (x, x∗ ) ∈ Gr A

implies that (y, y ∗ ) ∈ Gr A. REMARK 3.2.4 The above definition implies that the graph of a maximal monotone map is not properly included in the graph of another monotone map (i.e., it is maximal with respect to inclusion).

3. Nonlinear Operators and Young Measures

305

EXAMPLE 3.2.5 An increasing continuous function f : R −→ R is maximal monotone. However, an increasing discontinuous function f : R −→ R is monotone but not maximal monotone, since it admits a monotone extension by filling in the jumps at the discontinuity points. This example underlines the necessity of multivalued operators in the study of maximal monotonicity. The next result is an immediate consequence of Definition 3.2.1. PROPOSITION 3.2.6 ∗ A map A : X ⊇ D(A) −→ 2X is maximal monotone if and only if A−1 : X ∗ ⊇ D(A∗ ) −→ 2X is maximal monotone. PROPOSITION 3.2.7 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map, then for every x ∈ D(A), the set A(x) is nonempty, convex and closed. PROOF

Since x ∈ D(A), A(x) 6= ∅. Let x∗ , y ∗ ∈ A(x). Set df

u∗λ = λx∗ + (1 − λ)y ∗

∀ λ ∈ [0, 1].

For all (z, z ∗ ) ∈ Gr A, we have hu∗λ − z ∗ , x − ziX = λ hx∗ − z ∗ , x − ziX + (1 − λ) hy ∗ − z ∗ , x − ziX > 0, hence u∗λ ∈ A(x) (see Definition 3.2.3). Therefore A(x) is convex. Also suppose that {x∗n }n>1 ⊆ A(x) is a sequence, such that x∗n −→ x∗

in X ∗ .

We have hx∗n − z ∗ , x − ziX > 0

∀ n > 1, (z, z ∗ ) ∈ Gr A.

In the limit as n → +∞, we have hx∗ − z ∗ , x − ziX > 0, hence (x, x∗ ) ∈ Gr A, i.e., A(x) is closed in X ∗ . A fundamental property of monotone maps is local boundedness. ∗

DEFINITION 3.2.8 A monotone map A : X ⊇ D(A) −→ 2X is said to be locally bounded at x ∈ D(A), if there exists M > 0 and r > 0, such that ky ∗ kX ∗ 6 M ∀ y ∈ D(A) ∩ B r (x), y ∗ ∈ A(y).

306

Nonlinear Analysis

DEFINITION 3.2.9 If C ⊆ X is a nonempty set, a pointSx ∈ C is an absorbing point of C, if the set C − x is absorbing, i.e., X = λ(C − x). λ>0

REMARK 3.2.10 If int C 6= ∅, then any x ∈ int C is an absorbing point of C. If C = ∂B1 ∪ {0}, then 0 is an absorbing point although int C = ∅. PROPOSITION 3.2.11 ∗ If A : X ⊇ D(A) −→ 2X is monotone and x ∈ D(A) is an absorbing point of D(A), then A is locally bounded at x. PROOF Without any loss of generality we may assume that x = 0 and 0 ∈ A(0) (i.e., (0, 0) ∈ Gr A). Indeed if this is not the case, we choose x∗ ∈ A(x) and consider the map df

A1 (y) = A(y + x) − x∗ . Evidently A1 is still monotone, (0, 0) ∈ Gr A1 and D(A1 ) = D(A) − x. So we can replace A with A1 . Therefore we need to show that A is locally bounded at 0. For every u ∈ X, we define df

ϕ(u) =

sup y ∈ D(A) kykX 6 1 y ∗ ∈ A(y)

hy ∗ , u − yiX .

Clearly ϕ is the supremum of affine continuous functions, hence ϕ is convex and lower semicontinuous and because (0, 0) ∈ Gr A, we have ϕ > 0. The set df

C =

©

ª

u ∈ X : ϕ(u) 6 1

is closed and convex. We claim that 0 ∈ C. Indeed because (0, 0) ∈ Gr A, we have 0 6 hy ∗ , yiX ∀ (y, y ∗ ) ∈ Gr A and so ϕ(0) 6 0. Let df

E = C ∩ (−C). This is a closed, convex and symmetric set. We claim that it is absorbing too. Let u ∈ X. Since by hypothesis D(A) is absorbing, we can find λ > 0, such that λu ∈ D(A), i.e., A(λu) 6= ∅. Choose v ∗ ∈ A(λu). If (y, y ∗ ) ∈ Gr A, from the monotonicity of A, we have hy ∗ , λu − yiX 6 hv ∗ , λu − yiX ,

3. Nonlinear Operators and Young Measures

307

so ϕ(λu) 6

hv ∗ , λu − yiX 6 hv ∗ , λuiX + kv ∗ kX ∗ < +∞.

sup y ∈ D(A) kykX 6 1

Choose t ∈ (0, 1), such that tϕ(λu) < 1. Because ϕ is convex, we have ϕ(tλu) 6 tϕ(λu) + (1 − t)ϕ(0) = tϕ(λu) < 1, so tλu ∈ C. This shows that C is absorbing, hence E is absorbing too. Thus E is a neighbourhood of the origin and so we can find δ > 0, such that ϕ(u) 6 1

∀ kukX 6 2δ.

This means that hy ∗ , uiX 6 1 + hy ∗ , yiX

∀ y ∈ D(A), kykX 6 1, y ∗ ∈ A(y), kukX 6 2δ.

Therefore, if y ∈ D(A) ∩ B δ and y ∗ ∈ A(y), we have 2δ ky ∗ kX ∗ =

sup kukX 62δ

hy ∗ , uiX 6 1 + ky ∗ kX ∗ kykX 6 1 + δ ky ∗ kX ∗ ,

so ky ∗ kX ∗ 6 1δ . Using this result, we can determine the continuity properties of maximal monotone maps. First let us recall the following notion from multivalued analysis. DEFINITION 3.2.12 Let Y, Z be two Hausdorff topological spaces. A multifunction (set-valued map) F : Y −→ 2Z \ {∅} is said to be upper semicontinuous, if for any closed set C ⊆ Z, the set ª df © F − (C) = y ∈ Y : F (y) ∩ C 6= ∅ is closed. REMARK 3.2.13 It is easy to check that the above definition is equivalent to saying that for any open set U ⊆ Z, the set ª df © F + (U ) = y ∈ Y : F (y) ⊆ U is open. Moreover, if for all y ∈ Y , the set F (y) ⊆ Z is closed and Z is regular, then upper semicontinuity of F implies that Gr F ⊆ Y × Z is closed (see Hu & Papageorgiou (1997, p. 41)). The converse is true if F is locally compact, i.e., for every y ∈ Y , we can find a neighbourhood U of y, such that F (U ) is compact in Z. Finally note that if F is single-valued, then the notion of upper semicontinuity coincides with that of continuity.

308

Nonlinear Analysis

PROPOSITION 3.2.14 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map and int D(A) 6= ∅, then A|int D(A) is upper semicontinuous from X with the norm topology into X ∗ with the weak topology. PROOF Let C ⊆ X ∗ be a weakly closed set. We need to show that the set ¡ ¢− © ª A|int D(A) (C) = x ∈ int D(A) : A(x) ∩ C 6= ∅ is closed in int D(A). To this end let {xn }n>1 ⊆

¡

A|int D(A)

¢−1

(C)

be a sequence, such that xn −→ x

in X,

for some x ∈ int D(A). Let x∗n ∈ A(xn ) ∩ C

∀ n > 1.

Then Proposition 3.2.11 implies that the sequence {x∗n }n>1 ⊆ X ∗ is bounded. By virtue of the reflexivity of X ∗ and the Eberlein-Smulian Theorem (see Theorem A.3.8), we may assume that w

x∗n −→ x∗

in X ∗ .

Clearly x∗ ∈ C. Also, we have hx∗n − y ∗ , xn − yiX > 0 so lim

n→+∞

∀ n > 1, (y, y ∗ ) ∈ Gr A,

∗ ® ® xn − y ∗ , xn − y X = x∗ − y ∗ , x − y X > 0.

Because A is maximal monotone, we infer that x∗ ∈ A(x). Therefore −1 x ∈ (A|int C ) (C) and this proves the claimed upper semicontinuity of A|int D(A) . A careful reading of the previous proof reveals that the following is also true. PROPOSITION 3.2.15 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map, ∗ then Gr A ⊆ X × Xw and Gr A ⊆ Xw × X ∗ are closed sets (here Zw denotes the space Z furnished with the weak topology).

3. Nonlinear Operators and Young Measures

309

DEFINITION 3.2.16 Let Y, Z be two Banach spaces and let V : Y −→ 2Z \ {∅} be a multifunction. (a) We say that V is demicontinuous, if it is upper semicontinuous from Y with the norm topology into Z with the weak topology. (b) We say that if for all x, y ∈ Y , the multivalued ¡ V is hemicontinuous, ¢ map λ 7−→ V λx + (1 − λ)y is upper semicontinuous from [0, 1] into Z with the weak topology. (c) We say that V is bounded, if it maps bounded sets in Y into bounded sets in Z. REMARK 3.2.17 Evidently demicontinuity implies hemicontinuity. ∗ For monotone maps A : X −→ 2X with D(A) = X, the converse is also true. PROPOSITION 3.2.18 ∗ If A : X −→ 2X is a monotone hemicontinuous map with D(A) = X, then A is demicontinuous. PROOF

If C ⊆ X ∗ is w-closed, we need to show that the set ¡ ¢ A− (C) = x ∈ X : A(x) ∩ C 6= ∅

is norm closed in X. To this end let {xn }n>1 ⊆ A− (C) be a sequence, such that xn −→ x Let

x∗n ∈ A(xn ) ∩ C

in X. ∀ n > 1.

Then Proposition 3.2.11 implies that the sequence {x∗n }n>1 ⊆ X ∗ is bounded and so we may assume that w

x∗n −→ x∗ Set

in X ∗ .

df

yλ = x + λy, and let

yλ∗ ∈ A(yλ )

∀ λ > 0, y ∈ X.

From the monotonicity of A, we have ∗ ® xn − yλ∗ , xn − x X − λ hx∗n − yλ∗ , yiX > 0

∀ n > 1,

310

Nonlinear Analysis

so

® 1 ∗ xn − yλ∗ , xn − x X λ Passing to the limit as n → +∞, we obtain hx∗n − yλ∗ , yiX 6

∀ n > 1.

hx∗ − yλ∗ , yiX 6 0. Next let λ & 0. Due to the hemicontinuity of A, we may say that w

yλ∗ −→ y ∗

in X ∗ ,

for some y ∗ ∈ A(x). So we obtain that hx∗ − y ∗ , yiX 6 0. Because y ∈ X was arbitrary, it follows that x∗ = y ∗ ∈ A(x). Therefore x ∈ A− (C) and we have proved the demicontinuity of A. Next we give a sufficient condition for maximality of a monotone map. PROPOSITION 3.2.19 ∗ If A : X −→ 2X is a monotone map with D(A) = X, which is hemicontinuous and for every x ∈ X, the set A(x) ⊆ X ∗ is closed and convex, then A is maximal monotone. b 0 ). b is a monotone extension of A and x∗ ∈ A(x PROOF Suppose that A 0 ∗ We need to show that x0 ∈ A(x0 ). If this is not true, then from the strong separation theorem (see Theorem A.3.2), we can find u ∈ X \ {0}, such that hx∗ , uiX < hx∗0 , ui

∀ x∗ ∈ A(x0 ).

(3.8)

b we have Let λ > 0 and xλ = x0 + λu. By virtue of the monotonicity of A, λ hx∗λ − x∗0 , uiX > 0 so

hx∗λ − x∗0 , uiX > 0

∀ x∗λ ∈ A(xλ ), ∀ x∗λ ∈ A(xλ ).

Because of the hemicontinuity of A, we can say that w

x∗λ −→ x∗

in X ∗

as λ & 0,

∗

for some x ∈ A(x0 ). So from (3.9), we have that hx∗ − x∗0 , uiX > 0, which contradicts (3.8).

(3.9)

3. Nonlinear Operators and Young Measures

311

At this point let us give some standard examples of maximal monotone maps. EXAMPLE 3.2.20 (a) Let H be a Hilbert space and let C ⊆ H be a closed, convex set. It is well known that for each x ∈ H, there exists a unique element in C, denoted by projC (x), such that ° ° °x − proj (x)° = inf kx − ck X C X c∈C

(best approximation of x in C). The map projC : H −→ C is known as the metric projection on C. Then projC is a maximal monotone map. Indeed, recalling that ® x − projC (x), c − projC (x) X 6 0 ∀ c ∈ C, then we can easily check that ° ° ® °proj (x) − proj (y)°2 6 x − y, proj (x) − proj (y) C C C C X X and so

° ° °proj (x) − proj (y)° 6 kx − yk , X C C X

i.e., the map x 7−→ projC (x) is nonexpansive and it is monotone. So by Proposition 3.2.19, the map x 7−→ projC (x) is maximal monotone. (b) If H is a Hilbert space and A : H −→ H is nonexpansive (i.e., Lipschitz continuous with Lipschitz constant equal to 1), then it is easy to check that idX + A is maximal monotone. df

(c) Let X be a reflexive Banach space and ϕ : X −→ R = R ∪ {+∞} a proper (i.e., not identically +∞), convex, lower semicontinuous function. The subdifferential of ϕ is defined by ½ ¾ df ∗ ∗ ∗ ∂ϕ(x) = x ∈ X : hx , x − yiX 6 ϕ(y) − ϕ(x) ∀y∈X . ∗

Then ∂ϕ : X −→ 2X is a maximal monotone map. We shall prove this and more in Section 4.3, where we conduct a detailed study of the convex subdifferential. For the moment, we keep in mind the maximality of the subdifferential map, in order to better understand the next example. (d) Let X be a reflexive Banach space and consider the map F : X −→ 2X defined by ½ ¾ df 2 2 F(x) = x∗ ∈ X ∗ : hx∗ , xiX = kxkX = kx∗ kX ∗ .

∗

312

Nonlinear Analysis

According to the Hahn-Banach Theorem, we see that F(x) 6= ∅ for all x ∈ X. Moreover, we have that F(x) = ∂ϕ(x),

where ϕ(x) =

1 2 kxkX . 2

Indeed, if x∗ ∈ F(x), then 2

hx∗ , y − xiX 6 kx∗ kX ∗ kykX − kxkX 1¡ 2 2 ¢ 2 6 kxkX + kykX − kxkX = ϕ(y) − ϕ(x), 2 df

so x∗ ∈ ∂ϕ(x). Conversely, if x∗ ∈ ∂ϕ(x). Let ψ(x) = kxkX for all x ∈ X. By ϕ0 (x; ·) and ψ 0 (x; ·) we denote the directional derivatives at x ∈ X of the convex functions ϕ and ψ respectively, i.e., for all h ∈ X, we have ϕ(x + λh) − ϕ(x) λ ψ(x + λh) − ψ(x) df ψ 0 (x; h) = lim . λ&0 λ df

ϕ0 (x; h) = lim

λ&0

The limits exist since the difference quotients decrease as λ & 0, because of the convexity of the functions. Then we have 2

kx + λhkX kxkX − kxkX λ&0 λ 2 2 1 kx + λhkX − kxkX 6 lim = ϕ0 (x; h) λ&0 2 λ

ψ 0 (x; h) kxkX = lim

(3.10)

and 2

2

1 kx + λhkX − kxkX λ&0 2 λ · ¸ ¢ 1 kx + λhkX − kxkX ¡ = lim kx + λhkX + kxkX λ&0 2 λ = ψ 0 (x; h) kxkX . (3.11)

ϕ0 (x; h) = lim

From (3.10) and (3.11), we infer that ϕ0 (x; h) = ψ 0 (x; h) kxkX . Clearly from the definition of ∂ϕ(x), we see that x∗ ∈ ∂ϕ(x) if and only if hx∗ , hiX 6 ϕ0 (x; h) = ψ 0 (x; h) kxkX . So, we have ¿ ∗ À x ,h 6 ψ 0 (x; h) 6 ψ(x + h) − ψ(x) 6 khkX kxkX X

∀ h ∈ X,

3. Nonlinear Operators and Young Measures so

kx∗ kX ∗ 6 kxkX .

(3.12)

On the other hand since ¿ ∗ À x ,h 6 ψ 0 (x; h) kxkX X we have that

x∗ kxkX

¿

313

∀ h ∈ X,

∈ ∂ψ(x) and so

À x∗ ,y − x 6 ψ(y) − ψ(x) kxkX X

Let y = 0. We obtain

¿

Therefore

∀ y ∈ X.

À x∗ ,x > kxkX . kxkX X kx∗ kX ∗ > kxkX .

(3.13)

From (3.12) and (3.13), it follows that kx∗ kX ∗ = kxkX , hence x∗ ∈ F(x). Note that if X is a Hilbert space, then F is the canonical isomorphism between X and X ∗ . So if X = H is a pivot Hilbert space (i.e., H = H ∗ ), then F is an identity operator. REMARK 3.2.21 The duality map introduced in Example 3.2.20(a) actually can be defined on any Banach space (not necessarily reflexive) and is essentially dependent on the norm of the space. More precisely, if k·k1 and k·k2 are two equivalent norms on X and F1 and F2 the corresponding duality maps, then we need not have F1 = F2 . In fact in the proposition that follows, we show that the geometry of X and X ∗ is closely related to the properties of the duality map F. PROPOSITION 3.2.22 If X is a reflexive Banach space with X ∗ strictly convex, then the duality map F : X −→ X ∗ is single-valued, odd, demicontinuous, maximal monotone, coercive and bounded. PROOF

Let x∗1 , x∗2 ∈ F(x). Then we have 2

2

hx∗k , xiX = kxkX = kx∗k kX ∗

for k ∈ {1, 2}.

So, we have 2

2

2 kx∗1 kX ∗ kxkX 6 kx∗1 kX ∗ + kx∗2 kX ∗ = hx∗1 + x∗2 , xiX 6 kx∗1 + x∗2 kX kxkX , thus kx∗1 kX ∗ 6

1 ∗ kx + x∗2 kX ∗ 2 1

314

Nonlinear Analysis

and so x∗1 = x∗2 due to the strict convexity of X ∗ . Clearly F(−x) = −F(x), i.e., F is odd. To show the demicontinuity of F, suppose that {xn }n>1 ⊆ X is a sequence, such that xn −→ x in X, for some x ∈ X. Then ° ° °F(xn )° ∗ = kxn k −→ kxk X X X and so the sequence {F(xn )}n>1 ⊆ X ∗ is bounded. Because of the reflexivity of X ∗ , we may assume that w

F(xn ) −→ x∗ We have ∗ ® x ,h X = 6

lim

n→+∞

in X ∗ .

® F(xn ), h X

lim kxn kX khkX = kxkX khkX

n→+∞

∀h∈X

(3.14)

and ∗ ® x ,x X =

lim

n→+∞

From (3.14), we have

® F(xn ), x X =

2

2

lim kxn kX = kxkX .

n→+∞

(3.15)

kx∗ kX ∗ 6 kxkX

and from (3.15), we have kx∗ kX ∗ > kxkX . Therefore

kx∗ kX ∗ = kxkX

and so x∗ ∈ F(x), which proves the demicontinuity of F. Maximal monotonicity follows from Examples 3.2.20(c) and (d). A more direct proof is the following. Let x, y ∈ X. Then we have ® 2 2 F(x) − F (y), x − y X > kxkX + kykX − 2 kxkX kykX 2 = (kxkX − kykX ) > 0, (3.16) so F is monotone. Invoking Proposition 3.2.19, we conclude that F is maximal monotone. Also we have ® 2 F(x), x X = kxkX , i.e., F is coercive. Finally it is clear that F is bounded.

3. Nonlinear Operators and Young Measures

315

REMARK 3.2.23 As we shall see later in this section (see Corollary 3.2.31), the maximal monotonicity and coercivity of F imply that F is surjective. Also we have ® ϕ0 (x; h) = F(x), h X ∀ h ∈ X, which means that ϕ is Gˆateaux differentiable at every x ∈ X and ϕ0 (x) = F(x). Moreover, from Example 3.2.20(d), we see that the map x 7−→ ψ(x) = kxkX is Gˆateaux differentiable at every x 6= 0 and ψ 0 (x) =

F(x) . kxkX

It is a result of Banach space theory that the reflexive Banach space X is smooth (i.e., its norm is Gˆateaux differentiable at every x 6= 0) if and only if X ∗ is strictly convex. Similarly the reflexive Banach space X is strictly convex if and only if X ∗ is smooth (see Day (1973, p. 144)). PROPOSITION 3.2.24 If X is a reflexive Banach space and both X and X ∗ are strictly convex, then the duality map F : X −→ X ∗ is strictly monotone and bijective and F −1 is the duality map of X ∗ . PROOF

Suppose that ® F(x) − F(y), x − y X = 0.

From (3.16), we have ¿ µ ¶ À ¿ µ ¶ À x+y x−y x+y x−y 0 = F(x) − F , − F − F(y), 2 2 2 2 X X ° ° ¶ ° µ µ° ¶2 °x + y° 2 °x + y° ° ° > kxkX − ° + ° ° 2 ° ° 2 ° − kykX , X X so

° ° °x + y° ° kxkX = ° ° 2 °

X

= kykX .

Because X is strictly convex, it follows that x = y. So F is strictly monotone. Hence it is injective (see also Proposition 3.2.22). Moreover, we know that F is surjective (see Remark 3.2.23). Therefore F is bijective. Finally, it is clear that F −1 : X ∗ −→ X is the duality map of X ∗ .

316

Nonlinear Analysis

PROPOSITION 3.2.25 If X is a reflexive Banach space and X ∗ is a locally uniformly convex space (see Definition A.3.21), then the duality map F : X −→ X ∗ is continuous. PROOF

Let {xn }n>1 ⊆ X be a sequence, such that xn −→ x

in X,

for some x ∈ X. Then ° ° °F(xn )°

X∗

° ° −→ °F(x)°X ∗ .

Moreover, because F is demicontinuous (see Proposition 3.2.22), we have w

F(xn ) −→ F (x)

in X ∗ .

Since X ∗ is locally uniformly convex, it has the Kadec-Klee property (see Remark A.3.22) and so F(xn ) −→ F (x)

in X ∗ .

Therefore F is continuous. REMARK 3.2.26 Under the hypotheses of Proposition 3.2.25, the map x 7−→ ψ(x) = kxkX is Fr´echet differentiable at every x 6= 0. Indeed, from Remark 3.2.23, we know that the map x 7−→ ψ(x) is Gˆateaux differentiable at every x 6= 0. Also Proposition 3.2.25 says that the map F(x) x 7−→ ψ 0 (x) = kxkX is continuous on X \ {0}. So ψ is Fr´echet differentiable on X \ {0}. Combining Propositions 3.2.24 and 3.2.25, we obtain the following proposition. PROPOSITION 3.2.27 If X is a reflexive Banach space and both X and X ∗ are locally uniformly convex (see Definition A.3.21), then the duality map F : X −→ X ∗ is a homeomorphism. PROPOSITION 3.2.28 If X is a reflexive Banach space and X ∗ is uniformly convex, then the duality map F : X −→ X ∗ is uniformly continuous on bounded subsets of X.

3. Nonlinear Operators and Young Measures PROOF

317

First we show that F is uniformly continuous on ª df © ∂B1 = x ∈ X : kxkX = 1 .

If this is not the case, then we can find ε > 0 and two sequences {xn }n>1 , {yn }n>1 ⊆ ∂B1 , such that kxn − yn kX −→ 0 and

° ° °F(xn ) − F(yn )° ∗ > ε X

∀ n > 1.

We have ° ° ® °F(x) + F(y)° ∗ kxk > F(x) + F(y), x X X X ® ® ® = F(x), x X + F(y), y X + F(y), x − y X 2

2

> kxkX + kykX − kykX kx − ykX

∀ x, y ∈ X.

Putting x = xn ,

y = yn

∀ n > 1,

we obtain

° 1° °F(xn ) + F(yn )° ∗ = 1 − 1 kxn − yn k , X X 2 2 ∗ which contradicts the uniform convexity of X . Recall that F(λu) = λF(u) ∀ λ > 0, u ∈ X. For x, y ∈ X \ {0}, we have ° µ ¶ µ ¶° ° ° x y ° ° = ° kxkX F − kykX F X∗ kxkX kykX °X ∗ ° µ ° µ ¶ µ ¶° ¶° ° ° ° ° x y y ° ° °F ° . 6 kxkX °F −F + kx − yk X° ° ° ∗ kxkX kykX kyk ∗ X X X ° ° °F(x) − F(y)°

From the uniform continuity of F on ∂B1 , it follows that F is uniformly continuous on bounded sets located outside some neighbourhood of the origin. Since F is continuous at x = 0 and F(0) = 0, we conclude that F is uniformly continuous on bounded sets. Using the duality map, we can have a necessary and sufficient condition for the maximality of a monotone operator A. THEOREM 3.2.29 ∗ If both X and X ∗ are strictly convex and A : X ⊇ D(A) −→ 2X is a monotone map, then A is maximal monotone if and only if R(A + λF) = X ∗ for all λ > 0 (equivalently for some λ > 0).

318

Nonlinear Analysis

Theorem 3.2.29 is also a surjectivity result. One of the reasons that maximal monotone operators are important in applications is their remarkable surjectivity properties. We start with a necessary and sufficient condition in order to have surjectivity. THEOREM 3.2.30 ∗ If A : X ⊇ D(A) −→ 2X is a monotone map, then R(A) = X ∗ if and only if A−1 is locally bounded. PROOF

“=⇒”: Since A is maximal monotone, so is A−1 . Because D(A−1 ) = R(A) = X ∗ ,

from Proposition 3.2.11, we have that A−1 is locally bounded. “⇐=”: To show that R(A) = X ∗ , it suffices to show that R(A) is both closed and open in X ∗ . First we show that R(A) is closed. To this end let {x∗n }n>1 ⊆ R(A) and suppose that x∗n −→ x∗

in X ∗ .

We have x∗n ∈ A(xn ) and from the monotonicity of A, it follows that ® ∗ ∀ n > 1, (y, y ∗ ) ∈ Gr A. xn − y ∗ , xn − y X > 0

(3.17)

Since by hypothesis A−1 is locally bounded, the sequence {xn }n>1 ⊆ X is bounded and so by passing to a suitable subsequence if necessary, we may assume that w xn −→ x in X. Passing to the limit as n → +∞ in (3.17), we obtain ∗ ® x − y∗ , x − y X > 0 ∀ (y, y ∗ ) ∈ Gr A, so x∗ ∈ A(x) (since A is maximal monotone). Therefore x∗ ∈ R(A) and so we have proved that R(A) is closed. Next we show that R(A) is open in X ∗ . Let x∗ ∈ R(A). We have x∗ ∈ A(x). By considering if necessary b A(y) = A(y + x)

3. Nonlinear Operators and Young Measures

319

(maximal monotonicity is invariant under translation), we may assume that x = 0. Let r > 0 be such that A−1 |Br (x∗ ) is bounded, where df

Br (x∗ ) =

©

ª y ∗ ∈ X ∗ : ky ∗ − x∗ kX < r .

By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume without any loss of generality, that both X and X ∗ are locally uniformly convex. Let y ∗ ∈ B r2 (x∗ ). Then by Theorem 3.2.29, for every λ > 0, the operator equation x∗λ + λF(xλ ) = y ∗ ,

x∗λ ∈ A(xλ )

(3.18)

has a solution xλ . Because A is maximal monotone, we have ∗ ® y − λF(xλ ) − x∗ , xλ X > 0 ∀λ>0 (recall that x = 0), so 2

ky ∗ − x∗ kX ∗ kxλ kX > λ kxλ kX and thus λ kxλ kX

0.

From (3.18), we see that ° ∗ ° ° ° °y − x∗λ ° ∗ = λ°F(xλ )° ∗ = λ kxλ k < r X X X 2 so ° ∗ ° °xλ − x∗ ° ∗ < r ∀ λ > 0. X

∀ λ > 0,

(3.19)

Recall that A−1 |Br (x∗ ) is bounded. So we have that {xλ }λ>0 ⊆ X is bounded. Using this in (3.19), we see that x∗λ −→ y ∗

in X ∗

as λ & 0.

Since from the first part of the proof of this implication, we have that R(A) is closed, it follows that y ∗ ∈ R(A) and so B r2 ⊆ R(A), which proves that R(A) is open in X ∗ . Thus we conclude that R(A) = X ∗ . COROLLARY 3.2.31 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone and weakly coercive, then A is surjective (i.e., R(A) = X ∗ ). PROOF By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume that both X and X ∗ are locally uniformly convex. The weak coercivity condition is equivalent to saying that A−1 is locally bounded. So we can apply Theorem 3.2.30 and conclude that R(A) = X ∗ .

320

Nonlinear Analysis

COROLLARY 3.2.32 ∗ If A : X −→ 2X is monotone, hemicontinuous with D(A) = X and weakly coercive, then A is surjective (i.e., R(A) = X ∗ ). In a finite dimensional context we can drop the monotonicity hypothesis from the above result, provided we assume coercivity. Namely, we have the following result. PROPOSITION 3.2.33 ∗ If X is a finite dimensional Banach space, F : X −→ 2X is an upper semicontinuous and coercive multifunction with nonempty, compact and convex values, then F is surjective. PROOF

For every y ∗ ∈ X ∗ , the multifunction df

Fy∗ (x) = F (x) − y ∗ satisfies the same hypotheses as F . So it suffices to show that 0 ∈ R(F ). Suppose that 0 6∈ R(F ). Then by the strong separation theorem (see Theorem A.3.2), for every x ∈ X we can find u(x) ∈ X \ {0}, such that ∗ ® 0 < ∗ inf x , u(x) X . x ∈F (x)

Since by hypothesis F is coercive, given M > 0 we can find r = r(M ) > 0, such that hx∗ , xiX > M ∀ kxkX > r, x∗ ∈ F (x), kxkX so hx∗ , xiX > M r ∀ kxkX = r, x∗ ∈ F (x). For such x ∈ X, we can take u(x) = x. Now let x ∈ X \ {0}. We define ½ ¾ df ∗ U (x) = y ∈ X : ∗ inf hy , xiX > 0 . y ∈F (y)

Because of our hypotheses on F , the map y 7−→

inf

y ∗ ∈F (y)

hy ∗ , xiX is lower semicontinuous

(see Hu & Papageorgiou©(1997,ª p. 83)) and so the set U (x) is open. From the above, we have that U (x) x∈X\{0} is an open cover of X.

3. Nonlinear Operators and Young Measures

321

We choose ¢ df ¡ an open cover {Vk }m k=1 of B r = x ∈ X : kxkX 6 r , such that for each k ∈ {1, . . . , m}, we can find xk ∈ X, such that Vk ⊆ U (xk ) and if Vk ∩ ∂Br 6= ∅,

then xk ∈ Vk ∩ ∂Br

and diam Vk

1, such that ϑk (x) > 0 and for each x∗ ∈ F (x), we have hx∗ , xk iX > 0 (since x ∈ Vk ⊆ U (xk )). So for each x ∈ B r and any x∗ ∈ F (x), we have m X ∗ ® x , f (x) X = ϑk (x) hx∗ , xk iX > 0, k=1

hence f (x) 6= 0

∀ x ∈ Br .

Therefore dB (f, Br , 0) = 0, where dB (f, Br , 0) denotes the Brouwer degree of f on Br with respect to 0. On the other hand, if x ∈ ∂Br , f (x) is a convex combination of the points {xk }m k=1 ⊆ ∂Br and kxk − xkX

0, we define the resolvent operator of A, by ¢−1 df ¡ Jλ = idH + λA and the Yosida approximation of A, by df

Aλ = REMARK 3.2.37

¢ 1¡ idH − Jλ . λ

By virtue of Theorem 3.2.29, we have that

D(Jλ ) = D(Aλ ) = H

∀ λ > 0.

Moreover, it is easy to check that Jλ is single-valued and then so is Aλ .

3. Nonlinear Operators and Young Measures

323

The next theorem summarizes the main properties of the resolvent and Yosida approximation of a maximal monotone operator A. THEOREM 3.2.38 If H is a pivot Hilbert space and A : H ⊇ D(A) −→ 2H is a maximal monotone map, then for every λ > 0, we have (a) Jλ is nonexpansive (i.e., Lipschitz continuous with Lipschitz constant 1); ¡ ¢ (b) Aλ (x) ∈ A Jλ (x) for every x ∈ H; (c) Aλ is monotone and Lipschitz continuous with Lipschitz constant λ1 ; ° ° ° ° (d) °Aλ (x)°H 6 °A0 (x)°H for every x ∈ D(A), where A0 (x) = projA(x) (0) (recall that A(x) is closed and convex; see Proposition 3.2.7 and Example 3.2.20(a)); (e) lim Aλ (x) = A0 (x) for all x ∈ D(A); λ&0

(f ) D(A) is convex and lim Jλ (x) = projD (x) for every x ∈ H. λ&0

PROOF

(a) For x, y ∈ H, we have ¡ ¡ ¢ ¡ ¢¢ x − y ∈ Jλ (x) − Jλ (y) + λ A Jλ (x) − A Jλ (y) .

We take the inner product with Jλ (x) − Jλ (y) and use the monotonicity of A. We have ° ° ® °Jλ (x) − Jλ (y)°2 6 x − y, Jλ (x) − Jλ (y) H H ° ° 6 kx − yk °Jλ (x) − Jλ (y)° , H

so

H

° ° °Jλ (x) − Jλ (y)° 6 kx − yk . H H

(b) This follows from the equivalence ¡ ¢ x, x∗ ∈ Gr Aλ ⇐⇒

¡

¢ x − λx∗ , x∗ ∈ Gr A.

(3.20)

(c) Because Jλ is nonexpansive (see (a)), it follows that idH − Jλ is monotone (see Example 3.2.20(b)) and so Aλ is monotone too. We have ¡ ¢ x − y = Jλ (x) − Jλ (y) + λ Aλ (x) − Aλ (y) ,

324

Nonlinear Analysis

and ® ® x − y, Aλ (x) − Aλ (y) H = Jλ (x) − Jλ (y), Aλ (x) − Aλ (y) H ° °2 + λ°Aλ (x) − Aλ (y)°H . From the monotonicity of A and (b), we have ° °2 ° ° λ°Aλ (x) − Aλ (y)°H 6 kx − ykH °Aλ (x) − Aλ (y)°H , so

° ° °Aλ (x) − Aλ (y)°

1 kx − ykH . λ Invoking Proposition 3.2.19, we conclude that A is maximal monotone. H

6

(d) From (b), we have that 0 ® A (x) − Aλ (x), x − Jλ (x) H > 0 so

∀ x ∈ D(A), λ > 0,

° ° ° ° ° ° ® °Aλ (x)°2 6 A0 (x), Aλ (x) 6 °A0 (x)°H °Aλ (x)°H H H

and thus

(3.21)

° ° ° ° °Aλ (x)° 6 °A0 (x)° . H H

(e) Using (3.20), we can easily verify that (Aλ )µ = Aλ+µ So from (d) and (3.21), we see that ° ° ° ° °Aλ+µ (x)° 6 °Aλ (x)° H H

∀ λ, µ > 0.

∀ x ∈ H, λ, µ > 0

(3.22)

and ° ° ® °Aλ+µ (x)°2 6 Aλ (x), Aλ+µ (x) H H

∀ x ∈ H, λ, µ > 0.

(3.23)

From (3.22) and (3.23), it follows that ° ° ° ° ° ° °Aλ+µ (x) − Aλ (x)°2 6 °Aλ (x)°2 − °Aλ+µ (x)°2 ∀ x ∈ H, λ, µ > 0. H H H ° ª ©° Therefore, since from (d), °Aλ (x)°H λ>0 is bounded for λ > 0 small © ª enough, we infer that Aλ (x) λ>0 is Cauchy and so Aλ (x) −→ y

in H

as λ & 0.

By definition, we have x − Jλ (x) = λAλ (x)

3. Nonlinear Operators and Young Measures

325

and so Jλ (x) −→ x in H

as λ & 0.

Using (b) and the maximal monotonicity of A, we have that y ∈ A(x), hence y = A0 (x). (f ) Let df

C = conv D(A) and x ∈ H. We have ® Aλ (x) − u, Jλ (x) − z H > 0 so

x − Jλ (x) − λu, Jλ (x) − z

® H

∀ (z, u) ∈ Gr A,

> 0

∀ (z, u) ∈ Gr A

and thus ° ° ® ® °Jλ (x)°2 6 x − λu, Jλ (x) − z + Jλ (x), z H H H ∀ (z, u) ∈ Gr A. (3.24) © ª From (3.24), it follows that Jλ (x) λ>0 ⊆ H is bounded. We choose a sequence λn & 0, such that w

Jλn (x) −→ v

in H.

Then by passing to the limit in (3.24) (with λ = λn ), we obtain 2

kvkH 6 hx, v − ziH + hv, ziH

∀ z ∈ D(A),

so hv − x, v − ziH 6 0

∀ z ∈ D(A)

and so hx − v, z − viH 6 0

∀ z ∈ C.

(3.25)

Since v ∈ C, from (3.25), it follows that v = projC (x). It remains to show that C = D(A). To this end, note that Jλ (x) ∈ D(A)

∀x∈H

and Jλ (z) −→ z

as λ & 0

∀ z ∈ C.

Therefore, it follows that C ⊆ D(A), hence C = D(A).

326

Nonlinear Analysis

REMARK 3.2.39

From the proof of (e), it follows that if x 6∈ D, then ° ° °Aλ (x)° % +∞ as λ & 0. H

Perturbation results for maximal monotone operators play an important role in applications. In this direction we have the following result. THEOREM 3.2.40 If H is a pivot Hilbert space, A : H ⊇ D(A) −→ 2H and B : H ⊇ D(B) −→ ¡ ¢ 2H are two maximal monotone maps and 0 ∈ int D(A) \ D(B) , then A + B : H ⊇ D(A) ∩ D(B) −→ 2H is maximal monotone too. PROOF We start by showing that if S : H −→ H is a Lipschitz continuous map with Lipschitz constant kS > 0, then the map A + S : H ⊇ D(A) −→ 2H is maximal monotone. To this end, we choose µ > 0, such that µkS < 1. Then for a given y ∈ H, the equation x + µA(x) + µS(x) 3 y

(3.26)

is equivalent to x =

¡

¢−1 ¡ ¢ ¡ ¢ idH + µA y − µS(x) = Jµ y − µS(x) .

Note that the map

(3.27)

¡ ¢ z 7−→ Eµ (z) = Jµ y − µS(x)

is Lipschitz continuous with Lipschitz constant µkS < 1. So by the Banach fixed point theorem (see Theorem 7.1.2), equation (3.27) (hence inclusion (3.26) too) has a unique solution x ∈ D(A). By virtue of Theorem 3.2.29, this means that µ(A + S) is maximal monotone (note that since H is a pivot Hilbert space, F = idH ). So A + S is maximal monotone too. Using this general fact and Theorems 3.2.29 and 3.2.38(c), we see that for every λ > 0 we can find xλ ∈ D(A), such that xλ + A(xλ ) + Bλ (xλ ) 3 y.

(3.28)

We take the inner product with xλ − z, for some z ∈ D(A) ∩ D(B). Exploiting the monotonicity of A and Bλ , we obtain ° ° kxλ − zk 6 °y − z − A0 (z) − Bλ (z)° ∀ λ > 0. H

H

3. Nonlinear Operators and Young Measures Using also Theorem 3.2.38(d), we get ° ° ° ° kxλ kH 6 2 kzkH + kykH + °A0 (z)°H + °B 0 (z)°H

327

∀ λ > 0.

(3.29)

Because of our hypothesis concerning the domains D(A) and D(B), we can find ε > 0, such that df

Bε =

©

ª z ∈ H : kzkH 6 ε ⊆ D(B) − D(A).

Let z ∈ B ε . Then z = b − a with b ∈ D(B) and

a ∈ D(A).

Exploiting the monotonicity of Bλ , we have ® ® ® Bλ (xλ ), b) H 6 Bλ (xλ ), xλ H − Bλ (b), xλ − b H . From Theorem 3.2.38(d), we have ° ° ° ° ® ® Bλ (xλ ), z H 6 Bλ (xλ ), xλ − a H + °B 0 (b)°H °xλ − b°H , so

Bλ (xλ ), z

® H

6

° ° ° ° ® y − xλ − uλ , xλ − a H + °B 0 (b)°H °xλ − b°H ,

with uλ ∈ A(xλ ). As A is monotone, we have ° ° ° ¢ ® ¡° Bλ (xλ ), z H 6 kxλ − akH °y − xλ °H + °A0 (a)°H ° ° + °B 0 (b)°H kxλ − bkH . From (3.29), we have that {xλ }λ>0 is bounded. Thus for every z ∈ B ε , we can find c(z) > 0, such that ® Bλ (xλ ), z H 6 c(z). Invoking the uniform boundedness principle (see Theorem A.3.4), we have ° ° sup °Bλ (xλ )°H < +∞. (3.30) λ>0

From (3.28), we have xλ − xµ ∈ −A(xλ ) + A(xµ ) − Bλ (xλ ) + Bµ (xµ )

∀ λ, µ > 0,

so from the monotonicity of A, we have ° ° ® °xλ − xµ °2 6 − Bλ (xλ ) − Bµ (xµ ), xλ − xµ H H

∀ λ, µ > 0.

328

Nonlinear Analysis

Invoking also Theorem 3.2.38(b), we obtain ° ° ® °xλ − xµ °2 6 − Bλ (xλ ) − Bµ (xµ ), λBλ (xλ ) − µBλ (xµ ) H H

∀ λ, µ > 0.

Using (3.30), we infer that {xλ }λ>0 ⊆ H is Cauchy and so xλ −→ x as λ & 0.

(3.31)

Also we can say that w

Bλ (xλ ) −→ v Note that

in H,

as λ & 0.

(3.32)

° ° ° ° ° ° °Jλ (xλ ) − x° 6 λ°Bλ (xλ )° + °xλ − x° , H H H

so, using (3.30) and (3.31), we have Jλ (xλ ) −→ x

in H,

as λ & 0.

Since B is maximal monotone and ¡ ¢ Bλ (xλ ) ∈ B Jλ (xλ ) (see Theorem 3.2.38(b)), in the limit as λ & 0, we obtain (x, v) ∈ Gr B (see Proposition 3.2.15 and (3.32)). Passing to the limit as λ & 0 in (3.28) and using the fact that A is maximal monotone, we obtain that x ∈ D(A) and x + A(x) + B(x) 3 y, so

¡ ¢ R idH + A + B = H

and finally A + B is maximal monotone (see Theorem 3.2.29). For operators from a reflexive Banach space into its dual, we have the following perturbation result. THEOREM 3.2.41 ∗ If X is a reflexive Banach space, A : X ⊇ D(A) −→ 2X and B : X ⊇ ∗ D(B) −→ 2X are two maximal monotone maps and D(A) ∩ int D(B) 6= ∅ (or D(B) ∩ int D(A) 6= ∅), ∗ then A + B : X ⊇ D(A) ∩ D(B) −→ 2X is maximal monotone. ¡ ¢ REMARK 3.2.42 Since int D(A) − D(B) ⊆ int D(A) − D(B) , we see that the hypothesis of Theorem ¡3.2.40 is weaker¢ than that of Theorem 3.2.41. Note that the condition 0 ∈ int D(A) − D(B) may hold even if int D(A) = int D(B) = ∅.

3. Nonlinear Operators and Young Measures

329

Another useful perturbation result is given in the next theorem. THEOREM 3.2.43 If H is a pivot Hilbert space, A : H ⊇ D(A) −→ 2H and B : H ⊇ D(B) −→ 2H are maximal monotone maps, D(A) ∩ D(B) 6= ∅ and ® 0 6 y, Bλ (x) H ∀ (x, y) ∈ Gr A, λ > 0, then A + B is maximal monotone. PROOF

Let y ∈ H and consider the inclusion x + A(x) + Bλ (x) 3 y.

(3.33)

From the proof of Theorem 3.2.40, we know that (3.33) has a unique solution xλ ∈ D(A) and {xλ }λ>0 ⊆ H is bounded. Take the inner product of (3.33) with Bλ (xλ ) and use the hypothesis, to obtain that ° ° sup °Bλ (xλ )°H < +∞. λ>0

Then the remainder of the proof goes as the proof of Theorem 3.2.40. Next we introduce some generalizations of the concept of monotonicity, which are useful in the study of nonlinear partial differential equations. The mathematical setting remains the same. Namely, X is a reflexive Banach ∗ space, X ∗ is its topological dual and A : X −→ 2X is an operator. DEFINITION 3.2.44 domonotone, if

The map A : X −→ 2X

∗

is said to be pseu-

(a) the set A(x) is nonempty, convex and weakly compact for all x ∈ X; (b) A is upper semicontinuous from each finite dimensional subspace V of X, into X ∗ furnished with the weak topology; (c) if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are sequences, such that x∗n ∈ A(xn ), w

xn −→ x for some x ∈ X and

in X,

® lim sup x∗n , xn − x X 6 0, n→+∞

then for each y ∈ X, we can find v ∗ (y) ∈ A(x), such that ∗ ® ® v (y), x − y X 6 lim inf x∗n , xn − y X . n→+∞

330

Nonlinear Analysis

To be able to deal with problems in which the nonlinear operators are not everywhere defined and which are not continuous even in a mild sense, we introduce the following notion. ∗

DEFINITION 3.2.45 A map A : X −→ 2X is said to be generalized pseudomonotone, if for any sequences {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ , such that w xn −→ x in X, for some x ∈ X and

w

x∗n −→ x∗

in X ∗ ,

for some x∗ ∈ X ∗ , with x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞

we have that x∗ ∈ A(x) and ∗ ∗® ® xn , x X −→ x∗ , x X . An immediate consequence of this definition is the following result. PROPOSITION 3.2.46 ∗ A map A : X −→ 2X is generalized pseudomonotone if and only if A−1 : X ∗ −→ 2X is generalized pseudomonotone. The class of generalized pseudomonotone maps contains the maximal monotone ones. PROPOSITION 3.2.47 ∗ If A : X ⊇ D(A) −→ 2X is maximal monotone, then A is generalized pseudomonotone. PROOF that

Let {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ be two sequences, such w

xn −→ x for some x ∈ X and

w

x∗n −→ x∗

in X, in X ∗ ,

for some x∗ ∈ X ∗ , with x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0. n→+∞

We need to show that x∗ ∈ A(x) and ® ∗ ® xn , x X −→ x∗ , x X .

3. Nonlinear Operators and Young Measures Let (u, u∗ ) ∈ Gr A. Then since A is monotone, we have ® 0 6 x∗n − u∗ , xn − u X ∀ n > 1.

331

(3.34)

Also we have ∗ ® ® ® ® ® xn , xn X = x∗n − u∗ , xn − u X + x∗n , u X + u∗ , xn X − u∗ , u X . Note that ∗ ® ® ® ® ® ® xn , u X + u∗ , xn X − u∗ , u X −→ x∗ , u X + u∗ , x X − u∗ , u X , so from (3.34), we have ® ® ® ® ∗ ® x , x X > lim sup x∗n , xn X > x∗ , u X + u∗ , x X − u∗ , u X n→+∞

and thus

∗ ® x − u∗ , x − u X > 0.

Since (u, u∗ ) ∈ Gr A was arbitrary and A is maximal monotone, it follows that x∗ ∈ A(x). Therefore ∗ ® xn − x∗ , xn − x X > 0 ∀ n > 1, so

® ® lim inf x∗n , x X > x∗ , x X n→+∞

and thus

x∗n , xn

® X

−→

∗ ® x , x X,

i.e., A is generalized pseudomonotone. In fact every pseudomonotone map is generalized pseudomonotone. THEOREM 3.2.48 ∗ If A : X −→ 2X is a pseudomonotone map, then A is generalized pseudomonotone PROOF

© ª Suppose that (xn , x∗n ) n>1 ⊆ Gr A is a sequence, such that w×w

(xn , x∗n ) −→ (x, x∗ ) in X × X ∗ , for some (x, x∗ ) ∈ X × X ∗ and ® lim sup x∗n , xn − x X 6 0. n→+∞

332

Nonlinear Analysis

By virtue of pseudomonotonicity of A, for every y ∈ X, we can find v ∗ (y) ∈ A(x), such that ∗ ® ® v (y), x − y X 6 lim inf x∗n , xn − y X . n→+∞

We may assume that

® ∗ xn , xn X −→ ξ,

for some ξ ∈ R and so ® ® lim sup x∗n , xn − x X = ξ − x∗ , x X 6 0. n→+∞

(3.35)

Also, we have ® ® ® ξ − x∗ , y X > lim inf x∗n , xn − y X > v ∗ (y), x − y X , n→+∞

so, from (3.35), we have ∗ ® ® x , x − y X > v ∗ (y), x − y X

∀ y ∈ X.

(3.36)

We claim that x∗ ∈ A(x). If this is not the case, then since A(x) is convex and w-compact (see Definition 3.2.44), we can find u ∈ X, such that ∗ ® ∗ ® x , u X < ∗ inf z , u X. (3.37) z ∈A(x)

Let y = x − u in (3.36). Then ∗ ® ® x , u X > v ∗ (y), u X , with

(3.38)

v ∗ (y) ∈ A(x).

Comparing (3.37) and (3.38), we reach a contradiction. Therefore x∗ ∈ A(x). Finally, if y = x ∈ X, then ® ® lim inf x∗n , xn − x X > v ∗ (x), x − x X = 0, n→+∞

so

® ® lim inf x∗n , xn X > x∗ , x X n→+∞

and recalling the choice of the sequences {xn }n>1 and {x∗n }n>1 , we get ∗ ® ® xn , xn X −→ x∗ , x X .

There is a converse to this proposition.

3. Nonlinear Operators and Young Measures

333

PROPOSITION 3.2.49 ∗ If A : X −→ 2X is a bounded generalized pseudomonotone map and for every x ∈ X, the set A(x) is nonempty, convex and weakly compact, then A is pseudomonotone. PROOF

First we show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two w

sequences, such that xn −→ x in X, for some x ∈ X, x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞

then for each u ∈ X, we can find y ∗ (u) ∈ A(x), such that ∗ ® ® y (u), x − u X 6 lim inf x∗n , xn − u X . n→+∞

Suppose that this is not true. Then we can find u ∈ X, such that ® ∗ ® lim inf x∗n , xn − u X < ∗ inf v , x − u X. n→+∞

v ∈A(x)

By passing to a suitable subsequence, we may say that ® ∗ ® lim x∗n , xn − u X < ∗ inf v , x − u X. n→+∞

v ∈A(x)

Since A is bounded, we have that the sequence {x∗n }n>1 ⊆ X ∗ is bounded. So by virtue of the Eberlein-Smulian theorem (see Theorem A.3.8), we may assume that w x∗n −→ x∗ in X ∗ . Because A is generalized pseudomonotone, it follows that x∗ ∈ A(x) and ® ® ∗ xn , xn X −→ x∗ , x X (see Definition 3.2.45). Therefore ® ® lim x∗n , xn − u X = x∗ , x − u X < n→+∞

inf

v ∗ ∈A(x)

∗ ® v , x − u X,

a contradiction, since x∗ ∈ A(x). Next we show that A is upper semicontinuous from X into X ∗ furnished with the weak topology. By virtue of the boundedness of A and of Remark 3.2.13, it suffices to show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two sequences, such that xn −→ x for some x ∈ X,

w

x∗n −→ x∗

in X, in X ∗ ,

for some x∗ ∈ X ∗ and x∗n ∈ A(xn ) for n > 1, then x∗ ∈ A(x). But this follows from the fact that A is generalized pseudomonotone. Thus we have shown that A is pseudomonotone (see Definition 3.2.44).

334

Nonlinear Analysis

Combining this result with Propositions 3.2.11 and 3.2.47, we obtain the following corollary. COROLLARY 3.2.50 ∗ If A : X ⊇ D(A) −→ 2X is maximal monotone and D(A) = X, then A is pseudomonotone. The class of pseudomonotone maps is invariant under addition of operators. PROPOSITION 3.2.51 ∗ If A1 , A2 : X −→ 2X are two pseudomonotone maps, then A1 + A2 is pseudomonotone too. PROOF

Evidently for each x ∈ X, the set (A1 + A2 )(x) = A1 (x) + A2 (x)

is nonempty, convex and weakly compact. Moreover, it is easy to see that the map x 7−→ (A1 + A2 )(x) is upper semicontinuous from every finite dimensional subspace of X into X ∗ equipped with the weak topology. Next we show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two sequences, such that w xn −→ x in X, for some x ∈ X, x∗n ∈ (A1 + A2 )(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞

then for every u ∈ X, we can find y ∗ (u) ∈ (A1 + A2 )(x), such that

∗ ® ® y (u), x − u X 6 lim inf x∗n , xn − u X . n→+∞

Let df

x∗n = yn∗ + zn∗

with

yn∗ ∈ A1 (xn ),

zn∗ ∈ A2 (xn )

∀ n > 1.

Then, we have · lim sup n→+∞

∗ ® ® yn , xn − x X + zn∗ , xn − x X

¸ 6 0.

(3.39)

3. Nonlinear Operators and Young Measures

335

We claim that (3.39) implies

® lim sup yn∗ , xn − x X 6 0 n→+∞

and

(3.40)

® lim sup zn∗ , xn − x X 6 0. n→+∞

Suppose that (3.40) is not true. Then at least one of the two lim sup is strictly bigger than zero. To fix things, suppose that ® lim sup yn∗ , xn − x X > 0. n→+∞

Then we can find c > 0 and a suitable subsequence (denoted with the same index), such that ® lim yn∗ , xn − x X > c = 0. n→+∞

Then because of (3.39), we have that ® lim sup zn∗ , xn − x X 6 −c < 0. n→+∞

(3.41)

By virtue of the pseudomonotonicity of A2 , for every u ∈ X, we can find y2∗ (u) ∈ A2 (x), such that

∗ ® ® y2 (u), x − u X 6 lim inf zn∗ , xn − u X . n→+∞

Let u = x. Then we have

® lim inf zn∗ , xn − x X > 0. n→+∞

(3.42)

Comparing (3.41) and (3.42), we reach a contradiction. This proves (3.40). Since both A1 and A2 are pseudomonotone, given u ∈ X, we can find y1∗ (u) ∈ A1 (x) and such that

y2∗ (u) ∈ A2 (x),

® ® y1∗ (u), x − u X 6 lim inf yn∗ , x − u X n→+∞

and

∗ ® ® y2 (u), x − u X 6 lim inf zn∗ , x − u X . n→+∞

Let

df

y ∗ (u) = y1∗ (u) + y2∗ (u) ∈ (A1 + A2 )(x). We have ∗ ® ® ® y (u), x − u X 6 lim inf yn∗ , xn − u X + lim inf zn∗ , xn − u X n→+∞ n→+∞ ® 6 lim inf x∗n (u), xn − u X , n→+∞

which means that A1 + A2 is pseudomonotone.

336

Nonlinear Analysis

As was the case with maximal monotone operators, pseudomonotone operators exhibit remarkable surjectivity properties. THEOREM 3.2.52 ∗ If A : X −→ 2X is pseudomonotone and coercive, then R(A) = X ∗ , i.e., A is surjective. PROOF Let T be the family of all finite dimensional subspaces of X, equipped with the partial order defined by inclusion. Let V ∈ T and let iV : V −→ X denote the embedding operator. Then i∗V : X ∗ −→ V ∗ is the corresponding projection operator onto V ∗ . Then ∗

AV = i∗V AiV : V −→ 2V . Clearly AV has nonempty, convex and compact values and it is upper semicontinuous. Moreover, for every x∗V ∈ AV (x), we have x∗V = i∗V x∗ for some x∗ ∈ A(x) and so ® ® ∗ ® xV , x V = i∗V x∗ , x V = x∗ , iV (x) X , so AV is coercive too. To prove the theorem, it suffices to show that 0 ∈ R(A). Because of Proposition 3.2.33, for every V ∈ T , we can find xV ∈ V , such that 0 ∈ AV (xV ), hence 0 = i∗V x∗V , for some x∗V ∈ A(xV ). By virtue of the coercivity of A, we have that {xV }V ∈T ⊆ X is bounded. For V ∈ T , let [ df {xV 0 }. EV = V0 ∈ T V0 ⊇ V

Then EV ⊆ B M

3. Nonlinear Operators and Young Measures

337

for some M > 0 large enough. Because X is reflexive (hence B M is weakly compact), from the finite intersection property, we have that \ w E V 6= ∅. V ∈T

Let x0 ∈

\

w

EV

and

y ∈ X.

V ∈T

© ª Choose V ∈ T , such that {x0 , y} ⊆ V . Let xVk k>1 ⊆ EV be such that w

xVk −→ x0 Recall that 0 = i∗Vk x∗Vk with x∗Vk

in X as k → +∞. ¡ ¢ ∈ A xVk . So we have

∗ ® xVk , xVk − x0 X = 0

∀ k > 1.

Since A is pseudomonotone, we can find y ∗ (y) ∈ A(x0 ), such that ∗ ® ® y (y), x0 − y X 6 lim x∗Vk , xVk − y X = 0 ∀ y ∈ X. k→+∞

(3.43)

Suppose that 0 6∈ A(x0 ). Then by the strong separation theorem (see Theorem A.3.2), we can find y ∈ X, such that ∗ ® 0 < ∗ inf v , x0 − y X . (3.44) v ∈A(x0 )

Comparing (3.43) and (3.44), we obtain a contradiction. This proves the surjectivity of A. The following classes of operators are often useful in nonlinear operator equations. DEFINITION 3.2.53

Let

A : X ⊇ D(A) −→ 2X

∗

and

B : X ⊇ D(B) −→ 2X

∗

be two maps. (a) We say that B is smooth, if D(B) = X and B is bounded, coercive and maximal monotone. (b) We say that A is regular, if it is generalized pseudomonotone and for ∗ every smooth operator B : X ⊇ D(B) −→ 2X , we have R(A + B) = X ∗ .

338

Nonlinear Analysis

The next proposition gives an important example of a regular generalized pseudomonotone map. PROPOSITION 3.2.54 ∗ If A : X ⊇ D(A) −→ 2X is a pseudomonotone operator and there exists c > 0, such that hx∗ , xiX > −c kxkX

∀ (x, x∗ ) ∈ Gr A,

then A is regular (so also generalized pseudomonotone). PROOF First note that A is a generalized pseudomonotone operator (see ∗ Theorem 3.2.48). Also let B : X ⊇ D(B) −→ 2X be an arbitrary smooth map. Then from Corollary 3.2.50, B is pseudomonotone. Then Proposition 3.2.51 implies that A+B is pseudomonotone. Moreover, A+B is coercive. So Theorem 3.2.52 implies that R(A + B) = X ∗ , hence A is regular. We introduce two more classes of nonlinear operators of monotone type. DEFINITION 3.2.55

∗

Let A : X ⊇ D(A) −→ 2X be a map.

(a) We say that A is of type (M ), if for every x ∈ X, the set A(x) is nonempty, convex, weakly compact, it is upper semicontinuous from every finite dimensional subspace V of X into X ∗ furnished with the weak topology and if w xn −→ x in X, w

x∗n −→ x∗ with (xn , x∗n ) ∈ Gr A, then

in X ∗

(x, x∗ ) ∈ Gr A.

(b) We say that A is of type (S)+ , if A is single valued with D(A) = X and for every sequence {xn }n>1 ⊆ X, such that w

xn −→ x

in X,

for some x ∈ X and ® lim sup A(xn ), xn − x X 6 0, n→+∞

we have that xn −→ x

in X.

3. Nonlinear Operators and Young Measures

339

REMARK 3.2.56 The prototype for operators of type (M ) is a monotone, hemicontinuous map. Similarly the prototype for operators of type (S)+ is a uniformly monotone operator defined everywhere. If A is of type (M ) (respectively of type (S)+ ) and B : X −→ X ∗ is completely continuous, then A + B is of type (M ) (respectively of type (S)+ ). In fact in the (S)+ case, B can be compact. We close this section by briefly discussing two important examples of maximal monotone maps. PROPOSITION 3.2.57 ∗ If X is a separable, reflexive Banach space, A : X ⊇ D(A) −→ 2X is ¡ ¢ p0 ∗ b : Lp (T ; X) ⊇ D b −→ 2L T ;X , maximal monotone with 0 ∈ A(0) and A where T = [0, b], p ∈ (1, +∞), p1 + p10 = 1 is defined by ½ df b A(x) =

p0

h∈L

¡

T;X

∗

¢

¡

¢ : h(t) ∈ A x(t)

¾ for a.a. t ∈ T

b ∀ x ∈ D,

where ½ df b = D

x ∈ Lp (T ; X) : x(t) ∈ D(A) for a.a. t ∈ T and there exists ¾ ¡ ¢ ¡ ¢ p0 ∗ h ∈ L T ; X such that h(t) ∈ A x(t) for a.a. t ∈ T ,

b is maximal monotone too. then A PROOF By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume without any loss of generality that both X and X ∗ are locally uniformly convex (see Definition A.3.21). Let F : X −→ X ∗ be the duality map of X. From Proposition 3.2.27, we have that F is a homeomorphism. Let ¡ ¢ p0 ∗ J0 : Lp (T ; X) −→ 2L T ;X be defined by

¢°p−2 ¡ ¢ df ° ¡ J0 (x)(·) = °F x(·) °X ∗ F x(·) .

It is easy to see that J0 is continuous, strictly monotone and so maximal b is a monotone map. We monotone too (see Proposition 3.2.18). Clearly A claim that ¢ ¡ ¢ ¡ b + J0 = Lp0 T ; X ∗ . R A

340

Nonlinear Analysis

¢ 0¡ ∗ To this end let h ∈ Lp T ; X ∗ and consider the multifunction S : T −→ 2X , defined by ª df © S(t) = x ∈ X : A(x) + ϕ(x) 3 h(t) , where ϕ : X −→ X ∗ is the monotone, continuous (hence maximal monotone) map, defined by °p−2 df ° ϕ(x) = °F(x)°X ∗ F(x)

∀ x ∈ X.

We know that ∗

A + ϕ : X ⊇ D(A) −→ 2X is maximal monotone (see Theorem 3.2.41). Moreover, because 0 ∈ A(0), we have ° °p ° °p ∗ ® ® x + ϕ(x), x X > ϕ(x), x X = °F(x)°X ∗ = °x°X ∀ (x, x∗ ) ∈ Gr A, hence A + ϕ is coercive. Therefore R(A + ϕ) = X ∗ (see Corollary 3.2.31). It follows that S(t) 6= ∅

∀ t ∈ T.

Note that Gr S =

©

(t, x) ∈ T × X :

¡ ¢ ª x, ϕ(x) − h(t) ∈ Gr A .

Let ϑ : T × X −→ X × X ∗ be the function, defined by df

ϑ(t, x) =

¡

¢ x, ϕ(x) − ξ(t) .

Clearly ϑ is a Carath´eodory function (i.e., t 7−→ ϑ(t, x) is measurable and x 7−→ ϑ(t, x) is continuous). Therefore ϑ is jointly measurable. Note that ¡ ¢ Gr S = ϑ−1 Gr A and from Proposition 3.2.15, we know that ∗ Gr A ⊆ X × Xw is closed ∗ (here by Xw we denote the space X ∗ furnished with the weak topology). Hence ¡ ¢ ∗ Gr S ∈ B X × Xw

3. Nonlinear Operators and Young Measures

341

∗ (by B(Z) we denote the Borel σ-field of Z). Since Xw is a Souslin space (see Definition A.2.29(b)), then ¡ ¢ ¡ ∗¢ ∗ B X × Xw = B(X) × B Xw

(see Proposition A.2.34(b)). Moreover, ¡ ¢ ¡ ∗¢ B X ∗ = B Xw . Therefore ¡ ¢ ¡ ¢ ¡ ∗¢ B(X) × B Xw = B(X) × B X ∗ = B X × X ∗ . So we have that

¡ ¢ Gr S ∈ B X × X ∗

and we can apply the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) to obtain a measurable map x : T −→ X, such that x(t) ∈ S(t) We have

∀ t ∈ T.

¡ ¢ ¡ ¢ h(t) ∈ A x(t) + ϕ x(t)

for a.a. t ∈ T.

Taking duality brackets with x(t), we obtain ° ° ® °x(t)°p 6 h(t), x(t) X X

∀t∈T

(recall that 0 ∈ A(0)) and so ° ° ° ° °x(t)°p−1 6 °h(t)° ∗ , X X from which it follows that x ∈ Lp (T ; X). This proves that ¡ ¢ ¡ ¢ b + J0 = Lp0 T ; X ∗ . R A Using this surjectivity property we shall establish the maximal monotonicity b To this end suppose that for some of the monotone operator A. ¢ 0¡ (y, v) ∈ Lp (T ; X) × Lp T ; X ∗ , we have

u − v, x − y

® pp0

> 0

b ∀ (x, u) ∈ Gr A,

(3.45)

where by h·, ·ipp0 we denote the duality brackets for the pair of spaces ¡ p0 ¡ ¢ ¢ L T ; X ∗ , Lp (T ; X) , i.e., df

Zb

hu, vipp0 = 0

® u(t), v(t) X dt

¢ 0¡ ∀ (u, v) ∈ Lp (T ; X) × Lp T ; X ∗ .

342

Nonlinear Analysis

b + J0 is surjective, we can find x1 ∈ D, b such that Since A u + J0 (x1 ) = v + J0 (y), b 1 ). Returning to (3.45) and setting x = x1 , we obtain for some u ∈ A(x ® J0 (y) − J0 (x1 ), x1 − y pp0 > 0. (3.46) But J0 is strictly monotone. So from (3.46), it follows that x1 = y, hence

b y∈D

and b This proves the maximality of A.

b 1 ). v = u ∈ A(x

The second important class of maximal monotone operators that we would like to check closing this section are the linear ones. Note that for a linear operator A : X −→ X ∗ , monotonicity is equivalent to saying that ® A(x), x X > 0 ∀ x ∈ D(A). For linear monotone operators we can characterize maximality using the adjoint operator or in terms of the density of its domain. This brings us to the “doorsteps” of the Hille-Yosida theorem and the theory of semigroups of operators, which form the subject of the next section. THEOREM 3.2.58 If A : X ⊇ D(A) −→ X ∗ is a linear monotone operator, then the following statements are equivalent: (a) A is maximal monotone; (b) A∗ is maximal monotone; (c) A and A∗ are both monotone and A is closed and densely defined. In particular if X = H is a Hilbert space and A : H −→ H ∗ is linear maximal monotone, then A is symmetric if and only if A is selfadjoint. Maximal monotonicity is crucial here since it may happen that A is monotone symmetric, but A∗ is not monotone. REMARK 3.2.59 In Section 4.3 we shall return to the subject of monotone operators, by examining in more detail the subdifferential of a proper, convex and lower semicontinuous function (see Example 3.2.20(c)). This is a special kind of monotone map, known as cyclically monotone. As we shall see there, not every monotone map is of the subdifferential type.

3. Nonlinear Operators and Young Measures

3.3

343

Accretive Operators and Semigroups of Operators

In the previous section we studied operators (in general nonlinear) from a Banach space X into its dual X ∗ . In this section we deal with operators from X into X which still exhibit a “monotonicity” property. These are the socalled accretive operators. Of course the two classes of monotone and accretive operators coincide when X = H is a Hilbert space. Accretive operators are intimately connected to the theory of generation of semigroups, which are a basic tool in the study of evolution equations. So the second half of this section is devoted to the presentation of the basics of the theory of semigroups of operators (linear and nonlinear). Trying to extend the notion of monotonicity to maps from a Banach space X into itself, immediately we face the problem of finding a substitute for the duality brackets. There are two equivalent ways to do this. The first is to use the duality map, which essentially brings us back to the familiar setting of the dual pair (X, X ∗ ). The second approach replaces the duality brackets by a so-called semi-inner product, which is a kind of inner product for the Banach space. Let us start by giving the definition of accretivity based on the duality map and then proceed to introduce semi-inner products on a Banach space and show how they can be used. DEFINITION 3.3.1 Let X be a Banach space and let A : X ⊇ D(A) −→ 2X be an operator. (a) We say that A is accretive, if for every ®(x1 , u1 ), (x2 , u2 ) ∈ Gr A, there exists x∗ ∈ F (x1 − x2 ), such that x∗ , u1 − u2 X > 0. (b) An accretive operator is said to be maximal accretive, if its graph is not properly included in the graph of another accretive operator. (c) Finally an accretive operator is said to be m-accretive, if R(idX + A) = X. REMARK 3.3.2 When X = H = H ∗ is a Hilbert pivot space, then F = idX and so the notion of accretivity (respectively maximal accretivity) coincides with that of monotonicity (respectively maximal monotonicity). Moreover, in this case by virtue of Theorem 3.2.29, maximal accretivity and m-accretivity coincide. However, this is not in general true. We can find a maximal accretive operator which is not m-accretive (see Miyadera (1992, pp. 42–44)). If −A is an accretive operator, then A is called a dissipative operator . This terminology originates from mechanics, where dissipative forces are forces which do not increase the energy.

344

Nonlinear Analysis

The next lemma leads to an alternative definition of an accretive operator. LEMMA 3.3.3 If X is a Banach space and x, y ∈ X, then kxkX 6 kx + λykX for all λ > 0 if and only if there exists x∗ ∈ F(x), such that hx∗ , yiX > 0. PROOF “=⇒”: Without any loss of generality we may assume that x 6= 0. Let x∗λ ∈ F(x + λy), x∗λ 6= 0 for all λ > 0. Set df

vλ∗ =

x∗λ ∗ kxλ kX ∗

∀ λ > 0.

If λn & 0, by Alaoglu’s theorem (see Theorem A.3.9), we can find v ∗ ∈ X ∗ with kv ∗ kX ∗ 6 1, such that ® hv ∗ , viX = lim vλ∗n , v X ∀ v ∈ X. n→+∞

By hypothesis we have ° ° kxk 6 °x + λn y ° X

X

=

so

∗ ® ® vλn , x + λn y X 6 kxkX + λn vλ∗n , y X , ∗ ® v , y X > 0.

Also, since

° ° ° x + λn y °

X

=

(3.47)

∗ ® ® vλn , x + λn y −→ v ∗ , x X ,

we have that kxkX 6

∗ ® v , x X,

kxkX =

∗ ® v , x X.

so

It follows that x∗ = kxkX v ∗ ∈ F(x). We have hx∗ , yiX > kxkX hv ∗ , yiX > 0 (see (3.47)). “⇐=”: From the definition of the duality map F (see Example 3.2.20(d)) and the hypothesis that hx∗ , yiX > 0, we have 2

kxkX =

® ∗ ® x , x X 6 x∗ , x + λy X 6 kx∗ kX ∗ kx + λykX .

So, because x∗ ∈ F(x) and thus kxkX = kx∗ kX ∗ , we have kxkX 6 kx + λykX .

3. Nonlinear Operators and Young Measures

345

Using this lemma, we have the following alternative characterization of accretivity (known as Kato’s criterion). PROPOSITION 3.3.4 (Kato’s Criterion) If X is a Banach space and A : X ⊇ D(A) −→ 2X , then A is accretive if and only if for all λ > 0 and any (x1 , u1 ), (x2 , u2 ) ∈ Gr A, we have ° ° ° ° °x1 − x2 ° 6 °x1 − x2 + λ(u1 − u2 )° . X X Next we define the semi-inner products on X. DEFINITION 3.3.5 Let X be a Banach space and x, y ∈ X. We define the semi-inner products (·, ·)± by the following: kx + λykX − kxkX kx + λykX − kxkX = kxkX inf λ>0 λ λ kx + λykX − kxkX kx + λykX − kxkX df = kxkX lim = kxkX sup . λ%0 λ λ λ 0; (c) (z + y, x)± 6 kzkX kxkX + (y, x)± ; (d) (·, ·)+ : X × X −→ R is upper semicontinuous; (e) (·, ·)− : X × X −→ R is lower semicontinuous. PROOF

(a) It follows easily from Proposition 3.3.7.

(b) Note that F(µx) = µF(x) and so ® (λy, µx)+ = ∗max µx∗ , λy X x ∈F ® (x) = λµ ∗max x∗ , y X = λµ (y, x)+ . x ∈F (x)

Similarly for (·, ·)− . (c) It follows easily from Proposition 3.3.7.

3. Nonlinear Operators and Young Measures

347

(d) From Definition 3.3.5, we know that kx + λykX − kxkX . λ>0 λ

(y, x)+ = kxkX inf

Note that the function (y, x) 7−→ kxkX Hence kxkX inf

λ>0

kx + λykX − kxkX is continuous. λ

kx + λykX − kxkX kx + λykX − kxkX = inf kxkX λ>0 λ λ

is upper semicontinuous. (e) Note that (y, x)− = (−y, x)+ and use part (d). The next proposition summarizes the different equivalent ways we can use to define accretive operators. It follows immediately from the previous discussion. THEOREM 3.3.10 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an operator, then the following two statements are equivalent: (a) A is an accretive operator; (b) for every (x1 , u2 ), (x2 , u2 ) ∈ Gr A, any one of the following three statements is true: ° ° [A1 ] kx1 − x2 kX 6 °x1 − x2 + λ(u1 − u2 )°X for all λ > 0; ¡ ¢ 0 [A2 ] ¡ψ+ x1 − x2 ; u1 −¢u2 > 0; [A3 ] u1 − u2 , x1 − x2 + > 0. Motivated from Proposition 3.3.4, we make the following definition. DEFINITION 3.3.11

Let X be a Banach space and let A : X ⊇ D(A) −→ 2X

be an accretive operator. For every λ > 0 we introduce df

Jλ =

¡

idX + λA

¡ ¢ both defined on R idX + λA .

¢−1

and

df

Aλ =

¢ 1¡ idX − Jλ , λ

348

Nonlinear Analysis

In the next proposition we have collected some elementary properties of the operators Jλ and Aλ . PROPOSITION 3.3.12 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an accretive operator, then (a) Jλ is nonexpansive on R(idX + λA), i.e., ° ° °Jλ (x) − Jλ (y)° 6 kx − yk ∀ x, y ∈ R(idX + λA); X X (b) Aλ is accretive and Lipschitz continuous with constant λ2 on R(idX + λA); ¡ ¢ (c) Aλ (x) ∈ A Jλ (x) for every x ∈ R(idX + λA); ° ° ¯ ¯ df (d) °Aλ (x)°X 6 ¯A(x)¯ = inf kukX for every x ∈ D(A) ∩ R(idX + λA); u∈A(x)

µ

T

(e) lim Jλ (x) = x for every x ∈ D(A) ∩ λ&0

PROOF

λ>0

¶ R(idX + λA) .

(a) This follows at once from Proposition 3.3.4.

(b) Let df

yk = Then yk

¡

¢ idX + µAk (xk )

∀ k ∈ {1, 2}, µ > 0.

µ ¶ ¢ µ¡ = idX + id − Jλ (xk ) λ X

and so y1 − y2 +

¢ µ¡ Jλ (x1 ) − Jλ (x2 ) = λ

∀ k ∈ {1, 2}

µ ¶ µ 1+ (x1 − x2 ). λ

Hence µ ¶ ° µ µ° 1+ kx1 − x2 kX 6 ky1 − y2 kX + °Jλ (x1 ) − Jλ (x2 )°X λ λ and from part (a), we have kx1 − x2 kX 6 ky1 − y2 kX , which proves the accretivity of Aλ . Also, from part (a), we have ° ° °Aλ (x1 ) − Aλ (x2 )° = 1 kx1 − x2 + Jλ (x1 ) − Jλ (x2 )k 6 2 kx1 − x2 k X X λ λ and so Aλ is a Lipschitz continuous operator with constant

2 λ.

3. Nonlinear Operators and Young Measures

349

(c) We have Aλ (x) ∈

¡ ¢ ¢ ¡ ¢ 1¡ (idX +λA) Jλ (x) −Jλ (x) = A Jλ (x) λ

∀ x ∈ R(idX +λA).

(d) We have ¢ ¢ 1¡ ¡ Jλ (idX + λA)x − Jλ (x) λ ¢ ¢ 1¡ ¡ = Jλ x + λu − Jλ (x) ∀ u ∈ A(x). λ

Aλ (x) =

Hence

° ° °Aλ (x)°

X

which implies

(3.48)

6 kukX ,

° ° ¯ ¯ °Aλ (x)° 6 ¯A(x)¯. X

(e) From part (d), we have ° ° ° ° ¯ ¯ °Jλ (x) − x° = λ°Aλ (x)° 6 λ¯A(x)¯ X X

∀ x ∈ D(A) ∩ R(idX + λA).

Hence Jλ (x) −→ x

as λ & 0,

∀ x ∈ D(A) ∩ R(idX + λA)

and by uniform continuity this extends to all x ∈ D(A) ∩ R(idX + λA) (see part (b)). REMARK 3.3.13 When X = H is a Hilbert space, then Proposition 3.3.12 coincides with Theorem 3.2.38. Also note that if λ, µ > 0 and x ∈ D(Jλ ) = R(idX + λA), then µ λ−µ x+ Jλ (x) ∈ D(Jµ ) = R(idX + µA) λ λ and

µ Jλ (x) = Jµ

¶ µ λ−µ x+ Jλ (x) λ λ

(this last equality is usually known as Resolvent Identity ). To see this let x = y + λv with (y, v) ∈ Gr A. So Jλ (x) = y. Hence we have µ λ−µ µ λ−µ x+ Jλ (x) = (y +λv)+ y = y +µv ∈ R(idX +µA) = D(Jµ ) λ λ λ λ and

µ Jµ

µ λ−µ x+ Jλ (x) λ λ

¶ = Jµ (y + µv) = y = Jλ (x).

350

Nonlinear Analysis

Using the resolvent identity, for λ > µ > 0, we have that ° ° ° ° ° ° ° ° λ°Aλ (x)°X = °Jλ (x) − x°X 6 °Jλ (x) − Jµ (x)°X + °Jµ (x) − x°X ° µ ° ¶ ° ° ° ° µ λ−µ ° = °Jµ x+ Jλ (x) − Jµ (x)° + °Jµ (x) − x°X ° λ λ X ° ° °µ ° ° ° λ − µ ° ° ° 6 ° ° λ x + λ Jλ (x) − x° + Jµ (x) − x X X ° ° ° ° = (λ − µ)°Aλ (x)° + µ°Aµ (x)° , X

X

so

° ° ° ° °Aλ (x)° 6 °Aµ (x)° ∀λ>µ X X ° ª ©° and thus °Aλ (x)°X λ>0 is increasing as λ decreases to 0+ . PROPOSITION 3.3.14 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an accretive operator, then R(idX + λA) = X for all λ > 0, if it holds for some λ > 0. PROOF

Suppose that R(idX + λA) = X

for some λ > 0 and µ > λ2 . Then for u ∈ X, we have ¡ ¢ u ∈ idX + µA (x) or equivalently

µ ¶ ¡ ¢ λ λ idX + λA (x) 3 u + 1 − x, µ µ

which in turn is equivalent to saying that K(x) = x for the contraction µ µ ¶ ¶ λ λ df K(x) = Jλ u+ 1− x ∀ x ∈ X. µ µ By Banach’s fixed point theorem (see Theorem 7.1.2) K(x) = x has a unique solution and so λ R(idX + µA) = X ∀µ> . 2 Then by induction we conclude that R(idX + µA) = X∀µ > 0.

3. Nonlinear Operators and Young Measures

351

REMARK 3.3.15 Using Proposition 3.3.14, we can say that an operator A : X ⊇ D(A) −→ 2X is said to be m-accretive if and only if R(idX + λA) = X

for some λ > 0

(equivalently for all λ > 0; see Definition 3.3.1). PROPOSITION 3.3.16 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then (a) A is maximal accretive; (b) A is closed; (c) if xλ −→ x and Aλ (x) −→ u in X as λ & 0, then (x, u) ∈ Gr A. PROOF

(a) Let (x0 , u0 ) ∈ X × X and suppose that ° ° kx0 − xkX 6 °x0 − x + λ(u0 − u)°X ∀ λ > 0, (x, u) ∈ Gr A.

(3.49)

We need to show that (x0 , u0 ) ∈ Gr A

(see Definition 3.3.1 and Proposition 3.3.4). Since A is m-accretive, we have that X = R(idX + A) and so we can find (x, u) ∈ Gr A, such that x + u = x0 + u0 . Using this in (3.49), we obtain x = x0 , hence (x0 , u0 ) ∈ Gr A. © ª (b) Let (xn , un ) n>1 ⊆ Gr A and assume that (xn , un ) −→ (x, u) in X × X. We need to show that (x, u) ∈ Gr A. By virtue of the m-accretivity of A, we have ° ° kxn − ykX 6 °xn − y + λ(un − v)°X ∀ n > 1, λ > 0, (y, v) ∈ Gr A, so ° ° kx − ykX 6 °x − y + λ(u − v)°X

∀ λ > 0, (y, v) ∈ Gr A.

(3.50)

352

Nonlinear Analysis

But from part (a), we know that A is maximal accretive. So from (3.50), it follows that (x, u) ∈ Gr A. (c) From Proposition 3.3.12(a) and (e), we have that Jλ (xλ ) −→ x in X,

as λ & 0,

while from Proposition 3.3.12(c), we know that ¡ ¢ Aλ (xλ ) ∈ A Jλ (xλ ) ∀ λ > 0. Then using part (b), we infer that (x, u) ∈ Gr A. We can improve conclusion (b) of Proposition 3.3.16, provided we strengthen the condition on the space X. PROPOSITION 3.3.17 If X is a reflexive Banach space with X ∗ being locally uniformly convex and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then Gr A is sequentially closed in X × Xw . PROOF

© ª Suppose that (xn , un ) n>1 ⊆ Gr A is a sequence, such that xn −→ x and

w

un −→ u in X.

We need to show that (x, u) ∈ Gr A. Because A is m-accretive, from Theorem 3.3.10(b)[A3 ], we have ® F(xn − y), un − v X > 0 ∀ n > 1, (y, v) ∈ Gr A.

(3.51)

But from Proposition 3.2.25, we know that the duality map F : X −→ X ∗ is continuous. So if we pass to the limit as n → +∞ in (3.51), we obtain ® F(x − y), u − v X > 0 ∀ (y, v) ∈ Gr A, so

¡

u − v, x − y

¢ +

> 0

∀ (y, v) ∈ Gr A

and thus, from Proposition 3.3.16(a), we conclude that (x, u) ∈ Gr A.

Another useful result that can be proved by imposing extra conditions on the space X is the following one.

3. Nonlinear Operators and Young Measures

353

PROPOSITION 3.3.18 If X is a Banach space with X ∗ strictly convex and A : X ⊇ D(A) −→ 2X is a maximal accretive operator, then the set A(x) ⊆ X is convex and closed for any x ∈ D(A). PROOF Because X ∗ is strictly convex, the duality map F : X −→ X ∗ is single-valued. First we show that for x ∈ D(A), the set A(x) is convex. So let u, v ∈ A(x) and set w = tu + (1 − t)v

with t ∈ [0, 1].

For all (y, h) ∈ Gr A, we have that ® ® ® F(x − y), w − h X = t F(x − y), u − h X + (1 − t) F(x − y), v − h X > 0, so from the maximality of A, we have that (x, w) ∈ Gr A. Next we show that the set A(x) is closed in X. To this end let {un }n>1 ⊆ A(x) be a sequence, such that un −→ u

in X.

We have so

F(x − y), un − v

® X

> 0

® F(x − y), u − v X > 0

∀ n > 1, (y, v) ∈ Gr A, ∀ (y, v) ∈ Gr A

and so, from the maximality of A, we have that (x, u) ∈ Gr A. We continue with the properties of maximal accretive and m-accretive operators. PROPOSITION 3.3.19 If X is an uniformly convex Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then D(A) is convex. PROOF

Let

½ df

D0 =

¾ x ∈ conv D(A) : lim Jλ (x) = x . λ&0

Evidently D(A) = D0 and D0 is closed. So it suffices to show that D0 is convex. For x, y ∈ D0 , we have ° µ ° ° ° ¶ ° ° ° ° °Jλ x + y − Jλ (x)° 6 ° x − y ° ∀λ>0 (3.52) ° ° ° 2 2 °X X

354

Nonlinear Analysis

and

° µ ° ¶ ° ° °Jλ x + y − Jλ (y)° ° ° 2

° ° °x − y° ° ° 6° ∀λ>0 2 °X X (see Proposition 3.3.12(a)). From (3.52) and (3.53), it follows that ½ µ ¶¾ x+y Jλ ⊆ X is bounded. 2 λ∈(0,1)

(3.53)

Since X is reflexive (being uniformly convex; see Remark A.3.22), we can find a sequence λn & 0, such that µ ¶ x+y w Jλn −→ h in X. 2 So if we pass to the limit as n → +∞ in (3.52) and (3.53), we obtain ° ° ° ° °x − y ° °x − y° ° ° ° kh − xkX 6 ° and kh − yk 6 (3.54) X ° 2 ° ° 2 ° . X

X

We have kx − ykX 6 kx − hkX + kh − ykX 6 kx − ykX . From (3.54) and (3.55), it follows that kx − hkX = ky − hkX

(3.55)

° ° °x − y° ° ° = ° 2 °

X

and this by virtue of the uniform convexity of X implies that h = we have µ ¶ x+y x+y w Jλn −→ in X. 2 2 Moreover, we have ° ° µ ¶ °y − x° ° ° ° ° 6 lim inf °Jλn x + y − Jλ (x)° n ° X n→+∞ ° 2 2 X ° ° µ ¶ ° ° °y − x° x + y ° ° , 6 lim sup ° − Jλn (x)° °Jλn ° 6 X 2 2 n→+∞ X so ° ° µ ¶ ° ° ° ° °Jλ x + y − Jλ (x)° −→ ° y − x ° . n ° n ° X 2 2 X Since Jλn (x) −→ x in X,

x+y 2 .

Thus (3.56)

(3.57)

from (3.56) and (3.57) and the Kadec-Klee property (see Remark A.3.22), we conclude that µ ¶ x+y x+y Jλn −→ 2 2 and so D(A).

x+y 2

∈ D0 , which proves the convexity of D0 , hence the convexity of

3. Nonlinear Operators and Young Measures

355

Next we prove two perturbation results for m-accretive operators. To do this we shall need the following auxiliary result. LEMMA 3.3.20 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , and B : X ⊇ D(B) −→ 2X are two m-accretive operators with D(A) ∩ D(B) 6= ∅, then for every u ∈ X and every λ > 0, the operator inclusion x + A(x) + Bλ (x) 3 u has a unique ©solution ªxλ ∈ D(A) and {xλ }λ>0 is bounded. Moreover, if Bλ (xλ ) λ∈(0,1) is bounded, then the operator inclusion x + A(x) + B(x) 3 u has a unique solution x ∈ D(A) ∩ D(B) and xλ −→ x PROOF

in X,

as λ & 0.

The operator inclusion x + A(x) + Bλ (x) 3 u

is equivalent to µ x = 1+

¶−1 µ

λ A λ+1

¶ λ 1 −1 u+ (id + λB) (x) , λ+1 λ+1 X

so x = Kλ (x), with df

µ

Kλ = J

A

λ λ+1

Since the operators J Aλ

λ+1

λ 1 u+ JA λ+1 λ+1 λ

(3.58) ¶ ∀ λ > 0.

and JλB are nonexpansive on X (see Proposi-

tion 3.3.12(a)), we can check that ° ° °Kλ (x) − Kλ (y)° 6 X

1 kx − ykX λ+1

∀ x, y ∈ X, λ > 0.

Invoking Banach’s fixed point theorem (see Theorem 7.1.2), we infer that (3.58) has a unique solution xλ ∈ D(A) for λ > 0. Let z ∈ D(A) ∩ D(B) and

uλ ∈ z + A(z) + Bλ (z).

356

Nonlinear Analysis

From Proposition 3.3.12(b), we know that Bλ is accretive and since the sum of accretive operators is clearly accretive, we have that the operator A + Bλ : X ⊇ D(A) −→ 2X is accretive. So ® F(xλ − z), u − xλ − (uλ − z) X > 0, hence 2

kxλ − zkX 6

® F(xλ − z), u − uλ X 6 kxλ − zkX ku − uλ kX ,

from which it follows that kxλ − zkX 6 ku − uλ kX . Because

(3.59)

° ° ¯ ¯ °Bλ (z)° 6 ¯B(z)¯ X

(see Proposition 3.3.12(d)), from © ª (3.59), we infer that {xλ }λ>0 is bounded. Now suppose that Bλ (xλ ) λ>0 is bounded. For λ, µ > 0, we have u − xλ − Bλ (xλ ) ∈ A(xλ ) and

u − xµ − Bµ (xµ ) ∈ A(xµ ).

Exploiting the accretivity of the operator A, we obtain ® F(xλ − xµ ), xµ − xλ + Bµ (xµ ) − Bλ (xλ ) X > 0, so

2

kxλ − xµ kX 6

® F(xλ − xµ ), Bµ (xµ ) − Bλ (xλ ) X .

Because ¡ ¢ Bλ (xλ ) ∈ B Jλ (xλ )

¡ ¢ and Bµ (xµ ) ∈ B Jµ (xµ )

(see Proposition 3.3.12(c)) and B is accretive, we have ¡ ¢ ® F Jµ (xµ ) − Jλ (xλ ) , Bµ (xµ ) − Bλ (xλ ) X > 0. It follows that 2

kx λ − xµ kX ¡ ¢ ® 6 F(xλ − xµ ) + F Jµ (xµ ) − Jλ (xλ ) , Bµ (xµ ) − Bλ (xλ ) X . (3.60) Since λBλ (xλ ) = xλ − Jλ (xλ ) ª and by hypothesis Bλ (xλ ) λ∈(0,1) is bounded, we have that ©

° ° °xλ − Jλ (xλ )°

X

6 M1 λ

∀ λ ∈ (0, 1),

3. Nonlinear Operators and Young Measures

357

for some M1 > 0. Because X ∗ is uniformly convex, we have that duality map is uniformly continuous on bounded sets of X (see Proposition 3.2.28). Therefore since the duality map is odd (see Proposition 3.2.22), we see that given ε > 0, for all λ, µ > 0 small enough, we have ° ¡ ¢ ¡ ¢° °F xλ − xµ + F Jµ (xµ ) − Jλ (xλ ) ° ° ¡ ¢ ¡ ¢°X = °F Jµ (xµ ) − Jλ (xλ ) − F xµ − xλ ° 6 ε. X

So from (3.60), we have 2

kxλ − xµ kX 6 M1 ε

∀ λ, µ > 0 small enough.

Since ε > 0 was arbitrary, we conclude that xλ −→ x in X,

as λ & 0.

Let λn & 0 be such that w

Bλn (xλn ) −→ z

in X

© ª (recall that X is reflexive and by hypothesis Bλ (xλ ) λ∈(0,1) is bounded). Set df

vn = u − xλn − Bλn (xλn ) Then

w

vn −→ v = u − x − z

∀ n > 1. in X

and vn ∈ A(xλn ). Invoking Proposition 3.3.17, we have that (x, v) ∈ Gr A. Also ¡ ¢ Jλn (xλn ), Bλn (xλn ) ∈ Gr B and Jλn (xλn ) −→ x and

w

Bλn (xλn ) −→ x in X.

So once again via Proposition 3.3.17, we have that (x, z) ∈ Gr B. Thus finally x ∈ D(A) ∩ D(B)

and u = x + v + z

with v ∈ A(x)

and z ∈ B(x).

Using this lemma we can prove two perturbation theorems for m-accretive operators.

358

Nonlinear Analysis

THEOREM 3.3.21 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , B : X ⊇ D(B) −→ 2X are two m-accretive operators, such that (i) D(A) ∩ D(B) 6= ∅; ¡ ¢ ® (ii) F Bλ (x) , u X > 0 for all λ > 0 and all (x, u) ∈ Gr A, then A + B is m-accretive. PROOF Let u ∈ X. By virtue of Lemma 3.3.20, we can find a unique xλ ∈ D(A), such that u ∈ xλ + A(xλ ) + Bλ (xλ )

∀ λ > 0.

¡ ¢ We take the duality brackets with F Bλ (xλ ) . Using (ii)©and theªfact that {xλ }λ>0 is bounded (see Lemma 3.3.20), we obtain that Bλ (xλ ) λ∈(0,1) is bounded. Then by virtue of Lemma 3.3.20, we have that xλ −→ x in X

as λ & 0

and u ∈ x + A(x) + B(x), i.e., R(idX + A + B) = X, which means that A + B is m-accretive. THEOREM 3.3.22 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , B : X ⊇ D(B) −→ 2X are two m-accretive operators, such that (i) D(A) ⊆ D(B); (ii) for each r > 0, there are c < 1 and d > 0, such that ¯ ¯ ¯ ¯ ¯B(x)¯ 6 c¯A(x)¯ + d then A + B is m-accretive.

∀ x ∈ D(A), kxkX 6 r,

3. Nonlinear Operators and Young Measures

359

PROOF Let u ∈ X and let xλ ∈ D(A) be the unique solution of the operator inclusion u ∈ xλ + A(xλ ) + Bλ (xλ ). Since xλ ∈ D(A) ⊆ D(B) (see (i)) and ° ° ¯ ¯ °Bλ (xλ )° 6 ¯B(xλ )¯ X

∀λ>0

(see Proposition 3.3.12(d)), from condition (ii) and since {xλ }λ>0 is bounded (see Lemma 3.3.20), we obtain ¯ ¯ ¯ ¯ ¯A(xλ )¯ 6 c¯A(xλ )¯ + d0 ∀ λ > 0, © ª for some d0 > 0, so A(xλ ) λ>0 is bounded (since c < 1). © ª Using this fact in condition (ii), we infer that Bλ (xλ ) λ∈(0,1) is bounded. Then invoking Lemma 3.3.20, we finish the proof as before. REMARK 3.3.23 Condition (ii) of Theorem 3.3.22 can be replaced by the following local condition (ii)’ for every x0 ∈ D(A), we can find a neighbourhood U of x0 and constants c < 1 and d > 0, such that ¯ ¯ ¯ ¯ ¯B(x)¯ 6 c¯A(x)¯ + d ∀ x ∈ D(A) ∩ U

(see Kato (1967)). In applications in general Theorem 3.3.21 is more convenient than Theorem 3.3.22. We present an application of the perturbation results in the study of elliptic boundary value problems. To this end let Ω ⊆ RN be a bounded domain with a C 2 -boundary ∂Ω. We shall need the following existence, uniqueness and regularity result due to Agmon, Douglis & Nirenberg (1959). THEOREM 3.3.24 If Ω ⊆ RN is as above, p ∈ (1, +∞) and f ∈ Lp (Ω), then there exists unique x ∈ W 2,p (Ω) ∩ W01,p (Ω), such that −∆x(z) + x(z) = f (z)

for a.a. z ∈ Z,

x|∂Z = 0.

Moreover, if ∂Ω is a C m+1 -manifold for some m > 1 and f ∈ W m,p (Ω), then x ∈ W m+2,p (Ω) and kxkW m+2,p (Ω) 6 c kf kW m,p (Ω) for some c = c(m, p, Ω) > 0.

∀ x ∈ W m+2,p (Ω),

360

Nonlinear Analysis

Let

ξ : R ⊇ D(ξ) −→ 2R

be a maximal monotone map with 0 ∈ ξ(0). We consider the realization (lifting) of ξ on Lp (Ω) × Lp (Ω) for p ∈ (1, +∞). So we define b −→ 2Lp (Ω) ξb: Lp (Ω) ⊇ D(ξ) by ½

¡ ¢ u ∈ Lp (Ω) : u(z) ∈ ξ x(z) for a.a. z ∈ Z

df b ξ(x) =

where

¾ b ∀ x ∈ D(ξ),

½ df b = D(ξ)

x ∈ Lp (Ω) : there exists u ∈ Lp (Ω), ¡

¾

¢

such that u(z) ∈ ξ x(z) for a.a. z ∈ Z . A simple measurable selection argument establishes that ξb is m-accretive and we have £ ¤ b −1 x (z) = (1 + λξ)−1 x(z) for a.a. z ∈ Z (idX + λξ) with X = Lp (Ω) and ¡ ¢ ξbλ (x)(z) = ξλ x(z) for a.a. z ∈ Z, all λ > 0 and all x ∈ Lp (Ω). p

We consider the operator K : Lp (Ω) ⊇ D(K) −→ 2L b K(x) = −∆x + ξ(x) where

(Ω)

, defined by

∀ x ∈ D(K),

df b D(K) = W01,p (Ω) ∩ W 2,p (Ω) ∩ D(ξ).

PROPOSITION 3.3.25 p If K : Lp (Ω) ⊇ D(K) −→ 2L (Ω) is the operator defined by (3.61), then K is m-accretive. It is easy to check that the duality map on Lp (Ω),

PROOF

0

F : Lp (Ω) −→ Lp (Ω) (with

1 p

+

1 p0

= 1), is defined by df

F(x)(·) = x(·)

|x(·)|p−2 p−2

kxkp

.

(3.61)

3. Nonlinear Operators and Young Measures

361

Using this we can check that the operator −∆ : Lp (Ω) ⊇ W01,p (Ω) ∩ W 2,p (Ω) −→ Lp (Ω) is accretive. Moreover, by virtue of Theorem 3.3.24, it follows that −∆ is m-accretive. We have Z ¡ ¢ ® ¡ ¢¯ ¡ ¢¯p−2 b F ξλ (x) , −∆x Lp (Ω) = −∆x(z)ξλ x(z) ¯ξλ x(z) ¯ dz. (3.62) Ω

If

¯ ¯p−2 df ϕλ (r) = ξλ (r)¯ξλ (r)¯ ,

then ϕλ is a Lipschitz continuous map and ¯ ¯p−2 ϕ0λ (r) = (p − 1)¯ξλ (r)¯ ξλ0 (r) Also

for a.a. r ∈ R.

¡ ¢ ϕ x(·) ∈ W01,p (Ω)

(see Proposition 2.4.25 and Remark 2.4.26) and ¡ ¢ ¡ ¢ Dϕλ x(z) = ϕ0λ x(z) Dx(z) for a.a. z ∈ Z. Performing an integration by parts on the right hand side integral of (3.62) and recalling that βλ (0) = 0 (since 0 ∈ β(0)), we obtain Z °2 ¡ ¢ ® ¡ ¢° b F ξλ (x) , −∆x Lp (Ω) = ϕ0λ x(z) °Dx(z)°RN dz > 0, Ω

¡ ¢ since ϕ0λ x(z) > 0, because ϕλ is monotone increasing. Applying Theo0 rem 3.3.21, with data A = −∆, B = ξb and X = Lp (Ω) (note that X ∗ = Lp (Ω) 0 is uniformly convex since p ∈ (1, +∞)), we obtain that A + B = K is maccretive. The main reason for studying accretive operators is the fact that they are closely related with the generation of semigroups (linear and nonlinear). The theory of semigroups is a valuable tool in the study of partial differential equations, of Volterra integral equations and of control problems. In the rest of this section, we will see how m-accretive operators lead to semigroups of operators, which in turn describe the time-evolution of a dynamical system monitored by a differential equation in a Banach space (evolution equation). So we start our discussion with an existence result for evolution equations in which the input and Cauchy data are regular (smooth). First two useful auxiliary results.

362

Nonlinear Analysis

LEMMA 3.3.26 If X is a Banach space, x : T = [0, b] −→ X is weakly differentiable at t ∈ T , i.e., ¯ ® ¯ ® d ∗ x , x(s) X ¯¯ = x∗ , x0 (t) X ∀ x∗ ∈ X ∗ , ds s=t ° ° and s 7−→ °x(s)°X is differentiable at s = t, then · ¸¯ ° ° ° ¯ ® ¡ ¢ d° °x(t)° °x(s)° ¯ = x∗ , x(t) X ∀ x∗ ∈ F x(t) . X ds X ¯ s=t ¡ ¢ PROOF For every x∗ ∈ F x(t) and r > 0, we have ° ° ° ¢ ∗ ® ¡° x , x(t + r) − x(t) X 6 kx∗ kX ∗ °x(t + r)°X − °x(t)°X . Since by hypothesis x is weakly differentiable at t, dividing with r > 0 and letting r & 0, we obtain · ¸¯ ° ° ° ¯ ∗ ® d° ° ° ° ° x , x(t) X 6 x(t) X x(s) X ¯¯ . (3.63) ds s=t On the other hand, since ° ° ° ¢ ∗ ® ¡° x , x(t) − x(t − r) X > kx∗ kX ∗ °x(t)°X − °x(t − r)°X , arguing as above, we obtain · ¸¯ ° ° ° ¯ ® d° °x(t)° °x(s)° ¯ 6 x∗ , x0 (t) X . X ds X ¯ s=t

(3.64)

From (3.63) and (3.64), we conclude the desired equality. The second auxiliary result is a Gronwall-type lemma which is used frequently in the study of evolution equations. LEMMA 3.3.27 If ϕ ∈ L1 (T ), ϕ(t) > 0 for almost all t ∈ T , η ∈ R, u ∈ C(T ) and 1 1 u(t)2 6 η 2 + 2 2

Zt ϕ(s)u(s) ds

∀ t ∈ T,

0

then ¯ ¯ ¯u(t)¯ 6 |η| +

Zt ϕ(s) ds 0

∀ t ∈ T.

3. Nonlinear Operators and Young Measures PROOF

363

Let 1 ξε (t) = (η + ε)2 + 2

Zt ϕ(s)u(s) ds, 0

with ε > 0 and t ∈ T . Then ξε0 (t) = ϕ(t)u(t) Moreover, so

for a.a. t ∈ T.

1 u(t)2 6 ξ0 (t) 6 ξε (t) 2 √ p ξε0 (t) 6 ϕ(t) 2 ξε (t)

∀ ε > 0, t ∈ T, ∀ ε > 0, t ∈ T.

Because t 7−→ ξε (t) is absolutely continuous with values in R+ , we have ¡p ¢0 ξε (t) = thus

and so

1 p ξε0 (t) 2 ξε (t)

for a.a. t ∈ T,

¡p ¢0 1 ξε (t) 6 √ ϕ(t) for a.a. t ∈ T 2 Z p p 1 ξε (t) 6 ξε (0) + √ ϕ(s) ds 2 t

∀ t ∈ T.

0

Therefore, it follows that Z p √ p ¯ ¯ ¯u(t)¯ 6 2 ξε (t) 6 2ξε (0) + ϕ(s) ds t

0

Zt = |η + ε| +

ϕ(s) ds. 0

Let ε & 0, to conclude that ¯ ¯ ¯u(t)¯ 6 |η| +

Zt ϕ(s) ds

∀ t ∈ T.

0

Using these results, we can prove the first existence theorem for evolution equations driven by m-accretive operators.

364

Nonlinear Analysis

THEOREM 3.3.28 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive),¡ A : X ¢⊇ D(A) −→ 2X is an m-accretive operator, x0 ∈ D, f ∈ W 1,1 (0, b); X and ω ∈ R, ¡ ¢ then we can find a unique x ∈ W 1,∞ (0, b); X , such that ½

¡ ¢ x0 (t) + A x(t) 3 ωx(t) + f (t) x(0) = x0 .

for a.a. t ∈ T = [0, b],

PROOF First we show ¡ that ¢the solution if it exists is unique. So suppose that x1 , x2 ∈ W 1,∞ (0, b); X are two solutions of the evolution Cauchy problem. We have ¡ ¢0 ¡ ¢ ¡ ¢ ¡ ¢ x1 (t) − x2 (t) + A x1 (t) − A x2 (t) 3 ω x1 (t) − x2 (t) for a.a. t ∈ T. Let

° ° ϕ(t) = °x1 (t) − x2 (t)° . ¡ ¢ Since x1 , x2 ∈ W 1,∞ (0, b); X , they are Lipschitz continuous functions (see Theorem 2.2.24) and so they are differentiable almost everywhere on T (see Theorem 2.2.17). Moreover, ϕ is a Lipschitz continuous function too, thus differentiable almost everywhere on T . So we can use Lemma 3.3.26 and obtain ° ° ° ° °x1 (t) − x2 (t)° d °x1 (t) − x2 (t)° X dt X ¡ ¢ 0 ® 0 = F x1 (t) − x2 (t) , x1 (t) − x2 (t) X

∀ t ∈ T.

Since A is m-accretive, we also have ° ° ° ° ° ° °x1 (t) − x2 (t)° d °x1 (t) − x2 (t)° 6 ω °x1 (t) − x2 (t)°2 X dt X X so

° ° ° d° °x1 (t) − x2 (t)° 6 ω °x1 (t) − x2 (t)° X X dt

∀ t ∈ T,

∀ t ∈ T.

Because x1 (0) − x2 (0) = 0, by Gronwall’s inequality (the differential form), we obtain kx1 (t) − x2 (t)kX = 0 ∀ t ∈ T. Therefore x1 = x2 . To establish the existence of a solution, first we consider the following approximate evolution equation. ¡ ¢ ½ 0 x (t) + Aλ x(t) = ωx(t) + f (t) for a.a. t ∈ T = [0, b], (3.65) x(0) = x0 ,

3. Nonlinear Operators and Young Measures

365

with λ > 0. Because Aλ is a Lipschitz continuous operator (see Proposition 3.3.12(b)), problem (3.65) has a unique solution xλ ∈ C 1 (T ; X). Using Lemma 3.3.26, we see that for all λ, µ > 0, we have ° ¡ ¢ ¡ ¢ ¡ ¢® 1 d° °xλ (t) − xµ (t)°2 + F xλ (t) − xµ (t) , Aλ xλ (t) − Aµ xµ (t) X X 2 dt ° °2 ° ° = ω xλ (t) − xµ (t) X ∀ t ∈ T, so by Gronwall’s inequality (see Theorem A.4.7), we have ° ° °xλ (t) − xµ (t)°2 X Zb 6 −2

¡ ¢ ¡ ¢ ¡ ¢® e2ω(t−s) F xλ (s) − xµ (s) , Aλ xλ (s) − Aµ xµ (s) X ds

t ∈ T.

0

Exploiting the accretivity of A and the fact that ¡ ¢ ¡ ¡ ¢¢ Aα xα (t) ∈ A Jα xα (t)

∀t∈T

(see Proposition 3.3.12(c)), we can write that ° ° °xλ (t) − xµ (t)°2

X

Zb 6 −2

¡ ¢ ¡ ¡ ¢ ¡ ¢¢ e2ω(t−s) F xλ (s) − xµ (s) − F Jλ xλ (s) − Jµ xµ (s) ×

0

¡ ¢ ¡ ¢® ×Aλ xλ (s) − Aµ xµ (s) X ds

∀ t ∈ T,

so ° ° °xλ (t) − xµ (t)°2 X Zb ° ¡ ¢ ¡ ¡ ¢ ¡ ¢¢° 6 2 e2ω(t−s) °F xλ (s) − xµ (s) − F Jλ xλ (s) − Jµ xµ (s) °X ∗ × ° ¡ ¢ ¡ ¢° ×°Aλ xλ (s) − Aµ xµ (s) °X ds

0

∀ t ∈ T.

(3.66)

We claim that ° 0 ° ¯ ¯ ° ° °xλ (t)° 6 ¯A(x0 )¯ + ω kx0 k + °f (0)° + e2ωb kf 0 k . X 1 X X

(3.67)

Indeed, from (3.65), for every r ∈ (0, b), we have ¡

¢0 ¡ ¢ ¡ ¢ xλ (t + r) − xλ (t) + Aλ xλ (t + r) − Aλ xλ (t) ¡ ¢ = ω xλ (t + r) − xλ (t) + f (t + r) − f (t) for a.a. t ∈ Tr = [0, t − r].

366

Nonlinear Analysis

Exploiting the accretivity of Aλ and Lemma 3.3.26, we obtain ° 1 d° °xλ (t + r) − xλ (t)°2 X 2 dt ° °2 6 ω °xλ (t + r) − xλ (t)°X ¡ ¢ ® + F xλ (t + r) − xλ (t) , f (t + r) − f (t) X

for a.a. t ∈ Tr ,

so ° ° °xλ (t + r) − xλ (t)°2 X ° °2 6 °xλ (r) − xλ (0)°X Zt +2

° ° ° ° e2ω(t−s) °xλ (s + r) − xλ (s)°X °f (s + r) − f (s)°X ds.

0

Thus by Lemma 3.3.27, we obtain ° ° ° ° °xλ (t + r) − xλ (t)° 6 °xλ (r) − xλ (0)° + X X

Zt

° ° e2ω(t−s) °f (s + r) − f (s)°X ds.

0

Dividing with r > 0 and letting r & 0, we obtain ° 0 ° °xλ (t)°

X

° ° 6 °ωx0 + f (0) − Aλ (x0 )°X +

Zb

° ° e2ω(t−s) °f 0 (s)°X ds

0

¯ ¯ ° ° 6 ¯A(x0 )¯ + ω kx0 kX + °f (0)°X + e2ωb kf 0 k1 . This proves (3.67), from which we infer that there exists M1 > 0, such that ° 0 ° °xλ (t)° 6 M1 ∀ λ > 0, t ∈ T. (3.68) X Since

Zt x0λ (s) ds

xλ (t) = x0 +

∀ λ > 0, t ∈ T,

0

it follows that there exists M2 > 0, such that ° ° °xλ (t)° 6 M2 ∀ λ > 0, t ∈ T. X

(3.69)

Returning to (3.65) and using (3.68) and (3.69), we obtain M3 > 0, such that ° ¡ ¢° °Aλ xλ (t) ° 6 M3 ∀ λ > 0, t ∈ T, (3.70) X so ° ° ¡ ¡ ¢° ¢° °xλ (t)−Jλ xλ (t) ° = λ°Aλ xλ (t) ° 6 λM3 X X

∀ λ > 0, t ∈ T. (3.71)

3. Nonlinear Operators and Young Measures From (3.71), it follows that ° ¡ ¢° °xλ (t) − Jλ xλ (t) ° −→ 0 X

367

as λ & 0, uniformly on T.

Because F is uniformly continuous on bounded sets, from (3.66) and (3.70), we see that for a given ε > 0, for all λ, µ > 0 small enough, we have ° ° °xλ (t) − xµ (t)°2 6 M4 ε, X for some M4 > 0, so xλ −→ x in C(T ; X) as λ & 0. Moreover, from (3.67) ¡ ¢ we infer that x is a Lipschitz continuous function, i.e., x ∈ W 1,∞ (0, b); X . We claim that this is the solution of the evolution equation. To this end let (y, z) ∈ Gr A and let us set df

yλ = y + λz

∀ λ > 0.

Hence z = Aλ (yλ ). Using (3.65) and the accretivity of Aλ , we obtain ° ° ° ° °xλ (t) − yλ °2 6 °xλ (t0 ) − yλ °2 X X Zt +

¡ ¢ ® F xλ (s) − y , ωxλ (s) + f (s) − z X ds

∀ t, t0 ∈ T.

t0

Letting λ & 0, we get ° ° ° ° °x(t) − y °2 − °x(t0 ) − y °2 6 2 X X

Zt

¡ ¢ ® F x(s) − y , ωx(s) + f (s) − z X . (3.72)

0

Note that for any z, h ∈ X, we have ® 1¡ 2 2 2 ¢ F(h), z − h X 6 khkX kzkX − khkX 6 kzkX − khkX . 2 Using this in (3.72), we have ¿ À ¡ ¢ x(t) − x(t0 ) F x(t0 ) − y , t − t0 X Zt ¡ ¢ ® 1 6 F x(s) − y , ωx(s) + f (s) − z X ds. t − t0

(3.73)

t0

Let t0 ∈ T be a point of differentiability of x. Passing to the limit as t → t0 in (3.73), we obtain ¡ ¢ ® ¡ ¢ ® F x(t0 ) − y , x0 (t0 ) X = F x(t0 ) − y , ωx(t0 ) + f (t0 ) − z X ,

368

Nonlinear Analysis

so

¡ ¢ ® F x(t0 ) − y , u0 − z X > 0,

with

(3.74)

¡ ¢ df u0 = −x0 (t0 ) + ωx(t0 ) + f (t0 ) ∈ A x(t0 )

(see (3.65)). Since A is m-accretive, hence maximal accretive, from (3.74), we conclude that ¡ ¢ −x0 (t0 ) + ωx(t0 ) + f (t0 ) ∈ A x(t0 ) . Because ¡ x is¢ almost everywhere differentiable W 1,∞ (0, b); X ), we conclude that x solves (3.65).

(recall

that

x

∈

Next we want to consider evolution equations with less regular data. This can be done with remarkable success using the theory of semigroups of operators. In what follows we present some basic aspects of this theory. We start with the linear theory and then pass the nonlinear one. Let us motivate the definition of a semigroup of bounded linear operators (linear semigroup). So let X be a Banach space and A ∈ L(X). We consider the following Cauchy problem ½ 0 x (t) = Ax(t) ∀ t > 0, (3.75) x(0) = x0 ∈ X. df

It is easy to check that the function x(t) = etA x0 for t > 0, x ∈ C 1 (R+ ; X) is the unique solution of (3.75). Let us mention the basic properties of this solution. First, for every fixed t > 0, the map x0 7−→ x(t) is linear. Moreover, since ° ° °x(t)° 6 etkAkL kx0 k , X X it is also bounded. Second, as t & 0, we have that x(t) −→ x0

in X

and x(0) = x0 . Finally third, by virtue of the uniqueness of the solution of (3.75), if we start with initial condition x(t0 ), t0 > 0 and move for time t > 0, we must reach the state x(t + t0 ) (recall that e(t+t0 )A = etA et0 A ). Generalizing these properties we obtain the notion of a C0 -semigroup of linear operators. © ª DEFINITION 3.3.29 Let X be a Banach space and S(t) t>0 ⊆ L(X). We call S a C0 -semigroup on X if the following conditions hold: (a) S(0) = idX ; (b) S(t+s)=S(t)S(s) for all s, t > 0; ° ° (c) lim °S(t)x − x°X = 0 for all x ∈ X. t→0

3. Nonlinear Operators and Young Measures

369

REMARK 3.3.30 Property (b) is the semigroup property , while property (c) implies that the function t 7−→ S(t) is continuous from R+ into L(X) furnished with the strong operator topology. If A ∈ L(X), then © ª S(t) = etA t>0 is a C0 -semigroup. Also if X = Cb (R) (the space of bounded continuous functions f : R −→ R equipped with the supremum norm) and S(t)f (·) = f (t + ·) then

©

∀ f ∈ Cb (R),

ª S(t) t>0 is a C0 -semigroup.

PROPOSITION 3.3.31 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X, then there exist M > 1 and ω > 0, such that ° ° °S(t)° 6 M eωt ∀ t > 0. L PROOF By virtue of property (c) in Definition 3.3.29, we can find M > 1 and δ > 0, such that ° ° °S(t)° 6 M ∀ t ∈ [0, δ]. L Let

ln M > 0. δ For a given t > 0, we can find an integer n > 0 and ϑ ∈ [0, δ), such that df

ω =

t = nδ + ϑ. Because of the semigroup property, we have S(t) = S(δ)n S(ϑ), so

° ° ° ° ° ° °S(t)° 6 °S(δ)°n °S(ϑ)° L L L 6 M n M = M eωt ,

since

ln M n = n ln M = mωδ 6 ωt.

370

Nonlinear Analysis

Using this bound, we can improve (c) in Definition 3.3.29. COROLLARY 3.3.32 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X, then for all x ∈ X, the map t 7−→ S(t)x is continuous from R+ into X. PROOF Let r > 0. Then using the semigroup property and Proposition 3.3.31, we obtain ° ° °S(t + r)x − S(t)x° ° ° ° °X = °S(t)°L °S(r)x − x°X ° ° 6 M eωt °S(r)x − x°X −→ 0 as r & 0. So the function t 7−→ S(t)x is continuous on R+ . © ª DEFINITION 3.3.33 Let X be a Banach space and let S(t) t>0 be a C0 -semigroup on X. From Proposition 3.3.31, we know that ° ° °S(t)° 6 M eωt ∀t>0 L for some M > 1 and ω > 0. If M = 1 and ω = 0, i.e., ° ° °S(t)° 6 1 ∀ t > 0, L then we say that we have a contraction semigroup. The following notion is central in the theory of linear semigroups and is the starting point for determining those operators which generate contraction semigroups (see the Hille-Yosida Theorem 3.3.46). © ª DEFINITION 3.3.34 Let X be a Banach space and let S(t) t>0 be a C0 -semigroup on X. We introduce the generator (or infinitesimal generator) of the semigroup S as the linear operator A : X ⊇ D(A) −→ X, defined by S(t)x − x df Ax = lim ∀ x ∈ D(A), t&0 t where

½ df

D(A) =

¾ S(t)x − x x ∈ X : lim exists . t&0 t

In general the operator A is not bounded.

3. Nonlinear Operators and Young Measures

371

EXAMPLE 3.3.35 (a) Let A ∈ L(X) and S(t) = eAt for t > 1. This is a C0 -semigroup. Then for every x ∈ X, we have ∞ k−1 ∞ k−1 X X etA x − x t t = Ak x = Ax + Ak x t k! k! k=1

and

k=2

°X ° ° ∞ tk−1 k ° ° A x° ° ° k!

X

k=2

2

6 t kAkL kxkX =

6

2 |t| kAkL

∞ k−1 X t k=2

k!

k

kAkL kxkX

∞ k X

t k kAkL k!

k=0 kxkX etkAkL

−→ 0

as t & 0.

Therefore the generator of S is A. (b) Let X = C(B; X)

df

and S(t)f (·) = f (t + ·)

(see Remark 3.3.30). We have (S(t)f − f )(s) = D+ f (s) if it exists. t&0 t lim

So if f ∈ D(A), then D+ f exists at all s > 0 and it is bounded and uniformly continuous. Also we have f (s) − f (s − t) f (s − t + t) − f (s − t) = t t o(t) = D+ f (s − t) + −→ D+ f (s) as t & 0 t (since D+ f is continuous). Therefore if f ∈ D(A), then D+ f = D− f , i.e., f 0 (t) exists at all t ∈ R and f 0 ∈ Cb (R). So ½ ¾ D(A) = f ∈ Cb (R) : f 0 exists everywhere and f 0 ∈ Cb (R) and Af = f 0 for all f ∈ D(A). ¡ ¢ More generally, if X = H = L2 0, b and ½ df x(t + s) if S(t)x(s) = 0 if

t + s ∈ (0, b), t + s 6∈ (0, b),

then the generator of S is the operator A : H ⊇ D(A) −→ H, defined by Ax(t) = with D(A) =

©

d x(t), dt

ª x ∈ W 1,2 (0, b) : x(b) = 0 .

372

Nonlinear Analysis

In the next proposition, we summarize the differential properties of C0 semigroups. PROPOSITION 3.3.36 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X with generator A, then for all x ∈ D(A) and all t > 0, we have (a) S(t)x ∈ D(A); (b)

d dt S(t)x

= AS(t)x = S(t)Ax for t > 0; Zt

(c) S(t)x − x =

S(r)Ax dr; 0

(d) D(A) is dense in X and the operator A is closed (i.e., Gr A ⊆ X × X is closed). PROOF

(a) For r > 0 we have

S(t + r)x − S(t)x S(r)x − x = S(t) −→ S(t)Ax r r

as r & 0.

Hence S(t)x ∈ D(A). (b) From (a), we have

d+ S(t)x = S(t)Ax. dt

Also note that S(t + r)x − S(t)x S(r)x − x = S(t) r r µ ¶ S(r) − idX = S(t)x −→ AS(t) as r & 0, r so

d+ S(t)x = S(t)Ax = AS(t)x. dt On the other hand, for t > r > 0, we have S(t)x − S(t − r)x S(r)x − x = S(t − r) −→ S(t)Ax as r & 0, r x so Also since

d− S(t)x = S(t)Ax. dt µ ¶ S(r)x − x S(r) − idX S(t − r) = S(t − r), r r

3. Nonlinear Operators and Young Measures

373

we have

d− S(t)x = S(t)Ax = AS(t)x. dt So finally, we conclude that d S(t)x = S(t)Ax = AS(t)x. dt (c) By part (b), the function t 7−→ S(t)x is continuously differentiable. So for all x∗ ∈ X ∗ , we have ∗ ® x , S(t)x − x X =

Zt 0

Zt = 0

® d ∗ x , S(r)x X dr dr

¿

À ¿ Zt À ∗ d ∗ x , S(r)x dr = x , S(r)Ax dr , dt X X 0

hence

Zt S(r)Ax dr

S(t)x − x =

∀ t > 0.

0

(d) For t > r > 0 and x ∈ X, we have S(r) − idX r =

1 r

Zt

¶

µ Zt S(τ )x dτ 0

¡ ¢ S(τ + r)x − S(τ )x dτ

0

· Zt+r ¸ Zt 1 = S(τ )x dτ − S(τ )x dτ r r

0

· Zt+r ¸ Zr 1 = S(τ )x dτ − S(τ )x dτ −→ S(t)x − x as r & 0, r r

so

0

Zt S(τ )x dτ ∈ D(A). 0

But note that 1 lim t&0 t

Zt S(τ )x dτ = x 0

374

Nonlinear Analysis

and since x ∈ X was arbitrary, we conclude that D(A) is dense in X. Next let {xn }n>1 ⊆ D(A) and assume that xn −→ x and A(xn ) −→ y

in X.

For every r > 0, we have ° ° ° ° °S(r)Axn − S(r)y ° 6 M eωr °Axn − y ° X X (see Proposition 3.3.31) and thus S(·)Axn −→ S(·)y

in X uniformly on [0, t], t > 0.

From (c) we know that Zt S(t)xn − xn =

S(r)Axn dr. 0

Passing to the limit as n → +∞, we obtain Zt S(r)y dr.

S(t)x − x = 0

Hence lim

t&0

S(t)x − x = y, t

which implies that x ∈ D(A) and y = Ax, i.e., A is closed. REMARK 3.3.37 Using part (b) of Proposition 3.3.36 and induction, we can show that for all n > 1, all x ∈ D(An ) and all t > 0, we have dn S(t)x = S(t)An x = An S(t)x. dtn Moreover, it can be shown that the set

∞ \

D(An ) is dense in X.

n=1

For details we refer to Pazy (1983, p. 6). Also because of parts (b) and (d) of Proposition 3.3.36 and using Theorem 2.1.17, we can rewrite part (c) as follows (c)’ for all t > 0 and all x ∈ X, we have Zt S(t)x − x = A

S(τ )x dτ. 0

3. Nonlinear Operators and Young Measures

375

COROLLARY 3.3.38 An operator A : X ⊇ D(A) −→ X can be generator of at most one C0 semigroup. PROOF Suppose that S1 and S2 are two C0 -semigroups with generator A. Let x ∈ D(A) and t > 0 and define the function df

u(s) = S1 (t − s)S2 (s)x

∀ s ∈ (0, t). ¡ ¢ From Proposition 3.3.36(b), we know that u ∈ C 1 (0, t); X and u0 (s) = −AS1 (t − s)S2 (s)x + S1 (t − s)AS2 (s)x = −S1 (t − s)AS2 (s)x + S1 (t − s)AS2 (s)x = 0, so Zt u0 (s) ds = 0

S1 (t)x − S2 (t)x = u(0) − u(t) = −

∀ x ∈ D(A). (3.76)

0

Because D(A) is dense in X (see Proposition 3.3.36(d)), from (3.76), it follows that S1 (t) = S2 (t) ∀ t > 0. © ª © ª If S(t) t>0 is a C0 -semigroup on a Banach space X, then S(t)∗ t>0 still has the semigroup property but need not be a C0 -semigroup. In fact in general we can only show that for every x∗ ∈ X ∗ , t 7−→ S(t)∗ x∗ is weakly∗ -continuous at t = 0, i.e., w∗-lim S(t)∗ x = x. t&0

So the map S(t) 7−→ S(t)∗ does not preserve the strong continuity at t = 0. © ª EXAMPLE 3.3.39 Let X = C0 (R) (see Section 2.3) and let S(t) t>0 be the C0 -semigroup of left translations, i.e., ¡ ¢ df S(t)f (s) = f (t + s)

∀ t, s > 0, f ∈ C0 (R).

We know that X ∗ = N BV (R), the space of all normalized functions of bounded variation with the total variation norm Z ¯ ¯ ¯ dϑ(t)¯. kϑk = (Var ϑ)(R) = T V (R)

R

376

Nonlinear Analysis

By saying that ϑ is normalized, we mean that ϑ(s) =

ϑ(s+ ) + ϑ(s− ) 2

∀s∈R

and ϑ(−∞) =

lim ϑ(s) = 0

s→−∞

and

ϑ(+∞) =

lim ϑ(s) = 0

s→+∞

(see also Theorem 2.3.41). For all f ∈ C0 (R) and all ϑ ∈ N BV (R), we have Z Z ® ® ϑ, S(t)f = f (t + s) dϑ(s) = f (s) ds ϑ(s − t) = S(t)∗ ϑ, f , R

so

R

¡ ¢ S(t)∗ ϑ (s) = ϑ(s − t),

© ª i.e., S(t)∗ t>0 is the right translation of ϑ. By a theorem of Plessner (1929), we know that · ¸ · ¸ ° ° °ϑ(· − t) − ϑ(·)° −→ 0 as t & 0 ⇐⇒ ϑ ∈ AC(R) . T V (R) So the function t 7−→ S(t)∗ ϑ is not in general strongly continuous at t = 0 (unless of course ϑ ∈ AC(R)). Note that in the previous example X = C0 (R) is not a reflexive Banach space. This is not an accident. PROPOSITION 3.3.40 © ª If H is a Hilbert space, H ∗ = H (i.e., H is a pivot space) and S(t) t>0 is a C0 -semigroup © ª on H, then S(t)∗ t>0 is a C0 -semigroup on H. PROOF Clearly S(·)∗ satisfies the semigroup property. Therefore we need to show that for all h ∈ H, the map t 7−→ S(t)∗ h is strongly continuous at t = 0. For every x, h ∈ H, the function ¡ ¢ ¡ ¢ s 7−→ S(s)∗ h, x H = h, S(s)x H is continuous on R. Therefore for all h ∈ H, the function s 7−→ S(s)∗ h is weakly continuous. So it follows from the uniform boundedness principle (see Theorem A.3.4) and the semigroup property that the function s 7−→ S(s)∗ h is bounded on any compact interval of R+ . Also it is weakly measurable. Moreover, if {sn }n>1 is an enumeration of the rationals in R+ and we consider © ª df L = span Q S(sn )∗ h n>1

3. Nonlinear Operators and Young Measures ©

(i.e., finite linear combinations of

S(sn )∗ h

ª n>1

377

with rational coefficients),

df

then L is countable and H0 = span L is separable. © ª Since the function s 7−→ S(s)∗ h is weakly continuous, we see that S(s)∗ h s>0 ⊆ H0 . Therefore we have proved that the function s 7−→ S(s)∗ h is weakly measurable and separably valued; hence by Theorem 2.1.3, it is strongly measurable. We infer that S(·)∗ h ∈ L1loc (R; H). If t > 0 and η ∈ (0, t), we have ° ° °S(t + r)∗ h − S(t)∗ h°

H

° Zη ° °1 ¡ ¢ ° ∗ ∗ ° = ° S(t + r) h − S(t) h dτ ° ° η H 0

° Zη ° °1 ¡ ¢ ° ∗ ∗ ∗ ° = ° S(τ ) S(t + r − τ ) h − S(t − τ ) h dτ ° ° η H 0

1 6 M1 η

Zη

° ° °S(t + r − τ )∗ h − S(t − τ )∗ h° dτ, H

(3.77)

0

for some M1 > 0, so by Corollary 2.3.8, we have ° ° lim °S(t + r)∗ h − S(t)∗ h°H r&0

M1 6 lim r&0 η

Zη

° ° °S(t + r − τ )∗ h − S(t − τ )∗ h° dτ = 0. H

0

Finally let tn & 0 and © ª df C = conv S(tn )∗ h n>1 . Because

w

S(tn )∗ h −→ h

in H,

we have that h ∈ C. So if © ª df E = span S(tn )∗ h n>1 , then from (3.77), it follows that lim S(tn )∗ h = h

n→+∞

Because

∀ h ∈ E.

° ° sup °S(tn )∗ °L 6 M2 ,

n>1

for some M2 > 0, we conclude that lim S(tn )∗ h = h

n→+∞

∀ h ∈ H.

(3.78)

378

Nonlinear Analysis

REMARK 3.3.41 In fact the result is true if H is replaced © ª by a reflexive Banach space. This is a consequence of the fact that if S(t) t>0 is a semigroup of linear operators, such that for all x ∈ X, the function t 7−→ S(t)x is strongly measurable (this is the case if for all x ∈ X, the function t 7−→ S(t)x is weakly continuous), then S is a C0 -semigroup. For details we refer to Hille & Phillips (1957, pp. 305–306). Of great importance in applications are theorems which give necessary and sufficient conditions for an operator A to be the infinitesimal generator of a C0 -semigroup. The basic result in this direction is the celebrated HilleYosida theorem. To state and prove this fundamental result we need some preparation. DEFINITION 3.3.42 Let X be a Banach space and let A : X ⊇ D(A) −→ X be a closed, linear operator. (a) The resolvent set %(A) of A is defined by ª df © %(A) = λ ∈ R : λidX − A : X ⊇ D(A) −→ X is bijective . (b) If λ ∈ %(A), then the resolvent operator Rλ : X −→ X is defined by df

Rλ x = (λidX − A)−1

∀ x ∈ X.

REMARK 3.3.43 It is easy to check that Rλ is closed. So by the closed graph theorem (see Theorem A.3.7), we have that Rλ ∈ L(X). Moreover, we have ARλ x = Rλ Ax for all x ∈ D(A) (see the proof of Proposition 3.3.44(b)). PROPOSITION 3.3.44 If X is a Banach space and A : X ⊇ D(A) −→ X is a closed, linear operator, then (a) for λ, µ ∈ %(A), we have Rλ − Rµ = (λ − µ)Rλ Rµ (resolvent identity) and Rλ Rµ = Rµ Rλ ; (b) if A is the generator of a C0 -semigroup S and ° ° °S(t)° 6 M eωt ∀ t > 0, L then λ ∈ %(A) and

∀λ>ω

+∞ Z Rλ x = e−λt S(t)x dt 0

∀ x ∈ X.

3. Nonlinear Operators and Young Measures PROOF

379

Let x ∈ D(A). We have £ ¤ (λidX − A) Rλ − Rµ (µidX − A)(x) = (µidX − A)x − (λidX − A)x = (µ − λ)x,

so Rλ − Rµ = (µ − λ)Rλ Rµ .

(3.79)

The commutation of Rλ and Rµ follows by interchanging λ and µ in (3.79). (b) Let us set +∞ Z e−λt S(t)x dt

df bλ x = R

∀ λ > ω, x ∈ X.

0

This operator is well defined, since ° −λt ° °e S(t)x° 6 M e(ω−λ)t kxk X X and the function t 7−→ e−λt S(t)x is continuous, thus strongly measurable. We have ° ° bλ ° 6 M °R L

+∞ Z e−(λ−ω)t dt 6 0

M , λ−ω

bλ ∈ L(X). i.e., R bλ x ∈ D(A) and We show that R bλ x = x (λidX − A)R

∀ x ∈ X.

We have S(s) − idX b 1 Rλ x = s s λs

=

e

−1 s

Z∞ e s

−λt

+∞ Z ¡ ¢ e−λt S(t + s) − S(t) x dt 0

1 S(t)x dx − s

Zs e−λt S(t)x dt. 0

Passing to the limit as s & 0, we obtain b λ x = λR bλ x − x AR

∀ x ∈ X.

(3.80)

380

Nonlinear Analysis

Using Proposition 3.3.36(b) and Theorem 2.1.17, for all x ∈ D(A), we have +∞ Z e−λt S(t)Ax dt

bλ Ax = R

0

+∞ Z = e−λt AS(t)x dt 0

+∞ Z bλ x. = A e−λt S(t)x dt = AR

(3.81)

0

From (3.80) and (3.81), it follows that bλ (λid − A)x = x R X and bλ x = x (λidX − A)R

∀ x ∈ D(A),

so bλ Rλ = R

and

λ ∈ %(A).

REMARK 3.3.45 Because of Proposition 3.3.44(b), we see that the resolvent operator is the Laplace transform of the C0 -semigroup generated by A. Hence the function λ 7−→ R(λ) is analytic on %(A). Now we are ready for the theorem characterizing the generators of C0 semigroups. THEOREM 3.3.46 (Hille-Yosida Theorem) If X is a Banach space and A : X ⊇ D(A) −→ X is a closed, densely defined, linear operator, then A is the generator of a C0 -semigroup if and only if there exist M > 1 and ω ∈ R, such that ° k° °Rλ ° 6 L

M (λ − ω)k

∀ λ ∈ %(A), λ > ω.

In this case we have ° ° °S(t)° 6 M eωt L

∀ t > 0.

3. Nonlinear Operators and Young Measures

381

PROOF “=⇒”: From Proposition 3.3.44, we know that if λ > ω, then λ ∈ %(A) and we have +∞ Z Rλ x = e−λt S(t)x dt

∀ x ∈ X,

0

so d(k−1) Rλ x = Rλk−1 x = dλk−1

+∞ Z (−t)k−1 e−λt S(t)x dt

∀k>1

0

and thus ° k−1 ° °R ° λ

L

6 M

+∞ Z tk−1 e−(λ−ω)t dt = M (k − 1)!(λ − ω)−k .

(3.82)

0

But the function λ 7−→ Rλ is analytic on %(A) (see Remark 3.3.45). Hence (k−1)

Rλ

= (−1)k−1 (k − 1)!Rλk .

(3.83)

From (3.82) and (3.83), it follows that ° k° °Rλ ° 6 L

M . (λ − ω)k

“⇐=”: For λ > ω, let us set df

Aλ = λ2 Rλ − λidX . We have that Rλ ∈ L(X) and so we obtain the C0 -semigroup Sλ (t) = eAλ t = e−λt

∞ X (λ2 t)n (λidX − A)−n n! n=0

(see Remark 3.3.30). We shall show that as λ → +∞, then Sλ (t) converges in the strong operator topology to S(t), t > 0, which is the desired semigroup. Note that x = Rλ (λidX − A)x = λRλ x − Rλ Ax = λRλ x − ARλ x

∀ x ∈ D(A)

(see Remark 3.3.43), so Aλ x = λRλ Ax = λARλ x.

(3.84)

Also, we have ° ° ° ° °λRλ x − x° = °λRλ Ax° 6 X X

M kAxkX λ−ω

∀ x ∈ D(A),

382

Nonlinear Analysis

so λRλ x −→ x

in X

as λ → +∞,

∀ x ∈ D(A).

But D(A) is dense in X. So for a given x ∈ X, we can find a sequence {xm }m>1 ⊆ D(A), such that xm −→ x in X. Then for λn −→ +∞ as n → +∞, we have λn Rλn xm −→ xm

in X

as n → +∞.

By the double limit lemma (see Proposition A.2.35), we can find an increasing sequence {m(n)}n>1 (not necessarily strictly) to +∞, such that λn Rλn xm(n) −→ x

in X,

as n → +∞.

Then we have ° ° ° ° ° ° °λn Rλn x − x° 6 °λn Rλn x − λn Rλn xm(n) ° + °λn Rλ xm(n) − x° n X X X ° ° ° λn M ° ° ° ° ° 6 x − xm(n) X + λn Rλn xm(n) − x X −→ 0, λn − ω so λRλ x −→ x in X as λ → +∞ ∀ x ∈ X. Then because of (3.84), we have Aλ x −→ Ax in X

as λ → +∞

∀ x ∈ D(A).

For every λ > ω and t > 0, we have ° ° °Sλ (t)°

L

6 e

−λt

∞ X λω (λ2 t)n M = M e λ−ω t . n n! (λ − ω) n=0

Also from Proposition 3.3.44(a), we have that Aλ Aµ = Aµ Aλ

∀ λ, µ > 0

and so Aλ Sµ (t) = Sµ (t)Aλ

∀ λ, µ > 0, t > 0.

From Proposition 3.3.36(b), it follows that Zt Sλ (t)x − Sµ (t)x = 0

¢ d¡ Sµ (t − s)Sλ (s)x ds dt

Zt =

Sµ (t − s)(Aλ − Aµ )Sλ (s)x ds 0

Zt =

Sµ (t − s)Sλ (s)(Aλ − Aµ )x ds 0

∀ x ∈ D(A),

3. Nonlinear Operators and Young Measures

383

so ° ° °Sλ (t)x − Sµ (t)x° X ° ° µω 6 M 2 e µ−ω t °(Aλ − Aµ )x°X

Zt

(λ−µ)ω 2 s

e− (µ−ω)(λ−ω) ds. 0

Let λ > µ. We have ° ° ° ° µω °Sλ (t)x − Sµ (t)x° 6 M 2 e µ−ω t °(Aλ − Aµ )x° −→ 0 X X

as λ, µ → +∞,

and thus Sλ (t)x converges to some limit as λ → +∞ uniformly on compact intervals. Denote this limit by S(t)x, x ∈ D(A). As before exploiting the density of D(A) in X, we have that Sλ (t)x −→ S(t)x

in X

as λ → +∞

∀x∈X

and the convergence is uniform on compact intervals in R+ . This means that the function t 7−→ S(t)x is continuous and since S clearly satisfies the semigroup property and S(0) = idX , we conclude that S is a C0 -semigroup. b be the generator of It remains to show that A is the generator of S. Let A S. From Proposition 3.3.36(c), we know that Zt Sλ (t)x − x =

Sλ (s)Aλ x ds

∀ t > 0, λ > 0, x ∈ D(A).

(3.85)

0

Note that ° ° °Sλ (s)Aλ x − S(s)Ax° ° °X ° ° 6 °Sλ (s)Aλ x − Sλ (s)Ax°X + °Sλ (s)Ax − S(s)Ax°X ° ° ° ° ° ° 6 °Sλ (s)°L °Aλ x − Ax°X + °Sλ (s)Ax − S(s)Ax°X ∀ s ∈ [0, t], λ > 0, x ∈ D(A), so

° ° °Sλ (s)Aλ x − S(s)Ax° −→ 0 X

as λ → +∞.

Thus if we pass to the limit as λ → +∞ in (3.85), we obtain Zt S(t)x − x =

S(s)Ax ds

∀ t > 0, x ∈ D(A),

0

so

b and D(A) ⊆ D(A)

b Ax = Ax

∀ x ∈ D(A),

384

Nonlinear Analysis

b is an extension of A. i.e., A Now if λ > ω, then b λ ∈ %(A) ∩ %(A) and so

¡ ¢ ¡ ¢ b D(A) = (λid − A) D(A) = X. (λidX − A) X ¡ ¢ b i.e., A = A. b Thus λidX − A |D(A) is bijective and so D(A) = D(A), REMARK 3.3.47 The operator Aλ ∈ L(X) introduced in the above proof is known as the Yosida approximation of A. Note that if A is also dissipative (see Remark 3.3.2), then Aλ coincides with the notion introduced in Definition 3.3.11. However, there A was not necessarily linear. The following generation theorem for perturbed operators is useful in applications and is known as Phillips theorem. THEOREM 3.3.48 (Phillips Theorem) If X is a Banach space, A : X ⊇ D(A) −→ X is the generator of a C0 semigroup and B ∈ L(X), then A + B : X ⊇ D(A) −→ X is also the generator of a C0 -semigroup. Another important generation result is the so-called Lumer-Phillips theorem THEOREM 3.3.49 (Lumer-Phillips Theorem) If X is a Banach space and A : X ⊇ D(A) −→ X is a densely defined, linear, m-dissipative operator, then A is the generator of a contraction semigroup. PROOF

By Lemma 3.3.3, we have that ° ° °λx − Ax° > λ kxk ∀ λ > 0, x ∈ D(A). X X

Also R(λidX − A) = X

∀ λ > 0,

due to the m-dissipativity of A (see Definition 3.3.1 and Proposition 3.3.14). It follows that R+ ⊆ %(A) and kRλ kL 6

1 λ

∀ λ > 0.

Then by Theorem 3.3.46, A generates a contraction semigroup.

3. Nonlinear Operators and Young Measures

385

Recall that if X is a Banach space and A ∈ L(X), then A is the generator of the semigroup df S(t) = etA ∀ t > 0. Moreover, from elementary analysis, we know that for the exponential function, we have µ ¶−n at e−at = lim 1+ . n→+∞ n In the next theorem, we show that even if A is unbounded, the limit expression is valid for the semigroup generated by A. The result is known as the exponential formula. First a lemma. LEMMA 3.3.50 If X is a Banach space and B ∈ L(X) with kBkL 6 1, then ° n(B−id ) ° √ n °e X x − B x° 6 n kx − BxkX ∀ n > 1, x ∈ X. X PROOF

For k > n, we have k−1 X¡ ° k ° ¢ °B x − B n x° = B m+1 x − B m x X m=n

6 kx − BxkX

k−1 X

° ° kB m kL 6 |k − n|°x − Bx°X .

m=n

Then for any t > 0, we have ° t(B−id ) ° n °e X x − B x°

X

µX ∞

° ° ∞ k ° −t X ¢° t ¡ k n ° = °e B x−B x ° ° k! k=0

¶ ° ° tk 6 e−t |k − n| °x − Bx°X k! k=0 µX ¶ 21 ∞ k ¶ 21 µ X ∞ k ° ° t t −t 2 °x − Bx° 6 e (k − n) X k! k! k=0 k=0 ° ¢1 t ° t¡ = e− 2 t2 − (2n − 1)t + n2 2 e 2 °x − Bx°X .

X

Let t = n. We conclude that ° n(B−id ) ° ° √ ° n °e °x − Bx° . X x − B x° 6 n X X

Using this auxiliary result we can prove the exponential formula.

386

Nonlinear Analysis

THEOREM 3.3.51 If X is a Banach space and S is a contraction semigroup on X with generator A, then µ µ ¶−n ¶n t n S(t)x = lim idX − A x = lim R nt x ∀ x ∈ X. n→+∞ n→+∞ n t PROOF

For every n > 1 and t > 0, we have µ ¶−1 µ ¶−1 n n n t n R = id − A = idX − A . t t t t X n

Also we have

µ

n2 n R n − idX t2 t t

tA nt = t Note that

¶

µ = n

¶ n R nt − idX . t

° ° °n ° ° R n ° 6 1. °t t° L

So we can apply Lemma 3.3.50 with B = nt R nt . We obtain ° ° ° µ µ ¶¶ µ ¶n ° ° ° ° √ °n n ° exp n n R n − id ° ° n n x− R x° 6 n° R t x − x° X ° ° . t t t t t X X From the proof of Theorem 3.3.46, we know that ° ° °n ° ° R n x − x° 6 t kAxk ∀ x ∈ D(A). X °t t ° n X Therefore it follows that ° µ ¶n ° ° ° ¡ ¢ ° exp tA n x − n R n x° ° ° t t t

X

t 6 √ kAxkX n

∀ x ∈ D(A).

Again from the proof of Theorem 3.3.46, we know that ¡ ¢ S(t)x = lim exp tA nt x ∀ x ∈ X. n→+∞

So we infer that for fixed t > 0, µ S(t)x = But

lim

n→+∞

° ° ° tA nt ° °e °

L

n Rn t t

6 1 and

¶n x

∀ x ∈ D(A).

° ° °¡ n ¢ ° ° R n n° 6 1 ° t t ° L

and D(A) is dense in X. So (3.86) is valid for all x ∈ X.

(3.86)

3. Nonlinear Operators and Young Measures

387

Before passing to the nonlinear semigroup theory, let us see how we can use semigroups to extend the notion of a solution for an inhomogeneous evolution equation. So let X be a Banach space and A the generator of a C0 -semigroup S and T = [0, b]. Let f ∈ L1 (T ; X) and consider the evolution equation ½ 0 x (t) = Ax(t) + f (t) ∀ t ∈ T = [0, b], (3.87) x(0) = x0 . ¡ ¢ DEFINITION 3.3.52 (a) A function x ∈ W 1,1 (0, b); X is a strong solution of (3.87), if x(0) = x0 and it satisfies the equation almost everywhere (hence x(t) ∈ D(A) for almost all t ∈ T ). (b) A function x ∈C(T ; X)® is a weak solution of (3.87), if for all x∗ ∈ X ∗ , the function t 7−→ x∗ , x(t) X is absolutely continuous and ∗ ® x , x(t) X = hx∗ , x0 iX +

Zt

∗ ∗ ® A x , x(s) X ds +

0

Zt

∗ ® x , f (s) ds

∀ t ∈ T.

0

(c) A function x ∈ C(T ; X) is a mild solution of (3.87), if Zt S(t − s)f (s) ds

x(t) = S(t)x0 +

∀ t ∈ T.

0

The following interesting result is due to Ball (1977), where the reader can find its proof. THEOREM 3.3.53 If X is a Banach space, A is the generator of a C0 -semigroup and f ∈ L1 (T ; X), then x ∈ C(T ; X) is a mild solution of (3.86) if and only if it is a weak solution. REMARK 3.3.54 In contrast to the strong solution, the mild solution makes sense without having that x(t) ∈ D(A) for a.a. t ∈ T . Also we need not have x0 ∈ D(A) (nonregular initial condition). Moreover, it is easy to check that the mild solution (f, x0 ) 7−→ x(·; f, x0 ) is Lipschitz continuous on L1 (T ; X) × X.

388

Nonlinear Analysis

Now we move to nonlinear semigroups. DEFINITION 3.3.55 Let X be a Banach space and let C ⊆ X be a nonempty set. A family of maps S(t) : C −→ C,

t>0

is said to be a semigroup of nonexpansive maps if (a) S(0) = idC ; (b) S(t + s) = S(t) ◦ S(s) for all t, s > 0; ° ° (c) °S(t)x − S(t)y °X 6 kx − ykX for all t > 0 and all x, y ∈ C; (d) S(t)x −→ x in X as t & 0 for all x ∈ C. REMARK 3.3.56 Evidently a semigroup S on nonexpansive maps can be extended uniquely to a semigroup of nonexpansive maps on C and so in Definition 3.3.55 we may assume without any loss of generality that C ⊆ X is closed. If C = X and S(t) ∈ L(X), then we recover Definition 3.3.33. Moreover, it is straightforward to check that R+ × C 3 (t, x) −→ S(t)x ∈ C is continuous. We shall prove a basic generation theorem for nonlinear semigroups of nonexpansive maps. The result will be a nonlinear analog of Theorems 3.3.49 and 3.3.51. To do this we need some preparation. First we prove a combinatorial lemma. LEMMA 3.3.57 If n > m > 1 are integers and α, β > 0 are such that α + β = 1, then m µ ¶ X £ ¤1 n k n−k α β (m − k) 6 (na − m)2 + naβ 2 k k=0

and " ¶ µ ¶2 # 21 n µ X k−1 mβ mβ αm β k−m (n − k) 6 + +m−n . m−1 α2 a

k=m

PROOF

Since n > m and using the Cauchy-Schwarz inequality, we have m µ ¶ X n k n−k α β (m − k) k k=0 n µ ¶ X n k n−k 6 α β (m − k) k k=0

3. Nonlinear Operators and Young Measures 6

µX ¶1 µ n µ ¶ ¶ 21 n µ ¶ n k n−k 2 X n k n−k α β α β (m − k)2 . k k k=0

389 (3.88)

k=0

From the binomial theorem, we know that n µ ¶ X n k n−k α β = (α + β)n , k

(3.89)

k=0

µ ¶ n X n k n−k k α β = αn(α + β)n−1 k

(3.90)

µ ¶ n k n−k α β = α2 n(n − 1)(α + β)n−2 + αn(α + β)n−1 . k

(3.91)

k=0

and n X

k2

k=0

Using the equations of (3.89), (3.90) and (3.91) in the right hand side of (3.88) and since α + β = 1, we obtain

m µ ¶ X £ ¤1 n k n−k α β (m − k) 6 (nα − m)2 + nαβ 2 . k

k=0

Also using once more the Cauchy-Schwarz inequality, we have ¶ n µ X k−1 αm β k−m (n − k) m−1 k=m ¶ ∞ µ X k−1 6 αm β k−m |n − k| (3.92) m−1 k=m µX ¶ ¶ 21 µ X ¶ ¶ 21 ∞ µ ∞ µ k−1 k−1 6 αm β k−m αm β k−m (n − k)2 . m−1 m−1 k=m

Recall that

k=m

¶ ∞ µ X k−1 1 β k−m = m−1 (1 − β)m

∀ β ∈ (0, 1).

(3.93)

k=m

Using (3.93) and the identities obtained by differentiating it with respect to β, in the right hand side of (3.92), we obtain ¶ · µ ¶2 ¸ 21 n µ X k−1 mβ mβ m k−m α β (n − k) 6 + +m−n . m−1 α2 α k=m

390

Nonlinear Analysis

this auxiliary result we can obtain some estimates for the family © Using ª Jλn n>1,λ>0 . LEMMA 3.3.58 If X is a Banach space, A : X ⊇ D(A) −→ X is an m-accretive operator, λ > µ > 0 and n > m > 1 are integers, then ° ° ° ° (a) °Jλn (x) − x°X 6 n°Jλ (x) − x°X for all x ∈ X; (b) ° n ° °Jµ (x) − Jλm (x)°

X

µ ¶ ° n ° °J m−k (x) − x° λ X k k=0 µ ¶ n X ° ° m k−m k − 1 ° n−k + α β Jµ (x) − x°X , m−1 m X

6

αk β n−k

k=m

where

µ λ

a = PROOF

and

β =

λ−µ . λ

(a) Using Proposition 3.3.12(a), we have ° n−1 ° ° n ° ° X ¡ n−k ¢° n−(k+1) °Jλ x − x° = ° ° J (x) − J (x) λ λ ° ° X k=0

6

n−1 X

° ° °Jλ (x) − x°

X

X

° ° = n°Jλ (x) − x°X .

k=0

(b) For integers 1 6 i 6 m and 1 6 k 6 m, we set ° df ° ak,i = °Jµi (x) − Jλk (x)°X . Using the resolvent identity (see Remark 3.3.13), we obtain ° µ ¶° ° i ° µ k−1 λ−µ k ° ak,i = ° J (x) − J J (x) + J (x) µ λ λ ° µ ° λ λ X ° ° ° i−1 µ k−1 λ−µ k ° ° ° 6 °Jµ (x) − Jλ (x) − Jλ (x)° λ λ X ° ° µ° λ − µ° k−1 i−1 i−1 ° ° ° 6 Jµ (x) − Jλ (x) X + Jµ (x) − Jλk (x)°X λ λ = αak−1,i−1 + βak,i−1 . (3.94) Inequalities (3.94) can be solved to estimate am,n in terms of ak,0 and a0,i . This way we obtain the inequality in part (b) of the lemma.

3. Nonlinear Operators and Young Measures

391

Now we are ready for the generation theorem for nonlinear semigroups of nonexpansive maps. THEOREM 3.3.59 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then µ ¶−n t S(t)x = lim idX + A x n→+∞ n exists ©for each ª x ∈ D(A), uniformly in t on compact intervals in R+ . Moreover, S(t) t>0 is a semigroup of nonexpansive maps on D(A) and for each x ∈ D(A) and t > 0, we have ° ° ¯ ¯ °S(t)x − x° 6 t¯A(x)¯ = t inf kuk . X X u∈A(x)

PROOF Let x ∈ D(A), λ > µ > 0 and n > m > 1 positive integers. Using Lemmata 3.3.57 and 3.3.58, we obtain ° n ° °Jµ (x) − Jλm (x)°

X

µ· 6

¸ 21 (nµ − λm)2 + nµ(λ − µ)

(3.95)

· ¸ 21 ¶ ¯ ¯ ¯A(x)¯. (3.96) + mλ(λ − µ) + (mλ − nµ)2 Taking µ =

t n

and λ =

t m

in (3.96), we obtain µ

° n ° °J t (x) − J m ° t (x) n

m

X

6 2t

1 1 − m n

¶ 21 |Ax|,

(3.97)

so lim J nt (x) = S(t)x

n→+∞

n

exists uniformly in t on compact intervals of R+ . Moreover, since J nt is nonexpansive on X, n

we see that ° ° °S(t)x − S(t)y ° 6 kx − yk X X

∀ t > 0, x, y ∈ D(A).

Therefore S(t)x =

lim J nt (x) exists for x ∈ D(A)

n→+∞

n

and S(t)(·) is nonexpansive on D(A).

(3.98)

392

Nonlinear Analysis

Also if in (3.96), we let n = m, µ = nt and λ = pass to the limit as n → +∞, we obtain ° ° °S(t)x − S(s)x° 6 2|t − s||Ax|

s n

X

with 0 6 t 6 s and then ∀ x ∈ D(A).

(3.99)

From (3.99), it follows that the function t 7−→ S(t)x is continuous for all x ∈ D(A) and then by a use of the double limit lemma (see Proposition A.2.35) as in the proof of Theorem 3.3.46, we obtain that the function t 7−→ S(t)x is continuous on R+ for all x ∈ D(A). Finally we need to verify the semigroup property (see Definition 3.3.55(b)). From (3.97) and (3.98), we have ¡ ¢m ¡ ¢n S(t)m x = lim J nt (x) = lim J m (x). t n→+∞

n→+∞

n

n

Therefore, S(mt)x = =

lim J nmt (x) = lim J mk mt (x) n k→+∞ mk ¡ m ¢k lim J t (x) = S(t)m x.

n→+∞ k→+∞

k

(3.100)

Then if i, k, r, s > 0 are integers, we have µ

¶ µ ¶ µ ¶is+rk i r is + rk 1 S + x = S x = S x k s ks ks µ ¶is µ ¶rk µ ¶ µ ¶ 1 1 i r = S S x = S S x, ks ks k s so S(t + τ )x = S(t) ◦ S(τ )x for all rational t, τ > 0 and all x ∈ D(A). Exploiting the continuity in t and the nonexpansiveness in x, we conclude that S(t + τ ) = S(t) ◦ S(τ ) ∀ t, τ > 0.

In applications to evolution equations, it is very helpful to know if S(t) : C −→ C,

t>0

is a compact map (see Definition 3.1.1). So we make the following definition. DEFINITION 3.3.60 Let X be a Banach space and C ⊆ X be a nonempty, closed set and S(t) : C −→ C, t > 0, a semigroup of nonexpansive maps. We say that S is compact, if for all t > 0, S(t) is a compact map.

3. Nonlinear Operators and Young Measures

393

REMARK 3.3.61 Since S(0) = idC , then S(0) is not in general compact unless C ⊆ X is compact or X is finite dimensional. Next we present two simple but typical examples of (linear) semigroups which are compact and noncompact respectively. EXAMPLE 3.3.62

(a) X = H = L2 (0, π) and A : H ⊇ D(A) −→ H

is defined by df

Ax = −

d2 x dt2

∀ x ∈ W 2,2 (0, π) ∩ W01,2 (0, π).

By integration by parts, we can check that A is monotone (i.e., accretive). Also for every f ∈ L2 (0, π) the boundary value problem ½ 00 −x (t) + x(t) = f (t) for a.a. t ∈ [0, π], x(0) = x(π) = 0, has a unique solution x ∈ D(A). Hence A is maximal monotone (i.e., maccretive). Note that D(A) = H and then because of Theorem 3.3.49, −A generates a contraction semigroup S on H. We know that µ ¶ © ª d2 1,2 2 λk = k k>1 is the spectrum of − 2 , W0 (0, π) dt and the corresponding eigenfunctions (r ) 2 sin kx π

k>1

form an orthonormal basis for H. Then using sine Fourier expansion, we can easily verify that r ∞ X 2 −k2 t S(t)x(τ ) = ak e sin kτ ∀ x ∈ H, t > 0, τ ∈ [0, π], π k=1

with ak being the k-th Fourier coefficient, defined by r df

ak =

2 π

Zπ x(s) sin ks ds. 0

394

Nonlinear Analysis

If we set df

Sn (t)x(τ ) =

n X

r ak e

−k2 t

k=1

2 sin kτ π

∀ x ∈ H, t > 0, τ ∈ [0, π],

then Sn ∈ Lf (H) (i.e., Sn is of finite rank; see Definition 3.1.23) and for all t > 0, Sn (t)x −→ S(t)x

in H, uniformly on bounded subsets of H.

Therefore S(t) is compact for t > 0 (see Proposition 3.1.18). (b) Let X = H = L2 (0, 2π) and let A : X ⊇ D(A) −→ H be defined by df

Ax = with

dx dt

∀ x ∈ D(A),

½ df

D(A) =

¾ x∈W

2,2

0

0

(0, 2π) : x(0) = x(2π), x (0) = x (2π) .

A simple integration by parts reveals that A is monotone (i.e., accretive). Also for every λ > 0 and every h ∈ L2 (0, 2π), the problem ∀ t ∈ [0, 2π], x(t) + λ dx dt (t) = h(t) x(0) = x(2π), 0 x (0) = x0 (2π), has a unique solution x ∈ W 2,2 (0, 2π) and so A is maximal monotone (i.e., A is m-accretive). Hence by Theorem 3.3.49, −A generates a contraction semigroup S on H. This semigroup is defined by ½ x(τ + t) if τ + t ∈ [0, 2π], S(t)x(τ ) = x(τ + t − 2π) if τ + t > 2π, for all x ∈ H, τ ∈ [0, 2π] and t > 0. Then for every t > 0, S is an isometry on H and so S(t) is not compact for t > 0. REMARK 3.3.63 Roughly speaking, we can say that compact semigroups of nonexpansive maps are generated by m-accretive operators acting in a finite dimensional Banach space or by m-accretive operators arising in the study of parabolic problems. In contrast hyperbolic problems (even very simple ones) generate noncompact semigroups. The compactness of a semigroup S of nonexpansive maps is closely related to the compactness of the nonexpansive operators Jλ = (idX + λA)−1 (see Definition 3.3.11 and Proposition 3.3.12(a)). For this reason we need to have a result determining the relationship between S(t) and Jλ , t, λ > 0.

3. Nonlinear Operators and Young Measures

395

LEMMA 3.3.64 If X is a Banach space, C ⊆ X is a nonempty, closed set and S(t) : C −→ C,

t>0

is a semigroup of nonexpansive maps, then ° ° °S(t)x − x° 6 2 X t PROOF

Zt

° ° °S(τ )x − x° dτ X

∀ x ∈ C, t > 0.

0

From Definition 3.3.55(b) and (c), we have ° ° Zt ° ° °S(t)x − 1 S(τ ) dτ ° ° ° t X

(3.101)

° Zt ° °1 ¡ ¢ ° ° = ° S(t)x − S(τ )x dτ ° ° t X

(3.102)

0

0

6

=

1 t 1 t

Zt

° ° °S(t − τ )x − x° dτ X

0

Zt

° ° °S(τ )x − x° dτ X

∀ x ∈ C, t > 0.

(3.103)

0

Using this inequality, we obtain ° ° °S(t)x − x° X ° ° ° Zt ° Zt ° ° ¡ ¢ ° 1 1° ° ° ° 6 °S(t)x − S(τ )x dτ ° + ° S(τ )x − x dτ ° ° t t X X 0

6

6

1 t 2 t

Zt 0

Zt

°¡ ¢° ° S(τ )x − x ° dτ + 1 X t °¡ ¢° ° S(τ )x − x ° dτ X

0

Zt

°¡ ¢° ° S(τ )x − x ° dτ X

0

∀ x ∈ X, t > 0.

0

In order to derive the desired relations between S(t) and Jλ , with t, λ > 0, we need to return to nonlinear evolution equations and discuss their solvability when the data are nonregular.

396

Nonlinear Analysis

So let T = [0, b], X be a Banach space, A : X ⊇ D(A) −→ 2X be an m-accretive operator and f ∈ L1 (T ; X). We consider the following nonlinear evolution inclusion: ¡ ¢ ½ 0 x (t) + A x(t) 3 f (t) ∀ t ∈ T, (3.104) x(0) = x0 . From Theorem 3.3.28, we know that if¡ X ∗ is uniformly convex (hence X ¢ 1,1 is reflexive), x ∈ D(A) and f ∈ W (0, b); X , then there exists unique ¡ 0 ¢ 1,∞ x ∈ W (0, b); X satisfying (3.104) for almost all t ∈ T . Such a solution is usually called¡ strong ¢solution. However, if x0 ∈ D(A) \ D(A) or f ∈ L1 (T ; X) \ W 1,1 (0, b); X or X is not reflexive, then there are examples showing that (3.104) need not have a strong solution (for details we refer to Crandall & Liggett (1971)). So if we want to develop a general theory concerning evolution equations of the form (3.104), we need to introduce a new broader solution concept. Suppose that x is a strong solution. Then ¡ ¢ −x0 (t) + f (t) ∈ A x(t) for a.a. t ∈ T . Because A is accretive, using Theorem 3.3.10, we obtain ¡ ¢ 0 6 − x0 (t) + f (t) − v, x(t) − y + for a.a. t ∈ T and all (y, v) ∈ Gr A, so

¡

x0 (t), x(t) − y

¢ +

6

¡ ¢ f (t) − v, x(t) − y +

for a.a. t ∈ T .

By virtue of Lemma 3.3.26, we have that ° ¡ 0 ¢ d° °x(t) − y °2 x (t), x(t) − y + = X dt

for a.a. t ∈ T .

So we obtain ° ¡ ¢ 1 d° °x(t) − y °2 6 f (t) − v, x(t) − y + X 2 dt

for a.a. t ∈ T .

Integrating both sides of this inequality over [s, t] ⊆ [0, b], we obtain ° ° ° ° °x(t) − y °2 6 °x(s) − y °2 + 2 X X

Zt

¡

f (τ ) − v, x(τ ) − y

¢ +

dτ

s

0 6 s 6 t 6 b, (y, v) ∈ Gr A. This leads to the introduction of a new more general solution notion for problem (3.104).

3. Nonlinear Operators and Young Measures

397

DEFINITION 3.3.65 Let x ∈ X and f ∈ L1 (T ; X). A function x : T −→ X is said to be an integral solution of the Cauchy problem (3.104), if (a) x(0) = x; (b) x ∈ C(T ; X); (c) for all 0 6 s 6 t 6 b and all (y, v) ∈ Gr A, we have ° ° ° 1° °x(t) − y °2 6 1 °x(s) − y °2 + X X 2 2

Zt

¡

f (τ ) − v, x(τ ) − y

¢ +

dτ.

s

REMARK 3.3.66 The previous discussion shows that every strong solution is also an integral solution. Moreover, for every x0 ∈ D(A), the function µ ¶−n t df x(t) = S(t)x0 = lim idX + A x0 n→+∞ n is an integral solution of the autonomous Cauchy problem ¡ ¢ ½ 0 x (t) + A x(t) 3 0 ∀ t > 0, x(0) = x0 .

For details we refer to Barbu (1976, p. 124). More generally we have the following result due to B´enilan (1972) (see also Barbu (1976, p. 124) and Miyadera (1992, p. 160)). THEOREM 3.3.67 If X is a Banach space, A : X ⊇ 2X is an m-accretive operator, x0 ∈ D(A) and f ∈ L1 (T ; X), then problem (3.104) has a unique integral solution x(·; f ) ∈ C(T ; X). Moreover, if f1 , f2 ∈ L1 (T ; X) and x1 (·) = x(·; f1 ),

x2 (·) = x(·; f2 ),

we have ° ° ° ° °x1 (t) − x2 (t)°2 6 °x1 (s) − x2 (s)°2 X X Zt +

¡ ¢ f1 (τ ) − f2 (τ ), x1 (τ ) − x2 (τ ) + dτ

s

and

° ° ° ° °x1 (t) − x2 (t)° 6 °x1 (s) − x2 (s)° X X Zt + s

° ° °f1 (τ ) − f2 (τ )° dτ X

∀ 0 6 s 6 t 6 b.

398

Nonlinear Analysis

Now we have all the necessary tools to establish the relation between S(t) and Jλ for all t, λ > 0. PROPOSITION 3.3.68 If X is a Banach space, A : X ⊇ D(A) −→ 2X is an m-accretive operator and S(t) : D(A) −→ D(A),

t>1

is the semigroup of nonexpansive maps generated by A, then for all x0 ∈ D and all t, λ > 0, we have ° ° ° ¡ ¢° (a) °S(t)x0 − x0 °X 6 2 + λt °Jλ (x0 ) − x0 °X ; ° ° (b) °Jλ (x0 ) − x0 °X 6

2 t

¡ ¢ 1 + λt

Zt

° ° °S(τ )x0 − x0 ° dτ . X

0

PROOF (a) From Definition 3.3.55(c) and Theorem 3.3.59, for all x0 ∈ D(A), (y, v) ∈ Gr A and t > 0, we have ° ° °S(t)x0 − x0 ° X° ° ° ° ° ° 6 °S(t)x0 − S(t)y °X + °S(t)y − y °X + °y − x0 °X ° ° ° ° 6 2°x0 − y °X + °S(t)y − y °X 6 2 kx0 − ykX + t kvkX .

(3.105)

¡ ¢ In (3.105), let us set y = Jλ (x) and v = Aλ (x0 ) ∈ A Jλ (x0 ) (see Proposition 3.3.12(c)). We obtain ° ° °S(t)x0 − x0 ° X ° ° ° ° 6 2°x0 − Jλ (x0 )°X + t°Aλ (x0 )°X ° ° ° t° 6 2°x0 − Jλ (x0 )°X + °x0 − Jλ (x0 )°X λ µ ¶ ° t ° °x0 − Jλ (x0 )° . = 2+ X λ

(b) We know that x(t) = S(t)x0 is the unique integral solution of the autonomous Cauchy problem: ½

¡ ¢ x0 (t) + A x(t) 3 0 x(0) = x0 ,

∀ t ∈ T,

3. Nonlinear Operators and Young Measures

399

and ° 1° °S(t)x0 − y °2 6 1 kx0 − yk2 X X 2 2 t Z ¡ ¢ + − v, S(τ )x0 − y + dτ ∀ (y, v) ∈ Gr A, t > 0 0

(see Definition 3.3.65 and Remark 3.3.66). Using Definition 3.3.5 and Lemma 3.3.27, we obtain ° ° °S(t)x0 − y ° 6 kx0 − yk X X 1 + λ

Zt

° ° ° ¢ ¡° °S(τ )x0 − y − λv ° − °S(τ )x0 − y ° dτ X X

∀ λ > 0.

0

Let y = Jλ (x0 )

and v = Aλ (x0 ).

We have ° ° ° ° °S(t)x0 − Jλ (x0 )° 6 °x0 − Jλ (x0 )° X X Zt ° ° °¢ ¡° 1 °S(τ )x0 − x0 ° − °S(τ )x0 − Jλ (x0 )° dτ. + X λ

(3.106)

0

From the triangle inequality, we have ° ° ° ° ° ° −°S(t)x0 − x0 °X 6 °S(t)x0 − Jλ (x0 )°X − °Jλ (x0 ) − x0 °X . Using this in (3.106), we obtain ° ° ° ° −°S(t)x0 − x0 °X + °Jλ (x0 ) − x0 °X ° ° 1 6 °Jλ (x0 ) − x0 °X + λ

Zt

° ° ° ¢ ¡ ° 2°S(τ )x0 − x0 °X − °Jλ (x0 ) − x0 °X dτ,

0

so ° ° °Jλ (x0 ) − x0 °

X

° λ° 2 6 °S(t)x0 − x0 °X + t t

Zt

° ° °S(τ )x0 − x0 ° dτ X

0

and from Lemma 3.3.64, we have ° ° °Jλ (x0 ) − x0 °

X

2 6 t

µ ¶ Zt ° ° λ °S(τ )x0 − x0 ° dτ. 1+ X t 0

400

Nonlinear Analysis

LEMMA 3.3.69 If X is a Banach space, C ⊆ X is a nonempty, closed set, fn : C −→ X are compact maps for n > 1 and fn (x) −→ f (x)

in X,

uniformly on bounded subsets of C, then f : C −→ X is compact. PROOF Clearly f : C −→ X is continuous. Next let B ⊆ C be a bounded set. Then for a given ε > 0, we can find n0 = n0 (ε, B) > 1, such that ° ° °fn (x) − f (x)° < ε ∀ n > n0 , x ∈ B. (3.107) X 2 For n > n0 , the set fn (B) is compact in X. So we can find

0 {xk }N k=1 ,

where N0 = N0 (n, ε) > 1, such that fn (B) ⊆

N0 [

B 2ε (xk ).

(3.108)

k=1

Let x ∈ B. From (3.108), we see that there exists k ∈ {1, . . . , N0 }, such that ° ° °fn (x) − xk ° < ε . (3.109) 2 Therefore using (3.107) and (3.109), we have ° ° ° ° ° ° °f (x) − xk ° 6 °f (x) − fn (x)° + °fn (x) − xk ° < ε, X X X so f (x) ∈ Bε (xk ) and thus f (B) ⊆

N0 [

Bε (xk ),

k=1

i.e., f (B) is totally bounded, thus relatively compact in X. DEFINITION 3.3.70 closed and let

Let X be a Banach space, C ⊆ X nonempty, S(t) : C −→ C,

t > 0,

be a semigroup of nonexpansive maps. We say that S is equicontinuous (respectively weakly if for each bounded set B ⊆ C, the © equicontinuous) ª family of functions S(·)x x∈B is equicontinuous (respectively weakly equicontinuous) at each t > 0.

3. Nonlinear Operators and Young Measures

401

As expected compactness and equicontinuity of nonlinear semigroups are closely related. PROPOSITION 3.3.71 If X is a Banach space, C ⊆ X is a nonempty, closed set and S(t) : C −→ C,

t>0

is a compact semigroup for nonexpansive maps, then S is equicontinuous. PROOF Because

Let B ⊆ C be a bounded set, let t > 0 and choose r ∈ (0, t). S(t − r)B is compact, N (ε)

we can find {xk }k=1 ⊆ B, such that S(t − r)B ⊆

Nε [

¡ ¢ B 3ε S(t − r)xk .

(3.110)

k=1

© ª For each k ∈ 1, . . . , N (ε) , the map t 7−→ S(t)xk is continuous on R+ and so we can find δ = δ(ε, t) ∈ (0, r), such that ° ° °S(t + h)xk − S(t)xk ° 6 ε X 3

© ª ∀ k ∈ 1, . . . , N (ε) , h ∈ [−δ, δ]. (3.111)

© ª Then because of (3.110), for each x ∈ B, we can find k ∈ 1, . . . , N (ε) , such that ° ° °S(t − r)x − S(t − r)xk ° 6 ε . (3.112) X 3 So using the semigroup property and (3.111) and (3.112), we obtain ° ° °S(t + h)x − S(t)x° X ° ° ° ° 6 °S(t + h)x − S(t + h)xk °X + °S(t + h)xk − S(t)xk °X ° ° + °S(t)xk − S(t)x°X ° ° ° ° 6 2°S(t − r)x − S(t − r)xk ° + °S(t + h)xk − S(t)xk ° 6 ε X

X

∀ x ∈ B, h ∈ [−δ, δ]. © ª This proves that S(·)x x∈B is equicontinuous at every t > 0. Now we are ready to present a characterization of compact nonlinear semigroups.

402

Nonlinear Analysis

THEOREM 3.3.72 If X is a Banach space, A : X ⊇ D(A) −→ 2X is an m-accretive operator and S(t) : D(A) −→ D(A), t > 0 is the semigroup of nonexpansive maps generated by A according to Theorem 3.3.59, then the following statements are equivalent: (a) the semigroup S is compact; (b) for each λ > 0, Jλ is a compact map and the semigroup S is equicontinuous. PROOF “(a)=⇒(b)”: The equicontinuity of S follows from Proposition 3.3.71. So we need to show that for each λ > 0, Jλ is a compact map. From Theorem 3.3.59, we have ° ° ° ° °S(t) ◦ Jλ (x) − Jλ (x)° 6 t°Aλ (x)° X X ° t° = °x − Jλ (x)°X ∀ x ∈ X, λ > 0 (3.113) λ ¡ ¢ (recall that Aλ (x) ∈ A Jλ (x) ; see Proposition 3.3.12(c)). Because Jλ is nonexpansive, it maps bounded sets to bounded sets. So from (3.113), it follows that S(t) ◦ Jλ −→ Jλ as t → 0+ , uniformly on bounded sets of X. But S(t) ◦ Jλ is compact, since S(t) is. Therefore Lemma 3.3.69 implies that Jλ is compact. “(b)=⇒(a)”: Using Proposition 3.3.68(b), we have °¡ ° ¢ ° Jλ ◦ S(t) (x) − S(t)x° X Zλ ° 4 ° °S(t + τ )x − S(t)x° dτ 6 ∀ λ, t > 0, x ∈ D(A). (3.114) X λ 0

Because S is equicontinuous, for every bounded set B ⊆ D(A) and for each t > 0, we can find ω : R+ −→ R+ , such that lim ω(r) = 0

r&0

and

° ° °S(t + τ )x − S(t)x°

X

6 ω(τ )

∀ τ > 0, x ∈ B.

Using (3.115) in (3.114), we obtain °¡ ° ¢ ° Jλ ◦ S(t) (x) − S(t)x° 6 4 sup ω(τ ) X τ ∈[0,λ]

∀ x ∈ B,

(3.115)

3. Nonlinear Operators and Young Measures

403

so Jλ ◦ S(t) −→ S(t) as λ & 0, uniformly on bounded subsets of D(A). Note that Jλ ◦ S is compact

(see the first part of (b)). So S(t) is compact for all t > 0. In the case of linear semigroups, the above theorem takes the following form. COROLLARY 3.3.73 If X is a Banach space, A : X ⊇ D(A) −→ X is densely defined, linear, m-accretive operator and S(t) : X −→ X, t > 0 is the contraction semigroup generated by −A according to Theorem 3.3.49, then the following statements are equivalent: (a) the semigroup S is compact; (b) for each λ > 0, Jλ is a compact operator and the map t 7−→ S(t) is continuous from R+ into L(X) with the operator norm topology. REMARK 3.3.74 Using the resolvent identity (see Remark 3.3.13), we see that in Theorem 3.3.72 and in Corollary 3.3.73, the map Jλ is compact for all λ > 0 if and only if it is compact for some λ > 0. The next proposition gives an equivalent condition for the map Jλ to be compact. PROPOSITION 3.3.75 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then the following statements are equivalent: (a) for each λ > 0, Jλ is a compact map; (b) for every m > 0, the level set ½ df

Lm =

¾ ¯ ¯ ¯ ¯ x ∈ D(A) : kxkX + A(x) 6 m

is relatively compact in X.

404 PROOF

Nonlinear Analysis “(a)=⇒(b)”: From Proposition 3.3.12(d), we have ° ° °Aλ (x)°

X

so

¯ ¯ 6 ¯A(x)¯ 6 m

° ° °x − Jλ (x)° 6 mλ X

∀ x ∈ Lm , λ > 0,

∀ x ∈ Lm , λ > 0

and thus Jλ −→ idLm

as λ & 0,

uniformly on Lm .

From Lemma 3.3.69, it follows that Lm (which is bounded) is relatively compact. “(b)=⇒(a)”: Let B ⊆ X be bounded and λ > 0. Since Jλ is nonexpansive Jλ (B) is bounded. Because ¡ ¢ Aλ (x) ∈ A Jλ (x)

∀ x ∈ X,

we have ° ° ¯ ¡ ¢¯ °Jλ (x)° + ¯A Jλ (x) ¯ X ° ° ° ° 6 °Jλ (x)°X + °Aλ (x)°X ° ° ° 1° = °Jλ (x)°X + °x − Jλ (x)°X λ

∀ x ∈ X.

So there exists m > 0 large enough, such that Jλ (B) ⊆ Lm , hence Jλ (B) is compact and so Jλ is a compact map. In Section 4.3, we will return to semigroups, when we will examine the subdifferential of a convex function. Before concluding this section, we would like to make an interesting remark concerning accretive operators. REMARK 3.3.76 In a Hilbert space maximal accretivity (i.e., maximal monotonicity) and m-accretivity coincide (see Theorem 3.2.29). In a general Banach space this is no longer true. For a counterexample we refer to Crandall & Liggett (1971) (see also Miyadera (1992, pp. 42–44)).

3. Nonlinear Operators and Young Measures

3.4

405

The Nemytskii Operator and Integral Functions

In this section first we examine the Nemytskii (or superposition) operator, which is an important nonlinear operator that arises in many applications and then we pass to the study of nonlinear integral functionals, which leads naturally to the topic of the next section, which is the theory of Young measures. Consider a set Ω, which in most cases is a measure space or a metric space or both and let X, Y be two Hausdorff topological spaces, which in our analysis as well as in most applications are either Euclidean spaces or Banach spaces. Let f : Ω × X −→ Y and consider the nonlinear operator ¡ ¢ df Nf (u)(z) = f z, u(z)

∀ z ∈ Ω,

¡ ¢ which to each function u : Ω −→ X assigns the Y -valued z 7−→ f z, u(z) . This operator is known in the literature as the Nemytskii operator corresponding to the function f (also known as the superposition operator of f , or the composition operator of f , or the substitution operator of f ). Since in many applications the Nemytskii operator Nf acts on a Lebesgue space Lp , it is important to know under what conditions Nf maps Lp into another Lebesgue space Lr . It turns out that this leads to a particular growth condition on f , namely p f (z, x) = O(|x| r ), which is both a necessary and a sufficient condition for Nf to act between Lp and Lr . This is the well known Krasnoselskii’s theorem, which here we prove in a more general form, namely when Nf acts on Lebesgue-Bochner spaces. We start with a definition. DEFINITION 3.4.1 Let (Ω, Σ) be a measurable space and let X, Y be two Hausdorff spaces. A function f : Ω × X −→ Y is said to be a Carath´ eodory function, if ¡ ¢ (a) for every x ∈ X, the function z 7−→ f (z, x) is Σ, B(Y ) -measurable, with B(Y ) being the Borel σ-field of Y ; (b) for every z ∈ Ω, the function x 7−→ f (z, x) is continuous. REMARK 3.4.2 If X is a separable metric space and Y is a metric space, then the function (z, x) 7−→ f (z, x) is Σ × B(X)-measurable, with B(X) being the Borel σ-field of X (i.e., f is jointly measurable). Therefore f is sup-measurable (superpositionally measurable), meaning ¡ ¢ that for every measurable function u : Ω −→ X, the function z 7−→ f z, u(z) is measurable, i.e., the Nemytskii operator Nf maps measurable functions to measurable ones (for details see Denkowski, Mig´orski & Papageorgiou (2003a, pp. 189–190)).

406

Nonlinear Analysis

In what follows, to avoid repeating the same hypotheses, we fix (Ω, Σ, µ) to be a nonatomic, σ-finite, complete measure space (in applications usually Ω is a subset of RN , equipped with the Lebesgue measure) and X, Y are two separable Banach spaces. LEMMA 3.4.3 If h : Ω × X −→ R+ is a Carath´eodory function, such that h(z, 0) = 0 for all z ∈ Ω and ° ° °Nh (u)° r 6 cr ∀ u ∈ Lp (Ω; X), L (Ω) for some c > 0, then µ(Ek ) = 0 where

∀ k > 1,

½ df

Ek =

¾ z∈Ω:

sup h(z, x) = +∞

∀ k > 1.

kxkX 6k

PROOF Suppose that for some k > 1, we have µ(Ek ) 6= 0. Because the measure space is nonatomic, σ-finite, we can find Bk ∈ Σ, such that Bk ⊆ Ek

and

0 < µ(Bk ) < +∞.

For every z ∈ Bk , we have ½ df Sk (z) = x ∈ X : kxkX 6 k, h(z, x) >

¾ 2cr . µ(Bk )

Evidently Sk (z) 6= ∅

∀ z ∈ Bk

and Gr Sk ∈ (Σ ∩ Bk ) × B(X), with B(X) being the Borel σ-field of X. We apply the Yankov-von¡Neumann¢ Aumann selection theorem (see Theorem A.2.33) and obtain a Σ, B(X) measurable map uk : Bk −→ X such that uk (z) ∈ Sk (z)

∀ z ∈ Bk .

We extend uk to all of Ω by setting uk (z) = 0 if z ∈ Ω \ Bk . Since h(z, 0) = 0 p

∀z∈Ω

and uk ∈ L (Ω; X), we have Z Z ¡ ¢r ¡ ¢r h z, uk (z) dµ > 2cr , h z, uk (z) dµ = Ω

a contradiction.

Bk

3. Nonlinear Operators and Young Measures

407

Using this lemma, we can prove the general version of Krasnoselskii’s theorem for Nf . THEOREM 3.4.4 If f : Ω × X −→ Y is a Carath´eodory function, p, r ∈ [1, +∞) and Nf maps Lp (Ω; X) into Lr (Ω; Y ), then Nf is continuous, bounded (i.e., maps bounded sets into bounded sets) and there exist a ∈ Lr (Ω)+ and c > 0, such that ° ° °f (z, x)°

Y

PROOF

p

r 6 a(z) + c kxkX

for µ-a.a. z ∈ Ω.

Let {un }n>1 ⊆ Lp (Ω; X) be a sequence, such that un −→ u

in Lp (Ω; X),

for some u ∈ Lp (Ω; X). Let g : Ω × X −→ R be defined by °r df ° g(z, x) = °f (z, x + u(z)) − f (z, u(z))°Y . We pick a subsequence {unk }k>1 of {un }n>1 , such that ° ° 1 °un − u°p p 6 k k L (Ω;X) 2

∀k>1

and unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Let

df

vk = unk − u

∀ k > 1.

We have vk (z) −→ 0 and so

¡ ¢ g z, vk (z) −→ 0

for µ-a.a. z ∈ Ω

for µ-a.a. z ∈ Ω

as k → +∞.

Because g(z, x) > 0

∀ (z, x) ∈ Ω × X

and vk (z) −→ 0 for µ-a.a. z ∈ Ω, we can find k(z) ∈ N, such that ¡ ¢ ¡ ¢ ξ(z) = sup g z, vk (z) = g z, vk(z) (z) . k>1

408

Nonlinear Analysis

Let

df

vb(z) = vk(z) (z). Since ξ is Σ-measurable, we see that the function z 7−→ vb(z) is Σ-measurable. Moreover, we have Z Z ° ° ° °p °vb(z)°p dµ 6 sup °vk (z)°X dµ X 6

Ω ∞ X

Ω p

kvk kLp (Ω;X) =

k=1

k>1

∞ X ° ° °un − u°p p < +∞, k L (Ω;X) k=1

so vb ∈ Lp (Ω; X). Then from the definition of g and the hypothesis that Nf maps Lp (Ω; X) into Lr (Ω; X), we infer that ¡ ¢ g ·, vb(·) ∈ L1 (Ω)+ . Since and

¡ ¢ ¡ ¢ g z, vk (z) 6 g z, vb(z)

∀ z ∈ Ω, k > 1

¡ ¢ g z, vk (z) −→ 0 for µ-a.a. z ∈ Ω,

form the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that Z ¡ ¢ g z, vk (z) dµ −→ 0. Ω

Therefore

¡ ¢ Nf xnk −→ Nf (x) in Lr (Ω; Y ). © ª Since every subsequence of Nf (xn ) n>1 has a further subsequence converging in Lr (Ω; Y ) to Nf (x), we conclude that Nf (xn ) −→ Nf (x)

in Lr (Ω; Y )

and so the map Nf : Lp (Ω; X) −→ Lr (Ω; Y ) is continuous. Next we prove the boundedness of Nf . For u ∈ Lp (Ω; X), let ¡ ¢ ¡ ¢ df fb(z, x) = f z, x + u(z) − f z, u(z) .

3. Nonlinear Operators and Young Measures

409

Evidently fb is a Carath´eodory function, Nfb maps Lp (Ω; X) into Lr (Ω; Y ) and in addition fb(z, 0) = 0 ∀ z ∈ Ω. So without any loss of generality, we may assume that f (z, 0) = 0

∀ z ∈ Ω.

Since Nf is continuous at 0, we can find % > 0, such that ° ° °Nf (u)° p 6 1 ∀ kukLp (Ω;X) 6 %. L (Ω;Y ) Then take an arbitrary u ∈ Lp (Ω; X) and let n > 1 be an integer, such that p

n%p 6 kukLp (Ω;X) 6 (n + 1)%p . We write Ω =

m+1 [

Ωk

k=1

as a disjoint union, such that p

kukLp (Ωk ;X) 6 %p

© ª ∀ k ∈ 1, . . . , k + 1 .

Then we have Z

n+1 XZ ° ¡ ° ¡ ¢° ¢° °f z, u(z) °r dµ = °f z, u(z) °r dµ Y Y

Ω

µ

6 n+1 6

k=1Ω k

kukLp (Ω;X)

¶p

%

+ 1,

which proves that Nf is bounded. Finally we prove the growth condition. Since Nf is bounded, we can find c > 0, such that ° ° °Nf (u)° p 6 c ∀ kukLp (Ω;X) 6 1. (3.116) L (Ω;Y ) Let h : Ω × X −→ R be defined by

· ¸ ° ° ° ° pr + ° ° ° ° h(z, x) = f (z, x) Y − c x X . df

Using the inequality which says that (ξ1 − ξ2 )r 6 ξ1r − ξ2r

∀ ξ1 > ξ 2 ,

410

Nonlinear Analysis

we have ° °r ° °p h(z, x)r 6 °f (z, x)°Y − cr °x°X

when h(z, x) > 0.

(3.117)

Let u ∈ Lp (Ω; X) and let ©

df

C =

¡ ¢ ª z ∈ Ω : h z, u(z) > 0 .

Then we can find an integer n > 1 and ε ∈ [0, 1), such that Z ° ° °u(z)°p dµ = n + ε. X C

So we can write C =

n+1 [

Ck ,

k=1

a disjoint union, such that Z ° ° °u(z)°p dµ 6 1 X

© ª ∀ k ∈ 1, . . . , n + 1 .

Ck

Then assuming as before without any loss of generality, that f (z, 0) = 0

∀z∈Ω

and using (3.116), we obtain Z

n+1 XZ ° ¡ ° ¡ ¢° ¢° °f z, u(z) °r dµ = °f z, u(z) °r dµ 6 (n + 1)cr . Y Y

(3.118)

k=1C k

C

Returning to (3.117) and using (3.118), we have Z ¡ ¢r h z, u(z) dµ 6 (n + 1)cr − (n + ε)cr 6 cr

∀ u ∈ Lp (Ω; X). (3.119)

Ω

So by virtue of Lemma 3.4.3, we have µ½ ¾¶ µ z ∈ Ω : sup h(z, x) = +∞ = 0

∀ k > 1.

(3.120)

kxkX 6k

Since by hypothesis the measure space is σ-finite, we can find {Dk }k>1 ⊆ Σ, such that ∞ [ Ω = Dk and µ(Dk ) < +∞ ∀ k > 1. k=1

3. Nonlinear Operators and Young Measures

411

For z ∈ Dk , let ½ df

Vk (z) =

x ∈ X : kxkX 6 k, sup h(z, x) < +∞, kxkX 6k

¾ 1 sup h(z, x) − 6 h(z, x) . k kxkX 6k Because of (3.120), Vk (z) 6= ∅

for µ-a.a. z ∈ Dk .

Also we have Gr Vk ∈ (Σ ∩ Dk ) × B(X). So the Yankov-von Neumann-Aumann selection theorem (see Theorem ¡ ¢ A.2.33) gives a Σ, B(X) -measurable map vk : Dk −→ X, such that vk (z) ∈ Vk (z)

∀ z ∈ Dk .

Extend vk to all of Ω by setting vk |Ω\Dk = 0. Let df

a(z) = sup h(z, x). x∈X

Because h is a Carath´eodory function and X is separable, a is Σ-measurable. Also we have sup h(z, x) − kxkX 6k

so

¡ ¢ 1 6 h z, vk (z) 6 a(z) k

¡ ¢ h z, vk (z) −→ a(z)

for µ-a.a. z ∈ Ω,

for µ-a.a. z ∈ Ω,

as k → +∞.

p

Note that vk ∈ L (Ω; X) and so from (3.119), we have Z ¡ ¢r h z, vk (z) dµ 6 cr ∀ k > 1. Ω

As h > 0, we can apply Fatou’s lemma (see Theorem A.2.1) and obtain Z Z ¡ ¢r r a(z) dµ 6 lim inf h z, vk (z) dµ 6 cr k→+∞

Ω

Ω

and thus a ∈ Lr (Ω). Recalling the definition of h(z, x), we conclude that p ° ° °f (z, x)° 6 a(z) + c kxk r X Y

for µ-a.a. z ∈ Ω.

412

Nonlinear Analysis

REMARK 3.4.5

By virtue of Theorem 3.4.4, the growth condition

p ° ° °f (z, x)° 6 a(z) + c kxk r X Y

for µ-a.a. z ∈ Ω,

with a ∈ Lr (Ω), c > 0 is both necessary and sufficient condition for the continuity and boundedness of the Nemytskii operator Nf : Lp (Ω; X) −→ Lr (Ω; Y ). If in Theorem 3.4.4 we drop the hypothesis that f (z, x) is a Carath´eodory ¡ function and¢ we only assume that the function (z, x) 7−→ f (z, x) is Σ × B(X), B(Y ) -measurable and for all z ∈ Ω, the function x 7−→ f (z, x) is lower semicontinuous, then we no longer have the continuity of the Nemytskii operator Nf , even if it maps Lp (Ω; X) into Lr (Ω; Y ). To see this let Ω = [0, 1] equipped with the Lebesgue measure, let X = Y = R and consider the function ½ df 1 if x 6= 0, f (x) = 0 if x = 0. Then Nf maps Lp (Ω) to Lr (Ω) for every r ∈ [1, +∞). However, if we consider df

xn (z) =

z , n

then Nf (xn ) does not converge in measure to zero. Also if r = +∞ and Nf maps Lp (Ω; X) into L∞ (Ω; Y ) (f is still a Carath´eodory function), then Nf is again bounded and there exists M > 0, such that ° ° °f (z, x)° 6 M for µ-a.a. z ∈ Ω and all x ∈ X. Y The proof which is similar to that of Theorem 3.4.4 is left to the reader. However, Nf : Lp (Ω; X) −→ L∞ (Ω; Y ) is not in general continuous as the following example illustrates. Let Ω = [0, 1] be equipped with the Lebesgue measure, let X = Y = R and consider −1 if x < −1, df x if −1 6 x 6 1, f (x) = 1 if 1 < x. If we take un (z) = z n , then xn −→ 0 in Lp [0, 1], for all p ∈ [1, +∞). But Nf (xn ) does not converge to zero in L∞ [0, 1].

3. Nonlinear Operators and Young Measures

413

PROPOSITION 3.4.6 If f : Ω × RN −→ RN is a Carath´eodory function, for all z ∈ Ω, f (z, ·) is a ¡ ¢ 0 monotone map and Nf maps Lp Ω; RN into Lp (Ω; RN ), where p ∈ [1, +∞), 1 1 p + p0 = 1, ¡ ¢ 0 then Nf : Lp Ω; RN −→ Lp (Ω; RN ) is a maximal monotone operator. PROOF If by h·, ·ipp0 we denote the duality brackets for the pair of spaces ¢¢ ¡ ¢ ¡ p0 ¡ L (Ω; RN ), Lp Ω; RN , for all u, v ∈ Lp Ω; RN , we have ® Nf (u)−Nf (v), u−v pp0 =

Z

¡ ¡ ¢ ¡ ¢ ¢ f z, u(z) −f z, v(z) , u(z)−v(z) RN dµ > 0

Ω

due to the monotonicity of f (z, ·). Hence Nf is monotone. Moreover, by Theorem 3.4.4, Nf is continuous. Therefore, Proposition 3.2.19 implies that Nf is maximal monotone. PROPOSITION 3.4.7 If f : Ω × RN −→ RN is a Carath´eodory function, such that (i) f (z, ·) is a strictly monotone map for µ-almost all z ∈ Ω; ¡ ¢ p (ii) f (z, x), x RN > c1 kxkRN − a1 (z) for µ-almost all z ∈ Ω and all x ∈ RN 1 with a1 ∈ L (Ω)+ , c > 0; ¡ ¢ 0 (iii) Nf maps Lp Ω; RN into Lp (Ω; RN ), with p ∈ [1, +∞), p1 + p10 = 1, ¡ ¢ 0 then Nf : Lp Ω; RN −→ Lp (Ω; RN ) is an operator of type (S)+ (see Definition 3.2.55(b)). PROOF

¡ ¢ Suppose that {un }n>1 ⊆ Lp Ω; RN is a sequence, such that ¡ ¢ un −→ u in Lp Ω; RN

and

® lim sup Nf (un ) − Nf (u), un − u pp0 6 0. n→+∞

We need to show that un −→ u

¡ ¢ in Lp Ω; RN .

From the monotonicity of Nf (see Proposition 3.4.6), we have that ® Nf (un ) − Nf (u), un − u pp0 −→ 0.

414

Nonlinear Analysis

Note that

® Nf (un ) − Nf (u), un − u pp0 Z ¡ ¡ ¢ ¡ ¢ ¢ = f z, un (z) − f z, u(z) , un (z) − u(z) RN dµ. Ω

Because of the monotonicity of f (z, ·), by passing to a subsequence, we may assume that ¢ ¡ ¢ ¢ df ¡ ¡ βn (z) = f z, un (z) − f z, u(z) , un (z) − u(z) RN −→ 0 for µ-a.a. z ∈ Ω and

¯ ¯ ¯βn (z)¯ 6 k(z) for µ-a.a. z ∈ Ω and all n > 1,

with k ∈ L1 (Z)+ . From Theorem 3.4.4, we know that for µ-almost all z ∈ Ω and all x ∈ RN , we have ° ° °f (z, x)° N 6 a(z) + c kxkp−1 RN , R 0

with a ∈ Lp (Ω)+ , c > 0 (recall that pp0 = p−1). So for all z ∈ Ω\D, µ(D) = 0 and all n > 1, we have °p ° °p ¢ ¡° k(z) > βn (z) > c1 °un (z)°RN + °u(z)°RN ° ° ¡ ° °p−1 ¢ − °un (z)°RN a(z) + c°u(z)°RN ° ° ¡ ° °p−1 ¢ − °u(z)°RN a(z) + c°u(z)°RN − 2a1 (z). (3.121) Using Young’s inequality (see Proposition A.4.5) with ε > 0, we have 0 ° ° ° °p−1 °p °p ε° cp ° c°un (z)°RN °u(z)°RN 6 °un (z)°RN + 0 °u(z)°RN p εp

(3.122)

° ° ° °p−1 ° ° ° cp ° °u(z)°p N + ε °un (z)°p N . c°u(z)°RN °un (z)°RN 6 0 R R εp p

(3.123)

and

Using (3.122) and (3.123) in (3.121), we obtain 0 ° °p ° °p °p °p ε° cp ° c1 °un (z)°RN 6 k(z) + c1 °u(z)°RN + °un (z)°RN + 0 °u(z)°RN p° εp ° ° ¢ ¡° + a(z) °un (z)°RN + °u(z)°RN ° ° ° cp ° °u(z)°p N + ε °un (z)°p N + 2a1 (z). + (3.124) 0 R R εp p

Recall that p1 + the sequence

1 p0

= 1 and choose ε < c. Then from (3.124), it follows that ° ª ©° °un (·)°p N R

n>1

¢ ¡ ⊆ L1 Ω; RN +

3. Nonlinear Operators and Young Measures

415

is integrable. Also for all z ∈ Ω \ D, µ(D) = 0, the sequence © uniformly ª un (z) n>1 ⊆ RN is bounded. So by passing to a suitable subsequence (depending in general on z ∈ Ω \ D), we may assume that un (z) −→ u b(z) in RN . Recall that f (z, ·) is continuous and that βn (z) −→ 0, so in the limit we obtain ¡ ¡ ¢ ¡ ¢ ¢ f z, u b(z) − f z, u(z) , u b(z) − u(z) RN = 0. Since by hypothesis f (z, ·) is strictly monotone, we infer that u b(z) = u(z)

∀ z ∈Ω\D

and so it follows that un (z) −→ u(z) in RN ,

∀ z ∈ Ω \ D, µ(D) = 0.

(3.125)

From (3.125), the uniform integrability of the sequence ° ª ©° ¡ ¢ °un (·)°p N ⊆ L1 Ω; RN + R n>1 and Vitali’s theorem (the extended dominated convergence theorem; see Theorem A.2.9), we obtain that kun k p ¡ L

Since

Ω;RN

¢ −→ kuk ¡ p L

Ω;RN

¢.

¡ ¢ w un −→ u in Lp Ω; RN

and the latter space is uniformly convex, from the Kadec-Klee property (see Remark A.3.22), we know that ¡ ¢ un −→ u in Lp Ω; RN , hence Nf is of type (S)+ . Next we pass to the study of integral functionals defined on LebesgueBochner space. So if (Ω, Σ, µ) is a nonatomic, complete σ-finite measure space, X a separable Banach space and f : Ω × X −→ R = R ∪ {+∞} is a Σ × B(X)-measurable function (an integrand), we consider the integral functional Z ¡ ¢ df If (u) = f z, u(z) dz ∀ u ∈ Lp (Ω; X), Ω

with p ∈ [1, +∞]. We start with a definition which extends the notion of a Carath´eodory function (see Definition 3.4.1).

416

Nonlinear Analysis

DEFINITION 3.4.8 Let (Ω, Σ, µ) be a complete σ-finite measure space and X a separable metric space. We say that f : Ω × X −→ R = R ∪ {+∞} is a normal integrand, if (a) f is Σ × B(X)-measurable; and (b) the function x 7−→ f (z, x) is lower semicontinuous for µ-almost all z ∈ Ω. We show that normal integrands which are bounded below can be realized as the upper envelope of a sequence of Carath´eodory integrands. PROPOSITION 3.4.9 If (Ω, Σ, µ) is a complete σ-finite measure space, X is a separable metric space with metric dX , f : Ω × X −→ R is a normal integrand and there exists a function h : Ω −→ R (not necessarily measurable), such that h(z) 6 f (z, x)

for µ-a.a. z ∈ Ω and all x ∈ X,

then we can find a sequence of functions fn : Ω × X −→ R for n > 1, such that for all n > 1, we have (a) h(z) 6 fn (z, x) 6 n for µ-almost all z ∈ Ω and all x ∈ X; (b) the function z 7−→ fn (z, x) is measurable for all x ∈ X; ¯ ¯ (c) ¯fn (z, x) − fn (z, v)¯ 6 ndX (x, v) for all z ∈ Ω and all x, v ∈ Ω; (d) fn (z, x) % f (z, x) for µ-almost all z ∈ Ω and all x ∈ X. PROOF

(a) For every n > 1, let £ ¤ df fbn (z, x) = inf f (z, y) + ndX (y, x) . y∈X

Evidently, for all n > 1, µ-almost all z ∈ Ω and all x ∈ X, we have h(z) 6 fb1 (z, x) 6 . . . 6 fbn (z, x) 6 fbn+1 (z, x) 6 . . . 6 f (z, x). If we fix n > 1, x ∈ X and λ ∈ R, we have ½ ¾ ½ ¾ b z ∈ Ω : fn (z, x) < λ = projΩ (z, y) ∈ Ω × X : f (z, y) + ndX (y, x) < λ .

By virtue of the joint measurability of f (see Definition 3.4.8), we deduce that df

Cλ =

½ ¾ (z, y) ∈ Ω × X : f (z, y) + ndX (y, x) < λ ∈ Σ × B(X).

3. Nonlinear Operators and Young Measures

417

Hence from the Yankov-von Neumann-Aumann projection theorem (see Theorem A.2.32) and since by hypothesis Σ is µ-complete, we have projΩ Cλ ∈ Σ and so the function z 7−→ fbn (z, x) is measurable for all x ∈ X and all n > 1. From the definition of fbn , we have fbn (z, x) 6 f (z, y) + ndX (y, x) so

∀ (z, y) ∈ Ω × X,

fbn (z, x) 6 f (z, y) + ndX (y, v) + ndX (v, x)

and thus

∀v∈X

fbn (z, x) − fbn (z, v) 6 ndX (x, v).

Interchanging the roles of x and v in the above argument, we conclude that ¯ ¯ ¯fbn (z, x) − fbn (z, v)¯ 6 ndX (x, v) ∀ z ∈ Ω, x, v ∈ X. Therefore fbn is Σ × B(X)-measurable (see Remark 3.4.2). Moreover, since for © ª all (z, x) ∈ Ω × X, the sequence fbn (z, x) n>1 is increasing, we have lim fbn (z, x) 6 f (z, x)

n→+∞

∀ (z, x) ∈ Ω × X.

(3.126)

Let D ⊆ Ω be the µ-null set, such that the function f (z, ·) is lower semicontinuous for all z ∈ Ω \ D. Then for all z ∈ Ω \ D, x ∈ X and ε > 0, let {yn }n>1 \ X be a sequence, such that f (z, yn ) + ndX (yn , x) 6 fbn (z, x) + ε. As n → +∞, either fbn (z, x) % +∞, in which case equality holds in (3.126) or otherwise we have yn −→ x in X. Then because of the lower semicontinuity of f (z, ·), we have f (z, x) 6 lim inf f (z, yn ) 6 n→+∞

lim fbn (z, x) + ε.

n→+∞

(3.127)

Since z ∈ Ω \ D, x ∈ X and ε > 0 were arbitrary, from (3.126) and (3.127), we infer that fbn (z, x) % f (z, x) for µ-a.a. z ∈ Ω and all x ∈ X. Finally set

© ª df fn (z, x) = min fbn (z, x), n .

Then the sequence {fn }n>1 is the desired sequence.

418

Nonlinear Analysis

Using this approximation result, we have another characterization of normal integrands. First let us recall the Scorza-Dragoni theorem, which is a parametrized version of Lusin’s theorem (see Theorem A.2.11). THEOREM 3.4.10 (Scorza-Dragoni Theorem) If Ω, X are two Polish spaces (see Definition A.2.29(a)), Y is a separable metric space, µ is a tight Borel measure on Ω (see Remark 3.4.11) and f : Ω × X −→ Y is a Carath´eodory function, then for every ε > 0, we can find a compact set Ωε ⊆ Ω, with µ(Ω \ Ωε ) < ε, such that f |Ωε ×X is continuous. REMARK 3.4.11 Recall that µ is a tight Borel measure on Ω, if µ is finite and for every ε > 0, we can find a compact subset Kε of Ω, such that µ(Ω \ Kε ) < ε. On a Polish space every finite Borel measure is tight. Combining Proposition 3.4.9 with Theorem 3.4.10, we obtain the following characterization of normal integrands. PROPOSITION 3.4.12 If Ω, X are two Polish spaces, µ is a finite Borel measure on Ω and f : Ω × df

X −→ R = R ∪ {+∞}, then f is a normal integrand if and only if for every ε > 0 we can find a compact set Kε ⊆ Ω, such that µ(Ω \ Kε ) < ε and f |Kε ×X is lower semicontinuous. Now we pass to the study of the integral functional Z ¡ ¢ df If (u) = f z, u(z) dµ ∀ u ∈ Lp (Ω; X), Ω

with p ∈ [1, +∞]. In our analysis we shall use the computational convention +∞ − ∞ = +∞, which is useful when dealing with integral functionals of R = R∪{+∞}-valued functions.

3. Nonlinear Operators and Young Measures

419

THEOREM 3.4.13 If (Ω, Σ, µ) is a nonatomic, σ-finite measure space, X is separable Banach space, f : Ω × X −→ R is a normal integrand, the integral functional If is not identically +∞ and p ∈ [1, +∞), then the following properties are equivalent: (a) If is lower semicontinuous on Lp (Ω; X) and If (u) > −∞ for all u ∈ Lp (Ω; X). (b) If : Lp (Ω; X) −→ R. (c) There exist β1 ∈ R and β2 > 0, such that p

If (u) > β1 − β2 kukLp (Ω;X)

∀ u ∈ Lp (Ω; X).

(d) There exist a ∈ L1 (Ω) and c > 0, such that p

f (z, x) > a(z) − c kxkX PROOF

for µ-a.a. z ∈ Ω and all x ∈ X.

Clearly implications (d)=⇒(c)=⇒(b) and (a)=⇒(b) hold.

“(b)=⇒(d)”: Let us set

© ª df g = min f, 0 .

We claim that Z

¡ ¢ g z, u(z) dµ > −∞

∀ u ∈ Lp (Ω; X).

(3.128)

Ω

Suppose that (3.128) does not hold. Then for some u0 ∈ Lp (Ω; X), we have Z ¡ ¢ g z, u0 (z) dµ = −∞. Ω

Let

½ df

C =

¾ ¡ ¢ ¡ ¢ z ∈ Ω : f z, u0 (z) = g z, u0 (z) ∈ Σ.

For any given v ∈ Lp (Ω; X), we define df

u b = χC u0 + χC c v ∈ Lp (Ω; X). Then we have Z −∞ < If (b u) = C

¡ ¢ g z, u0 (z) dµ +

Z Cc

¡ ¢ f z, v(z) dµ,

420

Nonlinear Analysis

so

Z

¡ ¢ f z, v(z) dµ = +∞

Cc

and thus If ≡ +∞, a contradiction. From (3.128), it follows that Ng (the Nemytskii operator corresponding to g) maps Lp (Ω; X) into L1 (Ω). Invoking Theorem 3.4.4, we obtain (d). “(d)=⇒(a)”: Let λ ∈ R and consider a sequence {un }n>1 ⊆ Lp (Ω; X), such that un −→ u in Lp (Ω; X), for some u ∈ Lp (Ω; X) and If (un ) 6 λ

∀ n > 1.

By passing to a subsequence if necessary, we may also assume that un (z) −→ u(z) for µ-a.a. z ∈ Ω and

° ° °un (z)° 6 k(z) for µ-a.a. z ∈ Ω, n > 1, X

with k ∈ L1 (Ω)+ . Then because of (d), we can apply Fatou’s lemma (see Theorem A.2.1) and obtain Z Z ¡ ¢ ¡ ¢ If (u) = f z, u(z) dµ 6 lim inf f z, un (z) dµ n→+∞

Z 6 lim inf

n→+∞

Ω

Ω

¡ ¢ f z, un (z) dµ 6 λ,

Ω

so If is lower semicontinuous on Lp (Ω; X). Moreover, it is clear that If (u) > −∞

∀ u ∈ Lp (Ω; X).

COROLLARY 3.4.14 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a Carath´eodory function, such that ¯ ¯ ¯f (z, x)¯ 6 a(z) + c kxkp for µ-a.a. z ∈ Ω and all x ∈ X, X with a ∈ L1 (Ω)+ , c > 0 and p ∈ [1, +∞), then If : Lp (Ω; X) −→ R is continuous.

3. Nonlinear Operators and Young Measures

421

For p = +∞, we can state the following continuity result. PROPOSITION 3.4.15 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a Carath´eodory function and for all r > 0 we can find ar ∈ L1 (Ω)+ , such that ¯ ¯ ¯f (z, x)¯ 6 ar (z) for µ-a.a. z ∈ Ω and all kxk 6 r, X then If : L∞ (Ω; X) −→ R is continuous. PROOF

Suppose that {un }n>1 ⊆ L∞ (Ω; X) is a sequence, such that un −→ u in L∞ (Ω; X),

for some u ∈ L∞ (Ω; X). Let df

r = sup kun kL∞ (Ω;X) < +∞. n>1

Then Since

¯ ¡ ¢¯ ¯f z, un (z) ¯ 6 ar (z) for µ-a.a. z ∈ Ω. ¡ ¢ ¡ ¢ f z, un (z) −→ f z, u(z)

for µ-a.a. z ∈ Ω,

from the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that If (un ) −→ If (u).

PROPOSITION 3.4.16 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a normal integrand, such that f (z, ·) is convex for µ-almost all z ∈ Ω, u0 ∈ L∞ (Ω; X) and If (u0 ) ∈ R, then the following conditions are equivalent: (a) If is continuous at u0 with respect to norm topology on L∞ (Ω; X). (b) There exist ε > 0 and a ∈ L1 (Ω), such that ¡ ¢ sup f z, x + u0 (z) 6 a(z) for µ-a.a. z ∈ Ω. kxkX 6ε

PROOF “(a)=⇒(b)”: Suppose that the implication is not true. Then for any ε > 0, the function ¡ ¢ df ξε (z) = sup f z, x + u0 (z) kxkX 6ε

422

Nonlinear Analysis

is not integrable (note that since f is a normal integrand, ξε is measurable). Clearly Z ξε (z) dµ = +∞ ∀ ε > 0. Ω

So for every ε > 0 and N > 1, we can find a measurable function ξε,N : Ω −→ R, such that Z ξε,N (z) dµ > N and ξε,N (z) 6 ξε (z) ∀ z ∈ Ω. Ω

Then for every z ∈ Ω, we define ½ ¾ ° ° df Sε,N (z) = x ∈ X : °x − u0 (z)°X 6 ε, f (z, x) > ξε,N (z) . ° ° Since (z, x) 7−→ °x − u0 (z)°X is a Carath´eodory function (hence jointly measurable; see Remark 3.4.2) and (z, x) 7−→ f (z, x) − ξε,N (z) is a Σ × B(X)-measurable function (since f is a normal integrand), we infer that Gr Sε,N ∈ Σ × B(X). Applying the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33), we obtain a measurable map uε,N : Ω −→ X, such that Clearly uε,N

uε,N (z) ∈ Sε,N (z) ∀ z ∈ Ω. ° ° ∈ L (Ω; X), °uε,N − u0 °L∞ (Ω;X) 6 ε and Z If (uε,N ) > ξε,N (z) dµ > N. ∞

Ω

Since ε > 0 and N > 1 were arbitrary, it follows that the convex integral functional If is unbounded from above in every L∞ (Ω; X)-neighbourhood of u0 and so it cannot be continuous at u0 , a contradiction. “(b)=⇒(a)”: For every df

v ∈ Bε = we have

Z

If (u0 ± v) = Ω

½

¾ v ∈ L∞ (Ω; X) : kvkL∞ (Ω;X) 6 ε ,

¡ ¢ f z, u0 (z) ± v(z) dµ 6

Z a(z) dµ = η < +∞. Ω

Since If is convex and bounded above in a neighbourhood of u0 , it is continuous at u0 .

3. Nonlinear Operators and Young Measures

423

Thus far we have considered the norm topology on the Lebesgue-Bochner space. If we want to have weak lower semicontinuity, then, as we show, necessarily f (z, ·) must be convex. For this purpose we need to use some tools from multivalued analysis, which for the convenience of the reader we recall here. Details can be found in Denkowski, Mig´orski & Papageorgiou (2003a, Chapter 4). DEFINITION 3.4.17

Let Y be a separable Banach space and G : Ω −→ 2Y \ {∅}

is a multivalued (set-valued) map. We say that G is graph measurable, if Gr G ∈ Σ × B(Y ), where df

Gr G =

©

ª (z, y) ∈ Ω × Y : y ∈ G(z) .

Also, for any p ∈ [1, +∞], we set ½ ¾ p df p SG = g ∈ L (Ω; Y ) : g(z) ∈ G(z) for µ-a.a. z ∈ Ω (the set of Lp -selections of the multifunction G) The next result from multivalued analysis is the main tool in establishing the necessity of the convexity of f (z, ·). PROPOSITION 3.4.18 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, Y is a separable p Banach space and G : Ω −→ 2Y \ {∅} is graph measurable with SG 6= ∅, p ∈ [1, ∞), pw p p then SG = Sconv G , where w stands for the weak topology on L (Ω; Y ). REMARK 3.4.19 There is an analogous result for p = +∞. More precisely, let (Ω, Σ, µ) and Y be as in Proposition 3.4.18. We denote by Yw∗∗ the dual space of Y furnished with the w∗ -topology. From Theorem 2.2.12, we know that L∞ (Ω; Yw∗∗ ) = L1 (Ω; Y )∗ , where L∞ (Ω; Yw∗∗ ) is understood in the sense of Definition 2.2.10. Let ∗

G : Ω −→ 2Y \ {∅} be a multifunction, such that ¡ ¢ Gr G ∈ Σ × B Yw∗∗ Then ∞ SG

w∗

∞ = Sconv w∗ G

∞ and SG 6= ∅.

in L∞ (Ω; Yw∗∗ ).

424

Nonlinear Analysis

Using this general result about multifunctions, we can prove the following theorem about the integral functional If , according to which the weak lower semicontinuity of If on L1 (Ω; X) implies the convexity of f (z, ·) for all z ∈ Ω. THEOREM 3.4.20 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a normal integrand, there exists u0 ∈ L1 (Ω; X), such that If (u0 ) < +∞ and If is weakly lower semicontinuous on L1 (Ω; X), then f (z, ·) is convex for all z ∈ Ω. PROOF

Without any loss of generality, we may assume that ¡ ¢ f z, u0 (z) = 0 ∀z∈Ω

(otherwise replace f (z, x) by ¡ ¢ fb(z, x) = f (z, x) − f z, u0 (z) ). Consider the multifunction E : Ω −→ 2X×R , defined by ½ ¾ df E(z) = epi f (z, ·) = (x, λ) ∈ X × R : f (z, x) 6 λ (the epigraph of f (z, ·)). Since ¡ ¢ u0 (z), 0 ∈ E(z)

∀ z ∈ Ω,

we see that E has nonempty values, which are also closed due to the lower semicontinuity of f (z, ·). Moreover, 1 (u0 , 0) ∈ SE .

We claim that

1 SE is weakly closed in L1 (Ω; X). © ª 1 and assume that To this end let (uα , λα ) α∈J be a net in SE w

uα −→ u in L1 (Ω; X) and

w

λα −→ λ

in L1 (Ω).

3. Nonlinear Operators and Young Measures

425

For every C ∈ Σ, we have that w

χC uα −→ χC u in L1 (Ω; X) and so

w

χC uα + χC c u0 −→ χC u + χC c u0 Also

w

in L1 (Ω).

χC λα −→ χC λ Note that

in L1 (Ω; X).

Z If (χC uα + χC c u0 ) =

(3.129)

¡ ¢ f z, uα (z) dµ.

C

Since by hypothesis If is weakly lower semicontinuous on L1 (Ω; X), we have lim inf If (χC uα + χC c u0 ) > If (χC u + χC c u0 ) α Z ¡ ¢ = f z, u(z) dµ, C

so

Z

¡ ¢ f z, uα (z) dµ >

lim inf α

C

Because we have

Z

¡ ¢ f z, u(z) dµ.

C

¡ ¢ f z, uα (z) 6 λα (z) for µ-a.a. z ∈ Ω, Z

¡ ¢ f z, uα (z) dµ 6

C

so, from (3.129), we have Z C

Z λα (z) dµ, C

¡ ¢ f z, u(z) dµ 6

Z λ(z) dµ. C

The set C ∈ Σ was arbitrary. Hence it follows that ¡ ¢ f z, u(z) 6 λ(z) for µ-a.a. z ∈ Ω and

1 (u, λ) ∈ SE ,

which proves that 1 SE is weakly closed in L1 (Ω; X).

Then by virtue of Proposition 3.4.18, we have that 1 1 SE = Sconv E.

426

Nonlinear Analysis

We claim that this implies that E(z) = conv E(z)

for µ-a.a. z ∈ Ω.

We proceed by contradiction. Suppose that D(z) = conv E(z) \ E(z) 6= ∅

∀ z ∈ C ∈ Σ,

with µ(C) > 0. Clearly Gr D ∈ (Σ ∩ C) × B(X) and so by the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) we can find two (Σ ∩ C)-measurable functions b : C −→ R, and λ

u b : Ω −→ X such that

¡

b u b(z), λ(z)

¢

∈ D(z)

∀ z ∈ C.

Exploiting the σ-finiteness of the measure space, we can find C0 ⊆ C with such that Then let

C0 ∈ Σ

and

χC 0 u b ∈ L1 (Ω; X)

0 < µ(C 0 ) < +∞, b ∈ L1 (Ω). and χC 0 λ

df

u = χC 0 u b + χ(C 0 )c u0

df b and λ = χC 0 λ.

¡ ¢ 1 Evidently u, λ ∈ SE and ¡ ¢ u(z), λ(z) ∈ D(z)

∀ z ∈ C 0,

a contradiction. Therefore E(z) = conv E(z) for µ-a.a. z ∈ Ω and by redefining E on a µ-null set, we may assume that E(z) = conv E(z)

∀ z ∈ Ω.

This proves the convexity of f (z, ·) for all z ∈ Ω. In the next section we use the theory of Young measures to prove very general lower semicontinuity results for integral functionals. Moreover, in Section 4.3, we will focus on convex integral functionals.

3. Nonlinear Operators and Young Measures

3.5

427

Young Measures

According to Proposition 2.3.38, a sequence {un }n>1 ⊆ L1 (Ω), which converges weakly but not strongly in L1 (Ω), oscillates violently around its weak limit. However, in the limit all this information about the faster and faster oscillations is lost and only a mean value is recorded. Of course this is not satisfactory, because if for example on {un }n>1 we act with the Nemytskii operator Nf , we cannot say that w

Nf (un ) −→ Nf (u)

in L1 (Ω),

unless f (z, ·) is affine. The idea is then to embed the sequence {un }n>1 into a larger space and consider the limit there. The appropriate space is that of probability-valued functions (parametrized measures). These are the Young measures. In what follows let Ω and E be locally compact, σ-compact metric spaces, Σ a σ-field on Ω containing B(Ω) (the Borel σ-field of Ω) and µ ∈ M (Ω)+ (see Section 2.3), which is nonatomic and Σ is µ-complete. Also we set ½ ¾ df 1 M+ (E) = λ ∈ M (E)+ : λ(E) = 1 (3.130) (the probability measures on E) and ½ ¾ df 1 SM+ (E) = λ ∈ M (E)+ : λ(E) 6 1 (the subprobability measures on E). DEFINITION 3.5.1 A transition probability (respectively transition subprobability) on E is a function 1 1 λ : Ω −→ M+ (E) (respectively λ : Ω −→ SM+ (E)),

such that for every A ∈ B(E), we have that the function z 7−→ λ(z)(A) is b c Σ-measurable. By R(Ω, E) (respectively SR(Ω, E)) we denote the space of transition probabilities (respectively subprobabilities) on Ω. 1 REMARK 3.5.2 On M+ (E) we can consider the topology of narrow 1 convergence (see Definition 2.3.42(c)). This is the relative topology on M+ (E) ¡ ¢ 1 induced by w M (E), Cb (E) . Recall that M+ (E) furnished with this topology 1 is a Polish space (see Definition A.2.29(a)). In the sequel by M+ (E)n we 1 denote the space M+ (E) equipped with the topology of narrow convergence.

428

Nonlinear Analysis

PROPOSITION 3.5.3 ¡ ¡ 1 ¢¢ 1 If λ : Ω −→ M+ (E) is a Σ, B M+ (E)n -measurable function, b then λ ∈ R(Ω, E). PROOF Let U be an open set in E. The map z 7−→ ¡ λ(z)(U ¡ 1 ) is the ¢¢ 1 composition of λ : Ω −→ M+ (E), which by hypothesis is Σ, B M+ (E)n 1 measurable and of the map ξ : M+ (E) −→ [0, 1], defined by ξ(ν) = ν(U ), which is lower semicontinuous (by virtue of the Portmanteau theorem; see Theorem A.2.36). Therefore the map z −→ λ(z)(U ) is Σ-measurable. Since 1 λ(z) ∈ M+ (E) is regular for every z ∈ Ω, we conclude that the map z 7−→ λ(z)(A) is Σ-measurable for all A ∈ Σ.

PROPOSITION 3.5.4 b If E is compact and λ ∈ R(Ω, E), ¡ ¡ 1 ¢¢ then λ is a Σ, B M+ (E)n -measurable function. PROOF that

Recall that C(E)∗ = M (E). Then for every g ∈ C(E), we have

® the map z 7−→ λ(z), g C(E) =

Z g(x)λ(z)(dx) is Σ-measurable. E

Indeed, suppose that s : E −→ R is a simple function. Then since λ ∈ R(Ω; E), we have that Z the map z 7−→ s(x)λ(z)(dx) is Σ-measurable. E

We can find a sequence {sn }n>1 of simple functions on E, such that sn (x) −→ g(x) uniformly on E. Then

Z

Z sn (x)λ(z)(dx) −→

E

g(x)λ(z)(dx), E

® which proves the Σ-measurability of the map z 7−→ λ(z), g C(E) . So the map z 7−→ λ(z) is weakly∗ -measurable. 1 Because M+ (E)n is a compact metrizable space, we conclude that z 7−→ λ(z) ¡ ¡ 1 ¢¢ is Σ, B M+ (E)n -measurable.

3. Nonlinear Operators and Young Measures

429

Let us recall the notion of image measure. DEFINITION 3.5.5 Let ¡(S, T ) be¢a measurable space, Y a Hausdorff topological space, ξ : S −→ Y a Σ, B(Y ) -measurable function and λ : T −→ R+ ∪ {+∞} a measure. The image of λ under ξ is the measure ν : B(Y ) −→ R+ ∪ {+∞}, defined by ¡ ¢ df ν(A) = λ ξ −1 (A)

∀ A ∈ B(Y ).

We often denote ν by λ ◦ ξ −1 . REMARK 3.5.6 If g : Y −→ R is a ν-integrable map or a measurable and positive map, then Z Z ¡ ¢ g ξ(s) dλ(s) = g(y) dν(y). S

Y

Now we can give the definition of Young measure. DEFINITION 3.5.7 (a) λ ∈ M (Ω × E)+ is a Young measure with respect to µ, if λ(A × E) = µ(A) ∀ A ∈ Σ. By Y(Ω, E, µ), we denote the space of Young measures with respect to µ. (b) λ ∈ M (Ω × E)+ is a Young submeasure with respect to µ, if λ(A × E) 6 µ(A)

∀ A ∈ Σ.

By SY(Ω, E, µ), we denote the space of Young submeasures with respect to µ. (c) If u : Ω −→ E is a measurable function, then the Young measure associated to u is the element ν ∈ Y(Ω, E, µ), defined by Z Z ¡ ¢ h(z, x) dν = h z, u(z) dν ∀ h ∈ C0 (Ω × E). Ω×E

Ω

REMARK 3.5.8

Let projΩ : Ω × E −→ Ω

be the projection map defined by df

projΩ (z, x) = z If ν ∈ Y(Ω, E, µ), then

∀ (z, x) ∈ Ω × E.

µ = ν ◦ proj−1 Ω

430

Nonlinear Analysis

(see Definition 3.5.5). Moreover, we have ν = µ ◦ η −1

for all µ, ν as in Definition 3.5.7(c),

where η : Ω −→ Ω × E is defined by df

η(z) =

¡ ¢ z, u(z)

∀ z ∈ Ω.

If un : Ω −→ E are measurable functions for n > 1, un (z) −→ u(z) for µ-a.a. z ∈ Ω and νn , ν are the Young measures associated to un and u respectively (see Definition 3.5.7(c)), then w

1 (Ω × E) in M+

νn −→ ν

(see Definition 2.3.42(b)). To see this let h ∈ C0 (Ω × E). Using Remark 3.5.6 and the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z Z ¡ ¢ lim h(z, x) dνn = lim h z, un (z) dµ n→+∞ Ω×E

Z

= Ω

n→+∞

Z

¡ ¢ h z, u(z) dµ =

Ω

h(z, x) dν. Ω×E

PROPOSITION 3.5.9 We have ½ SY(Ω, E, µ) = ν ∈ M (Ω × E)+ : ν(Ω × E) 6 µ(Ω) and for all β ∈ C0 (Ω)+ and all ξ ∈ C0 (Ω × E)+ , such that ξ(z, x) 6 β(z) for all (z, x) ∈ Ω × E, ¾ Z Z we have ξ(z, x) dν 6 β(z) dµ . Ω×E

Ω

Moreover, if E is compact, then ½ Y(Ω, E, µ) = ν ∈ SY(Ω, E, µ) : for all β ∈ C0 (Ω)+ , ¾ Z Z we have β(z) dµ = β(z) dν . Ω

Ω×E

3. Nonlinear Operators and Young Measures

431

PROOF Let K ⊆ Ω and C ⊆ E be nonempty, compact sets, and ε > 0. Then by virtue of Urysohn’s lemma (see Theorem A.1.13), we can find ξ ∈ C0 (Ω × E)+ ,

β ∈ C(Ω)+ ,

0 6 ξ, β 6 1

and a compact set K1 ⊆ Ω,

K1 ⊇ K,

K1 6= K,

such that ξ|K×C = 1,

β|K = 1,

¯ ¯β|

c K1

¯ ¯ 6 ε,

µ(K1 \ K) 6 ε

and ξ(z, x) 6 β(z) Then we have

∀ (z, x) ∈ Ω × E.

Z ν(K × C) 6

Z ξ(z, x) dν 6

Ω×E

β(z) dµ Ω

¡ ¢ 6 µ(K) + µ(K1 \ K) + εµ(K1c ) 6 µ(K) + ε 1 + µ(Ω) . Let ε & 0, to conclude that ν(K × C) 6 µ(K). If A ∈ Σ, then we can find compact sets Kn ⊆ A

and Cn ⊆ E

∀ n > 1,

such that Kn % A

and

Cn % E.

Since ν(Kn × Cn ) 6 µ(Kn )

∀ n > 1,

passing to the limit as n → +∞, we obtain ν(A × E) 6 µ(A). This proves the first equality. If E is compact, simply note that if β ∈ C0 (Ω), then β ∈ C0 (Ω × E). So the second equality follows at once. Recall that

M (Ω × E) = C0 (Ω × E)∗

(see Theorem 2.3.41). So we can equip Y(Ω, E, µ) and SY(Ω, E, µ) with the relative weak∗ -topology. If we do this we can have useful topological properties for the space of Young measures and of Young submeasures.

432

Nonlinear Analysis

THEOREM 3.5.10 E is compact if and only if Y(Ω, E, µ) is compact. PROOF

“=⇒”: First note that ∗

Y(Ω, E, µ) ⊆ SY(Ω, E, µ) ⊆ B µ(Ω) , where ∗ df B µ(Ω) =

½

¾ λ ∈ M (Ω × E) : |λ|(Ω × E) 6 µ(Ω) .

We know that the predual space C0 (Ω × E) is separable. So on bounded subsets of M (Ω × E), the relative weak∗ -topology is compact and metrizable. Therefore we have to show that Y(Ω, E, µ) is sequentially closed. Let {νn }n>1 ⊆ Y(Ω, E, µ) be a sequence, such that w∗

νn −→ ν. If β ∈ C0 (Ω), then from Proposition 3.5.9, we have Z Z β(z) dµ = β(z) dνn ∀ n > 1. Ω

Ω×E

Since β ∈ C0 (Ω × E), in the limit as n → +∞, we obtain Z Z β(z) dµ = β(z) dν Ω

Ω×E

and thus due to Proposition 3.5.9, we have ν ∈ Y(Ω, E, µ). This proves the compactness of Y(Ω, E, µ). “⇐=”: Suppose that E is not compact. Then we can find a sequence {xn }n>1 ⊆ E with no convergent subsequence. Let νn ∈ Y(Ω, E, µ) be associated to the constant function xn (see Definition 3.5.7(c)). Then for each ξ ∈ C0 (Ω × E), we have Z Z lim ξ(z, x) dνn = lim ξ(z, xn ) dµ = 0 n→+∞ Ω×E

n→+∞

Ω

and so

w∗

νn −→ 0 and 0 6∈ Y(Ω, E, µ), a contradiction.

∀n>1

3. Nonlinear Operators and Young Measures

433

THEOREM 3.5.11 SY(Ω, E, µ) is compact and metrizable. PROOF

Recall that ∗

SY(Ω, E, µ) ⊆ B µ(Ω) ∗

(see the proof of Theorem 3.5.10) and B µ(Ω) with the relative weak∗ -topology is compact metrizable. So we need to show that SY(Ω, E, µ) is sequentially weak∗ -closed. To this end let {νn }n>1 ⊆ SY(Ω, E, µ) be a sequence, such that w∗

νn −→ ν

in SY(Ω, E, µ),

for some ν ∈ M (Ω × E)+ . Suppose that ξ ∈ C0 (Ω × E)+ ,

β ∈ C0 (Ω)+

and ξ(z, x) 6 β(z)

∀ (z, x) ∈ Ω × E.

Then from Theorem 3.5.10, we have Z Z ξ(z, x) dνn 6 β(z) dµ Ω×E

so

∀ n > 1,

Ω

Z

Z ξ(z, x) dν 6

Ω×E

β(z) dµ Ω

and from Proposition 3.5.9, we have that ν ∈ SY(Ω, E, µ).

Let us see how Young measures are related to transition probabilities (see Definition 3.5.1). c In what follows on SR(Ω, E) we consider the equivalence relation · ¸ λ1 ∼ λ2 ⇐⇒ λ1 (z) = λ2 (z) for µ-a.a. z ∈ Ω . Then we set df b R(Ω, E) = R(Ω, E)/∼

and

df c SR(Ω, E) = SR(Ω, E)/∼ .

434

Nonlinear Analysis

THEOREM 3.5.12 There is a bijection ψ : R(Ω, E) −→ Y(Ω, E, µ) given by df

ψ(λ) = ν where df

∀ λ ∈ R(Ω, E),

Z Z

ν(C) =

χC (z, x)λ(z)(dx) dµ

∀ C ∈ Σ × B(E).

Ω E

PROOF For any ν-integrable or positive and Σ × B(E)-measurable function h : Ω × E −→ R, we have Z Z Z h(z, x) dν = h(z, x)λ(z)(dx) dµ, Ω×E

Ω E

where ν = ψ(λ). First we show that the map is injective. So let λ1 , λ2 ∈ R(Ω, E) and suppose that ψ(λ1 ) = ψ(λ2 ). Then for A ∈ Σ and η ∈ C0 (E), we set df

h(z, x) = χA (z)η(x) We have

Z Z

∀ (z, x) ∈ Ω × E. Z Z

η(x)λ1 (z)(dx) dµ = A E

η(x)λ2 (z)(dx) dµ. A E

Because A ∈ Σ was arbitrary, we obtain Z Z η(x)λ1 (z)(dx) = η(x)λ2 (z)(dx) for µ-a.a. z ∈ Ω and all η ∈ C0 (E), E

E

so λ1 (z) = λ2 (z) for µ-a.a. z ∈ Ω. Therefore ψ is indeed injective. Next we show that ψ is surjective. So let ν ∈ Y(Ω, E, µ). Then for any ε > 0, we can find a sequence {Dk }k>1 of pairwise disjoint Borel subsets of E with µ ¶ ∞ \ diam Dk < ε and ν Ω × Dkc = 0 k=1

and also a sequence {Um }m>1 of pairwise disjoint open sets with diam Um < ε,

µ(∂Um ) = 0

and

µ ¶ ∞ [ µ Ω\ Um = 0. m=1

3. Nonlinear Operators and Young Measures

435

For each k > 1, we pick xk ∈ Dk and then define df

ε

λ (z) =

∞ X ν(Um × Dk )

µ(Dk )

k=1

δxk

∀ z ∈ Um ,

where δxk is the Dirac measure concentrated on xk . Then λε ∈ R(Ω, E) and let

ν ε = ψ(λε ).

For all C ∈ Σ × B(E), we have µ ¶ µ [ ε ν Um × Dk 6 ν (C) 6 ν (m, k) Um × Dk ⊆ C

[

¶ U m × Dk .

(m, k) Um × Dk ⊇ C

Therefore for any open set V ⊆ Ω × E with ν(∂V ) = 0, we have lim ν ε (V ) = ν(V ).

ε&0

Then by the regularity of ν and the Portmanteau theorem (see Theorem A.2.36), we have that w∗

ν ε −→ ν. © εª On the other hand note that λ ε>0 is bounded in L∞ (Ω; M (E)) and so by Alaoglu’s theorem (see Theorem A.3.9), we may assume that w∗

λε −→ λ

in L∞ (Ω; M (E)).

Then if ξ ∈ C0 (Ω × E), we have ¡ ¢ ξ ∈ L1 Ω; C0 (E) and so Z

Z ξ(z, x) dν = lim

ε&0 Ω×E

Ω×E

Z Z ξ(z, x) dν ε = lim

Ω E

Z Z =

ξ(z, x)λε (z)(dx) dµ

ε&0

ξ(z, x)λ(z)(dx) dµ. Ω E

Hence λ ∈ R(Ω, E) and ν = ψ(λ).

436

Nonlinear Analysis

REMARK 3.5.13 Similarly we establish that ψ is a bijective from SR(Ω, E) onto SY(Ω, E, µ). Moreover, if ν is the Young measure associated to a measurable function u : Ω −→ E, then ψ(δu ) = ν. Here δu is the Dirac transition probability associated to u, i.e., ½ 1 if u(z) ∈ A δu(z) (A) = ∀ A ∈ B(E). 0 if u(z) 6∈ A ∗ Also the identification obtained in Theorem 3.5.12 implies ¡ that the weak topology on Y(Ω, E, µ) (resulting from the pair of spaces M (Ω × E), C (Ω × 0 ¢ ∗ E) ) ¡is equivalent to the -topology on R(Ω, E) (resulting from the ¡ weak ¢¢ pair L∞ (Ω; M (E)), L1 Ω; C0 (E) ). Finally we should mention that Theorem 3.5.12 is also known as the disintegration theorem.

The next approximation result is important in many applications. THEOREM 3.5.14 If F : Ω −→ 2E \ {∅} is a multifunction, such that Gr F ∈ Σ × B(E), then RF (Ω, E) (3.131) ¡ ¢w ∗ = δu : u : Ω −→ E measurable, u(z) ∈ F (z) for µ-a.a. z ∈ Ω , where df

RF (Ω, E) =

©

¡ ¢ λ ∈ R(Ω, E) : λ(z) F (z) = 1

for a.a. z ∈ Ω

ª

and the closure is taken with respect to the weak∗ -topology on L∞ (Ω; M (E)). 1 1 PROOF In what follows by M+ (E)w∗ (respectively M+ (E)n ) we denote 1 ∗ the space M+ (E) furnished with the relative weak -topology of M (E) (respectively the narrow topology); see Definition 2.3.42 and Remark 2.3.43. We 1 know that M+ (E)n is a Polish space and because the narrow topology is 1 stronger than the weak∗ -topology, we infer that M+ (E)w∗ is a Souslin space 1 1 (see Definition A.2.29(b)). So the Borel σ-fields of M+ (E)n and M+ (E)w∗ ¡ ¢ are equal and we simply write B (E) . Now let ¡ ¢ ª df © 1 S(z) = λ ∈ M+ (E) : λ F (z) = 1 .

We claim that

¡ ¢ Gr S ∈ Σ × B (E) .

So let C ∈ Σ × B(E) 1 and consider the map ϕC : Ω × M+ (E) −→ R, defined by ¡ ¢ 1 ϕC (z, λ) = δz ⊗ λ (C) ∀ (z, λ) ∈ Ω × M+ (E).

3. Nonlinear Operators and Young Measures

437

Here by δz we denote the Dirac measure concentrated at z ¡∈ Ω. ¢Let T be the collection of all sets C ∈ Σ × B(E), such that ϕC is Σ × B (E) -measurable. For A ∈ Σ and B ∈ B(E), we have ϕA×B (z, λ) = χA (z) × λ(B) = χA (z) × ϑB (λ), where

1 ∀ λ ∈ M+ (E).

ϑB (λ) = λ(B)

1 map on M+ (E). Let B1 be all those sets G ∈ B(E), We show that ϑB is ¡ a Borel ¢ such that ϑG is B (E) -measurable. If G is open, then from the Portmanteau theorem (see Theorem A.2.36), we know that the map

λ 7−→ ϑG (λ) is lower semicontinuous on set in E, then

1 M+ (E)n .

If U is an open set E and K is a closed

ϑU ∩K = ϑU − ϑU ∩K c , so the map λ 7−→ ϑU ∩K (λ) is Borel. N Hence if {Uk }N k=1 are open sets in E and {Kk }k=1 are closed sets in E, then N [ ¡

Uk ∩ Kk

¢

∈ B1

k=1

and so B1 is a monotone class. From the monotone class theorem, it follows that B1 = B(E). ¡ ¢ So ϕG is B (E) -measurable for all G ∈ B(E). Therefore the map (z, λ) 7−→ ϕA×B (z, λ) = χA (z) × ϑB (λ) ¡ ¢ is Σ × B (E) -measurable for all A ∈ Σ and all B ∈ B(E), hence A × B ∈ T . Clearly T is a monotone class and so it follows that T = Σ × B(E). We have © ª 1 (z, λ) ∈ Ω × M+ (E) : (δz ⊗ λ)(Gr F ) = 1 ¡ ¢ ¡ ¢ = ϕ−1 = Gr S ⊆ Σ × B (E) . Gr F {1} 1 Let D be the subset of M+ (E) consisting of the Dirac measures and introduce ¡ ¢ ª df © Se (z) = λ ∈ D : λ F (z) = 1 .

Then ext S(z) = Se (z)

and

¡ ¢ Gr Se ∈ Σ × B (E) .

Using Remark 3.4.19, we have SS∞e so we obtain (3.131).

w∗

∞ = Sconv = SS∞ , ∗S e

438

Nonlinear Analysis

Exploiting the identification of R(Ω, E) with Y(Ω, E, µ) we obtain the following corollary of Theorem 3.5.14. COROLLARY 3.5.15 The Young measures associated to measurable functions are dense in the space Y(Ω, E, µ) for the weak∗ -topology on M (Ω × E) (or equivalently by Theorem 3.5.12 for the weak∗ -topology on L∞ (Ω; M (E))). In the previous section we introduced the following classes of Σ × B(X)df

measurable functions f : Ω × E −→ R = R ∪ {+∞} (called integrands): the Carath´eodory integrands and the normal integrands. Next we introduce further specifications of these classes, which will be helpful in our analysis. DEFINITION 3.5.16 (a) ½ df N (Ω, Σ, E) = f : Ω × E −→ R : f is Σ × B(E)-measurable and ¾ f (z, ·) is lower semicontinuous , i.e., N (Ω, Σ, E) is the set of all normal integrands; see also Definition 3.4.8. (b)

½ df

N+ (Ω, Σ, E) =

¾ f ∈ N (Ω, Σ, E) : f > 0 ,

i.e., N+ (Ω, Σ, E) is the set of positive normal integrands. (c) ½ df b b (Ω, Σ, E) = K

f ∈ N (Ω, Σ, E) : f (z, ·) ∈ Cb (E) for µ-a.a. z ∈ Ω and ¾ ° ° the map z 7−→ °f (z, ·)°L∞ (E) belongs in L1 (Ω) ,

b b (Ω, Σ, E) is the set of all Cb -Carath´eodory integrands. i.e., K (d) ½ df b 0 (Ω, Σ, E) = K

f ∈ N (Ω, Σ, E) : f (z, ·) ∈ C0 (E) for µ-a.a. z ∈ Ω ¾ ° ° ∞ ° ° and the map z 7−→ f (z, ·) L∞ (E) belongs in L (Ω) ,

b 0 (Ω, Σ, E) is the set of all C0 -Carath´eodory integrands. i.e., K

3. Nonlinear Operators and Young Measures From Proposition 3.4.9, we obtain the following fact. PROPOSITION 3.5.17 If f ∈ N+ (Ω, Σ, E), b 0 (Ω, Σ, E), such that then there exists a sequence {fn }n>1 ⊆ K fn % f. PROPOSITION 3.5.18 If f ∈ N+ (Ω, Σ, E), then the map

Z ν 7−→

f dν Ω×E

∗

is w -lower semicontinuous on SY(Ω, E, µ). PROOF

By virtue of Proposition 3.5.17, we can find a sequence b 0 (Ω, Σ, E), {fn }n>1 ⊆ K

such that fn % f. By the monotone convergence theorem (see Theorem A.2.10), we have Z Z f dν = lim fn dν. n→+∞ Ω×E

Ω×E

Since

¡ ¢ fn ∈ L1 Ω; C0 (E)

the map

∀ n > 1,

Z ν 7−→

fn dν Ω×E

is w∗ -continuous on SY(Ω, E, µ) (see Remark 3.5.13). Therefore the map Z ν 7−→ f dν Ω×E

is w∗ -lower semicontinuous on SY(Ω, E, µ).

439

440

Nonlinear Analysis

PROPOSITION © ª 3.5.19 If un : Ω −→ E n>1 is a sequence of Σ-measurable functions, {νn }n>1 is a sequence of Young measures associated with {un }n>1 (i.e., νn = δun for n > 1) and w∗

νn −→ ν

in SY(Ω, E, µ),

for some ν ∈ SY(Ω, E, µ), then for µ-almost all z ∈ Ω, the subprobability measure ν(z) is supported by ∞ \ © ª © ª lim sup un (z) = uk (z) : k > n . n→+∞

PROOF

n=1

We define df

Fk (z) =

©

un (z)

ª

∀ z ∈ Ω, k > 1

n>k

and

½ fk (z, x) = iFk (z) =

0 +∞

if x ∈ Fk (z), otherwise.

Evidently fk ∈ N+ (Ω, Σ, E) and by virtue of Proposition 3.5.18, we have Z 0 6 fk (z, x) dν Ω×E

Z

6 lim inf

n→+∞ Ω×E

Z

= lim inf

n→+∞

fk (z, x) dνn

¡ ¢ fk z, un (z) dµ = 0,

Ω

so

Z fk (z, x) dν = 0. Ω×E

From Remark 3.5.13, we have Z Z fk (z, x)ν(z)(dx) dµ = 0, Ω E

so

Z fk (z, x)ν(z)(dx) = 0

for µ-a.a. z ∈ Ω

E

and thus ν(z) is supported by Fk (z) for µ-almost all z ∈ Ω. Because k > 1 was arbitrary, we obtain the conclusion of the proposition.

3. Nonlinear Operators and Young Measures

441

When we examined the space of measures M (E) = C0 (E)∗ , in addition to the weak∗ -topology, we considered also¢ a finer topology called the narrow ¡ topology , namely the w M (E), Cb (E) -topology. Exploiting the identification of Y(Ω, E, µ) with R(Ω, E), we do the same thing for the space Y(Ω, E, µ) of Young measures. We introduce a finer topology, which will lead to some powerful compactness result. DEFINITION 3.5.20 The narrow topology on Y(Ω, E, µ) is the weakest topology which makes continuous the linear functionals of the form Z b b (Ω, Σ, E). ν 7−→ f dν ∀f ∈K Ω×E

We say that the sequence {νn }n>1 converges narrowly to ν, if Z lim

n→+∞ Ω×E

Z f dνn =

b b (Ω, Σ, E) ∀f ∈K

f dν Ω×E

and we write

n

νn −→ ν. REMARK 3.5.21 If E is compact, then the narrow and weak∗ -topolob b 0 (Ω, Σ, E). gies coincide since Kb (Ω, Σ, E) = K PROPOSITION 3.5.22 If {un : Ω −→ E}n>1 is a sequence of Σ-measurable functions, u : Ω −→ E is a Σ-measurable function, {νn }n>1 and ν are the Young measures associated to the functions {un }n>1 and u respectively, then µ n un −→ u ⇐⇒ νn −→ ν. PROOF

“=⇒”: Since

µ

u −→ u, © ª we can extract a subsequence unk k>1 , such that unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Then for all

b b (Ω, Σ, E), f ∈K

by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z Z ¡ ¢ ¡ ¢ lim f z, unk (z) dµ = f z, u(z) dµ, n→+∞

Ω

Ω

442

Nonlinear Analysis

so

Z lim

Z

k→+∞ Ω×E

f (z, x) dνnk =

b b (Ω, Σ, E) ∀f ∈K

f (z, x) dν Ω×E

and thus

n

νnk −→ ν

as k → +∞,

in Y(Ω, E, µ).

Since every subsequence of {νn }n>1 has a further subsequence converging narrowly to ν, we conclude that n

νn −→ ν “⇐=”: Let

in Y(Ω, E, µ).

© ¡ ¢ª df f (z, x) = min 1, dE x, u(z) ,

where dE denotes the metric on E. Clearly b b (Ω, Σ, E). f ∈K So

Z lim

Z

n→+∞ Ω×E

hence

Z lim

n→+∞

f (z, x) dνn =

f (z, x) dν, Ω×E

¡ ¢ f z, un (z) dµ =

Ω

Z

¡ ¢ f z, u(z) dµ = 0.

Ω

For a given ε > 0, let df

Mε,n =

©

¡ ¢ ª z ∈ Ω : dE un (z), u(z) > ε

We have

Z εµ(Mε,n ) 6 Z 6

¡ ¢ f z, un (z) dµ

Mε,n

¡

¢ f z, un (z) dµ,

Ω

so, from (3.132), we have lim µ(Mε,n ) = 0

n→+∞

and thus

µ

∀ n > 1.

un −→ u.

(3.132)

3. Nonlinear Operators and Young Measures

443

By the Alexandrov one-point compactification (see Theorem A.1.3 and Reb such that E is a dense mark A.1.4), we can find a compact metric space E, b subset of E. PROPOSITION 3.5.23 If f ∈ N+ (Ω, Σ, E), then © ª ¡ ¢ b 0 Ω, Σ, E b , such that (a) there exists a sequence fbn n>1 ⊆ K fbn % f R

(b) the function ν 7−→

on Ω × E;

f dν is narrowly lower semicontinuous.

Ω×E

PROOF

b which extends d . Then (a) Let dEb be the metric of E E

£ ¤ fbn (z, x) = inf f (z, y) + ndEb (y, x) y∈E

b ∀ n > 1, (z, x) ∈ Ω × E

is the desired sequence. (b) Follows from Proposition 3.5.18 since the narrow topology is finer than the weak∗ -topology. PROPOSITION 3.5.24 The narrow topology on Y(Ω, Σ, E) is the weakest topology τ , such that the R b b (Ω, Σ, E). b map ν −→ fbdν is continuous for all fb ∈ K Ω×E

b b (Ω, Σ, E) b ⊆K b b (Ω, Σ, E). So a priori the τ -topology PROOF Note that K b b (Ω, Σ, E) and introduce is weaker than the narrow topology. Next let f ∈ K ° df ° β(z) = °f (z, ·)°L∞ (E)

df

and g(z, x) = f (z, x) + a(z).

Evidently g R∈ N+ (Ω, Σ, E) and so by Proposition 3.5.23, we have that the map ν 7−→ g dν is τ -lower semicontinuous. Since

Ω×E

Z

Z f dν =

Ω×E

we infer that the map ν −→

R Ω×E

Z g dν −

Ω×E

β dµ, Ω

f dν is τ -lower semicontinuous.

If we repeat the above argument with f replaced by −f , we reach the desired conclusion.

444

Nonlinear Analysis

PROPOSITION 3.5.25 If on K0 (Ω, Σ, E) we introduce the equivalence relation ¡© ª¢ f1 ∼ f2 ⇐⇒ µ z ∈ Ω : f1 (z, ·) 6= f2 (z, ·) = 0, ¡ ¢ b 0 (Ω, Σ, E)/∼ is in bijection with L1 Ω; C0 (E) . then K0 (Ω, Σ, E) = K PROOF

For each f ∈ K0 (Ω, Σ, E), let df

ψ(f )(z) = f (z, ·). Then ψ(f )(z) ∈ C0 (E). Also the map Ω 3 z 7−→ ψ(f )(z) ∈ C0 (E) is measurable. To see this let {xn }n>1 be dense in E. Then for all h ∈ C0 (E), we have ° ° ¯ ¯ °ψ(f )(z) − h° = sup ¯f (z, xn ) − h(xn )¯, C (E) 0

so the map

n>1

° ° z 7−→ °ψ(f )(z) − h°C0 (E)

is Σ-measurable for all h ∈ C0 (E). Since the space C0 (E) is separable, it follows that the map Ω 3 z 7−→ ψ(f )(z) ∈ C0 (E) is strongly measurable (see Corollary 2.1.4). Moreover, Z ° ° °ψ(f )(z)° dµ < +∞, C0 (E) Ω

¡ 1

¢ i.e., ψ(f ) ∈ L Ω; C0 (E) . Therefore ¡ ¢ ψ : K0 (Ω, Σ, E) −→ L1 Ω; C0 (E) ¡ ¢ is injective. Moreover, if g ∈ L1 Ω; C0 (E) , then g = ψ(f ) with df

f (z, x) = g(z)(x), so ψ is bijective. REMARK 3.5.26¡ The above proposition permits the identification of ¢ K0 (Ω, Σ, E) with L1 Ω; C0 (E) , which is very convenient because of the identification of Y(Ω, E, µ) with R(Ω, E).

3. Nonlinear Operators and Young Measures

445

We want to investigate the compact sets in Y(Ω, E, µ). For this purpose it is useful to recall the basic results characterizing the compact sets of the ¡ ¢ space M (E) furnished with the narrow topology, i.e., the w M (E), Cb (E) topology. If E is compact, this topology coincides with the w∗ -topology. THEOREM 3.5.27 (Prohorov Theorem) If Y is a Polish space (see Definition A.2.29(a)) and C ⊆ M (Y )+ is a bounded set, then C is relatively compact in the narrow topology if and only if C is uniformly tight, i.e., for every ε > 0, we can find a compact set Kε ⊆ Y, such that

¡ ¢ sup λ Kεc 6 ε.

λ∈C

We have the following characterization of uniformly tight sets in M (Y )+ . PROPOSITION 3.5.28 If Y is a Polish space and C ⊆ M (Y )+ is nonempty, then C is uniformly tight if and only if there exists ψ : Y −→ R+ , such that © ª (a) the set y ∈ Y : ψ(y) 6 t is compact for every t > 0 (i.e., ψ is inf-compact); and Z (b) sup ψ(y)λ(dy) < +∞. λ∈C

Y

PROOF “=⇒”: Let {Kn }n>1 be an increasing sequence of compact sets of Y , such that λ(Y \ Kn ) 6

1 2n

Let us set df

ψ =

∞ X

∀ n > 1, λ ∈ C.

χY \Kn .

n=1

We have

Z ψ(y)λ(dy) = Y

∞ Z X

χY \Kn (y)λ(dy) 6 1

∀ λ ∈ C.

n=1 Y

Note that ψ is N ∪ {+∞}-valued. So © ª © ª ψ 6 t = ψ 6 [t] = K[t]+1 , where [t] denotes the integer part of t > 1. Therefore ψ is inf-compact.

446

Nonlinear Analysis

Motivated from Theorem 3.5.27, we introduce the following definition. DEFINITION 3.5.29 A set S ⊆ Y(Ω, E, µ) is said to be uniformly tight, if and only if for every ε > 0, there exists a compact set Kε ⊆ E, such that

¡ ¢ sup ν Ω × (E \ Kε ) 6 ε.

ν∈S

REMARK 3.5.30 The © uniformª tightness of S ⊆ Y(Ω, E, µ) is equivais uniformly tight in M (E)+ (see lent to saying that C = ν ◦ proj−1 E ν∈S Theorem 3.5.27). Using the identification of Y(Ω, E, µ) with R(Ω, E) and Proposition 3.5.28, we are led to the following characterization of uniformly tight sets in Y(Ω, E, µ). PROPOSITION 3.5.31 S ⊆ Y(Ω, E, µ) is uniformly tight if and only if there exists an inf-compact df

function ψ : E −→ R+ = R+ ∪ {+∞}, such that Z Z sup ψ(x)ν(z)(dx) dµ < +∞. ν∈S

Ω E

In the literature there is another notion of uniform tightness for Young measures (or equivalently for transition probabilities). DEFINITION 3.5.32 A set S ⊆ Y(Ω, E, µ) is said to be uniformly Btight, if there exists a Σ × B(E)-measurable function ϕ : Ω × E −→ R+ , such that ϕ(z, ·) is inf-compact for all z ∈ Ω, i.e., the set © ª x ∈ E : ϕ(z, x) 6 t is compact for all t > 0 (hence ϕ is a normal integrand, i.e., ϕ ∈ N+ (Ω, Σ, E)), such that Z Z Z sup ϕ(z, x) dν = sup ϕ(z, x)ν(z)(dx) dµ < +∞. ν∈S Ω×E

ν∈S

Ω E

REMARK 3.5.33 Evidently uniform tightness implies uniform B-tightness. In fact since E is a Polish space, the two notions are equivalent (see Valadier (1975, p. 165)).

3. Nonlinear Operators and Young Measures

447

LEMMA 3.5.34 If S ⊆ Y(Ω, E, µ) is uniformly tight, then so is S (the closure in the narrow topology; see Definition 3.5.20). PROOF By virtue of Definition 3.5.29, for a given ε > 0, we can find a compact set Kε ⊆ E, such that ¡ ¢ sup ν Ω × (E \ Kε ) 6 ε. ν∈S

Let us set

df

ψ(x) = χE\Kε (x). Then ψ ∈ N+ (Ω, Σ, E) and if ν ∈ Y(Ω, E, µ), we have Z ¡ ¢ ψ(x) dν 6 ε ⇐⇒ ν Ω × (E \ Kε ) 6 ε. Ω×E

Therefore, if ν ∈ S, by Proposition 3.5.23(b), it follows that ¡ ¢ ν Ω × (E \ Kε ) 6 ε, hence S is uniformly tight too. The next theorem is the extension of Theorem 3.5.27 to Young measures. THEOREM 3.5.35 S ⊆ Y(Ω, E, µ) is relatively compact for the narrow topology if and only if S is uniformly tight. PROOF

“=⇒”: Let ϕ : Y(Ω, E, µ) −→ M (E)+ be defined by df

ϕ(ν) = ν ◦ proj−1 E

∀ ν ∈ Y(Ω, E, µ).

b b (Ω, Σ, E) and so from Remark 3.5.6, we If h ∈ C0 (E), then h ◦ projE ∈ K have Z Z ¡ ¢ ¡ ¢ h(x)d ν ◦ proj−1 = h ◦ projE (z, x) dν E E

Ω×E

and thus ϕ is continuous for the narrow topology on Y(Ω, Σ, E) and on M (E)+ . It follows then that ϕ(S) is compact in M (E)+ and so from Theorem 3.5.27, we have that ϕ(S) is uniformly tight in M (E)+ . This then implies the uniform tightness of S (see Remark 3.5.30).

448

Nonlinear Analysis

b µ), defined “⇐=”: Let {νn }n>1 ⊆ S be a sequence and consider νbn ∈ Y(Ω, E, by ¡ ¢ df b n > 1. νbn (C) = νn C ∩ (Ω × E) ∀ C ∈ Σ × B(E), b is compact, the weak∗ and narrow topologies on Y(Ω, E, b µ) Because the set E coincide (see Remark 3.5.21). So because of Theorem 3.5.10, we may assume that νbn −→ νb in Y(Ω, E, µ), with the narrow topology. Since S is uniformly tight, for every m > 1, we can find a compact set Km ⊆ E, such that

so

¡ ¢ 1 νn Ω × (E \ Km ) 6 m

∀ n > 1,

¡ ¢ 1 νbn Ω × (E \ Km ) 6 m

∀n>1

and thus ¡ ¢ ¡ ¢ 1 νb Ω × (E \ Km ) 6 lim inf νbn Ω × (E \ Km ) 6 . n→+∞ m So νb is supported by Ω ×

∞ S m=1

Km ⊆ Ω × E.

Since in the sequel we shall focus on Young measures associated to measurable functions, let us give some examples in this direction. EXAMPLE 3.5.36 In the examples that follow Ω = [0, 1], µ is the Lebesgue measure and E = R. (a) Consider the Rademacher functions ¡ ¢ df un (z) = sgn sin(2n πz) where

( z |z| sgn z = 1 df

So

½ un (z) =

We have

1 −1

∀ n > 1,

if

z 6= 0,

if

z 6= 1.

£ ¢ if z ∈ 2kn , k+1 , k is even, 2n otherwise. w∗

un −→ 0

in L1 [0, 1].

3. Nonlinear Operators and Young Measures

449

Let νn be the Young measure associated to un for n > 1. We have that ¢ 1¡ ¢ 1¡ µ ⊗ δ1 + µ ⊗ δ−1 . 2 2

w∗

νn −→

So we see that w∗ -convergence in L∞ (Ω) does not imply the w∗ -convergence in M (Ω×E) of the associated Young measures. Note that the sequence {un }n>1 has no subsequences converging µ-almost everywhere on Ω = [0, 1] (note that kun − um kL1 [0,1] = 1 for n 6= m and compare with Remark 3.5.8). (b) Let {un }n>1 be the Rademacher functions introduced above. Then define ½ df

u bn (z) =

un (z) 1 2 un (z)

if if

n is even, n is odd.

Clearly we still have w∗

u bn −→ 0 in L∞ [0, 1]. © ª However, the sequence νbn n>1 of associated Young measures is not convergent, because w∗

νb2n −→ and

w∗

νb2n+1 −→

¢ 1¡ ¢ 1¡ µ ⊗ δ1 + µ ⊗ δ−1 2 2

in M (Ω × E)

¢ 1¡ ¢ 1¡ µ ⊗ δ 21 + µ ⊗ δ− 21 2 2

in M (Ω × E).

(c) Let df

un (z) = sin(nz) We know that

∀ z ∈ [0, 1], n > 1.

w∗

un −→ 0 in L∞ [0, 1].

On the other hand it can be shown (see Tartar (1979, p. 148)), that if {νn }n>1 is the sequence of associated Young measures, then w∗

νn −→ ν

in M (Ω × E),

where Z ν(A) = A

1 1 √ dz π 1 − z2

for all measurable A ⊆ [0, 1].

(d) Let df

un (z) = ϑn χ[0, 1 ] (z) n

∀ n > 1,

450

Nonlinear Analysis

with ϑn ∈ R for n > 1. No matter which is the sequence {ϑn }n>1 ⊆ R, if {νn }n>1 is the sequence of associated Young measures, we have w∗

νn −→ µ ⊗ δ0

in M (Ω × E).

(e) Let un (z) = n

∀ z ∈ [0, 1], n > 1.

Also let {νn }n>1 be the sequence of associated Young measures. Clearly w∗

νn −→ 0

in M (Ω × E).

REMARK 3.5.37 In Examples 3.5.36(a), (b) and (c), the function un have values in [−1, 1], so we can take E = [−1, 1] and then from Theorem 3.5.10, it follows that the sequence {νn }n>1 is relatively w∗ -compact in M (Ω × E). In Examples 3.5.36(d) and (e), we cannot assume that E is compact. Nevertheless the sequence {νn }n>1 is uniformly tight, thus relatively compact for the narrow topology (hence for the weak∗ -topology too; see Theorem 3.5.35). As we already mentioned, in the remaining part of this section, we look at Young measures associated to measurable functions. So in what follows un : Ω −→ E are Σ-measurable functions and {νn }n>1 is the sequence of corresponding Young measures. Following Definition 3.5.32, we introduce the following notion. DEFINITION 3.5.38 The sequence {un }n>1 is uniformly tight, if the sequence {νn }n>1 is uniformly tight (in the sense of Definition 3.5.32). REMARK 3.5.39 Since νn (z) = δun (z) , according to the above definition, the sequence {un }n>1 is uniformly tight, if for every ε > 0, we can find a compact set Kε ⊆ E, such that ¡© ª¢ sup µ z ∈ Ω : un (z) 6∈ Kε 6 ε. n>1

By Proposition 3.5.31, equivalently we can say that there exists an infcompact function ψ : E −→ R+ , such that Z ¡ ¢ sup ψ un (z) dµ < +∞. n>1

Ω

In particular then if E = RN and we take ψ(x) ¡ ¢ = kxkRN , then we see that every bounded sequence {un }n>1 ⊆ L1 Ω; RN is uniformly tight.

3. Nonlinear Operators and Young Measures

451

THEOREM 3.5.40 If n

νn −→ ν © ¡ ¢ª and f ∈ N (Ω, Σ, E) is such that the sequence f − ·, un (·) n>1 is uniformly integrable in L1 (Ω), then ¢ R ¡ R (a) if lim inf f z, un (z) dµ < +∞, then f + (z, x) dν < +∞; n→+∞

(b)

Ω

Ω×E

¢ R ¡ f (z, x) dν 6 lim inf f z, un (z) dµ.

R

n→+∞

Ω×E

PROOF

Ω

(a) Fix c > 0 and let © ª df fc = max − c, f .

From Proposition 3.5.23(b), we have Z Z ¡ ¢ ¡ ¢ fc (z, x) + c dν 6 lim inf fc (z, un (z)) + c dµ, n→+∞

Ω×E

so

Ω

Z

Z fc (z, x) dν 6 lim inf

fc (z, un (z)) dµ < +∞.

n→+∞

Ω×E

(3.133)

Ω

If we let c = 0, then from (3.133) and the uniform integrability hypothesis, we conclude that © ¡ ¢ª the sequence f + ·, un (·) n>1 is bounded in L1 (Ω). This proves (a). (b) Let

½ df

An,c = Then we have

Z

¾ ¢ z ∈ Ω : f z, un (z) < −c . ¡

¡ ¢ f z, un (z) dµ 6 0

∀ n > 1,

An,c

so

Z An,c

¡ ¢ f + z, un (z) dµ −

Z An,c

¡ ¢ f − z, un (z) dµ 6 0

∀ n > 1.

452

Nonlinear Analysis

Thus for a given ε > 0, we can find c > 0 large enough so that Z ¡ ¢ −ε 6 f z, un (z) dµ 6 0.

(3.134)

An,c

Note that

¡ ¢ ¡ ¢ fc z, un (z) = f z, un (z)

and

on Ω \ An,c

¡ ¢ fc z, un (z) = −c on An,c .

Hence Z

¡ ¢ f z, un (z) dµ =

Ω

Z

=

¢ f z, un (z) dµ +

>

Z

¢ f z, un (z) dµ +

An,c

Z

¡ ¢ f z, un (z) dµ

Ω\An,c

Z

¡

¢ fc z, un (z) dµ −

Ω

¡

Z

¡ ¢ f z, un (z) dµ +

An,c

¡

An,c

Z

Z

¡ ¢ fc z, un (z) dµ

An,c

¡

¢ fc z, un (z) dµ >

Ω

Z

¡ ¢ fc z, un (z) dµ − ε

Ω

(see (3.134)). So using (3.133) and the fact that f 6 fc , we have Z lim inf

n→+∞

Z >

¡ ¢ f z, un (z) dµ > lim inf

Z

n→+∞

Ω

Ω

Ω

Z

fc (z, x) dν − ε >

¡ ¢ fc z, un (z) dµ − ε

f (z, x) dν − ε. Ω

Let ε & 0 to finish the proof of the theorem. COROLLARY 3.5.41 n If νn −→ ν, f0 : Ω×E −→ R is Σ×B(E)-measurable, f0 (z, ·) ∈ C(E) for all © ¡ ¢ª z ∈ Ω and f0 ·, un (·) n>1 is a sequence of uniformly integrable functions, then f0 is ν-integrable and Z Z ¡ ¢ f0 (z, x) dν = lim f0 z, un (z) dµ. n→+∞

Ω×E

PROOF

Ω

Use Theorem 3.5.40 with f = f0 and f = −f0 .

3. Nonlinear Operators and Young Measures

453

COROLLARY 3.5.42 © ¡ ¢ª n If νn −→ ν, h : E −→ R is a continuous function and h un (·) n>1 is a sequence of uniformly integrable functions, then (a) for µ-almost all z ∈ Ω, the function h is ν(z)-integrable and Z Z ¯ ¯ ¯h(x)¯ν(z)(dx) dµ < +∞; Ω E

(b)

w h(un ) −→ b h

where

in L1 (Ω),

Z

df b h(z) =

h(x)ν(z)(dx). E

PROOF

(a) Follows from Corollary 3.5.41.

(b) Let ϑ ∈ L∞ (Ω) and let us set df

f0 (z, x) = ϑ(z)h(x). We have

Z

¡ ¢ f0 z, un (z) dµ 6 kϑk∞

A

Z

¯ ¡ ¢¯ ¯h un (z) ¯ dµ

∀ A ∈ Σ,

A

© ¡ ¢ª so f0 ·, un (·) n>1 is a sequence of uniformly integrable functions. We can apply Corollary 3.5.41 and obtain Z Z ¡ ¢ f0 (z, x) dν = lim f0 z, un (z) dµ n→+∞

Ω×E

Ω Z

=

lim

n→+∞

¡ ¢ ϑ(z)h un (z) dµ.

(3.135)

Ω

© ¡ ¢ª Because h un (·) n>1 is a sequence of uniformly integrable functions, from the Dunford-Pettis theorem (see Theorem 2.3.24), we may assume that w h(un ) −→ b h in L1 (Ω).

So from (3.135,) we have µZ ¶ Z Z Z f0 (z, x) dν = ϑ(z) h(x)ν(z)(dx) dµ = ϑ(z)b h(z) dµ. Ω×E

Ω

E

Ω

454

Nonlinear Analysis

Let ϑ = χA , A ∈ Σ. Then we obtain Z Z Z b h(x)ν(z)(dx) dµ = h(z) dµ A E

so

∀ A ∈ Σ,

A

Z h(x)ν(z)(dx) = b h(z) for µ-a.a. z ∈ Ω. E

© ¡ ¢ª As every subsequence of h un (·) n>1 has a further subsequence converging © ¡ ¢ª weakly in L1 (Ω) to b h, we conclude that the original sequence h un (·) n>1

converges. REMARK 3.5.43

If E is compact, then ∗

w h(un ) −→ b h

in L∞ (Ω).

PROPOSITION 3.5.44 ¡ ¢ If E = RN and {un }n>1 ⊆ L∞ Ω; RN , {νn }n>1 are sequences, such that w∗

un −→ u and

w∗

νn −→ ν then

¡ ¢ in L∞ Ω; RN

in M (Ω × RN ),

Z u(z) =

xν(z)(dx)

for µ-a.a. z ∈ Ω.

RN

¡ ¢ PROOF Since the sequence {un }n>1 is bounded in L∞ Ω; RN , we may replace E = RN by a compact subset of it. Then the narrow and weak∗ topologies coincide. Therefore from Corollary 3.5.42 (see also Remark 3.5.43) with h(x) = x, we obtain the result. REMARK 3.5.45 For a given measurable function u : Ω −→ E, we define the barycenter of u to be the set ½ ¾ Z df Bar(u) = λ ∈ R(Ω, E) : u(z) = xλ(z)(dx) for µ-a.a. z ∈ Ω . E

¡ ¢ So the conclusion of Proposition 3.5.44 says that ν ∈ R Ω, RN belongs in the barycenter of u.

3. Nonlinear Operators and Young Measures

455

In Proposition 2.3.38 we produced a criterion for strong convergence in L1 (Ω). In the next proposition, using the tools provided by the theory of ¡ ¢ Young measures, we obtain a criterion for strong convergence in Lp Ω; RN , p ∈ [1, +∞). PROPOSITION 3.5.46 ¡ ¢ If E = RN , {un }n>1 ⊆ L∞ Ω; RN and {νn }n>1 ⊆ M (Ω×E) are sequences, such that ¡ ¢ w∗ un −→ u in L∞ Ω; RN and

w∗

νn −→ ν

in M (Ω × E),

un −→ u

¡ ¢ in Lp Ω; RN ,

then for p ∈ [1, +∞) if and only if ν(z) = δu(z) PROOF

for µ-a.a. z ∈ Ω.

¡ ¢ “=⇒”: Let h ∈ C0 RN . Then

h(un ) −→ h(u) in Lp (Ω). © ¡ ¢ª On the other hand it is easy to see that h un (·) n>1 is the sequence of uniformly integrable functions. So by virtue of Corollary 3.5.42(b), we have that w h(un ) −→ b h in L1 (Ω), with df b h(z) =

Z h(x)ν(z)(dx). RN

We have ¡ ¢ ® h u(z) = h, δu(z) C (RN ) 0 Z ® = h(x)ν(z)(dx) = h, ν(z) C0 (RN )

for µ-a.a. z ∈ Ω,

RN

where by h·, ·iC0 (RN ) we denote the duality brackets for the pair of spaces ¡ ¡ N¢ ¡ ¢¢ ¡ ¢ C0 R , M RN . Since h ∈ C0 RN was arbitrary, we obtain δu(z) = µ(z) for µ-a.a. z ∈ Ω. (b)¡ It suffices to¡ prove ¢the implication for p > 1 (recall that the embedding ¢ Lp Ω; RN ⊆ L1 Ω; RN is continuous). If ν(z) = δu(z)

for µ-a.a. z ∈ Ω,

456

Nonlinear Analysis

then from Corollary 3.5.42(b) with p

h(x) = kxkRN , we have w

p

p

kun kRN −→ kukRN

in L1 (Ω)

and so kun kp −→ kukp . Note that w

un −→ u

¡ ¢ in Lp Ω; RN .

So from the Kadec-Klee property (see Remark A.3.22), we have that ¡ ¢ un −→ u in Lp Ω; RN .

We have another result in this direction. First a lemma. LEMMA 3.5.47¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence of uniformly integrable functions, then we can find a subsequence {νnk }k>1 of {νn }n>1 , such that n

(a) νnk −→ ν in Y(Ω, RN , µ); R (b) xν(z)(dx) < +∞; RN

¡ ¢ R w (c) unk −→ u in L1 Ω; RN , with u(z) = xν(z)(dx). RN

PROOF (a) From Proposition 3.5.31 with ψ(x) = kxkRN , we see that the sequence {νn }n>1 is uniformly tight. So by virtue of Theorem 3.5.35, we can find a subsequence {νnk }k>1 of {νn }n>1 , such that n

νnk −→ ν

as k → +∞

in Y(Ω, RN , µ).

(b) For the subsequence obtained in part (a), we see that we can apply Corollary 3.5.42(a) with ϕ(x) = kxkRN and obtain the desired conclusion. (c) For the subsequence obtained in part (a), we see that we can apply Corollary 3.5.42(b) with ϕ(x) = xk for all x = (x1 , . . . , xN ) ∈ RN and obtain the desired conclusion.

3. Nonlinear Operators and Young Measures

457

PROPOSITION ¡ 3.5.48 ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that w

un −→ u

¡ ¢ in L1 Ω; RN

and for every subsequence {νnk }k>1 of the sequence {νn }n>1 for which we have n νnk −→ ν as k → +∞ in Y(Ω, RN , µ), ν is the Young measure associated to a Σ-measurable function w : Ω −→ RN , then w = u and ¡ ¢ un −→ u in L1 Ω; RN . PROOF

We have ° Z Z °Z ° ° ° ° °w(z)° N dµ = ° xν(z)(dx)° ° ° R Ω

Z

6 lim inf

n→+∞

Ω

RN

dµ

RN

° ° °un (z)° N dµ < +∞, k R

Ω

¡ ¢ so w ∈ L1 Ω; RN . If

n

νnk −→ ν = δw(·) , then by Lemma 3.5.47(c), we have w

unk −→ w

¡ ¢ in L1 Ω; RN .

Hence w = u. From Proposition 3.5.22, we have that µ

unk −→ u and by passing to a further subsequence if necessary, we may assume that unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Then from the extended dominated convergence theorem (Vitali’s theorem; see Theorem A.2.9), we have that ¢ ¡ un −→ u in L1 Ω; RN .

Now we can prove a lower semicontinuity result for integral functionals.

458

Nonlinear Analysis

THEOREM 3.5.49 ¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that ¡ ¢ w un −→ u in L1 Ω; RN , {wn : Ω −→ E}n>1 is a sequence of Σ-measurable functions, such that µ

wn −→ w,

¡ ¢ for some Σ-measurable function w : Ω −→ E, f ∈ N Ω, Σ, E × RN and ¡ ¢ (i) f z, w(z), · is convex for µ-almost all z ∈ Ω; © ¡ ¢ª (ii) f − ·, wn (·), un (·) n>1 is a sequence of uniformly integrable functions, then

Z

(a) if lim inf

n→+∞

¡ ¢ f z, wn (z), un (z) dµ < +∞, then

Ω

Z

¡ ¢ f + z, w(z), u(z) dµ < +∞;

Ω

Z (b)

¡ ¢ f z, w(z), u(z) dµ 6 lim inf

Z

n→+∞

Ω

PROOF limit

¡ ¢ f z, wn (z), un (z) dµ.

Ω

By passing to a suitable subsequence, we may assume that the Z ¡ ¢ lim f z, wn (z), un (z) dµ exists, n→+∞

Ω

while by Proposition 3.5.22, we have n

δwn −→ δw

in Y(Ω, Σ, E)

and by Lemma 3.5.47, we have n

δun −→ ν

in Y(Ω, Σ, RN ).

It is easy to see that n

δwn ⊗ δun −→ δw ⊗ ν

in Y(Ω, Σ, E × RN ).

Invoking Theorem 3.5.40(a), we obtain the implication Z ¡ ¢ lim inf f z, wn (z), un (z) dµ < +∞ n→+∞

Ω

Z µZ Ω

RN

⇓ ¶ ¡ ¢ f + z, w(z), x ν(z)(dx) dµ < +∞.

(3.136)

3. Nonlinear Operators and Young Measures

459

From Theorem 3.5.40(b), we have ¶ Z µZ ¡ ¢ f z, w(z), x ν(z)(dx) dµ Ω

RN Z

6 lim inf

n→+∞

¡ ¢ f z, wn (z), un (z) dµ.

(3.137)

Ω

¡

¢ Since by hypothesis f z, w(z), · is convex for µ-almost all z ∈ Ω, using Jensen’s inequality (see Theorem A.2.26), we obtain ¶ Z Z µ Z ¡ ¢ ¡ ¢ f z, w(z), u(z) dµ = f z, w(z) xν(z)(dx) dµ Ω

Z µZ

6 Ω

Ω ¶ ¢ f z, w(z), x ν(z)(dx) dµ.

¡

RN

(3.138)

RN

Then part (a) follows from (3.136) and (3.138), while part (b) follows from estimates (3.137) and (3.138). There is a version of this result for integrands defined on Banach spaces. THEOREM 3.5.50 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, Y is a separable reflexive Banach space and f : Ω × X × Y −→ R is a Σ × B(X) × B(Y )-measurable function, such that (i) the function (x, y) 7−→ f (ω, x, y) is lower semicontinuous for µ-almost all ω ∈ Ω; (ii) the function y 7−→ f (ω, x, y) is convex for µ-almost all ω ∈ Ω and all x ∈ X; (iii) there exist β ∈ L1 (Ω) and c > 0, such that ¡ ¢ f (ω, x, y) > β(ω) − c kxkX + kykY for µ-a.a. ω ∈ Ω and all (x, y) ∈ X × Y, then the functional df

Z

(u, v) 7−→ If (u, v) =

¡ ¢ f ω, u(ω), v(ω) dµ

Ω

is sequentially lower semicontinuous from L1 (Ω; X) × L1 (Ω; Y )w into R.

460

Nonlinear Analysis

In Proposition 2.3.39 using an extremality ¡ ¢condition, we obtained a result concerning strong convergence in L1 Ω; RN . There the result was stated without proof. Now that we have in our disposal the tools of the theory of Young measures, we can give a proof of it. PROPOSITION ¡ 3.5.51 ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that ¡ ¢ in L1 Ω; RN ,

w

un −→ u

¡ ¢ for some u ∈ L1 Ω; RN and µ ¶ u(z) ∈ ext conv lim sup{un (z)}

for µ-a.a. z ∈ Ω,

n→+∞

then

¡ ¢ in L1 Ω; RN .

un −→ u PROOF

Let © ª df S(z) = conv lim sup un (z)

∀ z ∈ Ω.

n→+∞

Since u(z) ∈ ext S(z) for µ-a.a. z ∈ Ω, © ª we can find a sequence Cn (z) n>1 of closed, convex sets in S(z), such that ∞ [

© ª Cn (z) = S(z) \ u(z)

for µ-a.a. z ∈ Ω.

n=1

Let us fix z ∈ Ω \ N with µ(N ) = 0. From Lemma 3.5.47, we have Z u(z) = xν(z)(dx), RN

© ª with ν(z) being supported by lim sup un (z) . Suppose that n→+∞

¡ ¢ ν(z) Cn (z) > 0. ¡ ¢ If ν(z) Cn (z) = 1, then u(z) ∈ Cn (z), a contradiction. Therefore ¡ ¢ 0 < ν(z) Cn (z) < 1 and we can define df

λ1 =

ν(z)|Cn (z) ν(z)(Cn (z))

and

df

λ2 =

ν(z)|RN \Cn (z) 1 − ν(z)(Cn (z))

.

3. Nonlinear Operators and Young Measures It follows that

461

Z

u(z) =

xν(z)(dx) RN

¡ ¢ = ν(z) Cn (z)

Z

¡ ¡ ¢¢ xλ1 (dx) + 1 − ν(z) Cn (z)

RN

Z xλ2 (dx),

RN

with

Z u(z) 6=

xλ1 (dx), RN

since

Z xλ1 (dx) ∈ Cn (z). RN

So we have a contradiction to the hypothesis that u(z) is extremal in C(z). This implies that ¡ ¢ ν(z) Cn (z) = 0 ∀ n > 1, z ∈ Ω \ N, hence ν(z)

¡© ª¢ u(z) = 1,

i.e., ν(z) = δu(z)

for µ-a.a. z ∈ Ω.

Then the conclusion of the proposition follows from Proposition 3.5.46. REMARK 3.5.52 The result is true if (Ω, Σ, µ) is any finite measure space (see Proposition 2.3.39). However, since the theory in this section was developed for a locally compact, σ-compact metric space Ω we have kept this assumption. ¡ ¢ We know that a bounded sequence {un }n>1 ⊆ L1 Ω; RN is uniformly tight (see Proposition 3.5.31 with ψ(x) = kxkRN ) and so we can extract a © ª subsequence unk k>1 , such that n

δunk −→ ν

as k → +∞

in Y(Ω, Σ, RN )

(see Theorem 3.5.35). It is natural to ask what is the relation between the sequence {un }n>1 and the function Z u(z) = xν(z)(dx). RN

In this respect the Chacon biting lemma (see Theorem 2.3.26) is helpful. An equivalent reformulation of this result (for X = RN ) is the following theorem.

462

Nonlinear Analysis

THEOREM 3.5.53 ¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a bounded sequence, © ª then we can extract a subsequence unk k>1 of {un }n>1 , such that for every δ > 0, there exists A ∈ Σ, with µ(A) < δ and

¡ ¢ in L1 Ω \ A; RN ,

w

unk −→ u

as k → +∞.

PROPOSITION ¡ 3.5.54 ¢ If {un }n>1 ⊆ L1 Ω; RN is a bounded sequence, © ª then we can extract a subsequence unk k>1 and a decreasing sequence of sets {Ak }k>1 ⊆ Σ, with µ(Ak ) & 0, such that

n

δunk −→ ν and

df

in Y(Ω, Σ, RN ) w

wk = χΩ\Ak unk −→ u with df

¡ ¢ in L1 Ω; RN ,

Z

u(z) =

xν(z)(dx)

for µ-a.a. z ∈ Ω.

RN

© ª PROOF By virtue of Theorem 2.3.26, we can find a subsequence unk k>1 of {un }n>1 and a decreasing sequence of sets {Ak }n>1 ⊆ Σ with ©

µ(Ak ) & 0,

ª

such that wk = χΩ\Ak unk k>1 is a sequence of uniformly integrable functions. Then by virtue of Lemma 3.5.47, we need to show that n

δwk −→ ν To this end let

in Y(Ω, Σ, RN ).

¡ ¢ b b Ω, Σ, RN . f ∈ K

If ηk = ψ(δwk ) (see Theorem 3.5.12), then we have ¯ Z ¯ Z ¯ ¯ ¯ f (z, x) dηn − f (z, x) dνnk ¯¯ ¯ Ω×RN

¯ = ¯

Z

Ω×RN

¡ ¡ ¢¢ ¯ f (z, 0) − f z, unk (z) dµ¯

Ak

6 2 kψk

¡

L∞ Ω×RN

n

so ηn −→ ν.

¢ µ(Ak ) −→ 0 as k → +∞,

3. Nonlinear Operators and Young Measures

3.6

463

Remarks

3.1: Compact maps was the first class of operators used to study nonlinear equations in infinite dimensional spaces. Leray & Schauder (1934) used compact perturbations of the identity in order to extend the Brouwer degree to infinite dimensional spaces. However, for linear operators the notion was first introduced by Riesz (1918). Earlier Hilbert (1906) introduced the notion of completely continuous linear operator between Banach spaces. As a property for linear operators, complete continuity actually lies properly between compactness and boundedness. Moreover, when the domain space X is reflexive, then the two notions coincide (see Corollary 3.1.8). The basic approximation result for compact maps stated in Theorem 3.1.10 is due to Schauder (1930). Proper maps (see Definition 3.1.13) are discussed in Berger (1977). Theorem 3.1.22 is due to Schauder (1930). The proof of Proposition 3.1.31 can be found in Reed & Simon (1972, p. 191). Theorem 3.1.38 on the spectral properties of compact linear operators is due to Riesz (1918). Compact linear operators and their spectral properties are discussed in detail in Dunford & Schwartz (1958), Kato (1976) and Yosida (1978). The Fredholm alternative (see Theorem 3.1.48) was obtained in the context of linear integral equations by Fredholm (1903). The spectral theory of selfadjoint, compact, linear operators can be found in the books Akhiezer & Glazman (1961, 1963), Gohberg & Goldberg (1981), Halmos (1998) and Kato (1976). For further results of Fredholm operators we refer to Goldberg (1966), Kato (1976) and Schechter (1971). The proof of Proposition 3.1.70 can be found in Schechter (1971, p. 114). 3.2: Monotone operators are rooted in the calculus of variations and were introduced in the early sixties, in order to provide an analytical framework for the study of nonlinear operator equations broader than the one provided by compact operators. The first mention of monotone operators in a Hilbert space can be traced in the work of Golomb (1935) on nonlinear Hammerstein integral equations. However, the systematic development of the theory of monotone operators, started with Kachurovski (1960), who established that the derivative of a convex function is a monotone map and also introduced the term “monotone operator.” Then Minty (1962) obtained the first existence result for nonlinear functional equations in Hilbert spaces under monotonicity assumptions. Even simple one dimensional examples reveal that a complete theory of maximal monotone maps requires the use of multivalued maps. Proposition 3.2.11 is due to Rockafellar (1969) who proved that a monotone map is locally bounded at every point in the interior of its domain. Here we have stated a slightly more general version of this result. Concerning Proposition 3.2.14, Kenderov (1974) proved that if X is separable, reflexive Banach ∗ space and A : X ⊇ D(A) −→ 2X is maximal monotone with int D(A) 6= ∅,

464

Nonlinear Analysis

then there is a dense Gδ subset D0 of int D(A), such that A|D0 is single valued and upper semicontinuous for the norm topologies on X and X ∗ . For additional results in this direction, we refer to Phelps (1993). The duality map (see Example 3.2.20(d)) plays a basic role in the study of the geometry of Banach spaces and in the theory of evolution equations. It was first introduced by Beurling & Livingston (1962). Its properties are studied by Browder (1976), Cioranescu (1990) and Zeidler (1990b). Theorem 3.2.29 is due to Minty (1962) for Hilbert spaces and Rockafellar (1970c) for Banach spaces. Its proof can be found in Zeidler (1990b, p. 881). Theorem 3.2.30 is due to Browder (1968) and together with Corollary 3.2.31 explains why maximal monotone operators are a powerful tool in the study of nonlinear operator equations. Theorem 3.2.40 is due to Attouch (1981), while Theorem 3.2.41 is due to Rockafellar (1970c). The proof of Theorem 3.2.41 can be found in Zeidler (1990b, p. 888). The notion of pseudomonotonicity was introduced by Br´ezis (1968) (using nets) and Browder (1976) (using sequences). The basic works on pseudomonotonicity are those by Browder & Hess (1972) and Kenmochi (1974, 1975). Of course the most important result is Theorem 3.2.52, due to Browder & Hess (1972). Monotone operators and operators of monotone-type are discussed in the books of Br´ezis (1973), Barbu (1976), Deimling (1985), Hu & Papageorgiou (1997), Morosanu (1988), Pascali & Sburlan (1978) and Showalter (1997). The proof of Theorem 3.2.58 can be found in Hu & Papageorgiou (1997, pp. 311–312). 3.3: Accretive operators were introduced by Kato (1967, 1968), who gave the complete characterization in metric terms involved in Proposition 3.3.4. In the first part of the section, dealing with accretive operators, we have summarized the results of Br´ezis (1971), Br´ezis & Pazy (1970), Crandall & Pazy (1969, 1970), Kato (1967, 1968) and Kenmochi (1972, 1973). Lemma 3.3.26 is due to Kato (1968, 1970), while the Gronwall-type inequality obtained in Lemma 3.3.27 can be found in Br´ezis (1973, p. 157). Theorem 3.3.28 can be found in Crandall & Pazy (1969). The linear semigroup theory started developing as soon as it was realized that the theory has immediate applications to partial differential equations, Markov processes and ergodic theory. It developed rapidly during the 1940’s and 1950’s thanks to the seminal contributions of Hille, Phillips and Yosida. The main result of this theory is of course Theorem 3.3.46 (the Hille-Yosida generation theorem), for contraction semigroups (i.e., M = 1, ω = 0) was proved independently by Hille (1942) and Yosida (1948), while the general case (proved in Theorem 3.3.46) is independently due to Feller (1953), Miyadera (1952) and Phillips (1953). The proof of Phillips theorem (see Theorem 3.3.48) can be found in Hille & Phillips (1957, p. 389). Theorem 3.3.49 is due to Lumer & Phillips (1961), while an early Hilbert space version of it was proved by Phillips (1959). The exponential formula in Theorem 3.3.51 is due to Hille (1942). In fact Hille’s proof of the generation theorem was based on it. Another representa-

3. Nonlinear Operators and Young Measures

465

tion formula can be found in Pazy (1983, p. 21). THEOREM 3.6.1 © ª If A is the infinitesimal generator of a C0 -semigroup S(t) t>0 on X, then S(t)x = lim etAλ x . λ→+∞

A complete list of representation formulae can be found in Hille & Phillips (1957, p. 354). Theorem 3.3.59 (the generation theorem for nonlinear semigroups) for Hilbert spaces was proved by Komura (1967), while the general case is due to Crandall & Liggett (1971). The notion of integral solution (see Definition 3.3.65) is due to B´enilan (1972). In the linear case, Proposition 3.3.71 is due to Lax (see Hille & Phillips (1957, p. 304)). For nonlinear semigroups, Proposition 3.3.71 and Theorem 3.3.72 were proved by Br´ezis (1974), while Corollary 3.3.73 is due to Pazy (1968). The theory of linear semigroups can be found in the books of Butzer & Berens (1967), Fattorini (1999), Goldstein (1985), Hille & Phillips (1957) and Pazy (1983), while the theory of nonlinear semigroups can be found in the books of Barbu (1976), Miyadera (1992), Pavel (1987) and Vrabie (1987). 3.4: Theorem 3.4.4 for functions defined on Ω × R with values in R was proved by Krasnoselskii (1964b, 1964a). The general case is due to Lucchetti & Patrone (1980). Moreover, in addition to Theorem 3.4.4, we can also show the continuity in measure of the operator Nf , already known when X and Y are Euclidean spaces. PROPOSITION 3.6.2 © If µ(Ω) < +∞, f : Ω × X −→ Y is a Carath´eodory function, xn : Ω −→ ª X n>1 is a sequence of Σ-measurable functions and µ

xn −→ x, then

µ

Nf (xn ) −→ Nf (x). The proof of Scorza-Dragoni theorem (see Theorem 3.4.10) can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 188). Normal integrands were introduced by Rockafellar (1968). The characterization of lower semicontinuity of the integral functional If obtained in Theorem 3.4.13 for Euclidean spaces X and Y is due to Poljak (1969), while the general case can be found in Lucchetti & Patrone (1980). Proposition 3.4.16 is due to Ioffe & Levin (1972). The proof of Proposition 3.4.18 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 460). For X = R, Theorem 3.4.20 can be found in Ekeland & Temam (1976), but their proof is different based on results from convex analysis.

466

Nonlinear Analysis

Further results on Nemytskii operators and integral functionals can be found in the books of Appell & Zabrejko (1990), Buttazzo (1989), Cesari (1983), Ekeland & Temam (1976), Hu & Papageorgiou (1997), Ioffe & Tihomirov (1979) and Vaˇınberg (1973). 3.5: The theory of Young measures has its roots in the so-called ”generalized curves” of Young (1942a, 1942b, 1969) for the study of variational problems which are not inf-compact and consequently do not have a solution. The needs of control theory (relaxation) and of the calculus of variations led to further development of the original ideas of Young. We refer to Berliocchi & Lasry (1973), Ekeland (1972), Ekeland & Temam (1976), Gamkrelidze (1978), Warga (1972). Recently there was a revival of the theory (motivated also by the needs of problems in theoretical mechanics), which can be traced in the works of Alibert & Bouchitt´e (1997), Balder (1984, 1997), Ball (1989), Ball & Murat (1989), Ball & Zhang (1990), Di Perna (1985), Di Perna & Majda (1987) and Tartar (1979). Our presentation here follows the survey paper of Valadier (1975). Applications of Young measures to control theory and mechanics can be found in the books of Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Gamkrelidze (1978), Hu & Papageorgiou (1997, 2000), Pedregal (1997), Roubiˇcek (1997) and Warga (1972). For the proof of Prohorov theorem (see Theorem 3.5.27) we refer to Parthasarathy (1967, p. 47). Theorem 3.5.49 was obtained by De Giorgi (1968–1969), with f > 0. A more general version similar to that of Theorem 3.5.49 was proved by Ioffe (1977a, 1977b). His proof, which is also reproduced by Buttazzo (1989, p. 46), does not use Young measures and instead it is based on the approximation of f by certain affine functions. Two different proofs of Theorem 3.5.50 can be found in Balder (1987) and Hu & Papageorgiou (2000, p. 31).

Chapter 4 Smooth and Nonsmooth Analysis and Variational Principles

The purpose of this chapter is to outline the basic aspects of the smooth and nonsmooth calculus in Banach spaces. Special emphasis is given on the nonsmooth theory, which started developing in the 1960’s, in order to provide a uniform viewpoint for the treatment of large classes of nonlinear extremal problems. The resulting subdifferential theories found also in many other applications and today are part of the so-called nonsmooth analysis, which is one of the most robust and interesting research areas of nonlinear functional analysis. In Section 4.1 we present the basics of the smooth calculus in Banach spaces. We limit ourselves to the discussion of the Gˆateaux and Fr´echet derivatives, which are the two most useful derivatives for vector valued functions. In Section 4.2 we consider convex functions defined on Banach spaces. We discuss their continuity and differentiability properties. It turns out that a purely algebraic condition (convexity) has remarkable and powerful topological and differentiability implications. Also differentiability results bring us in contact with the Banach space theory and in particular with the so-called Asplund spaces which have the useful property that every separable subspace has a separable dual. We also show that every convex continuous function is locally Lipschitz. Locally Lipschitz functions between Banach spaces are the objects of investigation in Section 4.3. If the two Banach spaces are finite dimensional, then the locally Lipschitz function is differentiable almost everywhere (for the Lebesgue measure). Here we see how this can be generalized to the case where the two spaces are infinite dimensional. The main difficulty is to produce a suitable notion of negligible sets. This is done using the notion of Haar-null sets. We study them in detail and eventually prove an infinite dimensional version of the Rademacher theorem on the almost everywhere differentiability of locally Lipschitz functions. In Section 4.4 we pass to the nonsmooth part of this chapter. We examine the duality and subdifferentiability properties of convex functions and the subdifferentiability properties of locally Lipschitz functions. At the end of the section, using the notion of bornology, we briefly consider some more subdifferentials of proper functions. In Section 4.5 we investigate integral functionals defined by convex or nonconvex normal integrands. We determine their duality and subdifferentiability

467

468

Nonlinear Analysis

properties. Finally in Section 4.6 we present some variational principles and their applications. Prominent in our discussion is the so-called Ekeland variational principle, in which we show that it is equivalent to some other powerful results of nonlinear analysis. We also use it to prove some surjectivity results for nonlinear maps, which extend corresponding results from the linear operator theory. This chapter illustrates in a rather convincing manner how methods and results of nonlinear analysis cover a wide area from a theoretical starting point (Banach space theory) to an applied end (optimization theory).

4.1

Differential Calculus in Banach Spaces

In this section we develop the basics of the differential calculus in Banach spaces. The geometric character of the operation of differentiation becomes very apparent in this general setting and leads naturally to generalizations such as the subdifferentials of convex and of locally Lipschitz functions. Moreover, the needs of the infinite dimensional variational problems which dominate the present landscape of nonlinear analysis require a differential calculus in Banach spaces, along the lines of the one existing in RN . This section shows that such a theory is possible and the analogy with the finite dimensional calculus is indeed remarkable. In what follows X and Y are two Banach spaces. Additional hypotheses will be introduced as needed. DEFINITION 4.1.1 A map f : X −→ Y is said to be Gˆ ateaux differentiable at x ∈ X, if and only if there exists A(x) ∈ L(X; Y ), such that lim

λ→0

f (x + λh) − f (x) = A(x)h λ

∀ h ∈ X.

The operator A(x) is said to be the Gˆ ateaux derivative of f at x. It is 0 usually denoted by fG (x). We say that f is Gˆ ateaux differentiable, if it is Gˆ ateaux differentiable at every x ∈ X. REMARK 4.1.2

If we set df

ϕ(λ) = f (x + λh), then

d ϕ(λ)|λ=0 ∀ x, h ∈ X. dλ So the Gˆateaux derivative is essentially a one dimensional concept, since it considers the difference quotients along rays. Clearly then the Gˆateaux deriva0 tive fG (x), if it exists, is unique. 0 fG (x)h =

4. Smooth and Nonsmooth Analysis and Variational Principles

469

Let us see some examples that illustrate the notion of Gˆateaux derivative. 0 (a) If f = A ∈ L(X; Y ), then fG (x) = A for all

EXAMPLE 4.1.3 x ∈ X.

(b) Let X = RN , Y = RM and f = (f1 , . . . , fM ) : RN −→ RM . Let A = (akj ) be an M × N -matrix and let h = ej = (0, . . . , 0, 1, 0, . . . , 0) be the j-th coordinate vector. Then ° ° ° f (x + λh) − f (x) − λAh ° ° = 0, lim ° ° λ→0 ° λ Y so

¯ ¯ ¯ fk (x + λej ) − fk (x) − λakj ¯ ¯ ¯ = 0 lim ¯ λ→0 ¯ λ

and so

∂fk (x) = akj ∂xj

∀ k ∈ {1, . . . , M }, j ∈ {1, . . . , N }

∀ k ∈ {1, . . . , M }, j ∈ {1, . . . , N }.

0 Therefore fG (x) has the matrix representation µ ¶ ∂fk 0 fG (x) = (x) . k∈{1,...,M } ∂xj j∈{1,...,N }

This matrix is called the Jacobian matrix of f at x ∈ RN . If Y = R, then µ ¶N ∂f 0 fG (x) = (x) , ∂xj j=1 known as the gradient of f at x ∈ RN . ¡ ¢ (c) Let X = Y = C [0, 1] and consider the Hammerstain integral operator , defined by df fb(x)(t) =

Z1

¡ ¢ k(t, s)f s, x(s) ds

∀ t ∈ [0, 1],

0

where

¡ ¢ k ∈ C [0, 1]; [0, 1] and

¡ ¢ ∂f ∈ C [0, 1] × R . ∂x

An easy calculation reveals that fb is Gˆateaux differentiable and ¡

0 fbG (x)h

¢ (t) =

Z1 k(t, s) 0

¢ ∂f ¡ s, x(s) h(s) ds ∂x

¡ ¢ ∀ h ∈ C [0, 1] .

470

Nonlinear Analysis

A stronger differentiability notion is given in the next definition. DEFINITION 4.1.4 A map f : X −→ Y is said to be Fr´ echet differentiable at x ∈ X if there exists A(x) ∈ L(X; Y ), such that f (x + h) − f (x) = A(x)h + u(x, h), where

ku(x, h)kY −→ 0 khkX

as khkX → 0.

The operator A(x) is said to be the Fr´ echet derivative of f at x ∈ X and it is usually denoted by fF0 (x). We say that f is Fr´echet differentiable, if it is Fr´echet differentiable at every x ∈ X. REMARK 4.1.5 It is easy to see that the Fr´echet derivative fF0 (x), if it exists, is unique. It is clear that if f is Fr´echet differentiable at x, it is also Gˆateaux differentiable at x. The converse is not true as the following example shows. EXAMPLE 4.1.6 Let X = R2 , Y = R and consider the function 2 f : R −→ R, defined by ( 3 x1 x2 df if x = (x1 , x2 ) 6= (0, 0), x41 +x22 f (x) = 0 if x = (x1 , x2 ) = (0, 0). The function f is Gˆateaux differentiable at x = 0 and 0 fG (0) = 0.

However, it is not Fr´echet differentiable at x = 0, since on the curve h21 = h2 , we have

|f (h)| |h3 h2 | 1 1 |h1 | = 41 2 p = p , khkR2 h1 + h2 h21 + h22 2 h21 + h22

so lim

khkR2 →0

|f (h)| 1 = 6= 0. khkR2 2

The next proposition establishes the exact relation between Gˆateaux and Fr´echet derivatives.

4. Smooth and Nonsmooth Analysis and Variational Principles

471

PROPOSITION 4.1.7 If f : X −→ Y is a Gˆ ateaux differentiable function at all points of some 0 neighbourhood of x ∈ X and fG (·) is continuous at x ∈ X, then f is also Fr´echet differentiable at x ∈ X. PROOF

Let us set df

0 u(x, h) = f (x + h) − f (x) − fG (x)h.

Then for every y ∗ ∈ Y ∗ , we have ∗ ® ® ® 0 y , u(x, h) Y = y ∗ , f (x + h) − f (x) Y − y ∗ , fG (x)h Y . By virtue of the mean value theorem, we can find λ ∈ (0, 1) (depending on y ∗ ), such that ∗ ® ® 0 0 y , u(x, h) Y = y ∗ , fG (x + λh)h − fG (x)h Y . We can find y ∗ ∈ Y ∗ with ky ∗ kY ∗ = 1, such that ° ° ¯ ® ¯ °u(x, h)° = ¯ y ∗ , u(x, h) ¯ . Y Y Then we have ° ° °u(x, h)°

Y

so

and so

° 0 ° 0 6 °fG (x + λh) − fG (x)°L khkX ,

° ° 0 ku(x, h)kY 0 (x)°L , 6 °fG (x + λh) − fG khkX ku(x, h)kY −→ 0 as khkX → 0 khkX

0 (since by hypothesis fG (·) is continuous at x ∈ X).

Before proceeding further, let us give some examples of Fr´echet differentiable maps. EXAMPLE 4.1.8 (a) Let X = H be a Hilbert space. Let A ∈ L(H) and let f : H −→ R be defined by df

f (x) = (Ax, x)H

∀ x ∈ H.

Then ¡ ¡ ¢ ¢ f (x + h) − f (x) − Ax + A∗ x, h H = o khkH as h → 0

472

Nonlinear Analysis

and so

fF0 (x) = A + A∗ . 2

If A = idH , then f (x) = kxkH and we have that fF0 (x) = 2x

∀ x ∈ H.

(b) Let Ω = RN be an open set and let f : Ω × R −→ R be a Carath´eodory function. Suppose that ¯ ¯ ¯f (z, x)¯ 6 a(z) + c|x|p−1 for a.a. z ∈ Ω and all x ∈ R, 0

with p ∈ [1, +∞), a ∈ Lp (Ω)+ , p1 + p10 = 1 and c > 0. Let F be the potential function corresponding to f , i.e., df

Zx f (z, r) dr.

F (z, x) = 0

Using the mean value theorem, we can see that ¯ ¯ ¯F (z, x)¯ 6 b a(z) + b c|x|p for a.a. z ∈ Ω and all x ∈ R, with b a ∈ L1 (Ω)+ and b c > 0. Then consider the continuous functional ϕ : Lp (Ω) −→ R, defined by Z ¡ ¢ df ϕ(u) = F z, u(z) dz. Ω

We claim that ϕ is continuously Fr´echet differentiable and ϕ0 (u) = Nf (u) To this end let

Z

ξ(h) = Ω

∀ u ∈ Lp (Ω).

¡ ¢ F z, (u + h)(z) dz − Z

+

Z

¡ ¢ F z, u(z) dz

Ω

¡

¢ f z, u(z) h(z) dz.

Ω

Note that ¡

¢ ¡ ¢ F z, (u + h)(z) − F z, u(z) =

Z1 0

Z1 = 0

¢ d ¡ F z, (u + th)(z) dt dt ¡ ¢ f z, (u + th)(z) h(z) dt.

4. Smooth and Nonsmooth Analysis and Variational Principles

473

Therefore, using Fubini’s theorem and H¨older’s inequality (see, e.g., Theorem A.2.27), we have ¯ ¯ ¯ξ(h)¯ =

Z Z1

¯ ¡ ¯ ¢ ¡ ¢¯¯ ¯f z, (u + th)(z) − f z, u(z) ¯¯h(z)¯ dt dz

Ω 0

Z1 Z 6

¯ ¡ ¯ ¢ ¡ ¢¯¯ ¯f z, (u + th)(z) − f z, u(z) ¯¯h(z)¯ dz dt

0 Ω

Z1 6 khkp

° ° °Nf (u + th) − Nf (u)° 0 dt. p

0

Because Nf is continuous (see Theorem 3.4.4), by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we can conclude that ξ(h) −→ 0 as khkp → 0. khkp This proves that

ϕ0F (u) = Nf (u) ¢ and so ϕ ∈ C Lp (Ω) . This example is important in the variational methods for the study of boundary value problems. ¡ 1

PROPOSITION 4.1.9 If f : X −→ Y is a function which is Fr´echet differentiable at x ∈ X, then f is continuous at x ∈ X. PROOF Since f is Fr´echet differentiable at x, we can find δ > 0, such that ° ° °f (x + h) − f (x) − fF0 (x)h° 6 khk ∀ khkX 6 δ, X Y so

° ° °f (x + h) − f (x)° 6 (1 + kfF0 (x)k ) khk L X Y

∀ khkX 6 δ.

This proves the continuity of f at x ∈ X. REMARK 4.1.10 The above proposition is no longer true if Fr´echet differentiability is replaced by Gˆateaux differentiability. To see this consider the function f : R2 −→ R, defined by ( 4 x1 x2 df if (x1 , x2 ) 6= 0 x61 +x32 f (x1 , x2 ) = ∀ x = (x1 , x2 ) ∈ R2 . 0 if (x1 , x2 ) = 0 0 Then fG (x1 , x2 ) = 0 but f is not continuous at the origin.

474

Nonlinear Analysis

In the case of Gˆateaux differentiable maps we can conclude that they are continuous along rays. PROPOSITION 4.1.11 If f : X −→ Y is a function, which is Gˆ ateaux differentiable at x ∈ X, then ° ° lim °f (x + λh) − f (x)°Y = 0 ∀ h ∈ X. λ→0

PROOF

Let

df

ϕ(λ) = f (x + λh)

∀ λ ∈ R.

Then ϕ is differentiable at 0, hence continuous there. So ϕ(λ) −→ ϕ(0) as λ → 0, which implies that f (x + λh) −→ f (x)

in Y,

as λ → 0.

We have a chain rule for these derivatives. PROPOSITION 4.1.12 If Z is a Banach space too, f : X −→ Y is a function which is Gˆ ateaux differentiable at x ∈ X and g : Y −→ Z is a function which is Fr´echet differentiable at f (x), df

then the function k = g ◦ f : X −→ Z is Gˆ ateaux differentiable at x ∈ X and ¡ ¢ 0 0 kG (x) = gF0 f (x) fG (x). Moreover, if f is Fr´echet differentiable at x ∈ X, then k is Fr´echet differentiable at x. PROOF

For λ 6= 0, we have

° ¡ ¢ 0 1 ° °k(x + λh) − k(x) − λgF0 f (x) fG (x)h°Z |λ| ¡ ¢ ¡ ¢ ¡ ¢¡ ¢° 1 ° °g f (x + λh) − g f (x) − λgF0 f (x) f (x + λh) − f (x) ° 6 Z |λ| ¡ ¢¡ ¢° 1 ° 0 °g 0 f (x) f (x + λh) − f (x) − λfG + (x)h °Z . (4.1) |λ| F Since f is Gˆateaux differentiable at x ∈ X, the second summand in the right hand side of (4.1) goes to zero as λ → 0. Also suppose that f (x + λh) 6= f (x). Then since f (x + λh) −→ f (x) in Y as λ → 0

4. Smooth and Nonsmooth Analysis and Variational Principles

475

(see Proposition 4.1.11) and because g is Fr´echet differentiable at f (x), we have that the first summand in the right hand side of (4.1) goes to zero as λ → 0. This proves that ¡ ¢ 0 0 kG (x) = gF0 f (x) fG (x). The proof is similar if f is Fr´echet differentiable at x ∈ X. COROLLARY 4.1.13 If f : X −→ Y is a function, which is Gˆ ateaux differentiable at every point of the interval ª df © [x, x + h] = u ∈ X : u = λx + (1 − λ)(x + h), λ ∈ [0, 1] , then Z1 0 fG (x + th)h dt.

f (x + h) − f (x) = 0

In the next proposition we show that compactness of a map is passed to its Fr´echet derivative. PROPOSITION 4.1.14 If f : X −→ Y is a function which is compact and Fr´echet differentiable at x ∈ X, then fF0 (x) ∈ Lc (X; Y ). PROOF Suppose that the proposition is not true. Then we can find ε > 0 and {xn }n>1 ⊆ X with kxn kX 6 1 such that

∀ n > 1,

° 0 ° °fF (x)xn − fF0 (x)xm ° > 3ε X

∀ n 6= m.

Because f is Fr´echet differentiable at x, we have f (x + h) − f (x) = fF0 (x)h + u(x, h) and we can find δ > 0, such that ° ° °u(x, h)° 6 ε khk X Y

∀ khkX 6 δ.

Therefore, from (4.2), we have ° ° °f (x + δxn ) − f (x + δxm )° ° 0 ° ° Y ° ° ° ° ° > δ fF (x)(xn − xm ) Y − °u(x, δxn )°Y − °u(x, δxm )°Y > 3εδ − δε − δε = δε, a contradiction to the fact that f is compact.

(4.2)

476

Nonlinear Analysis

For the next proposition, we need to introduce the following definition. DEFINITION 4.1.15 (a) A function f : [a, b] −→ X is said to be right differentiable at t ∈ [a, b), if the limit lim

h→0+

¤ 1£ f (t + h) − f (t) h

exists.

0 We denote this limit by f+ (t) and we call it the right derivative of f at t. 0 Evidently f+ (t) ∈ X.

(b) Similarly a function f : [a, b] −→ X is said to be left differentiable at t ∈ (a, b], if the limit lim−

h→0

¤ 1£ f (t + h) − f (t) h

exists.

0 We denote this limit by f− (t) and we call it the left derivative of f at t. 0 Evidently f− (t) ∈ X.

REMARK 4.1.16 A function f : [a, b] −→ X is Fr´echet differentiable 0 0 at t ∈ (a, b) if and only if f− (t) = f+ (t). PROPOSITION 4.1.17 0 0 If f : [a, b] −→ X, g : [a, b] −→ R, are continuous functions, f+ (t) (t), g+ exist at all t ∈ (a, b) and ° 0 ° 0 °f+ (t)° 6 g+ (t) ∀ t ∈ (a, b), X then

° ° °f (b) − f (a)° 6 g(b) − g(a). X

PROOF

Let ε > 0 be given and consider the set ½ ¾ ° ° df ° ° U = t ∈ [a, b] : f (t) − f (a) X > g(t) − g(a) + ε(t − a) + ε . df

Clearly U is an open set. Suppose that U is nonempty and let c = inf U . We can say the following: (a) c > a. This follows from the continuity of f and g; (b) c 6∈ U : since U is open; (c) c < b: otherwise U = {b} which is not open.

4. Smooth and Nonsmooth Analysis and Variational Principles

477

So we have that a < c < b. By hypothesis we have ° 0 ° 0 °f+ (c)° 6 g+ (c). X Let h > 0 be such that if t ∈ [c, c + h], we have ° 0 ° °f+ (c)° > kf (t) − f (c)kX − ε X t−c 2

and

0 g+ (c) 6

It follows that ° ° °f (t) − f (c)° 6 g(t) − g(c) + ε(t − c) X

g(t) − g(c) ε + . t−c 2

∀ t ∈ [c, c + h].

Also because c 6∈ U , we have ° ° °f (c) − f (a)° 6 g(c) − g(a) + ε(c − a) + ε. X From (4.3) and (4.4), we obtain ° ° °f (t) − f (a)° 6 g(t) − g(a) + ε(t − a) + ε X

(4.3)

(4.4)

∀ t ∈ [c, c + h].

We infer that inf U > c + h, a contradiction. So U = ∅ and we obtain ° ° °f (t) − f (a)° 6 g(t) − g(a) + ε(t − a) + ε ∀ t ∈ [a, b]. X Let t = b and ε & 0 to obtain the desired inequality. REMARK 4.1.18 We have an analogous result if we replace the right derivatives by the left ones. Moreover, we can weaken the hypotheses of Proposition 4.1.17 and assume that there is a countable set D ⊆ [a, b], such 0 0 (t), g+ (t) exist for all t ∈ [a, b] \ D and that f+ ° 0 ° 0 °f+ (t)° 6 g+ (t) ∀ t ∈ [a, b] \ D. X COROLLARY 4.1.19 If f : [a, b] −→ X is continuous, right differentiable at every t ∈ (a, b) and ° 0 ° °f+ (t)° 6 k ∀ t ∈ (a, b), X then

° ° °f (t) − f (s)° 6 k|t − s| X

∀ t, s ∈ [a, b].

COROLLARY 4.1.20 If g : [a, b] −→ R is a continuous function, which is right differentiable at every t ∈ (a, b), then g is increasing if and only if 0 g+ (t) > 0

∀ t ∈ (a, b).

478

Nonlinear Analysis

We have the following mean value theorem. PROPOSITION 4.1.21 (Mean Value Theorem) If f : X −→ R is a Gˆ ateaux differentiable function, then we can find λ0 ∈ (0, 1), such that 0 ® f (x + h) − f (x) = fG (x + λ0 h), h X . PROOF

Let

df

ϕ(λ) = f (x + λh). Recall that

0 ® fG (x + λ0 h), h X = ϕ0 (λ0 )

(see Remark 4.1.2). Using the mean value theorem for scalar functions, we can find λ0 ∈ (0, 1), such that ϕ(1) − ϕ(0) = ϕ0 (λ0 ), so f (x + h) − f (x) =

0 ® fG (x + λ0 h), h X .

In general for vector valued functions the mean value theorem fails as the next example illustrates. Let f : R2 −→ R2 be defined by µ ¶ x1 df f (x) = (x31 , x22 ) ∀x= ∈ R2 . x2

EXAMPLE 4.1.22

We have

· f 0 (x) =

¸ 3x21 0 . 0 2x2

¡¢ ¡¢ © If x = 00 and yª = 11 , then it is clear that there is no z ∈ [x, y] = λx + (1 − λ)y : λ ∈ [0, 1] , such that f (y) − f (x) = f 0 (z)(y − x). For vector valued functions the mean value theorem takes an inequality form. PROPOSITION 4.1.23 (Mean Value Theorem) If f : X −→ Y is a Gˆ ateaux differentiable function and x, h ∈ X, y ∗ ∈ Y ∗ , then we can find λ0 ∈ (0, 1), such that ® ∗ ® 0 (x + λ0 h)h Y y , f (x + h) − f (x) Y = y ∗ , fG and

° ° ° 0 ° °f (x + h) − f (x)° 6 °fG (x + λ0 h)°L khkX . Y

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

Let df

g(x) = Then

479

∗ ® y , f (x) Y .

0 ® ® 0 gG (x), h X = y ∗ , fG (x)h Y .

From Proposition 4.1.21, we know that we can find λ0 ∈ (0, 1), such that g(x + h) − g(x) = hence

0 gG (x + λ0 h), h

® X

,

∗ ® ® 0 y , f (x + h) − f (x) Y = y ∗ , fG (x + λ0 h)h Y .

Since y ∗ ∈ Y ∗ is arbitrary, we choose ky ∗ kY ∗ = 1, such that ° ° ∗ ® y , f (x + h) − f (x) Y = °f (x + h) − f (x)°Y . So we can find λ0 ∈ (0, 1), such that ° ° ° 0 ° ® 0 °f (x + h) − f (x)° = y ∗ , fG °fG (x + λ0 h)° khk . (x + λ h)h 6 0 X Y Y L

COROLLARY 4.1.24 If U ⊆ X is connected and open f : U −→ Y is a Gˆ ateaux differentiable function and 0 fG (x) = 0 ∀ x ∈ U, then f is constant on U . Next we state and prove two major results of differential calculus. These are the implicit function theorem and the inverse function theorem. The implicit function theorem deals with the following situation. Let f (x, y) and suppose that f (x0 , y0 ) = c. Can we find a function x 7−→ y = g(x), which at least locally satisfies ¡ ¢ f x, g(x) = c ? We want g to be differentiable provided f is differentiable. Moreover, in the neighbourhood, where ¡ ¢ f x, g(x) = c is valid, g(x) should be the unique solution. To better motivate this consider the following simple example.

480

Nonlinear Analysis

EXAMPLE 4.1.25

Let f : R2 −→ R be defined by df

f (x, y) = x2 + y 2 − 1. We consider the 0-level set of f , namely the set of those x, y ∈ R that satisfy f (x, y) = 0, which in our ¡ case is ¢ of course the unit circle. We look for a function g(x), such that f x, g(x) = 0 for all x in the domain of g. Evidently g(x) = ±

p 1 − x2

and so g need not be unique unless we restrict its domain. Also near x0 = ±1, g could be either square root, so it is not uniquely determined. Note that at x0 = ±1, g is not differentiable and ∂f = 0. ∂y So to produce a unique differentiable function g, such that ¡ ¢ f x, g(x) = 0, we need to look locally and impose some condition like ∂f 6= 0. ∂y The proof of the implicit function theorem uses the Banach fixed point theorem, which we state here in the form needed and postpone the proof of the general version until Section 7.1. PROPOSITION 4.1.26 (Banach Fixed Point Theorem) If V is a Banach space, C is a closed subset of V and S : C −→ C satisfies ° ° °S(v1 ) − S(v2 )°

V

6 k kv1 − v2 kV

∀ v1 , v2 ∈ C,

for some k ∈ [0, 1), then there exists unique v ∈ C, such that v © = S(v). ª Moreover, if we have a parametrized family S(x) x∈U (with U being an open subset of a Banach space W ) satisfying the above contraction condition with k ∈ [0, 1) independent of x, then the unique solution v = v(x) of v = S(x)v depends continuously on x.

4. Smooth and Nonsmooth Analysis and Variational Principles

481

Using this proposition we can prove the implicit function theorem. In what follows for a function f (x, y) by D1 f (x, y) (respectively D2 f (x, y)) we denote the partial derivative of f with respect to x (respectively y). THEOREM 4.1.27 (Implicit Function Theorem) If X, Y, Z are three Banach spaces, U ⊆ X × Y is an open set, (x0 , y0 ) ∈ U , f : U −→ Z is a continuous differentiable function, f (x0 , y0 ) = 0 and D2 f (x0 , y0 ) ∈ L(X; Y ) is invertible with a continuous inverse, i.e., D2 f (x0 , y0 ) is an isomorphism, then there exist neighbourhoods U1 of x0 and U2 of y0 , such that U1 × U2 ⊆ U and a unique continuously differentiable function g : U1 −→ U2 , such that ¡ ¢ f x, g(x) = 0 ∀ x ∈ U1 and

¡ ¡ ¢¢−1 ¡ ¢ Dg(x) = − D2 f x, g(x) D1 f x, g(x)

PROOF

Let

∀ x ∈ U1 .

df

L0 = D2 f (x0 , y0 ) ∈ L(Y ; Z). By hypothesis L0 is an isomorphism. Then the equation f (x, y) = 0 can be equivalently rewritten as y = y − L−1 0 f (x, y).

(4.5)

The advantage of passing to (4.5) is that we can apply Proposition 4.1.26. Namely for every x, we look for a fixed point of y 7−→ y − L−1 0 f (x, y) and to do this we employ Proposition 4.1.26. Let us set df

h(x, y) = y − L−1 0 f (x, y). Since L−1 0 ◦ L0 = idY , we have £ ¡ ¢¤ h(x, y1 ) − h(x, y2 ) = L−1 L0 (y1 − y2 ) − f (x, y1 ) − f (x, y2 ) . 0 Because f is C 1 at (x0 , y0 ) and L0 is an isomorphism, we can find δ1 > 0 and ϑ > 0, such that if kx − x0 kX 6 δ1 , ky1 − y0 kY 6 ϑ, ky2 − y0 kY 6 ϑ, then ° ° °h(x, y1 ) − h(x, y2 )° 6 1 ky1 − y2 k . Y Y 2

(4.6)

Also because of the continuity of h(·, y0 ), we can find δ2 > 0, such that if kx − x0 kX 6 δ2 , then ° ° °h(x, y0 ) − h(x0 , y0 )° < ϑ . Y 2

(4.7)

482

Nonlinear Analysis

© ª Therefore, from (4.6) and (4.7), if δ = min δ1 , δ2 and kx − x0 kX 6 δ, ky1 − y0 kY 6 ϑ, we have ° ° ° ° °h(x, y) − y0 ° = °h(x, y) − h(x0 , y0 )° Y ° Y ° ° ° 6 °h(x, y) − h(x, y0 )°Y + °h(x, y0 ) − h(x0 , y0 )°Y 1 ϑ 6 ky − y0 kY + 6 ϑ. (4.8) 2 2 © ª So we see that h(x, ·) maps © B ϑ (y0 ) = y ∈ Y : ªky − y0 kY 6 ϑ onto itself as well as Bϑ (y0 ) = y ∈ Y : ky − y0 kY < ϑ onto itself (see (4.7) © ª and (4.8)), for all x ∈ B δ (x0 ) = x ∈ X : kx − x0 k©X 6 δ . We ªcan apply Proposition 4.1.26 to obtain the parametric family y 7−→ h(x, y) x∈B (x ) . 0 δ

So for every x ∈ B δ (x0 ), we can find unique y = y(x) ∈ B ϑ (y0 ), such that h(x, y) = y, hence f (x, y) = 0 and the function g(x) = y(x) is continuous. Let df

U1 = Bδ (x0 )

df

and U2 = Bϑ (y0 ).

Evidently by choosing δ > 0 and ϑ > 0 small enough we can have that U1 × U2 ⊆ U . We claim that the function g : Bδ (x0 ) −→ Y is continuously differentiable. To this end let (x1 , y1 ) ∈ U1 × U2 , y1 = g(x1 ) (recall that G(x, ·) maps U2 into itself). Exploiting the differentiability of f at (x1 , y1 ), we have f (x, y) = A(x − x1 ) + B(y − y1 ) + u(x, y) with

df

A = D1 f (x1 , y1 ), and

∀ (x, y) ∈ U,

df

B = D2 f (x1 , y1 )

ku(x, y)kZ = 0. (x,y)→(x1 ,y1 ) k(x − x1 , y − y1 )kX×Y lim

Recall that Hence

¡ ¢ f x, g(x) = 0

∀ x ∈ U1 .

¡ ¢ g(x) = −B −1 A(x − x1 ) + y1 − B −1 u x, g(x) .

(4.9)

We can find r1 , r2 > 0, such that if kx − x1 kX 6 r1 , ky − y1 kY 6 r2 , then ° ° °u(x, y)°

Z

6

¡ ¢ 1 kx − x1 kX + ky − y1 kY , −1 2 kB kL

so ° ° °u(x, g(x))° 6 Z

° ° ° ¢ ¡° 1 °x − x1 ° + °g(x) − g(x1 )° . −1 X Y 2 kB kL

(4.10)

4. Smooth and Nonsmooth Analysis and Variational Principles

483

From (4.9) and (4.10), it follows that ° ° ° ° ° ° °g(x) − g(x1 )° 6 °B −1 A° kx − x1 k + 1 kx − x1 k + 1 °g(x) − g(x1 )° , X X Y L Y 2 2 so ° ° °g(x) − g(x1 )° 6 η kx − x1 k , (4.11) X Y ° ° df with η = 2°B −1 A°L + 1. Let ¡ ¢ df v(x) = −B −1 u x, g(x) . From (4.9), we have g(x) − g(x1 ) = B −1 A(x − x1 ) + v(x) and since

(4.12)

° ° ° ° ° ¡ ¢° °v(x)° 6 °B −1 ° °u x, g(x) ° Y L Z

and g is continuous, we have lim

x→x1

kv(x)kY = 0. kx − x1 kX

(4.13)

From (4.12) and (4.13), it follows that g is Fr´echet differentiable at x1 ∈ U1 and ¡ ¢−1 D1 f (x1 , y1 ), gF0 (x1 ) = −B −1 A = − D2 f (x1 , y1 ) which means that g is continuously differentiable. DEFINITION 4.1.28 Let Z be a Banach space and let V ⊆ Z be a closed subspace of Z. We say that V is complemented, if there is a closed subspace W of Z, such that Z = V ⊕W (i.e., Z = V + W and V ∩ W = {0}). REMARK 4.1.29 The subspace V ⊆ Z is complemented if and only if there exists a bounded linear projection of Z onto V , i.e., there exists PV ∈ L(Z), such that PV |V = idV

and

PV (Z) = V.

The closed subspace c0 of l∞ is not complemented. If every closed subspace of a Banach space Z is complemented, then Z is isomorphic to a Hilbert space. Every subspace of the Banach space Z, which is either finite dimensional or it has finite codimension, is complemented. Finally in a Hilbert space every closed subspace is complemented (take the orthogonal complement).

484

Nonlinear Analysis

An interesting consequence of Theorem 4.1.27 is the following corollary. COROLLARY 4.1.30 If U ⊆ X is open, f : U −→ Y is a continuously differentiable function, fF0 (x0 ) is surjective and ker fF0 (x0 ) is complemented, then f (U ) contains a neighbourhood of f (x0 ). df

PROOF Let V = ker fF0 (x0 ). Then X = V ⊕ W (see Definition 4.1.28) and so for all x ∈ X we have x = v + w, with v ∈ V and w ∈ W . We write f (x) = f (v, w). Evidently D2 f (x0 ) ∈ L(W ; Y )

is an isomorphism.

So we can apply Theorem 4.1.27 and conclude that f (U ) contains the neighbourhood U2 of y0 = f (x0 ) postulated by Theorem 4.1.27. In the next example we use Corollary 4.1.30 to prove an existence theorem for differential equations. EXAMPLE 4.1.31

Let ¡ ¢ X = C 1 [0, 1] and

¡ ¢ Y = C [0, 1] .

Let f : X −→ Y be defined by df

f (x) =

dx + x3 . dt

It is easy to see that f is a C 1 -map and fF0 (0) =

d ∈ L(X; Y ). dt

From the fundamental theorem of calculus, we have that fF0 (0) is surjective. Also ker fF0 (0) is the space of constant functions, hence it is complemented (see Remark 4.1.29). In fact the complement of ker fF0 (0) is given by ½ df

W =

Z1 x∈X:

¾ x(t) dt = 0 .

0

So we can apply Corollary 4.1.30 and conclude the following:

4. Smooth and Nonsmooth Analysis and Variational Principles

485

“We can find ε > 0, such that if y ∈ Y with kykY < ε, then the differential equation dx(t) + x(t)3 = y(t) dt

∀ t ∈ [0, 1]

has a solution x ∈ X.” Using the implicit function theorem (see Theorem 4.1.27), we can prove the inverse function theorem. THEOREM 4.1.32 (Inverse Function Theorem) If U ⊆ Y is an open set, f : U −→ X is a continuously differentiable function, y0 ∈ U and fF0 (y0 ) ∈ L(Y ; X) is an isomorphism, then there exists a neighbourhood U 0 of y0 , U 0 ⊆ U and V 0 a neighbourhood of x0 = f (y0 ), such that f : U 0 −→ V 0 is a diffeomorphism and (f −1 )0F (x0 ) = fF0 (y0 )−1 . PROOF

Let

df

h(x, y) = f (y) − x. Then

D2 h(x0 , y0 ) = fF0 (y0 ),

which by hypothesis is an isomorphism. So by virtue of Theorem 4.1.27 we can find a neighbourhood V 0 of x0 and a continuously differentiable map g : V 0 −→ Y , such that g(V 0 ) ⊆ U0 for a neighbourhood U0 of y0 , ¡ ¢ h x, g(x) = 0 ¡ ¢ (i.e., f g(x) = x for all x ∈ V 0 ) and

∀ x∈V0

g(x0 ) = y0 . In the sequel we consider f restricted to g(V 0 ). Since ¡ ¢ f g(x) = x, we see that g is injective on V 0 , hence a bijection from V 0 onto g(V 0 ). In addition g(V 0 ) = f −1 (V 0 ) is open because f is continuous. So we set U 0 = g(V 0 )

486

Nonlinear Analysis

and we have that f : U 0 −→ V 0 is a bijection. Finally since ¡ ¡ ¢¢−1 ¡ ¢ gF0 (x0 ) = − D2 h x0 , g(x0 ) D1 h x0 , g(x0 ) , we have hence

fF0 (x0 ) ◦ gF0 (x0 ) = idX , gF0 (x0 ) = (f −1 )0F (x0 ) = fF0 (y0 )−1 .

In finite dimensions this theorem has the following useful consequences. COROLLARY 4.1.33 If V ⊆ RN is an open set, x0 ∈ V , f : V −→ RM is a continuously differentiable function and y0 = f (x0 ), then (a) if N 6 M and fF0 (x0 ) is of maximal rank (i.e., of rank N ), then we can find a neighbourhood U 0 of y0 , V 0 a neighbourhood of x0 and a continuously differentiable function g : U 0 −→ RN , such that (g ◦ f )(x) = i(x)

∀ x ∈ V 0,

where i : RN −→ RM is the canonical injection, i.e., i(x1 , . . . , xN ) = (x1 , . . . , xN , 0, . . . , 0); (b) if N > M and fF0 (x0 ) is of maximal rank (i.e., of rank M ), then we can find a neighbourhood Vb of x0 and a continuously differentiable map ϑ : Vb −→ V , with ϑ(x0 ) = x0 and (f ◦ ϑ)(x) = projRM (x)

∀ x ∈ Vb ,

where projRM : RN −→ RM is the canonical projection, i.e., projRM (x1 , . . . , xN ) = (x1 , . . . , xM ). PROOF

(a) By hypothesis µ ¶ ∂fi rank (x0 ) = N. i∈{1,...,M } ∂xj j∈{1,...,N }

By relabelling things if necessary, we may assume that µ ¶ ∂fi det (x0 ) 6= 0. i∈{1,...,N } ∂xj j∈{1,...,N }

4. Smooth and Nonsmooth Analysis and Variational Principles

487

Let ξ : V × RM −N −→ RM be defined by df

ξ(x1 , . . . , xM ) = f (x1 , . . . , xN ) + (0, . . . , 0, xN +1 , . . . , xM ). We have

µ det

∂ξi (x0 , 0) ∂xj

¶ 6= 0. i∈{1,...,M } j∈{1,...,M }

So Theorem 4.1.27 implies that locally there exists an inverse g of f , such that i(x) = (g ◦ ξ ◦ i)(x) = (g ◦ f )(x). (b) Again we may assume without any loss of generality that µ ¶ ∂fi det (x0 ) 6= 0. i∈{1,...,M } ∂xj j∈{1,...,M }

N

We define η : V −→ R , by df

η(x1 , . . . , xN ) = Hence

µ det

∂ηi (x0 ) ∂xj

¡

¢ f1 (x), . . . , fM (x), xM +1 , . . . , xn .

¶

µ = det i∈{1,...,N } j∈{1,...,N }

∂fi (x0 ) ∂xj

¶ 6= 0. i∈{1,...,M } j∈{1,...,M }

So by Theorem 4.1.32, we can find a neighbourhood Vb of x0 and a continuously differentiable map ϑ : Vb −→ V , such that ϑ(x0 ) = x0 and ϑ = η −1 . Then ¢ ¡ ¢ ¡ projRM (x) = projRM ◦ η ◦ ϑ (x) = f ◦ ϑ (x).

REMARK 4.1.34 In Corollary 4.1.33, part (a) tells us that f locally looks like the inclusion map: V 0 ⊆ RN @ i@ R

f -

RM

U 0 ⊆ RM ¡ ¡ ªg

On the other hand part (b) tells us that f locally looks like the projection map: Vb ⊆ RN ϑ

p

@@ ¡¡ ª R N fb ϑ(V ) ⊆ V ⊆ R RM Both diagrams are commutative.

488

4.2

Nonlinear Analysis

Convex Functions

In this section we focus on convex functions and their differentiability properties. We show that the algebraic property of convexity has important topological consequences such as continuity and differentiability. The situation is especially pleasant in the context of separable Banach spaces. Convex functions play a certain role in modern variational analysis and the applications require that we consider extended real valued convex functions, df that is functions with values in R = R ∪ {+∞}. DEFINITION 4.2.1 Let X be a Hausdorff topological space and let ϕ : X −→ R be a function. The effective domain of ϕ is the set df

dom ϕ =

©

ª x ∈ X : ϕ(x) < +∞ .

We say that ϕ is proper, if dom ϕ 6= ∅. The epigraph of ϕ is the set df

epi ϕ =

©

ª (x, λ) ∈ X × R : ϕ(x) 6 λ .

The function ϕ is lower semicontinuous, if for every λ ∈ R, the sublevel set ª df © Lϕ x ∈ X : ϕ(x) 6 λ λ = is closed. If X is a Hausdorff linear topological space, we say that ϕ is convex, if for all x1 , x2 ∈ dom ϕ and all λ ∈ [0, 1], we have ¡ ¢ ϕ λx1 + (1 − λ)x2 6 λϕ(x1 ) + (1 − λ)ϕ(x2 ). We say that ϕ is strictly convex if the above inequality is strict when x1 6= x2 and λ ∈ (0, 1). The cone of proper, convex and lower semicontinuous functions is denoted by Γ0 (X). REMARK 4.2.2 It is well known that ϕ is lower semicontinuous if and only if epi ϕ ⊆ X × R is closed or equivalently if ϕ(x) 6 lim inf α ϕ(xα ) for every net (xα ) converging to x. Also ϕ is convex if and only if epi ϕ ⊆ X ×R is a convex set. This means that certain properties of proper, convex and lower semicontinuous functions can be deduced from these (rather special) closed, convex sets in X × R. So one can argue that the study of proper, convex, lower semicontinuous functions is a special case of the study of closed, convex sets. On the other hand, if C is a nonempty subset of X, we can introduce the indicator function of C, by ½ df 0 if x ∈ C, iC (x) = +∞ otherwise.

4. Smooth and Nonsmooth Analysis and Variational Principles

489

Then iC ∈ Γ0 (X) if and only if C is closed and convex. This example shows that it is possible to deduce certain properties of a closed, convex set from the properties of its indicator function which belongs in Γ0 (X). So one can argue that the study of closed, convex sets is a special case of the study of proper, convex, lower semicontinuous functions. Both points of view are legitimate and it is a matter of which approach is more convenient, the geometric or the analytical. The next theorem summarizes the continuity properties of a proper, convex function. THEOREM 4.2.3 If X is a Hausdorff linear space and ϕ : X −→ R is a proper, convex function, then the following statements are equivalent: (a) ϕ is bounded from above on a neighbourhood of x0 ∈ X; (b) ϕ is continuous at x0 ∈ X; (c) int epi ϕ 6= ∅; (d) int dom ϕ 6= ∅ and ϕ|int dom ϕ is continuous. Moreover, if the above statements hold, then int epi ϕ =

©

ª (x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ .

PROOF “(a)=⇒(b)”: Let U be a neighbourhood of x0 , such that ϕ|U is bounded from above, i.e., there exists c > 0, such that ϕ(x) 6 c

∀ x ∈ U.

Replacing, if necessary, U by U − x0 and ϕ(x) by ϕ(x + x0 ) − ϕ(x0 ), we may assume without any loss of generality that x0 = 0 and that ϕ(0) = 0. We will show that ϕ is continuous at x0 = 0. Let ε ∈ (0, c] and let us define df

Vε =

³ε ´ ³ ε ´ U ∩ − U . c c

Evidently Vε is a symmetric neighbourhood of the origin. We shall show that ¯ ¯ ¯ϕ(x)¯ 6 ε ∀ x ∈ Vε (4.14) (which implies the continuity of ϕ at x0 = 0). So let x ∈ Vε . We have εc x ∈ U and because ϕ is convex, we have ε ³c ´ ³ ε´ ε ϕ(x) 6 ϕ x + 1 − ϕ(0) 6 c = ε. c ε c c

490

Nonlinear Analysis

Also − εc x ∈ U and so µ

ε ³ c ´¶ 1 c x + − x 1 + εc 1 + εc ε ε ³ ´ 1 c 1 ε 6 ϕ(x) + c ε ϕ − x 6 ϕ(x) + , 1 + εc 1+ c ε 1 + εc 1 + εc

0 = ϕ(0) = ϕ

hence −ε 6 ϕ(x). So finally we obtain (4.14) and this proves the continuity of ϕ at the origin. “(b)=⇒(a)”: Since ϕ is continuous at x0 , then it is bounded on a neighbourhood of x0 . “(a)=⇒(c)”: By hypothesis, there exists a neighbourhood U of x0 , such that ϕ(x) 6 c

∀ x ∈ U.

So U ⊆ int dom ϕ and {(x, λ) ∈ X × R : x ∈ U, c < λ} ⊆ epi ϕ, which implies that int epi ϕ 6= ∅. “(c)=⇒(a)”: Let (x, λ) ∈ int epi ϕ. We can find a neighbourhood U of x and r > 0, such that U × [λ − r, λ + r] ⊆ epi ϕ, hence U × {λ} ⊆ epi ϕ and so ϕ(x) 6 λ

∀ x ∈ U,

which means that ϕ is bounded from above in a neighbourhood of x. “(a)=⇒(d)”: As before we may assume that x0 = 0. Let U be the neighbourhood of x0 = 0 postulated by part (a). Evidently U ⊆ dom ϕ and so int dom ϕ 6= ∅. Let x ∈ int dom ϕ. Note that dom ϕ is convex. So we can find r > 1, such that x b = rx ∈ dom ϕ. Let µ ¶ 1 df V = x+ 1− U, r which is a neighbourhood of x. Exploiting the convexity of ϕ, for all u ∈ V , we have µ ¶ 1 u = x+ 1− z with z ∈ U r

4. Smooth and Nonsmooth Analysis and Variational Principles and

491

µ

µ ¶ ¶ µ ¶ 1 1 1 1 x b+ 1− z 6 ϕ(b x) + 1 − ϕ(z) r r r r µ ¶ 1 1 6 ϕ(b x) + 1 − c = b c. r r

ϕ(u) = ϕ

So ϕ is bounded from above in a neighbourhood of x, hence continuous at x ∈ int dom ϕ (recall that (a)⇐⇒(b)). “(d)=⇒(a)”: Obvious. Finally let us show that int epi ϕ = {(x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ} . Let us denote the right hand side set by W . Clearly int epi ϕ ⊆ W. On the other hand let x ∈ int dom ϕ and Let

ϕ(x) < λ.

¡ ¢ b ∈ ϕ(x), λ . λ

Because ϕ|int dom ϕ is continuous, there exists a neighbourhood U of x, such that U ⊆ int dom ϕ and b ϕ(x) < λ ∀ x ∈ U. ³ ´ ¡ ¢ b +∞ ⊆ int epi ϕ, and so Hence x, λ ∈ U × λ, W ⊆ int epi ϕ.

REMARK 4.2.4 From the above theorem it follows that © ª int dom ϕ = x ∈ X : there exists λ ∈ R such that (x, λ) ∈ int epi ϕ . Also a convex function ϕ can be continuous at a boundary point x of dom ϕ where ϕ(x) = +∞. To see this consider the convex function ϕ : R −→ R, defined by ½1 df if x ∈ (0, +∞), x ϕ(x) = +∞ if x ∈ (−∞, 0]. Recall that in R the neighbourhoods of +∞ are all the sets (λ, +∞] with λ ∈ R. If C is nonempty, closed, convex set in X, then the indicator function iC ∈ Γ0 (X) is continuous at x if and only if x ∈ int C. Therefore, if int C = ∅, then iC is not continuous at any point C = dom iC .

492

Nonlinear Analysis

If X is finite dimensional, the situation is remarkably simple. PROPOSITION 4.2.5 If ϕ : X −→ R is convex and X is finite dimensional, then ϕ is continuous on int dom ϕ. PROOF

Let x ∈ int dom ϕ. We can find {ek }N k=0 ⊆ X

(N = dim X)

and r > 0, such that Br (x) ⊆ conv {ek }N k=0 ⊆ dom ϕ. So if y ∈ Br (x), we can find {λk }N k=0 ⊆ [0, 1], such that N X

λk = 1

and

x=

k=0

N X

λk e k .

k=0

Then because ϕ is convex, we have ϕ(x) 6

N X

λk ϕ(ek )

k=0

6

µX N

¶ λk

k=0

max

k∈{0,...,N }

ϕ(ek )

= c < +∞. So by virtue of Theorem 4.2.3, ϕ|int dom ϕ is continuous.

To have an infinite dimensional analog of the above theorem, we need an extra condition on the function ϕ. THEOREM 4.2.6 If X is a Banach space and ϕ : X −→ R is convex and lower semicontinuous, then ϕ|int dom ϕ is continuous.

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

We have dom ϕ =

∞ [ ¡

493

¢ ϕ6n .

n=1

Let x ∈ int dom ϕ. Since ϕ is lower semicontinuous, the sets © ª ϕ 6 n are closed. So by the Baire category theorem (see Theorem A.1.10), we can find n > 1, such that © ª int ϕ < n 6= ∅ and ϕ(x) < n. Let and set

© ª y ∈ int ϕ < n ¡ ¢ df h(λ) = ϕ x + λ(y − x)

∀ λ > 0.

Since x ∈ int dom ϕ, we can find r > 0, such that B rky−xkX (x) ⊆ dom ϕ. We have [−r, r] ⊆ dom h, hence 0 ∈ int dom h and so h is continuous at 0 (see Proposition 4.2.5). Because h(0) < n, we can find ϑ > 0, such that h(λ) < n Let

∀ λ ∈ [−ϑ, 0].

df

z = x − ϑ(y − x). We have and

© ª z∈ ϕ 0 and c > 0, such that ¯ ¯ B 2δ (x0 ) ⊆ U and ¯ϕ(y)¯ 6 c ∀ y ∈ B 2δ (x0 ). Let x, y ∈ B δ (x0 )

with x 6= y.

Let us set df

r = kx − ykX and

δ df z = y + (y − x). r

Then z ∈ B 2δ (x0 ). Also we have y =

r δ z+ x. r+δ 1+δ

So from the convexity of ϕ, we obtain ϕ(y) 6

r δ ϕ(z) + ϕ(x), r+δ r+δ

so ¢ r ¡ ϕ(z) − ϕ(x) r+δ r 2c 6 2c = kx − ykX . δ δ

ϕ(y) − ϕ(x) 6

Interchanging the roles of x and y in the above argument, we conclude that ¯ ¯ ¯ϕ(y) − ϕ(x)¯ 6 2c kx − yk X δ

∀ x, y ∈ B δ (x0 ).

“⇐=”: Obvious. For convex, continuous functions, it is possible to characterize Gˆateaux and Fr´echet differentiability at x ∈ X only in terms of ϕ, that is without using the linear functionals ϕ0F (x) and ϕ0G (x).

4. Smooth and Nonsmooth Analysis and Variational Principles

495

PROPOSITION 4.2.8 If U ⊆ X is an open set and ϕ : U −→ R is a convex, continuous function, then ϕ is Gˆ ateaux differentiable at x ∈ X if and only if lim

λ&0

PROOF

ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) = 0 λ

∀ h ∈ X.

(4.15)

“=⇒”: Since ϕ is Gˆateaux differentiable at x ∈ X, we have lim

λ&0

and lim

λ&0

ϕ(x + λh) − ϕ(x) = ϕ0G (x)h λ

ϕ(x − λh) − ϕ(x) = ϕ0G (x)(−h). λ

From these limits, we obtain immediately (4.15). “⇐=”: Let

df

ψ(λ) = ϕ(x + λh)

∀ λ ∈ R.

Then ψ is convex and by hypothesis we have 0 0 ψ+ (0) = ψ− (0).

Therefore ψ is differentiable at λ = 0 and so ϕ is Gˆateaux differentiable at x (see Remark 4.1.2). In the case of Fr´echet differentiability at x, (4.15) holds uniformly with respect to h when khkX = 1. PROPOSITION 4.2.9 If U ⊆ X is an open set and ϕ : U −→ R is a convex, continuous function, then ϕ is Fr´echet differentiable at x ∈ U if and only if for every ε > 0 there exists δ > 0, such that ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) < λε

∀ khkX = 1, λ ∈ (0, δ). (4.16)

PROOF “=⇒”: Since ϕ is Fr´echet differentiable at x ∈ X, for a given ε > 0, we can find δ > 0, such that ® ε ϕ(x + λh) − ϕ(x) − ϕ0F (x), λh X < khkX 2 ∀ khkX = 1, λ ∈ (0, δ). Rewriting (4.17) with h replaced by −h and adding we obtain (4.16).

(4.17)

496

Nonlinear Analysis

“⇐=”: From Proposition 4.2.8 we know that ϕ is Gˆateaux differentiable at x. The convexity ϕ implies that

and

® ϕ(x + λh) − ϕ(x) 0 − ϕG (x), h X > 0 λ

∀λ>0

(4.18)

® ϕ(x) − ϕ(x − λh) 0 − ϕG (x), h X 6 0 λ

∀ λ > 0.

(4.19)

Therefore λε > ϕ(x £ + λh) + ϕ(x − λh)− 2ϕ(x) ® ¤ = ϕ(x + λh) − ϕ(x) − λ ϕ0G (x), h X £ ® ¤ − ϕ(x) − ϕ(x − λh) − λ ϕ0G (x), h X ∀ khkX = 1, λ ∈ (0, δ).

(4.20)

From (4.18) and (4.19), we see that the right hand side of (4.20) is the sum of two positive quantities both of which have to be less than λε for all khkX = 1 and all λ ∈ (0, δ). This means that ϕ is Fr´echet differentiable at x. It is well known from elementary calculus that for a function ϕ : U −→ R with U ⊆ RN being an open set, existence of all partial derivatives at x ∈ U does not imply Fr´echet differentiability. However, if ϕ is convex this is true. PROPOSITION 4.2.10 If U ⊆ RN is an open set, ϕ : U −→ R is a convex function and all the partial derivatives of ϕ at x ∈ U exist, then ϕ is Fr´echet differentiable at x ∈ U . PROOF The obvious candidate for Fr´echet derivative of ϕ at x is the linear transformation determined by the partial derivatives, that is ¡ ¢ A(x) ∈ L RN ; R = RN , defined by N ¡ ¢ df X ∂ϕ A(x), h RN = (x)hk ∂xk

∀ h = (h1 , . . . , hN ) ∈ RN .

k=1

Let r > 0 be such that Br (x) ⊆ U. For each h ∈ Br (0), let ¡ ¢ df ψ(h) = ϕ(x + h) − ϕ(x) − A(x), h RN .

4. Smooth and Nonsmooth Analysis and Variational Principles Evidently ψ is convex on Br (0). For each k ∈ function ξk : Br (0) −→ R, by ψ(hk ek ) df if hk 6= 0 ξk (h) = h 0 k if hk = 0

©

497

ª 1, . . . , N , let us define a

∀ h ∈ Br (0),

N where {ek }N k=1 is the orthonormal basis of R . We have

ξk (h) −→ 0 as khkRN → 0. For each h = (h1 , . . . , hN ) with khkRN < Nr , because of the convexity of ψ, we have µX ¶ N N ¢ 1 1 X ¡ ψ(h) = ψ N hk ek 6 ψ N hk ek N N k=1

=

N X

k=1

hk ξk (N h) 6 khkRN

k=1

Since

µ 0 = ψ

N X

¯ ¯ ¯ξk (N h)¯.

k=1

1 1 h + (−h) 2 2

¶ 6

1 1 ψ(h) + ψ(−h), 2 2

we have −ψ(−h) 6 ψ(h). Therefore, it follows that − khkRN

N N X X ¯ ¯ ¯ ¯ ¯ξk (−N h)¯ 6 ψ(h) 6 khk N ¯ξk (N h)¯, R k=1

so

k=1

ψ(h) −→ 0 as h → 0 khkRN

and thus ϕ is Fr´echet differentiable at x ∈ U and ϕ0F (x) = A(x).

The above proposition implies that for convex functions on a finite dimensional Banach space the situation is straightforward; namely Gˆateaux and Fr´echet differentiability are equivalent (compare with Proposition 4.1.7). COROLLARY 4.2.11 If X is a finite dimensional Banach space, U ⊆ X is an open set and ϕ : U −→ R is a convex function, then ϕ is Gˆ ateaux differentiable at x ∈ U if and only if it is Fr´echet differentiable at x ∈ U .

498

Nonlinear Analysis

From elementary convex analysis we know that a convex function ϕ : (a, b) −→ R is differentiable at all except at most a countable number of points of (a, b). We would like to identify those Banach spaces X where the convex continuous functions defined on open convex sets in X have similar Gˆateaux differentiability properties. The basic result in this direction is the so-called Mazur’s theorem, which implies the automatic generic (i.e., in a dense Gδ -subset) Gˆateaux differentiability of convex, continuous functions in separable Banach spaces. THEOREM 4.2.12 (Mazur Theorem) If X is a separable Banach space, U ⊆ X is an open, convex set and ϕ : U −→ R is a convex, continuous function, then ϕ is Gˆ ateaux differentiable on a dense Gδ subset of U . PROOF For each h ∈ X and m ∈ N, let µ ¶ ½ 1 df S h, = x ∈ U : there exists δ = δ (x, m) > 0, such that m ¾ ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) 1 sup < . λ m λ∈(0,δ) Since by hypothesis ϕ is continuous, for a given m > 1 and every k > 1, the set ½ ¾ ϕ(x + k1 h) + ϕ(x − k1 h) − 2ϕ(x) 1 df Tk (h) = x ∈ U : < 1 m k is open in U . It follows then that µ ¶ ∞ [ 1 S h, = Tk (h) m k=1

is open in U . We claim that it is also dense in U . Suppose that this is not the case. Then we can find x0 ∈ U and r > 0, such that µ ¶ 1 S h, ∩ Br (x0 ) = ∅. m If ψ(λ) = ϕ(x0 +λh), it follows that ¡ the ¢convex function ψ is not differentiable 1 on (−r, r), a contradiction. So S h, m is dense in U . Because X is separable, we can find a sequence {hn }n>1 which is dense in © ª ∂B1 (0) = h ∈ X : khkX = 1 . By virtue of Proposition 4.2.8, ϕ is Gˆateaux differentiable at x if the directional derivative exists in the directions {hn }n>1 . µ ¶ ∞ \ 1 So ϕ is Gˆateaux differentiable on the set S hn , , which is dense in m n,m=1 U (Baire’s theorem) and of course Gδ .

4. Smooth and Nonsmooth Analysis and Variational Principles

499

There exist nonseparable Banach spaces where the above theorem fails. EXAMPLE 4.2.13

Let X = l∞ and for x = {xn }n>1 , let df

ϕ(x) = lim sup |xn |. n→+∞

Then ϕ is a seminorm (hence it is convex) and it is also continuous since ϕ(x) 6 kxk∞ . If ϕ(x) = 0, then xn −→ 0 and so taking h = (1, 1, . . .), we have ϕ(x + λh) − ϕ(x) |λ| = , λ λ ¡ ¢ which shows that ϕ is not Gˆateaux differentiable at all x ∈ ϕ−1 {0} . If ϕ(x) > 0, exploiting the positive homogeneity of ϕ, we may assume without any loss of generality that ϕ(x) = 1. © ª Let xnk k>1 be a subsequence of {xn }n>1 , such that |xnk | −→ 1

as k → +∞.

By passing to a ©further ª subsequence if necessary, we may assume that all the elements of the xnk k>1 have the same sign. Moreover, since ϕ(x) = ϕ(−x), we can say that xnk > 0 ∀ k > 1. Let ½ df

hn =

0 1

if either n 6= nk for all k > 1, or n = nk with k odd, if n = nk with k even.

Let us set h = {hn }n>1 ∈ l∞ . We have ϕ(x + λh) − ϕ(x) = λ

½

1 0

if if

λ > 0, λ < 0.

So ϕ is nowhere Gˆateaux differentiable. REMARK 4.2.14 The above example should not lead to the conclusion that Theorem 4.2.12 fails in every nonseparable Banach space. There are nonseparable Banach spaces in which Theorem 4.2.12 remains valid, for example the class of weakly compactly generated Banach spaces. A Banach space is weakly compactly generated , if there exists a weakly compact set C (which we can always take to be convex), whose linear span is dense in X

500

Nonlinear Analysis

(i.e., X = span C). Separable Banach spaces are weakly compactly generated. To see this let {xn }n>1 be dense in ∂B1 (0) and take ½ df

C =

1 xn n

¾ ∪ {0}.

Note that C is actually compact. Also reflexive Banach spaces are weakly df compact generated. In this case let C = B 1 (0). DEFINITION 4.2.15 A Banach space X is said to be a weak Asplund space, if every convex, continuous function defined on an open, convex set U ⊆ X is Gˆ ateaux differentiable at each point of a dense Gδ subset U . REMARK 4.2.16 In the above definition the term weak has nothing to do with the weak topology. It is used because sometimes Gˆateaux differentiability is called weak differentiability in contrast to the Fr´echet differentiability, which is called strong differentiability . Theorem 4.2.12 says that every separable Banach space is a weak Asplund space. What about Fr´echet differentiability? In this direction we have the following result due to Asplund (1968) and Lindenstrauss (1963) independently. THEOREM 4.2.17 If X is a Banach space with separable dual, U ⊆ X is an open, convex set and ϕ : U −→ R is a convex, continuous function, then ϕ is Fr´echet differentiable on a dense Gδ subset of U . Then in analogy to Definition 4.2.15, we make the following definition. DEFINITION 4.2.18 A Banach space X is said to be an Asplund space, if every convex, continuous function defined on an open, convex set U ⊆ X is Fr´echet differentiable on a dense Gδ subset of U . REMARK 4.2.19 Theorem 4.2.17 implies that every Banach space X with a separable dual X ∗ is an Asplund space. Note that X is separable too. More generally, we can say that a separable Banach space ¡ X ¢is an Asplund space if and only if it has a separable dual. If X = C [0, 1] (whose dual ¡ ¢ M [0, 1] , the space of Radon measures is not separable) and ϕ : X −→ R+ is defined by df

ϕ(x) = kxk∞ , then it can be shown that ϕ is not Fr´echet differentiable at any point.

4. Smooth and Nonsmooth Analysis and Variational Principles

4.3

501

Haar Null Sets and Locally Lipschitz Functions

From real analysis we know that every Lipschitz continuous function f : R −→ R is differentiable almost everywhere and is the integral of its derivative (a consequence of the fundamental theorem of the Lebesgue calculus). We saw that this theorem can be extended to vector valued functions f : R −→ X, when X is a Banach space with the RNP (see Theorem 2.2.17 and Remark 2.2.18). We also proved another generalization of Lebesgue’s original result, which is due to Rademacher and which says that a locally Lipschitz function f : RN −→ RM is differentiable almost everywhere (see Theorem 1.5.8 and Corollary 1.5.9). The purpose of this section is to combine these two generalizations; that is we want to prove a Rademacher’s theorem for functions between Banach spaces. The problem that we face when dealing with such a generalization is that we do not have a natural measure µ (such as the Lebesgue measure in RN ), which produces a useful class of µ-null sets. So we need to devise new ways to come up with negligible sets. Our experience from the real line suggests that the approach cannot be purely topological and choose, say, the sets of first Baire category. There are several distinct ways to define negligible sets. Our goal is not to present all of them. Instead we will focus on the so-called Haarnull sets, which historically produced the first generalization of Rademacher’s theorem. As the name suggests Haar-null sets are defined on topological groups. So let us start with a brief discussion of them. DEFINITION 4.3.1 A topological group is a group G endowed with a Hausdorff topology, which is compatible with the group structure; that is the two maps G × G 3 (x, y) 7−→ xy ∈ G

and

G 3 x 7−→ x−1 ∈ G

are continuous (G × G is furnished with the product topology). An isomorb is a group isophism of a topological group G onto a topological group G b morphism of G onto G which is bicontinuous. We say that G is an Abelian topological group, if G is Abelian. REMARK 4.3.2 The map x 7−→ x−1 is the inverse map, the map x 7−→ ax is left translation by a and the map x 7−→ xb is right translation by b. All three maps are homeomorphisms of G into itself. The fact that translations are homeomorphisms implies that a topological group is topologically homogeneous. Namely for any a, b ∈ G, the map x 7−→ ba−1 x is a homeomorphism of G which sends a to b. Therefore the topological structure at a is reflected at b. In particular then the topology is completely determined by the system of neighbourhoods of the neutral element e.

502

Nonlinear Analysis

DEFINITION 4.3.3 be left-invariant, if

A metric dG on a topological group G is said to

dG (ax, ay) = dG (x, y)

∀ a, x, y ∈ G

and right-invariant, if dG (xa, ya) = dG (x, y)

∀ a, x, y ∈ G.

The metric dG is invariant, if it is both left-invariant and right-invariant. REMARK 4.3.4 A celebrated theorem of Birkhoff-Kakutani (see, e.g., Hewitt & Ross (1963, pp. 68–70)) says that a topological group G is metrizable if and only if the neutral element e has a countable fundamental system of neighbourhoods. A metrizable topological group admits a left-invariant (or right-invariant) compatible metric. However, a metrizable topological group need not admit an invariant metric. For Abelian groups clearly left and right invariance are equivalent. We shall consider separable Abelian topological groups and separable Banach spaces or otherwise we face serious measurability problems. In addition the Abelian topological group will be Polish. So let G be an Abelian Polish topological group. If G is locally compact, then it is well known that there is a unique (up to scalar multiplication) translation invariant measure µ on G called the Haar measure (see Dieudonn´e (1969, p. 244)). For G which is not locally compact, no invariant measure exists. Nevertheless, it is still possible to define the notion of Haar-null sets. Since G is Abelian, its operation will be denoted by “+.” Also all measures considered in the sequel will be Borel without any explicit mention. DEFINITION 4.3.5 A Borel set A ⊆ G is said to be Haar-null, if there is a probability measure µ on G, such that χA ? µ = 0, i.e., Z χA (x + y) µ(dx) = 0 ∀y∈G G

(the convolution of the characteristic function χA and the measure µ). REMARK 4.3.6 So according to this definition the Borel set A is Haarnull if and only if there is a probability measure µ, such that all translates of A are µ-null, i.e., µ(A + y) = 0 ∀ y ∈ G. The measure µ is called test measure for A. The next proposition shows that when G is locally compact, then the notion of Haar-null set coincides with that of a set which is negligible with respect to the Haar measure on G.

4. Smooth and Nonsmooth Analysis and Variational Principles

503

PROPOSITION 4.3.7 If G is a locally compact, Abelian, Polish topological group and A ⊆ G is a Borel set, then the following two properties are equivalent: (a) A is Haar-null on G; (b) A is negligible for the Haar measure on G. PROOF “(a)=⇒(b)”: Let µ be a test measure for A and let h be a Haar measure on G. We know that h is σ-finite. Then by virtue of Fubini’s theorem, we have ¸ ¸ Z ·Z Z ·Z χA (x + y) h(dx) µ(dy) = χA (x + y) µ(dy) h(dx) = 0. G

G

G

G

So for µ-almost all y ∈ G, we have Z χA (x + y) h(dx) = 0, G

hence

Z h(A) =

χA (x) h(dx) = 0. G

“(b)=⇒(a)”: Again let h be a Haar measure. Let f ∈ Cc (G), such that Z f (x)h(dx) = 1. G

Let us set df

Z

µ(B) =

f (x)h(dx)

∀ B ∈ B(G).

B

Evidently µ is a probability measure on G and we have Z Z χA (x + y)µ(dx) = χA (x + y)f (x)h(dx) G

G

Z

6 c

Z

χA (x + y)h(dx) = c G

for some c > 0 (since f ∈ Cc (G)).

χA (x)h(dx) = 0, G

504

Nonlinear Analysis

REMARK 4.3.8 Since on RN , the Haar measures are multiples of the Lebesgue measure, it follows that the Haar-null sets are the Lebesgue-null sets. PROPOSITION 4.3.9 If G is an Abelian, Polish topological group and {An }n>1 is a sequence of Haar-null sets in G, ∞ df S b= then A An is a Haar-null set in G. n=1

1 PROOF Let M+ (G) be the of probability measures on G. Furnished ¡ space ¢ 1 1 with the narrow topology w M+ (G), Cb (G) , M+ (G) becomes a Polish space (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003a, p. 199)). Let % be 1 a complete metric on M+ (G) generating the above narrow topology. By the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that if 1 µn −→ µ in M+ (G),

then µn ? ν −→ µ ? ν

1 in M+ (G).

Note that if a probability measure µ vanishes on a set A and all its translates, then the same is true for every translate of µ and every probability measure which is absolutely continuous with respect to µ (see Definition A.2.22). In particular this is the case for measures of the form df

µB (A) =

µ(A ∩ B) , µ(B)

with B ∈ B(G), such that µ(B) > 0. So let µ0 be a translation of µ, such that every neighbourhood of the neutral element e of G has strictly positive µ0 -measure. Then for r > 0, we set df

ur (C) = where

df

Br (e) =

©

µ0 (C ∩ Br (e)) , µ0 (Br (e))

ª x ∈ G : dG (x, e) < r ,

with dG being a complete metric on G. Evidently ur −→ δe

1 in M+ (G),

as r & 0,

where δe is the Dirac measure with its mass at e. Therefore for every ε > 0, 1 we can find ν ∈ M+ (G), such that χA ? ν = 0 (i.e., ν is a test measure on A) and %(ν, δe ) < ε.

4. Smooth and Nonsmooth Analysis and Variational Principles

505

These observations imply that by induction we can generate a sequence 1 {νn }n>1 ⊆ M+ (G), such that χAn ? νn = 0

∀n>1

(i.e., νn is a test measure on An ) and ¡ ¢ % u, u ? µn

1, 2n where u is any convolution of different µk ’s with 1 6 k 6 n − 1. Since the 1 metric % on M+ (G) is complete, we see that the measure df

µ =

∞ Y

?µk

k=1

is well defined. Because µ = µn ? νn for all n > 1, where Y ?µk , νn = k6=n

it follows that χAn ? µ = 0

∀ n > 1,

hence χ

∞ S

n=1

This proves that

∞ S

? µ = 0. An

An is a Haar-null set.

n=1

COROLLARY 4.3.10 If G is an Abelian, Polish topological group and A ⊆ G is Haar-null, then Ac = G \ A is dense in G. PROOF Let {xn }n>1 be a sequence which is dense in G and let µ be a test measure for A. It suffices to show that int A = ∅. Suppose that int A 6= ∅. Then we can find a neighbourhood U of the neutral element e of G and a ∈ A, such that a + U ⊆ G. We have µ(U + b) = µ(U + a) = µ(U ) = 0

∀ b ∈ G,

so µ(U + xn ) = 0

∀ n > 1.

Then because of Proposition 4.3.9, we have that G =

∞ [

(U + xn ) is Haar-null, with test measure µ,

n=1

a contradiction.

506

Nonlinear Analysis

EXAMPLE 4.3.11 Let G be a separable Banach space and let A ⊆ X be a Borel set which intersects all the translates of a fixed line L in sets whose one-dimensional Lebesgue measure is zero. For example A can be a proper, Borel linear subspace of X. Then A is a Haar-null set. Any probability measure on the line L which is equivalent to the Lebesgue measure on L can be used as a test measure of A. Sets like A above are called directionallynull sets. Recall that a set A ⊆ R can have positive measure, and, in fact, its complement can be Lebesgue-null, without A including any interval of positive length (consider, e.g., the set of irrational numbers). However, if we take all differences between elements in A (i.e., A − A), then this set contains a nontrivial interval around zero. This fact is used in the construction of a nonmeasurable set (see, e.g., Halmos (1974, p. 69)). The same property can be proved if R is replaced by an Abelian, Polish topological group G and we consider a Borel set A ⊆ G which is not Haar-null. PROPOSITION 4.3.12 If G is an Abelian, Polish topological group and A ⊆ G is a Borel set which is not Haar-null, then A − A is a neighbourhood of the neutral element e of G. PROOF

Let df

S(A) =

©

ª x ∈ G : (A + x) ∩ A is not Haar-null .

We claim that S(A) is a neighbourhood of e. If we can show this then the proposition follows since S(A) ⊆ A − A. Suppose that the claim is not true. Then we can find a sequence {xn }n>1 ⊆ G, such that 1 dG (xn , e) < n ∀n>1 2 (dG being a complete metric on G) and (A + xn ) ∩ A is Haar-null

∀ n > 1.

Because of Proposition 4.3.9, the set ∞ [ £ ¤ (A + xn ) ∩ A n=1

is Haar-null and so its complement df b = A A\

∞ [ £ ¤ (A + xn ) ∩ A n=1

4. Smooth and Nonsmooth Analysis and Variational Principles

507

is a Borel set which is not Haar-null (see Corollary 4.3.10). Note that ¡ ¢ b + xn ∩ A b = ∅ A

∀ n > 1.

Let C be the Cantor group (i.e., C = {0, 1}N ). The elements of C are denoted by ξ = {ξn }n>1 with ξn ∈ {0, 1} ∀ n > 1. Let ξ (n) =

©

(n) ª

ξk

k>1

be the element of C, defined by ½ (n) df

ξk

=

0 1

if if

k= 6 n, k = n.

Consider the map ϑ : C −→ G, defined by df

ϑ(ξ) =

∞ X

ξn xn .

n=1

Note that

¡ ¢ ϑ ξ (n) = xn

and because dG is complete and invariant, we have that ϑ is continuous. If we consider the Haar probability measure h on C and we consider the image measure h ◦ ϑ−1 = µ on G, b is not Haar-null, µ does not vanish on some translate A; b that then because A is we can find y ∈ G, such that ¡ ¢ b + y has positive h-measure. ϑ−1 A So we must have that ¡ ¢ ¡ ¢ b + y − ϑ−1 A b + y is a neighbourhood of 0 ∈ C. ϑ−1 A This means that for all n > 1 large enough, ¢ ¢ ¡ ¡ b+y . b + y − ϑ−1 A ξ (n) ∈ ϑ−1 A ¢ ¡ b + y which differ only in the n-th coordinate and Thus there are σ, τ ∈ ϑ−1 A b−A b for all these n. But this contradicts the fact that xn ∈ A ¡ ¢ b + xn ∩ A b = ∅ A ∀ n > 1.

508

Nonlinear Analysis

REMARK 4.3.13 In the above proof, we have shown something stronger. Namely that S(A) is an open set containing e. In fact this result can be generalized as follows: “If A, B are two Borel sets in G and df

S(A, B) =

©

ª x ∈ G : (A + x) ∩ B is not Haar-null ,

then S(A, B) is an open (possibly empty) subset of G” (see Christensen (1974, p. 118)). COROLLARY 4.3.14 (a) If G is an Abelian, Polish topological group which is not locally compact, then every compact subset of G is Haar-null. (b) If G = X is a nonreflexive, separable Banach space, then every weak compact subset of X is Haar-null. Using the notion of the Haar-null sets we can extend Rademacher’s theorem to Lipschitz continuous functions between certain Banach spaces. LEMMA 4.3.15 If X and Y are two Banach spaces, U ⊆ X is an open set, f : U −→ Y is a Lipschitz continuous function, G ⊆ X is a dense additive subgroup and for some x0 ∈ U and all h ∈ G, f (x0 + λh) − f (x0 ) = f 0 (x0 ; h) λ→0 λ lim

exists and f 0 (x0 ; ·) is additive, then f is Gˆ ateaux differentiable at x0 . PROOF

Consider the following family of functions: df

uλ (h) =

f (x0 + λh) − f (x0 ) λ

∀ λ 6= 0, h ∈ X.

The family {uλ }λ6=0 is equicontinuous (since f is Lipschitz continuous on U ) and since lim uλ (h) exists for all h ∈ G, it also exists for all h ∈ G = X. λ→0

Moreover, from the additivity of f 0 (x0 ; ·) on G follows the additivity on G = X. In addition note that lim uλ (th) = t lim uλ (h)

λ→0

λ→0

∀ t ∈ R.

So f 0 (x0 ; ·) is a linear operator which is bounded by the Lipschitz constant of f , i.e., f 0 (x0 ; ·) ∈ L(X; Y ). Therefore f is Gˆateaux differentiable at x0 .

4. Smooth and Nonsmooth Analysis and Variational Principles

509

PROPOSITION 4.3.16 If U ⊆ RN is an open set, Y is a Banach space with the RNP and f : U −→ Y is a Lipschitz continuous function, then f is Gˆ ateaux differentiable almost everywhere on U (on U we consider the N -dimensional Lebesgue measure). PROOF Without any loss of generality, we may assume that U = RN . Let {xn }n>1 ⊆ RN be dense and let df

G = span Q {xn }n>1 (i.e., the linear combinations of the xn ’s with rational coefficients). Clearly G is a countable dense additive subgroup of RN . From Theorem 2.2.17, we know that the directional derivatives f 0 (x; h) = lim

λ→0

f (x + λh) − f (x) λ

exist in all directions h ∈ G for almost all x ∈ RN . Then in the light of Lemma 4.3.15, if we can show that these directional derivatives are additive on G, then we will have the almost everywhere Gˆateaux differentiability. ¡ desired ¢ To this end let ϕ ∈ Cc1 RN be such that Z ϕ(x) dx = 1. RN

For example a standard function with these properties is the function Ã ! 1 c exp if kxkRN 6 1, 2 ϕ(x) = kxkRN − 1 0 if kxkRN > 1, with c ∈ R chosen so that we have the normalization condition Z ϕ(x) dx = 1. RN

Let

Z

df

g(x) = (f ? ϕ)(x) =

f (y)ϕ(x − y) dy. RN

We know that and so we have that

¡ ¢ g ∈ C 1 RN ; Y 0 gG (x)h =

¡

¢ f ? ϕ0G (x) h

510

Nonlinear Analysis

is linear in h ∈ RN for all x ∈ RN . Since Z Z f (y)ϕ(x − y) dy = f (x − y)ϕ(y) dy RN

∀ x ∈ RN , h ∈ G,

RN

using the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have g(x + λh) − g(x) 0 gG (x)h = lim λ→0 λ Z f (x + λh − y) − f (x − y) = ϕ(y) lim dy λ→0 λ RN

and it follows that 0 (x)h = gG

¡

¢ ϕ ? ξh (x),

where

ϕ(x + λh) − ϕ(x) λ is a bounded measurable function. Then we have ¡ ¢ ϕ ? ξh1 +h2 − ξh1 − ξh2 = 0 ∀ h1 , h2 ∈ G. df

ξh (x) = lim

λ→0

The same is true if ϕ is replaced by ϕm (x) = mN ϕ(mx). Recall that for every bounded, measurable function gb : RN −→ Y, we have that

¡ ¢ ϕm ? gb (x) −→ gb(x)

for a.a. x ∈ RN

(see Proposition 2.4.12(c); the result there is stated for R-valued functions, but it can be extended to Y -valued functions by scalarization using elements of Y ∗ and recalling that by virtue of Theorem 2.1.3, we may assume without any loss of generality that Y is separable). So in the limit, we obtain that ξh1 +h2 (x) = ξh1 (x) + ξh2 (x) for a.a. x ∈ RN and all h1 , h2 ∈ G. Because G is countable, the exceptional Lebesgue-null set is independent of h1 , h2 ∈ G. Now we are ready for the infinite dimensional generalization of Rademacher’s theorem (see Theorem 1.5.8 and Corollary 1.5.9).

4. Smooth and Nonsmooth Analysis and Variational Principles

511

THEOREM 4.3.17 If X is a separable Banach space, U ⊆ X is an open set, Y is a Banach space with the RNP and f : U −→ Y is a Lipschitz continuous function, then f is a Gˆ ateaux differentiable function on a set Df with X \ Df being Haar-null in X. PROOF Without any loss of generality we may assume that U = X. First we show that the set Df of points of Gˆateaux differentiability of f is a Borel subset of X. Indeed let {xn }n>1 be dense in X and set df

G = span Q {xn }n>1 . Then by virtue of Lemma 4.3.15, we have ¯ ¯ ½ ¯ f (x + λh) − f (x) f (x + rh) − f (x) ¯ ¯ ¯< 1, Df = x ∈ X : ¯ − ¯ m λ r ¾ 1 λ, r ∈ Q, |λ|, |r| 6 , h ∈ G, m, n > 1 . n So we see that Df ⊆ X is a Borel set. Now let {yn }n>1 be a sequence of linearly independent vectors in X, such that span {yn }n>1 = X. Let

df

Vm = span {yn }m n=1 . Then V1 ⊆ V2 ⊆ . . . ⊆ Vm ⊆ . . . ⊆ X and

∞ [

Vm = X.

m=1

Using Proposition 4.3.16, we can find Dn ⊆ Vn , such that f is Gˆateaux differentiable on Dn and Vn \Dn is Lebesgue-null. By virtue of Lemma 4.3.15, Df =

∞ \

Dn .

n=1

So Df is a Haar-null set in X. COROLLARY 4.3.18 If X is a separable Banach space, Y is a Banach space with the RNP and f : X −→ Y is a locally Lipschitz continuous function, then f is a Gˆ ateaux differentiable function on a set Df with X \ Df being Haar-null in X.

512

4.4

Nonlinear Analysis

Duality and Subdifferentials

A basic theme that runs through the whole theory of convex analysis is that of “duality.” Namely almost every mathematical notion is paired with another one, which is in some sense dual to it. So convex cones are associated to their polars (generalizing this way the pairing of tangent and normal spaces in differential geometry), closed convex sets are associated to their support functions (a pairing which permits an interchange between geometric and analytical reasoning), minimization problems in linear programming are associated to maximization problems, known as the dual problems, which provide valuable information about the solvability and the value of the original problem. In general duality permits us to establish close relations between otherwise disparate properties. The starting point of all these correspondence is a deep duality principle between certain pairs of convex functions (known as conjugate functions), which we study in the first half of this section. The mathematical framework of our analysis is a Hausdorff locally convex vector space X and its dual X ∗ (i.e., the set of continuous, linear functionals on X). Additional hypotheses will be introduced as needed. We supply X with the w(X, X ∗ )-topology and X ∗ with the w(X ∗ , X)-topology. So (X ∗ , X) is a dual pair (or dual system) and we denote by h·, ·iX their pairing. df

DEFINITION 4.4.1 Let ϕ : X −→ R∗ = R ∪ {±∞}. Then LegendreFenchel transform (or the conjugate) of ϕ is the function ϕ∗ : X ∗ −→ R∗ , defined by £ ¤ df ϕ∗ (x∗ ) = sup hx∗ , xiX − ϕ(x) . x∈X

∗ ∗

∗∗

The function (ϕ ) = ϕ : X −→ R∗ , defined by df

ϕ∗∗ (x) =

sup x∗ ∈X ∗

£

¤ hx∗ , xiX − ϕ∗ (x∗ ) ,

is the second conjugate (or biconjugate) of ϕ. REMARK 4.4.2 If ϕ takes the value −∞, then ϕ∗ ≡ +∞. Also if the effective domain dom ϕ is empty, then ϕ∗ ≡ +∞. For this reason of interest df is the case where ϕ : X −→ R = R ∪ {+∞} and dom ϕ 6= ∅ (i.e., ϕ is a proper function; see Definition 4.2.1). In this case ϕ∗ : X −→ R and it is proper too. The significance of ϕ∗ is better understood using epigraphs (see Definition 4.2.1). So we have · (x∗ , µ) ∈ epi ϕ∗ ⇐⇒

hx∗ , xiX − λ 6 µ

¸ ∀ (x, λ) ∈ epi ϕ .

4. Smooth and Nonsmooth Analysis and Variational Principles

513

If we write the last inequality as hx∗ , xiX − µ 6 λ

∀ (x, λ) ∈ epi ϕ,

we see that · ∗

(x , µ) ∈ epi ϕ

∗

⇐⇒

¸ ∗

hx , xiX − µ 6 ϕ(x)

∀x∈X .

So df

l(x∗ ,µ) (x) = hx∗ , xiX − µ is a continuous affine minorant of ϕ. Therefore ϕ∗ is proper if and only if ϕ admits a continuous affine minorant. Moreover, ϕ∗ describes the family of all continuous affine minorants of ϕ. On the other hand also note that · ¸ ϕ∗ (x∗ ) 6 µ ⇐⇒ l(x,λ) (x∗ ) = hx∗ , xiX − λ 6 µ ∀ (x, λ) ∈ epi ϕ . So see that ϕ∗ is the pointwise supremum of all continuous affine functions © we ª l(x,λ) (x,λ)∈epi ϕ . Therefore ϕ∗∗ is the pointwise supremum of all continuous affine functions majorized by ϕ. From these observations and recalling that the supremum of continuous affine functions on X is convex and lower semicontinuous, we obtain the following result. PROPOSITION 4.4.3 If ϕ : X −→ R is a proper function, then ϕ∗ ∈ Γ0 (X ∗ ). Also directly from the definition of ϕ∗ , we obtain the following two results. PROPOSITION 4.4.4 (Young-Fenchel Inequality) If ϕ : X −→ R∗ is a function, then ϕ(x) + ϕ∗ (x∗ ) > hx∗ , xiX ∀ x ∈ X, x∗ ∈ X ∗ . PROPOSITION 4.4.5 If ϕ, ψ : −→ R∗ and ϕ(x) 6 ψ(x)

∀ x ∈ X,

then ϕ∗ (x∗ ) > ψ ∗ (x∗ )

∀ x∗ ∈ X ∗ .

514

Nonlinear Analysis

DEFINITION 4.4.6 set C is the function

(a) Let C ⊆ X. The support function of the σC : X ∗ −→ R∗ ,

defined by df

σC (x∗ ) = sup hx∗ , ciX c∈C

(recall that sup∅ = −∞). If C 6= ∅, then σC takes values in R. (b) The infimal convolution of functions ϕ, ψ : X −→ R is the function ϕ ⊕ ψ : X −→ R∗ , defined by ¡

¢ ¡ ¢ df ϕ ⊕ ψ (x) = inf ϕ(y) + ψ(x − y) = y∈X

inf

z+y=x

¡ ¢ ϕ(z) + ψ(y) .

We say that ϕ ⊕ ψ is exact at x, if ¡ ¢ ¡ ¢ ϕ ⊕ ψ (x) = min ϕ(y) + ψ(x − y) y∈X

(i.e., the infimum is attained). We say that ϕ ⊕ ψ is exact, if it is exact at every x. REMARK 4.4.7 Evidently if C ⊆ X is nonempty, then σC ∈ Γ0 (X ∗ ) and σC (0) = 0. In fact σC is sublinear (i.e., subadditive and positively homogeneous). Moreover, ¡ ¢∗ σC = iC , where iC is the indicator function of the set C ⊆ X, i.e., ½ df 0 if x ∈ C, iC (x) = +∞ otherwise (see Remark 4.2.2). If df

sepi ϕ =

©

(x, λ) ∈ X × R : ϕ(x) < λ

ª

(the strict epigraph of ϕ), then it is easy to check that ¡ ¢ sepi ϕ ⊕ ψ = sepi ϕ + sepi ψ. Also since ¡

¢ ϕ ⊕ ψ (x) =

inf

(z, λ) ∈ epi ϕ (y, µ) ∈ epi ϕ z+y =x

(λ + µ) =

inf

(x,λ)∈(epi ϕ+epi ψ)

λ,

4. Smooth and Nonsmooth Analysis and Variational Principles

515

we see that the infimal convolution of proper, convex functions ϕ and ψ is convex but not necessarily proper. For example, if ϕ = iC and ψ = iD and C, D ⊆ X are two nonempty, convex, disjoint sets, then iC + iD ≡ +∞. On the other hand, if ϕ and ψ are linear functionals which are not identical, then ϕ ⊕ ψ = −∞. In addition note that, if X is a normed space and C ⊆ X is a nonempty set, then for all x ∈ X, we have dX (x, C) = inf kx − ckX c∈C ¡ ¢ = inf kx − ykX + iC (y) y∈X ¡ ¢ = k·kX + iC (x). Hence for all nonempty, convex sets C ⊆ X, the distance function dX (·, C) is convex. Moreover, it is easy to see that ¯ ¯ ¯d (x, C) − d (y, C)¯ 6 kx − yk , X X X i.e., dX (·, C) is nonexpansive. Finally for any index set I, we have µ

¶∗ = sup ϕ∗i .

inf ϕi

i∈I

i∈I

PROPOSITION 4.4.8 If ϕ, ψ : X −→ R are proper, convex functions, then ¡ ¢∗ ϕ⊕ψ = ϕ∗ + ψ ∗ . PROOF

According to Definitions 4.4.1 and 4.4.6(b), we have

¡ ¢∗ ¡ ¡ ¢ ¢ ϕ ⊕ ψ (x∗ ) = sup hx∗ , xiX − ϕ ⊕ ψ (x) x∈X ¶ µ ¡ ¢ = sup hx∗ , xiX − inf ϕ(y) + ψ(x − y) y∈X

x∈X

=

∗

sup (hx , yiX − ϕ(y) + hx∗ , ziX − ψ(z))

y,z∈X ∗ ∗

= ϕ (x ) + ψ ∗ (x∗ ).

516

Nonlinear Analysis

PROPOSITION 4.4.9 If X and Y are two Hausdorff, locally convex spaces, A ∈ L(X; Y ) is an isomorphism, g : Y −→ R is a proper function and for y0 ∈ Y , x∗0 ∈ X ∗ , ξ0 ∈ R and λ0 > 0, we set df

ϕ(x) = λ0 g(Ax + y0 ) + hx∗0 , xiX + ξ0

∀ x ∈ X,

then µ ∗

∗

ϕ (x ) = λ0 g

∗

¶ ® 1 −1 ∗ ∗ ∗ (A ) (x − x0 ) − x∗ − x∗0 , A−1 y0 X − ξ0 λ0

∀x∗ ∈ X ∗ .

PROOF

We have © ª ϕ∗ (x∗ ) = sup hx∗ , xiX − λ0 g(Ax + y0 ) − hx∗0 , xiX − ξ0 x∈X ½ ¾ ® 1 −1 ∗ ∗ = λ0 sup (A ) (x − x∗0 ), y Y − g(y) λ0 ® y∈Y − x∗ − x∗0 , A−1 y0 X − ξ0 µ ¶ ® 1 ∗ −1 ∗ ∗ ∗ = λ0 g (A ) (x − x0 ) − x∗ − x∗0 , A−1 y0 X − ξ0 . λ0

COROLLARY 4.4.10 If g : X −→ R a is proper function and x0 ∈ X, x∗0 ∈ X ∗ , λ0 > 0, ϑ0 ∈ R, then (a) for (b) for (c) for (d) for (e) for (f ) for

ϕ(x) = g(x + x0 ), ϕ∗ (x∗ ) = g ∗ (x∗ ) − hx∗ , x0 iX ; ∗ ϕ∗ (x∗ ) = g ∗ (x∗¡ − x∗0¢); ϕ(x) = g(x) + hx0 , xiX , ϕ(x) = λ0 g(x), ϕ∗ (x∗ ) = λ0 g ∗ λ10 x∗ ; ¡1 ¢ ϕ(x) = λ0 g λ0 x , ϕ∗ (x∗ ) = λ0¡g ∗ (x∗ ); ¢ ϕ(x) = g(λ0 x), ϕ∗ (x∗ ) = g ∗ λ10 x∗ ; ϕ(x) = λ0 g( λ10 x + x0 ) + ϑ0 , ϕ∗ (x∗ ) = λ0 g ∗ (x∗ ) − λ0 hx∗ , x0 iX − ϑ0 .

Let us give some examples of conjugate functions. EXAMPLE 4.4.11

(a) Let X be a normed space and © ª C = B 1 = x ∈ X : kxkX 6 1 .

Then

σC (x∗ ) = i∗C (x∗ ) = sup hx∗ , ciX = kx∗ kX ∗ . c∈B1

(b) If K ⊆ X is a cone (i.e., λK ⊆ K for all λ > 0), then σK = i−K ∗ ,

4. Smooth and Nonsmooth Analysis and Variational Principles

517

where K ∗ is the dual cone, i.e., df

K∗ =

©

ª x∗ ∈ X ∗ : hx∗ , xiX > 0 for all x ∈ K .

If K is a linear subspace of X, then df

K∗ = K⊥ =

©

ª x∗ ∈ X ∗ : hx∗ , xiX = 0 for all x ∈ K .

(c) Let X be a normed space and ϕ(x) = kxkX . Then we have

£ ¤ ϕ∗ (x∗ ) = sup hx∗ , xiX − kxkX . x∈X

If kx∗ kX ∗ 6 1, we have

hx∗ , xiX 6 kxkX

and so ϕ∗ (x∗ ) = 0. On the other hand if kx∗ kX ∗ > 1, we can find x ∈ X, such that kxkX < hx∗ , xiX and so £ ¤ hx∗ , λxiX − kλxkX = λ hx∗ , xiX − kxkX > 0 Therefore ϕ∗ (x∗ ) = +∞. So we conclude that © ∗ where B = x∗ ∈ X ∗ : kx∗ kX ∗

ϕ∗ = iB∗ , ª 61 .

(d) Let X be a normed space, C ⊆ X nonempty set and df

ϕ(x) = dX (x, C)

∀ x ∈ X.

Then from Remark 4.4.7, we know that ϕ = k·kX ⊕ iC . Then by virtue of Proposition 4.4.8, we have that ∗

ϕ∗ = k·kX + i∗C = iB∗ + σC

(see (c) and (a) above).

∀ λ > 0.

518

Nonlinear Analysis

(e) If df

ϕ(x) = hx∗0 , xiX + ξ0 with

∀ x ∈ X,

x∗0

∈ X and ξ0 ∈ R (i.e., ϕ is a continuous, affine functions), then ½ £ ¤ −ξ0 if x∗ = x∗0 , ϕ∗ (x∗ ) = sup hx∗ − x∗0 , xiX − ξ0 = +∞ if x∗ 6= x∗0 . x∈X

(f ) If ϕ : RN −→ R is defined by df

ϕ(x) =

1 p kxkRN , p

with p ∈ (1, +∞), then ϕ∗ (x∗ ) = with

1 p

+

1 p0

1 ∗ p0 kx kRN , p0

= 1. Indeed, let df

gx∗ (x) = (x∗ , x)RN −

1 p kxkRN . p

Then gx∗ is concave and p−2

gx0 ∗ (x) = x∗ − kxkRN x. x) = 0 at the unique point x b, such that We have that gx0 ∗ (b ° °p ° ° p (x∗ , x b)RN = °x b°RN = °x∗ °Rp−1 N . So if p0 =

p p−1

and since ϕ∗ (x∗ ) = sup gx∗ (x), x∈RN

we have that ϕ∗ (x∗ ) =

1 ∗ p0 kx kRN . p0

More generally, let X be a normed space and let g : R −→ R be an even, convex function. If ¡ ¢ df ϕ(x) = g kxkX then

¡ ¢ ϕ∗ (x∗ ) = g ∗ kx∗ kX ∗

∀ x ∈ X, ∀ x∗ ∈ X ∗ .

4. Smooth and Nonsmooth Analysis and Variational Principles

519

PROPOSITION 4.4.12 If ϕ : X −→ R is a convex and lower semicontinuous function, then ϕ admits a continuous affine minorant, i.e., hx∗0 , xiX − ξ0 6 ϕ(x)

∀ x ∈ X,

for some (x0 , ξ0 ) ∈ X ∗ × R. PROOF

Clearly we may assume that ϕ is proper, i.e., ϕ ∈ Γ0 (X).

Let x0 ∈ X and η ∈ R be such that η < ϕ(x0 ). Then (x0 , η) 6∈ epi ϕ and so by the strong separation theorem (see Theorem A.3.2), we can find (x∗0 , ϑ0 ) ∈ X ∗ × R, (x∗0 , ϑ0 ) 6= (0, 0) and ξ ∈ R, such that hx∗0 , xiX + ϑ0 λ < ξ < hx∗0 , x0 iX + ϑ0 η Let (x, λ) = We have

∀ (x, λ) ∈ epi ϕ.

¡ ¢ x, ϕ(x) .

hx∗0 , xiX + ϑ0 ϕ(x) < ξ < hx∗0 , x0 iX + ϑ0 η,

so ϑ0 < 0. Without any loss of generality, we may assume that ϑ0 = −1. from (4.21), we have so

hx∗0 , xiX − ϕ(x) < ξ0

∀ x ∈ X,

hx∗0 , xiX − ξ0 < ϕ(x)

∀ x ∈ X.

(4.21) Then

PROPOSITION 4.4.13 For any function ϕ : X −→ R∗ , we have ϕ∗∗ 6 ϕ. PROOF have

From the Young-Fenchel inequality (see Proposition 4.4.4), we

ϕ∗∗ (x) =

sup x∗ ∈X ∗

£

¤ hx∗ , xiX − ϕ∗ (x∗ ) 6 ϕ(x)

∀ x ∈ X.

520

Nonlinear Analysis

The next theorem is very important and determines when we have equality in Proposition 4.4.13. THEOREM 4.4.14 If ϕ : X −→ R is a function, then ϕ∗∗ = ϕ if and only if ϕ is convex and lower semicontinuous. PROOF

“=⇒”: Follows from Remark 4.4.2.

“⇐=”: If ϕ ≡ +∞, then and so

ϕ∗ ≡ −∞ v = ϕ∗∗ ≡ +∞.

Therefore we may assume that ϕ is proper. We know that we have ϕ∗∗ 6 ϕ. So we need to show that the opposite inequality also holds. To this end let x ∈ X and µ ∈ R be such that µ < ϕ(x). Then (x, µ) ∈ / epi ϕ and so we can apply the strong separation theorem (see Theorem A.3.2) and find (x∗ , β) ∈ X ∗ × R, (x∗ , β) 6= (0, 0) and δ > 0, such that hx∗ , yiX + βλ 6 hx∗ , xiX + βµ − δ

∀ (y, λ) ∈ epi ϕ.

Since λ can increase to +∞, from this inequality it follows that β 6 0. First suppose that β < 0. We have hx∗ , yiX + βϕ(y) < hx∗ , xiX + βµ

∀ y ∈ X,

from which it follows that (−βϕ)∗ (x∗ ) 6 hx∗ , xiX + βµ. Using Corollary 4.4.10(c), we obtain µ ∗¶ x ∗ −βϕ 6 hx∗ , xiX + βµ −β and thus

¿ µ 6

−

x∗ ,x β

À X

µ ∗¶ x − ϕ∗ − 6 ϕ∗∗ (x). β

Because µ < ϕ(x) was arbitrary, we infer that ϕ(x) 6 ϕ∗∗ (x)

4. Smooth and Nonsmooth Analysis and Variational Principles

521

as desired. Next assume that β = 0. We have hx∗ , yiX 6 hx∗ , xiX − δ

∀ y ∈ dom ϕ

and so, we see that x 6∈ dom ϕ

and ϕ(x) = +∞.

It is enough to show that also ϕ∗∗ (x) = +∞. Let η ∈ R be such that hx∗ , yiX < η < hx∗ , xiX

∀ y ∈ dom ϕ.

As ϕ is bounded below by an affine function (see Proposition 4.4.12), we have that there exist y ∗ ∈ X ∗ and ϑ ∈ R, such that hy ∗ , yiX − ϑ 6 ϕ(y)

∀ y ∈ X.

So for all γ > 0, we have ¡ ¢ hy ∗ , yiX − ϑ + γ hx∗ , yiX − η 6 ϕ(y)

∀ y ∈ X.

Then hy ∗ + γx∗ , yiX − ϕ(y) 6 ϑ + γη

∀y∈X

and so ϕ∗ (y ∗ + γx∗ ) 6 ϑ + γη. Therefore ¡ ¢ hy ∗ , xiX − ϑ + γ hx∗ , xiX − η 6 hy ∗ + γx∗ , xiX − ϕ∗ (y ∗ + γx∗ ) 6 ϕ∗∗ (x). Since η < hx∗ , xiX and γ > 0 was arbitrary, we see that the left hand side is arbitrarily large and so ϕ∗∗ (x) = +∞. Thus ϕ(x) 6 ϕ∗∗ (x).

522

Nonlinear Analysis

COROLLARY 4.4.15 If C ⊆ X is a nonempty, closed and convex set, then x ∈ C if and only if hx∗ , xiX 6 σC (x∗ )

∀ x∗ ∈ X ∗ .

In Proposition 4.4.8, we saw that addition is the dual operation to infimal convolution. The next proposition shows that under some additional conditions, the converse is also true. PROPOSITION 4.4.16 If ϕ, ψ : X −→ R are proper, convex functions and there exists a point x ∈ dom ϕ, such that ψ is continuous at x, then ¡ ¢∗ ϕ+ψ = ϕ∗ ⊕ ψ ∗ . Now we pass to the study of the subdifferential starting with convex subdifferentials. The convex subdifferential characterizes the local behaviour of convex functions, in a way which is analogous to that in which derivatives determine the local behaviour of smooth functions (see Section 4.1). In fact we can develop a subdifferential calculus which to a high degree parallels the differential calculus of smooth functions. The mathematical setting remains as before. Namely X is a Hausdorff, locally convex vector space, X ∗ is its topological dual. The spaces X and X ∗ are supplied with the w(X, X ∗ ) and w(X ∗ , X) topologies respectively. Let ϕ : X −→ R be a proper, convex function and x ∈ dom ϕ, h ∈ X. The function df ϕ(x + λh) − ϕ(x) ux (λ) = λ is increasing on (0, +∞). So we can make the following definition. DEFINITION 4.4.17 Let ϕ : X −→ R be a proper, convex function and x0 ∈ dom ϕ. The directional derivative of ϕ at x0 in the direction h ∈ X is defined by df

ϕ(x0 + λh) − ϕ(x0 ) ϕ(x0 + λh) − ϕ(x0 ) = lim . λ>0 λ&0 λ λ

ϕ0 (x0 ; h) = inf

REMARK 4.4.18 Note that ϕ0 (x0 ; h) ∈ R∗ and it is easy to see that 0 ϕ (x0 ; ·) is sublinear. Moreover, if X is a Banach space and ϕ0 (x0 ; ·) ∈ X ∗ , then ϕ is Gˆateaux differentiable at x0 and ϕ0 (x0 ; ·) = ϕ0G (x0 ).

4. Smooth and Nonsmooth Analysis and Variational Principles

523

DEFINITION 4.4.19 Let ϕ : X −→ R be a proper function and x0 ∈ dom ϕ. The subdifferential of ϕ at x0 is the subset ∂ϕ(x0 ) (possibly empty) of X ∗ , defined by ½ ¾ ∗ ® df ∗ ∗ ∂ϕ(x0 ) = x ∈ X : x , y − x0 6 ϕ(y) − ϕ(x0 ) for all y ∈ X . REMARK 4.4.20

From this definition we see that

x∗ ∈ ∂ϕ(x0 ) where

if and only if df

argminψ =

©

x0 ∈ argmin(ϕ − x∗ ),

ª x ∈ X : ψ(x) = inf ψ . X

The set ∂ϕ(x) is always a closed and convex subset of X ∗ and it can be empty (consider for example the subdifferential ∂ϕ(x) when x ∈ / dom ϕ). The domain of the subdifferential multifunction ∂ϕ is the set © ª D(∂ϕ) = x ∈ X : ∂ϕ(x) 6= ∅ . The function ϕ is said to be subdifferentiable at x ∈ X, if x ∈ D(∂ϕ). The elements of ∂ϕ(x) are called subgradients of ϕ at x. Using the epigraph of ϕ we can better understand the geometric meaning of the subdifferential. So ϕ is subdifferentiable at x ∈ X and x∗ ∈ X ∗ is a subgradient of ϕ at x if and only if the graph of the continuous function y 7−→ hx∗ , y − xiX + ϕ(x) ¡ ¢ is a nonvertical supporting hyperplane to the set epi ϕ at x, ϕ(x) , that is the continuous affine function df

l(y) = hx∗ , y − xiX + ϕ(x) is a minorant of ϕ which is exact at x, i.e., l 6 ϕ and

l(x) = ϕ(x).

Since l(x) 6 ϕ∗∗ (x) 6 ϕ (see Remark 4.4.2 and Proposition 4.4.13), we infer that if ∂ϕ(x) 6= ∅, then ϕ(x) = ϕ∗∗ (x). Consequently, if ϕ(x) = ϕ∗∗ (x), then ∂ϕ(x) = ∂ϕ∗∗ (x).

524

Nonlinear Analysis

PROPOSITION 4.4.21 If ϕ : X −→ R is a function, then x∗ ∈ ∂ϕ(x) ⇐⇒ ϕ(x) + ϕ∗ (x∗ ) = hx∗ , xiX . PROOF

“=⇒”: From the definition of the subdifferential, we have hx∗ , yiX − ϕ(y) 6 hx∗ , xiX − ϕ(x)

∀ y ∈ X,

so ϕ∗ (x∗ ) + ϕ(x) 6 hx∗ , xiX . Since the opposite inequality is always true (see the Young-Fenchel inequality; Proposition 4.4.4), we conclude that ϕ(x) + ϕ∗ (x∗ ) = hx∗ , xiX . “⇐=”: We have hx∗ , xiX − ϕ(x) = ϕ∗ (x∗ ) > hx∗ , yiX − ϕ(y)

∀ y ∈ X.

Therefore hx∗ , y − xiX 6 ϕ(y) − ϕ(x)

∀ y ∈ X,

hence x∗ ∈ ∂ϕ(x).

COROLLARY 4.4.22 If ϕ : X −→ R and x∗ ∈ ∂ϕ(x), then x ∈ ∂ϕ∗ (x∗ ). PROOF

Since x∗ ∈ ∂ϕ(x), we have ϕ∗ (x∗ ) + ϕ(x) = hx∗ , xiX

(see Proposition 4.4.21). Then since ϕ∗∗ 6 ϕ (see Proposition 4.4.13), we obtain ϕ∗ (x∗ ) + ϕ∗∗ (x) 6 hx∗ , xiX . A new appeal to Proposition 4.4.21 gives that x ∈ ∂ϕ∗ (x∗ ).

4. Smooth and Nonsmooth Analysis and Variational Principles

525

COROLLARY 4.4.23 If ϕ ∈ Γ0 (X), then x∗ ∈ ∂ϕ(x) ⇐⇒ x ∈ ∂ϕ∗ (x∗ ). PROOF

Since ϕ ∈ Γ0 (X), we have ϕ = ϕ∗∗

(see Theorem 4.4.14). So from Corollary 4.4.22 we conclude the desired equivalence. Before continuing with the investigation of the subdifferentials in the context of convex functions, let us give some examples of subdifferentials. EXAMPLE 4.4.24 (a) Let ϕ : R −→ R be a proper, convex function and x ∈ int dom ϕ. Then it is easily seen that £ 0 ¤ 0 ∂ϕ(x) = f− (x), f+ (x) . df

(b) Let X be a Banach space and ϕ(x) = kxkX . If x 6= 0, then © ª ∂ϕ(x) = x∗ ∈ X ∗ : kx∗ kX ∗ = 1, hx∗ , xiX = kxkX . Indeed, let x∗ ∈ X ∗ be such that kx∗ kX ∗ = 1 Then and so

and

hx∗ , xiX = kxkX .

hx∗ , yiX 6 kykX

∀y∈X

hx∗ , y − xiX 6 kykX − kxkX ,

hence x∗ ∈ ∂ϕ(x). On the other hand, let x∗ ∈ ∂ϕ(x). Then − kxkX > − hx∗ , xiX and

kxkX = 2 kxkX − kxkX > hx∗ , xiX ,

from which we infer that hx∗ , xiX = kxkX . Also

hx∗ , λyiX 6 kx + λykX − kxkX

∀ y ∈ X, λ > 0,

526

Nonlinear Analysis

hence

° ° °1 ° 1 ° 6 ° x + y° ° − λ kxkX . λ X

∗

hx , yiX Let λ → +∞, to obtain

hx∗ , yiX 6 kykX ,

from which it follows that

kx∗ kX ∗ 6 1.

But since hx∗ , xiX = kxkX , we conclude that kx∗ kX ∗ = 1. If x = 0, then ∗

∂ϕ(0) = B 1 =

©

ª x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 .

Indeed note that · ∗

x ∈ ∂ϕ(0) ⇐⇒

¸ ∗

hx , xiX 6 kxkX

∀x∈X

and the last inequality is equivalent to saying that kx∗ kX ∗ 6 1. df

(c) Let C be a closed, convex set in X and ϕ(x) = iC (x). Then df

©

x∗ ∈ X ∗ : hx∗ , c − xiX 6 0 for all c ∈ C © ∗ ª = x ∈ X ∗ : hx∗ , xiX = σC (x∗ ) .

∂ϕ(x) = NC (x) =

ª

The set ∂ϕ(x) = NC (x) is a nonempty (because 0 ∈ ∂ϕ(x) = NC (x)), closed and convex cone in X ∗ , known as the normal cone to C at x. It generalizes the notion of normal space (see Definition A.1.12(b)) in differential geometry. If x ∈ / C, then ∂ϕ(x) = NC (x) = ∅. So D(∂ϕ) = C and ∂ϕ(x) = NC (x) = {0}

∀ x ∈ int C.

If C = V is a linear subspace of X, then ∂ϕ(x) = NV (x) = V ⊥ © ∗ ª = x ∈ X ∗ : hx∗ , viX = 0 for all v ∈ V

∀ x ∈ V.

4. Smooth and Nonsmooth Analysis and Variational Principles

527

For convex functions we have an easy criterion for subdifferentiability at x ∈ X. PROPOSITION 4.4.25 If X is a Banach space and ϕ : X −→ R is a convex function which is continuous at x ∈ X, then ∂ϕ(x) 6= ∅ and ∂ϕ(x) is w∗ -compact and convex in X ∗ . PROOF

From Theorem 4.2.3, we know that

int epi ϕ 6= ∅. ¢ Since x, ϕ(x) belongs to the boundary of epi ϕ, we can apply the weak separation theorem (see Theorem A.3.1) and find (x∗ , η) ∈ X ∗ × R, with (x∗ , η) 6= (0, 0), such that ¡ ¢ η ϕ(x) − λ 6 h−x∗ , x − yiX ∀ (y, λ) ∈ epi ϕ. (4.22) ¡

Since for fixed y ∈ dom ϕ, λ can increase up to +∞, from (4.22), we infer that η > 0. If η = 0, then h−x∗ , x − yiX > 0

∀ y ∈ dom ϕ.

But x ∈ int dom ϕ (see Theorem 4.2.3). So x∗ = 0, a contradiction. So η > 0 and we take η = 1. Then from (4.22) with λ = ϕ(y), we have ϕ(x) − ϕ(y) 6 h−x∗ , x − yiX , so −x∗ ∈ ∂ϕ(x) 6= ∅. From Theorems 4.2.3 and 4.2.7, we know that there exists r > 0, such that ϕ|Br (x) is Lipschitz continuous. So we have hx∗ , uiX 6 ϕ(x + u) − ϕ(x) 6 k kukX

∀ u ∈ B r (0),

for some k > 0 and so kx∗ kX ∗ 6 k. By Alaoglu’s theorem (see Theorem A.3.9) and since ∂ϕ(x) is clearly w∗ closed, we conclude that it is w∗ -compact and convex. REMARK 4.4.26 The result is actually true in the more general context of dual pairs of locally convex spaces. However, since the material of Section 4.2 was developed in the context of Banach spaces and to avoid introducing additional functional analytic material, we have stated the result in Banach spaces.

528

Nonlinear Analysis

In fact for a continuous, convex function ϕ we can describe the subdifferential completely. PROPOSITION 4.4.27 If X is a Banach space and ϕ : X −→ R is a convex function which is continuous at x ∈ X, then σ∂ϕ(x) (h) = f 0 (x; h) ∀ h ∈ X. PROOF

Let

df

ψ(h) = ϕ0 (x; h)

∀ h ∈ X.

Since ϕ is continuous at x ∈ X, we have ∂ϕ(x) 6= ∅ (see Proposition 4.4.25). So we have hx∗ , hiX 6 ψ(h) 6 ϕ(x + h) − ϕ(x)

∀ h ∈ X, x∗ ∈ ∂ϕ(x),

so ψ is finite everywhere, hence continuous on X. Also using Proposition 4.4.9, we see that the conjugate of the function df

ψλ (h) =

¤ 1£ ϕ(x + λh) − ϕ(x) λ

∀λ>0

is the function ψλ∗ (x∗ ) =

¤ 1£ ∗ ϕ (λx∗ ) + ϕ(x) − λ hx∗ , xiX λ

∀ λ > 0.

Since ψ = inf ψλ , we have that λ>0

ψ ∗ = sup ψλ∗ λ>0

(see Remark 4.4.2). Therefore ¤ 1£ ∗ ϕ (λx∗ ) + ϕ(x) − hλx∗ , xiX . λ>0 λ

ψ ∗ (x∗ ) = sup

Then by virtue of Propositions 4.4.4 and 4.4.21, we have ½ 0 if x∗ ∈ ∂ϕ(x), ψ ∗ (x∗ ) = +∞ otherwise, i.e., ψ ∗ = i∂ϕ(x) and so ψ ∗∗ = ψ = σ∂ϕ(x) (see Theorem 4.4.14). REMARK 4.4.28 Again the result remains valid in the framework of dual pairs of locally convex spaces.

4. Smooth and Nonsmooth Analysis and Variational Principles

529

Next we show that for convex functions the case of Gˆateaux differentiability is essentially the same as that of uniqueness of the subgradient. PROPOSITION 4.4.29 Let X be a Banach space and let ϕ : X −→ R be a proper, convex function. (a) If ϕ is Gˆ ateaux differentiable at x, then © ª x ∈ D(∂ϕ) and ∂ϕ(x) = ϕ0G (x) . (b) If ϕ is continuous at x and ∂ϕ(x) is a singleton, then ϕ is Gˆ ateaux differentiable at x and ª © ∂ϕ(x) = ϕ0G (x) . PROOF x, we have

(a) Due to the convexity and Gˆateaux differentiability of ϕ at ¤ 1£ ϕ(x + λh) − ϕ(x) λ 6 ϕ(x + h) − ϕ(x) ∀ λ ∈ (0, 1), h ∈ X,

hϕ0G (x), hiX 6

so ϕ0G (x) ∈ ∂ϕ(x). Let x∗ ∈ X ∗ be any element of ∂ϕ(x). We have hx∗ , hiX 6 so

¤ 1£ ϕ(x + λh) − ϕ(x) λ

∀ λ > 0, h ∈ X,

hx∗ , hiX 6 hϕ0G (x), hiX ∗

and thus x =

ϕ0G (x),

∀h∈X

i.e., ∂ϕ(x) =

©

ª ϕ0G (x) .

(b) Since ϕ is convex, we have ϕ(x) + λϕ0 (x; h) 6 ϕ(x + λh)

∀ λ ∈ R, h ∈ X.

So the straight line df

L =

©¡ ¢ ª x + λh, ϕ(x) + λϕ0 (x; h) : λ ∈ R

does not intersect int epi ϕ 6= ∅ (see Theorem 4.2.3). Then by the weak separation theorem (see Theorem A.3.1), we can find a closed hyperplane H containing line L, such that H ∩ int epi ϕ = ∅.

530

Nonlinear Analysis

The hyperplane H is the graph of a continuous affine function l on X, such that l(x) = ϕ(x). Since by hypothesis ∂ϕ(x) = {x∗ }, the slope of l is x∗ and because L ⊆ H, we have ϕ0 (x; h) = hx∗ , hiX

∀ h ∈ X.

Thus ϕ is Gˆateaux differentiable at x and © ª ∂ϕ(x) = ϕ0G (x) .

The next proposition explains the central role of the subdifferential in optimization theory. It is a direct consequence of Definition 4.4.19. PROPOSITION 4.4.30 If ϕ : X −→ R is a proper function, then ϕ attains its minimum at x ∈ dom ϕ if and only if 0 ∈ ∂ϕ(x). Next we will establish some basic rules of the subdifferential calculus. We start with two straightforward observations. Here ϕ, ψ : X −→ R are proper functions. We have ∂(λϕ)(x) = λ∂ϕ(x)

∀ λ > 0, x ∈ X

(4.23)

and ∂ϕ(x) + ∂ψ(x) ⊆ ∂(ϕ + ψ)(x)

∀ x ∈ dom ϕ ∩ dom ψ.

(4.24)

The next proposition provides a simple situation where equality in (4.24) is realized. PROPOSITION 4.4.31 If ϕ, ψ : X −→ R are proper, convex functions and there exists x b ∈ dom ϕ ∩ dom ψ where ϕ is continuous, then ∂ϕ(x) + ∂ψ(x) = ∂(ϕ + ψ)(x) ∀ x ∈ X. PROOF

Because of (4.24), we need to show that ∂ϕ(x) + ∂ψ(x) ⊇ ∂(ϕ + ψ)(x)

∀ x ∈ X.

(4.25)

4. Smooth and Nonsmooth Analysis and Variational Principles

531

To this end let x∗ ∈ ∂(ϕ + ψ)(x). Then x ∈ dom ϕ ∩ dom ψ and ψ(x) − ψ(y) 6 ϕ(y) − ϕ(x) − hx∗ , y − xiX = g(y)

∀ y ∈ X.

We introduce the following two sets df

C1 = epi g

and

df

C2 =

©

ª (y, µ) ∈ X × R : µ 6 ψ(x) − ψ(y) .

Both sets are convex and by virtue of Theorem 4.2.3, int C1 6= ∅. Also int C1 ∩ C2 = ∅. Indeed, g(y) 6 µ 6 ψ(x) − ψ(y)

∀ (y, µ) ∈ int C1 ∩ C2

and so g(y) = µ. Because (y, µ) ∈ int C1 , we have that (y, µ − ε) ∈ C1 for ε > 0 small and so g(y) 6 µ − ε, a contradiction. Since int C1 ∩ C2 = ∅, we can apply the weak separation theorem (see Theorem A.3.1) and produce (z ∗ , η) ∈ X ∗ × R, (z ∗ , η) 6= (0, 0), such that hz ∗ , ziX + ηλ 6 hz ∗ , yiX + ηµ

∀ (z, λ) ∈ C1 , (y, µ) ∈ C2

(4.26)

and the inequality is strict if (z, λ) ∈ int C1 . Note that (x, 0) ∈ C2 . Then from (4.26) and since λ can increase to +∞, we obtain η 6 0. If η = 0, then hz ∗ , ziX 6 hz ∗ , xiX

∀ z ∈ dom g.

But since g is continuous at x, dom g is a neighbourhood of x, hence z ∗ = 0, a contradiction to the fact that (z ∗ , η) 6= (0, 0). So η < 0 and we may assume that η = −1. Then from (4.26), we have ¡ ¢ hz ∗ , ziX − g(z) 6 hz ∗ , xiX 6 hz ∗ , yiX − ψ(x) − ψ(y) ∀ z ∈ dom g, y ∈ dom ψ. From the second inequality we have that −z ∗ ∈ ∂ψ(x), while from the first we have that x∗ + z ∗ ∈ ∂ϕ(x). Then x∗ = x∗ + z ∗ + (−z ∗ ) ∈ ∂ϕ(x) + ∂ψ(x) and we have proved (4.25). Of course the result is also true for any family {ϕi }ni=1 n T of proper, convex functions on X, such that there exists x ∈ dom ϕi , where REMARK 4.4.32

all but one of the functions are continuous.

i=1

532

Nonlinear Analysis

PROPOSITION 4.4.33 If A ∈ L(X; Y ) and ϕ : Y −→ R is a proper function, then A∗ ∂ϕ(Ax) ⊆ ∂(ϕ ◦ A)(x) ∀x∈X and equality holds if in addition ϕ is convex and continuous at a point in the range of A. PROOF The inclusion follows at once from the definitions. Let us prove that equality holds when ϕ is convex and continuous at the range of A. So let x∗ ∈ ∂(ϕ ◦ A)(x). We have hx∗ , z − xiX + (ϕ ◦ A)(x) 6 (ϕ ◦ A)(z) Let

df

L =

∀ z ∈ X.

(4.27)

©¡ ¢ ª Az, hx∗ , z − xiX + (ϕ ◦ A)(x) ∈ Y × R : z ∈ X .

This is an affine subspace of Y × R and because of (4.27), L and epi ϕ have only boundary points in common, that is L ∩ int epi ϕ 6= ∅ (note that by Theorem 4.2.3, int epi ϕ 6= ∅). So we can apply the weak separation theorem (see Theorem A.3.1) and find a close hyperplane H containing L, such that H ∩ int epi ϕ = ∅. The hyperplane H is the graph of a continuous affine function df

l(y) = hy ∗ , yiY + µ

∀ y ∈ Y,

with (y ∗ , µ) ∈ Y ∗ × R. Since H ⊇ L, we have hy ∗ , AziY + µ = hx∗ , z − xiX + (ϕ ◦ A)(x) so taking z = 0, we have µ = (ϕ ◦ A)(x) − hx∗ , xiX and hy ∗ , AziY = hx∗ , ziX

∀ z ∈ X.

From the second equality, we infer that x∗ = A∗ y ∗ . Also since H ∩ int epi ϕ = ∅,

∀ z ∈ X,

4. Smooth and Nonsmooth Analysis and Variational Principles

533

we have hy ∗ , yiY + (ϕ ◦ A)(x) − hA∗ y ∗ , xiX 6 ϕ(y)

∀ y ∈ Y,

so hy ∗ , y − AxiY + (ϕ ◦ A)(x) 6 ϕ(y) ¡ ¢ and thus y ∗ ∈ ∂ϕ A(x) . We infer that ∂(ϕ ◦ A)(x) ⊆ A∗ ∂ϕ(Ax)

∀y∈Y

∀ x ∈ X.

Therefore equality must hold. ∗

Next we study the multifunction ∂ϕ : X −→ 2X . The first result explains the connection between subdifferentials and maximal monotone maps and it generalizes the elementary fact that if ϕ : R −→ R is a continuous, convex function, then ϕ0 is increasing. THEOREM 4.4.34 If X is a reflexive Banach space and ϕ ∈ Γ0 (X), ∗ then ∂ϕ : X −→ 2X is a maximal monotone map. PROOF Using Troyanski’s renorming theorem (see Theorem A.3.23), we may assume that both X and X ∗ are locally uniformly convex (see Definition A.3.21). Let F : X −→ X ∗ be the duality map of X. By Proposition 3.2.27, F is a homeomorphism. Now, directly from the definition, we see that ∂ϕ is monotone. So by virtue of Theorem 3.2.29 to prove the maximal monotonicity of ∂ϕ, it suffices to show that ¡ ¢ R ∂ϕ + F = X ∗ . (4.28) To this end let x∗ ∈ X ∗ and consider the function ψ : X −→ R, defined by df

ψ(x) =

1 2 kxkX + ϕ(x) − hx∗ , xiX 2

∀ x ∈ X.

Evidently ψ ∈ Γ0 (X) and ψ(x) −→ +∞

as kxkX → +∞

534

Nonlinear Analysis

(recall that ϕ is bounded below by a continuous affine function; see Proposition 4.4.12). So by the Weierstrass theorem, we can find x0 ∈ dom ψ, such that ψ(x0 ) = inf ψ. X

Then from Proposition 4.4.30, we have that 0 ∈ ∂ψ(x0 ). Using Proposition 4.4.31 (see also Remark 4.4.32), we have ∂ϕ(x0 ) = F(x0 ) + ∂ϕ(x0 ) − x∗ ¡ ¢ (recall that ∂ 21 k·k2X (x0 ) = F(x0 ); see Example 3.2.20(d)). Hence 0 ∈ ∂ϕ(x0 ) + F(x0 ) − x∗ and so

x∗ ∈ ∂ϕ(x0 ) + F(x0 ).

Because x∗ ∈ X ∗ was arbitrary, we conclude that (4.28) holds and thus ∂ϕ is maximal monotone. REMARK 4.4.35 The result is actually true for X being any Banach space. For a proof of the result in this general case we refer to Rockafellar (1970b) (see also Phelps (1993, p. 59)). Now we obtain some additional properties which characterize the subdifferentials within the class of maximal monotone maps. ∗

DEFINITION 4.4.36 Let X be a Banach space and A : X −→ 2X . We say that A is n-cyclically monotone provided that n X

x∗k , xk − xk+1

® X

> 0,

k=0

whenever n > 1 and x0 , x2 , . . . , xn ∈ X, and

x∗k ∈ A(xk )

xn+1 = x0

∀ k ∈ {0, 1, . . . , n}.

We say that A is cyclically monotone, if it is n-cyclically monotone for every n > 2. The map A is maximal cyclically monotone, if its graph is not properly included in the graph of a cyclically monotone map.

4. Smooth and Nonsmooth Analysis and Variational Principles

535

REMARK 4.4.37 Clearly a 2-cyclically monotone map is monotone. So every cyclically monotone map is monotone. PROPOSITION 4.4.38 Every monotone map f : R −→ 2R is cyclically monotone. PROOF

Let x1 , x1 , . . . , xn ∈ D(f )

and

x∗k ∈ f (xk )

∀ k ∈ {0, 1, . . . , n}.

We may assume that xk 6 xk+1 for all k ∈ {0, 1, . . . , n − 1}. Then x∗k 6 x∗k+1 for all k ∈ {0, 1, . . . , n − 1} and we have n X

x∗k (xk − xk+1 ) =

k=0

n−1 X

x∗k (xk − xk+1 ) + x∗n (xn − x0 )

k=0

=

n−1 X

(x∗k − x∗0 )(xk − xk+1 ) > 0

k=0

(recall that xn+1 = x0 ). Directly from the definition, we see that if ϕ : X −→ R is a proper, convex function, then ∂ϕ is cyclically monotone. Moreover, if ϕ ∈ Γ0 (X), then by virtue of Theorem 4.4.34 and Remark 4.4.35, we see that ∂ϕ is maximal cyclically monotone. It turns out that subdifferentials are the only maximal cyclically monotone maps. THEOREM 4.4.39 If X is a Banach space, ∗ then a map A : X −→ 2X is maximal cyclically monotone if and only if there exists ϕ ∈ Γ0 (X), such that A = ∂ϕ. PROOF

“=⇒”: Let us fix (x0 , x∗0 ) ∈ Gr A and for every x ∈ X we define df

ϕ(x) =

sup

n X

∗ (xk , x A k=0 ©k ) ∈ Gr ª k ∈ 1, . . . , n n>1

x∗k , xk+1 − xk

® X

® + x∗n , x − xn X .

Because ϕ is the supremum of continuous affine functions, it follows that ϕ is convex and lower semicontinuous. Moreover, since n X ∗ ® xk , xk+1 − xk X 6 0 k=0

536

Nonlinear Analysis

(due to the cyclical monotonicity of A), it follows that ϕ is proper, that is ϕ ∈ Γ0 (X). Let (x, x∗ ) ∈ Gr A and y ∈ X. Since in the definition of ϕ, n > 2 is arbitrary, we have ϕ(y) >

n X ∗ ® ® ® xk , xk+1 − xk X + x∗n , x − xn X + x∗ , y − x X k=0

(i.e., we have added the point (x, x∗ ) ∈ Gr A in the definition of ϕ). Hence we obtain ® ϕ(y) > ϕ(x) + x∗ , y − x X ∀ y ∈ X, so x∗ ∈ ∂ϕ(x). Since (x, x∗ ) ∈ Gr A was arbitrary, we infer that Gr A ⊆ Gr ∂ϕ. Due to the maximality of ϕ, we conclude that Gr A = Gr ∂ϕ, hence A = ∂ϕ. “⇐=”: See the remark before the statement of the theorem. REMARK 4.4.40 In fact it can be shown that ϕ ∈ Γ0 (X) is unique up to an additive constant; see Rockafellar (1970b). COROLLARY 4.4.41 Any maximal monotone map f : R −→ 2R has the form £ 0 ¤ 0 f (x) = g− (x), g+ (x) , with g ∈ Γ0 (R). We can use Theorem 4.4.39 to characterize self-adjoint positive operators in a Hilbert space. PROPOSITION 4.4.42 If H is a Hilbert space and A : H ⊇ D(A) −→ H is a linear maximal monotone operator, then A is maximal cyclically monotone if and only if A is self-adjoint. PROOF “=⇒”: By virtue of Theorem 4.4.39, we can find ϕ ∈ Γ0 (H), such that A = ∂ϕ. Because A(0) = 0 and using Remark 4.4.40, we may assume that ϕ(0) = 0. For x ∈ D(A), let df

g(t) = ϕ(tx)

∀ t ∈ [0, 1].

From Proposition 4.4.33, we have that ∂g(t) = (∂ϕ(tx), x)H .

4. Smooth and Nonsmooth Analysis and Variational Principles

537

Using the definition of subdifferential, we infer that ¯ ¯ ¡ ¢ ¯g(t) − g(s)¯ 6 A(x), x |t − s| ∀ t, s ∈ [0, 1] H and so g is differentiable for almost all t ∈ [0, 1] and ¡ ¢ d g(t) = t A(x), x H . dt Then we have µ Z1 ¶ Z1 ¡ ¢ ¢ 1¡ 0 g(1) − g(0) = g (t) dt = t dt A(x), x H = A(x), x H , 2 0

0

so

¢ 1¡ A(x), x H ∀ x ∈ D(A). 2 Then via an easy calculation, we obtain that ¡ ¢ ¢ ¡ ¢ ¤ 1 £¡ ∂ϕ(x), y H = A(x), y H + x, A(y) H ∀ x, y ∈ D(A). 2 Since ¡ ¢ ¡ ¢ ∂ϕ(x), y H = A(x), y H , ϕ(x) =

we obtain

¡

A(x), y

¢ H

=

¡

x, A(y)

¢

∀ x, y ∈ D(A)

H

and so A ⊆ A∗ . But A∗ is monotone (see Theorem 3.2.58) and A is maximal. Therefore A = A∗ and we conclude that A is self-adjoint. “=⇒”: Since A is self-adjoint, maximal monotone, there exists a square root 1 of A with the same properties (see Kato (1976, p. 281)). So A 2 is closed (see Theorem 3.2.58) and if we set ½ 1 ° 1 °2 1 ° 2 ° df if x ∈ D(A 2 ) 2 A x H ϕ(x) = ∀ x ∈ H, 0 otherwise then ϕ ∈ Γ0 (H). Since ¡ 1 ¡ ¢ ¢ 1 A(x), y H = A 2 (x), A 2 (y) H

¡ 1¢ ∀ (x, y) ∈ D(A) × D A 2 ,

we have °2 ¡ ¢ 1 ° 1 °2 1° 1 A(x), y − x H 6 °A 2 (y)°H − °A 2 (x)°H 2 2 so A(x) ∈ ∂ϕ(x), i.e., A ⊆ ∂ϕ. Because A is maximal, we conclude that A = ∂ϕ.

¡ 1¢ ∀ (x, y) ∈ D(A) × D A 2 ,

538

Nonlinear Analysis

Using Proposition 3.2.14, we obtain the following result. PROPOSITION 4.4.43 If X is a reflexive Banach space and ϕ : X −→ R is a continuous, convex function, then ∗ ∂ϕ : X −→ 2X \ {∅} is upper semicontinuous from X with norm topology into X ∗ with the weak topology. REMARK 4.4.44 The result is true if X is any Banach space. In this case X ∗ is supplied with the w∗ -topology. The proof remains the same. DEFINITION 4.4.45 Let Y and Z be Hausdorff topological spaces and let S : Y ⊇ D(S) −→ 2Z be a multifunction. A selection f of S is a single valued map f : Y −→ Z, such that f (y) ∈ S(y)

∀ y ∈ D(S).

In the next proposition using selections of the subdifferential map, we characterize the Gˆateaux and Fr´echet differentiability of the convex function. PROPOSITION 4.4.46 If X is a Banach space, U ⊆ X is a nonempty, open convex set and ϕ : U −→ R is a continuous, convex function, then ϕ is Gˆ ateaux (respectively Fr´echet) differentiable at x ∈ U if and only if there is a selection f of the subdifferential map ∂ϕ which is norm-to-weak∗ (respectively norm-to-norm) continuous at x. An interesting consequence of Proposition 4.4.46 is that Fr´echet differentiable, convex functions are necessarily C 1 -functions. COROLLARY 4.4.47 If X is a Banach space, U ⊆ X is a nonempty, open, convex set and ϕ : U −→ R is a convex and Fr´echet differentiable function, then the function x 7−→ ϕ0F (x) is norm-to-norm continuous from U into X ∗ , i.e., ϕ ∈ C 1 (X).

4. Smooth and Nonsmooth Analysis and Variational Principles

539

Another such result is given in the next proposition. PROPOSITION 4.4.48 ¡ ¢ If ϕ ∈ Γ0 RN , ϕ is strictly convex and

¡ ¢ then ϕ∗ ∈ C 1 RN .

ϕ(x) −→ +∞ kxkRN

as kxkRN → +∞,

PROOF Without any loss of generality, we may assume that 0 ∈ dom ϕ and ϕ(0) = 0. Fix x∗ ∈ RN and consider the function df

Evidently −ψx∗

ψx∗ (x) = (x∗ , x)RN − ϕ(x). ¡ N¢ ∈ Γ0 R , it is strictly convex and ψx∗ (x) −→ −∞

as kxkRN → +∞.

So by the Weierstrass theorem ψx∗ attains its maximum on RN and the maximizer x is unique. By Propositions 4.4.21 and Corollary 4.4.23, we have ∂ϕ∗ (x∗ ) = {x}, i.e., ∂ϕ is single-valued. The map x∗ 7−→ ∂ϕ∗ (x∗ ) is closed. We show that it maps bounded sets to bounded sets, hence it is continuous. To this end let kx∗ kRN 6 r

and

x = ∂ϕ∗ (x∗ ).

We have x∗ ∈ ∂ϕ(x) and so (x∗ , x)RN ϕ(x) ϕ(x) − ϕ(0) = 6 6 kx∗ kRN 6 r; kxkRN kxkRN kxkRN © ª thus the set ∂ϕ∗ (x∗ ) : kx∗ kRN 6 r is bounded, i.e., ∂ϕ∗ is continuous. Finally let x = ∂ϕ∗ (x∗ ) and xh∗ = ∂ϕ∗ (x∗ + h∗ ), for some x∗ ∈ RN , h∗ ∈ RN \ {0}. From the definition of the subdifferential, we have ϕ∗ (x∗ + h∗ ) − ϕ∗ (x∗ ) − (h∗ , x)RN 06 kh∗ kRN (h∗ , xh∗ − x)RN 6 6 kxh∗ − xkRN . kh∗ kRN From the continuity of ∂ϕ∗ , we have kxh∗ − xkRN −→ 0

as h∗ → 0.

So ϕ∗ is¡ differentiable at x∗ ∈ RN and the derivative is continuous, i.e., ¢ N ∗ 1 ϕ ∈C R .

540

Nonlinear Analysis

Before passing to the nonconvex subdifferentials, let us mention a few things about the ε-subdifferential (or approximate subdifferential), which is a useful tool in convex analysis. Its definition results from an innocent looking perturbation of the original subdifferential (see Definition 4.4.19), which however leads to some remarkable properties, that are different in nature from those of the “exact” subdifferential. The mathematical setting remains unchanged with (X, X ∗ ) being a dual system of Hausdorff locally convex spaces. DEFINITION 4.4.49 Let ϕ : X −→ R be a proper function, ε > 0 and x ∈ dom ϕ. The ε-subdifferential of ϕ at x is the set ∂ε ϕ(x) (possibly empty), defined by ½ ¾ df ∗ ∗ ∗ ∂ε ϕ(x) = x ∈ X : hx , y − xiX − ε 6 ϕ(y) − ϕ(x) for all y ∈ X . REMARK 4.4.50

Equivalently we can say that x∗ ∈ ∂ε ϕ(x)

if and only if ¡ ¢ inf ϕ − x∗ > −∞ X

with ¡ ¢ ε-argmin ϕ − x∗ =

¡ ¢ x ∈ ε − argmin ϕ − x∗ ,

½

Also if and only if

and

¾ ¡ ¢ y ∈ X : ϕ(y) − hx∗ , yiX 6 inf ϕ − x∗ + ε . X

x∗ ∈ ∂ε ϕ(x) ϕ(x) + ϕ∗ (x∗ ) − hx∗ , xiX 6 ε.

Geometrically the definition of ∂ε ϕ(x) says that the epigraph of the ¡ continuous¢ affine function with slope x∗ ∈ ∂ε ϕ(x) and passing through x, ϕ(x) − ε contains the epigraph of ϕ. So for ε > 0, ∂ε ϕ(x) is a global notion (in contrast to ∂ϕ(x) which is local), i.e., it may be sensitive to variations of ϕ far away from x. When ε = 0, we recover Definition 4.4.19. The next proposition establishes the main difference between approximate and exact subdifferentials. PROPOSITION 4.4.51 If ϕ ∈ Γ0 (X), ε > 0 and x ∈ dom ϕ, then ∂ε ϕ(x) 6= ∅ and it is w∗ -closed, convex.

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

541

From Theorem 4.4.14, we have −ϕ(x) =

Let

inf [ϕ∗ (x∗ ) − hx∗ , xiX ] .

x∗ ∈X ∗

df

ψ(x∗ ) = ϕ∗ (x∗ ) − hx∗ , xiX

∀ x∗ ∈ X ∗ .

Let x∗ ∈ ε − argminψ 6= ∅. We have ϕ∗ (x∗ ) − hx∗ , xiX 6 inf∗ ψ + ε = −ϕ(x) + ε, X

hence

ϕ(x) + ϕ∗ (x∗ ) − hx∗ , xiX 6 ε,

which means that x∗ ∈ ∂ε ϕ(x) (see Remark 4.4.7). So ∂ε ϕ(x) 6= ∅ and clearly it is w∗ -closed and convex. In the study of the ε-subdifferentials (with ε > 0), the directional derivative (see Definition 4.4.17) is replaced by the following quantity. DEFINITION 4.4.52 Let ϕ : X −→ R be a proper function, ε > 0 and x ∈ dom ϕ. For every h ∈ X, we define df

ϕ0ε (x; h) = inf

λ>0

ϕ(x + λh) − ϕ(x) + ε . λ

The next result is analogous to Proposition 4.4.27. PROPOSITION 4.4.53 If X is a Banach space, ϕ ∈ Γ0 (X), ε > 0 and x ∈ dom ϕ, then ϕ0ε (x; ·) = σ∂ε ϕ(x) (·). Let us mention some basic calculus rules for the ε-subdifferential. PROPOSITION 4.4.54 If X and Y are two Banach spaces, A ∈ L(X; Y ), ϕ ∈ Γ0 (Y ), ε > 0 and x ∈ A−1 (dom ϕ), then ¡ ¡ ¢¢w∗ ∂ε (ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . PROOF

Since ϕ ◦ A ∈ Γ0 (X), from Proposition 4.4.53, we have that (ϕ ◦ A)0ε (x; h) = σ∂ε (ϕ◦A)(x) (h)

∀ h ∈ X.

We have ¡ ¢ ϕ(A(x) + λA(h)) − ϕ(A(x)) + ε = ϕ0ε A(x); A(h) . λ>0 λ

(ϕ ◦ A)0ε (x; h) = inf

542

Nonlinear Analysis

On the other hand, using Proposition 4.4.53, for all h ∈ X, we have ∗ ∗ ® σA∗ (∂ε ϕ(A(x))) (h) = sup A (y ), h X y∗∈∂ε ϕ(A(x))

=

sup y∗∈∂ε ϕ(A(x))

∗ ® ¡ ¢ y , A(h) X = ϕ0ε A(x); A(h) .

So we conclude that σ∂ε (ϕ◦A)(x) (h) = σA∗ (∂ε ϕ(A(x))) (h)

∀ h ∈ X,

hence we obtain the conclusion of the proposition. COROLLARY 4.4.55 If X and Y are two Banach spaces, A ∈ L(X; Y ), ϕ ∈ Γ0 (Y ) and x ∈ A−1 (dom ϕ), then \ ¡ ¡ ¢¢w∗ ∂(ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . ε>0

Moreover, if X is reflexive, then we have \ ¡ ¡ ¢¢ ∂(ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . ε>0

PROOF

Clearly \

∂(ϕ ◦ A)(x) =

∂ε (ϕ ◦ A)(x).

ε>0

Applying Proposition 4.4.54, we obtain the first equality. For the second equality just note that if X is reflexive, then ¡ ¡ ¢¢ ¡ ¡ ¢¢w ¡ ¡ ¢¢w∗ A∗ ∂ε ϕ A(x) = A∗ ∂ε ϕ A(x) = A∗ ∂ε ϕ A(x) .

COROLLARY 4.4.56 If X is a Banach space, ϕ, ψ ∈ Γ0 (X) and x ∈ dom ϕ ∩ dom ψ, then \ w∗ ∂(ϕ + ψ)(x) = ∂ε ϕ(x) + ∂ε ψ(x) . ε>0

Moreover, if X is reflexive, then ∂(ϕ + ψ)(x) =

\ ε>0

∂ε ϕ(x) + ∂ε ψ(x).

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

543

Let df

u(x, y) = ϕ(x) + ϕ(y)

∀ (x, y) ∈ X × X

and let A ∈ L(X; X × X) be defined by df

A(x) =

¡ ¢ x, x

∀ x ∈ X.

Then we see that u◦A = ϕ+ψ

on X.

We have A∗ (x∗ , y ∗ ) = x∗ + y ∗

∀ (x∗ , y ∗ ) ∈ X ∗ × Y ∗

and u∗ (x∗ , y ∗ ) = ϕ∗ (x∗ ) + ψ ∗ (y ∗ )

∀ (x∗ , y ∗ ) ∈ X ∗ × Y ∗ .

Let (x, y) ∈ dom u, ε > 0 and (x∗ , y ∗ ) ∈ ∂ε u(x, y). We have ϕ(x) + ψ(y) + ϕ∗ (x∗ ) + ψ ∗ (y ∗ ) − hx∗ , xiX − hy ∗ , yiX 6 ε. So there exists ε1 , ε2 > 0, such that ε1 + ε2 = ε and x∗ ∈ ∂ε1 ϕ(x) ⊆ ∂ε ϕ(x) and

y ∗ ∈ ∂ε2 ψ(y) ⊆ ∂ε ψ(y).

Therefore, we infer that ∂ε u(x, y) ⊆ ∂ε1 ϕ(x) × ∂ε2 ψ(y). Using Corollary 4.4.55, we obtain ∂(ϕ + ψ)(x) =

\

\ ¡ ¢w ∗ w∗ A∗ ∂ε u(x, x) ⊆ ∂ε ϕ(x) + ∂ε ψ(x) .

ε>0

ε>0

(4.29)

On the other hand note that ∂ε ϕ(x) + ∂ε ψ(x) ⊆ ∂2ε (ϕ + ψ)(x) and so

w∗

∂ε ϕ(x) + ∂ε ψ(x)

⊆ ∂2ε (ϕ + ψ)(x).

From this inclusion it follows that \ \ w∗ ∂ε ϕ(x) + ∂ε ψ(x) ⊆ ∂2ε (ϕ + ψ)(x) = ∂(ϕ + ψ)(x). ε>0

ε>0

(4.30)

544

Nonlinear Analysis

From (4.29) and (4.30), we conclude that ∂(ϕ + ψ)(x) =

\

w∗

∂ε ϕ(x) + ∂ε ψ(x)

.

ε>0

Again if X is reflexive, then the norm closure and the weak closure coincide (due to the convexity of the set). Using the second characterization of the ε-subdifferential, we can easily prove the following rule. We leave the details to the reader. PROPOSITION 4.4.57 If ϕ, ψ ∈ Γ0 (X), ε > 0 and there exist x0 ∈ dom ϕ ∩ dom ψ, such that ϕ is continuous at x0 , then [ ∂ε (ϕ + ψ)(x) = [∂ε1 ϕ(x) + ∂ε2 ψ(x)] ∀ x ∈ dom ϕ ∩ dom ψ. ε1 , ε2 > 0 ε2 + ε2 = ε

Now we pass to a brief discussion of some nonconvex subdifferentials. Historically the first subdifferential defined for nonconvex functions is that for locally Lipschitz functions. The starting point for the introduction of such a subdifferential was Theorem 4.2.7 (that is that a continuous, convex function is locally Lipschitz) and when the underlying space is finite dimensional Theorem 1.5.8 and Corollary 1.5.9 (Rademacher’s theorem). The mathematical framework is a Banach space X with X ∗ its topological dual. Let us start by recalling the definition of a locally Lipschitz function, which is central in what follows. DEFINITION 4.4.58 A function ϕ : X −→ R is locally Lipschitz, if every point x ∈ X admits a neighbourhood U ⊆ X and a constant kU (depending on U ), such that ¯ ¯ ¯ϕ(y) − ϕ(z)¯ 6 kU ky − zk ∀ y, z ∈ U. X A locally Lipschitz function need not have directional derivatives in the sense of Definition 4.4.17. However, exploiting the local Lipschitz structure, we can define a generalized directional derivative as follows. DEFINITION 4.4.59 Let ϕ : X −→ R be a locally Lipschitz function. Then the generalized directional derivative of ϕ at x ∈ X in the direction h ∈ X is defined by df

ϕ0 (x; h) = lim sup x0 → x λ&0

ϕ(x0 + λh) − ϕ(x0 ) . λ

4. Smooth and Nonsmooth Analysis and Variational Principles

545

The utility of ϕ0 follows from some useful properties that it exhibits. PROPOSITION 4.4.60 If ϕ : X −→ R is a locally Lipschitz function, then (a) the function h 7−→ ϕ0 (x; h) is sublinear and Lipschitz continuous for all x ∈ X; (b) the function (x, h) 7−→ ϕ0 (x; h) is upper semicontinuous on X × X; (c) ϕ0 (x; −h) = (−ϕ)0 (x; h). PROOF (a) Clearly ϕ(x; ·) is positively homogeneous. Also let h1 , h2 ∈ X. We have ϕ0 (x; h1 + h2 ) = lim sup x0 → x λ&0

= lim sup x0 → x λ&0

6 lim sup x0 → x λ&0 0

ϕ(x0 + λ(h1 + h2 )) − ϕ(x0 ) λ

ϕ(x0 + λh1 + λh2 ) − ϕ(x0 + λh2 ) + ϕ(x0 + λh2 ) − ϕ(x0 ) λ ϕ(x0 + λh1 + λh2 ) − ϕ(x0 + λh2 ) ϕ(x0 + λh2 ) − ϕ(x0 ) + lim sup λ λ x0 → x λ&0 0

= ϕ (x; h1 ) + ϕ (x; h2 ). So we have proved that ϕ0 (x; ·) is sublinear. Exploiting the local Lipschitzness of ϕ, we see that for all x0 ∈ X near x ∈ X and for all λ > 0 near zero, we have ϕ(x0 + λh) − ϕ(x0 ) 6 k khkX ∀ h ∈ X, λ so ϕ0 (x; h) 6 k khkX ∀h∈X and due to the sublinearity of ϕ0 (x; ·), we have ¯ 0 ¯ ¯ϕ (x; h)¯ 6 k khk ∀ h ∈ X, X so finally we deduce that ϕ0 (x; ·) is Lipschitz continuous. (b) Let (xn , hn ) −→ (x, h)

in X × X.

From the definition of ϕ0 (x; h), we know that for every n > 1, we can find vn ∈ X and λn ∈ (0, 1), such that kvn kX + λn 6

1 n

546

Nonlinear Analysis

and ϕ0 (xn ; hn ) 6 so

ϕ(xn + vn + λn hn ) − ϕ(xn + vn ) 1 + , λn n

lim sup ϕ0 (xn , hn ) 6 ϕ0 (x; h), n→+∞

i.e., the function (x, h) 7−→ ϕ0 (x, h) is upper semicontinuous. (c) By definition, we have ϕ0 (x; −h) = lim sup x0 → x λ&0

= lim sup y→x λ&0

ϕ(x0 − λh) − ϕ(x0 ) λ

(−ϕ)(y + λh) − (−ϕ)(y) = (−ϕ)0 (x; h) λ

(with y = x0 − λh). These properties lead to the following definition. DEFINITION 4.4.61 Let ϕ : X −→ R be a locally Lipschitz function. The generalized subdifferential (or Clarke subdifferential) of ϕ at x is defined by ½ ¾ df ∂ϕ(x) = x∗ ∈ X ∗ : hx∗ , xiX 6 ϕ0 (x; h) for all h ∈ X . The elements of ∂ϕ(x) are called generalized gradients. PROPOSITION 4.4.62 If ϕ : X −→ R is a locally Lipschitz function, then for every x ∈ X, the set ∂ϕ(x) ⊆ X ∗ is nonempty, convex and w∗ ∗ compact, the multifunction ∂ϕ : X −→ 2X \{∅} is upper semicontinuous from X with the norm topology into X ∗ with the w∗ -topology (see Definition 3.2.12) and ϕ0 (x; h) = σ∂ϕ(x) (h) ∀ (x, h) ∈ X × X. PROOF Since by Proposition 4.4.60(a), ϕ0 (x; ·) is sublinear, the HahnBanach theorem implies that it has a continuous linear minorant. Therefore ∂ϕ(x) 6= ∅. Clearly the set is convex and by virtue of Proposition 4.4.60(a), it is also closed and bounded, hence w∗ -compact (by Alaoglu’s theorem; see Theorem A.3.9). To show the upper semicontinuity, let C ⊆ X ∗ be a w∗ closed set and let © ª {xn }n>1 ⊆ ∂ϕ− (C) = x ∈ X : ∂ϕ(x) ∩ C 6= ∅

4. Smooth and Nonsmooth Analysis and Variational Principles

547

be a sequence, such that xn −→ x

in X.

Let us take x∗n ∈ ∂ϕ(xn ) ∩ C

∀ n > 1.

Because Proposition 4.4.60(a), the sequence {x∗n }n>1 is bounded in X ∗ . So by Alaoglu’s theorem (see Theorem A.3.9), we can find a subnet {x∗α }α∈J of {x∗n }n>1 , such that w∗

x∗α −→ x∗ . We have hx∗α , hiX 6 ϕ0 (xα ; h)

∀ h ∈ X.

Taking the limit with respect to α ∈ J and using Proposition 4.4.60(b), we obtain hx∗ , hiX 6 ϕ0 (x; h) ∀ h ∈ X, so x∗ ∈ ∂ϕ(x). Also x∗ ∈ C, since C ⊆ X ∗ is w∗ -closed. Therefore x∗ ∈ ∂ϕ(x) ∩ C, hence x ∈ ∂ϕ− (C) which proves the upper semicontinuity of the multifunction. Finally, using once more the Hahn-Banach theorem, for every h0 ∈ X, we can find x∗0 ∈ ∂ϕ(x), such that hx∗0 , hiX 6 ϕ0 (x0 ; h)

∀h∈X

and hx∗0 , h0 iX = ϕ0 (x0 ; h0 ). Therefore ϕ0 (x; ·) = σ∂ϕ(x) (·). PROPOSITION 4.4.63 Let ϕ : X −→ R be a locally Lipschitz function. (a) If ϕ is Gˆ ateaux differentiable at x ∈ X, then ϕ0G (x) ∈ ∂ϕ(x). (b) If ϕ ∈ C 1 (X), © ª then ∂ϕ(x) = ϕ0F (x) for all x ∈ X. (c) If ϕ is also convex, then the convex and generalized subdifferentials of ϕ coincide.

548

Nonlinear Analysis (a) From the definition of ϕ0 (x; ·), we have 0 ® ϕG (x), h X 6 ϕ0 (x; h) ∀ h ∈ X.

PROOF

So ϕ0G (x) ∈ ∂ϕ(x). (b) If ϕ ∈ C 1 (X), then ϕ0 (x; h) = © ª hence ∂ϕ(x) = ϕ0F (x) .

0 ® ϕF (x), h X

∀ h ∈ X,

(c) From the definition of ϕ0 (x; h), we have ϕ(x0 + λh) − ϕ(x0 ) , ε&0 kx0 −xk 6εδ 0 0 arbitrary. Because ϕ is convex, the map λ 7−→

ϕ(x0 + λh) − ϕ(x0 ) is increasing on (0, +∞). λ

So we obtain ϕ0 (x; h) = lim

sup

ε&0 kx0 −xk 6εδ X

ϕ(x0 + εh) − ϕ(x0 ) . ε

From the local Lipschitz property of ϕ, we have ¯ ¯ ¯ ϕ(x0 + εh) − ϕ(x0 ) ϕ(x + εh) − ϕ(x) ¯ ¯ ¯ 6 2δk − ¯ ¯ ε ε

∀ x0 ∈ x + εδB 1 ,

with k > 0, so ϕ0 (x; h) 6 lim

ε&0

ϕ(x + εh) − ϕ(x) + 2δk. ε

Since δ > was arbitrary, we obtain ϕ0 (x; h) 6 ϕ0 (x; h)

∀ h ∈ X.

Because the opposite inequality is always true, we conclude that ϕ0 (x; ·) = ϕ0 (x; ·). Then by virtue of Proposition 4.4.27 and Definition 4.4.61, we conclude that the two subdifferentials coincide. EXAMPLE 4.4.64 If ϕ is differentiable but not C 1 , then ∂ϕ(x) need not be a singleton. To see this consider the function ϕ : R −→ R, defined by ¡ ¢ ½ 2 df x sin x1 if x 6= 0, ϕ(x) = 0 if x = 0. Then ϕ is Lipschitz continuous on [−1, 1], ϕ0 (0) = 0 but ϕ0 is not continuous at x = 0. A straightforward calculation shows that ϕ0 (0; h) = |h|, hence ∂ϕ(0) = [−1, 1] and so it is not a singleton.

4. Smooth and Nonsmooth Analysis and Variational Principles

549

When X = RN , then we can use Corollary 1.5.9 (Rademacher’s theorem) to give a definition which is less abstract and formal than Definition 4.4.61 in terms of the generalized directional derivative. The new definition is more geometric. THEOREM 4.4.65 If ϕ : RN −→ R is a locally Lipschitz function and E is any Lebesgue-null set in RN , then ½ ¾ ∂ϕ(x) = conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕc , n→+∞

where Dϕc ⊆ RN is the Lebesgue-null set where ϕ fails to be differentiable (due to Rademacher’s theorem). PROOF

Since, by Proposition 4.4.63(a), we have ∇ϕ(xn ) ∈ ∂ϕ(xn )

∀n>1

and ∂ϕ©is locallyª bounded (see Proposition 4.4.60(a)), we see that the sequence ∇ϕ(xn ) n>1 has a convergent subsequence. Then Proposition 4.4.62 implies that the limit of the subsequence belongs in ∂ϕ(x). Therefore we have ½ ¾ c conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕ ⊆ ∂ϕ(x). (4.31) n→+∞

On the other hand let ξh =

lim sup

¡

∇ϕ(y), h

y→x c y∈ / E ∪ Dϕ

¢ RN

,

with h 6= 0. For a given ε > 0, we can find δ = δ(ε) > 0, such that ¡ ¢ ∇ϕ(y), h RN 6 ξh + ε ∀ y ∈ x + δB 1 (0), y ∈ / E ∪ Dϕc . ³ ´ δ For t ∈ 0, 2khk , we have N R

δ ϕ(y + th) − ϕ(y) 6 t(ξh + ε) for a.a. y ∈ x + B 1 (0), 2 so

ϕ0 (x; h) 6 ξh + ε.

Therefore the support function of the set ½ ¾ c conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕ n→+∞

majorizes ϕ0 (x; ·). This combined with (4.31) finishes the proof of the theorem.

550

Nonlinear Analysis

COROLLARY 4.4.66 If ϕ : RN −→ R is a locally Lipschitz function, then ¡ ¢ ϕ0 (x; h) = lim sup ∇ϕ(x0 ), h RN . x0 → x c y ∈ E ∪ Dϕ

Finally let us state a few basic calculus rules for the generalized subdifferential. PROPOSITION 4.4.67 If ϕ : X −→ R is a locally Lipschitz function and λ ∈ R, then ∂(λϕ) = λ∂ϕ. PROOF

Evidently the result is true if λ > 0, since (λϕ)0 = λϕ0 .

So we need to consider the case λ < 0. We may assume that λ = −1. Then x∗ ∈ ∂(−ϕ)(x) if and only if hx∗ , hiX 6 (−ϕ)0 (x; h)

∀ h ∈ X.

By Proposition 4.4.60(c), we have (−ϕ)0 (x; h) = ϕ0 (x; −h). So x∗ ∈ ∂(−ϕ)(x) if and only if hx∗ , hiX 6 ϕ0 (x; −h)

∀ h ∈ X,

hence −x∗ ∈ ∂ϕ(x). Thus finally we have that x∗ ∈ ∂(−ϕ)(x) if and only if x∗ ∈ −∂ϕ(x).

An interesting consequence of this proposition is the following extension of Fermat’s equation for local extrema.

4. Smooth and Nonsmooth Analysis and Variational Principles

551

PROPOSITION 4.4.68 If ϕ : X −→ R is a locally Lipschitz function and has a local maximum or minimum at x ∈ X, then 0 ∈ ∂ϕ(x). PROOF Since ∂(−ϕ) = −∂ϕ, it suffices to prove the proposition for the case of a local minimum at x ∈ X. Then clearly ϕ0 (x; h) > 0

∀ h ∈ X.

Hence 0 ∈ ∂ϕ(x). PROPOSITION 4.4.69 © ª If ϕk : X −→ R for k ∈ 1, . . . , n are locally Lipschitz functions, then ¶ µX n n X ϕk ⊆ ∂ϕk , ∂ k=1

k=1

i.e., the generalized subdifferential is subadditive. PROOF It suffices to prove the result for n = 2. The general case follows by induction. The support function of ∂(ϕ1 +ϕ2 ) is (ϕ1 +ϕ2 )0 and the support function of ∂ϕ1 + ∂ϕ2 is ϕ01 + ϕ02 . Also note that ∂ϕ1 (x) + ∂ϕ2 (x) is convex and w∗ -compact. Since (ϕ1 + ϕ2 )0 (x; ·) 6 ϕ01 (x; ·) + ϕ02 (x; ·), we conclude that ∂(ϕ1 + ϕ2 )(x) ⊆ ∂ϕ1 (x) + ∂ϕ2 (x)

∀ x ∈ X.

COROLLARY 4.4.70 © ª If ϕk : X −→ R for k ∈ 1, . . . , n are locally Lipschitz functions and all but one are C 1 -functions, then µX ¶ n n X ∂ ϕk = ∂ϕk . k=1

k=1

COROLLARY 4.4.71 © ª If ϕk : X −→ R are locally Lipschitz functions and λk ∈ R for k ∈ 1, . . . , n , then µX ¶ n n X ∂ λk ϕk ⊆ λk ∂ϕk k=1

k=1

and equality holds if all but one of the functions are C 1 -functions.

552

Nonlinear Analysis

The following mean value theorem is a useful tool in many applications. THEOREM 4.4.72 (Mean Value Theorem) If U ⊆ X is an open set, x, y ∈ X, [x, y] ⊆ U , with df

[x, y] =

©

ª λx + (1 − λ)y : λ ∈ [0, 1]

and ϕ : U −→ R is locally Lipschitz, then there exists u ∈ (x, y), with df

(x, y) =

©

λx + (1 − λ)y : λ ∈ (0, 1)

ª

and u∗ ∈ ∂ϕ(u), such that ϕ(y) − ϕ(x) = hu∗ , y − xiX . PROOF

Let

df

xλ = x + λ(y − x) and consider the function f : [0, 1] −→ R, defined by df

f (λ) = ϕ(xλ )

∀ λ ∈ [0, 1].

Clearly f is Lipschitz continuous on [0, 1]. We claim that © ª ∂f (λ) ⊆ hu∗ , y − xiX : u∗ ∈ ∂ϕ(xλ ) ∀ λ ∈ (0, 1). Since both sets are closed, convex, it suffices to show that σ∂f (λ) (±1) 6

max

u∗ ∈∂ϕ(xλ )

± hu∗ , y − xiX .

(4.32)

To this end, for h = ±1, we have lim sup λ0 → λ t&0

= lim sup λ0 → λ t&0

6 lim sup =

f (λ0 + th) − f (λ0 ) t ϕ(x + (λ0 + th)(y − x)) − ϕ(x + λ0 (y − x)) t ϕ(z + th(y − x)) − ϕ(z) t

z → xλ t&0 ¡ ϕ0 xλ ; h(y

¢ − x) =

From (4.33), we obtain (4.32).

∗ ® u , h(y − x) .

sup u∗ ∈∂ϕ(x

λ)

(4.33)

4. Smooth and Nonsmooth Analysis and Variational Principles

553

Now let ¡ ¢ df ξ(λ) = ϕ(xλ ) + λ ϕ(x) − ϕ(y)

∀ λ ∈ [0, 1].

We have ξ(0) = ξ(1) = ϕ(x) and so we can find λ ∈ (0, 1) at which ξ attains a local extremum. Then by Proposition 4.4.68, we have that 0 ∈ ∂ξ(λ), which via Propositions 4.4.67 and 4.4.69 and (4.32) implies that ® ϕ(y) − ϕ(x) ∈ ∂ϕ(u), y − x X , with u = xλ . The generalized subdifferential has a remarkable calculus which makes it very useful in applications. We mention only two rules which arise in applications. For their proofs and additional results in this direction we refer to Clarke (1983). PROPOSITION 4.4.73 (Chain Rule) If X and Y are two Banach spaces, h ∈ C 1 (X; Y ) and ϕ : Y −→ R is a locally Lipschitz function, then ¡ ¢ ∂(ϕ ◦ h)(x) ⊆ ∂ϕ h(x) ◦ h0F (x) ∀ x ∈ X. REMARK 4.4.74

Note that h0F (x) ∈ L(X; Y ). Using its adjoint ¡ 0 ¢∗ hF (x) ∈ L(Y ∗ ; X ∗ )

we can equivalently rewrite the above chain rule as ¡ ¢∗ ¡ ¢ ∂(ϕ ◦ h)(x) ⊆ h0F (x) ∂ϕ h(x)

∀ x ∈ X.

A useful consequence of the above chain rule is the following result. COROLLARY 4.4.75 If X and Y are two Banach spaces, X is embedded continuously and densely df

in Y , ϕ : Y −→ R is a locally Lipschitz function and ϕ b = ϕ|X , then ∂ ϕ(x) b = ∂ϕ(x) ∀ x ∈ X, which means that every element in ∂ ϕ(x) b admits a unique extension to an element of ∂ϕ(x).

554

Nonlinear Analysis

PROPOSITION 4.4.76 (Nonsmooth Lagrange Rule) If X is a Banach space, ϕ, f : X −→ R are two locally Lipschitz functions and x is a local solution of the problem inf ϕ(x),

f (x)60

then there exist λ0 , λ1 > 0, not both zero, such that 0 ∈ λ0 ∂ϕ(x) + λ1 ∂f (x) REMARK 4.4.77

and

0 = λ1 f (x).

If df

m(b) =

inf ϕ(x),

f (x)6b

m(0) is finite and lim inf b→0

m(b) − m(0) > −∞ |b|

(calmness condition), then λ0 > 0 and so by normalization we can assume that λ0 = 1. Let us conclude this section by mentioning some more nonconvex subdifferentials. This can be done using the following parent notion. DEFINITION 4.4.78 Let X be a Banach space. A bornology is a collection B of bounded, symmetric (with respect to the origin) subsets of X, whose union is X. REMARK 4.4.79 By taking the collection of all finite symmetric sets, we have the so-called Gˆ ateaux bornology denoted by BG . Similarly if the collection consists of all bounded, symmetric sets, then we have the Fr´ echet bornology denoted by BF . Finally if we consider all symmetric, compact sets, then the resulting bornology is the so-called Hadamard bornology denoted by BH . DEFINITION 4.4.80 in X.

Let X be a Banach space and let B be a bornology

(a) The norm of X is said to be B-smooth, if it is Gˆ ateaux differentiable at every x ∈ X \ {0} and the defining limit exists uniformly on members of B. (b) A function ϕ : X −→ R is said to be B-differentiable at x ∈ X with B-derivative ϕ0B (x), if for every C ∈ B, we have lim sup

λ→0 h∈C

ϕ(x + λh) − ϕ(x) − λ hϕ0B (x), hiX = 0. λ

4. Smooth and Nonsmooth Analysis and Variational Principles

555

(c) Let ϕ : X −→ R∗ = R ∪ {±∞} be a lower semicontinuous function and suppose that ϕ(x) ∈ R. We say that ϕ is B-subdifferentiable at x, if there exists x∗ ∈ X ∗ , such that for every ε > 0 and every C ∈ B, there exists δ > 0 for which we have ϕ(x + λh) − ϕ(x) +ε ∀ λ ∈ (0, δ), h ∈ C. λ The elements x∗ ∈ X ∗ are called B-subderivatives of ϕ at x and the set of all B-subderivatives is the B-subdifferential of ϕ at x and it is denoted by ∂B ϕ(x). hx∗ , hiX 6

REMARK 4.4.81 If in Definition 4.4.80(b), B = BG (respectively B = BF ), then ϕ0B (x) = ϕ0G (x) (respectively ϕ0B (x) = ϕ0F (x)). If in Definition 4.4.80(c), ϕ is R-valued, proper, convex, then ∂B ϕ(x) = ∂ϕ(x) (the convex subdifferential; see Definition 4.4.19). If in Definition 4.4.80(c), ϕ is R-valued, locally Lipschitz, then ∂B ϕ(x) = ∂ϕ(x) (the generalized subdifferential; see Definition 4.4.61). Finally if in Definition 4.4.80(c), we reverse the inequality and replace ε be −ε, we obtain the B-superdifferential of ϕ at x denoted by ∂ B ϕ(x). Note that ∂ B (−ϕ)(x) = −∂B ϕ(x). The elements of ∂ B ϕ(x) are called B-superderivatives of ϕ at x. PROPOSITION 4.4.82 Let X be a Banach space, B a bornology in X and ϕ : X −→ R∗ . (a) If ϕ is a lower semicontinuous function, ϕ(x) ∈ R and ∂B ϕ(x) and ∂ B ϕ(x) are nonempty sets, then ϕ is B-differentiable at x and ∂ϕB (x) = ϕ0G (x). (b) If ϕ is a concave function which is continuous in a neighbourhood of x and ∂B ϕ(x) 6= ∅, © ª then ϕ is B-differentiable at x and ∂ϕB (x) = ϕ0B (x) . (c) If the norm of X is B-smooth and ½ ∞ 1X df 2 T = s : X −→ R : s(x) = µn kx − zn kX , 2 n=1 where µn > 0,

∞ X

¾ µn = 1, and zn −→ z ∈ X ,

n=1

then the elements of T are everywhere B-differentiable.

556

Nonlinear Analysis

PROOF

(a) Let x∗ ∈ ∂B ϕ(x) and

y ∗ ∈ ∂ B ϕ(x).

Then for every ε > 0 and every h ∈ X, we have hy ∗ , hiX − hx∗ , hiX 6 2ε, hence

x∗ = y ∗ .

Denote the common value by v ∗ . Then clearly v ∗ = ϕ0B (x) = ϕ0G (x). (b) The function −ϕ is convex and continuous at x ∈ X. Then by Proposition 4.4.25 and Remark 4.4.81, we have ∂(−ϕ)(x) = ∂B (−ϕ)(x) 6= ∅. Since

∂ B ϕ(x) = −∂B (−ϕ)(x),

we have that

∂ B ϕ(x) 6= ∅.

Also ∂B ϕ(x) = −∂(−ϕ)(x) 6= ∅. So we can apply part (a) and obtain the claim. (c) Let x ∈ X. The sequence © ª x − zn n>1 is bounded in X. By hypothesis the norm of X is BG -smooth and so X ∗ is strictly convex. Hence by Proposition 3.2.22, the duality map F is single valued. Then from the definition of s ∈ T , the series ∞ X

µn F(x − zn ) is norm convergent in X ∗

n=1 ∗

∗

to some x ∈ X . It is straightforward to check that x∗ = s0B (x).

Finally let us give some particular subdifferentials associated with bornologies.

4. Smooth and Nonsmooth Analysis and Variational Principles EXAMPLE 4.4.83

557

Let X be a Banach space and B a bornology on X.

(a) Let ϕ : X −→ R be a proper, lower semicontinuous function and x ∈ dom ϕ. The viscosity B-subdifferential at x is the set ∂B ϕ(x) of all x∗ ∈ X ∗ , such that there is a B-differentiable function f , such that ϕ − f attains a local minimum at x and 0 x∗ = fB (x).

(b) If in the above case we specialize the perturbation function f , we obtain the so-called proximal subdifferential ∂p ϕ(x) of ϕ at x. Assuming that the norm of X is Fr´echet differentiable (off the origin), if df

2

f (y) = hx∗ , y − xiX − k ky − xkX , for some k > 0 in (a), then we obtain the proximal subdifferential ∂p ϕ(x) for a proper, lower semicontinuous function ϕ : X −→ R. So x∗ ∈ ∂p ϕ(x) if and only if for some k > 0 and all y in the neighbourhood of x, we have 2

ϕ(y) − ϕ(x) + k ky − xkX > hx∗ , y − xiX . This subdifferential is useful if X is finite dimensional or if X is a Hilbert space. (c) Let ϕ : X −→ R be a proper, lower semicontinuous function and x ∈ − dom ϕ. The canonical B-subdifferential of ϕ at x, denoted by ∂B ϕ(x), is ∗ ∗ the set of all x ∈ X , such that lim inf inf λ→0

h∈C

1 [ϕ(x + λh) − ϕ(x) − λ hx∗ , hiX ] > 0 λ

∀ C ∈ B.

REMARK 4.4.84 Of all the subdifferentials the proximal is the smallest. The viscosity B-subdifferential is not greater than the corresponding canonical subdifferential. The Fr´echet viscosity and canonical subdifferentials coincide, if there exists a Lipschitz continuous, Fr´echet differentiable bump function (see Deville, Godefroy & Zizler (1993); recall that b : X −→ R is a bump function if it has a nonempty and bounded support). If B1 and B2 are two bornologies in X and B2 is finer than B1 (i.e., for every C1 ∈ B1 , we can find C2 ∈ B2 , such that C1 ⊆ C2 ), then any B2 -subdifferential is not greater than the corresponding B1 -subdifferential.

558

4.5

Nonlinear Analysis

Integral Functionals and Subdifferentials

Let (Ω, Σ, µ) be a σ-finite measure space and let X be a separable Banach space. In this section we describe the subdifferential theory of the integral functionals Z ¡ ¢ Iϕ (u) = ϕ ω, u(ω) dµ, Ω df

where ϕ : Ω×X −→ R = R∪{+∞} is a normal integrand (see Definition 3.4.8) and u : Ω −→ X belongs in some vector space of functions. We adopt the convention that +∞ + (−∞) = +∞ (i.e., we let +∞ dominate over −∞). Then for a normal integrand ϕ : Ω × X −→ R, we define the integral functional Iϕ : L1 (Ω; X) −→ R∗ , by Z ¡ ¢ ¢+ R ¡ ϕ ω, u(ω) dµ < +∞, ϕ ω, u(ω) dµ if df Ω Iϕ (u) = Ω ¢+ R ¡ if ϕ ω, u(ω) dµ = +∞. +∞ Ω ∗

Similarly,¡ if ϕ is of ϕ(ω, ·), we define the integral functional ¢ the conjugate ∗ −→ R∗ , by Iϕ∗ : L∞ Ω; Xw ∗ Z ¡ ¢ ¢+ R ∗¡ ϕ∗ ω, v(ω) dµ if ϕ ω, v(ω) dµ < +∞, df Ω Iϕ∗ (v) = Ω ¢+ R ∗¡ if ϕ ω, v(ω) dµ = +∞. +∞ Ω

Recall that ¢ ¡ 1 ¢∗ ¡ ∗ L (Ω; X) = L∞ Ω; Xw ∗ ¢ ¡ (see Theorem 2.2.12; for the Banach space L∞ Ω; Xw∗ ∗ see Definition 2.2.10; note that Theorem 2.2.12 was stated with µ being a finite measure, but the result is true for µ being σ-finite, see Ionescu-Tulcea & Ionescu-Tulcea (1969, ¡ ¡ ¢¢ ∗ are defined p. 95)). The duality brackets for the pair L1 (Ω; X), L∞ Ω; Xw ∗ by Z ¡ ® ¢ df ∗ ∀ u ∈ L1 (Ω; X), v ∈ L∞ Ω; Xw hu, viL1 (Ω;X) = v(ω), u(ω) X dµ ∗ . Ω

We mention that the theory can be developed for integral functionals Iϕ defined on Lp (Ω; X) with p ∈ (1, +∞). In this case ¡ p ¢∗ ¢ 0¡ ∗ L (Ω; X) = Lp Ω; Xw ∗

4. Smooth and Nonsmooth Analysis and Variational Principles with

1 p

+

1 p0

559

= 1 and if X ∗ is separable, then ¡ p ¢∗ ¢ 0¡ L (Ω; X) = Lp Ω; X ∗

(see Theorem 2.2.9). First we need to know that the integral in the definition of Iϕ∗ makes sense. PROPOSITION 4.5.1 If ϕ : Ω × X −→ R is a normal integrand, then ϕ∗ : Ω × X ∗ −→ R defined by df

ϕ∗ (ω, x∗ ) = sup (hx∗ , xiX − ϕ(ω, x)) x∈X

∗ is a convex normal integrand on Ω × Xw ∗.

PROOF

Consider the multifunction E : Ω −→ 2X \ {∅}, defined by © ª df E(ω) = epi ϕ(ω, ·) = (x, λ) ∈ X × R : ϕ(ω, x) 6 λ .

Evidently E has nonempty and closed values (by virtue of the normality of ϕ(ω, ·)). Moreover, we have © ª Gr E = (ω, x, λ) ∈ Ω × X × X : ϕ(ω, x) 6 λ ∈ Σ × B(X) × B(R) = Σ × B(X × R), where B(Z) is the Borel σ-field of Z. So we can find two sequence © ª © ª un : Ω −→ X n>1 and λn : Ω −→ R n>1 of Σ-measurable functions, such that ©¡ ¢ª E(ω) = un (ω), λn (ω) n>1

∀ω∈Ω

(see Denkowski, Mig´orski & Papageorgiou (2003a, p. 433)). Note that ϕ∗ (ω, x∗ ) = sup [hx∗ , un (ω)iX − λn (ω)] . n>1

Thus we conclude that the function (ω, x∗ ) 7−→ ϕ∗ (ω, x∗ ) is a convex normal ∗ integrand on Ω × Xw ∗. THEOREM 4.5.2 (a) If Iϕ : L1 (Ω; X) −→ R∗ is finite at u0 ∈ L1 (Ω; X), then (Iϕ )∗ = Iϕ∗ . (b) If ϕ is a convex normal integrand (i.e., ϕ(ω, ·) is convex for µ-almost ¡ ¢ ∗ all ω ∈ Ω) and Iϕ and Iϕ∗ are finite at u0 ∈ L1 (Ω; X) and v0 ∈ L∞ Ω; Xw ∗ respectively, then Iϕ and Iϕ∗ are proper, convex and lower semicontinuous functionals which are conjugate to each other.

560

Nonlinear Analysis

PROOF we have Z

¡ ¢ ∗ (a) Evidently it suffices to show that for all v ∈ L∞ Ω; Xw ∗ , ³

¡ ¢ ϕ∗ ω, v(ω) dµ 6

sup u∈L1 (Ω;X)

Ω

´ hv, uiL1 (Ω;X) − Iϕ (u)

(4.34)

(see Proposition 4.4.4). Let ξ ∈ R be such that ξ < Iϕ∗ (v). We can obtain (4.34) if we show that there exists u ∈ L1 (Ω; X), such that hv, uiL1 (Ω;X) − Iϕ (u) > ξ. Since by hypothesis Iϕ (u0 ) is finite, we can find ϑ0 ∈ L1 (Ω), such that ® ¡ ¢ v(ω), u0 (ω) X − ϕ ω, u0 (ω) > ϑ0 (ω) for µ-a.a. ω ∈ Ω. (4.35) We obtain

¡ ¢ ϕ∗ ω, v(ω) > ϑ0 (ω) for µ-a.a. ω ∈ Ω.

We claim that there exists a function β ∈ L1 (Ω), such that Z ξ < β(ω) dµ Ω

and

¡ ¢ β(ω) < ϕ∗ ω, v(ω) for µ-a.a. ω ∈ Ω.

Let g ∈ L1 (Ω), g(ω) > 0 for µ-almost all ω ∈ Ω. If Iϕ∗ (v) is finite, then let ¡ ¢ df β(ω) = ϕ∗ ω, v(ω) − εg(ω), R with ε > 0 sufficiently small (so that ξ < β(ω) dµ). If Iϕ∗ (v) = +∞, then Ω

we define

½

¾ ¢ 1 ∗¡ df min ng(ω), ϕ ω, v(ω) fn (ω) = ¢ 2 ∗¡ ϕ ω, v(ω) − g(ω)

if if

¡ ¢ ϕ∗ ω, v(ω) > 0, ¡ ¢ ϕ∗ ω, v(ω) 6 0.

Note that fn (ω) −→

¢ 1 ∗¡ ϕ ω, v(ω) 2

© ¡ ¢ ª ∀ ω ∈ ω ∈ Ω : ϕ∗ ω, v(ω) > 0 .

Hence by the monotone convergence theorem (see Theorem A.2.10), we have that Z lim fn (ω) dµ = +∞ n→+∞

Ω

4. Smooth and Nonsmooth Analysis and Variational Principles

561

and so we can find n0 > 1 large enough so that Z ξ < fn0 (ω) dµ. Ω

Therefore if we take β = fn0 , we have ¡ ¢ β(ω) < ϕ∗ ω, v(ω) for µ-a.a. ω ∈ Ω. Consider the multifunction S : Ω −→ 2X , defined by df

S(ω) =

©

x∈S:

® ª v(ω), x X − ϕ(ω, x) > β(ω)

∀ ω ∈ Ω.

Evidently S has nonempty, closed values and © ª Gr S = (ω, x) ∈ Ω × X : x ∈ S(ω) ∈ Σ × B(X). By the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.32), we can find a Σ-measurable function s : Ω −→ X, such that s(ω) ∈ S(ω)

∀ ω ∈ Ω.

Since µ is σ-finite, we can find Ω0 ∈ Σ with µ(Ω0 ) < +∞, such that s|Ω0 is bounded and Z Z β(ω) dµ + ϑ0 (ω) dµ > ξ. Ω0

Ω\Ω0

We set

½ df

u(ω) =

s(ω) u0 (ω)

if if

ω ∈ Ω0 , ω ∈ Ω \ Ω0 .

Evidently u ∈ L1 (Ω; X) and we have ® ¡ ¢ v(ω), u(ω) X − ϕ ω, u(ω) > β(ω) and

® ¡ ¢ v(ω), u(ω) X − ϕ ω, u(ω) > ϑ0 (ω)

∀ ω ∈ Ω0 ∀ ω ∈ Ω \ Ω0

(see (4.35)). Therefore Z Z Z Z ¡ ¢ ® β(ω) dµ + v(ω), u(ω) X dµ − ϕ ω, u(ω) dµ > Ω

Ω

Ω0

Ω\Ω0

so Iϕ∗ (v) = (Iϕ )∗ (v). (b) Since

ϕ = ϕ∗∗

(see Theorem 4.4.14), this follows at once from part (a).

ϑ0 (ω) dµ > ξ,

562

Nonlinear Analysis

Next we consider integral functionals defined on the Lebesgue-Bochner space L∞ (Ω; X ∗ ). So as before (Ω, Σ, µ) is a σ-finite measure space, but now ∗ X is a separable Banach space with a separable dual X ∗ . Recall that Xw ∗ ∗ ∗ (i.e., the Banach space X supplied with the w -topology) is a Souslin space ∗ (see Definition A.2.29(b)). It follows that B(X ∗ ) = B(Xw ∗ ) (see Denkowski, ∗ Mig´orski & Papageorgiou (2003a, p. 211). Also let ϕ : Ω × Xw ∗ −→ R be a convex normal integrand. We consider the following integral functionals Z ¡ ¢ df ∗ Iϕ (u) = ϕ ω, u(ω) dµ ∀ u ∈ L1 (Ω; X ∗ ) = L1 (Ω; Xw ∗) Ω

and df

Z

Iϕ∗ (v) =

¡ ¢ ϕ∗ ω, v(ω) dµ

∀ v ∈ L∞ (Ω; X).

Ω

From Theorem 4.5.2(b), we know that if dom Iϕ 6= ∅ and dom Iϕ∗ 6= ∅, then the functionals Iϕ and Iϕ∗ are conjugate to each other. So we have £ ¤ Iϕ∗ (v) = sup hv, uiL1 (Ω;X ∗ ) − Iϕ (u) u∈L1 (Ω;X ∗ )

and Iϕ (u) =

£

sup v∈L∞ (Ω;X ∗ )

¤ hu, viL1 (Ω;X ∗ ) − Iϕ∗ (v) .

¡ ¢∗ ∗ What about Iϕ∗ defined on (L∞ (Ω; X)) ? To get an expression for this ∗ conjugate, first we need to introduce the structure of (L∞ (Ω; X)) . DEFINITION 4.5.3 Y be a Banach space.

Let (Ω, Σ, µ) be a σ-finite measure space and let

¡ ¢∗ (a) A function l ∈ L∞ (Ω; Y ) is said to be absolutely continuous with respect to µ, if Z ® l(v) = u(ω), v(ω) Y ∀ v ∈ L∞ (Ω; Y ), Ω

with u ∈ L1 (Ω; Yw∗∗ ). The function u is said to be the density of l with respect to µ. We have Z ° ° °u(ω)° ∗ dµ. klk = kukL1 (Ω;Y ∗∗ ) = Y w

Ω

¡ ¢∗ So we can identify an absolute continuous functional l ∈ L∞ (Ω; Y ) , with its density with respect to µ.

4. Smooth and Nonsmooth Analysis and Variational Principles

563

¡ ¢∗ (b) A functional l ∈ L∞ (Ω; Y ) is said to be singular with respect to µ, if there exists a decreasing sequence {Cn }n>1 ⊆ Σ, such that µ(Cn ) & 0 and l is supported by Cn for n > 1, that is if v ∈ L∞ (Ω; Y ) and v vanishes on some Cn , then l(v) = 0. ¡ ¢∗ REMARK 4.5.4 If µ is finite, then l ∈ L∞ (Ω; Y ) is singular if and only if for every ε > 0, we can find A ∈ Σ, such that µ(A) 6 ε and l is ∞ supported by A ¡ ¢∗ (i.e., if v ∈ L (Ω; Y ), v|A = ¡ 0, then l(v) ¢∗ = 0). For a given l ∈ L∞ (Ω; Y ) and A ∈ Σ, we define lA ∈ L∞ (Ω; Y ) , by ¢ df ¡ lA (v) = l χA v

∀ v ∈ L∞ (Ω; Y ).

It is easy to see that if A, B ∈ Σ and A ∩ B = ∅, then ° A∪B ° ° ° ° ° °l ° ∞ = °lA °(L∞ (Ω;Y ))∗ + °lB °(L∞ (Ω;Y ))∗ . (L (Ω;Y ))∗ PROPOSITION 4.5.5 If (Ω, Σ, µ) is a σ-finite measure space, Y is a Banach space, l ∈

¡ ∞ ¢∗ L (Ω; Y )

and for every ε > 0, there exists A ∈ Σ, with µ(A) 6 ε

and

° ° klk(L∞ (Ω;Y ))∗ − ε 6 °lA °(L∞ (Ω;Y ))∗ ,

¡ ¢∗ then l ∈ L∞ (Ω; Y ) is singular with respect to µ. PROOF µ(An ) 6

Let {An }n>1 ⊆ Σ be such that 1 2n

and

° ° 1 6 °lAn °(L∞ (Ω;Y ))∗ ∀ n > 1. n 2 ° c° + °lAn ° ∞ ∗ , we have

klk(L∞ (Ω;Y ))∗ −

° ° Since klk(L∞ (Ω;Y ))∗ = °lAn °(L∞ (Ω;Y ))∗ ° Ac ° °l n °

6

(L∞ (Ω;Y ))∗

Let df

Cn =

∞ [

Ak

(L

1 2n

(Ω;Y ))

∀ n > 1.

∀ n > 1.

k=n+1

We have µ(Cn ) 6

1 2n

∀ n > 1.

564

Nonlinear Analysis

We claim that ¡ ¢∗ l ∈ L∞ (Ω; Y ) is supported by Cn

∀ n > 1.

To this end let v ∈ L∞ (Ω; Y ) with v|Cn = 0. So for all k > n + 1, we have that v = χAc v k

and so ¯ ¯ ¯ ¯ ° ° 1 ¯l(v)¯ = ¯lAck (v)¯ 6 °lAck ° ∞ kvkL∞ (Ω;Y ) 6 k kvkL∞ (Ω;Y ) . (L (Ω;Y ))∗ 2 ¡ ∞ ¢∗ Let k → +∞ to obtain that l(v) = 0 and so l ∈ L (Ω; Y ) is singular. PROPOSITION 4.5.6 If (Ω, Σ, µ) is a finite measure space, Y is a Banach space, L∞ (Ω) ⊗ Y = span

¡© ª¢ gy : g ∈ L∞ (Ω), y ∈ Y ,

¡ ¢∗ l ∈ L∞ (Ω; Y ) and l(w) = 0

∀ w ∈ L∞ (Ω) ⊗ Y,

then l is singular. PROOF Let Z be the subspace of L∞ (Ω; Y ) consisting of the equivalence classes of countably valued functions from Ω into Y . From Corollary 2.1.4, we know that Z is dense in L∞ (Ω; Y ). So for a given ε > 0, we can find z ∈ Z with kzkL∞ (Ω;Y ) 6 1, such that klk(L∞ (Ω;Y ))∗ − ε 6 l(z). So there exist a sequence {Am }m>1 ⊆ Σ of pairwise disjoint sets with Ω =

∞ [

Am

m=1

and a sequence {xm }m>1 ⊆ X, such that z(ω) = xm

∀ ω ∈ Am .

4. Smooth and Nonsmooth Analysis and Variational Principles

565

Let n > 1 be large enough, such that µ [ ¶ ∞ µ Am 6 ε. m=n

Since w(·) =

n−1 X

χAm (·)xm ∈ L∞ (Ω) ⊗ Y,

m=1

we have

µ l(z) = l χ

¶ ∞ S

m=n

z Am

> klk(L∞ (Ω;Y ))∗ − ε.

¡ ¢∗ Applying Proposition 4.5.5, we obtain that l ∈ L∞ (Ω; Y ) is singular with respect to µ. ∗

Let us state the theorem which characterizes the dual space (L∞ (Ω; Y )) . The decomposition produced by this theorem is analogous to the Lebesgue decomposition of measures. For a proof of the theorem we refer to Levin (1974). THEOREM 4.5.7 If (Ω, Σ, µ) is a σ-finite measure space, Y is a Banach space and Ls is the space¡of singular¢ continuous linear functionals on L∞ (Ω; Y ), ∗ then L∞ (Ω; Y ) is isometrically isomorphic to L1 (Ω; Yw∗∗ ) ⊕ Ls and klk(L∞ (Ω;Y ))∗ = kukL1 (Ω;Y ∗∗ ) + kls kL∞ (Ω;Y )∗ , w

¡ ¢∗ where l ∈ L∞ (Ω; Y ) , u ∈ L1 (Ω; Yw∗∗ ) and ls ∈ Ls . ¡ ¢∗ Now that we have a complete description of the dual space L∞ (Ω; Y ) , ¡ ¢∗ we can return to our initial problem, namely the formula for Iϕ∗ , defined ¡ ¢∗ on L∞ (Ω; X) . Recall that the mathematical setting is the following: (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space with a separable dual X ∗ and ∗ ϕ : Ω × Xw ∗ −→ R is a convex normal integrand. As we already mentioned ∗ B(X ∗ ) = B(Xw ∗ ).

¡ ¢∗ ¡ ¢∗ We want to derive a formula for Iϕ∗ : L∞ (Ω; X) −→ R, defined by ¡

Iϕ∗

¢∗

df

(l) =

sup v∈L∞ (Ω;X)

£ ¤ l(v) − Iϕ∗ (v)

¡ ¢∗ ∀ l ∈ L∞ (Ω; X) .

566

Nonlinear Analysis

THEOREM 4.5.8 If the above hypotheses hold and dom Iϕ , domIϕ∗ are nonempty, then ¡ ¢∗ Iϕ∗ (l) = Iϕ (u) + σdom Iϕ∗ (ls ), ¡ ¢∗ where l ∈ L∞ (Ω; X) , u ∈ L1 (Ω; X ∗ ) is the density of l with respect to µ and l ∈ Ls is the singular part of l with respect to µ (so we have that l = u + ls and klk(L∞ (Ω;X))∗ = kukL1 (Ω;X ∗ ) + kls k(L∞ (Ω;X))∗ ; see Theorem 4.5.7). PROOF

By definition, we have ¡

6

£ ¤ (l) = sup hu, viL∞ (Ω;X) + ls (v) − Iϕ∗ (v) £ v∈dom Iϕ∗ ¤ hu, viL∞ (Ω;X) − Iϕ∗ (v) + sup ls (b v) sup

Iϕ∗

¢∗

v∈dom Iϕ∗

v b∈dom Iϕ∗

= Iϕ (u) + σdom Iϕ∗ (ls ).

(4.36)

Let {Cn }n>1 ⊆ Σ be a decreasing sequence, such that µ(Cn ) & 0 and ls is supported by Cn for every n > 1. Also let v0 ∈ dom Iϕ∗ ,

ε > 0 and

ξ∈R

be such that ξ < Iϕ (u). Note that ¡ ¢ ® ¡ ¢ ϕ ω, u(ω) > u(ω), v0 (ω) X − ϕ∗ ω, v0 (ω)

for µ-a.a. ω ∈ Ω

and so Iϕ (u) > −∞. Then for n > 1 large enough, we have Z

£

u(ω), v0 (ω)

® X

¡ ¢¤ − ϕ∗ ω, v0 (ω) dµ > −ε

(4.37)

Cn

and

Z c Cn

¡ ¢ ϕ ω, u(ω) dµ > ξ.

(4.38)

4. Smooth and Nonsmooth Analysis and Variational Principles

567

We have ¡ ¢∗ Iϕ∗ (l) =

· sup

Z l(v) −

v∈L∞ (Ω;X)

Cn

· =

sup v∈L∞ (Ω\Cn ;X)

Z

hu, viL∞ (Ω;X) −

sup v b∈L∞ (Cn ;X)

Z =

¸ ¡ ¢ ∗ ϕ ω, v(ω) dµ Z

hu, vbiL∞ (Ω;X) + ls (b v) −

Z

¸ ¢ ϕ ω, v(ω) dµ ∗

¡

c Cn

¡ ¢ ϕ∗ ω, vb(ω) dµ

¸

Cn

¡ ¢ ϕ ω, u(ω) dµ

c Cn

+

¡

c Cn

· +

¢ ϕ ω, v(ω) dµ − ∗

sup v b∈L∞ (Cn ;X)

· ¸ Z £ ® ¡ ¢¤ ∗ ls (b v) + u(ω), vb(ω) X − ϕ ω, vb(ω) dµ . Cn

Let

df

vb = χCn v0 . Since

¡ ¢ ls χCnc v0 = 0,

we have

¡ ¢ ls (v0 ) = ls χCn v0

and so, using also (4.37) and (4.38), we obtain Z ¡ ¢∗ ¡ ¢ Iϕ∗ (l) > ϕ ω, u(ω) dµ + ls (v0 ) c Cn

Z

+

£

¡ ¢¤ hu(ω), v0 (ω)iX − ϕ∗ ω, v0 (ω) dµ

Cn

> ξ + ls (v0 ) − ε. Since ε > 0 was arbitrary, we let ε & 0 and have ¡ ¢∗ Iϕ∗ (l) > ξ + ls (v0 ) ∀ v0 ∈ dom Iϕ∗ , so

¡

Iϕ∗

¢∗

(l) > Iϕ (u) + σdom Iϕ∗ (ls ).

From (4.36) and (4.39), we conclude that ¡ ¢∗ Iϕ∗ (l) = Iϕ (u) + σdom Iϕ∗ (ls ).

(4.39)

568

Nonlinear Analysis

REMARK 4.5.9 If dom Iϕ∗ = L∞ (Ω; X), then Iϕ∗ is continuous, in fact ∞ locally Lipschitz on L (Ω; X) (see Theorem 4.2.6 and 4.2.7). We will show that we can say more about the continuity of Iϕ∗ , when dom Iϕ∗ = L∞ (Ω; X) (see Proposition 4.5.14). DEFINITION 4.5.10

Let V and W be two linear spaces.

(a) We say that (V, W ) form a dual pair (or dual system), if there exists a bilinear functional b : V × W −→ R, written b(v, w) = hv, wi, such that (i) if hv, wi = 0 for all w ∈ W , then v = 0; (ii) if hv, wi = 0 for all v ∈ V , then w = 0. (b) For a given dual pair (V, W ) and a topology τV on V , we say that τV is compatible ¡ ¢∗ with the dual pair (V, W ¡ ), if ¢∗ τV is a locally convex vector topology and VτV = W (that is, if v ∗ ∈ VτV , then v ∗ (v) = hv, wifor some w ∈ W and all v ∈ V and conversely). So we view W as a subspace of the¡ algebraic ¢∗ dual of V . Dually, a compatible topology τW on W is one such that WτW = V. (c) The smallest topology on V compatible with the dual pair (V, W ) is the weak topology denoted by w(V, W ). The largest topology on V compatible with the dual pair is the Mackey topology denoted by m(V, W ). REMARK 4.5.11 From the properties of the bilinear form h·, ·i, we see easily that a compatible topology is always Hausdorff. All compatible topologies have the same closed, convex sets and the same bounded sets (i.e., these properties are duality invariant). The Mackey topology m(V, W ) is the topology of uniform convergence on all balanced, convex, w-compact sets in W . Recall that A ⊆ W is balanced if λA ⊆ A for all |λ| 6 1. If V = X is a Banach space and W = X ∗ , then w(V, W ) is the usual weak topology on X and m(V, W ) is the norm topology on X. On the other hand, if V = X ∗ and W = X, then w(V, W ) is the weak∗ -topology on X ∗ and m(V, W ) is strictly smaller than the norm topology, unless X is reflexive. DEFINITION 4.5.12 Let (V, W ) be a dual pair, let τW be a compatible topology on W and let ϕ : W −→ R∗ be a function. We say that ϕ is τW -inf-compact for the slope v0 ∈ V , if for all λ ∈ R, the set © ª w ∈ W : ϕ(w) − hv0 , wi 6 λ is τW -compact. If v0 = 0, then we simply say that ϕ is τW -inf-compact. The next proposition establishes an interesting connection between continuity of ϕ and τW -inf-compactness of ϕ∗ .

4. Smooth and Nonsmooth Analysis and Variational Principles

569

PROPOSITION 4.5.13 If (V, W ) is a dual pair, w = w(V, W ) and ϕ ∈ Γ0 (Vw ), then ϕ is finite and m(V, W )-continuous at v0 ∈ V if and only if ϕ∗ is w(W, V )-inf-compact for the slope v0 ∈ V . Using Theorem 4.5.8 and Proposition 4.5.13, we infer the following result for the integral functional Iϕ∗ . PROPOSITION 4.5.14 If the hypotheses of Theorem 4.5.8 hold with dom Iϕ∗ = L∞ (Ω; X), then ¡ ¢ (a) Iϕ∗ is a m L∞ (Ω; X), L1 (Ω; X ∗ ) -continuous function on L∞ (Ω; X); ¡ ¢ (b) Iϕ is a w L1 (Ω; X ∗ ), L∞ (Ω; X) -inf-compact function for every slope in L∞ (Ω; X); (c) we have ¡ ¢∗ Iϕ∗ (l) =

½

Iϕ (u) +∞

if l = u ∈ L1 (Ω; X ∗ ), otherwise.

Another useful continuity result for the integral functional Iϕ is the following. PROPOSITION 4.5.15 If (Ω, Σ, µ) is a nonatomic finite measure space, X is a separable Banach space, ϕ : Ω × X −→ R is a convex normal integrand, p ∈ [1, +∞) and Iϕ : Lp (Ω; X) −→ R is continuous at a point, then Iϕ is continuous everywhere. PROOF Let u0 ∈ Lp (Ω; X) be the point of continuity of Iϕ . By considering if necessary the functional df x 7−→ Ibϕ (x) = Iϕ (u0 + x) − Iϕ (u0 ),

we may assume without any loss of generality that u0 = 0 and Iϕ (0) = 0. So we can find δ > 0, such that Iϕ (u) 6 1

∀ kukLp (Ω;X) 6 δ.

Let x ∈ Lp (Ω; X) be arbitrary. Exploiting the nonatomicity of µ and the absolute continuity of the Lebesgue integral, we can find δ1 > 0 and pairwise disjoint sets {Ak }N k=1 ⊆ Σ, such that ° ° © ª ° ° µ(Ak ) 6 δ1 and °χAk u° p 6 δ ∀ k ∈ 1, . . . , N . L (Ω;X)

570

Nonlinear Analysis

We have Z Z Z ¡ ¢ ¡ ¢ ϕ ω, u(ω) dµ = ϕ ω, χAk (ω)u(ω) dµ − ϕ(ω, 0) dµ 6 1 + ξ, Ak

Ack

Ω

© ª for some ξ > 0 independent of k ∈ 1, . . . , N . Therefore, we conclude that N Z X

¡ ¢ ϕ ω, u(ω) dµ =

k=1A k

Z

¡ ¢ ϕ ω, u(ω) dµ = Iϕ (u) < +∞,

Ω

so Iϕ is continuous everywhere on Lp (Ω; X) (see Theorem 4.2.3). Next we describe the subdifferential of the integral functional Iϕ . THEOREM 4.5.16 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space, ϕ : Ω × X −→ R is a convex normal integrand and Iϕ : Lp (Ω; X) −→ R, p ∈ [1, +∞) is finite for at least one u0 ∈ Lp (Ω; X), then for every u ∈ Lp (Ω; X), we have that ¢ 0¡ u∗ ∈ ∂Iϕ (u) ⊆ Lp Ω; Xw∗ ∗ (with

1 p

+

1 p0

= 1) if and only if ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω)

PROOF and only if

for µ-a.a. ω ∈ Ω.

According to Proposition 4.4.21, we have that u∗ ∈ ∂Iϕ (u) if ¡ ¢∗ Iϕ (u) + Iϕ (u∗ ) = hu∗ , viLp (Ω;X)

(see Remark 2.2.13). From Theorem 4.5.2(a), we know that ¡ ¢∗ Iϕ = Iϕ∗ . So we have that u∗ ∈ ∂Iϕ (u) if and only if Z Z £ ¡ ¢ ¡ ¢¤ ∗ ® ϕ ω, u(ω) + ϕ∗ ω, u∗ (ω) dµ = u (ω), u(ω) X dµ. Ω

Ω

The result now follows from the fact that ¡ ¢ ¡ ¢ ® ϕ ω, u(ω) + ϕ∗ ω, u∗ (ω) > u∗ (ω), u(ω) X

for µ-a.a. ω ∈ Ω.

4. Smooth and Nonsmooth Analysis and Variational Principles

571

Before passing to the study of the subdifferentials of nonconvex integral functionals, let us prove a last result for convex subdifferentials. It concerns functionals defined on the space of continuous functions defined on a compact metric space K. So let K be a compact metric space and consider the Banach space C(K) (with the supremum norm). The Riesz-Markov representation theorem (see Theorem 2.3.41) says that C(K)∗ = M (K), where M (K) is the Banach space of Radon measures, i.e., the space of all signed Borel measures which are of bounded variation with the norm given by the total variation, namely kµk = sup

½X N

¾ N ¯ ¯ [ ¯µ(Ak )¯ : Ak ⊆ A, Ak ∩ Ai = ∅ for k 6= i, N > 1 .

k=1

k=1

In what follows by h·, ·iC(K) we denote the duality brackets for the pair ¡ ¢ C(K), M (K) , i.e., Z df hµ, uiC(K) = u(x) dµ(x) ∀ u ∈ C(K), µ ∈ M (K). K

We say that µ ∈ M (K) is positive, denoted by µ > 0, if hµ, uiC(K) > 0

∀ u ∈ C(K), u > 0.

If e ∈ C(K) is the function, such that e(x) = 1

∀x∈K

and

hµ, eiC(K) = 1,

then we say that the Radon measure µ ∈ M (K) has total mass one. A Radon measure µ ∈ M (K) vanishes in an open set U ⊆ K, if hµ, uiC(K) = 0 recall that

∀ u ∈ C(K), supp u ⊆ U ;

df

supp u = {x ∈ K : u(x) 6= 0}. By using partition of unity, we can show Sthat if µ vanishes in a collection of open sets Ur , then µ also vanishes on Ur . Hence it follows that there b where µ vanishes. Then the set K \ U b is called the exists a largest open set U support of µ and is denoted by supp µ.

572

Nonlinear Analysis

LEMMA 4.5.17 If u ∈ C(K), µ ∈ M (K) and u|supp µ = 0, then hµ, uiC(K) = 0. PROOF

If dK is a metric on K, for each ε > 0, let ¡ ¢ df © ¡ ¢ ª supp µ ε = x ∈ K : dK x, supp µ < ε .

Using Urysohn’s lemma (see Theorem A.1.13), for every n > 1, we can find ϑn ∈ C(K), such that ϑn |(supp µ) ≡ 0 1 n

and ϑn |(supp µ)c

2 n

≡ 1.

Then ϑn u −→ u in C(K) and so hµ, ϑn uiC(K) −→ hµ, uiC(K) . Note that b = K \ supp µ supp ϑn u ⊆ U

∀ n > 1.

Hence hµ, ϑn uiC(K) = 0 and so we conclude that hµ, uiC(K) = 0.

PROPOSITION 4.5.18 If ξ : C(K) −→ R is defined by df

ξ(u) = max u(x), x∈K

then ξ is continuous, convex and for each u ∈ C(K), we have that µ ∈ ∂ξ(u) µ > 0,

hµ, eiC(K) = 1

if and only if © ª and supp µ ⊆ x ∈ K : ξ(u) = u(x) .

4. Smooth and Nonsmooth Analysis and Variational Principles

573

PROOF The convexity of ξ is clear. To establish the continuity of ξ, we argue as follows. Let u, v ∈ C(K) and

x0 ∈ K

be such that ξ(u) = max u(x) = u(x0 ). x∈K

We have ξ(u) − ξ(v) 6 u(x0 ) − v(x0 ) 6 ku − vk∞ . Reversing the roles of u and v in the above argument, we conclude that ¯ ¯ ¯ξ(u) − ξ(v)¯ 6 ku − vk , ∞ i.e., ξ is Lipschitz continuous. Now let us prove the description of ∂ξ. (a) Let µ ∈ ∂ξ(u). Then we have hµ, v − uiC(K) 6 ξ(v) − ξ(u)

∀ v ∈ C(K).

Let g ∈ C(K), and let us set

g > 0

df

v = u − g. From (4.40), we have ¡ ¢ − hµ, giC(K) 6 max u − g (x) − max u(x) 6 0, x∈K

x∈K

so hµ, giC(K) > 0. Also let c ∈ R and let us set df

v = u + ce. From (4.40), we have ¡ ¢ c hµ, eiC(K) 6 max u + ce (x) − max u(x) = c x∈K

x∈K

(recall that e ≡ 1). Since c ∈ R was arbitrary, we obtain hµ, eiC(K) = 1. Next we show that df

supp µ ⊆ C =

©

ª x ∈ K : ξ(u) = u(x) .

(4.40)

574

Nonlinear Analysis

It suffices to show that µ vanishes in any open set U ⊆ K \ C. To this end let g ∈ C(K) be such that supp g ⊆ U . Also let df

η = ξ(u) − max u(x) > 0. x∈supp g

We choose ε > 0 so that ±εg(x) < η

∀ x ∈ K.

Then u(x) ± εg(x) < ξ(u)

∀ x ∈ supp g

and ξ(u ± εg) = ξ(u). df

So if in (4.40), we set v = u ± εg, we obtain ±ε hµ, giC(K) 6 0, i.e., hµ, giC(K) = 0, so ©

supp µ ⊆

ª x ∈ K : ξ(u) = u(x) .

(b) Note that g = u − ξ(u)e and g(x) = 0

∀ x ∈ supp µ.

Using Lemma 4.5.17, we obtain hµ, giC(K) = 0 and so hµ, uiC(K) = ξ(u). Hence if v ∈ C(K), we have ξ(v) − ξ(u) = ξ(v) − hµ, uiC(K) . Let

df

g = ξ(v)e − v. Evidently g > 0 and so hµ, giC(K) > 0,

(4.41)

4. Smooth and Nonsmooth Analysis and Variational Principles

575

hence ξ(v) > hµ, uiC(K) . Using this in (4.41), we obtain ξ(v) − ξ(u) > hµ, v − uiC(K)

∀ v ∈ C(K),

so µ ∈ ∂ξ(u). Now we consider nonconvex locally Lipschitz integral functionals. The mathematical setting is the following: (Ω, Σ, µ) is a finite measure space, X is a separable Banach space and ϕ : Ω × X −→ R is a measurable function. We consider the integral functional Iϕ : Lp (Ω; X) −→ R, p ∈ [1, +∞), defined by df

Z

Iϕ (u) =

¡ ¢ ϕ ω, u(ω) dµ

∀ u ∈ Lp (Ω; X).

Ω

Our goal is to describe ∂Iϕ (u) under one the following two hypotheses: H(ϕ)1 We have

¯ ¯ ¯ϕ(ω, x) − ϕ(ω, y)¯ 6 k(ω) kx − yk , X 0

for µ-almost all ω ∈ Ω and all x, y ∈ X, with k ∈ Lp (Ω),

1 p

+

1 p0

= 1.

H(ϕ)2 For µ-almost all ω ∈ Ω, the function ϕ(ω, ·) is locally Lipschitz and ³ ´ p−1 kx∗ kX ∗ 6 a(z) 1 + kxkX , for µ-almost all ω ∈ Ω, all x ∈ X and all x∗ ∈ ∂ϕ(ω, x), with a ∈ L∞ (Ω). THEOREM 4.5.19 If ϕ : Ω × X −→ R is a measurable function and satisfies either hypotheses H(ϕ)1 or H(ϕ)2 , then Iϕ is Lipschitz continuous on bounded sets of Lp (Ω; X) and if u∗ ∈ ∂Iϕ (u), then ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω) for µ-a.a. ω ∈ Ω.

576

Nonlinear Analysis

PROOF

Case 1. First suppose that hypothesis H(ϕ)1 holds.

Then for u, v ∈ Lp (Ω; X), we have ¯ ¯ ¯Iϕ (u) − Iϕ (v)¯ Z ¯ ¡ ¢ ¡ ¢¯ ¯ϕ ω, u(ω) − ϕ ω, v(ω) ¯ dµ 6 Ω

Z

6

° ° k(ω)°u(ω) − v(ω)°X dµ

Ω

6 kkkp0 ku − vkLp (Ω;X) , so Iϕ is Lipschitz continuous (globally) on Lp (Ω; X). Case 2. Next suppose that hypothesis H(ϕ)2 holds. We show that Iϕ is Lipschitz continuous on bounded sets of Lp (Ω; X). Let u, v ∈ Lp (Ω; X) be such that kukLp (Ω;X) 6 r

and

kvkLp (Ω;X) 6 r.

Using Theorem 4.4.72, we can find w(ω) ∈

£ ¤ df © ª u(ω), v(ω) = (1 − λ)u(ω) + λv(ω) : λ ∈ [0, 1]

and

¡ ¢ w∗ (ω) ∈ ∂ϕ ω, w(ω) ,

such that ¡ ¢ ¡ ¢ ϕ ω, v(ω) − ϕ ω, u(ω) =

∗ ® w (ω), v(ω) − u(ω) X

for µ-a.a. ω ∈ Ω.

(4.42)

By virtue of hypothesis H(ϕ)2 , we have that ³ ° ∗ ° ° ° ´ °w (ω)° 6 a(ω) 1 + °w(ω)°p−1 ³

X

X

° °p−1 ° °p−1 ´ 6 b a(ω) 1 + °u(ω)°X + °v(ω)°X

for µ-a.a. ω ∈ Ω,

with b a ∈ L∞ (Ω). Let ³ ° °p−1 ° °p−1 ´ df η(ω) = b a(ω) 1 + °u(ω)°X + °v(ω)°X .

(4.43)

4. Smooth and Nonsmooth Analysis and Variational Principles Then

577

0

η ∈ Lp (Ω)+ and kηkp0 6 c, ° ° where c > 0 depends on °b a°∞ and on r > 0. From (4.42) and (4.43), it follows that ¯ ¯ ¯Iϕ (v) − Iϕ (u)¯ 6 c kv − uk p L (Ω;X) , so Iϕ is Lipschitz continuous on bounded sets. Now let ¢ 0¡ ∗ u∗ ∈ ∂Iϕ (u) ⊆ Lp Ω; Xw ∗ . Then using Fatou’s lemma (see Theorem A.2.1), we have hu∗ , hiLp (Ω;X) 6 Zb 6

¡ ¢0 Iϕ (u; h)

¡ ¢ ϕ0 ω, u(ω); h(ω) dµ

∀ h ∈ Lp (Ω; X),

0

so Zb

£ 0¡ ¢ ¤ ϕ ω, u(ω); h(ω) − hu∗ (ω), h(ω)iX dµ > 0

∀ h ∈ Lp (Ω; X).

0

Let df

h = χA z, with A ∈ Σ, z ∈ X. We obtain Z £ 0¡ ¢ ¤ ϕ ω, u(ω); z − hu∗ (ω), ziX dµ > 0. A

Since A ∈ Σ is arbitrary, we infer that ¡ ¢ ∗ ® u (ω), z X 6 ϕ0 ω, u(ω); z

for a.a. ω ∈ Ω

(4.44)

and the exceptional µ-null set is independent of z ∈ X since X is separable. Since z ∈ X is arbitrary, we conclude that ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω) for µ-a.a. ω ∈ Ω.

REMARK 4.5.20 Note that under hypothesis H(ϕ)1 we in fact proved that Iϕ is Lipschitz continuous (globally).

578

4.6

Nonlinear Analysis

Variational Principles

Suppose that X is a Banach space, C ⊆ X is a nonempty, noncompact set and df ϕ : X −→ R = R ∪ {+∞} is a proper, lower semicontinuous function, which is bounded below. Then the problem inf ϕ(x) x∈C

need not have a solution. If X = RN , the situation can be remedied by considering a suitable small perturbation of ϕ. More specifically, for simplicity let C = RN , m = inf ϕ(x), ε > 0 x∈X

and take x0 ∈ RN to be such that ϕ(x0 ) 6 m + ε. Consider the function df

ϕε (x) = ϕ(x) + ε kx − x0 kX . Evidently ϕε : RN −→ R is proper, lower semicontinuous and in addition ϕε is weakly coercive, i.e., ϕε (x) −→ +∞

as kxkRN → +∞.

So invoking the Weierstrass theorem, we infer that ϕε attains its infimum at a point y ∈ RN . Note that ky − x0 kRN 6 1. Indeed, if ky − x0 kRN > 1, we have ϕε (y) = ϕ(y) + ε ky − x0 kRN > ϕ(y) + ε > m + ε > ϕ(x0 ) = ϕε (x0 ) > ϕε (y), a contradiction. Also ϕε (y) 6 ϕε (x)

∀ x ∈ RN ,

hence ϕ(y) + ε ky − x0 kRN 6 ϕ(x) + ε kx − x0 kRN ,

4. Smooth and Nonsmooth Analysis and Variational Principles

579

so ϕ(y) 6 ϕ(x) + ε kx − ykRN . So this argument shows that for a given ε > 0 and x0 ∈ RN satisfying ϕ(x0 ) 6 inf ϕ(x) + ε x∈X

(i.e., x0 ∈ RN is an ε-minimizer), we can find y ∈ RN , such that ky − x0 kRN 6 1 and the function x 7−→ ϕ(x) + ε kx − ykRN attains its infimum at y ∈ RN . The main analytical tool in this argument was the theorem of Weierstrass, which guarantees a minimizer for a proper, lower semicontinuous, bounded from below function with at least one bounded sublevel set (this is the case if for example the function is weakly coercive). Evidently this is an essentially finite dimensional situation. In an infinite dimensional space for this to work we need extra conditions, such as the reflexivity of X and the weak lower semicontinuity of the function. In general the argument fails. Nevertheless, we can salvage the principle formulated above. Namely if x0 ∈ X is an ε-minimizer of ϕ : X −→ R, which is proper, lower semicontinuous, bounded from below, then a small Lipschitz perturbation of ϕ attains a strict minimum at a point y ∈ X, which is relatively close to x0 (i.e., we can find a Lipschitz continuous function h : X −→ R with a small Lipschitz constant, such that ϕ + h attains a strict minimum at y ∈ X). In fact this principle can be formulated in any complete metric space. This is the essence of the so-called Ekeland variational principle and its extensions. This result turned out to be an essential tool in many different areas of nonlinear analysis. THEOREM 4.6.1 If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and x0 ∈ X satisfies ϕ(x0 ) 6 inf ϕ(x) + ε, x∈X

then for a given λ > 0, we can find yλ ∈ X, such that (a) ϕ(yλ ) 6 ϕ(x0 ); (b) dX (yλ , x0 ) 6 λ; (c) ϕ(yλ ) < ϕ(x) + λε dX (x, yλ ) for all x 6= yλ .

580

Nonlinear Analysis

PROOF

By replacing ϕ with 1 ϕ(x) ε

df

ϕ(x) b = and dX (·, ·) with df

dλ (·, ·) =

1 d (·, ·), λ X

without any loss of generality, we may assume that ε = λ = 1. On X we define a relation by £ ¤ x6z

£ ¤ ϕ(x) 6 ϕ(z) − dX (x, z) .

df

⇐⇒

Evidently x 6 x (i.e., the relation 6 is reflexive). Also, if x 6 z, we have ϕ(x) 6 ϕ(z) − dX (x, z) and if z 6 v, we have ϕ(z) 6 ϕ(v) − dX (z, v). Thus, by the triangle inequality, it follows that £ ¤ ϕ(x) 6 ϕ(v) − dX (x, z) + dX (z, v) 6 ϕ(v) − dX (x, v). Therefore x 6 v (i.e., the relation 6 is transitive). Finally if x 6 z and z 6 x, we obtain dX (x, z) = 0, hence x = z (i.e., the relation 6 is antisymmetric). So we conclude that the relation 6 is a partial order. Inductively we define a sequence {Sn }n>1 of subsets of X as follows. Let x1 = x0 and ª df © S1 = z ∈ X : z 6 x 1 , x2 ∈ S1 is such that ϕ(x2 ) 6 inf ϕ(x) + x∈S1

1 22

and for the induction step, let df

Sn =

©

ª z ∈ X : z 6 xn ,

xn+1 ∈ Sn is such that ϕ(xn+1 ) 6 inf ϕ(x) + x∈Sn

1 . 2n+1

4. Smooth and Nonsmooth Analysis and Variational Principles

581

Since xn+1 6 xn , we have that Sn+1 ⊆ Sn for n > 1 and by virtue of lower semicontinuity of ϕ, we have that Sn is closed. If z ∈ Sn+1 , we have that z 6 xn+1 6 xn and so dX (z, xn+1 ) 6 ϕ(xn+1 ) − ϕ(z) 6 inf ϕ(x) + x∈Sn

6 ϕ(z) +

1 − ϕ(z) 2n+1

1 1 − ϕ(z) = n+1 , 2n+1 2

so diam Sn+1 6

1 2n

∀ n > 1,

i.e., diam Sn −→ 0. Because (X, dX ) is complete, by Cantor’s theorem (see Theorem A.1.11), we have that ∞ \ Sn = {y}. n=1

Since y ∈ S1 , we have y 6 x1 = x0 and so ϕ(y) 6 ϕ(x0 ) − dX (y, x0 ) 6 ϕ(x0 ), i.e., (a) holds. Also we have (recall ε = λ = 1) dX (y, x0 ) 6 ϕ(x0 ) − ϕ(y) 6 inf ϕ(x) + 1 − inf ϕ(x) = 1, x∈X

x∈X

i.e., (b) holds. Finally to prove (c), we need to show that z 6 y implies z = y. Indeed, if z 6 y, then z 6 xn for all n > 1, hence z ∈

∞ \

Sn ,

n=1

which implies that z = y. REMARK 4.6.2 Note that in the above proposition conclusions (b) and (c) are somehow complementary and the choice of λ > 0 allows us to strike a balance between them depending on the application we have in mind. If λ > 0 is large then (b) provides little information on the whereabouts of yλ while (c) tells us that yλ is close to being a global minimizer of ϕ. The opposite situation occurs when λ > 0 is small. Then (b) implies that yλ is close to x0 , but the inequality in (c) gives us little information. Two particular cases√are of interest. The first corresponds to λ = 1, ε > 0 and the second to λ = ε, ε > 0. In the first case we are not interested in conclusion (b) (i.e., we are not interested on how yλ is located with respect to x0 ). In the second case we are interested in both (b) and (c). We state these two particular cases as corollaries.

582

Nonlinear Analysis

COROLLARY 4.6.3 If (X, dX ) is a complete metric space and ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, then for every ε > 0, we can find yε ∈ X, such that (a) ϕ(yε ) 6 inf x∈X ϕ(x) + ε; (b) ϕ(yε ) < ϕ(x) + εdX (x, yε ) for all x 6= yε . COROLLARY 4.6.4 If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and xε ∈ X satisfies ϕ(xε ) 6 inf ϕ(x) + ε, x∈X

then we can find yε ∈ X, such that (a) ϕ(yε ) 6 ϕ(xε ); √ (b) dX (yε , xε ) 6 ε; √ (c) ϕ(yε ) < ϕ(x) + εdX (x, yε ) for all x 6= yε . If we put more structure on the space X, we can strengthen the conclusion of Theorem 4.6.1. THEOREM 4.6.5 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then for every ε > 0, we can find xε ∈ X, such that ° ° ϕ(xε ) 6 inf ϕ(x) + ε and °ϕ0G (xε )°X ∗ 6 ε. x∈X

PROOF

By virtue of Corollary 4.6.3, we can find xε ∈ X, such that

ϕ(xε ) 6 inf ϕ(x) + ε x∈X

and ϕ(xε ) 6 ϕ(x) + ε kx − xε kX

∀ x ∈ X.

df

Let h ∈ X and λ > 0 be arbitrary. Let us set x = xε + λh. We obtain ϕ(xε ) − ϕ(xε + λh) 6 ε khkX . λ Passing to the limit as λ & 0, we obtain ® ∀ h ∈ X, − ϕ0G (x), h X 6 ε khkX ° 0 ° ¯ 0 ® ¯ so ¯ ϕG (x), h X ¯ 6 ε khkX and thus °ϕG (x)°X ∗ 6 ε. An immediate consequence of this theorem is the following corollary.

4. Smooth and Nonsmooth Analysis and Variational Principles

583

COROLLARY 4.6.6 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then there exists a sequence {xn }n>1 ⊆ X, such that ϕ(xn ) & inf ϕ(x) x∈X

and

ϕ0G (xn ) −→ 0.

REMARK 4.6.7 The above corollary asserts the existence of a minimizing sequence, whose elements satisfy the first order necessary conditions, up to any desired approximation. COROLLARY 4.6.8 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then for each minimizing sequence {yn }n>1 of ϕ (i.e., ϕ(yn ) & inf ϕ(x)), we x∈X

can find another minimizing sequence {xn }n>1 of ϕ, such that: (a) ϕ(xn ) 6 ϕ(yn ); (b) kxn − yn kX −→ 0; ° ° (c) °ϕ0G (xn )°X ∗ −→ 0. As we already said the Ekeland variational principle is a very powerful tool of nonlinear analysis. Below we show how the well known Caristi’s fixed point theorem can be derived from the Ekeland variational principle. In fact we show the two results are equivalent, in the sense that the Ekeland variational principle can also be derived from Caristi’s fixed point theorem. First we state and prove Caristi’s fixed point theorem. THEOREM 4.6.9 (Caristi Fixed Point Theorem) If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function and F : X −→ 2X \ {∅} is a multifunction, such that ϕ(y) 6 ϕ(x) − dX (y, x)

for some y ∈ F (x) and all x ∈ X,

(4.45)

then there exists x0 ∈ X, such that x0 ∈ F (x0 ). PROOF such that

By virtue of Corollary 4.6.3 with ε = 1, we can find x0 ∈ X, ϕ(x0 ) < ϕ(x) + dX (x, x0 )

∀ x 6= x0 .

(4.46)

We claim that x0 ∈ F (x0 ). Suppose that this is not true. Then for all y ∈ F (x0 ), we have that y 6= x0 . Let y ∈ F (x0 ) be as in (4.45). We have ϕ(y) 6 ϕ(x0 ) − dX (y, x0 ) (see (4.46)), a contradiction.

and ϕ(x0 ) < ϕ(y) + dX (y, x0 )

584

Nonlinear Analysis

REMARK 4.6.10 We emphasize that on the multifunction F no regularity conditions were imposed except for (4.45), which is a mild restriction. Suppose that F has compact values and ¡ ¢ h F (x), F (y) 6 kdX (x, y) ∀ x, y ∈ X and with k ∈ (0, 1). Here h(·, ·) stands for the Hausdorff metric on the nonempty and closed subsets of X. Then we can apply Theorem 4.6.9 with ¡ ¢ 1 ϕ(x) = d x, F (x) . 1−k X ¡ ¢ Indeed let y ∈ F (x) be such that dX x, F (x) = dX (x, y). Such an element exists since F (x) is compact. Then we have ¡ ¢ (1 − k)dX (x, y) = dX x, F (x) − kdX (x, y) ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ 6 dX x, F (x) − h F (x), F (y) 6 dX x, F (x) − dX y, F (y) , so ϕ(y) 6 ϕ(x) − dX (x, y). So condition (4.45) is satisfied. The resulting fixed point theorem is a particular case of Nadler’s fixed point theorem (see Theorem 7.4.3). Of course, if F is single valued, we recover the well known Banach’s contraction principle (see Theorem 7.1.2). We should point out that Banach’s fixed point theorem contains much more information. PROPOSITION 4.6.11 The Caristi fixed point theorem (see Theorem 4.6.9) implies the Ekeland variational principle in the form of Corollary 4.6.3. PROOF Let dbX = εdX . This is an equivalent metric on X. Proceeding by contradiction, suppose that there is no xε ∈ X satisfying inequality (b) in Corollary 4.6.3. Then for every x ∈ X, we have © ª F (x) = y ∈ X : ϕ(x) > ϕ(y) + dbX (x, y), y 6= x 6= ∅. The multifunction F satisfies (4.45). So by Theorem 4.6.9, we can find x0 ∈ X, such that x0 ∈ F (x0 ). But this cannot happen since from the definition of F , x∈ / F (x) for all x ∈ X. There is another geometrical result of nonlinear analysis which is equivalent to some form of the Ekeland variational principle. DEFINITION 4.6.12 Let X be a normed space, C ⊆ X a nonempty, convex set and x ∈ X. The drop associated with the pair (x, C), denoted by D(x, C), is the convex hull of {x} ∪ C, i.e., © ª D(x, C) = x + λ(c − x) : c ∈ C, λ ∈ [0, 1] .

4. Smooth and Nonsmooth Analysis and Variational Principles REMARK 4.6.13 given its geometry.

585

The set D(x, C) is called a “drop,” a suitable name

The next result is known as the drop theorem. THEOREM 4.6.14 (Drop Theorem) If X is a normed space, A ⊆ X is a complete set, y ∈ X \ A, R = dX (y, A) and 0 < r < R < %, then there exists u ∈ A, such that ¡ ¢ u ∈ B % (y) and D u, B r (y) ∩ A = {u}. PROOF Let

By translating things, if necessary, we may assume that y = 0. df

E = B % ∩ A. This is a closed subset of A, hence it is a complete metric space (the metric induced by the norm of X). We introduce the continuous function ϕ : E −→ R+ , defined by

%+r kxkX . R−r We apply Corollary 4.6.3 with ε = 1 to obtain u ∈ E, such that df

ϕ(x) =

ϕ(u) < ϕ(x) + ku − xkX

∀ x ∈ E, x 6= u.

(4.47)

We need to show that ¡ ¢ D u, B % (0) ∩ A = {u}. ¡ ¢ Suppose that this is not true and let v ∈ D u, B % (0) ∩ A. We have v ∈ A and v = (1 − λ)u + λz for some z ∈ B r (0) and some λ ∈ [0, 1]. Since v 6= u and r < R, it follows that λ ∈ (0, 1). We have kvkX 6 (1 − λ) kukX + λ kzkX . Because u ∈ A, we have kukX > R and so it follows that λ(R − r) 6 λ (kukX − kzkX ) 6 kukX − kvkX .

(4.48)

From (4.47) with x = v, we have %+r %+r %+r kukX < kvkX + kv − ukX = kvkX + λ ku − zkX , R−r R−r R−r

586 so

Nonlinear Analysis %+r (kukX − kvkX ) < λ ku − zkX R−r

and thus using also (4.48), we have % + r < ku − zkX . But kukX 6 % (recall that y = 0) and z ∈ B r (0). Hence ku − zkX 6 % + r, a contradiction. This geometrical result is in fact equivalent to the Ekeland variational principle stated in the following form which can be easily deduced from Corollary 4.6.3. PROPOSITION 4.6.15 If (X, dX ) is a complete metric space and ϕ : X −→ R is a proper, lower semicontinuous function which is bounded from below, then for any β > 0 and any x0 ∈ X, there exists y ∈ X, such that: (a) ϕ(y) < ϕ(x) + βdX (y, x) for all x 6= y; (b) ϕ(y) 6 ϕ(x0 ) − βdX (y, x0 ). In this form the Ekeland variational principle is equivalent to the drop theorem. For the proof of this, see Penot (1986). PROPOSITION 4.6.16 The drop theorem (see Theorem 4.6.14) is equivalent to the Ekeland variational principle in the form of Proposition 4.6.15. We continue with the applications of the Ekeland variational principle. PROPOSITION 4.6.17 If X is a Banach space, ϕ : X −→ X is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable and there exist a, c > 0, such that a kxkX − c 6 ϕ(x) ∀ x ∈ X, (4.49) X∗

then ϕ0G (X) is dense in aB 1 , where X∗

B1

=

©

ª x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 .

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

X

Let x∗ ∈ aB 1

∗

587

and consider the function df

ψ(x) = ϕ(x) − hx∗ , xiX . Evidently ψ is lower semicontinuous, bounded from below (see (4.49)) and Gˆateaux differentiable. Applying Theorem 4.6.5, we obtain yε ∈ X, such that ° 0 ° °ψG (yε )° ∗ 6 ε. X But

0 ψG (yε ) = ϕ0G (yε ) − x∗ .

Hence

° 0 ° °ϕG (yε ) − x∗ ° ∗ 6 ε. X X∗

Since x∗ ∈ aB 1

X∗

was arbitrary, we conclude that ϕ0G (X) is dense in aB 1 .

COROLLARY 4.6.18 If X is a Banach space, ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable and there exists a continuous function ϑ : R+ −→ R, such that ϑ(s) −→ +∞ s and then

as s → +∞

¡ ¢ ϕ(x) > ϑ kxkX ϕ0G (X)

PROOF

∀ x ∈ X,

∗

is dense in X . Let a > 0. We can find sa > 0, such that ϑ(s) > as

∀ s > sa .

Hence we have that ∀ x ∈ X, kxkX > sa .

ϕ(x) > a kxkX

On the other hand, if kxkX < sa , then ϕ(x) > ma , where df

ma =

min ϑ(s).

s∈[0,sa ]

Therefore, finally we have ϕ(x) > a kxkX − ma . X∗

Apply Proposition 4.6.17 to obtain that ϕ0G (X) is dense in aB 1 . Since a > 0 was arbitrary, we conclude that ϕ0G (X) is dense in X ∗ .

588

Nonlinear Analysis

Recall that if X is a© Banach space and ϕ ª ∈ Γ0 (X), then© D(∂ϕ) ⊆ dom ϕ (recall that D(∂ϕ) = x ∈ X : ∂ϕ(x) = 6 ∅ and dom ϕ = x ∈ X : ϕ(x) < ª +∞ ). We would like to have a more precise relation between these two sets. PROPOSITION 4.6.19 If X is a Banach space, ϕ ∈ Γ0 (X), x0 ∈ dom ϕ, ε > 0 and x∗0 ∈ ∂ε ϕ(x0 ), then there exist x ∈ X and x∗ ∈ ∂ϕ(x), such that √ (a) kx − x0 kX 6 ε; ¯ ¯ √ (b) ¯ϕ(x) − ϕ(x0 )¯ 6 ε + ε; ¢ √ ¡ (c) kx∗ − x∗0 kX ∗ 6 ε 1 + kx0 kX . PROOF

Let

df

ψ(x) = ϕ(x) − hx∗0 , xiX . We have ψ ∈ Γ0 (X) and ψ(x) > −ϕ∗ (x∗0 ) > ϕ(x0 ) − hx∗0 , x0 iX − ε (see Remark 4.4.50). Moreover, from Remark 4.4.50, we have ψ(x0 ) 6 inf ψ(x) + ε. x∈X

On X we use the norm ¯ ¯ |||x|||X = kxkX + ¯ hx∗0 , xiX ¯, which is equivalent to the original norm k·kX . Invoking Corollary 4.6.4, we obtain x ∈ X, such that ψ(x) 6 ψ(x0 ) and

¯ ¯ √ |||x − x0 |||X = kx − x0 kX + ¯ hx∗0 , x − x0 iX ¯ 6 ε

(4.50)

and ϕ(x) − hx∗0 , xiX 6 ϕ(y) − hx∗0 , yiX +

√ ε |||x − y|||X

∀ y ∈ X.

From (4.51) we obtain that ϕ(x) − hx∗0 , xiX = inf ψ1 (y), y∈X

where

df

ψ1 (y) =

©

ϕ(y) − hx∗0 , yiX +

so, from Proposition 4.4.30, we have 0 ∈ ∂ψ1 (x).

ª √ ε |||x − y|||X ,

(4.51)

4. Smooth and Nonsmooth Analysis and Variational Principles

589

Invoking Proposition 4.4.31, we have ∂ψ1 (x) = ∂ϕ(x) − x∗0 + Note that

¢ √ ¡ ε∂ |||·|||X (0).

¡ ¢ ∂ |||·|||X (0) = u∗ + λx∗0 , X∗

with u∗ ∈ B 1

and λ ∈ [−1, 1] (see Example 4.4.24(b)). Therefore, we have ¢ √ ¡ 0 = x∗ − x∗0 + ε u∗ + λx∗0 ,

with x∗ ∈ ∂ϕ(x), so kx∗ − x∗0 kX ∗ 6

¢ √ ¡ ε 1 + kx∗0 kX ∗ ,

which proves (c). Also from (4.50) and since ψ(x0 ) 6 inf ψ(x) + ε, x∈X

we have ¯ ¯ ¯ ¯ √ ¯ϕ(x) − ϕ(x0 )¯ 6 ψ(x0 ) − ψ(x) + ¯ hx∗0 , x − x0 i ¯ 6 ε + ε, X which proves (b). √ Finally again from (4.50), we have kx − x0 kX 6 ε, which proves (a). THEOREM 4.6.20 If X is a Banach space and ϕ ∈ Γ0 (X), then D(∂ϕ) is dense in dom ϕ. PROOF Let x0 ∈ dom ϕ, n > 1 and x∗0n ∈ ∂ n1 ϕ(x0 ) (see Proposition 4.4.51). Invoking Proposition 4.6.19, we obtain xn ∈ D(∂ϕ), such that 1 kxn − x0 kX 6 √ n

and

¯ ¯ ¯ϕ(xn ) − ϕ(x0 )¯ 6 1 + √1 . n n

REMARK 4.6.21 Actually in the proof of Theorem 4.6.20, we obtained something stronger. Namely if x0 ∈ dom ϕ, we can find a sequence {xn }n>1 ⊆ D(∂ϕ), such that xn −→ x0 in X and ϕ(xn ) −→ ϕ(x0 )

in R.

590

Nonlinear Analysis

Recall the following theorem from the theory of bounded linear operators between Banach spaces. THEOREM 4.6.22 If X and Y are two Banach spaces and A ∈ L(X; Y ), then the following statements are equivalent: (a) A is surjective; (b) there exists c > 0, such that ky ∗ kY ∗ 6 c kA∗ y ∗ kX ∗

∀ y∗ ∈ Y ∗ ;

(c) N (A∗ ) = {0} and R(A∗ ) is closed. REMARK 4.6.23 According to this theorem, if A is surjective, then A∗ is injective. If one of the spaces X or Y is finite dimensional, then the converse is also true. We want to produce a nonlinear analog of Theorem 4.6.22. THEOREM 4.6.24 If X and Y are two Banach spaces, ϕ : X −→ Y is a Gˆ ateaux differentiable map, ϕ(X) is closed in Y , y ∈ Y and there exist % > 0 and k ∈ [0, 1), such that ¡ ¢ ϕ−1 B% (y) 6= ∅ (4.52) and inf

z∈R(ϕ0G (x))

ky − ϕ(x) − zkY 6 k ky − ϕ(x)kY

¡ ¢ ∀ x ∈ ϕ−1 B% (y) , (4.53)

then y ∈ ϕ(X). PROOF

Let A = ϕ(X).

By hypothesis A ⊆ Y is closed. We proceed by contradiction. So suppose that y ∈ / ϕ(X). Let df

R = dY (y, A) and choose %, r > 0, such that r < R < %

and k% < r.

Note that if (4.52) and (4.53) hold for some %0 > 0, then it also holds for any % ∈ (R, %0 ). According to Theorem 4.6.14, we can find u0 ∈ B% (y), such that D(u0 , C) ∩ A = {u0 },

4. Smooth and Nonsmooth Analysis and Variational Principles where

591

df

C = B r (y). Let x0 ∈ X be such that ϕ(x0 ) = u0 . From (4.53) and recalling that k% < r, we have ° ° ° ° °y − ϕ(x0 ) − z ° 6 k °y − ϕ(x0 )° < r. inf Y Y z∈R(ϕ0G (x0 ))

So we can find h ∈ X, such that ° ° °y − ϕ(x0 ) − ϕ0G (x0 )h° < r. Y

(4.54)

From this inequality, it follows that for λ > 0 small, we have ° ° ° ° °y − ϕ(x0 ) − ϕ(x0 + λh) − ϕ(x0 ) ° < r. ° ° λ Y Let

ϕ(x0 + λh) − ϕ(x0 ) ∈ Y. λ From Definition 4.6.12, we see that df

vλ = y − ϕ(x0 ) −

y − vλ ∈ D(u0 , C), where C = B r (y), so (1 − λ)u0 + λ(y − vλ ) ∈ D(u0 , C)

∀ λ ∈ (0, 1)

and thus ϕ(x0 + λh) ∈ D(u0 , C)

∀ λ > 0 small enough.

Because D(u0 , C) ∩ A = {u0 }, it follows that ϕ(x0 + λh) = u0

∀ λ > 0 small enough,

so ϕ0G (x0 ) = 0. Using this in (4.54), we obtain ° ° °y − ϕ(x0 )° < r < R, X a contradiction. REMARK 4.6.25 If in the above theorem conditions (4.52) and (4.53) hold for all y ∈ Y , then we conclude that ϕ is surjective. Note that conditions (4.52) and (4.53) are in a sense complementary. Namely the larger % > 0, the more difficult it is to verify (4.53).

592

Nonlinear Analysis

COROLLARY 4.6.26 If X and Y are two Banach spaces, ϕ : X −→ Y is a Gˆ ateaux differentiable map, ϕ(X) is closed in Y and N

¡¡

¢∗ ¢ ϕ0G (x) = {0}

∀ x ∈ X,

then ϕ is surjective. PROOF

Recall that ¡ ¢ R ϕ0G (x) =

⊥

N

¡¡

¢∗ ¢ ϕ0G (x)

(see, e.g., Denkowski, ¡¡ ¢∗Mig´ ¢ orski & Papageorgiou (2003a, p. 320)). Since by hypothesis N ϕ0G (x) = {0}, it follows that (4.53) is true with k = 0. So for a given y ∈ Y , let % > 0 be such that ¡ ¢ dY y, ϕ(X) < % and let k = 0. Then we can apply Theorem 4.6.24 and conclude that y ∈ ϕ(X). This proves that ϕ is surjective. REMARK 4.6.27 the hypothesis

It is clear from the proof of the above theorem that N

¡¡

¢∗ ¢ ϕ0G (x) = {0}

∀x∈X

can be replaced by the hypothesis that ¡ ¢ R ϕ0G (x) is dense in Y

∀ x ∈ X.

¡ ¢ Moreover, if ϕ0G (x) ∈ Φ(X; Y ) and ind ϕ0G (x) = 0 for all x ∈ X (see Definition 3.1.60), then the hypothesis N

¡¡

¢∗ ¢ ϕ0G (x) = {0}

∀x∈X

can be replaced by the hypothesis that ¡ ¢ N ϕ0G (x) = {0}

∀ x ∈ X.

The discussion so far has illustrated the power of the Ekeland variational principle. The only difficulty that we encounter when using this principle is that the perturbation function ε kx − xε k is not differentiable at the origin. So it is natural to ask whether it is possible to formulate a similar variational principle but for a different class of perturbations, which would include functions differentiable at points of interest. This was achieved by Borwein & Preiss (1987), who obtained the following theorem.

4. Smooth and Nonsmooth Analysis and Variational Principles

593

THEOREM 4.6.28 If X is a Banach space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and x0 ∈ X is such that ϕ(x0 ) 6 inf ϕ(x) + ε, x∈X

then for any λ > 0 and any p ∈ [1, +∞), we can find yλ ∈ X, a sequence {xn }n>1 ⊆ X, such that xn −→ yλ in X ∞ P and a sequence {tn }n>1 ⊆ [0, 1] satisfying tn = 1, such that: n=1

(a) ϕ(yλ ) 6 ϕ(x0 ); (b) kxn − x0 kX 6 λ for all n > 1; (c) the function ψ(x) = ϕ(x) +

ε λp

mum at yλ .

∞ P n=1

p

tn kx − xn kX attains a strict mini-

The next step in the development of variational principles was made by Deville, Godefroy & Zizler (1993). Their starting point was the argument in the Borwein-Preiss variational principle. In that argument important is the df function ϑ(x) = 1 − kxkX and in particular what matters is the behaviour of ϑ within the unit ball. That is the behaviour of ϑ outside the domain where ϑ is nonnegative plays no role in the argument. So we may as well replace ϑ by the function © ª df b ϑ(x) = ϑ+ (x) = max 0, 1 − kxkX . Note that ϑb is a continuous bump function (i.e., a continuous function on X which has a nonempty and bounded support; see Remark 4.4.84). This observation is interesting because there are Banach spaces in which smooth bump functions can be found, but they do not have an equivalent differentiable norm (see Fabian (1997)). THEOREM 4.6.29 If X is a Banach space which admits a Lipschitz continuous bump function that is Fr´echet (respectively Gˆ ateaux) differentiable, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function and ε > 0, then there exists a Lipschitz continuous function g : X −→ R which is Fr´echet (respectively Gˆ ateaux) differentiable, such that ¯ ¯ kgk∞ = sup ¯g(x)¯ 6 ε, x∈X ° ° kg 0 k∞ = sup °g 0 (x)°X ∗ 6 ε x∈X

and ϕ − g attains its minimum on X.

594

Nonlinear Analysis

PROOF We do the proof for the Fr´echet differentiable case. The proof of the Gˆateaux differentiable case is done similarly. Let V be the linear space of all functions g : X −→ R which are Lipschitz continuous and Fr´echet differentiable. Evidently ° 0 ° °g (x)° ∗ 6 Lip(g) ∀ g ∈ V, x ∈ X X (by Lip(g) we denote the Lipschitz constant of g). So the function x 7−→ g 0 (x) is bounded and then by the mean value theorem (see Proposition 4.1.21), we have that x 7−→ g(x) is bounded too. It is easy to see that V supplied with the norm kgkV = kgk∞ + kg 0 k∞ becomes a Banach space. For every n > 1, let ½ df An = g ∈ V : there exists x0 ∈ X, such that ¾ (ϕ − g)(x0 ) < inf (ϕ − g)(x) , x∈X\B 1 (x0 ) n

where

©

1ª . n We claim that for every n > 1, the set An is open and dense in V . To do this note that k·k∞ 6 k·kV . So it follows that An is open. To show the density of An , let g ∈ An and ε > 0. We need to find h ∈ V with khkV < ε and x0 ∈ X, such that (ϕ − g − h)(x0 ) < inf (ϕ − g − h)(x). B n1 (x0 ) =

x ∈ X : kx − x0 kX

1 . n

The function ϕ − g is bounded from below. So we can find x0 ∈ X, such that (ϕ − g)(x0 ) < inf (ϕ − g)(x) + b(0). x∈X

Let

df

h(x) = b(x − x0 ). Evidently h ∈ V with khkV < ε, and we have ¡ ¢ ¡ ¢ ¡ ¢ ϕ − g − h (x0 ) = ϕ − g (x0 ) − b(0) < inf ϕ − g (x). x∈X

Since hX\B 1 (x0 ) ≡ 0, we have n

¡

¡ ¢ ¢ ¡ ¢ ϕ − g − h (x) = ϕ − g (x) > inf ϕ − g (x) x∈X

∀ x ∈ X \ B n1 (x0 ).

4. Smooth and Nonsmooth Analysis and Variational Principles

595

Hence g + h ∈ An and this proves the density of An in V . Then by the Baire category theorem (see Theorem A.1.10), we have that ∞ \

D =

An ⊆ Y is a dense Gδ set.

n=1

Next we show that if g ∈ D, then ϕ − g attains its minimum on X. From the definition of An , we can find xn ∈ X, such that ¡ ¢ ¡ ¢ ϕ − g (xn ) < inf ϕ − g (x). x∈X\B 1 (xn ) n

For m > n, we have xm ∈ B n1 (xn ), or otherwise we would have ¡ ¢ ¡ ¢ ϕ − g (xn ) < ϕ − g (xm ) and

kxn − xm kX >

1 1 > . n m

(4.55)

By virtue of the second inequality and the choice of xm , we have ¡ ¢ ¡ ¢ ϕ − g (xm ) < ϕ − g (xn ), which contradicts the first inequality in (4.55). Therefore we infer that {xn }n>1 ⊆ X is a Cauchy sequence in X and xn −→ x b in X. We claim that x b is a minimizer of ϕ − g. Because ϕ − g is lower semicontinuous, we have ¡ ¢ ¡ ¢ ϕ − g (b x) 6 lim inf ϕ − g (xn ) n→+∞ µ ¶ ¡ ¢ 6 lim inf inf ϕ − g (x) . n→+∞

x∈X\B 1 (xn ) n

If u ∈ X, u 6= x b, then kxn − ukX >

1 n

∀ n > 1 large enough

and so for n > 1 large enough, we have ¡ ¢ ¡ ¢ inf ϕ − g (x) 6 ϕ − g (u). x∈X\B 1 (xn ) n

Using this in (4.56), we conclude that ¡ ¢ ¡ ¢ ϕ − g (b x) = inf ϕ − g (x). x∈X

(4.56)

596

Nonlinear Analysis

REMARK 4.6.30 If the norm of X is Fr´echet differentiable away from the origin, then the function 2

x 7−→ kxkX is a C 1 -function on X. If ξ : R+ −→ R is a C 1 -function, such that ξ(0) = 1 then

and ξ(s) = 0

∀ s > 1,

¡ df 2 ¢ b(x) = ξ kxkX

is a C 1 -function on X, such that b(0) = 1

and

b(x) = 0

∀x∈ / B1 (0).

This is a C 1 -bump function. Note that if X ∗ is separable, then X admits a C 1 -bump function. Indeed in this case X admits an equivalent Fr´echet differentiable norm and so the C 1 -bump function is constructed as indicated above. Every separable Banach space admits an equivalent Gˆateaux differentiable norm and so it admits a Lipschitz continuous, Gˆateaux differentiable bump function. The following result is an interesting consequence of Theorem 4.6.29. PROPOSITION 4.6.31 If the Banach space X admits a Lipschitz continuous and Fr´echet (respectively Gˆ ateaux) differentiable bump function, then every continuous convex function defined on X is Fr´echet (respectively Gˆ ateaux) differentiable on a dense subset of X. In particular, if the Banach space admits a Lipschitz continuous and Fr´echet differentiable bump function, then X is an Asplund space (see Definition 4.2.18). PROOF Again we do the proof for the Fr´echet differentiable case. The proof for the Gˆateaux differentiable case is similar. Let ϕ be a continuous concave function on X and b be a Lipschitz continuous and Fr´echet differentiable function on X with b(0) 6= 0

and

b(x) = 0

∀x∈ / B1 (0).

Let x0 ∈ X and choose δ > 0, such that ϕ(x0 ) − 1 < ϕ(x)

∀ x ∈ Bδ (x0 ).

Choose m > 1δ . If kx − x0 kX > δ, we have ° ° °m(x − x0 )° > 1 X

4. Smooth and Nonsmooth Analysis and Variational Principles and so

597

¡ ¢ b m(x − x0 ) = 0.

We define the function 1 df f (x) = b(m(x − x0 ))2 +∞

¡ ¢ if b m(x − x0 ) 6= 0, otherwise.

The function ϕ + f : X −→ R is proper, lower semicontinuous and bounded from below (by ϕ(x0 ) − 1). So we can apply Theorem 4.6.29 and obtain a Lipschitz continuous and Fr´echet differentiable function g : X −→ R, such that ϕ + f − g attains its minimum at some point y0 ∈ Bδ (x0 ) (since f ≡ +∞ outside Bδ (x0 )). Let U be a neighbourhood of y0 , such that the function ¡ ¢ x 7−→ b m(x − x0 ) is nonzero. Then the function f is Fr´echet differentiable on U . We have ϕ(y0 ) + f (y0 ) − g(y0 ) 6 ϕ(y) + f (y) − g(y)

∀ y ∈ U,

so −ϕ(y) 6 −ϕ(y0 ) − f (y0 ) + g(y0 ) + f (y) − g(y) Let

df

v(y) = −ϕ(y0 ) − f (y0 ) + g(y0 ) + f (y) − g(y)

∀ y ∈ U. ∀ y ∈ U.

We have −ϕ(y) 6 v(y)

∀y∈U

and

− ϕ(y0 ) = v(y0 ).

If khkX is small, from the convexity of −ϕ, we have 0 6 (−ϕ)(y0 + h) + (−ϕ)(y0 − h) − (−2ϕ)(y0 ) 6 v(y0 + h) + v(y0 − h) − 2v(y0 ).

(4.57)

Since v is Fr´echet differentiable, we have v(y0 + h) + v(y0 − h) − 2v(y0 ) =

¡ ® ¢ 0 ® vF (y0 ), h X − vF0 (y0 ), h X + o khkX .

Hence from (4.57) and (4.58), it follows that ¡ ¢ (−ϕ)(y0 + h) + (−ϕ)(y0 − h) − 2(−ϕ)(y0 ) = o khkX

(4.58)

598

Nonlinear Analysis

and this by virtue of Proposition 4.2.9 implies that −ϕ is Fr´echet differentiable at y0 . Since x0 ∈ X and δ > 0 were arbitrary, we conclude that −ϕ is Fr´echet differentiable on a dense subset of X. Finally for the last part of the proposition, recall that the set of points of differentiability of −ϕ is a Gδ set (see the proof of Theorem 4.2.12). REMARK 4.6.32 It is not known whether every Asplund space admits a Fr´echet differentiable bump function. We conclude this section with a generalization of Theorem 4.6.1 which is useful when we study boundary value problems using variational methods. This generalization is due to Zhong (1997), where the interested reader can find its proof. THEOREM 4.6.33 If h : R+ −→ R+ is a continuous, nondecreasing function, such that +∞ Z

0

1 dr = +∞, 1 + h(r)

(X, dX ) is a complete metric space, x0 ∈ X is fixed, ϕ : X −→ R is a proper, lower semicontinuous and bounded below function, ε > 0, ϕ(y) 6 inf ϕ(x) + ε x∈X

and λ > 0, then there exists xλ ∈ X such that ϕ(xλ ) 6 ϕ(y),

dX (xλ , x0 ) 6 r0 + r

and ϕ(xλ ) 6 ϕ(x) +

ε d (xλ , x) λ(1 + h(dX (x0 , xλ ))) X

where

∀ x ∈ X,

df

r0 = dX (x0 , y) and r > 0 is such that

rZ0 +r

r0

REMARK 4.6.34 Theorem 4.6.1.

1 dr > λ. 1 + h(r)

If h ≡ 0 and x0 = y, then Theorem 4.6.33 reduces to

4. Smooth and Nonsmooth Analysis and Variational Principles

4.7

599

Remarks

4.1: Gˆateaux (1913) gave the definition of directional differentiability when X is simply a linear space and Y = R. Afterwards, L´evy (1920) imposed the requirement that f 0 (x; ·) must be linear. The Fr´echet derivative was introduced by Fr´echet (1920). Various parts of the calculus in Banach spaces can be found in the books of Abraham & Marsden (1978), Cartan (1967), Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Dieudonn´e (1969), Ioffe & Tihomirov (1979), Vaˇınberg (1973), Zeidler (1985b) and in the survey papers of Averbukh & Smolyanov (1967, 1968) and Nashed (1971). We should also mention Lusternik’s theorem, which is useful in variational analysis. For a proof of it see Zeidler (1985b, pp. 287–289). First a definition. DEFINITION 4.7.1 C.

Let X be a locally convex space, C ⊆ X and x0 ∈

(a) An admissible curve in C through x0 is a map u : (−ε, ε) −→ C for some ε > 0, such that u(t) ∈ C u(0) = x0

∀ t ∈ (−ε, ε), and

u0 (0) exists.

(b) h ∈ X is said to be a tangent vector to C at x0 if and only if there exists an admissible curve in C through x0 , such that u0 (0) = h, that is if there exist an ε > 0 and a map (−ε, ε) 3 λ 7−→ r(λ) ∈ X, such that x0 + λh + r(λ) ∈ C and

kr(λ)kX −→ 0 λ

∀ λ ∈ (−ε, ε) as λ → 0.

(c) The set of all vectors tangent to C at x0 is a closed cone, which is nonempty since the origin belongs to it. This cone is usually called the tangent cone to C at x0 and it is denoted by TC (x0 ). If this cone is a subspace of X, then it is called the tangent space to C at x0 .

600

Nonlinear Analysis

THEOREM 4.7.2 (a) If X and Y are two Banach spaces, U is a neighbourhood of x0 ∈ X, ϕ : U −→ Y is a Fr´echet differentiable function and ¡ ¢ R ϕ0F (x0 ) = Y (i.e., x0 ∈ U is a regular point of ϕ), then the tangent space to the set df

C =

©

x ∈ X : ϕ(x) = ϕ(x0 )

ª

coincides with the kernel of ϕ0F (x0 ), i.e., ¡ ¢ TC (x0 ) = N ϕ0F (x0 ) . (b) If X and Y are two Banach spaces, U is a neighbourhood of x0 ∈ X, ϕ ∈ C 1 (U ; Y ) and for all x ∈ U , such that ϕ(x) = ϕ(x0 ), we have

¡ ¢ R ϕ0F (x) = Y

¡ ¢ and N ϕ0F (x) is complemented in X (see Definition 4.1.28), then the set ª df © C = x ∈ U : ϕ(x) = ϕ(x0 ) is a C 1 -manifold in X. Moreover, if ϕ ∈ C r (U ; Y ) with r > 1, then C is a C r -manifold in X. 4.2: Convex functions play a central role in many applications. There are several books dealing with the continuity and differentiability properties of convex functions on finite or infinite dimensional Banach spaces. We mention the books of Rockafellar (1970a), Webster (1994) (convex functions defined on RN ) and Barbu & Precupanu (1986), Ekeland & Temam (1976), Giles (1982), Ioffe & Tihomirov (1979), Laurent (1972), Phelps (1993) and Roberts & Varberg (1973) (convex functions defined on Banach and locally convex spaces). Note that the books of Giles (1982) and Phelps (1993) approach convex functions from the point of view of functional analysis and place special emphasis on the relations with Banach space theory. Theorem 4.2.12 is a classical result of Mazur (1933). For convex, continuous functions which are everywhere Gˆateaux differentiable, there are some stronger results on Fr´echet differentiability. In particular, Deville, Godefroy, Hare & Zizler (1987) characterize the separable Banach spaces X so that every continuous convex function

4. Smooth and Nonsmooth Analysis and Variational Principles

601

ϕ : X −→ R, which is everywhere Gˆateaux differentiable, must be Fr´echet differentiable on a dense set. It turns out that X ∗ can be nonseparable, but X cannot contain a subspace isomorphic to l1 . 4.3: Haar-null sets (see Definition 4.3.5) were introduced by Christensen (1972), who also obtained all the results up until Corollary 4.3.14. Theorem 4.3.17 is also due to Christensen (1974). In the paper of Hunt, Sauer & Yorke (1992, 1993) (see also their addendum), we find relations between dynamical systems and Haar-null sets. There are other ways to define negligible sets in an infinite dimensional Banach space (such as Gauss-null sets, Aronszajn-null sets and cube-null sets). A detailed discussion of them and their use in the study of the differentiability properties of Lipschitz continuous functions can be found in the book of Benyamini & Lindenstrauss (1997). 4.4: Duality is in the core of convex analysis. The Legendre-Fenchel transform (see Definition 4.4.1) was first used for convex functions on R by Mandelbrojt (1939). This motivated Fenchel (1951) to introduce an important and more general definition for convex functions in RN . The transform introduced by Fenchel is an extension of the Legendre transform (see Legendre (1786)). This is why the transform is called Legendre-Fenchel transform. This notion was extended to dual pairs of locally convex spaces by Brondsted (1964), Moreau (1966–1967) and Rockafellar (1974). A special case of the inequality in Proposition 4.4.3(a) can be found in Young (1912) and for this reason the inequality is called Young-Fenchel inequality. We should point out that some authors (see Ioffe & Tichomirov (1968, 1979)) prefer to name Young-Fenchel transform for what we call here Legendre-Fenchel transform. The finite-dimensional duality theory can be found in the books of Fenchel (1951), Rockafellar (1970a), while the infinite dimensional duality theory can be found in the books of Barbu & Precupanu (1986), Ekeland & Temam (1976), Ioffe & Tihomirov (1979), Laurent (1972). Theorem 4.4.14 is the main result in the duality theory for convex functions and sometimes it is called ¡ the ¢ Fenchel-Moreau theorem. First Fenchel (1951) observed that ϕ ∈ Γ0 RN if and only if it is supremum of all affine continuous functions majorized by ϕ. Soon thereafter H¨ormander (1955) established the following result. PROPOSITION 4.7.3 If X is a locally convex space, then there is a bijective correspondence between nonempty, closed, convex sets and sublinear, w(X ∗ , X)-lower semicontinuous functions on X ∗ with values df

in R = R ∪ {+∞}, which maps C into σC . Theorem 4.4.14 in conjunction with Proposition 4.4.13 says that ϕ∗∗ is the biggest convex and lower semicontinuous function majorized by ϕ (sometimes this is denoted by writing that ϕ∗∗ = conv ϕ, which is a suggestive notation expressing the fact that epi ϕ∗∗ = conv epi ϕ). This fact is important in control

602

Nonlinear Analysis

theory in connection with the relaxation method. If the ambient space is RN , then using Carath´eodory’s theorem for convex sets in RN , we can have the following useful expression for ϕ∗∗ (see Ioffe & Tihomirov (1979, p. 189)). PROPOSITION 4.7.4 If ϕ : RN −→ R is a proper, lower semicontinuous function and dom ϕ∗∗ ⊆ RN is closed, then ϕ∗∗ (x) = inf

½ NX +1 k=1

λk ϕ(xk ) : xk ∈ RN , λk > 0,

N +1 X k=1

λk = 1,

N +1 X

¾ λk xk = x .

k=1

The operation of infimal convolution (see Definition 4.4.6(b)) was introduced by Moreau (1965, 1966–1967) and its duality properties were studied by Ioffe & Tichomirov (1968, 1979). The proof of Proposition 4.4.16 can be found in Ioffe & Tihomirov (1979, p. 178). Although affine continuous supports for convex functions were considered earlier, the first systematic study of the subdifferential multifunction started with the works of Moreau (1965, 1966–1967) and Rockafellar (1966, 1970b). Moreau (1965) limits himself in the framework of Hilbert spaces, while Rockafellar (1970b) passes to general Banach spaces. We should also mention the related work of Pshenichnyi (1971) on quasi-differentiable functions. One of the main results of the convex subdifferential theory is Theorem 4.4.34 (see also Remark 4.4.35). This was first proved by Rockafellar (1966), but it was found that his proof had a gap. This was remedied by Rockafellar (1970b), where we find also the proof of Proposition 4.4.31. The notion of cyclically monotone operators is due to Rockafellar (1966), who proved Theorem 4.4.39 (see also Rockafellar (1970b, Theorem B, p. 210). Proposition 4.4.42 is due to Br´ezis (1973). For the proof of Proposition 4.4.46, we refer to Phelps (1993, p. 19). The ε-subdifferential (see Definition 4.4.49) was investigated systematically by Hiriart-Urruty (1980, 1982), and Hiriart-Urruty & Phelps (1993). Convex subdifferentials found widespread applications in optimization, control theory and evolution equations, as seen in the books of Barbu (1976, 1994), Barbu & Precupanu (1986), Dontchev & Zolezzi (1993), Ekeland & Temam (1976), Hiriart-Urruty & Lemar´echal (1993), Hu & Papageorgiou (1997, 2000), Ioffe & Tihomirov (1979), Rockafellar (1970a, 1974), Rockafellar & Wets (1998) and Tiba (1990). The proof of Proposition 4.4.53 can be found in Rockafellar (1970a, pp. 219–220). The subdifferential theory for locally Lipschitz functionals is due to Clarke (1975, 1981, 1983). Only Theorem 4.4.72 is due to Lebourg (1975). Applications of the generalized subdifferential can be found in the books of Clarke (1983, 1989), Clarke, Ledyaev, Stern & Wolenski (1998) and in Naniewicz & Panagiotopoulos (1995) and Gasi´ nski & Papageorgiou (2005) (which deal with hemivariational inequalities). Of the other subdifferentials, the viscosity

4. Smooth and Nonsmooth Analysis and Variational Principles

603

subdifferential was explicitly defined by Deville, Godefroy & Zizler (1993) and studied by Borwein & Zhu (1996). The proximal subdifferential is discussed in Clarke (1989), Clarke, Ledyaev, Stern & Wolenski (1998) and Rockafellar & Wets (1998) and the canonical subdifferential was introduced by Penot (1978). 4.5: Integral functionals determined by (convex) normal integrands were first studied by Rockafellar (1968). Several results for integrands defined on Ω × RN can be found in Rockafellar (1976) with additional results and extensions in Rockafellar (1971a, 1971c, 1971b). The work was extended by Levin (1973, 1974, 1975, 1980), who removed some restrictive finite dimensionality or reflexivity hypotheses. Theorem 4.5.7, known as the Yosida-Hewitt decomposition theorem, was first proved by Yosida & Hewitt (1952) for X = R and µ being a finite measure. Another, more direct proof can be found in Dubovitskii & Miljutin (1968). Ioffe & Levin (1972) extended the result to a separable Banach space X and a finite measure µ. Their proof does not extend to µ being σ-finite. The general form (see Theorem 4.5.7) is due to Levin (1974). Theorem 4.5.2 is due to Rockafellar (1971a) (for a reflexive, separable Banach space X) and Levin (1975) (for a separable Banach space X). Similarly Theorem 4.5.8 is due to Rockafellar (1971b) (for X = RN ), Rockafellar (1971a) (for a reflexive, separable Banach space X) and Levin (1975) (for a separable Banach space X). Proposition 4.5.13 is due to Moreau (1966–1967) and its proof can be also found in Laurent (1972, p. 348), while Theorem 4.5.16 is due to Rockafellar (1971a) Additional results on convex integral functionals can be found in Bismut (1973), Castaing & Valadier (1977), Papageorgiou (1986) and Valadier (1975). There is a continuous analog of the operation of infimal convolution (see Definition 4.4.6(b)). DEFINITION 4.7.5 sure space and let

Let (Ω, Σ, µ) be a finite, complete, nonatomic meaϕ : Ω × RN −→ R

be a Σ × B(RN )-measurable integrand. The inf-convolution integral of ϕ with respect to µ is the function I ϕω dµ : RN −→ R∗ , Ω

defined by µI Ω

¶ ½ ¾ Z df ϕω dµ (x) = inf λ ∈ R : (x, λ) ∈ epi ϕ(ω, ·) dµ . Ω

604

Nonlinear Analysis

REMARK 4.7.6 In the above definition ½Z ¾ Z ¡ ¢ 1 epi ϕ(ω, ·) dµ = u(ω), λ(ω) dµ : (u, λ) ∈ Sepi ϕ(ω,·) . Ω

Ω

¡ 1

¢ If for all x ∈ L Ω; RN , Iϕ (x) exists (possibly infinite), then µI ¶ ½ ¾ Z ¡ ¢ ϕω dµ (x) = inf Iϕ (u) : u ∈ L1 Ω; RN , u(ω) dµ = x . Ω

Ω

In this form this operation arises in mathematical economics (see Aumann & Shapley (1974)). The next result is due to Ioffe & Tichomirov (1968). It is the continuous analog of Proposition 4.4.8 PROPOSITION ¶ 4.7.7 µ H If dom ϕω dµ 6= ∅, then

Ω

µI Ω

¶∗ Z ϕω dµ = ϕ∗ω dµ. Ω

4.6: Theorem 4.6.1 is due to Ekeland (1974). A detailed discussion with various applications can be found in Ekeland (1979, 1989). Theorem 4.6.9 with F being single valued was proved by Caristi (1976) using a different proof based on transfinite induction (see also Caristi & Kirk (1975)). Theorem 4.6.14 is due to Daneˇs (1972), but his proof used a result of Krasnoselskii & Zabreiko (1984). The proof given here is due to Brondsted (1974). Relations between these and other geometric theorems of nonlinear analysis were proved by Br´ezis & Browder (1976), Daneˇs (1972) and Penot (1986). Proposition 4.6.19 and Theorem 4.6.20 are due to Brondsted & Rockafellar (1965). For the proof of Theorem 4.6.22, we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 384). Theorem 4.6.24 and Corollary 4.6.26 are due to Browder (1971a, 1971b). Another nonlinear surjectivity result due to Bates & Ekeland (1980) is the following. PROPOSITION 4.7.8 If X and Y are two Banach spaces, ϕ : X −→ Y is continuous, Gˆ ateaux differentiable, ¡ ¢ R ϕ0G (x) = Y ∀x∈X and there exists k > 0, such that for all x ∈ X and all y ∈ Y , there exists ¡ ¢−1 z ∈ ϕ0G (x) (y) satisfying kzkX 6 k kykX , then f (X) = Y , i.e., f is surjective.

4. Smooth and Nonsmooth Analysis and Variational Principles

605

Theorem 4.6.28 is due to Borwein & Preiss (1987) and it is known as the Borwein-Preiss smooth variational principle. Theorem 4.6.29 is due to Deville, Godefroy & Zizler (1993). More applications of the Ekeland variational principle can be found in Barbu (1994), Denkowski, Mig´orski & Papageorgiou (2003b), Fattorini (1999), Li & Yong (1995) and Willem (1996).

Chapter 5 Critical Point Theory

Variational methods are a valuable tool in the analysis of nonlinear problems. According to these methods, we are trying to find solutions of a given nonlinear equation, by looking for critical (stationary) points of a functional defined on the function space in which we want the solution of our problem to lie. The Euler-Lagrange equation satisfied by a critical point is the nonlinear equation that we are trying to solve. The functional, whose critical points we are trying to determine, in many cases is unbo

VOLUME 9

NONLINEAR ANALYSIS

SERIES IN MATHEMATICAL ANALYSIS AND APPLICATIONS Series in Mathematical Analysis and Applications (SIMAA) is edited by Ravi P. Agarwal, Florida Institute of Technology, USA and Donal O’Regan, National University of Ireland, Galway, Ireland. The series is aimed at reporting on new developments in mathematical analysis and applications of a high standard and or current interest. Each volume in the series is devoted to a topic in analysis that has been applied, or is potentially applicable, to the solutions of scientific, engineering and social problems. Volume 1 Method of Variation of Parameters for Dynamic Systems V. Lakshmikantham and S.G. Deo Volume 2 Integral and Integrodifferential Equations: Theory, Methods and Applications Edited by Ravi P. Agarwal and Donal O’Regan Volume 3 Theorems of Leray-Schauder Type and Applications Donal O’Regan and Radu Precup Volume 4 Set Valued Mappings with Applications in Nonlinear Analysis Edited by Ravi P. Agarwal and Donal O’Regan Volume 5 Oscillation Theory for Second Order Dynamic Equations Ravi P. Agarwal, Said R. Grace, and Donal O’Regan Volume 6 Theory of Fuzzy Differential Equations and Inclusions V. Lakshmikantham and Ram N. Mohapatra Volume 7 Monotone Flows and Rapid Convergence for Nonlinear Partial Differential Equations V. Lakshmikantham, S. Koksal, and Raymond Bonnett Volume 8 Nonsmooth Critical Point Theory and Nonlinear Boundary Value Problems Leszek Gasi´nski and Nikolaos S. Papageorgiou Volume 9 Nonlinear Analysis Leszek Gasi´nski and Nikolaos S. Papageorgiou

Series in Mathematical Analysis and Applications Edited by Ravi P. Agarwal and Donal O’Regan

VOLUME 9

NONLINEAR ANALYSIS

Leszek Gasi´nski Nikolaos S. Papageorgiou

Boca Raton London New York Singapore

Published in 2005 by Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2005 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 1-58488-484-3 (Hardcover) International Standard Book Number-13: 978-1-58488-484-2 (Hardcover) Library of Congress Card Number 2005045529 This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data Gasinski, Leszek. Nonlinear analysis / Leszek Gasinski, Nikolaos S. Papageorgiou. p. cm. -- (Series in mathematical analysis and applications ; v. 9) Includes bibliographical references and index. ISBN 1-58488-484-3 1. Nonlinear functional analysis. 2. Nonlinear operators. I. Papageorgiou, Nikolaos Socrates. II. Title. III. Series. QA321.5.G37 2005 515'.7--dc22

2005045529

Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com Taylor & Francis Group is the Academic Division of T&F Informa plc.

and the CRC Press Web site at http://www.crcpress.com

To Prof. ZdzisÃlaw Denkowski

Contents

1 Hausdorff Measures and Capacity 1.1 Measure Theoretical Background . . . . . . . 1.2 Covering Results . . . . . . . . . . . . . . . . 1.3 Hausdorff Measure and Hausdorff Dimension 1.4 Differentiation of Hausdorff Measures . . . . 1.5 Lipschitz Functions . . . . . . . . . . . . . . 1.6 Capacity . . . . . . . . . . . . . . . . . . . . 1.7 Remarks . . . . . . . . . . . . . . . . . . . .

. . . . . . .

2 Lebesgue-Bochner and Sobolev Spaces 2.1 Vector-Valued Functions . . . . . . . . . . . . 2.2 Lebesgue-Bochner Spaces and Evolution Triples 2.3 Compactness Results . . . . . . . . . . . . . . 2.4 Sobolev Spaces . . . . . . . . . . . . . . . . . . 2.5 Inequalities and Embedding Theorems . . . . . 2.6 Fine Properties of Functions and BV-Functions 2.7 Remarks . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

1 3 7 22 44 52 81 103

. . . . . . .

107 108 127 150 179 213 239 257

. . . . . .

265 266 303 343 405 427 463

3 Nonlinear Operators and Young Measures 3.1 Compact and Fredholm Operators . . . . . . . . . 3.2 Operators of Monotone Type . . . . . . . . . . . . 3.3 Accretive Operators and Semigroups of Operators 3.4 The Nemytskii Operator and Integral Functions . 3.5 Young Measures . . . . . . . . . . . . . . . . . . . 3.6 Remarks . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

4 Smooth and Nonsmooth Analysis and Variational 4.1 Differential Calculus in Banach Spaces . . . . . . 4.2 Convex Functions . . . . . . . . . . . . . . . . . . 4.3 Haar Null Sets and Locally Lipschitz Functions . 4.4 Duality and Subdifferentials . . . . . . . . . . . . 4.5 Integral Functionals and Subdifferentials . . . . . 4.6 Variational Principles . . . . . . . . . . . . . . . . 4.7 Remarks . . . . . . . . . . . . . . . . . . . . . . .

Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

467 468 488 501 512 558 578 599

vii

viii 5 Critical Point Theory 5.1 Deformation Results . . . . . . . . . . . . . 5.2 Minimax Theorems . . . . . . . . . . . . . 5.3 Structure of the Critical Set . . . . . . . . 5.4 Multiple Critical Points . . . . . . . . . . . 5.5 Lusternik-Schnirelman Theory and Abstract lems . . . . . . . . . . . . . . . . . . . . . . 5.6 Remarks . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eigenvalue Prob. . . . . . . . . . . . . . . . . . . .

607 608 642 654 661 689 705

6 Eigenvalue Problems and Maximum Principles 6.1 Linear Elliptic Operators . . . . . . . . . . . . . 6.2 The Partial p-Laplacian . . . . . . . . . . . . . . 6.3 The Ordinary p-Laplacian . . . . . . . . . . . . 6.4 Maximum Principles . . . . . . . . . . . . . . . . 6.5 Comparison Principles . . . . . . . . . . . . . . . 6.6 Remarks . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

707 708 732 759 775 788 797

7 Fixed Point Theory 7.1 Metric Fixed Point Theory . . 7.2 Topological Fixed Point Theory 7.3 Partial Order and Fixed Points 7.4 Fixed Points of Multifunctions 7.5 Remarks . . . . . . . . . . . .

. . . . . . . . .

Appendix A.1 Topology . . . . . . . . . . . . . A.2 Measure Theory . . . . . . . . . A.3 Functional Analysis . . . . . . . A.4 Calculus and Nonlinear Analysis

. . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

803 804 821 833 877 891

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

895 895 899 908 912

List of Symbols

915

References

925

Preface

Linear functional analysis deals with infinite dimensional topological vector spaces (which mix in a fruitful way the linear (algebraic) structure with topological one) and the linear operators acting between them. The effort was to extend standard results of linear analysis to an infinite dimensional context. The first half of the twentieth century is marked by intensive theoretical investigations in this area, which were also accompanied by detailed treatment of linear mathematical models. With the exception of a short period during the 1930’s (compact operators and Leray-Schauder degree), nonlinear operators were out of the emerging picture. However, mounting evidence from diverse other fields such as physics, engineering, economics, biology and others suggested that there should be an effort to extend the linear theory to various kinds of nonlinear operators. Systematic efforts in this direction started in the early 1960’s and mark the beginning of what is known today as “Nonlinear Analysis.” Since then several theories have been developed in this respect and today some of them are well established approaching their limits, while others are still the object of intense research activity. It is not a coincidence that simultaneously with the advent of nonlinear analysis, we have the appearance of nonsmooth analysis and of multivalued analysis, both of which were motivated by concrete needs in applied areas such as control theory, optimization, game theory and economics. Their development provided nonlinear analysis with new concepts, tools and theories that enriched the subject considerably. Today nonlinear analysis is a well established mathematical discipline, which is characterized by a remarkable mixture of analysis, topology and applications. It is exactly the fact that the subject combines in a beautiful way these three items that makes it attractive to mathematicians. The notions and techniques of nonlinear analysis provide the appropriate tools to develop more realistic and accurate models describing various phenomena. This gives nonlinear analysis a rather interdisciplinary character. Today the more theoretically inclined nonmathematician (engineer, economist, biologist or chemist) needs a working knowledge of at least a part of nonlinear analysis in order to be able to conduct a complete qualitative analysis of his models. This supports a high demand for books on nonlinear analysis. Of course the subject is big (vast is maybe a more appropriate word) and no single book can cover all its theoretical and applied parts. In this volume, we have focused on those topics of nonlinear analysis which are pertinent to the theory of boundary value problems and their applications such as control theory and calculus of variations.

ix

x In Chapter 1 we deal with Hausdorff measures and capacities, which provide the means to estimate the “size” or “dimension” of “thin” or “highly irregular” sets. The recent development of fractal geometry and its uses in a variety of applied areas (such as Brownian motion of particles, turbulence in fluids, geographical coastlines and surfaces etc) renewed the interest on Hausdorff measures, which for a long period were a topic of secondary importance within measure theory. In this chapter we also have our first encounter with Lipschitz and locally Lipschitz functionals which will be examined again in Chapter 4. At this point we prove the celebrated “Rademacher’s theorem.” Chapter 2 deals with certain classes of function spaces, which arise naturally in the study of boundary value problems. These are the Lebesgue-Bochner spaces (the suitable spaces for the analysis of evolution equations) and the Sobolev spaces (the suitable spaces for weak solutions of elliptic equations). We conduct a detailed study of these spaces with special emphasis on compactness and embedding results. Also using the tool of Hausdorff measures and capacities, we investigate the fine properties of Sobolev functions and also introduce and study functionals of bounded variation which are useful in theoretical mechanics. In Chapter 3, we deal with certain large classes of nonlinear operators which arise often in applications. We examine compact operators for which we develop in parallel the corresponding linear theory, with one of the main results being the spectral theorem for compact self-adjoint operators on a Hilbert space. We also investigate nonlinear operators of monotone type which have their roots in the calculus of variations and exhibit remarkable surjectivity properties. Monotone operators lead to accretive operators, the two families being identical in the context of Hilbert spaces. Accretive operators are closely connected with the generation theory of semigroups of operators. We also examine both linear and nonlinear semigroups. Semigroups are basic tools in the study of evolution equations. In addition, we examine the Nemytskii operator which is a nonlinear operator encountered in almost all problems. Finally, in the last section of the chapter, we discuss Young measures which provide the right framework to examine the limit behavior of the minimizing sequence of variational problems which do not have a solution. Young measures are used in optimal control and in the calculus of variations in connection with the so-called “relaxation method.” Chapter 4 presents the calculus of smooth and of certain broad classes of nonsmooth functions. We start with the Gˆateaux and Fr´echet derivatives. We discuss the generic differentiability of continuous convex functions (Mazur’s theorem) and extend Rademacher’s theorem to locally Lipschitz functions between certain Banach spaces by using the notion of Haar-null sets. Then we pass to nondifferentiable functions and develop the duality properties and subdifferential theory of convex functions and the generalized subdifferential of locally Lipschitz functions. We also examine integral functionals and discuss the celebrated Ekeland variational principle establishing its equivalence with some other geometric results of nonlinear analysis.

xi In Chapter 5 we present the critical point theory of C 1 -functions defined on a Banach space. This theory is in the core of the variational methods used in the study of boundary value problems. We follow the deformation approach which leads to minimax characterizations of the critical values. We also study the structure of the set of critical points and derive results on the existence of multiple critical points. Next we present the Lusternik-Schnirelman theory which extends to nonlinear eigenvalue problems the corresponding linear theory of R. Courant. Chapter 6 uses the abstract results of Chapter 5 as well as results from earlier chapters to develop the spectrum of linear elliptic differential operators, of the partial p-Laplacian (with Dirichlet and Neumann boundary conditions) and of the scalar and vector ordinary p-Laplacian (with Dirichlet, Neumann and periodic boundary conditions). We also present linear and nonlinear maximum principles and comparison results, which are useful tools in the study of boundary value problems. Finally in Chapter 7 we have gathered some basic fixed point theorems. We present results from metric fixed point theory, from topological fixed point theory and fixed point results based on the partial order induced by a closed, convex pointed cone. We also indicate how many of these results can be extended to multifunctions (set-valued functions). We have tried to make the volume self-contained. For this reason at the end of the book we have included a rather extended appendix for easy reference of the general results used in the book. Nevertheless, within the test whenever we are in the need of using some results not proved in the book, we also give exact references where the interested reader can find additional information. Now that the project has reached its conclusion, we would like to thank the good people of CRC Press (especially Mrs. Jessica Vakili) for their help and kind cooperation during the preparation of this book. We would like to thank the two editors of this series, Prof. R.P. Agarwal and Prof. D.O’Regan, for supporting this effort.

Chapter 1 Hausdorff Measures and Capacity

During the golden era of measure theory (namely the first two decades of the 20th century), Carath´eodory was the first to consider the notion of “length” for sets in RN . Later, in 1919, Hausdorff, motivated by the ideas of Carath´eodory, introduced the measure and dimensional concepts that we shall discuss in this chapter. So in the modern language, the “length” of a set A ⊆ RN will be its Hausdorff one-dimensional outer measure (denoted by µ(1) ). Following the pioneering works of Carath´eodory and Hausdorff, significant contributions to the subject were made by Besicovitch. In fact, in the first decade of development of the subject, the main advances on the subject were made by Besicovitch and his students, since geometric measure theory was not part of the mainstream measure theory. However, since the early 70’s, the subject attracted a large number of researchers, due to its fundamental importance in the study of the so-called “Fractal Geometry.” Fractal sets arise in many applications, such as turbulence in fluids, geographical coastlines and surfaces, fluctuation of prices in stock exchanges, the Brownian motion of particles and others. Mandelbrojt was the first to emphasize their use to model a variety of phenomenona. There have been many ways to estimate the “size” or “dimension” of small (thin) sets and of highly irregular sets and to generalize the idea that points, curves and surfaces have dimensions 0, 1 and 2 respectively. Hausdorff measure has the advantage of being a measure and together with the notion of Hausdorff dimension can provide a more delicate sense of the size of sets in RN than Lebesgue measure provides. To illustrate this, consider in R2 the set ½µ ¾ ¶ 1 df A = t, sin : t ∈ (0, 1) . t Suppose we wish to measure the length of the curve A. A first approximation can be based on the Carath´eodory outer measure, which defines: df

λ1 (A) =

inf ∞

A⊆

S

∞ X

δ(Cn ),

Cn n=1

n=1

i.e., the infimum is taken over all countable covers of A (by δ(A) we denote the diameter of the set A; see (1.1)). If we adopt this definition, we see that λ1 (A) < +∞, while we know that the length of A is infinite. The reason for

1

2

Nonlinear Analysis

this is that in the definition of λ1 (A), the covers of A are not forced to follow the geometry of A. For this reason the Hausdorff s-dimensional measures (s) µ(s) (A) are defined as limits of outer measures µδ which follow the local geometry of A (see Definition 1.3.5). As another illustrative example, consider the unit square S in R2 (i.e., square of side length equal to 1) and define

df

λ1 (S) =

inf ∞

S⊆

S

∞ X

δ(Cn ),

Cn n=1

n=1

i.e., again the infimum is taken over all countable covers of S. We observe that we can do no better than cover S itself. Indeed, if we cover S with smaller squares of diameter less or equal to n1 , then we see that we need at 1 least n2 squares to achieve the √ covering and so the approximation of λ (S) obtained this way exceeds n 2. So the smaller the squares we use to cover, the bigger the estimate for λ1 (S). Therefore, small squares are irrelevant in the calculation of λ1 (S) and yet it is precisely them that should have an influence on the evaluation of λ1 (S). We expect λ1 (S) = 0, since the diameter is a one-dimensional concept and it is used to measure a square in R2 , which is a two dimensional concept. For this we need a definition which takes into account the local geometry of the set under consideration. In this chapter, in Section 1.1 we recall some basic definitions and facts from measure theory, which will be needed in what follows. In Section 1.2, we discuss some “covering theorems.” Covering results play a central role in geometric measure theory. In Section 1.3 we introduce and study Hausdorff measures and the Hausdorff dimension of sets. Among other things we calculate the Hausdorff dimension of some classical irregular sets in R (Cantor-like sets). From these calculations, the reader will realize that the Hausdorff measure and the Hausdorff dimension of sets (even of simple ones) may be hard to calculate. For this reason sometimes other notions may be more suitable (such as capacity; see Section 1.6). In Section 1.4 we discuss the differentiation of Hausdorff measures and derive the Lebesgue-Besicovitch differentiation theorem. In Section 1.5, using the tools of Hausdorff measures, we study the geometry of Lipschitz continuous functions. Among other things, we obtain the “area and coarea formulas” and the associated with them “change of variables formulas.” Finally in Section 1.6, we present an alternative analytical notion measuring small sets in RN , namely the p-capacity. We derive some basic properties of the p-capacities and compare them to the Hausdorff measures.

1. Hausdorff Measures and Capacity

1.1

3

Measure Theoretical Background

In this section we recall some basic definitions and facts from measure theory, which we shall need in the sequel. Let us start with the concept of outer measure, which, when restricted to a suitable σ-field of sets, leads to a measure. DEFINITION 1.1.1 Let X be a set. A map µ : 2X −→ [0, +∞] is said to be an outer measure, if (a) µ(∅) = 0; (b) A ⊆ B =⇒ µ(A) 6 µ(B) (monotonicity); (c) for any sequence of sets {An }n>1 ⊆ 2X , we have µ[ ¶ X ∞ ∞ µ An 6 µ(An ) n=1

n=1

(subadditivity). For a given outer measure µ on X and A ∈ 2X , we define the restriction of µ on A, denoted by µbA, by df

(µbA)(B) = µ(A ∩ B)

∀ B ∈ 2X .

We say that µ is a finite outer measure if µ(X) < +∞ (i.e., µ has values in R+ ). REMARK 1.1.2 Note that µbA is an outer measure on X, while we define µ|A to be the restriction of µ (as a function) on 2A , i.e., µ|A : 2A −→ [0, +∞] is defined by ¡ ¢ df µ|A (B) = µ(B)

∀ B ∈ 2A ⊆ 2X .

Outer measures are useful because they lead to measures when restricted to suitably defined σ-fields. These σ-fields can be quite large. DEFINITION 1.1.3 Let X be a set and µ an outer measure on X. A set A ∈ 2X is said to be µ-measurable, if µ(B) = µ(A ∩ B) + µ(B \ A) i.e., A “decomposes” every set B additively.

∀ B ∈ 2X ,

4

Nonlinear Analysis

REMARK 1.1.4

Let X be a set and µ an outer measure on X.

(a) By virtue of the subadditivity property of an outer measure, to show that A ∈ 2X is µ-measurable, it is enough to check that µ(B) > µ(A ∩ B) + µ(B \ A)

∀ B ∈ 2X .

(b) Clearly, if A ∈ 2X and µ(A) = 0, then A is µ-measurable. (c) If A ∈ 2X , then any µ-measurable set is also µbA-measurable. (d) A is µ-measurable if and only if Ac = X \ A is µ-measurable. It is straightforward to check the following result. PROPOSITION 1.1.5 If X is a set and µ is an outer measure on X, then the collection Σµ of all µ-measurable sets is a σ-field and µ restricted on Σµ is a measure. REMARK 1.1.6 While Definition 1.1.3 involves only additivity of µ, the conclusion in Proposition 1.1.5 is about σ-additivity of µ on Σµ . This reveals the power of Definition 1.1.3. Note that from Remark 1.1.4(b), it follows that the σ-field Σµ is µ-complete. DEFINITION 1.1.7 Let X be a nonempty Hausdorff topological space and let µ be an outer measure on X. (a) Let T be a family of 2X . We say that µ is T -regular, if µ(A) =

inf µ(B)

B∈T A⊆B

∀ A ∈ 2X .

If T = Σµ , then we simply say that µ is regular. (b) We say that µ is a Borel measure, if B(X) ⊆ Σµ with B(X) being the Borel σ-field of X. (c) We say that µ is a Borel regular measure, if µ is a Borel measure which is B(X)-regular. (d) We say that µ is a Radon measure, if µ is a Borel regular measure and µ(K) < +∞

∀ K ⊆ X, K-compact.

1. Hausdorff Measures and Capacity

5

REMARK 1.1.8 Let X be a Hausdorff topological space and let µ be an outer measure on X. (a) Note that µ is regular if and only if ∀A ∈ 2X ∃B ∈ Σµ : µ(A) = µ(B). (b) If µ is regular on X and {An }n>1 ⊆ 2X is increasing (i.e., An ⊆ An+1 for n > 1), then µ[ ¶ ∞ µ An = sup µ(An ). n>1

n=1

PROPOSITION 1.1.9 If X is a Hausdorff topological space, µ is an outer measure on X which is Borel regular and A ∈ Σµ with µ(A) < +∞, then µbA is a Radon measure. PROOF

Let

df

µ1 = µbA. Evidently Σµ ⊆ Σµ1 and so µ1 is a Borel measure. Also for every compact K ⊆ X, we have µ1 (K) < +∞. It remains to show that µ1 is Borel regular. To this end note that since µ is Borel regular, for a given A ∈ 2X , we can find B ∈ B(X), A ⊆ B, such that µ(A) = µ(B) < +∞. Because A ∈ Σµ , from Definition 1.1.3, we have µ(B \ A) = µ(B) − µ(A) = 0. Since A ∈ Σµ , for every C ∈ 2X , we have ¡ ¢ (µbB)(C) = µ(B ∩ C) = µ(B ∩ C ∩ A) + µ (B ∩ C) \ A 6 µ(C ∩ A) + µ(B \ A) = µ(C ∩ A) = (µbA)(C). As A ⊆ B, we infer that µbB = µbA. So without any loss of generality, we may assume that A ∈ B(X). Let C ∈ 2X . Since µ is Borel regular, we can find D ∈ B(X), such that A∩C ⊆D

and µ(A ∩ C) = µ(D)

6

Nonlinear Analysis

(see Remark 1.1.8(a)). Let us take df

E = D ∪ (X \ A). Evidently E ∈ B(X) and C ⊆ (A ∩ C) ∪ (X \ A) ⊆ E. Moreover, since E ∩ A = D ∩ A, we have µ1 (E) = µ(E ∩ A) = µ(D ∩ A) 6 µ(D) = µ(A ∩ C) = µ1 (C), so µ1 = µbA is Borel regular (see Remark 1.1.8(a)), hence Radon. We conclude this section, by recalling the following basic measure theoretic approximations. PROPOSITION 1.1.10 If X is a Hausdorff topological space and µ is an outer measure on X which is Borel, then (a) if A ∈ B(X), µ(A) < +∞ and ε > 0, then we can find an open set Uε ⊇ A and a closed set Cε ⊆ A, such that µ(Uε \ Cε ) < ε, i.e., µ(A) =

inf µ(U ) =

U -open A⊆U

sup µ(C). C-closed C⊆A

(b) if µ is Radon, then for every A ∈ 2X , we have µ(A) =

inf µ(U )

U -open A⊆U

and if A ∈ Σµ , then µ(A) =

sup

µ(K).

K-compact K⊆A

REMARK 1.1.11 Note that in the first part of Proposition 1.1.10(b), the set A need not be µ-measurable.

1. Hausdorff Measures and Capacity

1.2

7

Covering Results

One of the main tools in geometric measure theory is the so called Vitali covering theorem. For a given sufficiently large family of sets that cover a given set A, Vitali’s covering theorem allows us to select a countable subfamily consisting of distinct sets with exactly the desired approximation properties. The basic principle embodied in the proof of Vitali’s covering theorem is illustrated in the next proposition. In what follows for any subset A of a metric space (X, dX ), we define df

δ(A) = diam (A) = sup dX (x, y),

(1.1)

x,y∈A

df

the diameter of A (by convention diam ∅ = 0). PROPOSITION 1.2.1 If T is a collection of nondegenerate balls in RN with sup δ(B) < +∞, B∈T

then we can find a finite or countable subfamily F of T consisting of disjoint balls, such that [ [ b B ⊆ B, B∈T

B∈F

b being the ball concentric with B, but with radius five times the radius with B of B. PROOF

Let df

d0 = sup δ(B), B∈T ½ ¾ d0 d0 df Tn = B ∈ T : n < δ(B) 6 n−1 2 2

∀ n > 1.

Inductively, we generate subfamilies Fn ⊆ Tn for n > 1. Namely, let F1 be any maximal disjoint collection of balls in T1 . Suppose we have selected F1 , . . . , Fm . We choose Fm+1 to be any maximal disjoint subfamily of ½ ¾ m [ 0 0 B ∈ Tm+1 : B ∩ B = ∅ for all B ∈ Fk i=1

and finally set df

F =

∞ [ m=1

Fm .

8

Nonlinear Analysis

Evidently F ⊆ T and consists of disjoint balls. Claim. For each B ∈ T , we can find B 0 ∈ F , such that B ∩ B 0 6= ∅ and b 0 ). δ(B) 6 2δ(B 0 ) (so also B ⊆ B For some m > 1, we have B ∈ Tm . By virtue of the maximality of Fm , we m S can find B 0 ⊆ Fk with B ∩ B 0 6= ∅. We have that k=1

d0 6 δ(B 0 ) and 2m

δ(B) 6

d0 . 2m−1

So δ(B) 6 2δ(B 0 ) and this proves the claim. From the claim it follows at once that

S

S

B⊆

B∈T

b0. B

B 0 ∈F

DEFINITION 1.2.2 Let A ⊆ RNS. A collection T of sets in RN is said B and for every x ∈ A and every to be a Vitali cover of A, if A ⊆ B∈T

ε > 0, there exists B ∈ T , such that x ∈ B and 0 < δ(B) < ε. REMARK 1.2.3 Note that from the second requirement of the above definition it follows that inf δ(B) = 0. B∈T

So T is a Vitali cover of a set A, if every point x ∈ A is contained in an arbitrary small element of T . As a straightforward consequence of Proposition 1.2.1 we obtain the following proposition. PROPOSITION 1.2.4 If A ⊆ RN , T is a Vitali cover of A consisting of closed balls, such that sup δ(B) < +∞, B∈T

then there exists a countable family F = {Bn }n>1 consisting of disjoint balls from T , such that for each m > 1, we have A ⊆

m [ n=1

Bn ∪

∞ [

bn , B

n=m+1

bn is the closed ball cocentric with Bn and radius five times the radius where B of Bn .

1. Hausdorff Measures and Capacity

Let F be as in the proof of Proposition 1.2.1. Select {Bn }m n=1 ⊆ m [ Bn , then we are done. Otherwise let x ∈ A \ Bn . Since T

PROOF F. If A ⊆

9

m S

n=1

n=1

is a Vitali cover of A consisting of closed balls, then we can find B ∈ T , such that x ∈ B and B ∩ Bn = ∅ ∀ n ∈ {1, . . . , m}. But from the claim in the proof of Proposition 1.2.1, we can find B 0 ∈ F , such that b 0 and B ∩ B 0 6= ∅ B⊆B (so B 0 ∈ {Bn }∞ n=m+1 ). Now we are ready to state and prove Vitali’s covering theorem. In what follows by λN we denote the N -dimensional Lebesgue outer measure. THEOREM 1.2.5 (Vitali Covering Theorem) If A ⊆ RN with 0 < λN (A) < +∞ and T is a Vitali cover of A consisting of closed sets, then we can find a sequence {Cn }n>1 of elements in T , such that Cn ∩Cm = ∅ for n 6= m and µ ¶ ∞ [ λN A \ Cn = 0. n=1

PROOF Without any loss of generality, we can assume that there exists an open set U ⊆ RN with λN (U ) < +∞ and C⊆U

∀ C ∈T.

We construct the sequence {Cn }n>1 inductively. Let C1 ∈ T . Suppose that n S C1 , . . . , Cn are disjoint sets in T . If A ⊆ Ck , then we are finished. If not, k=1

setting df

Vn = U \

n [

Ck ,

k=1

we introduce df

Tn = Because A \

n S i=1

©

C ∈ T : C ⊆ Vn

ª

df

and δn = sup λN (C). C∈Tn

Ck 6= ∅ and T is a Vitali cover of A, we see that Tn 6= ∅ and

so δn > 0. We select Cn+1 ∈ T

with

δn < λN (Cn+1 ). 2

10

Nonlinear Analysis

We continue this process. Then either at some finite step n > 1 we shall have n S A⊆ Ck , in which case the proof of the theorem is complete or otherwise k=1

we produce a sequence {Cn }n>1 ⊆ T of disjoint sets. Then we have ∞ X

λN (Cn ) = λN

n=1

µ[ ∞

¶ Cn

6 λN (U ) < +∞.

(1.2)

n=1

For each n > 1 let Bn be a ball with center in Cn and radius equal to 3δ(Cn ). We claim that n ∞ [ [ A\ Ck ⊆ Bk ∀ n > 1. (1.3) k=1

Let x ∈ A \ such that

n S

k=n+1

Ck . Since T is a Vitali cover of A, we can find a set Cx ∈ Tn ,

k=1

x ∈ Cx

and λN (Cx ) > 0.

We shall show that Cx ∩ Ck 6= ∅ for some k > n. Indeed, if this is not the case, then λN (Cx ) 6 δk for all k > 1, which contradicts the fact that 0 6

lim δk 6

k→+∞

lim 2λN (Ck+1 ) = 0

k→+∞

(recall the choice of Ck+1 and see (1.2)). Let m > n be the smallest integer, such that Cx ∩ Cm 6= ∅. Since Cx ∈ Tm−1 , we have λN (Cx ) 6 δm−1 < 2λN (Cm ) and recalling the choice of Bm , also Cx ⊆ Bm . So we have proved (1.3). Then for any n > 1, we have ¶ µ ¶ µ n ∞ ∞ [ X [ N N Ck 6 λ A \ Ck 6 λN (Bk ). (1.4) λ A\ k=1

k=1

k=n+1

Recalling that Bk is a ball of radius 3δ(Ck ) and combining (1.2) and (1.4), we conclude that µ ¶ ∞ [ N λ A\ Ck = 0. k=1

1. Hausdorff Measures and Capacity

11

Vitali’s covering theorem may be difficult to digest at first and probably it is necessary to see the lemma in action several times before appreciating it. For this reason we present four simple applications from classical analysis of functions of one-variable. We start with a definition which establishes the notation for various limits of the difference quotient that we shall use in the sequel. These derivatives are often more useful than the ordinary derivative, since they are defined at every point. DEFINITION 1.2.6 For a given function f : [a, b] −→ R, the upper right and lower right derivates of f at x ∈ [a, b) are defined by f (x + h) − f (x) h

df

D+ f (x) = lim sup h→0+

and df

D+ f (x) = lim inf h→0+

f (x + h) − f (x) h

respectively. Similarly the upper left and lower left derivates of f at x ∈ (a, b] are defined by df

D− f (x) = lim sup h→0−

f (x + h) − f (x) h

and df

D− f (x) = lim inf − h→0

f (x + h) − f (x) h

respectively. REMARK 1.2.7 Evidently, the derivates of a function at a point may be infinite. The function f is differentiable at x ∈ (a, b), if −∞ < D+ f (x) = D+ f (x) = D− f (x) = D− f (x) < +∞. The function f is differentiable at x = a or at x = b, if the appropriate two derivates are finite and equal. Also the one-sided derivatives exist at a point x, if D+ f (x) = D+ f (x) and D− f (x) = D− f (x). The derivates are also called Dini derivates and clearly we always have

and

D+ f (x) 6 D+ f (x)

∀ x ∈ [a, b)

D− f (x) 6 D− f (x)

∀ x ∈ (a, b].

12

Nonlinear Analysis

In the literature, sometimes we find the notion of a derived number for a function f at x. So β ∈ R∗ is a derived number for f at x, if there is a sequence {hn }n>1 ⊆ R, such that hn −→ 0, hn 6= 0 and

∀n>1

f (x + hn ) − f (x) = β. n→+∞ hn lim

A function f may have many derived numbers at a point x. Of course f is differentiable at x if and only if all derived numbers of f at x agree and are finite. EXAMPLE 1.2.8

Consider the function f : R −→ R defined by ( 1 df x sin if x = 6 0, f (x) = x 0 if x = 0.

We can check that D− f (0) = −1 < D+ f (0) = 1 and every number in [−1, 1] is a derived number for f . The function f is not of bounded variation (see Definition A.2.15(a)). LEMMA 1.2.9 If f : [a, b] −→ R is nondecreasing, then all four derivates of f are finite almost everywhere on [a, b]. PROOF

Clearly all derivates are nonnegative. So it suffices to show that

D+ f (x) < +∞ Let

and

D− f (x) < +∞

for a.a. x ∈ [a, b].

½ df

A =

¾ +

x ∈ [a, b] : D f (x) = +∞

and suppose that λ∗ (A) = β > 0, where λ∗ is the Lebesgue outer measure on R. Let M > 0 be such that f (b) − f (a)

1 with hxn & 0, hxn 6= 0 such that M 6 The collection

©

∀ n > 1,

f (x + hxn ) − f (x) . hxn

[x, x + hxn ]

ª x∈A,n>1

is a Vitali cover of A. By virtue of Vitali’s covering theorem (see Theorem 1.2.5), we can find a family of disjoint intervals © ªm [xn , xn + hn ] n=1 , such that

m X

hn >

n=1

β . 2

Therefore m m X X ¡ ¢ f (xn + hn ) − f (xn ) > M hn n=1

n=1

Mβ > > f (b) − f (a), 2 a contradiction. This proves that λ∗ (A) = 0 and so

D+ f (x) < +∞.

Analogously we can prove that D− f (x) < +∞. Using this lemma and Vitali’s covering theorem, we can now prove that a nondecreasing function is differentiable almost everywhere on [a, b]. THEOREM 1.2.10 If f : [a, b] −→ R is nondecreasing, then f is differentiable almost everywhere on [a, b]. PROOF For f to be differentiable at x, we must have that all four derivates at x are finite and equal. By virtue of Lemma 1.2.9, it suffices to show that all four derivates are equal almost everywhere. Let ª df © A = x ∈ (a, b) : D+ f (x) < D+ f (x) .

14

Nonlinear Analysis

We show that A is Lebesgue-null. The proof for the other combinations of derivates is similar. Suppose that λ∗ (A) > 0. We can find rational numbers r, s, such that the set df

B =

©

ª x ∈ A : D+ f (x) < r < s < D+ f (x)

satisfies

λ∗ (B) = β > 0.

Let ε ∈ (0, β). From the regularity of the Lebesgue outer measure λ∗ , we know that there exists an open set U ⊆ (a, b), such that and λ1 (U ) − ε < β.

B⊆U

For each x ∈ B and n > 1, we can find hxn > 0, such that

£ ¤ x, x + hxn ⊆ U

The family

©

with hxn & 0, f (x + hxn ) − f (x) < r. hxn

and

ª [x, x + hxn ] x∈B,n>1

is a Vitali cover of B. By virtue of Vitali’s covering theorem (see Theorem 1.2.5), for a given ε > 0, we can find a disjoint subfamily © ªm [xn , xn + hn ] n=1 of the Vitali cover, such that µ ¶ m [ λ∗ B \ [xn , xn + hn ] < ε. n=1

We have m m X X ¡ ¢ f (xn + hn ) − f (xn ) < r hn 6 rλ1 (U ) < r(β + ε). n=1

n=1

Let us set df

C = B∩

µ[ m

¶ [xn , xn + hn ] .

n=1

We have that

β − ε < λ∗ (C).

(1.5)

1. Hausdorff Measures and Capacity

15

¡ ¢ For every y ∈ C and k > 1, we can find uyk ∈ y, y + k1 , such that f (uyk ) − f (y) > s uyk − y and

[y, uyk ] ⊆ (xn , xn + hn ),

The family

©

for some n ∈ {1, . . . , m}.

ª [y, uyk ] y∈C,k>1

is a Vitali cover of C. Invoking Vitali’s covering theorem (see Theorem 1.2.5), we can find a disjoint subfamily ©

ªl [yk , uk ] k=1 ,

such that λ∗ (C) − ε

s (uk − yk ) k=1 k=1 ¡ ¢ > s λ∗ (C) − ε > s(β − 2ε).

(1.6)

For each 1 6 n 6 m, let df

Jn =

©

ª k ∈ {1, . . . , l} : [yk , uk ] ⊆ (xn , xn + hn ) .

Since f is nondecreasing, using (1.6) and (1.5), we have s(β − 2ε)

0. Because f is absolutely continuous, we can find δ > 0, such ªm that, if (rn , sn ) n=1 is a finite family of disjoint subintervals of [a, b] with m X

(sn − rn ) < δ,

n=m

then we have

m X ¯ ¯ ¯f (sn ) − f (rn )¯ < ε. n=1

We introduce the family ¯ ¯ ½ ¾ ¯ f (y) − f (x) ¯ df ¯ 0, A ⊆ [a, b] and at each point x ∈ A there exists a derived number β (see Remark 1.2.7), such that β 0, we can find a bounded open set U ⊆ R, such that A⊆U

and

λ1 (U ) − ε < λ∗ (A).

If x ∈ A, then by hypothesis we can find a sequence {hn }n>1 ⊆ R \ {0}, such that hn −→ 0, [x, x + hn ] ⊆ U

∀n>1

(or [x + hn , x] ⊆ U in the event hn < 0; but in the sequel for simplicity we shall write [x, x + hn ] for both cases) and f (x + hn ) − f (x) < r hn

∀ n > 1.

(1.7)

For all n > 1 and x ∈ A, let df

Dn (x) = [x, x + hn ], ¤ df £ En (x) = f (x), f (x + hn ) . Because f is strictly increasing En (x) is a nondegenerate, closed interval and ¡ ¢ f Dn (x) ⊆ En (x) ∀ n > 1, x ∈ A. Since ¡ ¢ λ1 Dn (x) = |hn | and from (1.7), we have

¯ ¯ ¡ ¢ λ1 En (x) = ¯f (x + hn ) − f (x)¯,

¡ ¢ ¡ ¢ λ1 En (x) < rλ1 Dn (x) .

(1.8)

1. Hausdorff Measures and Capacity

19

Passing to the limit as n → +∞, we have |hn | −→ 0 and so from (1.8), we obtain that ¡ ¢ lim λ1 En (x) = 0.

n→+∞

Let

df

T =

©

ª En (x) x∈A,n>1 .

Then T is a Vitali cover of the set f (A). So Vitali’s covering theorem (see Theorem 1.2.5) implies the existence a disjoint sequence © ª Enk (xk ) k>1 ⊆ T , such that

µ ¶ ∞ [ λ1 f (A) \ Enk (xk ) = 0.

(1.9)

k=1

Using (1.9) and (1.8), it follows that ¢ λ f (A) 6 λ1 ∗

=

¡

∞ X

µ[ ∞ k=1

1

λ (Enk (xk )) < r

k=1

¶ Enk (xk ) ∞ X

¡ ¢ λ1 Dnk (xk ) .

(1.10)

k=1

© ª Since f is strictly increasing, we see that Dnk (xk ) k>1 are pairwise disjoint too. So we have µ[ ¶ ∞ ∞ X ¡ ¢ 1 1 λ Dnk (xk ) = λ Dnk (xk ) (1.11) k=1

k=1

6 λ1 (U ) 6 λ∗ (A) + ε. From (1.10) and (1.11), we infer that ¡ ¢ ¡ ¢ λ∗ f (A) 6 r λ∗ (A) + ε . Let ε & 0, to conclude that ¡ ¢ λ∗ f (A) 6 rλ∗ (A).

In a similar fashion, we can have the following comparison result.

(1.12)

20

Nonlinear Analysis

THEOREM 1.2.15 If f : [a, b] −→ R is a strictly increasing function, s > 0, A ⊆ [a, b] and at each point x ∈ A there exists a derived number γ, such that γ > s, then ¡ ¢ λ∗ f (A) > sλ∗ (A). The final application of Vitali’s covering theorem (see Theorem 1.2.5) is the following criterion for measurability of sets in R. THEOREM 1.2.16 If F is any collection of intervals in R and [

A =

D,

D∈F

then A is Lebesgue measurable. PROOF

Let T be a collection of all intervals E,

such that E ⊆ D for some D ∈ F. Evidently T is a Vitali cover of A

and so by Vitali’s covering theorem (see Theorem 1.2.5), we can find a sequence {En }n>1 of disjoint elements in T , such that µ ¶ ∞ [ λ A\ En = 0. ∗

n=1

Because each En ⊆ A, the set df

A =

∞ [ n=1

µ ¶ ∞ [ En ∪ A \ En n=1

is Lebesgue measurable. REMARK 1.2.17 Theorem 1.2.16 can be used to show that the upper and lower derivates of an arbitrary function are measurable. In particular then the four derivates of a measurable function are measurable and so is the derivative of a measurable function. We will not go into that here.

1. Hausdorff Measures and Capacity

21

When λN is replaced by an arbitrary Radon measure µ on RN , there is b in terms of µ(B). So the proof of Vitali’s no systematic way to control µ(B) covering theorem (see Theorem 1.2.5) which uses the principle involved in Proposition 1.2.1, namely the use of suitable expansions of balls, does not work. So we need an analog of Proposition 1.2.1, which does not require enlarging the balls, though. This is done by the so-called “Besicovitch covering theorem.” THEOREM 1.2.18 (Besicovitch Covering Theorem) If F is any collection of closed balls in RN , sup δ(B) < +∞ B∈F

and A is the set of centers of all balls B ∈ F, then there exist a positive integer k = k(N ) > 1 and Tn ⊆ F

∀ n ∈ {1, . . . , k},

such that each Tn is a countable collection of disjoint balls in F and A ⊆

k [ [

B.

n=1 B∈Tn

Using the above theorem, we can have the following counterpart of Vitali’s covering theorem (see Theorem 1.2.5). THEOREM 1.2.19 If µ is a Borel measure on RN , T is a family of nondegenerate closed balls in RN , A is the set of centers of balls in T , µ(A) < +∞, inf

Br (a)∈F

r = 0

∀a∈A

and U ⊆ RN is an open set, then there exists a countable collection of disjoint balls F from T , such that [ B∈F

B ⊆U

and

µ [ ¶ µ (A ∩ U ) \ B = 0. B∈F

22

Nonlinear Analysis

1.3

Hausdorff Measure and Hausdorff Dimension

Hausdorff measures were introduced as certain lower dimensional measures on RN which allow us to measure “small” subsets in RN . The Hausdorff measure and the associated Hausdorff dimension of the set provide a more delicate sense of the size of a set in RN than the Lebesgue measure provides. We start with the introduction of a special class of outer measures, known as metric outer measures. DEFINITION 1.3.1 function).

Let (X, dX ) be a metric space (d is the metric

(a) If A, B ⊆ X, then we say that A and B are separated sets, if df

dX (A, B) =

inf dX (a, b) > 0.

a∈A b∈B

(b) If µ is an outer measure on X, then we say that µ is a metric outer measure, if µ(A ∪ B) = µ(A) + µ(B)

∀ A, B ⊆ X, A and B separated.

We show that if µ is a metric outer measure, then B(X) ⊆ Σ(µ), i.e., µ is Borel. To this end we need the following auxiliary result, known as Carath´ eodory’s lemma. In what follows (X, d) is a metric space. LEMMA 1.3.2 (Carath´ eodory Lemma) If µ is a metric outer measure on X, U ⊆ X is an open subset, U 6= X, A ⊆ U and ½ ¾ 1 df c An = x ∈ A : d(x, U ) > ∀ n > 1, (1.13) n then µ(A) = lim µ(An ). n→+∞

PROOF Note that the sequence {An }n>1 is an increasing sequence and so lim µ(An ) exists. Moreover, since An ⊆ A for n > 1, we have n→+∞

lim µ(An ) 6 µ(A).

n→+∞

So we need to show that µ(A) 6

lim µ(An ).

n→+∞

(1.14)

1. Hausdorff Measures and Capacity

23

Because U is open, we have d(x, U c ) > 0

∀x∈A

and so we can find n0 > 1 large enough so that x ∈ An0 . Therefore, we have ∞ [

A =

An .

n=1

For each n > 1, we introduce the set ½ df Cn = An+1 \ An = x ∈ A :

1 1 6 d(x, U c ) < n+1 n

¾ .

We have A = A2n ∪

∞ [

∞ [

Ck = A2n ∪

k=2n

C2k ∪

k=n

∞ [

C2k+1

k=n

and from the subadditivity of µ, it follows that µ(A) 6 µ(A2n ) +

∞ X

µ(C2k ) +

k=n

∞ X

µ(C2k+1 ).

(1.15)

k=n

If both series are convergent, then we obtain (1.14). So suppose that this is not true and, say, we have ∞ X

µ(C2k ) = +∞.

(1.16)

k=1

Note that

¡ ¢ d C2k , C2k+2 >

1 1 − 2k + 1 2k + 2

∀k>1

and so the sets {Ck }k>1 are separated. Therefore, we have µ

µ n−1 [

¶ C2k

k=1

Note that

n−1 [

=

n−1 X

µ(C2k )

∀ n > 1.

(1.17)

k=1

C2k ⊆ A2n

∀n>1

k=1

and so

µ n−1 ¶ [ µ C2k 6 µ(A2n ) k=1

∀ n > 1.

(1.18)

24

Nonlinear Analysis

From (1.17) and (1.18), it follows that n−1 X

µ(C2k ) 6 µ(A2n ).

k=1

Combining this with (1.16), we infer that lim µ(A2n ) = +∞

n→+∞

and so µ(A) 6 as desired. Similarly, if

∞ P

lim µ(A2n ),

n→+∞

µ(C2k+1 ) = +∞.

k=1

THEOREM 1.3.3 If µ is an outer measure on X, then B(X) ⊆ Σ(µ) (i.e., µ is Borel) if and only if µ is a metric outer measure. PROOF

“=⇒”: Let A1 , A2 ⊆ X be separated sets and let us set df

β = d(A1 , A2 ) > 0. For every x ∈ A1 , we define ½ ¾ β df U (x) = B β (x) = y ∈ X : d(y, x) < 2 2

df

and U =

[

U (x).

x∈A1

Evidently U is open, A1 ⊆ U and A2 ∩ U = ∅. Since by hypothesis U ∈ Σ(µ), we have that ¡ ¢ ¡ ¢ µ(A1 ∪ A2 ) = µ (A1 ∪ A2 ) ∩ U + µ (A1 ∪ A2 ) ∩ U c . (1.19) Because A1 ⊆ U and A2 ∩ U = ∅, from (1.19), it follows that µ(A1 ∪ A2 ) = µ(A1 ) + µ(A2 ), i.e., µ is metric outer measure. “⇐=”: It suffices to show that Σ(µ) contains all closed sets. So let C ⊆ X be df

df

closed and let us set U = C c . Let D ⊆ X, A = D \ C and let {An }n>1 be an increasing sequence of subsets of A as in Lemma 1.3.2. Then d(An , C) >

1 n

∀n>1

1. Hausdorff Measures and Capacity

25

and, from Lemma 1.3.2, we have µ(D \ C) = µ(A) =

lim µ(An ).

n→+∞

(1.20)

Since by hypothesis µ is a metric outer measure and the sets {An }n>1 are separated from C, we have ¡ ¢ µ(D) > µ (D ∩ C) ∪ An = µ(D ∩ C) + µ(An ) ∀ n > 1. Passing to the limit as n → +∞ and using (1.20), we obtain µ(D) > µ(D ∩ C) + µ(D \ C). The reverse inequality is always true (subadditivity). So we obtain µ(D) = µ(D ∪ C) + µ(D \ C)

∀ D ⊆ X.

Thus C ∈ Σ(µ) and hence B(X) ⊆ Σ(µ). To introduce the concept of Hausdorff measure, we shall need the following notion. Recall that by (X, d) we denote a metric space. DEFINITION 1.3.4 of a set C, if C⊆

∞ [

A sequence {An }n>1 of subsets of X is a δ-cover

An

and

δ(An ) 6 δ

∀ n > 1.

n=1

By Tδ (C) we denote the family of all δ-covers of the set C. Using this notion, we can introduce the Hausdorff s-dimensional measure, s > 0. As usual, for any A ⊆ X, df

δ(A) = diam (A) = sup d(x, y), x,y∈A df

the diameter of A (by convention diam ∅ = 0). DEFINITION 1.3.5 define

For any s > 0, 0 < δ 6 +∞ and C ⊆ X, we df

(s)

µδ (C) =

inf

{An }n>1 ∈Tδ (C)

∞ X

δ(An )s

n=1

(as always we use the convention that inf ∅ = +∞). The Hausdorff sdimensional outer measure µ(s) is defined by df

(s)

(s)

µ(s) (C) = lim µδ (C) = sup µδ (C). δ&0

δ>0

26

Nonlinear Analysis

REMARK 1.3.6 It is easily seen that µ(s) is an outer measure. Moreover, it is a metric outer measure. Indeed, if δ > 0 is less than the positive distance of two separate sets A and C, then no set in Tδ (A ∪ C) can intersect both A and C and so it follows that (s)

(s)

(s)

µδ (A ∪ C) = µδ (A) + µδ (C). Letting δ & 0, we can obtain the same equality for µ(s) . ¡In addition by ¢ Theorem 1.3.3, µ(s) is Borel. The restriction of µ(s) on Σ µ(s) is called the Hausdorff s-dimensional measure. Sometimes it is convenient to consider δ-covers consisting of open or alternatively closed sets. In these cases, (s) although a different value of µδ may be attained for δ > 0, the limit µ(s) as δ & 0 is the same (see Davies (1970)). However, the limit µ(s) is different, if we restrict ourselves to δ-covers by balls (see Besicovitch (1928)). In this case the resulting Hausdorff measure is called the spherical Hausdorff measure. Finally, if X = RN , it is easy to see that µ(s) remains the same if we consider δ-covers consisting only of convex sets. Next we show that for any set C ⊆ X, there is a critical value s0 , such that for s > s0 , the corresponding Hausdorff s-dimensional measure of C is zero, while for s < s0 the Hausdorff s-dimensional measure of C is infinite. THEOREM 1.3.7 If A ⊆ RN and 0 6 s < t < +∞, then (a) if µ(s) (A) < +∞, then µ(t) (A) = 0; (b) if µ(t) (A) > 0, then µ(s) (A) = +∞. PROOF (a) Let µ(s) (A) < +∞ and t > s. Let {An }n>1 ∈ T m1 (A). Then for any n > 1, we have µ ¶t−s δ(An )t 1 t−s = δ(An ) 6 , δ(An )s m so

∞ X

(t)

µ 1 (A) 6 m

µ t

δ(An ) 6

n=1

and thus

µ (t)

µ 1 (A) 6 t

1 m

1 m

¶t−s

¶t−s X ∞

δ(An )s

n=1

(s)

µ 1 (A). m

Letting m → +∞, we obtain µ(t) (A) = 0. (b) Let µ(t) (A) > 0 and s < t. Assuming that µ(s) (A) < +∞, from (a), we get that µ(t) (A) = 0, a contradiction.

1. Hausdorff Measures and Capacity

27

This theorem leads to the following definition. DEFINITION 1.3.8 Let C ⊆ X. If there is no s > 0, such that df µ(s) (C) = +∞, then dim C = 0. Otherwise, let df

dim C =

sup (s)

µ

s.

s>0 (C) = +∞

Then dim C is called the Hausdorff dimension of C. Consider the Cantor ternary set C. It is well known that C is a nonempty, bounded, nowhere dense, perfect set in R which has Lebesgue measure zero. So the Lebesgue measure can contribute no additional information concerning the size of C. On the other hand, as we shall see the Hausdorff dimension provides a more delicate sense of size. PROPOSITION 1.3.9 If C ⊆ [0, 1] is the Cantor ternary set, then dim C =

ln 2 ln 3 .

PROOF We start with two simple observations concerning the Hausdorff s-dimensional outer measure µ(s) on R. First note that µ(s) is translation invariant, namely µ(s) (A) = µ(s) (A + x)

∀ A ⊆ R, x ∈ R

ª df © (here A + x = a + x : a ∈ A ). Second, µ(s) is s-positive homogeneous, i.e., for every ϑ > 0, µ(s) (ϑA) = ϑs µ(s) (A) ∀ ϑ > 0. In the of C we start by removing from [0, 1] £the open middle ¡ construction ¢ ¤ £ ¤ third 31 , 32 . The resulting set consists of two closed intervals 0, 31 and 32 , 1 . Let · ¸ · ¸ 1 2 1 df 2 df C = C ∩ 0, and C = C ∩ , 1 . 3 3 Evidently C 1 and C 2 are translates of a multiple (by 31 ) of C. So we have (s) µ(s) (C) = µ(s) (C 1 ∪ C 2 ) =µ µ¶ (C 1 ) + µ(s) (C 2 ) s ¡ ¢ 1 = 2µ(s) C 2 = 2 µ(s) (C) 3

(1.21)

(see Remark 1.3.6 and the observations in the beginning of this proof). From (1.21), it follows that µ ¶s 1 µ(s) (C) = 0 or µ(s) (C) = +∞ or 2 = 1. 3

28

Nonlinear Analysis

From the last possibility, it follows that s =

ln 2 . ln 3

If we can show that 0 < µ(s) (C) < +∞, then s = dimension of C (see Theorem 1.3.7). First we show that µ(s) (C) > 0. Note that d(C 1 , C 2 ) >

ln 2 ln 3

is the Hausdorff

1 . 3

Let δ 6 31 . Then any collection {An }n>1 ∈ Tδ (C) (which can be taken to consist of open intervals; see Remark 1.3.6) can be decomposed into two subcollections of intervals {An,1 }n>1 ∈ Tδ (C 1 ) and {An,2 }n>1 ∈ Tδ (C 2 ), such that ∞ ∞ ∞ X X X δ(An )s = δ(An,1 )s + δ(An,2 )s . (1.22) n=1

n=1

n=1

In the right hand side of (1.22) suppose that the first sum is smaller than the second. Because C 2 is a translate of C 1 , the same when applied to ª © translation the intervals {An,1 }n>1 gives a subcollection A0n,1 n>1 ∈ Tδ (C 2 ). Also from {An,1 }n>1 we can produce in a similar way a collection {A0n }n>1 covering C, such that δ(A0n ) = 3δ(A0n,1 ) ∀ n > 1. (1.23) Then, from (1.23) and the choice of s, we have ∞ X

δ(An )s >

n=1 ∞ X

= 2

∞ X

δ(An,1 )s +

n=1

δ(A0n,1 )s

= 2

n=1

∞ µ ¶s X 1 n=1

3

∞ X

δ(A0n,1 )s

n=1

δ(A0n )s =

∞ X

δ(A0n )s .

n=1

If any one of the intervals {A0n }n>1 has length bigger or equal to 31 , we have ∞ X

δ(An )s >

n=1

µ ¶s 1 1 = . 3 2

Because C is compact, we can use only finite coverings and so min δ(An ) > 0. n>1

The intervals {A0n }n>1 are multiples (by (1.23)) of a subfamily of the intervals {An }n>1 , hence we have 3 min δ(A0n ) > min δ(An ). n>1

n>1

1. Hausdorff Measures and Capacity

29

If every interval A0n has length (diameter) less than 31 , we can apply the same process to the cover {A0n }n>1 . After a finite number of such steps, we produce a cover {A00n }n>1 , such that 1 3

max δ(A00n ) > n>1

and

∞ X

δ(An )s >

n=1

so

∞ X

δ(A00n )s ,

n=1

s

δ(An )

n=1

and thus

∞ X

µ ¶s 1 1 > = 3 2

0 < µ(s) (C).

Next we show that

µ(s) (C) < +∞.

Let {An }n>1 ∈ Tδ (C) consist of open intervals. From this family, as above, we obtain covers {An,k }n>1 of C k for k ∈ {1, 2}, such that δ(An,k ) 6

δ 3

∀ n > 1.

Again from the choice of s, we have δ(An )s = δ(An,1 )s + δ(An,2 )s , so

(s)

(s)

µδ (C) > µ δ (C). 3

(s) µδ

(s)

Because is nondecreasing in δ > 0, we infer that µδ is independent of δ > 0. So we can take an open interval of length greater than 1 as an open cover of C and conclude that µ(s) (C) 6 1. This proves that dim E =

ln 2 . ln 3

One can show that for every ξ ∈ [0, 1], there exists a set A ⊆ R, such that dim A = ξ. This can be done using Cantor-like sets. These are sets which share most of the properties of the Cantor ternary set, but need not be Lebesgue-null. We can construct a Cantor-like set as follows. We start with the interval [0, 1] and proceed inductively. We remove an open interval B1,1 centered at 21 with length less than 1. We are left with closed intervals

30

Nonlinear Analysis

D1,1 and D1,2 each with length less than 21 . At the n-th step of this process we are left with closed intervals Dn,1 , Dn,2 , . . . , Dn,2n each with length less than 21n . In the (n + 1)-st step, from each closed interval Dn,k we remove an open interval En+1,k having the same center as Dn,k and length less than the length of Dn,k . We set n

df

Sn =

2 [

Dn,k

df

and S =

∞ \

Sn .

n=1

k=1

The set S is a Cantor-like set. It is known (see Hewitt & Stromberg (1975, p. 71)) that S is nonempty, compact, nowhere dense and perfect (just as the Cantor ternary set). However, unlike the Cantor ternary set, S need not be Lebesgue-null. More precisely, consider a sequence {ϑn }n>1 of positive numbers, such that 1 > 2ϑ1 > 4ϑ2 > . . . > 2n ϑn > . . . . Following the construction of S above, we remove from [0, 1] an open interval centered at 21 and having length 1 − 2ϑ1 . The remaining closed intervals D1,1 and D1,2 each have length ϑ1 . Then from each of the intervals D1,1 and D1,2 we remove cocentric open intervals each of length ϑ1 − 2ϑ2 . We are left with closed intervals D2,1 , D2,2 , D2,3 and D2,4 each of length ϑ2 . We continue this way. In the n-th step we are left with 2n closed intervals each with length ϑn . Then we have λ1 (S) = lim 2n ϑn n→+∞

1

(λ being the Lebesgue measure on R). If ϑn = 31n , then S = C is the Cantor ternary set. Although S is nowhere dense, we can have λ1 (S) as close to 1 as we choose. Indeed, for a given ξ ∈ (0, 1), let 1 nξ + 1 df ϑn = n ∀ n > 1. 2 n+1 Then we have λ1 (S) = ξ. Suppose that in the construction of the Cantor-like set at each step the closed subintervals are divided in the same proportions as the original, namely δ(D1,1 ) = δ(D1,2 ) = ϑ δ(D2,1 ) = δ(D2,2 ) = δ(D2,3 ) = δ(D2,4 ) = ϑ2 and in general δ(Dn,k ) = ϑk

∀ k ∈ {1, . . . , 2n }.

Then the resulting Cantor-like set is denoted by Sϑ . Arguing as in the proof of Proposition 1.3.9, we obtain the following Proposition.

1. Hausdorff Measures and Capacity

31

PROPOSITION 1.3.10 ¡ ¢ ln 2 If ϑ ∈ 0, 21 , then dim Sϑ = − ln ϑ. REMARK 1.3.11 If ϑ = 31 , then S = C is the Cantor ternary set and Propositions 1.3.9 and 1.3.10 coincide. COROLLARY 1.3.12 For each ξ ∈ [0, 1], there exists A ⊆ R, such that dim A = ξ. PROOF If ξ = 0, then we take³ A to ´be a singleton. If 0 < ξ < 1, then take ϑ = exp − lnξ 2 < 21 and use Proposition 1.3.10. If ξ = 1, let A = I = [0, 1]. Then we can easily check that +∞ if 0 < s < 1, 1 if s = 1, µ(s) (A) = 0 if s > 1. Therefore dim A = 1. REMARK 1.3.13 of a set A ⊆ X is by

An alternative way to define the Hausdorff dimension df

dim A =

inf

s.

s>0 µ (A) = 0 (s)

In general the Hausdorff dimension of a set may be any number in [0, +∞] and need not be an integer. Even if dim A is an integer and k = dim A > 0, the set A need not be a “k-dimensional surface” in any sense (see Federer (1969)). Next we turn our attention to the case X = RN . Let us begin by recalling the definition of the N -dimensional outer measure λN . (a) We say that Q ⊆ RN is a closed N -cube, N Q if there exist ak < bk for k = 1, . . . , N , such that Q = [ak , bk ]. We set DEFINITION 1.3.14

k=1 df

|Q| =

N Y

(bk − ak ).

k=1

(b) The Lebesgue N -dimensional outer measure λN , for all A ⊆ RN , is defined by ½X ¾ ∞ ∞ [ df N λ (A) = inf |Qk | : A ⊆ Qk , Qk is closed N -cube . k=1

k=1

32

Nonlinear Analysis

REMARK 1.3.15 Clearly the definitions of λ1 and µ(1) on R coincide. We shall show that for any N > 1 the outer measures λN and µ(N ) are closely related. In fact they differ by a multiplicative constant. This is not easy to establish and requires some preparation which culminates to the so-called “isodiametric inequality,” which says that the set of maximal volume for a given diameter is the sphere. LEMMA 1.3.16 If f : RN −→ [0, +∞] is Lebesgue measurable, then the set ½ ¾ df H = (x, ϑ) ∈ RN × R : 0 6 ϑ 6 f (x) is Lebesgue measurable in RN +1 . PROOF

Let

©

df

A =

ª x ∈ RN : f (x) = +∞ .

Then A is Lebesgue measurable. Let g : Ac × R+ −→ R+ be defined by df

g(x, ϑ) = f (x) − ϑ

∀ (x, ϑ) ∈ Ac × R+ .

Evidently g is a Carath´eodory function (i.e., it is Lebesgue measurable in x ∈ RN and continuous in ϑ ∈ R). Therefore g is Lebesgue measurable on Ac × R+ and so ½ df

H0 =

¾ (x, ϑ) ∈ Ac × R+ : ϑ 6 f (x)

is Lebesgue measurable in RN +1 . Finally note that H = H0 ∪ (A × R+ ).

In what follows for a, b ∈ RN , kakRN = 1, we introduce the following objects: ª df © L(a, b) = b + ta : t ∈ R - the line passing from b in the direction of a and df

P (a) =

©

x ∈ RN : (x, a)RN = 0

ª

- the plane passing from the origin, perpendicular to a.

1. Hausdorff Measures and Capacity

33

DEFINITION 1.3.17 Let a ∈ RN with kakRN = 1 and A ⊆ RN . We define the Steiner symmetrization of A with respect to the plane P (a) to be the set ½ ¾ [ ¡ ¢ 1 df S(a, A) = b + ta : |t| 6 µ(1) A ∩ L(a, b) . 2 b ∈ P (a) A ∩ L(a, b) 6= ∅

REMARK 1.3.18 The above defined Steiner symmetrization with respect to an (N − 1)-dimensional subspace Y of RN is the operation which associates to each A ⊆ RN , the set V ⊆ RN , such that for every L perpendicular to Y either • L ∩ A = ∅ and L ∩ V = ∅; or • L ∩ A 6= ∅ and L ∩ V is a closed segment centered in Y and µ(1) (L ∩ A) = µ(1) (L ∩ V ). If A is compact, then V is compact too and λN (A) = λN (V ). Also if A is convex, then V is convex too. The next Proposition summarizes the properties of the Steiner symmetrization. PROPOSITION 1.3.19 Let A ⊆ RN and a ∈ RN . ¡ ¢ (a) δ S(a, A) 6 δ(A). (b) If A ⊆ RN is Lebesgue measurable, ¡ ¢ then so is S(a, A) and λN S(a, A) = λN (A). PROOF

(a) Assume that δ(A) < +∞

or otherwise the result is trivial. Also we may assume that A is closed. For a given ε > 0, let x, y ∈ S(a, A) be such that ¡ ¢ δ S(a, A) − ε 6 kx − ykRN . Let

df

b = x − (x, a)RN a and

df

c = y − (y, a)RN a.

34

Nonlinear Analysis

Then b, c ∈ P (a). Let us set © ª df r = inf t ∈ R : b + ta ∈ A , © ª df u = inf t ∈ R : c + ta ∈ A ,

© ª df s = sup t ∈ R : b + ta ∈ A , © ª df v = sup t ∈ R : c + ta ∈ A .

We may assume that without any loss of generality that v − r > s − u. So 1 1 1 1 (v − r) + (s − u) = (s − r) + (v − u) 2 2 2 2 ¡ ¢ 1 ¡ ¢ 1 > µ(1) A ∩ L(a, b) + µ(1) A ∩ L(a, c) . 2 2

v−r >

Note that and

¯ ¯ (x, a)

RN

¯ ¡ ¢ ¯ 6 1 µ(1) A ∩ L(a, b) 2

¯ ¯ (y, a)

¯ ¡ ¢ ¯ 6 1 µ(1) A ∩ L(a, c) 2 (recall that x, y ∈ S(a, b)). It follows that ¯ ¯ ¯ ¯ ¯ ¯ v − r > ¯ (x, a)RN ¯ + ¯ (y, a)RN ¯ > ¯ (x − y, a)RN ¯. RN

Hence we have ¡ ¡ ¢ ¢2 2 δ S(a, A) − ε 6 kx − ykRN ¯ ¯ 2 2 6 kb − ckRN + ¯ (x − y, a)RN ¯ 2

6 kb − ckRN + (v − r)2 ° °2 = °(b + ra) − (c + va)°RN 6 δ(A)2 (note that A is closed and so b + ra, c + va ∈ A). It follows that ¡ ¢ δ S(a, A) − ε 6 δ(A). ¡ ¢ Let ε & 0, to conclude that δ S(a, A) 6 δ(A). (b) Recall that the Lebesgue measure λN is rotation invariant. So we may take 0 .. a = eN = . . 0 1

1. Hausdorff Measures and Capacity

35

Then P (a) = P (eN ) = RN −1 . Note that the function f : RN −1 −→ R, defined by ¡ ¢ df f (b) = µ(1) A ∩ L(a, b)

∀ b ∈ RN −1 ,

is measurable (Fubini’s theorem) and Z λN (A) = f (b)dλN −1 (b) A

(since λ1 = µ(1) ; see Remark 1.3.15). So by virtue of Lemma 1.3.16, we have that ½ ¾ f (b) f (b) df N −1 S(a, b) = (b, ϑ) ∈ R ×R: − 6ϑ6 2 2 ½ ¾ N −1 \ (b, 0) ∈ R × R : A ∩ L(a, b) = ∅ is Lebesgue measurable in RN and, moreover, Z ¡ ¢ N λ S(a, A) = f (b) dλN −1 (b) = λN (A). RN −1

Now we are properly equipped to prove the so-called “isodiametric inequality,” which states that, if in RN we consider the family of all sets with given diameter, the one with maximum Lebesgue N -dimensional outer measure (N volume) is the sphere. THEOREM 1.3.20 (Isodiametric Inequality) For all A ⊆ RN , we have µ ¶N δ(A) λ (A) 6 a(N ) , 2 N

N

df π 2 where a(N ) = ¡ N ¢ is the volume of the unit ball in RN . 2 !

PROOF

If δ(A) = +∞, then there is nothing to prove. So suppose that δ(A) < +∞.

36

Nonlinear Analysis

N Let {ek }N k=1 be the standard basis of R . We introduce

A1 = S(e1 , A),

A2 = S(e2 , A1 ),

...,

AN = S(eN , AN −1 ).

Let us set A∗ = AN . Claim 1. A∗ is symmetric with respect to the origin. By virtue of the definition of the Steiner symmetrization, we have that A1 is symmetric with respect to the plain P (e1 ). Let 1 6 k 6 N − 1 and suppose that Ak is symmetric with respect to P (e1 ), . . . , P (ek ). Again Ak+1 is symmetric with respect to P (ek+1 ). Let us fix 1 6 m 6 k and let Rm : RN −→ RN be reflection with respect to P (em ). Let b ∈ P (ek+1 ). Because Rm (Ak ) = Ak , we have © ª © ª µ(1) Ak ∩ L(ek+1 , b) = µ(1) Ak ∩ L(ek+1 , Rm (b)) , so ½

¾ t ∈ R : b + tek+1 ∈ Ak+1

½ =

¾ t ∈ R : Rm (b) + tek+1 ∈ Ak+1

and thus Rm (Ak+1 ) = Ak+1 , i.e., Ak+1 is symmetric with respect to P (em ). It follows that A∗ = AN is symmetric with respect to P (e1 ), . . . , P (eN ), hence it is symmetric with respect to the origin. µ

N

π2 Claim 2. λN (A∗ ) 6 ¡ N ¢ 2

!

δ(A∗ ) 2

¶N .

Let x ∈ A∗ . Then because of Claim 1, we have −x ∈ A∗ and so 2 kxkRN 6 δ(A∗ ). Hence

½ A∗ ⊆ B δ(A∗ ) (0) = 2

and so

y ∈ RN : kykRN 6

δ(A∗ ) 2

¾

µ ¶ ¶N N µ π2 δ(A∗ ) ∗ ¡ ¢ λ (A ) 6 λ B δ(A ) (0) 6 N . 2 2 2 ! N

∗

N

Using Claim 2, we can have the isodiametric inequality. Note that A ⊆ RN is Lebesgue measurable and so by Proposition 1.3.19, we have ¡ ∗¢ ¡ ¢ ¡ ∗¢ ¡ ¢ λN A = λN A and δ A 6 δ A .

1. Hausdorff Measures and Capacity

37

Using Claim 2, it follows that N µ ∗ ¶N ¡ ¢ ¡ ∗¢ π2 δ(A ) λN (A) 6 λN A = λN A 6 ¡N ¢ 2 2 ! ¶N ¶N N µ N µ π2 δ(A) π2 δ(A) 6 ¡N ¢ = ¡N ¢ . 2 2 ! ! 2 2

THEOREM 1.3.21 df

If A ⊆ RN , then λN (A) = cN µ(N ) (A), with cN =

N

π2 ¡N ¢ . N 2 2 !

PROOF For a given ε > 0, we can find a cover {Cn }n>1 of A consisting of closed, convex sets, such that ∞ X

δ(Cn )N 6 µ(N ) (A) + ε.

n=1

By virtue of Theorem 1.3.20, we have λN (Cn ) 6 cN δ(Cn )N

∀ n > 1.

So λN (A) 6

∞ X

λN (Cn ) 6 cN

n=1

∞ X

δ(Cn )N 6 cN µ(N ) (A) + cN ε.

n=1

Let ε & 0 to conclude that λN (A) 6 cN µ(N ) (A).

(1.24)

To prove the opposite inequality, first we show that µ(N ) is absolutely continuous with respect to λN (see Definition A.2.22). Note that for any N -cube Q, we have µ ¶N δ(Q) √ λN (Q) = |Q| 6 . N So for a given δ > 0, we have (N )

µδ

(A) 6

inf

∞ X

Qn -N -cube n=1 ∞ S A⊆ Qn n=1

δ(Qn ) 6 δ

δ(Qn ) 6

√ N N N λ (A).

38

Nonlinear Analysis

Let δ & 0, to conclude that µ(N ) is absolutely continuous with respect to λN (see Definition A.2.22). Next for a given ε, δ > 0, we can find a cover {Qn }n>1 of A consisting of N -cubes, such that δ(Qn ) < δ and

∞ X

∀n>1

λN (Qn ) 6 λN (A) + ε.

(1.25)

n=1

We may suppose that N -cubes are open by expanding them slightly so that the above inequality remains valid. Invoking Vitali’s covering theorem (see Theorem 1.2.5), for every n > 1 we can find disjoint balls {Bn,k }k>1 contained in Qn , such that δ(Bn,k ) 6 δ

and

µ ¶ ∞ [ λ Qn \ Bn,k = 0. N

k=1

By virtue of the absolute continuity of µ(N ) with respect to λN , we have µ

(N )

µ ¶ ∞ [ Qn \ Bn,k = 0

and

(N ) µδ

µ ¶ ∞ [ Qn \ Bn,k = 0.

k=1

k=1

Therefore, using (1.25), we have (N )

µδ

(A) 6 6 6

∞ X

(N )

µδ

k=1 ∞ X ∞ X n=1 k=1 ∞ X

1 cN

(Qn ) 6

∞ X ∞ X

(N )

µδ

n=1 k=1 ∞ X ∞ X

δ(Bn,k )N =

n=1 k=1

λN (Qn ) 6

n=1

(Bn,k ) +

∞ X n=1

(N )

µδ

µ ¶ ∞ [ Qn \ Bn,k k=1

1 N λ (Bn,k ) cN

1 N ε λ (A) + . cN cN

Let ε, δ & 0, to conclude that cN µ(N ) (A) 6 λN (A). From (1.24) and (1.26), we conclude that λN = cN µ(N ) .

(1.26)

1. Hausdorff Measures and Capacity

39

REMARK 1.3.22 Some authors, in order to get rid of the multiplicative constant cN , normalize the definition of the Hausdorff measures on RN . So if C ⊆ RN , 0 6 s < +∞, 0 < δ 6 +∞, they set µ ¶s ∞ X δ(An ) df (s) a(s) µδ (C) = inf , ∞ S 2 C⊆ A n=1 n=1

n

δ(An ) 6 δ df

where a(s) =

s

π2 . Here s Γ( 2 + 1) df

Γ(s) =

+∞ Z xs−1 e−x dx 0

is the gamma Euler function. The Hausdorff s-dimensional outer measure µ(s) is defined by (s) (s) µ(s) (C) = lim µδ (C) = sup µδ (C) δ&0

δ>0

(cf., e.g., Evans & Gariepy (1992, p. 60)) . Recall that ¡ ¢ λN B(x, r) = a(N )rN

∀ x ∈ RN .

In this case Theorem 1.3.21 says that λN = µ(N ) . Note that µ(0) is the counting measure. Let us prove some further properties of the Hausdorff measures on RN . PROPOSITION 1.3.23 Let 0 6 s < +∞. We have (a) µ(s) (A) = 0 for all A ⊆ RN and all s > N . (b) µ(s) (ξA) = ξ s µ(s) (A) for all A ⊆ RN and all ξ > 0. ¡ ¢ (c) µ(s) K(A) = µ(s) (A) for all A ⊆ RN and for any affine isometry K : RN −→ RN . PROOF

(a) Let Q = (0, 1)N and let m > 1 be an integer. For df

N k = (ki )N i=1 ∈ K = {0, . . . , m − 1} ,

we set df

Qk =

¸ N · Y ki ki + 1 , . m m i=1

40

Nonlinear Analysis

Note that

[

Q =

Qk

k∈K

So we have

X

(s)

µ √N (Q) 6 m

√ N and δ(Qk ) = . m √ s δ(Qk )s = mN −s N .

k∈K

Letting m → +∞, since s > N , we obtain µ(s) (Q) = 0, from which it follows that

µ(s) (RN ) = 0.

(b) Note that for all C ⊆ RN , we have δ(ξC) = ξδ(C). So the result follows at once from Definition 1.3.5. (c) Note that for all C ⊆ RN , we have ¡ ¢ δ K(C) = δ(C). Again the result follows from Definition 1.3.5. The next Proposition suggests a convenient way to check that µ(s) vanishes on a set. PROPOSITION 1.3.24 (s) If A ⊆ RN , 0 < δ 6 +∞ and 0 6 s < +∞ are such that µδ (A) = 0, then µ(s) (A) = 0. (0)

PROOF If s = 0, then µδ (A) = 0 implies that A = ∅ and so µ(0) (A) = 0. So suppose that s > 0. For a given ε > 0, we can find {Cn }n>1 , such that A⊆

∞ [

Cn ,

δ(Cn ) 6 δ

and

n=1

Evidently and so

(s) µε (A)

∞ X n=1

δ(Cn )s 6 ε

∀n>1

6 ε. Let ε & 0, to conclude that µ(s) (A) = 0.

δ(Cn )s 6 ε.

1. Hausdorff Measures and Capacity

41

Taking into account that for a Lipschitz continuous function with constant c > 0, for every A ⊆ RN , we have ¡ ¢ δ f (A) 6 cδ(A), and we obtain the following result. PROPOSITION 1.3.25 If f : RN −→ RM is a Lipschitz continuous function with Lipschitz constant c > 0 (see¡ Definition 1.5.1), A ⊆ RN and 0 6 s < +∞, ¢ (s) s (s) then µ f (A) 6 c µ (A). We conclude this section by returning to the notion of Hausdorff dimension (see Definition 1.3.8) and having a second look at this concept. The Hausdorff dimension has an intuitive appeal when familiar objects are under consideration. So for example dim RN = N (see Theorem 1.3.21). Suppose we want to determine the Hausdorff dimension of a curve C ⊆ R3 . Our first guess will be that dim C = 1. But recall that there are curves in R3 which fill the unit cube. Such a curve must have Hausdorff dimension 3. Therefore we must proceed with caution. DEFINITION 1.3.26

Let (X, d ) be a metric space.

¡ ¢ (a) By a curve in X we mean the image f [0, 1] of a continuous function f : [0, 1] −→ X. ¡ ¢ (b) The length of a curve C = f [0, 1] is defined by df

l(C) = sup

m X ¡ ¢ d f (xk−1 ), f (xk ) , k=1

where the supremum is taken over all partitions 0 = x0 < x1 < . . . < xm = 1 of [0, 1].

(c) The curve C is said to be rectifiable, if l(C) < +∞. REMARK 1.3.27 A curve C is a continuum, i.e., a compact and connected set in X. In particular then a curve is a Borel set; hence it is also µ(s) -measurable. Moreover, if in Definition 1.3.26(a) f is injective, then f −1 exists and is continuous and so C is the homeomorphic image of [0, 1]. Also in Definition 1.3.26(a), we can replace [0, 1] by any closed bounded interval [a, b]. Some authors require f to be injective.

42

Nonlinear Analysis

PROPOSITION 1.3.28 If (X, d ) is a metric space, f : [0, 1] −→ X is a nonconstant curve with ¡ ¢ length l and C = f [0, 1] , then (a) 0 < µ(1) (C) 6 l; (b) if f is injective, then µ(1) (C) = l. Therefore, if l is rectifiable (i.e., l < +∞), then dim C = 1. PROOF

(a) First we show that µ(1) (C) 6 l.

Assume that l < +∞ or otherwise there is nothing to prove. Let {Ak }m k=1 be a collection of closed subarcs of C, such that C =

m [

Ak ,

δ(Ak ) 6

k=1

1 n

(1)

and µ 1 (C) 6 n

m X

δ(Ak ).

(1.27)

k=1

Let us explicitly construct the subarcs Ak for k ∈ {1, . . . , m}. Note that f is uniformly continuous and so we can find η > 0, such that ¡ ¢ 1 d f (x), f (y) < n

∀ x, y ∈ [0, 1], |x − y| < η.

Consider a partition 0 = x0 < x1 < . . . < xn = 1 such that |xk − xk−1 | < η Let

¡ ¢ df Ak = f [xk−1 , xk ] ,

of

[0, 1],

© ª ∀ k ∈ 1, . . . , m . © ª ∀ k ∈ 1, . . . , m .

Evidently the subarcs {Ak }m k=1 cover C and ¡ ¢ 1 d f (xk−1 ), f (xk ) 6 δ(Ak ) < n

∀ k ∈ {1, . . . , m}.

Note that every Ak is compact and so we can find points yk , zk ∈ [xk−1 , xk ], yk 6 zk , such that ¡ ¢ d f (yk ), f (zk ) = δ(Ak ). We generate the finer partition 0 6 y1 6 z1 6 y2 6 z2 6 . . . 6 ym 6 zm 6 1.

1. Hausdorff Measures and Capacity

43

From (1.27), we have (1)

µ 1 (C) 6 n

m X k=1

δ(Ak ) =

m X ¡ ¢ d f (yk ), f (zk ) 6 l. k=1

Passing to the limit as n → +∞, we obtain that µ(1) (C) 6 l. Next we show that 0 < µ(1) (C). To this end note that if 0 6 a < b 6 1, then ¡ ¢ ¡ ¢ d f (a), f (b) 6 µ(1) f ([a, b]) . (1.28) df

To see this let h : E = f ([a, b]) −→ R be the function ¡ ¢ df h(u) = d u, f (a) . Evidently h is a Lipschitz continuous function with Lipschitz constant 1 and df

J =

£ ¤ £ ¡ ¢¤ 0, h(b) = 0, d f (a), f (b) ⊆ h(E).

So, from Proposition 1.3.25, we have ¡ ¢ ¡ ¢ d f (a), f (b) = λ1 (J) = µ(1) (J) 6 µ(1) h(E) 6 µ(1) (E). This proves inequality (1.28). But from (1.28) and since for appropriately chosen a, b we have ¡ ¢ d f (a), f (b) > 0 (recall that the curve is nonconstant), we conclude that 0 < µ(1) (C). (b) Now suppose that f is injective. Let 0 = x0 < x1 < . . . < xm = 1 be a partition of [0, 1]. The sets ¡ ¢ df Ak = f [xk−1 , xk ] are pairwise disjoint Borel subsets of X. Using inequality (1.28) on each subarc, we obtain m m X X ¡ ¢ ¡ ¡ ¢¢ d f (xk−1 ), f (xk ) 6 µ(1) f [xk−1 , xk ] k=1

k=1

µ[ ¶ m ¡ ¡ ¢¢ ¡ ¢ = µ(1) = µ(1) f [0, 1] = µ(1) (C). f [xk−1 , xk ] k=1

Since the partition of [0, 1] was arbitrary, it follows that l 6 µ(1) (C). Combining this with (a), we obtain that l = µ(1) (C).

44

1.4

Nonlinear Analysis

Differentiation of Hausdorff Measures

From the general measure theory, we know that the differentiation theory of real functions can be extended to a theory of differentiation for measures, which has many similar features and interesting problems. For the Lebesgue measures λN , N > 1, one of the basic results of this theory is the so-called Lebesgue density theorem, which we recall here. THEOREM 1.4.1 (Lebesgue Density Theorem) If A ⊆ RN is a Lebesgue measurable set, then for λN -a.a. x ∈ A, 1 λN (B r (x) ∩ A) lim = r&0 λN (B r (x)) 0 for λN -a.a. x ∈ RN \ A. DEFINITION 1.4.2

Let A ⊆ RN and x ∈ RN . We say that:

(a) x is a point of density of A, if λN (B r (x) ∩ A) = 1; r&0 λN (B r (x)) lim

(b) x is a point of dispersion of A, if λN (B r (x) ∩ A) = 0. r&0 λN (B r (x)) lim

REMARK 1.4.3 According to Theorem 1.4.1, we see that λN -almost every point of A is a point of density of A and λN -almost every point of RN \A is a point of dispersion of A. We can think that the point of density of a set A form a kind of measure theoretic interior of A, while the points of dispersion of A form a kind of measure theoretic exterior of A. The purpose of this section is to establish analogs of Theorem 1.4.1 for lower dimensional Hausdorff measures. In what follows we work in RN and 1 < s < N. THEOREM 1.4.4 If A ⊆ RN is µ(s) -measurable and µ(s) (A) < +∞, then µ(s) (B r (x) ∩ A) lim = 0 for µ(s) -a.a. x ∈ RN \ A. r&0 (2r)s

1. Hausdorff Measures and Capacity PROOF

45

For every t > 0, let ½ ¾ µ(s) (B r (x) ∩ A) df Ct = x ∈ RN \ A : lim sup > t . (2r)s r&0

To finish the proof it is enough to show that µ(s) (Ct ) = 0

∀ t > 0.

Fix ε > 0. We know that µ(s) bA is a Radon measure (see Proposition 1.1.9). So we can find K ⊆ A compact, such that µ(s) (A \ K) 6 ε (see Proposition 1.1.10(b)). Let df

U = RN \ K. Then U is open and Ct ⊆ U. For fixed δ > 0, we consider the family of closed balls ½ ¾ µ(s) (B r (x) ∩ A) df T = B r (x) : B r (x) ⊆ U, 0 < r < δ, >t . (2r)s Without any loss of generality we may assume that T 6= ∅ or otherwise Ct = ∅ and so µ(s) (Ct ) = 0. © ª Invoking Proposition 1.2.1, we can find a sequence B rn (xn ) n>1 of disjoint elements in T , such that Ct ⊆

∞ [

B 5rn (xn ).

n=1

Then we have (s)

µ10δ (Ct ) 6

∞ X

(10rn )s 6

n=1

∞ ¢ 5s X (s) ¡ µ B rn (xn ) ∩ A t n=1

5s (s) 5s (s) 5s ε 6 µ (U ∩ A) = µ (A \ K) 6 . t t t Let δ & 0, to obtain

5s ε . t Since ε > 0 was arbitrary, we conclude that µ(s) (Ct ) = 0. µ(s) (Ct ) 6

46

Nonlinear Analysis

To have a complete analog of Theorem 1.4.1, we need to check and see if something can be said about the density of A at its points. To do this we will make use of Proposition 1.2.4. THEOREM 1.4.5 If A ⊆ RN is µ(s) -measurable and µ(s) (A) < +∞, then 1 µ(s) (B r (x) ∩ A) 6 lim sup 6 1 2s (2r)s r&0 PROOF

for µ(s) -a.a. x ∈ A.

First we show that lim sup r&0

µ(s) (B r (x) ∩ A) 6 1 (2r)s

for µ(s) -a.a. x ∈ A.

(1.29)

To this end, for every t > 1, we introduce the set Ct ⊆ A defined by ½ ¾ µ(s) (B r (x) ∩ A) df Ct = x ∈ A : lim sup > t . (2r)s r&0 Fix ε > 0. Again µ(s) bA is a Radon measure (see Proposition 1.1.9). We can find an open set U ⊆ RN , such that Ct ⊆ U and

µ(s) (U ∩ A) − ε 6 µ(s) (Ct )

(1.30)

(see Proposition 1.1.10(b)). We introduce the family T of closed balls defined by ½ ¾ µ(s) (B r (x) ∩ A) df T = B r (x) : B r (x) ⊆ U, 0 < r < δ, >t . (2r)s © ª By virtue of Proposition 1.2.4, we can find a sequence B rn (xn ) n>1 of disjoint balls in T , such that Ct ⊆

m [

∞ [

B rn (xn ) ∪

n=1

B 5rn (xn )

∀ m > 1.

n=m+1

Then for δ > 0, we have (s)

µ10δ (Ct ) 6

m X n=1

(2rn )s +

∞ X

(10rn )s

n=m+1

m ∞ ¢ 5s X ¡ ¢ 1 X (s) ¡ 6 µ B rn ∩ A + µ(s) B rn (xn ) ∩ A t n=1 t n=m+1 µ [ ¶ ∞ ¡ ¢ 1 5s 6 µ(s) (U ∩ A) + µ(s) ∀ m > 1. B rn (xn ) ∩ A t t n=m+1

1. Hausdorff Measures and Capacity

47

Using (1.30) and letting m → +∞, we obtain (s)

µ10δ (Ct ) 6

¢ 1 (s) 1 ¡ (s) µ (U ∩ A) 6 µ (Ct ) + ε . t t

Letting δ & 0, we see that µ(s) (Ct ) 6

¢ 1 ¡ (s) µ (Ct ) + ε . t

Since ε > 0 was arbitrary, we finally have that 1 (s) µ (Ct ), t

µ(s) (Ct ) 6 i.e.,

µ(s) (Ct ) = 0

(recall that t > 1). This proves (1.29). Next we show that 1 µ(s) (B r (x) ∩ A) 6 lim sup 2s (2r)s r&0

for µ(s) -a.a. x ∈ A.

For a given ξ, δ ∈ (0, 1), we introduce the set A(δ, ξ) ⊆ A, defined by ½ df (s) A(δ, ξ) = x ∈ A : µδ (C ∩ A) 6 ξδ(C)s for all C ⊆ RN , ¾ with δ(C) 6 δ and x ∈ C . Let {Cn }n>1 be a δ-cover of A(δ, ξ), such that A(δ, ξ) ⊆

∞ [

Cn

n=1

and δ(Cn ) 6 δ,

and

Cn ∩ A(δ, ξ) 6= ∅

∀ n > 1.

So (s) ¡

µδ

A(δ, ξ)

¢

6

∞ X

(s) ¡

µδ

¢ Cn ∩ A(δ, ξ)

n=1

6

∞ X n=1

(s) µδ (Cn

∩ A) 6

∞ X

ξδ(Cn )s

n=1

and from Definition 1.3.5, we see that ¢ ¢ (s) ¡ (s) ¡ µδ A(δ, ξ) 6 ξµδ A(δ, ξ) .

(1.31)

48

Nonlinear Analysis

Since 0 < ξ < 1 we have

(s) ¡

and

µδ

A(δ, ξ)

¢

< +∞,

(s) ¡

µδ

¢ A(δ, ξ) = 0.

In particular, from Proposition 1.3.24, we see that ¡ ¢ µ(s) A(δ, 1 − δ) = 0. Set

½ df

D∞ =

x ∈ A : lim sup r&0

(1.32)

¾ µ(s) (B r (x) ∩ A) 1 < . (2r)s 2s

If x ∈ D∞ , then we can find δ > 0, such that µ(s) (B r (x) ∩ A) 1−δ 6 s (2r) 2s

∀ r ∈ (0, δ].

(1.33)

For any C ⊆ RN , with x∈C ∩A

and

δ(C) 6 δ,

from (1.33), we have ¡ ¢ (s) µδ (C ∩ A) 6 µ(s) (C ∩ A) 6 µ(s) B δ(C) (x) ∩ A 6 (1 − δ)δ(C)s . So it follows that x ∈ A(δ, 1 − δ). Therefore, we have µ ¶ ∞ [ 1 1 D∞ ⊆ A ,1 − , n n n=1 and, using also (1.32), we have µ(s) (D∞ ) = 0. Thus we infer that (1.31) is true. For a given locally integrable function, we can establish the Hausdorff measure of the set where the function is locally large. To do this we shall need the so-called Lebesgue differentiation theorem or Lebesgue-Besicovitch differentiation theorem THEOREM ¡ 1.4.6 ¢(Lebesgue-Besicovitch Differentiation Theorem) If f ∈ L1loc RN ; RM , then Z ° ° 1 °f (y) − f (x)° M dλN (y) = 0 for λN -a.a. x ∈ RN . lim N R r&0 λ (B r (x)) B r (x)

1. Hausdorff Measures and Capacity

49

PROOF Let D = {uk }k>1 be a dense subset of RM . Then by the classical differentiation theorem of Lebesgue (see for example Cohn (1980, p. 190)), we have Z ° ° ° ° 1 °f (y) − un ° M dλN (y) = °f (x) − un ° M (1.34) lim N R R r&0 λ (B r (x)) B r (x)

for λN -a.a. x ∈ RN . Suppose that x ∈ RN is such a differentiability point for which (1.34) is valid for all n > 1. For a given ε > 0, we can choose un , such that ° ° °f (x) − un ° N < ε. R Then we have

Z

1 N r&0 λ (B r (x)) lim

° ° °f (y) − f (x)° M dλN (y) R

B r (x)

Z

1 r&0 λN (B r (x))

6 lim

° ° ° ° °f (y) − un ° M dλN (y) + °un − f (x)° M < 2ε. R R

B r (x)

Since ε > 0 was arbitrary, we conclude that Z ° ° 1 °f (y) − f (x)° M dλN (y) = 0. lim N R r&0 λ (B r (x)) B r (x)

COROLLARY ¡ N 1.4.7 ¢ M If f ∈ L∞ , then loc R ; R Z 1 lim N f (y) dλN (y) = f (x) r&0 λ (B r (x))

for λN -a.a. x ∈ RN .

B r (x)

PROOF

Note that ° ° ° lim °

° ° f (y) dλ (y) − f (x)° °

Z

1 N r&0 λ (B r (x))

N

RM

B r (x)

1 N r&0 λ (B r (x))

Z

6 lim

° ° °f (y) − f (x)°

RM

B r (x)

So the corollary follows at once from Theorem 1.4.6.

dλN (y).

50

Nonlinear Analysis

REMARK 1.4.8 Theorem 1.4.6 and Corollary 1.4.7 remain valid if λN is replaced by any Radon measure on RN . Also we may replace the ball B r (x) by any other measurable sets Sr (x) containing x which shrink to the point x ∈ RN as r & 0. For example we can take Sr (x) to be N -cube with edges equal to 2r. If N = 1, we may take for example the intervals [x − h, h],

[x, x + h] or

[x − h, x + h].

In Proposition 2.1.22 we shall see that the results are also valid for Banach space valued functions, i.e., RM is replaced by a Banach space. Now we are ready to ¢estimate the Hausdorff measure of the set where a ¡ function f ∈ L1loc RN ; R is locally large. THEOREM ¡ 1.4.9 ¢ If f ∈ L1loc RN ; R , 0 6 s < N and ½ Z 1 df Cs = x ∈ RN : lim sup s r&0 r

¾ ¯ ¯ ¯f (y)¯ dλN (y) > 0 ,

B r (x)

then µ(s) (Cs ) = 0. PROOF It is clear that without any loss of generality, we may assume that f ∈ L1 (RN ; R). By virtue of Corollary 1.4.7, we have that Z ¯ ¯ 1 ¯f (y)¯ dλN (y) = 0 for λN -a.a. x ∈ RN lim s r→0 r B r (x)

(recall that 0 6 s < N ). So λN (Cs ) = 0. Let ε > 0, δ > 0 and ξ > 0 be given. Since f ∈ L1 (RN ; R), from the absolute continuity of the Lebesgue integral, we know that we can find ϑ > 0, such that Z ¯ ¯ ¯f (y)¯ dλN (y) < ξ ∀ A ⊆ RN , λN (A) < ϑ. A

We introduce the set

Csε ⊆ Cs

defined by ½ df Csε =

1 x ∈ Cs : lim sup s r r&o

Z B r (x)

¾ ¯ ¯ ¯f (y)¯ dλN (y) > ε .

1. Hausdorff Measures and Capacity

51

We have that λN (Csε ) = 0. So we can find an open set U ⊆ RN , such that λN (U ) < ϑ. Let us set df

T =

½ B r (x) : x ∈ Csε , 0 < r < δ, B r (x) ⊆ U ¾ Z ¯ ¯ ¯f (y)¯ dλN (y) > εrs . and B r (x)

Invoking Proposition 1.2.1, we can find a sequence disjoint balls, such that Csε ⊆

∞ [

©

ª Brn (xn ) n>1 ⊆ T of

B5rn (xn ).

n=1

From this it follows that (s)

µ10δ (Csε ) 6 6

10 ε

Z

∞ s X

10s 6 ε

n=1

Z

∞ X

(10rn )s

n=1

¯ ¯ ¯f (y)¯ dλN (y)

B rn (xn )

s ¯ ¯ ¯f (y)¯ dλN (y) 6 10 ξ. ε

U

Let δ & 0 and then ξ & 0, to conclude that µ(s) (Csε ) = 0. Since Cs =

∞ [

1

Csn ,

n=1

we conclude that µ(s) (Cs ) = 0.

52

Nonlinear Analysis

1.5

Lipschitz Functions

In this section we derive some basic properties relating to the behaviour of Lipschitz continuous functions. A first such result was already established in Proposition 1.3.25. DEFINITION 1.5.1

Let C ⊆ RN .

(a) A function f : C −→ RM is said to be Lipschitz continuous, if there exists a constant c > 0, such that ° ° °f (x) − f (y)° M 6 c kx − yk N ∀ x, y ∈ C. R R (b) If f : C −→ RM is Lipschitz continuous, then the Lipschitz constant Lip(f ) > 0 of f is defined by df

Lip(f ) =

sup x, y ∈ C x 6= y

kf (x) − f (y)kRM . kx − ykRN

(c) If U ⊆ RN is open, a function f : U −→ RM is said to be locally Lipschitz, if for every x ∈ U , we can find a neighbourhood V ⊆ U of x, such that f |V is Lipschitz continuous. THEOREM 1.5.2 N If f : RN −→ RM , f = (fi )M with λN (A) > 0, i=1 and A ⊆ R then ¡ ¢ (a) dim Gr (f |A ) > N , where Gr (f |A ) is the graph of f over A, defined by df

Gr (f |A ) =

©

ª (x, y) ∈ A × RM : y = f (x) ;

¡ ¢ (b) if f is Lipschitz continuous, then dim Gr (f |A ) = N . PROOF (a) Let P : RN +M −→ RN be the projection operator. Operator P is Lipschitz continuous with Lip(P ) = 1. By virtue of Theorem 1.3.21 and Proposition 1.3.25, we have that ¡ ¡ ¢¢ ¡ ¢ 1 N λ (A) = µ(N ) (A) = µ(N ) P Gr (f |A ) 6 µ(N ) Gr (f |A ) cN ¡ ¢ and so dim Gr (f |A ) > N (see Definition 1.3.8). 0

1, then we define fb = fbi i=1 (fi are the component functions of f ). We have M X ° ° ¯ ¯ ¡ ¢ °fb(x) − fb(z)°2 M = ¯fbi (x) − fbi (z)¯2 6 M Lip(f ) 2 kx − zk2 N , R R i=1

so

√ ¡ ¢ Lip fb 6 M Lip(f ).

1. Hausdorff Measures and Capacity REMARK 1.5.5

55

Let f : X −→ R and f : X −→ R be defined by ½ df f (x) if x ∈ A, f (x) = 0 if x ∈ RN \ A,

then as we shall see in Chapter 4, fb = f ⊕ Lip(f ) k·kX , where ⊕ denotes the operation of infimal convolution (see Definition 4.4.6(b)). Since this operation preserves convexity, then if A ⊆ X is convex and f : A −→ R is Lipschitz continuous and convex, then so is fb: X −→ R. Also note that the extension fb obtained in Theorem 1.5.4 is maximal in the sense that if g : X −→ R is any Lipschitz continuous function with Lip(g) 6 Lip(f ), such that g|A = f , then g 6 fb. Indeed note that g(x) − f (y) 6 Lip(f ) kx − ykX

∀ x ∈ X, y ∈ A,

hence g(x) 6 fb(x). A minimal such extension can be obtained by considering the function £ ¤ df fe(x) = sup f (a) − Lip(f ) kx − akX . a∈A

This extension is known as the McShane extension of f and was obtained by McShane (1934) who was the first to study the problem of extension of Lipschitz continuous functions. Finally we mention that Kirszbraun (1934) produced an extension fb of a Lipschitz continuous function f : A −→ RM , such that Lip(fb) = Lip(f ) (see also Federer (1969, p. 201)).

One of the main theorems concerning Lipschitz continuous functions is the so-called Rademacher’s theorem, which asserts that a Lipschitz continuous function f : RN −→ RM is differentiable almost everywhere. This is the starting point for extending the subdifferential theory beyond the family of convex functions (see Chapter 4). First let us recall the following basic definition from multivariable calculus. DEFINITION 1.5.6 Let U ⊆ RN be an open set. We say that a funcM tion f : U −→ R is differentiable (or Fr´ echet differentiable) at x ∈ U , if there exists L(x) ∈ L(RN ; RM ), such that lim

h→0

f (x + h) − f (x) − L(x)h = 0. khkX

REMARK 1.5.7 Evidently L(x) is unique, usually is denoted by Df (x) or f 0 (x) and it is called the derivative of f at x. From multivariable calculus, we know that if M = 1, then Df (x)u =

N N X X ∂f (x)uk = f 0 (x; ek )uk ∂xk

k=1

k=1

∀ u ∈ RN ,

56

Nonlinear Analysis

where f 0 (x; v) is the directional or Gˆ ateaux derivative of f at x in the direction v, defined by df

f 0 (x; v) = lim

λ→0

f (x + λv) − f (x) λ

N and {ek }N k=1 is the canonical basis of R .

THEOREM 1.5.8 (Rademacher Theorem) If U ⊆ RN is an open set and f : U −→ RM is a Lipschitz continuous function, then f is differentiable at λN -almost all x ∈ U . PROOF Clearly we may assume that M = 1. For any u ∈ RN with kukRN = 1, we set df

f 0 (x; u) = lim

λ→0

f (x + λu) − f (x) λ

∀ x ∈ U,

provided this limit exists. Claim 1. f 0 (x; u) exists for λN -almost all x ∈ U . Let f (x + λu) − f (x) λ λ→0 f (x + λu) − f (x) df 0 f− (x; u) = lim inf λ→0 λ df

0 (x; u) = lim sup f+

Evidently if df

Cu =

©

then Cu =

∀x∈U ∀ x ∈ U.

ª x ∈ U : f 0 (x; u) does not exist , ©

ª 0 0 x ∈ U : f− (x; u) < f+ (x; u) .

Note that 0 f+ (x; u) = inf

sup

k>1 0 < |λ| < λ∈Q

1 k

f (x + λu) − f (x) , λ

0 so f+ (·; u) is a Borel measurable function. 0 Similarly we show that f− (·; u) is a Borel measurable function. It follows that Cu ∈ B(U ) (i.e., Cu ⊆ U is a Borel measurable set). Next for every x, u ∈ RN with kukRN = 1, let ϕ : R −→ R be defined by df

ϕ(λ) = f (x + λu)

∀ λ ∈ R.

1. Hausdorff Measures and Capacity

57

The function ϕ is Lipschitz continuous, hence absolutely continuous and so by fundamental theorem of Lebesgue calculus (see Theorem A.2.20), it is differentiable at almost every λ ∈ R. Therefore µ(1) (Cu ∩ L) = 0, for every line L parallel to the direction u. Hence by Fubini’s theorem, we have λN (Cu ) = 0 and this proves the claim. From Claim 1 and Remark 1.5.7, we see that µ ∇f (x) =

¶N ∂f (x) exists for λN -a.a. x ∈ U. ∂xk k=1

¡ ¢ Claim 2. f 0 (x; u) = u, ∇f (x) RN for λN -almost all x ∈ U . Let ϑ ∈ Cc∞ (U ). We have Z Z f (x + λu) − f (x) ϑ(x) − ϑ(x − λu) N ϑ(x) dλN (x) = − f (x) dλ (x). λ λ U

U

Let λ =

1 k

for k > 1. Since f is Lipschitz continuous, we have ¯ ¯ ¯ f (x + k1 u) − f (x) ¯ ¯ ¯ 6 Lip(f ) kuk N = Lip(f ). R 1 ¯ ¯ k

Therefore when k → +∞, from the Lebesgue dominated convergence theorem (see Theorem A.2.2 ), we have that Z Z f 0 (x; u)ϑ(x) dλN (x) = − f (x)ϑ0 (x; u) dλN (x) U

= − Z =

N X k=1

Z uk U

U

Z N X ∂ϑ ∂f N f (x) uk (x) dλ (x) = (x)ϑ(x) dλN (x) ∂xk ∂xk k=1

U

¡ ¢ u, ∇f (x) RN ϑ(x) dλN (x).

U

Because ϑ ∈ Cc∞ (U ) is arbitrary, it follows that ¡ ¢ f 0 (x; u) = u, ∇f (x) RN for λN -a.a. x ∈ U. This proves the second claim.

58

Nonlinear Analysis Let {un }n>1 be a dense subset of ∂B1 (0). For n > 1, we define ½ df

En =

¡ ¢ x ∈ U : f 0 (x; un ) and ∇f (x) exist and f 0 (x; un ) = un , ∇f (x) RN

and df

E =

∞ \

¾

En .

n=1

By virtue of Claim 2, we have that λN (U \ E) = 0. Claim 3. f is differentiable at every x ∈ E. Let x ∈ E, u ∈ ∂B1 (0), λ ∈ R \ {0} and set df

η(x, u, λ) =

f (x + λu) − f (x) − (u, ∇f (x))RN . λ

If v ∈ ∂B1 (0), we have ¯ ¯ ¯η(x, u, λ) − η(x, v, λ)¯ ¯ ¯ ¯ f (x + λu) − f (x + λv) ¯ ¯¡ ¢ ¯ ¯ ¯ + ¯ u − v, ∇f (x) N ¯ 6 ¯ ¯ R λ ° ° 6 Lip(f ) ku − vkRN + °∇f (x)°RN ku − vkRN . Note that

µ ¶ ∂f Lip 6 2Lip(f ) ∂xk

and so

© ª ∀ k ∈ 1, . . . , N

√ Lip(∇f ) 6 2 N Lip(f )

(see the proof of Theorem 1.5.4). Therefore, we have √ ¢ ¯ ¯ ¡ ¯η(x, u, λ) − η(x, v, λ)¯ 6 1 + 2 N Lip(f ) ku − vk N . R

(1.38)

Let ε > 0 be given. We can choose l > 1 large enough so that © ª ∀v ∈ ∂B1 (0) ∃k ∈ 1, . . . , l : kv − uk kRN 6

ε √ . 2(1 + 2 N )Lip(f )

(1.39)

As x ∈ E, we have lim η(x, uk , λ) = 0

λ→0

(recall that x ∈ E). So we can find δ > 0, such that ¯ ¯ ¯η(x, uk , λ)¯ < ε 2

∀ 0 < |λ| < δ, k ∈ {1, . . . , l}.

(1.40)

1. Hausdorff Measures and Capacity

59

Thus from (1.38), (1.39) and (1.40), for every v ∈ ∂B1 (0), we can find k ∈ {1, . . . , l}, such that for all 0 < |λ| < δ, we have ¯ ¯ ¯ ¯ ¯ ¯ ¯η(x, v, λ)¯ 6 ¯η(x, uk , λ)¯ + ¯η(x, v, λ) − η(x, uk , λ)¯ < ε. (1.41) We emphasize that δ > 0 is independent of v ∈ ∂B1 (0). Let y ∈ U , y 6= x and let us set y−x df v = ∈ ∂B1 (0). ky − xkRN We have y = x + λv, with λ = ky − xkRN . Then ¡ ¢ ¡ ¢ f (y) − f (x) − ∇f (x), y − x RN = f (y) − f (x) + λ ∇f (x), v RN ¡ ¢ = o(λ) = o ky − xkRN as y → x. Therefore f is differentiable at x ∈ E with Df (x) = ∇f (x). This proves the claim and the theorem. COROLLARY 1.5.9 If U ⊆ RN is an open set and f : U −→ RM is a locally Lipschitz function, then f is differentiable at λN -almost all x ∈ U . PROOF Again we may assume that M = 1. Note that since f is locally Lipschitz, it is Lipschitz continuous when restricted to any compact set K ⊆ U . Indeed, if this is not true, then we can find a compact set K ⊆ U and two sequences {xn }n>1 , {yn }n>1 ⊆ K, such that ¯ ¯ n kxn − yn kRN < ¯f (xn ) − f (yn )¯ ∀ n > 1. Note that

2 max |f | ∀ n > 1. n K Since K is compact, we can produce two subsequences {xnk }k>1 of {xn }n>1 and {ynk }k>1 of {xn }n>1 , such that kxn − yn kRN 6

xnk −→ v

and ynk −→ v,

for some v ∈ K, which contradicts the fact that in a neighbourhood of x, the function f is Lipschitz continuous. Next let {Un }n>1 be an increasing sequence of bounded open subsets of U , ∞ S such that U = Un , for example let n=1

½ df

Un =

x ∈ U : kxkRN < n,

¾ 1 < d (x, ∂U ) n

∀ n > 1.

Then by virtue of Theorem 1.5.8, f is differentiable at λN -almost all x ∈ Un , for n > 1. Therefore f is differentiable λN -almost everywhere on U .

60

Nonlinear Analysis

COROLLARY 1.5.10 If U ⊆ RN is an open set, f : U −→ RM is a locally Lipschitz function and df

Z =

©

ª x ∈ U : f (x) = 0 ,

then Df (x) = 0 for λN -almost all x ∈ Z. PROOF As before we may assume that M = 1. Also suppose that λN (Z) > 0 or otherwise there is nothing to prove. Then by virtue of Theorems 1.4.1 and 1.5.8, we can choose x ∈ Z, such that Df (x) exists and λN (B r (x) ∩ Z) = 1. r&0 λN (B r (x)) lim

(1.42)

We have f (y) =

¡

¢ ¡ ¢ ∇f (x), y − x RN + o ky − xkRN as y → x.

(1.43)

Suppose that df

v = ∇f (x) 6= 0. We introduce the set ½ df

C =

u ∈ ∂B1 (0) : (v, u)RN

kvkRN > 2

¾ .

For a given u ∈ C and t > 0, in (1.43), we set y = x + tu and we have ¡ ¢ t kvkRN f (x + tu) = t ∇f (x), u RN + o(t kukRN ) > + o(t), 2 so

kvkRN f (x + tv) o(t) > + t 2 t

and f (x + tu) > 0

∀ t ∈ (0, t∗ ), u ∈ C,

for some t∗ > 0, a contradiction to (1.42). COROLLARY 1.5.11 If f1 , f2 : RN −→ RN are locally Lipschitz and df

E =

©

ª x ∈ RN : (f2 ◦ f1 )(x) = x ,

¡ ¢ then D(f2 ◦ f1 )(x) = Df2 f1 (x) Df1 (x) = idRN for λN -almost all x ∈ E.

1. Hausdorff Measures and Capacity PROOF

61

Let df

dom fi =

©

ª x ∈ RN : Dfi (x) exists ,

for i = 1, 2.

We set df

C = E ∩ dom f1 ∩ f1−1 (domf2 ). If x ∈ C \ f1−1 (dom f2 ), then f1 (x) ∈ RN \ dom f2 and so

¡ ¢ ¡ ¢ ¡ ¢ x = f2 f1 (x) = f2 ◦ f1 (x) ∈ f2 RN \ dom f2 .

It follows that E\C ⊆

¡ N ¢ ¡ ¢ R \ dom f1 ∪ f2 RN \ dom f2 .

(1.44)

Invoking Theorem 1.5.8, from (1.44), we infer that λN (C \ E) = 0 (recall that a Lipschitz continuous function maps Lebesgue-null sets to Lebesgue-null sets). If¢ x ∈ C, then from the definition of C, we see that ¡ Df1 (x) and Df2 f1 (x) exist and so ¡ ¢ Df2 f1 (x) Df1 (x) = D(f2 ◦ f1 )(x) (chain rule). Since (f2 ◦ f1 )(x) − x = 0, from Corollary 1.5.10, we infer that ¡ ¢ Df2 f1 (x) Df1 (x) = id for λN -a.a. x ∈ RN .

Continuing with our investigation of Lipschitz continuous maps f : RN −→ R , we aim at deriving change of variables formulas. We distinguish two cases. In the first case N 6 M and the change of variables formula is obtained via the so-called area formula, which asserts that N -dimensional measure of f (A) can be calculated by integrating a suitable Jacobian. In the second case M 6 N and the change of variables formula passes through the so-called coarea formula, which asserts that the integral of the (N − M )-dimensional measure of the level sets of f is computed by integrating a suitable Jacobian. First we derive the area formula and for this we need some preparation. We start with a result from linear algebra, known as polar decomposition. It produces for a linear operator L : RN −→ RM an analog of the polar representation z = reiϑ of a complex number. First some definitions. M

62

Nonlinear Analysis

DEFINITION 1.5.12 (a) An operator U : RN −→ RM is said to be orthogonal, if ¡ ¢ U x, U y RM = (x, y)RN ∀ x, y ∈ RN . (b) For a given operator L : RN −→ RM , its adjoint L∗ : RM −→ RN is defined by ¡ ¢ ¡ ¢ Lx, y RM = x, L∗ y RN ∀ x ∈ RN , y ∈ RM . (c) An operator L : RN −→ RN is self-adjoint, if L∗ = L. (d) An operator S : Rk −→ Rk is said to be positive (we write S > 0), if it is self-adjoint (i.e., S = S ∗ ) and ¡ ¢ Sx, x RN > 0 ∀ x ∈ RN . REMARK 1.5.13 If N = M , then U : RN −→ RN is orthogonal if and ∗ −1 only if U = U (hence U is invertible). In general, if U : RN −→ RM is orthogonal, then N 6 M and U ∗ ◦ U = idRN . Also U : RN −→ RM is orthogonal if and only if it is an isometry. Finally, if S : RN −→ RN is self-adjoint, then we can find an orthogonal operator U : RN −→ RN and a diagonalizable operator D : RN −→ RN , such that S = U ◦ D ◦ U −1 . A positive operator S : RN −→ RN has a unique positive square root T : RN −→ RN , i.e., T 2 = S. THEOREM 1.5.14 (Polar Decomposition Theorem) Let L : RN −→ RM be a linear operator. (a) If N 6 M , then there exist a positive operator S : RN −→ RN and an orthogonal operator U : RN −→ RM , such that L = U ◦ S. (b) If M 6 N , then there exists a positive operator S : RM −→ RM and an orthogonal operator U : RM −→ RN , such that L = S ◦ U ∗.

1. Hausdorff Measures and Capacity

63

PROOF (a) Since L∗∗ = L, the operator L∗ ◦L : RN −→ RN is clearly positive. So it admits a unique square root S : RN −→ RN (see Remark 1.5.13). For each y = Sx ∈ R(S), we write U y = Lx, motivated by the fact that eventually we must have L = U ◦ S. First, we need to show that with this definition U is unambiguously defined on R(S), that is if Sx1 = Sx2 , then Lx1 = Lx2 . ° ° Note that Sx1 = Sx2 is equivalent to saying that °S(x1 − x2 )°RN = 0 and this condition implies that ° ° °L(x1 − x2 )° M = 0. R Therefore U is well defined on R(S) and its range equals R(L). Note that L and S have the same kernel. So dim R(L) = dim R(S)

and

dim R(L)⊥ = dim R(S)⊥ .

Therefore there exists an isometric isomorphism U0 : R(S)⊥ −→ R(L)⊥ . We extend U on R(S)⊥ by setting it equal to U0 . Since RN = R(S) ⊕ R(S)⊥ , every y ∈ RN can be written in a unique way as y = Sx + u,

with u ∈ R(S)⊥ .

We set U y = Lx + U0 u and we have U : RN −→ RM , which is linear and well defined. Also, exploiting the orthogonality of R(S) and R(S)⊥ , we have ¡ ¢ ¡ ¢ U y, U y RN = Lx + U0 u, Lx + U0 u RM ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ = Lx, Lx RM + U0 u, U0 u RM = Sx, Sx RN + u, u RN = y, y RN , so U is orthogonal and U ◦ S = L. (b) Follows if we apply (a) to the operator L∗ : RM −→ RN . REMARK 1.5.15 In a polar decomposition L = U ◦ S, the positive operator S is unique. Indeed suppose that U ◦ S = U1 ◦ S1 . Then by taking adjoints, we obtain S ◦ U ∗ = S1 ◦ U1∗ and so S 2 = S ◦ U ∗ ◦ U ◦ S = S1 ◦ U1∗ ◦ U1 ◦ S1 = S12 . The positive operator S 2 = S12 has a unique square root, hence S = S1 . Moreover, if N = M and the operator L is invertible, then in the polar decomposition L = U ◦ S, the orthogonal operator U is unique too. Indeed, since L is invertible, so is S (since S = U −1 ◦ L). Then from U ◦ S = U1 ◦ S1 and since S −1 = S1−1 , we have that U = U1 ◦ S1 ◦ S −1 = U1 ◦ S1 ◦ S1−1 = U1 .

64

Nonlinear Analysis

We can use Theorem 1.5.14 to define the Jacobian of a Lipschitz continuous map f : RN −→ RM . DEFINITION 1.5.16

Let L : RN −→ RM be a linear operator.

(a) If N 6 M and L = U ◦ S is a polar decomposition of L (see Theorem 1.5.14), then we define the Jacobian of L to be ¯ df ¯ jac L = ¯ det S ¯. (b) If M 6 N and L = S ◦ U ∗ is a polar decomposition of L (see Theorem 1.5.14), then we define the Jacobian of L to be ¯ df ¯ jac L = ¯ det S ¯. (c) If f : RN −→ RM is Lipschitz continuous and

∂f1 ∂x1

...

∂fM ∂x1

...

Df = ...

∂f1 ∂xN

.. .

∂fM ∂xN

is the M × N -gradient matrix, then the Jacobian of f is defined by df

Jf (x) = jac Df (x)

for λN -a.a. x ∈ RN .

REMARK 1.5.17 Since in a polar decomposition the positive operator is uniquely defined (see Remark 1.5.15), then we see that the notions introduced in Definition 1.5.16 are well defined. If L : RN −→ RM is a linear operator, then we can easily check that if N 6 M , we have jac L = det(L∗ ◦ L), while if M 6 N , we have jac L = det(L ◦ L∗ ). Another expression computing jac L2 is given by the so-called BinetCauchy formula. So let N 6 M and set df

Θ(N, M ) =

©

ª θ : {1, . . . , N } −→ {1, . . . , M } is increasing .

For each θ ∈ Θ, we define Pθ : RM −→ RN by df

Pθ (x1 , . . . , xM ) = (xθ(1) , . . . , xθ(N ) ).

1. Hausdorff Measures and Capacity

65

Clearly Pθ is the projection operator of RM on some N -dimensional subspace N V = span {eθ(k) }N −→ RM is a linear operator then a k=1 . Then, if L : R straightforward but cumbersome proof gives the Binet-Cauchy formula: jac L2 =

X

det(Pθ ◦ L)2 .

θ∈Θ(N,M )

For details we refer to Evans & Gariepy (1992, p. 89). LEMMA 1.5.18 If L : RN −→ RM is a linear operator, N 6 M and A ⊆ RN , ¡ ¢ then µ(N ) L(A) = jac L · λN (A). PROOF Let L = U¯◦S be ¯a polar decomposition of L (see Theorem 1.5.14). We know that jac L = ¯ det S ¯ (see Definition 1.5.16(a)). If jac L = 0, then ¡det S = ¢ 0 and so S is not surjective, i.e., dim R(S) 6 N −1. It follows that µ(N ) L(A) = 0 (see, e.g., Proposition 1.3.23). If jac L > 0, then using the orthogonality of U and the facts that µ(N ) = λN in RN , L = U ◦ S and U ∗ ◦ U = idRN , we have µ(N ) (L(B r (x))) λN ((U ∗ ◦ L)(B r (x))) λN ((U ∗ ◦ U ◦ S)(B r (x))) = = λN (B r (x)) λN (B r (x)) λN (B r (x)) N N ¯ ¯ λ (S(B r (x))) λ (S(B 1 (0))) = = = ¯ det S ¯ = jac L, (1.45) a(N ) λN (B r (x)) N

df π 2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . Finally let 2 !

¡ ¢ df ϑ(A) = µ(N ) L(A)

∀ A ⊆ RN .

Then ϑ is a Radon measure and ϑ ≺≺ λN . So the Radon-Nikodym derivative of ϑ with respect to λN (see Theorem A.2.24 and Remark A.2.25) exists and is given by dϑ ϑ(B r (x)) (x) = lim N = jac L N r&0 dλ λ (B r (x))

(see (1.45) and Widom (1969, p. 119)). From the Radon-Nikodym theorem (see Theorem A.2.24), we infer that for all Borel sets A ⊆ RN , we have ¡ ¢ µ(N ) L(A) = jac L · λN (A).

(1.46)

Because ϑ and λN are both Radon measures, we conclude that (1.46) holds for all A ⊆ RN .

66

Nonlinear Analysis

LEMMA 1.5.19 If f : RN −→ RM is Lipschitz continuous, N 6 M and A ⊆ RN is Lebesgue measurable, then (a) f (A) is a µ(N ) -measurable set; ¡ ¢ (b) the map y 7−→ µ(0) A ∩ f −1 (y) is µ(N ) -measurable on RM ; Z ¡ ¢ ¡ ¢N (c) µ(0) A ∩ f −1 (y) µ(N ) (y) 6 Lip(f ) λN (A). RM

PROOF Clearly we can assume with any loss of generality that A is bounded (if not, consider instead A ∩ Br (0)). (a) From the regularity of the Lebesgue measure, we know that for every i > 1, we can find compact set Ki ⊆ A, such that λN (A \ Ki ) 6

1 i

∀ i > 1.

Because f is Lipschitz continuous, f (Ki ) ⊆ RM is compact and so it is µ(N ) measurable. Then ¶ µ[ ∞ ∞ [ Ki = f (Ki ) is a µ(N ) -measurable set. f i=1

i=1

Also, using Proposition 1.3.25, we have µ µ[ ¶¶ µ µ ¶¶ ∞ ∞ [ Ki µ(N ) f (A) \ f 6 µ(N ) f A \ Ki i=1

µ ¶ ∞ [ 6 Lip(f )N λN A \ Ki = 0,

i=1

i=1

so f (A) is µ(N ) -measurable. (b) For n > 1, we introduce the following families of N -cubes ½ df

Fk =

Q ⊆ RN : Q =

¸ ¾ N µ Y cj cj + 1 , , cj are integers, j ∈ {1, . . . , N } . k k j=1

Let df

hk =

X Q∈Fk

χf (A∩Q) .

1. Hausdorff Measures and Capacity

67

Then by part (a), hk is µ(N ) -measurable and for every y ∈ RN , hk (y) is the number of cubes Q ∈ Fk , such that ¡ ¢ (A ∩ Q) ∩ f −1 {y} 6= ∅. Hence for all y ∈ RN , we have ¡ ¡ ¢¢ as k → +∞, hk (y) % µ(0) A ∩ f −1 {y} ¡ ¢ so the function y 7−→ µ(0) A ∩ f −1 (y) is µ(N ) -measurable. (c) Using the monotone convergence theorem (see Theorem A.2.10) and using Proposition 1.3.25 and the fact that [ RN = Q ∀ k > 1, Q∈Fk

we have

Z (0)

µ RM

=

lim

k→+∞

¡

A∩f

−1

¢ ({y}) dµ(N ) (y) =

Z lim

k→+∞ RM

hk (y) dµ(N ) (y)

X

X ¡ ¡ ¢ ¢N µ(N ) f (A ∩ Q) 6 lim sup Lip(f ) λN (A ∩ Q)

Q∈Fk

k→+∞ Q∈F k

¡ ¢N = Lip(f ) λN (A).

LEMMA 1.5.20 If f : RN −→ RM is Lipschitz continuous, t > 1 and df

C =

©

ª x ∈ RN : Df (x) exists and Jf (x) > 0 ,

then there exists a sequence {Ei }i>1 of Borel subsets of RN , such that (a) C =

∞ S i=1

Ei ;

(b) f |Ei is injective for all i > 1; (c) for each i > 1 there exists a self-adjoint isomorphism Li : RN −→ RN , such that ¡ ¢ ¡ ¢ Lip (f |Ei ) ◦ L−1 6 t, Lip L−1 6 t, i i ◦ (f |Ei ) and

| det Li | 6 Jf |Ei 6 tN | det Li |. tN

68

Nonlinear Analysis

PROOF Choose ε > 0 so that 1t + ε < 1 < t − ε. Let E be a countable dense subset of C and let G be a countable dense subset of the space of self-adjoint isomorphisms of RN . For each u ∈ E, L ∈ G and k > 1, we set E(u, L, k) to be the set of all x ∈ C ∩ B k1 (u), such that µ

¶ ° ° 1 + ε kLhkRN 6 °Df (x)h°RM 6 (t − ε) kLhkRN t

and ° ° ° ° °f (y) − f (x) − Df (x)(y − x)° M 6 ε°L(y − x)° N R R

∀ h ∈ RN (1.47)

∀ y ∈ B k2 (x). (1.48)

Since x 7−→ Df (x) is a Borel function, we see that E(u, L, k) is a Borel subset of RN . From (1.47) and (1.48), it follows that ° ° ° ° ° 1° °L(y − x)° N 6 °f (y) − f (x)° M 6 t°L(y − x)° N R R R t ∀ x ∈ E(u, L, k), y ∈ B k2 (x).

(1.49)

Claim 1. If x ∈ E(u, L, k), then µ

1 +ε t

¶N

¯ ¯ ¯ ¯ ¯ det L¯ 6 Jf (x) 6 (t − ε)N ¯ det L¯.

Let Df (x) = L = U ◦ S (see Theorem 1.5.14(a)). According to Definition 1.5.16(c), ¯ ¯ Jf (x) = jac Df (x) = ¯ det S ¯. From (1.47), we have µ ¶ ° ° 1 + ε kLhkRN 6 °(U ◦ S)h°RM = kShkRN 6 (t − ε) kLhkRN ∀h ∈ RN . t Since L ∈ G, we have µ ¶ ° ° 1 + ε khkRN 6 °(S ◦ L−1 )h°RN 6 (t − ε) khkRN t so thus

∀ h ∈ RN ,

¡ ¢ (S ◦ L−1 ) B 1 (0) ⊆ B t−ε (0), ¯ ¯ ¡ ¢ ¯det(S ◦ L−1 )¯ a(N ) 6 λN B t−ε (0) = a(N )(t − ε)N ,

1. Hausdorff Measures and Capacity

69

N

df π 2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . Finally 2 ! ¯ ¯ ¯ ¯ ¯ det S ¯ 6 (t − ε)N ¯ det L¯.

Similarly, we prove the other inequality of the claim. So Claim 1 is proved. Let {Ei }i>1 be an enumeration of the countable set ©

ª E(u, L, k) : u ∈ E, L ∈ G, k > 1 .

(a) Let x ∈ C, while Df (x) = U ◦ S (by Theorem 1.5.14(a)). Select L ∈ G, such that ° ° Lip(L ◦ S −1 ) = °L ◦ S −1 °L 6 and

µ

¶−1 1 +ε t

° ° Lip(S ◦ L−1 ) = °S ◦ L−1 °L 6 t − ε.

Note that because x ∈ C, we have that S is invertible. Also select k > 1 and u ∈ E, such that 1 kx − ukRN < k and ° ° °f (y) − f (x) − Df (x)(y − x)° M R ° ° ε 6 ky − xkRN = ε°L(y − x)°RN ∀ y ∈ B k2 (x). −1 Lip(L ) We infer that x ∈ E(u, L, k) and so x ∈ Ei for some i > 1. Because x ∈ C was arbitrary, we have proved statement (a). (b) Choose Ei from the countable collection {Ei }i>1 . We have Ei = E(u, L, k) with some u ∈ E, L ∈ G and k > 1. Let us set Li = L. From (1.49), we have ° ° ° ° ° 1° °Li (y − x)° N 6 °f (y) − f (x)° M 6 t°Li (y − x)° N R R R t

∀ y ∈ B k2 (x).

70

Nonlinear Analysis

Since Ei ⊆ B k1 (u) ⊆ B k2 (x), we have that ° ° ° 1° °Li (y − x)° N 6 °f (y) − f (x)° M R t ° ° R 6 t°Li (y − x)°RN

∀ x, y ∈ Ei ,

(1.50)

so f |Ei is injective. (c) From (1.50), it follows that ¡¡ ¢ ¢ Lip f |Ei ◦ L−1 6 t i

¡ ¡ ¢¢ and Lip Li ◦ f |Ei 6 t.

Moreover, from Claim 1 and letting ε & 0, we obtain | det Li | 6 Jf |Ei 6 tN | det Li |. tN

Now we are ready for the area formula. THEOREM 1.5.21 (Area Formula) If f : RN −→ RM is a Lipschitz continuous function, N 6 M and A ⊆ RN is a Lebesgue measurable set, then Z Z ¡ ¢ Jf (x) dλN (x) = µ(0) A ∩ f −1 ({y}) dµ(N ) (y). A

RM

PROOF By virtue of Theorem 1.5.8, without any loss of generality we may assume that Df (x) (and so Jf (x) too) exists for all x ∈ A. Also as before we may suppose that λN (A) < +∞. © ª Case 1. A ⊆ Jf > 0 . In this case we may use Lemma 1.5.20 and produce a sequence {Ei }i>1 of Borel subsets of RN which satisfy the postulates of Lemma 1.5.20. We may additionally assume that the sets {Ei }i>1 are disjoint. Let Fk be the following family of N -cubes ½ df

Fk =

Q⊆R

N

¸ ¾ N µ Y cj cj + 1 : Q= , , cj are integers, j ∈ {1, . . . , N } k k j=1

(compare with the proof of Lemma 1.5.19(b)). We set df

k Fi,n = Ei ∩ Qkn ∩ A

with Qkn ∈ Fk ,

∀ i > 1, n > 1.

1. Hausdorff Measures and Capacity

71

k Evidently the sets Fi,n are disjoint and ∞ [

A =

k Fi,n

∀ k > 1.

i,n=1

First we show that lim

∞ X

k→+∞

(N )

µ

Z

¡ ¢ k f (Fi,n ) =

i,n=1

¡ ¢ µ(0) A ∩ f −1 ({y}) dµ(N ) (y).

(1.51)

RN

To this end, we introduce df

hk =

∞ X

χf (F k

i,n

i,n=1

)

∀ k > 1.

© k ª which Therefore hk (y) is the number of sets from the sequence Fi,n i,n>1 intersect f −1 ({y}). Note that ¡ ¢ hk (y) % µ(0) A ∩ f −1 ({y}) as k → +∞. Then (1.51) follows from the monotone convergence theorem (see Theorem A.2.10). Let t > 1. Because of Lemma 1.5.20 and Proposition 1.3.25, we have ¡ ¢ ¡¡ ¢ k ¢ ¡ ¢ k k µ(N ) f (Fi,n ) = µ(N ) f |Ei ◦ L−1 6 tN λN Li (Fi,n ) (1.52) i ◦ Li (Fi,n ) and

³³ ´ ´ ¡ ¢ ¡ ¢−1 k k λN Li (Fi,n ) = µ(N ) Li ◦ f |Ei ◦ f (Fi,n ) ¡ ¢ k 6 tN µ(N ) f (Fi,n ) .

(1.53)

So, using (1.52), (1.53) and Lemmas 1.5.18 and 1.5.20(c), it follows that ¢ ¢ 1 (N ) ¡ 1 ¡ k k µ f (Fi,n ) 6 N Li (Fi,n ) t2N t Z ¯ 1 ¯ k = N ¯ det Li ¯λN (Fi,n ) 6 Jf (x) dλN (x) t k Fi,n

¯ ¯ ¡ ¢ ¡ ¢ k k k 6 tN ¯ det Li ¯λN (Fi,n ) = tN λN Li (Fi,n ) 6 t2N µ(N ) f (Fi,n ) . k We take the sum for the parameters i, n > 1. Recalling that the sets Fi,n are disjoint and since f |Ei is injective (see Lemma 1.5.20(b)), we obtain

1 t2N

∞ X i,n=1

¡ ¢ k µ(N ) f (Fi,n ) 6

Z Jf (x) dλN (x) 6 t2N A

∞ X i,n=1

¡ k ¢ µ(N ) Fi,n .

72

Nonlinear Analysis

Let k → +∞ and use (1.51) to write that Z Z ¡ ¢ (N ) 1 (0) −1 µ A ∩ f ({y}) dµ (y) 6 Jf (x) dλN (x) t2N A RN Z ¡ ¢ 6 t2N µ(0) A ∩ f −1 ({y}) dµ(N ) (y). RN

Since t > 1 was arbitrary, we let t & 1 and obtain the “area formula” for the case when © ª A ⊆ Jf > 0 . © ª Case 2. A ⊆ Jf = 0 . Let ε > 0. We write f = p ◦ g, where

g : RN −→ RM × RN

and

p : RM × RN −→ RM

are defined by df

g(x) =

¡ ¢ f (x), εx

and p(y, z) = y

∀ x, z ∈ RN , y ∈ RM .

We show that there exists ξ > 0, such that 0 < Jg(x) 6 ξε Note that

· Dg(x) =

∀ x ∈ A.

(1.54)

¸ Df (x) , εIM (N +M )×M

where IM is M × M -identity matrix. Then by virtue of the Binet-Cauchy Formula (see Remark 1.5.17), we have that Jg(x)2 = “sum of squares of (N × N )-subdeterminants of Dg(x)” > ε2N > 0. Moreover, since kDf (x)kL 6 Lip(f ) for λN -a.a. x ∈ RN , once again from the Binet-Cauchy Formula, we have that ½ ¾ sum of squares of terms each 2 2 Jg(x) = Jf (x) + 6 ξε2 , involving at least one ε > 0 for some ξ > 0 and all x ∈ A.

1. Hausdorff Measures and Capacity

73

This shows that (1.54) is true. Then, using Proposition 1.3.25, Case 1, (1.54) and the fact that kpkL = 1, we have ¡ ¢ ¡ ¢ µ(N ) f (A) 6 µ(N ) g(A) Z ¡ ¢ 6 µ(0) A ∩ g −1 ({y, z}) dµ(N ) (y, z) RN +M

Z

Jg(x) dλN (x) 6

=

p ξελN (A).

A

Let ε & 0 to conclude that ¡ ¢ µ(N ) f (A) = 0. Note that Hence

¡ ¢ supp µ(0) A ∩ f −1 ({·}) ⊆ f (A).

Z (0)

µ

¡ ¢ A ∩ f −1 ({y}) dµ(N ) (y) =

Z Jf (x) dλN (x) = 0. A

RN

This proves Case 2. Finally for the general case, we write A = A0 ∪ A1 with

© ª A0 ⊆ Jf = 0

and

© ª A1 ⊆ Jf > 0

and apply the result on each set separately. REMARK 1.5.22

The function ¡ ¢ y 7−→ µ(0) A ∩ f −1 ({y})

on RM is called the multiplicity function. Also note that from Theorem 1.5.21, we infer f −1 ({y}) is at most countable for µ(N ) -almost all y ∈ RM . THEOREM 1.5.23 (Change of Variables Formula I) If f : RN −→ RM is a Lipschitz continuous function, N 6 M and g ∈ L1 (RN ), then ¸ Z Z · X g(x)Jf (x) dλN (x) = g(x) dµ(N ) (y). RN

RM

x∈f −1 ({y})

74

Nonlinear Analysis

PROOF

First assume that g > 0. Let ½ ¾ df A1 = x ∈ RN : g(x) > 1

and inductively define ½ df

An =

x∈R

N

Then

¾ n−1 1 X1 : g(x) > + χ (x) n i=1 i Ai

∀ n > 2.

∞ X 1 g > χAn . n n=1

If g(x) = +∞, we see that x ∈ An

∀ n > 1.

On the other hand, if 0 < g(x) < +∞, then x 6∈ An for infinitely many n > 1. So for infinitely many n, we have 0 < g(x) −

n−1 X i=1

and so we conclude that g =

1 1 χ (x) 6 i Ai n

∞ X 1 χAn . n n=1

From the monotone convergence theorem (see Theorem A.2.10) and Theorem 1.5.21, we have Z g(x)Jf (x) dλN (x) RN ∞ X

1 = n n=1 = = = =

Z χAn (x)Jf (x) dλN (x) N

R Z Z ∞ ∞ X X ¡ ¢ 1 1 Jf (x) dλN (x) = µ(0) An ∩ f −1 ({y}) dµ(N ) (y) n n n=1 A n=1 RM n ¶ Z µX ∞ X 1 χAn (x) dµ(N ) (y) n −1 n=1 x∈f ({y}) RM ¶ Z µ ∞ X X 1 χAn (x) dµ(N ) (y) n x∈f −1 ({y}) n=1 M RZ µ ¶ X g(x) dµ(N ) (y).

RM

x∈f −1 ({y})

1. Hausdorff Measures and Capacity

75

In the general case let g = g + − g − and apply the first part of the proof on each component function g + > 0, g − > 0. EXAMPLE 1.5.24

(a) Let N = 1, M > 1. Suppose that f : R −→ RM

is Lipschitz continuous and injective. We have f = (fk )M k=1 and with 0 =

Df = (fk0 )M k=1 d dt .

Let −∞ < a < b < +∞

and

¡ ¢ C = f [a, b]

(the curve defined by f ). Using Theorem 1.5.21, we have Zb (1)

µ

(C) =

¯ 0 ¯ 1 ¯f (t)¯ dλ (t),

a

the length of C. (b) Let N > 1, M = N + 1. Suppose that g : RN −→ R is Lipschitz continuous and let f : RN −→ RN +1 be the Lipschitz continuous function defined by df

f (x) =

¡ ¢ x, g(x)

∀ x ∈ RN .

We have µ Df (x) =

¶ IN ∇g(x) (N +1)×N

for λN -a.a. x ∈ RN ,

where IN is the N × N -identity matrix. Therefore ½ ¾ sum of squares of 2 2 (Jf ) = = 1 + k∇gkRN . N × N -subdeterminants Then, if df

G =

©

ª (x, y) ∈ RN × R : y = g(x)

76

Nonlinear Analysis

(the graph of g), from Theorem 1.5.21, we have Z q ° °2 (N ) µ (G) = 1 + °∇g(x)°RN dλN (x), RN

the surface area of G. (c) Let N > 1, M = N + 1. Suppose that f : RN −→ RN +1 is Lipschitz continuous and injective. Then µ ¶ ¡ ¢N +1 ∂fk f = fk k=1 and Df = . ∂xi k = 1, . . . , N + 1 i = 1, . . . , N

So (Jf )2 =

N +1 · X k=1

∂(f1 , . . . , fk−1 , fk+1 , fN +1 ) ∂(x1 , . . . , xN )

¸2 ,

the sum of squares of N × N -subdeterminants. Therefore, if U ⊆ RN is any open set and A = f (U ) ⊆ RN +1 , then by Theorem 1.5.21, we have ¯ ¶1 Z µ NX +1 ¯ ¯ ∂(f1 , . . . , fk−1 , fk+1 , . . . , fN +1 ) ¯2 2 N (N ) ¯ ¯ µ (A) = dλ (x). ¯ ¯ ∂(x1 , . . . , xN ) U

k=1

In Theorem 1.5.21, we proved that if f : RN −→ RM is Lipschitz continuous, N 6 M and A ⊆ RN is Lebesgue measurable, then the Jacobian integral Z Jf (x) dλN (x) A

equals the N -dimensional Hausdorff area of f |A , given by Z ¡ ¢ µ(0) A ∩ f −1 ({y}) dµ(N ) (y). RM

If N > M , then the Jacobian integral equals the “coarea” of f |A , defined by Z ¡ ¢ µ(N −M ) A ∩ f −1 ({y}) dµ(N ) (y). RM

This result is known as the coarea formula. THEOREM 1.5.25 (Coarea Formula) If f : RN −→ RM is Lipschitz continuous, M 6 N and A ⊆ RN is Lebesgue measurable, then Z Z ¡ ¢ Jf (x) dλN (x) = µ(N −M ) A ∩ f −1 ({y}) dλM (y). A

RM

1. Hausdorff Measures and Capacity

77

As was the case with the area formula, the coarea formula leads to a change of variables formula. THEOREM 1.5.26 (Change of Variables Formula II) If f : RN −→ RM is a Lipschitz continuous function, N > M and g ∈ L1 (RN ), then ¸ Z Z ·Z N (N −M ) g(x)Jf (x) dλ (x) = g(x) dµ (x) dλM (y). RN

RM

f −1 ({y})

PROOF First assume that g > 0. As in the proof of Theorem 1.5.23, we can find Lebesgue measurable sets {An }n>1 ⊆ RN , such that g =

∞ X 1 χ . n An n=1

Invoking the monotone convergence theorem (see Theorem A.2.10) and Theorem 1.5.25, we have Z g(x)Jf (x) dλN (x)

=

RN ∞ X

1 n n=1

Z Jf (x) dλN (x) An

Z ∞ X ¡ ¢ 1 = µ(N −M ) An ∩ f −1 ({y}) dλM (y) n n=1 RM

¸ Z ·X ∞ ¢ 1 (N −M ) ¡ −1 = µ An ∩ f ({y}) dλM (y) n n=1 RM ¸ Z ·Z = g(x) dµ(N −M ) (y) dλM (y). RM

f −1 ({y})

For the general case let g = g+ − g− and apply the first part to each component function g + > 0 and

g − > 0.

78

Nonlinear Analysis (a) Let N > 1, M = 1. Suppose that f : RN −→ R

EXAMPLE 1.5.27 is defined by

f (x) = kxkRN and g ∈ L1 (RN ). Then, we have Df (x) =

x kxkRN

and

∀ x ∈ RN \ {0}.

Jf (x) = 1

From Theorem 1.5.26, we have +∞· Z Z

Z N

g(x) dλ (x) =

g(x) dµ 0

RN

(N −1)

¸ (x) dλ1 (r)

∂B r (0)

+∞ · Z Z = rN −1 0

¸ g(rx) dµ(N −1) dλ1 (r).

(1.55)

∂B 1 (0)

In particular if g = χB

1 (0)

,

from (1.55), it follows that a(N ) =

¢ 1 (N −1) ¡ µ ∂B 1 (0) , N

N

π2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . 2 ! df

(b) Let N > 1 and M = 1. Suppose that f : RN −→ R is a Lipschitz continuous function. Then Jf = kDf kRN and so from Theorem 1.5.25, we have that Z

° ° °Df (x)°

RN

+∞ Z ¡ ¢ dλ (x) = µ(N −1) {f = t} dλ1 (t). N

RN

−∞

We conclude this section with some additional useful results involving the multiplicity function ¡ ¢ y 7−→ µ(0) A ∩ f −1 ({y}) of a Lipschitz continuous function f .

1. Hausdorff Measures and Capacity

79

PROPOSITION 1.5.28 If X, Y are separable metric spaces, ξ is an outer measure on Y , f : X −→ Y is a map such that for every Borel set B ⊆ X, the set f (B) is ξ-measurable, ϑ : 2X −→ R = R ∪ {+∞} is the outer measure on X, defined by ¡ ¢ df ϑ(A) = ξ f (A)

∀A⊆X

and ϑb is the Borel measure resulting from ϑ by the Carath´eodory construction, then for every Borel set B ⊆ X, we have Z ¡ ¢ b ϑ(B) = µ(0) B ∩ f −1 ({y}) dξ(y). Y

PROOF Let {Bk }k>1 be a sequence of Borel partitions of B, such that every member of Bk is the union of some subcollection in Bk+1 and sup δ(A) −→ 0

as k → +∞,

A∈Bk

i.e., B =

∞ [

Bk is a Vitali cover of B.

k=1

Note that if

df

hk (y) =

X

χf (A) (y)

∀ k > 1, y ∈ Y,

A∈Bk

then

¡ ¢ hk (y) % µ(0) B ∩ f −1 ({y})

as k → +∞.

So by the monotone convergence theorem (see Theorem A.2.10), we have that Z X X ¡ ¢ b ϑ(B) = lim ξ f (A) = lim χf (A) (y) dξ(y) k→+∞

Z =

A∈Bk

k→+∞

Y

A∈Bk

¡ ¢ µ(0) B ∩ f −1 ({y}) dξ(y).

Y

REMARK 1.5.29 Recall that if X is a separable metric space, Y is a Hausdorff topological space, f : X −→ Y is a continuous map, ξ is a Borel outer measure on Y , then for every Borel set B ⊆ X, the set f (B) is ξmeasurable. This fact is essentially the starting point for the theory of Souslin sets (see Definition A.2.29(b)).

80

Nonlinear Analysis

PROPOSITION 1.5.30 If X is a Polish space (see Definition A.2.29(a)), Y is a separable metric space, f : X −→ Y is a Lipschitz continuous function, 0 6 k < +∞ and A ⊆ X is a Borel set, then Z ¡ ¡ ¢¢ ¡ ¢k µ(0) A ∩ f −1 {y} dµ(k) (y) 6 Lip(f ) µ(k) (A). Y

PROOF

From Proposition 1.3.25, we know that ¡ ¢ ¡ ¢k df ϑ(A) = µ(k) f (A) 6 Lip(f ) µ(k) (A)

∀ A ⊆ X.

Then apply Proposition 1.5.28 on the outer measure ϑ. PROPOSITION 1.5.31 If X is a separable metric space, then for every connected set C ⊆ X, we have δ(C) 6 µ(1) (C). PROOF

Clearly we may assume that µ(1) (C) < +∞

or otherwise the inequality is obvious. Since µ(1) is a Borel measure, we can find a Borel set B ⊇ C, such that µ(1) (B) = µ(1) (C). Let u, v ∈ C and let f : X −→ R be defined by df

f (x) = dX (x, u)

∀ x ∈ X,

where dX is the metric in X. Since f is Lipschitz continuous with Lip(f ) = 1, f (u), f (v) ∈ f (C) = [a, b], from Proposition 1.5.30, we have that µ(1) (C) = µ(1) (B) Z ¡ ¢ > µ(0) B ∩ f −1 ({y}) dµ(1) (y) R

¡ ¢ > µ(1) f (B) > dX (v, u).

1. Hausdorff Measures and Capacity

1.6

81

Capacity

The notion of capacity plays a crucial role in the study of local properties of Sobolev functions. In a sense it takes the place of measure and it is used to characterize the smallness of subsets in RN . For this reason, it is indispensable in the study of the continuity properties of Sobolev functions. We shall deal with these issues in Section 2.7. Moreover, the concept of capacity enters the study of obstacle problems. In this section we develop the theory of the so-called “p-capacity” (variational capacity). The development of the theory of the p-capacity requires knowledge of the definition of Sobolev spaces and some results from their theory. To make this section self-contained, we state here the necessary material from the theory of Sobolev spaces, but we postpone the proofs until Section 2.4, where we conduct a more systematic study of Sobolev spaces. DEFINITION 1.6.1 Let U ⊆ RN be a nonempty open set. By z = N (zk )k=1 , we denote a generic point of U . (a) Suppose that f ∈ L1loc (U ). We say that gk ∈ L1loc (U ) is the distributional (or © ª weak) partial derivative of f with respect to zk (with k ∈ 1, . . . , N ) in U , if Z Z ∂ϕ f dz = − gk ϕ dz ∀ ϕ ∈ Cc∞ (Z) ∂zk U

U

is the space of all C ∞ (Z)-functions with compact supports, i.e., (here the space of test functions). We write Cc∞ (Z)

gk =

∂f = Dk f ∂zk

© ª ∀ k ∈ 1, . . . , N .

If all of the distributional (weak) partial derivatives Dk f exist for k = df

1, . . . , N , then Df = (Dfk )N k=1 is the distributional (weak) derivative of f . (b) Let p ∈ [1, +∞]. We define the Sobolev space W 1,p (U ), by df

W 1,p (U ) =

©

¡ ¢ª f ∈ Lp (U ) : Df ∈ Lp U ; RN .

Also we define df

1,p Wloc (U ) =

©

ª f : U −→ R : f |V ∈ W 1,p (V ) for all V ⊂⊂ U ,

where V ⊂⊂ U means that V is a bounded open subset of U such that V ⊆ U . 1,p The elements of Wloc (U ) are called Sobolev functions.

82

Nonlinear Analysis (c) Let p ∈ [1, N ). We define the critical Sobolev exponent df

p∗ =

Np N −p

and the space df

Kp =

©

¡ ¢ª ∗ f ∈ Lp (RN ) : f > 0, Df ∈ Lp RN ; RN .

(d) Let p ∈ [1, N ) and C ⊆ RN . The p-capacity of C is defined by df

capp (C) =

inf p

f ∈K C ⊆ int {f > 1}

° °p °Df ° . p

REMARK 1.6.2 (a) Clearly if the distributional (weak) partial derivative Dk f exists, it is uniquely defined modulo a Lebesgue-null set in RN . (b) If f ∈ W 1,p (U ), we define df

kf k1,p =

³

p

p

kf kp + kDf kp

and

´ p1

∀ p ∈ [1, +∞)

df

kf k1,∞ = kf k∞ + kDf k∞ . These are norms in W 1,p (U ) for p ∈ [1, +∞) and W 1,∞ (U ) respectively. Normed this way the Sobolev spaces are Banach spaces. ∗

(c) Although p < p∗ , we do not have Lp (RN ) ⊆ Lp (RN ) and so we cannot say that K p ⊆ W 1,p (RN ). (d) If K ⊆ RN is a compact set, then by using standard regularization (via mollification; see also Definition 2.4.10) of the characteristic function χK , we can check that p capp (K) = inf¡ ¢ kDf kp . f ∈ Cc∞ RN f > χK

(e) Evidently, if C1 ⊆ C2 , then capp (C1 ) 6 capp (C2 ) (monotonicity). (f ) Because p < p∗ , the elements of K p are Sobolev functions.

1. Hausdorff Measures and Capacity

83

As we already mentioned, for easy reference, we present four results from the theory of Sobolev spaces, which will be used in the sequel. The proofs of these results will be given in Section 2.4. PROPOSITION 1.6.3 If U ⊆ RN is open, p ∈ [1, +∞) and f ∈ W 1,p (U ), then we can find a sequence {f }n>1 ⊆ W 1,p (U ) ∩ C ∞ (U ), such that fn −→ f

in W 1,p (U ).

PROPOSITION 1.6.4 Let U ⊆ RN be an open set and let p ∈ [1, +∞). (a) If f, g ∈ W 1,p (U ), then df

df

h0 = min{f, g} ∈ W 1,p (U ),

h1 = max{f, g} ∈ W 1,p (U )

and ½ Dh0 (x) = ½ Dh1 (x) =

Df (x) Dg(x)

for λN -a.a. x ∈ {f 6 g}, for λN -a.a. x ∈ {f > g},

Dg(x) Df (x)

for λN -a.a. x ∈ {f 6 g}, for λN -a.a. x ∈ {f > g}.

In particular f + , f − , |f | ∈ W 1,p (U ). (b) If {fn }n>1 ⊆ W 1,p (U ) is a sequence, then df

h = sup fn ∈ W 1,p (U ), n>1

and

° ° df u = sup °Dfn °RN ∈ Lp (U )

° ° °Dfn (z)° N 6 u(z) R

n>1

for λN -a.a. z ∈ U.

PROPOSITION 1.6.5 If U ⊆ RN is a bounded open set with a C 1 -boundary and p ∈ [1, +∞), ¡ ¢ then there exists E ⊆ L W 1,p (U ), W 1,p (RN ) , such that E(f )|U = f. REMARK 1.6.6

The function E(f ) is called an extension of f on RN .

84

Nonlinear Analysis

Finally we mention two basic inequalities. The first is known as the “Sobolev inequality” (or “Sobolev-Nirenberg-Gagliardo inequality”) and the second is known as the “Poincar´e-Wirtinger inequality.” THEOREM 1.6.7 (Sobolev-Nirenberg-Gagliardo Inequality) If p ∈ [1, +∞), then there exists C = C(N, p) > 0, such that ∀ f ∈ W 1,p (RN ).

kf kp∗ 6 C kDf kp

THEOREM 1.6.8 (Poincar´ e-Wirtinger Inequality) If U ⊆ RN is bounded, connected and open set (i.e., a bounded domain in RN ) with a C 1 -boundary and p ∈ [1, +∞), then there exists C0 = C0 (N, p) > 0, such that ° ° °f − f ° 6 C0 kDf k ∀ f ∈ W 1,p (U ), p p with f =

1 N λ (U )

Z f (z) dz. U

If p < N , then

° ° °f − f ° ∗ 6 C0 kDf k . p p

A Sobolev inequality is also valid for the elements in K p . PROPOSITION 1.6.9 If f ∈ K p , then there exists C = C(N, p) > 0, such that ∀ f ∈ K p.

kf kp∗ 6 C kDf kp PROOF

¡ ¢ First we produce a sequence {ϕn }n>1 ⊆ Cc∞ RN , such that 0 6 ϕn < 1

∀ n > 1,

ϕn (z) % 1

for a.a. z ∈ RN ,

ϕn (z) = 1

∀ kzkRN < n

and sup kDϕn kRN < +∞.

n>1

¡ ¢ To this end, let ϕ ∈ Cc∞ B2 (0) , such that 0 6 ϕ 6 1 and

ϕ|B

1 (0)

= 1.

1. Hausdorff Measures and Capacity Let us set

85

³z´

, ∀ z ∈ RN , n > 1. n This is the desired sequence. Note that ϕn (z) = ϕ

ϕn f ∈ W 1,p (RN ) (recall that p < p∗ and use the product rule). Invoking the Sobolev-NirenbergGagliardo inequality (see Theorem 1.6.7), we can find C > 0, such that for all n > 1, we have kϕn f kp∗ 6 C kD(ϕn f )kp 6 C kDf kp + C kf Dϕn kp ; thus by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have kf kp∗ 6 C kDf kp + C lim inf kf Dϕn kp . (1.56) n→+∞

Using H¨older’s inequality (see Theorem A.2.27) (as pp∗ + p1∗ 0 = 1), the fact ( p ) ³ ∗ ´N that ϕn |Bn (0) ≡ 1 and since p pp = 1 and sup kDϕn kRN < +∞, for every n>1

n > 1, we have Z p

kf Dϕn kp = RN

µ

Z

¯ ¯ ∗ ¯f (z)¯p dz

6 Z

6 C1

{kzkRN >n} ³ ´0 °p pp∗

¶ pp∗ µ Z

° °Dϕn (z)° N R

¯ ¯ ¯f (z)Dϕn (z)¯p dz

dz

¶1− pp∗

RN

{kzkRN >n}

µ

Z

¯ ¯ ¯f (z)Dϕn (z)¯p dz =

¯ ¯ ∗ ¯f (z)¯p dz

¶ pp∗

,

{kzkRN >n} ∗

for some C1 > 0. Since |f |p ∈ L1 (RN ), we have ° °p lim °f Dϕn °p 6 C1 lim

n→+∞

µ

Z

n→+∞

¯ ¯ ∗ ¯f (z)¯p dz

¶ pp∗

= 0.

{kzkRN >n}

Using this in (1.56), we conclude that kf kp∗ 6 C kDf kp .

We use this inequality to establish that the p-capacity capp is an outer measure on RN .

86

Nonlinear Analysis

THEOREM 1.6.10 If p ∈ [1, N ), then capp is an outer measure on RN . PROOF Clearly capp (∅) = 0 and capp is monotone (see Remark 1.6.2(e)). So it remains to show that if {Cn }n>1 is a sequence of subsets of RN and ∞ S C= Cn , then n=1

capp (C) 6

∞ X

capp (Cn )

n=1

(see Definition 1.1.1). We assume that ∞ X

capp (Cn ) < +∞

n=1

or otherwise the inequality is obvious. According to Definition 1.6.1(d), for a given ε > 0, we can find fn ∈ K p , such that © ª Cn ⊆ int fn > 1 and

° ° °Dfn °p 6 capp (Cn ) + ε ∀ n > 1. (1.57) p 2n © ª Let h = sup fn . Evidently C ⊆ int h > 1 . Also, using the monotone n>1

convergence theorem (see Theorem A.2.10), Proposition 1.6.3 and (1.57), we have Z Z ∞ Z X ∗ p∗ p∗ h(z) dz = sup fn (z) dz 6 fn (z)p dz RN

RN

n>1

n=1 N R

∞ ∞ ³ p∗ X X ° °p∗ ε ´p ° ° 6 C Dfn p 6 C capp (Cn ) + n 2 n=1 n=1

6 C1

·X ∞ ³ n=1

ε ´ capp (Cn ) + n 2

¸ pp∗ < +∞,

¡ ¢ ∗ for some C1 > 0. As h ∈ Lploc RN and p < p∗ , we have h ∈ Lp (RN ). Also if u = sup kDfn kRN , from (1.57), we have n>1

Z RN

so u ∈ Lp (RN ).

∞ Z X ¯ ¯ ° ° ¯u(z)¯p dz 6 °Dfn (z)°p N dz < +∞, R n=1 N R

1. Hausdorff Measures and Capacity

87

° ° 1,p By Proposition 1.6.4(b), we have that u ∈ Wloc (RN ) and °Dh(z)° 6 u(z) ¡ ¢ for almost all z ∈ Z. Therefore Dh ∈ Lp RN ; RN and so h ∈ K p . By virtue of Definition 1.6.1(d), the monotone convergence theorem (see Theorem A.2.10) and (1.57), we have Z Z ° °p ° ° capp (C) 6 Dh(z) RN dz 6 u(z)p dz 6

∞ X

Z

RN

RN

° ° °Dfn (z)°p N dz 6 R

n=1 N R

∞ X

capp (Cn ) + ε.

n=1

Let ε & 0, to conclude that capp (C) 6

∞ X

capp (Cn ).

n=1

In the next Theorem, we have collected the basic properties of the p-capacity capp . THEOREM 1.6.11 If p ∈ [1, N ) and A ⊆ C ⊆ RN , then (a) capp (A) =

inf

A⊆U U is open

capp (U ).

(b) capp (ξA) = ξ N −p capp (A) for all ξ > 0. ¡ ¢ (c) capp L(A) = capp (A) for every affine isometry L : RN −→ RN . (d) capp (A) 6 Cµ(N −p) (A) for some C = C(N, p) > 0. ¡ ¢ N (e) λN (A) 6 C capp (A) N −p for some C = C(N, p) > 0. (f ) capp (A ∩ B) + capp (A ∪ B) 6 capp (A) + capp (B). (g) if {An }n>1 is an increasing sequence (i.e., An ⊆ An+1 for all n > 1), then µ[ ¶ ∞ lim capp (An ) = capp An . n→+∞

n=1

(h) if {An }n>1 is a decreasing sequence (i.e., An ⊇ An+1 for all n > 1) of compact sets in RN , then µ\ ¶ ∞ lim capp (An ) = capp An . n→+∞

n=1

88

Nonlinear Analysis

PROOF

(a) From the monotonicity of the p-capacity, we have capp (A) 6

inf

A⊆U U is open

capp (U ).

(1.58)

© ª For a given ε > 0, we can find f ∈ K p , such that A ⊆ int f > 1 and p

kDf kp 6 capp (A) + ε.

(1.59)

Let U = int {f > 1}. Then from Definition 1.6.1(d) and (1.58), we have p

capp (U ) 6 kDf kp 6 capp (A) + ε. Let ε & 0, to obtain that capp (U ) 6 capp (A). Combining this with (1.58), we conclude that capp (A) =

inf

A⊆U U is open

capp (U ).

© ª (b) Let ε > 0 be given. Then we can find f ∈ K p , such that A ⊆ int f > 1 and p kDf kp 6 capp (A) + ε. ³ ´ ¡ ¢ df Let ξ > 0 and h(z) = f zξ . We have h ∈ K p and ξA ⊆ int h > 1 . So ¡ ¢ p p capp (ξA) 6 kDhkp = ξ N −p kDf kp 6 ξ N −p capp (A) + ε . Let ε & 0 to obtain

capp (ξA) 6 ξ N −p capp (A).

(1.60)

Using (1.60), we see that µ capp (A) = capp so

¶ 1 1 (ξA) 6 N −p capp (ξA), ξ ξ

ξ N −p capp (A) 6 capp (ξA).

From (1.60) and (1.61), we conclude that capp (ξA) = ξ N −p capp (A) (c) The proof is similar to that of (b).

∀ ξ > 0.

(1.61)

1. Hausdorff Measures and Capacity ∞ S

(d) Let δ > 0 and suppose that A ⊆

n=1

89

B rn (xn ), with 2rn < δ for all n > 1.

Since capp is an outer measure (see Theorem 1.6.10), using also (b) and (c) as B rn (xn ) = xn + rn B 1 (0), we have capp (A) 6

∞ X

∞ ¡ ¢ ¡ ¢X capp B rn (xn ) = capp B 1 (0) rnN −p ,

n=1

so

n=1

capp (A) 6 Cµ(N −p) (A).

© ª (e) Let ε > 0 and select f ∈ K p , such that A ⊆ int f > 1 and ° °p °Df ° 6 capp (A) + ε. p

(1.62)

Using Proposition 1.6.9 and (1.62), we obtain ¡ ¢1 1 λN (A) p∗ 6 kf kp∗ 6 C kDf kp 6 C capp (A) + ε p , for some C > 0 and so λN (A) 6 Ccapp (A)

p∗ p

N

= Ccapp (A) N −p ,

for some C > 0. (f ) Let ε > 0 and select f, g ∈ K p , such that © ª © ª A ⊆ int f > 1 , B ⊆ int g > 1 and Let

p

p

kDf kp 6 capp (A) + ε,

kDgkp 6 capp (B) + ε.

© ª df h0 = min f, g

© ª df h1 = max f, g .

and

(1.63)

Using Proposition 1.6.4(a), we see that h0 , h1 ∈ K p . Also, we have ° ° ° ° °Dh0 (z)°p N + °Dh1 (z)°p N R ° °p ° °p R = °Df (z)°RN + °Dg(z)°RN for λN -a.a. z ∈ RN and

© ª A ∩ B ⊆ int h0 > 1 ,

© ª A ∪ B ⊆ int h1 > 1 .

(1.65)

Therefore from (1.65), (1.64), (1.63) and since h1 , h1 ∈ K p , we obtain p

p

capp (A ∩ B) + capp (A ∪ B) 6 kDh0 kp + kDh1 kp p

p

= kDf kp + kDgkp 6 capp (A) + capp (B) + 2ε,

(1.64)

90

Nonlinear Analysis

so capp (A ∩ B) + capp (A ∪ B) 6 capp (A) + capp (B). (g) We do the proof for the case p ∈ (1, N ). For the case p = 1 we refer to Federer & Ziemer (1972). By virtue of the monotonicity property, we have lim capp (An ) 6 capp

n→+∞

µ[ ∞

¶ An .

(1.66)

n=1

Suppose that capp

µ[ ∞

¶ An

< +∞,

n=1

as otherwise there is nothing to prove. Thus also lim capp (An ) < +∞.

n→+∞

Let ε > 0 and for every n > 1 let us select fn ∈ K p , such that © ª An ⊆ int fn > 1 df

and

p

kDfn kp 6 capp (An ) +

ε . 2n

(1.67)

df

Let us set h0 = 0 and hk = max fn . We know that {hk }k>0 ⊆ K p , hk = 16n6k ¡ ¢ © © ª ª max fk , hk−1 and Ak−1 ⊆ int min fk , hk−1 > 1 . So, using (1.67), we have p

kDhk kp + capp (Ak−1 ) ° ¡ ¢°p ° ¡ ¢°p 6 °D max{fk , hk−1 } °p + °D min{fk , hk−1 } °p ε p p p = kDfk kp + kDhk−1 kp 6 capp (Ak ) + k + kDhk−1 kp , 2 so p

p

kDhk kp − kDhk−1 kp 6 capp (Ak ) − capp (Ak−1 ) +

ε . 2k

Adding and recalling that h0 = 0, we obtain p

kDhk kp 6 capp (Ak ) + ε

∀ k > 1.

(1.68)

df

Let u = lim hk . Evidently k→+∞

∞ [ k=1

© ª Ak ⊆ int u > 1

(1.69)

1. Hausdorff Measures and Capacity

91

and so, by the monotone convergence theorem (see Theorem A.2.10), Proposition 1.6.9 and (1.68), we have kukp∗ =

lim khk kp∗ 6 C lim inf kDhk kp k→+∞ µ ¶ p1 6 C lim capp (Ak ) + ε . k→+∞

k→+∞

(1.70)

¡ ¢ So at least for a subsequence of {Dhk }k>1 ⊆ Lp RN ; RN , we have that it is bounded. Hence we may assume that ¡ ¢ w Dhk −→ Du in Lp RN ; RN (recall that we have assumed that p > 1). Then from (1.70) and since kDukp 6 lim inf kDhk kp ,

(1.71)

k→+∞

we infer that f ∈ K p . Therefore, using (1.69), (1.71) and (1.70), we have µ[ ¶ ∞ p capp An 6 kDukp 6 lim capp (An ). (1.72) n→+∞

n=1

From (1.66) and (1.72), it follows that lim capp (An ) = capp

n→+∞

µ[ ∞

¶ An .

n=1

(h) Note that due to the monotonicity property, we have ¶ µ\ ∞ An . lim capp (An ) > capp n→+∞

(1.73)

n=1 ∞ T

Let U be an open set such that

n=1

An ⊆ U . The set

∞ T n=1

An is compact and

so for some n0 > 1, we have that An0 ⊆ U , hence An ⊆ U for all n > n0 . It follows that lim capp (An ) 6 capp (U ) n→+∞

and so, using also (a), we have lim capp (An ) 6

n→+∞

T

inf

An ⊆ U U is open

capp (U ) = capp

µ\ ∞ n=1

From (1.73) and (1.74), we conclude that lim capp (An ) = capp

n→+∞

µ\ ∞ n=1

¶ An .

¶ An .

(1.74)

92

Nonlinear Analysis

REMARK 1.6.12 The monotonicity of capp together with properties (g) and (h) in Theorem 1.6.11 imply that the set-function A 7−→ capp (A) is a “Choquet capacity” (see Definition A.2.37). Using Choquet’s capacitability theorem (see Theorem A.2.39 or cf. Choquet (1955)), we can say that for all Souslin (analytic) subsets A (see Definition A.2.29(b) and Remark A.2.30) of RN (in particular then for all Borel sets A of RN ), we have capp (A) =

sup K⊆A K is compact

capp (K).

In Theorem 1.6.11(d) we obtained a first relation between p-capacity and Hausdorff measures, both of which measure small sets in RN . We can improve this result as follows. THEOREM 1.6.13 Let p ∈ (1, N ) and A ⊆ RN . (a) If µ(N −p) (A) < +∞, then capp (A) = 0. (b) If capp (A) = 0, then µ(s) (A) = 0 for all s > N − p. PROOF

(a) Clearly we may assume that A ⊆ RN is compact.

Claim 1. We can find C = C(N, p, A) > 0, such that, if V ⊆ RN is open with A ⊆ V , then we can find an open set U ⊆ RN and f ∈ K p , such that © ª p A ⊆ U ⊆ f = 1 , supp f ⊆ V and kDf kp 6 C. Let V ⊆ RN be an open set, such that A ⊆ V . Let us set df

δ =

d(A, V c ) > 0. 2

Because A is compact and µ(N −p) (A) < +∞, we can find m

{zk }k=1 ⊆ A m

and {rk }k=1 ⊆ R+ \ {0}, such that 2rk < δ,

A⊆

m [ k=1

Brk (zk )

and

m X k=1

rkN −p 6 µ(N −p) (A) + 1.

(1.75)

1. Hausdorff Measures and Capacity Let us set

m [

df

U =

93

Brk (zk )

k=1

and let fk ∈ K p be defined by 1 df fk (z) = 2− 0

if if if

kz−zk kRN rk

kz − zk kRN < rk , rk 6 kz − zk kRN 6 2rk , 2rk < kz − zk kRN .

Using Proposition 1.6.4(a), we see that p

kDfk kp 6 CrkN −p

∀ k ∈ {1, . . . , m}.

df

Let us set f = max fk . Then f ∈ K p , U ⊆ 16k6m

©

(1.76)

ª f = 1 , supp f ⊆ V and

from (1.76) and (1.75), we have p

kDf kp 6

m X

p

kDfk kp 6 C

k=1

m X

¡ ¢ rkN −p 6 C µ(N −p) (A) + 1 ,

k=1

which proves Claim 1. We use the claim inductively and produce a sequence {Un }n>1 of open sets in RN and functions {fn }n>1 ⊆ K p , such that A ⊆ Un+1 ⊆ Un ,

© ª U n+1 ⊆ int fn = 1

∀n>1

and supp fn ⊆ Un , Let df

Sm =

m X 1 n n=1

p

kDfn kp 6 C

and

df

hm =

∀ n > 1. m 1 X1 fn . Sm n=1 n

We have hm ∈ K p

and hm > 1 on Um+1 . ¢ Also because supp kDfn (·)kRN ⊆ Un \ U n+1 and p > 1, we see that ¡

° °p capp (A) 6 °Dhm °p 6 6

m ° 1 X 1 ° °Dfn °p p p p Sm n=1 n

m C X 1 −→ 0 p Sm np n=1

as m → +∞.

94

Nonlinear Analysis

(b) Since capp (A) = 0, for every n > 1 we can find fn ∈ K p , such that © ª A ⊆ int fn > 1 and 1 p kDfn kp 6 n . (1.77) 2 ∞ df P Let us set h = fn . From (1.77), we have n=1

kDhkp 6

∞ X

kDfn kp < +∞.

(1.78)

n=1

Using Proposition 1.6.9 and (1.78), we have that khkp∗ 6

∞ X

kfn kp∗ 6

n=1

∞ X

C kDfn kp < +∞,

n=1

so h ∈ K p . © ª Observe that A ⊆ int h > m for all m > 1. Let z0 ∈ A. For r > 0 small, © ª we have B r (z0 ) ⊆ int h > m , hence df

hz0 ,r =

Z

1 λN (B r (z0 ))

h(z) dz > m, B r (z0 )

which implies that hz0 ,r −→ +∞

as r & 0.

(1.79)

Claim 2. For every z0 ∈ A, we have Z ° ° 1 °Dh(z)°p N dz = +∞, lim s R r&0 r B r (z0 )

for any s > N − p. To prove this claim, we proceed by contradiction. Let z0 ∈ A and suppose that Z ° ° 1 °Dh(z)°p N dz < +∞. lim s R r&0 r B r (z0 )

Then we can find M < +∞, such that for all r ∈ (0, 1], we have Z ° ° 1 °Dh(z)°p N dz 6 M. R rs B r (z0 )

1. Hausdorff Measures and Capacity

95

Invoking the Poincar´e-Wirtinger inequality (see Theorem 1.6.8), we have Z ¯ ¯ 1 ¯h(z) − hz ,r ¯ dz 0 N λ (B r (z0 )) B r (z0 ) Z ° ° C °Dh(z)°p N dz 6 C1 rϑ , 6 N (1.80) R λ (B r (z0 )) B r (z0 )

for ¡some C, ¢ C1 > 0, ϑ = s − (N − p) and for all r ∈ (0, 1] (recall that λN B r (z0 ) = a(N )rN ). Since ¡ ¢ ¡ ¢ λN B r (z0 ) = 2N λN B r2 (z0 ) and using Jensen’s inequality (see Theorem A.2.26) and (1.80), we have that ¯ ¯ 1 ¯ λN (B r2 (z0 )) ¯

¯ ¯ ¯hz , r − hz ,r ¯ = 0 2 0 Z

2N 6 N λ (B r (z0 ))

Z

µ Z

6

2 λN (B

r (z0 ))

h(z) − hz0 ,r

¢

¯ ¯ dz ¯¯

B r (z0 )

¯ ¯h(z) − hz

2

¯ ¯ 0 ,r dz

B r (z0 ) N

¡

¯ ¯h(z) − hz

¯p ¯ dz 0 ,r

1 p

¶ p1

ϑ

6 C2 r p ,

(1.81)

B r (z0 )

for some C2 > 0. Therefore, using (1.81), for k > i, we have ¯ ¯ ¯hz0 , so

1 2k

k ¯ ¯ X ¯ ¯ − hz0 , 1i ¯ 6 ¯hz0 , 1l − hz0 , 2

2

l=i+1

¶ ϑp k µ ¯ X 1 ¯ 1 , ¯ 6 C2 2l−1 2l−1 l=i+1

© ª hz0 , 21n n>1 is a Cauchy sequence and this contradicts the fact that

hz0 , 21n −→ +∞ (see (1.79)). This proves Claim 2. Then we have ½ 1 A ⊆ z0 ∈ RN : lim sup s r&0 r

¾

° ° °Dh(z)°p N dz = +∞ R

B r (z0 )

½ ⊆

Z

z0 ∈ RN : lim sup r&0

1 rs

Z

¾

° ° °Dh(z)°p N dz > 0 R

= Cs .

B r (z0 )

But from Theorem 1.4.9, we have that µ(s) (Cs ) = 0, hence µ(s) (A) = 0.

96

Nonlinear Analysis

REMARK 1.6.14 If p = 1 and A ⊆ RN , it can be shown that cap1 (A) = (1) 0 if and only if µ (A) = 0. The proof of this result, which uses functions of bounded variations and the isoperimetric inequality, can be found in Evans & Gariepy (1992, p. 193). PROPOSITION 1.6.15 If T ⊆ (0, 1) is such that λ1 (T ) > 0, p ∈ [N − 1, N ), A ⊆ B 1 (0) ⊆ RN and for each r ∈ T there exists unique zr ∈ ∂B r (0), such that zr ∈ A, then capp (A) > 0. PROOF

Let f : RN −→ R be defined by df

2

f (z) = kzkRN =

N X

N ∀ z = (zn )N n=1 ∈ R .

zn2

n=1

Evidently f is Lipschitz continuous with Lip(f ) = 1. So by Proposition 1.3.25, we have ¡ ¢ µ(1) f (A) 6 µ(1) (A). Note that T = f (A). So by virtue of Theorem 1.3.21, we have that ¡ ¢ 0 < λ1 (T ) = µ(1) f (A) 6 µ(1) (A). If for some p ∈ [N − 1, N ), we have that capp (A) = 0,

then from Theorem 1.6.13(b) (see also Remark 1.6.14), we have µ(1) (A) = 0 (note that 1 > N − p), a contradiction. So capp (A) > 0

∀ p ∈ [N − 1, N ).

The next result provides a kind of Chebyshev inequality in terms of pcapacities. PROPOSITION 1.6.16 If p ∈ [1, N ), f ∈ K p , ε > 0 and ½ Z 1 df A = z0 ∈ RN : N λ (B r (z0 ))

¾ f (z) dz > ε for some r > 0 ,

B t (z0 )

then there exists a constant C = C(N, p), such that capp (A) 6

C p kDf kp . εp

1. Hausdorff Measures and Capacity

97

PROOF First, we show that the set A ⊆ RN is open. Let z0 ∈ A. Then for some r > 0 and ξ > 0, we have Z 1 f (z) dz = ε + ξ. λN (B r (z0 )) B r (z0 )

¡ N¢ Since f ∈ L∞ , exploiting the absolute continuity of the Lebesgue inteloc R gral, we can find ϑ > 0 small enough so that if λN (B) < ϑ, then Z 1 f (z) dz < ξ. λN (B r (z0 )) B

Also let δ > 0 be such that ¡ ¢ λN B r (z) M B r (z0 ) < ϑ

∀ kz − z0 kRN < δ,

df

where X M Y = (X \ Y ) ∪ (Y \ X) is the symmetric difference of X and Y . So, if kz − z0 kRN < δ, we have Z 1 f (y) dy λN (B r (z0 )) B r (z)

1 = N λ (B r (z0 ))

Z

1 f (y) dy + N λ (B r (z0 ))

B r (z0 )

¸

Z f (y) dy −

B r (z)

Z

1 > ε+ξ− N λ (B r (z0 ))

· Z

f (y) dy

B r (z0 )

f (y) dy > ε + ξ − ξ = ε, B r (z)OB r (z0 )

so z ∈ A and we infer that A is open. Next let z0 ∈ A and let r > 0 be such that Z 1 f (y) dy > ε. λN (B r (z0 )) B r (z0 )

Then, by Jensen’s inequality (see Theorem A.2.26), we have Z N a(N )r ε < f (y) dy B r (z0 )

6

¡

a(N )r

¢ 1 N 1− p∗

µ Z f (y)

p∗

dy

¶ p1∗

B r (z0 )

so r 6 C1 for some C1 > 0 independent of z0 .

6

¡

a(N )rN

¢1− p1∗

kf kp∗ ,

98

Nonlinear Analysis

Invoking the Besicovitch covering theorem (see Theorem 1.2.18), we can find k = k(N ) > 1, a positive integer and countable collection {Tn }kn=1 of closed balls, such that k X X A⊆ B n=1 B∈Tn

and

Z

1 N λ (B)

f (y) dy > ε

∀B∈

Tn .

(1.82)

n=1

B

©

k [

(n) ª

Let Bi be an enumeration of the elements in the countable collection i>1 Tn for n = 1, . . . , k. Using Proposition 1.6.4(a), we can check that µ

¶+

Z

1 (n) λN (Bi )

¡ (n) ¢ ∈ W 1,p Bi .

f (y) dy − f (n)

Bi

Then Poincar´e’s inequality (see Theorem 1.6.8) implies that °µ ° ° °

Z

1 (n)

λN (Bi )

¶+ ° ° ° f (y) dy − f °

(n)

W 1,p (Bi

(n) Bi

)

6 C2 kDf kLp (B (n) ;RN ) , i

(n)

for some C2 > 0. Invoking Proposition 1.6.5, we can find gi such that µ (n)

gi

> 0,

and

¶+

Z

1

(n)

gi (z) =

(n) λN (Bi )

° ° ° (n) ° °gi °

W 1,p (RN )

f (y) dy − f

(z)

(n)

>

(n)

6 C3 kDf kLp (B (n) ;RN ) , i

Z

1

f (y) dy > ε.

(n)

λN (Bi )

(n)

Bi

Also if df

g =

(n)

for a.a. z ∈ Bi

Bi

for some C3 = C3 (N, p) > 0. From (1.82), we have f + gi

∈ W 1,p (RN ),

sup i>1 n = 1, . . . , k

(n)

gi ,

(1.83)

1. Hausdorff Measures and Capacity

99

we have g > 0. We claim that h ∈ K p . To this end, using (1.83), note that Z sup i>1 RN n = 1, . . . , k

6

k X

C3p

n=1

∞ Z X i=1

k X ∞ Z X ¯ (n) ¯p ¯ (n) ¯p ¯g (y)¯ dy 6 ¯g (y)¯ dy i i n=1 i=1

RN

p

p

kDf (y)kRN dy 6 kC3p kDf kp ,

(n)

Bi

so g ∈ Lp (RN ). Also, we have Z sup

6

i>1 RN n = 1, . . . , k ∞ Z k X X p C3 n=1 i=1 (n) Bi

k X ∞ ° ° X ¯ (n) ¯p ° (n) °p ¯Dg ¯ dy 6 Dg ° i i ° p

n=1 i=1 p

p

kDf (y)kRN dy 6 kC3p kDf kp .

(1.84)

From Proposition 1.6.4(b), we infer that kDg(y)kRN 6

¯ (n) ¯p ¯Dg (y)¯

sup i>1 n = 1, . . . , k

i

for a.a. y ∈ RN ,

¡ ¢ so by (1.84), we have that Dg ∈ Lp RN ; RN and thus g ∈ K p . Because f + g > ε almost everywhere on A and A is open, using also (1.84), it follows that °p Z ° °1 ° ° ° capp (A) 6 ° ε D(f + g)(y)° N dy R RN

´ C4 ³ C5 p p 6 p kDf kp + kDgkp 6 p kDf kpp , ε ε for some C4 , C5 = C5 (N, p) > 0. Setting C = C5 , we obtain the result. DEFINITION 1.6.17 A function f : RN −→ R is said to be pquasicontinuous, if for each ε > 0, we can find an open set U ⊆ RN , such that capp (U ) < ε and f |RN \U is continuous. We have all the necessary tools to prove the following “differentiability” result for Sobolev functions. As we already said a systematic study of Sobolev functions and of their differentiability properties will be conducted in Chapter 2. Here we state a result which says that up to a set A of p-capacity zero, a function f ∈ W 1,p (RN ) can be represented by a p-quasicontinuous function.

100

Nonlinear Analysis

THEOREM 1.6.18 If p ∈ [1, N ) and f ∈ W 1,p (RN ), then there exists f ∗ : RN −→ R a p-quasicontinuous function, such that (a) there exists a Borel set A ⊆ RN with capp (A) = 0, such that Z 1 lim f (y) dy = f ∗ (z) ∀ z ∈ RN \ A; r→0 λN (B r (z)) B r (z)

(b) for each z ∈ RN \ A, we have Z

1 r→0 λN (B r (z))

¯ ¯ ∗ ¯f (y) − f ∗ (y)¯p dy = 0.

lim

B r (z)

PROOF

(a) Let ½ df

C =

lim sup r→0

Z

1 rN −p

¾ p kDf (y)kRN dy > 0 .

B r (z)

From Theorem 1.4.9, we have that µ(N −p) (C) = 0 and this by Theorem 1.6.13(a) implies that capp (C) = 0. Moreover, from the Poincar´e-Wirtinger inequality (see Theorem 1.6.8), we see that Z ¯ ¯ 1 ¯f (y) − f z,r ¯ dy = 0 lim ∀ z ∈ RN \ C. (1.85) r→0 λN (B r (z)) B r (z)

By virtue of Proposition 1.6.3, we can find a sequence {fn }n>1 ⊆ W 1,p (RN ) ∩ C ∞ (RN ), such that p

kDf − Dfn kp 6 For n > 1, we introduce the sets ¯ ½ Z ¯ 1 df N ¯ En = z ∈ R : ¯ N λ (B r (z)) B r (z)

1 2(p+1)n

∀ n > 1.

(1.86)

¯ ¾ ¯ 1 ¯ |f (y) − fn (y)| dy ¯ > n for some r > 0 . 2

1. Hausdorff Measures and Capacity

101

From Proposition 1.6.16 and (1.86), we have that capp (En ) C p 6 C kDf − Dfn kp 6 (p+1)n , 2pn 2 for some C = C(N, p) > 0, so capp (En ) 6

c . 2n

(1.87)

Moreover, we have ¯ ¯ ¯f z,r − fn (z)¯ 6

1 λN (Br (z)) Z

½ Z

Z

¯ ¯ ¯f (y) − f z,r ¯ dy +

Br (z)

|f (y) − fn (y)| dy

Br (z)

¾

|fn (y) − fn (z)| dy ,

+ Br (z)

so from (1.85), we have ¯ ¯ lim ¯f z,r − fn (z)¯ 6

1 2n

r→0

We set df

Ak = C ∪

∀ z ∈ C ∪ En .

(1.88)

¶

µ[ ∞

En

∀ k > 1.

n=k

Evidently Ak is Borel for k > 1 and we have that capp (Ak ) 6 capp (C) +

∞ X n=k

Then, if df

A =

∞ \

∞ X 1 capp (En ) 6 . 2n

(1.89)

n=k

Ak ,

k=1

from (1.89), we have that capp (A) 6

lim capp (Ak ) = 0.

k→+∞

Note that, if z ∈ RN \ Ak and n, m > k, from (1.88), we have ¯ ¯ ¯ ¯ ¯ ¯ ¯fn (z)−fm (z)¯ 6 lim sup ¯f z,r − fn (z)¯ +lim sup ¯f z,r − fm (z)¯ 6 1 + 1 , 2n 2m r→0 r→0 ¡ ¢ so the sequence {fn }n>1 converges uniformly on RN \ Ak to some h ∈ C RN . Also we have ¯ ¯ ¯ ¯ lim sup ¯h(z) − f z,r ¯ 6 |h(z) − fn (z)| + lim sup ¯fn (z) − f z,r ¯ , r→0

r→0

102

Nonlinear Analysis

so, from (1.87), we have h(z) = lim f z,r = f ∗ (z)

∀ z ∈ RN \ Ak , k > 1

r→0

hence f ∗ (z) = lim f z,r

∀ z ∈ RN \ A.

r→0

We need to show that f ∗ is p-quasicontinuous. For this purpose let ε > 0 be given. We choose k > 1, such that capp (Ak )

1 converges to f ∗ uniformly on RN \ U , we infer that the function f ∗ |RN \U is continuous, hence f ∗ is p-quasicontinuous. (b) Note that C ⊆ A and so from (1.85), we see that for all z ∈ RN \ A, we have that µ lim

r→0

1 λN (B r (z))

Z |f (y) − f ∗ (z)|

p∗

dy

¶ p1∗

B r (z)

¯ ¯ 6 lim ¯f z,r − f ∗ (z)¯ + lim r→0

r→0

µ

1 N λ (B r (z))

Z

¯ ¯ ∗ ¯f (y) − f z,r ¯p dy

¶ p1∗

= 0.

B r (z)

REMARK 1.6.19

By virtue of Theorem 1.6.13(b), we have that µ(s) (A) = 0

∀ s > N − p.

Hence dim A 6 N − p. On the other hand this last inequality does not necessarily imply that µ(N −p) (A) < +∞, which in turn gives us that capp (A) = 0 (see Theorem 1.6.13(a)). Therefore the conclusion capp (A) = 0 in the statement of Theorem 1.6.18 is stronger than the dimensionality condition dim A 6 N − p.

1. Hausdorff Measures and Capacity

1.7

103

Remarks

1.1: The necessary measure theoretic background is standard and can be found in any books on abstract measure theory. We mention the books of Ash (1972), Denkowski, Mig´orski & Papageorgiou (2003a, Chapter 2), Dudley (1989), Hewitt & Stromberg (1975) and Royden (1968). Outer measures were first introduced by Carath´eodory (1914), who also gave the definition of a µ-measurable set (see Definition 1.1.3) and proved that Σ(µ) is a σ-field and the outer measure µ restricted on Σ(µ) is a measure. For a proof of Proposition 1.1.10 we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 184-186) and Evans & Gariepy (1992, p. 6–9). 1.2: In some books a Vitali cover (see Definition 1.2.2) is called “fine cover” (see, e.g., Evans & Gariepy (1992)). The original version of Theorem 1.2.5 (Vitali covering theorem) is due to Vitali (1908), who employed closed cubes. The first to study the differentiability of monotone functions (or more generally of functions of bounded variation, in particular then of absolutely continuous functions) was Lebesgue (1904, 1910). Evans & Gariepy (1992, p. 30), Hardt (1979), Simon (1983) and Ziemer (1989) contain the proof of Theorem 1.2.18. For the proof of Theorem 1.2.19 see Evans & Gariepy (1992, p. 35). In general covering theorems are useful in harmonic analysis and in geometric measure theory. More about them can be found in de Guzman (1975). 1.3: Carath´eodory (1914), working with outer measures, was the first to introduce “Hausdorff measures.” More precisely, he introduced “1-dimensional” (or “linear”) measures in RN and also indicated that similarly one can define k-dimensional measures in RN for any integer k > 1. Hausdorff (1919) realized that Carath´eodory’s definition can be used also for noninteger s > 0. He then went on to show that Cantor’s ternary set has fractional dimension ln 2 s ln 3 . An extension of the theory can be achieved by replacing δ(An ) in the X definition of the Hausdorff measure, by ξ(An ), where ξ : 2 −→ R+ is any premeasure, i.e., ξ(∅) = 0 and if U ⊆ V then ξ(U ) 6 ξ(V ) (monotonicity). Of special interest are premeasures resulting from Hausdorff functions h. Namely we consider a function h : R+ −→ R+ satisfying: (a) h(t) > 0 for all t > 0, (b) if t 6 s, then h(t) 6 h(s), (c) h is right continuous at every t > 0. Such a function is called Hausdorff function. For such a function and a positive constant ϑ, we define a premeasure ξ on the metric space X, by © ¡ ¢ ª ½ df min h δ(A) , h(ϑ) if A 6= ∅, ξ(A) = 0 if A = ∅.

104

Nonlinear Analysis

Then ξ is the premeasure defined by h and the cut-off level ϑ. For more details about this generalization, we refer to Davies (1970) and Davies & Samuels (1974). In the presentation of the isodiametric inequality (see Theorem 1.3.20) and of the fact that µ(N ) is a multiple of the Lebesgue measure λN , we follow Evans & Gariepy (1992, Chapter 3). We refer also to Falconer (1985, Section 1.6), Federer (1969, Section 2.10.33) and Hardt (1979). It will be a grave omission not to mention the fundamental contributions on the field of Hausdorff measures made by Besicovitch. We mention the works of Besicovitch (1945, 1946) related to Theorem 1.2.18 (the covering theorem bearing his name). A more complete list of the works of Besicovitch can be found in the book of Falconer (1985). 1.4: The intuitive meaning of Theorem 1.4.1 is that for λN -almost all x ∈ A, small balls centered at x consist predominantly of points of A. Theorems 1.4.1 and 1.4.6 are due to Lebesgue (1910). They were generalized by Besicovitch (1945, 1946), who replaced the Lebesgue measure λN by a Radon measure on RN . Another source of information for the differentiation of measures in RN is the book of Widom (1969). 1.5: Theorem 1.5.4 was originally proved by McShane (1934), who produced the minimal Lipschitz extension of f . Theorem 1.5.8 was originally proved by Rademacher (1919). The proof that is given here is essentially due to Morrey (1966, Theorem 3.1.6). It can be found also in Evans & Gariepy (1992), Simon (1983) and Ziemer (1989). If we employ the notion of Haar-null set, we can also have an extension of Rademacher’s theorem to locally Lipschitz functions between Banach spaces. DEFINITION 1.7.1 Let (G, +) be an abelian Polish group and d an invariant metric on G compatible with the topology (therefore automatically complete). A universally measurable set A ⊆ G is a Haar-null set, if there exists a probability measure µ on G (not unique), such ¡ that ¢χA ? µ = R 0, where χA is the characteristic function of the set A and χA ? µ (x) = G χA (x + y) dµ(y). REMARK 1.7.2 The above definition is equivalent to the requirement that every translate of the set A is a zero set for the measure µ. The measure µ is usually called test measure. The next theorem is the extension of Theorem 1.5.8 to functions between Banach spaces. It will be proved in Section 4.3 (see Theorem 4.3.17). THEOREM 1.7.3 If X is a separable Banach space, Y is a Banach space with the RNP (see Section 2.1) and f : X −→ Y is locally Lipschitz, then there exists a universally measurable set D ⊆ X, such that X \ D is Haar-null and f |D is differentiable in the sense of Gˆ ateaux.

1. Hausdorff Measures and Capacity

105

Our derivation of the area formula (see Theorem 1.5.21) is based on Evans & Gariepy (1992, Section 3.3) (see also Federer (1969, Section 3.2) and Hardt (1979)). Lipschitz continuous functions and their properties are discussed in Federer (1969, Section 3.3). If N = M , from the two change of variables results (see Theorem 1.5.23 and Theorem 1.5.26), we obtain the following change of variables formula. THEOREM 1.7.4 If U, V ⊆ RN are open sets, f : U −→ V is a locally Lipschitz homeomorphism and u ∈ L1 (V ), then ¯ ¯ v = (u ◦ f )¯ det Jf ¯ ∈ L1 (V ) and

Z U

¯ ¡ ¢¯ u f (x) ¯ det Jf (x)¯ dλN (x) =

Z u(y) dλN (y). V

The proof of Theorem 1.5.25 can be found in Evans & Gariepy (1992, Section 3.4) or Federer (1969, Subsection 3.2.11). 1.6: Our treatment of capacity follows Evans & Gariepy (1992, Section 4.7) (see also Federer & Ziemer (1972)). There are other notions of capacity as for example Bessel capacity, Riesz capacity, etc., which are discussed in Stein (1970) and Ziemer (1989). The abstract theory of capacities in Banach spaces can be found in Fowler (1973). Moreover, for the use of capacities in the convergence of obstacles, we refer to Dal Maso (1985).

Chapter 2 Lebesgue-Bochner and Sobolev Spaces

The functional-analytic approach to the solution of (partial) differential equations requires knowledge of the properties of spaces of functions of one or several real variables. A large class of infinite dimensional dynamical systems (evolution systems) can be modelled as an abstract differential equation defined on a suitable Banach space or on a suitable manifold therein. The advantage of such an abstract formulation lies not only on its generality but also in the insight that can be gained about the many common unifying properties that tie together apparently diverse problems. It is clear that such a study relies on the knowledge of various spaces of vector valued functions (i.e., of Banach space valued functions). For this reason Section 2.1 deals with vector valued functions. We introduce the various notions of measurability for such functions and then based on them we define the different integrals corresponding to them. The emphasis is on the so-called Bochner integral, which generalizes in a very natural way the classical Lebesgue integral to vector valued functions. In Section 2.2 we continue with vector valued functions and introduce the so-called Lebesgue-Bochner spaces, which extend to vector valued functions the well known Lebesgue Lp -spaces. We also consider evolution triples and the function spaces associated with them. Evolution triples provide a suitable analytical framework for the study of a large class of linear and nonlinear evolution equations. In Section 2.3 we have compactness results for the spaces introduced and studied in the previous section. The compactness results refer to both the strong and the weak topologies on the spaces under consideration. Thus far we are dealing with function spaces arising in evolutionary problems. In Section 2.4 we study Sobolev spaces, which are the main tools in the analysis of both stationary and nonstationary equations. Sobolev spaces play a central role in the modern theory of partial differential equations and they allow us to broaden significantly the notion of solution of a boundary value problem. They provide a natural functional analytical framework for the study of weak solutions of elliptic boundary value problems. No specific applications to problems in partial differential equations are discussed. Instead the section aims to serve as a concise introduction to the properties of

107

108

Nonlinear Analysis

the Sobolev spaces (of both one and several variables). In Section 2.5 we present some fundamental inequalities associated with Sobolev functions, the celebrated embedding theorems for the Sobolev spaces and some of their consequences. The embedding theorems are arguably the most important results in this theory and the reason why Sobolev spaces are so effective in dealing with boundary value problems. Finally in Section 2.6 we establish some fine properties of Sobolev spaces and introduce functions of bounded variation (BV-functions). These are functions whose weak first partial derivatives are Radon measure and this is essentially the weakest measurable theoretic sense in which a function can be differentiable. They are particularly useful in theoretical mechanics.

2.1

Vector-Valued Functions

In this section we deal with functions which take values in Banach spaces. For such functions we define the various notions of measurability and different integrals corresponding to them. The domain of a function is a finite measure space (Ω, Σ, µ) and the range is a Banach space X. By X ∗ we denote the topological dual of X and by h·, ·iX the duality brackets for the pair (X ∗ , X). By B(X) we denote the Borel σ-field of X. DEFINITION 2.1.1

Let f : Ω −→ X be a function.

(a) Function f is said to be a simple function, if it takes only finite number of values, say x1 , . . . , xN and ¡ ¢ © ª Ck = f −1 {xk } ∈ Σ ∀ k ∈ 1, . . . , N . The formula s =

N P k=1

xk χCk is called the standard representation of f .

(b) Function f is said to be strongly measurable (or Bochner measur© ª able), if there exists a sequence sn : Ω −→ X n>1 of simple functions, such that sn (ω) −→ f (ω) for µ-a.a. ω ∈ Ω, where −→ denotes the convergence in the norm topology of X, i.e., ° ° °f (ω) − sn (ω)° −→ 0 for µ-a.a. ω ∈ Ω. X ∗ ∗ (c) Function f is said to ® be weakly measurable, if for all x ∈ X , the ∗ function ω −→ x , f (ω) X is Σ-measurable.

2. Lebesgue-Bochner and Sobolev Spaces

109

∗ ∗ (d) Function f : Ω −→ X is ® said to be weak -measurable, if for all x ∈ X, the function ω 7−→ f (ω), x X is Σ-measurable.

REMARK 2.1.2 Evidently strong measurability of a function f : Ω −→ X implies its weak measurability. Also strong measurability implies that for every B ∈ B(X), we have that f −1 (B) ∈ Σ (i.e., f is Borel measurable). Moreover, adapting the proof of the classical result, which asserts that a measurable R-valued function is the µ-almost everywhere limit of a sequence of simple functions, we see that if X is separable, then f : Ω −→ X is strongly measurable if and only if it is Borel measurable. In fact, in the next theorem, known as the Pettis measurability theorem, when X is separable, the situation simplifies considerably. THEOREM 2.1.3 A function f : Ω −→ X is strongly measurable if and only if it is weakly measurable and µ-almost separably valued (i.e., there exists a set A ∈ Σ with µ(A) = 0, such that f (Ω \ A) is separable in X). PROOF “=⇒”: Let f : Ω −→ X be a strongly measurable function. As we already pointed out, f is weakly measurable. Also since f is strongly measurable, we can find a sequence {sn }n>1 of X-valued simple functions and a µ-null set A ∈ Σ, such that sn (ω) −→ f (ω)

in X, for all ω ∈ Ω \ A.

Let {yn }n>1 be a sequence of all the values taken by the sequence {sn }n>1 (clearly the set is countable). Let df

Y = span {yn }n>1 . Then Y is a closed separable subspace of X. Moreover, f (Ω \ A) ⊆ Y and so f is µ-almost separably valued. “⇐=”: Let f : Ω −→ X be a µ-almost separably valued function. Without any loss of generality, we may assume that f is separably valued. Then replacing X by Y = span f (Ω), which is separable, we see that we may assume that X X∗

is separable. Let {x∗n }n>1 be dense in ∂B 1 (0), where X∗

∂B 1 (0) = Then

° ° °f (ω)°

X

©

ª x∗ ∈ X ∗ : kx∗ kX ∗ = 1 .

¯ ® ¯ = sup ¯ x∗n , f (ω) X ¯. n>1

But for each n > 1, ® the function ω 7−→ x∗n , f (ω) X is Σ-measurable,

110 hence

Nonlinear Analysis ° ° the function ω 7−→ °f (ω)°X is Σ-measurable.

Let

©

df

C0 =

° ° ª ω ∈ Ω : °f (ω)°X > 0 .

We have that C0 ∈ Σ and for every y ∈ X, the function ω 7−→ f (ω) − y is Σ ∩ C0 -measurable. Therefore, ° ° the function ω 7−→ °f (ω) − y °X is Σ ∩ C0 -measurable. Let {zn }n>1 be dense in f (Ω). For a given ε > 0, we define df

Dn =

©

ª ω ∈ C0 : kf (ω) − zn kX < ε .

Evidently Dn ∈ Σ ∩ C0

and

C0 =

∞ [

Dn .

n=1

Let df

En = Dn \

n−1 [

Di .

i=1

Then {En }n>1 ⊆ Σ ∩ C0 is a sequence of disjoint sets and ∞ [

C0 =

En .

n=1

We define

½ df

fε (ω) =

zn 0

if if

ω ∈ En , n > 1, ω ∈ Ω \ C0 .

Clearly fε : Ω −→ X is Σ-measurable, countably-valued (i.e., takes countably many values) and kf (ω) − fε (ω)kX < ε

∀ ω ∈ Ω.

© ª Taking ε = k1 , k > 1, we see that f is the uniform limit of a sequence f k1 k>1 of countably-valued functionals, hence f is strongly measurable. An interesting byproduct of the previous proof is the following result. COROLLARY 2.1.4 A function f : Ω −→ X is strongly measurable if and only if it is the uniform limit almost everywhere of a sequence of countably-valued, Σ-measurable functions.

2. Lebesgue-Bochner and Sobolev Spaces

111

By virtue of Theorem 2.1.3, we see that the measurability situation of Xvalued functions simplifies considerably when X is separable. THEOREM 2.1.5 If X is separable and f : Ω −→ X, then the following three properties are equivalent: (a) f is strongly measurable; (b) f is Borel measurable; (c) f is weakly measurable. REMARK 2.1.6 The usual facts regarding the stability of strongly measurable functions under sum, scalar multiplication and pointwise µ-almost everywhere limits hold. Also by just replacing absolute values by norms in the proof of the classical Egorov’s theorem (see Theorem A.2.12), we see that the result generalizes to X-valued functions. Finally for any Banach space X and a strongly measurable function f : Ω −→ X, the function ω 7−→ kf (ω)kX is Σmeasurable. Indeed, if {sn }n>1 is the sequence of X-valued simple functions, such that sn (ω) −→ f (ω) in X for µ-a.a. ω ∈ Ω, then ¯° ° ° ° ¯ ° ° ¯°f (ω)° − °sn (ω)° ¯ 6 °f (ω) − sn (ω)° −→ 0 X X X

for µ-a.a. ω ∈ Ω

° ° and so the function ω 7−→ °f (ω)°X is Σ-measurable. EXAMPLE 2.1.7 It can be shown that weak measurability does not imply strong measurability. Because of Theorem 2.1.5, we look for functions with values in a nonseparable Banach space. So consider the nonseparable ¡ ¢ Hilbert space X = l2 [0, 1] and ¡ ¢ let {et }t∈[0,1] be an orthonormal basis. The function f : [0, 1] −→ l2 [0, 1] defined by f (t) = et is weakly measurable, since ¡ ¡ ∗ ¢ ¢ x , f (t) X = x∗ , et X = 0

¡ ¢∗ ¡ ¢ ∀ x∗ ∈ l2 [0, 1] = l2 [0, 1] .

¡ ¢ On the other hand, if A ⊆ [0, 1], then f [0, 1] \ A is separable if and only if [0, 1] \ A is countable and so we cannot have λ1 (A) = 0. Therefore by virtue of Corollary 2.1.4, f is not strongly measurable.

112

Nonlinear Analysis

Now we are ready to define the Bochner integral for strongly measurable functions. DEFINITION 2.1.8

(a) Let N X

df

s(ω) =

xk χCk (ω),

xk ∈ X,

Ck ∈ Σ

k=1

be an X-valued simple function. The Bochner integral of s is defined by Z

df

s(ω) dµ(ω) =

N X

µ(Ck )xk .

k=1

Ω

(b) A function f : Ω −→ X is said to be Bochner integrable, if there exists a sequence {sn }n>1 of simple functions, such that Z ° ° °f (ω) − sn (ω)° dµ(ω) = 0. lim X n→+∞

Ω

Z If A ∈ Σ, we define the Bochner integral

f (ω) dµ(ω) of f on A, by A

Z

Z f (ω) dµ(ω) =

A

lim

χA (ω)sn (ω) dµ(ω).

n→+∞

(2.1)

Ω

Z Instead of

f (ω) dµ(ω), we will often write Ω

Z

Z f (ω) dµ or even

Ω

f dµ, Ω

when no confusion is possible. REMARK 2.1.9 It is easy to verify that in Definition 2.1.8(b), the limit in (2.1) exists and is independent of the sequence of simple functions {sn }n>1 with the properties postulated there. The next theorem gives a necessary and sufficient condition for the Bochner integrability of a function f : Ω −→ X. PROPOSITION 2.1.10 A strongly measurable f : Ω −→ X is Bochner integrable ° function ° ° ° if and only if the function ω 7−→ °f (ω)°X is Lebesgue integrable (i.e., °f (·)°X ∈ L1 (Ω)).

2. Lebesgue-Bochner and Sobolev Spaces PROOF such that

113

“=⇒”: Let {sn }n>1 be a sequence of X-valued simple functions, Z ° ° °f (ω) − sn (ω)° dµ −→ 0. X Ω

Then for any n > 1, we have Z Z Z ° ° ° ° ° ° °f (ω)° dµ 6 °f (ω) − sn (ω)° dµ + °sn (ω)° dµ < +∞, X X X Ω

Ω

Ω

so kf (·)kX ∈ L1 (Ω). “⇐=”: Since f : Ω −→ X is strongly measurable, we can find a sequence {sn }n>1 of X-valued, simple functions, such that ° ° lim °f (ω) − sn (ω)°X = 0 ∀ ω ∈ Ω \ A, n→+∞

with µ(A) = 0. Hence ° ° ° ° lim °sn (ω)°X = °f (ω)°X n→+∞

∀ ω ∈ Ω \ A.

Let hn : Ω −→ X

∀n>1

be defined by ½ df

hn (ω) =

sn (ω) 0

if ksn (ω)kX < 2 kf (ω)kX , otherwise.

Evidently for every n > 1, hn is an X-valued simple function. Also ° ° lim °f (ω) − hn (ω)°X = 0 ∀ ω ∈Ω\A n→+∞

and

° ° ° ° °f (ω) − hn (ω)° 6 3°f (ω)° X X

∀ ω ∈ Ω \ A.

So by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z ° ° °f (ω) − hn (ω)° dµ = 0, lim X n→+∞

Ω

so f is Bochner integrable (see Definition 2.1.8(b)). COROLLARY 2.1.11 If f : Ω −→ X is a Bochner integrable function and A ∈ Σ, then °Z ° Z ° ° ° ° ° f (ω) dµ° 6 °f (ω)° dµ. ° ° X A

X

A

114

Nonlinear Analysis

PROOF It is clear that the corollary holds for any s : Ω −→ X simple function. Then use Proposition 2.1.10. It is a direct consequence of Definition 2.1.8(b) that the Bochner integral is a linear operator. Namely we have the following proposition. PROPOSITION 2.1.12 If f, g : Ω −→ X are two Bochner integrable functions, A ∈ Σ and ξ ∈ R, then f + ξg is Bochner integrable too and Z Z Z ¡ ¢ f + ξg (ω) dµ = f (ω) dµ + ξ g(ω) dµ. Ω

A

A

The Lebesgue dominated convergence theorem (see Theorem A.2.2) applies also to Bochner integrable functions. PROPOSITION 2.1.13 If f : Ω −→ X is a strongly measurable function, fn : Ω −→ X, n > 1 are Bochner integrable, fn (ω) −→ f (ω)

for µ-a.a. ω ∈ Ω

and there exists h ∈ L1 (Ω)+ , such that ° ° °fn (ω)° 6 h(ω) for µ-a.a. ω ∈ Ω, and all n > 1, X then f is Bochner integrable and we have Z Z f (ω) dµ = lim fn (ω) dµ

∀ A ∈ Σ.

n→+∞

A

PROOF

A

Clearly ° ° °f (ω)° 6 h(ω) for µ-a.a. ω ∈ Ω X

and the function

° ° ω 7−→ °f (ω) − fn (ω)°X

is Σ-measurable for every n > 1. Since ° ° °f (ω) − fn (ω)° 6 2h(ω) X we have that

° ° °f (·) − fn (·)° ∈ L1 (Ω) X

for µ-a.a. ω ∈ Ω,

∀ n > 1.

2. Lebesgue-Bochner and Sobolev Spaces

115

Thus by the Lebesgue dominated convergence theorem (see Theorem A.2.2) for R-valued functions, we have Z ° ° °f (ω) − fn (ω)° dµ −→ 0. (2.2) X Ω

By virtue of Definition 2.1.8(b), for each n > 1, we can find an X-valued step function sn , such that Z ° ° °fn (ω) − sn (ω)° dµ < 1 . X n Ω

We have Z ° ° °f (ω) − sn (ω)° dµ X Ω

Z

6

° ° °f (ω) − fn (ω)° dµ + X

Ω

Z

° ° °fn (ω) − sn (ω)° dµ −→ 0 as n → +∞, X

Ω

so f is Bochner integrable. Moreover, from Corollary 2.1.11 and (2.2), we have °Z ° Z Z ° ° ° ° ° f (ω) dµ − sn (ω) dµ° 6 °f (ω) − sn (ω)° dµ ° ° X Z 6

A

A

A

° ° °f (ω) − sn (ω)° dµ −→ 0 X

as n → +∞.

Ω

Also we have a version of Fatou’s lemma (see Theorem A.2.1). PROPOSITION 2.1.14 ©R ª If fn : Ω −→ X, n > 1, are Bochner integrable, Ω kfn k dµ n>1 is bounded and w fn (ω) −→ f (ω) for a.a. ω ∈ Ω, then f is Bochner integrable and Z Z ° ° ° ° °f (ω)° dµ 6 lim inf °fn (ω)° dµ. X X n→+∞

Ω

Ω

116

Nonlinear Analysis

PROOF Evidently f is weakly measurable. Also by Theorem 2.1.3 for every n > 1, we can find An ∈ Σ which is µ-null and fn (Ω \ A) is separable. Let C ∈ Σ be the µ-null set, such that for ω ∈ Ω \ C, we have w

fn (ω) −→ f (ω). Let df

A =

µ[ ∞

¶ An

∪ C.

n=1

Then A ∈ Σ and it is µ-null. Let ∞ [

df

Y = span

¡ ¢ fn Ω \ An .

n=1

Evidently Y is a separable Banach subspace of X and f (Ω \ C) ⊆ Y. So by virtue of the weak lower semicontinuous of the norm functional in a Banach space, we have ° ° ° ° °f (ω)° 6 lim inf °fn (ω)° for µ-a.a. ω ∈ Ω. X X n→+∞

° ° Since °fn (ω)°X , n > 1, is Lebesgue integrable (see Proposition 2.1.10), by the Fatou’s lemma (see Theorem A.2.1), we have Z Z ° ° ° ° °f (ω)° dµ 6 lim inf °fn (ω)° dµ. X X n→+∞

Ω

Ω

DEFINITION 2.1.15 A set function m : Σ −→ X is said to be a vector measure, if for all sequences {An }n>1 ⊆ Σ of pairwise disjoint sets, we have m

µ[ ∞

¶ An

=

n=1

∞ X

m(An ),

n=1

where the series converges in the norm topology of X. The next proposition shows that the indefinite Bochner integral Z A 7−→ f dµ A

of a Bochner integrable function f : Ω −→ X is a vector measure which is absolutely continuous with respect to µ (i.e., m ≺≺ µ).

2. Lebesgue-Bochner and Sobolev Spaces

117

PROPOSITION 2.1.16 If f : Ω −→ X is a Bochner integrable function, then the set function m : Σ −→ X defined by Z df m(A) = f (ω) dµ ∀A∈Σ A

is a vector measure and m ≺≺ µ, i.e., lim m(A) = 0.

µ(A)&0

PROOF

Let {An }n>1 ⊆ Σ be a sequence of pairwise disjoint sets. Since °Z ° Z ° ° ° ° ° f (ω) dµ° 6 °f (ω)° dµ ∀ n > 1, ° ° X X

An

An

the series

∞ Z X

f (ω) dµ

n=1A n

is dominated term-by-term by the convergent series of positive terms Z ∞ Z X ° ° ° ° °f (ω)° dµ 6 °f (ω)° dµ < +∞ X X n=1A n

Ω

(see Proposition 2.1.10). Therefore the series ∞ Z X

f (ω) dµ

n=1A n

is absolutely convergent. Moreover, for all k > 1, we have ° Z ° ° ° ∞ S n=1

f (ω) dµ −

° ° f (ω) dµ° °

n=1A n

An

X

° ° = ° °

∞ S

° ° °f (ω)° dµ −→ 0 as k → +∞, X

6 An

n=k+1

so m

µ[ ∞ n=1

¶ An

=

∞ X n=1

m(An ),

° ° f (ω) dµ° °

Z

n=k+1

Z

∞ S

k Z X

X

An

118

Nonlinear Analysis

i.e., m is°a vector measure. ° Since °f (·)°X ∈ L1 (Ω), from the absolute continuity of the Lebesgue integral, we have Z ° ° °f (ω)° dµ = 0. lim X µ(A)&0

A

From Corollary 2.1.11, we have °Z ° ° ° ° ° ° f (ω) dµ lim °m(A)°X = lim ° ° ° µ(A)&0 µ(A)&0

X

A

Z 6

lim

µ(A)&0

° ° °f (ω)° dµ = 0 X

A

and so m ≺≺ µ. Thus far the theory of Bochner integration is a straightforward extension of the theory of Lebesgue integration, with the absolute values replaced by norms. The next theorem exhibits a strong property of the Bochner integral that has no counterpart in the theory of Lebesgue integration. THEOREM 2.1.17 If Y is another Banach space, L : X ⊇ D −→ Y is a closed linear operator ¡ ¢ and f : Ω −→ X is a Bochner integrable function, such that L f (·) : Ω −→ Y is Bochner integrable too, then µZ ¶ Z ¡ ¢ L f (ω) dµ = L f (ω)) dµ ∀ A ∈ Σ. A

PROOF

A

Let ½ df

C0 =

¾

° ° ω ∈ Ω : °f (ω)°X > 0

∈ Σ.

By Corollary 2.1.4, for a given ε > 0, we can find countably valued functions hε : C0 −→ X such that ° ° ε sup °f (ω) − hε (ω)°X < 2 ω∈Ω\E

and gε : C0 → Y,

and

° ¡ ° ¢ ε sup °L f (ω) − gε (ω)°X < , 2 ω∈Ω\E

with E ∈ Σ being a µ-null set. Let {Bn }n>1 ⊆ Σ∩C0 be a common refinement of the subdivisions corresponding to hε and gε and let ωn ∈ Bn

∀ n > 1.

2. Lebesgue-Bochner and Sobolev Spaces

119

We introduce the function uε : Ω −→ X, defined by ½ df f (ωn ) if ω ∈ Bn , n > 1, uε (ω) = 0 if ω ∈ Ω \ C0 . Then, we have

Z

° ° °f (ω) − uε (ω)° dµ < εµ(Ω) X

(2.3)

° ¡ ¢ ¡ ¢° °L f (ω) − L uε (ω) ° dµ < εµ(Ω). Y

(2.4)

Ω

and

Z Ω

Also for every A ∈ Σ, we have Z uε (ω) dµ =

∞ X

f (ωn )µ(Bn ∩ A) =

n=1

A

lim

N →+∞

N X

f (ωn )µ(Bn ∩ A)

(2.5)

n=1

and Z

∞ X ¡ ¢ ¡ ¢ L uε (ω) dµ = L f (ωn ) µ(Bn ∩ A) n=1

A

=

lim

N →+∞

N X ¡ ¢ L f (ωn ) µ(Bn ∩ A).

(2.6)

n=1

Since by hypothesis L is a closed, linear operator, from (2.5) and (2.6), we have that µZ ¶ Z ¡ ¢ uε dµ, L uε dµ ∈ Gr L. A

A

Consider a sequence εn & 0. From (2.3) and (2.4), we have Z Z Z Z uεn dµ −→ f dµ and L(uεn ) dµ −→ L(f ) dµ. A

A

Since

µZ L A

A

¶ uεn dµ

Z =

L(uεn ) dµ

∀n>1

A

and L is closed, it follows that µZ ¶ Z ¡ ¢ L f (ω) dµ = L f (ω) dµ A

A

A

∀ A ∈ Σ.

120

Nonlinear Analysis

REMARK 2.1.18

If f : Ω −→ X is a Bochner integrable function and L ∈ L(X; Y ),

then L(f ) is Bochner integrable, since ° ° °L(f (ω))° 6 kLk kf (ω)k L X Y

∀ω∈Ω

COROLLARY 2.1.19 If f, g : Ω −→ X are two Bochner integrable functions and Z

Z f (ω) dµ =

A

g(ω) dµ

∀ A ∈ Σ,

A

then f (ω) = g(ω) for µ-almost all ω ∈ Ω. PROOF We may assume that g = 0 and that X is separable (see Theorem 2.1.3). Then the ball ∗

B1 =

©

x∗ ∈ X ∗ : kx∗ kX ∗ 6 1

ª

furnished with the relative weak∗ -topology is compact, metrizable (see Alaoglu theorem; Theorem A.3.9 and Theorem A.3.13). ∗ Let {xn∗ }n>1 be a countable w∗ -dense subset of B 1 . By Theorem 2.1.17, for every n > 1 and A ∈ Σ, we have ¿ À Z Z ∗ ® xn , f (ω) X dµ = x∗n , f (ω) dµ = 0, A

so

A

∗ ® xn , f (ω) X = 0

Since

we have

° ° °f (ω)° = X

for µ-a.a. ω ∈ Ω.

¯ ¯ sup ¯ hx∗ , f (ω)iX ¯, ∗

x∗ ∈B 1

° ° °f (ω)° = 0 X

for µ-a.a. ω ∈ Ω

and so f (ω) = 0

X

for µ-a.a. ω ∈ Ω.

A similar proof gives us the following result.

2. Lebesgue-Bochner and Sobolev Spaces

121

COROLLARY 2.1.20 If f, g : Ω −→ X are two strongly measurable functions and

x∗ , f (ω)

® X

=

∗ ® x , g(ω) X

for µ-a.a. ω ∈ Ω and all x∗ ∈ X ∗

(the exceptional µ-null set may depend on x∗ ∈ X ∗ ), then f (ω) = g(ω) for µ-a.a. ω ∈ Ω. The next result can be viewed as a kind of mean value theorem for the Bochner integral. PROPOSITION 2.1.21 If f : Ω −→ X is a Bochner integrable function and A ∈ Σ with µ(A) > 0, then Z 1 f (ω) dµ ∈ conv f (A). µ(A) A

PROOF

We proceed by contradiction. Suppose that Z 1 f (ω) dµ 6∈ conv f (A). µ(A) A

Then by the strong separation theorem for convex sets (see Theorem A.3.2), we can find x∗ ∈ X ∗ \ {0} and ϑ ∈ R, such that ¿ À Z ® 1 x∗ , f (ω) dµ < ϑ 6 x∗ , f (ω) X ∀ ω ∈ A, µ(A) X A

so using Theorem 2.1.17, we have Z ∗ ® ® 1 x , f (ω) X dµ < ϑ 6 x∗ , f (ω) X µ(A)

∀ ω ∈ A.

A

Integrating this inequality over A, we obtain Z Z ∗ ® ∗ ® x , f (ω) X dµ, x , f (ω) X dµ < ϑµ(A) 6 A

A

a contradiction. Also for Bochner integrable functions, the Lebesgue differentiation theorem holds (see Theorem 1.4.6). So we have the following result.

122

Nonlinear Analysis

PROPOSITION 2.1.22 If Z ⊆ RN is a bounded open set and f : Z −→ X is a Bochner integrable function, then Z ° ° 1 °f (y) − f (x)° dλN (y) = 0 for λN -a.a. x ∈ Z, lim N X r&0 a(N )r B r (x)

where

N

π2 df a(N ) = ¡ N ¢ 2

!

is the volume of the unit ball in RN . PROOF Invoking Theorem 2.1.3, we may assume that X is separable. Let {xn }n>1 be a dense set in X. Then by Theorem 1.4.6, we have Z ° ° 1 °f (y) − xn ° dλN (y) lim X r&0 a(N )r N B r (x) ° ° = °f (x) − xn °X for λN -a.a. x ∈ Z and all n > 1.

(2.7)

Let x ∈ Z be a point where (2.7) is valid. Then for a given ε > 0, we can select xn , such that ° ° °f (x) − xn ° < ε. X We have 1 lim sup a(N )rN r&0

Z B r (x)

1 6 lim sup N r&0 a(N )r

Z

° ° °f (y) − f (x)° dλN (y) X ·

¸ ° ° ° ° °f (y) − xn ° + °xn − f (x)° dλN (y) X X

B r (x)

< 2ε, so lim sup r&0

1 a(N )rN

Z

° ° °f (y) − f (x)° dλN (y) = 0 X

for λN -a.a. x ∈ Z.

B r (x)

We conclude this section by introducing three weaker integrals for Banach space valued functions.

2. Lebesgue-Bochner and Sobolev Spaces DEFINITION 2.1.23

123

Let f : Ω −→ X be a function.

(a) Suppose that f : Ω −→ X is weakly measurable. We say that f is Pettis integrable, if for each A ∈ Σ, there exists xA ∈ X, such that Z ∗ ® ∗ ® x , xA X = x , f (ω) X dµ ∀ x∗ ∈ X ∗ . A

Then we write

Z xA = (P)- f (ω) dµ. A

(b) Suppose that f : Ω −→ X is weakly measurable. We say that f is Dun∗∗ ford integrable, if for each A ∈ Σ, there exists x∗∗ A ∈ X , such that Z ∗ ® ∗∗ ∗ ® x , f (ω) X dµ ∀ x∗ ∈ X ∗ . xA , x X ∗ = A

Then we write

Z = (D)- f (ω) dµ.

x∗∗ A

A

(c) Suppose that f : Ω −→ X ∗ is w∗ -measurable. We say that f is Gelfand integrable, if for each A ∈ Σ, there exists x∗A ∈ X ∗ , such that Z ∗ ® ® xA , x X = f (ω), x X dµ ∀ x ∈ X. A

Then we write

Z x∗A = (G)- f (ω) dµ. A

REMARK 2.1.24 Clearly we have that Bochner integrability implies Pettis integrability and Pettis integrability implies Dunford integrability. The reverse implications need not be true. Of course if X is reflexive, then the Pettis and Dunford integrals coincide. Finally note that the Gelfand integral is actually the Pettis integral for X ∗ -valued functions. For a Pettis integrable function f : Ω −→ X, we consider the set-valued Z function A 7−→ (P)- f dµ. We want to know if this is a µ-continuous vector A

measure, as was the case with the Bochner integral (see Proposition 2.1.16). To answer this we need some preparation.

124

Nonlinear Analysis

DEFINITION 2.1.25

Let

∞ P n=1

∞ P

(a) We say that the series

n=1

xn be a series of elements of X.

xn is unconditionally convergent to x, if

for all permutations π of N, the series

∞ P n=1

(b) We say that the series

∞ P n=1

xπ(n) converges to x.

xn is weakly subseries convergent to x, if

for every strictly increasing sequence {nk }k>1 of integers, the series is weakly convergent.

REMARK 2.1.26 ∞ P n=1

∞ P

If

n=1

∞ P k=1

xnk

xn is absolutely convergent (i.e., the series

kxn kX is convergent), then it is unconditionally convergent. Also uncon-

ditional convergence is equivalent to the subseries convergence (in the norm topology of X) and implies convergence. The next result is known as the Orlicz-Pettis theorem. THEOREM 2.1.27 (Orlicz-Pettis Theorem) ∞ P A formal series xn in X is unconditionally convergent if and only if it is n=1

weakly subseries convergent. REMARK 2.1.28 An interesting consequence of the above theorem is that if m : Σ −→ X is a weakly countably additive set function, then it is a vector measure. PROPOSITION 2.1.29 If f : Ω −→ X is Pettis integrable, then the function

Z Σ 3 A 7−→ m(A) = (P)- f dµ A

is a vector measure.

2. Lebesgue-Bochner and Sobolev Spaces

125

PROOF Let {An }n>1 be a sequence of pairwise disjoint sets in Σ. For every x∗ ∈ X ∗ , we have ¿ À Z Z ∗ ® x∗ , (P)f (ω) dµ = x , f (ω) X dµ X

∞ S n=1

=

∞ S

An

n=1

An

Z ∞ Z ∞ ¿ X X ∗ ® x , f (ω) X dµ = x∗ , (P)n=1A n

n=1

∞ S n=1

so the function

À , f (ω) dµ X

An

Z Σ 3 A 7−→ m(A) = (P)- f dµ A

is weakly countably additive. Of course the same argument applies to any subsequence of {An }n>1 . So we can invoke Theorem 2.1.27 and conclude that the function Z Σ 3 A 7−→ m(A) = (P)- f dµ A

is a vector measure. REMARK 2.1.30 The result is not true for the Dunford integral, which is not even strongly additive (see Diestel & Uhl (1977, p. 53)). The next result provides an easy test for checking the Gelfand integrability of f : Ω −→ X ∗ . PROPOSITION 2.1.31 If f : Ω −→ X ∗ has the following property hf (·), xiX ∈ L1 (Ω)

∀ x ∈ X,

then f is Gelfand integrable. PROOF

Let A ∈ Σ and let L : X −→ L1 (Ω) be defined by df

L(x) =

f (·), x

® X

∀ x ∈ X.

We claim that the linear operator L has a closed graph. To this end suppose that xn −→ x in X and

® f (·), xn X −→ g

in L1 (Ω).

126

Nonlinear Analysis

Then by passing to a suitable subsequence of {xn }n>1 if necessary, we may assume that ® f (ω), xn X −→ g(ω) for µ-a.a. ω ∈ Ω. Therefore g(ω) =

f (ω), x

® X

for µ-a.a. ω ∈ Ω,

hence (x, g) ∈ Gr L, i.e., L has closed graph. By the closed graph theorem (see Theorem A.3.7), L is continuous and if IA : L1 (Ω) −→ R is the integral operator, defined by Z df IA (g) = g(ω) dµ ∀ A ∈ Σ, A

we have that IA ◦ L ∈ X ∗ and so there exists x∗A ∈ X ∗ , such that ∗ ® xA , x X =

Z

® f (ω), x X dµ

∀ x ∈ X.

A

Therefore f is Gelfand integrable. REMARK 2.1.32 The same closed graph argument shows that if f : Ω −→ X is such that ∗ ® x , f (·) X ∈ L1 (Ω)

∀ x∗ ∈ X ∗ ,

then f is Dunford integrable. The situation with the Pettis integrability is less satisfactory and more sophisticated criteria are needed to establish it (see Diestel & Uhl (1977, pp. 54–56)). COROLLARY 2.1.33 If f : Ω −→ X ∗ is a w∗ -measurable function and has range which is norm bounded in X ∗ , then f is Gelfand integrable. The same argument as in the proof of Proposition 2.1.21 gives the following mean value theorem for the Gelfand integral. THEOREM 2.1.34 If f : Ω −→ X ∗ is a Gelfand integrable function and A ∈ Σ, then Z ∗ 1 (G)- f dµ ∈ conv w f (A). µ(A) A

2. Lebesgue-Bochner and Sobolev Spaces

2.2

127

Lebesgue-Bochner Spaces and Evolution Triples

Using the Bochner integral introduced in the previous section, we can introduce generalizations of the classical Lebesgue spaces to Banach space valued functions. As in the previous section (Ω, Σ, µ) is a finite measure space and X is a Banach space. Additional hypotheses will be introduced as needed. DEFINITION 2.2.1 Let p ∈ [1, +∞]. By Lp (Ω; X) we denote the space of classes of strongly measurable functions f : Ω −→ X, such that ° equivalence ° °f (·)° ∈ Lp (Ω). Also we introduce their respective norms by X df

µZ

kf kp =

¶ p1 ° ° °f (ω)°p dµ X

if p ∈ [1, +∞)

Ω

and

° ° df kf k∞ = esssup °f (ω)°X . ω∈Ω

REMARK 2.2.2 As with R-valued functions, the equivalence relation used in the above definition is the following: f ∼g

if and only if

f (ω) = g(ω)

for µ-a.a. ω ∈ Ω.

It is routine to check the following facts. PROPOSITION 2.2.3 ¡ ¢ (a) Lp (Ω; X), k·kp is a Banach space for p ∈ [1, +∞]. (b) If p ∈ [1, +∞), Σ is countably generated and X is separable, then Lp (Ω; X) is separable. (c) If p ∈ (1, +∞) and X is reflexive, then Lp (Ω; X) is reflexive. (d) If X is a Hilbert space, then L2 (Ω; X) is a Hilbert space too with inner product Z ¡ ¢ (f, g)2 = f (ω), g(ω) X dµ. Ω

128

Nonlinear Analysis

REMARK 2.2.4 The σ-field Σ is countably generated if there exists a countable subfamily T , such that Σ = σ(T ). If Ω is an open or closed subset of RN , then the Borel σ-field B(Ω) is countably generated. Also clearly if p ∈ [1, +∞) and Lp (Ω; X) is separable, then X is separable. Additional conditions on X usually translate to corresponding properties of the LebesgueBochner space Lp (Ω; X). So if p ∈ (1, +∞), then Lp (Ω; X) is uniformly convex if and only if X is uniformly convex (see Day (1955, 1973)). Moreover, as for p the Lebesgue spaces, simple functions (Ω; X) and if Z ⊆ RN ¡ ¢ are dense in L ∞ p is a bounded open set then C Z; X is dense in L (Ω; X) for p ∈ [1, +∞). PROPOSITION 2.2.5 If Y is another Banach space, X ⊆ Y and the embedding is continuous, p, r ∈ [1, +∞], p 6 r, then Lr (Ω; X) ⊆ Lp (Ω; Y ) and the embedding is continuous. PROOF Let f ∈ Lr (Ω; X). Since the embedding X ⊆ Y is continuous, using H¨older’s inequality (see Theorem A.2.27; as p 6 r), we have µZ Ω

° ° °f (ω)°p dµ Y

¶ p1

µZ 6 c1

¶ p1 µZ ¶ r1 ° ° r °f (ω)°p dµ 6 c2 kf (ω)kX dµ , X

Ω

Ω

for some c1 , c2 > 0. So Lr (Ω; X) ⊆ Lp (Ω; Y ) and the embedding is continuous. We want to identify the dual of Lp (Ω; X) for p ∈ [1, +∞). First a definition which is motivated by the fact that the proof of the classical Riesz representation theorem (see Theorem A.3.24) uses the Radon-Nikodym theorem (see Theorem A.2.24). DEFINITION 2.2.6 (a) Let m : Σ −→ X be a vector measure (see Definition 2.1.15). We say that m is of bounded variation, if |m|(Ω) < +∞, where X ° ° °m(C)° |m|(A) = sup ∀ A ∈ Σ, X TA

C∈TA

with TA running through the set of all finite Σ-partitions of A. The quantity |m| : Σ −→ R+ is called the variation of m and is a measure. (b) A Banach space X is said to have the Radon-Nikodym property (RNP for short), if for every probability space (Ω, Σ, µ) and every vector measure m : Σ −→ X of bounded variation such that m ≺≺ µ (i.e., if µ(A) = 0 then m(A) = 0), there exists f ∈ L1 (Ω; X), such that Z m(A) = f (ω) dµ ∀ A ∈ Σ. A

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.2.7 has. Suppose that

129

The RNP is not a property that every Banach space

X1 = c0 . ¡ ¢ ¡ ¢ 1 On [0, 1], B([0, 1]), λ1 (here B [0, 1] is the Borel σ-field ¡ of [0, ¢ 1] and λ is the Lebesgue measure) consider the vector measure m : B [0, 1] −→ c0 , defined by ½Z ¾ ¡ ¢ m(A) = cos nt dt ∀ A ∈ B [0, 1] . n>1

A

The Riemann-Lebesgue Lemma guarantees that ¡ ¢ m(A) ∈ c0 ∀ A ∈ B [0, 1] . Also m ≺≺ λ1 . However, m cannot have a¡ Radon-Nikodym derivative (see Theorem A.2.24 ¢ and Remark A.2.25) in L1 [0, 1]; c0 , since {cos nt}n>1 ∈ / c0

for a.a. t ∈ [0, 1].

Therefore c0 lacks the RNP. However, there are two large classes of Banach spaces which have the RNP. PROPOSITION 2.2.8 If X is reflexive or it is a separable dual space, then X has the RNP. Now we state the Riesz representation theorem for the Lebesgue-Bochner spaces. THEOREM 2.2.9 (Riesz Representation Theorem for the Lebesgue-Bochner Spaces) If p ∈ [1, +∞) and p1 + p10 = 1, ¡ ¢∗ ¢ 0¡ then Lp (Ω; X) = Lp Ω; X ∗ if and only if X ∗ has the RNP and the duality pairing is given by Z ¢ ® ® 0¡ ∀ f ∈ Lp (Ω; X), g ∈ Lp Ω; X ∗ . g(ω), f (ω) X dµ g, f Lp (Ω;X) = Ω ∗ when X = ¡What¢ can be said if X does not have the RNP (for example C [0, 1] )? We can still have a representation theorem for L1 (Ω; X). First a definition.

130

Nonlinear Analysis

¡ ¢ ∗ DEFINITION 2.2.10 By L∞ Ω; Xw we denote the space of all w∗ ∗ ∗ measurable functions g : Ω −→ X , such that there exists c > 0 with ¯ ® ¯ ¯ g(ω), x ¯ 6 c kxk for µ-a.a. ω ∈ Ω and all x ∈ X (2.8) X X (the exceptional µ-null ¡ ¢ set may depend on x). Two functions g, h are equiva∗ (denoted by g ≈ h) if lent in L∞ Ω; Xw ∗ ® ® g(ω), x X = h(ω), x X for µ-a.a. ω ∈ Ω and all x ∈ X. The infimum of all c > 0 for which the above inequality (2.8) is true is denoted by kgkL∞ (Ω;X ∗ ∗ ) and we have w

¯ ® ¯ ¯ g(ω), x ¯ 6 kgk ∞ L (Ω;X ∗ ∗ ) kxkX X w

for µ-a.a. ω ∈ Ω.

We can easily check that k·kL∞ (Ω;X ∗ ∗ ) is a norm. w

¢ ¡ ∗ does REMARK 2.2.11 (a) The equivalence relation in L∞ Ω; Xw ∗ not coincide with the usual one in the Lp -space, since ® g(ω), x X = 0 for µ-a.a. ω ∈ Ω and all x ∈ X does not necessarily imply that g(ω) = 0

for µ-a.a. ω ∈ Ω.

Indeed let Ω = [0, 1]

and

¡ ¢ X = l2 [0, 1]

(it is a nonseparable Hilbert space). Then ¡ ¢ L∞ Ω; Xw∗ ∗ = L∞ (Ω; Xw ) ¡ ¢ and let g(ω) = ga (ω) a∈[0,1] with ½ ga (ω) =

1 0

if if

ω = t, ω 6= t.

Then g ≈ 0 in L∞ (Ω; Xw ), but ° ° °g(ω)° = 1 X

for a.a. ω ∈ [0, 1]. ¡ ¢ ∗ However, if X is sparable and g ∈ L∞ Ω; Xw ∗ , then the function ° ° ω 7−→ °g(ω)°X ∗

is measurable, essentially bounded and ° ° kgkL∞ (Ω;X ∗ ∗ ) = esssup °g(ω)°X ∗ . w

ω∈Ω

2. Lebesgue-Bochner and Sobolev Spaces

131

¡ ¢ ∗ ∞ ∗ (b) In general, we have L∞ Ω; Xw ∗ ¡ 6= L¢ (Ω; X ), even if X∗ is separable. ¡ ¢ To see this let Ω = [0, 1] and X = C [0, 1] . We know that X = M [0, 1] , the space of finite Borel measures on [0, 1] equipped with the total variation norm. Let g : Ω −→ X ∗ be defined by df

g(ω) = δω , ¡ ¢ the Dirac measure at ω ∈ [0, 1]. Then g ∈ L∞ Ω; Xw∗ ∗ , but it is not strongly measurable, nor equivalent to any strongly measurable function. To see this, note that due to the separability of X, g ≈ h if and only if g(ω) = h(ω) for almost all ω ∈ Ω (with h being strongly measurable). Then g is strongly measurable too and so by virtue of Corollary 2.1.4, there exists a countablyvalued function u, such that ° ° °g(ω) − u(ω)°

X∗

1 ⊆ B 1 = © ∗ ª y ∈ X ∗ : ky ∗ kX ∗ 6 1 , such that ¯ ® ¯ ∀ x ∈ X. kxkX = sup ¯ x∗n , x X ¯ n>1

° ° Therefore for every y ∗ ∈ X ∗ , the function ω 7−→ °g(ω) − y ∗ °X ∗ is Σmeasurable and then from the proof of Theorem 2.1.3, ¡ we can ¢ infer∞that the ∗ function ω 7−→ g(ω) is strongly measurable. Hence L∞ Ω; Xw = L (Ω; X ∗ ) ∗ and Theorem 2.2.12 coincides with Theorem 2.2.9.

132

Nonlinear Analysis

In complete analogy with the case of R-valued functions, we introduce the notion of absolutely continuous X-valued function. DEFINITION 2.2.14 A function f : T = [0, b] −→ X is said to be absolutely continuous, if forª every ε > 0, we can find δ(ε) > 0, such © that for each sequence (an , bn ) n>1 of pairwise disjoint intervals in T with ∞ P (bn − an ) < δ, we have

n=1

∞ X ° ° °f (bn ) − f (an )° < ε. X n=1

Also for a function f : T = [0, b] −→ X and a partition P : 0 = x 0 < . . . < xn = b we define df

V (f, P ) =

of T,

m X ° ° °f (xk ) − f (xk−1 )° . X k=1

The variation of f on T is defined by © ª df V (f )(b) = sup V (f, P ) : P is a partition of T . When V (f )(b) is finite, we say that f is of bounded variation. REMARK 2.2.15 Clearly the function t 7−→ V (f )(t) is an increasing function and if f : T = [0, b] −→ X is absolutely continuous, then it is of bounded variation. The converse is not true. It is well known that an R-valued, absolutely continuous function is almost everywhere differentiable on T and it is the indefinite integral of its derivative. The result is no longer true for X-valued in general. EXAMPLE 2.2.16 Let X = L1 [0, 1] and consider the function f : [0, 1] −→ X, defined by df

f (t) = χ[0,t]

∀ t ∈ [0, 1].

It is easy to see that f is absolutely continuous. However, f is nowhere differentiable on [0, 1].¡ Indeed,¢ if f is differentiable at t = t0 ∈ [0, 1], then for ∗ every g ∈ L∞ [0, 1] = L1 [0, 1] , the function df

t 7−→ ϑ(t) =

® g, f (t) L1 [0,1] =

Z1

Zt g(s)f (t)(s) ds =

0

g(s)ds 0

2. Lebesgue-Bochner and Sobolev Spaces is differentiable at t = t0 . Let

½ df

g(s) = We have

½ df

ϑ(t) =

1 −1

if if

t 2t0 − t

133

s 6 t0 , s > t0 . if if

t 6 t0 , t > t0 ,

and ϑ clearly is not differentiable at t = t0 . Note that in this example X = L1 [0, 1] does not have the RNP. THEOREM 2.2.17 If X is reflexive and f : T = [0, b] −→ X is absolutely continuous, then f is differentiable at almost all t ∈ T and Zt f 0 (s) ds

f (t) = f (0) +

∀ t ∈ T.

0

PROOF Because of Theorem 2.1.3, we may assume that X is also separable. Since f is absolutely continuous, it is of bounded variation and the function t 7−→ V (f )(t) is increasing on T = [0, b] (see Definition 2.2.14 and Remark 2.2.15). For 0 6 t 6 t + h 6 b, we have ° ° °f (t + h) − f (t)° 6 V (f )(t + h) − V (f )(t), X so

¢ kf (t + h) − f (t)kX 1¡ 6 V (f )(t + h) − V (f )(t) h h

∀h>0

and lim sup h→0

kf (t + h) − f (t)kX d 6 V (f )(t) < +∞ h dt

for a.a. t ∈ T.

(2.9)

Since X is separable, reflexive, X ∗ is separable too (see Remark A.3.14). Let {x∗n }n>1 be a dense sequence in X ∗ . For every n > 1, ® the function t 7−→ x∗n , f (t) X is differentiable at every point of T \ Dn , with λ1 (Dn ) = 0 (as before λ1 denotes the Lebesgue measure on T ). Also let ½ ¾ kf (t + h) − f (t)kX df D0 = t ∈ T : lim sup = +∞ h h→0 and let us set df

D =

∞ [ n=0

Dn .

134

Nonlinear Analysis

From (2.9), we have that λ1 (D) = 0. Then for ε > 0 small enough and t ∈ T \ D, the family ½ ¾ kf (t + h) − f (t)kX : |h| 6 ε and t ∈ T \ D h is bounded. Since for every n > 1 and every t ∈ T \ D, ¿ À ∗ f (t + h) − f (t) xn , the limit lim exists, n→+∞ h X we infer that there exists u(t) ∈ X, such that for all x∗ ∈ X ∗ and all t ∈ T \D, we have ¿ À ® ∗ f (t + h) − f (t) lim xn , = x∗ , u(t) X , h→0 h X so f is weakly differentiable at every t ∈ T \ D. Let f 0 be the weak derivative of f (i.e., f 0 (t) = u(t) for all t ∈ T \ D). Clearly f 0 is weakly measurable and so by Theorem 2.1.3 it is also strongly measurable. Moreover, from the weak lower semicontinuity of the norm in a Banach space, we have ° 0 ° °f (t)° 6 lim inf kf (t + h) − f (t)kX X h→0 h

∀ t ∈ T \ D.

(2.10)

Then from (2.10) and Fatou’s lemma (see Theorem A.2.1), we have that Zb

° 0 ° °f (t)° dt 6 V (f )(b), X

0

i.e., f 0 ∈ L1 (T ; X). Also ∗ ® x , f (t) − f (0) X =

Zt

∗ 0 ® x , f (s) X ds

∀ x∗ ∈ X ∗ , t ∈ T,

0

so from Theorem 2.1.17, we have Zt f 0 (s) ds

f (t) − f (0) =

∀t∈T

0

and finally f is almost everywhere strongly differentiable with df = f 0 ∈ L1 (T ; X) dt and (2.11) holds.

(2.11)

2. Lebesgue-Bochner and Sobolev Spaces

135

REMARK 2.2.18 The result is more generally true if we assume that X has the RNP. This follows from the fact the RNP is passed to closed linear subspaces of X and if X is a separable Banach space with the RNP, then it has the separable dual (see Diestel & Uhl (1977, pp. 217–218)). So a careful reading of the previous proof reveals that it remains valid if instead we assume only that X has the RNP. The next result is an extension of the so-called “Lagrange lemma” and “DuBois-Reymond lemma” (see Denkowski, Mig´orski & Papageorgiou (2003b, p. 673)) to Banach space valued functionals. PROPOSITION 2.2.19 Let f ∈ L1 (T ; X) (with T = [0, b]). (a) If Zb f (t)ϑ(t) dt = 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,

f (t)ϑ0 (t) dt = 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,

0

then f = 0. (b) If Zb 0

then f is constant. PROOF (a) By virtue of Theorem 2.1.3, we may assume that X is sep∗ ∗ furnished with the w∗ -topology) is arable. Then Xw ∗ (the dual space X ∗ ∗ w -separable (in fact Xw∗ is a Souslin space; see Definition A.2.29(b) and Remark A.2.30). Let {x∗n }n>1 be w∗ -dense in X ∗ . Then for all n > 1 and all ¡ ¢ ϑ ∈ Cc∞ (0, b) , we have Zb

® ϑ(t) x∗n , f (t) dt =

0

¿

Zb x∗n ,

À f (t)ϑ(t) dt

0

= 0, X

so by the Lagrange lemma, we have ∗ ® xn , f (t) X = 0 for a.a. t ∈ T and all n > 1 and since {x∗n }n>1

w∗

= X ∗ , we obtain that f (t) = 0

for a.a. t ∈ T.

(b) The proof is similar, using this time the DuBois-Reymond lemma.

136

Nonlinear Analysis

The next proposition permits the identification of the space of X-valued absolutely continuous functions with a vector Sobolev space. PROPOSITION 2.2.20 If f, g ∈ L1 (T ; X) (with T = [0, b]), then the following conditions are equivalent: Zt (a) f (t) = v +

g(s) ds, v ∈ X, for almost all t ∈ T ; 0

Zb

Zb 0

(b)

f (t)ϑ (t) dt = − 0

¡ ¢ g(t)ϑ(t) dt for all ϑ ∈ Cc∞ (0, b) ;

0

(c) for every x∗ ∈ X ∗ , ® ® d ∗ x , f (·) X = x∗ , g(·) X dt in the distributional sense on (0, b) (see Definition 1.6.1(a)). PROOF “(a)=⇒(b),(c)”: These implications follow from a simple integration by parts. “(c)=⇒(b)”: From the definition ¡ ¢of distributional derivative (see Definition 1.6.1(a)), for all ϑ ∈ Cc∞ (0, b) and all x∗ ∈ X ∗ , we have Zb

∗ ® x , f (t) X ϑ0 (t) dt = −

0

Zb 0

Zb =

® d ∗ x , f (t) X ϑ(t) dt dt

∗ ® x , g(t) X ϑ(t) dt,

0

so Zb

∗ 0 ® x , ϑ (t)f (t) + ϑ(t)g(t) X dt

0

¿ =

Zb ∗

x ,

¡

À ϑ (t)f (t) + ϑ(t)g(t) dt ¢

0

= 0

∀ x∗ ∈ X ∗

X

0

and thus Zb

Zb 0

f (t)ϑ (t) dt = − 0

g(t)ϑ(t) dt 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .

2. Lebesgue-Bochner and Sobolev Spaces

137

“(b)=⇒(a)”: Let df

Zb

fb(t) =

g(s) ds

∀ t ∈ T.

0

Evidently fb is absolutely continuous and fb0 (t) = g(t)

for a.a. t ∈ T .

Let df h = f − fb.

We have

Zb h(t)ϑ0 (t) dt = 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,

0

so, using Proposition 2.2.19(b), we have h(t) = v ∈ X

∀t∈T

and finally Zb g(s) ds

f (t) = v +

∀ t ∈ T.

0

COROLLARY 2.2.21 If f, g ∈ L1 (T ; X) (T = [0, b]) and one of the equivalent statements (a), (b) or (c) in Proposition 2.2.20 holds, then f is almost everywhere equal to an absolutely continuous function f1 : T −→ X. Extending the notion of distributional (weak) derivative and the resulting Sobolev spaces (see Definition 1.6.1) to X-valued functions, we make the following definitions. DEFINITION 2.2.22 (a) Let f, g ∈ L1 (T ; X) (with T = [0, b]). We say that g is the distributional (weak) derivative of f , if Zb

Zb 0

f (t)ϑ (t) dt = − 0

g(t)ϑ(t) dt 0

We denote this derivative of f by Df .

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .

138

Nonlinear Analysis

(b) Let p ∈ [1, +∞] and T = [0, b]. We define ½ ¾ ¡ ¢ df W 1,p (0, b); X = f ∈ Lp (T ; X) : Df ∈ Lp (T ; X) . (c) Let p ∈ [1, +∞] and T = [0, b]. We define ½ ¡ ¢ df AC 1,p T, X = f : T −→ X : f is absolutely continuous, differentiable almost everywhere with derivative ¾ f 0 ∈ Lp (T ; X) . REMARK 2.2.23 According to Theorem 2.2.17 (see also Remark 2.2.18), if X is reflexive (or more generally if X has RNP), then ¡ ¢ f ∈ AC 1,p T, X if and only if there exists a function g ∈ Lp (T ; X), such that Zt ∀ t ∈ T. f (t) = f (0) + g(s) ds 0

¡ ¢ 1,p Invoking Proposition 2.2.20, we see that the spaces W (0, b); X and ¡ ¢ 1,p AC T, X (for p ∈ [1, +∞]) can be identified. THEOREM 2.2.24 If p ∈ [1, +∞] and f ∈ Lp (T ; X) (with T = [0, b]), then the following statements are equivalent: (a) f ∈ W 1,p (T ; X);

¡ ¢ (b) there exists f1 ∈ AC 1,p T, X , such that f (t) = f1 (t) for almost all t ∈ T. REMARK 2.2.25 In Section 2.4, we shall see that this property distinguishes Sobolev functions of one variable (i.e., defined on (0, b)) from Sobolev functions of several variables (i.e., functions defined on an open set Z ⊆ RN with N > 1). PROPOSITION 2.2.26 If X is reflexive, p ∈ (1, +∞) and f ∈ Lp (T ; X) (with T = [0, b]), then the following two conditions are equivalent: ¡ ¢ ¡ ¢ (a) f ∈ W 1,p (0, b); X (or there exists f1 ∈ AC 1,p T, X , such that f (t) = f1 (t) for almost all t ∈ T ); b−h Z

° ° °f (t + h) − f (t)°p dt 6 chp for some c > 0 and all h ∈ (0, b). X

(b) 0

2. Lebesgue-Bochner and Sobolev Spaces PROOF

139

“(a)=⇒(b)”: By Theorem 2.2.24, we have

t+h Z f (t + h) − f (t) = Df1 (s) ds

∀ t, t + h ∈ T = [0, b].

t

By Jensen inequality (see Theorem A.2.26), we have ° ° °f (t + h) − f (t)°p 6 hp−1 X

t+h Z ° ° °Df1 (s)°p ds, X t

so b−h Z

° ° °f (t + h) − f (t)°p dt 6 hp−1 X

0

b−h t+h Z Z

° ° °Df1 (s)°p ds dt. X

0

(2.12)

0

Note that b−h Z

0

1 h

t+h Z Zb ° ° ° ° °Df1 (s)°p ds dt −→ °Df1 (s)°p ds as h → 0 X X t

0

(see Proposition 2.1.22). So from (2.12), we conclude that b−h Z

° ° °f (t + h) − f (t)°p dt 6 chp X

∀ h ∈ (0, b),

0

for some constant c > 0. “(b)=⇒(a)”: For every n > 1, let df

gn (t) = χ[0,b− 1 ] (t) n

f (t + n1 ) − f (t) 1 n

.

By virtue of condition (b), the sequence {gn }n>1 ⊆ Lp (T ; X) is bounded. Since p ∈ (1, +∞) and X is reflexive, the Lebesgue-Bochner space Lp (T ; X) is reflexive too (see Proposition 2.2.3(c)). So by the Eberlein-Smulian theorem (see Theorem A.3.8), we may assume that w

gn −→ g for some g ∈ Lp (T ; X).

in Lp (T ; X),

140

Nonlinear Analysis

¡ ¢ For every ϑ∗ ∈ Cc∞ (0, b); X ∗ , we have Zb

∗ ® ϑ (t), g(t) X dt

0

Zb =

lim

n→+∞

∗ ® ϑ (t), gn (t) X dt

0

Zb ¿ =

ϑ∗ (t),

lim

n→+∞

f (t + n1 ) − f (t) 1 n

0

=

lim

ϑ∗ (t − n1 ) − ϑ∗ (t) 1 n

n→+∞ 0

dt

(2.13)

X

1

· b− Z n¿

À

À ¸ Zb ∗ ® , f (t) dt − n ϑ (t), f (t) X dt . X

1 b− n

¡ ¢ Because ϑ∗ ∈ Cc∞ (0, b); X ∗ , for n > 1 large enough, we have Zb

∗ ® ϑ (t), f (t) X dt = 0.

1 b− n

Also 1 b− n ¿

Z 0

ϑ∗ (t − n1 ) − ϑ∗ (t) 1 n

À , f (t)

Zb dt −→ − X

∗0 ® ϑ (t), f (t) X dt.

0

So from (2.13), we have Zb

∗ ® ϑ (t), g(t) X dt = −

0

Zb

® ϑ∗ 0 (t), f (t) X dt

¡ ¢ ∀ ϑ∗ ∈ Cc∞ (0, b); X

0

and finally Df = g i.e.,

in Lp (T ; X),

¡ ¢ f ∈ W 1,p (0, b); X .

To prove the next result concerning X-valued functions, we shall need the following general result about embeddings of Banach spaces, which will also be helpful in our discussion of evolution triples later in this section.

2. Lebesgue-Bochner and Sobolev Spaces

141

LEMMA 2.2.27 If Y is another Banach space, such that X ⊆ Y , the embedding is continuous and X is dense in Y , then (a) the embedding Y ∗ ⊆ X ∗ is continuous; (b) if X is reflexive, then Y ∗ is dense in X ∗ . PROOF (a) Since by hypothesis X is embedded continuously in Y , there exists c1 > 0, such that kxkY 6 c1 kxkX

∀ x ∈ X.

Let y ∗ ∈ Y ∗ . Then ¯ ∗ ¯ ¯ hy , xi ¯ 6 ky ∗ k ∗ kxk 6 c1 ky ∗ k ∗ kxk . Y Y Y Y X

(2.14)

Let yb∗ = y ∗ |X . Then from (2.14), we have yb∗ ∈ X ∗ and kb y ∗ kX ∗ 6 c1 ky ∗ kY ∗ .

(2.15)

We show that yb∗ = 0 implies that y ∗ = 0. Indeed for all x ∈ X, we have 0 = hb y ∗ , xiX 6 hy ∗ , xiX . Because X is dense in Y , it follows that y ∗ = 0. So the map i∗ : Y ∗ −→ X ∗ , defined by df

i∗ (y ∗ ) = yb∗ , is continuous, injective. Hence y ∗ can be identified with yb∗ and so Y ∗ ⊆ X ∗ with continuous injection (see (2.15)). (b) Suppose that assertion is not true. Then Y∗

k·kX ∗

6= X ∗

and so by the Hahn-Banach theorem, we can find u ∈ X ∗∗ = X (since X is reflexive), u 6= 0, such that hx∗ , uiX = 0 It follows that u = 0, a contradiction.

∀ x∗ ∈ Y ∗ .

142

Nonlinear Analysis

PROPOSITION 2.2.28 If X is reflexive, Y is another Banach space, X ⊆ Y , the embedding is ¡ ¢ continuous and f ∈ L∞ (T ; X) ∩ C T ; Yw (T = [0, b] and Yw is the Banach space Y equipped with the weak topology), ¡ ¢ then f ∈ C T ; Xw (where Xw is the Banach space X equipped with the weak topology). k·k

PROOF By replacing Y with X Y if necessary, we may assume that X is dense in Y . So by virtue of Lemma 2.2.27(b), Y ∗ ⊆ X ∗ and the embedding is continuous and dense. From Corollary 2.1.4, we know that there exists a sequence {fn }n>1 of X-valued, countably valued functions on T , such that fn −→ f

uniformly on T in X.

We know that ° ° °fn (t)° 6 c1 kf k ∞ L (T ;X) X for some c1 > 0 and ∗ ® ® y , fn (t) X −→ y ∗ , f (t) X

∀ t ∈ T, n > 1,

∀ y ∗ ∈ Y ∗ , t ∈ T.

It follows that ¯ ∗ ® ¯ ¯ y , fn (t) ¯ 6 c1 ky ∗ k ∗ kfn k ∞ X L (T ;X) X thus

∀ n > 1, t ∈ T,

¯ ∗ ® ¯ ¯ y , f (t) ¯ 6 c1 ky ∗ k ∗ kf k ∞ X L (T ;X) X

∀t∈T

° ° °f (t)° 6 c1 kf k ∞ L (T ;X) X

∀ t ∈ T.

and so f (t) ∈ X

and

(2.16)

∗ Next let x∗ ∈ X ∗ . We can find a sequence {ym }m>1 ⊆ Y ∗ , such that ∗ ym −→ x∗ in X ∗ . ¡ ¢ Also let tn → t in T . Because f ∈ C T ; Yw , we have

∗ ® ∗ ® ∗ ® ym , f (tn ) X = ym , f (tn ) Y −→ ym , f (t) Y as n → +∞, for all m > 1 and all t ∈ T.

(2.17)

® ∗ ® ∗ ® , f (t) X −→ x∗ , f (t) X ym , f (t) Y = ym as m → +∞, for all t ∈ T.

(2.18)

Also we have

2. Lebesgue-Bochner and Sobolev Spaces

143

From (2.17) and (2.18), via the double ©limit lemma (see Proposition A.2.35), ª we deduce that there exists a sequence m(n) n>1 increasing (not necessarily strictly) to +∞ such that ∗ ® ® ym(n) , f (tn ) X −→ x∗ , f (t) X . (2.19) From (2.16) and (2.19), we have ¯ ∗ ® ® ¯ ¯ x , f (tn ) − x∗ , f (t) ¯ X X ¯ ® ∗ ® ¯¯ ¯¯ ∗ ® ® ¯¯ ¯ ∗ 6 ¯ x , f (tn ) X − ym(n) , f (tn ) X ¯ + ¯ ym(n) , f (tn ) X − x∗ , f (t) X ¯ ¯ ° ° ° ° ® ® ¯ ∗ ° ∗ °f (tn )° + ¯¯ y ∗ , f (tn ) − x∗ , f (t) ¯¯ −→ 0, 6 °x∗ − ym(n) m(n) X X X X ¡ ¢ so f ∈ C T ; Xw . The next lemma is crucial in obtaining compactness theorems for function spaces which arise in the study of evolution equations. LEMMA 2.2.29 If X, Y, Z are three Banach spaces, such that X ⊆ Y ⊆ Z with the first embedding compact and the second continuous, then for every ξ > 0, we can find c(ξ) > 0, such that kxkY 6 ξ kxkX + c(ξ) kxkZ

∀ x ∈ X.

PROOF Suppose the lemma is not true. Then we can find ξ > 0 and a sequence {xn }n>1 ⊆ X, such that kxn kY > ξ kxn kX + n kxn kZ df

Let yn =

xn kxn kX

∀ n > 1.

for all n > 1. We have kyn kY > ξ + n kyn kZ

∀ n > 1.

(2.20)

Since kyn kX = 1 for all n > 1 and the embedding X ⊆ Y is compact, from (2.20), we have that kyn kZ −→ 0 (2.21) and also the sequence {yn }n>1 ⊆ Y is relatively compact. Thus we can find a subsequence {ynk }k>1 of {yn }n>1 , such that ynk −→ u

in Y.

Since Y is embedded continuously in Z, we have also that ynk −→ u in Z. Because of (2.21), we have that u = 0. On the other hand from (2.20) in the limit as k → +∞, we have kukY > ξ > 0, a contradiction. This proves the lemma.

144

Nonlinear Analysis

Let X, Y, Z be three Banach spaces, with X, Y reflexive. Assume that X ⊆ Y ⊆ Z, with the embeddings being continuous. Moreover, we suppose that the first embedding is compact. Let T = [0, b] and 1 < p, r. We introduce the space df

Wpr (T ) =

©

ª u ∈ Lp (T ; X) : u0 = Du ∈ Lr (T ; Z) .

Here u0 = Du denotes the derivative in the distributional sense in Z, i.e., Zb

Zb 0

u0 (t)ϑ(t) dt in Z

u(t)ϑ (t) dt = − 0

¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .

0

We furnish Wpr (T ) with the norm kukpr = kukp + ku0 kr . Clearly Wpr (T ) normed this way is a Banach space. Indeed, consider the isomorphism η : Wpr (T ) −→ Lp (T ; X) × Lr (T ; Z), given by df

η(x) = (x, x0 )

∀ x ∈ Wpr (T )

and view Wpr (T ) as a closed subspace of Lp (T ; X) × Lr (T ; Z). Moreover, if X and Z are separable, then so is Wpr (T ) and finally if X and Z are reflexive, then Wpr (T ) is reflexive too. It is evident that Wpr (T ) ⊆ Lp (T ; X) ⊆ Lp (T ; Z) ⊆ Ls (T ; Z), with s = min{p, r}. Then ¡ ¢ Wpr (T ) ⊆ W 1,s (0, b); Z ¡ ¢ and so every u ∈ Wpr (T ) viewed as a Z-valued function belongs in AC 1,s T, Z (see Theorem 2.2.24). Therefore the derivative u0 = Du is actually a strong derivative in Z almost everywhere, i.e., u0 = We note that

du . dt

Wpr (T ) ⊆ Lp (T ; X) ⊆ Lp (T ; Y )

and clearly the embeddings are continuous. We can say more about the embedding Wpr (T ) ⊆ Lp (T ; Y ), provided we strengthen our conditions on the spaces X, Y and Z.

2. Lebesgue-Bochner and Sobolev Spaces

145

THEOREM 2.2.30 If X, Y, Z are Banach spaces, with X, Z being reflexive, the embeddings X ⊆ Y ⊆ Z being continuous and the embedding X ⊆ Y being compact, then the embedding Wpr (T ) ⊆ Lp (T ; Y ) is compact. PROOF Let {un }n>1 ⊆ Wpr (T ) be a bounded sequence. We need to show that it has a subsequence which converges strongly in Lp (T ; Y ). Note that Wpr (T ) is reflexive. Passing to a subsequence if necessary, we may assume that w un −→ u in Wpr (T ). This means that

w

un −→ u in Lp (T ; X)

and

w

u0n −→ u0

in Lr (T ; Z).

Recall that Wpr (T ) ⊆ C(T ; Z). Claim 1. The embedding Wpr (T ) ⊆ C(T ; Z) is continuous. To see this suppose that un −→ u in Wpr (T ).

(2.22)

Then un (t) −→ u(t)

in X

∀ t ∈ T \ D,

in Z

∀ t ∈ T \ D.

with λ1 (D) = 0. Evidently un (t) −→ u(t)

For t ∈ T and s ∈ T \ D, from Proposition 2.2.20, we have ° ° ° ° ° ° °un (t) − u(t)° 6 °un (s) − u(s)° + c1 °u0n − u0 ° r Z Z L (T ;Z)

(2.23)

∀ n > 1,

for some c1 > 0. For every n > 1, we choose tn ∈ T , such that ° ° ° ° °un − u° = °un (tn ) − u(tn )°Z . C(T ;Z) So from (2.22) and (2.23), we have ° ° ° ° ° ° °un − u° 6 °un (s) − u(s)°Z + c1 °u0n − u0 °Lr (T ;Z) −→ 0. C(T ;Z) This proves Claim 1. Let

df

vn = u n − u

∀ n > 1.

146

Nonlinear Analysis

Then from Claim 1, it follows that we can find c2 > 0, such that kvn kC(T ;Z) = kun − ukC(T ;Z) 6 c2

∀ n > 1.

(2.24)

We claim that vn (t) −→ 0 in Z

∀ t ∈ T.

We shall prove this for t = 0. The proof is similar for any other t ∈ T . We have Zt vn (0) = vn (t) − vn0 (τ ) dτ, 0

so vn (0) =

1 s

Zs vn (t) dt − 0

1 s

Zs Zt vn0 (τ ) dτ dt, 0

0

thus vn (0) = ξn + ηn

∀ n > 1,

with ξn

1 = s

Zs vn (t) dt and

1 = − s

ηn

0

Note that ηn = −

1 s

Zs Zt vn0 (τ ) dτ dt 0

∀ n > 1.

0

Zs (s − t)vn0 (t) dt. 0

For a given ε > 0, select s ∈ T so that Zs kηn kZ 6 0

° 0 ° °vn (t)° dt 6 ε Z 2

∀ n > 1.

For this fixed s ∈ T , note that ξn −→ 0

w

in X

ξn −→ 0

in Z

and so (since X is embedded compactly in Z). So for n > 1 large enough, we have kξn kZ 6

ε . 2

This means that vn (t) −→ 0 in Z

∀ t ∈ T.

2. Lebesgue-Bochner and Sobolev Spaces

147

Because of (2.24), we can apply Proposition 2.1.13 and infer that vn −→ 0

in Lp (T ; Z).

By virtue of Lemma 2.2.29, for a given γ > 0 we can find c(γ) > 0, such that kvn kLp (T ;Y ) 6 γ kvn kLp (T ;X) + c(γ) kvn kLp (T ;Z) , so kvn kLp (T ;Y ) 6 γc3 + c(γ) kvn kLp (T ;Z)

∀ n > 1,

(2.25)

for some c3 > 0. Since γ > 0 was arbitrary and kvn kLp (T ;Z) −→ 0, from (2.25) we infer that lim sup kvn kLp (T ;Y ) 6 0, n→+∞

i.e., vn → 0 in Lp (T ; Y ). Now we are about to introduce a notion that plays a central role in the study of evolution equations. The modern strategy in studying parabolic equations is to make use of many different function spaces. The concept of evolution triple, which we define next, provides an appropriate analytical framework to realize this strategy. DEFINITION 2.2.31 A triple of spaces (X, H, X ∗ ) is said to be an evolution triple, if the following are true: (a) X is a separable, reflexive Banach space; (b) H is a separable Hilbert space; (c) the embedding X ⊆ H is continuous and dense. REMARK 2.2.32 By virtue of Lemma 2.2.27(b), the embedding H ∗ ⊆ ∗ X is continuous and dense. Since by the Riesz-Fr´echet representation theorem (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003a, p. 316)) we can assume that H = H ∗ , then we have that all embeddings X ⊆ H ⊆ X ∗ are continuous and dense. For all h ∈ H and all x ∈ X, we have hh, xiX = (h, x)H , i.e., h·, ·iX |H×X = (·, ·)H . Also for x∗ ∈ X ∗ , x ∈ X, we have hx∗ , xiX =

lim k·k

∗

X h −→ x∗ h∈H

(h, x)H

(since H is dense in X ∗ ). Therefore if X is a Hilbert space too, we do not represent the elements of X ∗ using the inner product of X (the Riesz-Fr´echet theorem), but using the inner product of H.

148

Nonlinear Analysis

EXAMPLE 2.2.33 If Z ⊆ RN is a bounded open set with smooth boundary and p ∈ [2, +∞), then as we shall see in Section 2.4, the spaces ¡ ¢∗ X = W 1,p (Z), H = L2 (Z) and X ∗ = W 1,p (Z) form an evolution triple. For the evolution triple (X, H, X ∗ ), we can consider the reflexive Banach space ½ ¾ ¢ 0¡ df Wpp0 (T ) = u ∈ Lp (T ; X) : u0 ∈ Lp T ; X ∗ , with p1 + p10 = 1, introduced earlier. In the next proposition we establish a regularity property for the elements of Wpp0 (T ) and also derive an “integration by parts formula,” which is crucial in the treatment of evolution equations. PROPOSITION 2.2.34 (Integration by Parts Formula) If (X, H, X ∗ ) is an evolution triple and 1 < p, p0 < +∞ with p1 + then

1 p0

= 1,

(a) Wpp0 (T ) ⊆ C(T ; H) and the embedding is continuous; (b) for all u, v ∈ Wpp0 (T ) and all 0 6 s 6 t 6 b, we have ¡

¢ ¡ ¢ u(t), v(t) H − u(s), v(s) H =

Zt

£ 0 ® ® ¤ u (τ ), v(τ ) X + u(τ ), v 0 (τ ) X dτ.

s

PROOF (a) Note that by the generalized Weierstrass approximation theorem , the space of X-valued polynomials is dense in Wpp0 (T ). In particular then the embedding C 1 (T ; X) ⊆ Wpp0 (T ) is dense. Now let u, v ∈ C 1 (T ; X). We have ¢ ¡ ¢ ¡ ¢ d¡ u(t), v(t) H = u0 (t), v(t) H + u(t), v 0 (t) H dt

∀ t ∈ T.

Thus ¡ ¢ ¡ ¢ u(t), v(t) H − u(s), v(s) H ¸ Zt · ¡ ¢ ¡ 0 ¢ 0 = u (τ ), v(τ ) H + u(τ ), v (τ ) H dτ s

so ¡

u(t), v(t)

¢ H

¡ ¢ − u(s), v(s) H

∀ 0 6 s 6 t 6 b,

2. Lebesgue-Bochner and Sobolev Spaces Zt · =

¸ 0 ® ® 0 u (τ ), v(τ ) X + u(τ ), v (τ ) X dτ

149

∀ 0 6 s 6 t 6 b. (2.26)

s

Choose ϑ ∈ C 1 (R), such that ϑ(s) = 0,

ϑ(t) = 1

and

Let

|ϑ| + |ϑ0 | 6 1

on R.

df

v = ϑu. Then

v 0 = ϑ0 u + ϑu0

and using H¨older’s inequality (see Theorem A.2.27) from (2.26), we obtain ¯ ¯ ¯u(t)¯2 6 c1 kuk2 0 ∀ t ∈ T, pp for some c1 > 0 and so kukC(T ;H) 6

√ c1 kukpp0

∀ u ∈ C 1 (T ; X).

(2.27)

Therefore the identity map ¡ ¢ i : C 1 (T ; X), k·kpp0 −→ C(T ; H) is continuous. But as we said in the beginning of the proof the embedding C 1 (T ; X) ⊆ Wpp0 (T ) is dense. So we can extend i continuously on Wpp0 (T ). Hence the embedding Wpp0 (T ) ⊆ C(T ; H) is continuous. (b) The integration by parts formula follows from (2.26) and the density of the embedding C 1 (T ; X) ⊆ Wpp0 (T ) REMARK 2.2.35

Even if the embedding X ⊆ H is compact,

the embedding Wpp0 (T ) ⊆ C(T ; H) is not compact

(see Mig´orski (1994)). In general, if X and Z are Banach spaces, the embedding X ⊆ Z is continuous (a special case is if X = Z), p, r ∈ [1, ∞] and df

Wpr (T ) =

©

ª u ∈ Lp (T ; X) : u0 ∈ Lr (T ; Z) ,

then Wpr (T ) ⊆ C(T ; Z). This inclusion as well as that of Proposition 2.2.34(a) means that if u ∈ Wpp0 (T ) (respectively u ∈ Wpr (T )), then there exists u1 ∈ C(T ; H) (respectively u1 ∈ C(T ; Z)), such that u(t) = u1 (t)

for a.a. t ∈ T.

150

2.3

Nonlinear Analysis

Compactness Results

In this section we prove compactness and weak compactness results for subsets of C(T ; X) and Lp (T ; X) (p ∈ [1, +∞)). Throughout this section T = [0, b] (b < +∞) and X is a Banach space. Additional hypotheses will be introduced as needed. We start with the classical “Arzela-Ascoli theorem” which characterizes the compact subsets of C(T ; X). In its proof we shall need the following lemma. LEMMA 2.3.1 If K ⊆ X is a nonempty set and for every ε > 0 there exists a relatively compact set Kε ⊆ X, such that for every x ∈ K we can find xε ⊆ Kε , such that kx − xε kX < ε, then K is relatively compact. PROOF Let ε > 0. Choose K 2ε ⊆ X to be the relatively compact subset postulated by the hypothesis of the lemma. We can find {xkε }nk=1 ⊆ K 2ε , such that n [ K 2ε ⊆ B 2ε (xkε ). k=1

By hypothesis for every x ∈ K, there exists x 2ε ⊆ K 2ε , such that ° ° °x − x ε ° 2

X

0 there exists δ(ε) > 0, such that, if t, s ∈ T and |t − s| < δ, then ° ° °u(t) − u(s)° < ε ∀u∈K X (the equicontinuity is uniform in t ∈ T since T is compact).

2. Lebesgue-Bochner and Sobolev Spaces

151

PROOF “=⇒”: Property (a) follows from the fact that for every t ∈ T the evaluation at t map et : C(T ; X) 3 u 7−→ u(t) ∈ X is continuous. To prove property (b) (the equicontinuity property), we proceed as follows. Let ε > 0. Because K is relatively compact in C(T ; X), we can find {uk }nk=1 ⊆ K, such that K ⊆

n [

B 3ε (uk ).

k=1

If t ∈ T , there is a δ = δ(ε) > 0, such that if s ∈ T and |t − s| < δ, then ° ° °uk (t) − uk (s)° < ε ∀ k ∈ {1, . . . , n} X 3 (recall that the functions {uk }nk=1 are uniformly continuous on T ©since T ªis compact). Now let s ∈ T with |t − s| < δ and u ∈ K. Choose k0 ∈ 1, . . . , n , such that ku − uk0 k∞ < 3ε . We have ° ° °u(t) − u(s)° X ° ° ° ° ° ° ° 6 u(t) − uk0 (t)°X + °uk0 (t) − uk0 (s)°X + °uk0 (s) − u(s)°X ε 6 ku − uk0 k∞ + + kuk0 − uk∞ < ε, 3 so K is equicontinuous. “⇐=”: First note that K(0) and K(b) are both relatively compact. Indeed, for a given ε > 0, we can find δ = δ(ε) > 0, such that if 0 < s < δ, then ° ° °u(s) − u(0)° < ε ∀ u ∈ K. X Since by hypothesis K(s) ⊆ X is relatively compact, from Lemma 2.3.1, it follows that K(0) ⊆ X is relatively compact. Similarly for K(b) ⊆ X. For every integer N , let uN : T −→ X be the function equal to u ∈ K at the points tk = kb N , k = 0, . . . , N and linear between these points. Then the ª df © set KN = uN : u ∈ K is isomorphic to N Y k=0

µ K

kb N

¶ ⊆ X N +1 ,

which is relatively compact (Tychonoff’s theorem). Therefore KN ⊆ C(T ; X) is relatively compact. Also if N > δb , then by property (b), we have ku − uN k∞ < ε. So by Lemma 2.3.1, we conclude that K ⊆ C(T ; X) is relatively compact. We can have a “weak” variant of the Arzela-Ascoli theorem. First a definition.

152

Nonlinear Analysis

¡ ¢ DEFINITION 2.3.3 A set K ⊆ C T ; Xw is weakly equicontinuous, if for every ε > 0 and x∗ ∈ X ∗ , we can find δ = δ(ε, x∗ ) > 0, such that if t, s ∈ T and |t − s| < δ, then ¯ ∗ ® ¯ ¯ x , u(t) − u(s) ¯ < ε ∀ u ∈ K. X Also we say that a sequence of functions un : T −→ X, n > 1 converges weakly uniformly to u : T −→ X, if for every ε > 0 and x∗ ∈ X ∗ , we can find n0 = n0 (ε, x∗ ) > 1, such that ¯ ∗ ® ¯ ¯ x , un (t) − u(t) ¯ < ε ∀ t ∈ T, n > n0 . X THEOREM 2.3.4 ¡ ¢ If X ∗ is separable, {un }n>1 ⊆ C T ; Xw , for every t ∈ T , the set w

{un (t)}n>1 is weakly compact in X and the sequence {un }n>1 is weakly equicontinuous, ¡ ¢ then we can find u ∈ C T ; Xw and a subsequence {unk }k>1 of {un }n>1 such that unk −→ u weakly uniformly in T. PROOF Let C ∗ be a countable dense subset on X ∗ . We introduce ∗ df D = span Q C ∗ , the set of linear combinations with rational coefficients of the elements of C ∗ . Evidently D∗ is countable and dense in X ∗ . Using the classical Arzela-Ascoli theorem on C(T ) together with the Cantor diagonal process, we can find a subsequence {unk }k>1 of {un }n>1 , such that ∗ ® x , unk (·) X −→ v(x∗ )(·) in C(T ) as k → +∞. Note that x∗ 7−→ v(x∗ ) is a map defined on D∗ with values in C(T ). Moreover, we have ¯ ∗ ® ¯ ¯ x − y ∗ , un (t) ¯ 6 c1 kx∗ − y ∗ k ∗ ∀ t ∈ T, x∗ , y ∗ ∈ D∗ , X X for some c1 > 0, so ¯ ∗ ¯ ¯v(x )(t) − v(y ∗ )(t)¯ 6 c1 kx∗ − y ∗ k ∗ X and thus

∀ t ∈ T, x∗ , y ∗ ∈ D∗

° ∗ ° °v(x ) − v(y ∗ )° 6 c1 kx∗ − y ∗ kX ∗ . C(T )

Therefore the map v : D∗ −→ C(T ) is uniformly continuous. Thus it can be k·k

∗

extended to a unique continuous map vb : D∗ X = X ∗ −→ C(T ). Clearly vb is continuous. This together with the fact that K(t) ⊆ X is weakly compact imply that we can find u : T −→ X, such that ∗ ® ∀ x∗ ∈ X ∗ , t ∈ T. x , u(t) X = vb(x∗ )(t) ¡ ¢ We conclude that u ∈ C T ; Xw and unk −→ u weakly uniformly on T .

2. Lebesgue-Bochner and Sobolev Spaces

153

DEFINITION 2.3.5 A subset K ⊆ Lp (T ; X) (p ∈ [1, +∞)) is said to be p-equiintegrable, if it is uniformly integrable (see Definition A.2.3) and b−h Z

° ° °u(t + h) − u(t)°p dt = 0 X

lim

h&0

uniformly for all u ∈ K.

0

In the next theorem we present a characterization of relatively compact sets of the Lebesgue-Bochner spaces Lp (T ; X) (p ∈ [1, +∞]) and also obtain an alternative criterion for compactness in C(T ; X) in which the compactness condition of the Arzela-Ascoli theorem (see Theorem 2.3.2) is replaced by a similar one for integrals. In what follows df

τh (u)(t) = u(t + h)

∀ h > 0.

So if u is defined on T , this translated version τh (u) is defined on [−h, b − h]. Note that the definition of p-equiintegrability is equivalent to saying that ° ° °τh (u) − u° p −→ 0 as h → 0, uniformly for all u ∈ K, L (T ;X) h

df

with Th = [0, b − h]. THEOREM 2.3.6 K ⊆ Lp (T ; X), p ∈ [1, +∞) (respectively K ⊆ C(T ; X)) is relatively compact if and only if t Z (a) for all t, s ∈ (0, b), s < t, we have that the set u(τ ) dτ : u ∈ K is s

relatively compact in X; and (b) K is p-equiintegrable (respectively

lim kτh (u) − ukL∞ (Th ;X) = 0 uni-

h→+∞

formly in u ∈ K). PROOF

Suppose that K ⊆ Lp (T ; X), p ∈ [1, +∞). Zt u(τ ) dτ is continuous from Lp (T ; X) into X,

“=⇒”: Since the map u 7−→ s

property (a) is satisfied. Due to the relative compactness of K ⊆ Lp (T ; X), we can find a ©sequenceª{uk }nk=1 ⊆ Lp (T ; X), such that for every u ∈ K, we can find k ∈ 1, . . . , n , such that ku − uk kp < 3ε . Because the embedding C(T ; X) ⊆ Lp (T ; X) is dense, we can assume that uk ∈ C(T ; X). Then we can find hk > 0, such that for h ∈ (0, hk ), we have ° ° ε °τk (uk ) − uk ° p < . L (Th ;X) 3

154

Nonlinear Analysis

Let b h = min hk . We have 16k6n

¡ ¢ τh (u) − u = τh (u − uk ) − (u − uk ) + τh (uk ) − uk and so

∀h6b h, u ∈ K,

kτh (u) − ukLp (Th ;X) < ε so lim kτh (u) − ukLp (Th ;X) = 0

uniformly for u ∈ K.

h&0

“⇐=”: Let u ∈ K and r > 0. We set 1 Mr (u)(t) = r df

Zt+r u(s) ds. t

¡ ¢ We have Mr (u) ∈ C Tr ; X with Tr = [0, b − r]. For every t, s ∈ [0, b − r], s 6 t, we have s+r ° Z ° ° ° ° ° ¡ ¢ °Mr (u)(t) − Mr (u)(s)° = ° 1 τt−s (u) − u (τ ) dτ ° °r ° X X s

° 1° 6 °τt−s (u) − u°L1 (T ;X) , t−s r so Mr K =

©

Mr u : u ∈ K

ª

¡ ¢ ⊆ C Tr ; X

is uniformly equicontinuous (see condition (b)). ¡ Also¢ from condition (a), we see that for every t ∈ (0, b − r), the set Mr K (t) ¡⊆ X ¢is relatively compact. So by Theorem 2.3.2, we have that Mr K ⊆ C Tr ; X is relatively compact. Note that 1 Mr (u)(t) − u(t) = r so

Zr

¡ ¢ τh (u) − u (t) dh

∀ t ∈ Tr ,

0

° ° ° ° °Mr (u) − u° p 6 max °τh (u) − u°Lp (Tr ;X) . L (Tr ;X) h∈[0,r]

But because of condition (b), for all bb < b, K is the uniform limit of Mr K in ¡ ¢ £ ¤ Lp Tb; X with Tb = 0, bb as r → 0, r 6 b − bb. But since Mr K is relatively ¢ ¢ ¡ ¢ ¡ ¡ compact in C Tb; X and the embedding C Tb; X ⊆ Lp Tb; X is continuous, ¢ ¡ we see that K is relatively compact in Lp Tb; X . Conditions (a) and (b) remain valid if one ©changes the time direction. ª Namely if u(t) = u(b − t), then the set K = u : u ∈ K still satisfies

2. Lebesgue-Bochner and Sobolev Spaces

155

conditions (a) and (b). Then from the previous argument we have that K ¡ ¢ is relatively compact in Lp Tb; X . It follows that K is relatively compact in ¡£ ¤ ¢ Lp bb, b ; X . Setting for example bb = 2b , we obtain the relative compactness of K in Lp (T ; X). The proof of the case when K ⊆ C(T ; X) is similar. REMARK 2.3.7 The restriction p < +∞ is necessary, because if K = {u} with u bounded but discontinuous from T into X, then K is compact in L∞ (T ; X) but condition (b) is not satisfied. COROLLARY 2.3.8 If u ∈ Lp (T ; X), p ∈ [1, +∞), then ° ° °τh (u) − u° p L (T

h ;X)

−→ 0

as h & 0.

When X = R, we have the so-called “Riesz-Kolmogorov theorem” for p ∈ [1, +∞). COROLLARY 2.3.9 (Riesz-Kolmogorov Theorem) K ⊆ Lp (T ), p ∈ [1, +∞) (respectively K ⊆ C(T )) is relatively compact if and only if (a) there exist t, s ∈ (0, b), s < t, such that the set ½ Zt

¾ u(τ ) dτ : u ∈ K

⊆ X

s

is bounded; and b−h Z

¯ ¯ ¯u(t + h) − u(t)¯p dt −→ 0 as h & 0 uniformly for u ∈ K.

(b) 0

PROOF

From(a) and (b) it follows that for all t, s ∈ K, s < t, we Zt have that the set u(τ ) dτ : u ∈ K ⊆ X is bounded. So we can apply s

Theorem 2.3.6. REMARK 2.3.10 Theorem 2.3.6 and Corollary 2.3.9 provide characterizations of compact sets in C(T ; X) and C(T ) respectively. Compared with the classical Arzela-Ascoli theorem (see Theorem 2.3.2), we see that condition (a) (the space criterion) is now a condition on integrals, while condition (b) (the time criterion) remains the same.

156

Nonlinear Analysis

Next we shall characterize sets which are bounded in Lp (T ; X) and compact in Lr (T ; X) with r < p. Such results are known as “partial compactness” results, since the compactness is not achieved for the larger order p for which the set is actually bounded. First we obtain two auxiliary results relating compactness with time-local compactness. LEMMA 2.3.11 The set K ⊆ Lp (T ; X) (with p ∈ [1, +∞)) is relatively compact if and only if (a) K ⊆ Lploc (T ; X)¡ is relatively compact (i.e., for all s, t ∈ (0, b), s < t, the ¢ set K|[s,t] ⊆ Lp [s, t]; X is relatively compact); and (b)

° ° Rh ° Rb ° °u(t)°p dt + °u(t)°p dt −→ 0 as h → 0 uniformly for u ∈ K. X X 0

b−h

PROOF “=⇒”: Condition (a) is automatically true. Let u be the extension by 0 outside T of u. Then the set © ª K = u: u∈K ¡ ¢ is relatively compact in Lp [−b, 2b]; X . As ° ° °τh (u) − u° p L ([−b,2b];X) Zh =

° ° °u(t)°p dt + X

0

b−h Z

° ° °u(t + h) − u(t)°p dt + X

0

Zb

° ° °u(t)°p dt, X

b−h

applying Theorem 2.3.6, we obtain Zh

° ° °u(t)°p dt + X

0

Zb

° ° °u(t)°p dt −→ 0 X

as h & 0

uniformly for u ∈ K.

b−h

“⇐=”: Let df

uh = χ[h,b−h] u

df

and Kh =

©

ª uh : u ∈ K .

Condition (b) implies that for a given ε > 0, we can find h > 0 small enough, such that kuh − ukLp (T ;X) < ε ∀ u ∈ K. Since Kh ⊆ Lp (T ; X) is relatively compact (see condition (a)), from Lemma 2.3.1, we infer that K ⊆ Lp (T ; X) is relatively compact.

2. Lebesgue-Bochner and Sobolev Spaces

157

LEMMA 2.3.12 If K ⊆ Lp (T ; X) (with p ∈ (1, +∞]) is bounded and K ⊆ L1loc (T ; X) is relatively compact (i.e., for all t, s ∈ (0, b), s < t, K|L1 ([s,t];X) is relatively compact), then K ⊆ Lr (T ; X) is relatively compact for all r ∈ (1, p). PROOF For every h 6 b and every u ∈ K, from H¨older’s inequality (see Theorem A.2.27), we have Zh

° ° °u(t)° dt + X

0

Zb

° ° 1 °u(t)° dt 6 2h q0 kuk , q X

b−h

so K ⊆ L1 (T ; X) is relatively compact (see Lemma 2.3.11). So for a given ©ε > 0, weª can find {uk }nk=1 ⊆ K, such that for every u ∈ K, there exists k ∈ 1, . . . , n , such that 1

ku − uk k1

0, we ª can find {uk }nk=1 ⊆ K, such that for each u ∈ K there exists k ∈ 1, . . . , n , such that ku − uk kLp (T ;Z) < ε. Invoking Lemma 2.2.29, for every ξ > 0, we can find c = c(ξ) > 0, such that ku − uk kLp (T ;Y ) 6 ξ ku − uk kLp (T ;X) + c ku − uk kLp (T ;Z) 6 ξδX + cε, df

where δX = diam Lp (T ;X) K. For a given ε0 > 0, select ξ = Then from (2.28), we have

ε0 2δX

(2.28) df ε0 2c .

and ε =

ku − uk kLp (T ;Y ) 6 ε0 , so K ⊆ Lp (T ; Y ) is relatively compact. Based on the above lemma, we can have the following compactness result for an intermediate space. THEOREM 2.3.19 If Y, Z are Banach spaces, the embeddings X ⊆ Y ⊆ Z are continuous, with the first embedding compact, p ∈ [1, +∞] and (i) K ⊆ Lp (T ; X) is bounded, (ii) kτh (u) − ukLp (Th ;Z) −→ 0 as h & 0 uniformly for u ∈ K, then K is relatively compact in Lp (T ; Y ) if p ∈ [1, +∞) and in C(T ; Y ) if p = +∞. PROOF Because of the compactness of the embedding X ⊆ Y and of Theorem 2.3.6, the set K ⊆ Lp (T ; Z) is relatively compact. An application of Lemma 2.3.18 finishes the proof. This result permits an extension of Theorem 2.2.30.

160

Nonlinear Analysis

THEOREM 2.3.20 If Y, Z are Banach spaces, the embeddings X ⊆ Y ⊆ Z are continuous with the first embedding compact, then (a) if K ⊆ Lp (T ; X) (with p ∈ [1, +∞)) is bounded and the set ª df © K 0 = u0 = Du : u ∈ K ⊆ L1 (T ; Z) is bounded, we have that K ⊆ Lp (T ; Y ) is relatively compact; ª df © (b) if K ⊆ L∞ (T ; X) is bounded and K 0 = u0 = Du : u ∈ K ⊆ Lr (T ; Z) (with r > 1) is bounded, we have that K ⊆ C(T ; Y ) is relatively compact. Now let us look at weakly compact subsets of L1 (Ω; X). To describe a large class of such sets in L1 (Ω; X), we shall need two results which for easy reference we state here without proofs. The first is the celebrated James theorem. THEOREM 2.3.21 (James Theorem) A nonempty, weakly closed and bounded subset of a Banach space X is weakly compact if and only if every x∗ ∈ X ∗ attains its maximum on the set. The second result is a remarkable consequence of the property of decomposability. If (Ω, Σ, µ) is a finite measure space and X is a Banach space, a set K ⊆ L1 (Ω; X) is said to be decomposable if and only if χA u1 + χAc u2 ∈ K for all (u1 , u2 , A) ∈ K × K × Σ. PROPOSITION 2.3.22 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, ϕ : Ω × df

X −→ R = R ∪ {+∞} is jointly measurable, F : Ω −→ 2X \ {∅} is graph ª df © measurable (i.e., Gr F = (ω, x) ∈ Ω × X : x ∈ F (ω) ∈ Σ × B(X) with B(X) being the Borel σ-field of X), Z ¡ ¢ ϕ ω, u(ω) dµ Iϕ (u) = Ω

is defined (maybe +∞ or −∞) for all u ∈ SF1 with ª df © SF1 = u ∈ L1 (Ω; X) : u(ω) ∈ F (ω) for µ-a.a. ω ∈ Ω and there exists u0 ∈ SF1 , such that Iϕ (u0 ) > −∞, then

Z sup Iϕ (u) =

1 u∈SF

sup ϕ(ω, x) dµ. Ω

x∈F (ω)

2. Lebesgue-Bochner and Sobolev Spaces

161

Using these results, we can identify a large class of weakly compact subsets of L1 (Ω; X). THEOREM 2.3.23 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, F : Ω −→ 2X \ {∅} is graph measurable, for µ-almost all ω ∈ Ω, F (ω) is weakly compact, convex and there exists h ∈ L1 (Ω)+ , such that sup kxkX 6 h(ω)

for µ-a.a. ω ∈ Ω,

x∈F (ω)

then

SF1 =

¡ ¢ u ∈ L1 (Ω; X) : u(ω) ∈ F (ω) for µ-a.a. ω ∈ Ω

is weakly compact and convex. PROOF Convexity of SF1 is obvious. Moreover, because of the boundedness by h ∈ L1 (Ω)+ , we have SF1 6= 0 (see Denkowski, Mig´orski & PapageorL1 (Ω; X). giou (2003a, p. 432)). So we show that SF1 is weakly compact in ¡ ¢∗ According to Theorem 2.3.21, it suffices to show that every u∗ ∈ L1 (Ω; X) attains its supremum on SF1 . From Theorem 2.2.12, we know that ¢ ¡ 1 ¢∗ ¡ ∗ L (Ω; X) = L∞ Ω; Xw ∗ and the duality pairing is given by Z ∗

hu , uiL1 (Ω;X) =

∗ ® u (ω), u(ω) X dµ.

Ω

From Proposition 2.3.22, we have Z sup hu∗ , uiL1 (Ω;X) = sup

1 u∈SF

Z

Let

x∈F (ω)

½ df

M (ω) =

Ω

® sup u∗ (ω), x X dµ.

= Ω

1 u∈SF

∗ ® u (ω), u(ω) X dµ

y ∈ F (ω) :

∗ ® u (ω), y X =

sup x∈F (ω)

¾ ∗ ® u (ω), x X .

Since F (ω) is µ-almost everywhere w-compact, we see that M (ω) 6= 0

for µ-a.a. ω ∈ Ω.

By setting F (ω) = {0} on the exceptional Lebesgue-null set, we can say that M (ω) 6= 0

∀ ω ∈ Ω.

162

Nonlinear Analysis

Also from Denkowski, Mig´orski & Papageorgiou (2003a, p. 433), we know that we can find a sequence of Σ-measurable functions fn : Ω −→ X, such that fn (ω) ∈ F (ω)

∀ ω ∈ Ω, n > 1

and F (ω) = {fn (ω)}n>1 Hence

sup

u∗ (ω), x

x∈F (ω)

® X

k·kX

∀ ω ∈ Ω.

® = sup u∗ (ω), fn (ω) X n>1

and so ω 7−→

sup x∈F (ω)

∗ ® u (ω), x X = m∗ (ω)

is Σ-measurable.

Then ª (ω, x) ∈ Ω × X : x ∈ M (ω) © ® ª = (ω, x) ∈ Ω × X : u∗ (ω), x X = m∗ (ω) .

Gr M =

©

Since u∗ is w∗ -measurable, it follows that Gr M ∈ Σ × B(X). So we can apply the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) and obtain a strongly measurable function u0 : Ω −→ X, such that u0 (ω) ∈ M (ω) for µ-a.a. ω ∈ Ω. Evidently and

u0 ∈ SF1 hu∗ , u0 i = sup hu∗ , uiX . 1 u∈SF

¢ ¡ 1 ¢∗ ¡ ∗ Since u∗ ∈ L∞ Ω; Xw = L (Ω; X) was arbitrary and clearly SF1 is weakly ∗ closed and bounded, from Theorem 2.3.21, we conclude that SF1 ⊆ L1 (Ω; X) is weakly compact. A classical theorem of Dunford-Pettis isolates the relatively weakly compact subsets of L1 (Ω) as the bounded, uniformly integrable subsets (see Definition A.2.3). If X is reflexive, the original proof for L1 (Ω) extends with only notational changes to the present vector valued setting. So we have THEOREM 2.3.24 (Dunford-Pettis Theorem) If (Ω, Σ, µ) is a finite measure space, X is reflexive and K ⊆ L1 (Ω; X) is bounded, then K is relatively weakly compact in L1 (Ω; X) if and only if it is uniformly integrable.

2. Lebesgue-Bochner and Sobolev Spaces

163

The relative weak compactness in L1 (Ω; X) is closely related with the socalled “biting convergence,” which is useful in the calculus of variations and in optimal control. DEFINITION 2.3.25

A sequence {un }n>1 ⊆ L1 (Ω; X) is said to conb

verge to u ∈ L1 (Ω; X) in the biting sense, denoted by un −→ u, if there exists an increasing sequence {Cm }m>1 ⊆ Σ, such that µ(Cm ) % µ(Ω) and

w

un −→ u

as m → +∞

in L1 (Cm ; X)

∀ m > 1.

The so-called “Chacon Biting Lemma” says that if X is a reflexive Banach space, then every bounded sequence in L1 (Ω; X) has a subsequence converging in L1 (Ω; X) in the biting sense. The next result is a slightly stronger version of the original biting lemma. THEOREM 2.3.26 If (Ω, Σ, µ) is a finite measure space, X is a Banach space and {un }n>1 ⊆ L1 (Ω; X) is bounded, then there exists a subsequence {unk }k>1 of {un }n>1 and an increasing sequence {Cm }m>1 ⊆ Σ, such that µ(Cm ) % µ(Ω) as m → +∞ and n o is uniformly integrable. χCk unk k>1

PROOF

Let m > 1 and define Z df hm (t) = sup

n>m {kum kX >t}

° ° °un (ω)° dµ X

∀ t > 0.

Note that hm : R+ −→ R+ and it is decreasing. So lim hm (t) exists for all t→+∞

m > 1. Let df

ξ =

lim h1 (t).

t→+∞

Since for every m > 1, {u1 , . . . , um−1 } is uniformly integrable, it follows that lim hm (t) = ξ.

t→+∞

Let {ti }i>1 ⊆ R+ be a sequence increasing to +∞, such that ξ 6 h1 (ti ) 6 ξ +

1 i

∀ i > 1.

Because ξ 6 hm (t)

∀ m > 1, t ∈ R+ ,

164

Nonlinear Analysis

we can find a strictly increasing sequence {li }i>1 , such that Z

° ° °uli (ω)° dµ > ξ − 1 . X i

{kuli k >ti } X

df

Set Di = {kuli kX > ti }. Then ti µ(Di ) 6 sup kun k1 n>1

and so µ(Di ) −→ 0 as i → +∞. © ª df Let Ci = Ω \ Di . We claim that the sequence χCi uli i>1 is uniformly integrable. To this end let Z ° ° df °uli (ω)° dµ. h(t) = sup X i>1

Ci ∩{kuli k >t} X

We need to show that h(t) −→ 0 as t → +∞. We have Z ° ° °uli (ω)° dµ h(tr ) = sup X i>r

{tr r

{kuli kX >tr }

·µ 6 sup i>r

° ° °uli (ω)° dµ − X

Z

¸ ° ° °uli (ω)° dµ X

{kuli kX >ti }

µ ¶¶¸ 1 1 2 ξ+ − ξ− 6 , r i r

so h(t) −→ 0 as t → +∞. Finally replace {Di }i>1 by a sequence decreasing to ∅ (or a µ-null set). Then there exists a strictly increasing sequence {ik }k>1 , such that ¡ ¢ 1 µ Dik 6 k 2 df

Set Ak =

∞ S j=k

∀ k > 1.

Dij for k > 1. Then {Ak }k>1 decreases to a µ-null set. The

subsequence of the statement of the theorem is defined by setting unk = uli with i = © ik . Also ªCk = Ω \ Ak for k > 1. The uniform integrability of the sequence χCk unk k>1 follows from the inclusion Ω \ Ak ⊆ Ω \ Dik .

2. Lebesgue-Bochner and Sobolev Spaces

165

EXAMPLE 2.3.27 In the previous theorem, it is necessary to pass to a subsequence. To see this let Ω = [0, 1]

and

µ = λ1

© ª (the Lebesgue measure on R). If n = 2k + i with k > 1, i ∈ 0, . . . , 2k − 1 , we set · ¶ i i+1 k df 2 if ω ∈ k , k , un (ω) = 2 2 0 otherwise. Then there is no increasing sequence {Cm }m>1 with Ω = © ª χCm un n>1 is uniformly integrable. Indeed, if

∞ S m=1

Cm , such that

¡ ¢ 1 λ1 Ω \ Cm 6 , 2 © ª then for all k > 1, there exists i ∈ 0, . . . , 2k − 1 , such that µ· λ1

i i+1 , 2k 2k

¶

¶ ∩ Cm

>

1 2k+1

.

COROLLARY 2.3.28 If X is reflexive and {un }n>1 ⊆ L1 (Ω; X) is bounded, then we can find a subsequence {unk }k>1 of {un }n>1 and u ∈ L1 (Ω; X), such that b unk −→ u as k → +∞. REMARK 2.3.29 As we shall see in Section 2.5, some of the ideas involved in the “Biting Lemma” are common in the “concentration compactness theorem” (see Theorem 2.5.30). To have an analogous result for the spaces Lp (T ; X), with p ∈ (1, +∞), we need the following result which is useful in many situations since it provides information about the pointwise behaviour of a weakly convergent sequence in Lp (T ; X) for p ∈ [1, +∞). First a definition-notation. DEFINITION 2.3.30 If X is a Banach space and {An }n>1 ⊆ 2X \{∅}. We set ½ ¾ df w-lim sup An = x ∈ X : x = w-lim xnk , xnk ∈ Ank , n1 < n2 < . . . . n→+∞

k→+∞

Here w stands for the weak topology on X.

166

Nonlinear Analysis

PROPOSITION 2.3.31 If (Ω, Σ, µ) is a finite measure space, X is a Banach space, {un }n>1 ⊆ Lp (Ω; X) and u ∈ Lp (Ω; X) with p ∈ [1, +∞), w

un −→ u

in Lp (Ω; X)

and for µ-almost all ω ∈ Ω, the sequence {un (ω)}n>1 is relatively weakly compact, then © ª u(ω) ∈ conv w-lim sup un (ω) for µ-a.a. ω ∈ Ω. n→+∞

Using this Proposition we can have the following result for bounded sequences in Lp (T ; X) (with p ∈ (1, +∞)). THEOREM 2.3.32 If (Ω, Σ, µ) is a finite measure space, X is a reflexive Banach space, {un }n>1 ⊆ Lp (Ω; X) (with p ∈ (1, +∞)) is bounded and w

un (ω) −→ u(ω)

for µ-a.a. ω ∈ Ω

in X,

(2.29)

then u ∈ Lp (Ω; X) and w

un −→ u

in Lp (Ω; X).

PROOF From Proposition 2.2.3(c), we know that Lp (Ω; X) is reflexive. So by the Eberlein-Smulian theorem (see Theorem A.3.8), we can find a subsequence {unk }k>1 of {un }n>1 , such that w

unk −→ u b in Lp (Ω; X). Using Proposition 2.3.31 and (2.29), we infer that u = u b ∈ Lp (Ω; X). So every subsequence of {un }n>1 has a further subsequence weakly convergent in Lp (Ω; X) to u and from this it follows that w

un −→ u

in Lp (Ω; X).

Another notion related to the weak convergence in L1 (T ; X) (T = [0, b]) is given in the next definition. DEFINITION 2.3.33 Let T = [0, b] (b < +∞) and X a Banach space. The weak norm on L1 (T ; X) is defined by °Z t ° ° ° df ° kukw = max ° u(τ ) dτ ° ∀ u ∈ L1 (T ; X). ° 06s6t6b

s

X

2. Lebesgue-Bochner and Sobolev Spaces

167

REMARK 2.3.34 kukw

Equivalently we can define °Z t ° ° ° ° = max ° u(τ ) dτ ∀ u ∈ L1 (T ; X). ° ° t∈T

0

X

Evidently k·kw is a norm on L1 (T ; X) weaker than the usual norm Zb kuk1 =

° ° °u(t)° dt X

∀ u ∈ L1 (T ; X).

0

We shall show that for a broad class of subsets of L1 (T ; X), the topology generated by the weak norm k·kw and the weak L1 (T ; X)-topology coincide. For this purpose we introduce the following property for subsets of L1 (T ; X). DEFINITION 2.3.35 Let T = [0, b] and X a Banach space. We say that K ⊆ L1 (T ; X) has property U , if (a) K is uniformly integrable; and (b) for every ε > 0, there exists a compact set Cε ⊆ X, such that for every u ∈ K there exists a Lebesgue measurable set Aε,u ⊆ T , such that ¡ ¢ λ1 T \ Aε,u < ε and u(t) ∈ Cε

∀ t ∈ Aε,u

1

(here λ stands for the Lebesgue measure on T ). REMARK 2.3.36 Since the Lebesgue measure λ1 is nonatomic (see Theorem A.2.5 and Remark A.2.6), the uniform integrability property implies that K is bounded. Also if K ⊆ L1 (T ; X) has property U , then K is relatively w-compact (see Bourgain (1979)). THEOREM 2.3.37 If T = [0, b], X is a Banach space and K ⊆ L1 (T ; X) has property U , then the weak L1 (T ; X)-topology and k·kw -norm topology on K coincide. Moreover, K is relatively k·kw -compact. PROOF For every n > 1, let C n1 ⊆ X be the compact set postulated by Definition 2.3.35. The set ∞ df [ C = C n1 n=1

is separable in X and note that u(t) ∈ C

∀ t ∈ T \ Au ,

168

Nonlinear Analysis

with λ1 (Au ) = 0. So by replacing X by span C if necessary, we may assume that X is a separable Banach space. Then the dual unit ball © ª ∗ B 1 = x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 furnished with the relative w∗ -topology is compact metrizable. Let {tn }n>1 ⊆ T ∗

be a dense set and consider a w∗ -dense set {x∗n }n>1 ⊆ B 1 . The family n o χ[tm ,tk ] x∗n : n, m, k > 1, tm < tk is countable and so it can be enumerated as {ϕi }i>1 . We have kukw

¯ b ¯ ¯Z ¯ ¯ ¯ ® = sup ¯¯ ϕi (t), u(t) X dt¯¯ i>1 ¯ ¯ 0

(see Definition 2.3.35). Let S : L1 (T ; X) −→ l∞ be the continuous, linear operator defined by df

½ Zb

S(u) =

® ϕi (t), u(t) X dt

0

Note that

¾ . i>1

° ° °S(u)° ∞ = kuk . w l

We claim that S(K) is relatively strongly compact in l∞ . First suppose that there exists a norm compact set D ⊆ X, such that ½ ¾ 1 1 K ⊆ SD = u ∈ L (T ; X) : u(t) ∈ D for a.a. t ∈ T . Let {ei }i>1 be the standard basis in l1 and C(D) the space of continuous ¡ ¢ R-valued functions on D. Let Sb : l1 −→ L1 T ; C(D) be defined by df b i) = S(e ϕi

∀i>1

and on all of l1 by linearity and ¡ ¢ continuity. Using Theorem 2.3.6, we can see that {ϕi }i>1 ⊆ L1 T ; C(D) is relatively norm-compact. Hence Sb is a compact operator and then by Schauder’s theorem, the adjoint operator ¡ ¢ ¡ ¢∗ Sb∗ : L1 T ; C(D) = L∞ T ; M (D)w∗ −→ l∞ is compact (here by M (D)w∗ we denote the space of Radon measure furnished with w∗ -topology; recall that by the Riesz-Markov theorem, C(D)∗ = M (D);

2. Lebesgue-Bochner and Sobolev Spaces

169

¡ ¢ ¡ ¢∗ see Theorem A.3.25). For every g ∈ L∞ T ; M (D)w∗ = L1 T ; C(D) , we have ½ Zb ¾ ® ∗ Sb (g) = g(t), ϕi (t) C(D) dt , i>1

0

where

® g(t), ϕi (t) C(D) =

Z ϕi (t)(x) dg(t)(x), D

for the measure g(t) ∈ M (D). If for every t ∈ T , g(t) is Dirac measure concentrated on u(t) ∈ D ⊆ X, then ¡ ¢ g ∈ L∞ T ; M (D)w∗ and

½ Zb Sb∗ (g) =

¾ ® ϕi (t), u(t) X dt = S(u).

0

Therefore we see that the action of the operator Sb on K can be identified with the action of the operator Sb∗ and so S(K) ⊆ l∞ is relatively compact. Now we pass to the general case and assume that K has property U . For ε > 0 consider the set n o df Kε = χAε,u u : u ∈ K , where Aε,u ⊆ T is the Lebesgue measurable set postulated by Definition 2.3.35. By virtue of the uniform integrability of K, for each δ > 0, we can find ε > 0, such that inf ku − vk1 < δ

v∈Kε

∀ u ∈ K.

Note that k·kw 6 k·k1 . Therefore from the definition of the operator S, we have ° ° inf °S(u) − y °l∞ < δ. y∈S(Kε )

∞ But the set S(Kε ) ⊆ l∞ is relatively ¡norm-compact. ¢ Hence S(K) ⊆ l is 1 ∞ relatively norm-compact. Since S ∈ L L (T ; K); l , we also have that S is weak-to-weak continuous. The weak and norm topologies coincide on S(K). Thus S|K is weak-to-norm continuous. Recall that because K has property U , it is relatively w-compact (see Remark 2.3.36). Since without any loss of generality we may assume that K is convex, the linear map S : K −→ S(K) is a weak-to-norm ¡homeomorphism.¢ The norm topology of l∞ on S(K) is the norm topology of L1 (T ; X), k·kw∗ on K. Therefore the set K is relatively k·kw -compact and the proof of the theorem is finished.

170

Nonlinear Analysis

Next we ask the question when a weakly convergent sequence in Lp is strongly convergent. If p ∈ (1, +∞) and X is uniformly convex (hence reflexive; see Remark A.3.22), the Lebesgue-Bochner space Lp (Ω; X) is uniformly convex and so we have the Kadec-Klee property which says that if w

un −→ u in Lp (Ω; X) and kun kp −→ kukp , then un −→ u

in Lp (Ω; X).

This is no longer true for L1 (Ω; X). The next proposition illustrates the difference between weak and strong convergence in L1 (Ω). A sequence {un }n>1 ⊆ L1 (Ω) which converges weakly but not strongly oscillates violently around its weak limit. PROPOSITION 2.3.38 w If (Ω, Σ, µ) is a finite measure space, {un }n>1 ⊆ L1 (Ω), un −→ u in L1 (Ω) and u(ω) 6 lim inf un (ω) for µ-a.a. ω ∈ Ω, n→+∞

1

then un −→ u in L (Ω). PROOF Without any loss of generality, we may assume that u = 0. From Theorem 2.3.24, we know that the sequence {un }n>1 ⊆ L1 (Ω) is uniformly integrable. So given ε > 0 we can find δ = δ(ε) > 0, such that if A ∈ Σ, µ(A) < δ, then Z |un | dµ < ε

∀ n > 1.

A

For every N > 1, let ½ df

ΩN =

ε ω ∈ Ω : inf un (ω) > − n>N µ(Ω)

¾ .

Because of our hypothesis ¡ ¢and since we have assumed that u = 0, we can find N > 1 large so that µ ΩcN < δ. Also since w

un −→ u = 0

in L1 (Ω),

we can find N1 > N , such that for all n > N1 , we have ¯ ¯ ¯Z ¯ ¯ ¯ ¯ un dµ¯ < ε. ¯ ¯ ¯ ¯ ΩN

2. Lebesgue-Bochner and Sobolev Spaces

171

So for all n > N1 , we have Z Z Z |un | dµ = |un | dµ + |un | dµ Ω

Ωc

ΩN

N ¯ Z ¯ Z Z ¯ ¯ ε ε ¯ ¯ dµ + 6 u + dµ + |un | dµ 6 4ε, n ¯ µ(Ω) ¯ µ(Ω)

ΩN

so

ΩcN

ΩN

Z |un | dµ −→ 0

as n → +∞,

un −→ u = 0

in L1 (Ω).

Ω

i.e.,

When we deal with RN -valued functions, an extremality condition replaces the inequality hypothesis in the previous proposition. The result is due to Visintin (1984), where the reader can find the proof. PROPOSITION 2.3.39 ¡ ¢ If (Ω, Σ, µ) is a finite measure space, {fn }n>1 ⊆ L1 Ω; RN is a sequence such that ¡ ¢ w fn −→ f in L1 Ω; RN , ¡ ¢ for some f ∈ L1 Ω; RN and µ ¶ f (ω) ∈ ext conv lim sup{fn (ω)} for µ-a.a. ω ∈ Ω, n→+∞

then fn −→ f

¡ ¢ in L1 Ω; RN .

We conclude this section with a brief look at the space of Radon measures, which appears in applications (such as optimal control, game theory, mathematical economics etc.) and also is useful in the study of Sobolev spaces (see Section 2.4). So let Z be a locally compact, σ-compact metric space. We consider the following three spaces of continuous functions on Z: df

©

df

©

Cc (Z) = C0 (Z) =

ª u : Z −→ R continuous with compact support , u : Z −→ R continuous and vanishes at infinity,

i.e., for all ε > 0 there exists a compact set Kε ⊆ Z, ¯ ¯ ª such that ¯u(z)¯ < ε for all z 6∈ Kε , ª df © Cb (Z) = u : Z −→ R continuous and bounded .

172

Nonlinear Analysis

Evidently we have the following inclusions: Cc (Z) ⊆ C0 (Z) ⊆ Cb (Z). If Z is compact, then these three spaces coincide. If Z is not compact, each inclusion is strict. We can define a norm on Cb (Z) by setting ¯ ¯ df kukCb (Z) = kuk∞ = sup ¯u(z)¯. z∈Z

By restriction, this norm also passes to the spaces Cc (Z) and C0 (Z). PROPOSITION 2.3.40 The space Cb (Z) equipped with the norm k·k∞ is a Banach space. The space C0 (Z) is a closed subspace of this Banach space (hence itself a Banach space). The space Cc (Z) is k·k∞ -dense in C0 (Z). PROOF

The first two statements are obvious. Only the third requires ∞ S some work. Since Z is a locally compact, σ-compact metric space, Z = Cn , n=1

where {Cn }n>1 is a sequence of compact sets with Cn ⊆ int Cn+1 for all n > 1. Let {ϑn }n>1 and {ξn }n>1 be continuous partitions of unit subordinate to the open covers {int Cn }n>1 and {Cnc }n>1 respectively. We have ϑn + ξn = 1 on df

Z and so ϑn = 1 on Cn for n > 1. Let u ∈ C0 (Z) and set un = ϑn u. Then un ∈ Cc (Z) and

ku − un k∞ = kξn uk∞

∀ n > 1.

Since supp ξn ⊆ Cnc and u(z) −→ 0 as z tends to infinity (the one point Alexandrov compactification of Z; see Theorem A.1.3 and Remark A.1.4), we conclude that ku − un k∞ −→ 0 as n → +∞.

Also by M (Z) we denote the space of all signed measures m : B(Z) −→ R (with B(Z) being the Borel σ-field of Z) that have bounded variation. Since Z is a metric space such measures are regular. The measures in M (Z) are known as Radon measures. The Riesz-Markov representation theorem says that M (Z) is the dual space of C0 (Z).

2. Lebesgue-Bochner and Sobolev Spaces

173

THEOREM 2.3.41 (Riesz-Markov Representation Theorem) If X is a locally compact, σ-compact metric space, then C0 (Z)∗ = M (Z) and the duality pairing is given by Z hµ, uiC0 (Z) = u(z) dµ ∀ u ∈ C0 (Z), µ ∈ M (Z). Z

Using the three spaces of continuous functions on Z introduced earlier, we can define three different notions of convergence for sequences of Radon measures. DEFINITION 2.3.42 Let Z be a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z). (a) We say that the sequence {µn }n>1 converges vaguely to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ Cc (Z). Z

Z

(b) We say that the sequence {µn }n>1 converges weakly to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ C0 (Z). Z

Z w

We denote this convergence by µn −→ µ. (c) We say that the sequence {µn }n>1 converges narrowly to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ Cb (Z). Z

Z n

We denote this convergence µn −→ µ. REMARK 2.3.43

Evidently we have that

• the norm convergence in M (Z) implies the narrow convergence in M (Z); • the narrow convergence in M (Z) implies the weak convergence in M (Z); • the weak convergence in M (Z) implies the vague convergence in M (Z). In functional analytic terms the weak convergence is actually the weak∗ convergence in the Banach space M (Z) (see Theorem 2.3.41). The term weak convergence originates from probability theory. Also the term narrow convergence is the English translation of the term “convergence ´etroite” first used by Bourbaki (1969).

174

Nonlinear Analysis

PROPOSITION 2.3.44 If Z is a locally compact, σ-compact metric space, {un }n>1 ⊆ C0 (Z) is a sequence and u ∈ C0 (Z), then w un −→ u in C0 (Z) if and only if sup kun k∞ < ∞

and

n>1

un (z) −→ u(z)

∀ z ∈ Z.

PROOF “=⇒”: A weakly convergent sequence in a Banach space is bounded. So supn>1 kun k∞ < +∞. Also if µ = δz is the Dirac measure concentrated at z ∈ Z, then hδz , un i −→ hδz , ui . But hδz , un i = un (z) and

hδz , ui = u(z).

So un (z) −→ u(z)

∀ z ∈ Z.

“⇐=”: This is an immediate consequence of the Lebesgue dominated convergence theorem (see Theorem A.2.2). PROPOSITION 2.3.45 If Z is a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z) is a sequence, then (a) if µn −→ µ

vaguely in M (Z)

and for every ε > 0 there exists compact set Kε ⊆ Z, such that ¡ ¢ |µn | Kεc < ε ∀ n > n0 , then µn −→ µ

narrowly in M (Z);

(b) if µn > 0 for all n > 1, µn −→ µ

vaguely in M (Z),

and µn (Z) −→ µ(Z), then µn −→ µ

narrowly in M (Z).

2. Lebesgue-Bochner and Sobolev Spaces

175

PROOF (a) Let u ∈ Cb (Z) and ε > 0. Let Kε ⊆ Z be the compact set postulated by the hypotheses. We take ξε ∈ Cc (Z), such that ξε |Kε = 1. Evidently u = ξε u + v with supp v ⊆ Kεc . So we have Z Z Z u dµn = ξε u dµn + v dµn . Z

Z

Z

Since µn −→ µ vaguely in M (Z) and ξε u ∈ Cc (Z), we have Z Z ξε u dµn −→ ξε u dµ. Z

Also

Z

¯Z ¯ ¯Z ¯ ¯ ¯ ¯ ¯ ¡ ¢ ¯ v dµn ¯ = ¯ v dµn ¯ 6 kvk |µn | Kεc 6 ε kuk . ∞ ∞ ¯ ¯ ¯ ¯ Kεc

Z

So we obtain

Z

Z

lim sup n→+∞

and

u dµn 6 Z

Z u dµn >

n→+∞

(2.30)

Z

Z lim inf

ξε u dµ + ε kuk∞

Z

ξε u dµ − ε kuk∞ . Z

Since ε > 0 was arbitrary and ξε −→ 1, we obtain that Z Z u dµn −→ u dµ, Z

Z

i.e., µn −→ µ

narrowly in M (Z).

(b) Every measure µ0 ∈ M (Z), µ0¡> 0¢is tight. So given ε > 0, we can find a compact set Kε ⊆ Z, such that µ0 Kεc < ε. Let u ∈ Cc (Z) be such that Z supp u ⊆ Kε ,

0 6 u 6 1 and

kµk∗ − ε

1 be such that ¯ ¯ ¯µn (Z) − µ(Z)¯ < ε

¯ ¯Z Z ¯ ¯ ¯ u dµn − u dµ¯ < ε ¯ ¯

and

Z

∀ n > n0 . (2.31)

Z

Then for n > n0 , from (2.31), we have Z Z ¡ ¢ µn Kεc 6 kµn k∗ − u dµn + ε − u dµ 6 kµn k∗ − kµk∗ + 2ε < 3ε. Z

Z

So, from part (a), we conclude that µn −→ µ narrowly in M (Z).

We have a compactness result for the weak convergence of Radon measures. THEOREM 2.3.46 If Z is a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z) is bounded, then there is a subsequence {µnk }k>1 of {µn }n>1 and µ ∈ M (Z), such that w

µnk −→ µ

as k → +∞.

PROOF Let Z ∗ be the Alexandrov one-point compactification of Z (see Theorem A.1.3 and Remark A.1.4). Then Z ∗ is a compact metrizable space and so C(Z ∗ ) is a separable Banach space. Set df

E =

©

ª u ∈ C(Z ∗ ) : u(∞) = 0 .

Then this is a closed subspace of C(Z ∗ ); thus E is a separable Banach space too. For every u ∈ E, let u b denote the restriction of u to Z. Consider the linear map L : E −→ Cb (Z) defined by df

L(u) = u b

∀ u ∈ E.

We claim that L is an isometry of E onto C0 (Z). To this end, let u ∈ E. Since u is continuous at +∞, then for every ε > 0, there exists a compact set Kε , such that ¯ ¯ ¯u(z) − u(∞)¯ < ε ∀ z ∈ Kεc . This means that u b ∈ C0 (Z). On the other hand let v ∈ C0 (Z). Then v can be extended to Z ∗ by setting v1 (∞) = 0

2. Lebesgue-Bochner and Sobolev Spaces

177

and v1 (z) = v(z)

∀ z ∈ Z.

Since v ∈ C0 (Z), we see that v1 ∈ C(Z ∗ ) and so v1 ∈ E. This isometry shows that C0 (Z) is separable. Then the weak∗ topology on bounded subsets of M (Z) = C0 (Z)∗ is compact (by Alaoglu’s theorem; see Theorem A.3.9) and metrizable. This proves the theorem. REMARK 2.3.47 Using the compactification technique of the previous proof we can show that if df

G =

©

¡ ¢ ª µ ∈ M (Z ∗ ) : µ {∞} = 0

and S : G −→ M (Z) is defined by df

∀ µ ∈ M (Z ∗ )

S(µ) = µ b with µ b(A) = µ(A)

∀ A ∈ B(Z) ⊆ B(Z ∗ ),

then S is an isometry of G onto M (Z). When Z is compact, then Theorem 2.3.46 can be strengthened. In what follows by M (Z)+ we denote the elements µ of M (Z) for which we have µ > 0 (i.e., they are measures). THEOREM 2.3.48 If Z is a compact metric space and {µn }n>1 ⊆ M (Z)+ is such that kµn k = r

∀ n > 1,

then there exists a subsequence {µnk }k>1 of {µn }n>1 and µ ∈ M (Z)+ with kµk = r, such that w

µnk −→ µ i.e., the set

as k → +∞,

© ª + SR = µ ∈ M (Z)+ : kµk = r

is w∗ -sequentially compact). We conclude with a result for sequences of functions which converge simultaneously pointwise and weakly in Lp (Ω) (p ∈ [1, +∞)). This result can be viewed as a refinement of Fatou’s Lemma.

178

Nonlinear Analysis

PROPOSITION 2.3.49 If (Ω, Σ, µ) is a finite measure space, {un }n>1 ⊆ Lp (Ω) (p ∈ [1, +∞)), w

un −→ u

in Lp (Ω)

and un (ω) −→ u(ω) then

³ lim

n→+∞

PROOF we have

p

for µ-a.a. ω ∈ Ω, p

´

p

kun kp − kun − ukp

= kukp .

For a given ε > 0, we can find c(ε) > 0, such that for all a, b ∈ R, ||a + b|p − |a|p | 6 ε|a|p + c(ε)|b|p .

We set

df

(2.32) +

hεn = (||un |p − |un − u|p − |u|p | − ε|un − u|p ) . Evidently we have hεn (ω) −→ 0 for µ-a.a. ω ∈ Ω and from (2.32),

hεn (ω) 6

¯p ¡ ¢¯ 1 + c(ε) ¯u(ω)¯ .

So from the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z hεn dµ −→ 0 as n → +∞. (2.33) Ω

But note that |un |p − |un − u|p − |u|p 6 hεn + ε|un − u|p

for µ-a.a. ω ∈ Ω.

Hence, using (2.33), we obtain Z lim sup ||un |p − |un − u|p − |u|p | dµ 6 M ε, n→+∞

Ω

where

df

p

M = sup kun − ukp . n>1

Since ε > 0 was arbitrary, we have ³ ´ p p p lim kun kp − kun − ukp = kukp . n→+∞

REMARK 2.3.50 Note that if p = 2, then we do not need the µ-almost everywhere pointwise convergence of the sequence {un }n>1 to u.

2. Lebesgue-Bochner and Sobolev Spaces

2.4

179

Sobolev Spaces

Already in Section 1.6, we introduced the Sobolev space W 1,p (Z) (see Definition 1.6.1). Here we introduce Sobolev spaces of any order m > 1 and conduct a systematic study of them, proving among other things those results stated in Section 1.6 without a proof. N Let us start by fixing the notation. An element α = (αk )N is said k=1 ∈ N to be a multi-index . Associated to a multi-index α, we have the following symbols: N df X |α| = αk k=1

the length of α, and αN z α = z1α1 . . . zN

N ∀ z = (zk )N k=1 ∈ R .

We say © that two ª multi-indices α, β are related by α 6 β, if αk 6 βk for all k ∈ 1, . . . , N . Finally we set df

Dk =

∂ ∂zk

© ª ∀ k ∈ 1, . . . , N

and df

αN = Dα = D1α1 . . . DN

∂z1α1

∂ |α| αN . . . . ∂zN

DEFINITION 2.4.1 Let Z ⊆ RN be an open set. By D(Z) we de∞ note the space Cc (Z) (the space of C ∞ (Z) functions with compact support) equipped with the following convergence notion: “the sequence {ϑn }n>1 ⊆ Cc∞ (Z) is said to converge to 0, if there exists a fixed compact set K ⊆ Z, such that supp ϑn ⊆ K for all n > 1 and {Dα ϑn }n>1 converges uniformly to 0 for all α ∈ NN .” The elements of the space D(Z) are called test functions. A linear functional T : D(Z) −→ R, such that ϑn −→ 0

in D(Z),

implies T (ϑn ) −→ 0 is called a distribution. The space of distributions is denoted by D(Z)∗ .

180

Nonlinear Analysis

REMARK 2.4.2 The convergence notion introduced on Cc∞ (Z) is actually topological, i.e., corresponds to a topology on Cc∞ (Z). Therefore D(Z)∗ is the dual of the space of test functions. Recall that D(Z) is dense in Lp (Ω) for all p ∈ [1, +∞). If u ∈ L1loc (Z) and Tu : D(Z) −→ R is defined by Z df Tu (ϑ) = uϑ dz ∀ ϑ ∈ D(Z), Ω

then Tu ∈ D(Z)∗ . Moreover, if u, v ∈ L1loc (Z) and u = v

for a.a. z ∈ Z,

then Tu = Tv . In particular, if u(z) = 0

for a.a. z ∈ Z,

it defines the zero distribution. In fact the converse is also true. If Tu = 0, then u(z) = 0 for almost all z ∈ Z, provided that u ∈ L1loc (Z). Distributions resulting from locally integrable functions are usually called regular distributions. Another important distribution is the Dirac δ-function; namely for z ∈ Z, we define df

δz (ϑ) = ϑ(z)

∀ ϑ ∈ D(Z).

This distribution is not regular. DEFINITION 2.4.3 For every distribution T ∈ D(Z)∗ and every α ∈ N α N , the distribution D T is defined by ¡ ¢ df Dα T (ϑ) = (−1)|α| T Dα ϑ

∀ ϑ ∈ D(Z).

Then Dα T is the derivative of order α of the distribution T . For given two functions u, v ∈ L1loc (Z) and α ∈ NN , we write v = Dα u to express the fact that Dα Tu = Tv . This is equivalent to saying that Z Z vϑ dz = (−1)|α| uDα ϑ dz. Ω

Ω

α

The function v = D u is the derivative of order α in the sense of distributions of the function u. If u ∈ C |α| (Z), then the distributional derivative Dα u ∂ |α| u coincides with the classical partial derivative αN . ∂z1α1 . . . ∂zN REMARK 2.4.4

2. Lebesgue-Bochner and Sobolev Spaces

181

Now we are ready to give the definition of Sobolev space. DEFINITION 2.4.5 Let Z ⊆ RN be an open set. The Sobolev space m,p W (Z) for m ∈ N0 , p ∈ [1, +∞], is defined by df

W m,p (Z) =

©

ª u ∈ Lp (Z) : Dα u ∈ Lp (Z) for all α ∈ NN with |α| 6 m .

For every u ∈ W m,p (Z), we define df

kukW m,p (Z) =

µ X

kD

α

p ukp

¶ p1 if p ∈ [1, +∞)

|α|6m

and

df

kukW m,∞ (Z) =

X

kDα uk∞ .

|α|6m

Clearly this is a norm on W m,p (Z). Finally we set df

k·kW m,p (Z)

W0m,p (Z) = D(Z)

,

for p ∈ [1, +∞). REMARK 2.4.6

Evidently un −→ u in W m,p (Z)

if and only if for all α ∈ NN with |α| 6 m, we have Dα un −→ Dα u in Lp (Z). Let

© ª df r = card α : α is multi-index, |α| 6 m

and consider the map L : W m,p (Z) −→ defined by df

L(u) =

¡ α ¢ D u |α|6m

¡

¢r Lp (Z) ,

∀ u ∈ W m,p (Z).

It is easily seen that L is an isometric isomorphism. Based on this observation, we can state the following result. PROPOSITION 2.4.7 ¡ ¢ The spaces W m,p (Z), k·kW m,p (Z) (with p ∈ [1, +∞], m ∈ N0 ) are Banach spaces, which are separable for p ∈ [1, +∞), reflexive and uniformly convex for p ∈ (1, +∞).

182

Nonlinear Analysis

COROLLARY 2.4.8 For every m ∈ N0 and p ∈ [1, +∞], the space W0m,p (Z) is a closed subspace of W m,p (Z). PROOF

We need to show that W0m,p (Z) ⊆ W m,p (Z).

So let u ∈ W0m,p (Z) and let {ϑn }n>1 ⊆ D(Z) be such that ϑn −→ u in W m,p (Z). From Proposition 2.4.7, it follows that u ∈ W m,p (Z). REMARK 2.4.9 The space p = 2 is important and we reserve a special notation for it. We set df

H m (Z) = W m,2 (Z) and

df

H0m (Z) = W0m,2 (Z).

For u, v ∈ H m (Z), we define df

(u, v)H m (Z) =

X

(Dα u, Dα v)L2 (Z) =

X Z

Dα u Dα v dz.

|α|6m Ω

|α|6m

Clearly (·, ·)H m (Z) defines an inner product on H m (Z) which generates the norm k·kW m,2 (Z) . From Proposition 2.4.7, it follows that H m (Z) and H0m (Z) are Hilbert spaces. From now on m = 1. So we examine the first Sobolev spaces W 1,p (Z) and W01,p (Z), with p ∈ [1, +∞]. Next we derive ways to approximate the elements of the Sobolev space W 1,p (Z) by smooth functions. For this purpose we introduce certain regularizing sequences known as mollifiers. ¡ ¢ DEFINITION 2.4.10 Let ϕ ∈ Cc∞ RN , ϕ > 0 be such that Z © ª N supp ϕ ⊆ z ∈ R : kzkRN 6 1 and ϕ(z) dz = 1. RN

A possible choice is the function µ ¶ 1 df c exp kzk2 −1 ϕ(z) = RN 0

if

kzkRN < 1,

if

kzkRN > 1,

with c > 0 chosen in such a way so that Z ϕ(z) dz = 1. RN

2. Lebesgue-Bochner and Sobolev Spaces

183

If ε > 0, we define

1 ³z ´ ϕ . εN ε ¡ ¢ Then ϕε ∈ Cc∞ RN , ϕε > 0 is such that ϕε (z) =

supp ϕε ⊆

©

ª z ∈ RN : kzkRN 6 ε

Z and

ϕε (z) dz = 1. RN

The function ϕε is called a mollifier and given u ∈ L1loc (Z), the mollification (or regularization) of u corresponding to {ϕε }ε>0 is given by Z Z df uε (z) = u(z − y)ϕε (y) dy = u(y)ϕε (z − y) dy, Ω

RN

where we have extended u to all of RN as zero (i.e., uε = u?ϕε with ? denoting the convolution operation). REMARK 2.4.11

Note that

© ª supp uε ⊆ supp u + z ∈ RN : kzkRN 6 ε . The next proposition summarizes the approximations achieved via mollification. PROPOSITION 2.4.12 If Z ⊆ RN is an open set, then ¡ ¢ df 1 ∞ (a) for every u ∈ L (Z) and every ε > 0, u ∈ C Z , where Z−ε = ε −ε loc © ª z ∈ Z : dZ (z, ∂Z) > ε ; (b) if u ∈ C(Z), then uε −→ u

as ε & 0

uniformly on compact subsets of Z; (c) if u ∈ Lploc (Z) for some p ∈ [1, +∞), then uε −→ u

in Lploc (Z);

(d) if u ∈ W 1,p (Z), p ∈ [1, +∞], then Di uε = ϕε ? Di u

∀ i ∈ {1, . . . , N };

(e) if u ∈ W 1,p (Z), p ∈ [1, +∞), then uε −→ u

in W 1,p (Z).

184

Nonlinear Analysis

PROOF (a) Note that uε is defined on Z−ε (see Remark 2.4.11). Let z ∈ Z−ε , i ∈ {1, . . . , N } and e1 , . . . , eN be the canonical basis of RN . For |t| small enough, we have that z + tei ∈ Z−ε

∀ i ∈ {1, . . . , N }.

So if tk −→ 0, we can assume that z + tk ei ∈ Z−ε We set

1 h(z, y) = N ϕ ε

and fk (z, y) = We have

µ

∀ k > 1. z−y ε

¶ u(y)

¢ 1¡ h(z + tk ei − y) − h(z, y) . tk

¢ 1¡ uε (z + tk ei ) − uε (z) = tk

Z fk (z, y) dy. Z−ε

Note that fk (z, y) −→

∂ϕε (z − y) as k → +∞, ∂zi

∀ y ∈ Z−ε .

Moreover, by the mean value theorem , we have ¯ ¯ ¯fk (z, y)¯ 6

1 εN +1

¡ ¢ kDϕk∞ |u| ∈ L1 Z−ε .

So by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z ¢ ∂uε 1¡ ∂ϕε (z) = lim uε (z + tk ei ) − uε (z) = (z − y)u(y) dy. k→+∞ tk ∂zi ∂zi Z

In a similar way we show that the partial derivatives of uε of all orders exist and are continuous on Zε . Therefore ¡ ¢ uε ∈ C ∞ Z−ε . Note that we have ¢ ∂ ¡ ∂ϕε u ? ϕε = u ? ∂zi ∂zi

∀ i ∈ {1, . . . , N }.

(b) Let K be a compact subset of Z. Take z ∈ K and set df

x =

y−z , ε

y ∈ Z.

2. Lebesgue-Bochner and Sobolev Spaces

185

Note that ϕ(−x) = ϕ(x). Then we have ¯ ¯ ¯uε (z) − u(z)¯ 6 Z 6

µ

Z

1 εN

z−y ε

ϕ

¶

¯ ¯ ¯u(y) − u(z)¯ dy

Bε (z)

¯ ¯ ϕ(x)¯u(z + εx) − u(z)¯ dz 6 ξ(ε)

B1 (0)

(2.34)

Z ϕ(x) dx = ξ(ε),

B1 (0)

where ξ(ε) =

¯ ¯ ¯u(y) − u(v)¯.

sup (y, v) ∈ K ε × K ε ky − vk 6 ε

Since u is uniformly continuous on the compact subsets of Z, we have that lim ξ(ε) = 0.

ε&0

So from (2.34), we conclude that uε −→ u as ε & 0 uniformly on compact subsets of Z. ¡ ¢ (c) Let K be a compact subset of Z and 0 < ε < d K, ∂Z . Then for z ∈ K, we have Z ¯ ¯ ¯ ¯p ¯uε (z)¯p 6 ϑ(y)¯u(z + εy)¯ dy. (2.35) B1 (0)

To see (2.35) note that it is clearly true if p = 2. So suppose that p ∈ (1, +∞). If p1 + p10 = 1, then 1

1

ϑ(y)u(z + εy) = ϑ(y) p ϑ(y) p0 u(z + εy).

Invoking H¨older’s inequality (see Theorem A.2.27), we have ¯ ¯ ¯uε (z)¯ 6

µ Z ϑ(y) dy

¶ 10 µ Z p

B1 (0)

Since

p

¶ p1

ϑ(y) |u(z + εy)| dy

B1 (0)

Z ϑ(y) dy = 1, B1 (0)

from (2.36), we obtain (2.35). Invoking (2.35), we have ¶ Z Z µ Z ¯ ¯ ¯ ¯p ¯uε (z)¯p dz 6 ¯ ¯ ϑ(y) u(z + εy) dy dz; K

K

B1 (0)

.

(2.36)

186

Nonlinear Analysis

thus by Fubini’s theorem, we Z Z Z ¯ ¯ ¯uε (z)¯p dz 6 ϑ(y) K

B1 (0)

where

df

Z

Z

¯ ¯ ¯u(z + εy)¯p dz dy 6

K

Kε =

Since

have

ϑ(y) B1 (0)

©

Z

¯ ¯ ¯u(v)¯p dv dy,

Kε

ª v ∈ RN : d(v, K) 6 ε .

ϑ(y) dy = 1, we have B1 (0)

Z kuε kLp (K) 6

¯ ¯ ¯u(v)¯p dv,

(2.37)

Kε p

so uε ∈ L (K). Let V be a bounded open set, such that Kε ⊆ V ⊆ V ⊆ Z

∀ ε > 0 small enough. ¡ ¢ Let δ > 0 be given and select h ∈ C V , such that ku − hkLp (V ) < δ. Here we exploit the density of the embedding ¡ ¢ C V ⊆ Lp (V ). Then, using (2.37), for ε > 0 small enough, we have ku − uε kLp (K) 6 ku − hkLp (K) + kh − hε kLp (K) + k(h − v)ε kLp (K) 6 ku − hkLp (K) + kh − hε kLp (K) + kh − vkLp (Kε ) 6 3δ, so uε −→ u

in Lploc (Z) as ε & 0.

(d) Suppose that u ∈ W 1,p (Z), p ∈ (1, +∞). From the proof of part (a) and integrating by parts, we know that Z Z ∂ϕε ∂ϕε Di u(z) = (z − y)u(y) dy = − (z − y)u(y) dy ∂zi ∂yi Z Z Z ¡ ¢ = ϕε (z − y)Di u(y) dy = ϕε ? Di u (z), Z

for z ∈ Z−ε , ε > 0 and i ∈ {1, . . . , N }. (e) This follows from (c) and (d).

2. Lebesgue-Bochner and Sobolev Spaces

187

The next theorem shows that smooth functions are dense in W 1,p (Z), p ∈ [1, +∞]. So equivalently the space W 1,p (Z) can be defined as the closure in the k·kW 1,p (Z) -norm of C ∞ (Z)∩W 1,p (Z). The result is known in the literature as the Meyers-Serrin theorem. THEOREM 2.4.13 (Meyers-Serrin Theorem) If p ∈ [1, +∞), then the embedding C ∞ (Z) ∩ W 1,p (Z) ⊆ W 1,p (Z) is dense. PROOF

We define ½ ¾ ¡ ¢ 1 df Z−n = z ∈ Z : d z, ∂Z > , kzkRN < n n

and

∀n>1

df

Z0 = ∅. Set

df

Un = Z−(n+1) \ Z −(n−1)

∀ n > 1.

The collection {Un }n>1 is an open cover of Z. So we can find a smooth partition of unity {ξn }n>1 subordinate to {Un }n>1 . Then ξn ∈ Cc∞ (Un ), and

∞ X

06ξ61

ξn (z) = 1

∀n>1

∀ z ∈ Z.

n=1

Let u ∈ W 1,p (Z) and δ > 0. We have ξn u ∈ W 1,p (Z)

and

¡ ¢ supp ξn u ⊆ Un

∀ n > 1.

Thus by virtue of Proposition 2.4.12(e), there exist εn > 0, such that ¡ ¢ supp ϑεn ? (ξn u) ⊆ Un and kϑεn ? (ξn u) − ξn uk < Define df

uδ =

∞ X n=1

δ . 2n

ϑεn ? (ξn u).

(2.38)

188

Nonlinear Analysis

In some neighbourhood of each point z ∈ U , there are only finitely many nonzero terms in the sum. Hence uδ ∈ C ∞ (Z). Next note that u =

∞ X

ξn u.

n=1

From (2.38), we have ¶p ∞ µZ X ¯ ¡ ¢¯p ¯ ¯ 6 ϑεn ? (ξn u) − ξn u dz 1

kuδ − ukW 1,p (Z)

n=1

+

Z

∞ µZ X

n=1

Z

¯ ¡ ¢¯ ¯ϑεn ? D(ξn u) − Du ¯p dz

¶ p1 < δ,

so uδ ∈ C ∞ (Z) ∩ W 1,p (Z) and uδ −→ u in W 1,p (Z) as δ & 0.

REMARK 2.4.14 The result is true for all Sobolev spaces W m,p (Z), m > 1. We emphasize that in the above approximation ¡ ¢ result, we do not claim that the approximating functions belong in C ∞ Z . To obtain this we need to strengthen the geometry of the boundary ∂Z of Z. DEFINITION 2.4.15 A bounded open set Z ⊆ RN is said to be Lipschitz, if for each z ∈ ∂U , there exists a neighbourhood U of z, such that © ª N Z ∩ U = y = (zk )N k=1 ∈ R : η(y1 , . . . , yN −1 ) < yN ∩ U, where η : RN −1 −→ R is a Lipschitz continuous function and {yk }N k=1 is a system of Cartesian coordinates of RN . REMARK 2.4.16 From this definition it follows that ∂Z locally has a representation of the form yN = η(y1 , . . . , yN −1 ), i.e., near z ∈ ∂Z, the boundary ∂Z is the graph of a Lipschitz continuous function. By Rademacher’s theorem (see Theorem 1.5.8), the outer unit normal n(z) to the domain Z exists for µ(N −1) -almost all z ∈ ∂Z. If Z is a bounded polyhedron, then Z is Lipschitz. Also if Z is a C ∞ -submanifold with C ∞ boundary ∂Z, then Z is Lipschitz. Every Lipschitz open set Z ⊆ RN is locally star-shaped. For Lipschitz Z ⊆ RN we can improve the conclusion of Theorem 2.4.13.

2. Lebesgue-Bochner and Sobolev Spaces

189

THEOREM 2.4.17 If Z ⊆ RN is a bounded open set, which is Lipschitz and p ∈ [1, +∞), ¡ ¢ then the embedding C ∞ Z ⊆ W 1,p (Z) is dense. REMARK 2.4.18 Theorem 2.4.17 implies that for any bounded, open, Lipschitz set Z ⊆ RN and any given u ∈ W 1,p (Z) (p ∈ [1, +∞)), there exists a sequence {un }n>1 ⊆ D(RN ), such that un |Z −→ u in W 1,p (Z). In general for any open set Z ⊆ RN and u ∈ W 1,p (Z) (p ∈ [1, +∞)), we can say that there exists a sequence {un }n>1 ⊆ D(RN ), such that w

un −→ u in Lp (Z) Di xn |Z 0 −→ Di x|Z 0 in Lp (Z 0 ),

∀ i ∈ {1, . . . , N }, Z 0 ⊂⊂ Z

(i.e., Z 0 is a bounded open set with Z 0 ⊆ Z). This result is known in the literature as Friedrich’s theorem. Theorem 2.4.17 holds for all Sobolev spaces W m,p (Z), m > 1. None of these approximation results (Theorems 2.4.13 and 2.4.17) is true for p = +∞. Indeed consider the following examples. EXAMPLE 2.4.19

(a) Let Z = RN . We know that

¡ ¢k·k∞ ¡ ¢ Cc RN = C0 RN . ¡ ¢ Thus while u ≡ 1 ∈ W 1,∞ RN , it can not be approximated by functions in ¡ ¢ Cc∞ RN . (b) Let Z = (−1, 1) and consider the function ½ df 0 if z 6 0, u(z) = z if z > 0. Then u is absolutely continuous. Its derivative in the sense of distribution is given by ½ df 0 if z < 0, Du(z) = 1 if z > 0. ¯ ¯ Let ϑ ∈ C ∞ (Z) be ¯such that ¯kϑ0 − u0 k∞ < ε. So if z < 0, then ¯ϑ0 (z)¯ < ε and if z > 0, then ¯ϑ0 (z) − 1¯ < ε, hence 1 − ε < ϑ0 (z). By continuity, we obtain ϑ0 (0) 6 ε and ϑ0 (0) > 1 − ε. If ε < 21 , we reach a contradiction. This shows that u cannot be approximated in W 1,∞ (Z) by smooth functions.

190

Nonlinear Analysis

The next Proposition proves a simple characterization of the elements in W 1,p (Z) (p ∈ (1, +∞]). PROPOSITION 2.4.20 If Z ⊆ RN is open and u ∈ Lp (Z) with p ∈ (1, +∞], then the following statements are equivalent: (a) u ∈ W 1,p (Z); (b) there exists a constant c > 0, such that ¯Z ¯ ¯ ¯ ¯ u ∂ϑ dz ¯ 6 c kϑk 0 ∀ ϑ ∈ Cc∞ (Z), k ∈ {1, . . . , N }, p ¯ ∂zk ¯ Z

with

1 p

+

1 p0

= 1;

(c) there exists a constant c > 0, such that for all Z 0 ⊂⊂ Z (i.e., Z 0 is a bounded open set such that Z 0 ⊆ Z) and ° ° °τy (u) − u° p 0 6 c kyk N ∀y ∈ RN , with kyk N < d (Z 0 , Z c ). R R RN L (Z ) Moreover, in both (b) and (c) we can take c = kDukp . PROOF

“(a)=⇒(b)”: Obvious.

“(b)=⇒(a)”: Let Lk : Cc∞ (Z) −→ R be defined by Z ∂ϑ df Lk (ϑ) = u dz ∀ k ∈ {1, . . . , N } . ∂zk Z

0

0

Evidently Lk is linear, Lp -continuous. Since the embedding Cc∞ (Z) ⊆ Lp (Z) 0 0 is dense, we can extend Lk continuously on all of Lp (Z). So Lk ∈ Lp (Z)∗ and by the Riesz representation theorem (see Theorem A.3.24), we can find h ∈ Lp (Z), such that Z 0 Lk (v) = hv dz ∀ v ∈ Lp (Z), Z

so

Z u Z

∂ϑ dz = ∂zk

Z hϑ dz

∀ ϑ ∈ Cc∞ (Z)

Z

and Dk u = h hence u ∈ W

1,p

(Z).

∀ k ∈ {1, . . . , N },

2. Lebesgue-Bochner and Sobolev Spaces

191

“(a)=⇒(c)”: First suppose that u ∈ Cc∞ (Z). Let y ∈ RN and set df

v(t) = u(z + ty)

∀ t ∈ R.

Then from the chain rule, we have v 0 (t) = (Du(z + ty), y)RN . Integrating, we obtain Z1

Z1 0

u(z + y) − u(z) = v(1) − v(0) =

(Du(z + ty), y)RN dt

v (t) dt = 0

0

and so p

kτy (u) − ukLp (Z 0 ) Z Z1 6

p kykRN

p

kDu(z + ty)kRN dt dz Z0 0

Z1 Z =

p kykRN

p

kDu(z + ty)kRN dz dt 0 Z0

Z1 Z =

p kykRN

p

kDu(r)kRN dr dt. 0 Z 0 +ty

If

¡ ¢ kykRN < dRN Z 0 , Z c ,

we can find a bounded open set Z 00 , such that Z

00

⊆ Z

and Z 0 + ty ⊆ Z 00

Therefore

∀ t ∈ [0, 1].

Z p

p

p

kτy (u) − ukLp (Z 0 ) 6 kykRN

kDukRN dz.

(2.39)

Z 00

For the general case, suppose that u ∈ W 1,p (Z), p ∈ (1, +∞]. Then we can find a sequence {un }n>1 ⊆ Cc∞ (R), such that un −→ u in Lp (Z), Dun −→ Du

in Lp (Z 0 ),

for any Z 0 ⊂⊂ Z (see Remark 2.4.18). From (2.39), we have Z p p p kτy (un ) − un kLp (Z 0 ) 6 kykRN kDun kRN dz, Z0

192

Nonlinear Analysis

so

Z p

p

p

kτy (u) − ukLp (Z 0 ) 6 kykRN

kDukRN dz.

(2.40)

Z0

If p = +∞, we obtain (2.40) for p < +∞ and then let p → +∞. “(c)=⇒(b)”: Let ϑ ∈ Cc∞ (Z) and consider an open set Z 0 , such that supp ϑ ⊆ Z 0 ⊂⊂ Z. Let y ∈ RN with

¡ ¢ kykRN < d Z 0 , Z c .

By hypothesis we have ¯Z ¯ ¯ ¡ ¯ ¢ ¯ τy (u) − u ϑ dz ¯¯ 6 c kykRN kϑkp0 . ¯ Z

Note that Z

¡ ¢ u(z + y) − u(z) ϑ(z) dz =

Z

so

Z

¡ ¢ u(z) ϑ(z − y) − ϑ(z) dz,

Z

¯Z ¯ ¯ ¯ ¯ u(z) ϑ(z − y) − ϑ(z) dz ¯ 6 c kϑk 0 . p ¯ ¯ kyk N Z

R

©

ª

Let y = tek , t ∈ R, k ∈ 1, . . . , N . Passing to the limit as t → 0, we obtain ¯Z ¯ ¯ ¯ © ª ¯ u ∂ϑ dz ¯ 6 c kϑk 0 ∀ ϑ ∈ Cc∞ (Z), k ∈ 1, . . . , N . p ¯ ∂zk ¯ Z

Finally it is clear from the above proofs that in (b) and (c), the constant c > 0 can be taken as c = kDukp . REMARK 2.4.21 If p = 1, then we have (a) =⇒ (b) =⇒ (c). From the implication (a) =⇒ (c), we can see that if Z ⊆ RN is an open set and u ∈ W 1,∞ (Z), then ¯ ¯ ¯u(z) − u(y)¯ 6 kDuk kz − yk N ∀ z, y ∈ Z. ∞ R So W 1,∞ (Z) is the space of Lipschitz continuous functions on Z. In particular ¡ ¢ W 1,∞ (Z) ⊆ C Z . More generally, it is easy to show that if u : Z −→ R is locally Lipschitz, then 1,p u ∈ Wloc (Z) (p ∈ [1, +∞]).

2. Lebesgue-Bochner and Sobolev Spaces

193

PROPOSITION 2.4.22 If Z ⊆ RN is an open set and u, v ∈ W 1,p (Z) ∩ L∞ (Z) with p ∈ [1, +∞], then uv ∈ W 1,p (Z) and D(uv) = uDv + vDu (product rule). PROOF If p = +∞, then u, v are Lipschitz continuous functions and so differentiable for almost all z ∈ Z (see Theorem 1.5.8). Clearly uv is Lipschitz continuous too, hence in W 1,∞ (Z) and the product rule results from the usual product rule of differentiable functions. So suppose that p ∈ [1, +∞). We assume that ¯ ¯ ¯ ¯ ¯u(z)¯, ¯v(z)¯ 6 1 for a.a. z ∈ Z. Invoking Theorem 2.4.13, we can find sequences {b un }n>1 , {b vn }n>1 ⊆ C ∞ (Z) ∩ W 1,p (Z), such that u bn u bn (z) vbn vbn (z)

−→ u in W 1,p (Z), −→ u(z) for a.a. z ∈ Z, −→ v in W 1,p (Z), −→ v(z) for a.a. z ∈ Z.

Let © ª df un = max − 1, min {b un , 1} , © ª df vn = max − 1, min {b vn , 1} . Then un vn is locally Lipschitz. Moreover, we have D(un vn ) = un Dvn + vn Dun ∈ Lp (Z), so

un vn ∈ W 1,p (Z)

∀ n > 1.

Note that un −→ u in W 1,p (Z), un (z) −→ u(z) for a.a. z ∈ Z, vn −→ v in W 1,p (Z), vn (z) −→ v(z) for a.a. z ∈ Z. We have p

kun vn − uvkp Z Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯vn (z)¯p ¯un (z) − u(z)¯p dz + ¯u(z)¯p ¯vn (z) − v(z)¯p dz −→ 0. 6 Z

Z

194

Nonlinear Analysis

In addition ° ¡ ¢° °un Dvn + vn Dun − uDv + vDu °p p p

p

6 kun Dvn − uDvkp + kvn Dun − vDukp Z ¯ ¯ ¯un (z)¯p kDvn − Dvkp N dz 6 R Z

Z

+

¯p p ¯ kDv(z)kRN ¯un (z) − u(z)¯ dz

Z

Z

+

¯ ¯ ¯vn (z)¯p kDun − Dukp N dz R

Z

Z

+

¯p p ¯ kDu(z)kRN ¯vn (z) − v(z)¯ dz −→ 0.

Z

Therefore we conclude that uv ∈ W 1,p (Z) and D(uv) = uDv + vDu.

In fact a careful reading of this proof reveals that the following result is also true. PROPOSITION 2.4.23 If Z ⊆ RN is an open set, u ∈ W 1,p (Z), p ∈ [1, +∞] and v ∈ W 1,∞ (Z), then uv ∈ W 1,p (Z) and D(uv) = uDv + vDu. Next we prove a chain rule for Sobolev functions. PROPOSITION 2.4.24 (Chain Rule for Sobolev Functions) If Z ⊆ RN is an open set, ξ ∈ C 1 (R), ξ 0 ∈ L∞ (R), ξ(0) = 0 and u ∈ W 1,p (Z) with p ∈ [1, +∞], then ξ ◦ u ∈ W 1,p (Z) and D(ξ ◦ u) = ξ 0 (u)Du. Moreover, if Z is bounded, then we can drop the condition that ξ(0) = 0. PROOF

Let p ∈ [1, +∞) and ϑ ∈ Cc∞ (Z) be such that supp ϑ ⊆ V ⊂⊂ Z.

2. Lebesgue-Bochner and Sobolev Spaces

195

Using Proposition 2.4.12(e) (twice) and integration by parts, we have Z Z ∂ϑ ∂ϑ ξ(u) dz = ξ(u) dz ∂zk ∂zk Z V Z Z ∂ϑ ∂uε = lim ξ(uε ) dz = − lim ξ 0 (uε ) ϑ dz ε&0 ε&0 ∂zk ∂zk V V Z Z = − ξ 0 (u)(Dk u)ϑ dz = − ξ 0 (u)(Dk u)ϑ dz, V

so

Z

Dk (ξ ◦ u) = ξ 0 (u)Dk u

∀ k ∈ {1, . . . , N } .

It is clear from the above argument (see second equality) that if Z is bounded, then the condition ξ(0) = 0 can be dropped. If p = +∞, then ξ ◦ u is a Lipschitz continuous function (see Remark 2.4.21) and the result follows from the classical chain rule. In fact there is a stronger version of the previous proposition. It is due to Marcus & Mizel (1972), where the interested reader can find the proof. PROPOSITION 2.4.25 If Z ⊆ RN is an open set, ξ : R −→ R is Lipschitz continuous, ξ(0) = 0 and u ∈ W 1,p (Z), p ∈ [1, +∞], then ξ ◦ u ∈ W 1,p (Z) and D(ξ ◦ u) = ξ ∗ (u)Du almost everywhere on Z with ξ ∗ : R −→ R being any bounded Borel function such that ξ ∗ (z) = ξ 0 (z)

for a.a. z ∈ Z.

REMARK 2.4.26 The function f ∗ can always be taken to be bounded, by virtue of the following result due to Stampacchia (1966): “If u ∈ W 1,p (Z) and A ⊆ R is a Lebesgue-null set, then Du(z) = 0 for almost all z ∈ u−1 (A).” Moreover, note that the chain rule (see Proposition 2.4.25) is also valid for W01,p (Z) (see Corollary 2.4.8). Recall that © ª u+ = max u, 0 We have

u = u+ − u−

© ª and u− = min − u, 0 . and

|u| = u+ + u− .

Using the general version of the chain rule (see Proposition 2.4.25), we obtain at once the following result.

196

Nonlinear Analysis

PROPOSITION 2.4.27 If Z ⊆ RN is an open set and u ∈ W 1,p (Z), p ∈ [1, +∞], then u+ , u− , |u| ∈ W 1,p (Z) and we have ½ Du+ = ½ Du− =

Du 0 0 −Du

Du 0 D|u| = −Du

for a.a. z ∈ {u > 0} , for a.a. z ∈ {u 6 0} , for a.a. z ∈ {u > 0} , for a.a. z ∈ {u < 0} , for a.a. z ∈ {u > 0} , for a.a. z ∈ {u = 0} , for a.a. z ∈ {u < 0} .

Using this proposition we can show that the Sobolev spaces W 1,p (Z), p ∈ [1, +∞] have a lattice structure. COROLLARY 2.4.28 If Z ⊆ RN is an open set and u, v ∈ W 1,p (Z), p ∈ [1, +∞], then © ª h0 = min u, v ∈ W 1,p (Z),

© ª h1 = max u, v ∈ W 1,p (Z)

and we have ½ df

Dh0 = ½ df

Dh1 = PROOF

Du Dv

for a.a. z ∈ {u 6 v} , for a.a. z ∈ {u > v} ,

Du Dv

for a.a. z ∈ {u > v} , for a.a. z ∈ {u 6 v} .

Note that h1 = (u − v)+ + v

and

h0 = u − (u − v)+ .

Then the result follows at once from Proposition 2.4.27. An immediate consequence of this Corollary is the following particular case of the result of Stampacchia mentioned in Remark 2.4.26. COROLLARY 2.4.29 If Z ⊆ RN is an open set and u ∈ W 1,p (Z), p ∈ [1, +∞], then for every η ∈ R, we have Du(z) = 0

© ª for a.a. z ∈ u = η .

2. Lebesgue-Bochner and Sobolev Spaces

197

PROPOSITION 2.4.30 If Z ⊆ RN is an open set, {un }n>1 ⊆ W 1,p (Z), p ∈ (1, +∞), df

h = sup kDun k ∈ Lp (Z)

and

n>1

then g ∈ W PROOF

1,p

df

g = sup un , n>1

° ° (Z) and °Dg(z)°RN 6 h(z) for almost all z ∈ Z.

Let

df

gk = max un

∀ k > 1.

16n6k

From Corollary 2.4.28, we have that gk ∈ W 1,p (Z) and ° ° ° ° °Dgk (z)° N 6 max °Dun (z)° N 6 h(z) for a.a. z ∈ Z. R R 16n6k

(2.41)

Evidently the sequence {gk }k>1 is increasing and gk (z) −→ g(z) for a.a. z ∈ Z

as k → +∞. ¡ ¢ Also from (2.41), we see that the sequence {Dgk }k>1 ⊆ Lp Z; RN is bounded. Then from the monotone convergence theorem (see Theorem A.2.10) and the Eberlein-Smulian theorem (see Theorem A.3.8), we have in Lp (Z), ¡ ¢ −→ w in Lp Z; RN ,

gk −→ g Dgkm

w

with {gkm }m>1 being a subsequence of {gk }k>1 . For every ϑ ∈ Cc∞ (Z) and every i ∈ {1, . . . , N }, from the definition of the distributional derivative, we have Z Z (Di gk )ϑ dz = − gk Di ϑ dz, Z

so

Z

Z

Z wi ϑ dz = −

A

gDϑ dz, Z

¡ ¢ p N where w = (wi )N and finally i=1 ∈ L Z; R wi = Di g

∀ i ∈ {1, . . . , N } .

Therefore, we infer that for whole sequence, we have ¡ ¢ w Dgn −→ Dg in Lp Z; RN . So g ∈ W 1,p (Z) and kDg(z)kRN 6 h(z)

for a.a. z ∈ Z.

198

Nonlinear Analysis

From the above proofs, it is clear that we have: PROPOSITION 2.4.31 If Z ⊆ RN is an open set and {un }n>1 ⊆ W 1,p (Z), p ∈ [1, +∞) is a sequence, such that w

in Lp (Z) ¡ ¢ Dun −→ w in Lp Z; RN , un −→ u w

then u ∈ W 1,p (Z) and Du = w. PROPOSITION 2.4.32 If Z ⊆ RN is an open set, {un }n>1 , {vn }n>1 ⊆ W 1,p (Z), p ∈ [1, +∞) and w

in W 1,p (Z),

w

in W 1,p (Z),

un −→ u vn −→ v then

© ª © ª min un , vn −→ min u, v in W 1,p (Z), © ª © ª max un , vn −→ max u, v in W 1,p (Z). PROOF

It suffices to show that

+ if un −→ u in W 1,p (Z), then u+ n −→ u

First note that and so it follows that

in W 1,p (Z).

¯ + ¯ ¯ ¯ ¯un − u+ ¯ 6 ¯un − u¯ + u+ n −→ u

in Lp (Z).

Next let h = χ(0,+∞) . Using Proposition 2.4.27, we have ° + ° °Dun − Du+ °p p Z p = kh(un )Dun − h(u)DukRN dz Z

Z p

6 kDun (z) − Du(z)kp + Z

¯p p ¯ kDu(z)kRN ¯h(un ) − h(u)¯ dz −→ 0.

2. Lebesgue-Bochner and Sobolev Spaces

199

Using this, we can conclude the validity of Proposition 2.4.27 and Corollary 2.4.28 for the spaces W01,p (Z), p ∈ [1, +∞). So W01,p (Z), p ∈ [1, +∞) has a lattice structure. PROPOSITION 2.4.33 If Z ⊆ RN is an open set and u, v ∈ W01,p (Z), p ∈ [1, +∞), then © ª © ª max u, v , min u, v ∈ W01,p (Z). In particular

u+ , u− , |u| ∈ W01,p (Z).

PROOF Again it suffices to show that u+ ∈ W01,p (Z). Let {ϑn }n>1 ⊆ Cc∞ (Z) be such that ϑn −→ u in W 1,p (Z). From the proof of Proposition 2.4.12(c) and (d), it follows that we can find a sequence {ψnm }n>1 ⊆ Cc∞ (Z), ψnm > 0, such that ψnm −→ ϑ+ n

in W01,p (Z)

as m → +∞

∀ n > 1.

Since + ϑ+ n −→ u

in W 1,p (Z)

(see Proposition 2.4.32), via the double limit lemma (see Proposition A.2.35), we can find a sequence {m(n)}n>1 increasing (not necessarily strictly) to +∞, such that ψn m(n) −→ u+ in W 1,p (Z). Since ψn m(n) ∈ Cc∞ (Z), we deduce that u+ ∈ W 1,p (Z). In fact the previous result can be also obtained by having a chain rule for W01,p (Z), p ∈ (1, +∞) (see also Proposition 2.4.25). First an auxiliary result which is actually of independent interest. PROPOSITION 2.4.34 If Z ⊆ RN is an open set, u ∈ W 1,p (Z), p ∈ [1, +∞) and u vanishes outside a compact K ⊆ Z, then u ∈ W01,p (Z). PROOF

Let Z 0 be a bounded open set in RN , such that K ⊆ Z 0 ⊂⊂ Z.

¡ ¢ Let ϕ ∈ Cc∞ RN , such that

ϕ|K ≡ 1

200

Nonlinear Analysis

(i.e., ϕ is what we usually call a cut off function). We have ϕu = u. We can find {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn −→ u in Lp (Z) and Dϑn −→ Du

¡ ¢ in Lp Z 0 ; RN

(see Remark 2.4.18). We have ϕϑn −→ ϕu

in W 1,p (Z)

and ϕϑn ∈ Cc∞ (Z). Therefore ϕu = u ∈ W01,p (Z).

Using this result we can prove the chain rule for the Sobolev spaces W01,p (Z), p ∈ (1, +∞). PROPOSITION 2.4.35 If Z ⊆ RN is a bounded open set, ξ : R −→ R is Lipschitz continuous, ξ(0) = 0 and u ∈ W01,p (Z), p ∈ (1, +∞), then ξ ◦ u ∈ W01,p (Z) and D(ξ ◦ u) = ξ 0 (u)Du. PROOF

Let {ϑn }n>1 ⊆ Cc∞ (Z) be such that ϑn −→ u in W 1,p (Z).

Set

df

hn = ξ ◦ ϑn

∀ n > 1.

Evidently hn is a Lipschitz continuous function and since ϑn has compact support, so does hn . Also ¯ ¯ ¯ ∂hn ¯ ¯ ¯ ∀ i ∈ {1, . . . , N }, n > 1. ¯ ∂zi ¯ 6 Lip(hn ) Because Z ⊆ RN is bounded, we infer that ∂hn ∈ Lp (Z), ∂zi

2. Lebesgue-Bochner and Sobolev Spaces

201

hence hn ∈ W 1,p (Z) and has compact support. So Proposition 2.4.34 implies that hn ∈ W01,p (Z). Also we have ¯ ¯ ¡ ¡ ¢¯ ¢ ¡ ¢¯ ¯hn (z) − ξ u(z) ¯ = ¯ξ ϑn (z) − ξ u(z) ¯ ¯ ¯ 6 Lip(ξ)¯ϑn (z) − u(z)¯ for a.a. z ∈ Z, so hn −→ ξ ◦ u

in Lp (Z).

N Moreover, if {ek }N k=1 is the standard orthonormal basis of R , we have

|hn (z + tei ) − hn (z)| Lip(ξ) |ϑn (z + tei ) − ϑn (z)| 6 , |t| |t| so

° ° ° ∂hn ° ° ° lim sup ° ∂zi ° n→+∞

p

But

∂ϑn ∂u −→ ∂zi ∂zi

° ° ° ∂ϑn ° ° ° . 6 Lip(ξ) lim sup ° ∂zi ° n→+∞

in Lp (Z)

So from (2.42), we infer that the sequence

(2.42)

p

∀ i ∈ {1, . . . , N }. n

∂hn ∂zi

o i>1

⊆ Lp (Z) is bounded.

Since p ∈ (1, +∞), by passing to a subsequence if necessary, we may assume that ∂hn w −→ wi in Lp (Z) ∀ i ∈ {1, . . . , N }. ∂zi From Proposition 2.4.31, we have that wi =

∂ξ(u) ∂zi

and so

hn −→ ξ(u) in W 1,p (Z). Because hn ∈ W01,p (Z), we conclude that ξ ◦ u = ξ(u) ∈ W01,p (Z). Finally note that

Dhn = ξ 0 (ϑn )Dϑn

and so in the limit we have D(ξ ◦ u) = ξ 0 (u)Du.

REMARK 2.4.36 If Z ⊆ RN is bounded, open, Lipschitz, then in the above proof we can choose {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn −→ u in W 1,p (Z). Then the same proof is valid and so we have a proof of Proposition 2.4.25 with the extra hypothesis that Z ⊆ RN is bounded and Lipschitz.

202

Nonlinear Analysis

We can also have the product rule for the spaces W 1,p (Z), p ∈ [1, +∞]. The proof is the same as that of Proposition 2.4.22, using this time a sequence {b un }n>1 ⊆ Cc∞ (Z). PROPOSITION 2.4.37 If Z ⊆ RN is an open set and u ∈ W01,p (Z) ∩ L∞ (Z)

v ∈ W 1,p (Z) ∩ L∞ (Z),

and

p ∈ [1, +∞], then uv ∈ W01,p (Z) and D(uv) = uDv + vDu. Continuing with the Sobolev spaces W01,p (Z), p ∈ [1, +∞), we have the following results. PROPOSITION 2.4.38 If Z ⊆ RN is an open set, u ∈ W01,p (Z)

v ∈ W 1,p (Z),

and

p ∈ [1, +∞] and 0 6 v(z) 6 u(z)

for a.a. z ∈ Z,

then v ∈ W01,p (Z). PROOF From the proof of Proposition 2.4.33, we know that there exists a sequence {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn > 0 and Let

∀n>1

ϑn −→ u in W 1,p (Z). © ª df hn = min v, ϑn

∀ n > 1.

Evidently hn has compact support and so by Proposition 2.4.34, hn ∈ W01,p (Z). Moreover, from Proposition 2.4.32, we have that © ª hn −→ min v, u = v in W 1,p (Z). So v ∈ W01,p (Z).

2. Lebesgue-Bochner and Sobolev Spaces

203

PROPOSITION 2.4.39 If Z ∈ RN is an open set, u ∈ W01,p (Z), v ∈ W 1,p (Z), p ∈ [1, +∞) and ¯ ¯ ¯ ¯ ¯v(z)¯ 6 ¯u(z)¯ for a.a. z ∈ Z \ K, where K is a compact subset of Z, then v ∈ W01,p (Z). PROOF

Let ϕ ∈ Cc∞ (Z) be such that 0 6 ϕ 6 1 and

ϕ|K = 1

(a cut off function). We set df

u b = (1 − ϕ)|u| + ϕv + . From Propositions 2.4.33 and 2.4.34, we have that u b ∈ W01,p (Z) and So

0 6 v+ 6 u b.

v + ∈ W01,p (Z)

(see Proposition 2.4.38). Similarly we show that v − ∈ W01,p (Z). Hence v ∈ W01,p (Z). We can improve Proposition 2.4.34 and motivate the discussion of trace which follows. PROPOSITION 2.4.40 If Z ⊆ RN is a bounded open set, u ∈ W 1,p (Z), p ∈ [1, +∞) and lim u(z) = 0

z→y

∀ y ∈ ∂Z,

then u ∈ W01,p (Z). PROOF

Since

u = u+ − u− ,

we may assume that u > 0. For ε > 0 let uε ∈ W 1,p (Z) (see Proposition 2.4.27) and uε has compact support. Therefore by Proposition 2.4.34, we have that uε ∈ W01,p (Z). Now note that uε −→ u in W 1,p (Z) Thus u ∈ W01,p (Z).

as ε & 0.

204

Nonlinear Analysis

So roughly speaking a function u ∈ W 1,p (Z) belongs to W01,p (Z), if u is vanishing on ∂Z. But it is not meaningful to talk of values of u on a set of measure zero. Hence we must be more careful on how we assign boundary values to Sobolev functions. Trace theory does exactly this, namely defines and studies the concept of boundary values for the Sobolev spaces W 1,p (Z), p ∈ [1, +∞). The trace of a Sobolev function is an extension of the restriction of a continuous function on ∂Z. We start with a simple lemma.

LEMMA 2.4.41 If Z ⊆ RN is a bounded, open set which is Lipschitz, then for all p ∈ [1, +∞) there exists c > 0, such that Z p

|u|p dµ(N −1) 6 c kukW 1,p (Z)

∀ u ∈ C 1 (Z).

∂Z

PROOF Since by hypothesis the boundary ∂Z is Lipschitz, for any z = (zk )N k=1 ∈ ∂Z we can find r > 0 and a Lipschitz continuous function η : RN −1 −→ R, such that (upon rotating and relabelling the coordinate axes if necessary), we have Z ∩ Cr (z) =

©

ª N y = (yk )N k=1 ∈ R : η(y1 , . . . , yN −1 ) < yN ∩ Cr (z),

where df

Cr (z) =

©

ª N y = (yk )N k=1 ∈ R : |yk − zk | < r, k = 1, . . . , N .

First assume that u|Z\Cr (z) . If {ek }N k=1 is the standard orthonormal basis of RN and n(·) is the outward unit normal vector on ∂Z, then we have − (eN , n)RN >

¡ ¢− 1 1 + Lip(η)2 2 > 0 for µ(N −1) -a.a. z ∈ ∂Z ∩ Cr (z).

(2.43)

Let ε > 0 be given and set df

1

ξε (t) = (t2 + ε2 ) 2 − ε

∀ t ∈ R.

¯ ¯ Using the Gauss-Green theorem (see Theorem A.4.1) and since ¯ξε0 ¯ 6 1 for

2. Lebesgue-Bochner and Sobolev Spaces all t ∈ R, we have Z ¡ ¢ ξε u(y) dµ(N −1) = ∂Z

Z

205

¡ ¢ ξε u(y) dµ(N −1)

∂Z∩Cr (z)

Z

¡ ¢ ξε u(y) (−eN , n(y))RN dµ(N −1)

6 c ∂Z∩Cr (z)

Z

= −c ∂Z∩Cr (z)

Z

¢¢ ∂ ¡ ¡ ξε u(y) dy ∂yN

¯ 0¡ ° ¢¯ ° ¯ξε u(y) ¯ °Du(y)°

6 c

Z

RN

dy 6 c

° ° °Du(y)°

RN

,

Z

∂Z∩Cr (z)

with c > 0 independent of u (see (2.43)). Note that ξε (u) −→ |u| as ε & 0, so in the limit we obtain Z Z (N −1) |u| dµ 6 c kDu(z)kRN dy.

(2.44)

Z

∂Z

Now we remove the extra hypothesis that u|Z\Cr (z) = 0. In the general case, we can cover ∂Z by a finite number of such cubes Cri (zi ) = Ci for i = 1, . . . , m. Then we can find smooth functions {ξi }m i=0 , such that 0 6 ξi 6 1,

supp ξi ⊆ Ci 0 6 ξ0 6 1,

and

m X

ξi (z) = 1

∀ i ∈ {1, . . . , N },

supp ξ0 ⊆ Z ∀ z ∈ Z.

i=0

We set

df

ui = ξi u

∀ i ∈ {0, 1, . . . , m}.

Evidently each ui |Z\Ci = 0 and so using (2.44), we obtain Z Z ¡ ¢ |u| dµ(N −1) 6 c |u(z)| + kDu(z)kRN dy ∀ u ∈ C 1 (Z). Z

∂Z

If p ∈ (1, +∞), then we use (2.45) with |u|p replacing |u|. So finally Z ¯ ¯ ¯u(z)¯p dµ(N −1) 6 c kukp 1,p ∀ p ∈ [1, +∞). W (Z) ∂Z

(2.45)

206

Nonlinear Analysis

Using this Lemma, we can state and prove the trace theorem which gives meaning to the concept of boundary values for Sobolev spaces. THEOREM 2.4.42 (Trace Theorem) If Z ⊆ RN is bounded, open set which is Lipschitz and p ∈ [1, +∞), then there exists a unique continuous linear map ¡ ¢ γ0 : W 1,p (Z) −→ Lp ∂Z, µ(N −1) , such that γ0 (u) = u|∂Z

∀ u ∈ C 1 (Z).

PROOF By virtue of Theorem 2.4.17, the embedding C 1 (Z) ⊆ W 1,p (Z) is dense. From Lemma 2.4.41, we know that p ku|∂Z k p ¡ L

∂Z,µ(N −1)

¢ 6 c kukW 1,p (Z)

∀ u ∈ C 1 (Z),

for some c > 0. So we can extend uniquely to a continuous linear map ¡ ¢ γ0 : W 1,p (Z) −→ Lp ∂Z, µ(N −1) .

DEFINITION 2.4.43 trace of u on ∂Z.

For every u ∈ W 1,p (Z), we say that γ0 (u) is the

PROPOSITION 2.4.44 If Z ⊆ RN is bounded, open set which is Lipschitz and p ∈ [1, +∞), then Z Z Di u(z) dz = γ0 (u)ni dµ(N −1) ∀ u ∈ W 1,p (Z), i ∈ {1, . . . , N } Z

∂Z

(as before n = (ni )N i=1 is the outward unit normal on ∂Z). PROOF

Let {un }n>1 ⊆ C 1 (Z) be such that ku − un kW 1,p (Z) −→ 0

(see Theorem 2.4.17). From the divergence theorem of multivariable calculus (see Theorem A.4.1), we have Z Z Di un (z) dz = γ0 (un )ni dµ(N −1) ∀ n > 1. (2.46) Z

∂Z

2. Lebesgue-Bochner and Sobolev Spaces

207

From Theorem 2.4.42, we know that ¡ ¢ γ0 (un ) −→ γ0 (u) in Lp ∂Z, µ(N −1) . Also since un −→ u in W 1,p (Z), we have Di un −→ Di u in Lp (Z). So passing to the limit as n → +∞ in (2.46), we obtain Z Z Di u(z) dz = γ0 (u)ni dµ(N −1) . Z

∂Z

This proposition leads to a Green’s Formula for Sobolev functions. First an auxiliary result which provides still another version of the product rule. LEMMA 2.4.45 If Z ⊆ RN is an open set, p ∈ (1, +∞) and then for all u ∈ W

1,p

(Z), v ∈ W

1,p

0

1 p

+

1 p0

= 1,

(Z), we have uv ∈ W 1,1 (Z) and

Di (uv) = uDi v + vDi u

∀ i ∈ {1, . . . , N }.

PROOF First assume that u ∈ C 1 (Z) and consider a sequence {vn }n>1 ⊆ C 1 (Z), such that vn −→ v

0

in W 1,p (Z 0 )

∀ Z 0 ⊂⊂ Z

(see Remark 2.4.18). Let ϑ ∈ D(Z) and consider Z 0 ⊆ RN bounded, open set, such that 0 supp ϑ ⊆ Z 0 ⊆ Z ⊆ Z. For every i ∈ {1, . . . , N }, we have Z Z Z ¡ ¢ uvn Di ϑ dz = uvn Di ϑ dz = − uDi vn + vn Di u ϑ dz, Z0

Z

so

Z lim

n→+∞

¡

Z0

¢ uDi vn + vn Di u ϑ dz =

Z0

Z (uDi v + vDi u) ϕ dz. Z0

Since vn −→ v

0

in W 1,p (Z),

208

Nonlinear Analysis

we have

Z

Z

lim

uvn Di ϑ dz =

n→+∞ Z

uvDi ϑ dz. Z

So in the limit as n → +∞, we obtain Z Z ¡ ¢ uvDi ϑ dz = − uDi v + vDi u ϑ dz

∀ ϑ ∈ D(Z),

Z0

Z

so

Di (uv) = uDi v + vDi u ∈ L1 (Z),

i.e., uv ∈ W 1,1 (Z). Now we remove the restriction that u ∈ C 1 (Z). If u ∈ W 1,p (Z), we can find a sequence {un }n>1 ⊆ C 1 (Z), such that un −→ u in W 1,p (Z 0 ) for all open sets Z 0 ⊂⊂ Z. From the first part of the proof we know that Di (un v) = un Di v + vDi un Hence un v −→ uv

∀ n > 1.

in L1 (Z)

and Di (un v) −→ uDi v + vDi u

in L1 (Z).

Therefore {un v}n>1 is a Cauchy sequence in W 1,1 (Z) and so un v −→ uv

in W 1,1 (Z)

and Di (uv) = uDi v + vDi u.

THEOREM 2.4.46 (Green Formula) If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞), 1,p

1,p

0

1 1 p + p0

then for all u ∈ W (Z), v ∈ W (Z) and i ∈ {1, . . . , N }, we have Z Z Z uDi v dz + vDi u dz = γ0 (uv)ni dµ(N −1) . Z

PROOF

Z

∂Z

From Lemma 2.4.45, we know that uv ∈ W 1,1 (Z)

and Di (uv) = uDi v + vi Du.

An application of Proposition 2.4.44 leads to Green’s formula.

= 1,

2. Lebesgue-Bochner and Sobolev Spaces

209

COROLLARY 2.4.47 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞) and 1 1 p + p0 = 1, ¡ ¢ then for all u ∈ W 1,p (Z) and h ∈ C 1 RN ; RN , we have Z Z Z ¡ ¢ ¡ ¢ udiv h dz + Du(z), h(z) RN dz = γ0 (u) h, n RN dµ(N −1) . Z

Z

∂Z

Theorem 2.4.42 gives meaning to the quantity u|∂Z for any u ∈ W 1,p (Z), ∂m p ∈ [1, +∞). In fact we can do the same thing for ∂n m for any m > 1. Here ∂ ∂n denotes the outward normal derivative on ∂Z. Also we can give a more precise description of the range of the trace map. To do this we need to introduce Sobolev spaces of fractional order on manifolds. DEFINITION 2.4.48 Let M be a compact manifold in RN . For any s ∈ (0, 1), p ∈ [1, +∞) and u ∈ C ∞ (M ), we define p1 Z Z 0 p ¯ ¯ |u(z) − u(z )| df p kukW s,p (M ) = ¯u(z)¯ dz + dz dz 0 . N −1+sp kz − z 0 kRN M

M ×M

This is a norm. The completion of C ∞ (M ) under this norm is denoted by W s,p (M ). For any s > 0, we set s = k + η, with a positive integer k and η ∈ (0, 1) (if s is not an integer). We define ª df © W s,p (M ) = u ∈ W k,p (M ) : Dα u ∈ W η,p (M ) for all |α| = k . REMARK 2.4.49 The definition makes sense also for any Z ⊆ RN bounded and open. Also if s = 0, by convention W 0,p (M ) = Lp (M ). Now we can state the full version of the trace theorem. THEOREM 2.4.50 (Trace Theorem) If Z ⊆ RN is a bounded, open set which is Lipschitz, m > 1 is a positive integer and p ∈ [1, +∞), then there exists a unique bounded, linear operator m,p γ = (γk )m−1 (Z) −→ Lp (∂Z)m , k=0 : W

such that ¡ ¢ (a) if u ∈ C ∞ Z , then γk (u) = (b) range γ =

m−1 Q

W

m−k− p10 ,p

k=0

(c) ker γ = W0m,p (Z).

∂k u ∂nk

(∂Z);

for k = 1, . . . , m − 1;

210

Nonlinear Analysis

Using Theorem 2.4.46 and the continuity of the trace map γ1 , we obtain the following result. THEOREM 2.4.51 If Z ⊆ RN is a bounded, open set which is Lipschitz and u ∈ H 2 (Z), v ∈ H 1 (Z), then Z Z Z ¡ ¢ ∂u (N −1) (∆u)v dz + Du, Dv RN dz = v dµ . ∂n Z

Z

∂Z

REMARK 2.4.52 The equality in the above theorem is sometimes called Second Green’s Identity . We can have a nonlinear extension of this theorem (i.e., p 6= 2). For this purpose if Z ⊆ RN is a bounded, open set which is Lipschitz and q ∈ (1, +∞), we introduce the space: ¡ ¢ df © ¡ ¢ ª V q Z, div = v ∈ Lq Z; RN : div v ∈ Lq (Z) . ¡ ¢ We furnish V q Z, div with the norm df

kvkV q (Z,div ) =

h i q1 ° °q q kvkLq (Z;RN ) + °div v °Lq (Z) .

¡ ¢ It is easy to see that V q Z, div equipped with this norm is a separable, ¡ ¢ ¡ ¢ reflexive Banach space and the embedding C ∞ Z; RN ⊆ V q Z, div is dense. The next theorem extends Theorem 2.4.46. For a proof of it we refer to Casas & Ferna ´ndez (1989) and Kenmochi (1975). THEOREM 2.4.53 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞) and 1 1 p + p0 = 1, then there exists a unique bounded, linear operator ¢ 1 0¡ − 1 ,p0 ,p γn : V p Z, div −→ W p0 (∂Z) = W p0 (∂Z)∗ , such that

¡ ¢ ∀ v ∈ C ∞ Z; RN

γn (v) = (v, n)RN and Z

Z udiv v dz +

Z

=

¡

γn (v), γ0 (u)

¢

¡ ¢ Du, v RN dz

Z

W

1 ,p p0 (∂Z)

¢ 0¡ ∀ v ∈ V p Z, div , u ∈ W 1,p (Z).

2. Lebesgue-Bochner and Sobolev Spaces

211

If for u ∈ W 1,p (Z), we set ¡ ¢ df p ∆p u = div kDukRN Du (the p-Laplacian), then from Theorem 2.4.53, we obtain the following nonlinear extension of Theorem 2.4.51. THEOREM 2.4.54 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞), p1 + p10 = 1, u ∈ W 1,p (Z)

and

0

∆p u ∈ Lp (Z), −

1

,p0

then there exists a unique element of W p0 (∂Z), which by extension we ∂u denote by ∂n , satisfying for all v ∈ W 1,p (Z), p µ ¶ Z Z ¢ ¡ ¢ ∂u p−2 ¡ ∆p u v dz + kDukRN Du, Dv RN dz = , γ0 (v) . 1 ,p ∂np W p0 (∂Z) Z

Z

W01,p (Z),

If u ∈ then we can extend u to u b ∈ W 1,p (RN ) by simply setting u equal to zero on RN \ Z. It is not clear whether this extension is possible for u ∈ W 1,p (Z). The next theorem shows when this is possible. It is known as extension theorem. THEOREM 2.4.55 (Extension Theorem) If Z ⊆ RN is a bounded, open set which is Lipschitz and Zb ⊇ Z is open, then there exists an extension operator ¡ ¢ E : W 1,p (Z) −→ W 1,p Zb , such that E(u)|Z = u, and

° ° °E(u)°

° ° °E(u)° p b 6 c kuk p L (Z) L (Z) ¡ ¢ 6 c kuk 1,p W (Z)

b W 1,p Z

∀ u ∈ W 1,p (Z)

∀ u ∈ W 1,p (Z),

¡ ¢ for some c = c Z, Zb > 0. Next let us define the dual of W 1,p (Z) for an open set Z ⊆ RN and p ∈ [1, +∞). By considering the map L1 : W 1,p (Z) −→ Lp (Z)N +1 , defined by df

L1 (u) =

¡

u, Du

¢

∀ u ∈ W 1,p (Z),

we see that W 1,p (Z) is isometrically isomorphic to a subspace of Lp (Z)N +1 . So from the Riesz representation theorem (see Theorem A.3.24), we have

212

Nonlinear Analysis

THEOREM 2.4.56 If Z ⊆ RN is an open set, p ∈ [1, +∞), then Z ¡ ¢ G(u) = h, Du RN dz

1 p

+

1 p0

= 1 and G ∈ W 1,p (Z)∗ ,

∀ y ∈ W 1,p (Z),

Z 0

for some h ∈ Lp (Z; RN ) The dual of W 1,p (Z) is generally more than a space of distributions on Z. Clearly the restriction on Cc∞ (Z) of an element in W 1,p (Z)∗ belongs to D(Z)∗ . However, this restriction is not injective because Cc∞ (Z) is not dense in W 1,p (Z). The problem is that the elements of W 1,p (Z) can have nonzero boundary values (in the sense of trace). On the other hand Cc∞ (Z) is dense in W01,p (Z). So for this Sobolev space the restriction is injective and we can have a convenient description of W01,p (Z)∗ . THEOREM 2.4.57 If Z ⊆ RN is an open set and p ∈ [1, +∞), then ½ W01,p (Z)∗

df

=

∗

G ∈ D(Z) : G = −

N X

Dk Tgk ,

k=1

¾ p0 N for some g = (gk )N ∈ L (T ) . k=1 0

df

We set W −1,p (Z) = W01,p (Z)∗ , with

1 p

+

1 p0

= 1.

For Sobolev functions of one variable (i.e., N = 1), using Theorem 2.2.24, we have the following convenient characterization. THEOREM 2.4.58 If Z = T = [0, b] (b < +∞) and u ∈ W 1,p (T ), p1 + p10 = 1, then u admits a representative which is absolutely continuous. Moreover, for p = +∞, the representative is Lipschitz continuous. REMARK 2.4.59 The result is also true if T is unbounded. In this case the representative is locally absolutely continuous (see Definition A.2.15(b)). For p = +∞, again the representative is Lipschitz continuous (see Remark 2.4.21).

2. Lebesgue-Bochner and Sobolev Spaces

2.5

213

Inequalities and Embedding Theorems

The study of Sobolev spaces is useful because their elements possess special properties. Many of those properties are a consequence of the so-called embedding theorems. Among other things, the embedding theorems establish regularity properties for the Sobolev functions, in addition to the ones implied by their definition. Let us start with a negative observation. H 1 (Z) is in general not embedded in L∞ (Z). To see ½ ¾ 1 df Z = (x, y) ∈ R2 : x2 + y 2 < 2 , e p ¡ 1¢ η η ∈ 0, 2 and let u(x, y) = | ln r| , where r = x2 + y 2 and (x, y) ∈ Z. We have EXAMPLE 2.5.1 this let

1

Z2πµ Ze

Z 2

|u| dx dy =

Ze

| ln r|2η r dr < +∞,

| ln r| r dr dϑ 6 2π 0

Z

1

¶ 2η

0

i.e., u ∈ L2 (Z). Note that

0 ∂u ∂x

©

ª exists on Z \ (0, 0) and we have

∂u 1x 1 = −η| ln r|η−1 = −η| ln r|η−1 cos ϑ, ∂x rr r so 1

¶ Z ¯ ¯2 Z2πµ Ze ¯ ∂u ¯ 2 2η−2 1 ¯ ¯ dx dy 6 η | ln r| dr cos2 ϑ dϑ ¯ ∂x ¯ r 0

Z

0

1

Ze 6 2πη

2

2η−2 1

| ln r| 0

r

· dr 6 2πη

2

(− ln r)2η−1 − 2η − 1

¸ 1e 6 0

and thus

¡ © ª¢ ∂u ∂u ∈ L2 (Z) and ∈ C 1 Z \ (0, 0) . ∂x ∂x Let ϑ ∈ D(Z). We have µ ¶ µ ¶ 1 1 1 1 u(·, y) ∈ C 1 − , ∀y∈ − , \ {0} e e e e and so

1

Ze − − 1e

1

∂u (x, y)ϑ(x, y) dy = ∂x

Ze

u(x, y) − 1e

∂ϑ (x, y) dx. ∂x

2πη 2 1 − 2η

214

Nonlinear Analysis

¡ ¢ Integrating with respect to y ∈ − 1e , 1e \ {0} and using Fubini’s theorem, we obtain Z Z ∂u ∂ϑ − ϑ dx dy = u dx dy, ∂x ∂x Z

Z

so Dx Tu = T ∂u ∂x

and similarly we show that Dy Tu = T ∂u . ∂y

This proves that u ∈ H 1 (Z). From this example we see that H 1 (Z) 6⊆ L∞ (Z). Next we prove that if f ∈ W 1,p (RN ), then f ∈ Lr (RN ) for a certain range of r > 1 (including r = p). df

DEFINITION 2.5.2 If p ∈ [1, N ), then p∗ = cal Sobolev exponent corresponding to p.

Np N −p

is called the criti-

The next inequality, known as the “Sobolev-Nirenberg-Gagliardo inequality” (or simply as “Sobolev inequality”; see also Theorem 1.6.7), implies that the embedding W 1,p (RN ) ⊆ Lr (RN ) is continuous for r ∈ [1, p∗ ]. THEOREM 2.5.3 (Sobolev-Nirenberg-Gagliardo Inequality) If p ∈ [1, N ), then there exists c > 0, such that kukp∗ 6 c kDukp

∀ u ∈ W 1,p (RN ).

¡ ¢ PROOF By virtue of Theorem 2.4.13, we may assume that u ∈ Cc1 RN . For every i ∈ {1, . . . , N }, we have ¡ ¢ u x1 , . . . , xi , . . . , xN =

Zxi −∞

¢ ∂u ¡ x1 , . . . , ti , . . . , xN dti , ∂xi

thus ¯ ¯ ¯u(x)¯ 6

+∞ Z ° ¡ ¢° °Du xi , . . . , ti , . . . , xN ° N dti R

∀ i ∈ {1, . . . , N }

−∞

and so +∞ ¶ N1−1 N µ Z Y ¯ ¯ N ° ¡ ¢° ¯u(x)¯ N −1 6 °Du x1 , . . . , ti , . . . , xN ° N dti . R i=1

−∞

2. Lebesgue-Bochner and Sobolev Spaces

215

We integrate with respect to x1 ∈ R. +∞ Z ∗ |u|1 dx1 −∞ +∞ +∞ N µ Z +∞ µZ ¶ N1−1 Z ¶ N1−1 Y ° ° ° ° ° ° ° ° 6 Du RN dt1 Du RN dti dx1 −∞ i=2

−∞

µ

+∞ Z

° ° °Du° N dt1 R

6

¶

1 N −1

µY N

−∞

+∞ Z +∞ Z

° ° °Du° N dx1 dti R

¶ N1−1 .

i=2−∞ −∞

−∞

Next we integrate with respect to x2 ∈ R. +∞ Z +∞ Z ∗ |u|1 dx1 dx2 −∞ −∞ +∞ Z +∞ +∞ Z +∞ µZ ¶ N1−1 µ Z ¶ N1−1 6 kDukRN dx1 dx2 kDukRN dt1 dt2 × −∞ −∞

−∞ −∞

+∞ Z +∞ Z +∞ N µ Z Y

×

¶ N1−1

kDukRN dx1 dx2 dti

i=3

.

−∞ −∞ −∞

We continue this way and we obtain Z |u|

1∗

+∞ +∞ ¶ N1−1 Z N µ Z Y N dx 6 ... kDukRN dx1 . . . dxN = kDuk1N −1 , i=1

RN

−∞

−∞

so kuk1∗ 6 kDuk1 .

(2.47)

This proves the theorem for p = 1. Now suppose that p ∈ (1, +∞). Set h = |u|η with η > 0 to be chosen in the process in the proof. Using (2.47) and H¨older’s inequality (see Theorem A.2.27), we obtain µZ |u|

ηN N −1

¶ NN−1 dx

Z 6

RN

µZ

¶ 0

° ° °D|u|η ° N dx 6 η R

RN

|u|(η−1)p dx kDukp ,

6 η RN

Z

RN

|u|η−1 kDukRN dx

216 with

Nonlinear Analysis 1 p

+

1 p0

= 1. Choose η > 0 so that ηN p = (η − 1) . N −1 p−1

Then

p = η p−1

µ

p N − p−1 N −1

and p = η hence η =

¶

N −p , N −1

Np − p N −1 ∗ = p . N −p N

So we have µZ p∗

|u|

¶ NN−1 dx

µZ p∗

6 η

RN

|u|

¶ 10 dx

p

kDukp ,

RN

so kukp∗ 6 c kDukp , with c = c(N, p) > 0 (note that

N −1 N

>

N −p N p ).

The next inequality is known as “Poincar´e’s inequality” and is very useful in the study of Dirichlet elliptic equations. THEOREM 2.5.4 (Poincar´ e’s Inequality) If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then there exists c = c(Z, p) > 0, such that kukp 6 c kDukp PROOF

∀ u ∈ W 1,p (Z).

Since Z ⊆ RN is bounded, we can find ξ > 0, such that Z ⊆ (−ξ, ξ)N .

Let ϑ ∈ D(Z) and extend it by zero on the whole “cube” (−ξ, ξ)N . For every z = (zk )N k=1 , we have ZzN ∂ϑ ϑ(z) = (x1 , . . . , xN −1 , t) dt. ∂xN −ξ

2. Lebesgue-Bochner and Sobolev Spaces By H¨older’s inequality (see Theorem A.2.27), with ¯ ¯ p ¯ϑ(z1 , . . . , zN )¯p 6 (2ξ) p0

1 p

+

1 p0

217 = 1, we have

¯ Zξ ¯ ¯ ∂ϑ ¡ ¢¯p ¯ ¯ dt, z , . . . , z , t N −1 ¯ ∂zN 1 ¯

−ξ

so Z

¯ ¯ ¯ϑ(z1 , . . . , zN )¯p dz1 . . . dzN −1

(−ξ,ξ)N −1

6 (2ξ)

Z

p p0

(−ξ,ξ)N −1

thus

Z

¯ µ Zξ ¯ ¶ ¯ ∂ϑ ¡ ¢ ¯p ¯ ¯ ¯ ∂zN z1 , . . . , zN −1 , t ¯ dt dz1 , . . . dzN −1 , −ξ

¯ ¯ p ¯ϑ(z)¯p dz 6 (2ξ) p0 +1

Z

and since

p p0

Z

° °p °Dϑ° N dz R

Z

+ 1 = p, we have kϑkp 6 2ξ kDϑkp .

Since D(Z) is dense in W 1,p (Z), we conclude that kukp 6 c kDukp

∀ u ∈ W 1,p (Z),

for some c = c(Z, p) > 0. REMARK 2.5.5 In fact the result is true if Z ⊆ RN is unbounded but of finite width, namely it lies between two parallel hyperplanes (see Adams (1975, p. 158)). However, the result fails in truly unbounded domains Z ⊆ RN . Let Z ⊆ RN and ϑ ∈ D(RN ) be such that

EXAMPLE 2.5.6 ϑ|B df

Let ϑm (z) = ϑ

¡z¢ m

1 (0)

≡ 1,

ϑ|B2 (0)c ≡ 0 and

. Then if N < p, we have kϑm kW 1,p (Z) −→ 0

while

0 6 ϑ 6 1.

as m → +∞,

¡ ¢ kϑm kp > λN Bm (0) −→ +∞

as m → +∞.

An immediate useful consequence of Theorem 2.5.4, which is a basic tool in the study of Dirichlet elliptic problems, is the following result.

218

Nonlinear Analysis

COROLLARY 2.5.7 If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then ¶ p1 µZ ° ° df °Du(z)°p N kDukp = ∀ u ∈ W01,p (Z) R RN

is a norm on W01,p (Z) equivalent to the usual Sobolev norm kukW 1,p (Z) . Let us use this opportunity to mention a few equivalent norms for the Sobolev spaces W 1,p (Z). PROPOSITION 2.5.8 If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then the following three norms are equivalent to the original Sobolev norm k·kW 1,p (Z) : kuk(1)

¯Z ¯p ¶ p1 µ ° °p ¯ ¯ ° ° ¯ = Du p + ¯ u dz ¯¯

kuk(2)

¯Z ¯p ¶ p1 µ ° °p ¯ ¯ df (N −1) ¯ ° ° ¯ = Du p + ¯ u dµ ¯

kuk(3)

µ ¶ p1 Z ° °p df p (N −1) ° ° = Du p + |u| dµ .

df

Z

∂Z

∂Z

REMARK 2.5.9

If N = 1 and Z = (0, b) (b ∈ (0, +∞)), then Z u dµ(N −1) = u(0) + u(b). ∂Z

Before passing to the so-called embedding theorems, let us mention one more inequality, known in the literature as “Morrey’s inequality.” First a definition. DEFINITION 2.5.10 Let η ∈ (0, 1). A function u : RN −→ R is said to be H¨ older continuous with exponent η, if sup x, y ∈ RN x 6= y

|u(x) − u(y)| < +∞. η kx − ykRN

In the proof of Morrey’s inequality, we shall use the following lemma.

2. Lebesgue-Bochner and Sobolev Spaces

219

LEMMA 2.5.11 For every p ∈ [1, +∞), there exists c = c(N, p) > 0, such that Z Z ¯ ¯ ° ° ° ° ¯u(y) − u(z)¯p dy 6 crN +p−1 °Du(y)°p N °y − z °1−N dy, N R

B r (x)

R

B r (x)

¡ ¢ for all r > 0, u ∈ C 1 B r (x) and all y, z ∈ B r (x). PROOF

If y, z ∈ B r (x), then Z1

u(y) − u(z) = 0

Z1

¢ d ¡ u z + t(y − z dt = dt

thus ¯ ¯ ¯u(y) − u(z)¯p 6 ky − zkp N R

Z1

¡ ¢ Du(z + t(y − z)), y − z RN dt,

0

° ¡ ¢° °Du z + t(y − z) °p N dt. R

0

So using Proposition 1.3.23(b) and (c), for s > 0, we have Z ¯ ¯ ¯u(y) − u(z)¯p dµ(N −1) (y) B r (x)∩∂B s (z)

Z1

Z

° ¡ ¢° °Du z + t(y − z) °p N dµ(N −1) (y) dt R

p

6 s

0 B r (x)∩∂B s (z)

Z1 p

6 s

0

Z

1

° ° °Du(w)°p N dµ(N −1) (w) dt R

tN −1 B r (x)∩∂B st (z)

Z

Z1

° ° ° ° °Du(w)°p N °w − z °1−N dµ(N −1) (w) dt R RN

= sN +p−1 0 B r (x)∩∂B st (z)

Z

= sN +p−2

° ° ° ° °Du(w)°p N °w − z °1−N dw. N R

R

B r (x)∩B s (z)

Then from Example 1.5.27(a), we have Z Z ¯ ¯ ¯u(y) − u(z)¯p dy 6 crN +p−1 B r (x)

with c = c(N, p) > 0.

B r (x)

° ° ° ° °Du(y)°p N °y − z °1−N dy, R RN

220

Nonlinear Analysis

THEOREM 2.5.12 (Morrey Inequality) (a) For every p ∈ (N, +∞), there exists c = c(N, p) > 0, such that ¯ ¯ ¯u(y) − u(z)¯ 6 cr

Z

1 λN (B r (x))

° ° °Du(w)°p N dw, R

B r (x)

¡ ¢ for all r > 0, u ∈ W 1,p Br (x) and λN -almost all y, z ∈ B r (x). (b) If p ∈ (N, +∞) and u ∈ W 1,p (RN ), then the limit lim ur,z = u∗ (z)

r&0

exists for all z ∈ RN and u∗ is H¨ older continuous with exponent 1 − Np ; recall that Z 1 ur,z = N u(x) dλN (x). λ (B r (z)) B r (z)

¡ ¢ PROOF (a) First suppose that u ∈ C 1 Br (x) . Using Lemma 2.5.11 with p p = 1, H¨older’s inequality (see Theorem A.2.27) and recalling that p0 = p−1 , for all y, z ∈ B r (x), we have ¯ ¯ ¯u(y) − u(z)¯ 6 Z 6 c

Z

1 N λ (B r (x))

¯ ¯ ¯¢ ¡¯ ¯u(y) − u(w)¯ + ¯u(w) − u(z)¯ dw

B r (x)

´ ° ° ³ 1−N °Du(w)° N ky − wk1−N + kz − wk dw N N R R R

B r (x)

µ Z 6 c

³

1−N

1−N

ky − wkRN + kz − wkRN

´p0

B r (x)

6 cr

dw

¶ 10 µ Z p

p

¶ p1

kDukRN dw

B r (x)

(N −(N −1)p0 ) p10

µ Z

° ° °Du(w)°p N

¶ p1

R

B r (x)

µ Z 6 cr

1− N p

° ° °Du(w)°p N dw R

¶ p1 .

B r (x)

Invoking Theorem 2.4.13, we see that the same estimate holds for all u ∈ ¡ ¢ W 1,p Br (x) and for λN -almost all y, z ∈ Br (x).

2. Lebesgue-Bochner and Sobolev Spaces

221

(b) From part (a), for λN -almost all y, z ∈ B r (x), with r = kx − ykRN , we have µ Z ¶ p1 N ¯ ¯ ° °p p ¯u(y) − u(z)¯ 6 c ky − zk1− ° ° Du(w) dw N N R

R

B r (x)

6 c kDukLp (Z;RN ) ky −

1− N zkRN p

,

so u is λN -almost everywhere equal to a H¨older continuous function u∗ with exponent 1 − Np . So lim ur,z = u∗ (z)

r&0

∀ z ∈ RN .

REMARK If p = +∞, then we know that the elements of ¡ ¢ 2.5.13 W 1,∞ RN are Lipschitz continuous functions (see Remark 2.4.21). From Theorem 2.5.12, it follows that, if u ∈ W 1,p (RN ), N < p, then lim

kzkRN →+∞

u(z) = 0.

Already from Theorem 2.5.3, we know that the embedding W 1,p (RN ) ⊆ Lr (RN ) is continuous for all r ∈ [1, p∗ ]. Moreover, from Theorem 2.5.12, we know that if p > N , then the embedding W 1,p (RN ) ⊆ L∞ (RN ) is continuous. The next two theorems make these facts much more precise. The first theorem is known as the Sobolev embedding theorem, while the second is known as the Rellich-Kondrachov embedding theorem. First let us introduce a new kind of boundary regularity. DEFINITION 2.5.14 (a) For given z ∈ RN , an open ball B1 with center z and an open ball B2 not containing z, the set ª df © Cz = z + t(y − z) : y ∈ B2 , t > 0 ∩ B1 is called a finite cone in RN . (b) Let Z ⊆ RN be an open set. We say that Z has the cone property, if there exists a finite cone C0 , such that¢ for each z ∈ Z, there exists an ¡ orthogonal transformation Ux ∈ L RN ; RN , for which we have Ux (C0 ) ⊆ Z.

222

Nonlinear Analysis

REMARK 2.5.15 If Z ⊆ RN is a bounded, open set which is Lipschitz, then it has the cone property (see Adams (1975, p. 51)). Also it is clear that if Z is C 1 , then it has the cone property. THEOREM 2.5.16 (Sobolev Embedding Theorem) If Z ⊆ RN is an open set with the cone property, p ∈ [1, +∞), k, m are integers, k > 0, m > 1, then (a) if mp N , then the embedding W k+m,p (Z) ⊆ Cbk (Z) is continuous with Cbk (Z) being the space of all functions u ∈ C k (Z), such that Dα u is bounded on Z for all multiindices α ∈ NN with |α| 6 k. When Z ⊆ RN is bounded, the conclusions are stronger. THEOREM 2.5.17 (Rellich-Kondrachov Embedding Theorem) If Z ⊆ RN is an open, bounded set with the cone property, p ∈ [1, +∞), k, m are integers, k > 0, m > 1, then (a) if mp N , then the embedding W k+m,p (Z) ⊆ C k Z is compact for all r ∈ [1, +∞] and in particular if k = 0, we have that the embedding W m,p (Z) ⊆ Lr (Z) is compact for all r ∈ [1, +∞]. REMARK 2.5.18 Since W01,p (Z) is a closed subspace of W 1,p (Z), we see that both embedding theorems (i.e., Theorems 2.5.16 and 2.5.17) are also valid for W01,p (Z). In fact in this case the cone property can be dropped. Using the embedding theorems, we can prove a generalized form of Poincar´e’s inequality (see Theorem 2.5.4).

2. Lebesgue-Bochner and Sobolev Spaces

223

THEOREM 2.5.19 (Generalized Poincar´ e Inequality) If Z ⊆ RN is a bounded, open, connected set with the cone property, p ∈ (1, +∞) and V is a closed linear subspace of W 1,p (Z), such that the only constant function in V is the zero function, then kukp 6 c kDukp ∀ u ∈ V, for some c > 0. PROOF We proceed by contradiction. So suppose that the conclusion of the theorem is not true. We can find a sequence {un }n>1 , such that kun kp > n kDun kp Let df

vn =

∀ n > 1.

un kun kp

∀ n > 1.

and

kDvn kp

1 W 1,p (Z) is bounded and so by passing to a subsequence if necessary, we may assume that w

vn −→ v

in W 1,p (Z).

Because of Theorem 2.5.17(a), we have that vn −→ v

in Lp (Z).

Hence kvkp = 1. Also exploiting the weak lower semicontinuity of the norm in a Banach space, we have kDvkp 6 lim inf kDvn kp = 0, n→+∞

so v ∈ V is constant, thus v = 0, which is a contradiction to the fact that kvkp = 1.

224

Nonlinear Analysis

COROLLARY 2.5.20 If Z ⊆ RN is a bounded, open, connected set which is Lipschitz, S0 ⊆ ∂Z with µ(N −1) (S0 ) > 0, p ∈ (1, +∞) and df

V =

©

u ∈ W 1,p (Z) : γ0 (u) = 0 on S0

ª

(γ0 being the trace operator on W 1,p (Z); see Theorem 2.4.42), then kukp 6 c kDukp ∀ u ∈ V, for some c > 0. PROOF Since γ0 is continuous linear (see Theorem 2.4.42), V is closed, linear subspace of W 1,p (Z). Suppose that the constant function u ≡ c ∈ V . We have 0 = γ0 (u) = γ0 (c) = c. This permits the application of Theorem 2.5.19. This leads to another fundamental inequality, known as the “Poincar´eWirtinger inequality.” It is an essential tool in the study of periodic ordinary differential equations and Neumann partial differential equations. THEOREM 2.5.21 (Poincar´ e-Wirtinger Inequality) If Z ⊆ RN is a bounded, open, connected set with the cone property and p ∈ (1, +∞), then ku − ukp 6 c kDukp ∀ u ∈ W 1,p (Z), R df for some c > 0, where u = λN1(Z) u(z) dz. Z

PROOF

Let

Z u ∈ W 1,p (Z) : u(z) dz = 0 . V = df

Z

Clearly V is a closed, linear subspace of W 1,p (Z). If the constant function u ≡ c ∈ V , then Z u(z) dz = cλN (Z) = 0, Z

hence c = 0. Also u−u ∈ V

∀ u ∈ W 1,p (Z).

So an application of Theorem 2.5.19 finishes the proof.

2. Lebesgue-Bochner and Sobolev Spaces

225

We already know that for the Sobolev functions of one variable (i.e., N = 1), the situation is better (see Theorem 2.4.58). In this case the embedding theorems take a sharper form. THEOREM 2.5.22 Let T ⊆ R be an interval. (a) If T is open and p ∈ [1, +∞], then the embedding W 1,p (T ) ⊆ L∞ (T ) is continuous; (b) If T is open, bounded and p ∈ (1, +∞], ¡ ¢ then the embedding W 1,p (T ) ⊆ C T is compact; (c) If T is open, bounded and p ∈ [1, +∞), then the embedding W 1,1 (T ) ⊆ Lp (T ) is compact. PROOF (a) Using the extension theorem (see Theorem 2.4.55), we see that without any loss of generality, we may assume that T = R. First let u ∈ Cc1 (R) and for p ∈ [1, +∞) let df

ξp (r) = |r|p−1 r We have that ξp (u) ∈

Cc1 (R)

∀ r ∈ R.

and from the chain rule, we have

¯ ¯p−1 d ¢ ¡ ¢d d ¡ ξp u(t) = ξp0 u(t) u(t) = p¯u(t)¯ u(t) dt dt dt (since ξp0 (r) = p|r|p−1 ). Hence for every t ∈ R, we have ¡

¢ ξp u(t) =

Zt

¯ ¯p−1 p¯u(s)¯ u0 (s) ds

−∞

(since u ∈ Cc1 (R)), so by H¨older’s inequality (see Theorem A.2.27), we have Z ¯ ¡ ¯ ¯ ¯ ¯p−1 ¯ ¯ ¢¯ ¯ξp u(t) ¯ = ¯u(t)¯p 6 p¯u(s)¯ ¯u0 (s)¯ ds R

6

p−1 p kukp

0

ku kp .

By Young’s inequality (see Proposition A.4.5), we obtain kuk∞ 6 c kukW 1,p (T )

∀ u ∈ Cc1 (R),

(2.48)

for some c > 0. Now let u ∈ W 1,p (R), with p ∈ [1, +∞). Then we can find a sequence {un }n>1 ⊆ Cc1 (R), such that un −→ u in W 1,p (R).

226

Nonlinear Analysis

From (2.48), we have kun − um k∞ 6 c kun − um kW 1,p (T )

∀ n, m > 1,

hence the sequence {un }n>1 ⊆ L∞ (R) is a Cauchy sequence. Therefore un −→ u in L∞ (R) and we have proved the continuity of the embedding W 1,p (R) ⊆ L∞ (R) for p ∈ [1, +∞). Of course the result is trivially true for p = +∞. (b) Let B 1 (0) be the closed unit ball in W 1,p (T ), p ∈ (1, +∞]. Let u ∈ B 1 (0). We have ¯ Zt ¯ ¯ ¯ ¯ ¯ 1 1 ¯u(t) − u(s)¯ = ¯ u0 (τ ) dτ ¯ 6 ku0 k |t − s| p0 6 |t − s| p0 p ¯ ¯

∀ t, s ∈ T,

s

where p1 + p10 = 1 (see Theorem 2.4.58). Then the Arzela-Ascoli theorem ¡ ¢ (see Theorem 2.3.2) implies that B 1 (0) is relatively compact in C T and ¡ ¢ so we have proved the compactness of the embedding W 1,p (T ) ⊆ C T for p ∈ (1, +∞] with T ⊆ R being a bounded, open interval. (c) Let B 1 (0) be the closed unit ball in W 1,1 (T ). Let E be an open subset of T , such that E ⊆ T . Let ¡ ¢ |h| < dR E, T c and u ∈ B 1 (0). From Remark 2.4.21, we know that kτh (u) − ukL1 (E) 6 |h| ku0 kL1 (T ) 6 |h|, so Z

³ ´p−1 Z ¯ ¯ ¯ ¯ ¯u(t + h) − u(t)¯p dt 6 2 kuk ∞ ¯u(t + h) − u(t)¯ dt 6 c|h|, L (T )

E

E

for some c > 0 with p ∈ [1, +∞). Thus µZ

¶ p1 ¯ ¯ 1 1 ¯u(t + h) − u(t)¯p dt 6 c p |h| p .

E

Invoking Theorem 2.3.6, we infer that B 1 (0) is relatively compact in Lp (T ), p ∈ [1, +∞). This proves the compactness of the embedding W 1,1 (T ) ⊆ Lp (T ) for p ∈ [1, +∞) with T ⊆ R being a bounded, open interval.

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.5.23

227

The embedding ¡ ¢ W 1,1 (T ) ⊆ C T

is continuous but never compact even if the open interval T is bounded. If T ⊆ R is a bounded, open interval and {un }n>1 ⊆ W 1,1 (T ) is a bounded sequence, then we can extract a subsequence {unk }k>1 , such that {unk (t)}k>1 converges for every t ∈ T (see Denkowski, Mig´orski & Papageorgiou (2003a, p. 229)). Also if T ⊆ R is an unbounded, open interval and p ∈ (1, +∞], then the embedding W 1,p (T ) ⊆ L∞ (T ) is continuous, but not compact. To the equivalent Sobolev norms mentioned in Proposition 2.5.8, we can add one more. THEOREM 2.5.24 (a) If T ⊆ R is a bounded, open interval and r ∈ [1, +∞], then df

|||u|||W 1,p (T ) = kukr + ku0 kp is equivalent to the usual norm k·kW 1,p (T ) on W 1,p (T ); (b) If Z ⊆ RN is a bounded, open set with the cone property and p ∈ [1, +∞), then df

|||u|||W 1,p (T ) = kukr + kDukp is equivalent to the usual norm k·kW 1,p (Z) on W 1,p (Z) provided that r ∈ [1, p∗ ] if p < N , r ∈ [1, +∞) if p = N and r ∈ [1, +∞] if p > N . Now let Z ⊆ RN be a bounded, open set which is Lipschitz. Consider the Banach space M (Z) of ¡ Radon ¢∗ measures on Z with the total variation norm. Recall that M (Z) = C0 (Z) (see Theorem 2.3.41). From Theorem 2.4.17 (see also Remark 2.5.18), we know that if r > N , then the embedding ¡ ¢ © ¡ ¢ ª W01,r (Z) ⊆ C0 Z = u ∈ C Z : u|∂Z = 0 is continuous and dense. So by virtue of Lemma 2.2.27(a), we have that 0

M (Z) ⊆ W −1,r (Z), with 1r + r10 = 1 (see Theorem 2.4.57). This observation is crucial in proving the next compactness result.

228

Nonlinear Analysis

THEOREM 2.5.25 If Z ⊆ RN is a bounded, open set with the cone property and {µn }n>1 ⊆ M (Z) is a bounded sequence, then the sequence {µn }n>1 is relatively compact in W −1,r (Z) for every r ∈ df

[1, 1∗ ), where 1∗ =

N N −1 .

PROOF By virtue of Theorem 2.3.46, we can find a subsequence {µnk }n>1 of the sequence {µn }n>1 and µ ∈ M (Z), such that w

µnk −→ µ in M (Z). 1 1 r + r 0 = 1. Then ¡ ¢ 0 W01,r (Z) ⊆ C0 Z is

Let r0 > 0 be the conjugate exponent of r ∈ [1, 1∗ ), i.e., r0 > N and so by Theorem 2.5.17(c), the embedding 0

compact. Let B 1 (0) be the closed unit ball in W01,r (Z). We see that B 1 (0) ¡ ¢ m(ε) is compact in C0 Z . So given ε > 0 we can find a finite sequence {ui }i=1 , such that for every u ∈ B 1 (0), we have min ku − ui k ¡ ¢ < ε. (2.49) C0 Z

16i6m(ε)

©

ª So, if u ∈ B 1 (0), for some i ∈ 1, . . . , m(ε) , we have ¯Z ¯ Z ¯ ¯ ¯ u dµn − u dµ¯ k ¯ ¯ Z

Z

¯Z ¯ ¯Z ¯ ¯Z ¯ Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ 6 ¯ (u − ui ) dµnk ¯ + ¯ ui dµnk − ui dµ¯ + ¯ (ui − u) dµ¯¯ Z

Z

Z

¯Z ¯ Z ¯ ¯ ¯ 6 2ε sup |µnk | (Z) + ¯ ui dµnk − ui dµ¯¯. k>1 Z

Since

Z

Z

w

µnk −→ µ as k → +∞, we have that

¯Z ¯ Z ¯ ¯ ¯ ui dµn − ui dµ¯ −→ 0 as k → +∞. k ¯ ¯ Z

Z

Therefore, we conclude that lim

k→+∞

so

¯Z ¯ Z ¯ ¯ sup ¯¯ u dµnk − u dµ¯¯ = 0,

u∈B 1 (0)

Z

Z

µnk −→ µ in W −1,r (Z).

2. Lebesgue-Bochner and Sobolev Spaces

229

Next we sharpen both the Rellich-Kondrachov theorem (see Theorem 2.5.17) and Egorov’s theorem (see Theorem A.2.10). We show that a sequence bounded in W 1,r (Z) (r ∈ (1, N )) has a subsequence converging uniformly outside a very small set. The set is not only small in the Lebesgue measure (as the Egorov’s theorem postulates; see Theorem A.2.10), but it is also small in p-capacity, for p ∈ [1, r). First we introduce a notion, which is useful in the study of the pointwise properties of Sobolev functions. ¡ ¢ Suppose that u ∈ L1loc RN . Then Z 1 lim N u(y) dy if the limit exists df r&0 λ (B r (z)) u∗ (z) = B r (z) 0 otherwise

DEFINITION 2.5.26

is the precise representative of u. REMARK 2.5.27

¡ ¢ If u, v ∈ L1loc RN and

u(z) = v(z) then

for λN -almost all z ∈ RN ,

u∗ (z) = v ∗ (z)

∀ z ∈ RN .

Moreover, in view of the Lebesgue differentiation theorem (see Theorem 1.4.6), the limit in the definition of u∗ (see Definition 2.5.26) exists for λN -almost all z ∈ Z. In the next theorem, we identify each function in the Sobolev space W 1,r (Z) with its precise representative. THEOREM 2.5.28 If Z ⊆ RN is a bounded, open set which is Lipschitz, r ∈ (1, N ) and {un }n>1 ⊆ W 1,r (Z) is bounded, then there exist a subsequence {unk }k>1 of {un }n>1 and u ∈ W 1,r (Z), such that for every p ∈ [1, r) and every δ > 0, there exists a relatively closed set Aδ ⊆ Z, such that ¡ ¢ capp U \ Aδ 6 δ and unk −→ u

uniformly on Aδ .

PROOF We may assume that uk ∈ W01,r (Z) for all k > 1. Indeed if this is not the¡case¢ we choose a bounded, open set U ⊇ Z ⊇ Z and a cut-off function ϕ ∈ Cc RN , such that ϕ|Z = 1

and

ϕU c = 0.

230

Nonlinear Analysis

If u ∈ W 1,r (Z) and E(u) ∈ W 1,r (U ) is the extension (see Theorem 2.4.55), then ϕE(u) ∈ W01,r (U ). Because by hypothesis the sequence {un }n>1 ⊆ W01,r (Z) is bounded, by passing to a subsequence if necessary, we may assume that w

un −→ u in W01,r (Z) un −→ u in Lr (Z), with u ∈ W01,r (Z) (see Theorem 2.5.17(a)). Fix δ, ε > 0 and let ¯ ¯ ª df © Cnε = z ∈ Z : ¯un (z) − u(z)¯ > ε and df

hεn = Note that

2³ ε ´+ |un − u| − . ε 2 hεn ∈ W01,r (Z)

(see Proposition 2.4.33) and hεn > 1 on Cnε . From Definition 1.6.1(d), by H¨older’s and Poincar´e’s inequalities (see Theorem A.2.27), for p ∈ [1, r), we have ° °p ¡ ¢ capp Cεn 6 °Dhεn °p µ ¶p p ³n p 2 ε o´1− r r r 6 λN |un − u| > (kDun kr + kDukr ) r ε 2 r−p

6 c(ε) kun − ukLr (Z) . We choose a subsequence {unk }k>1 of {un }n>1 , such that ∞ X ° ° °un − u°r−p < +∞. k r k=1

We set df

i Dm =

∞ [

1

Cnik .

k=m

Then since capp is an outer measure on RN (see Theorem 1.6.10), we have i capp (Dm ) 6

∞ X

³ 1 ´ capp Cnik

k=m

µ ¶X ∞ ° ° 1 °un − u°r−p < δ , 6 c k r i 2k+1 k=1

provided that m = m(i) > 1 is large enough.

2. Lebesgue-Bochner and Sobolev Spaces

231

i From Theorem 1.6.11(a), we know that we can find an open set Vmi ⊇ Dm with δ capp (Vmi ) < k . 2 Set ∞ [ df i Aδ = Z \ Vm(i) . i=1

Then Aδ ⊆ Z is relatively closed, capp (Aδ ) 6

∞ X

¡ i ¢ capp Vm(i) 6 δ

i=1

and unk −→ u

REMARK 2.5.29

uniformly on Aδ .

The result in general fails if p = r.

Now we shall discuss how the continuous embedding ∗

W 1,p (Z) ⊆ Lp (Z) (with Z ∈ RN and p ∈ [1, N )) fails to be compact (see Theorem 2.5.17). It has to do with the so-called “concentration phenomena.” To start having an idea about such situations, recall Proposition 2.3.38. There we saw that if w

un −→ u

in L1 (R)

and oscillates rapidly around its weak limit, then the sequence {un }n>1 cannot converge strongly in L1 (Z). Moreover, if p ∈ (1, +∞), w

un −→ u in Lp (Z) and we also know that un (z) −→ u(z)

for a.a. z ∈ Z;

still we cannot in general deduce strong convergence in Lp (Z). The problem is that the mass of |un − u|p may coalesce onto a set of zero Lebesgue measure. This is the problem of “concentration.” For this reason, in contrast to the case p = 1, for p > 1 the best constant in the Sobolev inequality (see Theorem 2.5.3) is never achieved when Z ⊆ RN , Z 6= RN is an open set which is Lipschitz and in particular is never achieved on a bounded Lipschitz domain. For this reason our analysis will be for Z = RN . So let N > 1 and let p ∈ [1, N ) and consider df

D1,p (RN ) =

©

¡ ¢ª ∗ u ∈ Lp (RN ) : Du ∈ Lp RN ; RN ,

232

Nonlinear Analysis df

(where p∗ =

Np N −p )

furnished with the norm kukD1,p (RN ) = kDukp .

That this is a norm on D1,p (RN ) follows from the Sobolev-NirenbergGagliardo inequality (see Theorem 2.5.3; normed this way D1,p (RN ) is a separable Banach space, Hilbert space if p = 2). The best constant c > 0 in that inequality is given by p

(c−1 )p = S =

kDukp

inf

p

kukp∗

u ∈ D 1,p (RN ) u 6= 0

=

inf

u ∈ D 1,p (RN ) kukp∗ = 1

p

kDukp .

The question is whether this infimum is realized by an element in D1,p (RN ). So consider a minimizing sequence {un }n>1 ⊆ D1,p (RN ), i.e., p

kDun kp −→ S, with kun kp∗ = 1

∀ n > 1.

By passing to a subsequence if necessary, we may assume that w

un −→ u in D1,p (RN ) and so

p

p

kDukp 6 lim inf kDun kp . n→+∞

This u is a minimizer provided that kukp∗ = 1. But since w

∗

in Lp (RN ),

un −→ u

we only know that kukp∗ 6 1. Note that if v ∈ D1,p (RN ), y ∈ RN and λ > 0, then the rescaled function v y,λ (z) = λ satisfies

° y,λ ° °Dv °

p

= kDvkp

N −p p

and

v(λz + y) ° y,λ ° °v ° ∗ = kvk ∗ . p p

So the problem is invariant under translations and dilations. In order to avoid noncompactness of the minimizing sequence (hence achieve kukp∗ = 1) we need the following result, known as the Concentration-Compactness Lemma. In what follows we shall regard L1 (RN ) in a natural way as a subset of M (RN ), by associating to u ∈ L1 (RN ) the measure Z µ(A) = u(z) dz, A

i.e., dµ = u dz.

2. Lebesgue-Bochner and Sobolev Spaces

233

THEOREM 2.5.30 If p ∈ [1, N ), w

in D1,p (RN ),

un −→ u

w

p

µ bn = kDun kRN −→ µ and

νbn = |un |p

w

∗

in M (RN ),

−→ ν

with µ, ν ∈ M (RN ), µ, ν > 0, then (a) there exists an at most countable index set © I, a ªfamily {zi }i∈I of distinct points in RN and nonnegative numbers µi , νi i∈I , such that p

µ > kDukRN +

X

µi δzi

i∈I

and

∗

ν = |u|p +

X

νi δzi ;

i∈I

(b) for all i ∈ I,

p ∗

Sνip and in particular

X

6 µi

p ∗

νip

< +∞.

i∈I

PROOF Z

¡ ¢ First suppose that u = 0. Let ϑ ∈ Cc∞ RN . We have

¯ ¯ ∗ p∗ ¯ϑun ¯p dz 6 S − p

µZ

RN

¶ pp∗ ∀ n > 1.

(2.50)

RN

Since we have that

° ° °D(ϑun )°p dz

|un |p Z

∗

w

−→ ν

in M (RN ),

¯ ¯ ∗ ¯ϑun ¯p dz −→

RN

Z ∗

|ϑ|p dν. RN

Also, using the facts that ¡ ¢ un −→ u in Lploc RN (see Theorem 2.5.17(a)) and p

w

kDun kRN −→ µ in M (RN ),

(2.51)

234

Nonlinear Analysis

we have that µZ lim inf

n→+∞

° ° °D(ϑun )°p N dz R

¶ pp∗

RN

µZ

° °p |ϑ| Dun °RN dz p°

= lim inf

n→+∞

¶ pp∗

RN

Z

|ϑ|p dµ.

=

(2.52)

RN

From (2.50), (2.51) and (2.52), we infer that in the limit as n → +∞, we have µZ p∗

|ϑ|

dν

¶ p1∗

µZ 6 S

¶ p1 |ϑ| dµ

1 −p

RN

¡ ¢ ∀ ϑ ∈ Cc∞ RN .

p

(2.53)

RN

From (2.53), it follows that for all compact sets K ⊆ RN , we have 1

1

1

ν(K) p∗ 6 S − p µ(K) p .

(2.54)

Because the measures are Radon, from (2.54), we deduce that 1

1

1

ν(A) p∗ 6 S − p µ(A) p

for all Borel sets A ⊆ RN .

(2.55)

From Saks Lemma (see Theorem A.2.13), we know that µ = µ0 + µa , with µ0 nonatomic measure, µ0 > 0, µa purely atomic and X µa = µi δzi , i∈I

where I is a countable index set, {µi }i∈I ⊆ R+ \ {0} and {zi }i∈I ⊆ RN . Because of (2.55), we see that ν ≺≺ µ and so from the Radon-Nikodym theorem (see Theorem A.2.24), we have that Z dν ν(A) = dµ for all Borel sets A ⊆ RN , (2.56) dµ A

where

dν ∈ L1 (RN ; µ) dµ

2. Lebesgue-Bochner and Sobolev Spaces

235

is the Radon-Nikodym derivative (see Remark A.2.25). So dν ν(B r (z)) (z) = lim r&0 µ(B r (z)) dµ

for µ-a.a. z ∈ RN .

(2.57)

From (2.55), we see that ¡ ¢ N p∗ ν(B r (z)) 6 S − p µ B r (z) N −p , µ(B r (z))

(2.58)

¡ ¢ provided µ B r (z) 6= 0. From (2.57) and (2.58), it follows that dν (z) = 0 dµ

for a.a. z ∈ supp µ0 = RN \ {zi }i∈I .

(2.59)

Set

dν (zi )µi . dµ From (2.56), (2.57), (2.58) and (2.59), we see that the theorem holds when u = 0. Now suppose that u 6= 0. Set νi =

df

wn = un − u. Then the previous calculations apply to {wn }n>1 . Moreover, by virtue of Proposition 2.3.49, we have p

p

p

kDwn kp = kDun kp − kDukp + εn so

w

p

with εn & 0,

p

b > 0 kDwn kRN −→ µ − kDukRN = µ

in M (RN ).

Similarly, we have |wn |p

∗

−→ ν − |u|p

∗

= νb in M (RN ).

So, we are back to the case u = 0, with un replaced by wn , µ replaced by µ b and ν replaced by νb. COROLLARY 2.5.31 If p ∈ [1, +∞), w

in D1,p (RN ),

un −→ 0

p

w

µn = kDun kRN −→ µ and

νn = |un |p

∗

w

−→ ν

with µ, ν > 0 and 1

1

in M (RN ), 1

µ(RN ) p 6 S p ν(RN ) p∗ , then ν is a Dirac measure.

236

Nonlinear Analysis

PROOF

From (2.55) and the hypotheses, we see that 1

1

1

µ(RN ) p = S p ν(RN ) p∗ . Also from (2.53), we have that µZ p∗

|ϑ|

dν

¶ p1∗

µZ 6 S

1 −p

N

µ(R )

1 N

|ϑ|

RN

p∗

¶ p1∗ dµ

¡ ¢ ∀ ϑ ∈ Cc∞ RN .

RN

Thus we infer that ν = S−

p∗ p

p

µ(RN ) N −p µ

and (2.53) becomes µZ |ϑ|

p∗

dν

¶ p1∗

¡

N

ν(R )

¢ N1

µZ 6

RN

so

p

¶ p1

|ϑ| dν

,

RN

1

1

1

ν(A) p∗ ν(RN ) N 6 ν(A) p

for all Borel sets A ⊆ RN .

But this is impossible if ν is not a Dirac measure. We return to the problem of determining the best Sobolev constant, i.e., S =

inf 1,p

p

N

u ∈ D (R ) kukp∗ = 1

kDukp .

(2.60)

We have the following existence result for problem (2.60). The result is due to Lions (1985a, 1985a) and its proof, which also uses Theorem 2.5.30, can be found there. THEOREM 2.5.32 If p ∈ (1, N ) and {un }n>1 ⊆ D1,p (RN ) is a minimizing sequence for problem (2.60), then {un }n>1 up to translation and dilation is relatively compact in D1,p (RN ), i.e., there exists a sequence {(zn , λn )}n>1 ⊆ RN × (R+ \ {0}), such that the sequence N −p

uznn ,λn (z) = λn p un (λn z + zn ) is relatively compact in D1,p (RN ).

∀ z ∈ RN

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.5.33

237

If p = 2, the function df

u(z) =

[N (N − 2)] 2

N −2 4

(1 + kzkRN )

N −2 2

is a minimizer for problem (2.60) (see Aubin (1976) and Talenti (1976)). If Z ⊆ RN is any open set (not necessarily equal RN ) and by S(Z) we denote the value of problem (2.60) when RN is replaced by Z, then S(Z) = S, but S(Z) is never attained if Z 6= RN . Finally if p = 1, then the best constant for the embedding N

D1,1 (RN ) ⊆ L N −1 (RN ) is attained on the characteristic functions of balls, that is in BVloc (RN ) (see Section 2.6). Since we are in the business of determining sharp constants in inequalities, let us check to see what happens with the constant in the Poincar´e-Wirtinger inequality (see Theorem 2.5.21) for Sobolev functions of one variable. Let T = (0, b) (b < +∞) and p ∈ (1, +∞). We introduce the space ¡ ¢ df © ¡ ¢ ª 1,p Wper T ; RN = u ∈ W 1,p T ; RN : u(0) = u(b) . ¡ ¢ 1,p From Theorem 2.5.22(b), we have that Wper T ; RN is embedded continu¡ ¢ ously (in fact compactly) in C T ; RN . Therefore the evaluations at t = b and t = 0 make sense. PROPOSITION 2.5.34 ¡ ¢ Rb 1,p If u ∈ Wper T ; RN (p ∈ (1, +∞)) and u(t) dt = 0, 0

1

then kuk∞ 6 b p0 ku0 kp . PROOF Arguing on each component separately, we may assume without any loss of generality that N = 1. Then from the mean value theorem for integrals, we can find τ ∈ T = (0, b), such that 1 u(τ ) = b

Zb u(s) ds = 0. 0

By H¨older’s inequality (see Theorem A.2.27), with

1 p

+

1 p0

¯ Zt ¯ Zb ¯ ¯ ¯ ¯ ¯ 0 ¯ 1 ¯u(t)¯ = ¯ u0 (s) ds¯ 6 ¯u (s)¯ ds 6 b p0 ku0 k p ¯ ¯ τ

so kuk∞ 6 b

1 p0

0 0

ku kp .

= 1, we have ∀ t ∈ [0, b],

238

Nonlinear Analysis

¡ ¢ 1,2 In the case of the Hilbert space Wper T ; RN , we have the following sharp estimates. PROPOSITION ¡ ¢2.5.35 1,p If u ∈ Wper T ; RN (p ∈ (1, +∞)) and Zb u(t) dt = 0, 0

then 2

b2 2 ku0 k2 ; 4π 2 b 2 6 ku0 k2 . 12

(a) kuk2 6 2

(b) kuk∞ PROOF

Again we may assume that N = 1.

(a) We consider the Fourier expansion of u, i.e., µ ¶ +∞ X 2iπkt u(t) = ak exp . b k = −∞ k 6= 0

Parseval’s equality implies that 2

ku0 k2 =

+∞ X k = −∞ k 6= 0

b

4π 2 k 2 4π 2 |ak |2 > 2 2 b b

+∞ X

b|ak |2 =

k = −∞ k 6= 0

4π 2 2 kuk2 . b2

(b) Using the Cauchy-Schwarz-Bunyakowski inequality (see Proposition A.4.5 and Remark A.4.6), Parseval’s equality and since ∞ X 1 π2 = , 2 k 6

k=1

for every t ∈ [0, b], we have µ X +∞ ¯ ¯ ¯u(t)¯2 6

¶2 |ak |

k = −∞ k 6= 0

µ 6

+∞ X k = −∞ k 6= 0

b 4π 2 k 2

¶µ

+∞ X k = −∞ k 6= 0

4π 2 k 2 |ak |2 b

¶ =

b 2 ku0 k2 . 12

2. Lebesgue-Bochner and Sobolev Spaces

2.6

239

Fine Properties of Functions and BV-Functions

In this section we establish some further differentiability properties of Sobolev functions and also introduce the space of functions of bounded variation (BV -functions) and establish some of their basic properties. ∗ We start with a result on the Lp -differentiability of Sobolev functions. PROPOSITION 2.6.1 1,p If u ∈ Wloc (RN ) with p ∈ [1, N ), then for λN -almost all z ∈ RN , we have µ

1 λN (B r (z))

Z

¯ ¯u(y) − u(z) − (Du(z), y − z)

RN

¯p ∗ ¯ dy

¶ p1∗

B r (z)

= o(r) PROOF we have

as r & 0.

From Theorem 1.4.6, we know that for λN -almost all z ∈ RN , Z

1 lim r&0 λN (B r (z))

¯ ¯ ¯u(y) − u(z)¯p dz = 0

B r (z)

and 1 r&0 λN (B r (z))

Z

° ° °Du(y) − Du(z)°p N dz = 0. R

lim

B r (z)

We fix such a point z ∈ RN (known as a Lebesgue point for the functions u and Du). Clearly exploiting the translation invariance ¡ ¢of the Lebesgue measure λN , we can take z = 0. We choose ϑ ∈ Cc1 B r (0) with kϑkp0 6 1 (here p1 + p10 = 1). Let ϕ be a mollifier (see Definition 2.4.10) and for every ε > 0, set df

uε = ϑε ? u. Choose y ∈ B r (0) and let h(t) = uε (ty). Then Z1 h0 (s) ds,

h(1) = h(0) + 0

240

Nonlinear Analysis

so Z1 uε (y) = uε (0) +

¡

Duε (sy), y

¢ RN

ds

(2.61)

0

¡

= uε (0) + Du(0), y

Z1

¢ RN

+

¡

Duε (sy) − Du(0), y

¢ RN

ds.

0

Using Fubini’s theorem and a change of variables, we have Z

1 λN (B r (0))

ϑ(y) (uε (y) − uε (0) − (Du(0), y)RN ) dy B r (0)

Z1

Z

1 λN (B r (0))

= 0

¡ ¢ ϑ(y) Duε (sy) − Du(0), y RN dy ds

B r (0)

Z1

Z

1 N sλ (B rs (0))

= 0

ϑ

³y ´ ¡ ¢ Duε (y) − Du(0), y RN dy ds. s

B rs (0)

Letting ε & 0, in the limit we obtain 1 N λ (B r (0))

Z ϑ(y) (u(y) − u(0) − (Du(0), y)RN ) dy B r (0)

Z1 = 0

Z

1 N sλ (B rs (0))

ϑ

³y´ ¡ ¢ Du(y) − Du(0), y RN dy ds s

B rs (0)

Z1

µ

6 r 0

¯ ³ y ´¯p0 ¶ p10 ¯ ¯ × ¯ϑ ¯ dy s

Z

1 λN (B rs (0))

B rs (0)

µ ×

1 N λ (B rs (0))

¶ p1

Z kDu(y) −

p Du(0)kRN

dy

.

B rs (0)

Note that 1 N λ (B rs (0))

Z B rs (0)

Z ¯ ³ y ´¯p0 1 ¯ ¯ ¯ϑ ¯ dy = N s λ (B r (0)) B r (0)

|ϑ (y)|

p0

dy 6

1 a(N )rN

2. Lebesgue-Bochner and Sobolev Spaces

241

(see Remark 1.3.22). So we obtain Z

1 N λ (B r (0))

¡ ¡ ¢ ¢ ϑ(y) u(y) − u(0) − Du(0), y RN dy

B r (0)

³ ´ 1− N = ε r p0

as r & 0.

Taking the supremum over all ϑ, we obtain µ Z

1 rN

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

B r (0)

¯p ¯ dy RN

³ ´ 1− N = o r p0

¶ p1

as r & 0,

so µ

Z

1 N λ (B r (0))

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

¯p ¯ dy RN

¶ p1

B r (0)

= o(r) as r & 0.

(2.62)

Set ¡ ¢ df h(y) = u(y) − u(0) − Du(0), y RN , so h ∈ W 1,p (Br (0)) and consider its extension E(h) ∈ W 1,p (RN ). We have ° ° °E(h)° 1,p N 6 c1 kuk 1,p W (Br (0)) , W (R )

(2.63)

for some c1 > 0 (see Theorem 2.4.55). Then, via Sobolev’s inequality (see Theorem 2.5.3) and (2.63), we have µ Z

¯ ¯ ∗ ¯h(y)¯p dy

¶ p1∗

µZ 6

B r (0)

µZ

6 c2

° ° °DE(h)(y)°p N dy R

¶ p1

¯ ¯ ∗ ¯E(h)(y)¯p dy

RN

RN

µ Z 6 c3 B r (0)

¶ p1∗

¯ ¡¯ ¢ ¯h(y)¯p + kDh(y)kp N dy R

¶ p1 .

(2.64)

242

Nonlinear Analysis

Therefore using (2.62) and (2.64), we conclude that µ

1 N λ (B r (0))

Z

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

¯p∗ ¯ dy RN

¶ p1∗

B r (0)

µ 6 c4 r

Z

1 N λ (B r (0))

° ° °Du(y) − Du(0)°p N dy R

¶ p1

B r (0)

µ + c4

Z

1 λN (B r (0))

¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y

¯p ¯ dy N R

¶ p1

B r (0)

= o(r) as r & 0.

For differentiability λN -almost everywhere, as probably expected, we consider the case p ∈ (N, +∞]. PROPOSITION 2.6.2 1,p If u ∈ Wloc (RN ) with p ∈ (N, +∞], then u is differentiable λN -almost everywhere and the derivative equals the distributional derivative λN -almost everywhere. 1,∞ 1,p PROOF Since Wloc (RN ) ⊆ Wloc (RN ) for any p < +∞, we may assume N that p ∈ (N, +∞). For λ -almost all z ∈ RN , we have Z 1 p lim N kDu(y) − Du(z)kRN dy = 0. (2.65) r&0 λ (B r (z)) B r (z)

Choose z ∈ RN , such that (2.65) holds. Set ¡ ¢ df h(y) = u(y) − u(z) − Du(z), y − z RN

∀ y ∈ B r (z).

Using Morrey’s inequality (see Theorem 2.5.12(a)), we have ¯ ¯ ¯h(y) − h(z)¯ 6 cr

µ

1 N λ (B r (z))

Z

° ° °Dh(y)°p N dy R

B r (z)

with r = ky − zkRN . Since h(z) = 0

and

Dh = Du − Du(z),

¶ p1 ,

2. Lebesgue-Bochner and Sobolev Spaces

243

using (2.65), we obtain |u(y) − u(z) − (Du(z), y − z)RN | ky − zkRN µ ¶ p1 Z ° °p 1 ° ° 6 c N Dh(y) dy −→ 0 as y → z, λ (B r (z)) B r (z)

so u is λN -almost everywhere differentiable and ∇u(z) = Du(z) for a.a. z ∈ RN .

Next we investigate the properties of a Sobolev function u (or more exactly of its precise representation u∗ (see Definition 2.5.26)) along lines. In this direction we have the following result. PROPOSITION 2.6.3 1,p (a) If u ∈ Wloc (RN ), p ∈ [1, +∞), © ª then for each k ∈ 1, . . . , N the function ¡ ¢ u∗k (z 0 , t) = u∗ z1 , . . . , zk−1 , t, zk+1 , . . . , zN is locally absolutely continuous in t for λN −1 -almost all z 0 = (zi )N i=1,i6=k ∈ ¡ ¢ RN −1 (see Definition A.2.15(b)). Moreover (u∗k )0 ∈ Lploc RN . ¡ ¢ (b) If u ∈ Lploc RN and u = h λN -almost everywhere where for each k ∈ © ª 1, . . . , N , the function ¡ ¢ df hk (z 0 , t) = h z1 , . . . , zk−1 , t, zk+1 , . . . , zN is locally absolutely continuous in t for λN −1 -almost all z 0 = (zi )N i=1,i6=k ∈ ¡ ¢ RN −1 and h0k ∈ Lploc RN , 1,p then u ∈ Wloc (RN ). PROOF (a) Clearly we may assume that k = N . Set uε = ϕε ? u with {ϕε }ε>0 being a family of mollifiers. We know that 1,p uε −→ u in Wloc (RN )

(see Proposition 2.4.12(e)). For every M > 0 and λN −1 -almost all z 0 = −1 N −1 (zi )N , from Fubini’s theorem, we have i=1 ∈ R ¯ ¯p ¶ ZM µ ¯ ∂uε 0 ∂u 0 ¯¯ p |uε (z 0 , t) − u(z 0 , t)| + ¯¯ (z , t) − dt −→ 0 (z , t)¯ ∂zN ∂zN

−M

as ε & 0.

244

Nonlinear Analysis

Let

df

uN,ε (t) = uε (z 0 , t). Then uN,ε −→ uN

1,p in Wloc (R)

as ε & 0

and so also locally uniformly to a locally absolutely continuous function uN with u0N (t) = DN u(z 0 , t) for λ1 -a.a. t ∈ R. Also from Theorems 1.6.18, 1.6.13(b) and Remark 1.6.14, we have that uε −→ u∗

µ(N −1) − a.e.,

so from Proposition 1.3.25, we have uN,ε (t) −→ u∗ (z 0 , t) for λN −1 -a.a. z 0 ∈ RN −1 and all t ∈ R. Thus

uN (t) = u∗ (z 0 , t) for λN −1 -a.a. z 0 ∈ RN and all t ∈ R.

¡ ¢ (b) For each ϑ ∈ Cc1 RN , we have Z Z ∂ϑ ∂ϑ u dz = h dz ∂zk ∂zk RN

RN

+∞ µZ ¶ hk (z 0 , t)ϑ0 (z 0 , t) dt dz 0

Z = RN −1

Z = − RN −1

so

−∞ +∞ µZ ¶ Z 0 0 0 0 hk (z , t)ϑ(z , t) dt dz = − −∞

h0k ϑ dz 0 ,

RN −1

Dk u(z) = h0k (z) for λN -a.a. z ∈ RN and all k ∈ {1, . . . , N },

1,p thus u ∈ Wloc (RN ).

Before starting discussing BV -functions, let us prove a result on the superposition operator defined on a Sobolev space. More precisely let Z ⊆ RN be an open set and let ξ : R −→ R be a Lipschitz continuous function. If Z is unbounded, we also assume that u(0) = 0. From Proposition 2.4.25, we know that if u ∈ W 1,p (Z), then ξ ◦ u ∈ W 1,p (Z). So we can define the map Nξ : W 1,p (Z) −→ W 1,p (Z), by df

Nξ (u) = ξ ◦ u

∀ u ∈ W 1,p (Z).

2. Lebesgue-Bochner and Sobolev Spaces

245

PROPOSITION 2.6.4 If p ∈ (1, +∞), then Nξ : W 1,p (Z) −→ W 1,p (Z) is continuous. PROOF

Suppose that un −→ u in W 1,p (Z).

Then

ξ(un ) −→ ξ(u) in Lp (Z).

Also from Proposition 2.4.25, we know that D(ξ ◦ un )(z) = (ξ ∗ ◦ un )Dun (z)

for a.a. z ∈ Z,

with a bounded Borel measurable function ξ ∗ : R −→ R, such that ξ ∗ (z) = ξ 0 (z) So the sequence

for a.a. z ∈ R.

¡ ¢ {D(ξ ◦ un )}n>1 ⊆ Lp Z; RN

is bounded and it follows that ¡ ¢ w D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN . First suppose that ξ ∗ = χA , with A being a Borel set. Set df

η ∗ (t) = ξ ∗ (t) − We have Z Z

1 = p 2

° ° °D(η ◦ un )°p N dz = R

Z Z

Z p kDun kRN Z

1 2

dz −→

1 2p

and

1 df η(t) = ξ(t) − . 2

° ∗ ° °(η ◦ un )Dun °p N dz R Z

Z p

kDukRN dz = Z

° ° °D(η ◦ u)°p N dz. R

Z

Since

¡ ¢ w D(η ◦ un ) −→ D(η ◦ u) in Lp Z; RN , ° ° ° ° °D(η ◦ un )° −→ °D(η ◦ u)° p p ¢ ¡ N p (p ∈ (1, +∞)), from the Kadec-Klee and the uniform convexity of L Z; R property (see Remark A.3.22), we have that ¡ ¢ D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN , whenever ξ ∗ is a characteristic function of a Borel set.

246

Nonlinear Analysis

Clearly then the same is true for ξ ∗ being a countably-valued Borel function. Now suppose that ξ ∗ is an arbitrary bounded Borel function. For a given ε > 0, we can find a countably-valued function s∗ , such that ¯ ¯ sup ¯ξ ∗ (t) − s∗ (t)¯ 6 ε t∈R

(see Corollary 2.1.4). So using Proposition 2.4.25, we have ° ° °D(ξ ◦ un ) − D(ξ ◦ u)° p ³° ° ∗ ° ° ° ° ´ ∗ 6 °(s ◦ un )Dun − (s ◦ u)Du°p + ε °Dun °p + °Du°p , so

° ° lim sup °D(ξ ◦ un ) − D(ξ ◦ u)°p 6 2ε kDukp . n→+∞

Let ε & 0, to obtain ¡ ¢ D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN , hence ξ ◦ un −→ ξ ◦ u in W 1,p (Z).

REMARK 2.6.5 The result is also true for p = 1, but the proof is more involved. We refer to Marcus & Mizel (1979), for details. The weakest measure theoretic sense in which a function w ∈ L1 (Z) can be differentiable is to require that its partial derivatives in the sense of distributions are Radon measures. Such functions are called functions of bounded variation. More precisely we make the following definition. DEFINITION 2.6.6 Let Z ⊆ RN be an open set. A function u ∈ L1 (Z) is said to be of bounded variation, if and only if there exist bounded Borel signed measures © ª µk : B(Z) −→ R, for k ∈ 1, . . . , N , such that

Z

Z uDk ϑ dz = −

Z

ϑ dµk

∀ ϑ ∈ Cc∞ (Z).

Z

The space of functions of bounded variation is denoted by BV (Z). The next Proposition clarifies the structure of the functions of bounded variation.

2. Lebesgue-Bochner and Sobolev Spaces

247

PROPOSITION 2.6.7 If Z ⊆ RN is an open set, u ∈ BV (Z) and for h ∈ Cc (Z), h > 0, we set ½Z ¡ ¢ df ∞ N kDuk (h) = sup udiv ϑ dz : ϑ = (ϑk )N , k=1 ∈ Cc Z; R Z

° ° °ϑ(z)°

RN

¾ 6 h(z), z ∈ Z ,

then kDuk is a Radon measure. PROOF According to the Riesz-Markov representation theorem (see Theorem 2.3.41), we need to show that kDuk is a positive linear functional on Cc (Z) which is continuous under monotone convergence, i.e., if hn % h in Cc (Z), then kDuk (hn ) −→ kDuk (h). To this end let µ = (µk )N k=1 = Du. From Definition 2.6.6, we have that Z Z ¡ ¢ udiv ϑ dz = − ϑ dµ ∀ ϑ ∈ Cc∞ Z; RN . Z

Z

Thus, we may write ½Z kDuk (h) = sup

¡ ¢ N v dµ : v = (vk )N , k=1 ∈ Cc Z; R

Z

¾ ° ° °v(z)° N 6 h(z) for all z ∈ Z . R

We show that kDuk¡ (·) is additive. So let h1 , h2 ∈ Cc (Z), h1 , h2 > 0 and ¢ suppose that v ∈ Cc Z; RN is such that ° ° °v(z)° 6 h1 (z) + h2 (z) ∀ z ∈ Z. © ª Let g = min h1 , kvk and ( v(z) g(z) kv(z)k if v(z) 6= 0, df RN w(z) = 0 if v(z) = 0. ¡ ¢ Clearly w ∈ Cc Z; RN and ° ° ° ° °v(z) − w(z)° N = °v(z)° N − g(z) 6 h2 (z) R R Therefore, since ° ° °w(z)° N = g(z) 6 h1 (z) R

∀ z ∈ Z,

∀ z ∈ Z.

248

Nonlinear Analysis

we have Z

Z v dµ =

Z

Z w dµ +

Z

(v − w) dµ 6 kDuk (h1 ) + kDuk (h2 ), Z

so kDuk (h1 + h2 ) 6 kDuk (h1 ) + kDuk (h1 ). Since the opposite inequality is clearly true, we conclude that kDuk (·) is additive. Also it is clearly positively homogeneous. Thus it remains to show that if hn % h in Cc (Z)+ , then ¡

Let v ∈ Cc Z; R

¢ N

kDuk (hn ) −→ kDuk (h). , such that ° ° °v(z)° N 6 h(z) R

© ª df Let gn = min hn , kvk and (

v(z) gn (z) kv(z)k RN 0

∀ z ∈ Z.

if if

v(z) 6= 0, v(z) = 0.

¡ ¢ We have wn ∈ Cc Z; RN , ° ° °wn (z)° N = gn (z) 6 hn (z) R

∀z∈Z

df

wn (z) =

and kv − wn k = kvk − gn & 0. Because kv − wn k = kvk − gn 6 2 kvk , by virtue of the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that Z v dµ = hDu, viCc (Z;RN ) = lim hDu, wn iCc (Z;RN ) 6 lim kDuk (gn ), n→+∞

n→+∞

Z

so kDuk (h) 6

lim kDuk (gn ).

n→+∞

Since gn 6 h

∀ n > 1,

we have that the opposite inequality also holds, hence kDuk (h) >

lim kDuk (gn ),

n→+∞

so kDuk (h) =

lim kDuk (hn ).

n→+∞

2. Lebesgue-Bochner and Sobolev Spaces

249

COROLLARY 2.6.8 If Z ⊆ RN is an open set and u ∈ BV (Z), then there exists a Borel measurable function ξ : Z −→ RN , such that ° ° °ξ(z)° N = 1 µ = Du-a.e. R and

Z

Z udiv ϑ dz = −

Z

(ϑ, ξ)RN d kDuk

¡ ¢ ∀ ϑ ∈ Cc1 Z; RN .

Z

REMARK 2.6.9

Evidently ξ =

d(Du) d kDuk

(i.e., the Radon-Nikodym derivative of µ = Du with respect to kDuk, since Du ≺≺ kDuk; see Theorem A.2.24 and Remark A.2.25). So, we have Z

Z udiv ϑ dz = −

Z

ϑdDu

¡ ¢ ∀ ϑ ∈ Cc1 Z; RN .

Z

In the sequel for u ∈ L1loc (Z), we say that u ∈ BVloc (Z) (i.e., has locally bounded variation in Z), if for every bounded open set V ⊆ Z with V ⊆ Z, we have that u ∈ BV (V ). Note that the total variation of kDuk is given by ½Z kDuk (Z) = sup

¡ ¢ ∞ N udiv ϑ dz : ϑ = (ϑk )N , k=1 ∈ Cc Z; R

Z

¾ ° ° °ϑ(z)° N 6 1 for all z ∈ Z . R

The norm of BV (Z) is given by kukBV (Z) = kuk1 + kDuk and makes BV (Z) a Banach space. It is also well known that an absolutely continuous function u : R −→ R with u0 ∈ L1 (R) is of bounded variation in R. In particular then W 1,1 (R) ⊆ BV (R). Next we show that the same is true in higher dimensions (i.e., for N > 1). First two examples to motivate what follows.

250

Nonlinear Analysis

1,1 EXAMPLE 2.6.10 (a) Let Z ⊆ RN be an open set and u ∈ W ¡ (Z), ¢ 1,1 1 N then °u ∈ BV ° (Z) (i.e., W (Z) ⊆ BV (Z)). To see this let ϑ ∈ Cc Z; R with °ϑ(z)°RN 6 1 for all z ∈ Z. We have Z Z ¡ ¢ udiv ϑ dz = − Du, ϑ RN dz, Z

Z

so

Z

° ° °Du(z)°

kDuk =

RN

dz

Z

and

( ξ(z) =

Du(z) kDu(z)kRN

0

if if

Du(z) 6= 0 Du(z) = 0

for λN -a.a. z ∈ Z.

1,1 1,p Similarly we show that Wloc (Z) ⊆ BVloc (Z). In particular then Wloc (Z) ⊆ N BVloc (Z) for all p ∈ [1, +∞) and if Z ⊆ R is bounded and open then W 1,p (Z) ⊆ BV (Z) for all n > 1.

(b) Let Z ⊆ RN be an open set, U ⊆ RN another open set with C 2 -boundary ∂U , such that ¡ ¢ µ(N −1) ∂U ∩ K < +∞ for all compact sets K ⊆ Z. ¡ ¢ Then from Proposition 2.4.44, for ϑ ∈ Cc1 Z; RN , we have Z Z div ϑ dz = (ϑ, n)RN dµ(N −1) U

∂U

(here n denotes the outward unit normal along ∂U ¡ ). Hence ¢ for any bounded, open set V ⊆ Z, with V ⊆ Z and for any ϑ ∈ Cc1 V ; RN , we have Z Z ¡ ¢ div ϑ dz = (ϑ, n)RN dµ(N −1) 6 µN −1 ∂U ∩ V , U

so χU ∈ BVloc (Z). Moreover,

∂U ∩V

¡ ¢ k∂χU k (Z) = µ(N −1) ∂U ∩ Z .

Thus k∂χU k (Z) measures the size of ∂U in Z. Since χU is not in general in 1,1 Wloc (Z), we see that not every function of (locally) bounded variation is a Sobolev function. Motivated by Example 2.6.10(b), we make the following definition. DEFINITION 2.6.11 A Lebesgue measurable set A ⊆ RN is said to have finite perimeter in an open set Z ⊆ RN , if χA ∈ BV (Z).

2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.6.12 poli sets.

251

Some authors call sets of finite perimeter, Cacciop-

Next we shall establish some elementary properties of BV -functions. The first is the lower semicontinuity of the variational measure. PROPOSITION 2.6.13 If Z ⊆ RN is an open set and {un }n>1 ⊆ BV (Z) is such that un −→ u

in L1loc (Z),

then for every open set U ⊆ Z, we have kDuk (U ) 6 lim inf kDun k (U ). n→+∞

¡ ¢ Let ϑ ∈ Cc∞ Z; RN be such that

PROOF

kϑ(z)kRN 6 1 We have

Z

∀ z ∈ U.

Z udiv ϑ dz =

lim

un div ϑ dz 6 lim inf kDun k (U );

n→+∞

U

n→+∞

U

so from Remark 2.6.9, we have kDuk (U ) 6 lim inf kDun k (U ). n→+∞

REMARK 2.6.14 The above Proposition does not assert that u ∈ BV (Z). This will be true if u ∈ L1 (Z) and sup kDun k (Z) < +∞. To n>1

see this let ϑ ∈ Cc1 (Z) and k = 1, . . . , N . We have Z Z Z lim ϑDk un dz = − lim un Dk ϑ dz = − uDk ϑ dz, n→+∞

n→+∞

Z

so

Z

Z

¯Z ¯ ¯ ¯ ¯ uDk ϑ dz ¯¯ 6 kϑk∞ lim inf kDun k (Z) < +∞. ¯ n→+∞ Z

Because the embedding Cc1 (Z) ⊆ Cc (Z) is dense, we have that Z Dk u(ϑ) = − uDk ϑ dz ∀ k = 1, . . . , N Z

is a bounded linear functional on Cc (Z), hence a measure.

252

Nonlinear Analysis

In the next Proposition, we establish an upper semicontinuity property of the total variation measure. PROPOSITION 2.6.15 If Z ⊆ RN is an open set, {un }n>1 ⊆ BV (Z), un −→ u

in L1loc (Z)

and kDuk (Z) =

lim kDun k (Z),

n→+∞

then ¡ ¢ ¡ ¢ lim sup kDun k U ∩ Z 6 kDuk U ∩ Z n→+∞

PROOF have that

for all open sets U ⊆ Z.

The set V = Z \ U is open and so from Proposition 2.6.13, we kDuk (V ) 6 lim inf kDun k (V ).

(2.66)

n→+∞

Then we have ¡ ¢ kDuk U ∩ Z + kDuk (V ) = kDuk (Z) = ¡ ¢ > lim sup kDun k U ∩ Z + lim inf kDun k (V ) n→+∞ n→+∞ ¡ ¢ > lim sup kDun k U ∩ Z + kDuk (V ),

lim kDun k (Z)

n→+∞

n→+∞

so

¡ ¢ ¡ ¢ lim sup kDun k U ∩ Z 6 kDuk U ∩ Z . n→+∞

Combining Propositions 2.6.13 and 2.6.15, we have the following. COROLLARY 2.6.16 If Z ⊆ RN is an open set, {un }n>1 ⊆ BV (Z), un −→ u

in L1loc (Z),

kDun k (Z) −→ kDuk (Z) and

¡ ¢ kDuk ∂U = 0

for all open sets U ⊆ Z,

then kDun k (U ) −→ kDuk (U ).

2. Lebesgue-Bochner and Sobolev Spaces

253

The next theorem is the counterpart for the space BV (Z) of the MeyersSerrin theorem (see Theorem 2.4.13). THEOREM 2.6.17 If Z ⊆ RN is an open set and u ∈ BV (Z), then we can find a sequence {un }n>1 ⊆ BV (Z) ∩ C ∞ (Z), such that un −→ u

in L1 (Z)

and

kDun k (Z) −→ kDuk (Z).

PROOF Let ε > 0. For a given positive integer m > 1, we define the following open subset of Z: ½ ¾ 1 df Zk = z ∈ Z : d(z, ∂Z) > ∩ Bk+m (0) ∀ k > 1. k+m Choose m > 1 large enough so that kDuk (Z \ Z1 ) < ε.

(2.67)

Setting Z0 = ∅, we introduce the following sequence of open sets of Z: df

Vk = Zk+1 \ Z k−1

∀ k > 1.

Let {ξk }k>1 be a C ∞ -partition of unity subordinate to the open cover {Vk }k>1 of Z, i.e., ξk ∈ Cc∞ (Vk ),

0 6 ξk 6 1

and

∞ X

ξk = 1

on Z.

k=1

Let ϕ be a mollifier and for each k > 1, choose εk > 0, such that supp (ϕεk ? (ξk u)) ⊆ Vk ε kϕεk ? (ξk u) − ξk uk1 < k 2 kϕεk ? (uDξk ) − uDξk k < ε . 1 2k Let df

uε =

∞ X

ϕεk ? (ξk u).

k=1

Then uε ∈ C ∞ (Z) and because u =

∞ X

ξk u,

k=1

from (2.68), we have kuε − uk1 < ε,

(2.68)

254

Nonlinear Analysis

so

uε −→ u in L1 (Z)

as ε & 0.

(2.69)

kDuk (Z) 6 lim inf kDuε k (Z).

(2.70)

Invoking Proposition 2.6.13, we have ε&0

¡ ¢ Now let ϑ ∈ Cc1 Z; RN be such that ° ° °ϑ(z)° N 6 1 R We have Z uε div ϑ dz =

=

Z ∞ Z X

∞ Z X

ϕεk ? (ξk u)div ϑ dz

k=1 Z

¡ ¢ ξk udiv ϕεk ? ϑ dz

k=1 Z

=

∞ Z X

udiv (ξk (ϕεk ? ϑ)) dz −

k=1 Z

=

∞ Z X

∀ z ∈ Z.

∞ Z X

u (Dξk , (ϕεk ? ϑ))RN dz

k=1 Z ∞ X ¡ ¢ udiv ξk (ϕεk ? ϑ) dz −

k=1 Z

Z

(ϑ, ϕεk ? (uDξk ) − uDξk )RN dz

k=1 Z

= η1,ε + η2,ε . Note that

° ¡ ¢ ° °ξk ϕε ? ϑ (z)° N 6 1 k R

∀ z ∈ Z, k > 1.

Also each z ∈ Z belongs in at most three elements in the cover {Vk }k>1 . So we have ¯ ¯ ¯Z ¯ ∞ Z X ¯ ¯ ¯ ¡ ¢ ¡ ¢ ¯ ¯η1,ε ¯ = ¯ udiv ξ1 (ϕε ? ϑ) dz + u div ξk (ϕεk ? ϑ) dz ¯¯ 1 ¯ ¯ ¯ k=2 Z Z ∞ X kDuk (Vk ) 6 kDuk (Z) + k=2

6 kDuk (Z) + 3 kDuk (Z \ Z1 ) 6 kDuk (Z) + 3ε.

(2.71)

Also from (2.68), we have that ¯ ¯ ¯η2,ε ¯ < ε. From (2.71) and (2.72), it follows that Z uε div ϑ dz 6 kDuk (Z) + 4ε, Z

(2.72)

2. Lebesgue-Bochner and Sobolev Spaces thus and so

255

° ° °Duε ° (Z) 6 kDuk (Z) + 4ε ° ° lim sup °Duε ° (Z) 6 kDuk (Z).

(2.73)

ε→0

From (2.70) and (2.73), we infer that ° ° °Duε ° (Z) −→ kDuk (Z)

as ε & 0.

This combined with (2.69) finishes the proof of the theorem. REMARK 2.6.18 Note that in the previous “local” approximation result, we do not have that ° ° °D(uε − u)° (Z) −→ 0 as ε & 0 and so we cannot claim the density of BV (Z) ∩ C ∞ (Z) in BV (Z). COROLLARY 2.6.19 If Z ⊆ RN is a bounded open set which is Lipschitz, then B r , for r > 0, is compact in L1 (Z), where B r = {u ∈ BV (Z) : kukBV 6 r} . PROOF Let {un }n>1 ⊆ B r . By Theorem 2.6.17, we can find ψn ∈ C ∞ (Z), such that Z ° ° ° ° °un − ψn ° 6 1 and kDψn k = °Dψn (z)° dz 6 2. 1 n Z 1,1

It follows that {ψn }n>1 ⊆ W (Z) is bounded. By virtue of Theorem 2.5.17, the sequence {ψn }n>1 ⊆ L1 (Z) is relatively compact. So we may assume that ψn −→ u in L1 (Z). From Remark 2.6.14, we have that u ∈ BV (Z) and from Proposition 2.6.13, we have that kukBV 6 r, i.e., u ∈ B r . REMARK 2.6.20 According to Corollary 2.6.19, if Z ⊆ RN is a bounded open set which is Lipschitz, then the embedding BV (Z) ⊆ L1 (Z) is compact. An interesting application of this compact embedding is the following result.

256

Nonlinear Analysis

PROPOSITION 2.6.21 If Z ⊆ RN is a bounded open set which is Lipschitz, ½ df T = A ⊆ Z : A is Lebesgue measurable, ¾ 1 N λ (A) = λ (Z \ A) = λ (Z) 2 N

N

and P (A, Z) = kDχA k

∀A∈T

(the perimeter with respect to Z functional), then there exists A∗ ∈ T , such that P (A∗ , Z) = inf P (A, Z). A∈T

PROOF

Let

df

S = {χA : A ∈ T } ⊆ L1 (Z). We furnish S with the relative L1 (Z)-topology. Since kχA k1 6 λN (Z)

∀ A∈T,

we see that the functional ξ : S −→ R defined by df

ξ(χA ) = kDχA k = P (A, Z) is coercive on S for the BV (Z)-norm. Therefore the sub-level sets of ξ are bounded in BV (Z), thus relatively compact in S ⊆ L1 (Z) (note that S is closed in L1 (Z) and see Remark 2.6.20). Also from Proposition 2.6.13, we know that ξ is lower semicontinuous on S. This means that its sub-level sets are compact in L1 (Z). So by the Weierstrass theorem, we can find A∗ ∈ T , such that P (A∗ , Z) = inf P (A, Z). A∈T

We can relate the variation measure of u and the perimeters of its superlevel sets. The result is actually a “Co-Area Formula” for BV -functions (see also Theorem 1.5.25). THEOREM 2.6.22 If Z ⊆ RN is an open set, u ∈ L1 (Z) and for every r ∈ R let ª df © Lr = z ∈ Z : u(z) > r , then (a) u ∈ BV (Z) implies that Z∞ kDuk (Z) =

° ° °DχL ° (Z) dr; r

−∞

(b) if for almost all r ∈ R, Lr has a finite perimeter, then u ∈ BV (Z).

2. Lebesgue-Bochner and Sobolev Spaces

2.7

257

Remarks

2.1: To have a good theory of integration, we need a reasonable notion of measurability of functions. In this direction the basic result is the Pettis measurability theorem (see Theorem 2.1.3), which was proved by Pettis (1938a). The main integral for vector valued functions, which has a rich enough structure to have significant applications, is the Bochner integral. The Bochner integral can be traced in the works of Bochner (1933) and Dunford (1935) and for this reason is also known as “Dunford’s first integral.” Most of the properties of the Bochner integral follow from the corresponding properties of the classical Lebesgue integral, by virtue of Proposition 2.1.10. So some analysts say that the Bochner integral is the Lebesgue integral with the absolute value replaced by norms. The Pettis integral has much fewer applications, which require knowledge and use of sophisticated measure theoretic results. The theory of Pettis integration started with the work of Pettis and attracted renewed attention after the paper of Edgar (1977). A detailed study of the Pettis integral with applications can be found in the monograph of Talagrand (1984a). On the subject of vector valued functions and their integration, the reader can consult the books of Diestel & Uhl (1977), Dunford & Schwartz (1958) and Hille & Phillips (1957). The proof of the Orlicz-Pettis theorem can be found in Diestel & Uhl (1977, p. 22). 2.2: A reference to Lebesgue-Bochner spaces can be found in every book dealing with infinite dimensional dynamical systems. They are a natural generalization of the classical Lebesgue spaces using the notion of Bochner integral. Vector measures were already considered by Pettis (1938b). However, the real expansion on the subject occurred in the late 60s and during the 70s, when there was a systematic study of the geometry of Banach spaces. That is when RNP spaces were introduced and studied in detail. That a reflexive Banach space has the RNP, which was established by Phillips (1940), while the fact that a separable dual Banach space is an RNP space is due to Dunford & Pettis (1940). The proof of Proposition 2.2.8 can be found in Diestel & Uhl (1977, pp. 79 and 82). Theorem 2.2.9 (the Riesz Representation theorem for the Lebesgue Bochner spaces Lp (Ω; X), p ∈ [1, +∞)) is essentially due to Bochner & Taylor (1938). Its proof can be found in Diestel & Uhl (1977, p. 97). Its extension (for p = 1) mentioned in Theorem 2.2.12 (called Dinculeanu-Foias theorem) is due to Dinculeanu & Foias (1961) and its proof, based on “lifting theory,” can be found in Ionescu-Tulcea & Ionescu-Tulcea (1969, p. 93). Absolute continuity of real valued functions (see Definition 2.2.14) was introduced by Vitali (1908), who established the fundamental fact that a real valued function on [0, 1] is absolutely continuous if and only if it is the integral of its derivative (the fundamental theorem of Lebesgue calculus). Theorem 2.2.17 is due to Komura (1967). Lemma 2.2.29

258

Nonlinear Analysis

can be found in Lions (1969, p. 58) and shows how new inequalities can be derived from the properties of embedding operators. Theorem 2.2.30 is due to Aubin (1963) and plays a central role in the theory of evolution equations. Evolution triples (see Definition 2.2.31) are also known as “Gelfand triples,” because of their systematic use by Gelfand & Shilov (1977) (see also Wloka (1987)). Evolution triples and their properties and applications can be found in Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Hu & Papageorgiou (1997, 2000), Lions (1969), Showalter (1997) and Zeidler (1990a, 1990b). Finally we mention a result on the structure of L1 (Ω; X) due to Talagrand (1984b). PROPOSITION 2.7.1 If (Ω, Σ, µ) is a finite measure space and X is a Banach space which is weakly sequentially complete, then L1 (Ω; X) is weakly sequentially complete too. 2.3: Theorem 2.3.2 is known as the “Arzela-Ascoli theorem” ¡ ¢ although some authors use only one of the two names. Working on C [0, 1] , Arzela (1889) proved the necessity part, while Ascoli (1883–1884) proved the sufficiency part. A general formulation of this theorem can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 73). The results on the compactness of various sets in Lp (T ; X) (p ∈ [1, +∞)) and in C(T ; X) (variations of the Arzela-Ascoli theorem) can be found in Simon (1987). They are formulations and extensions of the classical criterion for strong compactness in Lp (T ) (p ∈ [1, +∞)), due to Riesz (1933) and Kolmogorov (1931). James’ theorem (see Theorem 2.3.21), due to James (1964), is one of the deepest and most influential results of functional analysis. From it, it follows that a Banach space X is reflexive if and only if every x∗ ∈ X ∗ attains its supremum on the unit ball of X. For a proof of James’ theorem see Holmes (1975, pp. 157–161). Theorem 2.3.21 can be found in Papageorgiou (1985) (see also Denkowski, Mig´orski & Papageorgiou (2003a, p. 462)), where a kind of converse of it can also be found. The proof of Proposition 2.3.22 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 458). Ionescu-Tulcea & Ionescu-Tulcea (1969) were the first to observe that the classical Dunford-Pettis theorem (see Dunford (1935)) can be extended to X-valued functions with X being a reflexive Banach space, after some straightforward modifications in the original proof (see Theorem 2.3.24). The proof of Proposition 2.3.31 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 484). The notion of biting convergence (see Definition 2.3.35) is due to Chacon (see Brooks & Chacon (1980) and Ball & Murat (1989)). In Brooks & Chacon (1980), we can find the original version of Theorem 2.3.26 (Biting Theorem). Property U (see Definition 2.3.33) is natural in the context of solution flows of a differential equation. Theorem 2.3.37 is due to Gutman (1985). Extensions of Proposition 2.3.39 to Banach space valued functions can be found in Rzezuchowski

2. Lebesgue-Bochner and Sobolev Spaces

259

(1989). The notation for the various spaces of continuous functions is not standard (see, e.g., Hewitt & Stromberg (1975, p. 86)). For a proof of Theorem 2.3.41 (Riesz-Markov representation theorem) we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 322). Also the names for the various modes of convergence introduced in Definition 2.3.42 vary among authors. So we caution the reader to be careful. Since we are dealing with the space of measures, let us mention two striking results concerning them. Let (Ω, Σ) be a measurable space and let ca(Σ) be the space of all signed measures on Σ of bounded variation endowed with the total variation norm df

kµk1 = |µ|(Ω)

∀ µ ∈ ca(Σ).

We can also introduce another norm given by ¯ ¯ df kµk∞ = sup ¯µ(A)¯

∀ µ ∈ ca(Σ).

A∈Σ

Then kµk∞ 6 kµk1 6 4 kµk∞

∀ µ ∈ ca(Σ) ¡ ¢ (i.e., the two norms are equivalent). The space ca(Σ), k·k1 is a Banach space. The first result is a remarkable improvement of the Uniform Boundedness Principle and is known as “Nikodym’s boundedness theorem.” PROPOSITION 2.7.2 If {µs }s∈S ⊆ ca(Σ) and ¯ ¯ sup ¯µs (A)¯ < +∞

∀ A ∈ Σ,

s∈S

then

¯ ¯ sup ¯µs (A)¯ < +∞. s∈S A∈Σ

The second result is known as “Nikodym’s convergence theorem.” PROPOSITION 2.7.3 If {µn }n>1 ⊆ ca(Σ) and lim µn (A) = µ(A) exists

n→+∞

∀ A ∈ Σ,

then µ ∈ ca(Σ) and moreover, if µn ≺≺ λ for all n > 1 with λ ∈ ca(Σ), then µ ≺≺ λ. Both results can be found in Diestel (1984, pp. 80 and 90) and Dunford & Schwartz (1958, pp. 309 and 321).

260

Nonlinear Analysis

For a proof of Theorem 2.3.48 we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 198) and Parthasarathy (1967, p. 45). Proposition 2.3.49 is due to Br´ezis & Lieb (1983). Finally we state a compactness result concerning vector measures. The result is known as “Lyapunov’s convexity theorem” and has important ramifications in Control Theory (see Hermes & LaSalle (1969)). THEOREM 2.7.4 Let (Ω, Σ) be a measurable space. (a) If µk : Σ −→ R, k = 1, . . . , N are finite nonatomic measures, ¢N S ¡ then R = µk (A) k=1 is compact and convex in RN . A∈Σ

(b) If X is a Banach space with the RNP and m : Σ −→ X is a vector measure which is nonatomic and of bounded variation, S k·k then R = m(A) is strongly compact and convex. A∈Σ

2.4: Sobolev spaces were introduced by Sobolev (1963a, 1963b). Related spaces were also studied by Morrey (1940, 1966) and later by Deny & Lions (1953–1954). Today there are many well known books on the subject. We mention Adams (1975), Br´ezis (1983), Evans & Gariepy (1992), Kufner, John & Fuˇcik (1977), Lions & Magenes (1972), Maz’ja (1985) and Ziemer (1989). We mention that for functions of several variables (i.e., N > 1), when p = 2, we use the notation H m (Z) (respectively H0m (Z)) for the Sobolev space W m,2 (Z) (respectively W0m,2 (Z)). However, for functions of one variable (i.e., N = 1, hence Z = T = (a, b)), we keep the notation W m,2 (T ) (respectively W0m,2 (T )). Theorem 2.4.13 is due to Meyers & Serrin (1964). The result is often called “local approximation theorem.” A discussion of the various geometric conditions imposed on the boundary ∂Z can be found in Adams (1975, pp. 66–67). For a proof of the approximation result given in Theorem 2.4.17, we refer to Evans & Gariepy (1992, p. 127). To see that without further conditions on the domain Z, Theorem 2.4.17 is not true, consider the following example. EXAMPLE 2.7.5 df

Z =

Let ©

ª

(z1 , z2 ) ∈ R2 : 0 < |z1 | < 1, 0 < z2 < 1

and

½ df

u(z1 , z2 ) =

1 0

if if

z1 > 0, z1 < 0.

Clearly u ∈ W 1,p (Z) (p ∈ [1, +∞)). However, given ε > 0 sufficiently small, it is easy to see that we cannot find ϑ ∈ C 1 (Z), such that ku − ϑkW 1,p (Z) < ε. Note that this particular Z lies on both sides of its boundary.

2. Lebesgue-Bochner and Sobolev Spaces

261

Another approximation result, useful in optimal control problems, is given below. First a definition. DEFINITION 2.7.6 Let Z be an open set. We say that u : Z −→ R is affine, if it is the restriction to Z of an affine function over RN . We say that u : Z −→ R is piecewise affine, if it is continuous and there exists a partition of Z into a Lebesgue-null set and finite number of open sets on which u is affine. REMARK 2.7.7

If u : Z −→ R is affine, then Du = constant

and the converse is true, if Z is connected. We have ¡ ¢ u(z) = Du(z), z RN + c, with c ∈ R. PROPOSITION 2.7.8 If Z ⊆ RN is a bounded open set which is Lipschitz and u ∈ W01,p (Z) (p ∈ (1, +∞)), then we can find a sequence {un }n>1 of piecewise affine functions over Z, null on ∂Z (i.e., {un }n>1 ⊆ W01,p (Z)), such that un −→ u

in W01,p (Z).

For the case p = +∞, there is the following approximation result. PROPOSITION 2.7.9 If Z ⊆ RN is a bounded open set which is Lipschitz and u ∈ W 1,∞ (Z), then there exists a sequence {un , Zn }n>1 where un ∈ W 1,∞ (Z), Zn ⊆ Z are open, Zn ⊆ Zn+1 ∀ n > 1, λN (Z \ Zn ) −→ 0, un |Zn are piecewise affine, un (z) = u(z) un −→ u

∀ z ∈ ∂Z, n > 1, uniformly on Z,

Dun (z) −→ Du(z)

for a.a. z ∈ Z

and kDun k∞ 6 kDuk∞ + ε(n), with ε(n) −→ 0 as n → +∞.

262

Nonlinear Analysis

REMARK 2.7.10 Recall that, if u ∈ W 1,∞ (Z), then it is Lipschitz continuous on Z and so it can be extended continuously to Z (i.e., W 1,∞ (Z) ⊆ ¡ ¢ C Z ). So the boundary values of u are well defined. Both the previous approximation results can be found in Ekeland & Temam (1976, pp. 316–317). A detailed discussion of Sobolev spaces of fractional order and on manifolds can be found in Adams (1975) and Kufner, John & Fuˇcik (1977). Theorem 2.4.54 can be found in Kenmochi (1975) and Casas & Fern´andez (1989). Finally we mention a Proposition useful in the interpretation of the variational formulation of various equations, such as the Navier-Stokes equation. The result is due to de Rham (1955). PROPOSITION 2.7.11 ¡ ¢ N ∗ If Z ⊆ RN is an open set and u = (uk )N , k=1 ∈ D Z; R then a necessary and sufficient condition that u = Dh for some h ∈ D(Z)∗ is that ¡ ¢ ª df © hu, ϑi = 0 ∀ ϑ ∈ V = ϑ ∈ D Z; RN : div ϑ = 0 . REMARK Note that the divergence operator div maps ¡ ¢ 2.7.12 W01,p Z; RN onto the space ½ df

V =

¾

Z p

h ∈ L (Z) :

h(z) dz = 0

= Lp (Z)/R

Z

(recall that −div is the adjoint of the gradient operator). For the proof of the trace theorem (see Theorem 2.4.50), we refer to Adams (1975, p. 216) and Kufner, John & Fuˇcik (1977, p. 337) and for the proof of Extension Theorem (see Theorem 2.4.55), we refer to Br´ezis (1983, p. 158).

2.5: Theorem 2.5.3 is the classical “Sobolev inequality” (see Sobolev (1963a, 1963b)), which was also developed by Gagliardo (1958), Morrey (1940, 1966) and Nirenberg (1959). The proof given here is due to Nirenberg (1959). For the Poincar´e inequality (see Theorem 2.5.4) and the Poincar´e-Wirtinger inequality (see Theorem 2.5.21) we refer to Meyers (1978). For the proof of Proposition 2.5.8 we refer to Maz’ja (1985, p. 27). The Sobolev embedding theorem (see Theorem 2.5.16) originated in the work of Sobolev (1963a), with important refinements by Morrey (1940) and Gagliardo (1958). The RellichKondrachov embedding theorem (see Theorem 2.5.17) originated in a paper by Rellich (1930) for p = 2 and by Kondrachov (1945) for the general case. For the proofs of both theorems 2.5.16 and 2.5.17 we refer to Br´ezis (1983, pp. 168–170). There are variations of this theorem with interesting applications, like the following one due to Frehse (1984).

2. Lebesgue-Bochner and Sobolev Spaces

263

PROPOSITION 2.7.13 If Z ⊆ RN is a bounded open set, {un }n>1 ⊆ W 1,p (Z) (with p ∈ [1, +∞)) is a bounded sequence and Z ¢ p−2 ¡ kDun k Dun , Dh RN dz 6 M khk∞ ∀n > 1, h ∈ W 1,p (Z) ∩ L∞ (Z), Z

for some M > 0, then there exist u ∈ W 1,p (Z) and a subsequence {unk }k>1 of {un }n>1 , such that un −→ u in W 1,r (Z) ∀ r < p.

For the proof of Theorem 2.5.24, we refer to Adams (1975, p. 79). Theorem 2.5.28 is another refinement of the Rellich-Kondrachov theorem (and simultaneously of the Egorov theorem; see Theorem A.2.10) and can be found in Evans (1990, p. 8). Theorem 2.5.30 (the Concentration-Compactness Lemma) is due to Lions (1985a, 1985b) and is important in the study of elliptic differential equations involving critical exponents. For additional results in this direction we refer to the work of Ben Naoum, Troestler & Willem (1996) and Bianchi, Chabrowski & Szulkin (1995) and the monographs of Evans (1990) and Willem (1996). Propositions 2.5.34 and 2.5.35 can be found in Mawhin & Willem (1989). 2.6: For the Lp -differentiability and λN -a.e. differentiability of Sobolev functions we refer to Bagby & Ziemer (1974), Liu (1977) and Resetnjak (1969) and the books of Evans & Gariepy (1992), Federer (1969), Simon (1983), Stein (1970) and Ziemer (1989). Proposition 2.6.3 is due to Marcus & Mizel (1972), where the interested reader can find additional results in this direction. Proposition 2.6.4 is due to Marcus & Mizel (1979), where the authors prove that the result is also valid for p = 1. Functions of bounded variation on R were introduced by Jordan (1881), who placed integration within the context of a “measurable” set. Lebesgue (1910) proved that a function of bounded variation on R is almost everywhere differentiable (for a proof which does not use measure theory – except sets of measure zero – we refer to Riesz & Nagy (1955, pp. 3–10)). Before the formal introduction of distributions, extensions of the notion of bounded variation to functions of many variables were suggested by Tonelli (1926) and Cesari (1936). It involved consideration of functions along the coordinate axes. Theorem 2.6.17 is due to Krickerberg (1957). The theory of sets of finite perimeter was introduced by Caccioppoli (1953) and De Giorgi (1954, 1955) (where one can find the Co-Area Formula for BV -functions; see Theorem 2.6.22). The proof of Theorem 2.6.22 can be also found in Evans & Gariepy (1992, p. 185) and Ziemer (1989, p. 231). Further contributions were made by Federer (1958), Fleming (1960) and Krickerberg (1957). More details on the space of BV -functions can be found in the books of Evans & Gariepy (1992), Giusti (1984) and Ziemer (1989).

Chapter 3 Nonlinear Operators and Young Measures

In this chapter we study certain nonlinear operators which arise in applications and we also discuss the so-called Young measures, which roughly speaking capture the limits of minimizing sequences in variational problems which do not have a solution. For some cases we also develop the corresponding linear theory in order to have a complete picture of the theory, see the similarities and differences of the two and appreciate the limitations of the nonlinear theory. In Section 3.1, we consider compact operators. Compactness was introduced as a first attempt to deal with infinite dimensional nonlinear operator equations. By its nature, compactness approximates infinite objects by finite ones. We see that in the context of compact operators (linear and nonlinear alike) this principle is in general true. We also discuss proper maps, the spectral theory of linear, compact, self-adjoint operators on a Hilbert space and Fredholm operators. A broader framework for the analysis of infinite dimensional problems is provided by monotone operators, which extend to an infinite dimensional context, the simple notion of an increasing real function. In Section 3.2 we examine monotone operators from a Banach space into its dual, with special emphasis on maximal monotone operators, which are a generalization of a continuous increasing real function. Maximal monotone operators have remarkable surjectivity properties. We point out that surjectivity results are important because they correspond to existence results for certain classes of nonlinear operator equations. At the end of the section we also discuss generalizations of the notion of monotonicity. These are the so-called operators of monotone type, the most important of which are the pseudomonotone operators. Monotone operators map a Banach space to its dual. If instead we want to consider nonlinear operators mapping a Banach space to itself, we need to consider accretive and m-accretive operators. Their importance comes from the fact that they are the generators of linear and nonlinear semigroups, which, roughly speaking, are an abstraction of the trajectories of a given differential equation. In Section 3.3 first we examine accretive operators and then we look at semigroups of operators generated by certain accretive operators. We present in detail both the linear and nonlinear theories. Undoubtedly the most common nonlinear operator is the so-called Nemytskii operator (or superposition)

265

266

Nonlinear Analysis

operator. In Section 3.4 we examine this operator and we have a first look at integral functionals corresponding to normal integrands. In a variational problem, when the objective functional is not inf-compact, a solution does not exist. Nevertheless, the minimizing sequences (or appropriate subsequences of them) have a limit behaviour (usually more and more oscillating), which is captured by embedding the original functions to the space of Young measures (or parametrized probabilities). This embedding leads to a larger inf-compact problem which has a solution (relaxation). In Section 3.5 we discuss the theory of the Young measures and obtain additional lower semicontinuity results for integral functionals. Some of the topics of this chapter will be revisited in the course of the next chapter.

3.1

Compact and Fredholm Operators

The first efforts to solve nonlinear functional equations involved various aspects of compactness. For this reason compact operators were introduced. They constitute a class of maps to which we can generalize several of the results which are valid for maps between finite dimensional Banach spaces. Degree theory and fixed point theory, which provide important tools for the study of functional equations, depend on the notion of compact maps. DEFINITION 3.1.1 Let X, Y be two Banach spaces and let D be a subset of X. We say that f : D −→ Y is compact, if it is continuous and for every bounded set B ⊆ D, the set f (B) is compact in Y . We denote the set of compact maps by K(D; Y ). Also if D = X, we set df

Lc (X; Y ) = K(X; Y ) ∩ L(X; Y ). REMARK 3.1.2 Evidently K(D; Y ) is a linear space which is closed under composition with continuous bounded maps. If dim Y < +∞, then every continuous bounded map f : D −→ Y is compact. In the sequel we shall see that the space K(D; Y ) consists of precisely those maps which can be approximated by mappings with a finite dimensional range (see Theorem 3.1.10). Note that if L : X −→ Y is linear and maps bounded sets in X into relatively compact sets in Y , then L ∈ Lc (X; Y ) (i.e., L is also continuous). Finally, if L ∈ Lc (X; Y ), then L has a separable range.

3. Nonlinear Operators and Young Measures

267

Another notion involving compactness is given in the next definition. DEFINITION 3.1.3 Let X, Y be two Banach spaces and let D be a subset of X. We say that f : D −→ Y is completely continuous, if for every sequence {xn }n>1 ⊆ D, such that w

xn −→ x

in X,

for some x ∈ D, we have that f (xn ) −→ f (x)

in Y

(i.e., f is sequentially continuous from D with the relative weak topology of X into Y with the norm topology). REMARK 3.1.4 A completely continuous linear operator L : X −→ Y is also known as Dunford-Pettis operator and is of course continuous. In general the classes of compact maps and completely continuous maps are not comparable. However, for linear operators the situation is better. We can establish that complete continuity actually lies properly between compactness and boundedness. PROPOSITION 3.1.5 If X, Y are two Banach spaces and L ∈ Lc (X; Y ) = K(X; Y ) ∩ L(X; Y ), then L is completely continuous. PROOF

If

w

xn −→ x

in X,

then the sequence {xn }n>1 ⊆ X is bounded. Because L ∈ K(X; Y ), we have that k·k {L(xn )}n>1Y is compact in Y. Thus we can find a subsequence {xnk }k>1 of {xn }n>1 , such that L(xnk ) −→ y

in Y.

But because L ∈ L(X; Y ), we also have w

L(xn ) −→ L(x)

in Y.

Therefore y = L(x) and so we conclude that L(xn ) −→ L(x) i.e., L is completely continuous.

in Y,

268

Nonlinear Analysis

The converse of the above Proposition is not in general true. EXAMPLE 3.1.6 erty, namely if

Recall that the Banach space l1 has the Schur propw

xn −→ x in l1 , then xn −→ x in l1 . Using this we see that the identity map i : l1 −→ l1 is a completely continuous linear operator which is not compact. However, if we strengthen the condition on the space X, the situation improves. PROPOSITION 3.1.7 If X is a reflexive Banach space, Y is a Banach space, D ⊆ X is a nonempty, closed set, and f : D −→ Y is completely continuous, then f ∈ K(D; Y ). PROOF Clearly f is continuous. Let B ⊆ D be a bounded set. We need to show that f (B) is compact in Y . To this end let {yn }n>1 ⊆ f (B). Then yn = f (xn ) with xn ∈ B

∀ n > 1.

Since X is reflexive, by passing to a subsequence if necessary, we may assume that w xn −→ x in D. Then f (xn ) −→ f (x)

in Y

and so f (B) is indeed compact in Y . Combining Propositions 3.1.5 and 3.1.7, we have the following. COROLLARY 3.1.8 If X is a reflexive Banach space, Y is a Banach space and L ∈ L(X; Y ), then L is compact if and only if L is completely continuous. REMARK 3.1.9 In both Proposition 3.1.7 and Corollary 3.1.8 the condition that X is reflexive cannot be relaxed (see Example 3.1.6).

3. Nonlinear Operators and Young Measures

269

The next theorem gives a characterization of compact maps defined on a bounded set, which explains why compact maps are the suitable class to extend the properties of maps between finite dimensional Banach spaces. THEOREM 3.1.10 If X, Y are two Banach spaces, D ⊆ X is a bounded set and f : D −→ Y , then the following are equivalent: (a) f ∈ K(D; Y ); (b) given ε > 0 we can find a continuous, bounded map fε : D −→ Y , such that ° ° °f (x) − fε (x)° < ε ∀ x ∈ D, Y ¡ ¢ fε (D) ⊆ conv f (D) and dim span fε (D) < +∞. PROOF

“(a)=⇒(b)”: Since f ∈ K(D; Y ), we have that the set f (D) is compact in Y .

So given ε > 0, we can find {yk }m k=1 ⊆ Y , such that f (D) ⊆

m [

Bε (yk ).

k=1

Let

© ª df ak (y) = max ε − ky − yk kY , 0

and

ak (y) df ϑk (y) = P m ak (y)

∀ y ∈ f (D).

k=1

We define df

fε (x) =

m X

¡ ¢ ϑk f (x) yk

∀ x ∈ D.

k=1

Evidently the function fε : D −→ Y is continuous, fε (D) ⊆ span {yk }m k=1 , the set fε (D) is compact and ° ° °f (x) − fε (x)° = m Y P k=1 m P

< k=1 m P k=1

1 ak (f (x))

° m ° °X ¡ ¢¡ ¢° ° ° a f (x) y − f (x) k k ° °

Y

k=1

¡ ¢ ak f (x) ¡ ¢ε = ε ak f (x)

∀ x ∈ D.

270

Nonlinear Analysis df

“(b)=⇒(a)”: Let εn = n1 and let fεn = fn be the continuous, bounded map with finite dimensional range postulated by statement (b). Then f , being the uniform limit of the sequence {fn }n>1 of continuous maps, is itself continuous. Also let y = f (x) with x ∈ D. We have ky − yn kY

1 being strongly convergent in Y , imply xn −→ x

in X,

then f is proper. PROOF First suppose that hypothesis (i) holds. Let C ⊆ Y be a compact set. We need to show that f −1 (C) is compact in X. Let {xn }n>1 ⊆ f −1 (C). Then f (xn ) = yn ∈ C ∀ n > 1. Because C ⊆ Y is compact, by passing to a suitable subsequence if necessary, we may assume that yn −→ y ∈ C in Y. The weak coercivity of f implies that the sequence {xn }n>1 ⊆ X is bounded. © ª So the sequence u(xn ) n>1 ⊆ Y is relatively compact and we may assume that u(xn ) −→ z in Y. Then g(xn ) = f (xn ) − u(xn ) −→ y − z

in Y.

Because g is proper, it follows that the sequence {xn }n>1 has a subsequence {xnk }k>1 , such that xnk −→ x in X. Therefore f (xnk ) −→ f (x) and so y = f (x), i.e., x ∈ f −1 (C), which proves the properness of f .

3. Nonlinear Operators and Young Measures

273

Next suppose that hypothesis (ii) holds. Again f (xn ) = yn −→ y

in Y

and due to the weak coercivity of f , the sequence {xn }n>1 ⊆ X is bounded. Because X is reflexive, we may assume that w

xn −→ x

in X.

Then hypothesis (ii) implies that xn −→ x in X and so yn = f (xn ) −→ f (x)

in Y,

hence y = f (x). This proves the properness of f . Compactness and properness are related as follows. PROPOSITION 3.1.17 If X is a Banach space, D ⊆ X is a closed, bounded set and f ∈ K(D; X), then idX − f is proper (idX is the identity operator on X). PROOF Then

Let C ⊆ X be a compact set and let {xn }n>1 ⊆ (idX − f )−1 (C). xn − f (xn ) = cn ,

cn ∈ C

∀ n > 1.

Since C is compact and f ∈ K(D; X), by passing to a suitable subsequence if necessary, we may assume that cn −→ c ∈ C

and

f (xn ) −→ y

in X.

Then xn = cn + f (xn ) −→ c + y = x in X and so f (xn ) −→ f (x) in X. Thus y = f (x) and we have c = x − f (x), hence which shows that

x ∈ (idX − f )−1 (C), (idX − f )−1 (C) is compact.

274

Nonlinear Analysis

Next we have a closer look at the space of compact linear operators Lc (X; Y ). PROPOSITION 3.1.18 If X, Y are two Banach spaces, then Lc (X; Y ) with the operator norm is a Banach space. PROOF Clearly Lc (X; Y ) is a linear subspace of L(X; Y ) (see also Remark 3.1.2). Because L(X; Y ) with the operator norm is a Banach space, it suffices to show that Lc (X; Y ) is closed in L(X; Y ). So let {Ln }n>1 ⊆ Lc (X; Y ) and suppose that kLn − LkL −→ 0.

(3.1)

We need to show that L ∈ Lc (X; Y ). Because of (3.1), we have that ° ° sup °Ln (x) − L(x)°Y −→ 0. kxkX 61

So given ε > 0, we can find n0 = n0 (ε) > 1, such that ° ° °Ln (x) − L(x)° < ε ∀ n > n0 , kxkX 6 1. Y 2 If

df

B1X =

©

ª x ∈ X : kxkX < 1 ,

¢ ¡ then the set Ln0 B1X is relatively compact. So we can find a finite set F ⊆ Y , such that [ ¡ ¢ Ln0 B1X ⊆ B 2ε (y). y∈F

We claim that

[ ¡ ¢ L B1X ⊆ Bε (y). y∈F

Indeed for a given x ∈

B1X ,

we can find y ∈ F , such that ° ° °Ln0 (x) − y ° < ε . Y 2

Then ° ° ° ° ° ° °L(x) − y ° 6 °L(x) − Ln0 (x)° + °Ln0 (x) − y ° < ε + ε = ε, Y Y Y 2 2 ¡ X¢ which shows that L B1 is totally bounded, thus relatively compact. REMARK 3.1.19 Since composition of two operators, one of which is compact, is again a compact operator, we infer that Lc (X) is a closed twosided ideal of the Banach algebra L(X). Moreover, it is clear that if L ∈ Lc (X) and dim X = +∞, then L−1 does not exist (i.e., L is a singular operator).

3. Nonlinear Operators and Young Measures

275

The next characterization of the elements in the Banach space Lc (X; Y ) is known as “Schauder’s theorem.” First a definition. DEFINITION 3.1.20 If X, Y are two Banach spaces and L ∈ L(X; Y ), then its adjoint L∗ : Y ∗ −→ X ∗ is the linear operator given by df

L∗ (y ∗ ) = y ∗ L i.e.,

∗ ∗ ® ® L (y ), x X = y ∗ , L(x) Y

∀ y∗ ∈ Y ∗ , ∀ x ∈ X, y ∗ ∈ Y ∗ ,

where by h·, ·iZ we denote the duality brackets for the pair (Z, Z ∗ ) for any Banach space Z. REMARK 3.1.21 ∗

Clearly ∈ L(Y ∗ ; X ∗ )

L

and

kLkL = kL∗ kL .

So the map L −→ L∗ is an isometric isomorphism from L(X; Y ) into L(Y ∗ ; X ∗ ). Moreover, (L−1 )∗ = (L∗ )−1 and L∗ (Y ∗ ) is closed if and only if it is w∗ -closed. THEOREM 3.1.22 (Schauder Theorem) If X, Y are two Banach spaces and L ∈ L(X; Y ), then L ∈ Lc (X; Y ) if and only if L∗ ∈ Lc (Y ∗ ; X ∗ ). PROOF where

¡ ∗¢ is relatively compact, “=⇒”: We need to show that L∗ B1Y B1Y

Let {yn∗ }n>1 ⊆ B1Y

∗

df

=

©

ª y ∗ ∈ Y ∗ : ky ∗ kY ∗ < 1 .

∗

be a sequence. Consider the elements yn∗ for n > 1, ¡ ¢ restricted on the set L B1X , which is compact in Y . Clearly the sequence ¡ ¡ ¢¢ {yn∗ }n>1 ⊆ C L B1X is bounded and equicontinuous (see Definition A.1.15). So by the Arzela-Ascoli theorem (see Theorem 2.3.2), the sequence {yn∗ }n>1 ⊆ ¡ ¡ ¢¢ © ª C L B1X is relatively compact. Hence we can find a subsequence yn∗ k k>1 of {yn∗ }n>1 , such that ¯ ® ® ¯ sup ¯ yn∗ k , L(x) Y − yn∗ m , L(x) Y ¯ −→ 0 as k, m → +∞. x∈B1X

So lim

k,m→+∞

=

lim

° ∗ ∗ ° °L (yn ) − L∗ (yn∗ )° ∗ m k X ¯ ∗ ∗ ® ¯ ∗ ∗ ¯ sup L (ynk ) − L (ynm ), x X ¯

k,m→+∞ x∈B X 1

=

lim

¯ ® ¯ sup ¯ yn∗ k − yn∗ m , L(x) Y ¯ = 0.

k,m→+∞ x∈B X 1

276

Nonlinear Analysis

© ª Therefore L∗ (yn∗ k ) k>1 ⊆ X ∗ is a Cauchy sequence and so it is convergent. ¡ ∗¢ This implies that the set L∗ B1Y is compact in X ∗ and so L∗ ∈ Lc (Y ∗ ; X ∗ ). “⇐=”: Let r : X −→ X ∗ be the canonical embedding of X into X ∗∗ . have L∗∗ r = rL.

We

So identifying X with r(X), we have that L∗∗ |X = L. From the first part of the proof and since by hypothesis L∗ ∈ Lc (Y ∗ ; X ∗ ), we have that ¢ ¡ ∗∗ ⊆ Y ∗∗ is compact, L∗∗ B1X where

ª x∗∗ ∈ X ∗∗ : kx∗∗ kX ∗∗ < 1 . ¢ ¡ ¢ ¡ ∗∗ Since L∗∗ B1X is a closed subset of L∗∗ B1X , it follows that B1X

∗∗

df

=

©

¢ ¡ L∗∗ B1X ⊆ X ∗∗

is compact.

¢ ¢ ¡ ¡ ¡ ¢ But as we already established L∗∗ B1X = L B1X . So L B1X ⊆ X is compact and we conclude that L ∈ Lc (X; Y ). DEFINITION 3.1.23 Let X, Y be two Banach spaces and L ∈ L(X; Y ). We say that L is a finite rank operator (or finite dimensional operator or degenerate operator), if dim L(X) < +∞. We denote the space of all finite dimensional operators from X into Y equipped with the norm inherited df from L(X; Y ), by Lf (X; Y ). If L ∈ Lf (X; Y ), then rank L = dim L(X). REMARK 3.1.24 Clearly Lf (X; Y ) ⊆ Lc (X; Y ). The inclusion is in general strict as the next example illustrates. Consider the operator L ∈ L(l2 ), defined by nx o df n L(x) = ∀ x = {xn }n>1 ∈ l2 . 2n n>1

EXAMPLE 3.1.25

We claim that L ∈ Lc (l2 ) \ Lf (l2 ). Clearly L 6∈ Lf (l2 ). So let us show that ¡ 2¢ L ∈ Lc (l2 ). We need to show that L B1l ⊆ l2 is relatively compact. For a given ε > 0, find n0 = n0 (ε) > 1, such that ∞ X n=n0

1 6 ε. 2n +1

3. Nonlinear Operators and Young Measures The set

df

C =

n³ x

277

´ o x2 xn0 , . . . , , 0, . . . : |x | 6 1 n 2 22 2n0 1

,

is compact in l2 (view it as a subset of Rn0 ). So we can find a finite set F ⊆ C, such that [ C⊆ Bε (v). v∈F l

2

Let x ∈ B 1 . We have |xn | 6 1

∀n>1

and so there exists v ∈ F , such that n0 ¯ ¯2 X ¯ xn ¯ ¯ n − vn ¯ < ε 2 . 2 n=1

Then we have °n o ° n0 ¯ ∞ ¯2 ¯ x ¯2 X X ° xn ° ¯ xn ¯ ¯ n¯ ° ° = − v − v + ¯ n ¯ n ¯ < ε2 + ε2 = 2ε2 , n¯ ° 2n n>1 °2 2 2 l n=1 n=n +1 0

¡ 2¢ so L B1l is relatively compact in l2 , hence L ∈ Lc (l2 ). Making use of a finite basis to describe the finite dimensional range of L ∈ Lf (X; Y ), we can easily establish the following result. PROPOSITION 3.1.26 If X, Y are two Banach spaces and L ∈ L(X; Y ), then L ∈ Lf (X; Y ) if and only if L∗ ∈ Lf (Y ∗ ; X ∗ ). Moreover, rank L = rank L∗ . From Theorem 3.1.10 we know that every compact map can be uniformly approximated locally by maps with range in a finite dimensional space. Motivated by this fact, it is natural to ask whether k·kL

Lc (X; Y ) = Lf (X; Y )

.

In fact for a long time this was one of the major open problems in Banach space theory. But let us formulate the problem precisely. We start with a definition. DEFINITION 3.1.27 A Banach space Y has the approximation property, if for every Banach space X, we have that Lc (X; Y ) = Lf (X; Y )

k·kL

.

278

Nonlinear Analysis

The famous open (until 1973) problem in Banach space theory is known as the “approximation problem” and asks whether every Banach space Y has the approximation property. It was settled in the negative by Enflo (1973), who found a separable, reflexive Banach space (necessarily infinite dimensional), which lacks the approximation property. Let us also mention a few things about the spectrum of compact operators. The spaces in much of applied mathematics are actually real vector spaces, as was the case in our study so far. However, to work all the time in real spaces is mathematically inconvenient. Eigenvalue-eigenvector theory is such an instance. The theory is crippled if we insist on real vector spaces. For this reason in the next definition we consider a complex Banach space. DEFINITION 3.1.28 Let X be a complex Banach space and L ∈ L(X). The resolvent set %(L) of L is defined by df

%(L) =

©

ª λ ∈ C : (λidX − L)−1 exists and belongs in L(X) .

The operator df

R(λ) = (λidX − L)−1 is called the resolvent of L at λ. The points of %(L) are called regular values of L. The set df

σ(L) = C \ %(L) is called the spectrum of L. The point spectrum of L is the subset σp (L) of σ(L) defined by df

σ(L) =

©

ª λ ∈ σ(L) : ker (λidX − L) 6= ∅ .

The elements of σp (L) are called eigenvalues of L and for each λ ∈ σp (L) the closed subspace ker (λidX −L) of X is the eigenspace corresponding to the eigenvalue λ, while the nonzero elements of ker (λidX − L) are called eigenvectors of L. REMARK 3.1.29 If dim X = +∞ and L ∈ Lc (X), then 0 ∈ σ(L). Indeed, if L has a bounded inverse, then we could define an equivalent norm ° df ° |||x|||X = °L(x)°X

on X,

¡ ¢ whose closed unit ball is L B1X . But the latter set is compact (since L ∈ Lc (X)) and so dim X < +∞, a contradiction. Also, if dim X < +∞, then from linear algebra we know that the operator λidX − L is invertible (and automatically (λidX − L)−1 is continuous) if and only if λidX − L is bijective. So in this case σ(L) = σp (L). In general we can have that σp (L) = ∅ and σ(L) 6= ∅.

3. Nonlinear Operators and Young Measures

279

¡ ¢ EXAMPLE 3.1.30 ¡ Consider the Hilbert space L2 [0, 1] (over the com¡ ¢¢ plex scalars). Let L ∈ L L2 [0, 1] be defined by df

L(x)(t) = tx(t)

∀ t ∈ [0, 1].

We claim that σp (L) = ∅. Indeed, if for some λ ∈ C we have λx(t) = tx(t)

∀ t ∈ [0, 1],

then (λ − t)x(t) = 0 and so x(t) = 0

for a.a. t ∈ [0, 1].

On the other hand [0, 1] ⊆ σ(L). To this end let λ ∈ [0, 1] and take ε > 0, such that [λ, λ + ε] ⊆ [0, 1] or

[λ − ε, λ] ⊆ [0, 1].

To fix things, we assume that the first is true. We define ½ 1 √ if t ∈ [λ, λ + ε]. df ε xε (t) = 0 if t 6∈ [λ, λ + ε]. Then we have

Z1

λ+ε Z 2

xε (t) dt = 0

1 dt = 1 ε

λ

and so we have kxε k2 = 1. Also

¡ ¢ λidX − L (xε )(t) = (λ − t)xε (t)

Therefore ° ° °(λid − L)(xε )°2 = X 2

λ+ε Z

∀ t ∈ [0, 1].

1 ε2 (λ − t)2 dt = ε 3

λ

and so

¡ ¢ ¡ ¢ λidX − L (xε ) −→ 0 in L2 [0, 1] as ε & 0.

If λidX − L has a bounded inverse, then xε =

¡ ¢−1 (λidX − L)(xε ) −→ 0 as ε & 0, λidX − L

a contradiction to the fact that kxε k2 = 1 for all ε > 0.

280

Nonlinear Analysis

Using the theory of analytic functions one can show the following proposition. PROPOSITION 3.1.31 If X is a complex Banach space and L ∈ L(X), then σ(L) 6= ∅. REMARK 3.1.32 Banach space.

The result is no longer valid if we consider a real

PROPOSITION 3.1.33 If X is a complex Banach space and λ 6= 0 is an eigenvalue of L ∈ Lc (X), then dim(λidX − L)−1 (0) < +∞ (i.e., the eigenspace corresponding to λ is finite dimensional). PROOF

Set

df

Nλ = (λidX − L)−1 (0) and let B be a bounded subset of Nλ . For each x ∈ B, we have L(x) = λx. Since L ∈ Lc (X), we have that L(B) ⊆ X is compact. Hence λB ⊆ X is compact. Since all bounded sets in Nλ are relatively compact, it follows that dim Nλ < +∞. PROPOSITION 3.1.34 If X is a complex Banach space, L ∈ Lc (X) and ε > 0, then L has only finite many linear independent eigenvectors corresponding to eigenvalues having absolute value larger than ε. PROOF Let {xn }n>1 be a sequence of distinct eigenvectors corresponding to eigenvalues λk satisfying |λk | > ε. Set df

Xn = span {xk }nk=1

∀ n > 1.

Note that L(Xn ) = Xn and use Riesz lemma (see Proposition A.3.15), to obtain yn ∈ Xn such that Let

with kyn k = 1,

¡ ¢ 1 d yn , Xn−1 > . 2 df

un =

yn . λn

3. Nonlinear Operators and Young Measures Then

1 ε

kun kX < Also if yn =

n P

281

and L(un ) ∈ Xn .

ak xk , we have

k=1

L(un ) − yn =

n µ X λk k=1

λn

¶ − 1 ak xk =

n−1 Xµ k=1

¶ λk − 1 ak xk ∈ Xn−1 . λn

If n > m, then L(um ) ∈ Xm ⊆ Xn−1

and

L(un ) − yn ∈ Xn−1 .

So we have ° ° ¡ ¢ °L(un ) − L(um )° > d L(un ), Xn−1 ¡ ¢ ¡ ¢ 1 = d L(un ) + yn − L(un ), Xn−1 = d yn , Xn−1 > , 2 © ª so the sequence L(un ) n>1 has no convergent subsequence, a contradiction to the fact that L ∈ Lc (X). PROPOSITION 3.1.35 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then R(λidX − L) is closed. PROOF

Without any loss of generality, we may assume that λ = 1. Set df

V = idX − L

¡ ¢ df and N1 = V −1 {0} .

If λ 6∈ σp (L), then N1 = {0}. If λ ∈ σp (L), then dim N1 < +∞ (see Proposition 3.1.33). So in both cases we see that dim N1 < +∞. Thus we can write that X = N1 ⊕ Em with E being a closed subspace of X. Let df Vb = V |E .

We have V (X) = V (E) = Vb (E)

¡ ¢ ¡ ¢ and Vb −1 {0} = V −1 {0} ∩ E = {0}.

282

Nonlinear Analysis

This shows that Vb is bijective form E into X. We claim that ° ° °Vb (x)° > 0. inf x∈E kxkX = 1

X

Suppose that the claim is not true. Then we can find a sequence {xn }n>1 ⊆ E with kxn kX = 1, such that ° ° °Vb (xn )° & 0. X Since L ∈ Lc (X), by passing to a suitable subsequence if necessary, we may assume that L(xn ) −→ u in X. Then xn =

¡

¢ Vb + L (xn ) −→ u in X

and so kukX = 1. Moreover, Vb (xn ) −→ Vb (u)

in X

and so

Vb (u) = 0, a contradiction to the fact that Vb : E −→ X is bijective. So the claim is true and therefore there exists c > 0, such that ° ° °Vb (x)° > c kxk ∀ x ∈ E. X X This implies that Vb (E) is closed. Indeed, let un ∈ Vb (E)

∀n>1

and assume that un −→ u Then

in X.

un = Vb (xn ) with xn ∈ E

∀ n > 1.

We have ° 1° °Vb (xn − xm )° X c 1 = kun − um kX −→ 0 as n, m → +∞. c

kxn − xm kX 6

So xn −→ x in X for some x ∈ E and Vb (xn ) −→ Vb (x) = u ∈ Vb (E) in X. Finally recall that Vb (E) = V (E).

3. Nonlinear Operators and Young Measures

283

To produce a characterization of the spectrum of a compact operator, we shall need that following straightforward auxiliary result. LEMMA 3.1.36 If X is a Banach space, L ∈ Lc (X) and E = (idX − L)(X) is a proper subspace of X, X then for every ε > 0 we can find xε ∈ B 1 , such that ¡ ¢ dX L(xε ), L(E) > 1 − ε. PROOF By virtue of the Riesz lemma (see Proposition A.3.15), we can find xε ∈ X with kxε kX = 1, such that ¡ ¢ d xε , E > 1 − ε. Note that (idX − L)(xε ) in E and L(E) ⊆ E. Therefore ¡ ¢ ¡ ¢ ¡ ¢ dX L(xε ), L(E) > dX xε − (I − L)(xε ), E = dX xε , E > 1 − ε.

Using this lemma we can have the following remarkable property of the spectrum of a compact operator. THEOREM 3.1.37 If X is a complex Banach space, L ∈ Lc (X) and λ ∈ σ(L) \ {0}, then λ ∈ σp (L). PROOF

Without any loss of generality, we may assume that λ = 1. Let df

V = idX − L and suppose that

¡ ¢ V −1 {0} = {0}

(i.e., λ = 1 6∈ σp (L)). We set df

En = V n (X) and note that ¡ ¢ En = V n (X) = V n−1 V (X) ⊆ V n−1 (X) = En−1

∀ n > 1.

284

Nonlinear Analysis

From Proposition 3.1.35, we know that En is closed

∀ n > 1.

En+1 = En

∀ n > 1.

Suppose that En

Then according to Lemma 3.1.36, we can find xn ∈ B 1 , such that ¡ ¢ 1 dX L(xn ), L(En+1 ) > 2 so

° ° °L(xn ) − L(xm )°

∀ n > 1,

1 2 a contradiction to the compactness of L. So X

>

∀ n 6= m,

En+1 6= En , for some n > 1. We shall show that X = E0 = E1 . Suppose that this is not true, i.e., E0 6= E1 . Let m > 1 be the smallest positive integer, such that Em−1 6= Em = Em+1 . We choose y ∈ Em−1 \ Em . Then V (y) ∈ Em = Em+1 . Hence we can find z ∈ Em , such that V (y) = V (z) and

y 6= z,

since y 6∈ Em = Em+1 . Therefore V (y − z) = 0 and so y − z ∈ ker V, ¡ ¢ a contradiction to the hypothesis that V −1 {0} = {0}. So V is surjective and by Banach’s theorem (see Theorem A.3.6), we have that V −1 exists and is bounded. Hence λ = 1 6∈ σ(L), a contradiction.

3. Nonlinear Operators and Young Measures

285

From Theorem 3.1.37, Proposition 3.1.34 and the well known fact from linear algebra, which says that eigenvectors corresponding to distinct eigenvalues are linear independent, we obtain the following characterization of the spectrum of a compact operator. THEOREM 3.1.38 If X is an infinite dimensional complex Banach space and L ∈ Lc (X), then (a) σ(L) is a countable compact set whose only possible limit point is 0; (b) σ(L) = {0} ∪ σp (L); (c) if λ ∈ σp (L) \ {0}, then the eigenspace of L corresponding to L is finite dimensional. REMARK 3.1.39 The above theorem does not say that σ(L) is the disjoint union of {0} and σp (L). For example if X = l2 and L ∈ Lc (l2 ) is given by ¡ ¢ df ¡ ¢ L {xn }n>1 = x1 , 0, 0, . . . , then λ = 0 is an eigenvalue of L and the associated eigenspace is infinite dimensional (it has codimension equal to 1). On the other hand the operator L ∈ Lc (l2 ) in Example 3.1.25 is bijective and so does not have 0 as an eigenvalue. As we mentioned in the beginning of this section, compact operators generalize to infinite dimensions the properties of operators between finite dimensional spaces. One such property is that if dim X < +∞ and L ∈ L(X), then L is surjective if and only if L is injective. The result is no longer true if dim X = +∞. For example let X = l2 and let ¡ ¢ df ¡ ¢ L {xn }n>1 = 0, x1 , x2 , . . . (the right shift operator). However, if L ∈ Lc (X), then as we show in the sequel, the result is true for idX − L. We start with a definition. DEFINITION 3.1.40 Let X be a Banach space, A ⊆ X and C ⊆ X ∗ . We introduce the set A⊥ ⊆ X ∗ (pronounced “A perp”) and ⊥ C ⊆ X (pronounced “perp C”), defined by ½ A

⊥

df

=

¾ ∗

∗

∗

x ∈ X : hx , aiX = 0 for all a ∈ A , ½ ¾ df ⊥ ∗ ∗ C = x ∈ X : hc , xiX = 0 for all c ∈ C .

286

Nonlinear Analysis

REMARK 3.1.41 X respectively and ⊥

¡

A⊥

¢

The sets A⊥ and

= span A

and

⊥

C are closed subsets of X ∗ and

¡⊥ ¢⊥ ∗ C = span w C.

Also if E is a closed subspace of X, then ¡ ¢∗ X/E = E ⊥ and X ∗ / ⊥ = E ∗ E

(see, e.g., Beauzamy (1982, pp. 41 and 43)). LEMMA 3.1.42 If X, Y are two Banach spaces and L ∈ L(X; Y ), then ¡ ¢ ker L = ⊥ L∗ (Y ∗ ) and ker L∗ = L(X)⊥ . PROOF Recall that Y ∗ is a separating family of functions on Y . So x ∈ ker L if and only if ∗ ∗ ® ® L (y ), x X = y ∗ , L(x) Y = 0 ∀ y∗ ∈ Y ∗ , ¡ ¢ hence x ∈ ⊥ L∗ (Y ) . In a similar fashion, we have that y ∗ ∈ ker L∗ if and only if ∗ ® ® y , L(x) Y = L∗ (y ∗ ), x X = 0 ∀ x ∈ X, so y ∗ ∈ L(X)⊥ . LEMMA 3.1.43 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then R(λidX − L) = X implies that ker (λidX − L) = {0}, i.e., if λidX − L is surjective, then it is injective. PROOF Without any loss of generality we may assume that λ = 1. Recall that L commutes with (idX − L)n (consider the polynomial expansion of (idX − L)n ). So ¡ ¢ L ker (idX − L)n ⊆ ker (idX − L)n ∀ n > 1. Suppose that although R(idX − L)(X) = X, the operator idX − L is not injective. Note that (idX − L)n (X) = X

∀n>1

3. Nonlinear Operators and Young Measures

287

and so (idX − L)n+1 maps some elements of X to 0 that (idX − L)n does not. Hence ker (idX − L)n ( ker (idX − L)n+1 . Using Riesz lemma (see Proposition A.3.15), we can find xn ∈ ker (idX − L)n+1 with kxn kX = 1, such that kxn − ykX >

1 2

∀ n > 1, y ∈ ker (idX − L)n .

If n > m, we have (idX − L)(xn ) + L(xm ) ∈ ker (idX − L)n and so ° ° ° ¡ ¢° °L(xn ) − L(xm )° = °xn − (id − L)(xn ) + L(xm ) ° > 1 , X X X 2 a contradiction to the fact that L ∈ Lc (X). PROPOSITION 3.1.44 If X is Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then dim ker (λidX − L) = codim R(λidX − L) ³ ´ (recall that codim R(λidX − L) = dim X/R(λid − L) ). X

PROOF Without any loss of generality, we may assume that λ = 1. From Remark 3.1.41 and Lemma 3.1.42, we have that ³ ´∗ ¢ ¡ X/R(λid − L) (3.2) = R(idX − L)⊥ = ker id∗X − L∗ . X

From Theorem 3.1.22, we know that L∗ ∈ Lc (X ∗ ) and so Proposition 3.1.33 implies that ¡ ¢ dim ker id∗X − L∗ < +∞. A finite dimensional Banach space has the same dimension as its dual. So from (3.2), we have that codim R(idX − L) = dimker (id∗X − L∗ ).

(3.3)

Because L∗ ∈ L(X ∗ ), from Proposition 3.1.35, we have that R(id∗X − L∗ ) is closed, hence w∗ -closed too (see Remark 3.1.21). So from Remark 3.1.41 and Lemma 3.1.42, we have ¡ ¢⊥ ¡ ¢ ¡ ¢ ker (idX − L)⊥ = ⊥ R(id∗X − L∗ ) = R id∗X − L∗ = R id∗X − L∗ ,

288 so

Nonlinear Analysis X ∗ /R(id∗ − L∗ ) = X ∗ / = ker (idX − L)⊥ X

£ ¤∗ ker (idX − L) .

Using as before the fact that a finite dimensional Banach space has the same dimension as its dual, we obtain codim R(id∗X − L∗ ) = dim ker (idX − L).

(3.4)

Suppose that dim ker (idX − L) > codim R(idX − L). Then we can find a closed subspace E of X, such that X = R(idX − L) ⊕ E. Let PE be the projection operator onto E. Then ker PE = R(idX − L). We have that X/ker P = X/R(id − L) = E E X and so codim R(idX − L) = dim E. Therefore there is a bounded linear operator T which is not injective and maps ker (idX − L) onto E. Then ¡ ¢ T ∈ Lc ker (idX − L); X . Let F be a closed subspace of X, such that X = ker (idX − L) ⊕ F and P0 the projection operator onto ker (idX − L) and with kernel F . Set df

G = L + T P0 . Evidently G ∈ Lc (X) and we have (idX − G)(X) =

¡

¢¡ ¢ (idX − L) − T P0 ker (idX − L) ¡ ¢ + (idX − L) − T P0 (F )

= E + (idX − L)(F ) = E + (idX − L)(X) = X,

3. Nonlinear Operators and Young Measures

289

so, from Lemma 3.1.43, we have that idX − G is injective. But there is a nonzero u ∈ ker T ⊆ ker (idX − L), such that (idX − G)(u) = (idX − L)(u) − T P0 (u) = 0, a contradiction. So, it follows that dim ker (idX − L) 6 codim R(idX − L).

(3.5)

Moreover, because L∗ ∈ Lc (X ∗ ), we also have dim ker (id∗X − L∗ ) 6 codim R(id∗X − L∗ ).

(3.6)

From (3.3), (3.4), (3.5) and (3.6), we conclude that dim ker (idX − L) = codim R(idX − L).

A byproduct of the above proof is the following result. COROLLARY 3.1.45 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then dim ker (λidX − L) = dim ker (λid∗X − L∗ ). Clearly Proposition 3.1.44 permits the improvement of Lemma 3.1.43. This is done in the next theorem which summarizes all the above properties of a compact operator. THEOREM 3.1.46 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then (a) ker (λidX − L) is finite dimensional; (b) R(λidX − L) is closed and R(λidX − L) = ker (λid∗X − L∗ )⊥ ; (c) ker (λidX − L) = {0} if and only if R(λidX − L) = X; (d) dim ker (λidX − L) = dim ker (λid∗X − L∗ ). REMARK 3.1.47 Statement (c) expresses the fact that λidX − L is injective if and only if λidX − L is surjective, a well known property of linear operators between finite dimensional spaces.

290

Nonlinear Analysis

This leads us to the Fredholm alternative theorem, an important tool in the study of integral equations and boundary value problems. THEOREM 3.1.48 (Fredholm Alternative Theorem) If X is Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then one and only one of the following two alternatives holds: (a) for every u ∈ X, the equation (λidX − L)(x) = u has a unique solution x ∈ X; or (b) the homogeneous equation (λidX − L)(x) = 0 has N linear independent solutions with N > 1; in this case the nonhomogeneous equation (λidX − L)(x) = u has a solution if and only if u verifies N conditions of orthogonality, i.e., u ∈ ker (λid∗X − L∗ )⊥ . Next let us say a few words about the spectrum of a self-adjoint, compact operator on a Hilbert space. DEFINITION 3.1.49 Let H be a Hilbert space and L ∈ L(H). We say that L is self-adjoint (or hermitian) if and only if L∗ = L, i.e., ¡ ¢ ¡ ¢ L(x), y H = x, L(y) H ∀ x, y ∈ H. REMARK 3.1.50 If H is a complex Hilbert space and L ∈ L(H) is a self-adjoint operator, then ¡

¡ ¢ ¢ ¡ ¢ L(x), x H = x, L(x) H = L(x), x H ,

hence

¡

L(x), x

¢ H

∈ R.

Also one can check that n

kLn kL = kLkL and kLkL =

∀n>1

¯¡ ¢ ¯ sup ¯ L(x), x H ¯.

kxkH 61

3. Nonlinear Operators and Young Measures

291

PROPOSITION 3.1.51 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then all eigenvalues of L are real and eigenvectors corresponding to different eigenvalues are orthogonal. PROOF

Let λ be a eigenvalue with an eigenvector x. We have ¡ ¢ ¡ ¢ L(x), x H = λx, x H ,

so, from Remark 3.1.50, we have λ =

(L(x), x)H 2

kxkH

∈ R.

Also if µ is another eigenvalue with an eigenvector y, we have ¡ ¢ ¡ ¢ ¡ ¢ L(x), y H = λ (x, y)H and L(y), x H = µ x, y H . Since L is self-adjoint, it follows that (λ − µ) (x, y)H = 0. Because λ 6= µ, we conclude that (x, y)H = 0. PROPOSITION 3.1.52 If H is Hilbert space and L ∈ L(H) is a self-adjoint operator, then λ ∈ σ(L) if and only if ° ° inf °(λidX − L)(x)°H = 0. kxkH =1

PROOF

“⇐=”: If λ ∈ %(L), then (λidX − L)−1 ∈ L(H)

and so for x ∈ H with kxkH = 1, we have ° ° 1 = kxkH = °(λidX − L)−1 (λidX − L)(x)°H ° °−1 ° ° 6 °λidX − L°L °(λidX − L)(x)°H , so inf

kxkH =1

° ° ° ° °(λid − L)(x)° > °(λid − L)−1 °−1 > 0. X X H L

“=⇒”: We proceed by contradiction. So suppose that ° ° inf °(λidX − L)(x)°H = c > 0. kxkH =1

292

Nonlinear Analysis

Then by positive homogeneity, we have ° ° °(λid − L)(x)° > c kxk X H H

∀ x ∈ X.

Hence λidX − L is injective. If we show that λidX − L is also surjective, then by Banach’s theorem (see Theorem A.3.6), we have that (λidX − L)−1 ∈ L(H), a contradiction to the fact that λ ∈ σ(L). We establish the surjectivity of λidX − L in two steps. First we show that (λidX − L)(H) is dense in H and then that (λidX − L)(H) is closed in H. Suppose that (λidX −L)(H) is not dense in H. Then we can find u ∈ H\{0}, such that ¡ ¢ u, (λidX − L)(x) H = 0 ∀ x ∈ H. Since L is self-adjoint, we have that ¡ ¢ ¡ ¢ 0 = u, (λidX − L)(x) H = (λidX − L)(u), x H hence

∀ x ∈ H,

¡ ¢ λidX − L (u) = 0.

This means that λ ∈ σp (L). But from Proposition 3.1.51, we know that σp (L) ⊆ R. Hence λ = λ = λ. Therefore

¡

¢ λidX − L (u) = 0,

u 6= 0,

a contradiction to the fact that ° ° °(λid − L)(u)° > c kuk > 0. X H H This proves that (λidX − L)(H) is dense in H. Now we show that (λidX − L)(H) is closed in H. To this end let (λidX − L)(xn ) −→ y Then we have ° ° °(λidX − L)(xn − xm )° −→ 0 H Since

in H.

as n, m → +∞.

° ° °(λid − L)(xn − xm )° > c kxn − xm k , X H H

3. Nonlinear Operators and Young Measures

293

we have that kxn − xm kH −→ 0

as n, m → +∞.

Therefore xn −→ x ∈ H and so (λidX − L)(xn ) −→ (λidX − L)(x), hence y = (λidX − L)(x), which proves the closedness of (λidX − L)(H). We conclude that ¡ ¢ λidX − L (H) = H and so by Banach’s theorem, we have (λidX − L)−1 ∈ L(H), a contradiction to the fact that λ ∈ σ(L). In Proposition 3.1.51, we saw that if L ∈ L(H) is a self-adjoint operator, then σp (L) ⊆ R. Next we show that in fact the whole spectrum is real. PROPOSITION 3.1.53 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then σ(L) ⊆ R. PROOF Let λ = a + ic with c 6= 0. We show that λ ∈ %(L). For every x ∈ H, we have ¡ ¢ ¡ ¢ (λidX − L)(x), x H − x, (λidX − L)(x) H ¡ ¢ ¡ ¢ 2 2 = λ kxkH − L(x), x H − λ kxkH + x, L(x) H ¡ ¢ 2 2 = λ − λ kxkH = 2ic kxkH . So for every x ∈ H, we have ¯¡ ¢ ¡ ¢ ¯ 2 2|c| kxkH = ¯ (λidX − L)(x), x H − x, (λidX − L)(x) H ¯ ¯¡ ¢ ¯ ¢ ¯ ¯¡ 6 ¯ (λidX − L)(x), x H ¯ + ¯ (x, (λidX − L)(x) H ¯ ° ° 6 2°(λidX − L)(x)°H kxkH , hence

° ° c kxkH 6 °(λidX − L)(x)°H .

Invoking Proposition 3.1.52, we infer that λ ∈ %(L). Therefore σ(L) ⊆ R.

294

Nonlinear Analysis

We can say more about the position of σ(L) in the real line R when L ∈ L(H) is self-adjoint. PROPOSITION 3.1.54 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then σ(L) ⊆ [m, M ], where df

m =

inf

kxkH =1

¡

L(x), x

¢ H

and

df

M =

sup kxkH =1

¡ ¢ L(x), x H .

Moreover, m, M ∈ σ(L). PROOF

From Proposition 3.1.53, we know that σ(L) ⊆ R. Let r > 0 and

df

λ = M + r. Then for every x ∈ H with kxkH = 1, we have ¡ ¢ ¡ ¢ 2 (λidX − L)(x), x H = λ kxkH − L(x), x H ° °2 2 2 > λ°x°H − M kxkH = r kxkH = r, so

° ° r 6 °(λidX − L)(x)°H .

Invoking Proposition 3.1.52, we infer that λ ∈ %(L). Similarly if λ = m − r. So σ(L) ⊆ [m, M ]. Next we show that M ∈ σ(L). Note that σ(L − µidX ) = σ(L) − µ

∀ µ ∈ R.

So by replacing L with L + µidX with µ > 0 sufficiently large, we may assume that 0 6 m 6 M . Then M = kLkL (see Remark 3.1.50). Let {xn }n>1 ⊆ X be a sequence, such that ¡ ¢ kxn kH = 1 ∀ n > 1 and L(xn ), xn H % M = kLkL . Then we have ° ° ¡ ¢ °(M id − L)(xn )°2 = M xn − L(xn ), M xn − L(xn ) X H H ° °2 ¡ ¢ 2 = M 2 kxn kH + °L(xn )°H − 2M L(xn ), xn H ¡ ¢ 6 M 2 + M 2 − 2M L(xn ), xn H , hence

° ° °(M id − L)(xn )° −→ 0. X H

So Proposition 3.1.52 implies that M ∈ σ(L). The proof that m ∈ σ(L) is similar.

3. Nonlinear Operators and Young Measures

295

PROPOSITION 3.1.55 If H is a Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then σp (L) 6= ∅. PROOF If L = 0, then λ = 0 is an eigenvalue of L. If L 6= 0, then by Proposition 3.1.54, at least one of m or M is a nonzero element of σ(L). Invoking Theorem 3.1.38(b), we conclude that σp (L) 6= ∅.

PROPOSITION 3.1.56 If H is a Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then there is a orthonormal basis of H consisting of eigenvectors of L. PROOF

For λ ∈ σp (L) let df

E(λ) = (λidX − L)−1 (0) (the eigenspace corresponding to the eigenvalue λ). Let B(λ) be an orthonormal basis for each finite dimensional eigenspace E(λ). By virtue of Proposition 3.1.51, we have that [ B(λ) is an orthonormal set in H. λ∈σp (L)

Suppose that span

[

B(λ) 6= H.

λ∈σp (L)

Then set

df

F =

[

£ span

B(λ)

¤⊥

.

λ∈σp (L)

Clearly L(F ) ⊆ F. So L|F has an eigenvalue (see Proposition 3.1.55). Let u ∈ F \ {0} be an eigenvector of L|F . Evidently u is an eigenvector of L and so [ F ∩ span B(λ) ! {0}, λ∈σp (L)

a contradiction. Therefore span

[ λ∈σp (L)

B(λ) = H.

296

Nonlinear Analysis

Now we can state the so-called spectral theorem for compact self-adjoint operators on a separable Hilbert space. THEOREM 3.1.57 (Spectral Theorem) If H is an infinite dimensional separable Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then there exists an orthonormal basis {en }n>1 of H formed by eigenvectors of L, such that ∞ X

L(x) =

λn (x, en )H en

∀ x ∈ H,

n=1

with {λn }n>1 being the eigenvalues corresponding to {en }n>1 . PROOF From Proposition 3.1.56, we know that there exists an orthonormal basis of H consisting of eigenvectors of L. This orthonormal basis is countable, because H is separable. Denote it by {en }n>1 . For every x ∈ H and m > k > 1 and since ¯¡ ¢ ¯ |λn | 6 sup ¯ L(x), x H ¯ = kLkL kxkH 61

(see Remark 3.1.50), we have ° ∞ °2 m X °X ° ¯ ¯ ° ° ¯λn (x, en ) ¯2 λ (x, e ) e = n n H n° H ° H

n=k

n=k

m X ¯ ¯ 2 ¯ (x, en ) ¯2 −→ 0 as k, m → +∞, 6 kLkL H n=k

so

∞ P n=1

λn (x, en )H en is convergent in H.

Moreover, if x ∈ H with kxkH 6 1, then for every m > 1, we have ° m °2 m m X X °X ° ¯ ¯2 ¯ ¯ 2 2¯ ° ° ¯ ¯ (x, en ) ¯2 λ (x, e ) e = λ (x, e ) 6 kLk n n H n° n H n L H ° H

n=1

n=1

∞ X ¯ ¯ 2 ¯ (x, en ) ¯2 = kLk2 kxk2 . 6 kLkL H L H

n=1

(3.7)

n=1

Therefore, if we define df

T (x) =

∞ X

λn (x, en )H en ,

n=1

from (3.7), we see that T ∈ L(H). Note that L(en ) = T (en ) for all n > 1. So by linearity and continuity, we conclude that L = T .

3. Nonlinear Operators and Young Measures

297

Before passing to Fredholm operators, let us mention two more results on compact maps. PROPOSITION 3.1.58 If X, Y are two Banach spaces, U ⊆ X is an open set, f ∈ K(U ; Y ) and it is Fr´echet differentiable, then f 0 (x) ∈ Lc (X; Y ) ∀ x ∈ U. PROOF Suppose that f 0 (x) is not compact. Then we can find a sequence {un }n>1 ⊆ X with kun kX 6 1 ∀n>1 and ε > 0, such that ° 0 ° °f (x)un − f 0 (x)um ° > ε Y We have

∀ n 6= m.

f (x + h) − f (x) = f 0 (x)h + ox (h),

where

ε khkX ∀ h ∈ X, khkX 6 δ, 3 for some δ = δ(ε, x) > 0. Therefore ° ° °f (x + δun ) − f (x + δum )° ° ° ° Y ° ° ° > δ °f 0 (x)(un − um )°Y − °ox (δun )°Y − °ox (δum )°Y 2ε ε > εδ − δ = δ, 3 3 kox (h)kY 6

a contradiction to the fact that f ∈ K(U ; Y ). The converse of the above result is also true, provided that the map x 7−→ f 0 (x)

belongs in K(U ; L(X; Y )). For details see Vaˇınberg (1973, pp. 47 and 51). PROPOSITION 3.1.59 Let X, Y be two Banach spaces. (a) If f : X −→ Y is a Fr´echet differentiable operator, f 0 (x) ∈ Lc (X; Y ) ¡ ¢ for every x ∈ X and x 7−→ f 0 (x) belongs in K X; L(X; Y ) , then f ∈ K(X; Y ). (b) If f ∈ K(X; Y ), f is Fr´echet differentiable, and x 7−→ f 0 (x) belongs in ¡ ¢ K X; L(X; Y ) , then f is completely continuous.

298

Nonlinear Analysis

The study of compact operators leads us to the following definition. DEFINITION 3.1.60 Let X, Y be two Banach spaces and L ∈ L(X; Y ). We say that L is a Fredholm operator, if α(L) = dim ker L < +∞

and

β(L) = codim R(L) < +∞.

The class of Fredholm operators is denoted by Φ(X; Y ). The quantity α(L) is called the kernel index of L and the quantity β(L) is called the deficiency index of L. The index of L is defined by df

ind (L) = α(L) − β(L). If we have only that α(L) < +∞ and that R(L) is closed, then we say that L is a semi-Fredholm operator and the class of all semi-Fredholm operators is denoted by Φ+ (X; Y ). If X = Y we write Φ(X) and Φ+ (X). REMARK 3.1.61

We have Φ(X; Y ) ⊆ Φ+ (X; Y ),

since as we show in the sequel, the condition that β(L) < +∞ implies that R(L) is closed. LEMMA 3.1.62 If X, Y are two Banach spaces, L ∈ L(X; Y ) is injective and L−1 : R(L) −→ X is bounded, then R(L) is closed. PROOF

Let L(xn ) −→ y

in Y.

Since by hypothesis L has a bounded inverse on R(L), we must have that ° ° °L(xn ) − L(xm )° > c kxn − xm k ∀ n 6= m, X Y for some c > 0. Therefore {xn }n>1 ⊆ X is a Cauchy sequence and we have that xn −→ x, for some x ∈ X. Hence L(xn ) −→ L(x)

in Y

and so y = L(x). Therefore y ∈ R(L) and we conclude that R(L) is closed in Y .

3. Nonlinear Operators and Young Measures

299

The next definition formalizes an idea which was used in earlier proofs, when we wanted to get rid of the nontrivial kernel of an operator L ∈ L(X; Y ). DEFINITION 3.1.63

Let X, Y be two Banach spaces and L ∈ Lc (X; Y ).

b induced by L is the operator from X/ The operator L ker L into Y defined by ¡ ¢ df b [x] = L L(x) REMARK 3.1.64

∀ x ∈ X.

b is injective and Evidently L ¡ ¢ b = R(L). R L

b is continuous and In fact it is straightforward to show that L ° ° kLk = °b L° . L

L

PROPOSITION 3.1.65 If X, Y are two Banach spaces and L ∈ L(X; Y ), then R(L) is closed if and only if there exists c > 0, such that ° ° °L(x)° > cdX (x, ker L) ∀ x ∈ X. Y b induced by L (see Definition 3.1.63). PROOF Consider the operator L b is injective and We know that L ¡ ¢ b = R(L) R L ¡ ¢ b is closed if (see Remark 3.1.64). By virtue of Lemma 3.1.62, R(L) = R L b −1 has a bounded inverse which in turn is equivalent to saying and only if L that b kL([x])k kL(x)kY Y 0 < c = inf = inf . x6∈ker L x6∈ker L dX (x, ker L) k[x]k

We can define the quantity df

γ(L) =

inf

x6∈ker L

kL(x)kY . dX (x, ker L)

This quantity is known as the minimum modulus of L. From the previous discussion, we have the following proposition.

300

Nonlinear Analysis

PROPOSITION 3.1.66 If X, Y are two Banach spaces and L : X −→ Y is linear, then any two of the following three properties imply the other: (a) L ∈ L(X; Y ); (b) R(L) is closed in Y ; (c) γ(L) > 0. PROPOSITION 3.1.67 If X, Y are two Banach spaces, L ∈ L(X; Y ) and suppose that E is a closed subspace of Y , such that R(L) ⊕ E is closed in Y , then R(L) is closed. PROOF

Let L0 ∈ L(X × E, X × Y ) be defined by df

L0 (x, u) = L(x) + u

∀ (x, u) ∈ X × E.

Since R(L) ∩ E = {0}, we have that ker L0 = ker L × {0}. By hypothesis R(L0 ) = R(L) ⊕ E is closed. So according to Proposition 3.1.66, we have that γ(L0 ) > 0. Then for all x ∈ X, we have ° ° ° ° °L(x)° = °L0 (x, 0)° Y X×X ¡ ¢ > γ(L0 )d (x, 0), ker L0 = γ(L0 )dX (x, ker L), so γ(L) > γ(L0 ) > 0. Invoking Proposition 3.1.66, we conclude that R(L) ⊆ Y is closed. COROLLARY 3.1.68 If X, Y are two Banach spaces, L ∈ L(X; Y ) and β(L) < +∞, then R(L) is closed. PROOF

We have Y = R(L) ⊕ E

for some finite dimensional subspace E of Y . Apply Proposition 3.1.67.

3. Nonlinear Operators and Young Measures

301

This corollary implies that every L ∈ Φ(X; Y ) has closed range and so Φ(X; Y ) ⊆ Φ+ (X; Y ) (see Remark 3.1.61). The propositions that follow summarize some of the basic properties of Fredholm operators. PROPOSITION 3.1.69 If X, Y are two Banach spaces and L ∈ Φ(X; Y ), then (a) if ind L = 0 and ker L = {0}, then for every y ∈ Y the equation L(x) = y has a unique solution and L−1 exists and is bounded (i.e., L−1 ∈ L(X; Y )); (b) for given y ∈ Y , the equation L(x) = y has a solution if and only if hy ∗ , yiX = 0

∀ y ∗ ∈ ker L∗ ,

i.e., y ∈ ⊥ (ker L∗ ); (c) L∗ ∈ Φ(Y ∗ ; X ∗ ) and α(L∗ ) = β(L), β(L∗ ) = α(L), ind L∗ = −ind L. PROOF (a) Since ker L = {0}, L is injective. Also because ind L = 0, we have β(L) = 0 and so R(L) = Y . Invoking Banach’s Theorem, we conclude that L−1 ∈ L(X; Y ) and L(x) = y has a unique solution. (b) From Corollary 3.1.68, we know that R(L) is closed. Hence R(L) =

⊥

(ker L∗ ) .

(c) Since R(L) is closed, we have ker L∗ So

w∗

= R(L)⊥

and

α(L∗ ) = β(L) and

⊥

¡

¢ R(L∗ ) = ker L.

β(L∗ ) = α(L).

Because L ∈ Φ(X; Y ), it follows that L∗ ∈ Φ(Y ∗ ; X ∗ )

and

ind L∗ = −ind L.

The next proposition gives a basic stability property of Fredholm operators. PROPOSITION 3.1.70 If X, Y are two Banach spaces, L ∈ Φ(X; Y ) and T ∈ Lc (X; Y ), df

then G = L + T ∈ Φ(X; Y ) and ind G = ind L.

302

Nonlinear Analysis

REMARK 3.1.71 The result is also true if instead of T ∈ Lc (X; Y ) we assume that T ∈ L(X; Y ) with kT kL 6 δ for some δ = δ(L) > 0. Also note that in particular Proposition 3.1.70 implies that if T ∈ Lc (X; Y ) then idX − T ∈ Φ(X; Y ). PROPOSITION 3.1.72 If X is a Banach space and L ∈ L(X), then L ∈ Φ+ (X) if and only if for every closed and bounded set B ⊆ X, L|B is proper. PROOF

“=⇒”: Let {xn }n>1 ⊆ B be such that L(xn ) −→ u in X.

We have that X = ker L ⊕ E, with a closed subspace E ⊆ X (since dim ker L < +∞). So xn = zn + en with zn ∈ ker L, en ∈ E. We have L(xn ) = L(en ) −→ u

in X.

b (see Definition 3.1.63) is injective and so by Banach’s Then operator L|E = L Theorem, ¡ ¢ L−1 ∈ L R(L); E . Therefore, we have en −→ e

in X,

for some e ∈ E. The sequence {zn }n>1 ⊆ ker L is bounded. So exploiting the finite dimensionality of ker L, we have that the sequence {zn }n>1 is relatively compact in X. Therefore we conclude that the sequence {xn }n>1 is relatively compact in X, which proves that L|B is proper. “=⇒”: The set

©

ª

x ∈ X : x ∈ ker L, kxkX 6 1

is compact by hypothesis. So it follows that ker L is finite dimensional. We can write X = ker L ⊕ E, with a closed subspace E ⊆ X. Then Proposition 3.1.67 implies that R(L) is closed, hence L ∈ Φ+ (X).

3. Nonlinear Operators and Young Measures

3.2

303

Operators of Monotone Type

Operators of monotone type were introduced to provide an analytic framework broader than compact operators in order to study nonlinear functional equations. Their systematic study of monotone operators starts in early 1960’s and marks the advent of nonlinear functional analysis. Monotone operators are rooted in the theory of variational problems. Moreover, recalling that the Gˆateaux derivative of a convex function is the prototypical example of a nonlinear monotone operator, it is no surprise that for a long period the theory of monotone operators and convex analysis developed in parallel and interacted heavily. The mathematical framework of the analysis in this section is the following. Let X be a reflexive Banach space and X ∗ its topological dual. By h·, ·iX we denote the duality pairing for the spaces X ∗ and X. Also A : X ⊇ D(A) −→ 2X

∗

is a generally multivalued operator. The domain D(A) of A is defined by df

D(A) =

©

ª

x ∈ X : A(x) 6= ∅

and the graph Gr A of A is defined by df

Gr A =

©

ª (x, x∗ ) ∈ X × X ∗ : x∗ ∈ A(x) .

Also we can define A−1 : X ∗ ⊇ D∗ −→ 2X b df

Gr A∗ =

©

ª (x∗ , x) ∈ X ∗ × X : (x, x∗ ) ∈ Gr A .

Note that A−1 is always defined as a multivalued operator. Some of the results in this section are actually true in a more general setting. However, in order to have a uniform presentation, we have chosen to work in the above setting, which after all is what we encounter in most applications. DEFINITION 3.2.1 ator.

∗

Let A : X ⊇ D(A) −→ 2X be a multivalued oper-

(a) We say that A is monotone, if hx∗ − y ∗ , x − yiX > 0, for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y).

304

Nonlinear Analysis

(b) We say that A is strictly monotone, if it is monotone and hx∗ − y ∗ , x − yiX > 0, for all x, y ∈ D(A), x 6= y and all x∗ ∈ A(x), y ∗ ∈ A(y). (c) We say that A is strongly monotone, if there exists c > 0, such that 2

hx∗ − y ∗ , x − yiX > c kx − ykX , for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y). (d) We say that A is uniformly monotone, if there exists a continuous function c : R+ −→ R+ , which is strictly increasing, c(0) = 0, c(r) −→ +∞ as r → +∞ and ¡ ¢ hx∗ − y ∗ , x − yiX > c kx − ykX kx − ykX , for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y). (e) We say that A is coercive, if D(A) is bounded or D(A) is unbounded and inf{hx∗ , xiX : x∗ ∈ A(x)} −→ +∞ kxkX

as kxkX → +∞, x ∈ D(A).

We say that A is weakly coercive, if D(A) is bounded or D(A) is unbounded and inf kx∗ kX ∗ −→ +∞ as kxkX → +∞, x ∈ D. ∗ x ∈A(x)

REMARK 3.2.2 If A is strongly monotone, then A is uniformly monotone. If A is uniformly monotone, then A is strictly monotone. If A is strictly monotone, then A is monotone. If A is uniformly monotone, then A is coercive. If A is coercive, then A is weakly coercive. Sometimes it is convenient to identify A with its graph. For this reason some authors speak of monotone sets in X × X ∗ . ∗

DEFINITION 3.2.3 A monotone map A : X ⊇ D(A) −→ 2X is said to be maximal monotone, if the inequality hx∗ − y ∗ , x − yiX > 0

∀ (x, x∗ ) ∈ Gr A

implies that (y, y ∗ ) ∈ Gr A. REMARK 3.2.4 The above definition implies that the graph of a maximal monotone map is not properly included in the graph of another monotone map (i.e., it is maximal with respect to inclusion).

3. Nonlinear Operators and Young Measures

305

EXAMPLE 3.2.5 An increasing continuous function f : R −→ R is maximal monotone. However, an increasing discontinuous function f : R −→ R is monotone but not maximal monotone, since it admits a monotone extension by filling in the jumps at the discontinuity points. This example underlines the necessity of multivalued operators in the study of maximal monotonicity. The next result is an immediate consequence of Definition 3.2.1. PROPOSITION 3.2.6 ∗ A map A : X ⊇ D(A) −→ 2X is maximal monotone if and only if A−1 : X ∗ ⊇ D(A∗ ) −→ 2X is maximal monotone. PROPOSITION 3.2.7 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map, then for every x ∈ D(A), the set A(x) is nonempty, convex and closed. PROOF

Since x ∈ D(A), A(x) 6= ∅. Let x∗ , y ∗ ∈ A(x). Set df

u∗λ = λx∗ + (1 − λ)y ∗

∀ λ ∈ [0, 1].

For all (z, z ∗ ) ∈ Gr A, we have hu∗λ − z ∗ , x − ziX = λ hx∗ − z ∗ , x − ziX + (1 − λ) hy ∗ − z ∗ , x − ziX > 0, hence u∗λ ∈ A(x) (see Definition 3.2.3). Therefore A(x) is convex. Also suppose that {x∗n }n>1 ⊆ A(x) is a sequence, such that x∗n −→ x∗

in X ∗ .

We have hx∗n − z ∗ , x − ziX > 0

∀ n > 1, (z, z ∗ ) ∈ Gr A.

In the limit as n → +∞, we have hx∗ − z ∗ , x − ziX > 0, hence (x, x∗ ) ∈ Gr A, i.e., A(x) is closed in X ∗ . A fundamental property of monotone maps is local boundedness. ∗

DEFINITION 3.2.8 A monotone map A : X ⊇ D(A) −→ 2X is said to be locally bounded at x ∈ D(A), if there exists M > 0 and r > 0, such that ky ∗ kX ∗ 6 M ∀ y ∈ D(A) ∩ B r (x), y ∗ ∈ A(y).

306

Nonlinear Analysis

DEFINITION 3.2.9 If C ⊆ X is a nonempty set, a pointSx ∈ C is an absorbing point of C, if the set C − x is absorbing, i.e., X = λ(C − x). λ>0

REMARK 3.2.10 If int C 6= ∅, then any x ∈ int C is an absorbing point of C. If C = ∂B1 ∪ {0}, then 0 is an absorbing point although int C = ∅. PROPOSITION 3.2.11 ∗ If A : X ⊇ D(A) −→ 2X is monotone and x ∈ D(A) is an absorbing point of D(A), then A is locally bounded at x. PROOF Without any loss of generality we may assume that x = 0 and 0 ∈ A(0) (i.e., (0, 0) ∈ Gr A). Indeed if this is not the case, we choose x∗ ∈ A(x) and consider the map df

A1 (y) = A(y + x) − x∗ . Evidently A1 is still monotone, (0, 0) ∈ Gr A1 and D(A1 ) = D(A) − x. So we can replace A with A1 . Therefore we need to show that A is locally bounded at 0. For every u ∈ X, we define df

ϕ(u) =

sup y ∈ D(A) kykX 6 1 y ∗ ∈ A(y)

hy ∗ , u − yiX .

Clearly ϕ is the supremum of affine continuous functions, hence ϕ is convex and lower semicontinuous and because (0, 0) ∈ Gr A, we have ϕ > 0. The set df

C =

©

ª

u ∈ X : ϕ(u) 6 1

is closed and convex. We claim that 0 ∈ C. Indeed because (0, 0) ∈ Gr A, we have 0 6 hy ∗ , yiX ∀ (y, y ∗ ) ∈ Gr A and so ϕ(0) 6 0. Let df

E = C ∩ (−C). This is a closed, convex and symmetric set. We claim that it is absorbing too. Let u ∈ X. Since by hypothesis D(A) is absorbing, we can find λ > 0, such that λu ∈ D(A), i.e., A(λu) 6= ∅. Choose v ∗ ∈ A(λu). If (y, y ∗ ) ∈ Gr A, from the monotonicity of A, we have hy ∗ , λu − yiX 6 hv ∗ , λu − yiX ,

3. Nonlinear Operators and Young Measures

307

so ϕ(λu) 6

hv ∗ , λu − yiX 6 hv ∗ , λuiX + kv ∗ kX ∗ < +∞.

sup y ∈ D(A) kykX 6 1

Choose t ∈ (0, 1), such that tϕ(λu) < 1. Because ϕ is convex, we have ϕ(tλu) 6 tϕ(λu) + (1 − t)ϕ(0) = tϕ(λu) < 1, so tλu ∈ C. This shows that C is absorbing, hence E is absorbing too. Thus E is a neighbourhood of the origin and so we can find δ > 0, such that ϕ(u) 6 1

∀ kukX 6 2δ.

This means that hy ∗ , uiX 6 1 + hy ∗ , yiX

∀ y ∈ D(A), kykX 6 1, y ∗ ∈ A(y), kukX 6 2δ.

Therefore, if y ∈ D(A) ∩ B δ and y ∗ ∈ A(y), we have 2δ ky ∗ kX ∗ =

sup kukX 62δ

hy ∗ , uiX 6 1 + ky ∗ kX ∗ kykX 6 1 + δ ky ∗ kX ∗ ,

so ky ∗ kX ∗ 6 1δ . Using this result, we can determine the continuity properties of maximal monotone maps. First let us recall the following notion from multivalued analysis. DEFINITION 3.2.12 Let Y, Z be two Hausdorff topological spaces. A multifunction (set-valued map) F : Y −→ 2Z \ {∅} is said to be upper semicontinuous, if for any closed set C ⊆ Z, the set ª df © F − (C) = y ∈ Y : F (y) ∩ C 6= ∅ is closed. REMARK 3.2.13 It is easy to check that the above definition is equivalent to saying that for any open set U ⊆ Z, the set ª df © F + (U ) = y ∈ Y : F (y) ⊆ U is open. Moreover, if for all y ∈ Y , the set F (y) ⊆ Z is closed and Z is regular, then upper semicontinuity of F implies that Gr F ⊆ Y × Z is closed (see Hu & Papageorgiou (1997, p. 41)). The converse is true if F is locally compact, i.e., for every y ∈ Y , we can find a neighbourhood U of y, such that F (U ) is compact in Z. Finally note that if F is single-valued, then the notion of upper semicontinuity coincides with that of continuity.

308

Nonlinear Analysis

PROPOSITION 3.2.14 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map and int D(A) 6= ∅, then A|int D(A) is upper semicontinuous from X with the norm topology into X ∗ with the weak topology. PROOF Let C ⊆ X ∗ be a weakly closed set. We need to show that the set ¡ ¢− © ª A|int D(A) (C) = x ∈ int D(A) : A(x) ∩ C 6= ∅ is closed in int D(A). To this end let {xn }n>1 ⊆

¡

A|int D(A)

¢−1

(C)

be a sequence, such that xn −→ x

in X,

for some x ∈ int D(A). Let x∗n ∈ A(xn ) ∩ C

∀ n > 1.

Then Proposition 3.2.11 implies that the sequence {x∗n }n>1 ⊆ X ∗ is bounded. By virtue of the reflexivity of X ∗ and the Eberlein-Smulian Theorem (see Theorem A.3.8), we may assume that w

x∗n −→ x∗

in X ∗ .

Clearly x∗ ∈ C. Also, we have hx∗n − y ∗ , xn − yiX > 0 so lim

n→+∞

∀ n > 1, (y, y ∗ ) ∈ Gr A,

∗ ® ® xn − y ∗ , xn − y X = x∗ − y ∗ , x − y X > 0.

Because A is maximal monotone, we infer that x∗ ∈ A(x). Therefore −1 x ∈ (A|int C ) (C) and this proves the claimed upper semicontinuity of A|int D(A) . A careful reading of the previous proof reveals that the following is also true. PROPOSITION 3.2.15 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map, ∗ then Gr A ⊆ X × Xw and Gr A ⊆ Xw × X ∗ are closed sets (here Zw denotes the space Z furnished with the weak topology).

3. Nonlinear Operators and Young Measures

309

DEFINITION 3.2.16 Let Y, Z be two Banach spaces and let V : Y −→ 2Z \ {∅} be a multifunction. (a) We say that V is demicontinuous, if it is upper semicontinuous from Y with the norm topology into Z with the weak topology. (b) We say that if for all x, y ∈ Y , the multivalued ¡ V is hemicontinuous, ¢ map λ 7−→ V λx + (1 − λ)y is upper semicontinuous from [0, 1] into Z with the weak topology. (c) We say that V is bounded, if it maps bounded sets in Y into bounded sets in Z. REMARK 3.2.17 Evidently demicontinuity implies hemicontinuity. ∗ For monotone maps A : X −→ 2X with D(A) = X, the converse is also true. PROPOSITION 3.2.18 ∗ If A : X −→ 2X is a monotone hemicontinuous map with D(A) = X, then A is demicontinuous. PROOF

If C ⊆ X ∗ is w-closed, we need to show that the set ¡ ¢ A− (C) = x ∈ X : A(x) ∩ C 6= ∅

is norm closed in X. To this end let {xn }n>1 ⊆ A− (C) be a sequence, such that xn −→ x Let

x∗n ∈ A(xn ) ∩ C

in X. ∀ n > 1.

Then Proposition 3.2.11 implies that the sequence {x∗n }n>1 ⊆ X ∗ is bounded and so we may assume that w

x∗n −→ x∗ Set

in X ∗ .

df

yλ = x + λy, and let

yλ∗ ∈ A(yλ )

∀ λ > 0, y ∈ X.

From the monotonicity of A, we have ∗ ® xn − yλ∗ , xn − x X − λ hx∗n − yλ∗ , yiX > 0

∀ n > 1,

310

Nonlinear Analysis

so

® 1 ∗ xn − yλ∗ , xn − x X λ Passing to the limit as n → +∞, we obtain hx∗n − yλ∗ , yiX 6

∀ n > 1.

hx∗ − yλ∗ , yiX 6 0. Next let λ & 0. Due to the hemicontinuity of A, we may say that w

yλ∗ −→ y ∗

in X ∗ ,

for some y ∗ ∈ A(x). So we obtain that hx∗ − y ∗ , yiX 6 0. Because y ∈ X was arbitrary, it follows that x∗ = y ∗ ∈ A(x). Therefore x ∈ A− (C) and we have proved the demicontinuity of A. Next we give a sufficient condition for maximality of a monotone map. PROPOSITION 3.2.19 ∗ If A : X −→ 2X is a monotone map with D(A) = X, which is hemicontinuous and for every x ∈ X, the set A(x) ⊆ X ∗ is closed and convex, then A is maximal monotone. b 0 ). b is a monotone extension of A and x∗ ∈ A(x PROOF Suppose that A 0 ∗ We need to show that x0 ∈ A(x0 ). If this is not true, then from the strong separation theorem (see Theorem A.3.2), we can find u ∈ X \ {0}, such that hx∗ , uiX < hx∗0 , ui

∀ x∗ ∈ A(x0 ).

(3.8)

b we have Let λ > 0 and xλ = x0 + λu. By virtue of the monotonicity of A, λ hx∗λ − x∗0 , uiX > 0 so

hx∗λ − x∗0 , uiX > 0

∀ x∗λ ∈ A(xλ ), ∀ x∗λ ∈ A(xλ ).

Because of the hemicontinuity of A, we can say that w

x∗λ −→ x∗

in X ∗

as λ & 0,

∗

for some x ∈ A(x0 ). So from (3.9), we have that hx∗ − x∗0 , uiX > 0, which contradicts (3.8).

(3.9)

3. Nonlinear Operators and Young Measures

311

At this point let us give some standard examples of maximal monotone maps. EXAMPLE 3.2.20 (a) Let H be a Hilbert space and let C ⊆ H be a closed, convex set. It is well known that for each x ∈ H, there exists a unique element in C, denoted by projC (x), such that ° ° °x − proj (x)° = inf kx − ck X C X c∈C

(best approximation of x in C). The map projC : H −→ C is known as the metric projection on C. Then projC is a maximal monotone map. Indeed, recalling that ® x − projC (x), c − projC (x) X 6 0 ∀ c ∈ C, then we can easily check that ° ° ® °proj (x) − proj (y)°2 6 x − y, proj (x) − proj (y) C C C C X X and so

° ° °proj (x) − proj (y)° 6 kx − yk , X C C X

i.e., the map x 7−→ projC (x) is nonexpansive and it is monotone. So by Proposition 3.2.19, the map x 7−→ projC (x) is maximal monotone. (b) If H is a Hilbert space and A : H −→ H is nonexpansive (i.e., Lipschitz continuous with Lipschitz constant equal to 1), then it is easy to check that idX + A is maximal monotone. df

(c) Let X be a reflexive Banach space and ϕ : X −→ R = R ∪ {+∞} a proper (i.e., not identically +∞), convex, lower semicontinuous function. The subdifferential of ϕ is defined by ½ ¾ df ∗ ∗ ∗ ∂ϕ(x) = x ∈ X : hx , x − yiX 6 ϕ(y) − ϕ(x) ∀y∈X . ∗

Then ∂ϕ : X −→ 2X is a maximal monotone map. We shall prove this and more in Section 4.3, where we conduct a detailed study of the convex subdifferential. For the moment, we keep in mind the maximality of the subdifferential map, in order to better understand the next example. (d) Let X be a reflexive Banach space and consider the map F : X −→ 2X defined by ½ ¾ df 2 2 F(x) = x∗ ∈ X ∗ : hx∗ , xiX = kxkX = kx∗ kX ∗ .

∗

312

Nonlinear Analysis

According to the Hahn-Banach Theorem, we see that F(x) 6= ∅ for all x ∈ X. Moreover, we have that F(x) = ∂ϕ(x),

where ϕ(x) =

1 2 kxkX . 2

Indeed, if x∗ ∈ F(x), then 2

hx∗ , y − xiX 6 kx∗ kX ∗ kykX − kxkX 1¡ 2 2 ¢ 2 6 kxkX + kykX − kxkX = ϕ(y) − ϕ(x), 2 df

so x∗ ∈ ∂ϕ(x). Conversely, if x∗ ∈ ∂ϕ(x). Let ψ(x) = kxkX for all x ∈ X. By ϕ0 (x; ·) and ψ 0 (x; ·) we denote the directional derivatives at x ∈ X of the convex functions ϕ and ψ respectively, i.e., for all h ∈ X, we have ϕ(x + λh) − ϕ(x) λ ψ(x + λh) − ψ(x) df ψ 0 (x; h) = lim . λ&0 λ df

ϕ0 (x; h) = lim

λ&0

The limits exist since the difference quotients decrease as λ & 0, because of the convexity of the functions. Then we have 2

kx + λhkX kxkX − kxkX λ&0 λ 2 2 1 kx + λhkX − kxkX 6 lim = ϕ0 (x; h) λ&0 2 λ

ψ 0 (x; h) kxkX = lim

(3.10)

and 2

2

1 kx + λhkX − kxkX λ&0 2 λ · ¸ ¢ 1 kx + λhkX − kxkX ¡ = lim kx + λhkX + kxkX λ&0 2 λ = ψ 0 (x; h) kxkX . (3.11)

ϕ0 (x; h) = lim

From (3.10) and (3.11), we infer that ϕ0 (x; h) = ψ 0 (x; h) kxkX . Clearly from the definition of ∂ϕ(x), we see that x∗ ∈ ∂ϕ(x) if and only if hx∗ , hiX 6 ϕ0 (x; h) = ψ 0 (x; h) kxkX . So, we have ¿ ∗ À x ,h 6 ψ 0 (x; h) 6 ψ(x + h) − ψ(x) 6 khkX kxkX X

∀ h ∈ X,

3. Nonlinear Operators and Young Measures so

kx∗ kX ∗ 6 kxkX .

(3.12)

On the other hand since ¿ ∗ À x ,h 6 ψ 0 (x; h) kxkX X we have that

x∗ kxkX

¿

313

∀ h ∈ X,

∈ ∂ψ(x) and so

À x∗ ,y − x 6 ψ(y) − ψ(x) kxkX X

Let y = 0. We obtain

¿

Therefore

∀ y ∈ X.

À x∗ ,x > kxkX . kxkX X kx∗ kX ∗ > kxkX .

(3.13)

From (3.12) and (3.13), it follows that kx∗ kX ∗ = kxkX , hence x∗ ∈ F(x). Note that if X is a Hilbert space, then F is the canonical isomorphism between X and X ∗ . So if X = H is a pivot Hilbert space (i.e., H = H ∗ ), then F is an identity operator. REMARK 3.2.21 The duality map introduced in Example 3.2.20(a) actually can be defined on any Banach space (not necessarily reflexive) and is essentially dependent on the norm of the space. More precisely, if k·k1 and k·k2 are two equivalent norms on X and F1 and F2 the corresponding duality maps, then we need not have F1 = F2 . In fact in the proposition that follows, we show that the geometry of X and X ∗ is closely related to the properties of the duality map F. PROPOSITION 3.2.22 If X is a reflexive Banach space with X ∗ strictly convex, then the duality map F : X −→ X ∗ is single-valued, odd, demicontinuous, maximal monotone, coercive and bounded. PROOF

Let x∗1 , x∗2 ∈ F(x). Then we have 2

2

hx∗k , xiX = kxkX = kx∗k kX ∗

for k ∈ {1, 2}.

So, we have 2

2

2 kx∗1 kX ∗ kxkX 6 kx∗1 kX ∗ + kx∗2 kX ∗ = hx∗1 + x∗2 , xiX 6 kx∗1 + x∗2 kX kxkX , thus kx∗1 kX ∗ 6

1 ∗ kx + x∗2 kX ∗ 2 1

314

Nonlinear Analysis

and so x∗1 = x∗2 due to the strict convexity of X ∗ . Clearly F(−x) = −F(x), i.e., F is odd. To show the demicontinuity of F, suppose that {xn }n>1 ⊆ X is a sequence, such that xn −→ x in X, for some x ∈ X. Then ° ° °F(xn )° ∗ = kxn k −→ kxk X X X and so the sequence {F(xn )}n>1 ⊆ X ∗ is bounded. Because of the reflexivity of X ∗ , we may assume that w

F(xn ) −→ x∗ We have ∗ ® x ,h X = 6

lim

n→+∞

in X ∗ .

® F(xn ), h X

lim kxn kX khkX = kxkX khkX

n→+∞

∀h∈X

(3.14)

and ∗ ® x ,x X =

lim

n→+∞

From (3.14), we have

® F(xn ), x X =

2

2

lim kxn kX = kxkX .

n→+∞

(3.15)

kx∗ kX ∗ 6 kxkX

and from (3.15), we have kx∗ kX ∗ > kxkX . Therefore

kx∗ kX ∗ = kxkX

and so x∗ ∈ F(x), which proves the demicontinuity of F. Maximal monotonicity follows from Examples 3.2.20(c) and (d). A more direct proof is the following. Let x, y ∈ X. Then we have ® 2 2 F(x) − F (y), x − y X > kxkX + kykX − 2 kxkX kykX 2 = (kxkX − kykX ) > 0, (3.16) so F is monotone. Invoking Proposition 3.2.19, we conclude that F is maximal monotone. Also we have ® 2 F(x), x X = kxkX , i.e., F is coercive. Finally it is clear that F is bounded.

3. Nonlinear Operators and Young Measures

315

REMARK 3.2.23 As we shall see later in this section (see Corollary 3.2.31), the maximal monotonicity and coercivity of F imply that F is surjective. Also we have ® ϕ0 (x; h) = F(x), h X ∀ h ∈ X, which means that ϕ is Gˆateaux differentiable at every x ∈ X and ϕ0 (x) = F(x). Moreover, from Example 3.2.20(d), we see that the map x 7−→ ψ(x) = kxkX is Gˆateaux differentiable at every x 6= 0 and ψ 0 (x) =

F(x) . kxkX

It is a result of Banach space theory that the reflexive Banach space X is smooth (i.e., its norm is Gˆateaux differentiable at every x 6= 0) if and only if X ∗ is strictly convex. Similarly the reflexive Banach space X is strictly convex if and only if X ∗ is smooth (see Day (1973, p. 144)). PROPOSITION 3.2.24 If X is a reflexive Banach space and both X and X ∗ are strictly convex, then the duality map F : X −→ X ∗ is strictly monotone and bijective and F −1 is the duality map of X ∗ . PROOF

Suppose that ® F(x) − F(y), x − y X = 0.

From (3.16), we have ¿ µ ¶ À ¿ µ ¶ À x+y x−y x+y x−y 0 = F(x) − F , − F − F(y), 2 2 2 2 X X ° ° ¶ ° µ µ° ¶2 °x + y° 2 °x + y° ° ° > kxkX − ° + ° ° 2 ° ° 2 ° − kykX , X X so

° ° °x + y° ° kxkX = ° ° 2 °

X

= kykX .

Because X is strictly convex, it follows that x = y. So F is strictly monotone. Hence it is injective (see also Proposition 3.2.22). Moreover, we know that F is surjective (see Remark 3.2.23). Therefore F is bijective. Finally, it is clear that F −1 : X ∗ −→ X is the duality map of X ∗ .

316

Nonlinear Analysis

PROPOSITION 3.2.25 If X is a reflexive Banach space and X ∗ is a locally uniformly convex space (see Definition A.3.21), then the duality map F : X −→ X ∗ is continuous. PROOF

Let {xn }n>1 ⊆ X be a sequence, such that xn −→ x

in X,

for some x ∈ X. Then ° ° °F(xn )°

X∗

° ° −→ °F(x)°X ∗ .

Moreover, because F is demicontinuous (see Proposition 3.2.22), we have w

F(xn ) −→ F (x)

in X ∗ .

Since X ∗ is locally uniformly convex, it has the Kadec-Klee property (see Remark A.3.22) and so F(xn ) −→ F (x)

in X ∗ .

Therefore F is continuous. REMARK 3.2.26 Under the hypotheses of Proposition 3.2.25, the map x 7−→ ψ(x) = kxkX is Fr´echet differentiable at every x 6= 0. Indeed, from Remark 3.2.23, we know that the map x 7−→ ψ(x) is Gˆateaux differentiable at every x 6= 0. Also Proposition 3.2.25 says that the map F(x) x 7−→ ψ 0 (x) = kxkX is continuous on X \ {0}. So ψ is Fr´echet differentiable on X \ {0}. Combining Propositions 3.2.24 and 3.2.25, we obtain the following proposition. PROPOSITION 3.2.27 If X is a reflexive Banach space and both X and X ∗ are locally uniformly convex (see Definition A.3.21), then the duality map F : X −→ X ∗ is a homeomorphism. PROPOSITION 3.2.28 If X is a reflexive Banach space and X ∗ is uniformly convex, then the duality map F : X −→ X ∗ is uniformly continuous on bounded subsets of X.

3. Nonlinear Operators and Young Measures PROOF

317

First we show that F is uniformly continuous on ª df © ∂B1 = x ∈ X : kxkX = 1 .

If this is not the case, then we can find ε > 0 and two sequences {xn }n>1 , {yn }n>1 ⊆ ∂B1 , such that kxn − yn kX −→ 0 and

° ° °F(xn ) − F(yn )° ∗ > ε X

∀ n > 1.

We have ° ° ® °F(x) + F(y)° ∗ kxk > F(x) + F(y), x X X X ® ® ® = F(x), x X + F(y), y X + F(y), x − y X 2

2

> kxkX + kykX − kykX kx − ykX

∀ x, y ∈ X.

Putting x = xn ,

y = yn

∀ n > 1,

we obtain

° 1° °F(xn ) + F(yn )° ∗ = 1 − 1 kxn − yn k , X X 2 2 ∗ which contradicts the uniform convexity of X . Recall that F(λu) = λF(u) ∀ λ > 0, u ∈ X. For x, y ∈ X \ {0}, we have ° µ ¶ µ ¶° ° ° x y ° ° = ° kxkX F − kykX F X∗ kxkX kykX °X ∗ ° µ ° µ ¶ µ ¶° ¶° ° ° ° ° x y y ° ° °F ° . 6 kxkX °F −F + kx − yk X° ° ° ∗ kxkX kykX kyk ∗ X X X ° ° °F(x) − F(y)°

From the uniform continuity of F on ∂B1 , it follows that F is uniformly continuous on bounded sets located outside some neighbourhood of the origin. Since F is continuous at x = 0 and F(0) = 0, we conclude that F is uniformly continuous on bounded sets. Using the duality map, we can have a necessary and sufficient condition for the maximality of a monotone operator A. THEOREM 3.2.29 ∗ If both X and X ∗ are strictly convex and A : X ⊇ D(A) −→ 2X is a monotone map, then A is maximal monotone if and only if R(A + λF) = X ∗ for all λ > 0 (equivalently for some λ > 0).

318

Nonlinear Analysis

Theorem 3.2.29 is also a surjectivity result. One of the reasons that maximal monotone operators are important in applications is their remarkable surjectivity properties. We start with a necessary and sufficient condition in order to have surjectivity. THEOREM 3.2.30 ∗ If A : X ⊇ D(A) −→ 2X is a monotone map, then R(A) = X ∗ if and only if A−1 is locally bounded. PROOF

“=⇒”: Since A is maximal monotone, so is A−1 . Because D(A−1 ) = R(A) = X ∗ ,

from Proposition 3.2.11, we have that A−1 is locally bounded. “⇐=”: To show that R(A) = X ∗ , it suffices to show that R(A) is both closed and open in X ∗ . First we show that R(A) is closed. To this end let {x∗n }n>1 ⊆ R(A) and suppose that x∗n −→ x∗

in X ∗ .

We have x∗n ∈ A(xn ) and from the monotonicity of A, it follows that ® ∗ ∀ n > 1, (y, y ∗ ) ∈ Gr A. xn − y ∗ , xn − y X > 0

(3.17)

Since by hypothesis A−1 is locally bounded, the sequence {xn }n>1 ⊆ X is bounded and so by passing to a suitable subsequence if necessary, we may assume that w xn −→ x in X. Passing to the limit as n → +∞ in (3.17), we obtain ∗ ® x − y∗ , x − y X > 0 ∀ (y, y ∗ ) ∈ Gr A, so x∗ ∈ A(x) (since A is maximal monotone). Therefore x∗ ∈ R(A) and so we have proved that R(A) is closed. Next we show that R(A) is open in X ∗ . Let x∗ ∈ R(A). We have x∗ ∈ A(x). By considering if necessary b A(y) = A(y + x)

3. Nonlinear Operators and Young Measures

319

(maximal monotonicity is invariant under translation), we may assume that x = 0. Let r > 0 be such that A−1 |Br (x∗ ) is bounded, where df

Br (x∗ ) =

©

ª y ∗ ∈ X ∗ : ky ∗ − x∗ kX < r .

By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume without any loss of generality, that both X and X ∗ are locally uniformly convex. Let y ∗ ∈ B r2 (x∗ ). Then by Theorem 3.2.29, for every λ > 0, the operator equation x∗λ + λF(xλ ) = y ∗ ,

x∗λ ∈ A(xλ )

(3.18)

has a solution xλ . Because A is maximal monotone, we have ∗ ® y − λF(xλ ) − x∗ , xλ X > 0 ∀λ>0 (recall that x = 0), so 2

ky ∗ − x∗ kX ∗ kxλ kX > λ kxλ kX and thus λ kxλ kX

0.

From (3.18), we see that ° ∗ ° ° ° °y − x∗λ ° ∗ = λ°F(xλ )° ∗ = λ kxλ k < r X X X 2 so ° ∗ ° °xλ − x∗ ° ∗ < r ∀ λ > 0. X

∀ λ > 0,

(3.19)

Recall that A−1 |Br (x∗ ) is bounded. So we have that {xλ }λ>0 ⊆ X is bounded. Using this in (3.19), we see that x∗λ −→ y ∗

in X ∗

as λ & 0.

Since from the first part of the proof of this implication, we have that R(A) is closed, it follows that y ∗ ∈ R(A) and so B r2 ⊆ R(A), which proves that R(A) is open in X ∗ . Thus we conclude that R(A) = X ∗ . COROLLARY 3.2.31 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone and weakly coercive, then A is surjective (i.e., R(A) = X ∗ ). PROOF By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume that both X and X ∗ are locally uniformly convex. The weak coercivity condition is equivalent to saying that A−1 is locally bounded. So we can apply Theorem 3.2.30 and conclude that R(A) = X ∗ .

320

Nonlinear Analysis

COROLLARY 3.2.32 ∗ If A : X −→ 2X is monotone, hemicontinuous with D(A) = X and weakly coercive, then A is surjective (i.e., R(A) = X ∗ ). In a finite dimensional context we can drop the monotonicity hypothesis from the above result, provided we assume coercivity. Namely, we have the following result. PROPOSITION 3.2.33 ∗ If X is a finite dimensional Banach space, F : X −→ 2X is an upper semicontinuous and coercive multifunction with nonempty, compact and convex values, then F is surjective. PROOF

For every y ∗ ∈ X ∗ , the multifunction df

Fy∗ (x) = F (x) − y ∗ satisfies the same hypotheses as F . So it suffices to show that 0 ∈ R(F ). Suppose that 0 6∈ R(F ). Then by the strong separation theorem (see Theorem A.3.2), for every x ∈ X we can find u(x) ∈ X \ {0}, such that ∗ ® 0 < ∗ inf x , u(x) X . x ∈F (x)

Since by hypothesis F is coercive, given M > 0 we can find r = r(M ) > 0, such that hx∗ , xiX > M ∀ kxkX > r, x∗ ∈ F (x), kxkX so hx∗ , xiX > M r ∀ kxkX = r, x∗ ∈ F (x). For such x ∈ X, we can take u(x) = x. Now let x ∈ X \ {0}. We define ½ ¾ df ∗ U (x) = y ∈ X : ∗ inf hy , xiX > 0 . y ∈F (y)

Because of our hypotheses on F , the map y 7−→

inf

y ∗ ∈F (y)

hy ∗ , xiX is lower semicontinuous

(see Hu & Papageorgiou©(1997,ª p. 83)) and so the set U (x) is open. From the above, we have that U (x) x∈X\{0} is an open cover of X.

3. Nonlinear Operators and Young Measures

321

We choose ¢ df ¡ an open cover {Vk }m k=1 of B r = x ∈ X : kxkX 6 r , such that for each k ∈ {1, . . . , m}, we can find xk ∈ X, such that Vk ⊆ U (xk ) and if Vk ∩ ∂Br 6= ∅,

then xk ∈ Vk ∩ ∂Br

and diam Vk

1, such that ϑk (x) > 0 and for each x∗ ∈ F (x), we have hx∗ , xk iX > 0 (since x ∈ Vk ⊆ U (xk )). So for each x ∈ B r and any x∗ ∈ F (x), we have m X ∗ ® x , f (x) X = ϑk (x) hx∗ , xk iX > 0, k=1

hence f (x) 6= 0

∀ x ∈ Br .

Therefore dB (f, Br , 0) = 0, where dB (f, Br , 0) denotes the Brouwer degree of f on Br with respect to 0. On the other hand, if x ∈ ∂Br , f (x) is a convex combination of the points {xk }m k=1 ⊆ ∂Br and kxk − xkX

0, we define the resolvent operator of A, by ¢−1 df ¡ Jλ = idH + λA and the Yosida approximation of A, by df

Aλ = REMARK 3.2.37

¢ 1¡ idH − Jλ . λ

By virtue of Theorem 3.2.29, we have that

D(Jλ ) = D(Aλ ) = H

∀ λ > 0.

Moreover, it is easy to check that Jλ is single-valued and then so is Aλ .

3. Nonlinear Operators and Young Measures

323

The next theorem summarizes the main properties of the resolvent and Yosida approximation of a maximal monotone operator A. THEOREM 3.2.38 If H is a pivot Hilbert space and A : H ⊇ D(A) −→ 2H is a maximal monotone map, then for every λ > 0, we have (a) Jλ is nonexpansive (i.e., Lipschitz continuous with Lipschitz constant 1); ¡ ¢ (b) Aλ (x) ∈ A Jλ (x) for every x ∈ H; (c) Aλ is monotone and Lipschitz continuous with Lipschitz constant λ1 ; ° ° ° ° (d) °Aλ (x)°H 6 °A0 (x)°H for every x ∈ D(A), where A0 (x) = projA(x) (0) (recall that A(x) is closed and convex; see Proposition 3.2.7 and Example 3.2.20(a)); (e) lim Aλ (x) = A0 (x) for all x ∈ D(A); λ&0

(f ) D(A) is convex and lim Jλ (x) = projD (x) for every x ∈ H. λ&0

PROOF

(a) For x, y ∈ H, we have ¡ ¡ ¢ ¡ ¢¢ x − y ∈ Jλ (x) − Jλ (y) + λ A Jλ (x) − A Jλ (y) .

We take the inner product with Jλ (x) − Jλ (y) and use the monotonicity of A. We have ° ° ® °Jλ (x) − Jλ (y)°2 6 x − y, Jλ (x) − Jλ (y) H H ° ° 6 kx − yk °Jλ (x) − Jλ (y)° , H

so

H

° ° °Jλ (x) − Jλ (y)° 6 kx − yk . H H

(b) This follows from the equivalence ¡ ¢ x, x∗ ∈ Gr Aλ ⇐⇒

¡

¢ x − λx∗ , x∗ ∈ Gr A.

(3.20)

(c) Because Jλ is nonexpansive (see (a)), it follows that idH − Jλ is monotone (see Example 3.2.20(b)) and so Aλ is monotone too. We have ¡ ¢ x − y = Jλ (x) − Jλ (y) + λ Aλ (x) − Aλ (y) ,

324

Nonlinear Analysis

and ® ® x − y, Aλ (x) − Aλ (y) H = Jλ (x) − Jλ (y), Aλ (x) − Aλ (y) H ° °2 + λ°Aλ (x) − Aλ (y)°H . From the monotonicity of A and (b), we have ° °2 ° ° λ°Aλ (x) − Aλ (y)°H 6 kx − ykH °Aλ (x) − Aλ (y)°H , so

° ° °Aλ (x) − Aλ (y)°

1 kx − ykH . λ Invoking Proposition 3.2.19, we conclude that A is maximal monotone. H

6

(d) From (b), we have that 0 ® A (x) − Aλ (x), x − Jλ (x) H > 0 so

∀ x ∈ D(A), λ > 0,

° ° ° ° ° ° ® °Aλ (x)°2 6 A0 (x), Aλ (x) 6 °A0 (x)°H °Aλ (x)°H H H

and thus

(3.21)

° ° ° ° °Aλ (x)° 6 °A0 (x)° . H H

(e) Using (3.20), we can easily verify that (Aλ )µ = Aλ+µ So from (d) and (3.21), we see that ° ° ° ° °Aλ+µ (x)° 6 °Aλ (x)° H H

∀ λ, µ > 0.

∀ x ∈ H, λ, µ > 0

(3.22)

and ° ° ® °Aλ+µ (x)°2 6 Aλ (x), Aλ+µ (x) H H

∀ x ∈ H, λ, µ > 0.

(3.23)

From (3.22) and (3.23), it follows that ° ° ° ° ° ° °Aλ+µ (x) − Aλ (x)°2 6 °Aλ (x)°2 − °Aλ+µ (x)°2 ∀ x ∈ H, λ, µ > 0. H H H ° ª ©° Therefore, since from (d), °Aλ (x)°H λ>0 is bounded for λ > 0 small © ª enough, we infer that Aλ (x) λ>0 is Cauchy and so Aλ (x) −→ y

in H

as λ & 0.

By definition, we have x − Jλ (x) = λAλ (x)

3. Nonlinear Operators and Young Measures

325

and so Jλ (x) −→ x in H

as λ & 0.

Using (b) and the maximal monotonicity of A, we have that y ∈ A(x), hence y = A0 (x). (f ) Let df

C = conv D(A) and x ∈ H. We have ® Aλ (x) − u, Jλ (x) − z H > 0 so

x − Jλ (x) − λu, Jλ (x) − z

® H

∀ (z, u) ∈ Gr A,

> 0

∀ (z, u) ∈ Gr A

and thus ° ° ® ® °Jλ (x)°2 6 x − λu, Jλ (x) − z + Jλ (x), z H H H ∀ (z, u) ∈ Gr A. (3.24) © ª From (3.24), it follows that Jλ (x) λ>0 ⊆ H is bounded. We choose a sequence λn & 0, such that w

Jλn (x) −→ v

in H.

Then by passing to the limit in (3.24) (with λ = λn ), we obtain 2

kvkH 6 hx, v − ziH + hv, ziH

∀ z ∈ D(A),

so hv − x, v − ziH 6 0

∀ z ∈ D(A)

and so hx − v, z − viH 6 0

∀ z ∈ C.

(3.25)

Since v ∈ C, from (3.25), it follows that v = projC (x). It remains to show that C = D(A). To this end, note that Jλ (x) ∈ D(A)

∀x∈H

and Jλ (z) −→ z

as λ & 0

∀ z ∈ C.

Therefore, it follows that C ⊆ D(A), hence C = D(A).

326

Nonlinear Analysis

REMARK 3.2.39

From the proof of (e), it follows that if x 6∈ D, then ° ° °Aλ (x)° % +∞ as λ & 0. H

Perturbation results for maximal monotone operators play an important role in applications. In this direction we have the following result. THEOREM 3.2.40 If H is a pivot Hilbert space, A : H ⊇ D(A) −→ 2H and B : H ⊇ D(B) −→ ¡ ¢ 2H are two maximal monotone maps and 0 ∈ int D(A) \ D(B) , then A + B : H ⊇ D(A) ∩ D(B) −→ 2H is maximal monotone too. PROOF We start by showing that if S : H −→ H is a Lipschitz continuous map with Lipschitz constant kS > 0, then the map A + S : H ⊇ D(A) −→ 2H is maximal monotone. To this end, we choose µ > 0, such that µkS < 1. Then for a given y ∈ H, the equation x + µA(x) + µS(x) 3 y

(3.26)

is equivalent to x =

¡

¢−1 ¡ ¢ ¡ ¢ idH + µA y − µS(x) = Jµ y − µS(x) .

Note that the map

(3.27)

¡ ¢ z 7−→ Eµ (z) = Jµ y − µS(x)

is Lipschitz continuous with Lipschitz constant µkS < 1. So by the Banach fixed point theorem (see Theorem 7.1.2), equation (3.27) (hence inclusion (3.26) too) has a unique solution x ∈ D(A). By virtue of Theorem 3.2.29, this means that µ(A + S) is maximal monotone (note that since H is a pivot Hilbert space, F = idH ). So A + S is maximal monotone too. Using this general fact and Theorems 3.2.29 and 3.2.38(c), we see that for every λ > 0 we can find xλ ∈ D(A), such that xλ + A(xλ ) + Bλ (xλ ) 3 y.

(3.28)

We take the inner product with xλ − z, for some z ∈ D(A) ∩ D(B). Exploiting the monotonicity of A and Bλ , we obtain ° ° kxλ − zk 6 °y − z − A0 (z) − Bλ (z)° ∀ λ > 0. H

H

3. Nonlinear Operators and Young Measures Using also Theorem 3.2.38(d), we get ° ° ° ° kxλ kH 6 2 kzkH + kykH + °A0 (z)°H + °B 0 (z)°H

327

∀ λ > 0.

(3.29)

Because of our hypothesis concerning the domains D(A) and D(B), we can find ε > 0, such that df

Bε =

©

ª z ∈ H : kzkH 6 ε ⊆ D(B) − D(A).

Let z ∈ B ε . Then z = b − a with b ∈ D(B) and

a ∈ D(A).

Exploiting the monotonicity of Bλ , we have ® ® ® Bλ (xλ ), b) H 6 Bλ (xλ ), xλ H − Bλ (b), xλ − b H . From Theorem 3.2.38(d), we have ° ° ° ° ® ® Bλ (xλ ), z H 6 Bλ (xλ ), xλ − a H + °B 0 (b)°H °xλ − b°H , so

Bλ (xλ ), z

® H

6

° ° ° ° ® y − xλ − uλ , xλ − a H + °B 0 (b)°H °xλ − b°H ,

with uλ ∈ A(xλ ). As A is monotone, we have ° ° ° ¢ ® ¡° Bλ (xλ ), z H 6 kxλ − akH °y − xλ °H + °A0 (a)°H ° ° + °B 0 (b)°H kxλ − bkH . From (3.29), we have that {xλ }λ>0 is bounded. Thus for every z ∈ B ε , we can find c(z) > 0, such that ® Bλ (xλ ), z H 6 c(z). Invoking the uniform boundedness principle (see Theorem A.3.4), we have ° ° sup °Bλ (xλ )°H < +∞. (3.30) λ>0

From (3.28), we have xλ − xµ ∈ −A(xλ ) + A(xµ ) − Bλ (xλ ) + Bµ (xµ )

∀ λ, µ > 0,

so from the monotonicity of A, we have ° ° ® °xλ − xµ °2 6 − Bλ (xλ ) − Bµ (xµ ), xλ − xµ H H

∀ λ, µ > 0.

328

Nonlinear Analysis

Invoking also Theorem 3.2.38(b), we obtain ° ° ® °xλ − xµ °2 6 − Bλ (xλ ) − Bµ (xµ ), λBλ (xλ ) − µBλ (xµ ) H H

∀ λ, µ > 0.

Using (3.30), we infer that {xλ }λ>0 ⊆ H is Cauchy and so xλ −→ x as λ & 0.

(3.31)

Also we can say that w

Bλ (xλ ) −→ v Note that

in H,

as λ & 0.

(3.32)

° ° ° ° ° ° °Jλ (xλ ) − x° 6 λ°Bλ (xλ )° + °xλ − x° , H H H

so, using (3.30) and (3.31), we have Jλ (xλ ) −→ x

in H,

as λ & 0.

Since B is maximal monotone and ¡ ¢ Bλ (xλ ) ∈ B Jλ (xλ ) (see Theorem 3.2.38(b)), in the limit as λ & 0, we obtain (x, v) ∈ Gr B (see Proposition 3.2.15 and (3.32)). Passing to the limit as λ & 0 in (3.28) and using the fact that A is maximal monotone, we obtain that x ∈ D(A) and x + A(x) + B(x) 3 y, so

¡ ¢ R idH + A + B = H

and finally A + B is maximal monotone (see Theorem 3.2.29). For operators from a reflexive Banach space into its dual, we have the following perturbation result. THEOREM 3.2.41 ∗ If X is a reflexive Banach space, A : X ⊇ D(A) −→ 2X and B : X ⊇ ∗ D(B) −→ 2X are two maximal monotone maps and D(A) ∩ int D(B) 6= ∅ (or D(B) ∩ int D(A) 6= ∅), ∗ then A + B : X ⊇ D(A) ∩ D(B) −→ 2X is maximal monotone. ¡ ¢ REMARK 3.2.42 Since int D(A) − D(B) ⊆ int D(A) − D(B) , we see that the hypothesis of Theorem ¡3.2.40 is weaker¢ than that of Theorem 3.2.41. Note that the condition 0 ∈ int D(A) − D(B) may hold even if int D(A) = int D(B) = ∅.

3. Nonlinear Operators and Young Measures

329

Another useful perturbation result is given in the next theorem. THEOREM 3.2.43 If H is a pivot Hilbert space, A : H ⊇ D(A) −→ 2H and B : H ⊇ D(B) −→ 2H are maximal monotone maps, D(A) ∩ D(B) 6= ∅ and ® 0 6 y, Bλ (x) H ∀ (x, y) ∈ Gr A, λ > 0, then A + B is maximal monotone. PROOF

Let y ∈ H and consider the inclusion x + A(x) + Bλ (x) 3 y.

(3.33)

From the proof of Theorem 3.2.40, we know that (3.33) has a unique solution xλ ∈ D(A) and {xλ }λ>0 ⊆ H is bounded. Take the inner product of (3.33) with Bλ (xλ ) and use the hypothesis, to obtain that ° ° sup °Bλ (xλ )°H < +∞. λ>0

Then the remainder of the proof goes as the proof of Theorem 3.2.40. Next we introduce some generalizations of the concept of monotonicity, which are useful in the study of nonlinear partial differential equations. The mathematical setting remains the same. Namely, X is a reflexive Banach ∗ space, X ∗ is its topological dual and A : X −→ 2X is an operator. DEFINITION 3.2.44 domonotone, if

The map A : X −→ 2X

∗

is said to be pseu-

(a) the set A(x) is nonempty, convex and weakly compact for all x ∈ X; (b) A is upper semicontinuous from each finite dimensional subspace V of X, into X ∗ furnished with the weak topology; (c) if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are sequences, such that x∗n ∈ A(xn ), w

xn −→ x for some x ∈ X and

in X,

® lim sup x∗n , xn − x X 6 0, n→+∞

then for each y ∈ X, we can find v ∗ (y) ∈ A(x), such that ∗ ® ® v (y), x − y X 6 lim inf x∗n , xn − y X . n→+∞

330

Nonlinear Analysis

To be able to deal with problems in which the nonlinear operators are not everywhere defined and which are not continuous even in a mild sense, we introduce the following notion. ∗

DEFINITION 3.2.45 A map A : X −→ 2X is said to be generalized pseudomonotone, if for any sequences {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ , such that w xn −→ x in X, for some x ∈ X and

w

x∗n −→ x∗

in X ∗ ,

for some x∗ ∈ X ∗ , with x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞

we have that x∗ ∈ A(x) and ∗ ∗® ® xn , x X −→ x∗ , x X . An immediate consequence of this definition is the following result. PROPOSITION 3.2.46 ∗ A map A : X −→ 2X is generalized pseudomonotone if and only if A−1 : X ∗ −→ 2X is generalized pseudomonotone. The class of generalized pseudomonotone maps contains the maximal monotone ones. PROPOSITION 3.2.47 ∗ If A : X ⊇ D(A) −→ 2X is maximal monotone, then A is generalized pseudomonotone. PROOF that

Let {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ be two sequences, such w

xn −→ x for some x ∈ X and

w

x∗n −→ x∗

in X, in X ∗ ,

for some x∗ ∈ X ∗ , with x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0. n→+∞

We need to show that x∗ ∈ A(x) and ® ∗ ® xn , x X −→ x∗ , x X .

3. Nonlinear Operators and Young Measures Let (u, u∗ ) ∈ Gr A. Then since A is monotone, we have ® 0 6 x∗n − u∗ , xn − u X ∀ n > 1.

331

(3.34)

Also we have ∗ ® ® ® ® ® xn , xn X = x∗n − u∗ , xn − u X + x∗n , u X + u∗ , xn X − u∗ , u X . Note that ∗ ® ® ® ® ® ® xn , u X + u∗ , xn X − u∗ , u X −→ x∗ , u X + u∗ , x X − u∗ , u X , so from (3.34), we have ® ® ® ® ∗ ® x , x X > lim sup x∗n , xn X > x∗ , u X + u∗ , x X − u∗ , u X n→+∞

and thus

∗ ® x − u∗ , x − u X > 0.

Since (u, u∗ ) ∈ Gr A was arbitrary and A is maximal monotone, it follows that x∗ ∈ A(x). Therefore ∗ ® xn − x∗ , xn − x X > 0 ∀ n > 1, so

® ® lim inf x∗n , x X > x∗ , x X n→+∞

and thus

x∗n , xn

® X

−→

∗ ® x , x X,

i.e., A is generalized pseudomonotone. In fact every pseudomonotone map is generalized pseudomonotone. THEOREM 3.2.48 ∗ If A : X −→ 2X is a pseudomonotone map, then A is generalized pseudomonotone PROOF

© ª Suppose that (xn , x∗n ) n>1 ⊆ Gr A is a sequence, such that w×w

(xn , x∗n ) −→ (x, x∗ ) in X × X ∗ , for some (x, x∗ ) ∈ X × X ∗ and ® lim sup x∗n , xn − x X 6 0. n→+∞

332

Nonlinear Analysis

By virtue of pseudomonotonicity of A, for every y ∈ X, we can find v ∗ (y) ∈ A(x), such that ∗ ® ® v (y), x − y X 6 lim inf x∗n , xn − y X . n→+∞

We may assume that

® ∗ xn , xn X −→ ξ,

for some ξ ∈ R and so ® ® lim sup x∗n , xn − x X = ξ − x∗ , x X 6 0. n→+∞

(3.35)

Also, we have ® ® ® ξ − x∗ , y X > lim inf x∗n , xn − y X > v ∗ (y), x − y X , n→+∞

so, from (3.35), we have ∗ ® ® x , x − y X > v ∗ (y), x − y X

∀ y ∈ X.

(3.36)

We claim that x∗ ∈ A(x). If this is not the case, then since A(x) is convex and w-compact (see Definition 3.2.44), we can find u ∈ X, such that ∗ ® ∗ ® x , u X < ∗ inf z , u X. (3.37) z ∈A(x)

Let y = x − u in (3.36). Then ∗ ® ® x , u X > v ∗ (y), u X , with

(3.38)

v ∗ (y) ∈ A(x).

Comparing (3.37) and (3.38), we reach a contradiction. Therefore x∗ ∈ A(x). Finally, if y = x ∈ X, then ® ® lim inf x∗n , xn − x X > v ∗ (x), x − x X = 0, n→+∞

so

® ® lim inf x∗n , xn X > x∗ , x X n→+∞

and recalling the choice of the sequences {xn }n>1 and {x∗n }n>1 , we get ∗ ® ® xn , xn X −→ x∗ , x X .

There is a converse to this proposition.

3. Nonlinear Operators and Young Measures

333

PROPOSITION 3.2.49 ∗ If A : X −→ 2X is a bounded generalized pseudomonotone map and for every x ∈ X, the set A(x) is nonempty, convex and weakly compact, then A is pseudomonotone. PROOF

First we show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two w

sequences, such that xn −→ x in X, for some x ∈ X, x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞

then for each u ∈ X, we can find y ∗ (u) ∈ A(x), such that ∗ ® ® y (u), x − u X 6 lim inf x∗n , xn − u X . n→+∞

Suppose that this is not true. Then we can find u ∈ X, such that ® ∗ ® lim inf x∗n , xn − u X < ∗ inf v , x − u X. n→+∞

v ∈A(x)

By passing to a suitable subsequence, we may say that ® ∗ ® lim x∗n , xn − u X < ∗ inf v , x − u X. n→+∞

v ∈A(x)

Since A is bounded, we have that the sequence {x∗n }n>1 ⊆ X ∗ is bounded. So by virtue of the Eberlein-Smulian theorem (see Theorem A.3.8), we may assume that w x∗n −→ x∗ in X ∗ . Because A is generalized pseudomonotone, it follows that x∗ ∈ A(x) and ® ® ∗ xn , xn X −→ x∗ , x X (see Definition 3.2.45). Therefore ® ® lim x∗n , xn − u X = x∗ , x − u X < n→+∞

inf

v ∗ ∈A(x)

∗ ® v , x − u X,

a contradiction, since x∗ ∈ A(x). Next we show that A is upper semicontinuous from X into X ∗ furnished with the weak topology. By virtue of the boundedness of A and of Remark 3.2.13, it suffices to show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two sequences, such that xn −→ x for some x ∈ X,

w

x∗n −→ x∗

in X, in X ∗ ,

for some x∗ ∈ X ∗ and x∗n ∈ A(xn ) for n > 1, then x∗ ∈ A(x). But this follows from the fact that A is generalized pseudomonotone. Thus we have shown that A is pseudomonotone (see Definition 3.2.44).

334

Nonlinear Analysis

Combining this result with Propositions 3.2.11 and 3.2.47, we obtain the following corollary. COROLLARY 3.2.50 ∗ If A : X ⊇ D(A) −→ 2X is maximal monotone and D(A) = X, then A is pseudomonotone. The class of pseudomonotone maps is invariant under addition of operators. PROPOSITION 3.2.51 ∗ If A1 , A2 : X −→ 2X are two pseudomonotone maps, then A1 + A2 is pseudomonotone too. PROOF

Evidently for each x ∈ X, the set (A1 + A2 )(x) = A1 (x) + A2 (x)

is nonempty, convex and weakly compact. Moreover, it is easy to see that the map x 7−→ (A1 + A2 )(x) is upper semicontinuous from every finite dimensional subspace of X into X ∗ equipped with the weak topology. Next we show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two sequences, such that w xn −→ x in X, for some x ∈ X, x∗n ∈ (A1 + A2 )(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞

then for every u ∈ X, we can find y ∗ (u) ∈ (A1 + A2 )(x), such that

∗ ® ® y (u), x − u X 6 lim inf x∗n , xn − u X . n→+∞

Let df

x∗n = yn∗ + zn∗

with

yn∗ ∈ A1 (xn ),

zn∗ ∈ A2 (xn )

∀ n > 1.

Then, we have · lim sup n→+∞

∗ ® ® yn , xn − x X + zn∗ , xn − x X

¸ 6 0.

(3.39)

3. Nonlinear Operators and Young Measures

335

We claim that (3.39) implies

® lim sup yn∗ , xn − x X 6 0 n→+∞

and

(3.40)

® lim sup zn∗ , xn − x X 6 0. n→+∞

Suppose that (3.40) is not true. Then at least one of the two lim sup is strictly bigger than zero. To fix things, suppose that ® lim sup yn∗ , xn − x X > 0. n→+∞

Then we can find c > 0 and a suitable subsequence (denoted with the same index), such that ® lim yn∗ , xn − x X > c = 0. n→+∞

Then because of (3.39), we have that ® lim sup zn∗ , xn − x X 6 −c < 0. n→+∞

(3.41)

By virtue of the pseudomonotonicity of A2 , for every u ∈ X, we can find y2∗ (u) ∈ A2 (x), such that

∗ ® ® y2 (u), x − u X 6 lim inf zn∗ , xn − u X . n→+∞

Let u = x. Then we have

® lim inf zn∗ , xn − x X > 0. n→+∞

(3.42)

Comparing (3.41) and (3.42), we reach a contradiction. This proves (3.40). Since both A1 and A2 are pseudomonotone, given u ∈ X, we can find y1∗ (u) ∈ A1 (x) and such that

y2∗ (u) ∈ A2 (x),

® ® y1∗ (u), x − u X 6 lim inf yn∗ , x − u X n→+∞

and

∗ ® ® y2 (u), x − u X 6 lim inf zn∗ , x − u X . n→+∞

Let

df

y ∗ (u) = y1∗ (u) + y2∗ (u) ∈ (A1 + A2 )(x). We have ∗ ® ® ® y (u), x − u X 6 lim inf yn∗ , xn − u X + lim inf zn∗ , xn − u X n→+∞ n→+∞ ® 6 lim inf x∗n (u), xn − u X , n→+∞

which means that A1 + A2 is pseudomonotone.

336

Nonlinear Analysis

As was the case with maximal monotone operators, pseudomonotone operators exhibit remarkable surjectivity properties. THEOREM 3.2.52 ∗ If A : X −→ 2X is pseudomonotone and coercive, then R(A) = X ∗ , i.e., A is surjective. PROOF Let T be the family of all finite dimensional subspaces of X, equipped with the partial order defined by inclusion. Let V ∈ T and let iV : V −→ X denote the embedding operator. Then i∗V : X ∗ −→ V ∗ is the corresponding projection operator onto V ∗ . Then ∗

AV = i∗V AiV : V −→ 2V . Clearly AV has nonempty, convex and compact values and it is upper semicontinuous. Moreover, for every x∗V ∈ AV (x), we have x∗V = i∗V x∗ for some x∗ ∈ A(x) and so ® ® ∗ ® xV , x V = i∗V x∗ , x V = x∗ , iV (x) X , so AV is coercive too. To prove the theorem, it suffices to show that 0 ∈ R(A). Because of Proposition 3.2.33, for every V ∈ T , we can find xV ∈ V , such that 0 ∈ AV (xV ), hence 0 = i∗V x∗V , for some x∗V ∈ A(xV ). By virtue of the coercivity of A, we have that {xV }V ∈T ⊆ X is bounded. For V ∈ T , let [ df {xV 0 }. EV = V0 ∈ T V0 ⊇ V

Then EV ⊆ B M

3. Nonlinear Operators and Young Measures

337

for some M > 0 large enough. Because X is reflexive (hence B M is weakly compact), from the finite intersection property, we have that \ w E V 6= ∅. V ∈T

Let x0 ∈

\

w

EV

and

y ∈ X.

V ∈T

© ª Choose V ∈ T , such that {x0 , y} ⊆ V . Let xVk k>1 ⊆ EV be such that w

xVk −→ x0 Recall that 0 = i∗Vk x∗Vk with x∗Vk

in X as k → +∞. ¡ ¢ ∈ A xVk . So we have

∗ ® xVk , xVk − x0 X = 0

∀ k > 1.

Since A is pseudomonotone, we can find y ∗ (y) ∈ A(x0 ), such that ∗ ® ® y (y), x0 − y X 6 lim x∗Vk , xVk − y X = 0 ∀ y ∈ X. k→+∞

(3.43)

Suppose that 0 6∈ A(x0 ). Then by the strong separation theorem (see Theorem A.3.2), we can find y ∈ X, such that ∗ ® 0 < ∗ inf v , x0 − y X . (3.44) v ∈A(x0 )

Comparing (3.43) and (3.44), we obtain a contradiction. This proves the surjectivity of A. The following classes of operators are often useful in nonlinear operator equations. DEFINITION 3.2.53

Let

A : X ⊇ D(A) −→ 2X

∗

and

B : X ⊇ D(B) −→ 2X

∗

be two maps. (a) We say that B is smooth, if D(B) = X and B is bounded, coercive and maximal monotone. (b) We say that A is regular, if it is generalized pseudomonotone and for ∗ every smooth operator B : X ⊇ D(B) −→ 2X , we have R(A + B) = X ∗ .

338

Nonlinear Analysis

The next proposition gives an important example of a regular generalized pseudomonotone map. PROPOSITION 3.2.54 ∗ If A : X ⊇ D(A) −→ 2X is a pseudomonotone operator and there exists c > 0, such that hx∗ , xiX > −c kxkX

∀ (x, x∗ ) ∈ Gr A,

then A is regular (so also generalized pseudomonotone). PROOF First note that A is a generalized pseudomonotone operator (see ∗ Theorem 3.2.48). Also let B : X ⊇ D(B) −→ 2X be an arbitrary smooth map. Then from Corollary 3.2.50, B is pseudomonotone. Then Proposition 3.2.51 implies that A+B is pseudomonotone. Moreover, A+B is coercive. So Theorem 3.2.52 implies that R(A + B) = X ∗ , hence A is regular. We introduce two more classes of nonlinear operators of monotone type. DEFINITION 3.2.55

∗

Let A : X ⊇ D(A) −→ 2X be a map.

(a) We say that A is of type (M ), if for every x ∈ X, the set A(x) is nonempty, convex, weakly compact, it is upper semicontinuous from every finite dimensional subspace V of X into X ∗ furnished with the weak topology and if w xn −→ x in X, w

x∗n −→ x∗ with (xn , x∗n ) ∈ Gr A, then

in X ∗

(x, x∗ ) ∈ Gr A.

(b) We say that A is of type (S)+ , if A is single valued with D(A) = X and for every sequence {xn }n>1 ⊆ X, such that w

xn −→ x

in X,

for some x ∈ X and ® lim sup A(xn ), xn − x X 6 0, n→+∞

we have that xn −→ x

in X.

3. Nonlinear Operators and Young Measures

339

REMARK 3.2.56 The prototype for operators of type (M ) is a monotone, hemicontinuous map. Similarly the prototype for operators of type (S)+ is a uniformly monotone operator defined everywhere. If A is of type (M ) (respectively of type (S)+ ) and B : X −→ X ∗ is completely continuous, then A + B is of type (M ) (respectively of type (S)+ ). In fact in the (S)+ case, B can be compact. We close this section by briefly discussing two important examples of maximal monotone maps. PROPOSITION 3.2.57 ∗ If X is a separable, reflexive Banach space, A : X ⊇ D(A) −→ 2X is ¡ ¢ p0 ∗ b : Lp (T ; X) ⊇ D b −→ 2L T ;X , maximal monotone with 0 ∈ A(0) and A where T = [0, b], p ∈ (1, +∞), p1 + p10 = 1 is defined by ½ df b A(x) =

p0

h∈L

¡

T;X

∗

¢

¡

¢ : h(t) ∈ A x(t)

¾ for a.a. t ∈ T

b ∀ x ∈ D,

where ½ df b = D

x ∈ Lp (T ; X) : x(t) ∈ D(A) for a.a. t ∈ T and there exists ¾ ¡ ¢ ¡ ¢ p0 ∗ h ∈ L T ; X such that h(t) ∈ A x(t) for a.a. t ∈ T ,

b is maximal monotone too. then A PROOF By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume without any loss of generality that both X and X ∗ are locally uniformly convex (see Definition A.3.21). Let F : X −→ X ∗ be the duality map of X. From Proposition 3.2.27, we have that F is a homeomorphism. Let ¡ ¢ p0 ∗ J0 : Lp (T ; X) −→ 2L T ;X be defined by

¢°p−2 ¡ ¢ df ° ¡ J0 (x)(·) = °F x(·) °X ∗ F x(·) .

It is easy to see that J0 is continuous, strictly monotone and so maximal b is a monotone map. We monotone too (see Proposition 3.2.18). Clearly A claim that ¢ ¡ ¢ ¡ b + J0 = Lp0 T ; X ∗ . R A

340

Nonlinear Analysis

¢ 0¡ ∗ To this end let h ∈ Lp T ; X ∗ and consider the multifunction S : T −→ 2X , defined by ª df © S(t) = x ∈ X : A(x) + ϕ(x) 3 h(t) , where ϕ : X −→ X ∗ is the monotone, continuous (hence maximal monotone) map, defined by °p−2 df ° ϕ(x) = °F(x)°X ∗ F(x)

∀ x ∈ X.

We know that ∗

A + ϕ : X ⊇ D(A) −→ 2X is maximal monotone (see Theorem 3.2.41). Moreover, because 0 ∈ A(0), we have ° °p ° °p ∗ ® ® x + ϕ(x), x X > ϕ(x), x X = °F(x)°X ∗ = °x°X ∀ (x, x∗ ) ∈ Gr A, hence A + ϕ is coercive. Therefore R(A + ϕ) = X ∗ (see Corollary 3.2.31). It follows that S(t) 6= ∅

∀ t ∈ T.

Note that Gr S =

©

(t, x) ∈ T × X :

¡ ¢ ª x, ϕ(x) − h(t) ∈ Gr A .

Let ϑ : T × X −→ X × X ∗ be the function, defined by df

ϑ(t, x) =

¡

¢ x, ϕ(x) − ξ(t) .

Clearly ϑ is a Carath´eodory function (i.e., t 7−→ ϑ(t, x) is measurable and x 7−→ ϑ(t, x) is continuous). Therefore ϑ is jointly measurable. Note that ¡ ¢ Gr S = ϑ−1 Gr A and from Proposition 3.2.15, we know that ∗ Gr A ⊆ X × Xw is closed ∗ (here by Xw we denote the space X ∗ furnished with the weak topology). Hence ¡ ¢ ∗ Gr S ∈ B X × Xw

3. Nonlinear Operators and Young Measures

341

∗ (by B(Z) we denote the Borel σ-field of Z). Since Xw is a Souslin space (see Definition A.2.29(b)), then ¡ ¢ ¡ ∗¢ ∗ B X × Xw = B(X) × B Xw

(see Proposition A.2.34(b)). Moreover, ¡ ¢ ¡ ∗¢ B X ∗ = B Xw . Therefore ¡ ¢ ¡ ¢ ¡ ∗¢ B(X) × B Xw = B(X) × B X ∗ = B X × X ∗ . So we have that

¡ ¢ Gr S ∈ B X × X ∗

and we can apply the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) to obtain a measurable map x : T −→ X, such that x(t) ∈ S(t) We have

∀ t ∈ T.

¡ ¢ ¡ ¢ h(t) ∈ A x(t) + ϕ x(t)

for a.a. t ∈ T.

Taking duality brackets with x(t), we obtain ° ° ® °x(t)°p 6 h(t), x(t) X X

∀t∈T

(recall that 0 ∈ A(0)) and so ° ° ° ° °x(t)°p−1 6 °h(t)° ∗ , X X from which it follows that x ∈ Lp (T ; X). This proves that ¡ ¢ ¡ ¢ b + J0 = Lp0 T ; X ∗ . R A Using this surjectivity property we shall establish the maximal monotonicity b To this end suppose that for some of the monotone operator A. ¢ 0¡ (y, v) ∈ Lp (T ; X) × Lp T ; X ∗ , we have

u − v, x − y

® pp0

> 0

b ∀ (x, u) ∈ Gr A,

(3.45)

where by h·, ·ipp0 we denote the duality brackets for the pair of spaces ¡ p0 ¡ ¢ ¢ L T ; X ∗ , Lp (T ; X) , i.e., df

Zb

hu, vipp0 = 0

® u(t), v(t) X dt

¢ 0¡ ∀ (u, v) ∈ Lp (T ; X) × Lp T ; X ∗ .

342

Nonlinear Analysis

b + J0 is surjective, we can find x1 ∈ D, b such that Since A u + J0 (x1 ) = v + J0 (y), b 1 ). Returning to (3.45) and setting x = x1 , we obtain for some u ∈ A(x ® J0 (y) − J0 (x1 ), x1 − y pp0 > 0. (3.46) But J0 is strictly monotone. So from (3.46), it follows that x1 = y, hence

b y∈D

and b This proves the maximality of A.

b 1 ). v = u ∈ A(x

The second important class of maximal monotone operators that we would like to check closing this section are the linear ones. Note that for a linear operator A : X −→ X ∗ , monotonicity is equivalent to saying that ® A(x), x X > 0 ∀ x ∈ D(A). For linear monotone operators we can characterize maximality using the adjoint operator or in terms of the density of its domain. This brings us to the “doorsteps” of the Hille-Yosida theorem and the theory of semigroups of operators, which form the subject of the next section. THEOREM 3.2.58 If A : X ⊇ D(A) −→ X ∗ is a linear monotone operator, then the following statements are equivalent: (a) A is maximal monotone; (b) A∗ is maximal monotone; (c) A and A∗ are both monotone and A is closed and densely defined. In particular if X = H is a Hilbert space and A : H −→ H ∗ is linear maximal monotone, then A is symmetric if and only if A is selfadjoint. Maximal monotonicity is crucial here since it may happen that A is monotone symmetric, but A∗ is not monotone. REMARK 3.2.59 In Section 4.3 we shall return to the subject of monotone operators, by examining in more detail the subdifferential of a proper, convex and lower semicontinuous function (see Example 3.2.20(c)). This is a special kind of monotone map, known as cyclically monotone. As we shall see there, not every monotone map is of the subdifferential type.

3. Nonlinear Operators and Young Measures

3.3

343

Accretive Operators and Semigroups of Operators

In the previous section we studied operators (in general nonlinear) from a Banach space X into its dual X ∗ . In this section we deal with operators from X into X which still exhibit a “monotonicity” property. These are the socalled accretive operators. Of course the two classes of monotone and accretive operators coincide when X = H is a Hilbert space. Accretive operators are intimately connected to the theory of generation of semigroups, which are a basic tool in the study of evolution equations. So the second half of this section is devoted to the presentation of the basics of the theory of semigroups of operators (linear and nonlinear). Trying to extend the notion of monotonicity to maps from a Banach space X into itself, immediately we face the problem of finding a substitute for the duality brackets. There are two equivalent ways to do this. The first is to use the duality map, which essentially brings us back to the familiar setting of the dual pair (X, X ∗ ). The second approach replaces the duality brackets by a so-called semi-inner product, which is a kind of inner product for the Banach space. Let us start by giving the definition of accretivity based on the duality map and then proceed to introduce semi-inner products on a Banach space and show how they can be used. DEFINITION 3.3.1 Let X be a Banach space and let A : X ⊇ D(A) −→ 2X be an operator. (a) We say that A is accretive, if for every ®(x1 , u1 ), (x2 , u2 ) ∈ Gr A, there exists x∗ ∈ F (x1 − x2 ), such that x∗ , u1 − u2 X > 0. (b) An accretive operator is said to be maximal accretive, if its graph is not properly included in the graph of another accretive operator. (c) Finally an accretive operator is said to be m-accretive, if R(idX + A) = X. REMARK 3.3.2 When X = H = H ∗ is a Hilbert pivot space, then F = idX and so the notion of accretivity (respectively maximal accretivity) coincides with that of monotonicity (respectively maximal monotonicity). Moreover, in this case by virtue of Theorem 3.2.29, maximal accretivity and m-accretivity coincide. However, this is not in general true. We can find a maximal accretive operator which is not m-accretive (see Miyadera (1992, pp. 42–44)). If −A is an accretive operator, then A is called a dissipative operator . This terminology originates from mechanics, where dissipative forces are forces which do not increase the energy.

344

Nonlinear Analysis

The next lemma leads to an alternative definition of an accretive operator. LEMMA 3.3.3 If X is a Banach space and x, y ∈ X, then kxkX 6 kx + λykX for all λ > 0 if and only if there exists x∗ ∈ F(x), such that hx∗ , yiX > 0. PROOF “=⇒”: Without any loss of generality we may assume that x 6= 0. Let x∗λ ∈ F(x + λy), x∗λ 6= 0 for all λ > 0. Set df

vλ∗ =

x∗λ ∗ kxλ kX ∗

∀ λ > 0.

If λn & 0, by Alaoglu’s theorem (see Theorem A.3.9), we can find v ∗ ∈ X ∗ with kv ∗ kX ∗ 6 1, such that ® hv ∗ , viX = lim vλ∗n , v X ∀ v ∈ X. n→+∞

By hypothesis we have ° ° kxk 6 °x + λn y ° X

X

=

so

∗ ® ® vλn , x + λn y X 6 kxkX + λn vλ∗n , y X , ∗ ® v , y X > 0.

Also, since

° ° ° x + λn y °

X

=

(3.47)

∗ ® ® vλn , x + λn y −→ v ∗ , x X ,

we have that kxkX 6

∗ ® v , x X,

kxkX =

∗ ® v , x X.

so

It follows that x∗ = kxkX v ∗ ∈ F(x). We have hx∗ , yiX > kxkX hv ∗ , yiX > 0 (see (3.47)). “⇐=”: From the definition of the duality map F (see Example 3.2.20(d)) and the hypothesis that hx∗ , yiX > 0, we have 2

kxkX =

® ∗ ® x , x X 6 x∗ , x + λy X 6 kx∗ kX ∗ kx + λykX .

So, because x∗ ∈ F(x) and thus kxkX = kx∗ kX ∗ , we have kxkX 6 kx + λykX .

3. Nonlinear Operators and Young Measures

345

Using this lemma, we have the following alternative characterization of accretivity (known as Kato’s criterion). PROPOSITION 3.3.4 (Kato’s Criterion) If X is a Banach space and A : X ⊇ D(A) −→ 2X , then A is accretive if and only if for all λ > 0 and any (x1 , u1 ), (x2 , u2 ) ∈ Gr A, we have ° ° ° ° °x1 − x2 ° 6 °x1 − x2 + λ(u1 − u2 )° . X X Next we define the semi-inner products on X. DEFINITION 3.3.5 Let X be a Banach space and x, y ∈ X. We define the semi-inner products (·, ·)± by the following: kx + λykX − kxkX kx + λykX − kxkX = kxkX inf λ>0 λ λ kx + λykX − kxkX kx + λykX − kxkX df = kxkX lim = kxkX sup . λ%0 λ λ λ 0; (c) (z + y, x)± 6 kzkX kxkX + (y, x)± ; (d) (·, ·)+ : X × X −→ R is upper semicontinuous; (e) (·, ·)− : X × X −→ R is lower semicontinuous. PROOF

(a) It follows easily from Proposition 3.3.7.

(b) Note that F(µx) = µF(x) and so ® (λy, µx)+ = ∗max µx∗ , λy X x ∈F ® (x) = λµ ∗max x∗ , y X = λµ (y, x)+ . x ∈F (x)

Similarly for (·, ·)− . (c) It follows easily from Proposition 3.3.7.

3. Nonlinear Operators and Young Measures

347

(d) From Definition 3.3.5, we know that kx + λykX − kxkX . λ>0 λ

(y, x)+ = kxkX inf

Note that the function (y, x) 7−→ kxkX Hence kxkX inf

λ>0

kx + λykX − kxkX is continuous. λ

kx + λykX − kxkX kx + λykX − kxkX = inf kxkX λ>0 λ λ

is upper semicontinuous. (e) Note that (y, x)− = (−y, x)+ and use part (d). The next proposition summarizes the different equivalent ways we can use to define accretive operators. It follows immediately from the previous discussion. THEOREM 3.3.10 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an operator, then the following two statements are equivalent: (a) A is an accretive operator; (b) for every (x1 , u2 ), (x2 , u2 ) ∈ Gr A, any one of the following three statements is true: ° ° [A1 ] kx1 − x2 kX 6 °x1 − x2 + λ(u1 − u2 )°X for all λ > 0; ¡ ¢ 0 [A2 ] ¡ψ+ x1 − x2 ; u1 −¢u2 > 0; [A3 ] u1 − u2 , x1 − x2 + > 0. Motivated from Proposition 3.3.4, we make the following definition. DEFINITION 3.3.11

Let X be a Banach space and let A : X ⊇ D(A) −→ 2X

be an accretive operator. For every λ > 0 we introduce df

Jλ =

¡

idX + λA

¡ ¢ both defined on R idX + λA .

¢−1

and

df

Aλ =

¢ 1¡ idX − Jλ , λ

348

Nonlinear Analysis

In the next proposition we have collected some elementary properties of the operators Jλ and Aλ . PROPOSITION 3.3.12 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an accretive operator, then (a) Jλ is nonexpansive on R(idX + λA), i.e., ° ° °Jλ (x) − Jλ (y)° 6 kx − yk ∀ x, y ∈ R(idX + λA); X X (b) Aλ is accretive and Lipschitz continuous with constant λ2 on R(idX + λA); ¡ ¢ (c) Aλ (x) ∈ A Jλ (x) for every x ∈ R(idX + λA); ° ° ¯ ¯ df (d) °Aλ (x)°X 6 ¯A(x)¯ = inf kukX for every x ∈ D(A) ∩ R(idX + λA); u∈A(x)

µ

T

(e) lim Jλ (x) = x for every x ∈ D(A) ∩ λ&0

PROOF

λ>0

¶ R(idX + λA) .

(a) This follows at once from Proposition 3.3.4.

(b) Let df

yk = Then yk

¡

¢ idX + µAk (xk )

∀ k ∈ {1, 2}, µ > 0.

µ ¶ ¢ µ¡ = idX + id − Jλ (xk ) λ X

and so y1 − y2 +

¢ µ¡ Jλ (x1 ) − Jλ (x2 ) = λ

∀ k ∈ {1, 2}

µ ¶ µ 1+ (x1 − x2 ). λ

Hence µ ¶ ° µ µ° 1+ kx1 − x2 kX 6 ky1 − y2 kX + °Jλ (x1 ) − Jλ (x2 )°X λ λ and from part (a), we have kx1 − x2 kX 6 ky1 − y2 kX , which proves the accretivity of Aλ . Also, from part (a), we have ° ° °Aλ (x1 ) − Aλ (x2 )° = 1 kx1 − x2 + Jλ (x1 ) − Jλ (x2 )k 6 2 kx1 − x2 k X X λ λ and so Aλ is a Lipschitz continuous operator with constant

2 λ.

3. Nonlinear Operators and Young Measures

349

(c) We have Aλ (x) ∈

¡ ¢ ¢ ¡ ¢ 1¡ (idX +λA) Jλ (x) −Jλ (x) = A Jλ (x) λ

∀ x ∈ R(idX +λA).

(d) We have ¢ ¢ 1¡ ¡ Jλ (idX + λA)x − Jλ (x) λ ¢ ¢ 1¡ ¡ = Jλ x + λu − Jλ (x) ∀ u ∈ A(x). λ

Aλ (x) =

Hence

° ° °Aλ (x)°

X

which implies

(3.48)

6 kukX ,

° ° ¯ ¯ °Aλ (x)° 6 ¯A(x)¯. X

(e) From part (d), we have ° ° ° ° ¯ ¯ °Jλ (x) − x° = λ°Aλ (x)° 6 λ¯A(x)¯ X X

∀ x ∈ D(A) ∩ R(idX + λA).

Hence Jλ (x) −→ x

as λ & 0,

∀ x ∈ D(A) ∩ R(idX + λA)

and by uniform continuity this extends to all x ∈ D(A) ∩ R(idX + λA) (see part (b)). REMARK 3.3.13 When X = H is a Hilbert space, then Proposition 3.3.12 coincides with Theorem 3.2.38. Also note that if λ, µ > 0 and x ∈ D(Jλ ) = R(idX + λA), then µ λ−µ x+ Jλ (x) ∈ D(Jµ ) = R(idX + µA) λ λ and

µ Jλ (x) = Jµ

¶ µ λ−µ x+ Jλ (x) λ λ

(this last equality is usually known as Resolvent Identity ). To see this let x = y + λv with (y, v) ∈ Gr A. So Jλ (x) = y. Hence we have µ λ−µ µ λ−µ x+ Jλ (x) = (y +λv)+ y = y +µv ∈ R(idX +µA) = D(Jµ ) λ λ λ λ and

µ Jµ

µ λ−µ x+ Jλ (x) λ λ

¶ = Jµ (y + µv) = y = Jλ (x).

350

Nonlinear Analysis

Using the resolvent identity, for λ > µ > 0, we have that ° ° ° ° ° ° ° ° λ°Aλ (x)°X = °Jλ (x) − x°X 6 °Jλ (x) − Jµ (x)°X + °Jµ (x) − x°X ° µ ° ¶ ° ° ° ° µ λ−µ ° = °Jµ x+ Jλ (x) − Jµ (x)° + °Jµ (x) − x°X ° λ λ X ° ° °µ ° ° ° λ − µ ° ° ° 6 ° ° λ x + λ Jλ (x) − x° + Jµ (x) − x X X ° ° ° ° = (λ − µ)°Aλ (x)° + µ°Aµ (x)° , X

X

so

° ° ° ° °Aλ (x)° 6 °Aµ (x)° ∀λ>µ X X ° ª ©° and thus °Aλ (x)°X λ>0 is increasing as λ decreases to 0+ . PROPOSITION 3.3.14 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an accretive operator, then R(idX + λA) = X for all λ > 0, if it holds for some λ > 0. PROOF

Suppose that R(idX + λA) = X

for some λ > 0 and µ > λ2 . Then for u ∈ X, we have ¡ ¢ u ∈ idX + µA (x) or equivalently

µ ¶ ¡ ¢ λ λ idX + λA (x) 3 u + 1 − x, µ µ

which in turn is equivalent to saying that K(x) = x for the contraction µ µ ¶ ¶ λ λ df K(x) = Jλ u+ 1− x ∀ x ∈ X. µ µ By Banach’s fixed point theorem (see Theorem 7.1.2) K(x) = x has a unique solution and so λ R(idX + µA) = X ∀µ> . 2 Then by induction we conclude that R(idX + µA) = X∀µ > 0.

3. Nonlinear Operators and Young Measures

351

REMARK 3.3.15 Using Proposition 3.3.14, we can say that an operator A : X ⊇ D(A) −→ 2X is said to be m-accretive if and only if R(idX + λA) = X

for some λ > 0

(equivalently for all λ > 0; see Definition 3.3.1). PROPOSITION 3.3.16 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then (a) A is maximal accretive; (b) A is closed; (c) if xλ −→ x and Aλ (x) −→ u in X as λ & 0, then (x, u) ∈ Gr A. PROOF

(a) Let (x0 , u0 ) ∈ X × X and suppose that ° ° kx0 − xkX 6 °x0 − x + λ(u0 − u)°X ∀ λ > 0, (x, u) ∈ Gr A.

(3.49)

We need to show that (x0 , u0 ) ∈ Gr A

(see Definition 3.3.1 and Proposition 3.3.4). Since A is m-accretive, we have that X = R(idX + A) and so we can find (x, u) ∈ Gr A, such that x + u = x0 + u0 . Using this in (3.49), we obtain x = x0 , hence (x0 , u0 ) ∈ Gr A. © ª (b) Let (xn , un ) n>1 ⊆ Gr A and assume that (xn , un ) −→ (x, u) in X × X. We need to show that (x, u) ∈ Gr A. By virtue of the m-accretivity of A, we have ° ° kxn − ykX 6 °xn − y + λ(un − v)°X ∀ n > 1, λ > 0, (y, v) ∈ Gr A, so ° ° kx − ykX 6 °x − y + λ(u − v)°X

∀ λ > 0, (y, v) ∈ Gr A.

(3.50)

352

Nonlinear Analysis

But from part (a), we know that A is maximal accretive. So from (3.50), it follows that (x, u) ∈ Gr A. (c) From Proposition 3.3.12(a) and (e), we have that Jλ (xλ ) −→ x in X,

as λ & 0,

while from Proposition 3.3.12(c), we know that ¡ ¢ Aλ (xλ ) ∈ A Jλ (xλ ) ∀ λ > 0. Then using part (b), we infer that (x, u) ∈ Gr A. We can improve conclusion (b) of Proposition 3.3.16, provided we strengthen the condition on the space X. PROPOSITION 3.3.17 If X is a reflexive Banach space with X ∗ being locally uniformly convex and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then Gr A is sequentially closed in X × Xw . PROOF

© ª Suppose that (xn , un ) n>1 ⊆ Gr A is a sequence, such that xn −→ x and

w

un −→ u in X.

We need to show that (x, u) ∈ Gr A. Because A is m-accretive, from Theorem 3.3.10(b)[A3 ], we have ® F(xn − y), un − v X > 0 ∀ n > 1, (y, v) ∈ Gr A.

(3.51)

But from Proposition 3.2.25, we know that the duality map F : X −→ X ∗ is continuous. So if we pass to the limit as n → +∞ in (3.51), we obtain ® F(x − y), u − v X > 0 ∀ (y, v) ∈ Gr A, so

¡

u − v, x − y

¢ +

> 0

∀ (y, v) ∈ Gr A

and thus, from Proposition 3.3.16(a), we conclude that (x, u) ∈ Gr A.

Another useful result that can be proved by imposing extra conditions on the space X is the following one.

3. Nonlinear Operators and Young Measures

353

PROPOSITION 3.3.18 If X is a Banach space with X ∗ strictly convex and A : X ⊇ D(A) −→ 2X is a maximal accretive operator, then the set A(x) ⊆ X is convex and closed for any x ∈ D(A). PROOF Because X ∗ is strictly convex, the duality map F : X −→ X ∗ is single-valued. First we show that for x ∈ D(A), the set A(x) is convex. So let u, v ∈ A(x) and set w = tu + (1 − t)v

with t ∈ [0, 1].

For all (y, h) ∈ Gr A, we have that ® ® ® F(x − y), w − h X = t F(x − y), u − h X + (1 − t) F(x − y), v − h X > 0, so from the maximality of A, we have that (x, w) ∈ Gr A. Next we show that the set A(x) is closed in X. To this end let {un }n>1 ⊆ A(x) be a sequence, such that un −→ u

in X.

We have so

F(x − y), un − v

® X

> 0

® F(x − y), u − v X > 0

∀ n > 1, (y, v) ∈ Gr A, ∀ (y, v) ∈ Gr A

and so, from the maximality of A, we have that (x, u) ∈ Gr A. We continue with the properties of maximal accretive and m-accretive operators. PROPOSITION 3.3.19 If X is an uniformly convex Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then D(A) is convex. PROOF

Let

½ df

D0 =

¾ x ∈ conv D(A) : lim Jλ (x) = x . λ&0

Evidently D(A) = D0 and D0 is closed. So it suffices to show that D0 is convex. For x, y ∈ D0 , we have ° µ ° ° ° ¶ ° ° ° ° °Jλ x + y − Jλ (x)° 6 ° x − y ° ∀λ>0 (3.52) ° ° ° 2 2 °X X

354

Nonlinear Analysis

and

° µ ° ¶ ° ° °Jλ x + y − Jλ (y)° ° ° 2

° ° °x − y° ° ° 6° ∀λ>0 2 °X X (see Proposition 3.3.12(a)). From (3.52) and (3.53), it follows that ½ µ ¶¾ x+y Jλ ⊆ X is bounded. 2 λ∈(0,1)

(3.53)

Since X is reflexive (being uniformly convex; see Remark A.3.22), we can find a sequence λn & 0, such that µ ¶ x+y w Jλn −→ h in X. 2 So if we pass to the limit as n → +∞ in (3.52) and (3.53), we obtain ° ° ° ° °x − y ° °x − y° ° ° ° kh − xkX 6 ° and kh − yk 6 (3.54) X ° 2 ° ° 2 ° . X

X

We have kx − ykX 6 kx − hkX + kh − ykX 6 kx − ykX . From (3.54) and (3.55), it follows that kx − hkX = ky − hkX

(3.55)

° ° °x − y° ° ° = ° 2 °

X

and this by virtue of the uniform convexity of X implies that h = we have µ ¶ x+y x+y w Jλn −→ in X. 2 2 Moreover, we have ° ° µ ¶ °y − x° ° ° ° ° 6 lim inf °Jλn x + y − Jλ (x)° n ° X n→+∞ ° 2 2 X ° ° µ ¶ ° ° °y − x° x + y ° ° , 6 lim sup ° − Jλn (x)° °Jλn ° 6 X 2 2 n→+∞ X so ° ° µ ¶ ° ° ° ° °Jλ x + y − Jλ (x)° −→ ° y − x ° . n ° n ° X 2 2 X Since Jλn (x) −→ x in X,

x+y 2 .

Thus (3.56)

(3.57)

from (3.56) and (3.57) and the Kadec-Klee property (see Remark A.3.22), we conclude that µ ¶ x+y x+y Jλn −→ 2 2 and so D(A).

x+y 2

∈ D0 , which proves the convexity of D0 , hence the convexity of

3. Nonlinear Operators and Young Measures

355

Next we prove two perturbation results for m-accretive operators. To do this we shall need the following auxiliary result. LEMMA 3.3.20 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , and B : X ⊇ D(B) −→ 2X are two m-accretive operators with D(A) ∩ D(B) 6= ∅, then for every u ∈ X and every λ > 0, the operator inclusion x + A(x) + Bλ (x) 3 u has a unique ©solution ªxλ ∈ D(A) and {xλ }λ>0 is bounded. Moreover, if Bλ (xλ ) λ∈(0,1) is bounded, then the operator inclusion x + A(x) + B(x) 3 u has a unique solution x ∈ D(A) ∩ D(B) and xλ −→ x PROOF

in X,

as λ & 0.

The operator inclusion x + A(x) + Bλ (x) 3 u

is equivalent to µ x = 1+

¶−1 µ

λ A λ+1

¶ λ 1 −1 u+ (id + λB) (x) , λ+1 λ+1 X

so x = Kλ (x), with df

µ

Kλ = J

A

λ λ+1

Since the operators J Aλ

λ+1

λ 1 u+ JA λ+1 λ+1 λ

(3.58) ¶ ∀ λ > 0.

and JλB are nonexpansive on X (see Proposi-

tion 3.3.12(a)), we can check that ° ° °Kλ (x) − Kλ (y)° 6 X

1 kx − ykX λ+1

∀ x, y ∈ X, λ > 0.

Invoking Banach’s fixed point theorem (see Theorem 7.1.2), we infer that (3.58) has a unique solution xλ ∈ D(A) for λ > 0. Let z ∈ D(A) ∩ D(B) and

uλ ∈ z + A(z) + Bλ (z).

356

Nonlinear Analysis

From Proposition 3.3.12(b), we know that Bλ is accretive and since the sum of accretive operators is clearly accretive, we have that the operator A + Bλ : X ⊇ D(A) −→ 2X is accretive. So ® F(xλ − z), u − xλ − (uλ − z) X > 0, hence 2

kxλ − zkX 6

® F(xλ − z), u − uλ X 6 kxλ − zkX ku − uλ kX ,

from which it follows that kxλ − zkX 6 ku − uλ kX . Because

(3.59)

° ° ¯ ¯ °Bλ (z)° 6 ¯B(z)¯ X

(see Proposition 3.3.12(d)), from © ª (3.59), we infer that {xλ }λ>0 is bounded. Now suppose that Bλ (xλ ) λ>0 is bounded. For λ, µ > 0, we have u − xλ − Bλ (xλ ) ∈ A(xλ ) and

u − xµ − Bµ (xµ ) ∈ A(xµ ).

Exploiting the accretivity of the operator A, we obtain ® F(xλ − xµ ), xµ − xλ + Bµ (xµ ) − Bλ (xλ ) X > 0, so

2

kxλ − xµ kX 6

® F(xλ − xµ ), Bµ (xµ ) − Bλ (xλ ) X .

Because ¡ ¢ Bλ (xλ ) ∈ B Jλ (xλ )

¡ ¢ and Bµ (xµ ) ∈ B Jµ (xµ )

(see Proposition 3.3.12(c)) and B is accretive, we have ¡ ¢ ® F Jµ (xµ ) − Jλ (xλ ) , Bµ (xµ ) − Bλ (xλ ) X > 0. It follows that 2

kx λ − xµ kX ¡ ¢ ® 6 F(xλ − xµ ) + F Jµ (xµ ) − Jλ (xλ ) , Bµ (xµ ) − Bλ (xλ ) X . (3.60) Since λBλ (xλ ) = xλ − Jλ (xλ ) ª and by hypothesis Bλ (xλ ) λ∈(0,1) is bounded, we have that ©

° ° °xλ − Jλ (xλ )°

X

6 M1 λ

∀ λ ∈ (0, 1),

3. Nonlinear Operators and Young Measures

357

for some M1 > 0. Because X ∗ is uniformly convex, we have that duality map is uniformly continuous on bounded sets of X (see Proposition 3.2.28). Therefore since the duality map is odd (see Proposition 3.2.22), we see that given ε > 0, for all λ, µ > 0 small enough, we have ° ¡ ¢ ¡ ¢° °F xλ − xµ + F Jµ (xµ ) − Jλ (xλ ) ° ° ¡ ¢ ¡ ¢°X = °F Jµ (xµ ) − Jλ (xλ ) − F xµ − xλ ° 6 ε. X

So from (3.60), we have 2

kxλ − xµ kX 6 M1 ε

∀ λ, µ > 0 small enough.

Since ε > 0 was arbitrary, we conclude that xλ −→ x in X,

as λ & 0.

Let λn & 0 be such that w

Bλn (xλn ) −→ z

in X

© ª (recall that X is reflexive and by hypothesis Bλ (xλ ) λ∈(0,1) is bounded). Set df

vn = u − xλn − Bλn (xλn ) Then

w

vn −→ v = u − x − z

∀ n > 1. in X

and vn ∈ A(xλn ). Invoking Proposition 3.3.17, we have that (x, v) ∈ Gr A. Also ¡ ¢ Jλn (xλn ), Bλn (xλn ) ∈ Gr B and Jλn (xλn ) −→ x and

w

Bλn (xλn ) −→ x in X.

So once again via Proposition 3.3.17, we have that (x, z) ∈ Gr B. Thus finally x ∈ D(A) ∩ D(B)

and u = x + v + z

with v ∈ A(x)

and z ∈ B(x).

Using this lemma we can prove two perturbation theorems for m-accretive operators.

358

Nonlinear Analysis

THEOREM 3.3.21 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , B : X ⊇ D(B) −→ 2X are two m-accretive operators, such that (i) D(A) ∩ D(B) 6= ∅; ¡ ¢ ® (ii) F Bλ (x) , u X > 0 for all λ > 0 and all (x, u) ∈ Gr A, then A + B is m-accretive. PROOF Let u ∈ X. By virtue of Lemma 3.3.20, we can find a unique xλ ∈ D(A), such that u ∈ xλ + A(xλ ) + Bλ (xλ )

∀ λ > 0.

¡ ¢ We take the duality brackets with F Bλ (xλ ) . Using (ii)©and theªfact that {xλ }λ>0 is bounded (see Lemma 3.3.20), we obtain that Bλ (xλ ) λ∈(0,1) is bounded. Then by virtue of Lemma 3.3.20, we have that xλ −→ x in X

as λ & 0

and u ∈ x + A(x) + B(x), i.e., R(idX + A + B) = X, which means that A + B is m-accretive. THEOREM 3.3.22 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , B : X ⊇ D(B) −→ 2X are two m-accretive operators, such that (i) D(A) ⊆ D(B); (ii) for each r > 0, there are c < 1 and d > 0, such that ¯ ¯ ¯ ¯ ¯B(x)¯ 6 c¯A(x)¯ + d then A + B is m-accretive.

∀ x ∈ D(A), kxkX 6 r,

3. Nonlinear Operators and Young Measures

359

PROOF Let u ∈ X and let xλ ∈ D(A) be the unique solution of the operator inclusion u ∈ xλ + A(xλ ) + Bλ (xλ ). Since xλ ∈ D(A) ⊆ D(B) (see (i)) and ° ° ¯ ¯ °Bλ (xλ )° 6 ¯B(xλ )¯ X

∀λ>0

(see Proposition 3.3.12(d)), from condition (ii) and since {xλ }λ>0 is bounded (see Lemma 3.3.20), we obtain ¯ ¯ ¯ ¯ ¯A(xλ )¯ 6 c¯A(xλ )¯ + d0 ∀ λ > 0, © ª for some d0 > 0, so A(xλ ) λ>0 is bounded (since c < 1). © ª Using this fact in condition (ii), we infer that Bλ (xλ ) λ∈(0,1) is bounded. Then invoking Lemma 3.3.20, we finish the proof as before. REMARK 3.3.23 Condition (ii) of Theorem 3.3.22 can be replaced by the following local condition (ii)’ for every x0 ∈ D(A), we can find a neighbourhood U of x0 and constants c < 1 and d > 0, such that ¯ ¯ ¯ ¯ ¯B(x)¯ 6 c¯A(x)¯ + d ∀ x ∈ D(A) ∩ U

(see Kato (1967)). In applications in general Theorem 3.3.21 is more convenient than Theorem 3.3.22. We present an application of the perturbation results in the study of elliptic boundary value problems. To this end let Ω ⊆ RN be a bounded domain with a C 2 -boundary ∂Ω. We shall need the following existence, uniqueness and regularity result due to Agmon, Douglis & Nirenberg (1959). THEOREM 3.3.24 If Ω ⊆ RN is as above, p ∈ (1, +∞) and f ∈ Lp (Ω), then there exists unique x ∈ W 2,p (Ω) ∩ W01,p (Ω), such that −∆x(z) + x(z) = f (z)

for a.a. z ∈ Z,

x|∂Z = 0.

Moreover, if ∂Ω is a C m+1 -manifold for some m > 1 and f ∈ W m,p (Ω), then x ∈ W m+2,p (Ω) and kxkW m+2,p (Ω) 6 c kf kW m,p (Ω) for some c = c(m, p, Ω) > 0.

∀ x ∈ W m+2,p (Ω),

360

Nonlinear Analysis

Let

ξ : R ⊇ D(ξ) −→ 2R

be a maximal monotone map with 0 ∈ ξ(0). We consider the realization (lifting) of ξ on Lp (Ω) × Lp (Ω) for p ∈ (1, +∞). So we define b −→ 2Lp (Ω) ξb: Lp (Ω) ⊇ D(ξ) by ½

¡ ¢ u ∈ Lp (Ω) : u(z) ∈ ξ x(z) for a.a. z ∈ Z

df b ξ(x) =

where

¾ b ∀ x ∈ D(ξ),

½ df b = D(ξ)

x ∈ Lp (Ω) : there exists u ∈ Lp (Ω), ¡

¾

¢

such that u(z) ∈ ξ x(z) for a.a. z ∈ Z . A simple measurable selection argument establishes that ξb is m-accretive and we have £ ¤ b −1 x (z) = (1 + λξ)−1 x(z) for a.a. z ∈ Z (idX + λξ) with X = Lp (Ω) and ¡ ¢ ξbλ (x)(z) = ξλ x(z) for a.a. z ∈ Z, all λ > 0 and all x ∈ Lp (Ω). p

We consider the operator K : Lp (Ω) ⊇ D(K) −→ 2L b K(x) = −∆x + ξ(x) where

(Ω)

, defined by

∀ x ∈ D(K),

df b D(K) = W01,p (Ω) ∩ W 2,p (Ω) ∩ D(ξ).

PROPOSITION 3.3.25 p If K : Lp (Ω) ⊇ D(K) −→ 2L (Ω) is the operator defined by (3.61), then K is m-accretive. It is easy to check that the duality map on Lp (Ω),

PROOF

0

F : Lp (Ω) −→ Lp (Ω) (with

1 p

+

1 p0

= 1), is defined by df

F(x)(·) = x(·)

|x(·)|p−2 p−2

kxkp

.

(3.61)

3. Nonlinear Operators and Young Measures

361

Using this we can check that the operator −∆ : Lp (Ω) ⊇ W01,p (Ω) ∩ W 2,p (Ω) −→ Lp (Ω) is accretive. Moreover, by virtue of Theorem 3.3.24, it follows that −∆ is m-accretive. We have Z ¡ ¢ ® ¡ ¢¯ ¡ ¢¯p−2 b F ξλ (x) , −∆x Lp (Ω) = −∆x(z)ξλ x(z) ¯ξλ x(z) ¯ dz. (3.62) Ω

If

¯ ¯p−2 df ϕλ (r) = ξλ (r)¯ξλ (r)¯ ,

then ϕλ is a Lipschitz continuous map and ¯ ¯p−2 ϕ0λ (r) = (p − 1)¯ξλ (r)¯ ξλ0 (r) Also

for a.a. r ∈ R.

¡ ¢ ϕ x(·) ∈ W01,p (Ω)

(see Proposition 2.4.25 and Remark 2.4.26) and ¡ ¢ ¡ ¢ Dϕλ x(z) = ϕ0λ x(z) Dx(z) for a.a. z ∈ Z. Performing an integration by parts on the right hand side integral of (3.62) and recalling that βλ (0) = 0 (since 0 ∈ β(0)), we obtain Z °2 ¡ ¢ ® ¡ ¢° b F ξλ (x) , −∆x Lp (Ω) = ϕ0λ x(z) °Dx(z)°RN dz > 0, Ω

¡ ¢ since ϕ0λ x(z) > 0, because ϕλ is monotone increasing. Applying Theo0 rem 3.3.21, with data A = −∆, B = ξb and X = Lp (Ω) (note that X ∗ = Lp (Ω) 0 is uniformly convex since p ∈ (1, +∞)), we obtain that A + B = K is maccretive. The main reason for studying accretive operators is the fact that they are closely related with the generation of semigroups (linear and nonlinear). The theory of semigroups is a valuable tool in the study of partial differential equations, of Volterra integral equations and of control problems. In the rest of this section, we will see how m-accretive operators lead to semigroups of operators, which in turn describe the time-evolution of a dynamical system monitored by a differential equation in a Banach space (evolution equation). So we start our discussion with an existence result for evolution equations in which the input and Cauchy data are regular (smooth). First two useful auxiliary results.

362

Nonlinear Analysis

LEMMA 3.3.26 If X is a Banach space, x : T = [0, b] −→ X is weakly differentiable at t ∈ T , i.e., ¯ ® ¯ ® d ∗ x , x(s) X ¯¯ = x∗ , x0 (t) X ∀ x∗ ∈ X ∗ , ds s=t ° ° and s 7−→ °x(s)°X is differentiable at s = t, then · ¸¯ ° ° ° ¯ ® ¡ ¢ d° °x(t)° °x(s)° ¯ = x∗ , x(t) X ∀ x∗ ∈ F x(t) . X ds X ¯ s=t ¡ ¢ PROOF For every x∗ ∈ F x(t) and r > 0, we have ° ° ° ¢ ∗ ® ¡° x , x(t + r) − x(t) X 6 kx∗ kX ∗ °x(t + r)°X − °x(t)°X . Since by hypothesis x is weakly differentiable at t, dividing with r > 0 and letting r & 0, we obtain · ¸¯ ° ° ° ¯ ∗ ® d° ° ° ° ° x , x(t) X 6 x(t) X x(s) X ¯¯ . (3.63) ds s=t On the other hand, since ° ° ° ¢ ∗ ® ¡° x , x(t) − x(t − r) X > kx∗ kX ∗ °x(t)°X − °x(t − r)°X , arguing as above, we obtain · ¸¯ ° ° ° ¯ ® d° °x(t)° °x(s)° ¯ 6 x∗ , x0 (t) X . X ds X ¯ s=t

(3.64)

From (3.63) and (3.64), we conclude the desired equality. The second auxiliary result is a Gronwall-type lemma which is used frequently in the study of evolution equations. LEMMA 3.3.27 If ϕ ∈ L1 (T ), ϕ(t) > 0 for almost all t ∈ T , η ∈ R, u ∈ C(T ) and 1 1 u(t)2 6 η 2 + 2 2

Zt ϕ(s)u(s) ds

∀ t ∈ T,

0

then ¯ ¯ ¯u(t)¯ 6 |η| +

Zt ϕ(s) ds 0

∀ t ∈ T.

3. Nonlinear Operators and Young Measures PROOF

363

Let 1 ξε (t) = (η + ε)2 + 2

Zt ϕ(s)u(s) ds, 0

with ε > 0 and t ∈ T . Then ξε0 (t) = ϕ(t)u(t) Moreover, so

for a.a. t ∈ T.

1 u(t)2 6 ξ0 (t) 6 ξε (t) 2 √ p ξε0 (t) 6 ϕ(t) 2 ξε (t)

∀ ε > 0, t ∈ T, ∀ ε > 0, t ∈ T.

Because t 7−→ ξε (t) is absolutely continuous with values in R+ , we have ¡p ¢0 ξε (t) = thus

and so

1 p ξε0 (t) 2 ξε (t)

for a.a. t ∈ T,

¡p ¢0 1 ξε (t) 6 √ ϕ(t) for a.a. t ∈ T 2 Z p p 1 ξε (t) 6 ξε (0) + √ ϕ(s) ds 2 t

∀ t ∈ T.

0

Therefore, it follows that Z p √ p ¯ ¯ ¯u(t)¯ 6 2 ξε (t) 6 2ξε (0) + ϕ(s) ds t

0

Zt = |η + ε| +

ϕ(s) ds. 0

Let ε & 0, to conclude that ¯ ¯ ¯u(t)¯ 6 |η| +

Zt ϕ(s) ds

∀ t ∈ T.

0

Using these results, we can prove the first existence theorem for evolution equations driven by m-accretive operators.

364

Nonlinear Analysis

THEOREM 3.3.28 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive),¡ A : X ¢⊇ D(A) −→ 2X is an m-accretive operator, x0 ∈ D, f ∈ W 1,1 (0, b); X and ω ∈ R, ¡ ¢ then we can find a unique x ∈ W 1,∞ (0, b); X , such that ½

¡ ¢ x0 (t) + A x(t) 3 ωx(t) + f (t) x(0) = x0 .

for a.a. t ∈ T = [0, b],

PROOF First we show ¡ that ¢the solution if it exists is unique. So suppose that x1 , x2 ∈ W 1,∞ (0, b); X are two solutions of the evolution Cauchy problem. We have ¡ ¢0 ¡ ¢ ¡ ¢ ¡ ¢ x1 (t) − x2 (t) + A x1 (t) − A x2 (t) 3 ω x1 (t) − x2 (t) for a.a. t ∈ T. Let

° ° ϕ(t) = °x1 (t) − x2 (t)° . ¡ ¢ Since x1 , x2 ∈ W 1,∞ (0, b); X , they are Lipschitz continuous functions (see Theorem 2.2.24) and so they are differentiable almost everywhere on T (see Theorem 2.2.17). Moreover, ϕ is a Lipschitz continuous function too, thus differentiable almost everywhere on T . So we can use Lemma 3.3.26 and obtain ° ° ° ° °x1 (t) − x2 (t)° d °x1 (t) − x2 (t)° X dt X ¡ ¢ 0 ® 0 = F x1 (t) − x2 (t) , x1 (t) − x2 (t) X

∀ t ∈ T.

Since A is m-accretive, we also have ° ° ° ° ° ° °x1 (t) − x2 (t)° d °x1 (t) − x2 (t)° 6 ω °x1 (t) − x2 (t)°2 X dt X X so

° ° ° d° °x1 (t) − x2 (t)° 6 ω °x1 (t) − x2 (t)° X X dt

∀ t ∈ T,

∀ t ∈ T.

Because x1 (0) − x2 (0) = 0, by Gronwall’s inequality (the differential form), we obtain kx1 (t) − x2 (t)kX = 0 ∀ t ∈ T. Therefore x1 = x2 . To establish the existence of a solution, first we consider the following approximate evolution equation. ¡ ¢ ½ 0 x (t) + Aλ x(t) = ωx(t) + f (t) for a.a. t ∈ T = [0, b], (3.65) x(0) = x0 ,

3. Nonlinear Operators and Young Measures

365

with λ > 0. Because Aλ is a Lipschitz continuous operator (see Proposition 3.3.12(b)), problem (3.65) has a unique solution xλ ∈ C 1 (T ; X). Using Lemma 3.3.26, we see that for all λ, µ > 0, we have ° ¡ ¢ ¡ ¢ ¡ ¢® 1 d° °xλ (t) − xµ (t)°2 + F xλ (t) − xµ (t) , Aλ xλ (t) − Aµ xµ (t) X X 2 dt ° °2 ° ° = ω xλ (t) − xµ (t) X ∀ t ∈ T, so by Gronwall’s inequality (see Theorem A.4.7), we have ° ° °xλ (t) − xµ (t)°2 X Zb 6 −2

¡ ¢ ¡ ¢ ¡ ¢® e2ω(t−s) F xλ (s) − xµ (s) , Aλ xλ (s) − Aµ xµ (s) X ds

t ∈ T.

0

Exploiting the accretivity of A and the fact that ¡ ¢ ¡ ¡ ¢¢ Aα xα (t) ∈ A Jα xα (t)

∀t∈T

(see Proposition 3.3.12(c)), we can write that ° ° °xλ (t) − xµ (t)°2

X

Zb 6 −2

¡ ¢ ¡ ¡ ¢ ¡ ¢¢ e2ω(t−s) F xλ (s) − xµ (s) − F Jλ xλ (s) − Jµ xµ (s) ×

0

¡ ¢ ¡ ¢® ×Aλ xλ (s) − Aµ xµ (s) X ds

∀ t ∈ T,

so ° ° °xλ (t) − xµ (t)°2 X Zb ° ¡ ¢ ¡ ¡ ¢ ¡ ¢¢° 6 2 e2ω(t−s) °F xλ (s) − xµ (s) − F Jλ xλ (s) − Jµ xµ (s) °X ∗ × ° ¡ ¢ ¡ ¢° ×°Aλ xλ (s) − Aµ xµ (s) °X ds

0

∀ t ∈ T.

(3.66)

We claim that ° 0 ° ¯ ¯ ° ° °xλ (t)° 6 ¯A(x0 )¯ + ω kx0 k + °f (0)° + e2ωb kf 0 k . X 1 X X

(3.67)

Indeed, from (3.65), for every r ∈ (0, b), we have ¡

¢0 ¡ ¢ ¡ ¢ xλ (t + r) − xλ (t) + Aλ xλ (t + r) − Aλ xλ (t) ¡ ¢ = ω xλ (t + r) − xλ (t) + f (t + r) − f (t) for a.a. t ∈ Tr = [0, t − r].

366

Nonlinear Analysis

Exploiting the accretivity of Aλ and Lemma 3.3.26, we obtain ° 1 d° °xλ (t + r) − xλ (t)°2 X 2 dt ° °2 6 ω °xλ (t + r) − xλ (t)°X ¡ ¢ ® + F xλ (t + r) − xλ (t) , f (t + r) − f (t) X

for a.a. t ∈ Tr ,

so ° ° °xλ (t + r) − xλ (t)°2 X ° °2 6 °xλ (r) − xλ (0)°X Zt +2

° ° ° ° e2ω(t−s) °xλ (s + r) − xλ (s)°X °f (s + r) − f (s)°X ds.

0

Thus by Lemma 3.3.27, we obtain ° ° ° ° °xλ (t + r) − xλ (t)° 6 °xλ (r) − xλ (0)° + X X

Zt

° ° e2ω(t−s) °f (s + r) − f (s)°X ds.

0

Dividing with r > 0 and letting r & 0, we obtain ° 0 ° °xλ (t)°

X

° ° 6 °ωx0 + f (0) − Aλ (x0 )°X +

Zb

° ° e2ω(t−s) °f 0 (s)°X ds

0

¯ ¯ ° ° 6 ¯A(x0 )¯ + ω kx0 kX + °f (0)°X + e2ωb kf 0 k1 . This proves (3.67), from which we infer that there exists M1 > 0, such that ° 0 ° °xλ (t)° 6 M1 ∀ λ > 0, t ∈ T. (3.68) X Since

Zt x0λ (s) ds

xλ (t) = x0 +

∀ λ > 0, t ∈ T,

0

it follows that there exists M2 > 0, such that ° ° °xλ (t)° 6 M2 ∀ λ > 0, t ∈ T. X

(3.69)

Returning to (3.65) and using (3.68) and (3.69), we obtain M3 > 0, such that ° ¡ ¢° °Aλ xλ (t) ° 6 M3 ∀ λ > 0, t ∈ T, (3.70) X so ° ° ¡ ¡ ¢° ¢° °xλ (t)−Jλ xλ (t) ° = λ°Aλ xλ (t) ° 6 λM3 X X

∀ λ > 0, t ∈ T. (3.71)

3. Nonlinear Operators and Young Measures From (3.71), it follows that ° ¡ ¢° °xλ (t) − Jλ xλ (t) ° −→ 0 X

367

as λ & 0, uniformly on T.

Because F is uniformly continuous on bounded sets, from (3.66) and (3.70), we see that for a given ε > 0, for all λ, µ > 0 small enough, we have ° ° °xλ (t) − xµ (t)°2 6 M4 ε, X for some M4 > 0, so xλ −→ x in C(T ; X) as λ & 0. Moreover, from (3.67) ¡ ¢ we infer that x is a Lipschitz continuous function, i.e., x ∈ W 1,∞ (0, b); X . We claim that this is the solution of the evolution equation. To this end let (y, z) ∈ Gr A and let us set df

yλ = y + λz

∀ λ > 0.

Hence z = Aλ (yλ ). Using (3.65) and the accretivity of Aλ , we obtain ° ° ° ° °xλ (t) − yλ °2 6 °xλ (t0 ) − yλ °2 X X Zt +

¡ ¢ ® F xλ (s) − y , ωxλ (s) + f (s) − z X ds

∀ t, t0 ∈ T.

t0

Letting λ & 0, we get ° ° ° ° °x(t) − y °2 − °x(t0 ) − y °2 6 2 X X

Zt

¡ ¢ ® F x(s) − y , ωx(s) + f (s) − z X . (3.72)

0

Note that for any z, h ∈ X, we have ® 1¡ 2 2 2 ¢ F(h), z − h X 6 khkX kzkX − khkX 6 kzkX − khkX . 2 Using this in (3.72), we have ¿ À ¡ ¢ x(t) − x(t0 ) F x(t0 ) − y , t − t0 X Zt ¡ ¢ ® 1 6 F x(s) − y , ωx(s) + f (s) − z X ds. t − t0

(3.73)

t0

Let t0 ∈ T be a point of differentiability of x. Passing to the limit as t → t0 in (3.73), we obtain ¡ ¢ ® ¡ ¢ ® F x(t0 ) − y , x0 (t0 ) X = F x(t0 ) − y , ωx(t0 ) + f (t0 ) − z X ,

368

Nonlinear Analysis

so

¡ ¢ ® F x(t0 ) − y , u0 − z X > 0,

with

(3.74)

¡ ¢ df u0 = −x0 (t0 ) + ωx(t0 ) + f (t0 ) ∈ A x(t0 )

(see (3.65)). Since A is m-accretive, hence maximal accretive, from (3.74), we conclude that ¡ ¢ −x0 (t0 ) + ωx(t0 ) + f (t0 ) ∈ A x(t0 ) . Because ¡ x is¢ almost everywhere differentiable W 1,∞ (0, b); X ), we conclude that x solves (3.65).

(recall

that

x

∈

Next we want to consider evolution equations with less regular data. This can be done with remarkable success using the theory of semigroups of operators. In what follows we present some basic aspects of this theory. We start with the linear theory and then pass the nonlinear one. Let us motivate the definition of a semigroup of bounded linear operators (linear semigroup). So let X be a Banach space and A ∈ L(X). We consider the following Cauchy problem ½ 0 x (t) = Ax(t) ∀ t > 0, (3.75) x(0) = x0 ∈ X. df

It is easy to check that the function x(t) = etA x0 for t > 0, x ∈ C 1 (R+ ; X) is the unique solution of (3.75). Let us mention the basic properties of this solution. First, for every fixed t > 0, the map x0 7−→ x(t) is linear. Moreover, since ° ° °x(t)° 6 etkAkL kx0 k , X X it is also bounded. Second, as t & 0, we have that x(t) −→ x0

in X

and x(0) = x0 . Finally third, by virtue of the uniqueness of the solution of (3.75), if we start with initial condition x(t0 ), t0 > 0 and move for time t > 0, we must reach the state x(t + t0 ) (recall that e(t+t0 )A = etA et0 A ). Generalizing these properties we obtain the notion of a C0 -semigroup of linear operators. © ª DEFINITION 3.3.29 Let X be a Banach space and S(t) t>0 ⊆ L(X). We call S a C0 -semigroup on X if the following conditions hold: (a) S(0) = idX ; (b) S(t+s)=S(t)S(s) for all s, t > 0; ° ° (c) lim °S(t)x − x°X = 0 for all x ∈ X. t→0

3. Nonlinear Operators and Young Measures

369

REMARK 3.3.30 Property (b) is the semigroup property , while property (c) implies that the function t 7−→ S(t) is continuous from R+ into L(X) furnished with the strong operator topology. If A ∈ L(X), then © ª S(t) = etA t>0 is a C0 -semigroup. Also if X = Cb (R) (the space of bounded continuous functions f : R −→ R equipped with the supremum norm) and S(t)f (·) = f (t + ·) then

©

∀ f ∈ Cb (R),

ª S(t) t>0 is a C0 -semigroup.

PROPOSITION 3.3.31 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X, then there exist M > 1 and ω > 0, such that ° ° °S(t)° 6 M eωt ∀ t > 0. L PROOF By virtue of property (c) in Definition 3.3.29, we can find M > 1 and δ > 0, such that ° ° °S(t)° 6 M ∀ t ∈ [0, δ]. L Let

ln M > 0. δ For a given t > 0, we can find an integer n > 0 and ϑ ∈ [0, δ), such that df

ω =

t = nδ + ϑ. Because of the semigroup property, we have S(t) = S(δ)n S(ϑ), so

° ° ° ° ° ° °S(t)° 6 °S(δ)°n °S(ϑ)° L L L 6 M n M = M eωt ,

since

ln M n = n ln M = mωδ 6 ωt.

370

Nonlinear Analysis

Using this bound, we can improve (c) in Definition 3.3.29. COROLLARY 3.3.32 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X, then for all x ∈ X, the map t 7−→ S(t)x is continuous from R+ into X. PROOF Let r > 0. Then using the semigroup property and Proposition 3.3.31, we obtain ° ° °S(t + r)x − S(t)x° ° ° ° °X = °S(t)°L °S(r)x − x°X ° ° 6 M eωt °S(r)x − x°X −→ 0 as r & 0. So the function t 7−→ S(t)x is continuous on R+ . © ª DEFINITION 3.3.33 Let X be a Banach space and let S(t) t>0 be a C0 -semigroup on X. From Proposition 3.3.31, we know that ° ° °S(t)° 6 M eωt ∀t>0 L for some M > 1 and ω > 0. If M = 1 and ω = 0, i.e., ° ° °S(t)° 6 1 ∀ t > 0, L then we say that we have a contraction semigroup. The following notion is central in the theory of linear semigroups and is the starting point for determining those operators which generate contraction semigroups (see the Hille-Yosida Theorem 3.3.46). © ª DEFINITION 3.3.34 Let X be a Banach space and let S(t) t>0 be a C0 -semigroup on X. We introduce the generator (or infinitesimal generator) of the semigroup S as the linear operator A : X ⊇ D(A) −→ X, defined by S(t)x − x df Ax = lim ∀ x ∈ D(A), t&0 t where

½ df

D(A) =

¾ S(t)x − x x ∈ X : lim exists . t&0 t

In general the operator A is not bounded.

3. Nonlinear Operators and Young Measures

371

EXAMPLE 3.3.35 (a) Let A ∈ L(X) and S(t) = eAt for t > 1. This is a C0 -semigroup. Then for every x ∈ X, we have ∞ k−1 ∞ k−1 X X etA x − x t t = Ak x = Ax + Ak x t k! k! k=1

and

k=2

°X ° ° ∞ tk−1 k ° ° A x° ° ° k!

X

k=2

2

6 t kAkL kxkX =

6

2 |t| kAkL

∞ k−1 X t k=2

k!

k

kAkL kxkX

∞ k X

t k kAkL k!

k=0 kxkX etkAkL

−→ 0

as t & 0.

Therefore the generator of S is A. (b) Let X = C(B; X)

df

and S(t)f (·) = f (t + ·)

(see Remark 3.3.30). We have (S(t)f − f )(s) = D+ f (s) if it exists. t&0 t lim

So if f ∈ D(A), then D+ f exists at all s > 0 and it is bounded and uniformly continuous. Also we have f (s) − f (s − t) f (s − t + t) − f (s − t) = t t o(t) = D+ f (s − t) + −→ D+ f (s) as t & 0 t (since D+ f is continuous). Therefore if f ∈ D(A), then D+ f = D− f , i.e., f 0 (t) exists at all t ∈ R and f 0 ∈ Cb (R). So ½ ¾ D(A) = f ∈ Cb (R) : f 0 exists everywhere and f 0 ∈ Cb (R) and Af = f 0 for all f ∈ D(A). ¡ ¢ More generally, if X = H = L2 0, b and ½ df x(t + s) if S(t)x(s) = 0 if

t + s ∈ (0, b), t + s 6∈ (0, b),

then the generator of S is the operator A : H ⊇ D(A) −→ H, defined by Ax(t) = with D(A) =

©

d x(t), dt

ª x ∈ W 1,2 (0, b) : x(b) = 0 .

372

Nonlinear Analysis

In the next proposition, we summarize the differential properties of C0 semigroups. PROPOSITION 3.3.36 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X with generator A, then for all x ∈ D(A) and all t > 0, we have (a) S(t)x ∈ D(A); (b)

d dt S(t)x

= AS(t)x = S(t)Ax for t > 0; Zt

(c) S(t)x − x =

S(r)Ax dr; 0

(d) D(A) is dense in X and the operator A is closed (i.e., Gr A ⊆ X × X is closed). PROOF

(a) For r > 0 we have

S(t + r)x − S(t)x S(r)x − x = S(t) −→ S(t)Ax r r

as r & 0.

Hence S(t)x ∈ D(A). (b) From (a), we have

d+ S(t)x = S(t)Ax. dt

Also note that S(t + r)x − S(t)x S(r)x − x = S(t) r r µ ¶ S(r) − idX = S(t)x −→ AS(t) as r & 0, r so

d+ S(t)x = S(t)Ax = AS(t)x. dt On the other hand, for t > r > 0, we have S(t)x − S(t − r)x S(r)x − x = S(t − r) −→ S(t)Ax as r & 0, r x so Also since

d− S(t)x = S(t)Ax. dt µ ¶ S(r)x − x S(r) − idX S(t − r) = S(t − r), r r

3. Nonlinear Operators and Young Measures

373

we have

d− S(t)x = S(t)Ax = AS(t)x. dt So finally, we conclude that d S(t)x = S(t)Ax = AS(t)x. dt (c) By part (b), the function t 7−→ S(t)x is continuously differentiable. So for all x∗ ∈ X ∗ , we have ∗ ® x , S(t)x − x X =

Zt 0

Zt = 0

® d ∗ x , S(r)x X dr dr

¿

À ¿ Zt À ∗ d ∗ x , S(r)x dr = x , S(r)Ax dr , dt X X 0

hence

Zt S(r)Ax dr

S(t)x − x =

∀ t > 0.

0

(d) For t > r > 0 and x ∈ X, we have S(r) − idX r =

1 r

Zt

¶

µ Zt S(τ )x dτ 0

¡ ¢ S(τ + r)x − S(τ )x dτ

0

· Zt+r ¸ Zt 1 = S(τ )x dτ − S(τ )x dτ r r

0

· Zt+r ¸ Zr 1 = S(τ )x dτ − S(τ )x dτ −→ S(t)x − x as r & 0, r r

so

0

Zt S(τ )x dτ ∈ D(A). 0

But note that 1 lim t&0 t

Zt S(τ )x dτ = x 0

374

Nonlinear Analysis

and since x ∈ X was arbitrary, we conclude that D(A) is dense in X. Next let {xn }n>1 ⊆ D(A) and assume that xn −→ x and A(xn ) −→ y

in X.

For every r > 0, we have ° ° ° ° °S(r)Axn − S(r)y ° 6 M eωr °Axn − y ° X X (see Proposition 3.3.31) and thus S(·)Axn −→ S(·)y

in X uniformly on [0, t], t > 0.

From (c) we know that Zt S(t)xn − xn =

S(r)Axn dr. 0

Passing to the limit as n → +∞, we obtain Zt S(r)y dr.

S(t)x − x = 0

Hence lim

t&0

S(t)x − x = y, t

which implies that x ∈ D(A) and y = Ax, i.e., A is closed. REMARK 3.3.37 Using part (b) of Proposition 3.3.36 and induction, we can show that for all n > 1, all x ∈ D(An ) and all t > 0, we have dn S(t)x = S(t)An x = An S(t)x. dtn Moreover, it can be shown that the set

∞ \

D(An ) is dense in X.

n=1

For details we refer to Pazy (1983, p. 6). Also because of parts (b) and (d) of Proposition 3.3.36 and using Theorem 2.1.17, we can rewrite part (c) as follows (c)’ for all t > 0 and all x ∈ X, we have Zt S(t)x − x = A

S(τ )x dτ. 0

3. Nonlinear Operators and Young Measures

375

COROLLARY 3.3.38 An operator A : X ⊇ D(A) −→ X can be generator of at most one C0 semigroup. PROOF Suppose that S1 and S2 are two C0 -semigroups with generator A. Let x ∈ D(A) and t > 0 and define the function df

u(s) = S1 (t − s)S2 (s)x

∀ s ∈ (0, t). ¡ ¢ From Proposition 3.3.36(b), we know that u ∈ C 1 (0, t); X and u0 (s) = −AS1 (t − s)S2 (s)x + S1 (t − s)AS2 (s)x = −S1 (t − s)AS2 (s)x + S1 (t − s)AS2 (s)x = 0, so Zt u0 (s) ds = 0

S1 (t)x − S2 (t)x = u(0) − u(t) = −

∀ x ∈ D(A). (3.76)

0

Because D(A) is dense in X (see Proposition 3.3.36(d)), from (3.76), it follows that S1 (t) = S2 (t) ∀ t > 0. © ª © ª If S(t) t>0 is a C0 -semigroup on a Banach space X, then S(t)∗ t>0 still has the semigroup property but need not be a C0 -semigroup. In fact in general we can only show that for every x∗ ∈ X ∗ , t 7−→ S(t)∗ x∗ is weakly∗ -continuous at t = 0, i.e., w∗-lim S(t)∗ x = x. t&0

So the map S(t) 7−→ S(t)∗ does not preserve the strong continuity at t = 0. © ª EXAMPLE 3.3.39 Let X = C0 (R) (see Section 2.3) and let S(t) t>0 be the C0 -semigroup of left translations, i.e., ¡ ¢ df S(t)f (s) = f (t + s)

∀ t, s > 0, f ∈ C0 (R).

We know that X ∗ = N BV (R), the space of all normalized functions of bounded variation with the total variation norm Z ¯ ¯ ¯ dϑ(t)¯. kϑk = (Var ϑ)(R) = T V (R)

R

376

Nonlinear Analysis

By saying that ϑ is normalized, we mean that ϑ(s) =

ϑ(s+ ) + ϑ(s− ) 2

∀s∈R

and ϑ(−∞) =

lim ϑ(s) = 0

s→−∞

and

ϑ(+∞) =

lim ϑ(s) = 0

s→+∞

(see also Theorem 2.3.41). For all f ∈ C0 (R) and all ϑ ∈ N BV (R), we have Z Z ® ® ϑ, S(t)f = f (t + s) dϑ(s) = f (s) ds ϑ(s − t) = S(t)∗ ϑ, f , R

so

R

¡ ¢ S(t)∗ ϑ (s) = ϑ(s − t),

© ª i.e., S(t)∗ t>0 is the right translation of ϑ. By a theorem of Plessner (1929), we know that · ¸ · ¸ ° ° °ϑ(· − t) − ϑ(·)° −→ 0 as t & 0 ⇐⇒ ϑ ∈ AC(R) . T V (R) So the function t 7−→ S(t)∗ ϑ is not in general strongly continuous at t = 0 (unless of course ϑ ∈ AC(R)). Note that in the previous example X = C0 (R) is not a reflexive Banach space. This is not an accident. PROPOSITION 3.3.40 © ª If H is a Hilbert space, H ∗ = H (i.e., H is a pivot space) and S(t) t>0 is a C0 -semigroup © ª on H, then S(t)∗ t>0 is a C0 -semigroup on H. PROOF Clearly S(·)∗ satisfies the semigroup property. Therefore we need to show that for all h ∈ H, the map t 7−→ S(t)∗ h is strongly continuous at t = 0. For every x, h ∈ H, the function ¡ ¢ ¡ ¢ s 7−→ S(s)∗ h, x H = h, S(s)x H is continuous on R. Therefore for all h ∈ H, the function s 7−→ S(s)∗ h is weakly continuous. So it follows from the uniform boundedness principle (see Theorem A.3.4) and the semigroup property that the function s 7−→ S(s)∗ h is bounded on any compact interval of R+ . Also it is weakly measurable. Moreover, if {sn }n>1 is an enumeration of the rationals in R+ and we consider © ª df L = span Q S(sn )∗ h n>1

3. Nonlinear Operators and Young Measures ©

(i.e., finite linear combinations of

S(sn )∗ h

ª n>1

377

with rational coefficients),

df

then L is countable and H0 = span L is separable. © ª Since the function s 7−→ S(s)∗ h is weakly continuous, we see that S(s)∗ h s>0 ⊆ H0 . Therefore we have proved that the function s 7−→ S(s)∗ h is weakly measurable and separably valued; hence by Theorem 2.1.3, it is strongly measurable. We infer that S(·)∗ h ∈ L1loc (R; H). If t > 0 and η ∈ (0, t), we have ° ° °S(t + r)∗ h − S(t)∗ h°

H

° Zη ° °1 ¡ ¢ ° ∗ ∗ ° = ° S(t + r) h − S(t) h dτ ° ° η H 0

° Zη ° °1 ¡ ¢ ° ∗ ∗ ∗ ° = ° S(τ ) S(t + r − τ ) h − S(t − τ ) h dτ ° ° η H 0

1 6 M1 η

Zη

° ° °S(t + r − τ )∗ h − S(t − τ )∗ h° dτ, H

(3.77)

0

for some M1 > 0, so by Corollary 2.3.8, we have ° ° lim °S(t + r)∗ h − S(t)∗ h°H r&0

M1 6 lim r&0 η

Zη

° ° °S(t + r − τ )∗ h − S(t − τ )∗ h° dτ = 0. H

0

Finally let tn & 0 and © ª df C = conv S(tn )∗ h n>1 . Because

w

S(tn )∗ h −→ h

in H,

we have that h ∈ C. So if © ª df E = span S(tn )∗ h n>1 , then from (3.77), it follows that lim S(tn )∗ h = h

n→+∞

Because

∀ h ∈ E.

° ° sup °S(tn )∗ °L 6 M2 ,

n>1

for some M2 > 0, we conclude that lim S(tn )∗ h = h

n→+∞

∀ h ∈ H.

(3.78)

378

Nonlinear Analysis

REMARK 3.3.41 In fact the result is true if H is replaced © ª by a reflexive Banach space. This is a consequence of the fact that if S(t) t>0 is a semigroup of linear operators, such that for all x ∈ X, the function t 7−→ S(t)x is strongly measurable (this is the case if for all x ∈ X, the function t 7−→ S(t)x is weakly continuous), then S is a C0 -semigroup. For details we refer to Hille & Phillips (1957, pp. 305–306). Of great importance in applications are theorems which give necessary and sufficient conditions for an operator A to be the infinitesimal generator of a C0 -semigroup. The basic result in this direction is the celebrated HilleYosida theorem. To state and prove this fundamental result we need some preparation. DEFINITION 3.3.42 Let X be a Banach space and let A : X ⊇ D(A) −→ X be a closed, linear operator. (a) The resolvent set %(A) of A is defined by ª df © %(A) = λ ∈ R : λidX − A : X ⊇ D(A) −→ X is bijective . (b) If λ ∈ %(A), then the resolvent operator Rλ : X −→ X is defined by df

Rλ x = (λidX − A)−1

∀ x ∈ X.

REMARK 3.3.43 It is easy to check that Rλ is closed. So by the closed graph theorem (see Theorem A.3.7), we have that Rλ ∈ L(X). Moreover, we have ARλ x = Rλ Ax for all x ∈ D(A) (see the proof of Proposition 3.3.44(b)). PROPOSITION 3.3.44 If X is a Banach space and A : X ⊇ D(A) −→ X is a closed, linear operator, then (a) for λ, µ ∈ %(A), we have Rλ − Rµ = (λ − µ)Rλ Rµ (resolvent identity) and Rλ Rµ = Rµ Rλ ; (b) if A is the generator of a C0 -semigroup S and ° ° °S(t)° 6 M eωt ∀ t > 0, L then λ ∈ %(A) and

∀λ>ω

+∞ Z Rλ x = e−λt S(t)x dt 0

∀ x ∈ X.

3. Nonlinear Operators and Young Measures PROOF

379

Let x ∈ D(A). We have £ ¤ (λidX − A) Rλ − Rµ (µidX − A)(x) = (µidX − A)x − (λidX − A)x = (µ − λ)x,

so Rλ − Rµ = (µ − λ)Rλ Rµ .

(3.79)

The commutation of Rλ and Rµ follows by interchanging λ and µ in (3.79). (b) Let us set +∞ Z e−λt S(t)x dt

df bλ x = R

∀ λ > ω, x ∈ X.

0

This operator is well defined, since ° −λt ° °e S(t)x° 6 M e(ω−λ)t kxk X X and the function t 7−→ e−λt S(t)x is continuous, thus strongly measurable. We have ° ° bλ ° 6 M °R L

+∞ Z e−(λ−ω)t dt 6 0

M , λ−ω

bλ ∈ L(X). i.e., R bλ x ∈ D(A) and We show that R bλ x = x (λidX − A)R

∀ x ∈ X.

We have S(s) − idX b 1 Rλ x = s s λs

=

e

−1 s

Z∞ e s

−λt

+∞ Z ¡ ¢ e−λt S(t + s) − S(t) x dt 0

1 S(t)x dx − s

Zs e−λt S(t)x dt. 0

Passing to the limit as s & 0, we obtain b λ x = λR bλ x − x AR

∀ x ∈ X.

(3.80)

380

Nonlinear Analysis

Using Proposition 3.3.36(b) and Theorem 2.1.17, for all x ∈ D(A), we have +∞ Z e−λt S(t)Ax dt

bλ Ax = R

0

+∞ Z = e−λt AS(t)x dt 0

+∞ Z bλ x. = A e−λt S(t)x dt = AR

(3.81)

0

From (3.80) and (3.81), it follows that bλ (λid − A)x = x R X and bλ x = x (λidX − A)R

∀ x ∈ D(A),

so bλ Rλ = R

and

λ ∈ %(A).

REMARK 3.3.45 Because of Proposition 3.3.44(b), we see that the resolvent operator is the Laplace transform of the C0 -semigroup generated by A. Hence the function λ 7−→ R(λ) is analytic on %(A). Now we are ready for the theorem characterizing the generators of C0 semigroups. THEOREM 3.3.46 (Hille-Yosida Theorem) If X is a Banach space and A : X ⊇ D(A) −→ X is a closed, densely defined, linear operator, then A is the generator of a C0 -semigroup if and only if there exist M > 1 and ω ∈ R, such that ° k° °Rλ ° 6 L

M (λ − ω)k

∀ λ ∈ %(A), λ > ω.

In this case we have ° ° °S(t)° 6 M eωt L

∀ t > 0.

3. Nonlinear Operators and Young Measures

381

PROOF “=⇒”: From Proposition 3.3.44, we know that if λ > ω, then λ ∈ %(A) and we have +∞ Z Rλ x = e−λt S(t)x dt

∀ x ∈ X,

0

so d(k−1) Rλ x = Rλk−1 x = dλk−1

+∞ Z (−t)k−1 e−λt S(t)x dt

∀k>1

0

and thus ° k−1 ° °R ° λ

L

6 M

+∞ Z tk−1 e−(λ−ω)t dt = M (k − 1)!(λ − ω)−k .

(3.82)

0

But the function λ 7−→ Rλ is analytic on %(A) (see Remark 3.3.45). Hence (k−1)

Rλ

= (−1)k−1 (k − 1)!Rλk .

(3.83)

From (3.82) and (3.83), it follows that ° k° °Rλ ° 6 L

M . (λ − ω)k

“⇐=”: For λ > ω, let us set df

Aλ = λ2 Rλ − λidX . We have that Rλ ∈ L(X) and so we obtain the C0 -semigroup Sλ (t) = eAλ t = e−λt

∞ X (λ2 t)n (λidX − A)−n n! n=0

(see Remark 3.3.30). We shall show that as λ → +∞, then Sλ (t) converges in the strong operator topology to S(t), t > 0, which is the desired semigroup. Note that x = Rλ (λidX − A)x = λRλ x − Rλ Ax = λRλ x − ARλ x

∀ x ∈ D(A)

(see Remark 3.3.43), so Aλ x = λRλ Ax = λARλ x.

(3.84)

Also, we have ° ° ° ° °λRλ x − x° = °λRλ Ax° 6 X X

M kAxkX λ−ω

∀ x ∈ D(A),

382

Nonlinear Analysis

so λRλ x −→ x

in X

as λ → +∞,

∀ x ∈ D(A).

But D(A) is dense in X. So for a given x ∈ X, we can find a sequence {xm }m>1 ⊆ D(A), such that xm −→ x in X. Then for λn −→ +∞ as n → +∞, we have λn Rλn xm −→ xm

in X

as n → +∞.

By the double limit lemma (see Proposition A.2.35), we can find an increasing sequence {m(n)}n>1 (not necessarily strictly) to +∞, such that λn Rλn xm(n) −→ x

in X,

as n → +∞.

Then we have ° ° ° ° ° ° °λn Rλn x − x° 6 °λn Rλn x − λn Rλn xm(n) ° + °λn Rλ xm(n) − x° n X X X ° ° ° λn M ° ° ° ° ° 6 x − xm(n) X + λn Rλn xm(n) − x X −→ 0, λn − ω so λRλ x −→ x in X as λ → +∞ ∀ x ∈ X. Then because of (3.84), we have Aλ x −→ Ax in X

as λ → +∞

∀ x ∈ D(A).

For every λ > ω and t > 0, we have ° ° °Sλ (t)°

L

6 e

−λt

∞ X λω (λ2 t)n M = M e λ−ω t . n n! (λ − ω) n=0

Also from Proposition 3.3.44(a), we have that Aλ Aµ = Aµ Aλ

∀ λ, µ > 0

and so Aλ Sµ (t) = Sµ (t)Aλ

∀ λ, µ > 0, t > 0.

From Proposition 3.3.36(b), it follows that Zt Sλ (t)x − Sµ (t)x = 0

¢ d¡ Sµ (t − s)Sλ (s)x ds dt

Zt =

Sµ (t − s)(Aλ − Aµ )Sλ (s)x ds 0

Zt =

Sµ (t − s)Sλ (s)(Aλ − Aµ )x ds 0

∀ x ∈ D(A),

3. Nonlinear Operators and Young Measures

383

so ° ° °Sλ (t)x − Sµ (t)x° X ° ° µω 6 M 2 e µ−ω t °(Aλ − Aµ )x°X

Zt

(λ−µ)ω 2 s

e− (µ−ω)(λ−ω) ds. 0

Let λ > µ. We have ° ° ° ° µω °Sλ (t)x − Sµ (t)x° 6 M 2 e µ−ω t °(Aλ − Aµ )x° −→ 0 X X

as λ, µ → +∞,

and thus Sλ (t)x converges to some limit as λ → +∞ uniformly on compact intervals. Denote this limit by S(t)x, x ∈ D(A). As before exploiting the density of D(A) in X, we have that Sλ (t)x −→ S(t)x

in X

as λ → +∞

∀x∈X

and the convergence is uniform on compact intervals in R+ . This means that the function t 7−→ S(t)x is continuous and since S clearly satisfies the semigroup property and S(0) = idX , we conclude that S is a C0 -semigroup. b be the generator of It remains to show that A is the generator of S. Let A S. From Proposition 3.3.36(c), we know that Zt Sλ (t)x − x =

Sλ (s)Aλ x ds

∀ t > 0, λ > 0, x ∈ D(A).

(3.85)

0

Note that ° ° °Sλ (s)Aλ x − S(s)Ax° ° °X ° ° 6 °Sλ (s)Aλ x − Sλ (s)Ax°X + °Sλ (s)Ax − S(s)Ax°X ° ° ° ° ° ° 6 °Sλ (s)°L °Aλ x − Ax°X + °Sλ (s)Ax − S(s)Ax°X ∀ s ∈ [0, t], λ > 0, x ∈ D(A), so

° ° °Sλ (s)Aλ x − S(s)Ax° −→ 0 X

as λ → +∞.

Thus if we pass to the limit as λ → +∞ in (3.85), we obtain Zt S(t)x − x =

S(s)Ax ds

∀ t > 0, x ∈ D(A),

0

so

b and D(A) ⊆ D(A)

b Ax = Ax

∀ x ∈ D(A),

384

Nonlinear Analysis

b is an extension of A. i.e., A Now if λ > ω, then b λ ∈ %(A) ∩ %(A) and so

¡ ¢ ¡ ¢ b D(A) = (λid − A) D(A) = X. (λidX − A) X ¡ ¢ b i.e., A = A. b Thus λidX − A |D(A) is bijective and so D(A) = D(A), REMARK 3.3.47 The operator Aλ ∈ L(X) introduced in the above proof is known as the Yosida approximation of A. Note that if A is also dissipative (see Remark 3.3.2), then Aλ coincides with the notion introduced in Definition 3.3.11. However, there A was not necessarily linear. The following generation theorem for perturbed operators is useful in applications and is known as Phillips theorem. THEOREM 3.3.48 (Phillips Theorem) If X is a Banach space, A : X ⊇ D(A) −→ X is the generator of a C0 semigroup and B ∈ L(X), then A + B : X ⊇ D(A) −→ X is also the generator of a C0 -semigroup. Another important generation result is the so-called Lumer-Phillips theorem THEOREM 3.3.49 (Lumer-Phillips Theorem) If X is a Banach space and A : X ⊇ D(A) −→ X is a densely defined, linear, m-dissipative operator, then A is the generator of a contraction semigroup. PROOF

By Lemma 3.3.3, we have that ° ° °λx − Ax° > λ kxk ∀ λ > 0, x ∈ D(A). X X

Also R(λidX − A) = X

∀ λ > 0,

due to the m-dissipativity of A (see Definition 3.3.1 and Proposition 3.3.14). It follows that R+ ⊆ %(A) and kRλ kL 6

1 λ

∀ λ > 0.

Then by Theorem 3.3.46, A generates a contraction semigroup.

3. Nonlinear Operators and Young Measures

385

Recall that if X is a Banach space and A ∈ L(X), then A is the generator of the semigroup df S(t) = etA ∀ t > 0. Moreover, from elementary analysis, we know that for the exponential function, we have µ ¶−n at e−at = lim 1+ . n→+∞ n In the next theorem, we show that even if A is unbounded, the limit expression is valid for the semigroup generated by A. The result is known as the exponential formula. First a lemma. LEMMA 3.3.50 If X is a Banach space and B ∈ L(X) with kBkL 6 1, then ° n(B−id ) ° √ n °e X x − B x° 6 n kx − BxkX ∀ n > 1, x ∈ X. X PROOF

For k > n, we have k−1 X¡ ° k ° ¢ °B x − B n x° = B m+1 x − B m x X m=n

6 kx − BxkX

k−1 X

° ° kB m kL 6 |k − n|°x − Bx°X .

m=n

Then for any t > 0, we have ° t(B−id ) ° n °e X x − B x°

X

µX ∞

° ° ∞ k ° −t X ¢° t ¡ k n ° = °e B x−B x ° ° k! k=0

¶ ° ° tk 6 e−t |k − n| °x − Bx°X k! k=0 µX ¶ 21 ∞ k ¶ 21 µ X ∞ k ° ° t t −t 2 °x − Bx° 6 e (k − n) X k! k! k=0 k=0 ° ¢1 t ° t¡ = e− 2 t2 − (2n − 1)t + n2 2 e 2 °x − Bx°X .

X

Let t = n. We conclude that ° n(B−id ) ° ° √ ° n °e °x − Bx° . X x − B x° 6 n X X

Using this auxiliary result we can prove the exponential formula.

386

Nonlinear Analysis

THEOREM 3.3.51 If X is a Banach space and S is a contraction semigroup on X with generator A, then µ µ ¶−n ¶n t n S(t)x = lim idX − A x = lim R nt x ∀ x ∈ X. n→+∞ n→+∞ n t PROOF

For every n > 1 and t > 0, we have µ ¶−1 µ ¶−1 n n n t n R = id − A = idX − A . t t t t X n

Also we have

µ

n2 n R n − idX t2 t t

tA nt = t Note that

¶

µ = n

¶ n R nt − idX . t

° ° °n ° ° R n ° 6 1. °t t° L

So we can apply Lemma 3.3.50 with B = nt R nt . We obtain ° ° ° µ µ ¶¶ µ ¶n ° ° ° ° √ °n n ° exp n n R n − id ° ° n n x− R x° 6 n° R t x − x° X ° ° . t t t t t X X From the proof of Theorem 3.3.46, we know that ° ° °n ° ° R n x − x° 6 t kAxk ∀ x ∈ D(A). X °t t ° n X Therefore it follows that ° µ ¶n ° ° ° ¡ ¢ ° exp tA n x − n R n x° ° ° t t t

X

t 6 √ kAxkX n

∀ x ∈ D(A).

Again from the proof of Theorem 3.3.46, we know that ¡ ¢ S(t)x = lim exp tA nt x ∀ x ∈ X. n→+∞

So we infer that for fixed t > 0, µ S(t)x = But

lim

n→+∞

° ° ° tA nt ° °e °

L

n Rn t t

6 1 and

¶n x

∀ x ∈ D(A).

° ° °¡ n ¢ ° ° R n n° 6 1 ° t t ° L

and D(A) is dense in X. So (3.86) is valid for all x ∈ X.

(3.86)

3. Nonlinear Operators and Young Measures

387

Before passing to the nonlinear semigroup theory, let us see how we can use semigroups to extend the notion of a solution for an inhomogeneous evolution equation. So let X be a Banach space and A the generator of a C0 -semigroup S and T = [0, b]. Let f ∈ L1 (T ; X) and consider the evolution equation ½ 0 x (t) = Ax(t) + f (t) ∀ t ∈ T = [0, b], (3.87) x(0) = x0 . ¡ ¢ DEFINITION 3.3.52 (a) A function x ∈ W 1,1 (0, b); X is a strong solution of (3.87), if x(0) = x0 and it satisfies the equation almost everywhere (hence x(t) ∈ D(A) for almost all t ∈ T ). (b) A function x ∈C(T ; X)® is a weak solution of (3.87), if for all x∗ ∈ X ∗ , the function t 7−→ x∗ , x(t) X is absolutely continuous and ∗ ® x , x(t) X = hx∗ , x0 iX +

Zt

∗ ∗ ® A x , x(s) X ds +

0

Zt

∗ ® x , f (s) ds

∀ t ∈ T.

0

(c) A function x ∈ C(T ; X) is a mild solution of (3.87), if Zt S(t − s)f (s) ds

x(t) = S(t)x0 +

∀ t ∈ T.

0

The following interesting result is due to Ball (1977), where the reader can find its proof. THEOREM 3.3.53 If X is a Banach space, A is the generator of a C0 -semigroup and f ∈ L1 (T ; X), then x ∈ C(T ; X) is a mild solution of (3.86) if and only if it is a weak solution. REMARK 3.3.54 In contrast to the strong solution, the mild solution makes sense without having that x(t) ∈ D(A) for a.a. t ∈ T . Also we need not have x0 ∈ D(A) (nonregular initial condition). Moreover, it is easy to check that the mild solution (f, x0 ) 7−→ x(·; f, x0 ) is Lipschitz continuous on L1 (T ; X) × X.

388

Nonlinear Analysis

Now we move to nonlinear semigroups. DEFINITION 3.3.55 Let X be a Banach space and let C ⊆ X be a nonempty set. A family of maps S(t) : C −→ C,

t>0

is said to be a semigroup of nonexpansive maps if (a) S(0) = idC ; (b) S(t + s) = S(t) ◦ S(s) for all t, s > 0; ° ° (c) °S(t)x − S(t)y °X 6 kx − ykX for all t > 0 and all x, y ∈ C; (d) S(t)x −→ x in X as t & 0 for all x ∈ C. REMARK 3.3.56 Evidently a semigroup S on nonexpansive maps can be extended uniquely to a semigroup of nonexpansive maps on C and so in Definition 3.3.55 we may assume without any loss of generality that C ⊆ X is closed. If C = X and S(t) ∈ L(X), then we recover Definition 3.3.33. Moreover, it is straightforward to check that R+ × C 3 (t, x) −→ S(t)x ∈ C is continuous. We shall prove a basic generation theorem for nonlinear semigroups of nonexpansive maps. The result will be a nonlinear analog of Theorems 3.3.49 and 3.3.51. To do this we need some preparation. First we prove a combinatorial lemma. LEMMA 3.3.57 If n > m > 1 are integers and α, β > 0 are such that α + β = 1, then m µ ¶ X £ ¤1 n k n−k α β (m − k) 6 (na − m)2 + naβ 2 k k=0

and " ¶ µ ¶2 # 21 n µ X k−1 mβ mβ αm β k−m (n − k) 6 + +m−n . m−1 α2 a

k=m

PROOF

Since n > m and using the Cauchy-Schwarz inequality, we have m µ ¶ X n k n−k α β (m − k) k k=0 n µ ¶ X n k n−k 6 α β (m − k) k k=0

3. Nonlinear Operators and Young Measures 6

µX ¶1 µ n µ ¶ ¶ 21 n µ ¶ n k n−k 2 X n k n−k α β α β (m − k)2 . k k k=0

389 (3.88)

k=0

From the binomial theorem, we know that n µ ¶ X n k n−k α β = (α + β)n , k

(3.89)

k=0

µ ¶ n X n k n−k k α β = αn(α + β)n−1 k

(3.90)

µ ¶ n k n−k α β = α2 n(n − 1)(α + β)n−2 + αn(α + β)n−1 . k

(3.91)

k=0

and n X

k2

k=0

Using the equations of (3.89), (3.90) and (3.91) in the right hand side of (3.88) and since α + β = 1, we obtain

m µ ¶ X £ ¤1 n k n−k α β (m − k) 6 (nα − m)2 + nαβ 2 . k

k=0

Also using once more the Cauchy-Schwarz inequality, we have ¶ n µ X k−1 αm β k−m (n − k) m−1 k=m ¶ ∞ µ X k−1 6 αm β k−m |n − k| (3.92) m−1 k=m µX ¶ ¶ 21 µ X ¶ ¶ 21 ∞ µ ∞ µ k−1 k−1 6 αm β k−m αm β k−m (n − k)2 . m−1 m−1 k=m

Recall that

k=m

¶ ∞ µ X k−1 1 β k−m = m−1 (1 − β)m

∀ β ∈ (0, 1).

(3.93)

k=m

Using (3.93) and the identities obtained by differentiating it with respect to β, in the right hand side of (3.92), we obtain ¶ · µ ¶2 ¸ 21 n µ X k−1 mβ mβ m k−m α β (n − k) 6 + +m−n . m−1 α2 α k=m

390

Nonlinear Analysis

this auxiliary result we can obtain some estimates for the family © Using ª Jλn n>1,λ>0 . LEMMA 3.3.58 If X is a Banach space, A : X ⊇ D(A) −→ X is an m-accretive operator, λ > µ > 0 and n > m > 1 are integers, then ° ° ° ° (a) °Jλn (x) − x°X 6 n°Jλ (x) − x°X for all x ∈ X; (b) ° n ° °Jµ (x) − Jλm (x)°

X

µ ¶ ° n ° °J m−k (x) − x° λ X k k=0 µ ¶ n X ° ° m k−m k − 1 ° n−k + α β Jµ (x) − x°X , m−1 m X

6

αk β n−k

k=m

where

µ λ

a = PROOF

and

β =

λ−µ . λ

(a) Using Proposition 3.3.12(a), we have ° n−1 ° ° n ° ° X ¡ n−k ¢° n−(k+1) °Jλ x − x° = ° ° J (x) − J (x) λ λ ° ° X k=0

6

n−1 X

° ° °Jλ (x) − x°

X

X

° ° = n°Jλ (x) − x°X .

k=0

(b) For integers 1 6 i 6 m and 1 6 k 6 m, we set ° df ° ak,i = °Jµi (x) − Jλk (x)°X . Using the resolvent identity (see Remark 3.3.13), we obtain ° µ ¶° ° i ° µ k−1 λ−µ k ° ak,i = ° J (x) − J J (x) + J (x) µ λ λ ° µ ° λ λ X ° ° ° i−1 µ k−1 λ−µ k ° ° ° 6 °Jµ (x) − Jλ (x) − Jλ (x)° λ λ X ° ° µ° λ − µ° k−1 i−1 i−1 ° ° ° 6 Jµ (x) − Jλ (x) X + Jµ (x) − Jλk (x)°X λ λ = αak−1,i−1 + βak,i−1 . (3.94) Inequalities (3.94) can be solved to estimate am,n in terms of ak,0 and a0,i . This way we obtain the inequality in part (b) of the lemma.

3. Nonlinear Operators and Young Measures

391

Now we are ready for the generation theorem for nonlinear semigroups of nonexpansive maps. THEOREM 3.3.59 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then µ ¶−n t S(t)x = lim idX + A x n→+∞ n exists ©for each ª x ∈ D(A), uniformly in t on compact intervals in R+ . Moreover, S(t) t>0 is a semigroup of nonexpansive maps on D(A) and for each x ∈ D(A) and t > 0, we have ° ° ¯ ¯ °S(t)x − x° 6 t¯A(x)¯ = t inf kuk . X X u∈A(x)

PROOF Let x ∈ D(A), λ > µ > 0 and n > m > 1 positive integers. Using Lemmata 3.3.57 and 3.3.58, we obtain ° n ° °Jµ (x) − Jλm (x)°

X

µ· 6

¸ 21 (nµ − λm)2 + nµ(λ − µ)

(3.95)

· ¸ 21 ¶ ¯ ¯ ¯A(x)¯. (3.96) + mλ(λ − µ) + (mλ − nµ)2 Taking µ =

t n

and λ =

t m

in (3.96), we obtain µ

° n ° °J t (x) − J m ° t (x) n

m

X

6 2t

1 1 − m n

¶ 21 |Ax|,

(3.97)

so lim J nt (x) = S(t)x

n→+∞

n

exists uniformly in t on compact intervals of R+ . Moreover, since J nt is nonexpansive on X, n

we see that ° ° °S(t)x − S(t)y ° 6 kx − yk X X

∀ t > 0, x, y ∈ D(A).

Therefore S(t)x =

lim J nt (x) exists for x ∈ D(A)

n→+∞

n

and S(t)(·) is nonexpansive on D(A).

(3.98)

392

Nonlinear Analysis

Also if in (3.96), we let n = m, µ = nt and λ = pass to the limit as n → +∞, we obtain ° ° °S(t)x − S(s)x° 6 2|t − s||Ax|

s n

X

with 0 6 t 6 s and then ∀ x ∈ D(A).

(3.99)

From (3.99), it follows that the function t 7−→ S(t)x is continuous for all x ∈ D(A) and then by a use of the double limit lemma (see Proposition A.2.35) as in the proof of Theorem 3.3.46, we obtain that the function t 7−→ S(t)x is continuous on R+ for all x ∈ D(A). Finally we need to verify the semigroup property (see Definition 3.3.55(b)). From (3.97) and (3.98), we have ¡ ¢m ¡ ¢n S(t)m x = lim J nt (x) = lim J m (x). t n→+∞

n→+∞

n

n

Therefore, S(mt)x = =

lim J nmt (x) = lim J mk mt (x) n k→+∞ mk ¡ m ¢k lim J t (x) = S(t)m x.

n→+∞ k→+∞

k

(3.100)

Then if i, k, r, s > 0 are integers, we have µ

¶ µ ¶ µ ¶is+rk i r is + rk 1 S + x = S x = S x k s ks ks µ ¶is µ ¶rk µ ¶ µ ¶ 1 1 i r = S S x = S S x, ks ks k s so S(t + τ )x = S(t) ◦ S(τ )x for all rational t, τ > 0 and all x ∈ D(A). Exploiting the continuity in t and the nonexpansiveness in x, we conclude that S(t + τ ) = S(t) ◦ S(τ ) ∀ t, τ > 0.

In applications to evolution equations, it is very helpful to know if S(t) : C −→ C,

t>0

is a compact map (see Definition 3.1.1). So we make the following definition. DEFINITION 3.3.60 Let X be a Banach space and C ⊆ X be a nonempty, closed set and S(t) : C −→ C, t > 0, a semigroup of nonexpansive maps. We say that S is compact, if for all t > 0, S(t) is a compact map.

3. Nonlinear Operators and Young Measures

393

REMARK 3.3.61 Since S(0) = idC , then S(0) is not in general compact unless C ⊆ X is compact or X is finite dimensional. Next we present two simple but typical examples of (linear) semigroups which are compact and noncompact respectively. EXAMPLE 3.3.62

(a) X = H = L2 (0, π) and A : H ⊇ D(A) −→ H

is defined by df

Ax = −

d2 x dt2

∀ x ∈ W 2,2 (0, π) ∩ W01,2 (0, π).

By integration by parts, we can check that A is monotone (i.e., accretive). Also for every f ∈ L2 (0, π) the boundary value problem ½ 00 −x (t) + x(t) = f (t) for a.a. t ∈ [0, π], x(0) = x(π) = 0, has a unique solution x ∈ D(A). Hence A is maximal monotone (i.e., maccretive). Note that D(A) = H and then because of Theorem 3.3.49, −A generates a contraction semigroup S on H. We know that µ ¶ © ª d2 1,2 2 λk = k k>1 is the spectrum of − 2 , W0 (0, π) dt and the corresponding eigenfunctions (r ) 2 sin kx π

k>1

form an orthonormal basis for H. Then using sine Fourier expansion, we can easily verify that r ∞ X 2 −k2 t S(t)x(τ ) = ak e sin kτ ∀ x ∈ H, t > 0, τ ∈ [0, π], π k=1

with ak being the k-th Fourier coefficient, defined by r df

ak =

2 π

Zπ x(s) sin ks ds. 0

394

Nonlinear Analysis

If we set df

Sn (t)x(τ ) =

n X

r ak e

−k2 t

k=1

2 sin kτ π

∀ x ∈ H, t > 0, τ ∈ [0, π],

then Sn ∈ Lf (H) (i.e., Sn is of finite rank; see Definition 3.1.23) and for all t > 0, Sn (t)x −→ S(t)x

in H, uniformly on bounded subsets of H.

Therefore S(t) is compact for t > 0 (see Proposition 3.1.18). (b) Let X = H = L2 (0, 2π) and let A : X ⊇ D(A) −→ H be defined by df

Ax = with

dx dt

∀ x ∈ D(A),

½ df

D(A) =

¾ x∈W

2,2

0

0

(0, 2π) : x(0) = x(2π), x (0) = x (2π) .

A simple integration by parts reveals that A is monotone (i.e., accretive). Also for every λ > 0 and every h ∈ L2 (0, 2π), the problem ∀ t ∈ [0, 2π], x(t) + λ dx dt (t) = h(t) x(0) = x(2π), 0 x (0) = x0 (2π), has a unique solution x ∈ W 2,2 (0, 2π) and so A is maximal monotone (i.e., A is m-accretive). Hence by Theorem 3.3.49, −A generates a contraction semigroup S on H. This semigroup is defined by ½ x(τ + t) if τ + t ∈ [0, 2π], S(t)x(τ ) = x(τ + t − 2π) if τ + t > 2π, for all x ∈ H, τ ∈ [0, 2π] and t > 0. Then for every t > 0, S is an isometry on H and so S(t) is not compact for t > 0. REMARK 3.3.63 Roughly speaking, we can say that compact semigroups of nonexpansive maps are generated by m-accretive operators acting in a finite dimensional Banach space or by m-accretive operators arising in the study of parabolic problems. In contrast hyperbolic problems (even very simple ones) generate noncompact semigroups. The compactness of a semigroup S of nonexpansive maps is closely related to the compactness of the nonexpansive operators Jλ = (idX + λA)−1 (see Definition 3.3.11 and Proposition 3.3.12(a)). For this reason we need to have a result determining the relationship between S(t) and Jλ , t, λ > 0.

3. Nonlinear Operators and Young Measures

395

LEMMA 3.3.64 If X is a Banach space, C ⊆ X is a nonempty, closed set and S(t) : C −→ C,

t>0

is a semigroup of nonexpansive maps, then ° ° °S(t)x − x° 6 2 X t PROOF

Zt

° ° °S(τ )x − x° dτ X

∀ x ∈ C, t > 0.

0

From Definition 3.3.55(b) and (c), we have ° ° Zt ° ° °S(t)x − 1 S(τ ) dτ ° ° ° t X

(3.101)

° Zt ° °1 ¡ ¢ ° ° = ° S(t)x − S(τ )x dτ ° ° t X

(3.102)

0

0

6

=

1 t 1 t

Zt

° ° °S(t − τ )x − x° dτ X

0

Zt

° ° °S(τ )x − x° dτ X

∀ x ∈ C, t > 0.

(3.103)

0

Using this inequality, we obtain ° ° °S(t)x − x° X ° ° ° Zt ° Zt ° ° ¡ ¢ ° 1 1° ° ° ° 6 °S(t)x − S(τ )x dτ ° + ° S(τ )x − x dτ ° ° t t X X 0

6

6

1 t 2 t

Zt 0

Zt

°¡ ¢° ° S(τ )x − x ° dτ + 1 X t °¡ ¢° ° S(τ )x − x ° dτ X

0

Zt

°¡ ¢° ° S(τ )x − x ° dτ X

0

∀ x ∈ X, t > 0.

0

In order to derive the desired relations between S(t) and Jλ , with t, λ > 0, we need to return to nonlinear evolution equations and discuss their solvability when the data are nonregular.

396

Nonlinear Analysis

So let T = [0, b], X be a Banach space, A : X ⊇ D(A) −→ 2X be an m-accretive operator and f ∈ L1 (T ; X). We consider the following nonlinear evolution inclusion: ¡ ¢ ½ 0 x (t) + A x(t) 3 f (t) ∀ t ∈ T, (3.104) x(0) = x0 . From Theorem 3.3.28, we know that if¡ X ∗ is uniformly convex (hence X ¢ 1,1 is reflexive), x ∈ D(A) and f ∈ W (0, b); X , then there exists unique ¡ 0 ¢ 1,∞ x ∈ W (0, b); X satisfying (3.104) for almost all t ∈ T . Such a solution is usually called¡ strong ¢solution. However, if x0 ∈ D(A) \ D(A) or f ∈ L1 (T ; X) \ W 1,1 (0, b); X or X is not reflexive, then there are examples showing that (3.104) need not have a strong solution (for details we refer to Crandall & Liggett (1971)). So if we want to develop a general theory concerning evolution equations of the form (3.104), we need to introduce a new broader solution concept. Suppose that x is a strong solution. Then ¡ ¢ −x0 (t) + f (t) ∈ A x(t) for a.a. t ∈ T . Because A is accretive, using Theorem 3.3.10, we obtain ¡ ¢ 0 6 − x0 (t) + f (t) − v, x(t) − y + for a.a. t ∈ T and all (y, v) ∈ Gr A, so

¡

x0 (t), x(t) − y

¢ +

6

¡ ¢ f (t) − v, x(t) − y +

for a.a. t ∈ T .

By virtue of Lemma 3.3.26, we have that ° ¡ 0 ¢ d° °x(t) − y °2 x (t), x(t) − y + = X dt

for a.a. t ∈ T .

So we obtain ° ¡ ¢ 1 d° °x(t) − y °2 6 f (t) − v, x(t) − y + X 2 dt

for a.a. t ∈ T .

Integrating both sides of this inequality over [s, t] ⊆ [0, b], we obtain ° ° ° ° °x(t) − y °2 6 °x(s) − y °2 + 2 X X

Zt

¡

f (τ ) − v, x(τ ) − y

¢ +

dτ

s

0 6 s 6 t 6 b, (y, v) ∈ Gr A. This leads to the introduction of a new more general solution notion for problem (3.104).

3. Nonlinear Operators and Young Measures

397

DEFINITION 3.3.65 Let x ∈ X and f ∈ L1 (T ; X). A function x : T −→ X is said to be an integral solution of the Cauchy problem (3.104), if (a) x(0) = x; (b) x ∈ C(T ; X); (c) for all 0 6 s 6 t 6 b and all (y, v) ∈ Gr A, we have ° ° ° 1° °x(t) − y °2 6 1 °x(s) − y °2 + X X 2 2

Zt

¡

f (τ ) − v, x(τ ) − y

¢ +

dτ.

s

REMARK 3.3.66 The previous discussion shows that every strong solution is also an integral solution. Moreover, for every x0 ∈ D(A), the function µ ¶−n t df x(t) = S(t)x0 = lim idX + A x0 n→+∞ n is an integral solution of the autonomous Cauchy problem ¡ ¢ ½ 0 x (t) + A x(t) 3 0 ∀ t > 0, x(0) = x0 .

For details we refer to Barbu (1976, p. 124). More generally we have the following result due to B´enilan (1972) (see also Barbu (1976, p. 124) and Miyadera (1992, p. 160)). THEOREM 3.3.67 If X is a Banach space, A : X ⊇ 2X is an m-accretive operator, x0 ∈ D(A) and f ∈ L1 (T ; X), then problem (3.104) has a unique integral solution x(·; f ) ∈ C(T ; X). Moreover, if f1 , f2 ∈ L1 (T ; X) and x1 (·) = x(·; f1 ),

x2 (·) = x(·; f2 ),

we have ° ° ° ° °x1 (t) − x2 (t)°2 6 °x1 (s) − x2 (s)°2 X X Zt +

¡ ¢ f1 (τ ) − f2 (τ ), x1 (τ ) − x2 (τ ) + dτ

s

and

° ° ° ° °x1 (t) − x2 (t)° 6 °x1 (s) − x2 (s)° X X Zt + s

° ° °f1 (τ ) − f2 (τ )° dτ X

∀ 0 6 s 6 t 6 b.

398

Nonlinear Analysis

Now we have all the necessary tools to establish the relation between S(t) and Jλ for all t, λ > 0. PROPOSITION 3.3.68 If X is a Banach space, A : X ⊇ D(A) −→ 2X is an m-accretive operator and S(t) : D(A) −→ D(A),

t>1

is the semigroup of nonexpansive maps generated by A, then for all x0 ∈ D and all t, λ > 0, we have ° ° ° ¡ ¢° (a) °S(t)x0 − x0 °X 6 2 + λt °Jλ (x0 ) − x0 °X ; ° ° (b) °Jλ (x0 ) − x0 °X 6

2 t

¡ ¢ 1 + λt

Zt

° ° °S(τ )x0 − x0 ° dτ . X

0

PROOF (a) From Definition 3.3.55(c) and Theorem 3.3.59, for all x0 ∈ D(A), (y, v) ∈ Gr A and t > 0, we have ° ° °S(t)x0 − x0 ° X° ° ° ° ° ° 6 °S(t)x0 − S(t)y °X + °S(t)y − y °X + °y − x0 °X ° ° ° ° 6 2°x0 − y °X + °S(t)y − y °X 6 2 kx0 − ykX + t kvkX .

(3.105)

¡ ¢ In (3.105), let us set y = Jλ (x) and v = Aλ (x0 ) ∈ A Jλ (x0 ) (see Proposition 3.3.12(c)). We obtain ° ° °S(t)x0 − x0 ° X ° ° ° ° 6 2°x0 − Jλ (x0 )°X + t°Aλ (x0 )°X ° ° ° t° 6 2°x0 − Jλ (x0 )°X + °x0 − Jλ (x0 )°X λ µ ¶ ° t ° °x0 − Jλ (x0 )° . = 2+ X λ

(b) We know that x(t) = S(t)x0 is the unique integral solution of the autonomous Cauchy problem: ½

¡ ¢ x0 (t) + A x(t) 3 0 x(0) = x0 ,

∀ t ∈ T,

3. Nonlinear Operators and Young Measures

399

and ° 1° °S(t)x0 − y °2 6 1 kx0 − yk2 X X 2 2 t Z ¡ ¢ + − v, S(τ )x0 − y + dτ ∀ (y, v) ∈ Gr A, t > 0 0

(see Definition 3.3.65 and Remark 3.3.66). Using Definition 3.3.5 and Lemma 3.3.27, we obtain ° ° °S(t)x0 − y ° 6 kx0 − yk X X 1 + λ

Zt

° ° ° ¢ ¡° °S(τ )x0 − y − λv ° − °S(τ )x0 − y ° dτ X X

∀ λ > 0.

0

Let y = Jλ (x0 )

and v = Aλ (x0 ).

We have ° ° ° ° °S(t)x0 − Jλ (x0 )° 6 °x0 − Jλ (x0 )° X X Zt ° ° °¢ ¡° 1 °S(τ )x0 − x0 ° − °S(τ )x0 − Jλ (x0 )° dτ. + X λ

(3.106)

0

From the triangle inequality, we have ° ° ° ° ° ° −°S(t)x0 − x0 °X 6 °S(t)x0 − Jλ (x0 )°X − °Jλ (x0 ) − x0 °X . Using this in (3.106), we obtain ° ° ° ° −°S(t)x0 − x0 °X + °Jλ (x0 ) − x0 °X ° ° 1 6 °Jλ (x0 ) − x0 °X + λ

Zt

° ° ° ¢ ¡ ° 2°S(τ )x0 − x0 °X − °Jλ (x0 ) − x0 °X dτ,

0

so ° ° °Jλ (x0 ) − x0 °

X

° λ° 2 6 °S(t)x0 − x0 °X + t t

Zt

° ° °S(τ )x0 − x0 ° dτ X

0

and from Lemma 3.3.64, we have ° ° °Jλ (x0 ) − x0 °

X

2 6 t

µ ¶ Zt ° ° λ °S(τ )x0 − x0 ° dτ. 1+ X t 0

400

Nonlinear Analysis

LEMMA 3.3.69 If X is a Banach space, C ⊆ X is a nonempty, closed set, fn : C −→ X are compact maps for n > 1 and fn (x) −→ f (x)

in X,

uniformly on bounded subsets of C, then f : C −→ X is compact. PROOF Clearly f : C −→ X is continuous. Next let B ⊆ C be a bounded set. Then for a given ε > 0, we can find n0 = n0 (ε, B) > 1, such that ° ° °fn (x) − f (x)° < ε ∀ n > n0 , x ∈ B. (3.107) X 2 For n > n0 , the set fn (B) is compact in X. So we can find

0 {xk }N k=1 ,

where N0 = N0 (n, ε) > 1, such that fn (B) ⊆

N0 [

B 2ε (xk ).

(3.108)

k=1

Let x ∈ B. From (3.108), we see that there exists k ∈ {1, . . . , N0 }, such that ° ° °fn (x) − xk ° < ε . (3.109) 2 Therefore using (3.107) and (3.109), we have ° ° ° ° ° ° °f (x) − xk ° 6 °f (x) − fn (x)° + °fn (x) − xk ° < ε, X X X so f (x) ∈ Bε (xk ) and thus f (B) ⊆

N0 [

Bε (xk ),

k=1

i.e., f (B) is totally bounded, thus relatively compact in X. DEFINITION 3.3.70 closed and let

Let X be a Banach space, C ⊆ X nonempty, S(t) : C −→ C,

t > 0,

be a semigroup of nonexpansive maps. We say that S is equicontinuous (respectively weakly if for each bounded set B ⊆ C, the © equicontinuous) ª family of functions S(·)x x∈B is equicontinuous (respectively weakly equicontinuous) at each t > 0.

3. Nonlinear Operators and Young Measures

401

As expected compactness and equicontinuity of nonlinear semigroups are closely related. PROPOSITION 3.3.71 If X is a Banach space, C ⊆ X is a nonempty, closed set and S(t) : C −→ C,

t>0

is a compact semigroup for nonexpansive maps, then S is equicontinuous. PROOF Because

Let B ⊆ C be a bounded set, let t > 0 and choose r ∈ (0, t). S(t − r)B is compact, N (ε)

we can find {xk }k=1 ⊆ B, such that S(t − r)B ⊆

Nε [

¡ ¢ B 3ε S(t − r)xk .

(3.110)

k=1

© ª For each k ∈ 1, . . . , N (ε) , the map t 7−→ S(t)xk is continuous on R+ and so we can find δ = δ(ε, t) ∈ (0, r), such that ° ° °S(t + h)xk − S(t)xk ° 6 ε X 3

© ª ∀ k ∈ 1, . . . , N (ε) , h ∈ [−δ, δ]. (3.111)

© ª Then because of (3.110), for each x ∈ B, we can find k ∈ 1, . . . , N (ε) , such that ° ° °S(t − r)x − S(t − r)xk ° 6 ε . (3.112) X 3 So using the semigroup property and (3.111) and (3.112), we obtain ° ° °S(t + h)x − S(t)x° X ° ° ° ° 6 °S(t + h)x − S(t + h)xk °X + °S(t + h)xk − S(t)xk °X ° ° + °S(t)xk − S(t)x°X ° ° ° ° 6 2°S(t − r)x − S(t − r)xk ° + °S(t + h)xk − S(t)xk ° 6 ε X

X

∀ x ∈ B, h ∈ [−δ, δ]. © ª This proves that S(·)x x∈B is equicontinuous at every t > 0. Now we are ready to present a characterization of compact nonlinear semigroups.

402

Nonlinear Analysis

THEOREM 3.3.72 If X is a Banach space, A : X ⊇ D(A) −→ 2X is an m-accretive operator and S(t) : D(A) −→ D(A), t > 0 is the semigroup of nonexpansive maps generated by A according to Theorem 3.3.59, then the following statements are equivalent: (a) the semigroup S is compact; (b) for each λ > 0, Jλ is a compact map and the semigroup S is equicontinuous. PROOF “(a)=⇒(b)”: The equicontinuity of S follows from Proposition 3.3.71. So we need to show that for each λ > 0, Jλ is a compact map. From Theorem 3.3.59, we have ° ° ° ° °S(t) ◦ Jλ (x) − Jλ (x)° 6 t°Aλ (x)° X X ° t° = °x − Jλ (x)°X ∀ x ∈ X, λ > 0 (3.113) λ ¡ ¢ (recall that Aλ (x) ∈ A Jλ (x) ; see Proposition 3.3.12(c)). Because Jλ is nonexpansive, it maps bounded sets to bounded sets. So from (3.113), it follows that S(t) ◦ Jλ −→ Jλ as t → 0+ , uniformly on bounded sets of X. But S(t) ◦ Jλ is compact, since S(t) is. Therefore Lemma 3.3.69 implies that Jλ is compact. “(b)=⇒(a)”: Using Proposition 3.3.68(b), we have °¡ ° ¢ ° Jλ ◦ S(t) (x) − S(t)x° X Zλ ° 4 ° °S(t + τ )x − S(t)x° dτ 6 ∀ λ, t > 0, x ∈ D(A). (3.114) X λ 0

Because S is equicontinuous, for every bounded set B ⊆ D(A) and for each t > 0, we can find ω : R+ −→ R+ , such that lim ω(r) = 0

r&0

and

° ° °S(t + τ )x − S(t)x°

X

6 ω(τ )

∀ τ > 0, x ∈ B.

Using (3.115) in (3.114), we obtain °¡ ° ¢ ° Jλ ◦ S(t) (x) − S(t)x° 6 4 sup ω(τ ) X τ ∈[0,λ]

∀ x ∈ B,

(3.115)

3. Nonlinear Operators and Young Measures

403

so Jλ ◦ S(t) −→ S(t) as λ & 0, uniformly on bounded subsets of D(A). Note that Jλ ◦ S is compact

(see the first part of (b)). So S(t) is compact for all t > 0. In the case of linear semigroups, the above theorem takes the following form. COROLLARY 3.3.73 If X is a Banach space, A : X ⊇ D(A) −→ X is densely defined, linear, m-accretive operator and S(t) : X −→ X, t > 0 is the contraction semigroup generated by −A according to Theorem 3.3.49, then the following statements are equivalent: (a) the semigroup S is compact; (b) for each λ > 0, Jλ is a compact operator and the map t 7−→ S(t) is continuous from R+ into L(X) with the operator norm topology. REMARK 3.3.74 Using the resolvent identity (see Remark 3.3.13), we see that in Theorem 3.3.72 and in Corollary 3.3.73, the map Jλ is compact for all λ > 0 if and only if it is compact for some λ > 0. The next proposition gives an equivalent condition for the map Jλ to be compact. PROPOSITION 3.3.75 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then the following statements are equivalent: (a) for each λ > 0, Jλ is a compact map; (b) for every m > 0, the level set ½ df

Lm =

¾ ¯ ¯ ¯ ¯ x ∈ D(A) : kxkX + A(x) 6 m

is relatively compact in X.

404 PROOF

Nonlinear Analysis “(a)=⇒(b)”: From Proposition 3.3.12(d), we have ° ° °Aλ (x)°

X

so

¯ ¯ 6 ¯A(x)¯ 6 m

° ° °x − Jλ (x)° 6 mλ X

∀ x ∈ Lm , λ > 0,

∀ x ∈ Lm , λ > 0

and thus Jλ −→ idLm

as λ & 0,

uniformly on Lm .

From Lemma 3.3.69, it follows that Lm (which is bounded) is relatively compact. “(b)=⇒(a)”: Let B ⊆ X be bounded and λ > 0. Since Jλ is nonexpansive Jλ (B) is bounded. Because ¡ ¢ Aλ (x) ∈ A Jλ (x)

∀ x ∈ X,

we have ° ° ¯ ¡ ¢¯ °Jλ (x)° + ¯A Jλ (x) ¯ X ° ° ° ° 6 °Jλ (x)°X + °Aλ (x)°X ° ° ° 1° = °Jλ (x)°X + °x − Jλ (x)°X λ

∀ x ∈ X.

So there exists m > 0 large enough, such that Jλ (B) ⊆ Lm , hence Jλ (B) is compact and so Jλ is a compact map. In Section 4.3, we will return to semigroups, when we will examine the subdifferential of a convex function. Before concluding this section, we would like to make an interesting remark concerning accretive operators. REMARK 3.3.76 In a Hilbert space maximal accretivity (i.e., maximal monotonicity) and m-accretivity coincide (see Theorem 3.2.29). In a general Banach space this is no longer true. For a counterexample we refer to Crandall & Liggett (1971) (see also Miyadera (1992, pp. 42–44)).

3. Nonlinear Operators and Young Measures

3.4

405

The Nemytskii Operator and Integral Functions

In this section first we examine the Nemytskii (or superposition) operator, which is an important nonlinear operator that arises in many applications and then we pass to the study of nonlinear integral functionals, which leads naturally to the topic of the next section, which is the theory of Young measures. Consider a set Ω, which in most cases is a measure space or a metric space or both and let X, Y be two Hausdorff topological spaces, which in our analysis as well as in most applications are either Euclidean spaces or Banach spaces. Let f : Ω × X −→ Y and consider the nonlinear operator ¡ ¢ df Nf (u)(z) = f z, u(z)

∀ z ∈ Ω,

¡ ¢ which to each function u : Ω −→ X assigns the Y -valued z 7−→ f z, u(z) . This operator is known in the literature as the Nemytskii operator corresponding to the function f (also known as the superposition operator of f , or the composition operator of f , or the substitution operator of f ). Since in many applications the Nemytskii operator Nf acts on a Lebesgue space Lp , it is important to know under what conditions Nf maps Lp into another Lebesgue space Lr . It turns out that this leads to a particular growth condition on f , namely p f (z, x) = O(|x| r ), which is both a necessary and a sufficient condition for Nf to act between Lp and Lr . This is the well known Krasnoselskii’s theorem, which here we prove in a more general form, namely when Nf acts on Lebesgue-Bochner spaces. We start with a definition. DEFINITION 3.4.1 Let (Ω, Σ) be a measurable space and let X, Y be two Hausdorff spaces. A function f : Ω × X −→ Y is said to be a Carath´ eodory function, if ¡ ¢ (a) for every x ∈ X, the function z 7−→ f (z, x) is Σ, B(Y ) -measurable, with B(Y ) being the Borel σ-field of Y ; (b) for every z ∈ Ω, the function x 7−→ f (z, x) is continuous. REMARK 3.4.2 If X is a separable metric space and Y is a metric space, then the function (z, x) 7−→ f (z, x) is Σ × B(X)-measurable, with B(X) being the Borel σ-field of X (i.e., f is jointly measurable). Therefore f is sup-measurable (superpositionally measurable), meaning ¡ ¢ that for every measurable function u : Ω −→ X, the function z 7−→ f z, u(z) is measurable, i.e., the Nemytskii operator Nf maps measurable functions to measurable ones (for details see Denkowski, Mig´orski & Papageorgiou (2003a, pp. 189–190)).

406

Nonlinear Analysis

In what follows, to avoid repeating the same hypotheses, we fix (Ω, Σ, µ) to be a nonatomic, σ-finite, complete measure space (in applications usually Ω is a subset of RN , equipped with the Lebesgue measure) and X, Y are two separable Banach spaces. LEMMA 3.4.3 If h : Ω × X −→ R+ is a Carath´eodory function, such that h(z, 0) = 0 for all z ∈ Ω and ° ° °Nh (u)° r 6 cr ∀ u ∈ Lp (Ω; X), L (Ω) for some c > 0, then µ(Ek ) = 0 where

∀ k > 1,

½ df

Ek =

¾ z∈Ω:

sup h(z, x) = +∞

∀ k > 1.

kxkX 6k

PROOF Suppose that for some k > 1, we have µ(Ek ) 6= 0. Because the measure space is nonatomic, σ-finite, we can find Bk ∈ Σ, such that Bk ⊆ Ek

and

0 < µ(Bk ) < +∞.

For every z ∈ Bk , we have ½ df Sk (z) = x ∈ X : kxkX 6 k, h(z, x) >

¾ 2cr . µ(Bk )

Evidently Sk (z) 6= ∅

∀ z ∈ Bk

and Gr Sk ∈ (Σ ∩ Bk ) × B(X), with B(X) being the Borel σ-field of X. We apply the Yankov-von¡Neumann¢ Aumann selection theorem (see Theorem A.2.33) and obtain a Σ, B(X) measurable map uk : Bk −→ X such that uk (z) ∈ Sk (z)

∀ z ∈ Bk .

We extend uk to all of Ω by setting uk (z) = 0 if z ∈ Ω \ Bk . Since h(z, 0) = 0 p

∀z∈Ω

and uk ∈ L (Ω; X), we have Z Z ¡ ¢r ¡ ¢r h z, uk (z) dµ > 2cr , h z, uk (z) dµ = Ω

a contradiction.

Bk

3. Nonlinear Operators and Young Measures

407

Using this lemma, we can prove the general version of Krasnoselskii’s theorem for Nf . THEOREM 3.4.4 If f : Ω × X −→ Y is a Carath´eodory function, p, r ∈ [1, +∞) and Nf maps Lp (Ω; X) into Lr (Ω; Y ), then Nf is continuous, bounded (i.e., maps bounded sets into bounded sets) and there exist a ∈ Lr (Ω)+ and c > 0, such that ° ° °f (z, x)°

Y

PROOF

p

r 6 a(z) + c kxkX

for µ-a.a. z ∈ Ω.

Let {un }n>1 ⊆ Lp (Ω; X) be a sequence, such that un −→ u

in Lp (Ω; X),

for some u ∈ Lp (Ω; X). Let g : Ω × X −→ R be defined by °r df ° g(z, x) = °f (z, x + u(z)) − f (z, u(z))°Y . We pick a subsequence {unk }k>1 of {un }n>1 , such that ° ° 1 °un − u°p p 6 k k L (Ω;X) 2

∀k>1

and unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Let

df

vk = unk − u

∀ k > 1.

We have vk (z) −→ 0 and so

¡ ¢ g z, vk (z) −→ 0

for µ-a.a. z ∈ Ω

for µ-a.a. z ∈ Ω

as k → +∞.

Because g(z, x) > 0

∀ (z, x) ∈ Ω × X

and vk (z) −→ 0 for µ-a.a. z ∈ Ω, we can find k(z) ∈ N, such that ¡ ¢ ¡ ¢ ξ(z) = sup g z, vk (z) = g z, vk(z) (z) . k>1

408

Nonlinear Analysis

Let

df

vb(z) = vk(z) (z). Since ξ is Σ-measurable, we see that the function z 7−→ vb(z) is Σ-measurable. Moreover, we have Z Z ° ° ° °p °vb(z)°p dµ 6 sup °vk (z)°X dµ X 6

Ω ∞ X

Ω p

kvk kLp (Ω;X) =

k=1

k>1

∞ X ° ° °un − u°p p < +∞, k L (Ω;X) k=1

so vb ∈ Lp (Ω; X). Then from the definition of g and the hypothesis that Nf maps Lp (Ω; X) into Lr (Ω; X), we infer that ¡ ¢ g ·, vb(·) ∈ L1 (Ω)+ . Since and

¡ ¢ ¡ ¢ g z, vk (z) 6 g z, vb(z)

∀ z ∈ Ω, k > 1

¡ ¢ g z, vk (z) −→ 0 for µ-a.a. z ∈ Ω,

form the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that Z ¡ ¢ g z, vk (z) dµ −→ 0. Ω

Therefore

¡ ¢ Nf xnk −→ Nf (x) in Lr (Ω; Y ). © ª Since every subsequence of Nf (xn ) n>1 has a further subsequence converging in Lr (Ω; Y ) to Nf (x), we conclude that Nf (xn ) −→ Nf (x)

in Lr (Ω; Y )

and so the map Nf : Lp (Ω; X) −→ Lr (Ω; Y ) is continuous. Next we prove the boundedness of Nf . For u ∈ Lp (Ω; X), let ¡ ¢ ¡ ¢ df fb(z, x) = f z, x + u(z) − f z, u(z) .

3. Nonlinear Operators and Young Measures

409

Evidently fb is a Carath´eodory function, Nfb maps Lp (Ω; X) into Lr (Ω; Y ) and in addition fb(z, 0) = 0 ∀ z ∈ Ω. So without any loss of generality, we may assume that f (z, 0) = 0

∀ z ∈ Ω.

Since Nf is continuous at 0, we can find % > 0, such that ° ° °Nf (u)° p 6 1 ∀ kukLp (Ω;X) 6 %. L (Ω;Y ) Then take an arbitrary u ∈ Lp (Ω; X) and let n > 1 be an integer, such that p

n%p 6 kukLp (Ω;X) 6 (n + 1)%p . We write Ω =

m+1 [

Ωk

k=1

as a disjoint union, such that p

kukLp (Ωk ;X) 6 %p

© ª ∀ k ∈ 1, . . . , k + 1 .

Then we have Z

n+1 XZ ° ¡ ° ¡ ¢° ¢° °f z, u(z) °r dµ = °f z, u(z) °r dµ Y Y

Ω

µ

6 n+1 6

k=1Ω k

kukLp (Ω;X)

¶p

%

+ 1,

which proves that Nf is bounded. Finally we prove the growth condition. Since Nf is bounded, we can find c > 0, such that ° ° °Nf (u)° p 6 c ∀ kukLp (Ω;X) 6 1. (3.116) L (Ω;Y ) Let h : Ω × X −→ R be defined by

· ¸ ° ° ° ° pr + ° ° ° ° h(z, x) = f (z, x) Y − c x X . df

Using the inequality which says that (ξ1 − ξ2 )r 6 ξ1r − ξ2r

∀ ξ1 > ξ 2 ,

410

Nonlinear Analysis

we have ° °r ° °p h(z, x)r 6 °f (z, x)°Y − cr °x°X

when h(z, x) > 0.

(3.117)

Let u ∈ Lp (Ω; X) and let ©

df

C =

¡ ¢ ª z ∈ Ω : h z, u(z) > 0 .

Then we can find an integer n > 1 and ε ∈ [0, 1), such that Z ° ° °u(z)°p dµ = n + ε. X C

So we can write C =

n+1 [

Ck ,

k=1

a disjoint union, such that Z ° ° °u(z)°p dµ 6 1 X

© ª ∀ k ∈ 1, . . . , n + 1 .

Ck

Then assuming as before without any loss of generality, that f (z, 0) = 0

∀z∈Ω

and using (3.116), we obtain Z

n+1 XZ ° ¡ ° ¡ ¢° ¢° °f z, u(z) °r dµ = °f z, u(z) °r dµ 6 (n + 1)cr . Y Y

(3.118)

k=1C k

C

Returning to (3.117) and using (3.118), we have Z ¡ ¢r h z, u(z) dµ 6 (n + 1)cr − (n + ε)cr 6 cr

∀ u ∈ Lp (Ω; X). (3.119)

Ω

So by virtue of Lemma 3.4.3, we have µ½ ¾¶ µ z ∈ Ω : sup h(z, x) = +∞ = 0

∀ k > 1.

(3.120)

kxkX 6k

Since by hypothesis the measure space is σ-finite, we can find {Dk }k>1 ⊆ Σ, such that ∞ [ Ω = Dk and µ(Dk ) < +∞ ∀ k > 1. k=1

3. Nonlinear Operators and Young Measures

411

For z ∈ Dk , let ½ df

Vk (z) =

x ∈ X : kxkX 6 k, sup h(z, x) < +∞, kxkX 6k

¾ 1 sup h(z, x) − 6 h(z, x) . k kxkX 6k Because of (3.120), Vk (z) 6= ∅

for µ-a.a. z ∈ Dk .

Also we have Gr Vk ∈ (Σ ∩ Dk ) × B(X). So the Yankov-von Neumann-Aumann selection theorem (see Theorem ¡ ¢ A.2.33) gives a Σ, B(X) -measurable map vk : Dk −→ X, such that vk (z) ∈ Vk (z)

∀ z ∈ Dk .

Extend vk to all of Ω by setting vk |Ω\Dk = 0. Let df

a(z) = sup h(z, x). x∈X

Because h is a Carath´eodory function and X is separable, a is Σ-measurable. Also we have sup h(z, x) − kxkX 6k

so

¡ ¢ 1 6 h z, vk (z) 6 a(z) k

¡ ¢ h z, vk (z) −→ a(z)

for µ-a.a. z ∈ Ω,

for µ-a.a. z ∈ Ω,

as k → +∞.

p

Note that vk ∈ L (Ω; X) and so from (3.119), we have Z ¡ ¢r h z, vk (z) dµ 6 cr ∀ k > 1. Ω

As h > 0, we can apply Fatou’s lemma (see Theorem A.2.1) and obtain Z Z ¡ ¢r r a(z) dµ 6 lim inf h z, vk (z) dµ 6 cr k→+∞

Ω

Ω

and thus a ∈ Lr (Ω). Recalling the definition of h(z, x), we conclude that p ° ° °f (z, x)° 6 a(z) + c kxk r X Y

for µ-a.a. z ∈ Ω.

412

Nonlinear Analysis

REMARK 3.4.5

By virtue of Theorem 3.4.4, the growth condition

p ° ° °f (z, x)° 6 a(z) + c kxk r X Y

for µ-a.a. z ∈ Ω,

with a ∈ Lr (Ω), c > 0 is both necessary and sufficient condition for the continuity and boundedness of the Nemytskii operator Nf : Lp (Ω; X) −→ Lr (Ω; Y ). If in Theorem 3.4.4 we drop the hypothesis that f (z, x) is a Carath´eodory ¡ function and¢ we only assume that the function (z, x) 7−→ f (z, x) is Σ × B(X), B(Y ) -measurable and for all z ∈ Ω, the function x 7−→ f (z, x) is lower semicontinuous, then we no longer have the continuity of the Nemytskii operator Nf , even if it maps Lp (Ω; X) into Lr (Ω; Y ). To see this let Ω = [0, 1] equipped with the Lebesgue measure, let X = Y = R and consider the function ½ df 1 if x 6= 0, f (x) = 0 if x = 0. Then Nf maps Lp (Ω) to Lr (Ω) for every r ∈ [1, +∞). However, if we consider df

xn (z) =

z , n

then Nf (xn ) does not converge in measure to zero. Also if r = +∞ and Nf maps Lp (Ω; X) into L∞ (Ω; Y ) (f is still a Carath´eodory function), then Nf is again bounded and there exists M > 0, such that ° ° °f (z, x)° 6 M for µ-a.a. z ∈ Ω and all x ∈ X. Y The proof which is similar to that of Theorem 3.4.4 is left to the reader. However, Nf : Lp (Ω; X) −→ L∞ (Ω; Y ) is not in general continuous as the following example illustrates. Let Ω = [0, 1] be equipped with the Lebesgue measure, let X = Y = R and consider −1 if x < −1, df x if −1 6 x 6 1, f (x) = 1 if 1 < x. If we take un (z) = z n , then xn −→ 0 in Lp [0, 1], for all p ∈ [1, +∞). But Nf (xn ) does not converge to zero in L∞ [0, 1].

3. Nonlinear Operators and Young Measures

413

PROPOSITION 3.4.6 If f : Ω × RN −→ RN is a Carath´eodory function, for all z ∈ Ω, f (z, ·) is a ¡ ¢ 0 monotone map and Nf maps Lp Ω; RN into Lp (Ω; RN ), where p ∈ [1, +∞), 1 1 p + p0 = 1, ¡ ¢ 0 then Nf : Lp Ω; RN −→ Lp (Ω; RN ) is a maximal monotone operator. PROOF If by h·, ·ipp0 we denote the duality brackets for the pair of spaces ¢¢ ¡ ¢ ¡ p0 ¡ L (Ω; RN ), Lp Ω; RN , for all u, v ∈ Lp Ω; RN , we have ® Nf (u)−Nf (v), u−v pp0 =

Z

¡ ¡ ¢ ¡ ¢ ¢ f z, u(z) −f z, v(z) , u(z)−v(z) RN dµ > 0

Ω

due to the monotonicity of f (z, ·). Hence Nf is monotone. Moreover, by Theorem 3.4.4, Nf is continuous. Therefore, Proposition 3.2.19 implies that Nf is maximal monotone. PROPOSITION 3.4.7 If f : Ω × RN −→ RN is a Carath´eodory function, such that (i) f (z, ·) is a strictly monotone map for µ-almost all z ∈ Ω; ¡ ¢ p (ii) f (z, x), x RN > c1 kxkRN − a1 (z) for µ-almost all z ∈ Ω and all x ∈ RN 1 with a1 ∈ L (Ω)+ , c > 0; ¡ ¢ 0 (iii) Nf maps Lp Ω; RN into Lp (Ω; RN ), with p ∈ [1, +∞), p1 + p10 = 1, ¡ ¢ 0 then Nf : Lp Ω; RN −→ Lp (Ω; RN ) is an operator of type (S)+ (see Definition 3.2.55(b)). PROOF

¡ ¢ Suppose that {un }n>1 ⊆ Lp Ω; RN is a sequence, such that ¡ ¢ un −→ u in Lp Ω; RN

and

® lim sup Nf (un ) − Nf (u), un − u pp0 6 0. n→+∞

We need to show that un −→ u

¡ ¢ in Lp Ω; RN .

From the monotonicity of Nf (see Proposition 3.4.6), we have that ® Nf (un ) − Nf (u), un − u pp0 −→ 0.

414

Nonlinear Analysis

Note that

® Nf (un ) − Nf (u), un − u pp0 Z ¡ ¡ ¢ ¡ ¢ ¢ = f z, un (z) − f z, u(z) , un (z) − u(z) RN dµ. Ω

Because of the monotonicity of f (z, ·), by passing to a subsequence, we may assume that ¢ ¡ ¢ ¢ df ¡ ¡ βn (z) = f z, un (z) − f z, u(z) , un (z) − u(z) RN −→ 0 for µ-a.a. z ∈ Ω and

¯ ¯ ¯βn (z)¯ 6 k(z) for µ-a.a. z ∈ Ω and all n > 1,

with k ∈ L1 (Z)+ . From Theorem 3.4.4, we know that for µ-almost all z ∈ Ω and all x ∈ RN , we have ° ° °f (z, x)° N 6 a(z) + c kxkp−1 RN , R 0

with a ∈ Lp (Ω)+ , c > 0 (recall that pp0 = p−1). So for all z ∈ Ω\D, µ(D) = 0 and all n > 1, we have °p ° °p ¢ ¡° k(z) > βn (z) > c1 °un (z)°RN + °u(z)°RN ° ° ¡ ° °p−1 ¢ − °un (z)°RN a(z) + c°u(z)°RN ° ° ¡ ° °p−1 ¢ − °u(z)°RN a(z) + c°u(z)°RN − 2a1 (z). (3.121) Using Young’s inequality (see Proposition A.4.5) with ε > 0, we have 0 ° ° ° °p−1 °p °p ε° cp ° c°un (z)°RN °u(z)°RN 6 °un (z)°RN + 0 °u(z)°RN p εp

(3.122)

° ° ° °p−1 ° ° ° cp ° °u(z)°p N + ε °un (z)°p N . c°u(z)°RN °un (z)°RN 6 0 R R εp p

(3.123)

and

Using (3.122) and (3.123) in (3.121), we obtain 0 ° °p ° °p °p °p ε° cp ° c1 °un (z)°RN 6 k(z) + c1 °u(z)°RN + °un (z)°RN + 0 °u(z)°RN p° εp ° ° ¢ ¡° + a(z) °un (z)°RN + °u(z)°RN ° ° ° cp ° °u(z)°p N + ε °un (z)°p N + 2a1 (z). + (3.124) 0 R R εp p

Recall that p1 + the sequence

1 p0

= 1 and choose ε < c. Then from (3.124), it follows that ° ª ©° °un (·)°p N R

n>1

¢ ¡ ⊆ L1 Ω; RN +

3. Nonlinear Operators and Young Measures

415

is integrable. Also for all z ∈ Ω \ D, µ(D) = 0, the sequence © uniformly ª un (z) n>1 ⊆ RN is bounded. So by passing to a suitable subsequence (depending in general on z ∈ Ω \ D), we may assume that un (z) −→ u b(z) in RN . Recall that f (z, ·) is continuous and that βn (z) −→ 0, so in the limit we obtain ¡ ¡ ¢ ¡ ¢ ¢ f z, u b(z) − f z, u(z) , u b(z) − u(z) RN = 0. Since by hypothesis f (z, ·) is strictly monotone, we infer that u b(z) = u(z)

∀ z ∈Ω\D

and so it follows that un (z) −→ u(z) in RN ,

∀ z ∈ Ω \ D, µ(D) = 0.

(3.125)

From (3.125), the uniform integrability of the sequence ° ª ©° ¡ ¢ °un (·)°p N ⊆ L1 Ω; RN + R n>1 and Vitali’s theorem (the extended dominated convergence theorem; see Theorem A.2.9), we obtain that kun k p ¡ L

Since

Ω;RN

¢ −→ kuk ¡ p L

Ω;RN

¢.

¡ ¢ w un −→ u in Lp Ω; RN

and the latter space is uniformly convex, from the Kadec-Klee property (see Remark A.3.22), we know that ¡ ¢ un −→ u in Lp Ω; RN , hence Nf is of type (S)+ . Next we pass to the study of integral functionals defined on LebesgueBochner space. So if (Ω, Σ, µ) is a nonatomic, complete σ-finite measure space, X a separable Banach space and f : Ω × X −→ R = R ∪ {+∞} is a Σ × B(X)-measurable function (an integrand), we consider the integral functional Z ¡ ¢ df If (u) = f z, u(z) dz ∀ u ∈ Lp (Ω; X), Ω

with p ∈ [1, +∞]. We start with a definition which extends the notion of a Carath´eodory function (see Definition 3.4.1).

416

Nonlinear Analysis

DEFINITION 3.4.8 Let (Ω, Σ, µ) be a complete σ-finite measure space and X a separable metric space. We say that f : Ω × X −→ R = R ∪ {+∞} is a normal integrand, if (a) f is Σ × B(X)-measurable; and (b) the function x 7−→ f (z, x) is lower semicontinuous for µ-almost all z ∈ Ω. We show that normal integrands which are bounded below can be realized as the upper envelope of a sequence of Carath´eodory integrands. PROPOSITION 3.4.9 If (Ω, Σ, µ) is a complete σ-finite measure space, X is a separable metric space with metric dX , f : Ω × X −→ R is a normal integrand and there exists a function h : Ω −→ R (not necessarily measurable), such that h(z) 6 f (z, x)

for µ-a.a. z ∈ Ω and all x ∈ X,

then we can find a sequence of functions fn : Ω × X −→ R for n > 1, such that for all n > 1, we have (a) h(z) 6 fn (z, x) 6 n for µ-almost all z ∈ Ω and all x ∈ X; (b) the function z 7−→ fn (z, x) is measurable for all x ∈ X; ¯ ¯ (c) ¯fn (z, x) − fn (z, v)¯ 6 ndX (x, v) for all z ∈ Ω and all x, v ∈ Ω; (d) fn (z, x) % f (z, x) for µ-almost all z ∈ Ω and all x ∈ X. PROOF

(a) For every n > 1, let £ ¤ df fbn (z, x) = inf f (z, y) + ndX (y, x) . y∈X

Evidently, for all n > 1, µ-almost all z ∈ Ω and all x ∈ X, we have h(z) 6 fb1 (z, x) 6 . . . 6 fbn (z, x) 6 fbn+1 (z, x) 6 . . . 6 f (z, x). If we fix n > 1, x ∈ X and λ ∈ R, we have ½ ¾ ½ ¾ b z ∈ Ω : fn (z, x) < λ = projΩ (z, y) ∈ Ω × X : f (z, y) + ndX (y, x) < λ .

By virtue of the joint measurability of f (see Definition 3.4.8), we deduce that df

Cλ =

½ ¾ (z, y) ∈ Ω × X : f (z, y) + ndX (y, x) < λ ∈ Σ × B(X).

3. Nonlinear Operators and Young Measures

417

Hence from the Yankov-von Neumann-Aumann projection theorem (see Theorem A.2.32) and since by hypothesis Σ is µ-complete, we have projΩ Cλ ∈ Σ and so the function z 7−→ fbn (z, x) is measurable for all x ∈ X and all n > 1. From the definition of fbn , we have fbn (z, x) 6 f (z, y) + ndX (y, x) so

∀ (z, y) ∈ Ω × X,

fbn (z, x) 6 f (z, y) + ndX (y, v) + ndX (v, x)

and thus

∀v∈X

fbn (z, x) − fbn (z, v) 6 ndX (x, v).

Interchanging the roles of x and v in the above argument, we conclude that ¯ ¯ ¯fbn (z, x) − fbn (z, v)¯ 6 ndX (x, v) ∀ z ∈ Ω, x, v ∈ X. Therefore fbn is Σ × B(X)-measurable (see Remark 3.4.2). Moreover, since for © ª all (z, x) ∈ Ω × X, the sequence fbn (z, x) n>1 is increasing, we have lim fbn (z, x) 6 f (z, x)

n→+∞

∀ (z, x) ∈ Ω × X.

(3.126)

Let D ⊆ Ω be the µ-null set, such that the function f (z, ·) is lower semicontinuous for all z ∈ Ω \ D. Then for all z ∈ Ω \ D, x ∈ X and ε > 0, let {yn }n>1 \ X be a sequence, such that f (z, yn ) + ndX (yn , x) 6 fbn (z, x) + ε. As n → +∞, either fbn (z, x) % +∞, in which case equality holds in (3.126) or otherwise we have yn −→ x in X. Then because of the lower semicontinuity of f (z, ·), we have f (z, x) 6 lim inf f (z, yn ) 6 n→+∞

lim fbn (z, x) + ε.

n→+∞

(3.127)

Since z ∈ Ω \ D, x ∈ X and ε > 0 were arbitrary, from (3.126) and (3.127), we infer that fbn (z, x) % f (z, x) for µ-a.a. z ∈ Ω and all x ∈ X. Finally set

© ª df fn (z, x) = min fbn (z, x), n .

Then the sequence {fn }n>1 is the desired sequence.

418

Nonlinear Analysis

Using this approximation result, we have another characterization of normal integrands. First let us recall the Scorza-Dragoni theorem, which is a parametrized version of Lusin’s theorem (see Theorem A.2.11). THEOREM 3.4.10 (Scorza-Dragoni Theorem) If Ω, X are two Polish spaces (see Definition A.2.29(a)), Y is a separable metric space, µ is a tight Borel measure on Ω (see Remark 3.4.11) and f : Ω × X −→ Y is a Carath´eodory function, then for every ε > 0, we can find a compact set Ωε ⊆ Ω, with µ(Ω \ Ωε ) < ε, such that f |Ωε ×X is continuous. REMARK 3.4.11 Recall that µ is a tight Borel measure on Ω, if µ is finite and for every ε > 0, we can find a compact subset Kε of Ω, such that µ(Ω \ Kε ) < ε. On a Polish space every finite Borel measure is tight. Combining Proposition 3.4.9 with Theorem 3.4.10, we obtain the following characterization of normal integrands. PROPOSITION 3.4.12 If Ω, X are two Polish spaces, µ is a finite Borel measure on Ω and f : Ω × df

X −→ R = R ∪ {+∞}, then f is a normal integrand if and only if for every ε > 0 we can find a compact set Kε ⊆ Ω, such that µ(Ω \ Kε ) < ε and f |Kε ×X is lower semicontinuous. Now we pass to the study of the integral functional Z ¡ ¢ df If (u) = f z, u(z) dµ ∀ u ∈ Lp (Ω; X), Ω

with p ∈ [1, +∞]. In our analysis we shall use the computational convention +∞ − ∞ = +∞, which is useful when dealing with integral functionals of R = R∪{+∞}-valued functions.

3. Nonlinear Operators and Young Measures

419

THEOREM 3.4.13 If (Ω, Σ, µ) is a nonatomic, σ-finite measure space, X is separable Banach space, f : Ω × X −→ R is a normal integrand, the integral functional If is not identically +∞ and p ∈ [1, +∞), then the following properties are equivalent: (a) If is lower semicontinuous on Lp (Ω; X) and If (u) > −∞ for all u ∈ Lp (Ω; X). (b) If : Lp (Ω; X) −→ R. (c) There exist β1 ∈ R and β2 > 0, such that p

If (u) > β1 − β2 kukLp (Ω;X)

∀ u ∈ Lp (Ω; X).

(d) There exist a ∈ L1 (Ω) and c > 0, such that p

f (z, x) > a(z) − c kxkX PROOF

for µ-a.a. z ∈ Ω and all x ∈ X.

Clearly implications (d)=⇒(c)=⇒(b) and (a)=⇒(b) hold.

“(b)=⇒(d)”: Let us set

© ª df g = min f, 0 .

We claim that Z

¡ ¢ g z, u(z) dµ > −∞

∀ u ∈ Lp (Ω; X).

(3.128)

Ω

Suppose that (3.128) does not hold. Then for some u0 ∈ Lp (Ω; X), we have Z ¡ ¢ g z, u0 (z) dµ = −∞. Ω

Let

½ df

C =

¾ ¡ ¢ ¡ ¢ z ∈ Ω : f z, u0 (z) = g z, u0 (z) ∈ Σ.

For any given v ∈ Lp (Ω; X), we define df

u b = χC u0 + χC c v ∈ Lp (Ω; X). Then we have Z −∞ < If (b u) = C

¡ ¢ g z, u0 (z) dµ +

Z Cc

¡ ¢ f z, v(z) dµ,

420

Nonlinear Analysis

so

Z

¡ ¢ f z, v(z) dµ = +∞

Cc

and thus If ≡ +∞, a contradiction. From (3.128), it follows that Ng (the Nemytskii operator corresponding to g) maps Lp (Ω; X) into L1 (Ω). Invoking Theorem 3.4.4, we obtain (d). “(d)=⇒(a)”: Let λ ∈ R and consider a sequence {un }n>1 ⊆ Lp (Ω; X), such that un −→ u in Lp (Ω; X), for some u ∈ Lp (Ω; X) and If (un ) 6 λ

∀ n > 1.

By passing to a subsequence if necessary, we may also assume that un (z) −→ u(z) for µ-a.a. z ∈ Ω and

° ° °un (z)° 6 k(z) for µ-a.a. z ∈ Ω, n > 1, X

with k ∈ L1 (Ω)+ . Then because of (d), we can apply Fatou’s lemma (see Theorem A.2.1) and obtain Z Z ¡ ¢ ¡ ¢ If (u) = f z, u(z) dµ 6 lim inf f z, un (z) dµ n→+∞

Z 6 lim inf

n→+∞

Ω

Ω

¡ ¢ f z, un (z) dµ 6 λ,

Ω

so If is lower semicontinuous on Lp (Ω; X). Moreover, it is clear that If (u) > −∞

∀ u ∈ Lp (Ω; X).

COROLLARY 3.4.14 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a Carath´eodory function, such that ¯ ¯ ¯f (z, x)¯ 6 a(z) + c kxkp for µ-a.a. z ∈ Ω and all x ∈ X, X with a ∈ L1 (Ω)+ , c > 0 and p ∈ [1, +∞), then If : Lp (Ω; X) −→ R is continuous.

3. Nonlinear Operators and Young Measures

421

For p = +∞, we can state the following continuity result. PROPOSITION 3.4.15 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a Carath´eodory function and for all r > 0 we can find ar ∈ L1 (Ω)+ , such that ¯ ¯ ¯f (z, x)¯ 6 ar (z) for µ-a.a. z ∈ Ω and all kxk 6 r, X then If : L∞ (Ω; X) −→ R is continuous. PROOF

Suppose that {un }n>1 ⊆ L∞ (Ω; X) is a sequence, such that un −→ u in L∞ (Ω; X),

for some u ∈ L∞ (Ω; X). Let df

r = sup kun kL∞ (Ω;X) < +∞. n>1

Then Since

¯ ¡ ¢¯ ¯f z, un (z) ¯ 6 ar (z) for µ-a.a. z ∈ Ω. ¡ ¢ ¡ ¢ f z, un (z) −→ f z, u(z)

for µ-a.a. z ∈ Ω,

from the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that If (un ) −→ If (u).

PROPOSITION 3.4.16 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a normal integrand, such that f (z, ·) is convex for µ-almost all z ∈ Ω, u0 ∈ L∞ (Ω; X) and If (u0 ) ∈ R, then the following conditions are equivalent: (a) If is continuous at u0 with respect to norm topology on L∞ (Ω; X). (b) There exist ε > 0 and a ∈ L1 (Ω), such that ¡ ¢ sup f z, x + u0 (z) 6 a(z) for µ-a.a. z ∈ Ω. kxkX 6ε

PROOF “(a)=⇒(b)”: Suppose that the implication is not true. Then for any ε > 0, the function ¡ ¢ df ξε (z) = sup f z, x + u0 (z) kxkX 6ε

422

Nonlinear Analysis

is not integrable (note that since f is a normal integrand, ξε is measurable). Clearly Z ξε (z) dµ = +∞ ∀ ε > 0. Ω

So for every ε > 0 and N > 1, we can find a measurable function ξε,N : Ω −→ R, such that Z ξε,N (z) dµ > N and ξε,N (z) 6 ξε (z) ∀ z ∈ Ω. Ω

Then for every z ∈ Ω, we define ½ ¾ ° ° df Sε,N (z) = x ∈ X : °x − u0 (z)°X 6 ε, f (z, x) > ξε,N (z) . ° ° Since (z, x) 7−→ °x − u0 (z)°X is a Carath´eodory function (hence jointly measurable; see Remark 3.4.2) and (z, x) 7−→ f (z, x) − ξε,N (z) is a Σ × B(X)-measurable function (since f is a normal integrand), we infer that Gr Sε,N ∈ Σ × B(X). Applying the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33), we obtain a measurable map uε,N : Ω −→ X, such that Clearly uε,N

uε,N (z) ∈ Sε,N (z) ∀ z ∈ Ω. ° ° ∈ L (Ω; X), °uε,N − u0 °L∞ (Ω;X) 6 ε and Z If (uε,N ) > ξε,N (z) dµ > N. ∞

Ω

Since ε > 0 and N > 1 were arbitrary, it follows that the convex integral functional If is unbounded from above in every L∞ (Ω; X)-neighbourhood of u0 and so it cannot be continuous at u0 , a contradiction. “(b)=⇒(a)”: For every df

v ∈ Bε = we have

Z

If (u0 ± v) = Ω

½

¾ v ∈ L∞ (Ω; X) : kvkL∞ (Ω;X) 6 ε ,

¡ ¢ f z, u0 (z) ± v(z) dµ 6

Z a(z) dµ = η < +∞. Ω

Since If is convex and bounded above in a neighbourhood of u0 , it is continuous at u0 .

3. Nonlinear Operators and Young Measures

423

Thus far we have considered the norm topology on the Lebesgue-Bochner space. If we want to have weak lower semicontinuity, then, as we show, necessarily f (z, ·) must be convex. For this purpose we need to use some tools from multivalued analysis, which for the convenience of the reader we recall here. Details can be found in Denkowski, Mig´orski & Papageorgiou (2003a, Chapter 4). DEFINITION 3.4.17

Let Y be a separable Banach space and G : Ω −→ 2Y \ {∅}

is a multivalued (set-valued) map. We say that G is graph measurable, if Gr G ∈ Σ × B(Y ), where df

Gr G =

©

ª (z, y) ∈ Ω × Y : y ∈ G(z) .

Also, for any p ∈ [1, +∞], we set ½ ¾ p df p SG = g ∈ L (Ω; Y ) : g(z) ∈ G(z) for µ-a.a. z ∈ Ω (the set of Lp -selections of the multifunction G) The next result from multivalued analysis is the main tool in establishing the necessity of the convexity of f (z, ·). PROPOSITION 3.4.18 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, Y is a separable p Banach space and G : Ω −→ 2Y \ {∅} is graph measurable with SG 6= ∅, p ∈ [1, ∞), pw p p then SG = Sconv G , where w stands for the weak topology on L (Ω; Y ). REMARK 3.4.19 There is an analogous result for p = +∞. More precisely, let (Ω, Σ, µ) and Y be as in Proposition 3.4.18. We denote by Yw∗∗ the dual space of Y furnished with the w∗ -topology. From Theorem 2.2.12, we know that L∞ (Ω; Yw∗∗ ) = L1 (Ω; Y )∗ , where L∞ (Ω; Yw∗∗ ) is understood in the sense of Definition 2.2.10. Let ∗

G : Ω −→ 2Y \ {∅} be a multifunction, such that ¡ ¢ Gr G ∈ Σ × B Yw∗∗ Then ∞ SG

w∗

∞ = Sconv w∗ G

∞ and SG 6= ∅.

in L∞ (Ω; Yw∗∗ ).

424

Nonlinear Analysis

Using this general result about multifunctions, we can prove the following theorem about the integral functional If , according to which the weak lower semicontinuity of If on L1 (Ω; X) implies the convexity of f (z, ·) for all z ∈ Ω. THEOREM 3.4.20 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a normal integrand, there exists u0 ∈ L1 (Ω; X), such that If (u0 ) < +∞ and If is weakly lower semicontinuous on L1 (Ω; X), then f (z, ·) is convex for all z ∈ Ω. PROOF

Without any loss of generality, we may assume that ¡ ¢ f z, u0 (z) = 0 ∀z∈Ω

(otherwise replace f (z, x) by ¡ ¢ fb(z, x) = f (z, x) − f z, u0 (z) ). Consider the multifunction E : Ω −→ 2X×R , defined by ½ ¾ df E(z) = epi f (z, ·) = (x, λ) ∈ X × R : f (z, x) 6 λ (the epigraph of f (z, ·)). Since ¡ ¢ u0 (z), 0 ∈ E(z)

∀ z ∈ Ω,

we see that E has nonempty values, which are also closed due to the lower semicontinuity of f (z, ·). Moreover, 1 (u0 , 0) ∈ SE .

We claim that

1 SE is weakly closed in L1 (Ω; X). © ª 1 and assume that To this end let (uα , λα ) α∈J be a net in SE w

uα −→ u in L1 (Ω; X) and

w

λα −→ λ

in L1 (Ω).

3. Nonlinear Operators and Young Measures

425

For every C ∈ Σ, we have that w

χC uα −→ χC u in L1 (Ω; X) and so

w

χC uα + χC c u0 −→ χC u + χC c u0 Also

w

in L1 (Ω).

χC λα −→ χC λ Note that

in L1 (Ω; X).

Z If (χC uα + χC c u0 ) =

(3.129)

¡ ¢ f z, uα (z) dµ.

C

Since by hypothesis If is weakly lower semicontinuous on L1 (Ω; X), we have lim inf If (χC uα + χC c u0 ) > If (χC u + χC c u0 ) α Z ¡ ¢ = f z, u(z) dµ, C

so

Z

¡ ¢ f z, uα (z) dµ >

lim inf α

C

Because we have

Z

¡ ¢ f z, u(z) dµ.

C

¡ ¢ f z, uα (z) 6 λα (z) for µ-a.a. z ∈ Ω, Z

¡ ¢ f z, uα (z) dµ 6

C

so, from (3.129), we have Z C

Z λα (z) dµ, C

¡ ¢ f z, u(z) dµ 6

Z λ(z) dµ. C

The set C ∈ Σ was arbitrary. Hence it follows that ¡ ¢ f z, u(z) 6 λ(z) for µ-a.a. z ∈ Ω and

1 (u, λ) ∈ SE ,

which proves that 1 SE is weakly closed in L1 (Ω; X).

Then by virtue of Proposition 3.4.18, we have that 1 1 SE = Sconv E.

426

Nonlinear Analysis

We claim that this implies that E(z) = conv E(z)

for µ-a.a. z ∈ Ω.

We proceed by contradiction. Suppose that D(z) = conv E(z) \ E(z) 6= ∅

∀ z ∈ C ∈ Σ,

with µ(C) > 0. Clearly Gr D ∈ (Σ ∩ C) × B(X) and so by the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) we can find two (Σ ∩ C)-measurable functions b : C −→ R, and λ

u b : Ω −→ X such that

¡

b u b(z), λ(z)

¢

∈ D(z)

∀ z ∈ C.

Exploiting the σ-finiteness of the measure space, we can find C0 ⊆ C with such that Then let

C0 ∈ Σ

and

χC 0 u b ∈ L1 (Ω; X)

0 < µ(C 0 ) < +∞, b ∈ L1 (Ω). and χC 0 λ

df

u = χC 0 u b + χ(C 0 )c u0

df b and λ = χC 0 λ.

¡ ¢ 1 Evidently u, λ ∈ SE and ¡ ¢ u(z), λ(z) ∈ D(z)

∀ z ∈ C 0,

a contradiction. Therefore E(z) = conv E(z) for µ-a.a. z ∈ Ω and by redefining E on a µ-null set, we may assume that E(z) = conv E(z)

∀ z ∈ Ω.

This proves the convexity of f (z, ·) for all z ∈ Ω. In the next section we use the theory of Young measures to prove very general lower semicontinuity results for integral functionals. Moreover, in Section 4.3, we will focus on convex integral functionals.

3. Nonlinear Operators and Young Measures

3.5

427

Young Measures

According to Proposition 2.3.38, a sequence {un }n>1 ⊆ L1 (Ω), which converges weakly but not strongly in L1 (Ω), oscillates violently around its weak limit. However, in the limit all this information about the faster and faster oscillations is lost and only a mean value is recorded. Of course this is not satisfactory, because if for example on {un }n>1 we act with the Nemytskii operator Nf , we cannot say that w

Nf (un ) −→ Nf (u)

in L1 (Ω),

unless f (z, ·) is affine. The idea is then to embed the sequence {un }n>1 into a larger space and consider the limit there. The appropriate space is that of probability-valued functions (parametrized measures). These are the Young measures. In what follows let Ω and E be locally compact, σ-compact metric spaces, Σ a σ-field on Ω containing B(Ω) (the Borel σ-field of Ω) and µ ∈ M (Ω)+ (see Section 2.3), which is nonatomic and Σ is µ-complete. Also we set ½ ¾ df 1 M+ (E) = λ ∈ M (E)+ : λ(E) = 1 (3.130) (the probability measures on E) and ½ ¾ df 1 SM+ (E) = λ ∈ M (E)+ : λ(E) 6 1 (the subprobability measures on E). DEFINITION 3.5.1 A transition probability (respectively transition subprobability) on E is a function 1 1 λ : Ω −→ M+ (E) (respectively λ : Ω −→ SM+ (E)),

such that for every A ∈ B(E), we have that the function z 7−→ λ(z)(A) is b c Σ-measurable. By R(Ω, E) (respectively SR(Ω, E)) we denote the space of transition probabilities (respectively subprobabilities) on Ω. 1 REMARK 3.5.2 On M+ (E) we can consider the topology of narrow 1 convergence (see Definition 2.3.42(c)). This is the relative topology on M+ (E) ¡ ¢ 1 induced by w M (E), Cb (E) . Recall that M+ (E) furnished with this topology 1 is a Polish space (see Definition A.2.29(a)). In the sequel by M+ (E)n we 1 denote the space M+ (E) equipped with the topology of narrow convergence.

428

Nonlinear Analysis

PROPOSITION 3.5.3 ¡ ¡ 1 ¢¢ 1 If λ : Ω −→ M+ (E) is a Σ, B M+ (E)n -measurable function, b then λ ∈ R(Ω, E). PROOF Let U be an open set in E. The map z 7−→ ¡ λ(z)(U ¡ 1 ) is the ¢¢ 1 composition of λ : Ω −→ M+ (E), which by hypothesis is Σ, B M+ (E)n 1 measurable and of the map ξ : M+ (E) −→ [0, 1], defined by ξ(ν) = ν(U ), which is lower semicontinuous (by virtue of the Portmanteau theorem; see Theorem A.2.36). Therefore the map z −→ λ(z)(U ) is Σ-measurable. Since 1 λ(z) ∈ M+ (E) is regular for every z ∈ Ω, we conclude that the map z 7−→ λ(z)(A) is Σ-measurable for all A ∈ Σ.

PROPOSITION 3.5.4 b If E is compact and λ ∈ R(Ω, E), ¡ ¡ 1 ¢¢ then λ is a Σ, B M+ (E)n -measurable function. PROOF that

Recall that C(E)∗ = M (E). Then for every g ∈ C(E), we have

® the map z 7−→ λ(z), g C(E) =

Z g(x)λ(z)(dx) is Σ-measurable. E

Indeed, suppose that s : E −→ R is a simple function. Then since λ ∈ R(Ω; E), we have that Z the map z 7−→ s(x)λ(z)(dx) is Σ-measurable. E

We can find a sequence {sn }n>1 of simple functions on E, such that sn (x) −→ g(x) uniformly on E. Then

Z

Z sn (x)λ(z)(dx) −→

E

g(x)λ(z)(dx), E

® which proves the Σ-measurability of the map z 7−→ λ(z), g C(E) . So the map z 7−→ λ(z) is weakly∗ -measurable. 1 Because M+ (E)n is a compact metrizable space, we conclude that z 7−→ λ(z) ¡ ¡ 1 ¢¢ is Σ, B M+ (E)n -measurable.

3. Nonlinear Operators and Young Measures

429

Let us recall the notion of image measure. DEFINITION 3.5.5 Let ¡(S, T ) be¢a measurable space, Y a Hausdorff topological space, ξ : S −→ Y a Σ, B(Y ) -measurable function and λ : T −→ R+ ∪ {+∞} a measure. The image of λ under ξ is the measure ν : B(Y ) −→ R+ ∪ {+∞}, defined by ¡ ¢ df ν(A) = λ ξ −1 (A)

∀ A ∈ B(Y ).

We often denote ν by λ ◦ ξ −1 . REMARK 3.5.6 If g : Y −→ R is a ν-integrable map or a measurable and positive map, then Z Z ¡ ¢ g ξ(s) dλ(s) = g(y) dν(y). S

Y

Now we can give the definition of Young measure. DEFINITION 3.5.7 (a) λ ∈ M (Ω × E)+ is a Young measure with respect to µ, if λ(A × E) = µ(A) ∀ A ∈ Σ. By Y(Ω, E, µ), we denote the space of Young measures with respect to µ. (b) λ ∈ M (Ω × E)+ is a Young submeasure with respect to µ, if λ(A × E) 6 µ(A)

∀ A ∈ Σ.

By SY(Ω, E, µ), we denote the space of Young submeasures with respect to µ. (c) If u : Ω −→ E is a measurable function, then the Young measure associated to u is the element ν ∈ Y(Ω, E, µ), defined by Z Z ¡ ¢ h(z, x) dν = h z, u(z) dν ∀ h ∈ C0 (Ω × E). Ω×E

Ω

REMARK 3.5.8

Let projΩ : Ω × E −→ Ω

be the projection map defined by df

projΩ (z, x) = z If ν ∈ Y(Ω, E, µ), then

∀ (z, x) ∈ Ω × E.

µ = ν ◦ proj−1 Ω

430

Nonlinear Analysis

(see Definition 3.5.5). Moreover, we have ν = µ ◦ η −1

for all µ, ν as in Definition 3.5.7(c),

where η : Ω −→ Ω × E is defined by df

η(z) =

¡ ¢ z, u(z)

∀ z ∈ Ω.

If un : Ω −→ E are measurable functions for n > 1, un (z) −→ u(z) for µ-a.a. z ∈ Ω and νn , ν are the Young measures associated to un and u respectively (see Definition 3.5.7(c)), then w

1 (Ω × E) in M+

νn −→ ν

(see Definition 2.3.42(b)). To see this let h ∈ C0 (Ω × E). Using Remark 3.5.6 and the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z Z ¡ ¢ lim h(z, x) dνn = lim h z, un (z) dµ n→+∞ Ω×E

Z

= Ω

n→+∞

Z

¡ ¢ h z, u(z) dµ =

Ω

h(z, x) dν. Ω×E

PROPOSITION 3.5.9 We have ½ SY(Ω, E, µ) = ν ∈ M (Ω × E)+ : ν(Ω × E) 6 µ(Ω) and for all β ∈ C0 (Ω)+ and all ξ ∈ C0 (Ω × E)+ , such that ξ(z, x) 6 β(z) for all (z, x) ∈ Ω × E, ¾ Z Z we have ξ(z, x) dν 6 β(z) dµ . Ω×E

Ω

Moreover, if E is compact, then ½ Y(Ω, E, µ) = ν ∈ SY(Ω, E, µ) : for all β ∈ C0 (Ω)+ , ¾ Z Z we have β(z) dµ = β(z) dν . Ω

Ω×E

3. Nonlinear Operators and Young Measures

431

PROOF Let K ⊆ Ω and C ⊆ E be nonempty, compact sets, and ε > 0. Then by virtue of Urysohn’s lemma (see Theorem A.1.13), we can find ξ ∈ C0 (Ω × E)+ ,

β ∈ C(Ω)+ ,

0 6 ξ, β 6 1

and a compact set K1 ⊆ Ω,

K1 ⊇ K,

K1 6= K,

such that ξ|K×C = 1,

β|K = 1,

¯ ¯β|

c K1

¯ ¯ 6 ε,

µ(K1 \ K) 6 ε

and ξ(z, x) 6 β(z) Then we have

∀ (z, x) ∈ Ω × E.

Z ν(K × C) 6

Z ξ(z, x) dν 6

Ω×E

β(z) dµ Ω

¡ ¢ 6 µ(K) + µ(K1 \ K) + εµ(K1c ) 6 µ(K) + ε 1 + µ(Ω) . Let ε & 0, to conclude that ν(K × C) 6 µ(K). If A ∈ Σ, then we can find compact sets Kn ⊆ A

and Cn ⊆ E

∀ n > 1,

such that Kn % A

and

Cn % E.

Since ν(Kn × Cn ) 6 µ(Kn )

∀ n > 1,

passing to the limit as n → +∞, we obtain ν(A × E) 6 µ(A). This proves the first equality. If E is compact, simply note that if β ∈ C0 (Ω), then β ∈ C0 (Ω × E). So the second equality follows at once. Recall that

M (Ω × E) = C0 (Ω × E)∗

(see Theorem 2.3.41). So we can equip Y(Ω, E, µ) and SY(Ω, E, µ) with the relative weak∗ -topology. If we do this we can have useful topological properties for the space of Young measures and of Young submeasures.

432

Nonlinear Analysis

THEOREM 3.5.10 E is compact if and only if Y(Ω, E, µ) is compact. PROOF

“=⇒”: First note that ∗

Y(Ω, E, µ) ⊆ SY(Ω, E, µ) ⊆ B µ(Ω) , where ∗ df B µ(Ω) =

½

¾ λ ∈ M (Ω × E) : |λ|(Ω × E) 6 µ(Ω) .

We know that the predual space C0 (Ω × E) is separable. So on bounded subsets of M (Ω × E), the relative weak∗ -topology is compact and metrizable. Therefore we have to show that Y(Ω, E, µ) is sequentially closed. Let {νn }n>1 ⊆ Y(Ω, E, µ) be a sequence, such that w∗

νn −→ ν. If β ∈ C0 (Ω), then from Proposition 3.5.9, we have Z Z β(z) dµ = β(z) dνn ∀ n > 1. Ω

Ω×E

Since β ∈ C0 (Ω × E), in the limit as n → +∞, we obtain Z Z β(z) dµ = β(z) dν Ω

Ω×E

and thus due to Proposition 3.5.9, we have ν ∈ Y(Ω, E, µ). This proves the compactness of Y(Ω, E, µ). “⇐=”: Suppose that E is not compact. Then we can find a sequence {xn }n>1 ⊆ E with no convergent subsequence. Let νn ∈ Y(Ω, E, µ) be associated to the constant function xn (see Definition 3.5.7(c)). Then for each ξ ∈ C0 (Ω × E), we have Z Z lim ξ(z, x) dνn = lim ξ(z, xn ) dµ = 0 n→+∞ Ω×E

n→+∞

Ω

and so

w∗

νn −→ 0 and 0 6∈ Y(Ω, E, µ), a contradiction.

∀n>1

3. Nonlinear Operators and Young Measures

433

THEOREM 3.5.11 SY(Ω, E, µ) is compact and metrizable. PROOF

Recall that ∗

SY(Ω, E, µ) ⊆ B µ(Ω) ∗

(see the proof of Theorem 3.5.10) and B µ(Ω) with the relative weak∗ -topology is compact metrizable. So we need to show that SY(Ω, E, µ) is sequentially weak∗ -closed. To this end let {νn }n>1 ⊆ SY(Ω, E, µ) be a sequence, such that w∗

νn −→ ν

in SY(Ω, E, µ),

for some ν ∈ M (Ω × E)+ . Suppose that ξ ∈ C0 (Ω × E)+ ,

β ∈ C0 (Ω)+

and ξ(z, x) 6 β(z)

∀ (z, x) ∈ Ω × E.

Then from Theorem 3.5.10, we have Z Z ξ(z, x) dνn 6 β(z) dµ Ω×E

so

∀ n > 1,

Ω

Z

Z ξ(z, x) dν 6

Ω×E

β(z) dµ Ω

and from Proposition 3.5.9, we have that ν ∈ SY(Ω, E, µ).

Let us see how Young measures are related to transition probabilities (see Definition 3.5.1). c In what follows on SR(Ω, E) we consider the equivalence relation · ¸ λ1 ∼ λ2 ⇐⇒ λ1 (z) = λ2 (z) for µ-a.a. z ∈ Ω . Then we set df b R(Ω, E) = R(Ω, E)/∼

and

df c SR(Ω, E) = SR(Ω, E)/∼ .

434

Nonlinear Analysis

THEOREM 3.5.12 There is a bijection ψ : R(Ω, E) −→ Y(Ω, E, µ) given by df

ψ(λ) = ν where df

∀ λ ∈ R(Ω, E),

Z Z

ν(C) =

χC (z, x)λ(z)(dx) dµ

∀ C ∈ Σ × B(E).

Ω E

PROOF For any ν-integrable or positive and Σ × B(E)-measurable function h : Ω × E −→ R, we have Z Z Z h(z, x) dν = h(z, x)λ(z)(dx) dµ, Ω×E

Ω E

where ν = ψ(λ). First we show that the map is injective. So let λ1 , λ2 ∈ R(Ω, E) and suppose that ψ(λ1 ) = ψ(λ2 ). Then for A ∈ Σ and η ∈ C0 (E), we set df

h(z, x) = χA (z)η(x) We have

Z Z

∀ (z, x) ∈ Ω × E. Z Z

η(x)λ1 (z)(dx) dµ = A E

η(x)λ2 (z)(dx) dµ. A E

Because A ∈ Σ was arbitrary, we obtain Z Z η(x)λ1 (z)(dx) = η(x)λ2 (z)(dx) for µ-a.a. z ∈ Ω and all η ∈ C0 (E), E

E

so λ1 (z) = λ2 (z) for µ-a.a. z ∈ Ω. Therefore ψ is indeed injective. Next we show that ψ is surjective. So let ν ∈ Y(Ω, E, µ). Then for any ε > 0, we can find a sequence {Dk }k>1 of pairwise disjoint Borel subsets of E with µ ¶ ∞ \ diam Dk < ε and ν Ω × Dkc = 0 k=1

and also a sequence {Um }m>1 of pairwise disjoint open sets with diam Um < ε,

µ(∂Um ) = 0

and

µ ¶ ∞ [ µ Ω\ Um = 0. m=1

3. Nonlinear Operators and Young Measures

435

For each k > 1, we pick xk ∈ Dk and then define df

ε

λ (z) =

∞ X ν(Um × Dk )

µ(Dk )

k=1

δxk

∀ z ∈ Um ,

where δxk is the Dirac measure concentrated on xk . Then λε ∈ R(Ω, E) and let

ν ε = ψ(λε ).

For all C ∈ Σ × B(E), we have µ ¶ µ [ ε ν Um × Dk 6 ν (C) 6 ν (m, k) Um × Dk ⊆ C

[

¶ U m × Dk .

(m, k) Um × Dk ⊇ C

Therefore for any open set V ⊆ Ω × E with ν(∂V ) = 0, we have lim ν ε (V ) = ν(V ).

ε&0

Then by the regularity of ν and the Portmanteau theorem (see Theorem A.2.36), we have that w∗

ν ε −→ ν. © εª On the other hand note that λ ε>0 is bounded in L∞ (Ω; M (E)) and so by Alaoglu’s theorem (see Theorem A.3.9), we may assume that w∗

λε −→ λ

in L∞ (Ω; M (E)).

Then if ξ ∈ C0 (Ω × E), we have ¡ ¢ ξ ∈ L1 Ω; C0 (E) and so Z

Z ξ(z, x) dν = lim

ε&0 Ω×E

Ω×E

Z Z ξ(z, x) dν ε = lim

Ω E

Z Z =

ξ(z, x)λε (z)(dx) dµ

ε&0

ξ(z, x)λ(z)(dx) dµ. Ω E

Hence λ ∈ R(Ω, E) and ν = ψ(λ).

436

Nonlinear Analysis

REMARK 3.5.13 Similarly we establish that ψ is a bijective from SR(Ω, E) onto SY(Ω, E, µ). Moreover, if ν is the Young measure associated to a measurable function u : Ω −→ E, then ψ(δu ) = ν. Here δu is the Dirac transition probability associated to u, i.e., ½ 1 if u(z) ∈ A δu(z) (A) = ∀ A ∈ B(E). 0 if u(z) 6∈ A ∗ Also the identification obtained in Theorem 3.5.12 implies ¡ that the weak topology on Y(Ω, E, µ) (resulting from the pair of spaces M (Ω × E), C (Ω × 0 ¢ ∗ E) ) ¡is equivalent to the -topology on R(Ω, E) (resulting from the ¡ weak ¢¢ pair L∞ (Ω; M (E)), L1 Ω; C0 (E) ). Finally we should mention that Theorem 3.5.12 is also known as the disintegration theorem.

The next approximation result is important in many applications. THEOREM 3.5.14 If F : Ω −→ 2E \ {∅} is a multifunction, such that Gr F ∈ Σ × B(E), then RF (Ω, E) (3.131) ¡ ¢w ∗ = δu : u : Ω −→ E measurable, u(z) ∈ F (z) for µ-a.a. z ∈ Ω , where df

RF (Ω, E) =

©

¡ ¢ λ ∈ R(Ω, E) : λ(z) F (z) = 1

for a.a. z ∈ Ω

ª

and the closure is taken with respect to the weak∗ -topology on L∞ (Ω; M (E)). 1 1 PROOF In what follows by M+ (E)w∗ (respectively M+ (E)n ) we denote 1 ∗ the space M+ (E) furnished with the relative weak -topology of M (E) (respectively the narrow topology); see Definition 2.3.42 and Remark 2.3.43. We 1 know that M+ (E)n is a Polish space and because the narrow topology is 1 stronger than the weak∗ -topology, we infer that M+ (E)w∗ is a Souslin space 1 1 (see Definition A.2.29(b)). So the Borel σ-fields of M+ (E)n and M+ (E)w∗ ¡ ¢ are equal and we simply write B (E) . Now let ¡ ¢ ª df © 1 S(z) = λ ∈ M+ (E) : λ F (z) = 1 .

We claim that

¡ ¢ Gr S ∈ Σ × B (E) .

So let C ∈ Σ × B(E) 1 and consider the map ϕC : Ω × M+ (E) −→ R, defined by ¡ ¢ 1 ϕC (z, λ) = δz ⊗ λ (C) ∀ (z, λ) ∈ Ω × M+ (E).

3. Nonlinear Operators and Young Measures

437

Here by δz we denote the Dirac measure concentrated at z ¡∈ Ω. ¢Let T be the collection of all sets C ∈ Σ × B(E), such that ϕC is Σ × B (E) -measurable. For A ∈ Σ and B ∈ B(E), we have ϕA×B (z, λ) = χA (z) × λ(B) = χA (z) × ϑB (λ), where

1 ∀ λ ∈ M+ (E).

ϑB (λ) = λ(B)

1 map on M+ (E). Let B1 be all those sets G ∈ B(E), We show that ϑB is ¡ a Borel ¢ such that ϑG is B (E) -measurable. If G is open, then from the Portmanteau theorem (see Theorem A.2.36), we know that the map

λ 7−→ ϑG (λ) is lower semicontinuous on set in E, then

1 M+ (E)n .

If U is an open set E and K is a closed

ϑU ∩K = ϑU − ϑU ∩K c , so the map λ 7−→ ϑU ∩K (λ) is Borel. N Hence if {Uk }N k=1 are open sets in E and {Kk }k=1 are closed sets in E, then N [ ¡

Uk ∩ Kk

¢

∈ B1

k=1

and so B1 is a monotone class. From the monotone class theorem, it follows that B1 = B(E). ¡ ¢ So ϕG is B (E) -measurable for all G ∈ B(E). Therefore the map (z, λ) 7−→ ϕA×B (z, λ) = χA (z) × ϑB (λ) ¡ ¢ is Σ × B (E) -measurable for all A ∈ Σ and all B ∈ B(E), hence A × B ∈ T . Clearly T is a monotone class and so it follows that T = Σ × B(E). We have © ª 1 (z, λ) ∈ Ω × M+ (E) : (δz ⊗ λ)(Gr F ) = 1 ¡ ¢ ¡ ¢ = ϕ−1 = Gr S ⊆ Σ × B (E) . Gr F {1} 1 Let D be the subset of M+ (E) consisting of the Dirac measures and introduce ¡ ¢ ª df © Se (z) = λ ∈ D : λ F (z) = 1 .

Then ext S(z) = Se (z)

and

¡ ¢ Gr Se ∈ Σ × B (E) .

Using Remark 3.4.19, we have SS∞e so we obtain (3.131).

w∗

∞ = Sconv = SS∞ , ∗S e

438

Nonlinear Analysis

Exploiting the identification of R(Ω, E) with Y(Ω, E, µ) we obtain the following corollary of Theorem 3.5.14. COROLLARY 3.5.15 The Young measures associated to measurable functions are dense in the space Y(Ω, E, µ) for the weak∗ -topology on M (Ω × E) (or equivalently by Theorem 3.5.12 for the weak∗ -topology on L∞ (Ω; M (E))). In the previous section we introduced the following classes of Σ × B(X)df

measurable functions f : Ω × E −→ R = R ∪ {+∞} (called integrands): the Carath´eodory integrands and the normal integrands. Next we introduce further specifications of these classes, which will be helpful in our analysis. DEFINITION 3.5.16 (a) ½ df N (Ω, Σ, E) = f : Ω × E −→ R : f is Σ × B(E)-measurable and ¾ f (z, ·) is lower semicontinuous , i.e., N (Ω, Σ, E) is the set of all normal integrands; see also Definition 3.4.8. (b)

½ df

N+ (Ω, Σ, E) =

¾ f ∈ N (Ω, Σ, E) : f > 0 ,

i.e., N+ (Ω, Σ, E) is the set of positive normal integrands. (c) ½ df b b (Ω, Σ, E) = K

f ∈ N (Ω, Σ, E) : f (z, ·) ∈ Cb (E) for µ-a.a. z ∈ Ω and ¾ ° ° the map z 7−→ °f (z, ·)°L∞ (E) belongs in L1 (Ω) ,

b b (Ω, Σ, E) is the set of all Cb -Carath´eodory integrands. i.e., K (d) ½ df b 0 (Ω, Σ, E) = K

f ∈ N (Ω, Σ, E) : f (z, ·) ∈ C0 (E) for µ-a.a. z ∈ Ω ¾ ° ° ∞ ° ° and the map z 7−→ f (z, ·) L∞ (E) belongs in L (Ω) ,

b 0 (Ω, Σ, E) is the set of all C0 -Carath´eodory integrands. i.e., K

3. Nonlinear Operators and Young Measures From Proposition 3.4.9, we obtain the following fact. PROPOSITION 3.5.17 If f ∈ N+ (Ω, Σ, E), b 0 (Ω, Σ, E), such that then there exists a sequence {fn }n>1 ⊆ K fn % f. PROPOSITION 3.5.18 If f ∈ N+ (Ω, Σ, E), then the map

Z ν 7−→

f dν Ω×E

∗

is w -lower semicontinuous on SY(Ω, E, µ). PROOF

By virtue of Proposition 3.5.17, we can find a sequence b 0 (Ω, Σ, E), {fn }n>1 ⊆ K

such that fn % f. By the monotone convergence theorem (see Theorem A.2.10), we have Z Z f dν = lim fn dν. n→+∞ Ω×E

Ω×E

Since

¡ ¢ fn ∈ L1 Ω; C0 (E)

the map

∀ n > 1,

Z ν 7−→

fn dν Ω×E

is w∗ -continuous on SY(Ω, E, µ) (see Remark 3.5.13). Therefore the map Z ν 7−→ f dν Ω×E

is w∗ -lower semicontinuous on SY(Ω, E, µ).

439

440

Nonlinear Analysis

PROPOSITION © ª 3.5.19 If un : Ω −→ E n>1 is a sequence of Σ-measurable functions, {νn }n>1 is a sequence of Young measures associated with {un }n>1 (i.e., νn = δun for n > 1) and w∗

νn −→ ν

in SY(Ω, E, µ),

for some ν ∈ SY(Ω, E, µ), then for µ-almost all z ∈ Ω, the subprobability measure ν(z) is supported by ∞ \ © ª © ª lim sup un (z) = uk (z) : k > n . n→+∞

PROOF

n=1

We define df

Fk (z) =

©

un (z)

ª

∀ z ∈ Ω, k > 1

n>k

and

½ fk (z, x) = iFk (z) =

0 +∞

if x ∈ Fk (z), otherwise.

Evidently fk ∈ N+ (Ω, Σ, E) and by virtue of Proposition 3.5.18, we have Z 0 6 fk (z, x) dν Ω×E

Z

6 lim inf

n→+∞ Ω×E

Z

= lim inf

n→+∞

fk (z, x) dνn

¡ ¢ fk z, un (z) dµ = 0,

Ω

so

Z fk (z, x) dν = 0. Ω×E

From Remark 3.5.13, we have Z Z fk (z, x)ν(z)(dx) dµ = 0, Ω E

so

Z fk (z, x)ν(z)(dx) = 0

for µ-a.a. z ∈ Ω

E

and thus ν(z) is supported by Fk (z) for µ-almost all z ∈ Ω. Because k > 1 was arbitrary, we obtain the conclusion of the proposition.

3. Nonlinear Operators and Young Measures

441

When we examined the space of measures M (E) = C0 (E)∗ , in addition to the weak∗ -topology, we considered also¢ a finer topology called the narrow ¡ topology , namely the w M (E), Cb (E) -topology. Exploiting the identification of Y(Ω, E, µ) with R(Ω, E), we do the same thing for the space Y(Ω, E, µ) of Young measures. We introduce a finer topology, which will lead to some powerful compactness result. DEFINITION 3.5.20 The narrow topology on Y(Ω, E, µ) is the weakest topology which makes continuous the linear functionals of the form Z b b (Ω, Σ, E). ν 7−→ f dν ∀f ∈K Ω×E

We say that the sequence {νn }n>1 converges narrowly to ν, if Z lim

n→+∞ Ω×E

Z f dνn =

b b (Ω, Σ, E) ∀f ∈K

f dν Ω×E

and we write

n

νn −→ ν. REMARK 3.5.21 If E is compact, then the narrow and weak∗ -topolob b 0 (Ω, Σ, E). gies coincide since Kb (Ω, Σ, E) = K PROPOSITION 3.5.22 If {un : Ω −→ E}n>1 is a sequence of Σ-measurable functions, u : Ω −→ E is a Σ-measurable function, {νn }n>1 and ν are the Young measures associated to the functions {un }n>1 and u respectively, then µ n un −→ u ⇐⇒ νn −→ ν. PROOF

“=⇒”: Since

µ

u −→ u, © ª we can extract a subsequence unk k>1 , such that unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Then for all

b b (Ω, Σ, E), f ∈K

by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z Z ¡ ¢ ¡ ¢ lim f z, unk (z) dµ = f z, u(z) dµ, n→+∞

Ω

Ω

442

Nonlinear Analysis

so

Z lim

Z

k→+∞ Ω×E

f (z, x) dνnk =

b b (Ω, Σ, E) ∀f ∈K

f (z, x) dν Ω×E

and thus

n

νnk −→ ν

as k → +∞,

in Y(Ω, E, µ).

Since every subsequence of {νn }n>1 has a further subsequence converging narrowly to ν, we conclude that n

νn −→ ν “⇐=”: Let

in Y(Ω, E, µ).

© ¡ ¢ª df f (z, x) = min 1, dE x, u(z) ,

where dE denotes the metric on E. Clearly b b (Ω, Σ, E). f ∈K So

Z lim

Z

n→+∞ Ω×E

hence

Z lim

n→+∞

f (z, x) dνn =

f (z, x) dν, Ω×E

¡ ¢ f z, un (z) dµ =

Ω

Z

¡ ¢ f z, u(z) dµ = 0.

Ω

For a given ε > 0, let df

Mε,n =

©

¡ ¢ ª z ∈ Ω : dE un (z), u(z) > ε

We have

Z εµ(Mε,n ) 6 Z 6

¡ ¢ f z, un (z) dµ

Mε,n

¡

¢ f z, un (z) dµ,

Ω

so, from (3.132), we have lim µ(Mε,n ) = 0

n→+∞

and thus

µ

∀ n > 1.

un −→ u.

(3.132)

3. Nonlinear Operators and Young Measures

443

By the Alexandrov one-point compactification (see Theorem A.1.3 and Reb such that E is a dense mark A.1.4), we can find a compact metric space E, b subset of E. PROPOSITION 3.5.23 If f ∈ N+ (Ω, Σ, E), then © ª ¡ ¢ b 0 Ω, Σ, E b , such that (a) there exists a sequence fbn n>1 ⊆ K fbn % f R

(b) the function ν 7−→

on Ω × E;

f dν is narrowly lower semicontinuous.

Ω×E

PROOF

b which extends d . Then (a) Let dEb be the metric of E E

£ ¤ fbn (z, x) = inf f (z, y) + ndEb (y, x) y∈E

b ∀ n > 1, (z, x) ∈ Ω × E

is the desired sequence. (b) Follows from Proposition 3.5.18 since the narrow topology is finer than the weak∗ -topology. PROPOSITION 3.5.24 The narrow topology on Y(Ω, Σ, E) is the weakest topology τ , such that the R b b (Ω, Σ, E). b map ν −→ fbdν is continuous for all fb ∈ K Ω×E

b b (Ω, Σ, E) b ⊆K b b (Ω, Σ, E). So a priori the τ -topology PROOF Note that K b b (Ω, Σ, E) and introduce is weaker than the narrow topology. Next let f ∈ K ° df ° β(z) = °f (z, ·)°L∞ (E)

df

and g(z, x) = f (z, x) + a(z).

Evidently g R∈ N+ (Ω, Σ, E) and so by Proposition 3.5.23, we have that the map ν 7−→ g dν is τ -lower semicontinuous. Since

Ω×E

Z

Z f dν =

Ω×E

we infer that the map ν −→

R Ω×E

Z g dν −

Ω×E

β dµ, Ω

f dν is τ -lower semicontinuous.

If we repeat the above argument with f replaced by −f , we reach the desired conclusion.

444

Nonlinear Analysis

PROPOSITION 3.5.25 If on K0 (Ω, Σ, E) we introduce the equivalence relation ¡© ª¢ f1 ∼ f2 ⇐⇒ µ z ∈ Ω : f1 (z, ·) 6= f2 (z, ·) = 0, ¡ ¢ b 0 (Ω, Σ, E)/∼ is in bijection with L1 Ω; C0 (E) . then K0 (Ω, Σ, E) = K PROOF

For each f ∈ K0 (Ω, Σ, E), let df

ψ(f )(z) = f (z, ·). Then ψ(f )(z) ∈ C0 (E). Also the map Ω 3 z 7−→ ψ(f )(z) ∈ C0 (E) is measurable. To see this let {xn }n>1 be dense in E. Then for all h ∈ C0 (E), we have ° ° ¯ ¯ °ψ(f )(z) − h° = sup ¯f (z, xn ) − h(xn )¯, C (E) 0

so the map

n>1

° ° z 7−→ °ψ(f )(z) − h°C0 (E)

is Σ-measurable for all h ∈ C0 (E). Since the space C0 (E) is separable, it follows that the map Ω 3 z 7−→ ψ(f )(z) ∈ C0 (E) is strongly measurable (see Corollary 2.1.4). Moreover, Z ° ° °ψ(f )(z)° dµ < +∞, C0 (E) Ω

¡ 1

¢ i.e., ψ(f ) ∈ L Ω; C0 (E) . Therefore ¡ ¢ ψ : K0 (Ω, Σ, E) −→ L1 Ω; C0 (E) ¡ ¢ is injective. Moreover, if g ∈ L1 Ω; C0 (E) , then g = ψ(f ) with df

f (z, x) = g(z)(x), so ψ is bijective. REMARK 3.5.26¡ The above proposition permits the identification of ¢ K0 (Ω, Σ, E) with L1 Ω; C0 (E) , which is very convenient because of the identification of Y(Ω, E, µ) with R(Ω, E).

3. Nonlinear Operators and Young Measures

445

We want to investigate the compact sets in Y(Ω, E, µ). For this purpose it is useful to recall the basic results characterizing the compact sets of the ¡ ¢ space M (E) furnished with the narrow topology, i.e., the w M (E), Cb (E) topology. If E is compact, this topology coincides with the w∗ -topology. THEOREM 3.5.27 (Prohorov Theorem) If Y is a Polish space (see Definition A.2.29(a)) and C ⊆ M (Y )+ is a bounded set, then C is relatively compact in the narrow topology if and only if C is uniformly tight, i.e., for every ε > 0, we can find a compact set Kε ⊆ Y, such that

¡ ¢ sup λ Kεc 6 ε.

λ∈C

We have the following characterization of uniformly tight sets in M (Y )+ . PROPOSITION 3.5.28 If Y is a Polish space and C ⊆ M (Y )+ is nonempty, then C is uniformly tight if and only if there exists ψ : Y −→ R+ , such that © ª (a) the set y ∈ Y : ψ(y) 6 t is compact for every t > 0 (i.e., ψ is inf-compact); and Z (b) sup ψ(y)λ(dy) < +∞. λ∈C

Y

PROOF “=⇒”: Let {Kn }n>1 be an increasing sequence of compact sets of Y , such that λ(Y \ Kn ) 6

1 2n

Let us set df

ψ =

∞ X

∀ n > 1, λ ∈ C.

χY \Kn .

n=1

We have

Z ψ(y)λ(dy) = Y

∞ Z X

χY \Kn (y)λ(dy) 6 1

∀ λ ∈ C.

n=1 Y

Note that ψ is N ∪ {+∞}-valued. So © ª © ª ψ 6 t = ψ 6 [t] = K[t]+1 , where [t] denotes the integer part of t > 1. Therefore ψ is inf-compact.

446

Nonlinear Analysis

Motivated from Theorem 3.5.27, we introduce the following definition. DEFINITION 3.5.29 A set S ⊆ Y(Ω, E, µ) is said to be uniformly tight, if and only if for every ε > 0, there exists a compact set Kε ⊆ E, such that

¡ ¢ sup ν Ω × (E \ Kε ) 6 ε.

ν∈S

REMARK 3.5.30 The © uniformª tightness of S ⊆ Y(Ω, E, µ) is equivais uniformly tight in M (E)+ (see lent to saying that C = ν ◦ proj−1 E ν∈S Theorem 3.5.27). Using the identification of Y(Ω, E, µ) with R(Ω, E) and Proposition 3.5.28, we are led to the following characterization of uniformly tight sets in Y(Ω, E, µ). PROPOSITION 3.5.31 S ⊆ Y(Ω, E, µ) is uniformly tight if and only if there exists an inf-compact df

function ψ : E −→ R+ = R+ ∪ {+∞}, such that Z Z sup ψ(x)ν(z)(dx) dµ < +∞. ν∈S

Ω E

In the literature there is another notion of uniform tightness for Young measures (or equivalently for transition probabilities). DEFINITION 3.5.32 A set S ⊆ Y(Ω, E, µ) is said to be uniformly Btight, if there exists a Σ × B(E)-measurable function ϕ : Ω × E −→ R+ , such that ϕ(z, ·) is inf-compact for all z ∈ Ω, i.e., the set © ª x ∈ E : ϕ(z, x) 6 t is compact for all t > 0 (hence ϕ is a normal integrand, i.e., ϕ ∈ N+ (Ω, Σ, E)), such that Z Z Z sup ϕ(z, x) dν = sup ϕ(z, x)ν(z)(dx) dµ < +∞. ν∈S Ω×E

ν∈S

Ω E

REMARK 3.5.33 Evidently uniform tightness implies uniform B-tightness. In fact since E is a Polish space, the two notions are equivalent (see Valadier (1975, p. 165)).

3. Nonlinear Operators and Young Measures

447

LEMMA 3.5.34 If S ⊆ Y(Ω, E, µ) is uniformly tight, then so is S (the closure in the narrow topology; see Definition 3.5.20). PROOF By virtue of Definition 3.5.29, for a given ε > 0, we can find a compact set Kε ⊆ E, such that ¡ ¢ sup ν Ω × (E \ Kε ) 6 ε. ν∈S

Let us set

df

ψ(x) = χE\Kε (x). Then ψ ∈ N+ (Ω, Σ, E) and if ν ∈ Y(Ω, E, µ), we have Z ¡ ¢ ψ(x) dν 6 ε ⇐⇒ ν Ω × (E \ Kε ) 6 ε. Ω×E

Therefore, if ν ∈ S, by Proposition 3.5.23(b), it follows that ¡ ¢ ν Ω × (E \ Kε ) 6 ε, hence S is uniformly tight too. The next theorem is the extension of Theorem 3.5.27 to Young measures. THEOREM 3.5.35 S ⊆ Y(Ω, E, µ) is relatively compact for the narrow topology if and only if S is uniformly tight. PROOF

“=⇒”: Let ϕ : Y(Ω, E, µ) −→ M (E)+ be defined by df

ϕ(ν) = ν ◦ proj−1 E

∀ ν ∈ Y(Ω, E, µ).

b b (Ω, Σ, E) and so from Remark 3.5.6, we If h ∈ C0 (E), then h ◦ projE ∈ K have Z Z ¡ ¢ ¡ ¢ h(x)d ν ◦ proj−1 = h ◦ projE (z, x) dν E E

Ω×E

and thus ϕ is continuous for the narrow topology on Y(Ω, Σ, E) and on M (E)+ . It follows then that ϕ(S) is compact in M (E)+ and so from Theorem 3.5.27, we have that ϕ(S) is uniformly tight in M (E)+ . This then implies the uniform tightness of S (see Remark 3.5.30).

448

Nonlinear Analysis

b µ), defined “⇐=”: Let {νn }n>1 ⊆ S be a sequence and consider νbn ∈ Y(Ω, E, by ¡ ¢ df b n > 1. νbn (C) = νn C ∩ (Ω × E) ∀ C ∈ Σ × B(E), b is compact, the weak∗ and narrow topologies on Y(Ω, E, b µ) Because the set E coincide (see Remark 3.5.21). So because of Theorem 3.5.10, we may assume that νbn −→ νb in Y(Ω, E, µ), with the narrow topology. Since S is uniformly tight, for every m > 1, we can find a compact set Km ⊆ E, such that

so

¡ ¢ 1 νn Ω × (E \ Km ) 6 m

∀ n > 1,

¡ ¢ 1 νbn Ω × (E \ Km ) 6 m

∀n>1

and thus ¡ ¢ ¡ ¢ 1 νb Ω × (E \ Km ) 6 lim inf νbn Ω × (E \ Km ) 6 . n→+∞ m So νb is supported by Ω ×

∞ S m=1

Km ⊆ Ω × E.

Since in the sequel we shall focus on Young measures associated to measurable functions, let us give some examples in this direction. EXAMPLE 3.5.36 In the examples that follow Ω = [0, 1], µ is the Lebesgue measure and E = R. (a) Consider the Rademacher functions ¡ ¢ df un (z) = sgn sin(2n πz) where

( z |z| sgn z = 1 df

So

½ un (z) =

We have

1 −1

∀ n > 1,

if

z 6= 0,

if

z 6= 1.

£ ¢ if z ∈ 2kn , k+1 , k is even, 2n otherwise. w∗

un −→ 0

in L1 [0, 1].

3. Nonlinear Operators and Young Measures

449

Let νn be the Young measure associated to un for n > 1. We have that ¢ 1¡ ¢ 1¡ µ ⊗ δ1 + µ ⊗ δ−1 . 2 2

w∗

νn −→

So we see that w∗ -convergence in L∞ (Ω) does not imply the w∗ -convergence in M (Ω×E) of the associated Young measures. Note that the sequence {un }n>1 has no subsequences converging µ-almost everywhere on Ω = [0, 1] (note that kun − um kL1 [0,1] = 1 for n 6= m and compare with Remark 3.5.8). (b) Let {un }n>1 be the Rademacher functions introduced above. Then define ½ df

u bn (z) =

un (z) 1 2 un (z)

if if

n is even, n is odd.

Clearly we still have w∗

u bn −→ 0 in L∞ [0, 1]. © ª However, the sequence νbn n>1 of associated Young measures is not convergent, because w∗

νb2n −→ and

w∗

νb2n+1 −→

¢ 1¡ ¢ 1¡ µ ⊗ δ1 + µ ⊗ δ−1 2 2

in M (Ω × E)

¢ 1¡ ¢ 1¡ µ ⊗ δ 21 + µ ⊗ δ− 21 2 2

in M (Ω × E).

(c) Let df

un (z) = sin(nz) We know that

∀ z ∈ [0, 1], n > 1.

w∗

un −→ 0 in L∞ [0, 1].

On the other hand it can be shown (see Tartar (1979, p. 148)), that if {νn }n>1 is the sequence of associated Young measures, then w∗

νn −→ ν

in M (Ω × E),

where Z ν(A) = A

1 1 √ dz π 1 − z2

for all measurable A ⊆ [0, 1].

(d) Let df

un (z) = ϑn χ[0, 1 ] (z) n

∀ n > 1,

450

Nonlinear Analysis

with ϑn ∈ R for n > 1. No matter which is the sequence {ϑn }n>1 ⊆ R, if {νn }n>1 is the sequence of associated Young measures, we have w∗

νn −→ µ ⊗ δ0

in M (Ω × E).

(e) Let un (z) = n

∀ z ∈ [0, 1], n > 1.

Also let {νn }n>1 be the sequence of associated Young measures. Clearly w∗

νn −→ 0

in M (Ω × E).

REMARK 3.5.37 In Examples 3.5.36(a), (b) and (c), the function un have values in [−1, 1], so we can take E = [−1, 1] and then from Theorem 3.5.10, it follows that the sequence {νn }n>1 is relatively w∗ -compact in M (Ω × E). In Examples 3.5.36(d) and (e), we cannot assume that E is compact. Nevertheless the sequence {νn }n>1 is uniformly tight, thus relatively compact for the narrow topology (hence for the weak∗ -topology too; see Theorem 3.5.35). As we already mentioned, in the remaining part of this section, we look at Young measures associated to measurable functions. So in what follows un : Ω −→ E are Σ-measurable functions and {νn }n>1 is the sequence of corresponding Young measures. Following Definition 3.5.32, we introduce the following notion. DEFINITION 3.5.38 The sequence {un }n>1 is uniformly tight, if the sequence {νn }n>1 is uniformly tight (in the sense of Definition 3.5.32). REMARK 3.5.39 Since νn (z) = δun (z) , according to the above definition, the sequence {un }n>1 is uniformly tight, if for every ε > 0, we can find a compact set Kε ⊆ E, such that ¡© ª¢ sup µ z ∈ Ω : un (z) 6∈ Kε 6 ε. n>1

By Proposition 3.5.31, equivalently we can say that there exists an infcompact function ψ : E −→ R+ , such that Z ¡ ¢ sup ψ un (z) dµ < +∞. n>1

Ω

In particular then if E = RN and we take ψ(x) ¡ ¢ = kxkRN , then we see that every bounded sequence {un }n>1 ⊆ L1 Ω; RN is uniformly tight.

3. Nonlinear Operators and Young Measures

451

THEOREM 3.5.40 If n

νn −→ ν © ¡ ¢ª and f ∈ N (Ω, Σ, E) is such that the sequence f − ·, un (·) n>1 is uniformly integrable in L1 (Ω), then ¢ R ¡ R (a) if lim inf f z, un (z) dµ < +∞, then f + (z, x) dν < +∞; n→+∞

(b)

Ω

Ω×E

¢ R ¡ f (z, x) dν 6 lim inf f z, un (z) dµ.

R

n→+∞

Ω×E

PROOF

Ω

(a) Fix c > 0 and let © ª df fc = max − c, f .

From Proposition 3.5.23(b), we have Z Z ¡ ¢ ¡ ¢ fc (z, x) + c dν 6 lim inf fc (z, un (z)) + c dµ, n→+∞

Ω×E

so

Ω

Z

Z fc (z, x) dν 6 lim inf

fc (z, un (z)) dµ < +∞.

n→+∞

Ω×E

(3.133)

Ω

If we let c = 0, then from (3.133) and the uniform integrability hypothesis, we conclude that © ¡ ¢ª the sequence f + ·, un (·) n>1 is bounded in L1 (Ω). This proves (a). (b) Let

½ df

An,c = Then we have

Z

¾ ¢ z ∈ Ω : f z, un (z) < −c . ¡

¡ ¢ f z, un (z) dµ 6 0

∀ n > 1,

An,c

so

Z An,c

¡ ¢ f + z, un (z) dµ −

Z An,c

¡ ¢ f − z, un (z) dµ 6 0

∀ n > 1.

452

Nonlinear Analysis

Thus for a given ε > 0, we can find c > 0 large enough so that Z ¡ ¢ −ε 6 f z, un (z) dµ 6 0.

(3.134)

An,c

Note that

¡ ¢ ¡ ¢ fc z, un (z) = f z, un (z)

and

on Ω \ An,c

¡ ¢ fc z, un (z) = −c on An,c .

Hence Z

¡ ¢ f z, un (z) dµ =

Ω

Z

=

¢ f z, un (z) dµ +

>

Z

¢ f z, un (z) dµ +

An,c

Z

¡ ¢ f z, un (z) dµ

Ω\An,c

Z

¡

¢ fc z, un (z) dµ −

Ω

¡

Z

¡ ¢ f z, un (z) dµ +

An,c

¡

An,c

Z

Z

¡ ¢ fc z, un (z) dµ

An,c

¡

¢ fc z, un (z) dµ >

Ω

Z

¡ ¢ fc z, un (z) dµ − ε

Ω

(see (3.134)). So using (3.133) and the fact that f 6 fc , we have Z lim inf

n→+∞

Z >

¡ ¢ f z, un (z) dµ > lim inf

Z

n→+∞

Ω

Ω

Ω

Z

fc (z, x) dν − ε >

¡ ¢ fc z, un (z) dµ − ε

f (z, x) dν − ε. Ω

Let ε & 0 to finish the proof of the theorem. COROLLARY 3.5.41 n If νn −→ ν, f0 : Ω×E −→ R is Σ×B(E)-measurable, f0 (z, ·) ∈ C(E) for all © ¡ ¢ª z ∈ Ω and f0 ·, un (·) n>1 is a sequence of uniformly integrable functions, then f0 is ν-integrable and Z Z ¡ ¢ f0 (z, x) dν = lim f0 z, un (z) dµ. n→+∞

Ω×E

PROOF

Ω

Use Theorem 3.5.40 with f = f0 and f = −f0 .

3. Nonlinear Operators and Young Measures

453

COROLLARY 3.5.42 © ¡ ¢ª n If νn −→ ν, h : E −→ R is a continuous function and h un (·) n>1 is a sequence of uniformly integrable functions, then (a) for µ-almost all z ∈ Ω, the function h is ν(z)-integrable and Z Z ¯ ¯ ¯h(x)¯ν(z)(dx) dµ < +∞; Ω E

(b)

w h(un ) −→ b h

where

in L1 (Ω),

Z

df b h(z) =

h(x)ν(z)(dx). E

PROOF

(a) Follows from Corollary 3.5.41.

(b) Let ϑ ∈ L∞ (Ω) and let us set df

f0 (z, x) = ϑ(z)h(x). We have

Z

¡ ¢ f0 z, un (z) dµ 6 kϑk∞

A

Z

¯ ¡ ¢¯ ¯h un (z) ¯ dµ

∀ A ∈ Σ,

A

© ¡ ¢ª so f0 ·, un (·) n>1 is a sequence of uniformly integrable functions. We can apply Corollary 3.5.41 and obtain Z Z ¡ ¢ f0 (z, x) dν = lim f0 z, un (z) dµ n→+∞

Ω×E

Ω Z

=

lim

n→+∞

¡ ¢ ϑ(z)h un (z) dµ.

(3.135)

Ω

© ¡ ¢ª Because h un (·) n>1 is a sequence of uniformly integrable functions, from the Dunford-Pettis theorem (see Theorem 2.3.24), we may assume that w h(un ) −→ b h in L1 (Ω).

So from (3.135,) we have µZ ¶ Z Z Z f0 (z, x) dν = ϑ(z) h(x)ν(z)(dx) dµ = ϑ(z)b h(z) dµ. Ω×E

Ω

E

Ω

454

Nonlinear Analysis

Let ϑ = χA , A ∈ Σ. Then we obtain Z Z Z b h(x)ν(z)(dx) dµ = h(z) dµ A E

so

∀ A ∈ Σ,

A

Z h(x)ν(z)(dx) = b h(z) for µ-a.a. z ∈ Ω. E

© ¡ ¢ª As every subsequence of h un (·) n>1 has a further subsequence converging © ¡ ¢ª weakly in L1 (Ω) to b h, we conclude that the original sequence h un (·) n>1

converges. REMARK 3.5.43

If E is compact, then ∗

w h(un ) −→ b h

in L∞ (Ω).

PROPOSITION 3.5.44 ¡ ¢ If E = RN and {un }n>1 ⊆ L∞ Ω; RN , {νn }n>1 are sequences, such that w∗

un −→ u and

w∗

νn −→ ν then

¡ ¢ in L∞ Ω; RN

in M (Ω × RN ),

Z u(z) =

xν(z)(dx)

for µ-a.a. z ∈ Ω.

RN

¡ ¢ PROOF Since the sequence {un }n>1 is bounded in L∞ Ω; RN , we may replace E = RN by a compact subset of it. Then the narrow and weak∗ topologies coincide. Therefore from Corollary 3.5.42 (see also Remark 3.5.43) with h(x) = x, we obtain the result. REMARK 3.5.45 For a given measurable function u : Ω −→ E, we define the barycenter of u to be the set ½ ¾ Z df Bar(u) = λ ∈ R(Ω, E) : u(z) = xλ(z)(dx) for µ-a.a. z ∈ Ω . E

¡ ¢ So the conclusion of Proposition 3.5.44 says that ν ∈ R Ω, RN belongs in the barycenter of u.

3. Nonlinear Operators and Young Measures

455

In Proposition 2.3.38 we produced a criterion for strong convergence in L1 (Ω). In the next proposition, using the tools provided by the theory of ¡ ¢ Young measures, we obtain a criterion for strong convergence in Lp Ω; RN , p ∈ [1, +∞). PROPOSITION 3.5.46 ¡ ¢ If E = RN , {un }n>1 ⊆ L∞ Ω; RN and {νn }n>1 ⊆ M (Ω×E) are sequences, such that ¡ ¢ w∗ un −→ u in L∞ Ω; RN and

w∗

νn −→ ν

in M (Ω × E),

un −→ u

¡ ¢ in Lp Ω; RN ,

then for p ∈ [1, +∞) if and only if ν(z) = δu(z) PROOF

for µ-a.a. z ∈ Ω.

¡ ¢ “=⇒”: Let h ∈ C0 RN . Then

h(un ) −→ h(u) in Lp (Ω). © ¡ ¢ª On the other hand it is easy to see that h un (·) n>1 is the sequence of uniformly integrable functions. So by virtue of Corollary 3.5.42(b), we have that w h(un ) −→ b h in L1 (Ω), with df b h(z) =

Z h(x)ν(z)(dx). RN

We have ¡ ¢ ® h u(z) = h, δu(z) C (RN ) 0 Z ® = h(x)ν(z)(dx) = h, ν(z) C0 (RN )

for µ-a.a. z ∈ Ω,

RN

where by h·, ·iC0 (RN ) we denote the duality brackets for the pair of spaces ¡ ¡ N¢ ¡ ¢¢ ¡ ¢ C0 R , M RN . Since h ∈ C0 RN was arbitrary, we obtain δu(z) = µ(z) for µ-a.a. z ∈ Ω. (b)¡ It suffices to¡ prove ¢the implication for p > 1 (recall that the embedding ¢ Lp Ω; RN ⊆ L1 Ω; RN is continuous). If ν(z) = δu(z)

for µ-a.a. z ∈ Ω,

456

Nonlinear Analysis

then from Corollary 3.5.42(b) with p

h(x) = kxkRN , we have w

p

p

kun kRN −→ kukRN

in L1 (Ω)

and so kun kp −→ kukp . Note that w

un −→ u

¡ ¢ in Lp Ω; RN .

So from the Kadec-Klee property (see Remark A.3.22), we have that ¡ ¢ un −→ u in Lp Ω; RN .

We have another result in this direction. First a lemma. LEMMA 3.5.47¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence of uniformly integrable functions, then we can find a subsequence {νnk }k>1 of {νn }n>1 , such that n

(a) νnk −→ ν in Y(Ω, RN , µ); R (b) xν(z)(dx) < +∞; RN

¡ ¢ R w (c) unk −→ u in L1 Ω; RN , with u(z) = xν(z)(dx). RN

PROOF (a) From Proposition 3.5.31 with ψ(x) = kxkRN , we see that the sequence {νn }n>1 is uniformly tight. So by virtue of Theorem 3.5.35, we can find a subsequence {νnk }k>1 of {νn }n>1 , such that n

νnk −→ ν

as k → +∞

in Y(Ω, RN , µ).

(b) For the subsequence obtained in part (a), we see that we can apply Corollary 3.5.42(a) with ϕ(x) = kxkRN and obtain the desired conclusion. (c) For the subsequence obtained in part (a), we see that we can apply Corollary 3.5.42(b) with ϕ(x) = xk for all x = (x1 , . . . , xN ) ∈ RN and obtain the desired conclusion.

3. Nonlinear Operators and Young Measures

457

PROPOSITION ¡ 3.5.48 ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that w

un −→ u

¡ ¢ in L1 Ω; RN

and for every subsequence {νnk }k>1 of the sequence {νn }n>1 for which we have n νnk −→ ν as k → +∞ in Y(Ω, RN , µ), ν is the Young measure associated to a Σ-measurable function w : Ω −→ RN , then w = u and ¡ ¢ un −→ u in L1 Ω; RN . PROOF

We have ° Z Z °Z ° ° ° ° °w(z)° N dµ = ° xν(z)(dx)° ° ° R Ω

Z

6 lim inf

n→+∞

Ω

RN

dµ

RN

° ° °un (z)° N dµ < +∞, k R

Ω

¡ ¢ so w ∈ L1 Ω; RN . If

n

νnk −→ ν = δw(·) , then by Lemma 3.5.47(c), we have w

unk −→ w

¡ ¢ in L1 Ω; RN .

Hence w = u. From Proposition 3.5.22, we have that µ

unk −→ u and by passing to a further subsequence if necessary, we may assume that unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Then from the extended dominated convergence theorem (Vitali’s theorem; see Theorem A.2.9), we have that ¢ ¡ un −→ u in L1 Ω; RN .

Now we can prove a lower semicontinuity result for integral functionals.

458

Nonlinear Analysis

THEOREM 3.5.49 ¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that ¡ ¢ w un −→ u in L1 Ω; RN , {wn : Ω −→ E}n>1 is a sequence of Σ-measurable functions, such that µ

wn −→ w,

¡ ¢ for some Σ-measurable function w : Ω −→ E, f ∈ N Ω, Σ, E × RN and ¡ ¢ (i) f z, w(z), · is convex for µ-almost all z ∈ Ω; © ¡ ¢ª (ii) f − ·, wn (·), un (·) n>1 is a sequence of uniformly integrable functions, then

Z

(a) if lim inf

n→+∞

¡ ¢ f z, wn (z), un (z) dµ < +∞, then

Ω

Z

¡ ¢ f + z, w(z), u(z) dµ < +∞;

Ω

Z (b)

¡ ¢ f z, w(z), u(z) dµ 6 lim inf

Z

n→+∞

Ω

PROOF limit

¡ ¢ f z, wn (z), un (z) dµ.

Ω

By passing to a suitable subsequence, we may assume that the Z ¡ ¢ lim f z, wn (z), un (z) dµ exists, n→+∞

Ω

while by Proposition 3.5.22, we have n

δwn −→ δw

in Y(Ω, Σ, E)

and by Lemma 3.5.47, we have n

δun −→ ν

in Y(Ω, Σ, RN ).

It is easy to see that n

δwn ⊗ δun −→ δw ⊗ ν

in Y(Ω, Σ, E × RN ).

Invoking Theorem 3.5.40(a), we obtain the implication Z ¡ ¢ lim inf f z, wn (z), un (z) dµ < +∞ n→+∞

Ω

Z µZ Ω

RN

⇓ ¶ ¡ ¢ f + z, w(z), x ν(z)(dx) dµ < +∞.

(3.136)

3. Nonlinear Operators and Young Measures

459

From Theorem 3.5.40(b), we have ¶ Z µZ ¡ ¢ f z, w(z), x ν(z)(dx) dµ Ω

RN Z

6 lim inf

n→+∞

¡ ¢ f z, wn (z), un (z) dµ.

(3.137)

Ω

¡

¢ Since by hypothesis f z, w(z), · is convex for µ-almost all z ∈ Ω, using Jensen’s inequality (see Theorem A.2.26), we obtain ¶ Z Z µ Z ¡ ¢ ¡ ¢ f z, w(z), u(z) dµ = f z, w(z) xν(z)(dx) dµ Ω

Z µZ

6 Ω

Ω ¶ ¢ f z, w(z), x ν(z)(dx) dµ.

¡

RN

(3.138)

RN

Then part (a) follows from (3.136) and (3.138), while part (b) follows from estimates (3.137) and (3.138). There is a version of this result for integrands defined on Banach spaces. THEOREM 3.5.50 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, Y is a separable reflexive Banach space and f : Ω × X × Y −→ R is a Σ × B(X) × B(Y )-measurable function, such that (i) the function (x, y) 7−→ f (ω, x, y) is lower semicontinuous for µ-almost all ω ∈ Ω; (ii) the function y 7−→ f (ω, x, y) is convex for µ-almost all ω ∈ Ω and all x ∈ X; (iii) there exist β ∈ L1 (Ω) and c > 0, such that ¡ ¢ f (ω, x, y) > β(ω) − c kxkX + kykY for µ-a.a. ω ∈ Ω and all (x, y) ∈ X × Y, then the functional df

Z

(u, v) 7−→ If (u, v) =

¡ ¢ f ω, u(ω), v(ω) dµ

Ω

is sequentially lower semicontinuous from L1 (Ω; X) × L1 (Ω; Y )w into R.

460

Nonlinear Analysis

In Proposition 2.3.39 using an extremality ¡ ¢condition, we obtained a result concerning strong convergence in L1 Ω; RN . There the result was stated without proof. Now that we have in our disposal the tools of the theory of Young measures, we can give a proof of it. PROPOSITION ¡ 3.5.51 ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that ¡ ¢ in L1 Ω; RN ,

w

un −→ u

¡ ¢ for some u ∈ L1 Ω; RN and µ ¶ u(z) ∈ ext conv lim sup{un (z)}

for µ-a.a. z ∈ Ω,

n→+∞

then

¡ ¢ in L1 Ω; RN .

un −→ u PROOF

Let © ª df S(z) = conv lim sup un (z)

∀ z ∈ Ω.

n→+∞

Since u(z) ∈ ext S(z) for µ-a.a. z ∈ Ω, © ª we can find a sequence Cn (z) n>1 of closed, convex sets in S(z), such that ∞ [

© ª Cn (z) = S(z) \ u(z)

for µ-a.a. z ∈ Ω.

n=1

Let us fix z ∈ Ω \ N with µ(N ) = 0. From Lemma 3.5.47, we have Z u(z) = xν(z)(dx), RN

© ª with ν(z) being supported by lim sup un (z) . Suppose that n→+∞

¡ ¢ ν(z) Cn (z) > 0. ¡ ¢ If ν(z) Cn (z) = 1, then u(z) ∈ Cn (z), a contradiction. Therefore ¡ ¢ 0 < ν(z) Cn (z) < 1 and we can define df

λ1 =

ν(z)|Cn (z) ν(z)(Cn (z))

and

df

λ2 =

ν(z)|RN \Cn (z) 1 − ν(z)(Cn (z))

.

3. Nonlinear Operators and Young Measures It follows that

461

Z

u(z) =

xν(z)(dx) RN

¡ ¢ = ν(z) Cn (z)

Z

¡ ¡ ¢¢ xλ1 (dx) + 1 − ν(z) Cn (z)

RN

Z xλ2 (dx),

RN

with

Z u(z) 6=

xλ1 (dx), RN

since

Z xλ1 (dx) ∈ Cn (z). RN

So we have a contradiction to the hypothesis that u(z) is extremal in C(z). This implies that ¡ ¢ ν(z) Cn (z) = 0 ∀ n > 1, z ∈ Ω \ N, hence ν(z)

¡© ª¢ u(z) = 1,

i.e., ν(z) = δu(z)

for µ-a.a. z ∈ Ω.

Then the conclusion of the proposition follows from Proposition 3.5.46. REMARK 3.5.52 The result is true if (Ω, Σ, µ) is any finite measure space (see Proposition 2.3.39). However, since the theory in this section was developed for a locally compact, σ-compact metric space Ω we have kept this assumption. ¡ ¢ We know that a bounded sequence {un }n>1 ⊆ L1 Ω; RN is uniformly tight (see Proposition 3.5.31 with ψ(x) = kxkRN ) and so we can extract a © ª subsequence unk k>1 , such that n

δunk −→ ν

as k → +∞

in Y(Ω, Σ, RN )

(see Theorem 3.5.35). It is natural to ask what is the relation between the sequence {un }n>1 and the function Z u(z) = xν(z)(dx). RN

In this respect the Chacon biting lemma (see Theorem 2.3.26) is helpful. An equivalent reformulation of this result (for X = RN ) is the following theorem.

462

Nonlinear Analysis

THEOREM 3.5.53 ¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a bounded sequence, © ª then we can extract a subsequence unk k>1 of {un }n>1 , such that for every δ > 0, there exists A ∈ Σ, with µ(A) < δ and

¡ ¢ in L1 Ω \ A; RN ,

w

unk −→ u

as k → +∞.

PROPOSITION ¡ 3.5.54 ¢ If {un }n>1 ⊆ L1 Ω; RN is a bounded sequence, © ª then we can extract a subsequence unk k>1 and a decreasing sequence of sets {Ak }k>1 ⊆ Σ, with µ(Ak ) & 0, such that

n

δunk −→ ν and

df

in Y(Ω, Σ, RN ) w

wk = χΩ\Ak unk −→ u with df

¡ ¢ in L1 Ω; RN ,

Z

u(z) =

xν(z)(dx)

for µ-a.a. z ∈ Ω.

RN

© ª PROOF By virtue of Theorem 2.3.26, we can find a subsequence unk k>1 of {un }n>1 and a decreasing sequence of sets {Ak }n>1 ⊆ Σ with ©

µ(Ak ) & 0,

ª

such that wk = χΩ\Ak unk k>1 is a sequence of uniformly integrable functions. Then by virtue of Lemma 3.5.47, we need to show that n

δwk −→ ν To this end let

in Y(Ω, Σ, RN ).

¡ ¢ b b Ω, Σ, RN . f ∈ K

If ηk = ψ(δwk ) (see Theorem 3.5.12), then we have ¯ Z ¯ Z ¯ ¯ ¯ f (z, x) dηn − f (z, x) dνnk ¯¯ ¯ Ω×RN

¯ = ¯

Z

Ω×RN

¡ ¡ ¢¢ ¯ f (z, 0) − f z, unk (z) dµ¯

Ak

6 2 kψk

¡

L∞ Ω×RN

n

so ηn −→ ν.

¢ µ(Ak ) −→ 0 as k → +∞,

3. Nonlinear Operators and Young Measures

3.6

463

Remarks

3.1: Compact maps was the first class of operators used to study nonlinear equations in infinite dimensional spaces. Leray & Schauder (1934) used compact perturbations of the identity in order to extend the Brouwer degree to infinite dimensional spaces. However, for linear operators the notion was first introduced by Riesz (1918). Earlier Hilbert (1906) introduced the notion of completely continuous linear operator between Banach spaces. As a property for linear operators, complete continuity actually lies properly between compactness and boundedness. Moreover, when the domain space X is reflexive, then the two notions coincide (see Corollary 3.1.8). The basic approximation result for compact maps stated in Theorem 3.1.10 is due to Schauder (1930). Proper maps (see Definition 3.1.13) are discussed in Berger (1977). Theorem 3.1.22 is due to Schauder (1930). The proof of Proposition 3.1.31 can be found in Reed & Simon (1972, p. 191). Theorem 3.1.38 on the spectral properties of compact linear operators is due to Riesz (1918). Compact linear operators and their spectral properties are discussed in detail in Dunford & Schwartz (1958), Kato (1976) and Yosida (1978). The Fredholm alternative (see Theorem 3.1.48) was obtained in the context of linear integral equations by Fredholm (1903). The spectral theory of selfadjoint, compact, linear operators can be found in the books Akhiezer & Glazman (1961, 1963), Gohberg & Goldberg (1981), Halmos (1998) and Kato (1976). For further results of Fredholm operators we refer to Goldberg (1966), Kato (1976) and Schechter (1971). The proof of Proposition 3.1.70 can be found in Schechter (1971, p. 114). 3.2: Monotone operators are rooted in the calculus of variations and were introduced in the early sixties, in order to provide an analytical framework for the study of nonlinear operator equations broader than the one provided by compact operators. The first mention of monotone operators in a Hilbert space can be traced in the work of Golomb (1935) on nonlinear Hammerstein integral equations. However, the systematic development of the theory of monotone operators, started with Kachurovski (1960), who established that the derivative of a convex function is a monotone map and also introduced the term “monotone operator.” Then Minty (1962) obtained the first existence result for nonlinear functional equations in Hilbert spaces under monotonicity assumptions. Even simple one dimensional examples reveal that a complete theory of maximal monotone maps requires the use of multivalued maps. Proposition 3.2.11 is due to Rockafellar (1969) who proved that a monotone map is locally bounded at every point in the interior of its domain. Here we have stated a slightly more general version of this result. Concerning Proposition 3.2.14, Kenderov (1974) proved that if X is separable, reflexive Banach ∗ space and A : X ⊇ D(A) −→ 2X is maximal monotone with int D(A) 6= ∅,

464

Nonlinear Analysis

then there is a dense Gδ subset D0 of int D(A), such that A|D0 is single valued and upper semicontinuous for the norm topologies on X and X ∗ . For additional results in this direction, we refer to Phelps (1993). The duality map (see Example 3.2.20(d)) plays a basic role in the study of the geometry of Banach spaces and in the theory of evolution equations. It was first introduced by Beurling & Livingston (1962). Its properties are studied by Browder (1976), Cioranescu (1990) and Zeidler (1990b). Theorem 3.2.29 is due to Minty (1962) for Hilbert spaces and Rockafellar (1970c) for Banach spaces. Its proof can be found in Zeidler (1990b, p. 881). Theorem 3.2.30 is due to Browder (1968) and together with Corollary 3.2.31 explains why maximal monotone operators are a powerful tool in the study of nonlinear operator equations. Theorem 3.2.40 is due to Attouch (1981), while Theorem 3.2.41 is due to Rockafellar (1970c). The proof of Theorem 3.2.41 can be found in Zeidler (1990b, p. 888). The notion of pseudomonotonicity was introduced by Br´ezis (1968) (using nets) and Browder (1976) (using sequences). The basic works on pseudomonotonicity are those by Browder & Hess (1972) and Kenmochi (1974, 1975). Of course the most important result is Theorem 3.2.52, due to Browder & Hess (1972). Monotone operators and operators of monotone-type are discussed in the books of Br´ezis (1973), Barbu (1976), Deimling (1985), Hu & Papageorgiou (1997), Morosanu (1988), Pascali & Sburlan (1978) and Showalter (1997). The proof of Theorem 3.2.58 can be found in Hu & Papageorgiou (1997, pp. 311–312). 3.3: Accretive operators were introduced by Kato (1967, 1968), who gave the complete characterization in metric terms involved in Proposition 3.3.4. In the first part of the section, dealing with accretive operators, we have summarized the results of Br´ezis (1971), Br´ezis & Pazy (1970), Crandall & Pazy (1969, 1970), Kato (1967, 1968) and Kenmochi (1972, 1973). Lemma 3.3.26 is due to Kato (1968, 1970), while the Gronwall-type inequality obtained in Lemma 3.3.27 can be found in Br´ezis (1973, p. 157). Theorem 3.3.28 can be found in Crandall & Pazy (1969). The linear semigroup theory started developing as soon as it was realized that the theory has immediate applications to partial differential equations, Markov processes and ergodic theory. It developed rapidly during the 1940’s and 1950’s thanks to the seminal contributions of Hille, Phillips and Yosida. The main result of this theory is of course Theorem 3.3.46 (the Hille-Yosida generation theorem), for contraction semigroups (i.e., M = 1, ω = 0) was proved independently by Hille (1942) and Yosida (1948), while the general case (proved in Theorem 3.3.46) is independently due to Feller (1953), Miyadera (1952) and Phillips (1953). The proof of Phillips theorem (see Theorem 3.3.48) can be found in Hille & Phillips (1957, p. 389). Theorem 3.3.49 is due to Lumer & Phillips (1961), while an early Hilbert space version of it was proved by Phillips (1959). The exponential formula in Theorem 3.3.51 is due to Hille (1942). In fact Hille’s proof of the generation theorem was based on it. Another representa-

3. Nonlinear Operators and Young Measures

465

tion formula can be found in Pazy (1983, p. 21). THEOREM 3.6.1 © ª If A is the infinitesimal generator of a C0 -semigroup S(t) t>0 on X, then S(t)x = lim etAλ x . λ→+∞

A complete list of representation formulae can be found in Hille & Phillips (1957, p. 354). Theorem 3.3.59 (the generation theorem for nonlinear semigroups) for Hilbert spaces was proved by Komura (1967), while the general case is due to Crandall & Liggett (1971). The notion of integral solution (see Definition 3.3.65) is due to B´enilan (1972). In the linear case, Proposition 3.3.71 is due to Lax (see Hille & Phillips (1957, p. 304)). For nonlinear semigroups, Proposition 3.3.71 and Theorem 3.3.72 were proved by Br´ezis (1974), while Corollary 3.3.73 is due to Pazy (1968). The theory of linear semigroups can be found in the books of Butzer & Berens (1967), Fattorini (1999), Goldstein (1985), Hille & Phillips (1957) and Pazy (1983), while the theory of nonlinear semigroups can be found in the books of Barbu (1976), Miyadera (1992), Pavel (1987) and Vrabie (1987). 3.4: Theorem 3.4.4 for functions defined on Ω × R with values in R was proved by Krasnoselskii (1964b, 1964a). The general case is due to Lucchetti & Patrone (1980). Moreover, in addition to Theorem 3.4.4, we can also show the continuity in measure of the operator Nf , already known when X and Y are Euclidean spaces. PROPOSITION 3.6.2 © If µ(Ω) < +∞, f : Ω × X −→ Y is a Carath´eodory function, xn : Ω −→ ª X n>1 is a sequence of Σ-measurable functions and µ

xn −→ x, then

µ

Nf (xn ) −→ Nf (x). The proof of Scorza-Dragoni theorem (see Theorem 3.4.10) can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 188). Normal integrands were introduced by Rockafellar (1968). The characterization of lower semicontinuity of the integral functional If obtained in Theorem 3.4.13 for Euclidean spaces X and Y is due to Poljak (1969), while the general case can be found in Lucchetti & Patrone (1980). Proposition 3.4.16 is due to Ioffe & Levin (1972). The proof of Proposition 3.4.18 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 460). For X = R, Theorem 3.4.20 can be found in Ekeland & Temam (1976), but their proof is different based on results from convex analysis.

466

Nonlinear Analysis

Further results on Nemytskii operators and integral functionals can be found in the books of Appell & Zabrejko (1990), Buttazzo (1989), Cesari (1983), Ekeland & Temam (1976), Hu & Papageorgiou (1997), Ioffe & Tihomirov (1979) and Vaˇınberg (1973). 3.5: The theory of Young measures has its roots in the so-called ”generalized curves” of Young (1942a, 1942b, 1969) for the study of variational problems which are not inf-compact and consequently do not have a solution. The needs of control theory (relaxation) and of the calculus of variations led to further development of the original ideas of Young. We refer to Berliocchi & Lasry (1973), Ekeland (1972), Ekeland & Temam (1976), Gamkrelidze (1978), Warga (1972). Recently there was a revival of the theory (motivated also by the needs of problems in theoretical mechanics), which can be traced in the works of Alibert & Bouchitt´e (1997), Balder (1984, 1997), Ball (1989), Ball & Murat (1989), Ball & Zhang (1990), Di Perna (1985), Di Perna & Majda (1987) and Tartar (1979). Our presentation here follows the survey paper of Valadier (1975). Applications of Young measures to control theory and mechanics can be found in the books of Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Gamkrelidze (1978), Hu & Papageorgiou (1997, 2000), Pedregal (1997), Roubiˇcek (1997) and Warga (1972). For the proof of Prohorov theorem (see Theorem 3.5.27) we refer to Parthasarathy (1967, p. 47). Theorem 3.5.49 was obtained by De Giorgi (1968–1969), with f > 0. A more general version similar to that of Theorem 3.5.49 was proved by Ioffe (1977a, 1977b). His proof, which is also reproduced by Buttazzo (1989, p. 46), does not use Young measures and instead it is based on the approximation of f by certain affine functions. Two different proofs of Theorem 3.5.50 can be found in Balder (1987) and Hu & Papageorgiou (2000, p. 31).

Chapter 4 Smooth and Nonsmooth Analysis and Variational Principles

The purpose of this chapter is to outline the basic aspects of the smooth and nonsmooth calculus in Banach spaces. Special emphasis is given on the nonsmooth theory, which started developing in the 1960’s, in order to provide a uniform viewpoint for the treatment of large classes of nonlinear extremal problems. The resulting subdifferential theories found also in many other applications and today are part of the so-called nonsmooth analysis, which is one of the most robust and interesting research areas of nonlinear functional analysis. In Section 4.1 we present the basics of the smooth calculus in Banach spaces. We limit ourselves to the discussion of the Gˆateaux and Fr´echet derivatives, which are the two most useful derivatives for vector valued functions. In Section 4.2 we consider convex functions defined on Banach spaces. We discuss their continuity and differentiability properties. It turns out that a purely algebraic condition (convexity) has remarkable and powerful topological and differentiability implications. Also differentiability results bring us in contact with the Banach space theory and in particular with the so-called Asplund spaces which have the useful property that every separable subspace has a separable dual. We also show that every convex continuous function is locally Lipschitz. Locally Lipschitz functions between Banach spaces are the objects of investigation in Section 4.3. If the two Banach spaces are finite dimensional, then the locally Lipschitz function is differentiable almost everywhere (for the Lebesgue measure). Here we see how this can be generalized to the case where the two spaces are infinite dimensional. The main difficulty is to produce a suitable notion of negligible sets. This is done using the notion of Haar-null sets. We study them in detail and eventually prove an infinite dimensional version of the Rademacher theorem on the almost everywhere differentiability of locally Lipschitz functions. In Section 4.4 we pass to the nonsmooth part of this chapter. We examine the duality and subdifferentiability properties of convex functions and the subdifferentiability properties of locally Lipschitz functions. At the end of the section, using the notion of bornology, we briefly consider some more subdifferentials of proper functions. In Section 4.5 we investigate integral functionals defined by convex or nonconvex normal integrands. We determine their duality and subdifferentiability

467

468

Nonlinear Analysis

properties. Finally in Section 4.6 we present some variational principles and their applications. Prominent in our discussion is the so-called Ekeland variational principle, in which we show that it is equivalent to some other powerful results of nonlinear analysis. We also use it to prove some surjectivity results for nonlinear maps, which extend corresponding results from the linear operator theory. This chapter illustrates in a rather convincing manner how methods and results of nonlinear analysis cover a wide area from a theoretical starting point (Banach space theory) to an applied end (optimization theory).

4.1

Differential Calculus in Banach Spaces

In this section we develop the basics of the differential calculus in Banach spaces. The geometric character of the operation of differentiation becomes very apparent in this general setting and leads naturally to generalizations such as the subdifferentials of convex and of locally Lipschitz functions. Moreover, the needs of the infinite dimensional variational problems which dominate the present landscape of nonlinear analysis require a differential calculus in Banach spaces, along the lines of the one existing in RN . This section shows that such a theory is possible and the analogy with the finite dimensional calculus is indeed remarkable. In what follows X and Y are two Banach spaces. Additional hypotheses will be introduced as needed. DEFINITION 4.1.1 A map f : X −→ Y is said to be Gˆ ateaux differentiable at x ∈ X, if and only if there exists A(x) ∈ L(X; Y ), such that lim

λ→0

f (x + λh) − f (x) = A(x)h λ

∀ h ∈ X.

The operator A(x) is said to be the Gˆ ateaux derivative of f at x. It is 0 usually denoted by fG (x). We say that f is Gˆ ateaux differentiable, if it is Gˆ ateaux differentiable at every x ∈ X. REMARK 4.1.2

If we set df

ϕ(λ) = f (x + λh), then

d ϕ(λ)|λ=0 ∀ x, h ∈ X. dλ So the Gˆateaux derivative is essentially a one dimensional concept, since it considers the difference quotients along rays. Clearly then the Gˆateaux deriva0 tive fG (x), if it exists, is unique. 0 fG (x)h =

4. Smooth and Nonsmooth Analysis and Variational Principles

469

Let us see some examples that illustrate the notion of Gˆateaux derivative. 0 (a) If f = A ∈ L(X; Y ), then fG (x) = A for all

EXAMPLE 4.1.3 x ∈ X.

(b) Let X = RN , Y = RM and f = (f1 , . . . , fM ) : RN −→ RM . Let A = (akj ) be an M × N -matrix and let h = ej = (0, . . . , 0, 1, 0, . . . , 0) be the j-th coordinate vector. Then ° ° ° f (x + λh) − f (x) − λAh ° ° = 0, lim ° ° λ→0 ° λ Y so

¯ ¯ ¯ fk (x + λej ) − fk (x) − λakj ¯ ¯ ¯ = 0 lim ¯ λ→0 ¯ λ

and so

∂fk (x) = akj ∂xj

∀ k ∈ {1, . . . , M }, j ∈ {1, . . . , N }

∀ k ∈ {1, . . . , M }, j ∈ {1, . . . , N }.

0 Therefore fG (x) has the matrix representation µ ¶ ∂fk 0 fG (x) = (x) . k∈{1,...,M } ∂xj j∈{1,...,N }

This matrix is called the Jacobian matrix of f at x ∈ RN . If Y = R, then µ ¶N ∂f 0 fG (x) = (x) , ∂xj j=1 known as the gradient of f at x ∈ RN . ¡ ¢ (c) Let X = Y = C [0, 1] and consider the Hammerstain integral operator , defined by df fb(x)(t) =

Z1

¡ ¢ k(t, s)f s, x(s) ds

∀ t ∈ [0, 1],

0

where

¡ ¢ k ∈ C [0, 1]; [0, 1] and

¡ ¢ ∂f ∈ C [0, 1] × R . ∂x

An easy calculation reveals that fb is Gˆateaux differentiable and ¡

0 fbG (x)h

¢ (t) =

Z1 k(t, s) 0

¢ ∂f ¡ s, x(s) h(s) ds ∂x

¡ ¢ ∀ h ∈ C [0, 1] .

470

Nonlinear Analysis

A stronger differentiability notion is given in the next definition. DEFINITION 4.1.4 A map f : X −→ Y is said to be Fr´ echet differentiable at x ∈ X if there exists A(x) ∈ L(X; Y ), such that f (x + h) − f (x) = A(x)h + u(x, h), where

ku(x, h)kY −→ 0 khkX

as khkX → 0.

The operator A(x) is said to be the Fr´ echet derivative of f at x ∈ X and it is usually denoted by fF0 (x). We say that f is Fr´echet differentiable, if it is Fr´echet differentiable at every x ∈ X. REMARK 4.1.5 It is easy to see that the Fr´echet derivative fF0 (x), if it exists, is unique. It is clear that if f is Fr´echet differentiable at x, it is also Gˆateaux differentiable at x. The converse is not true as the following example shows. EXAMPLE 4.1.6 Let X = R2 , Y = R and consider the function 2 f : R −→ R, defined by ( 3 x1 x2 df if x = (x1 , x2 ) 6= (0, 0), x41 +x22 f (x) = 0 if x = (x1 , x2 ) = (0, 0). The function f is Gˆateaux differentiable at x = 0 and 0 fG (0) = 0.

However, it is not Fr´echet differentiable at x = 0, since on the curve h21 = h2 , we have

|f (h)| |h3 h2 | 1 1 |h1 | = 41 2 p = p , khkR2 h1 + h2 h21 + h22 2 h21 + h22

so lim

khkR2 →0

|f (h)| 1 = 6= 0. khkR2 2

The next proposition establishes the exact relation between Gˆateaux and Fr´echet derivatives.

4. Smooth and Nonsmooth Analysis and Variational Principles

471

PROPOSITION 4.1.7 If f : X −→ Y is a Gˆ ateaux differentiable function at all points of some 0 neighbourhood of x ∈ X and fG (·) is continuous at x ∈ X, then f is also Fr´echet differentiable at x ∈ X. PROOF

Let us set df

0 u(x, h) = f (x + h) − f (x) − fG (x)h.

Then for every y ∗ ∈ Y ∗ , we have ∗ ® ® ® 0 y , u(x, h) Y = y ∗ , f (x + h) − f (x) Y − y ∗ , fG (x)h Y . By virtue of the mean value theorem, we can find λ ∈ (0, 1) (depending on y ∗ ), such that ∗ ® ® 0 0 y , u(x, h) Y = y ∗ , fG (x + λh)h − fG (x)h Y . We can find y ∗ ∈ Y ∗ with ky ∗ kY ∗ = 1, such that ° ° ¯ ® ¯ °u(x, h)° = ¯ y ∗ , u(x, h) ¯ . Y Y Then we have ° ° °u(x, h)°

Y

so

and so

° 0 ° 0 6 °fG (x + λh) − fG (x)°L khkX ,

° ° 0 ku(x, h)kY 0 (x)°L , 6 °fG (x + λh) − fG khkX ku(x, h)kY −→ 0 as khkX → 0 khkX

0 (since by hypothesis fG (·) is continuous at x ∈ X).

Before proceeding further, let us give some examples of Fr´echet differentiable maps. EXAMPLE 4.1.8 (a) Let X = H be a Hilbert space. Let A ∈ L(H) and let f : H −→ R be defined by df

f (x) = (Ax, x)H

∀ x ∈ H.

Then ¡ ¡ ¢ ¢ f (x + h) − f (x) − Ax + A∗ x, h H = o khkH as h → 0

472

Nonlinear Analysis

and so

fF0 (x) = A + A∗ . 2

If A = idH , then f (x) = kxkH and we have that fF0 (x) = 2x

∀ x ∈ H.

(b) Let Ω = RN be an open set and let f : Ω × R −→ R be a Carath´eodory function. Suppose that ¯ ¯ ¯f (z, x)¯ 6 a(z) + c|x|p−1 for a.a. z ∈ Ω and all x ∈ R, 0

with p ∈ [1, +∞), a ∈ Lp (Ω)+ , p1 + p10 = 1 and c > 0. Let F be the potential function corresponding to f , i.e., df

Zx f (z, r) dr.

F (z, x) = 0

Using the mean value theorem, we can see that ¯ ¯ ¯F (z, x)¯ 6 b a(z) + b c|x|p for a.a. z ∈ Ω and all x ∈ R, with b a ∈ L1 (Ω)+ and b c > 0. Then consider the continuous functional ϕ : Lp (Ω) −→ R, defined by Z ¡ ¢ df ϕ(u) = F z, u(z) dz. Ω

We claim that ϕ is continuously Fr´echet differentiable and ϕ0 (u) = Nf (u) To this end let

Z

ξ(h) = Ω

∀ u ∈ Lp (Ω).

¡ ¢ F z, (u + h)(z) dz − Z

+

Z

¡ ¢ F z, u(z) dz

Ω

¡

¢ f z, u(z) h(z) dz.

Ω

Note that ¡

¢ ¡ ¢ F z, (u + h)(z) − F z, u(z) =

Z1 0

Z1 = 0

¢ d ¡ F z, (u + th)(z) dt dt ¡ ¢ f z, (u + th)(z) h(z) dt.

4. Smooth and Nonsmooth Analysis and Variational Principles

473

Therefore, using Fubini’s theorem and H¨older’s inequality (see, e.g., Theorem A.2.27), we have ¯ ¯ ¯ξ(h)¯ =

Z Z1

¯ ¡ ¯ ¢ ¡ ¢¯¯ ¯f z, (u + th)(z) − f z, u(z) ¯¯h(z)¯ dt dz

Ω 0

Z1 Z 6

¯ ¡ ¯ ¢ ¡ ¢¯¯ ¯f z, (u + th)(z) − f z, u(z) ¯¯h(z)¯ dz dt

0 Ω

Z1 6 khkp

° ° °Nf (u + th) − Nf (u)° 0 dt. p

0

Because Nf is continuous (see Theorem 3.4.4), by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we can conclude that ξ(h) −→ 0 as khkp → 0. khkp This proves that

ϕ0F (u) = Nf (u) ¢ and so ϕ ∈ C Lp (Ω) . This example is important in the variational methods for the study of boundary value problems. ¡ 1

PROPOSITION 4.1.9 If f : X −→ Y is a function which is Fr´echet differentiable at x ∈ X, then f is continuous at x ∈ X. PROOF Since f is Fr´echet differentiable at x, we can find δ > 0, such that ° ° °f (x + h) − f (x) − fF0 (x)h° 6 khk ∀ khkX 6 δ, X Y so

° ° °f (x + h) − f (x)° 6 (1 + kfF0 (x)k ) khk L X Y

∀ khkX 6 δ.

This proves the continuity of f at x ∈ X. REMARK 4.1.10 The above proposition is no longer true if Fr´echet differentiability is replaced by Gˆateaux differentiability. To see this consider the function f : R2 −→ R, defined by ( 4 x1 x2 df if (x1 , x2 ) 6= 0 x61 +x32 f (x1 , x2 ) = ∀ x = (x1 , x2 ) ∈ R2 . 0 if (x1 , x2 ) = 0 0 Then fG (x1 , x2 ) = 0 but f is not continuous at the origin.

474

Nonlinear Analysis

In the case of Gˆateaux differentiable maps we can conclude that they are continuous along rays. PROPOSITION 4.1.11 If f : X −→ Y is a function, which is Gˆ ateaux differentiable at x ∈ X, then ° ° lim °f (x + λh) − f (x)°Y = 0 ∀ h ∈ X. λ→0

PROOF

Let

df

ϕ(λ) = f (x + λh)

∀ λ ∈ R.

Then ϕ is differentiable at 0, hence continuous there. So ϕ(λ) −→ ϕ(0) as λ → 0, which implies that f (x + λh) −→ f (x)

in Y,

as λ → 0.

We have a chain rule for these derivatives. PROPOSITION 4.1.12 If Z is a Banach space too, f : X −→ Y is a function which is Gˆ ateaux differentiable at x ∈ X and g : Y −→ Z is a function which is Fr´echet differentiable at f (x), df

then the function k = g ◦ f : X −→ Z is Gˆ ateaux differentiable at x ∈ X and ¡ ¢ 0 0 kG (x) = gF0 f (x) fG (x). Moreover, if f is Fr´echet differentiable at x ∈ X, then k is Fr´echet differentiable at x. PROOF

For λ 6= 0, we have

° ¡ ¢ 0 1 ° °k(x + λh) − k(x) − λgF0 f (x) fG (x)h°Z |λ| ¡ ¢ ¡ ¢ ¡ ¢¡ ¢° 1 ° °g f (x + λh) − g f (x) − λgF0 f (x) f (x + λh) − f (x) ° 6 Z |λ| ¡ ¢¡ ¢° 1 ° 0 °g 0 f (x) f (x + λh) − f (x) − λfG + (x)h °Z . (4.1) |λ| F Since f is Gˆateaux differentiable at x ∈ X, the second summand in the right hand side of (4.1) goes to zero as λ → 0. Also suppose that f (x + λh) 6= f (x). Then since f (x + λh) −→ f (x) in Y as λ → 0

4. Smooth and Nonsmooth Analysis and Variational Principles

475

(see Proposition 4.1.11) and because g is Fr´echet differentiable at f (x), we have that the first summand in the right hand side of (4.1) goes to zero as λ → 0. This proves that ¡ ¢ 0 0 kG (x) = gF0 f (x) fG (x). The proof is similar if f is Fr´echet differentiable at x ∈ X. COROLLARY 4.1.13 If f : X −→ Y is a function, which is Gˆ ateaux differentiable at every point of the interval ª df © [x, x + h] = u ∈ X : u = λx + (1 − λ)(x + h), λ ∈ [0, 1] , then Z1 0 fG (x + th)h dt.

f (x + h) − f (x) = 0

In the next proposition we show that compactness of a map is passed to its Fr´echet derivative. PROPOSITION 4.1.14 If f : X −→ Y is a function which is compact and Fr´echet differentiable at x ∈ X, then fF0 (x) ∈ Lc (X; Y ). PROOF Suppose that the proposition is not true. Then we can find ε > 0 and {xn }n>1 ⊆ X with kxn kX 6 1 such that

∀ n > 1,

° 0 ° °fF (x)xn − fF0 (x)xm ° > 3ε X

∀ n 6= m.

Because f is Fr´echet differentiable at x, we have f (x + h) − f (x) = fF0 (x)h + u(x, h) and we can find δ > 0, such that ° ° °u(x, h)° 6 ε khk X Y

∀ khkX 6 δ.

Therefore, from (4.2), we have ° ° °f (x + δxn ) − f (x + δxm )° ° 0 ° ° Y ° ° ° ° ° > δ fF (x)(xn − xm ) Y − °u(x, δxn )°Y − °u(x, δxm )°Y > 3εδ − δε − δε = δε, a contradiction to the fact that f is compact.

(4.2)

476

Nonlinear Analysis

For the next proposition, we need to introduce the following definition. DEFINITION 4.1.15 (a) A function f : [a, b] −→ X is said to be right differentiable at t ∈ [a, b), if the limit lim

h→0+

¤ 1£ f (t + h) − f (t) h

exists.

0 We denote this limit by f+ (t) and we call it the right derivative of f at t. 0 Evidently f+ (t) ∈ X.

(b) Similarly a function f : [a, b] −→ X is said to be left differentiable at t ∈ (a, b], if the limit lim−

h→0

¤ 1£ f (t + h) − f (t) h

exists.

0 We denote this limit by f− (t) and we call it the left derivative of f at t. 0 Evidently f− (t) ∈ X.

REMARK 4.1.16 A function f : [a, b] −→ X is Fr´echet differentiable 0 0 at t ∈ (a, b) if and only if f− (t) = f+ (t). PROPOSITION 4.1.17 0 0 If f : [a, b] −→ X, g : [a, b] −→ R, are continuous functions, f+ (t) (t), g+ exist at all t ∈ (a, b) and ° 0 ° 0 °f+ (t)° 6 g+ (t) ∀ t ∈ (a, b), X then

° ° °f (b) − f (a)° 6 g(b) − g(a). X

PROOF

Let ε > 0 be given and consider the set ½ ¾ ° ° df ° ° U = t ∈ [a, b] : f (t) − f (a) X > g(t) − g(a) + ε(t − a) + ε . df

Clearly U is an open set. Suppose that U is nonempty and let c = inf U . We can say the following: (a) c > a. This follows from the continuity of f and g; (b) c 6∈ U : since U is open; (c) c < b: otherwise U = {b} which is not open.

4. Smooth and Nonsmooth Analysis and Variational Principles

477

So we have that a < c < b. By hypothesis we have ° 0 ° 0 °f+ (c)° 6 g+ (c). X Let h > 0 be such that if t ∈ [c, c + h], we have ° 0 ° °f+ (c)° > kf (t) − f (c)kX − ε X t−c 2

and

0 g+ (c) 6

It follows that ° ° °f (t) − f (c)° 6 g(t) − g(c) + ε(t − c) X

g(t) − g(c) ε + . t−c 2

∀ t ∈ [c, c + h].

Also because c 6∈ U , we have ° ° °f (c) − f (a)° 6 g(c) − g(a) + ε(c − a) + ε. X From (4.3) and (4.4), we obtain ° ° °f (t) − f (a)° 6 g(t) − g(a) + ε(t − a) + ε X

(4.3)

(4.4)

∀ t ∈ [c, c + h].

We infer that inf U > c + h, a contradiction. So U = ∅ and we obtain ° ° °f (t) − f (a)° 6 g(t) − g(a) + ε(t − a) + ε ∀ t ∈ [a, b]. X Let t = b and ε & 0 to obtain the desired inequality. REMARK 4.1.18 We have an analogous result if we replace the right derivatives by the left ones. Moreover, we can weaken the hypotheses of Proposition 4.1.17 and assume that there is a countable set D ⊆ [a, b], such 0 0 (t), g+ (t) exist for all t ∈ [a, b] \ D and that f+ ° 0 ° 0 °f+ (t)° 6 g+ (t) ∀ t ∈ [a, b] \ D. X COROLLARY 4.1.19 If f : [a, b] −→ X is continuous, right differentiable at every t ∈ (a, b) and ° 0 ° °f+ (t)° 6 k ∀ t ∈ (a, b), X then

° ° °f (t) − f (s)° 6 k|t − s| X

∀ t, s ∈ [a, b].

COROLLARY 4.1.20 If g : [a, b] −→ R is a continuous function, which is right differentiable at every t ∈ (a, b), then g is increasing if and only if 0 g+ (t) > 0

∀ t ∈ (a, b).

478

Nonlinear Analysis

We have the following mean value theorem. PROPOSITION 4.1.21 (Mean Value Theorem) If f : X −→ R is a Gˆ ateaux differentiable function, then we can find λ0 ∈ (0, 1), such that 0 ® f (x + h) − f (x) = fG (x + λ0 h), h X . PROOF

Let

df

ϕ(λ) = f (x + λh). Recall that

0 ® fG (x + λ0 h), h X = ϕ0 (λ0 )

(see Remark 4.1.2). Using the mean value theorem for scalar functions, we can find λ0 ∈ (0, 1), such that ϕ(1) − ϕ(0) = ϕ0 (λ0 ), so f (x + h) − f (x) =

0 ® fG (x + λ0 h), h X .

In general for vector valued functions the mean value theorem fails as the next example illustrates. Let f : R2 −→ R2 be defined by µ ¶ x1 df f (x) = (x31 , x22 ) ∀x= ∈ R2 . x2

EXAMPLE 4.1.22

We have

· f 0 (x) =

¸ 3x21 0 . 0 2x2

¡¢ ¡¢ © If x = 00 and yª = 11 , then it is clear that there is no z ∈ [x, y] = λx + (1 − λ)y : λ ∈ [0, 1] , such that f (y) − f (x) = f 0 (z)(y − x). For vector valued functions the mean value theorem takes an inequality form. PROPOSITION 4.1.23 (Mean Value Theorem) If f : X −→ Y is a Gˆ ateaux differentiable function and x, h ∈ X, y ∗ ∈ Y ∗ , then we can find λ0 ∈ (0, 1), such that ® ∗ ® 0 (x + λ0 h)h Y y , f (x + h) − f (x) Y = y ∗ , fG and

° ° ° 0 ° °f (x + h) − f (x)° 6 °fG (x + λ0 h)°L khkX . Y

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

Let df

g(x) = Then

479

∗ ® y , f (x) Y .

0 ® ® 0 gG (x), h X = y ∗ , fG (x)h Y .

From Proposition 4.1.21, we know that we can find λ0 ∈ (0, 1), such that g(x + h) − g(x) = hence

0 gG (x + λ0 h), h

® X

,

∗ ® ® 0 y , f (x + h) − f (x) Y = y ∗ , fG (x + λ0 h)h Y .

Since y ∗ ∈ Y ∗ is arbitrary, we choose ky ∗ kY ∗ = 1, such that ° ° ∗ ® y , f (x + h) − f (x) Y = °f (x + h) − f (x)°Y . So we can find λ0 ∈ (0, 1), such that ° ° ° 0 ° ® 0 °f (x + h) − f (x)° = y ∗ , fG °fG (x + λ0 h)° khk . (x + λ h)h 6 0 X Y Y L

COROLLARY 4.1.24 If U ⊆ X is connected and open f : U −→ Y is a Gˆ ateaux differentiable function and 0 fG (x) = 0 ∀ x ∈ U, then f is constant on U . Next we state and prove two major results of differential calculus. These are the implicit function theorem and the inverse function theorem. The implicit function theorem deals with the following situation. Let f (x, y) and suppose that f (x0 , y0 ) = c. Can we find a function x 7−→ y = g(x), which at least locally satisfies ¡ ¢ f x, g(x) = c ? We want g to be differentiable provided f is differentiable. Moreover, in the neighbourhood, where ¡ ¢ f x, g(x) = c is valid, g(x) should be the unique solution. To better motivate this consider the following simple example.

480

Nonlinear Analysis

EXAMPLE 4.1.25

Let f : R2 −→ R be defined by df

f (x, y) = x2 + y 2 − 1. We consider the 0-level set of f , namely the set of those x, y ∈ R that satisfy f (x, y) = 0, which in our ¡ case is ¢ of course the unit circle. We look for a function g(x), such that f x, g(x) = 0 for all x in the domain of g. Evidently g(x) = ±

p 1 − x2

and so g need not be unique unless we restrict its domain. Also near x0 = ±1, g could be either square root, so it is not uniquely determined. Note that at x0 = ±1, g is not differentiable and ∂f = 0. ∂y So to produce a unique differentiable function g, such that ¡ ¢ f x, g(x) = 0, we need to look locally and impose some condition like ∂f 6= 0. ∂y The proof of the implicit function theorem uses the Banach fixed point theorem, which we state here in the form needed and postpone the proof of the general version until Section 7.1. PROPOSITION 4.1.26 (Banach Fixed Point Theorem) If V is a Banach space, C is a closed subset of V and S : C −→ C satisfies ° ° °S(v1 ) − S(v2 )°

V

6 k kv1 − v2 kV

∀ v1 , v2 ∈ C,

for some k ∈ [0, 1), then there exists unique v ∈ C, such that v © = S(v). ª Moreover, if we have a parametrized family S(x) x∈U (with U being an open subset of a Banach space W ) satisfying the above contraction condition with k ∈ [0, 1) independent of x, then the unique solution v = v(x) of v = S(x)v depends continuously on x.

4. Smooth and Nonsmooth Analysis and Variational Principles

481

Using this proposition we can prove the implicit function theorem. In what follows for a function f (x, y) by D1 f (x, y) (respectively D2 f (x, y)) we denote the partial derivative of f with respect to x (respectively y). THEOREM 4.1.27 (Implicit Function Theorem) If X, Y, Z are three Banach spaces, U ⊆ X × Y is an open set, (x0 , y0 ) ∈ U , f : U −→ Z is a continuous differentiable function, f (x0 , y0 ) = 0 and D2 f (x0 , y0 ) ∈ L(X; Y ) is invertible with a continuous inverse, i.e., D2 f (x0 , y0 ) is an isomorphism, then there exist neighbourhoods U1 of x0 and U2 of y0 , such that U1 × U2 ⊆ U and a unique continuously differentiable function g : U1 −→ U2 , such that ¡ ¢ f x, g(x) = 0 ∀ x ∈ U1 and

¡ ¡ ¢¢−1 ¡ ¢ Dg(x) = − D2 f x, g(x) D1 f x, g(x)

PROOF

Let

∀ x ∈ U1 .

df

L0 = D2 f (x0 , y0 ) ∈ L(Y ; Z). By hypothesis L0 is an isomorphism. Then the equation f (x, y) = 0 can be equivalently rewritten as y = y − L−1 0 f (x, y).

(4.5)

The advantage of passing to (4.5) is that we can apply Proposition 4.1.26. Namely for every x, we look for a fixed point of y 7−→ y − L−1 0 f (x, y) and to do this we employ Proposition 4.1.26. Let us set df

h(x, y) = y − L−1 0 f (x, y). Since L−1 0 ◦ L0 = idY , we have £ ¡ ¢¤ h(x, y1 ) − h(x, y2 ) = L−1 L0 (y1 − y2 ) − f (x, y1 ) − f (x, y2 ) . 0 Because f is C 1 at (x0 , y0 ) and L0 is an isomorphism, we can find δ1 > 0 and ϑ > 0, such that if kx − x0 kX 6 δ1 , ky1 − y0 kY 6 ϑ, ky2 − y0 kY 6 ϑ, then ° ° °h(x, y1 ) − h(x, y2 )° 6 1 ky1 − y2 k . Y Y 2

(4.6)

Also because of the continuity of h(·, y0 ), we can find δ2 > 0, such that if kx − x0 kX 6 δ2 , then ° ° °h(x, y0 ) − h(x0 , y0 )° < ϑ . Y 2

(4.7)

482

Nonlinear Analysis

© ª Therefore, from (4.6) and (4.7), if δ = min δ1 , δ2 and kx − x0 kX 6 δ, ky1 − y0 kY 6 ϑ, we have ° ° ° ° °h(x, y) − y0 ° = °h(x, y) − h(x0 , y0 )° Y ° Y ° ° ° 6 °h(x, y) − h(x, y0 )°Y + °h(x, y0 ) − h(x0 , y0 )°Y 1 ϑ 6 ky − y0 kY + 6 ϑ. (4.8) 2 2 © ª So we see that h(x, ·) maps © B ϑ (y0 ) = y ∈ Y : ªky − y0 kY 6 ϑ onto itself as well as Bϑ (y0 ) = y ∈ Y : ky − y0 kY < ϑ onto itself (see (4.7) © ª and (4.8)), for all x ∈ B δ (x0 ) = x ∈ X : kx − x0 k©X 6 δ . We ªcan apply Proposition 4.1.26 to obtain the parametric family y 7−→ h(x, y) x∈B (x ) . 0 δ

So for every x ∈ B δ (x0 ), we can find unique y = y(x) ∈ B ϑ (y0 ), such that h(x, y) = y, hence f (x, y) = 0 and the function g(x) = y(x) is continuous. Let df

U1 = Bδ (x0 )

df

and U2 = Bϑ (y0 ).

Evidently by choosing δ > 0 and ϑ > 0 small enough we can have that U1 × U2 ⊆ U . We claim that the function g : Bδ (x0 ) −→ Y is continuously differentiable. To this end let (x1 , y1 ) ∈ U1 × U2 , y1 = g(x1 ) (recall that G(x, ·) maps U2 into itself). Exploiting the differentiability of f at (x1 , y1 ), we have f (x, y) = A(x − x1 ) + B(y − y1 ) + u(x, y) with

df

A = D1 f (x1 , y1 ), and

∀ (x, y) ∈ U,

df

B = D2 f (x1 , y1 )

ku(x, y)kZ = 0. (x,y)→(x1 ,y1 ) k(x − x1 , y − y1 )kX×Y lim

Recall that Hence

¡ ¢ f x, g(x) = 0

∀ x ∈ U1 .

¡ ¢ g(x) = −B −1 A(x − x1 ) + y1 − B −1 u x, g(x) .

(4.9)

We can find r1 , r2 > 0, such that if kx − x1 kX 6 r1 , ky − y1 kY 6 r2 , then ° ° °u(x, y)°

Z

6

¡ ¢ 1 kx − x1 kX + ky − y1 kY , −1 2 kB kL

so ° ° °u(x, g(x))° 6 Z

° ° ° ¢ ¡° 1 °x − x1 ° + °g(x) − g(x1 )° . −1 X Y 2 kB kL

(4.10)

4. Smooth and Nonsmooth Analysis and Variational Principles

483

From (4.9) and (4.10), it follows that ° ° ° ° ° ° °g(x) − g(x1 )° 6 °B −1 A° kx − x1 k + 1 kx − x1 k + 1 °g(x) − g(x1 )° , X X Y L Y 2 2 so ° ° °g(x) − g(x1 )° 6 η kx − x1 k , (4.11) X Y ° ° df with η = 2°B −1 A°L + 1. Let ¡ ¢ df v(x) = −B −1 u x, g(x) . From (4.9), we have g(x) − g(x1 ) = B −1 A(x − x1 ) + v(x) and since

(4.12)

° ° ° ° ° ¡ ¢° °v(x)° 6 °B −1 ° °u x, g(x) ° Y L Z

and g is continuous, we have lim

x→x1

kv(x)kY = 0. kx − x1 kX

(4.13)

From (4.12) and (4.13), it follows that g is Fr´echet differentiable at x1 ∈ U1 and ¡ ¢−1 D1 f (x1 , y1 ), gF0 (x1 ) = −B −1 A = − D2 f (x1 , y1 ) which means that g is continuously differentiable. DEFINITION 4.1.28 Let Z be a Banach space and let V ⊆ Z be a closed subspace of Z. We say that V is complemented, if there is a closed subspace W of Z, such that Z = V ⊕W (i.e., Z = V + W and V ∩ W = {0}). REMARK 4.1.29 The subspace V ⊆ Z is complemented if and only if there exists a bounded linear projection of Z onto V , i.e., there exists PV ∈ L(Z), such that PV |V = idV

and

PV (Z) = V.

The closed subspace c0 of l∞ is not complemented. If every closed subspace of a Banach space Z is complemented, then Z is isomorphic to a Hilbert space. Every subspace of the Banach space Z, which is either finite dimensional or it has finite codimension, is complemented. Finally in a Hilbert space every closed subspace is complemented (take the orthogonal complement).

484

Nonlinear Analysis

An interesting consequence of Theorem 4.1.27 is the following corollary. COROLLARY 4.1.30 If U ⊆ X is open, f : U −→ Y is a continuously differentiable function, fF0 (x0 ) is surjective and ker fF0 (x0 ) is complemented, then f (U ) contains a neighbourhood of f (x0 ). df

PROOF Let V = ker fF0 (x0 ). Then X = V ⊕ W (see Definition 4.1.28) and so for all x ∈ X we have x = v + w, with v ∈ V and w ∈ W . We write f (x) = f (v, w). Evidently D2 f (x0 ) ∈ L(W ; Y )

is an isomorphism.

So we can apply Theorem 4.1.27 and conclude that f (U ) contains the neighbourhood U2 of y0 = f (x0 ) postulated by Theorem 4.1.27. In the next example we use Corollary 4.1.30 to prove an existence theorem for differential equations. EXAMPLE 4.1.31

Let ¡ ¢ X = C 1 [0, 1] and

¡ ¢ Y = C [0, 1] .

Let f : X −→ Y be defined by df

f (x) =

dx + x3 . dt

It is easy to see that f is a C 1 -map and fF0 (0) =

d ∈ L(X; Y ). dt

From the fundamental theorem of calculus, we have that fF0 (0) is surjective. Also ker fF0 (0) is the space of constant functions, hence it is complemented (see Remark 4.1.29). In fact the complement of ker fF0 (0) is given by ½ df

W =

Z1 x∈X:

¾ x(t) dt = 0 .

0

So we can apply Corollary 4.1.30 and conclude the following:

4. Smooth and Nonsmooth Analysis and Variational Principles

485

“We can find ε > 0, such that if y ∈ Y with kykY < ε, then the differential equation dx(t) + x(t)3 = y(t) dt

∀ t ∈ [0, 1]

has a solution x ∈ X.” Using the implicit function theorem (see Theorem 4.1.27), we can prove the inverse function theorem. THEOREM 4.1.32 (Inverse Function Theorem) If U ⊆ Y is an open set, f : U −→ X is a continuously differentiable function, y0 ∈ U and fF0 (y0 ) ∈ L(Y ; X) is an isomorphism, then there exists a neighbourhood U 0 of y0 , U 0 ⊆ U and V 0 a neighbourhood of x0 = f (y0 ), such that f : U 0 −→ V 0 is a diffeomorphism and (f −1 )0F (x0 ) = fF0 (y0 )−1 . PROOF

Let

df

h(x, y) = f (y) − x. Then

D2 h(x0 , y0 ) = fF0 (y0 ),

which by hypothesis is an isomorphism. So by virtue of Theorem 4.1.27 we can find a neighbourhood V 0 of x0 and a continuously differentiable map g : V 0 −→ Y , such that g(V 0 ) ⊆ U0 for a neighbourhood U0 of y0 , ¡ ¢ h x, g(x) = 0 ¡ ¢ (i.e., f g(x) = x for all x ∈ V 0 ) and

∀ x∈V0

g(x0 ) = y0 . In the sequel we consider f restricted to g(V 0 ). Since ¡ ¢ f g(x) = x, we see that g is injective on V 0 , hence a bijection from V 0 onto g(V 0 ). In addition g(V 0 ) = f −1 (V 0 ) is open because f is continuous. So we set U 0 = g(V 0 )

486

Nonlinear Analysis

and we have that f : U 0 −→ V 0 is a bijection. Finally since ¡ ¡ ¢¢−1 ¡ ¢ gF0 (x0 ) = − D2 h x0 , g(x0 ) D1 h x0 , g(x0 ) , we have hence

fF0 (x0 ) ◦ gF0 (x0 ) = idX , gF0 (x0 ) = (f −1 )0F (x0 ) = fF0 (y0 )−1 .

In finite dimensions this theorem has the following useful consequences. COROLLARY 4.1.33 If V ⊆ RN is an open set, x0 ∈ V , f : V −→ RM is a continuously differentiable function and y0 = f (x0 ), then (a) if N 6 M and fF0 (x0 ) is of maximal rank (i.e., of rank N ), then we can find a neighbourhood U 0 of y0 , V 0 a neighbourhood of x0 and a continuously differentiable function g : U 0 −→ RN , such that (g ◦ f )(x) = i(x)

∀ x ∈ V 0,

where i : RN −→ RM is the canonical injection, i.e., i(x1 , . . . , xN ) = (x1 , . . . , xN , 0, . . . , 0); (b) if N > M and fF0 (x0 ) is of maximal rank (i.e., of rank M ), then we can find a neighbourhood Vb of x0 and a continuously differentiable map ϑ : Vb −→ V , with ϑ(x0 ) = x0 and (f ◦ ϑ)(x) = projRM (x)

∀ x ∈ Vb ,

where projRM : RN −→ RM is the canonical projection, i.e., projRM (x1 , . . . , xN ) = (x1 , . . . , xM ). PROOF

(a) By hypothesis µ ¶ ∂fi rank (x0 ) = N. i∈{1,...,M } ∂xj j∈{1,...,N }

By relabelling things if necessary, we may assume that µ ¶ ∂fi det (x0 ) 6= 0. i∈{1,...,N } ∂xj j∈{1,...,N }

4. Smooth and Nonsmooth Analysis and Variational Principles

487

Let ξ : V × RM −N −→ RM be defined by df

ξ(x1 , . . . , xM ) = f (x1 , . . . , xN ) + (0, . . . , 0, xN +1 , . . . , xM ). We have

µ det

∂ξi (x0 , 0) ∂xj

¶ 6= 0. i∈{1,...,M } j∈{1,...,M }

So Theorem 4.1.27 implies that locally there exists an inverse g of f , such that i(x) = (g ◦ ξ ◦ i)(x) = (g ◦ f )(x). (b) Again we may assume without any loss of generality that µ ¶ ∂fi det (x0 ) 6= 0. i∈{1,...,M } ∂xj j∈{1,...,M }

N

We define η : V −→ R , by df

η(x1 , . . . , xN ) = Hence

µ det

∂ηi (x0 ) ∂xj

¡

¢ f1 (x), . . . , fM (x), xM +1 , . . . , xn .

¶

µ = det i∈{1,...,N } j∈{1,...,N }

∂fi (x0 ) ∂xj

¶ 6= 0. i∈{1,...,M } j∈{1,...,M }

So by Theorem 4.1.32, we can find a neighbourhood Vb of x0 and a continuously differentiable map ϑ : Vb −→ V , such that ϑ(x0 ) = x0 and ϑ = η −1 . Then ¢ ¡ ¢ ¡ projRM (x) = projRM ◦ η ◦ ϑ (x) = f ◦ ϑ (x).

REMARK 4.1.34 In Corollary 4.1.33, part (a) tells us that f locally looks like the inclusion map: V 0 ⊆ RN @ i@ R

f -

RM

U 0 ⊆ RM ¡ ¡ ªg

On the other hand part (b) tells us that f locally looks like the projection map: Vb ⊆ RN ϑ

p

@@ ¡¡ ª R N fb ϑ(V ) ⊆ V ⊆ R RM Both diagrams are commutative.

488

4.2

Nonlinear Analysis

Convex Functions

In this section we focus on convex functions and their differentiability properties. We show that the algebraic property of convexity has important topological consequences such as continuity and differentiability. The situation is especially pleasant in the context of separable Banach spaces. Convex functions play a certain role in modern variational analysis and the applications require that we consider extended real valued convex functions, df that is functions with values in R = R ∪ {+∞}. DEFINITION 4.2.1 Let X be a Hausdorff topological space and let ϕ : X −→ R be a function. The effective domain of ϕ is the set df

dom ϕ =

©

ª x ∈ X : ϕ(x) < +∞ .

We say that ϕ is proper, if dom ϕ 6= ∅. The epigraph of ϕ is the set df

epi ϕ =

©

ª (x, λ) ∈ X × R : ϕ(x) 6 λ .

The function ϕ is lower semicontinuous, if for every λ ∈ R, the sublevel set ª df © Lϕ x ∈ X : ϕ(x) 6 λ λ = is closed. If X is a Hausdorff linear topological space, we say that ϕ is convex, if for all x1 , x2 ∈ dom ϕ and all λ ∈ [0, 1], we have ¡ ¢ ϕ λx1 + (1 − λ)x2 6 λϕ(x1 ) + (1 − λ)ϕ(x2 ). We say that ϕ is strictly convex if the above inequality is strict when x1 6= x2 and λ ∈ (0, 1). The cone of proper, convex and lower semicontinuous functions is denoted by Γ0 (X). REMARK 4.2.2 It is well known that ϕ is lower semicontinuous if and only if epi ϕ ⊆ X × R is closed or equivalently if ϕ(x) 6 lim inf α ϕ(xα ) for every net (xα ) converging to x. Also ϕ is convex if and only if epi ϕ ⊆ X ×R is a convex set. This means that certain properties of proper, convex and lower semicontinuous functions can be deduced from these (rather special) closed, convex sets in X × R. So one can argue that the study of proper, convex, lower semicontinuous functions is a special case of the study of closed, convex sets. On the other hand, if C is a nonempty subset of X, we can introduce the indicator function of C, by ½ df 0 if x ∈ C, iC (x) = +∞ otherwise.

4. Smooth and Nonsmooth Analysis and Variational Principles

489

Then iC ∈ Γ0 (X) if and only if C is closed and convex. This example shows that it is possible to deduce certain properties of a closed, convex set from the properties of its indicator function which belongs in Γ0 (X). So one can argue that the study of closed, convex sets is a special case of the study of proper, convex, lower semicontinuous functions. Both points of view are legitimate and it is a matter of which approach is more convenient, the geometric or the analytical. The next theorem summarizes the continuity properties of a proper, convex function. THEOREM 4.2.3 If X is a Hausdorff linear space and ϕ : X −→ R is a proper, convex function, then the following statements are equivalent: (a) ϕ is bounded from above on a neighbourhood of x0 ∈ X; (b) ϕ is continuous at x0 ∈ X; (c) int epi ϕ 6= ∅; (d) int dom ϕ 6= ∅ and ϕ|int dom ϕ is continuous. Moreover, if the above statements hold, then int epi ϕ =

©

ª (x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ .

PROOF “(a)=⇒(b)”: Let U be a neighbourhood of x0 , such that ϕ|U is bounded from above, i.e., there exists c > 0, such that ϕ(x) 6 c

∀ x ∈ U.

Replacing, if necessary, U by U − x0 and ϕ(x) by ϕ(x + x0 ) − ϕ(x0 ), we may assume without any loss of generality that x0 = 0 and that ϕ(0) = 0. We will show that ϕ is continuous at x0 = 0. Let ε ∈ (0, c] and let us define df

Vε =

³ε ´ ³ ε ´ U ∩ − U . c c

Evidently Vε is a symmetric neighbourhood of the origin. We shall show that ¯ ¯ ¯ϕ(x)¯ 6 ε ∀ x ∈ Vε (4.14) (which implies the continuity of ϕ at x0 = 0). So let x ∈ Vε . We have εc x ∈ U and because ϕ is convex, we have ε ³c ´ ³ ε´ ε ϕ(x) 6 ϕ x + 1 − ϕ(0) 6 c = ε. c ε c c

490

Nonlinear Analysis

Also − εc x ∈ U and so µ

ε ³ c ´¶ 1 c x + − x 1 + εc 1 + εc ε ε ³ ´ 1 c 1 ε 6 ϕ(x) + c ε ϕ − x 6 ϕ(x) + , 1 + εc 1+ c ε 1 + εc 1 + εc

0 = ϕ(0) = ϕ

hence −ε 6 ϕ(x). So finally we obtain (4.14) and this proves the continuity of ϕ at the origin. “(b)=⇒(a)”: Since ϕ is continuous at x0 , then it is bounded on a neighbourhood of x0 . “(a)=⇒(c)”: By hypothesis, there exists a neighbourhood U of x0 , such that ϕ(x) 6 c

∀ x ∈ U.

So U ⊆ int dom ϕ and {(x, λ) ∈ X × R : x ∈ U, c < λ} ⊆ epi ϕ, which implies that int epi ϕ 6= ∅. “(c)=⇒(a)”: Let (x, λ) ∈ int epi ϕ. We can find a neighbourhood U of x and r > 0, such that U × [λ − r, λ + r] ⊆ epi ϕ, hence U × {λ} ⊆ epi ϕ and so ϕ(x) 6 λ

∀ x ∈ U,

which means that ϕ is bounded from above in a neighbourhood of x. “(a)=⇒(d)”: As before we may assume that x0 = 0. Let U be the neighbourhood of x0 = 0 postulated by part (a). Evidently U ⊆ dom ϕ and so int dom ϕ 6= ∅. Let x ∈ int dom ϕ. Note that dom ϕ is convex. So we can find r > 1, such that x b = rx ∈ dom ϕ. Let µ ¶ 1 df V = x+ 1− U, r which is a neighbourhood of x. Exploiting the convexity of ϕ, for all u ∈ V , we have µ ¶ 1 u = x+ 1− z with z ∈ U r

4. Smooth and Nonsmooth Analysis and Variational Principles and

491

µ

µ ¶ ¶ µ ¶ 1 1 1 1 x b+ 1− z 6 ϕ(b x) + 1 − ϕ(z) r r r r µ ¶ 1 1 6 ϕ(b x) + 1 − c = b c. r r

ϕ(u) = ϕ

So ϕ is bounded from above in a neighbourhood of x, hence continuous at x ∈ int dom ϕ (recall that (a)⇐⇒(b)). “(d)=⇒(a)”: Obvious. Finally let us show that int epi ϕ = {(x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ} . Let us denote the right hand side set by W . Clearly int epi ϕ ⊆ W. On the other hand let x ∈ int dom ϕ and Let

ϕ(x) < λ.

¡ ¢ b ∈ ϕ(x), λ . λ

Because ϕ|int dom ϕ is continuous, there exists a neighbourhood U of x, such that U ⊆ int dom ϕ and b ϕ(x) < λ ∀ x ∈ U. ³ ´ ¡ ¢ b +∞ ⊆ int epi ϕ, and so Hence x, λ ∈ U × λ, W ⊆ int epi ϕ.

REMARK 4.2.4 From the above theorem it follows that © ª int dom ϕ = x ∈ X : there exists λ ∈ R such that (x, λ) ∈ int epi ϕ . Also a convex function ϕ can be continuous at a boundary point x of dom ϕ where ϕ(x) = +∞. To see this consider the convex function ϕ : R −→ R, defined by ½1 df if x ∈ (0, +∞), x ϕ(x) = +∞ if x ∈ (−∞, 0]. Recall that in R the neighbourhoods of +∞ are all the sets (λ, +∞] with λ ∈ R. If C is nonempty, closed, convex set in X, then the indicator function iC ∈ Γ0 (X) is continuous at x if and only if x ∈ int C. Therefore, if int C = ∅, then iC is not continuous at any point C = dom iC .

492

Nonlinear Analysis

If X is finite dimensional, the situation is remarkably simple. PROPOSITION 4.2.5 If ϕ : X −→ R is convex and X is finite dimensional, then ϕ is continuous on int dom ϕ. PROOF

Let x ∈ int dom ϕ. We can find {ek }N k=0 ⊆ X

(N = dim X)

and r > 0, such that Br (x) ⊆ conv {ek }N k=0 ⊆ dom ϕ. So if y ∈ Br (x), we can find {λk }N k=0 ⊆ [0, 1], such that N X

λk = 1

and

x=

k=0

N X

λk e k .

k=0

Then because ϕ is convex, we have ϕ(x) 6

N X

λk ϕ(ek )

k=0

6

µX N

¶ λk

k=0

max

k∈{0,...,N }

ϕ(ek )

= c < +∞. So by virtue of Theorem 4.2.3, ϕ|int dom ϕ is continuous.

To have an infinite dimensional analog of the above theorem, we need an extra condition on the function ϕ. THEOREM 4.2.6 If X is a Banach space and ϕ : X −→ R is convex and lower semicontinuous, then ϕ|int dom ϕ is continuous.

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

We have dom ϕ =

∞ [ ¡

493

¢ ϕ6n .

n=1

Let x ∈ int dom ϕ. Since ϕ is lower semicontinuous, the sets © ª ϕ 6 n are closed. So by the Baire category theorem (see Theorem A.1.10), we can find n > 1, such that © ª int ϕ < n 6= ∅ and ϕ(x) < n. Let and set

© ª y ∈ int ϕ < n ¡ ¢ df h(λ) = ϕ x + λ(y − x)

∀ λ > 0.

Since x ∈ int dom ϕ, we can find r > 0, such that B rky−xkX (x) ⊆ dom ϕ. We have [−r, r] ⊆ dom h, hence 0 ∈ int dom h and so h is continuous at 0 (see Proposition 4.2.5). Because h(0) < n, we can find ϑ > 0, such that h(λ) < n Let

∀ λ ∈ [−ϑ, 0].

df

z = x − ϑ(y − x). We have and

© ª z∈ ϕ 0 and c > 0, such that ¯ ¯ B 2δ (x0 ) ⊆ U and ¯ϕ(y)¯ 6 c ∀ y ∈ B 2δ (x0 ). Let x, y ∈ B δ (x0 )

with x 6= y.

Let us set df

r = kx − ykX and

δ df z = y + (y − x). r

Then z ∈ B 2δ (x0 ). Also we have y =

r δ z+ x. r+δ 1+δ

So from the convexity of ϕ, we obtain ϕ(y) 6

r δ ϕ(z) + ϕ(x), r+δ r+δ

so ¢ r ¡ ϕ(z) − ϕ(x) r+δ r 2c 6 2c = kx − ykX . δ δ

ϕ(y) − ϕ(x) 6

Interchanging the roles of x and y in the above argument, we conclude that ¯ ¯ ¯ϕ(y) − ϕ(x)¯ 6 2c kx − yk X δ

∀ x, y ∈ B δ (x0 ).

“⇐=”: Obvious. For convex, continuous functions, it is possible to characterize Gˆateaux and Fr´echet differentiability at x ∈ X only in terms of ϕ, that is without using the linear functionals ϕ0F (x) and ϕ0G (x).

4. Smooth and Nonsmooth Analysis and Variational Principles

495

PROPOSITION 4.2.8 If U ⊆ X is an open set and ϕ : U −→ R is a convex, continuous function, then ϕ is Gˆ ateaux differentiable at x ∈ X if and only if lim

λ&0

PROOF

ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) = 0 λ

∀ h ∈ X.

(4.15)

“=⇒”: Since ϕ is Gˆateaux differentiable at x ∈ X, we have lim

λ&0

and lim

λ&0

ϕ(x + λh) − ϕ(x) = ϕ0G (x)h λ

ϕ(x − λh) − ϕ(x) = ϕ0G (x)(−h). λ

From these limits, we obtain immediately (4.15). “⇐=”: Let

df

ψ(λ) = ϕ(x + λh)

∀ λ ∈ R.

Then ψ is convex and by hypothesis we have 0 0 ψ+ (0) = ψ− (0).

Therefore ψ is differentiable at λ = 0 and so ϕ is Gˆateaux differentiable at x (see Remark 4.1.2). In the case of Fr´echet differentiability at x, (4.15) holds uniformly with respect to h when khkX = 1. PROPOSITION 4.2.9 If U ⊆ X is an open set and ϕ : U −→ R is a convex, continuous function, then ϕ is Fr´echet differentiable at x ∈ U if and only if for every ε > 0 there exists δ > 0, such that ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) < λε

∀ khkX = 1, λ ∈ (0, δ). (4.16)

PROOF “=⇒”: Since ϕ is Fr´echet differentiable at x ∈ X, for a given ε > 0, we can find δ > 0, such that ® ε ϕ(x + λh) − ϕ(x) − ϕ0F (x), λh X < khkX 2 ∀ khkX = 1, λ ∈ (0, δ). Rewriting (4.17) with h replaced by −h and adding we obtain (4.16).

(4.17)

496

Nonlinear Analysis

“⇐=”: From Proposition 4.2.8 we know that ϕ is Gˆateaux differentiable at x. The convexity ϕ implies that

and

® ϕ(x + λh) − ϕ(x) 0 − ϕG (x), h X > 0 λ

∀λ>0

(4.18)

® ϕ(x) − ϕ(x − λh) 0 − ϕG (x), h X 6 0 λ

∀ λ > 0.

(4.19)

Therefore λε > ϕ(x £ + λh) + ϕ(x − λh)− 2ϕ(x) ® ¤ = ϕ(x + λh) − ϕ(x) − λ ϕ0G (x), h X £ ® ¤ − ϕ(x) − ϕ(x − λh) − λ ϕ0G (x), h X ∀ khkX = 1, λ ∈ (0, δ).

(4.20)

From (4.18) and (4.19), we see that the right hand side of (4.20) is the sum of two positive quantities both of which have to be less than λε for all khkX = 1 and all λ ∈ (0, δ). This means that ϕ is Fr´echet differentiable at x. It is well known from elementary calculus that for a function ϕ : U −→ R with U ⊆ RN being an open set, existence of all partial derivatives at x ∈ U does not imply Fr´echet differentiability. However, if ϕ is convex this is true. PROPOSITION 4.2.10 If U ⊆ RN is an open set, ϕ : U −→ R is a convex function and all the partial derivatives of ϕ at x ∈ U exist, then ϕ is Fr´echet differentiable at x ∈ U . PROOF The obvious candidate for Fr´echet derivative of ϕ at x is the linear transformation determined by the partial derivatives, that is ¡ ¢ A(x) ∈ L RN ; R = RN , defined by N ¡ ¢ df X ∂ϕ A(x), h RN = (x)hk ∂xk

∀ h = (h1 , . . . , hN ) ∈ RN .

k=1

Let r > 0 be such that Br (x) ⊆ U. For each h ∈ Br (0), let ¡ ¢ df ψ(h) = ϕ(x + h) − ϕ(x) − A(x), h RN .

4. Smooth and Nonsmooth Analysis and Variational Principles Evidently ψ is convex on Br (0). For each k ∈ function ξk : Br (0) −→ R, by ψ(hk ek ) df if hk 6= 0 ξk (h) = h 0 k if hk = 0

©

497

ª 1, . . . , N , let us define a

∀ h ∈ Br (0),

N where {ek }N k=1 is the orthonormal basis of R . We have

ξk (h) −→ 0 as khkRN → 0. For each h = (h1 , . . . , hN ) with khkRN < Nr , because of the convexity of ψ, we have µX ¶ N N ¢ 1 1 X ¡ ψ(h) = ψ N hk ek 6 ψ N hk ek N N k=1

=

N X

k=1

hk ξk (N h) 6 khkRN

k=1

Since

µ 0 = ψ

N X

¯ ¯ ¯ξk (N h)¯.

k=1

1 1 h + (−h) 2 2

¶ 6

1 1 ψ(h) + ψ(−h), 2 2

we have −ψ(−h) 6 ψ(h). Therefore, it follows that − khkRN

N N X X ¯ ¯ ¯ ¯ ¯ξk (−N h)¯ 6 ψ(h) 6 khk N ¯ξk (N h)¯, R k=1

so

k=1

ψ(h) −→ 0 as h → 0 khkRN

and thus ϕ is Fr´echet differentiable at x ∈ U and ϕ0F (x) = A(x).

The above proposition implies that for convex functions on a finite dimensional Banach space the situation is straightforward; namely Gˆateaux and Fr´echet differentiability are equivalent (compare with Proposition 4.1.7). COROLLARY 4.2.11 If X is a finite dimensional Banach space, U ⊆ X is an open set and ϕ : U −→ R is a convex function, then ϕ is Gˆ ateaux differentiable at x ∈ U if and only if it is Fr´echet differentiable at x ∈ U .

498

Nonlinear Analysis

From elementary convex analysis we know that a convex function ϕ : (a, b) −→ R is differentiable at all except at most a countable number of points of (a, b). We would like to identify those Banach spaces X where the convex continuous functions defined on open convex sets in X have similar Gˆateaux differentiability properties. The basic result in this direction is the so-called Mazur’s theorem, which implies the automatic generic (i.e., in a dense Gδ -subset) Gˆateaux differentiability of convex, continuous functions in separable Banach spaces. THEOREM 4.2.12 (Mazur Theorem) If X is a separable Banach space, U ⊆ X is an open, convex set and ϕ : U −→ R is a convex, continuous function, then ϕ is Gˆ ateaux differentiable on a dense Gδ subset of U . PROOF For each h ∈ X and m ∈ N, let µ ¶ ½ 1 df S h, = x ∈ U : there exists δ = δ (x, m) > 0, such that m ¾ ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) 1 sup < . λ m λ∈(0,δ) Since by hypothesis ϕ is continuous, for a given m > 1 and every k > 1, the set ½ ¾ ϕ(x + k1 h) + ϕ(x − k1 h) − 2ϕ(x) 1 df Tk (h) = x ∈ U : < 1 m k is open in U . It follows then that µ ¶ ∞ [ 1 S h, = Tk (h) m k=1

is open in U . We claim that it is also dense in U . Suppose that this is not the case. Then we can find x0 ∈ U and r > 0, such that µ ¶ 1 S h, ∩ Br (x0 ) = ∅. m If ψ(λ) = ϕ(x0 +λh), it follows that ¡ the ¢convex function ψ is not differentiable 1 on (−r, r), a contradiction. So S h, m is dense in U . Because X is separable, we can find a sequence {hn }n>1 which is dense in © ª ∂B1 (0) = h ∈ X : khkX = 1 . By virtue of Proposition 4.2.8, ϕ is Gˆateaux differentiable at x if the directional derivative exists in the directions {hn }n>1 . µ ¶ ∞ \ 1 So ϕ is Gˆateaux differentiable on the set S hn , , which is dense in m n,m=1 U (Baire’s theorem) and of course Gδ .

4. Smooth and Nonsmooth Analysis and Variational Principles

499

There exist nonseparable Banach spaces where the above theorem fails. EXAMPLE 4.2.13

Let X = l∞ and for x = {xn }n>1 , let df

ϕ(x) = lim sup |xn |. n→+∞

Then ϕ is a seminorm (hence it is convex) and it is also continuous since ϕ(x) 6 kxk∞ . If ϕ(x) = 0, then xn −→ 0 and so taking h = (1, 1, . . .), we have ϕ(x + λh) − ϕ(x) |λ| = , λ λ ¡ ¢ which shows that ϕ is not Gˆateaux differentiable at all x ∈ ϕ−1 {0} . If ϕ(x) > 0, exploiting the positive homogeneity of ϕ, we may assume without any loss of generality that ϕ(x) = 1. © ª Let xnk k>1 be a subsequence of {xn }n>1 , such that |xnk | −→ 1

as k → +∞.

By passing to a ©further ª subsequence if necessary, we may assume that all the elements of the xnk k>1 have the same sign. Moreover, since ϕ(x) = ϕ(−x), we can say that xnk > 0 ∀ k > 1. Let ½ df

hn =

0 1

if either n 6= nk for all k > 1, or n = nk with k odd, if n = nk with k even.

Let us set h = {hn }n>1 ∈ l∞ . We have ϕ(x + λh) − ϕ(x) = λ

½

1 0

if if

λ > 0, λ < 0.

So ϕ is nowhere Gˆateaux differentiable. REMARK 4.2.14 The above example should not lead to the conclusion that Theorem 4.2.12 fails in every nonseparable Banach space. There are nonseparable Banach spaces in which Theorem 4.2.12 remains valid, for example the class of weakly compactly generated Banach spaces. A Banach space is weakly compactly generated , if there exists a weakly compact set C (which we can always take to be convex), whose linear span is dense in X

500

Nonlinear Analysis

(i.e., X = span C). Separable Banach spaces are weakly compactly generated. To see this let {xn }n>1 be dense in ∂B1 (0) and take ½ df

C =

1 xn n

¾ ∪ {0}.

Note that C is actually compact. Also reflexive Banach spaces are weakly df compact generated. In this case let C = B 1 (0). DEFINITION 4.2.15 A Banach space X is said to be a weak Asplund space, if every convex, continuous function defined on an open, convex set U ⊆ X is Gˆ ateaux differentiable at each point of a dense Gδ subset U . REMARK 4.2.16 In the above definition the term weak has nothing to do with the weak topology. It is used because sometimes Gˆateaux differentiability is called weak differentiability in contrast to the Fr´echet differentiability, which is called strong differentiability . Theorem 4.2.12 says that every separable Banach space is a weak Asplund space. What about Fr´echet differentiability? In this direction we have the following result due to Asplund (1968) and Lindenstrauss (1963) independently. THEOREM 4.2.17 If X is a Banach space with separable dual, U ⊆ X is an open, convex set and ϕ : U −→ R is a convex, continuous function, then ϕ is Fr´echet differentiable on a dense Gδ subset of U . Then in analogy to Definition 4.2.15, we make the following definition. DEFINITION 4.2.18 A Banach space X is said to be an Asplund space, if every convex, continuous function defined on an open, convex set U ⊆ X is Fr´echet differentiable on a dense Gδ subset of U . REMARK 4.2.19 Theorem 4.2.17 implies that every Banach space X with a separable dual X ∗ is an Asplund space. Note that X is separable too. More generally, we can say that a separable Banach space ¡ X ¢is an Asplund space if and only if it has a separable dual. If X = C [0, 1] (whose dual ¡ ¢ M [0, 1] , the space of Radon measures is not separable) and ϕ : X −→ R+ is defined by df

ϕ(x) = kxk∞ , then it can be shown that ϕ is not Fr´echet differentiable at any point.

4. Smooth and Nonsmooth Analysis and Variational Principles

4.3

501

Haar Null Sets and Locally Lipschitz Functions

From real analysis we know that every Lipschitz continuous function f : R −→ R is differentiable almost everywhere and is the integral of its derivative (a consequence of the fundamental theorem of the Lebesgue calculus). We saw that this theorem can be extended to vector valued functions f : R −→ X, when X is a Banach space with the RNP (see Theorem 2.2.17 and Remark 2.2.18). We also proved another generalization of Lebesgue’s original result, which is due to Rademacher and which says that a locally Lipschitz function f : RN −→ RM is differentiable almost everywhere (see Theorem 1.5.8 and Corollary 1.5.9). The purpose of this section is to combine these two generalizations; that is we want to prove a Rademacher’s theorem for functions between Banach spaces. The problem that we face when dealing with such a generalization is that we do not have a natural measure µ (such as the Lebesgue measure in RN ), which produces a useful class of µ-null sets. So we need to devise new ways to come up with negligible sets. Our experience from the real line suggests that the approach cannot be purely topological and choose, say, the sets of first Baire category. There are several distinct ways to define negligible sets. Our goal is not to present all of them. Instead we will focus on the so-called Haarnull sets, which historically produced the first generalization of Rademacher’s theorem. As the name suggests Haar-null sets are defined on topological groups. So let us start with a brief discussion of them. DEFINITION 4.3.1 A topological group is a group G endowed with a Hausdorff topology, which is compatible with the group structure; that is the two maps G × G 3 (x, y) 7−→ xy ∈ G

and

G 3 x 7−→ x−1 ∈ G

are continuous (G × G is furnished with the product topology). An isomorb is a group isophism of a topological group G onto a topological group G b morphism of G onto G which is bicontinuous. We say that G is an Abelian topological group, if G is Abelian. REMARK 4.3.2 The map x 7−→ x−1 is the inverse map, the map x 7−→ ax is left translation by a and the map x 7−→ xb is right translation by b. All three maps are homeomorphisms of G into itself. The fact that translations are homeomorphisms implies that a topological group is topologically homogeneous. Namely for any a, b ∈ G, the map x 7−→ ba−1 x is a homeomorphism of G which sends a to b. Therefore the topological structure at a is reflected at b. In particular then the topology is completely determined by the system of neighbourhoods of the neutral element e.

502

Nonlinear Analysis

DEFINITION 4.3.3 be left-invariant, if

A metric dG on a topological group G is said to

dG (ax, ay) = dG (x, y)

∀ a, x, y ∈ G

and right-invariant, if dG (xa, ya) = dG (x, y)

∀ a, x, y ∈ G.

The metric dG is invariant, if it is both left-invariant and right-invariant. REMARK 4.3.4 A celebrated theorem of Birkhoff-Kakutani (see, e.g., Hewitt & Ross (1963, pp. 68–70)) says that a topological group G is metrizable if and only if the neutral element e has a countable fundamental system of neighbourhoods. A metrizable topological group admits a left-invariant (or right-invariant) compatible metric. However, a metrizable topological group need not admit an invariant metric. For Abelian groups clearly left and right invariance are equivalent. We shall consider separable Abelian topological groups and separable Banach spaces or otherwise we face serious measurability problems. In addition the Abelian topological group will be Polish. So let G be an Abelian Polish topological group. If G is locally compact, then it is well known that there is a unique (up to scalar multiplication) translation invariant measure µ on G called the Haar measure (see Dieudonn´e (1969, p. 244)). For G which is not locally compact, no invariant measure exists. Nevertheless, it is still possible to define the notion of Haar-null sets. Since G is Abelian, its operation will be denoted by “+.” Also all measures considered in the sequel will be Borel without any explicit mention. DEFINITION 4.3.5 A Borel set A ⊆ G is said to be Haar-null, if there is a probability measure µ on G, such that χA ? µ = 0, i.e., Z χA (x + y) µ(dx) = 0 ∀y∈G G

(the convolution of the characteristic function χA and the measure µ). REMARK 4.3.6 So according to this definition the Borel set A is Haarnull if and only if there is a probability measure µ, such that all translates of A are µ-null, i.e., µ(A + y) = 0 ∀ y ∈ G. The measure µ is called test measure for A. The next proposition shows that when G is locally compact, then the notion of Haar-null set coincides with that of a set which is negligible with respect to the Haar measure on G.

4. Smooth and Nonsmooth Analysis and Variational Principles

503

PROPOSITION 4.3.7 If G is a locally compact, Abelian, Polish topological group and A ⊆ G is a Borel set, then the following two properties are equivalent: (a) A is Haar-null on G; (b) A is negligible for the Haar measure on G. PROOF “(a)=⇒(b)”: Let µ be a test measure for A and let h be a Haar measure on G. We know that h is σ-finite. Then by virtue of Fubini’s theorem, we have ¸ ¸ Z ·Z Z ·Z χA (x + y) h(dx) µ(dy) = χA (x + y) µ(dy) h(dx) = 0. G

G

G

G

So for µ-almost all y ∈ G, we have Z χA (x + y) h(dx) = 0, G

hence

Z h(A) =

χA (x) h(dx) = 0. G

“(b)=⇒(a)”: Again let h be a Haar measure. Let f ∈ Cc (G), such that Z f (x)h(dx) = 1. G

Let us set df

Z

µ(B) =

f (x)h(dx)

∀ B ∈ B(G).

B

Evidently µ is a probability measure on G and we have Z Z χA (x + y)µ(dx) = χA (x + y)f (x)h(dx) G

G

Z

6 c

Z

χA (x + y)h(dx) = c G

for some c > 0 (since f ∈ Cc (G)).

χA (x)h(dx) = 0, G

504

Nonlinear Analysis

REMARK 4.3.8 Since on RN , the Haar measures are multiples of the Lebesgue measure, it follows that the Haar-null sets are the Lebesgue-null sets. PROPOSITION 4.3.9 If G is an Abelian, Polish topological group and {An }n>1 is a sequence of Haar-null sets in G, ∞ df S b= then A An is a Haar-null set in G. n=1

1 PROOF Let M+ (G) be the of probability measures on G. Furnished ¡ space ¢ 1 1 with the narrow topology w M+ (G), Cb (G) , M+ (G) becomes a Polish space (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003a, p. 199)). Let % be 1 a complete metric on M+ (G) generating the above narrow topology. By the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that if 1 µn −→ µ in M+ (G),

then µn ? ν −→ µ ? ν

1 in M+ (G).

Note that if a probability measure µ vanishes on a set A and all its translates, then the same is true for every translate of µ and every probability measure which is absolutely continuous with respect to µ (see Definition A.2.22). In particular this is the case for measures of the form df

µB (A) =

µ(A ∩ B) , µ(B)

with B ∈ B(G), such that µ(B) > 0. So let µ0 be a translation of µ, such that every neighbourhood of the neutral element e of G has strictly positive µ0 -measure. Then for r > 0, we set df

ur (C) = where

df

Br (e) =

©

µ0 (C ∩ Br (e)) , µ0 (Br (e))

ª x ∈ G : dG (x, e) < r ,

with dG being a complete metric on G. Evidently ur −→ δe

1 in M+ (G),

as r & 0,

where δe is the Dirac measure with its mass at e. Therefore for every ε > 0, 1 we can find ν ∈ M+ (G), such that χA ? ν = 0 (i.e., ν is a test measure on A) and %(ν, δe ) < ε.

4. Smooth and Nonsmooth Analysis and Variational Principles

505

These observations imply that by induction we can generate a sequence 1 {νn }n>1 ⊆ M+ (G), such that χAn ? νn = 0

∀n>1

(i.e., νn is a test measure on An ) and ¡ ¢ % u, u ? µn

1, 2n where u is any convolution of different µk ’s with 1 6 k 6 n − 1. Since the 1 metric % on M+ (G) is complete, we see that the measure df

µ =

∞ Y

?µk

k=1

is well defined. Because µ = µn ? νn for all n > 1, where Y ?µk , νn = k6=n

it follows that χAn ? µ = 0

∀ n > 1,

hence χ

∞ S

n=1

This proves that

∞ S

? µ = 0. An

An is a Haar-null set.

n=1

COROLLARY 4.3.10 If G is an Abelian, Polish topological group and A ⊆ G is Haar-null, then Ac = G \ A is dense in G. PROOF Let {xn }n>1 be a sequence which is dense in G and let µ be a test measure for A. It suffices to show that int A = ∅. Suppose that int A 6= ∅. Then we can find a neighbourhood U of the neutral element e of G and a ∈ A, such that a + U ⊆ G. We have µ(U + b) = µ(U + a) = µ(U ) = 0

∀ b ∈ G,

so µ(U + xn ) = 0

∀ n > 1.

Then because of Proposition 4.3.9, we have that G =

∞ [

(U + xn ) is Haar-null, with test measure µ,

n=1

a contradiction.

506

Nonlinear Analysis

EXAMPLE 4.3.11 Let G be a separable Banach space and let A ⊆ X be a Borel set which intersects all the translates of a fixed line L in sets whose one-dimensional Lebesgue measure is zero. For example A can be a proper, Borel linear subspace of X. Then A is a Haar-null set. Any probability measure on the line L which is equivalent to the Lebesgue measure on L can be used as a test measure of A. Sets like A above are called directionallynull sets. Recall that a set A ⊆ R can have positive measure, and, in fact, its complement can be Lebesgue-null, without A including any interval of positive length (consider, e.g., the set of irrational numbers). However, if we take all differences between elements in A (i.e., A − A), then this set contains a nontrivial interval around zero. This fact is used in the construction of a nonmeasurable set (see, e.g., Halmos (1974, p. 69)). The same property can be proved if R is replaced by an Abelian, Polish topological group G and we consider a Borel set A ⊆ G which is not Haar-null. PROPOSITION 4.3.12 If G is an Abelian, Polish topological group and A ⊆ G is a Borel set which is not Haar-null, then A − A is a neighbourhood of the neutral element e of G. PROOF

Let df

S(A) =

©

ª x ∈ G : (A + x) ∩ A is not Haar-null .

We claim that S(A) is a neighbourhood of e. If we can show this then the proposition follows since S(A) ⊆ A − A. Suppose that the claim is not true. Then we can find a sequence {xn }n>1 ⊆ G, such that 1 dG (xn , e) < n ∀n>1 2 (dG being a complete metric on G) and (A + xn ) ∩ A is Haar-null

∀ n > 1.

Because of Proposition 4.3.9, the set ∞ [ £ ¤ (A + xn ) ∩ A n=1

is Haar-null and so its complement df b = A A\

∞ [ £ ¤ (A + xn ) ∩ A n=1

4. Smooth and Nonsmooth Analysis and Variational Principles

507

is a Borel set which is not Haar-null (see Corollary 4.3.10). Note that ¡ ¢ b + xn ∩ A b = ∅ A

∀ n > 1.

Let C be the Cantor group (i.e., C = {0, 1}N ). The elements of C are denoted by ξ = {ξn }n>1 with ξn ∈ {0, 1} ∀ n > 1. Let ξ (n) =

©

(n) ª

ξk

k>1

be the element of C, defined by ½ (n) df

ξk

=

0 1

if if

k= 6 n, k = n.

Consider the map ϑ : C −→ G, defined by df

ϑ(ξ) =

∞ X

ξn xn .

n=1

Note that

¡ ¢ ϑ ξ (n) = xn

and because dG is complete and invariant, we have that ϑ is continuous. If we consider the Haar probability measure h on C and we consider the image measure h ◦ ϑ−1 = µ on G, b is not Haar-null, µ does not vanish on some translate A; b that then because A is we can find y ∈ G, such that ¡ ¢ b + y has positive h-measure. ϑ−1 A So we must have that ¡ ¢ ¡ ¢ b + y − ϑ−1 A b + y is a neighbourhood of 0 ∈ C. ϑ−1 A This means that for all n > 1 large enough, ¢ ¢ ¡ ¡ b+y . b + y − ϑ−1 A ξ (n) ∈ ϑ−1 A ¢ ¡ b + y which differ only in the n-th coordinate and Thus there are σ, τ ∈ ϑ−1 A b−A b for all these n. But this contradicts the fact that xn ∈ A ¡ ¢ b + xn ∩ A b = ∅ A ∀ n > 1.

508

Nonlinear Analysis

REMARK 4.3.13 In the above proof, we have shown something stronger. Namely that S(A) is an open set containing e. In fact this result can be generalized as follows: “If A, B are two Borel sets in G and df

S(A, B) =

©

ª x ∈ G : (A + x) ∩ B is not Haar-null ,

then S(A, B) is an open (possibly empty) subset of G” (see Christensen (1974, p. 118)). COROLLARY 4.3.14 (a) If G is an Abelian, Polish topological group which is not locally compact, then every compact subset of G is Haar-null. (b) If G = X is a nonreflexive, separable Banach space, then every weak compact subset of X is Haar-null. Using the notion of the Haar-null sets we can extend Rademacher’s theorem to Lipschitz continuous functions between certain Banach spaces. LEMMA 4.3.15 If X and Y are two Banach spaces, U ⊆ X is an open set, f : U −→ Y is a Lipschitz continuous function, G ⊆ X is a dense additive subgroup and for some x0 ∈ U and all h ∈ G, f (x0 + λh) − f (x0 ) = f 0 (x0 ; h) λ→0 λ lim

exists and f 0 (x0 ; ·) is additive, then f is Gˆ ateaux differentiable at x0 . PROOF

Consider the following family of functions: df

uλ (h) =

f (x0 + λh) − f (x0 ) λ

∀ λ 6= 0, h ∈ X.

The family {uλ }λ6=0 is equicontinuous (since f is Lipschitz continuous on U ) and since lim uλ (h) exists for all h ∈ G, it also exists for all h ∈ G = X. λ→0

Moreover, from the additivity of f 0 (x0 ; ·) on G follows the additivity on G = X. In addition note that lim uλ (th) = t lim uλ (h)

λ→0

λ→0

∀ t ∈ R.

So f 0 (x0 ; ·) is a linear operator which is bounded by the Lipschitz constant of f , i.e., f 0 (x0 ; ·) ∈ L(X; Y ). Therefore f is Gˆateaux differentiable at x0 .

4. Smooth and Nonsmooth Analysis and Variational Principles

509

PROPOSITION 4.3.16 If U ⊆ RN is an open set, Y is a Banach space with the RNP and f : U −→ Y is a Lipschitz continuous function, then f is Gˆ ateaux differentiable almost everywhere on U (on U we consider the N -dimensional Lebesgue measure). PROOF Without any loss of generality, we may assume that U = RN . Let {xn }n>1 ⊆ RN be dense and let df

G = span Q {xn }n>1 (i.e., the linear combinations of the xn ’s with rational coefficients). Clearly G is a countable dense additive subgroup of RN . From Theorem 2.2.17, we know that the directional derivatives f 0 (x; h) = lim

λ→0

f (x + λh) − f (x) λ

exist in all directions h ∈ G for almost all x ∈ RN . Then in the light of Lemma 4.3.15, if we can show that these directional derivatives are additive on G, then we will have the almost everywhere Gˆateaux differentiability. ¡ desired ¢ To this end let ϕ ∈ Cc1 RN be such that Z ϕ(x) dx = 1. RN

For example a standard function with these properties is the function Ã ! 1 c exp if kxkRN 6 1, 2 ϕ(x) = kxkRN − 1 0 if kxkRN > 1, with c ∈ R chosen so that we have the normalization condition Z ϕ(x) dx = 1. RN

Let

Z

df

g(x) = (f ? ϕ)(x) =

f (y)ϕ(x − y) dy. RN

We know that and so we have that

¡ ¢ g ∈ C 1 RN ; Y 0 gG (x)h =

¡

¢ f ? ϕ0G (x) h

510

Nonlinear Analysis

is linear in h ∈ RN for all x ∈ RN . Since Z Z f (y)ϕ(x − y) dy = f (x − y)ϕ(y) dy RN

∀ x ∈ RN , h ∈ G,

RN

using the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have g(x + λh) − g(x) 0 gG (x)h = lim λ→0 λ Z f (x + λh − y) − f (x − y) = ϕ(y) lim dy λ→0 λ RN

and it follows that 0 (x)h = gG

¡

¢ ϕ ? ξh (x),

where

ϕ(x + λh) − ϕ(x) λ is a bounded measurable function. Then we have ¡ ¢ ϕ ? ξh1 +h2 − ξh1 − ξh2 = 0 ∀ h1 , h2 ∈ G. df

ξh (x) = lim

λ→0

The same is true if ϕ is replaced by ϕm (x) = mN ϕ(mx). Recall that for every bounded, measurable function gb : RN −→ Y, we have that

¡ ¢ ϕm ? gb (x) −→ gb(x)

for a.a. x ∈ RN

(see Proposition 2.4.12(c); the result there is stated for R-valued functions, but it can be extended to Y -valued functions by scalarization using elements of Y ∗ and recalling that by virtue of Theorem 2.1.3, we may assume without any loss of generality that Y is separable). So in the limit, we obtain that ξh1 +h2 (x) = ξh1 (x) + ξh2 (x) for a.a. x ∈ RN and all h1 , h2 ∈ G. Because G is countable, the exceptional Lebesgue-null set is independent of h1 , h2 ∈ G. Now we are ready for the infinite dimensional generalization of Rademacher’s theorem (see Theorem 1.5.8 and Corollary 1.5.9).

4. Smooth and Nonsmooth Analysis and Variational Principles

511

THEOREM 4.3.17 If X is a separable Banach space, U ⊆ X is an open set, Y is a Banach space with the RNP and f : U −→ Y is a Lipschitz continuous function, then f is a Gˆ ateaux differentiable function on a set Df with X \ Df being Haar-null in X. PROOF Without any loss of generality we may assume that U = X. First we show that the set Df of points of Gˆateaux differentiability of f is a Borel subset of X. Indeed let {xn }n>1 be dense in X and set df

G = span Q {xn }n>1 . Then by virtue of Lemma 4.3.15, we have ¯ ¯ ½ ¯ f (x + λh) − f (x) f (x + rh) − f (x) ¯ ¯ ¯< 1, Df = x ∈ X : ¯ − ¯ m λ r ¾ 1 λ, r ∈ Q, |λ|, |r| 6 , h ∈ G, m, n > 1 . n So we see that Df ⊆ X is a Borel set. Now let {yn }n>1 be a sequence of linearly independent vectors in X, such that span {yn }n>1 = X. Let

df

Vm = span {yn }m n=1 . Then V1 ⊆ V2 ⊆ . . . ⊆ Vm ⊆ . . . ⊆ X and

∞ [

Vm = X.

m=1

Using Proposition 4.3.16, we can find Dn ⊆ Vn , such that f is Gˆateaux differentiable on Dn and Vn \Dn is Lebesgue-null. By virtue of Lemma 4.3.15, Df =

∞ \

Dn .

n=1

So Df is a Haar-null set in X. COROLLARY 4.3.18 If X is a separable Banach space, Y is a Banach space with the RNP and f : X −→ Y is a locally Lipschitz continuous function, then f is a Gˆ ateaux differentiable function on a set Df with X \ Df being Haar-null in X.

512

4.4

Nonlinear Analysis

Duality and Subdifferentials

A basic theme that runs through the whole theory of convex analysis is that of “duality.” Namely almost every mathematical notion is paired with another one, which is in some sense dual to it. So convex cones are associated to their polars (generalizing this way the pairing of tangent and normal spaces in differential geometry), closed convex sets are associated to their support functions (a pairing which permits an interchange between geometric and analytical reasoning), minimization problems in linear programming are associated to maximization problems, known as the dual problems, which provide valuable information about the solvability and the value of the original problem. In general duality permits us to establish close relations between otherwise disparate properties. The starting point of all these correspondence is a deep duality principle between certain pairs of convex functions (known as conjugate functions), which we study in the first half of this section. The mathematical framework of our analysis is a Hausdorff locally convex vector space X and its dual X ∗ (i.e., the set of continuous, linear functionals on X). Additional hypotheses will be introduced as needed. We supply X with the w(X, X ∗ )-topology and X ∗ with the w(X ∗ , X)-topology. So (X ∗ , X) is a dual pair (or dual system) and we denote by h·, ·iX their pairing. df

DEFINITION 4.4.1 Let ϕ : X −→ R∗ = R ∪ {±∞}. Then LegendreFenchel transform (or the conjugate) of ϕ is the function ϕ∗ : X ∗ −→ R∗ , defined by £ ¤ df ϕ∗ (x∗ ) = sup hx∗ , xiX − ϕ(x) . x∈X

∗ ∗

∗∗

The function (ϕ ) = ϕ : X −→ R∗ , defined by df

ϕ∗∗ (x) =

sup x∗ ∈X ∗

£

¤ hx∗ , xiX − ϕ∗ (x∗ ) ,

is the second conjugate (or biconjugate) of ϕ. REMARK 4.4.2 If ϕ takes the value −∞, then ϕ∗ ≡ +∞. Also if the effective domain dom ϕ is empty, then ϕ∗ ≡ +∞. For this reason of interest df is the case where ϕ : X −→ R = R ∪ {+∞} and dom ϕ 6= ∅ (i.e., ϕ is a proper function; see Definition 4.2.1). In this case ϕ∗ : X −→ R and it is proper too. The significance of ϕ∗ is better understood using epigraphs (see Definition 4.2.1). So we have · (x∗ , µ) ∈ epi ϕ∗ ⇐⇒

hx∗ , xiX − λ 6 µ

¸ ∀ (x, λ) ∈ epi ϕ .

4. Smooth and Nonsmooth Analysis and Variational Principles

513

If we write the last inequality as hx∗ , xiX − µ 6 λ

∀ (x, λ) ∈ epi ϕ,

we see that · ∗

(x , µ) ∈ epi ϕ

∗

⇐⇒

¸ ∗

hx , xiX − µ 6 ϕ(x)

∀x∈X .

So df

l(x∗ ,µ) (x) = hx∗ , xiX − µ is a continuous affine minorant of ϕ. Therefore ϕ∗ is proper if and only if ϕ admits a continuous affine minorant. Moreover, ϕ∗ describes the family of all continuous affine minorants of ϕ. On the other hand also note that · ¸ ϕ∗ (x∗ ) 6 µ ⇐⇒ l(x,λ) (x∗ ) = hx∗ , xiX − λ 6 µ ∀ (x, λ) ∈ epi ϕ . So see that ϕ∗ is the pointwise supremum of all continuous affine functions © we ª l(x,λ) (x,λ)∈epi ϕ . Therefore ϕ∗∗ is the pointwise supremum of all continuous affine functions majorized by ϕ. From these observations and recalling that the supremum of continuous affine functions on X is convex and lower semicontinuous, we obtain the following result. PROPOSITION 4.4.3 If ϕ : X −→ R is a proper function, then ϕ∗ ∈ Γ0 (X ∗ ). Also directly from the definition of ϕ∗ , we obtain the following two results. PROPOSITION 4.4.4 (Young-Fenchel Inequality) If ϕ : X −→ R∗ is a function, then ϕ(x) + ϕ∗ (x∗ ) > hx∗ , xiX ∀ x ∈ X, x∗ ∈ X ∗ . PROPOSITION 4.4.5 If ϕ, ψ : −→ R∗ and ϕ(x) 6 ψ(x)

∀ x ∈ X,

then ϕ∗ (x∗ ) > ψ ∗ (x∗ )

∀ x∗ ∈ X ∗ .

514

Nonlinear Analysis

DEFINITION 4.4.6 set C is the function

(a) Let C ⊆ X. The support function of the σC : X ∗ −→ R∗ ,

defined by df

σC (x∗ ) = sup hx∗ , ciX c∈C

(recall that sup∅ = −∞). If C 6= ∅, then σC takes values in R. (b) The infimal convolution of functions ϕ, ψ : X −→ R is the function ϕ ⊕ ψ : X −→ R∗ , defined by ¡

¢ ¡ ¢ df ϕ ⊕ ψ (x) = inf ϕ(y) + ψ(x − y) = y∈X

inf

z+y=x

¡ ¢ ϕ(z) + ψ(y) .

We say that ϕ ⊕ ψ is exact at x, if ¡ ¢ ¡ ¢ ϕ ⊕ ψ (x) = min ϕ(y) + ψ(x − y) y∈X

(i.e., the infimum is attained). We say that ϕ ⊕ ψ is exact, if it is exact at every x. REMARK 4.4.7 Evidently if C ⊆ X is nonempty, then σC ∈ Γ0 (X ∗ ) and σC (0) = 0. In fact σC is sublinear (i.e., subadditive and positively homogeneous). Moreover, ¡ ¢∗ σC = iC , where iC is the indicator function of the set C ⊆ X, i.e., ½ df 0 if x ∈ C, iC (x) = +∞ otherwise (see Remark 4.2.2). If df

sepi ϕ =

©

(x, λ) ∈ X × R : ϕ(x) < λ

ª

(the strict epigraph of ϕ), then it is easy to check that ¡ ¢ sepi ϕ ⊕ ψ = sepi ϕ + sepi ψ. Also since ¡

¢ ϕ ⊕ ψ (x) =

inf

(z, λ) ∈ epi ϕ (y, µ) ∈ epi ϕ z+y =x

(λ + µ) =

inf

(x,λ)∈(epi ϕ+epi ψ)

λ,

4. Smooth and Nonsmooth Analysis and Variational Principles

515

we see that the infimal convolution of proper, convex functions ϕ and ψ is convex but not necessarily proper. For example, if ϕ = iC and ψ = iD and C, D ⊆ X are two nonempty, convex, disjoint sets, then iC + iD ≡ +∞. On the other hand, if ϕ and ψ are linear functionals which are not identical, then ϕ ⊕ ψ = −∞. In addition note that, if X is a normed space and C ⊆ X is a nonempty set, then for all x ∈ X, we have dX (x, C) = inf kx − ckX c∈C ¡ ¢ = inf kx − ykX + iC (y) y∈X ¡ ¢ = k·kX + iC (x). Hence for all nonempty, convex sets C ⊆ X, the distance function dX (·, C) is convex. Moreover, it is easy to see that ¯ ¯ ¯d (x, C) − d (y, C)¯ 6 kx − yk , X X X i.e., dX (·, C) is nonexpansive. Finally for any index set I, we have µ

¶∗ = sup ϕ∗i .

inf ϕi

i∈I

i∈I

PROPOSITION 4.4.8 If ϕ, ψ : X −→ R are proper, convex functions, then ¡ ¢∗ ϕ⊕ψ = ϕ∗ + ψ ∗ . PROOF

According to Definitions 4.4.1 and 4.4.6(b), we have

¡ ¢∗ ¡ ¡ ¢ ¢ ϕ ⊕ ψ (x∗ ) = sup hx∗ , xiX − ϕ ⊕ ψ (x) x∈X ¶ µ ¡ ¢ = sup hx∗ , xiX − inf ϕ(y) + ψ(x − y) y∈X

x∈X

=

∗

sup (hx , yiX − ϕ(y) + hx∗ , ziX − ψ(z))

y,z∈X ∗ ∗

= ϕ (x ) + ψ ∗ (x∗ ).

516

Nonlinear Analysis

PROPOSITION 4.4.9 If X and Y are two Hausdorff, locally convex spaces, A ∈ L(X; Y ) is an isomorphism, g : Y −→ R is a proper function and for y0 ∈ Y , x∗0 ∈ X ∗ , ξ0 ∈ R and λ0 > 0, we set df

ϕ(x) = λ0 g(Ax + y0 ) + hx∗0 , xiX + ξ0

∀ x ∈ X,

then µ ∗

∗

ϕ (x ) = λ0 g

∗

¶ ® 1 −1 ∗ ∗ ∗ (A ) (x − x0 ) − x∗ − x∗0 , A−1 y0 X − ξ0 λ0

∀x∗ ∈ X ∗ .

PROOF

We have © ª ϕ∗ (x∗ ) = sup hx∗ , xiX − λ0 g(Ax + y0 ) − hx∗0 , xiX − ξ0 x∈X ½ ¾ ® 1 −1 ∗ ∗ = λ0 sup (A ) (x − x∗0 ), y Y − g(y) λ0 ® y∈Y − x∗ − x∗0 , A−1 y0 X − ξ0 µ ¶ ® 1 ∗ −1 ∗ ∗ ∗ = λ0 g (A ) (x − x0 ) − x∗ − x∗0 , A−1 y0 X − ξ0 . λ0

COROLLARY 4.4.10 If g : X −→ R a is proper function and x0 ∈ X, x∗0 ∈ X ∗ , λ0 > 0, ϑ0 ∈ R, then (a) for (b) for (c) for (d) for (e) for (f ) for

ϕ(x) = g(x + x0 ), ϕ∗ (x∗ ) = g ∗ (x∗ ) − hx∗ , x0 iX ; ∗ ϕ∗ (x∗ ) = g ∗ (x∗¡ − x∗0¢); ϕ(x) = g(x) + hx0 , xiX , ϕ(x) = λ0 g(x), ϕ∗ (x∗ ) = λ0 g ∗ λ10 x∗ ; ¡1 ¢ ϕ(x) = λ0 g λ0 x , ϕ∗ (x∗ ) = λ0¡g ∗ (x∗ ); ¢ ϕ(x) = g(λ0 x), ϕ∗ (x∗ ) = g ∗ λ10 x∗ ; ϕ(x) = λ0 g( λ10 x + x0 ) + ϑ0 , ϕ∗ (x∗ ) = λ0 g ∗ (x∗ ) − λ0 hx∗ , x0 iX − ϑ0 .

Let us give some examples of conjugate functions. EXAMPLE 4.4.11

(a) Let X be a normed space and © ª C = B 1 = x ∈ X : kxkX 6 1 .

Then

σC (x∗ ) = i∗C (x∗ ) = sup hx∗ , ciX = kx∗ kX ∗ . c∈B1

(b) If K ⊆ X is a cone (i.e., λK ⊆ K for all λ > 0), then σK = i−K ∗ ,

4. Smooth and Nonsmooth Analysis and Variational Principles

517

where K ∗ is the dual cone, i.e., df

K∗ =

©

ª x∗ ∈ X ∗ : hx∗ , xiX > 0 for all x ∈ K .

If K is a linear subspace of X, then df

K∗ = K⊥ =

©

ª x∗ ∈ X ∗ : hx∗ , xiX = 0 for all x ∈ K .

(c) Let X be a normed space and ϕ(x) = kxkX . Then we have

£ ¤ ϕ∗ (x∗ ) = sup hx∗ , xiX − kxkX . x∈X

If kx∗ kX ∗ 6 1, we have

hx∗ , xiX 6 kxkX

and so ϕ∗ (x∗ ) = 0. On the other hand if kx∗ kX ∗ > 1, we can find x ∈ X, such that kxkX < hx∗ , xiX and so £ ¤ hx∗ , λxiX − kλxkX = λ hx∗ , xiX − kxkX > 0 Therefore ϕ∗ (x∗ ) = +∞. So we conclude that © ∗ where B = x∗ ∈ X ∗ : kx∗ kX ∗

ϕ∗ = iB∗ , ª 61 .

(d) Let X be a normed space, C ⊆ X nonempty set and df

ϕ(x) = dX (x, C)

∀ x ∈ X.

Then from Remark 4.4.7, we know that ϕ = k·kX ⊕ iC . Then by virtue of Proposition 4.4.8, we have that ∗

ϕ∗ = k·kX + i∗C = iB∗ + σC

(see (c) and (a) above).

∀ λ > 0.

518

Nonlinear Analysis

(e) If df

ϕ(x) = hx∗0 , xiX + ξ0 with

∀ x ∈ X,

x∗0

∈ X and ξ0 ∈ R (i.e., ϕ is a continuous, affine functions), then ½ £ ¤ −ξ0 if x∗ = x∗0 , ϕ∗ (x∗ ) = sup hx∗ − x∗0 , xiX − ξ0 = +∞ if x∗ 6= x∗0 . x∈X

(f ) If ϕ : RN −→ R is defined by df

ϕ(x) =

1 p kxkRN , p

with p ∈ (1, +∞), then ϕ∗ (x∗ ) = with

1 p

+

1 p0

1 ∗ p0 kx kRN , p0

= 1. Indeed, let df

gx∗ (x) = (x∗ , x)RN −

1 p kxkRN . p

Then gx∗ is concave and p−2

gx0 ∗ (x) = x∗ − kxkRN x. x) = 0 at the unique point x b, such that We have that gx0 ∗ (b ° °p ° ° p (x∗ , x b)RN = °x b°RN = °x∗ °Rp−1 N . So if p0 =

p p−1

and since ϕ∗ (x∗ ) = sup gx∗ (x), x∈RN

we have that ϕ∗ (x∗ ) =

1 ∗ p0 kx kRN . p0

More generally, let X be a normed space and let g : R −→ R be an even, convex function. If ¡ ¢ df ϕ(x) = g kxkX then

¡ ¢ ϕ∗ (x∗ ) = g ∗ kx∗ kX ∗

∀ x ∈ X, ∀ x∗ ∈ X ∗ .

4. Smooth and Nonsmooth Analysis and Variational Principles

519

PROPOSITION 4.4.12 If ϕ : X −→ R is a convex and lower semicontinuous function, then ϕ admits a continuous affine minorant, i.e., hx∗0 , xiX − ξ0 6 ϕ(x)

∀ x ∈ X,

for some (x0 , ξ0 ) ∈ X ∗ × R. PROOF

Clearly we may assume that ϕ is proper, i.e., ϕ ∈ Γ0 (X).

Let x0 ∈ X and η ∈ R be such that η < ϕ(x0 ). Then (x0 , η) 6∈ epi ϕ and so by the strong separation theorem (see Theorem A.3.2), we can find (x∗0 , ϑ0 ) ∈ X ∗ × R, (x∗0 , ϑ0 ) 6= (0, 0) and ξ ∈ R, such that hx∗0 , xiX + ϑ0 λ < ξ < hx∗0 , x0 iX + ϑ0 η Let (x, λ) = We have

∀ (x, λ) ∈ epi ϕ.

¡ ¢ x, ϕ(x) .

hx∗0 , xiX + ϑ0 ϕ(x) < ξ < hx∗0 , x0 iX + ϑ0 η,

so ϑ0 < 0. Without any loss of generality, we may assume that ϑ0 = −1. from (4.21), we have so

hx∗0 , xiX − ϕ(x) < ξ0

∀ x ∈ X,

hx∗0 , xiX − ξ0 < ϕ(x)

∀ x ∈ X.

(4.21) Then

PROPOSITION 4.4.13 For any function ϕ : X −→ R∗ , we have ϕ∗∗ 6 ϕ. PROOF have

From the Young-Fenchel inequality (see Proposition 4.4.4), we

ϕ∗∗ (x) =

sup x∗ ∈X ∗

£

¤ hx∗ , xiX − ϕ∗ (x∗ ) 6 ϕ(x)

∀ x ∈ X.

520

Nonlinear Analysis

The next theorem is very important and determines when we have equality in Proposition 4.4.13. THEOREM 4.4.14 If ϕ : X −→ R is a function, then ϕ∗∗ = ϕ if and only if ϕ is convex and lower semicontinuous. PROOF

“=⇒”: Follows from Remark 4.4.2.

“⇐=”: If ϕ ≡ +∞, then and so

ϕ∗ ≡ −∞ v = ϕ∗∗ ≡ +∞.

Therefore we may assume that ϕ is proper. We know that we have ϕ∗∗ 6 ϕ. So we need to show that the opposite inequality also holds. To this end let x ∈ X and µ ∈ R be such that µ < ϕ(x). Then (x, µ) ∈ / epi ϕ and so we can apply the strong separation theorem (see Theorem A.3.2) and find (x∗ , β) ∈ X ∗ × R, (x∗ , β) 6= (0, 0) and δ > 0, such that hx∗ , yiX + βλ 6 hx∗ , xiX + βµ − δ

∀ (y, λ) ∈ epi ϕ.

Since λ can increase to +∞, from this inequality it follows that β 6 0. First suppose that β < 0. We have hx∗ , yiX + βϕ(y) < hx∗ , xiX + βµ

∀ y ∈ X,

from which it follows that (−βϕ)∗ (x∗ ) 6 hx∗ , xiX + βµ. Using Corollary 4.4.10(c), we obtain µ ∗¶ x ∗ −βϕ 6 hx∗ , xiX + βµ −β and thus

¿ µ 6

−

x∗ ,x β

À X

µ ∗¶ x − ϕ∗ − 6 ϕ∗∗ (x). β

Because µ < ϕ(x) was arbitrary, we infer that ϕ(x) 6 ϕ∗∗ (x)

4. Smooth and Nonsmooth Analysis and Variational Principles

521

as desired. Next assume that β = 0. We have hx∗ , yiX 6 hx∗ , xiX − δ

∀ y ∈ dom ϕ

and so, we see that x 6∈ dom ϕ

and ϕ(x) = +∞.

It is enough to show that also ϕ∗∗ (x) = +∞. Let η ∈ R be such that hx∗ , yiX < η < hx∗ , xiX

∀ y ∈ dom ϕ.

As ϕ is bounded below by an affine function (see Proposition 4.4.12), we have that there exist y ∗ ∈ X ∗ and ϑ ∈ R, such that hy ∗ , yiX − ϑ 6 ϕ(y)

∀ y ∈ X.

So for all γ > 0, we have ¡ ¢ hy ∗ , yiX − ϑ + γ hx∗ , yiX − η 6 ϕ(y)

∀ y ∈ X.

Then hy ∗ + γx∗ , yiX − ϕ(y) 6 ϑ + γη

∀y∈X

and so ϕ∗ (y ∗ + γx∗ ) 6 ϑ + γη. Therefore ¡ ¢ hy ∗ , xiX − ϑ + γ hx∗ , xiX − η 6 hy ∗ + γx∗ , xiX − ϕ∗ (y ∗ + γx∗ ) 6 ϕ∗∗ (x). Since η < hx∗ , xiX and γ > 0 was arbitrary, we see that the left hand side is arbitrarily large and so ϕ∗∗ (x) = +∞. Thus ϕ(x) 6 ϕ∗∗ (x).

522

Nonlinear Analysis

COROLLARY 4.4.15 If C ⊆ X is a nonempty, closed and convex set, then x ∈ C if and only if hx∗ , xiX 6 σC (x∗ )

∀ x∗ ∈ X ∗ .

In Proposition 4.4.8, we saw that addition is the dual operation to infimal convolution. The next proposition shows that under some additional conditions, the converse is also true. PROPOSITION 4.4.16 If ϕ, ψ : X −→ R are proper, convex functions and there exists a point x ∈ dom ϕ, such that ψ is continuous at x, then ¡ ¢∗ ϕ+ψ = ϕ∗ ⊕ ψ ∗ . Now we pass to the study of the subdifferential starting with convex subdifferentials. The convex subdifferential characterizes the local behaviour of convex functions, in a way which is analogous to that in which derivatives determine the local behaviour of smooth functions (see Section 4.1). In fact we can develop a subdifferential calculus which to a high degree parallels the differential calculus of smooth functions. The mathematical setting remains as before. Namely X is a Hausdorff, locally convex vector space, X ∗ is its topological dual. The spaces X and X ∗ are supplied with the w(X, X ∗ ) and w(X ∗ , X) topologies respectively. Let ϕ : X −→ R be a proper, convex function and x ∈ dom ϕ, h ∈ X. The function df ϕ(x + λh) − ϕ(x) ux (λ) = λ is increasing on (0, +∞). So we can make the following definition. DEFINITION 4.4.17 Let ϕ : X −→ R be a proper, convex function and x0 ∈ dom ϕ. The directional derivative of ϕ at x0 in the direction h ∈ X is defined by df

ϕ(x0 + λh) − ϕ(x0 ) ϕ(x0 + λh) − ϕ(x0 ) = lim . λ>0 λ&0 λ λ

ϕ0 (x0 ; h) = inf

REMARK 4.4.18 Note that ϕ0 (x0 ; h) ∈ R∗ and it is easy to see that 0 ϕ (x0 ; ·) is sublinear. Moreover, if X is a Banach space and ϕ0 (x0 ; ·) ∈ X ∗ , then ϕ is Gˆateaux differentiable at x0 and ϕ0 (x0 ; ·) = ϕ0G (x0 ).

4. Smooth and Nonsmooth Analysis and Variational Principles

523

DEFINITION 4.4.19 Let ϕ : X −→ R be a proper function and x0 ∈ dom ϕ. The subdifferential of ϕ at x0 is the subset ∂ϕ(x0 ) (possibly empty) of X ∗ , defined by ½ ¾ ∗ ® df ∗ ∗ ∂ϕ(x0 ) = x ∈ X : x , y − x0 6 ϕ(y) − ϕ(x0 ) for all y ∈ X . REMARK 4.4.20

From this definition we see that

x∗ ∈ ∂ϕ(x0 ) where

if and only if df

argminψ =

©

x0 ∈ argmin(ϕ − x∗ ),

ª x ∈ X : ψ(x) = inf ψ . X

The set ∂ϕ(x) is always a closed and convex subset of X ∗ and it can be empty (consider for example the subdifferential ∂ϕ(x) when x ∈ / dom ϕ). The domain of the subdifferential multifunction ∂ϕ is the set © ª D(∂ϕ) = x ∈ X : ∂ϕ(x) 6= ∅ . The function ϕ is said to be subdifferentiable at x ∈ X, if x ∈ D(∂ϕ). The elements of ∂ϕ(x) are called subgradients of ϕ at x. Using the epigraph of ϕ we can better understand the geometric meaning of the subdifferential. So ϕ is subdifferentiable at x ∈ X and x∗ ∈ X ∗ is a subgradient of ϕ at x if and only if the graph of the continuous function y 7−→ hx∗ , y − xiX + ϕ(x) ¡ ¢ is a nonvertical supporting hyperplane to the set epi ϕ at x, ϕ(x) , that is the continuous affine function df

l(y) = hx∗ , y − xiX + ϕ(x) is a minorant of ϕ which is exact at x, i.e., l 6 ϕ and

l(x) = ϕ(x).

Since l(x) 6 ϕ∗∗ (x) 6 ϕ (see Remark 4.4.2 and Proposition 4.4.13), we infer that if ∂ϕ(x) 6= ∅, then ϕ(x) = ϕ∗∗ (x). Consequently, if ϕ(x) = ϕ∗∗ (x), then ∂ϕ(x) = ∂ϕ∗∗ (x).

524

Nonlinear Analysis

PROPOSITION 4.4.21 If ϕ : X −→ R is a function, then x∗ ∈ ∂ϕ(x) ⇐⇒ ϕ(x) + ϕ∗ (x∗ ) = hx∗ , xiX . PROOF

“=⇒”: From the definition of the subdifferential, we have hx∗ , yiX − ϕ(y) 6 hx∗ , xiX − ϕ(x)

∀ y ∈ X,

so ϕ∗ (x∗ ) + ϕ(x) 6 hx∗ , xiX . Since the opposite inequality is always true (see the Young-Fenchel inequality; Proposition 4.4.4), we conclude that ϕ(x) + ϕ∗ (x∗ ) = hx∗ , xiX . “⇐=”: We have hx∗ , xiX − ϕ(x) = ϕ∗ (x∗ ) > hx∗ , yiX − ϕ(y)

∀ y ∈ X.

Therefore hx∗ , y − xiX 6 ϕ(y) − ϕ(x)

∀ y ∈ X,

hence x∗ ∈ ∂ϕ(x).

COROLLARY 4.4.22 If ϕ : X −→ R and x∗ ∈ ∂ϕ(x), then x ∈ ∂ϕ∗ (x∗ ). PROOF

Since x∗ ∈ ∂ϕ(x), we have ϕ∗ (x∗ ) + ϕ(x) = hx∗ , xiX

(see Proposition 4.4.21). Then since ϕ∗∗ 6 ϕ (see Proposition 4.4.13), we obtain ϕ∗ (x∗ ) + ϕ∗∗ (x) 6 hx∗ , xiX . A new appeal to Proposition 4.4.21 gives that x ∈ ∂ϕ∗ (x∗ ).

4. Smooth and Nonsmooth Analysis and Variational Principles

525

COROLLARY 4.4.23 If ϕ ∈ Γ0 (X), then x∗ ∈ ∂ϕ(x) ⇐⇒ x ∈ ∂ϕ∗ (x∗ ). PROOF

Since ϕ ∈ Γ0 (X), we have ϕ = ϕ∗∗

(see Theorem 4.4.14). So from Corollary 4.4.22 we conclude the desired equivalence. Before continuing with the investigation of the subdifferentials in the context of convex functions, let us give some examples of subdifferentials. EXAMPLE 4.4.24 (a) Let ϕ : R −→ R be a proper, convex function and x ∈ int dom ϕ. Then it is easily seen that £ 0 ¤ 0 ∂ϕ(x) = f− (x), f+ (x) . df

(b) Let X be a Banach space and ϕ(x) = kxkX . If x 6= 0, then © ª ∂ϕ(x) = x∗ ∈ X ∗ : kx∗ kX ∗ = 1, hx∗ , xiX = kxkX . Indeed, let x∗ ∈ X ∗ be such that kx∗ kX ∗ = 1 Then and so

and

hx∗ , xiX = kxkX .

hx∗ , yiX 6 kykX

∀y∈X

hx∗ , y − xiX 6 kykX − kxkX ,

hence x∗ ∈ ∂ϕ(x). On the other hand, let x∗ ∈ ∂ϕ(x). Then − kxkX > − hx∗ , xiX and

kxkX = 2 kxkX − kxkX > hx∗ , xiX ,

from which we infer that hx∗ , xiX = kxkX . Also

hx∗ , λyiX 6 kx + λykX − kxkX

∀ y ∈ X, λ > 0,

526

Nonlinear Analysis

hence

° ° °1 ° 1 ° 6 ° x + y° ° − λ kxkX . λ X

∗

hx , yiX Let λ → +∞, to obtain

hx∗ , yiX 6 kykX ,

from which it follows that

kx∗ kX ∗ 6 1.

But since hx∗ , xiX = kxkX , we conclude that kx∗ kX ∗ = 1. If x = 0, then ∗

∂ϕ(0) = B 1 =

©

ª x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 .

Indeed note that · ∗

x ∈ ∂ϕ(0) ⇐⇒

¸ ∗

hx , xiX 6 kxkX

∀x∈X

and the last inequality is equivalent to saying that kx∗ kX ∗ 6 1. df

(c) Let C be a closed, convex set in X and ϕ(x) = iC (x). Then df

©

x∗ ∈ X ∗ : hx∗ , c − xiX 6 0 for all c ∈ C © ∗ ª = x ∈ X ∗ : hx∗ , xiX = σC (x∗ ) .

∂ϕ(x) = NC (x) =

ª

The set ∂ϕ(x) = NC (x) is a nonempty (because 0 ∈ ∂ϕ(x) = NC (x)), closed and convex cone in X ∗ , known as the normal cone to C at x. It generalizes the notion of normal space (see Definition A.1.12(b)) in differential geometry. If x ∈ / C, then ∂ϕ(x) = NC (x) = ∅. So D(∂ϕ) = C and ∂ϕ(x) = NC (x) = {0}

∀ x ∈ int C.

If C = V is a linear subspace of X, then ∂ϕ(x) = NV (x) = V ⊥ © ∗ ª = x ∈ X ∗ : hx∗ , viX = 0 for all v ∈ V

∀ x ∈ V.

4. Smooth and Nonsmooth Analysis and Variational Principles

527

For convex functions we have an easy criterion for subdifferentiability at x ∈ X. PROPOSITION 4.4.25 If X is a Banach space and ϕ : X −→ R is a convex function which is continuous at x ∈ X, then ∂ϕ(x) 6= ∅ and ∂ϕ(x) is w∗ -compact and convex in X ∗ . PROOF

From Theorem 4.2.3, we know that

int epi ϕ 6= ∅. ¢ Since x, ϕ(x) belongs to the boundary of epi ϕ, we can apply the weak separation theorem (see Theorem A.3.1) and find (x∗ , η) ∈ X ∗ × R, with (x∗ , η) 6= (0, 0), such that ¡ ¢ η ϕ(x) − λ 6 h−x∗ , x − yiX ∀ (y, λ) ∈ epi ϕ. (4.22) ¡

Since for fixed y ∈ dom ϕ, λ can increase up to +∞, from (4.22), we infer that η > 0. If η = 0, then h−x∗ , x − yiX > 0

∀ y ∈ dom ϕ.

But x ∈ int dom ϕ (see Theorem 4.2.3). So x∗ = 0, a contradiction. So η > 0 and we take η = 1. Then from (4.22) with λ = ϕ(y), we have ϕ(x) − ϕ(y) 6 h−x∗ , x − yiX , so −x∗ ∈ ∂ϕ(x) 6= ∅. From Theorems 4.2.3 and 4.2.7, we know that there exists r > 0, such that ϕ|Br (x) is Lipschitz continuous. So we have hx∗ , uiX 6 ϕ(x + u) − ϕ(x) 6 k kukX

∀ u ∈ B r (0),

for some k > 0 and so kx∗ kX ∗ 6 k. By Alaoglu’s theorem (see Theorem A.3.9) and since ∂ϕ(x) is clearly w∗ closed, we conclude that it is w∗ -compact and convex. REMARK 4.4.26 The result is actually true in the more general context of dual pairs of locally convex spaces. However, since the material of Section 4.2 was developed in the context of Banach spaces and to avoid introducing additional functional analytic material, we have stated the result in Banach spaces.

528

Nonlinear Analysis

In fact for a continuous, convex function ϕ we can describe the subdifferential completely. PROPOSITION 4.4.27 If X is a Banach space and ϕ : X −→ R is a convex function which is continuous at x ∈ X, then σ∂ϕ(x) (h) = f 0 (x; h) ∀ h ∈ X. PROOF

Let

df

ψ(h) = ϕ0 (x; h)

∀ h ∈ X.

Since ϕ is continuous at x ∈ X, we have ∂ϕ(x) 6= ∅ (see Proposition 4.4.25). So we have hx∗ , hiX 6 ψ(h) 6 ϕ(x + h) − ϕ(x)

∀ h ∈ X, x∗ ∈ ∂ϕ(x),

so ψ is finite everywhere, hence continuous on X. Also using Proposition 4.4.9, we see that the conjugate of the function df

ψλ (h) =

¤ 1£ ϕ(x + λh) − ϕ(x) λ

∀λ>0

is the function ψλ∗ (x∗ ) =

¤ 1£ ∗ ϕ (λx∗ ) + ϕ(x) − λ hx∗ , xiX λ

∀ λ > 0.

Since ψ = inf ψλ , we have that λ>0

ψ ∗ = sup ψλ∗ λ>0

(see Remark 4.4.2). Therefore ¤ 1£ ∗ ϕ (λx∗ ) + ϕ(x) − hλx∗ , xiX . λ>0 λ

ψ ∗ (x∗ ) = sup

Then by virtue of Propositions 4.4.4 and 4.4.21, we have ½ 0 if x∗ ∈ ∂ϕ(x), ψ ∗ (x∗ ) = +∞ otherwise, i.e., ψ ∗ = i∂ϕ(x) and so ψ ∗∗ = ψ = σ∂ϕ(x) (see Theorem 4.4.14). REMARK 4.4.28 Again the result remains valid in the framework of dual pairs of locally convex spaces.

4. Smooth and Nonsmooth Analysis and Variational Principles

529

Next we show that for convex functions the case of Gˆateaux differentiability is essentially the same as that of uniqueness of the subgradient. PROPOSITION 4.4.29 Let X be a Banach space and let ϕ : X −→ R be a proper, convex function. (a) If ϕ is Gˆ ateaux differentiable at x, then © ª x ∈ D(∂ϕ) and ∂ϕ(x) = ϕ0G (x) . (b) If ϕ is continuous at x and ∂ϕ(x) is a singleton, then ϕ is Gˆ ateaux differentiable at x and ª © ∂ϕ(x) = ϕ0G (x) . PROOF x, we have

(a) Due to the convexity and Gˆateaux differentiability of ϕ at ¤ 1£ ϕ(x + λh) − ϕ(x) λ 6 ϕ(x + h) − ϕ(x) ∀ λ ∈ (0, 1), h ∈ X,

hϕ0G (x), hiX 6

so ϕ0G (x) ∈ ∂ϕ(x). Let x∗ ∈ X ∗ be any element of ∂ϕ(x). We have hx∗ , hiX 6 so

¤ 1£ ϕ(x + λh) − ϕ(x) λ

∀ λ > 0, h ∈ X,

hx∗ , hiX 6 hϕ0G (x), hiX ∗

and thus x =

ϕ0G (x),

∀h∈X

i.e., ∂ϕ(x) =

©

ª ϕ0G (x) .

(b) Since ϕ is convex, we have ϕ(x) + λϕ0 (x; h) 6 ϕ(x + λh)

∀ λ ∈ R, h ∈ X.

So the straight line df

L =

©¡ ¢ ª x + λh, ϕ(x) + λϕ0 (x; h) : λ ∈ R

does not intersect int epi ϕ 6= ∅ (see Theorem 4.2.3). Then by the weak separation theorem (see Theorem A.3.1), we can find a closed hyperplane H containing line L, such that H ∩ int epi ϕ = ∅.

530

Nonlinear Analysis

The hyperplane H is the graph of a continuous affine function l on X, such that l(x) = ϕ(x). Since by hypothesis ∂ϕ(x) = {x∗ }, the slope of l is x∗ and because L ⊆ H, we have ϕ0 (x; h) = hx∗ , hiX

∀ h ∈ X.

Thus ϕ is Gˆateaux differentiable at x and © ª ∂ϕ(x) = ϕ0G (x) .

The next proposition explains the central role of the subdifferential in optimization theory. It is a direct consequence of Definition 4.4.19. PROPOSITION 4.4.30 If ϕ : X −→ R is a proper function, then ϕ attains its minimum at x ∈ dom ϕ if and only if 0 ∈ ∂ϕ(x). Next we will establish some basic rules of the subdifferential calculus. We start with two straightforward observations. Here ϕ, ψ : X −→ R are proper functions. We have ∂(λϕ)(x) = λ∂ϕ(x)

∀ λ > 0, x ∈ X

(4.23)

and ∂ϕ(x) + ∂ψ(x) ⊆ ∂(ϕ + ψ)(x)

∀ x ∈ dom ϕ ∩ dom ψ.

(4.24)

The next proposition provides a simple situation where equality in (4.24) is realized. PROPOSITION 4.4.31 If ϕ, ψ : X −→ R are proper, convex functions and there exists x b ∈ dom ϕ ∩ dom ψ where ϕ is continuous, then ∂ϕ(x) + ∂ψ(x) = ∂(ϕ + ψ)(x) ∀ x ∈ X. PROOF

Because of (4.24), we need to show that ∂ϕ(x) + ∂ψ(x) ⊇ ∂(ϕ + ψ)(x)

∀ x ∈ X.

(4.25)

4. Smooth and Nonsmooth Analysis and Variational Principles

531

To this end let x∗ ∈ ∂(ϕ + ψ)(x). Then x ∈ dom ϕ ∩ dom ψ and ψ(x) − ψ(y) 6 ϕ(y) − ϕ(x) − hx∗ , y − xiX = g(y)

∀ y ∈ X.

We introduce the following two sets df

C1 = epi g

and

df

C2 =

©

ª (y, µ) ∈ X × R : µ 6 ψ(x) − ψ(y) .

Both sets are convex and by virtue of Theorem 4.2.3, int C1 6= ∅. Also int C1 ∩ C2 = ∅. Indeed, g(y) 6 µ 6 ψ(x) − ψ(y)

∀ (y, µ) ∈ int C1 ∩ C2

and so g(y) = µ. Because (y, µ) ∈ int C1 , we have that (y, µ − ε) ∈ C1 for ε > 0 small and so g(y) 6 µ − ε, a contradiction. Since int C1 ∩ C2 = ∅, we can apply the weak separation theorem (see Theorem A.3.1) and produce (z ∗ , η) ∈ X ∗ × R, (z ∗ , η) 6= (0, 0), such that hz ∗ , ziX + ηλ 6 hz ∗ , yiX + ηµ

∀ (z, λ) ∈ C1 , (y, µ) ∈ C2

(4.26)

and the inequality is strict if (z, λ) ∈ int C1 . Note that (x, 0) ∈ C2 . Then from (4.26) and since λ can increase to +∞, we obtain η 6 0. If η = 0, then hz ∗ , ziX 6 hz ∗ , xiX

∀ z ∈ dom g.

But since g is continuous at x, dom g is a neighbourhood of x, hence z ∗ = 0, a contradiction to the fact that (z ∗ , η) 6= (0, 0). So η < 0 and we may assume that η = −1. Then from (4.26), we have ¡ ¢ hz ∗ , ziX − g(z) 6 hz ∗ , xiX 6 hz ∗ , yiX − ψ(x) − ψ(y) ∀ z ∈ dom g, y ∈ dom ψ. From the second inequality we have that −z ∗ ∈ ∂ψ(x), while from the first we have that x∗ + z ∗ ∈ ∂ϕ(x). Then x∗ = x∗ + z ∗ + (−z ∗ ) ∈ ∂ϕ(x) + ∂ψ(x) and we have proved (4.25). Of course the result is also true for any family {ϕi }ni=1 n T of proper, convex functions on X, such that there exists x ∈ dom ϕi , where REMARK 4.4.32

all but one of the functions are continuous.

i=1

532

Nonlinear Analysis

PROPOSITION 4.4.33 If A ∈ L(X; Y ) and ϕ : Y −→ R is a proper function, then A∗ ∂ϕ(Ax) ⊆ ∂(ϕ ◦ A)(x) ∀x∈X and equality holds if in addition ϕ is convex and continuous at a point in the range of A. PROOF The inclusion follows at once from the definitions. Let us prove that equality holds when ϕ is convex and continuous at the range of A. So let x∗ ∈ ∂(ϕ ◦ A)(x). We have hx∗ , z − xiX + (ϕ ◦ A)(x) 6 (ϕ ◦ A)(z) Let

df

L =

∀ z ∈ X.

(4.27)

©¡ ¢ ª Az, hx∗ , z − xiX + (ϕ ◦ A)(x) ∈ Y × R : z ∈ X .

This is an affine subspace of Y × R and because of (4.27), L and epi ϕ have only boundary points in common, that is L ∩ int epi ϕ 6= ∅ (note that by Theorem 4.2.3, int epi ϕ 6= ∅). So we can apply the weak separation theorem (see Theorem A.3.1) and find a close hyperplane H containing L, such that H ∩ int epi ϕ = ∅. The hyperplane H is the graph of a continuous affine function df

l(y) = hy ∗ , yiY + µ

∀ y ∈ Y,

with (y ∗ , µ) ∈ Y ∗ × R. Since H ⊇ L, we have hy ∗ , AziY + µ = hx∗ , z − xiX + (ϕ ◦ A)(x) so taking z = 0, we have µ = (ϕ ◦ A)(x) − hx∗ , xiX and hy ∗ , AziY = hx∗ , ziX

∀ z ∈ X.

From the second equality, we infer that x∗ = A∗ y ∗ . Also since H ∩ int epi ϕ = ∅,

∀ z ∈ X,

4. Smooth and Nonsmooth Analysis and Variational Principles

533

we have hy ∗ , yiY + (ϕ ◦ A)(x) − hA∗ y ∗ , xiX 6 ϕ(y)

∀ y ∈ Y,

so hy ∗ , y − AxiY + (ϕ ◦ A)(x) 6 ϕ(y) ¡ ¢ and thus y ∗ ∈ ∂ϕ A(x) . We infer that ∂(ϕ ◦ A)(x) ⊆ A∗ ∂ϕ(Ax)

∀y∈Y

∀ x ∈ X.

Therefore equality must hold. ∗

Next we study the multifunction ∂ϕ : X −→ 2X . The first result explains the connection between subdifferentials and maximal monotone maps and it generalizes the elementary fact that if ϕ : R −→ R is a continuous, convex function, then ϕ0 is increasing. THEOREM 4.4.34 If X is a reflexive Banach space and ϕ ∈ Γ0 (X), ∗ then ∂ϕ : X −→ 2X is a maximal monotone map. PROOF Using Troyanski’s renorming theorem (see Theorem A.3.23), we may assume that both X and X ∗ are locally uniformly convex (see Definition A.3.21). Let F : X −→ X ∗ be the duality map of X. By Proposition 3.2.27, F is a homeomorphism. Now, directly from the definition, we see that ∂ϕ is monotone. So by virtue of Theorem 3.2.29 to prove the maximal monotonicity of ∂ϕ, it suffices to show that ¡ ¢ R ∂ϕ + F = X ∗ . (4.28) To this end let x∗ ∈ X ∗ and consider the function ψ : X −→ R, defined by df

ψ(x) =

1 2 kxkX + ϕ(x) − hx∗ , xiX 2

∀ x ∈ X.

Evidently ψ ∈ Γ0 (X) and ψ(x) −→ +∞

as kxkX → +∞

534

Nonlinear Analysis

(recall that ϕ is bounded below by a continuous affine function; see Proposition 4.4.12). So by the Weierstrass theorem, we can find x0 ∈ dom ψ, such that ψ(x0 ) = inf ψ. X

Then from Proposition 4.4.30, we have that 0 ∈ ∂ψ(x0 ). Using Proposition 4.4.31 (see also Remark 4.4.32), we have ∂ϕ(x0 ) = F(x0 ) + ∂ϕ(x0 ) − x∗ ¡ ¢ (recall that ∂ 21 k·k2X (x0 ) = F(x0 ); see Example 3.2.20(d)). Hence 0 ∈ ∂ϕ(x0 ) + F(x0 ) − x∗ and so

x∗ ∈ ∂ϕ(x0 ) + F(x0 ).

Because x∗ ∈ X ∗ was arbitrary, we conclude that (4.28) holds and thus ∂ϕ is maximal monotone. REMARK 4.4.35 The result is actually true for X being any Banach space. For a proof of the result in this general case we refer to Rockafellar (1970b) (see also Phelps (1993, p. 59)). Now we obtain some additional properties which characterize the subdifferentials within the class of maximal monotone maps. ∗

DEFINITION 4.4.36 Let X be a Banach space and A : X −→ 2X . We say that A is n-cyclically monotone provided that n X

x∗k , xk − xk+1

® X

> 0,

k=0

whenever n > 1 and x0 , x2 , . . . , xn ∈ X, and

x∗k ∈ A(xk )

xn+1 = x0

∀ k ∈ {0, 1, . . . , n}.

We say that A is cyclically monotone, if it is n-cyclically monotone for every n > 2. The map A is maximal cyclically monotone, if its graph is not properly included in the graph of a cyclically monotone map.

4. Smooth and Nonsmooth Analysis and Variational Principles

535

REMARK 4.4.37 Clearly a 2-cyclically monotone map is monotone. So every cyclically monotone map is monotone. PROPOSITION 4.4.38 Every monotone map f : R −→ 2R is cyclically monotone. PROOF

Let x1 , x1 , . . . , xn ∈ D(f )

and

x∗k ∈ f (xk )

∀ k ∈ {0, 1, . . . , n}.

We may assume that xk 6 xk+1 for all k ∈ {0, 1, . . . , n − 1}. Then x∗k 6 x∗k+1 for all k ∈ {0, 1, . . . , n − 1} and we have n X

x∗k (xk − xk+1 ) =

k=0

n−1 X

x∗k (xk − xk+1 ) + x∗n (xn − x0 )

k=0

=

n−1 X

(x∗k − x∗0 )(xk − xk+1 ) > 0

k=0

(recall that xn+1 = x0 ). Directly from the definition, we see that if ϕ : X −→ R is a proper, convex function, then ∂ϕ is cyclically monotone. Moreover, if ϕ ∈ Γ0 (X), then by virtue of Theorem 4.4.34 and Remark 4.4.35, we see that ∂ϕ is maximal cyclically monotone. It turns out that subdifferentials are the only maximal cyclically monotone maps. THEOREM 4.4.39 If X is a Banach space, ∗ then a map A : X −→ 2X is maximal cyclically monotone if and only if there exists ϕ ∈ Γ0 (X), such that A = ∂ϕ. PROOF

“=⇒”: Let us fix (x0 , x∗0 ) ∈ Gr A and for every x ∈ X we define df

ϕ(x) =

sup

n X

∗ (xk , x A k=0 ©k ) ∈ Gr ª k ∈ 1, . . . , n n>1

x∗k , xk+1 − xk

® X

® + x∗n , x − xn X .

Because ϕ is the supremum of continuous affine functions, it follows that ϕ is convex and lower semicontinuous. Moreover, since n X ∗ ® xk , xk+1 − xk X 6 0 k=0

536

Nonlinear Analysis

(due to the cyclical monotonicity of A), it follows that ϕ is proper, that is ϕ ∈ Γ0 (X). Let (x, x∗ ) ∈ Gr A and y ∈ X. Since in the definition of ϕ, n > 2 is arbitrary, we have ϕ(y) >

n X ∗ ® ® ® xk , xk+1 − xk X + x∗n , x − xn X + x∗ , y − x X k=0

(i.e., we have added the point (x, x∗ ) ∈ Gr A in the definition of ϕ). Hence we obtain ® ϕ(y) > ϕ(x) + x∗ , y − x X ∀ y ∈ X, so x∗ ∈ ∂ϕ(x). Since (x, x∗ ) ∈ Gr A was arbitrary, we infer that Gr A ⊆ Gr ∂ϕ. Due to the maximality of ϕ, we conclude that Gr A = Gr ∂ϕ, hence A = ∂ϕ. “⇐=”: See the remark before the statement of the theorem. REMARK 4.4.40 In fact it can be shown that ϕ ∈ Γ0 (X) is unique up to an additive constant; see Rockafellar (1970b). COROLLARY 4.4.41 Any maximal monotone map f : R −→ 2R has the form £ 0 ¤ 0 f (x) = g− (x), g+ (x) , with g ∈ Γ0 (R). We can use Theorem 4.4.39 to characterize self-adjoint positive operators in a Hilbert space. PROPOSITION 4.4.42 If H is a Hilbert space and A : H ⊇ D(A) −→ H is a linear maximal monotone operator, then A is maximal cyclically monotone if and only if A is self-adjoint. PROOF “=⇒”: By virtue of Theorem 4.4.39, we can find ϕ ∈ Γ0 (H), such that A = ∂ϕ. Because A(0) = 0 and using Remark 4.4.40, we may assume that ϕ(0) = 0. For x ∈ D(A), let df

g(t) = ϕ(tx)

∀ t ∈ [0, 1].

From Proposition 4.4.33, we have that ∂g(t) = (∂ϕ(tx), x)H .

4. Smooth and Nonsmooth Analysis and Variational Principles

537

Using the definition of subdifferential, we infer that ¯ ¯ ¡ ¢ ¯g(t) − g(s)¯ 6 A(x), x |t − s| ∀ t, s ∈ [0, 1] H and so g is differentiable for almost all t ∈ [0, 1] and ¡ ¢ d g(t) = t A(x), x H . dt Then we have µ Z1 ¶ Z1 ¡ ¢ ¢ 1¡ 0 g(1) − g(0) = g (t) dt = t dt A(x), x H = A(x), x H , 2 0

0

so

¢ 1¡ A(x), x H ∀ x ∈ D(A). 2 Then via an easy calculation, we obtain that ¡ ¢ ¢ ¡ ¢ ¤ 1 £¡ ∂ϕ(x), y H = A(x), y H + x, A(y) H ∀ x, y ∈ D(A). 2 Since ¡ ¢ ¡ ¢ ∂ϕ(x), y H = A(x), y H , ϕ(x) =

we obtain

¡

A(x), y

¢ H

=

¡

x, A(y)

¢

∀ x, y ∈ D(A)

H

and so A ⊆ A∗ . But A∗ is monotone (see Theorem 3.2.58) and A is maximal. Therefore A = A∗ and we conclude that A is self-adjoint. “=⇒”: Since A is self-adjoint, maximal monotone, there exists a square root 1 of A with the same properties (see Kato (1976, p. 281)). So A 2 is closed (see Theorem 3.2.58) and if we set ½ 1 ° 1 °2 1 ° 2 ° df if x ∈ D(A 2 ) 2 A x H ϕ(x) = ∀ x ∈ H, 0 otherwise then ϕ ∈ Γ0 (H). Since ¡ 1 ¡ ¢ ¢ 1 A(x), y H = A 2 (x), A 2 (y) H

¡ 1¢ ∀ (x, y) ∈ D(A) × D A 2 ,

we have °2 ¡ ¢ 1 ° 1 °2 1° 1 A(x), y − x H 6 °A 2 (y)°H − °A 2 (x)°H 2 2 so A(x) ∈ ∂ϕ(x), i.e., A ⊆ ∂ϕ. Because A is maximal, we conclude that A = ∂ϕ.

¡ 1¢ ∀ (x, y) ∈ D(A) × D A 2 ,

538

Nonlinear Analysis

Using Proposition 3.2.14, we obtain the following result. PROPOSITION 4.4.43 If X is a reflexive Banach space and ϕ : X −→ R is a continuous, convex function, then ∗ ∂ϕ : X −→ 2X \ {∅} is upper semicontinuous from X with norm topology into X ∗ with the weak topology. REMARK 4.4.44 The result is true if X is any Banach space. In this case X ∗ is supplied with the w∗ -topology. The proof remains the same. DEFINITION 4.4.45 Let Y and Z be Hausdorff topological spaces and let S : Y ⊇ D(S) −→ 2Z be a multifunction. A selection f of S is a single valued map f : Y −→ Z, such that f (y) ∈ S(y)

∀ y ∈ D(S).

In the next proposition using selections of the subdifferential map, we characterize the Gˆateaux and Fr´echet differentiability of the convex function. PROPOSITION 4.4.46 If X is a Banach space, U ⊆ X is a nonempty, open convex set and ϕ : U −→ R is a continuous, convex function, then ϕ is Gˆ ateaux (respectively Fr´echet) differentiable at x ∈ U if and only if there is a selection f of the subdifferential map ∂ϕ which is norm-to-weak∗ (respectively norm-to-norm) continuous at x. An interesting consequence of Proposition 4.4.46 is that Fr´echet differentiable, convex functions are necessarily C 1 -functions. COROLLARY 4.4.47 If X is a Banach space, U ⊆ X is a nonempty, open, convex set and ϕ : U −→ R is a convex and Fr´echet differentiable function, then the function x 7−→ ϕ0F (x) is norm-to-norm continuous from U into X ∗ , i.e., ϕ ∈ C 1 (X).

4. Smooth and Nonsmooth Analysis and Variational Principles

539

Another such result is given in the next proposition. PROPOSITION 4.4.48 ¡ ¢ If ϕ ∈ Γ0 RN , ϕ is strictly convex and

¡ ¢ then ϕ∗ ∈ C 1 RN .

ϕ(x) −→ +∞ kxkRN

as kxkRN → +∞,

PROOF Without any loss of generality, we may assume that 0 ∈ dom ϕ and ϕ(0) = 0. Fix x∗ ∈ RN and consider the function df

Evidently −ψx∗

ψx∗ (x) = (x∗ , x)RN − ϕ(x). ¡ N¢ ∈ Γ0 R , it is strictly convex and ψx∗ (x) −→ −∞

as kxkRN → +∞.

So by the Weierstrass theorem ψx∗ attains its maximum on RN and the maximizer x is unique. By Propositions 4.4.21 and Corollary 4.4.23, we have ∂ϕ∗ (x∗ ) = {x}, i.e., ∂ϕ is single-valued. The map x∗ 7−→ ∂ϕ∗ (x∗ ) is closed. We show that it maps bounded sets to bounded sets, hence it is continuous. To this end let kx∗ kRN 6 r

and

x = ∂ϕ∗ (x∗ ).

We have x∗ ∈ ∂ϕ(x) and so (x∗ , x)RN ϕ(x) ϕ(x) − ϕ(0) = 6 6 kx∗ kRN 6 r; kxkRN kxkRN kxkRN © ª thus the set ∂ϕ∗ (x∗ ) : kx∗ kRN 6 r is bounded, i.e., ∂ϕ∗ is continuous. Finally let x = ∂ϕ∗ (x∗ ) and xh∗ = ∂ϕ∗ (x∗ + h∗ ), for some x∗ ∈ RN , h∗ ∈ RN \ {0}. From the definition of the subdifferential, we have ϕ∗ (x∗ + h∗ ) − ϕ∗ (x∗ ) − (h∗ , x)RN 06 kh∗ kRN (h∗ , xh∗ − x)RN 6 6 kxh∗ − xkRN . kh∗ kRN From the continuity of ∂ϕ∗ , we have kxh∗ − xkRN −→ 0

as h∗ → 0.

So ϕ∗ is¡ differentiable at x∗ ∈ RN and the derivative is continuous, i.e., ¢ N ∗ 1 ϕ ∈C R .

540

Nonlinear Analysis

Before passing to the nonconvex subdifferentials, let us mention a few things about the ε-subdifferential (or approximate subdifferential), which is a useful tool in convex analysis. Its definition results from an innocent looking perturbation of the original subdifferential (see Definition 4.4.19), which however leads to some remarkable properties, that are different in nature from those of the “exact” subdifferential. The mathematical setting remains unchanged with (X, X ∗ ) being a dual system of Hausdorff locally convex spaces. DEFINITION 4.4.49 Let ϕ : X −→ R be a proper function, ε > 0 and x ∈ dom ϕ. The ε-subdifferential of ϕ at x is the set ∂ε ϕ(x) (possibly empty), defined by ½ ¾ df ∗ ∗ ∗ ∂ε ϕ(x) = x ∈ X : hx , y − xiX − ε 6 ϕ(y) − ϕ(x) for all y ∈ X . REMARK 4.4.50

Equivalently we can say that x∗ ∈ ∂ε ϕ(x)

if and only if ¡ ¢ inf ϕ − x∗ > −∞ X

with ¡ ¢ ε-argmin ϕ − x∗ =

¡ ¢ x ∈ ε − argmin ϕ − x∗ ,

½

Also if and only if

and

¾ ¡ ¢ y ∈ X : ϕ(y) − hx∗ , yiX 6 inf ϕ − x∗ + ε . X

x∗ ∈ ∂ε ϕ(x) ϕ(x) + ϕ∗ (x∗ ) − hx∗ , xiX 6 ε.

Geometrically the definition of ∂ε ϕ(x) says that the epigraph of the ¡ continuous¢ affine function with slope x∗ ∈ ∂ε ϕ(x) and passing through x, ϕ(x) − ε contains the epigraph of ϕ. So for ε > 0, ∂ε ϕ(x) is a global notion (in contrast to ∂ϕ(x) which is local), i.e., it may be sensitive to variations of ϕ far away from x. When ε = 0, we recover Definition 4.4.19. The next proposition establishes the main difference between approximate and exact subdifferentials. PROPOSITION 4.4.51 If ϕ ∈ Γ0 (X), ε > 0 and x ∈ dom ϕ, then ∂ε ϕ(x) 6= ∅ and it is w∗ -closed, convex.

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

541

From Theorem 4.4.14, we have −ϕ(x) =

Let

inf [ϕ∗ (x∗ ) − hx∗ , xiX ] .

x∗ ∈X ∗

df

ψ(x∗ ) = ϕ∗ (x∗ ) − hx∗ , xiX

∀ x∗ ∈ X ∗ .

Let x∗ ∈ ε − argminψ 6= ∅. We have ϕ∗ (x∗ ) − hx∗ , xiX 6 inf∗ ψ + ε = −ϕ(x) + ε, X

hence

ϕ(x) + ϕ∗ (x∗ ) − hx∗ , xiX 6 ε,

which means that x∗ ∈ ∂ε ϕ(x) (see Remark 4.4.7). So ∂ε ϕ(x) 6= ∅ and clearly it is w∗ -closed and convex. In the study of the ε-subdifferentials (with ε > 0), the directional derivative (see Definition 4.4.17) is replaced by the following quantity. DEFINITION 4.4.52 Let ϕ : X −→ R be a proper function, ε > 0 and x ∈ dom ϕ. For every h ∈ X, we define df

ϕ0ε (x; h) = inf

λ>0

ϕ(x + λh) − ϕ(x) + ε . λ

The next result is analogous to Proposition 4.4.27. PROPOSITION 4.4.53 If X is a Banach space, ϕ ∈ Γ0 (X), ε > 0 and x ∈ dom ϕ, then ϕ0ε (x; ·) = σ∂ε ϕ(x) (·). Let us mention some basic calculus rules for the ε-subdifferential. PROPOSITION 4.4.54 If X and Y are two Banach spaces, A ∈ L(X; Y ), ϕ ∈ Γ0 (Y ), ε > 0 and x ∈ A−1 (dom ϕ), then ¡ ¡ ¢¢w∗ ∂ε (ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . PROOF

Since ϕ ◦ A ∈ Γ0 (X), from Proposition 4.4.53, we have that (ϕ ◦ A)0ε (x; h) = σ∂ε (ϕ◦A)(x) (h)

∀ h ∈ X.

We have ¡ ¢ ϕ(A(x) + λA(h)) − ϕ(A(x)) + ε = ϕ0ε A(x); A(h) . λ>0 λ

(ϕ ◦ A)0ε (x; h) = inf

542

Nonlinear Analysis

On the other hand, using Proposition 4.4.53, for all h ∈ X, we have ∗ ∗ ® σA∗ (∂ε ϕ(A(x))) (h) = sup A (y ), h X y∗∈∂ε ϕ(A(x))

=

sup y∗∈∂ε ϕ(A(x))

∗ ® ¡ ¢ y , A(h) X = ϕ0ε A(x); A(h) .

So we conclude that σ∂ε (ϕ◦A)(x) (h) = σA∗ (∂ε ϕ(A(x))) (h)

∀ h ∈ X,

hence we obtain the conclusion of the proposition. COROLLARY 4.4.55 If X and Y are two Banach spaces, A ∈ L(X; Y ), ϕ ∈ Γ0 (Y ) and x ∈ A−1 (dom ϕ), then \ ¡ ¡ ¢¢w∗ ∂(ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . ε>0

Moreover, if X is reflexive, then we have \ ¡ ¡ ¢¢ ∂(ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . ε>0

PROOF

Clearly \

∂(ϕ ◦ A)(x) =

∂ε (ϕ ◦ A)(x).

ε>0

Applying Proposition 4.4.54, we obtain the first equality. For the second equality just note that if X is reflexive, then ¡ ¡ ¢¢ ¡ ¡ ¢¢w ¡ ¡ ¢¢w∗ A∗ ∂ε ϕ A(x) = A∗ ∂ε ϕ A(x) = A∗ ∂ε ϕ A(x) .

COROLLARY 4.4.56 If X is a Banach space, ϕ, ψ ∈ Γ0 (X) and x ∈ dom ϕ ∩ dom ψ, then \ w∗ ∂(ϕ + ψ)(x) = ∂ε ϕ(x) + ∂ε ψ(x) . ε>0

Moreover, if X is reflexive, then ∂(ϕ + ψ)(x) =

\ ε>0

∂ε ϕ(x) + ∂ε ψ(x).

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

543

Let df

u(x, y) = ϕ(x) + ϕ(y)

∀ (x, y) ∈ X × X

and let A ∈ L(X; X × X) be defined by df

A(x) =

¡ ¢ x, x

∀ x ∈ X.

Then we see that u◦A = ϕ+ψ

on X.

We have A∗ (x∗ , y ∗ ) = x∗ + y ∗

∀ (x∗ , y ∗ ) ∈ X ∗ × Y ∗

and u∗ (x∗ , y ∗ ) = ϕ∗ (x∗ ) + ψ ∗ (y ∗ )

∀ (x∗ , y ∗ ) ∈ X ∗ × Y ∗ .

Let (x, y) ∈ dom u, ε > 0 and (x∗ , y ∗ ) ∈ ∂ε u(x, y). We have ϕ(x) + ψ(y) + ϕ∗ (x∗ ) + ψ ∗ (y ∗ ) − hx∗ , xiX − hy ∗ , yiX 6 ε. So there exists ε1 , ε2 > 0, such that ε1 + ε2 = ε and x∗ ∈ ∂ε1 ϕ(x) ⊆ ∂ε ϕ(x) and

y ∗ ∈ ∂ε2 ψ(y) ⊆ ∂ε ψ(y).

Therefore, we infer that ∂ε u(x, y) ⊆ ∂ε1 ϕ(x) × ∂ε2 ψ(y). Using Corollary 4.4.55, we obtain ∂(ϕ + ψ)(x) =

\

\ ¡ ¢w ∗ w∗ A∗ ∂ε u(x, x) ⊆ ∂ε ϕ(x) + ∂ε ψ(x) .

ε>0

ε>0

(4.29)

On the other hand note that ∂ε ϕ(x) + ∂ε ψ(x) ⊆ ∂2ε (ϕ + ψ)(x) and so

w∗

∂ε ϕ(x) + ∂ε ψ(x)

⊆ ∂2ε (ϕ + ψ)(x).

From this inclusion it follows that \ \ w∗ ∂ε ϕ(x) + ∂ε ψ(x) ⊆ ∂2ε (ϕ + ψ)(x) = ∂(ϕ + ψ)(x). ε>0

ε>0

(4.30)

544

Nonlinear Analysis

From (4.29) and (4.30), we conclude that ∂(ϕ + ψ)(x) =

\

w∗

∂ε ϕ(x) + ∂ε ψ(x)

.

ε>0

Again if X is reflexive, then the norm closure and the weak closure coincide (due to the convexity of the set). Using the second characterization of the ε-subdifferential, we can easily prove the following rule. We leave the details to the reader. PROPOSITION 4.4.57 If ϕ, ψ ∈ Γ0 (X), ε > 0 and there exist x0 ∈ dom ϕ ∩ dom ψ, such that ϕ is continuous at x0 , then [ ∂ε (ϕ + ψ)(x) = [∂ε1 ϕ(x) + ∂ε2 ψ(x)] ∀ x ∈ dom ϕ ∩ dom ψ. ε1 , ε2 > 0 ε2 + ε2 = ε

Now we pass to a brief discussion of some nonconvex subdifferentials. Historically the first subdifferential defined for nonconvex functions is that for locally Lipschitz functions. The starting point for the introduction of such a subdifferential was Theorem 4.2.7 (that is that a continuous, convex function is locally Lipschitz) and when the underlying space is finite dimensional Theorem 1.5.8 and Corollary 1.5.9 (Rademacher’s theorem). The mathematical framework is a Banach space X with X ∗ its topological dual. Let us start by recalling the definition of a locally Lipschitz function, which is central in what follows. DEFINITION 4.4.58 A function ϕ : X −→ R is locally Lipschitz, if every point x ∈ X admits a neighbourhood U ⊆ X and a constant kU (depending on U ), such that ¯ ¯ ¯ϕ(y) − ϕ(z)¯ 6 kU ky − zk ∀ y, z ∈ U. X A locally Lipschitz function need not have directional derivatives in the sense of Definition 4.4.17. However, exploiting the local Lipschitz structure, we can define a generalized directional derivative as follows. DEFINITION 4.4.59 Let ϕ : X −→ R be a locally Lipschitz function. Then the generalized directional derivative of ϕ at x ∈ X in the direction h ∈ X is defined by df

ϕ0 (x; h) = lim sup x0 → x λ&0

ϕ(x0 + λh) − ϕ(x0 ) . λ

4. Smooth and Nonsmooth Analysis and Variational Principles

545

The utility of ϕ0 follows from some useful properties that it exhibits. PROPOSITION 4.4.60 If ϕ : X −→ R is a locally Lipschitz function, then (a) the function h 7−→ ϕ0 (x; h) is sublinear and Lipschitz continuous for all x ∈ X; (b) the function (x, h) 7−→ ϕ0 (x; h) is upper semicontinuous on X × X; (c) ϕ0 (x; −h) = (−ϕ)0 (x; h). PROOF (a) Clearly ϕ(x; ·) is positively homogeneous. Also let h1 , h2 ∈ X. We have ϕ0 (x; h1 + h2 ) = lim sup x0 → x λ&0

= lim sup x0 → x λ&0

6 lim sup x0 → x λ&0 0

ϕ(x0 + λ(h1 + h2 )) − ϕ(x0 ) λ

ϕ(x0 + λh1 + λh2 ) − ϕ(x0 + λh2 ) + ϕ(x0 + λh2 ) − ϕ(x0 ) λ ϕ(x0 + λh1 + λh2 ) − ϕ(x0 + λh2 ) ϕ(x0 + λh2 ) − ϕ(x0 ) + lim sup λ λ x0 → x λ&0 0

= ϕ (x; h1 ) + ϕ (x; h2 ). So we have proved that ϕ0 (x; ·) is sublinear. Exploiting the local Lipschitzness of ϕ, we see that for all x0 ∈ X near x ∈ X and for all λ > 0 near zero, we have ϕ(x0 + λh) − ϕ(x0 ) 6 k khkX ∀ h ∈ X, λ so ϕ0 (x; h) 6 k khkX ∀h∈X and due to the sublinearity of ϕ0 (x; ·), we have ¯ 0 ¯ ¯ϕ (x; h)¯ 6 k khk ∀ h ∈ X, X so finally we deduce that ϕ0 (x; ·) is Lipschitz continuous. (b) Let (xn , hn ) −→ (x, h)

in X × X.

From the definition of ϕ0 (x; h), we know that for every n > 1, we can find vn ∈ X and λn ∈ (0, 1), such that kvn kX + λn 6

1 n

546

Nonlinear Analysis

and ϕ0 (xn ; hn ) 6 so

ϕ(xn + vn + λn hn ) − ϕ(xn + vn ) 1 + , λn n

lim sup ϕ0 (xn , hn ) 6 ϕ0 (x; h), n→+∞

i.e., the function (x, h) 7−→ ϕ0 (x, h) is upper semicontinuous. (c) By definition, we have ϕ0 (x; −h) = lim sup x0 → x λ&0

= lim sup y→x λ&0

ϕ(x0 − λh) − ϕ(x0 ) λ

(−ϕ)(y + λh) − (−ϕ)(y) = (−ϕ)0 (x; h) λ

(with y = x0 − λh). These properties lead to the following definition. DEFINITION 4.4.61 Let ϕ : X −→ R be a locally Lipschitz function. The generalized subdifferential (or Clarke subdifferential) of ϕ at x is defined by ½ ¾ df ∂ϕ(x) = x∗ ∈ X ∗ : hx∗ , xiX 6 ϕ0 (x; h) for all h ∈ X . The elements of ∂ϕ(x) are called generalized gradients. PROPOSITION 4.4.62 If ϕ : X −→ R is a locally Lipschitz function, then for every x ∈ X, the set ∂ϕ(x) ⊆ X ∗ is nonempty, convex and w∗ ∗ compact, the multifunction ∂ϕ : X −→ 2X \{∅} is upper semicontinuous from X with the norm topology into X ∗ with the w∗ -topology (see Definition 3.2.12) and ϕ0 (x; h) = σ∂ϕ(x) (h) ∀ (x, h) ∈ X × X. PROOF Since by Proposition 4.4.60(a), ϕ0 (x; ·) is sublinear, the HahnBanach theorem implies that it has a continuous linear minorant. Therefore ∂ϕ(x) 6= ∅. Clearly the set is convex and by virtue of Proposition 4.4.60(a), it is also closed and bounded, hence w∗ -compact (by Alaoglu’s theorem; see Theorem A.3.9). To show the upper semicontinuity, let C ⊆ X ∗ be a w∗ closed set and let © ª {xn }n>1 ⊆ ∂ϕ− (C) = x ∈ X : ∂ϕ(x) ∩ C 6= ∅

4. Smooth and Nonsmooth Analysis and Variational Principles

547

be a sequence, such that xn −→ x

in X.

Let us take x∗n ∈ ∂ϕ(xn ) ∩ C

∀ n > 1.

Because Proposition 4.4.60(a), the sequence {x∗n }n>1 is bounded in X ∗ . So by Alaoglu’s theorem (see Theorem A.3.9), we can find a subnet {x∗α }α∈J of {x∗n }n>1 , such that w∗

x∗α −→ x∗ . We have hx∗α , hiX 6 ϕ0 (xα ; h)

∀ h ∈ X.

Taking the limit with respect to α ∈ J and using Proposition 4.4.60(b), we obtain hx∗ , hiX 6 ϕ0 (x; h) ∀ h ∈ X, so x∗ ∈ ∂ϕ(x). Also x∗ ∈ C, since C ⊆ X ∗ is w∗ -closed. Therefore x∗ ∈ ∂ϕ(x) ∩ C, hence x ∈ ∂ϕ− (C) which proves the upper semicontinuity of the multifunction. Finally, using once more the Hahn-Banach theorem, for every h0 ∈ X, we can find x∗0 ∈ ∂ϕ(x), such that hx∗0 , hiX 6 ϕ0 (x0 ; h)

∀h∈X

and hx∗0 , h0 iX = ϕ0 (x0 ; h0 ). Therefore ϕ0 (x; ·) = σ∂ϕ(x) (·). PROPOSITION 4.4.63 Let ϕ : X −→ R be a locally Lipschitz function. (a) If ϕ is Gˆ ateaux differentiable at x ∈ X, then ϕ0G (x) ∈ ∂ϕ(x). (b) If ϕ ∈ C 1 (X), © ª then ∂ϕ(x) = ϕ0F (x) for all x ∈ X. (c) If ϕ is also convex, then the convex and generalized subdifferentials of ϕ coincide.

548

Nonlinear Analysis (a) From the definition of ϕ0 (x; ·), we have 0 ® ϕG (x), h X 6 ϕ0 (x; h) ∀ h ∈ X.

PROOF

So ϕ0G (x) ∈ ∂ϕ(x). (b) If ϕ ∈ C 1 (X), then ϕ0 (x; h) = © ª hence ∂ϕ(x) = ϕ0F (x) .

0 ® ϕF (x), h X

∀ h ∈ X,

(c) From the definition of ϕ0 (x; h), we have ϕ(x0 + λh) − ϕ(x0 ) , ε&0 kx0 −xk 6εδ 0 0 arbitrary. Because ϕ is convex, the map λ 7−→

ϕ(x0 + λh) − ϕ(x0 ) is increasing on (0, +∞). λ

So we obtain ϕ0 (x; h) = lim

sup

ε&0 kx0 −xk 6εδ X

ϕ(x0 + εh) − ϕ(x0 ) . ε

From the local Lipschitz property of ϕ, we have ¯ ¯ ¯ ϕ(x0 + εh) − ϕ(x0 ) ϕ(x + εh) − ϕ(x) ¯ ¯ ¯ 6 2δk − ¯ ¯ ε ε

∀ x0 ∈ x + εδB 1 ,

with k > 0, so ϕ0 (x; h) 6 lim

ε&0

ϕ(x + εh) − ϕ(x) + 2δk. ε

Since δ > was arbitrary, we obtain ϕ0 (x; h) 6 ϕ0 (x; h)

∀ h ∈ X.

Because the opposite inequality is always true, we conclude that ϕ0 (x; ·) = ϕ0 (x; ·). Then by virtue of Proposition 4.4.27 and Definition 4.4.61, we conclude that the two subdifferentials coincide. EXAMPLE 4.4.64 If ϕ is differentiable but not C 1 , then ∂ϕ(x) need not be a singleton. To see this consider the function ϕ : R −→ R, defined by ¡ ¢ ½ 2 df x sin x1 if x 6= 0, ϕ(x) = 0 if x = 0. Then ϕ is Lipschitz continuous on [−1, 1], ϕ0 (0) = 0 but ϕ0 is not continuous at x = 0. A straightforward calculation shows that ϕ0 (0; h) = |h|, hence ∂ϕ(0) = [−1, 1] and so it is not a singleton.

4. Smooth and Nonsmooth Analysis and Variational Principles

549

When X = RN , then we can use Corollary 1.5.9 (Rademacher’s theorem) to give a definition which is less abstract and formal than Definition 4.4.61 in terms of the generalized directional derivative. The new definition is more geometric. THEOREM 4.4.65 If ϕ : RN −→ R is a locally Lipschitz function and E is any Lebesgue-null set in RN , then ½ ¾ ∂ϕ(x) = conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕc , n→+∞

where Dϕc ⊆ RN is the Lebesgue-null set where ϕ fails to be differentiable (due to Rademacher’s theorem). PROOF

Since, by Proposition 4.4.63(a), we have ∇ϕ(xn ) ∈ ∂ϕ(xn )

∀n>1

and ∂ϕ©is locallyª bounded (see Proposition 4.4.60(a)), we see that the sequence ∇ϕ(xn ) n>1 has a convergent subsequence. Then Proposition 4.4.62 implies that the limit of the subsequence belongs in ∂ϕ(x). Therefore we have ½ ¾ c conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕ ⊆ ∂ϕ(x). (4.31) n→+∞

On the other hand let ξh =

lim sup

¡

∇ϕ(y), h

y→x c y∈ / E ∪ Dϕ

¢ RN

,

with h 6= 0. For a given ε > 0, we can find δ = δ(ε) > 0, such that ¡ ¢ ∇ϕ(y), h RN 6 ξh + ε ∀ y ∈ x + δB 1 (0), y ∈ / E ∪ Dϕc . ³ ´ δ For t ∈ 0, 2khk , we have N R

δ ϕ(y + th) − ϕ(y) 6 t(ξh + ε) for a.a. y ∈ x + B 1 (0), 2 so

ϕ0 (x; h) 6 ξh + ε.

Therefore the support function of the set ½ ¾ c conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕ n→+∞

majorizes ϕ0 (x; ·). This combined with (4.31) finishes the proof of the theorem.

550

Nonlinear Analysis

COROLLARY 4.4.66 If ϕ : RN −→ R is a locally Lipschitz function, then ¡ ¢ ϕ0 (x; h) = lim sup ∇ϕ(x0 ), h RN . x0 → x c y ∈ E ∪ Dϕ

Finally let us state a few basic calculus rules for the generalized subdifferential. PROPOSITION 4.4.67 If ϕ : X −→ R is a locally Lipschitz function and λ ∈ R, then ∂(λϕ) = λ∂ϕ. PROOF

Evidently the result is true if λ > 0, since (λϕ)0 = λϕ0 .

So we need to consider the case λ < 0. We may assume that λ = −1. Then x∗ ∈ ∂(−ϕ)(x) if and only if hx∗ , hiX 6 (−ϕ)0 (x; h)

∀ h ∈ X.

By Proposition 4.4.60(c), we have (−ϕ)0 (x; h) = ϕ0 (x; −h). So x∗ ∈ ∂(−ϕ)(x) if and only if hx∗ , hiX 6 ϕ0 (x; −h)

∀ h ∈ X,

hence −x∗ ∈ ∂ϕ(x). Thus finally we have that x∗ ∈ ∂(−ϕ)(x) if and only if x∗ ∈ −∂ϕ(x).

An interesting consequence of this proposition is the following extension of Fermat’s equation for local extrema.

4. Smooth and Nonsmooth Analysis and Variational Principles

551

PROPOSITION 4.4.68 If ϕ : X −→ R is a locally Lipschitz function and has a local maximum or minimum at x ∈ X, then 0 ∈ ∂ϕ(x). PROOF Since ∂(−ϕ) = −∂ϕ, it suffices to prove the proposition for the case of a local minimum at x ∈ X. Then clearly ϕ0 (x; h) > 0

∀ h ∈ X.

Hence 0 ∈ ∂ϕ(x). PROPOSITION 4.4.69 © ª If ϕk : X −→ R for k ∈ 1, . . . , n are locally Lipschitz functions, then ¶ µX n n X ϕk ⊆ ∂ϕk , ∂ k=1

k=1

i.e., the generalized subdifferential is subadditive. PROOF It suffices to prove the result for n = 2. The general case follows by induction. The support function of ∂(ϕ1 +ϕ2 ) is (ϕ1 +ϕ2 )0 and the support function of ∂ϕ1 + ∂ϕ2 is ϕ01 + ϕ02 . Also note that ∂ϕ1 (x) + ∂ϕ2 (x) is convex and w∗ -compact. Since (ϕ1 + ϕ2 )0 (x; ·) 6 ϕ01 (x; ·) + ϕ02 (x; ·), we conclude that ∂(ϕ1 + ϕ2 )(x) ⊆ ∂ϕ1 (x) + ∂ϕ2 (x)

∀ x ∈ X.

COROLLARY 4.4.70 © ª If ϕk : X −→ R for k ∈ 1, . . . , n are locally Lipschitz functions and all but one are C 1 -functions, then µX ¶ n n X ∂ ϕk = ∂ϕk . k=1

k=1

COROLLARY 4.4.71 © ª If ϕk : X −→ R are locally Lipschitz functions and λk ∈ R for k ∈ 1, . . . , n , then µX ¶ n n X ∂ λk ϕk ⊆ λk ∂ϕk k=1

k=1

and equality holds if all but one of the functions are C 1 -functions.

552

Nonlinear Analysis

The following mean value theorem is a useful tool in many applications. THEOREM 4.4.72 (Mean Value Theorem) If U ⊆ X is an open set, x, y ∈ X, [x, y] ⊆ U , with df

[x, y] =

©

ª λx + (1 − λ)y : λ ∈ [0, 1]

and ϕ : U −→ R is locally Lipschitz, then there exists u ∈ (x, y), with df

(x, y) =

©

λx + (1 − λ)y : λ ∈ (0, 1)

ª

and u∗ ∈ ∂ϕ(u), such that ϕ(y) − ϕ(x) = hu∗ , y − xiX . PROOF

Let

df

xλ = x + λ(y − x) and consider the function f : [0, 1] −→ R, defined by df

f (λ) = ϕ(xλ )

∀ λ ∈ [0, 1].

Clearly f is Lipschitz continuous on [0, 1]. We claim that © ª ∂f (λ) ⊆ hu∗ , y − xiX : u∗ ∈ ∂ϕ(xλ ) ∀ λ ∈ (0, 1). Since both sets are closed, convex, it suffices to show that σ∂f (λ) (±1) 6

max

u∗ ∈∂ϕ(xλ )

± hu∗ , y − xiX .

(4.32)

To this end, for h = ±1, we have lim sup λ0 → λ t&0

= lim sup λ0 → λ t&0

6 lim sup =

f (λ0 + th) − f (λ0 ) t ϕ(x + (λ0 + th)(y − x)) − ϕ(x + λ0 (y − x)) t ϕ(z + th(y − x)) − ϕ(z) t

z → xλ t&0 ¡ ϕ0 xλ ; h(y

¢ − x) =

From (4.33), we obtain (4.32).

∗ ® u , h(y − x) .

sup u∗ ∈∂ϕ(x

λ)

(4.33)

4. Smooth and Nonsmooth Analysis and Variational Principles

553

Now let ¡ ¢ df ξ(λ) = ϕ(xλ ) + λ ϕ(x) − ϕ(y)

∀ λ ∈ [0, 1].

We have ξ(0) = ξ(1) = ϕ(x) and so we can find λ ∈ (0, 1) at which ξ attains a local extremum. Then by Proposition 4.4.68, we have that 0 ∈ ∂ξ(λ), which via Propositions 4.4.67 and 4.4.69 and (4.32) implies that ® ϕ(y) − ϕ(x) ∈ ∂ϕ(u), y − x X , with u = xλ . The generalized subdifferential has a remarkable calculus which makes it very useful in applications. We mention only two rules which arise in applications. For their proofs and additional results in this direction we refer to Clarke (1983). PROPOSITION 4.4.73 (Chain Rule) If X and Y are two Banach spaces, h ∈ C 1 (X; Y ) and ϕ : Y −→ R is a locally Lipschitz function, then ¡ ¢ ∂(ϕ ◦ h)(x) ⊆ ∂ϕ h(x) ◦ h0F (x) ∀ x ∈ X. REMARK 4.4.74

Note that h0F (x) ∈ L(X; Y ). Using its adjoint ¡ 0 ¢∗ hF (x) ∈ L(Y ∗ ; X ∗ )

we can equivalently rewrite the above chain rule as ¡ ¢∗ ¡ ¢ ∂(ϕ ◦ h)(x) ⊆ h0F (x) ∂ϕ h(x)

∀ x ∈ X.

A useful consequence of the above chain rule is the following result. COROLLARY 4.4.75 If X and Y are two Banach spaces, X is embedded continuously and densely df

in Y , ϕ : Y −→ R is a locally Lipschitz function and ϕ b = ϕ|X , then ∂ ϕ(x) b = ∂ϕ(x) ∀ x ∈ X, which means that every element in ∂ ϕ(x) b admits a unique extension to an element of ∂ϕ(x).

554

Nonlinear Analysis

PROPOSITION 4.4.76 (Nonsmooth Lagrange Rule) If X is a Banach space, ϕ, f : X −→ R are two locally Lipschitz functions and x is a local solution of the problem inf ϕ(x),

f (x)60

then there exist λ0 , λ1 > 0, not both zero, such that 0 ∈ λ0 ∂ϕ(x) + λ1 ∂f (x) REMARK 4.4.77

and

0 = λ1 f (x).

If df

m(b) =

inf ϕ(x),

f (x)6b

m(0) is finite and lim inf b→0

m(b) − m(0) > −∞ |b|

(calmness condition), then λ0 > 0 and so by normalization we can assume that λ0 = 1. Let us conclude this section by mentioning some more nonconvex subdifferentials. This can be done using the following parent notion. DEFINITION 4.4.78 Let X be a Banach space. A bornology is a collection B of bounded, symmetric (with respect to the origin) subsets of X, whose union is X. REMARK 4.4.79 By taking the collection of all finite symmetric sets, we have the so-called Gˆ ateaux bornology denoted by BG . Similarly if the collection consists of all bounded, symmetric sets, then we have the Fr´ echet bornology denoted by BF . Finally if we consider all symmetric, compact sets, then the resulting bornology is the so-called Hadamard bornology denoted by BH . DEFINITION 4.4.80 in X.

Let X be a Banach space and let B be a bornology

(a) The norm of X is said to be B-smooth, if it is Gˆ ateaux differentiable at every x ∈ X \ {0} and the defining limit exists uniformly on members of B. (b) A function ϕ : X −→ R is said to be B-differentiable at x ∈ X with B-derivative ϕ0B (x), if for every C ∈ B, we have lim sup

λ→0 h∈C

ϕ(x + λh) − ϕ(x) − λ hϕ0B (x), hiX = 0. λ

4. Smooth and Nonsmooth Analysis and Variational Principles

555

(c) Let ϕ : X −→ R∗ = R ∪ {±∞} be a lower semicontinuous function and suppose that ϕ(x) ∈ R. We say that ϕ is B-subdifferentiable at x, if there exists x∗ ∈ X ∗ , such that for every ε > 0 and every C ∈ B, there exists δ > 0 for which we have ϕ(x + λh) − ϕ(x) +ε ∀ λ ∈ (0, δ), h ∈ C. λ The elements x∗ ∈ X ∗ are called B-subderivatives of ϕ at x and the set of all B-subderivatives is the B-subdifferential of ϕ at x and it is denoted by ∂B ϕ(x). hx∗ , hiX 6

REMARK 4.4.81 If in Definition 4.4.80(b), B = BG (respectively B = BF ), then ϕ0B (x) = ϕ0G (x) (respectively ϕ0B (x) = ϕ0F (x)). If in Definition 4.4.80(c), ϕ is R-valued, proper, convex, then ∂B ϕ(x) = ∂ϕ(x) (the convex subdifferential; see Definition 4.4.19). If in Definition 4.4.80(c), ϕ is R-valued, locally Lipschitz, then ∂B ϕ(x) = ∂ϕ(x) (the generalized subdifferential; see Definition 4.4.61). Finally if in Definition 4.4.80(c), we reverse the inequality and replace ε be −ε, we obtain the B-superdifferential of ϕ at x denoted by ∂ B ϕ(x). Note that ∂ B (−ϕ)(x) = −∂B ϕ(x). The elements of ∂ B ϕ(x) are called B-superderivatives of ϕ at x. PROPOSITION 4.4.82 Let X be a Banach space, B a bornology in X and ϕ : X −→ R∗ . (a) If ϕ is a lower semicontinuous function, ϕ(x) ∈ R and ∂B ϕ(x) and ∂ B ϕ(x) are nonempty sets, then ϕ is B-differentiable at x and ∂ϕB (x) = ϕ0G (x). (b) If ϕ is a concave function which is continuous in a neighbourhood of x and ∂B ϕ(x) 6= ∅, © ª then ϕ is B-differentiable at x and ∂ϕB (x) = ϕ0B (x) . (c) If the norm of X is B-smooth and ½ ∞ 1X df 2 T = s : X −→ R : s(x) = µn kx − zn kX , 2 n=1 where µn > 0,

∞ X

¾ µn = 1, and zn −→ z ∈ X ,

n=1

then the elements of T are everywhere B-differentiable.

556

Nonlinear Analysis

PROOF

(a) Let x∗ ∈ ∂B ϕ(x) and

y ∗ ∈ ∂ B ϕ(x).

Then for every ε > 0 and every h ∈ X, we have hy ∗ , hiX − hx∗ , hiX 6 2ε, hence

x∗ = y ∗ .

Denote the common value by v ∗ . Then clearly v ∗ = ϕ0B (x) = ϕ0G (x). (b) The function −ϕ is convex and continuous at x ∈ X. Then by Proposition 4.4.25 and Remark 4.4.81, we have ∂(−ϕ)(x) = ∂B (−ϕ)(x) 6= ∅. Since

∂ B ϕ(x) = −∂B (−ϕ)(x),

we have that

∂ B ϕ(x) 6= ∅.

Also ∂B ϕ(x) = −∂(−ϕ)(x) 6= ∅. So we can apply part (a) and obtain the claim. (c) Let x ∈ X. The sequence © ª x − zn n>1 is bounded in X. By hypothesis the norm of X is BG -smooth and so X ∗ is strictly convex. Hence by Proposition 3.2.22, the duality map F is single valued. Then from the definition of s ∈ T , the series ∞ X

µn F(x − zn ) is norm convergent in X ∗

n=1 ∗

∗

to some x ∈ X . It is straightforward to check that x∗ = s0B (x).

Finally let us give some particular subdifferentials associated with bornologies.

4. Smooth and Nonsmooth Analysis and Variational Principles EXAMPLE 4.4.83

557

Let X be a Banach space and B a bornology on X.

(a) Let ϕ : X −→ R be a proper, lower semicontinuous function and x ∈ dom ϕ. The viscosity B-subdifferential at x is the set ∂B ϕ(x) of all x∗ ∈ X ∗ , such that there is a B-differentiable function f , such that ϕ − f attains a local minimum at x and 0 x∗ = fB (x).

(b) If in the above case we specialize the perturbation function f , we obtain the so-called proximal subdifferential ∂p ϕ(x) of ϕ at x. Assuming that the norm of X is Fr´echet differentiable (off the origin), if df

2

f (y) = hx∗ , y − xiX − k ky − xkX , for some k > 0 in (a), then we obtain the proximal subdifferential ∂p ϕ(x) for a proper, lower semicontinuous function ϕ : X −→ R. So x∗ ∈ ∂p ϕ(x) if and only if for some k > 0 and all y in the neighbourhood of x, we have 2

ϕ(y) − ϕ(x) + k ky − xkX > hx∗ , y − xiX . This subdifferential is useful if X is finite dimensional or if X is a Hilbert space. (c) Let ϕ : X −→ R be a proper, lower semicontinuous function and x ∈ − dom ϕ. The canonical B-subdifferential of ϕ at x, denoted by ∂B ϕ(x), is ∗ ∗ the set of all x ∈ X , such that lim inf inf λ→0

h∈C

1 [ϕ(x + λh) − ϕ(x) − λ hx∗ , hiX ] > 0 λ

∀ C ∈ B.

REMARK 4.4.84 Of all the subdifferentials the proximal is the smallest. The viscosity B-subdifferential is not greater than the corresponding canonical subdifferential. The Fr´echet viscosity and canonical subdifferentials coincide, if there exists a Lipschitz continuous, Fr´echet differentiable bump function (see Deville, Godefroy & Zizler (1993); recall that b : X −→ R is a bump function if it has a nonempty and bounded support). If B1 and B2 are two bornologies in X and B2 is finer than B1 (i.e., for every C1 ∈ B1 , we can find C2 ∈ B2 , such that C1 ⊆ C2 ), then any B2 -subdifferential is not greater than the corresponding B1 -subdifferential.

558

4.5

Nonlinear Analysis

Integral Functionals and Subdifferentials

Let (Ω, Σ, µ) be a σ-finite measure space and let X be a separable Banach space. In this section we describe the subdifferential theory of the integral functionals Z ¡ ¢ Iϕ (u) = ϕ ω, u(ω) dµ, Ω df

where ϕ : Ω×X −→ R = R∪{+∞} is a normal integrand (see Definition 3.4.8) and u : Ω −→ X belongs in some vector space of functions. We adopt the convention that +∞ + (−∞) = +∞ (i.e., we let +∞ dominate over −∞). Then for a normal integrand ϕ : Ω × X −→ R, we define the integral functional Iϕ : L1 (Ω; X) −→ R∗ , by Z ¡ ¢ ¢+ R ¡ ϕ ω, u(ω) dµ < +∞, ϕ ω, u(ω) dµ if df Ω Iϕ (u) = Ω ¢+ R ¡ if ϕ ω, u(ω) dµ = +∞. +∞ Ω ∗

Similarly,¡ if ϕ is of ϕ(ω, ·), we define the integral functional ¢ the conjugate ∗ −→ R∗ , by Iϕ∗ : L∞ Ω; Xw ∗ Z ¡ ¢ ¢+ R ∗¡ ϕ∗ ω, v(ω) dµ if ϕ ω, v(ω) dµ < +∞, df Ω Iϕ∗ (v) = Ω ¢+ R ∗¡ if ϕ ω, v(ω) dµ = +∞. +∞ Ω

Recall that ¢ ¡ 1 ¢∗ ¡ ∗ L (Ω; X) = L∞ Ω; Xw ∗ ¢ ¡ (see Theorem 2.2.12; for the Banach space L∞ Ω; Xw∗ ∗ see Definition 2.2.10; note that Theorem 2.2.12 was stated with µ being a finite measure, but the result is true for µ being σ-finite, see Ionescu-Tulcea & Ionescu-Tulcea (1969, ¡ ¡ ¢¢ ∗ are defined p. 95)). The duality brackets for the pair L1 (Ω; X), L∞ Ω; Xw ∗ by Z ¡ ® ¢ df ∗ ∀ u ∈ L1 (Ω; X), v ∈ L∞ Ω; Xw hu, viL1 (Ω;X) = v(ω), u(ω) X dµ ∗ . Ω

We mention that the theory can be developed for integral functionals Iϕ defined on Lp (Ω; X) with p ∈ (1, +∞). In this case ¡ p ¢∗ ¢ 0¡ ∗ L (Ω; X) = Lp Ω; Xw ∗

4. Smooth and Nonsmooth Analysis and Variational Principles with

1 p

+

1 p0

559

= 1 and if X ∗ is separable, then ¡ p ¢∗ ¢ 0¡ L (Ω; X) = Lp Ω; X ∗

(see Theorem 2.2.9). First we need to know that the integral in the definition of Iϕ∗ makes sense. PROPOSITION 4.5.1 If ϕ : Ω × X −→ R is a normal integrand, then ϕ∗ : Ω × X ∗ −→ R defined by df

ϕ∗ (ω, x∗ ) = sup (hx∗ , xiX − ϕ(ω, x)) x∈X

∗ is a convex normal integrand on Ω × Xw ∗.

PROOF

Consider the multifunction E : Ω −→ 2X \ {∅}, defined by © ª df E(ω) = epi ϕ(ω, ·) = (x, λ) ∈ X × R : ϕ(ω, x) 6 λ .

Evidently E has nonempty and closed values (by virtue of the normality of ϕ(ω, ·)). Moreover, we have © ª Gr E = (ω, x, λ) ∈ Ω × X × X : ϕ(ω, x) 6 λ ∈ Σ × B(X) × B(R) = Σ × B(X × R), where B(Z) is the Borel σ-field of Z. So we can find two sequence © ª © ª un : Ω −→ X n>1 and λn : Ω −→ R n>1 of Σ-measurable functions, such that ©¡ ¢ª E(ω) = un (ω), λn (ω) n>1

∀ω∈Ω

(see Denkowski, Mig´orski & Papageorgiou (2003a, p. 433)). Note that ϕ∗ (ω, x∗ ) = sup [hx∗ , un (ω)iX − λn (ω)] . n>1

Thus we conclude that the function (ω, x∗ ) 7−→ ϕ∗ (ω, x∗ ) is a convex normal ∗ integrand on Ω × Xw ∗. THEOREM 4.5.2 (a) If Iϕ : L1 (Ω; X) −→ R∗ is finite at u0 ∈ L1 (Ω; X), then (Iϕ )∗ = Iϕ∗ . (b) If ϕ is a convex normal integrand (i.e., ϕ(ω, ·) is convex for µ-almost ¡ ¢ ∗ all ω ∈ Ω) and Iϕ and Iϕ∗ are finite at u0 ∈ L1 (Ω; X) and v0 ∈ L∞ Ω; Xw ∗ respectively, then Iϕ and Iϕ∗ are proper, convex and lower semicontinuous functionals which are conjugate to each other.

560

Nonlinear Analysis

PROOF we have Z

¡ ¢ ∗ (a) Evidently it suffices to show that for all v ∈ L∞ Ω; Xw ∗ , ³

¡ ¢ ϕ∗ ω, v(ω) dµ 6

sup u∈L1 (Ω;X)

Ω

´ hv, uiL1 (Ω;X) − Iϕ (u)

(4.34)

(see Proposition 4.4.4). Let ξ ∈ R be such that ξ < Iϕ∗ (v). We can obtain (4.34) if we show that there exists u ∈ L1 (Ω; X), such that hv, uiL1 (Ω;X) − Iϕ (u) > ξ. Since by hypothesis Iϕ (u0 ) is finite, we can find ϑ0 ∈ L1 (Ω), such that ® ¡ ¢ v(ω), u0 (ω) X − ϕ ω, u0 (ω) > ϑ0 (ω) for µ-a.a. ω ∈ Ω. (4.35) We obtain

¡ ¢ ϕ∗ ω, v(ω) > ϑ0 (ω) for µ-a.a. ω ∈ Ω.

We claim that there exists a function β ∈ L1 (Ω), such that Z ξ < β(ω) dµ Ω

and

¡ ¢ β(ω) < ϕ∗ ω, v(ω) for µ-a.a. ω ∈ Ω.

Let g ∈ L1 (Ω), g(ω) > 0 for µ-almost all ω ∈ Ω. If Iϕ∗ (v) is finite, then let ¡ ¢ df β(ω) = ϕ∗ ω, v(ω) − εg(ω), R with ε > 0 sufficiently small (so that ξ < β(ω) dµ). If Iϕ∗ (v) = +∞, then Ω

we define

½

¾ ¢ 1 ∗¡ df min ng(ω), ϕ ω, v(ω) fn (ω) = ¢ 2 ∗¡ ϕ ω, v(ω) − g(ω)

if if

¡ ¢ ϕ∗ ω, v(ω) > 0, ¡ ¢ ϕ∗ ω, v(ω) 6 0.

Note that fn (ω) −→

¢ 1 ∗¡ ϕ ω, v(ω) 2

© ¡ ¢ ª ∀ ω ∈ ω ∈ Ω : ϕ∗ ω, v(ω) > 0 .

Hence by the monotone convergence theorem (see Theorem A.2.10), we have that Z lim fn (ω) dµ = +∞ n→+∞

Ω

4. Smooth and Nonsmooth Analysis and Variational Principles

561

and so we can find n0 > 1 large enough so that Z ξ < fn0 (ω) dµ. Ω

Therefore if we take β = fn0 , we have ¡ ¢ β(ω) < ϕ∗ ω, v(ω) for µ-a.a. ω ∈ Ω. Consider the multifunction S : Ω −→ 2X , defined by df

S(ω) =

©

x∈S:

® ª v(ω), x X − ϕ(ω, x) > β(ω)

∀ ω ∈ Ω.

Evidently S has nonempty, closed values and © ª Gr S = (ω, x) ∈ Ω × X : x ∈ S(ω) ∈ Σ × B(X). By the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.32), we can find a Σ-measurable function s : Ω −→ X, such that s(ω) ∈ S(ω)

∀ ω ∈ Ω.

Since µ is σ-finite, we can find Ω0 ∈ Σ with µ(Ω0 ) < +∞, such that s|Ω0 is bounded and Z Z β(ω) dµ + ϑ0 (ω) dµ > ξ. Ω0

Ω\Ω0

We set

½ df

u(ω) =

s(ω) u0 (ω)

if if

ω ∈ Ω0 , ω ∈ Ω \ Ω0 .

Evidently u ∈ L1 (Ω; X) and we have ® ¡ ¢ v(ω), u(ω) X − ϕ ω, u(ω) > β(ω) and

® ¡ ¢ v(ω), u(ω) X − ϕ ω, u(ω) > ϑ0 (ω)

∀ ω ∈ Ω0 ∀ ω ∈ Ω \ Ω0

(see (4.35)). Therefore Z Z Z Z ¡ ¢ ® β(ω) dµ + v(ω), u(ω) X dµ − ϕ ω, u(ω) dµ > Ω

Ω

Ω0

Ω\Ω0

so Iϕ∗ (v) = (Iϕ )∗ (v). (b) Since

ϕ = ϕ∗∗

(see Theorem 4.4.14), this follows at once from part (a).

ϑ0 (ω) dµ > ξ,

562

Nonlinear Analysis

Next we consider integral functionals defined on the Lebesgue-Bochner space L∞ (Ω; X ∗ ). So as before (Ω, Σ, µ) is a σ-finite measure space, but now ∗ X is a separable Banach space with a separable dual X ∗ . Recall that Xw ∗ ∗ ∗ (i.e., the Banach space X supplied with the w -topology) is a Souslin space ∗ (see Definition A.2.29(b)). It follows that B(X ∗ ) = B(Xw ∗ ) (see Denkowski, ∗ Mig´orski & Papageorgiou (2003a, p. 211). Also let ϕ : Ω × Xw ∗ −→ R be a convex normal integrand. We consider the following integral functionals Z ¡ ¢ df ∗ Iϕ (u) = ϕ ω, u(ω) dµ ∀ u ∈ L1 (Ω; X ∗ ) = L1 (Ω; Xw ∗) Ω

and df

Z

Iϕ∗ (v) =

¡ ¢ ϕ∗ ω, v(ω) dµ

∀ v ∈ L∞ (Ω; X).

Ω

From Theorem 4.5.2(b), we know that if dom Iϕ 6= ∅ and dom Iϕ∗ 6= ∅, then the functionals Iϕ and Iϕ∗ are conjugate to each other. So we have £ ¤ Iϕ∗ (v) = sup hv, uiL1 (Ω;X ∗ ) − Iϕ (u) u∈L1 (Ω;X ∗ )

and Iϕ (u) =

£

sup v∈L∞ (Ω;X ∗ )

¤ hu, viL1 (Ω;X ∗ ) − Iϕ∗ (v) .

¡ ¢∗ ∗ What about Iϕ∗ defined on (L∞ (Ω; X)) ? To get an expression for this ∗ conjugate, first we need to introduce the structure of (L∞ (Ω; X)) . DEFINITION 4.5.3 Y be a Banach space.

Let (Ω, Σ, µ) be a σ-finite measure space and let

¡ ¢∗ (a) A function l ∈ L∞ (Ω; Y ) is said to be absolutely continuous with respect to µ, if Z ® l(v) = u(ω), v(ω) Y ∀ v ∈ L∞ (Ω; Y ), Ω

with u ∈ L1 (Ω; Yw∗∗ ). The function u is said to be the density of l with respect to µ. We have Z ° ° °u(ω)° ∗ dµ. klk = kukL1 (Ω;Y ∗∗ ) = Y w

Ω

¡ ¢∗ So we can identify an absolute continuous functional l ∈ L∞ (Ω; Y ) , with its density with respect to µ.

4. Smooth and Nonsmooth Analysis and Variational Principles

563

¡ ¢∗ (b) A functional l ∈ L∞ (Ω; Y ) is said to be singular with respect to µ, if there exists a decreasing sequence {Cn }n>1 ⊆ Σ, such that µ(Cn ) & 0 and l is supported by Cn for n > 1, that is if v ∈ L∞ (Ω; Y ) and v vanishes on some Cn , then l(v) = 0. ¡ ¢∗ REMARK 4.5.4 If µ is finite, then l ∈ L∞ (Ω; Y ) is singular if and only if for every ε > 0, we can find A ∈ Σ, such that µ(A) 6 ε and l is ∞ supported by A ¡ ¢∗ (i.e., if v ∈ L (Ω; Y ), v|A = ¡ 0, then l(v) ¢∗ = 0). For a given l ∈ L∞ (Ω; Y ) and A ∈ Σ, we define lA ∈ L∞ (Ω; Y ) , by ¢ df ¡ lA (v) = l χA v

∀ v ∈ L∞ (Ω; Y ).

It is easy to see that if A, B ∈ Σ and A ∩ B = ∅, then ° A∪B ° ° ° ° ° °l ° ∞ = °lA °(L∞ (Ω;Y ))∗ + °lB °(L∞ (Ω;Y ))∗ . (L (Ω;Y ))∗ PROPOSITION 4.5.5 If (Ω, Σ, µ) is a σ-finite measure space, Y is a Banach space, l ∈

¡ ∞ ¢∗ L (Ω; Y )

and for every ε > 0, there exists A ∈ Σ, with µ(A) 6 ε

and

° ° klk(L∞ (Ω;Y ))∗ − ε 6 °lA °(L∞ (Ω;Y ))∗ ,

¡ ¢∗ then l ∈ L∞ (Ω; Y ) is singular with respect to µ. PROOF µ(An ) 6

Let {An }n>1 ⊆ Σ be such that 1 2n

and

° ° 1 6 °lAn °(L∞ (Ω;Y ))∗ ∀ n > 1. n 2 ° c° + °lAn ° ∞ ∗ , we have

klk(L∞ (Ω;Y ))∗ −

° ° Since klk(L∞ (Ω;Y ))∗ = °lAn °(L∞ (Ω;Y ))∗ ° Ac ° °l n °

6

(L∞ (Ω;Y ))∗

Let df

Cn =

∞ [

Ak

(L

1 2n

(Ω;Y ))

∀ n > 1.

∀ n > 1.

k=n+1

We have µ(Cn ) 6

1 2n

∀ n > 1.

564

Nonlinear Analysis

We claim that ¡ ¢∗ l ∈ L∞ (Ω; Y ) is supported by Cn

∀ n > 1.

To this end let v ∈ L∞ (Ω; Y ) with v|Cn = 0. So for all k > n + 1, we have that v = χAc v k

and so ¯ ¯ ¯ ¯ ° ° 1 ¯l(v)¯ = ¯lAck (v)¯ 6 °lAck ° ∞ kvkL∞ (Ω;Y ) 6 k kvkL∞ (Ω;Y ) . (L (Ω;Y ))∗ 2 ¡ ∞ ¢∗ Let k → +∞ to obtain that l(v) = 0 and so l ∈ L (Ω; Y ) is singular. PROPOSITION 4.5.6 If (Ω, Σ, µ) is a finite measure space, Y is a Banach space, L∞ (Ω) ⊗ Y = span

¡© ª¢ gy : g ∈ L∞ (Ω), y ∈ Y ,

¡ ¢∗ l ∈ L∞ (Ω; Y ) and l(w) = 0

∀ w ∈ L∞ (Ω) ⊗ Y,

then l is singular. PROOF Let Z be the subspace of L∞ (Ω; Y ) consisting of the equivalence classes of countably valued functions from Ω into Y . From Corollary 2.1.4, we know that Z is dense in L∞ (Ω; Y ). So for a given ε > 0, we can find z ∈ Z with kzkL∞ (Ω;Y ) 6 1, such that klk(L∞ (Ω;Y ))∗ − ε 6 l(z). So there exist a sequence {Am }m>1 ⊆ Σ of pairwise disjoint sets with Ω =

∞ [

Am

m=1

and a sequence {xm }m>1 ⊆ X, such that z(ω) = xm

∀ ω ∈ Am .

4. Smooth and Nonsmooth Analysis and Variational Principles

565

Let n > 1 be large enough, such that µ [ ¶ ∞ µ Am 6 ε. m=n

Since w(·) =

n−1 X

χAm (·)xm ∈ L∞ (Ω) ⊗ Y,

m=1

we have

µ l(z) = l χ

¶ ∞ S

m=n

z Am

> klk(L∞ (Ω;Y ))∗ − ε.

¡ ¢∗ Applying Proposition 4.5.5, we obtain that l ∈ L∞ (Ω; Y ) is singular with respect to µ. ∗

Let us state the theorem which characterizes the dual space (L∞ (Ω; Y )) . The decomposition produced by this theorem is analogous to the Lebesgue decomposition of measures. For a proof of the theorem we refer to Levin (1974). THEOREM 4.5.7 If (Ω, Σ, µ) is a σ-finite measure space, Y is a Banach space and Ls is the space¡of singular¢ continuous linear functionals on L∞ (Ω; Y ), ∗ then L∞ (Ω; Y ) is isometrically isomorphic to L1 (Ω; Yw∗∗ ) ⊕ Ls and klk(L∞ (Ω;Y ))∗ = kukL1 (Ω;Y ∗∗ ) + kls kL∞ (Ω;Y )∗ , w

¡ ¢∗ where l ∈ L∞ (Ω; Y ) , u ∈ L1 (Ω; Yw∗∗ ) and ls ∈ Ls . ¡ ¢∗ Now that we have a complete description of the dual space L∞ (Ω; Y ) , ¡ ¢∗ we can return to our initial problem, namely the formula for Iϕ∗ , defined ¡ ¢∗ on L∞ (Ω; X) . Recall that the mathematical setting is the following: (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space with a separable dual X ∗ and ∗ ϕ : Ω × Xw ∗ −→ R is a convex normal integrand. As we already mentioned ∗ B(X ∗ ) = B(Xw ∗ ).

¡ ¢∗ ¡ ¢∗ We want to derive a formula for Iϕ∗ : L∞ (Ω; X) −→ R, defined by ¡

Iϕ∗

¢∗

df

(l) =

sup v∈L∞ (Ω;X)

£ ¤ l(v) − Iϕ∗ (v)

¡ ¢∗ ∀ l ∈ L∞ (Ω; X) .

566

Nonlinear Analysis

THEOREM 4.5.8 If the above hypotheses hold and dom Iϕ , domIϕ∗ are nonempty, then ¡ ¢∗ Iϕ∗ (l) = Iϕ (u) + σdom Iϕ∗ (ls ), ¡ ¢∗ where l ∈ L∞ (Ω; X) , u ∈ L1 (Ω; X ∗ ) is the density of l with respect to µ and l ∈ Ls is the singular part of l with respect to µ (so we have that l = u + ls and klk(L∞ (Ω;X))∗ = kukL1 (Ω;X ∗ ) + kls k(L∞ (Ω;X))∗ ; see Theorem 4.5.7). PROOF

By definition, we have ¡

6

£ ¤ (l) = sup hu, viL∞ (Ω;X) + ls (v) − Iϕ∗ (v) £ v∈dom Iϕ∗ ¤ hu, viL∞ (Ω;X) − Iϕ∗ (v) + sup ls (b v) sup

Iϕ∗

¢∗

v∈dom Iϕ∗

v b∈dom Iϕ∗

= Iϕ (u) + σdom Iϕ∗ (ls ).

(4.36)

Let {Cn }n>1 ⊆ Σ be a decreasing sequence, such that µ(Cn ) & 0 and ls is supported by Cn for every n > 1. Also let v0 ∈ dom Iϕ∗ ,

ε > 0 and

ξ∈R

be such that ξ < Iϕ (u). Note that ¡ ¢ ® ¡ ¢ ϕ ω, u(ω) > u(ω), v0 (ω) X − ϕ∗ ω, v0 (ω)

for µ-a.a. ω ∈ Ω

and so Iϕ (u) > −∞. Then for n > 1 large enough, we have Z

£

u(ω), v0 (ω)

® X

¡ ¢¤ − ϕ∗ ω, v0 (ω) dµ > −ε

(4.37)

Cn

and

Z c Cn

¡ ¢ ϕ ω, u(ω) dµ > ξ.

(4.38)

4. Smooth and Nonsmooth Analysis and Variational Principles

567

We have ¡ ¢∗ Iϕ∗ (l) =

· sup

Z l(v) −

v∈L∞ (Ω;X)

Cn

· =

sup v∈L∞ (Ω\Cn ;X)

Z

hu, viL∞ (Ω;X) −

sup v b∈L∞ (Cn ;X)

Z =

¸ ¡ ¢ ∗ ϕ ω, v(ω) dµ Z

hu, vbiL∞ (Ω;X) + ls (b v) −

Z

¸ ¢ ϕ ω, v(ω) dµ ∗

¡

c Cn

¡ ¢ ϕ∗ ω, vb(ω) dµ

¸

Cn

¡ ¢ ϕ ω, u(ω) dµ

c Cn

+

¡

c Cn

· +

¢ ϕ ω, v(ω) dµ − ∗

sup v b∈L∞ (Cn ;X)

· ¸ Z £ ® ¡ ¢¤ ∗ ls (b v) + u(ω), vb(ω) X − ϕ ω, vb(ω) dµ . Cn

Let

df

vb = χCn v0 . Since

¡ ¢ ls χCnc v0 = 0,

we have

¡ ¢ ls (v0 ) = ls χCn v0

and so, using also (4.37) and (4.38), we obtain Z ¡ ¢∗ ¡ ¢ Iϕ∗ (l) > ϕ ω, u(ω) dµ + ls (v0 ) c Cn

Z

+

£

¡ ¢¤ hu(ω), v0 (ω)iX − ϕ∗ ω, v0 (ω) dµ

Cn

> ξ + ls (v0 ) − ε. Since ε > 0 was arbitrary, we let ε & 0 and have ¡ ¢∗ Iϕ∗ (l) > ξ + ls (v0 ) ∀ v0 ∈ dom Iϕ∗ , so

¡

Iϕ∗

¢∗

(l) > Iϕ (u) + σdom Iϕ∗ (ls ).

From (4.36) and (4.39), we conclude that ¡ ¢∗ Iϕ∗ (l) = Iϕ (u) + σdom Iϕ∗ (ls ).

(4.39)

568

Nonlinear Analysis

REMARK 4.5.9 If dom Iϕ∗ = L∞ (Ω; X), then Iϕ∗ is continuous, in fact ∞ locally Lipschitz on L (Ω; X) (see Theorem 4.2.6 and 4.2.7). We will show that we can say more about the continuity of Iϕ∗ , when dom Iϕ∗ = L∞ (Ω; X) (see Proposition 4.5.14). DEFINITION 4.5.10

Let V and W be two linear spaces.

(a) We say that (V, W ) form a dual pair (or dual system), if there exists a bilinear functional b : V × W −→ R, written b(v, w) = hv, wi, such that (i) if hv, wi = 0 for all w ∈ W , then v = 0; (ii) if hv, wi = 0 for all v ∈ V , then w = 0. (b) For a given dual pair (V, W ) and a topology τV on V , we say that τV is compatible ¡ ¢∗ with the dual pair (V, W ¡ ), if ¢∗ τV is a locally convex vector topology and VτV = W (that is, if v ∗ ∈ VτV , then v ∗ (v) = hv, wifor some w ∈ W and all v ∈ V and conversely). So we view W as a subspace of the¡ algebraic ¢∗ dual of V . Dually, a compatible topology τW on W is one such that WτW = V. (c) The smallest topology on V compatible with the dual pair (V, W ) is the weak topology denoted by w(V, W ). The largest topology on V compatible with the dual pair is the Mackey topology denoted by m(V, W ). REMARK 4.5.11 From the properties of the bilinear form h·, ·i, we see easily that a compatible topology is always Hausdorff. All compatible topologies have the same closed, convex sets and the same bounded sets (i.e., these properties are duality invariant). The Mackey topology m(V, W ) is the topology of uniform convergence on all balanced, convex, w-compact sets in W . Recall that A ⊆ W is balanced if λA ⊆ A for all |λ| 6 1. If V = X is a Banach space and W = X ∗ , then w(V, W ) is the usual weak topology on X and m(V, W ) is the norm topology on X. On the other hand, if V = X ∗ and W = X, then w(V, W ) is the weak∗ -topology on X ∗ and m(V, W ) is strictly smaller than the norm topology, unless X is reflexive. DEFINITION 4.5.12 Let (V, W ) be a dual pair, let τW be a compatible topology on W and let ϕ : W −→ R∗ be a function. We say that ϕ is τW -inf-compact for the slope v0 ∈ V , if for all λ ∈ R, the set © ª w ∈ W : ϕ(w) − hv0 , wi 6 λ is τW -compact. If v0 = 0, then we simply say that ϕ is τW -inf-compact. The next proposition establishes an interesting connection between continuity of ϕ and τW -inf-compactness of ϕ∗ .

4. Smooth and Nonsmooth Analysis and Variational Principles

569

PROPOSITION 4.5.13 If (V, W ) is a dual pair, w = w(V, W ) and ϕ ∈ Γ0 (Vw ), then ϕ is finite and m(V, W )-continuous at v0 ∈ V if and only if ϕ∗ is w(W, V )-inf-compact for the slope v0 ∈ V . Using Theorem 4.5.8 and Proposition 4.5.13, we infer the following result for the integral functional Iϕ∗ . PROPOSITION 4.5.14 If the hypotheses of Theorem 4.5.8 hold with dom Iϕ∗ = L∞ (Ω; X), then ¡ ¢ (a) Iϕ∗ is a m L∞ (Ω; X), L1 (Ω; X ∗ ) -continuous function on L∞ (Ω; X); ¡ ¢ (b) Iϕ is a w L1 (Ω; X ∗ ), L∞ (Ω; X) -inf-compact function for every slope in L∞ (Ω; X); (c) we have ¡ ¢∗ Iϕ∗ (l) =

½

Iϕ (u) +∞

if l = u ∈ L1 (Ω; X ∗ ), otherwise.

Another useful continuity result for the integral functional Iϕ is the following. PROPOSITION 4.5.15 If (Ω, Σ, µ) is a nonatomic finite measure space, X is a separable Banach space, ϕ : Ω × X −→ R is a convex normal integrand, p ∈ [1, +∞) and Iϕ : Lp (Ω; X) −→ R is continuous at a point, then Iϕ is continuous everywhere. PROOF Let u0 ∈ Lp (Ω; X) be the point of continuity of Iϕ . By considering if necessary the functional df x 7−→ Ibϕ (x) = Iϕ (u0 + x) − Iϕ (u0 ),

we may assume without any loss of generality that u0 = 0 and Iϕ (0) = 0. So we can find δ > 0, such that Iϕ (u) 6 1

∀ kukLp (Ω;X) 6 δ.

Let x ∈ Lp (Ω; X) be arbitrary. Exploiting the nonatomicity of µ and the absolute continuity of the Lebesgue integral, we can find δ1 > 0 and pairwise disjoint sets {Ak }N k=1 ⊆ Σ, such that ° ° © ª ° ° µ(Ak ) 6 δ1 and °χAk u° p 6 δ ∀ k ∈ 1, . . . , N . L (Ω;X)

570

Nonlinear Analysis

We have Z Z Z ¡ ¢ ¡ ¢ ϕ ω, u(ω) dµ = ϕ ω, χAk (ω)u(ω) dµ − ϕ(ω, 0) dµ 6 1 + ξ, Ak

Ack

Ω

© ª for some ξ > 0 independent of k ∈ 1, . . . , N . Therefore, we conclude that N Z X

¡ ¢ ϕ ω, u(ω) dµ =

k=1A k

Z

¡ ¢ ϕ ω, u(ω) dµ = Iϕ (u) < +∞,

Ω

so Iϕ is continuous everywhere on Lp (Ω; X) (see Theorem 4.2.3). Next we describe the subdifferential of the integral functional Iϕ . THEOREM 4.5.16 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space, ϕ : Ω × X −→ R is a convex normal integrand and Iϕ : Lp (Ω; X) −→ R, p ∈ [1, +∞) is finite for at least one u0 ∈ Lp (Ω; X), then for every u ∈ Lp (Ω; X), we have that ¢ 0¡ u∗ ∈ ∂Iϕ (u) ⊆ Lp Ω; Xw∗ ∗ (with

1 p

+

1 p0

= 1) if and only if ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω)

PROOF and only if

for µ-a.a. ω ∈ Ω.

According to Proposition 4.4.21, we have that u∗ ∈ ∂Iϕ (u) if ¡ ¢∗ Iϕ (u) + Iϕ (u∗ ) = hu∗ , viLp (Ω;X)

(see Remark 2.2.13). From Theorem 4.5.2(a), we know that ¡ ¢∗ Iϕ = Iϕ∗ . So we have that u∗ ∈ ∂Iϕ (u) if and only if Z Z £ ¡ ¢ ¡ ¢¤ ∗ ® ϕ ω, u(ω) + ϕ∗ ω, u∗ (ω) dµ = u (ω), u(ω) X dµ. Ω

Ω

The result now follows from the fact that ¡ ¢ ¡ ¢ ® ϕ ω, u(ω) + ϕ∗ ω, u∗ (ω) > u∗ (ω), u(ω) X

for µ-a.a. ω ∈ Ω.

4. Smooth and Nonsmooth Analysis and Variational Principles

571

Before passing to the study of the subdifferentials of nonconvex integral functionals, let us prove a last result for convex subdifferentials. It concerns functionals defined on the space of continuous functions defined on a compact metric space K. So let K be a compact metric space and consider the Banach space C(K) (with the supremum norm). The Riesz-Markov representation theorem (see Theorem 2.3.41) says that C(K)∗ = M (K), where M (K) is the Banach space of Radon measures, i.e., the space of all signed Borel measures which are of bounded variation with the norm given by the total variation, namely kµk = sup

½X N

¾ N ¯ ¯ [ ¯µ(Ak )¯ : Ak ⊆ A, Ak ∩ Ai = ∅ for k 6= i, N > 1 .

k=1

k=1

In what follows by h·, ·iC(K) we denote the duality brackets for the pair ¡ ¢ C(K), M (K) , i.e., Z df hµ, uiC(K) = u(x) dµ(x) ∀ u ∈ C(K), µ ∈ M (K). K

We say that µ ∈ M (K) is positive, denoted by µ > 0, if hµ, uiC(K) > 0

∀ u ∈ C(K), u > 0.

If e ∈ C(K) is the function, such that e(x) = 1

∀x∈K

and

hµ, eiC(K) = 1,

then we say that the Radon measure µ ∈ M (K) has total mass one. A Radon measure µ ∈ M (K) vanishes in an open set U ⊆ K, if hµ, uiC(K) = 0 recall that

∀ u ∈ C(K), supp u ⊆ U ;

df

supp u = {x ∈ K : u(x) 6= 0}. By using partition of unity, we can show Sthat if µ vanishes in a collection of open sets Ur , then µ also vanishes on Ur . Hence it follows that there b where µ vanishes. Then the set K \ U b is called the exists a largest open set U support of µ and is denoted by supp µ.

572

Nonlinear Analysis

LEMMA 4.5.17 If u ∈ C(K), µ ∈ M (K) and u|supp µ = 0, then hµ, uiC(K) = 0. PROOF

If dK is a metric on K, for each ε > 0, let ¡ ¢ df © ¡ ¢ ª supp µ ε = x ∈ K : dK x, supp µ < ε .

Using Urysohn’s lemma (see Theorem A.1.13), for every n > 1, we can find ϑn ∈ C(K), such that ϑn |(supp µ) ≡ 0 1 n

and ϑn |(supp µ)c

2 n

≡ 1.

Then ϑn u −→ u in C(K) and so hµ, ϑn uiC(K) −→ hµ, uiC(K) . Note that b = K \ supp µ supp ϑn u ⊆ U

∀ n > 1.

Hence hµ, ϑn uiC(K) = 0 and so we conclude that hµ, uiC(K) = 0.

PROPOSITION 4.5.18 If ξ : C(K) −→ R is defined by df

ξ(u) = max u(x), x∈K

then ξ is continuous, convex and for each u ∈ C(K), we have that µ ∈ ∂ξ(u) µ > 0,

hµ, eiC(K) = 1

if and only if © ª and supp µ ⊆ x ∈ K : ξ(u) = u(x) .

4. Smooth and Nonsmooth Analysis and Variational Principles

573

PROOF The convexity of ξ is clear. To establish the continuity of ξ, we argue as follows. Let u, v ∈ C(K) and

x0 ∈ K

be such that ξ(u) = max u(x) = u(x0 ). x∈K

We have ξ(u) − ξ(v) 6 u(x0 ) − v(x0 ) 6 ku − vk∞ . Reversing the roles of u and v in the above argument, we conclude that ¯ ¯ ¯ξ(u) − ξ(v)¯ 6 ku − vk , ∞ i.e., ξ is Lipschitz continuous. Now let us prove the description of ∂ξ. (a) Let µ ∈ ∂ξ(u). Then we have hµ, v − uiC(K) 6 ξ(v) − ξ(u)

∀ v ∈ C(K).

Let g ∈ C(K), and let us set

g > 0

df

v = u − g. From (4.40), we have ¡ ¢ − hµ, giC(K) 6 max u − g (x) − max u(x) 6 0, x∈K

x∈K

so hµ, giC(K) > 0. Also let c ∈ R and let us set df

v = u + ce. From (4.40), we have ¡ ¢ c hµ, eiC(K) 6 max u + ce (x) − max u(x) = c x∈K

x∈K

(recall that e ≡ 1). Since c ∈ R was arbitrary, we obtain hµ, eiC(K) = 1. Next we show that df

supp µ ⊆ C =

©

ª x ∈ K : ξ(u) = u(x) .

(4.40)

574

Nonlinear Analysis

It suffices to show that µ vanishes in any open set U ⊆ K \ C. To this end let g ∈ C(K) be such that supp g ⊆ U . Also let df

η = ξ(u) − max u(x) > 0. x∈supp g

We choose ε > 0 so that ±εg(x) < η

∀ x ∈ K.

Then u(x) ± εg(x) < ξ(u)

∀ x ∈ supp g

and ξ(u ± εg) = ξ(u). df

So if in (4.40), we set v = u ± εg, we obtain ±ε hµ, giC(K) 6 0, i.e., hµ, giC(K) = 0, so ©

supp µ ⊆

ª x ∈ K : ξ(u) = u(x) .

(b) Note that g = u − ξ(u)e and g(x) = 0

∀ x ∈ supp µ.

Using Lemma 4.5.17, we obtain hµ, giC(K) = 0 and so hµ, uiC(K) = ξ(u). Hence if v ∈ C(K), we have ξ(v) − ξ(u) = ξ(v) − hµ, uiC(K) . Let

df

g = ξ(v)e − v. Evidently g > 0 and so hµ, giC(K) > 0,

(4.41)

4. Smooth and Nonsmooth Analysis and Variational Principles

575

hence ξ(v) > hµ, uiC(K) . Using this in (4.41), we obtain ξ(v) − ξ(u) > hµ, v − uiC(K)

∀ v ∈ C(K),

so µ ∈ ∂ξ(u). Now we consider nonconvex locally Lipschitz integral functionals. The mathematical setting is the following: (Ω, Σ, µ) is a finite measure space, X is a separable Banach space and ϕ : Ω × X −→ R is a measurable function. We consider the integral functional Iϕ : Lp (Ω; X) −→ R, p ∈ [1, +∞), defined by df

Z

Iϕ (u) =

¡ ¢ ϕ ω, u(ω) dµ

∀ u ∈ Lp (Ω; X).

Ω

Our goal is to describe ∂Iϕ (u) under one the following two hypotheses: H(ϕ)1 We have

¯ ¯ ¯ϕ(ω, x) − ϕ(ω, y)¯ 6 k(ω) kx − yk , X 0

for µ-almost all ω ∈ Ω and all x, y ∈ X, with k ∈ Lp (Ω),

1 p

+

1 p0

= 1.

H(ϕ)2 For µ-almost all ω ∈ Ω, the function ϕ(ω, ·) is locally Lipschitz and ³ ´ p−1 kx∗ kX ∗ 6 a(z) 1 + kxkX , for µ-almost all ω ∈ Ω, all x ∈ X and all x∗ ∈ ∂ϕ(ω, x), with a ∈ L∞ (Ω). THEOREM 4.5.19 If ϕ : Ω × X −→ R is a measurable function and satisfies either hypotheses H(ϕ)1 or H(ϕ)2 , then Iϕ is Lipschitz continuous on bounded sets of Lp (Ω; X) and if u∗ ∈ ∂Iϕ (u), then ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω) for µ-a.a. ω ∈ Ω.

576

Nonlinear Analysis

PROOF

Case 1. First suppose that hypothesis H(ϕ)1 holds.

Then for u, v ∈ Lp (Ω; X), we have ¯ ¯ ¯Iϕ (u) − Iϕ (v)¯ Z ¯ ¡ ¢ ¡ ¢¯ ¯ϕ ω, u(ω) − ϕ ω, v(ω) ¯ dµ 6 Ω

Z

6

° ° k(ω)°u(ω) − v(ω)°X dµ

Ω

6 kkkp0 ku − vkLp (Ω;X) , so Iϕ is Lipschitz continuous (globally) on Lp (Ω; X). Case 2. Next suppose that hypothesis H(ϕ)2 holds. We show that Iϕ is Lipschitz continuous on bounded sets of Lp (Ω; X). Let u, v ∈ Lp (Ω; X) be such that kukLp (Ω;X) 6 r

and

kvkLp (Ω;X) 6 r.

Using Theorem 4.4.72, we can find w(ω) ∈

£ ¤ df © ª u(ω), v(ω) = (1 − λ)u(ω) + λv(ω) : λ ∈ [0, 1]

and

¡ ¢ w∗ (ω) ∈ ∂ϕ ω, w(ω) ,

such that ¡ ¢ ¡ ¢ ϕ ω, v(ω) − ϕ ω, u(ω) =

∗ ® w (ω), v(ω) − u(ω) X

for µ-a.a. ω ∈ Ω.

(4.42)

By virtue of hypothesis H(ϕ)2 , we have that ³ ° ∗ ° ° ° ´ °w (ω)° 6 a(ω) 1 + °w(ω)°p−1 ³

X

X

° °p−1 ° °p−1 ´ 6 b a(ω) 1 + °u(ω)°X + °v(ω)°X

for µ-a.a. ω ∈ Ω,

with b a ∈ L∞ (Ω). Let ³ ° °p−1 ° °p−1 ´ df η(ω) = b a(ω) 1 + °u(ω)°X + °v(ω)°X .

(4.43)

4. Smooth and Nonsmooth Analysis and Variational Principles Then

577

0

η ∈ Lp (Ω)+ and kηkp0 6 c, ° ° where c > 0 depends on °b a°∞ and on r > 0. From (4.42) and (4.43), it follows that ¯ ¯ ¯Iϕ (v) − Iϕ (u)¯ 6 c kv − uk p L (Ω;X) , so Iϕ is Lipschitz continuous on bounded sets. Now let ¢ 0¡ ∗ u∗ ∈ ∂Iϕ (u) ⊆ Lp Ω; Xw ∗ . Then using Fatou’s lemma (see Theorem A.2.1), we have hu∗ , hiLp (Ω;X) 6 Zb 6

¡ ¢0 Iϕ (u; h)

¡ ¢ ϕ0 ω, u(ω); h(ω) dµ

∀ h ∈ Lp (Ω; X),

0

so Zb

£ 0¡ ¢ ¤ ϕ ω, u(ω); h(ω) − hu∗ (ω), h(ω)iX dµ > 0

∀ h ∈ Lp (Ω; X).

0

Let df

h = χA z, with A ∈ Σ, z ∈ X. We obtain Z £ 0¡ ¢ ¤ ϕ ω, u(ω); z − hu∗ (ω), ziX dµ > 0. A

Since A ∈ Σ is arbitrary, we infer that ¡ ¢ ∗ ® u (ω), z X 6 ϕ0 ω, u(ω); z

for a.a. ω ∈ Ω

(4.44)

and the exceptional µ-null set is independent of z ∈ X since X is separable. Since z ∈ X is arbitrary, we conclude that ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω) for µ-a.a. ω ∈ Ω.

REMARK 4.5.20 Note that under hypothesis H(ϕ)1 we in fact proved that Iϕ is Lipschitz continuous (globally).

578

4.6

Nonlinear Analysis

Variational Principles

Suppose that X is a Banach space, C ⊆ X is a nonempty, noncompact set and df ϕ : X −→ R = R ∪ {+∞} is a proper, lower semicontinuous function, which is bounded below. Then the problem inf ϕ(x) x∈C

need not have a solution. If X = RN , the situation can be remedied by considering a suitable small perturbation of ϕ. More specifically, for simplicity let C = RN , m = inf ϕ(x), ε > 0 x∈X

and take x0 ∈ RN to be such that ϕ(x0 ) 6 m + ε. Consider the function df

ϕε (x) = ϕ(x) + ε kx − x0 kX . Evidently ϕε : RN −→ R is proper, lower semicontinuous and in addition ϕε is weakly coercive, i.e., ϕε (x) −→ +∞

as kxkRN → +∞.

So invoking the Weierstrass theorem, we infer that ϕε attains its infimum at a point y ∈ RN . Note that ky − x0 kRN 6 1. Indeed, if ky − x0 kRN > 1, we have ϕε (y) = ϕ(y) + ε ky − x0 kRN > ϕ(y) + ε > m + ε > ϕ(x0 ) = ϕε (x0 ) > ϕε (y), a contradiction. Also ϕε (y) 6 ϕε (x)

∀ x ∈ RN ,

hence ϕ(y) + ε ky − x0 kRN 6 ϕ(x) + ε kx − x0 kRN ,

4. Smooth and Nonsmooth Analysis and Variational Principles

579

so ϕ(y) 6 ϕ(x) + ε kx − ykRN . So this argument shows that for a given ε > 0 and x0 ∈ RN satisfying ϕ(x0 ) 6 inf ϕ(x) + ε x∈X

(i.e., x0 ∈ RN is an ε-minimizer), we can find y ∈ RN , such that ky − x0 kRN 6 1 and the function x 7−→ ϕ(x) + ε kx − ykRN attains its infimum at y ∈ RN . The main analytical tool in this argument was the theorem of Weierstrass, which guarantees a minimizer for a proper, lower semicontinuous, bounded from below function with at least one bounded sublevel set (this is the case if for example the function is weakly coercive). Evidently this is an essentially finite dimensional situation. In an infinite dimensional space for this to work we need extra conditions, such as the reflexivity of X and the weak lower semicontinuity of the function. In general the argument fails. Nevertheless, we can salvage the principle formulated above. Namely if x0 ∈ X is an ε-minimizer of ϕ : X −→ R, which is proper, lower semicontinuous, bounded from below, then a small Lipschitz perturbation of ϕ attains a strict minimum at a point y ∈ X, which is relatively close to x0 (i.e., we can find a Lipschitz continuous function h : X −→ R with a small Lipschitz constant, such that ϕ + h attains a strict minimum at y ∈ X). In fact this principle can be formulated in any complete metric space. This is the essence of the so-called Ekeland variational principle and its extensions. This result turned out to be an essential tool in many different areas of nonlinear analysis. THEOREM 4.6.1 If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and x0 ∈ X satisfies ϕ(x0 ) 6 inf ϕ(x) + ε, x∈X

then for a given λ > 0, we can find yλ ∈ X, such that (a) ϕ(yλ ) 6 ϕ(x0 ); (b) dX (yλ , x0 ) 6 λ; (c) ϕ(yλ ) < ϕ(x) + λε dX (x, yλ ) for all x 6= yλ .

580

Nonlinear Analysis

PROOF

By replacing ϕ with 1 ϕ(x) ε

df

ϕ(x) b = and dX (·, ·) with df

dλ (·, ·) =

1 d (·, ·), λ X

without any loss of generality, we may assume that ε = λ = 1. On X we define a relation by £ ¤ x6z

£ ¤ ϕ(x) 6 ϕ(z) − dX (x, z) .

df

⇐⇒

Evidently x 6 x (i.e., the relation 6 is reflexive). Also, if x 6 z, we have ϕ(x) 6 ϕ(z) − dX (x, z) and if z 6 v, we have ϕ(z) 6 ϕ(v) − dX (z, v). Thus, by the triangle inequality, it follows that £ ¤ ϕ(x) 6 ϕ(v) − dX (x, z) + dX (z, v) 6 ϕ(v) − dX (x, v). Therefore x 6 v (i.e., the relation 6 is transitive). Finally if x 6 z and z 6 x, we obtain dX (x, z) = 0, hence x = z (i.e., the relation 6 is antisymmetric). So we conclude that the relation 6 is a partial order. Inductively we define a sequence {Sn }n>1 of subsets of X as follows. Let x1 = x0 and ª df © S1 = z ∈ X : z 6 x 1 , x2 ∈ S1 is such that ϕ(x2 ) 6 inf ϕ(x) + x∈S1

1 22

and for the induction step, let df

Sn =

©

ª z ∈ X : z 6 xn ,

xn+1 ∈ Sn is such that ϕ(xn+1 ) 6 inf ϕ(x) + x∈Sn

1 . 2n+1

4. Smooth and Nonsmooth Analysis and Variational Principles

581

Since xn+1 6 xn , we have that Sn+1 ⊆ Sn for n > 1 and by virtue of lower semicontinuity of ϕ, we have that Sn is closed. If z ∈ Sn+1 , we have that z 6 xn+1 6 xn and so dX (z, xn+1 ) 6 ϕ(xn+1 ) − ϕ(z) 6 inf ϕ(x) + x∈Sn

6 ϕ(z) +

1 − ϕ(z) 2n+1

1 1 − ϕ(z) = n+1 , 2n+1 2

so diam Sn+1 6

1 2n

∀ n > 1,

i.e., diam Sn −→ 0. Because (X, dX ) is complete, by Cantor’s theorem (see Theorem A.1.11), we have that ∞ \ Sn = {y}. n=1

Since y ∈ S1 , we have y 6 x1 = x0 and so ϕ(y) 6 ϕ(x0 ) − dX (y, x0 ) 6 ϕ(x0 ), i.e., (a) holds. Also we have (recall ε = λ = 1) dX (y, x0 ) 6 ϕ(x0 ) − ϕ(y) 6 inf ϕ(x) + 1 − inf ϕ(x) = 1, x∈X

x∈X

i.e., (b) holds. Finally to prove (c), we need to show that z 6 y implies z = y. Indeed, if z 6 y, then z 6 xn for all n > 1, hence z ∈

∞ \

Sn ,

n=1

which implies that z = y. REMARK 4.6.2 Note that in the above proposition conclusions (b) and (c) are somehow complementary and the choice of λ > 0 allows us to strike a balance between them depending on the application we have in mind. If λ > 0 is large then (b) provides little information on the whereabouts of yλ while (c) tells us that yλ is close to being a global minimizer of ϕ. The opposite situation occurs when λ > 0 is small. Then (b) implies that yλ is close to x0 , but the inequality in (c) gives us little information. Two particular cases√are of interest. The first corresponds to λ = 1, ε > 0 and the second to λ = ε, ε > 0. In the first case we are not interested in conclusion (b) (i.e., we are not interested on how yλ is located with respect to x0 ). In the second case we are interested in both (b) and (c). We state these two particular cases as corollaries.

582

Nonlinear Analysis

COROLLARY 4.6.3 If (X, dX ) is a complete metric space and ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, then for every ε > 0, we can find yε ∈ X, such that (a) ϕ(yε ) 6 inf x∈X ϕ(x) + ε; (b) ϕ(yε ) < ϕ(x) + εdX (x, yε ) for all x 6= yε . COROLLARY 4.6.4 If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and xε ∈ X satisfies ϕ(xε ) 6 inf ϕ(x) + ε, x∈X

then we can find yε ∈ X, such that (a) ϕ(yε ) 6 ϕ(xε ); √ (b) dX (yε , xε ) 6 ε; √ (c) ϕ(yε ) < ϕ(x) + εdX (x, yε ) for all x 6= yε . If we put more structure on the space X, we can strengthen the conclusion of Theorem 4.6.1. THEOREM 4.6.5 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then for every ε > 0, we can find xε ∈ X, such that ° ° ϕ(xε ) 6 inf ϕ(x) + ε and °ϕ0G (xε )°X ∗ 6 ε. x∈X

PROOF

By virtue of Corollary 4.6.3, we can find xε ∈ X, such that

ϕ(xε ) 6 inf ϕ(x) + ε x∈X

and ϕ(xε ) 6 ϕ(x) + ε kx − xε kX

∀ x ∈ X.

df

Let h ∈ X and λ > 0 be arbitrary. Let us set x = xε + λh. We obtain ϕ(xε ) − ϕ(xε + λh) 6 ε khkX . λ Passing to the limit as λ & 0, we obtain ® ∀ h ∈ X, − ϕ0G (x), h X 6 ε khkX ° 0 ° ¯ 0 ® ¯ so ¯ ϕG (x), h X ¯ 6 ε khkX and thus °ϕG (x)°X ∗ 6 ε. An immediate consequence of this theorem is the following corollary.

4. Smooth and Nonsmooth Analysis and Variational Principles

583

COROLLARY 4.6.6 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then there exists a sequence {xn }n>1 ⊆ X, such that ϕ(xn ) & inf ϕ(x) x∈X

and

ϕ0G (xn ) −→ 0.

REMARK 4.6.7 The above corollary asserts the existence of a minimizing sequence, whose elements satisfy the first order necessary conditions, up to any desired approximation. COROLLARY 4.6.8 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then for each minimizing sequence {yn }n>1 of ϕ (i.e., ϕ(yn ) & inf ϕ(x)), we x∈X

can find another minimizing sequence {xn }n>1 of ϕ, such that: (a) ϕ(xn ) 6 ϕ(yn ); (b) kxn − yn kX −→ 0; ° ° (c) °ϕ0G (xn )°X ∗ −→ 0. As we already said the Ekeland variational principle is a very powerful tool of nonlinear analysis. Below we show how the well known Caristi’s fixed point theorem can be derived from the Ekeland variational principle. In fact we show the two results are equivalent, in the sense that the Ekeland variational principle can also be derived from Caristi’s fixed point theorem. First we state and prove Caristi’s fixed point theorem. THEOREM 4.6.9 (Caristi Fixed Point Theorem) If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function and F : X −→ 2X \ {∅} is a multifunction, such that ϕ(y) 6 ϕ(x) − dX (y, x)

for some y ∈ F (x) and all x ∈ X,

(4.45)

then there exists x0 ∈ X, such that x0 ∈ F (x0 ). PROOF such that

By virtue of Corollary 4.6.3 with ε = 1, we can find x0 ∈ X, ϕ(x0 ) < ϕ(x) + dX (x, x0 )

∀ x 6= x0 .

(4.46)

We claim that x0 ∈ F (x0 ). Suppose that this is not true. Then for all y ∈ F (x0 ), we have that y 6= x0 . Let y ∈ F (x0 ) be as in (4.45). We have ϕ(y) 6 ϕ(x0 ) − dX (y, x0 ) (see (4.46)), a contradiction.

and ϕ(x0 ) < ϕ(y) + dX (y, x0 )

584

Nonlinear Analysis

REMARK 4.6.10 We emphasize that on the multifunction F no regularity conditions were imposed except for (4.45), which is a mild restriction. Suppose that F has compact values and ¡ ¢ h F (x), F (y) 6 kdX (x, y) ∀ x, y ∈ X and with k ∈ (0, 1). Here h(·, ·) stands for the Hausdorff metric on the nonempty and closed subsets of X. Then we can apply Theorem 4.6.9 with ¡ ¢ 1 ϕ(x) = d x, F (x) . 1−k X ¡ ¢ Indeed let y ∈ F (x) be such that dX x, F (x) = dX (x, y). Such an element exists since F (x) is compact. Then we have ¡ ¢ (1 − k)dX (x, y) = dX x, F (x) − kdX (x, y) ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ 6 dX x, F (x) − h F (x), F (y) 6 dX x, F (x) − dX y, F (y) , so ϕ(y) 6 ϕ(x) − dX (x, y). So condition (4.45) is satisfied. The resulting fixed point theorem is a particular case of Nadler’s fixed point theorem (see Theorem 7.4.3). Of course, if F is single valued, we recover the well known Banach’s contraction principle (see Theorem 7.1.2). We should point out that Banach’s fixed point theorem contains much more information. PROPOSITION 4.6.11 The Caristi fixed point theorem (see Theorem 4.6.9) implies the Ekeland variational principle in the form of Corollary 4.6.3. PROOF Let dbX = εdX . This is an equivalent metric on X. Proceeding by contradiction, suppose that there is no xε ∈ X satisfying inequality (b) in Corollary 4.6.3. Then for every x ∈ X, we have © ª F (x) = y ∈ X : ϕ(x) > ϕ(y) + dbX (x, y), y 6= x 6= ∅. The multifunction F satisfies (4.45). So by Theorem 4.6.9, we can find x0 ∈ X, such that x0 ∈ F (x0 ). But this cannot happen since from the definition of F , x∈ / F (x) for all x ∈ X. There is another geometrical result of nonlinear analysis which is equivalent to some form of the Ekeland variational principle. DEFINITION 4.6.12 Let X be a normed space, C ⊆ X a nonempty, convex set and x ∈ X. The drop associated with the pair (x, C), denoted by D(x, C), is the convex hull of {x} ∪ C, i.e., © ª D(x, C) = x + λ(c − x) : c ∈ C, λ ∈ [0, 1] .

4. Smooth and Nonsmooth Analysis and Variational Principles REMARK 4.6.13 given its geometry.

585

The set D(x, C) is called a “drop,” a suitable name

The next result is known as the drop theorem. THEOREM 4.6.14 (Drop Theorem) If X is a normed space, A ⊆ X is a complete set, y ∈ X \ A, R = dX (y, A) and 0 < r < R < %, then there exists u ∈ A, such that ¡ ¢ u ∈ B % (y) and D u, B r (y) ∩ A = {u}. PROOF Let

By translating things, if necessary, we may assume that y = 0. df

E = B % ∩ A. This is a closed subset of A, hence it is a complete metric space (the metric induced by the norm of X). We introduce the continuous function ϕ : E −→ R+ , defined by

%+r kxkX . R−r We apply Corollary 4.6.3 with ε = 1 to obtain u ∈ E, such that df

ϕ(x) =

ϕ(u) < ϕ(x) + ku − xkX

∀ x ∈ E, x 6= u.

(4.47)

We need to show that ¡ ¢ D u, B % (0) ∩ A = {u}. ¡ ¢ Suppose that this is not true and let v ∈ D u, B % (0) ∩ A. We have v ∈ A and v = (1 − λ)u + λz for some z ∈ B r (0) and some λ ∈ [0, 1]. Since v 6= u and r < R, it follows that λ ∈ (0, 1). We have kvkX 6 (1 − λ) kukX + λ kzkX . Because u ∈ A, we have kukX > R and so it follows that λ(R − r) 6 λ (kukX − kzkX ) 6 kukX − kvkX .

(4.48)

From (4.47) with x = v, we have %+r %+r %+r kukX < kvkX + kv − ukX = kvkX + λ ku − zkX , R−r R−r R−r

586 so

Nonlinear Analysis %+r (kukX − kvkX ) < λ ku − zkX R−r

and thus using also (4.48), we have % + r < ku − zkX . But kukX 6 % (recall that y = 0) and z ∈ B r (0). Hence ku − zkX 6 % + r, a contradiction. This geometrical result is in fact equivalent to the Ekeland variational principle stated in the following form which can be easily deduced from Corollary 4.6.3. PROPOSITION 4.6.15 If (X, dX ) is a complete metric space and ϕ : X −→ R is a proper, lower semicontinuous function which is bounded from below, then for any β > 0 and any x0 ∈ X, there exists y ∈ X, such that: (a) ϕ(y) < ϕ(x) + βdX (y, x) for all x 6= y; (b) ϕ(y) 6 ϕ(x0 ) − βdX (y, x0 ). In this form the Ekeland variational principle is equivalent to the drop theorem. For the proof of this, see Penot (1986). PROPOSITION 4.6.16 The drop theorem (see Theorem 4.6.14) is equivalent to the Ekeland variational principle in the form of Proposition 4.6.15. We continue with the applications of the Ekeland variational principle. PROPOSITION 4.6.17 If X is a Banach space, ϕ : X −→ X is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable and there exist a, c > 0, such that a kxkX − c 6 ϕ(x) ∀ x ∈ X, (4.49) X∗

then ϕ0G (X) is dense in aB 1 , where X∗

B1

=

©

ª x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 .

4. Smooth and Nonsmooth Analysis and Variational Principles PROOF

X

Let x∗ ∈ aB 1

∗

587

and consider the function df

ψ(x) = ϕ(x) − hx∗ , xiX . Evidently ψ is lower semicontinuous, bounded from below (see (4.49)) and Gˆateaux differentiable. Applying Theorem 4.6.5, we obtain yε ∈ X, such that ° 0 ° °ψG (yε )° ∗ 6 ε. X But

0 ψG (yε ) = ϕ0G (yε ) − x∗ .

Hence

° 0 ° °ϕG (yε ) − x∗ ° ∗ 6 ε. X X∗

Since x∗ ∈ aB 1

X∗

was arbitrary, we conclude that ϕ0G (X) is dense in aB 1 .

COROLLARY 4.6.18 If X is a Banach space, ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable and there exists a continuous function ϑ : R+ −→ R, such that ϑ(s) −→ +∞ s and then

as s → +∞

¡ ¢ ϕ(x) > ϑ kxkX ϕ0G (X)

PROOF

∀ x ∈ X,

∗

is dense in X . Let a > 0. We can find sa > 0, such that ϑ(s) > as

∀ s > sa .

Hence we have that ∀ x ∈ X, kxkX > sa .

ϕ(x) > a kxkX

On the other hand, if kxkX < sa , then ϕ(x) > ma , where df

ma =

min ϑ(s).

s∈[0,sa ]

Therefore, finally we have ϕ(x) > a kxkX − ma . X∗

Apply Proposition 4.6.17 to obtain that ϕ0G (X) is dense in aB 1 . Since a > 0 was arbitrary, we conclude that ϕ0G (X) is dense in X ∗ .

588

Nonlinear Analysis

Recall that if X is a© Banach space and ϕ ª ∈ Γ0 (X), then© D(∂ϕ) ⊆ dom ϕ (recall that D(∂ϕ) = x ∈ X : ∂ϕ(x) = 6 ∅ and dom ϕ = x ∈ X : ϕ(x) < ª +∞ ). We would like to have a more precise relation between these two sets. PROPOSITION 4.6.19 If X is a Banach space, ϕ ∈ Γ0 (X), x0 ∈ dom ϕ, ε > 0 and x∗0 ∈ ∂ε ϕ(x0 ), then there exist x ∈ X and x∗ ∈ ∂ϕ(x), such that √ (a) kx − x0 kX 6 ε; ¯ ¯ √ (b) ¯ϕ(x) − ϕ(x0 )¯ 6 ε + ε; ¢ √ ¡ (c) kx∗ − x∗0 kX ∗ 6 ε 1 + kx0 kX . PROOF

Let

df

ψ(x) = ϕ(x) − hx∗0 , xiX . We have ψ ∈ Γ0 (X) and ψ(x) > −ϕ∗ (x∗0 ) > ϕ(x0 ) − hx∗0 , x0 iX − ε (see Remark 4.4.50). Moreover, from Remark 4.4.50, we have ψ(x0 ) 6 inf ψ(x) + ε. x∈X

On X we use the norm ¯ ¯ |||x|||X = kxkX + ¯ hx∗0 , xiX ¯, which is equivalent to the original norm k·kX . Invoking Corollary 4.6.4, we obtain x ∈ X, such that ψ(x) 6 ψ(x0 ) and

¯ ¯ √ |||x − x0 |||X = kx − x0 kX + ¯ hx∗0 , x − x0 iX ¯ 6 ε

(4.50)

and ϕ(x) − hx∗0 , xiX 6 ϕ(y) − hx∗0 , yiX +

√ ε |||x − y|||X

∀ y ∈ X.

From (4.51) we obtain that ϕ(x) − hx∗0 , xiX = inf ψ1 (y), y∈X

where

df

ψ1 (y) =

©

ϕ(y) − hx∗0 , yiX +

so, from Proposition 4.4.30, we have 0 ∈ ∂ψ1 (x).

ª √ ε |||x − y|||X ,

(4.51)

4. Smooth and Nonsmooth Analysis and Variational Principles

589

Invoking Proposition 4.4.31, we have ∂ψ1 (x) = ∂ϕ(x) − x∗0 + Note that

¢ √ ¡ ε∂ |||·|||X (0).

¡ ¢ ∂ |||·|||X (0) = u∗ + λx∗0 , X∗

with u∗ ∈ B 1

and λ ∈ [−1, 1] (see Example 4.4.24(b)). Therefore, we have ¢ √ ¡ 0 = x∗ − x∗0 + ε u∗ + λx∗0 ,

with x∗ ∈ ∂ϕ(x), so kx∗ − x∗0 kX ∗ 6

¢ √ ¡ ε 1 + kx∗0 kX ∗ ,

which proves (c). Also from (4.50) and since ψ(x0 ) 6 inf ψ(x) + ε, x∈X

we have ¯ ¯ ¯ ¯ √ ¯ϕ(x) − ϕ(x0 )¯ 6 ψ(x0 ) − ψ(x) + ¯ hx∗0 , x − x0 i ¯ 6 ε + ε, X which proves (b). √ Finally again from (4.50), we have kx − x0 kX 6 ε, which proves (a). THEOREM 4.6.20 If X is a Banach space and ϕ ∈ Γ0 (X), then D(∂ϕ) is dense in dom ϕ. PROOF Let x0 ∈ dom ϕ, n > 1 and x∗0n ∈ ∂ n1 ϕ(x0 ) (see Proposition 4.4.51). Invoking Proposition 4.6.19, we obtain xn ∈ D(∂ϕ), such that 1 kxn − x0 kX 6 √ n

and

¯ ¯ ¯ϕ(xn ) − ϕ(x0 )¯ 6 1 + √1 . n n

REMARK 4.6.21 Actually in the proof of Theorem 4.6.20, we obtained something stronger. Namely if x0 ∈ dom ϕ, we can find a sequence {xn }n>1 ⊆ D(∂ϕ), such that xn −→ x0 in X and ϕ(xn ) −→ ϕ(x0 )

in R.

590

Nonlinear Analysis

Recall the following theorem from the theory of bounded linear operators between Banach spaces. THEOREM 4.6.22 If X and Y are two Banach spaces and A ∈ L(X; Y ), then the following statements are equivalent: (a) A is surjective; (b) there exists c > 0, such that ky ∗ kY ∗ 6 c kA∗ y ∗ kX ∗

∀ y∗ ∈ Y ∗ ;

(c) N (A∗ ) = {0} and R(A∗ ) is closed. REMARK 4.6.23 According to this theorem, if A is surjective, then A∗ is injective. If one of the spaces X or Y is finite dimensional, then the converse is also true. We want to produce a nonlinear analog of Theorem 4.6.22. THEOREM 4.6.24 If X and Y are two Banach spaces, ϕ : X −→ Y is a Gˆ ateaux differentiable map, ϕ(X) is closed in Y , y ∈ Y and there exist % > 0 and k ∈ [0, 1), such that ¡ ¢ ϕ−1 B% (y) 6= ∅ (4.52) and inf

z∈R(ϕ0G (x))

ky − ϕ(x) − zkY 6 k ky − ϕ(x)kY

¡ ¢ ∀ x ∈ ϕ−1 B% (y) , (4.53)

then y ∈ ϕ(X). PROOF

Let A = ϕ(X).

By hypothesis A ⊆ Y is closed. We proceed by contradiction. So suppose that y ∈ / ϕ(X). Let df

R = dY (y, A) and choose %, r > 0, such that r < R < %

and k% < r.

Note that if (4.52) and (4.53) hold for some %0 > 0, then it also holds for any % ∈ (R, %0 ). According to Theorem 4.6.14, we can find u0 ∈ B% (y), such that D(u0 , C) ∩ A = {u0 },

4. Smooth and Nonsmooth Analysis and Variational Principles where

591

df

C = B r (y). Let x0 ∈ X be such that ϕ(x0 ) = u0 . From (4.53) and recalling that k% < r, we have ° ° ° ° °y − ϕ(x0 ) − z ° 6 k °y − ϕ(x0 )° < r. inf Y Y z∈R(ϕ0G (x0 ))

So we can find h ∈ X, such that ° ° °y − ϕ(x0 ) − ϕ0G (x0 )h° < r. Y

(4.54)

From this inequality, it follows that for λ > 0 small, we have ° ° ° ° °y − ϕ(x0 ) − ϕ(x0 + λh) − ϕ(x0 ) ° < r. ° ° λ Y Let

ϕ(x0 + λh) − ϕ(x0 ) ∈ Y. λ From Definition 4.6.12, we see that df

vλ = y − ϕ(x0 ) −

y − vλ ∈ D(u0 , C), where C = B r (y), so (1 − λ)u0 + λ(y − vλ ) ∈ D(u0 , C)

∀ λ ∈ (0, 1)

and thus ϕ(x0 + λh) ∈ D(u0 , C)

∀ λ > 0 small enough.

Because D(u0 , C) ∩ A = {u0 }, it follows that ϕ(x0 + λh) = u0

∀ λ > 0 small enough,

so ϕ0G (x0 ) = 0. Using this in (4.54), we obtain ° ° °y − ϕ(x0 )° < r < R, X a contradiction. REMARK 4.6.25 If in the above theorem conditions (4.52) and (4.53) hold for all y ∈ Y , then we conclude that ϕ is surjective. Note that conditions (4.52) and (4.53) are in a sense complementary. Namely the larger % > 0, the more difficult it is to verify (4.53).

592

Nonlinear Analysis

COROLLARY 4.6.26 If X and Y are two Banach spaces, ϕ : X −→ Y is a Gˆ ateaux differentiable map, ϕ(X) is closed in Y and N

¡¡

¢∗ ¢ ϕ0G (x) = {0}

∀ x ∈ X,

then ϕ is surjective. PROOF

Recall that ¡ ¢ R ϕ0G (x) =

⊥

N

¡¡

¢∗ ¢ ϕ0G (x)

(see, e.g., Denkowski, ¡¡ ¢∗Mig´ ¢ orski & Papageorgiou (2003a, p. 320)). Since by hypothesis N ϕ0G (x) = {0}, it follows that (4.53) is true with k = 0. So for a given y ∈ Y , let % > 0 be such that ¡ ¢ dY y, ϕ(X) < % and let k = 0. Then we can apply Theorem 4.6.24 and conclude that y ∈ ϕ(X). This proves that ϕ is surjective. REMARK 4.6.27 the hypothesis

It is clear from the proof of the above theorem that N

¡¡

¢∗ ¢ ϕ0G (x) = {0}

∀x∈X

can be replaced by the hypothesis that ¡ ¢ R ϕ0G (x) is dense in Y

∀ x ∈ X.

¡ ¢ Moreover, if ϕ0G (x) ∈ Φ(X; Y ) and ind ϕ0G (x) = 0 for all x ∈ X (see Definition 3.1.60), then the hypothesis N

¡¡

¢∗ ¢ ϕ0G (x) = {0}

∀x∈X

can be replaced by the hypothesis that ¡ ¢ N ϕ0G (x) = {0}

∀ x ∈ X.

The discussion so far has illustrated the power of the Ekeland variational principle. The only difficulty that we encounter when using this principle is that the perturbation function ε kx − xε k is not differentiable at the origin. So it is natural to ask whether it is possible to formulate a similar variational principle but for a different class of perturbations, which would include functions differentiable at points of interest. This was achieved by Borwein & Preiss (1987), who obtained the following theorem.

4. Smooth and Nonsmooth Analysis and Variational Principles

593

THEOREM 4.6.28 If X is a Banach space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and x0 ∈ X is such that ϕ(x0 ) 6 inf ϕ(x) + ε, x∈X

then for any λ > 0 and any p ∈ [1, +∞), we can find yλ ∈ X, a sequence {xn }n>1 ⊆ X, such that xn −→ yλ in X ∞ P and a sequence {tn }n>1 ⊆ [0, 1] satisfying tn = 1, such that: n=1

(a) ϕ(yλ ) 6 ϕ(x0 ); (b) kxn − x0 kX 6 λ for all n > 1; (c) the function ψ(x) = ϕ(x) +

ε λp

mum at yλ .

∞ P n=1

p

tn kx − xn kX attains a strict mini-

The next step in the development of variational principles was made by Deville, Godefroy & Zizler (1993). Their starting point was the argument in the Borwein-Preiss variational principle. In that argument important is the df function ϑ(x) = 1 − kxkX and in particular what matters is the behaviour of ϑ within the unit ball. That is the behaviour of ϑ outside the domain where ϑ is nonnegative plays no role in the argument. So we may as well replace ϑ by the function © ª df b ϑ(x) = ϑ+ (x) = max 0, 1 − kxkX . Note that ϑb is a continuous bump function (i.e., a continuous function on X which has a nonempty and bounded support; see Remark 4.4.84). This observation is interesting because there are Banach spaces in which smooth bump functions can be found, but they do not have an equivalent differentiable norm (see Fabian (1997)). THEOREM 4.6.29 If X is a Banach space which admits a Lipschitz continuous bump function that is Fr´echet (respectively Gˆ ateaux) differentiable, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function and ε > 0, then there exists a Lipschitz continuous function g : X −→ R which is Fr´echet (respectively Gˆ ateaux) differentiable, such that ¯ ¯ kgk∞ = sup ¯g(x)¯ 6 ε, x∈X ° ° kg 0 k∞ = sup °g 0 (x)°X ∗ 6 ε x∈X

and ϕ − g attains its minimum on X.

594

Nonlinear Analysis

PROOF We do the proof for the Fr´echet differentiable case. The proof of the Gˆateaux differentiable case is done similarly. Let V be the linear space of all functions g : X −→ R which are Lipschitz continuous and Fr´echet differentiable. Evidently ° 0 ° °g (x)° ∗ 6 Lip(g) ∀ g ∈ V, x ∈ X X (by Lip(g) we denote the Lipschitz constant of g). So the function x 7−→ g 0 (x) is bounded and then by the mean value theorem (see Proposition 4.1.21), we have that x 7−→ g(x) is bounded too. It is easy to see that V supplied with the norm kgkV = kgk∞ + kg 0 k∞ becomes a Banach space. For every n > 1, let ½ df An = g ∈ V : there exists x0 ∈ X, such that ¾ (ϕ − g)(x0 ) < inf (ϕ − g)(x) , x∈X\B 1 (x0 ) n

where

©

1ª . n We claim that for every n > 1, the set An is open and dense in V . To do this note that k·k∞ 6 k·kV . So it follows that An is open. To show the density of An , let g ∈ An and ε > 0. We need to find h ∈ V with khkV < ε and x0 ∈ X, such that (ϕ − g − h)(x0 ) < inf (ϕ − g − h)(x). B n1 (x0 ) =

x ∈ X : kx − x0 kX

1 . n

The function ϕ − g is bounded from below. So we can find x0 ∈ X, such that (ϕ − g)(x0 ) < inf (ϕ − g)(x) + b(0). x∈X

Let

df

h(x) = b(x − x0 ). Evidently h ∈ V with khkV < ε, and we have ¡ ¢ ¡ ¢ ¡ ¢ ϕ − g − h (x0 ) = ϕ − g (x0 ) − b(0) < inf ϕ − g (x). x∈X

Since hX\B 1 (x0 ) ≡ 0, we have n

¡

¡ ¢ ¢ ¡ ¢ ϕ − g − h (x) = ϕ − g (x) > inf ϕ − g (x) x∈X

∀ x ∈ X \ B n1 (x0 ).

4. Smooth and Nonsmooth Analysis and Variational Principles

595

Hence g + h ∈ An and this proves the density of An in V . Then by the Baire category theorem (see Theorem A.1.10), we have that ∞ \

D =

An ⊆ Y is a dense Gδ set.

n=1

Next we show that if g ∈ D, then ϕ − g attains its minimum on X. From the definition of An , we can find xn ∈ X, such that ¡ ¢ ¡ ¢ ϕ − g (xn ) < inf ϕ − g (x). x∈X\B 1 (xn ) n

For m > n, we have xm ∈ B n1 (xn ), or otherwise we would have ¡ ¢ ¡ ¢ ϕ − g (xn ) < ϕ − g (xm ) and

kxn − xm kX >

1 1 > . n m

(4.55)

By virtue of the second inequality and the choice of xm , we have ¡ ¢ ¡ ¢ ϕ − g (xm ) < ϕ − g (xn ), which contradicts the first inequality in (4.55). Therefore we infer that {xn }n>1 ⊆ X is a Cauchy sequence in X and xn −→ x b in X. We claim that x b is a minimizer of ϕ − g. Because ϕ − g is lower semicontinuous, we have ¡ ¢ ¡ ¢ ϕ − g (b x) 6 lim inf ϕ − g (xn ) n→+∞ µ ¶ ¡ ¢ 6 lim inf inf ϕ − g (x) . n→+∞

x∈X\B 1 (xn ) n

If u ∈ X, u 6= x b, then kxn − ukX >

1 n

∀ n > 1 large enough

and so for n > 1 large enough, we have ¡ ¢ ¡ ¢ inf ϕ − g (x) 6 ϕ − g (u). x∈X\B 1 (xn ) n

Using this in (4.56), we conclude that ¡ ¢ ¡ ¢ ϕ − g (b x) = inf ϕ − g (x). x∈X

(4.56)

596

Nonlinear Analysis

REMARK 4.6.30 If the norm of X is Fr´echet differentiable away from the origin, then the function 2

x 7−→ kxkX is a C 1 -function on X. If ξ : R+ −→ R is a C 1 -function, such that ξ(0) = 1 then

and ξ(s) = 0

∀ s > 1,

¡ df 2 ¢ b(x) = ξ kxkX

is a C 1 -function on X, such that b(0) = 1

and

b(x) = 0

∀x∈ / B1 (0).

This is a C 1 -bump function. Note that if X ∗ is separable, then X admits a C 1 -bump function. Indeed in this case X admits an equivalent Fr´echet differentiable norm and so the C 1 -bump function is constructed as indicated above. Every separable Banach space admits an equivalent Gˆateaux differentiable norm and so it admits a Lipschitz continuous, Gˆateaux differentiable bump function. The following result is an interesting consequence of Theorem 4.6.29. PROPOSITION 4.6.31 If the Banach space X admits a Lipschitz continuous and Fr´echet (respectively Gˆ ateaux) differentiable bump function, then every continuous convex function defined on X is Fr´echet (respectively Gˆ ateaux) differentiable on a dense subset of X. In particular, if the Banach space admits a Lipschitz continuous and Fr´echet differentiable bump function, then X is an Asplund space (see Definition 4.2.18). PROOF Again we do the proof for the Fr´echet differentiable case. The proof for the Gˆateaux differentiable case is similar. Let ϕ be a continuous concave function on X and b be a Lipschitz continuous and Fr´echet differentiable function on X with b(0) 6= 0

and

b(x) = 0

∀x∈ / B1 (0).

Let x0 ∈ X and choose δ > 0, such that ϕ(x0 ) − 1 < ϕ(x)

∀ x ∈ Bδ (x0 ).

Choose m > 1δ . If kx − x0 kX > δ, we have ° ° °m(x − x0 )° > 1 X

4. Smooth and Nonsmooth Analysis and Variational Principles and so

597

¡ ¢ b m(x − x0 ) = 0.

We define the function 1 df f (x) = b(m(x − x0 ))2 +∞

¡ ¢ if b m(x − x0 ) 6= 0, otherwise.

The function ϕ + f : X −→ R is proper, lower semicontinuous and bounded from below (by ϕ(x0 ) − 1). So we can apply Theorem 4.6.29 and obtain a Lipschitz continuous and Fr´echet differentiable function g : X −→ R, such that ϕ + f − g attains its minimum at some point y0 ∈ Bδ (x0 ) (since f ≡ +∞ outside Bδ (x0 )). Let U be a neighbourhood of y0 , such that the function ¡ ¢ x 7−→ b m(x − x0 ) is nonzero. Then the function f is Fr´echet differentiable on U . We have ϕ(y0 ) + f (y0 ) − g(y0 ) 6 ϕ(y) + f (y) − g(y)

∀ y ∈ U,

so −ϕ(y) 6 −ϕ(y0 ) − f (y0 ) + g(y0 ) + f (y) − g(y) Let

df

v(y) = −ϕ(y0 ) − f (y0 ) + g(y0 ) + f (y) − g(y)

∀ y ∈ U. ∀ y ∈ U.

We have −ϕ(y) 6 v(y)

∀y∈U

and

− ϕ(y0 ) = v(y0 ).

If khkX is small, from the convexity of −ϕ, we have 0 6 (−ϕ)(y0 + h) + (−ϕ)(y0 − h) − (−2ϕ)(y0 ) 6 v(y0 + h) + v(y0 − h) − 2v(y0 ).

(4.57)

Since v is Fr´echet differentiable, we have v(y0 + h) + v(y0 − h) − 2v(y0 ) =

¡ ® ¢ 0 ® vF (y0 ), h X − vF0 (y0 ), h X + o khkX .

Hence from (4.57) and (4.58), it follows that ¡ ¢ (−ϕ)(y0 + h) + (−ϕ)(y0 − h) − 2(−ϕ)(y0 ) = o khkX

(4.58)

598

Nonlinear Analysis

and this by virtue of Proposition 4.2.9 implies that −ϕ is Fr´echet differentiable at y0 . Since x0 ∈ X and δ > 0 were arbitrary, we conclude that −ϕ is Fr´echet differentiable on a dense subset of X. Finally for the last part of the proposition, recall that the set of points of differentiability of −ϕ is a Gδ set (see the proof of Theorem 4.2.12). REMARK 4.6.32 It is not known whether every Asplund space admits a Fr´echet differentiable bump function. We conclude this section with a generalization of Theorem 4.6.1 which is useful when we study boundary value problems using variational methods. This generalization is due to Zhong (1997), where the interested reader can find its proof. THEOREM 4.6.33 If h : R+ −→ R+ is a continuous, nondecreasing function, such that +∞ Z

0

1 dr = +∞, 1 + h(r)

(X, dX ) is a complete metric space, x0 ∈ X is fixed, ϕ : X −→ R is a proper, lower semicontinuous and bounded below function, ε > 0, ϕ(y) 6 inf ϕ(x) + ε x∈X

and λ > 0, then there exists xλ ∈ X such that ϕ(xλ ) 6 ϕ(y),

dX (xλ , x0 ) 6 r0 + r

and ϕ(xλ ) 6 ϕ(x) +

ε d (xλ , x) λ(1 + h(dX (x0 , xλ ))) X

where

∀ x ∈ X,

df

r0 = dX (x0 , y) and r > 0 is such that

rZ0 +r

r0

REMARK 4.6.34 Theorem 4.6.1.

1 dr > λ. 1 + h(r)

If h ≡ 0 and x0 = y, then Theorem 4.6.33 reduces to

4. Smooth and Nonsmooth Analysis and Variational Principles

4.7

599

Remarks

4.1: Gˆateaux (1913) gave the definition of directional differentiability when X is simply a linear space and Y = R. Afterwards, L´evy (1920) imposed the requirement that f 0 (x; ·) must be linear. The Fr´echet derivative was introduced by Fr´echet (1920). Various parts of the calculus in Banach spaces can be found in the books of Abraham & Marsden (1978), Cartan (1967), Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Dieudonn´e (1969), Ioffe & Tihomirov (1979), Vaˇınberg (1973), Zeidler (1985b) and in the survey papers of Averbukh & Smolyanov (1967, 1968) and Nashed (1971). We should also mention Lusternik’s theorem, which is useful in variational analysis. For a proof of it see Zeidler (1985b, pp. 287–289). First a definition. DEFINITION 4.7.1 C.

Let X be a locally convex space, C ⊆ X and x0 ∈

(a) An admissible curve in C through x0 is a map u : (−ε, ε) −→ C for some ε > 0, such that u(t) ∈ C u(0) = x0

∀ t ∈ (−ε, ε), and

u0 (0) exists.

(b) h ∈ X is said to be a tangent vector to C at x0 if and only if there exists an admissible curve in C through x0 , such that u0 (0) = h, that is if there exist an ε > 0 and a map (−ε, ε) 3 λ 7−→ r(λ) ∈ X, such that x0 + λh + r(λ) ∈ C and

kr(λ)kX −→ 0 λ

∀ λ ∈ (−ε, ε) as λ → 0.

(c) The set of all vectors tangent to C at x0 is a closed cone, which is nonempty since the origin belongs to it. This cone is usually called the tangent cone to C at x0 and it is denoted by TC (x0 ). If this cone is a subspace of X, then it is called the tangent space to C at x0 .

600

Nonlinear Analysis

THEOREM 4.7.2 (a) If X and Y are two Banach spaces, U is a neighbourhood of x0 ∈ X, ϕ : U −→ Y is a Fr´echet differentiable function and ¡ ¢ R ϕ0F (x0 ) = Y (i.e., x0 ∈ U is a regular point of ϕ), then the tangent space to the set df

C =

©

x ∈ X : ϕ(x) = ϕ(x0 )

ª

coincides with the kernel of ϕ0F (x0 ), i.e., ¡ ¢ TC (x0 ) = N ϕ0F (x0 ) . (b) If X and Y are two Banach spaces, U is a neighbourhood of x0 ∈ X, ϕ ∈ C 1 (U ; Y ) and for all x ∈ U , such that ϕ(x) = ϕ(x0 ), we have

¡ ¢ R ϕ0F (x) = Y

¡ ¢ and N ϕ0F (x) is complemented in X (see Definition 4.1.28), then the set ª df © C = x ∈ U : ϕ(x) = ϕ(x0 ) is a C 1 -manifold in X. Moreover, if ϕ ∈ C r (U ; Y ) with r > 1, then C is a C r -manifold in X. 4.2: Convex functions play a central role in many applications. There are several books dealing with the continuity and differentiability properties of convex functions on finite or infinite dimensional Banach spaces. We mention the books of Rockafellar (1970a), Webster (1994) (convex functions defined on RN ) and Barbu & Precupanu (1986), Ekeland & Temam (1976), Giles (1982), Ioffe & Tihomirov (1979), Laurent (1972), Phelps (1993) and Roberts & Varberg (1973) (convex functions defined on Banach and locally convex spaces). Note that the books of Giles (1982) and Phelps (1993) approach convex functions from the point of view of functional analysis and place special emphasis on the relations with Banach space theory. Theorem 4.2.12 is a classical result of Mazur (1933). For convex, continuous functions which are everywhere Gˆateaux differentiable, there are some stronger results on Fr´echet differentiability. In particular, Deville, Godefroy, Hare & Zizler (1987) characterize the separable Banach spaces X so that every continuous convex function

4. Smooth and Nonsmooth Analysis and Variational Principles

601

ϕ : X −→ R, which is everywhere Gˆateaux differentiable, must be Fr´echet differentiable on a dense set. It turns out that X ∗ can be nonseparable, but X cannot contain a subspace isomorphic to l1 . 4.3: Haar-null sets (see Definition 4.3.5) were introduced by Christensen (1972), who also obtained all the results up until Corollary 4.3.14. Theorem 4.3.17 is also due to Christensen (1974). In the paper of Hunt, Sauer & Yorke (1992, 1993) (see also their addendum), we find relations between dynamical systems and Haar-null sets. There are other ways to define negligible sets in an infinite dimensional Banach space (such as Gauss-null sets, Aronszajn-null sets and cube-null sets). A detailed discussion of them and their use in the study of the differentiability properties of Lipschitz continuous functions can be found in the book of Benyamini & Lindenstrauss (1997). 4.4: Duality is in the core of convex analysis. The Legendre-Fenchel transform (see Definition 4.4.1) was first used for convex functions on R by Mandelbrojt (1939). This motivated Fenchel (1951) to introduce an important and more general definition for convex functions in RN . The transform introduced by Fenchel is an extension of the Legendre transform (see Legendre (1786)). This is why the transform is called Legendre-Fenchel transform. This notion was extended to dual pairs of locally convex spaces by Brondsted (1964), Moreau (1966–1967) and Rockafellar (1974). A special case of the inequality in Proposition 4.4.3(a) can be found in Young (1912) and for this reason the inequality is called Young-Fenchel inequality. We should point out that some authors (see Ioffe & Tichomirov (1968, 1979)) prefer to name Young-Fenchel transform for what we call here Legendre-Fenchel transform. The finite-dimensional duality theory can be found in the books of Fenchel (1951), Rockafellar (1970a), while the infinite dimensional duality theory can be found in the books of Barbu & Precupanu (1986), Ekeland & Temam (1976), Ioffe & Tihomirov (1979), Laurent (1972). Theorem 4.4.14 is the main result in the duality theory for convex functions and sometimes it is called ¡ the ¢ Fenchel-Moreau theorem. First Fenchel (1951) observed that ϕ ∈ Γ0 RN if and only if it is supremum of all affine continuous functions majorized by ϕ. Soon thereafter H¨ormander (1955) established the following result. PROPOSITION 4.7.3 If X is a locally convex space, then there is a bijective correspondence between nonempty, closed, convex sets and sublinear, w(X ∗ , X)-lower semicontinuous functions on X ∗ with values df

in R = R ∪ {+∞}, which maps C into σC . Theorem 4.4.14 in conjunction with Proposition 4.4.13 says that ϕ∗∗ is the biggest convex and lower semicontinuous function majorized by ϕ (sometimes this is denoted by writing that ϕ∗∗ = conv ϕ, which is a suggestive notation expressing the fact that epi ϕ∗∗ = conv epi ϕ). This fact is important in control

602

Nonlinear Analysis

theory in connection with the relaxation method. If the ambient space is RN , then using Carath´eodory’s theorem for convex sets in RN , we can have the following useful expression for ϕ∗∗ (see Ioffe & Tihomirov (1979, p. 189)). PROPOSITION 4.7.4 If ϕ : RN −→ R is a proper, lower semicontinuous function and dom ϕ∗∗ ⊆ RN is closed, then ϕ∗∗ (x) = inf

½ NX +1 k=1

λk ϕ(xk ) : xk ∈ RN , λk > 0,

N +1 X k=1

λk = 1,

N +1 X

¾ λk xk = x .

k=1

The operation of infimal convolution (see Definition 4.4.6(b)) was introduced by Moreau (1965, 1966–1967) and its duality properties were studied by Ioffe & Tichomirov (1968, 1979). The proof of Proposition 4.4.16 can be found in Ioffe & Tihomirov (1979, p. 178). Although affine continuous supports for convex functions were considered earlier, the first systematic study of the subdifferential multifunction started with the works of Moreau (1965, 1966–1967) and Rockafellar (1966, 1970b). Moreau (1965) limits himself in the framework of Hilbert spaces, while Rockafellar (1970b) passes to general Banach spaces. We should also mention the related work of Pshenichnyi (1971) on quasi-differentiable functions. One of the main results of the convex subdifferential theory is Theorem 4.4.34 (see also Remark 4.4.35). This was first proved by Rockafellar (1966), but it was found that his proof had a gap. This was remedied by Rockafellar (1970b), where we find also the proof of Proposition 4.4.31. The notion of cyclically monotone operators is due to Rockafellar (1966), who proved Theorem 4.4.39 (see also Rockafellar (1970b, Theorem B, p. 210). Proposition 4.4.42 is due to Br´ezis (1973). For the proof of Proposition 4.4.46, we refer to Phelps (1993, p. 19). The ε-subdifferential (see Definition 4.4.49) was investigated systematically by Hiriart-Urruty (1980, 1982), and Hiriart-Urruty & Phelps (1993). Convex subdifferentials found widespread applications in optimization, control theory and evolution equations, as seen in the books of Barbu (1976, 1994), Barbu & Precupanu (1986), Dontchev & Zolezzi (1993), Ekeland & Temam (1976), Hiriart-Urruty & Lemar´echal (1993), Hu & Papageorgiou (1997, 2000), Ioffe & Tihomirov (1979), Rockafellar (1970a, 1974), Rockafellar & Wets (1998) and Tiba (1990). The proof of Proposition 4.4.53 can be found in Rockafellar (1970a, pp. 219–220). The subdifferential theory for locally Lipschitz functionals is due to Clarke (1975, 1981, 1983). Only Theorem 4.4.72 is due to Lebourg (1975). Applications of the generalized subdifferential can be found in the books of Clarke (1983, 1989), Clarke, Ledyaev, Stern & Wolenski (1998) and in Naniewicz & Panagiotopoulos (1995) and Gasi´ nski & Papageorgiou (2005) (which deal with hemivariational inequalities). Of the other subdifferentials, the viscosity

4. Smooth and Nonsmooth Analysis and Variational Principles

603

subdifferential was explicitly defined by Deville, Godefroy & Zizler (1993) and studied by Borwein & Zhu (1996). The proximal subdifferential is discussed in Clarke (1989), Clarke, Ledyaev, Stern & Wolenski (1998) and Rockafellar & Wets (1998) and the canonical subdifferential was introduced by Penot (1978). 4.5: Integral functionals determined by (convex) normal integrands were first studied by Rockafellar (1968). Several results for integrands defined on Ω × RN can be found in Rockafellar (1976) with additional results and extensions in Rockafellar (1971a, 1971c, 1971b). The work was extended by Levin (1973, 1974, 1975, 1980), who removed some restrictive finite dimensionality or reflexivity hypotheses. Theorem 4.5.7, known as the Yosida-Hewitt decomposition theorem, was first proved by Yosida & Hewitt (1952) for X = R and µ being a finite measure. Another, more direct proof can be found in Dubovitskii & Miljutin (1968). Ioffe & Levin (1972) extended the result to a separable Banach space X and a finite measure µ. Their proof does not extend to µ being σ-finite. The general form (see Theorem 4.5.7) is due to Levin (1974). Theorem 4.5.2 is due to Rockafellar (1971a) (for a reflexive, separable Banach space X) and Levin (1975) (for a separable Banach space X). Similarly Theorem 4.5.8 is due to Rockafellar (1971b) (for X = RN ), Rockafellar (1971a) (for a reflexive, separable Banach space X) and Levin (1975) (for a separable Banach space X). Proposition 4.5.13 is due to Moreau (1966–1967) and its proof can be also found in Laurent (1972, p. 348), while Theorem 4.5.16 is due to Rockafellar (1971a) Additional results on convex integral functionals can be found in Bismut (1973), Castaing & Valadier (1977), Papageorgiou (1986) and Valadier (1975). There is a continuous analog of the operation of infimal convolution (see Definition 4.4.6(b)). DEFINITION 4.7.5 sure space and let

Let (Ω, Σ, µ) be a finite, complete, nonatomic meaϕ : Ω × RN −→ R

be a Σ × B(RN )-measurable integrand. The inf-convolution integral of ϕ with respect to µ is the function I ϕω dµ : RN −→ R∗ , Ω

defined by µI Ω

¶ ½ ¾ Z df ϕω dµ (x) = inf λ ∈ R : (x, λ) ∈ epi ϕ(ω, ·) dµ . Ω

604

Nonlinear Analysis

REMARK 4.7.6 In the above definition ½Z ¾ Z ¡ ¢ 1 epi ϕ(ω, ·) dµ = u(ω), λ(ω) dµ : (u, λ) ∈ Sepi ϕ(ω,·) . Ω

Ω

¡ 1

¢ If for all x ∈ L Ω; RN , Iϕ (x) exists (possibly infinite), then µI ¶ ½ ¾ Z ¡ ¢ ϕω dµ (x) = inf Iϕ (u) : u ∈ L1 Ω; RN , u(ω) dµ = x . Ω

Ω

In this form this operation arises in mathematical economics (see Aumann & Shapley (1974)). The next result is due to Ioffe & Tichomirov (1968). It is the continuous analog of Proposition 4.4.8 PROPOSITION ¶ 4.7.7 µ H If dom ϕω dµ 6= ∅, then

Ω

µI Ω

¶∗ Z ϕω dµ = ϕ∗ω dµ. Ω

4.6: Theorem 4.6.1 is due to Ekeland (1974). A detailed discussion with various applications can be found in Ekeland (1979, 1989). Theorem 4.6.9 with F being single valued was proved by Caristi (1976) using a different proof based on transfinite induction (see also Caristi & Kirk (1975)). Theorem 4.6.14 is due to Daneˇs (1972), but his proof used a result of Krasnoselskii & Zabreiko (1984). The proof given here is due to Brondsted (1974). Relations between these and other geometric theorems of nonlinear analysis were proved by Br´ezis & Browder (1976), Daneˇs (1972) and Penot (1986). Proposition 4.6.19 and Theorem 4.6.20 are due to Brondsted & Rockafellar (1965). For the proof of Theorem 4.6.22, we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 384). Theorem 4.6.24 and Corollary 4.6.26 are due to Browder (1971a, 1971b). Another nonlinear surjectivity result due to Bates & Ekeland (1980) is the following. PROPOSITION 4.7.8 If X and Y are two Banach spaces, ϕ : X −→ Y is continuous, Gˆ ateaux differentiable, ¡ ¢ R ϕ0G (x) = Y ∀x∈X and there exists k > 0, such that for all x ∈ X and all y ∈ Y , there exists ¡ ¢−1 z ∈ ϕ0G (x) (y) satisfying kzkX 6 k kykX , then f (X) = Y , i.e., f is surjective.

4. Smooth and Nonsmooth Analysis and Variational Principles

605

Theorem 4.6.28 is due to Borwein & Preiss (1987) and it is known as the Borwein-Preiss smooth variational principle. Theorem 4.6.29 is due to Deville, Godefroy & Zizler (1993). More applications of the Ekeland variational principle can be found in Barbu (1994), Denkowski, Mig´orski & Papageorgiou (2003b), Fattorini (1999), Li & Yong (1995) and Willem (1996).

Chapter 5 Critical Point Theory

Variational methods are a valuable tool in the analysis of nonlinear problems. According to these methods, we are trying to find solutions of a given nonlinear equation, by looking for critical (stationary) points of a functional defined on the function space in which we want the solution of our problem to lie. The Euler-Lagrange equation satisfied by a critical point is the nonlinear equation that we are trying to solve. The functional, whose critical points we are trying to determine, in many cases is unbo