Series in Mathematical Analysis and Applications Edited by Ravi P. Agarwal and Donal O’Regan
VOLUME 9
NONLINEAR ANALYS...
43 downloads
1563 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Series in Mathematical Analysis and Applications Edited by Ravi P. Agarwal and Donal O’Regan
VOLUME 9
NONLINEAR ANALYSIS
SERIES IN MATHEMATICAL ANALYSIS AND APPLICATIONS Series in Mathematical Analysis and Applications (SIMAA) is edited by Ravi P. Agarwal, Florida Institute of Technology, USA and Donal O’Regan, National University of Ireland, Galway, Ireland. The series is aimed at reporting on new developments in mathematical analysis and applications of a high standard and or current interest. Each volume in the series is devoted to a topic in analysis that has been applied, or is potentially applicable, to the solutions of scientific, engineering and social problems. Volume 1 Method of Variation of Parameters for Dynamic Systems V. Lakshmikantham and S.G. Deo Volume 2 Integral and Integrodifferential Equations: Theory, Methods and Applications Edited by Ravi P. Agarwal and Donal O’Regan Volume 3 Theorems of Leray-Schauder Type and Applications Donal O’Regan and Radu Precup Volume 4 Set Valued Mappings with Applications in Nonlinear Analysis Edited by Ravi P. Agarwal and Donal O’Regan Volume 5 Oscillation Theory for Second Order Dynamic Equations Ravi P. Agarwal, Said R. Grace, and Donal O’Regan Volume 6 Theory of Fuzzy Differential Equations and Inclusions V. Lakshmikantham and Ram N. Mohapatra Volume 7 Monotone Flows and Rapid Convergence for Nonlinear Partial Differential Equations V. Lakshmikantham, S. Koksal, and Raymond Bonnett Volume 8 Nonsmooth Critical Point Theory and Nonlinear Boundary Value Problems Leszek Gasi´nski and Nikolaos S. Papageorgiou Volume 9 Nonlinear Analysis Leszek Gasi´nski and Nikolaos S. Papageorgiou
Series in Mathematical Analysis and Applications Edited by Ravi P. Agarwal and Donal O’Regan
VOLUME 9
NONLINEAR ANALYSIS
Leszek Gasi´nski Nikolaos S. Papageorgiou
Boca Raton London New York Singapore
Published in 2005 by Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2005 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 1-58488-484-3 (Hardcover) International Standard Book Number-13: 978-1-58488-484-2 (Hardcover) Library of Congress Card Number 2005045529 This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data Gasinski, Leszek. Nonlinear analysis / Leszek Gasinski, Nikolaos S. Papageorgiou. p. cm. -- (Series in mathematical analysis and applications ; v. 9) Includes bibliographical references and index. ISBN 1-58488-484-3 1. Nonlinear functional analysis. 2. Nonlinear operators. I. Papageorgiou, Nikolaos Socrates. II. Title. III. Series. QA321.5.G37 2005 515'.7--dc22
2005045529
Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com Taylor & Francis Group is the Academic Division of T&F Informa plc.
and the CRC Press Web site at http://www.crcpress.com
To Prof. ZdzisÃlaw Denkowski
Contents
1 Hausdorff Measures and Capacity 1.1 Measure Theoretical Background . . . . . . . 1.2 Covering Results . . . . . . . . . . . . . . . . 1.3 Hausdorff Measure and Hausdorff Dimension 1.4 Differentiation of Hausdorff Measures . . . . 1.5 Lipschitz Functions . . . . . . . . . . . . . . 1.6 Capacity . . . . . . . . . . . . . . . . . . . . 1.7 Remarks . . . . . . . . . . . . . . . . . . . .
. . . . . . .
2 Lebesgue-Bochner and Sobolev Spaces 2.1 Vector-Valued Functions . . . . . . . . . . . . 2.2 Lebesgue-Bochner Spaces and Evolution Triples 2.3 Compactness Results . . . . . . . . . . . . . . 2.4 Sobolev Spaces . . . . . . . . . . . . . . . . . . 2.5 Inequalities and Embedding Theorems . . . . . 2.6 Fine Properties of Functions and BV-Functions 2.7 Remarks . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
1 3 7 22 44 52 81 103
. . . . . . .
107 108 127 150 179 213 239 257
. . . . . .
265 266 303 343 405 427 463
3 Nonlinear Operators and Young Measures 3.1 Compact and Fredholm Operators . . . . . . . . . 3.2 Operators of Monotone Type . . . . . . . . . . . . 3.3 Accretive Operators and Semigroups of Operators 3.4 The Nemytskii Operator and Integral Functions . 3.5 Young Measures . . . . . . . . . . . . . . . . . . . 3.6 Remarks . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
4 Smooth and Nonsmooth Analysis and Variational 4.1 Differential Calculus in Banach Spaces . . . . . . 4.2 Convex Functions . . . . . . . . . . . . . . . . . . 4.3 Haar Null Sets and Locally Lipschitz Functions . 4.4 Duality and Subdifferentials . . . . . . . . . . . . 4.5 Integral Functionals and Subdifferentials . . . . . 4.6 Variational Principles . . . . . . . . . . . . . . . . 4.7 Remarks . . . . . . . . . . . . . . . . . . . . . . .
Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
467 468 488 501 512 558 578 599
vii
viii 5 Critical Point Theory 5.1 Deformation Results . . . . . . . . . . . . . 5.2 Minimax Theorems . . . . . . . . . . . . . 5.3 Structure of the Critical Set . . . . . . . . 5.4 Multiple Critical Points . . . . . . . . . . . 5.5 Lusternik-Schnirelman Theory and Abstract lems . . . . . . . . . . . . . . . . . . . . . . 5.6 Remarks . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eigenvalue Prob. . . . . . . . . . . . . . . . . . . .
607 608 642 654 661 689 705
6 Eigenvalue Problems and Maximum Principles 6.1 Linear Elliptic Operators . . . . . . . . . . . . . 6.2 The Partial p-Laplacian . . . . . . . . . . . . . . 6.3 The Ordinary p-Laplacian . . . . . . . . . . . . 6.4 Maximum Principles . . . . . . . . . . . . . . . . 6.5 Comparison Principles . . . . . . . . . . . . . . . 6.6 Remarks . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
707 708 732 759 775 788 797
7 Fixed Point Theory 7.1 Metric Fixed Point Theory . . 7.2 Topological Fixed Point Theory 7.3 Partial Order and Fixed Points 7.4 Fixed Points of Multifunctions 7.5 Remarks . . . . . . . . . . . .
. . . . . . . . .
Appendix A.1 Topology . . . . . . . . . . . . . A.2 Measure Theory . . . . . . . . . A.3 Functional Analysis . . . . . . . A.4 Calculus and Nonlinear Analysis
. . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
803 804 821 833 877 891
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
895 895 899 908 912
List of Symbols
915
References
925
Preface
Linear functional analysis deals with infinite dimensional topological vector spaces (which mix in a fruitful way the linear (algebraic) structure with topological one) and the linear operators acting between them. The effort was to extend standard results of linear analysis to an infinite dimensional context. The first half of the twentieth century is marked by intensive theoretical investigations in this area, which were also accompanied by detailed treatment of linear mathematical models. With the exception of a short period during the 1930’s (compact operators and Leray-Schauder degree), nonlinear operators were out of the emerging picture. However, mounting evidence from diverse other fields such as physics, engineering, economics, biology and others suggested that there should be an effort to extend the linear theory to various kinds of nonlinear operators. Systematic efforts in this direction started in the early 1960’s and mark the beginning of what is known today as “Nonlinear Analysis.” Since then several theories have been developed in this respect and today some of them are well established approaching their limits, while others are still the object of intense research activity. It is not a coincidence that simultaneously with the advent of nonlinear analysis, we have the appearance of nonsmooth analysis and of multivalued analysis, both of which were motivated by concrete needs in applied areas such as control theory, optimization, game theory and economics. Their development provided nonlinear analysis with new concepts, tools and theories that enriched the subject considerably. Today nonlinear analysis is a well established mathematical discipline, which is characterized by a remarkable mixture of analysis, topology and applications. It is exactly the fact that the subject combines in a beautiful way these three items that makes it attractive to mathematicians. The notions and techniques of nonlinear analysis provide the appropriate tools to develop more realistic and accurate models describing various phenomena. This gives nonlinear analysis a rather interdisciplinary character. Today the more theoretically inclined nonmathematician (engineer, economist, biologist or chemist) needs a working knowledge of at least a part of nonlinear analysis in order to be able to conduct a complete qualitative analysis of his models. This supports a high demand for books on nonlinear analysis. Of course the subject is big (vast is maybe a more appropriate word) and no single book can cover all its theoretical and applied parts. In this volume, we have focused on those topics of nonlinear analysis which are pertinent to the theory of boundary value problems and their applications such as control theory and calculus of variations.
ix
x In Chapter 1 we deal with Hausdorff measures and capacities, which provide the means to estimate the “size” or “dimension” of “thin” or “highly irregular” sets. The recent development of fractal geometry and its uses in a variety of applied areas (such as Brownian motion of particles, turbulence in fluids, geographical coastlines and surfaces etc) renewed the interest on Hausdorff measures, which for a long period were a topic of secondary importance within measure theory. In this chapter we also have our first encounter with Lipschitz and locally Lipschitz functionals which will be examined again in Chapter 4. At this point we prove the celebrated “Rademacher’s theorem.” Chapter 2 deals with certain classes of function spaces, which arise naturally in the study of boundary value problems. These are the Lebesgue-Bochner spaces (the suitable spaces for the analysis of evolution equations) and the Sobolev spaces (the suitable spaces for weak solutions of elliptic equations). We conduct a detailed study of these spaces with special emphasis on compactness and embedding results. Also using the tool of Hausdorff measures and capacities, we investigate the fine properties of Sobolev functions and also introduce and study functionals of bounded variation which are useful in theoretical mechanics. In Chapter 3, we deal with certain large classes of nonlinear operators which arise often in applications. We examine compact operators for which we develop in parallel the corresponding linear theory, with one of the main results being the spectral theorem for compact self-adjoint operators on a Hilbert space. We also investigate nonlinear operators of monotone type which have their roots in the calculus of variations and exhibit remarkable surjectivity properties. Monotone operators lead to accretive operators, the two families being identical in the context of Hilbert spaces. Accretive operators are closely connected with the generation theory of semigroups of operators. We also examine both linear and nonlinear semigroups. Semigroups are basic tools in the study of evolution equations. In addition, we examine the Nemytskii operator which is a nonlinear operator encountered in almost all problems. Finally, in the last section of the chapter, we discuss Young measures which provide the right framework to examine the limit behavior of the minimizing sequence of variational problems which do not have a solution. Young measures are used in optimal control and in the calculus of variations in connection with the so-called “relaxation method.” Chapter 4 presents the calculus of smooth and of certain broad classes of nonsmooth functions. We start with the Gˆateaux and Fr´echet derivatives. We discuss the generic differentiability of continuous convex functions (Mazur’s theorem) and extend Rademacher’s theorem to locally Lipschitz functions between certain Banach spaces by using the notion of Haar-null sets. Then we pass to nondifferentiable functions and develop the duality properties and subdifferential theory of convex functions and the generalized subdifferential of locally Lipschitz functions. We also examine integral functionals and discuss the celebrated Ekeland variational principle establishing its equivalence with some other geometric results of nonlinear analysis.
xi In Chapter 5 we present the critical point theory of C 1 -functions defined on a Banach space. This theory is in the core of the variational methods used in the study of boundary value problems. We follow the deformation approach which leads to minimax characterizations of the critical values. We also study the structure of the set of critical points and derive results on the existence of multiple critical points. Next we present the Lusternik-Schnirelman theory which extends to nonlinear eigenvalue problems the corresponding linear theory of R. Courant. Chapter 6 uses the abstract results of Chapter 5 as well as results from earlier chapters to develop the spectrum of linear elliptic differential operators, of the partial p-Laplacian (with Dirichlet and Neumann boundary conditions) and of the scalar and vector ordinary p-Laplacian (with Dirichlet, Neumann and periodic boundary conditions). We also present linear and nonlinear maximum principles and comparison results, which are useful tools in the study of boundary value problems. Finally in Chapter 7 we have gathered some basic fixed point theorems. We present results from metric fixed point theory, from topological fixed point theory and fixed point results based on the partial order induced by a closed, convex pointed cone. We also indicate how many of these results can be extended to multifunctions (set-valued functions). We have tried to make the volume self-contained. For this reason at the end of the book we have included a rather extended appendix for easy reference of the general results used in the book. Nevertheless, within the test whenever we are in the need of using some results not proved in the book, we also give exact references where the interested reader can find additional information. Now that the project has reached its conclusion, we would like to thank the good people of CRC Press (especially Mrs. Jessica Vakili) for their help and kind cooperation during the preparation of this book. We would like to thank the two editors of this series, Prof. R.P. Agarwal and Prof. D.O’Regan, for supporting this effort.
Chapter 1 Hausdorff Measures and Capacity
During the golden era of measure theory (namely the first two decades of the 20th century), Carath´eodory was the first to consider the notion of “length” for sets in RN . Later, in 1919, Hausdorff, motivated by the ideas of Carath´eodory, introduced the measure and dimensional concepts that we shall discuss in this chapter. So in the modern language, the “length” of a set A ⊆ RN will be its Hausdorff one-dimensional outer measure (denoted by µ(1) ). Following the pioneering works of Carath´eodory and Hausdorff, significant contributions to the subject were made by Besicovitch. In fact, in the first decade of development of the subject, the main advances on the subject were made by Besicovitch and his students, since geometric measure theory was not part of the mainstream measure theory. However, since the early 70’s, the subject attracted a large number of researchers, due to its fundamental importance in the study of the so-called “Fractal Geometry.” Fractal sets arise in many applications, such as turbulence in fluids, geographical coastlines and surfaces, fluctuation of prices in stock exchanges, the Brownian motion of particles and others. Mandelbrojt was the first to emphasize their use to model a variety of phenomenona. There have been many ways to estimate the “size” or “dimension” of small (thin) sets and of highly irregular sets and to generalize the idea that points, curves and surfaces have dimensions 0, 1 and 2 respectively. Hausdorff measure has the advantage of being a measure and together with the notion of Hausdorff dimension can provide a more delicate sense of the size of sets in RN than Lebesgue measure provides. To illustrate this, consider in R2 the set ½µ ¾ ¶ 1 df A = t, sin : t ∈ (0, 1) . t Suppose we wish to measure the length of the curve A. A first approximation can be based on the Carath´eodory outer measure, which defines: df
λ1 (A) =
inf ∞
A⊆
S
∞ X
δ(Cn ),
Cn n=1
n=1
i.e., the infimum is taken over all countable covers of A (by δ(A) we denote the diameter of the set A; see (1.1)). If we adopt this definition, we see that λ1 (A) < +∞, while we know that the length of A is infinite. The reason for
1
2
Nonlinear Analysis
this is that in the definition of λ1 (A), the covers of A are not forced to follow the geometry of A. For this reason the Hausdorff s-dimensional measures (s) µ(s) (A) are defined as limits of outer measures µδ which follow the local geometry of A (see Definition 1.3.5). As another illustrative example, consider the unit square S in R2 (i.e., square of side length equal to 1) and define
df
λ1 (S) =
inf ∞
S⊆
S
∞ X
δ(Cn ),
Cn n=1
n=1
i.e., again the infimum is taken over all countable covers of S. We observe that we can do no better than cover S itself. Indeed, if we cover S with smaller squares of diameter less or equal to n1 , then we see that we need at 1 least n2 squares to achieve the √ covering and so the approximation of λ (S) obtained this way exceeds n 2. So the smaller the squares we use to cover, the bigger the estimate for λ1 (S). Therefore, small squares are irrelevant in the calculation of λ1 (S) and yet it is precisely them that should have an influence on the evaluation of λ1 (S). We expect λ1 (S) = 0, since the diameter is a one-dimensional concept and it is used to measure a square in R2 , which is a two dimensional concept. For this we need a definition which takes into account the local geometry of the set under consideration. In this chapter, in Section 1.1 we recall some basic definitions and facts from measure theory, which will be needed in what follows. In Section 1.2, we discuss some “covering theorems.” Covering results play a central role in geometric measure theory. In Section 1.3 we introduce and study Hausdorff measures and the Hausdorff dimension of sets. Among other things we calculate the Hausdorff dimension of some classical irregular sets in R (Cantor-like sets). From these calculations, the reader will realize that the Hausdorff measure and the Hausdorff dimension of sets (even of simple ones) may be hard to calculate. For this reason sometimes other notions may be more suitable (such as capacity; see Section 1.6). In Section 1.4 we discuss the differentiation of Hausdorff measures and derive the Lebesgue-Besicovitch differentiation theorem. In Section 1.5, using the tools of Hausdorff measures, we study the geometry of Lipschitz continuous functions. Among other things, we obtain the “area and coarea formulas” and the associated with them “change of variables formulas.” Finally in Section 1.6, we present an alternative analytical notion measuring small sets in RN , namely the p-capacity. We derive some basic properties of the p-capacities and compare them to the Hausdorff measures.
1. Hausdorff Measures and Capacity
1.1
3
Measure Theoretical Background
In this section we recall some basic definitions and facts from measure theory, which we shall need in the sequel. Let us start with the concept of outer measure, which, when restricted to a suitable σ-field of sets, leads to a measure. DEFINITION 1.1.1 Let X be a set. A map µ : 2X −→ [0, +∞] is said to be an outer measure, if (a) µ(∅) = 0; (b) A ⊆ B =⇒ µ(A) 6 µ(B) (monotonicity); (c) for any sequence of sets {An }n>1 ⊆ 2X , we have µ[ ¶ X ∞ ∞ µ An 6 µ(An ) n=1
n=1
(subadditivity). For a given outer measure µ on X and A ∈ 2X , we define the restriction of µ on A, denoted by µbA, by df
(µbA)(B) = µ(A ∩ B)
∀ B ∈ 2X .
We say that µ is a finite outer measure if µ(X) < +∞ (i.e., µ has values in R+ ). REMARK 1.1.2 Note that µbA is an outer measure on X, while we define µ|A to be the restriction of µ (as a function) on 2A , i.e., µ|A : 2A −→ [0, +∞] is defined by ¡ ¢ df µ|A (B) = µ(B)
∀ B ∈ 2A ⊆ 2X .
Outer measures are useful because they lead to measures when restricted to suitably defined σ-fields. These σ-fields can be quite large. DEFINITION 1.1.3 Let X be a set and µ an outer measure on X. A set A ∈ 2X is said to be µ-measurable, if µ(B) = µ(A ∩ B) + µ(B \ A) i.e., A “decomposes” every set B additively.
∀ B ∈ 2X ,
4
Nonlinear Analysis
REMARK 1.1.4
Let X be a set and µ an outer measure on X.
(a) By virtue of the subadditivity property of an outer measure, to show that A ∈ 2X is µ-measurable, it is enough to check that µ(B) > µ(A ∩ B) + µ(B \ A)
∀ B ∈ 2X .
(b) Clearly, if A ∈ 2X and µ(A) = 0, then A is µ-measurable. (c) If A ∈ 2X , then any µ-measurable set is also µbA-measurable. (d) A is µ-measurable if and only if Ac = X \ A is µ-measurable. It is straightforward to check the following result. PROPOSITION 1.1.5 If X is a set and µ is an outer measure on X, then the collection Σµ of all µ-measurable sets is a σ-field and µ restricted on Σµ is a measure. REMARK 1.1.6 While Definition 1.1.3 involves only additivity of µ, the conclusion in Proposition 1.1.5 is about σ-additivity of µ on Σµ . This reveals the power of Definition 1.1.3. Note that from Remark 1.1.4(b), it follows that the σ-field Σµ is µ-complete. DEFINITION 1.1.7 Let X be a nonempty Hausdorff topological space and let µ be an outer measure on X. (a) Let T be a family of 2X . We say that µ is T -regular, if µ(A) =
inf µ(B)
B∈T A⊆B
∀ A ∈ 2X .
If T = Σµ , then we simply say that µ is regular. (b) We say that µ is a Borel measure, if B(X) ⊆ Σµ with B(X) being the Borel σ-field of X. (c) We say that µ is a Borel regular measure, if µ is a Borel measure which is B(X)-regular. (d) We say that µ is a Radon measure, if µ is a Borel regular measure and µ(K) < +∞
∀ K ⊆ X, K-compact.
1. Hausdorff Measures and Capacity
5
REMARK 1.1.8 Let X be a Hausdorff topological space and let µ be an outer measure on X. (a) Note that µ is regular if and only if ∀A ∈ 2X ∃B ∈ Σµ : µ(A) = µ(B). (b) If µ is regular on X and {An }n>1 ⊆ 2X is increasing (i.e., An ⊆ An+1 for n > 1), then µ[ ¶ ∞ µ An = sup µ(An ). n>1
n=1
PROPOSITION 1.1.9 If X is a Hausdorff topological space, µ is an outer measure on X which is Borel regular and A ∈ Σµ with µ(A) < +∞, then µbA is a Radon measure. PROOF
Let
df
µ1 = µbA. Evidently Σµ ⊆ Σµ1 and so µ1 is a Borel measure. Also for every compact K ⊆ X, we have µ1 (K) < +∞. It remains to show that µ1 is Borel regular. To this end note that since µ is Borel regular, for a given A ∈ 2X , we can find B ∈ B(X), A ⊆ B, such that µ(A) = µ(B) < +∞. Because A ∈ Σµ , from Definition 1.1.3, we have µ(B \ A) = µ(B) − µ(A) = 0. Since A ∈ Σµ , for every C ∈ 2X , we have ¡ ¢ (µbB)(C) = µ(B ∩ C) = µ(B ∩ C ∩ A) + µ (B ∩ C) \ A 6 µ(C ∩ A) + µ(B \ A) = µ(C ∩ A) = (µbA)(C). As A ⊆ B, we infer that µbB = µbA. So without any loss of generality, we may assume that A ∈ B(X). Let C ∈ 2X . Since µ is Borel regular, we can find D ∈ B(X), such that A∩C ⊆D
and µ(A ∩ C) = µ(D)
6
Nonlinear Analysis
(see Remark 1.1.8(a)). Let us take df
E = D ∪ (X \ A). Evidently E ∈ B(X) and C ⊆ (A ∩ C) ∪ (X \ A) ⊆ E. Moreover, since E ∩ A = D ∩ A, we have µ1 (E) = µ(E ∩ A) = µ(D ∩ A) 6 µ(D) = µ(A ∩ C) = µ1 (C), so µ1 = µbA is Borel regular (see Remark 1.1.8(a)), hence Radon. We conclude this section, by recalling the following basic measure theoretic approximations. PROPOSITION 1.1.10 If X is a Hausdorff topological space and µ is an outer measure on X which is Borel, then (a) if A ∈ B(X), µ(A) < +∞ and ε > 0, then we can find an open set Uε ⊇ A and a closed set Cε ⊆ A, such that µ(Uε \ Cε ) < ε, i.e., µ(A) =
inf µ(U ) =
U -open A⊆U
sup µ(C). C-closed C⊆A
(b) if µ is Radon, then for every A ∈ 2X , we have µ(A) =
inf µ(U )
U -open A⊆U
and if A ∈ Σµ , then µ(A) =
sup
µ(K).
K-compact K⊆A
REMARK 1.1.11 Note that in the first part of Proposition 1.1.10(b), the set A need not be µ-measurable.
1. Hausdorff Measures and Capacity
1.2
7
Covering Results
One of the main tools in geometric measure theory is the so called Vitali covering theorem. For a given sufficiently large family of sets that cover a given set A, Vitali’s covering theorem allows us to select a countable subfamily consisting of distinct sets with exactly the desired approximation properties. The basic principle embodied in the proof of Vitali’s covering theorem is illustrated in the next proposition. In what follows for any subset A of a metric space (X, dX ), we define df
δ(A) = diam (A) = sup dX (x, y),
(1.1)
x,y∈A
df
the diameter of A (by convention diam ∅ = 0). PROPOSITION 1.2.1 If T is a collection of nondegenerate balls in RN with sup δ(B) < +∞, B∈T
then we can find a finite or countable subfamily F of T consisting of disjoint balls, such that [ [ b B ⊆ B, B∈T
B∈F
b being the ball concentric with B, but with radius five times the radius with B of B. PROOF
Let df
d0 = sup δ(B), B∈T ½ ¾ d0 d0 df Tn = B ∈ T : n < δ(B) 6 n−1 2 2
∀ n > 1.
Inductively, we generate subfamilies Fn ⊆ Tn for n > 1. Namely, let F1 be any maximal disjoint collection of balls in T1 . Suppose we have selected F1 , . . . , Fm . We choose Fm+1 to be any maximal disjoint subfamily of ½ ¾ m [ 0 0 B ∈ Tm+1 : B ∩ B = ∅ for all B ∈ Fk i=1
and finally set df
F =
∞ [ m=1
Fm .
8
Nonlinear Analysis
Evidently F ⊆ T and consists of disjoint balls. Claim. For each B ∈ T , we can find B 0 ∈ F , such that B ∩ B 0 6= ∅ and b 0 ). δ(B) 6 2δ(B 0 ) (so also B ⊆ B For some m > 1, we have B ∈ Tm . By virtue of the maximality of Fm , we m S can find B 0 ⊆ Fk with B ∩ B 0 6= ∅. We have that k=1
d0 6 δ(B 0 ) and 2m
δ(B) 6
d0 . 2m−1
So δ(B) 6 2δ(B 0 ) and this proves the claim. From the claim it follows at once that
S
S
B⊆
B∈T
b0. B
B 0 ∈F
DEFINITION 1.2.2 Let A ⊆ RNS. A collection T of sets in RN is said B and for every x ∈ A and every to be a Vitali cover of A, if A ⊆ B∈T
ε > 0, there exists B ∈ T , such that x ∈ B and 0 < δ(B) < ε. REMARK 1.2.3 Note that from the second requirement of the above definition it follows that inf δ(B) = 0. B∈T
So T is a Vitali cover of a set A, if every point x ∈ A is contained in an arbitrary small element of T . As a straightforward consequence of Proposition 1.2.1 we obtain the following proposition. PROPOSITION 1.2.4 If A ⊆ RN , T is a Vitali cover of A consisting of closed balls, such that sup δ(B) < +∞, B∈T
then there exists a countable family F = {Bn }n>1 consisting of disjoint balls from T , such that for each m > 1, we have A ⊆
m [ n=1
Bn ∪
∞ [
bn , B
n=m+1
bn is the closed ball cocentric with Bn and radius five times the radius where B of Bn .
1. Hausdorff Measures and Capacity
Let F be as in the proof of Proposition 1.2.1. Select {Bn }m n=1 ⊆ m [ Bn , then we are done. Otherwise let x ∈ A \ Bn . Since T
PROOF F. If A ⊆
9
m S
n=1
n=1
is a Vitali cover of A consisting of closed balls, then we can find B ∈ T , such that x ∈ B and B ∩ Bn = ∅ ∀ n ∈ {1, . . . , m}. But from the claim in the proof of Proposition 1.2.1, we can find B 0 ∈ F , such that b 0 and B ∩ B 0 6= ∅ B⊆B (so B 0 ∈ {Bn }∞ n=m+1 ). Now we are ready to state and prove Vitali’s covering theorem. In what follows by λN we denote the N -dimensional Lebesgue outer measure. THEOREM 1.2.5 (Vitali Covering Theorem) If A ⊆ RN with 0 < λN (A) < +∞ and T is a Vitali cover of A consisting of closed sets, then we can find a sequence {Cn }n>1 of elements in T , such that Cn ∩Cm = ∅ for n 6= m and µ ¶ ∞ [ λN A \ Cn = 0. n=1
PROOF Without any loss of generality, we can assume that there exists an open set U ⊆ RN with λN (U ) < +∞ and C⊆U
∀ C ∈T.
We construct the sequence {Cn }n>1 inductively. Let C1 ∈ T . Suppose that n S C1 , . . . , Cn are disjoint sets in T . If A ⊆ Ck , then we are finished. If not, k=1
setting df
Vn = U \
n [
Ck ,
k=1
we introduce df
Tn = Because A \
n S i=1
©
C ∈ T : C ⊆ Vn
ª
df
and δn = sup λN (C). C∈Tn
Ck 6= ∅ and T is a Vitali cover of A, we see that Tn 6= ∅ and
so δn > 0. We select Cn+1 ∈ T
with
δn < λN (Cn+1 ). 2
10
Nonlinear Analysis
We continue this process. Then either at some finite step n > 1 we shall have n S A⊆ Ck , in which case the proof of the theorem is complete or otherwise k=1
we produce a sequence {Cn }n>1 ⊆ T of disjoint sets. Then we have ∞ X
λN (Cn ) = λN
n=1
µ[ ∞
¶ Cn
6 λN (U ) < +∞.
(1.2)
n=1
For each n > 1 let Bn be a ball with center in Cn and radius equal to 3δ(Cn ). We claim that n ∞ [ [ A\ Ck ⊆ Bk ∀ n > 1. (1.3) k=1
Let x ∈ A \ such that
n S
k=n+1
Ck . Since T is a Vitali cover of A, we can find a set Cx ∈ Tn ,
k=1
x ∈ Cx
and λN (Cx ) > 0.
We shall show that Cx ∩ Ck 6= ∅ for some k > n. Indeed, if this is not the case, then λN (Cx ) 6 δk for all k > 1, which contradicts the fact that 0 6
lim δk 6
k→+∞
lim 2λN (Ck+1 ) = 0
k→+∞
(recall the choice of Ck+1 and see (1.2)). Let m > n be the smallest integer, such that Cx ∩ Cm 6= ∅. Since Cx ∈ Tm−1 , we have λN (Cx ) 6 δm−1 < 2λN (Cm ) and recalling the choice of Bm , also Cx ⊆ Bm . So we have proved (1.3). Then for any n > 1, we have ¶ µ ¶ µ n ∞ ∞ [ X [ N N Ck 6 λ A \ Ck 6 λN (Bk ). (1.4) λ A\ k=1
k=1
k=n+1
Recalling that Bk is a ball of radius 3δ(Ck ) and combining (1.2) and (1.4), we conclude that µ ¶ ∞ [ N λ A\ Ck = 0. k=1
1. Hausdorff Measures and Capacity
11
Vitali’s covering theorem may be difficult to digest at first and probably it is necessary to see the lemma in action several times before appreciating it. For this reason we present four simple applications from classical analysis of functions of one-variable. We start with a definition which establishes the notation for various limits of the difference quotient that we shall use in the sequel. These derivatives are often more useful than the ordinary derivative, since they are defined at every point. DEFINITION 1.2.6 For a given function f : [a, b] −→ R, the upper right and lower right derivates of f at x ∈ [a, b) are defined by f (x + h) − f (x) h
df
D+ f (x) = lim sup h→0+
and df
D+ f (x) = lim inf h→0+
f (x + h) − f (x) h
respectively. Similarly the upper left and lower left derivates of f at x ∈ (a, b] are defined by df
D− f (x) = lim sup h→0−
f (x + h) − f (x) h
and df
D− f (x) = lim inf − h→0
f (x + h) − f (x) h
respectively. REMARK 1.2.7 Evidently, the derivates of a function at a point may be infinite. The function f is differentiable at x ∈ (a, b), if −∞ < D+ f (x) = D+ f (x) = D− f (x) = D− f (x) < +∞. The function f is differentiable at x = a or at x = b, if the appropriate two derivates are finite and equal. Also the one-sided derivatives exist at a point x, if D+ f (x) = D+ f (x) and D− f (x) = D− f (x). The derivates are also called Dini derivates and clearly we always have
and
D+ f (x) 6 D+ f (x)
∀ x ∈ [a, b)
D− f (x) 6 D− f (x)
∀ x ∈ (a, b].
12
Nonlinear Analysis
In the literature, sometimes we find the notion of a derived number for a function f at x. So β ∈ R∗ is a derived number for f at x, if there is a sequence {hn }n>1 ⊆ R, such that hn −→ 0, hn 6= 0 and
∀n>1
f (x + hn ) − f (x) = β. n→+∞ hn lim
A function f may have many derived numbers at a point x. Of course f is differentiable at x if and only if all derived numbers of f at x agree and are finite. EXAMPLE 1.2.8
Consider the function f : R −→ R defined by ( 1 df x sin if x = 6 0, f (x) = x 0 if x = 0.
We can check that D− f (0) = −1 < D+ f (0) = 1 and every number in [−1, 1] is a derived number for f . The function f is not of bounded variation (see Definition A.2.15(a)). LEMMA 1.2.9 If f : [a, b] −→ R is nondecreasing, then all four derivates of f are finite almost everywhere on [a, b]. PROOF
Clearly all derivates are nonnegative. So it suffices to show that
D+ f (x) < +∞ Let
and
D− f (x) < +∞
for a.a. x ∈ [a, b].
½ df
A =
¾ +
x ∈ [a, b] : D f (x) = +∞
and suppose that λ∗ (A) = β > 0, where λ∗ is the Lebesgue outer measure on R. Let M > 0 be such that f (b) − f (a)
1 with hxn & 0, hxn 6= 0 such that M 6 The collection
©
∀ n > 1,
f (x + hxn ) − f (x) . hxn
[x, x + hxn ]
ª x∈A,n>1
is a Vitali cover of A. By virtue of Vitali’s covering theorem (see Theorem 1.2.5), we can find a family of disjoint intervals © ªm [xn , xn + hn ] n=1 , such that
m X
hn >
n=1
β . 2
Therefore m m X X ¡ ¢ f (xn + hn ) − f (xn ) > M hn n=1
n=1
Mβ > > f (b) − f (a), 2 a contradiction. This proves that λ∗ (A) = 0 and so
D+ f (x) < +∞.
Analogously we can prove that D− f (x) < +∞. Using this lemma and Vitali’s covering theorem, we can now prove that a nondecreasing function is differentiable almost everywhere on [a, b]. THEOREM 1.2.10 If f : [a, b] −→ R is nondecreasing, then f is differentiable almost everywhere on [a, b]. PROOF For f to be differentiable at x, we must have that all four derivates at x are finite and equal. By virtue of Lemma 1.2.9, it suffices to show that all four derivates are equal almost everywhere. Let ª df © A = x ∈ (a, b) : D+ f (x) < D+ f (x) .
14
Nonlinear Analysis
We show that A is Lebesgue-null. The proof for the other combinations of derivates is similar. Suppose that λ∗ (A) > 0. We can find rational numbers r, s, such that the set df
B =
©
ª x ∈ A : D+ f (x) < r < s < D+ f (x)
satisfies
λ∗ (B) = β > 0.
Let ε ∈ (0, β). From the regularity of the Lebesgue outer measure λ∗ , we know that there exists an open set U ⊆ (a, b), such that and λ1 (U ) − ε < β.
B⊆U
For each x ∈ B and n > 1, we can find hxn > 0, such that
£ ¤ x, x + hxn ⊆ U
The family
©
with hxn & 0, f (x + hxn ) − f (x) < r. hxn
and
ª [x, x + hxn ] x∈B,n>1
is a Vitali cover of B. By virtue of Vitali’s covering theorem (see Theorem 1.2.5), for a given ε > 0, we can find a disjoint subfamily © ªm [xn , xn + hn ] n=1 of the Vitali cover, such that µ ¶ m [ λ∗ B \ [xn , xn + hn ] < ε. n=1
We have m m X X ¡ ¢ f (xn + hn ) − f (xn ) < r hn 6 rλ1 (U ) < r(β + ε). n=1
n=1
Let us set df
C = B∩
µ[ m
¶ [xn , xn + hn ] .
n=1
We have that
β − ε < λ∗ (C).
(1.5)
1. Hausdorff Measures and Capacity
15
¡ ¢ For every y ∈ C and k > 1, we can find uyk ∈ y, y + k1 , such that f (uyk ) − f (y) > s uyk − y and
[y, uyk ] ⊆ (xn , xn + hn ),
The family
©
for some n ∈ {1, . . . , m}.
ª [y, uyk ] y∈C,k>1
is a Vitali cover of C. Invoking Vitali’s covering theorem (see Theorem 1.2.5), we can find a disjoint subfamily ©
ªl [yk , uk ] k=1 ,
such that λ∗ (C) − ε
s (uk − yk ) k=1 k=1 ¡ ¢ > s λ∗ (C) − ε > s(β − 2ε).
(1.6)
For each 1 6 n 6 m, let df
Jn =
©
ª k ∈ {1, . . . , l} : [yk , uk ] ⊆ (xn , xn + hn ) .
Since f is nondecreasing, using (1.6) and (1.5), we have s(β − 2ε)
0. Because f is absolutely continuous, we can find δ > 0, such ªm that, if (rn , sn ) n=1 is a finite family of disjoint subintervals of [a, b] with m X
(sn − rn ) < δ,
n=m
then we have
m X ¯ ¯ ¯f (sn ) − f (rn )¯ < ε. n=1
We introduce the family ¯ ¯ ½ ¾ ¯ f (y) − f (x) ¯ df ¯ 0, A ⊆ [a, b] and at each point x ∈ A there exists a derived number β (see Remark 1.2.7), such that β 0, we can find a bounded open set U ⊆ R, such that A⊆U
and
λ1 (U ) − ε < λ∗ (A).
If x ∈ A, then by hypothesis we can find a sequence {hn }n>1 ⊆ R \ {0}, such that hn −→ 0, [x, x + hn ] ⊆ U
∀n>1
(or [x + hn , x] ⊆ U in the event hn < 0; but in the sequel for simplicity we shall write [x, x + hn ] for both cases) and f (x + hn ) − f (x) < r hn
∀ n > 1.
(1.7)
For all n > 1 and x ∈ A, let df
Dn (x) = [x, x + hn ], ¤ df £ En (x) = f (x), f (x + hn ) . Because f is strictly increasing En (x) is a nondegenerate, closed interval and ¡ ¢ f Dn (x) ⊆ En (x) ∀ n > 1, x ∈ A. Since ¡ ¢ λ1 Dn (x) = |hn | and from (1.7), we have
¯ ¯ ¡ ¢ λ1 En (x) = ¯f (x + hn ) − f (x)¯,
¡ ¢ ¡ ¢ λ1 En (x) < rλ1 Dn (x) .
(1.8)
1. Hausdorff Measures and Capacity
19
Passing to the limit as n → +∞, we have |hn | −→ 0 and so from (1.8), we obtain that ¡ ¢ lim λ1 En (x) = 0.
n→+∞
Let
df
T =
©
ª En (x) x∈A,n>1 .
Then T is a Vitali cover of the set f (A). So Vitali’s covering theorem (see Theorem 1.2.5) implies the existence a disjoint sequence © ª Enk (xk ) k>1 ⊆ T , such that
µ ¶ ∞ [ λ1 f (A) \ Enk (xk ) = 0.
(1.9)
k=1
Using (1.9) and (1.8), it follows that ¢ λ f (A) 6 λ1 ∗
=
¡
∞ X
µ[ ∞ k=1
1
λ (Enk (xk )) < r
k=1
¶ Enk (xk ) ∞ X
¡ ¢ λ1 Dnk (xk ) .
(1.10)
k=1
© ª Since f is strictly increasing, we see that Dnk (xk ) k>1 are pairwise disjoint too. So we have µ[ ¶ ∞ ∞ X ¡ ¢ 1 1 λ Dnk (xk ) = λ Dnk (xk ) (1.11) k=1
k=1
6 λ1 (U ) 6 λ∗ (A) + ε. From (1.10) and (1.11), we infer that ¡ ¢ ¡ ¢ λ∗ f (A) 6 r λ∗ (A) + ε . Let ε & 0, to conclude that ¡ ¢ λ∗ f (A) 6 rλ∗ (A).
In a similar fashion, we can have the following comparison result.
(1.12)
20
Nonlinear Analysis
THEOREM 1.2.15 If f : [a, b] −→ R is a strictly increasing function, s > 0, A ⊆ [a, b] and at each point x ∈ A there exists a derived number γ, such that γ > s, then ¡ ¢ λ∗ f (A) > sλ∗ (A). The final application of Vitali’s covering theorem (see Theorem 1.2.5) is the following criterion for measurability of sets in R. THEOREM 1.2.16 If F is any collection of intervals in R and [
A =
D,
D∈F
then A is Lebesgue measurable. PROOF
Let T be a collection of all intervals E,
such that E ⊆ D for some D ∈ F. Evidently T is a Vitali cover of A
and so by Vitali’s covering theorem (see Theorem 1.2.5), we can find a sequence {En }n>1 of disjoint elements in T , such that µ ¶ ∞ [ λ A\ En = 0. ∗
n=1
Because each En ⊆ A, the set df
A =
∞ [ n=1
µ ¶ ∞ [ En ∪ A \ En n=1
is Lebesgue measurable. REMARK 1.2.17 Theorem 1.2.16 can be used to show that the upper and lower derivates of an arbitrary function are measurable. In particular then the four derivates of a measurable function are measurable and so is the derivative of a measurable function. We will not go into that here.
1. Hausdorff Measures and Capacity
21
When λN is replaced by an arbitrary Radon measure µ on RN , there is b in terms of µ(B). So the proof of Vitali’s no systematic way to control µ(B) covering theorem (see Theorem 1.2.5) which uses the principle involved in Proposition 1.2.1, namely the use of suitable expansions of balls, does not work. So we need an analog of Proposition 1.2.1, which does not require enlarging the balls, though. This is done by the so-called “Besicovitch covering theorem.” THEOREM 1.2.18 (Besicovitch Covering Theorem) If F is any collection of closed balls in RN , sup δ(B) < +∞ B∈F
and A is the set of centers of all balls B ∈ F, then there exist a positive integer k = k(N ) > 1 and Tn ⊆ F
∀ n ∈ {1, . . . , k},
such that each Tn is a countable collection of disjoint balls in F and A ⊆
k [ [
B.
n=1 B∈Tn
Using the above theorem, we can have the following counterpart of Vitali’s covering theorem (see Theorem 1.2.5). THEOREM 1.2.19 If µ is a Borel measure on RN , T is a family of nondegenerate closed balls in RN , A is the set of centers of balls in T , µ(A) < +∞, inf
Br (a)∈F
r = 0
∀a∈A
and U ⊆ RN is an open set, then there exists a countable collection of disjoint balls F from T , such that [ B∈F
B ⊆U
and
µ [ ¶ µ (A ∩ U ) \ B = 0. B∈F
22
Nonlinear Analysis
1.3
Hausdorff Measure and Hausdorff Dimension
Hausdorff measures were introduced as certain lower dimensional measures on RN which allow us to measure “small” subsets in RN . The Hausdorff measure and the associated Hausdorff dimension of the set provide a more delicate sense of the size of a set in RN than the Lebesgue measure provides. We start with the introduction of a special class of outer measures, known as metric outer measures. DEFINITION 1.3.1 function).
Let (X, dX ) be a metric space (d is the metric
(a) If A, B ⊆ X, then we say that A and B are separated sets, if df
dX (A, B) =
inf dX (a, b) > 0.
a∈A b∈B
(b) If µ is an outer measure on X, then we say that µ is a metric outer measure, if µ(A ∪ B) = µ(A) + µ(B)
∀ A, B ⊆ X, A and B separated.
We show that if µ is a metric outer measure, then B(X) ⊆ Σ(µ), i.e., µ is Borel. To this end we need the following auxiliary result, known as Carath´ eodory’s lemma. In what follows (X, d) is a metric space. LEMMA 1.3.2 (Carath´ eodory Lemma) If µ is a metric outer measure on X, U ⊆ X is an open subset, U 6= X, A ⊆ U and ½ ¾ 1 df c An = x ∈ A : d(x, U ) > ∀ n > 1, (1.13) n then µ(A) = lim µ(An ). n→+∞
PROOF Note that the sequence {An }n>1 is an increasing sequence and so lim µ(An ) exists. Moreover, since An ⊆ A for n > 1, we have n→+∞
lim µ(An ) 6 µ(A).
n→+∞
So we need to show that µ(A) 6
lim µ(An ).
n→+∞
(1.14)
1. Hausdorff Measures and Capacity
23
Because U is open, we have d(x, U c ) > 0
∀x∈A
and so we can find n0 > 1 large enough so that x ∈ An0 . Therefore, we have ∞ [
A =
An .
n=1
For each n > 1, we introduce the set ½ df Cn = An+1 \ An = x ∈ A :
1 1 6 d(x, U c ) < n+1 n
¾ .
We have A = A2n ∪
∞ [
∞ [
Ck = A2n ∪
k=2n
C2k ∪
k=n
∞ [
C2k+1
k=n
and from the subadditivity of µ, it follows that µ(A) 6 µ(A2n ) +
∞ X
µ(C2k ) +
k=n
∞ X
µ(C2k+1 ).
(1.15)
k=n
If both series are convergent, then we obtain (1.14). So suppose that this is not true and, say, we have ∞ X
µ(C2k ) = +∞.
(1.16)
k=1
Note that
¡ ¢ d C2k , C2k+2 >
1 1 − 2k + 1 2k + 2
∀k>1
and so the sets {Ck }k>1 are separated. Therefore, we have µ
µ n−1 [
¶ C2k
k=1
Note that
n−1 [
=
n−1 X
µ(C2k )
∀ n > 1.
(1.17)
k=1
C2k ⊆ A2n
∀n>1
k=1
and so
µ n−1 ¶ [ µ C2k 6 µ(A2n ) k=1
∀ n > 1.
(1.18)
24
Nonlinear Analysis
From (1.17) and (1.18), it follows that n−1 X
µ(C2k ) 6 µ(A2n ).
k=1
Combining this with (1.16), we infer that lim µ(A2n ) = +∞
n→+∞
and so µ(A) 6 as desired. Similarly, if
∞ P
lim µ(A2n ),
n→+∞
µ(C2k+1 ) = +∞.
k=1
THEOREM 1.3.3 If µ is an outer measure on X, then B(X) ⊆ Σ(µ) (i.e., µ is Borel) if and only if µ is a metric outer measure. PROOF
“=⇒”: Let A1 , A2 ⊆ X be separated sets and let us set df
β = d(A1 , A2 ) > 0. For every x ∈ A1 , we define ½ ¾ β df U (x) = B β (x) = y ∈ X : d(y, x) < 2 2
df
and U =
[
U (x).
x∈A1
Evidently U is open, A1 ⊆ U and A2 ∩ U = ∅. Since by hypothesis U ∈ Σ(µ), we have that ¡ ¢ ¡ ¢ µ(A1 ∪ A2 ) = µ (A1 ∪ A2 ) ∩ U + µ (A1 ∪ A2 ) ∩ U c . (1.19) Because A1 ⊆ U and A2 ∩ U = ∅, from (1.19), it follows that µ(A1 ∪ A2 ) = µ(A1 ) + µ(A2 ), i.e., µ is metric outer measure. “⇐=”: It suffices to show that Σ(µ) contains all closed sets. So let C ⊆ X be df
df
closed and let us set U = C c . Let D ⊆ X, A = D \ C and let {An }n>1 be an increasing sequence of subsets of A as in Lemma 1.3.2. Then d(An , C) >
1 n
∀n>1
1. Hausdorff Measures and Capacity
25
and, from Lemma 1.3.2, we have µ(D \ C) = µ(A) =
lim µ(An ).
n→+∞
(1.20)
Since by hypothesis µ is a metric outer measure and the sets {An }n>1 are separated from C, we have ¡ ¢ µ(D) > µ (D ∩ C) ∪ An = µ(D ∩ C) + µ(An ) ∀ n > 1. Passing to the limit as n → +∞ and using (1.20), we obtain µ(D) > µ(D ∩ C) + µ(D \ C). The reverse inequality is always true (subadditivity). So we obtain µ(D) = µ(D ∪ C) + µ(D \ C)
∀ D ⊆ X.
Thus C ∈ Σ(µ) and hence B(X) ⊆ Σ(µ). To introduce the concept of Hausdorff measure, we shall need the following notion. Recall that by (X, d) we denote a metric space. DEFINITION 1.3.4 of a set C, if C⊆
∞ [
A sequence {An }n>1 of subsets of X is a δ-cover
An
and
δ(An ) 6 δ
∀ n > 1.
n=1
By Tδ (C) we denote the family of all δ-covers of the set C. Using this notion, we can introduce the Hausdorff s-dimensional measure, s > 0. As usual, for any A ⊆ X, df
δ(A) = diam (A) = sup d(x, y), x,y∈A df
the diameter of A (by convention diam ∅ = 0). DEFINITION 1.3.5 define
For any s > 0, 0 < δ 6 +∞ and C ⊆ X, we df
(s)
µδ (C) =
inf
{An }n>1 ∈Tδ (C)
∞ X
δ(An )s
n=1
(as always we use the convention that inf ∅ = +∞). The Hausdorff sdimensional outer measure µ(s) is defined by df
(s)
(s)
µ(s) (C) = lim µδ (C) = sup µδ (C). δ&0
δ>0
26
Nonlinear Analysis
REMARK 1.3.6 It is easily seen that µ(s) is an outer measure. Moreover, it is a metric outer measure. Indeed, if δ > 0 is less than the positive distance of two separate sets A and C, then no set in Tδ (A ∪ C) can intersect both A and C and so it follows that (s)
(s)
(s)
µδ (A ∪ C) = µδ (A) + µδ (C). Letting δ & 0, we can obtain the same equality for µ(s) . ¡In addition by ¢ Theorem 1.3.3, µ(s) is Borel. The restriction of µ(s) on Σ µ(s) is called the Hausdorff s-dimensional measure. Sometimes it is convenient to consider δ-covers consisting of open or alternatively closed sets. In these cases, (s) although a different value of µδ may be attained for δ > 0, the limit µ(s) as δ & 0 is the same (see Davies (1970)). However, the limit µ(s) is different, if we restrict ourselves to δ-covers by balls (see Besicovitch (1928)). In this case the resulting Hausdorff measure is called the spherical Hausdorff measure. Finally, if X = RN , it is easy to see that µ(s) remains the same if we consider δ-covers consisting only of convex sets. Next we show that for any set C ⊆ X, there is a critical value s0 , such that for s > s0 , the corresponding Hausdorff s-dimensional measure of C is zero, while for s < s0 the Hausdorff s-dimensional measure of C is infinite. THEOREM 1.3.7 If A ⊆ RN and 0 6 s < t < +∞, then (a) if µ(s) (A) < +∞, then µ(t) (A) = 0; (b) if µ(t) (A) > 0, then µ(s) (A) = +∞. PROOF (a) Let µ(s) (A) < +∞ and t > s. Let {An }n>1 ∈ T m1 (A). Then for any n > 1, we have µ ¶t−s δ(An )t 1 t−s = δ(An ) 6 , δ(An )s m so
∞ X
(t)
µ 1 (A) 6 m
µ t
δ(An ) 6
n=1
and thus
µ (t)
µ 1 (A) 6 t
1 m
1 m
¶t−s
¶t−s X ∞
δ(An )s
n=1
(s)
µ 1 (A). m
Letting m → +∞, we obtain µ(t) (A) = 0. (b) Let µ(t) (A) > 0 and s < t. Assuming that µ(s) (A) < +∞, from (a), we get that µ(t) (A) = 0, a contradiction.
1. Hausdorff Measures and Capacity
27
This theorem leads to the following definition. DEFINITION 1.3.8 Let C ⊆ X. If there is no s > 0, such that df µ(s) (C) = +∞, then dim C = 0. Otherwise, let df
dim C =
sup (s)
µ
s.
s>0 (C) = +∞
Then dim C is called the Hausdorff dimension of C. Consider the Cantor ternary set C. It is well known that C is a nonempty, bounded, nowhere dense, perfect set in R which has Lebesgue measure zero. So the Lebesgue measure can contribute no additional information concerning the size of C. On the other hand, as we shall see the Hausdorff dimension provides a more delicate sense of size. PROPOSITION 1.3.9 If C ⊆ [0, 1] is the Cantor ternary set, then dim C =
ln 2 ln 3 .
PROOF We start with two simple observations concerning the Hausdorff s-dimensional outer measure µ(s) on R. First note that µ(s) is translation invariant, namely µ(s) (A) = µ(s) (A + x)
∀ A ⊆ R, x ∈ R
ª df © (here A + x = a + x : a ∈ A ). Second, µ(s) is s-positive homogeneous, i.e., for every ϑ > 0, µ(s) (ϑA) = ϑs µ(s) (A) ∀ ϑ > 0. In the of C we start by removing from [0, 1] £the open middle ¡ construction ¢ ¤ £ ¤ third 31 , 32 . The resulting set consists of two closed intervals 0, 31 and 32 , 1 . Let · ¸ · ¸ 1 2 1 df 2 df C = C ∩ 0, and C = C ∩ , 1 . 3 3 Evidently C 1 and C 2 are translates of a multiple (by 31 ) of C. So we have (s) µ(s) (C) = µ(s) (C 1 ∪ C 2 ) =µ µ¶ (C 1 ) + µ(s) (C 2 ) s ¡ ¢ 1 = 2µ(s) C 2 = 2 µ(s) (C) 3
(1.21)
(see Remark 1.3.6 and the observations in the beginning of this proof). From (1.21), it follows that µ ¶s 1 µ(s) (C) = 0 or µ(s) (C) = +∞ or 2 = 1. 3
28
Nonlinear Analysis
From the last possibility, it follows that s =
ln 2 . ln 3
If we can show that 0 < µ(s) (C) < +∞, then s = dimension of C (see Theorem 1.3.7). First we show that µ(s) (C) > 0. Note that d(C 1 , C 2 ) >
ln 2 ln 3
is the Hausdorff
1 . 3
Let δ 6 31 . Then any collection {An }n>1 ∈ Tδ (C) (which can be taken to consist of open intervals; see Remark 1.3.6) can be decomposed into two subcollections of intervals {An,1 }n>1 ∈ Tδ (C 1 ) and {An,2 }n>1 ∈ Tδ (C 2 ), such that ∞ ∞ ∞ X X X δ(An )s = δ(An,1 )s + δ(An,2 )s . (1.22) n=1
n=1
n=1
In the right hand side of (1.22) suppose that the first sum is smaller than the second. Because C 2 is a translate of C 1 , the same when applied to ª © translation the intervals {An,1 }n>1 gives a subcollection A0n,1 n>1 ∈ Tδ (C 2 ). Also from {An,1 }n>1 we can produce in a similar way a collection {A0n }n>1 covering C, such that δ(A0n ) = 3δ(A0n,1 ) ∀ n > 1. (1.23) Then, from (1.23) and the choice of s, we have ∞ X
δ(An )s >
n=1 ∞ X
= 2
∞ X
δ(An,1 )s +
n=1
δ(A0n,1 )s
= 2
n=1
∞ µ ¶s X 1 n=1
3
∞ X
δ(A0n,1 )s
n=1
δ(A0n )s =
∞ X
δ(A0n )s .
n=1
If any one of the intervals {A0n }n>1 has length bigger or equal to 31 , we have ∞ X
δ(An )s >
n=1
µ ¶s 1 1 = . 3 2
Because C is compact, we can use only finite coverings and so min δ(An ) > 0. n>1
The intervals {A0n }n>1 are multiples (by (1.23)) of a subfamily of the intervals {An }n>1 , hence we have 3 min δ(A0n ) > min δ(An ). n>1
n>1
1. Hausdorff Measures and Capacity
29
If every interval A0n has length (diameter) less than 31 , we can apply the same process to the cover {A0n }n>1 . After a finite number of such steps, we produce a cover {A00n }n>1 , such that 1 3
max δ(A00n ) > n>1
and
∞ X
δ(An )s >
n=1
so
∞ X
δ(A00n )s ,
n=1
s
δ(An )
n=1
and thus
∞ X
µ ¶s 1 1 > = 3 2
0 < µ(s) (C).
Next we show that
µ(s) (C) < +∞.
Let {An }n>1 ∈ Tδ (C) consist of open intervals. From this family, as above, we obtain covers {An,k }n>1 of C k for k ∈ {1, 2}, such that δ(An,k ) 6
δ 3
∀ n > 1.
Again from the choice of s, we have δ(An )s = δ(An,1 )s + δ(An,2 )s , so
(s)
(s)
µδ (C) > µ δ (C). 3
(s) µδ
(s)
Because is nondecreasing in δ > 0, we infer that µδ is independent of δ > 0. So we can take an open interval of length greater than 1 as an open cover of C and conclude that µ(s) (C) 6 1. This proves that dim E =
ln 2 . ln 3
One can show that for every ξ ∈ [0, 1], there exists a set A ⊆ R, such that dim A = ξ. This can be done using Cantor-like sets. These are sets which share most of the properties of the Cantor ternary set, but need not be Lebesgue-null. We can construct a Cantor-like set as follows. We start with the interval [0, 1] and proceed inductively. We remove an open interval B1,1 centered at 21 with length less than 1. We are left with closed intervals
30
Nonlinear Analysis
D1,1 and D1,2 each with length less than 21 . At the n-th step of this process we are left with closed intervals Dn,1 , Dn,2 , . . . , Dn,2n each with length less than 21n . In the (n + 1)-st step, from each closed interval Dn,k we remove an open interval En+1,k having the same center as Dn,k and length less than the length of Dn,k . We set n
df
Sn =
2 [
Dn,k
df
and S =
∞ \
Sn .
n=1
k=1
The set S is a Cantor-like set. It is known (see Hewitt & Stromberg (1975, p. 71)) that S is nonempty, compact, nowhere dense and perfect (just as the Cantor ternary set). However, unlike the Cantor ternary set, S need not be Lebesgue-null. More precisely, consider a sequence {ϑn }n>1 of positive numbers, such that 1 > 2ϑ1 > 4ϑ2 > . . . > 2n ϑn > . . . . Following the construction of S above, we remove from [0, 1] an open interval centered at 21 and having length 1 − 2ϑ1 . The remaining closed intervals D1,1 and D1,2 each have length ϑ1 . Then from each of the intervals D1,1 and D1,2 we remove cocentric open intervals each of length ϑ1 − 2ϑ2 . We are left with closed intervals D2,1 , D2,2 , D2,3 and D2,4 each of length ϑ2 . We continue this way. In the n-th step we are left with 2n closed intervals each with length ϑn . Then we have λ1 (S) = lim 2n ϑn n→+∞
1
(λ being the Lebesgue measure on R). If ϑn = 31n , then S = C is the Cantor ternary set. Although S is nowhere dense, we can have λ1 (S) as close to 1 as we choose. Indeed, for a given ξ ∈ (0, 1), let 1 nξ + 1 df ϑn = n ∀ n > 1. 2 n+1 Then we have λ1 (S) = ξ. Suppose that in the construction of the Cantor-like set at each step the closed subintervals are divided in the same proportions as the original, namely δ(D1,1 ) = δ(D1,2 ) = ϑ δ(D2,1 ) = δ(D2,2 ) = δ(D2,3 ) = δ(D2,4 ) = ϑ2 and in general δ(Dn,k ) = ϑk
∀ k ∈ {1, . . . , 2n }.
Then the resulting Cantor-like set is denoted by Sϑ . Arguing as in the proof of Proposition 1.3.9, we obtain the following Proposition.
1. Hausdorff Measures and Capacity
31
PROPOSITION 1.3.10 ¡ ¢ ln 2 If ϑ ∈ 0, 21 , then dim Sϑ = − ln ϑ. REMARK 1.3.11 If ϑ = 31 , then S = C is the Cantor ternary set and Propositions 1.3.9 and 1.3.10 coincide. COROLLARY 1.3.12 For each ξ ∈ [0, 1], there exists A ⊆ R, such that dim A = ξ. PROOF If ξ = 0, then we take³ A to ´be a singleton. If 0 < ξ < 1, then take ϑ = exp − lnξ 2 < 21 and use Proposition 1.3.10. If ξ = 1, let A = I = [0, 1]. Then we can easily check that +∞ if 0 < s < 1, 1 if s = 1, µ(s) (A) = 0 if s > 1. Therefore dim A = 1. REMARK 1.3.13 of a set A ⊆ X is by
An alternative way to define the Hausdorff dimension df
dim A =
inf
s.
s>0 µ (A) = 0 (s)
In general the Hausdorff dimension of a set may be any number in [0, +∞] and need not be an integer. Even if dim A is an integer and k = dim A > 0, the set A need not be a “k-dimensional surface” in any sense (see Federer (1969)). Next we turn our attention to the case X = RN . Let us begin by recalling the definition of the N -dimensional outer measure λN . (a) We say that Q ⊆ RN is a closed N -cube, N Q if there exist ak < bk for k = 1, . . . , N , such that Q = [ak , bk ]. We set DEFINITION 1.3.14
k=1 df
|Q| =
N Y
(bk − ak ).
k=1
(b) The Lebesgue N -dimensional outer measure λN , for all A ⊆ RN , is defined by ½X ¾ ∞ ∞ [ df N λ (A) = inf |Qk | : A ⊆ Qk , Qk is closed N -cube . k=1
k=1
32
Nonlinear Analysis
REMARK 1.3.15 Clearly the definitions of λ1 and µ(1) on R coincide. We shall show that for any N > 1 the outer measures λN and µ(N ) are closely related. In fact they differ by a multiplicative constant. This is not easy to establish and requires some preparation which culminates to the so-called “isodiametric inequality,” which says that the set of maximal volume for a given diameter is the sphere. LEMMA 1.3.16 If f : RN −→ [0, +∞] is Lebesgue measurable, then the set ½ ¾ df H = (x, ϑ) ∈ RN × R : 0 6 ϑ 6 f (x) is Lebesgue measurable in RN +1 . PROOF
Let
©
df
A =
ª x ∈ RN : f (x) = +∞ .
Then A is Lebesgue measurable. Let g : Ac × R+ −→ R+ be defined by df
g(x, ϑ) = f (x) − ϑ
∀ (x, ϑ) ∈ Ac × R+ .
Evidently g is a Carath´eodory function (i.e., it is Lebesgue measurable in x ∈ RN and continuous in ϑ ∈ R). Therefore g is Lebesgue measurable on Ac × R+ and so ½ df
H0 =
¾ (x, ϑ) ∈ Ac × R+ : ϑ 6 f (x)
is Lebesgue measurable in RN +1 . Finally note that H = H0 ∪ (A × R+ ).
In what follows for a, b ∈ RN , kakRN = 1, we introduce the following objects: ª df © L(a, b) = b + ta : t ∈ R - the line passing from b in the direction of a and df
P (a) =
©
x ∈ RN : (x, a)RN = 0
ª
- the plane passing from the origin, perpendicular to a.
1. Hausdorff Measures and Capacity
33
DEFINITION 1.3.17 Let a ∈ RN with kakRN = 1 and A ⊆ RN . We define the Steiner symmetrization of A with respect to the plane P (a) to be the set ½ ¾ [ ¡ ¢ 1 df S(a, A) = b + ta : |t| 6 µ(1) A ∩ L(a, b) . 2 b ∈ P (a) A ∩ L(a, b) 6= ∅
REMARK 1.3.18 The above defined Steiner symmetrization with respect to an (N − 1)-dimensional subspace Y of RN is the operation which associates to each A ⊆ RN , the set V ⊆ RN , such that for every L perpendicular to Y either • L ∩ A = ∅ and L ∩ V = ∅; or • L ∩ A 6= ∅ and L ∩ V is a closed segment centered in Y and µ(1) (L ∩ A) = µ(1) (L ∩ V ). If A is compact, then V is compact too and λN (A) = λN (V ). Also if A is convex, then V is convex too. The next Proposition summarizes the properties of the Steiner symmetrization. PROPOSITION 1.3.19 Let A ⊆ RN and a ∈ RN . ¡ ¢ (a) δ S(a, A) 6 δ(A). (b) If A ⊆ RN is Lebesgue measurable, ¡ ¢ then so is S(a, A) and λN S(a, A) = λN (A). PROOF
(a) Assume that δ(A) < +∞
or otherwise the result is trivial. Also we may assume that A is closed. For a given ε > 0, let x, y ∈ S(a, A) be such that ¡ ¢ δ S(a, A) − ε 6 kx − ykRN . Let
df
b = x − (x, a)RN a and
df
c = y − (y, a)RN a.
34
Nonlinear Analysis
Then b, c ∈ P (a). Let us set © ª df r = inf t ∈ R : b + ta ∈ A , © ª df u = inf t ∈ R : c + ta ∈ A ,
© ª df s = sup t ∈ R : b + ta ∈ A , © ª df v = sup t ∈ R : c + ta ∈ A .
We may assume that without any loss of generality that v − r > s − u. So 1 1 1 1 (v − r) + (s − u) = (s − r) + (v − u) 2 2 2 2 ¡ ¢ 1 ¡ ¢ 1 > µ(1) A ∩ L(a, b) + µ(1) A ∩ L(a, c) . 2 2
v−r >
Note that and
¯ ¯ (x, a)
RN
¯ ¡ ¢ ¯ 6 1 µ(1) A ∩ L(a, b) 2
¯ ¯ (y, a)
¯ ¡ ¢ ¯ 6 1 µ(1) A ∩ L(a, c) 2 (recall that x, y ∈ S(a, b)). It follows that ¯ ¯ ¯ ¯ ¯ ¯ v − r > ¯ (x, a)RN ¯ + ¯ (y, a)RN ¯ > ¯ (x − y, a)RN ¯. RN
Hence we have ¡ ¡ ¢ ¢2 2 δ S(a, A) − ε 6 kx − ykRN ¯ ¯ 2 2 6 kb − ckRN + ¯ (x − y, a)RN ¯ 2
6 kb − ckRN + (v − r)2 ° °2 = °(b + ra) − (c + va)°RN 6 δ(A)2 (note that A is closed and so b + ra, c + va ∈ A). It follows that ¡ ¢ δ S(a, A) − ε 6 δ(A). ¡ ¢ Let ε & 0, to conclude that δ S(a, A) 6 δ(A). (b) Recall that the Lebesgue measure λN is rotation invariant. So we may take 0 .. a = eN = . . 0 1
1. Hausdorff Measures and Capacity
35
Then P (a) = P (eN ) = RN −1 . Note that the function f : RN −1 −→ R, defined by ¡ ¢ df f (b) = µ(1) A ∩ L(a, b)
∀ b ∈ RN −1 ,
is measurable (Fubini’s theorem) and Z λN (A) = f (b)dλN −1 (b) A
(since λ1 = µ(1) ; see Remark 1.3.15). So by virtue of Lemma 1.3.16, we have that ½ ¾ f (b) f (b) df N −1 S(a, b) = (b, ϑ) ∈ R ×R: − 6ϑ6 2 2 ½ ¾ N −1 \ (b, 0) ∈ R × R : A ∩ L(a, b) = ∅ is Lebesgue measurable in RN and, moreover, Z ¡ ¢ N λ S(a, A) = f (b) dλN −1 (b) = λN (A). RN −1
Now we are properly equipped to prove the so-called “isodiametric inequality,” which states that, if in RN we consider the family of all sets with given diameter, the one with maximum Lebesgue N -dimensional outer measure (N volume) is the sphere. THEOREM 1.3.20 (Isodiametric Inequality) For all A ⊆ RN , we have µ ¶N δ(A) λ (A) 6 a(N ) , 2 N
N
df π 2 where a(N ) = ¡ N ¢ is the volume of the unit ball in RN . 2 !
PROOF
If δ(A) = +∞, then there is nothing to prove. So suppose that δ(A) < +∞.
36
Nonlinear Analysis
N Let {ek }N k=1 be the standard basis of R . We introduce
A1 = S(e1 , A),
A2 = S(e2 , A1 ),
...,
AN = S(eN , AN −1 ).
Let us set A∗ = AN . Claim 1. A∗ is symmetric with respect to the origin. By virtue of the definition of the Steiner symmetrization, we have that A1 is symmetric with respect to the plain P (e1 ). Let 1 6 k 6 N − 1 and suppose that Ak is symmetric with respect to P (e1 ), . . . , P (ek ). Again Ak+1 is symmetric with respect to P (ek+1 ). Let us fix 1 6 m 6 k and let Rm : RN −→ RN be reflection with respect to P (em ). Let b ∈ P (ek+1 ). Because Rm (Ak ) = Ak , we have © ª © ª µ(1) Ak ∩ L(ek+1 , b) = µ(1) Ak ∩ L(ek+1 , Rm (b)) , so ½
¾ t ∈ R : b + tek+1 ∈ Ak+1
½ =
¾ t ∈ R : Rm (b) + tek+1 ∈ Ak+1
and thus Rm (Ak+1 ) = Ak+1 , i.e., Ak+1 is symmetric with respect to P (em ). It follows that A∗ = AN is symmetric with respect to P (e1 ), . . . , P (eN ), hence it is symmetric with respect to the origin. µ
N
π2 Claim 2. λN (A∗ ) 6 ¡ N ¢ 2
!
δ(A∗ ) 2
¶N .
Let x ∈ A∗ . Then because of Claim 1, we have −x ∈ A∗ and so 2 kxkRN 6 δ(A∗ ). Hence
½ A∗ ⊆ B δ(A∗ ) (0) = 2
and so
y ∈ RN : kykRN 6
δ(A∗ ) 2
¾
µ ¶ ¶N N µ π2 δ(A∗ ) ∗ ¡ ¢ λ (A ) 6 λ B δ(A ) (0) 6 N . 2 2 2 ! N
∗
N
Using Claim 2, we can have the isodiametric inequality. Note that A ⊆ RN is Lebesgue measurable and so by Proposition 1.3.19, we have ¡ ∗¢ ¡ ¢ ¡ ∗¢ ¡ ¢ λN A = λN A and δ A 6 δ A .
1. Hausdorff Measures and Capacity
37
Using Claim 2, it follows that N µ ∗ ¶N ¡ ¢ ¡ ∗¢ π2 δ(A ) λN (A) 6 λN A = λN A 6 ¡N ¢ 2 2 ! ¶N ¶N N µ N µ π2 δ(A) π2 δ(A) 6 ¡N ¢ = ¡N ¢ . 2 2 ! ! 2 2
THEOREM 1.3.21 df
If A ⊆ RN , then λN (A) = cN µ(N ) (A), with cN =
N
π2 ¡N ¢ . N 2 2 !
PROOF For a given ε > 0, we can find a cover {Cn }n>1 of A consisting of closed, convex sets, such that ∞ X
δ(Cn )N 6 µ(N ) (A) + ε.
n=1
By virtue of Theorem 1.3.20, we have λN (Cn ) 6 cN δ(Cn )N
∀ n > 1.
So λN (A) 6
∞ X
λN (Cn ) 6 cN
n=1
∞ X
δ(Cn )N 6 cN µ(N ) (A) + cN ε.
n=1
Let ε & 0 to conclude that λN (A) 6 cN µ(N ) (A).
(1.24)
To prove the opposite inequality, first we show that µ(N ) is absolutely continuous with respect to λN (see Definition A.2.22). Note that for any N -cube Q, we have µ ¶N δ(Q) √ λN (Q) = |Q| 6 . N So for a given δ > 0, we have (N )
µδ
(A) 6
inf
∞ X
Qn -N -cube n=1 ∞ S A⊆ Qn n=1
δ(Qn ) 6 δ
δ(Qn ) 6
√ N N N λ (A).
38
Nonlinear Analysis
Let δ & 0, to conclude that µ(N ) is absolutely continuous with respect to λN (see Definition A.2.22). Next for a given ε, δ > 0, we can find a cover {Qn }n>1 of A consisting of N -cubes, such that δ(Qn ) < δ and
∞ X
∀n>1
λN (Qn ) 6 λN (A) + ε.
(1.25)
n=1
We may suppose that N -cubes are open by expanding them slightly so that the above inequality remains valid. Invoking Vitali’s covering theorem (see Theorem 1.2.5), for every n > 1 we can find disjoint balls {Bn,k }k>1 contained in Qn , such that δ(Bn,k ) 6 δ
and
µ ¶ ∞ [ λ Qn \ Bn,k = 0. N
k=1
By virtue of the absolute continuity of µ(N ) with respect to λN , we have µ
(N )
µ ¶ ∞ [ Qn \ Bn,k = 0
and
(N ) µδ
µ ¶ ∞ [ Qn \ Bn,k = 0.
k=1
k=1
Therefore, using (1.25), we have (N )
µδ
(A) 6 6 6
∞ X
(N )
µδ
k=1 ∞ X ∞ X n=1 k=1 ∞ X
1 cN
(Qn ) 6
∞ X ∞ X
(N )
µδ
n=1 k=1 ∞ X ∞ X
δ(Bn,k )N =
n=1 k=1
λN (Qn ) 6
n=1
(Bn,k ) +
∞ X n=1
(N )
µδ
µ ¶ ∞ [ Qn \ Bn,k k=1
1 N λ (Bn,k ) cN
1 N ε λ (A) + . cN cN
Let ε, δ & 0, to conclude that cN µ(N ) (A) 6 λN (A). From (1.24) and (1.26), we conclude that λN = cN µ(N ) .
(1.26)
1. Hausdorff Measures and Capacity
39
REMARK 1.3.22 Some authors, in order to get rid of the multiplicative constant cN , normalize the definition of the Hausdorff measures on RN . So if C ⊆ RN , 0 6 s < +∞, 0 < δ 6 +∞, they set µ ¶s ∞ X δ(An ) df (s) a(s) µδ (C) = inf , ∞ S 2 C⊆ A n=1 n=1
n
δ(An ) 6 δ df
where a(s) =
s
π2 . Here s Γ( 2 + 1) df
Γ(s) =
+∞ Z xs−1 e−x dx 0
is the gamma Euler function. The Hausdorff s-dimensional outer measure µ(s) is defined by (s) (s) µ(s) (C) = lim µδ (C) = sup µδ (C) δ&0
δ>0
(cf., e.g., Evans & Gariepy (1992, p. 60)) . Recall that ¡ ¢ λN B(x, r) = a(N )rN
∀ x ∈ RN .
In this case Theorem 1.3.21 says that λN = µ(N ) . Note that µ(0) is the counting measure. Let us prove some further properties of the Hausdorff measures on RN . PROPOSITION 1.3.23 Let 0 6 s < +∞. We have (a) µ(s) (A) = 0 for all A ⊆ RN and all s > N . (b) µ(s) (ξA) = ξ s µ(s) (A) for all A ⊆ RN and all ξ > 0. ¡ ¢ (c) µ(s) K(A) = µ(s) (A) for all A ⊆ RN and for any affine isometry K : RN −→ RN . PROOF
(a) Let Q = (0, 1)N and let m > 1 be an integer. For df
N k = (ki )N i=1 ∈ K = {0, . . . , m − 1} ,
we set df
Qk =
¸ N · Y ki ki + 1 , . m m i=1
40
Nonlinear Analysis
Note that
[
Q =
Qk
k∈K
So we have
X
(s)
µ √N (Q) 6 m
√ N and δ(Qk ) = . m √ s δ(Qk )s = mN −s N .
k∈K
Letting m → +∞, since s > N , we obtain µ(s) (Q) = 0, from which it follows that
µ(s) (RN ) = 0.
(b) Note that for all C ⊆ RN , we have δ(ξC) = ξδ(C). So the result follows at once from Definition 1.3.5. (c) Note that for all C ⊆ RN , we have ¡ ¢ δ K(C) = δ(C). Again the result follows from Definition 1.3.5. The next Proposition suggests a convenient way to check that µ(s) vanishes on a set. PROPOSITION 1.3.24 (s) If A ⊆ RN , 0 < δ 6 +∞ and 0 6 s < +∞ are such that µδ (A) = 0, then µ(s) (A) = 0. (0)
PROOF If s = 0, then µδ (A) = 0 implies that A = ∅ and so µ(0) (A) = 0. So suppose that s > 0. For a given ε > 0, we can find {Cn }n>1 , such that A⊆
∞ [
Cn ,
δ(Cn ) 6 δ
and
n=1
Evidently and so
(s) µε (A)
∞ X n=1
δ(Cn )s 6 ε
∀n>1
6 ε. Let ε & 0, to conclude that µ(s) (A) = 0.
δ(Cn )s 6 ε.
1. Hausdorff Measures and Capacity
41
Taking into account that for a Lipschitz continuous function with constant c > 0, for every A ⊆ RN , we have ¡ ¢ δ f (A) 6 cδ(A), and we obtain the following result. PROPOSITION 1.3.25 If f : RN −→ RM is a Lipschitz continuous function with Lipschitz constant c > 0 (see¡ Definition 1.5.1), A ⊆ RN and 0 6 s < +∞, ¢ (s) s (s) then µ f (A) 6 c µ (A). We conclude this section by returning to the notion of Hausdorff dimension (see Definition 1.3.8) and having a second look at this concept. The Hausdorff dimension has an intuitive appeal when familiar objects are under consideration. So for example dim RN = N (see Theorem 1.3.21). Suppose we want to determine the Hausdorff dimension of a curve C ⊆ R3 . Our first guess will be that dim C = 1. But recall that there are curves in R3 which fill the unit cube. Such a curve must have Hausdorff dimension 3. Therefore we must proceed with caution. DEFINITION 1.3.26
Let (X, d ) be a metric space.
¡ ¢ (a) By a curve in X we mean the image f [0, 1] of a continuous function f : [0, 1] −→ X. ¡ ¢ (b) The length of a curve C = f [0, 1] is defined by df
l(C) = sup
m X ¡ ¢ d f (xk−1 ), f (xk ) , k=1
where the supremum is taken over all partitions 0 = x0 < x1 < . . . < xm = 1 of [0, 1].
(c) The curve C is said to be rectifiable, if l(C) < +∞. REMARK 1.3.27 A curve C is a continuum, i.e., a compact and connected set in X. In particular then a curve is a Borel set; hence it is also µ(s) -measurable. Moreover, if in Definition 1.3.26(a) f is injective, then f −1 exists and is continuous and so C is the homeomorphic image of [0, 1]. Also in Definition 1.3.26(a), we can replace [0, 1] by any closed bounded interval [a, b]. Some authors require f to be injective.
42
Nonlinear Analysis
PROPOSITION 1.3.28 If (X, d ) is a metric space, f : [0, 1] −→ X is a nonconstant curve with ¡ ¢ length l and C = f [0, 1] , then (a) 0 < µ(1) (C) 6 l; (b) if f is injective, then µ(1) (C) = l. Therefore, if l is rectifiable (i.e., l < +∞), then dim C = 1. PROOF
(a) First we show that µ(1) (C) 6 l.
Assume that l < +∞ or otherwise there is nothing to prove. Let {Ak }m k=1 be a collection of closed subarcs of C, such that C =
m [
Ak ,
δ(Ak ) 6
k=1
1 n
(1)
and µ 1 (C) 6 n
m X
δ(Ak ).
(1.27)
k=1
Let us explicitly construct the subarcs Ak for k ∈ {1, . . . , m}. Note that f is uniformly continuous and so we can find η > 0, such that ¡ ¢ 1 d f (x), f (y) < n
∀ x, y ∈ [0, 1], |x − y| < η.
Consider a partition 0 = x0 < x1 < . . . < xn = 1 such that |xk − xk−1 | < η Let
¡ ¢ df Ak = f [xk−1 , xk ] ,
of
[0, 1],
© ª ∀ k ∈ 1, . . . , m . © ª ∀ k ∈ 1, . . . , m .
Evidently the subarcs {Ak }m k=1 cover C and ¡ ¢ 1 d f (xk−1 ), f (xk ) 6 δ(Ak ) < n
∀ k ∈ {1, . . . , m}.
Note that every Ak is compact and so we can find points yk , zk ∈ [xk−1 , xk ], yk 6 zk , such that ¡ ¢ d f (yk ), f (zk ) = δ(Ak ). We generate the finer partition 0 6 y1 6 z1 6 y2 6 z2 6 . . . 6 ym 6 zm 6 1.
1. Hausdorff Measures and Capacity
43
From (1.27), we have (1)
µ 1 (C) 6 n
m X k=1
δ(Ak ) =
m X ¡ ¢ d f (yk ), f (zk ) 6 l. k=1
Passing to the limit as n → +∞, we obtain that µ(1) (C) 6 l. Next we show that 0 < µ(1) (C). To this end note that if 0 6 a < b 6 1, then ¡ ¢ ¡ ¢ d f (a), f (b) 6 µ(1) f ([a, b]) . (1.28) df
To see this let h : E = f ([a, b]) −→ R be the function ¡ ¢ df h(u) = d u, f (a) . Evidently h is a Lipschitz continuous function with Lipschitz constant 1 and df
J =
£ ¤ £ ¡ ¢¤ 0, h(b) = 0, d f (a), f (b) ⊆ h(E).
So, from Proposition 1.3.25, we have ¡ ¢ ¡ ¢ d f (a), f (b) = λ1 (J) = µ(1) (J) 6 µ(1) h(E) 6 µ(1) (E). This proves inequality (1.28). But from (1.28) and since for appropriately chosen a, b we have ¡ ¢ d f (a), f (b) > 0 (recall that the curve is nonconstant), we conclude that 0 < µ(1) (C). (b) Now suppose that f is injective. Let 0 = x0 < x1 < . . . < xm = 1 be a partition of [0, 1]. The sets ¡ ¢ df Ak = f [xk−1 , xk ] are pairwise disjoint Borel subsets of X. Using inequality (1.28) on each subarc, we obtain m m X X ¡ ¢ ¡ ¡ ¢¢ d f (xk−1 ), f (xk ) 6 µ(1) f [xk−1 , xk ] k=1
k=1
µ[ ¶ m ¡ ¡ ¢¢ ¡ ¢ = µ(1) = µ(1) f [0, 1] = µ(1) (C). f [xk−1 , xk ] k=1
Since the partition of [0, 1] was arbitrary, it follows that l 6 µ(1) (C). Combining this with (a), we obtain that l = µ(1) (C).
44
1.4
Nonlinear Analysis
Differentiation of Hausdorff Measures
From the general measure theory, we know that the differentiation theory of real functions can be extended to a theory of differentiation for measures, which has many similar features and interesting problems. For the Lebesgue measures λN , N > 1, one of the basic results of this theory is the so-called Lebesgue density theorem, which we recall here. THEOREM 1.4.1 (Lebesgue Density Theorem) If A ⊆ RN is a Lebesgue measurable set, then for λN -a.a. x ∈ A, 1 λN (B r (x) ∩ A) lim = r&0 λN (B r (x)) 0 for λN -a.a. x ∈ RN \ A. DEFINITION 1.4.2
Let A ⊆ RN and x ∈ RN . We say that:
(a) x is a point of density of A, if λN (B r (x) ∩ A) = 1; r&0 λN (B r (x)) lim
(b) x is a point of dispersion of A, if λN (B r (x) ∩ A) = 0. r&0 λN (B r (x)) lim
REMARK 1.4.3 According to Theorem 1.4.1, we see that λN -almost every point of A is a point of density of A and λN -almost every point of RN \A is a point of dispersion of A. We can think that the point of density of a set A form a kind of measure theoretic interior of A, while the points of dispersion of A form a kind of measure theoretic exterior of A. The purpose of this section is to establish analogs of Theorem 1.4.1 for lower dimensional Hausdorff measures. In what follows we work in RN and 1 < s < N. THEOREM 1.4.4 If A ⊆ RN is µ(s) -measurable and µ(s) (A) < +∞, then µ(s) (B r (x) ∩ A) lim = 0 for µ(s) -a.a. x ∈ RN \ A. r&0 (2r)s
1. Hausdorff Measures and Capacity PROOF
45
For every t > 0, let ½ ¾ µ(s) (B r (x) ∩ A) df Ct = x ∈ RN \ A : lim sup > t . (2r)s r&0
To finish the proof it is enough to show that µ(s) (Ct ) = 0
∀ t > 0.
Fix ε > 0. We know that µ(s) bA is a Radon measure (see Proposition 1.1.9). So we can find K ⊆ A compact, such that µ(s) (A \ K) 6 ε (see Proposition 1.1.10(b)). Let df
U = RN \ K. Then U is open and Ct ⊆ U. For fixed δ > 0, we consider the family of closed balls ½ ¾ µ(s) (B r (x) ∩ A) df T = B r (x) : B r (x) ⊆ U, 0 < r < δ, >t . (2r)s Without any loss of generality we may assume that T 6= ∅ or otherwise Ct = ∅ and so µ(s) (Ct ) = 0. © ª Invoking Proposition 1.2.1, we can find a sequence B rn (xn ) n>1 of disjoint elements in T , such that Ct ⊆
∞ [
B 5rn (xn ).
n=1
Then we have (s)
µ10δ (Ct ) 6
∞ X
(10rn )s 6
n=1
∞ ¢ 5s X (s) ¡ µ B rn (xn ) ∩ A t n=1
5s (s) 5s (s) 5s ε 6 µ (U ∩ A) = µ (A \ K) 6 . t t t Let δ & 0, to obtain
5s ε . t Since ε > 0 was arbitrary, we conclude that µ(s) (Ct ) = 0. µ(s) (Ct ) 6
46
Nonlinear Analysis
To have a complete analog of Theorem 1.4.1, we need to check and see if something can be said about the density of A at its points. To do this we will make use of Proposition 1.2.4. THEOREM 1.4.5 If A ⊆ RN is µ(s) -measurable and µ(s) (A) < +∞, then 1 µ(s) (B r (x) ∩ A) 6 lim sup 6 1 2s (2r)s r&0 PROOF
for µ(s) -a.a. x ∈ A.
First we show that lim sup r&0
µ(s) (B r (x) ∩ A) 6 1 (2r)s
for µ(s) -a.a. x ∈ A.
(1.29)
To this end, for every t > 1, we introduce the set Ct ⊆ A defined by ½ ¾ µ(s) (B r (x) ∩ A) df Ct = x ∈ A : lim sup > t . (2r)s r&0 Fix ε > 0. Again µ(s) bA is a Radon measure (see Proposition 1.1.9). We can find an open set U ⊆ RN , such that Ct ⊆ U and
µ(s) (U ∩ A) − ε 6 µ(s) (Ct )
(1.30)
(see Proposition 1.1.10(b)). We introduce the family T of closed balls defined by ½ ¾ µ(s) (B r (x) ∩ A) df T = B r (x) : B r (x) ⊆ U, 0 < r < δ, >t . (2r)s © ª By virtue of Proposition 1.2.4, we can find a sequence B rn (xn ) n>1 of disjoint balls in T , such that Ct ⊆
m [
∞ [
B rn (xn ) ∪
n=1
B 5rn (xn )
∀ m > 1.
n=m+1
Then for δ > 0, we have (s)
µ10δ (Ct ) 6
m X n=1
(2rn )s +
∞ X
(10rn )s
n=m+1
m ∞ ¢ 5s X ¡ ¢ 1 X (s) ¡ 6 µ B rn ∩ A + µ(s) B rn (xn ) ∩ A t n=1 t n=m+1 µ [ ¶ ∞ ¡ ¢ 1 5s 6 µ(s) (U ∩ A) + µ(s) ∀ m > 1. B rn (xn ) ∩ A t t n=m+1
1. Hausdorff Measures and Capacity
47
Using (1.30) and letting m → +∞, we obtain (s)
µ10δ (Ct ) 6
¢ 1 (s) 1 ¡ (s) µ (U ∩ A) 6 µ (Ct ) + ε . t t
Letting δ & 0, we see that µ(s) (Ct ) 6
¢ 1 ¡ (s) µ (Ct ) + ε . t
Since ε > 0 was arbitrary, we finally have that 1 (s) µ (Ct ), t
µ(s) (Ct ) 6 i.e.,
µ(s) (Ct ) = 0
(recall that t > 1). This proves (1.29). Next we show that 1 µ(s) (B r (x) ∩ A) 6 lim sup 2s (2r)s r&0
for µ(s) -a.a. x ∈ A.
For a given ξ, δ ∈ (0, 1), we introduce the set A(δ, ξ) ⊆ A, defined by ½ df (s) A(δ, ξ) = x ∈ A : µδ (C ∩ A) 6 ξδ(C)s for all C ⊆ RN , ¾ with δ(C) 6 δ and x ∈ C . Let {Cn }n>1 be a δ-cover of A(δ, ξ), such that A(δ, ξ) ⊆
∞ [
Cn
n=1
and δ(Cn ) 6 δ,
and
Cn ∩ A(δ, ξ) 6= ∅
∀ n > 1.
So (s) ¡
µδ
A(δ, ξ)
¢
6
∞ X
(s) ¡
µδ
¢ Cn ∩ A(δ, ξ)
n=1
6
∞ X n=1
(s) µδ (Cn
∩ A) 6
∞ X
ξδ(Cn )s
n=1
and from Definition 1.3.5, we see that ¢ ¢ (s) ¡ (s) ¡ µδ A(δ, ξ) 6 ξµδ A(δ, ξ) .
(1.31)
48
Nonlinear Analysis
Since 0 < ξ < 1 we have
(s) ¡
and
µδ
A(δ, ξ)
¢
< +∞,
(s) ¡
µδ
¢ A(δ, ξ) = 0.
In particular, from Proposition 1.3.24, we see that ¡ ¢ µ(s) A(δ, 1 − δ) = 0. Set
½ df
D∞ =
x ∈ A : lim sup r&0
(1.32)
¾ µ(s) (B r (x) ∩ A) 1 < . (2r)s 2s
If x ∈ D∞ , then we can find δ > 0, such that µ(s) (B r (x) ∩ A) 1−δ 6 s (2r) 2s
∀ r ∈ (0, δ].
(1.33)
For any C ⊆ RN , with x∈C ∩A
and
δ(C) 6 δ,
from (1.33), we have ¡ ¢ (s) µδ (C ∩ A) 6 µ(s) (C ∩ A) 6 µ(s) B δ(C) (x) ∩ A 6 (1 − δ)δ(C)s . So it follows that x ∈ A(δ, 1 − δ). Therefore, we have µ ¶ ∞ [ 1 1 D∞ ⊆ A ,1 − , n n n=1 and, using also (1.32), we have µ(s) (D∞ ) = 0. Thus we infer that (1.31) is true. For a given locally integrable function, we can establish the Hausdorff measure of the set where the function is locally large. To do this we shall need the so-called Lebesgue differentiation theorem or Lebesgue-Besicovitch differentiation theorem THEOREM ¡ 1.4.6 ¢(Lebesgue-Besicovitch Differentiation Theorem) If f ∈ L1loc RN ; RM , then Z ° ° 1 °f (y) − f (x)° M dλN (y) = 0 for λN -a.a. x ∈ RN . lim N R r&0 λ (B r (x)) B r (x)
1. Hausdorff Measures and Capacity
49
PROOF Let D = {uk }k>1 be a dense subset of RM . Then by the classical differentiation theorem of Lebesgue (see for example Cohn (1980, p. 190)), we have Z ° ° ° ° 1 °f (y) − un ° M dλN (y) = °f (x) − un ° M (1.34) lim N R R r&0 λ (B r (x)) B r (x)
for λN -a.a. x ∈ RN . Suppose that x ∈ RN is such a differentiability point for which (1.34) is valid for all n > 1. For a given ε > 0, we can choose un , such that ° ° °f (x) − un ° N < ε. R Then we have
Z
1 N r&0 λ (B r (x)) lim
° ° °f (y) − f (x)° M dλN (y) R
B r (x)
Z
1 r&0 λN (B r (x))
6 lim
° ° ° ° °f (y) − un ° M dλN (y) + °un − f (x)° M < 2ε. R R
B r (x)
Since ε > 0 was arbitrary, we conclude that Z ° ° 1 °f (y) − f (x)° M dλN (y) = 0. lim N R r&0 λ (B r (x)) B r (x)
COROLLARY ¡ N 1.4.7 ¢ M If f ∈ L∞ , then loc R ; R Z 1 lim N f (y) dλN (y) = f (x) r&0 λ (B r (x))
for λN -a.a. x ∈ RN .
B r (x)
PROOF
Note that ° ° ° lim °
° ° f (y) dλ (y) − f (x)° °
Z
1 N r&0 λ (B r (x))
N
RM
B r (x)
1 N r&0 λ (B r (x))
Z
6 lim
° ° °f (y) − f (x)°
RM
B r (x)
So the corollary follows at once from Theorem 1.4.6.
dλN (y).
50
Nonlinear Analysis
REMARK 1.4.8 Theorem 1.4.6 and Corollary 1.4.7 remain valid if λN is replaced by any Radon measure on RN . Also we may replace the ball B r (x) by any other measurable sets Sr (x) containing x which shrink to the point x ∈ RN as r & 0. For example we can take Sr (x) to be N -cube with edges equal to 2r. If N = 1, we may take for example the intervals [x − h, h],
[x, x + h] or
[x − h, x + h].
In Proposition 2.1.22 we shall see that the results are also valid for Banach space valued functions, i.e., RM is replaced by a Banach space. Now we are ready to ¢estimate the Hausdorff measure of the set where a ¡ function f ∈ L1loc RN ; R is locally large. THEOREM ¡ 1.4.9 ¢ If f ∈ L1loc RN ; R , 0 6 s < N and ½ Z 1 df Cs = x ∈ RN : lim sup s r&0 r
¾ ¯ ¯ ¯f (y)¯ dλN (y) > 0 ,
B r (x)
then µ(s) (Cs ) = 0. PROOF It is clear that without any loss of generality, we may assume that f ∈ L1 (RN ; R). By virtue of Corollary 1.4.7, we have that Z ¯ ¯ 1 ¯f (y)¯ dλN (y) = 0 for λN -a.a. x ∈ RN lim s r→0 r B r (x)
(recall that 0 6 s < N ). So λN (Cs ) = 0. Let ε > 0, δ > 0 and ξ > 0 be given. Since f ∈ L1 (RN ; R), from the absolute continuity of the Lebesgue integral, we know that we can find ϑ > 0, such that Z ¯ ¯ ¯f (y)¯ dλN (y) < ξ ∀ A ⊆ RN , λN (A) < ϑ. A
We introduce the set
Csε ⊆ Cs
defined by ½ df Csε =
1 x ∈ Cs : lim sup s r r&o
Z B r (x)
¾ ¯ ¯ ¯f (y)¯ dλN (y) > ε .
1. Hausdorff Measures and Capacity
51
We have that λN (Csε ) = 0. So we can find an open set U ⊆ RN , such that λN (U ) < ϑ. Let us set df
T =
½ B r (x) : x ∈ Csε , 0 < r < δ, B r (x) ⊆ U ¾ Z ¯ ¯ ¯f (y)¯ dλN (y) > εrs . and B r (x)
Invoking Proposition 1.2.1, we can find a sequence disjoint balls, such that Csε ⊆
∞ [
©
ª Brn (xn ) n>1 ⊆ T of
B5rn (xn ).
n=1
From this it follows that (s)
µ10δ (Csε ) 6 6
10 ε
Z
∞ s X
10s 6 ε
n=1
Z
∞ X
(10rn )s
n=1
¯ ¯ ¯f (y)¯ dλN (y)
B rn (xn )
s ¯ ¯ ¯f (y)¯ dλN (y) 6 10 ξ. ε
U
Let δ & 0 and then ξ & 0, to conclude that µ(s) (Csε ) = 0. Since Cs =
∞ [
1
Csn ,
n=1
we conclude that µ(s) (Cs ) = 0.
52
Nonlinear Analysis
1.5
Lipschitz Functions
In this section we derive some basic properties relating to the behaviour of Lipschitz continuous functions. A first such result was already established in Proposition 1.3.25. DEFINITION 1.5.1
Let C ⊆ RN .
(a) A function f : C −→ RM is said to be Lipschitz continuous, if there exists a constant c > 0, such that ° ° °f (x) − f (y)° M 6 c kx − yk N ∀ x, y ∈ C. R R (b) If f : C −→ RM is Lipschitz continuous, then the Lipschitz constant Lip(f ) > 0 of f is defined by df
Lip(f ) =
sup x, y ∈ C x 6= y
kf (x) − f (y)kRM . kx − ykRN
(c) If U ⊆ RN is open, a function f : U −→ RM is said to be locally Lipschitz, if for every x ∈ U , we can find a neighbourhood V ⊆ U of x, such that f |V is Lipschitz continuous. THEOREM 1.5.2 N If f : RN −→ RM , f = (fi )M with λN (A) > 0, i=1 and A ⊆ R then ¡ ¢ (a) dim Gr (f |A ) > N , where Gr (f |A ) is the graph of f over A, defined by df
Gr (f |A ) =
©
ª (x, y) ∈ A × RM : y = f (x) ;
¡ ¢ (b) if f is Lipschitz continuous, then dim Gr (f |A ) = N . PROOF (a) Let P : RN +M −→ RN be the projection operator. Operator P is Lipschitz continuous with Lip(P ) = 1. By virtue of Theorem 1.3.21 and Proposition 1.3.25, we have that ¡ ¡ ¢¢ ¡ ¢ 1 N λ (A) = µ(N ) (A) = µ(N ) P Gr (f |A ) 6 µ(N ) Gr (f |A ) cN ¡ ¢ and so dim Gr (f |A ) > N (see Definition 1.3.8). 0
1, then we define fb = fbi i=1 (fi are the component functions of f ). We have M X ° ° ¯ ¯ ¡ ¢ °fb(x) − fb(z)°2 M = ¯fbi (x) − fbi (z)¯2 6 M Lip(f ) 2 kx − zk2 N , R R i=1
so
√ ¡ ¢ Lip fb 6 M Lip(f ).
1. Hausdorff Measures and Capacity REMARK 1.5.5
55
Let f : X −→ R and f : X −→ R be defined by ½ df f (x) if x ∈ A, f (x) = 0 if x ∈ RN \ A,
then as we shall see in Chapter 4, fb = f ⊕ Lip(f ) k·kX , where ⊕ denotes the operation of infimal convolution (see Definition 4.4.6(b)). Since this operation preserves convexity, then if A ⊆ X is convex and f : A −→ R is Lipschitz continuous and convex, then so is fb: X −→ R. Also note that the extension fb obtained in Theorem 1.5.4 is maximal in the sense that if g : X −→ R is any Lipschitz continuous function with Lip(g) 6 Lip(f ), such that g|A = f , then g 6 fb. Indeed note that g(x) − f (y) 6 Lip(f ) kx − ykX
∀ x ∈ X, y ∈ A,
hence g(x) 6 fb(x). A minimal such extension can be obtained by considering the function £ ¤ df fe(x) = sup f (a) − Lip(f ) kx − akX . a∈A
This extension is known as the McShane extension of f and was obtained by McShane (1934) who was the first to study the problem of extension of Lipschitz continuous functions. Finally we mention that Kirszbraun (1934) produced an extension fb of a Lipschitz continuous function f : A −→ RM , such that Lip(fb) = Lip(f ) (see also Federer (1969, p. 201)).
One of the main theorems concerning Lipschitz continuous functions is the so-called Rademacher’s theorem, which asserts that a Lipschitz continuous function f : RN −→ RM is differentiable almost everywhere. This is the starting point for extending the subdifferential theory beyond the family of convex functions (see Chapter 4). First let us recall the following basic definition from multivariable calculus. DEFINITION 1.5.6 Let U ⊆ RN be an open set. We say that a funcM tion f : U −→ R is differentiable (or Fr´ echet differentiable) at x ∈ U , if there exists L(x) ∈ L(RN ; RM ), such that lim
h→0
f (x + h) − f (x) − L(x)h = 0. khkX
REMARK 1.5.7 Evidently L(x) is unique, usually is denoted by Df (x) or f 0 (x) and it is called the derivative of f at x. From multivariable calculus, we know that if M = 1, then Df (x)u =
N N X X ∂f (x)uk = f 0 (x; ek )uk ∂xk
k=1
k=1
∀ u ∈ RN ,
56
Nonlinear Analysis
where f 0 (x; v) is the directional or Gˆ ateaux derivative of f at x in the direction v, defined by df
f 0 (x; v) = lim
λ→0
f (x + λv) − f (x) λ
N and {ek }N k=1 is the canonical basis of R .
THEOREM 1.5.8 (Rademacher Theorem) If U ⊆ RN is an open set and f : U −→ RM is a Lipschitz continuous function, then f is differentiable at λN -almost all x ∈ U . PROOF Clearly we may assume that M = 1. For any u ∈ RN with kukRN = 1, we set df
f 0 (x; u) = lim
λ→0
f (x + λu) − f (x) λ
∀ x ∈ U,
provided this limit exists. Claim 1. f 0 (x; u) exists for λN -almost all x ∈ U . Let f (x + λu) − f (x) λ λ→0 f (x + λu) − f (x) df 0 f− (x; u) = lim inf λ→0 λ df
0 (x; u) = lim sup f+
Evidently if df
Cu =
©
then Cu =
∀x∈U ∀ x ∈ U.
ª x ∈ U : f 0 (x; u) does not exist , ©
ª 0 0 x ∈ U : f− (x; u) < f+ (x; u) .
Note that 0 f+ (x; u) = inf
sup
k>1 0 < |λ| < λ∈Q
1 k
f (x + λu) − f (x) , λ
0 so f+ (·; u) is a Borel measurable function. 0 Similarly we show that f− (·; u) is a Borel measurable function. It follows that Cu ∈ B(U ) (i.e., Cu ⊆ U is a Borel measurable set). Next for every x, u ∈ RN with kukRN = 1, let ϕ : R −→ R be defined by df
ϕ(λ) = f (x + λu)
∀ λ ∈ R.
1. Hausdorff Measures and Capacity
57
The function ϕ is Lipschitz continuous, hence absolutely continuous and so by fundamental theorem of Lebesgue calculus (see Theorem A.2.20), it is differentiable at almost every λ ∈ R. Therefore µ(1) (Cu ∩ L) = 0, for every line L parallel to the direction u. Hence by Fubini’s theorem, we have λN (Cu ) = 0 and this proves the claim. From Claim 1 and Remark 1.5.7, we see that µ ∇f (x) =
¶N ∂f (x) exists for λN -a.a. x ∈ U. ∂xk k=1
¡ ¢ Claim 2. f 0 (x; u) = u, ∇f (x) RN for λN -almost all x ∈ U . Let ϑ ∈ Cc∞ (U ). We have Z Z f (x + λu) − f (x) ϑ(x) − ϑ(x − λu) N ϑ(x) dλN (x) = − f (x) dλ (x). λ λ U
U
Let λ =
1 k
for k > 1. Since f is Lipschitz continuous, we have ¯ ¯ ¯ f (x + k1 u) − f (x) ¯ ¯ ¯ 6 Lip(f ) kuk N = Lip(f ). R 1 ¯ ¯ k
Therefore when k → +∞, from the Lebesgue dominated convergence theorem (see Theorem A.2.2 ), we have that Z Z f 0 (x; u)ϑ(x) dλN (x) = − f (x)ϑ0 (x; u) dλN (x) U
= − Z =
N X k=1
Z uk U
U
Z N X ∂ϑ ∂f N f (x) uk (x) dλ (x) = (x)ϑ(x) dλN (x) ∂xk ∂xk k=1
U
¡ ¢ u, ∇f (x) RN ϑ(x) dλN (x).
U
Because ϑ ∈ Cc∞ (U ) is arbitrary, it follows that ¡ ¢ f 0 (x; u) = u, ∇f (x) RN for λN -a.a. x ∈ U. This proves the second claim.
58
Nonlinear Analysis Let {un }n>1 be a dense subset of ∂B1 (0). For n > 1, we define ½ df
En =
¡ ¢ x ∈ U : f 0 (x; un ) and ∇f (x) exist and f 0 (x; un ) = un , ∇f (x) RN
and df
E =
∞ \
¾
En .
n=1
By virtue of Claim 2, we have that λN (U \ E) = 0. Claim 3. f is differentiable at every x ∈ E. Let x ∈ E, u ∈ ∂B1 (0), λ ∈ R \ {0} and set df
η(x, u, λ) =
f (x + λu) − f (x) − (u, ∇f (x))RN . λ
If v ∈ ∂B1 (0), we have ¯ ¯ ¯η(x, u, λ) − η(x, v, λ)¯ ¯ ¯ ¯ f (x + λu) − f (x + λv) ¯ ¯¡ ¢ ¯ ¯ ¯ + ¯ u − v, ∇f (x) N ¯ 6 ¯ ¯ R λ ° ° 6 Lip(f ) ku − vkRN + °∇f (x)°RN ku − vkRN . Note that
µ ¶ ∂f Lip 6 2Lip(f ) ∂xk
and so
© ª ∀ k ∈ 1, . . . , N
√ Lip(∇f ) 6 2 N Lip(f )
(see the proof of Theorem 1.5.4). Therefore, we have √ ¢ ¯ ¯ ¡ ¯η(x, u, λ) − η(x, v, λ)¯ 6 1 + 2 N Lip(f ) ku − vk N . R
(1.38)
Let ε > 0 be given. We can choose l > 1 large enough so that © ª ∀v ∈ ∂B1 (0) ∃k ∈ 1, . . . , l : kv − uk kRN 6
ε √ . 2(1 + 2 N )Lip(f )
(1.39)
As x ∈ E, we have lim η(x, uk , λ) = 0
λ→0
(recall that x ∈ E). So we can find δ > 0, such that ¯ ¯ ¯η(x, uk , λ)¯ < ε 2
∀ 0 < |λ| < δ, k ∈ {1, . . . , l}.
(1.40)
1. Hausdorff Measures and Capacity
59
Thus from (1.38), (1.39) and (1.40), for every v ∈ ∂B1 (0), we can find k ∈ {1, . . . , l}, such that for all 0 < |λ| < δ, we have ¯ ¯ ¯ ¯ ¯ ¯ ¯η(x, v, λ)¯ 6 ¯η(x, uk , λ)¯ + ¯η(x, v, λ) − η(x, uk , λ)¯ < ε. (1.41) We emphasize that δ > 0 is independent of v ∈ ∂B1 (0). Let y ∈ U , y 6= x and let us set y−x df v = ∈ ∂B1 (0). ky − xkRN We have y = x + λv, with λ = ky − xkRN . Then ¡ ¢ ¡ ¢ f (y) − f (x) − ∇f (x), y − x RN = f (y) − f (x) + λ ∇f (x), v RN ¡ ¢ = o(λ) = o ky − xkRN as y → x. Therefore f is differentiable at x ∈ E with Df (x) = ∇f (x). This proves the claim and the theorem. COROLLARY 1.5.9 If U ⊆ RN is an open set and f : U −→ RM is a locally Lipschitz function, then f is differentiable at λN -almost all x ∈ U . PROOF Again we may assume that M = 1. Note that since f is locally Lipschitz, it is Lipschitz continuous when restricted to any compact set K ⊆ U . Indeed, if this is not true, then we can find a compact set K ⊆ U and two sequences {xn }n>1 , {yn }n>1 ⊆ K, such that ¯ ¯ n kxn − yn kRN < ¯f (xn ) − f (yn )¯ ∀ n > 1. Note that
2 max |f | ∀ n > 1. n K Since K is compact, we can produce two subsequences {xnk }k>1 of {xn }n>1 and {ynk }k>1 of {xn }n>1 , such that kxn − yn kRN 6
xnk −→ v
and ynk −→ v,
for some v ∈ K, which contradicts the fact that in a neighbourhood of x, the function f is Lipschitz continuous. Next let {Un }n>1 be an increasing sequence of bounded open subsets of U , ∞ S such that U = Un , for example let n=1
½ df
Un =
x ∈ U : kxkRN < n,
¾ 1 < d (x, ∂U ) n
∀ n > 1.
Then by virtue of Theorem 1.5.8, f is differentiable at λN -almost all x ∈ Un , for n > 1. Therefore f is differentiable λN -almost everywhere on U .
60
Nonlinear Analysis
COROLLARY 1.5.10 If U ⊆ RN is an open set, f : U −→ RM is a locally Lipschitz function and df
Z =
©
ª x ∈ U : f (x) = 0 ,
then Df (x) = 0 for λN -almost all x ∈ Z. PROOF As before we may assume that M = 1. Also suppose that λN (Z) > 0 or otherwise there is nothing to prove. Then by virtue of Theorems 1.4.1 and 1.5.8, we can choose x ∈ Z, such that Df (x) exists and λN (B r (x) ∩ Z) = 1. r&0 λN (B r (x)) lim
(1.42)
We have f (y) =
¡
¢ ¡ ¢ ∇f (x), y − x RN + o ky − xkRN as y → x.
(1.43)
Suppose that df
v = ∇f (x) 6= 0. We introduce the set ½ df
C =
u ∈ ∂B1 (0) : (v, u)RN
kvkRN > 2
¾ .
For a given u ∈ C and t > 0, in (1.43), we set y = x + tu and we have ¡ ¢ t kvkRN f (x + tu) = t ∇f (x), u RN + o(t kukRN ) > + o(t), 2 so
kvkRN f (x + tv) o(t) > + t 2 t
and f (x + tu) > 0
∀ t ∈ (0, t∗ ), u ∈ C,
for some t∗ > 0, a contradiction to (1.42). COROLLARY 1.5.11 If f1 , f2 : RN −→ RN are locally Lipschitz and df
E =
©
ª x ∈ RN : (f2 ◦ f1 )(x) = x ,
¡ ¢ then D(f2 ◦ f1 )(x) = Df2 f1 (x) Df1 (x) = idRN for λN -almost all x ∈ E.
1. Hausdorff Measures and Capacity PROOF
61
Let df
dom fi =
©
ª x ∈ RN : Dfi (x) exists ,
for i = 1, 2.
We set df
C = E ∩ dom f1 ∩ f1−1 (domf2 ). If x ∈ C \ f1−1 (dom f2 ), then f1 (x) ∈ RN \ dom f2 and so
¡ ¢ ¡ ¢ ¡ ¢ x = f2 f1 (x) = f2 ◦ f1 (x) ∈ f2 RN \ dom f2 .
It follows that E\C ⊆
¡ N ¢ ¡ ¢ R \ dom f1 ∪ f2 RN \ dom f2 .
(1.44)
Invoking Theorem 1.5.8, from (1.44), we infer that λN (C \ E) = 0 (recall that a Lipschitz continuous function maps Lebesgue-null sets to Lebesgue-null sets). If¢ x ∈ C, then from the definition of C, we see that ¡ Df1 (x) and Df2 f1 (x) exist and so ¡ ¢ Df2 f1 (x) Df1 (x) = D(f2 ◦ f1 )(x) (chain rule). Since (f2 ◦ f1 )(x) − x = 0, from Corollary 1.5.10, we infer that ¡ ¢ Df2 f1 (x) Df1 (x) = id for λN -a.a. x ∈ RN .
Continuing with our investigation of Lipschitz continuous maps f : RN −→ R , we aim at deriving change of variables formulas. We distinguish two cases. In the first case N 6 M and the change of variables formula is obtained via the so-called area formula, which asserts that N -dimensional measure of f (A) can be calculated by integrating a suitable Jacobian. In the second case M 6 N and the change of variables formula passes through the so-called coarea formula, which asserts that the integral of the (N − M )-dimensional measure of the level sets of f is computed by integrating a suitable Jacobian. First we derive the area formula and for this we need some preparation. We start with a result from linear algebra, known as polar decomposition. It produces for a linear operator L : RN −→ RM an analog of the polar representation z = reiϑ of a complex number. First some definitions. M
62
Nonlinear Analysis
DEFINITION 1.5.12 (a) An operator U : RN −→ RM is said to be orthogonal, if ¡ ¢ U x, U y RM = (x, y)RN ∀ x, y ∈ RN . (b) For a given operator L : RN −→ RM , its adjoint L∗ : RM −→ RN is defined by ¡ ¢ ¡ ¢ Lx, y RM = x, L∗ y RN ∀ x ∈ RN , y ∈ RM . (c) An operator L : RN −→ RN is self-adjoint, if L∗ = L. (d) An operator S : Rk −→ Rk is said to be positive (we write S > 0), if it is self-adjoint (i.e., S = S ∗ ) and ¡ ¢ Sx, x RN > 0 ∀ x ∈ RN . REMARK 1.5.13 If N = M , then U : RN −→ RN is orthogonal if and ∗ −1 only if U = U (hence U is invertible). In general, if U : RN −→ RM is orthogonal, then N 6 M and U ∗ ◦ U = idRN . Also U : RN −→ RM is orthogonal if and only if it is an isometry. Finally, if S : RN −→ RN is self-adjoint, then we can find an orthogonal operator U : RN −→ RN and a diagonalizable operator D : RN −→ RN , such that S = U ◦ D ◦ U −1 . A positive operator S : RN −→ RN has a unique positive square root T : RN −→ RN , i.e., T 2 = S. THEOREM 1.5.14 (Polar Decomposition Theorem) Let L : RN −→ RM be a linear operator. (a) If N 6 M , then there exist a positive operator S : RN −→ RN and an orthogonal operator U : RN −→ RM , such that L = U ◦ S. (b) If M 6 N , then there exists a positive operator S : RM −→ RM and an orthogonal operator U : RM −→ RN , such that L = S ◦ U ∗.
1. Hausdorff Measures and Capacity
63
PROOF (a) Since L∗∗ = L, the operator L∗ ◦L : RN −→ RN is clearly positive. So it admits a unique square root S : RN −→ RN (see Remark 1.5.13). For each y = Sx ∈ R(S), we write U y = Lx, motivated by the fact that eventually we must have L = U ◦ S. First, we need to show that with this definition U is unambiguously defined on R(S), that is if Sx1 = Sx2 , then Lx1 = Lx2 . ° ° Note that Sx1 = Sx2 is equivalent to saying that °S(x1 − x2 )°RN = 0 and this condition implies that ° ° °L(x1 − x2 )° M = 0. R Therefore U is well defined on R(S) and its range equals R(L). Note that L and S have the same kernel. So dim R(L) = dim R(S)
and
dim R(L)⊥ = dim R(S)⊥ .
Therefore there exists an isometric isomorphism U0 : R(S)⊥ −→ R(L)⊥ . We extend U on R(S)⊥ by setting it equal to U0 . Since RN = R(S) ⊕ R(S)⊥ , every y ∈ RN can be written in a unique way as y = Sx + u,
with u ∈ R(S)⊥ .
We set U y = Lx + U0 u and we have U : RN −→ RM , which is linear and well defined. Also, exploiting the orthogonality of R(S) and R(S)⊥ , we have ¡ ¢ ¡ ¢ U y, U y RN = Lx + U0 u, Lx + U0 u RM ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ = Lx, Lx RM + U0 u, U0 u RM = Sx, Sx RN + u, u RN = y, y RN , so U is orthogonal and U ◦ S = L. (b) Follows if we apply (a) to the operator L∗ : RM −→ RN . REMARK 1.5.15 In a polar decomposition L = U ◦ S, the positive operator S is unique. Indeed suppose that U ◦ S = U1 ◦ S1 . Then by taking adjoints, we obtain S ◦ U ∗ = S1 ◦ U1∗ and so S 2 = S ◦ U ∗ ◦ U ◦ S = S1 ◦ U1∗ ◦ U1 ◦ S1 = S12 . The positive operator S 2 = S12 has a unique square root, hence S = S1 . Moreover, if N = M and the operator L is invertible, then in the polar decomposition L = U ◦ S, the orthogonal operator U is unique too. Indeed, since L is invertible, so is S (since S = U −1 ◦ L). Then from U ◦ S = U1 ◦ S1 and since S −1 = S1−1 , we have that U = U1 ◦ S1 ◦ S −1 = U1 ◦ S1 ◦ S1−1 = U1 .
64
Nonlinear Analysis
We can use Theorem 1.5.14 to define the Jacobian of a Lipschitz continuous map f : RN −→ RM . DEFINITION 1.5.16
Let L : RN −→ RM be a linear operator.
(a) If N 6 M and L = U ◦ S is a polar decomposition of L (see Theorem 1.5.14), then we define the Jacobian of L to be ¯ df ¯ jac L = ¯ det S ¯. (b) If M 6 N and L = S ◦ U ∗ is a polar decomposition of L (see Theorem 1.5.14), then we define the Jacobian of L to be ¯ df ¯ jac L = ¯ det S ¯. (c) If f : RN −→ RM is Lipschitz continuous and
∂f1 ∂x1
...
∂fM ∂x1
...
Df = ...
∂f1 ∂xN
.. .
∂fM ∂xN
is the M × N -gradient matrix, then the Jacobian of f is defined by df
Jf (x) = jac Df (x)
for λN -a.a. x ∈ RN .
REMARK 1.5.17 Since in a polar decomposition the positive operator is uniquely defined (see Remark 1.5.15), then we see that the notions introduced in Definition 1.5.16 are well defined. If L : RN −→ RM is a linear operator, then we can easily check that if N 6 M , we have jac L = det(L∗ ◦ L), while if M 6 N , we have jac L = det(L ◦ L∗ ). Another expression computing jac L2 is given by the so-called BinetCauchy formula. So let N 6 M and set df
Θ(N, M ) =
©
ª θ : {1, . . . , N } −→ {1, . . . , M } is increasing .
For each θ ∈ Θ, we define Pθ : RM −→ RN by df
Pθ (x1 , . . . , xM ) = (xθ(1) , . . . , xθ(N ) ).
1. Hausdorff Measures and Capacity
65
Clearly Pθ is the projection operator of RM on some N -dimensional subspace N V = span {eθ(k) }N −→ RM is a linear operator then a k=1 . Then, if L : R straightforward but cumbersome proof gives the Binet-Cauchy formula: jac L2 =
X
det(Pθ ◦ L)2 .
θ∈Θ(N,M )
For details we refer to Evans & Gariepy (1992, p. 89). LEMMA 1.5.18 If L : RN −→ RM is a linear operator, N 6 M and A ⊆ RN , ¡ ¢ then µ(N ) L(A) = jac L · λN (A). PROOF Let L = U¯◦S be ¯a polar decomposition of L (see Theorem 1.5.14). We know that jac L = ¯ det S ¯ (see Definition 1.5.16(a)). If jac L = 0, then ¡det S = ¢ 0 and so S is not surjective, i.e., dim R(S) 6 N −1. It follows that µ(N ) L(A) = 0 (see, e.g., Proposition 1.3.23). If jac L > 0, then using the orthogonality of U and the facts that µ(N ) = λN in RN , L = U ◦ S and U ∗ ◦ U = idRN , we have µ(N ) (L(B r (x))) λN ((U ∗ ◦ L)(B r (x))) λN ((U ∗ ◦ U ◦ S)(B r (x))) = = λN (B r (x)) λN (B r (x)) λN (B r (x)) N N ¯ ¯ λ (S(B r (x))) λ (S(B 1 (0))) = = = ¯ det S ¯ = jac L, (1.45) a(N ) λN (B r (x)) N
df π 2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . Finally let 2 !
¡ ¢ df ϑ(A) = µ(N ) L(A)
∀ A ⊆ RN .
Then ϑ is a Radon measure and ϑ ≺≺ λN . So the Radon-Nikodym derivative of ϑ with respect to λN (see Theorem A.2.24 and Remark A.2.25) exists and is given by dϑ ϑ(B r (x)) (x) = lim N = jac L N r&0 dλ λ (B r (x))
(see (1.45) and Widom (1969, p. 119)). From the Radon-Nikodym theorem (see Theorem A.2.24), we infer that for all Borel sets A ⊆ RN , we have ¡ ¢ µ(N ) L(A) = jac L · λN (A).
(1.46)
Because ϑ and λN are both Radon measures, we conclude that (1.46) holds for all A ⊆ RN .
66
Nonlinear Analysis
LEMMA 1.5.19 If f : RN −→ RM is Lipschitz continuous, N 6 M and A ⊆ RN is Lebesgue measurable, then (a) f (A) is a µ(N ) -measurable set; ¡ ¢ (b) the map y 7−→ µ(0) A ∩ f −1 (y) is µ(N ) -measurable on RM ; Z ¡ ¢ ¡ ¢N (c) µ(0) A ∩ f −1 (y) µ(N ) (y) 6 Lip(f ) λN (A). RM
PROOF Clearly we can assume with any loss of generality that A is bounded (if not, consider instead A ∩ Br (0)). (a) From the regularity of the Lebesgue measure, we know that for every i > 1, we can find compact set Ki ⊆ A, such that λN (A \ Ki ) 6
1 i
∀ i > 1.
Because f is Lipschitz continuous, f (Ki ) ⊆ RM is compact and so it is µ(N ) measurable. Then ¶ µ[ ∞ ∞ [ Ki = f (Ki ) is a µ(N ) -measurable set. f i=1
i=1
Also, using Proposition 1.3.25, we have µ µ[ ¶¶ µ µ ¶¶ ∞ ∞ [ Ki µ(N ) f (A) \ f 6 µ(N ) f A \ Ki i=1
µ ¶ ∞ [ 6 Lip(f )N λN A \ Ki = 0,
i=1
i=1
so f (A) is µ(N ) -measurable. (b) For n > 1, we introduce the following families of N -cubes ½ df
Fk =
Q ⊆ RN : Q =
¸ ¾ N µ Y cj cj + 1 , , cj are integers, j ∈ {1, . . . , N } . k k j=1
Let df
hk =
X Q∈Fk
χf (A∩Q) .
1. Hausdorff Measures and Capacity
67
Then by part (a), hk is µ(N ) -measurable and for every y ∈ RN , hk (y) is the number of cubes Q ∈ Fk , such that ¡ ¢ (A ∩ Q) ∩ f −1 {y} 6= ∅. Hence for all y ∈ RN , we have ¡ ¡ ¢¢ as k → +∞, hk (y) % µ(0) A ∩ f −1 {y} ¡ ¢ so the function y 7−→ µ(0) A ∩ f −1 (y) is µ(N ) -measurable. (c) Using the monotone convergence theorem (see Theorem A.2.10) and using Proposition 1.3.25 and the fact that [ RN = Q ∀ k > 1, Q∈Fk
we have
Z (0)
µ RM
=
lim
k→+∞
¡
A∩f
−1
¢ ({y}) dµ(N ) (y) =
Z lim
k→+∞ RM
hk (y) dµ(N ) (y)
X
X ¡ ¡ ¢ ¢N µ(N ) f (A ∩ Q) 6 lim sup Lip(f ) λN (A ∩ Q)
Q∈Fk
k→+∞ Q∈F k
¡ ¢N = Lip(f ) λN (A).
LEMMA 1.5.20 If f : RN −→ RM is Lipschitz continuous, t > 1 and df
C =
©
ª x ∈ RN : Df (x) exists and Jf (x) > 0 ,
then there exists a sequence {Ei }i>1 of Borel subsets of RN , such that (a) C =
∞ S i=1
Ei ;
(b) f |Ei is injective for all i > 1; (c) for each i > 1 there exists a self-adjoint isomorphism Li : RN −→ RN , such that ¡ ¢ ¡ ¢ Lip (f |Ei ) ◦ L−1 6 t, Lip L−1 6 t, i i ◦ (f |Ei ) and
| det Li | 6 Jf |Ei 6 tN | det Li |. tN
68
Nonlinear Analysis
PROOF Choose ε > 0 so that 1t + ε < 1 < t − ε. Let E be a countable dense subset of C and let G be a countable dense subset of the space of self-adjoint isomorphisms of RN . For each u ∈ E, L ∈ G and k > 1, we set E(u, L, k) to be the set of all x ∈ C ∩ B k1 (u), such that µ
¶ ° ° 1 + ε kLhkRN 6 °Df (x)h°RM 6 (t − ε) kLhkRN t
and ° ° ° ° °f (y) − f (x) − Df (x)(y − x)° M 6 ε°L(y − x)° N R R
∀ h ∈ RN (1.47)
∀ y ∈ B k2 (x). (1.48)
Since x 7−→ Df (x) is a Borel function, we see that E(u, L, k) is a Borel subset of RN . From (1.47) and (1.48), it follows that ° ° ° ° ° 1° °L(y − x)° N 6 °f (y) − f (x)° M 6 t°L(y − x)° N R R R t ∀ x ∈ E(u, L, k), y ∈ B k2 (x).
(1.49)
Claim 1. If x ∈ E(u, L, k), then µ
1 +ε t
¶N
¯ ¯ ¯ ¯ ¯ det L¯ 6 Jf (x) 6 (t − ε)N ¯ det L¯.
Let Df (x) = L = U ◦ S (see Theorem 1.5.14(a)). According to Definition 1.5.16(c), ¯ ¯ Jf (x) = jac Df (x) = ¯ det S ¯. From (1.47), we have µ ¶ ° ° 1 + ε kLhkRN 6 °(U ◦ S)h°RM = kShkRN 6 (t − ε) kLhkRN ∀h ∈ RN . t Since L ∈ G, we have µ ¶ ° ° 1 + ε khkRN 6 °(S ◦ L−1 )h°RN 6 (t − ε) khkRN t so thus
∀ h ∈ RN ,
¡ ¢ (S ◦ L−1 ) B 1 (0) ⊆ B t−ε (0), ¯ ¯ ¡ ¢ ¯det(S ◦ L−1 )¯ a(N ) 6 λN B t−ε (0) = a(N )(t − ε)N ,
1. Hausdorff Measures and Capacity
69
N
df π 2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . Finally 2 ! ¯ ¯ ¯ ¯ ¯ det S ¯ 6 (t − ε)N ¯ det L¯.
Similarly, we prove the other inequality of the claim. So Claim 1 is proved. Let {Ei }i>1 be an enumeration of the countable set ©
ª E(u, L, k) : u ∈ E, L ∈ G, k > 1 .
(a) Let x ∈ C, while Df (x) = U ◦ S (by Theorem 1.5.14(a)). Select L ∈ G, such that ° ° Lip(L ◦ S −1 ) = °L ◦ S −1 °L 6 and
µ
¶−1 1 +ε t
° ° Lip(S ◦ L−1 ) = °S ◦ L−1 °L 6 t − ε.
Note that because x ∈ C, we have that S is invertible. Also select k > 1 and u ∈ E, such that 1 kx − ukRN < k and ° ° °f (y) − f (x) − Df (x)(y − x)° M R ° ° ε 6 ky − xkRN = ε°L(y − x)°RN ∀ y ∈ B k2 (x). −1 Lip(L ) We infer that x ∈ E(u, L, k) and so x ∈ Ei for some i > 1. Because x ∈ C was arbitrary, we have proved statement (a). (b) Choose Ei from the countable collection {Ei }i>1 . We have Ei = E(u, L, k) with some u ∈ E, L ∈ G and k > 1. Let us set Li = L. From (1.49), we have ° ° ° ° ° 1° °Li (y − x)° N 6 °f (y) − f (x)° M 6 t°Li (y − x)° N R R R t
∀ y ∈ B k2 (x).
70
Nonlinear Analysis
Since Ei ⊆ B k1 (u) ⊆ B k2 (x), we have that ° ° ° 1° °Li (y − x)° N 6 °f (y) − f (x)° M R t ° ° R 6 t°Li (y − x)°RN
∀ x, y ∈ Ei ,
(1.50)
so f |Ei is injective. (c) From (1.50), it follows that ¡¡ ¢ ¢ Lip f |Ei ◦ L−1 6 t i
¡ ¡ ¢¢ and Lip Li ◦ f |Ei 6 t.
Moreover, from Claim 1 and letting ε & 0, we obtain | det Li | 6 Jf |Ei 6 tN | det Li |. tN
Now we are ready for the area formula. THEOREM 1.5.21 (Area Formula) If f : RN −→ RM is a Lipschitz continuous function, N 6 M and A ⊆ RN is a Lebesgue measurable set, then Z Z ¡ ¢ Jf (x) dλN (x) = µ(0) A ∩ f −1 ({y}) dµ(N ) (y). A
RM
PROOF By virtue of Theorem 1.5.8, without any loss of generality we may assume that Df (x) (and so Jf (x) too) exists for all x ∈ A. Also as before we may suppose that λN (A) < +∞. © ª Case 1. A ⊆ Jf > 0 . In this case we may use Lemma 1.5.20 and produce a sequence {Ei }i>1 of Borel subsets of RN which satisfy the postulates of Lemma 1.5.20. We may additionally assume that the sets {Ei }i>1 are disjoint. Let Fk be the following family of N -cubes ½ df
Fk =
Q⊆R
N
¸ ¾ N µ Y cj cj + 1 : Q= , , cj are integers, j ∈ {1, . . . , N } k k j=1
(compare with the proof of Lemma 1.5.19(b)). We set df
k Fi,n = Ei ∩ Qkn ∩ A
with Qkn ∈ Fk ,
∀ i > 1, n > 1.
1. Hausdorff Measures and Capacity
71
k Evidently the sets Fi,n are disjoint and ∞ [
A =
k Fi,n
∀ k > 1.
i,n=1
First we show that lim
∞ X
k→+∞
(N )
µ
Z
¡ ¢ k f (Fi,n ) =
i,n=1
¡ ¢ µ(0) A ∩ f −1 ({y}) dµ(N ) (y).
(1.51)
RN
To this end, we introduce df
hk =
∞ X
χf (F k
i,n
i,n=1
)
∀ k > 1.
© k ª which Therefore hk (y) is the number of sets from the sequence Fi,n i,n>1 intersect f −1 ({y}). Note that ¡ ¢ hk (y) % µ(0) A ∩ f −1 ({y}) as k → +∞. Then (1.51) follows from the monotone convergence theorem (see Theorem A.2.10). Let t > 1. Because of Lemma 1.5.20 and Proposition 1.3.25, we have ¡ ¢ ¡¡ ¢ k ¢ ¡ ¢ k k µ(N ) f (Fi,n ) = µ(N ) f |Ei ◦ L−1 6 tN λN Li (Fi,n ) (1.52) i ◦ Li (Fi,n ) and
³³ ´ ´ ¡ ¢ ¡ ¢−1 k k λN Li (Fi,n ) = µ(N ) Li ◦ f |Ei ◦ f (Fi,n ) ¡ ¢ k 6 tN µ(N ) f (Fi,n ) .
(1.53)
So, using (1.52), (1.53) and Lemmas 1.5.18 and 1.5.20(c), it follows that ¢ ¢ 1 (N ) ¡ 1 ¡ k k µ f (Fi,n ) 6 N Li (Fi,n ) t2N t Z ¯ 1 ¯ k = N ¯ det Li ¯λN (Fi,n ) 6 Jf (x) dλN (x) t k Fi,n
¯ ¯ ¡ ¢ ¡ ¢ k k k 6 tN ¯ det Li ¯λN (Fi,n ) = tN λN Li (Fi,n ) 6 t2N µ(N ) f (Fi,n ) . k We take the sum for the parameters i, n > 1. Recalling that the sets Fi,n are disjoint and since f |Ei is injective (see Lemma 1.5.20(b)), we obtain
1 t2N
∞ X i,n=1
¡ ¢ k µ(N ) f (Fi,n ) 6
Z Jf (x) dλN (x) 6 t2N A
∞ X i,n=1
¡ k ¢ µ(N ) Fi,n .
72
Nonlinear Analysis
Let k → +∞ and use (1.51) to write that Z Z ¡ ¢ (N ) 1 (0) −1 µ A ∩ f ({y}) dµ (y) 6 Jf (x) dλN (x) t2N A RN Z ¡ ¢ 6 t2N µ(0) A ∩ f −1 ({y}) dµ(N ) (y). RN
Since t > 1 was arbitrary, we let t & 1 and obtain the “area formula” for the case when © ª A ⊆ Jf > 0 . © ª Case 2. A ⊆ Jf = 0 . Let ε > 0. We write f = p ◦ g, where
g : RN −→ RM × RN
and
p : RM × RN −→ RM
are defined by df
g(x) =
¡ ¢ f (x), εx
and p(y, z) = y
∀ x, z ∈ RN , y ∈ RM .
We show that there exists ξ > 0, such that 0 < Jg(x) 6 ξε Note that
· Dg(x) =
∀ x ∈ A.
(1.54)
¸ Df (x) , εIM (N +M )×M
where IM is M × M -identity matrix. Then by virtue of the Binet-Cauchy Formula (see Remark 1.5.17), we have that Jg(x)2 = “sum of squares of (N × N )-subdeterminants of Dg(x)” > ε2N > 0. Moreover, since kDf (x)kL 6 Lip(f ) for λN -a.a. x ∈ RN , once again from the Binet-Cauchy Formula, we have that ½ ¾ sum of squares of terms each 2 2 Jg(x) = Jf (x) + 6 ξε2 , involving at least one ε > 0 for some ξ > 0 and all x ∈ A.
1. Hausdorff Measures and Capacity
73
This shows that (1.54) is true. Then, using Proposition 1.3.25, Case 1, (1.54) and the fact that kpkL = 1, we have ¡ ¢ ¡ ¢ µ(N ) f (A) 6 µ(N ) g(A) Z ¡ ¢ 6 µ(0) A ∩ g −1 ({y, z}) dµ(N ) (y, z) RN +M
Z
Jg(x) dλN (x) 6
=
p ξελN (A).
A
Let ε & 0 to conclude that ¡ ¢ µ(N ) f (A) = 0. Note that Hence
¡ ¢ supp µ(0) A ∩ f −1 ({·}) ⊆ f (A).
Z (0)
µ
¡ ¢ A ∩ f −1 ({y}) dµ(N ) (y) =
Z Jf (x) dλN (x) = 0. A
RN
This proves Case 2. Finally for the general case, we write A = A0 ∪ A1 with
© ª A0 ⊆ Jf = 0
and
© ª A1 ⊆ Jf > 0
and apply the result on each set separately. REMARK 1.5.22
The function ¡ ¢ y 7−→ µ(0) A ∩ f −1 ({y})
on RM is called the multiplicity function. Also note that from Theorem 1.5.21, we infer f −1 ({y}) is at most countable for µ(N ) -almost all y ∈ RM . THEOREM 1.5.23 (Change of Variables Formula I) If f : RN −→ RM is a Lipschitz continuous function, N 6 M and g ∈ L1 (RN ), then ¸ Z Z · X g(x)Jf (x) dλN (x) = g(x) dµ(N ) (y). RN
RM
x∈f −1 ({y})
74
Nonlinear Analysis
PROOF
First assume that g > 0. Let ½ ¾ df A1 = x ∈ RN : g(x) > 1
and inductively define ½ df
An =
x∈R
N
Then
¾ n−1 1 X1 : g(x) > + χ (x) n i=1 i Ai
∀ n > 2.
∞ X 1 g > χAn . n n=1
If g(x) = +∞, we see that x ∈ An
∀ n > 1.
On the other hand, if 0 < g(x) < +∞, then x 6∈ An for infinitely many n > 1. So for infinitely many n, we have 0 < g(x) −
n−1 X i=1
and so we conclude that g =
1 1 χ (x) 6 i Ai n
∞ X 1 χAn . n n=1
From the monotone convergence theorem (see Theorem A.2.10) and Theorem 1.5.21, we have Z g(x)Jf (x) dλN (x) RN ∞ X
1 = n n=1 = = = =
Z χAn (x)Jf (x) dλN (x) N
R Z Z ∞ ∞ X X ¡ ¢ 1 1 Jf (x) dλN (x) = µ(0) An ∩ f −1 ({y}) dµ(N ) (y) n n n=1 A n=1 RM n ¶ Z µX ∞ X 1 χAn (x) dµ(N ) (y) n −1 n=1 x∈f ({y}) RM ¶ Z µ ∞ X X 1 χAn (x) dµ(N ) (y) n x∈f −1 ({y}) n=1 M RZ µ ¶ X g(x) dµ(N ) (y).
RM
x∈f −1 ({y})
1. Hausdorff Measures and Capacity
75
In the general case let g = g + − g − and apply the first part of the proof on each component function g + > 0, g − > 0. EXAMPLE 1.5.24
(a) Let N = 1, M > 1. Suppose that f : R −→ RM
is Lipschitz continuous and injective. We have f = (fk )M k=1 and with 0 =
Df = (fk0 )M k=1 d dt .
Let −∞ < a < b < +∞
and
¡ ¢ C = f [a, b]
(the curve defined by f ). Using Theorem 1.5.21, we have Zb (1)
µ
(C) =
¯ 0 ¯ 1 ¯f (t)¯ dλ (t),
a
the length of C. (b) Let N > 1, M = N + 1. Suppose that g : RN −→ R is Lipschitz continuous and let f : RN −→ RN +1 be the Lipschitz continuous function defined by df
f (x) =
¡ ¢ x, g(x)
∀ x ∈ RN .
We have µ Df (x) =
¶ IN ∇g(x) (N +1)×N
for λN -a.a. x ∈ RN ,
where IN is the N × N -identity matrix. Therefore ½ ¾ sum of squares of 2 2 (Jf ) = = 1 + k∇gkRN . N × N -subdeterminants Then, if df
G =
©
ª (x, y) ∈ RN × R : y = g(x)
76
Nonlinear Analysis
(the graph of g), from Theorem 1.5.21, we have Z q ° °2 (N ) µ (G) = 1 + °∇g(x)°RN dλN (x), RN
the surface area of G. (c) Let N > 1, M = N + 1. Suppose that f : RN −→ RN +1 is Lipschitz continuous and injective. Then µ ¶ ¡ ¢N +1 ∂fk f = fk k=1 and Df = . ∂xi k = 1, . . . , N + 1 i = 1, . . . , N
So (Jf )2 =
N +1 · X k=1
∂(f1 , . . . , fk−1 , fk+1 , fN +1 ) ∂(x1 , . . . , xN )
¸2 ,
the sum of squares of N × N -subdeterminants. Therefore, if U ⊆ RN is any open set and A = f (U ) ⊆ RN +1 , then by Theorem 1.5.21, we have ¯ ¶1 Z µ NX +1 ¯ ¯ ∂(f1 , . . . , fk−1 , fk+1 , . . . , fN +1 ) ¯2 2 N (N ) ¯ ¯ µ (A) = dλ (x). ¯ ¯ ∂(x1 , . . . , xN ) U
k=1
In Theorem 1.5.21, we proved that if f : RN −→ RM is Lipschitz continuous, N 6 M and A ⊆ RN is Lebesgue measurable, then the Jacobian integral Z Jf (x) dλN (x) A
equals the N -dimensional Hausdorff area of f |A , given by Z ¡ ¢ µ(0) A ∩ f −1 ({y}) dµ(N ) (y). RM
If N > M , then the Jacobian integral equals the “coarea” of f |A , defined by Z ¡ ¢ µ(N −M ) A ∩ f −1 ({y}) dµ(N ) (y). RM
This result is known as the coarea formula. THEOREM 1.5.25 (Coarea Formula) If f : RN −→ RM is Lipschitz continuous, M 6 N and A ⊆ RN is Lebesgue measurable, then Z Z ¡ ¢ Jf (x) dλN (x) = µ(N −M ) A ∩ f −1 ({y}) dλM (y). A
RM
1. Hausdorff Measures and Capacity
77
As was the case with the area formula, the coarea formula leads to a change of variables formula. THEOREM 1.5.26 (Change of Variables Formula II) If f : RN −→ RM is a Lipschitz continuous function, N > M and g ∈ L1 (RN ), then ¸ Z Z ·Z N (N −M ) g(x)Jf (x) dλ (x) = g(x) dµ (x) dλM (y). RN
RM
f −1 ({y})
PROOF First assume that g > 0. As in the proof of Theorem 1.5.23, we can find Lebesgue measurable sets {An }n>1 ⊆ RN , such that g =
∞ X 1 χ . n An n=1
Invoking the monotone convergence theorem (see Theorem A.2.10) and Theorem 1.5.25, we have Z g(x)Jf (x) dλN (x)
=
RN ∞ X
1 n n=1
Z Jf (x) dλN (x) An
Z ∞ X ¡ ¢ 1 = µ(N −M ) An ∩ f −1 ({y}) dλM (y) n n=1 RM
¸ Z ·X ∞ ¢ 1 (N −M ) ¡ −1 = µ An ∩ f ({y}) dλM (y) n n=1 RM ¸ Z ·Z = g(x) dµ(N −M ) (y) dλM (y). RM
f −1 ({y})
For the general case let g = g+ − g− and apply the first part to each component function g + > 0 and
g − > 0.
78
Nonlinear Analysis (a) Let N > 1, M = 1. Suppose that f : RN −→ R
EXAMPLE 1.5.27 is defined by
f (x) = kxkRN and g ∈ L1 (RN ). Then, we have Df (x) =
x kxkRN
and
∀ x ∈ RN \ {0}.
Jf (x) = 1
From Theorem 1.5.26, we have +∞· Z Z
Z N
g(x) dλ (x) =
g(x) dµ 0
RN
(N −1)
¸ (x) dλ1 (r)
∂B r (0)
+∞ · Z Z = rN −1 0
¸ g(rx) dµ(N −1) dλ1 (r).
(1.55)
∂B 1 (0)
In particular if g = χB
1 (0)
,
from (1.55), it follows that a(N ) =
¢ 1 (N −1) ¡ µ ∂B 1 (0) , N
N
π2 a(N ) = ¡ N ¢ is the volume of the unit ball in RN . 2 ! df
(b) Let N > 1 and M = 1. Suppose that f : RN −→ R is a Lipschitz continuous function. Then Jf = kDf kRN and so from Theorem 1.5.25, we have that Z
° ° °Df (x)°
RN
+∞ Z ¡ ¢ dλ (x) = µ(N −1) {f = t} dλ1 (t). N
RN
−∞
We conclude this section with some additional useful results involving the multiplicity function ¡ ¢ y 7−→ µ(0) A ∩ f −1 ({y}) of a Lipschitz continuous function f .
1. Hausdorff Measures and Capacity
79
PROPOSITION 1.5.28 If X, Y are separable metric spaces, ξ is an outer measure on Y , f : X −→ Y is a map such that for every Borel set B ⊆ X, the set f (B) is ξ-measurable, ϑ : 2X −→ R = R ∪ {+∞} is the outer measure on X, defined by ¡ ¢ df ϑ(A) = ξ f (A)
∀A⊆X
and ϑb is the Borel measure resulting from ϑ by the Carath´eodory construction, then for every Borel set B ⊆ X, we have Z ¡ ¢ b ϑ(B) = µ(0) B ∩ f −1 ({y}) dξ(y). Y
PROOF Let {Bk }k>1 be a sequence of Borel partitions of B, such that every member of Bk is the union of some subcollection in Bk+1 and sup δ(A) −→ 0
as k → +∞,
A∈Bk
i.e., B =
∞ [
Bk is a Vitali cover of B.
k=1
Note that if
df
hk (y) =
X
χf (A) (y)
∀ k > 1, y ∈ Y,
A∈Bk
then
¡ ¢ hk (y) % µ(0) B ∩ f −1 ({y})
as k → +∞.
So by the monotone convergence theorem (see Theorem A.2.10), we have that Z X X ¡ ¢ b ϑ(B) = lim ξ f (A) = lim χf (A) (y) dξ(y) k→+∞
Z =
A∈Bk
k→+∞
Y
A∈Bk
¡ ¢ µ(0) B ∩ f −1 ({y}) dξ(y).
Y
REMARK 1.5.29 Recall that if X is a separable metric space, Y is a Hausdorff topological space, f : X −→ Y is a continuous map, ξ is a Borel outer measure on Y , then for every Borel set B ⊆ X, the set f (B) is ξmeasurable. This fact is essentially the starting point for the theory of Souslin sets (see Definition A.2.29(b)).
80
Nonlinear Analysis
PROPOSITION 1.5.30 If X is a Polish space (see Definition A.2.29(a)), Y is a separable metric space, f : X −→ Y is a Lipschitz continuous function, 0 6 k < +∞ and A ⊆ X is a Borel set, then Z ¡ ¡ ¢¢ ¡ ¢k µ(0) A ∩ f −1 {y} dµ(k) (y) 6 Lip(f ) µ(k) (A). Y
PROOF
From Proposition 1.3.25, we know that ¡ ¢ ¡ ¢k df ϑ(A) = µ(k) f (A) 6 Lip(f ) µ(k) (A)
∀ A ⊆ X.
Then apply Proposition 1.5.28 on the outer measure ϑ. PROPOSITION 1.5.31 If X is a separable metric space, then for every connected set C ⊆ X, we have δ(C) 6 µ(1) (C). PROOF
Clearly we may assume that µ(1) (C) < +∞
or otherwise the inequality is obvious. Since µ(1) is a Borel measure, we can find a Borel set B ⊇ C, such that µ(1) (B) = µ(1) (C). Let u, v ∈ C and let f : X −→ R be defined by df
f (x) = dX (x, u)
∀ x ∈ X,
where dX is the metric in X. Since f is Lipschitz continuous with Lip(f ) = 1, f (u), f (v) ∈ f (C) = [a, b], from Proposition 1.5.30, we have that µ(1) (C) = µ(1) (B) Z ¡ ¢ > µ(0) B ∩ f −1 ({y}) dµ(1) (y) R
¡ ¢ > µ(1) f (B) > dX (v, u).
1. Hausdorff Measures and Capacity
1.6
81
Capacity
The notion of capacity plays a crucial role in the study of local properties of Sobolev functions. In a sense it takes the place of measure and it is used to characterize the smallness of subsets in RN . For this reason, it is indispensable in the study of the continuity properties of Sobolev functions. We shall deal with these issues in Section 2.7. Moreover, the concept of capacity enters the study of obstacle problems. In this section we develop the theory of the so-called “p-capacity” (variational capacity). The development of the theory of the p-capacity requires knowledge of the definition of Sobolev spaces and some results from their theory. To make this section self-contained, we state here the necessary material from the theory of Sobolev spaces, but we postpone the proofs until Section 2.4, where we conduct a more systematic study of Sobolev spaces. DEFINITION 1.6.1 Let U ⊆ RN be a nonempty open set. By z = N (zk )k=1 , we denote a generic point of U . (a) Suppose that f ∈ L1loc (U ). We say that gk ∈ L1loc (U ) is the distributional (or © ª weak) partial derivative of f with respect to zk (with k ∈ 1, . . . , N ) in U , if Z Z ∂ϕ f dz = − gk ϕ dz ∀ ϕ ∈ Cc∞ (Z) ∂zk U
U
is the space of all C ∞ (Z)-functions with compact supports, i.e., (here the space of test functions). We write Cc∞ (Z)
gk =
∂f = Dk f ∂zk
© ª ∀ k ∈ 1, . . . , N .
If all of the distributional (weak) partial derivatives Dk f exist for k = df
1, . . . , N , then Df = (Dfk )N k=1 is the distributional (weak) derivative of f . (b) Let p ∈ [1, +∞]. We define the Sobolev space W 1,p (U ), by df
W 1,p (U ) =
©
¡ ¢ª f ∈ Lp (U ) : Df ∈ Lp U ; RN .
Also we define df
1,p Wloc (U ) =
©
ª f : U −→ R : f |V ∈ W 1,p (V ) for all V ⊂⊂ U ,
where V ⊂⊂ U means that V is a bounded open subset of U such that V ⊆ U . 1,p The elements of Wloc (U ) are called Sobolev functions.
82
Nonlinear Analysis (c) Let p ∈ [1, N ). We define the critical Sobolev exponent df
p∗ =
Np N −p
and the space df
Kp =
©
¡ ¢ª ∗ f ∈ Lp (RN ) : f > 0, Df ∈ Lp RN ; RN .
(d) Let p ∈ [1, N ) and C ⊆ RN . The p-capacity of C is defined by df
capp (C) =
inf p
f ∈K C ⊆ int {f > 1}
° °p °Df ° . p
REMARK 1.6.2 (a) Clearly if the distributional (weak) partial derivative Dk f exists, it is uniquely defined modulo a Lebesgue-null set in RN . (b) If f ∈ W 1,p (U ), we define df
kf k1,p =
³
p
p
kf kp + kDf kp
and
´ p1
∀ p ∈ [1, +∞)
df
kf k1,∞ = kf k∞ + kDf k∞ . These are norms in W 1,p (U ) for p ∈ [1, +∞) and W 1,∞ (U ) respectively. Normed this way the Sobolev spaces are Banach spaces. ∗
(c) Although p < p∗ , we do not have Lp (RN ) ⊆ Lp (RN ) and so we cannot say that K p ⊆ W 1,p (RN ). (d) If K ⊆ RN is a compact set, then by using standard regularization (via mollification; see also Definition 2.4.10) of the characteristic function χK , we can check that p capp (K) = inf¡ ¢ kDf kp . f ∈ Cc∞ RN f > χK
(e) Evidently, if C1 ⊆ C2 , then capp (C1 ) 6 capp (C2 ) (monotonicity). (f ) Because p < p∗ , the elements of K p are Sobolev functions.
1. Hausdorff Measures and Capacity
83
As we already mentioned, for easy reference, we present four results from the theory of Sobolev spaces, which will be used in the sequel. The proofs of these results will be given in Section 2.4. PROPOSITION 1.6.3 If U ⊆ RN is open, p ∈ [1, +∞) and f ∈ W 1,p (U ), then we can find a sequence {f }n>1 ⊆ W 1,p (U ) ∩ C ∞ (U ), such that fn −→ f
in W 1,p (U ).
PROPOSITION 1.6.4 Let U ⊆ RN be an open set and let p ∈ [1, +∞). (a) If f, g ∈ W 1,p (U ), then df
df
h0 = min{f, g} ∈ W 1,p (U ),
h1 = max{f, g} ∈ W 1,p (U )
and ½ Dh0 (x) = ½ Dh1 (x) =
Df (x) Dg(x)
for λN -a.a. x ∈ {f 6 g}, for λN -a.a. x ∈ {f > g},
Dg(x) Df (x)
for λN -a.a. x ∈ {f 6 g}, for λN -a.a. x ∈ {f > g}.
In particular f + , f − , |f | ∈ W 1,p (U ). (b) If {fn }n>1 ⊆ W 1,p (U ) is a sequence, then df
h = sup fn ∈ W 1,p (U ), n>1
and
° ° df u = sup °Dfn °RN ∈ Lp (U )
° ° °Dfn (z)° N 6 u(z) R
n>1
for λN -a.a. z ∈ U.
PROPOSITION 1.6.5 If U ⊆ RN is a bounded open set with a C 1 -boundary and p ∈ [1, +∞), ¡ ¢ then there exists E ⊆ L W 1,p (U ), W 1,p (RN ) , such that E(f )|U = f. REMARK 1.6.6
The function E(f ) is called an extension of f on RN .
84
Nonlinear Analysis
Finally we mention two basic inequalities. The first is known as the “Sobolev inequality” (or “Sobolev-Nirenberg-Gagliardo inequality”) and the second is known as the “Poincar´e-Wirtinger inequality.” THEOREM 1.6.7 (Sobolev-Nirenberg-Gagliardo Inequality) If p ∈ [1, +∞), then there exists C = C(N, p) > 0, such that ∀ f ∈ W 1,p (RN ).
kf kp∗ 6 C kDf kp
THEOREM 1.6.8 (Poincar´ e-Wirtinger Inequality) If U ⊆ RN is bounded, connected and open set (i.e., a bounded domain in RN ) with a C 1 -boundary and p ∈ [1, +∞), then there exists C0 = C0 (N, p) > 0, such that ° ° °f − f ° 6 C0 kDf k ∀ f ∈ W 1,p (U ), p p with f =
1 N λ (U )
Z f (z) dz. U
If p < N , then
° ° °f − f ° ∗ 6 C0 kDf k . p p
A Sobolev inequality is also valid for the elements in K p . PROPOSITION 1.6.9 If f ∈ K p , then there exists C = C(N, p) > 0, such that ∀ f ∈ K p.
kf kp∗ 6 C kDf kp PROOF
¡ ¢ First we produce a sequence {ϕn }n>1 ⊆ Cc∞ RN , such that 0 6 ϕn < 1
∀ n > 1,
ϕn (z) % 1
for a.a. z ∈ RN ,
ϕn (z) = 1
∀ kzkRN < n
and sup kDϕn kRN < +∞.
n>1
¡ ¢ To this end, let ϕ ∈ Cc∞ B2 (0) , such that 0 6 ϕ 6 1 and
ϕ|B
1 (0)
= 1.
1. Hausdorff Measures and Capacity Let us set
85
³z´
, ∀ z ∈ RN , n > 1. n This is the desired sequence. Note that ϕn (z) = ϕ
ϕn f ∈ W 1,p (RN ) (recall that p < p∗ and use the product rule). Invoking the Sobolev-NirenbergGagliardo inequality (see Theorem 1.6.7), we can find C > 0, such that for all n > 1, we have kϕn f kp∗ 6 C kD(ϕn f )kp 6 C kDf kp + C kf Dϕn kp ; thus by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have kf kp∗ 6 C kDf kp + C lim inf kf Dϕn kp . (1.56) n→+∞
Using H¨older’s inequality (see Theorem A.2.27) (as pp∗ + p1∗ 0 = 1), the fact ( p ) ³ ∗ ´N that ϕn |Bn (0) ≡ 1 and since p pp = 1 and sup kDϕn kRN < +∞, for every n>1
n > 1, we have Z p
kf Dϕn kp = RN
µ
Z
¯ ¯ ∗ ¯f (z)¯p dz
6 Z
6 C1
{kzkRN >n} ³ ´0 °p pp∗
¶ pp∗ µ Z
° °Dϕn (z)° N R
¯ ¯ ¯f (z)Dϕn (z)¯p dz
dz
¶1− pp∗
RN
{kzkRN >n}
µ
Z
¯ ¯ ¯f (z)Dϕn (z)¯p dz =
¯ ¯ ∗ ¯f (z)¯p dz
¶ pp∗
,
{kzkRN >n} ∗
for some C1 > 0. Since |f |p ∈ L1 (RN ), we have ° °p lim °f Dϕn °p 6 C1 lim
n→+∞
µ
Z
n→+∞
¯ ¯ ∗ ¯f (z)¯p dz
¶ pp∗
= 0.
{kzkRN >n}
Using this in (1.56), we conclude that kf kp∗ 6 C kDf kp .
We use this inequality to establish that the p-capacity capp is an outer measure on RN .
86
Nonlinear Analysis
THEOREM 1.6.10 If p ∈ [1, N ), then capp is an outer measure on RN . PROOF Clearly capp (∅) = 0 and capp is monotone (see Remark 1.6.2(e)). So it remains to show that if {Cn }n>1 is a sequence of subsets of RN and ∞ S C= Cn , then n=1
capp (C) 6
∞ X
capp (Cn )
n=1
(see Definition 1.1.1). We assume that ∞ X
capp (Cn ) < +∞
n=1
or otherwise the inequality is obvious. According to Definition 1.6.1(d), for a given ε > 0, we can find fn ∈ K p , such that © ª Cn ⊆ int fn > 1 and
° ° °Dfn °p 6 capp (Cn ) + ε ∀ n > 1. (1.57) p 2n © ª Let h = sup fn . Evidently C ⊆ int h > 1 . Also, using the monotone n>1
convergence theorem (see Theorem A.2.10), Proposition 1.6.3 and (1.57), we have Z Z ∞ Z X ∗ p∗ p∗ h(z) dz = sup fn (z) dz 6 fn (z)p dz RN
RN
n>1
n=1 N R
∞ ∞ ³ p∗ X X ° °p∗ ε ´p ° ° 6 C Dfn p 6 C capp (Cn ) + n 2 n=1 n=1
6 C1
·X ∞ ³ n=1
ε ´ capp (Cn ) + n 2
¸ pp∗ < +∞,
¡ ¢ ∗ for some C1 > 0. As h ∈ Lploc RN and p < p∗ , we have h ∈ Lp (RN ). Also if u = sup kDfn kRN , from (1.57), we have n>1
Z RN
so u ∈ Lp (RN ).
∞ Z X ¯ ¯ ° ° ¯u(z)¯p dz 6 °Dfn (z)°p N dz < +∞, R n=1 N R
1. Hausdorff Measures and Capacity
87
° ° 1,p By Proposition 1.6.4(b), we have that u ∈ Wloc (RN ) and °Dh(z)° 6 u(z) ¡ ¢ for almost all z ∈ Z. Therefore Dh ∈ Lp RN ; RN and so h ∈ K p . By virtue of Definition 1.6.1(d), the monotone convergence theorem (see Theorem A.2.10) and (1.57), we have Z Z ° °p ° ° capp (C) 6 Dh(z) RN dz 6 u(z)p dz 6
∞ X
Z
RN
RN
° ° °Dfn (z)°p N dz 6 R
n=1 N R
∞ X
capp (Cn ) + ε.
n=1
Let ε & 0, to conclude that capp (C) 6
∞ X
capp (Cn ).
n=1
In the next Theorem, we have collected the basic properties of the p-capacity capp . THEOREM 1.6.11 If p ∈ [1, N ) and A ⊆ C ⊆ RN , then (a) capp (A) =
inf
A⊆U U is open
capp (U ).
(b) capp (ξA) = ξ N −p capp (A) for all ξ > 0. ¡ ¢ (c) capp L(A) = capp (A) for every affine isometry L : RN −→ RN . (d) capp (A) 6 Cµ(N −p) (A) for some C = C(N, p) > 0. ¡ ¢ N (e) λN (A) 6 C capp (A) N −p for some C = C(N, p) > 0. (f ) capp (A ∩ B) + capp (A ∪ B) 6 capp (A) + capp (B). (g) if {An }n>1 is an increasing sequence (i.e., An ⊆ An+1 for all n > 1), then µ[ ¶ ∞ lim capp (An ) = capp An . n→+∞
n=1
(h) if {An }n>1 is a decreasing sequence (i.e., An ⊇ An+1 for all n > 1) of compact sets in RN , then µ\ ¶ ∞ lim capp (An ) = capp An . n→+∞
n=1
88
Nonlinear Analysis
PROOF
(a) From the monotonicity of the p-capacity, we have capp (A) 6
inf
A⊆U U is open
capp (U ).
(1.58)
© ª For a given ε > 0, we can find f ∈ K p , such that A ⊆ int f > 1 and p
kDf kp 6 capp (A) + ε.
(1.59)
Let U = int {f > 1}. Then from Definition 1.6.1(d) and (1.58), we have p
capp (U ) 6 kDf kp 6 capp (A) + ε. Let ε & 0, to obtain that capp (U ) 6 capp (A). Combining this with (1.58), we conclude that capp (A) =
inf
A⊆U U is open
capp (U ).
© ª (b) Let ε > 0 be given. Then we can find f ∈ K p , such that A ⊆ int f > 1 and p kDf kp 6 capp (A) + ε. ³ ´ ¡ ¢ df Let ξ > 0 and h(z) = f zξ . We have h ∈ K p and ξA ⊆ int h > 1 . So ¡ ¢ p p capp (ξA) 6 kDhkp = ξ N −p kDf kp 6 ξ N −p capp (A) + ε . Let ε & 0 to obtain
capp (ξA) 6 ξ N −p capp (A).
(1.60)
Using (1.60), we see that µ capp (A) = capp so
¶ 1 1 (ξA) 6 N −p capp (ξA), ξ ξ
ξ N −p capp (A) 6 capp (ξA).
From (1.60) and (1.61), we conclude that capp (ξA) = ξ N −p capp (A) (c) The proof is similar to that of (b).
∀ ξ > 0.
(1.61)
1. Hausdorff Measures and Capacity ∞ S
(d) Let δ > 0 and suppose that A ⊆
n=1
89
B rn (xn ), with 2rn < δ for all n > 1.
Since capp is an outer measure (see Theorem 1.6.10), using also (b) and (c) as B rn (xn ) = xn + rn B 1 (0), we have capp (A) 6
∞ X
∞ ¡ ¢ ¡ ¢X capp B rn (xn ) = capp B 1 (0) rnN −p ,
n=1
so
n=1
capp (A) 6 Cµ(N −p) (A).
© ª (e) Let ε > 0 and select f ∈ K p , such that A ⊆ int f > 1 and ° °p °Df ° 6 capp (A) + ε. p
(1.62)
Using Proposition 1.6.9 and (1.62), we obtain ¡ ¢1 1 λN (A) p∗ 6 kf kp∗ 6 C kDf kp 6 C capp (A) + ε p , for some C > 0 and so λN (A) 6 Ccapp (A)
p∗ p
N
= Ccapp (A) N −p ,
for some C > 0. (f ) Let ε > 0 and select f, g ∈ K p , such that © ª © ª A ⊆ int f > 1 , B ⊆ int g > 1 and Let
p
p
kDf kp 6 capp (A) + ε,
kDgkp 6 capp (B) + ε.
© ª df h0 = min f, g
© ª df h1 = max f, g .
and
(1.63)
Using Proposition 1.6.4(a), we see that h0 , h1 ∈ K p . Also, we have ° ° ° ° °Dh0 (z)°p N + °Dh1 (z)°p N R ° °p ° °p R = °Df (z)°RN + °Dg(z)°RN for λN -a.a. z ∈ RN and
© ª A ∩ B ⊆ int h0 > 1 ,
© ª A ∪ B ⊆ int h1 > 1 .
(1.65)
Therefore from (1.65), (1.64), (1.63) and since h1 , h1 ∈ K p , we obtain p
p
capp (A ∩ B) + capp (A ∪ B) 6 kDh0 kp + kDh1 kp p
p
= kDf kp + kDgkp 6 capp (A) + capp (B) + 2ε,
(1.64)
90
Nonlinear Analysis
so capp (A ∩ B) + capp (A ∪ B) 6 capp (A) + capp (B). (g) We do the proof for the case p ∈ (1, N ). For the case p = 1 we refer to Federer & Ziemer (1972). By virtue of the monotonicity property, we have lim capp (An ) 6 capp
n→+∞
µ[ ∞
¶ An .
(1.66)
n=1
Suppose that capp
µ[ ∞
¶ An
< +∞,
n=1
as otherwise there is nothing to prove. Thus also lim capp (An ) < +∞.
n→+∞
Let ε > 0 and for every n > 1 let us select fn ∈ K p , such that © ª An ⊆ int fn > 1 df
and
p
kDfn kp 6 capp (An ) +
ε . 2n
(1.67)
df
Let us set h0 = 0 and hk = max fn . We know that {hk }k>0 ⊆ K p , hk = 16n6k ¡ ¢ © © ª ª max fk , hk−1 and Ak−1 ⊆ int min fk , hk−1 > 1 . So, using (1.67), we have p
kDhk kp + capp (Ak−1 ) ° ¡ ¢°p ° ¡ ¢°p 6 °D max{fk , hk−1 } °p + °D min{fk , hk−1 } °p ε p p p = kDfk kp + kDhk−1 kp 6 capp (Ak ) + k + kDhk−1 kp , 2 so p
p
kDhk kp − kDhk−1 kp 6 capp (Ak ) − capp (Ak−1 ) +
ε . 2k
Adding and recalling that h0 = 0, we obtain p
kDhk kp 6 capp (Ak ) + ε
∀ k > 1.
(1.68)
df
Let u = lim hk . Evidently k→+∞
∞ [ k=1
© ª Ak ⊆ int u > 1
(1.69)
1. Hausdorff Measures and Capacity
91
and so, by the monotone convergence theorem (see Theorem A.2.10), Proposition 1.6.9 and (1.68), we have kukp∗ =
lim khk kp∗ 6 C lim inf kDhk kp k→+∞ µ ¶ p1 6 C lim capp (Ak ) + ε . k→+∞
k→+∞
(1.70)
¡ ¢ So at least for a subsequence of {Dhk }k>1 ⊆ Lp RN ; RN , we have that it is bounded. Hence we may assume that ¡ ¢ w Dhk −→ Du in Lp RN ; RN (recall that we have assumed that p > 1). Then from (1.70) and since kDukp 6 lim inf kDhk kp ,
(1.71)
k→+∞
we infer that f ∈ K p . Therefore, using (1.69), (1.71) and (1.70), we have µ[ ¶ ∞ p capp An 6 kDukp 6 lim capp (An ). (1.72) n→+∞
n=1
From (1.66) and (1.72), it follows that lim capp (An ) = capp
n→+∞
µ[ ∞
¶ An .
n=1
(h) Note that due to the monotonicity property, we have ¶ µ\ ∞ An . lim capp (An ) > capp n→+∞
(1.73)
n=1 ∞ T
Let U be an open set such that
n=1
An ⊆ U . The set
∞ T n=1
An is compact and
so for some n0 > 1, we have that An0 ⊆ U , hence An ⊆ U for all n > n0 . It follows that lim capp (An ) 6 capp (U ) n→+∞
and so, using also (a), we have lim capp (An ) 6
n→+∞
T
inf
An ⊆ U U is open
capp (U ) = capp
µ\ ∞ n=1
From (1.73) and (1.74), we conclude that lim capp (An ) = capp
n→+∞
µ\ ∞ n=1
¶ An .
¶ An .
(1.74)
92
Nonlinear Analysis
REMARK 1.6.12 The monotonicity of capp together with properties (g) and (h) in Theorem 1.6.11 imply that the set-function A 7−→ capp (A) is a “Choquet capacity” (see Definition A.2.37). Using Choquet’s capacitability theorem (see Theorem A.2.39 or cf. Choquet (1955)), we can say that for all Souslin (analytic) subsets A (see Definition A.2.29(b) and Remark A.2.30) of RN (in particular then for all Borel sets A of RN ), we have capp (A) =
sup K⊆A K is compact
capp (K).
In Theorem 1.6.11(d) we obtained a first relation between p-capacity and Hausdorff measures, both of which measure small sets in RN . We can improve this result as follows. THEOREM 1.6.13 Let p ∈ (1, N ) and A ⊆ RN . (a) If µ(N −p) (A) < +∞, then capp (A) = 0. (b) If capp (A) = 0, then µ(s) (A) = 0 for all s > N − p. PROOF
(a) Clearly we may assume that A ⊆ RN is compact.
Claim 1. We can find C = C(N, p, A) > 0, such that, if V ⊆ RN is open with A ⊆ V , then we can find an open set U ⊆ RN and f ∈ K p , such that © ª p A ⊆ U ⊆ f = 1 , supp f ⊆ V and kDf kp 6 C. Let V ⊆ RN be an open set, such that A ⊆ V . Let us set df
δ =
d(A, V c ) > 0. 2
Because A is compact and µ(N −p) (A) < +∞, we can find m
{zk }k=1 ⊆ A m
and {rk }k=1 ⊆ R+ \ {0}, such that 2rk < δ,
A⊆
m [ k=1
Brk (zk )
and
m X k=1
rkN −p 6 µ(N −p) (A) + 1.
(1.75)
1. Hausdorff Measures and Capacity Let us set
m [
df
U =
93
Brk (zk )
k=1
and let fk ∈ K p be defined by 1 df fk (z) = 2− 0
if if if
kz−zk kRN rk
kz − zk kRN < rk , rk 6 kz − zk kRN 6 2rk , 2rk < kz − zk kRN .
Using Proposition 1.6.4(a), we see that p
kDfk kp 6 CrkN −p
∀ k ∈ {1, . . . , m}.
df
Let us set f = max fk . Then f ∈ K p , U ⊆ 16k6m
©
(1.76)
ª f = 1 , supp f ⊆ V and
from (1.76) and (1.75), we have p
kDf kp 6
m X
p
kDfk kp 6 C
k=1
m X
¡ ¢ rkN −p 6 C µ(N −p) (A) + 1 ,
k=1
which proves Claim 1. We use the claim inductively and produce a sequence {Un }n>1 of open sets in RN and functions {fn }n>1 ⊆ K p , such that A ⊆ Un+1 ⊆ Un ,
© ª U n+1 ⊆ int fn = 1
∀n>1
and supp fn ⊆ Un , Let df
Sm =
m X 1 n n=1
p
kDfn kp 6 C
and
df
hm =
∀ n > 1. m 1 X1 fn . Sm n=1 n
We have hm ∈ K p
and hm > 1 on Um+1 . ¢ Also because supp kDfn (·)kRN ⊆ Un \ U n+1 and p > 1, we see that ¡
° °p capp (A) 6 °Dhm °p 6 6
m ° 1 X 1 ° °Dfn °p p p p Sm n=1 n
m C X 1 −→ 0 p Sm np n=1
as m → +∞.
94
Nonlinear Analysis
(b) Since capp (A) = 0, for every n > 1 we can find fn ∈ K p , such that © ª A ⊆ int fn > 1 and 1 p kDfn kp 6 n . (1.77) 2 ∞ df P Let us set h = fn . From (1.77), we have n=1
kDhkp 6
∞ X
kDfn kp < +∞.
(1.78)
n=1
Using Proposition 1.6.9 and (1.78), we have that khkp∗ 6
∞ X
kfn kp∗ 6
n=1
∞ X
C kDfn kp < +∞,
n=1
so h ∈ K p . © ª Observe that A ⊆ int h > m for all m > 1. Let z0 ∈ A. For r > 0 small, © ª we have B r (z0 ) ⊆ int h > m , hence df
hz0 ,r =
Z
1 λN (B r (z0 ))
h(z) dz > m, B r (z0 )
which implies that hz0 ,r −→ +∞
as r & 0.
(1.79)
Claim 2. For every z0 ∈ A, we have Z ° ° 1 °Dh(z)°p N dz = +∞, lim s R r&0 r B r (z0 )
for any s > N − p. To prove this claim, we proceed by contradiction. Let z0 ∈ A and suppose that Z ° ° 1 °Dh(z)°p N dz < +∞. lim s R r&0 r B r (z0 )
Then we can find M < +∞, such that for all r ∈ (0, 1], we have Z ° ° 1 °Dh(z)°p N dz 6 M. R rs B r (z0 )
1. Hausdorff Measures and Capacity
95
Invoking the Poincar´e-Wirtinger inequality (see Theorem 1.6.8), we have Z ¯ ¯ 1 ¯h(z) − hz ,r ¯ dz 0 N λ (B r (z0 )) B r (z0 ) Z ° ° C °Dh(z)°p N dz 6 C1 rϑ , 6 N (1.80) R λ (B r (z0 )) B r (z0 )
for ¡some C, ¢ C1 > 0, ϑ = s − (N − p) and for all r ∈ (0, 1] (recall that λN B r (z0 ) = a(N )rN ). Since ¡ ¢ ¡ ¢ λN B r (z0 ) = 2N λN B r2 (z0 ) and using Jensen’s inequality (see Theorem A.2.26) and (1.80), we have that ¯ ¯ 1 ¯ λN (B r2 (z0 )) ¯
¯ ¯ ¯hz , r − hz ,r ¯ = 0 2 0 Z
2N 6 N λ (B r (z0 ))
Z
µ Z
6
2 λN (B
r (z0 ))
h(z) − hz0 ,r
¢
¯ ¯ dz ¯¯
B r (z0 )
¯ ¯h(z) − hz
2
¯ ¯ 0 ,r dz
B r (z0 ) N
¡
¯ ¯h(z) − hz
¯p ¯ dz 0 ,r
1 p
¶ p1
ϑ
6 C2 r p ,
(1.81)
B r (z0 )
for some C2 > 0. Therefore, using (1.81), for k > i, we have ¯ ¯ ¯hz0 , so
1 2k
k ¯ ¯ X ¯ ¯ − hz0 , 1i ¯ 6 ¯hz0 , 1l − hz0 , 2
2
l=i+1
¶ ϑp k µ ¯ X 1 ¯ 1 , ¯ 6 C2 2l−1 2l−1 l=i+1
© ª hz0 , 21n n>1 is a Cauchy sequence and this contradicts the fact that
hz0 , 21n −→ +∞ (see (1.79)). This proves Claim 2. Then we have ½ 1 A ⊆ z0 ∈ RN : lim sup s r&0 r
¾
° ° °Dh(z)°p N dz = +∞ R
B r (z0 )
½ ⊆
Z
z0 ∈ RN : lim sup r&0
1 rs
Z
¾
° ° °Dh(z)°p N dz > 0 R
= Cs .
B r (z0 )
But from Theorem 1.4.9, we have that µ(s) (Cs ) = 0, hence µ(s) (A) = 0.
96
Nonlinear Analysis
REMARK 1.6.14 If p = 1 and A ⊆ RN , it can be shown that cap1 (A) = (1) 0 if and only if µ (A) = 0. The proof of this result, which uses functions of bounded variations and the isoperimetric inequality, can be found in Evans & Gariepy (1992, p. 193). PROPOSITION 1.6.15 If T ⊆ (0, 1) is such that λ1 (T ) > 0, p ∈ [N − 1, N ), A ⊆ B 1 (0) ⊆ RN and for each r ∈ T there exists unique zr ∈ ∂B r (0), such that zr ∈ A, then capp (A) > 0. PROOF
Let f : RN −→ R be defined by df
2
f (z) = kzkRN =
N X
N ∀ z = (zn )N n=1 ∈ R .
zn2
n=1
Evidently f is Lipschitz continuous with Lip(f ) = 1. So by Proposition 1.3.25, we have ¡ ¢ µ(1) f (A) 6 µ(1) (A). Note that T = f (A). So by virtue of Theorem 1.3.21, we have that ¡ ¢ 0 < λ1 (T ) = µ(1) f (A) 6 µ(1) (A). If for some p ∈ [N − 1, N ), we have that capp (A) = 0,
then from Theorem 1.6.13(b) (see also Remark 1.6.14), we have µ(1) (A) = 0 (note that 1 > N − p), a contradiction. So capp (A) > 0
∀ p ∈ [N − 1, N ).
The next result provides a kind of Chebyshev inequality in terms of pcapacities. PROPOSITION 1.6.16 If p ∈ [1, N ), f ∈ K p , ε > 0 and ½ Z 1 df A = z0 ∈ RN : N λ (B r (z0 ))
¾ f (z) dz > ε for some r > 0 ,
B t (z0 )
then there exists a constant C = C(N, p), such that capp (A) 6
C p kDf kp . εp
1. Hausdorff Measures and Capacity
97
PROOF First, we show that the set A ⊆ RN is open. Let z0 ∈ A. Then for some r > 0 and ξ > 0, we have Z 1 f (z) dz = ε + ξ. λN (B r (z0 )) B r (z0 )
¡ N¢ Since f ∈ L∞ , exploiting the absolute continuity of the Lebesgue inteloc R gral, we can find ϑ > 0 small enough so that if λN (B) < ϑ, then Z 1 f (z) dz < ξ. λN (B r (z0 )) B
Also let δ > 0 be such that ¡ ¢ λN B r (z) M B r (z0 ) < ϑ
∀ kz − z0 kRN < δ,
df
where X M Y = (X \ Y ) ∪ (Y \ X) is the symmetric difference of X and Y . So, if kz − z0 kRN < δ, we have Z 1 f (y) dy λN (B r (z0 )) B r (z)
1 = N λ (B r (z0 ))
Z
1 f (y) dy + N λ (B r (z0 ))
B r (z0 )
¸
Z f (y) dy −
B r (z)
Z
1 > ε+ξ− N λ (B r (z0 ))
· Z
f (y) dy
B r (z0 )
f (y) dy > ε + ξ − ξ = ε, B r (z)OB r (z0 )
so z ∈ A and we infer that A is open. Next let z0 ∈ A and let r > 0 be such that Z 1 f (y) dy > ε. λN (B r (z0 )) B r (z0 )
Then, by Jensen’s inequality (see Theorem A.2.26), we have Z N a(N )r ε < f (y) dy B r (z0 )
6
¡
a(N )r
¢ 1 N 1− p∗
µ Z f (y)
p∗
dy
¶ p1∗
B r (z0 )
so r 6 C1 for some C1 > 0 independent of z0 .
6
¡
a(N )rN
¢1− p1∗
kf kp∗ ,
98
Nonlinear Analysis
Invoking the Besicovitch covering theorem (see Theorem 1.2.18), we can find k = k(N ) > 1, a positive integer and countable collection {Tn }kn=1 of closed balls, such that k X X A⊆ B n=1 B∈Tn
and
Z
1 N λ (B)
f (y) dy > ε
∀B∈
Tn .
(1.82)
n=1
B
©
k [
(n) ª
Let Bi be an enumeration of the elements in the countable collection i>1 Tn for n = 1, . . . , k. Using Proposition 1.6.4(a), we can check that µ
¶+
Z
1 (n) λN (Bi )
¡ (n) ¢ ∈ W 1,p Bi .
f (y) dy − f (n)
Bi
Then Poincar´e’s inequality (see Theorem 1.6.8) implies that °µ ° ° °
Z
1 (n)
λN (Bi )
¶+ ° ° ° f (y) dy − f °
(n)
W 1,p (Bi
(n) Bi
)
6 C2 kDf kLp (B (n) ;RN ) , i
(n)
for some C2 > 0. Invoking Proposition 1.6.5, we can find gi such that µ (n)
gi
> 0,
and
¶+
Z
1
(n)
gi (z) =
(n) λN (Bi )
° ° ° (n) ° °gi °
W 1,p (RN )
f (y) dy − f
(z)
(n)
>
(n)
6 C3 kDf kLp (B (n) ;RN ) , i
Z
1
f (y) dy > ε.
(n)
λN (Bi )
(n)
Bi
Also if df
g =
(n)
for a.a. z ∈ Bi
Bi
for some C3 = C3 (N, p) > 0. From (1.82), we have f + gi
∈ W 1,p (RN ),
sup i>1 n = 1, . . . , k
(n)
gi ,
(1.83)
1. Hausdorff Measures and Capacity
99
we have g > 0. We claim that h ∈ K p . To this end, using (1.83), note that Z sup i>1 RN n = 1, . . . , k
6
k X
C3p
n=1
∞ Z X i=1
k X ∞ Z X ¯ (n) ¯p ¯ (n) ¯p ¯g (y)¯ dy 6 ¯g (y)¯ dy i i n=1 i=1
RN
p
p
kDf (y)kRN dy 6 kC3p kDf kp ,
(n)
Bi
so g ∈ Lp (RN ). Also, we have Z sup
6
i>1 RN n = 1, . . . , k ∞ Z k X X p C3 n=1 i=1 (n) Bi
k X ∞ ° ° X ¯ (n) ¯p ° (n) °p ¯Dg ¯ dy 6 Dg ° i i ° p
n=1 i=1 p
p
kDf (y)kRN dy 6 kC3p kDf kp .
(1.84)
From Proposition 1.6.4(b), we infer that kDg(y)kRN 6
¯ (n) ¯p ¯Dg (y)¯
sup i>1 n = 1, . . . , k
i
for a.a. y ∈ RN ,
¡ ¢ so by (1.84), we have that Dg ∈ Lp RN ; RN and thus g ∈ K p . Because f + g > ε almost everywhere on A and A is open, using also (1.84), it follows that °p Z ° °1 ° ° ° capp (A) 6 ° ε D(f + g)(y)° N dy R RN
´ C4 ³ C5 p p 6 p kDf kp + kDgkp 6 p kDf kpp , ε ε for some C4 , C5 = C5 (N, p) > 0. Setting C = C5 , we obtain the result. DEFINITION 1.6.17 A function f : RN −→ R is said to be pquasicontinuous, if for each ε > 0, we can find an open set U ⊆ RN , such that capp (U ) < ε and f |RN \U is continuous. We have all the necessary tools to prove the following “differentiability” result for Sobolev functions. As we already said a systematic study of Sobolev functions and of their differentiability properties will be conducted in Chapter 2. Here we state a result which says that up to a set A of p-capacity zero, a function f ∈ W 1,p (RN ) can be represented by a p-quasicontinuous function.
100
Nonlinear Analysis
THEOREM 1.6.18 If p ∈ [1, N ) and f ∈ W 1,p (RN ), then there exists f ∗ : RN −→ R a p-quasicontinuous function, such that (a) there exists a Borel set A ⊆ RN with capp (A) = 0, such that Z 1 lim f (y) dy = f ∗ (z) ∀ z ∈ RN \ A; r→0 λN (B r (z)) B r (z)
(b) for each z ∈ RN \ A, we have Z
1 r→0 λN (B r (z))
¯ ¯ ∗ ¯f (y) − f ∗ (y)¯p dy = 0.
lim
B r (z)
PROOF
(a) Let ½ df
C =
lim sup r→0
Z
1 rN −p
¾ p kDf (y)kRN dy > 0 .
B r (z)
From Theorem 1.4.9, we have that µ(N −p) (C) = 0 and this by Theorem 1.6.13(a) implies that capp (C) = 0. Moreover, from the Poincar´e-Wirtinger inequality (see Theorem 1.6.8), we see that Z ¯ ¯ 1 ¯f (y) − f z,r ¯ dy = 0 lim ∀ z ∈ RN \ C. (1.85) r→0 λN (B r (z)) B r (z)
By virtue of Proposition 1.6.3, we can find a sequence {fn }n>1 ⊆ W 1,p (RN ) ∩ C ∞ (RN ), such that p
kDf − Dfn kp 6 For n > 1, we introduce the sets ¯ ½ Z ¯ 1 df N ¯ En = z ∈ R : ¯ N λ (B r (z)) B r (z)
1 2(p+1)n
∀ n > 1.
(1.86)
¯ ¾ ¯ 1 ¯ |f (y) − fn (y)| dy ¯ > n for some r > 0 . 2
1. Hausdorff Measures and Capacity
101
From Proposition 1.6.16 and (1.86), we have that capp (En ) C p 6 C kDf − Dfn kp 6 (p+1)n , 2pn 2 for some C = C(N, p) > 0, so capp (En ) 6
c . 2n
(1.87)
Moreover, we have ¯ ¯ ¯f z,r − fn (z)¯ 6
1 λN (Br (z)) Z
½ Z
Z
¯ ¯ ¯f (y) − f z,r ¯ dy +
Br (z)
|f (y) − fn (y)| dy
Br (z)
¾
|fn (y) − fn (z)| dy ,
+ Br (z)
so from (1.85), we have ¯ ¯ lim ¯f z,r − fn (z)¯ 6
1 2n
r→0
We set df
Ak = C ∪
∀ z ∈ C ∪ En .
(1.88)
¶
µ[ ∞
En
∀ k > 1.
n=k
Evidently Ak is Borel for k > 1 and we have that capp (Ak ) 6 capp (C) +
∞ X n=k
Then, if df
A =
∞ \
∞ X 1 capp (En ) 6 . 2n
(1.89)
n=k
Ak ,
k=1
from (1.89), we have that capp (A) 6
lim capp (Ak ) = 0.
k→+∞
Note that, if z ∈ RN \ Ak and n, m > k, from (1.88), we have ¯ ¯ ¯ ¯ ¯ ¯ ¯fn (z)−fm (z)¯ 6 lim sup ¯f z,r − fn (z)¯ +lim sup ¯f z,r − fm (z)¯ 6 1 + 1 , 2n 2m r→0 r→0 ¡ ¢ so the sequence {fn }n>1 converges uniformly on RN \ Ak to some h ∈ C RN . Also we have ¯ ¯ ¯ ¯ lim sup ¯h(z) − f z,r ¯ 6 |h(z) − fn (z)| + lim sup ¯fn (z) − f z,r ¯ , r→0
r→0
102
Nonlinear Analysis
so, from (1.87), we have h(z) = lim f z,r = f ∗ (z)
∀ z ∈ RN \ Ak , k > 1
r→0
hence f ∗ (z) = lim f z,r
∀ z ∈ RN \ A.
r→0
We need to show that f ∗ is p-quasicontinuous. For this purpose let ε > 0 be given. We choose k > 1, such that capp (Ak )
1 converges to f ∗ uniformly on RN \ U , we infer that the function f ∗ |RN \U is continuous, hence f ∗ is p-quasicontinuous. (b) Note that C ⊆ A and so from (1.85), we see that for all z ∈ RN \ A, we have that µ lim
r→0
1 λN (B r (z))
Z |f (y) − f ∗ (z)|
p∗
dy
¶ p1∗
B r (z)
¯ ¯ 6 lim ¯f z,r − f ∗ (z)¯ + lim r→0
r→0
µ
1 N λ (B r (z))
Z
¯ ¯ ∗ ¯f (y) − f z,r ¯p dy
¶ p1∗
= 0.
B r (z)
REMARK 1.6.19
By virtue of Theorem 1.6.13(b), we have that µ(s) (A) = 0
∀ s > N − p.
Hence dim A 6 N − p. On the other hand this last inequality does not necessarily imply that µ(N −p) (A) < +∞, which in turn gives us that capp (A) = 0 (see Theorem 1.6.13(a)). Therefore the conclusion capp (A) = 0 in the statement of Theorem 1.6.18 is stronger than the dimensionality condition dim A 6 N − p.
1. Hausdorff Measures and Capacity
1.7
103
Remarks
1.1: The necessary measure theoretic background is standard and can be found in any books on abstract measure theory. We mention the books of Ash (1972), Denkowski, Mig´orski & Papageorgiou (2003a, Chapter 2), Dudley (1989), Hewitt & Stromberg (1975) and Royden (1968). Outer measures were first introduced by Carath´eodory (1914), who also gave the definition of a µ-measurable set (see Definition 1.1.3) and proved that Σ(µ) is a σ-field and the outer measure µ restricted on Σ(µ) is a measure. For a proof of Proposition 1.1.10 we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 184-186) and Evans & Gariepy (1992, p. 6–9). 1.2: In some books a Vitali cover (see Definition 1.2.2) is called “fine cover” (see, e.g., Evans & Gariepy (1992)). The original version of Theorem 1.2.5 (Vitali covering theorem) is due to Vitali (1908), who employed closed cubes. The first to study the differentiability of monotone functions (or more generally of functions of bounded variation, in particular then of absolutely continuous functions) was Lebesgue (1904, 1910). Evans & Gariepy (1992, p. 30), Hardt (1979), Simon (1983) and Ziemer (1989) contain the proof of Theorem 1.2.18. For the proof of Theorem 1.2.19 see Evans & Gariepy (1992, p. 35). In general covering theorems are useful in harmonic analysis and in geometric measure theory. More about them can be found in de Guzman (1975). 1.3: Carath´eodory (1914), working with outer measures, was the first to introduce “Hausdorff measures.” More precisely, he introduced “1-dimensional” (or “linear”) measures in RN and also indicated that similarly one can define k-dimensional measures in RN for any integer k > 1. Hausdorff (1919) realized that Carath´eodory’s definition can be used also for noninteger s > 0. He then went on to show that Cantor’s ternary set has fractional dimension ln 2 s ln 3 . An extension of the theory can be achieved by replacing δ(An ) in the X definition of the Hausdorff measure, by ξ(An ), where ξ : 2 −→ R+ is any premeasure, i.e., ξ(∅) = 0 and if U ⊆ V then ξ(U ) 6 ξ(V ) (monotonicity). Of special interest are premeasures resulting from Hausdorff functions h. Namely we consider a function h : R+ −→ R+ satisfying: (a) h(t) > 0 for all t > 0, (b) if t 6 s, then h(t) 6 h(s), (c) h is right continuous at every t > 0. Such a function is called Hausdorff function. For such a function and a positive constant ϑ, we define a premeasure ξ on the metric space X, by © ¡ ¢ ª ½ df min h δ(A) , h(ϑ) if A 6= ∅, ξ(A) = 0 if A = ∅.
104
Nonlinear Analysis
Then ξ is the premeasure defined by h and the cut-off level ϑ. For more details about this generalization, we refer to Davies (1970) and Davies & Samuels (1974). In the presentation of the isodiametric inequality (see Theorem 1.3.20) and of the fact that µ(N ) is a multiple of the Lebesgue measure λN , we follow Evans & Gariepy (1992, Chapter 3). We refer also to Falconer (1985, Section 1.6), Federer (1969, Section 2.10.33) and Hardt (1979). It will be a grave omission not to mention the fundamental contributions on the field of Hausdorff measures made by Besicovitch. We mention the works of Besicovitch (1945, 1946) related to Theorem 1.2.18 (the covering theorem bearing his name). A more complete list of the works of Besicovitch can be found in the book of Falconer (1985). 1.4: The intuitive meaning of Theorem 1.4.1 is that for λN -almost all x ∈ A, small balls centered at x consist predominantly of points of A. Theorems 1.4.1 and 1.4.6 are due to Lebesgue (1910). They were generalized by Besicovitch (1945, 1946), who replaced the Lebesgue measure λN by a Radon measure on RN . Another source of information for the differentiation of measures in RN is the book of Widom (1969). 1.5: Theorem 1.5.4 was originally proved by McShane (1934), who produced the minimal Lipschitz extension of f . Theorem 1.5.8 was originally proved by Rademacher (1919). The proof that is given here is essentially due to Morrey (1966, Theorem 3.1.6). It can be found also in Evans & Gariepy (1992), Simon (1983) and Ziemer (1989). If we employ the notion of Haar-null set, we can also have an extension of Rademacher’s theorem to locally Lipschitz functions between Banach spaces. DEFINITION 1.7.1 Let (G, +) be an abelian Polish group and d an invariant metric on G compatible with the topology (therefore automatically complete). A universally measurable set A ⊆ G is a Haar-null set, if there exists a probability measure µ on G (not unique), such ¡ that ¢χA ? µ = R 0, where χA is the characteristic function of the set A and χA ? µ (x) = G χA (x + y) dµ(y). REMARK 1.7.2 The above definition is equivalent to the requirement that every translate of the set A is a zero set for the measure µ. The measure µ is usually called test measure. The next theorem is the extension of Theorem 1.5.8 to functions between Banach spaces. It will be proved in Section 4.3 (see Theorem 4.3.17). THEOREM 1.7.3 If X is a separable Banach space, Y is a Banach space with the RNP (see Section 2.1) and f : X −→ Y is locally Lipschitz, then there exists a universally measurable set D ⊆ X, such that X \ D is Haar-null and f |D is differentiable in the sense of Gˆ ateaux.
1. Hausdorff Measures and Capacity
105
Our derivation of the area formula (see Theorem 1.5.21) is based on Evans & Gariepy (1992, Section 3.3) (see also Federer (1969, Section 3.2) and Hardt (1979)). Lipschitz continuous functions and their properties are discussed in Federer (1969, Section 3.3). If N = M , from the two change of variables results (see Theorem 1.5.23 and Theorem 1.5.26), we obtain the following change of variables formula. THEOREM 1.7.4 If U, V ⊆ RN are open sets, f : U −→ V is a locally Lipschitz homeomorphism and u ∈ L1 (V ), then ¯ ¯ v = (u ◦ f )¯ det Jf ¯ ∈ L1 (V ) and
Z U
¯ ¡ ¢¯ u f (x) ¯ det Jf (x)¯ dλN (x) =
Z u(y) dλN (y). V
The proof of Theorem 1.5.25 can be found in Evans & Gariepy (1992, Section 3.4) or Federer (1969, Subsection 3.2.11). 1.6: Our treatment of capacity follows Evans & Gariepy (1992, Section 4.7) (see also Federer & Ziemer (1972)). There are other notions of capacity as for example Bessel capacity, Riesz capacity, etc., which are discussed in Stein (1970) and Ziemer (1989). The abstract theory of capacities in Banach spaces can be found in Fowler (1973). Moreover, for the use of capacities in the convergence of obstacles, we refer to Dal Maso (1985).
Chapter 2 Lebesgue-Bochner and Sobolev Spaces
The functional-analytic approach to the solution of (partial) differential equations requires knowledge of the properties of spaces of functions of one or several real variables. A large class of infinite dimensional dynamical systems (evolution systems) can be modelled as an abstract differential equation defined on a suitable Banach space or on a suitable manifold therein. The advantage of such an abstract formulation lies not only on its generality but also in the insight that can be gained about the many common unifying properties that tie together apparently diverse problems. It is clear that such a study relies on the knowledge of various spaces of vector valued functions (i.e., of Banach space valued functions). For this reason Section 2.1 deals with vector valued functions. We introduce the various notions of measurability for such functions and then based on them we define the different integrals corresponding to them. The emphasis is on the so-called Bochner integral, which generalizes in a very natural way the classical Lebesgue integral to vector valued functions. In Section 2.2 we continue with vector valued functions and introduce the so-called Lebesgue-Bochner spaces, which extend to vector valued functions the well known Lebesgue Lp -spaces. We also consider evolution triples and the function spaces associated with them. Evolution triples provide a suitable analytical framework for the study of a large class of linear and nonlinear evolution equations. In Section 2.3 we have compactness results for the spaces introduced and studied in the previous section. The compactness results refer to both the strong and the weak topologies on the spaces under consideration. Thus far we are dealing with function spaces arising in evolutionary problems. In Section 2.4 we study Sobolev spaces, which are the main tools in the analysis of both stationary and nonstationary equations. Sobolev spaces play a central role in the modern theory of partial differential equations and they allow us to broaden significantly the notion of solution of a boundary value problem. They provide a natural functional analytical framework for the study of weak solutions of elliptic boundary value problems. No specific applications to problems in partial differential equations are discussed. Instead the section aims to serve as a concise introduction to the properties of
107
108
Nonlinear Analysis
the Sobolev spaces (of both one and several variables). In Section 2.5 we present some fundamental inequalities associated with Sobolev functions, the celebrated embedding theorems for the Sobolev spaces and some of their consequences. The embedding theorems are arguably the most important results in this theory and the reason why Sobolev spaces are so effective in dealing with boundary value problems. Finally in Section 2.6 we establish some fine properties of Sobolev spaces and introduce functions of bounded variation (BV-functions). These are functions whose weak first partial derivatives are Radon measure and this is essentially the weakest measurable theoretic sense in which a function can be differentiable. They are particularly useful in theoretical mechanics.
2.1
Vector-Valued Functions
In this section we deal with functions which take values in Banach spaces. For such functions we define the various notions of measurability and different integrals corresponding to them. The domain of a function is a finite measure space (Ω, Σ, µ) and the range is a Banach space X. By X ∗ we denote the topological dual of X and by h·, ·iX the duality brackets for the pair (X ∗ , X). By B(X) we denote the Borel σ-field of X. DEFINITION 2.1.1
Let f : Ω −→ X be a function.
(a) Function f is said to be a simple function, if it takes only finite number of values, say x1 , . . . , xN and ¡ ¢ © ª Ck = f −1 {xk } ∈ Σ ∀ k ∈ 1, . . . , N . The formula s =
N P k=1
xk χCk is called the standard representation of f .
(b) Function f is said to be strongly measurable (or Bochner measur© ª able), if there exists a sequence sn : Ω −→ X n>1 of simple functions, such that sn (ω) −→ f (ω) for µ-a.a. ω ∈ Ω, where −→ denotes the convergence in the norm topology of X, i.e., ° ° °f (ω) − sn (ω)° −→ 0 for µ-a.a. ω ∈ Ω. X ∗ ∗ (c) Function f is said to ® be weakly measurable, if for all x ∈ X , the ∗ function ω −→ x , f (ω) X is Σ-measurable.
2. Lebesgue-Bochner and Sobolev Spaces
109
∗ ∗ (d) Function f : Ω −→ X is ® said to be weak -measurable, if for all x ∈ X, the function ω 7−→ f (ω), x X is Σ-measurable.
REMARK 2.1.2 Evidently strong measurability of a function f : Ω −→ X implies its weak measurability. Also strong measurability implies that for every B ∈ B(X), we have that f −1 (B) ∈ Σ (i.e., f is Borel measurable). Moreover, adapting the proof of the classical result, which asserts that a measurable R-valued function is the µ-almost everywhere limit of a sequence of simple functions, we see that if X is separable, then f : Ω −→ X is strongly measurable if and only if it is Borel measurable. In fact, in the next theorem, known as the Pettis measurability theorem, when X is separable, the situation simplifies considerably. THEOREM 2.1.3 A function f : Ω −→ X is strongly measurable if and only if it is weakly measurable and µ-almost separably valued (i.e., there exists a set A ∈ Σ with µ(A) = 0, such that f (Ω \ A) is separable in X). PROOF “=⇒”: Let f : Ω −→ X be a strongly measurable function. As we already pointed out, f is weakly measurable. Also since f is strongly measurable, we can find a sequence {sn }n>1 of X-valued simple functions and a µ-null set A ∈ Σ, such that sn (ω) −→ f (ω)
in X, for all ω ∈ Ω \ A.
Let {yn }n>1 be a sequence of all the values taken by the sequence {sn }n>1 (clearly the set is countable). Let df
Y = span {yn }n>1 . Then Y is a closed separable subspace of X. Moreover, f (Ω \ A) ⊆ Y and so f is µ-almost separably valued. “⇐=”: Let f : Ω −→ X be a µ-almost separably valued function. Without any loss of generality, we may assume that f is separably valued. Then replacing X by Y = span f (Ω), which is separable, we see that we may assume that X X∗
is separable. Let {x∗n }n>1 be dense in ∂B 1 (0), where X∗
∂B 1 (0) = Then
° ° °f (ω)°
X
©
ª x∗ ∈ X ∗ : kx∗ kX ∗ = 1 .
¯ ® ¯ = sup ¯ x∗n , f (ω) X ¯. n>1
But for each n > 1, ® the function ω 7−→ x∗n , f (ω) X is Σ-measurable,
110 hence
Nonlinear Analysis ° ° the function ω 7−→ °f (ω)°X is Σ-measurable.
Let
©
df
C0 =
° ° ª ω ∈ Ω : °f (ω)°X > 0 .
We have that C0 ∈ Σ and for every y ∈ X, the function ω 7−→ f (ω) − y is Σ ∩ C0 -measurable. Therefore, ° ° the function ω 7−→ °f (ω) − y °X is Σ ∩ C0 -measurable. Let {zn }n>1 be dense in f (Ω). For a given ε > 0, we define df
Dn =
©
ª ω ∈ C0 : kf (ω) − zn kX < ε .
Evidently Dn ∈ Σ ∩ C0
and
C0 =
∞ [
Dn .
n=1
Let df
En = Dn \
n−1 [
Di .
i=1
Then {En }n>1 ⊆ Σ ∩ C0 is a sequence of disjoint sets and ∞ [
C0 =
En .
n=1
We define
½ df
fε (ω) =
zn 0
if if
ω ∈ En , n > 1, ω ∈ Ω \ C0 .
Clearly fε : Ω −→ X is Σ-measurable, countably-valued (i.e., takes countably many values) and kf (ω) − fε (ω)kX < ε
∀ ω ∈ Ω.
© ª Taking ε = k1 , k > 1, we see that f is the uniform limit of a sequence f k1 k>1 of countably-valued functionals, hence f is strongly measurable. An interesting byproduct of the previous proof is the following result. COROLLARY 2.1.4 A function f : Ω −→ X is strongly measurable if and only if it is the uniform limit almost everywhere of a sequence of countably-valued, Σ-measurable functions.
2. Lebesgue-Bochner and Sobolev Spaces
111
By virtue of Theorem 2.1.3, we see that the measurability situation of Xvalued functions simplifies considerably when X is separable. THEOREM 2.1.5 If X is separable and f : Ω −→ X, then the following three properties are equivalent: (a) f is strongly measurable; (b) f is Borel measurable; (c) f is weakly measurable. REMARK 2.1.6 The usual facts regarding the stability of strongly measurable functions under sum, scalar multiplication and pointwise µ-almost everywhere limits hold. Also by just replacing absolute values by norms in the proof of the classical Egorov’s theorem (see Theorem A.2.12), we see that the result generalizes to X-valued functions. Finally for any Banach space X and a strongly measurable function f : Ω −→ X, the function ω 7−→ kf (ω)kX is Σmeasurable. Indeed, if {sn }n>1 is the sequence of X-valued simple functions, such that sn (ω) −→ f (ω) in X for µ-a.a. ω ∈ Ω, then ¯° ° ° ° ¯ ° ° ¯°f (ω)° − °sn (ω)° ¯ 6 °f (ω) − sn (ω)° −→ 0 X X X
for µ-a.a. ω ∈ Ω
° ° and so the function ω 7−→ °f (ω)°X is Σ-measurable. EXAMPLE 2.1.7 It can be shown that weak measurability does not imply strong measurability. Because of Theorem 2.1.5, we look for functions with values in a nonseparable Banach space. So consider the nonseparable ¡ ¢ Hilbert space X = l2 [0, 1] and ¡ ¢ let {et }t∈[0,1] be an orthonormal basis. The function f : [0, 1] −→ l2 [0, 1] defined by f (t) = et is weakly measurable, since ¡ ¡ ∗ ¢ ¢ x , f (t) X = x∗ , et X = 0
¡ ¢∗ ¡ ¢ ∀ x∗ ∈ l2 [0, 1] = l2 [0, 1] .
¡ ¢ On the other hand, if A ⊆ [0, 1], then f [0, 1] \ A is separable if and only if [0, 1] \ A is countable and so we cannot have λ1 (A) = 0. Therefore by virtue of Corollary 2.1.4, f is not strongly measurable.
112
Nonlinear Analysis
Now we are ready to define the Bochner integral for strongly measurable functions. DEFINITION 2.1.8
(a) Let N X
df
s(ω) =
xk χCk (ω),
xk ∈ X,
Ck ∈ Σ
k=1
be an X-valued simple function. The Bochner integral of s is defined by Z
df
s(ω) dµ(ω) =
N X
µ(Ck )xk .
k=1
Ω
(b) A function f : Ω −→ X is said to be Bochner integrable, if there exists a sequence {sn }n>1 of simple functions, such that Z ° ° °f (ω) − sn (ω)° dµ(ω) = 0. lim X n→+∞
Ω
Z If A ∈ Σ, we define the Bochner integral
f (ω) dµ(ω) of f on A, by A
Z
Z f (ω) dµ(ω) =
A
lim
χA (ω)sn (ω) dµ(ω).
n→+∞
(2.1)
Ω
Z Instead of
f (ω) dµ(ω), we will often write Ω
Z
Z f (ω) dµ or even
Ω
f dµ, Ω
when no confusion is possible. REMARK 2.1.9 It is easy to verify that in Definition 2.1.8(b), the limit in (2.1) exists and is independent of the sequence of simple functions {sn }n>1 with the properties postulated there. The next theorem gives a necessary and sufficient condition for the Bochner integrability of a function f : Ω −→ X. PROPOSITION 2.1.10 A strongly measurable f : Ω −→ X is Bochner integrable ° function ° ° ° if and only if the function ω 7−→ °f (ω)°X is Lebesgue integrable (i.e., °f (·)°X ∈ L1 (Ω)).
2. Lebesgue-Bochner and Sobolev Spaces PROOF such that
113
“=⇒”: Let {sn }n>1 be a sequence of X-valued simple functions, Z ° ° °f (ω) − sn (ω)° dµ −→ 0. X Ω
Then for any n > 1, we have Z Z Z ° ° ° ° ° ° °f (ω)° dµ 6 °f (ω) − sn (ω)° dµ + °sn (ω)° dµ < +∞, X X X Ω
Ω
Ω
so kf (·)kX ∈ L1 (Ω). “⇐=”: Since f : Ω −→ X is strongly measurable, we can find a sequence {sn }n>1 of X-valued, simple functions, such that ° ° lim °f (ω) − sn (ω)°X = 0 ∀ ω ∈ Ω \ A, n→+∞
with µ(A) = 0. Hence ° ° ° ° lim °sn (ω)°X = °f (ω)°X n→+∞
∀ ω ∈ Ω \ A.
Let hn : Ω −→ X
∀n>1
be defined by ½ df
hn (ω) =
sn (ω) 0
if ksn (ω)kX < 2 kf (ω)kX , otherwise.
Evidently for every n > 1, hn is an X-valued simple function. Also ° ° lim °f (ω) − hn (ω)°X = 0 ∀ ω ∈Ω\A n→+∞
and
° ° ° ° °f (ω) − hn (ω)° 6 3°f (ω)° X X
∀ ω ∈ Ω \ A.
So by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z ° ° °f (ω) − hn (ω)° dµ = 0, lim X n→+∞
Ω
so f is Bochner integrable (see Definition 2.1.8(b)). COROLLARY 2.1.11 If f : Ω −→ X is a Bochner integrable function and A ∈ Σ, then °Z ° Z ° ° ° ° ° f (ω) dµ° 6 °f (ω)° dµ. ° ° X A
X
A
114
Nonlinear Analysis
PROOF It is clear that the corollary holds for any s : Ω −→ X simple function. Then use Proposition 2.1.10. It is a direct consequence of Definition 2.1.8(b) that the Bochner integral is a linear operator. Namely we have the following proposition. PROPOSITION 2.1.12 If f, g : Ω −→ X are two Bochner integrable functions, A ∈ Σ and ξ ∈ R, then f + ξg is Bochner integrable too and Z Z Z ¡ ¢ f + ξg (ω) dµ = f (ω) dµ + ξ g(ω) dµ. Ω
A
A
The Lebesgue dominated convergence theorem (see Theorem A.2.2) applies also to Bochner integrable functions. PROPOSITION 2.1.13 If f : Ω −→ X is a strongly measurable function, fn : Ω −→ X, n > 1 are Bochner integrable, fn (ω) −→ f (ω)
for µ-a.a. ω ∈ Ω
and there exists h ∈ L1 (Ω)+ , such that ° ° °fn (ω)° 6 h(ω) for µ-a.a. ω ∈ Ω, and all n > 1, X then f is Bochner integrable and we have Z Z f (ω) dµ = lim fn (ω) dµ
∀ A ∈ Σ.
n→+∞
A
PROOF
A
Clearly ° ° °f (ω)° 6 h(ω) for µ-a.a. ω ∈ Ω X
and the function
° ° ω 7−→ °f (ω) − fn (ω)°X
is Σ-measurable for every n > 1. Since ° ° °f (ω) − fn (ω)° 6 2h(ω) X we have that
° ° °f (·) − fn (·)° ∈ L1 (Ω) X
for µ-a.a. ω ∈ Ω,
∀ n > 1.
2. Lebesgue-Bochner and Sobolev Spaces
115
Thus by the Lebesgue dominated convergence theorem (see Theorem A.2.2) for R-valued functions, we have Z ° ° °f (ω) − fn (ω)° dµ −→ 0. (2.2) X Ω
By virtue of Definition 2.1.8(b), for each n > 1, we can find an X-valued step function sn , such that Z ° ° °fn (ω) − sn (ω)° dµ < 1 . X n Ω
We have Z ° ° °f (ω) − sn (ω)° dµ X Ω
Z
6
° ° °f (ω) − fn (ω)° dµ + X
Ω
Z
° ° °fn (ω) − sn (ω)° dµ −→ 0 as n → +∞, X
Ω
so f is Bochner integrable. Moreover, from Corollary 2.1.11 and (2.2), we have °Z ° Z Z ° ° ° ° ° f (ω) dµ − sn (ω) dµ° 6 °f (ω) − sn (ω)° dµ ° ° X Z 6
A
A
A
° ° °f (ω) − sn (ω)° dµ −→ 0 X
as n → +∞.
Ω
Also we have a version of Fatou’s lemma (see Theorem A.2.1). PROPOSITION 2.1.14 ©R ª If fn : Ω −→ X, n > 1, are Bochner integrable, Ω kfn k dµ n>1 is bounded and w fn (ω) −→ f (ω) for a.a. ω ∈ Ω, then f is Bochner integrable and Z Z ° ° ° ° °f (ω)° dµ 6 lim inf °fn (ω)° dµ. X X n→+∞
Ω
Ω
116
Nonlinear Analysis
PROOF Evidently f is weakly measurable. Also by Theorem 2.1.3 for every n > 1, we can find An ∈ Σ which is µ-null and fn (Ω \ A) is separable. Let C ∈ Σ be the µ-null set, such that for ω ∈ Ω \ C, we have w
fn (ω) −→ f (ω). Let df
A =
µ[ ∞
¶ An
∪ C.
n=1
Then A ∈ Σ and it is µ-null. Let ∞ [
df
Y = span
¡ ¢ fn Ω \ An .
n=1
Evidently Y is a separable Banach subspace of X and f (Ω \ C) ⊆ Y. So by virtue of the weak lower semicontinuous of the norm functional in a Banach space, we have ° ° ° ° °f (ω)° 6 lim inf °fn (ω)° for µ-a.a. ω ∈ Ω. X X n→+∞
° ° Since °fn (ω)°X , n > 1, is Lebesgue integrable (see Proposition 2.1.10), by the Fatou’s lemma (see Theorem A.2.1), we have Z Z ° ° ° ° °f (ω)° dµ 6 lim inf °fn (ω)° dµ. X X n→+∞
Ω
Ω
DEFINITION 2.1.15 A set function m : Σ −→ X is said to be a vector measure, if for all sequences {An }n>1 ⊆ Σ of pairwise disjoint sets, we have m
µ[ ∞
¶ An
=
n=1
∞ X
m(An ),
n=1
where the series converges in the norm topology of X. The next proposition shows that the indefinite Bochner integral Z A 7−→ f dµ A
of a Bochner integrable function f : Ω −→ X is a vector measure which is absolutely continuous with respect to µ (i.e., m ≺≺ µ).
2. Lebesgue-Bochner and Sobolev Spaces
117
PROPOSITION 2.1.16 If f : Ω −→ X is a Bochner integrable function, then the set function m : Σ −→ X defined by Z df m(A) = f (ω) dµ ∀A∈Σ A
is a vector measure and m ≺≺ µ, i.e., lim m(A) = 0.
µ(A)&0
PROOF
Let {An }n>1 ⊆ Σ be a sequence of pairwise disjoint sets. Since °Z ° Z ° ° ° ° ° f (ω) dµ° 6 °f (ω)° dµ ∀ n > 1, ° ° X X
An
An
the series
∞ Z X
f (ω) dµ
n=1A n
is dominated term-by-term by the convergent series of positive terms Z ∞ Z X ° ° ° ° °f (ω)° dµ 6 °f (ω)° dµ < +∞ X X n=1A n
Ω
(see Proposition 2.1.10). Therefore the series ∞ Z X
f (ω) dµ
n=1A n
is absolutely convergent. Moreover, for all k > 1, we have ° Z ° ° ° ∞ S n=1
f (ω) dµ −
° ° f (ω) dµ° °
n=1A n
An
X
° ° = ° °
∞ S
° ° °f (ω)° dµ −→ 0 as k → +∞, X
6 An
n=k+1
so m
µ[ ∞ n=1
¶ An
=
∞ X n=1
m(An ),
° ° f (ω) dµ° °
Z
n=k+1
Z
∞ S
k Z X
X
An
118
Nonlinear Analysis
i.e., m is°a vector measure. ° Since °f (·)°X ∈ L1 (Ω), from the absolute continuity of the Lebesgue integral, we have Z ° ° °f (ω)° dµ = 0. lim X µ(A)&0
A
From Corollary 2.1.11, we have °Z ° ° ° ° ° ° f (ω) dµ lim °m(A)°X = lim ° ° ° µ(A)&0 µ(A)&0
X
A
Z 6
lim
µ(A)&0
° ° °f (ω)° dµ = 0 X
A
and so m ≺≺ µ. Thus far the theory of Bochner integration is a straightforward extension of the theory of Lebesgue integration, with the absolute values replaced by norms. The next theorem exhibits a strong property of the Bochner integral that has no counterpart in the theory of Lebesgue integration. THEOREM 2.1.17 If Y is another Banach space, L : X ⊇ D −→ Y is a closed linear operator ¡ ¢ and f : Ω −→ X is a Bochner integrable function, such that L f (·) : Ω −→ Y is Bochner integrable too, then µZ ¶ Z ¡ ¢ L f (ω) dµ = L f (ω)) dµ ∀ A ∈ Σ. A
PROOF
A
Let ½ df
C0 =
¾
° ° ω ∈ Ω : °f (ω)°X > 0
∈ Σ.
By Corollary 2.1.4, for a given ε > 0, we can find countably valued functions hε : C0 −→ X such that ° ° ε sup °f (ω) − hε (ω)°X < 2 ω∈Ω\E
and gε : C0 → Y,
and
° ¡ ° ¢ ε sup °L f (ω) − gε (ω)°X < , 2 ω∈Ω\E
with E ∈ Σ being a µ-null set. Let {Bn }n>1 ⊆ Σ∩C0 be a common refinement of the subdivisions corresponding to hε and gε and let ωn ∈ Bn
∀ n > 1.
2. Lebesgue-Bochner and Sobolev Spaces
119
We introduce the function uε : Ω −→ X, defined by ½ df f (ωn ) if ω ∈ Bn , n > 1, uε (ω) = 0 if ω ∈ Ω \ C0 . Then, we have
Z
° ° °f (ω) − uε (ω)° dµ < εµ(Ω) X
(2.3)
° ¡ ¢ ¡ ¢° °L f (ω) − L uε (ω) ° dµ < εµ(Ω). Y
(2.4)
Ω
and
Z Ω
Also for every A ∈ Σ, we have Z uε (ω) dµ =
∞ X
f (ωn )µ(Bn ∩ A) =
n=1
A
lim
N →+∞
N X
f (ωn )µ(Bn ∩ A)
(2.5)
n=1
and Z
∞ X ¡ ¢ ¡ ¢ L uε (ω) dµ = L f (ωn ) µ(Bn ∩ A) n=1
A
=
lim
N →+∞
N X ¡ ¢ L f (ωn ) µ(Bn ∩ A).
(2.6)
n=1
Since by hypothesis L is a closed, linear operator, from (2.5) and (2.6), we have that µZ ¶ Z ¡ ¢ uε dµ, L uε dµ ∈ Gr L. A
A
Consider a sequence εn & 0. From (2.3) and (2.4), we have Z Z Z Z uεn dµ −→ f dµ and L(uεn ) dµ −→ L(f ) dµ. A
A
Since
µZ L A
A
¶ uεn dµ
Z =
L(uεn ) dµ
∀n>1
A
and L is closed, it follows that µZ ¶ Z ¡ ¢ L f (ω) dµ = L f (ω) dµ A
A
A
∀ A ∈ Σ.
120
Nonlinear Analysis
REMARK 2.1.18
If f : Ω −→ X is a Bochner integrable function and L ∈ L(X; Y ),
then L(f ) is Bochner integrable, since ° ° °L(f (ω))° 6 kLk kf (ω)k L X Y
∀ω∈Ω
COROLLARY 2.1.19 If f, g : Ω −→ X are two Bochner integrable functions and Z
Z f (ω) dµ =
A
g(ω) dµ
∀ A ∈ Σ,
A
then f (ω) = g(ω) for µ-almost all ω ∈ Ω. PROOF We may assume that g = 0 and that X is separable (see Theorem 2.1.3). Then the ball ∗
B1 =
©
x∗ ∈ X ∗ : kx∗ kX ∗ 6 1
ª
furnished with the relative weak∗ -topology is compact, metrizable (see Alaoglu theorem; Theorem A.3.9 and Theorem A.3.13). ∗ Let {xn∗ }n>1 be a countable w∗ -dense subset of B 1 . By Theorem 2.1.17, for every n > 1 and A ∈ Σ, we have ¿ À Z Z ∗ ® xn , f (ω) X dµ = x∗n , f (ω) dµ = 0, A
so
A
∗ ® xn , f (ω) X = 0
Since
we have
° ° °f (ω)° = X
for µ-a.a. ω ∈ Ω.
¯ ¯ sup ¯ hx∗ , f (ω)iX ¯, ∗
x∗ ∈B 1
° ° °f (ω)° = 0 X
for µ-a.a. ω ∈ Ω
and so f (ω) = 0
X
for µ-a.a. ω ∈ Ω.
A similar proof gives us the following result.
2. Lebesgue-Bochner and Sobolev Spaces
121
COROLLARY 2.1.20 If f, g : Ω −→ X are two strongly measurable functions and
x∗ , f (ω)
® X
=
∗ ® x , g(ω) X
for µ-a.a. ω ∈ Ω and all x∗ ∈ X ∗
(the exceptional µ-null set may depend on x∗ ∈ X ∗ ), then f (ω) = g(ω) for µ-a.a. ω ∈ Ω. The next result can be viewed as a kind of mean value theorem for the Bochner integral. PROPOSITION 2.1.21 If f : Ω −→ X is a Bochner integrable function and A ∈ Σ with µ(A) > 0, then Z 1 f (ω) dµ ∈ conv f (A). µ(A) A
PROOF
We proceed by contradiction. Suppose that Z 1 f (ω) dµ 6∈ conv f (A). µ(A) A
Then by the strong separation theorem for convex sets (see Theorem A.3.2), we can find x∗ ∈ X ∗ \ {0} and ϑ ∈ R, such that ¿ À Z ® 1 x∗ , f (ω) dµ < ϑ 6 x∗ , f (ω) X ∀ ω ∈ A, µ(A) X A
so using Theorem 2.1.17, we have Z ∗ ® ® 1 x , f (ω) X dµ < ϑ 6 x∗ , f (ω) X µ(A)
∀ ω ∈ A.
A
Integrating this inequality over A, we obtain Z Z ∗ ® ∗ ® x , f (ω) X dµ, x , f (ω) X dµ < ϑµ(A) 6 A
A
a contradiction. Also for Bochner integrable functions, the Lebesgue differentiation theorem holds (see Theorem 1.4.6). So we have the following result.
122
Nonlinear Analysis
PROPOSITION 2.1.22 If Z ⊆ RN is a bounded open set and f : Z −→ X is a Bochner integrable function, then Z ° ° 1 °f (y) − f (x)° dλN (y) = 0 for λN -a.a. x ∈ Z, lim N X r&0 a(N )r B r (x)
where
N
π2 df a(N ) = ¡ N ¢ 2
!
is the volume of the unit ball in RN . PROOF Invoking Theorem 2.1.3, we may assume that X is separable. Let {xn }n>1 be a dense set in X. Then by Theorem 1.4.6, we have Z ° ° 1 °f (y) − xn ° dλN (y) lim X r&0 a(N )r N B r (x) ° ° = °f (x) − xn °X for λN -a.a. x ∈ Z and all n > 1.
(2.7)
Let x ∈ Z be a point where (2.7) is valid. Then for a given ε > 0, we can select xn , such that ° ° °f (x) − xn ° < ε. X We have 1 lim sup a(N )rN r&0
Z B r (x)
1 6 lim sup N r&0 a(N )r
Z
° ° °f (y) − f (x)° dλN (y) X ·
¸ ° ° ° ° °f (y) − xn ° + °xn − f (x)° dλN (y) X X
B r (x)
< 2ε, so lim sup r&0
1 a(N )rN
Z
° ° °f (y) − f (x)° dλN (y) = 0 X
for λN -a.a. x ∈ Z.
B r (x)
We conclude this section by introducing three weaker integrals for Banach space valued functions.
2. Lebesgue-Bochner and Sobolev Spaces DEFINITION 2.1.23
123
Let f : Ω −→ X be a function.
(a) Suppose that f : Ω −→ X is weakly measurable. We say that f is Pettis integrable, if for each A ∈ Σ, there exists xA ∈ X, such that Z ∗ ® ∗ ® x , xA X = x , f (ω) X dµ ∀ x∗ ∈ X ∗ . A
Then we write
Z xA = (P)- f (ω) dµ. A
(b) Suppose that f : Ω −→ X is weakly measurable. We say that f is Dun∗∗ ford integrable, if for each A ∈ Σ, there exists x∗∗ A ∈ X , such that Z ∗ ® ∗∗ ∗ ® x , f (ω) X dµ ∀ x∗ ∈ X ∗ . xA , x X ∗ = A
Then we write
Z = (D)- f (ω) dµ.
x∗∗ A
A
(c) Suppose that f : Ω −→ X ∗ is w∗ -measurable. We say that f is Gelfand integrable, if for each A ∈ Σ, there exists x∗A ∈ X ∗ , such that Z ∗ ® ® xA , x X = f (ω), x X dµ ∀ x ∈ X. A
Then we write
Z x∗A = (G)- f (ω) dµ. A
REMARK 2.1.24 Clearly we have that Bochner integrability implies Pettis integrability and Pettis integrability implies Dunford integrability. The reverse implications need not be true. Of course if X is reflexive, then the Pettis and Dunford integrals coincide. Finally note that the Gelfand integral is actually the Pettis integral for X ∗ -valued functions. For a Pettis integrable function f : Ω −→ X, we consider the set-valued Z function A 7−→ (P)- f dµ. We want to know if this is a µ-continuous vector A
measure, as was the case with the Bochner integral (see Proposition 2.1.16). To answer this we need some preparation.
124
Nonlinear Analysis
DEFINITION 2.1.25
Let
∞ P n=1
∞ P
(a) We say that the series
n=1
xn be a series of elements of X.
xn is unconditionally convergent to x, if
for all permutations π of N, the series
∞ P n=1
(b) We say that the series
∞ P n=1
xπ(n) converges to x.
xn is weakly subseries convergent to x, if
for every strictly increasing sequence {nk }k>1 of integers, the series is weakly convergent.
REMARK 2.1.26 ∞ P n=1
∞ P
If
n=1
∞ P k=1
xnk
xn is absolutely convergent (i.e., the series
kxn kX is convergent), then it is unconditionally convergent. Also uncon-
ditional convergence is equivalent to the subseries convergence (in the norm topology of X) and implies convergence. The next result is known as the Orlicz-Pettis theorem. THEOREM 2.1.27 (Orlicz-Pettis Theorem) ∞ P A formal series xn in X is unconditionally convergent if and only if it is n=1
weakly subseries convergent. REMARK 2.1.28 An interesting consequence of the above theorem is that if m : Σ −→ X is a weakly countably additive set function, then it is a vector measure. PROPOSITION 2.1.29 If f : Ω −→ X is Pettis integrable, then the function
Z Σ 3 A 7−→ m(A) = (P)- f dµ A
is a vector measure.
2. Lebesgue-Bochner and Sobolev Spaces
125
PROOF Let {An }n>1 be a sequence of pairwise disjoint sets in Σ. For every x∗ ∈ X ∗ , we have ¿ À Z Z ∗ ® x∗ , (P)f (ω) dµ = x , f (ω) X dµ X
∞ S n=1
=
∞ S
An
n=1
An
Z ∞ Z ∞ ¿ X X ∗ ® x , f (ω) X dµ = x∗ , (P)n=1A n
n=1
∞ S n=1
so the function
À , f (ω) dµ X
An
Z Σ 3 A 7−→ m(A) = (P)- f dµ A
is weakly countably additive. Of course the same argument applies to any subsequence of {An }n>1 . So we can invoke Theorem 2.1.27 and conclude that the function Z Σ 3 A 7−→ m(A) = (P)- f dµ A
is a vector measure. REMARK 2.1.30 The result is not true for the Dunford integral, which is not even strongly additive (see Diestel & Uhl (1977, p. 53)). The next result provides an easy test for checking the Gelfand integrability of f : Ω −→ X ∗ . PROPOSITION 2.1.31 If f : Ω −→ X ∗ has the following property hf (·), xiX ∈ L1 (Ω)
∀ x ∈ X,
then f is Gelfand integrable. PROOF
Let A ∈ Σ and let L : X −→ L1 (Ω) be defined by df
L(x) =
f (·), x
® X
∀ x ∈ X.
We claim that the linear operator L has a closed graph. To this end suppose that xn −→ x in X and
® f (·), xn X −→ g
in L1 (Ω).
126
Nonlinear Analysis
Then by passing to a suitable subsequence of {xn }n>1 if necessary, we may assume that ® f (ω), xn X −→ g(ω) for µ-a.a. ω ∈ Ω. Therefore g(ω) =
f (ω), x
® X
for µ-a.a. ω ∈ Ω,
hence (x, g) ∈ Gr L, i.e., L has closed graph. By the closed graph theorem (see Theorem A.3.7), L is continuous and if IA : L1 (Ω) −→ R is the integral operator, defined by Z df IA (g) = g(ω) dµ ∀ A ∈ Σ, A
we have that IA ◦ L ∈ X ∗ and so there exists x∗A ∈ X ∗ , such that ∗ ® xA , x X =
Z
® f (ω), x X dµ
∀ x ∈ X.
A
Therefore f is Gelfand integrable. REMARK 2.1.32 The same closed graph argument shows that if f : Ω −→ X is such that ∗ ® x , f (·) X ∈ L1 (Ω)
∀ x∗ ∈ X ∗ ,
then f is Dunford integrable. The situation with the Pettis integrability is less satisfactory and more sophisticated criteria are needed to establish it (see Diestel & Uhl (1977, pp. 54–56)). COROLLARY 2.1.33 If f : Ω −→ X ∗ is a w∗ -measurable function and has range which is norm bounded in X ∗ , then f is Gelfand integrable. The same argument as in the proof of Proposition 2.1.21 gives the following mean value theorem for the Gelfand integral. THEOREM 2.1.34 If f : Ω −→ X ∗ is a Gelfand integrable function and A ∈ Σ, then Z ∗ 1 (G)- f dµ ∈ conv w f (A). µ(A) A
2. Lebesgue-Bochner and Sobolev Spaces
2.2
127
Lebesgue-Bochner Spaces and Evolution Triples
Using the Bochner integral introduced in the previous section, we can introduce generalizations of the classical Lebesgue spaces to Banach space valued functions. As in the previous section (Ω, Σ, µ) is a finite measure space and X is a Banach space. Additional hypotheses will be introduced as needed. DEFINITION 2.2.1 Let p ∈ [1, +∞]. By Lp (Ω; X) we denote the space of classes of strongly measurable functions f : Ω −→ X, such that ° equivalence ° °f (·)° ∈ Lp (Ω). Also we introduce their respective norms by X df
µZ
kf kp =
¶ p1 ° ° °f (ω)°p dµ X
if p ∈ [1, +∞)
Ω
and
° ° df kf k∞ = esssup °f (ω)°X . ω∈Ω
REMARK 2.2.2 As with R-valued functions, the equivalence relation used in the above definition is the following: f ∼g
if and only if
f (ω) = g(ω)
for µ-a.a. ω ∈ Ω.
It is routine to check the following facts. PROPOSITION 2.2.3 ¡ ¢ (a) Lp (Ω; X), k·kp is a Banach space for p ∈ [1, +∞]. (b) If p ∈ [1, +∞), Σ is countably generated and X is separable, then Lp (Ω; X) is separable. (c) If p ∈ (1, +∞) and X is reflexive, then Lp (Ω; X) is reflexive. (d) If X is a Hilbert space, then L2 (Ω; X) is a Hilbert space too with inner product Z ¡ ¢ (f, g)2 = f (ω), g(ω) X dµ. Ω
128
Nonlinear Analysis
REMARK 2.2.4 The σ-field Σ is countably generated if there exists a countable subfamily T , such that Σ = σ(T ). If Ω is an open or closed subset of RN , then the Borel σ-field B(Ω) is countably generated. Also clearly if p ∈ [1, +∞) and Lp (Ω; X) is separable, then X is separable. Additional conditions on X usually translate to corresponding properties of the LebesgueBochner space Lp (Ω; X). So if p ∈ (1, +∞), then Lp (Ω; X) is uniformly convex if and only if X is uniformly convex (see Day (1955, 1973)). Moreover, as for p the Lebesgue spaces, simple functions (Ω; X) and if Z ⊆ RN ¡ ¢ are dense in L ∞ p is a bounded open set then C Z; X is dense in L (Ω; X) for p ∈ [1, +∞). PROPOSITION 2.2.5 If Y is another Banach space, X ⊆ Y and the embedding is continuous, p, r ∈ [1, +∞], p 6 r, then Lr (Ω; X) ⊆ Lp (Ω; Y ) and the embedding is continuous. PROOF Let f ∈ Lr (Ω; X). Since the embedding X ⊆ Y is continuous, using H¨older’s inequality (see Theorem A.2.27; as p 6 r), we have µZ Ω
° ° °f (ω)°p dµ Y
¶ p1
µZ 6 c1
¶ p1 µZ ¶ r1 ° ° r °f (ω)°p dµ 6 c2 kf (ω)kX dµ , X
Ω
Ω
for some c1 , c2 > 0. So Lr (Ω; X) ⊆ Lp (Ω; Y ) and the embedding is continuous. We want to identify the dual of Lp (Ω; X) for p ∈ [1, +∞). First a definition which is motivated by the fact that the proof of the classical Riesz representation theorem (see Theorem A.3.24) uses the Radon-Nikodym theorem (see Theorem A.2.24). DEFINITION 2.2.6 (a) Let m : Σ −→ X be a vector measure (see Definition 2.1.15). We say that m is of bounded variation, if |m|(Ω) < +∞, where X ° ° °m(C)° |m|(A) = sup ∀ A ∈ Σ, X TA
C∈TA
with TA running through the set of all finite Σ-partitions of A. The quantity |m| : Σ −→ R+ is called the variation of m and is a measure. (b) A Banach space X is said to have the Radon-Nikodym property (RNP for short), if for every probability space (Ω, Σ, µ) and every vector measure m : Σ −→ X of bounded variation such that m ≺≺ µ (i.e., if µ(A) = 0 then m(A) = 0), there exists f ∈ L1 (Ω; X), such that Z m(A) = f (ω) dµ ∀ A ∈ Σ. A
2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.2.7 has. Suppose that
129
The RNP is not a property that every Banach space
X1 = c0 . ¡ ¢ ¡ ¢ 1 On [0, 1], B([0, 1]), λ1 (here B [0, 1] is the Borel σ-field ¡ of [0, ¢ 1] and λ is the Lebesgue measure) consider the vector measure m : B [0, 1] −→ c0 , defined by ½Z ¾ ¡ ¢ m(A) = cos nt dt ∀ A ∈ B [0, 1] . n>1
A
The Riemann-Lebesgue Lemma guarantees that ¡ ¢ m(A) ∈ c0 ∀ A ∈ B [0, 1] . Also m ≺≺ λ1 . However, m cannot have a¡ Radon-Nikodym derivative (see Theorem A.2.24 ¢ and Remark A.2.25) in L1 [0, 1]; c0 , since {cos nt}n>1 ∈ / c0
for a.a. t ∈ [0, 1].
Therefore c0 lacks the RNP. However, there are two large classes of Banach spaces which have the RNP. PROPOSITION 2.2.8 If X is reflexive or it is a separable dual space, then X has the RNP. Now we state the Riesz representation theorem for the Lebesgue-Bochner spaces. THEOREM 2.2.9 (Riesz Representation Theorem for the Lebesgue-Bochner Spaces) If p ∈ [1, +∞) and p1 + p10 = 1, ¡ ¢∗ ¢ 0¡ then Lp (Ω; X) = Lp Ω; X ∗ if and only if X ∗ has the RNP and the duality pairing is given by Z ¢ ® ® 0¡ ∀ f ∈ Lp (Ω; X), g ∈ Lp Ω; X ∗ . g(ω), f (ω) X dµ g, f Lp (Ω;X) = Ω ∗ when X = ¡What¢ can be said if X does not have the RNP (for example C [0, 1] )? We can still have a representation theorem for L1 (Ω; X). First a definition.
130
Nonlinear Analysis
¡ ¢ ∗ DEFINITION 2.2.10 By L∞ Ω; Xw we denote the space of all w∗ ∗ ∗ measurable functions g : Ω −→ X , such that there exists c > 0 with ¯ ® ¯ ¯ g(ω), x ¯ 6 c kxk for µ-a.a. ω ∈ Ω and all x ∈ X (2.8) X X (the exceptional µ-null ¡ ¢ set may depend on x). Two functions g, h are equiva∗ (denoted by g ≈ h) if lent in L∞ Ω; Xw ∗ ® ® g(ω), x X = h(ω), x X for µ-a.a. ω ∈ Ω and all x ∈ X. The infimum of all c > 0 for which the above inequality (2.8) is true is denoted by kgkL∞ (Ω;X ∗ ∗ ) and we have w
¯ ® ¯ ¯ g(ω), x ¯ 6 kgk ∞ L (Ω;X ∗ ∗ ) kxkX X w
for µ-a.a. ω ∈ Ω.
We can easily check that k·kL∞ (Ω;X ∗ ∗ ) is a norm. w
¢ ¡ ∗ does REMARK 2.2.11 (a) The equivalence relation in L∞ Ω; Xw ∗ not coincide with the usual one in the Lp -space, since ® g(ω), x X = 0 for µ-a.a. ω ∈ Ω and all x ∈ X does not necessarily imply that g(ω) = 0
for µ-a.a. ω ∈ Ω.
Indeed let Ω = [0, 1]
and
¡ ¢ X = l2 [0, 1]
(it is a nonseparable Hilbert space). Then ¡ ¢ L∞ Ω; Xw∗ ∗ = L∞ (Ω; Xw ) ¡ ¢ and let g(ω) = ga (ω) a∈[0,1] with ½ ga (ω) =
1 0
if if
ω = t, ω 6= t.
Then g ≈ 0 in L∞ (Ω; Xw ), but ° ° °g(ω)° = 1 X
for a.a. ω ∈ [0, 1]. ¡ ¢ ∗ However, if X is sparable and g ∈ L∞ Ω; Xw ∗ , then the function ° ° ω 7−→ °g(ω)°X ∗
is measurable, essentially bounded and ° ° kgkL∞ (Ω;X ∗ ∗ ) = esssup °g(ω)°X ∗ . w
ω∈Ω
2. Lebesgue-Bochner and Sobolev Spaces
131
¡ ¢ ∗ ∞ ∗ (b) In general, we have L∞ Ω; Xw ∗ ¡ 6= L¢ (Ω; X ), even if X∗ is separable. ¡ ¢ To see this let Ω = [0, 1] and X = C [0, 1] . We know that X = M [0, 1] , the space of finite Borel measures on [0, 1] equipped with the total variation norm. Let g : Ω −→ X ∗ be defined by df
g(ω) = δω , ¡ ¢ the Dirac measure at ω ∈ [0, 1]. Then g ∈ L∞ Ω; Xw∗ ∗ , but it is not strongly measurable, nor equivalent to any strongly measurable function. To see this, note that due to the separability of X, g ≈ h if and only if g(ω) = h(ω) for almost all ω ∈ Ω (with h being strongly measurable). Then g is strongly measurable too and so by virtue of Corollary 2.1.4, there exists a countablyvalued function u, such that ° ° °g(ω) − u(ω)°
X∗
1 ⊆ B 1 = © ∗ ª y ∈ X ∗ : ky ∗ kX ∗ 6 1 , such that ¯ ® ¯ ∀ x ∈ X. kxkX = sup ¯ x∗n , x X ¯ n>1
° ° Therefore for every y ∗ ∈ X ∗ , the function ω 7−→ °g(ω) − y ∗ °X ∗ is Σmeasurable and then from the proof of Theorem 2.1.3, ¡ we can ¢ infer∞that the ∗ function ω 7−→ g(ω) is strongly measurable. Hence L∞ Ω; Xw = L (Ω; X ∗ ) ∗ and Theorem 2.2.12 coincides with Theorem 2.2.9.
132
Nonlinear Analysis
In complete analogy with the case of R-valued functions, we introduce the notion of absolutely continuous X-valued function. DEFINITION 2.2.14 A function f : T = [0, b] −→ X is said to be absolutely continuous, if forª every ε > 0, we can find δ(ε) > 0, such © that for each sequence (an , bn ) n>1 of pairwise disjoint intervals in T with ∞ P (bn − an ) < δ, we have
n=1
∞ X ° ° °f (bn ) − f (an )° < ε. X n=1
Also for a function f : T = [0, b] −→ X and a partition P : 0 = x 0 < . . . < xn = b we define df
V (f, P ) =
of T,
m X ° ° °f (xk ) − f (xk−1 )° . X k=1
The variation of f on T is defined by © ª df V (f )(b) = sup V (f, P ) : P is a partition of T . When V (f )(b) is finite, we say that f is of bounded variation. REMARK 2.2.15 Clearly the function t 7−→ V (f )(t) is an increasing function and if f : T = [0, b] −→ X is absolutely continuous, then it is of bounded variation. The converse is not true. It is well known that an R-valued, absolutely continuous function is almost everywhere differentiable on T and it is the indefinite integral of its derivative. The result is no longer true for X-valued in general. EXAMPLE 2.2.16 Let X = L1 [0, 1] and consider the function f : [0, 1] −→ X, defined by df
f (t) = χ[0,t]
∀ t ∈ [0, 1].
It is easy to see that f is absolutely continuous. However, f is nowhere differentiable on [0, 1].¡ Indeed,¢ if f is differentiable at t = t0 ∈ [0, 1], then for ∗ every g ∈ L∞ [0, 1] = L1 [0, 1] , the function df
t 7−→ ϑ(t) =
® g, f (t) L1 [0,1] =
Z1
Zt g(s)f (t)(s) ds =
0
g(s)ds 0
2. Lebesgue-Bochner and Sobolev Spaces is differentiable at t = t0 . Let
½ df
g(s) = We have
½ df
ϑ(t) =
1 −1
if if
t 2t0 − t
133
s 6 t0 , s > t0 . if if
t 6 t0 , t > t0 ,
and ϑ clearly is not differentiable at t = t0 . Note that in this example X = L1 [0, 1] does not have the RNP. THEOREM 2.2.17 If X is reflexive and f : T = [0, b] −→ X is absolutely continuous, then f is differentiable at almost all t ∈ T and Zt f 0 (s) ds
f (t) = f (0) +
∀ t ∈ T.
0
PROOF Because of Theorem 2.1.3, we may assume that X is also separable. Since f is absolutely continuous, it is of bounded variation and the function t 7−→ V (f )(t) is increasing on T = [0, b] (see Definition 2.2.14 and Remark 2.2.15). For 0 6 t 6 t + h 6 b, we have ° ° °f (t + h) − f (t)° 6 V (f )(t + h) − V (f )(t), X so
¢ kf (t + h) − f (t)kX 1¡ 6 V (f )(t + h) − V (f )(t) h h
∀h>0
and lim sup h→0
kf (t + h) − f (t)kX d 6 V (f )(t) < +∞ h dt
for a.a. t ∈ T.
(2.9)
Since X is separable, reflexive, X ∗ is separable too (see Remark A.3.14). Let {x∗n }n>1 be a dense sequence in X ∗ . For every n > 1, ® the function t 7−→ x∗n , f (t) X is differentiable at every point of T \ Dn , with λ1 (Dn ) = 0 (as before λ1 denotes the Lebesgue measure on T ). Also let ½ ¾ kf (t + h) − f (t)kX df D0 = t ∈ T : lim sup = +∞ h h→0 and let us set df
D =
∞ [ n=0
Dn .
134
Nonlinear Analysis
From (2.9), we have that λ1 (D) = 0. Then for ε > 0 small enough and t ∈ T \ D, the family ½ ¾ kf (t + h) − f (t)kX : |h| 6 ε and t ∈ T \ D h is bounded. Since for every n > 1 and every t ∈ T \ D, ¿ À ∗ f (t + h) − f (t) xn , the limit lim exists, n→+∞ h X we infer that there exists u(t) ∈ X, such that for all x∗ ∈ X ∗ and all t ∈ T \D, we have ¿ À ® ∗ f (t + h) − f (t) lim xn , = x∗ , u(t) X , h→0 h X so f is weakly differentiable at every t ∈ T \ D. Let f 0 be the weak derivative of f (i.e., f 0 (t) = u(t) for all t ∈ T \ D). Clearly f 0 is weakly measurable and so by Theorem 2.1.3 it is also strongly measurable. Moreover, from the weak lower semicontinuity of the norm in a Banach space, we have ° 0 ° °f (t)° 6 lim inf kf (t + h) − f (t)kX X h→0 h
∀ t ∈ T \ D.
(2.10)
Then from (2.10) and Fatou’s lemma (see Theorem A.2.1), we have that Zb
° 0 ° °f (t)° dt 6 V (f )(b), X
0
i.e., f 0 ∈ L1 (T ; X). Also ∗ ® x , f (t) − f (0) X =
Zt
∗ 0 ® x , f (s) X ds
∀ x∗ ∈ X ∗ , t ∈ T,
0
so from Theorem 2.1.17, we have Zt f 0 (s) ds
f (t) − f (0) =
∀t∈T
0
and finally f is almost everywhere strongly differentiable with df = f 0 ∈ L1 (T ; X) dt and (2.11) holds.
(2.11)
2. Lebesgue-Bochner and Sobolev Spaces
135
REMARK 2.2.18 The result is more generally true if we assume that X has the RNP. This follows from the fact the RNP is passed to closed linear subspaces of X and if X is a separable Banach space with the RNP, then it has the separable dual (see Diestel & Uhl (1977, pp. 217–218)). So a careful reading of the previous proof reveals that it remains valid if instead we assume only that X has the RNP. The next result is an extension of the so-called “Lagrange lemma” and “DuBois-Reymond lemma” (see Denkowski, Mig´orski & Papageorgiou (2003b, p. 673)) to Banach space valued functionals. PROPOSITION 2.2.19 Let f ∈ L1 (T ; X) (with T = [0, b]). (a) If Zb f (t)ϑ(t) dt = 0
¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,
f (t)ϑ0 (t) dt = 0
¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,
0
then f = 0. (b) If Zb 0
then f is constant. PROOF (a) By virtue of Theorem 2.1.3, we may assume that X is sep∗ ∗ furnished with the w∗ -topology) is arable. Then Xw ∗ (the dual space X ∗ ∗ w -separable (in fact Xw∗ is a Souslin space; see Definition A.2.29(b) and Remark A.2.30). Let {x∗n }n>1 be w∗ -dense in X ∗ . Then for all n > 1 and all ¡ ¢ ϑ ∈ Cc∞ (0, b) , we have Zb
® ϑ(t) x∗n , f (t) dt =
0
¿
Zb x∗n ,
À f (t)ϑ(t) dt
0
= 0, X
so by the Lagrange lemma, we have ∗ ® xn , f (t) X = 0 for a.a. t ∈ T and all n > 1 and since {x∗n }n>1
w∗
= X ∗ , we obtain that f (t) = 0
for a.a. t ∈ T.
(b) The proof is similar, using this time the DuBois-Reymond lemma.
136
Nonlinear Analysis
The next proposition permits the identification of the space of X-valued absolutely continuous functions with a vector Sobolev space. PROPOSITION 2.2.20 If f, g ∈ L1 (T ; X) (with T = [0, b]), then the following conditions are equivalent: Zt (a) f (t) = v +
g(s) ds, v ∈ X, for almost all t ∈ T ; 0
Zb
Zb 0
(b)
f (t)ϑ (t) dt = − 0
¡ ¢ g(t)ϑ(t) dt for all ϑ ∈ Cc∞ (0, b) ;
0
(c) for every x∗ ∈ X ∗ , ® ® d ∗ x , f (·) X = x∗ , g(·) X dt in the distributional sense on (0, b) (see Definition 1.6.1(a)). PROOF “(a)=⇒(b),(c)”: These implications follow from a simple integration by parts. “(c)=⇒(b)”: From the definition ¡ ¢of distributional derivative (see Definition 1.6.1(a)), for all ϑ ∈ Cc∞ (0, b) and all x∗ ∈ X ∗ , we have Zb
∗ ® x , f (t) X ϑ0 (t) dt = −
0
Zb 0
Zb =
® d ∗ x , f (t) X ϑ(t) dt dt
∗ ® x , g(t) X ϑ(t) dt,
0
so Zb
∗ 0 ® x , ϑ (t)f (t) + ϑ(t)g(t) X dt
0
¿ =
Zb ∗
x ,
¡
À ϑ (t)f (t) + ϑ(t)g(t) dt ¢
0
= 0
∀ x∗ ∈ X ∗
X
0
and thus Zb
Zb 0
f (t)ϑ (t) dt = − 0
g(t)ϑ(t) dt 0
¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .
2. Lebesgue-Bochner and Sobolev Spaces
137
“(b)=⇒(a)”: Let df
Zb
fb(t) =
g(s) ds
∀ t ∈ T.
0
Evidently fb is absolutely continuous and fb0 (t) = g(t)
for a.a. t ∈ T .
Let df h = f − fb.
We have
Zb h(t)ϑ0 (t) dt = 0
¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) ,
0
so, using Proposition 2.2.19(b), we have h(t) = v ∈ X
∀t∈T
and finally Zb g(s) ds
f (t) = v +
∀ t ∈ T.
0
COROLLARY 2.2.21 If f, g ∈ L1 (T ; X) (T = [0, b]) and one of the equivalent statements (a), (b) or (c) in Proposition 2.2.20 holds, then f is almost everywhere equal to an absolutely continuous function f1 : T −→ X. Extending the notion of distributional (weak) derivative and the resulting Sobolev spaces (see Definition 1.6.1) to X-valued functions, we make the following definitions. DEFINITION 2.2.22 (a) Let f, g ∈ L1 (T ; X) (with T = [0, b]). We say that g is the distributional (weak) derivative of f , if Zb
Zb 0
f (t)ϑ (t) dt = − 0
g(t)ϑ(t) dt 0
We denote this derivative of f by Df .
¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .
138
Nonlinear Analysis
(b) Let p ∈ [1, +∞] and T = [0, b]. We define ½ ¾ ¡ ¢ df W 1,p (0, b); X = f ∈ Lp (T ; X) : Df ∈ Lp (T ; X) . (c) Let p ∈ [1, +∞] and T = [0, b]. We define ½ ¡ ¢ df AC 1,p T, X = f : T −→ X : f is absolutely continuous, differentiable almost everywhere with derivative ¾ f 0 ∈ Lp (T ; X) . REMARK 2.2.23 According to Theorem 2.2.17 (see also Remark 2.2.18), if X is reflexive (or more generally if X has RNP), then ¡ ¢ f ∈ AC 1,p T, X if and only if there exists a function g ∈ Lp (T ; X), such that Zt ∀ t ∈ T. f (t) = f (0) + g(s) ds 0
¡ ¢ 1,p Invoking Proposition 2.2.20, we see that the spaces W (0, b); X and ¡ ¢ 1,p AC T, X (for p ∈ [1, +∞]) can be identified. THEOREM 2.2.24 If p ∈ [1, +∞] and f ∈ Lp (T ; X) (with T = [0, b]), then the following statements are equivalent: (a) f ∈ W 1,p (T ; X);
¡ ¢ (b) there exists f1 ∈ AC 1,p T, X , such that f (t) = f1 (t) for almost all t ∈ T. REMARK 2.2.25 In Section 2.4, we shall see that this property distinguishes Sobolev functions of one variable (i.e., defined on (0, b)) from Sobolev functions of several variables (i.e., functions defined on an open set Z ⊆ RN with N > 1). PROPOSITION 2.2.26 If X is reflexive, p ∈ (1, +∞) and f ∈ Lp (T ; X) (with T = [0, b]), then the following two conditions are equivalent: ¡ ¢ ¡ ¢ (a) f ∈ W 1,p (0, b); X (or there exists f1 ∈ AC 1,p T, X , such that f (t) = f1 (t) for almost all t ∈ T ); b−h Z
° ° °f (t + h) − f (t)°p dt 6 chp for some c > 0 and all h ∈ (0, b). X
(b) 0
2. Lebesgue-Bochner and Sobolev Spaces PROOF
139
“(a)=⇒(b)”: By Theorem 2.2.24, we have
t+h Z f (t + h) − f (t) = Df1 (s) ds
∀ t, t + h ∈ T = [0, b].
t
By Jensen inequality (see Theorem A.2.26), we have ° ° °f (t + h) − f (t)°p 6 hp−1 X
t+h Z ° ° °Df1 (s)°p ds, X t
so b−h Z
° ° °f (t + h) − f (t)°p dt 6 hp−1 X
0
b−h t+h Z Z
° ° °Df1 (s)°p ds dt. X
0
(2.12)
0
Note that b−h Z
0
1 h
t+h Z Zb ° ° ° ° °Df1 (s)°p ds dt −→ °Df1 (s)°p ds as h → 0 X X t
0
(see Proposition 2.1.22). So from (2.12), we conclude that b−h Z
° ° °f (t + h) − f (t)°p dt 6 chp X
∀ h ∈ (0, b),
0
for some constant c > 0. “(b)=⇒(a)”: For every n > 1, let df
gn (t) = χ[0,b− 1 ] (t) n
f (t + n1 ) − f (t) 1 n
.
By virtue of condition (b), the sequence {gn }n>1 ⊆ Lp (T ; X) is bounded. Since p ∈ (1, +∞) and X is reflexive, the Lebesgue-Bochner space Lp (T ; X) is reflexive too (see Proposition 2.2.3(c)). So by the Eberlein-Smulian theorem (see Theorem A.3.8), we may assume that w
gn −→ g for some g ∈ Lp (T ; X).
in Lp (T ; X),
140
Nonlinear Analysis
¡ ¢ For every ϑ∗ ∈ Cc∞ (0, b); X ∗ , we have Zb
∗ ® ϑ (t), g(t) X dt
0
Zb =
lim
n→+∞
∗ ® ϑ (t), gn (t) X dt
0
Zb ¿ =
ϑ∗ (t),
lim
n→+∞
f (t + n1 ) − f (t) 1 n
0
=
lim
ϑ∗ (t − n1 ) − ϑ∗ (t) 1 n
n→+∞ 0
dt
(2.13)
X
1
· b− Z n¿
À
À ¸ Zb ∗ ® , f (t) dt − n ϑ (t), f (t) X dt . X
1 b− n
¡ ¢ Because ϑ∗ ∈ Cc∞ (0, b); X ∗ , for n > 1 large enough, we have Zb
∗ ® ϑ (t), f (t) X dt = 0.
1 b− n
Also 1 b− n ¿
Z 0
ϑ∗ (t − n1 ) − ϑ∗ (t) 1 n
À , f (t)
Zb dt −→ − X
∗0 ® ϑ (t), f (t) X dt.
0
So from (2.13), we have Zb
∗ ® ϑ (t), g(t) X dt = −
0
Zb
® ϑ∗ 0 (t), f (t) X dt
¡ ¢ ∀ ϑ∗ ∈ Cc∞ (0, b); X
0
and finally Df = g i.e.,
in Lp (T ; X),
¡ ¢ f ∈ W 1,p (0, b); X .
To prove the next result concerning X-valued functions, we shall need the following general result about embeddings of Banach spaces, which will also be helpful in our discussion of evolution triples later in this section.
2. Lebesgue-Bochner and Sobolev Spaces
141
LEMMA 2.2.27 If Y is another Banach space, such that X ⊆ Y , the embedding is continuous and X is dense in Y , then (a) the embedding Y ∗ ⊆ X ∗ is continuous; (b) if X is reflexive, then Y ∗ is dense in X ∗ . PROOF (a) Since by hypothesis X is embedded continuously in Y , there exists c1 > 0, such that kxkY 6 c1 kxkX
∀ x ∈ X.
Let y ∗ ∈ Y ∗ . Then ¯ ∗ ¯ ¯ hy , xi ¯ 6 ky ∗ k ∗ kxk 6 c1 ky ∗ k ∗ kxk . Y Y Y Y X
(2.14)
Let yb∗ = y ∗ |X . Then from (2.14), we have yb∗ ∈ X ∗ and kb y ∗ kX ∗ 6 c1 ky ∗ kY ∗ .
(2.15)
We show that yb∗ = 0 implies that y ∗ = 0. Indeed for all x ∈ X, we have 0 = hb y ∗ , xiX 6 hy ∗ , xiX . Because X is dense in Y , it follows that y ∗ = 0. So the map i∗ : Y ∗ −→ X ∗ , defined by df
i∗ (y ∗ ) = yb∗ , is continuous, injective. Hence y ∗ can be identified with yb∗ and so Y ∗ ⊆ X ∗ with continuous injection (see (2.15)). (b) Suppose that assertion is not true. Then Y∗
k·kX ∗
6= X ∗
and so by the Hahn-Banach theorem, we can find u ∈ X ∗∗ = X (since X is reflexive), u 6= 0, such that hx∗ , uiX = 0 It follows that u = 0, a contradiction.
∀ x∗ ∈ Y ∗ .
142
Nonlinear Analysis
PROPOSITION 2.2.28 If X is reflexive, Y is another Banach space, X ⊆ Y , the embedding is ¡ ¢ continuous and f ∈ L∞ (T ; X) ∩ C T ; Yw (T = [0, b] and Yw is the Banach space Y equipped with the weak topology), ¡ ¢ then f ∈ C T ; Xw (where Xw is the Banach space X equipped with the weak topology). k·k
PROOF By replacing Y with X Y if necessary, we may assume that X is dense in Y . So by virtue of Lemma 2.2.27(b), Y ∗ ⊆ X ∗ and the embedding is continuous and dense. From Corollary 2.1.4, we know that there exists a sequence {fn }n>1 of X-valued, countably valued functions on T , such that fn −→ f
uniformly on T in X.
We know that ° ° °fn (t)° 6 c1 kf k ∞ L (T ;X) X for some c1 > 0 and ∗ ® ® y , fn (t) X −→ y ∗ , f (t) X
∀ t ∈ T, n > 1,
∀ y ∗ ∈ Y ∗ , t ∈ T.
It follows that ¯ ∗ ® ¯ ¯ y , fn (t) ¯ 6 c1 ky ∗ k ∗ kfn k ∞ X L (T ;X) X thus
∀ n > 1, t ∈ T,
¯ ∗ ® ¯ ¯ y , f (t) ¯ 6 c1 ky ∗ k ∗ kf k ∞ X L (T ;X) X
∀t∈T
° ° °f (t)° 6 c1 kf k ∞ L (T ;X) X
∀ t ∈ T.
and so f (t) ∈ X
and
(2.16)
∗ Next let x∗ ∈ X ∗ . We can find a sequence {ym }m>1 ⊆ Y ∗ , such that ∗ ym −→ x∗ in X ∗ . ¡ ¢ Also let tn → t in T . Because f ∈ C T ; Yw , we have
∗ ® ∗ ® ∗ ® ym , f (tn ) X = ym , f (tn ) Y −→ ym , f (t) Y as n → +∞, for all m > 1 and all t ∈ T.
(2.17)
® ∗ ® ∗ ® , f (t) X −→ x∗ , f (t) X ym , f (t) Y = ym as m → +∞, for all t ∈ T.
(2.18)
Also we have
2. Lebesgue-Bochner and Sobolev Spaces
143
From (2.17) and (2.18), via the double ©limit lemma (see Proposition A.2.35), ª we deduce that there exists a sequence m(n) n>1 increasing (not necessarily strictly) to +∞ such that ∗ ® ® ym(n) , f (tn ) X −→ x∗ , f (t) X . (2.19) From (2.16) and (2.19), we have ¯ ∗ ® ® ¯ ¯ x , f (tn ) − x∗ , f (t) ¯ X X ¯ ® ∗ ® ¯¯ ¯¯ ∗ ® ® ¯¯ ¯ ∗ 6 ¯ x , f (tn ) X − ym(n) , f (tn ) X ¯ + ¯ ym(n) , f (tn ) X − x∗ , f (t) X ¯ ¯ ° ° ° ° ® ® ¯ ∗ ° ∗ °f (tn )° + ¯¯ y ∗ , f (tn ) − x∗ , f (t) ¯¯ −→ 0, 6 °x∗ − ym(n) m(n) X X X X ¡ ¢ so f ∈ C T ; Xw . The next lemma is crucial in obtaining compactness theorems for function spaces which arise in the study of evolution equations. LEMMA 2.2.29 If X, Y, Z are three Banach spaces, such that X ⊆ Y ⊆ Z with the first embedding compact and the second continuous, then for every ξ > 0, we can find c(ξ) > 0, such that kxkY 6 ξ kxkX + c(ξ) kxkZ
∀ x ∈ X.
PROOF Suppose the lemma is not true. Then we can find ξ > 0 and a sequence {xn }n>1 ⊆ X, such that kxn kY > ξ kxn kX + n kxn kZ df
Let yn =
xn kxn kX
∀ n > 1.
for all n > 1. We have kyn kY > ξ + n kyn kZ
∀ n > 1.
(2.20)
Since kyn kX = 1 for all n > 1 and the embedding X ⊆ Y is compact, from (2.20), we have that kyn kZ −→ 0 (2.21) and also the sequence {yn }n>1 ⊆ Y is relatively compact. Thus we can find a subsequence {ynk }k>1 of {yn }n>1 , such that ynk −→ u
in Y.
Since Y is embedded continuously in Z, we have also that ynk −→ u in Z. Because of (2.21), we have that u = 0. On the other hand from (2.20) in the limit as k → +∞, we have kukY > ξ > 0, a contradiction. This proves the lemma.
144
Nonlinear Analysis
Let X, Y, Z be three Banach spaces, with X, Y reflexive. Assume that X ⊆ Y ⊆ Z, with the embeddings being continuous. Moreover, we suppose that the first embedding is compact. Let T = [0, b] and 1 < p, r. We introduce the space df
Wpr (T ) =
©
ª u ∈ Lp (T ; X) : u0 = Du ∈ Lr (T ; Z) .
Here u0 = Du denotes the derivative in the distributional sense in Z, i.e., Zb
Zb 0
u0 (t)ϑ(t) dt in Z
u(t)ϑ (t) dt = − 0
¡ ¢ ∀ ϑ ∈ Cc∞ (0, b) .
0
We furnish Wpr (T ) with the norm kukpr = kukp + ku0 kr . Clearly Wpr (T ) normed this way is a Banach space. Indeed, consider the isomorphism η : Wpr (T ) −→ Lp (T ; X) × Lr (T ; Z), given by df
η(x) = (x, x0 )
∀ x ∈ Wpr (T )
and view Wpr (T ) as a closed subspace of Lp (T ; X) × Lr (T ; Z). Moreover, if X and Z are separable, then so is Wpr (T ) and finally if X and Z are reflexive, then Wpr (T ) is reflexive too. It is evident that Wpr (T ) ⊆ Lp (T ; X) ⊆ Lp (T ; Z) ⊆ Ls (T ; Z), with s = min{p, r}. Then ¡ ¢ Wpr (T ) ⊆ W 1,s (0, b); Z ¡ ¢ and so every u ∈ Wpr (T ) viewed as a Z-valued function belongs in AC 1,s T, Z (see Theorem 2.2.24). Therefore the derivative u0 = Du is actually a strong derivative in Z almost everywhere, i.e., u0 = We note that
du . dt
Wpr (T ) ⊆ Lp (T ; X) ⊆ Lp (T ; Y )
and clearly the embeddings are continuous. We can say more about the embedding Wpr (T ) ⊆ Lp (T ; Y ), provided we strengthen our conditions on the spaces X, Y and Z.
2. Lebesgue-Bochner and Sobolev Spaces
145
THEOREM 2.2.30 If X, Y, Z are Banach spaces, with X, Z being reflexive, the embeddings X ⊆ Y ⊆ Z being continuous and the embedding X ⊆ Y being compact, then the embedding Wpr (T ) ⊆ Lp (T ; Y ) is compact. PROOF Let {un }n>1 ⊆ Wpr (T ) be a bounded sequence. We need to show that it has a subsequence which converges strongly in Lp (T ; Y ). Note that Wpr (T ) is reflexive. Passing to a subsequence if necessary, we may assume that w un −→ u in Wpr (T ). This means that
w
un −→ u in Lp (T ; X)
and
w
u0n −→ u0
in Lr (T ; Z).
Recall that Wpr (T ) ⊆ C(T ; Z). Claim 1. The embedding Wpr (T ) ⊆ C(T ; Z) is continuous. To see this suppose that un −→ u in Wpr (T ).
(2.22)
Then un (t) −→ u(t)
in X
∀ t ∈ T \ D,
in Z
∀ t ∈ T \ D.
with λ1 (D) = 0. Evidently un (t) −→ u(t)
For t ∈ T and s ∈ T \ D, from Proposition 2.2.20, we have ° ° ° ° ° ° °un (t) − u(t)° 6 °un (s) − u(s)° + c1 °u0n − u0 ° r Z Z L (T ;Z)
(2.23)
∀ n > 1,
for some c1 > 0. For every n > 1, we choose tn ∈ T , such that ° ° ° ° °un − u° = °un (tn ) − u(tn )°Z . C(T ;Z) So from (2.22) and (2.23), we have ° ° ° ° ° ° °un − u° 6 °un (s) − u(s)°Z + c1 °u0n − u0 °Lr (T ;Z) −→ 0. C(T ;Z) This proves Claim 1. Let
df
vn = u n − u
∀ n > 1.
146
Nonlinear Analysis
Then from Claim 1, it follows that we can find c2 > 0, such that kvn kC(T ;Z) = kun − ukC(T ;Z) 6 c2
∀ n > 1.
(2.24)
We claim that vn (t) −→ 0 in Z
∀ t ∈ T.
We shall prove this for t = 0. The proof is similar for any other t ∈ T . We have Zt vn (0) = vn (t) − vn0 (τ ) dτ, 0
so vn (0) =
1 s
Zs vn (t) dt − 0
1 s
Zs Zt vn0 (τ ) dτ dt, 0
0
thus vn (0) = ξn + ηn
∀ n > 1,
with ξn
1 = s
Zs vn (t) dt and
1 = − s
ηn
0
Note that ηn = −
1 s
Zs Zt vn0 (τ ) dτ dt 0
∀ n > 1.
0
Zs (s − t)vn0 (t) dt. 0
For a given ε > 0, select s ∈ T so that Zs kηn kZ 6 0
° 0 ° °vn (t)° dt 6 ε Z 2
∀ n > 1.
For this fixed s ∈ T , note that ξn −→ 0
w
in X
ξn −→ 0
in Z
and so (since X is embedded compactly in Z). So for n > 1 large enough, we have kξn kZ 6
ε . 2
This means that vn (t) −→ 0 in Z
∀ t ∈ T.
2. Lebesgue-Bochner and Sobolev Spaces
147
Because of (2.24), we can apply Proposition 2.1.13 and infer that vn −→ 0
in Lp (T ; Z).
By virtue of Lemma 2.2.29, for a given γ > 0 we can find c(γ) > 0, such that kvn kLp (T ;Y ) 6 γ kvn kLp (T ;X) + c(γ) kvn kLp (T ;Z) , so kvn kLp (T ;Y ) 6 γc3 + c(γ) kvn kLp (T ;Z)
∀ n > 1,
(2.25)
for some c3 > 0. Since γ > 0 was arbitrary and kvn kLp (T ;Z) −→ 0, from (2.25) we infer that lim sup kvn kLp (T ;Y ) 6 0, n→+∞
i.e., vn → 0 in Lp (T ; Y ). Now we are about to introduce a notion that plays a central role in the study of evolution equations. The modern strategy in studying parabolic equations is to make use of many different function spaces. The concept of evolution triple, which we define next, provides an appropriate analytical framework to realize this strategy. DEFINITION 2.2.31 A triple of spaces (X, H, X ∗ ) is said to be an evolution triple, if the following are true: (a) X is a separable, reflexive Banach space; (b) H is a separable Hilbert space; (c) the embedding X ⊆ H is continuous and dense. REMARK 2.2.32 By virtue of Lemma 2.2.27(b), the embedding H ∗ ⊆ ∗ X is continuous and dense. Since by the Riesz-Fr´echet representation theorem (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003a, p. 316)) we can assume that H = H ∗ , then we have that all embeddings X ⊆ H ⊆ X ∗ are continuous and dense. For all h ∈ H and all x ∈ X, we have hh, xiX = (h, x)H , i.e., h·, ·iX |H×X = (·, ·)H . Also for x∗ ∈ X ∗ , x ∈ X, we have hx∗ , xiX =
lim k·k
∗
X h −→ x∗ h∈H
(h, x)H
(since H is dense in X ∗ ). Therefore if X is a Hilbert space too, we do not represent the elements of X ∗ using the inner product of X (the Riesz-Fr´echet theorem), but using the inner product of H.
148
Nonlinear Analysis
EXAMPLE 2.2.33 If Z ⊆ RN is a bounded open set with smooth boundary and p ∈ [2, +∞), then as we shall see in Section 2.4, the spaces ¡ ¢∗ X = W 1,p (Z), H = L2 (Z) and X ∗ = W 1,p (Z) form an evolution triple. For the evolution triple (X, H, X ∗ ), we can consider the reflexive Banach space ½ ¾ ¢ 0¡ df Wpp0 (T ) = u ∈ Lp (T ; X) : u0 ∈ Lp T ; X ∗ , with p1 + p10 = 1, introduced earlier. In the next proposition we establish a regularity property for the elements of Wpp0 (T ) and also derive an “integration by parts formula,” which is crucial in the treatment of evolution equations. PROPOSITION 2.2.34 (Integration by Parts Formula) If (X, H, X ∗ ) is an evolution triple and 1 < p, p0 < +∞ with p1 + then
1 p0
= 1,
(a) Wpp0 (T ) ⊆ C(T ; H) and the embedding is continuous; (b) for all u, v ∈ Wpp0 (T ) and all 0 6 s 6 t 6 b, we have ¡
¢ ¡ ¢ u(t), v(t) H − u(s), v(s) H =
Zt
£ 0 ® ® ¤ u (τ ), v(τ ) X + u(τ ), v 0 (τ ) X dτ.
s
PROOF (a) Note that by the generalized Weierstrass approximation theorem , the space of X-valued polynomials is dense in Wpp0 (T ). In particular then the embedding C 1 (T ; X) ⊆ Wpp0 (T ) is dense. Now let u, v ∈ C 1 (T ; X). We have ¢ ¡ ¢ ¡ ¢ d¡ u(t), v(t) H = u0 (t), v(t) H + u(t), v 0 (t) H dt
∀ t ∈ T.
Thus ¡ ¢ ¡ ¢ u(t), v(t) H − u(s), v(s) H ¸ Zt · ¡ ¢ ¡ 0 ¢ 0 = u (τ ), v(τ ) H + u(τ ), v (τ ) H dτ s
so ¡
u(t), v(t)
¢ H
¡ ¢ − u(s), v(s) H
∀ 0 6 s 6 t 6 b,
2. Lebesgue-Bochner and Sobolev Spaces Zt · =
¸ 0 ® ® 0 u (τ ), v(τ ) X + u(τ ), v (τ ) X dτ
149
∀ 0 6 s 6 t 6 b. (2.26)
s
Choose ϑ ∈ C 1 (R), such that ϑ(s) = 0,
ϑ(t) = 1
and
Let
|ϑ| + |ϑ0 | 6 1
on R.
df
v = ϑu. Then
v 0 = ϑ0 u + ϑu0
and using H¨older’s inequality (see Theorem A.2.27) from (2.26), we obtain ¯ ¯ ¯u(t)¯2 6 c1 kuk2 0 ∀ t ∈ T, pp for some c1 > 0 and so kukC(T ;H) 6
√ c1 kukpp0
∀ u ∈ C 1 (T ; X).
(2.27)
Therefore the identity map ¡ ¢ i : C 1 (T ; X), k·kpp0 −→ C(T ; H) is continuous. But as we said in the beginning of the proof the embedding C 1 (T ; X) ⊆ Wpp0 (T ) is dense. So we can extend i continuously on Wpp0 (T ). Hence the embedding Wpp0 (T ) ⊆ C(T ; H) is continuous. (b) The integration by parts formula follows from (2.26) and the density of the embedding C 1 (T ; X) ⊆ Wpp0 (T ) REMARK 2.2.35
Even if the embedding X ⊆ H is compact,
the embedding Wpp0 (T ) ⊆ C(T ; H) is not compact
(see Mig´orski (1994)). In general, if X and Z are Banach spaces, the embedding X ⊆ Z is continuous (a special case is if X = Z), p, r ∈ [1, ∞] and df
Wpr (T ) =
©
ª u ∈ Lp (T ; X) : u0 ∈ Lr (T ; Z) ,
then Wpr (T ) ⊆ C(T ; Z). This inclusion as well as that of Proposition 2.2.34(a) means that if u ∈ Wpp0 (T ) (respectively u ∈ Wpr (T )), then there exists u1 ∈ C(T ; H) (respectively u1 ∈ C(T ; Z)), such that u(t) = u1 (t)
for a.a. t ∈ T.
150
2.3
Nonlinear Analysis
Compactness Results
In this section we prove compactness and weak compactness results for subsets of C(T ; X) and Lp (T ; X) (p ∈ [1, +∞)). Throughout this section T = [0, b] (b < +∞) and X is a Banach space. Additional hypotheses will be introduced as needed. We start with the classical “Arzela-Ascoli theorem” which characterizes the compact subsets of C(T ; X). In its proof we shall need the following lemma. LEMMA 2.3.1 If K ⊆ X is a nonempty set and for every ε > 0 there exists a relatively compact set Kε ⊆ X, such that for every x ∈ K we can find xε ⊆ Kε , such that kx − xε kX < ε, then K is relatively compact. PROOF Let ε > 0. Choose K 2ε ⊆ X to be the relatively compact subset postulated by the hypothesis of the lemma. We can find {xkε }nk=1 ⊆ K 2ε , such that n [ K 2ε ⊆ B 2ε (xkε ). k=1
By hypothesis for every x ∈ K, there exists x 2ε ⊆ K 2ε , such that ° ° °x − x ε ° 2
X
0 there exists δ(ε) > 0, such that, if t, s ∈ T and |t − s| < δ, then ° ° °u(t) − u(s)° < ε ∀u∈K X (the equicontinuity is uniform in t ∈ T since T is compact).
2. Lebesgue-Bochner and Sobolev Spaces
151
PROOF “=⇒”: Property (a) follows from the fact that for every t ∈ T the evaluation at t map et : C(T ; X) 3 u 7−→ u(t) ∈ X is continuous. To prove property (b) (the equicontinuity property), we proceed as follows. Let ε > 0. Because K is relatively compact in C(T ; X), we can find {uk }nk=1 ⊆ K, such that K ⊆
n [
B 3ε (uk ).
k=1
If t ∈ T , there is a δ = δ(ε) > 0, such that if s ∈ T and |t − s| < δ, then ° ° °uk (t) − uk (s)° < ε ∀ k ∈ {1, . . . , n} X 3 (recall that the functions {uk }nk=1 are uniformly continuous on T ©since T ªis compact). Now let s ∈ T with |t − s| < δ and u ∈ K. Choose k0 ∈ 1, . . . , n , such that ku − uk0 k∞ < 3ε . We have ° ° °u(t) − u(s)° X ° ° ° ° ° ° ° 6 u(t) − uk0 (t)°X + °uk0 (t) − uk0 (s)°X + °uk0 (s) − u(s)°X ε 6 ku − uk0 k∞ + + kuk0 − uk∞ < ε, 3 so K is equicontinuous. “⇐=”: First note that K(0) and K(b) are both relatively compact. Indeed, for a given ε > 0, we can find δ = δ(ε) > 0, such that if 0 < s < δ, then ° ° °u(s) − u(0)° < ε ∀ u ∈ K. X Since by hypothesis K(s) ⊆ X is relatively compact, from Lemma 2.3.1, it follows that K(0) ⊆ X is relatively compact. Similarly for K(b) ⊆ X. For every integer N , let uN : T −→ X be the function equal to u ∈ K at the points tk = kb N , k = 0, . . . , N and linear between these points. Then the ª df © set KN = uN : u ∈ K is isomorphic to N Y k=0
µ K
kb N
¶ ⊆ X N +1 ,
which is relatively compact (Tychonoff’s theorem). Therefore KN ⊆ C(T ; X) is relatively compact. Also if N > δb , then by property (b), we have ku − uN k∞ < ε. So by Lemma 2.3.1, we conclude that K ⊆ C(T ; X) is relatively compact. We can have a “weak” variant of the Arzela-Ascoli theorem. First a definition.
152
Nonlinear Analysis
¡ ¢ DEFINITION 2.3.3 A set K ⊆ C T ; Xw is weakly equicontinuous, if for every ε > 0 and x∗ ∈ X ∗ , we can find δ = δ(ε, x∗ ) > 0, such that if t, s ∈ T and |t − s| < δ, then ¯ ∗ ® ¯ ¯ x , u(t) − u(s) ¯ < ε ∀ u ∈ K. X Also we say that a sequence of functions un : T −→ X, n > 1 converges weakly uniformly to u : T −→ X, if for every ε > 0 and x∗ ∈ X ∗ , we can find n0 = n0 (ε, x∗ ) > 1, such that ¯ ∗ ® ¯ ¯ x , un (t) − u(t) ¯ < ε ∀ t ∈ T, n > n0 . X THEOREM 2.3.4 ¡ ¢ If X ∗ is separable, {un }n>1 ⊆ C T ; Xw , for every t ∈ T , the set w
{un (t)}n>1 is weakly compact in X and the sequence {un }n>1 is weakly equicontinuous, ¡ ¢ then we can find u ∈ C T ; Xw and a subsequence {unk }k>1 of {un }n>1 such that unk −→ u weakly uniformly in T. PROOF Let C ∗ be a countable dense subset on X ∗ . We introduce ∗ df D = span Q C ∗ , the set of linear combinations with rational coefficients of the elements of C ∗ . Evidently D∗ is countable and dense in X ∗ . Using the classical Arzela-Ascoli theorem on C(T ) together with the Cantor diagonal process, we can find a subsequence {unk }k>1 of {un }n>1 , such that ∗ ® x , unk (·) X −→ v(x∗ )(·) in C(T ) as k → +∞. Note that x∗ 7−→ v(x∗ ) is a map defined on D∗ with values in C(T ). Moreover, we have ¯ ∗ ® ¯ ¯ x − y ∗ , un (t) ¯ 6 c1 kx∗ − y ∗ k ∗ ∀ t ∈ T, x∗ , y ∗ ∈ D∗ , X X for some c1 > 0, so ¯ ∗ ¯ ¯v(x )(t) − v(y ∗ )(t)¯ 6 c1 kx∗ − y ∗ k ∗ X and thus
∀ t ∈ T, x∗ , y ∗ ∈ D∗
° ∗ ° °v(x ) − v(y ∗ )° 6 c1 kx∗ − y ∗ kX ∗ . C(T )
Therefore the map v : D∗ −→ C(T ) is uniformly continuous. Thus it can be k·k
∗
extended to a unique continuous map vb : D∗ X = X ∗ −→ C(T ). Clearly vb is continuous. This together with the fact that K(t) ⊆ X is weakly compact imply that we can find u : T −→ X, such that ∗ ® ∀ x∗ ∈ X ∗ , t ∈ T. x , u(t) X = vb(x∗ )(t) ¡ ¢ We conclude that u ∈ C T ; Xw and unk −→ u weakly uniformly on T .
2. Lebesgue-Bochner and Sobolev Spaces
153
DEFINITION 2.3.5 A subset K ⊆ Lp (T ; X) (p ∈ [1, +∞)) is said to be p-equiintegrable, if it is uniformly integrable (see Definition A.2.3) and b−h Z
° ° °u(t + h) − u(t)°p dt = 0 X
lim
h&0
uniformly for all u ∈ K.
0
In the next theorem we present a characterization of relatively compact sets of the Lebesgue-Bochner spaces Lp (T ; X) (p ∈ [1, +∞]) and also obtain an alternative criterion for compactness in C(T ; X) in which the compactness condition of the Arzela-Ascoli theorem (see Theorem 2.3.2) is replaced by a similar one for integrals. In what follows df
τh (u)(t) = u(t + h)
∀ h > 0.
So if u is defined on T , this translated version τh (u) is defined on [−h, b − h]. Note that the definition of p-equiintegrability is equivalent to saying that ° ° °τh (u) − u° p −→ 0 as h → 0, uniformly for all u ∈ K, L (T ;X) h
df
with Th = [0, b − h]. THEOREM 2.3.6 K ⊆ Lp (T ; X), p ∈ [1, +∞) (respectively K ⊆ C(T ; X)) is relatively compact if and only if t Z (a) for all t, s ∈ (0, b), s < t, we have that the set u(τ ) dτ : u ∈ K is s
relatively compact in X; and (b) K is p-equiintegrable (respectively
lim kτh (u) − ukL∞ (Th ;X) = 0 uni-
h→+∞
formly in u ∈ K). PROOF
Suppose that K ⊆ Lp (T ; X), p ∈ [1, +∞). Zt u(τ ) dτ is continuous from Lp (T ; X) into X,
“=⇒”: Since the map u 7−→ s
property (a) is satisfied. Due to the relative compactness of K ⊆ Lp (T ; X), we can find a ©sequenceª{uk }nk=1 ⊆ Lp (T ; X), such that for every u ∈ K, we can find k ∈ 1, . . . , n , such that ku − uk kp < 3ε . Because the embedding C(T ; X) ⊆ Lp (T ; X) is dense, we can assume that uk ∈ C(T ; X). Then we can find hk > 0, such that for h ∈ (0, hk ), we have ° ° ε °τk (uk ) − uk ° p < . L (Th ;X) 3
154
Nonlinear Analysis
Let b h = min hk . We have 16k6n
¡ ¢ τh (u) − u = τh (u − uk ) − (u − uk ) + τh (uk ) − uk and so
∀h6b h, u ∈ K,
kτh (u) − ukLp (Th ;X) < ε so lim kτh (u) − ukLp (Th ;X) = 0
uniformly for u ∈ K.
h&0
“⇐=”: Let u ∈ K and r > 0. We set 1 Mr (u)(t) = r df
Zt+r u(s) ds. t
¡ ¢ We have Mr (u) ∈ C Tr ; X with Tr = [0, b − r]. For every t, s ∈ [0, b − r], s 6 t, we have s+r ° Z ° ° ° ° ° ¡ ¢ °Mr (u)(t) − Mr (u)(s)° = ° 1 τt−s (u) − u (τ ) dτ ° °r ° X X s
° 1° 6 °τt−s (u) − u°L1 (T ;X) , t−s r so Mr K =
©
Mr u : u ∈ K
ª
¡ ¢ ⊆ C Tr ; X
is uniformly equicontinuous (see condition (b)). ¡ Also¢ from condition (a), we see that for every t ∈ (0, b − r), the set Mr K (t) ¡⊆ X ¢is relatively compact. So by Theorem 2.3.2, we have that Mr K ⊆ C Tr ; X is relatively compact. Note that 1 Mr (u)(t) − u(t) = r so
Zr
¡ ¢ τh (u) − u (t) dh
∀ t ∈ Tr ,
0
° ° ° ° °Mr (u) − u° p 6 max °τh (u) − u°Lp (Tr ;X) . L (Tr ;X) h∈[0,r]
But because of condition (b), for all bb < b, K is the uniform limit of Mr K in ¡ ¢ £ ¤ Lp Tb; X with Tb = 0, bb as r → 0, r 6 b − bb. But since Mr K is relatively ¢ ¢ ¡ ¢ ¡ ¡ compact in C Tb; X and the embedding C Tb; X ⊆ Lp Tb; X is continuous, ¢ ¡ we see that K is relatively compact in Lp Tb; X . Conditions (a) and (b) remain valid if one ©changes the time direction. ª Namely if u(t) = u(b − t), then the set K = u : u ∈ K still satisfies
2. Lebesgue-Bochner and Sobolev Spaces
155
conditions (a) and (b). Then from the previous argument we have that K ¡ ¢ is relatively compact in Lp Tb; X . It follows that K is relatively compact in ¡£ ¤ ¢ Lp bb, b ; X . Setting for example bb = 2b , we obtain the relative compactness of K in Lp (T ; X). The proof of the case when K ⊆ C(T ; X) is similar. REMARK 2.3.7 The restriction p < +∞ is necessary, because if K = {u} with u bounded but discontinuous from T into X, then K is compact in L∞ (T ; X) but condition (b) is not satisfied. COROLLARY 2.3.8 If u ∈ Lp (T ; X), p ∈ [1, +∞), then ° ° °τh (u) − u° p L (T
h ;X)
−→ 0
as h & 0.
When X = R, we have the so-called “Riesz-Kolmogorov theorem” for p ∈ [1, +∞). COROLLARY 2.3.9 (Riesz-Kolmogorov Theorem) K ⊆ Lp (T ), p ∈ [1, +∞) (respectively K ⊆ C(T )) is relatively compact if and only if (a) there exist t, s ∈ (0, b), s < t, such that the set ½ Zt
¾ u(τ ) dτ : u ∈ K
⊆ X
s
is bounded; and b−h Z
¯ ¯ ¯u(t + h) − u(t)¯p dt −→ 0 as h & 0 uniformly for u ∈ K.
(b) 0
PROOF
From(a) and (b) it follows that for all t, s ∈ K, s < t, we Zt have that the set u(τ ) dτ : u ∈ K ⊆ X is bounded. So we can apply s
Theorem 2.3.6. REMARK 2.3.10 Theorem 2.3.6 and Corollary 2.3.9 provide characterizations of compact sets in C(T ; X) and C(T ) respectively. Compared with the classical Arzela-Ascoli theorem (see Theorem 2.3.2), we see that condition (a) (the space criterion) is now a condition on integrals, while condition (b) (the time criterion) remains the same.
156
Nonlinear Analysis
Next we shall characterize sets which are bounded in Lp (T ; X) and compact in Lr (T ; X) with r < p. Such results are known as “partial compactness” results, since the compactness is not achieved for the larger order p for which the set is actually bounded. First we obtain two auxiliary results relating compactness with time-local compactness. LEMMA 2.3.11 The set K ⊆ Lp (T ; X) (with p ∈ [1, +∞)) is relatively compact if and only if (a) K ⊆ Lploc (T ; X)¡ is relatively compact (i.e., for all s, t ∈ (0, b), s < t, the ¢ set K|[s,t] ⊆ Lp [s, t]; X is relatively compact); and (b)
° ° Rh ° Rb ° °u(t)°p dt + °u(t)°p dt −→ 0 as h → 0 uniformly for u ∈ K. X X 0
b−h
PROOF “=⇒”: Condition (a) is automatically true. Let u be the extension by 0 outside T of u. Then the set © ª K = u: u∈K ¡ ¢ is relatively compact in Lp [−b, 2b]; X . As ° ° °τh (u) − u° p L ([−b,2b];X) Zh =
° ° °u(t)°p dt + X
0
b−h Z
° ° °u(t + h) − u(t)°p dt + X
0
Zb
° ° °u(t)°p dt, X
b−h
applying Theorem 2.3.6, we obtain Zh
° ° °u(t)°p dt + X
0
Zb
° ° °u(t)°p dt −→ 0 X
as h & 0
uniformly for u ∈ K.
b−h
“⇐=”: Let df
uh = χ[h,b−h] u
df
and Kh =
©
ª uh : u ∈ K .
Condition (b) implies that for a given ε > 0, we can find h > 0 small enough, such that kuh − ukLp (T ;X) < ε ∀ u ∈ K. Since Kh ⊆ Lp (T ; X) is relatively compact (see condition (a)), from Lemma 2.3.1, we infer that K ⊆ Lp (T ; X) is relatively compact.
2. Lebesgue-Bochner and Sobolev Spaces
157
LEMMA 2.3.12 If K ⊆ Lp (T ; X) (with p ∈ (1, +∞]) is bounded and K ⊆ L1loc (T ; X) is relatively compact (i.e., for all t, s ∈ (0, b), s < t, K|L1 ([s,t];X) is relatively compact), then K ⊆ Lr (T ; X) is relatively compact for all r ∈ (1, p). PROOF For every h 6 b and every u ∈ K, from H¨older’s inequality (see Theorem A.2.27), we have Zh
° ° °u(t)° dt + X
0
Zb
° ° 1 °u(t)° dt 6 2h q0 kuk , q X
b−h
so K ⊆ L1 (T ; X) is relatively compact (see Lemma 2.3.11). So for a given ©ε > 0, weª can find {uk }nk=1 ⊆ K, such that for every u ∈ K, there exists k ∈ 1, . . . , n , such that 1
ku − uk k1
0, we ª can find {uk }nk=1 ⊆ K, such that for each u ∈ K there exists k ∈ 1, . . . , n , such that ku − uk kLp (T ;Z) < ε. Invoking Lemma 2.2.29, for every ξ > 0, we can find c = c(ξ) > 0, such that ku − uk kLp (T ;Y ) 6 ξ ku − uk kLp (T ;X) + c ku − uk kLp (T ;Z) 6 ξδX + cε, df
where δX = diam Lp (T ;X) K. For a given ε0 > 0, select ξ = Then from (2.28), we have
ε0 2δX
(2.28) df ε0 2c .
and ε =
ku − uk kLp (T ;Y ) 6 ε0 , so K ⊆ Lp (T ; Y ) is relatively compact. Based on the above lemma, we can have the following compactness result for an intermediate space. THEOREM 2.3.19 If Y, Z are Banach spaces, the embeddings X ⊆ Y ⊆ Z are continuous, with the first embedding compact, p ∈ [1, +∞] and (i) K ⊆ Lp (T ; X) is bounded, (ii) kτh (u) − ukLp (Th ;Z) −→ 0 as h & 0 uniformly for u ∈ K, then K is relatively compact in Lp (T ; Y ) if p ∈ [1, +∞) and in C(T ; Y ) if p = +∞. PROOF Because of the compactness of the embedding X ⊆ Y and of Theorem 2.3.6, the set K ⊆ Lp (T ; Z) is relatively compact. An application of Lemma 2.3.18 finishes the proof. This result permits an extension of Theorem 2.2.30.
160
Nonlinear Analysis
THEOREM 2.3.20 If Y, Z are Banach spaces, the embeddings X ⊆ Y ⊆ Z are continuous with the first embedding compact, then (a) if K ⊆ Lp (T ; X) (with p ∈ [1, +∞)) is bounded and the set ª df © K 0 = u0 = Du : u ∈ K ⊆ L1 (T ; Z) is bounded, we have that K ⊆ Lp (T ; Y ) is relatively compact; ª df © (b) if K ⊆ L∞ (T ; X) is bounded and K 0 = u0 = Du : u ∈ K ⊆ Lr (T ; Z) (with r > 1) is bounded, we have that K ⊆ C(T ; Y ) is relatively compact. Now let us look at weakly compact subsets of L1 (Ω; X). To describe a large class of such sets in L1 (Ω; X), we shall need two results which for easy reference we state here without proofs. The first is the celebrated James theorem. THEOREM 2.3.21 (James Theorem) A nonempty, weakly closed and bounded subset of a Banach space X is weakly compact if and only if every x∗ ∈ X ∗ attains its maximum on the set. The second result is a remarkable consequence of the property of decomposability. If (Ω, Σ, µ) is a finite measure space and X is a Banach space, a set K ⊆ L1 (Ω; X) is said to be decomposable if and only if χA u1 + χAc u2 ∈ K for all (u1 , u2 , A) ∈ K × K × Σ. PROPOSITION 2.3.22 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, ϕ : Ω × df
X −→ R = R ∪ {+∞} is jointly measurable, F : Ω −→ 2X \ {∅} is graph ª df © measurable (i.e., Gr F = (ω, x) ∈ Ω × X : x ∈ F (ω) ∈ Σ × B(X) with B(X) being the Borel σ-field of X), Z ¡ ¢ ϕ ω, u(ω) dµ Iϕ (u) = Ω
is defined (maybe +∞ or −∞) for all u ∈ SF1 with ª df © SF1 = u ∈ L1 (Ω; X) : u(ω) ∈ F (ω) for µ-a.a. ω ∈ Ω and there exists u0 ∈ SF1 , such that Iϕ (u0 ) > −∞, then
Z sup Iϕ (u) =
1 u∈SF
sup ϕ(ω, x) dµ. Ω
x∈F (ω)
2. Lebesgue-Bochner and Sobolev Spaces
161
Using these results, we can identify a large class of weakly compact subsets of L1 (Ω; X). THEOREM 2.3.23 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, F : Ω −→ 2X \ {∅} is graph measurable, for µ-almost all ω ∈ Ω, F (ω) is weakly compact, convex and there exists h ∈ L1 (Ω)+ , such that sup kxkX 6 h(ω)
for µ-a.a. ω ∈ Ω,
x∈F (ω)
then
SF1 =
¡ ¢ u ∈ L1 (Ω; X) : u(ω) ∈ F (ω) for µ-a.a. ω ∈ Ω
is weakly compact and convex. PROOF Convexity of SF1 is obvious. Moreover, because of the boundedness by h ∈ L1 (Ω)+ , we have SF1 6= 0 (see Denkowski, Mig´orski & PapageorL1 (Ω; X). giou (2003a, p. 432)). So we show that SF1 is weakly compact in ¡ ¢∗ According to Theorem 2.3.21, it suffices to show that every u∗ ∈ L1 (Ω; X) attains its supremum on SF1 . From Theorem 2.2.12, we know that ¢ ¡ 1 ¢∗ ¡ ∗ L (Ω; X) = L∞ Ω; Xw ∗ and the duality pairing is given by Z ∗
hu , uiL1 (Ω;X) =
∗ ® u (ω), u(ω) X dµ.
Ω
From Proposition 2.3.22, we have Z sup hu∗ , uiL1 (Ω;X) = sup
1 u∈SF
Z
Let
x∈F (ω)
½ df
M (ω) =
Ω
® sup u∗ (ω), x X dµ.
= Ω
1 u∈SF
∗ ® u (ω), u(ω) X dµ
y ∈ F (ω) :
∗ ® u (ω), y X =
sup x∈F (ω)
¾ ∗ ® u (ω), x X .
Since F (ω) is µ-almost everywhere w-compact, we see that M (ω) 6= 0
for µ-a.a. ω ∈ Ω.
By setting F (ω) = {0} on the exceptional Lebesgue-null set, we can say that M (ω) 6= 0
∀ ω ∈ Ω.
162
Nonlinear Analysis
Also from Denkowski, Mig´orski & Papageorgiou (2003a, p. 433), we know that we can find a sequence of Σ-measurable functions fn : Ω −→ X, such that fn (ω) ∈ F (ω)
∀ ω ∈ Ω, n > 1
and F (ω) = {fn (ω)}n>1 Hence
sup
u∗ (ω), x
x∈F (ω)
® X
k·kX
∀ ω ∈ Ω.
® = sup u∗ (ω), fn (ω) X n>1
and so ω 7−→
sup x∈F (ω)
∗ ® u (ω), x X = m∗ (ω)
is Σ-measurable.
Then ª (ω, x) ∈ Ω × X : x ∈ M (ω) © ® ª = (ω, x) ∈ Ω × X : u∗ (ω), x X = m∗ (ω) .
Gr M =
©
Since u∗ is w∗ -measurable, it follows that Gr M ∈ Σ × B(X). So we can apply the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) and obtain a strongly measurable function u0 : Ω −→ X, such that u0 (ω) ∈ M (ω) for µ-a.a. ω ∈ Ω. Evidently and
u0 ∈ SF1 hu∗ , u0 i = sup hu∗ , uiX . 1 u∈SF
¢ ¡ 1 ¢∗ ¡ ∗ Since u∗ ∈ L∞ Ω; Xw = L (Ω; X) was arbitrary and clearly SF1 is weakly ∗ closed and bounded, from Theorem 2.3.21, we conclude that SF1 ⊆ L1 (Ω; X) is weakly compact. A classical theorem of Dunford-Pettis isolates the relatively weakly compact subsets of L1 (Ω) as the bounded, uniformly integrable subsets (see Definition A.2.3). If X is reflexive, the original proof for L1 (Ω) extends with only notational changes to the present vector valued setting. So we have THEOREM 2.3.24 (Dunford-Pettis Theorem) If (Ω, Σ, µ) is a finite measure space, X is reflexive and K ⊆ L1 (Ω; X) is bounded, then K is relatively weakly compact in L1 (Ω; X) if and only if it is uniformly integrable.
2. Lebesgue-Bochner and Sobolev Spaces
163
The relative weak compactness in L1 (Ω; X) is closely related with the socalled “biting convergence,” which is useful in the calculus of variations and in optimal control. DEFINITION 2.3.25
A sequence {un }n>1 ⊆ L1 (Ω; X) is said to conb
verge to u ∈ L1 (Ω; X) in the biting sense, denoted by un −→ u, if there exists an increasing sequence {Cm }m>1 ⊆ Σ, such that µ(Cm ) % µ(Ω) and
w
un −→ u
as m → +∞
in L1 (Cm ; X)
∀ m > 1.
The so-called “Chacon Biting Lemma” says that if X is a reflexive Banach space, then every bounded sequence in L1 (Ω; X) has a subsequence converging in L1 (Ω; X) in the biting sense. The next result is a slightly stronger version of the original biting lemma. THEOREM 2.3.26 If (Ω, Σ, µ) is a finite measure space, X is a Banach space and {un }n>1 ⊆ L1 (Ω; X) is bounded, then there exists a subsequence {unk }k>1 of {un }n>1 and an increasing sequence {Cm }m>1 ⊆ Σ, such that µ(Cm ) % µ(Ω) as m → +∞ and n o is uniformly integrable. χCk unk k>1
PROOF
Let m > 1 and define Z df hm (t) = sup
n>m {kum kX >t}
° ° °un (ω)° dµ X
∀ t > 0.
Note that hm : R+ −→ R+ and it is decreasing. So lim hm (t) exists for all t→+∞
m > 1. Let df
ξ =
lim h1 (t).
t→+∞
Since for every m > 1, {u1 , . . . , um−1 } is uniformly integrable, it follows that lim hm (t) = ξ.
t→+∞
Let {ti }i>1 ⊆ R+ be a sequence increasing to +∞, such that ξ 6 h1 (ti ) 6 ξ +
1 i
∀ i > 1.
Because ξ 6 hm (t)
∀ m > 1, t ∈ R+ ,
164
Nonlinear Analysis
we can find a strictly increasing sequence {li }i>1 , such that Z
° ° °uli (ω)° dµ > ξ − 1 . X i
{kuli k >ti } X
df
Set Di = {kuli kX > ti }. Then ti µ(Di ) 6 sup kun k1 n>1
and so µ(Di ) −→ 0 as i → +∞. © ª df Let Ci = Ω \ Di . We claim that the sequence χCi uli i>1 is uniformly integrable. To this end let Z ° ° df °uli (ω)° dµ. h(t) = sup X i>1
Ci ∩{kuli k >t} X
We need to show that h(t) −→ 0 as t → +∞. We have Z ° ° °uli (ω)° dµ h(tr ) = sup X i>r
{tr r
{kuli kX >tr }
·µ 6 sup i>r
° ° °uli (ω)° dµ − X
Z
¸ ° ° °uli (ω)° dµ X
{kuli kX >ti }
µ ¶¶¸ 1 1 2 ξ+ − ξ− 6 , r i r
so h(t) −→ 0 as t → +∞. Finally replace {Di }i>1 by a sequence decreasing to ∅ (or a µ-null set). Then there exists a strictly increasing sequence {ik }k>1 , such that ¡ ¢ 1 µ Dik 6 k 2 df
Set Ak =
∞ S j=k
∀ k > 1.
Dij for k > 1. Then {Ak }k>1 decreases to a µ-null set. The
subsequence of the statement of the theorem is defined by setting unk = uli with i = © ik . Also ªCk = Ω \ Ak for k > 1. The uniform integrability of the sequence χCk unk k>1 follows from the inclusion Ω \ Ak ⊆ Ω \ Dik .
2. Lebesgue-Bochner and Sobolev Spaces
165
EXAMPLE 2.3.27 In the previous theorem, it is necessary to pass to a subsequence. To see this let Ω = [0, 1]
and
µ = λ1
© ª (the Lebesgue measure on R). If n = 2k + i with k > 1, i ∈ 0, . . . , 2k − 1 , we set · ¶ i i+1 k df 2 if ω ∈ k , k , un (ω) = 2 2 0 otherwise. Then there is no increasing sequence {Cm }m>1 with Ω = © ª χCm un n>1 is uniformly integrable. Indeed, if
∞ S m=1
Cm , such that
¡ ¢ 1 λ1 Ω \ Cm 6 , 2 © ª then for all k > 1, there exists i ∈ 0, . . . , 2k − 1 , such that µ· λ1
i i+1 , 2k 2k
¶
¶ ∩ Cm
>
1 2k+1
.
COROLLARY 2.3.28 If X is reflexive and {un }n>1 ⊆ L1 (Ω; X) is bounded, then we can find a subsequence {unk }k>1 of {un }n>1 and u ∈ L1 (Ω; X), such that b unk −→ u as k → +∞. REMARK 2.3.29 As we shall see in Section 2.5, some of the ideas involved in the “Biting Lemma” are common in the “concentration compactness theorem” (see Theorem 2.5.30). To have an analogous result for the spaces Lp (T ; X), with p ∈ (1, +∞), we need the following result which is useful in many situations since it provides information about the pointwise behaviour of a weakly convergent sequence in Lp (T ; X) for p ∈ [1, +∞). First a definition-notation. DEFINITION 2.3.30 If X is a Banach space and {An }n>1 ⊆ 2X \{∅}. We set ½ ¾ df w-lim sup An = x ∈ X : x = w-lim xnk , xnk ∈ Ank , n1 < n2 < . . . . n→+∞
k→+∞
Here w stands for the weak topology on X.
166
Nonlinear Analysis
PROPOSITION 2.3.31 If (Ω, Σ, µ) is a finite measure space, X is a Banach space, {un }n>1 ⊆ Lp (Ω; X) and u ∈ Lp (Ω; X) with p ∈ [1, +∞), w
un −→ u
in Lp (Ω; X)
and for µ-almost all ω ∈ Ω, the sequence {un (ω)}n>1 is relatively weakly compact, then © ª u(ω) ∈ conv w-lim sup un (ω) for µ-a.a. ω ∈ Ω. n→+∞
Using this Proposition we can have the following result for bounded sequences in Lp (T ; X) (with p ∈ (1, +∞)). THEOREM 2.3.32 If (Ω, Σ, µ) is a finite measure space, X is a reflexive Banach space, {un }n>1 ⊆ Lp (Ω; X) (with p ∈ (1, +∞)) is bounded and w
un (ω) −→ u(ω)
for µ-a.a. ω ∈ Ω
in X,
(2.29)
then u ∈ Lp (Ω; X) and w
un −→ u
in Lp (Ω; X).
PROOF From Proposition 2.2.3(c), we know that Lp (Ω; X) is reflexive. So by the Eberlein-Smulian theorem (see Theorem A.3.8), we can find a subsequence {unk }k>1 of {un }n>1 , such that w
unk −→ u b in Lp (Ω; X). Using Proposition 2.3.31 and (2.29), we infer that u = u b ∈ Lp (Ω; X). So every subsequence of {un }n>1 has a further subsequence weakly convergent in Lp (Ω; X) to u and from this it follows that w
un −→ u
in Lp (Ω; X).
Another notion related to the weak convergence in L1 (T ; X) (T = [0, b]) is given in the next definition. DEFINITION 2.3.33 Let T = [0, b] (b < +∞) and X a Banach space. The weak norm on L1 (T ; X) is defined by °Z t ° ° ° df ° kukw = max ° u(τ ) dτ ° ∀ u ∈ L1 (T ; X). ° 06s6t6b
s
X
2. Lebesgue-Bochner and Sobolev Spaces
167
REMARK 2.3.34 kukw
Equivalently we can define °Z t ° ° ° ° = max ° u(τ ) dτ ∀ u ∈ L1 (T ; X). ° ° t∈T
0
X
Evidently k·kw is a norm on L1 (T ; X) weaker than the usual norm Zb kuk1 =
° ° °u(t)° dt X
∀ u ∈ L1 (T ; X).
0
We shall show that for a broad class of subsets of L1 (T ; X), the topology generated by the weak norm k·kw and the weak L1 (T ; X)-topology coincide. For this purpose we introduce the following property for subsets of L1 (T ; X). DEFINITION 2.3.35 Let T = [0, b] and X a Banach space. We say that K ⊆ L1 (T ; X) has property U , if (a) K is uniformly integrable; and (b) for every ε > 0, there exists a compact set Cε ⊆ X, such that for every u ∈ K there exists a Lebesgue measurable set Aε,u ⊆ T , such that ¡ ¢ λ1 T \ Aε,u < ε and u(t) ∈ Cε
∀ t ∈ Aε,u
1
(here λ stands for the Lebesgue measure on T ). REMARK 2.3.36 Since the Lebesgue measure λ1 is nonatomic (see Theorem A.2.5 and Remark A.2.6), the uniform integrability property implies that K is bounded. Also if K ⊆ L1 (T ; X) has property U , then K is relatively w-compact (see Bourgain (1979)). THEOREM 2.3.37 If T = [0, b], X is a Banach space and K ⊆ L1 (T ; X) has property U , then the weak L1 (T ; X)-topology and k·kw -norm topology on K coincide. Moreover, K is relatively k·kw -compact. PROOF For every n > 1, let C n1 ⊆ X be the compact set postulated by Definition 2.3.35. The set ∞ df [ C = C n1 n=1
is separable in X and note that u(t) ∈ C
∀ t ∈ T \ Au ,
168
Nonlinear Analysis
with λ1 (Au ) = 0. So by replacing X by span C if necessary, we may assume that X is a separable Banach space. Then the dual unit ball © ª ∗ B 1 = x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 furnished with the relative w∗ -topology is compact metrizable. Let {tn }n>1 ⊆ T ∗
be a dense set and consider a w∗ -dense set {x∗n }n>1 ⊆ B 1 . The family n o χ[tm ,tk ] x∗n : n, m, k > 1, tm < tk is countable and so it can be enumerated as {ϕi }i>1 . We have kukw
¯ b ¯ ¯Z ¯ ¯ ¯ ® = sup ¯¯ ϕi (t), u(t) X dt¯¯ i>1 ¯ ¯ 0
(see Definition 2.3.35). Let S : L1 (T ; X) −→ l∞ be the continuous, linear operator defined by df
½ Zb
S(u) =
® ϕi (t), u(t) X dt
0
Note that
¾ . i>1
° ° °S(u)° ∞ = kuk . w l
We claim that S(K) is relatively strongly compact in l∞ . First suppose that there exists a norm compact set D ⊆ X, such that ½ ¾ 1 1 K ⊆ SD = u ∈ L (T ; X) : u(t) ∈ D for a.a. t ∈ T . Let {ei }i>1 be the standard basis in l1 and C(D) the space of continuous ¡ ¢ R-valued functions on D. Let Sb : l1 −→ L1 T ; C(D) be defined by df b i) = S(e ϕi
∀i>1
and on all of l1 by linearity and ¡ ¢ continuity. Using Theorem 2.3.6, we can see that {ϕi }i>1 ⊆ L1 T ; C(D) is relatively norm-compact. Hence Sb is a compact operator and then by Schauder’s theorem, the adjoint operator ¡ ¢ ¡ ¢∗ Sb∗ : L1 T ; C(D) = L∞ T ; M (D)w∗ −→ l∞ is compact (here by M (D)w∗ we denote the space of Radon measure furnished with w∗ -topology; recall that by the Riesz-Markov theorem, C(D)∗ = M (D);
2. Lebesgue-Bochner and Sobolev Spaces
169
¡ ¢ ¡ ¢∗ see Theorem A.3.25). For every g ∈ L∞ T ; M (D)w∗ = L1 T ; C(D) , we have ½ Zb ¾ ® ∗ Sb (g) = g(t), ϕi (t) C(D) dt , i>1
0
where
® g(t), ϕi (t) C(D) =
Z ϕi (t)(x) dg(t)(x), D
for the measure g(t) ∈ M (D). If for every t ∈ T , g(t) is Dirac measure concentrated on u(t) ∈ D ⊆ X, then ¡ ¢ g ∈ L∞ T ; M (D)w∗ and
½ Zb Sb∗ (g) =
¾ ® ϕi (t), u(t) X dt = S(u).
0
Therefore we see that the action of the operator Sb on K can be identified with the action of the operator Sb∗ and so S(K) ⊆ l∞ is relatively compact. Now we pass to the general case and assume that K has property U . For ε > 0 consider the set n o df Kε = χAε,u u : u ∈ K , where Aε,u ⊆ T is the Lebesgue measurable set postulated by Definition 2.3.35. By virtue of the uniform integrability of K, for each δ > 0, we can find ε > 0, such that inf ku − vk1 < δ
v∈Kε
∀ u ∈ K.
Note that k·kw 6 k·k1 . Therefore from the definition of the operator S, we have ° ° inf °S(u) − y °l∞ < δ. y∈S(Kε )
∞ But the set S(Kε ) ⊆ l∞ is relatively ¡norm-compact. ¢ Hence S(K) ⊆ l is 1 ∞ relatively norm-compact. Since S ∈ L L (T ; K); l , we also have that S is weak-to-weak continuous. The weak and norm topologies coincide on S(K). Thus S|K is weak-to-norm continuous. Recall that because K has property U , it is relatively w-compact (see Remark 2.3.36). Since without any loss of generality we may assume that K is convex, the linear map S : K −→ S(K) is a weak-to-norm ¡homeomorphism.¢ The norm topology of l∞ on S(K) is the norm topology of L1 (T ; X), k·kw∗ on K. Therefore the set K is relatively k·kw -compact and the proof of the theorem is finished.
170
Nonlinear Analysis
Next we ask the question when a weakly convergent sequence in Lp is strongly convergent. If p ∈ (1, +∞) and X is uniformly convex (hence reflexive; see Remark A.3.22), the Lebesgue-Bochner space Lp (Ω; X) is uniformly convex and so we have the Kadec-Klee property which says that if w
un −→ u in Lp (Ω; X) and kun kp −→ kukp , then un −→ u
in Lp (Ω; X).
This is no longer true for L1 (Ω; X). The next proposition illustrates the difference between weak and strong convergence in L1 (Ω). A sequence {un }n>1 ⊆ L1 (Ω) which converges weakly but not strongly oscillates violently around its weak limit. PROPOSITION 2.3.38 w If (Ω, Σ, µ) is a finite measure space, {un }n>1 ⊆ L1 (Ω), un −→ u in L1 (Ω) and u(ω) 6 lim inf un (ω) for µ-a.a. ω ∈ Ω, n→+∞
1
then un −→ u in L (Ω). PROOF Without any loss of generality, we may assume that u = 0. From Theorem 2.3.24, we know that the sequence {un }n>1 ⊆ L1 (Ω) is uniformly integrable. So given ε > 0 we can find δ = δ(ε) > 0, such that if A ∈ Σ, µ(A) < δ, then Z |un | dµ < ε
∀ n > 1.
A
For every N > 1, let ½ df
ΩN =
ε ω ∈ Ω : inf un (ω) > − n>N µ(Ω)
¾ .
Because of our hypothesis ¡ ¢and since we have assumed that u = 0, we can find N > 1 large so that µ ΩcN < δ. Also since w
un −→ u = 0
in L1 (Ω),
we can find N1 > N , such that for all n > N1 , we have ¯ ¯ ¯Z ¯ ¯ ¯ ¯ un dµ¯ < ε. ¯ ¯ ¯ ¯ ΩN
2. Lebesgue-Bochner and Sobolev Spaces
171
So for all n > N1 , we have Z Z Z |un | dµ = |un | dµ + |un | dµ Ω
Ωc
ΩN
N ¯ Z ¯ Z Z ¯ ¯ ε ε ¯ ¯ dµ + 6 u + dµ + |un | dµ 6 4ε, n ¯ µ(Ω) ¯ µ(Ω)
ΩN
so
ΩcN
ΩN
Z |un | dµ −→ 0
as n → +∞,
un −→ u = 0
in L1 (Ω).
Ω
i.e.,
When we deal with RN -valued functions, an extremality condition replaces the inequality hypothesis in the previous proposition. The result is due to Visintin (1984), where the reader can find the proof. PROPOSITION 2.3.39 ¡ ¢ If (Ω, Σ, µ) is a finite measure space, {fn }n>1 ⊆ L1 Ω; RN is a sequence such that ¡ ¢ w fn −→ f in L1 Ω; RN , ¡ ¢ for some f ∈ L1 Ω; RN and µ ¶ f (ω) ∈ ext conv lim sup{fn (ω)} for µ-a.a. ω ∈ Ω, n→+∞
then fn −→ f
¡ ¢ in L1 Ω; RN .
We conclude this section with a brief look at the space of Radon measures, which appears in applications (such as optimal control, game theory, mathematical economics etc.) and also is useful in the study of Sobolev spaces (see Section 2.4). So let Z be a locally compact, σ-compact metric space. We consider the following three spaces of continuous functions on Z: df
©
df
©
Cc (Z) = C0 (Z) =
ª u : Z −→ R continuous with compact support , u : Z −→ R continuous and vanishes at infinity,
i.e., for all ε > 0 there exists a compact set Kε ⊆ Z, ¯ ¯ ª such that ¯u(z)¯ < ε for all z 6∈ Kε , ª df © Cb (Z) = u : Z −→ R continuous and bounded .
172
Nonlinear Analysis
Evidently we have the following inclusions: Cc (Z) ⊆ C0 (Z) ⊆ Cb (Z). If Z is compact, then these three spaces coincide. If Z is not compact, each inclusion is strict. We can define a norm on Cb (Z) by setting ¯ ¯ df kukCb (Z) = kuk∞ = sup ¯u(z)¯. z∈Z
By restriction, this norm also passes to the spaces Cc (Z) and C0 (Z). PROPOSITION 2.3.40 The space Cb (Z) equipped with the norm k·k∞ is a Banach space. The space C0 (Z) is a closed subspace of this Banach space (hence itself a Banach space). The space Cc (Z) is k·k∞ -dense in C0 (Z). PROOF
The first two statements are obvious. Only the third requires ∞ S some work. Since Z is a locally compact, σ-compact metric space, Z = Cn , n=1
where {Cn }n>1 is a sequence of compact sets with Cn ⊆ int Cn+1 for all n > 1. Let {ϑn }n>1 and {ξn }n>1 be continuous partitions of unit subordinate to the open covers {int Cn }n>1 and {Cnc }n>1 respectively. We have ϑn + ξn = 1 on df
Z and so ϑn = 1 on Cn for n > 1. Let u ∈ C0 (Z) and set un = ϑn u. Then un ∈ Cc (Z) and
ku − un k∞ = kξn uk∞
∀ n > 1.
Since supp ξn ⊆ Cnc and u(z) −→ 0 as z tends to infinity (the one point Alexandrov compactification of Z; see Theorem A.1.3 and Remark A.1.4), we conclude that ku − un k∞ −→ 0 as n → +∞.
Also by M (Z) we denote the space of all signed measures m : B(Z) −→ R (with B(Z) being the Borel σ-field of Z) that have bounded variation. Since Z is a metric space such measures are regular. The measures in M (Z) are known as Radon measures. The Riesz-Markov representation theorem says that M (Z) is the dual space of C0 (Z).
2. Lebesgue-Bochner and Sobolev Spaces
173
THEOREM 2.3.41 (Riesz-Markov Representation Theorem) If X is a locally compact, σ-compact metric space, then C0 (Z)∗ = M (Z) and the duality pairing is given by Z hµ, uiC0 (Z) = u(z) dµ ∀ u ∈ C0 (Z), µ ∈ M (Z). Z
Using the three spaces of continuous functions on Z introduced earlier, we can define three different notions of convergence for sequences of Radon measures. DEFINITION 2.3.42 Let Z be a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z). (a) We say that the sequence {µn }n>1 converges vaguely to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ Cc (Z). Z
Z
(b) We say that the sequence {µn }n>1 converges weakly to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ C0 (Z). Z
Z w
We denote this convergence by µn −→ µ. (c) We say that the sequence {µn }n>1 converges narrowly to µ ∈ M (Z) if and only if Z Z u(z) dµn −→ u dµ ∀ u ∈ Cb (Z). Z
Z n
We denote this convergence µn −→ µ. REMARK 2.3.43
Evidently we have that
• the norm convergence in M (Z) implies the narrow convergence in M (Z); • the narrow convergence in M (Z) implies the weak convergence in M (Z); • the weak convergence in M (Z) implies the vague convergence in M (Z). In functional analytic terms the weak convergence is actually the weak∗ convergence in the Banach space M (Z) (see Theorem 2.3.41). The term weak convergence originates from probability theory. Also the term narrow convergence is the English translation of the term “convergence ´etroite” first used by Bourbaki (1969).
174
Nonlinear Analysis
PROPOSITION 2.3.44 If Z is a locally compact, σ-compact metric space, {un }n>1 ⊆ C0 (Z) is a sequence and u ∈ C0 (Z), then w un −→ u in C0 (Z) if and only if sup kun k∞ < ∞
and
n>1
un (z) −→ u(z)
∀ z ∈ Z.
PROOF “=⇒”: A weakly convergent sequence in a Banach space is bounded. So supn>1 kun k∞ < +∞. Also if µ = δz is the Dirac measure concentrated at z ∈ Z, then hδz , un i −→ hδz , ui . But hδz , un i = un (z) and
hδz , ui = u(z).
So un (z) −→ u(z)
∀ z ∈ Z.
“⇐=”: This is an immediate consequence of the Lebesgue dominated convergence theorem (see Theorem A.2.2). PROPOSITION 2.3.45 If Z is a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z) is a sequence, then (a) if µn −→ µ
vaguely in M (Z)
and for every ε > 0 there exists compact set Kε ⊆ Z, such that ¡ ¢ |µn | Kεc < ε ∀ n > n0 , then µn −→ µ
narrowly in M (Z);
(b) if µn > 0 for all n > 1, µn −→ µ
vaguely in M (Z),
and µn (Z) −→ µ(Z), then µn −→ µ
narrowly in M (Z).
2. Lebesgue-Bochner and Sobolev Spaces
175
PROOF (a) Let u ∈ Cb (Z) and ε > 0. Let Kε ⊆ Z be the compact set postulated by the hypotheses. We take ξε ∈ Cc (Z), such that ξε |Kε = 1. Evidently u = ξε u + v with supp v ⊆ Kεc . So we have Z Z Z u dµn = ξε u dµn + v dµn . Z
Z
Z
Since µn −→ µ vaguely in M (Z) and ξε u ∈ Cc (Z), we have Z Z ξε u dµn −→ ξε u dµ. Z
Also
Z
¯Z ¯ ¯Z ¯ ¯ ¯ ¯ ¯ ¡ ¢ ¯ v dµn ¯ = ¯ v dµn ¯ 6 kvk |µn | Kεc 6 ε kuk . ∞ ∞ ¯ ¯ ¯ ¯ Kεc
Z
So we obtain
Z
Z
lim sup n→+∞
and
u dµn 6 Z
Z u dµn >
n→+∞
(2.30)
Z
Z lim inf
ξε u dµ + ε kuk∞
Z
ξε u dµ − ε kuk∞ . Z
Since ε > 0 was arbitrary and ξε −→ 1, we obtain that Z Z u dµn −→ u dµ, Z
Z
i.e., µn −→ µ
narrowly in M (Z).
(b) Every measure µ0 ∈ M (Z), µ0¡> 0¢is tight. So given ε > 0, we can find a compact set Kε ⊆ Z, such that µ0 Kεc < ε. Let u ∈ Cc (Z) be such that Z supp u ⊆ Kε ,
0 6 u 6 1 and
kµk∗ − ε
1 be such that ¯ ¯ ¯µn (Z) − µ(Z)¯ < ε
¯ ¯Z Z ¯ ¯ ¯ u dµn − u dµ¯ < ε ¯ ¯
and
Z
∀ n > n0 . (2.31)
Z
Then for n > n0 , from (2.31), we have Z Z ¡ ¢ µn Kεc 6 kµn k∗ − u dµn + ε − u dµ 6 kµn k∗ − kµk∗ + 2ε < 3ε. Z
Z
So, from part (a), we conclude that µn −→ µ narrowly in M (Z).
We have a compactness result for the weak convergence of Radon measures. THEOREM 2.3.46 If Z is a locally compact, σ-compact metric space and {µn }n>1 ⊆ M (Z) is bounded, then there is a subsequence {µnk }k>1 of {µn }n>1 and µ ∈ M (Z), such that w
µnk −→ µ
as k → +∞.
PROOF Let Z ∗ be the Alexandrov one-point compactification of Z (see Theorem A.1.3 and Remark A.1.4). Then Z ∗ is a compact metrizable space and so C(Z ∗ ) is a separable Banach space. Set df
E =
©
ª u ∈ C(Z ∗ ) : u(∞) = 0 .
Then this is a closed subspace of C(Z ∗ ); thus E is a separable Banach space too. For every u ∈ E, let u b denote the restriction of u to Z. Consider the linear map L : E −→ Cb (Z) defined by df
L(u) = u b
∀ u ∈ E.
We claim that L is an isometry of E onto C0 (Z). To this end, let u ∈ E. Since u is continuous at +∞, then for every ε > 0, there exists a compact set Kε , such that ¯ ¯ ¯u(z) − u(∞)¯ < ε ∀ z ∈ Kεc . This means that u b ∈ C0 (Z). On the other hand let v ∈ C0 (Z). Then v can be extended to Z ∗ by setting v1 (∞) = 0
2. Lebesgue-Bochner and Sobolev Spaces
177
and v1 (z) = v(z)
∀ z ∈ Z.
Since v ∈ C0 (Z), we see that v1 ∈ C(Z ∗ ) and so v1 ∈ E. This isometry shows that C0 (Z) is separable. Then the weak∗ topology on bounded subsets of M (Z) = C0 (Z)∗ is compact (by Alaoglu’s theorem; see Theorem A.3.9) and metrizable. This proves the theorem. REMARK 2.3.47 Using the compactification technique of the previous proof we can show that if df
G =
©
¡ ¢ ª µ ∈ M (Z ∗ ) : µ {∞} = 0
and S : G −→ M (Z) is defined by df
∀ µ ∈ M (Z ∗ )
S(µ) = µ b with µ b(A) = µ(A)
∀ A ∈ B(Z) ⊆ B(Z ∗ ),
then S is an isometry of G onto M (Z). When Z is compact, then Theorem 2.3.46 can be strengthened. In what follows by M (Z)+ we denote the elements µ of M (Z) for which we have µ > 0 (i.e., they are measures). THEOREM 2.3.48 If Z is a compact metric space and {µn }n>1 ⊆ M (Z)+ is such that kµn k = r
∀ n > 1,
then there exists a subsequence {µnk }k>1 of {µn }n>1 and µ ∈ M (Z)+ with kµk = r, such that w
µnk −→ µ i.e., the set
as k → +∞,
© ª + SR = µ ∈ M (Z)+ : kµk = r
is w∗ -sequentially compact). We conclude with a result for sequences of functions which converge simultaneously pointwise and weakly in Lp (Ω) (p ∈ [1, +∞)). This result can be viewed as a refinement of Fatou’s Lemma.
178
Nonlinear Analysis
PROPOSITION 2.3.49 If (Ω, Σ, µ) is a finite measure space, {un }n>1 ⊆ Lp (Ω) (p ∈ [1, +∞)), w
un −→ u
in Lp (Ω)
and un (ω) −→ u(ω) then
³ lim
n→+∞
PROOF we have
p
for µ-a.a. ω ∈ Ω, p
´
p
kun kp − kun − ukp
= kukp .
For a given ε > 0, we can find c(ε) > 0, such that for all a, b ∈ R, ||a + b|p − |a|p | 6 ε|a|p + c(ε)|b|p .
We set
df
(2.32) +
hεn = (||un |p − |un − u|p − |u|p | − ε|un − u|p ) . Evidently we have hεn (ω) −→ 0 for µ-a.a. ω ∈ Ω and from (2.32),
hεn (ω) 6
¯p ¡ ¢¯ 1 + c(ε) ¯u(ω)¯ .
So from the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z hεn dµ −→ 0 as n → +∞. (2.33) Ω
But note that |un |p − |un − u|p − |u|p 6 hεn + ε|un − u|p
for µ-a.a. ω ∈ Ω.
Hence, using (2.33), we obtain Z lim sup ||un |p − |un − u|p − |u|p | dµ 6 M ε, n→+∞
Ω
where
df
p
M = sup kun − ukp . n>1
Since ε > 0 was arbitrary, we have ³ ´ p p p lim kun kp − kun − ukp = kukp . n→+∞
REMARK 2.3.50 Note that if p = 2, then we do not need the µ-almost everywhere pointwise convergence of the sequence {un }n>1 to u.
2. Lebesgue-Bochner and Sobolev Spaces
2.4
179
Sobolev Spaces
Already in Section 1.6, we introduced the Sobolev space W 1,p (Z) (see Definition 1.6.1). Here we introduce Sobolev spaces of any order m > 1 and conduct a systematic study of them, proving among other things those results stated in Section 1.6 without a proof. N Let us start by fixing the notation. An element α = (αk )N is said k=1 ∈ N to be a multi-index . Associated to a multi-index α, we have the following symbols: N df X |α| = αk k=1
the length of α, and αN z α = z1α1 . . . zN
N ∀ z = (zk )N k=1 ∈ R .
We say © that two ª multi-indices α, β are related by α 6 β, if αk 6 βk for all k ∈ 1, . . . , N . Finally we set df
Dk =
∂ ∂zk
© ª ∀ k ∈ 1, . . . , N
and df
αN = Dα = D1α1 . . . DN
∂z1α1
∂ |α| αN . . . . ∂zN
DEFINITION 2.4.1 Let Z ⊆ RN be an open set. By D(Z) we de∞ note the space Cc (Z) (the space of C ∞ (Z) functions with compact support) equipped with the following convergence notion: “the sequence {ϑn }n>1 ⊆ Cc∞ (Z) is said to converge to 0, if there exists a fixed compact set K ⊆ Z, such that supp ϑn ⊆ K for all n > 1 and {Dα ϑn }n>1 converges uniformly to 0 for all α ∈ NN .” The elements of the space D(Z) are called test functions. A linear functional T : D(Z) −→ R, such that ϑn −→ 0
in D(Z),
implies T (ϑn ) −→ 0 is called a distribution. The space of distributions is denoted by D(Z)∗ .
180
Nonlinear Analysis
REMARK 2.4.2 The convergence notion introduced on Cc∞ (Z) is actually topological, i.e., corresponds to a topology on Cc∞ (Z). Therefore D(Z)∗ is the dual of the space of test functions. Recall that D(Z) is dense in Lp (Ω) for all p ∈ [1, +∞). If u ∈ L1loc (Z) and Tu : D(Z) −→ R is defined by Z df Tu (ϑ) = uϑ dz ∀ ϑ ∈ D(Z), Ω
then Tu ∈ D(Z)∗ . Moreover, if u, v ∈ L1loc (Z) and u = v
for a.a. z ∈ Z,
then Tu = Tv . In particular, if u(z) = 0
for a.a. z ∈ Z,
it defines the zero distribution. In fact the converse is also true. If Tu = 0, then u(z) = 0 for almost all z ∈ Z, provided that u ∈ L1loc (Z). Distributions resulting from locally integrable functions are usually called regular distributions. Another important distribution is the Dirac δ-function; namely for z ∈ Z, we define df
δz (ϑ) = ϑ(z)
∀ ϑ ∈ D(Z).
This distribution is not regular. DEFINITION 2.4.3 For every distribution T ∈ D(Z)∗ and every α ∈ N α N , the distribution D T is defined by ¡ ¢ df Dα T (ϑ) = (−1)|α| T Dα ϑ
∀ ϑ ∈ D(Z).
Then Dα T is the derivative of order α of the distribution T . For given two functions u, v ∈ L1loc (Z) and α ∈ NN , we write v = Dα u to express the fact that Dα Tu = Tv . This is equivalent to saying that Z Z vϑ dz = (−1)|α| uDα ϑ dz. Ω
Ω
α
The function v = D u is the derivative of order α in the sense of distributions of the function u. If u ∈ C |α| (Z), then the distributional derivative Dα u ∂ |α| u coincides with the classical partial derivative αN . ∂z1α1 . . . ∂zN REMARK 2.4.4
2. Lebesgue-Bochner and Sobolev Spaces
181
Now we are ready to give the definition of Sobolev space. DEFINITION 2.4.5 Let Z ⊆ RN be an open set. The Sobolev space m,p W (Z) for m ∈ N0 , p ∈ [1, +∞], is defined by df
W m,p (Z) =
©
ª u ∈ Lp (Z) : Dα u ∈ Lp (Z) for all α ∈ NN with |α| 6 m .
For every u ∈ W m,p (Z), we define df
kukW m,p (Z) =
µ X
kD
α
p ukp
¶ p1 if p ∈ [1, +∞)
|α|6m
and
df
kukW m,∞ (Z) =
X
kDα uk∞ .
|α|6m
Clearly this is a norm on W m,p (Z). Finally we set df
k·kW m,p (Z)
W0m,p (Z) = D(Z)
,
for p ∈ [1, +∞). REMARK 2.4.6
Evidently un −→ u in W m,p (Z)
if and only if for all α ∈ NN with |α| 6 m, we have Dα un −→ Dα u in Lp (Z). Let
© ª df r = card α : α is multi-index, |α| 6 m
and consider the map L : W m,p (Z) −→ defined by df
L(u) =
¡ α ¢ D u |α|6m
¡
¢r Lp (Z) ,
∀ u ∈ W m,p (Z).
It is easily seen that L is an isometric isomorphism. Based on this observation, we can state the following result. PROPOSITION 2.4.7 ¡ ¢ The spaces W m,p (Z), k·kW m,p (Z) (with p ∈ [1, +∞], m ∈ N0 ) are Banach spaces, which are separable for p ∈ [1, +∞), reflexive and uniformly convex for p ∈ (1, +∞).
182
Nonlinear Analysis
COROLLARY 2.4.8 For every m ∈ N0 and p ∈ [1, +∞], the space W0m,p (Z) is a closed subspace of W m,p (Z). PROOF
We need to show that W0m,p (Z) ⊆ W m,p (Z).
So let u ∈ W0m,p (Z) and let {ϑn }n>1 ⊆ D(Z) be such that ϑn −→ u in W m,p (Z). From Proposition 2.4.7, it follows that u ∈ W m,p (Z). REMARK 2.4.9 The space p = 2 is important and we reserve a special notation for it. We set df
H m (Z) = W m,2 (Z) and
df
H0m (Z) = W0m,2 (Z).
For u, v ∈ H m (Z), we define df
(u, v)H m (Z) =
X
(Dα u, Dα v)L2 (Z) =
X Z
Dα u Dα v dz.
|α|6m Ω
|α|6m
Clearly (·, ·)H m (Z) defines an inner product on H m (Z) which generates the norm k·kW m,2 (Z) . From Proposition 2.4.7, it follows that H m (Z) and H0m (Z) are Hilbert spaces. From now on m = 1. So we examine the first Sobolev spaces W 1,p (Z) and W01,p (Z), with p ∈ [1, +∞]. Next we derive ways to approximate the elements of the Sobolev space W 1,p (Z) by smooth functions. For this purpose we introduce certain regularizing sequences known as mollifiers. ¡ ¢ DEFINITION 2.4.10 Let ϕ ∈ Cc∞ RN , ϕ > 0 be such that Z © ª N supp ϕ ⊆ z ∈ R : kzkRN 6 1 and ϕ(z) dz = 1. RN
A possible choice is the function µ ¶ 1 df c exp kzk2 −1 ϕ(z) = RN 0
if
kzkRN < 1,
if
kzkRN > 1,
with c > 0 chosen in such a way so that Z ϕ(z) dz = 1. RN
2. Lebesgue-Bochner and Sobolev Spaces
183
If ε > 0, we define
1 ³z ´ ϕ . εN ε ¡ ¢ Then ϕε ∈ Cc∞ RN , ϕε > 0 is such that ϕε (z) =
supp ϕε ⊆
©
ª z ∈ RN : kzkRN 6 ε
Z and
ϕε (z) dz = 1. RN
The function ϕε is called a mollifier and given u ∈ L1loc (Z), the mollification (or regularization) of u corresponding to {ϕε }ε>0 is given by Z Z df uε (z) = u(z − y)ϕε (y) dy = u(y)ϕε (z − y) dy, Ω
RN
where we have extended u to all of RN as zero (i.e., uε = u?ϕε with ? denoting the convolution operation). REMARK 2.4.11
Note that
© ª supp uε ⊆ supp u + z ∈ RN : kzkRN 6 ε . The next proposition summarizes the approximations achieved via mollification. PROPOSITION 2.4.12 If Z ⊆ RN is an open set, then ¡ ¢ df 1 ∞ (a) for every u ∈ L (Z) and every ε > 0, u ∈ C Z , where Z−ε = ε −ε loc © ª z ∈ Z : dZ (z, ∂Z) > ε ; (b) if u ∈ C(Z), then uε −→ u
as ε & 0
uniformly on compact subsets of Z; (c) if u ∈ Lploc (Z) for some p ∈ [1, +∞), then uε −→ u
in Lploc (Z);
(d) if u ∈ W 1,p (Z), p ∈ [1, +∞], then Di uε = ϕε ? Di u
∀ i ∈ {1, . . . , N };
(e) if u ∈ W 1,p (Z), p ∈ [1, +∞), then uε −→ u
in W 1,p (Z).
184
Nonlinear Analysis
PROOF (a) Note that uε is defined on Z−ε (see Remark 2.4.11). Let z ∈ Z−ε , i ∈ {1, . . . , N } and e1 , . . . , eN be the canonical basis of RN . For |t| small enough, we have that z + tei ∈ Z−ε
∀ i ∈ {1, . . . , N }.
So if tk −→ 0, we can assume that z + tk ei ∈ Z−ε We set
1 h(z, y) = N ϕ ε
and fk (z, y) = We have
µ
∀ k > 1. z−y ε
¶ u(y)
¢ 1¡ h(z + tk ei − y) − h(z, y) . tk
¢ 1¡ uε (z + tk ei ) − uε (z) = tk
Z fk (z, y) dy. Z−ε
Note that fk (z, y) −→
∂ϕε (z − y) as k → +∞, ∂zi
∀ y ∈ Z−ε .
Moreover, by the mean value theorem , we have ¯ ¯ ¯fk (z, y)¯ 6
1 εN +1
¡ ¢ kDϕk∞ |u| ∈ L1 Z−ε .
So by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z ¢ ∂uε 1¡ ∂ϕε (z) = lim uε (z + tk ei ) − uε (z) = (z − y)u(y) dy. k→+∞ tk ∂zi ∂zi Z
In a similar way we show that the partial derivatives of uε of all orders exist and are continuous on Zε . Therefore ¡ ¢ uε ∈ C ∞ Z−ε . Note that we have ¢ ∂ ¡ ∂ϕε u ? ϕε = u ? ∂zi ∂zi
∀ i ∈ {1, . . . , N }.
(b) Let K be a compact subset of Z. Take z ∈ K and set df
x =
y−z , ε
y ∈ Z.
2. Lebesgue-Bochner and Sobolev Spaces
185
Note that ϕ(−x) = ϕ(x). Then we have ¯ ¯ ¯uε (z) − u(z)¯ 6 Z 6
µ
Z
1 εN
z−y ε
ϕ
¶
¯ ¯ ¯u(y) − u(z)¯ dy
Bε (z)
¯ ¯ ϕ(x)¯u(z + εx) − u(z)¯ dz 6 ξ(ε)
B1 (0)
(2.34)
Z ϕ(x) dx = ξ(ε),
B1 (0)
where ξ(ε) =
¯ ¯ ¯u(y) − u(v)¯.
sup (y, v) ∈ K ε × K ε ky − vk 6 ε
Since u is uniformly continuous on the compact subsets of Z, we have that lim ξ(ε) = 0.
ε&0
So from (2.34), we conclude that uε −→ u as ε & 0 uniformly on compact subsets of Z. ¡ ¢ (c) Let K be a compact subset of Z and 0 < ε < d K, ∂Z . Then for z ∈ K, we have Z ¯ ¯ ¯ ¯p ¯uε (z)¯p 6 ϑ(y)¯u(z + εy)¯ dy. (2.35) B1 (0)
To see (2.35) note that it is clearly true if p = 2. So suppose that p ∈ (1, +∞). If p1 + p10 = 1, then 1
1
ϑ(y)u(z + εy) = ϑ(y) p ϑ(y) p0 u(z + εy).
Invoking H¨older’s inequality (see Theorem A.2.27), we have ¯ ¯ ¯uε (z)¯ 6
µ Z ϑ(y) dy
¶ 10 µ Z p
B1 (0)
Since
p
¶ p1
ϑ(y) |u(z + εy)| dy
B1 (0)
Z ϑ(y) dy = 1, B1 (0)
from (2.36), we obtain (2.35). Invoking (2.35), we have ¶ Z Z µ Z ¯ ¯ ¯ ¯p ¯uε (z)¯p dz 6 ¯ ¯ ϑ(y) u(z + εy) dy dz; K
K
B1 (0)
.
(2.36)
186
Nonlinear Analysis
thus by Fubini’s theorem, we Z Z Z ¯ ¯ ¯uε (z)¯p dz 6 ϑ(y) K
B1 (0)
where
df
Z
Z
¯ ¯ ¯u(z + εy)¯p dz dy 6
K
Kε =
Since
have
ϑ(y) B1 (0)
©
Z
¯ ¯ ¯u(v)¯p dv dy,
Kε
ª v ∈ RN : d(v, K) 6 ε .
ϑ(y) dy = 1, we have B1 (0)
Z kuε kLp (K) 6
¯ ¯ ¯u(v)¯p dv,
(2.37)
Kε p
so uε ∈ L (K). Let V be a bounded open set, such that Kε ⊆ V ⊆ V ⊆ Z
∀ ε > 0 small enough. ¡ ¢ Let δ > 0 be given and select h ∈ C V , such that ku − hkLp (V ) < δ. Here we exploit the density of the embedding ¡ ¢ C V ⊆ Lp (V ). Then, using (2.37), for ε > 0 small enough, we have ku − uε kLp (K) 6 ku − hkLp (K) + kh − hε kLp (K) + k(h − v)ε kLp (K) 6 ku − hkLp (K) + kh − hε kLp (K) + kh − vkLp (Kε ) 6 3δ, so uε −→ u
in Lploc (Z) as ε & 0.
(d) Suppose that u ∈ W 1,p (Z), p ∈ (1, +∞). From the proof of part (a) and integrating by parts, we know that Z Z ∂ϕε ∂ϕε Di u(z) = (z − y)u(y) dy = − (z − y)u(y) dy ∂zi ∂yi Z Z Z ¡ ¢ = ϕε (z − y)Di u(y) dy = ϕε ? Di u (z), Z
for z ∈ Z−ε , ε > 0 and i ∈ {1, . . . , N }. (e) This follows from (c) and (d).
2. Lebesgue-Bochner and Sobolev Spaces
187
The next theorem shows that smooth functions are dense in W 1,p (Z), p ∈ [1, +∞]. So equivalently the space W 1,p (Z) can be defined as the closure in the k·kW 1,p (Z) -norm of C ∞ (Z)∩W 1,p (Z). The result is known in the literature as the Meyers-Serrin theorem. THEOREM 2.4.13 (Meyers-Serrin Theorem) If p ∈ [1, +∞), then the embedding C ∞ (Z) ∩ W 1,p (Z) ⊆ W 1,p (Z) is dense. PROOF
We define ½ ¾ ¡ ¢ 1 df Z−n = z ∈ Z : d z, ∂Z > , kzkRN < n n
and
∀n>1
df
Z0 = ∅. Set
df
Un = Z−(n+1) \ Z −(n−1)
∀ n > 1.
The collection {Un }n>1 is an open cover of Z. So we can find a smooth partition of unity {ξn }n>1 subordinate to {Un }n>1 . Then ξn ∈ Cc∞ (Un ), and
∞ X
06ξ61
ξn (z) = 1
∀n>1
∀ z ∈ Z.
n=1
Let u ∈ W 1,p (Z) and δ > 0. We have ξn u ∈ W 1,p (Z)
and
¡ ¢ supp ξn u ⊆ Un
∀ n > 1.
Thus by virtue of Proposition 2.4.12(e), there exist εn > 0, such that ¡ ¢ supp ϑεn ? (ξn u) ⊆ Un and kϑεn ? (ξn u) − ξn uk < Define df
uδ =
∞ X n=1
δ . 2n
ϑεn ? (ξn u).
(2.38)
188
Nonlinear Analysis
In some neighbourhood of each point z ∈ U , there are only finitely many nonzero terms in the sum. Hence uδ ∈ C ∞ (Z). Next note that u =
∞ X
ξn u.
n=1
From (2.38), we have ¶p ∞ µZ X ¯ ¡ ¢¯p ¯ ¯ 6 ϑεn ? (ξn u) − ξn u dz 1
kuδ − ukW 1,p (Z)
n=1
+
Z
∞ µZ X
n=1
Z
¯ ¡ ¢¯ ¯ϑεn ? D(ξn u) − Du ¯p dz
¶ p1 < δ,
so uδ ∈ C ∞ (Z) ∩ W 1,p (Z) and uδ −→ u in W 1,p (Z) as δ & 0.
REMARK 2.4.14 The result is true for all Sobolev spaces W m,p (Z), m > 1. We emphasize that in the above approximation ¡ ¢ result, we do not claim that the approximating functions belong in C ∞ Z . To obtain this we need to strengthen the geometry of the boundary ∂Z of Z. DEFINITION 2.4.15 A bounded open set Z ⊆ RN is said to be Lipschitz, if for each z ∈ ∂U , there exists a neighbourhood U of z, such that © ª N Z ∩ U = y = (zk )N k=1 ∈ R : η(y1 , . . . , yN −1 ) < yN ∩ U, where η : RN −1 −→ R is a Lipschitz continuous function and {yk }N k=1 is a system of Cartesian coordinates of RN . REMARK 2.4.16 From this definition it follows that ∂Z locally has a representation of the form yN = η(y1 , . . . , yN −1 ), i.e., near z ∈ ∂Z, the boundary ∂Z is the graph of a Lipschitz continuous function. By Rademacher’s theorem (see Theorem 1.5.8), the outer unit normal n(z) to the domain Z exists for µ(N −1) -almost all z ∈ ∂Z. If Z is a bounded polyhedron, then Z is Lipschitz. Also if Z is a C ∞ -submanifold with C ∞ boundary ∂Z, then Z is Lipschitz. Every Lipschitz open set Z ⊆ RN is locally star-shaped. For Lipschitz Z ⊆ RN we can improve the conclusion of Theorem 2.4.13.
2. Lebesgue-Bochner and Sobolev Spaces
189
THEOREM 2.4.17 If Z ⊆ RN is a bounded open set, which is Lipschitz and p ∈ [1, +∞), ¡ ¢ then the embedding C ∞ Z ⊆ W 1,p (Z) is dense. REMARK 2.4.18 Theorem 2.4.17 implies that for any bounded, open, Lipschitz set Z ⊆ RN and any given u ∈ W 1,p (Z) (p ∈ [1, +∞)), there exists a sequence {un }n>1 ⊆ D(RN ), such that un |Z −→ u in W 1,p (Z). In general for any open set Z ⊆ RN and u ∈ W 1,p (Z) (p ∈ [1, +∞)), we can say that there exists a sequence {un }n>1 ⊆ D(RN ), such that w
un −→ u in Lp (Z) Di xn |Z 0 −→ Di x|Z 0 in Lp (Z 0 ),
∀ i ∈ {1, . . . , N }, Z 0 ⊂⊂ Z
(i.e., Z 0 is a bounded open set with Z 0 ⊆ Z). This result is known in the literature as Friedrich’s theorem. Theorem 2.4.17 holds for all Sobolev spaces W m,p (Z), m > 1. None of these approximation results (Theorems 2.4.13 and 2.4.17) is true for p = +∞. Indeed consider the following examples. EXAMPLE 2.4.19
(a) Let Z = RN . We know that
¡ ¢k·k∞ ¡ ¢ Cc RN = C0 RN . ¡ ¢ Thus while u ≡ 1 ∈ W 1,∞ RN , it can not be approximated by functions in ¡ ¢ Cc∞ RN . (b) Let Z = (−1, 1) and consider the function ½ df 0 if z 6 0, u(z) = z if z > 0. Then u is absolutely continuous. Its derivative in the sense of distribution is given by ½ df 0 if z < 0, Du(z) = 1 if z > 0. ¯ ¯ Let ϑ ∈ C ∞ (Z) be ¯such that ¯kϑ0 − u0 k∞ < ε. So if z < 0, then ¯ϑ0 (z)¯ < ε and if z > 0, then ¯ϑ0 (z) − 1¯ < ε, hence 1 − ε < ϑ0 (z). By continuity, we obtain ϑ0 (0) 6 ε and ϑ0 (0) > 1 − ε. If ε < 21 , we reach a contradiction. This shows that u cannot be approximated in W 1,∞ (Z) by smooth functions.
190
Nonlinear Analysis
The next Proposition proves a simple characterization of the elements in W 1,p (Z) (p ∈ (1, +∞]). PROPOSITION 2.4.20 If Z ⊆ RN is open and u ∈ Lp (Z) with p ∈ (1, +∞], then the following statements are equivalent: (a) u ∈ W 1,p (Z); (b) there exists a constant c > 0, such that ¯Z ¯ ¯ ¯ ¯ u ∂ϑ dz ¯ 6 c kϑk 0 ∀ ϑ ∈ Cc∞ (Z), k ∈ {1, . . . , N }, p ¯ ∂zk ¯ Z
with
1 p
+
1 p0
= 1;
(c) there exists a constant c > 0, such that for all Z 0 ⊂⊂ Z (i.e., Z 0 is a bounded open set such that Z 0 ⊆ Z) and ° ° °τy (u) − u° p 0 6 c kyk N ∀y ∈ RN , with kyk N < d (Z 0 , Z c ). R R RN L (Z ) Moreover, in both (b) and (c) we can take c = kDukp . PROOF
“(a)=⇒(b)”: Obvious.
“(b)=⇒(a)”: Let Lk : Cc∞ (Z) −→ R be defined by Z ∂ϑ df Lk (ϑ) = u dz ∀ k ∈ {1, . . . , N } . ∂zk Z
0
0
Evidently Lk is linear, Lp -continuous. Since the embedding Cc∞ (Z) ⊆ Lp (Z) 0 0 is dense, we can extend Lk continuously on all of Lp (Z). So Lk ∈ Lp (Z)∗ and by the Riesz representation theorem (see Theorem A.3.24), we can find h ∈ Lp (Z), such that Z 0 Lk (v) = hv dz ∀ v ∈ Lp (Z), Z
so
Z u Z
∂ϑ dz = ∂zk
Z hϑ dz
∀ ϑ ∈ Cc∞ (Z)
Z
and Dk u = h hence u ∈ W
1,p
(Z).
∀ k ∈ {1, . . . , N },
2. Lebesgue-Bochner and Sobolev Spaces
191
“(a)=⇒(c)”: First suppose that u ∈ Cc∞ (Z). Let y ∈ RN and set df
v(t) = u(z + ty)
∀ t ∈ R.
Then from the chain rule, we have v 0 (t) = (Du(z + ty), y)RN . Integrating, we obtain Z1
Z1 0
u(z + y) − u(z) = v(1) − v(0) =
(Du(z + ty), y)RN dt
v (t) dt = 0
0
and so p
kτy (u) − ukLp (Z 0 ) Z Z1 6
p kykRN
p
kDu(z + ty)kRN dt dz Z0 0
Z1 Z =
p kykRN
p
kDu(z + ty)kRN dz dt 0 Z0
Z1 Z =
p kykRN
p
kDu(r)kRN dr dt. 0 Z 0 +ty
If
¡ ¢ kykRN < dRN Z 0 , Z c ,
we can find a bounded open set Z 00 , such that Z
00
⊆ Z
and Z 0 + ty ⊆ Z 00
Therefore
∀ t ∈ [0, 1].
Z p
p
p
kτy (u) − ukLp (Z 0 ) 6 kykRN
kDukRN dz.
(2.39)
Z 00
For the general case, suppose that u ∈ W 1,p (Z), p ∈ (1, +∞]. Then we can find a sequence {un }n>1 ⊆ Cc∞ (R), such that un −→ u in Lp (Z), Dun −→ Du
in Lp (Z 0 ),
for any Z 0 ⊂⊂ Z (see Remark 2.4.18). From (2.39), we have Z p p p kτy (un ) − un kLp (Z 0 ) 6 kykRN kDun kRN dz, Z0
192
Nonlinear Analysis
so
Z p
p
p
kτy (u) − ukLp (Z 0 ) 6 kykRN
kDukRN dz.
(2.40)
Z0
If p = +∞, we obtain (2.40) for p < +∞ and then let p → +∞. “(c)=⇒(b)”: Let ϑ ∈ Cc∞ (Z) and consider an open set Z 0 , such that supp ϑ ⊆ Z 0 ⊂⊂ Z. Let y ∈ RN with
¡ ¢ kykRN < d Z 0 , Z c .
By hypothesis we have ¯Z ¯ ¯ ¡ ¯ ¢ ¯ τy (u) − u ϑ dz ¯¯ 6 c kykRN kϑkp0 . ¯ Z
Note that Z
¡ ¢ u(z + y) − u(z) ϑ(z) dz =
Z
so
Z
¡ ¢ u(z) ϑ(z − y) − ϑ(z) dz,
Z
¯Z ¯ ¯ ¯ ¯ u(z) ϑ(z − y) − ϑ(z) dz ¯ 6 c kϑk 0 . p ¯ ¯ kyk N Z
R
©
ª
Let y = tek , t ∈ R, k ∈ 1, . . . , N . Passing to the limit as t → 0, we obtain ¯Z ¯ ¯ ¯ © ª ¯ u ∂ϑ dz ¯ 6 c kϑk 0 ∀ ϑ ∈ Cc∞ (Z), k ∈ 1, . . . , N . p ¯ ∂zk ¯ Z
Finally it is clear from the above proofs that in (b) and (c), the constant c > 0 can be taken as c = kDukp . REMARK 2.4.21 If p = 1, then we have (a) =⇒ (b) =⇒ (c). From the implication (a) =⇒ (c), we can see that if Z ⊆ RN is an open set and u ∈ W 1,∞ (Z), then ¯ ¯ ¯u(z) − u(y)¯ 6 kDuk kz − yk N ∀ z, y ∈ Z. ∞ R So W 1,∞ (Z) is the space of Lipschitz continuous functions on Z. In particular ¡ ¢ W 1,∞ (Z) ⊆ C Z . More generally, it is easy to show that if u : Z −→ R is locally Lipschitz, then 1,p u ∈ Wloc (Z) (p ∈ [1, +∞]).
2. Lebesgue-Bochner and Sobolev Spaces
193
PROPOSITION 2.4.22 If Z ⊆ RN is an open set and u, v ∈ W 1,p (Z) ∩ L∞ (Z) with p ∈ [1, +∞], then uv ∈ W 1,p (Z) and D(uv) = uDv + vDu (product rule). PROOF If p = +∞, then u, v are Lipschitz continuous functions and so differentiable for almost all z ∈ Z (see Theorem 1.5.8). Clearly uv is Lipschitz continuous too, hence in W 1,∞ (Z) and the product rule results from the usual product rule of differentiable functions. So suppose that p ∈ [1, +∞). We assume that ¯ ¯ ¯ ¯ ¯u(z)¯, ¯v(z)¯ 6 1 for a.a. z ∈ Z. Invoking Theorem 2.4.13, we can find sequences {b un }n>1 , {b vn }n>1 ⊆ C ∞ (Z) ∩ W 1,p (Z), such that u bn u bn (z) vbn vbn (z)
−→ u in W 1,p (Z), −→ u(z) for a.a. z ∈ Z, −→ v in W 1,p (Z), −→ v(z) for a.a. z ∈ Z.
Let © ª df un = max − 1, min {b un , 1} , © ª df vn = max − 1, min {b vn , 1} . Then un vn is locally Lipschitz. Moreover, we have D(un vn ) = un Dvn + vn Dun ∈ Lp (Z), so
un vn ∈ W 1,p (Z)
∀ n > 1.
Note that un −→ u in W 1,p (Z), un (z) −→ u(z) for a.a. z ∈ Z, vn −→ v in W 1,p (Z), vn (z) −→ v(z) for a.a. z ∈ Z. We have p
kun vn − uvkp Z Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯vn (z)¯p ¯un (z) − u(z)¯p dz + ¯u(z)¯p ¯vn (z) − v(z)¯p dz −→ 0. 6 Z
Z
194
Nonlinear Analysis
In addition ° ¡ ¢° °un Dvn + vn Dun − uDv + vDu °p p p
p
6 kun Dvn − uDvkp + kvn Dun − vDukp Z ¯ ¯ ¯un (z)¯p kDvn − Dvkp N dz 6 R Z
Z
+
¯p p ¯ kDv(z)kRN ¯un (z) − u(z)¯ dz
Z
Z
+
¯ ¯ ¯vn (z)¯p kDun − Dukp N dz R
Z
Z
+
¯p p ¯ kDu(z)kRN ¯vn (z) − v(z)¯ dz −→ 0.
Z
Therefore we conclude that uv ∈ W 1,p (Z) and D(uv) = uDv + vDu.
In fact a careful reading of this proof reveals that the following result is also true. PROPOSITION 2.4.23 If Z ⊆ RN is an open set, u ∈ W 1,p (Z), p ∈ [1, +∞] and v ∈ W 1,∞ (Z), then uv ∈ W 1,p (Z) and D(uv) = uDv + vDu. Next we prove a chain rule for Sobolev functions. PROPOSITION 2.4.24 (Chain Rule for Sobolev Functions) If Z ⊆ RN is an open set, ξ ∈ C 1 (R), ξ 0 ∈ L∞ (R), ξ(0) = 0 and u ∈ W 1,p (Z) with p ∈ [1, +∞], then ξ ◦ u ∈ W 1,p (Z) and D(ξ ◦ u) = ξ 0 (u)Du. Moreover, if Z is bounded, then we can drop the condition that ξ(0) = 0. PROOF
Let p ∈ [1, +∞) and ϑ ∈ Cc∞ (Z) be such that supp ϑ ⊆ V ⊂⊂ Z.
2. Lebesgue-Bochner and Sobolev Spaces
195
Using Proposition 2.4.12(e) (twice) and integration by parts, we have Z Z ∂ϑ ∂ϑ ξ(u) dz = ξ(u) dz ∂zk ∂zk Z V Z Z ∂ϑ ∂uε = lim ξ(uε ) dz = − lim ξ 0 (uε ) ϑ dz ε&0 ε&0 ∂zk ∂zk V V Z Z = − ξ 0 (u)(Dk u)ϑ dz = − ξ 0 (u)(Dk u)ϑ dz, V
so
Z
Dk (ξ ◦ u) = ξ 0 (u)Dk u
∀ k ∈ {1, . . . , N } .
It is clear from the above argument (see second equality) that if Z is bounded, then the condition ξ(0) = 0 can be dropped. If p = +∞, then ξ ◦ u is a Lipschitz continuous function (see Remark 2.4.21) and the result follows from the classical chain rule. In fact there is a stronger version of the previous proposition. It is due to Marcus & Mizel (1972), where the interested reader can find the proof. PROPOSITION 2.4.25 If Z ⊆ RN is an open set, ξ : R −→ R is Lipschitz continuous, ξ(0) = 0 and u ∈ W 1,p (Z), p ∈ [1, +∞], then ξ ◦ u ∈ W 1,p (Z) and D(ξ ◦ u) = ξ ∗ (u)Du almost everywhere on Z with ξ ∗ : R −→ R being any bounded Borel function such that ξ ∗ (z) = ξ 0 (z)
for a.a. z ∈ Z.
REMARK 2.4.26 The function f ∗ can always be taken to be bounded, by virtue of the following result due to Stampacchia (1966): “If u ∈ W 1,p (Z) and A ⊆ R is a Lebesgue-null set, then Du(z) = 0 for almost all z ∈ u−1 (A).” Moreover, note that the chain rule (see Proposition 2.4.25) is also valid for W01,p (Z) (see Corollary 2.4.8). Recall that © ª u+ = max u, 0 We have
u = u+ − u−
© ª and u− = min − u, 0 . and
|u| = u+ + u− .
Using the general version of the chain rule (see Proposition 2.4.25), we obtain at once the following result.
196
Nonlinear Analysis
PROPOSITION 2.4.27 If Z ⊆ RN is an open set and u ∈ W 1,p (Z), p ∈ [1, +∞], then u+ , u− , |u| ∈ W 1,p (Z) and we have ½ Du+ = ½ Du− =
Du 0 0 −Du
Du 0 D|u| = −Du
for a.a. z ∈ {u > 0} , for a.a. z ∈ {u 6 0} , for a.a. z ∈ {u > 0} , for a.a. z ∈ {u < 0} , for a.a. z ∈ {u > 0} , for a.a. z ∈ {u = 0} , for a.a. z ∈ {u < 0} .
Using this proposition we can show that the Sobolev spaces W 1,p (Z), p ∈ [1, +∞] have a lattice structure. COROLLARY 2.4.28 If Z ⊆ RN is an open set and u, v ∈ W 1,p (Z), p ∈ [1, +∞], then © ª h0 = min u, v ∈ W 1,p (Z),
© ª h1 = max u, v ∈ W 1,p (Z)
and we have ½ df
Dh0 = ½ df
Dh1 = PROOF
Du Dv
for a.a. z ∈ {u 6 v} , for a.a. z ∈ {u > v} ,
Du Dv
for a.a. z ∈ {u > v} , for a.a. z ∈ {u 6 v} .
Note that h1 = (u − v)+ + v
and
h0 = u − (u − v)+ .
Then the result follows at once from Proposition 2.4.27. An immediate consequence of this Corollary is the following particular case of the result of Stampacchia mentioned in Remark 2.4.26. COROLLARY 2.4.29 If Z ⊆ RN is an open set and u ∈ W 1,p (Z), p ∈ [1, +∞], then for every η ∈ R, we have Du(z) = 0
© ª for a.a. z ∈ u = η .
2. Lebesgue-Bochner and Sobolev Spaces
197
PROPOSITION 2.4.30 If Z ⊆ RN is an open set, {un }n>1 ⊆ W 1,p (Z), p ∈ (1, +∞), df
h = sup kDun k ∈ Lp (Z)
and
n>1
then g ∈ W PROOF
1,p
df
g = sup un , n>1
° ° (Z) and °Dg(z)°RN 6 h(z) for almost all z ∈ Z.
Let
df
gk = max un
∀ k > 1.
16n6k
From Corollary 2.4.28, we have that gk ∈ W 1,p (Z) and ° ° ° ° °Dgk (z)° N 6 max °Dun (z)° N 6 h(z) for a.a. z ∈ Z. R R 16n6k
(2.41)
Evidently the sequence {gk }k>1 is increasing and gk (z) −→ g(z) for a.a. z ∈ Z
as k → +∞. ¡ ¢ Also from (2.41), we see that the sequence {Dgk }k>1 ⊆ Lp Z; RN is bounded. Then from the monotone convergence theorem (see Theorem A.2.10) and the Eberlein-Smulian theorem (see Theorem A.3.8), we have in Lp (Z), ¡ ¢ −→ w in Lp Z; RN ,
gk −→ g Dgkm
w
with {gkm }m>1 being a subsequence of {gk }k>1 . For every ϑ ∈ Cc∞ (Z) and every i ∈ {1, . . . , N }, from the definition of the distributional derivative, we have Z Z (Di gk )ϑ dz = − gk Di ϑ dz, Z
so
Z
Z
Z wi ϑ dz = −
A
gDϑ dz, Z
¡ ¢ p N where w = (wi )N and finally i=1 ∈ L Z; R wi = Di g
∀ i ∈ {1, . . . , N } .
Therefore, we infer that for whole sequence, we have ¡ ¢ w Dgn −→ Dg in Lp Z; RN . So g ∈ W 1,p (Z) and kDg(z)kRN 6 h(z)
for a.a. z ∈ Z.
198
Nonlinear Analysis
From the above proofs, it is clear that we have: PROPOSITION 2.4.31 If Z ⊆ RN is an open set and {un }n>1 ⊆ W 1,p (Z), p ∈ [1, +∞) is a sequence, such that w
in Lp (Z) ¡ ¢ Dun −→ w in Lp Z; RN , un −→ u w
then u ∈ W 1,p (Z) and Du = w. PROPOSITION 2.4.32 If Z ⊆ RN is an open set, {un }n>1 , {vn }n>1 ⊆ W 1,p (Z), p ∈ [1, +∞) and w
in W 1,p (Z),
w
in W 1,p (Z),
un −→ u vn −→ v then
© ª © ª min un , vn −→ min u, v in W 1,p (Z), © ª © ª max un , vn −→ max u, v in W 1,p (Z). PROOF
It suffices to show that
+ if un −→ u in W 1,p (Z), then u+ n −→ u
First note that and so it follows that
in W 1,p (Z).
¯ + ¯ ¯ ¯ ¯un − u+ ¯ 6 ¯un − u¯ + u+ n −→ u
in Lp (Z).
Next let h = χ(0,+∞) . Using Proposition 2.4.27, we have ° + ° °Dun − Du+ °p p Z p = kh(un )Dun − h(u)DukRN dz Z
Z p
6 kDun (z) − Du(z)kp + Z
¯p p ¯ kDu(z)kRN ¯h(un ) − h(u)¯ dz −→ 0.
2. Lebesgue-Bochner and Sobolev Spaces
199
Using this, we can conclude the validity of Proposition 2.4.27 and Corollary 2.4.28 for the spaces W01,p (Z), p ∈ [1, +∞). So W01,p (Z), p ∈ [1, +∞) has a lattice structure. PROPOSITION 2.4.33 If Z ⊆ RN is an open set and u, v ∈ W01,p (Z), p ∈ [1, +∞), then © ª © ª max u, v , min u, v ∈ W01,p (Z). In particular
u+ , u− , |u| ∈ W01,p (Z).
PROOF Again it suffices to show that u+ ∈ W01,p (Z). Let {ϑn }n>1 ⊆ Cc∞ (Z) be such that ϑn −→ u in W 1,p (Z). From the proof of Proposition 2.4.12(c) and (d), it follows that we can find a sequence {ψnm }n>1 ⊆ Cc∞ (Z), ψnm > 0, such that ψnm −→ ϑ+ n
in W01,p (Z)
as m → +∞
∀ n > 1.
Since + ϑ+ n −→ u
in W 1,p (Z)
(see Proposition 2.4.32), via the double limit lemma (see Proposition A.2.35), we can find a sequence {m(n)}n>1 increasing (not necessarily strictly) to +∞, such that ψn m(n) −→ u+ in W 1,p (Z). Since ψn m(n) ∈ Cc∞ (Z), we deduce that u+ ∈ W 1,p (Z). In fact the previous result can be also obtained by having a chain rule for W01,p (Z), p ∈ (1, +∞) (see also Proposition 2.4.25). First an auxiliary result which is actually of independent interest. PROPOSITION 2.4.34 If Z ⊆ RN is an open set, u ∈ W 1,p (Z), p ∈ [1, +∞) and u vanishes outside a compact K ⊆ Z, then u ∈ W01,p (Z). PROOF
Let Z 0 be a bounded open set in RN , such that K ⊆ Z 0 ⊂⊂ Z.
¡ ¢ Let ϕ ∈ Cc∞ RN , such that
ϕ|K ≡ 1
200
Nonlinear Analysis
(i.e., ϕ is what we usually call a cut off function). We have ϕu = u. We can find {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn −→ u in Lp (Z) and Dϑn −→ Du
¡ ¢ in Lp Z 0 ; RN
(see Remark 2.4.18). We have ϕϑn −→ ϕu
in W 1,p (Z)
and ϕϑn ∈ Cc∞ (Z). Therefore ϕu = u ∈ W01,p (Z).
Using this result we can prove the chain rule for the Sobolev spaces W01,p (Z), p ∈ (1, +∞). PROPOSITION 2.4.35 If Z ⊆ RN is a bounded open set, ξ : R −→ R is Lipschitz continuous, ξ(0) = 0 and u ∈ W01,p (Z), p ∈ (1, +∞), then ξ ◦ u ∈ W01,p (Z) and D(ξ ◦ u) = ξ 0 (u)Du. PROOF
Let {ϑn }n>1 ⊆ Cc∞ (Z) be such that ϑn −→ u in W 1,p (Z).
Set
df
hn = ξ ◦ ϑn
∀ n > 1.
Evidently hn is a Lipschitz continuous function and since ϑn has compact support, so does hn . Also ¯ ¯ ¯ ∂hn ¯ ¯ ¯ ∀ i ∈ {1, . . . , N }, n > 1. ¯ ∂zi ¯ 6 Lip(hn ) Because Z ⊆ RN is bounded, we infer that ∂hn ∈ Lp (Z), ∂zi
2. Lebesgue-Bochner and Sobolev Spaces
201
hence hn ∈ W 1,p (Z) and has compact support. So Proposition 2.4.34 implies that hn ∈ W01,p (Z). Also we have ¯ ¯ ¡ ¡ ¢¯ ¢ ¡ ¢¯ ¯hn (z) − ξ u(z) ¯ = ¯ξ ϑn (z) − ξ u(z) ¯ ¯ ¯ 6 Lip(ξ)¯ϑn (z) − u(z)¯ for a.a. z ∈ Z, so hn −→ ξ ◦ u
in Lp (Z).
N Moreover, if {ek }N k=1 is the standard orthonormal basis of R , we have
|hn (z + tei ) − hn (z)| Lip(ξ) |ϑn (z + tei ) − ϑn (z)| 6 , |t| |t| so
° ° ° ∂hn ° ° ° lim sup ° ∂zi ° n→+∞
p
But
∂ϑn ∂u −→ ∂zi ∂zi
° ° ° ∂ϑn ° ° ° . 6 Lip(ξ) lim sup ° ∂zi ° n→+∞
in Lp (Z)
So from (2.42), we infer that the sequence
(2.42)
p
∀ i ∈ {1, . . . , N }. n
∂hn ∂zi
o i>1
⊆ Lp (Z) is bounded.
Since p ∈ (1, +∞), by passing to a subsequence if necessary, we may assume that ∂hn w −→ wi in Lp (Z) ∀ i ∈ {1, . . . , N }. ∂zi From Proposition 2.4.31, we have that wi =
∂ξ(u) ∂zi
and so
hn −→ ξ(u) in W 1,p (Z). Because hn ∈ W01,p (Z), we conclude that ξ ◦ u = ξ(u) ∈ W01,p (Z). Finally note that
Dhn = ξ 0 (ϑn )Dϑn
and so in the limit we have D(ξ ◦ u) = ξ 0 (u)Du.
REMARK 2.4.36 If Z ⊆ RN is bounded, open, Lipschitz, then in the above proof we can choose {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn −→ u in W 1,p (Z). Then the same proof is valid and so we have a proof of Proposition 2.4.25 with the extra hypothesis that Z ⊆ RN is bounded and Lipschitz.
202
Nonlinear Analysis
We can also have the product rule for the spaces W 1,p (Z), p ∈ [1, +∞]. The proof is the same as that of Proposition 2.4.22, using this time a sequence {b un }n>1 ⊆ Cc∞ (Z). PROPOSITION 2.4.37 If Z ⊆ RN is an open set and u ∈ W01,p (Z) ∩ L∞ (Z)
v ∈ W 1,p (Z) ∩ L∞ (Z),
and
p ∈ [1, +∞], then uv ∈ W01,p (Z) and D(uv) = uDv + vDu. Continuing with the Sobolev spaces W01,p (Z), p ∈ [1, +∞), we have the following results. PROPOSITION 2.4.38 If Z ⊆ RN is an open set, u ∈ W01,p (Z)
v ∈ W 1,p (Z),
and
p ∈ [1, +∞] and 0 6 v(z) 6 u(z)
for a.a. z ∈ Z,
then v ∈ W01,p (Z). PROOF From the proof of Proposition 2.4.33, we know that there exists a sequence {ϑn }n>1 ⊆ Cc∞ (Z), such that ϑn > 0 and Let
∀n>1
ϑn −→ u in W 1,p (Z). © ª df hn = min v, ϑn
∀ n > 1.
Evidently hn has compact support and so by Proposition 2.4.34, hn ∈ W01,p (Z). Moreover, from Proposition 2.4.32, we have that © ª hn −→ min v, u = v in W 1,p (Z). So v ∈ W01,p (Z).
2. Lebesgue-Bochner and Sobolev Spaces
203
PROPOSITION 2.4.39 If Z ∈ RN is an open set, u ∈ W01,p (Z), v ∈ W 1,p (Z), p ∈ [1, +∞) and ¯ ¯ ¯ ¯ ¯v(z)¯ 6 ¯u(z)¯ for a.a. z ∈ Z \ K, where K is a compact subset of Z, then v ∈ W01,p (Z). PROOF
Let ϕ ∈ Cc∞ (Z) be such that 0 6 ϕ 6 1 and
ϕ|K = 1
(a cut off function). We set df
u b = (1 − ϕ)|u| + ϕv + . From Propositions 2.4.33 and 2.4.34, we have that u b ∈ W01,p (Z) and So
0 6 v+ 6 u b.
v + ∈ W01,p (Z)
(see Proposition 2.4.38). Similarly we show that v − ∈ W01,p (Z). Hence v ∈ W01,p (Z). We can improve Proposition 2.4.34 and motivate the discussion of trace which follows. PROPOSITION 2.4.40 If Z ⊆ RN is a bounded open set, u ∈ W 1,p (Z), p ∈ [1, +∞) and lim u(z) = 0
z→y
∀ y ∈ ∂Z,
then u ∈ W01,p (Z). PROOF
Since
u = u+ − u− ,
we may assume that u > 0. For ε > 0 let uε ∈ W 1,p (Z) (see Proposition 2.4.27) and uε has compact support. Therefore by Proposition 2.4.34, we have that uε ∈ W01,p (Z). Now note that uε −→ u in W 1,p (Z) Thus u ∈ W01,p (Z).
as ε & 0.
204
Nonlinear Analysis
So roughly speaking a function u ∈ W 1,p (Z) belongs to W01,p (Z), if u is vanishing on ∂Z. But it is not meaningful to talk of values of u on a set of measure zero. Hence we must be more careful on how we assign boundary values to Sobolev functions. Trace theory does exactly this, namely defines and studies the concept of boundary values for the Sobolev spaces W 1,p (Z), p ∈ [1, +∞). The trace of a Sobolev function is an extension of the restriction of a continuous function on ∂Z. We start with a simple lemma.
LEMMA 2.4.41 If Z ⊆ RN is a bounded, open set which is Lipschitz, then for all p ∈ [1, +∞) there exists c > 0, such that Z p
|u|p dµ(N −1) 6 c kukW 1,p (Z)
∀ u ∈ C 1 (Z).
∂Z
PROOF Since by hypothesis the boundary ∂Z is Lipschitz, for any z = (zk )N k=1 ∈ ∂Z we can find r > 0 and a Lipschitz continuous function η : RN −1 −→ R, such that (upon rotating and relabelling the coordinate axes if necessary), we have Z ∩ Cr (z) =
©
ª N y = (yk )N k=1 ∈ R : η(y1 , . . . , yN −1 ) < yN ∩ Cr (z),
where df
Cr (z) =
©
ª N y = (yk )N k=1 ∈ R : |yk − zk | < r, k = 1, . . . , N .
First assume that u|Z\Cr (z) . If {ek }N k=1 is the standard orthonormal basis of RN and n(·) is the outward unit normal vector on ∂Z, then we have − (eN , n)RN >
¡ ¢− 1 1 + Lip(η)2 2 > 0 for µ(N −1) -a.a. z ∈ ∂Z ∩ Cr (z).
(2.43)
Let ε > 0 be given and set df
1
ξε (t) = (t2 + ε2 ) 2 − ε
∀ t ∈ R.
¯ ¯ Using the Gauss-Green theorem (see Theorem A.4.1) and since ¯ξε0 ¯ 6 1 for
2. Lebesgue-Bochner and Sobolev Spaces all t ∈ R, we have Z ¡ ¢ ξε u(y) dµ(N −1) = ∂Z
Z
205
¡ ¢ ξε u(y) dµ(N −1)
∂Z∩Cr (z)
Z
¡ ¢ ξε u(y) (−eN , n(y))RN dµ(N −1)
6 c ∂Z∩Cr (z)
Z
= −c ∂Z∩Cr (z)
Z
¢¢ ∂ ¡ ¡ ξε u(y) dy ∂yN
¯ 0¡ ° ¢¯ ° ¯ξε u(y) ¯ °Du(y)°
6 c
Z
RN
dy 6 c
° ° °Du(y)°
RN
,
Z
∂Z∩Cr (z)
with c > 0 independent of u (see (2.43)). Note that ξε (u) −→ |u| as ε & 0, so in the limit we obtain Z Z (N −1) |u| dµ 6 c kDu(z)kRN dy.
(2.44)
Z
∂Z
Now we remove the extra hypothesis that u|Z\Cr (z) = 0. In the general case, we can cover ∂Z by a finite number of such cubes Cri (zi ) = Ci for i = 1, . . . , m. Then we can find smooth functions {ξi }m i=0 , such that 0 6 ξi 6 1,
supp ξi ⊆ Ci 0 6 ξ0 6 1,
and
m X
ξi (z) = 1
∀ i ∈ {1, . . . , N },
supp ξ0 ⊆ Z ∀ z ∈ Z.
i=0
We set
df
ui = ξi u
∀ i ∈ {0, 1, . . . , m}.
Evidently each ui |Z\Ci = 0 and so using (2.44), we obtain Z Z ¡ ¢ |u| dµ(N −1) 6 c |u(z)| + kDu(z)kRN dy ∀ u ∈ C 1 (Z). Z
∂Z
If p ∈ (1, +∞), then we use (2.45) with |u|p replacing |u|. So finally Z ¯ ¯ ¯u(z)¯p dµ(N −1) 6 c kukp 1,p ∀ p ∈ [1, +∞). W (Z) ∂Z
(2.45)
206
Nonlinear Analysis
Using this Lemma, we can state and prove the trace theorem which gives meaning to the concept of boundary values for Sobolev spaces. THEOREM 2.4.42 (Trace Theorem) If Z ⊆ RN is bounded, open set which is Lipschitz and p ∈ [1, +∞), then there exists a unique continuous linear map ¡ ¢ γ0 : W 1,p (Z) −→ Lp ∂Z, µ(N −1) , such that γ0 (u) = u|∂Z
∀ u ∈ C 1 (Z).
PROOF By virtue of Theorem 2.4.17, the embedding C 1 (Z) ⊆ W 1,p (Z) is dense. From Lemma 2.4.41, we know that p ku|∂Z k p ¡ L
∂Z,µ(N −1)
¢ 6 c kukW 1,p (Z)
∀ u ∈ C 1 (Z),
for some c > 0. So we can extend uniquely to a continuous linear map ¡ ¢ γ0 : W 1,p (Z) −→ Lp ∂Z, µ(N −1) .
DEFINITION 2.4.43 trace of u on ∂Z.
For every u ∈ W 1,p (Z), we say that γ0 (u) is the
PROPOSITION 2.4.44 If Z ⊆ RN is bounded, open set which is Lipschitz and p ∈ [1, +∞), then Z Z Di u(z) dz = γ0 (u)ni dµ(N −1) ∀ u ∈ W 1,p (Z), i ∈ {1, . . . , N } Z
∂Z
(as before n = (ni )N i=1 is the outward unit normal on ∂Z). PROOF
Let {un }n>1 ⊆ C 1 (Z) be such that ku − un kW 1,p (Z) −→ 0
(see Theorem 2.4.17). From the divergence theorem of multivariable calculus (see Theorem A.4.1), we have Z Z Di un (z) dz = γ0 (un )ni dµ(N −1) ∀ n > 1. (2.46) Z
∂Z
2. Lebesgue-Bochner and Sobolev Spaces
207
From Theorem 2.4.42, we know that ¡ ¢ γ0 (un ) −→ γ0 (u) in Lp ∂Z, µ(N −1) . Also since un −→ u in W 1,p (Z), we have Di un −→ Di u in Lp (Z). So passing to the limit as n → +∞ in (2.46), we obtain Z Z Di u(z) dz = γ0 (u)ni dµ(N −1) . Z
∂Z
This proposition leads to a Green’s Formula for Sobolev functions. First an auxiliary result which provides still another version of the product rule. LEMMA 2.4.45 If Z ⊆ RN is an open set, p ∈ (1, +∞) and then for all u ∈ W
1,p
(Z), v ∈ W
1,p
0
1 p
+
1 p0
= 1,
(Z), we have uv ∈ W 1,1 (Z) and
Di (uv) = uDi v + vDi u
∀ i ∈ {1, . . . , N }.
PROOF First assume that u ∈ C 1 (Z) and consider a sequence {vn }n>1 ⊆ C 1 (Z), such that vn −→ v
0
in W 1,p (Z 0 )
∀ Z 0 ⊂⊂ Z
(see Remark 2.4.18). Let ϑ ∈ D(Z) and consider Z 0 ⊆ RN bounded, open set, such that 0 supp ϑ ⊆ Z 0 ⊆ Z ⊆ Z. For every i ∈ {1, . . . , N }, we have Z Z Z ¡ ¢ uvn Di ϑ dz = uvn Di ϑ dz = − uDi vn + vn Di u ϑ dz, Z0
Z
so
Z lim
n→+∞
¡
Z0
¢ uDi vn + vn Di u ϑ dz =
Z0
Z (uDi v + vDi u) ϕ dz. Z0
Since vn −→ v
0
in W 1,p (Z),
208
Nonlinear Analysis
we have
Z
Z
lim
uvn Di ϑ dz =
n→+∞ Z
uvDi ϑ dz. Z
So in the limit as n → +∞, we obtain Z Z ¡ ¢ uvDi ϑ dz = − uDi v + vDi u ϑ dz
∀ ϑ ∈ D(Z),
Z0
Z
so
Di (uv) = uDi v + vDi u ∈ L1 (Z),
i.e., uv ∈ W 1,1 (Z). Now we remove the restriction that u ∈ C 1 (Z). If u ∈ W 1,p (Z), we can find a sequence {un }n>1 ⊆ C 1 (Z), such that un −→ u in W 1,p (Z 0 ) for all open sets Z 0 ⊂⊂ Z. From the first part of the proof we know that Di (un v) = un Di v + vDi un Hence un v −→ uv
∀ n > 1.
in L1 (Z)
and Di (un v) −→ uDi v + vDi u
in L1 (Z).
Therefore {un v}n>1 is a Cauchy sequence in W 1,1 (Z) and so un v −→ uv
in W 1,1 (Z)
and Di (uv) = uDi v + vDi u.
THEOREM 2.4.46 (Green Formula) If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞), 1,p
1,p
0
1 1 p + p0
then for all u ∈ W (Z), v ∈ W (Z) and i ∈ {1, . . . , N }, we have Z Z Z uDi v dz + vDi u dz = γ0 (uv)ni dµ(N −1) . Z
PROOF
Z
∂Z
From Lemma 2.4.45, we know that uv ∈ W 1,1 (Z)
and Di (uv) = uDi v + vi Du.
An application of Proposition 2.4.44 leads to Green’s formula.
= 1,
2. Lebesgue-Bochner and Sobolev Spaces
209
COROLLARY 2.4.47 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞) and 1 1 p + p0 = 1, ¡ ¢ then for all u ∈ W 1,p (Z) and h ∈ C 1 RN ; RN , we have Z Z Z ¡ ¢ ¡ ¢ udiv h dz + Du(z), h(z) RN dz = γ0 (u) h, n RN dµ(N −1) . Z
Z
∂Z
Theorem 2.4.42 gives meaning to the quantity u|∂Z for any u ∈ W 1,p (Z), ∂m p ∈ [1, +∞). In fact we can do the same thing for ∂n m for any m > 1. Here ∂ ∂n denotes the outward normal derivative on ∂Z. Also we can give a more precise description of the range of the trace map. To do this we need to introduce Sobolev spaces of fractional order on manifolds. DEFINITION 2.4.48 Let M be a compact manifold in RN . For any s ∈ (0, 1), p ∈ [1, +∞) and u ∈ C ∞ (M ), we define p1 Z Z 0 p ¯ ¯ |u(z) − u(z )| df p kukW s,p (M ) = ¯u(z)¯ dz + dz dz 0 . N −1+sp kz − z 0 kRN M
M ×M
This is a norm. The completion of C ∞ (M ) under this norm is denoted by W s,p (M ). For any s > 0, we set s = k + η, with a positive integer k and η ∈ (0, 1) (if s is not an integer). We define ª df © W s,p (M ) = u ∈ W k,p (M ) : Dα u ∈ W η,p (M ) for all |α| = k . REMARK 2.4.49 The definition makes sense also for any Z ⊆ RN bounded and open. Also if s = 0, by convention W 0,p (M ) = Lp (M ). Now we can state the full version of the trace theorem. THEOREM 2.4.50 (Trace Theorem) If Z ⊆ RN is a bounded, open set which is Lipschitz, m > 1 is a positive integer and p ∈ [1, +∞), then there exists a unique bounded, linear operator m,p γ = (γk )m−1 (Z) −→ Lp (∂Z)m , k=0 : W
such that ¡ ¢ (a) if u ∈ C ∞ Z , then γk (u) = (b) range γ =
m−1 Q
W
m−k− p10 ,p
k=0
(c) ker γ = W0m,p (Z).
∂k u ∂nk
(∂Z);
for k = 1, . . . , m − 1;
210
Nonlinear Analysis
Using Theorem 2.4.46 and the continuity of the trace map γ1 , we obtain the following result. THEOREM 2.4.51 If Z ⊆ RN is a bounded, open set which is Lipschitz and u ∈ H 2 (Z), v ∈ H 1 (Z), then Z Z Z ¡ ¢ ∂u (N −1) (∆u)v dz + Du, Dv RN dz = v dµ . ∂n Z
Z
∂Z
REMARK 2.4.52 The equality in the above theorem is sometimes called Second Green’s Identity . We can have a nonlinear extension of this theorem (i.e., p 6= 2). For this purpose if Z ⊆ RN is a bounded, open set which is Lipschitz and q ∈ (1, +∞), we introduce the space: ¡ ¢ df © ¡ ¢ ª V q Z, div = v ∈ Lq Z; RN : div v ∈ Lq (Z) . ¡ ¢ We furnish V q Z, div with the norm df
kvkV q (Z,div ) =
h i q1 ° °q q kvkLq (Z;RN ) + °div v °Lq (Z) .
¡ ¢ It is easy to see that V q Z, div equipped with this norm is a separable, ¡ ¢ ¡ ¢ reflexive Banach space and the embedding C ∞ Z; RN ⊆ V q Z, div is dense. The next theorem extends Theorem 2.4.46. For a proof of it we refer to Casas & Ferna ´ndez (1989) and Kenmochi (1975). THEOREM 2.4.53 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞) and 1 1 p + p0 = 1, then there exists a unique bounded, linear operator ¢ 1 0¡ − 1 ,p0 ,p γn : V p Z, div −→ W p0 (∂Z) = W p0 (∂Z)∗ , such that
¡ ¢ ∀ v ∈ C ∞ Z; RN
γn (v) = (v, n)RN and Z
Z udiv v dz +
Z
=
¡
γn (v), γ0 (u)
¢
¡ ¢ Du, v RN dz
Z
W
1 ,p p0 (∂Z)
¢ 0¡ ∀ v ∈ V p Z, div , u ∈ W 1,p (Z).
2. Lebesgue-Bochner and Sobolev Spaces
211
If for u ∈ W 1,p (Z), we set ¡ ¢ df p ∆p u = div kDukRN Du (the p-Laplacian), then from Theorem 2.4.53, we obtain the following nonlinear extension of Theorem 2.4.51. THEOREM 2.4.54 If Z ⊆ RN is a bounded, open set which is Lipschitz, p ∈ (1, +∞), p1 + p10 = 1, u ∈ W 1,p (Z)
and
0
∆p u ∈ Lp (Z), −
1
,p0
then there exists a unique element of W p0 (∂Z), which by extension we ∂u denote by ∂n , satisfying for all v ∈ W 1,p (Z), p µ ¶ Z Z ¢ ¡ ¢ ∂u p−2 ¡ ∆p u v dz + kDukRN Du, Dv RN dz = , γ0 (v) . 1 ,p ∂np W p0 (∂Z) Z
Z
W01,p (Z),
If u ∈ then we can extend u to u b ∈ W 1,p (RN ) by simply setting u equal to zero on RN \ Z. It is not clear whether this extension is possible for u ∈ W 1,p (Z). The next theorem shows when this is possible. It is known as extension theorem. THEOREM 2.4.55 (Extension Theorem) If Z ⊆ RN is a bounded, open set which is Lipschitz and Zb ⊇ Z is open, then there exists an extension operator ¡ ¢ E : W 1,p (Z) −→ W 1,p Zb , such that E(u)|Z = u, and
° ° °E(u)°
° ° °E(u)° p b 6 c kuk p L (Z) L (Z) ¡ ¢ 6 c kuk 1,p W (Z)
b W 1,p Z
∀ u ∈ W 1,p (Z)
∀ u ∈ W 1,p (Z),
¡ ¢ for some c = c Z, Zb > 0. Next let us define the dual of W 1,p (Z) for an open set Z ⊆ RN and p ∈ [1, +∞). By considering the map L1 : W 1,p (Z) −→ Lp (Z)N +1 , defined by df
L1 (u) =
¡
u, Du
¢
∀ u ∈ W 1,p (Z),
we see that W 1,p (Z) is isometrically isomorphic to a subspace of Lp (Z)N +1 . So from the Riesz representation theorem (see Theorem A.3.24), we have
212
Nonlinear Analysis
THEOREM 2.4.56 If Z ⊆ RN is an open set, p ∈ [1, +∞), then Z ¡ ¢ G(u) = h, Du RN dz
1 p
+
1 p0
= 1 and G ∈ W 1,p (Z)∗ ,
∀ y ∈ W 1,p (Z),
Z 0
for some h ∈ Lp (Z; RN ) The dual of W 1,p (Z) is generally more than a space of distributions on Z. Clearly the restriction on Cc∞ (Z) of an element in W 1,p (Z)∗ belongs to D(Z)∗ . However, this restriction is not injective because Cc∞ (Z) is not dense in W 1,p (Z). The problem is that the elements of W 1,p (Z) can have nonzero boundary values (in the sense of trace). On the other hand Cc∞ (Z) is dense in W01,p (Z). So for this Sobolev space the restriction is injective and we can have a convenient description of W01,p (Z)∗ . THEOREM 2.4.57 If Z ⊆ RN is an open set and p ∈ [1, +∞), then ½ W01,p (Z)∗
df
=
∗
G ∈ D(Z) : G = −
N X
Dk Tgk ,
k=1
¾ p0 N for some g = (gk )N ∈ L (T ) . k=1 0
df
We set W −1,p (Z) = W01,p (Z)∗ , with
1 p
+
1 p0
= 1.
For Sobolev functions of one variable (i.e., N = 1), using Theorem 2.2.24, we have the following convenient characterization. THEOREM 2.4.58 If Z = T = [0, b] (b < +∞) and u ∈ W 1,p (T ), p1 + p10 = 1, then u admits a representative which is absolutely continuous. Moreover, for p = +∞, the representative is Lipschitz continuous. REMARK 2.4.59 The result is also true if T is unbounded. In this case the representative is locally absolutely continuous (see Definition A.2.15(b)). For p = +∞, again the representative is Lipschitz continuous (see Remark 2.4.21).
2. Lebesgue-Bochner and Sobolev Spaces
2.5
213
Inequalities and Embedding Theorems
The study of Sobolev spaces is useful because their elements possess special properties. Many of those properties are a consequence of the so-called embedding theorems. Among other things, the embedding theorems establish regularity properties for the Sobolev functions, in addition to the ones implied by their definition. Let us start with a negative observation. H 1 (Z) is in general not embedded in L∞ (Z). To see ½ ¾ 1 df Z = (x, y) ∈ R2 : x2 + y 2 < 2 , e p ¡ 1¢ η η ∈ 0, 2 and let u(x, y) = | ln r| , where r = x2 + y 2 and (x, y) ∈ Z. We have EXAMPLE 2.5.1 this let
1
Z2πµ Ze
Z 2
|u| dx dy =
Ze
| ln r|2η r dr < +∞,
| ln r| r dr dϑ 6 2π 0
Z
1
¶ 2η
0
i.e., u ∈ L2 (Z). Note that
0 ∂u ∂x
©
ª exists on Z \ (0, 0) and we have
∂u 1x 1 = −η| ln r|η−1 = −η| ln r|η−1 cos ϑ, ∂x rr r so 1
¶ Z ¯ ¯2 Z2πµ Ze ¯ ∂u ¯ 2 2η−2 1 ¯ ¯ dx dy 6 η | ln r| dr cos2 ϑ dϑ ¯ ∂x ¯ r 0
Z
0
1
Ze 6 2πη
2
2η−2 1
| ln r| 0
r
· dr 6 2πη
2
(− ln r)2η−1 − 2η − 1
¸ 1e 6 0
and thus
¡ © ª¢ ∂u ∂u ∈ L2 (Z) and ∈ C 1 Z \ (0, 0) . ∂x ∂x Let ϑ ∈ D(Z). We have µ ¶ µ ¶ 1 1 1 1 u(·, y) ∈ C 1 − , ∀y∈ − , \ {0} e e e e and so
1
Ze − − 1e
1
∂u (x, y)ϑ(x, y) dy = ∂x
Ze
u(x, y) − 1e
∂ϑ (x, y) dx. ∂x
2πη 2 1 − 2η
214
Nonlinear Analysis
¡ ¢ Integrating with respect to y ∈ − 1e , 1e \ {0} and using Fubini’s theorem, we obtain Z Z ∂u ∂ϑ − ϑ dx dy = u dx dy, ∂x ∂x Z
Z
so Dx Tu = T ∂u ∂x
and similarly we show that Dy Tu = T ∂u . ∂y
This proves that u ∈ H 1 (Z). From this example we see that H 1 (Z) 6⊆ L∞ (Z). Next we prove that if f ∈ W 1,p (RN ), then f ∈ Lr (RN ) for a certain range of r > 1 (including r = p). df
DEFINITION 2.5.2 If p ∈ [1, N ), then p∗ = cal Sobolev exponent corresponding to p.
Np N −p
is called the criti-
The next inequality, known as the “Sobolev-Nirenberg-Gagliardo inequality” (or simply as “Sobolev inequality”; see also Theorem 1.6.7), implies that the embedding W 1,p (RN ) ⊆ Lr (RN ) is continuous for r ∈ [1, p∗ ]. THEOREM 2.5.3 (Sobolev-Nirenberg-Gagliardo Inequality) If p ∈ [1, N ), then there exists c > 0, such that kukp∗ 6 c kDukp
∀ u ∈ W 1,p (RN ).
¡ ¢ PROOF By virtue of Theorem 2.4.13, we may assume that u ∈ Cc1 RN . For every i ∈ {1, . . . , N }, we have ¡ ¢ u x1 , . . . , xi , . . . , xN =
Zxi −∞
¢ ∂u ¡ x1 , . . . , ti , . . . , xN dti , ∂xi
thus ¯ ¯ ¯u(x)¯ 6
+∞ Z ° ¡ ¢° °Du xi , . . . , ti , . . . , xN ° N dti R
∀ i ∈ {1, . . . , N }
−∞
and so +∞ ¶ N1−1 N µ Z Y ¯ ¯ N ° ¡ ¢° ¯u(x)¯ N −1 6 °Du x1 , . . . , ti , . . . , xN ° N dti . R i=1
−∞
2. Lebesgue-Bochner and Sobolev Spaces
215
We integrate with respect to x1 ∈ R. +∞ Z ∗ |u|1 dx1 −∞ +∞ +∞ N µ Z +∞ µZ ¶ N1−1 Z ¶ N1−1 Y ° ° ° ° ° ° ° ° 6 Du RN dt1 Du RN dti dx1 −∞ i=2
−∞
µ
+∞ Z
° ° °Du° N dt1 R
6
¶
1 N −1
µY N
−∞
+∞ Z +∞ Z
° ° °Du° N dx1 dti R
¶ N1−1 .
i=2−∞ −∞
−∞
Next we integrate with respect to x2 ∈ R. +∞ Z +∞ Z ∗ |u|1 dx1 dx2 −∞ −∞ +∞ Z +∞ +∞ Z +∞ µZ ¶ N1−1 µ Z ¶ N1−1 6 kDukRN dx1 dx2 kDukRN dt1 dt2 × −∞ −∞
−∞ −∞
+∞ Z +∞ Z +∞ N µ Z Y
×
¶ N1−1
kDukRN dx1 dx2 dti
i=3
.
−∞ −∞ −∞
We continue this way and we obtain Z |u|
1∗
+∞ +∞ ¶ N1−1 Z N µ Z Y N dx 6 ... kDukRN dx1 . . . dxN = kDuk1N −1 , i=1
RN
−∞
−∞
so kuk1∗ 6 kDuk1 .
(2.47)
This proves the theorem for p = 1. Now suppose that p ∈ (1, +∞). Set h = |u|η with η > 0 to be chosen in the process in the proof. Using (2.47) and H¨older’s inequality (see Theorem A.2.27), we obtain µZ |u|
ηN N −1
¶ NN−1 dx
Z 6
RN
µZ
¶ 0
° ° °D|u|η ° N dx 6 η R
RN
|u|(η−1)p dx kDukp ,
6 η RN
Z
RN
|u|η−1 kDukRN dx
216 with
Nonlinear Analysis 1 p
+
1 p0
= 1. Choose η > 0 so that ηN p = (η − 1) . N −1 p−1
Then
p = η p−1
µ
p N − p−1 N −1
and p = η hence η =
¶
N −p , N −1
Np − p N −1 ∗ = p . N −p N
So we have µZ p∗
|u|
¶ NN−1 dx
µZ p∗
6 η
RN
|u|
¶ 10 dx
p
kDukp ,
RN
so kukp∗ 6 c kDukp , with c = c(N, p) > 0 (note that
N −1 N
>
N −p N p ).
The next inequality is known as “Poincar´e’s inequality” and is very useful in the study of Dirichlet elliptic equations. THEOREM 2.5.4 (Poincar´ e’s Inequality) If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then there exists c = c(Z, p) > 0, such that kukp 6 c kDukp PROOF
∀ u ∈ W 1,p (Z).
Since Z ⊆ RN is bounded, we can find ξ > 0, such that Z ⊆ (−ξ, ξ)N .
Let ϑ ∈ D(Z) and extend it by zero on the whole “cube” (−ξ, ξ)N . For every z = (zk )N k=1 , we have ZzN ∂ϑ ϑ(z) = (x1 , . . . , xN −1 , t) dt. ∂xN −ξ
2. Lebesgue-Bochner and Sobolev Spaces By H¨older’s inequality (see Theorem A.2.27), with ¯ ¯ p ¯ϑ(z1 , . . . , zN )¯p 6 (2ξ) p0
1 p
+
1 p0
217 = 1, we have
¯ Zξ ¯ ¯ ∂ϑ ¡ ¢¯p ¯ ¯ dt, z , . . . , z , t N −1 ¯ ∂zN 1 ¯
−ξ
so Z
¯ ¯ ¯ϑ(z1 , . . . , zN )¯p dz1 . . . dzN −1
(−ξ,ξ)N −1
6 (2ξ)
Z
p p0
(−ξ,ξ)N −1
thus
Z
¯ µ Zξ ¯ ¶ ¯ ∂ϑ ¡ ¢ ¯p ¯ ¯ ¯ ∂zN z1 , . . . , zN −1 , t ¯ dt dz1 , . . . dzN −1 , −ξ
¯ ¯ p ¯ϑ(z)¯p dz 6 (2ξ) p0 +1
Z
and since
p p0
Z
° °p °Dϑ° N dz R
Z
+ 1 = p, we have kϑkp 6 2ξ kDϑkp .
Since D(Z) is dense in W 1,p (Z), we conclude that kukp 6 c kDukp
∀ u ∈ W 1,p (Z),
for some c = c(Z, p) > 0. REMARK 2.5.5 In fact the result is true if Z ⊆ RN is unbounded but of finite width, namely it lies between two parallel hyperplanes (see Adams (1975, p. 158)). However, the result fails in truly unbounded domains Z ⊆ RN . Let Z ⊆ RN and ϑ ∈ D(RN ) be such that
EXAMPLE 2.5.6 ϑ|B df
Let ϑm (z) = ϑ
¡z¢ m
1 (0)
≡ 1,
ϑ|B2 (0)c ≡ 0 and
. Then if N < p, we have kϑm kW 1,p (Z) −→ 0
while
0 6 ϑ 6 1.
as m → +∞,
¡ ¢ kϑm kp > λN Bm (0) −→ +∞
as m → +∞.
An immediate useful consequence of Theorem 2.5.4, which is a basic tool in the study of Dirichlet elliptic problems, is the following result.
218
Nonlinear Analysis
COROLLARY 2.5.7 If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then ¶ p1 µZ ° ° df °Du(z)°p N kDukp = ∀ u ∈ W01,p (Z) R RN
is a norm on W01,p (Z) equivalent to the usual Sobolev norm kukW 1,p (Z) . Let us use this opportunity to mention a few equivalent norms for the Sobolev spaces W 1,p (Z). PROPOSITION 2.5.8 If Z ⊆ RN is a bounded, open set and p ∈ [1, +∞), then the following three norms are equivalent to the original Sobolev norm k·kW 1,p (Z) : kuk(1)
¯Z ¯p ¶ p1 µ ° °p ¯ ¯ ° ° ¯ = Du p + ¯ u dz ¯¯
kuk(2)
¯Z ¯p ¶ p1 µ ° °p ¯ ¯ df (N −1) ¯ ° ° ¯ = Du p + ¯ u dµ ¯
kuk(3)
µ ¶ p1 Z ° °p df p (N −1) ° ° = Du p + |u| dµ .
df
Z
∂Z
∂Z
REMARK 2.5.9
If N = 1 and Z = (0, b) (b ∈ (0, +∞)), then Z u dµ(N −1) = u(0) + u(b). ∂Z
Before passing to the so-called embedding theorems, let us mention one more inequality, known in the literature as “Morrey’s inequality.” First a definition. DEFINITION 2.5.10 Let η ∈ (0, 1). A function u : RN −→ R is said to be H¨ older continuous with exponent η, if sup x, y ∈ RN x 6= y
|u(x) − u(y)| < +∞. η kx − ykRN
In the proof of Morrey’s inequality, we shall use the following lemma.
2. Lebesgue-Bochner and Sobolev Spaces
219
LEMMA 2.5.11 For every p ∈ [1, +∞), there exists c = c(N, p) > 0, such that Z Z ¯ ¯ ° ° ° ° ¯u(y) − u(z)¯p dy 6 crN +p−1 °Du(y)°p N °y − z °1−N dy, N R
B r (x)
R
B r (x)
¡ ¢ for all r > 0, u ∈ C 1 B r (x) and all y, z ∈ B r (x). PROOF
If y, z ∈ B r (x), then Z1
u(y) − u(z) = 0
Z1
¢ d ¡ u z + t(y − z dt = dt
thus ¯ ¯ ¯u(y) − u(z)¯p 6 ky − zkp N R
Z1
¡ ¢ Du(z + t(y − z)), y − z RN dt,
0
° ¡ ¢° °Du z + t(y − z) °p N dt. R
0
So using Proposition 1.3.23(b) and (c), for s > 0, we have Z ¯ ¯ ¯u(y) − u(z)¯p dµ(N −1) (y) B r (x)∩∂B s (z)
Z1
Z
° ¡ ¢° °Du z + t(y − z) °p N dµ(N −1) (y) dt R
p
6 s
0 B r (x)∩∂B s (z)
Z1 p
6 s
0
Z
1
° ° °Du(w)°p N dµ(N −1) (w) dt R
tN −1 B r (x)∩∂B st (z)
Z
Z1
° ° ° ° °Du(w)°p N °w − z °1−N dµ(N −1) (w) dt R RN
= sN +p−1 0 B r (x)∩∂B st (z)
Z
= sN +p−2
° ° ° ° °Du(w)°p N °w − z °1−N dw. N R
R
B r (x)∩B s (z)
Then from Example 1.5.27(a), we have Z Z ¯ ¯ ¯u(y) − u(z)¯p dy 6 crN +p−1 B r (x)
with c = c(N, p) > 0.
B r (x)
° ° ° ° °Du(y)°p N °y − z °1−N dy, R RN
220
Nonlinear Analysis
THEOREM 2.5.12 (Morrey Inequality) (a) For every p ∈ (N, +∞), there exists c = c(N, p) > 0, such that ¯ ¯ ¯u(y) − u(z)¯ 6 cr
Z
1 λN (B r (x))
° ° °Du(w)°p N dw, R
B r (x)
¡ ¢ for all r > 0, u ∈ W 1,p Br (x) and λN -almost all y, z ∈ B r (x). (b) If p ∈ (N, +∞) and u ∈ W 1,p (RN ), then the limit lim ur,z = u∗ (z)
r&0
exists for all z ∈ RN and u∗ is H¨ older continuous with exponent 1 − Np ; recall that Z 1 ur,z = N u(x) dλN (x). λ (B r (z)) B r (z)
¡ ¢ PROOF (a) First suppose that u ∈ C 1 Br (x) . Using Lemma 2.5.11 with p p = 1, H¨older’s inequality (see Theorem A.2.27) and recalling that p0 = p−1 , for all y, z ∈ B r (x), we have ¯ ¯ ¯u(y) − u(z)¯ 6 Z 6 c
Z
1 N λ (B r (x))
¯ ¯ ¯¢ ¡¯ ¯u(y) − u(w)¯ + ¯u(w) − u(z)¯ dw
B r (x)
´ ° ° ³ 1−N °Du(w)° N ky − wk1−N + kz − wk dw N N R R R
B r (x)
µ Z 6 c
³
1−N
1−N
ky − wkRN + kz − wkRN
´p0
B r (x)
6 cr
dw
¶ 10 µ Z p
p
¶ p1
kDukRN dw
B r (x)
(N −(N −1)p0 ) p10
µ Z
° ° °Du(w)°p N
¶ p1
R
B r (x)
µ Z 6 cr
1− N p
° ° °Du(w)°p N dw R
¶ p1 .
B r (x)
Invoking Theorem 2.4.13, we see that the same estimate holds for all u ∈ ¡ ¢ W 1,p Br (x) and for λN -almost all y, z ∈ Br (x).
2. Lebesgue-Bochner and Sobolev Spaces
221
(b) From part (a), for λN -almost all y, z ∈ B r (x), with r = kx − ykRN , we have µ Z ¶ p1 N ¯ ¯ ° °p p ¯u(y) − u(z)¯ 6 c ky − zk1− ° ° Du(w) dw N N R
R
B r (x)
6 c kDukLp (Z;RN ) ky −
1− N zkRN p
,
so u is λN -almost everywhere equal to a H¨older continuous function u∗ with exponent 1 − Np . So lim ur,z = u∗ (z)
r&0
∀ z ∈ RN .
REMARK If p = +∞, then we know that the elements of ¡ ¢ 2.5.13 W 1,∞ RN are Lipschitz continuous functions (see Remark 2.4.21). From Theorem 2.5.12, it follows that, if u ∈ W 1,p (RN ), N < p, then lim
kzkRN →+∞
u(z) = 0.
Already from Theorem 2.5.3, we know that the embedding W 1,p (RN ) ⊆ Lr (RN ) is continuous for all r ∈ [1, p∗ ]. Moreover, from Theorem 2.5.12, we know that if p > N , then the embedding W 1,p (RN ) ⊆ L∞ (RN ) is continuous. The next two theorems make these facts much more precise. The first theorem is known as the Sobolev embedding theorem, while the second is known as the Rellich-Kondrachov embedding theorem. First let us introduce a new kind of boundary regularity. DEFINITION 2.5.14 (a) For given z ∈ RN , an open ball B1 with center z and an open ball B2 not containing z, the set ª df © Cz = z + t(y − z) : y ∈ B2 , t > 0 ∩ B1 is called a finite cone in RN . (b) Let Z ⊆ RN be an open set. We say that Z has the cone property, if there exists a finite cone C0 , such that¢ for each z ∈ Z, there exists an ¡ orthogonal transformation Ux ∈ L RN ; RN , for which we have Ux (C0 ) ⊆ Z.
222
Nonlinear Analysis
REMARK 2.5.15 If Z ⊆ RN is a bounded, open set which is Lipschitz, then it has the cone property (see Adams (1975, p. 51)). Also it is clear that if Z is C 1 , then it has the cone property. THEOREM 2.5.16 (Sobolev Embedding Theorem) If Z ⊆ RN is an open set with the cone property, p ∈ [1, +∞), k, m are integers, k > 0, m > 1, then (a) if mp N , then the embedding W k+m,p (Z) ⊆ Cbk (Z) is continuous with Cbk (Z) being the space of all functions u ∈ C k (Z), such that Dα u is bounded on Z for all multiindices α ∈ NN with |α| 6 k. When Z ⊆ RN is bounded, the conclusions are stronger. THEOREM 2.5.17 (Rellich-Kondrachov Embedding Theorem) If Z ⊆ RN is an open, bounded set with the cone property, p ∈ [1, +∞), k, m are integers, k > 0, m > 1, then (a) if mp N , then the embedding W k+m,p (Z) ⊆ C k Z is compact for all r ∈ [1, +∞] and in particular if k = 0, we have that the embedding W m,p (Z) ⊆ Lr (Z) is compact for all r ∈ [1, +∞]. REMARK 2.5.18 Since W01,p (Z) is a closed subspace of W 1,p (Z), we see that both embedding theorems (i.e., Theorems 2.5.16 and 2.5.17) are also valid for W01,p (Z). In fact in this case the cone property can be dropped. Using the embedding theorems, we can prove a generalized form of Poincar´e’s inequality (see Theorem 2.5.4).
2. Lebesgue-Bochner and Sobolev Spaces
223
THEOREM 2.5.19 (Generalized Poincar´ e Inequality) If Z ⊆ RN is a bounded, open, connected set with the cone property, p ∈ (1, +∞) and V is a closed linear subspace of W 1,p (Z), such that the only constant function in V is the zero function, then kukp 6 c kDukp ∀ u ∈ V, for some c > 0. PROOF We proceed by contradiction. So suppose that the conclusion of the theorem is not true. We can find a sequence {un }n>1 , such that kun kp > n kDun kp Let df
vn =
∀ n > 1.
un kun kp
∀ n > 1.
and
kDvn kp
1 W 1,p (Z) is bounded and so by passing to a subsequence if necessary, we may assume that w
vn −→ v
in W 1,p (Z).
Because of Theorem 2.5.17(a), we have that vn −→ v
in Lp (Z).
Hence kvkp = 1. Also exploiting the weak lower semicontinuity of the norm in a Banach space, we have kDvkp 6 lim inf kDvn kp = 0, n→+∞
so v ∈ V is constant, thus v = 0, which is a contradiction to the fact that kvkp = 1.
224
Nonlinear Analysis
COROLLARY 2.5.20 If Z ⊆ RN is a bounded, open, connected set which is Lipschitz, S0 ⊆ ∂Z with µ(N −1) (S0 ) > 0, p ∈ (1, +∞) and df
V =
©
u ∈ W 1,p (Z) : γ0 (u) = 0 on S0
ª
(γ0 being the trace operator on W 1,p (Z); see Theorem 2.4.42), then kukp 6 c kDukp ∀ u ∈ V, for some c > 0. PROOF Since γ0 is continuous linear (see Theorem 2.4.42), V is closed, linear subspace of W 1,p (Z). Suppose that the constant function u ≡ c ∈ V . We have 0 = γ0 (u) = γ0 (c) = c. This permits the application of Theorem 2.5.19. This leads to another fundamental inequality, known as the “Poincar´eWirtinger inequality.” It is an essential tool in the study of periodic ordinary differential equations and Neumann partial differential equations. THEOREM 2.5.21 (Poincar´ e-Wirtinger Inequality) If Z ⊆ RN is a bounded, open, connected set with the cone property and p ∈ (1, +∞), then ku − ukp 6 c kDukp ∀ u ∈ W 1,p (Z), R df for some c > 0, where u = λN1(Z) u(z) dz. Z
PROOF
Let
Z u ∈ W 1,p (Z) : u(z) dz = 0 . V = df
Z
Clearly V is a closed, linear subspace of W 1,p (Z). If the constant function u ≡ c ∈ V , then Z u(z) dz = cλN (Z) = 0, Z
hence c = 0. Also u−u ∈ V
∀ u ∈ W 1,p (Z).
So an application of Theorem 2.5.19 finishes the proof.
2. Lebesgue-Bochner and Sobolev Spaces
225
We already know that for the Sobolev functions of one variable (i.e., N = 1), the situation is better (see Theorem 2.4.58). In this case the embedding theorems take a sharper form. THEOREM 2.5.22 Let T ⊆ R be an interval. (a) If T is open and p ∈ [1, +∞], then the embedding W 1,p (T ) ⊆ L∞ (T ) is continuous; (b) If T is open, bounded and p ∈ (1, +∞], ¡ ¢ then the embedding W 1,p (T ) ⊆ C T is compact; (c) If T is open, bounded and p ∈ [1, +∞), then the embedding W 1,1 (T ) ⊆ Lp (T ) is compact. PROOF (a) Using the extension theorem (see Theorem 2.4.55), we see that without any loss of generality, we may assume that T = R. First let u ∈ Cc1 (R) and for p ∈ [1, +∞) let df
ξp (r) = |r|p−1 r We have that ξp (u) ∈
Cc1 (R)
∀ r ∈ R.
and from the chain rule, we have
¯ ¯p−1 d ¢ ¡ ¢d d ¡ ξp u(t) = ξp0 u(t) u(t) = p¯u(t)¯ u(t) dt dt dt (since ξp0 (r) = p|r|p−1 ). Hence for every t ∈ R, we have ¡
¢ ξp u(t) =
Zt
¯ ¯p−1 p¯u(s)¯ u0 (s) ds
−∞
(since u ∈ Cc1 (R)), so by H¨older’s inequality (see Theorem A.2.27), we have Z ¯ ¡ ¯ ¯ ¯ ¯p−1 ¯ ¯ ¢¯ ¯ξp u(t) ¯ = ¯u(t)¯p 6 p¯u(s)¯ ¯u0 (s)¯ ds R
6
p−1 p kukp
0
ku kp .
By Young’s inequality (see Proposition A.4.5), we obtain kuk∞ 6 c kukW 1,p (T )
∀ u ∈ Cc1 (R),
(2.48)
for some c > 0. Now let u ∈ W 1,p (R), with p ∈ [1, +∞). Then we can find a sequence {un }n>1 ⊆ Cc1 (R), such that un −→ u in W 1,p (R).
226
Nonlinear Analysis
From (2.48), we have kun − um k∞ 6 c kun − um kW 1,p (T )
∀ n, m > 1,
hence the sequence {un }n>1 ⊆ L∞ (R) is a Cauchy sequence. Therefore un −→ u in L∞ (R) and we have proved the continuity of the embedding W 1,p (R) ⊆ L∞ (R) for p ∈ [1, +∞). Of course the result is trivially true for p = +∞. (b) Let B 1 (0) be the closed unit ball in W 1,p (T ), p ∈ (1, +∞]. Let u ∈ B 1 (0). We have ¯ Zt ¯ ¯ ¯ ¯ ¯ 1 1 ¯u(t) − u(s)¯ = ¯ u0 (τ ) dτ ¯ 6 ku0 k |t − s| p0 6 |t − s| p0 p ¯ ¯
∀ t, s ∈ T,
s
where p1 + p10 = 1 (see Theorem 2.4.58). Then the Arzela-Ascoli theorem ¡ ¢ (see Theorem 2.3.2) implies that B 1 (0) is relatively compact in C T and ¡ ¢ so we have proved the compactness of the embedding W 1,p (T ) ⊆ C T for p ∈ (1, +∞] with T ⊆ R being a bounded, open interval. (c) Let B 1 (0) be the closed unit ball in W 1,1 (T ). Let E be an open subset of T , such that E ⊆ T . Let ¡ ¢ |h| < dR E, T c and u ∈ B 1 (0). From Remark 2.4.21, we know that kτh (u) − ukL1 (E) 6 |h| ku0 kL1 (T ) 6 |h|, so Z
³ ´p−1 Z ¯ ¯ ¯ ¯ ¯u(t + h) − u(t)¯p dt 6 2 kuk ∞ ¯u(t + h) − u(t)¯ dt 6 c|h|, L (T )
E
E
for some c > 0 with p ∈ [1, +∞). Thus µZ
¶ p1 ¯ ¯ 1 1 ¯u(t + h) − u(t)¯p dt 6 c p |h| p .
E
Invoking Theorem 2.3.6, we infer that B 1 (0) is relatively compact in Lp (T ), p ∈ [1, +∞). This proves the compactness of the embedding W 1,1 (T ) ⊆ Lp (T ) for p ∈ [1, +∞) with T ⊆ R being a bounded, open interval.
2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.5.23
227
The embedding ¡ ¢ W 1,1 (T ) ⊆ C T
is continuous but never compact even if the open interval T is bounded. If T ⊆ R is a bounded, open interval and {un }n>1 ⊆ W 1,1 (T ) is a bounded sequence, then we can extract a subsequence {unk }k>1 , such that {unk (t)}k>1 converges for every t ∈ T (see Denkowski, Mig´orski & Papageorgiou (2003a, p. 229)). Also if T ⊆ R is an unbounded, open interval and p ∈ (1, +∞], then the embedding W 1,p (T ) ⊆ L∞ (T ) is continuous, but not compact. To the equivalent Sobolev norms mentioned in Proposition 2.5.8, we can add one more. THEOREM 2.5.24 (a) If T ⊆ R is a bounded, open interval and r ∈ [1, +∞], then df
|||u|||W 1,p (T ) = kukr + ku0 kp is equivalent to the usual norm k·kW 1,p (T ) on W 1,p (T ); (b) If Z ⊆ RN is a bounded, open set with the cone property and p ∈ [1, +∞), then df
|||u|||W 1,p (T ) = kukr + kDukp is equivalent to the usual norm k·kW 1,p (Z) on W 1,p (Z) provided that r ∈ [1, p∗ ] if p < N , r ∈ [1, +∞) if p = N and r ∈ [1, +∞] if p > N . Now let Z ⊆ RN be a bounded, open set which is Lipschitz. Consider the Banach space M (Z) of ¡ Radon ¢∗ measures on Z with the total variation norm. Recall that M (Z) = C0 (Z) (see Theorem 2.3.41). From Theorem 2.4.17 (see also Remark 2.5.18), we know that if r > N , then the embedding ¡ ¢ © ¡ ¢ ª W01,r (Z) ⊆ C0 Z = u ∈ C Z : u|∂Z = 0 is continuous and dense. So by virtue of Lemma 2.2.27(a), we have that 0
M (Z) ⊆ W −1,r (Z), with 1r + r10 = 1 (see Theorem 2.4.57). This observation is crucial in proving the next compactness result.
228
Nonlinear Analysis
THEOREM 2.5.25 If Z ⊆ RN is a bounded, open set with the cone property and {µn }n>1 ⊆ M (Z) is a bounded sequence, then the sequence {µn }n>1 is relatively compact in W −1,r (Z) for every r ∈ df
[1, 1∗ ), where 1∗ =
N N −1 .
PROOF By virtue of Theorem 2.3.46, we can find a subsequence {µnk }n>1 of the sequence {µn }n>1 and µ ∈ M (Z), such that w
µnk −→ µ in M (Z). 1 1 r + r 0 = 1. Then ¡ ¢ 0 W01,r (Z) ⊆ C0 Z is
Let r0 > 0 be the conjugate exponent of r ∈ [1, 1∗ ), i.e., r0 > N and so by Theorem 2.5.17(c), the embedding 0
compact. Let B 1 (0) be the closed unit ball in W01,r (Z). We see that B 1 (0) ¡ ¢ m(ε) is compact in C0 Z . So given ε > 0 we can find a finite sequence {ui }i=1 , such that for every u ∈ B 1 (0), we have min ku − ui k ¡ ¢ < ε. (2.49) C0 Z
16i6m(ε)
©
ª So, if u ∈ B 1 (0), for some i ∈ 1, . . . , m(ε) , we have ¯Z ¯ Z ¯ ¯ ¯ u dµn − u dµ¯ k ¯ ¯ Z
Z
¯Z ¯ ¯Z ¯ ¯Z ¯ Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ 6 ¯ (u − ui ) dµnk ¯ + ¯ ui dµnk − ui dµ¯ + ¯ (ui − u) dµ¯¯ Z
Z
Z
¯Z ¯ Z ¯ ¯ ¯ 6 2ε sup |µnk | (Z) + ¯ ui dµnk − ui dµ¯¯. k>1 Z
Since
Z
Z
w
µnk −→ µ as k → +∞, we have that
¯Z ¯ Z ¯ ¯ ¯ ui dµn − ui dµ¯ −→ 0 as k → +∞. k ¯ ¯ Z
Z
Therefore, we conclude that lim
k→+∞
so
¯Z ¯ Z ¯ ¯ sup ¯¯ u dµnk − u dµ¯¯ = 0,
u∈B 1 (0)
Z
Z
µnk −→ µ in W −1,r (Z).
2. Lebesgue-Bochner and Sobolev Spaces
229
Next we sharpen both the Rellich-Kondrachov theorem (see Theorem 2.5.17) and Egorov’s theorem (see Theorem A.2.10). We show that a sequence bounded in W 1,r (Z) (r ∈ (1, N )) has a subsequence converging uniformly outside a very small set. The set is not only small in the Lebesgue measure (as the Egorov’s theorem postulates; see Theorem A.2.10), but it is also small in p-capacity, for p ∈ [1, r). First we introduce a notion, which is useful in the study of the pointwise properties of Sobolev functions. ¡ ¢ Suppose that u ∈ L1loc RN . Then Z 1 lim N u(y) dy if the limit exists df r&0 λ (B r (z)) u∗ (z) = B r (z) 0 otherwise
DEFINITION 2.5.26
is the precise representative of u. REMARK 2.5.27
¡ ¢ If u, v ∈ L1loc RN and
u(z) = v(z) then
for λN -almost all z ∈ RN ,
u∗ (z) = v ∗ (z)
∀ z ∈ RN .
Moreover, in view of the Lebesgue differentiation theorem (see Theorem 1.4.6), the limit in the definition of u∗ (see Definition 2.5.26) exists for λN -almost all z ∈ Z. In the next theorem, we identify each function in the Sobolev space W 1,r (Z) with its precise representative. THEOREM 2.5.28 If Z ⊆ RN is a bounded, open set which is Lipschitz, r ∈ (1, N ) and {un }n>1 ⊆ W 1,r (Z) is bounded, then there exist a subsequence {unk }k>1 of {un }n>1 and u ∈ W 1,r (Z), such that for every p ∈ [1, r) and every δ > 0, there exists a relatively closed set Aδ ⊆ Z, such that ¡ ¢ capp U \ Aδ 6 δ and unk −→ u
uniformly on Aδ .
PROOF We may assume that uk ∈ W01,r (Z) for all k > 1. Indeed if this is not the¡case¢ we choose a bounded, open set U ⊇ Z ⊇ Z and a cut-off function ϕ ∈ Cc RN , such that ϕ|Z = 1
and
ϕU c = 0.
230
Nonlinear Analysis
If u ∈ W 1,r (Z) and E(u) ∈ W 1,r (U ) is the extension (see Theorem 2.4.55), then ϕE(u) ∈ W01,r (U ). Because by hypothesis the sequence {un }n>1 ⊆ W01,r (Z) is bounded, by passing to a subsequence if necessary, we may assume that w
un −→ u in W01,r (Z) un −→ u in Lr (Z), with u ∈ W01,r (Z) (see Theorem 2.5.17(a)). Fix δ, ε > 0 and let ¯ ¯ ª df © Cnε = z ∈ Z : ¯un (z) − u(z)¯ > ε and df
hεn = Note that
2³ ε ´+ |un − u| − . ε 2 hεn ∈ W01,r (Z)
(see Proposition 2.4.33) and hεn > 1 on Cnε . From Definition 1.6.1(d), by H¨older’s and Poincar´e’s inequalities (see Theorem A.2.27), for p ∈ [1, r), we have ° °p ¡ ¢ capp Cεn 6 °Dhεn °p µ ¶p p ³n p 2 ε o´1− r r r 6 λN |un − u| > (kDun kr + kDukr ) r ε 2 r−p
6 c(ε) kun − ukLr (Z) . We choose a subsequence {unk }k>1 of {un }n>1 , such that ∞ X ° ° °un − u°r−p < +∞. k r k=1
We set df
i Dm =
∞ [
1
Cnik .
k=m
Then since capp is an outer measure on RN (see Theorem 1.6.10), we have i capp (Dm ) 6
∞ X
³ 1 ´ capp Cnik
k=m
µ ¶X ∞ ° ° 1 °un − u°r−p < δ , 6 c k r i 2k+1 k=1
provided that m = m(i) > 1 is large enough.
2. Lebesgue-Bochner and Sobolev Spaces
231
i From Theorem 1.6.11(a), we know that we can find an open set Vmi ⊇ Dm with δ capp (Vmi ) < k . 2 Set ∞ [ df i Aδ = Z \ Vm(i) . i=1
Then Aδ ⊆ Z is relatively closed, capp (Aδ ) 6
∞ X
¡ i ¢ capp Vm(i) 6 δ
i=1
and unk −→ u
REMARK 2.5.29
uniformly on Aδ .
The result in general fails if p = r.
Now we shall discuss how the continuous embedding ∗
W 1,p (Z) ⊆ Lp (Z) (with Z ∈ RN and p ∈ [1, N )) fails to be compact (see Theorem 2.5.17). It has to do with the so-called “concentration phenomena.” To start having an idea about such situations, recall Proposition 2.3.38. There we saw that if w
un −→ u
in L1 (R)
and oscillates rapidly around its weak limit, then the sequence {un }n>1 cannot converge strongly in L1 (Z). Moreover, if p ∈ (1, +∞), w
un −→ u in Lp (Z) and we also know that un (z) −→ u(z)
for a.a. z ∈ Z;
still we cannot in general deduce strong convergence in Lp (Z). The problem is that the mass of |un − u|p may coalesce onto a set of zero Lebesgue measure. This is the problem of “concentration.” For this reason, in contrast to the case p = 1, for p > 1 the best constant in the Sobolev inequality (see Theorem 2.5.3) is never achieved when Z ⊆ RN , Z 6= RN is an open set which is Lipschitz and in particular is never achieved on a bounded Lipschitz domain. For this reason our analysis will be for Z = RN . So let N > 1 and let p ∈ [1, N ) and consider df
D1,p (RN ) =
©
¡ ¢ª ∗ u ∈ Lp (RN ) : Du ∈ Lp RN ; RN ,
232
Nonlinear Analysis df
(where p∗ =
Np N −p )
furnished with the norm kukD1,p (RN ) = kDukp .
That this is a norm on D1,p (RN ) follows from the Sobolev-NirenbergGagliardo inequality (see Theorem 2.5.3; normed this way D1,p (RN ) is a separable Banach space, Hilbert space if p = 2). The best constant c > 0 in that inequality is given by p
(c−1 )p = S =
kDukp
inf
p
kukp∗
u ∈ D 1,p (RN ) u 6= 0
=
inf
u ∈ D 1,p (RN ) kukp∗ = 1
p
kDukp .
The question is whether this infimum is realized by an element in D1,p (RN ). So consider a minimizing sequence {un }n>1 ⊆ D1,p (RN ), i.e., p
kDun kp −→ S, with kun kp∗ = 1
∀ n > 1.
By passing to a subsequence if necessary, we may assume that w
un −→ u in D1,p (RN ) and so
p
p
kDukp 6 lim inf kDun kp . n→+∞
This u is a minimizer provided that kukp∗ = 1. But since w
∗
in Lp (RN ),
un −→ u
we only know that kukp∗ 6 1. Note that if v ∈ D1,p (RN ), y ∈ RN and λ > 0, then the rescaled function v y,λ (z) = λ satisfies
° y,λ ° °Dv °
p
= kDvkp
N −p p
and
v(λz + y) ° y,λ ° °v ° ∗ = kvk ∗ . p p
So the problem is invariant under translations and dilations. In order to avoid noncompactness of the minimizing sequence (hence achieve kukp∗ = 1) we need the following result, known as the Concentration-Compactness Lemma. In what follows we shall regard L1 (RN ) in a natural way as a subset of M (RN ), by associating to u ∈ L1 (RN ) the measure Z µ(A) = u(z) dz, A
i.e., dµ = u dz.
2. Lebesgue-Bochner and Sobolev Spaces
233
THEOREM 2.5.30 If p ∈ [1, N ), w
in D1,p (RN ),
un −→ u
w
p
µ bn = kDun kRN −→ µ and
νbn = |un |p
w
∗
in M (RN ),
−→ ν
with µ, ν ∈ M (RN ), µ, ν > 0, then (a) there exists an at most countable index set © I, a ªfamily {zi }i∈I of distinct points in RN and nonnegative numbers µi , νi i∈I , such that p
µ > kDukRN +
X
µi δzi
i∈I
and
∗
ν = |u|p +
X
νi δzi ;
i∈I
(b) for all i ∈ I,
p ∗
Sνip and in particular
X
6 µi
p ∗
νip
< +∞.
i∈I
PROOF Z
¡ ¢ First suppose that u = 0. Let ϑ ∈ Cc∞ RN . We have
¯ ¯ ∗ p∗ ¯ϑun ¯p dz 6 S − p
µZ
RN
¶ pp∗ ∀ n > 1.
(2.50)
RN
Since we have that
° ° °D(ϑun )°p dz
|un |p Z
∗
w
−→ ν
in M (RN ),
¯ ¯ ∗ ¯ϑun ¯p dz −→
RN
Z ∗
|ϑ|p dν. RN
Also, using the facts that ¡ ¢ un −→ u in Lploc RN (see Theorem 2.5.17(a)) and p
w
kDun kRN −→ µ in M (RN ),
(2.51)
234
Nonlinear Analysis
we have that µZ lim inf
n→+∞
° ° °D(ϑun )°p N dz R
¶ pp∗
RN
µZ
° °p |ϑ| Dun °RN dz p°
= lim inf
n→+∞
¶ pp∗
RN
Z
|ϑ|p dµ.
=
(2.52)
RN
From (2.50), (2.51) and (2.52), we infer that in the limit as n → +∞, we have µZ p∗
|ϑ|
dν
¶ p1∗
µZ 6 S
¶ p1 |ϑ| dµ
1 −p
RN
¡ ¢ ∀ ϑ ∈ Cc∞ RN .
p
(2.53)
RN
From (2.53), it follows that for all compact sets K ⊆ RN , we have 1
1
1
ν(K) p∗ 6 S − p µ(K) p .
(2.54)
Because the measures are Radon, from (2.54), we deduce that 1
1
1
ν(A) p∗ 6 S − p µ(A) p
for all Borel sets A ⊆ RN .
(2.55)
From Saks Lemma (see Theorem A.2.13), we know that µ = µ0 + µa , with µ0 nonatomic measure, µ0 > 0, µa purely atomic and X µa = µi δzi , i∈I
where I is a countable index set, {µi }i∈I ⊆ R+ \ {0} and {zi }i∈I ⊆ RN . Because of (2.55), we see that ν ≺≺ µ and so from the Radon-Nikodym theorem (see Theorem A.2.24), we have that Z dν ν(A) = dµ for all Borel sets A ⊆ RN , (2.56) dµ A
where
dν ∈ L1 (RN ; µ) dµ
2. Lebesgue-Bochner and Sobolev Spaces
235
is the Radon-Nikodym derivative (see Remark A.2.25). So dν ν(B r (z)) (z) = lim r&0 µ(B r (z)) dµ
for µ-a.a. z ∈ RN .
(2.57)
From (2.55), we see that ¡ ¢ N p∗ ν(B r (z)) 6 S − p µ B r (z) N −p , µ(B r (z))
(2.58)
¡ ¢ provided µ B r (z) 6= 0. From (2.57) and (2.58), it follows that dν (z) = 0 dµ
for a.a. z ∈ supp µ0 = RN \ {zi }i∈I .
(2.59)
Set
dν (zi )µi . dµ From (2.56), (2.57), (2.58) and (2.59), we see that the theorem holds when u = 0. Now suppose that u 6= 0. Set νi =
df
wn = un − u. Then the previous calculations apply to {wn }n>1 . Moreover, by virtue of Proposition 2.3.49, we have p
p
p
kDwn kp = kDun kp − kDukp + εn so
w
p
with εn & 0,
p
b > 0 kDwn kRN −→ µ − kDukRN = µ
in M (RN ).
Similarly, we have |wn |p
∗
−→ ν − |u|p
∗
= νb in M (RN ).
So, we are back to the case u = 0, with un replaced by wn , µ replaced by µ b and ν replaced by νb. COROLLARY 2.5.31 If p ∈ [1, +∞), w
in D1,p (RN ),
un −→ 0
p
w
µn = kDun kRN −→ µ and
νn = |un |p
∗
w
−→ ν
with µ, ν > 0 and 1
1
in M (RN ), 1
µ(RN ) p 6 S p ν(RN ) p∗ , then ν is a Dirac measure.
236
Nonlinear Analysis
PROOF
From (2.55) and the hypotheses, we see that 1
1
1
µ(RN ) p = S p ν(RN ) p∗ . Also from (2.53), we have that µZ p∗
|ϑ|
dν
¶ p1∗
µZ 6 S
1 −p
N
µ(R )
1 N
|ϑ|
RN
p∗
¶ p1∗ dµ
¡ ¢ ∀ ϑ ∈ Cc∞ RN .
RN
Thus we infer that ν = S−
p∗ p
p
µ(RN ) N −p µ
and (2.53) becomes µZ |ϑ|
p∗
dν
¶ p1∗
¡
N
ν(R )
¢ N1
µZ 6
RN
so
p
¶ p1
|ϑ| dν
,
RN
1
1
1
ν(A) p∗ ν(RN ) N 6 ν(A) p
for all Borel sets A ⊆ RN .
But this is impossible if ν is not a Dirac measure. We return to the problem of determining the best Sobolev constant, i.e., S =
inf 1,p
p
N
u ∈ D (R ) kukp∗ = 1
kDukp .
(2.60)
We have the following existence result for problem (2.60). The result is due to Lions (1985a, 1985a) and its proof, which also uses Theorem 2.5.30, can be found there. THEOREM 2.5.32 If p ∈ (1, N ) and {un }n>1 ⊆ D1,p (RN ) is a minimizing sequence for problem (2.60), then {un }n>1 up to translation and dilation is relatively compact in D1,p (RN ), i.e., there exists a sequence {(zn , λn )}n>1 ⊆ RN × (R+ \ {0}), such that the sequence N −p
uznn ,λn (z) = λn p un (λn z + zn ) is relatively compact in D1,p (RN ).
∀ z ∈ RN
2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.5.33
237
If p = 2, the function df
u(z) =
[N (N − 2)] 2
N −2 4
(1 + kzkRN )
N −2 2
is a minimizer for problem (2.60) (see Aubin (1976) and Talenti (1976)). If Z ⊆ RN is any open set (not necessarily equal RN ) and by S(Z) we denote the value of problem (2.60) when RN is replaced by Z, then S(Z) = S, but S(Z) is never attained if Z 6= RN . Finally if p = 1, then the best constant for the embedding N
D1,1 (RN ) ⊆ L N −1 (RN ) is attained on the characteristic functions of balls, that is in BVloc (RN ) (see Section 2.6). Since we are in the business of determining sharp constants in inequalities, let us check to see what happens with the constant in the Poincar´e-Wirtinger inequality (see Theorem 2.5.21) for Sobolev functions of one variable. Let T = (0, b) (b < +∞) and p ∈ (1, +∞). We introduce the space ¡ ¢ df © ¡ ¢ ª 1,p Wper T ; RN = u ∈ W 1,p T ; RN : u(0) = u(b) . ¡ ¢ 1,p From Theorem 2.5.22(b), we have that Wper T ; RN is embedded continu¡ ¢ ously (in fact compactly) in C T ; RN . Therefore the evaluations at t = b and t = 0 make sense. PROPOSITION 2.5.34 ¡ ¢ Rb 1,p If u ∈ Wper T ; RN (p ∈ (1, +∞)) and u(t) dt = 0, 0
1
then kuk∞ 6 b p0 ku0 kp . PROOF Arguing on each component separately, we may assume without any loss of generality that N = 1. Then from the mean value theorem for integrals, we can find τ ∈ T = (0, b), such that 1 u(τ ) = b
Zb u(s) ds = 0. 0
By H¨older’s inequality (see Theorem A.2.27), with
1 p
+
1 p0
¯ Zt ¯ Zb ¯ ¯ ¯ ¯ ¯ 0 ¯ 1 ¯u(t)¯ = ¯ u0 (s) ds¯ 6 ¯u (s)¯ ds 6 b p0 ku0 k p ¯ ¯ τ
so kuk∞ 6 b
1 p0
0 0
ku kp .
= 1, we have ∀ t ∈ [0, b],
238
Nonlinear Analysis
¡ ¢ 1,2 In the case of the Hilbert space Wper T ; RN , we have the following sharp estimates. PROPOSITION ¡ ¢2.5.35 1,p If u ∈ Wper T ; RN (p ∈ (1, +∞)) and Zb u(t) dt = 0, 0
then 2
b2 2 ku0 k2 ; 4π 2 b 2 6 ku0 k2 . 12
(a) kuk2 6 2
(b) kuk∞ PROOF
Again we may assume that N = 1.
(a) We consider the Fourier expansion of u, i.e., µ ¶ +∞ X 2iπkt u(t) = ak exp . b k = −∞ k 6= 0
Parseval’s equality implies that 2
ku0 k2 =
+∞ X k = −∞ k 6= 0
b
4π 2 k 2 4π 2 |ak |2 > 2 2 b b
+∞ X
b|ak |2 =
k = −∞ k 6= 0
4π 2 2 kuk2 . b2
(b) Using the Cauchy-Schwarz-Bunyakowski inequality (see Proposition A.4.5 and Remark A.4.6), Parseval’s equality and since ∞ X 1 π2 = , 2 k 6
k=1
for every t ∈ [0, b], we have µ X +∞ ¯ ¯ ¯u(t)¯2 6
¶2 |ak |
k = −∞ k 6= 0
µ 6
+∞ X k = −∞ k 6= 0
b 4π 2 k 2
¶µ
+∞ X k = −∞ k 6= 0
4π 2 k 2 |ak |2 b
¶ =
b 2 ku0 k2 . 12
2. Lebesgue-Bochner and Sobolev Spaces
2.6
239
Fine Properties of Functions and BV-Functions
In this section we establish some further differentiability properties of Sobolev functions and also introduce the space of functions of bounded variation (BV -functions) and establish some of their basic properties. ∗ We start with a result on the Lp -differentiability of Sobolev functions. PROPOSITION 2.6.1 1,p If u ∈ Wloc (RN ) with p ∈ [1, N ), then for λN -almost all z ∈ RN , we have µ
1 λN (B r (z))
Z
¯ ¯u(y) − u(z) − (Du(z), y − z)
RN
¯p ∗ ¯ dy
¶ p1∗
B r (z)
= o(r) PROOF we have
as r & 0.
From Theorem 1.4.6, we know that for λN -almost all z ∈ RN , Z
1 lim r&0 λN (B r (z))
¯ ¯ ¯u(y) − u(z)¯p dz = 0
B r (z)
and 1 r&0 λN (B r (z))
Z
° ° °Du(y) − Du(z)°p N dz = 0. R
lim
B r (z)
We fix such a point z ∈ RN (known as a Lebesgue point for the functions u and Du). Clearly exploiting the translation invariance ¡ ¢of the Lebesgue measure λN , we can take z = 0. We choose ϑ ∈ Cc1 B r (0) with kϑkp0 6 1 (here p1 + p10 = 1). Let ϕ be a mollifier (see Definition 2.4.10) and for every ε > 0, set df
uε = ϑε ? u. Choose y ∈ B r (0) and let h(t) = uε (ty). Then Z1 h0 (s) ds,
h(1) = h(0) + 0
240
Nonlinear Analysis
so Z1 uε (y) = uε (0) +
¡
Duε (sy), y
¢ RN
ds
(2.61)
0
¡
= uε (0) + Du(0), y
Z1
¢ RN
+
¡
Duε (sy) − Du(0), y
¢ RN
ds.
0
Using Fubini’s theorem and a change of variables, we have Z
1 λN (B r (0))
ϑ(y) (uε (y) − uε (0) − (Du(0), y)RN ) dy B r (0)
Z1
Z
1 λN (B r (0))
= 0
¡ ¢ ϑ(y) Duε (sy) − Du(0), y RN dy ds
B r (0)
Z1
Z
1 N sλ (B rs (0))
= 0
ϑ
³y ´ ¡ ¢ Duε (y) − Du(0), y RN dy ds. s
B rs (0)
Letting ε & 0, in the limit we obtain 1 N λ (B r (0))
Z ϑ(y) (u(y) − u(0) − (Du(0), y)RN ) dy B r (0)
Z1 = 0
Z
1 N sλ (B rs (0))
ϑ
³y´ ¡ ¢ Du(y) − Du(0), y RN dy ds s
B rs (0)
Z1
µ
6 r 0
¯ ³ y ´¯p0 ¶ p10 ¯ ¯ × ¯ϑ ¯ dy s
Z
1 λN (B rs (0))
B rs (0)
µ ×
1 N λ (B rs (0))
¶ p1
Z kDu(y) −
p Du(0)kRN
dy
.
B rs (0)
Note that 1 N λ (B rs (0))
Z B rs (0)
Z ¯ ³ y ´¯p0 1 ¯ ¯ ¯ϑ ¯ dy = N s λ (B r (0)) B r (0)
|ϑ (y)|
p0
dy 6
1 a(N )rN
2. Lebesgue-Bochner and Sobolev Spaces
241
(see Remark 1.3.22). So we obtain Z
1 N λ (B r (0))
¡ ¡ ¢ ¢ ϑ(y) u(y) − u(0) − Du(0), y RN dy
B r (0)
³ ´ 1− N = ε r p0
as r & 0.
Taking the supremum over all ϑ, we obtain µ Z
1 rN
¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y
B r (0)
¯p ¯ dy RN
³ ´ 1− N = o r p0
¶ p1
as r & 0,
so µ
Z
1 N λ (B r (0))
¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y
¯p ¯ dy RN
¶ p1
B r (0)
= o(r) as r & 0.
(2.62)
Set ¡ ¢ df h(y) = u(y) − u(0) − Du(0), y RN , so h ∈ W 1,p (Br (0)) and consider its extension E(h) ∈ W 1,p (RN ). We have ° ° °E(h)° 1,p N 6 c1 kuk 1,p W (Br (0)) , W (R )
(2.63)
for some c1 > 0 (see Theorem 2.4.55). Then, via Sobolev’s inequality (see Theorem 2.5.3) and (2.63), we have µ Z
¯ ¯ ∗ ¯h(y)¯p dy
¶ p1∗
µZ 6
B r (0)
µZ
6 c2
° ° °DE(h)(y)°p N dy R
¶ p1
¯ ¯ ∗ ¯E(h)(y)¯p dy
RN
RN
µ Z 6 c3 B r (0)
¶ p1∗
¯ ¡¯ ¢ ¯h(y)¯p + kDh(y)kp N dy R
¶ p1 .
(2.64)
242
Nonlinear Analysis
Therefore using (2.62) and (2.64), we conclude that µ
1 N λ (B r (0))
Z
¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y
¯p∗ ¯ dy RN
¶ p1∗
B r (0)
µ 6 c4 r
Z
1 N λ (B r (0))
° ° °Du(y) − Du(0)°p N dy R
¶ p1
B r (0)
µ + c4
Z
1 λN (B r (0))
¯ ¡ ¢ ¯u(y) − u(0) − Du(0), y
¯p ¯ dy N R
¶ p1
B r (0)
= o(r) as r & 0.
For differentiability λN -almost everywhere, as probably expected, we consider the case p ∈ (N, +∞]. PROPOSITION 2.6.2 1,p If u ∈ Wloc (RN ) with p ∈ (N, +∞], then u is differentiable λN -almost everywhere and the derivative equals the distributional derivative λN -almost everywhere. 1,∞ 1,p PROOF Since Wloc (RN ) ⊆ Wloc (RN ) for any p < +∞, we may assume N that p ∈ (N, +∞). For λ -almost all z ∈ RN , we have Z 1 p lim N kDu(y) − Du(z)kRN dy = 0. (2.65) r&0 λ (B r (z)) B r (z)
Choose z ∈ RN , such that (2.65) holds. Set ¡ ¢ df h(y) = u(y) − u(z) − Du(z), y − z RN
∀ y ∈ B r (z).
Using Morrey’s inequality (see Theorem 2.5.12(a)), we have ¯ ¯ ¯h(y) − h(z)¯ 6 cr
µ
1 N λ (B r (z))
Z
° ° °Dh(y)°p N dy R
B r (z)
with r = ky − zkRN . Since h(z) = 0
and
Dh = Du − Du(z),
¶ p1 ,
2. Lebesgue-Bochner and Sobolev Spaces
243
using (2.65), we obtain |u(y) − u(z) − (Du(z), y − z)RN | ky − zkRN µ ¶ p1 Z ° °p 1 ° ° 6 c N Dh(y) dy −→ 0 as y → z, λ (B r (z)) B r (z)
so u is λN -almost everywhere differentiable and ∇u(z) = Du(z) for a.a. z ∈ RN .
Next we investigate the properties of a Sobolev function u (or more exactly of its precise representation u∗ (see Definition 2.5.26)) along lines. In this direction we have the following result. PROPOSITION 2.6.3 1,p (a) If u ∈ Wloc (RN ), p ∈ [1, +∞), © ª then for each k ∈ 1, . . . , N the function ¡ ¢ u∗k (z 0 , t) = u∗ z1 , . . . , zk−1 , t, zk+1 , . . . , zN is locally absolutely continuous in t for λN −1 -almost all z 0 = (zi )N i=1,i6=k ∈ ¡ ¢ RN −1 (see Definition A.2.15(b)). Moreover (u∗k )0 ∈ Lploc RN . ¡ ¢ (b) If u ∈ Lploc RN and u = h λN -almost everywhere where for each k ∈ © ª 1, . . . , N , the function ¡ ¢ df hk (z 0 , t) = h z1 , . . . , zk−1 , t, zk+1 , . . . , zN is locally absolutely continuous in t for λN −1 -almost all z 0 = (zi )N i=1,i6=k ∈ ¡ ¢ RN −1 and h0k ∈ Lploc RN , 1,p then u ∈ Wloc (RN ). PROOF (a) Clearly we may assume that k = N . Set uε = ϕε ? u with {ϕε }ε>0 being a family of mollifiers. We know that 1,p uε −→ u in Wloc (RN )
(see Proposition 2.4.12(e)). For every M > 0 and λN −1 -almost all z 0 = −1 N −1 (zi )N , from Fubini’s theorem, we have i=1 ∈ R ¯ ¯p ¶ ZM µ ¯ ∂uε 0 ∂u 0 ¯¯ p |uε (z 0 , t) − u(z 0 , t)| + ¯¯ (z , t) − dt −→ 0 (z , t)¯ ∂zN ∂zN
−M
as ε & 0.
244
Nonlinear Analysis
Let
df
uN,ε (t) = uε (z 0 , t). Then uN,ε −→ uN
1,p in Wloc (R)
as ε & 0
and so also locally uniformly to a locally absolutely continuous function uN with u0N (t) = DN u(z 0 , t) for λ1 -a.a. t ∈ R. Also from Theorems 1.6.18, 1.6.13(b) and Remark 1.6.14, we have that uε −→ u∗
µ(N −1) − a.e.,
so from Proposition 1.3.25, we have uN,ε (t) −→ u∗ (z 0 , t) for λN −1 -a.a. z 0 ∈ RN −1 and all t ∈ R. Thus
uN (t) = u∗ (z 0 , t) for λN −1 -a.a. z 0 ∈ RN and all t ∈ R.
¡ ¢ (b) For each ϑ ∈ Cc1 RN , we have Z Z ∂ϑ ∂ϑ u dz = h dz ∂zk ∂zk RN
RN
+∞ µZ ¶ hk (z 0 , t)ϑ0 (z 0 , t) dt dz 0
Z = RN −1
Z = − RN −1
so
−∞ +∞ µZ ¶ Z 0 0 0 0 hk (z , t)ϑ(z , t) dt dz = − −∞
h0k ϑ dz 0 ,
RN −1
Dk u(z) = h0k (z) for λN -a.a. z ∈ RN and all k ∈ {1, . . . , N },
1,p thus u ∈ Wloc (RN ).
Before starting discussing BV -functions, let us prove a result on the superposition operator defined on a Sobolev space. More precisely let Z ⊆ RN be an open set and let ξ : R −→ R be a Lipschitz continuous function. If Z is unbounded, we also assume that u(0) = 0. From Proposition 2.4.25, we know that if u ∈ W 1,p (Z), then ξ ◦ u ∈ W 1,p (Z). So we can define the map Nξ : W 1,p (Z) −→ W 1,p (Z), by df
Nξ (u) = ξ ◦ u
∀ u ∈ W 1,p (Z).
2. Lebesgue-Bochner and Sobolev Spaces
245
PROPOSITION 2.6.4 If p ∈ (1, +∞), then Nξ : W 1,p (Z) −→ W 1,p (Z) is continuous. PROOF
Suppose that un −→ u in W 1,p (Z).
Then
ξ(un ) −→ ξ(u) in Lp (Z).
Also from Proposition 2.4.25, we know that D(ξ ◦ un )(z) = (ξ ∗ ◦ un )Dun (z)
for a.a. z ∈ Z,
with a bounded Borel measurable function ξ ∗ : R −→ R, such that ξ ∗ (z) = ξ 0 (z) So the sequence
for a.a. z ∈ R.
¡ ¢ {D(ξ ◦ un )}n>1 ⊆ Lp Z; RN
is bounded and it follows that ¡ ¢ w D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN . First suppose that ξ ∗ = χA , with A being a Borel set. Set df
η ∗ (t) = ξ ∗ (t) − We have Z Z
1 = p 2
° ° °D(η ◦ un )°p N dz = R
Z Z
Z p kDun kRN Z
1 2
dz −→
1 2p
and
1 df η(t) = ξ(t) − . 2
° ∗ ° °(η ◦ un )Dun °p N dz R Z
Z p
kDukRN dz = Z
° ° °D(η ◦ u)°p N dz. R
Z
Since
¡ ¢ w D(η ◦ un ) −→ D(η ◦ u) in Lp Z; RN , ° ° ° ° °D(η ◦ un )° −→ °D(η ◦ u)° p p ¢ ¡ N p (p ∈ (1, +∞)), from the Kadec-Klee and the uniform convexity of L Z; R property (see Remark A.3.22), we have that ¡ ¢ D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN , whenever ξ ∗ is a characteristic function of a Borel set.
246
Nonlinear Analysis
Clearly then the same is true for ξ ∗ being a countably-valued Borel function. Now suppose that ξ ∗ is an arbitrary bounded Borel function. For a given ε > 0, we can find a countably-valued function s∗ , such that ¯ ¯ sup ¯ξ ∗ (t) − s∗ (t)¯ 6 ε t∈R
(see Corollary 2.1.4). So using Proposition 2.4.25, we have ° ° °D(ξ ◦ un ) − D(ξ ◦ u)° p ³° ° ∗ ° ° ° ° ´ ∗ 6 °(s ◦ un )Dun − (s ◦ u)Du°p + ε °Dun °p + °Du°p , so
° ° lim sup °D(ξ ◦ un ) − D(ξ ◦ u)°p 6 2ε kDukp . n→+∞
Let ε & 0, to obtain ¡ ¢ D(ξ ◦ un ) −→ D(ξ ◦ u) in Lp Z; RN , hence ξ ◦ un −→ ξ ◦ u in W 1,p (Z).
REMARK 2.6.5 The result is also true for p = 1, but the proof is more involved. We refer to Marcus & Mizel (1979), for details. The weakest measure theoretic sense in which a function w ∈ L1 (Z) can be differentiable is to require that its partial derivatives in the sense of distributions are Radon measures. Such functions are called functions of bounded variation. More precisely we make the following definition. DEFINITION 2.6.6 Let Z ⊆ RN be an open set. A function u ∈ L1 (Z) is said to be of bounded variation, if and only if there exist bounded Borel signed measures © ª µk : B(Z) −→ R, for k ∈ 1, . . . , N , such that
Z
Z uDk ϑ dz = −
Z
ϑ dµk
∀ ϑ ∈ Cc∞ (Z).
Z
The space of functions of bounded variation is denoted by BV (Z). The next Proposition clarifies the structure of the functions of bounded variation.
2. Lebesgue-Bochner and Sobolev Spaces
247
PROPOSITION 2.6.7 If Z ⊆ RN is an open set, u ∈ BV (Z) and for h ∈ Cc (Z), h > 0, we set ½Z ¡ ¢ df ∞ N kDuk (h) = sup udiv ϑ dz : ϑ = (ϑk )N , k=1 ∈ Cc Z; R Z
° ° °ϑ(z)°
RN
¾ 6 h(z), z ∈ Z ,
then kDuk is a Radon measure. PROOF According to the Riesz-Markov representation theorem (see Theorem 2.3.41), we need to show that kDuk is a positive linear functional on Cc (Z) which is continuous under monotone convergence, i.e., if hn % h in Cc (Z), then kDuk (hn ) −→ kDuk (h). To this end let µ = (µk )N k=1 = Du. From Definition 2.6.6, we have that Z Z ¡ ¢ udiv ϑ dz = − ϑ dµ ∀ ϑ ∈ Cc∞ Z; RN . Z
Z
Thus, we may write ½Z kDuk (h) = sup
¡ ¢ N v dµ : v = (vk )N , k=1 ∈ Cc Z; R
Z
¾ ° ° °v(z)° N 6 h(z) for all z ∈ Z . R
We show that kDuk¡ (·) is additive. So let h1 , h2 ∈ Cc (Z), h1 , h2 > 0 and ¢ suppose that v ∈ Cc Z; RN is such that ° ° °v(z)° 6 h1 (z) + h2 (z) ∀ z ∈ Z. © ª Let g = min h1 , kvk and ( v(z) g(z) kv(z)k if v(z) 6= 0, df RN w(z) = 0 if v(z) = 0. ¡ ¢ Clearly w ∈ Cc Z; RN and ° ° ° ° °v(z) − w(z)° N = °v(z)° N − g(z) 6 h2 (z) R R Therefore, since ° ° °w(z)° N = g(z) 6 h1 (z) R
∀ z ∈ Z,
∀ z ∈ Z.
248
Nonlinear Analysis
we have Z
Z v dµ =
Z
Z w dµ +
Z
(v − w) dµ 6 kDuk (h1 ) + kDuk (h2 ), Z
so kDuk (h1 + h2 ) 6 kDuk (h1 ) + kDuk (h1 ). Since the opposite inequality is clearly true, we conclude that kDuk (·) is additive. Also it is clearly positively homogeneous. Thus it remains to show that if hn % h in Cc (Z)+ , then ¡
Let v ∈ Cc Z; R
¢ N
kDuk (hn ) −→ kDuk (h). , such that ° ° °v(z)° N 6 h(z) R
© ª df Let gn = min hn , kvk and (
v(z) gn (z) kv(z)k RN 0
∀ z ∈ Z.
if if
v(z) 6= 0, v(z) = 0.
¡ ¢ We have wn ∈ Cc Z; RN , ° ° °wn (z)° N = gn (z) 6 hn (z) R
∀z∈Z
df
wn (z) =
and kv − wn k = kvk − gn & 0. Because kv − wn k = kvk − gn 6 2 kvk , by virtue of the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that Z v dµ = hDu, viCc (Z;RN ) = lim hDu, wn iCc (Z;RN ) 6 lim kDuk (gn ), n→+∞
n→+∞
Z
so kDuk (h) 6
lim kDuk (gn ).
n→+∞
Since gn 6 h
∀ n > 1,
we have that the opposite inequality also holds, hence kDuk (h) >
lim kDuk (gn ),
n→+∞
so kDuk (h) =
lim kDuk (hn ).
n→+∞
2. Lebesgue-Bochner and Sobolev Spaces
249
COROLLARY 2.6.8 If Z ⊆ RN is an open set and u ∈ BV (Z), then there exists a Borel measurable function ξ : Z −→ RN , such that ° ° °ξ(z)° N = 1 µ = Du-a.e. R and
Z
Z udiv ϑ dz = −
Z
(ϑ, ξ)RN d kDuk
¡ ¢ ∀ ϑ ∈ Cc1 Z; RN .
Z
REMARK 2.6.9
Evidently ξ =
d(Du) d kDuk
(i.e., the Radon-Nikodym derivative of µ = Du with respect to kDuk, since Du ≺≺ kDuk; see Theorem A.2.24 and Remark A.2.25). So, we have Z
Z udiv ϑ dz = −
Z
ϑdDu
¡ ¢ ∀ ϑ ∈ Cc1 Z; RN .
Z
In the sequel for u ∈ L1loc (Z), we say that u ∈ BVloc (Z) (i.e., has locally bounded variation in Z), if for every bounded open set V ⊆ Z with V ⊆ Z, we have that u ∈ BV (V ). Note that the total variation of kDuk is given by ½Z kDuk (Z) = sup
¡ ¢ ∞ N udiv ϑ dz : ϑ = (ϑk )N , k=1 ∈ Cc Z; R
Z
¾ ° ° °ϑ(z)° N 6 1 for all z ∈ Z . R
The norm of BV (Z) is given by kukBV (Z) = kuk1 + kDuk and makes BV (Z) a Banach space. It is also well known that an absolutely continuous function u : R −→ R with u0 ∈ L1 (R) is of bounded variation in R. In particular then W 1,1 (R) ⊆ BV (R). Next we show that the same is true in higher dimensions (i.e., for N > 1). First two examples to motivate what follows.
250
Nonlinear Analysis
1,1 EXAMPLE 2.6.10 (a) Let Z ⊆ RN be an open set and u ∈ W ¡ (Z), ¢ 1,1 1 N then °u ∈ BV ° (Z) (i.e., W (Z) ⊆ BV (Z)). To see this let ϑ ∈ Cc Z; R with °ϑ(z)°RN 6 1 for all z ∈ Z. We have Z Z ¡ ¢ udiv ϑ dz = − Du, ϑ RN dz, Z
Z
so
Z
° ° °Du(z)°
kDuk =
RN
dz
Z
and
( ξ(z) =
Du(z) kDu(z)kRN
0
if if
Du(z) 6= 0 Du(z) = 0
for λN -a.a. z ∈ Z.
1,1 1,p Similarly we show that Wloc (Z) ⊆ BVloc (Z). In particular then Wloc (Z) ⊆ N BVloc (Z) for all p ∈ [1, +∞) and if Z ⊆ R is bounded and open then W 1,p (Z) ⊆ BV (Z) for all n > 1.
(b) Let Z ⊆ RN be an open set, U ⊆ RN another open set with C 2 -boundary ∂U , such that ¡ ¢ µ(N −1) ∂U ∩ K < +∞ for all compact sets K ⊆ Z. ¡ ¢ Then from Proposition 2.4.44, for ϑ ∈ Cc1 Z; RN , we have Z Z div ϑ dz = (ϑ, n)RN dµ(N −1) U
∂U
(here n denotes the outward unit normal along ∂U ¡ ). Hence ¢ for any bounded, open set V ⊆ Z, with V ⊆ Z and for any ϑ ∈ Cc1 V ; RN , we have Z Z ¡ ¢ div ϑ dz = (ϑ, n)RN dµ(N −1) 6 µN −1 ∂U ∩ V , U
so χU ∈ BVloc (Z). Moreover,
∂U ∩V
¡ ¢ k∂χU k (Z) = µ(N −1) ∂U ∩ Z .
Thus k∂χU k (Z) measures the size of ∂U in Z. Since χU is not in general in 1,1 Wloc (Z), we see that not every function of (locally) bounded variation is a Sobolev function. Motivated by Example 2.6.10(b), we make the following definition. DEFINITION 2.6.11 A Lebesgue measurable set A ⊆ RN is said to have finite perimeter in an open set Z ⊆ RN , if χA ∈ BV (Z).
2. Lebesgue-Bochner and Sobolev Spaces REMARK 2.6.12 poli sets.
251
Some authors call sets of finite perimeter, Cacciop-
Next we shall establish some elementary properties of BV -functions. The first is the lower semicontinuity of the variational measure. PROPOSITION 2.6.13 If Z ⊆ RN is an open set and {un }n>1 ⊆ BV (Z) is such that un −→ u
in L1loc (Z),
then for every open set U ⊆ Z, we have kDuk (U ) 6 lim inf kDun k (U ). n→+∞
¡ ¢ Let ϑ ∈ Cc∞ Z; RN be such that
PROOF
kϑ(z)kRN 6 1 We have
Z
∀ z ∈ U.
Z udiv ϑ dz =
lim
un div ϑ dz 6 lim inf kDun k (U );
n→+∞
U
n→+∞
U
so from Remark 2.6.9, we have kDuk (U ) 6 lim inf kDun k (U ). n→+∞
REMARK 2.6.14 The above Proposition does not assert that u ∈ BV (Z). This will be true if u ∈ L1 (Z) and sup kDun k (Z) < +∞. To n>1
see this let ϑ ∈ Cc1 (Z) and k = 1, . . . , N . We have Z Z Z lim ϑDk un dz = − lim un Dk ϑ dz = − uDk ϑ dz, n→+∞
n→+∞
Z
so
Z
Z
¯Z ¯ ¯ ¯ ¯ uDk ϑ dz ¯¯ 6 kϑk∞ lim inf kDun k (Z) < +∞. ¯ n→+∞ Z
Because the embedding Cc1 (Z) ⊆ Cc (Z) is dense, we have that Z Dk u(ϑ) = − uDk ϑ dz ∀ k = 1, . . . , N Z
is a bounded linear functional on Cc (Z), hence a measure.
252
Nonlinear Analysis
In the next Proposition, we establish an upper semicontinuity property of the total variation measure. PROPOSITION 2.6.15 If Z ⊆ RN is an open set, {un }n>1 ⊆ BV (Z), un −→ u
in L1loc (Z)
and kDuk (Z) =
lim kDun k (Z),
n→+∞
then ¡ ¢ ¡ ¢ lim sup kDun k U ∩ Z 6 kDuk U ∩ Z n→+∞
PROOF have that
for all open sets U ⊆ Z.
The set V = Z \ U is open and so from Proposition 2.6.13, we kDuk (V ) 6 lim inf kDun k (V ).
(2.66)
n→+∞
Then we have ¡ ¢ kDuk U ∩ Z + kDuk (V ) = kDuk (Z) = ¡ ¢ > lim sup kDun k U ∩ Z + lim inf kDun k (V ) n→+∞ n→+∞ ¡ ¢ > lim sup kDun k U ∩ Z + kDuk (V ),
lim kDun k (Z)
n→+∞
n→+∞
so
¡ ¢ ¡ ¢ lim sup kDun k U ∩ Z 6 kDuk U ∩ Z . n→+∞
Combining Propositions 2.6.13 and 2.6.15, we have the following. COROLLARY 2.6.16 If Z ⊆ RN is an open set, {un }n>1 ⊆ BV (Z), un −→ u
in L1loc (Z),
kDun k (Z) −→ kDuk (Z) and
¡ ¢ kDuk ∂U = 0
for all open sets U ⊆ Z,
then kDun k (U ) −→ kDuk (U ).
2. Lebesgue-Bochner and Sobolev Spaces
253
The next theorem is the counterpart for the space BV (Z) of the MeyersSerrin theorem (see Theorem 2.4.13). THEOREM 2.6.17 If Z ⊆ RN is an open set and u ∈ BV (Z), then we can find a sequence {un }n>1 ⊆ BV (Z) ∩ C ∞ (Z), such that un −→ u
in L1 (Z)
and
kDun k (Z) −→ kDuk (Z).
PROOF Let ε > 0. For a given positive integer m > 1, we define the following open subset of Z: ½ ¾ 1 df Zk = z ∈ Z : d(z, ∂Z) > ∩ Bk+m (0) ∀ k > 1. k+m Choose m > 1 large enough so that kDuk (Z \ Z1 ) < ε.
(2.67)
Setting Z0 = ∅, we introduce the following sequence of open sets of Z: df
Vk = Zk+1 \ Z k−1
∀ k > 1.
Let {ξk }k>1 be a C ∞ -partition of unity subordinate to the open cover {Vk }k>1 of Z, i.e., ξk ∈ Cc∞ (Vk ),
0 6 ξk 6 1
and
∞ X
ξk = 1
on Z.
k=1
Let ϕ be a mollifier and for each k > 1, choose εk > 0, such that supp (ϕεk ? (ξk u)) ⊆ Vk ε kϕεk ? (ξk u) − ξk uk1 < k 2 kϕεk ? (uDξk ) − uDξk k < ε . 1 2k Let df
uε =
∞ X
ϕεk ? (ξk u).
k=1
Then uε ∈ C ∞ (Z) and because u =
∞ X
ξk u,
k=1
from (2.68), we have kuε − uk1 < ε,
(2.68)
254
Nonlinear Analysis
so
uε −→ u in L1 (Z)
as ε & 0.
(2.69)
kDuk (Z) 6 lim inf kDuε k (Z).
(2.70)
Invoking Proposition 2.6.13, we have ε&0
¡ ¢ Now let ϑ ∈ Cc1 Z; RN be such that ° ° °ϑ(z)° N 6 1 R We have Z uε div ϑ dz =
=
Z ∞ Z X
∞ Z X
ϕεk ? (ξk u)div ϑ dz
k=1 Z
¡ ¢ ξk udiv ϕεk ? ϑ dz
k=1 Z
=
∞ Z X
udiv (ξk (ϕεk ? ϑ)) dz −
k=1 Z
=
∞ Z X
∀ z ∈ Z.
∞ Z X
u (Dξk , (ϕεk ? ϑ))RN dz
k=1 Z ∞ X ¡ ¢ udiv ξk (ϕεk ? ϑ) dz −
k=1 Z
Z
(ϑ, ϕεk ? (uDξk ) − uDξk )RN dz
k=1 Z
= η1,ε + η2,ε . Note that
° ¡ ¢ ° °ξk ϕε ? ϑ (z)° N 6 1 k R
∀ z ∈ Z, k > 1.
Also each z ∈ Z belongs in at most three elements in the cover {Vk }k>1 . So we have ¯ ¯ ¯Z ¯ ∞ Z X ¯ ¯ ¯ ¡ ¢ ¡ ¢ ¯ ¯η1,ε ¯ = ¯ udiv ξ1 (ϕε ? ϑ) dz + u div ξk (ϕεk ? ϑ) dz ¯¯ 1 ¯ ¯ ¯ k=2 Z Z ∞ X kDuk (Vk ) 6 kDuk (Z) + k=2
6 kDuk (Z) + 3 kDuk (Z \ Z1 ) 6 kDuk (Z) + 3ε.
(2.71)
Also from (2.68), we have that ¯ ¯ ¯η2,ε ¯ < ε. From (2.71) and (2.72), it follows that Z uε div ϑ dz 6 kDuk (Z) + 4ε, Z
(2.72)
2. Lebesgue-Bochner and Sobolev Spaces thus and so
255
° ° °Duε ° (Z) 6 kDuk (Z) + 4ε ° ° lim sup °Duε ° (Z) 6 kDuk (Z).
(2.73)
ε→0
From (2.70) and (2.73), we infer that ° ° °Duε ° (Z) −→ kDuk (Z)
as ε & 0.
This combined with (2.69) finishes the proof of the theorem. REMARK 2.6.18 Note that in the previous “local” approximation result, we do not have that ° ° °D(uε − u)° (Z) −→ 0 as ε & 0 and so we cannot claim the density of BV (Z) ∩ C ∞ (Z) in BV (Z). COROLLARY 2.6.19 If Z ⊆ RN is a bounded open set which is Lipschitz, then B r , for r > 0, is compact in L1 (Z), where B r = {u ∈ BV (Z) : kukBV 6 r} . PROOF Let {un }n>1 ⊆ B r . By Theorem 2.6.17, we can find ψn ∈ C ∞ (Z), such that Z ° ° ° ° °un − ψn ° 6 1 and kDψn k = °Dψn (z)° dz 6 2. 1 n Z 1,1
It follows that {ψn }n>1 ⊆ W (Z) is bounded. By virtue of Theorem 2.5.17, the sequence {ψn }n>1 ⊆ L1 (Z) is relatively compact. So we may assume that ψn −→ u in L1 (Z). From Remark 2.6.14, we have that u ∈ BV (Z) and from Proposition 2.6.13, we have that kukBV 6 r, i.e., u ∈ B r . REMARK 2.6.20 According to Corollary 2.6.19, if Z ⊆ RN is a bounded open set which is Lipschitz, then the embedding BV (Z) ⊆ L1 (Z) is compact. An interesting application of this compact embedding is the following result.
256
Nonlinear Analysis
PROPOSITION 2.6.21 If Z ⊆ RN is a bounded open set which is Lipschitz, ½ df T = A ⊆ Z : A is Lebesgue measurable, ¾ 1 N λ (A) = λ (Z \ A) = λ (Z) 2 N
N
and P (A, Z) = kDχA k
∀A∈T
(the perimeter with respect to Z functional), then there exists A∗ ∈ T , such that P (A∗ , Z) = inf P (A, Z). A∈T
PROOF
Let
df
S = {χA : A ∈ T } ⊆ L1 (Z). We furnish S with the relative L1 (Z)-topology. Since kχA k1 6 λN (Z)
∀ A∈T,
we see that the functional ξ : S −→ R defined by df
ξ(χA ) = kDχA k = P (A, Z) is coercive on S for the BV (Z)-norm. Therefore the sub-level sets of ξ are bounded in BV (Z), thus relatively compact in S ⊆ L1 (Z) (note that S is closed in L1 (Z) and see Remark 2.6.20). Also from Proposition 2.6.13, we know that ξ is lower semicontinuous on S. This means that its sub-level sets are compact in L1 (Z). So by the Weierstrass theorem, we can find A∗ ∈ T , such that P (A∗ , Z) = inf P (A, Z). A∈T
We can relate the variation measure of u and the perimeters of its superlevel sets. The result is actually a “Co-Area Formula” for BV -functions (see also Theorem 1.5.25). THEOREM 2.6.22 If Z ⊆ RN is an open set, u ∈ L1 (Z) and for every r ∈ R let ª df © Lr = z ∈ Z : u(z) > r , then (a) u ∈ BV (Z) implies that Z∞ kDuk (Z) =
° ° °DχL ° (Z) dr; r
−∞
(b) if for almost all r ∈ R, Lr has a finite perimeter, then u ∈ BV (Z).
2. Lebesgue-Bochner and Sobolev Spaces
2.7
257
Remarks
2.1: To have a good theory of integration, we need a reasonable notion of measurability of functions. In this direction the basic result is the Pettis measurability theorem (see Theorem 2.1.3), which was proved by Pettis (1938a). The main integral for vector valued functions, which has a rich enough structure to have significant applications, is the Bochner integral. The Bochner integral can be traced in the works of Bochner (1933) and Dunford (1935) and for this reason is also known as “Dunford’s first integral.” Most of the properties of the Bochner integral follow from the corresponding properties of the classical Lebesgue integral, by virtue of Proposition 2.1.10. So some analysts say that the Bochner integral is the Lebesgue integral with the absolute value replaced by norms. The Pettis integral has much fewer applications, which require knowledge and use of sophisticated measure theoretic results. The theory of Pettis integration started with the work of Pettis and attracted renewed attention after the paper of Edgar (1977). A detailed study of the Pettis integral with applications can be found in the monograph of Talagrand (1984a). On the subject of vector valued functions and their integration, the reader can consult the books of Diestel & Uhl (1977), Dunford & Schwartz (1958) and Hille & Phillips (1957). The proof of the Orlicz-Pettis theorem can be found in Diestel & Uhl (1977, p. 22). 2.2: A reference to Lebesgue-Bochner spaces can be found in every book dealing with infinite dimensional dynamical systems. They are a natural generalization of the classical Lebesgue spaces using the notion of Bochner integral. Vector measures were already considered by Pettis (1938b). However, the real expansion on the subject occurred in the late 60s and during the 70s, when there was a systematic study of the geometry of Banach spaces. That is when RNP spaces were introduced and studied in detail. That a reflexive Banach space has the RNP, which was established by Phillips (1940), while the fact that a separable dual Banach space is an RNP space is due to Dunford & Pettis (1940). The proof of Proposition 2.2.8 can be found in Diestel & Uhl (1977, pp. 79 and 82). Theorem 2.2.9 (the Riesz Representation theorem for the Lebesgue Bochner spaces Lp (Ω; X), p ∈ [1, +∞)) is essentially due to Bochner & Taylor (1938). Its proof can be found in Diestel & Uhl (1977, p. 97). Its extension (for p = 1) mentioned in Theorem 2.2.12 (called Dinculeanu-Foias theorem) is due to Dinculeanu & Foias (1961) and its proof, based on “lifting theory,” can be found in Ionescu-Tulcea & Ionescu-Tulcea (1969, p. 93). Absolute continuity of real valued functions (see Definition 2.2.14) was introduced by Vitali (1908), who established the fundamental fact that a real valued function on [0, 1] is absolutely continuous if and only if it is the integral of its derivative (the fundamental theorem of Lebesgue calculus). Theorem 2.2.17 is due to Komura (1967). Lemma 2.2.29
258
Nonlinear Analysis
can be found in Lions (1969, p. 58) and shows how new inequalities can be derived from the properties of embedding operators. Theorem 2.2.30 is due to Aubin (1963) and plays a central role in the theory of evolution equations. Evolution triples (see Definition 2.2.31) are also known as “Gelfand triples,” because of their systematic use by Gelfand & Shilov (1977) (see also Wloka (1987)). Evolution triples and their properties and applications can be found in Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Hu & Papageorgiou (1997, 2000), Lions (1969), Showalter (1997) and Zeidler (1990a, 1990b). Finally we mention a result on the structure of L1 (Ω; X) due to Talagrand (1984b). PROPOSITION 2.7.1 If (Ω, Σ, µ) is a finite measure space and X is a Banach space which is weakly sequentially complete, then L1 (Ω; X) is weakly sequentially complete too. 2.3: Theorem 2.3.2 is known as the “Arzela-Ascoli theorem” ¡ ¢ although some authors use only one of the two names. Working on C [0, 1] , Arzela (1889) proved the necessity part, while Ascoli (1883–1884) proved the sufficiency part. A general formulation of this theorem can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 73). The results on the compactness of various sets in Lp (T ; X) (p ∈ [1, +∞)) and in C(T ; X) (variations of the Arzela-Ascoli theorem) can be found in Simon (1987). They are formulations and extensions of the classical criterion for strong compactness in Lp (T ) (p ∈ [1, +∞)), due to Riesz (1933) and Kolmogorov (1931). James’ theorem (see Theorem 2.3.21), due to James (1964), is one of the deepest and most influential results of functional analysis. From it, it follows that a Banach space X is reflexive if and only if every x∗ ∈ X ∗ attains its supremum on the unit ball of X. For a proof of James’ theorem see Holmes (1975, pp. 157–161). Theorem 2.3.21 can be found in Papageorgiou (1985) (see also Denkowski, Mig´orski & Papageorgiou (2003a, p. 462)), where a kind of converse of it can also be found. The proof of Proposition 2.3.22 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 458). Ionescu-Tulcea & Ionescu-Tulcea (1969) were the first to observe that the classical Dunford-Pettis theorem (see Dunford (1935)) can be extended to X-valued functions with X being a reflexive Banach space, after some straightforward modifications in the original proof (see Theorem 2.3.24). The proof of Proposition 2.3.31 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 484). The notion of biting convergence (see Definition 2.3.35) is due to Chacon (see Brooks & Chacon (1980) and Ball & Murat (1989)). In Brooks & Chacon (1980), we can find the original version of Theorem 2.3.26 (Biting Theorem). Property U (see Definition 2.3.33) is natural in the context of solution flows of a differential equation. Theorem 2.3.37 is due to Gutman (1985). Extensions of Proposition 2.3.39 to Banach space valued functions can be found in Rzezuchowski
2. Lebesgue-Bochner and Sobolev Spaces
259
(1989). The notation for the various spaces of continuous functions is not standard (see, e.g., Hewitt & Stromberg (1975, p. 86)). For a proof of Theorem 2.3.41 (Riesz-Markov representation theorem) we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 322). Also the names for the various modes of convergence introduced in Definition 2.3.42 vary among authors. So we caution the reader to be careful. Since we are dealing with the space of measures, let us mention two striking results concerning them. Let (Ω, Σ) be a measurable space and let ca(Σ) be the space of all signed measures on Σ of bounded variation endowed with the total variation norm df
kµk1 = |µ|(Ω)
∀ µ ∈ ca(Σ).
We can also introduce another norm given by ¯ ¯ df kµk∞ = sup ¯µ(A)¯
∀ µ ∈ ca(Σ).
A∈Σ
Then kµk∞ 6 kµk1 6 4 kµk∞
∀ µ ∈ ca(Σ) ¡ ¢ (i.e., the two norms are equivalent). The space ca(Σ), k·k1 is a Banach space. The first result is a remarkable improvement of the Uniform Boundedness Principle and is known as “Nikodym’s boundedness theorem.” PROPOSITION 2.7.2 If {µs }s∈S ⊆ ca(Σ) and ¯ ¯ sup ¯µs (A)¯ < +∞
∀ A ∈ Σ,
s∈S
then
¯ ¯ sup ¯µs (A)¯ < +∞. s∈S A∈Σ
The second result is known as “Nikodym’s convergence theorem.” PROPOSITION 2.7.3 If {µn }n>1 ⊆ ca(Σ) and lim µn (A) = µ(A) exists
n→+∞
∀ A ∈ Σ,
then µ ∈ ca(Σ) and moreover, if µn ≺≺ λ for all n > 1 with λ ∈ ca(Σ), then µ ≺≺ λ. Both results can be found in Diestel (1984, pp. 80 and 90) and Dunford & Schwartz (1958, pp. 309 and 321).
260
Nonlinear Analysis
For a proof of Theorem 2.3.48 we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 198) and Parthasarathy (1967, p. 45). Proposition 2.3.49 is due to Br´ezis & Lieb (1983). Finally we state a compactness result concerning vector measures. The result is known as “Lyapunov’s convexity theorem” and has important ramifications in Control Theory (see Hermes & LaSalle (1969)). THEOREM 2.7.4 Let (Ω, Σ) be a measurable space. (a) If µk : Σ −→ R, k = 1, . . . , N are finite nonatomic measures, ¢N S ¡ then R = µk (A) k=1 is compact and convex in RN . A∈Σ
(b) If X is a Banach space with the RNP and m : Σ −→ X is a vector measure which is nonatomic and of bounded variation, S k·k then R = m(A) is strongly compact and convex. A∈Σ
2.4: Sobolev spaces were introduced by Sobolev (1963a, 1963b). Related spaces were also studied by Morrey (1940, 1966) and later by Deny & Lions (1953–1954). Today there are many well known books on the subject. We mention Adams (1975), Br´ezis (1983), Evans & Gariepy (1992), Kufner, John & Fuˇcik (1977), Lions & Magenes (1972), Maz’ja (1985) and Ziemer (1989). We mention that for functions of several variables (i.e., N > 1), when p = 2, we use the notation H m (Z) (respectively H0m (Z)) for the Sobolev space W m,2 (Z) (respectively W0m,2 (Z)). However, for functions of one variable (i.e., N = 1, hence Z = T = (a, b)), we keep the notation W m,2 (T ) (respectively W0m,2 (T )). Theorem 2.4.13 is due to Meyers & Serrin (1964). The result is often called “local approximation theorem.” A discussion of the various geometric conditions imposed on the boundary ∂Z can be found in Adams (1975, pp. 66–67). For a proof of the approximation result given in Theorem 2.4.17, we refer to Evans & Gariepy (1992, p. 127). To see that without further conditions on the domain Z, Theorem 2.4.17 is not true, consider the following example. EXAMPLE 2.7.5 df
Z =
Let ©
ª
(z1 , z2 ) ∈ R2 : 0 < |z1 | < 1, 0 < z2 < 1
and
½ df
u(z1 , z2 ) =
1 0
if if
z1 > 0, z1 < 0.
Clearly u ∈ W 1,p (Z) (p ∈ [1, +∞)). However, given ε > 0 sufficiently small, it is easy to see that we cannot find ϑ ∈ C 1 (Z), such that ku − ϑkW 1,p (Z) < ε. Note that this particular Z lies on both sides of its boundary.
2. Lebesgue-Bochner and Sobolev Spaces
261
Another approximation result, useful in optimal control problems, is given below. First a definition. DEFINITION 2.7.6 Let Z be an open set. We say that u : Z −→ R is affine, if it is the restriction to Z of an affine function over RN . We say that u : Z −→ R is piecewise affine, if it is continuous and there exists a partition of Z into a Lebesgue-null set and finite number of open sets on which u is affine. REMARK 2.7.7
If u : Z −→ R is affine, then Du = constant
and the converse is true, if Z is connected. We have ¡ ¢ u(z) = Du(z), z RN + c, with c ∈ R. PROPOSITION 2.7.8 If Z ⊆ RN is a bounded open set which is Lipschitz and u ∈ W01,p (Z) (p ∈ (1, +∞)), then we can find a sequence {un }n>1 of piecewise affine functions over Z, null on ∂Z (i.e., {un }n>1 ⊆ W01,p (Z)), such that un −→ u
in W01,p (Z).
For the case p = +∞, there is the following approximation result. PROPOSITION 2.7.9 If Z ⊆ RN is a bounded open set which is Lipschitz and u ∈ W 1,∞ (Z), then there exists a sequence {un , Zn }n>1 where un ∈ W 1,∞ (Z), Zn ⊆ Z are open, Zn ⊆ Zn+1 ∀ n > 1, λN (Z \ Zn ) −→ 0, un |Zn are piecewise affine, un (z) = u(z) un −→ u
∀ z ∈ ∂Z, n > 1, uniformly on Z,
Dun (z) −→ Du(z)
for a.a. z ∈ Z
and kDun k∞ 6 kDuk∞ + ε(n), with ε(n) −→ 0 as n → +∞.
262
Nonlinear Analysis
REMARK 2.7.10 Recall that, if u ∈ W 1,∞ (Z), then it is Lipschitz continuous on Z and so it can be extended continuously to Z (i.e., W 1,∞ (Z) ⊆ ¡ ¢ C Z ). So the boundary values of u are well defined. Both the previous approximation results can be found in Ekeland & Temam (1976, pp. 316–317). A detailed discussion of Sobolev spaces of fractional order and on manifolds can be found in Adams (1975) and Kufner, John & Fuˇcik (1977). Theorem 2.4.54 can be found in Kenmochi (1975) and Casas & Fern´andez (1989). Finally we mention a Proposition useful in the interpretation of the variational formulation of various equations, such as the Navier-Stokes equation. The result is due to de Rham (1955). PROPOSITION 2.7.11 ¡ ¢ N ∗ If Z ⊆ RN is an open set and u = (uk )N , k=1 ∈ D Z; R then a necessary and sufficient condition that u = Dh for some h ∈ D(Z)∗ is that ¡ ¢ ª df © hu, ϑi = 0 ∀ ϑ ∈ V = ϑ ∈ D Z; RN : div ϑ = 0 . REMARK Note that the divergence operator div maps ¡ ¢ 2.7.12 W01,p Z; RN onto the space ½ df
V =
¾
Z p
h ∈ L (Z) :
h(z) dz = 0
= Lp (Z)/R
Z
(recall that −div is the adjoint of the gradient operator). For the proof of the trace theorem (see Theorem 2.4.50), we refer to Adams (1975, p. 216) and Kufner, John & Fuˇcik (1977, p. 337) and for the proof of Extension Theorem (see Theorem 2.4.55), we refer to Br´ezis (1983, p. 158).
2.5: Theorem 2.5.3 is the classical “Sobolev inequality” (see Sobolev (1963a, 1963b)), which was also developed by Gagliardo (1958), Morrey (1940, 1966) and Nirenberg (1959). The proof given here is due to Nirenberg (1959). For the Poincar´e inequality (see Theorem 2.5.4) and the Poincar´e-Wirtinger inequality (see Theorem 2.5.21) we refer to Meyers (1978). For the proof of Proposition 2.5.8 we refer to Maz’ja (1985, p. 27). The Sobolev embedding theorem (see Theorem 2.5.16) originated in the work of Sobolev (1963a), with important refinements by Morrey (1940) and Gagliardo (1958). The RellichKondrachov embedding theorem (see Theorem 2.5.17) originated in a paper by Rellich (1930) for p = 2 and by Kondrachov (1945) for the general case. For the proofs of both theorems 2.5.16 and 2.5.17 we refer to Br´ezis (1983, pp. 168–170). There are variations of this theorem with interesting applications, like the following one due to Frehse (1984).
2. Lebesgue-Bochner and Sobolev Spaces
263
PROPOSITION 2.7.13 If Z ⊆ RN is a bounded open set, {un }n>1 ⊆ W 1,p (Z) (with p ∈ [1, +∞)) is a bounded sequence and Z ¢ p−2 ¡ kDun k Dun , Dh RN dz 6 M khk∞ ∀n > 1, h ∈ W 1,p (Z) ∩ L∞ (Z), Z
for some M > 0, then there exist u ∈ W 1,p (Z) and a subsequence {unk }k>1 of {un }n>1 , such that un −→ u in W 1,r (Z) ∀ r < p.
For the proof of Theorem 2.5.24, we refer to Adams (1975, p. 79). Theorem 2.5.28 is another refinement of the Rellich-Kondrachov theorem (and simultaneously of the Egorov theorem; see Theorem A.2.10) and can be found in Evans (1990, p. 8). Theorem 2.5.30 (the Concentration-Compactness Lemma) is due to Lions (1985a, 1985b) and is important in the study of elliptic differential equations involving critical exponents. For additional results in this direction we refer to the work of Ben Naoum, Troestler & Willem (1996) and Bianchi, Chabrowski & Szulkin (1995) and the monographs of Evans (1990) and Willem (1996). Propositions 2.5.34 and 2.5.35 can be found in Mawhin & Willem (1989). 2.6: For the Lp -differentiability and λN -a.e. differentiability of Sobolev functions we refer to Bagby & Ziemer (1974), Liu (1977) and Resetnjak (1969) and the books of Evans & Gariepy (1992), Federer (1969), Simon (1983), Stein (1970) and Ziemer (1989). Proposition 2.6.3 is due to Marcus & Mizel (1972), where the interested reader can find additional results in this direction. Proposition 2.6.4 is due to Marcus & Mizel (1979), where the authors prove that the result is also valid for p = 1. Functions of bounded variation on R were introduced by Jordan (1881), who placed integration within the context of a “measurable” set. Lebesgue (1910) proved that a function of bounded variation on R is almost everywhere differentiable (for a proof which does not use measure theory – except sets of measure zero – we refer to Riesz & Nagy (1955, pp. 3–10)). Before the formal introduction of distributions, extensions of the notion of bounded variation to functions of many variables were suggested by Tonelli (1926) and Cesari (1936). It involved consideration of functions along the coordinate axes. Theorem 2.6.17 is due to Krickerberg (1957). The theory of sets of finite perimeter was introduced by Caccioppoli (1953) and De Giorgi (1954, 1955) (where one can find the Co-Area Formula for BV -functions; see Theorem 2.6.22). The proof of Theorem 2.6.22 can be also found in Evans & Gariepy (1992, p. 185) and Ziemer (1989, p. 231). Further contributions were made by Federer (1958), Fleming (1960) and Krickerberg (1957). More details on the space of BV -functions can be found in the books of Evans & Gariepy (1992), Giusti (1984) and Ziemer (1989).
Chapter 3 Nonlinear Operators and Young Measures
In this chapter we study certain nonlinear operators which arise in applications and we also discuss the so-called Young measures, which roughly speaking capture the limits of minimizing sequences in variational problems which do not have a solution. For some cases we also develop the corresponding linear theory in order to have a complete picture of the theory, see the similarities and differences of the two and appreciate the limitations of the nonlinear theory. In Section 3.1, we consider compact operators. Compactness was introduced as a first attempt to deal with infinite dimensional nonlinear operator equations. By its nature, compactness approximates infinite objects by finite ones. We see that in the context of compact operators (linear and nonlinear alike) this principle is in general true. We also discuss proper maps, the spectral theory of linear, compact, self-adjoint operators on a Hilbert space and Fredholm operators. A broader framework for the analysis of infinite dimensional problems is provided by monotone operators, which extend to an infinite dimensional context, the simple notion of an increasing real function. In Section 3.2 we examine monotone operators from a Banach space into its dual, with special emphasis on maximal monotone operators, which are a generalization of a continuous increasing real function. Maximal monotone operators have remarkable surjectivity properties. We point out that surjectivity results are important because they correspond to existence results for certain classes of nonlinear operator equations. At the end of the section we also discuss generalizations of the notion of monotonicity. These are the so-called operators of monotone type, the most important of which are the pseudomonotone operators. Monotone operators map a Banach space to its dual. If instead we want to consider nonlinear operators mapping a Banach space to itself, we need to consider accretive and m-accretive operators. Their importance comes from the fact that they are the generators of linear and nonlinear semigroups, which, roughly speaking, are an abstraction of the trajectories of a given differential equation. In Section 3.3 first we examine accretive operators and then we look at semigroups of operators generated by certain accretive operators. We present in detail both the linear and nonlinear theories. Undoubtedly the most common nonlinear operator is the so-called Nemytskii operator (or superposition)
265
266
Nonlinear Analysis
operator. In Section 3.4 we examine this operator and we have a first look at integral functionals corresponding to normal integrands. In a variational problem, when the objective functional is not inf-compact, a solution does not exist. Nevertheless, the minimizing sequences (or appropriate subsequences of them) have a limit behaviour (usually more and more oscillating), which is captured by embedding the original functions to the space of Young measures (or parametrized probabilities). This embedding leads to a larger inf-compact problem which has a solution (relaxation). In Section 3.5 we discuss the theory of the Young measures and obtain additional lower semicontinuity results for integral functionals. Some of the topics of this chapter will be revisited in the course of the next chapter.
3.1
Compact and Fredholm Operators
The first efforts to solve nonlinear functional equations involved various aspects of compactness. For this reason compact operators were introduced. They constitute a class of maps to which we can generalize several of the results which are valid for maps between finite dimensional Banach spaces. Degree theory and fixed point theory, which provide important tools for the study of functional equations, depend on the notion of compact maps. DEFINITION 3.1.1 Let X, Y be two Banach spaces and let D be a subset of X. We say that f : D −→ Y is compact, if it is continuous and for every bounded set B ⊆ D, the set f (B) is compact in Y . We denote the set of compact maps by K(D; Y ). Also if D = X, we set df
Lc (X; Y ) = K(X; Y ) ∩ L(X; Y ). REMARK 3.1.2 Evidently K(D; Y ) is a linear space which is closed under composition with continuous bounded maps. If dim Y < +∞, then every continuous bounded map f : D −→ Y is compact. In the sequel we shall see that the space K(D; Y ) consists of precisely those maps which can be approximated by mappings with a finite dimensional range (see Theorem 3.1.10). Note that if L : X −→ Y is linear and maps bounded sets in X into relatively compact sets in Y , then L ∈ Lc (X; Y ) (i.e., L is also continuous). Finally, if L ∈ Lc (X; Y ), then L has a separable range.
3. Nonlinear Operators and Young Measures
267
Another notion involving compactness is given in the next definition. DEFINITION 3.1.3 Let X, Y be two Banach spaces and let D be a subset of X. We say that f : D −→ Y is completely continuous, if for every sequence {xn }n>1 ⊆ D, such that w
xn −→ x
in X,
for some x ∈ D, we have that f (xn ) −→ f (x)
in Y
(i.e., f is sequentially continuous from D with the relative weak topology of X into Y with the norm topology). REMARK 3.1.4 A completely continuous linear operator L : X −→ Y is also known as Dunford-Pettis operator and is of course continuous. In general the classes of compact maps and completely continuous maps are not comparable. However, for linear operators the situation is better. We can establish that complete continuity actually lies properly between compactness and boundedness. PROPOSITION 3.1.5 If X, Y are two Banach spaces and L ∈ Lc (X; Y ) = K(X; Y ) ∩ L(X; Y ), then L is completely continuous. PROOF
If
w
xn −→ x
in X,
then the sequence {xn }n>1 ⊆ X is bounded. Because L ∈ K(X; Y ), we have that k·k {L(xn )}n>1Y is compact in Y. Thus we can find a subsequence {xnk }k>1 of {xn }n>1 , such that L(xnk ) −→ y
in Y.
But because L ∈ L(X; Y ), we also have w
L(xn ) −→ L(x)
in Y.
Therefore y = L(x) and so we conclude that L(xn ) −→ L(x) i.e., L is completely continuous.
in Y,
268
Nonlinear Analysis
The converse of the above Proposition is not in general true. EXAMPLE 3.1.6 erty, namely if
Recall that the Banach space l1 has the Schur propw
xn −→ x in l1 , then xn −→ x in l1 . Using this we see that the identity map i : l1 −→ l1 is a completely continuous linear operator which is not compact. However, if we strengthen the condition on the space X, the situation improves. PROPOSITION 3.1.7 If X is a reflexive Banach space, Y is a Banach space, D ⊆ X is a nonempty, closed set, and f : D −→ Y is completely continuous, then f ∈ K(D; Y ). PROOF Clearly f is continuous. Let B ⊆ D be a bounded set. We need to show that f (B) is compact in Y . To this end let {yn }n>1 ⊆ f (B). Then yn = f (xn ) with xn ∈ B
∀ n > 1.
Since X is reflexive, by passing to a subsequence if necessary, we may assume that w xn −→ x in D. Then f (xn ) −→ f (x)
in Y
and so f (B) is indeed compact in Y . Combining Propositions 3.1.5 and 3.1.7, we have the following. COROLLARY 3.1.8 If X is a reflexive Banach space, Y is a Banach space and L ∈ L(X; Y ), then L is compact if and only if L is completely continuous. REMARK 3.1.9 In both Proposition 3.1.7 and Corollary 3.1.8 the condition that X is reflexive cannot be relaxed (see Example 3.1.6).
3. Nonlinear Operators and Young Measures
269
The next theorem gives a characterization of compact maps defined on a bounded set, which explains why compact maps are the suitable class to extend the properties of maps between finite dimensional Banach spaces. THEOREM 3.1.10 If X, Y are two Banach spaces, D ⊆ X is a bounded set and f : D −→ Y , then the following are equivalent: (a) f ∈ K(D; Y ); (b) given ε > 0 we can find a continuous, bounded map fε : D −→ Y , such that ° ° °f (x) − fε (x)° < ε ∀ x ∈ D, Y ¡ ¢ fε (D) ⊆ conv f (D) and dim span fε (D) < +∞. PROOF
“(a)=⇒(b)”: Since f ∈ K(D; Y ), we have that the set f (D) is compact in Y .
So given ε > 0, we can find {yk }m k=1 ⊆ Y , such that f (D) ⊆
m [
Bε (yk ).
k=1
Let
© ª df ak (y) = max ε − ky − yk kY , 0
and
ak (y) df ϑk (y) = P m ak (y)
∀ y ∈ f (D).
k=1
We define df
fε (x) =
m X
¡ ¢ ϑk f (x) yk
∀ x ∈ D.
k=1
Evidently the function fε : D −→ Y is continuous, fε (D) ⊆ span {yk }m k=1 , the set fε (D) is compact and ° ° °f (x) − fε (x)° = m Y P k=1 m P
< k=1 m P k=1
1 ak (f (x))
° m ° °X ¡ ¢¡ ¢° ° ° a f (x) y − f (x) k k ° °
Y
k=1
¡ ¢ ak f (x) ¡ ¢ε = ε ak f (x)
∀ x ∈ D.
270
Nonlinear Analysis df
“(b)=⇒(a)”: Let εn = n1 and let fεn = fn be the continuous, bounded map with finite dimensional range postulated by statement (b). Then f , being the uniform limit of the sequence {fn }n>1 of continuous maps, is itself continuous. Also let y = f (x) with x ∈ D. We have ky − yn kY
1 being strongly convergent in Y , imply xn −→ x
in X,
then f is proper. PROOF First suppose that hypothesis (i) holds. Let C ⊆ Y be a compact set. We need to show that f −1 (C) is compact in X. Let {xn }n>1 ⊆ f −1 (C). Then f (xn ) = yn ∈ C ∀ n > 1. Because C ⊆ Y is compact, by passing to a suitable subsequence if necessary, we may assume that yn −→ y ∈ C in Y. The weak coercivity of f implies that the sequence {xn }n>1 ⊆ X is bounded. © ª So the sequence u(xn ) n>1 ⊆ Y is relatively compact and we may assume that u(xn ) −→ z in Y. Then g(xn ) = f (xn ) − u(xn ) −→ y − z
in Y.
Because g is proper, it follows that the sequence {xn }n>1 has a subsequence {xnk }k>1 , such that xnk −→ x in X. Therefore f (xnk ) −→ f (x) and so y = f (x), i.e., x ∈ f −1 (C), which proves the properness of f .
3. Nonlinear Operators and Young Measures
273
Next suppose that hypothesis (ii) holds. Again f (xn ) = yn −→ y
in Y
and due to the weak coercivity of f , the sequence {xn }n>1 ⊆ X is bounded. Because X is reflexive, we may assume that w
xn −→ x
in X.
Then hypothesis (ii) implies that xn −→ x in X and so yn = f (xn ) −→ f (x)
in Y,
hence y = f (x). This proves the properness of f . Compactness and properness are related as follows. PROPOSITION 3.1.17 If X is a Banach space, D ⊆ X is a closed, bounded set and f ∈ K(D; X), then idX − f is proper (idX is the identity operator on X). PROOF Then
Let C ⊆ X be a compact set and let {xn }n>1 ⊆ (idX − f )−1 (C). xn − f (xn ) = cn ,
cn ∈ C
∀ n > 1.
Since C is compact and f ∈ K(D; X), by passing to a suitable subsequence if necessary, we may assume that cn −→ c ∈ C
and
f (xn ) −→ y
in X.
Then xn = cn + f (xn ) −→ c + y = x in X and so f (xn ) −→ f (x) in X. Thus y = f (x) and we have c = x − f (x), hence which shows that
x ∈ (idX − f )−1 (C), (idX − f )−1 (C) is compact.
274
Nonlinear Analysis
Next we have a closer look at the space of compact linear operators Lc (X; Y ). PROPOSITION 3.1.18 If X, Y are two Banach spaces, then Lc (X; Y ) with the operator norm is a Banach space. PROOF Clearly Lc (X; Y ) is a linear subspace of L(X; Y ) (see also Remark 3.1.2). Because L(X; Y ) with the operator norm is a Banach space, it suffices to show that Lc (X; Y ) is closed in L(X; Y ). So let {Ln }n>1 ⊆ Lc (X; Y ) and suppose that kLn − LkL −→ 0.
(3.1)
We need to show that L ∈ Lc (X; Y ). Because of (3.1), we have that ° ° sup °Ln (x) − L(x)°Y −→ 0. kxkX 61
So given ε > 0, we can find n0 = n0 (ε) > 1, such that ° ° °Ln (x) − L(x)° < ε ∀ n > n0 , kxkX 6 1. Y 2 If
df
B1X =
©
ª x ∈ X : kxkX < 1 ,
¢ ¡ then the set Ln0 B1X is relatively compact. So we can find a finite set F ⊆ Y , such that [ ¡ ¢ Ln0 B1X ⊆ B 2ε (y). y∈F
We claim that
[ ¡ ¢ L B1X ⊆ Bε (y). y∈F
Indeed for a given x ∈
B1X ,
we can find y ∈ F , such that ° ° °Ln0 (x) − y ° < ε . Y 2
Then ° ° ° ° ° ° °L(x) − y ° 6 °L(x) − Ln0 (x)° + °Ln0 (x) − y ° < ε + ε = ε, Y Y Y 2 2 ¡ X¢ which shows that L B1 is totally bounded, thus relatively compact. REMARK 3.1.19 Since composition of two operators, one of which is compact, is again a compact operator, we infer that Lc (X) is a closed twosided ideal of the Banach algebra L(X). Moreover, it is clear that if L ∈ Lc (X) and dim X = +∞, then L−1 does not exist (i.e., L is a singular operator).
3. Nonlinear Operators and Young Measures
275
The next characterization of the elements in the Banach space Lc (X; Y ) is known as “Schauder’s theorem.” First a definition. DEFINITION 3.1.20 If X, Y are two Banach spaces and L ∈ L(X; Y ), then its adjoint L∗ : Y ∗ −→ X ∗ is the linear operator given by df
L∗ (y ∗ ) = y ∗ L i.e.,
∗ ∗ ® ® L (y ), x X = y ∗ , L(x) Y
∀ y∗ ∈ Y ∗ , ∀ x ∈ X, y ∗ ∈ Y ∗ ,
where by h·, ·iZ we denote the duality brackets for the pair (Z, Z ∗ ) for any Banach space Z. REMARK 3.1.21 ∗
Clearly ∈ L(Y ∗ ; X ∗ )
L
and
kLkL = kL∗ kL .
So the map L −→ L∗ is an isometric isomorphism from L(X; Y ) into L(Y ∗ ; X ∗ ). Moreover, (L−1 )∗ = (L∗ )−1 and L∗ (Y ∗ ) is closed if and only if it is w∗ -closed. THEOREM 3.1.22 (Schauder Theorem) If X, Y are two Banach spaces and L ∈ L(X; Y ), then L ∈ Lc (X; Y ) if and only if L∗ ∈ Lc (Y ∗ ; X ∗ ). PROOF where
¡ ∗¢ is relatively compact, “=⇒”: We need to show that L∗ B1Y B1Y
Let {yn∗ }n>1 ⊆ B1Y
∗
df
=
©
ª y ∗ ∈ Y ∗ : ky ∗ kY ∗ < 1 .
∗
be a sequence. Consider the elements yn∗ for n > 1, ¡ ¢ restricted on the set L B1X , which is compact in Y . Clearly the sequence ¡ ¡ ¢¢ {yn∗ }n>1 ⊆ C L B1X is bounded and equicontinuous (see Definition A.1.15). So by the Arzela-Ascoli theorem (see Theorem 2.3.2), the sequence {yn∗ }n>1 ⊆ ¡ ¡ ¢¢ © ª C L B1X is relatively compact. Hence we can find a subsequence yn∗ k k>1 of {yn∗ }n>1 , such that ¯ ® ® ¯ sup ¯ yn∗ k , L(x) Y − yn∗ m , L(x) Y ¯ −→ 0 as k, m → +∞. x∈B1X
So lim
k,m→+∞
=
lim
° ∗ ∗ ° °L (yn ) − L∗ (yn∗ )° ∗ m k X ¯ ∗ ∗ ® ¯ ∗ ∗ ¯ sup L (ynk ) − L (ynm ), x X ¯
k,m→+∞ x∈B X 1
=
lim
¯ ® ¯ sup ¯ yn∗ k − yn∗ m , L(x) Y ¯ = 0.
k,m→+∞ x∈B X 1
276
Nonlinear Analysis
© ª Therefore L∗ (yn∗ k ) k>1 ⊆ X ∗ is a Cauchy sequence and so it is convergent. ¡ ∗¢ This implies that the set L∗ B1Y is compact in X ∗ and so L∗ ∈ Lc (Y ∗ ; X ∗ ). “⇐=”: Let r : X −→ X ∗ be the canonical embedding of X into X ∗∗ . have L∗∗ r = rL.
We
So identifying X with r(X), we have that L∗∗ |X = L. From the first part of the proof and since by hypothesis L∗ ∈ Lc (Y ∗ ; X ∗ ), we have that ¢ ¡ ∗∗ ⊆ Y ∗∗ is compact, L∗∗ B1X where
ª x∗∗ ∈ X ∗∗ : kx∗∗ kX ∗∗ < 1 . ¢ ¡ ¢ ¡ ∗∗ Since L∗∗ B1X is a closed subset of L∗∗ B1X , it follows that B1X
∗∗
df
=
©
¢ ¡ L∗∗ B1X ⊆ X ∗∗
is compact.
¢ ¢ ¡ ¡ ¡ ¢ But as we already established L∗∗ B1X = L B1X . So L B1X ⊆ X is compact and we conclude that L ∈ Lc (X; Y ). DEFINITION 3.1.23 Let X, Y be two Banach spaces and L ∈ L(X; Y ). We say that L is a finite rank operator (or finite dimensional operator or degenerate operator), if dim L(X) < +∞. We denote the space of all finite dimensional operators from X into Y equipped with the norm inherited df from L(X; Y ), by Lf (X; Y ). If L ∈ Lf (X; Y ), then rank L = dim L(X). REMARK 3.1.24 Clearly Lf (X; Y ) ⊆ Lc (X; Y ). The inclusion is in general strict as the next example illustrates. Consider the operator L ∈ L(l2 ), defined by nx o df n L(x) = ∀ x = {xn }n>1 ∈ l2 . 2n n>1
EXAMPLE 3.1.25
We claim that L ∈ Lc (l2 ) \ Lf (l2 ). Clearly L 6∈ Lf (l2 ). So let us show that ¡ 2¢ L ∈ Lc (l2 ). We need to show that L B1l ⊆ l2 is relatively compact. For a given ε > 0, find n0 = n0 (ε) > 1, such that ∞ X n=n0
1 6 ε. 2n +1
3. Nonlinear Operators and Young Measures The set
df
C =
n³ x
277
´ o x2 xn0 , . . . , , 0, . . . : |x | 6 1 n 2 22 2n0 1
,
is compact in l2 (view it as a subset of Rn0 ). So we can find a finite set F ⊆ C, such that [ C⊆ Bε (v). v∈F l
2
Let x ∈ B 1 . We have |xn | 6 1
∀n>1
and so there exists v ∈ F , such that n0 ¯ ¯2 X ¯ xn ¯ ¯ n − vn ¯ < ε 2 . 2 n=1
Then we have °n o ° n0 ¯ ∞ ¯2 ¯ x ¯2 X X ° xn ° ¯ xn ¯ ¯ n¯ ° ° = − v − v + ¯ n ¯ n ¯ < ε2 + ε2 = 2ε2 , n¯ ° 2n n>1 °2 2 2 l n=1 n=n +1 0
¡ 2¢ so L B1l is relatively compact in l2 , hence L ∈ Lc (l2 ). Making use of a finite basis to describe the finite dimensional range of L ∈ Lf (X; Y ), we can easily establish the following result. PROPOSITION 3.1.26 If X, Y are two Banach spaces and L ∈ L(X; Y ), then L ∈ Lf (X; Y ) if and only if L∗ ∈ Lf (Y ∗ ; X ∗ ). Moreover, rank L = rank L∗ . From Theorem 3.1.10 we know that every compact map can be uniformly approximated locally by maps with range in a finite dimensional space. Motivated by this fact, it is natural to ask whether k·kL
Lc (X; Y ) = Lf (X; Y )
.
In fact for a long time this was one of the major open problems in Banach space theory. But let us formulate the problem precisely. We start with a definition. DEFINITION 3.1.27 A Banach space Y has the approximation property, if for every Banach space X, we have that Lc (X; Y ) = Lf (X; Y )
k·kL
.
278
Nonlinear Analysis
The famous open (until 1973) problem in Banach space theory is known as the “approximation problem” and asks whether every Banach space Y has the approximation property. It was settled in the negative by Enflo (1973), who found a separable, reflexive Banach space (necessarily infinite dimensional), which lacks the approximation property. Let us also mention a few things about the spectrum of compact operators. The spaces in much of applied mathematics are actually real vector spaces, as was the case in our study so far. However, to work all the time in real spaces is mathematically inconvenient. Eigenvalue-eigenvector theory is such an instance. The theory is crippled if we insist on real vector spaces. For this reason in the next definition we consider a complex Banach space. DEFINITION 3.1.28 Let X be a complex Banach space and L ∈ L(X). The resolvent set %(L) of L is defined by df
%(L) =
©
ª λ ∈ C : (λidX − L)−1 exists and belongs in L(X) .
The operator df
R(λ) = (λidX − L)−1 is called the resolvent of L at λ. The points of %(L) are called regular values of L. The set df
σ(L) = C \ %(L) is called the spectrum of L. The point spectrum of L is the subset σp (L) of σ(L) defined by df
σ(L) =
©
ª λ ∈ σ(L) : ker (λidX − L) 6= ∅ .
The elements of σp (L) are called eigenvalues of L and for each λ ∈ σp (L) the closed subspace ker (λidX −L) of X is the eigenspace corresponding to the eigenvalue λ, while the nonzero elements of ker (λidX − L) are called eigenvectors of L. REMARK 3.1.29 If dim X = +∞ and L ∈ Lc (X), then 0 ∈ σ(L). Indeed, if L has a bounded inverse, then we could define an equivalent norm ° df ° |||x|||X = °L(x)°X
on X,
¡ ¢ whose closed unit ball is L B1X . But the latter set is compact (since L ∈ Lc (X)) and so dim X < +∞, a contradiction. Also, if dim X < +∞, then from linear algebra we know that the operator λidX − L is invertible (and automatically (λidX − L)−1 is continuous) if and only if λidX − L is bijective. So in this case σ(L) = σp (L). In general we can have that σp (L) = ∅ and σ(L) 6= ∅.
3. Nonlinear Operators and Young Measures
279
¡ ¢ EXAMPLE 3.1.30 ¡ Consider the Hilbert space L2 [0, 1] (over the com¡ ¢¢ plex scalars). Let L ∈ L L2 [0, 1] be defined by df
L(x)(t) = tx(t)
∀ t ∈ [0, 1].
We claim that σp (L) = ∅. Indeed, if for some λ ∈ C we have λx(t) = tx(t)
∀ t ∈ [0, 1],
then (λ − t)x(t) = 0 and so x(t) = 0
for a.a. t ∈ [0, 1].
On the other hand [0, 1] ⊆ σ(L). To this end let λ ∈ [0, 1] and take ε > 0, such that [λ, λ + ε] ⊆ [0, 1] or
[λ − ε, λ] ⊆ [0, 1].
To fix things, we assume that the first is true. We define ½ 1 √ if t ∈ [λ, λ + ε]. df ε xε (t) = 0 if t 6∈ [λ, λ + ε]. Then we have
Z1
λ+ε Z 2
xε (t) dt = 0
1 dt = 1 ε
λ
and so we have kxε k2 = 1. Also
¡ ¢ λidX − L (xε )(t) = (λ − t)xε (t)
Therefore ° ° °(λid − L)(xε )°2 = X 2
λ+ε Z
∀ t ∈ [0, 1].
1 ε2 (λ − t)2 dt = ε 3
λ
and so
¡ ¢ ¡ ¢ λidX − L (xε ) −→ 0 in L2 [0, 1] as ε & 0.
If λidX − L has a bounded inverse, then xε =
¡ ¢−1 (λidX − L)(xε ) −→ 0 as ε & 0, λidX − L
a contradiction to the fact that kxε k2 = 1 for all ε > 0.
280
Nonlinear Analysis
Using the theory of analytic functions one can show the following proposition. PROPOSITION 3.1.31 If X is a complex Banach space and L ∈ L(X), then σ(L) 6= ∅. REMARK 3.1.32 Banach space.
The result is no longer valid if we consider a real
PROPOSITION 3.1.33 If X is a complex Banach space and λ 6= 0 is an eigenvalue of L ∈ Lc (X), then dim(λidX − L)−1 (0) < +∞ (i.e., the eigenspace corresponding to λ is finite dimensional). PROOF
Set
df
Nλ = (λidX − L)−1 (0) and let B be a bounded subset of Nλ . For each x ∈ B, we have L(x) = λx. Since L ∈ Lc (X), we have that L(B) ⊆ X is compact. Hence λB ⊆ X is compact. Since all bounded sets in Nλ are relatively compact, it follows that dim Nλ < +∞. PROPOSITION 3.1.34 If X is a complex Banach space, L ∈ Lc (X) and ε > 0, then L has only finite many linear independent eigenvectors corresponding to eigenvalues having absolute value larger than ε. PROOF Let {xn }n>1 be a sequence of distinct eigenvectors corresponding to eigenvalues λk satisfying |λk | > ε. Set df
Xn = span {xk }nk=1
∀ n > 1.
Note that L(Xn ) = Xn and use Riesz lemma (see Proposition A.3.15), to obtain yn ∈ Xn such that Let
with kyn k = 1,
¡ ¢ 1 d yn , Xn−1 > . 2 df
un =
yn . λn
3. Nonlinear Operators and Young Measures Then
1 ε
kun kX < Also if yn =
n P
281
and L(un ) ∈ Xn .
ak xk , we have
k=1
L(un ) − yn =
n µ X λk k=1
λn
¶ − 1 ak xk =
n−1 Xµ k=1
¶ λk − 1 ak xk ∈ Xn−1 . λn
If n > m, then L(um ) ∈ Xm ⊆ Xn−1
and
L(un ) − yn ∈ Xn−1 .
So we have ° ° ¡ ¢ °L(un ) − L(um )° > d L(un ), Xn−1 ¡ ¢ ¡ ¢ 1 = d L(un ) + yn − L(un ), Xn−1 = d yn , Xn−1 > , 2 © ª so the sequence L(un ) n>1 has no convergent subsequence, a contradiction to the fact that L ∈ Lc (X). PROPOSITION 3.1.35 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then R(λidX − L) is closed. PROOF
Without any loss of generality, we may assume that λ = 1. Set df
V = idX − L
¡ ¢ df and N1 = V −1 {0} .
If λ 6∈ σp (L), then N1 = {0}. If λ ∈ σp (L), then dim N1 < +∞ (see Proposition 3.1.33). So in both cases we see that dim N1 < +∞. Thus we can write that X = N1 ⊕ Em with E being a closed subspace of X. Let df Vb = V |E .
We have V (X) = V (E) = Vb (E)
¡ ¢ ¡ ¢ and Vb −1 {0} = V −1 {0} ∩ E = {0}.
282
Nonlinear Analysis
This shows that Vb is bijective form E into X. We claim that ° ° °Vb (x)° > 0. inf x∈E kxkX = 1
X
Suppose that the claim is not true. Then we can find a sequence {xn }n>1 ⊆ E with kxn kX = 1, such that ° ° °Vb (xn )° & 0. X Since L ∈ Lc (X), by passing to a suitable subsequence if necessary, we may assume that L(xn ) −→ u in X. Then xn =
¡
¢ Vb + L (xn ) −→ u in X
and so kukX = 1. Moreover, Vb (xn ) −→ Vb (u)
in X
and so
Vb (u) = 0, a contradiction to the fact that Vb : E −→ X is bijective. So the claim is true and therefore there exists c > 0, such that ° ° °Vb (x)° > c kxk ∀ x ∈ E. X X This implies that Vb (E) is closed. Indeed, let un ∈ Vb (E)
∀n>1
and assume that un −→ u Then
in X.
un = Vb (xn ) with xn ∈ E
∀ n > 1.
We have ° 1° °Vb (xn − xm )° X c 1 = kun − um kX −→ 0 as n, m → +∞. c
kxn − xm kX 6
So xn −→ x in X for some x ∈ E and Vb (xn ) −→ Vb (x) = u ∈ Vb (E) in X. Finally recall that Vb (E) = V (E).
3. Nonlinear Operators and Young Measures
283
To produce a characterization of the spectrum of a compact operator, we shall need that following straightforward auxiliary result. LEMMA 3.1.36 If X is a Banach space, L ∈ Lc (X) and E = (idX − L)(X) is a proper subspace of X, X then for every ε > 0 we can find xε ∈ B 1 , such that ¡ ¢ dX L(xε ), L(E) > 1 − ε. PROOF By virtue of the Riesz lemma (see Proposition A.3.15), we can find xε ∈ X with kxε kX = 1, such that ¡ ¢ d xε , E > 1 − ε. Note that (idX − L)(xε ) in E and L(E) ⊆ E. Therefore ¡ ¢ ¡ ¢ ¡ ¢ dX L(xε ), L(E) > dX xε − (I − L)(xε ), E = dX xε , E > 1 − ε.
Using this lemma we can have the following remarkable property of the spectrum of a compact operator. THEOREM 3.1.37 If X is a complex Banach space, L ∈ Lc (X) and λ ∈ σ(L) \ {0}, then λ ∈ σp (L). PROOF
Without any loss of generality, we may assume that λ = 1. Let df
V = idX − L and suppose that
¡ ¢ V −1 {0} = {0}
(i.e., λ = 1 6∈ σp (L)). We set df
En = V n (X) and note that ¡ ¢ En = V n (X) = V n−1 V (X) ⊆ V n−1 (X) = En−1
∀ n > 1.
284
Nonlinear Analysis
From Proposition 3.1.35, we know that En is closed
∀ n > 1.
En+1 = En
∀ n > 1.
Suppose that En
Then according to Lemma 3.1.36, we can find xn ∈ B 1 , such that ¡ ¢ 1 dX L(xn ), L(En+1 ) > 2 so
° ° °L(xn ) − L(xm )°
∀ n > 1,
1 2 a contradiction to the compactness of L. So X
>
∀ n 6= m,
En+1 6= En , for some n > 1. We shall show that X = E0 = E1 . Suppose that this is not true, i.e., E0 6= E1 . Let m > 1 be the smallest positive integer, such that Em−1 6= Em = Em+1 . We choose y ∈ Em−1 \ Em . Then V (y) ∈ Em = Em+1 . Hence we can find z ∈ Em , such that V (y) = V (z) and
y 6= z,
since y 6∈ Em = Em+1 . Therefore V (y − z) = 0 and so y − z ∈ ker V, ¡ ¢ a contradiction to the hypothesis that V −1 {0} = {0}. So V is surjective and by Banach’s theorem (see Theorem A.3.6), we have that V −1 exists and is bounded. Hence λ = 1 6∈ σ(L), a contradiction.
3. Nonlinear Operators and Young Measures
285
From Theorem 3.1.37, Proposition 3.1.34 and the well known fact from linear algebra, which says that eigenvectors corresponding to distinct eigenvalues are linear independent, we obtain the following characterization of the spectrum of a compact operator. THEOREM 3.1.38 If X is an infinite dimensional complex Banach space and L ∈ Lc (X), then (a) σ(L) is a countable compact set whose only possible limit point is 0; (b) σ(L) = {0} ∪ σp (L); (c) if λ ∈ σp (L) \ {0}, then the eigenspace of L corresponding to L is finite dimensional. REMARK 3.1.39 The above theorem does not say that σ(L) is the disjoint union of {0} and σp (L). For example if X = l2 and L ∈ Lc (l2 ) is given by ¡ ¢ df ¡ ¢ L {xn }n>1 = x1 , 0, 0, . . . , then λ = 0 is an eigenvalue of L and the associated eigenspace is infinite dimensional (it has codimension equal to 1). On the other hand the operator L ∈ Lc (l2 ) in Example 3.1.25 is bijective and so does not have 0 as an eigenvalue. As we mentioned in the beginning of this section, compact operators generalize to infinite dimensions the properties of operators between finite dimensional spaces. One such property is that if dim X < +∞ and L ∈ L(X), then L is surjective if and only if L is injective. The result is no longer true if dim X = +∞. For example let X = l2 and let ¡ ¢ df ¡ ¢ L {xn }n>1 = 0, x1 , x2 , . . . (the right shift operator). However, if L ∈ Lc (X), then as we show in the sequel, the result is true for idX − L. We start with a definition. DEFINITION 3.1.40 Let X be a Banach space, A ⊆ X and C ⊆ X ∗ . We introduce the set A⊥ ⊆ X ∗ (pronounced “A perp”) and ⊥ C ⊆ X (pronounced “perp C”), defined by ½ A
⊥
df
=
¾ ∗
∗
∗
x ∈ X : hx , aiX = 0 for all a ∈ A , ½ ¾ df ⊥ ∗ ∗ C = x ∈ X : hc , xiX = 0 for all c ∈ C .
286
Nonlinear Analysis
REMARK 3.1.41 X respectively and ⊥
¡
A⊥
¢
The sets A⊥ and
= span A
and
⊥
C are closed subsets of X ∗ and
¡⊥ ¢⊥ ∗ C = span w C.
Also if E is a closed subspace of X, then ¡ ¢∗ X/E = E ⊥ and X ∗ / ⊥ = E ∗ E
(see, e.g., Beauzamy (1982, pp. 41 and 43)). LEMMA 3.1.42 If X, Y are two Banach spaces and L ∈ L(X; Y ), then ¡ ¢ ker L = ⊥ L∗ (Y ∗ ) and ker L∗ = L(X)⊥ . PROOF Recall that Y ∗ is a separating family of functions on Y . So x ∈ ker L if and only if ∗ ∗ ® ® L (y ), x X = y ∗ , L(x) Y = 0 ∀ y∗ ∈ Y ∗ , ¡ ¢ hence x ∈ ⊥ L∗ (Y ) . In a similar fashion, we have that y ∗ ∈ ker L∗ if and only if ∗ ® ® y , L(x) Y = L∗ (y ∗ ), x X = 0 ∀ x ∈ X, so y ∗ ∈ L(X)⊥ . LEMMA 3.1.43 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then R(λidX − L) = X implies that ker (λidX − L) = {0}, i.e., if λidX − L is surjective, then it is injective. PROOF Without any loss of generality we may assume that λ = 1. Recall that L commutes with (idX − L)n (consider the polynomial expansion of (idX − L)n ). So ¡ ¢ L ker (idX − L)n ⊆ ker (idX − L)n ∀ n > 1. Suppose that although R(idX − L)(X) = X, the operator idX − L is not injective. Note that (idX − L)n (X) = X
∀n>1
3. Nonlinear Operators and Young Measures
287
and so (idX − L)n+1 maps some elements of X to 0 that (idX − L)n does not. Hence ker (idX − L)n ( ker (idX − L)n+1 . Using Riesz lemma (see Proposition A.3.15), we can find xn ∈ ker (idX − L)n+1 with kxn kX = 1, such that kxn − ykX >
1 2
∀ n > 1, y ∈ ker (idX − L)n .
If n > m, we have (idX − L)(xn ) + L(xm ) ∈ ker (idX − L)n and so ° ° ° ¡ ¢° °L(xn ) − L(xm )° = °xn − (id − L)(xn ) + L(xm ) ° > 1 , X X X 2 a contradiction to the fact that L ∈ Lc (X). PROPOSITION 3.1.44 If X is Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then dim ker (λidX − L) = codim R(λidX − L) ³ ´ (recall that codim R(λidX − L) = dim X/R(λid − L) ). X
PROOF Without any loss of generality, we may assume that λ = 1. From Remark 3.1.41 and Lemma 3.1.42, we have that ³ ´∗ ¢ ¡ X/R(λid − L) (3.2) = R(idX − L)⊥ = ker id∗X − L∗ . X
From Theorem 3.1.22, we know that L∗ ∈ Lc (X ∗ ) and so Proposition 3.1.33 implies that ¡ ¢ dim ker id∗X − L∗ < +∞. A finite dimensional Banach space has the same dimension as its dual. So from (3.2), we have that codim R(idX − L) = dimker (id∗X − L∗ ).
(3.3)
Because L∗ ∈ L(X ∗ ), from Proposition 3.1.35, we have that R(id∗X − L∗ ) is closed, hence w∗ -closed too (see Remark 3.1.21). So from Remark 3.1.41 and Lemma 3.1.42, we have ¡ ¢⊥ ¡ ¢ ¡ ¢ ker (idX − L)⊥ = ⊥ R(id∗X − L∗ ) = R id∗X − L∗ = R id∗X − L∗ ,
288 so
Nonlinear Analysis X ∗ /R(id∗ − L∗ ) = X ∗ / = ker (idX − L)⊥ X
£ ¤∗ ker (idX − L) .
Using as before the fact that a finite dimensional Banach space has the same dimension as its dual, we obtain codim R(id∗X − L∗ ) = dim ker (idX − L).
(3.4)
Suppose that dim ker (idX − L) > codim R(idX − L). Then we can find a closed subspace E of X, such that X = R(idX − L) ⊕ E. Let PE be the projection operator onto E. Then ker PE = R(idX − L). We have that X/ker P = X/R(id − L) = E E X and so codim R(idX − L) = dim E. Therefore there is a bounded linear operator T which is not injective and maps ker (idX − L) onto E. Then ¡ ¢ T ∈ Lc ker (idX − L); X . Let F be a closed subspace of X, such that X = ker (idX − L) ⊕ F and P0 the projection operator onto ker (idX − L) and with kernel F . Set df
G = L + T P0 . Evidently G ∈ Lc (X) and we have (idX − G)(X) =
¡
¢¡ ¢ (idX − L) − T P0 ker (idX − L) ¡ ¢ + (idX − L) − T P0 (F )
= E + (idX − L)(F ) = E + (idX − L)(X) = X,
3. Nonlinear Operators and Young Measures
289
so, from Lemma 3.1.43, we have that idX − G is injective. But there is a nonzero u ∈ ker T ⊆ ker (idX − L), such that (idX − G)(u) = (idX − L)(u) − T P0 (u) = 0, a contradiction. So, it follows that dim ker (idX − L) 6 codim R(idX − L).
(3.5)
Moreover, because L∗ ∈ Lc (X ∗ ), we also have dim ker (id∗X − L∗ ) 6 codim R(id∗X − L∗ ).
(3.6)
From (3.3), (3.4), (3.5) and (3.6), we conclude that dim ker (idX − L) = codim R(idX − L).
A byproduct of the above proof is the following result. COROLLARY 3.1.45 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then dim ker (λidX − L) = dim ker (λid∗X − L∗ ). Clearly Proposition 3.1.44 permits the improvement of Lemma 3.1.43. This is done in the next theorem which summarizes all the above properties of a compact operator. THEOREM 3.1.46 If X is a Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then (a) ker (λidX − L) is finite dimensional; (b) R(λidX − L) is closed and R(λidX − L) = ker (λid∗X − L∗ )⊥ ; (c) ker (λidX − L) = {0} if and only if R(λidX − L) = X; (d) dim ker (λidX − L) = dim ker (λid∗X − L∗ ). REMARK 3.1.47 Statement (c) expresses the fact that λidX − L is injective if and only if λidX − L is surjective, a well known property of linear operators between finite dimensional spaces.
290
Nonlinear Analysis
This leads us to the Fredholm alternative theorem, an important tool in the study of integral equations and boundary value problems. THEOREM 3.1.48 (Fredholm Alternative Theorem) If X is Banach space, L ∈ Lc (X) and λ is a nonzero scalar, then one and only one of the following two alternatives holds: (a) for every u ∈ X, the equation (λidX − L)(x) = u has a unique solution x ∈ X; or (b) the homogeneous equation (λidX − L)(x) = 0 has N linear independent solutions with N > 1; in this case the nonhomogeneous equation (λidX − L)(x) = u has a solution if and only if u verifies N conditions of orthogonality, i.e., u ∈ ker (λid∗X − L∗ )⊥ . Next let us say a few words about the spectrum of a self-adjoint, compact operator on a Hilbert space. DEFINITION 3.1.49 Let H be a Hilbert space and L ∈ L(H). We say that L is self-adjoint (or hermitian) if and only if L∗ = L, i.e., ¡ ¢ ¡ ¢ L(x), y H = x, L(y) H ∀ x, y ∈ H. REMARK 3.1.50 If H is a complex Hilbert space and L ∈ L(H) is a self-adjoint operator, then ¡
¡ ¢ ¢ ¡ ¢ L(x), x H = x, L(x) H = L(x), x H ,
hence
¡
L(x), x
¢ H
∈ R.
Also one can check that n
kLn kL = kLkL and kLkL =
∀n>1
¯¡ ¢ ¯ sup ¯ L(x), x H ¯.
kxkH 61
3. Nonlinear Operators and Young Measures
291
PROPOSITION 3.1.51 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then all eigenvalues of L are real and eigenvectors corresponding to different eigenvalues are orthogonal. PROOF
Let λ be a eigenvalue with an eigenvector x. We have ¡ ¢ ¡ ¢ L(x), x H = λx, x H ,
so, from Remark 3.1.50, we have λ =
(L(x), x)H 2
kxkH
∈ R.
Also if µ is another eigenvalue with an eigenvector y, we have ¡ ¢ ¡ ¢ ¡ ¢ L(x), y H = λ (x, y)H and L(y), x H = µ x, y H . Since L is self-adjoint, it follows that (λ − µ) (x, y)H = 0. Because λ 6= µ, we conclude that (x, y)H = 0. PROPOSITION 3.1.52 If H is Hilbert space and L ∈ L(H) is a self-adjoint operator, then λ ∈ σ(L) if and only if ° ° inf °(λidX − L)(x)°H = 0. kxkH =1
PROOF
“⇐=”: If λ ∈ %(L), then (λidX − L)−1 ∈ L(H)
and so for x ∈ H with kxkH = 1, we have ° ° 1 = kxkH = °(λidX − L)−1 (λidX − L)(x)°H ° °−1 ° ° 6 °λidX − L°L °(λidX − L)(x)°H , so inf
kxkH =1
° ° ° ° °(λid − L)(x)° > °(λid − L)−1 °−1 > 0. X X H L
“=⇒”: We proceed by contradiction. So suppose that ° ° inf °(λidX − L)(x)°H = c > 0. kxkH =1
292
Nonlinear Analysis
Then by positive homogeneity, we have ° ° °(λid − L)(x)° > c kxk X H H
∀ x ∈ X.
Hence λidX − L is injective. If we show that λidX − L is also surjective, then by Banach’s theorem (see Theorem A.3.6), we have that (λidX − L)−1 ∈ L(H), a contradiction to the fact that λ ∈ σ(L). We establish the surjectivity of λidX − L in two steps. First we show that (λidX − L)(H) is dense in H and then that (λidX − L)(H) is closed in H. Suppose that (λidX −L)(H) is not dense in H. Then we can find u ∈ H\{0}, such that ¡ ¢ u, (λidX − L)(x) H = 0 ∀ x ∈ H. Since L is self-adjoint, we have that ¡ ¢ ¡ ¢ 0 = u, (λidX − L)(x) H = (λidX − L)(u), x H hence
∀ x ∈ H,
¡ ¢ λidX − L (u) = 0.
This means that λ ∈ σp (L). But from Proposition 3.1.51, we know that σp (L) ⊆ R. Hence λ = λ = λ. Therefore
¡
¢ λidX − L (u) = 0,
u 6= 0,
a contradiction to the fact that ° ° °(λid − L)(u)° > c kuk > 0. X H H This proves that (λidX − L)(H) is dense in H. Now we show that (λidX − L)(H) is closed in H. To this end let (λidX − L)(xn ) −→ y Then we have ° ° °(λidX − L)(xn − xm )° −→ 0 H Since
in H.
as n, m → +∞.
° ° °(λid − L)(xn − xm )° > c kxn − xm k , X H H
3. Nonlinear Operators and Young Measures
293
we have that kxn − xm kH −→ 0
as n, m → +∞.
Therefore xn −→ x ∈ H and so (λidX − L)(xn ) −→ (λidX − L)(x), hence y = (λidX − L)(x), which proves the closedness of (λidX − L)(H). We conclude that ¡ ¢ λidX − L (H) = H and so by Banach’s theorem, we have (λidX − L)−1 ∈ L(H), a contradiction to the fact that λ ∈ σ(L). In Proposition 3.1.51, we saw that if L ∈ L(H) is a self-adjoint operator, then σp (L) ⊆ R. Next we show that in fact the whole spectrum is real. PROPOSITION 3.1.53 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then σ(L) ⊆ R. PROOF Let λ = a + ic with c 6= 0. We show that λ ∈ %(L). For every x ∈ H, we have ¡ ¢ ¡ ¢ (λidX − L)(x), x H − x, (λidX − L)(x) H ¡ ¢ ¡ ¢ 2 2 = λ kxkH − L(x), x H − λ kxkH + x, L(x) H ¡ ¢ 2 2 = λ − λ kxkH = 2ic kxkH . So for every x ∈ H, we have ¯¡ ¢ ¡ ¢ ¯ 2 2|c| kxkH = ¯ (λidX − L)(x), x H − x, (λidX − L)(x) H ¯ ¯¡ ¢ ¯ ¢ ¯ ¯¡ 6 ¯ (λidX − L)(x), x H ¯ + ¯ (x, (λidX − L)(x) H ¯ ° ° 6 2°(λidX − L)(x)°H kxkH , hence
° ° c kxkH 6 °(λidX − L)(x)°H .
Invoking Proposition 3.1.52, we infer that λ ∈ %(L). Therefore σ(L) ⊆ R.
294
Nonlinear Analysis
We can say more about the position of σ(L) in the real line R when L ∈ L(H) is self-adjoint. PROPOSITION 3.1.54 If H is a Hilbert space and L ∈ L(H) is a self-adjoint operator, then σ(L) ⊆ [m, M ], where df
m =
inf
kxkH =1
¡
L(x), x
¢ H
and
df
M =
sup kxkH =1
¡ ¢ L(x), x H .
Moreover, m, M ∈ σ(L). PROOF
From Proposition 3.1.53, we know that σ(L) ⊆ R. Let r > 0 and
df
λ = M + r. Then for every x ∈ H with kxkH = 1, we have ¡ ¢ ¡ ¢ 2 (λidX − L)(x), x H = λ kxkH − L(x), x H ° °2 2 2 > λ°x°H − M kxkH = r kxkH = r, so
° ° r 6 °(λidX − L)(x)°H .
Invoking Proposition 3.1.52, we infer that λ ∈ %(L). Similarly if λ = m − r. So σ(L) ⊆ [m, M ]. Next we show that M ∈ σ(L). Note that σ(L − µidX ) = σ(L) − µ
∀ µ ∈ R.
So by replacing L with L + µidX with µ > 0 sufficiently large, we may assume that 0 6 m 6 M . Then M = kLkL (see Remark 3.1.50). Let {xn }n>1 ⊆ X be a sequence, such that ¡ ¢ kxn kH = 1 ∀ n > 1 and L(xn ), xn H % M = kLkL . Then we have ° ° ¡ ¢ °(M id − L)(xn )°2 = M xn − L(xn ), M xn − L(xn ) X H H ° °2 ¡ ¢ 2 = M 2 kxn kH + °L(xn )°H − 2M L(xn ), xn H ¡ ¢ 6 M 2 + M 2 − 2M L(xn ), xn H , hence
° ° °(M id − L)(xn )° −→ 0. X H
So Proposition 3.1.52 implies that M ∈ σ(L). The proof that m ∈ σ(L) is similar.
3. Nonlinear Operators and Young Measures
295
PROPOSITION 3.1.55 If H is a Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then σp (L) 6= ∅. PROOF If L = 0, then λ = 0 is an eigenvalue of L. If L 6= 0, then by Proposition 3.1.54, at least one of m or M is a nonzero element of σ(L). Invoking Theorem 3.1.38(b), we conclude that σp (L) 6= ∅.
PROPOSITION 3.1.56 If H is a Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then there is a orthonormal basis of H consisting of eigenvectors of L. PROOF
For λ ∈ σp (L) let df
E(λ) = (λidX − L)−1 (0) (the eigenspace corresponding to the eigenvalue λ). Let B(λ) be an orthonormal basis for each finite dimensional eigenspace E(λ). By virtue of Proposition 3.1.51, we have that [ B(λ) is an orthonormal set in H. λ∈σp (L)
Suppose that span
[
B(λ) 6= H.
λ∈σp (L)
Then set
df
F =
[
£ span
B(λ)
¤⊥
.
λ∈σp (L)
Clearly L(F ) ⊆ F. So L|F has an eigenvalue (see Proposition 3.1.55). Let u ∈ F \ {0} be an eigenvector of L|F . Evidently u is an eigenvector of L and so [ F ∩ span B(λ) ! {0}, λ∈σp (L)
a contradiction. Therefore span
[ λ∈σp (L)
B(λ) = H.
296
Nonlinear Analysis
Now we can state the so-called spectral theorem for compact self-adjoint operators on a separable Hilbert space. THEOREM 3.1.57 (Spectral Theorem) If H is an infinite dimensional separable Hilbert space and L ∈ Lc (H) is a self-adjoint operator, then there exists an orthonormal basis {en }n>1 of H formed by eigenvectors of L, such that ∞ X
L(x) =
λn (x, en )H en
∀ x ∈ H,
n=1
with {λn }n>1 being the eigenvalues corresponding to {en }n>1 . PROOF From Proposition 3.1.56, we know that there exists an orthonormal basis of H consisting of eigenvectors of L. This orthonormal basis is countable, because H is separable. Denote it by {en }n>1 . For every x ∈ H and m > k > 1 and since ¯¡ ¢ ¯ |λn | 6 sup ¯ L(x), x H ¯ = kLkL kxkH 61
(see Remark 3.1.50), we have ° ∞ °2 m X °X ° ¯ ¯ ° ° ¯λn (x, en ) ¯2 λ (x, e ) e = n n H n° H ° H
n=k
n=k
m X ¯ ¯ 2 ¯ (x, en ) ¯2 −→ 0 as k, m → +∞, 6 kLkL H n=k
so
∞ P n=1
λn (x, en )H en is convergent in H.
Moreover, if x ∈ H with kxkH 6 1, then for every m > 1, we have ° m °2 m m X X °X ° ¯ ¯2 ¯ ¯ 2 2¯ ° ° ¯ ¯ (x, en ) ¯2 λ (x, e ) e = λ (x, e ) 6 kLk n n H n° n H n L H ° H
n=1
n=1
∞ X ¯ ¯ 2 ¯ (x, en ) ¯2 = kLk2 kxk2 . 6 kLkL H L H
n=1
(3.7)
n=1
Therefore, if we define df
T (x) =
∞ X
λn (x, en )H en ,
n=1
from (3.7), we see that T ∈ L(H). Note that L(en ) = T (en ) for all n > 1. So by linearity and continuity, we conclude that L = T .
3. Nonlinear Operators and Young Measures
297
Before passing to Fredholm operators, let us mention two more results on compact maps. PROPOSITION 3.1.58 If X, Y are two Banach spaces, U ⊆ X is an open set, f ∈ K(U ; Y ) and it is Fr´echet differentiable, then f 0 (x) ∈ Lc (X; Y ) ∀ x ∈ U. PROOF Suppose that f 0 (x) is not compact. Then we can find a sequence {un }n>1 ⊆ X with kun kX 6 1 ∀n>1 and ε > 0, such that ° 0 ° °f (x)un − f 0 (x)um ° > ε Y We have
∀ n 6= m.
f (x + h) − f (x) = f 0 (x)h + ox (h),
where
ε khkX ∀ h ∈ X, khkX 6 δ, 3 for some δ = δ(ε, x) > 0. Therefore ° ° °f (x + δun ) − f (x + δum )° ° ° ° Y ° ° ° > δ °f 0 (x)(un − um )°Y − °ox (δun )°Y − °ox (δum )°Y 2ε ε > εδ − δ = δ, 3 3 kox (h)kY 6
a contradiction to the fact that f ∈ K(U ; Y ). The converse of the above result is also true, provided that the map x 7−→ f 0 (x)
belongs in K(U ; L(X; Y )). For details see Vaˇınberg (1973, pp. 47 and 51). PROPOSITION 3.1.59 Let X, Y be two Banach spaces. (a) If f : X −→ Y is a Fr´echet differentiable operator, f 0 (x) ∈ Lc (X; Y ) ¡ ¢ for every x ∈ X and x 7−→ f 0 (x) belongs in K X; L(X; Y ) , then f ∈ K(X; Y ). (b) If f ∈ K(X; Y ), f is Fr´echet differentiable, and x 7−→ f 0 (x) belongs in ¡ ¢ K X; L(X; Y ) , then f is completely continuous.
298
Nonlinear Analysis
The study of compact operators leads us to the following definition. DEFINITION 3.1.60 Let X, Y be two Banach spaces and L ∈ L(X; Y ). We say that L is a Fredholm operator, if α(L) = dim ker L < +∞
and
β(L) = codim R(L) < +∞.
The class of Fredholm operators is denoted by Φ(X; Y ). The quantity α(L) is called the kernel index of L and the quantity β(L) is called the deficiency index of L. The index of L is defined by df
ind (L) = α(L) − β(L). If we have only that α(L) < +∞ and that R(L) is closed, then we say that L is a semi-Fredholm operator and the class of all semi-Fredholm operators is denoted by Φ+ (X; Y ). If X = Y we write Φ(X) and Φ+ (X). REMARK 3.1.61
We have Φ(X; Y ) ⊆ Φ+ (X; Y ),
since as we show in the sequel, the condition that β(L) < +∞ implies that R(L) is closed. LEMMA 3.1.62 If X, Y are two Banach spaces, L ∈ L(X; Y ) is injective and L−1 : R(L) −→ X is bounded, then R(L) is closed. PROOF
Let L(xn ) −→ y
in Y.
Since by hypothesis L has a bounded inverse on R(L), we must have that ° ° °L(xn ) − L(xm )° > c kxn − xm k ∀ n 6= m, X Y for some c > 0. Therefore {xn }n>1 ⊆ X is a Cauchy sequence and we have that xn −→ x, for some x ∈ X. Hence L(xn ) −→ L(x)
in Y
and so y = L(x). Therefore y ∈ R(L) and we conclude that R(L) is closed in Y .
3. Nonlinear Operators and Young Measures
299
The next definition formalizes an idea which was used in earlier proofs, when we wanted to get rid of the nontrivial kernel of an operator L ∈ L(X; Y ). DEFINITION 3.1.63
Let X, Y be two Banach spaces and L ∈ Lc (X; Y ).
b induced by L is the operator from X/ The operator L ker L into Y defined by ¡ ¢ df b [x] = L L(x) REMARK 3.1.64
∀ x ∈ X.
b is injective and Evidently L ¡ ¢ b = R(L). R L
b is continuous and In fact it is straightforward to show that L ° ° kLk = °b L° . L
L
PROPOSITION 3.1.65 If X, Y are two Banach spaces and L ∈ L(X; Y ), then R(L) is closed if and only if there exists c > 0, such that ° ° °L(x)° > cdX (x, ker L) ∀ x ∈ X. Y b induced by L (see Definition 3.1.63). PROOF Consider the operator L b is injective and We know that L ¡ ¢ b = R(L) R L ¡ ¢ b is closed if (see Remark 3.1.64). By virtue of Lemma 3.1.62, R(L) = R L b −1 has a bounded inverse which in turn is equivalent to saying and only if L that b kL([x])k kL(x)kY Y 0 < c = inf = inf . x6∈ker L x6∈ker L dX (x, ker L) k[x]k
We can define the quantity df
γ(L) =
inf
x6∈ker L
kL(x)kY . dX (x, ker L)
This quantity is known as the minimum modulus of L. From the previous discussion, we have the following proposition.
300
Nonlinear Analysis
PROPOSITION 3.1.66 If X, Y are two Banach spaces and L : X −→ Y is linear, then any two of the following three properties imply the other: (a) L ∈ L(X; Y ); (b) R(L) is closed in Y ; (c) γ(L) > 0. PROPOSITION 3.1.67 If X, Y are two Banach spaces, L ∈ L(X; Y ) and suppose that E is a closed subspace of Y , such that R(L) ⊕ E is closed in Y , then R(L) is closed. PROOF
Let L0 ∈ L(X × E, X × Y ) be defined by df
L0 (x, u) = L(x) + u
∀ (x, u) ∈ X × E.
Since R(L) ∩ E = {0}, we have that ker L0 = ker L × {0}. By hypothesis R(L0 ) = R(L) ⊕ E is closed. So according to Proposition 3.1.66, we have that γ(L0 ) > 0. Then for all x ∈ X, we have ° ° ° ° °L(x)° = °L0 (x, 0)° Y X×X ¡ ¢ > γ(L0 )d (x, 0), ker L0 = γ(L0 )dX (x, ker L), so γ(L) > γ(L0 ) > 0. Invoking Proposition 3.1.66, we conclude that R(L) ⊆ Y is closed. COROLLARY 3.1.68 If X, Y are two Banach spaces, L ∈ L(X; Y ) and β(L) < +∞, then R(L) is closed. PROOF
We have Y = R(L) ⊕ E
for some finite dimensional subspace E of Y . Apply Proposition 3.1.67.
3. Nonlinear Operators and Young Measures
301
This corollary implies that every L ∈ Φ(X; Y ) has closed range and so Φ(X; Y ) ⊆ Φ+ (X; Y ) (see Remark 3.1.61). The propositions that follow summarize some of the basic properties of Fredholm operators. PROPOSITION 3.1.69 If X, Y are two Banach spaces and L ∈ Φ(X; Y ), then (a) if ind L = 0 and ker L = {0}, then for every y ∈ Y the equation L(x) = y has a unique solution and L−1 exists and is bounded (i.e., L−1 ∈ L(X; Y )); (b) for given y ∈ Y , the equation L(x) = y has a solution if and only if hy ∗ , yiX = 0
∀ y ∗ ∈ ker L∗ ,
i.e., y ∈ ⊥ (ker L∗ ); (c) L∗ ∈ Φ(Y ∗ ; X ∗ ) and α(L∗ ) = β(L), β(L∗ ) = α(L), ind L∗ = −ind L. PROOF (a) Since ker L = {0}, L is injective. Also because ind L = 0, we have β(L) = 0 and so R(L) = Y . Invoking Banach’s Theorem, we conclude that L−1 ∈ L(X; Y ) and L(x) = y has a unique solution. (b) From Corollary 3.1.68, we know that R(L) is closed. Hence R(L) =
⊥
(ker L∗ ) .
(c) Since R(L) is closed, we have ker L∗ So
w∗
= R(L)⊥
and
α(L∗ ) = β(L) and
⊥
¡
¢ R(L∗ ) = ker L.
β(L∗ ) = α(L).
Because L ∈ Φ(X; Y ), it follows that L∗ ∈ Φ(Y ∗ ; X ∗ )
and
ind L∗ = −ind L.
The next proposition gives a basic stability property of Fredholm operators. PROPOSITION 3.1.70 If X, Y are two Banach spaces, L ∈ Φ(X; Y ) and T ∈ Lc (X; Y ), df
then G = L + T ∈ Φ(X; Y ) and ind G = ind L.
302
Nonlinear Analysis
REMARK 3.1.71 The result is also true if instead of T ∈ Lc (X; Y ) we assume that T ∈ L(X; Y ) with kT kL 6 δ for some δ = δ(L) > 0. Also note that in particular Proposition 3.1.70 implies that if T ∈ Lc (X; Y ) then idX − T ∈ Φ(X; Y ). PROPOSITION 3.1.72 If X is a Banach space and L ∈ L(X), then L ∈ Φ+ (X) if and only if for every closed and bounded set B ⊆ X, L|B is proper. PROOF
“=⇒”: Let {xn }n>1 ⊆ B be such that L(xn ) −→ u in X.
We have that X = ker L ⊕ E, with a closed subspace E ⊆ X (since dim ker L < +∞). So xn = zn + en with zn ∈ ker L, en ∈ E. We have L(xn ) = L(en ) −→ u
in X.
b (see Definition 3.1.63) is injective and so by Banach’s Then operator L|E = L Theorem, ¡ ¢ L−1 ∈ L R(L); E . Therefore, we have en −→ e
in X,
for some e ∈ E. The sequence {zn }n>1 ⊆ ker L is bounded. So exploiting the finite dimensionality of ker L, we have that the sequence {zn }n>1 is relatively compact in X. Therefore we conclude that the sequence {xn }n>1 is relatively compact in X, which proves that L|B is proper. “=⇒”: The set
©
ª
x ∈ X : x ∈ ker L, kxkX 6 1
is compact by hypothesis. So it follows that ker L is finite dimensional. We can write X = ker L ⊕ E, with a closed subspace E ⊆ X. Then Proposition 3.1.67 implies that R(L) is closed, hence L ∈ Φ+ (X).
3. Nonlinear Operators and Young Measures
3.2
303
Operators of Monotone Type
Operators of monotone type were introduced to provide an analytic framework broader than compact operators in order to study nonlinear functional equations. Their systematic study of monotone operators starts in early 1960’s and marks the advent of nonlinear functional analysis. Monotone operators are rooted in the theory of variational problems. Moreover, recalling that the Gˆateaux derivative of a convex function is the prototypical example of a nonlinear monotone operator, it is no surprise that for a long period the theory of monotone operators and convex analysis developed in parallel and interacted heavily. The mathematical framework of the analysis in this section is the following. Let X be a reflexive Banach space and X ∗ its topological dual. By h·, ·iX we denote the duality pairing for the spaces X ∗ and X. Also A : X ⊇ D(A) −→ 2X
∗
is a generally multivalued operator. The domain D(A) of A is defined by df
D(A) =
©
ª
x ∈ X : A(x) 6= ∅
and the graph Gr A of A is defined by df
Gr A =
©
ª (x, x∗ ) ∈ X × X ∗ : x∗ ∈ A(x) .
Also we can define A−1 : X ∗ ⊇ D∗ −→ 2X b df
Gr A∗ =
©
ª (x∗ , x) ∈ X ∗ × X : (x, x∗ ) ∈ Gr A .
Note that A−1 is always defined as a multivalued operator. Some of the results in this section are actually true in a more general setting. However, in order to have a uniform presentation, we have chosen to work in the above setting, which after all is what we encounter in most applications. DEFINITION 3.2.1 ator.
∗
Let A : X ⊇ D(A) −→ 2X be a multivalued oper-
(a) We say that A is monotone, if hx∗ − y ∗ , x − yiX > 0, for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y).
304
Nonlinear Analysis
(b) We say that A is strictly monotone, if it is monotone and hx∗ − y ∗ , x − yiX > 0, for all x, y ∈ D(A), x 6= y and all x∗ ∈ A(x), y ∗ ∈ A(y). (c) We say that A is strongly monotone, if there exists c > 0, such that 2
hx∗ − y ∗ , x − yiX > c kx − ykX , for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y). (d) We say that A is uniformly monotone, if there exists a continuous function c : R+ −→ R+ , which is strictly increasing, c(0) = 0, c(r) −→ +∞ as r → +∞ and ¡ ¢ hx∗ − y ∗ , x − yiX > c kx − ykX kx − ykX , for all x, y, ∈ D(A) and all x∗ ∈ A(x), y ∗ ∈ A(y). (e) We say that A is coercive, if D(A) is bounded or D(A) is unbounded and inf{hx∗ , xiX : x∗ ∈ A(x)} −→ +∞ kxkX
as kxkX → +∞, x ∈ D(A).
We say that A is weakly coercive, if D(A) is bounded or D(A) is unbounded and inf kx∗ kX ∗ −→ +∞ as kxkX → +∞, x ∈ D. ∗ x ∈A(x)
REMARK 3.2.2 If A is strongly monotone, then A is uniformly monotone. If A is uniformly monotone, then A is strictly monotone. If A is strictly monotone, then A is monotone. If A is uniformly monotone, then A is coercive. If A is coercive, then A is weakly coercive. Sometimes it is convenient to identify A with its graph. For this reason some authors speak of monotone sets in X × X ∗ . ∗
DEFINITION 3.2.3 A monotone map A : X ⊇ D(A) −→ 2X is said to be maximal monotone, if the inequality hx∗ − y ∗ , x − yiX > 0
∀ (x, x∗ ) ∈ Gr A
implies that (y, y ∗ ) ∈ Gr A. REMARK 3.2.4 The above definition implies that the graph of a maximal monotone map is not properly included in the graph of another monotone map (i.e., it is maximal with respect to inclusion).
3. Nonlinear Operators and Young Measures
305
EXAMPLE 3.2.5 An increasing continuous function f : R −→ R is maximal monotone. However, an increasing discontinuous function f : R −→ R is monotone but not maximal monotone, since it admits a monotone extension by filling in the jumps at the discontinuity points. This example underlines the necessity of multivalued operators in the study of maximal monotonicity. The next result is an immediate consequence of Definition 3.2.1. PROPOSITION 3.2.6 ∗ A map A : X ⊇ D(A) −→ 2X is maximal monotone if and only if A−1 : X ∗ ⊇ D(A∗ ) −→ 2X is maximal monotone. PROPOSITION 3.2.7 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map, then for every x ∈ D(A), the set A(x) is nonempty, convex and closed. PROOF
Since x ∈ D(A), A(x) 6= ∅. Let x∗ , y ∗ ∈ A(x). Set df
u∗λ = λx∗ + (1 − λ)y ∗
∀ λ ∈ [0, 1].
For all (z, z ∗ ) ∈ Gr A, we have hu∗λ − z ∗ , x − ziX = λ hx∗ − z ∗ , x − ziX + (1 − λ) hy ∗ − z ∗ , x − ziX > 0, hence u∗λ ∈ A(x) (see Definition 3.2.3). Therefore A(x) is convex. Also suppose that {x∗n }n>1 ⊆ A(x) is a sequence, such that x∗n −→ x∗
in X ∗ .
We have hx∗n − z ∗ , x − ziX > 0
∀ n > 1, (z, z ∗ ) ∈ Gr A.
In the limit as n → +∞, we have hx∗ − z ∗ , x − ziX > 0, hence (x, x∗ ) ∈ Gr A, i.e., A(x) is closed in X ∗ . A fundamental property of monotone maps is local boundedness. ∗
DEFINITION 3.2.8 A monotone map A : X ⊇ D(A) −→ 2X is said to be locally bounded at x ∈ D(A), if there exists M > 0 and r > 0, such that ky ∗ kX ∗ 6 M ∀ y ∈ D(A) ∩ B r (x), y ∗ ∈ A(y).
306
Nonlinear Analysis
DEFINITION 3.2.9 If C ⊆ X is a nonempty set, a pointSx ∈ C is an absorbing point of C, if the set C − x is absorbing, i.e., X = λ(C − x). λ>0
REMARK 3.2.10 If int C 6= ∅, then any x ∈ int C is an absorbing point of C. If C = ∂B1 ∪ {0}, then 0 is an absorbing point although int C = ∅. PROPOSITION 3.2.11 ∗ If A : X ⊇ D(A) −→ 2X is monotone and x ∈ D(A) is an absorbing point of D(A), then A is locally bounded at x. PROOF Without any loss of generality we may assume that x = 0 and 0 ∈ A(0) (i.e., (0, 0) ∈ Gr A). Indeed if this is not the case, we choose x∗ ∈ A(x) and consider the map df
A1 (y) = A(y + x) − x∗ . Evidently A1 is still monotone, (0, 0) ∈ Gr A1 and D(A1 ) = D(A) − x. So we can replace A with A1 . Therefore we need to show that A is locally bounded at 0. For every u ∈ X, we define df
ϕ(u) =
sup y ∈ D(A) kykX 6 1 y ∗ ∈ A(y)
hy ∗ , u − yiX .
Clearly ϕ is the supremum of affine continuous functions, hence ϕ is convex and lower semicontinuous and because (0, 0) ∈ Gr A, we have ϕ > 0. The set df
C =
©
ª
u ∈ X : ϕ(u) 6 1
is closed and convex. We claim that 0 ∈ C. Indeed because (0, 0) ∈ Gr A, we have 0 6 hy ∗ , yiX ∀ (y, y ∗ ) ∈ Gr A and so ϕ(0) 6 0. Let df
E = C ∩ (−C). This is a closed, convex and symmetric set. We claim that it is absorbing too. Let u ∈ X. Since by hypothesis D(A) is absorbing, we can find λ > 0, such that λu ∈ D(A), i.e., A(λu) 6= ∅. Choose v ∗ ∈ A(λu). If (y, y ∗ ) ∈ Gr A, from the monotonicity of A, we have hy ∗ , λu − yiX 6 hv ∗ , λu − yiX ,
3. Nonlinear Operators and Young Measures
307
so ϕ(λu) 6
hv ∗ , λu − yiX 6 hv ∗ , λuiX + kv ∗ kX ∗ < +∞.
sup y ∈ D(A) kykX 6 1
Choose t ∈ (0, 1), such that tϕ(λu) < 1. Because ϕ is convex, we have ϕ(tλu) 6 tϕ(λu) + (1 − t)ϕ(0) = tϕ(λu) < 1, so tλu ∈ C. This shows that C is absorbing, hence E is absorbing too. Thus E is a neighbourhood of the origin and so we can find δ > 0, such that ϕ(u) 6 1
∀ kukX 6 2δ.
This means that hy ∗ , uiX 6 1 + hy ∗ , yiX
∀ y ∈ D(A), kykX 6 1, y ∗ ∈ A(y), kukX 6 2δ.
Therefore, if y ∈ D(A) ∩ B δ and y ∗ ∈ A(y), we have 2δ ky ∗ kX ∗ =
sup kukX 62δ
hy ∗ , uiX 6 1 + ky ∗ kX ∗ kykX 6 1 + δ ky ∗ kX ∗ ,
so ky ∗ kX ∗ 6 1δ . Using this result, we can determine the continuity properties of maximal monotone maps. First let us recall the following notion from multivalued analysis. DEFINITION 3.2.12 Let Y, Z be two Hausdorff topological spaces. A multifunction (set-valued map) F : Y −→ 2Z \ {∅} is said to be upper semicontinuous, if for any closed set C ⊆ Z, the set ª df © F − (C) = y ∈ Y : F (y) ∩ C 6= ∅ is closed. REMARK 3.2.13 It is easy to check that the above definition is equivalent to saying that for any open set U ⊆ Z, the set ª df © F + (U ) = y ∈ Y : F (y) ⊆ U is open. Moreover, if for all y ∈ Y , the set F (y) ⊆ Z is closed and Z is regular, then upper semicontinuity of F implies that Gr F ⊆ Y × Z is closed (see Hu & Papageorgiou (1997, p. 41)). The converse is true if F is locally compact, i.e., for every y ∈ Y , we can find a neighbourhood U of y, such that F (U ) is compact in Z. Finally note that if F is single-valued, then the notion of upper semicontinuity coincides with that of continuity.
308
Nonlinear Analysis
PROPOSITION 3.2.14 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map and int D(A) 6= ∅, then A|int D(A) is upper semicontinuous from X with the norm topology into X ∗ with the weak topology. PROOF Let C ⊆ X ∗ be a weakly closed set. We need to show that the set ¡ ¢− © ª A|int D(A) (C) = x ∈ int D(A) : A(x) ∩ C 6= ∅ is closed in int D(A). To this end let {xn }n>1 ⊆
¡
A|int D(A)
¢−1
(C)
be a sequence, such that xn −→ x
in X,
for some x ∈ int D(A). Let x∗n ∈ A(xn ) ∩ C
∀ n > 1.
Then Proposition 3.2.11 implies that the sequence {x∗n }n>1 ⊆ X ∗ is bounded. By virtue of the reflexivity of X ∗ and the Eberlein-Smulian Theorem (see Theorem A.3.8), we may assume that w
x∗n −→ x∗
in X ∗ .
Clearly x∗ ∈ C. Also, we have hx∗n − y ∗ , xn − yiX > 0 so lim
n→+∞
∀ n > 1, (y, y ∗ ) ∈ Gr A,
∗ ® ® xn − y ∗ , xn − y X = x∗ − y ∗ , x − y X > 0.
Because A is maximal monotone, we infer that x∗ ∈ A(x). Therefore −1 x ∈ (A|int C ) (C) and this proves the claimed upper semicontinuity of A|int D(A) . A careful reading of the previous proof reveals that the following is also true. PROPOSITION 3.2.15 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone map, ∗ then Gr A ⊆ X × Xw and Gr A ⊆ Xw × X ∗ are closed sets (here Zw denotes the space Z furnished with the weak topology).
3. Nonlinear Operators and Young Measures
309
DEFINITION 3.2.16 Let Y, Z be two Banach spaces and let V : Y −→ 2Z \ {∅} be a multifunction. (a) We say that V is demicontinuous, if it is upper semicontinuous from Y with the norm topology into Z with the weak topology. (b) We say that if for all x, y ∈ Y , the multivalued ¡ V is hemicontinuous, ¢ map λ 7−→ V λx + (1 − λ)y is upper semicontinuous from [0, 1] into Z with the weak topology. (c) We say that V is bounded, if it maps bounded sets in Y into bounded sets in Z. REMARK 3.2.17 Evidently demicontinuity implies hemicontinuity. ∗ For monotone maps A : X −→ 2X with D(A) = X, the converse is also true. PROPOSITION 3.2.18 ∗ If A : X −→ 2X is a monotone hemicontinuous map with D(A) = X, then A is demicontinuous. PROOF
If C ⊆ X ∗ is w-closed, we need to show that the set ¡ ¢ A− (C) = x ∈ X : A(x) ∩ C 6= ∅
is norm closed in X. To this end let {xn }n>1 ⊆ A− (C) be a sequence, such that xn −→ x Let
x∗n ∈ A(xn ) ∩ C
in X. ∀ n > 1.
Then Proposition 3.2.11 implies that the sequence {x∗n }n>1 ⊆ X ∗ is bounded and so we may assume that w
x∗n −→ x∗ Set
in X ∗ .
df
yλ = x + λy, and let
yλ∗ ∈ A(yλ )
∀ λ > 0, y ∈ X.
From the monotonicity of A, we have ∗ ® xn − yλ∗ , xn − x X − λ hx∗n − yλ∗ , yiX > 0
∀ n > 1,
310
Nonlinear Analysis
so
® 1 ∗ xn − yλ∗ , xn − x X λ Passing to the limit as n → +∞, we obtain hx∗n − yλ∗ , yiX 6
∀ n > 1.
hx∗ − yλ∗ , yiX 6 0. Next let λ & 0. Due to the hemicontinuity of A, we may say that w
yλ∗ −→ y ∗
in X ∗ ,
for some y ∗ ∈ A(x). So we obtain that hx∗ − y ∗ , yiX 6 0. Because y ∈ X was arbitrary, it follows that x∗ = y ∗ ∈ A(x). Therefore x ∈ A− (C) and we have proved the demicontinuity of A. Next we give a sufficient condition for maximality of a monotone map. PROPOSITION 3.2.19 ∗ If A : X −→ 2X is a monotone map with D(A) = X, which is hemicontinuous and for every x ∈ X, the set A(x) ⊆ X ∗ is closed and convex, then A is maximal monotone. b 0 ). b is a monotone extension of A and x∗ ∈ A(x PROOF Suppose that A 0 ∗ We need to show that x0 ∈ A(x0 ). If this is not true, then from the strong separation theorem (see Theorem A.3.2), we can find u ∈ X \ {0}, such that hx∗ , uiX < hx∗0 , ui
∀ x∗ ∈ A(x0 ).
(3.8)
b we have Let λ > 0 and xλ = x0 + λu. By virtue of the monotonicity of A, λ hx∗λ − x∗0 , uiX > 0 so
hx∗λ − x∗0 , uiX > 0
∀ x∗λ ∈ A(xλ ), ∀ x∗λ ∈ A(xλ ).
Because of the hemicontinuity of A, we can say that w
x∗λ −→ x∗
in X ∗
as λ & 0,
∗
for some x ∈ A(x0 ). So from (3.9), we have that hx∗ − x∗0 , uiX > 0, which contradicts (3.8).
(3.9)
3. Nonlinear Operators and Young Measures
311
At this point let us give some standard examples of maximal monotone maps. EXAMPLE 3.2.20 (a) Let H be a Hilbert space and let C ⊆ H be a closed, convex set. It is well known that for each x ∈ H, there exists a unique element in C, denoted by projC (x), such that ° ° °x − proj (x)° = inf kx − ck X C X c∈C
(best approximation of x in C). The map projC : H −→ C is known as the metric projection on C. Then projC is a maximal monotone map. Indeed, recalling that ® x − projC (x), c − projC (x) X 6 0 ∀ c ∈ C, then we can easily check that ° ° ® °proj (x) − proj (y)°2 6 x − y, proj (x) − proj (y) C C C C X X and so
° ° °proj (x) − proj (y)° 6 kx − yk , X C C X
i.e., the map x 7−→ projC (x) is nonexpansive and it is monotone. So by Proposition 3.2.19, the map x 7−→ projC (x) is maximal monotone. (b) If H is a Hilbert space and A : H −→ H is nonexpansive (i.e., Lipschitz continuous with Lipschitz constant equal to 1), then it is easy to check that idX + A is maximal monotone. df
(c) Let X be a reflexive Banach space and ϕ : X −→ R = R ∪ {+∞} a proper (i.e., not identically +∞), convex, lower semicontinuous function. The subdifferential of ϕ is defined by ½ ¾ df ∗ ∗ ∗ ∂ϕ(x) = x ∈ X : hx , x − yiX 6 ϕ(y) − ϕ(x) ∀y∈X . ∗
Then ∂ϕ : X −→ 2X is a maximal monotone map. We shall prove this and more in Section 4.3, where we conduct a detailed study of the convex subdifferential. For the moment, we keep in mind the maximality of the subdifferential map, in order to better understand the next example. (d) Let X be a reflexive Banach space and consider the map F : X −→ 2X defined by ½ ¾ df 2 2 F(x) = x∗ ∈ X ∗ : hx∗ , xiX = kxkX = kx∗ kX ∗ .
∗
312
Nonlinear Analysis
According to the Hahn-Banach Theorem, we see that F(x) 6= ∅ for all x ∈ X. Moreover, we have that F(x) = ∂ϕ(x),
where ϕ(x) =
1 2 kxkX . 2
Indeed, if x∗ ∈ F(x), then 2
hx∗ , y − xiX 6 kx∗ kX ∗ kykX − kxkX 1¡ 2 2 ¢ 2 6 kxkX + kykX − kxkX = ϕ(y) − ϕ(x), 2 df
so x∗ ∈ ∂ϕ(x). Conversely, if x∗ ∈ ∂ϕ(x). Let ψ(x) = kxkX for all x ∈ X. By ϕ0 (x; ·) and ψ 0 (x; ·) we denote the directional derivatives at x ∈ X of the convex functions ϕ and ψ respectively, i.e., for all h ∈ X, we have ϕ(x + λh) − ϕ(x) λ ψ(x + λh) − ψ(x) df ψ 0 (x; h) = lim . λ&0 λ df
ϕ0 (x; h) = lim
λ&0
The limits exist since the difference quotients decrease as λ & 0, because of the convexity of the functions. Then we have 2
kx + λhkX kxkX − kxkX λ&0 λ 2 2 1 kx + λhkX − kxkX 6 lim = ϕ0 (x; h) λ&0 2 λ
ψ 0 (x; h) kxkX = lim
(3.10)
and 2
2
1 kx + λhkX − kxkX λ&0 2 λ · ¸ ¢ 1 kx + λhkX − kxkX ¡ = lim kx + λhkX + kxkX λ&0 2 λ = ψ 0 (x; h) kxkX . (3.11)
ϕ0 (x; h) = lim
From (3.10) and (3.11), we infer that ϕ0 (x; h) = ψ 0 (x; h) kxkX . Clearly from the definition of ∂ϕ(x), we see that x∗ ∈ ∂ϕ(x) if and only if hx∗ , hiX 6 ϕ0 (x; h) = ψ 0 (x; h) kxkX . So, we have ¿ ∗ À x ,h 6 ψ 0 (x; h) 6 ψ(x + h) − ψ(x) 6 khkX kxkX X
∀ h ∈ X,
3. Nonlinear Operators and Young Measures so
kx∗ kX ∗ 6 kxkX .
(3.12)
On the other hand since ¿ ∗ À x ,h 6 ψ 0 (x; h) kxkX X we have that
x∗ kxkX
¿
313
∀ h ∈ X,
∈ ∂ψ(x) and so
À x∗ ,y − x 6 ψ(y) − ψ(x) kxkX X
Let y = 0. We obtain
¿
Therefore
∀ y ∈ X.
À x∗ ,x > kxkX . kxkX X kx∗ kX ∗ > kxkX .
(3.13)
From (3.12) and (3.13), it follows that kx∗ kX ∗ = kxkX , hence x∗ ∈ F(x). Note that if X is a Hilbert space, then F is the canonical isomorphism between X and X ∗ . So if X = H is a pivot Hilbert space (i.e., H = H ∗ ), then F is an identity operator. REMARK 3.2.21 The duality map introduced in Example 3.2.20(a) actually can be defined on any Banach space (not necessarily reflexive) and is essentially dependent on the norm of the space. More precisely, if k·k1 and k·k2 are two equivalent norms on X and F1 and F2 the corresponding duality maps, then we need not have F1 = F2 . In fact in the proposition that follows, we show that the geometry of X and X ∗ is closely related to the properties of the duality map F. PROPOSITION 3.2.22 If X is a reflexive Banach space with X ∗ strictly convex, then the duality map F : X −→ X ∗ is single-valued, odd, demicontinuous, maximal monotone, coercive and bounded. PROOF
Let x∗1 , x∗2 ∈ F(x). Then we have 2
2
hx∗k , xiX = kxkX = kx∗k kX ∗
for k ∈ {1, 2}.
So, we have 2
2
2 kx∗1 kX ∗ kxkX 6 kx∗1 kX ∗ + kx∗2 kX ∗ = hx∗1 + x∗2 , xiX 6 kx∗1 + x∗2 kX kxkX , thus kx∗1 kX ∗ 6
1 ∗ kx + x∗2 kX ∗ 2 1
314
Nonlinear Analysis
and so x∗1 = x∗2 due to the strict convexity of X ∗ . Clearly F(−x) = −F(x), i.e., F is odd. To show the demicontinuity of F, suppose that {xn }n>1 ⊆ X is a sequence, such that xn −→ x in X, for some x ∈ X. Then ° ° °F(xn )° ∗ = kxn k −→ kxk X X X and so the sequence {F(xn )}n>1 ⊆ X ∗ is bounded. Because of the reflexivity of X ∗ , we may assume that w
F(xn ) −→ x∗ We have ∗ ® x ,h X = 6
lim
n→+∞
in X ∗ .
® F(xn ), h X
lim kxn kX khkX = kxkX khkX
n→+∞
∀h∈X
(3.14)
and ∗ ® x ,x X =
lim
n→+∞
From (3.14), we have
® F(xn ), x X =
2
2
lim kxn kX = kxkX .
n→+∞
(3.15)
kx∗ kX ∗ 6 kxkX
and from (3.15), we have kx∗ kX ∗ > kxkX . Therefore
kx∗ kX ∗ = kxkX
and so x∗ ∈ F(x), which proves the demicontinuity of F. Maximal monotonicity follows from Examples 3.2.20(c) and (d). A more direct proof is the following. Let x, y ∈ X. Then we have ® 2 2 F(x) − F (y), x − y X > kxkX + kykX − 2 kxkX kykX 2 = (kxkX − kykX ) > 0, (3.16) so F is monotone. Invoking Proposition 3.2.19, we conclude that F is maximal monotone. Also we have ® 2 F(x), x X = kxkX , i.e., F is coercive. Finally it is clear that F is bounded.
3. Nonlinear Operators and Young Measures
315
REMARK 3.2.23 As we shall see later in this section (see Corollary 3.2.31), the maximal monotonicity and coercivity of F imply that F is surjective. Also we have ® ϕ0 (x; h) = F(x), h X ∀ h ∈ X, which means that ϕ is Gˆateaux differentiable at every x ∈ X and ϕ0 (x) = F(x). Moreover, from Example 3.2.20(d), we see that the map x 7−→ ψ(x) = kxkX is Gˆateaux differentiable at every x 6= 0 and ψ 0 (x) =
F(x) . kxkX
It is a result of Banach space theory that the reflexive Banach space X is smooth (i.e., its norm is Gˆateaux differentiable at every x 6= 0) if and only if X ∗ is strictly convex. Similarly the reflexive Banach space X is strictly convex if and only if X ∗ is smooth (see Day (1973, p. 144)). PROPOSITION 3.2.24 If X is a reflexive Banach space and both X and X ∗ are strictly convex, then the duality map F : X −→ X ∗ is strictly monotone and bijective and F −1 is the duality map of X ∗ . PROOF
Suppose that ® F(x) − F(y), x − y X = 0.
From (3.16), we have ¿ µ ¶ À ¿ µ ¶ À x+y x−y x+y x−y 0 = F(x) − F , − F − F(y), 2 2 2 2 X X ° ° ¶ ° µ µ° ¶2 °x + y° 2 °x + y° ° ° > kxkX − ° + ° ° 2 ° ° 2 ° − kykX , X X so
° ° °x + y° ° kxkX = ° ° 2 °
X
= kykX .
Because X is strictly convex, it follows that x = y. So F is strictly monotone. Hence it is injective (see also Proposition 3.2.22). Moreover, we know that F is surjective (see Remark 3.2.23). Therefore F is bijective. Finally, it is clear that F −1 : X ∗ −→ X is the duality map of X ∗ .
316
Nonlinear Analysis
PROPOSITION 3.2.25 If X is a reflexive Banach space and X ∗ is a locally uniformly convex space (see Definition A.3.21), then the duality map F : X −→ X ∗ is continuous. PROOF
Let {xn }n>1 ⊆ X be a sequence, such that xn −→ x
in X,
for some x ∈ X. Then ° ° °F(xn )°
X∗
° ° −→ °F(x)°X ∗ .
Moreover, because F is demicontinuous (see Proposition 3.2.22), we have w
F(xn ) −→ F (x)
in X ∗ .
Since X ∗ is locally uniformly convex, it has the Kadec-Klee property (see Remark A.3.22) and so F(xn ) −→ F (x)
in X ∗ .
Therefore F is continuous. REMARK 3.2.26 Under the hypotheses of Proposition 3.2.25, the map x 7−→ ψ(x) = kxkX is Fr´echet differentiable at every x 6= 0. Indeed, from Remark 3.2.23, we know that the map x 7−→ ψ(x) is Gˆateaux differentiable at every x 6= 0. Also Proposition 3.2.25 says that the map F(x) x 7−→ ψ 0 (x) = kxkX is continuous on X \ {0}. So ψ is Fr´echet differentiable on X \ {0}. Combining Propositions 3.2.24 and 3.2.25, we obtain the following proposition. PROPOSITION 3.2.27 If X is a reflexive Banach space and both X and X ∗ are locally uniformly convex (see Definition A.3.21), then the duality map F : X −→ X ∗ is a homeomorphism. PROPOSITION 3.2.28 If X is a reflexive Banach space and X ∗ is uniformly convex, then the duality map F : X −→ X ∗ is uniformly continuous on bounded subsets of X.
3. Nonlinear Operators and Young Measures PROOF
317
First we show that F is uniformly continuous on ª df © ∂B1 = x ∈ X : kxkX = 1 .
If this is not the case, then we can find ε > 0 and two sequences {xn }n>1 , {yn }n>1 ⊆ ∂B1 , such that kxn − yn kX −→ 0 and
° ° °F(xn ) − F(yn )° ∗ > ε X
∀ n > 1.
We have ° ° ® °F(x) + F(y)° ∗ kxk > F(x) + F(y), x X X X ® ® ® = F(x), x X + F(y), y X + F(y), x − y X 2
2
> kxkX + kykX − kykX kx − ykX
∀ x, y ∈ X.
Putting x = xn ,
y = yn
∀ n > 1,
we obtain
° 1° °F(xn ) + F(yn )° ∗ = 1 − 1 kxn − yn k , X X 2 2 ∗ which contradicts the uniform convexity of X . Recall that F(λu) = λF(u) ∀ λ > 0, u ∈ X. For x, y ∈ X \ {0}, we have ° µ ¶ µ ¶° ° ° x y ° ° = ° kxkX F − kykX F X∗ kxkX kykX °X ∗ ° µ ° µ ¶ µ ¶° ¶° ° ° ° ° x y y ° ° °F ° . 6 kxkX °F −F + kx − yk X° ° ° ∗ kxkX kykX kyk ∗ X X X ° ° °F(x) − F(y)°
From the uniform continuity of F on ∂B1 , it follows that F is uniformly continuous on bounded sets located outside some neighbourhood of the origin. Since F is continuous at x = 0 and F(0) = 0, we conclude that F is uniformly continuous on bounded sets. Using the duality map, we can have a necessary and sufficient condition for the maximality of a monotone operator A. THEOREM 3.2.29 ∗ If both X and X ∗ are strictly convex and A : X ⊇ D(A) −→ 2X is a monotone map, then A is maximal monotone if and only if R(A + λF) = X ∗ for all λ > 0 (equivalently for some λ > 0).
318
Nonlinear Analysis
Theorem 3.2.29 is also a surjectivity result. One of the reasons that maximal monotone operators are important in applications is their remarkable surjectivity properties. We start with a necessary and sufficient condition in order to have surjectivity. THEOREM 3.2.30 ∗ If A : X ⊇ D(A) −→ 2X is a monotone map, then R(A) = X ∗ if and only if A−1 is locally bounded. PROOF
“=⇒”: Since A is maximal monotone, so is A−1 . Because D(A−1 ) = R(A) = X ∗ ,
from Proposition 3.2.11, we have that A−1 is locally bounded. “⇐=”: To show that R(A) = X ∗ , it suffices to show that R(A) is both closed and open in X ∗ . First we show that R(A) is closed. To this end let {x∗n }n>1 ⊆ R(A) and suppose that x∗n −→ x∗
in X ∗ .
We have x∗n ∈ A(xn ) and from the monotonicity of A, it follows that ® ∗ ∀ n > 1, (y, y ∗ ) ∈ Gr A. xn − y ∗ , xn − y X > 0
(3.17)
Since by hypothesis A−1 is locally bounded, the sequence {xn }n>1 ⊆ X is bounded and so by passing to a suitable subsequence if necessary, we may assume that w xn −→ x in X. Passing to the limit as n → +∞ in (3.17), we obtain ∗ ® x − y∗ , x − y X > 0 ∀ (y, y ∗ ) ∈ Gr A, so x∗ ∈ A(x) (since A is maximal monotone). Therefore x∗ ∈ R(A) and so we have proved that R(A) is closed. Next we show that R(A) is open in X ∗ . Let x∗ ∈ R(A). We have x∗ ∈ A(x). By considering if necessary b A(y) = A(y + x)
3. Nonlinear Operators and Young Measures
319
(maximal monotonicity is invariant under translation), we may assume that x = 0. Let r > 0 be such that A−1 |Br (x∗ ) is bounded, where df
Br (x∗ ) =
©
ª y ∗ ∈ X ∗ : ky ∗ − x∗ kX < r .
By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume without any loss of generality, that both X and X ∗ are locally uniformly convex. Let y ∗ ∈ B r2 (x∗ ). Then by Theorem 3.2.29, for every λ > 0, the operator equation x∗λ + λF(xλ ) = y ∗ ,
x∗λ ∈ A(xλ )
(3.18)
has a solution xλ . Because A is maximal monotone, we have ∗ ® y − λF(xλ ) − x∗ , xλ X > 0 ∀λ>0 (recall that x = 0), so 2
ky ∗ − x∗ kX ∗ kxλ kX > λ kxλ kX and thus λ kxλ kX
0.
From (3.18), we see that ° ∗ ° ° ° °y − x∗λ ° ∗ = λ°F(xλ )° ∗ = λ kxλ k < r X X X 2 so ° ∗ ° °xλ − x∗ ° ∗ < r ∀ λ > 0. X
∀ λ > 0,
(3.19)
Recall that A−1 |Br (x∗ ) is bounded. So we have that {xλ }λ>0 ⊆ X is bounded. Using this in (3.19), we see that x∗λ −→ y ∗
in X ∗
as λ & 0.
Since from the first part of the proof of this implication, we have that R(A) is closed, it follows that y ∗ ∈ R(A) and so B r2 ⊆ R(A), which proves that R(A) is open in X ∗ . Thus we conclude that R(A) = X ∗ . COROLLARY 3.2.31 ∗ If A : X ⊇ D(A) −→ 2X is a maximal monotone and weakly coercive, then A is surjective (i.e., R(A) = X ∗ ). PROOF By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume that both X and X ∗ are locally uniformly convex. The weak coercivity condition is equivalent to saying that A−1 is locally bounded. So we can apply Theorem 3.2.30 and conclude that R(A) = X ∗ .
320
Nonlinear Analysis
COROLLARY 3.2.32 ∗ If A : X −→ 2X is monotone, hemicontinuous with D(A) = X and weakly coercive, then A is surjective (i.e., R(A) = X ∗ ). In a finite dimensional context we can drop the monotonicity hypothesis from the above result, provided we assume coercivity. Namely, we have the following result. PROPOSITION 3.2.33 ∗ If X is a finite dimensional Banach space, F : X −→ 2X is an upper semicontinuous and coercive multifunction with nonempty, compact and convex values, then F is surjective. PROOF
For every y ∗ ∈ X ∗ , the multifunction df
Fy∗ (x) = F (x) − y ∗ satisfies the same hypotheses as F . So it suffices to show that 0 ∈ R(F ). Suppose that 0 6∈ R(F ). Then by the strong separation theorem (see Theorem A.3.2), for every x ∈ X we can find u(x) ∈ X \ {0}, such that ∗ ® 0 < ∗ inf x , u(x) X . x ∈F (x)
Since by hypothesis F is coercive, given M > 0 we can find r = r(M ) > 0, such that hx∗ , xiX > M ∀ kxkX > r, x∗ ∈ F (x), kxkX so hx∗ , xiX > M r ∀ kxkX = r, x∗ ∈ F (x). For such x ∈ X, we can take u(x) = x. Now let x ∈ X \ {0}. We define ½ ¾ df ∗ U (x) = y ∈ X : ∗ inf hy , xiX > 0 . y ∈F (y)
Because of our hypotheses on F , the map y 7−→
inf
y ∗ ∈F (y)
hy ∗ , xiX is lower semicontinuous
(see Hu & Papageorgiou©(1997,ª p. 83)) and so the set U (x) is open. From the above, we have that U (x) x∈X\{0} is an open cover of X.
3. Nonlinear Operators and Young Measures
321
We choose ¢ df ¡ an open cover {Vk }m k=1 of B r = x ∈ X : kxkX 6 r , such that for each k ∈ {1, . . . , m}, we can find xk ∈ X, such that Vk ⊆ U (xk ) and if Vk ∩ ∂Br 6= ∅,
then xk ∈ Vk ∩ ∂Br
and diam Vk
1, such that ϑk (x) > 0 and for each x∗ ∈ F (x), we have hx∗ , xk iX > 0 (since x ∈ Vk ⊆ U (xk )). So for each x ∈ B r and any x∗ ∈ F (x), we have m X ∗ ® x , f (x) X = ϑk (x) hx∗ , xk iX > 0, k=1
hence f (x) 6= 0
∀ x ∈ Br .
Therefore dB (f, Br , 0) = 0, where dB (f, Br , 0) denotes the Brouwer degree of f on Br with respect to 0. On the other hand, if x ∈ ∂Br , f (x) is a convex combination of the points {xk }m k=1 ⊆ ∂Br and kxk − xkX
0, we define the resolvent operator of A, by ¢−1 df ¡ Jλ = idH + λA and the Yosida approximation of A, by df
Aλ = REMARK 3.2.37
¢ 1¡ idH − Jλ . λ
By virtue of Theorem 3.2.29, we have that
D(Jλ ) = D(Aλ ) = H
∀ λ > 0.
Moreover, it is easy to check that Jλ is single-valued and then so is Aλ .
3. Nonlinear Operators and Young Measures
323
The next theorem summarizes the main properties of the resolvent and Yosida approximation of a maximal monotone operator A. THEOREM 3.2.38 If H is a pivot Hilbert space and A : H ⊇ D(A) −→ 2H is a maximal monotone map, then for every λ > 0, we have (a) Jλ is nonexpansive (i.e., Lipschitz continuous with Lipschitz constant 1); ¡ ¢ (b) Aλ (x) ∈ A Jλ (x) for every x ∈ H; (c) Aλ is monotone and Lipschitz continuous with Lipschitz constant λ1 ; ° ° ° ° (d) °Aλ (x)°H 6 °A0 (x)°H for every x ∈ D(A), where A0 (x) = projA(x) (0) (recall that A(x) is closed and convex; see Proposition 3.2.7 and Example 3.2.20(a)); (e) lim Aλ (x) = A0 (x) for all x ∈ D(A); λ&0
(f ) D(A) is convex and lim Jλ (x) = projD (x) for every x ∈ H. λ&0
PROOF
(a) For x, y ∈ H, we have ¡ ¡ ¢ ¡ ¢¢ x − y ∈ Jλ (x) − Jλ (y) + λ A Jλ (x) − A Jλ (y) .
We take the inner product with Jλ (x) − Jλ (y) and use the monotonicity of A. We have ° ° ® °Jλ (x) − Jλ (y)°2 6 x − y, Jλ (x) − Jλ (y) H H ° ° 6 kx − yk °Jλ (x) − Jλ (y)° , H
so
H
° ° °Jλ (x) − Jλ (y)° 6 kx − yk . H H
(b) This follows from the equivalence ¡ ¢ x, x∗ ∈ Gr Aλ ⇐⇒
¡
¢ x − λx∗ , x∗ ∈ Gr A.
(3.20)
(c) Because Jλ is nonexpansive (see (a)), it follows that idH − Jλ is monotone (see Example 3.2.20(b)) and so Aλ is monotone too. We have ¡ ¢ x − y = Jλ (x) − Jλ (y) + λ Aλ (x) − Aλ (y) ,
324
Nonlinear Analysis
and ® ® x − y, Aλ (x) − Aλ (y) H = Jλ (x) − Jλ (y), Aλ (x) − Aλ (y) H ° °2 + λ°Aλ (x) − Aλ (y)°H . From the monotonicity of A and (b), we have ° °2 ° ° λ°Aλ (x) − Aλ (y)°H 6 kx − ykH °Aλ (x) − Aλ (y)°H , so
° ° °Aλ (x) − Aλ (y)°
1 kx − ykH . λ Invoking Proposition 3.2.19, we conclude that A is maximal monotone. H
6
(d) From (b), we have that 0 ® A (x) − Aλ (x), x − Jλ (x) H > 0 so
∀ x ∈ D(A), λ > 0,
° ° ° ° ° ° ® °Aλ (x)°2 6 A0 (x), Aλ (x) 6 °A0 (x)°H °Aλ (x)°H H H
and thus
(3.21)
° ° ° ° °Aλ (x)° 6 °A0 (x)° . H H
(e) Using (3.20), we can easily verify that (Aλ )µ = Aλ+µ So from (d) and (3.21), we see that ° ° ° ° °Aλ+µ (x)° 6 °Aλ (x)° H H
∀ λ, µ > 0.
∀ x ∈ H, λ, µ > 0
(3.22)
and ° ° ® °Aλ+µ (x)°2 6 Aλ (x), Aλ+µ (x) H H
∀ x ∈ H, λ, µ > 0.
(3.23)
From (3.22) and (3.23), it follows that ° ° ° ° ° ° °Aλ+µ (x) − Aλ (x)°2 6 °Aλ (x)°2 − °Aλ+µ (x)°2 ∀ x ∈ H, λ, µ > 0. H H H ° ª ©° Therefore, since from (d), °Aλ (x)°H λ>0 is bounded for λ > 0 small © ª enough, we infer that Aλ (x) λ>0 is Cauchy and so Aλ (x) −→ y
in H
as λ & 0.
By definition, we have x − Jλ (x) = λAλ (x)
3. Nonlinear Operators and Young Measures
325
and so Jλ (x) −→ x in H
as λ & 0.
Using (b) and the maximal monotonicity of A, we have that y ∈ A(x), hence y = A0 (x). (f ) Let df
C = conv D(A) and x ∈ H. We have ® Aλ (x) − u, Jλ (x) − z H > 0 so
x − Jλ (x) − λu, Jλ (x) − z
® H
∀ (z, u) ∈ Gr A,
> 0
∀ (z, u) ∈ Gr A
and thus ° ° ® ® °Jλ (x)°2 6 x − λu, Jλ (x) − z + Jλ (x), z H H H ∀ (z, u) ∈ Gr A. (3.24) © ª From (3.24), it follows that Jλ (x) λ>0 ⊆ H is bounded. We choose a sequence λn & 0, such that w
Jλn (x) −→ v
in H.
Then by passing to the limit in (3.24) (with λ = λn ), we obtain 2
kvkH 6 hx, v − ziH + hv, ziH
∀ z ∈ D(A),
so hv − x, v − ziH 6 0
∀ z ∈ D(A)
and so hx − v, z − viH 6 0
∀ z ∈ C.
(3.25)
Since v ∈ C, from (3.25), it follows that v = projC (x). It remains to show that C = D(A). To this end, note that Jλ (x) ∈ D(A)
∀x∈H
and Jλ (z) −→ z
as λ & 0
∀ z ∈ C.
Therefore, it follows that C ⊆ D(A), hence C = D(A).
326
Nonlinear Analysis
REMARK 3.2.39
From the proof of (e), it follows that if x 6∈ D, then ° ° °Aλ (x)° % +∞ as λ & 0. H
Perturbation results for maximal monotone operators play an important role in applications. In this direction we have the following result. THEOREM 3.2.40 If H is a pivot Hilbert space, A : H ⊇ D(A) −→ 2H and B : H ⊇ D(B) −→ ¡ ¢ 2H are two maximal monotone maps and 0 ∈ int D(A) \ D(B) , then A + B : H ⊇ D(A) ∩ D(B) −→ 2H is maximal monotone too. PROOF We start by showing that if S : H −→ H is a Lipschitz continuous map with Lipschitz constant kS > 0, then the map A + S : H ⊇ D(A) −→ 2H is maximal monotone. To this end, we choose µ > 0, such that µkS < 1. Then for a given y ∈ H, the equation x + µA(x) + µS(x) 3 y
(3.26)
is equivalent to x =
¡
¢−1 ¡ ¢ ¡ ¢ idH + µA y − µS(x) = Jµ y − µS(x) .
Note that the map
(3.27)
¡ ¢ z 7−→ Eµ (z) = Jµ y − µS(x)
is Lipschitz continuous with Lipschitz constant µkS < 1. So by the Banach fixed point theorem (see Theorem 7.1.2), equation (3.27) (hence inclusion (3.26) too) has a unique solution x ∈ D(A). By virtue of Theorem 3.2.29, this means that µ(A + S) is maximal monotone (note that since H is a pivot Hilbert space, F = idH ). So A + S is maximal monotone too. Using this general fact and Theorems 3.2.29 and 3.2.38(c), we see that for every λ > 0 we can find xλ ∈ D(A), such that xλ + A(xλ ) + Bλ (xλ ) 3 y.
(3.28)
We take the inner product with xλ − z, for some z ∈ D(A) ∩ D(B). Exploiting the monotonicity of A and Bλ , we obtain ° ° kxλ − zk 6 °y − z − A0 (z) − Bλ (z)° ∀ λ > 0. H
H
3. Nonlinear Operators and Young Measures Using also Theorem 3.2.38(d), we get ° ° ° ° kxλ kH 6 2 kzkH + kykH + °A0 (z)°H + °B 0 (z)°H
327
∀ λ > 0.
(3.29)
Because of our hypothesis concerning the domains D(A) and D(B), we can find ε > 0, such that df
Bε =
©
ª z ∈ H : kzkH 6 ε ⊆ D(B) − D(A).
Let z ∈ B ε . Then z = b − a with b ∈ D(B) and
a ∈ D(A).
Exploiting the monotonicity of Bλ , we have ® ® ® Bλ (xλ ), b) H 6 Bλ (xλ ), xλ H − Bλ (b), xλ − b H . From Theorem 3.2.38(d), we have ° ° ° ° ® ® Bλ (xλ ), z H 6 Bλ (xλ ), xλ − a H + °B 0 (b)°H °xλ − b°H , so
Bλ (xλ ), z
® H
6
° ° ° ° ® y − xλ − uλ , xλ − a H + °B 0 (b)°H °xλ − b°H ,
with uλ ∈ A(xλ ). As A is monotone, we have ° ° ° ¢ ® ¡° Bλ (xλ ), z H 6 kxλ − akH °y − xλ °H + °A0 (a)°H ° ° + °B 0 (b)°H kxλ − bkH . From (3.29), we have that {xλ }λ>0 is bounded. Thus for every z ∈ B ε , we can find c(z) > 0, such that ® Bλ (xλ ), z H 6 c(z). Invoking the uniform boundedness principle (see Theorem A.3.4), we have ° ° sup °Bλ (xλ )°H < +∞. (3.30) λ>0
From (3.28), we have xλ − xµ ∈ −A(xλ ) + A(xµ ) − Bλ (xλ ) + Bµ (xµ )
∀ λ, µ > 0,
so from the monotonicity of A, we have ° ° ® °xλ − xµ °2 6 − Bλ (xλ ) − Bµ (xµ ), xλ − xµ H H
∀ λ, µ > 0.
328
Nonlinear Analysis
Invoking also Theorem 3.2.38(b), we obtain ° ° ® °xλ − xµ °2 6 − Bλ (xλ ) − Bµ (xµ ), λBλ (xλ ) − µBλ (xµ ) H H
∀ λ, µ > 0.
Using (3.30), we infer that {xλ }λ>0 ⊆ H is Cauchy and so xλ −→ x as λ & 0.
(3.31)
Also we can say that w
Bλ (xλ ) −→ v Note that
in H,
as λ & 0.
(3.32)
° ° ° ° ° ° °Jλ (xλ ) − x° 6 λ°Bλ (xλ )° + °xλ − x° , H H H
so, using (3.30) and (3.31), we have Jλ (xλ ) −→ x
in H,
as λ & 0.
Since B is maximal monotone and ¡ ¢ Bλ (xλ ) ∈ B Jλ (xλ ) (see Theorem 3.2.38(b)), in the limit as λ & 0, we obtain (x, v) ∈ Gr B (see Proposition 3.2.15 and (3.32)). Passing to the limit as λ & 0 in (3.28) and using the fact that A is maximal monotone, we obtain that x ∈ D(A) and x + A(x) + B(x) 3 y, so
¡ ¢ R idH + A + B = H
and finally A + B is maximal monotone (see Theorem 3.2.29). For operators from a reflexive Banach space into its dual, we have the following perturbation result. THEOREM 3.2.41 ∗ If X is a reflexive Banach space, A : X ⊇ D(A) −→ 2X and B : X ⊇ ∗ D(B) −→ 2X are two maximal monotone maps and D(A) ∩ int D(B) 6= ∅ (or D(B) ∩ int D(A) 6= ∅), ∗ then A + B : X ⊇ D(A) ∩ D(B) −→ 2X is maximal monotone. ¡ ¢ REMARK 3.2.42 Since int D(A) − D(B) ⊆ int D(A) − D(B) , we see that the hypothesis of Theorem ¡3.2.40 is weaker¢ than that of Theorem 3.2.41. Note that the condition 0 ∈ int D(A) − D(B) may hold even if int D(A) = int D(B) = ∅.
3. Nonlinear Operators and Young Measures
329
Another useful perturbation result is given in the next theorem. THEOREM 3.2.43 If H is a pivot Hilbert space, A : H ⊇ D(A) −→ 2H and B : H ⊇ D(B) −→ 2H are maximal monotone maps, D(A) ∩ D(B) 6= ∅ and ® 0 6 y, Bλ (x) H ∀ (x, y) ∈ Gr A, λ > 0, then A + B is maximal monotone. PROOF
Let y ∈ H and consider the inclusion x + A(x) + Bλ (x) 3 y.
(3.33)
From the proof of Theorem 3.2.40, we know that (3.33) has a unique solution xλ ∈ D(A) and {xλ }λ>0 ⊆ H is bounded. Take the inner product of (3.33) with Bλ (xλ ) and use the hypothesis, to obtain that ° ° sup °Bλ (xλ )°H < +∞. λ>0
Then the remainder of the proof goes as the proof of Theorem 3.2.40. Next we introduce some generalizations of the concept of monotonicity, which are useful in the study of nonlinear partial differential equations. The mathematical setting remains the same. Namely, X is a reflexive Banach ∗ space, X ∗ is its topological dual and A : X −→ 2X is an operator. DEFINITION 3.2.44 domonotone, if
The map A : X −→ 2X
∗
is said to be pseu-
(a) the set A(x) is nonempty, convex and weakly compact for all x ∈ X; (b) A is upper semicontinuous from each finite dimensional subspace V of X, into X ∗ furnished with the weak topology; (c) if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are sequences, such that x∗n ∈ A(xn ), w
xn −→ x for some x ∈ X and
in X,
® lim sup x∗n , xn − x X 6 0, n→+∞
then for each y ∈ X, we can find v ∗ (y) ∈ A(x), such that ∗ ® ® v (y), x − y X 6 lim inf x∗n , xn − y X . n→+∞
330
Nonlinear Analysis
To be able to deal with problems in which the nonlinear operators are not everywhere defined and which are not continuous even in a mild sense, we introduce the following notion. ∗
DEFINITION 3.2.45 A map A : X −→ 2X is said to be generalized pseudomonotone, if for any sequences {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ , such that w xn −→ x in X, for some x ∈ X and
w
x∗n −→ x∗
in X ∗ ,
for some x∗ ∈ X ∗ , with x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞
we have that x∗ ∈ A(x) and ∗ ∗® ® xn , x X −→ x∗ , x X . An immediate consequence of this definition is the following result. PROPOSITION 3.2.46 ∗ A map A : X −→ 2X is generalized pseudomonotone if and only if A−1 : X ∗ −→ 2X is generalized pseudomonotone. The class of generalized pseudomonotone maps contains the maximal monotone ones. PROPOSITION 3.2.47 ∗ If A : X ⊇ D(A) −→ 2X is maximal monotone, then A is generalized pseudomonotone. PROOF that
Let {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ be two sequences, such w
xn −→ x for some x ∈ X and
w
x∗n −→ x∗
in X, in X ∗ ,
for some x∗ ∈ X ∗ , with x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0. n→+∞
We need to show that x∗ ∈ A(x) and ® ∗ ® xn , x X −→ x∗ , x X .
3. Nonlinear Operators and Young Measures Let (u, u∗ ) ∈ Gr A. Then since A is monotone, we have ® 0 6 x∗n − u∗ , xn − u X ∀ n > 1.
331
(3.34)
Also we have ∗ ® ® ® ® ® xn , xn X = x∗n − u∗ , xn − u X + x∗n , u X + u∗ , xn X − u∗ , u X . Note that ∗ ® ® ® ® ® ® xn , u X + u∗ , xn X − u∗ , u X −→ x∗ , u X + u∗ , x X − u∗ , u X , so from (3.34), we have ® ® ® ® ∗ ® x , x X > lim sup x∗n , xn X > x∗ , u X + u∗ , x X − u∗ , u X n→+∞
and thus
∗ ® x − u∗ , x − u X > 0.
Since (u, u∗ ) ∈ Gr A was arbitrary and A is maximal monotone, it follows that x∗ ∈ A(x). Therefore ∗ ® xn − x∗ , xn − x X > 0 ∀ n > 1, so
® ® lim inf x∗n , x X > x∗ , x X n→+∞
and thus
x∗n , xn
® X
−→
∗ ® x , x X,
i.e., A is generalized pseudomonotone. In fact every pseudomonotone map is generalized pseudomonotone. THEOREM 3.2.48 ∗ If A : X −→ 2X is a pseudomonotone map, then A is generalized pseudomonotone PROOF
© ª Suppose that (xn , x∗n ) n>1 ⊆ Gr A is a sequence, such that w×w
(xn , x∗n ) −→ (x, x∗ ) in X × X ∗ , for some (x, x∗ ) ∈ X × X ∗ and ® lim sup x∗n , xn − x X 6 0. n→+∞
332
Nonlinear Analysis
By virtue of pseudomonotonicity of A, for every y ∈ X, we can find v ∗ (y) ∈ A(x), such that ∗ ® ® v (y), x − y X 6 lim inf x∗n , xn − y X . n→+∞
We may assume that
® ∗ xn , xn X −→ ξ,
for some ξ ∈ R and so ® ® lim sup x∗n , xn − x X = ξ − x∗ , x X 6 0. n→+∞
(3.35)
Also, we have ® ® ® ξ − x∗ , y X > lim inf x∗n , xn − y X > v ∗ (y), x − y X , n→+∞
so, from (3.35), we have ∗ ® ® x , x − y X > v ∗ (y), x − y X
∀ y ∈ X.
(3.36)
We claim that x∗ ∈ A(x). If this is not the case, then since A(x) is convex and w-compact (see Definition 3.2.44), we can find u ∈ X, such that ∗ ® ∗ ® x , u X < ∗ inf z , u X. (3.37) z ∈A(x)
Let y = x − u in (3.36). Then ∗ ® ® x , u X > v ∗ (y), u X , with
(3.38)
v ∗ (y) ∈ A(x).
Comparing (3.37) and (3.38), we reach a contradiction. Therefore x∗ ∈ A(x). Finally, if y = x ∈ X, then ® ® lim inf x∗n , xn − x X > v ∗ (x), x − x X = 0, n→+∞
so
® ® lim inf x∗n , xn X > x∗ , x X n→+∞
and recalling the choice of the sequences {xn }n>1 and {x∗n }n>1 , we get ∗ ® ® xn , xn X −→ x∗ , x X .
There is a converse to this proposition.
3. Nonlinear Operators and Young Measures
333
PROPOSITION 3.2.49 ∗ If A : X −→ 2X is a bounded generalized pseudomonotone map and for every x ∈ X, the set A(x) is nonempty, convex and weakly compact, then A is pseudomonotone. PROOF
First we show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two w
sequences, such that xn −→ x in X, for some x ∈ X, x∗n ∈ A(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞
then for each u ∈ X, we can find y ∗ (u) ∈ A(x), such that ∗ ® ® y (u), x − u X 6 lim inf x∗n , xn − u X . n→+∞
Suppose that this is not true. Then we can find u ∈ X, such that ® ∗ ® lim inf x∗n , xn − u X < ∗ inf v , x − u X. n→+∞
v ∈A(x)
By passing to a suitable subsequence, we may say that ® ∗ ® lim x∗n , xn − u X < ∗ inf v , x − u X. n→+∞
v ∈A(x)
Since A is bounded, we have that the sequence {x∗n }n>1 ⊆ X ∗ is bounded. So by virtue of the Eberlein-Smulian theorem (see Theorem A.3.8), we may assume that w x∗n −→ x∗ in X ∗ . Because A is generalized pseudomonotone, it follows that x∗ ∈ A(x) and ® ® ∗ xn , xn X −→ x∗ , x X (see Definition 3.2.45). Therefore ® ® lim x∗n , xn − u X = x∗ , x − u X < n→+∞
inf
v ∗ ∈A(x)
∗ ® v , x − u X,
a contradiction, since x∗ ∈ A(x). Next we show that A is upper semicontinuous from X into X ∗ furnished with the weak topology. By virtue of the boundedness of A and of Remark 3.2.13, it suffices to show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two sequences, such that xn −→ x for some x ∈ X,
w
x∗n −→ x∗
in X, in X ∗ ,
for some x∗ ∈ X ∗ and x∗n ∈ A(xn ) for n > 1, then x∗ ∈ A(x). But this follows from the fact that A is generalized pseudomonotone. Thus we have shown that A is pseudomonotone (see Definition 3.2.44).
334
Nonlinear Analysis
Combining this result with Propositions 3.2.11 and 3.2.47, we obtain the following corollary. COROLLARY 3.2.50 ∗ If A : X ⊇ D(A) −→ 2X is maximal monotone and D(A) = X, then A is pseudomonotone. The class of pseudomonotone maps is invariant under addition of operators. PROPOSITION 3.2.51 ∗ If A1 , A2 : X −→ 2X are two pseudomonotone maps, then A1 + A2 is pseudomonotone too. PROOF
Evidently for each x ∈ X, the set (A1 + A2 )(x) = A1 (x) + A2 (x)
is nonempty, convex and weakly compact. Moreover, it is easy to see that the map x 7−→ (A1 + A2 )(x) is upper semicontinuous from every finite dimensional subspace of X into X ∗ equipped with the weak topology. Next we show that if {xn }n>1 ⊆ X and {x∗n }n>1 ⊆ X ∗ are two sequences, such that w xn −→ x in X, for some x ∈ X, x∗n ∈ (A1 + A2 )(xn ) for n > 1 and ® lim sup x∗n , xn − x X 6 0, n→+∞
then for every u ∈ X, we can find y ∗ (u) ∈ (A1 + A2 )(x), such that
∗ ® ® y (u), x − u X 6 lim inf x∗n , xn − u X . n→+∞
Let df
x∗n = yn∗ + zn∗
with
yn∗ ∈ A1 (xn ),
zn∗ ∈ A2 (xn )
∀ n > 1.
Then, we have · lim sup n→+∞
∗ ® ® yn , xn − x X + zn∗ , xn − x X
¸ 6 0.
(3.39)
3. Nonlinear Operators and Young Measures
335
We claim that (3.39) implies
® lim sup yn∗ , xn − x X 6 0 n→+∞
and
(3.40)
® lim sup zn∗ , xn − x X 6 0. n→+∞
Suppose that (3.40) is not true. Then at least one of the two lim sup is strictly bigger than zero. To fix things, suppose that ® lim sup yn∗ , xn − x X > 0. n→+∞
Then we can find c > 0 and a suitable subsequence (denoted with the same index), such that ® lim yn∗ , xn − x X > c = 0. n→+∞
Then because of (3.39), we have that ® lim sup zn∗ , xn − x X 6 −c < 0. n→+∞
(3.41)
By virtue of the pseudomonotonicity of A2 , for every u ∈ X, we can find y2∗ (u) ∈ A2 (x), such that
∗ ® ® y2 (u), x − u X 6 lim inf zn∗ , xn − u X . n→+∞
Let u = x. Then we have
® lim inf zn∗ , xn − x X > 0. n→+∞
(3.42)
Comparing (3.41) and (3.42), we reach a contradiction. This proves (3.40). Since both A1 and A2 are pseudomonotone, given u ∈ X, we can find y1∗ (u) ∈ A1 (x) and such that
y2∗ (u) ∈ A2 (x),
® ® y1∗ (u), x − u X 6 lim inf yn∗ , x − u X n→+∞
and
∗ ® ® y2 (u), x − u X 6 lim inf zn∗ , x − u X . n→+∞
Let
df
y ∗ (u) = y1∗ (u) + y2∗ (u) ∈ (A1 + A2 )(x). We have ∗ ® ® ® y (u), x − u X 6 lim inf yn∗ , xn − u X + lim inf zn∗ , xn − u X n→+∞ n→+∞ ® 6 lim inf x∗n (u), xn − u X , n→+∞
which means that A1 + A2 is pseudomonotone.
336
Nonlinear Analysis
As was the case with maximal monotone operators, pseudomonotone operators exhibit remarkable surjectivity properties. THEOREM 3.2.52 ∗ If A : X −→ 2X is pseudomonotone and coercive, then R(A) = X ∗ , i.e., A is surjective. PROOF Let T be the family of all finite dimensional subspaces of X, equipped with the partial order defined by inclusion. Let V ∈ T and let iV : V −→ X denote the embedding operator. Then i∗V : X ∗ −→ V ∗ is the corresponding projection operator onto V ∗ . Then ∗
AV = i∗V AiV : V −→ 2V . Clearly AV has nonempty, convex and compact values and it is upper semicontinuous. Moreover, for every x∗V ∈ AV (x), we have x∗V = i∗V x∗ for some x∗ ∈ A(x) and so ® ® ∗ ® xV , x V = i∗V x∗ , x V = x∗ , iV (x) X , so AV is coercive too. To prove the theorem, it suffices to show that 0 ∈ R(A). Because of Proposition 3.2.33, for every V ∈ T , we can find xV ∈ V , such that 0 ∈ AV (xV ), hence 0 = i∗V x∗V , for some x∗V ∈ A(xV ). By virtue of the coercivity of A, we have that {xV }V ∈T ⊆ X is bounded. For V ∈ T , let [ df {xV 0 }. EV = V0 ∈ T V0 ⊇ V
Then EV ⊆ B M
3. Nonlinear Operators and Young Measures
337
for some M > 0 large enough. Because X is reflexive (hence B M is weakly compact), from the finite intersection property, we have that \ w E V 6= ∅. V ∈T
Let x0 ∈
\
w
EV
and
y ∈ X.
V ∈T
© ª Choose V ∈ T , such that {x0 , y} ⊆ V . Let xVk k>1 ⊆ EV be such that w
xVk −→ x0 Recall that 0 = i∗Vk x∗Vk with x∗Vk
in X as k → +∞. ¡ ¢ ∈ A xVk . So we have
∗ ® xVk , xVk − x0 X = 0
∀ k > 1.
Since A is pseudomonotone, we can find y ∗ (y) ∈ A(x0 ), such that ∗ ® ® y (y), x0 − y X 6 lim x∗Vk , xVk − y X = 0 ∀ y ∈ X. k→+∞
(3.43)
Suppose that 0 6∈ A(x0 ). Then by the strong separation theorem (see Theorem A.3.2), we can find y ∈ X, such that ∗ ® 0 < ∗ inf v , x0 − y X . (3.44) v ∈A(x0 )
Comparing (3.43) and (3.44), we obtain a contradiction. This proves the surjectivity of A. The following classes of operators are often useful in nonlinear operator equations. DEFINITION 3.2.53
Let
A : X ⊇ D(A) −→ 2X
∗
and
B : X ⊇ D(B) −→ 2X
∗
be two maps. (a) We say that B is smooth, if D(B) = X and B is bounded, coercive and maximal monotone. (b) We say that A is regular, if it is generalized pseudomonotone and for ∗ every smooth operator B : X ⊇ D(B) −→ 2X , we have R(A + B) = X ∗ .
338
Nonlinear Analysis
The next proposition gives an important example of a regular generalized pseudomonotone map. PROPOSITION 3.2.54 ∗ If A : X ⊇ D(A) −→ 2X is a pseudomonotone operator and there exists c > 0, such that hx∗ , xiX > −c kxkX
∀ (x, x∗ ) ∈ Gr A,
then A is regular (so also generalized pseudomonotone). PROOF First note that A is a generalized pseudomonotone operator (see ∗ Theorem 3.2.48). Also let B : X ⊇ D(B) −→ 2X be an arbitrary smooth map. Then from Corollary 3.2.50, B is pseudomonotone. Then Proposition 3.2.51 implies that A+B is pseudomonotone. Moreover, A+B is coercive. So Theorem 3.2.52 implies that R(A + B) = X ∗ , hence A is regular. We introduce two more classes of nonlinear operators of monotone type. DEFINITION 3.2.55
∗
Let A : X ⊇ D(A) −→ 2X be a map.
(a) We say that A is of type (M ), if for every x ∈ X, the set A(x) is nonempty, convex, weakly compact, it is upper semicontinuous from every finite dimensional subspace V of X into X ∗ furnished with the weak topology and if w xn −→ x in X, w
x∗n −→ x∗ with (xn , x∗n ) ∈ Gr A, then
in X ∗
(x, x∗ ) ∈ Gr A.
(b) We say that A is of type (S)+ , if A is single valued with D(A) = X and for every sequence {xn }n>1 ⊆ X, such that w
xn −→ x
in X,
for some x ∈ X and ® lim sup A(xn ), xn − x X 6 0, n→+∞
we have that xn −→ x
in X.
3. Nonlinear Operators and Young Measures
339
REMARK 3.2.56 The prototype for operators of type (M ) is a monotone, hemicontinuous map. Similarly the prototype for operators of type (S)+ is a uniformly monotone operator defined everywhere. If A is of type (M ) (respectively of type (S)+ ) and B : X −→ X ∗ is completely continuous, then A + B is of type (M ) (respectively of type (S)+ ). In fact in the (S)+ case, B can be compact. We close this section by briefly discussing two important examples of maximal monotone maps. PROPOSITION 3.2.57 ∗ If X is a separable, reflexive Banach space, A : X ⊇ D(A) −→ 2X is ¡ ¢ p0 ∗ b : Lp (T ; X) ⊇ D b −→ 2L T ;X , maximal monotone with 0 ∈ A(0) and A where T = [0, b], p ∈ (1, +∞), p1 + p10 = 1 is defined by ½ df b A(x) =
p0
h∈L
¡
T;X
∗
¢
¡
¢ : h(t) ∈ A x(t)
¾ for a.a. t ∈ T
b ∀ x ∈ D,
where ½ df b = D
x ∈ Lp (T ; X) : x(t) ∈ D(A) for a.a. t ∈ T and there exists ¾ ¡ ¢ ¡ ¢ p0 ∗ h ∈ L T ; X such that h(t) ∈ A x(t) for a.a. t ∈ T ,
b is maximal monotone too. then A PROOF By Troyanski’s renorming theorem (see Theorem A.3.23), we may assume without any loss of generality that both X and X ∗ are locally uniformly convex (see Definition A.3.21). Let F : X −→ X ∗ be the duality map of X. From Proposition 3.2.27, we have that F is a homeomorphism. Let ¡ ¢ p0 ∗ J0 : Lp (T ; X) −→ 2L T ;X be defined by
¢°p−2 ¡ ¢ df ° ¡ J0 (x)(·) = °F x(·) °X ∗ F x(·) .
It is easy to see that J0 is continuous, strictly monotone and so maximal b is a monotone map. We monotone too (see Proposition 3.2.18). Clearly A claim that ¢ ¡ ¢ ¡ b + J0 = Lp0 T ; X ∗ . R A
340
Nonlinear Analysis
¢ 0¡ ∗ To this end let h ∈ Lp T ; X ∗ and consider the multifunction S : T −→ 2X , defined by ª df © S(t) = x ∈ X : A(x) + ϕ(x) 3 h(t) , where ϕ : X −→ X ∗ is the monotone, continuous (hence maximal monotone) map, defined by °p−2 df ° ϕ(x) = °F(x)°X ∗ F(x)
∀ x ∈ X.
We know that ∗
A + ϕ : X ⊇ D(A) −→ 2X is maximal monotone (see Theorem 3.2.41). Moreover, because 0 ∈ A(0), we have ° °p ° °p ∗ ® ® x + ϕ(x), x X > ϕ(x), x X = °F(x)°X ∗ = °x°X ∀ (x, x∗ ) ∈ Gr A, hence A + ϕ is coercive. Therefore R(A + ϕ) = X ∗ (see Corollary 3.2.31). It follows that S(t) 6= ∅
∀ t ∈ T.
Note that Gr S =
©
(t, x) ∈ T × X :
¡ ¢ ª x, ϕ(x) − h(t) ∈ Gr A .
Let ϑ : T × X −→ X × X ∗ be the function, defined by df
ϑ(t, x) =
¡
¢ x, ϕ(x) − ξ(t) .
Clearly ϑ is a Carath´eodory function (i.e., t 7−→ ϑ(t, x) is measurable and x 7−→ ϑ(t, x) is continuous). Therefore ϑ is jointly measurable. Note that ¡ ¢ Gr S = ϑ−1 Gr A and from Proposition 3.2.15, we know that ∗ Gr A ⊆ X × Xw is closed ∗ (here by Xw we denote the space X ∗ furnished with the weak topology). Hence ¡ ¢ ∗ Gr S ∈ B X × Xw
3. Nonlinear Operators and Young Measures
341
∗ (by B(Z) we denote the Borel σ-field of Z). Since Xw is a Souslin space (see Definition A.2.29(b)), then ¡ ¢ ¡ ∗¢ ∗ B X × Xw = B(X) × B Xw
(see Proposition A.2.34(b)). Moreover, ¡ ¢ ¡ ∗¢ B X ∗ = B Xw . Therefore ¡ ¢ ¡ ¢ ¡ ∗¢ B(X) × B Xw = B(X) × B X ∗ = B X × X ∗ . So we have that
¡ ¢ Gr S ∈ B X × X ∗
and we can apply the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) to obtain a measurable map x : T −→ X, such that x(t) ∈ S(t) We have
∀ t ∈ T.
¡ ¢ ¡ ¢ h(t) ∈ A x(t) + ϕ x(t)
for a.a. t ∈ T.
Taking duality brackets with x(t), we obtain ° ° ® °x(t)°p 6 h(t), x(t) X X
∀t∈T
(recall that 0 ∈ A(0)) and so ° ° ° ° °x(t)°p−1 6 °h(t)° ∗ , X X from which it follows that x ∈ Lp (T ; X). This proves that ¡ ¢ ¡ ¢ b + J0 = Lp0 T ; X ∗ . R A Using this surjectivity property we shall establish the maximal monotonicity b To this end suppose that for some of the monotone operator A. ¢ 0¡ (y, v) ∈ Lp (T ; X) × Lp T ; X ∗ , we have
u − v, x − y
® pp0
> 0
b ∀ (x, u) ∈ Gr A,
(3.45)
where by h·, ·ipp0 we denote the duality brackets for the pair of spaces ¡ p0 ¡ ¢ ¢ L T ; X ∗ , Lp (T ; X) , i.e., df
Zb
hu, vipp0 = 0
® u(t), v(t) X dt
¢ 0¡ ∀ (u, v) ∈ Lp (T ; X) × Lp T ; X ∗ .
342
Nonlinear Analysis
b + J0 is surjective, we can find x1 ∈ D, b such that Since A u + J0 (x1 ) = v + J0 (y), b 1 ). Returning to (3.45) and setting x = x1 , we obtain for some u ∈ A(x ® J0 (y) − J0 (x1 ), x1 − y pp0 > 0. (3.46) But J0 is strictly monotone. So from (3.46), it follows that x1 = y, hence
b y∈D
and b This proves the maximality of A.
b 1 ). v = u ∈ A(x
The second important class of maximal monotone operators that we would like to check closing this section are the linear ones. Note that for a linear operator A : X −→ X ∗ , monotonicity is equivalent to saying that ® A(x), x X > 0 ∀ x ∈ D(A). For linear monotone operators we can characterize maximality using the adjoint operator or in terms of the density of its domain. This brings us to the “doorsteps” of the Hille-Yosida theorem and the theory of semigroups of operators, which form the subject of the next section. THEOREM 3.2.58 If A : X ⊇ D(A) −→ X ∗ is a linear monotone operator, then the following statements are equivalent: (a) A is maximal monotone; (b) A∗ is maximal monotone; (c) A and A∗ are both monotone and A is closed and densely defined. In particular if X = H is a Hilbert space and A : H −→ H ∗ is linear maximal monotone, then A is symmetric if and only if A is selfadjoint. Maximal monotonicity is crucial here since it may happen that A is monotone symmetric, but A∗ is not monotone. REMARK 3.2.59 In Section 4.3 we shall return to the subject of monotone operators, by examining in more detail the subdifferential of a proper, convex and lower semicontinuous function (see Example 3.2.20(c)). This is a special kind of monotone map, known as cyclically monotone. As we shall see there, not every monotone map is of the subdifferential type.
3. Nonlinear Operators and Young Measures
3.3
343
Accretive Operators and Semigroups of Operators
In the previous section we studied operators (in general nonlinear) from a Banach space X into its dual X ∗ . In this section we deal with operators from X into X which still exhibit a “monotonicity” property. These are the socalled accretive operators. Of course the two classes of monotone and accretive operators coincide when X = H is a Hilbert space. Accretive operators are intimately connected to the theory of generation of semigroups, which are a basic tool in the study of evolution equations. So the second half of this section is devoted to the presentation of the basics of the theory of semigroups of operators (linear and nonlinear). Trying to extend the notion of monotonicity to maps from a Banach space X into itself, immediately we face the problem of finding a substitute for the duality brackets. There are two equivalent ways to do this. The first is to use the duality map, which essentially brings us back to the familiar setting of the dual pair (X, X ∗ ). The second approach replaces the duality brackets by a so-called semi-inner product, which is a kind of inner product for the Banach space. Let us start by giving the definition of accretivity based on the duality map and then proceed to introduce semi-inner products on a Banach space and show how they can be used. DEFINITION 3.3.1 Let X be a Banach space and let A : X ⊇ D(A) −→ 2X be an operator. (a) We say that A is accretive, if for every ®(x1 , u1 ), (x2 , u2 ) ∈ Gr A, there exists x∗ ∈ F (x1 − x2 ), such that x∗ , u1 − u2 X > 0. (b) An accretive operator is said to be maximal accretive, if its graph is not properly included in the graph of another accretive operator. (c) Finally an accretive operator is said to be m-accretive, if R(idX + A) = X. REMARK 3.3.2 When X = H = H ∗ is a Hilbert pivot space, then F = idX and so the notion of accretivity (respectively maximal accretivity) coincides with that of monotonicity (respectively maximal monotonicity). Moreover, in this case by virtue of Theorem 3.2.29, maximal accretivity and m-accretivity coincide. However, this is not in general true. We can find a maximal accretive operator which is not m-accretive (see Miyadera (1992, pp. 42–44)). If −A is an accretive operator, then A is called a dissipative operator . This terminology originates from mechanics, where dissipative forces are forces which do not increase the energy.
344
Nonlinear Analysis
The next lemma leads to an alternative definition of an accretive operator. LEMMA 3.3.3 If X is a Banach space and x, y ∈ X, then kxkX 6 kx + λykX for all λ > 0 if and only if there exists x∗ ∈ F(x), such that hx∗ , yiX > 0. PROOF “=⇒”: Without any loss of generality we may assume that x 6= 0. Let x∗λ ∈ F(x + λy), x∗λ 6= 0 for all λ > 0. Set df
vλ∗ =
x∗λ ∗ kxλ kX ∗
∀ λ > 0.
If λn & 0, by Alaoglu’s theorem (see Theorem A.3.9), we can find v ∗ ∈ X ∗ with kv ∗ kX ∗ 6 1, such that ® hv ∗ , viX = lim vλ∗n , v X ∀ v ∈ X. n→+∞
By hypothesis we have ° ° kxk 6 °x + λn y ° X
X
=
so
∗ ® ® vλn , x + λn y X 6 kxkX + λn vλ∗n , y X , ∗ ® v , y X > 0.
Also, since
° ° ° x + λn y °
X
=
(3.47)
∗ ® ® vλn , x + λn y −→ v ∗ , x X ,
we have that kxkX 6
∗ ® v , x X,
kxkX =
∗ ® v , x X.
so
It follows that x∗ = kxkX v ∗ ∈ F(x). We have hx∗ , yiX > kxkX hv ∗ , yiX > 0 (see (3.47)). “⇐=”: From the definition of the duality map F (see Example 3.2.20(d)) and the hypothesis that hx∗ , yiX > 0, we have 2
kxkX =
® ∗ ® x , x X 6 x∗ , x + λy X 6 kx∗ kX ∗ kx + λykX .
So, because x∗ ∈ F(x) and thus kxkX = kx∗ kX ∗ , we have kxkX 6 kx + λykX .
3. Nonlinear Operators and Young Measures
345
Using this lemma, we have the following alternative characterization of accretivity (known as Kato’s criterion). PROPOSITION 3.3.4 (Kato’s Criterion) If X is a Banach space and A : X ⊇ D(A) −→ 2X , then A is accretive if and only if for all λ > 0 and any (x1 , u1 ), (x2 , u2 ) ∈ Gr A, we have ° ° ° ° °x1 − x2 ° 6 °x1 − x2 + λ(u1 − u2 )° . X X Next we define the semi-inner products on X. DEFINITION 3.3.5 Let X be a Banach space and x, y ∈ X. We define the semi-inner products (·, ·)± by the following: kx + λykX − kxkX kx + λykX − kxkX = kxkX inf λ>0 λ λ kx + λykX − kxkX kx + λykX − kxkX df = kxkX lim = kxkX sup . λ%0 λ λ λ 0; (c) (z + y, x)± 6 kzkX kxkX + (y, x)± ; (d) (·, ·)+ : X × X −→ R is upper semicontinuous; (e) (·, ·)− : X × X −→ R is lower semicontinuous. PROOF
(a) It follows easily from Proposition 3.3.7.
(b) Note that F(µx) = µF(x) and so ® (λy, µx)+ = ∗max µx∗ , λy X x ∈F ® (x) = λµ ∗max x∗ , y X = λµ (y, x)+ . x ∈F (x)
Similarly for (·, ·)− . (c) It follows easily from Proposition 3.3.7.
3. Nonlinear Operators and Young Measures
347
(d) From Definition 3.3.5, we know that kx + λykX − kxkX . λ>0 λ
(y, x)+ = kxkX inf
Note that the function (y, x) 7−→ kxkX Hence kxkX inf
λ>0
kx + λykX − kxkX is continuous. λ
kx + λykX − kxkX kx + λykX − kxkX = inf kxkX λ>0 λ λ
is upper semicontinuous. (e) Note that (y, x)− = (−y, x)+ and use part (d). The next proposition summarizes the different equivalent ways we can use to define accretive operators. It follows immediately from the previous discussion. THEOREM 3.3.10 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an operator, then the following two statements are equivalent: (a) A is an accretive operator; (b) for every (x1 , u2 ), (x2 , u2 ) ∈ Gr A, any one of the following three statements is true: ° ° [A1 ] kx1 − x2 kX 6 °x1 − x2 + λ(u1 − u2 )°X for all λ > 0; ¡ ¢ 0 [A2 ] ¡ψ+ x1 − x2 ; u1 −¢u2 > 0; [A3 ] u1 − u2 , x1 − x2 + > 0. Motivated from Proposition 3.3.4, we make the following definition. DEFINITION 3.3.11
Let X be a Banach space and let A : X ⊇ D(A) −→ 2X
be an accretive operator. For every λ > 0 we introduce df
Jλ =
¡
idX + λA
¡ ¢ both defined on R idX + λA .
¢−1
and
df
Aλ =
¢ 1¡ idX − Jλ , λ
348
Nonlinear Analysis
In the next proposition we have collected some elementary properties of the operators Jλ and Aλ . PROPOSITION 3.3.12 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an accretive operator, then (a) Jλ is nonexpansive on R(idX + λA), i.e., ° ° °Jλ (x) − Jλ (y)° 6 kx − yk ∀ x, y ∈ R(idX + λA); X X (b) Aλ is accretive and Lipschitz continuous with constant λ2 on R(idX + λA); ¡ ¢ (c) Aλ (x) ∈ A Jλ (x) for every x ∈ R(idX + λA); ° ° ¯ ¯ df (d) °Aλ (x)°X 6 ¯A(x)¯ = inf kukX for every x ∈ D(A) ∩ R(idX + λA); u∈A(x)
µ
T
(e) lim Jλ (x) = x for every x ∈ D(A) ∩ λ&0
PROOF
λ>0
¶ R(idX + λA) .
(a) This follows at once from Proposition 3.3.4.
(b) Let df
yk = Then yk
¡
¢ idX + µAk (xk )
∀ k ∈ {1, 2}, µ > 0.
µ ¶ ¢ µ¡ = idX + id − Jλ (xk ) λ X
and so y1 − y2 +
¢ µ¡ Jλ (x1 ) − Jλ (x2 ) = λ
∀ k ∈ {1, 2}
µ ¶ µ 1+ (x1 − x2 ). λ
Hence µ ¶ ° µ µ° 1+ kx1 − x2 kX 6 ky1 − y2 kX + °Jλ (x1 ) − Jλ (x2 )°X λ λ and from part (a), we have kx1 − x2 kX 6 ky1 − y2 kX , which proves the accretivity of Aλ . Also, from part (a), we have ° ° °Aλ (x1 ) − Aλ (x2 )° = 1 kx1 − x2 + Jλ (x1 ) − Jλ (x2 )k 6 2 kx1 − x2 k X X λ λ and so Aλ is a Lipschitz continuous operator with constant
2 λ.
3. Nonlinear Operators and Young Measures
349
(c) We have Aλ (x) ∈
¡ ¢ ¢ ¡ ¢ 1¡ (idX +λA) Jλ (x) −Jλ (x) = A Jλ (x) λ
∀ x ∈ R(idX +λA).
(d) We have ¢ ¢ 1¡ ¡ Jλ (idX + λA)x − Jλ (x) λ ¢ ¢ 1¡ ¡ = Jλ x + λu − Jλ (x) ∀ u ∈ A(x). λ
Aλ (x) =
Hence
° ° °Aλ (x)°
X
which implies
(3.48)
6 kukX ,
° ° ¯ ¯ °Aλ (x)° 6 ¯A(x)¯. X
(e) From part (d), we have ° ° ° ° ¯ ¯ °Jλ (x) − x° = λ°Aλ (x)° 6 λ¯A(x)¯ X X
∀ x ∈ D(A) ∩ R(idX + λA).
Hence Jλ (x) −→ x
as λ & 0,
∀ x ∈ D(A) ∩ R(idX + λA)
and by uniform continuity this extends to all x ∈ D(A) ∩ R(idX + λA) (see part (b)). REMARK 3.3.13 When X = H is a Hilbert space, then Proposition 3.3.12 coincides with Theorem 3.2.38. Also note that if λ, µ > 0 and x ∈ D(Jλ ) = R(idX + λA), then µ λ−µ x+ Jλ (x) ∈ D(Jµ ) = R(idX + µA) λ λ and
µ Jλ (x) = Jµ
¶ µ λ−µ x+ Jλ (x) λ λ
(this last equality is usually known as Resolvent Identity ). To see this let x = y + λv with (y, v) ∈ Gr A. So Jλ (x) = y. Hence we have µ λ−µ µ λ−µ x+ Jλ (x) = (y +λv)+ y = y +µv ∈ R(idX +µA) = D(Jµ ) λ λ λ λ and
µ Jµ
µ λ−µ x+ Jλ (x) λ λ
¶ = Jµ (y + µv) = y = Jλ (x).
350
Nonlinear Analysis
Using the resolvent identity, for λ > µ > 0, we have that ° ° ° ° ° ° ° ° λ°Aλ (x)°X = °Jλ (x) − x°X 6 °Jλ (x) − Jµ (x)°X + °Jµ (x) − x°X ° µ ° ¶ ° ° ° ° µ λ−µ ° = °Jµ x+ Jλ (x) − Jµ (x)° + °Jµ (x) − x°X ° λ λ X ° ° °µ ° ° ° λ − µ ° ° ° 6 ° ° λ x + λ Jλ (x) − x° + Jµ (x) − x X X ° ° ° ° = (λ − µ)°Aλ (x)° + µ°Aµ (x)° , X
X
so
° ° ° ° °Aλ (x)° 6 °Aµ (x)° ∀λ>µ X X ° ª ©° and thus °Aλ (x)°X λ>0 is increasing as λ decreases to 0+ . PROPOSITION 3.3.14 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an accretive operator, then R(idX + λA) = X for all λ > 0, if it holds for some λ > 0. PROOF
Suppose that R(idX + λA) = X
for some λ > 0 and µ > λ2 . Then for u ∈ X, we have ¡ ¢ u ∈ idX + µA (x) or equivalently
µ ¶ ¡ ¢ λ λ idX + λA (x) 3 u + 1 − x, µ µ
which in turn is equivalent to saying that K(x) = x for the contraction µ µ ¶ ¶ λ λ df K(x) = Jλ u+ 1− x ∀ x ∈ X. µ µ By Banach’s fixed point theorem (see Theorem 7.1.2) K(x) = x has a unique solution and so λ R(idX + µA) = X ∀µ> . 2 Then by induction we conclude that R(idX + µA) = X∀µ > 0.
3. Nonlinear Operators and Young Measures
351
REMARK 3.3.15 Using Proposition 3.3.14, we can say that an operator A : X ⊇ D(A) −→ 2X is said to be m-accretive if and only if R(idX + λA) = X
for some λ > 0
(equivalently for all λ > 0; see Definition 3.3.1). PROPOSITION 3.3.16 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then (a) A is maximal accretive; (b) A is closed; (c) if xλ −→ x and Aλ (x) −→ u in X as λ & 0, then (x, u) ∈ Gr A. PROOF
(a) Let (x0 , u0 ) ∈ X × X and suppose that ° ° kx0 − xkX 6 °x0 − x + λ(u0 − u)°X ∀ λ > 0, (x, u) ∈ Gr A.
(3.49)
We need to show that (x0 , u0 ) ∈ Gr A
(see Definition 3.3.1 and Proposition 3.3.4). Since A is m-accretive, we have that X = R(idX + A) and so we can find (x, u) ∈ Gr A, such that x + u = x0 + u0 . Using this in (3.49), we obtain x = x0 , hence (x0 , u0 ) ∈ Gr A. © ª (b) Let (xn , un ) n>1 ⊆ Gr A and assume that (xn , un ) −→ (x, u) in X × X. We need to show that (x, u) ∈ Gr A. By virtue of the m-accretivity of A, we have ° ° kxn − ykX 6 °xn − y + λ(un − v)°X ∀ n > 1, λ > 0, (y, v) ∈ Gr A, so ° ° kx − ykX 6 °x − y + λ(u − v)°X
∀ λ > 0, (y, v) ∈ Gr A.
(3.50)
352
Nonlinear Analysis
But from part (a), we know that A is maximal accretive. So from (3.50), it follows that (x, u) ∈ Gr A. (c) From Proposition 3.3.12(a) and (e), we have that Jλ (xλ ) −→ x in X,
as λ & 0,
while from Proposition 3.3.12(c), we know that ¡ ¢ Aλ (xλ ) ∈ A Jλ (xλ ) ∀ λ > 0. Then using part (b), we infer that (x, u) ∈ Gr A. We can improve conclusion (b) of Proposition 3.3.16, provided we strengthen the condition on the space X. PROPOSITION 3.3.17 If X is a reflexive Banach space with X ∗ being locally uniformly convex and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then Gr A is sequentially closed in X × Xw . PROOF
© ª Suppose that (xn , un ) n>1 ⊆ Gr A is a sequence, such that xn −→ x and
w
un −→ u in X.
We need to show that (x, u) ∈ Gr A. Because A is m-accretive, from Theorem 3.3.10(b)[A3 ], we have ® F(xn − y), un − v X > 0 ∀ n > 1, (y, v) ∈ Gr A.
(3.51)
But from Proposition 3.2.25, we know that the duality map F : X −→ X ∗ is continuous. So if we pass to the limit as n → +∞ in (3.51), we obtain ® F(x − y), u − v X > 0 ∀ (y, v) ∈ Gr A, so
¡
u − v, x − y
¢ +
> 0
∀ (y, v) ∈ Gr A
and thus, from Proposition 3.3.16(a), we conclude that (x, u) ∈ Gr A.
Another useful result that can be proved by imposing extra conditions on the space X is the following one.
3. Nonlinear Operators and Young Measures
353
PROPOSITION 3.3.18 If X is a Banach space with X ∗ strictly convex and A : X ⊇ D(A) −→ 2X is a maximal accretive operator, then the set A(x) ⊆ X is convex and closed for any x ∈ D(A). PROOF Because X ∗ is strictly convex, the duality map F : X −→ X ∗ is single-valued. First we show that for x ∈ D(A), the set A(x) is convex. So let u, v ∈ A(x) and set w = tu + (1 − t)v
with t ∈ [0, 1].
For all (y, h) ∈ Gr A, we have that ® ® ® F(x − y), w − h X = t F(x − y), u − h X + (1 − t) F(x − y), v − h X > 0, so from the maximality of A, we have that (x, w) ∈ Gr A. Next we show that the set A(x) is closed in X. To this end let {un }n>1 ⊆ A(x) be a sequence, such that un −→ u
in X.
We have so
F(x − y), un − v
® X
> 0
® F(x − y), u − v X > 0
∀ n > 1, (y, v) ∈ Gr A, ∀ (y, v) ∈ Gr A
and so, from the maximality of A, we have that (x, u) ∈ Gr A. We continue with the properties of maximal accretive and m-accretive operators. PROPOSITION 3.3.19 If X is an uniformly convex Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then D(A) is convex. PROOF
Let
½ df
D0 =
¾ x ∈ conv D(A) : lim Jλ (x) = x . λ&0
Evidently D(A) = D0 and D0 is closed. So it suffices to show that D0 is convex. For x, y ∈ D0 , we have ° µ ° ° ° ¶ ° ° ° ° °Jλ x + y − Jλ (x)° 6 ° x − y ° ∀λ>0 (3.52) ° ° ° 2 2 °X X
354
Nonlinear Analysis
and
° µ ° ¶ ° ° °Jλ x + y − Jλ (y)° ° ° 2
° ° °x − y° ° ° 6° ∀λ>0 2 °X X (see Proposition 3.3.12(a)). From (3.52) and (3.53), it follows that ½ µ ¶¾ x+y Jλ ⊆ X is bounded. 2 λ∈(0,1)
(3.53)
Since X is reflexive (being uniformly convex; see Remark A.3.22), we can find a sequence λn & 0, such that µ ¶ x+y w Jλn −→ h in X. 2 So if we pass to the limit as n → +∞ in (3.52) and (3.53), we obtain ° ° ° ° °x − y ° °x − y° ° ° ° kh − xkX 6 ° and kh − yk 6 (3.54) X ° 2 ° ° 2 ° . X
X
We have kx − ykX 6 kx − hkX + kh − ykX 6 kx − ykX . From (3.54) and (3.55), it follows that kx − hkX = ky − hkX
(3.55)
° ° °x − y° ° ° = ° 2 °
X
and this by virtue of the uniform convexity of X implies that h = we have µ ¶ x+y x+y w Jλn −→ in X. 2 2 Moreover, we have ° ° µ ¶ °y − x° ° ° ° ° 6 lim inf °Jλn x + y − Jλ (x)° n ° X n→+∞ ° 2 2 X ° ° µ ¶ ° ° °y − x° x + y ° ° , 6 lim sup ° − Jλn (x)° °Jλn ° 6 X 2 2 n→+∞ X so ° ° µ ¶ ° ° ° ° °Jλ x + y − Jλ (x)° −→ ° y − x ° . n ° n ° X 2 2 X Since Jλn (x) −→ x in X,
x+y 2 .
Thus (3.56)
(3.57)
from (3.56) and (3.57) and the Kadec-Klee property (see Remark A.3.22), we conclude that µ ¶ x+y x+y Jλn −→ 2 2 and so D(A).
x+y 2
∈ D0 , which proves the convexity of D0 , hence the convexity of
3. Nonlinear Operators and Young Measures
355
Next we prove two perturbation results for m-accretive operators. To do this we shall need the following auxiliary result. LEMMA 3.3.20 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , and B : X ⊇ D(B) −→ 2X are two m-accretive operators with D(A) ∩ D(B) 6= ∅, then for every u ∈ X and every λ > 0, the operator inclusion x + A(x) + Bλ (x) 3 u has a unique ©solution ªxλ ∈ D(A) and {xλ }λ>0 is bounded. Moreover, if Bλ (xλ ) λ∈(0,1) is bounded, then the operator inclusion x + A(x) + B(x) 3 u has a unique solution x ∈ D(A) ∩ D(B) and xλ −→ x PROOF
in X,
as λ & 0.
The operator inclusion x + A(x) + Bλ (x) 3 u
is equivalent to µ x = 1+
¶−1 µ
λ A λ+1
¶ λ 1 −1 u+ (id + λB) (x) , λ+1 λ+1 X
so x = Kλ (x), with df
µ
Kλ = J
A
λ λ+1
Since the operators J Aλ
λ+1
λ 1 u+ JA λ+1 λ+1 λ
(3.58) ¶ ∀ λ > 0.
and JλB are nonexpansive on X (see Proposi-
tion 3.3.12(a)), we can check that ° ° °Kλ (x) − Kλ (y)° 6 X
1 kx − ykX λ+1
∀ x, y ∈ X, λ > 0.
Invoking Banach’s fixed point theorem (see Theorem 7.1.2), we infer that (3.58) has a unique solution xλ ∈ D(A) for λ > 0. Let z ∈ D(A) ∩ D(B) and
uλ ∈ z + A(z) + Bλ (z).
356
Nonlinear Analysis
From Proposition 3.3.12(b), we know that Bλ is accretive and since the sum of accretive operators is clearly accretive, we have that the operator A + Bλ : X ⊇ D(A) −→ 2X is accretive. So ® F(xλ − z), u − xλ − (uλ − z) X > 0, hence 2
kxλ − zkX 6
® F(xλ − z), u − uλ X 6 kxλ − zkX ku − uλ kX ,
from which it follows that kxλ − zkX 6 ku − uλ kX . Because
(3.59)
° ° ¯ ¯ °Bλ (z)° 6 ¯B(z)¯ X
(see Proposition 3.3.12(d)), from © ª (3.59), we infer that {xλ }λ>0 is bounded. Now suppose that Bλ (xλ ) λ>0 is bounded. For λ, µ > 0, we have u − xλ − Bλ (xλ ) ∈ A(xλ ) and
u − xµ − Bµ (xµ ) ∈ A(xµ ).
Exploiting the accretivity of the operator A, we obtain ® F(xλ − xµ ), xµ − xλ + Bµ (xµ ) − Bλ (xλ ) X > 0, so
2
kxλ − xµ kX 6
® F(xλ − xµ ), Bµ (xµ ) − Bλ (xλ ) X .
Because ¡ ¢ Bλ (xλ ) ∈ B Jλ (xλ )
¡ ¢ and Bµ (xµ ) ∈ B Jµ (xµ )
(see Proposition 3.3.12(c)) and B is accretive, we have ¡ ¢ ® F Jµ (xµ ) − Jλ (xλ ) , Bµ (xµ ) − Bλ (xλ ) X > 0. It follows that 2
kx λ − xµ kX ¡ ¢ ® 6 F(xλ − xµ ) + F Jµ (xµ ) − Jλ (xλ ) , Bµ (xµ ) − Bλ (xλ ) X . (3.60) Since λBλ (xλ ) = xλ − Jλ (xλ ) ª and by hypothesis Bλ (xλ ) λ∈(0,1) is bounded, we have that ©
° ° °xλ − Jλ (xλ )°
X
6 M1 λ
∀ λ ∈ (0, 1),
3. Nonlinear Operators and Young Measures
357
for some M1 > 0. Because X ∗ is uniformly convex, we have that duality map is uniformly continuous on bounded sets of X (see Proposition 3.2.28). Therefore since the duality map is odd (see Proposition 3.2.22), we see that given ε > 0, for all λ, µ > 0 small enough, we have ° ¡ ¢ ¡ ¢° °F xλ − xµ + F Jµ (xµ ) − Jλ (xλ ) ° ° ¡ ¢ ¡ ¢°X = °F Jµ (xµ ) − Jλ (xλ ) − F xµ − xλ ° 6 ε. X
So from (3.60), we have 2
kxλ − xµ kX 6 M1 ε
∀ λ, µ > 0 small enough.
Since ε > 0 was arbitrary, we conclude that xλ −→ x in X,
as λ & 0.
Let λn & 0 be such that w
Bλn (xλn ) −→ z
in X
© ª (recall that X is reflexive and by hypothesis Bλ (xλ ) λ∈(0,1) is bounded). Set df
vn = u − xλn − Bλn (xλn ) Then
w
vn −→ v = u − x − z
∀ n > 1. in X
and vn ∈ A(xλn ). Invoking Proposition 3.3.17, we have that (x, v) ∈ Gr A. Also ¡ ¢ Jλn (xλn ), Bλn (xλn ) ∈ Gr B and Jλn (xλn ) −→ x and
w
Bλn (xλn ) −→ x in X.
So once again via Proposition 3.3.17, we have that (x, z) ∈ Gr B. Thus finally x ∈ D(A) ∩ D(B)
and u = x + v + z
with v ∈ A(x)
and z ∈ B(x).
Using this lemma we can prove two perturbation theorems for m-accretive operators.
358
Nonlinear Analysis
THEOREM 3.3.21 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , B : X ⊇ D(B) −→ 2X are two m-accretive operators, such that (i) D(A) ∩ D(B) 6= ∅; ¡ ¢ ® (ii) F Bλ (x) , u X > 0 for all λ > 0 and all (x, u) ∈ Gr A, then A + B is m-accretive. PROOF Let u ∈ X. By virtue of Lemma 3.3.20, we can find a unique xλ ∈ D(A), such that u ∈ xλ + A(xλ ) + Bλ (xλ )
∀ λ > 0.
¡ ¢ We take the duality brackets with F Bλ (xλ ) . Using (ii)©and theªfact that {xλ }λ>0 is bounded (see Lemma 3.3.20), we obtain that Bλ (xλ ) λ∈(0,1) is bounded. Then by virtue of Lemma 3.3.20, we have that xλ −→ x in X
as λ & 0
and u ∈ x + A(x) + B(x), i.e., R(idX + A + B) = X, which means that A + B is m-accretive. THEOREM 3.3.22 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive) and A : X ⊇ D(A) −→ 2X , B : X ⊇ D(B) −→ 2X are two m-accretive operators, such that (i) D(A) ⊆ D(B); (ii) for each r > 0, there are c < 1 and d > 0, such that ¯ ¯ ¯ ¯ ¯B(x)¯ 6 c¯A(x)¯ + d then A + B is m-accretive.
∀ x ∈ D(A), kxkX 6 r,
3. Nonlinear Operators and Young Measures
359
PROOF Let u ∈ X and let xλ ∈ D(A) be the unique solution of the operator inclusion u ∈ xλ + A(xλ ) + Bλ (xλ ). Since xλ ∈ D(A) ⊆ D(B) (see (i)) and ° ° ¯ ¯ °Bλ (xλ )° 6 ¯B(xλ )¯ X
∀λ>0
(see Proposition 3.3.12(d)), from condition (ii) and since {xλ }λ>0 is bounded (see Lemma 3.3.20), we obtain ¯ ¯ ¯ ¯ ¯A(xλ )¯ 6 c¯A(xλ )¯ + d0 ∀ λ > 0, © ª for some d0 > 0, so A(xλ ) λ>0 is bounded (since c < 1). © ª Using this fact in condition (ii), we infer that Bλ (xλ ) λ∈(0,1) is bounded. Then invoking Lemma 3.3.20, we finish the proof as before. REMARK 3.3.23 Condition (ii) of Theorem 3.3.22 can be replaced by the following local condition (ii)’ for every x0 ∈ D(A), we can find a neighbourhood U of x0 and constants c < 1 and d > 0, such that ¯ ¯ ¯ ¯ ¯B(x)¯ 6 c¯A(x)¯ + d ∀ x ∈ D(A) ∩ U
(see Kato (1967)). In applications in general Theorem 3.3.21 is more convenient than Theorem 3.3.22. We present an application of the perturbation results in the study of elliptic boundary value problems. To this end let Ω ⊆ RN be a bounded domain with a C 2 -boundary ∂Ω. We shall need the following existence, uniqueness and regularity result due to Agmon, Douglis & Nirenberg (1959). THEOREM 3.3.24 If Ω ⊆ RN is as above, p ∈ (1, +∞) and f ∈ Lp (Ω), then there exists unique x ∈ W 2,p (Ω) ∩ W01,p (Ω), such that −∆x(z) + x(z) = f (z)
for a.a. z ∈ Z,
x|∂Z = 0.
Moreover, if ∂Ω is a C m+1 -manifold for some m > 1 and f ∈ W m,p (Ω), then x ∈ W m+2,p (Ω) and kxkW m+2,p (Ω) 6 c kf kW m,p (Ω) for some c = c(m, p, Ω) > 0.
∀ x ∈ W m+2,p (Ω),
360
Nonlinear Analysis
Let
ξ : R ⊇ D(ξ) −→ 2R
be a maximal monotone map with 0 ∈ ξ(0). We consider the realization (lifting) of ξ on Lp (Ω) × Lp (Ω) for p ∈ (1, +∞). So we define b −→ 2Lp (Ω) ξb: Lp (Ω) ⊇ D(ξ) by ½
¡ ¢ u ∈ Lp (Ω) : u(z) ∈ ξ x(z) for a.a. z ∈ Z
df b ξ(x) =
where
¾ b ∀ x ∈ D(ξ),
½ df b = D(ξ)
x ∈ Lp (Ω) : there exists u ∈ Lp (Ω), ¡
¾
¢
such that u(z) ∈ ξ x(z) for a.a. z ∈ Z . A simple measurable selection argument establishes that ξb is m-accretive and we have £ ¤ b −1 x (z) = (1 + λξ)−1 x(z) for a.a. z ∈ Z (idX + λξ) with X = Lp (Ω) and ¡ ¢ ξbλ (x)(z) = ξλ x(z) for a.a. z ∈ Z, all λ > 0 and all x ∈ Lp (Ω). p
We consider the operator K : Lp (Ω) ⊇ D(K) −→ 2L b K(x) = −∆x + ξ(x) where
(Ω)
, defined by
∀ x ∈ D(K),
df b D(K) = W01,p (Ω) ∩ W 2,p (Ω) ∩ D(ξ).
PROPOSITION 3.3.25 p If K : Lp (Ω) ⊇ D(K) −→ 2L (Ω) is the operator defined by (3.61), then K is m-accretive. It is easy to check that the duality map on Lp (Ω),
PROOF
0
F : Lp (Ω) −→ Lp (Ω) (with
1 p
+
1 p0
= 1), is defined by df
F(x)(·) = x(·)
|x(·)|p−2 p−2
kxkp
.
(3.61)
3. Nonlinear Operators and Young Measures
361
Using this we can check that the operator −∆ : Lp (Ω) ⊇ W01,p (Ω) ∩ W 2,p (Ω) −→ Lp (Ω) is accretive. Moreover, by virtue of Theorem 3.3.24, it follows that −∆ is m-accretive. We have Z ¡ ¢ ® ¡ ¢¯ ¡ ¢¯p−2 b F ξλ (x) , −∆x Lp (Ω) = −∆x(z)ξλ x(z) ¯ξλ x(z) ¯ dz. (3.62) Ω
If
¯ ¯p−2 df ϕλ (r) = ξλ (r)¯ξλ (r)¯ ,
then ϕλ is a Lipschitz continuous map and ¯ ¯p−2 ϕ0λ (r) = (p − 1)¯ξλ (r)¯ ξλ0 (r) Also
for a.a. r ∈ R.
¡ ¢ ϕ x(·) ∈ W01,p (Ω)
(see Proposition 2.4.25 and Remark 2.4.26) and ¡ ¢ ¡ ¢ Dϕλ x(z) = ϕ0λ x(z) Dx(z) for a.a. z ∈ Z. Performing an integration by parts on the right hand side integral of (3.62) and recalling that βλ (0) = 0 (since 0 ∈ β(0)), we obtain Z °2 ¡ ¢ ® ¡ ¢° b F ξλ (x) , −∆x Lp (Ω) = ϕ0λ x(z) °Dx(z)°RN dz > 0, Ω
¡ ¢ since ϕ0λ x(z) > 0, because ϕλ is monotone increasing. Applying Theo0 rem 3.3.21, with data A = −∆, B = ξb and X = Lp (Ω) (note that X ∗ = Lp (Ω) 0 is uniformly convex since p ∈ (1, +∞)), we obtain that A + B = K is maccretive. The main reason for studying accretive operators is the fact that they are closely related with the generation of semigroups (linear and nonlinear). The theory of semigroups is a valuable tool in the study of partial differential equations, of Volterra integral equations and of control problems. In the rest of this section, we will see how m-accretive operators lead to semigroups of operators, which in turn describe the time-evolution of a dynamical system monitored by a differential equation in a Banach space (evolution equation). So we start our discussion with an existence result for evolution equations in which the input and Cauchy data are regular (smooth). First two useful auxiliary results.
362
Nonlinear Analysis
LEMMA 3.3.26 If X is a Banach space, x : T = [0, b] −→ X is weakly differentiable at t ∈ T , i.e., ¯ ® ¯ ® d ∗ x , x(s) X ¯¯ = x∗ , x0 (t) X ∀ x∗ ∈ X ∗ , ds s=t ° ° and s 7−→ °x(s)°X is differentiable at s = t, then · ¸¯ ° ° ° ¯ ® ¡ ¢ d° °x(t)° °x(s)° ¯ = x∗ , x(t) X ∀ x∗ ∈ F x(t) . X ds X ¯ s=t ¡ ¢ PROOF For every x∗ ∈ F x(t) and r > 0, we have ° ° ° ¢ ∗ ® ¡° x , x(t + r) − x(t) X 6 kx∗ kX ∗ °x(t + r)°X − °x(t)°X . Since by hypothesis x is weakly differentiable at t, dividing with r > 0 and letting r & 0, we obtain · ¸¯ ° ° ° ¯ ∗ ® d° ° ° ° ° x , x(t) X 6 x(t) X x(s) X ¯¯ . (3.63) ds s=t On the other hand, since ° ° ° ¢ ∗ ® ¡° x , x(t) − x(t − r) X > kx∗ kX ∗ °x(t)°X − °x(t − r)°X , arguing as above, we obtain · ¸¯ ° ° ° ¯ ® d° °x(t)° °x(s)° ¯ 6 x∗ , x0 (t) X . X ds X ¯ s=t
(3.64)
From (3.63) and (3.64), we conclude the desired equality. The second auxiliary result is a Gronwall-type lemma which is used frequently in the study of evolution equations. LEMMA 3.3.27 If ϕ ∈ L1 (T ), ϕ(t) > 0 for almost all t ∈ T , η ∈ R, u ∈ C(T ) and 1 1 u(t)2 6 η 2 + 2 2
Zt ϕ(s)u(s) ds
∀ t ∈ T,
0
then ¯ ¯ ¯u(t)¯ 6 |η| +
Zt ϕ(s) ds 0
∀ t ∈ T.
3. Nonlinear Operators and Young Measures PROOF
363
Let 1 ξε (t) = (η + ε)2 + 2
Zt ϕ(s)u(s) ds, 0
with ε > 0 and t ∈ T . Then ξε0 (t) = ϕ(t)u(t) Moreover, so
for a.a. t ∈ T.
1 u(t)2 6 ξ0 (t) 6 ξε (t) 2 √ p ξε0 (t) 6 ϕ(t) 2 ξε (t)
∀ ε > 0, t ∈ T, ∀ ε > 0, t ∈ T.
Because t 7−→ ξε (t) is absolutely continuous with values in R+ , we have ¡p ¢0 ξε (t) = thus
and so
1 p ξε0 (t) 2 ξε (t)
for a.a. t ∈ T,
¡p ¢0 1 ξε (t) 6 √ ϕ(t) for a.a. t ∈ T 2 Z p p 1 ξε (t) 6 ξε (0) + √ ϕ(s) ds 2 t
∀ t ∈ T.
0
Therefore, it follows that Z p √ p ¯ ¯ ¯u(t)¯ 6 2 ξε (t) 6 2ξε (0) + ϕ(s) ds t
0
Zt = |η + ε| +
ϕ(s) ds. 0
Let ε & 0, to conclude that ¯ ¯ ¯u(t)¯ 6 |η| +
Zt ϕ(s) ds
∀ t ∈ T.
0
Using these results, we can prove the first existence theorem for evolution equations driven by m-accretive operators.
364
Nonlinear Analysis
THEOREM 3.3.28 If X is a Banach space with X ∗ being uniformly convex (hence X is reflexive),¡ A : X ¢⊇ D(A) −→ 2X is an m-accretive operator, x0 ∈ D, f ∈ W 1,1 (0, b); X and ω ∈ R, ¡ ¢ then we can find a unique x ∈ W 1,∞ (0, b); X , such that ½
¡ ¢ x0 (t) + A x(t) 3 ωx(t) + f (t) x(0) = x0 .
for a.a. t ∈ T = [0, b],
PROOF First we show ¡ that ¢the solution if it exists is unique. So suppose that x1 , x2 ∈ W 1,∞ (0, b); X are two solutions of the evolution Cauchy problem. We have ¡ ¢0 ¡ ¢ ¡ ¢ ¡ ¢ x1 (t) − x2 (t) + A x1 (t) − A x2 (t) 3 ω x1 (t) − x2 (t) for a.a. t ∈ T. Let
° ° ϕ(t) = °x1 (t) − x2 (t)° . ¡ ¢ Since x1 , x2 ∈ W 1,∞ (0, b); X , they are Lipschitz continuous functions (see Theorem 2.2.24) and so they are differentiable almost everywhere on T (see Theorem 2.2.17). Moreover, ϕ is a Lipschitz continuous function too, thus differentiable almost everywhere on T . So we can use Lemma 3.3.26 and obtain ° ° ° ° °x1 (t) − x2 (t)° d °x1 (t) − x2 (t)° X dt X ¡ ¢ 0 ® 0 = F x1 (t) − x2 (t) , x1 (t) − x2 (t) X
∀ t ∈ T.
Since A is m-accretive, we also have ° ° ° ° ° ° °x1 (t) − x2 (t)° d °x1 (t) − x2 (t)° 6 ω °x1 (t) − x2 (t)°2 X dt X X so
° ° ° d° °x1 (t) − x2 (t)° 6 ω °x1 (t) − x2 (t)° X X dt
∀ t ∈ T,
∀ t ∈ T.
Because x1 (0) − x2 (0) = 0, by Gronwall’s inequality (the differential form), we obtain kx1 (t) − x2 (t)kX = 0 ∀ t ∈ T. Therefore x1 = x2 . To establish the existence of a solution, first we consider the following approximate evolution equation. ¡ ¢ ½ 0 x (t) + Aλ x(t) = ωx(t) + f (t) for a.a. t ∈ T = [0, b], (3.65) x(0) = x0 ,
3. Nonlinear Operators and Young Measures
365
with λ > 0. Because Aλ is a Lipschitz continuous operator (see Proposition 3.3.12(b)), problem (3.65) has a unique solution xλ ∈ C 1 (T ; X). Using Lemma 3.3.26, we see that for all λ, µ > 0, we have ° ¡ ¢ ¡ ¢ ¡ ¢® 1 d° °xλ (t) − xµ (t)°2 + F xλ (t) − xµ (t) , Aλ xλ (t) − Aµ xµ (t) X X 2 dt ° °2 ° ° = ω xλ (t) − xµ (t) X ∀ t ∈ T, so by Gronwall’s inequality (see Theorem A.4.7), we have ° ° °xλ (t) − xµ (t)°2 X Zb 6 −2
¡ ¢ ¡ ¢ ¡ ¢® e2ω(t−s) F xλ (s) − xµ (s) , Aλ xλ (s) − Aµ xµ (s) X ds
t ∈ T.
0
Exploiting the accretivity of A and the fact that ¡ ¢ ¡ ¡ ¢¢ Aα xα (t) ∈ A Jα xα (t)
∀t∈T
(see Proposition 3.3.12(c)), we can write that ° ° °xλ (t) − xµ (t)°2
X
Zb 6 −2
¡ ¢ ¡ ¡ ¢ ¡ ¢¢ e2ω(t−s) F xλ (s) − xµ (s) − F Jλ xλ (s) − Jµ xµ (s) ×
0
¡ ¢ ¡ ¢® ×Aλ xλ (s) − Aµ xµ (s) X ds
∀ t ∈ T,
so ° ° °xλ (t) − xµ (t)°2 X Zb ° ¡ ¢ ¡ ¡ ¢ ¡ ¢¢° 6 2 e2ω(t−s) °F xλ (s) − xµ (s) − F Jλ xλ (s) − Jµ xµ (s) °X ∗ × ° ¡ ¢ ¡ ¢° ×°Aλ xλ (s) − Aµ xµ (s) °X ds
0
∀ t ∈ T.
(3.66)
We claim that ° 0 ° ¯ ¯ ° ° °xλ (t)° 6 ¯A(x0 )¯ + ω kx0 k + °f (0)° + e2ωb kf 0 k . X 1 X X
(3.67)
Indeed, from (3.65), for every r ∈ (0, b), we have ¡
¢0 ¡ ¢ ¡ ¢ xλ (t + r) − xλ (t) + Aλ xλ (t + r) − Aλ xλ (t) ¡ ¢ = ω xλ (t + r) − xλ (t) + f (t + r) − f (t) for a.a. t ∈ Tr = [0, t − r].
366
Nonlinear Analysis
Exploiting the accretivity of Aλ and Lemma 3.3.26, we obtain ° 1 d° °xλ (t + r) − xλ (t)°2 X 2 dt ° °2 6 ω °xλ (t + r) − xλ (t)°X ¡ ¢ ® + F xλ (t + r) − xλ (t) , f (t + r) − f (t) X
for a.a. t ∈ Tr ,
so ° ° °xλ (t + r) − xλ (t)°2 X ° °2 6 °xλ (r) − xλ (0)°X Zt +2
° ° ° ° e2ω(t−s) °xλ (s + r) − xλ (s)°X °f (s + r) − f (s)°X ds.
0
Thus by Lemma 3.3.27, we obtain ° ° ° ° °xλ (t + r) − xλ (t)° 6 °xλ (r) − xλ (0)° + X X
Zt
° ° e2ω(t−s) °f (s + r) − f (s)°X ds.
0
Dividing with r > 0 and letting r & 0, we obtain ° 0 ° °xλ (t)°
X
° ° 6 °ωx0 + f (0) − Aλ (x0 )°X +
Zb
° ° e2ω(t−s) °f 0 (s)°X ds
0
¯ ¯ ° ° 6 ¯A(x0 )¯ + ω kx0 kX + °f (0)°X + e2ωb kf 0 k1 . This proves (3.67), from which we infer that there exists M1 > 0, such that ° 0 ° °xλ (t)° 6 M1 ∀ λ > 0, t ∈ T. (3.68) X Since
Zt x0λ (s) ds
xλ (t) = x0 +
∀ λ > 0, t ∈ T,
0
it follows that there exists M2 > 0, such that ° ° °xλ (t)° 6 M2 ∀ λ > 0, t ∈ T. X
(3.69)
Returning to (3.65) and using (3.68) and (3.69), we obtain M3 > 0, such that ° ¡ ¢° °Aλ xλ (t) ° 6 M3 ∀ λ > 0, t ∈ T, (3.70) X so ° ° ¡ ¡ ¢° ¢° °xλ (t)−Jλ xλ (t) ° = λ°Aλ xλ (t) ° 6 λM3 X X
∀ λ > 0, t ∈ T. (3.71)
3. Nonlinear Operators and Young Measures From (3.71), it follows that ° ¡ ¢° °xλ (t) − Jλ xλ (t) ° −→ 0 X
367
as λ & 0, uniformly on T.
Because F is uniformly continuous on bounded sets, from (3.66) and (3.70), we see that for a given ε > 0, for all λ, µ > 0 small enough, we have ° ° °xλ (t) − xµ (t)°2 6 M4 ε, X for some M4 > 0, so xλ −→ x in C(T ; X) as λ & 0. Moreover, from (3.67) ¡ ¢ we infer that x is a Lipschitz continuous function, i.e., x ∈ W 1,∞ (0, b); X . We claim that this is the solution of the evolution equation. To this end let (y, z) ∈ Gr A and let us set df
yλ = y + λz
∀ λ > 0.
Hence z = Aλ (yλ ). Using (3.65) and the accretivity of Aλ , we obtain ° ° ° ° °xλ (t) − yλ °2 6 °xλ (t0 ) − yλ °2 X X Zt +
¡ ¢ ® F xλ (s) − y , ωxλ (s) + f (s) − z X ds
∀ t, t0 ∈ T.
t0
Letting λ & 0, we get ° ° ° ° °x(t) − y °2 − °x(t0 ) − y °2 6 2 X X
Zt
¡ ¢ ® F x(s) − y , ωx(s) + f (s) − z X . (3.72)
0
Note that for any z, h ∈ X, we have ® 1¡ 2 2 2 ¢ F(h), z − h X 6 khkX kzkX − khkX 6 kzkX − khkX . 2 Using this in (3.72), we have ¿ À ¡ ¢ x(t) − x(t0 ) F x(t0 ) − y , t − t0 X Zt ¡ ¢ ® 1 6 F x(s) − y , ωx(s) + f (s) − z X ds. t − t0
(3.73)
t0
Let t0 ∈ T be a point of differentiability of x. Passing to the limit as t → t0 in (3.73), we obtain ¡ ¢ ® ¡ ¢ ® F x(t0 ) − y , x0 (t0 ) X = F x(t0 ) − y , ωx(t0 ) + f (t0 ) − z X ,
368
Nonlinear Analysis
so
¡ ¢ ® F x(t0 ) − y , u0 − z X > 0,
with
(3.74)
¡ ¢ df u0 = −x0 (t0 ) + ωx(t0 ) + f (t0 ) ∈ A x(t0 )
(see (3.65)). Since A is m-accretive, hence maximal accretive, from (3.74), we conclude that ¡ ¢ −x0 (t0 ) + ωx(t0 ) + f (t0 ) ∈ A x(t0 ) . Because ¡ x is¢ almost everywhere differentiable W 1,∞ (0, b); X ), we conclude that x solves (3.65).
(recall
that
x
∈
Next we want to consider evolution equations with less regular data. This can be done with remarkable success using the theory of semigroups of operators. In what follows we present some basic aspects of this theory. We start with the linear theory and then pass the nonlinear one. Let us motivate the definition of a semigroup of bounded linear operators (linear semigroup). So let X be a Banach space and A ∈ L(X). We consider the following Cauchy problem ½ 0 x (t) = Ax(t) ∀ t > 0, (3.75) x(0) = x0 ∈ X. df
It is easy to check that the function x(t) = etA x0 for t > 0, x ∈ C 1 (R+ ; X) is the unique solution of (3.75). Let us mention the basic properties of this solution. First, for every fixed t > 0, the map x0 7−→ x(t) is linear. Moreover, since ° ° °x(t)° 6 etkAkL kx0 k , X X it is also bounded. Second, as t & 0, we have that x(t) −→ x0
in X
and x(0) = x0 . Finally third, by virtue of the uniqueness of the solution of (3.75), if we start with initial condition x(t0 ), t0 > 0 and move for time t > 0, we must reach the state x(t + t0 ) (recall that e(t+t0 )A = etA et0 A ). Generalizing these properties we obtain the notion of a C0 -semigroup of linear operators. © ª DEFINITION 3.3.29 Let X be a Banach space and S(t) t>0 ⊆ L(X). We call S a C0 -semigroup on X if the following conditions hold: (a) S(0) = idX ; (b) S(t+s)=S(t)S(s) for all s, t > 0; ° ° (c) lim °S(t)x − x°X = 0 for all x ∈ X. t→0
3. Nonlinear Operators and Young Measures
369
REMARK 3.3.30 Property (b) is the semigroup property , while property (c) implies that the function t 7−→ S(t) is continuous from R+ into L(X) furnished with the strong operator topology. If A ∈ L(X), then © ª S(t) = etA t>0 is a C0 -semigroup. Also if X = Cb (R) (the space of bounded continuous functions f : R −→ R equipped with the supremum norm) and S(t)f (·) = f (t + ·) then
©
∀ f ∈ Cb (R),
ª S(t) t>0 is a C0 -semigroup.
PROPOSITION 3.3.31 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X, then there exist M > 1 and ω > 0, such that ° ° °S(t)° 6 M eωt ∀ t > 0. L PROOF By virtue of property (c) in Definition 3.3.29, we can find M > 1 and δ > 0, such that ° ° °S(t)° 6 M ∀ t ∈ [0, δ]. L Let
ln M > 0. δ For a given t > 0, we can find an integer n > 0 and ϑ ∈ [0, δ), such that df
ω =
t = nδ + ϑ. Because of the semigroup property, we have S(t) = S(δ)n S(ϑ), so
° ° ° ° ° ° °S(t)° 6 °S(δ)°n °S(ϑ)° L L L 6 M n M = M eωt ,
since
ln M n = n ln M = mωδ 6 ωt.
370
Nonlinear Analysis
Using this bound, we can improve (c) in Definition 3.3.29. COROLLARY 3.3.32 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X, then for all x ∈ X, the map t 7−→ S(t)x is continuous from R+ into X. PROOF Let r > 0. Then using the semigroup property and Proposition 3.3.31, we obtain ° ° °S(t + r)x − S(t)x° ° ° ° °X = °S(t)°L °S(r)x − x°X ° ° 6 M eωt °S(r)x − x°X −→ 0 as r & 0. So the function t 7−→ S(t)x is continuous on R+ . © ª DEFINITION 3.3.33 Let X be a Banach space and let S(t) t>0 be a C0 -semigroup on X. From Proposition 3.3.31, we know that ° ° °S(t)° 6 M eωt ∀t>0 L for some M > 1 and ω > 0. If M = 1 and ω = 0, i.e., ° ° °S(t)° 6 1 ∀ t > 0, L then we say that we have a contraction semigroup. The following notion is central in the theory of linear semigroups and is the starting point for determining those operators which generate contraction semigroups (see the Hille-Yosida Theorem 3.3.46). © ª DEFINITION 3.3.34 Let X be a Banach space and let S(t) t>0 be a C0 -semigroup on X. We introduce the generator (or infinitesimal generator) of the semigroup S as the linear operator A : X ⊇ D(A) −→ X, defined by S(t)x − x df Ax = lim ∀ x ∈ D(A), t&0 t where
½ df
D(A) =
¾ S(t)x − x x ∈ X : lim exists . t&0 t
In general the operator A is not bounded.
3. Nonlinear Operators and Young Measures
371
EXAMPLE 3.3.35 (a) Let A ∈ L(X) and S(t) = eAt for t > 1. This is a C0 -semigroup. Then for every x ∈ X, we have ∞ k−1 ∞ k−1 X X etA x − x t t = Ak x = Ax + Ak x t k! k! k=1
and
k=2
°X ° ° ∞ tk−1 k ° ° A x° ° ° k!
X
k=2
2
6 t kAkL kxkX =
6
2 |t| kAkL
∞ k−1 X t k=2
k!
k
kAkL kxkX
∞ k X
t k kAkL k!
k=0 kxkX etkAkL
−→ 0
as t & 0.
Therefore the generator of S is A. (b) Let X = C(B; X)
df
and S(t)f (·) = f (t + ·)
(see Remark 3.3.30). We have (S(t)f − f )(s) = D+ f (s) if it exists. t&0 t lim
So if f ∈ D(A), then D+ f exists at all s > 0 and it is bounded and uniformly continuous. Also we have f (s) − f (s − t) f (s − t + t) − f (s − t) = t t o(t) = D+ f (s − t) + −→ D+ f (s) as t & 0 t (since D+ f is continuous). Therefore if f ∈ D(A), then D+ f = D− f , i.e., f 0 (t) exists at all t ∈ R and f 0 ∈ Cb (R). So ½ ¾ D(A) = f ∈ Cb (R) : f 0 exists everywhere and f 0 ∈ Cb (R) and Af = f 0 for all f ∈ D(A). ¡ ¢ More generally, if X = H = L2 0, b and ½ df x(t + s) if S(t)x(s) = 0 if
t + s ∈ (0, b), t + s 6∈ (0, b),
then the generator of S is the operator A : H ⊇ D(A) −→ H, defined by Ax(t) = with D(A) =
©
d x(t), dt
ª x ∈ W 1,2 (0, b) : x(b) = 0 .
372
Nonlinear Analysis
In the next proposition, we summarize the differential properties of C0 semigroups. PROPOSITION 3.3.36 © ª If X is a Banach space and S(t) t>0 is a C0 -semigroup on X with generator A, then for all x ∈ D(A) and all t > 0, we have (a) S(t)x ∈ D(A); (b)
d dt S(t)x
= AS(t)x = S(t)Ax for t > 0; Zt
(c) S(t)x − x =
S(r)Ax dr; 0
(d) D(A) is dense in X and the operator A is closed (i.e., Gr A ⊆ X × X is closed). PROOF
(a) For r > 0 we have
S(t + r)x − S(t)x S(r)x − x = S(t) −→ S(t)Ax r r
as r & 0.
Hence S(t)x ∈ D(A). (b) From (a), we have
d+ S(t)x = S(t)Ax. dt
Also note that S(t + r)x − S(t)x S(r)x − x = S(t) r r µ ¶ S(r) − idX = S(t)x −→ AS(t) as r & 0, r so
d+ S(t)x = S(t)Ax = AS(t)x. dt On the other hand, for t > r > 0, we have S(t)x − S(t − r)x S(r)x − x = S(t − r) −→ S(t)Ax as r & 0, r x so Also since
d− S(t)x = S(t)Ax. dt µ ¶ S(r)x − x S(r) − idX S(t − r) = S(t − r), r r
3. Nonlinear Operators and Young Measures
373
we have
d− S(t)x = S(t)Ax = AS(t)x. dt So finally, we conclude that d S(t)x = S(t)Ax = AS(t)x. dt (c) By part (b), the function t 7−→ S(t)x is continuously differentiable. So for all x∗ ∈ X ∗ , we have ∗ ® x , S(t)x − x X =
Zt 0
Zt = 0
® d ∗ x , S(r)x X dr dr
¿
À ¿ Zt À ∗ d ∗ x , S(r)x dr = x , S(r)Ax dr , dt X X 0
hence
Zt S(r)Ax dr
S(t)x − x =
∀ t > 0.
0
(d) For t > r > 0 and x ∈ X, we have S(r) − idX r =
1 r
Zt
¶
µ Zt S(τ )x dτ 0
¡ ¢ S(τ + r)x − S(τ )x dτ
0
· Zt+r ¸ Zt 1 = S(τ )x dτ − S(τ )x dτ r r
0
· Zt+r ¸ Zr 1 = S(τ )x dτ − S(τ )x dτ −→ S(t)x − x as r & 0, r r
so
0
Zt S(τ )x dτ ∈ D(A). 0
But note that 1 lim t&0 t
Zt S(τ )x dτ = x 0
374
Nonlinear Analysis
and since x ∈ X was arbitrary, we conclude that D(A) is dense in X. Next let {xn }n>1 ⊆ D(A) and assume that xn −→ x and A(xn ) −→ y
in X.
For every r > 0, we have ° ° ° ° °S(r)Axn − S(r)y ° 6 M eωr °Axn − y ° X X (see Proposition 3.3.31) and thus S(·)Axn −→ S(·)y
in X uniformly on [0, t], t > 0.
From (c) we know that Zt S(t)xn − xn =
S(r)Axn dr. 0
Passing to the limit as n → +∞, we obtain Zt S(r)y dr.
S(t)x − x = 0
Hence lim
t&0
S(t)x − x = y, t
which implies that x ∈ D(A) and y = Ax, i.e., A is closed. REMARK 3.3.37 Using part (b) of Proposition 3.3.36 and induction, we can show that for all n > 1, all x ∈ D(An ) and all t > 0, we have dn S(t)x = S(t)An x = An S(t)x. dtn Moreover, it can be shown that the set
∞ \
D(An ) is dense in X.
n=1
For details we refer to Pazy (1983, p. 6). Also because of parts (b) and (d) of Proposition 3.3.36 and using Theorem 2.1.17, we can rewrite part (c) as follows (c)’ for all t > 0 and all x ∈ X, we have Zt S(t)x − x = A
S(τ )x dτ. 0
3. Nonlinear Operators and Young Measures
375
COROLLARY 3.3.38 An operator A : X ⊇ D(A) −→ X can be generator of at most one C0 semigroup. PROOF Suppose that S1 and S2 are two C0 -semigroups with generator A. Let x ∈ D(A) and t > 0 and define the function df
u(s) = S1 (t − s)S2 (s)x
∀ s ∈ (0, t). ¡ ¢ From Proposition 3.3.36(b), we know that u ∈ C 1 (0, t); X and u0 (s) = −AS1 (t − s)S2 (s)x + S1 (t − s)AS2 (s)x = −S1 (t − s)AS2 (s)x + S1 (t − s)AS2 (s)x = 0, so Zt u0 (s) ds = 0
S1 (t)x − S2 (t)x = u(0) − u(t) = −
∀ x ∈ D(A). (3.76)
0
Because D(A) is dense in X (see Proposition 3.3.36(d)), from (3.76), it follows that S1 (t) = S2 (t) ∀ t > 0. © ª © ª If S(t) t>0 is a C0 -semigroup on a Banach space X, then S(t)∗ t>0 still has the semigroup property but need not be a C0 -semigroup. In fact in general we can only show that for every x∗ ∈ X ∗ , t 7−→ S(t)∗ x∗ is weakly∗ -continuous at t = 0, i.e., w∗-lim S(t)∗ x = x. t&0
So the map S(t) 7−→ S(t)∗ does not preserve the strong continuity at t = 0. © ª EXAMPLE 3.3.39 Let X = C0 (R) (see Section 2.3) and let S(t) t>0 be the C0 -semigroup of left translations, i.e., ¡ ¢ df S(t)f (s) = f (t + s)
∀ t, s > 0, f ∈ C0 (R).
We know that X ∗ = N BV (R), the space of all normalized functions of bounded variation with the total variation norm Z ¯ ¯ ¯ dϑ(t)¯. kϑk = (Var ϑ)(R) = T V (R)
R
376
Nonlinear Analysis
By saying that ϑ is normalized, we mean that ϑ(s) =
ϑ(s+ ) + ϑ(s− ) 2
∀s∈R
and ϑ(−∞) =
lim ϑ(s) = 0
s→−∞
and
ϑ(+∞) =
lim ϑ(s) = 0
s→+∞
(see also Theorem 2.3.41). For all f ∈ C0 (R) and all ϑ ∈ N BV (R), we have Z Z ® ® ϑ, S(t)f = f (t + s) dϑ(s) = f (s) ds ϑ(s − t) = S(t)∗ ϑ, f , R
so
R
¡ ¢ S(t)∗ ϑ (s) = ϑ(s − t),
© ª i.e., S(t)∗ t>0 is the right translation of ϑ. By a theorem of Plessner (1929), we know that · ¸ · ¸ ° ° °ϑ(· − t) − ϑ(·)° −→ 0 as t & 0 ⇐⇒ ϑ ∈ AC(R) . T V (R) So the function t 7−→ S(t)∗ ϑ is not in general strongly continuous at t = 0 (unless of course ϑ ∈ AC(R)). Note that in the previous example X = C0 (R) is not a reflexive Banach space. This is not an accident. PROPOSITION 3.3.40 © ª If H is a Hilbert space, H ∗ = H (i.e., H is a pivot space) and S(t) t>0 is a C0 -semigroup © ª on H, then S(t)∗ t>0 is a C0 -semigroup on H. PROOF Clearly S(·)∗ satisfies the semigroup property. Therefore we need to show that for all h ∈ H, the map t 7−→ S(t)∗ h is strongly continuous at t = 0. For every x, h ∈ H, the function ¡ ¢ ¡ ¢ s 7−→ S(s)∗ h, x H = h, S(s)x H is continuous on R. Therefore for all h ∈ H, the function s 7−→ S(s)∗ h is weakly continuous. So it follows from the uniform boundedness principle (see Theorem A.3.4) and the semigroup property that the function s 7−→ S(s)∗ h is bounded on any compact interval of R+ . Also it is weakly measurable. Moreover, if {sn }n>1 is an enumeration of the rationals in R+ and we consider © ª df L = span Q S(sn )∗ h n>1
3. Nonlinear Operators and Young Measures ©
(i.e., finite linear combinations of
S(sn )∗ h
ª n>1
377
with rational coefficients),
df
then L is countable and H0 = span L is separable. © ª Since the function s 7−→ S(s)∗ h is weakly continuous, we see that S(s)∗ h s>0 ⊆ H0 . Therefore we have proved that the function s 7−→ S(s)∗ h is weakly measurable and separably valued; hence by Theorem 2.1.3, it is strongly measurable. We infer that S(·)∗ h ∈ L1loc (R; H). If t > 0 and η ∈ (0, t), we have ° ° °S(t + r)∗ h − S(t)∗ h°
H
° Zη ° °1 ¡ ¢ ° ∗ ∗ ° = ° S(t + r) h − S(t) h dτ ° ° η H 0
° Zη ° °1 ¡ ¢ ° ∗ ∗ ∗ ° = ° S(τ ) S(t + r − τ ) h − S(t − τ ) h dτ ° ° η H 0
1 6 M1 η
Zη
° ° °S(t + r − τ )∗ h − S(t − τ )∗ h° dτ, H
(3.77)
0
for some M1 > 0, so by Corollary 2.3.8, we have ° ° lim °S(t + r)∗ h − S(t)∗ h°H r&0
M1 6 lim r&0 η
Zη
° ° °S(t + r − τ )∗ h − S(t − τ )∗ h° dτ = 0. H
0
Finally let tn & 0 and © ª df C = conv S(tn )∗ h n>1 . Because
w
S(tn )∗ h −→ h
in H,
we have that h ∈ C. So if © ª df E = span S(tn )∗ h n>1 , then from (3.77), it follows that lim S(tn )∗ h = h
n→+∞
Because
∀ h ∈ E.
° ° sup °S(tn )∗ °L 6 M2 ,
n>1
for some M2 > 0, we conclude that lim S(tn )∗ h = h
n→+∞
∀ h ∈ H.
(3.78)
378
Nonlinear Analysis
REMARK 3.3.41 In fact the result is true if H is replaced © ª by a reflexive Banach space. This is a consequence of the fact that if S(t) t>0 is a semigroup of linear operators, such that for all x ∈ X, the function t 7−→ S(t)x is strongly measurable (this is the case if for all x ∈ X, the function t 7−→ S(t)x is weakly continuous), then S is a C0 -semigroup. For details we refer to Hille & Phillips (1957, pp. 305–306). Of great importance in applications are theorems which give necessary and sufficient conditions for an operator A to be the infinitesimal generator of a C0 -semigroup. The basic result in this direction is the celebrated HilleYosida theorem. To state and prove this fundamental result we need some preparation. DEFINITION 3.3.42 Let X be a Banach space and let A : X ⊇ D(A) −→ X be a closed, linear operator. (a) The resolvent set %(A) of A is defined by ª df © %(A) = λ ∈ R : λidX − A : X ⊇ D(A) −→ X is bijective . (b) If λ ∈ %(A), then the resolvent operator Rλ : X −→ X is defined by df
Rλ x = (λidX − A)−1
∀ x ∈ X.
REMARK 3.3.43 It is easy to check that Rλ is closed. So by the closed graph theorem (see Theorem A.3.7), we have that Rλ ∈ L(X). Moreover, we have ARλ x = Rλ Ax for all x ∈ D(A) (see the proof of Proposition 3.3.44(b)). PROPOSITION 3.3.44 If X is a Banach space and A : X ⊇ D(A) −→ X is a closed, linear operator, then (a) for λ, µ ∈ %(A), we have Rλ − Rµ = (λ − µ)Rλ Rµ (resolvent identity) and Rλ Rµ = Rµ Rλ ; (b) if A is the generator of a C0 -semigroup S and ° ° °S(t)° 6 M eωt ∀ t > 0, L then λ ∈ %(A) and
∀λ>ω
+∞ Z Rλ x = e−λt S(t)x dt 0
∀ x ∈ X.
3. Nonlinear Operators and Young Measures PROOF
379
Let x ∈ D(A). We have £ ¤ (λidX − A) Rλ − Rµ (µidX − A)(x) = (µidX − A)x − (λidX − A)x = (µ − λ)x,
so Rλ − Rµ = (µ − λ)Rλ Rµ .
(3.79)
The commutation of Rλ and Rµ follows by interchanging λ and µ in (3.79). (b) Let us set +∞ Z e−λt S(t)x dt
df bλ x = R
∀ λ > ω, x ∈ X.
0
This operator is well defined, since ° −λt ° °e S(t)x° 6 M e(ω−λ)t kxk X X and the function t 7−→ e−λt S(t)x is continuous, thus strongly measurable. We have ° ° bλ ° 6 M °R L
+∞ Z e−(λ−ω)t dt 6 0
M , λ−ω
bλ ∈ L(X). i.e., R bλ x ∈ D(A) and We show that R bλ x = x (λidX − A)R
∀ x ∈ X.
We have S(s) − idX b 1 Rλ x = s s λs
=
e
−1 s
Z∞ e s
−λt
+∞ Z ¡ ¢ e−λt S(t + s) − S(t) x dt 0
1 S(t)x dx − s
Zs e−λt S(t)x dt. 0
Passing to the limit as s & 0, we obtain b λ x = λR bλ x − x AR
∀ x ∈ X.
(3.80)
380
Nonlinear Analysis
Using Proposition 3.3.36(b) and Theorem 2.1.17, for all x ∈ D(A), we have +∞ Z e−λt S(t)Ax dt
bλ Ax = R
0
+∞ Z = e−λt AS(t)x dt 0
+∞ Z bλ x. = A e−λt S(t)x dt = AR
(3.81)
0
From (3.80) and (3.81), it follows that bλ (λid − A)x = x R X and bλ x = x (λidX − A)R
∀ x ∈ D(A),
so bλ Rλ = R
and
λ ∈ %(A).
REMARK 3.3.45 Because of Proposition 3.3.44(b), we see that the resolvent operator is the Laplace transform of the C0 -semigroup generated by A. Hence the function λ 7−→ R(λ) is analytic on %(A). Now we are ready for the theorem characterizing the generators of C0 semigroups. THEOREM 3.3.46 (Hille-Yosida Theorem) If X is a Banach space and A : X ⊇ D(A) −→ X is a closed, densely defined, linear operator, then A is the generator of a C0 -semigroup if and only if there exist M > 1 and ω ∈ R, such that ° k° °Rλ ° 6 L
M (λ − ω)k
∀ λ ∈ %(A), λ > ω.
In this case we have ° ° °S(t)° 6 M eωt L
∀ t > 0.
3. Nonlinear Operators and Young Measures
381
PROOF “=⇒”: From Proposition 3.3.44, we know that if λ > ω, then λ ∈ %(A) and we have +∞ Z Rλ x = e−λt S(t)x dt
∀ x ∈ X,
0
so d(k−1) Rλ x = Rλk−1 x = dλk−1
+∞ Z (−t)k−1 e−λt S(t)x dt
∀k>1
0
and thus ° k−1 ° °R ° λ
L
6 M
+∞ Z tk−1 e−(λ−ω)t dt = M (k − 1)!(λ − ω)−k .
(3.82)
0
But the function λ 7−→ Rλ is analytic on %(A) (see Remark 3.3.45). Hence (k−1)
Rλ
= (−1)k−1 (k − 1)!Rλk .
(3.83)
From (3.82) and (3.83), it follows that ° k° °Rλ ° 6 L
M . (λ − ω)k
“⇐=”: For λ > ω, let us set df
Aλ = λ2 Rλ − λidX . We have that Rλ ∈ L(X) and so we obtain the C0 -semigroup Sλ (t) = eAλ t = e−λt
∞ X (λ2 t)n (λidX − A)−n n! n=0
(see Remark 3.3.30). We shall show that as λ → +∞, then Sλ (t) converges in the strong operator topology to S(t), t > 0, which is the desired semigroup. Note that x = Rλ (λidX − A)x = λRλ x − Rλ Ax = λRλ x − ARλ x
∀ x ∈ D(A)
(see Remark 3.3.43), so Aλ x = λRλ Ax = λARλ x.
(3.84)
Also, we have ° ° ° ° °λRλ x − x° = °λRλ Ax° 6 X X
M kAxkX λ−ω
∀ x ∈ D(A),
382
Nonlinear Analysis
so λRλ x −→ x
in X
as λ → +∞,
∀ x ∈ D(A).
But D(A) is dense in X. So for a given x ∈ X, we can find a sequence {xm }m>1 ⊆ D(A), such that xm −→ x in X. Then for λn −→ +∞ as n → +∞, we have λn Rλn xm −→ xm
in X
as n → +∞.
By the double limit lemma (see Proposition A.2.35), we can find an increasing sequence {m(n)}n>1 (not necessarily strictly) to +∞, such that λn Rλn xm(n) −→ x
in X,
as n → +∞.
Then we have ° ° ° ° ° ° °λn Rλn x − x° 6 °λn Rλn x − λn Rλn xm(n) ° + °λn Rλ xm(n) − x° n X X X ° ° ° λn M ° ° ° ° ° 6 x − xm(n) X + λn Rλn xm(n) − x X −→ 0, λn − ω so λRλ x −→ x in X as λ → +∞ ∀ x ∈ X. Then because of (3.84), we have Aλ x −→ Ax in X
as λ → +∞
∀ x ∈ D(A).
For every λ > ω and t > 0, we have ° ° °Sλ (t)°
L
6 e
−λt
∞ X λω (λ2 t)n M = M e λ−ω t . n n! (λ − ω) n=0
Also from Proposition 3.3.44(a), we have that Aλ Aµ = Aµ Aλ
∀ λ, µ > 0
and so Aλ Sµ (t) = Sµ (t)Aλ
∀ λ, µ > 0, t > 0.
From Proposition 3.3.36(b), it follows that Zt Sλ (t)x − Sµ (t)x = 0
¢ d¡ Sµ (t − s)Sλ (s)x ds dt
Zt =
Sµ (t − s)(Aλ − Aµ )Sλ (s)x ds 0
Zt =
Sµ (t − s)Sλ (s)(Aλ − Aµ )x ds 0
∀ x ∈ D(A),
3. Nonlinear Operators and Young Measures
383
so ° ° °Sλ (t)x − Sµ (t)x° X ° ° µω 6 M 2 e µ−ω t °(Aλ − Aµ )x°X
Zt
(λ−µ)ω 2 s
e− (µ−ω)(λ−ω) ds. 0
Let λ > µ. We have ° ° ° ° µω °Sλ (t)x − Sµ (t)x° 6 M 2 e µ−ω t °(Aλ − Aµ )x° −→ 0 X X
as λ, µ → +∞,
and thus Sλ (t)x converges to some limit as λ → +∞ uniformly on compact intervals. Denote this limit by S(t)x, x ∈ D(A). As before exploiting the density of D(A) in X, we have that Sλ (t)x −→ S(t)x
in X
as λ → +∞
∀x∈X
and the convergence is uniform on compact intervals in R+ . This means that the function t 7−→ S(t)x is continuous and since S clearly satisfies the semigroup property and S(0) = idX , we conclude that S is a C0 -semigroup. b be the generator of It remains to show that A is the generator of S. Let A S. From Proposition 3.3.36(c), we know that Zt Sλ (t)x − x =
Sλ (s)Aλ x ds
∀ t > 0, λ > 0, x ∈ D(A).
(3.85)
0
Note that ° ° °Sλ (s)Aλ x − S(s)Ax° ° °X ° ° 6 °Sλ (s)Aλ x − Sλ (s)Ax°X + °Sλ (s)Ax − S(s)Ax°X ° ° ° ° ° ° 6 °Sλ (s)°L °Aλ x − Ax°X + °Sλ (s)Ax − S(s)Ax°X ∀ s ∈ [0, t], λ > 0, x ∈ D(A), so
° ° °Sλ (s)Aλ x − S(s)Ax° −→ 0 X
as λ → +∞.
Thus if we pass to the limit as λ → +∞ in (3.85), we obtain Zt S(t)x − x =
S(s)Ax ds
∀ t > 0, x ∈ D(A),
0
so
b and D(A) ⊆ D(A)
b Ax = Ax
∀ x ∈ D(A),
384
Nonlinear Analysis
b is an extension of A. i.e., A Now if λ > ω, then b λ ∈ %(A) ∩ %(A) and so
¡ ¢ ¡ ¢ b D(A) = (λid − A) D(A) = X. (λidX − A) X ¡ ¢ b i.e., A = A. b Thus λidX − A |D(A) is bijective and so D(A) = D(A), REMARK 3.3.47 The operator Aλ ∈ L(X) introduced in the above proof is known as the Yosida approximation of A. Note that if A is also dissipative (see Remark 3.3.2), then Aλ coincides with the notion introduced in Definition 3.3.11. However, there A was not necessarily linear. The following generation theorem for perturbed operators is useful in applications and is known as Phillips theorem. THEOREM 3.3.48 (Phillips Theorem) If X is a Banach space, A : X ⊇ D(A) −→ X is the generator of a C0 semigroup and B ∈ L(X), then A + B : X ⊇ D(A) −→ X is also the generator of a C0 -semigroup. Another important generation result is the so-called Lumer-Phillips theorem THEOREM 3.3.49 (Lumer-Phillips Theorem) If X is a Banach space and A : X ⊇ D(A) −→ X is a densely defined, linear, m-dissipative operator, then A is the generator of a contraction semigroup. PROOF
By Lemma 3.3.3, we have that ° ° °λx − Ax° > λ kxk ∀ λ > 0, x ∈ D(A). X X
Also R(λidX − A) = X
∀ λ > 0,
due to the m-dissipativity of A (see Definition 3.3.1 and Proposition 3.3.14). It follows that R+ ⊆ %(A) and kRλ kL 6
1 λ
∀ λ > 0.
Then by Theorem 3.3.46, A generates a contraction semigroup.
3. Nonlinear Operators and Young Measures
385
Recall that if X is a Banach space and A ∈ L(X), then A is the generator of the semigroup df S(t) = etA ∀ t > 0. Moreover, from elementary analysis, we know that for the exponential function, we have µ ¶−n at e−at = lim 1+ . n→+∞ n In the next theorem, we show that even if A is unbounded, the limit expression is valid for the semigroup generated by A. The result is known as the exponential formula. First a lemma. LEMMA 3.3.50 If X is a Banach space and B ∈ L(X) with kBkL 6 1, then ° n(B−id ) ° √ n °e X x − B x° 6 n kx − BxkX ∀ n > 1, x ∈ X. X PROOF
For k > n, we have k−1 X¡ ° k ° ¢ °B x − B n x° = B m+1 x − B m x X m=n
6 kx − BxkX
k−1 X
° ° kB m kL 6 |k − n|°x − Bx°X .
m=n
Then for any t > 0, we have ° t(B−id ) ° n °e X x − B x°
X
µX ∞
° ° ∞ k ° −t X ¢° t ¡ k n ° = °e B x−B x ° ° k! k=0
¶ ° ° tk 6 e−t |k − n| °x − Bx°X k! k=0 µX ¶ 21 ∞ k ¶ 21 µ X ∞ k ° ° t t −t 2 °x − Bx° 6 e (k − n) X k! k! k=0 k=0 ° ¢1 t ° t¡ = e− 2 t2 − (2n − 1)t + n2 2 e 2 °x − Bx°X .
X
Let t = n. We conclude that ° n(B−id ) ° ° √ ° n °e °x − Bx° . X x − B x° 6 n X X
Using this auxiliary result we can prove the exponential formula.
386
Nonlinear Analysis
THEOREM 3.3.51 If X is a Banach space and S is a contraction semigroup on X with generator A, then µ µ ¶−n ¶n t n S(t)x = lim idX − A x = lim R nt x ∀ x ∈ X. n→+∞ n→+∞ n t PROOF
For every n > 1 and t > 0, we have µ ¶−1 µ ¶−1 n n n t n R = id − A = idX − A . t t t t X n
Also we have
µ
n2 n R n − idX t2 t t
tA nt = t Note that
¶
µ = n
¶ n R nt − idX . t
° ° °n ° ° R n ° 6 1. °t t° L
So we can apply Lemma 3.3.50 with B = nt R nt . We obtain ° ° ° µ µ ¶¶ µ ¶n ° ° ° ° √ °n n ° exp n n R n − id ° ° n n x− R x° 6 n° R t x − x° X ° ° . t t t t t X X From the proof of Theorem 3.3.46, we know that ° ° °n ° ° R n x − x° 6 t kAxk ∀ x ∈ D(A). X °t t ° n X Therefore it follows that ° µ ¶n ° ° ° ¡ ¢ ° exp tA n x − n R n x° ° ° t t t
X
t 6 √ kAxkX n
∀ x ∈ D(A).
Again from the proof of Theorem 3.3.46, we know that ¡ ¢ S(t)x = lim exp tA nt x ∀ x ∈ X. n→+∞
So we infer that for fixed t > 0, µ S(t)x = But
lim
n→+∞
° ° ° tA nt ° °e °
L
n Rn t t
6 1 and
¶n x
∀ x ∈ D(A).
° ° °¡ n ¢ ° ° R n n° 6 1 ° t t ° L
and D(A) is dense in X. So (3.86) is valid for all x ∈ X.
(3.86)
3. Nonlinear Operators and Young Measures
387
Before passing to the nonlinear semigroup theory, let us see how we can use semigroups to extend the notion of a solution for an inhomogeneous evolution equation. So let X be a Banach space and A the generator of a C0 -semigroup S and T = [0, b]. Let f ∈ L1 (T ; X) and consider the evolution equation ½ 0 x (t) = Ax(t) + f (t) ∀ t ∈ T = [0, b], (3.87) x(0) = x0 . ¡ ¢ DEFINITION 3.3.52 (a) A function x ∈ W 1,1 (0, b); X is a strong solution of (3.87), if x(0) = x0 and it satisfies the equation almost everywhere (hence x(t) ∈ D(A) for almost all t ∈ T ). (b) A function x ∈C(T ; X)® is a weak solution of (3.87), if for all x∗ ∈ X ∗ , the function t 7−→ x∗ , x(t) X is absolutely continuous and ∗ ® x , x(t) X = hx∗ , x0 iX +
Zt
∗ ∗ ® A x , x(s) X ds +
0
Zt
∗ ® x , f (s) ds
∀ t ∈ T.
0
(c) A function x ∈ C(T ; X) is a mild solution of (3.87), if Zt S(t − s)f (s) ds
x(t) = S(t)x0 +
∀ t ∈ T.
0
The following interesting result is due to Ball (1977), where the reader can find its proof. THEOREM 3.3.53 If X is a Banach space, A is the generator of a C0 -semigroup and f ∈ L1 (T ; X), then x ∈ C(T ; X) is a mild solution of (3.86) if and only if it is a weak solution. REMARK 3.3.54 In contrast to the strong solution, the mild solution makes sense without having that x(t) ∈ D(A) for a.a. t ∈ T . Also we need not have x0 ∈ D(A) (nonregular initial condition). Moreover, it is easy to check that the mild solution (f, x0 ) 7−→ x(·; f, x0 ) is Lipschitz continuous on L1 (T ; X) × X.
388
Nonlinear Analysis
Now we move to nonlinear semigroups. DEFINITION 3.3.55 Let X be a Banach space and let C ⊆ X be a nonempty set. A family of maps S(t) : C −→ C,
t>0
is said to be a semigroup of nonexpansive maps if (a) S(0) = idC ; (b) S(t + s) = S(t) ◦ S(s) for all t, s > 0; ° ° (c) °S(t)x − S(t)y °X 6 kx − ykX for all t > 0 and all x, y ∈ C; (d) S(t)x −→ x in X as t & 0 for all x ∈ C. REMARK 3.3.56 Evidently a semigroup S on nonexpansive maps can be extended uniquely to a semigroup of nonexpansive maps on C and so in Definition 3.3.55 we may assume without any loss of generality that C ⊆ X is closed. If C = X and S(t) ∈ L(X), then we recover Definition 3.3.33. Moreover, it is straightforward to check that R+ × C 3 (t, x) −→ S(t)x ∈ C is continuous. We shall prove a basic generation theorem for nonlinear semigroups of nonexpansive maps. The result will be a nonlinear analog of Theorems 3.3.49 and 3.3.51. To do this we need some preparation. First we prove a combinatorial lemma. LEMMA 3.3.57 If n > m > 1 are integers and α, β > 0 are such that α + β = 1, then m µ ¶ X £ ¤1 n k n−k α β (m − k) 6 (na − m)2 + naβ 2 k k=0
and " ¶ µ ¶2 # 21 n µ X k−1 mβ mβ αm β k−m (n − k) 6 + +m−n . m−1 α2 a
k=m
PROOF
Since n > m and using the Cauchy-Schwarz inequality, we have m µ ¶ X n k n−k α β (m − k) k k=0 n µ ¶ X n k n−k 6 α β (m − k) k k=0
3. Nonlinear Operators and Young Measures 6
µX ¶1 µ n µ ¶ ¶ 21 n µ ¶ n k n−k 2 X n k n−k α β α β (m − k)2 . k k k=0
389 (3.88)
k=0
From the binomial theorem, we know that n µ ¶ X n k n−k α β = (α + β)n , k
(3.89)
k=0
µ ¶ n X n k n−k k α β = αn(α + β)n−1 k
(3.90)
µ ¶ n k n−k α β = α2 n(n − 1)(α + β)n−2 + αn(α + β)n−1 . k
(3.91)
k=0
and n X
k2
k=0
Using the equations of (3.89), (3.90) and (3.91) in the right hand side of (3.88) and since α + β = 1, we obtain
m µ ¶ X £ ¤1 n k n−k α β (m − k) 6 (nα − m)2 + nαβ 2 . k
k=0
Also using once more the Cauchy-Schwarz inequality, we have ¶ n µ X k−1 αm β k−m (n − k) m−1 k=m ¶ ∞ µ X k−1 6 αm β k−m |n − k| (3.92) m−1 k=m µX ¶ ¶ 21 µ X ¶ ¶ 21 ∞ µ ∞ µ k−1 k−1 6 αm β k−m αm β k−m (n − k)2 . m−1 m−1 k=m
Recall that
k=m
¶ ∞ µ X k−1 1 β k−m = m−1 (1 − β)m
∀ β ∈ (0, 1).
(3.93)
k=m
Using (3.93) and the identities obtained by differentiating it with respect to β, in the right hand side of (3.92), we obtain ¶ · µ ¶2 ¸ 21 n µ X k−1 mβ mβ m k−m α β (n − k) 6 + +m−n . m−1 α2 α k=m
390
Nonlinear Analysis
this auxiliary result we can obtain some estimates for the family © Using ª Jλn n>1,λ>0 . LEMMA 3.3.58 If X is a Banach space, A : X ⊇ D(A) −→ X is an m-accretive operator, λ > µ > 0 and n > m > 1 are integers, then ° ° ° ° (a) °Jλn (x) − x°X 6 n°Jλ (x) − x°X for all x ∈ X; (b) ° n ° °Jµ (x) − Jλm (x)°
X
µ ¶ ° n ° °J m−k (x) − x° λ X k k=0 µ ¶ n X ° ° m k−m k − 1 ° n−k + α β Jµ (x) − x°X , m−1 m X
6
αk β n−k
k=m
where
µ λ
a = PROOF
and
β =
λ−µ . λ
(a) Using Proposition 3.3.12(a), we have ° n−1 ° ° n ° ° X ¡ n−k ¢° n−(k+1) °Jλ x − x° = ° ° J (x) − J (x) λ λ ° ° X k=0
6
n−1 X
° ° °Jλ (x) − x°
X
X
° ° = n°Jλ (x) − x°X .
k=0
(b) For integers 1 6 i 6 m and 1 6 k 6 m, we set ° df ° ak,i = °Jµi (x) − Jλk (x)°X . Using the resolvent identity (see Remark 3.3.13), we obtain ° µ ¶° ° i ° µ k−1 λ−µ k ° ak,i = ° J (x) − J J (x) + J (x) µ λ λ ° µ ° λ λ X ° ° ° i−1 µ k−1 λ−µ k ° ° ° 6 °Jµ (x) − Jλ (x) − Jλ (x)° λ λ X ° ° µ° λ − µ° k−1 i−1 i−1 ° ° ° 6 Jµ (x) − Jλ (x) X + Jµ (x) − Jλk (x)°X λ λ = αak−1,i−1 + βak,i−1 . (3.94) Inequalities (3.94) can be solved to estimate am,n in terms of ak,0 and a0,i . This way we obtain the inequality in part (b) of the lemma.
3. Nonlinear Operators and Young Measures
391
Now we are ready for the generation theorem for nonlinear semigroups of nonexpansive maps. THEOREM 3.3.59 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then µ ¶−n t S(t)x = lim idX + A x n→+∞ n exists ©for each ª x ∈ D(A), uniformly in t on compact intervals in R+ . Moreover, S(t) t>0 is a semigroup of nonexpansive maps on D(A) and for each x ∈ D(A) and t > 0, we have ° ° ¯ ¯ °S(t)x − x° 6 t¯A(x)¯ = t inf kuk . X X u∈A(x)
PROOF Let x ∈ D(A), λ > µ > 0 and n > m > 1 positive integers. Using Lemmata 3.3.57 and 3.3.58, we obtain ° n ° °Jµ (x) − Jλm (x)°
X
µ· 6
¸ 21 (nµ − λm)2 + nµ(λ − µ)
(3.95)
· ¸ 21 ¶ ¯ ¯ ¯A(x)¯. (3.96) + mλ(λ − µ) + (mλ − nµ)2 Taking µ =
t n
and λ =
t m
in (3.96), we obtain µ
° n ° °J t (x) − J m ° t (x) n
m
X
6 2t
1 1 − m n
¶ 21 |Ax|,
(3.97)
so lim J nt (x) = S(t)x
n→+∞
n
exists uniformly in t on compact intervals of R+ . Moreover, since J nt is nonexpansive on X, n
we see that ° ° °S(t)x − S(t)y ° 6 kx − yk X X
∀ t > 0, x, y ∈ D(A).
Therefore S(t)x =
lim J nt (x) exists for x ∈ D(A)
n→+∞
n
and S(t)(·) is nonexpansive on D(A).
(3.98)
392
Nonlinear Analysis
Also if in (3.96), we let n = m, µ = nt and λ = pass to the limit as n → +∞, we obtain ° ° °S(t)x − S(s)x° 6 2|t − s||Ax|
s n
X
with 0 6 t 6 s and then ∀ x ∈ D(A).
(3.99)
From (3.99), it follows that the function t 7−→ S(t)x is continuous for all x ∈ D(A) and then by a use of the double limit lemma (see Proposition A.2.35) as in the proof of Theorem 3.3.46, we obtain that the function t 7−→ S(t)x is continuous on R+ for all x ∈ D(A). Finally we need to verify the semigroup property (see Definition 3.3.55(b)). From (3.97) and (3.98), we have ¡ ¢m ¡ ¢n S(t)m x = lim J nt (x) = lim J m (x). t n→+∞
n→+∞
n
n
Therefore, S(mt)x = =
lim J nmt (x) = lim J mk mt (x) n k→+∞ mk ¡ m ¢k lim J t (x) = S(t)m x.
n→+∞ k→+∞
k
(3.100)
Then if i, k, r, s > 0 are integers, we have µ
¶ µ ¶ µ ¶is+rk i r is + rk 1 S + x = S x = S x k s ks ks µ ¶is µ ¶rk µ ¶ µ ¶ 1 1 i r = S S x = S S x, ks ks k s so S(t + τ )x = S(t) ◦ S(τ )x for all rational t, τ > 0 and all x ∈ D(A). Exploiting the continuity in t and the nonexpansiveness in x, we conclude that S(t + τ ) = S(t) ◦ S(τ ) ∀ t, τ > 0.
In applications to evolution equations, it is very helpful to know if S(t) : C −→ C,
t>0
is a compact map (see Definition 3.1.1). So we make the following definition. DEFINITION 3.3.60 Let X be a Banach space and C ⊆ X be a nonempty, closed set and S(t) : C −→ C, t > 0, a semigroup of nonexpansive maps. We say that S is compact, if for all t > 0, S(t) is a compact map.
3. Nonlinear Operators and Young Measures
393
REMARK 3.3.61 Since S(0) = idC , then S(0) is not in general compact unless C ⊆ X is compact or X is finite dimensional. Next we present two simple but typical examples of (linear) semigroups which are compact and noncompact respectively. EXAMPLE 3.3.62
(a) X = H = L2 (0, π) and A : H ⊇ D(A) −→ H
is defined by df
Ax = −
d2 x dt2
∀ x ∈ W 2,2 (0, π) ∩ W01,2 (0, π).
By integration by parts, we can check that A is monotone (i.e., accretive). Also for every f ∈ L2 (0, π) the boundary value problem ½ 00 −x (t) + x(t) = f (t) for a.a. t ∈ [0, π], x(0) = x(π) = 0, has a unique solution x ∈ D(A). Hence A is maximal monotone (i.e., maccretive). Note that D(A) = H and then because of Theorem 3.3.49, −A generates a contraction semigroup S on H. We know that µ ¶ © ª d2 1,2 2 λk = k k>1 is the spectrum of − 2 , W0 (0, π) dt and the corresponding eigenfunctions (r ) 2 sin kx π
k>1
form an orthonormal basis for H. Then using sine Fourier expansion, we can easily verify that r ∞ X 2 −k2 t S(t)x(τ ) = ak e sin kτ ∀ x ∈ H, t > 0, τ ∈ [0, π], π k=1
with ak being the k-th Fourier coefficient, defined by r df
ak =
2 π
Zπ x(s) sin ks ds. 0
394
Nonlinear Analysis
If we set df
Sn (t)x(τ ) =
n X
r ak e
−k2 t
k=1
2 sin kτ π
∀ x ∈ H, t > 0, τ ∈ [0, π],
then Sn ∈ Lf (H) (i.e., Sn is of finite rank; see Definition 3.1.23) and for all t > 0, Sn (t)x −→ S(t)x
in H, uniformly on bounded subsets of H.
Therefore S(t) is compact for t > 0 (see Proposition 3.1.18). (b) Let X = H = L2 (0, 2π) and let A : X ⊇ D(A) −→ H be defined by df
Ax = with
dx dt
∀ x ∈ D(A),
½ df
D(A) =
¾ x∈W
2,2
0
0
(0, 2π) : x(0) = x(2π), x (0) = x (2π) .
A simple integration by parts reveals that A is monotone (i.e., accretive). Also for every λ > 0 and every h ∈ L2 (0, 2π), the problem ∀ t ∈ [0, 2π], x(t) + λ dx dt (t) = h(t) x(0) = x(2π), 0 x (0) = x0 (2π), has a unique solution x ∈ W 2,2 (0, 2π) and so A is maximal monotone (i.e., A is m-accretive). Hence by Theorem 3.3.49, −A generates a contraction semigroup S on H. This semigroup is defined by ½ x(τ + t) if τ + t ∈ [0, 2π], S(t)x(τ ) = x(τ + t − 2π) if τ + t > 2π, for all x ∈ H, τ ∈ [0, 2π] and t > 0. Then for every t > 0, S is an isometry on H and so S(t) is not compact for t > 0. REMARK 3.3.63 Roughly speaking, we can say that compact semigroups of nonexpansive maps are generated by m-accretive operators acting in a finite dimensional Banach space or by m-accretive operators arising in the study of parabolic problems. In contrast hyperbolic problems (even very simple ones) generate noncompact semigroups. The compactness of a semigroup S of nonexpansive maps is closely related to the compactness of the nonexpansive operators Jλ = (idX + λA)−1 (see Definition 3.3.11 and Proposition 3.3.12(a)). For this reason we need to have a result determining the relationship between S(t) and Jλ , t, λ > 0.
3. Nonlinear Operators and Young Measures
395
LEMMA 3.3.64 If X is a Banach space, C ⊆ X is a nonempty, closed set and S(t) : C −→ C,
t>0
is a semigroup of nonexpansive maps, then ° ° °S(t)x − x° 6 2 X t PROOF
Zt
° ° °S(τ )x − x° dτ X
∀ x ∈ C, t > 0.
0
From Definition 3.3.55(b) and (c), we have ° ° Zt ° ° °S(t)x − 1 S(τ ) dτ ° ° ° t X
(3.101)
° Zt ° °1 ¡ ¢ ° ° = ° S(t)x − S(τ )x dτ ° ° t X
(3.102)
0
0
6
=
1 t 1 t
Zt
° ° °S(t − τ )x − x° dτ X
0
Zt
° ° °S(τ )x − x° dτ X
∀ x ∈ C, t > 0.
(3.103)
0
Using this inequality, we obtain ° ° °S(t)x − x° X ° ° ° Zt ° Zt ° ° ¡ ¢ ° 1 1° ° ° ° 6 °S(t)x − S(τ )x dτ ° + ° S(τ )x − x dτ ° ° t t X X 0
6
6
1 t 2 t
Zt 0
Zt
°¡ ¢° ° S(τ )x − x ° dτ + 1 X t °¡ ¢° ° S(τ )x − x ° dτ X
0
Zt
°¡ ¢° ° S(τ )x − x ° dτ X
0
∀ x ∈ X, t > 0.
0
In order to derive the desired relations between S(t) and Jλ , with t, λ > 0, we need to return to nonlinear evolution equations and discuss their solvability when the data are nonregular.
396
Nonlinear Analysis
So let T = [0, b], X be a Banach space, A : X ⊇ D(A) −→ 2X be an m-accretive operator and f ∈ L1 (T ; X). We consider the following nonlinear evolution inclusion: ¡ ¢ ½ 0 x (t) + A x(t) 3 f (t) ∀ t ∈ T, (3.104) x(0) = x0 . From Theorem 3.3.28, we know that if¡ X ∗ is uniformly convex (hence X ¢ 1,1 is reflexive), x ∈ D(A) and f ∈ W (0, b); X , then there exists unique ¡ 0 ¢ 1,∞ x ∈ W (0, b); X satisfying (3.104) for almost all t ∈ T . Such a solution is usually called¡ strong ¢solution. However, if x0 ∈ D(A) \ D(A) or f ∈ L1 (T ; X) \ W 1,1 (0, b); X or X is not reflexive, then there are examples showing that (3.104) need not have a strong solution (for details we refer to Crandall & Liggett (1971)). So if we want to develop a general theory concerning evolution equations of the form (3.104), we need to introduce a new broader solution concept. Suppose that x is a strong solution. Then ¡ ¢ −x0 (t) + f (t) ∈ A x(t) for a.a. t ∈ T . Because A is accretive, using Theorem 3.3.10, we obtain ¡ ¢ 0 6 − x0 (t) + f (t) − v, x(t) − y + for a.a. t ∈ T and all (y, v) ∈ Gr A, so
¡
x0 (t), x(t) − y
¢ +
6
¡ ¢ f (t) − v, x(t) − y +
for a.a. t ∈ T .
By virtue of Lemma 3.3.26, we have that ° ¡ 0 ¢ d° °x(t) − y °2 x (t), x(t) − y + = X dt
for a.a. t ∈ T .
So we obtain ° ¡ ¢ 1 d° °x(t) − y °2 6 f (t) − v, x(t) − y + X 2 dt
for a.a. t ∈ T .
Integrating both sides of this inequality over [s, t] ⊆ [0, b], we obtain ° ° ° ° °x(t) − y °2 6 °x(s) − y °2 + 2 X X
Zt
¡
f (τ ) − v, x(τ ) − y
¢ +
dτ
s
0 6 s 6 t 6 b, (y, v) ∈ Gr A. This leads to the introduction of a new more general solution notion for problem (3.104).
3. Nonlinear Operators and Young Measures
397
DEFINITION 3.3.65 Let x ∈ X and f ∈ L1 (T ; X). A function x : T −→ X is said to be an integral solution of the Cauchy problem (3.104), if (a) x(0) = x; (b) x ∈ C(T ; X); (c) for all 0 6 s 6 t 6 b and all (y, v) ∈ Gr A, we have ° ° ° 1° °x(t) − y °2 6 1 °x(s) − y °2 + X X 2 2
Zt
¡
f (τ ) − v, x(τ ) − y
¢ +
dτ.
s
REMARK 3.3.66 The previous discussion shows that every strong solution is also an integral solution. Moreover, for every x0 ∈ D(A), the function µ ¶−n t df x(t) = S(t)x0 = lim idX + A x0 n→+∞ n is an integral solution of the autonomous Cauchy problem ¡ ¢ ½ 0 x (t) + A x(t) 3 0 ∀ t > 0, x(0) = x0 .
For details we refer to Barbu (1976, p. 124). More generally we have the following result due to B´enilan (1972) (see also Barbu (1976, p. 124) and Miyadera (1992, p. 160)). THEOREM 3.3.67 If X is a Banach space, A : X ⊇ 2X is an m-accretive operator, x0 ∈ D(A) and f ∈ L1 (T ; X), then problem (3.104) has a unique integral solution x(·; f ) ∈ C(T ; X). Moreover, if f1 , f2 ∈ L1 (T ; X) and x1 (·) = x(·; f1 ),
x2 (·) = x(·; f2 ),
we have ° ° ° ° °x1 (t) − x2 (t)°2 6 °x1 (s) − x2 (s)°2 X X Zt +
¡ ¢ f1 (τ ) − f2 (τ ), x1 (τ ) − x2 (τ ) + dτ
s
and
° ° ° ° °x1 (t) − x2 (t)° 6 °x1 (s) − x2 (s)° X X Zt + s
° ° °f1 (τ ) − f2 (τ )° dτ X
∀ 0 6 s 6 t 6 b.
398
Nonlinear Analysis
Now we have all the necessary tools to establish the relation between S(t) and Jλ for all t, λ > 0. PROPOSITION 3.3.68 If X is a Banach space, A : X ⊇ D(A) −→ 2X is an m-accretive operator and S(t) : D(A) −→ D(A),
t>1
is the semigroup of nonexpansive maps generated by A, then for all x0 ∈ D and all t, λ > 0, we have ° ° ° ¡ ¢° (a) °S(t)x0 − x0 °X 6 2 + λt °Jλ (x0 ) − x0 °X ; ° ° (b) °Jλ (x0 ) − x0 °X 6
2 t
¡ ¢ 1 + λt
Zt
° ° °S(τ )x0 − x0 ° dτ . X
0
PROOF (a) From Definition 3.3.55(c) and Theorem 3.3.59, for all x0 ∈ D(A), (y, v) ∈ Gr A and t > 0, we have ° ° °S(t)x0 − x0 ° X° ° ° ° ° ° 6 °S(t)x0 − S(t)y °X + °S(t)y − y °X + °y − x0 °X ° ° ° ° 6 2°x0 − y °X + °S(t)y − y °X 6 2 kx0 − ykX + t kvkX .
(3.105)
¡ ¢ In (3.105), let us set y = Jλ (x) and v = Aλ (x0 ) ∈ A Jλ (x0 ) (see Proposition 3.3.12(c)). We obtain ° ° °S(t)x0 − x0 ° X ° ° ° ° 6 2°x0 − Jλ (x0 )°X + t°Aλ (x0 )°X ° ° ° t° 6 2°x0 − Jλ (x0 )°X + °x0 − Jλ (x0 )°X λ µ ¶ ° t ° °x0 − Jλ (x0 )° . = 2+ X λ
(b) We know that x(t) = S(t)x0 is the unique integral solution of the autonomous Cauchy problem: ½
¡ ¢ x0 (t) + A x(t) 3 0 x(0) = x0 ,
∀ t ∈ T,
3. Nonlinear Operators and Young Measures
399
and ° 1° °S(t)x0 − y °2 6 1 kx0 − yk2 X X 2 2 t Z ¡ ¢ + − v, S(τ )x0 − y + dτ ∀ (y, v) ∈ Gr A, t > 0 0
(see Definition 3.3.65 and Remark 3.3.66). Using Definition 3.3.5 and Lemma 3.3.27, we obtain ° ° °S(t)x0 − y ° 6 kx0 − yk X X 1 + λ
Zt
° ° ° ¢ ¡° °S(τ )x0 − y − λv ° − °S(τ )x0 − y ° dτ X X
∀ λ > 0.
0
Let y = Jλ (x0 )
and v = Aλ (x0 ).
We have ° ° ° ° °S(t)x0 − Jλ (x0 )° 6 °x0 − Jλ (x0 )° X X Zt ° ° °¢ ¡° 1 °S(τ )x0 − x0 ° − °S(τ )x0 − Jλ (x0 )° dτ. + X λ
(3.106)
0
From the triangle inequality, we have ° ° ° ° ° ° −°S(t)x0 − x0 °X 6 °S(t)x0 − Jλ (x0 )°X − °Jλ (x0 ) − x0 °X . Using this in (3.106), we obtain ° ° ° ° −°S(t)x0 − x0 °X + °Jλ (x0 ) − x0 °X ° ° 1 6 °Jλ (x0 ) − x0 °X + λ
Zt
° ° ° ¢ ¡ ° 2°S(τ )x0 − x0 °X − °Jλ (x0 ) − x0 °X dτ,
0
so ° ° °Jλ (x0 ) − x0 °
X
° λ° 2 6 °S(t)x0 − x0 °X + t t
Zt
° ° °S(τ )x0 − x0 ° dτ X
0
and from Lemma 3.3.64, we have ° ° °Jλ (x0 ) − x0 °
X
2 6 t
µ ¶ Zt ° ° λ °S(τ )x0 − x0 ° dτ. 1+ X t 0
400
Nonlinear Analysis
LEMMA 3.3.69 If X is a Banach space, C ⊆ X is a nonempty, closed set, fn : C −→ X are compact maps for n > 1 and fn (x) −→ f (x)
in X,
uniformly on bounded subsets of C, then f : C −→ X is compact. PROOF Clearly f : C −→ X is continuous. Next let B ⊆ C be a bounded set. Then for a given ε > 0, we can find n0 = n0 (ε, B) > 1, such that ° ° °fn (x) − f (x)° < ε ∀ n > n0 , x ∈ B. (3.107) X 2 For n > n0 , the set fn (B) is compact in X. So we can find
0 {xk }N k=1 ,
where N0 = N0 (n, ε) > 1, such that fn (B) ⊆
N0 [
B 2ε (xk ).
(3.108)
k=1
Let x ∈ B. From (3.108), we see that there exists k ∈ {1, . . . , N0 }, such that ° ° °fn (x) − xk ° < ε . (3.109) 2 Therefore using (3.107) and (3.109), we have ° ° ° ° ° ° °f (x) − xk ° 6 °f (x) − fn (x)° + °fn (x) − xk ° < ε, X X X so f (x) ∈ Bε (xk ) and thus f (B) ⊆
N0 [
Bε (xk ),
k=1
i.e., f (B) is totally bounded, thus relatively compact in X. DEFINITION 3.3.70 closed and let
Let X be a Banach space, C ⊆ X nonempty, S(t) : C −→ C,
t > 0,
be a semigroup of nonexpansive maps. We say that S is equicontinuous (respectively weakly if for each bounded set B ⊆ C, the © equicontinuous) ª family of functions S(·)x x∈B is equicontinuous (respectively weakly equicontinuous) at each t > 0.
3. Nonlinear Operators and Young Measures
401
As expected compactness and equicontinuity of nonlinear semigroups are closely related. PROPOSITION 3.3.71 If X is a Banach space, C ⊆ X is a nonempty, closed set and S(t) : C −→ C,
t>0
is a compact semigroup for nonexpansive maps, then S is equicontinuous. PROOF Because
Let B ⊆ C be a bounded set, let t > 0 and choose r ∈ (0, t). S(t − r)B is compact, N (ε)
we can find {xk }k=1 ⊆ B, such that S(t − r)B ⊆
Nε [
¡ ¢ B 3ε S(t − r)xk .
(3.110)
k=1
© ª For each k ∈ 1, . . . , N (ε) , the map t 7−→ S(t)xk is continuous on R+ and so we can find δ = δ(ε, t) ∈ (0, r), such that ° ° °S(t + h)xk − S(t)xk ° 6 ε X 3
© ª ∀ k ∈ 1, . . . , N (ε) , h ∈ [−δ, δ]. (3.111)
© ª Then because of (3.110), for each x ∈ B, we can find k ∈ 1, . . . , N (ε) , such that ° ° °S(t − r)x − S(t − r)xk ° 6 ε . (3.112) X 3 So using the semigroup property and (3.111) and (3.112), we obtain ° ° °S(t + h)x − S(t)x° X ° ° ° ° 6 °S(t + h)x − S(t + h)xk °X + °S(t + h)xk − S(t)xk °X ° ° + °S(t)xk − S(t)x°X ° ° ° ° 6 2°S(t − r)x − S(t − r)xk ° + °S(t + h)xk − S(t)xk ° 6 ε X
X
∀ x ∈ B, h ∈ [−δ, δ]. © ª This proves that S(·)x x∈B is equicontinuous at every t > 0. Now we are ready to present a characterization of compact nonlinear semigroups.
402
Nonlinear Analysis
THEOREM 3.3.72 If X is a Banach space, A : X ⊇ D(A) −→ 2X is an m-accretive operator and S(t) : D(A) −→ D(A), t > 0 is the semigroup of nonexpansive maps generated by A according to Theorem 3.3.59, then the following statements are equivalent: (a) the semigroup S is compact; (b) for each λ > 0, Jλ is a compact map and the semigroup S is equicontinuous. PROOF “(a)=⇒(b)”: The equicontinuity of S follows from Proposition 3.3.71. So we need to show that for each λ > 0, Jλ is a compact map. From Theorem 3.3.59, we have ° ° ° ° °S(t) ◦ Jλ (x) − Jλ (x)° 6 t°Aλ (x)° X X ° t° = °x − Jλ (x)°X ∀ x ∈ X, λ > 0 (3.113) λ ¡ ¢ (recall that Aλ (x) ∈ A Jλ (x) ; see Proposition 3.3.12(c)). Because Jλ is nonexpansive, it maps bounded sets to bounded sets. So from (3.113), it follows that S(t) ◦ Jλ −→ Jλ as t → 0+ , uniformly on bounded sets of X. But S(t) ◦ Jλ is compact, since S(t) is. Therefore Lemma 3.3.69 implies that Jλ is compact. “(b)=⇒(a)”: Using Proposition 3.3.68(b), we have °¡ ° ¢ ° Jλ ◦ S(t) (x) − S(t)x° X Zλ ° 4 ° °S(t + τ )x − S(t)x° dτ 6 ∀ λ, t > 0, x ∈ D(A). (3.114) X λ 0
Because S is equicontinuous, for every bounded set B ⊆ D(A) and for each t > 0, we can find ω : R+ −→ R+ , such that lim ω(r) = 0
r&0
and
° ° °S(t + τ )x − S(t)x°
X
6 ω(τ )
∀ τ > 0, x ∈ B.
Using (3.115) in (3.114), we obtain °¡ ° ¢ ° Jλ ◦ S(t) (x) − S(t)x° 6 4 sup ω(τ ) X τ ∈[0,λ]
∀ x ∈ B,
(3.115)
3. Nonlinear Operators and Young Measures
403
so Jλ ◦ S(t) −→ S(t) as λ & 0, uniformly on bounded subsets of D(A). Note that Jλ ◦ S is compact
(see the first part of (b)). So S(t) is compact for all t > 0. In the case of linear semigroups, the above theorem takes the following form. COROLLARY 3.3.73 If X is a Banach space, A : X ⊇ D(A) −→ X is densely defined, linear, m-accretive operator and S(t) : X −→ X, t > 0 is the contraction semigroup generated by −A according to Theorem 3.3.49, then the following statements are equivalent: (a) the semigroup S is compact; (b) for each λ > 0, Jλ is a compact operator and the map t 7−→ S(t) is continuous from R+ into L(X) with the operator norm topology. REMARK 3.3.74 Using the resolvent identity (see Remark 3.3.13), we see that in Theorem 3.3.72 and in Corollary 3.3.73, the map Jλ is compact for all λ > 0 if and only if it is compact for some λ > 0. The next proposition gives an equivalent condition for the map Jλ to be compact. PROPOSITION 3.3.75 If X is a Banach space and A : X ⊇ D(A) −→ 2X is an m-accretive operator, then the following statements are equivalent: (a) for each λ > 0, Jλ is a compact map; (b) for every m > 0, the level set ½ df
Lm =
¾ ¯ ¯ ¯ ¯ x ∈ D(A) : kxkX + A(x) 6 m
is relatively compact in X.
404 PROOF
Nonlinear Analysis “(a)=⇒(b)”: From Proposition 3.3.12(d), we have ° ° °Aλ (x)°
X
so
¯ ¯ 6 ¯A(x)¯ 6 m
° ° °x − Jλ (x)° 6 mλ X
∀ x ∈ Lm , λ > 0,
∀ x ∈ Lm , λ > 0
and thus Jλ −→ idLm
as λ & 0,
uniformly on Lm .
From Lemma 3.3.69, it follows that Lm (which is bounded) is relatively compact. “(b)=⇒(a)”: Let B ⊆ X be bounded and λ > 0. Since Jλ is nonexpansive Jλ (B) is bounded. Because ¡ ¢ Aλ (x) ∈ A Jλ (x)
∀ x ∈ X,
we have ° ° ¯ ¡ ¢¯ °Jλ (x)° + ¯A Jλ (x) ¯ X ° ° ° ° 6 °Jλ (x)°X + °Aλ (x)°X ° ° ° 1° = °Jλ (x)°X + °x − Jλ (x)°X λ
∀ x ∈ X.
So there exists m > 0 large enough, such that Jλ (B) ⊆ Lm , hence Jλ (B) is compact and so Jλ is a compact map. In Section 4.3, we will return to semigroups, when we will examine the subdifferential of a convex function. Before concluding this section, we would like to make an interesting remark concerning accretive operators. REMARK 3.3.76 In a Hilbert space maximal accretivity (i.e., maximal monotonicity) and m-accretivity coincide (see Theorem 3.2.29). In a general Banach space this is no longer true. For a counterexample we refer to Crandall & Liggett (1971) (see also Miyadera (1992, pp. 42–44)).
3. Nonlinear Operators and Young Measures
3.4
405
The Nemytskii Operator and Integral Functions
In this section first we examine the Nemytskii (or superposition) operator, which is an important nonlinear operator that arises in many applications and then we pass to the study of nonlinear integral functionals, which leads naturally to the topic of the next section, which is the theory of Young measures. Consider a set Ω, which in most cases is a measure space or a metric space or both and let X, Y be two Hausdorff topological spaces, which in our analysis as well as in most applications are either Euclidean spaces or Banach spaces. Let f : Ω × X −→ Y and consider the nonlinear operator ¡ ¢ df Nf (u)(z) = f z, u(z)
∀ z ∈ Ω,
¡ ¢ which to each function u : Ω −→ X assigns the Y -valued z 7−→ f z, u(z) . This operator is known in the literature as the Nemytskii operator corresponding to the function f (also known as the superposition operator of f , or the composition operator of f , or the substitution operator of f ). Since in many applications the Nemytskii operator Nf acts on a Lebesgue space Lp , it is important to know under what conditions Nf maps Lp into another Lebesgue space Lr . It turns out that this leads to a particular growth condition on f , namely p f (z, x) = O(|x| r ), which is both a necessary and a sufficient condition for Nf to act between Lp and Lr . This is the well known Krasnoselskii’s theorem, which here we prove in a more general form, namely when Nf acts on Lebesgue-Bochner spaces. We start with a definition. DEFINITION 3.4.1 Let (Ω, Σ) be a measurable space and let X, Y be two Hausdorff spaces. A function f : Ω × X −→ Y is said to be a Carath´ eodory function, if ¡ ¢ (a) for every x ∈ X, the function z 7−→ f (z, x) is Σ, B(Y ) -measurable, with B(Y ) being the Borel σ-field of Y ; (b) for every z ∈ Ω, the function x 7−→ f (z, x) is continuous. REMARK 3.4.2 If X is a separable metric space and Y is a metric space, then the function (z, x) 7−→ f (z, x) is Σ × B(X)-measurable, with B(X) being the Borel σ-field of X (i.e., f is jointly measurable). Therefore f is sup-measurable (superpositionally measurable), meaning ¡ ¢ that for every measurable function u : Ω −→ X, the function z 7−→ f z, u(z) is measurable, i.e., the Nemytskii operator Nf maps measurable functions to measurable ones (for details see Denkowski, Mig´orski & Papageorgiou (2003a, pp. 189–190)).
406
Nonlinear Analysis
In what follows, to avoid repeating the same hypotheses, we fix (Ω, Σ, µ) to be a nonatomic, σ-finite, complete measure space (in applications usually Ω is a subset of RN , equipped with the Lebesgue measure) and X, Y are two separable Banach spaces. LEMMA 3.4.3 If h : Ω × X −→ R+ is a Carath´eodory function, such that h(z, 0) = 0 for all z ∈ Ω and ° ° °Nh (u)° r 6 cr ∀ u ∈ Lp (Ω; X), L (Ω) for some c > 0, then µ(Ek ) = 0 where
∀ k > 1,
½ df
Ek =
¾ z∈Ω:
sup h(z, x) = +∞
∀ k > 1.
kxkX 6k
PROOF Suppose that for some k > 1, we have µ(Ek ) 6= 0. Because the measure space is nonatomic, σ-finite, we can find Bk ∈ Σ, such that Bk ⊆ Ek
and
0 < µ(Bk ) < +∞.
For every z ∈ Bk , we have ½ df Sk (z) = x ∈ X : kxkX 6 k, h(z, x) >
¾ 2cr . µ(Bk )
Evidently Sk (z) 6= ∅
∀ z ∈ Bk
and Gr Sk ∈ (Σ ∩ Bk ) × B(X), with B(X) being the Borel σ-field of X. We apply the Yankov-von¡Neumann¢ Aumann selection theorem (see Theorem A.2.33) and obtain a Σ, B(X) measurable map uk : Bk −→ X such that uk (z) ∈ Sk (z)
∀ z ∈ Bk .
We extend uk to all of Ω by setting uk (z) = 0 if z ∈ Ω \ Bk . Since h(z, 0) = 0 p
∀z∈Ω
and uk ∈ L (Ω; X), we have Z Z ¡ ¢r ¡ ¢r h z, uk (z) dµ > 2cr , h z, uk (z) dµ = Ω
a contradiction.
Bk
3. Nonlinear Operators and Young Measures
407
Using this lemma, we can prove the general version of Krasnoselskii’s theorem for Nf . THEOREM 3.4.4 If f : Ω × X −→ Y is a Carath´eodory function, p, r ∈ [1, +∞) and Nf maps Lp (Ω; X) into Lr (Ω; Y ), then Nf is continuous, bounded (i.e., maps bounded sets into bounded sets) and there exist a ∈ Lr (Ω)+ and c > 0, such that ° ° °f (z, x)°
Y
PROOF
p
r 6 a(z) + c kxkX
for µ-a.a. z ∈ Ω.
Let {un }n>1 ⊆ Lp (Ω; X) be a sequence, such that un −→ u
in Lp (Ω; X),
for some u ∈ Lp (Ω; X). Let g : Ω × X −→ R be defined by °r df ° g(z, x) = °f (z, x + u(z)) − f (z, u(z))°Y . We pick a subsequence {unk }k>1 of {un }n>1 , such that ° ° 1 °un − u°p p 6 k k L (Ω;X) 2
∀k>1
and unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Let
df
vk = unk − u
∀ k > 1.
We have vk (z) −→ 0 and so
¡ ¢ g z, vk (z) −→ 0
for µ-a.a. z ∈ Ω
for µ-a.a. z ∈ Ω
as k → +∞.
Because g(z, x) > 0
∀ (z, x) ∈ Ω × X
and vk (z) −→ 0 for µ-a.a. z ∈ Ω, we can find k(z) ∈ N, such that ¡ ¢ ¡ ¢ ξ(z) = sup g z, vk (z) = g z, vk(z) (z) . k>1
408
Nonlinear Analysis
Let
df
vb(z) = vk(z) (z). Since ξ is Σ-measurable, we see that the function z 7−→ vb(z) is Σ-measurable. Moreover, we have Z Z ° ° ° °p °vb(z)°p dµ 6 sup °vk (z)°X dµ X 6
Ω ∞ X
Ω p
kvk kLp (Ω;X) =
k=1
k>1
∞ X ° ° °un − u°p p < +∞, k L (Ω;X) k=1
so vb ∈ Lp (Ω; X). Then from the definition of g and the hypothesis that Nf maps Lp (Ω; X) into Lr (Ω; X), we infer that ¡ ¢ g ·, vb(·) ∈ L1 (Ω)+ . Since and
¡ ¢ ¡ ¢ g z, vk (z) 6 g z, vb(z)
∀ z ∈ Ω, k > 1
¡ ¢ g z, vk (z) −→ 0 for µ-a.a. z ∈ Ω,
form the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that Z ¡ ¢ g z, vk (z) dµ −→ 0. Ω
Therefore
¡ ¢ Nf xnk −→ Nf (x) in Lr (Ω; Y ). © ª Since every subsequence of Nf (xn ) n>1 has a further subsequence converging in Lr (Ω; Y ) to Nf (x), we conclude that Nf (xn ) −→ Nf (x)
in Lr (Ω; Y )
and so the map Nf : Lp (Ω; X) −→ Lr (Ω; Y ) is continuous. Next we prove the boundedness of Nf . For u ∈ Lp (Ω; X), let ¡ ¢ ¡ ¢ df fb(z, x) = f z, x + u(z) − f z, u(z) .
3. Nonlinear Operators and Young Measures
409
Evidently fb is a Carath´eodory function, Nfb maps Lp (Ω; X) into Lr (Ω; Y ) and in addition fb(z, 0) = 0 ∀ z ∈ Ω. So without any loss of generality, we may assume that f (z, 0) = 0
∀ z ∈ Ω.
Since Nf is continuous at 0, we can find % > 0, such that ° ° °Nf (u)° p 6 1 ∀ kukLp (Ω;X) 6 %. L (Ω;Y ) Then take an arbitrary u ∈ Lp (Ω; X) and let n > 1 be an integer, such that p
n%p 6 kukLp (Ω;X) 6 (n + 1)%p . We write Ω =
m+1 [
Ωk
k=1
as a disjoint union, such that p
kukLp (Ωk ;X) 6 %p
© ª ∀ k ∈ 1, . . . , k + 1 .
Then we have Z
n+1 XZ ° ¡ ° ¡ ¢° ¢° °f z, u(z) °r dµ = °f z, u(z) °r dµ Y Y
Ω
µ
6 n+1 6
k=1Ω k
kukLp (Ω;X)
¶p
%
+ 1,
which proves that Nf is bounded. Finally we prove the growth condition. Since Nf is bounded, we can find c > 0, such that ° ° °Nf (u)° p 6 c ∀ kukLp (Ω;X) 6 1. (3.116) L (Ω;Y ) Let h : Ω × X −→ R be defined by
· ¸ ° ° ° ° pr + ° ° ° ° h(z, x) = f (z, x) Y − c x X . df
Using the inequality which says that (ξ1 − ξ2 )r 6 ξ1r − ξ2r
∀ ξ1 > ξ 2 ,
410
Nonlinear Analysis
we have ° °r ° °p h(z, x)r 6 °f (z, x)°Y − cr °x°X
when h(z, x) > 0.
(3.117)
Let u ∈ Lp (Ω; X) and let ©
df
C =
¡ ¢ ª z ∈ Ω : h z, u(z) > 0 .
Then we can find an integer n > 1 and ε ∈ [0, 1), such that Z ° ° °u(z)°p dµ = n + ε. X C
So we can write C =
n+1 [
Ck ,
k=1
a disjoint union, such that Z ° ° °u(z)°p dµ 6 1 X
© ª ∀ k ∈ 1, . . . , n + 1 .
Ck
Then assuming as before without any loss of generality, that f (z, 0) = 0
∀z∈Ω
and using (3.116), we obtain Z
n+1 XZ ° ¡ ° ¡ ¢° ¢° °f z, u(z) °r dµ = °f z, u(z) °r dµ 6 (n + 1)cr . Y Y
(3.118)
k=1C k
C
Returning to (3.117) and using (3.118), we have Z ¡ ¢r h z, u(z) dµ 6 (n + 1)cr − (n + ε)cr 6 cr
∀ u ∈ Lp (Ω; X). (3.119)
Ω
So by virtue of Lemma 3.4.3, we have µ½ ¾¶ µ z ∈ Ω : sup h(z, x) = +∞ = 0
∀ k > 1.
(3.120)
kxkX 6k
Since by hypothesis the measure space is σ-finite, we can find {Dk }k>1 ⊆ Σ, such that ∞ [ Ω = Dk and µ(Dk ) < +∞ ∀ k > 1. k=1
3. Nonlinear Operators and Young Measures
411
For z ∈ Dk , let ½ df
Vk (z) =
x ∈ X : kxkX 6 k, sup h(z, x) < +∞, kxkX 6k
¾ 1 sup h(z, x) − 6 h(z, x) . k kxkX 6k Because of (3.120), Vk (z) 6= ∅
for µ-a.a. z ∈ Dk .
Also we have Gr Vk ∈ (Σ ∩ Dk ) × B(X). So the Yankov-von Neumann-Aumann selection theorem (see Theorem ¡ ¢ A.2.33) gives a Σ, B(X) -measurable map vk : Dk −→ X, such that vk (z) ∈ Vk (z)
∀ z ∈ Dk .
Extend vk to all of Ω by setting vk |Ω\Dk = 0. Let df
a(z) = sup h(z, x). x∈X
Because h is a Carath´eodory function and X is separable, a is Σ-measurable. Also we have sup h(z, x) − kxkX 6k
so
¡ ¢ 1 6 h z, vk (z) 6 a(z) k
¡ ¢ h z, vk (z) −→ a(z)
for µ-a.a. z ∈ Ω,
for µ-a.a. z ∈ Ω,
as k → +∞.
p
Note that vk ∈ L (Ω; X) and so from (3.119), we have Z ¡ ¢r h z, vk (z) dµ 6 cr ∀ k > 1. Ω
As h > 0, we can apply Fatou’s lemma (see Theorem A.2.1) and obtain Z Z ¡ ¢r r a(z) dµ 6 lim inf h z, vk (z) dµ 6 cr k→+∞
Ω
Ω
and thus a ∈ Lr (Ω). Recalling the definition of h(z, x), we conclude that p ° ° °f (z, x)° 6 a(z) + c kxk r X Y
for µ-a.a. z ∈ Ω.
412
Nonlinear Analysis
REMARK 3.4.5
By virtue of Theorem 3.4.4, the growth condition
p ° ° °f (z, x)° 6 a(z) + c kxk r X Y
for µ-a.a. z ∈ Ω,
with a ∈ Lr (Ω), c > 0 is both necessary and sufficient condition for the continuity and boundedness of the Nemytskii operator Nf : Lp (Ω; X) −→ Lr (Ω; Y ). If in Theorem 3.4.4 we drop the hypothesis that f (z, x) is a Carath´eodory ¡ function and¢ we only assume that the function (z, x) 7−→ f (z, x) is Σ × B(X), B(Y ) -measurable and for all z ∈ Ω, the function x 7−→ f (z, x) is lower semicontinuous, then we no longer have the continuity of the Nemytskii operator Nf , even if it maps Lp (Ω; X) into Lr (Ω; Y ). To see this let Ω = [0, 1] equipped with the Lebesgue measure, let X = Y = R and consider the function ½ df 1 if x 6= 0, f (x) = 0 if x = 0. Then Nf maps Lp (Ω) to Lr (Ω) for every r ∈ [1, +∞). However, if we consider df
xn (z) =
z , n
then Nf (xn ) does not converge in measure to zero. Also if r = +∞ and Nf maps Lp (Ω; X) into L∞ (Ω; Y ) (f is still a Carath´eodory function), then Nf is again bounded and there exists M > 0, such that ° ° °f (z, x)° 6 M for µ-a.a. z ∈ Ω and all x ∈ X. Y The proof which is similar to that of Theorem 3.4.4 is left to the reader. However, Nf : Lp (Ω; X) −→ L∞ (Ω; Y ) is not in general continuous as the following example illustrates. Let Ω = [0, 1] be equipped with the Lebesgue measure, let X = Y = R and consider −1 if x < −1, df x if −1 6 x 6 1, f (x) = 1 if 1 < x. If we take un (z) = z n , then xn −→ 0 in Lp [0, 1], for all p ∈ [1, +∞). But Nf (xn ) does not converge to zero in L∞ [0, 1].
3. Nonlinear Operators and Young Measures
413
PROPOSITION 3.4.6 If f : Ω × RN −→ RN is a Carath´eodory function, for all z ∈ Ω, f (z, ·) is a ¡ ¢ 0 monotone map and Nf maps Lp Ω; RN into Lp (Ω; RN ), where p ∈ [1, +∞), 1 1 p + p0 = 1, ¡ ¢ 0 then Nf : Lp Ω; RN −→ Lp (Ω; RN ) is a maximal monotone operator. PROOF If by h·, ·ipp0 we denote the duality brackets for the pair of spaces ¢¢ ¡ ¢ ¡ p0 ¡ L (Ω; RN ), Lp Ω; RN , for all u, v ∈ Lp Ω; RN , we have ® Nf (u)−Nf (v), u−v pp0 =
Z
¡ ¡ ¢ ¡ ¢ ¢ f z, u(z) −f z, v(z) , u(z)−v(z) RN dµ > 0
Ω
due to the monotonicity of f (z, ·). Hence Nf is monotone. Moreover, by Theorem 3.4.4, Nf is continuous. Therefore, Proposition 3.2.19 implies that Nf is maximal monotone. PROPOSITION 3.4.7 If f : Ω × RN −→ RN is a Carath´eodory function, such that (i) f (z, ·) is a strictly monotone map for µ-almost all z ∈ Ω; ¡ ¢ p (ii) f (z, x), x RN > c1 kxkRN − a1 (z) for µ-almost all z ∈ Ω and all x ∈ RN 1 with a1 ∈ L (Ω)+ , c > 0; ¡ ¢ 0 (iii) Nf maps Lp Ω; RN into Lp (Ω; RN ), with p ∈ [1, +∞), p1 + p10 = 1, ¡ ¢ 0 then Nf : Lp Ω; RN −→ Lp (Ω; RN ) is an operator of type (S)+ (see Definition 3.2.55(b)). PROOF
¡ ¢ Suppose that {un }n>1 ⊆ Lp Ω; RN is a sequence, such that ¡ ¢ un −→ u in Lp Ω; RN
and
® lim sup Nf (un ) − Nf (u), un − u pp0 6 0. n→+∞
We need to show that un −→ u
¡ ¢ in Lp Ω; RN .
From the monotonicity of Nf (see Proposition 3.4.6), we have that ® Nf (un ) − Nf (u), un − u pp0 −→ 0.
414
Nonlinear Analysis
Note that
® Nf (un ) − Nf (u), un − u pp0 Z ¡ ¡ ¢ ¡ ¢ ¢ = f z, un (z) − f z, u(z) , un (z) − u(z) RN dµ. Ω
Because of the monotonicity of f (z, ·), by passing to a subsequence, we may assume that ¢ ¡ ¢ ¢ df ¡ ¡ βn (z) = f z, un (z) − f z, u(z) , un (z) − u(z) RN −→ 0 for µ-a.a. z ∈ Ω and
¯ ¯ ¯βn (z)¯ 6 k(z) for µ-a.a. z ∈ Ω and all n > 1,
with k ∈ L1 (Z)+ . From Theorem 3.4.4, we know that for µ-almost all z ∈ Ω and all x ∈ RN , we have ° ° °f (z, x)° N 6 a(z) + c kxkp−1 RN , R 0
with a ∈ Lp (Ω)+ , c > 0 (recall that pp0 = p−1). So for all z ∈ Ω\D, µ(D) = 0 and all n > 1, we have °p ° °p ¢ ¡° k(z) > βn (z) > c1 °un (z)°RN + °u(z)°RN ° ° ¡ ° °p−1 ¢ − °un (z)°RN a(z) + c°u(z)°RN ° ° ¡ ° °p−1 ¢ − °u(z)°RN a(z) + c°u(z)°RN − 2a1 (z). (3.121) Using Young’s inequality (see Proposition A.4.5) with ε > 0, we have 0 ° ° ° °p−1 °p °p ε° cp ° c°un (z)°RN °u(z)°RN 6 °un (z)°RN + 0 °u(z)°RN p εp
(3.122)
° ° ° °p−1 ° ° ° cp ° °u(z)°p N + ε °un (z)°p N . c°u(z)°RN °un (z)°RN 6 0 R R εp p
(3.123)
and
Using (3.122) and (3.123) in (3.121), we obtain 0 ° °p ° °p °p °p ε° cp ° c1 °un (z)°RN 6 k(z) + c1 °u(z)°RN + °un (z)°RN + 0 °u(z)°RN p° εp ° ° ¢ ¡° + a(z) °un (z)°RN + °u(z)°RN ° ° ° cp ° °u(z)°p N + ε °un (z)°p N + 2a1 (z). + (3.124) 0 R R εp p
Recall that p1 + the sequence
1 p0
= 1 and choose ε < c. Then from (3.124), it follows that ° ª ©° °un (·)°p N R
n>1
¢ ¡ ⊆ L1 Ω; RN +
3. Nonlinear Operators and Young Measures
415
is integrable. Also for all z ∈ Ω \ D, µ(D) = 0, the sequence © uniformly ª un (z) n>1 ⊆ RN is bounded. So by passing to a suitable subsequence (depending in general on z ∈ Ω \ D), we may assume that un (z) −→ u b(z) in RN . Recall that f (z, ·) is continuous and that βn (z) −→ 0, so in the limit we obtain ¡ ¡ ¢ ¡ ¢ ¢ f z, u b(z) − f z, u(z) , u b(z) − u(z) RN = 0. Since by hypothesis f (z, ·) is strictly monotone, we infer that u b(z) = u(z)
∀ z ∈Ω\D
and so it follows that un (z) −→ u(z) in RN ,
∀ z ∈ Ω \ D, µ(D) = 0.
(3.125)
From (3.125), the uniform integrability of the sequence ° ª ©° ¡ ¢ °un (·)°p N ⊆ L1 Ω; RN + R n>1 and Vitali’s theorem (the extended dominated convergence theorem; see Theorem A.2.9), we obtain that kun k p ¡ L
Since
Ω;RN
¢ −→ kuk ¡ p L
Ω;RN
¢.
¡ ¢ w un −→ u in Lp Ω; RN
and the latter space is uniformly convex, from the Kadec-Klee property (see Remark A.3.22), we know that ¡ ¢ un −→ u in Lp Ω; RN , hence Nf is of type (S)+ . Next we pass to the study of integral functionals defined on LebesgueBochner space. So if (Ω, Σ, µ) is a nonatomic, complete σ-finite measure space, X a separable Banach space and f : Ω × X −→ R = R ∪ {+∞} is a Σ × B(X)-measurable function (an integrand), we consider the integral functional Z ¡ ¢ df If (u) = f z, u(z) dz ∀ u ∈ Lp (Ω; X), Ω
with p ∈ [1, +∞]. We start with a definition which extends the notion of a Carath´eodory function (see Definition 3.4.1).
416
Nonlinear Analysis
DEFINITION 3.4.8 Let (Ω, Σ, µ) be a complete σ-finite measure space and X a separable metric space. We say that f : Ω × X −→ R = R ∪ {+∞} is a normal integrand, if (a) f is Σ × B(X)-measurable; and (b) the function x 7−→ f (z, x) is lower semicontinuous for µ-almost all z ∈ Ω. We show that normal integrands which are bounded below can be realized as the upper envelope of a sequence of Carath´eodory integrands. PROPOSITION 3.4.9 If (Ω, Σ, µ) is a complete σ-finite measure space, X is a separable metric space with metric dX , f : Ω × X −→ R is a normal integrand and there exists a function h : Ω −→ R (not necessarily measurable), such that h(z) 6 f (z, x)
for µ-a.a. z ∈ Ω and all x ∈ X,
then we can find a sequence of functions fn : Ω × X −→ R for n > 1, such that for all n > 1, we have (a) h(z) 6 fn (z, x) 6 n for µ-almost all z ∈ Ω and all x ∈ X; (b) the function z 7−→ fn (z, x) is measurable for all x ∈ X; ¯ ¯ (c) ¯fn (z, x) − fn (z, v)¯ 6 ndX (x, v) for all z ∈ Ω and all x, v ∈ Ω; (d) fn (z, x) % f (z, x) for µ-almost all z ∈ Ω and all x ∈ X. PROOF
(a) For every n > 1, let £ ¤ df fbn (z, x) = inf f (z, y) + ndX (y, x) . y∈X
Evidently, for all n > 1, µ-almost all z ∈ Ω and all x ∈ X, we have h(z) 6 fb1 (z, x) 6 . . . 6 fbn (z, x) 6 fbn+1 (z, x) 6 . . . 6 f (z, x). If we fix n > 1, x ∈ X and λ ∈ R, we have ½ ¾ ½ ¾ b z ∈ Ω : fn (z, x) < λ = projΩ (z, y) ∈ Ω × X : f (z, y) + ndX (y, x) < λ .
By virtue of the joint measurability of f (see Definition 3.4.8), we deduce that df
Cλ =
½ ¾ (z, y) ∈ Ω × X : f (z, y) + ndX (y, x) < λ ∈ Σ × B(X).
3. Nonlinear Operators and Young Measures
417
Hence from the Yankov-von Neumann-Aumann projection theorem (see Theorem A.2.32) and since by hypothesis Σ is µ-complete, we have projΩ Cλ ∈ Σ and so the function z 7−→ fbn (z, x) is measurable for all x ∈ X and all n > 1. From the definition of fbn , we have fbn (z, x) 6 f (z, y) + ndX (y, x) so
∀ (z, y) ∈ Ω × X,
fbn (z, x) 6 f (z, y) + ndX (y, v) + ndX (v, x)
and thus
∀v∈X
fbn (z, x) − fbn (z, v) 6 ndX (x, v).
Interchanging the roles of x and v in the above argument, we conclude that ¯ ¯ ¯fbn (z, x) − fbn (z, v)¯ 6 ndX (x, v) ∀ z ∈ Ω, x, v ∈ X. Therefore fbn is Σ × B(X)-measurable (see Remark 3.4.2). Moreover, since for © ª all (z, x) ∈ Ω × X, the sequence fbn (z, x) n>1 is increasing, we have lim fbn (z, x) 6 f (z, x)
n→+∞
∀ (z, x) ∈ Ω × X.
(3.126)
Let D ⊆ Ω be the µ-null set, such that the function f (z, ·) is lower semicontinuous for all z ∈ Ω \ D. Then for all z ∈ Ω \ D, x ∈ X and ε > 0, let {yn }n>1 \ X be a sequence, such that f (z, yn ) + ndX (yn , x) 6 fbn (z, x) + ε. As n → +∞, either fbn (z, x) % +∞, in which case equality holds in (3.126) or otherwise we have yn −→ x in X. Then because of the lower semicontinuity of f (z, ·), we have f (z, x) 6 lim inf f (z, yn ) 6 n→+∞
lim fbn (z, x) + ε.
n→+∞
(3.127)
Since z ∈ Ω \ D, x ∈ X and ε > 0 were arbitrary, from (3.126) and (3.127), we infer that fbn (z, x) % f (z, x) for µ-a.a. z ∈ Ω and all x ∈ X. Finally set
© ª df fn (z, x) = min fbn (z, x), n .
Then the sequence {fn }n>1 is the desired sequence.
418
Nonlinear Analysis
Using this approximation result, we have another characterization of normal integrands. First let us recall the Scorza-Dragoni theorem, which is a parametrized version of Lusin’s theorem (see Theorem A.2.11). THEOREM 3.4.10 (Scorza-Dragoni Theorem) If Ω, X are two Polish spaces (see Definition A.2.29(a)), Y is a separable metric space, µ is a tight Borel measure on Ω (see Remark 3.4.11) and f : Ω × X −→ Y is a Carath´eodory function, then for every ε > 0, we can find a compact set Ωε ⊆ Ω, with µ(Ω \ Ωε ) < ε, such that f |Ωε ×X is continuous. REMARK 3.4.11 Recall that µ is a tight Borel measure on Ω, if µ is finite and for every ε > 0, we can find a compact subset Kε of Ω, such that µ(Ω \ Kε ) < ε. On a Polish space every finite Borel measure is tight. Combining Proposition 3.4.9 with Theorem 3.4.10, we obtain the following characterization of normal integrands. PROPOSITION 3.4.12 If Ω, X are two Polish spaces, µ is a finite Borel measure on Ω and f : Ω × df
X −→ R = R ∪ {+∞}, then f is a normal integrand if and only if for every ε > 0 we can find a compact set Kε ⊆ Ω, such that µ(Ω \ Kε ) < ε and f |Kε ×X is lower semicontinuous. Now we pass to the study of the integral functional Z ¡ ¢ df If (u) = f z, u(z) dµ ∀ u ∈ Lp (Ω; X), Ω
with p ∈ [1, +∞]. In our analysis we shall use the computational convention +∞ − ∞ = +∞, which is useful when dealing with integral functionals of R = R∪{+∞}-valued functions.
3. Nonlinear Operators and Young Measures
419
THEOREM 3.4.13 If (Ω, Σ, µ) is a nonatomic, σ-finite measure space, X is separable Banach space, f : Ω × X −→ R is a normal integrand, the integral functional If is not identically +∞ and p ∈ [1, +∞), then the following properties are equivalent: (a) If is lower semicontinuous on Lp (Ω; X) and If (u) > −∞ for all u ∈ Lp (Ω; X). (b) If : Lp (Ω; X) −→ R. (c) There exist β1 ∈ R and β2 > 0, such that p
If (u) > β1 − β2 kukLp (Ω;X)
∀ u ∈ Lp (Ω; X).
(d) There exist a ∈ L1 (Ω) and c > 0, such that p
f (z, x) > a(z) − c kxkX PROOF
for µ-a.a. z ∈ Ω and all x ∈ X.
Clearly implications (d)=⇒(c)=⇒(b) and (a)=⇒(b) hold.
“(b)=⇒(d)”: Let us set
© ª df g = min f, 0 .
We claim that Z
¡ ¢ g z, u(z) dµ > −∞
∀ u ∈ Lp (Ω; X).
(3.128)
Ω
Suppose that (3.128) does not hold. Then for some u0 ∈ Lp (Ω; X), we have Z ¡ ¢ g z, u0 (z) dµ = −∞. Ω
Let
½ df
C =
¾ ¡ ¢ ¡ ¢ z ∈ Ω : f z, u0 (z) = g z, u0 (z) ∈ Σ.
For any given v ∈ Lp (Ω; X), we define df
u b = χC u0 + χC c v ∈ Lp (Ω; X). Then we have Z −∞ < If (b u) = C
¡ ¢ g z, u0 (z) dµ +
Z Cc
¡ ¢ f z, v(z) dµ,
420
Nonlinear Analysis
so
Z
¡ ¢ f z, v(z) dµ = +∞
Cc
and thus If ≡ +∞, a contradiction. From (3.128), it follows that Ng (the Nemytskii operator corresponding to g) maps Lp (Ω; X) into L1 (Ω). Invoking Theorem 3.4.4, we obtain (d). “(d)=⇒(a)”: Let λ ∈ R and consider a sequence {un }n>1 ⊆ Lp (Ω; X), such that un −→ u in Lp (Ω; X), for some u ∈ Lp (Ω; X) and If (un ) 6 λ
∀ n > 1.
By passing to a subsequence if necessary, we may also assume that un (z) −→ u(z) for µ-a.a. z ∈ Ω and
° ° °un (z)° 6 k(z) for µ-a.a. z ∈ Ω, n > 1, X
with k ∈ L1 (Ω)+ . Then because of (d), we can apply Fatou’s lemma (see Theorem A.2.1) and obtain Z Z ¡ ¢ ¡ ¢ If (u) = f z, u(z) dµ 6 lim inf f z, un (z) dµ n→+∞
Z 6 lim inf
n→+∞
Ω
Ω
¡ ¢ f z, un (z) dµ 6 λ,
Ω
so If is lower semicontinuous on Lp (Ω; X). Moreover, it is clear that If (u) > −∞
∀ u ∈ Lp (Ω; X).
COROLLARY 3.4.14 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a Carath´eodory function, such that ¯ ¯ ¯f (z, x)¯ 6 a(z) + c kxkp for µ-a.a. z ∈ Ω and all x ∈ X, X with a ∈ L1 (Ω)+ , c > 0 and p ∈ [1, +∞), then If : Lp (Ω; X) −→ R is continuous.
3. Nonlinear Operators and Young Measures
421
For p = +∞, we can state the following continuity result. PROPOSITION 3.4.15 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a Carath´eodory function and for all r > 0 we can find ar ∈ L1 (Ω)+ , such that ¯ ¯ ¯f (z, x)¯ 6 ar (z) for µ-a.a. z ∈ Ω and all kxk 6 r, X then If : L∞ (Ω; X) −→ R is continuous. PROOF
Suppose that {un }n>1 ⊆ L∞ (Ω; X) is a sequence, such that un −→ u in L∞ (Ω; X),
for some u ∈ L∞ (Ω; X). Let df
r = sup kun kL∞ (Ω;X) < +∞. n>1
Then Since
¯ ¡ ¢¯ ¯f z, un (z) ¯ 6 ar (z) for µ-a.a. z ∈ Ω. ¡ ¢ ¡ ¢ f z, un (z) −→ f z, u(z)
for µ-a.a. z ∈ Ω,
from the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that If (un ) −→ If (u).
PROPOSITION 3.4.16 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a normal integrand, such that f (z, ·) is convex for µ-almost all z ∈ Ω, u0 ∈ L∞ (Ω; X) and If (u0 ) ∈ R, then the following conditions are equivalent: (a) If is continuous at u0 with respect to norm topology on L∞ (Ω; X). (b) There exist ε > 0 and a ∈ L1 (Ω), such that ¡ ¢ sup f z, x + u0 (z) 6 a(z) for µ-a.a. z ∈ Ω. kxkX 6ε
PROOF “(a)=⇒(b)”: Suppose that the implication is not true. Then for any ε > 0, the function ¡ ¢ df ξε (z) = sup f z, x + u0 (z) kxkX 6ε
422
Nonlinear Analysis
is not integrable (note that since f is a normal integrand, ξε is measurable). Clearly Z ξε (z) dµ = +∞ ∀ ε > 0. Ω
So for every ε > 0 and N > 1, we can find a measurable function ξε,N : Ω −→ R, such that Z ξε,N (z) dµ > N and ξε,N (z) 6 ξε (z) ∀ z ∈ Ω. Ω
Then for every z ∈ Ω, we define ½ ¾ ° ° df Sε,N (z) = x ∈ X : °x − u0 (z)°X 6 ε, f (z, x) > ξε,N (z) . ° ° Since (z, x) 7−→ °x − u0 (z)°X is a Carath´eodory function (hence jointly measurable; see Remark 3.4.2) and (z, x) 7−→ f (z, x) − ξε,N (z) is a Σ × B(X)-measurable function (since f is a normal integrand), we infer that Gr Sε,N ∈ Σ × B(X). Applying the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33), we obtain a measurable map uε,N : Ω −→ X, such that Clearly uε,N
uε,N (z) ∈ Sε,N (z) ∀ z ∈ Ω. ° ° ∈ L (Ω; X), °uε,N − u0 °L∞ (Ω;X) 6 ε and Z If (uε,N ) > ξε,N (z) dµ > N. ∞
Ω
Since ε > 0 and N > 1 were arbitrary, it follows that the convex integral functional If is unbounded from above in every L∞ (Ω; X)-neighbourhood of u0 and so it cannot be continuous at u0 , a contradiction. “(b)=⇒(a)”: For every df
v ∈ Bε = we have
Z
If (u0 ± v) = Ω
½
¾ v ∈ L∞ (Ω; X) : kvkL∞ (Ω;X) 6 ε ,
¡ ¢ f z, u0 (z) ± v(z) dµ 6
Z a(z) dµ = η < +∞. Ω
Since If is convex and bounded above in a neighbourhood of u0 , it is continuous at u0 .
3. Nonlinear Operators and Young Measures
423
Thus far we have considered the norm topology on the Lebesgue-Bochner space. If we want to have weak lower semicontinuity, then, as we show, necessarily f (z, ·) must be convex. For this purpose we need to use some tools from multivalued analysis, which for the convenience of the reader we recall here. Details can be found in Denkowski, Mig´orski & Papageorgiou (2003a, Chapter 4). DEFINITION 3.4.17
Let Y be a separable Banach space and G : Ω −→ 2Y \ {∅}
is a multivalued (set-valued) map. We say that G is graph measurable, if Gr G ∈ Σ × B(Y ), where df
Gr G =
©
ª (z, y) ∈ Ω × Y : y ∈ G(z) .
Also, for any p ∈ [1, +∞], we set ½ ¾ p df p SG = g ∈ L (Ω; Y ) : g(z) ∈ G(z) for µ-a.a. z ∈ Ω (the set of Lp -selections of the multifunction G) The next result from multivalued analysis is the main tool in establishing the necessity of the convexity of f (z, ·). PROPOSITION 3.4.18 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, Y is a separable p Banach space and G : Ω −→ 2Y \ {∅} is graph measurable with SG 6= ∅, p ∈ [1, ∞), pw p p then SG = Sconv G , where w stands for the weak topology on L (Ω; Y ). REMARK 3.4.19 There is an analogous result for p = +∞. More precisely, let (Ω, Σ, µ) and Y be as in Proposition 3.4.18. We denote by Yw∗∗ the dual space of Y furnished with the w∗ -topology. From Theorem 2.2.12, we know that L∞ (Ω; Yw∗∗ ) = L1 (Ω; Y )∗ , where L∞ (Ω; Yw∗∗ ) is understood in the sense of Definition 2.2.10. Let ∗
G : Ω −→ 2Y \ {∅} be a multifunction, such that ¡ ¢ Gr G ∈ Σ × B Yw∗∗ Then ∞ SG
w∗
∞ = Sconv w∗ G
∞ and SG 6= ∅.
in L∞ (Ω; Yw∗∗ ).
424
Nonlinear Analysis
Using this general result about multifunctions, we can prove the following theorem about the integral functional If , according to which the weak lower semicontinuity of If on L1 (Ω; X) implies the convexity of f (z, ·) for all z ∈ Ω. THEOREM 3.4.20 If (Ω, Σ, µ) is a nonatomic, complete, σ-finite measure space, X is a separable Banach space, f : Ω × X −→ R is a normal integrand, there exists u0 ∈ L1 (Ω; X), such that If (u0 ) < +∞ and If is weakly lower semicontinuous on L1 (Ω; X), then f (z, ·) is convex for all z ∈ Ω. PROOF
Without any loss of generality, we may assume that ¡ ¢ f z, u0 (z) = 0 ∀z∈Ω
(otherwise replace f (z, x) by ¡ ¢ fb(z, x) = f (z, x) − f z, u0 (z) ). Consider the multifunction E : Ω −→ 2X×R , defined by ½ ¾ df E(z) = epi f (z, ·) = (x, λ) ∈ X × R : f (z, x) 6 λ (the epigraph of f (z, ·)). Since ¡ ¢ u0 (z), 0 ∈ E(z)
∀ z ∈ Ω,
we see that E has nonempty values, which are also closed due to the lower semicontinuity of f (z, ·). Moreover, 1 (u0 , 0) ∈ SE .
We claim that
1 SE is weakly closed in L1 (Ω; X). © ª 1 and assume that To this end let (uα , λα ) α∈J be a net in SE w
uα −→ u in L1 (Ω; X) and
w
λα −→ λ
in L1 (Ω).
3. Nonlinear Operators and Young Measures
425
For every C ∈ Σ, we have that w
χC uα −→ χC u in L1 (Ω; X) and so
w
χC uα + χC c u0 −→ χC u + χC c u0 Also
w
in L1 (Ω).
χC λα −→ χC λ Note that
in L1 (Ω; X).
Z If (χC uα + χC c u0 ) =
(3.129)
¡ ¢ f z, uα (z) dµ.
C
Since by hypothesis If is weakly lower semicontinuous on L1 (Ω; X), we have lim inf If (χC uα + χC c u0 ) > If (χC u + χC c u0 ) α Z ¡ ¢ = f z, u(z) dµ, C
so
Z
¡ ¢ f z, uα (z) dµ >
lim inf α
C
Because we have
Z
¡ ¢ f z, u(z) dµ.
C
¡ ¢ f z, uα (z) 6 λα (z) for µ-a.a. z ∈ Ω, Z
¡ ¢ f z, uα (z) dµ 6
C
so, from (3.129), we have Z C
Z λα (z) dµ, C
¡ ¢ f z, u(z) dµ 6
Z λ(z) dµ. C
The set C ∈ Σ was arbitrary. Hence it follows that ¡ ¢ f z, u(z) 6 λ(z) for µ-a.a. z ∈ Ω and
1 (u, λ) ∈ SE ,
which proves that 1 SE is weakly closed in L1 (Ω; X).
Then by virtue of Proposition 3.4.18, we have that 1 1 SE = Sconv E.
426
Nonlinear Analysis
We claim that this implies that E(z) = conv E(z)
for µ-a.a. z ∈ Ω.
We proceed by contradiction. Suppose that D(z) = conv E(z) \ E(z) 6= ∅
∀ z ∈ C ∈ Σ,
with µ(C) > 0. Clearly Gr D ∈ (Σ ∩ C) × B(X) and so by the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.33) we can find two (Σ ∩ C)-measurable functions b : C −→ R, and λ
u b : Ω −→ X such that
¡
b u b(z), λ(z)
¢
∈ D(z)
∀ z ∈ C.
Exploiting the σ-finiteness of the measure space, we can find C0 ⊆ C with such that Then let
C0 ∈ Σ
and
χC 0 u b ∈ L1 (Ω; X)
0 < µ(C 0 ) < +∞, b ∈ L1 (Ω). and χC 0 λ
df
u = χC 0 u b + χ(C 0 )c u0
df b and λ = χC 0 λ.
¡ ¢ 1 Evidently u, λ ∈ SE and ¡ ¢ u(z), λ(z) ∈ D(z)
∀ z ∈ C 0,
a contradiction. Therefore E(z) = conv E(z) for µ-a.a. z ∈ Ω and by redefining E on a µ-null set, we may assume that E(z) = conv E(z)
∀ z ∈ Ω.
This proves the convexity of f (z, ·) for all z ∈ Ω. In the next section we use the theory of Young measures to prove very general lower semicontinuity results for integral functionals. Moreover, in Section 4.3, we will focus on convex integral functionals.
3. Nonlinear Operators and Young Measures
3.5
427
Young Measures
According to Proposition 2.3.38, a sequence {un }n>1 ⊆ L1 (Ω), which converges weakly but not strongly in L1 (Ω), oscillates violently around its weak limit. However, in the limit all this information about the faster and faster oscillations is lost and only a mean value is recorded. Of course this is not satisfactory, because if for example on {un }n>1 we act with the Nemytskii operator Nf , we cannot say that w
Nf (un ) −→ Nf (u)
in L1 (Ω),
unless f (z, ·) is affine. The idea is then to embed the sequence {un }n>1 into a larger space and consider the limit there. The appropriate space is that of probability-valued functions (parametrized measures). These are the Young measures. In what follows let Ω and E be locally compact, σ-compact metric spaces, Σ a σ-field on Ω containing B(Ω) (the Borel σ-field of Ω) and µ ∈ M (Ω)+ (see Section 2.3), which is nonatomic and Σ is µ-complete. Also we set ½ ¾ df 1 M+ (E) = λ ∈ M (E)+ : λ(E) = 1 (3.130) (the probability measures on E) and ½ ¾ df 1 SM+ (E) = λ ∈ M (E)+ : λ(E) 6 1 (the subprobability measures on E). DEFINITION 3.5.1 A transition probability (respectively transition subprobability) on E is a function 1 1 λ : Ω −→ M+ (E) (respectively λ : Ω −→ SM+ (E)),
such that for every A ∈ B(E), we have that the function z 7−→ λ(z)(A) is b c Σ-measurable. By R(Ω, E) (respectively SR(Ω, E)) we denote the space of transition probabilities (respectively subprobabilities) on Ω. 1 REMARK 3.5.2 On M+ (E) we can consider the topology of narrow 1 convergence (see Definition 2.3.42(c)). This is the relative topology on M+ (E) ¡ ¢ 1 induced by w M (E), Cb (E) . Recall that M+ (E) furnished with this topology 1 is a Polish space (see Definition A.2.29(a)). In the sequel by M+ (E)n we 1 denote the space M+ (E) equipped with the topology of narrow convergence.
428
Nonlinear Analysis
PROPOSITION 3.5.3 ¡ ¡ 1 ¢¢ 1 If λ : Ω −→ M+ (E) is a Σ, B M+ (E)n -measurable function, b then λ ∈ R(Ω, E). PROOF Let U be an open set in E. The map z 7−→ ¡ λ(z)(U ¡ 1 ) is the ¢¢ 1 composition of λ : Ω −→ M+ (E), which by hypothesis is Σ, B M+ (E)n 1 measurable and of the map ξ : M+ (E) −→ [0, 1], defined by ξ(ν) = ν(U ), which is lower semicontinuous (by virtue of the Portmanteau theorem; see Theorem A.2.36). Therefore the map z −→ λ(z)(U ) is Σ-measurable. Since 1 λ(z) ∈ M+ (E) is regular for every z ∈ Ω, we conclude that the map z 7−→ λ(z)(A) is Σ-measurable for all A ∈ Σ.
PROPOSITION 3.5.4 b If E is compact and λ ∈ R(Ω, E), ¡ ¡ 1 ¢¢ then λ is a Σ, B M+ (E)n -measurable function. PROOF that
Recall that C(E)∗ = M (E). Then for every g ∈ C(E), we have
® the map z 7−→ λ(z), g C(E) =
Z g(x)λ(z)(dx) is Σ-measurable. E
Indeed, suppose that s : E −→ R is a simple function. Then since λ ∈ R(Ω; E), we have that Z the map z 7−→ s(x)λ(z)(dx) is Σ-measurable. E
We can find a sequence {sn }n>1 of simple functions on E, such that sn (x) −→ g(x) uniformly on E. Then
Z
Z sn (x)λ(z)(dx) −→
E
g(x)λ(z)(dx), E
® which proves the Σ-measurability of the map z 7−→ λ(z), g C(E) . So the map z 7−→ λ(z) is weakly∗ -measurable. 1 Because M+ (E)n is a compact metrizable space, we conclude that z 7−→ λ(z) ¡ ¡ 1 ¢¢ is Σ, B M+ (E)n -measurable.
3. Nonlinear Operators and Young Measures
429
Let us recall the notion of image measure. DEFINITION 3.5.5 Let ¡(S, T ) be¢a measurable space, Y a Hausdorff topological space, ξ : S −→ Y a Σ, B(Y ) -measurable function and λ : T −→ R+ ∪ {+∞} a measure. The image of λ under ξ is the measure ν : B(Y ) −→ R+ ∪ {+∞}, defined by ¡ ¢ df ν(A) = λ ξ −1 (A)
∀ A ∈ B(Y ).
We often denote ν by λ ◦ ξ −1 . REMARK 3.5.6 If g : Y −→ R is a ν-integrable map or a measurable and positive map, then Z Z ¡ ¢ g ξ(s) dλ(s) = g(y) dν(y). S
Y
Now we can give the definition of Young measure. DEFINITION 3.5.7 (a) λ ∈ M (Ω × E)+ is a Young measure with respect to µ, if λ(A × E) = µ(A) ∀ A ∈ Σ. By Y(Ω, E, µ), we denote the space of Young measures with respect to µ. (b) λ ∈ M (Ω × E)+ is a Young submeasure with respect to µ, if λ(A × E) 6 µ(A)
∀ A ∈ Σ.
By SY(Ω, E, µ), we denote the space of Young submeasures with respect to µ. (c) If u : Ω −→ E is a measurable function, then the Young measure associated to u is the element ν ∈ Y(Ω, E, µ), defined by Z Z ¡ ¢ h(z, x) dν = h z, u(z) dν ∀ h ∈ C0 (Ω × E). Ω×E
Ω
REMARK 3.5.8
Let projΩ : Ω × E −→ Ω
be the projection map defined by df
projΩ (z, x) = z If ν ∈ Y(Ω, E, µ), then
∀ (z, x) ∈ Ω × E.
µ = ν ◦ proj−1 Ω
430
Nonlinear Analysis
(see Definition 3.5.5). Moreover, we have ν = µ ◦ η −1
for all µ, ν as in Definition 3.5.7(c),
where η : Ω −→ Ω × E is defined by df
η(z) =
¡ ¢ z, u(z)
∀ z ∈ Ω.
If un : Ω −→ E are measurable functions for n > 1, un (z) −→ u(z) for µ-a.a. z ∈ Ω and νn , ν are the Young measures associated to un and u respectively (see Definition 3.5.7(c)), then w
1 (Ω × E) in M+
νn −→ ν
(see Definition 2.3.42(b)). To see this let h ∈ C0 (Ω × E). Using Remark 3.5.6 and the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z Z ¡ ¢ lim h(z, x) dνn = lim h z, un (z) dµ n→+∞ Ω×E
Z
= Ω
n→+∞
Z
¡ ¢ h z, u(z) dµ =
Ω
h(z, x) dν. Ω×E
PROPOSITION 3.5.9 We have ½ SY(Ω, E, µ) = ν ∈ M (Ω × E)+ : ν(Ω × E) 6 µ(Ω) and for all β ∈ C0 (Ω)+ and all ξ ∈ C0 (Ω × E)+ , such that ξ(z, x) 6 β(z) for all (z, x) ∈ Ω × E, ¾ Z Z we have ξ(z, x) dν 6 β(z) dµ . Ω×E
Ω
Moreover, if E is compact, then ½ Y(Ω, E, µ) = ν ∈ SY(Ω, E, µ) : for all β ∈ C0 (Ω)+ , ¾ Z Z we have β(z) dµ = β(z) dν . Ω
Ω×E
3. Nonlinear Operators and Young Measures
431
PROOF Let K ⊆ Ω and C ⊆ E be nonempty, compact sets, and ε > 0. Then by virtue of Urysohn’s lemma (see Theorem A.1.13), we can find ξ ∈ C0 (Ω × E)+ ,
β ∈ C(Ω)+ ,
0 6 ξ, β 6 1
and a compact set K1 ⊆ Ω,
K1 ⊇ K,
K1 6= K,
such that ξ|K×C = 1,
β|K = 1,
¯ ¯β|
c K1
¯ ¯ 6 ε,
µ(K1 \ K) 6 ε
and ξ(z, x) 6 β(z) Then we have
∀ (z, x) ∈ Ω × E.
Z ν(K × C) 6
Z ξ(z, x) dν 6
Ω×E
β(z) dµ Ω
¡ ¢ 6 µ(K) + µ(K1 \ K) + εµ(K1c ) 6 µ(K) + ε 1 + µ(Ω) . Let ε & 0, to conclude that ν(K × C) 6 µ(K). If A ∈ Σ, then we can find compact sets Kn ⊆ A
and Cn ⊆ E
∀ n > 1,
such that Kn % A
and
Cn % E.
Since ν(Kn × Cn ) 6 µ(Kn )
∀ n > 1,
passing to the limit as n → +∞, we obtain ν(A × E) 6 µ(A). This proves the first equality. If E is compact, simply note that if β ∈ C0 (Ω), then β ∈ C0 (Ω × E). So the second equality follows at once. Recall that
M (Ω × E) = C0 (Ω × E)∗
(see Theorem 2.3.41). So we can equip Y(Ω, E, µ) and SY(Ω, E, µ) with the relative weak∗ -topology. If we do this we can have useful topological properties for the space of Young measures and of Young submeasures.
432
Nonlinear Analysis
THEOREM 3.5.10 E is compact if and only if Y(Ω, E, µ) is compact. PROOF
“=⇒”: First note that ∗
Y(Ω, E, µ) ⊆ SY(Ω, E, µ) ⊆ B µ(Ω) , where ∗ df B µ(Ω) =
½
¾ λ ∈ M (Ω × E) : |λ|(Ω × E) 6 µ(Ω) .
We know that the predual space C0 (Ω × E) is separable. So on bounded subsets of M (Ω × E), the relative weak∗ -topology is compact and metrizable. Therefore we have to show that Y(Ω, E, µ) is sequentially closed. Let {νn }n>1 ⊆ Y(Ω, E, µ) be a sequence, such that w∗
νn −→ ν. If β ∈ C0 (Ω), then from Proposition 3.5.9, we have Z Z β(z) dµ = β(z) dνn ∀ n > 1. Ω
Ω×E
Since β ∈ C0 (Ω × E), in the limit as n → +∞, we obtain Z Z β(z) dµ = β(z) dν Ω
Ω×E
and thus due to Proposition 3.5.9, we have ν ∈ Y(Ω, E, µ). This proves the compactness of Y(Ω, E, µ). “⇐=”: Suppose that E is not compact. Then we can find a sequence {xn }n>1 ⊆ E with no convergent subsequence. Let νn ∈ Y(Ω, E, µ) be associated to the constant function xn (see Definition 3.5.7(c)). Then for each ξ ∈ C0 (Ω × E), we have Z Z lim ξ(z, x) dνn = lim ξ(z, xn ) dµ = 0 n→+∞ Ω×E
n→+∞
Ω
and so
w∗
νn −→ 0 and 0 6∈ Y(Ω, E, µ), a contradiction.
∀n>1
3. Nonlinear Operators and Young Measures
433
THEOREM 3.5.11 SY(Ω, E, µ) is compact and metrizable. PROOF
Recall that ∗
SY(Ω, E, µ) ⊆ B µ(Ω) ∗
(see the proof of Theorem 3.5.10) and B µ(Ω) with the relative weak∗ -topology is compact metrizable. So we need to show that SY(Ω, E, µ) is sequentially weak∗ -closed. To this end let {νn }n>1 ⊆ SY(Ω, E, µ) be a sequence, such that w∗
νn −→ ν
in SY(Ω, E, µ),
for some ν ∈ M (Ω × E)+ . Suppose that ξ ∈ C0 (Ω × E)+ ,
β ∈ C0 (Ω)+
and ξ(z, x) 6 β(z)
∀ (z, x) ∈ Ω × E.
Then from Theorem 3.5.10, we have Z Z ξ(z, x) dνn 6 β(z) dµ Ω×E
so
∀ n > 1,
Ω
Z
Z ξ(z, x) dν 6
Ω×E
β(z) dµ Ω
and from Proposition 3.5.9, we have that ν ∈ SY(Ω, E, µ).
Let us see how Young measures are related to transition probabilities (see Definition 3.5.1). c In what follows on SR(Ω, E) we consider the equivalence relation · ¸ λ1 ∼ λ2 ⇐⇒ λ1 (z) = λ2 (z) for µ-a.a. z ∈ Ω . Then we set df b R(Ω, E) = R(Ω, E)/∼
and
df c SR(Ω, E) = SR(Ω, E)/∼ .
434
Nonlinear Analysis
THEOREM 3.5.12 There is a bijection ψ : R(Ω, E) −→ Y(Ω, E, µ) given by df
ψ(λ) = ν where df
∀ λ ∈ R(Ω, E),
Z Z
ν(C) =
χC (z, x)λ(z)(dx) dµ
∀ C ∈ Σ × B(E).
Ω E
PROOF For any ν-integrable or positive and Σ × B(E)-measurable function h : Ω × E −→ R, we have Z Z Z h(z, x) dν = h(z, x)λ(z)(dx) dµ, Ω×E
Ω E
where ν = ψ(λ). First we show that the map is injective. So let λ1 , λ2 ∈ R(Ω, E) and suppose that ψ(λ1 ) = ψ(λ2 ). Then for A ∈ Σ and η ∈ C0 (E), we set df
h(z, x) = χA (z)η(x) We have
Z Z
∀ (z, x) ∈ Ω × E. Z Z
η(x)λ1 (z)(dx) dµ = A E
η(x)λ2 (z)(dx) dµ. A E
Because A ∈ Σ was arbitrary, we obtain Z Z η(x)λ1 (z)(dx) = η(x)λ2 (z)(dx) for µ-a.a. z ∈ Ω and all η ∈ C0 (E), E
E
so λ1 (z) = λ2 (z) for µ-a.a. z ∈ Ω. Therefore ψ is indeed injective. Next we show that ψ is surjective. So let ν ∈ Y(Ω, E, µ). Then for any ε > 0, we can find a sequence {Dk }k>1 of pairwise disjoint Borel subsets of E with µ ¶ ∞ \ diam Dk < ε and ν Ω × Dkc = 0 k=1
and also a sequence {Um }m>1 of pairwise disjoint open sets with diam Um < ε,
µ(∂Um ) = 0
and
µ ¶ ∞ [ µ Ω\ Um = 0. m=1
3. Nonlinear Operators and Young Measures
435
For each k > 1, we pick xk ∈ Dk and then define df
ε
λ (z) =
∞ X ν(Um × Dk )
µ(Dk )
k=1
δxk
∀ z ∈ Um ,
where δxk is the Dirac measure concentrated on xk . Then λε ∈ R(Ω, E) and let
ν ε = ψ(λε ).
For all C ∈ Σ × B(E), we have µ ¶ µ [ ε ν Um × Dk 6 ν (C) 6 ν (m, k) Um × Dk ⊆ C
[
¶ U m × Dk .
(m, k) Um × Dk ⊇ C
Therefore for any open set V ⊆ Ω × E with ν(∂V ) = 0, we have lim ν ε (V ) = ν(V ).
ε&0
Then by the regularity of ν and the Portmanteau theorem (see Theorem A.2.36), we have that w∗
ν ε −→ ν. © εª On the other hand note that λ ε>0 is bounded in L∞ (Ω; M (E)) and so by Alaoglu’s theorem (see Theorem A.3.9), we may assume that w∗
λε −→ λ
in L∞ (Ω; M (E)).
Then if ξ ∈ C0 (Ω × E), we have ¡ ¢ ξ ∈ L1 Ω; C0 (E) and so Z
Z ξ(z, x) dν = lim
ε&0 Ω×E
Ω×E
Z Z ξ(z, x) dν ε = lim
Ω E
Z Z =
ξ(z, x)λε (z)(dx) dµ
ε&0
ξ(z, x)λ(z)(dx) dµ. Ω E
Hence λ ∈ R(Ω, E) and ν = ψ(λ).
436
Nonlinear Analysis
REMARK 3.5.13 Similarly we establish that ψ is a bijective from SR(Ω, E) onto SY(Ω, E, µ). Moreover, if ν is the Young measure associated to a measurable function u : Ω −→ E, then ψ(δu ) = ν. Here δu is the Dirac transition probability associated to u, i.e., ½ 1 if u(z) ∈ A δu(z) (A) = ∀ A ∈ B(E). 0 if u(z) 6∈ A ∗ Also the identification obtained in Theorem 3.5.12 implies ¡ that the weak topology on Y(Ω, E, µ) (resulting from the pair of spaces M (Ω × E), C (Ω × 0 ¢ ∗ E) ) ¡is equivalent to the -topology on R(Ω, E) (resulting from the ¡ weak ¢¢ pair L∞ (Ω; M (E)), L1 Ω; C0 (E) ). Finally we should mention that Theorem 3.5.12 is also known as the disintegration theorem.
The next approximation result is important in many applications. THEOREM 3.5.14 If F : Ω −→ 2E \ {∅} is a multifunction, such that Gr F ∈ Σ × B(E), then RF (Ω, E) (3.131) ¡ ¢w ∗ = δu : u : Ω −→ E measurable, u(z) ∈ F (z) for µ-a.a. z ∈ Ω , where df
RF (Ω, E) =
©
¡ ¢ λ ∈ R(Ω, E) : λ(z) F (z) = 1
for a.a. z ∈ Ω
ª
and the closure is taken with respect to the weak∗ -topology on L∞ (Ω; M (E)). 1 1 PROOF In what follows by M+ (E)w∗ (respectively M+ (E)n ) we denote 1 ∗ the space M+ (E) furnished with the relative weak -topology of M (E) (respectively the narrow topology); see Definition 2.3.42 and Remark 2.3.43. We 1 know that M+ (E)n is a Polish space and because the narrow topology is 1 stronger than the weak∗ -topology, we infer that M+ (E)w∗ is a Souslin space 1 1 (see Definition A.2.29(b)). So the Borel σ-fields of M+ (E)n and M+ (E)w∗ ¡ ¢ are equal and we simply write B (E) . Now let ¡ ¢ ª df © 1 S(z) = λ ∈ M+ (E) : λ F (z) = 1 .
We claim that
¡ ¢ Gr S ∈ Σ × B (E) .
So let C ∈ Σ × B(E) 1 and consider the map ϕC : Ω × M+ (E) −→ R, defined by ¡ ¢ 1 ϕC (z, λ) = δz ⊗ λ (C) ∀ (z, λ) ∈ Ω × M+ (E).
3. Nonlinear Operators and Young Measures
437
Here by δz we denote the Dirac measure concentrated at z ¡∈ Ω. ¢Let T be the collection of all sets C ∈ Σ × B(E), such that ϕC is Σ × B (E) -measurable. For A ∈ Σ and B ∈ B(E), we have ϕA×B (z, λ) = χA (z) × λ(B) = χA (z) × ϑB (λ), where
1 ∀ λ ∈ M+ (E).
ϑB (λ) = λ(B)
1 map on M+ (E). Let B1 be all those sets G ∈ B(E), We show that ϑB is ¡ a Borel ¢ such that ϑG is B (E) -measurable. If G is open, then from the Portmanteau theorem (see Theorem A.2.36), we know that the map
λ 7−→ ϑG (λ) is lower semicontinuous on set in E, then
1 M+ (E)n .
If U is an open set E and K is a closed
ϑU ∩K = ϑU − ϑU ∩K c , so the map λ 7−→ ϑU ∩K (λ) is Borel. N Hence if {Uk }N k=1 are open sets in E and {Kk }k=1 are closed sets in E, then N [ ¡
Uk ∩ Kk
¢
∈ B1
k=1
and so B1 is a monotone class. From the monotone class theorem, it follows that B1 = B(E). ¡ ¢ So ϕG is B (E) -measurable for all G ∈ B(E). Therefore the map (z, λ) 7−→ ϕA×B (z, λ) = χA (z) × ϑB (λ) ¡ ¢ is Σ × B (E) -measurable for all A ∈ Σ and all B ∈ B(E), hence A × B ∈ T . Clearly T is a monotone class and so it follows that T = Σ × B(E). We have © ª 1 (z, λ) ∈ Ω × M+ (E) : (δz ⊗ λ)(Gr F ) = 1 ¡ ¢ ¡ ¢ = ϕ−1 = Gr S ⊆ Σ × B (E) . Gr F {1} 1 Let D be the subset of M+ (E) consisting of the Dirac measures and introduce ¡ ¢ ª df © Se (z) = λ ∈ D : λ F (z) = 1 .
Then ext S(z) = Se (z)
and
¡ ¢ Gr Se ∈ Σ × B (E) .
Using Remark 3.4.19, we have SS∞e so we obtain (3.131).
w∗
∞ = Sconv = SS∞ , ∗S e
438
Nonlinear Analysis
Exploiting the identification of R(Ω, E) with Y(Ω, E, µ) we obtain the following corollary of Theorem 3.5.14. COROLLARY 3.5.15 The Young measures associated to measurable functions are dense in the space Y(Ω, E, µ) for the weak∗ -topology on M (Ω × E) (or equivalently by Theorem 3.5.12 for the weak∗ -topology on L∞ (Ω; M (E))). In the previous section we introduced the following classes of Σ × B(X)df
measurable functions f : Ω × E −→ R = R ∪ {+∞} (called integrands): the Carath´eodory integrands and the normal integrands. Next we introduce further specifications of these classes, which will be helpful in our analysis. DEFINITION 3.5.16 (a) ½ df N (Ω, Σ, E) = f : Ω × E −→ R : f is Σ × B(E)-measurable and ¾ f (z, ·) is lower semicontinuous , i.e., N (Ω, Σ, E) is the set of all normal integrands; see also Definition 3.4.8. (b)
½ df
N+ (Ω, Σ, E) =
¾ f ∈ N (Ω, Σ, E) : f > 0 ,
i.e., N+ (Ω, Σ, E) is the set of positive normal integrands. (c) ½ df b b (Ω, Σ, E) = K
f ∈ N (Ω, Σ, E) : f (z, ·) ∈ Cb (E) for µ-a.a. z ∈ Ω and ¾ ° ° the map z 7−→ °f (z, ·)°L∞ (E) belongs in L1 (Ω) ,
b b (Ω, Σ, E) is the set of all Cb -Carath´eodory integrands. i.e., K (d) ½ df b 0 (Ω, Σ, E) = K
f ∈ N (Ω, Σ, E) : f (z, ·) ∈ C0 (E) for µ-a.a. z ∈ Ω ¾ ° ° ∞ ° ° and the map z 7−→ f (z, ·) L∞ (E) belongs in L (Ω) ,
b 0 (Ω, Σ, E) is the set of all C0 -Carath´eodory integrands. i.e., K
3. Nonlinear Operators and Young Measures From Proposition 3.4.9, we obtain the following fact. PROPOSITION 3.5.17 If f ∈ N+ (Ω, Σ, E), b 0 (Ω, Σ, E), such that then there exists a sequence {fn }n>1 ⊆ K fn % f. PROPOSITION 3.5.18 If f ∈ N+ (Ω, Σ, E), then the map
Z ν 7−→
f dν Ω×E
∗
is w -lower semicontinuous on SY(Ω, E, µ). PROOF
By virtue of Proposition 3.5.17, we can find a sequence b 0 (Ω, Σ, E), {fn }n>1 ⊆ K
such that fn % f. By the monotone convergence theorem (see Theorem A.2.10), we have Z Z f dν = lim fn dν. n→+∞ Ω×E
Ω×E
Since
¡ ¢ fn ∈ L1 Ω; C0 (E)
the map
∀ n > 1,
Z ν 7−→
fn dν Ω×E
is w∗ -continuous on SY(Ω, E, µ) (see Remark 3.5.13). Therefore the map Z ν 7−→ f dν Ω×E
is w∗ -lower semicontinuous on SY(Ω, E, µ).
439
440
Nonlinear Analysis
PROPOSITION © ª 3.5.19 If un : Ω −→ E n>1 is a sequence of Σ-measurable functions, {νn }n>1 is a sequence of Young measures associated with {un }n>1 (i.e., νn = δun for n > 1) and w∗
νn −→ ν
in SY(Ω, E, µ),
for some ν ∈ SY(Ω, E, µ), then for µ-almost all z ∈ Ω, the subprobability measure ν(z) is supported by ∞ \ © ª © ª lim sup un (z) = uk (z) : k > n . n→+∞
PROOF
n=1
We define df
Fk (z) =
©
un (z)
ª
∀ z ∈ Ω, k > 1
n>k
and
½ fk (z, x) = iFk (z) =
0 +∞
if x ∈ Fk (z), otherwise.
Evidently fk ∈ N+ (Ω, Σ, E) and by virtue of Proposition 3.5.18, we have Z 0 6 fk (z, x) dν Ω×E
Z
6 lim inf
n→+∞ Ω×E
Z
= lim inf
n→+∞
fk (z, x) dνn
¡ ¢ fk z, un (z) dµ = 0,
Ω
so
Z fk (z, x) dν = 0. Ω×E
From Remark 3.5.13, we have Z Z fk (z, x)ν(z)(dx) dµ = 0, Ω E
so
Z fk (z, x)ν(z)(dx) = 0
for µ-a.a. z ∈ Ω
E
and thus ν(z) is supported by Fk (z) for µ-almost all z ∈ Ω. Because k > 1 was arbitrary, we obtain the conclusion of the proposition.
3. Nonlinear Operators and Young Measures
441
When we examined the space of measures M (E) = C0 (E)∗ , in addition to the weak∗ -topology, we considered also¢ a finer topology called the narrow ¡ topology , namely the w M (E), Cb (E) -topology. Exploiting the identification of Y(Ω, E, µ) with R(Ω, E), we do the same thing for the space Y(Ω, E, µ) of Young measures. We introduce a finer topology, which will lead to some powerful compactness result. DEFINITION 3.5.20 The narrow topology on Y(Ω, E, µ) is the weakest topology which makes continuous the linear functionals of the form Z b b (Ω, Σ, E). ν 7−→ f dν ∀f ∈K Ω×E
We say that the sequence {νn }n>1 converges narrowly to ν, if Z lim
n→+∞ Ω×E
Z f dνn =
b b (Ω, Σ, E) ∀f ∈K
f dν Ω×E
and we write
n
νn −→ ν. REMARK 3.5.21 If E is compact, then the narrow and weak∗ -topolob b 0 (Ω, Σ, E). gies coincide since Kb (Ω, Σ, E) = K PROPOSITION 3.5.22 If {un : Ω −→ E}n>1 is a sequence of Σ-measurable functions, u : Ω −→ E is a Σ-measurable function, {νn }n>1 and ν are the Young measures associated to the functions {un }n>1 and u respectively, then µ n un −→ u ⇐⇒ νn −→ ν. PROOF
“=⇒”: Since
µ
u −→ u, © ª we can extract a subsequence unk k>1 , such that unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Then for all
b b (Ω, Σ, E), f ∈K
by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have Z Z ¡ ¢ ¡ ¢ lim f z, unk (z) dµ = f z, u(z) dµ, n→+∞
Ω
Ω
442
Nonlinear Analysis
so
Z lim
Z
k→+∞ Ω×E
f (z, x) dνnk =
b b (Ω, Σ, E) ∀f ∈K
f (z, x) dν Ω×E
and thus
n
νnk −→ ν
as k → +∞,
in Y(Ω, E, µ).
Since every subsequence of {νn }n>1 has a further subsequence converging narrowly to ν, we conclude that n
νn −→ ν “⇐=”: Let
in Y(Ω, E, µ).
© ¡ ¢ª df f (z, x) = min 1, dE x, u(z) ,
where dE denotes the metric on E. Clearly b b (Ω, Σ, E). f ∈K So
Z lim
Z
n→+∞ Ω×E
hence
Z lim
n→+∞
f (z, x) dνn =
f (z, x) dν, Ω×E
¡ ¢ f z, un (z) dµ =
Ω
Z
¡ ¢ f z, u(z) dµ = 0.
Ω
For a given ε > 0, let df
Mε,n =
©
¡ ¢ ª z ∈ Ω : dE un (z), u(z) > ε
We have
Z εµ(Mε,n ) 6 Z 6
¡ ¢ f z, un (z) dµ
Mε,n
¡
¢ f z, un (z) dµ,
Ω
so, from (3.132), we have lim µ(Mε,n ) = 0
n→+∞
and thus
µ
∀ n > 1.
un −→ u.
(3.132)
3. Nonlinear Operators and Young Measures
443
By the Alexandrov one-point compactification (see Theorem A.1.3 and Reb such that E is a dense mark A.1.4), we can find a compact metric space E, b subset of E. PROPOSITION 3.5.23 If f ∈ N+ (Ω, Σ, E), then © ª ¡ ¢ b 0 Ω, Σ, E b , such that (a) there exists a sequence fbn n>1 ⊆ K fbn % f R
(b) the function ν 7−→
on Ω × E;
f dν is narrowly lower semicontinuous.
Ω×E
PROOF
b which extends d . Then (a) Let dEb be the metric of E E
£ ¤ fbn (z, x) = inf f (z, y) + ndEb (y, x) y∈E
b ∀ n > 1, (z, x) ∈ Ω × E
is the desired sequence. (b) Follows from Proposition 3.5.18 since the narrow topology is finer than the weak∗ -topology. PROPOSITION 3.5.24 The narrow topology on Y(Ω, Σ, E) is the weakest topology τ , such that the R b b (Ω, Σ, E). b map ν −→ fbdν is continuous for all fb ∈ K Ω×E
b b (Ω, Σ, E) b ⊆K b b (Ω, Σ, E). So a priori the τ -topology PROOF Note that K b b (Ω, Σ, E) and introduce is weaker than the narrow topology. Next let f ∈ K ° df ° β(z) = °f (z, ·)°L∞ (E)
df
and g(z, x) = f (z, x) + a(z).
Evidently g R∈ N+ (Ω, Σ, E) and so by Proposition 3.5.23, we have that the map ν 7−→ g dν is τ -lower semicontinuous. Since
Ω×E
Z
Z f dν =
Ω×E
we infer that the map ν −→
R Ω×E
Z g dν −
Ω×E
β dµ, Ω
f dν is τ -lower semicontinuous.
If we repeat the above argument with f replaced by −f , we reach the desired conclusion.
444
Nonlinear Analysis
PROPOSITION 3.5.25 If on K0 (Ω, Σ, E) we introduce the equivalence relation ¡© ª¢ f1 ∼ f2 ⇐⇒ µ z ∈ Ω : f1 (z, ·) 6= f2 (z, ·) = 0, ¡ ¢ b 0 (Ω, Σ, E)/∼ is in bijection with L1 Ω; C0 (E) . then K0 (Ω, Σ, E) = K PROOF
For each f ∈ K0 (Ω, Σ, E), let df
ψ(f )(z) = f (z, ·). Then ψ(f )(z) ∈ C0 (E). Also the map Ω 3 z 7−→ ψ(f )(z) ∈ C0 (E) is measurable. To see this let {xn }n>1 be dense in E. Then for all h ∈ C0 (E), we have ° ° ¯ ¯ °ψ(f )(z) − h° = sup ¯f (z, xn ) − h(xn )¯, C (E) 0
so the map
n>1
° ° z 7−→ °ψ(f )(z) − h°C0 (E)
is Σ-measurable for all h ∈ C0 (E). Since the space C0 (E) is separable, it follows that the map Ω 3 z 7−→ ψ(f )(z) ∈ C0 (E) is strongly measurable (see Corollary 2.1.4). Moreover, Z ° ° °ψ(f )(z)° dµ < +∞, C0 (E) Ω
¡ 1
¢ i.e., ψ(f ) ∈ L Ω; C0 (E) . Therefore ¡ ¢ ψ : K0 (Ω, Σ, E) −→ L1 Ω; C0 (E) ¡ ¢ is injective. Moreover, if g ∈ L1 Ω; C0 (E) , then g = ψ(f ) with df
f (z, x) = g(z)(x), so ψ is bijective. REMARK 3.5.26¡ The above proposition permits the identification of ¢ K0 (Ω, Σ, E) with L1 Ω; C0 (E) , which is very convenient because of the identification of Y(Ω, E, µ) with R(Ω, E).
3. Nonlinear Operators and Young Measures
445
We want to investigate the compact sets in Y(Ω, E, µ). For this purpose it is useful to recall the basic results characterizing the compact sets of the ¡ ¢ space M (E) furnished with the narrow topology, i.e., the w M (E), Cb (E) topology. If E is compact, this topology coincides with the w∗ -topology. THEOREM 3.5.27 (Prohorov Theorem) If Y is a Polish space (see Definition A.2.29(a)) and C ⊆ M (Y )+ is a bounded set, then C is relatively compact in the narrow topology if and only if C is uniformly tight, i.e., for every ε > 0, we can find a compact set Kε ⊆ Y, such that
¡ ¢ sup λ Kεc 6 ε.
λ∈C
We have the following characterization of uniformly tight sets in M (Y )+ . PROPOSITION 3.5.28 If Y is a Polish space and C ⊆ M (Y )+ is nonempty, then C is uniformly tight if and only if there exists ψ : Y −→ R+ , such that © ª (a) the set y ∈ Y : ψ(y) 6 t is compact for every t > 0 (i.e., ψ is inf-compact); and Z (b) sup ψ(y)λ(dy) < +∞. λ∈C
Y
PROOF “=⇒”: Let {Kn }n>1 be an increasing sequence of compact sets of Y , such that λ(Y \ Kn ) 6
1 2n
Let us set df
ψ =
∞ X
∀ n > 1, λ ∈ C.
χY \Kn .
n=1
We have
Z ψ(y)λ(dy) = Y
∞ Z X
χY \Kn (y)λ(dy) 6 1
∀ λ ∈ C.
n=1 Y
Note that ψ is N ∪ {+∞}-valued. So © ª © ª ψ 6 t = ψ 6 [t] = K[t]+1 , where [t] denotes the integer part of t > 1. Therefore ψ is inf-compact.
446
Nonlinear Analysis
Motivated from Theorem 3.5.27, we introduce the following definition. DEFINITION 3.5.29 A set S ⊆ Y(Ω, E, µ) is said to be uniformly tight, if and only if for every ε > 0, there exists a compact set Kε ⊆ E, such that
¡ ¢ sup ν Ω × (E \ Kε ) 6 ε.
ν∈S
REMARK 3.5.30 The © uniformª tightness of S ⊆ Y(Ω, E, µ) is equivais uniformly tight in M (E)+ (see lent to saying that C = ν ◦ proj−1 E ν∈S Theorem 3.5.27). Using the identification of Y(Ω, E, µ) with R(Ω, E) and Proposition 3.5.28, we are led to the following characterization of uniformly tight sets in Y(Ω, E, µ). PROPOSITION 3.5.31 S ⊆ Y(Ω, E, µ) is uniformly tight if and only if there exists an inf-compact df
function ψ : E −→ R+ = R+ ∪ {+∞}, such that Z Z sup ψ(x)ν(z)(dx) dµ < +∞. ν∈S
Ω E
In the literature there is another notion of uniform tightness for Young measures (or equivalently for transition probabilities). DEFINITION 3.5.32 A set S ⊆ Y(Ω, E, µ) is said to be uniformly Btight, if there exists a Σ × B(E)-measurable function ϕ : Ω × E −→ R+ , such that ϕ(z, ·) is inf-compact for all z ∈ Ω, i.e., the set © ª x ∈ E : ϕ(z, x) 6 t is compact for all t > 0 (hence ϕ is a normal integrand, i.e., ϕ ∈ N+ (Ω, Σ, E)), such that Z Z Z sup ϕ(z, x) dν = sup ϕ(z, x)ν(z)(dx) dµ < +∞. ν∈S Ω×E
ν∈S
Ω E
REMARK 3.5.33 Evidently uniform tightness implies uniform B-tightness. In fact since E is a Polish space, the two notions are equivalent (see Valadier (1975, p. 165)).
3. Nonlinear Operators and Young Measures
447
LEMMA 3.5.34 If S ⊆ Y(Ω, E, µ) is uniformly tight, then so is S (the closure in the narrow topology; see Definition 3.5.20). PROOF By virtue of Definition 3.5.29, for a given ε > 0, we can find a compact set Kε ⊆ E, such that ¡ ¢ sup ν Ω × (E \ Kε ) 6 ε. ν∈S
Let us set
df
ψ(x) = χE\Kε (x). Then ψ ∈ N+ (Ω, Σ, E) and if ν ∈ Y(Ω, E, µ), we have Z ¡ ¢ ψ(x) dν 6 ε ⇐⇒ ν Ω × (E \ Kε ) 6 ε. Ω×E
Therefore, if ν ∈ S, by Proposition 3.5.23(b), it follows that ¡ ¢ ν Ω × (E \ Kε ) 6 ε, hence S is uniformly tight too. The next theorem is the extension of Theorem 3.5.27 to Young measures. THEOREM 3.5.35 S ⊆ Y(Ω, E, µ) is relatively compact for the narrow topology if and only if S is uniformly tight. PROOF
“=⇒”: Let ϕ : Y(Ω, E, µ) −→ M (E)+ be defined by df
ϕ(ν) = ν ◦ proj−1 E
∀ ν ∈ Y(Ω, E, µ).
b b (Ω, Σ, E) and so from Remark 3.5.6, we If h ∈ C0 (E), then h ◦ projE ∈ K have Z Z ¡ ¢ ¡ ¢ h(x)d ν ◦ proj−1 = h ◦ projE (z, x) dν E E
Ω×E
and thus ϕ is continuous for the narrow topology on Y(Ω, Σ, E) and on M (E)+ . It follows then that ϕ(S) is compact in M (E)+ and so from Theorem 3.5.27, we have that ϕ(S) is uniformly tight in M (E)+ . This then implies the uniform tightness of S (see Remark 3.5.30).
448
Nonlinear Analysis
b µ), defined “⇐=”: Let {νn }n>1 ⊆ S be a sequence and consider νbn ∈ Y(Ω, E, by ¡ ¢ df b n > 1. νbn (C) = νn C ∩ (Ω × E) ∀ C ∈ Σ × B(E), b is compact, the weak∗ and narrow topologies on Y(Ω, E, b µ) Because the set E coincide (see Remark 3.5.21). So because of Theorem 3.5.10, we may assume that νbn −→ νb in Y(Ω, E, µ), with the narrow topology. Since S is uniformly tight, for every m > 1, we can find a compact set Km ⊆ E, such that
so
¡ ¢ 1 νn Ω × (E \ Km ) 6 m
∀ n > 1,
¡ ¢ 1 νbn Ω × (E \ Km ) 6 m
∀n>1
and thus ¡ ¢ ¡ ¢ 1 νb Ω × (E \ Km ) 6 lim inf νbn Ω × (E \ Km ) 6 . n→+∞ m So νb is supported by Ω ×
∞ S m=1
Km ⊆ Ω × E.
Since in the sequel we shall focus on Young measures associated to measurable functions, let us give some examples in this direction. EXAMPLE 3.5.36 In the examples that follow Ω = [0, 1], µ is the Lebesgue measure and E = R. (a) Consider the Rademacher functions ¡ ¢ df un (z) = sgn sin(2n πz) where
( z |z| sgn z = 1 df
So
½ un (z) =
We have
1 −1
∀ n > 1,
if
z 6= 0,
if
z 6= 1.
£ ¢ if z ∈ 2kn , k+1 , k is even, 2n otherwise. w∗
un −→ 0
in L1 [0, 1].
3. Nonlinear Operators and Young Measures
449
Let νn be the Young measure associated to un for n > 1. We have that ¢ 1¡ ¢ 1¡ µ ⊗ δ1 + µ ⊗ δ−1 . 2 2
w∗
νn −→
So we see that w∗ -convergence in L∞ (Ω) does not imply the w∗ -convergence in M (Ω×E) of the associated Young measures. Note that the sequence {un }n>1 has no subsequences converging µ-almost everywhere on Ω = [0, 1] (note that kun − um kL1 [0,1] = 1 for n 6= m and compare with Remark 3.5.8). (b) Let {un }n>1 be the Rademacher functions introduced above. Then define ½ df
u bn (z) =
un (z) 1 2 un (z)
if if
n is even, n is odd.
Clearly we still have w∗
u bn −→ 0 in L∞ [0, 1]. © ª However, the sequence νbn n>1 of associated Young measures is not convergent, because w∗
νb2n −→ and
w∗
νb2n+1 −→
¢ 1¡ ¢ 1¡ µ ⊗ δ1 + µ ⊗ δ−1 2 2
in M (Ω × E)
¢ 1¡ ¢ 1¡ µ ⊗ δ 21 + µ ⊗ δ− 21 2 2
in M (Ω × E).
(c) Let df
un (z) = sin(nz) We know that
∀ z ∈ [0, 1], n > 1.
w∗
un −→ 0 in L∞ [0, 1].
On the other hand it can be shown (see Tartar (1979, p. 148)), that if {νn }n>1 is the sequence of associated Young measures, then w∗
νn −→ ν
in M (Ω × E),
where Z ν(A) = A
1 1 √ dz π 1 − z2
for all measurable A ⊆ [0, 1].
(d) Let df
un (z) = ϑn χ[0, 1 ] (z) n
∀ n > 1,
450
Nonlinear Analysis
with ϑn ∈ R for n > 1. No matter which is the sequence {ϑn }n>1 ⊆ R, if {νn }n>1 is the sequence of associated Young measures, we have w∗
νn −→ µ ⊗ δ0
in M (Ω × E).
(e) Let un (z) = n
∀ z ∈ [0, 1], n > 1.
Also let {νn }n>1 be the sequence of associated Young measures. Clearly w∗
νn −→ 0
in M (Ω × E).
REMARK 3.5.37 In Examples 3.5.36(a), (b) and (c), the function un have values in [−1, 1], so we can take E = [−1, 1] and then from Theorem 3.5.10, it follows that the sequence {νn }n>1 is relatively w∗ -compact in M (Ω × E). In Examples 3.5.36(d) and (e), we cannot assume that E is compact. Nevertheless the sequence {νn }n>1 is uniformly tight, thus relatively compact for the narrow topology (hence for the weak∗ -topology too; see Theorem 3.5.35). As we already mentioned, in the remaining part of this section, we look at Young measures associated to measurable functions. So in what follows un : Ω −→ E are Σ-measurable functions and {νn }n>1 is the sequence of corresponding Young measures. Following Definition 3.5.32, we introduce the following notion. DEFINITION 3.5.38 The sequence {un }n>1 is uniformly tight, if the sequence {νn }n>1 is uniformly tight (in the sense of Definition 3.5.32). REMARK 3.5.39 Since νn (z) = δun (z) , according to the above definition, the sequence {un }n>1 is uniformly tight, if for every ε > 0, we can find a compact set Kε ⊆ E, such that ¡© ª¢ sup µ z ∈ Ω : un (z) 6∈ Kε 6 ε. n>1
By Proposition 3.5.31, equivalently we can say that there exists an infcompact function ψ : E −→ R+ , such that Z ¡ ¢ sup ψ un (z) dµ < +∞. n>1
Ω
In particular then if E = RN and we take ψ(x) ¡ ¢ = kxkRN , then we see that every bounded sequence {un }n>1 ⊆ L1 Ω; RN is uniformly tight.
3. Nonlinear Operators and Young Measures
451
THEOREM 3.5.40 If n
νn −→ ν © ¡ ¢ª and f ∈ N (Ω, Σ, E) is such that the sequence f − ·, un (·) n>1 is uniformly integrable in L1 (Ω), then ¢ R ¡ R (a) if lim inf f z, un (z) dµ < +∞, then f + (z, x) dν < +∞; n→+∞
(b)
Ω
Ω×E
¢ R ¡ f (z, x) dν 6 lim inf f z, un (z) dµ.
R
n→+∞
Ω×E
PROOF
Ω
(a) Fix c > 0 and let © ª df fc = max − c, f .
From Proposition 3.5.23(b), we have Z Z ¡ ¢ ¡ ¢ fc (z, x) + c dν 6 lim inf fc (z, un (z)) + c dµ, n→+∞
Ω×E
so
Ω
Z
Z fc (z, x) dν 6 lim inf
fc (z, un (z)) dµ < +∞.
n→+∞
Ω×E
(3.133)
Ω
If we let c = 0, then from (3.133) and the uniform integrability hypothesis, we conclude that © ¡ ¢ª the sequence f + ·, un (·) n>1 is bounded in L1 (Ω). This proves (a). (b) Let
½ df
An,c = Then we have
Z
¾ ¢ z ∈ Ω : f z, un (z) < −c . ¡
¡ ¢ f z, un (z) dµ 6 0
∀ n > 1,
An,c
so
Z An,c
¡ ¢ f + z, un (z) dµ −
Z An,c
¡ ¢ f − z, un (z) dµ 6 0
∀ n > 1.
452
Nonlinear Analysis
Thus for a given ε > 0, we can find c > 0 large enough so that Z ¡ ¢ −ε 6 f z, un (z) dµ 6 0.
(3.134)
An,c
Note that
¡ ¢ ¡ ¢ fc z, un (z) = f z, un (z)
and
on Ω \ An,c
¡ ¢ fc z, un (z) = −c on An,c .
Hence Z
¡ ¢ f z, un (z) dµ =
Ω
Z
=
¢ f z, un (z) dµ +
>
Z
¢ f z, un (z) dµ +
An,c
Z
¡ ¢ f z, un (z) dµ
Ω\An,c
Z
¡
¢ fc z, un (z) dµ −
Ω
¡
Z
¡ ¢ f z, un (z) dµ +
An,c
¡
An,c
Z
Z
¡ ¢ fc z, un (z) dµ
An,c
¡
¢ fc z, un (z) dµ >
Ω
Z
¡ ¢ fc z, un (z) dµ − ε
Ω
(see (3.134)). So using (3.133) and the fact that f 6 fc , we have Z lim inf
n→+∞
Z >
¡ ¢ f z, un (z) dµ > lim inf
Z
n→+∞
Ω
Ω
Ω
Z
fc (z, x) dν − ε >
¡ ¢ fc z, un (z) dµ − ε
f (z, x) dν − ε. Ω
Let ε & 0 to finish the proof of the theorem. COROLLARY 3.5.41 n If νn −→ ν, f0 : Ω×E −→ R is Σ×B(E)-measurable, f0 (z, ·) ∈ C(E) for all © ¡ ¢ª z ∈ Ω and f0 ·, un (·) n>1 is a sequence of uniformly integrable functions, then f0 is ν-integrable and Z Z ¡ ¢ f0 (z, x) dν = lim f0 z, un (z) dµ. n→+∞
Ω×E
PROOF
Ω
Use Theorem 3.5.40 with f = f0 and f = −f0 .
3. Nonlinear Operators and Young Measures
453
COROLLARY 3.5.42 © ¡ ¢ª n If νn −→ ν, h : E −→ R is a continuous function and h un (·) n>1 is a sequence of uniformly integrable functions, then (a) for µ-almost all z ∈ Ω, the function h is ν(z)-integrable and Z Z ¯ ¯ ¯h(x)¯ν(z)(dx) dµ < +∞; Ω E
(b)
w h(un ) −→ b h
where
in L1 (Ω),
Z
df b h(z) =
h(x)ν(z)(dx). E
PROOF
(a) Follows from Corollary 3.5.41.
(b) Let ϑ ∈ L∞ (Ω) and let us set df
f0 (z, x) = ϑ(z)h(x). We have
Z
¡ ¢ f0 z, un (z) dµ 6 kϑk∞
A
Z
¯ ¡ ¢¯ ¯h un (z) ¯ dµ
∀ A ∈ Σ,
A
© ¡ ¢ª so f0 ·, un (·) n>1 is a sequence of uniformly integrable functions. We can apply Corollary 3.5.41 and obtain Z Z ¡ ¢ f0 (z, x) dν = lim f0 z, un (z) dµ n→+∞
Ω×E
Ω Z
=
lim
n→+∞
¡ ¢ ϑ(z)h un (z) dµ.
(3.135)
Ω
© ¡ ¢ª Because h un (·) n>1 is a sequence of uniformly integrable functions, from the Dunford-Pettis theorem (see Theorem 2.3.24), we may assume that w h(un ) −→ b h in L1 (Ω).
So from (3.135,) we have µZ ¶ Z Z Z f0 (z, x) dν = ϑ(z) h(x)ν(z)(dx) dµ = ϑ(z)b h(z) dµ. Ω×E
Ω
E
Ω
454
Nonlinear Analysis
Let ϑ = χA , A ∈ Σ. Then we obtain Z Z Z b h(x)ν(z)(dx) dµ = h(z) dµ A E
so
∀ A ∈ Σ,
A
Z h(x)ν(z)(dx) = b h(z) for µ-a.a. z ∈ Ω. E
© ¡ ¢ª As every subsequence of h un (·) n>1 has a further subsequence converging © ¡ ¢ª weakly in L1 (Ω) to b h, we conclude that the original sequence h un (·) n>1
converges. REMARK 3.5.43
If E is compact, then ∗
w h(un ) −→ b h
in L∞ (Ω).
PROPOSITION 3.5.44 ¡ ¢ If E = RN and {un }n>1 ⊆ L∞ Ω; RN , {νn }n>1 are sequences, such that w∗
un −→ u and
w∗
νn −→ ν then
¡ ¢ in L∞ Ω; RN
in M (Ω × RN ),
Z u(z) =
xν(z)(dx)
for µ-a.a. z ∈ Ω.
RN
¡ ¢ PROOF Since the sequence {un }n>1 is bounded in L∞ Ω; RN , we may replace E = RN by a compact subset of it. Then the narrow and weak∗ topologies coincide. Therefore from Corollary 3.5.42 (see also Remark 3.5.43) with h(x) = x, we obtain the result. REMARK 3.5.45 For a given measurable function u : Ω −→ E, we define the barycenter of u to be the set ½ ¾ Z df Bar(u) = λ ∈ R(Ω, E) : u(z) = xλ(z)(dx) for µ-a.a. z ∈ Ω . E
¡ ¢ So the conclusion of Proposition 3.5.44 says that ν ∈ R Ω, RN belongs in the barycenter of u.
3. Nonlinear Operators and Young Measures
455
In Proposition 2.3.38 we produced a criterion for strong convergence in L1 (Ω). In the next proposition, using the tools provided by the theory of ¡ ¢ Young measures, we obtain a criterion for strong convergence in Lp Ω; RN , p ∈ [1, +∞). PROPOSITION 3.5.46 ¡ ¢ If E = RN , {un }n>1 ⊆ L∞ Ω; RN and {νn }n>1 ⊆ M (Ω×E) are sequences, such that ¡ ¢ w∗ un −→ u in L∞ Ω; RN and
w∗
νn −→ ν
in M (Ω × E),
un −→ u
¡ ¢ in Lp Ω; RN ,
then for p ∈ [1, +∞) if and only if ν(z) = δu(z) PROOF
for µ-a.a. z ∈ Ω.
¡ ¢ “=⇒”: Let h ∈ C0 RN . Then
h(un ) −→ h(u) in Lp (Ω). © ¡ ¢ª On the other hand it is easy to see that h un (·) n>1 is the sequence of uniformly integrable functions. So by virtue of Corollary 3.5.42(b), we have that w h(un ) −→ b h in L1 (Ω), with df b h(z) =
Z h(x)ν(z)(dx). RN
We have ¡ ¢ ® h u(z) = h, δu(z) C (RN ) 0 Z ® = h(x)ν(z)(dx) = h, ν(z) C0 (RN )
for µ-a.a. z ∈ Ω,
RN
where by h·, ·iC0 (RN ) we denote the duality brackets for the pair of spaces ¡ ¡ N¢ ¡ ¢¢ ¡ ¢ C0 R , M RN . Since h ∈ C0 RN was arbitrary, we obtain δu(z) = µ(z) for µ-a.a. z ∈ Ω. (b)¡ It suffices to¡ prove ¢the implication for p > 1 (recall that the embedding ¢ Lp Ω; RN ⊆ L1 Ω; RN is continuous). If ν(z) = δu(z)
for µ-a.a. z ∈ Ω,
456
Nonlinear Analysis
then from Corollary 3.5.42(b) with p
h(x) = kxkRN , we have w
p
p
kun kRN −→ kukRN
in L1 (Ω)
and so kun kp −→ kukp . Note that w
un −→ u
¡ ¢ in Lp Ω; RN .
So from the Kadec-Klee property (see Remark A.3.22), we have that ¡ ¢ un −→ u in Lp Ω; RN .
We have another result in this direction. First a lemma. LEMMA 3.5.47¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence of uniformly integrable functions, then we can find a subsequence {νnk }k>1 of {νn }n>1 , such that n
(a) νnk −→ ν in Y(Ω, RN , µ); R (b) xν(z)(dx) < +∞; RN
¡ ¢ R w (c) unk −→ u in L1 Ω; RN , with u(z) = xν(z)(dx). RN
PROOF (a) From Proposition 3.5.31 with ψ(x) = kxkRN , we see that the sequence {νn }n>1 is uniformly tight. So by virtue of Theorem 3.5.35, we can find a subsequence {νnk }k>1 of {νn }n>1 , such that n
νnk −→ ν
as k → +∞
in Y(Ω, RN , µ).
(b) For the subsequence obtained in part (a), we see that we can apply Corollary 3.5.42(a) with ϕ(x) = kxkRN and obtain the desired conclusion. (c) For the subsequence obtained in part (a), we see that we can apply Corollary 3.5.42(b) with ϕ(x) = xk for all x = (x1 , . . . , xN ) ∈ RN and obtain the desired conclusion.
3. Nonlinear Operators and Young Measures
457
PROPOSITION ¡ 3.5.48 ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that w
un −→ u
¡ ¢ in L1 Ω; RN
and for every subsequence {νnk }k>1 of the sequence {νn }n>1 for which we have n νnk −→ ν as k → +∞ in Y(Ω, RN , µ), ν is the Young measure associated to a Σ-measurable function w : Ω −→ RN , then w = u and ¡ ¢ un −→ u in L1 Ω; RN . PROOF
We have ° Z Z °Z ° ° ° ° °w(z)° N dµ = ° xν(z)(dx)° ° ° R Ω
Z
6 lim inf
n→+∞
Ω
RN
dµ
RN
° ° °un (z)° N dµ < +∞, k R
Ω
¡ ¢ so w ∈ L1 Ω; RN . If
n
νnk −→ ν = δw(·) , then by Lemma 3.5.47(c), we have w
unk −→ w
¡ ¢ in L1 Ω; RN .
Hence w = u. From Proposition 3.5.22, we have that µ
unk −→ u and by passing to a further subsequence if necessary, we may assume that unk (z) −→ u(z) for µ-a.a. z ∈ Ω. Then from the extended dominated convergence theorem (Vitali’s theorem; see Theorem A.2.9), we have that ¢ ¡ un −→ u in L1 Ω; RN .
Now we can prove a lower semicontinuity result for integral functionals.
458
Nonlinear Analysis
THEOREM 3.5.49 ¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that ¡ ¢ w un −→ u in L1 Ω; RN , {wn : Ω −→ E}n>1 is a sequence of Σ-measurable functions, such that µ
wn −→ w,
¡ ¢ for some Σ-measurable function w : Ω −→ E, f ∈ N Ω, Σ, E × RN and ¡ ¢ (i) f z, w(z), · is convex for µ-almost all z ∈ Ω; © ¡ ¢ª (ii) f − ·, wn (·), un (·) n>1 is a sequence of uniformly integrable functions, then
Z
(a) if lim inf
n→+∞
¡ ¢ f z, wn (z), un (z) dµ < +∞, then
Ω
Z
¡ ¢ f + z, w(z), u(z) dµ < +∞;
Ω
Z (b)
¡ ¢ f z, w(z), u(z) dµ 6 lim inf
Z
n→+∞
Ω
PROOF limit
¡ ¢ f z, wn (z), un (z) dµ.
Ω
By passing to a suitable subsequence, we may assume that the Z ¡ ¢ lim f z, wn (z), un (z) dµ exists, n→+∞
Ω
while by Proposition 3.5.22, we have n
δwn −→ δw
in Y(Ω, Σ, E)
and by Lemma 3.5.47, we have n
δun −→ ν
in Y(Ω, Σ, RN ).
It is easy to see that n
δwn ⊗ δun −→ δw ⊗ ν
in Y(Ω, Σ, E × RN ).
Invoking Theorem 3.5.40(a), we obtain the implication Z ¡ ¢ lim inf f z, wn (z), un (z) dµ < +∞ n→+∞
Ω
Z µZ Ω
RN
⇓ ¶ ¡ ¢ f + z, w(z), x ν(z)(dx) dµ < +∞.
(3.136)
3. Nonlinear Operators and Young Measures
459
From Theorem 3.5.40(b), we have ¶ Z µZ ¡ ¢ f z, w(z), x ν(z)(dx) dµ Ω
RN Z
6 lim inf
n→+∞
¡ ¢ f z, wn (z), un (z) dµ.
(3.137)
Ω
¡
¢ Since by hypothesis f z, w(z), · is convex for µ-almost all z ∈ Ω, using Jensen’s inequality (see Theorem A.2.26), we obtain ¶ Z Z µ Z ¡ ¢ ¡ ¢ f z, w(z), u(z) dµ = f z, w(z) xν(z)(dx) dµ Ω
Z µZ
6 Ω
Ω ¶ ¢ f z, w(z), x ν(z)(dx) dµ.
¡
RN
(3.138)
RN
Then part (a) follows from (3.136) and (3.138), while part (b) follows from estimates (3.137) and (3.138). There is a version of this result for integrands defined on Banach spaces. THEOREM 3.5.50 If (Ω, Σ, µ) is a finite measure space, X is a separable Banach space, Y is a separable reflexive Banach space and f : Ω × X × Y −→ R is a Σ × B(X) × B(Y )-measurable function, such that (i) the function (x, y) 7−→ f (ω, x, y) is lower semicontinuous for µ-almost all ω ∈ Ω; (ii) the function y 7−→ f (ω, x, y) is convex for µ-almost all ω ∈ Ω and all x ∈ X; (iii) there exist β ∈ L1 (Ω) and c > 0, such that ¡ ¢ f (ω, x, y) > β(ω) − c kxkX + kykY for µ-a.a. ω ∈ Ω and all (x, y) ∈ X × Y, then the functional df
Z
(u, v) 7−→ If (u, v) =
¡ ¢ f ω, u(ω), v(ω) dµ
Ω
is sequentially lower semicontinuous from L1 (Ω; X) × L1 (Ω; Y )w into R.
460
Nonlinear Analysis
In Proposition 2.3.39 using an extremality ¡ ¢condition, we obtained a result concerning strong convergence in L1 Ω; RN . There the result was stated without proof. Now that we have in our disposal the tools of the theory of Young measures, we can give a proof of it. PROPOSITION ¡ 3.5.51 ¢ If {un }n>1 ⊆ L1 Ω; RN is a sequence, such that ¡ ¢ in L1 Ω; RN ,
w
un −→ u
¡ ¢ for some u ∈ L1 Ω; RN and µ ¶ u(z) ∈ ext conv lim sup{un (z)}
for µ-a.a. z ∈ Ω,
n→+∞
then
¡ ¢ in L1 Ω; RN .
un −→ u PROOF
Let © ª df S(z) = conv lim sup un (z)
∀ z ∈ Ω.
n→+∞
Since u(z) ∈ ext S(z) for µ-a.a. z ∈ Ω, © ª we can find a sequence Cn (z) n>1 of closed, convex sets in S(z), such that ∞ [
© ª Cn (z) = S(z) \ u(z)
for µ-a.a. z ∈ Ω.
n=1
Let us fix z ∈ Ω \ N with µ(N ) = 0. From Lemma 3.5.47, we have Z u(z) = xν(z)(dx), RN
© ª with ν(z) being supported by lim sup un (z) . Suppose that n→+∞
¡ ¢ ν(z) Cn (z) > 0. ¡ ¢ If ν(z) Cn (z) = 1, then u(z) ∈ Cn (z), a contradiction. Therefore ¡ ¢ 0 < ν(z) Cn (z) < 1 and we can define df
λ1 =
ν(z)|Cn (z) ν(z)(Cn (z))
and
df
λ2 =
ν(z)|RN \Cn (z) 1 − ν(z)(Cn (z))
.
3. Nonlinear Operators and Young Measures It follows that
461
Z
u(z) =
xν(z)(dx) RN
¡ ¢ = ν(z) Cn (z)
Z
¡ ¡ ¢¢ xλ1 (dx) + 1 − ν(z) Cn (z)
RN
Z xλ2 (dx),
RN
with
Z u(z) 6=
xλ1 (dx), RN
since
Z xλ1 (dx) ∈ Cn (z). RN
So we have a contradiction to the hypothesis that u(z) is extremal in C(z). This implies that ¡ ¢ ν(z) Cn (z) = 0 ∀ n > 1, z ∈ Ω \ N, hence ν(z)
¡© ª¢ u(z) = 1,
i.e., ν(z) = δu(z)
for µ-a.a. z ∈ Ω.
Then the conclusion of the proposition follows from Proposition 3.5.46. REMARK 3.5.52 The result is true if (Ω, Σ, µ) is any finite measure space (see Proposition 2.3.39). However, since the theory in this section was developed for a locally compact, σ-compact metric space Ω we have kept this assumption. ¡ ¢ We know that a bounded sequence {un }n>1 ⊆ L1 Ω; RN is uniformly tight (see Proposition 3.5.31 with ψ(x) = kxkRN ) and so we can extract a © ª subsequence unk k>1 , such that n
δunk −→ ν
as k → +∞
in Y(Ω, Σ, RN )
(see Theorem 3.5.35). It is natural to ask what is the relation between the sequence {un }n>1 and the function Z u(z) = xν(z)(dx). RN
In this respect the Chacon biting lemma (see Theorem 2.3.26) is helpful. An equivalent reformulation of this result (for X = RN ) is the following theorem.
462
Nonlinear Analysis
THEOREM 3.5.53 ¡ ¢ If {un }n>1 ⊆ L1 Ω; RN is a bounded sequence, © ª then we can extract a subsequence unk k>1 of {un }n>1 , such that for every δ > 0, there exists A ∈ Σ, with µ(A) < δ and
¡ ¢ in L1 Ω \ A; RN ,
w
unk −→ u
as k → +∞.
PROPOSITION ¡ 3.5.54 ¢ If {un }n>1 ⊆ L1 Ω; RN is a bounded sequence, © ª then we can extract a subsequence unk k>1 and a decreasing sequence of sets {Ak }k>1 ⊆ Σ, with µ(Ak ) & 0, such that
n
δunk −→ ν and
df
in Y(Ω, Σ, RN ) w
wk = χΩ\Ak unk −→ u with df
¡ ¢ in L1 Ω; RN ,
Z
u(z) =
xν(z)(dx)
for µ-a.a. z ∈ Ω.
RN
© ª PROOF By virtue of Theorem 2.3.26, we can find a subsequence unk k>1 of {un }n>1 and a decreasing sequence of sets {Ak }n>1 ⊆ Σ with ©
µ(Ak ) & 0,
ª
such that wk = χΩ\Ak unk k>1 is a sequence of uniformly integrable functions. Then by virtue of Lemma 3.5.47, we need to show that n
δwk −→ ν To this end let
in Y(Ω, Σ, RN ).
¡ ¢ b b Ω, Σ, RN . f ∈ K
If ηk = ψ(δwk ) (see Theorem 3.5.12), then we have ¯ Z ¯ Z ¯ ¯ ¯ f (z, x) dηn − f (z, x) dνnk ¯¯ ¯ Ω×RN
¯ = ¯
Z
Ω×RN
¡ ¡ ¢¢ ¯ f (z, 0) − f z, unk (z) dµ¯
Ak
6 2 kψk
¡
L∞ Ω×RN
n
so ηn −→ ν.
¢ µ(Ak ) −→ 0 as k → +∞,
3. Nonlinear Operators and Young Measures
3.6
463
Remarks
3.1: Compact maps was the first class of operators used to study nonlinear equations in infinite dimensional spaces. Leray & Schauder (1934) used compact perturbations of the identity in order to extend the Brouwer degree to infinite dimensional spaces. However, for linear operators the notion was first introduced by Riesz (1918). Earlier Hilbert (1906) introduced the notion of completely continuous linear operator between Banach spaces. As a property for linear operators, complete continuity actually lies properly between compactness and boundedness. Moreover, when the domain space X is reflexive, then the two notions coincide (see Corollary 3.1.8). The basic approximation result for compact maps stated in Theorem 3.1.10 is due to Schauder (1930). Proper maps (see Definition 3.1.13) are discussed in Berger (1977). Theorem 3.1.22 is due to Schauder (1930). The proof of Proposition 3.1.31 can be found in Reed & Simon (1972, p. 191). Theorem 3.1.38 on the spectral properties of compact linear operators is due to Riesz (1918). Compact linear operators and their spectral properties are discussed in detail in Dunford & Schwartz (1958), Kato (1976) and Yosida (1978). The Fredholm alternative (see Theorem 3.1.48) was obtained in the context of linear integral equations by Fredholm (1903). The spectral theory of selfadjoint, compact, linear operators can be found in the books Akhiezer & Glazman (1961, 1963), Gohberg & Goldberg (1981), Halmos (1998) and Kato (1976). For further results of Fredholm operators we refer to Goldberg (1966), Kato (1976) and Schechter (1971). The proof of Proposition 3.1.70 can be found in Schechter (1971, p. 114). 3.2: Monotone operators are rooted in the calculus of variations and were introduced in the early sixties, in order to provide an analytical framework for the study of nonlinear operator equations broader than the one provided by compact operators. The first mention of monotone operators in a Hilbert space can be traced in the work of Golomb (1935) on nonlinear Hammerstein integral equations. However, the systematic development of the theory of monotone operators, started with Kachurovski (1960), who established that the derivative of a convex function is a monotone map and also introduced the term “monotone operator.” Then Minty (1962) obtained the first existence result for nonlinear functional equations in Hilbert spaces under monotonicity assumptions. Even simple one dimensional examples reveal that a complete theory of maximal monotone maps requires the use of multivalued maps. Proposition 3.2.11 is due to Rockafellar (1969) who proved that a monotone map is locally bounded at every point in the interior of its domain. Here we have stated a slightly more general version of this result. Concerning Proposition 3.2.14, Kenderov (1974) proved that if X is separable, reflexive Banach ∗ space and A : X ⊇ D(A) −→ 2X is maximal monotone with int D(A) 6= ∅,
464
Nonlinear Analysis
then there is a dense Gδ subset D0 of int D(A), such that A|D0 is single valued and upper semicontinuous for the norm topologies on X and X ∗ . For additional results in this direction, we refer to Phelps (1993). The duality map (see Example 3.2.20(d)) plays a basic role in the study of the geometry of Banach spaces and in the theory of evolution equations. It was first introduced by Beurling & Livingston (1962). Its properties are studied by Browder (1976), Cioranescu (1990) and Zeidler (1990b). Theorem 3.2.29 is due to Minty (1962) for Hilbert spaces and Rockafellar (1970c) for Banach spaces. Its proof can be found in Zeidler (1990b, p. 881). Theorem 3.2.30 is due to Browder (1968) and together with Corollary 3.2.31 explains why maximal monotone operators are a powerful tool in the study of nonlinear operator equations. Theorem 3.2.40 is due to Attouch (1981), while Theorem 3.2.41 is due to Rockafellar (1970c). The proof of Theorem 3.2.41 can be found in Zeidler (1990b, p. 888). The notion of pseudomonotonicity was introduced by Br´ezis (1968) (using nets) and Browder (1976) (using sequences). The basic works on pseudomonotonicity are those by Browder & Hess (1972) and Kenmochi (1974, 1975). Of course the most important result is Theorem 3.2.52, due to Browder & Hess (1972). Monotone operators and operators of monotone-type are discussed in the books of Br´ezis (1973), Barbu (1976), Deimling (1985), Hu & Papageorgiou (1997), Morosanu (1988), Pascali & Sburlan (1978) and Showalter (1997). The proof of Theorem 3.2.58 can be found in Hu & Papageorgiou (1997, pp. 311–312). 3.3: Accretive operators were introduced by Kato (1967, 1968), who gave the complete characterization in metric terms involved in Proposition 3.3.4. In the first part of the section, dealing with accretive operators, we have summarized the results of Br´ezis (1971), Br´ezis & Pazy (1970), Crandall & Pazy (1969, 1970), Kato (1967, 1968) and Kenmochi (1972, 1973). Lemma 3.3.26 is due to Kato (1968, 1970), while the Gronwall-type inequality obtained in Lemma 3.3.27 can be found in Br´ezis (1973, p. 157). Theorem 3.3.28 can be found in Crandall & Pazy (1969). The linear semigroup theory started developing as soon as it was realized that the theory has immediate applications to partial differential equations, Markov processes and ergodic theory. It developed rapidly during the 1940’s and 1950’s thanks to the seminal contributions of Hille, Phillips and Yosida. The main result of this theory is of course Theorem 3.3.46 (the Hille-Yosida generation theorem), for contraction semigroups (i.e., M = 1, ω = 0) was proved independently by Hille (1942) and Yosida (1948), while the general case (proved in Theorem 3.3.46) is independently due to Feller (1953), Miyadera (1952) and Phillips (1953). The proof of Phillips theorem (see Theorem 3.3.48) can be found in Hille & Phillips (1957, p. 389). Theorem 3.3.49 is due to Lumer & Phillips (1961), while an early Hilbert space version of it was proved by Phillips (1959). The exponential formula in Theorem 3.3.51 is due to Hille (1942). In fact Hille’s proof of the generation theorem was based on it. Another representa-
3. Nonlinear Operators and Young Measures
465
tion formula can be found in Pazy (1983, p. 21). THEOREM 3.6.1 © ª If A is the infinitesimal generator of a C0 -semigroup S(t) t>0 on X, then S(t)x = lim etAλ x . λ→+∞
A complete list of representation formulae can be found in Hille & Phillips (1957, p. 354). Theorem 3.3.59 (the generation theorem for nonlinear semigroups) for Hilbert spaces was proved by Komura (1967), while the general case is due to Crandall & Liggett (1971). The notion of integral solution (see Definition 3.3.65) is due to B´enilan (1972). In the linear case, Proposition 3.3.71 is due to Lax (see Hille & Phillips (1957, p. 304)). For nonlinear semigroups, Proposition 3.3.71 and Theorem 3.3.72 were proved by Br´ezis (1974), while Corollary 3.3.73 is due to Pazy (1968). The theory of linear semigroups can be found in the books of Butzer & Berens (1967), Fattorini (1999), Goldstein (1985), Hille & Phillips (1957) and Pazy (1983), while the theory of nonlinear semigroups can be found in the books of Barbu (1976), Miyadera (1992), Pavel (1987) and Vrabie (1987). 3.4: Theorem 3.4.4 for functions defined on Ω × R with values in R was proved by Krasnoselskii (1964b, 1964a). The general case is due to Lucchetti & Patrone (1980). Moreover, in addition to Theorem 3.4.4, we can also show the continuity in measure of the operator Nf , already known when X and Y are Euclidean spaces. PROPOSITION 3.6.2 © If µ(Ω) < +∞, f : Ω × X −→ Y is a Carath´eodory function, xn : Ω −→ ª X n>1 is a sequence of Σ-measurable functions and µ
xn −→ x, then
µ
Nf (xn ) −→ Nf (x). The proof of Scorza-Dragoni theorem (see Theorem 3.4.10) can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 188). Normal integrands were introduced by Rockafellar (1968). The characterization of lower semicontinuity of the integral functional If obtained in Theorem 3.4.13 for Euclidean spaces X and Y is due to Poljak (1969), while the general case can be found in Lucchetti & Patrone (1980). Proposition 3.4.16 is due to Ioffe & Levin (1972). The proof of Proposition 3.4.18 can be found in Denkowski, Mig´orski & Papageorgiou (2003a, p. 460). For X = R, Theorem 3.4.20 can be found in Ekeland & Temam (1976), but their proof is different based on results from convex analysis.
466
Nonlinear Analysis
Further results on Nemytskii operators and integral functionals can be found in the books of Appell & Zabrejko (1990), Buttazzo (1989), Cesari (1983), Ekeland & Temam (1976), Hu & Papageorgiou (1997), Ioffe & Tihomirov (1979) and Vaˇınberg (1973). 3.5: The theory of Young measures has its roots in the so-called ”generalized curves” of Young (1942a, 1942b, 1969) for the study of variational problems which are not inf-compact and consequently do not have a solution. The needs of control theory (relaxation) and of the calculus of variations led to further development of the original ideas of Young. We refer to Berliocchi & Lasry (1973), Ekeland (1972), Ekeland & Temam (1976), Gamkrelidze (1978), Warga (1972). Recently there was a revival of the theory (motivated also by the needs of problems in theoretical mechanics), which can be traced in the works of Alibert & Bouchitt´e (1997), Balder (1984, 1997), Ball (1989), Ball & Murat (1989), Ball & Zhang (1990), Di Perna (1985), Di Perna & Majda (1987) and Tartar (1979). Our presentation here follows the survey paper of Valadier (1975). Applications of Young measures to control theory and mechanics can be found in the books of Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Gamkrelidze (1978), Hu & Papageorgiou (1997, 2000), Pedregal (1997), Roubiˇcek (1997) and Warga (1972). For the proof of Prohorov theorem (see Theorem 3.5.27) we refer to Parthasarathy (1967, p. 47). Theorem 3.5.49 was obtained by De Giorgi (1968–1969), with f > 0. A more general version similar to that of Theorem 3.5.49 was proved by Ioffe (1977a, 1977b). His proof, which is also reproduced by Buttazzo (1989, p. 46), does not use Young measures and instead it is based on the approximation of f by certain affine functions. Two different proofs of Theorem 3.5.50 can be found in Balder (1987) and Hu & Papageorgiou (2000, p. 31).
Chapter 4 Smooth and Nonsmooth Analysis and Variational Principles
The purpose of this chapter is to outline the basic aspects of the smooth and nonsmooth calculus in Banach spaces. Special emphasis is given on the nonsmooth theory, which started developing in the 1960’s, in order to provide a uniform viewpoint for the treatment of large classes of nonlinear extremal problems. The resulting subdifferential theories found also in many other applications and today are part of the so-called nonsmooth analysis, which is one of the most robust and interesting research areas of nonlinear functional analysis. In Section 4.1 we present the basics of the smooth calculus in Banach spaces. We limit ourselves to the discussion of the Gˆateaux and Fr´echet derivatives, which are the two most useful derivatives for vector valued functions. In Section 4.2 we consider convex functions defined on Banach spaces. We discuss their continuity and differentiability properties. It turns out that a purely algebraic condition (convexity) has remarkable and powerful topological and differentiability implications. Also differentiability results bring us in contact with the Banach space theory and in particular with the so-called Asplund spaces which have the useful property that every separable subspace has a separable dual. We also show that every convex continuous function is locally Lipschitz. Locally Lipschitz functions between Banach spaces are the objects of investigation in Section 4.3. If the two Banach spaces are finite dimensional, then the locally Lipschitz function is differentiable almost everywhere (for the Lebesgue measure). Here we see how this can be generalized to the case where the two spaces are infinite dimensional. The main difficulty is to produce a suitable notion of negligible sets. This is done using the notion of Haar-null sets. We study them in detail and eventually prove an infinite dimensional version of the Rademacher theorem on the almost everywhere differentiability of locally Lipschitz functions. In Section 4.4 we pass to the nonsmooth part of this chapter. We examine the duality and subdifferentiability properties of convex functions and the subdifferentiability properties of locally Lipschitz functions. At the end of the section, using the notion of bornology, we briefly consider some more subdifferentials of proper functions. In Section 4.5 we investigate integral functionals defined by convex or nonconvex normal integrands. We determine their duality and subdifferentiability
467
468
Nonlinear Analysis
properties. Finally in Section 4.6 we present some variational principles and their applications. Prominent in our discussion is the so-called Ekeland variational principle, in which we show that it is equivalent to some other powerful results of nonlinear analysis. We also use it to prove some surjectivity results for nonlinear maps, which extend corresponding results from the linear operator theory. This chapter illustrates in a rather convincing manner how methods and results of nonlinear analysis cover a wide area from a theoretical starting point (Banach space theory) to an applied end (optimization theory).
4.1
Differential Calculus in Banach Spaces
In this section we develop the basics of the differential calculus in Banach spaces. The geometric character of the operation of differentiation becomes very apparent in this general setting and leads naturally to generalizations such as the subdifferentials of convex and of locally Lipschitz functions. Moreover, the needs of the infinite dimensional variational problems which dominate the present landscape of nonlinear analysis require a differential calculus in Banach spaces, along the lines of the one existing in RN . This section shows that such a theory is possible and the analogy with the finite dimensional calculus is indeed remarkable. In what follows X and Y are two Banach spaces. Additional hypotheses will be introduced as needed. DEFINITION 4.1.1 A map f : X −→ Y is said to be Gˆ ateaux differentiable at x ∈ X, if and only if there exists A(x) ∈ L(X; Y ), such that lim
λ→0
f (x + λh) − f (x) = A(x)h λ
∀ h ∈ X.
The operator A(x) is said to be the Gˆ ateaux derivative of f at x. It is 0 usually denoted by fG (x). We say that f is Gˆ ateaux differentiable, if it is Gˆ ateaux differentiable at every x ∈ X. REMARK 4.1.2
If we set df
ϕ(λ) = f (x + λh), then
d ϕ(λ)|λ=0 ∀ x, h ∈ X. dλ So the Gˆateaux derivative is essentially a one dimensional concept, since it considers the difference quotients along rays. Clearly then the Gˆateaux deriva0 tive fG (x), if it exists, is unique. 0 fG (x)h =
4. Smooth and Nonsmooth Analysis and Variational Principles
469
Let us see some examples that illustrate the notion of Gˆateaux derivative. 0 (a) If f = A ∈ L(X; Y ), then fG (x) = A for all
EXAMPLE 4.1.3 x ∈ X.
(b) Let X = RN , Y = RM and f = (f1 , . . . , fM ) : RN −→ RM . Let A = (akj ) be an M × N -matrix and let h = ej = (0, . . . , 0, 1, 0, . . . , 0) be the j-th coordinate vector. Then ° ° ° f (x + λh) − f (x) − λAh ° ° = 0, lim ° ° λ→0 ° λ Y so
¯ ¯ ¯ fk (x + λej ) − fk (x) − λakj ¯ ¯ ¯ = 0 lim ¯ λ→0 ¯ λ
and so
∂fk (x) = akj ∂xj
∀ k ∈ {1, . . . , M }, j ∈ {1, . . . , N }
∀ k ∈ {1, . . . , M }, j ∈ {1, . . . , N }.
0 Therefore fG (x) has the matrix representation µ ¶ ∂fk 0 fG (x) = (x) . k∈{1,...,M } ∂xj j∈{1,...,N }
This matrix is called the Jacobian matrix of f at x ∈ RN . If Y = R, then µ ¶N ∂f 0 fG (x) = (x) , ∂xj j=1 known as the gradient of f at x ∈ RN . ¡ ¢ (c) Let X = Y = C [0, 1] and consider the Hammerstain integral operator , defined by df fb(x)(t) =
Z1
¡ ¢ k(t, s)f s, x(s) ds
∀ t ∈ [0, 1],
0
where
¡ ¢ k ∈ C [0, 1]; [0, 1] and
¡ ¢ ∂f ∈ C [0, 1] × R . ∂x
An easy calculation reveals that fb is Gˆateaux differentiable and ¡
0 fbG (x)h
¢ (t) =
Z1 k(t, s) 0
¢ ∂f ¡ s, x(s) h(s) ds ∂x
¡ ¢ ∀ h ∈ C [0, 1] .
470
Nonlinear Analysis
A stronger differentiability notion is given in the next definition. DEFINITION 4.1.4 A map f : X −→ Y is said to be Fr´ echet differentiable at x ∈ X if there exists A(x) ∈ L(X; Y ), such that f (x + h) − f (x) = A(x)h + u(x, h), where
ku(x, h)kY −→ 0 khkX
as khkX → 0.
The operator A(x) is said to be the Fr´ echet derivative of f at x ∈ X and it is usually denoted by fF0 (x). We say that f is Fr´echet differentiable, if it is Fr´echet differentiable at every x ∈ X. REMARK 4.1.5 It is easy to see that the Fr´echet derivative fF0 (x), if it exists, is unique. It is clear that if f is Fr´echet differentiable at x, it is also Gˆateaux differentiable at x. The converse is not true as the following example shows. EXAMPLE 4.1.6 Let X = R2 , Y = R and consider the function 2 f : R −→ R, defined by ( 3 x1 x2 df if x = (x1 , x2 ) 6= (0, 0), x41 +x22 f (x) = 0 if x = (x1 , x2 ) = (0, 0). The function f is Gˆateaux differentiable at x = 0 and 0 fG (0) = 0.
However, it is not Fr´echet differentiable at x = 0, since on the curve h21 = h2 , we have
|f (h)| |h3 h2 | 1 1 |h1 | = 41 2 p = p , khkR2 h1 + h2 h21 + h22 2 h21 + h22
so lim
khkR2 →0
|f (h)| 1 = 6= 0. khkR2 2
The next proposition establishes the exact relation between Gˆateaux and Fr´echet derivatives.
4. Smooth and Nonsmooth Analysis and Variational Principles
471
PROPOSITION 4.1.7 If f : X −→ Y is a Gˆ ateaux differentiable function at all points of some 0 neighbourhood of x ∈ X and fG (·) is continuous at x ∈ X, then f is also Fr´echet differentiable at x ∈ X. PROOF
Let us set df
0 u(x, h) = f (x + h) − f (x) − fG (x)h.
Then for every y ∗ ∈ Y ∗ , we have ∗ ® ® ® 0 y , u(x, h) Y = y ∗ , f (x + h) − f (x) Y − y ∗ , fG (x)h Y . By virtue of the mean value theorem, we can find λ ∈ (0, 1) (depending on y ∗ ), such that ∗ ® ® 0 0 y , u(x, h) Y = y ∗ , fG (x + λh)h − fG (x)h Y . We can find y ∗ ∈ Y ∗ with ky ∗ kY ∗ = 1, such that ° ° ¯ ® ¯ °u(x, h)° = ¯ y ∗ , u(x, h) ¯ . Y Y Then we have ° ° °u(x, h)°
Y
so
and so
° 0 ° 0 6 °fG (x + λh) − fG (x)°L khkX ,
° ° 0 ku(x, h)kY 0 (x)°L , 6 °fG (x + λh) − fG khkX ku(x, h)kY −→ 0 as khkX → 0 khkX
0 (since by hypothesis fG (·) is continuous at x ∈ X).
Before proceeding further, let us give some examples of Fr´echet differentiable maps. EXAMPLE 4.1.8 (a) Let X = H be a Hilbert space. Let A ∈ L(H) and let f : H −→ R be defined by df
f (x) = (Ax, x)H
∀ x ∈ H.
Then ¡ ¡ ¢ ¢ f (x + h) − f (x) − Ax + A∗ x, h H = o khkH as h → 0
472
Nonlinear Analysis
and so
fF0 (x) = A + A∗ . 2
If A = idH , then f (x) = kxkH and we have that fF0 (x) = 2x
∀ x ∈ H.
(b) Let Ω = RN be an open set and let f : Ω × R −→ R be a Carath´eodory function. Suppose that ¯ ¯ ¯f (z, x)¯ 6 a(z) + c|x|p−1 for a.a. z ∈ Ω and all x ∈ R, 0
with p ∈ [1, +∞), a ∈ Lp (Ω)+ , p1 + p10 = 1 and c > 0. Let F be the potential function corresponding to f , i.e., df
Zx f (z, r) dr.
F (z, x) = 0
Using the mean value theorem, we can see that ¯ ¯ ¯F (z, x)¯ 6 b a(z) + b c|x|p for a.a. z ∈ Ω and all x ∈ R, with b a ∈ L1 (Ω)+ and b c > 0. Then consider the continuous functional ϕ : Lp (Ω) −→ R, defined by Z ¡ ¢ df ϕ(u) = F z, u(z) dz. Ω
We claim that ϕ is continuously Fr´echet differentiable and ϕ0 (u) = Nf (u) To this end let
Z
ξ(h) = Ω
∀ u ∈ Lp (Ω).
¡ ¢ F z, (u + h)(z) dz − Z
+
Z
¡ ¢ F z, u(z) dz
Ω
¡
¢ f z, u(z) h(z) dz.
Ω
Note that ¡
¢ ¡ ¢ F z, (u + h)(z) − F z, u(z) =
Z1 0
Z1 = 0
¢ d ¡ F z, (u + th)(z) dt dt ¡ ¢ f z, (u + th)(z) h(z) dt.
4. Smooth and Nonsmooth Analysis and Variational Principles
473
Therefore, using Fubini’s theorem and H¨older’s inequality (see, e.g., Theorem A.2.27), we have ¯ ¯ ¯ξ(h)¯ =
Z Z1
¯ ¡ ¯ ¢ ¡ ¢¯¯ ¯f z, (u + th)(z) − f z, u(z) ¯¯h(z)¯ dt dz
Ω 0
Z1 Z 6
¯ ¡ ¯ ¢ ¡ ¢¯¯ ¯f z, (u + th)(z) − f z, u(z) ¯¯h(z)¯ dz dt
0 Ω
Z1 6 khkp
° ° °Nf (u + th) − Nf (u)° 0 dt. p
0
Because Nf is continuous (see Theorem 3.4.4), by the Lebesgue dominated convergence theorem (see Theorem A.2.2), we can conclude that ξ(h) −→ 0 as khkp → 0. khkp This proves that
ϕ0F (u) = Nf (u) ¢ and so ϕ ∈ C Lp (Ω) . This example is important in the variational methods for the study of boundary value problems. ¡ 1
PROPOSITION 4.1.9 If f : X −→ Y is a function which is Fr´echet differentiable at x ∈ X, then f is continuous at x ∈ X. PROOF Since f is Fr´echet differentiable at x, we can find δ > 0, such that ° ° °f (x + h) − f (x) − fF0 (x)h° 6 khk ∀ khkX 6 δ, X Y so
° ° °f (x + h) − f (x)° 6 (1 + kfF0 (x)k ) khk L X Y
∀ khkX 6 δ.
This proves the continuity of f at x ∈ X. REMARK 4.1.10 The above proposition is no longer true if Fr´echet differentiability is replaced by Gˆateaux differentiability. To see this consider the function f : R2 −→ R, defined by ( 4 x1 x2 df if (x1 , x2 ) 6= 0 x61 +x32 f (x1 , x2 ) = ∀ x = (x1 , x2 ) ∈ R2 . 0 if (x1 , x2 ) = 0 0 Then fG (x1 , x2 ) = 0 but f is not continuous at the origin.
474
Nonlinear Analysis
In the case of Gˆateaux differentiable maps we can conclude that they are continuous along rays. PROPOSITION 4.1.11 If f : X −→ Y is a function, which is Gˆ ateaux differentiable at x ∈ X, then ° ° lim °f (x + λh) − f (x)°Y = 0 ∀ h ∈ X. λ→0
PROOF
Let
df
ϕ(λ) = f (x + λh)
∀ λ ∈ R.
Then ϕ is differentiable at 0, hence continuous there. So ϕ(λ) −→ ϕ(0) as λ → 0, which implies that f (x + λh) −→ f (x)
in Y,
as λ → 0.
We have a chain rule for these derivatives. PROPOSITION 4.1.12 If Z is a Banach space too, f : X −→ Y is a function which is Gˆ ateaux differentiable at x ∈ X and g : Y −→ Z is a function which is Fr´echet differentiable at f (x), df
then the function k = g ◦ f : X −→ Z is Gˆ ateaux differentiable at x ∈ X and ¡ ¢ 0 0 kG (x) = gF0 f (x) fG (x). Moreover, if f is Fr´echet differentiable at x ∈ X, then k is Fr´echet differentiable at x. PROOF
For λ 6= 0, we have
° ¡ ¢ 0 1 ° °k(x + λh) − k(x) − λgF0 f (x) fG (x)h°Z |λ| ¡ ¢ ¡ ¢ ¡ ¢¡ ¢° 1 ° °g f (x + λh) − g f (x) − λgF0 f (x) f (x + λh) − f (x) ° 6 Z |λ| ¡ ¢¡ ¢° 1 ° 0 °g 0 f (x) f (x + λh) − f (x) − λfG + (x)h °Z . (4.1) |λ| F Since f is Gˆateaux differentiable at x ∈ X, the second summand in the right hand side of (4.1) goes to zero as λ → 0. Also suppose that f (x + λh) 6= f (x). Then since f (x + λh) −→ f (x) in Y as λ → 0
4. Smooth and Nonsmooth Analysis and Variational Principles
475
(see Proposition 4.1.11) and because g is Fr´echet differentiable at f (x), we have that the first summand in the right hand side of (4.1) goes to zero as λ → 0. This proves that ¡ ¢ 0 0 kG (x) = gF0 f (x) fG (x). The proof is similar if f is Fr´echet differentiable at x ∈ X. COROLLARY 4.1.13 If f : X −→ Y is a function, which is Gˆ ateaux differentiable at every point of the interval ª df © [x, x + h] = u ∈ X : u = λx + (1 − λ)(x + h), λ ∈ [0, 1] , then Z1 0 fG (x + th)h dt.
f (x + h) − f (x) = 0
In the next proposition we show that compactness of a map is passed to its Fr´echet derivative. PROPOSITION 4.1.14 If f : X −→ Y is a function which is compact and Fr´echet differentiable at x ∈ X, then fF0 (x) ∈ Lc (X; Y ). PROOF Suppose that the proposition is not true. Then we can find ε > 0 and {xn }n>1 ⊆ X with kxn kX 6 1 such that
∀ n > 1,
° 0 ° °fF (x)xn − fF0 (x)xm ° > 3ε X
∀ n 6= m.
Because f is Fr´echet differentiable at x, we have f (x + h) − f (x) = fF0 (x)h + u(x, h) and we can find δ > 0, such that ° ° °u(x, h)° 6 ε khk X Y
∀ khkX 6 δ.
Therefore, from (4.2), we have ° ° °f (x + δxn ) − f (x + δxm )° ° 0 ° ° Y ° ° ° ° ° > δ fF (x)(xn − xm ) Y − °u(x, δxn )°Y − °u(x, δxm )°Y > 3εδ − δε − δε = δε, a contradiction to the fact that f is compact.
(4.2)
476
Nonlinear Analysis
For the next proposition, we need to introduce the following definition. DEFINITION 4.1.15 (a) A function f : [a, b] −→ X is said to be right differentiable at t ∈ [a, b), if the limit lim
h→0+
¤ 1£ f (t + h) − f (t) h
exists.
0 We denote this limit by f+ (t) and we call it the right derivative of f at t. 0 Evidently f+ (t) ∈ X.
(b) Similarly a function f : [a, b] −→ X is said to be left differentiable at t ∈ (a, b], if the limit lim−
h→0
¤ 1£ f (t + h) − f (t) h
exists.
0 We denote this limit by f− (t) and we call it the left derivative of f at t. 0 Evidently f− (t) ∈ X.
REMARK 4.1.16 A function f : [a, b] −→ X is Fr´echet differentiable 0 0 at t ∈ (a, b) if and only if f− (t) = f+ (t). PROPOSITION 4.1.17 0 0 If f : [a, b] −→ X, g : [a, b] −→ R, are continuous functions, f+ (t) (t), g+ exist at all t ∈ (a, b) and ° 0 ° 0 °f+ (t)° 6 g+ (t) ∀ t ∈ (a, b), X then
° ° °f (b) − f (a)° 6 g(b) − g(a). X
PROOF
Let ε > 0 be given and consider the set ½ ¾ ° ° df ° ° U = t ∈ [a, b] : f (t) − f (a) X > g(t) − g(a) + ε(t − a) + ε . df
Clearly U is an open set. Suppose that U is nonempty and let c = inf U . We can say the following: (a) c > a. This follows from the continuity of f and g; (b) c 6∈ U : since U is open; (c) c < b: otherwise U = {b} which is not open.
4. Smooth and Nonsmooth Analysis and Variational Principles
477
So we have that a < c < b. By hypothesis we have ° 0 ° 0 °f+ (c)° 6 g+ (c). X Let h > 0 be such that if t ∈ [c, c + h], we have ° 0 ° °f+ (c)° > kf (t) − f (c)kX − ε X t−c 2
and
0 g+ (c) 6
It follows that ° ° °f (t) − f (c)° 6 g(t) − g(c) + ε(t − c) X
g(t) − g(c) ε + . t−c 2
∀ t ∈ [c, c + h].
Also because c 6∈ U , we have ° ° °f (c) − f (a)° 6 g(c) − g(a) + ε(c − a) + ε. X From (4.3) and (4.4), we obtain ° ° °f (t) − f (a)° 6 g(t) − g(a) + ε(t − a) + ε X
(4.3)
(4.4)
∀ t ∈ [c, c + h].
We infer that inf U > c + h, a contradiction. So U = ∅ and we obtain ° ° °f (t) − f (a)° 6 g(t) − g(a) + ε(t − a) + ε ∀ t ∈ [a, b]. X Let t = b and ε & 0 to obtain the desired inequality. REMARK 4.1.18 We have an analogous result if we replace the right derivatives by the left ones. Moreover, we can weaken the hypotheses of Proposition 4.1.17 and assume that there is a countable set D ⊆ [a, b], such 0 0 (t), g+ (t) exist for all t ∈ [a, b] \ D and that f+ ° 0 ° 0 °f+ (t)° 6 g+ (t) ∀ t ∈ [a, b] \ D. X COROLLARY 4.1.19 If f : [a, b] −→ X is continuous, right differentiable at every t ∈ (a, b) and ° 0 ° °f+ (t)° 6 k ∀ t ∈ (a, b), X then
° ° °f (t) − f (s)° 6 k|t − s| X
∀ t, s ∈ [a, b].
COROLLARY 4.1.20 If g : [a, b] −→ R is a continuous function, which is right differentiable at every t ∈ (a, b), then g is increasing if and only if 0 g+ (t) > 0
∀ t ∈ (a, b).
478
Nonlinear Analysis
We have the following mean value theorem. PROPOSITION 4.1.21 (Mean Value Theorem) If f : X −→ R is a Gˆ ateaux differentiable function, then we can find λ0 ∈ (0, 1), such that 0 ® f (x + h) − f (x) = fG (x + λ0 h), h X . PROOF
Let
df
ϕ(λ) = f (x + λh). Recall that
0 ® fG (x + λ0 h), h X = ϕ0 (λ0 )
(see Remark 4.1.2). Using the mean value theorem for scalar functions, we can find λ0 ∈ (0, 1), such that ϕ(1) − ϕ(0) = ϕ0 (λ0 ), so f (x + h) − f (x) =
0 ® fG (x + λ0 h), h X .
In general for vector valued functions the mean value theorem fails as the next example illustrates. Let f : R2 −→ R2 be defined by µ ¶ x1 df f (x) = (x31 , x22 ) ∀x= ∈ R2 . x2
EXAMPLE 4.1.22
We have
· f 0 (x) =
¸ 3x21 0 . 0 2x2
¡¢ ¡¢ © If x = 00 and yª = 11 , then it is clear that there is no z ∈ [x, y] = λx + (1 − λ)y : λ ∈ [0, 1] , such that f (y) − f (x) = f 0 (z)(y − x). For vector valued functions the mean value theorem takes an inequality form. PROPOSITION 4.1.23 (Mean Value Theorem) If f : X −→ Y is a Gˆ ateaux differentiable function and x, h ∈ X, y ∗ ∈ Y ∗ , then we can find λ0 ∈ (0, 1), such that ® ∗ ® 0 (x + λ0 h)h Y y , f (x + h) − f (x) Y = y ∗ , fG and
° ° ° 0 ° °f (x + h) − f (x)° 6 °fG (x + λ0 h)°L khkX . Y
4. Smooth and Nonsmooth Analysis and Variational Principles PROOF
Let df
g(x) = Then
479
∗ ® y , f (x) Y .
0 ® ® 0 gG (x), h X = y ∗ , fG (x)h Y .
From Proposition 4.1.21, we know that we can find λ0 ∈ (0, 1), such that g(x + h) − g(x) = hence
0 gG (x + λ0 h), h
® X
,
∗ ® ® 0 y , f (x + h) − f (x) Y = y ∗ , fG (x + λ0 h)h Y .
Since y ∗ ∈ Y ∗ is arbitrary, we choose ky ∗ kY ∗ = 1, such that ° ° ∗ ® y , f (x + h) − f (x) Y = °f (x + h) − f (x)°Y . So we can find λ0 ∈ (0, 1), such that ° ° ° 0 ° ® 0 °f (x + h) − f (x)° = y ∗ , fG °fG (x + λ0 h)° khk . (x + λ h)h 6 0 X Y Y L
COROLLARY 4.1.24 If U ⊆ X is connected and open f : U −→ Y is a Gˆ ateaux differentiable function and 0 fG (x) = 0 ∀ x ∈ U, then f is constant on U . Next we state and prove two major results of differential calculus. These are the implicit function theorem and the inverse function theorem. The implicit function theorem deals with the following situation. Let f (x, y) and suppose that f (x0 , y0 ) = c. Can we find a function x 7−→ y = g(x), which at least locally satisfies ¡ ¢ f x, g(x) = c ? We want g to be differentiable provided f is differentiable. Moreover, in the neighbourhood, where ¡ ¢ f x, g(x) = c is valid, g(x) should be the unique solution. To better motivate this consider the following simple example.
480
Nonlinear Analysis
EXAMPLE 4.1.25
Let f : R2 −→ R be defined by df
f (x, y) = x2 + y 2 − 1. We consider the 0-level set of f , namely the set of those x, y ∈ R that satisfy f (x, y) = 0, which in our ¡ case is ¢ of course the unit circle. We look for a function g(x), such that f x, g(x) = 0 for all x in the domain of g. Evidently g(x) = ±
p 1 − x2
and so g need not be unique unless we restrict its domain. Also near x0 = ±1, g could be either square root, so it is not uniquely determined. Note that at x0 = ±1, g is not differentiable and ∂f = 0. ∂y So to produce a unique differentiable function g, such that ¡ ¢ f x, g(x) = 0, we need to look locally and impose some condition like ∂f 6= 0. ∂y The proof of the implicit function theorem uses the Banach fixed point theorem, which we state here in the form needed and postpone the proof of the general version until Section 7.1. PROPOSITION 4.1.26 (Banach Fixed Point Theorem) If V is a Banach space, C is a closed subset of V and S : C −→ C satisfies ° ° °S(v1 ) − S(v2 )°
V
6 k kv1 − v2 kV
∀ v1 , v2 ∈ C,
for some k ∈ [0, 1), then there exists unique v ∈ C, such that v © = S(v). ª Moreover, if we have a parametrized family S(x) x∈U (with U being an open subset of a Banach space W ) satisfying the above contraction condition with k ∈ [0, 1) independent of x, then the unique solution v = v(x) of v = S(x)v depends continuously on x.
4. Smooth and Nonsmooth Analysis and Variational Principles
481
Using this proposition we can prove the implicit function theorem. In what follows for a function f (x, y) by D1 f (x, y) (respectively D2 f (x, y)) we denote the partial derivative of f with respect to x (respectively y). THEOREM 4.1.27 (Implicit Function Theorem) If X, Y, Z are three Banach spaces, U ⊆ X × Y is an open set, (x0 , y0 ) ∈ U , f : U −→ Z is a continuous differentiable function, f (x0 , y0 ) = 0 and D2 f (x0 , y0 ) ∈ L(X; Y ) is invertible with a continuous inverse, i.e., D2 f (x0 , y0 ) is an isomorphism, then there exist neighbourhoods U1 of x0 and U2 of y0 , such that U1 × U2 ⊆ U and a unique continuously differentiable function g : U1 −→ U2 , such that ¡ ¢ f x, g(x) = 0 ∀ x ∈ U1 and
¡ ¡ ¢¢−1 ¡ ¢ Dg(x) = − D2 f x, g(x) D1 f x, g(x)
PROOF
Let
∀ x ∈ U1 .
df
L0 = D2 f (x0 , y0 ) ∈ L(Y ; Z). By hypothesis L0 is an isomorphism. Then the equation f (x, y) = 0 can be equivalently rewritten as y = y − L−1 0 f (x, y).
(4.5)
The advantage of passing to (4.5) is that we can apply Proposition 4.1.26. Namely for every x, we look for a fixed point of y 7−→ y − L−1 0 f (x, y) and to do this we employ Proposition 4.1.26. Let us set df
h(x, y) = y − L−1 0 f (x, y). Since L−1 0 ◦ L0 = idY , we have £ ¡ ¢¤ h(x, y1 ) − h(x, y2 ) = L−1 L0 (y1 − y2 ) − f (x, y1 ) − f (x, y2 ) . 0 Because f is C 1 at (x0 , y0 ) and L0 is an isomorphism, we can find δ1 > 0 and ϑ > 0, such that if kx − x0 kX 6 δ1 , ky1 − y0 kY 6 ϑ, ky2 − y0 kY 6 ϑ, then ° ° °h(x, y1 ) − h(x, y2 )° 6 1 ky1 − y2 k . Y Y 2
(4.6)
Also because of the continuity of h(·, y0 ), we can find δ2 > 0, such that if kx − x0 kX 6 δ2 , then ° ° °h(x, y0 ) − h(x0 , y0 )° < ϑ . Y 2
(4.7)
482
Nonlinear Analysis
© ª Therefore, from (4.6) and (4.7), if δ = min δ1 , δ2 and kx − x0 kX 6 δ, ky1 − y0 kY 6 ϑ, we have ° ° ° ° °h(x, y) − y0 ° = °h(x, y) − h(x0 , y0 )° Y ° Y ° ° ° 6 °h(x, y) − h(x, y0 )°Y + °h(x, y0 ) − h(x0 , y0 )°Y 1 ϑ 6 ky − y0 kY + 6 ϑ. (4.8) 2 2 © ª So we see that h(x, ·) maps © B ϑ (y0 ) = y ∈ Y : ªky − y0 kY 6 ϑ onto itself as well as Bϑ (y0 ) = y ∈ Y : ky − y0 kY < ϑ onto itself (see (4.7) © ª and (4.8)), for all x ∈ B δ (x0 ) = x ∈ X : kx − x0 k©X 6 δ . We ªcan apply Proposition 4.1.26 to obtain the parametric family y 7−→ h(x, y) x∈B (x ) . 0 δ
So for every x ∈ B δ (x0 ), we can find unique y = y(x) ∈ B ϑ (y0 ), such that h(x, y) = y, hence f (x, y) = 0 and the function g(x) = y(x) is continuous. Let df
U1 = Bδ (x0 )
df
and U2 = Bϑ (y0 ).
Evidently by choosing δ > 0 and ϑ > 0 small enough we can have that U1 × U2 ⊆ U . We claim that the function g : Bδ (x0 ) −→ Y is continuously differentiable. To this end let (x1 , y1 ) ∈ U1 × U2 , y1 = g(x1 ) (recall that G(x, ·) maps U2 into itself). Exploiting the differentiability of f at (x1 , y1 ), we have f (x, y) = A(x − x1 ) + B(y − y1 ) + u(x, y) with
df
A = D1 f (x1 , y1 ), and
∀ (x, y) ∈ U,
df
B = D2 f (x1 , y1 )
ku(x, y)kZ = 0. (x,y)→(x1 ,y1 ) k(x − x1 , y − y1 )kX×Y lim
Recall that Hence
¡ ¢ f x, g(x) = 0
∀ x ∈ U1 .
¡ ¢ g(x) = −B −1 A(x − x1 ) + y1 − B −1 u x, g(x) .
(4.9)
We can find r1 , r2 > 0, such that if kx − x1 kX 6 r1 , ky − y1 kY 6 r2 , then ° ° °u(x, y)°
Z
6
¡ ¢ 1 kx − x1 kX + ky − y1 kY , −1 2 kB kL
so ° ° °u(x, g(x))° 6 Z
° ° ° ¢ ¡° 1 °x − x1 ° + °g(x) − g(x1 )° . −1 X Y 2 kB kL
(4.10)
4. Smooth and Nonsmooth Analysis and Variational Principles
483
From (4.9) and (4.10), it follows that ° ° ° ° ° ° °g(x) − g(x1 )° 6 °B −1 A° kx − x1 k + 1 kx − x1 k + 1 °g(x) − g(x1 )° , X X Y L Y 2 2 so ° ° °g(x) − g(x1 )° 6 η kx − x1 k , (4.11) X Y ° ° df with η = 2°B −1 A°L + 1. Let ¡ ¢ df v(x) = −B −1 u x, g(x) . From (4.9), we have g(x) − g(x1 ) = B −1 A(x − x1 ) + v(x) and since
(4.12)
° ° ° ° ° ¡ ¢° °v(x)° 6 °B −1 ° °u x, g(x) ° Y L Z
and g is continuous, we have lim
x→x1
kv(x)kY = 0. kx − x1 kX
(4.13)
From (4.12) and (4.13), it follows that g is Fr´echet differentiable at x1 ∈ U1 and ¡ ¢−1 D1 f (x1 , y1 ), gF0 (x1 ) = −B −1 A = − D2 f (x1 , y1 ) which means that g is continuously differentiable. DEFINITION 4.1.28 Let Z be a Banach space and let V ⊆ Z be a closed subspace of Z. We say that V is complemented, if there is a closed subspace W of Z, such that Z = V ⊕W (i.e., Z = V + W and V ∩ W = {0}). REMARK 4.1.29 The subspace V ⊆ Z is complemented if and only if there exists a bounded linear projection of Z onto V , i.e., there exists PV ∈ L(Z), such that PV |V = idV
and
PV (Z) = V.
The closed subspace c0 of l∞ is not complemented. If every closed subspace of a Banach space Z is complemented, then Z is isomorphic to a Hilbert space. Every subspace of the Banach space Z, which is either finite dimensional or it has finite codimension, is complemented. Finally in a Hilbert space every closed subspace is complemented (take the orthogonal complement).
484
Nonlinear Analysis
An interesting consequence of Theorem 4.1.27 is the following corollary. COROLLARY 4.1.30 If U ⊆ X is open, f : U −→ Y is a continuously differentiable function, fF0 (x0 ) is surjective and ker fF0 (x0 ) is complemented, then f (U ) contains a neighbourhood of f (x0 ). df
PROOF Let V = ker fF0 (x0 ). Then X = V ⊕ W (see Definition 4.1.28) and so for all x ∈ X we have x = v + w, with v ∈ V and w ∈ W . We write f (x) = f (v, w). Evidently D2 f (x0 ) ∈ L(W ; Y )
is an isomorphism.
So we can apply Theorem 4.1.27 and conclude that f (U ) contains the neighbourhood U2 of y0 = f (x0 ) postulated by Theorem 4.1.27. In the next example we use Corollary 4.1.30 to prove an existence theorem for differential equations. EXAMPLE 4.1.31
Let ¡ ¢ X = C 1 [0, 1] and
¡ ¢ Y = C [0, 1] .
Let f : X −→ Y be defined by df
f (x) =
dx + x3 . dt
It is easy to see that f is a C 1 -map and fF0 (0) =
d ∈ L(X; Y ). dt
From the fundamental theorem of calculus, we have that fF0 (0) is surjective. Also ker fF0 (0) is the space of constant functions, hence it is complemented (see Remark 4.1.29). In fact the complement of ker fF0 (0) is given by ½ df
W =
Z1 x∈X:
¾ x(t) dt = 0 .
0
So we can apply Corollary 4.1.30 and conclude the following:
4. Smooth and Nonsmooth Analysis and Variational Principles
485
“We can find ε > 0, such that if y ∈ Y with kykY < ε, then the differential equation dx(t) + x(t)3 = y(t) dt
∀ t ∈ [0, 1]
has a solution x ∈ X.” Using the implicit function theorem (see Theorem 4.1.27), we can prove the inverse function theorem. THEOREM 4.1.32 (Inverse Function Theorem) If U ⊆ Y is an open set, f : U −→ X is a continuously differentiable function, y0 ∈ U and fF0 (y0 ) ∈ L(Y ; X) is an isomorphism, then there exists a neighbourhood U 0 of y0 , U 0 ⊆ U and V 0 a neighbourhood of x0 = f (y0 ), such that f : U 0 −→ V 0 is a diffeomorphism and (f −1 )0F (x0 ) = fF0 (y0 )−1 . PROOF
Let
df
h(x, y) = f (y) − x. Then
D2 h(x0 , y0 ) = fF0 (y0 ),
which by hypothesis is an isomorphism. So by virtue of Theorem 4.1.27 we can find a neighbourhood V 0 of x0 and a continuously differentiable map g : V 0 −→ Y , such that g(V 0 ) ⊆ U0 for a neighbourhood U0 of y0 , ¡ ¢ h x, g(x) = 0 ¡ ¢ (i.e., f g(x) = x for all x ∈ V 0 ) and
∀ x∈V0
g(x0 ) = y0 . In the sequel we consider f restricted to g(V 0 ). Since ¡ ¢ f g(x) = x, we see that g is injective on V 0 , hence a bijection from V 0 onto g(V 0 ). In addition g(V 0 ) = f −1 (V 0 ) is open because f is continuous. So we set U 0 = g(V 0 )
486
Nonlinear Analysis
and we have that f : U 0 −→ V 0 is a bijection. Finally since ¡ ¡ ¢¢−1 ¡ ¢ gF0 (x0 ) = − D2 h x0 , g(x0 ) D1 h x0 , g(x0 ) , we have hence
fF0 (x0 ) ◦ gF0 (x0 ) = idX , gF0 (x0 ) = (f −1 )0F (x0 ) = fF0 (y0 )−1 .
In finite dimensions this theorem has the following useful consequences. COROLLARY 4.1.33 If V ⊆ RN is an open set, x0 ∈ V , f : V −→ RM is a continuously differentiable function and y0 = f (x0 ), then (a) if N 6 M and fF0 (x0 ) is of maximal rank (i.e., of rank N ), then we can find a neighbourhood U 0 of y0 , V 0 a neighbourhood of x0 and a continuously differentiable function g : U 0 −→ RN , such that (g ◦ f )(x) = i(x)
∀ x ∈ V 0,
where i : RN −→ RM is the canonical injection, i.e., i(x1 , . . . , xN ) = (x1 , . . . , xN , 0, . . . , 0); (b) if N > M and fF0 (x0 ) is of maximal rank (i.e., of rank M ), then we can find a neighbourhood Vb of x0 and a continuously differentiable map ϑ : Vb −→ V , with ϑ(x0 ) = x0 and (f ◦ ϑ)(x) = projRM (x)
∀ x ∈ Vb ,
where projRM : RN −→ RM is the canonical projection, i.e., projRM (x1 , . . . , xN ) = (x1 , . . . , xM ). PROOF
(a) By hypothesis µ ¶ ∂fi rank (x0 ) = N. i∈{1,...,M } ∂xj j∈{1,...,N }
By relabelling things if necessary, we may assume that µ ¶ ∂fi det (x0 ) 6= 0. i∈{1,...,N } ∂xj j∈{1,...,N }
4. Smooth and Nonsmooth Analysis and Variational Principles
487
Let ξ : V × RM −N −→ RM be defined by df
ξ(x1 , . . . , xM ) = f (x1 , . . . , xN ) + (0, . . . , 0, xN +1 , . . . , xM ). We have
µ det
∂ξi (x0 , 0) ∂xj
¶ 6= 0. i∈{1,...,M } j∈{1,...,M }
So Theorem 4.1.27 implies that locally there exists an inverse g of f , such that i(x) = (g ◦ ξ ◦ i)(x) = (g ◦ f )(x). (b) Again we may assume without any loss of generality that µ ¶ ∂fi det (x0 ) 6= 0. i∈{1,...,M } ∂xj j∈{1,...,M }
N
We define η : V −→ R , by df
η(x1 , . . . , xN ) = Hence
µ det
∂ηi (x0 ) ∂xj
¡
¢ f1 (x), . . . , fM (x), xM +1 , . . . , xn .
¶
µ = det i∈{1,...,N } j∈{1,...,N }
∂fi (x0 ) ∂xj
¶ 6= 0. i∈{1,...,M } j∈{1,...,M }
So by Theorem 4.1.32, we can find a neighbourhood Vb of x0 and a continuously differentiable map ϑ : Vb −→ V , such that ϑ(x0 ) = x0 and ϑ = η −1 . Then ¢ ¡ ¢ ¡ projRM (x) = projRM ◦ η ◦ ϑ (x) = f ◦ ϑ (x).
REMARK 4.1.34 In Corollary 4.1.33, part (a) tells us that f locally looks like the inclusion map: V 0 ⊆ RN @ i@ R
f -
RM
U 0 ⊆ RM ¡ ¡ ªg
On the other hand part (b) tells us that f locally looks like the projection map: Vb ⊆ RN ϑ
p
@@ ¡¡ ª R N fb ϑ(V ) ⊆ V ⊆ R RM Both diagrams are commutative.
488
4.2
Nonlinear Analysis
Convex Functions
In this section we focus on convex functions and their differentiability properties. We show that the algebraic property of convexity has important topological consequences such as continuity and differentiability. The situation is especially pleasant in the context of separable Banach spaces. Convex functions play a certain role in modern variational analysis and the applications require that we consider extended real valued convex functions, df that is functions with values in R = R ∪ {+∞}. DEFINITION 4.2.1 Let X be a Hausdorff topological space and let ϕ : X −→ R be a function. The effective domain of ϕ is the set df
dom ϕ =
©
ª x ∈ X : ϕ(x) < +∞ .
We say that ϕ is proper, if dom ϕ 6= ∅. The epigraph of ϕ is the set df
epi ϕ =
©
ª (x, λ) ∈ X × R : ϕ(x) 6 λ .
The function ϕ is lower semicontinuous, if for every λ ∈ R, the sublevel set ª df © Lϕ x ∈ X : ϕ(x) 6 λ λ = is closed. If X is a Hausdorff linear topological space, we say that ϕ is convex, if for all x1 , x2 ∈ dom ϕ and all λ ∈ [0, 1], we have ¡ ¢ ϕ λx1 + (1 − λ)x2 6 λϕ(x1 ) + (1 − λ)ϕ(x2 ). We say that ϕ is strictly convex if the above inequality is strict when x1 6= x2 and λ ∈ (0, 1). The cone of proper, convex and lower semicontinuous functions is denoted by Γ0 (X). REMARK 4.2.2 It is well known that ϕ is lower semicontinuous if and only if epi ϕ ⊆ X × R is closed or equivalently if ϕ(x) 6 lim inf α ϕ(xα ) for every net (xα ) converging to x. Also ϕ is convex if and only if epi ϕ ⊆ X ×R is a convex set. This means that certain properties of proper, convex and lower semicontinuous functions can be deduced from these (rather special) closed, convex sets in X × R. So one can argue that the study of proper, convex, lower semicontinuous functions is a special case of the study of closed, convex sets. On the other hand, if C is a nonempty subset of X, we can introduce the indicator function of C, by ½ df 0 if x ∈ C, iC (x) = +∞ otherwise.
4. Smooth and Nonsmooth Analysis and Variational Principles
489
Then iC ∈ Γ0 (X) if and only if C is closed and convex. This example shows that it is possible to deduce certain properties of a closed, convex set from the properties of its indicator function which belongs in Γ0 (X). So one can argue that the study of closed, convex sets is a special case of the study of proper, convex, lower semicontinuous functions. Both points of view are legitimate and it is a matter of which approach is more convenient, the geometric or the analytical. The next theorem summarizes the continuity properties of a proper, convex function. THEOREM 4.2.3 If X is a Hausdorff linear space and ϕ : X −→ R is a proper, convex function, then the following statements are equivalent: (a) ϕ is bounded from above on a neighbourhood of x0 ∈ X; (b) ϕ is continuous at x0 ∈ X; (c) int epi ϕ 6= ∅; (d) int dom ϕ 6= ∅ and ϕ|int dom ϕ is continuous. Moreover, if the above statements hold, then int epi ϕ =
©
ª (x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ .
PROOF “(a)=⇒(b)”: Let U be a neighbourhood of x0 , such that ϕ|U is bounded from above, i.e., there exists c > 0, such that ϕ(x) 6 c
∀ x ∈ U.
Replacing, if necessary, U by U − x0 and ϕ(x) by ϕ(x + x0 ) − ϕ(x0 ), we may assume without any loss of generality that x0 = 0 and that ϕ(0) = 0. We will show that ϕ is continuous at x0 = 0. Let ε ∈ (0, c] and let us define df
Vε =
³ε ´ ³ ε ´ U ∩ − U . c c
Evidently Vε is a symmetric neighbourhood of the origin. We shall show that ¯ ¯ ¯ϕ(x)¯ 6 ε ∀ x ∈ Vε (4.14) (which implies the continuity of ϕ at x0 = 0). So let x ∈ Vε . We have εc x ∈ U and because ϕ is convex, we have ε ³c ´ ³ ε´ ε ϕ(x) 6 ϕ x + 1 − ϕ(0) 6 c = ε. c ε c c
490
Nonlinear Analysis
Also − εc x ∈ U and so µ
ε ³ c ´¶ 1 c x + − x 1 + εc 1 + εc ε ε ³ ´ 1 c 1 ε 6 ϕ(x) + c ε ϕ − x 6 ϕ(x) + , 1 + εc 1+ c ε 1 + εc 1 + εc
0 = ϕ(0) = ϕ
hence −ε 6 ϕ(x). So finally we obtain (4.14) and this proves the continuity of ϕ at the origin. “(b)=⇒(a)”: Since ϕ is continuous at x0 , then it is bounded on a neighbourhood of x0 . “(a)=⇒(c)”: By hypothesis, there exists a neighbourhood U of x0 , such that ϕ(x) 6 c
∀ x ∈ U.
So U ⊆ int dom ϕ and {(x, λ) ∈ X × R : x ∈ U, c < λ} ⊆ epi ϕ, which implies that int epi ϕ 6= ∅. “(c)=⇒(a)”: Let (x, λ) ∈ int epi ϕ. We can find a neighbourhood U of x and r > 0, such that U × [λ − r, λ + r] ⊆ epi ϕ, hence U × {λ} ⊆ epi ϕ and so ϕ(x) 6 λ
∀ x ∈ U,
which means that ϕ is bounded from above in a neighbourhood of x. “(a)=⇒(d)”: As before we may assume that x0 = 0. Let U be the neighbourhood of x0 = 0 postulated by part (a). Evidently U ⊆ dom ϕ and so int dom ϕ 6= ∅. Let x ∈ int dom ϕ. Note that dom ϕ is convex. So we can find r > 1, such that x b = rx ∈ dom ϕ. Let µ ¶ 1 df V = x+ 1− U, r which is a neighbourhood of x. Exploiting the convexity of ϕ, for all u ∈ V , we have µ ¶ 1 u = x+ 1− z with z ∈ U r
4. Smooth and Nonsmooth Analysis and Variational Principles and
491
µ
µ ¶ ¶ µ ¶ 1 1 1 1 x b+ 1− z 6 ϕ(b x) + 1 − ϕ(z) r r r r µ ¶ 1 1 6 ϕ(b x) + 1 − c = b c. r r
ϕ(u) = ϕ
So ϕ is bounded from above in a neighbourhood of x, hence continuous at x ∈ int dom ϕ (recall that (a)⇐⇒(b)). “(d)=⇒(a)”: Obvious. Finally let us show that int epi ϕ = {(x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ} . Let us denote the right hand side set by W . Clearly int epi ϕ ⊆ W. On the other hand let x ∈ int dom ϕ and Let
ϕ(x) < λ.
¡ ¢ b ∈ ϕ(x), λ . λ
Because ϕ|int dom ϕ is continuous, there exists a neighbourhood U of x, such that U ⊆ int dom ϕ and b ϕ(x) < λ ∀ x ∈ U. ³ ´ ¡ ¢ b +∞ ⊆ int epi ϕ, and so Hence x, λ ∈ U × λ, W ⊆ int epi ϕ.
REMARK 4.2.4 From the above theorem it follows that © ª int dom ϕ = x ∈ X : there exists λ ∈ R such that (x, λ) ∈ int epi ϕ . Also a convex function ϕ can be continuous at a boundary point x of dom ϕ where ϕ(x) = +∞. To see this consider the convex function ϕ : R −→ R, defined by ½1 df if x ∈ (0, +∞), x ϕ(x) = +∞ if x ∈ (−∞, 0]. Recall that in R the neighbourhoods of +∞ are all the sets (λ, +∞] with λ ∈ R. If C is nonempty, closed, convex set in X, then the indicator function iC ∈ Γ0 (X) is continuous at x if and only if x ∈ int C. Therefore, if int C = ∅, then iC is not continuous at any point C = dom iC .
492
Nonlinear Analysis
If X is finite dimensional, the situation is remarkably simple. PROPOSITION 4.2.5 If ϕ : X −→ R is convex and X is finite dimensional, then ϕ is continuous on int dom ϕ. PROOF
Let x ∈ int dom ϕ. We can find {ek }N k=0 ⊆ X
(N = dim X)
and r > 0, such that Br (x) ⊆ conv {ek }N k=0 ⊆ dom ϕ. So if y ∈ Br (x), we can find {λk }N k=0 ⊆ [0, 1], such that N X
λk = 1
and
x=
k=0
N X
λk e k .
k=0
Then because ϕ is convex, we have ϕ(x) 6
N X
λk ϕ(ek )
k=0
6
µX N
¶ λk
k=0
max
k∈{0,...,N }
ϕ(ek )
= c < +∞. So by virtue of Theorem 4.2.3, ϕ|int dom ϕ is continuous.
To have an infinite dimensional analog of the above theorem, we need an extra condition on the function ϕ. THEOREM 4.2.6 If X is a Banach space and ϕ : X −→ R is convex and lower semicontinuous, then ϕ|int dom ϕ is continuous.
4. Smooth and Nonsmooth Analysis and Variational Principles PROOF
We have dom ϕ =
∞ [ ¡
493
¢ ϕ6n .
n=1
Let x ∈ int dom ϕ. Since ϕ is lower semicontinuous, the sets © ª ϕ 6 n are closed. So by the Baire category theorem (see Theorem A.1.10), we can find n > 1, such that © ª int ϕ < n 6= ∅ and ϕ(x) < n. Let and set
© ª y ∈ int ϕ < n ¡ ¢ df h(λ) = ϕ x + λ(y − x)
∀ λ > 0.
Since x ∈ int dom ϕ, we can find r > 0, such that B rky−xkX (x) ⊆ dom ϕ. We have [−r, r] ⊆ dom h, hence 0 ∈ int dom h and so h is continuous at 0 (see Proposition 4.2.5). Because h(0) < n, we can find ϑ > 0, such that h(λ) < n Let
∀ λ ∈ [−ϑ, 0].
df
z = x − ϑ(y − x). We have and
© ª z∈ ϕ 0 and c > 0, such that ¯ ¯ B 2δ (x0 ) ⊆ U and ¯ϕ(y)¯ 6 c ∀ y ∈ B 2δ (x0 ). Let x, y ∈ B δ (x0 )
with x 6= y.
Let us set df
r = kx − ykX and
δ df z = y + (y − x). r
Then z ∈ B 2δ (x0 ). Also we have y =
r δ z+ x. r+δ 1+δ
So from the convexity of ϕ, we obtain ϕ(y) 6
r δ ϕ(z) + ϕ(x), r+δ r+δ
so ¢ r ¡ ϕ(z) − ϕ(x) r+δ r 2c 6 2c = kx − ykX . δ δ
ϕ(y) − ϕ(x) 6
Interchanging the roles of x and y in the above argument, we conclude that ¯ ¯ ¯ϕ(y) − ϕ(x)¯ 6 2c kx − yk X δ
∀ x, y ∈ B δ (x0 ).
“⇐=”: Obvious. For convex, continuous functions, it is possible to characterize Gˆateaux and Fr´echet differentiability at x ∈ X only in terms of ϕ, that is without using the linear functionals ϕ0F (x) and ϕ0G (x).
4. Smooth and Nonsmooth Analysis and Variational Principles
495
PROPOSITION 4.2.8 If U ⊆ X is an open set and ϕ : U −→ R is a convex, continuous function, then ϕ is Gˆ ateaux differentiable at x ∈ X if and only if lim
λ&0
PROOF
ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) = 0 λ
∀ h ∈ X.
(4.15)
“=⇒”: Since ϕ is Gˆateaux differentiable at x ∈ X, we have lim
λ&0
and lim
λ&0
ϕ(x + λh) − ϕ(x) = ϕ0G (x)h λ
ϕ(x − λh) − ϕ(x) = ϕ0G (x)(−h). λ
From these limits, we obtain immediately (4.15). “⇐=”: Let
df
ψ(λ) = ϕ(x + λh)
∀ λ ∈ R.
Then ψ is convex and by hypothesis we have 0 0 ψ+ (0) = ψ− (0).
Therefore ψ is differentiable at λ = 0 and so ϕ is Gˆateaux differentiable at x (see Remark 4.1.2). In the case of Fr´echet differentiability at x, (4.15) holds uniformly with respect to h when khkX = 1. PROPOSITION 4.2.9 If U ⊆ X is an open set and ϕ : U −→ R is a convex, continuous function, then ϕ is Fr´echet differentiable at x ∈ U if and only if for every ε > 0 there exists δ > 0, such that ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) < λε
∀ khkX = 1, λ ∈ (0, δ). (4.16)
PROOF “=⇒”: Since ϕ is Fr´echet differentiable at x ∈ X, for a given ε > 0, we can find δ > 0, such that ® ε ϕ(x + λh) − ϕ(x) − ϕ0F (x), λh X < khkX 2 ∀ khkX = 1, λ ∈ (0, δ). Rewriting (4.17) with h replaced by −h and adding we obtain (4.16).
(4.17)
496
Nonlinear Analysis
“⇐=”: From Proposition 4.2.8 we know that ϕ is Gˆateaux differentiable at x. The convexity ϕ implies that
and
® ϕ(x + λh) − ϕ(x) 0 − ϕG (x), h X > 0 λ
∀λ>0
(4.18)
® ϕ(x) − ϕ(x − λh) 0 − ϕG (x), h X 6 0 λ
∀ λ > 0.
(4.19)
Therefore λε > ϕ(x £ + λh) + ϕ(x − λh)− 2ϕ(x) ® ¤ = ϕ(x + λh) − ϕ(x) − λ ϕ0G (x), h X £ ® ¤ − ϕ(x) − ϕ(x − λh) − λ ϕ0G (x), h X ∀ khkX = 1, λ ∈ (0, δ).
(4.20)
From (4.18) and (4.19), we see that the right hand side of (4.20) is the sum of two positive quantities both of which have to be less than λε for all khkX = 1 and all λ ∈ (0, δ). This means that ϕ is Fr´echet differentiable at x. It is well known from elementary calculus that for a function ϕ : U −→ R with U ⊆ RN being an open set, existence of all partial derivatives at x ∈ U does not imply Fr´echet differentiability. However, if ϕ is convex this is true. PROPOSITION 4.2.10 If U ⊆ RN is an open set, ϕ : U −→ R is a convex function and all the partial derivatives of ϕ at x ∈ U exist, then ϕ is Fr´echet differentiable at x ∈ U . PROOF The obvious candidate for Fr´echet derivative of ϕ at x is the linear transformation determined by the partial derivatives, that is ¡ ¢ A(x) ∈ L RN ; R = RN , defined by N ¡ ¢ df X ∂ϕ A(x), h RN = (x)hk ∂xk
∀ h = (h1 , . . . , hN ) ∈ RN .
k=1
Let r > 0 be such that Br (x) ⊆ U. For each h ∈ Br (0), let ¡ ¢ df ψ(h) = ϕ(x + h) − ϕ(x) − A(x), h RN .
4. Smooth and Nonsmooth Analysis and Variational Principles Evidently ψ is convex on Br (0). For each k ∈ function ξk : Br (0) −→ R, by ψ(hk ek ) df if hk 6= 0 ξk (h) = h 0 k if hk = 0
©
497
ª 1, . . . , N , let us define a
∀ h ∈ Br (0),
N where {ek }N k=1 is the orthonormal basis of R . We have
ξk (h) −→ 0 as khkRN → 0. For each h = (h1 , . . . , hN ) with khkRN < Nr , because of the convexity of ψ, we have µX ¶ N N ¢ 1 1 X ¡ ψ(h) = ψ N hk ek 6 ψ N hk ek N N k=1
=
N X
k=1
hk ξk (N h) 6 khkRN
k=1
Since
µ 0 = ψ
N X
¯ ¯ ¯ξk (N h)¯.
k=1
1 1 h + (−h) 2 2
¶ 6
1 1 ψ(h) + ψ(−h), 2 2
we have −ψ(−h) 6 ψ(h). Therefore, it follows that − khkRN
N N X X ¯ ¯ ¯ ¯ ¯ξk (−N h)¯ 6 ψ(h) 6 khk N ¯ξk (N h)¯, R k=1
so
k=1
ψ(h) −→ 0 as h → 0 khkRN
and thus ϕ is Fr´echet differentiable at x ∈ U and ϕ0F (x) = A(x).
The above proposition implies that for convex functions on a finite dimensional Banach space the situation is straightforward; namely Gˆateaux and Fr´echet differentiability are equivalent (compare with Proposition 4.1.7). COROLLARY 4.2.11 If X is a finite dimensional Banach space, U ⊆ X is an open set and ϕ : U −→ R is a convex function, then ϕ is Gˆ ateaux differentiable at x ∈ U if and only if it is Fr´echet differentiable at x ∈ U .
498
Nonlinear Analysis
From elementary convex analysis we know that a convex function ϕ : (a, b) −→ R is differentiable at all except at most a countable number of points of (a, b). We would like to identify those Banach spaces X where the convex continuous functions defined on open convex sets in X have similar Gˆateaux differentiability properties. The basic result in this direction is the so-called Mazur’s theorem, which implies the automatic generic (i.e., in a dense Gδ -subset) Gˆateaux differentiability of convex, continuous functions in separable Banach spaces. THEOREM 4.2.12 (Mazur Theorem) If X is a separable Banach space, U ⊆ X is an open, convex set and ϕ : U −→ R is a convex, continuous function, then ϕ is Gˆ ateaux differentiable on a dense Gδ subset of U . PROOF For each h ∈ X and m ∈ N, let µ ¶ ½ 1 df S h, = x ∈ U : there exists δ = δ (x, m) > 0, such that m ¾ ϕ(x + λh) + ϕ(x − λh) − 2ϕ(x) 1 sup < . λ m λ∈(0,δ) Since by hypothesis ϕ is continuous, for a given m > 1 and every k > 1, the set ½ ¾ ϕ(x + k1 h) + ϕ(x − k1 h) − 2ϕ(x) 1 df Tk (h) = x ∈ U : < 1 m k is open in U . It follows then that µ ¶ ∞ [ 1 S h, = Tk (h) m k=1
is open in U . We claim that it is also dense in U . Suppose that this is not the case. Then we can find x0 ∈ U and r > 0, such that µ ¶ 1 S h, ∩ Br (x0 ) = ∅. m If ψ(λ) = ϕ(x0 +λh), it follows that ¡ the ¢convex function ψ is not differentiable 1 on (−r, r), a contradiction. So S h, m is dense in U . Because X is separable, we can find a sequence {hn }n>1 which is dense in © ª ∂B1 (0) = h ∈ X : khkX = 1 . By virtue of Proposition 4.2.8, ϕ is Gˆateaux differentiable at x if the directional derivative exists in the directions {hn }n>1 . µ ¶ ∞ \ 1 So ϕ is Gˆateaux differentiable on the set S hn , , which is dense in m n,m=1 U (Baire’s theorem) and of course Gδ .
4. Smooth and Nonsmooth Analysis and Variational Principles
499
There exist nonseparable Banach spaces where the above theorem fails. EXAMPLE 4.2.13
Let X = l∞ and for x = {xn }n>1 , let df
ϕ(x) = lim sup |xn |. n→+∞
Then ϕ is a seminorm (hence it is convex) and it is also continuous since ϕ(x) 6 kxk∞ . If ϕ(x) = 0, then xn −→ 0 and so taking h = (1, 1, . . .), we have ϕ(x + λh) − ϕ(x) |λ| = , λ λ ¡ ¢ which shows that ϕ is not Gˆateaux differentiable at all x ∈ ϕ−1 {0} . If ϕ(x) > 0, exploiting the positive homogeneity of ϕ, we may assume without any loss of generality that ϕ(x) = 1. © ª Let xnk k>1 be a subsequence of {xn }n>1 , such that |xnk | −→ 1
as k → +∞.
By passing to a ©further ª subsequence if necessary, we may assume that all the elements of the xnk k>1 have the same sign. Moreover, since ϕ(x) = ϕ(−x), we can say that xnk > 0 ∀ k > 1. Let ½ df
hn =
0 1
if either n 6= nk for all k > 1, or n = nk with k odd, if n = nk with k even.
Let us set h = {hn }n>1 ∈ l∞ . We have ϕ(x + λh) − ϕ(x) = λ
½
1 0
if if
λ > 0, λ < 0.
So ϕ is nowhere Gˆateaux differentiable. REMARK 4.2.14 The above example should not lead to the conclusion that Theorem 4.2.12 fails in every nonseparable Banach space. There are nonseparable Banach spaces in which Theorem 4.2.12 remains valid, for example the class of weakly compactly generated Banach spaces. A Banach space is weakly compactly generated , if there exists a weakly compact set C (which we can always take to be convex), whose linear span is dense in X
500
Nonlinear Analysis
(i.e., X = span C). Separable Banach spaces are weakly compactly generated. To see this let {xn }n>1 be dense in ∂B1 (0) and take ½ df
C =
1 xn n
¾ ∪ {0}.
Note that C is actually compact. Also reflexive Banach spaces are weakly df compact generated. In this case let C = B 1 (0). DEFINITION 4.2.15 A Banach space X is said to be a weak Asplund space, if every convex, continuous function defined on an open, convex set U ⊆ X is Gˆ ateaux differentiable at each point of a dense Gδ subset U . REMARK 4.2.16 In the above definition the term weak has nothing to do with the weak topology. It is used because sometimes Gˆateaux differentiability is called weak differentiability in contrast to the Fr´echet differentiability, which is called strong differentiability . Theorem 4.2.12 says that every separable Banach space is a weak Asplund space. What about Fr´echet differentiability? In this direction we have the following result due to Asplund (1968) and Lindenstrauss (1963) independently. THEOREM 4.2.17 If X is a Banach space with separable dual, U ⊆ X is an open, convex set and ϕ : U −→ R is a convex, continuous function, then ϕ is Fr´echet differentiable on a dense Gδ subset of U . Then in analogy to Definition 4.2.15, we make the following definition. DEFINITION 4.2.18 A Banach space X is said to be an Asplund space, if every convex, continuous function defined on an open, convex set U ⊆ X is Fr´echet differentiable on a dense Gδ subset of U . REMARK 4.2.19 Theorem 4.2.17 implies that every Banach space X with a separable dual X ∗ is an Asplund space. Note that X is separable too. More generally, we can say that a separable Banach space ¡ X ¢is an Asplund space if and only if it has a separable dual. If X = C [0, 1] (whose dual ¡ ¢ M [0, 1] , the space of Radon measures is not separable) and ϕ : X −→ R+ is defined by df
ϕ(x) = kxk∞ , then it can be shown that ϕ is not Fr´echet differentiable at any point.
4. Smooth and Nonsmooth Analysis and Variational Principles
4.3
501
Haar Null Sets and Locally Lipschitz Functions
From real analysis we know that every Lipschitz continuous function f : R −→ R is differentiable almost everywhere and is the integral of its derivative (a consequence of the fundamental theorem of the Lebesgue calculus). We saw that this theorem can be extended to vector valued functions f : R −→ X, when X is a Banach space with the RNP (see Theorem 2.2.17 and Remark 2.2.18). We also proved another generalization of Lebesgue’s original result, which is due to Rademacher and which says that a locally Lipschitz function f : RN −→ RM is differentiable almost everywhere (see Theorem 1.5.8 and Corollary 1.5.9). The purpose of this section is to combine these two generalizations; that is we want to prove a Rademacher’s theorem for functions between Banach spaces. The problem that we face when dealing with such a generalization is that we do not have a natural measure µ (such as the Lebesgue measure in RN ), which produces a useful class of µ-null sets. So we need to devise new ways to come up with negligible sets. Our experience from the real line suggests that the approach cannot be purely topological and choose, say, the sets of first Baire category. There are several distinct ways to define negligible sets. Our goal is not to present all of them. Instead we will focus on the so-called Haarnull sets, which historically produced the first generalization of Rademacher’s theorem. As the name suggests Haar-null sets are defined on topological groups. So let us start with a brief discussion of them. DEFINITION 4.3.1 A topological group is a group G endowed with a Hausdorff topology, which is compatible with the group structure; that is the two maps G × G 3 (x, y) 7−→ xy ∈ G
and
G 3 x 7−→ x−1 ∈ G
are continuous (G × G is furnished with the product topology). An isomorb is a group isophism of a topological group G onto a topological group G b morphism of G onto G which is bicontinuous. We say that G is an Abelian topological group, if G is Abelian. REMARK 4.3.2 The map x 7−→ x−1 is the inverse map, the map x 7−→ ax is left translation by a and the map x 7−→ xb is right translation by b. All three maps are homeomorphisms of G into itself. The fact that translations are homeomorphisms implies that a topological group is topologically homogeneous. Namely for any a, b ∈ G, the map x 7−→ ba−1 x is a homeomorphism of G which sends a to b. Therefore the topological structure at a is reflected at b. In particular then the topology is completely determined by the system of neighbourhoods of the neutral element e.
502
Nonlinear Analysis
DEFINITION 4.3.3 be left-invariant, if
A metric dG on a topological group G is said to
dG (ax, ay) = dG (x, y)
∀ a, x, y ∈ G
and right-invariant, if dG (xa, ya) = dG (x, y)
∀ a, x, y ∈ G.
The metric dG is invariant, if it is both left-invariant and right-invariant. REMARK 4.3.4 A celebrated theorem of Birkhoff-Kakutani (see, e.g., Hewitt & Ross (1963, pp. 68–70)) says that a topological group G is metrizable if and only if the neutral element e has a countable fundamental system of neighbourhoods. A metrizable topological group admits a left-invariant (or right-invariant) compatible metric. However, a metrizable topological group need not admit an invariant metric. For Abelian groups clearly left and right invariance are equivalent. We shall consider separable Abelian topological groups and separable Banach spaces or otherwise we face serious measurability problems. In addition the Abelian topological group will be Polish. So let G be an Abelian Polish topological group. If G is locally compact, then it is well known that there is a unique (up to scalar multiplication) translation invariant measure µ on G called the Haar measure (see Dieudonn´e (1969, p. 244)). For G which is not locally compact, no invariant measure exists. Nevertheless, it is still possible to define the notion of Haar-null sets. Since G is Abelian, its operation will be denoted by “+.” Also all measures considered in the sequel will be Borel without any explicit mention. DEFINITION 4.3.5 A Borel set A ⊆ G is said to be Haar-null, if there is a probability measure µ on G, such that χA ? µ = 0, i.e., Z χA (x + y) µ(dx) = 0 ∀y∈G G
(the convolution of the characteristic function χA and the measure µ). REMARK 4.3.6 So according to this definition the Borel set A is Haarnull if and only if there is a probability measure µ, such that all translates of A are µ-null, i.e., µ(A + y) = 0 ∀ y ∈ G. The measure µ is called test measure for A. The next proposition shows that when G is locally compact, then the notion of Haar-null set coincides with that of a set which is negligible with respect to the Haar measure on G.
4. Smooth and Nonsmooth Analysis and Variational Principles
503
PROPOSITION 4.3.7 If G is a locally compact, Abelian, Polish topological group and A ⊆ G is a Borel set, then the following two properties are equivalent: (a) A is Haar-null on G; (b) A is negligible for the Haar measure on G. PROOF “(a)=⇒(b)”: Let µ be a test measure for A and let h be a Haar measure on G. We know that h is σ-finite. Then by virtue of Fubini’s theorem, we have ¸ ¸ Z ·Z Z ·Z χA (x + y) h(dx) µ(dy) = χA (x + y) µ(dy) h(dx) = 0. G
G
G
G
So for µ-almost all y ∈ G, we have Z χA (x + y) h(dx) = 0, G
hence
Z h(A) =
χA (x) h(dx) = 0. G
“(b)=⇒(a)”: Again let h be a Haar measure. Let f ∈ Cc (G), such that Z f (x)h(dx) = 1. G
Let us set df
Z
µ(B) =
f (x)h(dx)
∀ B ∈ B(G).
B
Evidently µ is a probability measure on G and we have Z Z χA (x + y)µ(dx) = χA (x + y)f (x)h(dx) G
G
Z
6 c
Z
χA (x + y)h(dx) = c G
for some c > 0 (since f ∈ Cc (G)).
χA (x)h(dx) = 0, G
504
Nonlinear Analysis
REMARK 4.3.8 Since on RN , the Haar measures are multiples of the Lebesgue measure, it follows that the Haar-null sets are the Lebesgue-null sets. PROPOSITION 4.3.9 If G is an Abelian, Polish topological group and {An }n>1 is a sequence of Haar-null sets in G, ∞ df S b= then A An is a Haar-null set in G. n=1
1 PROOF Let M+ (G) be the of probability measures on G. Furnished ¡ space ¢ 1 1 with the narrow topology w M+ (G), Cb (G) , M+ (G) becomes a Polish space (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003a, p. 199)). Let % be 1 a complete metric on M+ (G) generating the above narrow topology. By the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that if 1 µn −→ µ in M+ (G),
then µn ? ν −→ µ ? ν
1 in M+ (G).
Note that if a probability measure µ vanishes on a set A and all its translates, then the same is true for every translate of µ and every probability measure which is absolutely continuous with respect to µ (see Definition A.2.22). In particular this is the case for measures of the form df
µB (A) =
µ(A ∩ B) , µ(B)
with B ∈ B(G), such that µ(B) > 0. So let µ0 be a translation of µ, such that every neighbourhood of the neutral element e of G has strictly positive µ0 -measure. Then for r > 0, we set df
ur (C) = where
df
Br (e) =
©
µ0 (C ∩ Br (e)) , µ0 (Br (e))
ª x ∈ G : dG (x, e) < r ,
with dG being a complete metric on G. Evidently ur −→ δe
1 in M+ (G),
as r & 0,
where δe is the Dirac measure with its mass at e. Therefore for every ε > 0, 1 we can find ν ∈ M+ (G), such that χA ? ν = 0 (i.e., ν is a test measure on A) and %(ν, δe ) < ε.
4. Smooth and Nonsmooth Analysis and Variational Principles
505
These observations imply that by induction we can generate a sequence 1 {νn }n>1 ⊆ M+ (G), such that χAn ? νn = 0
∀n>1
(i.e., νn is a test measure on An ) and ¡ ¢ % u, u ? µn
1, 2n where u is any convolution of different µk ’s with 1 6 k 6 n − 1. Since the 1 metric % on M+ (G) is complete, we see that the measure df
µ =
∞ Y
?µk
k=1
is well defined. Because µ = µn ? νn for all n > 1, where Y ?µk , νn = k6=n
it follows that χAn ? µ = 0
∀ n > 1,
hence χ
∞ S
n=1
This proves that
∞ S
? µ = 0. An
An is a Haar-null set.
n=1
COROLLARY 4.3.10 If G is an Abelian, Polish topological group and A ⊆ G is Haar-null, then Ac = G \ A is dense in G. PROOF Let {xn }n>1 be a sequence which is dense in G and let µ be a test measure for A. It suffices to show that int A = ∅. Suppose that int A 6= ∅. Then we can find a neighbourhood U of the neutral element e of G and a ∈ A, such that a + U ⊆ G. We have µ(U + b) = µ(U + a) = µ(U ) = 0
∀ b ∈ G,
so µ(U + xn ) = 0
∀ n > 1.
Then because of Proposition 4.3.9, we have that G =
∞ [
(U + xn ) is Haar-null, with test measure µ,
n=1
a contradiction.
506
Nonlinear Analysis
EXAMPLE 4.3.11 Let G be a separable Banach space and let A ⊆ X be a Borel set which intersects all the translates of a fixed line L in sets whose one-dimensional Lebesgue measure is zero. For example A can be a proper, Borel linear subspace of X. Then A is a Haar-null set. Any probability measure on the line L which is equivalent to the Lebesgue measure on L can be used as a test measure of A. Sets like A above are called directionallynull sets. Recall that a set A ⊆ R can have positive measure, and, in fact, its complement can be Lebesgue-null, without A including any interval of positive length (consider, e.g., the set of irrational numbers). However, if we take all differences between elements in A (i.e., A − A), then this set contains a nontrivial interval around zero. This fact is used in the construction of a nonmeasurable set (see, e.g., Halmos (1974, p. 69)). The same property can be proved if R is replaced by an Abelian, Polish topological group G and we consider a Borel set A ⊆ G which is not Haar-null. PROPOSITION 4.3.12 If G is an Abelian, Polish topological group and A ⊆ G is a Borel set which is not Haar-null, then A − A is a neighbourhood of the neutral element e of G. PROOF
Let df
S(A) =
©
ª x ∈ G : (A + x) ∩ A is not Haar-null .
We claim that S(A) is a neighbourhood of e. If we can show this then the proposition follows since S(A) ⊆ A − A. Suppose that the claim is not true. Then we can find a sequence {xn }n>1 ⊆ G, such that 1 dG (xn , e) < n ∀n>1 2 (dG being a complete metric on G) and (A + xn ) ∩ A is Haar-null
∀ n > 1.
Because of Proposition 4.3.9, the set ∞ [ £ ¤ (A + xn ) ∩ A n=1
is Haar-null and so its complement df b = A A\
∞ [ £ ¤ (A + xn ) ∩ A n=1
4. Smooth and Nonsmooth Analysis and Variational Principles
507
is a Borel set which is not Haar-null (see Corollary 4.3.10). Note that ¡ ¢ b + xn ∩ A b = ∅ A
∀ n > 1.
Let C be the Cantor group (i.e., C = {0, 1}N ). The elements of C are denoted by ξ = {ξn }n>1 with ξn ∈ {0, 1} ∀ n > 1. Let ξ (n) =
©
(n) ª
ξk
k>1
be the element of C, defined by ½ (n) df
ξk
=
0 1
if if
k= 6 n, k = n.
Consider the map ϑ : C −→ G, defined by df
ϑ(ξ) =
∞ X
ξn xn .
n=1
Note that
¡ ¢ ϑ ξ (n) = xn
and because dG is complete and invariant, we have that ϑ is continuous. If we consider the Haar probability measure h on C and we consider the image measure h ◦ ϑ−1 = µ on G, b is not Haar-null, µ does not vanish on some translate A; b that then because A is we can find y ∈ G, such that ¡ ¢ b + y has positive h-measure. ϑ−1 A So we must have that ¡ ¢ ¡ ¢ b + y − ϑ−1 A b + y is a neighbourhood of 0 ∈ C. ϑ−1 A This means that for all n > 1 large enough, ¢ ¢ ¡ ¡ b+y . b + y − ϑ−1 A ξ (n) ∈ ϑ−1 A ¢ ¡ b + y which differ only in the n-th coordinate and Thus there are σ, τ ∈ ϑ−1 A b−A b for all these n. But this contradicts the fact that xn ∈ A ¡ ¢ b + xn ∩ A b = ∅ A ∀ n > 1.
508
Nonlinear Analysis
REMARK 4.3.13 In the above proof, we have shown something stronger. Namely that S(A) is an open set containing e. In fact this result can be generalized as follows: “If A, B are two Borel sets in G and df
S(A, B) =
©
ª x ∈ G : (A + x) ∩ B is not Haar-null ,
then S(A, B) is an open (possibly empty) subset of G” (see Christensen (1974, p. 118)). COROLLARY 4.3.14 (a) If G is an Abelian, Polish topological group which is not locally compact, then every compact subset of G is Haar-null. (b) If G = X is a nonreflexive, separable Banach space, then every weak compact subset of X is Haar-null. Using the notion of the Haar-null sets we can extend Rademacher’s theorem to Lipschitz continuous functions between certain Banach spaces. LEMMA 4.3.15 If X and Y are two Banach spaces, U ⊆ X is an open set, f : U −→ Y is a Lipschitz continuous function, G ⊆ X is a dense additive subgroup and for some x0 ∈ U and all h ∈ G, f (x0 + λh) − f (x0 ) = f 0 (x0 ; h) λ→0 λ lim
exists and f 0 (x0 ; ·) is additive, then f is Gˆ ateaux differentiable at x0 . PROOF
Consider the following family of functions: df
uλ (h) =
f (x0 + λh) − f (x0 ) λ
∀ λ 6= 0, h ∈ X.
The family {uλ }λ6=0 is equicontinuous (since f is Lipschitz continuous on U ) and since lim uλ (h) exists for all h ∈ G, it also exists for all h ∈ G = X. λ→0
Moreover, from the additivity of f 0 (x0 ; ·) on G follows the additivity on G = X. In addition note that lim uλ (th) = t lim uλ (h)
λ→0
λ→0
∀ t ∈ R.
So f 0 (x0 ; ·) is a linear operator which is bounded by the Lipschitz constant of f , i.e., f 0 (x0 ; ·) ∈ L(X; Y ). Therefore f is Gˆateaux differentiable at x0 .
4. Smooth and Nonsmooth Analysis and Variational Principles
509
PROPOSITION 4.3.16 If U ⊆ RN is an open set, Y is a Banach space with the RNP and f : U −→ Y is a Lipschitz continuous function, then f is Gˆ ateaux differentiable almost everywhere on U (on U we consider the N -dimensional Lebesgue measure). PROOF Without any loss of generality, we may assume that U = RN . Let {xn }n>1 ⊆ RN be dense and let df
G = span Q {xn }n>1 (i.e., the linear combinations of the xn ’s with rational coefficients). Clearly G is a countable dense additive subgroup of RN . From Theorem 2.2.17, we know that the directional derivatives f 0 (x; h) = lim
λ→0
f (x + λh) − f (x) λ
exist in all directions h ∈ G for almost all x ∈ RN . Then in the light of Lemma 4.3.15, if we can show that these directional derivatives are additive on G, then we will have the almost everywhere Gˆateaux differentiability. ¡ desired ¢ To this end let ϕ ∈ Cc1 RN be such that Z ϕ(x) dx = 1. RN
For example a standard function with these properties is the function à ! 1 c exp if kxkRN 6 1, 2 ϕ(x) = kxkRN − 1 0 if kxkRN > 1, with c ∈ R chosen so that we have the normalization condition Z ϕ(x) dx = 1. RN
Let
Z
df
g(x) = (f ? ϕ)(x) =
f (y)ϕ(x − y) dy. RN
We know that and so we have that
¡ ¢ g ∈ C 1 RN ; Y 0 gG (x)h =
¡
¢ f ? ϕ0G (x) h
510
Nonlinear Analysis
is linear in h ∈ RN for all x ∈ RN . Since Z Z f (y)ϕ(x − y) dy = f (x − y)ϕ(y) dy RN
∀ x ∈ RN , h ∈ G,
RN
using the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have g(x + λh) − g(x) 0 gG (x)h = lim λ→0 λ Z f (x + λh − y) − f (x − y) = ϕ(y) lim dy λ→0 λ RN
and it follows that 0 (x)h = gG
¡
¢ ϕ ? ξh (x),
where
ϕ(x + λh) − ϕ(x) λ is a bounded measurable function. Then we have ¡ ¢ ϕ ? ξh1 +h2 − ξh1 − ξh2 = 0 ∀ h1 , h2 ∈ G. df
ξh (x) = lim
λ→0
The same is true if ϕ is replaced by ϕm (x) = mN ϕ(mx). Recall that for every bounded, measurable function gb : RN −→ Y, we have that
¡ ¢ ϕm ? gb (x) −→ gb(x)
for a.a. x ∈ RN
(see Proposition 2.4.12(c); the result there is stated for R-valued functions, but it can be extended to Y -valued functions by scalarization using elements of Y ∗ and recalling that by virtue of Theorem 2.1.3, we may assume without any loss of generality that Y is separable). So in the limit, we obtain that ξh1 +h2 (x) = ξh1 (x) + ξh2 (x) for a.a. x ∈ RN and all h1 , h2 ∈ G. Because G is countable, the exceptional Lebesgue-null set is independent of h1 , h2 ∈ G. Now we are ready for the infinite dimensional generalization of Rademacher’s theorem (see Theorem 1.5.8 and Corollary 1.5.9).
4. Smooth and Nonsmooth Analysis and Variational Principles
511
THEOREM 4.3.17 If X is a separable Banach space, U ⊆ X is an open set, Y is a Banach space with the RNP and f : U −→ Y is a Lipschitz continuous function, then f is a Gˆ ateaux differentiable function on a set Df with X \ Df being Haar-null in X. PROOF Without any loss of generality we may assume that U = X. First we show that the set Df of points of Gˆateaux differentiability of f is a Borel subset of X. Indeed let {xn }n>1 be dense in X and set df
G = span Q {xn }n>1 . Then by virtue of Lemma 4.3.15, we have ¯ ¯ ½ ¯ f (x + λh) − f (x) f (x + rh) − f (x) ¯ ¯ ¯< 1, Df = x ∈ X : ¯ − ¯ m λ r ¾ 1 λ, r ∈ Q, |λ|, |r| 6 , h ∈ G, m, n > 1 . n So we see that Df ⊆ X is a Borel set. Now let {yn }n>1 be a sequence of linearly independent vectors in X, such that span {yn }n>1 = X. Let
df
Vm = span {yn }m n=1 . Then V1 ⊆ V2 ⊆ . . . ⊆ Vm ⊆ . . . ⊆ X and
∞ [
Vm = X.
m=1
Using Proposition 4.3.16, we can find Dn ⊆ Vn , such that f is Gˆateaux differentiable on Dn and Vn \Dn is Lebesgue-null. By virtue of Lemma 4.3.15, Df =
∞ \
Dn .
n=1
So Df is a Haar-null set in X. COROLLARY 4.3.18 If X is a separable Banach space, Y is a Banach space with the RNP and f : X −→ Y is a locally Lipschitz continuous function, then f is a Gˆ ateaux differentiable function on a set Df with X \ Df being Haar-null in X.
512
4.4
Nonlinear Analysis
Duality and Subdifferentials
A basic theme that runs through the whole theory of convex analysis is that of “duality.” Namely almost every mathematical notion is paired with another one, which is in some sense dual to it. So convex cones are associated to their polars (generalizing this way the pairing of tangent and normal spaces in differential geometry), closed convex sets are associated to their support functions (a pairing which permits an interchange between geometric and analytical reasoning), minimization problems in linear programming are associated to maximization problems, known as the dual problems, which provide valuable information about the solvability and the value of the original problem. In general duality permits us to establish close relations between otherwise disparate properties. The starting point of all these correspondence is a deep duality principle between certain pairs of convex functions (known as conjugate functions), which we study in the first half of this section. The mathematical framework of our analysis is a Hausdorff locally convex vector space X and its dual X ∗ (i.e., the set of continuous, linear functionals on X). Additional hypotheses will be introduced as needed. We supply X with the w(X, X ∗ )-topology and X ∗ with the w(X ∗ , X)-topology. So (X ∗ , X) is a dual pair (or dual system) and we denote by h·, ·iX their pairing. df
DEFINITION 4.4.1 Let ϕ : X −→ R∗ = R ∪ {±∞}. Then LegendreFenchel transform (or the conjugate) of ϕ is the function ϕ∗ : X ∗ −→ R∗ , defined by £ ¤ df ϕ∗ (x∗ ) = sup hx∗ , xiX − ϕ(x) . x∈X
∗ ∗
∗∗
The function (ϕ ) = ϕ : X −→ R∗ , defined by df
ϕ∗∗ (x) =
sup x∗ ∈X ∗
£
¤ hx∗ , xiX − ϕ∗ (x∗ ) ,
is the second conjugate (or biconjugate) of ϕ. REMARK 4.4.2 If ϕ takes the value −∞, then ϕ∗ ≡ +∞. Also if the effective domain dom ϕ is empty, then ϕ∗ ≡ +∞. For this reason of interest df is the case where ϕ : X −→ R = R ∪ {+∞} and dom ϕ 6= ∅ (i.e., ϕ is a proper function; see Definition 4.2.1). In this case ϕ∗ : X −→ R and it is proper too. The significance of ϕ∗ is better understood using epigraphs (see Definition 4.2.1). So we have · (x∗ , µ) ∈ epi ϕ∗ ⇐⇒
hx∗ , xiX − λ 6 µ
¸ ∀ (x, λ) ∈ epi ϕ .
4. Smooth and Nonsmooth Analysis and Variational Principles
513
If we write the last inequality as hx∗ , xiX − µ 6 λ
∀ (x, λ) ∈ epi ϕ,
we see that · ∗
(x , µ) ∈ epi ϕ
∗
⇐⇒
¸ ∗
hx , xiX − µ 6 ϕ(x)
∀x∈X .
So df
l(x∗ ,µ) (x) = hx∗ , xiX − µ is a continuous affine minorant of ϕ. Therefore ϕ∗ is proper if and only if ϕ admits a continuous affine minorant. Moreover, ϕ∗ describes the family of all continuous affine minorants of ϕ. On the other hand also note that · ¸ ϕ∗ (x∗ ) 6 µ ⇐⇒ l(x,λ) (x∗ ) = hx∗ , xiX − λ 6 µ ∀ (x, λ) ∈ epi ϕ . So see that ϕ∗ is the pointwise supremum of all continuous affine functions © we ª l(x,λ) (x,λ)∈epi ϕ . Therefore ϕ∗∗ is the pointwise supremum of all continuous affine functions majorized by ϕ. From these observations and recalling that the supremum of continuous affine functions on X is convex and lower semicontinuous, we obtain the following result. PROPOSITION 4.4.3 If ϕ : X −→ R is a proper function, then ϕ∗ ∈ Γ0 (X ∗ ). Also directly from the definition of ϕ∗ , we obtain the following two results. PROPOSITION 4.4.4 (Young-Fenchel Inequality) If ϕ : X −→ R∗ is a function, then ϕ(x) + ϕ∗ (x∗ ) > hx∗ , xiX ∀ x ∈ X, x∗ ∈ X ∗ . PROPOSITION 4.4.5 If ϕ, ψ : −→ R∗ and ϕ(x) 6 ψ(x)
∀ x ∈ X,
then ϕ∗ (x∗ ) > ψ ∗ (x∗ )
∀ x∗ ∈ X ∗ .
514
Nonlinear Analysis
DEFINITION 4.4.6 set C is the function
(a) Let C ⊆ X. The support function of the σC : X ∗ −→ R∗ ,
defined by df
σC (x∗ ) = sup hx∗ , ciX c∈C
(recall that sup∅ = −∞). If C 6= ∅, then σC takes values in R. (b) The infimal convolution of functions ϕ, ψ : X −→ R is the function ϕ ⊕ ψ : X −→ R∗ , defined by ¡
¢ ¡ ¢ df ϕ ⊕ ψ (x) = inf ϕ(y) + ψ(x − y) = y∈X
inf
z+y=x
¡ ¢ ϕ(z) + ψ(y) .
We say that ϕ ⊕ ψ is exact at x, if ¡ ¢ ¡ ¢ ϕ ⊕ ψ (x) = min ϕ(y) + ψ(x − y) y∈X
(i.e., the infimum is attained). We say that ϕ ⊕ ψ is exact, if it is exact at every x. REMARK 4.4.7 Evidently if C ⊆ X is nonempty, then σC ∈ Γ0 (X ∗ ) and σC (0) = 0. In fact σC is sublinear (i.e., subadditive and positively homogeneous). Moreover, ¡ ¢∗ σC = iC , where iC is the indicator function of the set C ⊆ X, i.e., ½ df 0 if x ∈ C, iC (x) = +∞ otherwise (see Remark 4.2.2). If df
sepi ϕ =
©
(x, λ) ∈ X × R : ϕ(x) < λ
ª
(the strict epigraph of ϕ), then it is easy to check that ¡ ¢ sepi ϕ ⊕ ψ = sepi ϕ + sepi ψ. Also since ¡
¢ ϕ ⊕ ψ (x) =
inf
(z, λ) ∈ epi ϕ (y, µ) ∈ epi ϕ z+y =x
(λ + µ) =
inf
(x,λ)∈(epi ϕ+epi ψ)
λ,
4. Smooth and Nonsmooth Analysis and Variational Principles
515
we see that the infimal convolution of proper, convex functions ϕ and ψ is convex but not necessarily proper. For example, if ϕ = iC and ψ = iD and C, D ⊆ X are two nonempty, convex, disjoint sets, then iC + iD ≡ +∞. On the other hand, if ϕ and ψ are linear functionals which are not identical, then ϕ ⊕ ψ = −∞. In addition note that, if X is a normed space and C ⊆ X is a nonempty set, then for all x ∈ X, we have dX (x, C) = inf kx − ckX c∈C ¡ ¢ = inf kx − ykX + iC (y) y∈X ¡ ¢ = k·kX + iC (x). Hence for all nonempty, convex sets C ⊆ X, the distance function dX (·, C) is convex. Moreover, it is easy to see that ¯ ¯ ¯d (x, C) − d (y, C)¯ 6 kx − yk , X X X i.e., dX (·, C) is nonexpansive. Finally for any index set I, we have µ
¶∗ = sup ϕ∗i .
inf ϕi
i∈I
i∈I
PROPOSITION 4.4.8 If ϕ, ψ : X −→ R are proper, convex functions, then ¡ ¢∗ ϕ⊕ψ = ϕ∗ + ψ ∗ . PROOF
According to Definitions 4.4.1 and 4.4.6(b), we have
¡ ¢∗ ¡ ¡ ¢ ¢ ϕ ⊕ ψ (x∗ ) = sup hx∗ , xiX − ϕ ⊕ ψ (x) x∈X ¶ µ ¡ ¢ = sup hx∗ , xiX − inf ϕ(y) + ψ(x − y) y∈X
x∈X
=
∗
sup (hx , yiX − ϕ(y) + hx∗ , ziX − ψ(z))
y,z∈X ∗ ∗
= ϕ (x ) + ψ ∗ (x∗ ).
516
Nonlinear Analysis
PROPOSITION 4.4.9 If X and Y are two Hausdorff, locally convex spaces, A ∈ L(X; Y ) is an isomorphism, g : Y −→ R is a proper function and for y0 ∈ Y , x∗0 ∈ X ∗ , ξ0 ∈ R and λ0 > 0, we set df
ϕ(x) = λ0 g(Ax + y0 ) + hx∗0 , xiX + ξ0
∀ x ∈ X,
then µ ∗
∗
ϕ (x ) = λ0 g
∗
¶ ® 1 −1 ∗ ∗ ∗ (A ) (x − x0 ) − x∗ − x∗0 , A−1 y0 X − ξ0 λ0
∀x∗ ∈ X ∗ .
PROOF
We have © ª ϕ∗ (x∗ ) = sup hx∗ , xiX − λ0 g(Ax + y0 ) − hx∗0 , xiX − ξ0 x∈X ½ ¾ ® 1 −1 ∗ ∗ = λ0 sup (A ) (x − x∗0 ), y Y − g(y) λ0 ® y∈Y − x∗ − x∗0 , A−1 y0 X − ξ0 µ ¶ ® 1 ∗ −1 ∗ ∗ ∗ = λ0 g (A ) (x − x0 ) − x∗ − x∗0 , A−1 y0 X − ξ0 . λ0
COROLLARY 4.4.10 If g : X −→ R a is proper function and x0 ∈ X, x∗0 ∈ X ∗ , λ0 > 0, ϑ0 ∈ R, then (a) for (b) for (c) for (d) for (e) for (f ) for
ϕ(x) = g(x + x0 ), ϕ∗ (x∗ ) = g ∗ (x∗ ) − hx∗ , x0 iX ; ∗ ϕ∗ (x∗ ) = g ∗ (x∗¡ − x∗0¢); ϕ(x) = g(x) + hx0 , xiX , ϕ(x) = λ0 g(x), ϕ∗ (x∗ ) = λ0 g ∗ λ10 x∗ ; ¡1 ¢ ϕ(x) = λ0 g λ0 x , ϕ∗ (x∗ ) = λ0¡g ∗ (x∗ ); ¢ ϕ(x) = g(λ0 x), ϕ∗ (x∗ ) = g ∗ λ10 x∗ ; ϕ(x) = λ0 g( λ10 x + x0 ) + ϑ0 , ϕ∗ (x∗ ) = λ0 g ∗ (x∗ ) − λ0 hx∗ , x0 iX − ϑ0 .
Let us give some examples of conjugate functions. EXAMPLE 4.4.11
(a) Let X be a normed space and © ª C = B 1 = x ∈ X : kxkX 6 1 .
Then
σC (x∗ ) = i∗C (x∗ ) = sup hx∗ , ciX = kx∗ kX ∗ . c∈B1
(b) If K ⊆ X is a cone (i.e., λK ⊆ K for all λ > 0), then σK = i−K ∗ ,
4. Smooth and Nonsmooth Analysis and Variational Principles
517
where K ∗ is the dual cone, i.e., df
K∗ =
©
ª x∗ ∈ X ∗ : hx∗ , xiX > 0 for all x ∈ K .
If K is a linear subspace of X, then df
K∗ = K⊥ =
©
ª x∗ ∈ X ∗ : hx∗ , xiX = 0 for all x ∈ K .
(c) Let X be a normed space and ϕ(x) = kxkX . Then we have
£ ¤ ϕ∗ (x∗ ) = sup hx∗ , xiX − kxkX . x∈X
If kx∗ kX ∗ 6 1, we have
hx∗ , xiX 6 kxkX
and so ϕ∗ (x∗ ) = 0. On the other hand if kx∗ kX ∗ > 1, we can find x ∈ X, such that kxkX < hx∗ , xiX and so £ ¤ hx∗ , λxiX − kλxkX = λ hx∗ , xiX − kxkX > 0 Therefore ϕ∗ (x∗ ) = +∞. So we conclude that © ∗ where B = x∗ ∈ X ∗ : kx∗ kX ∗
ϕ∗ = iB∗ , ª 61 .
(d) Let X be a normed space, C ⊆ X nonempty set and df
ϕ(x) = dX (x, C)
∀ x ∈ X.
Then from Remark 4.4.7, we know that ϕ = k·kX ⊕ iC . Then by virtue of Proposition 4.4.8, we have that ∗
ϕ∗ = k·kX + i∗C = iB∗ + σC
(see (c) and (a) above).
∀ λ > 0.
518
Nonlinear Analysis
(e) If df
ϕ(x) = hx∗0 , xiX + ξ0 with
∀ x ∈ X,
x∗0
∈ X and ξ0 ∈ R (i.e., ϕ is a continuous, affine functions), then ½ £ ¤ −ξ0 if x∗ = x∗0 , ϕ∗ (x∗ ) = sup hx∗ − x∗0 , xiX − ξ0 = +∞ if x∗ 6= x∗0 . x∈X
(f ) If ϕ : RN −→ R is defined by df
ϕ(x) =
1 p kxkRN , p
with p ∈ (1, +∞), then ϕ∗ (x∗ ) = with
1 p
+
1 p0
1 ∗ p0 kx kRN , p0
= 1. Indeed, let df
gx∗ (x) = (x∗ , x)RN −
1 p kxkRN . p
Then gx∗ is concave and p−2
gx0 ∗ (x) = x∗ − kxkRN x. x) = 0 at the unique point x b, such that We have that gx0 ∗ (b ° °p ° ° p (x∗ , x b)RN = °x b°RN = °x∗ °Rp−1 N . So if p0 =
p p−1
and since ϕ∗ (x∗ ) = sup gx∗ (x), x∈RN
we have that ϕ∗ (x∗ ) =
1 ∗ p0 kx kRN . p0
More generally, let X be a normed space and let g : R −→ R be an even, convex function. If ¡ ¢ df ϕ(x) = g kxkX then
¡ ¢ ϕ∗ (x∗ ) = g ∗ kx∗ kX ∗
∀ x ∈ X, ∀ x∗ ∈ X ∗ .
4. Smooth and Nonsmooth Analysis and Variational Principles
519
PROPOSITION 4.4.12 If ϕ : X −→ R is a convex and lower semicontinuous function, then ϕ admits a continuous affine minorant, i.e., hx∗0 , xiX − ξ0 6 ϕ(x)
∀ x ∈ X,
for some (x0 , ξ0 ) ∈ X ∗ × R. PROOF
Clearly we may assume that ϕ is proper, i.e., ϕ ∈ Γ0 (X).
Let x0 ∈ X and η ∈ R be such that η < ϕ(x0 ). Then (x0 , η) 6∈ epi ϕ and so by the strong separation theorem (see Theorem A.3.2), we can find (x∗0 , ϑ0 ) ∈ X ∗ × R, (x∗0 , ϑ0 ) 6= (0, 0) and ξ ∈ R, such that hx∗0 , xiX + ϑ0 λ < ξ < hx∗0 , x0 iX + ϑ0 η Let (x, λ) = We have
∀ (x, λ) ∈ epi ϕ.
¡ ¢ x, ϕ(x) .
hx∗0 , xiX + ϑ0 ϕ(x) < ξ < hx∗0 , x0 iX + ϑ0 η,
so ϑ0 < 0. Without any loss of generality, we may assume that ϑ0 = −1. from (4.21), we have so
hx∗0 , xiX − ϕ(x) < ξ0
∀ x ∈ X,
hx∗0 , xiX − ξ0 < ϕ(x)
∀ x ∈ X.
(4.21) Then
PROPOSITION 4.4.13 For any function ϕ : X −→ R∗ , we have ϕ∗∗ 6 ϕ. PROOF have
From the Young-Fenchel inequality (see Proposition 4.4.4), we
ϕ∗∗ (x) =
sup x∗ ∈X ∗
£
¤ hx∗ , xiX − ϕ∗ (x∗ ) 6 ϕ(x)
∀ x ∈ X.
520
Nonlinear Analysis
The next theorem is very important and determines when we have equality in Proposition 4.4.13. THEOREM 4.4.14 If ϕ : X −→ R is a function, then ϕ∗∗ = ϕ if and only if ϕ is convex and lower semicontinuous. PROOF
“=⇒”: Follows from Remark 4.4.2.
“⇐=”: If ϕ ≡ +∞, then and so
ϕ∗ ≡ −∞ v = ϕ∗∗ ≡ +∞.
Therefore we may assume that ϕ is proper. We know that we have ϕ∗∗ 6 ϕ. So we need to show that the opposite inequality also holds. To this end let x ∈ X and µ ∈ R be such that µ < ϕ(x). Then (x, µ) ∈ / epi ϕ and so we can apply the strong separation theorem (see Theorem A.3.2) and find (x∗ , β) ∈ X ∗ × R, (x∗ , β) 6= (0, 0) and δ > 0, such that hx∗ , yiX + βλ 6 hx∗ , xiX + βµ − δ
∀ (y, λ) ∈ epi ϕ.
Since λ can increase to +∞, from this inequality it follows that β 6 0. First suppose that β < 0. We have hx∗ , yiX + βϕ(y) < hx∗ , xiX + βµ
∀ y ∈ X,
from which it follows that (−βϕ)∗ (x∗ ) 6 hx∗ , xiX + βµ. Using Corollary 4.4.10(c), we obtain µ ∗¶ x ∗ −βϕ 6 hx∗ , xiX + βµ −β and thus
¿ µ 6
−
x∗ ,x β
À X
µ ∗¶ x − ϕ∗ − 6 ϕ∗∗ (x). β
Because µ < ϕ(x) was arbitrary, we infer that ϕ(x) 6 ϕ∗∗ (x)
4. Smooth and Nonsmooth Analysis and Variational Principles
521
as desired. Next assume that β = 0. We have hx∗ , yiX 6 hx∗ , xiX − δ
∀ y ∈ dom ϕ
and so, we see that x 6∈ dom ϕ
and ϕ(x) = +∞.
It is enough to show that also ϕ∗∗ (x) = +∞. Let η ∈ R be such that hx∗ , yiX < η < hx∗ , xiX
∀ y ∈ dom ϕ.
As ϕ is bounded below by an affine function (see Proposition 4.4.12), we have that there exist y ∗ ∈ X ∗ and ϑ ∈ R, such that hy ∗ , yiX − ϑ 6 ϕ(y)
∀ y ∈ X.
So for all γ > 0, we have ¡ ¢ hy ∗ , yiX − ϑ + γ hx∗ , yiX − η 6 ϕ(y)
∀ y ∈ X.
Then hy ∗ + γx∗ , yiX − ϕ(y) 6 ϑ + γη
∀y∈X
and so ϕ∗ (y ∗ + γx∗ ) 6 ϑ + γη. Therefore ¡ ¢ hy ∗ , xiX − ϑ + γ hx∗ , xiX − η 6 hy ∗ + γx∗ , xiX − ϕ∗ (y ∗ + γx∗ ) 6 ϕ∗∗ (x). Since η < hx∗ , xiX and γ > 0 was arbitrary, we see that the left hand side is arbitrarily large and so ϕ∗∗ (x) = +∞. Thus ϕ(x) 6 ϕ∗∗ (x).
522
Nonlinear Analysis
COROLLARY 4.4.15 If C ⊆ X is a nonempty, closed and convex set, then x ∈ C if and only if hx∗ , xiX 6 σC (x∗ )
∀ x∗ ∈ X ∗ .
In Proposition 4.4.8, we saw that addition is the dual operation to infimal convolution. The next proposition shows that under some additional conditions, the converse is also true. PROPOSITION 4.4.16 If ϕ, ψ : X −→ R are proper, convex functions and there exists a point x ∈ dom ϕ, such that ψ is continuous at x, then ¡ ¢∗ ϕ+ψ = ϕ∗ ⊕ ψ ∗ . Now we pass to the study of the subdifferential starting with convex subdifferentials. The convex subdifferential characterizes the local behaviour of convex functions, in a way which is analogous to that in which derivatives determine the local behaviour of smooth functions (see Section 4.1). In fact we can develop a subdifferential calculus which to a high degree parallels the differential calculus of smooth functions. The mathematical setting remains as before. Namely X is a Hausdorff, locally convex vector space, X ∗ is its topological dual. The spaces X and X ∗ are supplied with the w(X, X ∗ ) and w(X ∗ , X) topologies respectively. Let ϕ : X −→ R be a proper, convex function and x ∈ dom ϕ, h ∈ X. The function df ϕ(x + λh) − ϕ(x) ux (λ) = λ is increasing on (0, +∞). So we can make the following definition. DEFINITION 4.4.17 Let ϕ : X −→ R be a proper, convex function and x0 ∈ dom ϕ. The directional derivative of ϕ at x0 in the direction h ∈ X is defined by df
ϕ(x0 + λh) − ϕ(x0 ) ϕ(x0 + λh) − ϕ(x0 ) = lim . λ>0 λ&0 λ λ
ϕ0 (x0 ; h) = inf
REMARK 4.4.18 Note that ϕ0 (x0 ; h) ∈ R∗ and it is easy to see that 0 ϕ (x0 ; ·) is sublinear. Moreover, if X is a Banach space and ϕ0 (x0 ; ·) ∈ X ∗ , then ϕ is Gˆateaux differentiable at x0 and ϕ0 (x0 ; ·) = ϕ0G (x0 ).
4. Smooth and Nonsmooth Analysis and Variational Principles
523
DEFINITION 4.4.19 Let ϕ : X −→ R be a proper function and x0 ∈ dom ϕ. The subdifferential of ϕ at x0 is the subset ∂ϕ(x0 ) (possibly empty) of X ∗ , defined by ½ ¾ ∗ ® df ∗ ∗ ∂ϕ(x0 ) = x ∈ X : x , y − x0 6 ϕ(y) − ϕ(x0 ) for all y ∈ X . REMARK 4.4.20
From this definition we see that
x∗ ∈ ∂ϕ(x0 ) where
if and only if df
argminψ =
©
x0 ∈ argmin(ϕ − x∗ ),
ª x ∈ X : ψ(x) = inf ψ . X
The set ∂ϕ(x) is always a closed and convex subset of X ∗ and it can be empty (consider for example the subdifferential ∂ϕ(x) when x ∈ / dom ϕ). The domain of the subdifferential multifunction ∂ϕ is the set © ª D(∂ϕ) = x ∈ X : ∂ϕ(x) 6= ∅ . The function ϕ is said to be subdifferentiable at x ∈ X, if x ∈ D(∂ϕ). The elements of ∂ϕ(x) are called subgradients of ϕ at x. Using the epigraph of ϕ we can better understand the geometric meaning of the subdifferential. So ϕ is subdifferentiable at x ∈ X and x∗ ∈ X ∗ is a subgradient of ϕ at x if and only if the graph of the continuous function y 7−→ hx∗ , y − xiX + ϕ(x) ¡ ¢ is a nonvertical supporting hyperplane to the set epi ϕ at x, ϕ(x) , that is the continuous affine function df
l(y) = hx∗ , y − xiX + ϕ(x) is a minorant of ϕ which is exact at x, i.e., l 6 ϕ and
l(x) = ϕ(x).
Since l(x) 6 ϕ∗∗ (x) 6 ϕ (see Remark 4.4.2 and Proposition 4.4.13), we infer that if ∂ϕ(x) 6= ∅, then ϕ(x) = ϕ∗∗ (x). Consequently, if ϕ(x) = ϕ∗∗ (x), then ∂ϕ(x) = ∂ϕ∗∗ (x).
524
Nonlinear Analysis
PROPOSITION 4.4.21 If ϕ : X −→ R is a function, then x∗ ∈ ∂ϕ(x) ⇐⇒ ϕ(x) + ϕ∗ (x∗ ) = hx∗ , xiX . PROOF
“=⇒”: From the definition of the subdifferential, we have hx∗ , yiX − ϕ(y) 6 hx∗ , xiX − ϕ(x)
∀ y ∈ X,
so ϕ∗ (x∗ ) + ϕ(x) 6 hx∗ , xiX . Since the opposite inequality is always true (see the Young-Fenchel inequality; Proposition 4.4.4), we conclude that ϕ(x) + ϕ∗ (x∗ ) = hx∗ , xiX . “⇐=”: We have hx∗ , xiX − ϕ(x) = ϕ∗ (x∗ ) > hx∗ , yiX − ϕ(y)
∀ y ∈ X.
Therefore hx∗ , y − xiX 6 ϕ(y) − ϕ(x)
∀ y ∈ X,
hence x∗ ∈ ∂ϕ(x).
COROLLARY 4.4.22 If ϕ : X −→ R and x∗ ∈ ∂ϕ(x), then x ∈ ∂ϕ∗ (x∗ ). PROOF
Since x∗ ∈ ∂ϕ(x), we have ϕ∗ (x∗ ) + ϕ(x) = hx∗ , xiX
(see Proposition 4.4.21). Then since ϕ∗∗ 6 ϕ (see Proposition 4.4.13), we obtain ϕ∗ (x∗ ) + ϕ∗∗ (x) 6 hx∗ , xiX . A new appeal to Proposition 4.4.21 gives that x ∈ ∂ϕ∗ (x∗ ).
4. Smooth and Nonsmooth Analysis and Variational Principles
525
COROLLARY 4.4.23 If ϕ ∈ Γ0 (X), then x∗ ∈ ∂ϕ(x) ⇐⇒ x ∈ ∂ϕ∗ (x∗ ). PROOF
Since ϕ ∈ Γ0 (X), we have ϕ = ϕ∗∗
(see Theorem 4.4.14). So from Corollary 4.4.22 we conclude the desired equivalence. Before continuing with the investigation of the subdifferentials in the context of convex functions, let us give some examples of subdifferentials. EXAMPLE 4.4.24 (a) Let ϕ : R −→ R be a proper, convex function and x ∈ int dom ϕ. Then it is easily seen that £ 0 ¤ 0 ∂ϕ(x) = f− (x), f+ (x) . df
(b) Let X be a Banach space and ϕ(x) = kxkX . If x 6= 0, then © ª ∂ϕ(x) = x∗ ∈ X ∗ : kx∗ kX ∗ = 1, hx∗ , xiX = kxkX . Indeed, let x∗ ∈ X ∗ be such that kx∗ kX ∗ = 1 Then and so
and
hx∗ , xiX = kxkX .
hx∗ , yiX 6 kykX
∀y∈X
hx∗ , y − xiX 6 kykX − kxkX ,
hence x∗ ∈ ∂ϕ(x). On the other hand, let x∗ ∈ ∂ϕ(x). Then − kxkX > − hx∗ , xiX and
kxkX = 2 kxkX − kxkX > hx∗ , xiX ,
from which we infer that hx∗ , xiX = kxkX . Also
hx∗ , λyiX 6 kx + λykX − kxkX
∀ y ∈ X, λ > 0,
526
Nonlinear Analysis
hence
° ° °1 ° 1 ° 6 ° x + y° ° − λ kxkX . λ X
∗
hx , yiX Let λ → +∞, to obtain
hx∗ , yiX 6 kykX ,
from which it follows that
kx∗ kX ∗ 6 1.
But since hx∗ , xiX = kxkX , we conclude that kx∗ kX ∗ = 1. If x = 0, then ∗
∂ϕ(0) = B 1 =
©
ª x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 .
Indeed note that · ∗
x ∈ ∂ϕ(0) ⇐⇒
¸ ∗
hx , xiX 6 kxkX
∀x∈X
and the last inequality is equivalent to saying that kx∗ kX ∗ 6 1. df
(c) Let C be a closed, convex set in X and ϕ(x) = iC (x). Then df
©
x∗ ∈ X ∗ : hx∗ , c − xiX 6 0 for all c ∈ C © ∗ ª = x ∈ X ∗ : hx∗ , xiX = σC (x∗ ) .
∂ϕ(x) = NC (x) =
ª
The set ∂ϕ(x) = NC (x) is a nonempty (because 0 ∈ ∂ϕ(x) = NC (x)), closed and convex cone in X ∗ , known as the normal cone to C at x. It generalizes the notion of normal space (see Definition A.1.12(b)) in differential geometry. If x ∈ / C, then ∂ϕ(x) = NC (x) = ∅. So D(∂ϕ) = C and ∂ϕ(x) = NC (x) = {0}
∀ x ∈ int C.
If C = V is a linear subspace of X, then ∂ϕ(x) = NV (x) = V ⊥ © ∗ ª = x ∈ X ∗ : hx∗ , viX = 0 for all v ∈ V
∀ x ∈ V.
4. Smooth and Nonsmooth Analysis and Variational Principles
527
For convex functions we have an easy criterion for subdifferentiability at x ∈ X. PROPOSITION 4.4.25 If X is a Banach space and ϕ : X −→ R is a convex function which is continuous at x ∈ X, then ∂ϕ(x) 6= ∅ and ∂ϕ(x) is w∗ -compact and convex in X ∗ . PROOF
From Theorem 4.2.3, we know that
int epi ϕ 6= ∅. ¢ Since x, ϕ(x) belongs to the boundary of epi ϕ, we can apply the weak separation theorem (see Theorem A.3.1) and find (x∗ , η) ∈ X ∗ × R, with (x∗ , η) 6= (0, 0), such that ¡ ¢ η ϕ(x) − λ 6 h−x∗ , x − yiX ∀ (y, λ) ∈ epi ϕ. (4.22) ¡
Since for fixed y ∈ dom ϕ, λ can increase up to +∞, from (4.22), we infer that η > 0. If η = 0, then h−x∗ , x − yiX > 0
∀ y ∈ dom ϕ.
But x ∈ int dom ϕ (see Theorem 4.2.3). So x∗ = 0, a contradiction. So η > 0 and we take η = 1. Then from (4.22) with λ = ϕ(y), we have ϕ(x) − ϕ(y) 6 h−x∗ , x − yiX , so −x∗ ∈ ∂ϕ(x) 6= ∅. From Theorems 4.2.3 and 4.2.7, we know that there exists r > 0, such that ϕ|Br (x) is Lipschitz continuous. So we have hx∗ , uiX 6 ϕ(x + u) − ϕ(x) 6 k kukX
∀ u ∈ B r (0),
for some k > 0 and so kx∗ kX ∗ 6 k. By Alaoglu’s theorem (see Theorem A.3.9) and since ∂ϕ(x) is clearly w∗ closed, we conclude that it is w∗ -compact and convex. REMARK 4.4.26 The result is actually true in the more general context of dual pairs of locally convex spaces. However, since the material of Section 4.2 was developed in the context of Banach spaces and to avoid introducing additional functional analytic material, we have stated the result in Banach spaces.
528
Nonlinear Analysis
In fact for a continuous, convex function ϕ we can describe the subdifferential completely. PROPOSITION 4.4.27 If X is a Banach space and ϕ : X −→ R is a convex function which is continuous at x ∈ X, then σ∂ϕ(x) (h) = f 0 (x; h) ∀ h ∈ X. PROOF
Let
df
ψ(h) = ϕ0 (x; h)
∀ h ∈ X.
Since ϕ is continuous at x ∈ X, we have ∂ϕ(x) 6= ∅ (see Proposition 4.4.25). So we have hx∗ , hiX 6 ψ(h) 6 ϕ(x + h) − ϕ(x)
∀ h ∈ X, x∗ ∈ ∂ϕ(x),
so ψ is finite everywhere, hence continuous on X. Also using Proposition 4.4.9, we see that the conjugate of the function df
ψλ (h) =
¤ 1£ ϕ(x + λh) − ϕ(x) λ
∀λ>0
is the function ψλ∗ (x∗ ) =
¤ 1£ ∗ ϕ (λx∗ ) + ϕ(x) − λ hx∗ , xiX λ
∀ λ > 0.
Since ψ = inf ψλ , we have that λ>0
ψ ∗ = sup ψλ∗ λ>0
(see Remark 4.4.2). Therefore ¤ 1£ ∗ ϕ (λx∗ ) + ϕ(x) − hλx∗ , xiX . λ>0 λ
ψ ∗ (x∗ ) = sup
Then by virtue of Propositions 4.4.4 and 4.4.21, we have ½ 0 if x∗ ∈ ∂ϕ(x), ψ ∗ (x∗ ) = +∞ otherwise, i.e., ψ ∗ = i∂ϕ(x) and so ψ ∗∗ = ψ = σ∂ϕ(x) (see Theorem 4.4.14). REMARK 4.4.28 Again the result remains valid in the framework of dual pairs of locally convex spaces.
4. Smooth and Nonsmooth Analysis and Variational Principles
529
Next we show that for convex functions the case of Gˆateaux differentiability is essentially the same as that of uniqueness of the subgradient. PROPOSITION 4.4.29 Let X be a Banach space and let ϕ : X −→ R be a proper, convex function. (a) If ϕ is Gˆ ateaux differentiable at x, then © ª x ∈ D(∂ϕ) and ∂ϕ(x) = ϕ0G (x) . (b) If ϕ is continuous at x and ∂ϕ(x) is a singleton, then ϕ is Gˆ ateaux differentiable at x and ª © ∂ϕ(x) = ϕ0G (x) . PROOF x, we have
(a) Due to the convexity and Gˆateaux differentiability of ϕ at ¤ 1£ ϕ(x + λh) − ϕ(x) λ 6 ϕ(x + h) − ϕ(x) ∀ λ ∈ (0, 1), h ∈ X,
hϕ0G (x), hiX 6
so ϕ0G (x) ∈ ∂ϕ(x). Let x∗ ∈ X ∗ be any element of ∂ϕ(x). We have hx∗ , hiX 6 so
¤ 1£ ϕ(x + λh) − ϕ(x) λ
∀ λ > 0, h ∈ X,
hx∗ , hiX 6 hϕ0G (x), hiX ∗
and thus x =
ϕ0G (x),
∀h∈X
i.e., ∂ϕ(x) =
©
ª ϕ0G (x) .
(b) Since ϕ is convex, we have ϕ(x) + λϕ0 (x; h) 6 ϕ(x + λh)
∀ λ ∈ R, h ∈ X.
So the straight line df
L =
©¡ ¢ ª x + λh, ϕ(x) + λϕ0 (x; h) : λ ∈ R
does not intersect int epi ϕ 6= ∅ (see Theorem 4.2.3). Then by the weak separation theorem (see Theorem A.3.1), we can find a closed hyperplane H containing line L, such that H ∩ int epi ϕ = ∅.
530
Nonlinear Analysis
The hyperplane H is the graph of a continuous affine function l on X, such that l(x) = ϕ(x). Since by hypothesis ∂ϕ(x) = {x∗ }, the slope of l is x∗ and because L ⊆ H, we have ϕ0 (x; h) = hx∗ , hiX
∀ h ∈ X.
Thus ϕ is Gˆateaux differentiable at x and © ª ∂ϕ(x) = ϕ0G (x) .
The next proposition explains the central role of the subdifferential in optimization theory. It is a direct consequence of Definition 4.4.19. PROPOSITION 4.4.30 If ϕ : X −→ R is a proper function, then ϕ attains its minimum at x ∈ dom ϕ if and only if 0 ∈ ∂ϕ(x). Next we will establish some basic rules of the subdifferential calculus. We start with two straightforward observations. Here ϕ, ψ : X −→ R are proper functions. We have ∂(λϕ)(x) = λ∂ϕ(x)
∀ λ > 0, x ∈ X
(4.23)
and ∂ϕ(x) + ∂ψ(x) ⊆ ∂(ϕ + ψ)(x)
∀ x ∈ dom ϕ ∩ dom ψ.
(4.24)
The next proposition provides a simple situation where equality in (4.24) is realized. PROPOSITION 4.4.31 If ϕ, ψ : X −→ R are proper, convex functions and there exists x b ∈ dom ϕ ∩ dom ψ where ϕ is continuous, then ∂ϕ(x) + ∂ψ(x) = ∂(ϕ + ψ)(x) ∀ x ∈ X. PROOF
Because of (4.24), we need to show that ∂ϕ(x) + ∂ψ(x) ⊇ ∂(ϕ + ψ)(x)
∀ x ∈ X.
(4.25)
4. Smooth and Nonsmooth Analysis and Variational Principles
531
To this end let x∗ ∈ ∂(ϕ + ψ)(x). Then x ∈ dom ϕ ∩ dom ψ and ψ(x) − ψ(y) 6 ϕ(y) − ϕ(x) − hx∗ , y − xiX = g(y)
∀ y ∈ X.
We introduce the following two sets df
C1 = epi g
and
df
C2 =
©
ª (y, µ) ∈ X × R : µ 6 ψ(x) − ψ(y) .
Both sets are convex and by virtue of Theorem 4.2.3, int C1 6= ∅. Also int C1 ∩ C2 = ∅. Indeed, g(y) 6 µ 6 ψ(x) − ψ(y)
∀ (y, µ) ∈ int C1 ∩ C2
and so g(y) = µ. Because (y, µ) ∈ int C1 , we have that (y, µ − ε) ∈ C1 for ε > 0 small and so g(y) 6 µ − ε, a contradiction. Since int C1 ∩ C2 = ∅, we can apply the weak separation theorem (see Theorem A.3.1) and produce (z ∗ , η) ∈ X ∗ × R, (z ∗ , η) 6= (0, 0), such that hz ∗ , ziX + ηλ 6 hz ∗ , yiX + ηµ
∀ (z, λ) ∈ C1 , (y, µ) ∈ C2
(4.26)
and the inequality is strict if (z, λ) ∈ int C1 . Note that (x, 0) ∈ C2 . Then from (4.26) and since λ can increase to +∞, we obtain η 6 0. If η = 0, then hz ∗ , ziX 6 hz ∗ , xiX
∀ z ∈ dom g.
But since g is continuous at x, dom g is a neighbourhood of x, hence z ∗ = 0, a contradiction to the fact that (z ∗ , η) 6= (0, 0). So η < 0 and we may assume that η = −1. Then from (4.26), we have ¡ ¢ hz ∗ , ziX − g(z) 6 hz ∗ , xiX 6 hz ∗ , yiX − ψ(x) − ψ(y) ∀ z ∈ dom g, y ∈ dom ψ. From the second inequality we have that −z ∗ ∈ ∂ψ(x), while from the first we have that x∗ + z ∗ ∈ ∂ϕ(x). Then x∗ = x∗ + z ∗ + (−z ∗ ) ∈ ∂ϕ(x) + ∂ψ(x) and we have proved (4.25). Of course the result is also true for any family {ϕi }ni=1 n T of proper, convex functions on X, such that there exists x ∈ dom ϕi , where REMARK 4.4.32
all but one of the functions are continuous.
i=1
532
Nonlinear Analysis
PROPOSITION 4.4.33 If A ∈ L(X; Y ) and ϕ : Y −→ R is a proper function, then A∗ ∂ϕ(Ax) ⊆ ∂(ϕ ◦ A)(x) ∀x∈X and equality holds if in addition ϕ is convex and continuous at a point in the range of A. PROOF The inclusion follows at once from the definitions. Let us prove that equality holds when ϕ is convex and continuous at the range of A. So let x∗ ∈ ∂(ϕ ◦ A)(x). We have hx∗ , z − xiX + (ϕ ◦ A)(x) 6 (ϕ ◦ A)(z) Let
df
L =
∀ z ∈ X.
(4.27)
©¡ ¢ ª Az, hx∗ , z − xiX + (ϕ ◦ A)(x) ∈ Y × R : z ∈ X .
This is an affine subspace of Y × R and because of (4.27), L and epi ϕ have only boundary points in common, that is L ∩ int epi ϕ 6= ∅ (note that by Theorem 4.2.3, int epi ϕ 6= ∅). So we can apply the weak separation theorem (see Theorem A.3.1) and find a close hyperplane H containing L, such that H ∩ int epi ϕ = ∅. The hyperplane H is the graph of a continuous affine function df
l(y) = hy ∗ , yiY + µ
∀ y ∈ Y,
with (y ∗ , µ) ∈ Y ∗ × R. Since H ⊇ L, we have hy ∗ , AziY + µ = hx∗ , z − xiX + (ϕ ◦ A)(x) so taking z = 0, we have µ = (ϕ ◦ A)(x) − hx∗ , xiX and hy ∗ , AziY = hx∗ , ziX
∀ z ∈ X.
From the second equality, we infer that x∗ = A∗ y ∗ . Also since H ∩ int epi ϕ = ∅,
∀ z ∈ X,
4. Smooth and Nonsmooth Analysis and Variational Principles
533
we have hy ∗ , yiY + (ϕ ◦ A)(x) − hA∗ y ∗ , xiX 6 ϕ(y)
∀ y ∈ Y,
so hy ∗ , y − AxiY + (ϕ ◦ A)(x) 6 ϕ(y) ¡ ¢ and thus y ∗ ∈ ∂ϕ A(x) . We infer that ∂(ϕ ◦ A)(x) ⊆ A∗ ∂ϕ(Ax)
∀y∈Y
∀ x ∈ X.
Therefore equality must hold. ∗
Next we study the multifunction ∂ϕ : X −→ 2X . The first result explains the connection between subdifferentials and maximal monotone maps and it generalizes the elementary fact that if ϕ : R −→ R is a continuous, convex function, then ϕ0 is increasing. THEOREM 4.4.34 If X is a reflexive Banach space and ϕ ∈ Γ0 (X), ∗ then ∂ϕ : X −→ 2X is a maximal monotone map. PROOF Using Troyanski’s renorming theorem (see Theorem A.3.23), we may assume that both X and X ∗ are locally uniformly convex (see Definition A.3.21). Let F : X −→ X ∗ be the duality map of X. By Proposition 3.2.27, F is a homeomorphism. Now, directly from the definition, we see that ∂ϕ is monotone. So by virtue of Theorem 3.2.29 to prove the maximal monotonicity of ∂ϕ, it suffices to show that ¡ ¢ R ∂ϕ + F = X ∗ . (4.28) To this end let x∗ ∈ X ∗ and consider the function ψ : X −→ R, defined by df
ψ(x) =
1 2 kxkX + ϕ(x) − hx∗ , xiX 2
∀ x ∈ X.
Evidently ψ ∈ Γ0 (X) and ψ(x) −→ +∞
as kxkX → +∞
534
Nonlinear Analysis
(recall that ϕ is bounded below by a continuous affine function; see Proposition 4.4.12). So by the Weierstrass theorem, we can find x0 ∈ dom ψ, such that ψ(x0 ) = inf ψ. X
Then from Proposition 4.4.30, we have that 0 ∈ ∂ψ(x0 ). Using Proposition 4.4.31 (see also Remark 4.4.32), we have ∂ϕ(x0 ) = F(x0 ) + ∂ϕ(x0 ) − x∗ ¡ ¢ (recall that ∂ 21 k·k2X (x0 ) = F(x0 ); see Example 3.2.20(d)). Hence 0 ∈ ∂ϕ(x0 ) + F(x0 ) − x∗ and so
x∗ ∈ ∂ϕ(x0 ) + F(x0 ).
Because x∗ ∈ X ∗ was arbitrary, we conclude that (4.28) holds and thus ∂ϕ is maximal monotone. REMARK 4.4.35 The result is actually true for X being any Banach space. For a proof of the result in this general case we refer to Rockafellar (1970b) (see also Phelps (1993, p. 59)). Now we obtain some additional properties which characterize the subdifferentials within the class of maximal monotone maps. ∗
DEFINITION 4.4.36 Let X be a Banach space and A : X −→ 2X . We say that A is n-cyclically monotone provided that n X
x∗k , xk − xk+1
® X
> 0,
k=0
whenever n > 1 and x0 , x2 , . . . , xn ∈ X, and
x∗k ∈ A(xk )
xn+1 = x0
∀ k ∈ {0, 1, . . . , n}.
We say that A is cyclically monotone, if it is n-cyclically monotone for every n > 2. The map A is maximal cyclically monotone, if its graph is not properly included in the graph of a cyclically monotone map.
4. Smooth and Nonsmooth Analysis and Variational Principles
535
REMARK 4.4.37 Clearly a 2-cyclically monotone map is monotone. So every cyclically monotone map is monotone. PROPOSITION 4.4.38 Every monotone map f : R −→ 2R is cyclically monotone. PROOF
Let x1 , x1 , . . . , xn ∈ D(f )
and
x∗k ∈ f (xk )
∀ k ∈ {0, 1, . . . , n}.
We may assume that xk 6 xk+1 for all k ∈ {0, 1, . . . , n − 1}. Then x∗k 6 x∗k+1 for all k ∈ {0, 1, . . . , n − 1} and we have n X
x∗k (xk − xk+1 ) =
k=0
n−1 X
x∗k (xk − xk+1 ) + x∗n (xn − x0 )
k=0
=
n−1 X
(x∗k − x∗0 )(xk − xk+1 ) > 0
k=0
(recall that xn+1 = x0 ). Directly from the definition, we see that if ϕ : X −→ R is a proper, convex function, then ∂ϕ is cyclically monotone. Moreover, if ϕ ∈ Γ0 (X), then by virtue of Theorem 4.4.34 and Remark 4.4.35, we see that ∂ϕ is maximal cyclically monotone. It turns out that subdifferentials are the only maximal cyclically monotone maps. THEOREM 4.4.39 If X is a Banach space, ∗ then a map A : X −→ 2X is maximal cyclically monotone if and only if there exists ϕ ∈ Γ0 (X), such that A = ∂ϕ. PROOF
“=⇒”: Let us fix (x0 , x∗0 ) ∈ Gr A and for every x ∈ X we define df
ϕ(x) =
sup
n X
∗ (xk , x A k=0 ©k ) ∈ Gr ª k ∈ 1, . . . , n n>1
x∗k , xk+1 − xk
® X
® + x∗n , x − xn X .
Because ϕ is the supremum of continuous affine functions, it follows that ϕ is convex and lower semicontinuous. Moreover, since n X ∗ ® xk , xk+1 − xk X 6 0 k=0
536
Nonlinear Analysis
(due to the cyclical monotonicity of A), it follows that ϕ is proper, that is ϕ ∈ Γ0 (X). Let (x, x∗ ) ∈ Gr A and y ∈ X. Since in the definition of ϕ, n > 2 is arbitrary, we have ϕ(y) >
n X ∗ ® ® ® xk , xk+1 − xk X + x∗n , x − xn X + x∗ , y − x X k=0
(i.e., we have added the point (x, x∗ ) ∈ Gr A in the definition of ϕ). Hence we obtain ® ϕ(y) > ϕ(x) + x∗ , y − x X ∀ y ∈ X, so x∗ ∈ ∂ϕ(x). Since (x, x∗ ) ∈ Gr A was arbitrary, we infer that Gr A ⊆ Gr ∂ϕ. Due to the maximality of ϕ, we conclude that Gr A = Gr ∂ϕ, hence A = ∂ϕ. “⇐=”: See the remark before the statement of the theorem. REMARK 4.4.40 In fact it can be shown that ϕ ∈ Γ0 (X) is unique up to an additive constant; see Rockafellar (1970b). COROLLARY 4.4.41 Any maximal monotone map f : R −→ 2R has the form £ 0 ¤ 0 f (x) = g− (x), g+ (x) , with g ∈ Γ0 (R). We can use Theorem 4.4.39 to characterize self-adjoint positive operators in a Hilbert space. PROPOSITION 4.4.42 If H is a Hilbert space and A : H ⊇ D(A) −→ H is a linear maximal monotone operator, then A is maximal cyclically monotone if and only if A is self-adjoint. PROOF “=⇒”: By virtue of Theorem 4.4.39, we can find ϕ ∈ Γ0 (H), such that A = ∂ϕ. Because A(0) = 0 and using Remark 4.4.40, we may assume that ϕ(0) = 0. For x ∈ D(A), let df
g(t) = ϕ(tx)
∀ t ∈ [0, 1].
From Proposition 4.4.33, we have that ∂g(t) = (∂ϕ(tx), x)H .
4. Smooth and Nonsmooth Analysis and Variational Principles
537
Using the definition of subdifferential, we infer that ¯ ¯ ¡ ¢ ¯g(t) − g(s)¯ 6 A(x), x |t − s| ∀ t, s ∈ [0, 1] H and so g is differentiable for almost all t ∈ [0, 1] and ¡ ¢ d g(t) = t A(x), x H . dt Then we have µ Z1 ¶ Z1 ¡ ¢ ¢ 1¡ 0 g(1) − g(0) = g (t) dt = t dt A(x), x H = A(x), x H , 2 0
0
so
¢ 1¡ A(x), x H ∀ x ∈ D(A). 2 Then via an easy calculation, we obtain that ¡ ¢ ¢ ¡ ¢ ¤ 1 £¡ ∂ϕ(x), y H = A(x), y H + x, A(y) H ∀ x, y ∈ D(A). 2 Since ¡ ¢ ¡ ¢ ∂ϕ(x), y H = A(x), y H , ϕ(x) =
we obtain
¡
A(x), y
¢ H
=
¡
x, A(y)
¢
∀ x, y ∈ D(A)
H
and so A ⊆ A∗ . But A∗ is monotone (see Theorem 3.2.58) and A is maximal. Therefore A = A∗ and we conclude that A is self-adjoint. “=⇒”: Since A is self-adjoint, maximal monotone, there exists a square root 1 of A with the same properties (see Kato (1976, p. 281)). So A 2 is closed (see Theorem 3.2.58) and if we set ½ 1 ° 1 °2 1 ° 2 ° df if x ∈ D(A 2 ) 2 A x H ϕ(x) = ∀ x ∈ H, 0 otherwise then ϕ ∈ Γ0 (H). Since ¡ 1 ¡ ¢ ¢ 1 A(x), y H = A 2 (x), A 2 (y) H
¡ 1¢ ∀ (x, y) ∈ D(A) × D A 2 ,
we have °2 ¡ ¢ 1 ° 1 °2 1° 1 A(x), y − x H 6 °A 2 (y)°H − °A 2 (x)°H 2 2 so A(x) ∈ ∂ϕ(x), i.e., A ⊆ ∂ϕ. Because A is maximal, we conclude that A = ∂ϕ.
¡ 1¢ ∀ (x, y) ∈ D(A) × D A 2 ,
538
Nonlinear Analysis
Using Proposition 3.2.14, we obtain the following result. PROPOSITION 4.4.43 If X is a reflexive Banach space and ϕ : X −→ R is a continuous, convex function, then ∗ ∂ϕ : X −→ 2X \ {∅} is upper semicontinuous from X with norm topology into X ∗ with the weak topology. REMARK 4.4.44 The result is true if X is any Banach space. In this case X ∗ is supplied with the w∗ -topology. The proof remains the same. DEFINITION 4.4.45 Let Y and Z be Hausdorff topological spaces and let S : Y ⊇ D(S) −→ 2Z be a multifunction. A selection f of S is a single valued map f : Y −→ Z, such that f (y) ∈ S(y)
∀ y ∈ D(S).
In the next proposition using selections of the subdifferential map, we characterize the Gˆateaux and Fr´echet differentiability of the convex function. PROPOSITION 4.4.46 If X is a Banach space, U ⊆ X is a nonempty, open convex set and ϕ : U −→ R is a continuous, convex function, then ϕ is Gˆ ateaux (respectively Fr´echet) differentiable at x ∈ U if and only if there is a selection f of the subdifferential map ∂ϕ which is norm-to-weak∗ (respectively norm-to-norm) continuous at x. An interesting consequence of Proposition 4.4.46 is that Fr´echet differentiable, convex functions are necessarily C 1 -functions. COROLLARY 4.4.47 If X is a Banach space, U ⊆ X is a nonempty, open, convex set and ϕ : U −→ R is a convex and Fr´echet differentiable function, then the function x 7−→ ϕ0F (x) is norm-to-norm continuous from U into X ∗ , i.e., ϕ ∈ C 1 (X).
4. Smooth and Nonsmooth Analysis and Variational Principles
539
Another such result is given in the next proposition. PROPOSITION 4.4.48 ¡ ¢ If ϕ ∈ Γ0 RN , ϕ is strictly convex and
¡ ¢ then ϕ∗ ∈ C 1 RN .
ϕ(x) −→ +∞ kxkRN
as kxkRN → +∞,
PROOF Without any loss of generality, we may assume that 0 ∈ dom ϕ and ϕ(0) = 0. Fix x∗ ∈ RN and consider the function df
Evidently −ψx∗
ψx∗ (x) = (x∗ , x)RN − ϕ(x). ¡ N¢ ∈ Γ0 R , it is strictly convex and ψx∗ (x) −→ −∞
as kxkRN → +∞.
So by the Weierstrass theorem ψx∗ attains its maximum on RN and the maximizer x is unique. By Propositions 4.4.21 and Corollary 4.4.23, we have ∂ϕ∗ (x∗ ) = {x}, i.e., ∂ϕ is single-valued. The map x∗ 7−→ ∂ϕ∗ (x∗ ) is closed. We show that it maps bounded sets to bounded sets, hence it is continuous. To this end let kx∗ kRN 6 r
and
x = ∂ϕ∗ (x∗ ).
We have x∗ ∈ ∂ϕ(x) and so (x∗ , x)RN ϕ(x) ϕ(x) − ϕ(0) = 6 6 kx∗ kRN 6 r; kxkRN kxkRN kxkRN © ª thus the set ∂ϕ∗ (x∗ ) : kx∗ kRN 6 r is bounded, i.e., ∂ϕ∗ is continuous. Finally let x = ∂ϕ∗ (x∗ ) and xh∗ = ∂ϕ∗ (x∗ + h∗ ), for some x∗ ∈ RN , h∗ ∈ RN \ {0}. From the definition of the subdifferential, we have ϕ∗ (x∗ + h∗ ) − ϕ∗ (x∗ ) − (h∗ , x)RN 06 kh∗ kRN (h∗ , xh∗ − x)RN 6 6 kxh∗ − xkRN . kh∗ kRN From the continuity of ∂ϕ∗ , we have kxh∗ − xkRN −→ 0
as h∗ → 0.
So ϕ∗ is¡ differentiable at x∗ ∈ RN and the derivative is continuous, i.e., ¢ N ∗ 1 ϕ ∈C R .
540
Nonlinear Analysis
Before passing to the nonconvex subdifferentials, let us mention a few things about the ε-subdifferential (or approximate subdifferential), which is a useful tool in convex analysis. Its definition results from an innocent looking perturbation of the original subdifferential (see Definition 4.4.19), which however leads to some remarkable properties, that are different in nature from those of the “exact” subdifferential. The mathematical setting remains unchanged with (X, X ∗ ) being a dual system of Hausdorff locally convex spaces. DEFINITION 4.4.49 Let ϕ : X −→ R be a proper function, ε > 0 and x ∈ dom ϕ. The ε-subdifferential of ϕ at x is the set ∂ε ϕ(x) (possibly empty), defined by ½ ¾ df ∗ ∗ ∗ ∂ε ϕ(x) = x ∈ X : hx , y − xiX − ε 6 ϕ(y) − ϕ(x) for all y ∈ X . REMARK 4.4.50
Equivalently we can say that x∗ ∈ ∂ε ϕ(x)
if and only if ¡ ¢ inf ϕ − x∗ > −∞ X
with ¡ ¢ ε-argmin ϕ − x∗ =
¡ ¢ x ∈ ε − argmin ϕ − x∗ ,
½
Also if and only if
and
¾ ¡ ¢ y ∈ X : ϕ(y) − hx∗ , yiX 6 inf ϕ − x∗ + ε . X
x∗ ∈ ∂ε ϕ(x) ϕ(x) + ϕ∗ (x∗ ) − hx∗ , xiX 6 ε.
Geometrically the definition of ∂ε ϕ(x) says that the epigraph of the ¡ continuous¢ affine function with slope x∗ ∈ ∂ε ϕ(x) and passing through x, ϕ(x) − ε contains the epigraph of ϕ. So for ε > 0, ∂ε ϕ(x) is a global notion (in contrast to ∂ϕ(x) which is local), i.e., it may be sensitive to variations of ϕ far away from x. When ε = 0, we recover Definition 4.4.19. The next proposition establishes the main difference between approximate and exact subdifferentials. PROPOSITION 4.4.51 If ϕ ∈ Γ0 (X), ε > 0 and x ∈ dom ϕ, then ∂ε ϕ(x) 6= ∅ and it is w∗ -closed, convex.
4. Smooth and Nonsmooth Analysis and Variational Principles PROOF
541
From Theorem 4.4.14, we have −ϕ(x) =
Let
inf [ϕ∗ (x∗ ) − hx∗ , xiX ] .
x∗ ∈X ∗
df
ψ(x∗ ) = ϕ∗ (x∗ ) − hx∗ , xiX
∀ x∗ ∈ X ∗ .
Let x∗ ∈ ε − argminψ 6= ∅. We have ϕ∗ (x∗ ) − hx∗ , xiX 6 inf∗ ψ + ε = −ϕ(x) + ε, X
hence
ϕ(x) + ϕ∗ (x∗ ) − hx∗ , xiX 6 ε,
which means that x∗ ∈ ∂ε ϕ(x) (see Remark 4.4.7). So ∂ε ϕ(x) 6= ∅ and clearly it is w∗ -closed and convex. In the study of the ε-subdifferentials (with ε > 0), the directional derivative (see Definition 4.4.17) is replaced by the following quantity. DEFINITION 4.4.52 Let ϕ : X −→ R be a proper function, ε > 0 and x ∈ dom ϕ. For every h ∈ X, we define df
ϕ0ε (x; h) = inf
λ>0
ϕ(x + λh) − ϕ(x) + ε . λ
The next result is analogous to Proposition 4.4.27. PROPOSITION 4.4.53 If X is a Banach space, ϕ ∈ Γ0 (X), ε > 0 and x ∈ dom ϕ, then ϕ0ε (x; ·) = σ∂ε ϕ(x) (·). Let us mention some basic calculus rules for the ε-subdifferential. PROPOSITION 4.4.54 If X and Y are two Banach spaces, A ∈ L(X; Y ), ϕ ∈ Γ0 (Y ), ε > 0 and x ∈ A−1 (dom ϕ), then ¡ ¡ ¢¢w∗ ∂ε (ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . PROOF
Since ϕ ◦ A ∈ Γ0 (X), from Proposition 4.4.53, we have that (ϕ ◦ A)0ε (x; h) = σ∂ε (ϕ◦A)(x) (h)
∀ h ∈ X.
We have ¡ ¢ ϕ(A(x) + λA(h)) − ϕ(A(x)) + ε = ϕ0ε A(x); A(h) . λ>0 λ
(ϕ ◦ A)0ε (x; h) = inf
542
Nonlinear Analysis
On the other hand, using Proposition 4.4.53, for all h ∈ X, we have ∗ ∗ ® σA∗ (∂ε ϕ(A(x))) (h) = sup A (y ), h X y∗∈∂ε ϕ(A(x))
=
sup y∗∈∂ε ϕ(A(x))
∗ ® ¡ ¢ y , A(h) X = ϕ0ε A(x); A(h) .
So we conclude that σ∂ε (ϕ◦A)(x) (h) = σA∗ (∂ε ϕ(A(x))) (h)
∀ h ∈ X,
hence we obtain the conclusion of the proposition. COROLLARY 4.4.55 If X and Y are two Banach spaces, A ∈ L(X; Y ), ϕ ∈ Γ0 (Y ) and x ∈ A−1 (dom ϕ), then \ ¡ ¡ ¢¢w∗ ∂(ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . ε>0
Moreover, if X is reflexive, then we have \ ¡ ¡ ¢¢ ∂(ϕ ◦ A)(x) = A∗ ∂ε ϕ A(x) . ε>0
PROOF
Clearly \
∂(ϕ ◦ A)(x) =
∂ε (ϕ ◦ A)(x).
ε>0
Applying Proposition 4.4.54, we obtain the first equality. For the second equality just note that if X is reflexive, then ¡ ¡ ¢¢ ¡ ¡ ¢¢w ¡ ¡ ¢¢w∗ A∗ ∂ε ϕ A(x) = A∗ ∂ε ϕ A(x) = A∗ ∂ε ϕ A(x) .
COROLLARY 4.4.56 If X is a Banach space, ϕ, ψ ∈ Γ0 (X) and x ∈ dom ϕ ∩ dom ψ, then \ w∗ ∂(ϕ + ψ)(x) = ∂ε ϕ(x) + ∂ε ψ(x) . ε>0
Moreover, if X is reflexive, then ∂(ϕ + ψ)(x) =
\ ε>0
∂ε ϕ(x) + ∂ε ψ(x).
4. Smooth and Nonsmooth Analysis and Variational Principles PROOF
543
Let df
u(x, y) = ϕ(x) + ϕ(y)
∀ (x, y) ∈ X × X
and let A ∈ L(X; X × X) be defined by df
A(x) =
¡ ¢ x, x
∀ x ∈ X.
Then we see that u◦A = ϕ+ψ
on X.
We have A∗ (x∗ , y ∗ ) = x∗ + y ∗
∀ (x∗ , y ∗ ) ∈ X ∗ × Y ∗
and u∗ (x∗ , y ∗ ) = ϕ∗ (x∗ ) + ψ ∗ (y ∗ )
∀ (x∗ , y ∗ ) ∈ X ∗ × Y ∗ .
Let (x, y) ∈ dom u, ε > 0 and (x∗ , y ∗ ) ∈ ∂ε u(x, y). We have ϕ(x) + ψ(y) + ϕ∗ (x∗ ) + ψ ∗ (y ∗ ) − hx∗ , xiX − hy ∗ , yiX 6 ε. So there exists ε1 , ε2 > 0, such that ε1 + ε2 = ε and x∗ ∈ ∂ε1 ϕ(x) ⊆ ∂ε ϕ(x) and
y ∗ ∈ ∂ε2 ψ(y) ⊆ ∂ε ψ(y).
Therefore, we infer that ∂ε u(x, y) ⊆ ∂ε1 ϕ(x) × ∂ε2 ψ(y). Using Corollary 4.4.55, we obtain ∂(ϕ + ψ)(x) =
\
\ ¡ ¢w ∗ w∗ A∗ ∂ε u(x, x) ⊆ ∂ε ϕ(x) + ∂ε ψ(x) .
ε>0
ε>0
(4.29)
On the other hand note that ∂ε ϕ(x) + ∂ε ψ(x) ⊆ ∂2ε (ϕ + ψ)(x) and so
w∗
∂ε ϕ(x) + ∂ε ψ(x)
⊆ ∂2ε (ϕ + ψ)(x).
From this inclusion it follows that \ \ w∗ ∂ε ϕ(x) + ∂ε ψ(x) ⊆ ∂2ε (ϕ + ψ)(x) = ∂(ϕ + ψ)(x). ε>0
ε>0
(4.30)
544
Nonlinear Analysis
From (4.29) and (4.30), we conclude that ∂(ϕ + ψ)(x) =
\
w∗
∂ε ϕ(x) + ∂ε ψ(x)
.
ε>0
Again if X is reflexive, then the norm closure and the weak closure coincide (due to the convexity of the set). Using the second characterization of the ε-subdifferential, we can easily prove the following rule. We leave the details to the reader. PROPOSITION 4.4.57 If ϕ, ψ ∈ Γ0 (X), ε > 0 and there exist x0 ∈ dom ϕ ∩ dom ψ, such that ϕ is continuous at x0 , then [ ∂ε (ϕ + ψ)(x) = [∂ε1 ϕ(x) + ∂ε2 ψ(x)] ∀ x ∈ dom ϕ ∩ dom ψ. ε1 , ε2 > 0 ε2 + ε2 = ε
Now we pass to a brief discussion of some nonconvex subdifferentials. Historically the first subdifferential defined for nonconvex functions is that for locally Lipschitz functions. The starting point for the introduction of such a subdifferential was Theorem 4.2.7 (that is that a continuous, convex function is locally Lipschitz) and when the underlying space is finite dimensional Theorem 1.5.8 and Corollary 1.5.9 (Rademacher’s theorem). The mathematical framework is a Banach space X with X ∗ its topological dual. Let us start by recalling the definition of a locally Lipschitz function, which is central in what follows. DEFINITION 4.4.58 A function ϕ : X −→ R is locally Lipschitz, if every point x ∈ X admits a neighbourhood U ⊆ X and a constant kU (depending on U ), such that ¯ ¯ ¯ϕ(y) − ϕ(z)¯ 6 kU ky − zk ∀ y, z ∈ U. X A locally Lipschitz function need not have directional derivatives in the sense of Definition 4.4.17. However, exploiting the local Lipschitz structure, we can define a generalized directional derivative as follows. DEFINITION 4.4.59 Let ϕ : X −→ R be a locally Lipschitz function. Then the generalized directional derivative of ϕ at x ∈ X in the direction h ∈ X is defined by df
ϕ0 (x; h) = lim sup x0 → x λ&0
ϕ(x0 + λh) − ϕ(x0 ) . λ
4. Smooth and Nonsmooth Analysis and Variational Principles
545
The utility of ϕ0 follows from some useful properties that it exhibits. PROPOSITION 4.4.60 If ϕ : X −→ R is a locally Lipschitz function, then (a) the function h 7−→ ϕ0 (x; h) is sublinear and Lipschitz continuous for all x ∈ X; (b) the function (x, h) 7−→ ϕ0 (x; h) is upper semicontinuous on X × X; (c) ϕ0 (x; −h) = (−ϕ)0 (x; h). PROOF (a) Clearly ϕ(x; ·) is positively homogeneous. Also let h1 , h2 ∈ X. We have ϕ0 (x; h1 + h2 ) = lim sup x0 → x λ&0
= lim sup x0 → x λ&0
6 lim sup x0 → x λ&0 0
ϕ(x0 + λ(h1 + h2 )) − ϕ(x0 ) λ
ϕ(x0 + λh1 + λh2 ) − ϕ(x0 + λh2 ) + ϕ(x0 + λh2 ) − ϕ(x0 ) λ ϕ(x0 + λh1 + λh2 ) − ϕ(x0 + λh2 ) ϕ(x0 + λh2 ) − ϕ(x0 ) + lim sup λ λ x0 → x λ&0 0
= ϕ (x; h1 ) + ϕ (x; h2 ). So we have proved that ϕ0 (x; ·) is sublinear. Exploiting the local Lipschitzness of ϕ, we see that for all x0 ∈ X near x ∈ X and for all λ > 0 near zero, we have ϕ(x0 + λh) − ϕ(x0 ) 6 k khkX ∀ h ∈ X, λ so ϕ0 (x; h) 6 k khkX ∀h∈X and due to the sublinearity of ϕ0 (x; ·), we have ¯ 0 ¯ ¯ϕ (x; h)¯ 6 k khk ∀ h ∈ X, X so finally we deduce that ϕ0 (x; ·) is Lipschitz continuous. (b) Let (xn , hn ) −→ (x, h)
in X × X.
From the definition of ϕ0 (x; h), we know that for every n > 1, we can find vn ∈ X and λn ∈ (0, 1), such that kvn kX + λn 6
1 n
546
Nonlinear Analysis
and ϕ0 (xn ; hn ) 6 so
ϕ(xn + vn + λn hn ) − ϕ(xn + vn ) 1 + , λn n
lim sup ϕ0 (xn , hn ) 6 ϕ0 (x; h), n→+∞
i.e., the function (x, h) 7−→ ϕ0 (x, h) is upper semicontinuous. (c) By definition, we have ϕ0 (x; −h) = lim sup x0 → x λ&0
= lim sup y→x λ&0
ϕ(x0 − λh) − ϕ(x0 ) λ
(−ϕ)(y + λh) − (−ϕ)(y) = (−ϕ)0 (x; h) λ
(with y = x0 − λh). These properties lead to the following definition. DEFINITION 4.4.61 Let ϕ : X −→ R be a locally Lipschitz function. The generalized subdifferential (or Clarke subdifferential) of ϕ at x is defined by ½ ¾ df ∂ϕ(x) = x∗ ∈ X ∗ : hx∗ , xiX 6 ϕ0 (x; h) for all h ∈ X . The elements of ∂ϕ(x) are called generalized gradients. PROPOSITION 4.4.62 If ϕ : X −→ R is a locally Lipschitz function, then for every x ∈ X, the set ∂ϕ(x) ⊆ X ∗ is nonempty, convex and w∗ ∗ compact, the multifunction ∂ϕ : X −→ 2X \{∅} is upper semicontinuous from X with the norm topology into X ∗ with the w∗ -topology (see Definition 3.2.12) and ϕ0 (x; h) = σ∂ϕ(x) (h) ∀ (x, h) ∈ X × X. PROOF Since by Proposition 4.4.60(a), ϕ0 (x; ·) is sublinear, the HahnBanach theorem implies that it has a continuous linear minorant. Therefore ∂ϕ(x) 6= ∅. Clearly the set is convex and by virtue of Proposition 4.4.60(a), it is also closed and bounded, hence w∗ -compact (by Alaoglu’s theorem; see Theorem A.3.9). To show the upper semicontinuity, let C ⊆ X ∗ be a w∗ closed set and let © ª {xn }n>1 ⊆ ∂ϕ− (C) = x ∈ X : ∂ϕ(x) ∩ C 6= ∅
4. Smooth and Nonsmooth Analysis and Variational Principles
547
be a sequence, such that xn −→ x
in X.
Let us take x∗n ∈ ∂ϕ(xn ) ∩ C
∀ n > 1.
Because Proposition 4.4.60(a), the sequence {x∗n }n>1 is bounded in X ∗ . So by Alaoglu’s theorem (see Theorem A.3.9), we can find a subnet {x∗α }α∈J of {x∗n }n>1 , such that w∗
x∗α −→ x∗ . We have hx∗α , hiX 6 ϕ0 (xα ; h)
∀ h ∈ X.
Taking the limit with respect to α ∈ J and using Proposition 4.4.60(b), we obtain hx∗ , hiX 6 ϕ0 (x; h) ∀ h ∈ X, so x∗ ∈ ∂ϕ(x). Also x∗ ∈ C, since C ⊆ X ∗ is w∗ -closed. Therefore x∗ ∈ ∂ϕ(x) ∩ C, hence x ∈ ∂ϕ− (C) which proves the upper semicontinuity of the multifunction. Finally, using once more the Hahn-Banach theorem, for every h0 ∈ X, we can find x∗0 ∈ ∂ϕ(x), such that hx∗0 , hiX 6 ϕ0 (x0 ; h)
∀h∈X
and hx∗0 , h0 iX = ϕ0 (x0 ; h0 ). Therefore ϕ0 (x; ·) = σ∂ϕ(x) (·). PROPOSITION 4.4.63 Let ϕ : X −→ R be a locally Lipschitz function. (a) If ϕ is Gˆ ateaux differentiable at x ∈ X, then ϕ0G (x) ∈ ∂ϕ(x). (b) If ϕ ∈ C 1 (X), © ª then ∂ϕ(x) = ϕ0F (x) for all x ∈ X. (c) If ϕ is also convex, then the convex and generalized subdifferentials of ϕ coincide.
548
Nonlinear Analysis (a) From the definition of ϕ0 (x; ·), we have 0 ® ϕG (x), h X 6 ϕ0 (x; h) ∀ h ∈ X.
PROOF
So ϕ0G (x) ∈ ∂ϕ(x). (b) If ϕ ∈ C 1 (X), then ϕ0 (x; h) = © ª hence ∂ϕ(x) = ϕ0F (x) .
0 ® ϕF (x), h X
∀ h ∈ X,
(c) From the definition of ϕ0 (x; h), we have ϕ(x0 + λh) − ϕ(x0 ) , ε&0 kx0 −xk 6εδ 0 0 arbitrary. Because ϕ is convex, the map λ 7−→
ϕ(x0 + λh) − ϕ(x0 ) is increasing on (0, +∞). λ
So we obtain ϕ0 (x; h) = lim
sup
ε&0 kx0 −xk 6εδ X
ϕ(x0 + εh) − ϕ(x0 ) . ε
From the local Lipschitz property of ϕ, we have ¯ ¯ ¯ ϕ(x0 + εh) − ϕ(x0 ) ϕ(x + εh) − ϕ(x) ¯ ¯ ¯ 6 2δk − ¯ ¯ ε ε
∀ x0 ∈ x + εδB 1 ,
with k > 0, so ϕ0 (x; h) 6 lim
ε&0
ϕ(x + εh) − ϕ(x) + 2δk. ε
Since δ > was arbitrary, we obtain ϕ0 (x; h) 6 ϕ0 (x; h)
∀ h ∈ X.
Because the opposite inequality is always true, we conclude that ϕ0 (x; ·) = ϕ0 (x; ·). Then by virtue of Proposition 4.4.27 and Definition 4.4.61, we conclude that the two subdifferentials coincide. EXAMPLE 4.4.64 If ϕ is differentiable but not C 1 , then ∂ϕ(x) need not be a singleton. To see this consider the function ϕ : R −→ R, defined by ¡ ¢ ½ 2 df x sin x1 if x 6= 0, ϕ(x) = 0 if x = 0. Then ϕ is Lipschitz continuous on [−1, 1], ϕ0 (0) = 0 but ϕ0 is not continuous at x = 0. A straightforward calculation shows that ϕ0 (0; h) = |h|, hence ∂ϕ(0) = [−1, 1] and so it is not a singleton.
4. Smooth and Nonsmooth Analysis and Variational Principles
549
When X = RN , then we can use Corollary 1.5.9 (Rademacher’s theorem) to give a definition which is less abstract and formal than Definition 4.4.61 in terms of the generalized directional derivative. The new definition is more geometric. THEOREM 4.4.65 If ϕ : RN −→ R is a locally Lipschitz function and E is any Lebesgue-null set in RN , then ½ ¾ ∂ϕ(x) = conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕc , n→+∞
where Dϕc ⊆ RN is the Lebesgue-null set where ϕ fails to be differentiable (due to Rademacher’s theorem). PROOF
Since, by Proposition 4.4.63(a), we have ∇ϕ(xn ) ∈ ∂ϕ(xn )
∀n>1
and ∂ϕ©is locallyª bounded (see Proposition 4.4.60(a)), we see that the sequence ∇ϕ(xn ) n>1 has a convergent subsequence. Then Proposition 4.4.62 implies that the limit of the subsequence belongs in ∂ϕ(x). Therefore we have ½ ¾ c conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕ ⊆ ∂ϕ(x). (4.31) n→+∞
On the other hand let ξh =
lim sup
¡
∇ϕ(y), h
y→x c y∈ / E ∪ Dϕ
¢ RN
,
with h 6= 0. For a given ε > 0, we can find δ = δ(ε) > 0, such that ¡ ¢ ∇ϕ(y), h RN 6 ξh + ε ∀ y ∈ x + δB 1 (0), y ∈ / E ∪ Dϕc . ³ ´ δ For t ∈ 0, 2khk , we have N R
δ ϕ(y + th) − ϕ(y) 6 t(ξh + ε) for a.a. y ∈ x + B 1 (0), 2 so
ϕ0 (x; h) 6 ξh + ε.
Therefore the support function of the set ½ ¾ c conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / E ∪ Dϕ n→+∞
majorizes ϕ0 (x; ·). This combined with (4.31) finishes the proof of the theorem.
550
Nonlinear Analysis
COROLLARY 4.4.66 If ϕ : RN −→ R is a locally Lipschitz function, then ¡ ¢ ϕ0 (x; h) = lim sup ∇ϕ(x0 ), h RN . x0 → x c y ∈ E ∪ Dϕ
Finally let us state a few basic calculus rules for the generalized subdifferential. PROPOSITION 4.4.67 If ϕ : X −→ R is a locally Lipschitz function and λ ∈ R, then ∂(λϕ) = λ∂ϕ. PROOF
Evidently the result is true if λ > 0, since (λϕ)0 = λϕ0 .
So we need to consider the case λ < 0. We may assume that λ = −1. Then x∗ ∈ ∂(−ϕ)(x) if and only if hx∗ , hiX 6 (−ϕ)0 (x; h)
∀ h ∈ X.
By Proposition 4.4.60(c), we have (−ϕ)0 (x; h) = ϕ0 (x; −h). So x∗ ∈ ∂(−ϕ)(x) if and only if hx∗ , hiX 6 ϕ0 (x; −h)
∀ h ∈ X,
hence −x∗ ∈ ∂ϕ(x). Thus finally we have that x∗ ∈ ∂(−ϕ)(x) if and only if x∗ ∈ −∂ϕ(x).
An interesting consequence of this proposition is the following extension of Fermat’s equation for local extrema.
4. Smooth and Nonsmooth Analysis and Variational Principles
551
PROPOSITION 4.4.68 If ϕ : X −→ R is a locally Lipschitz function and has a local maximum or minimum at x ∈ X, then 0 ∈ ∂ϕ(x). PROOF Since ∂(−ϕ) = −∂ϕ, it suffices to prove the proposition for the case of a local minimum at x ∈ X. Then clearly ϕ0 (x; h) > 0
∀ h ∈ X.
Hence 0 ∈ ∂ϕ(x). PROPOSITION 4.4.69 © ª If ϕk : X −→ R for k ∈ 1, . . . , n are locally Lipschitz functions, then ¶ µX n n X ϕk ⊆ ∂ϕk , ∂ k=1
k=1
i.e., the generalized subdifferential is subadditive. PROOF It suffices to prove the result for n = 2. The general case follows by induction. The support function of ∂(ϕ1 +ϕ2 ) is (ϕ1 +ϕ2 )0 and the support function of ∂ϕ1 + ∂ϕ2 is ϕ01 + ϕ02 . Also note that ∂ϕ1 (x) + ∂ϕ2 (x) is convex and w∗ -compact. Since (ϕ1 + ϕ2 )0 (x; ·) 6 ϕ01 (x; ·) + ϕ02 (x; ·), we conclude that ∂(ϕ1 + ϕ2 )(x) ⊆ ∂ϕ1 (x) + ∂ϕ2 (x)
∀ x ∈ X.
COROLLARY 4.4.70 © ª If ϕk : X −→ R for k ∈ 1, . . . , n are locally Lipschitz functions and all but one are C 1 -functions, then µX ¶ n n X ∂ ϕk = ∂ϕk . k=1
k=1
COROLLARY 4.4.71 © ª If ϕk : X −→ R are locally Lipschitz functions and λk ∈ R for k ∈ 1, . . . , n , then µX ¶ n n X ∂ λk ϕk ⊆ λk ∂ϕk k=1
k=1
and equality holds if all but one of the functions are C 1 -functions.
552
Nonlinear Analysis
The following mean value theorem is a useful tool in many applications. THEOREM 4.4.72 (Mean Value Theorem) If U ⊆ X is an open set, x, y ∈ X, [x, y] ⊆ U , with df
[x, y] =
©
ª λx + (1 − λ)y : λ ∈ [0, 1]
and ϕ : U −→ R is locally Lipschitz, then there exists u ∈ (x, y), with df
(x, y) =
©
λx + (1 − λ)y : λ ∈ (0, 1)
ª
and u∗ ∈ ∂ϕ(u), such that ϕ(y) − ϕ(x) = hu∗ , y − xiX . PROOF
Let
df
xλ = x + λ(y − x) and consider the function f : [0, 1] −→ R, defined by df
f (λ) = ϕ(xλ )
∀ λ ∈ [0, 1].
Clearly f is Lipschitz continuous on [0, 1]. We claim that © ª ∂f (λ) ⊆ hu∗ , y − xiX : u∗ ∈ ∂ϕ(xλ ) ∀ λ ∈ (0, 1). Since both sets are closed, convex, it suffices to show that σ∂f (λ) (±1) 6
max
u∗ ∈∂ϕ(xλ )
± hu∗ , y − xiX .
(4.32)
To this end, for h = ±1, we have lim sup λ0 → λ t&0
= lim sup λ0 → λ t&0
6 lim sup =
f (λ0 + th) − f (λ0 ) t ϕ(x + (λ0 + th)(y − x)) − ϕ(x + λ0 (y − x)) t ϕ(z + th(y − x)) − ϕ(z) t
z → xλ t&0 ¡ ϕ0 xλ ; h(y
¢ − x) =
From (4.33), we obtain (4.32).
∗ ® u , h(y − x) .
sup u∗ ∈∂ϕ(x
λ)
(4.33)
4. Smooth and Nonsmooth Analysis and Variational Principles
553
Now let ¡ ¢ df ξ(λ) = ϕ(xλ ) + λ ϕ(x) − ϕ(y)
∀ λ ∈ [0, 1].
We have ξ(0) = ξ(1) = ϕ(x) and so we can find λ ∈ (0, 1) at which ξ attains a local extremum. Then by Proposition 4.4.68, we have that 0 ∈ ∂ξ(λ), which via Propositions 4.4.67 and 4.4.69 and (4.32) implies that ® ϕ(y) − ϕ(x) ∈ ∂ϕ(u), y − x X , with u = xλ . The generalized subdifferential has a remarkable calculus which makes it very useful in applications. We mention only two rules which arise in applications. For their proofs and additional results in this direction we refer to Clarke (1983). PROPOSITION 4.4.73 (Chain Rule) If X and Y are two Banach spaces, h ∈ C 1 (X; Y ) and ϕ : Y −→ R is a locally Lipschitz function, then ¡ ¢ ∂(ϕ ◦ h)(x) ⊆ ∂ϕ h(x) ◦ h0F (x) ∀ x ∈ X. REMARK 4.4.74
Note that h0F (x) ∈ L(X; Y ). Using its adjoint ¡ 0 ¢∗ hF (x) ∈ L(Y ∗ ; X ∗ )
we can equivalently rewrite the above chain rule as ¡ ¢∗ ¡ ¢ ∂(ϕ ◦ h)(x) ⊆ h0F (x) ∂ϕ h(x)
∀ x ∈ X.
A useful consequence of the above chain rule is the following result. COROLLARY 4.4.75 If X and Y are two Banach spaces, X is embedded continuously and densely df
in Y , ϕ : Y −→ R is a locally Lipschitz function and ϕ b = ϕ|X , then ∂ ϕ(x) b = ∂ϕ(x) ∀ x ∈ X, which means that every element in ∂ ϕ(x) b admits a unique extension to an element of ∂ϕ(x).
554
Nonlinear Analysis
PROPOSITION 4.4.76 (Nonsmooth Lagrange Rule) If X is a Banach space, ϕ, f : X −→ R are two locally Lipschitz functions and x is a local solution of the problem inf ϕ(x),
f (x)60
then there exist λ0 , λ1 > 0, not both zero, such that 0 ∈ λ0 ∂ϕ(x) + λ1 ∂f (x) REMARK 4.4.77
and
0 = λ1 f (x).
If df
m(b) =
inf ϕ(x),
f (x)6b
m(0) is finite and lim inf b→0
m(b) − m(0) > −∞ |b|
(calmness condition), then λ0 > 0 and so by normalization we can assume that λ0 = 1. Let us conclude this section by mentioning some more nonconvex subdifferentials. This can be done using the following parent notion. DEFINITION 4.4.78 Let X be a Banach space. A bornology is a collection B of bounded, symmetric (with respect to the origin) subsets of X, whose union is X. REMARK 4.4.79 By taking the collection of all finite symmetric sets, we have the so-called Gˆ ateaux bornology denoted by BG . Similarly if the collection consists of all bounded, symmetric sets, then we have the Fr´ echet bornology denoted by BF . Finally if we consider all symmetric, compact sets, then the resulting bornology is the so-called Hadamard bornology denoted by BH . DEFINITION 4.4.80 in X.
Let X be a Banach space and let B be a bornology
(a) The norm of X is said to be B-smooth, if it is Gˆ ateaux differentiable at every x ∈ X \ {0} and the defining limit exists uniformly on members of B. (b) A function ϕ : X −→ R is said to be B-differentiable at x ∈ X with B-derivative ϕ0B (x), if for every C ∈ B, we have lim sup
λ→0 h∈C
ϕ(x + λh) − ϕ(x) − λ hϕ0B (x), hiX = 0. λ
4. Smooth and Nonsmooth Analysis and Variational Principles
555
(c) Let ϕ : X −→ R∗ = R ∪ {±∞} be a lower semicontinuous function and suppose that ϕ(x) ∈ R. We say that ϕ is B-subdifferentiable at x, if there exists x∗ ∈ X ∗ , such that for every ε > 0 and every C ∈ B, there exists δ > 0 for which we have ϕ(x + λh) − ϕ(x) +ε ∀ λ ∈ (0, δ), h ∈ C. λ The elements x∗ ∈ X ∗ are called B-subderivatives of ϕ at x and the set of all B-subderivatives is the B-subdifferential of ϕ at x and it is denoted by ∂B ϕ(x). hx∗ , hiX 6
REMARK 4.4.81 If in Definition 4.4.80(b), B = BG (respectively B = BF ), then ϕ0B (x) = ϕ0G (x) (respectively ϕ0B (x) = ϕ0F (x)). If in Definition 4.4.80(c), ϕ is R-valued, proper, convex, then ∂B ϕ(x) = ∂ϕ(x) (the convex subdifferential; see Definition 4.4.19). If in Definition 4.4.80(c), ϕ is R-valued, locally Lipschitz, then ∂B ϕ(x) = ∂ϕ(x) (the generalized subdifferential; see Definition 4.4.61). Finally if in Definition 4.4.80(c), we reverse the inequality and replace ε be −ε, we obtain the B-superdifferential of ϕ at x denoted by ∂ B ϕ(x). Note that ∂ B (−ϕ)(x) = −∂B ϕ(x). The elements of ∂ B ϕ(x) are called B-superderivatives of ϕ at x. PROPOSITION 4.4.82 Let X be a Banach space, B a bornology in X and ϕ : X −→ R∗ . (a) If ϕ is a lower semicontinuous function, ϕ(x) ∈ R and ∂B ϕ(x) and ∂ B ϕ(x) are nonempty sets, then ϕ is B-differentiable at x and ∂ϕB (x) = ϕ0G (x). (b) If ϕ is a concave function which is continuous in a neighbourhood of x and ∂B ϕ(x) 6= ∅, © ª then ϕ is B-differentiable at x and ∂ϕB (x) = ϕ0B (x) . (c) If the norm of X is B-smooth and ½ ∞ 1X df 2 T = s : X −→ R : s(x) = µn kx − zn kX , 2 n=1 where µn > 0,
∞ X
¾ µn = 1, and zn −→ z ∈ X ,
n=1
then the elements of T are everywhere B-differentiable.
556
Nonlinear Analysis
PROOF
(a) Let x∗ ∈ ∂B ϕ(x) and
y ∗ ∈ ∂ B ϕ(x).
Then for every ε > 0 and every h ∈ X, we have hy ∗ , hiX − hx∗ , hiX 6 2ε, hence
x∗ = y ∗ .
Denote the common value by v ∗ . Then clearly v ∗ = ϕ0B (x) = ϕ0G (x). (b) The function −ϕ is convex and continuous at x ∈ X. Then by Proposition 4.4.25 and Remark 4.4.81, we have ∂(−ϕ)(x) = ∂B (−ϕ)(x) 6= ∅. Since
∂ B ϕ(x) = −∂B (−ϕ)(x),
we have that
∂ B ϕ(x) 6= ∅.
Also ∂B ϕ(x) = −∂(−ϕ)(x) 6= ∅. So we can apply part (a) and obtain the claim. (c) Let x ∈ X. The sequence © ª x − zn n>1 is bounded in X. By hypothesis the norm of X is BG -smooth and so X ∗ is strictly convex. Hence by Proposition 3.2.22, the duality map F is single valued. Then from the definition of s ∈ T , the series ∞ X
µn F(x − zn ) is norm convergent in X ∗
n=1 ∗
∗
to some x ∈ X . It is straightforward to check that x∗ = s0B (x).
Finally let us give some particular subdifferentials associated with bornologies.
4. Smooth and Nonsmooth Analysis and Variational Principles EXAMPLE 4.4.83
557
Let X be a Banach space and B a bornology on X.
(a) Let ϕ : X −→ R be a proper, lower semicontinuous function and x ∈ dom ϕ. The viscosity B-subdifferential at x is the set ∂B ϕ(x) of all x∗ ∈ X ∗ , such that there is a B-differentiable function f , such that ϕ − f attains a local minimum at x and 0 x∗ = fB (x).
(b) If in the above case we specialize the perturbation function f , we obtain the so-called proximal subdifferential ∂p ϕ(x) of ϕ at x. Assuming that the norm of X is Fr´echet differentiable (off the origin), if df
2
f (y) = hx∗ , y − xiX − k ky − xkX , for some k > 0 in (a), then we obtain the proximal subdifferential ∂p ϕ(x) for a proper, lower semicontinuous function ϕ : X −→ R. So x∗ ∈ ∂p ϕ(x) if and only if for some k > 0 and all y in the neighbourhood of x, we have 2
ϕ(y) − ϕ(x) + k ky − xkX > hx∗ , y − xiX . This subdifferential is useful if X is finite dimensional or if X is a Hilbert space. (c) Let ϕ : X −→ R be a proper, lower semicontinuous function and x ∈ − dom ϕ. The canonical B-subdifferential of ϕ at x, denoted by ∂B ϕ(x), is ∗ ∗ the set of all x ∈ X , such that lim inf inf λ→0
h∈C
1 [ϕ(x + λh) − ϕ(x) − λ hx∗ , hiX ] > 0 λ
∀ C ∈ B.
REMARK 4.4.84 Of all the subdifferentials the proximal is the smallest. The viscosity B-subdifferential is not greater than the corresponding canonical subdifferential. The Fr´echet viscosity and canonical subdifferentials coincide, if there exists a Lipschitz continuous, Fr´echet differentiable bump function (see Deville, Godefroy & Zizler (1993); recall that b : X −→ R is a bump function if it has a nonempty and bounded support). If B1 and B2 are two bornologies in X and B2 is finer than B1 (i.e., for every C1 ∈ B1 , we can find C2 ∈ B2 , such that C1 ⊆ C2 ), then any B2 -subdifferential is not greater than the corresponding B1 -subdifferential.
558
4.5
Nonlinear Analysis
Integral Functionals and Subdifferentials
Let (Ω, Σ, µ) be a σ-finite measure space and let X be a separable Banach space. In this section we describe the subdifferential theory of the integral functionals Z ¡ ¢ Iϕ (u) = ϕ ω, u(ω) dµ, Ω df
where ϕ : Ω×X −→ R = R∪{+∞} is a normal integrand (see Definition 3.4.8) and u : Ω −→ X belongs in some vector space of functions. We adopt the convention that +∞ + (−∞) = +∞ (i.e., we let +∞ dominate over −∞). Then for a normal integrand ϕ : Ω × X −→ R, we define the integral functional Iϕ : L1 (Ω; X) −→ R∗ , by Z ¡ ¢ ¢+ R ¡ ϕ ω, u(ω) dµ < +∞, ϕ ω, u(ω) dµ if df Ω Iϕ (u) = Ω ¢+ R ¡ if ϕ ω, u(ω) dµ = +∞. +∞ Ω ∗
Similarly,¡ if ϕ is of ϕ(ω, ·), we define the integral functional ¢ the conjugate ∗ −→ R∗ , by Iϕ∗ : L∞ Ω; Xw ∗ Z ¡ ¢ ¢+ R ∗¡ ϕ∗ ω, v(ω) dµ if ϕ ω, v(ω) dµ < +∞, df Ω Iϕ∗ (v) = Ω ¢+ R ∗¡ if ϕ ω, v(ω) dµ = +∞. +∞ Ω
Recall that ¢ ¡ 1 ¢∗ ¡ ∗ L (Ω; X) = L∞ Ω; Xw ∗ ¢ ¡ (see Theorem 2.2.12; for the Banach space L∞ Ω; Xw∗ ∗ see Definition 2.2.10; note that Theorem 2.2.12 was stated with µ being a finite measure, but the result is true for µ being σ-finite, see Ionescu-Tulcea & Ionescu-Tulcea (1969, ¡ ¡ ¢¢ ∗ are defined p. 95)). The duality brackets for the pair L1 (Ω; X), L∞ Ω; Xw ∗ by Z ¡ ® ¢ df ∗ ∀ u ∈ L1 (Ω; X), v ∈ L∞ Ω; Xw hu, viL1 (Ω;X) = v(ω), u(ω) X dµ ∗ . Ω
We mention that the theory can be developed for integral functionals Iϕ defined on Lp (Ω; X) with p ∈ (1, +∞). In this case ¡ p ¢∗ ¢ 0¡ ∗ L (Ω; X) = Lp Ω; Xw ∗
4. Smooth and Nonsmooth Analysis and Variational Principles with
1 p
+
1 p0
559
= 1 and if X ∗ is separable, then ¡ p ¢∗ ¢ 0¡ L (Ω; X) = Lp Ω; X ∗
(see Theorem 2.2.9). First we need to know that the integral in the definition of Iϕ∗ makes sense. PROPOSITION 4.5.1 If ϕ : Ω × X −→ R is a normal integrand, then ϕ∗ : Ω × X ∗ −→ R defined by df
ϕ∗ (ω, x∗ ) = sup (hx∗ , xiX − ϕ(ω, x)) x∈X
∗ is a convex normal integrand on Ω × Xw ∗.
PROOF
Consider the multifunction E : Ω −→ 2X \ {∅}, defined by © ª df E(ω) = epi ϕ(ω, ·) = (x, λ) ∈ X × R : ϕ(ω, x) 6 λ .
Evidently E has nonempty and closed values (by virtue of the normality of ϕ(ω, ·)). Moreover, we have © ª Gr E = (ω, x, λ) ∈ Ω × X × X : ϕ(ω, x) 6 λ ∈ Σ × B(X) × B(R) = Σ × B(X × R), where B(Z) is the Borel σ-field of Z. So we can find two sequence © ª © ª un : Ω −→ X n>1 and λn : Ω −→ R n>1 of Σ-measurable functions, such that ©¡ ¢ª E(ω) = un (ω), λn (ω) n>1
∀ω∈Ω
(see Denkowski, Mig´orski & Papageorgiou (2003a, p. 433)). Note that ϕ∗ (ω, x∗ ) = sup [hx∗ , un (ω)iX − λn (ω)] . n>1
Thus we conclude that the function (ω, x∗ ) 7−→ ϕ∗ (ω, x∗ ) is a convex normal ∗ integrand on Ω × Xw ∗. THEOREM 4.5.2 (a) If Iϕ : L1 (Ω; X) −→ R∗ is finite at u0 ∈ L1 (Ω; X), then (Iϕ )∗ = Iϕ∗ . (b) If ϕ is a convex normal integrand (i.e., ϕ(ω, ·) is convex for µ-almost ¡ ¢ ∗ all ω ∈ Ω) and Iϕ and Iϕ∗ are finite at u0 ∈ L1 (Ω; X) and v0 ∈ L∞ Ω; Xw ∗ respectively, then Iϕ and Iϕ∗ are proper, convex and lower semicontinuous functionals which are conjugate to each other.
560
Nonlinear Analysis
PROOF we have Z
¡ ¢ ∗ (a) Evidently it suffices to show that for all v ∈ L∞ Ω; Xw ∗ , ³
¡ ¢ ϕ∗ ω, v(ω) dµ 6
sup u∈L1 (Ω;X)
Ω
´ hv, uiL1 (Ω;X) − Iϕ (u)
(4.34)
(see Proposition 4.4.4). Let ξ ∈ R be such that ξ < Iϕ∗ (v). We can obtain (4.34) if we show that there exists u ∈ L1 (Ω; X), such that hv, uiL1 (Ω;X) − Iϕ (u) > ξ. Since by hypothesis Iϕ (u0 ) is finite, we can find ϑ0 ∈ L1 (Ω), such that ® ¡ ¢ v(ω), u0 (ω) X − ϕ ω, u0 (ω) > ϑ0 (ω) for µ-a.a. ω ∈ Ω. (4.35) We obtain
¡ ¢ ϕ∗ ω, v(ω) > ϑ0 (ω) for µ-a.a. ω ∈ Ω.
We claim that there exists a function β ∈ L1 (Ω), such that Z ξ < β(ω) dµ Ω
and
¡ ¢ β(ω) < ϕ∗ ω, v(ω) for µ-a.a. ω ∈ Ω.
Let g ∈ L1 (Ω), g(ω) > 0 for µ-almost all ω ∈ Ω. If Iϕ∗ (v) is finite, then let ¡ ¢ df β(ω) = ϕ∗ ω, v(ω) − εg(ω), R with ε > 0 sufficiently small (so that ξ < β(ω) dµ). If Iϕ∗ (v) = +∞, then Ω
we define
½
¾ ¢ 1 ∗¡ df min ng(ω), ϕ ω, v(ω) fn (ω) = ¢ 2 ∗¡ ϕ ω, v(ω) − g(ω)
if if
¡ ¢ ϕ∗ ω, v(ω) > 0, ¡ ¢ ϕ∗ ω, v(ω) 6 0.
Note that fn (ω) −→
¢ 1 ∗¡ ϕ ω, v(ω) 2
© ¡ ¢ ª ∀ ω ∈ ω ∈ Ω : ϕ∗ ω, v(ω) > 0 .
Hence by the monotone convergence theorem (see Theorem A.2.10), we have that Z lim fn (ω) dµ = +∞ n→+∞
Ω
4. Smooth and Nonsmooth Analysis and Variational Principles
561
and so we can find n0 > 1 large enough so that Z ξ < fn0 (ω) dµ. Ω
Therefore if we take β = fn0 , we have ¡ ¢ β(ω) < ϕ∗ ω, v(ω) for µ-a.a. ω ∈ Ω. Consider the multifunction S : Ω −→ 2X , defined by df
S(ω) =
©
x∈S:
® ª v(ω), x X − ϕ(ω, x) > β(ω)
∀ ω ∈ Ω.
Evidently S has nonempty, closed values and © ª Gr S = (ω, x) ∈ Ω × X : x ∈ S(ω) ∈ Σ × B(X). By the Yankov-von Neumann-Aumann selection theorem (see Theorem A.2.32), we can find a Σ-measurable function s : Ω −→ X, such that s(ω) ∈ S(ω)
∀ ω ∈ Ω.
Since µ is σ-finite, we can find Ω0 ∈ Σ with µ(Ω0 ) < +∞, such that s|Ω0 is bounded and Z Z β(ω) dµ + ϑ0 (ω) dµ > ξ. Ω0
Ω\Ω0
We set
½ df
u(ω) =
s(ω) u0 (ω)
if if
ω ∈ Ω0 , ω ∈ Ω \ Ω0 .
Evidently u ∈ L1 (Ω; X) and we have ® ¡ ¢ v(ω), u(ω) X − ϕ ω, u(ω) > β(ω) and
® ¡ ¢ v(ω), u(ω) X − ϕ ω, u(ω) > ϑ0 (ω)
∀ ω ∈ Ω0 ∀ ω ∈ Ω \ Ω0
(see (4.35)). Therefore Z Z Z Z ¡ ¢ ® β(ω) dµ + v(ω), u(ω) X dµ − ϕ ω, u(ω) dµ > Ω
Ω
Ω0
Ω\Ω0
so Iϕ∗ (v) = (Iϕ )∗ (v). (b) Since
ϕ = ϕ∗∗
(see Theorem 4.4.14), this follows at once from part (a).
ϑ0 (ω) dµ > ξ,
562
Nonlinear Analysis
Next we consider integral functionals defined on the Lebesgue-Bochner space L∞ (Ω; X ∗ ). So as before (Ω, Σ, µ) is a σ-finite measure space, but now ∗ X is a separable Banach space with a separable dual X ∗ . Recall that Xw ∗ ∗ ∗ (i.e., the Banach space X supplied with the w -topology) is a Souslin space ∗ (see Definition A.2.29(b)). It follows that B(X ∗ ) = B(Xw ∗ ) (see Denkowski, ∗ Mig´orski & Papageorgiou (2003a, p. 211). Also let ϕ : Ω × Xw ∗ −→ R be a convex normal integrand. We consider the following integral functionals Z ¡ ¢ df ∗ Iϕ (u) = ϕ ω, u(ω) dµ ∀ u ∈ L1 (Ω; X ∗ ) = L1 (Ω; Xw ∗) Ω
and df
Z
Iϕ∗ (v) =
¡ ¢ ϕ∗ ω, v(ω) dµ
∀ v ∈ L∞ (Ω; X).
Ω
From Theorem 4.5.2(b), we know that if dom Iϕ 6= ∅ and dom Iϕ∗ 6= ∅, then the functionals Iϕ and Iϕ∗ are conjugate to each other. So we have £ ¤ Iϕ∗ (v) = sup hv, uiL1 (Ω;X ∗ ) − Iϕ (u) u∈L1 (Ω;X ∗ )
and Iϕ (u) =
£
sup v∈L∞ (Ω;X ∗ )
¤ hu, viL1 (Ω;X ∗ ) − Iϕ∗ (v) .
¡ ¢∗ ∗ What about Iϕ∗ defined on (L∞ (Ω; X)) ? To get an expression for this ∗ conjugate, first we need to introduce the structure of (L∞ (Ω; X)) . DEFINITION 4.5.3 Y be a Banach space.
Let (Ω, Σ, µ) be a σ-finite measure space and let
¡ ¢∗ (a) A function l ∈ L∞ (Ω; Y ) is said to be absolutely continuous with respect to µ, if Z ® l(v) = u(ω), v(ω) Y ∀ v ∈ L∞ (Ω; Y ), Ω
with u ∈ L1 (Ω; Yw∗∗ ). The function u is said to be the density of l with respect to µ. We have Z ° ° °u(ω)° ∗ dµ. klk = kukL1 (Ω;Y ∗∗ ) = Y w
Ω
¡ ¢∗ So we can identify an absolute continuous functional l ∈ L∞ (Ω; Y ) , with its density with respect to µ.
4. Smooth and Nonsmooth Analysis and Variational Principles
563
¡ ¢∗ (b) A functional l ∈ L∞ (Ω; Y ) is said to be singular with respect to µ, if there exists a decreasing sequence {Cn }n>1 ⊆ Σ, such that µ(Cn ) & 0 and l is supported by Cn for n > 1, that is if v ∈ L∞ (Ω; Y ) and v vanishes on some Cn , then l(v) = 0. ¡ ¢∗ REMARK 4.5.4 If µ is finite, then l ∈ L∞ (Ω; Y ) is singular if and only if for every ε > 0, we can find A ∈ Σ, such that µ(A) 6 ε and l is ∞ supported by A ¡ ¢∗ (i.e., if v ∈ L (Ω; Y ), v|A = ¡ 0, then l(v) ¢∗ = 0). For a given l ∈ L∞ (Ω; Y ) and A ∈ Σ, we define lA ∈ L∞ (Ω; Y ) , by ¢ df ¡ lA (v) = l χA v
∀ v ∈ L∞ (Ω; Y ).
It is easy to see that if A, B ∈ Σ and A ∩ B = ∅, then ° A∪B ° ° ° ° ° °l ° ∞ = °lA °(L∞ (Ω;Y ))∗ + °lB °(L∞ (Ω;Y ))∗ . (L (Ω;Y ))∗ PROPOSITION 4.5.5 If (Ω, Σ, µ) is a σ-finite measure space, Y is a Banach space, l ∈
¡ ∞ ¢∗ L (Ω; Y )
and for every ε > 0, there exists A ∈ Σ, with µ(A) 6 ε
and
° ° klk(L∞ (Ω;Y ))∗ − ε 6 °lA °(L∞ (Ω;Y ))∗ ,
¡ ¢∗ then l ∈ L∞ (Ω; Y ) is singular with respect to µ. PROOF µ(An ) 6
Let {An }n>1 ⊆ Σ be such that 1 2n
and
° ° 1 6 °lAn °(L∞ (Ω;Y ))∗ ∀ n > 1. n 2 ° c° + °lAn ° ∞ ∗ , we have
klk(L∞ (Ω;Y ))∗ −
° ° Since klk(L∞ (Ω;Y ))∗ = °lAn °(L∞ (Ω;Y ))∗ ° Ac ° °l n °
6
(L∞ (Ω;Y ))∗
Let df
Cn =
∞ [
Ak
(L
1 2n
(Ω;Y ))
∀ n > 1.
∀ n > 1.
k=n+1
We have µ(Cn ) 6
1 2n
∀ n > 1.
564
Nonlinear Analysis
We claim that ¡ ¢∗ l ∈ L∞ (Ω; Y ) is supported by Cn
∀ n > 1.
To this end let v ∈ L∞ (Ω; Y ) with v|Cn = 0. So for all k > n + 1, we have that v = χAc v k
and so ¯ ¯ ¯ ¯ ° ° 1 ¯l(v)¯ = ¯lAck (v)¯ 6 °lAck ° ∞ kvkL∞ (Ω;Y ) 6 k kvkL∞ (Ω;Y ) . (L (Ω;Y ))∗ 2 ¡ ∞ ¢∗ Let k → +∞ to obtain that l(v) = 0 and so l ∈ L (Ω; Y ) is singular. PROPOSITION 4.5.6 If (Ω, Σ, µ) is a finite measure space, Y is a Banach space, L∞ (Ω) ⊗ Y = span
¡© ª¢ gy : g ∈ L∞ (Ω), y ∈ Y ,
¡ ¢∗ l ∈ L∞ (Ω; Y ) and l(w) = 0
∀ w ∈ L∞ (Ω) ⊗ Y,
then l is singular. PROOF Let Z be the subspace of L∞ (Ω; Y ) consisting of the equivalence classes of countably valued functions from Ω into Y . From Corollary 2.1.4, we know that Z is dense in L∞ (Ω; Y ). So for a given ε > 0, we can find z ∈ Z with kzkL∞ (Ω;Y ) 6 1, such that klk(L∞ (Ω;Y ))∗ − ε 6 l(z). So there exist a sequence {Am }m>1 ⊆ Σ of pairwise disjoint sets with Ω =
∞ [
Am
m=1
and a sequence {xm }m>1 ⊆ X, such that z(ω) = xm
∀ ω ∈ Am .
4. Smooth and Nonsmooth Analysis and Variational Principles
565
Let n > 1 be large enough, such that µ [ ¶ ∞ µ Am 6 ε. m=n
Since w(·) =
n−1 X
χAm (·)xm ∈ L∞ (Ω) ⊗ Y,
m=1
we have
µ l(z) = l χ
¶ ∞ S
m=n
z Am
> klk(L∞ (Ω;Y ))∗ − ε.
¡ ¢∗ Applying Proposition 4.5.5, we obtain that l ∈ L∞ (Ω; Y ) is singular with respect to µ. ∗
Let us state the theorem which characterizes the dual space (L∞ (Ω; Y )) . The decomposition produced by this theorem is analogous to the Lebesgue decomposition of measures. For a proof of the theorem we refer to Levin (1974). THEOREM 4.5.7 If (Ω, Σ, µ) is a σ-finite measure space, Y is a Banach space and Ls is the space¡of singular¢ continuous linear functionals on L∞ (Ω; Y ), ∗ then L∞ (Ω; Y ) is isometrically isomorphic to L1 (Ω; Yw∗∗ ) ⊕ Ls and klk(L∞ (Ω;Y ))∗ = kukL1 (Ω;Y ∗∗ ) + kls kL∞ (Ω;Y )∗ , w
¡ ¢∗ where l ∈ L∞ (Ω; Y ) , u ∈ L1 (Ω; Yw∗∗ ) and ls ∈ Ls . ¡ ¢∗ Now that we have a complete description of the dual space L∞ (Ω; Y ) , ¡ ¢∗ we can return to our initial problem, namely the formula for Iϕ∗ , defined ¡ ¢∗ on L∞ (Ω; X) . Recall that the mathematical setting is the following: (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space with a separable dual X ∗ and ∗ ϕ : Ω × Xw ∗ −→ R is a convex normal integrand. As we already mentioned ∗ B(X ∗ ) = B(Xw ∗ ).
¡ ¢∗ ¡ ¢∗ We want to derive a formula for Iϕ∗ : L∞ (Ω; X) −→ R, defined by ¡
Iϕ∗
¢∗
df
(l) =
sup v∈L∞ (Ω;X)
£ ¤ l(v) − Iϕ∗ (v)
¡ ¢∗ ∀ l ∈ L∞ (Ω; X) .
566
Nonlinear Analysis
THEOREM 4.5.8 If the above hypotheses hold and dom Iϕ , domIϕ∗ are nonempty, then ¡ ¢∗ Iϕ∗ (l) = Iϕ (u) + σdom Iϕ∗ (ls ), ¡ ¢∗ where l ∈ L∞ (Ω; X) , u ∈ L1 (Ω; X ∗ ) is the density of l with respect to µ and l ∈ Ls is the singular part of l with respect to µ (so we have that l = u + ls and klk(L∞ (Ω;X))∗ = kukL1 (Ω;X ∗ ) + kls k(L∞ (Ω;X))∗ ; see Theorem 4.5.7). PROOF
By definition, we have ¡
6
£ ¤ (l) = sup hu, viL∞ (Ω;X) + ls (v) − Iϕ∗ (v) £ v∈dom Iϕ∗ ¤ hu, viL∞ (Ω;X) − Iϕ∗ (v) + sup ls (b v) sup
Iϕ∗
¢∗
v∈dom Iϕ∗
v b∈dom Iϕ∗
= Iϕ (u) + σdom Iϕ∗ (ls ).
(4.36)
Let {Cn }n>1 ⊆ Σ be a decreasing sequence, such that µ(Cn ) & 0 and ls is supported by Cn for every n > 1. Also let v0 ∈ dom Iϕ∗ ,
ε > 0 and
ξ∈R
be such that ξ < Iϕ (u). Note that ¡ ¢ ® ¡ ¢ ϕ ω, u(ω) > u(ω), v0 (ω) X − ϕ∗ ω, v0 (ω)
for µ-a.a. ω ∈ Ω
and so Iϕ (u) > −∞. Then for n > 1 large enough, we have Z
£
u(ω), v0 (ω)
® X
¡ ¢¤ − ϕ∗ ω, v0 (ω) dµ > −ε
(4.37)
Cn
and
Z c Cn
¡ ¢ ϕ ω, u(ω) dµ > ξ.
(4.38)
4. Smooth and Nonsmooth Analysis and Variational Principles
567
We have ¡ ¢∗ Iϕ∗ (l) =
· sup
Z l(v) −
v∈L∞ (Ω;X)
Cn
· =
sup v∈L∞ (Ω\Cn ;X)
Z
hu, viL∞ (Ω;X) −
sup v b∈L∞ (Cn ;X)
Z =
¸ ¡ ¢ ∗ ϕ ω, v(ω) dµ Z
hu, vbiL∞ (Ω;X) + ls (b v) −
Z
¸ ¢ ϕ ω, v(ω) dµ ∗
¡
c Cn
¡ ¢ ϕ∗ ω, vb(ω) dµ
¸
Cn
¡ ¢ ϕ ω, u(ω) dµ
c Cn
+
¡
c Cn
· +
¢ ϕ ω, v(ω) dµ − ∗
sup v b∈L∞ (Cn ;X)
· ¸ Z £ ® ¡ ¢¤ ∗ ls (b v) + u(ω), vb(ω) X − ϕ ω, vb(ω) dµ . Cn
Let
df
vb = χCn v0 . Since
¡ ¢ ls χCnc v0 = 0,
we have
¡ ¢ ls (v0 ) = ls χCn v0
and so, using also (4.37) and (4.38), we obtain Z ¡ ¢∗ ¡ ¢ Iϕ∗ (l) > ϕ ω, u(ω) dµ + ls (v0 ) c Cn
Z
+
£
¡ ¢¤ hu(ω), v0 (ω)iX − ϕ∗ ω, v0 (ω) dµ
Cn
> ξ + ls (v0 ) − ε. Since ε > 0 was arbitrary, we let ε & 0 and have ¡ ¢∗ Iϕ∗ (l) > ξ + ls (v0 ) ∀ v0 ∈ dom Iϕ∗ , so
¡
Iϕ∗
¢∗
(l) > Iϕ (u) + σdom Iϕ∗ (ls ).
From (4.36) and (4.39), we conclude that ¡ ¢∗ Iϕ∗ (l) = Iϕ (u) + σdom Iϕ∗ (ls ).
(4.39)
568
Nonlinear Analysis
REMARK 4.5.9 If dom Iϕ∗ = L∞ (Ω; X), then Iϕ∗ is continuous, in fact ∞ locally Lipschitz on L (Ω; X) (see Theorem 4.2.6 and 4.2.7). We will show that we can say more about the continuity of Iϕ∗ , when dom Iϕ∗ = L∞ (Ω; X) (see Proposition 4.5.14). DEFINITION 4.5.10
Let V and W be two linear spaces.
(a) We say that (V, W ) form a dual pair (or dual system), if there exists a bilinear functional b : V × W −→ R, written b(v, w) = hv, wi, such that (i) if hv, wi = 0 for all w ∈ W , then v = 0; (ii) if hv, wi = 0 for all v ∈ V , then w = 0. (b) For a given dual pair (V, W ) and a topology τV on V , we say that τV is compatible ¡ ¢∗ with the dual pair (V, W ¡ ), if ¢∗ τV is a locally convex vector topology and VτV = W (that is, if v ∗ ∈ VτV , then v ∗ (v) = hv, wifor some w ∈ W and all v ∈ V and conversely). So we view W as a subspace of the¡ algebraic ¢∗ dual of V . Dually, a compatible topology τW on W is one such that WτW = V. (c) The smallest topology on V compatible with the dual pair (V, W ) is the weak topology denoted by w(V, W ). The largest topology on V compatible with the dual pair is the Mackey topology denoted by m(V, W ). REMARK 4.5.11 From the properties of the bilinear form h·, ·i, we see easily that a compatible topology is always Hausdorff. All compatible topologies have the same closed, convex sets and the same bounded sets (i.e., these properties are duality invariant). The Mackey topology m(V, W ) is the topology of uniform convergence on all balanced, convex, w-compact sets in W . Recall that A ⊆ W is balanced if λA ⊆ A for all |λ| 6 1. If V = X is a Banach space and W = X ∗ , then w(V, W ) is the usual weak topology on X and m(V, W ) is the norm topology on X. On the other hand, if V = X ∗ and W = X, then w(V, W ) is the weak∗ -topology on X ∗ and m(V, W ) is strictly smaller than the norm topology, unless X is reflexive. DEFINITION 4.5.12 Let (V, W ) be a dual pair, let τW be a compatible topology on W and let ϕ : W −→ R∗ be a function. We say that ϕ is τW -inf-compact for the slope v0 ∈ V , if for all λ ∈ R, the set © ª w ∈ W : ϕ(w) − hv0 , wi 6 λ is τW -compact. If v0 = 0, then we simply say that ϕ is τW -inf-compact. The next proposition establishes an interesting connection between continuity of ϕ and τW -inf-compactness of ϕ∗ .
4. Smooth and Nonsmooth Analysis and Variational Principles
569
PROPOSITION 4.5.13 If (V, W ) is a dual pair, w = w(V, W ) and ϕ ∈ Γ0 (Vw ), then ϕ is finite and m(V, W )-continuous at v0 ∈ V if and only if ϕ∗ is w(W, V )-inf-compact for the slope v0 ∈ V . Using Theorem 4.5.8 and Proposition 4.5.13, we infer the following result for the integral functional Iϕ∗ . PROPOSITION 4.5.14 If the hypotheses of Theorem 4.5.8 hold with dom Iϕ∗ = L∞ (Ω; X), then ¡ ¢ (a) Iϕ∗ is a m L∞ (Ω; X), L1 (Ω; X ∗ ) -continuous function on L∞ (Ω; X); ¡ ¢ (b) Iϕ is a w L1 (Ω; X ∗ ), L∞ (Ω; X) -inf-compact function for every slope in L∞ (Ω; X); (c) we have ¡ ¢∗ Iϕ∗ (l) =
½
Iϕ (u) +∞
if l = u ∈ L1 (Ω; X ∗ ), otherwise.
Another useful continuity result for the integral functional Iϕ is the following. PROPOSITION 4.5.15 If (Ω, Σ, µ) is a nonatomic finite measure space, X is a separable Banach space, ϕ : Ω × X −→ R is a convex normal integrand, p ∈ [1, +∞) and Iϕ : Lp (Ω; X) −→ R is continuous at a point, then Iϕ is continuous everywhere. PROOF Let u0 ∈ Lp (Ω; X) be the point of continuity of Iϕ . By considering if necessary the functional df x 7−→ Ibϕ (x) = Iϕ (u0 + x) − Iϕ (u0 ),
we may assume without any loss of generality that u0 = 0 and Iϕ (0) = 0. So we can find δ > 0, such that Iϕ (u) 6 1
∀ kukLp (Ω;X) 6 δ.
Let x ∈ Lp (Ω; X) be arbitrary. Exploiting the nonatomicity of µ and the absolute continuity of the Lebesgue integral, we can find δ1 > 0 and pairwise disjoint sets {Ak }N k=1 ⊆ Σ, such that ° ° © ª ° ° µ(Ak ) 6 δ1 and °χAk u° p 6 δ ∀ k ∈ 1, . . . , N . L (Ω;X)
570
Nonlinear Analysis
We have Z Z Z ¡ ¢ ¡ ¢ ϕ ω, u(ω) dµ = ϕ ω, χAk (ω)u(ω) dµ − ϕ(ω, 0) dµ 6 1 + ξ, Ak
Ack
Ω
© ª for some ξ > 0 independent of k ∈ 1, . . . , N . Therefore, we conclude that N Z X
¡ ¢ ϕ ω, u(ω) dµ =
k=1A k
Z
¡ ¢ ϕ ω, u(ω) dµ = Iϕ (u) < +∞,
Ω
so Iϕ is continuous everywhere on Lp (Ω; X) (see Theorem 4.2.3). Next we describe the subdifferential of the integral functional Iϕ . THEOREM 4.5.16 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space, ϕ : Ω × X −→ R is a convex normal integrand and Iϕ : Lp (Ω; X) −→ R, p ∈ [1, +∞) is finite for at least one u0 ∈ Lp (Ω; X), then for every u ∈ Lp (Ω; X), we have that ¢ 0¡ u∗ ∈ ∂Iϕ (u) ⊆ Lp Ω; Xw∗ ∗ (with
1 p
+
1 p0
= 1) if and only if ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω)
PROOF and only if
for µ-a.a. ω ∈ Ω.
According to Proposition 4.4.21, we have that u∗ ∈ ∂Iϕ (u) if ¡ ¢∗ Iϕ (u) + Iϕ (u∗ ) = hu∗ , viLp (Ω;X)
(see Remark 2.2.13). From Theorem 4.5.2(a), we know that ¡ ¢∗ Iϕ = Iϕ∗ . So we have that u∗ ∈ ∂Iϕ (u) if and only if Z Z £ ¡ ¢ ¡ ¢¤ ∗ ® ϕ ω, u(ω) + ϕ∗ ω, u∗ (ω) dµ = u (ω), u(ω) X dµ. Ω
Ω
The result now follows from the fact that ¡ ¢ ¡ ¢ ® ϕ ω, u(ω) + ϕ∗ ω, u∗ (ω) > u∗ (ω), u(ω) X
for µ-a.a. ω ∈ Ω.
4. Smooth and Nonsmooth Analysis and Variational Principles
571
Before passing to the study of the subdifferentials of nonconvex integral functionals, let us prove a last result for convex subdifferentials. It concerns functionals defined on the space of continuous functions defined on a compact metric space K. So let K be a compact metric space and consider the Banach space C(K) (with the supremum norm). The Riesz-Markov representation theorem (see Theorem 2.3.41) says that C(K)∗ = M (K), where M (K) is the Banach space of Radon measures, i.e., the space of all signed Borel measures which are of bounded variation with the norm given by the total variation, namely kµk = sup
½X N
¾ N ¯ ¯ [ ¯µ(Ak )¯ : Ak ⊆ A, Ak ∩ Ai = ∅ for k 6= i, N > 1 .
k=1
k=1
In what follows by h·, ·iC(K) we denote the duality brackets for the pair ¡ ¢ C(K), M (K) , i.e., Z df hµ, uiC(K) = u(x) dµ(x) ∀ u ∈ C(K), µ ∈ M (K). K
We say that µ ∈ M (K) is positive, denoted by µ > 0, if hµ, uiC(K) > 0
∀ u ∈ C(K), u > 0.
If e ∈ C(K) is the function, such that e(x) = 1
∀x∈K
and
hµ, eiC(K) = 1,
then we say that the Radon measure µ ∈ M (K) has total mass one. A Radon measure µ ∈ M (K) vanishes in an open set U ⊆ K, if hµ, uiC(K) = 0 recall that
∀ u ∈ C(K), supp u ⊆ U ;
df
supp u = {x ∈ K : u(x) 6= 0}. By using partition of unity, we can show Sthat if µ vanishes in a collection of open sets Ur , then µ also vanishes on Ur . Hence it follows that there b where µ vanishes. Then the set K \ U b is called the exists a largest open set U support of µ and is denoted by supp µ.
572
Nonlinear Analysis
LEMMA 4.5.17 If u ∈ C(K), µ ∈ M (K) and u|supp µ = 0, then hµ, uiC(K) = 0. PROOF
If dK is a metric on K, for each ε > 0, let ¡ ¢ df © ¡ ¢ ª supp µ ε = x ∈ K : dK x, supp µ < ε .
Using Urysohn’s lemma (see Theorem A.1.13), for every n > 1, we can find ϑn ∈ C(K), such that ϑn |(supp µ) ≡ 0 1 n
and ϑn |(supp µ)c
2 n
≡ 1.
Then ϑn u −→ u in C(K) and so hµ, ϑn uiC(K) −→ hµ, uiC(K) . Note that b = K \ supp µ supp ϑn u ⊆ U
∀ n > 1.
Hence hµ, ϑn uiC(K) = 0 and so we conclude that hµ, uiC(K) = 0.
PROPOSITION 4.5.18 If ξ : C(K) −→ R is defined by df
ξ(u) = max u(x), x∈K
then ξ is continuous, convex and for each u ∈ C(K), we have that µ ∈ ∂ξ(u) µ > 0,
hµ, eiC(K) = 1
if and only if © ª and supp µ ⊆ x ∈ K : ξ(u) = u(x) .
4. Smooth and Nonsmooth Analysis and Variational Principles
573
PROOF The convexity of ξ is clear. To establish the continuity of ξ, we argue as follows. Let u, v ∈ C(K) and
x0 ∈ K
be such that ξ(u) = max u(x) = u(x0 ). x∈K
We have ξ(u) − ξ(v) 6 u(x0 ) − v(x0 ) 6 ku − vk∞ . Reversing the roles of u and v in the above argument, we conclude that ¯ ¯ ¯ξ(u) − ξ(v)¯ 6 ku − vk , ∞ i.e., ξ is Lipschitz continuous. Now let us prove the description of ∂ξ. (a) Let µ ∈ ∂ξ(u). Then we have hµ, v − uiC(K) 6 ξ(v) − ξ(u)
∀ v ∈ C(K).
Let g ∈ C(K), and let us set
g > 0
df
v = u − g. From (4.40), we have ¡ ¢ − hµ, giC(K) 6 max u − g (x) − max u(x) 6 0, x∈K
x∈K
so hµ, giC(K) > 0. Also let c ∈ R and let us set df
v = u + ce. From (4.40), we have ¡ ¢ c hµ, eiC(K) 6 max u + ce (x) − max u(x) = c x∈K
x∈K
(recall that e ≡ 1). Since c ∈ R was arbitrary, we obtain hµ, eiC(K) = 1. Next we show that df
supp µ ⊆ C =
©
ª x ∈ K : ξ(u) = u(x) .
(4.40)
574
Nonlinear Analysis
It suffices to show that µ vanishes in any open set U ⊆ K \ C. To this end let g ∈ C(K) be such that supp g ⊆ U . Also let df
η = ξ(u) − max u(x) > 0. x∈supp g
We choose ε > 0 so that ±εg(x) < η
∀ x ∈ K.
Then u(x) ± εg(x) < ξ(u)
∀ x ∈ supp g
and ξ(u ± εg) = ξ(u). df
So if in (4.40), we set v = u ± εg, we obtain ±ε hµ, giC(K) 6 0, i.e., hµ, giC(K) = 0, so ©
supp µ ⊆
ª x ∈ K : ξ(u) = u(x) .
(b) Note that g = u − ξ(u)e and g(x) = 0
∀ x ∈ supp µ.
Using Lemma 4.5.17, we obtain hµ, giC(K) = 0 and so hµ, uiC(K) = ξ(u). Hence if v ∈ C(K), we have ξ(v) − ξ(u) = ξ(v) − hµ, uiC(K) . Let
df
g = ξ(v)e − v. Evidently g > 0 and so hµ, giC(K) > 0,
(4.41)
4. Smooth and Nonsmooth Analysis and Variational Principles
575
hence ξ(v) > hµ, uiC(K) . Using this in (4.41), we obtain ξ(v) − ξ(u) > hµ, v − uiC(K)
∀ v ∈ C(K),
so µ ∈ ∂ξ(u). Now we consider nonconvex locally Lipschitz integral functionals. The mathematical setting is the following: (Ω, Σ, µ) is a finite measure space, X is a separable Banach space and ϕ : Ω × X −→ R is a measurable function. We consider the integral functional Iϕ : Lp (Ω; X) −→ R, p ∈ [1, +∞), defined by df
Z
Iϕ (u) =
¡ ¢ ϕ ω, u(ω) dµ
∀ u ∈ Lp (Ω; X).
Ω
Our goal is to describe ∂Iϕ (u) under one the following two hypotheses: H(ϕ)1 We have
¯ ¯ ¯ϕ(ω, x) − ϕ(ω, y)¯ 6 k(ω) kx − yk , X 0
for µ-almost all ω ∈ Ω and all x, y ∈ X, with k ∈ Lp (Ω),
1 p
+
1 p0
= 1.
H(ϕ)2 For µ-almost all ω ∈ Ω, the function ϕ(ω, ·) is locally Lipschitz and ³ ´ p−1 kx∗ kX ∗ 6 a(z) 1 + kxkX , for µ-almost all ω ∈ Ω, all x ∈ X and all x∗ ∈ ∂ϕ(ω, x), with a ∈ L∞ (Ω). THEOREM 4.5.19 If ϕ : Ω × X −→ R is a measurable function and satisfies either hypotheses H(ϕ)1 or H(ϕ)2 , then Iϕ is Lipschitz continuous on bounded sets of Lp (Ω; X) and if u∗ ∈ ∂Iϕ (u), then ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω) for µ-a.a. ω ∈ Ω.
576
Nonlinear Analysis
PROOF
Case 1. First suppose that hypothesis H(ϕ)1 holds.
Then for u, v ∈ Lp (Ω; X), we have ¯ ¯ ¯Iϕ (u) − Iϕ (v)¯ Z ¯ ¡ ¢ ¡ ¢¯ ¯ϕ ω, u(ω) − ϕ ω, v(ω) ¯ dµ 6 Ω
Z
6
° ° k(ω)°u(ω) − v(ω)°X dµ
Ω
6 kkkp0 ku − vkLp (Ω;X) , so Iϕ is Lipschitz continuous (globally) on Lp (Ω; X). Case 2. Next suppose that hypothesis H(ϕ)2 holds. We show that Iϕ is Lipschitz continuous on bounded sets of Lp (Ω; X). Let u, v ∈ Lp (Ω; X) be such that kukLp (Ω;X) 6 r
and
kvkLp (Ω;X) 6 r.
Using Theorem 4.4.72, we can find w(ω) ∈
£ ¤ df © ª u(ω), v(ω) = (1 − λ)u(ω) + λv(ω) : λ ∈ [0, 1]
and
¡ ¢ w∗ (ω) ∈ ∂ϕ ω, w(ω) ,
such that ¡ ¢ ¡ ¢ ϕ ω, v(ω) − ϕ ω, u(ω) =
∗ ® w (ω), v(ω) − u(ω) X
for µ-a.a. ω ∈ Ω.
(4.42)
By virtue of hypothesis H(ϕ)2 , we have that ³ ° ∗ ° ° ° ´ °w (ω)° 6 a(ω) 1 + °w(ω)°p−1 ³
X
X
° °p−1 ° °p−1 ´ 6 b a(ω) 1 + °u(ω)°X + °v(ω)°X
for µ-a.a. ω ∈ Ω,
with b a ∈ L∞ (Ω). Let ³ ° °p−1 ° °p−1 ´ df η(ω) = b a(ω) 1 + °u(ω)°X + °v(ω)°X .
(4.43)
4. Smooth and Nonsmooth Analysis and Variational Principles Then
577
0
η ∈ Lp (Ω)+ and kηkp0 6 c, ° ° where c > 0 depends on °b a°∞ and on r > 0. From (4.42) and (4.43), it follows that ¯ ¯ ¯Iϕ (v) − Iϕ (u)¯ 6 c kv − uk p L (Ω;X) , so Iϕ is Lipschitz continuous on bounded sets. Now let ¢ 0¡ ∗ u∗ ∈ ∂Iϕ (u) ⊆ Lp Ω; Xw ∗ . Then using Fatou’s lemma (see Theorem A.2.1), we have hu∗ , hiLp (Ω;X) 6 Zb 6
¡ ¢0 Iϕ (u; h)
¡ ¢ ϕ0 ω, u(ω); h(ω) dµ
∀ h ∈ Lp (Ω; X),
0
so Zb
£ 0¡ ¢ ¤ ϕ ω, u(ω); h(ω) − hu∗ (ω), h(ω)iX dµ > 0
∀ h ∈ Lp (Ω; X).
0
Let df
h = χA z, with A ∈ Σ, z ∈ X. We obtain Z £ 0¡ ¢ ¤ ϕ ω, u(ω); z − hu∗ (ω), ziX dµ > 0. A
Since A ∈ Σ is arbitrary, we infer that ¡ ¢ ∗ ® u (ω), z X 6 ϕ0 ω, u(ω); z
for a.a. ω ∈ Ω
(4.44)
and the exceptional µ-null set is independent of z ∈ X since X is separable. Since z ∈ X is arbitrary, we conclude that ¡ ¢ u∗ (ω) ∈ ∂ϕ ω, u(ω) for µ-a.a. ω ∈ Ω.
REMARK 4.5.20 Note that under hypothesis H(ϕ)1 we in fact proved that Iϕ is Lipschitz continuous (globally).
578
4.6
Nonlinear Analysis
Variational Principles
Suppose that X is a Banach space, C ⊆ X is a nonempty, noncompact set and df ϕ : X −→ R = R ∪ {+∞} is a proper, lower semicontinuous function, which is bounded below. Then the problem inf ϕ(x) x∈C
need not have a solution. If X = RN , the situation can be remedied by considering a suitable small perturbation of ϕ. More specifically, for simplicity let C = RN , m = inf ϕ(x), ε > 0 x∈X
and take x0 ∈ RN to be such that ϕ(x0 ) 6 m + ε. Consider the function df
ϕε (x) = ϕ(x) + ε kx − x0 kX . Evidently ϕε : RN −→ R is proper, lower semicontinuous and in addition ϕε is weakly coercive, i.e., ϕε (x) −→ +∞
as kxkRN → +∞.
So invoking the Weierstrass theorem, we infer that ϕε attains its infimum at a point y ∈ RN . Note that ky − x0 kRN 6 1. Indeed, if ky − x0 kRN > 1, we have ϕε (y) = ϕ(y) + ε ky − x0 kRN > ϕ(y) + ε > m + ε > ϕ(x0 ) = ϕε (x0 ) > ϕε (y), a contradiction. Also ϕε (y) 6 ϕε (x)
∀ x ∈ RN ,
hence ϕ(y) + ε ky − x0 kRN 6 ϕ(x) + ε kx − x0 kRN ,
4. Smooth and Nonsmooth Analysis and Variational Principles
579
so ϕ(y) 6 ϕ(x) + ε kx − ykRN . So this argument shows that for a given ε > 0 and x0 ∈ RN satisfying ϕ(x0 ) 6 inf ϕ(x) + ε x∈X
(i.e., x0 ∈ RN is an ε-minimizer), we can find y ∈ RN , such that ky − x0 kRN 6 1 and the function x 7−→ ϕ(x) + ε kx − ykRN attains its infimum at y ∈ RN . The main analytical tool in this argument was the theorem of Weierstrass, which guarantees a minimizer for a proper, lower semicontinuous, bounded from below function with at least one bounded sublevel set (this is the case if for example the function is weakly coercive). Evidently this is an essentially finite dimensional situation. In an infinite dimensional space for this to work we need extra conditions, such as the reflexivity of X and the weak lower semicontinuity of the function. In general the argument fails. Nevertheless, we can salvage the principle formulated above. Namely if x0 ∈ X is an ε-minimizer of ϕ : X −→ R, which is proper, lower semicontinuous, bounded from below, then a small Lipschitz perturbation of ϕ attains a strict minimum at a point y ∈ X, which is relatively close to x0 (i.e., we can find a Lipschitz continuous function h : X −→ R with a small Lipschitz constant, such that ϕ + h attains a strict minimum at y ∈ X). In fact this principle can be formulated in any complete metric space. This is the essence of the so-called Ekeland variational principle and its extensions. This result turned out to be an essential tool in many different areas of nonlinear analysis. THEOREM 4.6.1 If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and x0 ∈ X satisfies ϕ(x0 ) 6 inf ϕ(x) + ε, x∈X
then for a given λ > 0, we can find yλ ∈ X, such that (a) ϕ(yλ ) 6 ϕ(x0 ); (b) dX (yλ , x0 ) 6 λ; (c) ϕ(yλ ) < ϕ(x) + λε dX (x, yλ ) for all x 6= yλ .
580
Nonlinear Analysis
PROOF
By replacing ϕ with 1 ϕ(x) ε
df
ϕ(x) b = and dX (·, ·) with df
dλ (·, ·) =
1 d (·, ·), λ X
without any loss of generality, we may assume that ε = λ = 1. On X we define a relation by £ ¤ x6z
£ ¤ ϕ(x) 6 ϕ(z) − dX (x, z) .
df
⇐⇒
Evidently x 6 x (i.e., the relation 6 is reflexive). Also, if x 6 z, we have ϕ(x) 6 ϕ(z) − dX (x, z) and if z 6 v, we have ϕ(z) 6 ϕ(v) − dX (z, v). Thus, by the triangle inequality, it follows that £ ¤ ϕ(x) 6 ϕ(v) − dX (x, z) + dX (z, v) 6 ϕ(v) − dX (x, v). Therefore x 6 v (i.e., the relation 6 is transitive). Finally if x 6 z and z 6 x, we obtain dX (x, z) = 0, hence x = z (i.e., the relation 6 is antisymmetric). So we conclude that the relation 6 is a partial order. Inductively we define a sequence {Sn }n>1 of subsets of X as follows. Let x1 = x0 and ª df © S1 = z ∈ X : z 6 x 1 , x2 ∈ S1 is such that ϕ(x2 ) 6 inf ϕ(x) + x∈S1
1 22
and for the induction step, let df
Sn =
©
ª z ∈ X : z 6 xn ,
xn+1 ∈ Sn is such that ϕ(xn+1 ) 6 inf ϕ(x) + x∈Sn
1 . 2n+1
4. Smooth and Nonsmooth Analysis and Variational Principles
581
Since xn+1 6 xn , we have that Sn+1 ⊆ Sn for n > 1 and by virtue of lower semicontinuity of ϕ, we have that Sn is closed. If z ∈ Sn+1 , we have that z 6 xn+1 6 xn and so dX (z, xn+1 ) 6 ϕ(xn+1 ) − ϕ(z) 6 inf ϕ(x) + x∈Sn
6 ϕ(z) +
1 − ϕ(z) 2n+1
1 1 − ϕ(z) = n+1 , 2n+1 2
so diam Sn+1 6
1 2n
∀ n > 1,
i.e., diam Sn −→ 0. Because (X, dX ) is complete, by Cantor’s theorem (see Theorem A.1.11), we have that ∞ \ Sn = {y}. n=1
Since y ∈ S1 , we have y 6 x1 = x0 and so ϕ(y) 6 ϕ(x0 ) − dX (y, x0 ) 6 ϕ(x0 ), i.e., (a) holds. Also we have (recall ε = λ = 1) dX (y, x0 ) 6 ϕ(x0 ) − ϕ(y) 6 inf ϕ(x) + 1 − inf ϕ(x) = 1, x∈X
x∈X
i.e., (b) holds. Finally to prove (c), we need to show that z 6 y implies z = y. Indeed, if z 6 y, then z 6 xn for all n > 1, hence z ∈
∞ \
Sn ,
n=1
which implies that z = y. REMARK 4.6.2 Note that in the above proposition conclusions (b) and (c) are somehow complementary and the choice of λ > 0 allows us to strike a balance between them depending on the application we have in mind. If λ > 0 is large then (b) provides little information on the whereabouts of yλ while (c) tells us that yλ is close to being a global minimizer of ϕ. The opposite situation occurs when λ > 0 is small. Then (b) implies that yλ is close to x0 , but the inequality in (c) gives us little information. Two particular cases√are of interest. The first corresponds to λ = 1, ε > 0 and the second to λ = ε, ε > 0. In the first case we are not interested in conclusion (b) (i.e., we are not interested on how yλ is located with respect to x0 ). In the second case we are interested in both (b) and (c). We state these two particular cases as corollaries.
582
Nonlinear Analysis
COROLLARY 4.6.3 If (X, dX ) is a complete metric space and ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, then for every ε > 0, we can find yε ∈ X, such that (a) ϕ(yε ) 6 inf x∈X ϕ(x) + ε; (b) ϕ(yε ) < ϕ(x) + εdX (x, yε ) for all x 6= yε . COROLLARY 4.6.4 If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and xε ∈ X satisfies ϕ(xε ) 6 inf ϕ(x) + ε, x∈X
then we can find yε ∈ X, such that (a) ϕ(yε ) 6 ϕ(xε ); √ (b) dX (yε , xε ) 6 ε; √ (c) ϕ(yε ) < ϕ(x) + εdX (x, yε ) for all x 6= yε . If we put more structure on the space X, we can strengthen the conclusion of Theorem 4.6.1. THEOREM 4.6.5 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then for every ε > 0, we can find xε ∈ X, such that ° ° ϕ(xε ) 6 inf ϕ(x) + ε and °ϕ0G (xε )°X ∗ 6 ε. x∈X
PROOF
By virtue of Corollary 4.6.3, we can find xε ∈ X, such that
ϕ(xε ) 6 inf ϕ(x) + ε x∈X
and ϕ(xε ) 6 ϕ(x) + ε kx − xε kX
∀ x ∈ X.
df
Let h ∈ X and λ > 0 be arbitrary. Let us set x = xε + λh. We obtain ϕ(xε ) − ϕ(xε + λh) 6 ε khkX . λ Passing to the limit as λ & 0, we obtain ® ∀ h ∈ X, − ϕ0G (x), h X 6 ε khkX ° 0 ° ¯ 0 ® ¯ so ¯ ϕG (x), h X ¯ 6 ε khkX and thus °ϕG (x)°X ∗ 6 ε. An immediate consequence of this theorem is the following corollary.
4. Smooth and Nonsmooth Analysis and Variational Principles
583
COROLLARY 4.6.6 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then there exists a sequence {xn }n>1 ⊆ X, such that ϕ(xn ) & inf ϕ(x) x∈X
and
ϕ0G (xn ) −→ 0.
REMARK 4.6.7 The above corollary asserts the existence of a minimizing sequence, whose elements satisfy the first order necessary conditions, up to any desired approximation. COROLLARY 4.6.8 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable, then for each minimizing sequence {yn }n>1 of ϕ (i.e., ϕ(yn ) & inf ϕ(x)), we x∈X
can find another minimizing sequence {xn }n>1 of ϕ, such that: (a) ϕ(xn ) 6 ϕ(yn ); (b) kxn − yn kX −→ 0; ° ° (c) °ϕ0G (xn )°X ∗ −→ 0. As we already said the Ekeland variational principle is a very powerful tool of nonlinear analysis. Below we show how the well known Caristi’s fixed point theorem can be derived from the Ekeland variational principle. In fact we show the two results are equivalent, in the sense that the Ekeland variational principle can also be derived from Caristi’s fixed point theorem. First we state and prove Caristi’s fixed point theorem. THEOREM 4.6.9 (Caristi Fixed Point Theorem) If (X, dX ) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function and F : X −→ 2X \ {∅} is a multifunction, such that ϕ(y) 6 ϕ(x) − dX (y, x)
for some y ∈ F (x) and all x ∈ X,
(4.45)
then there exists x0 ∈ X, such that x0 ∈ F (x0 ). PROOF such that
By virtue of Corollary 4.6.3 with ε = 1, we can find x0 ∈ X, ϕ(x0 ) < ϕ(x) + dX (x, x0 )
∀ x 6= x0 .
(4.46)
We claim that x0 ∈ F (x0 ). Suppose that this is not true. Then for all y ∈ F (x0 ), we have that y 6= x0 . Let y ∈ F (x0 ) be as in (4.45). We have ϕ(y) 6 ϕ(x0 ) − dX (y, x0 ) (see (4.46)), a contradiction.
and ϕ(x0 ) < ϕ(y) + dX (y, x0 )
584
Nonlinear Analysis
REMARK 4.6.10 We emphasize that on the multifunction F no regularity conditions were imposed except for (4.45), which is a mild restriction. Suppose that F has compact values and ¡ ¢ h F (x), F (y) 6 kdX (x, y) ∀ x, y ∈ X and with k ∈ (0, 1). Here h(·, ·) stands for the Hausdorff metric on the nonempty and closed subsets of X. Then we can apply Theorem 4.6.9 with ¡ ¢ 1 ϕ(x) = d x, F (x) . 1−k X ¡ ¢ Indeed let y ∈ F (x) be such that dX x, F (x) = dX (x, y). Such an element exists since F (x) is compact. Then we have ¡ ¢ (1 − k)dX (x, y) = dX x, F (x) − kdX (x, y) ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ 6 dX x, F (x) − h F (x), F (y) 6 dX x, F (x) − dX y, F (y) , so ϕ(y) 6 ϕ(x) − dX (x, y). So condition (4.45) is satisfied. The resulting fixed point theorem is a particular case of Nadler’s fixed point theorem (see Theorem 7.4.3). Of course, if F is single valued, we recover the well known Banach’s contraction principle (see Theorem 7.1.2). We should point out that Banach’s fixed point theorem contains much more information. PROPOSITION 4.6.11 The Caristi fixed point theorem (see Theorem 4.6.9) implies the Ekeland variational principle in the form of Corollary 4.6.3. PROOF Let dbX = εdX . This is an equivalent metric on X. Proceeding by contradiction, suppose that there is no xε ∈ X satisfying inequality (b) in Corollary 4.6.3. Then for every x ∈ X, we have © ª F (x) = y ∈ X : ϕ(x) > ϕ(y) + dbX (x, y), y 6= x 6= ∅. The multifunction F satisfies (4.45). So by Theorem 4.6.9, we can find x0 ∈ X, such that x0 ∈ F (x0 ). But this cannot happen since from the definition of F , x∈ / F (x) for all x ∈ X. There is another geometrical result of nonlinear analysis which is equivalent to some form of the Ekeland variational principle. DEFINITION 4.6.12 Let X be a normed space, C ⊆ X a nonempty, convex set and x ∈ X. The drop associated with the pair (x, C), denoted by D(x, C), is the convex hull of {x} ∪ C, i.e., © ª D(x, C) = x + λ(c − x) : c ∈ C, λ ∈ [0, 1] .
4. Smooth and Nonsmooth Analysis and Variational Principles REMARK 4.6.13 given its geometry.
585
The set D(x, C) is called a “drop,” a suitable name
The next result is known as the drop theorem. THEOREM 4.6.14 (Drop Theorem) If X is a normed space, A ⊆ X is a complete set, y ∈ X \ A, R = dX (y, A) and 0 < r < R < %, then there exists u ∈ A, such that ¡ ¢ u ∈ B % (y) and D u, B r (y) ∩ A = {u}. PROOF Let
By translating things, if necessary, we may assume that y = 0. df
E = B % ∩ A. This is a closed subset of A, hence it is a complete metric space (the metric induced by the norm of X). We introduce the continuous function ϕ : E −→ R+ , defined by
%+r kxkX . R−r We apply Corollary 4.6.3 with ε = 1 to obtain u ∈ E, such that df
ϕ(x) =
ϕ(u) < ϕ(x) + ku − xkX
∀ x ∈ E, x 6= u.
(4.47)
We need to show that ¡ ¢ D u, B % (0) ∩ A = {u}. ¡ ¢ Suppose that this is not true and let v ∈ D u, B % (0) ∩ A. We have v ∈ A and v = (1 − λ)u + λz for some z ∈ B r (0) and some λ ∈ [0, 1]. Since v 6= u and r < R, it follows that λ ∈ (0, 1). We have kvkX 6 (1 − λ) kukX + λ kzkX . Because u ∈ A, we have kukX > R and so it follows that λ(R − r) 6 λ (kukX − kzkX ) 6 kukX − kvkX .
(4.48)
From (4.47) with x = v, we have %+r %+r %+r kukX < kvkX + kv − ukX = kvkX + λ ku − zkX , R−r R−r R−r
586 so
Nonlinear Analysis %+r (kukX − kvkX ) < λ ku − zkX R−r
and thus using also (4.48), we have % + r < ku − zkX . But kukX 6 % (recall that y = 0) and z ∈ B r (0). Hence ku − zkX 6 % + r, a contradiction. This geometrical result is in fact equivalent to the Ekeland variational principle stated in the following form which can be easily deduced from Corollary 4.6.3. PROPOSITION 4.6.15 If (X, dX ) is a complete metric space and ϕ : X −→ R is a proper, lower semicontinuous function which is bounded from below, then for any β > 0 and any x0 ∈ X, there exists y ∈ X, such that: (a) ϕ(y) < ϕ(x) + βdX (y, x) for all x 6= y; (b) ϕ(y) 6 ϕ(x0 ) − βdX (y, x0 ). In this form the Ekeland variational principle is equivalent to the drop theorem. For the proof of this, see Penot (1986). PROPOSITION 4.6.16 The drop theorem (see Theorem 4.6.14) is equivalent to the Ekeland variational principle in the form of Proposition 4.6.15. We continue with the applications of the Ekeland variational principle. PROPOSITION 4.6.17 If X is a Banach space, ϕ : X −→ X is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable and there exist a, c > 0, such that a kxkX − c 6 ϕ(x) ∀ x ∈ X, (4.49) X∗
then ϕ0G (X) is dense in aB 1 , where X∗
B1
=
©
ª x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 .
4. Smooth and Nonsmooth Analysis and Variational Principles PROOF
X
Let x∗ ∈ aB 1
∗
587
and consider the function df
ψ(x) = ϕ(x) − hx∗ , xiX . Evidently ψ is lower semicontinuous, bounded from below (see (4.49)) and Gˆateaux differentiable. Applying Theorem 4.6.5, we obtain yε ∈ X, such that ° 0 ° °ψG (yε )° ∗ 6 ε. X But
0 ψG (yε ) = ϕ0G (yε ) − x∗ .
Hence
° 0 ° °ϕG (yε ) − x∗ ° ∗ 6 ε. X X∗
Since x∗ ∈ aB 1
X∗
was arbitrary, we conclude that ϕ0G (X) is dense in aB 1 .
COROLLARY 4.6.18 If X is a Banach space, ϕ : X −→ R is a lower semicontinuous, bounded from below function which is Gˆ ateaux differentiable and there exists a continuous function ϑ : R+ −→ R, such that ϑ(s) −→ +∞ s and then
as s → +∞
¡ ¢ ϕ(x) > ϑ kxkX ϕ0G (X)
PROOF
∀ x ∈ X,
∗
is dense in X . Let a > 0. We can find sa > 0, such that ϑ(s) > as
∀ s > sa .
Hence we have that ∀ x ∈ X, kxkX > sa .
ϕ(x) > a kxkX
On the other hand, if kxkX < sa , then ϕ(x) > ma , where df
ma =
min ϑ(s).
s∈[0,sa ]
Therefore, finally we have ϕ(x) > a kxkX − ma . X∗
Apply Proposition 4.6.17 to obtain that ϕ0G (X) is dense in aB 1 . Since a > 0 was arbitrary, we conclude that ϕ0G (X) is dense in X ∗ .
588
Nonlinear Analysis
Recall that if X is a© Banach space and ϕ ª ∈ Γ0 (X), then© D(∂ϕ) ⊆ dom ϕ (recall that D(∂ϕ) = x ∈ X : ∂ϕ(x) = 6 ∅ and dom ϕ = x ∈ X : ϕ(x) < ª +∞ ). We would like to have a more precise relation between these two sets. PROPOSITION 4.6.19 If X is a Banach space, ϕ ∈ Γ0 (X), x0 ∈ dom ϕ, ε > 0 and x∗0 ∈ ∂ε ϕ(x0 ), then there exist x ∈ X and x∗ ∈ ∂ϕ(x), such that √ (a) kx − x0 kX 6 ε; ¯ ¯ √ (b) ¯ϕ(x) − ϕ(x0 )¯ 6 ε + ε; ¢ √ ¡ (c) kx∗ − x∗0 kX ∗ 6 ε 1 + kx0 kX . PROOF
Let
df
ψ(x) = ϕ(x) − hx∗0 , xiX . We have ψ ∈ Γ0 (X) and ψ(x) > −ϕ∗ (x∗0 ) > ϕ(x0 ) − hx∗0 , x0 iX − ε (see Remark 4.4.50). Moreover, from Remark 4.4.50, we have ψ(x0 ) 6 inf ψ(x) + ε. x∈X
On X we use the norm ¯ ¯ |||x|||X = kxkX + ¯ hx∗0 , xiX ¯, which is equivalent to the original norm k·kX . Invoking Corollary 4.6.4, we obtain x ∈ X, such that ψ(x) 6 ψ(x0 ) and
¯ ¯ √ |||x − x0 |||X = kx − x0 kX + ¯ hx∗0 , x − x0 iX ¯ 6 ε
(4.50)
and ϕ(x) − hx∗0 , xiX 6 ϕ(y) − hx∗0 , yiX +
√ ε |||x − y|||X
∀ y ∈ X.
From (4.51) we obtain that ϕ(x) − hx∗0 , xiX = inf ψ1 (y), y∈X
where
df
ψ1 (y) =
©
ϕ(y) − hx∗0 , yiX +
so, from Proposition 4.4.30, we have 0 ∈ ∂ψ1 (x).
ª √ ε |||x − y|||X ,
(4.51)
4. Smooth and Nonsmooth Analysis and Variational Principles
589
Invoking Proposition 4.4.31, we have ∂ψ1 (x) = ∂ϕ(x) − x∗0 + Note that
¢ √ ¡ ε∂ |||·|||X (0).
¡ ¢ ∂ |||·|||X (0) = u∗ + λx∗0 , X∗
with u∗ ∈ B 1
and λ ∈ [−1, 1] (see Example 4.4.24(b)). Therefore, we have ¢ √ ¡ 0 = x∗ − x∗0 + ε u∗ + λx∗0 ,
with x∗ ∈ ∂ϕ(x), so kx∗ − x∗0 kX ∗ 6
¢ √ ¡ ε 1 + kx∗0 kX ∗ ,
which proves (c). Also from (4.50) and since ψ(x0 ) 6 inf ψ(x) + ε, x∈X
we have ¯ ¯ ¯ ¯ √ ¯ϕ(x) − ϕ(x0 )¯ 6 ψ(x0 ) − ψ(x) + ¯ hx∗0 , x − x0 i ¯ 6 ε + ε, X which proves (b). √ Finally again from (4.50), we have kx − x0 kX 6 ε, which proves (a). THEOREM 4.6.20 If X is a Banach space and ϕ ∈ Γ0 (X), then D(∂ϕ) is dense in dom ϕ. PROOF Let x0 ∈ dom ϕ, n > 1 and x∗0n ∈ ∂ n1 ϕ(x0 ) (see Proposition 4.4.51). Invoking Proposition 4.6.19, we obtain xn ∈ D(∂ϕ), such that 1 kxn − x0 kX 6 √ n
and
¯ ¯ ¯ϕ(xn ) − ϕ(x0 )¯ 6 1 + √1 . n n
REMARK 4.6.21 Actually in the proof of Theorem 4.6.20, we obtained something stronger. Namely if x0 ∈ dom ϕ, we can find a sequence {xn }n>1 ⊆ D(∂ϕ), such that xn −→ x0 in X and ϕ(xn ) −→ ϕ(x0 )
in R.
590
Nonlinear Analysis
Recall the following theorem from the theory of bounded linear operators between Banach spaces. THEOREM 4.6.22 If X and Y are two Banach spaces and A ∈ L(X; Y ), then the following statements are equivalent: (a) A is surjective; (b) there exists c > 0, such that ky ∗ kY ∗ 6 c kA∗ y ∗ kX ∗
∀ y∗ ∈ Y ∗ ;
(c) N (A∗ ) = {0} and R(A∗ ) is closed. REMARK 4.6.23 According to this theorem, if A is surjective, then A∗ is injective. If one of the spaces X or Y is finite dimensional, then the converse is also true. We want to produce a nonlinear analog of Theorem 4.6.22. THEOREM 4.6.24 If X and Y are two Banach spaces, ϕ : X −→ Y is a Gˆ ateaux differentiable map, ϕ(X) is closed in Y , y ∈ Y and there exist % > 0 and k ∈ [0, 1), such that ¡ ¢ ϕ−1 B% (y) 6= ∅ (4.52) and inf
z∈R(ϕ0G (x))
ky − ϕ(x) − zkY 6 k ky − ϕ(x)kY
¡ ¢ ∀ x ∈ ϕ−1 B% (y) , (4.53)
then y ∈ ϕ(X). PROOF
Let A = ϕ(X).
By hypothesis A ⊆ Y is closed. We proceed by contradiction. So suppose that y ∈ / ϕ(X). Let df
R = dY (y, A) and choose %, r > 0, such that r < R < %
and k% < r.
Note that if (4.52) and (4.53) hold for some %0 > 0, then it also holds for any % ∈ (R, %0 ). According to Theorem 4.6.14, we can find u0 ∈ B% (y), such that D(u0 , C) ∩ A = {u0 },
4. Smooth and Nonsmooth Analysis and Variational Principles where
591
df
C = B r (y). Let x0 ∈ X be such that ϕ(x0 ) = u0 . From (4.53) and recalling that k% < r, we have ° ° ° ° °y − ϕ(x0 ) − z ° 6 k °y − ϕ(x0 )° < r. inf Y Y z∈R(ϕ0G (x0 ))
So we can find h ∈ X, such that ° ° °y − ϕ(x0 ) − ϕ0G (x0 )h° < r. Y
(4.54)
From this inequality, it follows that for λ > 0 small, we have ° ° ° ° °y − ϕ(x0 ) − ϕ(x0 + λh) − ϕ(x0 ) ° < r. ° ° λ Y Let
ϕ(x0 + λh) − ϕ(x0 ) ∈ Y. λ From Definition 4.6.12, we see that df
vλ = y − ϕ(x0 ) −
y − vλ ∈ D(u0 , C), where C = B r (y), so (1 − λ)u0 + λ(y − vλ ) ∈ D(u0 , C)
∀ λ ∈ (0, 1)
and thus ϕ(x0 + λh) ∈ D(u0 , C)
∀ λ > 0 small enough.
Because D(u0 , C) ∩ A = {u0 }, it follows that ϕ(x0 + λh) = u0
∀ λ > 0 small enough,
so ϕ0G (x0 ) = 0. Using this in (4.54), we obtain ° ° °y − ϕ(x0 )° < r < R, X a contradiction. REMARK 4.6.25 If in the above theorem conditions (4.52) and (4.53) hold for all y ∈ Y , then we conclude that ϕ is surjective. Note that conditions (4.52) and (4.53) are in a sense complementary. Namely the larger % > 0, the more difficult it is to verify (4.53).
592
Nonlinear Analysis
COROLLARY 4.6.26 If X and Y are two Banach spaces, ϕ : X −→ Y is a Gˆ ateaux differentiable map, ϕ(X) is closed in Y and N
¡¡
¢∗ ¢ ϕ0G (x) = {0}
∀ x ∈ X,
then ϕ is surjective. PROOF
Recall that ¡ ¢ R ϕ0G (x) =
⊥
N
¡¡
¢∗ ¢ ϕ0G (x)
(see, e.g., Denkowski, ¡¡ ¢∗Mig´ ¢ orski & Papageorgiou (2003a, p. 320)). Since by hypothesis N ϕ0G (x) = {0}, it follows that (4.53) is true with k = 0. So for a given y ∈ Y , let % > 0 be such that ¡ ¢ dY y, ϕ(X) < % and let k = 0. Then we can apply Theorem 4.6.24 and conclude that y ∈ ϕ(X). This proves that ϕ is surjective. REMARK 4.6.27 the hypothesis
It is clear from the proof of the above theorem that N
¡¡
¢∗ ¢ ϕ0G (x) = {0}
∀x∈X
can be replaced by the hypothesis that ¡ ¢ R ϕ0G (x) is dense in Y
∀ x ∈ X.
¡ ¢ Moreover, if ϕ0G (x) ∈ Φ(X; Y ) and ind ϕ0G (x) = 0 for all x ∈ X (see Definition 3.1.60), then the hypothesis N
¡¡
¢∗ ¢ ϕ0G (x) = {0}
∀x∈X
can be replaced by the hypothesis that ¡ ¢ N ϕ0G (x) = {0}
∀ x ∈ X.
The discussion so far has illustrated the power of the Ekeland variational principle. The only difficulty that we encounter when using this principle is that the perturbation function ε kx − xε k is not differentiable at the origin. So it is natural to ask whether it is possible to formulate a similar variational principle but for a different class of perturbations, which would include functions differentiable at points of interest. This was achieved by Borwein & Preiss (1987), who obtained the following theorem.
4. Smooth and Nonsmooth Analysis and Variational Principles
593
THEOREM 4.6.28 If X is a Banach space, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function, ε > 0 and x0 ∈ X is such that ϕ(x0 ) 6 inf ϕ(x) + ε, x∈X
then for any λ > 0 and any p ∈ [1, +∞), we can find yλ ∈ X, a sequence {xn }n>1 ⊆ X, such that xn −→ yλ in X ∞ P and a sequence {tn }n>1 ⊆ [0, 1] satisfying tn = 1, such that: n=1
(a) ϕ(yλ ) 6 ϕ(x0 ); (b) kxn − x0 kX 6 λ for all n > 1; (c) the function ψ(x) = ϕ(x) +
ε λp
mum at yλ .
∞ P n=1
p
tn kx − xn kX attains a strict mini-
The next step in the development of variational principles was made by Deville, Godefroy & Zizler (1993). Their starting point was the argument in the Borwein-Preiss variational principle. In that argument important is the df function ϑ(x) = 1 − kxkX and in particular what matters is the behaviour of ϑ within the unit ball. That is the behaviour of ϑ outside the domain where ϑ is nonnegative plays no role in the argument. So we may as well replace ϑ by the function © ª df b ϑ(x) = ϑ+ (x) = max 0, 1 − kxkX . Note that ϑb is a continuous bump function (i.e., a continuous function on X which has a nonempty and bounded support; see Remark 4.4.84). This observation is interesting because there are Banach spaces in which smooth bump functions can be found, but they do not have an equivalent differentiable norm (see Fabian (1997)). THEOREM 4.6.29 If X is a Banach space which admits a Lipschitz continuous bump function that is Fr´echet (respectively Gˆ ateaux) differentiable, ϕ : X −→ R is a proper, lower semicontinuous, bounded from below function and ε > 0, then there exists a Lipschitz continuous function g : X −→ R which is Fr´echet (respectively Gˆ ateaux) differentiable, such that ¯ ¯ kgk∞ = sup ¯g(x)¯ 6 ε, x∈X ° ° kg 0 k∞ = sup °g 0 (x)°X ∗ 6 ε x∈X
and ϕ − g attains its minimum on X.
594
Nonlinear Analysis
PROOF We do the proof for the Fr´echet differentiable case. The proof of the Gˆateaux differentiable case is done similarly. Let V be the linear space of all functions g : X −→ R which are Lipschitz continuous and Fr´echet differentiable. Evidently ° 0 ° °g (x)° ∗ 6 Lip(g) ∀ g ∈ V, x ∈ X X (by Lip(g) we denote the Lipschitz constant of g). So the function x 7−→ g 0 (x) is bounded and then by the mean value theorem (see Proposition 4.1.21), we have that x 7−→ g(x) is bounded too. It is easy to see that V supplied with the norm kgkV = kgk∞ + kg 0 k∞ becomes a Banach space. For every n > 1, let ½ df An = g ∈ V : there exists x0 ∈ X, such that ¾ (ϕ − g)(x0 ) < inf (ϕ − g)(x) , x∈X\B 1 (x0 ) n
where
©
1ª . n We claim that for every n > 1, the set An is open and dense in V . To do this note that k·k∞ 6 k·kV . So it follows that An is open. To show the density of An , let g ∈ An and ε > 0. We need to find h ∈ V with khkV < ε and x0 ∈ X, such that (ϕ − g − h)(x0 ) < inf (ϕ − g − h)(x). B n1 (x0 ) =
x ∈ X : kx − x0 kX
1 . n
The function ϕ − g is bounded from below. So we can find x0 ∈ X, such that (ϕ − g)(x0 ) < inf (ϕ − g)(x) + b(0). x∈X
Let
df
h(x) = b(x − x0 ). Evidently h ∈ V with khkV < ε, and we have ¡ ¢ ¡ ¢ ¡ ¢ ϕ − g − h (x0 ) = ϕ − g (x0 ) − b(0) < inf ϕ − g (x). x∈X
Since hX\B 1 (x0 ) ≡ 0, we have n
¡
¡ ¢ ¢ ¡ ¢ ϕ − g − h (x) = ϕ − g (x) > inf ϕ − g (x) x∈X
∀ x ∈ X \ B n1 (x0 ).
4. Smooth and Nonsmooth Analysis and Variational Principles
595
Hence g + h ∈ An and this proves the density of An in V . Then by the Baire category theorem (see Theorem A.1.10), we have that ∞ \
D =
An ⊆ Y is a dense Gδ set.
n=1
Next we show that if g ∈ D, then ϕ − g attains its minimum on X. From the definition of An , we can find xn ∈ X, such that ¡ ¢ ¡ ¢ ϕ − g (xn ) < inf ϕ − g (x). x∈X\B 1 (xn ) n
For m > n, we have xm ∈ B n1 (xn ), or otherwise we would have ¡ ¢ ¡ ¢ ϕ − g (xn ) < ϕ − g (xm ) and
kxn − xm kX >
1 1 > . n m
(4.55)
By virtue of the second inequality and the choice of xm , we have ¡ ¢ ¡ ¢ ϕ − g (xm ) < ϕ − g (xn ), which contradicts the first inequality in (4.55). Therefore we infer that {xn }n>1 ⊆ X is a Cauchy sequence in X and xn −→ x b in X. We claim that x b is a minimizer of ϕ − g. Because ϕ − g is lower semicontinuous, we have ¡ ¢ ¡ ¢ ϕ − g (b x) 6 lim inf ϕ − g (xn ) n→+∞ µ ¶ ¡ ¢ 6 lim inf inf ϕ − g (x) . n→+∞
x∈X\B 1 (xn ) n
If u ∈ X, u 6= x b, then kxn − ukX >
1 n
∀ n > 1 large enough
and so for n > 1 large enough, we have ¡ ¢ ¡ ¢ inf ϕ − g (x) 6 ϕ − g (u). x∈X\B 1 (xn ) n
Using this in (4.56), we conclude that ¡ ¢ ¡ ¢ ϕ − g (b x) = inf ϕ − g (x). x∈X
(4.56)
596
Nonlinear Analysis
REMARK 4.6.30 If the norm of X is Fr´echet differentiable away from the origin, then the function 2
x 7−→ kxkX is a C 1 -function on X. If ξ : R+ −→ R is a C 1 -function, such that ξ(0) = 1 then
and ξ(s) = 0
∀ s > 1,
¡ df 2 ¢ b(x) = ξ kxkX
is a C 1 -function on X, such that b(0) = 1
and
b(x) = 0
∀x∈ / B1 (0).
This is a C 1 -bump function. Note that if X ∗ is separable, then X admits a C 1 -bump function. Indeed in this case X admits an equivalent Fr´echet differentiable norm and so the C 1 -bump function is constructed as indicated above. Every separable Banach space admits an equivalent Gˆateaux differentiable norm and so it admits a Lipschitz continuous, Gˆateaux differentiable bump function. The following result is an interesting consequence of Theorem 4.6.29. PROPOSITION 4.6.31 If the Banach space X admits a Lipschitz continuous and Fr´echet (respectively Gˆ ateaux) differentiable bump function, then every continuous convex function defined on X is Fr´echet (respectively Gˆ ateaux) differentiable on a dense subset of X. In particular, if the Banach space admits a Lipschitz continuous and Fr´echet differentiable bump function, then X is an Asplund space (see Definition 4.2.18). PROOF Again we do the proof for the Fr´echet differentiable case. The proof for the Gˆateaux differentiable case is similar. Let ϕ be a continuous concave function on X and b be a Lipschitz continuous and Fr´echet differentiable function on X with b(0) 6= 0
and
b(x) = 0
∀x∈ / B1 (0).
Let x0 ∈ X and choose δ > 0, such that ϕ(x0 ) − 1 < ϕ(x)
∀ x ∈ Bδ (x0 ).
Choose m > 1δ . If kx − x0 kX > δ, we have ° ° °m(x − x0 )° > 1 X
4. Smooth and Nonsmooth Analysis and Variational Principles and so
597
¡ ¢ b m(x − x0 ) = 0.
We define the function 1 df f (x) = b(m(x − x0 ))2 +∞
¡ ¢ if b m(x − x0 ) 6= 0, otherwise.
The function ϕ + f : X −→ R is proper, lower semicontinuous and bounded from below (by ϕ(x0 ) − 1). So we can apply Theorem 4.6.29 and obtain a Lipschitz continuous and Fr´echet differentiable function g : X −→ R, such that ϕ + f − g attains its minimum at some point y0 ∈ Bδ (x0 ) (since f ≡ +∞ outside Bδ (x0 )). Let U be a neighbourhood of y0 , such that the function ¡ ¢ x 7−→ b m(x − x0 ) is nonzero. Then the function f is Fr´echet differentiable on U . We have ϕ(y0 ) + f (y0 ) − g(y0 ) 6 ϕ(y) + f (y) − g(y)
∀ y ∈ U,
so −ϕ(y) 6 −ϕ(y0 ) − f (y0 ) + g(y0 ) + f (y) − g(y) Let
df
v(y) = −ϕ(y0 ) − f (y0 ) + g(y0 ) + f (y) − g(y)
∀ y ∈ U. ∀ y ∈ U.
We have −ϕ(y) 6 v(y)
∀y∈U
and
− ϕ(y0 ) = v(y0 ).
If khkX is small, from the convexity of −ϕ, we have 0 6 (−ϕ)(y0 + h) + (−ϕ)(y0 − h) − (−2ϕ)(y0 ) 6 v(y0 + h) + v(y0 − h) − 2v(y0 ).
(4.57)
Since v is Fr´echet differentiable, we have v(y0 + h) + v(y0 − h) − 2v(y0 ) =
¡ ® ¢ 0 ® vF (y0 ), h X − vF0 (y0 ), h X + o khkX .
Hence from (4.57) and (4.58), it follows that ¡ ¢ (−ϕ)(y0 + h) + (−ϕ)(y0 − h) − 2(−ϕ)(y0 ) = o khkX
(4.58)
598
Nonlinear Analysis
and this by virtue of Proposition 4.2.9 implies that −ϕ is Fr´echet differentiable at y0 . Since x0 ∈ X and δ > 0 were arbitrary, we conclude that −ϕ is Fr´echet differentiable on a dense subset of X. Finally for the last part of the proposition, recall that the set of points of differentiability of −ϕ is a Gδ set (see the proof of Theorem 4.2.12). REMARK 4.6.32 It is not known whether every Asplund space admits a Fr´echet differentiable bump function. We conclude this section with a generalization of Theorem 4.6.1 which is useful when we study boundary value problems using variational methods. This generalization is due to Zhong (1997), where the interested reader can find its proof. THEOREM 4.6.33 If h : R+ −→ R+ is a continuous, nondecreasing function, such that +∞ Z
0
1 dr = +∞, 1 + h(r)
(X, dX ) is a complete metric space, x0 ∈ X is fixed, ϕ : X −→ R is a proper, lower semicontinuous and bounded below function, ε > 0, ϕ(y) 6 inf ϕ(x) + ε x∈X
and λ > 0, then there exists xλ ∈ X such that ϕ(xλ ) 6 ϕ(y),
dX (xλ , x0 ) 6 r0 + r
and ϕ(xλ ) 6 ϕ(x) +
ε d (xλ , x) λ(1 + h(dX (x0 , xλ ))) X
where
∀ x ∈ X,
df
r0 = dX (x0 , y) and r > 0 is such that
rZ0 +r
r0
REMARK 4.6.34 Theorem 4.6.1.
1 dr > λ. 1 + h(r)
If h ≡ 0 and x0 = y, then Theorem 4.6.33 reduces to
4. Smooth and Nonsmooth Analysis and Variational Principles
4.7
599
Remarks
4.1: Gˆateaux (1913) gave the definition of directional differentiability when X is simply a linear space and Y = R. Afterwards, L´evy (1920) imposed the requirement that f 0 (x; ·) must be linear. The Fr´echet derivative was introduced by Fr´echet (1920). Various parts of the calculus in Banach spaces can be found in the books of Abraham & Marsden (1978), Cartan (1967), Denkowski, Mig´orski & Papageorgiou (2003a, 2003b), Dieudonn´e (1969), Ioffe & Tihomirov (1979), Vaˇınberg (1973), Zeidler (1985b) and in the survey papers of Averbukh & Smolyanov (1967, 1968) and Nashed (1971). We should also mention Lusternik’s theorem, which is useful in variational analysis. For a proof of it see Zeidler (1985b, pp. 287–289). First a definition. DEFINITION 4.7.1 C.
Let X be a locally convex space, C ⊆ X and x0 ∈
(a) An admissible curve in C through x0 is a map u : (−ε, ε) −→ C for some ε > 0, such that u(t) ∈ C u(0) = x0
∀ t ∈ (−ε, ε), and
u0 (0) exists.
(b) h ∈ X is said to be a tangent vector to C at x0 if and only if there exists an admissible curve in C through x0 , such that u0 (0) = h, that is if there exist an ε > 0 and a map (−ε, ε) 3 λ 7−→ r(λ) ∈ X, such that x0 + λh + r(λ) ∈ C and
kr(λ)kX −→ 0 λ
∀ λ ∈ (−ε, ε) as λ → 0.
(c) The set of all vectors tangent to C at x0 is a closed cone, which is nonempty since the origin belongs to it. This cone is usually called the tangent cone to C at x0 and it is denoted by TC (x0 ). If this cone is a subspace of X, then it is called the tangent space to C at x0 .
600
Nonlinear Analysis
THEOREM 4.7.2 (a) If X and Y are two Banach spaces, U is a neighbourhood of x0 ∈ X, ϕ : U −→ Y is a Fr´echet differentiable function and ¡ ¢ R ϕ0F (x0 ) = Y (i.e., x0 ∈ U is a regular point of ϕ), then the tangent space to the set df
C =
©
x ∈ X : ϕ(x) = ϕ(x0 )
ª
coincides with the kernel of ϕ0F (x0 ), i.e., ¡ ¢ TC (x0 ) = N ϕ0F (x0 ) . (b) If X and Y are two Banach spaces, U is a neighbourhood of x0 ∈ X, ϕ ∈ C 1 (U ; Y ) and for all x ∈ U , such that ϕ(x) = ϕ(x0 ), we have
¡ ¢ R ϕ0F (x) = Y
¡ ¢ and N ϕ0F (x) is complemented in X (see Definition 4.1.28), then the set ª df © C = x ∈ U : ϕ(x) = ϕ(x0 ) is a C 1 -manifold in X. Moreover, if ϕ ∈ C r (U ; Y ) with r > 1, then C is a C r -manifold in X. 4.2: Convex functions play a central role in many applications. There are several books dealing with the continuity and differentiability properties of convex functions on finite or infinite dimensional Banach spaces. We mention the books of Rockafellar (1970a), Webster (1994) (convex functions defined on RN ) and Barbu & Precupanu (1986), Ekeland & Temam (1976), Giles (1982), Ioffe & Tihomirov (1979), Laurent (1972), Phelps (1993) and Roberts & Varberg (1973) (convex functions defined on Banach and locally convex spaces). Note that the books of Giles (1982) and Phelps (1993) approach convex functions from the point of view of functional analysis and place special emphasis on the relations with Banach space theory. Theorem 4.2.12 is a classical result of Mazur (1933). For convex, continuous functions which are everywhere Gˆateaux differentiable, there are some stronger results on Fr´echet differentiability. In particular, Deville, Godefroy, Hare & Zizler (1987) characterize the separable Banach spaces X so that every continuous convex function
4. Smooth and Nonsmooth Analysis and Variational Principles
601
ϕ : X −→ R, which is everywhere Gˆateaux differentiable, must be Fr´echet differentiable on a dense set. It turns out that X ∗ can be nonseparable, but X cannot contain a subspace isomorphic to l1 . 4.3: Haar-null sets (see Definition 4.3.5) were introduced by Christensen (1972), who also obtained all the results up until Corollary 4.3.14. Theorem 4.3.17 is also due to Christensen (1974). In the paper of Hunt, Sauer & Yorke (1992, 1993) (see also their addendum), we find relations between dynamical systems and Haar-null sets. There are other ways to define negligible sets in an infinite dimensional Banach space (such as Gauss-null sets, Aronszajn-null sets and cube-null sets). A detailed discussion of them and their use in the study of the differentiability properties of Lipschitz continuous functions can be found in the book of Benyamini & Lindenstrauss (1997). 4.4: Duality is in the core of convex analysis. The Legendre-Fenchel transform (see Definition 4.4.1) was first used for convex functions on R by Mandelbrojt (1939). This motivated Fenchel (1951) to introduce an important and more general definition for convex functions in RN . The transform introduced by Fenchel is an extension of the Legendre transform (see Legendre (1786)). This is why the transform is called Legendre-Fenchel transform. This notion was extended to dual pairs of locally convex spaces by Brondsted (1964), Moreau (1966–1967) and Rockafellar (1974). A special case of the inequality in Proposition 4.4.3(a) can be found in Young (1912) and for this reason the inequality is called Young-Fenchel inequality. We should point out that some authors (see Ioffe & Tichomirov (1968, 1979)) prefer to name Young-Fenchel transform for what we call here Legendre-Fenchel transform. The finite-dimensional duality theory can be found in the books of Fenchel (1951), Rockafellar (1970a), while the infinite dimensional duality theory can be found in the books of Barbu & Precupanu (1986), Ekeland & Temam (1976), Ioffe & Tihomirov (1979), Laurent (1972). Theorem 4.4.14 is the main result in the duality theory for convex functions and sometimes it is called ¡ the ¢ Fenchel-Moreau theorem. First Fenchel (1951) observed that ϕ ∈ Γ0 RN if and only if it is supremum of all affine continuous functions majorized by ϕ. Soon thereafter H¨ormander (1955) established the following result. PROPOSITION 4.7.3 If X is a locally convex space, then there is a bijective correspondence between nonempty, closed, convex sets and sublinear, w(X ∗ , X)-lower semicontinuous functions on X ∗ with values df
in R = R ∪ {+∞}, which maps C into σC . Theorem 4.4.14 in conjunction with Proposition 4.4.13 says that ϕ∗∗ is the biggest convex and lower semicontinuous function majorized by ϕ (sometimes this is denoted by writing that ϕ∗∗ = conv ϕ, which is a suggestive notation expressing the fact that epi ϕ∗∗ = conv epi ϕ). This fact is important in control
602
Nonlinear Analysis
theory in connection with the relaxation method. If the ambient space is RN , then using Carath´eodory’s theorem for convex sets in RN , we can have the following useful expression for ϕ∗∗ (see Ioffe & Tihomirov (1979, p. 189)). PROPOSITION 4.7.4 If ϕ : RN −→ R is a proper, lower semicontinuous function and dom ϕ∗∗ ⊆ RN is closed, then ϕ∗∗ (x) = inf
½ NX +1 k=1
λk ϕ(xk ) : xk ∈ RN , λk > 0,
N +1 X k=1
λk = 1,
N +1 X
¾ λk xk = x .
k=1
The operation of infimal convolution (see Definition 4.4.6(b)) was introduced by Moreau (1965, 1966–1967) and its duality properties were studied by Ioffe & Tichomirov (1968, 1979). The proof of Proposition 4.4.16 can be found in Ioffe & Tihomirov (1979, p. 178). Although affine continuous supports for convex functions were considered earlier, the first systematic study of the subdifferential multifunction started with the works of Moreau (1965, 1966–1967) and Rockafellar (1966, 1970b). Moreau (1965) limits himself in the framework of Hilbert spaces, while Rockafellar (1970b) passes to general Banach spaces. We should also mention the related work of Pshenichnyi (1971) on quasi-differentiable functions. One of the main results of the convex subdifferential theory is Theorem 4.4.34 (see also Remark 4.4.35). This was first proved by Rockafellar (1966), but it was found that his proof had a gap. This was remedied by Rockafellar (1970b), where we find also the proof of Proposition 4.4.31. The notion of cyclically monotone operators is due to Rockafellar (1966), who proved Theorem 4.4.39 (see also Rockafellar (1970b, Theorem B, p. 210). Proposition 4.4.42 is due to Br´ezis (1973). For the proof of Proposition 4.4.46, we refer to Phelps (1993, p. 19). The ε-subdifferential (see Definition 4.4.49) was investigated systematically by Hiriart-Urruty (1980, 1982), and Hiriart-Urruty & Phelps (1993). Convex subdifferentials found widespread applications in optimization, control theory and evolution equations, as seen in the books of Barbu (1976, 1994), Barbu & Precupanu (1986), Dontchev & Zolezzi (1993), Ekeland & Temam (1976), Hiriart-Urruty & Lemar´echal (1993), Hu & Papageorgiou (1997, 2000), Ioffe & Tihomirov (1979), Rockafellar (1970a, 1974), Rockafellar & Wets (1998) and Tiba (1990). The proof of Proposition 4.4.53 can be found in Rockafellar (1970a, pp. 219–220). The subdifferential theory for locally Lipschitz functionals is due to Clarke (1975, 1981, 1983). Only Theorem 4.4.72 is due to Lebourg (1975). Applications of the generalized subdifferential can be found in the books of Clarke (1983, 1989), Clarke, Ledyaev, Stern & Wolenski (1998) and in Naniewicz & Panagiotopoulos (1995) and Gasi´ nski & Papageorgiou (2005) (which deal with hemivariational inequalities). Of the other subdifferentials, the viscosity
4. Smooth and Nonsmooth Analysis and Variational Principles
603
subdifferential was explicitly defined by Deville, Godefroy & Zizler (1993) and studied by Borwein & Zhu (1996). The proximal subdifferential is discussed in Clarke (1989), Clarke, Ledyaev, Stern & Wolenski (1998) and Rockafellar & Wets (1998) and the canonical subdifferential was introduced by Penot (1978). 4.5: Integral functionals determined by (convex) normal integrands were first studied by Rockafellar (1968). Several results for integrands defined on Ω × RN can be found in Rockafellar (1976) with additional results and extensions in Rockafellar (1971a, 1971c, 1971b). The work was extended by Levin (1973, 1974, 1975, 1980), who removed some restrictive finite dimensionality or reflexivity hypotheses. Theorem 4.5.7, known as the Yosida-Hewitt decomposition theorem, was first proved by Yosida & Hewitt (1952) for X = R and µ being a finite measure. Another, more direct proof can be found in Dubovitskii & Miljutin (1968). Ioffe & Levin (1972) extended the result to a separable Banach space X and a finite measure µ. Their proof does not extend to µ being σ-finite. The general form (see Theorem 4.5.7) is due to Levin (1974). Theorem 4.5.2 is due to Rockafellar (1971a) (for a reflexive, separable Banach space X) and Levin (1975) (for a separable Banach space X). Similarly Theorem 4.5.8 is due to Rockafellar (1971b) (for X = RN ), Rockafellar (1971a) (for a reflexive, separable Banach space X) and Levin (1975) (for a separable Banach space X). Proposition 4.5.13 is due to Moreau (1966–1967) and its proof can be also found in Laurent (1972, p. 348), while Theorem 4.5.16 is due to Rockafellar (1971a) Additional results on convex integral functionals can be found in Bismut (1973), Castaing & Valadier (1977), Papageorgiou (1986) and Valadier (1975). There is a continuous analog of the operation of infimal convolution (see Definition 4.4.6(b)). DEFINITION 4.7.5 sure space and let
Let (Ω, Σ, µ) be a finite, complete, nonatomic meaϕ : Ω × RN −→ R
be a Σ × B(RN )-measurable integrand. The inf-convolution integral of ϕ with respect to µ is the function I ϕω dµ : RN −→ R∗ , Ω
defined by µI Ω
¶ ½ ¾ Z df ϕω dµ (x) = inf λ ∈ R : (x, λ) ∈ epi ϕ(ω, ·) dµ . Ω
604
Nonlinear Analysis
REMARK 4.7.6 In the above definition ½Z ¾ Z ¡ ¢ 1 epi ϕ(ω, ·) dµ = u(ω), λ(ω) dµ : (u, λ) ∈ Sepi ϕ(ω,·) . Ω
Ω
¡ 1
¢ If for all x ∈ L Ω; RN , Iϕ (x) exists (possibly infinite), then µI ¶ ½ ¾ Z ¡ ¢ ϕω dµ (x) = inf Iϕ (u) : u ∈ L1 Ω; RN , u(ω) dµ = x . Ω
Ω
In this form this operation arises in mathematical economics (see Aumann & Shapley (1974)). The next result is due to Ioffe & Tichomirov (1968). It is the continuous analog of Proposition 4.4.8 PROPOSITION ¶ 4.7.7 µ H If dom ϕω dµ 6= ∅, then
Ω
µI Ω
¶∗ Z ϕω dµ = ϕ∗ω dµ. Ω
4.6: Theorem 4.6.1 is due to Ekeland (1974). A detailed discussion with various applications can be found in Ekeland (1979, 1989). Theorem 4.6.9 with F being single valued was proved by Caristi (1976) using a different proof based on transfinite induction (see also Caristi & Kirk (1975)). Theorem 4.6.14 is due to Daneˇs (1972), but his proof used a result of Krasnoselskii & Zabreiko (1984). The proof given here is due to Brondsted (1974). Relations between these and other geometric theorems of nonlinear analysis were proved by Br´ezis & Browder (1976), Daneˇs (1972) and Penot (1986). Proposition 4.6.19 and Theorem 4.6.20 are due to Brondsted & Rockafellar (1965). For the proof of Theorem 4.6.22, we refer to Denkowski, Mig´orski & Papageorgiou (2003a, p. 384). Theorem 4.6.24 and Corollary 4.6.26 are due to Browder (1971a, 1971b). Another nonlinear surjectivity result due to Bates & Ekeland (1980) is the following. PROPOSITION 4.7.8 If X and Y are two Banach spaces, ϕ : X −→ Y is continuous, Gˆ ateaux differentiable, ¡ ¢ R ϕ0G (x) = Y ∀x∈X and there exists k > 0, such that for all x ∈ X and all y ∈ Y , there exists ¡ ¢−1 z ∈ ϕ0G (x) (y) satisfying kzkX 6 k kykX , then f (X) = Y , i.e., f is surjective.
4. Smooth and Nonsmooth Analysis and Variational Principles
605
Theorem 4.6.28 is due to Borwein & Preiss (1987) and it is known as the Borwein-Preiss smooth variational principle. Theorem 4.6.29 is due to Deville, Godefroy & Zizler (1993). More applications of the Ekeland variational principle can be found in Barbu (1994), Denkowski, Mig´orski & Papageorgiou (2003b), Fattorini (1999), Li & Yong (1995) and Willem (1996).
Chapter 5 Critical Point Theory
Variational methods are a valuable tool in the analysis of nonlinear problems. According to these methods, we are trying to find solutions of a given nonlinear equation, by looking for critical (stationary) points of a functional defined on the function space in which we want the solution of our problem to lie. The Euler-Lagrange equation satisfied by a critical point is the nonlinear equation that we are trying to solve. The functional, whose critical points we are trying to determine, in many cases is unbounded (both from above and below – indefinite functional) and so we cannot expect to have global maxima or minima. Instead we look for local extrema or for saddle points using minimax arguments. So critical point theory is the main ingredient in the variational methods. In this chapter we present some aspects of critical point theory which are useful in the study of nonlinear boundary value problems. A fruitful technique in obtaining critical points of a C 1 -functional is based on deformation arguments along the gradient flow or a substitute of it when due to the geometry of the space or the lack of regularity of the functional it is impossible to use the gradient flow. For this reason in Section 5.1, we introduce the so-called pseudogradient vector field and the compactness-type conditions that this vector field must satisfy and we derive various deformation results, which describe the deformations of the sublevel sets of the functionals near a critical point where topologically interesting things may occur. In Section 5.2 we use the deformation results and the geometric notion of linking sets in order to obtain minimax expressions for the critical values of the functionals. We prove a general minimax principle which generates as special cases the classical mountain pass theorem, saddle point theorem and generalized mountain pass theorem. In Section 5.3 we prove strong forms of the mountain pass theorem, which, besides an existence statement for critical points, give in addition information about the fine structure of the functional near them. In Section 5.4 we prove results establishing the existence of multiple critical points. For this purpose we introduce the notion of local linking, we impose symmetry conditions on the functional and we use the Krasnoselskii’s genus of a set. The Krasnoselskii’s genus is an example of a topological index. Historically the first topological index was introduced by Lusternik-Schnirelman in order to extend to nonlinear eigenvalue problems the theory of eigenvalues of quadratic forms developed by Courant, Weyl and others. It turns out that
607
608
Nonlinear Analysis
the Lusternik-Schnirelman category is the maximal topological index which is invariant under homeomorphisms and satisfies certain properties (see Proposition 5.5.5). In Section 5.6 we introduce the Lusternik-Schnirelman category and prove a basic multiplicity result for critical points of a C 1 -functional. Also we develop the method of Lagrange multipliers for infinite dimensional constrained optimization problems, which provides the analytical framework for the study of nonlinear eigenvalue problems.
5.1
Deformation Results
Let X be a Hausdorff topological space and let K ⊆ X be a closed subset. By a homotopy on X we mean a continuous function h : [0, 1] × X −→ X, such that h(0, ·) = idX , i.e., h(0, x) = x
∀ x ∈ X.
A homotopy h is sometimes called a deformation of X. Let ϕ : X −→ X be a continuous function and let T a family of subsets of X on which ϕ is bounded. We set df
c = c(ϕ, T ) = inf sup ϕ(x).
(5.1)
C∈T x∈C
Our aim is to determine conditions which will guarantee that c ∈ ϕ(K). The idea is to use deformations of the sublevels ¡ ¢ © ª df ϕa = ϕ−1 (−∞, a] = x ∈ X : ϕ(x) 6 a
∀ a ∈ R.
To do this we shall use the following hypothesis. H(ϕ) There is a homotopy h : [0, 1] × X −→ X, such that for all a, b ∈ R, ¡ ¢ a < b with ϕ−1 [a, b] ∩ K = ∅, there is t > 0 for which we have h(t, ϕb ) ⊆ ϕa . REMARK 5.1.1 Hypothesis H(ϕ) tells us that nothing can happen topologically in between the levels a and b, if [a, b] does not contain any elements of ϕ(K).
5. Critical Point Theory
609
Using this hypothesis, we can prove the following generalized minimax principle. THEOREM 5.1.2 (Generalized Minimax Principle) If X, K, ϕ, T and h are as above, hypothesis H(ϕ) holds, h(t, C) ∈ T
∀ C ∈ T , t > 0,
(i.e., T is h-invariant), ϕ(K) ⊆ R is closed and c(ϕ, T ) is finite, then c(ϕ, T ) ∈ ϕ(K). PROOF We proceed by contradiction. Suppose that c(ϕ, T ) ∈ R \ ϕ(K). Then because ϕ(K) is closed, we can find a < c(ϕ, T ) < b, such that ¡ ¢ ϕ−1 [a, b] ∩ K = ∅. From (5.1) we see that we can find C ∈ T , such that sup ϕ(x) 6 b, x∈C
hence C ⊆ ϕb . By virtue of hypothesis H(ϕ), we can find t > 0, such that h(t, ϕb ) ⊆ ϕa . Hence h(t, C) ⊆ ϕa and due to the invariance hypothesis h(t, C) ∈ T . So we have c(ϕ, T ) 6 sup ϕ(x) 6 sup ϕ(x) 6 a, x∈ϕa
x∈h(t,C)
a contradiction to the choice of a ∈ R. This proves that c(ϕ, T ) ∈ ϕ(K).
REMARK 5.1.3 A careful reading of the above proof reveals that hypothesis H(ϕ) can be weakened as follows. H(ϕ)0 There is a homotopy h : [0, 1]×X −→ X, such that for all a, b ∈ R, a < ¡ ¢ b with ϕ−1 [a, b] ∩ K = ∅ and for all E ∈ T , there is t = t(E, a, b) > 0 for which we have h(t, E ∩ ϕb ) ⊆ ϕa . EXAMPLE 5.1.4
(a) Let T = {X}. Then c(ϕ, T ) = sup ϕ(x). x∈X
610
Nonlinear Analysis
© ª (b) Let T = {x} : x ∈ X . Then c(ϕ, T ) = inf ϕ(x). x∈X
ª df © (c) Let S M = x ∈ RM : kxkRM = 1 and let H be a homotopy class of maps g : S M −→ X. We set df
T =
©
ª C ⊆ X : C = g(S M ), g ∈ H .
Then T is invariant under all deformations of X. In most applications of interest, X is a Banach space, ϕ ∈ C 1 (X) and K is the set of critical points of ϕ, i.e., © ª K = K ϕ = x ∈ X : ϕ0 (x) = 0 . This set is closed. Then we can use Theorem 5.1.2 to determine critical points of ϕ. Two are the main ingredients of Theorem 5.1.2. (a) The homotopies (deformations) h, which satisfy property H(ϕ) with respect to K. (b) The h-invariant families T , which provide a measure for the number of critical points of ϕ. In Section 5.6, following the idea of degree theory, we shall construct Zvalued functions χ on certain subsets of 2X , such that the properties of χ and assumptions like H(ϕ) yield lower bounds for the number of critical points of ϕ (i.e., the number of elements of K). Recall that the degree ¯ ¯ ¯d(ϕ, U, 0)¯ provides a lower bound for the number of zeros of ϕ on U in the regular case. In this section we focus on how to construct homotopies exhibiting property H(ϕ). The idea is to consider homotopies of the underlying space X along the lines of steepest descent associated to ϕ, i.e., along the trajectories of ¡ ¢ x(t) ˙ = −∇ϕ x(t) . However, in many cases this differential system or its trajectory are not defined either due to the nature of X or due to the lack of regularity of ϕ. For example if X is a Banach space which is not a Hilbert space, then ∇ϕ(x) ∈ X ∗
∀ x ∈ X,
and so the differential system cannot be defined. For this reason we introduce a substitute of the gradient vector field, the so-called pseudogradient vector
5. Critical Point Theory
611
field. Then we examine the deformation flow generated by this field. To determine the basic properties of this flow, we need a compactness-type condition. So we start our discussion with a presentation and study of this compactness condition. Before proceeding with this discussion, let us mention that the whole theory can be developed in the more general setting of C 1 -functionals on complete, regular C 1,1 -Banach manifolds with Finsler structure. However, in many applications the setting used here is sufficient. DEFINITION 5.1.5 Let X be a Banach space with X ∗ being its topo1 logical dual and let ϕ ∈ C (X). (a) We say that ϕ satisfies the Palais-Smale condition at level c ∈ R (PSc -condition for short), if any sequence {xn }n>1 ⊆ X, such that ϕ(xn ) −→ c
and
ϕ0 (xn ) −→ 0
in X ∗ ,
has a strongly convergent subsequence. If this is true at every level c ∈ R, then we simply say that ϕ satisfies the Palais-Smale condition (PS-condition for short). (b) We say that ϕ satisfies the Cerami condition at level c ∈ R (Cc condition for short), if any sequence {xn }n>1 ⊆ X, such that ¡ ¢ ϕ(xn ) −→ c and 1 + kxn kX ϕ0 (xn ) −→ 0 in X ∗ , has a strongly convergent subsequence. If this is true at every level c ∈ R, then we simply say that ϕ satisfies the Cerami condition (C-condition for short). REMARK 5.1.6 Evidently the Cerami condition is weaker than the Palais-Smale condition. However, it can be shown that if ϕ ∈ C 1 (X) is bounded below, then the two conditions are in fact equivalent (see Gasi´ nski & Papageorgiou (2005, p. 127)). These conditions are compactness-type conditions which are rather strong. For example the constant functions and the functions cos x and sin x do not satisfy them. In what follows we prove a general result explaining this. In what follows X is a Banach space. Additional hypotheses will be introduced as needed. DEFINITION 5.1.7 A functional ϕ : X −→ R is said to be weakly coercive, if ϕ(x) −→ +∞ as kxkX → +∞. REMARK 5.1.8 Clearly weak coercivity of ϕ is equivalent to saying that for every λ ∈ R the set ϕλ is bounded, where ª df © ϕλ = x ∈ X : ϕ(x) 6 λ .
612
Nonlinear Analysis
PROPOSITION 5.1.9 If ϕ ∈ C 1 (X) and c ∈ R is such that ϕλ is unbounded for λ > c and bounded for λ < c, then we can find a sequence {xn }n>1 ⊆ X, such that ϕ(xn ) −→ c, PROOF
ϕ0 (xn ) −→ 0
in X ∗
and
kxn kX −→ +∞.
For every n > 1, we can find rn > n, such that © ª 1 ϕc− n ⊆ B rn (0) = x ∈ X : kxkX 6 rn .
Let us set
df
En = X \ B rn (0) and Then c−
1 6 n
df
ψn = ϕ|En .
inf ϕ(x) = cn .
x∈En
1
By hypothesis the set ϕc+ n is unbounded. So we can find yn ∈ X, such that ϕ(yn ) 6 c +
1 n
1 kyn kX > rn + 1 + √ . n
and
This means that yn ∈ En and so we have ϕ(yn ) 6 c + Applying Theorem 4.6.1 (with ε = that
2 n
1 2 6 cn + . n n
and λ =
√1 ), n
we obtain xn ∈ En , such
1 1 2 6 cn 6 ϕ(xn ) 6 ϕ(yn ) 6 c + 6 c+ , n n n 2 ϕ(xn ) 6 ϕ(x) + √ kx − xn kX ∀ x ∈ En , n 1 kxn − yn kX 6 √ . n c−
From the last inequality and since 1 kyn kX > rn + 1 + √ , n we have that kxn kX > rn + 1. Hence xn ∈ int En and so from (5.2), we obtain ° 0 ° °ϕ (xn )° 6 √2 . X n
(5.2)
5. Critical Point Theory
613
Thus finally for the sequence {xn }n>1 ⊆ X, we have that ϕ0 (xn ) −→ 0
ϕ(xn ) −→ c,
in X ∗
and kxn kX −→ +∞.
COROLLARY 5.1.10 If ϕ ∈ C 1 (X) satisfies the PSc -condition and ϕλ is bounded for all λ < c, then for some λ0 > 0 the set ϕc+λ0 is bounded. REMARK 5.1.11 Clearly in the above corollary, we can assume that instead of the PSc -condition, ϕ satisfies a weaker condition, namely if {xn }n>1 ⊆ X is such that ϕ(xn ) −→ c and
ϕ0 (xn ) −→ 0 in X ∗ ,
then the sequence {xn }n>1 ⊆ X has a bounded subsequence. Another important consequence of Proposition 5.1.9 is the following result. PROPOSITION 5.1.12 If ϕ ∈ C 1 (X) is bounded from below and it is not weakly coercive, then ϕ does not satisfy the PSc0 -condition where © ª df c0 = sup λ ∈ R : ϕλ is bounded . PROOF
Let
df
C =
©
λ ∈ R : ϕλ is bounded
ª
(recall that by definition the empty set is bounded) and df
m = inf ϕ(x). x∈X
Since ϕ is bounded from below m is finite and (−∞, m] ⊆ C, i.e., C 6= ∅. Because ϕ is not coercive, we have that c0 = sup C < +∞ (see Remark 5.1.8). Then for all λ > c0 , ϕλ is unbounded and so by virtue of Proposition 5.1.9, ϕ does not satisfy the PSc0 -condition.
614
Nonlinear Analysis
This proposition leads to the result relating the PS-condition with coercivity. THEOREM 5.1.13 If ϕ ∈ C 1 (X) is bounded from below and satisfies the PS-condition, then ϕ is weakly coercive. REMARK 5.1.14
For any functional ϕ : X −→ R, the set ª df © C = λ ∈ R : ϕλ is bounded
is a left half-line which is either open or closed. It can happen that C = ∅ or C = R, the latter case corresponding to ϕ being weakly coercive. On the other hand, if ª df © D = c ∈ R : ϕ satisfies the PSc -condition , then D is not necessarily a half-line, even if ϕ is bounded from below. EXAMPLE 5.1.15 The converse of Theorem 5.1.13 is not in general true. Namely if ϕ ∈ C 1 (X), it is bounded from below and it is weakly coercive, then it need not satisfy the PS-condition. To see this let X = H be a Hilbert space and ½ 2 df kxkH ln kxkH if x ∈ H \ {0}, ϕ(x) = 0 if x 6= 0. Then ϕ ∈ C 1 (H) and it is bounded from below and weakly coercive. However, if {xn }n>1 ⊆ H is such that kxn kH = √1e and has no strongly convergent subsequence, then 1 , ϕ0 (xn ) = 0 2e 1 -condition. and so ϕ does not satisfy the PS− 2e ϕ(xn ) = −
∀n>1
Now we can start discussing how we can generate the deformation flow. As we already mentioned in order to guarantee that the differential system which we use to generate the flow is defined in X and has (at least locally) solutions, instead of the gradient vector field, we use the following substitute. DEFINITION 5.1.16 Let ϕ ∈ C 1 (X). A vector v ∈ X is a pseudogradient vector for ϕ at x, if ° ° ° °2 ® kvkX 6 2°ϕ0 (x)°X ∗ and °ϕ0 (x)°X ∗ 6 ϕ0 (x), v X . © ª We say that V : x ∈ X : ϕ0 (x) 6= 0 −→ X is a pseudogradient vector © ª field for ϕ, if V is locally Lipschitz and for every x ∈ x ∈ X : ϕ0 (x) 6= 0 (i.e., in the set of regular points of ϕ) and V (x) is a pseudogradient vector for ϕ at x.
5. Critical Point Theory
615
REMARK 5.1.17 If v ∈ X is a pseudogradient vector of ϕ at x, we have that ° 0 °2 ° ° ® °ϕ (x)° ∗ 6 ϕ0 (x), v 6 °ϕ0 (x)°X ∗ kvkX , X X ° ° hence °ϕ0 (x)°X ∗ 6 kvkX . LEMMA 5.1.18 If Y is a metric space, V is a normed space, for every y ∈ Y , F (y) is a nonempty, convex subset of V and for every y ∈ Y we can find a neighbourhood U of y, such that \ F (y 0 ) 6= ∅, y 0 ∈U
then there exists a locally Lipschitz map f : Y −→ V , such that f (y) ∈ F (y) PROOF
for a.a. y ∈ Y.
For every y ∈ Y , let U (y) be a neighbourhood of y, such that \ F (y 0 ) 6= ∅. y 0 ∈U (y)
© ª The collection U (y) y∈Y is an open cover of Y and Y being metric space is paracompact (see Definition A.1.7 © and ªRemark A.1.8). So we can find a locally finite refinement {Vi }i∈I of U (y) y∈Y (see Definition A.1.5). First suppose that Vi 6= Y ∀i∈I and set df
ϑi (y) = dY (y, Y \ Vi ) and
df
ξ(y) =
X
ϑi (y)
∀ y ∈ Y.
i∈I
Clearly ϑi is Lipschitz continuous and because {Vi }i∈I is a locally finite open cover, ξ is locally Lipschitz and ξ(y) 6= 0 Setting df
ψi (y) =
∀ y ∈ Y.
ϑi (y) ξ(y)
∀ i ∈ I,
we see that {ψi }i∈I is a locally Lipschitz partition of unity subordinate to {Vi }i∈I . Now suppose that Vi0 = Y for some i0 ∈ I. Let us set ψi 0 ≡ 1
and
ψi ≡ 0
∀ i 6= i0 .
616
Nonlinear Analysis
Again {ψi }i∈I is a locally Lipschitz partition of unity subordinate to {Vi }i∈I . © ª Since the cover {Vi }i∈I refines the cover U (y) y∈Y (see Definition A.1.5), it follows that \ F (y 0 ) 6= ∅ ∀ i ∈ I. y 0 ∈Vi
Let vi ∈
\
F (y)
∀i∈I
y 0 ∈Vi
and define f : Y −→ V by X
df
f (y) =
ψi (y)vi .
i∈I
The function f is locally Lipschitz. Also for every y ∈ Y , there are only finitely many Vi1 , . . . , Vik elements of the cover {Vi }i∈I , such that y ∈ Vi m Then f (y) =
k X
∀ m ∈ {1, . . . , k}.
ψim (y)vim
and
m=1
k X
ψim (y) = 1.
m=1
Since vim ∈ F (y)
∀ m ∈ {1, . . . , k}
and F (y) is convex, it follows that f (y) ∈ F (y)
∀ y ∈ Y.
Using this lemma, we can establish the existence of pseudogradient vector field. THEOREM 5.1.19 If ϕ ∈ C 1 (X), then there exists a pseudogradient vector field for ϕ. PROOF
Let
df
Y =
©
ª x ∈ X : ϕ0 (x) 6= 0 .
For every x ∈ Y , let G(x) be the set of all pseudogradient vectors for ϕ at x ∈ Y . Clearly G(x) is a convex subset of X.
5. Critical Point Theory
617
Also for a given x ∈ Y , we can find u ∈ X, such that ° ® 4° °ϕ0 (x)° ∗ 6 ϕ0 (x), u . X X 5
kukX 6 1 and Let us set df
v =
° 5° °ϕ0 (x)° ∗ u. X 3
We have kvkX 6
° 5° °ϕ0 (x)° ∗ X 3
°2 0 ® 4° ϕ (x), v X > °ϕ0 (x)°X ∗ . 3
and
Because ϕ ∈ C 1 (X), we can find a neighbourhood U of x, such that ° ° kvkX < 2°ϕ0 (y)°X ∗
and
° °2 0 ® ϕ (y), v X > °ϕ0 (y)°X ∗
hence v ∈
\
∀ y ∈ U,
G(y).
y∈U
Thus we can apply Lemma 5.1.18 and obtain a locally Lipschitz map V : Y −→ X, such that V (x) ∈ G(x)
∀ x ∈ Y.
Clearly V is the pseudogradient vector field for ϕ. REMARK 5.1.20 we can take
If X = H is a Hilbert space and ϕ ∈ C 2 (X), then df
V (x) = ϕ0 (x). Also if 0 < α < β and ϕ ∈ C 1 (X), then by setting df
V1 (x) =
α + β V (x) 2 kϕ0 (x)k2X ∗
∀ x ∈ Y,
with V being the pseudogradient vector field obtained in Theorem 5.1.19 and df
Y =
©
ª x ∈ X : ϕ0 (x) 6= 0 ,
we have that V1 is locally Lipschitz from Y into X and ° ° ° ° ® α 6 ϕ0 (x), V1 (x) X 6 °ϕ0 (x)°X ∗ °V1 (x)°X 6 β
∀ x ∈ Y.
The deformation flow will be generated by solving a Cauchy problem in a Banach space. So let us recall the basic existence result concerning such problems.
618
Nonlinear Analysis
THEOREM 5.1.21 If U ⊆ X is an open subset, f : U −→ X is a locally Lipschitz vector field and x0 ∈ U , then the Cauchy problem ¡ ¢ ½ u(t) ˙ = f u(t) , (5.3) u(0) = x, has a unique C 1 -local solution u(x)(·) = u(x, ·) ¡ ¢ defined on a maximal interval l− (x), l+ (x) containing 0. The set df
W =
©
¡ ¢ª (x, t) : x ∈ U, t ∈ l− (x), l+ (x)
is open and the map (x, t) 7−→ u(x, Lipschitz from W into X. ¡ ¡t) is locally ¢¢ Moreover, if for some x ∈ X, u x, l− (x), l+ (x) lies in a complete subset of U , then lZ + (x) ° ¡ ¢° °f u(t) ° dt = +∞. l+ (x) < +∞ =⇒ (5.4) X 0
By imposing a sublinear growth condition on the vector field f , we can have a global solution. THEOREM 5.1.22 If U ⊆ X is an open subset, f : U −→ X is a locally Lipschitz function, ° ° °f (x)°
X
6 a + c kxkX
∀ x ∈ U,
for some a, c > 0 and the unique C 1 -solution (x, t) 7−→ u(t, x) of the Cauchy problem (5.3) lies in complete subsets of U , then ∀ x ∈ U, l− (x) = −∞ and l+ (x) = +∞ for every t ∈ R the function x 7−→ u(x, t) is a homeomorphism and the function X × R 3 (x, t) 7−→ u(x, t) ∈ X is locally Lipschitz and bounded (i.e., maps bounded sets into bounded sets). PROOF Suppose that for some x ∈ U , we have l+ (x) < +∞. Then integrating, we have ° ° °u(x, t)°
X
Zt 6 kxkX + 0
° ° °u(x, ˙ s)°X ds
5. Critical Point Theory Zt 6 kxkX + al+ (x) + c
° ° °u(x, s)° ds X
619 £ ¢ ∀ t ∈ 0, l+ (x) .
(5.5)
0
Invoking Gronwall’s inequality (see Theorem A.4.7), we can find β1 = β1 (x) > 0, such that ° ° £ ¢ °u(x, t)° 6 β1 ∀ t ∈ 0, l+ (x) . X Then by virtue of the sublinear growth condition of f we see that there exists β2 = β2 (x) > 0, such that ° ¡ £ ¢ ¢° °f u(x, t) ° 6 β2 ∀ t ∈ 0, l+ (x) , X hence
lZ + (x)
° ¡ ¢° °f u(x, t) ° dt 6 β2 l+ (x) < +∞, X
0
a contradiction to (5.4). This proves that l+ (x) = +∞. By reversing the time, in a similar fashion, we show that l− (x) = −∞. The uniqueness of the solution implies that u−1 (x, t) = u(x, −t)
∀ t > 0, x ∈ U.
So for every t > 0, the function u(·, t) is a homeomorphism. Finally from (5.5) and Gronwall’s inequality (see Theorem A.4.7), we have ° ° ¡ ¢ °u(x, t)° 6 kxk + at0 ect0 ∀ t ∈ [0, t0 ], X X hence we conclude that the map (x, t) 7−→ u(t, x) is bounded and of course locally Lipschitz (see Theorem 5.1.21). REMARK 5.1.23 The above global existence theorem implies that for every compact set T ⊆ R and every closed set C ⊆ U , the set u(C, T ) is closed in U . To see this suppose that u(xn , tn ) −→ v ∈ U, ª for some sequence (xn , tn ) n>1 ⊆ C ×T . Since T is compact, we may assume that tn −→ t ∈ T. ©
By virtue Theorem 5.1.22, we have ¡ ¢ xn = u−1 u(xn , tn ), tn −→ u−1 (v, t) ∈ C and so
¡ ¢ v = u u(v, −t), t ∈ u(C × T ).
620
Nonlinear Analysis
Now we can start examining the deformation flow. First a definition to fix our terminology. DEFINITION 5.1.24 Recall that a homotopy on X is a continuous map h : [0, 1] × X −→ X, such that h(0, x) = x
∀ x ∈ X.
We write ht for the map X 3 x 7−→ h(t, x) ∈ X. We say that h is a homotopy of homeomorphisms, if for every t ∈ [0, 1], ht is a homeomorphism on X. If ϕ ∈ C 1 (X), then we say that the homotopy h is ϕ-decreasing, if ¡ ¢ ¡ ¢ ϕ h(t, x) 6 ϕ h(s, x) ∀ t, s ∈ [0, 1], s 6 t, x ∈ X. First we establish a quantitative deformation result. PROPOSITION 5.1.25 If ϕ ∈ C 1 (X), a, b ∈ R, a < b, δ > 0, S ⊆ X is a closed subset and we have ° 0 ° °ϕ (x)° ∗ > 2(b − a) X δ
¡ ¢ ∀ x ∈ S ∩ ϕ−1 [a, b] ,
then for each ε > 0 and for each closed subset S0 ⊆ X with S ∩ S0 = ∅, there is a ϕ-decreasing and locally Lipschitz homotopy of homeomorphisms ht , such that: (a) If x ∈ ϕb and h(t, x) ∈ S for all t ∈ [0, 1], then h(1, x) ∈ ϕa . Moreover, if x ∈ ϕb and h(t, x) ∈ S ∩ ϕa for all t ∈ [0, s], where df
ϕa = then
©
ª ϕ>a ,
¡ ¢ ϕ h(s, x) 6 ϕ(x) − (b − a)s.
(b) If
½
df
C = ϕ
a−ε
° ° b−a ∪ ϕb+ε ∪ x ∈ X : °ϕ0 (x)°X ∗ 6 δ
then h(t, x) = x
∀ (t, x) ∈ [0, 1] × C.
° ° (c) °h(t, x) − x°X 6 δt for all (t, x) ∈ [0, 1] × X.
¾ ∪ S0 ,
5. Critical Point Theory PROOF
621
Recall that df
K = Kϕ =
©
x ∈ X : ϕ0 (x) = 0
ª
(the critical set of ϕ) and let ¡ ¢ df E = S ∩ ϕ−1 [a, b] . Also let V1 : X \ K −→ X be the normalized pseudomonotone vector field given in Remark 5.1.20 with α = 1, β = 2. The sets C and E are disjoint and closed. Let df b = C
©
ª x ∈ X : dX (x, C) 6 dX (x, E) .
b is closed and Then C b ⊆ C b ⊆ X \ E. C ⊆ int C Invoking Urysohn’s lemma (see Theorem A.1.13), we can find a locally Lipschitz map ϑ : X −→ [0, 1], such that ϑ|Cb ≡ 0
and
ϑ|E ≡ 1.
Then let f : X −→ X be the locally Lipschitz vector field defined by ( b df ϑ(x)V1 (x) if x ∈ X \ C, f (x) = b 0 if C. We consider the following Cauchy problem. ¡ ¢ ½ u(t) ˙ = −f u(t) , u(0) = x ∈ X. Note that from the definition of E and ϑ, we have ° ° ° ° ° 2(b − a) ° °V1 (x)° 6 °ϕ0 (x)° ∗ °V1 (x)° 6 2, X X X δ hence
(see Remark 5.1.20).
° ° °V1 (x)°
X
6
δ b−a
∀x∈X
(5.6)
622
Nonlinear Analysis
So we can apply Theorem 5.1.22 and obtain a locally Lipschitz, bounded map u : X × R+ −→ X, such that the map u(·, t) is a homeomorphism on X
∀ t > 1.
We have ¢ ¡ ¢ ® d ¡ ϕ u(x, t) = ϕ0 u(x, t) , u(x, ˙ t) X dt ¡ ¢ 6 −ϑ u(x, t) ∀x∈X
(5.7)
(see Remark 5.1.20 and recall the definition of ϑ). From (5.7) it follows that ¡ ¢ the function t 7−→ ϕ u(x, t) is decreasing on R+ . Note that because of the uniqueness of the solution of the Cauchy probb if and lem (5.6) and the definition of the vector field f , we have that x ∈ C only if u(x, t) = x ∀ t > 0. So from (5.7), we infer that ¡ ¢ the function t 7−→ ϕ u(x, t) is strictly decreasing
b ∀ x ∈ X \ C.
If u(x, t) ∈ E
∀ t ∈ [0, s],
then integrating (5.7) on [0, s], we obtain ¡ ¢ ϕ u(x, s) 6 ϕ(x) − s.
(5.8)
Finally, we have ° ° °u(x, t) − x° 6 X
Zt
° ° °u(x, ˙ s)°X ds
0
Zt =
° ¡ ¢° °f u(x, s) ° ds X
0
6 So if we set
δ t. b−a
(5.9)
¡ ¢ df h(t, x) = u x, (b − a)t ,
b is invariant with respect then from (5.8) and (5.9) and the observation that C to the flow of (5.6), we see that (a), (b) and (c) of the proposition are satisfied.
5. Critical Point Theory
623
Next we present some interesting consequences of the quantitative deformation result established in Proposition 5.1.25. COROLLARY 5.1.26 If ϕ ∈ C 1 (X), c ∈ R and ε > 0, then there exists a ϕ-decreasing and locally Lipschitz homotopy of homeomorphisms ht , such that: (a) If x ∈ ϕc+ε and ° 0¡ ¢° √ °ϕ h(t, x) ° ∗ > 4 ε X
∀ t ∈ [0, 1],
then h(1, x) ∈ ϕc−ε . Moreover, if ¡ ¢ c − ε 6 ϕ h(t, x) 6 c + ε and
° 0¡ ¢° √ °ϕ h(t, x) ° ∗ > 4 ε X
then
(b) If
∀ t ∈ [0, s] ∀ t ∈ [0, s],
¡ ¢ ϕ h(s, x) 6 ϕ(x) − 2εs. ° 0 ° √ °ϕ (x)° ∗ 6 2 ε X
or
¡ ¢ x∈ / ϕ−1 [c − 2ε, c + 2ε] ,
then h(t, x) = x
∀ t ∈ [0, 1].
° ° √ (c) °h(t, x) − x°X 6 εt for all (t, x) ∈ [0, 1] × X. PROOF
Apply Proposition 5.1.25, with a = c − ε,
and S =
©
b = c + ε,
δ =
√ ε
° ° √ ª x ∈ X : °ϕ0 (x)°X > 4 ε
to obtain the corollary. We can improve the speed of decrease of the function ¡ ¢ t 7−→ ϕ h(t, x)
(see Corollary 5.1.26(a)) at the expense of the estimate in Corollary 5.1.26(c).
624
Nonlinear Analysis
COROLLARY 5.1.27 ¡ ¢ If ϕ ∈ C 1 (X), c ∈ R and ε ∈ 0, 21 , then there exists a ϕ-decreasing and locally Lipschitz homotopy of homeomorphisms ht , such that: (a) If x ∈ ϕc+ε and ° 0¡ ¢° °ϕ h(t, x) °
X∗
√ > 4 ε
∀ t ∈ [0, 1],
then h(1, x) ∈ ϕc−ε . Moreover, if ¡ ¢ c − ε 6 ϕ h(t, x) 6 c + ε and
° 0¡ ¢° √ °ϕ h(t, x) ° ∗ > 4 ε X
then
(b) If
∀ t ∈ [0, s]
∀ t ∈ [0, s],
¡ ¢ ϕ h(s, x) 6 ϕ(x) − s. ° 0 ° √ °ϕ (x)° ∗ 6 2 ε X
or
¡ ¢ x∈ / ϕ−1 [c − 2ε, c + 2ε] ,
then h(t, x) = x
∀ t ∈ [0, 1].
n ° ° √ o (c) °h(t, x) − x°X 6 min 2√t ε , 4 ε for all (t, x) ∈ [0, 1] × X. PROOF Let u(x, t) be the flow obtained in the proof of Proposition 5.1.25, √ when a = c − ε, b = c + ε and δ = ε. From (5.9), we have ° ° °u(x, t) − x°
X
6
t √ . 2 ε
Also from the proof of Proposition 5.1.25 (see (5.8)), we have that ° ° °u(x, t) − x° 6 √1 X ε Let us set
Zt 0
¡ ¢ ¡ ¢¢ √ 1 ¡ ϑ u(x, s) ds 6 √ ϕ(x) − ϕ u(x, t) 6 4 ε. ε
df
h(t, x) = u(x, t)
∀ (t, x) ∈ [0, 1] × X.
5. Critical Point Theory
625
COROLLARY 5.1.28 If ϕ ∈ C 1 (X), a, b ∈ R, a < b, δ > 0, C, E ⊆ X are two closed subsets, such that C δ ∩ E = ∅, © ª where C δ = x ∈ X : dX (x, C) 6 δ and ° 0 ° °ϕ (x)° ∗ > 4(b − a) X δ
¡ ¢ ∀ x ∈ C δ ∩ ϕ−1 [a, b] ,
then for each ε > 0, there is a ϕ-decreasing and locally Lipschitz homotopy of homeomorphisms ht , such that (a) h1 (C ∩ ϕb ) ⊆ ϕa .
¡ ¢ (b) If x ∈ E or x ∈ / ϕ−1 [a − ε, b + ε] , then h(t, x) = x
∀ t ∈ [0, 1].
° ° (c) °h(t, x) − x°X 6 δt for all (t, x) ∈ [0, 1] × X. PROOF Apply Proposition 5.1.25 with S = C δ and S0 = E. Then it is clear that we obtain (b) and (c). Moreover, if x ∈ C ∩ ϕb , then from Proposition 5.1.25(c), we have that h(t, x) ∈ S
∀ t ∈ [0, 1].
Therefore h(1, C ∩ ϕb ) ⊆ ϕa .
Now we enter into the picture the PS-condition (see Definition 5.1.5(a)). This condition combined with the previous results will give us a complete description of the deformation flow. Let ϕ ∈ C 1 (X) and c ∈ R. As before K = Kϕ =
©
x ∈ X : ϕ0 (x) = 0
ª
(the set of critical points of ϕ) and df
Kcϕ =
©
ª x ∈ K ϕ : ϕ(x) = c
(the set of critical points of ϕ with energy level c). If ϕ satisfies the PScondition, then clearly Kcϕ is compact.
626
Nonlinear Analysis
THEOREM 5.1.29 If ϕ ∈ C 1 (X), ϕ(K ϕ ) ∩ [a, b] = ∅ and ϕ satisfies the PSc -condition for every c ∈ [a, b], then there exist ε > 0 and a ϕ-decreasing, locally Lipschitz homotopy of homeomorphisms ht , such that h(1, ϕb ) ⊆ ϕa and h(t, x) = x
¡ ¡ ¢¢ ∀ (t, x) ∈ [0, 1] × X \ ϕ−1 [a − ε, b + ε] .
PROOF Since by hypothesis ϕ(K ϕ ) ∩ [a, b] = ∅ and ϕ satisfies the PSc condition for all c ∈ [a, b], we can find ε > 0, such that ° 0 ° ¡ ¢ °ϕ (x)° > 2ε(b − a) ∀ x ∈ ϕ−1 [a, b] . X Then apply Proposition 5.1.25 with S = X and δ = 1ε , to finish the proof. THEOREM 5.1.30 If ϕ ∈ C 1 (X), c ∈ R, ϕ satisfies the PSc -condition and U is a neighbourhood of Kcϕ (we take U = ∅ if Kcϕ = ∅), then there exist ε > 0 and a ϕ-decreasing, locally Lipschitz homotopy of homeomorphisms ht , such that ¡ ¢ (a) h 1, ϕc+ε \ U ⊆ ϕc−ε ; ¡ ¡ ¢¢ (b) h(t, x) = x for all (t, x) ∈ [0, 1] × X \ ϕ−1 [c − 2ε, c + 2ε] ; ° ° √ (c) °h(t, x) − x°X 6 εt for all (t, x) ∈ [0, 1] × X. PROOF Let C = X \ U . Since ϕ satisfies the PSc -condition, we can find ε > 0, such that C √ε ∩ Kcϕ = ∅ and
° 0 ° √ °ϕ (x)° > 8 ε X
¡ ¢ ∀ x ∈ C√ε ∩ ϕ−1 [c − ε, c + ε] .
Apply Corollary 5.1.28, with a = c − ε,
b = c + ε,
δ =
√ ε,
C = X \U
and E = ∅.
REMARK 5.1.31 Part (a) of the above theorem says that starting a little bit above a critical level c, we will either bypass the critical neighbourhood U and reach a harmless level a < c or we will land on U , the only place near Kcϕ where topologically interesting things may occur.
5. Critical Point Theory
627
THEOREM 5.1.32 If ϕ b ∈ C 1 (X), b c ∈ R, ϕ b satisfies the PSbc -condition, A, B ⊆ X are two disjoint closed subsets, A ∩ Kbcϕb = ∅ and sup ϕ(x) b 6 b c 6 inf ϕ(x), b x∈B
x∈A
then there exist ε > 0 and a ϕ-decreasing, b locally Lipschitz homotopy of homeomorphisms ht , such that h(1, A) ⊆ ϕ bc−ε and h(t, x) = x
¡ ¡ ¡ ¢¢¢ ∀ (t, x) ∈ [0, 1] × B ∪ X \ ϕ b−1 [b c − 2ε, b c + 2ε] .
PROOF By virtue of Theorem 5.1.30, we can find ε > 0 and a ϕb decreasing, locally Lipschitz homotopy of homeomorphisms b ht , such that b b ⊆ ϕ h(1, C) bbc−ε and b h(t, x) = x
¡ ¡ ¢¢ ∀ (t, x) ∈ [0, 1] × X \ ϕ b−1 [b c − 2ε, b c + 2ε] .
We know that for every x ∈ X, the map df b hx (t) = b h(t, x)
solves a Cauchy problem: (
¡ ¢ ˙ b hx (t) = −f b hx (t) , b hx (0) = x,
where f is a locally Lipschitz, bounded vector field, such that f |X\ϕb−1 ([bc−2ε,bc+2ε]) ≡ 0. From the proof of Proposition 5.1.25, we know that for every x ∈ A, ¡ ¢ the map t 7−→ f b h(t, x) is strictly decreasing. Hence
¡ ¢ b h [0, 1] × A ∩ B = ∅.
From Remark 5.1.23, we know that ¡ ¢ b h [0, 1] × A is closed in X.
628
Nonlinear Analysis
So by Urysohn’s lemma (see Theorem A.1.13) we can find a locally Lipschitz function ϑ1 : X −→ [0, 1], such that ϑ1 |h([0,1]×A) ≡ 1 b The map
and
ϑ1 |B ≡ 0.
¡ ¢ x 7−→ ϑ1 ◦ f (x)
is still locally Lipschitz and bounded. Therefore the Cauchy problem: ¡ ¢ ¡ ¢ ½ u(t) ˙ = −ϑ1 u(t) f u(t) , u(0) = x, generates a ϕ-decreasing, b locally Lipschitz homotopy of homeomorphisms ηt , such that ¡ ¡ ¡ ¢¢¢ η(t, x) = x ∀ (t, x) ∈ [0, 1] × B ∪ X \ ϕ b−1 [b c − 2ε, b c + 2ε] . From the definition of ϑ1 and the uniqueness of the solution of the Cauchy problem, we have that η(t, x) = b h(t, x)
∀ (t, x) ∈ [0, 1] × A,
hence η(1, A) = b h(1, A) ⊆ ϕ bbc−ε . So finally let us set df
h(t, x) = η(t, x)
∀ (t, x) ∈ [0, 1] × X.
Note that in the next result we allow b = +∞, in which case ϕb \ Kbϕ = X. THEOREM 5.1.33 (Second Deformation Theorem) If ϕ ∈ C 1 (X), a ∈ R, a < b 6 +∞, ϕ satisfies the PSc -condition for every c ∈ [a, b), ϕ has no critical values in (a, b) and ϕ−1 (a) contains at most a finite number of critical points of ϕ, ¡ ¢ then there exists a ϕ-decreasing homotopy h : [0, 1] × ϕb \ Kbϕ −→ ϕb , such that ¢ ¡ h 1, ϕb \ Kbϕ ⊆ ϕa and h(t, x) = x
∀ (t, x) ∈ [0, 1] × ϕa .
5. Critical Point Theory PROOF
629
Let V1 : X \ K −→ X
be the normalized pseudogradient field determined in Remark 5.1.20 ¡ vector ¢ with α = 1, β = 2. For x = ϕ−1 [a, b] \ Kbϕ , we consider the Cauchy problem ½
¡ ¢ u(t) ˙ = −V1 u(t) , u(0) = x.
(5.10)
By virtue of Theorem 5.1.21, £problem¢ (5.10) has a unique solution u(x, ·) defined on a maximal interval 0, l+ (x) . On this interval, we have ¢ d ¡ ϕ u(t, x) 6 −1 dt
(5.11)
(recall that α = 1). ¡ ¡ ¢¢ Claim 1. If ϕ u x, t(x) = a for some t(x) < l+ (x), then t(x) is unique and the map x 7−→ t(x) is continuous. The uniqueness of t(x) is immediate from (5.11). Note that t(x) is characterized by ¡ ¢ ¡ ¢ ϕ u(x, t) < a < ϕ u(x, s) for s < t(x) < t < l+ (x). (5.12) Suppose that xn −→ x
in X.
For a given ε > 0 small enough, from (5.12), we have ¡ ¡ ¢¢ ¡ ¡ ¢¢ ϕ u xn , t(xn ) + ε < a < ϕ u xn , t(xn ) − ε . ¡ ¢ Exploiting the continuity of the map y 7−→ ϕ u(y, t) , we can find n0 = n0 (ε) > 1, such that ¡ ¡ ¢¢ ¡ ¡ ¢¢ ϕ u x, t(xn ) + ε < a < ϕ u x, t(xn ) − ε ∀ n > n0 . From the intermediate value theorem and the uniqueness of t(x), we obtain that ¯ ¯ ¯t(xn ) − t(x)¯ 6 ε ∀ n > n0 , hence t(xn ) −→ t(x), which proves Claim 1. ¡ ¢ ¡ ¢ For x ∈ ϕ−1 [a, b] \ Kbϕ , we put t(x) = l+ (x) if ϕ u(t, x) > a for all t < l+ (x).
630
Nonlinear Analysis
¡ ¢ Claim 2. If {xn }n>1 ⊆ ϕ−1 (a, b] \ Kbϕ , v ∈ ϕ−1 (a) and v = lim u(xn , sn ) n→+∞
for some sequence 0 6 sn < t(xn ), then for every sequence {tn }n>1 with sn 6 tn < t(xn ), we have v = lim u(xn , tn ).
n→+∞
Let ε > 0 be such that ¡ ¢ K ∩ B ε (v) ∩ ϕ−1 [a, b] ⊆ {v}, where B ε (v) = and
©
ª v 0 ∈ X : kv 0 − vkX 6 ε
¡ ¢ df b1 = sup ϕ B ε (v) < b.
We show that there exists n0 > 1 u(xn , tn ) ∈ B ε (v)
∀ n > n0 .
© ª Indeed, if this is not the case, we can find a subsequence (xnk , tnk ) k>1 of © ª (xn , tn ) n>1 , such that ° ¡ ° ¢ °u xn , tn − v ° > ε k k X
∀ k > 1.
° ¡ ° ¢ °u xn , s n − v ° < ε k k X 2
∀ k > k0 .
By hypothesis
£ ¤ So exploiting the continuity of u, we can find rnk , λnk ∈ snk , tnk , rnk < λnk , such that ° ¡ ° ¢ °u x n , rn − v ° = ε , k k X 2 ° ¡ ° ¢ °u xn , λn − v ° = ε k k X and
¡ ¢ u xnk , t ∈ A
¤ £ ∀ t ∈ rnk , λnk ,
where A is the annulus, defined by ½ ¾ ε df A = y∈X: 6 ky − vkX 6 ε . 2 Since ϕ satisfies the PSc -condition for every c ∈ [a, b), we have that ° 0 ° °ϕ (x)° ∗ > 0. δ = inf X −1 x∈A∩ϕ
([a,b1 ])
5. Critical Point Theory
631
But from (5.10), we have ¡ ¢ ¡ ¢° ε° °u xn , λn − u xn , rn ° 6 k k k k X 2 λnk
Z 6 2
rnk
λnk Z
° ¡ ¢° °u˙ xn , τ ° dτ k X
rnk
dτ λn − rnk 6 2 k . kϕ0 (u(xnk , τ ))kX ∗ δ
(5.13)
So we deduce that ¡ ¡ ¢¢ ¡ ¡ ¢¢ ϕ u xnk , λnk − ϕ u xnk , rnk =
λnk Z
rnk λnk
Z 6
¢¢ d ¡ ¡ ϕ u xnk , t dt dt
0¡ ¡ ¢¢ ¡ ¢® ¡ ¢ ϕ u xnk , t , u˙ xnk , t X dt 6 − λnk − rnk ;
rnk
hence, using also (5.13), we obtain ¡ ¡ ¢¢ ¡ ¡ ¢¢ ¡ ¢ ¡ ¡ ¢¢ δε a 6 ϕ u xnk , λnk 6 ϕ u xnk , rnk − λnk −rnk 6 ϕ u xnk , snk − . 4 By hypothesis
¡ ¡ ¢¢ ϕ u xnk , snk −→ ϕ(v) = a.
Thus in the limit we have a 6 a−
δε , 4
a contradiction. This proves Claim 2. ¡ ¢ Claim 3. If x ∈ ϕ−1 (a, b] \ Kbϕ is such that t(x) = l+ (x), then v = lim u(t, x) exists and v ∈ Kaϕ . t→l+ (x)
Suppose that the claim is not true. Because Kaϕ is compact, by virtue of Claim 2 (with xn = x for all n > 1), we cannot find a sequence {sn }n>1 ⊆ £ ¢ 0, l+ (x) , such that ¡ ¢ dX u(x, sn ), Kaϕ −→ 0. ¡ ¢ Therefore we can find ε > 0 and δ ∈ 0, l+ (x) , such that ¡ ¢ £ ¢ dX u(x, t), Kaϕ > ε ∀ t ∈ δ, l+ (x) . ¡ ¢ Note that the set u x, [0, δ] is compact and ¡ ¢ u x, [0, δ] ∩ Kaϕ = ∅.
632
Nonlinear Analysis
So by making ε > 0 even smaller if necessary, we can say that ¡£ ¤¢ © ª £ ¢ u(x, t) ∈ ϕ−1 a, ϕ(x) ∩ y ∈ X : dX (y, Kaϕ ) > ε ∀ t ∈ 0, l+ (x) . This set is complete and ¡ ¢ 0 < ϕ u(t, x) 6 ϕ(x) − t
£ ¢ ∀ t ∈ 0, l+ (x) .
So it follows that l+ (x) 6 ϕ(x) − a < +∞ and then Theorem 5.1.22 implies that lZ + (x)
° ¡ ¢° °V u(x, t) ° dt 6 2 X
+∞ =
lZ + (x)
0
0
dt . kϕ0 (u(x, t))kX ∗
This means that we can find a sequence tn −→ l+ (x), such that ° 0¡ ¢° °ϕ u(x, tn ) ° ∗ −→ 0. X Note that
¡ ¢ ϕ u(x, tn ) 6 b1 ,
for some b1 < b. So the PS-condition is valid and we can find a subsequence {sn }n>1 of {tn }n>1 , such that u(x, sn ) −→ v, ¡ ¢ for some v ∈ ϕ−1 [a, b1 ] ∩ K ϕ . Because of the hypotheses on ϕ, we have ϕ(v) = a, i.e., v ∈ Kaϕ , a contradiction. ¡ ¢ Because of Claims 1 and 2, the limit lim u(t, x) = u x, t(x) exists for all t→t(x) ¡ ¢ x ∈ ϕ−1 (a, b] \ Kbϕ . ¡ ¢ Claim 4 If {xn }n>1 ⊆ ϕ−1 (a, b] \ Kbϕ , x ∈ ϕ−1 (a) and xn −→ x in X, then x = lim u(xn , sn ) for every sequence {sn }n>1 , such that 0 6 sn 6 n→+∞
l+ (x). Because of Claim 2, it suffices to consider the case sn = l+ (xn ). We can find tn < l+ (xn ), such that ° ¡ ¢ ¡ ¢° °u xn , tn − u xn , l+ (xn ) ° 6 1 X n Since xn −→ x, by virtue of Claim 2, we have that u(xn , tn ) −→ x
∀ n > 1.
5. Critical Point Theory
633
and so we conclude that ¡ ¢ u xn , l+ (xn ) −→ x in X. ¡ ¢ ¡ ¢ Claim 5. If {xn }n>1 ⊆ ϕ−1 (a, b] \ Kbϕ , xn −→ x ∈ ϕ−1 (a, b] \ Kbϕ and l+ (x) = t(x), then for every sequence {tn }n>1 with 0 < tn < t(xn ) and l+ (x) 6 lim inf tn , n→+∞
we have ¡ ¢ u x, t(x) =
Let
¡ ¢ lim u xn , t(xn ) =
n→+∞
lim u(xn , tn ).
n→+∞
¡ ¢ df v = u x, t(x) .
¡ ¢ Let s1 ∈ 0, l+ (x) be such that u(x, s1 ) ∈ B 21 (v). For n > 1 large enough, we have that u(xn , s1 ) ∈ B 1 (v). Since l+ (x) 6 lim inf tn , n→+∞
we can find n1 > 1 large enough so that s1 < tn1 . Continuing this way we generate sk < tnk , such that ¡ ¢ u xnk , sk ∈ B k1 (v) ∀ k > 1. As before, because of Claim 2, we have that ¡ ¢ u xnk , sk −→ v and so, it follows that
¡ ¢ u xnk , tk −→ v.
Now let tn < t(xn ) be such that ° ¡ ¢ ¡ ¢° °u xn , t(xn ) − u xn , tn ° −→ 0 X and
¡ ¢ ϕ u(xn , tn ) −→ a.
It cannot happen that lim inf tn < l+ (x), n→+∞
634
Nonlinear Analysis
because then we can have tnk −→ τ < l+ (x) and hence
¡ ¢ ϕ u(x, τ ) = a,
which contradicts the assumption that t(x) = l+ (x). Therefore l+ (x) 6 lim inf tn n→+∞
and from the first part of the proof of this claim, we have u(xn , tn ) −→ v and so
¡ ¢ u xn , t(xn ) −→ v.
This proves Claim 5. For each x ∈ ϕa , we put t(x) = 0. Let
¡ ¢ ξ : R+ × ϕb \ Kbϕ −→ ϕb
be defined by x df u(x, t) ¢ ξ(t, x) = ¡ u x, t(x)
if if if
t(x) = 0, 0 6 t < t(x), 0 < t(x) 6 t.
Claim 6. ξ is continuous. Let (tn , xn ) −→ (t, x) and assume that a 6 ϕ(x). First suppose that t(x) = 0. Then since ξ(tn , xn ) = u(xn , sn ), with sn 6 t(xn ), from Claim 4, we have that ξ(tn , xn ) −→ x = ξ(t, x). Next suppose that t(x) > 0. If t < t(x), we have ¡ ¢ ϕ u(x, t) > a and so
¡ ¢ ϕ u(xn , tn ) > a
∀ n > n0 ,
(5.14)
5. Critical Point Theory
635
for some n0 > 1. Hence tn < t(xn ) and so from (5.14), we have ξ(tn , xn ) = u(xn , tn ) −→ u(x, t) = ξ(t, x). Finally let 0 < t(x) 6 t. If t(x) < l+ (x), then Claim 1 implies that t(xn ) −→ t(x) and so from the continuity of u, we have ξ(tn , xn ) −→ ξ(t, x). If t(x) = l+ , we invoke Claim 5. This proves Claim 6. By virtue of Claim 3, for every x ∈ ϕb \ Kbϕ , the limit Let
b lim ξ(t, x) = ξ(x) exists.
t→+∞
¡ ¢ h : [0, 1] × ϕb \ Kbϕ −→ ϕb
be defined by
´ ( ³ t ξ 1−t ,x h(t, x) = b ξ(x) df
if
t ∈ [0, 1),
if
t = 1.
Evidently h is ϕ-decreasing, ht |ϕa = idX |ϕa
h0 = idX ,
and
¡ ¢ h 1, ϕb \ Kbϕ ⊆ ϕa .
So if we show that h is continuous, we are done. b Claim 7. If xn −→ x and tn −→ +∞, then ξ(tn , xn ) −→ ξ(x). If t(x) = 0, we argue as in the corresponding situation in the proof of Claim 6. If 0 < t(x) < l+ (x), then 0 < t(xn ) < 2t(x) < +∞, for n > 1 large enough and so ¡ ¢ ¡ ¢ b ξ(tn , xn ) = u xn , t(xn ) −→ u x, t(x) = ξ(x). If t(x) = l+ (x), then since lim inf tn = +∞ > l+ (x), n→+∞
from Claim 5, we have
¡ ¢ b ξ(xn , tn ) −→ u x, t(x) = ξ(x).
This proves Claim 7. From Claims 6 and 7 and the definition of h, we conclude that h is continuous. This completes the proof of the theorem.
636
Nonlinear Analysis
Next we prove a deformation result under the weaker compactness condition of Definition 5.1.5(b). THEOREM 5.1.34 If ϕ ∈ C 1 (X), c ∈ R, ϕ satisfies the Cc -condition, ε0 > 0, U is a neighbourhood of Kcϕ (if Kcϕ = ∅, then we take U = ∅) and λ > 0, then there exists ε ∈ (0, ε0 ) and a ϕ-decreasing, locally Lipschitz homotopy of homeomorphisms h : [0, 1] × X −→ X, such that for all (t, x) ∈ [0, 1] × X, we have ° ° ¡ ¢ (a) °h(t, x) − x°X 6 λ 1 + kxkX t; ¡ ¢ (b) if h(t, x) 6= x, then ϕ h(t, x) < x; ¯ ¯ (c) if ¯ϕ(x) − c¯ > ε0 , then h(t, x) = x; ¡ ¢ (d) h {1} × ϕc+ε ⊆ ϕc−ε ∪ U . PROOF such that
Due to the Cc -condition, Kcϕ is compact. So we can find % > 0,
int B3% (Kcϕ ) ⊆ U. ¡ ¢ We claim that we can find εb ∈ 1, 21 ε0 and µ > 0, such that ¡ ¢ ¡ ¢ ¡ ¢c 1 + kxkX ϕ0 (x) > µ ∀ x ∈ ϕ−1 [c − 2b ε, c + 2b ε] ∩ int B% (Kcϕ ) . Indeed, if this is not the case, we can find xn ∈ / int B% (Kcϕ ), such that ¡ ¢ ϕ(xn ) −→ c and 1 + kxkX ϕ0 (xn ) −→ 0. Because of the Cc -condition, we can find a subsequence {xnk }k>1 of {xn }n>1 , such that xnk −→ x in X as k → +∞. Then ϕ(x) = c,
x∈ / int B% (Kcϕ ) and
ϕ0 (x) = 0,
a contradiction. We consider the sets ¡ ¢ df C = ϕ−1 [c − εb, c + εb] ∩ int B2% (Kcϕ ) and
df
E =
©
¯ ¯ ª x ∈ X : ¯ϕ(x) − c¯ > 2b ε ∪ B% (Kcϕ )
and let ϑ : X −→ [0, 1] be a locally Lipschitz function, such that ϑ|C ≡ 0
and
ϑ|E ≡ 1.
5. Critical Point Theory
637
Let σ > 0 be such that eσ −1 6 λ and let V1 be the normalized pseudogradient vector field of Remark 5.1.20 with α = 1, β = 2. We set ½ ¡ ¢ ¡ ¢c df −µσϑ(x)V1 (x) if x ∈ ϕ−1 [c − 2b ε, c + 2b ε] ∩ int B% (Kcϕ ) , f (x) = 0 otherwise. Evidently f : X −→ X is locally Lipschitz and ° ° ¡ ¢ °f (x)° 6 σ 1 + kxk and X X Consider the Cauchy problem ½
0 ® 1 ϕ (x), f (x) X 6 − µσϑ(x) 4
¡ ¢ u(t) ˙ = f u(t) , u(0) = x.
∀ x ∈ X. (5.15)
(5.16)
We know that (5.16) has a unique solution u(x)(·) and if we set df
h(t, x) = u(x)(t), then ht is a ϕ-decreasing, locally Lipschitz, bounded homotopy of homeomorphisms (see Theorem 5.1.22). Moreover, ¡ ¢ if h(t, x) 6= x, then ϕ h(t, x) < ϕ(x) (see the proof of Proposition 5.1.25). Also from (5.15), we have ° ° °h(t, x) − x° 6 σ X
Zt
¡
° ° ¢ 1 + °h(s, x)°X ds
0
Zt 6 σ
° ° ¡ ¢ °h(s, x) − x° ds + σ 1 + kxk t. X X
0
Hence by Gronwall’s inequality (see Theorem A.4.7) and the choice of σ > 0, we have ° ° ¡ ¢ °h(t, x) − x° 6 λ 1 + kxk t. X X Finally let R > 0 be such that B2ϑ (Kcϕ ) ⊆ int BR (0) and ε ∈ (0, εb] satisfying 8ε 6 µσ,
8λ(1 + R)ε 6 µσ%.
Let x ∈ ϕc+ε and suppose by contradiction that ¡ ¢ ϕ h(1, x) > c − ε and h(1, x) ∈ /U
638
Nonlinear Analysis
(see (d)). Then ¡ ¢ c − ε < ϕ h(t, x) 6 c + ε Also
∀ t ∈ [0, 1].
¡ ¢ h [0, 1] × {x} ∩ int B2% (Kcϕ ) = ∅,
or otherwise from the second inequality in (5.15), we would have ¡ ¢ 1 µσ 6 ϕ(x) − ϕ h(1, x) < 2ε, 4 a contradiction to the choice of ε ∈ (0, εb). So we can find 0 6 t1 < t2 6 1, such that ¡ ¢ ¡ ¢ dX h(t1 , x), Kcϕ = 2%, dX h(t2 , x), Kcϕ = 3% and
¡ ¢ 2% < dX h(t, x), Kcϕ < 3%
∀ t ∈ (t1 , t2 ).
We have
¡ ¢ ¡ ¢ 1 µσ(t2 − t1 ) 6 ϕ h(t1 , x) − ϕ h(t2 , x) < 2ε. 4 ¡ ¢ Because h(t2 , x) = h t2 − t1 , h(t1 , x) , we have ° ° % 6 °h(t2 , x) − h(t1 , x)°X ° ° ¢ ¡ 8ε 6 λ 1 + °h(t1 , x)°X (t2 − t1 ) < λ(1 + R) , µσ ¡ ¢ a contradiction to the choice of R > 0 and ε ∈ 0, εb . This completes the proof. Thus far we have assumed that ϕ ∈ C 1 (X). There is an extension of the theory to the case where ϕ = ϕ1 + ϕ2
with ϕ1 ∈ C 1 (X)
and ϕ2 ∈ Γ0 (X)
(5.17)
(see Definition 4.2.1). DEFINITION 5.1.35 If ϕ is as© in (5.17), then we sayªthat x ∈ X is a critical point of ϕ, if x ∈ dom ϕ2 = x ∈ X : ϕ2 (x) < +∞ and it satisfies the inequality ® ∀ h ∈ X. (5.18) 0 6 ϕ01 (x), h − x X + ϕ2 (h) − ϕ2 (x) REMARK 5.1.36
Evidently (5.17) is equivalent to saying that −ϕ01 (x) ∈ ∂ϕ2 (x).
It is easy to see that every local minimizer of ϕ is a critical point of ϕ.
5. Critical Point Theory
639
In this general situation the Palais-Smale compactness condition takes the following form which is motivated by the Ekeland variational principle (see Section 4.6). DEFINITION 5.1.37 Let ϕ be as in (5.17). We say that ϕ satisfies the generalized Palais-Smale condition (G-PS-condition for short), if every sequence {xn }n>1 ⊆ X, such that © ª the sequence ϕ(xn ) n>1 is bounded and −εn kh − xn kX 6
0 ® ϕ1 (xn ), h − xn X + ϕ2 (h) − ϕ2 (xn )
∀ h ∈ X,
with εn → 0, has a strongly convergent subsequence. The next lemma permits an equivalent reformulation of Definition 5.1.37, which in many situations is more appropriate. LEMMA 5.1.38 If ψ ∈ Γ0 (X), ψ(0) = 0 and ψ(x) > − kxkX ∗
∗
∀ x ∈ X,
∗
then there exists x ∈ X , such that kx kX ∗ 6 1 and hx∗ , xiX 6 ψ(x) PROOF
∀ x ∈ X.
Let df
ψ1 (x) = ψ(x) + kxkX
∀ x ∈ X.
By hypothesis ψ1 ∈ Γ0 (X), ψ1 > 0 and 0 = ψ1 (0) = inf ψ1 (x). x∈X
So we have 0 ∈ ∂ψ1 (0) (see Proposition 4.4.30). tion 4.4.31, we have
By virtue of Proposi-
∂ψ1 (0) = ∂ψ(0) + ∂ k·kX (0). But we know that X∗
∂ k·kX (0) = B 1
=
©
ª
x∗ ∈ X ∗ : kx∗ kX ∗ 6 1
(see Example 4.4.24(b)). So X∗
0 ∈ ∂ψ(0) + B 1
and this means that we can find x∗ ∈ X ∗ with kx∗ kX ∗ 6 1, such that hx∗ , xiX 6 ψ(x)
∀ x ∈ X.
640
Nonlinear Analysis
Using the above lemma, we can make the following definition. DEFINITION 5.1.39 Let ϕ be as in (5.17). We say that© ϕ satisfies ª the G-PS0 -condition, if every sequence {xn }n>1 ⊆ X, such that ϕ(xn ) n>1 is bounded and ® hx∗n , h − xn iX 6 ϕ01 (xn ), h − xn X + ϕ2 (h) − ϕ2 (xn ) ∀ h ∈ X, with
x∗n → 0
in X ∗ ,
has a strongly convergent subsequence. REMARK 5.1.40
The above condition means that x∗n ∈ ϕ01 (xn ) + ∂ϕ2 (xn )
and in this form the similarity with Definition 5.1.5(a) becomes apparent. Indeed, if ϕ2 ≡ 0, then we recover Definition 5.1.5(a). PROPOSITION 5.1.41 If ϕ is as in (5.17), then G-PS-condition and G-PS0 -condition are equivalent. PROOF Clearly G-PS0 -condition implies G-PS-condition. So we need to show the opposite implication. Suppose that {xn }n>1 ⊆ X is a sequence, © ª such that the sequence ϕ(xn ) n>1 is bounded and −εn kh − xn kX 6
0 ® ϕ1 (xn ), h − xn X + ϕ2 (h) − ϕ2 (xn )
with εn → 0. If εn 6 0, take
∀ h ∈ X,
x∗n = 0.
If εn > 0, let df
ψ(v) =
® ¤ 1 £ 0 ϕ1 (xn ), v X + ϕ2 (v + xn ) − ϕ2 (xn ) εn
∀ v ∈ X.
Evidently ψ ∈ Γ0 (X), ψ(0) = 0 and − kvkX 6 ψ(v)
∀ v ∈ X.
So we can apply Lemma 5.1.38 and obtain vn∗ ∈ X ∗ with kvn∗ kX ∗ 6 1, such that hvn∗ , viX 6 ψ(v) ∀ v ∈ X.
5. Critical Point Theory
641
Setting df
x∗n = εn vn∗ , we have
x∗n , h − xn
® X
0 ® ϕ1 (xn ), h − xn X + ϕ2 (h) − ϕ2 (xn )
6
∀ h ∈ X,
with x∗n → 0
in X ∗ .
Let ϕ be as in (5.17). If c ∈ R, as before we set ª x ∈ X : x is a critical point of ϕ , ª df © = x ∈ X : x ∈ K ϕ , ϕ(x) = c , ª df © = x ∈ X : ϕ(x) 6 c . df
Kϕ = Kcϕ ϕc
©
Using the Ekeland variational principle, we can have the following deformation result. THEOREM 5.1.42 If ϕ is as in (5.17), ϕ satisfies the G-PS-condition, c ∈ R, ε0 , δ > 0 and U is a neighbourhood of Kcϕ , ¡ ¢ then there exists ε ∈ (0, ε0 ), such that for each compact G ⊆ ϕc+ε \ϕc−ε \U , we can find a closed set F , such that G ⊆ int F , t0 > 0 and a homotopy ht : X −→ X, 0 6 t 6 t0 , such that ° ° (a) °h(t, x) − x°X 6 t for all x ∈ X; ¡ ¢ (b) ϕ h(t, x) 6 ϕ(x) − 2εt for all x ∈ F ; ¡ ¢ (c) ϕ h(t, x) 6 ϕ(x) for all x ∈ X; ¡ ¢ (d) sup ϕ h(t, x) − sup ϕ(x) 6 −2εt. x∈F
REMARK 5.1.43
x∈F
If ϕ1 and ϕ2 are both even and G is symmetric, i.e., G = −G,
then ht may be chosen to be odd.
642
5.2
Nonlinear Analysis
Minimax Theorems
In this section we use the deformation results to derive minimax expressions characterizing the critical values of a C 1 -functional. We start with a geometric notion which plays a central role in what follows. DEFINITION 5.2.1 Let Y be a Hausdorff topological space and C0 , E two distinct subsets of Y , such that E is closed and C0 ∩ E = ∅. We say that the sets C0 and E are linking in Y , if there exists a set C ⊇ C0 , such that for any γ ∈ C(C; Y ) satisfying γ|C0 = idC0 , we have γ(C) ∩ E 6= ∅. REMARK 5.2.2 Suppose that C0 and E are linking sets and ψ : Y −→ Y is a homeomorphism. Then it is easy to see that ψ(C0 ) and ψ(E) are linking too. If Y is locally convex and C0 is the relative boundary of a nonempty, bounded and convex set C, then above definition is equivalent to saying that C0 ∩ E = ∅ and C0 is not contractible in Y \ E (see Gasi´ nski & Papageorgiou (2005, p. 136)). EXAMPLE 5.2.3 kx0 − x1 kX > r,
(a) Let X be a Banach space, x0 , x1 ∈ X with © ª and E = ∂Br (x0 ) = x ∈ X : kx − x0 kX = r .
C0 = {x0 , x1 }
The sets C0 and E are linking in X. Indeed let df
C =
©
ª x ∈ X : (1 − t)x0 + tx1 , t ∈ [0, 1]
and consider γ ∈ C(C; X), such that γ(x0 ) = x0 and γ(x1 ) = x1 . Evidently the set γ(C) is connected. If γ(C) ∩ E = ∅, then we have γ(C) = U1 ∪ U2 , with df df U1 = γ(C) ∩ Br (x0 ) and U2 = γ(C) ∩ B r (x0 )c . Since both sets U1 and U2 are open, we contradict the connectedness of γ(C). (b) Let X be a Banach space and suppose that X = Y ⊕V , with dim Y < +∞. Let © ª df df C0 = ∂Br (0) ∩ Y = x ∈ Y : kxkX = r and E = V. Then the sets C0 and E are linking in X. To see this, let projY ∈ L(X) be the projection onto Y (it exists since Y is finite dimensional) and let df
C = B r (0) ∩ Y =
©
ª x ∈ Y : kxkX 6 r .
5. Critical Point Theory
643
For a given γ ∈ C(C; X), such that γ|C0 = idC0 , we claim that γ(C) ∩ E 6= ∅. ¡ ¢ To this end it suffices to show that 0 ∈ projY γ(C) . So let h : [0, 1]×Y −→ Y be defined by ¡ ¢ df h(t, y) = tprojY γ(y) + (1 − t)y. Evidently ht is a homotopy and h1 = projY ◦ γ. Moreover, for every t ∈ [0, 1], we have ht |C0 = idC0 . Then from the homotopy invariance and normalization of Brouwer’s degree, we have ¡ ¢ ¡ ¢ DB projY ◦ γ, Br (0) ∩ Y, 0 = DB idY , int Br (0) ∩ Y, 0 = 1, which means that 0 ∈ projY (C). (c) Let X be a Banach space and suppose that X = Y ⊕V , with dim Y < +∞. Let e ∈ V with kekX = 1 and 0 < % < r1 , 0 < r2 be given. We set ª df © C0 = λe + y : y ∈ Y and λ ∈ {0, r1 } or kykX = r2
and
df
E = ∂B% (0) ∩ V.
Then we claim that the sets C0 and E are linking in X. To show this as before, let projY ∈ L(X) be the projection on Y , df
C =
©
λe + y : y ∈ Y, λ ∈ [0, r1 ], kykX 6 r2
ª
(the cylinder whose boundary is C0 ) and suppose that γ ∈ C(C; X) with γC0 = idC0 . We will show that γ(C) ∩ E 6= ∅. To have this, we must show that there exists x ∈ C which satisfies ° ° ¡ ¢ °γ(x)° = % and proj γ(x) = 0. Y X So let h : [0, 1] × R × Y −→ Y be defined by ¡ ¢ h t, (λ, y) ¡° ¡ ¢° ¡ ¢ ¢ df = t°γ(x) − projY γ(x) °X + (1 − t)λ − %, tprojY γ(x) + (1 − t)y , where x = λ + ey. Clearly h is continuous and ¡ ¢ df h 0, (λ, y) = (λ − %, y). Moreover, if x = λe + y ∈ C0 , we have ¡ ¢ ¡ ¢ h t, (λ, y) = t kx − ykX + (1 − t)λ − %, y = (λ − %, y) 6= 0. Identifying C with a subset of R×Y by virtue of the decomposition x = λe+y and exploiting the homotopy invariance of Brouwer’s degree, we have ¡ ¢ ¡ ¢ DB h1 , int C, 0 = DB h0 , int C, 0 = 1. So there exists x = λe + y ∈ C, such that h(1, x) = 0. This is equivalent to ° ° ¡ ¢ °γ(x)° = % and proj γ(x) = 0. Y X
644
Nonlinear Analysis
Using the notion of linking sets, we can prove the following general minimax principle. In what follows X is a Banach space. THEOREM 5.2.4 If C0 and E are closed sets which are linking in X, ϕ ∈ C 1 (X), sup ϕ(x) 6 inf ϕ(x), x∈E
x∈C0
ϕ satisfies the PSc -condition, where ¡ ¢ df c = inf sup ϕ γ(x) , γ∈Γ x∈C
with C ⊇ C0 as in Definition 5.2.1 and df
Γ =
©
ª γ ∈ C(C; X) : γ|C0 = idC0 ,
then c > inf ϕ(x) and c is a critical value of ϕ. Moreover, if x∈E
c = inf ϕ(x), x∈E
then
Kcϕ
∩ E 6= ∅.
PROOF
Since the sets C0 and E are linking, for a given γ ∈ Γ, we have γ(C) ∩ E 6= ∅.
Therefore
¡ ¢ sup ϕ γ(x) > inf ϕ(x)
∀ γ ∈ Γ,
x∈E
x∈C
hence c > inf ϕ(x). x∈E
First suppose that c > inf ϕ(x). x∈E
Let us set
df
ε0 = c − inf ϕ(x) > 0 x∈E
and assume that Kcϕ = ∅. Then with U = ∅, let ε > 0 and h : [0, 1] × X −→ X be as postulated by Theorem 5.1.34. The choice of ε0 > 0 implies that ht |C0 = idC0
∀ t ∈ [0, 1].
Choose γ ∈ Γ, such that ¡ ¢ ϕ γ(x) 6 c + ε
∀x∈C
5. Critical Point Theory
645
and let us set ξ = h1 ◦ γ ∈ C(C; X). If x ∈ C0 , then
¡ ¢ ξ(x) = h1 γ(x) = h1 (x) = x,
hence ξ ∈ Γ. Moreover, note that γ(x) ∈ ϕc+ε
∀ x ∈ C.
So by virtue of Theorem 5.1.34(d), we have that ¡ ¢ ϕ ξ(x) 6 c − ε, which contradicts the definition of c (recall that ξ ∈ Γ). Next suppose that c = inf ϕ(x). x∈E
We will show that
Kcϕ ∩ E 6= ∅.
Suppose that the opposite holds, namely that Kcϕ ∩ E = ∅. We apply Theorem 5.1.32, with ϕ b = −ϕ,
b c = c,
A = E
and
B = C0 .
So we obtain ε > 0 and a continuous map h : [0, 1] × X −→ X as postulated by that theorem. Again we choose γ ∈ Γ, such that ¡ ¢ ϕ γ(x) < c + ε ∀ x ∈ C. Let
df
ξ1 = h−1 1 ◦ γ ∈ C(C; X). We have
¡ ¢ ξ1 (x) = h−1 γ(x) = h−1 1 1 (x) = x
∀ x ∈ C0 ,
i.e., ξ1 ∈ Γ. Since C0 and E are linking in X, we have ξ1 (C) ∩ E 6= ∅ and so we can find x0 ∈ C, such that ξ1 (x0 ) ∈ E. It follows that
¡ ¢ ¡ ¡ ¢¢ ϕ γ(x0 ) = ϕ h1 ξ1 (x0 ) > c + ε,
which contradicts the choice of γ ∈ Γ. This proves that Kcϕ ∩ E 6= ∅.
646
Nonlinear Analysis
With a slight restriction on the sets C0 and C (which though is satisfied in almost all applications of interest), we can have the above theorem with the PSc -condition replaced by the Cc -condition. THEOREM 5.2.5 If C is a nonempty, bounded, closed, convex set in X, C0 is the relative boundary of C and E is a nonempty, closed subset in X, C0 and E are linking in X, ϕ ∈ C 1 (X), sup ϕ(x) 6 inf ϕ(x), x∈E
x∈C0
ϕ satisfies the Cc -condition, where ¡ ¢ df c = inf sup ϕ γ(x) , γ∈Γ x∈C
with C ⊇ C0 as in Definition 5.2.1 and df
Γ =
©
ª γ ∈ C(C; X) : γ|C0 = idC0 ,
then c > inf ϕ(x) and c is a critical value of ϕ. Moreover, if x∈E
c = inf ϕ(x), x∈E
then Kcϕ ∩ E 6= ∅. PROOF
As in the proof of Theorem 5.2.4, we check that c > inf ϕ(x) x∈E
and first we prove the theorem when c = inf ϕ(x). x∈E
Without any loss of generality, we may assume that 0 ∈ C. Suppose that Kcϕ ∩ E = ∅. Since Kcϕ is compact, we can find a neighbourhood U of Kcϕ , such that U ∩ E = ∅. Then let ε > 0 and h : [0, 1] × X −→ X be as postulated by Theorem 5.1.34 and choose γ ∈ Γ, such that ¡ ¢ ϕ γ(x) 6 c + ε ∀ x ∈ C.
5. Critical Point Theory
647
We introduce η : [0, 1] × C0 −→ X, by ¡ ¢ df η(t, x) = γ (1 − t)x . We have η(0, x) = γ(x) = x and
η(1, x) = γ(0),
hence η ∈ DC0 , where DC0 is the set of contractions (deformations to a point) of C0 . We define ξ : [0, 1] × C0 −→ X, by
½ df
ξ(t, x) =
h(2t, ¡ x) ¢ h 1, η(2t − 1, x)
if if
£ ¤ t ∈ £0, 21 ¤ , t ∈ 21 , 1 .
Evidently ξ is continuous and ξ(0, x) = h(0, x) = x and
¡ ¢ ¡ ¢ ξ(1, x) = h 1, η(1, x) = h 1, γ(0) .
Therefore ξ ∈ DC0 . Also note that by Theorem 5.1.34, we have ¡ ¢ h(2t, x) = x or ϕ h(2t, x) < ϕ(x) 6 c. Therefore in both cases we have that h(2t, x) ∈ / E. Moreover, from the choice of γ ∈ Γ, we have γ(x) ∈ ϕc+ε ∀ x ∈ C. Since C is convex and we have assumed that 0 ∈ C, we have · ¸ 1 2(1 − t)x ∈ C ∀t∈ ,1 2 and so
·
¡ ¢ γ 2(1 − t)x ∈ ϕc+ε
Then Theorem 5.1.34(d) implies ¡ ¢ h 1, η(2t − 1, x) ∈ ϕc−ε ∪ U
∀t∈
and
¸ 1 ,1 . 2
¡ c−ε ¢ ϕ ∪ U ∩ E = ∅.
So it follows that ξ is a contraction of C0 in X \ E, a contradiction to the fact that the sets C0 and E are linking in X (see Remark 5.2.2). This proves that Kcϕ ∩ E 6= ∅. When c > inf ϕ(x), x∈E
the proof is similar as that of Theorem 5.2.4 but using this time Theorem 5.1.34.
648
Nonlinear Analysis
REMARK 5.2.6 the inequality
We emphasize that in both Theorems 5.2.4 and 5.2.5, sup ϕ(x) 6 inf ϕ(x) x∈E
x∈C0
is not strict. Equality is a possibility and in that case we refer to “relaxed boundary conditions.” With appropriate choices of the sets C0 and E, from Theorem 5.2.5 we can recover the classical results of critical point theory, strengthened in the sense that the PSc -condition is replaced by the Cc -condition and we have relaxed boundary conditions. The first result that we deduce from Theorem 5.2.5 is the mountain pass theorem. COROLLARY 5.2.7 (Mountain Pass Theorem) If ϕ ∈ C 1 (X), x0 , x1 ∈ X with kx0 − x1 kX > r > 0, © ª max ϕ(x0 ), ϕ(x1 ) 6 inf ϕ(x) kxkX =r
and ϕ satisfies the Cc -condition, where df
c = and
df
Γ0 = then c >
inf
kxkX =r
©
¡ ¢ inf max ϕ γ0 (t)
γ0 ∈Γ0 t∈[0,1]
¡ ¢ ª γ0 ∈ C [0, 1]; X : γ(0) = x0 , γ(1) = x1 ,
ϕ(x) and c is a critical value of ϕ. Moreover, if c =
inf
kxkX =r
ϕ(x),
then there exists a critical point x of ϕ with ϕ(x) = c and kxkX = r. PROOF C0 =
©
Let x0 , x1
ª
© ª and E = ∂Br (x0 ) = x ∈ X : kx − x0 kX = r .
From Example 5.2.3(a) we know that C0 and E link in X and ª df © C = x ∈ X : (1 − t)x0 + tx1 , t ∈ [0, 1] . Also note that if df
γ ∈ Γ = and
©
ª γ ∈ C(C; X) : γ(x0 ) = x0 , γ(x1 ) = x1 ,
¡ ¢ γ0 (t) = γ (1 − t)x0 + tx1 = γ(x)
∀ x ∈ C,
then γ0 ∈ Γ0 and conversely. So we can apply Theorem 5.2.5 and obtain the corollary.
5. Critical Point Theory
649
COROLLARY 5.2.8 (Saddle Point Theorem) If ϕ ∈ C 1 (X), X = Y ⊕ V with dim Y < +∞, there exists r > 0, such that sup
ϕ(x) 6 inf ϕ(x) x∈V
x∈Y kxkX = r
and ϕ satisfies the Cc -condition, where df
c = inf
γ∈Γ
and
df
Γ =
©
max
x∈Y kxkX 6 r
¡ ¢ ϕ γ(x)
¡ ¢ ª γ ∈ C B r (0) ∩ Y ; X : γ|∂Br (0)∩Y = id∂Br (0)∩Y ,
then c > inf ϕ(x) and c is a critical value of ϕ. Moreover, if x∈V
c = inf ϕ(x), x∈V
then there exists a critical point x of ϕ with ϕ(x) = c and x ∈ V . PROOF Let C0 = ∂Br (0) ∩ Y , C = B r (0) ∩ Y and E = V . Then by virtue of Example 5.2.3(b) the sets C0 and E are linking in X. So we apply Theorem 5.2.5 and have the corollary. COROLLARY 5.2.9 (Generalized Mountain Pass Theorem) If ϕ ∈ C 1 (X), X = Y ⊕ V with dim Y < +∞, there exist 0 < % < r1 , 0 < r2 and e ∈ V with kekX = 1, such that for © ª ª df © C0 = λe + y : y ∈ Y and λ ∈ 0, r1 or kykX = r2 , ª df © df C = λe + y : y ∈ Y, λ ∈ [0, r1 ], kykX 6 r2 and E = ∂B% (0) ∩ V, we have max ϕ(x) 6 inf ϕ(x)
x∈C0
x∈E
¡ ¢ df and ϕ satisfies the Cc -condition, where c = inf max ϕ γ(x) and γ∈Γ x∈C
df
Γ =
©
ª γ ∈ C(C; X) : γ|C0 = idC0 ,
then c > inf ϕ(x) and c is a critical value of ϕ. Moreover, if x∈E
c = inf ϕ(x), x∈E
then there exists a critical point x of ϕ with ϕ(x) = c and x ∈ V . PROOF The sets C0 and E are linking in X (see Example 5.2.3(c)). So we can apply Theorem 5.2.5.
650
Nonlinear Analysis
Finally a straightforward application of Theorem 4.6.33 gives the following result. THEOREM 5.2.10 If ϕ ∈ C 1 (X) is bounded from below and satisfies the Cc -condition with c = inf ϕ(x), x∈X
then there exists x ∈ X, such that ϕ(x) = inf ϕ(x) x∈X
(hence x ∈ X is a critical point of ϕ). REMARK 5.2.11 Recall that if ϕ ∈ C 1 (X) is bounded from below, then the PS-condition and the C-condition are equivalent. df
Next we consider functionals ϕ : X −→ R = R ∪ {+∞} of the form ϕ = ϕ1 + ϕ2
with ϕ1 ∈ C 1 (X)
and ϕ2 ∈ Γ0 (X).
(5.19)
THEOREM 5.2.12 If ϕ is as in (5.19), ϕ satisfies the G-PS-condition (see Definition 5.1.37), C is a compact submanifold of X with relative boundary ∂C = C0 , E ⊆ X is a nonempty, C0 and E are linking in X through C and sup ϕ(x) < inf ϕ(x), x∈E
x∈C0
¡ ¢ df c = inf max ϕ γ(x) , γ∈Γ x∈C
with
df
Γ =
©
ª γ ∈ C(C; X) : γ|C0 = idC0 ,
then c is a critical value of ϕ if c < +∞ and c > inf ϕ(x). x∈E
PROOF
Clearly c > inf ϕ(x). Equip Γ with the supremum metric x∈E
° ° df d∞ (γ, ξ) = max °γ(x) − ξ(x)°X x∈C
∀ γ, ξ ∈ Γ.
Then (Γ, d∞ ) is a complete metric space. Let ϑ : Γ −→ R be defined by ¡ ¢ df ϑ(γ) = max ϕ γ(x) . x∈C
5. Critical Point Theory
651
It is easy to check Also we claim that for every © that ϑ is lower semicontinuous. ª γ ∈ dom ϑ = γ ∈ Γ : ϑ(γ) < +∞ , the function ϕ ◦ γ is continuous on C. To this end note that ϕ2 is bounded on γ(C), hence ϕ2 is bounded and lower semicontinuous on conv γ(C). Evidently it suffices to show that ϕ2 |conv γ(C) is continuous. Let x0 ∈ conv γ(C) and let U be a neighbourhood of x0 , such that ϕ2 (u) 6 µ < +∞ ∀ u ∈ U ∩ conv γ(C). Without any loss of generality, we may assume that x0 = 0
and
ϕ2 (x0 ) = 0.
If λ ∈ [0, 1] and u ∈ λU ∩ conv γ(C), we have ³u´ ³u´ ϕ2 (u) 6 (1 − λ)ϕ2 (0) + λϕ2 = λϕ2 6 λµ, λ λ hence lim sup ϕ2 (u) 6 0.
u→0 u∈conv γ(C)
On the other hand from the lower semicontinuity of ϕ2 , we have lim inf
u→0 u∈conv γ(C)
ϕ2 (u) > 0.
Therefore, it follows that ϕ2 |conv γ(C) is continuous and so ϕ ◦ γ ∈ C(C). Suppose that c is not a critical value of ϕ and let df
ε0 = inf ϕ(x) − max ϕ(x) > 0. x∈E
¡
x∈C0
¢
Let ε ∈ 0, ε0 be as in Theorem 5.1.42 and choose ε0 < ε, such that max ϕ(x) < c − ε0 .
x∈C0
Applying Theorem 4.6.1, we obtain γ ∈ Γ, such that ϑ(γ) 6 c + ε0
and
− εd∞ (γ, ξ) 6 ϑ(ξ) − ϑ(γ)
∀ ξ ∈ Γ.
(5.20)
Let df
D = γ(C)
df
and D0 =
©
¡ ¢ ª γ(x) : x ∈ C, ϕ γ(x) ∈ [c − ε0 , c + ε0 ] .
By virtue of the continuity of ϕ ◦ γ, we see that D0 is compact and from the choice of ε0 > 0, we have that C0 ∩ D0 = ∅. Clearly
D0 ⊆ ϕc+ε \ ϕc−ε .
652
Nonlinear Analysis
Let ht : X −→ X, t ∈ [0, t0 ], be the homotopy postulated by Theorem 5.1.42 df
and let us set ξ = ht ◦ γ. Because ht |C0 = idC0 , we have that ξ ∈ Γ. From Theorem 5.1.42(a), we have d∞ (γ, ξ) 6 t. Note that
(5.21)
¡ ¢ ϑ(ξ) = max ϕ ξ(x) > c − ε0 . x∈C
Hence ¡ ¢ ¡ ¢ ¡ ¢ ϑ(ξ) = max ϕ ξ(x) = max ϕ (ht ◦ γ)(x) = max ϕ ht (y) . x∈C
x∈C
y∈D0
Then Theorem 5.1.42(d) implies that ¡ ¢ ϑ(ξ) − ϑ(γ) = max ϕ ht (y) − max ϕ(x) 6 −2εt. y∈D0
x∈D0
(5.22)
Comparing (5.20), (5.21) and (5.22), we reach a contradiction. This proves that c is a critical value of ϕ. With suitable choices of the sets C and E, we obtain generalizations of the mountain pass theorem, saddle point theorem and generalized mountain pass theorem. THEOREM 5.2.13 (Mountain Pass Theorem) If ϕ = ϕ1 + ϕ2 is as in (5.19), ϕ satisfies the G-PS-condition and there exist x ∈ X and r > 0, such that © ª max ϕ(x), ϕ(0) < inf ϕ(x), kxkX =r
df
c = where
df
Γ0 = then c >
inf
kxkX =r
©
¡ ¢ inf max ϕ γ0 (t) ,
γ0 ∈Γ0 t∈[0,1]
¡ ¢ ª γ0 ∈ C [0, 1]; X : γ(0) = 0, γ(1) = x ,
ϕ(x) and c is a critical value of ϕ.
COROLLARY 5.2.14 If ϕ = ϕ1 + ϕ2 is as in (5.19), ϕ satisfies the G-PS-condition, 0 is a local minimizer of ϕ and there exists x 6= 0, such that ϕ(x) 6 ϕ(0), then ϕ has a critical point different from x and 0. In particular, if ϕ has two local minima, then ϕ has at least three critical points.
5. Critical Point Theory
653
THEOREM 5.2.15 (Saddle Point Theorem) If ϕ = ϕ1 + ϕ2 is as in (5.19), ϕ satisfies the G-PS-condition, X = Y ⊕ V with dim Y < +∞, there exists r > 0, such that max
x∈∂Br (0)∩Y
ϕ(x) < inf ϕ(x), x∈V
¡ ¢ df c = inf max ϕ γ(x) , γ∈Γ x∈C
where
df
C = B r (0) ∩ Y and
df
Γ =
©
¡ ¢ ª γ ∈ C C; X : γ|∂Br (0) ∩Y = id∂Br (0) ∩ Y ,
then c > inf ϕ(x) and c is a critical value of ϕ. x∈V
THEOREM 5.2.16 (Generalized Mountain Pass Theorem) If ϕ = ϕ1 + ϕ2 is as in (5.19), ϕ satisfies the G-PS-condition, X = Y ⊕ V with dim Y < +∞, there exist 0 < % < r1 , 0 < r2 and e ∈ V with kekX = 1, such that for df
C0 =
©
© ª ª λe + y : y ∈ Y and λ ∈ 0, r1 or kykX = r2 ,
df
C =
©
λe + y : y ∈ Y, λ ∈ [0, r1 ], kykX 6 r2
and
ª
df
E = ∂B% (0) ∩ V, we have max ϕ(x) < inf ϕ(x)
x∈C0
and
x∈E
¡ ¢ df c = inf max ϕ γ(x) , γ∈Γ x∈C
where
df
Γ =
©
ª γ ∈ C(C; X) : γ|C0 = idC0 ,
then c > inf ϕ(x) and c is a critical value of ϕ. x∈E
Finally a straightforward application of Theorem 4.6.1 (with ε = produces the following result.
1 n,
λ = 1),
THEOREM 5.2.17 If ϕ = ϕ1 + ϕ2 is as in (5.19), ϕ satisfies the G-PS-condition and it is bounded from below, then c = inf ϕ(x) is attained and so it is a critical value of ϕ. x∈X
654
5.3
Nonlinear Analysis
Structure of the Critical Set
In this section we go beyond the existence of critical points and produce additional information about the fine structure of the functional near them. Different types of critical points of a functional ϕ sometimes can be distinguished by the topological structure of their neighbourhoods in the sublevel sets of ϕ. It turns out that such information about the topological type in some cases is already available from the minimax characterization of the corresponding critical value. One should note that ϕ has to be only C 1 as opposed to Morse theory, which has the same goal but requires more regularity on ϕ. So let X be a Banach space and ϕ ∈ C 1 (X). We keep the notation introduced in Section 5.1. So recall that for c ∈ R, we set df
Kϕ = df
Kcϕ = df
ϕc = df
ϕc = ◦ c df
ϕ DEFINITION 5.3.1
=
©
ª x ∈ X : ϕ0 (x) = 0 , © ª x ∈ K ϕ : ϕ(x) = c , © ª x ∈ X : ϕ(x) 6 c , © ª x ∈ X : ϕ(x) > c , © ª x ∈ X : ϕ(x) < c .
Suppose that x0 ∈ Kcϕ .
(a) We say that x0 is a local minimizer of ϕ, if there exists an open neighbourhood U of x0 , such that ϕ(x0 ) 6 ϕ(x) for all x ∈ U . (b) We say that x0 is of mountain pass type, if for all open neighbourhoods ◦c ◦c U of x0 , U ∩ ϕ 6= ∅ and the set U ∩ ϕ is not path-connected, where c = ϕ(x0 ). REMARK 5.3.2 isolated.
Note that we do not require that the critical point is
We start with a topological lemma. LEMMA 5.3.3 If (Y, dY ) is a metric space, C and V are nonempty subsets of Y , such that © ª C is compact, V is open and C ⊆ V and U (y) y∈C is an open cover of C, such that y ∈ U (y) and for every y ∈ C, the set U (y) ∩ V is path connected, © ªN then there exists a finite, disjoint, open cover Ui i=1 of C, such that for each © ª iµ∈ 1, . . . ,¶N , the set Ui ∩ V is contained in a path connected component S U (y) ∩ V . y∈C
5. Critical Point Theory
655
PROOF Clearly we may assume that C is a strict subset of Y or otherwise there is nothing to prove. Because C is compact, we can find a finite subcover © ªM © ª U (yi ) i=1 of the open cover U (y) y∈C . We let df
δ = min
max
y∈C i∈{1,...,M }
¡ ¢ dY y, C \ U (yi ) .
We claim that δ > 0. Indeed, if δ = 0, then for every m > 1, we can find ym ∈ C, such that ¡ ¢ 1 dY ym , U (yi )c 6 m
© ª ∀ i ∈ 1, . . . , M .
Because C is compact, we may assume that ym −→ y
as m → +∞,
for some y ∈ C. Then ¡ ¢ © ª dY y, U (yi )c 6 0 ∀ i ∈ 1, . . . , M . © ª Hence y ∈ U (yi )c for all i ∈ 1, . . . , M and so y ∈
·[ M
¸c U (yi ) ⊆ C c,
i=1
a contradiction. Therefore δ > 0 and from the definition of δ > 0, we see that © ª Bδ (y) = y 0 ∈ Y : dY (y 0 , y) < δ ⊆ U (y) ∀ y ∈ C, for some y ∈ C. Next on C we define an equivalent relation ∼ as follows: y ∼ y 0 if and only if there exist finitely many points {yi }k+1 i=0 , such that dY (yi , yi+1 ) < δ,
y0 = y
and yk+1 = y 0 .
Because C is compact, there are only finitely many equivalence classes denoted by C1 , . . . , CN . Let ½ ¾ © ª δ df Wi = y ∈ Y : dY (y, Ci ) < ∀ i ∈ 1, . . . , N . 4 Evidently Wi ∩ Wj = ∅ for i 6= j and C ⊆
N [
Wi .
i=1
It remains to µ show that each ¶ set Wi ∩ V is contained in a path connected S component of y∈C U (y) ∩ V .
656
Nonlinear Analysis
We define another equivalence relation ∼1 on C as follows: y ∼1 y if and only if y and y 0 belong to the same path connected component of µ [ ¶ U (y) ∩ V. y∈C
©
Fix i ∈ 1, . . . , N
ª
and let y, y 0 ∈ Wi ∩ V . We need to show that y ∼1 y 0 .
From the definition of Wi , we can find a finite sequence {yr }k+1 r=1 ⊆ Ci , such that δ dY (y, y0 ) < 4 and © ª δ dY (y 0 , yk+1 ) < , and dY (yr , yr+1 ) < δ, ∀ r ∈ 0, . . . , k . 4 Let df ε = δ − max dY (yr , yr+1 ) > 0. r∈{1,...,k}
Since by hypothesis C ⊆ V , we can find a finite sequence {xr }k+1 r=0 ⊆ V , such that © ª ε dY (yr , xr ) < < δ ∀ r ∈ 0, . . . , k + 1 2 and © ª dY (xr , xr+1 ) < δ ∀ r ∈ 0, . . . , k . From the choice of δ > 0, we have ¡ ¢ ¡ ¢ y, x0 ∈ Bδ (y0 ) ∩ V ⊆ U (y 0 ) ∩ V ,
© ª for some y 0 ∈ C. By virtue of our hypothesis concerning the cover U (y) y∈C , 0 we have that y ∼© 1 x0 . In a ª similar fashion we show that xk+1 ∼1 y . Finally fix r ∈ 0, . . . , k . From the triangle inequality, we have dY (yr , xr+1 ) 6 dY (yr , yr+1 ) + dY (yr+1 , xr+1 ) < δ − ε + Therefore we can find y r ∈ C, such that ¡ ¢ ¡ ¢ xr , xr+1 ∈ Bδ (yr ) ∩ V ⊆ U (y r ) ∩ V and so xr ∼1 xr+1
© ª ∀ r ∈ 0, . . . , k .
Summing up we have established that y ∼1 x0 ∼1 x1 ∼1 . . . ∼1 xk+1 ∼1 y 0 . This completes the proof of the lemma.
ε < δ. 2
5. Critical Point Theory
657
We will also need the following trivial variant of the standard deformation theorem (see Theorem 5.1.30). LEMMA 5.3.4 If ϕ ∈ C 1 (X) satisfies the PS-condition, ε0 > 0, c ∈ R and U and V are two open neighbourhoods of Kcϕ , such that U ⊆ V
and
dY (∂V, U ) > 0,
then there exist ε ∈ (0, ε0 ] and a ϕ-decreasing, locally Lipschitz homotopy of homeomorphisms h : [0, 1] × X −→ X, such that ¡ ¢ (a) h1 ϕc+ε \ U ⊆ ϕc−ε ; ¡ ¢ (b) h [0, 1] × U ⊆ V ; ¡ ¢ (c) h(t, x) = x for all (t, x) ∈ [0, 1] × ϕc−ε0 ∪ ϕc+ε0 . Now we can have the first structural result for the critical set Kcϕ . THEOREM 5.3.5 If ϕ ∈ C 1 (X) satisfies the PS-condition, x0 , x1 are two distinct points in X, df
Γ =
©
¡ ¢ ª γ ∈ C [0, 1]; X : γ(0) = x0 , γ(1) = x1 , ¡ ¢ df c = inf max ϕ γ(t) γ∈Γ t∈[0,1]
and
© ª df c > η = max ϕ(x0 ), ϕ(x1 ) ,
then Kcϕ 6= ∅ and one of the following conditions holds: (a) Kcϕ contains a local minimizer of ϕ; or (b) Kcϕ contains a critical point of mountain pass type. PROOF
Since c > η, we can find r > 0, such that inf
kxkX =r
ϕ(x) > η.
So by virtue of the mountain pass theorem (see Corollary 5.2.7), we have that Kcϕ 6= ∅. To prove the alternative part of the theorem, we argue by contradiction. Suppose that Kcϕ contains no local minimizers nor critical points of mountain pass type. Then for any x ∈ Kcϕ we can find a neighbourhood ◦c U (x) of x, such that the set U (x)∩ ϕ is path connected. Since ϕ satisfies the PS-condition, we have that Kcϕ is compact. Because we do not have local minimizers, then ◦c
Kcϕ = ϕ .
658
Nonlinear Analysis
© ª The collection U (x) x∈K ϕ is of course an open cover of Kcϕ . So we can apply c Lemma 5.3.3 and produce open sets U1 , . . . , UN , such that Kcϕ ⊆
N [
Ui ,
i=1
if Ui ∩ Uj 6= 0
then i = j
◦c
and Ui ∩ ϕ is contained in a path connected component of µ [ ¶ ◦c U (x) ∩ ϕ . x∈Kcϕ
If V =
N S i=1
¡ ¢ S df b= Ui , then dY ∂V, Kcϕ > 0. Also set U U (x) and define x∈Kcϕ
df
ε0 = df
δ = and
1 (c − η), 2
½ ¾ ¡ ¢ ¡ ¢ 1 b ) ∪ {x0 , x1 }, Kcϕ , d ∂V, Kcϕ min dY (∂ U Y 8 df
U =
©
ª x ∈ X : dY (x, Kcϕ ) < δ .
Then with the above data, we can apply Lemma 5.3.4 and obtain ε ∈ (0, ε0 ] and a ϕ-decreasing, locally Lipschitz homotopy of homeomorphisms ht as postulated by the lemma. Choose γ ∈ Γ, such that ¡ ¢ ϕ γ(t) 6 c + ε ∀ t ∈ [0, 1] and define ª t ∈ [0, 1] : γ(t) ∈ /U , ¢ ¡ ¢ df ¡ b ∩ Kcϕ ∪ h 1, γ(Σ) . D = U df
Σ =
©
Evidently x0 , x1 ∈ D. Let D0 be the path connected component of D which contains x0 . We will show that x1 ∈ D0 . If Σ = [0, 1], then we are done. So suppose that Σ 6= [0, 1]. Note that Σ is closed and let © ¡ ¢ ª df t∗ = sup t ∈ Σ : h 1, γ(t) ∈ D0 . Suppose that t∗ < 1. Note £that 0¤ belongs in the relative interior of Σ in [0, 1]. Therefore 0 < t∗ < 1. Let t∗1 , t∗2 be the component of Σ containing t∗ . ∗ ∗ ∗ ∗ ∗ ∗ ∗ ¡If t1 < ¢t , then t = t2 . Suppose that t = t1 . Then γ(t1 ) ∈ ∂U and so ∗ h 1, γ(t1 ) ∈ int D. Hence there exists ε > 0, such that ¡ ¡ ¢¢ Bε h 1, γ(t∗1 ) ⊆ D
5. Critical Point Theory and we must have which means that
659
¡ ¡ ¢¢ Bε h 1, γ(t∗1 ) ∩ D0 6= ∅, ¡ ¢ ¡ ¢ h 1, γ(t∗1 ) = h 1, γ(t∗ ) ∈ D0 .
It follows that
¡ ¢ h 1, γ(t∗2 ) ∈ D0 ,
which means that
t∗ > t∗2 > t∗1 ,
a contradiction. Therefore t∗ = t∗2 and ¡ ¢ h 1, γ(t∗ ) ∈ D0 . This means that
γ(t∗ ) ∈ ∂U. ¡ ¢ Let i0 ∈ 1, . . . , N be such that ¡ ¢ dY γ(t∗ ), Ui0 ∩ Kcϕ = δ. Let
© ª df b t = sup t ∈ [0, 1] : γ(t) ∈ U ∩ Ui0 .
Because t∗ = t∗2 , we must have b t ∈ (t∗ , 1). Also
¡ ¢ γ(b t) ∈ ∂ U ∩ Ui0
and so γ(b t) ∈ Σ. Then ¡ ¢ ◦c y = h 1, γ(b t) ∈ Ui0 ∩ ϕ and
¡ ¢ ◦c u = h 1, γ(t∗ ) ∈ Ui0 ∩ ϕ . ◦c
Since Ui0 ∩ ϕ is contained in a path connected component of µ [ ¶ ◦c U (x) ∩ ϕ , x∈Kcϕ
we have that y ∼1 u (see the proof of Lemma 5.3.3). Moreover, u ∈ D0 and so x0 ∼1 u ∼1 y. From this it follows that t∗ > b t > t∗ , a contradiction. Therefore we infer that t∗ = 1, which implies that e1 ∈ D0 . But this contradicts the definition of c since ◦c D0 ⊆ D ⊆ ϕ . This concludes the proof of the theorem.
660
Nonlinear Analysis
Another structural result for the set of critical points is the following one. THEOREM 5.3.6 If ϕ ∈ C 1 (X) satisfies the PS-condition, 0 is a local minimizer of ϕ with ϕ(0) = 0 and ϕ has a second local minimizer x1 6= 0, then one of the following holds: (a) there is a critical point x of ϕ, which is a saddle point (i.e., every neighbourhood of x contains points y, u, such that ϕ(y) < ϕ(x) < ϕ(u)); (b) the origin and x1 can be path connected in any neighbourhood of the set of local minimizers x of ϕ with ϕ(x) = 0 (then necessarily ϕ(x1 ) = 0). PROOF
Let df
Γ = and let us set
©
¡ ¢ ª γ ∈ C [0, 1]; X : γ(0) = 0, γ(1) = x1 ¡ ¢ df c = inf max ϕ γ(t) . γ∈Γ t∈[0,1]
Kcϕ
Suppose that consists entirely of local minimizers of ϕ. So for every x ∈ Kcϕ we can find a neighbourhood U (x) of x, such that ϕ(x) = c 6 ϕ(y) df b = (see Definition 5.3.1(a)). Let U
S
∀ y ∈ U (x) e of U (x). For any neighbourhood U
x∈Kcϕ
b ∩U e. Kcϕ let ε > 0 and ht be as in Theorem 5.1.34 with ε0 = 1 and U = U Choose γ ∈ Γ, such that ¡ ¢ ϕ γ(t) 6 c + ε ∀ t ∈ [0, 1]. df
Let us set γ0 = h(1, ·) ◦ γ. Clearly γ0 ∈ Γ and we have ¡ ¢ ¡ ¢ b. γ0 [0, 1] ⊆ h 1, ϕc+ε ⊆ ϕc−ε ∪ U ⊆ ϕc−ε ∪ U b are disjoint. So we must have But the sets ϕc−ε and U ¡ ¢ ¡ ¢ γ0 [0, 1] ⊆ ϕc−ε or γ0 [0, 1] ⊆ U. Since the first inclusion contradicts the definition of c, we must have ¡ ¢ e, γ0 [0, 1] ⊆ U ⊆ U e of Kcϕ . hence 0 and x1 can be path connected in any neighbourhood U
5. Critical Point Theory
5.4
661
Multiple Critical Points
In this section we present some useful results on the existence of multiple critical points for a smooth functional. The most powerful results in this direction can be obtained for functionals which are invariant under a group of symmetries. However, first we will present multiplicity results in which no symmetry conditions are assumed. Instead we employ a splitting condition for the functionals, which is known as “local linking at 0.” Let X be a Banach space and X ∗ its topological dual. DEFINITION 5.4.1 Suppose that X = Y ⊕V . We say that ϕ ∈ C 1 (X) has a local linking at 0, if there exists r > 0, such that ϕ(y) 6 0
∀ y ∈ Y, kykX 6 r
ϕ(v) > 0
∀ v ∈ V, kvkX 6 r.
and REMARK 5.4.2
It is clear from the above definition that 0 ® ϕ (0), h X = 0 ∀h∈Y
and Hence 0
0 ® ϕ (0), u X = 0
∀ u ∈ V.
0 ® ϕ (0), x X = 0
∀ x ∈ X,
∗
i.e., ϕ (0) = 0 in X . So 0 is a critical point of ϕ. The notion of local linking generalizes the notions of local minimum and of local maximum. If X = H is a Hilbert space, ϕ ∈ C 2 (H), ϕ(0) = 0 and 0 is a nondegenerate critical point (i.e., the self-adjoint operator ϕ00 (0) is invertible), then clearly ϕ has a local linking at 0. We start with a geometric lemma which complements Example 5.2.3(c). LEMMA 5.4.3 If X = Y ⊕ V with dim Y < +∞, 0 < dim V 6 +∞, e ∈ V , kekX = 1, df
C =
©
ª x = λe + y : y ∈ Y, λ > 0, kxkX 6 1 ,
r > 0 and γ : C −→ X is a continuous map, such that γ(y) = y and
° ° °γ(x)°
X
> r > 0
∀ y ∈ Y, kykX 6 1 ∀ x ∈ C, kxkX = 1,
(5.23) (5.24)
662
Nonlinear Analysis
then for any % ∈ (0, r), there exists x b ∈ C, such that ° ° ¡ ¢ projY γ(b x) = 0 and °projY (b x)°X = %, where projY ∈ L(X) is the projection onto Y (it exists since by hypothesis Y is finite dimensional). PROOF
Let Z = Y ⊕ Re and consider the map ξ : C −→ Z, defined by ° ¡ ¢ ° df ξ(x) = projY γ(x) + °(idX − projY )γ(x)°X e.
To prove the lemma, it suffices to show that there exists x b ∈ C, such that ξ(b x) = %e. From (5.23) and (5.24), we see that ξ(x) 6= %e ∀ x ∈ ∂C. ¡ ¢ So the Brouwer degree DB ξ, int C, %e is defined. Let df
C1 = and
©
ª
(y, 0) ∈ Z : y ∈ Y, kykX 6 1 df
C2 =
©
ª x ∈ C : kxkX = 1 .
Then ∂C = C1 ∪ C2 . Recall that Brouwer’s degree depends only on the boundary values of the function ξ. So we consider ξ|∂C . From the definition of γ and ξ, we see that ξ(x) = x and
° ° °ξ(x)° > r > 0 X
∀ x ∈ C1 ∀ x ∈ C2 .
Also let η : ∂C −→ X be defined by if x ∈ C1 , x df ξ(x) η(x) = if x ∈ C2 . kξ(x)kX ¡ ¢ Evidently η ∈ C ∂C; X . Also note that the functions ξ and η are homotopic © ª in Z \ %e through the homotopy df
h1 (t, x) = tξ(x) + (1 − t)η(x)
∀ t ∈ [0, 1].
5. Critical Point Theory
663
Note that η(C2 ) ⊆ C2
and
η|∂C2 = id∂C2 .
Since C2 is homeomorphic to a ball, there is a continuous deformation h2 (t, x) connecting η|C2 and idC2 , such that h2 (t, x) = x
Therefore ξ|∂C
∀ t ∈ [0, 1] and
x ∈ ∂C2 . © ª is homotopic to the identity in Z \ %e and so ¡ ¢ ¡ ¢ DB ξ, int C, %e = DB idC , int C, %e = 1.
Hence %e ∈ ξ(C) and we are done. REMARK 5.4.4 The above lemma implies that ∂C and V ∩ ∂B% (0) are linking in X (see Definition 5.2.1). We will need the following lemma. As in Section 5.1, by V we denote a pseudogradient vector field associated to ϕ ∈ C 1 (X) on the set © ª x ∈ X : ϕ0 (x) 6= 0 (see Definition 5.1.16 and Theorem 5.1.19). LEMMA 5.4.5 If ϕ ∈ C 1 (X) satisfies the PS-condition, x0 ∈ X is a unique global minimizer of ϕ (i.e., ϕ(x0 ) < ϕ(x) for all x 6= x0 ), y 6= x0 is such that ϕ0 (y) 6= 0 ¡ ¢ and ϕ has no critical value in ϕ(x0 ), ϕ(y) , then the negative pseudogradient flow, defined by V (u(t)) u(t) ˙ =− 2 , kV (u(t))kX u(0) = y £ ¤ ¡ ¢ exists for a maximal finite time t ∈ 0, b(y) and u b(y) = x0 . PROOF
By replacing ϕ by df
ϕ1 (x) = ϕ(x + x0 ) − ϕ(x0 ), if necessary, we may assume that x0 = 0
and
ϕ(x0 ) = 0.
(5.25)
664
Nonlinear Analysis
From the definition of the pseudogradient vector field V (see Definition 5.1.16) and (5.25), we have ¢ ¡ ¢ ® d ¡ ˙ ϕ u(t) = ϕ0 u(t) , u(t) X dt ¿ À ¡ ¢ V (u(t)) 0 = ϕ u(t) , − 2 kV (u(t))kX X 2 kϕ0 (u(t))kX ∗ 1 6 − < − . 2 4 kV (u(t))kX
(5.26)
Then from Theorem 5.1.21 and (5.26), we have that the flow u(t) exists on a maximal open interval (0, b) and by integrating (5.26) and since ϕ(x0 ) = 0 is a strict global minimum of ϕ, we see that b 6 4ϕ(y). Moreover, we have ¡ ¢ 0 < ϕ u(t) < ϕ(y) ∀ t ∈ (0, b). We claim that
u(t) −→ 0 as t → b− .
First assume that ϕ0 is bounded away from zero along the flow u. So there exists δ > 0, such that ° 0¡ ¢° °ϕ u(t) ° ∗ > δ X Then and so
∀ t ∈ (0, b).
° ¡ ° ¡ ¢° ¢° δ 6 °ϕ0 u(t) °X ∗ 6 °V u(t) °X Zb 0
° ° 1 °u(t) ˙ °X dt 6 2 δ
Zb
° ¡ ¢° °V u(t) ° dt < +∞, X
0
a contradiction to Theorem 5.1.21. So ϕ0 cannot be bounded away from zero along the flow u. Hence we can find tn −→ b− , such that ° 0¡ ¢° °ϕ u(tn ) ° ∗ −→ 0. X © ¡ ¢ª Since the sequence ϕ u(tn ) n>1 is bounded and ϕ satisfies the PS-condition, © ª a subsequence of u(tn ) n>1 converges to a critical point of ϕ. Since by ¡ ¤ hypothesis ϕ has no critical value in 0, ϕ(y) , this critical point is 0. Therefore ¡ ¢ lim− ϕ u(t) = ϕ(x0 ) = ϕ(0) = 0 t→b
and from the Ekeland variational principle (see Theorem 4.6.1), we conclude that u(t) −→ x0 = 0 as t → b− .
5. Critical Point Theory
665
Now we are ready for the first multiplicity result. THEOREM 5.4.6 If X = Y ⊕ V and dim Y < +∞, ϕ ∈ C 1 (X), ϕ(0) = 0, ϕ is bounded from below, inf ϕ(x) < 0, x∈X
ϕ satisfies the PS-condition and it has a local linking at 0, then ϕ has at least two nonzero critical points. PROOF that
From Theorem 5.2.10, we know that there exists x0 ∈ X, such ϕ(x0 ) = inf ϕ(x) < 0 = ϕ(0). x∈X
So x0 6= 0 and it is a critical point of ϕ. Also because of the local linking condition, we know that 0 is another critical point of ϕ. Suppose that 0 and x0 are the only critical points of ϕ. Case 1. 0 < dim Y , 0 < dim V . Without any loss of generality, we may assume that r = 1 < kx0 kX . Then for every y ∈ Y , with kykX = 1, we have ϕ0 (y) 6= 0 and ¡ so by ¢ Lemma 5.4.5, the flow u of (5.23) exists on a maximal open interval 0, b(y) with b(y) 6 −4ϕ(x0 ). Also we claim that there exists δ > 0, such that ◦ ϕ(x0 )+δ
ϕ
⊆ B kx0 kX (x0 ).
(5.27)
2
Indeed, if no such δ > 0 exists, we can find a minimizing sequence {yn }n>1 for ϕ, such that kx0 kX kyn − x0 kX > ∀ n > 1. 2 By virtue of Corollary 4.6.8, we can find another minimizing sequence {xn }n>1 of ϕ, such that ϕ0 (xn ) −→ 0 and
kx0 kX 6 kxn − x0 kX 4
Since ϕ satisfies the PS-condition, we may assume that xn −→ x b in X.
∀ n > 1.
666
Nonlinear Analysis
Then x b is a critical point of ϕ different from 0 and x0 , since ϕ(b x) = inf ϕ(x) < 0 = ϕ(0) and x∈X
kx0 kX 6 kb x − x0 kX , 4
a contradiction to our hypothesis in the beginning of the proof. Then by choosing δ > 0 in (5.27) small enough, we can find a unique t = t(y) < b(y), such that ¡ ¡ ¢¢ ϕ x t(y) = ϕ(x0 ) + δ. The uniqueness property (which is a consequence of (5.24)), implies that the map y 7−→ t(y) is continuous. Let e ∈ V be such that kekX = 1. Consider the set ª df © C = x = λe + y : y ∈ Y, λ > 0, kxkX 6 1 . ¡ ¢ We define γ0 ∈ C ∂C; X in the following way: df
γ0 (y) = y
∀ y ∈ Y, kykX 6 1,
df
γ0 (e) = x0 . Also, any x ∈ ∂C, x 6= e with kxkX = 1 has a unique representation x = λe + µy, with λ ∈ [0, 1], y ∈ Y , kykX = 1 and µ ∈ (0, 1] (i.e., the coefficients λ, µ and the vector y are unique). Then we define · ¸ ¡ ¢ 1 df γ0 (λe + µy) = u 2st(y) ∀ s ∈ 0, 2 (recall that u is the pseudogradient flow; see (5.25)). Hence µ ¶ ¡ ¢ 1 γ0 e + µy = u t(y) . 2 Recall that
¡ ¡ ¢¢ ϕ x t(y) = ϕ(x0 ) + δ.
So from (5.27), it follows that ° ° °1 ° ° e + µy − x0 ° 6 1 kx0 k . X °2 ° 2 X Also define ¡ ¢ df γ0 (λe + µy) = (2λ − 1)x0 + (2 − 2λ)u t(y)
· ∀λ∈
¸ 1 ,1 . 2
5. Critical Point Theory
667
¡ ¢ As λ moves from 21 to 1, γ0 (λe + µy) traverses the line segment from u t(y) to x0 and so we have · ¸ ° ° 1 °γ0 (λe + µy) − x0 ° 6 1 kx0 k ∀λ∈ ,1 . X X 2 2 Clearly γ0 is continuous and ϕ|γ0 (∂C) 6 0. Moreover, for some r 6 1, we have ° ° °γ0 (x)° > r > 0 X
∀ kxkX = 1.
Let us fix % ∈ (0, r). Then by Lemma 5.4.3, the sets ∂C and V ∩ ∂B% (0) are linking in X, i.e., for any continuous extension γ of γ0 to all of C, we have ¡ ¢ γ(C) ∩ V ∩ ∂B% (0) 6= ∅. Let
df
Γ = and
©
γ ∈ C(C; X) : γ|∂C = γ0
ª
¡ ¢ df c = inf max ϕ γ(x) . γ∈Γ x∈C
From Theorem 5.2.4, we know that c is a critical value of ϕ. If c > 0, then we have a second nonzero critical point of ϕ (since for the other nonzero critical point x0 , we have ϕ(x0 ) < 0). If c = 0, then from Theorem 5.2.4, we know that ¡ ¢ Kcϕ ∩ V ∩ ∂B% (0) 6= ∅ and so again we have a second nonzero critical point of ϕ (distinct from x0 ). Case 2. dim Y = 0. Let x0 6= 0 be a minimizer of ϕ and suppose that it is the only nonzero critical point of ϕ. Then we can find r > 0 small enough, such that ϕ(x) > β0 > 0
∀ kxkX = r.
Since ϕ(x0 ) < ϕ(0) = 0 and ϕ satisfies the PS-condition, we can apply the mountain pass theorem (see Corollary 5.2.7) and obtain a second nonzero critical point of ϕ. Case 3. dim V = 0. (In this case we may allow dim Y = +∞.) In this case we have ϕ(x) 6 −β0 < 0
∀ kxkX = r,
with r > 0 small enough. Applying the mountain pass theorem (see Corollary 5.2.7) on −ϕ, we obtain a second nonzero critical point of ϕ.
668
Nonlinear Analysis
From the proof of Theorem 5.4.6, we deduce a somewhat stronger result. THEOREM 5.4.7 If X = Y ⊕ V with dim Y < +∞, dim V > 0, ϕ ∈ C 1 (X), ϕ(0) = 0, inf ϕ(x) < 0, ϕ is bounded from below and satisfies the PS-condition, has a
x∈X
local linking at 0 and has only finite number of critical points, each a local minimizer of ϕ and on them we have ϕ < 0, then ϕ has another nonzero critical point.
PROOF We argue indirectly. Suppose that ϕ has no other nonzero critical point. Let {xk }N k=1 be the critical points of ϕ, where © ϕ < 0.ª Choose % > 0 small enough so that the open balls B% (xk ), k ∈ 1, . . . , N are mutually disjoint and they do not contain the origin. In the local linking condition, we can take r > 0 small enough so that B r (0) is disjoint from all the open balls B% (xk ). As in the proof of Theorem 5.4.6, we can find δ > 0, small enough, ◦ ϕ(xk )+δ
so that ©the component Dk containing xk of ϕ is contained in B% (xk ) ª for k ∈ 1, . . . , N . Let y ∈ Y with kykX = r. Then ϕ0 (y) 6= 0 and considering the negative pseudogradient flow starting at y (see (5.25)), we know that it exists for a finite time b(y) 6 −4 min ϕ. There is a unique time t = t(y), such that N [ ¡ ¢ x t(y) first touches ∂Dk . k=1
By renumbering things if necessary, we may assume that ¡ ¢ x t(y) ∈ ∂D0 . Then x(t) ∈ D0
¡ ¢ ∀ t ∈ t(y), b(y) .
So the flow which starts at y ends up in D0 . From the continuous dependence of the flow on the initial condition, it follows that for v ∈ V close to y on ∂Br (0), the negative pseudogradient flow starting at v will end up in D0 . Since the set Y ∩ ∂Br (0) is connected, starting from any point in that set, the negative pseudogradient flow ends up in D0 . Then we can continue as in the proof of Theorem 5.4.6 (follow the argument after (5.27)) and produce a nonzero critical point of ϕ where ϕ > 0. COROLLARY 5.4.8 If X = Y ⊕ V , dim Y < +∞, dim V > 0 and ϕ ∈ C 1 (X) satisfies all the conditions of Theorem 5.4.6 and ¡ in addition ¢ ϕ is even, then ϕ has at least two pairs {±x}, {±w} of nonzero critical points.
5. Critical Point Theory
669
To state another consequence of Theorem 5.4.6, we need to introduce some notions and state a result, which is the starting point of the so-called “Morse Theory.” DEFINITION 5.4.9 critical point of ϕ.
Let H be a Hilbert space, ϕ ∈ C 2 (H) and x is a
(a) We say that x is nondegenerate, if ϕ00 (x) is invertible. (b) The Morse index of x is defined as the supremum of the dimensions of the vector subspaces of H on which ϕ00 (x) is negative definite. THEOREM 5.4.10 (Morse Lemma) If H is a Hilbert space, ϕ ∈ C 2 (H) and 0 is a nondegenerate critical point of ϕ, then there exists a Lipschitz continuous homeomorphism h of a neighbourhood W of 0 onto a neighbourhood U of 0, such that h(0) = 0 and ¡ ¢ ¢ 1¡ ϕ h(x) = ϕ(0) + ϕ00 (0)x, x H . 2 REMARK 5.4.11 Let H+ be the subspace of H where ϕ00 (0) is positive definite and H− the subspace of H where ϕ00 (0) is negative definite. Since 0 is a nondegenerate critical point, we have that H = H+ ⊕ H− . Let PH+ be the orthogonal projection on H+ . If on H we use the equivalent norm ¯ ® ¯ |x|H = ¯ ϕ00 (0)x, x H ¯ ∀ x ∈ H, then we can equivalently rewrite the conclusion of Morse lemma as ¯2 ¡ ¢ ¢ ¯2 1¯ 1 ¯¡ ϕ h(x) = ϕ(0) + ¯PH (x)¯H + ¯ idH − PH (x)¯H . 2 2
(5.28)
From (5.28), it is clear that if 0 is a nondegenerate critical point of ϕ ∈ C 2 (H) with a finite Morse index, then the splitting in the local linking condition is assured. So we can state the following theorem. THEOREM 5.4.12 If H is a Hilbert space, ϕ ∈ C 2 (H) is bounded from below, 0 is a nondegenerate critical point of ϕ with finite Morse index and inf ϕ(x) < ϕ(0) = 0,
x∈H
then ϕ has at least two nonzero critical points.
670
Nonlinear Analysis
Thus far we have assumed that in the direct sum decomposition X = Y ⊕ V,
with dim Y < +∞.
Now we drop this requirement. We start with a quantitative deformation lemma, which is an immediate consequence of Corollary 5.1.28. LEMMA 5.4.13 If ϕ ∈ C 1 (X), C ⊆ X, ε, δ > 0, c ∈ R and ° 0 ° °ϕ (x)° ∗ > 4ε X δ
¡ ¢ ∀ x ∈ ϕ−1 [c − 2ε, c + 2ε] ∩ C2δ ,
then there exists a ϕ-decreasing and locally Lipschitz homotopy of homeomorphisms ht , such that ¡ ¢ ¡ ¢ (a) ϕ h(t, x) < c for all (t, x) ∈ [0, 1] × ϕc ∩ C ; ¡ ¢ (b) h 1, ϕc+ε ∩ C ⊆ ϕc−ε ; ° ° (c) °h(t, x) − x°X 6 δ for all (t, x) ∈ [0, 1] × X. Now suppose that X = Y ⊕ V. Consider two sequences {Yn }n>1 and {Vn }n>1 of subsets of Y and V respectively, such that Y1 ⊆ Y2 ⊆ . . . ⊆ Yn ⊆ . . . ⊆ Y, V1 ⊆ V2 ⊆ . . . ⊆ Vn ⊆ . . . ⊆ V and
∞ [
Yn = Y
n=1
and
∞ [
Vn = V.
n=1
For a given multiindex α = (α1 , α2 ) ∈ N2 , we set Xα = Yα1 ⊕ Vα2 . Recall that on the space of multiindices N2 we use the coordinatewise ordering, defined by · ¸ α 6 β ⇐⇒
α1 6 β1 , α2 6 β2 .
DEFINITION 5.4.14 A sequence of multiindices {αn }n>1 ⊆ N2 is ad2 missible, if for every α ∈ N there is an integer n0 > 1 such that αn > α, for every n > n0 . Also for a given function ψ : X −→ R and multiindex α ∈ N2 , by ψα we denote the restriction of ψ on Xα .
5. Critical Point Theory
671
We will use the following generalization of the Palais-Smale compactness condition. DEFINITION 5.4.15 Let ϕ© ∈ Cª1 (X). We say that ϕ satisfies the PS∗c condition, if for every sequence xαn n>1 ⊆ X, such that {αn }n>1 ⊆ N2 is admissible and xαn ∈ Xαn for n > 1, ¡ ¢ ϕ xαn −→ c and ϕ0αn (xαn ) −→ 0, we can extract a strongly convergent subsequence. We say that ϕ satisfies the PS∗ -condition if it satisfies PS∗c -condition at every level c ∈ R. REMARK 5.4.16
If
Yn = X
and Vn = {0}
∀ n > 1,
then the above definition reduces to the usual Palais-Smale condition (local and global). In the previous definitions and in what follows we can replace N2 by any directed set. DEFINITION 5.4.17 Let C and D be two closed subsets in X. Then C ≺∞ ¡D, if there exists β ∈ N2 , such that for every α > β, we can find ¢ hα ∈ C [0, 1] × Xα ; Xα , such that (a) hα (0, x) = x for all x ∈ Xα ; (b) hα (1, x) ∈ D for all x ∈ Xα ∩ C. This definition with the help of the corresponding Palais-Smale type condition in Definition 5.4.15 can extend the deformation lemma to the present Galerkin-type setting. LEMMA 5.4.18 If ϕ ∈ C 1 (X), c ∈ R, r > 0, U is a neighbourhood of Kcϕ and ϕ satisfies the PS∗c -condition, then for all ε > 0 small enough, we have ϕc+ε \ U ≺∞ ϕc−ε ¡ ¢ and the corresponding homotopies hα ∈ C [0, 1] × Xα ; Xα satisfy ° ° °hα (t, x) − x° 6 r X and
¡ ¢ ϕ hα (t, x) < c
∀ (t, x) ∈ [0, 1] × Xα
¡ ¢ ∀ (t, x) ∈ [0, 1] × ϕcα \ U .
672
Nonlinear Analysis
PROOF Since ϕ satisfies the PS∗c -condition, we can find % > 0 and β ∈ N2 , such that for every α > β, we have ° 0 ° ¡ ¢ ¡ ¢ °ϕα (x)° ∗ > % ∀ x ∈ ϕ−1 α [c − 2%, c + 2%] ∩ Xα \ U 2% , X where
df
Aδ = Choose df
δ = min
©
ª x ∈ X : dX (x, A) 6 δ .
n% 2
o , 4, r
and
ε∈
µ ¸ δ% 0, 4
and apply Lemma 5.4.13 with C = Xα \ U . LEMMA 5.4.19 If ϕ ∈ C 1 (X) is bounded below, c = inf ϕ(x) x∈X
and ϕ satisfies the PS∗c -condition, then c is a critical value of ϕ. PROOF Suppose that c is not a critical value of ϕ. Then by virtue of Lemma 5.4.18 (with U = ∅), we can find ε > 0, such that ϕc+ε ≺∞ ϕc−ε . From the definition of c, we see that for all α ∈ N2 , ϕc−ε is empty, while for all α α ∈ N2 large enough, ϕc+ε is nonempty. This contradicts Definition 5.4.17. α LEMMA 5.4.20 If ϕ ∈ C 1 (X) is bounded from below and satisfies the PS∗ -condition, then ϕ is weakly coercive. PROOF We proceed by contradiction. Suppose that ϕ is not weakly coercive. Then © ª df ξ = sup λ ∈ R : ϕλ is bounded < +∞. Because of the PS∗ -condition, Kξϕ is bounded. Let U be a bounded neighbourhood of Kξϕ . By Lemma 5.4.18 with r = 1, we can find ε > 0, such that ϕξ+ε \ U ≺∞ ϕξ−ε . Note that the set ϕξ+ε \ U is unbounded, while ϕξ−ε ⊆ B R (0),
5. Critical Point Theory for some R > 0. Since ° ° °hα (t, x) − x° 6 1 X
673
∀ (t, x) ∈ [0, 1] × Xα ,
for all α ∈ N2 large enough, we have ϕξ+ε \ U ⊆ B R+1 (0), α hence
ϕξ+ε \ U ⊆ B R+1 (0),
a contradiction. Now we are ready for the version of Theorem 5.4.7, where we drop the condition that the component space Y is finite dimensional. Note that if ϕ has a local linking at 0 and satisfies the PS∗ -condition, then we assume that the same direct sum decomposition of X holds for the two properties and that dim Yn < +∞
and
dim Vn < +∞
∀ n > 1.
THEOREM 5.4.21 If ϕ ∈ C 1 (X) and satisfies the following assumptions: (i) ϕ has a local linking at 0; (ii) ϕ satisfies the PS∗ -condition; (iii) ϕ is bounded (i.e., maps bounded sets into bounded sets); (iv) ϕ is bounded from below and df
c = inf ϕ(x) < 0, x∈X
then ϕ has at least two distinct nontrivial critical points. PROOF In the decomposition X = Y ⊕ V we assume that dim Y > 0 and dim V > 0, since the analysis of the other cases is similar. From Lemma 5.4.19, we know that there exists x0 ∈ X, such that ϕ(x0 ) = inf ϕ(x). x∈X
©
ª
Evidently x0 6= 0. Suppose that 0, x0 are the only critical points of ϕ. Let %
(m0 , m0 ) and satisfy ° ° °hα (t, x) − x° 6 % X 2
∀ (t, x) ∈ [0, 1] × Xα .
(5.33)
Since ϕ satisfies the PS∗ -condition, we can find m1 > m0 and δ > 0, such that ° 0 ° ¡ ¢ °ϕα (x)° ∗ > δ ∀ x ∈ ϕ−1 (5.34) α [c + ε, −ε] . X Let us fix n > m1 and let
df
α = (n, n). From (5.29) and (5.32), we have that c
ϕc+ε ⊆ Xα ∩ B % (x0 ) ⊆ ϕα2 . α
(5.35)
Let ª y ∈ Yn : kykX = % , ª df © = v ∈ Vn : kvkX = % . df
Sn1 = Sn2
©
Using (5.30) and (5.34), we can have a homotopy ¡ ¢ ξ ∈ C [0, 1] × Sn2 ; Xα , such that and
¡ ¢ ϕ ξ(t, x) < 0
∀ (t, x) ∈ (0, 1] × Sn2
¡ ¢ ϕ ξ(1, x) = c + ε
∀ x ∈ Sn2 .
(5.36) (5.37)
Also let ª y ∈ Yn : kykX 6 % , ª df © Dn2 = v ∈ Vn : kvkX 6 % . ¡ ¢ From (5.35), we infer that there exists η ∈ C Dn2 ; Xα , such that df
Dn1 =
©
η(x) = ξ(1, x)
∀ x ∈ Sn2
(5.38)
5. Critical Point Theory and
675
c ¡ ¢ η Dn2 ⊆ Xα ∩ B % (x0 ) ⊆ ϕα2 .
Let
(5.39)
df
E = [0, 1] × Dn2 and define as follows
σ : ∂E −→ ϕ0α x df ξ(t, x) σ(t, x) = η(x)
if if if
(t, x) ∈ {0} × Dn2 , (t, x) ∈ (0, 1) × Sn2 , (t, x) ∈ {1} × Dn2 .
From Lemma 5.4.20, we know that we can find R > 0, such that ϕ0 ⊆ B R (0).
By Dugundji’s extension theorem (see Theorem 3.1.11), we can find σ b ∈ C(E; Xα ), such that σ b|∂E = σ
and
¡ ¢ sup ϕ ◦ σ b (x) 6 c0 =
x∈E
sup
ϕ(x) < +∞
(5.40)
x∈B R (0)
(recall that ϕ maps bounded sets into bounded sets). Let ϑ (depending on α) be the homotopy corresponding to (5.31). We claim that if S = ϑ(1, Sn1 ), for any extension σ ∈ C(E; Xα ) of σ, we have σ(E) ∩ S 6= ∅, i.e., the sets ∂E and S are linking in X through E. Suppose that this is not the case. Then ∀ t ∈ [0, 1], y ∈ Sn1 , v ∈ Sn2 .
ϑ(1, y) 6= σ(t, v)
(5.41)
From (5.31), (5.33), (5.36) and (5.37), we have that (5.41) holds for all t ∈ [0, 1], y ∈ Dn1 , v ∈ Sn2 . Also from (5.38) and (5.39), it follows that (5.41) holds for t = 1 and for all y ∈ Dn1 , v ∈ Dn2 . Then, if 1 df ◦
◦2
U0 = Dn × Dn and
df
γ(t, x) = ϑ(1, y) − σ(t, v), from the properties of Brouwer’s degree, we have ¡ ¢ ¡ ¢ DB γ(0, ·), U0 , 0 = DB γ(1, ·), U0 , 0 = 0.
(5.42)
676
Nonlinear Analysis
From (5.33), we obtain that ϑ(t, y) 6= v
∀ t ∈ [0, 1], y ∈ Dn1 , v ∈ Sn2 .
(5.43)
From (5.31), we see that (5.43) holds for all t ∈ [0, 1], y ∈ Sn1 , v ∈ Dn2 . Let us define ¡ ¢ g ∈ C [0, 1] × U0 ; Xα , by df
g(t, x) = ϑ(t, y) − v, with x = y + v. Using (5.42) and the homotopy invariance of Brouwer’s degree, we have ¡ ¢ ¡ ¢ 0 = DB g(1, ·), U0 , 0 = DB g(0, ·), U0 , 0 ¡ ¢ = DB projYn − projVn , U0 , 0 6= 0, a contradiction (here projYn and projVn are the projection operators on Yn and Vn respectively). So the linking property holds. Let ¡ ¢ ª df © Γ = σ ∈ C E; Xα : σ ∂E = σ and define
¡ ¢ df b c = inf sup ϕ σ(w) . σ∈Γ w∈E
From Theorem 5.2.4, we know that b c is a critical value of ϕ. Also from (5.40) and the previous argument, we have that ε 6 b c 6 c0 . Since ϕ satisfies the PS∗ -condition, we can find m2 > m1 and β > 0, such that for α > (m2 , m2 ), we have ° 0 ° ¡ ¢ °ϕα (x)° > β > 0 ∀ x ∈ ϕ−1 (5.44) α [ε, c0 ] . X Then (5.44) contradicts the fact that b c ∈ [ε, c0 ] is a critical value of ϕ. Therefore we conclude that ϕ has at least one more (nontrivial) critical point. COROLLARY 5.4.22 If ϕ ∈ C 1 (X) satisfies the PS∗ -condition, maps bounded sets into bounded sets and has a global minimizer and a local maximizer, then ϕ has a third critical point. If ϕ is not bounded from below but has a local linking at 0, we can still find a nontrivial critical point.
5. Critical Point Theory
677
THEOREM 5.4.23 If ϕ ∈ C 1 (X) satisfies the following assumptions: (i) ϕ has a local linking at 0 and Y 6= {0}; (ii) ϕ satisfies the PS∗ -condition; (iii) ϕ maps bounded sets into bounded sets; (iv) for every m > 1, if x ∈ Ym ⊕ V and kxkX −→ +∞, we have that ϕ(x) −→ −∞ (i.e., −ϕ|Ym ⊕V is weakly coercive), then ϕ has at least one nontrivial critical point. PROOF We suppose that dim Y = +∞ and dim V > 0. The other cases are similar and simpler. Suppose that 0 is the only critical point of ϕ (recall that due to the local linking condition the origin is a critical point of ϕ). As before using Lemma 5.4.18 on the functions ϕ and ψ = −ϕ, we obtain ε > 0, such that for % > 0, we have ϕε \ B %3 (0) ≺∞ ϕ−ε (5.45) and ψ ε \ B %3 (0) ≺∞ ψ −ε .
(5.46)
We can assume that the corresponding homotopies exist for α > (m0 , m0 ) and satisfy Lemma 5.4.18 with r = %2 (see (5.33)). Because ϕ satisfies the PS∗ -condition, we can find m1 > m0 and δ > 0, such that for α > (m1 , m1 ), we have ° 0 ° ¡ ¢ °ϕα (x)° ∗ > δ ∀ x ∈ ϕ−1 (5.47) α (−∞, −ε] . X Due to the weak coercivity of −ϕ|Ym +1 ⊕V (see hypothesis (iv)), we can find 1 R > 0, such that ϕ(x) 6 −ε
R ∀ x ∈ Ym1 +1 ⊕ V, kxkX > √ . 2
(5.48)
Let df
µ =
inf
kxkX 6R
ϕ(x).
(5.49)
Since ϕ is bounded, we see that µ is finite. Without any loss of generality, we may assume that Ym1 6= Ym1 +1 and the norm on Ym1 +1 is Euclidean. Let us set α = (m1 , n) with n > m1 + 1 fixed. Using (5.45) and (5.47), we obtain a homotopy ¡ ¢ ξ ∈ C [0, 1] × Sn2 ; Xα ,
678
Nonlinear Analysis
such that
¡ ¢ ϕ ξ(t, x) < 0
and
∀ (t, x) ∈ (0, 1] × Sn2
¡ ¢ ϕ ξ(1, x) = µ − 1
(5.50)
∀ x ∈ Sn2 .
(5.51)
Sn1
and Sn2 are Dn1 ,¢ Dn2 and
as in the proof of Theorem 5.4.21. The sets Also¡ let E be as in the proof of Theorem 5.4.21 and define σ ∈ C ∂E; Xα , by 2 x if (t, x) ∈ {0} ¡ 1פ Dn ,2 df ξ(t, x) if (t, x) ∈ ¡0, 2 ¢ × Sn , σ(t, x) = 2(1 − t)ξ(1, x) + (2t − 1)u if (t, x) ∈ 21 , 1 × Sn2 , 0 u0 if (t, x) ∈ {1} × Dn2 , where u0 ∈ Ym1 +1 \ Ym1 and ku0 kX > R. From (5.48), (5.49), (5.50) and (5.51), we have that σ(∂E) ⊆ ϕ0 . Moreover, σ(∂E) ⊆ Ym1 +1 ⊕ Vn . So there exists a continuous extension σ b : E −→ Ym1 +1 ⊕ Vn of σ, such that sup σ b(x) 6 c0 =
x∈E
inf
x∈Ym1 +1 ⊕V
ϕ(x).
(5.52)
From hypotheses (iii) and (iv), we see that c0 is finite. As before, we can verify that ∂E and S = ϑ(1, Sn1 ) are linking in X via E. Here ϑ (depending on β = (n, n)) is the homotopy determined from (5.46). Finally let ¡ ¢ ª df © Γ = σ ∈ C E; Xβ : σ|∂E = σ and define
¡ ¢ df b c = inf sup ϕ σ(w) . σ∈Γ w∈E
Again ε 6 b c 6 c0 and arguing as in the last part of the proof of Theorem 5.4.21, we reach a contradiction. Therefore ϕ has at least one nontrivial critical point. Next we present a generalization of the saddle point theorem (see Corollary 5.2.8). THEOREM 5.4.24 If X = Y ⊕ V with dim Y < +∞, Y = W + Re with e 6= 0, ϕ ∈ C 1 (X), there exists r > 0, such that −∞ 6 a 6
inf
x∈R+ e⊕V
ϕ(x) 6
sup
ϕ(x) 6 inf ϕ(x)
x∈∂B(0)∩Y
x∈V
and ϕ satisfies the PS-condition, then ϕ admits at least two distinct critical points x0 and x1 , such that −∞ < a 6 ϕ(x0 ) 6
sup x∈∂B(0)∩Y
ϕ(x) 6 inf ϕ(x) 6 ϕ(x1 ). x∈V
(5.53)
5. Critical Point Theory
679
PROOF Evidently the critical point x1 is given by the saddle point theorem (see Corollary 5.2.8). Next let C0 = ∅, C = ∂Br (0) ∩ Y , E = R+ e ⊕ V , D = V . We check that C0 and E are linking in X \ D and so Theorem 5.2.4 provides a second critical point x0 , such that (5.53) hold. If ϕ(x0 ) = ϕ(x1 ), then ϕ(x0 ) =
sup x∈∂B(0)∩Y
ϕ(x) = inf ϕ(x) = ϕ(x1 ) x∈V
and so x0 ∈ ∂Br (0) ∩ Y , x1 ∈ V . Hence x0 6= x1 . In the last part of this section, we prove some multiplicity results concerning the critical points of functionals possessing a Z2 -symmetry. To treat such functionals we use the so-called Krasnoselskii’s genus, which is a particular case of a topological index. A topological index associated with a compact group G acting on a manifold M is a mapping from the closed G-invariant subsets of M into N ∪ {+∞}, which satisfies certain properties (see Proposition 5.4.29 below). The index helps to distinguish different critical points and so produce multiplicity results. Here we focus on the particular index known as “Krasnoselskii’s index,” which is suitable in treating functionals which are © ª even (so in this case the compact group is G = idM , −idM ≡ Z2 ). DEFINITION 5.4.25 Let ª df © T = A ⊆ X : 0 6∈ A, A is closed and symmetric . The Krasnoselskii genus is the map γ : T −→ N ∪ {+∞}, defined by if A = ∅, 0 © ¡ ¢ df inf k > 1 : ∃h ∈ C A; Rk odd,ª γ(A) = h(x) 6= 0 for all x ∈ A if A 6= ∅. If for A 6= ∅ no such integer k > 1 exists, we set γ(A) = +∞. REMARK 5.4.26 Note that symmetry with respect to the origin of the set A ∈ T and oddness of the map h are nothing but invariance properties with respect to a representation of the group Z2 . The notion of genus generalizes the notion of dimension of a linear space. To see this we need to recall a result from degree theory, known as the Borsuk-Ulam theorem (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003b, p. 198)). THEOREM 5.4.27 (Borsuk-Ulam Theorem) If U ⊆ RN is an open, bounded, symmetric neighbourhood of the origin, M M < N and ¡ h :¢∂U −→ R is continuous and odd, then 0 ∈ ϕ ∂U .
680
Nonlinear Analysis
Using this theorem, we can prove the following result. PROPOSITION 5.4.28 If U is any bounded, symmetric neighbourhood of the origin in X, ¡ ¢ then γ ∂U = dim X. PROOF If 0 < dim X < +∞, then by taking h = idX in Definition 5.4.25, we see that γ(∂U ) 6 dim X. Then by virtue of Theorem 5.4.27, we conclude that γ(∂U ) = dim X. If dim X = +∞, let Xn be an n-dimensional subspace of X. From the definition of γ, we see that ¡ ¢ γ ∂U ∩ Xn = n 6 γ(∂U ) ∀ n > 1. So γ(∂U ) = +∞ = dim X. Finally for X = {0}, we see that ∂U = ∅ and so by definition γ(∅) = 0. In the next proposition we show that γ satisfies all the properties of a topological index. PROPOSITION 5.4.29 If A, A1 , A2 ∈ T and h ∈ C(X; X) is an odd map, then (a) γ(A) > 0 and γ(A) = 0 if and only if A = ∅; (b) γ is monotone, i.e., if A1 ⊆ A2 , then γ(A1 ) 6 γ(A2 ); (c) γ is subadditive, i.e., γ(A1 ∪ A2 ) 6 γ(A1 ) + γ(A2 ); ¡ ¢ (d) γ is supervariant, i.e., γ(A) 6 γ h(A) ; (e) γ is continuous, i.e., if A ∈ T is compact, then γ(A) < +∞¡ and ¢ there exists an open set U , such that A ⊆ U , U ∈ T and γ(A) = γ U PROOF
(a) Follows from Definition 5.4.25. df
(b) Suppose that γ(A2 ) < +∞ or otherwise we are done. Let m = γ(A2 ). ¡ ¢ df Then we can find h ∈ C A2 ; Rm \ {0} which is also odd. Let h1 = h|A1 . ¡ ¢ Then h1 ∈ C A1 ; Rm \ {0} and it is odd. Therefore γ(A1 ) 6 m. (c) Again we suppose that m1 = γ(A1 ) < +∞ and m2 = γ(A2 ) < +∞. By definition we can find odd maps ¡ ¢ ¡ ¢ h1 ∈ C A1 ; Rm1 \ {0} and h2 ∈ C A2 ; Rm2 \ {0} .
5. Critical Point Theory
681
Invoking the Tietze extension theorem (see Theorem A.1.14), we can find two odd maps ¡ ¢ ¡ ¢ b h1 ∈ C X; Rm1 and b h2 ∈ C X; Rm2 , such that
b h 1 | A1 = h 1
and b h 2 | A2 = h 2 .
Let us set
¢ df ¡ b h(x) = b h1 (x), b h2 (x) ∀ x ∈ X. ¡ ¢ m +m Then b h ∈ C X; R 1 2 is odd and b h|A∪ A2 6= 0. So γ(A1 ∩ A2 ) 6 m1 + m2 (see Definition 5.4.25). ¡ ¢ ¡ ¢ (d) If b h ∈ C h(A); Rm \ {0} is an odd map, then b h ◦ h ∈ C A; Rm \ {0} is an odd map too. So it follows that ¡ ¢ γ(A) 6 γ h(A) . (e) Since A ∈ T is compact, we ©can find r > 0, such thatªA ∩ B r (0) = ∅. Consider the family of open sets Dr (x) = Br (x) ∪ Br (−x) x∈A . Evidently this is an open cover ©of A and ªmso because of the compactness of A, we can find a finite subcover Dr (xk ) k=1 . Let {ϕk }m be a continuous partition of © ªm k=1 unity subordinate to the cover Dr (xk ) k=1 , i.e., ϕk ∈ C(A) with supp ϕk ⊆ Dr (xk ),
0 6 ϕk 6 1 and
∞ X
ϕk (x) = 1
∀ x ∈ A.
k=1
We each ϕk is even (otherwise replace ϕk by ϕ bk (x) = ¡ may assume that ¢ 1 2 ϕk (x) + ϕk (−x) for all x ∈ A). Also note that from the choice of r > 0, we have © ª Br (xk ) ∩ Br (−xk ) = ∅ ∀ k ∈ 1, . . . , m . m Consider the map h = (hk )m k=1 ∈ C(X; R ), defined by ½ df ϕk (x) if x ∈ Br (xk ), hk (x) = −ϕk (x) if x ∈ Br (−xk ).
Evidently h is odd and h|A 6= ¡ 0. So γ(A)¢ 6 m < +∞. Suppose γ(A) = m0 < +∞. Let h ∈ C A; Rm0 \ {0} be odd. We may assume that h ∈ C(X; Rm0 ) (by the Tietze extension theorem; see Theorem A.1.14). Because A is compact, h(A) is compact too. We can find a bounded symmet¡ ¢ b of h(A), such that U b . By b ⊆ Rm0 \ {0}. Let U = h−1 U ric neighbourhood U ¡ ¢ ¡ ¢ construction h U 6= 0 and γ U 6 m0 . Since S ⊆ U , we have ¡ ¢ γ U = m0 = γ(A)
(see (b)).
682
Nonlinear Analysis
COROLLARY 5.4.30 If A, A1 , A2 ∈ T , then (a) if A1 and A2 are homeomorphic with respect to an odd homeomorphism, then γ(A1 ) = γ(A2 ); ¡ ¢ (b) if γ(A1 ), γ(A2 ) < +∞, then γ A2 \ A1 > γ(A2 ) − γ(A1 ); (c) γ(A) 6 dim X; (d) if γ(A) > m, 1 6 m < +∞ and projY ∈ L(X; Y ) is the projection operator on the subspace Y of X with dim Y = m, then A ∩ (idX − projY )(X) 6= ∅. REMARK 5.4.31
If A ∈ T is finite, then γ(A) = 1. Indeed choose ¡ ¢ h ∈ C A; R \ {0} ,
such that h(±xk ) = ±1, where A = {±x}m k=1 . On T we can consider the Hausdorff metric h0 , defined by ½ ¾ df h0 (A1 , A2 ) = max sup dX (a1 , A2 ), sup dX (a2 , A1 ) . a1 ∈A1
a2 ∈A2
It is well known (see, e.g., Hu & Papageorgiou (1997, p. 6)) that (T , h0 ) is a complete metric space. Let T c be the subspace of T consisting of compact sets and let ª df © Tkc = A ∈ T c : γ(A) > k , ¡ c ¢ c the closure ¡ c ¢taken in the metric space (T , h0 ). The metric spaces T , h0 and Tk , h0 are complete. LEMMA 5.4.32 If A ∈ Tkc , then γ(A) > k. PROOF
Let {An }n>1 ⊆ Tkc be such that h
0 An −→ A.
By Proposition 5.4.29(e), we can find δ > 0, such that ¡ ¢ γ(A) = γ Aδ ,
5. Critical Point Theory where
df
Aδ =
©
683
ª x ∈ X : dX (x, A) 6 δ .
Because
h
0 An −→ A,
we can find n0 = n0 (δ) > 1, such that An ⊆ Aδ
∀ n > n0 .
Then from Proposition 5.4.29(b), we have ¡ ¢ k 6 γ(An ) 6 γ Aδ = γ(A)
∀ n > n0 .
We will derive the multiplicity result using the general setting in the last part of Section 5.1. So suppose that ϕ = ϕ1 + ϕ2 ,
with
ϕ1 ∈ C 1 (X)
and ϕ2 ∈ Γ0 (X)
(5.54)
(see Definition 4.2.1). Observe that if ϕ1 and ϕ2 are both even then ϕ01 (0) = 0 and ϕ2 (0) is a global minimizer of ϕ2 . So in this case x = 0 is necessarily a critical point of ϕ (see Definition 5.1.35). THEOREM 5.4.33 If ϕ = ϕ1 + ϕ2 is as in (5.54), ϕ satisfies the G-PS-condition (see Definition 5.1.37), ϕ(0) = 0, ϕ1 and ϕ2 are even, df
ck =
inf sup ϕ(x)
A∈Tkc x∈A
and −∞ < ck < 0
© ª ∀ k ∈ 1, . . . , m ,
then ϕ has at least m distinct pairs {±xm }m k=1 of nontrivial critical points. PROOF
Let 1 6 k 6 m and suppose that ck = . . . = ck+i = c,
for some i > 0. Because c < 0, we see that 0 ∈ / Kcϕ . We claim that γ(Kcϕ ) > i + 1.
684
Nonlinear Analysis
We proceed by contradiction. So suppose that γ(Kcϕ ) 6 i. Let r > 0 be such that
¡¡ ¢ ¢ γ Kcϕ 2r = γ(Kcϕ )
(see Proposition 5.4.29(e)). Define ξ : Tkc −→ R, by ξ(A) = sup ϕ(x). x∈A
We claim that ξ is lower semicontinuous on (Tkc , h0 ). To this end let h
in (Tkc , h0 ).
0 An −→ A
Let x b ∈ A be such that ξ(A) = ϕ(b x) = sup ϕ(x). x∈A
We can find x bn ∈ An , such that x bn −→ x b. Then ϕ(b xn ) 6 ξ(An )
∀n>1
and so ξ(A) = ϕ(b x) 6 lim inf ϕ(b xn ) 6 lim inf ξ(An ), n→+∞
n→+∞
which proves the lower semicontinuity of ξ. Let © ª df ε0 = min 1, r, −c
and
¡ ¢ df U = int Kcϕ r .
Apply Theorem 5.1.42 to obtain ε ∈ (0, ε0 ) for which the postulates of that c , such that theorem hold. Choose A1 ∈ Tk+i ξ(A1 ) 6 c + ε2 . Since
c + ε2 6 c + ε < 0,
we have that 0 ∈ / A1 and from Lemma 5.4.32, it follows that γ(A1 ) > k + i. Let
df
A2 = A1 \ (Kcϕ )2r .
5. Critical Point Theory Then
685
ξ(A2 ) 6 c + ε2
and by virtue of Corollary 5.4.30(b), we have ¡ ¢ k = k + i − i 6 γ(A1 ) − γ (Kcϕ )2r 6 γ(A2 ). Invoking Theorem 4.6.1 with ε2 and λ = 1ε , we obtain A ∈ Tkc , such that ξ(A) 6 c + ε2 ,
h0 (A, A2 ) 6 ε
and
∀ B ∈ Tkc .
−εh0 (A, B) 6 ξ(B) − ξ(A) Because ε < r and A ∈
Tkc ,
(5.55) (5.56)
we have
¡ ¢ A ∩ Kcϕ r = ∅
and
ξ(A) > c.
Apply Theorem 5.1.42 (see also Remark 5.1.43) to produce an odd homotopy ht : A −→ X. Since
c + ε2 < 0,
we have that 0 ∈ / A
and
γ(A) > k.
Let B = ht (A), for t ∈ [0, 1] small enough. Then from Proposition 5.4.29(d), we have that γ(B) > γ(A) > k. So B ∈ Tkc and from Theorem 5.1.42, (5.55) and (5.56), we have −εt 6 −εh0 (A, B) 6 ξ(B) − ξ(A) 6 −2εt, which is a contradiction. Therefore γ(Kcϕ ) > i + 1. In particular γ(Kcϕk ) > 1 and so each Kcϕk has at least two points xk and −xk . This gives the theorem, if all ck are distinct. If not, then i > 0 for some k > 1. Hence ¡ ¢ γ Kcϕk > 2 and (see Remark 5.4.31).
Kcϕk is an infinite set
686
Nonlinear Analysis
THEOREM 5.4.34 If ϕ = ϕ1 + ϕ2 is as in (5.54), ϕ satisfies the G-PS-condition (see Definition 5.1.37), ϕ(0) = 0, ϕ1 , ϕ2 are even and (i) there exists a subspace V of X of finite codimension and β, r > 0, such that ϕ|∂Br (0)∩V > β; (ii) there exists a finite dimensional subspace Y of X with dim Y > codim V , such that ϕ|Y is weakly anticoercive (i.e., ϕ(x) −→ −∞ as kxkX → +∞, x ∈ Y ), then ϕ has at least dim Y − codim V distinct pairs of nontrivial critical points. PROOF Assume that for some d > 0, ϕ−d contains no critical points of ϕ (otherwise there are infinitely many critical points and so there is nothing to prove). Let R > r be such that ϕ|∂BY 6 −d R
(see hypothesis (ii)). Let df
m = codim V,
df
k = dim Y,
df
Y B = BR =
©
ª x ∈ Y : kxkX 6 R .
For j ∈ {1, . . . , k}, we introduce the following objects: df © F = η ∈ C(B; X) : η is odd and η|∂B is homotopic to id|∂B ª in ϕ−d by an odd homotopy , df © Γj = η(B \ U ) : η ∈ F, U is open in B and symmetric, U ∩ ∂B = ∅ and for each Z ⊆ U with Z ∈ T we have ª γ(Z) 6 k − j , df © ∆j = A ⊆ X : A ∈ T , A is compact and for each open W, ª such that A ⊆ W, there exists A0 ∈ Γj with A0 ⊆ W . Note that B ∈ ∆j
with A0 = B,
U = ∅
and
η = id|B
so that ∆j 6= ∅
∀ j ∈ {1, . . . , k}.
We introduce the following numbers: df
cj =
inf sup ϕ(x)
∀ j ∈ {1, . . . , k}.
A∈∆j x∈A
Using standard topological arguments (see Szulkin (1986, Lemma 4.5 and Lemma 4.6)), we obtain:
5. Critical Point Theory
687
(a) cj > β for all j ∈ {m + 1, . . . , k}. (b) ∆j+1 ⊆ ∆j for all j ∈ {1, . . . , k − 1}. (c) If A ∈ ∆j , C is a closed and symmetric set, such that A ⊆ int C and ψ : C −→ X is an odd map, such that ψ|C∩ϕ−d is homotopic to id|C∩ϕ−d by an odd homotopy, then ψ(A) ∈ ∆j . (d) If D ∈ T is compact, γ(D) 6 l and ϕ|D > −d, then there ¡ ¢exists a number δ > 0, such that for each A ∈ ∆j+l we have A \ int D δ ∈ ∆j . By virtue of (a) and (b), we have β 6 cm+1 6 . . . 6 ck . Suppose that cj+1 = . . . = cj+1+l = c for some l ∈ {0, . . . , k − m − 1}. Because ϕ is even, we have that Kcϕ is symmetric and of course it is compact (since ϕ satisfies the generalized PS-condition). Note that c > 0 (since c > β > 0) and so 0 6∈ Kcϕ (recall that ϕ(0) = 0). Hence Kcϕ ∈ T . We shall prove that l + 1 6 γ(Kcϕ ). Suppose that this is not the case. So γ(Kcϕ ) 6 l. By virtue of Proposition 5.4.29(e), we can find δ > 0, such that ³ ´ γ(Kcϕ ) = γ (Kcϕ )δ . Let ε0 > 0 and
³ ´ df U = int (Kcϕ )δ .
According to the Deformation Theorem (see Theorem 5.1.42; see also Remark 5.1.43), we can find ε ∈ (0, ε0 ) and a continuous map η : [0, 1]×X −→ X satisfying statements (a)–(h) of that theorem. In particular, we have that η has the semigroup property η(s, ·) ◦ η(t, ·) = η(s + t, ·)
∀ s, t ∈ [0, 1], s + t 6 1
and η(t, ·) is an odd homeomorphism
∀ t ∈ [0, 1].
From Theorem 5.1.42(b) and (c), we have that η(1, x) = x
∀ x ∈ ϕc+ε0 \ ϕc−ε0
and η(1, ϕc+ε \ U ) ⊆ ϕc−ε .
688
Nonlinear Analysis
We can find A ∈ ∆j+l , such that sup ϕ(x) 6 c + ε. x∈A
Because Kcϕ is compact, γ(Kcϕ ) 6 l and ϕ|K ϕ > −d, by (d) above and if δ > 0 c is small enough, we have that A \ U ∈ ∆j (recall the definition of U ). Moreover, A \ U ⊆ ϕc+ε \ U and so η(1, A \ U ) ⊆ ϕc−ε . Thus if ε0 is sufficiently small we have η(1, x) = x
∀ x ∈ ϕ−d
and so by (c) above, η(1, A \ U ) ∈ ∆j . We have © ª sup ϕ(x) : x ∈ η(1, A \ U ) 6 c − ε, which contradicts the fact that © ª sup ϕ(x) : x ∈ η(1, A \ U ) > c. Therefore γ(Kcϕ ) > l + 1. In particular then γ(Kcϕj ) > 1 and so by Remark 5.1.43, each Kcϕj has at least two antipodal points ±xj . This produces the claimed number of critical points of ϕ if all cj are distinct. If they are not, then l > 0 for some j, so γ(Kcϕj ) > 1 and by Remark 5.1.43, ϕ has infinite number of critical points (see Remark 5.4.31). COROLLARY 5.4.35 If the hypotheses of Theorem 5.4.34 hold with (ii) replaced by: (ii)’ for any positive integer k, there is a k-dimensional subspace Y of X, such that ϕ|Y is weakly anticoercive, then ϕ has infinitely many distinct pairs of nontrivial critical points.
5. Critical Point Theory
5.5
689
Lusternik-Schnirelman Theory and Abstract Eigenvalue Problems
The Lusternik-Schnirelman theory generalizes to arbitrary smooth functions the theory of eigenvalues of quadratic forms developed by R. Courant. Let us start by recalling the min-max principles which characterize the eigenvalues of a self-adjoint compact operator. Suppose that H is a separable Hilbert space. Let A ∈ Lc (H) be a self-adjoint operator (i.e., A is linear, compact and self-adjoint). From Theorem 3.1.38, we know that σ(A) consists of at most a countably infinite number of real eigenvalues {λk }k>1 with λ = 0 the only possible limit point. Moreover, the multiplicity of λk is equal to dim N (A − λk idH ) < +∞ and Ax =
∞ X
λk (x, xk )H xk ,
k=1
where {xk }k>1 is an orthonormal sequence of eigenvectors and λk is repeated dim N (A − λk idH )-times. The eigenvalues of A can be characterized by minmax principles. In particular, if the positive eigenvalues λ+ k are ordered in decreasing order (with multiplicities repeated), then λ+ k =
sup min (Ax, x)H .
(5.57)
Sm ∈Lk x∈Sm
Here Sm denotes the boundary of an arbitrary m-dimensional unit ball in H, i.e., df
S m = S ∩ Hm , © ª df S = ∂B1 (0) = x ∈ H : kxkH = 1 and Hm is an arbitrary m-dimensional subspace of H. Also Lk is the set of all Sm with m > k. It is natural to try to extend (5.57) to general smooth functionals ϕ by finding topological analogs of the sets Sm and Lk . We already did this in the previous section for functions exhibiting symmetry (even functionals), using the notion of genus (see Definition 5.4.25). Here we deal with general smooth functionals, not necessarily even. First we recall a basic topological notion. DEFINITION 5.5.1 Let Y be a Hausdorff topological space ¡ and A ⊆ Y ¢, A 6= ∅. We say that A is contractible in Y , if there exist h ∈ C [0, 1]×A; Y and y0 ∈ Y , such that h(0, x) = x
and
h(1, x) = y0
∀ x ∈ A.
690
Nonlinear Analysis
REMARK 5.5.2 According to the above definition A is contractible if and only if the identity map on A is null homotopic (i.e., idA ' 0). DEFINITION 5.5.3 Let Y be a Hausdorff topological space. The Lusternik-Schnirelman category relative to Y is the map catY (A) : 2Y −→ N0 ∪ {+∞}, defined by df
catY (∅) = 0,
½
df
catY (A) = min k ∈ N : A ⊆
k [
Am is closed
m=1
¾ and contractible for m = 1, . . . , k , df
catY (A) = +∞
if A 6= ∅ and does not admit such a finite cover.
REMARK 5.5.4 Since closedness and contractibility are preserved by homeomorphisms, we get the same value for homeomorphic Y or homeomorphic, closed A ⊆ Y . However, in the definition it is important to specify the space with respect to which the Lusternik-Schnirelman category is defined, since a set may be contractible in larger space Y1 but not in Y . So if Y is embedded continuously in Y1 , then catY1 (A) 6 catY (A)
∀ A ⊆ Y.
PROPOSITION 5.5.5 If Y and Z are Hausdorff topological spaces and A, C ⊆ Y , then (a) catY (A) 6 catY (C)
∀ A ⊆ C;
(b) catY (A ∪ C) 6 catY (A) + catY (C);
¡ ¢ (c) if ξ is null homotopic to the identity on A, then catY (A) 6 catY ξ(A) ; ¡ ¢ (d) if z ∈ Z, then catY ×Z A × {z} = catY (A).
PROOF Statements (a), (b) and (d) follow immediately from Definition 5.5.3. So it remains to show statement (c). Suppose that ¡ ¢ df k = catY ξ(A) < +∞ or otherwise there is nothing to prove. Let {Cm }km=1 be closed, contractible sets in Y which cover ¡ ξ(A). Then ¢ we can find corresponding deformations {hm }km=1 . Let h ∈ C [0, 1] × A; Y be such that h(0, ·) = idA
and
h(1, ·) = ξ
5. Critical Point Theory
691
(it exists since ξ is null homotopic to idA ). If df
h1 (·) = h(1, ·), let us set df
Am = h−1 1 (Cm )
∀ m ∈ {1, . . . , k}.
Evidently the sets Am are closed and they cover A. Also let b hm : [0, 1] × Am −→ Y be defined by ½ df b hm (t, x) =
h(2t, ¡ x) ¢ hm 2t − 1, h1 (x)
if if
0 6 t 6 21 , 1 2 6 t 6 1.
¡ ¢ Clearly b hm ∈ C [0, 1] × Am ; Y , b hm (0, ·) = idAm and b hm (1, ·) = ybm . Then according to Definition 5.5.3, we have ¡ ¢ catY (A) 6 k = catY ξ(A) .
Let us recall another important topological notion. DEFINITION 5.5.6 A metric space Y is said to be an absolute neighbourhood retract (ANR for short), if for every metric space V , every closed D ⊆ V and every ϕ ∈ C(D; Y ), there exists a continuous extension of ϕ to some neighbourhood U of D. If it is always possible to extend to all of V , then we say that Y is an absolute retract (AR for short). REMARK 5.5.7 Evidently an AR is an ANR but the converse need not be true. Finite products of ANRs are an ANR, while for AR this is true for arbitrary products. Also by Dugundji’s extension theorem (see Theorem 3.1.11), every convex subset of a normed space is an AR. Moreover, if Y is a Banach space then ∂B1 (0) is an ANR. If in addition dim Y = +∞, then ∂B1 (0) is an AR. In a Banach space an ANR is an AR if and only if it is contractible (see Palais (1966)). PROPOSITION 5.5.8 If Y is an ANR and A is a closed subset of Y , then there exists a neighbourhood U of A, such that ¡ ¢ catY (A) = catY U .
692
Nonlinear Analysis
PROOF
If catY (A) = +∞,
then the result is clear (see Proposition 5.5.5(a)). So suppose that df
k = catY (A) < +∞. Let {Am }km=1 be a covering of A by closed contractible in Y sets. It suffices to show that each Am has a neighbourhood Um , such that U m is contractible in Y . Since Am is contractible in Y , we can find ¡ ¢ hm ∈ C [0, 1] × Am ; Y and ybm ∈ Y, such that hm (0, x) = x and
¡ ¢ ∀ x ∈ Am , m ∈ 1, . . . , k .
hm (1, x) = ybm
Let df
W = [0, 1] × Y
df
and Em =
¡
¢ ¡ ¢ ¡ ¢ [0, 1] × Am ∪ {0} × Y ∪ {1} × Y .
The set Em is closed in W . Let um : Em −→ Y be defined by hm (t, x) if t ∈ [0, 1], x ∈ Am , df x if t = 0, x ∈ Y, um (t, x) = ybm if t = 1, x ∈ Y. Clearly um ∈ C(W ; Y ). Since Y is ANR, um admits a continuous extension u bm , defined on a neighbourhood Vm of Em . Since [0, 1] × Y is a metric space (hence normal), we can assume that Vm is closed and we can find a neighbourhood Um of Am , such that [0, 1] × U m ⊆ Vm . Since u bm (0, x) = x
and u bm (1, x) = ybm
∀ x ∈ U m,
we infer that U m is contractible in Y . Then U =
k [
Um
m=1
is a closed neighbourhood of A and ¡ ¢ k = catY (A) 6 catY U 6 k.
REMARK 5.5.9 It is easy to see that if A is a compact subset of Y , then catY (A) < +∞. Propositions 5.5.5 and 5.5.8 imply that when Y is an ANR, then catY is a topological index.
5. Critical Point Theory
693
Let us find the Lusternik-Schnirelman category of some standard sets. EXAMPLE 5.5.10
(a) Let X be a Banach space and A = B r (0). Then catA (A) = 1.
To see this note that
df
h(t, x) = (1 − t)x deforms A into {0}. df
(b) Let X be a Banach space and A = ∂Br (0). Then if dim X = +∞, we have catA (A) = 1 and if dim X < +∞, we have catA (A) = 2. Indeed if dim X = +∞, then A is a retract of B r (0). Let ξ be such a retraction and set ¡ ¢ df h(t, x) = ξ (1 − t)x . ¡ ¢ © ª Then h ∈ C [0, 1]×A; X deforms A into ξ(0) and so we have catY (A) = 1. On the other hand, if dim X < +∞, then we may assume that X = RN . A neighbourhood U of the north pole is contractible to it and A\U is contractible to the south pole. So we have catY (A) 6 2. But as a consequence of Borsuk’s antipodal theorem, we know that A is not contractible in itself. Therefore catY (A) = 2. df
(c) Consider the real projective space P m = S m /Z2 , where df
Sm =
©
ª x ∈ Rm+1 : kxkRm+1 = 1 .
Then we have catP k (P m ) = m + 1
∀ k > m.
Also if X is a uniformly convex Banach space and P ∞ (X) is the infinite dimensional projective © space obtained by ª identifying the antipodal points of the unit sphere S = x ∈ X : kxkX = 1 , i.e., df
P ∞ (X) = S/Z2 , then ¡ ¢ catP ∞ (X) P m (X) = m + 1
and
¡ ¢ catP ∞ (X) P ∞ (X) = +∞.
For details we refer to Schwartz (1969, Chapter V). (d) If A = T m = Rm /Zm is the m-torus, then catT m = m + 1. For details see Schwartz (1969, Chapter V).
694
Nonlinear Analysis
REMARK 5.5.11 The determination of the Lusternik-Schnirelman category of a subset A of Y is, in general, a rather complicated matter which is based on nontrivial results of cohomology and homotopy theories. That is why it is often preferable to use other topological indices (such as the genus). Nevertheless, we should point out that the Lusternik-Schnirelman category is maximal among all topological indices which are invariant under homeomorphisms and satisfy the properties of Proposition 5.5.5. Next we show that the Lusternik-Schnirelman category can be employed to find critical levels of min-max type. So let X be a Banach space and for k > 1, we define df
Ak =
©
ª A ⊆ X : A is compact and catX (A) > k .
Clearly {Ak }k>1 is a decreasing sequence, i.e., Ak ⊆ Ak−1
∀ k > 2.
For a given ϕ ∈ C 1 (X), we set df
ck =
inf max ϕ(x).
A∈Ak x∈X
(5.58)
Evidently −∞ 6 c1 6 c2 6 . . . < +∞. THEOREM 5.5.12 If ϕ ∈ C 1 (X) satisfies the PS-condition and for some k > 1, we have ck > −∞, then ck is a critical value of ϕ. Moreover if c = ck = ck+1 = . . . = ck+m , then catX (Kcϕ ) > m + 1. PROOF
(5.59)
It is enough to show (5.59) and then by Definition 5.5.3, Kcϕ 6= ∅.
Note that because of the PS-condition, Kcϕ is compact. According to Proposition 5.5.8, we can find a neighbourhood U of Kcϕ , such that ¡ ¢ ¡ ¢ catX U = catX Kcϕ . Applying Theorem 5.1.34, we obtain ε ∈ (0, 1) and let A ∈ Ak+m be such that max ϕ(x) 6 c + ε. x∈X
Let
df
C = A \ U.
5. Critical Point Theory
695
Then using Proposition 5.5.5, we have ¡ ¢ k + m 6 catX (A) 6 catX C ∪ U 6 catX (C) + catX (U ) = catX (C) + catX (Kcϕ ).
(5.60)
If ht is the ϕ-decreasing, locally Lipschitz homotopy of homeomorphisms postulated by Theorem 5.1.34 and df
D = h(1, C), then
D ⊆ ϕc−ε .
So max ϕ(x) 6 c − ε x∈D
and by the definition of c, we have that catX (D) 6 k − 1. Then Proposition 5.5.5(c) implies that catX (C) 6 catX (D) 6 k − 1.
(5.61)
Using (5.61) in (5.60), we obtain ¡ ¢ m + 1 6 catX Kcϕ .
Another result which can be deduced in the same way from the properties of the Lusternik-Schnirelman category and Theorem 5.1.34 is the following one. We leave the details to the reader. PROPOSITION 5.5.13 If ϕ ∈ C 1 (X), a < b, ϕ satisfies the PSc -condition for all c ∈ [a, b], Kbϕ = ∅ ¡ ¢ and catX ¡ϕa ¢ < +∞, then catX ϕb < +∞. Before passing to the material related to nonlinear eigenvalue problems, let us state the relation between the Lusternik-Schnirelman category and the genus. Let X be a Banach space and let P ∞ (X) be the infinite dimensional projective space over X furnished with the quotient topology with respect to the canonical projection projP ∞ (X) : X −→ P ∞ (X), defined by df
projP ∞ (X) (x) =
©
x, −x
ª
∀ x ∈ X \ {0}.
The next theorem was proved by Rabinowitz (1973).
696
Nonlinear Analysis
THEOREM 5.5.14 If X is a Banach space and A ∈ T (see Definition 5.4.25), then ¡ ¢ catP ∞ (X) projP ∞ (X) (A) = γ(A). COROLLARY 5.5.15 If X is a Banach space, then catproj
¡ P ∞ (X)
(∂B1 (0))
¢ ¡ ¢ projP ∞ (X) (∂B1 (0)) = γ ∂B1 (0) = dim X.
In the rest of this section we deal with the method of Lagrange multipliers in constrained minimization which leads to nonlinear eigenvalue problems. So the problem under consideration is the following: inf f (x),
where C =
x∈C
©
ª x ∈ X : g(x) = 0 .
(5.62)
In the next definition we abstract the idea that a tangent vector to a surface is the velocity vector of a trajectory in the surface. DEFINITION 5.5.16
Let X be a Banach space, C ⊆ X and x0 ∈ C.
(a) A curve in C through x0 is a continuous map c : (−ε, ε) −→ X, such that c(t) ∈ C
∀ t ∈ (−ε, ε),
c(0) = x0 and c0 (0) exists. (b) A vector v ∈ X is a tangent vector to C at x0 if and only if v = c0 (0) for some curve c as in (a). (c) If the set of all tangent vectors to C at x0 form a vector space, it is denoted by Tx0 (C) and called the tangent space to C at x0 . Its translation x0 + Tx0 (C) is called the tangent plane to C at x0 . Using this intuitive definition, we can proceed to introduce the notion of a manifold. First we isolate those points of C for which a local parametrization (local coordinates) exists. DEFINITION 5.5.17 Let X be a Banach space and C ⊆ X. A point x0 ∈ C is said to be regular, if Tx0 (C) exists and it is closed and there exists a neighbourhood V of the origin in Tx0 (C) and a homeomorphism ϑ : V −→ C on a neighbourhood of U of x0 .
5. Critical Point Theory
697
REMARK 5.5.18 According to this definition U = ϑ(V ) is a neighbourhood of x0 . Then each x ∈ U can be described as x = ϑ(v) for some v ∈ V . We call v ∈ V the local coordinate of x ∈ U and the pair (V, ϑ) is a local parametrization (local coordinate system) to C at x0 . Suppose that x1 and x2 are two regular points in C with corresponding local coordinate systems (V1 , ϑ1 ) and (V2 , ϑ2 ) respectively. Let df
U1 = ϑ1 (V1 ) and
df
U2 = ϑ2 (V2 ).
Then every x ∈ U1 ∩ U2 has local coordinate v1 = ϑ−1 1 (x) for the local system (V1 , ϑ1 ) and v2 = ϑ−1 of¢ 2 (x) for the local system (V2 , ϑ2 ). Then the changes ¡ coordinates between the two local systems is described by v1 = ϑ−1 ϑ2 (v2 ) 1 (i.e., by the map ϑ−1 1 ◦ ϑ2 ). DEFINITION 5.5.19 Let X be a Banach space and C ⊆ X. We say that C is a (topological) manifold, if every point of C is regular. We say that C is a C k -manifold, if C is a manifold and the changes of local k coordinates ϑ−1 1 ◦ ϑ2 are C -maps. REMARK 5.5.20 For a manifold C, the changes of local coordinates ϑ−1 ◦ϑ are homeomorphisms. Similarly for C k -manifold all the maps ϑ−1 2 1 1 ◦ϑ2 k are C -diffeomorphisms. EXAMPLE 5.5.21 It is routine to check that if X is a Banach space and ϕ ∈ C 1 (X), then C = ϕ−1 (r) is a C 1 -manifold provided that r is not a critical value of ϕ. The next theorem gives the generalization of this fact (see also Theorem 4.2.7). THEOREM 5.5.22 (Lusternik Theorem) If X and Y are two Banach spaces, ϕ ∈ C 1 (X; Y ), df
C =
©
x ∈ X : ϕ(x) = 0
ª
0 and ¡ 0for ¢every x ∈ C,¡ ϕ0 (x)¢ is surjective and has a complemented kernel N ϕ (x) (i.e., X = N ϕ (x) ⊕ V for some closed subspace V of X), then C is a C 1 -manifold and for every x ∈ C, we have ¡ ¢ Tx (C) = N ϕ0 (x) .
PROOF
Let x ∈ C. By hypothesis, we have ¡ ¢ X = N ϕ0 (x) ⊕ V.
¡ ¢ It follows that there exists a continuous projection projN (ϕ0 (x)) onto N ϕ0 (x) . Let ¡ ¢ ψ : N ϕ0 (x) ⊕ V −→ Y
698
Nonlinear Analysis
be defined by df
ψ(u, v) = ϕ(x + u + v)
¡ ¢ ∀ u ∈ N ϕ0 (x) , v ∈ V.
Note that ψ(0, 0) = ϕ(x) and ψ is continuously differentiable with ¡ ¡ ¢ ¢ D1 ψ(0, 0) = ϕ0 (x)|N (ϕ0 (x)) = 0 ∈ L N ϕ0 (x) ; Y and
D2 ψ(0, 0) = ϕ0 (x)|V ∈ L(V ; Y ).
By the inverse function theorem (see Theorem 4.1.32), D2 ψ(0, 0) is an isomorphism between V and Y . Consider the equation ψ(u, v) = ϕ(x) = 0. The properties of ψ permit the use of the implicit function theorem (see Theorem 4.1.27). So we can find δ > 0 and a uniquely determined C 1 -map ¡ ¢ ξ : N ϕ0 (x) ∩ Bδ (0) −→ V, such that ¡ ¢ 0 = ϕ(x) = ψ u, ξ(u) ξ(0) = 0
and
¡ ¢ ∀ u ∈ N ϕ0 (x) ∩ Bδ (0),
ξ 0 (0) = −D2 ψ(0, 0)−1 D1 ψ(0, 0).
Hence ξ 0 (0) = 0 and so kξ(u)kX = 0. u→0 kuk X lim
Now let
¡ ¢ η : N ϕ0 (x) ∩ Bδ (0) −→ X
be defined by df
η(u) = x + u + ξ(u). Evidently η is continuous. By construction ¡ ¢ ¡ ¢ 0 = ϕ(x) = ψ u, ξ(u) = ϕ x + u + ξ(u) and so we see that x + u + ξ(u) ∈ C, i.e., η maps into C. Because u and ξ(u) are in complementary subspaces of X, we have that η is injective. Hence η is invertible on df
W =
©
¡ ¢ ª x + u + ξ(u) : u ∈ N ϕ0 (x) ∩ Bδ (0) ⊆ C
5. Critical Point Theory and so
699
¡ ¢ η −1 x + u + ξ(u) = x + u.
Recall that ¡ ¢ ¡ ¢ R projN (ϕ0 (x)) = N ϕ0 (x)
¡ ¢ and N projN (ϕ0 (x)) = V.
So we can write that ¡ ¢ ¡ ¢ η −1 x + u + ξ(u) = x + projN (ϕ0 (x)) u + ξ(u) ,
(5.63)
which that η −1 is continuous. Therefore η is a homeomorphism of ¡ 0 shows ¢ N ϕ (x) ∩ Bδ (0) onto W . In fact from¡ the properties of ξ and (5.63), we ¢ conclude that η is a diffeomorphism of N ϕ0 (x) ∩Bδ (0) onto W . This implies that C is C 1 -manifold. Next let w ∈ Tx (C). Let c be the corresponding curve, such that ¡ ¢ ϕ c(t) = 0 ∀ t ∈ (−ε, ε), and c(0) = x,
c0 (0) = w
(see Definition 5.5.17). Then by the chain rule, we have ϕ0 (x)w = 0, ¡ ¢ ¡ ¢ i.e., w ∈ N ϕ0 (x) . On the other hand, if w ∈ N ϕ0 (x) , then ¡ ¢ c(t) = ϕ tw ,
t ∈ (−ε, ε),
is a curve satisfying the requirements of Definition 5.5.17. Hence w ∈ Tx (C) and we conclude that ¡ ¢ Tx (C) = N ϕ0 (x) .
REMARK 5.5.23 Recall that every finite dimensional subspace F of X is complemented. Similarly if a closed subspace Z of X has finite codimension, then Z is complemented. Of course in a Hilbert space every closed subspace is complemented (by the orthogonal complement). In general, however, every Banach space that is not isomorphic to a Hilbert space has closed subspaces that are not complemented. If X and Y are Banach spaces and A ∈ L(X; Y ) is surjective, then it is easy to check that A has a right inverse if and only if N (A) is complemented. The next result establishes the method of Lagrange multipliers for constrained optimization problems in Banach spaces.
700
Nonlinear Analysis
THEOREM 5.5.24 If X and Y are two Banach spaces, f : X −→ R and g : X −→ Y are Fr´echet differentiable maps, x0 ∈ X is a finite local minimizer of prob¡ the constrained ¢ lem (5.62) and g 0 (x0 ) ∈ L(X; Y ) is surjective with N g 0 (x0 ) complemented in X, then there exists y ∗ ∈ Y ∗ , such that x0 ∈ X is a critical point of df
h(x) = f (x) − hy ∗ , g(x)iY , i.e., f 0 (x0 ) = y ∗ ◦ g 0 (x0 ). PROOF
By virtue of Theorem 5.5.22, if df
C = we have
©
ª x ∈ X : g(x) = 0 ,
¡ ¢ Tx0 (C) = N g 0 (x0 ) .
¡ ¢ Let w ∈ N g 0 (x0 ) . We can find a curve c, such that ¡ ¢ g c(t) = 0 ∀ t ∈ (−ε, ε), c(0) = x0
c0 (0) = w.
and
Then by the chain rule, we have g 0 (x0 )w = 0. Let ¡ ¢ df h(t) = f c(t)
∀ t ∈ (−ε, ε).
By hypothesis h has a local minimum at t = 0. Therefore f 0 (0) = 0 and so So we have that By hypotheses
0 ® f (x0 ), w X = 0
¡ ¢ ∀ w ∈ N g 0 (x0 ) .
¡ ¢ ¡ ¢ N g 0 (x0 ) ⊆ N f 0 (x0 ) .
(5.64)
¡ ¢ X = N g 0 (x0 ) ⊕ V,
with a closed subspace V of X. Note that L = g 0 (x0 )|V is an isomorphism. Also let projV be the projection operator onto V . Because of (5.64), we have that f 0 (x0 ) ◦ projV = f 0 (x0 )
5. Critical Point Theory and
701
f 0 (x0 ) ◦ projV : V −→ R
is a continuous, linear functional (i.e., f 0 (x0 ) ◦ projV ∈ V ∗ ). Then y ∗ = f 0 (x0 ) ◦ projV ◦ L−1 ∈ Y ∗ . We have ¡ ∗ 0 ¢ ¡ ¢¡ ¢ ¡ ¢¡ ¢ y ◦g (x0 ) (x) = y ∗ ◦g 0 (x0 ) projV (x) = y ∗ ◦L projV (x) = f 0 (x0 )(x), so
f 0 (x0 ) = y ∗ ◦ g 0 (x0 ).
We can generalize the above theorem by weakening the assumptions on the constraint function g. To do this we need the following partial generalization of Theorem 5.5.22. THEOREM 5.5.25 If X and Y are two Banach spaces, ϕ : X −→ Y is Fr´echet differentiable, ϕ0 (x0 ) ∈ L(X; Y ) is surjective (i.e., ϕ is regular at x0 ) and df
C = then
¡ ¢ x ∈ X : ϕ(x) = ϕ(x0 ) ,
¡ ¢ Tx0 (C) = N ϕ0 (x0 )
and there exist a neighbourhood U of x0 , a number β > 0 and mapping U 3 u 7−→ ξ(u) ∈ X, such that ° ° ° ° ¡ ¢ ϕ(x0 ) = ϕ u + ξ(u) and °ξ(u)°X 6 β °ϕ(u) − ϕ(x0 )°Y ∀ u ∈ U. Using this theorem, we can have the following generalization of Theorem 5.5.24. THEOREM 5.5.26 If X and Y are two Banach spaces, f : X −→ R is Fr´echet differentiable at x0 , g : X −→ Y is continuously Fr´echet differentiable at x0 with g 0 (x0 ) ∈ ª df © L(X; Y ) having closed range and x0 ∈ C = x ∈ X : g(x) = 0 is a finite local minimizer of the constrained problem (5.62), then there exist λ0 ∈ R and y ∗ ∈ Y ∗ not both equal to zero, such that λ0 f 0 (x0 ) = y ∗ ◦ g 0 (x0 )
in X ∗ .
(5.65)
Moreover, if g 0 (x0 ) is surjective (i.e., g is regular at x0 ), then we have that λ0 6= 0 and so we can take λ0 = 1.
702
Nonlinear Analysis
PROOF In the degenerate case when g 0 (x0 ) ∈ L(X; Y ) is not surjective, by a well known corollary of the Hahn-Banach theorem (see, e.g., Denkowski, Mig´orski & Papageorgiou (2003a, p. 263)), we can satisfy (5.65) with λ0 = 0. Otherwise, using Theorem 5.5.25, we can show that (5.64) holds. Also since now we have that ¡ ¢ R g 0 (x0 ) = Y, by virtue of Lemma 3.1.42, we have ¡ ¢ ¡ ¢⊥ R g 0 (x0 )∗ = N g 0 (x0 ) © ¡ ¢ª = x∗ ∈ X ∗ : hx∗ , xiX = 0 for all x ∈ N g 0 (x0 ) . Then from (5.64), we see that ¡ ¢⊥ ¡ ¢ f 0 (x0 ) ∈ N g 0 (x0 ) = R g 0 (x0 )∗ and so for some y ∗ ∈ Y ∗ , we have f 0 (x0 ) = g 0 (x0 )∗ y ∗ . Therefore 0 ® ® ® f (x0 ), x X = g 0 (x0 )∗ y ∗ , x X = y ∗ , g 0 (x0 )x Y so
∀ x ∈ X,
f 0 (x0 ) = y ∗ ◦ g 0 (x0 ) in X ∗ ,
i.e., (5.65) holds with λ0 = 1. REMARK 5.5.27 In Theorems 5.5.27 and 5.5.26 the quantities λ0 ∈ R and y ∗ ∈ Y ∗ are called Lagrange multipliers. If Y is finite dimensional, say Y =¡RN , then every y ∗ ∈ Y ∗ is characterized ¢ uniquely by an N -tuple of numbers λ1 , . . . , λN and so Theorem 5.5.24 (see also Theorem 5.5.26) reduces to the well known theorem from multivariable calculus on the existence of Lagrange multipliers. THEOREM 5.5.28 If f : Rk −→ R is a differentiable function, g : Rk −→ RN is continuously differentiable, f attains at x0 ∈ Rk a local minimum on the set ª df © C = x ∈ Rk : g(x) = 0 ¡ ¢ and x0 is a regular point of g (i.e., g 0 (x0 ) ∈ L Rk ; RN is surjective), © ª b = λi N ∈ RN , such that then we can find λ i=1
N X ∂f ∂gi (x0 ) = λi (x0 ) ∂xj ∂xj i=1
© ª ∀ j ∈ 1, . . . , k .
5. Critical Point Theory
703
REMARK 5.5.29 According to the above theorem, the local extrema ª df © of f subject to the constraint x ∈ C = x ∈ Rk : g(x) = 0 are included in ¢ ¡ b = (λi )N , x0 ∈ RN × Rk of the algebraic system the solutions λ i=1 © ª ∂f (x ) = λ ∂g1 (x ) + . . . + λ ∂gN (x0 ) for j ∈ 1, . . . , k , 0 1 0 N ∂x ∂xj ∂xj © ª j gi (x0 ) = 0 for i ∈ 1, . . . , N . This is a system of k + N equations with k + N unknowns. Let us see how we can use the method of Lagrange multipliers to establish the existence of eigenvalues for certain linear, bounded, self-adjoint operators in Hilbert spaces. EXAMPLE 5.5.30 Let H be a Hilbert space and suppose that A ∈ L(H) is self-adjoint. Let df
f (x) = and
¡
Ax, x
df
C =
©
¢ H
,
df
g(x) = kxkH − 1
ª x ∈ H : g(x) = 0 .
Suppose that x0 ∈ H, kx0 kH = 1 is a solution of the minimization problem min f (x). x∈C
Then by virtue of Theorem 5.5.24 with Y = R, we can find λ ∈ R, such that f 0 (x0 ) = λg 0 (x0 ), hence Ax0 = λx0 . So x0 ∈ H is an eigenvector of A with corresponding real eigenvalue λ ∈ R. Here we have used the fact that every x ∈ C is a regular point of g. Next we extend the notion of critical point to constrained functionals. DEFINITION 5.5.31
Let X be a Banach space, C ⊆ X and ϕ : X −→ R.
We say that ϕ has a critical point with respect to C at x0 , if x0 ∈ C and for every curve c : (−ε, ε) −→ X, such that c(t) ∈ C for all t ∈ (−ε, ε), c(0) = x0 and c0 (0) exists, we have ¢ d ¡ ϕ c(t) |t=0 = 0. dt
(5.66)
704
Nonlinear Analysis
REMARK 5.5.32 If int C 6= ∅ and x0 ∈ int C, then x0 is a usual critical point of ϕ, often called a free critical point of ϕ. From (5.66) and the chain rule, we have the following criterion. PROPOSITION 5.5.33 If X is a Banach space, C ⊆ X, ϕ : X −→ R is a function which is Fr´echet differentiable at x0 ∈ C and the set C has a tangent space Tx0 (C) at x0 , then x0 is a critical point of ϕ with respect to C if and only if ϕ0 (x0 )|Tx (C) = 0. 0
We can have another criterion based on Lagrange multipliers. PROPOSITION 5.5.34 If X and Y are two Banach spaces, f : X −→ R is Fr´echet differentiable, g : X −→ Y is a continuously Fr´echet differentiable function, x0 ∈ X, ª df © C = x ∈ X : g(x) = 0 ¡ ¢ and g 0 (x0 ) ∈ L(X; Y ) is surjective with N g 0 (x0 ) complemented in X, then f has a critical point with respect to C at x0 if and only if there exists y ∗ ∈ Y ∗ , such that f 0 (x0 ) = y ∗ ◦ g 0 (x0 ) in X ∗ . PROOF that
“=⇒”: From Theorem 5.5.22 (see also Theorem 5.5.25), we know ¡ ¢ Tx0 (C) = N g 0 (x0 ) .
So from Proposition 5.5.33, we know that f 0 (x0 )v = 0 hence
¡ ¢ ∀ v ∈ N g 0 (x0 ) ,
¡ ¢ ¡ ¢ N g 0 (x0 ) ⊆ N f 0 (x0 ) .
From this inclusion, arguing as in the proof of Theorem 5.5.26, we obtain y ∗ ∈ Y ∗ , such that f 0 (x0 ) = y ∗ ◦ g 0 (x0 ) in X ∗ . (5.67) “⇐=”: If (5.67) holds and c : (−ε, ε) −→ X is a curve, ¡ ¢ such that c(t) ∈ C for all t ∈ (−ε, ε), c(0) = x0 and c0 (0) exists. Then g c(t) = 0 for all t ∈ (−ε, ε) and g 0 (x0 )c0 (0) = 0. Because of (5.67), we have that 0 ® f (x0 ), c0 (0) X = 0 and so
¢ d ¡ f c(t) |t=0 = 0, dt
which means that x0 is a critical point of f with respect to C.
5. Critical Point Theory
5.6
705
Remarks
5.1: Deformation techniques along the gradient flow already appeared in 1960’s in the works of Browder (1965a) and Schwartz (1964). In the deformation approach, the Palais-Smale condition (PS-condition; see Definition 5.1.5(a)) and the pseudogradient vector field (see Definition 5.1.16) play a central role and were introduced by Palais & Smale (1964) and Palais (1966) respectively. The slight extension of the PS-condition, the Cerami condition (C-condition; see Definition 5.1.5(b)), was introduced by Cerami (1978). Both conditions replaced the compactness of the domain of the energy functionals. The connection between the PS-condition and coercivity (see Theorem 5.1.13) is due to Costa & Silva (1991). Analogous results can be found in the works ˇ of Caklovi´ c, Li & Willem (1990), Goeleven (1993) and Gasi´ nski & Papageorgiou (2005) (for nonsmooth functionals). For the Theorem 5.1.21 we refer to Cartan (1967, p. 122). The basic deformation result is due to Clark (1972). Proposition 5.1.25 and its consequences (see Corollaries 5.1.26, 5.1.27, 5.1.28) are quantitative versions of the result of Clark due to Willem (1996). Theorem 5.1.32 is due to Du (1991). The second deformation theorem (see Theorem 5.1.33) is due to Rothe (1973), Marino & Prodi (1975), Chang (1981) and Wang (1987) (see also the book of Chang (1993, p. 23)). The deformation theorem based on the Cerami condition (see Theorem 5.1.34) is due to Bartolo, Benci & Fortunato (1983). A nonsmooth generalization of it can be found in Gasi´ nski & Papageorgiou (2005). The extension of the deformation approach to functionals of the form ϕ = ϕ1 + ϕ2 with ϕ1 ∈ C 1 (X) and ϕ2 ∈ Γ0 (X) (see Theorem 5.1.42) is due to Szulkin (1986). In Gasi´ nski & Papageorgiou (2005) we find a further generalization of Szulkin’s theory to functionals ϕ = ϕ1 + ϕ2 with ϕ1 being locally Lipschitz. 5.2: The notion of linking sets (see Definition 5.2.1) is essentially due to Benci & Rabinowitz (1979). The main minimax results which generate the others by appropriate choice of the sets (see Theorems 5.2.4 and 5.2.5) are generalized versions of a theorem which can be found in Struwe (1990, p. 118). The mountain pass theorem (see Corollary 5.2.7) is due to Ambrosetti & Rabinowitz (1973). The saddle point theorem (see Corollary 5.2.8) is due to Rabinowitz (1978b), while the generalized mountain pass theorem (see Corollary 5.2.9) is due to Rabinowitz (1978a) (see also Rabinowitz (1986)). The extensions to functionals of the form ϕ = ϕ1 + ϕ2 with ϕ1 ∈ C 1 (X) and ϕ2 ∈ Γ0 (X) presented in the last part of Section 5.2 are due to Szulkin (1986). Additional results and developments of the deformation and minimax techniques, with applications to boundary value problems, can be found in Ambrosetti (1992), Du (1991), Fang & Ghoussoub (1992), Ghoussoub (1993a, 1993b), Mawhin & Willem (1989), Nirenberg (1981, 1989), Papageorgiou & Papageorgiou (2004), Ramos & Rebelo (1994) and Silva (1991).
706
Nonlinear Analysis
5.3: Theorem 5.3.5 is due to Hofer (1985) (for a generalization see also Hofer (1988)). Theorem 5.3.6 can be found in Struwe (1990, p. 129) and is also related to results of Chang (1993) and Pucci & Serrin (1985, 1987). Similar results can be found in Hofer (1984, 1986) and Manes & Micheletti (1973). 5.4: The notion of local linking (see Definition 5.4.1) was introduced by Liu & Li (1984) under the stronger assumption that dim Y < +∞ and ϕ(v) > r > 0 for all v ∈ V with kvkV = r. Theorems 5.4.6 and 5.4.7 are due to Br´ezis & Nirenberg (1991). For the proof of Theorem 5.4.10 (Morse lemma) we refer to Schwartz (1969, p. 136). Theorems 5.4.21 and 5.4.23 are due to Li & Willem (1995). Theorem 5.4.24 is due to Gasi´ nski & Papageorgiou (2005). The notion of genus (see Definition 5.4.25) was introduced by Krasnoselskii (1964b) (see also Krasnoselskii & Zabreiko (1984, p. 385)). Here we use the definition of genus given by Coffman (1969). The equivalence of the definition of Coffman and of the original definition of Krasnoselskii can be found in Rabinowitz (1973). A presentation of the various properties of the genus can be found in Mawhin & Willem (1989), Rabinowitz (1986) and Struwe (1990). The proof of Theorem 5.5.25 can be found in Ioffe & Tihomirov (1979, p. 30). Theorems 5.4.33 and 5.4.34 are due to Szulkin (1986) and generalize earlier results of Rabinowitz (1986) with ϕ2 = 0. 5.5: The Lusternik-Schnirelman category (see Definition 5.5.3) is the first example of a topological index and was introduced by Lusternik & Schnirelman (1934), who also proved the basic multiplicity result, Theorem 5.5.12. Here we give a restricted definition of a manifold C by requiring that C lies in a fixed Banach space X. A more general definition can be found in Lang (1972). Theorems 5.5.22 and 5.5.24 are ¡ due ¢to Lusternik (1934). In this section, crucial is the assumption that R g 0 (x0 ) = Y (regularity of the map g : X −→ Y at x0 ). So let us give a criterion for such surjectivity to hold (see Yosida (1978, p. 208)). THEOREM 5.6.1 If X and Y are two Banach spaces and A : X ⊇ D(A) −→ Y is a closed and densely defined, linear operator, then A is surjective (i.e., R(A) = Y ) if and only if A∗ has a continuous inverse (i.e., there exists c > 0 such that kA∗ y ∗ kX ∗ > c ky ∗ kY ∗ for all y ∗ ∈ D(A∗ )). EXAMPLE 5.6.2 If X = H is a Hilbert space and A : H ⊇ D(A) −→ H is a closed, densely defined, linear operator, such that there exists c > 0 for which we have ¡ ¢ 2 ∀ x ∈ D(A) Ax, x H > c kxkH (i.e., A is strongly monotone), then R(A) = H. Additional results on constrained infinite dimensional optimization can be found in Blanchard & Br¨ uning (1992), Ioffe & Tihomirov (1979), Vaˇınberg (1973) and Zeidler (1985b).
Chapter 6 Eigenvalue Problems and Maximum Principles
In this chapter we continue our investigation of the methods and techniques used in the study of linear and nonlinear elliptic partial differential equations. First we analyze the spectrum of linear elliptic differential operators and then we pass to certain nonlinear ones, namely the p-Laplacian. Our analysis involves energy methods which are based on the critical point theory developed in the previous chapter. In the last two sections we develop the other distinct technique in the study of stationary partial differential equations which is based on maximum principles. This method leads to pointwise deductions and for this reason requires higher smoothness in the functions and it is different from the integral-based energy methods which are developed in the framework of Sobolev spaces. In Section 6.1 we study linear eigenvalue problems with weight for general linear elliptic differential operators in divergence form. As a particular case we obtain a complete description of the spectrum of the negative Laplacian with Dirichlet or Neumann boundary conditions. We also obtain variational characterizations of the eigenvalues via minimax expressions (Courant’s theory). At the end of the section we present the basic maximum principles for linear elliptic partial differential equations (weak and strong (Hopf) maximum principles). In Section 6.2 we pass to nonlinear elliptic partial differential operators and examine the spectrum of the p-Laplacian. The relevant eigenvalue problem is nonlinear and we use energy methods (critical point theory; see Chapter 5) to study it. We determine the beginning of the spectrum of the negative pLaplacian with Dirichlet or Neumann boundary conditions and establish the existence of a principal eigenvalue. Our investigations involve also some basic nonlinear regularity results. In Section 6.3 we examine the ordinary (one dimensional) p-Laplacian differential operator. We consider both the scalar and vector ordinary p-Laplacian differential operators under Dirichlet, Neumann or periodic boundary conditions. In Section 6.4 we develop versions of the maximum principle when the differential operator is nonlinear and involves the p-Laplacian. We also state the nonlinear Picone’s identity and illustrate how it can be used in the study of nonlinear eigenvalue problems.
707
708
Nonlinear Analysis
Finally in Section 6.5 we prove comparison results involving the p-Laplacian. In contrast to the linear case, where the comparison principles are an easy consequence of the maximum principle, the nonlinear case is more complicated due to the degeneracy of the operator.
6.1
Linear Elliptic Operators
In this section we develop the theory of linear eigenvalue problems with weights. This theory is an important tool in the study of semilinear boundary value problems. The theory is based on the spectral properties of compact selfadjoint operators on a Hilbert space. Some first results in this direction were obtained in Section 3.1. Here we continue this study and obtain additional results which eventually will be used in the case of linear elliptic differential operators. The mathematical framework of our analysis is a Hilbert space H and a linear compact self-adjoint operator A : H −→ H (i.e., A ∈ Lc (H) and it is self-adjoint). PROPOSITION ¡ 6.1.1 ¢ df If λ1 = sup Ax, x H > 0, kxkH =1
then there exists u1 ∈ H with ku1 kH = 1, such that ¡ ¢ Au1 = λ1 u1 and Au1 , u1 H = λ1 . PROOF
Clearly λ1 < +∞. Let © ª {xn }n>1 ⊆ ∂B1 (0) = x ∈ H : kxkH = 1
be a sequence, such that ¡
Axn , xn
¢ H
% λ1 .
By passing to a subsequence if necessary, we may assume that w
xn −→ u1
in H.
Since A is compact, passing to a next subsequence if necessary, we may also assume that Axn −→ Au1 in H. ¡ ¢ Hence Au1 , u1 H = λ1 . Let df
S = λ1 idH − A.
6. Eigenvalue Problems and Maximum Principles Then
¡
Su1 , u1
¢ H
= 0
and
¡ ¢ Sx, x H > 0
709
∀ x ∈ H.
Let h ∈ H and let x = u1 + th with t ∈ R. We have ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ 0 6 Sx, x H = Su1 , u1 H + 2t Su1 , h H + t2 Sh, h H ¡ ¢ ¡ ¢ = t2 Sh, h H + 2t Su1 , h H . ¡ ¢ Let t = r Su1 , h H with r ∈ R. We obtain ¡ ¢2 £ ¡ ¢ ¤ 0 6 r Su1 , h H r Sh, h H + 2 . Taking |r| > 0 small enough, we see that we must have ¡ ¢ Su1 , h H = 0 ∀ h ∈ H. So Au1 = λ1 u1 . In a similar fashion we show the next proposition. PROPOSITION ¡ 6.1.2 ¢ If λ−1 = inf Ax, x H < 0, kxkH =1
then there exists u−1 ∈ H with ku−1 kH = 1, such that ¡ ¢ Au−1 = λ−1 u−1 and Au−1 , u−1 H = λ−1 . REMARK 6.1.3 From Proposition 3.1.54 we know that λ1 is the largest eigenvalue of A and λ−1 is the smallest one. By passing to the orthogonal complements of the eigenspaces Ru1 , Ru−1 , we can produce the other eigenvalues of A in a similar fashion. The whole process is based on the following elementary fact from the theory of bounded linear operators on a Hilbert space. LEMMA 6.1.4 If V is an A-invariant subspace of H (i.e., A(V ) ⊆ V ), then A ∈ Lc (V ⊥ ) and it is self-adjoint. PROPOSITION 6.1.5 ¾ ½ © ª ¡ ¢ df If λn = sup Ax, x H : kxkH = 1, x ⊥ uk for k ∈ 1, . . . , n − 1 > 0, then there exists un ∈ H with kun kH = 1, such that ¡ ¢ Aun = λn un and Aun , un H = λn .
710
Nonlinear Analysis
PROPOSITION 6.1.6 ½ ¾ ¡ ¢ © ª df If λ−n = inf Ax, x H : kxkH = 1, x ⊥ u−k for k ∈ 1, . . . , n − 1 < 0, then there exists u−n ∈ H with ku−n kH = 1, such that ¡ ¢ Au−n = λ−n u−n and Au−n , u−n H = λ−n . REMARK 6.1.7 The expressions for λn and λ−n in the above propositions have the drawback that they require knowledge of all previous positive (respectively negative) eigenvalues. This is remedied in the next two propositions. PROPOSITION 6.1.8 For every integer n > 1, we have λn =
inf
Y ∈Ln−1
sup x ∈ Y⊥ kxkH = 1
¡ ¢ Ax, x H ,
(6.1)
where Ln−1 is the set of all (n − 1)-dimensional subspaces Y of H. PROOF
Let mn be the right hand side of (6.1). If df
Y = span {uk }n−1 k=1 , then from Proposition 6.1.5, we infer that mn 6 λn . On the other hand, let {vk }n−1 k=1 be mutually orthogonal vectors and let us set df
Y = span {vk }n−1 k=1 . Let df
u =
n X
βi ui
i=1
be such that © ª ∀ k ∈ 1, . . . , n − 1
(u, vk )H = 0
and
n X i=1
(i.e., kukH = 1, the vector is normalized). We have n X ¡ ¢ βi2 λi > λn , Au, u H = i=1
so sup x ∈ Y⊥ kxkH = 1
¡ ¢ Ax, x H > λn
∀ Y ∈ Ln−1
and thus mn > λn . Therefore we conclude that λn = mn .
βi2 = 1
6. Eigenvalue Problems and Maximum Principles
711
In a similar fashion we obtain analogous expressions for the negative eigenvalues. PROPOSITION 6.1.9 For every integer n > 1, we have λ−n =
sup
¡ ¢ Ax, x H .
inf
Y ∈Ln−1
x ∈ Y⊥ kxkH = 1
(6.2)
Still we detect something unsatisfactory in (6.1) and (6.2). Namely in (6.1) (respectively (6.2)), first we perform a maximization (respectively minimization) over an infinite dimensional subspace of H. In the next two propositions we fix this. PROPOSITION 6.1.10 For every integer n > 1, we have λn = sup
Y ∈Ln
inf
x∈Y kxkH = 1
¡
Ax, x
¢ H
.
(6.3)
Let ξn be the right hand side of (6.3). Let Y = span {uk }nk=1 . n P Let x ∈ Y be such that kxkH = 1. We have x = βk uk ∈ Y , with PROOF
2 kxkH
=
n P k=1
k=1
βk2
= 1. So ¡
Ax, x
¢ H
=
n X
βk2 λk > λn ,
k=1
and hence ξn > λn . On the other hand, if Y ∈ Ln , we choose x ∈ Y , kxkH = 1, such that © ª x ⊥ uk ∀ k ∈ 1, . . . , n − 1 . ¡ ¢ From Proposition 6.1.5, we know that Ax, x H 6 λn , so ξn 6 λn . Therefore we conclude that λn = ξn . Similarly for the negative eigenvalues. PROPOSITION 6.1.11 For every integer n > 1, we have λ−n =
inf
Y ∈Ln
sup x∈Y kxkH = 1
¡ ¢ Ax, x H .
(6.4)
REMARK 6.1.12 In the above characterizations the inf and sup operations are attained. So they can be replaced by min and max.
712
Nonlinear Analysis
Next we will use the above abstract results to describe the solutions of a linear eigenvalue problem. So let Ω ⊆ RN (N > 2) be a bounded domain (i.e., a bounded connected open set). We introduce the linear differential operator in divergence form, defined by µ ¶ N X ∂ ∂x df Lx = − aij (z) + a0 (z)x, ∂zj ∂zi i,j=1 N
where aij ∈ L∞ (Ω), aij = aji , a0 ∈ L 2 (Ω) and a0 (z) > 0 for almost all z ∈ Ω. We also assume that the operator L is uniformly elliptic in Ω, namely there exists c0 > 0, such that N X
2
aij (z)ξi ξj > c0 kξkRN
N for a.a. z ∈ Ω and all ξ = (ξk )N k=1 ∈ R .
i,j=1
Uniform ellipticity of the operator L means that for almost all z ∈ Ω, the ¡ ¢N N × N -matrix B(z) = aij (z) i,j=1 is symmetric, positive definite with the smallest eigenvalue greater or equal to c0 > 0. If ½ © ª df 1 if i = j, aij = δij = ∀ i, j ∈ 1, . . . , N , 0 if i 6= j then L = −∆. We consider the following weighted linear eigenvalue problem: ½ Lx = λmx in Ω, x|∂Ω = 0.
(6.5)
Here m : Ω −→ R is the weight function and it is assumed to belong in L∞ (Ω). The important feature of (6.5), which we want to emphasize, is that the weight function m can change sign in Ω. Consider the bilinear form b : H01 (Ω) × H01 (Ω) −→ R, defined by ¸ Z ·X N df b(x, y) = aij (z)Di xDj y + a0 (z)xy dz. i,j=1
Ω
∗
N
Since by hypothesis a0 ∈ L 2 (Ω), x, y ∈ L2 (Ω), where ( 2N if N > 3 ∗ 2 = N −2 +∞ if N = 2 and because N2 + 22∗ = 1, from the generalized H¨older’s inequality (see Theorem A.2.27), we have that a0 xy ∈ L1 (Ω) and so Z a0 (z)xy dz Ω
is well defined.
6. Eigenvalue Problems and Maximum Principles
713
Note that the bilinear form b(·, ·) is: (a) symmetric, i.e., b(x, y) = b(y, x)
∀ x, y ∈ H01 (Ω);
(b) continuous, since ¯ ¯ ¯b(x, y)¯ 6 c1 kxk 1 kyk 1 , H (Ω) H (Ω) 0
0
for some c1 > 0; (c) coercive, since 2
b(x, x) > c2 kxkH 1 (Ω) , 0
for some c2 > 0 (by virtue of the strong ellipticity condition and since a0 > 0). Hence the bilinear form b defines an equivalent inner product on H01 (Ω). Let h ∈ L2 (Ω). Since by hypothesis m ∈ L∞ (Ω), the function Z y 7−→ mhy dz Ω
is linear continuous on H01 (Ω). So by the Riesz representation theorem, we can find Ah ∈ H01 (Ω), such that Z ¡ ¢ b Ah, y = mhy dz ∀ y ∈ H01 (Ω). Ω
The map A : L2 (Ω) −→ H01 (Ω) ⊆ L2 (Ω) is clearly linear, bounded and selfadjoint. Moreover, exploiting the compactness of the embedding H01 (Ω) ⊆ L2 (Ω), we can easily check that A is compact. Using the operator A, problem (6.5) can be equivalently rewritten as ¡ ¢ b(x, y) = λb Ax, y H hence Ax =
∀ y ∈ H01 (Ω), 1 x. λ
Since A is compact, self-adjoint on the Hilbert space H = L2 (Ω), we can apply the earlier abstract results to determine the eigenvalues and eigenfunctions of problem (6.5). First let us give the definition of eigenvalue and eigenfunction for the differential operator L.
714
Nonlinear Analysis
DEFINITION 6.1.13 (a) We say that x ∈ H01 (Ω) is a weak solution of the boundary value problem (6.5), if Z b(x, y) = λ mxy dz ∀ y ∈ H01 (Ω). Ω
(b) We say that λ ∈ R is an eigenvalue of L, if problem (6.5) has a nontrivial weak solution x ∈ H01 (Ω). The nonzero solutions are the eigenfunctions corresponding to the eigenvalue λ. REMARK 6.1.14
Since µ ¶ ∂ ∂x − aij (·) ∈ H −1 (Ω) ∂z ∂z j i i,j=1 N X
∀ x ∈ H01 (Ω)
2∗
∗ 0
(see Theorem 2.4.57) and a0 x ∈ L 2∗ −1 (Ω) = L(2 ) (Ω) ⊆ H −1 (Ω) ( 21∗ + 1 older’s inequality; see Theorem A.2.27), from Defini(2∗ )0 = 1; generalized H¨ tion 6.1.13(a), it follows that Lx(z) = λm(z)x(z)
for a.a. z ∈ Z
(strong solution). From Propositions 6.1.10 and 6.1.11, we have the following one. PROPOSITION 6.1.15 The linear eigenvalue problem (6.5) has a double sequence of eigenvalues . . . 6 λ−n 6 . . . 6 λ−1 < 0 < λ1 6 . . . 6 λn 6 . . . , which have the following variational characterizations Z 1 = sup inf mx2 dz, λn Y ∈Ln x ∈ Y
(6.6)
kxk2 = 1 Ω
1 λ−n
Z
=
inf
Y ∈Ln
sup
mx2 dz.
x∈Y kxk2 = 1 Ω
The corresponding eigenfunctions un are such that Z b(un , y) = λn mun y dz ∀ y ∈ H01 (Ω), Ω
kun k2 = 1
and
1 = λn
and similarly for the eigenfunctions of λ−n .
Z mu2n dz Ω
(6.7)
6. Eigenvalue Problems and Maximum Principles REMARK 6.1.16 except +∞ and −∞.
715
These eigenvalues do not have accumulation points,
In what follows ª df © Ω+ = z ∈ Ω : m(z) > 0 ,
df
Ω− =
©
ª
z ∈ Ω : m(z) < 0
and by | · |N we denote the Lebesgue measure on RN . PROPOSITION 6.1.17 (a) If |Ω+ |N = 0, then there is no λn for n > 1; (b) If |Ω− |N = 0, then there is no λ−n for n > 1; (c) If |Ω+ |N > 0, then λn −→ +∞; (d) If |Ω− |N > 0, then λ−n −→ −∞. PROOF
(a) and (b) follow at once from Proposition 6.1.15.
(c) Let B1 , . . . , Bn be a family of pairwise disjoint open balls in Ω, such that ¯ ¯ © ª ¯Bi ∩ Ω+ ¯ > 0 ∀ i ∈ 1, . . . , n . N Let {ui }ni=1 ⊆ Cc∞ (Ω), with
Z
supp ui ⊆ Bi
mu2i dz = 1.
and Ω
Then let us set
df
Y = span {ui }ni=1 ∈ Ln . n P
For x =
i=1
βi ui ∈ Y , we have Z mx2 dz =
n X
Z βi2
i=1
Ω
and b(x, x) =
n X
mu2i dz =
max
i∈{1,...,n}
βi2
i=1
Ω
βi2 b(ui , ui ) 6 c
i=1
with c =
n X
n X
βi2 ,
i=1
b(ui , ui ). Therefore Z mx2 dz >
1 b(x, x) c
∀ x ∈ Y ∈ Ln .
Ω
Then the result follows from this inequality and (b). (d) The proof is similar to that of (c) using this time (6.7).
716
Nonlinear Analysis
© ª The eigenvalues λ±n n>1 depend on the weight function m ∈ L∞ (Ω). Proposition 6.1.15 establishes the basic features of this dependence. PROPOSITION 6.1.18 (a) If m1 , m2 ∈ L∞ (Ω), m1 6 m2 and λ±n (m1 ), λ±n (m2 ) exist, then λ±n (m2 ) 6 λ±n (m1 ). (b) If the inequality m1 6 m2 is strict on a set of positive measure, then λ±n (m2 ) < λ±n (m1 ). Next let us examine the eigenfunctions corresponding to these eigenvalues. First, from the spectral theorem, for compact, self-adjoint operators (see Theorem 3.1.57), we have the following result. PROPOSITION 6.1.19 The eigenfunctions {un }n>1 and {u−n }n>1 form orthonormal bases for © ª © ª L2 (Ω) and √uλn n>1 and √u−n n>1 are orthonormal bases of H01 (Ω) λ−n
n
equipped with the inner product b(·, ·). The next result is a theorem of the Krein-Rutman-type for the eigenvalue problem (6.5). First let us recall the Krein-Rutman theorem. THEOREM 6.1.20 (Krein-Rutman Theorem) If X is an ordered Banach space with positive cone K, X = K − K, A ∈ Lc (X), A(K) ⊆ K (i.e., A is a positive operator) and df
r(A) =
max |λ| > 0,
λ∈σX (A)
then r(A) is an eigenvalue with a positive eigenvector. In the context of the eigenvalue problem (6.5) this theorem takes the following form. THEOREM 6.1.21 (a) If |Ω+ |N > 0, then λ1 > 0 is simple and u1 (z) > 0 for almost all z ∈ Ω. (b) If |Ω− |N > 0, then λ−1 < 0 is simple and u1 (z) > 0 for almost all z ∈ Ω. PROOF (a) Let u ∈ H01 (Ω) be an eigenfunction corresponding to the eigenvalue λ1 > 0. At this stage we do not know whether or not some eigenvalues λi are equal to λ1 and so in general u is a linear combination of ui ’s. We have Z b(u, v) = λ1 muv dz ∀ v ∈ H01 (Ω). Ω
6. Eigenvalue Problems and Maximum Principles
717
Suppose that u does not have a constant sign. Then u+ , u− 6= 0 and from Proposition 2.4.27, we know that u+ , u− ∈ H01 (Ω) and b(u, u) = b(u+ , u+ ) + b(u− , u− ). We have that
b(u+ , u+ ) > 0
Note that 1 = λ1
R
mu2 dz = b(u, u)
Ω
and R Ω
b(u− , u− ) > 0.
R m(u+ )2 dz + Ω m(u− )2 dz . b(u+ , u+ ) + b(u− , u− )
By virtue of Proposition 6.1.1, it follows that R R m(u+ )2 dz m(u− )2 dz 1 1 Ω Ω = and = , λ1 b(u+ , u+ ) λ1 b(u− , u− ) so u+ and u− are both eigenfunctions corresponding to λ1 > 0. Then from Stampacchia (1966, p. 238) (if m > 0, m 6= 0, from the classical maximum principle), we obtain that u+ (z) > 0 and
u− (z) > 0 for a.a. z ∈ Ω,
a contradiction. This proves that u has a constant sign and so we may assume that u(z) > 0 for a.a. z ∈ Ω. As above from Stampacchia (1966) (or the maximum principle if m > 0, m 6= 0), we infer that u(z) > 0 for a.a. z ∈ Ω. Now let us show that λ1 > 0 is simple, i.e., it has multiplicity 1. To this end let u b1 and u b2 be two eigenfunctions corresponding to the eigenvalue λ1 > 0. For every µ ∈ R, the function u b1 + µb u2 has constant sign. Let ª df © C+ = µ ∈ R : u b1 + µb u2 > 0 , ª df © C− = µ ∈ R : u b1 + µb u2 6 0 . Evidently the sets C+ and C− are nonempty, closed and C+ ∪ C− = R. So we can find µ0 ∈ C+ ∩ C− . Then u b1 + µ0 u b2 = 0, which shows that u b1 and u b2 are linearly dependent, i.e., the eigenspace corresponding to λ1 > 0 is one dimensional. (b) The proof is similar to that of (a).
718
Nonlinear Analysis
REMARK 6.1.22
The simplicity of λ1 > 0 and λ−1 > 0 implies that 0 < λ 1 < λ 2 6 λ3 . . .
and . . . 6 λ−3 6 λ−2 < λ−1 < 0. This analysis produces as a special case a complete description of the eigenelements of the¢ negative Laplacian with Dirichlet boundary conditions, ¡ i.e., of − ∆, H01 (Ω) . So we consider the following linear eigenvalue problem: ½ −∆x(z) = λx(z) for a.a. z ∈ Ω, (6.8) x|∂Ω = 0, with λ ∈ R. THEOREM 6.1.23 Problem (6.8) has countably many eigenvalues 0 < λ1 < λ2 6 λ3 6 . . . , with λn −→ +∞ and an orthonormal basis {un }n>1 of L2 (Ω) of eigenfunc© ª tions, with √uλn n>1 being an orthonormal basis of H01 (Ω). The eigenvalues n {λn }n>1 have the following variational characterizations: 2
λ1 =
min 1
kDxk2
x ∈ H0 (Ω) x 6= 0
2
kxk2
,
(6.9) 2
λn =
kDxk2
min 1
2
kxk2
x ∈ H0 (Ω) x ⊥ {u1 , . . . , un−1 }
,
(6.10)
2
λn = min max
kDxk2
Y ∈Ln x ∈ Y x 6= 0
2
kxk2
.
(6.11)
The first eigenvalue is simple and the corresponding eigenfunction does not change sign and so we can choose u1 , such that u1 > 0. PROPOSITION 6.1.24 If u ∈ H01 (Ω) \ {0} satisfies the equation 2
2
kDuk2 = λ1 kuk2 , then u is an eigenfunction corresponding to λ1 , i.e., u = ϑu1 , for some ϑ ∈ R \ {0}.
6. Eigenvalue Problems and Maximum Principles PROOF
719
Let y ∈ H01 (Ω) and t > 0 be such that u + ty 6= 0.
Then u + ty ∈ H01 (Ω) and by (6.9), we have 2
kDuk2 2
kuk2
2
kD(u + ty)k2
= λ1 6
2
ku + tyk2
.
Without any loss of generality, we may assume that kuk2 = 1. So µ Z ¶ Z ¡ ¢ 2 2 λ1 2t uy dz + t2 kyk2 6 2t Du, Dy RN dz + t2 kDyk2 . Ω
Ω +
Dividing with 2t and then letting t → 0 , we obtain Z Z ¡ ¢ λ1 uy dz 6 Du, Dy RN dz ∀ y ∈ H01 (Ω), Ω
so
Ω
½
−∆u(z) = λ1 u(z) u|∂Ω = 0.
for a.a. z ∈ Ω,
Therefore u = ϑu1 for some ϑ ∈ R \ {0}. ¢ ¡ We can say more about the eigenfunctions of − ∆, H01 (Ω) , but to do this we need to recall the basic regularity result from elliptic equations. Detailed proofs can be found in Evans (1998, Section 6.3) and Gilbarg & Trudinger (2001, Sections 8.3 and 8.4). For a bounded domain Ω ⊆ RN , we consider the following boundary value problem: L0 x = h in Ω, (6.12) where L0 is the linear differential operator in divergence form, given by µ ¶ X N N X ∂ ∂x ∂x df L0 x = − aij (z) + bi (z) + a0 (z)x. ∂zj ∂zi ∂zi i,j=1 i=1 The first result is about interior (local) regularity of the weak solutions of problem (6.12). THEOREM 6.1.25 © ª If aij ∈ C(Ω), bi , a0 ∈ L∞ (Ω) for i, j ∈ 1, . . . , N , h ∈ L2 (Ω) and x ∈ H 1 (Ω) is a weak solution of (6.12), 2 then x ∈ Hloc (Ω) (i.e., x ∈ H 2 (Ω0 ) for each open set Ω0 ⊂⊂ Ω) and for each 0 open set Ω ⊂⊂ Ω, we have the estimate ³ ´ kxkH 2 (Ω0 ) 6 c khkL2 (Ω) + kxkL2 (Ω) and the constant c > 0 depends only on Ω0 ⊂⊂ Ω and the coefficients of L0 .
720
Nonlinear Analysis
REMARK 6.1.26 Note that we do not require x ∈ H01 (Ω), i.e., (6.12) 2 need not be a Dirichlet problem. Since x ∈ Hloc (Ω), we have L0 x(z) = h(z) for a.a. z ∈ Ω. By strengthening the regularity of the coefficients and of the right hand side, we can improve the local regularity of the weak solution of (6.12). Namely, if © ª aij , bi , a0 ∈ C m+1 (Ω) ∀ i, j ∈ 1, . . . , N and h ∈ H m (Ω), m+2 then x ∈ Hloc (Ω) and for each Ω0 ⊂⊂ Ω, we have ³ ´ kxkH m+2 (Ω0 ) 6 c khkH m (Ω) + kxkH m (Ω) ,
with c > 0 depending only on m, Ω0 , Ω and the coefficients of L0 . Finally, if © ª aij , bi , a0 ∈ C ∞ (Ω) ∀ i, j ∈ 1, . . . , N and h ∈ C ∞ (Ω), then the weak solutions x ∈ H 1 (Ω) of (6.12) are in C ∞ (Ω). Again we emphasize that no assumptions are made about ∂Ω. So any possible singularities of u on the boundary ∂Ω do not propagate in the interior. Now we examine regularity of the weak solutions of (6.12) up to the boundary. THEOREM 6.1.27 © ª If ai,j ∈ C 1 (Ω), bi , a0 ∈ L∞ (Ω) for i, j ∈ 1, . . . , N , h ∈ L2 (Ω), ∂Ω is a C 2 -manifold and x ∈ H01 (Ω) is a weak solution of (6.12) with Dirichlet boundary conditions, then x ∈ H 2 (Ω) and ³ ´ kxkH 2 (Ω) 6 c khkL2 (Ω) + kxkL2 (Ω) , with c > 0 depending only on Ω and the coefficients of L0 . REMARK 6.1.28 If the weak solution x ∈ H01 (Ω) is unique, then the estimate simplifies and becomes kxkH 2 (Ω) 6 c khkL2 (Ω) . Again by strengthening the regularity of the coefficients and of the boundary ∂Ω, we can©improve the of the theorem. So if ai,j , bi , a0 ∈ C m+1 (Ω) ª conclusion m for i, j ∈ 1, . . . , N , h ∈ H (Ω) and ∂Ω is a C m+2 -manifold, then x ∈ H m+2 (Ω) and ³ ´ kxkH m+2 (Ω) 6 c khkH m (Ω) + kxkL2 (Ω) ,
6. Eigenvalue Problems and Maximum Principles
721
with c > 0 depending only on m, Ω and the coefficients of L0 . Moreover, if the weak solution is unique, then kxkH m+2 (Ω) 6 c khkH m (Ω) . ¡ ¢ © ª Finally, if ai,j , bi , a0 ∈ C ∞ Ω for i, j ∈ 1, . . . , N , h ∈ L∞ (Ω) and ∂Ω is a ¡ ¢ C ∞ -manifold, then x ∈ C ∞ Ω . Having these regularity results, we can complete Theorem 6.1.23. THEOREM 6.1.29 The eigenfunctions {un }n>1 of problem (6.8) obtained in Theorem 6.1.23 belong in H01 (Ω) ∩ C ∞ (Ω). Moreover, if ∂Ω is a C ∞ -manifold, then ¡ ¢ un ∈ C ∞ Ω ∀ n > 1. REMARK 6.1.30 eigenfunctions,
Any x ∈ L2 (Ω) can be expanded in terms of these x =
∞ X
(x, uk )2 uk ,
k=1
where by (·, ·)2 we denote the inner product in L2 (Ω), i.e., Z df (g, h)2 = g(z)h(z) dz ∀ g, h ∈ L2 (Ω). Ω
Moreover, if x ∈
H01 (Ω),
then
¡ ¢ Dx, Dx 2 = =
Z
¡
Ω ∞ X
Dx(z), Dx(z)
¢ RN
dz
2
λk (x, uk )2 .
k=1 2
So those x ∈ L (Ω) which do not belong in H01 (Ω) can be characterized by the fact that the series ∞ X 2 λk (x, uk )2 k=1
diverges. In fact in a similar fashion, we can have the spectrum of the ¢negative ¡ Laplacian with Neumann boundary conditions (i.e., of − ∆, H 1 (Ω) ). So we consider the following linear eigenvalue problem on the bounded domain Ω with C 1 -boundary (so that the divergence theorem holds; see Theorem A.4.1): ½ −∆x(z) = λx(z) for a.a. z ∈ Ω, (6.13) ∂x ∂n = 0 on ∂Ω.
722
Nonlinear Analysis
THEOREM 6.1.31 If Ω ⊆ RN is a bounded domain with a C 1 -boundary ∂Ω, then the eigenvalue problem (6.13) has countably many eigenvalues 0 = λ0 < λ 1 6 . . . 6 λk 6 . . . , with λk −→ +∞ and corresponding eigenfunctions {uk }k>0 which form an © ª orthonormal basis of L2 (Ω), while u0 , √uλk k>1 form an orthonormal basis k © ª of H 1 (Ω). The elements (λk , uk ) k>0 have the following variational characterizations: 2
kDxk2
λk−1 = min max
2 kxk2
Y ∈Lk x ∈ Y x 6= 0
2
=
max
min
kDxk2 2
kxk2
Y ∈Lk−1 x ∈ Y ⊥ x 6= 0 2
=
kDuk−1 k2 2
kuk−1 k2
.
REMARK 6.1.32 Any x ∈ L2 (Ω) can be expanded in terms of these eigenfunctions, namely ∞ X x = (x, uk )2 uk . k=0 1
Moreover, if x ∈ H (Ω), then ¡ ¢ Dx, Dx 2 = =
Z
¡
Ω ∞ X
Dx(z), Dx(z)
¢ RN
dz
2
λk (x, uk )2 .
k=0
So those x ∈ L2 (Ω) which do not belong to H 1 (Ω) can be characterized by the fact that the series ∞ X
2
λk (x, uk )2
diverges.
k=0
For both eigenvalue problems (6.8) and (6.13), the eigenvalues depend monotonically on the domain Ω, i.e., if Ω1 ⊆ Ω2 , then λk (Ω2 ) 6 λk (Ω1 )
∀ k > 1.
6. Eigenvalue Problems and Maximum Principles
723
Next we prove the maximum principles for second order elliptic partial differential operators. The starting point of the maximum principles is the following fact from calculus. If Ω is an open set, ϕ ∈ C 2 (Ω) and ϕ attains its maximum on Ω at z0 ∈ Ω, then µ ∇ϕ(z0 ) = 0
and
D2 ϕ(z0 ) =
∂ϕ(z0 ) ∂zi ∂zj
¶N 6 0.
(6.14)
i,j=1
The relations (6.14) are pointwise and so are the deductions based on them. For this reason the methods differ from the integral-based energy methods in the study of elliptic partial differential equations. Hence, we will consider an elliptic differential operator L in nondivergence form and we will assume that the solutions are C 2 . So the linear differential operator L has the following form: N X
N X ∂2x ∂x Lx(z) = − aij (z) (z) + bi (z) (z) + a0 (z)x(z). ∂z ∂z ∂z i j i i,j=1 i=1 df
(6.15)
We assume the following conditions concerning the coefficients in the operator L: H(L) © ª (i) aij ∈ C(Ω) ∩ L∞ (Ω), aij = aji for i, j ∈ 1, . . . , N ; (ii) there exists constant c0 > 0, such that for almost all z ∈ Ω and N all ξ = (ξi )N i=1 ∈ R , we have N X
2
aij (z)ξi ξj > c0 kξkRN
i,j=1
(uniform ellipticity condition); © ª (iii) bi , a0 ∈ C(Ω) ∩ L∞ (Ω) for i ∈ 1, . . . , N . REMARK 6.1.33
If
aij ∈ C 1 (Ω)
© ª ∀ i, j ∈ 1, . . . , N ,
then any linear differential operator in divergence form can be put in the nondivergence form (6.15). The next two theorems are called “Weak Minimum Principles.” In the first we require that a0 ≡ 0 and in the second that a0 > 0.
724
Nonlinear Analysis
THEOREM 6.1.34 (Weak Maximum Principle I) If Ω ⊆ RN is a bounded open set, L satisfies hypothesis H(L) with a0 ≡ 0, ¡ ¢ x ∈ C 2 (Ω) ∩ C Ω and Lx(z) 6 0
∀ z ∈ Ω,
then x attains its maximum on ∂Ω, i.e., max x(z) = max x(z). z∈∂Ω
z∈Ω
PROOF
First we consider the case that Lx(z) < 0
for a.a. z ∈ Ω.
We argue indirectly. Suppose that at some z0 ∈ Ω, the function x attains its maximum. We have ∇x(z0 ) = 0
D2 x(z0 ) 6 0.
and
¡ ¢N By hypotheses H(L)(i) and (ii), the matrix A = aij (z0 ) i,j=1 is symmetric and positive definite, hence it is diagonalizable, i.e., there exists an orthogonal N × N -matrix U = (uij )N i,j=1 , such that U AU T = diag (dk )N k=1 , © ª with dk > 0 for all k ∈ 1, . . . , N . Let df v = z0 + U (z − z0 ) ∈ RN . Then
(6.16)
z − z0 = U −1 (v − z0 )
and so
N X ∂x ∂x = uki ∂zi ∂vk
© ª ∀ i ∈ 1, . . . , N
k=1
and ∂2x = ∂zi ∂zj
N X k,m=1
∂2x uki umj ∂vk ∂vm
© ª ∀ i, j ∈ 1, . . . , N .
So, using also (6.16), at z0 ∈ Ω, we have N X i,j=1
aij (z0 )
N N X X ∂2x ∂2x aij (z0 ) (z0 ) = (z0 )uki umj ∂zi ∂zj ∂vk ∂vm i,j=1 k,m=1
=
N X k=1
dk
∂2x (z0 ) 6 0, ∂vk2
(6.17)
6. Eigenvalue Problems and Maximum Principles
725
2
since dk > 0 and ∂∂vx2 (z0 ) 6 0. k Using (6.17) and recalling that µ ∇x(z0 ) =
∂x (z0 ) ∂zi
¶N = 0, i=1
we have Lx(z0 ) = −
N X
aij (z0 )
i,j=1
N X ∂2x ∂x (z0 ) + bi (z0 ) (z0 ) > 0. ∂zi ∂zj ∂z i i=1
(6.18)
But by hypothesis Lx(z0 ) < 0, a contradiction to (6.18). Now for the general case assume that Lx(z) 6 0
∀ z ∈ Ω.
Let ε > 0 and λ > 0 and define df
xε (z) = x(z) + εeλz1
∀ z ∈ Ω.
From the uniform ellipticity condition (see hypothesis H(L)(ii)), we have N X
2
N ∀ ξ = (ξi )N i=1 ∈ R .
aij (z0 )ξi ξj > c0 kξkRN
i,j=1
Taking ξ = ei = (0, . . . , 0, 1, 0, . . . , 0), we obtain that aii (z0 ) > c0
© ª ∀ i ∈ 1, . . . , N .
Then £ ¤ Lxε (z) = Lx(z) + εLeλz1 6 εeλz1 − λ2 a11 (z0 ) + λb1 (z0 ) ° ° ¤ £ 6 εeλz1 − λ2 c0 + λ°bb°∞ < 0 ∀ z ∈ Ω, where bb = (bi )N i=1 , provided we choose λ > 0 large enough. Then from the first part of the proof, we have that max xε (z) = max xε (z). z∈Ω
z∈∂Ω
Letting ε & 0, we conclude that max x(z) = max x(z). z∈Ω
z∈∂Ω
726
Nonlinear Analysis
COROLLARY 6.1.35 If Ω ⊆ RN is a bounded open set, L satisfies hypotheses H(L) with a0 ≡ 0, ¡ ¢ x ∈ C 2 (Ω) ∩ C Ω and Lx(z) > 0
∀ z ∈ Ω,
then x attains its minimum on ∂Ω, i.e., min x(z) = min x(z). z∈Ω
z∈∂Ω
COROLLARY 6.1.36 If Ω ⊆ RN is a bounded open set, L satisfies hypotheses H(L) with a0 ≡ 0, ¡ ¢ x, y ∈ C 2 (Ω) ∩ C Ω and Lx(z) 6 Ly(z)
∀ z ∈ Ω,
x(z) 6 y(z)
∀ z ∈ ∂Ω,
x(z) 6 y(z)
∀ z ∈ Ω.
then Moreover, if Lx(z) = Ly(z) for all z ∈ Ω and x(z) = y(z) for all z ∈ ∂Ω, then x(z) = y(z) ∀ z ∈ Ω. COROLLARY 6.1.37 If Ω ⊆ RN is a bounded open set, L satisfies hypotheses H(L) and we consider the Dirichlet problem ½ Lx(z) = h(z) ∀ z ∈ Ω, x|∂Ω = g, with h ∈ C(Ω) and g ∈ C(∂Ω), ¡ ¢ then the Dirichlet problem has a unique solution x ∈ C 2 (Ω) ∩ C Ω (classical solution). Motivated from Theorem 6.1.34 and Corollary 6.1.35, we make the following definition. DEFINITION 6.1.38 Let Ω ⊆ RN be a bounded open set and let L be as in (6.15). A function y ∈ C 2 (Ω) is said to be a lower solution or subsolution of the equation Lx(z) = 0
∀ z ∈ Ω,
if Ly(z) 6 0 for all z ∈ Ω and is said to be an upper solution or supersolution, if Ly(z) > 0 for all z ∈ Ω. In the © particular ª case where L = −∆ (i.e., aij = δij , bi = 0, a0 = 0 for all i, j ∈ 1, . . . , N ), then a lower solution is called subharmonic and an upper solution is called superharmonic.
6. Eigenvalue Problems and Maximum Principles
727
REMARK 6.1.39 The above nomenclature when L ¡= ¢−∆ is motivated by the following simple observation. Let x ∈ C 2 (Ω) ∩ C Ω be a harmonic function, i.e., −∆x(z) = 0 ∀ z ∈ Ω. ¡ ¢ 2 If y ∈ C (Ω) ∩ C Ω is a lower solution (i.e., −∆y(z) 6 0 for all z ∈ Ω), such that y = x on ∂Ω, then by Corollary 6.1.36, we have y(z) 6 x(z)
∀ z ∈ Ω.
Therefore a subharmonic function always lies below a harmonic function with the same boundary values. Similarly for superharmonic functions. In the next Theorem, we remove the restriction that a0 ≡ 0. THEOREM 6.1.40 (Weak Maximum Principle II) If Ω ⊆ RN is a bounded open set, L satisfies hypotheses H(L) with a0 (z) > 0 ¡ ¢ for all z ∈ Ω, x ∈ C 2 (Ω) ∩ C Ω and Lx(z) 6 0
∀z∈Ω
(i.e., x is a lower solution for Lx = 0), then max x(z) 6 max x+ (z), z∈∂Ω
z∈Ω df
©
ª
where x+ = max x, 0 . PROOF
Consider the open set ª df © V = z ∈ Ω : x(z) > 0 .
If L1 x = Lx − a0 x, we have L1 x(z) 6 −a0 (z)x(z) 6 0
∀ z ∈ V.
Note that the operator L1 has no zero order term. So if V 6= ∅, we can apply Theorem 6.1.34 and obtain max x(z) = max x(z) = max x+ (z), z∈∂V
z∈V
hence
z∈∂Ω
max x(z) = max x+ (z). z∈Ω
z∈∂Ω
If V = ∅, then x 6 0 on Ω and so max x(z) 6 max x+ (z). z∈Ω
z∈∂Ω
728
Nonlinear Analysis
COROLLARY 6.1.41 If Ω ⊆ RN is a bounded open set, L satisfies hypotheses H(L) with a0 (z) > 0 ¡ ¢ for all z ∈ Ω, x ∈ C 2 (Ω) ∩ C Ω and Lx(z) > 0 for all z ∈ Ω (i.e., x is an upper solution for Lx = 0), then min x(z) > − max x− (z), z∈∂Ω
z∈Ω
©
ª where x = max − x, 0 . − df
REMARK 6.1.42
In particular if Lx(z) = 0 for all z ∈ Ω, then ¯ ¯ ¯ ¯ max ¯x(z)¯ = max ¯x(z)¯. z∈∂Ω
z∈Ω
Next we prove the so-called “Strong Maximum Principle.” First we establish an interesting auxiliary result known as Hopf’s lemma. LEMMA 6.1.43 (Hopf Lemma) (a) If Ω ⊆ RN is a bounded open set, L satisfies hypotheses H(L) with ¡ ¢ a0 ≡ 0, x ∈ C 2 (Ω) ∩ C Ω , Lx(z) 6 0
∀ z ∈ Ω,
z0 ∈ ∂Ω, x(z0 ) > x(z)
∀z∈Ω
and there exists a ball B ⊆ Ω, such that z0 ∈ ∂B (interior ball condition at z0 ∈ ∂Ω), ∂u then ∂n (z0 ) > 0 with n being the outer unit normal to B at z0 . (b) If the hypotheses of (a) hold but now a0 (z) > 0 for all z ∈ Ω, then the same condition is valid provided x(z0 ) > 0. PROOF
Suppose that a0 (z) > 0
∀ z ∈ Ω.
Also let B = Br (z) for some z ∈ Ω and r > 0. We introduce the auxiliary function 2 2 u(z) = e−µkz−zkRN − e−µr , where µ > 0 is a constant to be determined. We have · N X 2 Lu(z) = e−µkz−zkRN − 4µ2 aij (z)(zi − z i )(zj − z j )
(6.19)
i,j=1
+2µ
¸ µ ¶ N X ¡ ¢ 2 2 aii (z) − bi (z)(zi − z i ) + a0 (z) e−µkz−zkRN − e−µr . i=1
6. Eigenvalue Problems and Maximum Principles
729
From the uniform ellipticity condition (see hypothesis H(L)(ii)), we have N X
aij (z)(zi − z i )(zj − z j ) > c0 kz − zk2RN .
i,j=1
Using this in (6.19), we obtain · 2 −µkz−zk2RN Lu(z) 6 e − 4µ2 c0 kz − zkRN
¸ ° ° +2µtr A(z) + 2µ°bb°∞ kz − zkRN + ka0 k∞ , (6.20)
where A(z) =
¡
aij (z)
¢N i,j=1
and bb(z) =
¡
¢N bi (z) i=1 .
Note that tr A(·) ∈ L∞ (Ω). We consider the open annulus df
D = Br (z) \ B r2 (z). Then from (6.20), for all z ∈ D, we have µ ¶ ° ° ° °r 2 Lu(z) 6 e−µkz−zkRN − µ2 c0 kz − zk2RN + 2µ°tr A°∞ + µ°bb°∞ + ka0 k∞ . Choosing µ > 0 large enough, we see that Lu(z) 6 0
∀ z ∈ D.
Since by hypothesis x(z0 ) > x(z)
∀ z ∈ Ω,
we can find ε > 0 small enough so that x(z) + εu(z) 6 x(z0 )
∀ z ∈ ∂B r2 (z).
(6.21)
Also note that because u ≡ 0 on ∂Br (z), we have x(z) + εu(z) 6 x(z0 )
∀ z ∈ ∂Br (z).
(6.22)
By hypothesis Lx(z) 6 0 for all z ∈ Ω and recall that Lu(z) 6 0 for all z ∈ D. So ¡ ¢ L x + εu − x(z0 ) (z) = −a0 (z)x(z0 ) 6 0 ∀ z ∈ D. In addition from (6.21) and (6.22), we have x(z) + εu(z) − x(z0 ) 6 0
∀ z ∈ ∂D.
Applying Theorem 6.1.40, we obtain x(z) + εu(z) − x(z0 ) 6 0
∀ z ∈ D.
730
Nonlinear Analysis
But x(z0 ) + εu(z0 ) − x(z0 ) = 0. Hence
¢ ∂ ¡ ∂x ∂u x + εu − x(z0 ) (z0 ) = (z0 ) + ε (z0 ) > 0 ∂n ∂n ∂n
and so ¢ 2 ∂x ∂u ε¡ (z0 ) > −ε (z0 ) = − Du(z0 ), z0 − z RN = 2µεre−µr > 0 ∂n ∂n r (recall that n(z0 ) =
z0 −z r ).
Using the above lemma, we can prove the strong maximum principles. In the first a0 ≡ 0 and in the second a0 > 0. Recall that a domain in RN is an open and connected set in RN . THEOREM 6.1.44 (Strong Maximum Principle I) If Ω ⊆ RN is a bounded domain, L satisfies hypotheses H(L) with a0 ≡ 0, ¡ ¢ x ∈ C 2 (Ω) ∩ C Ω , Lx(z) 6 0 ∀z∈Ω and x attains its maximum over Ω at an interior point (i.e., in Ω), then x is constant on Ω. PROOF
Suppose that x is not constant. Let us set df
m = max x(z) z∈Ω
(note that m > 0 if a0 6= 0). Then ª df © V = z ∈ Ω : x(z) < m 6= ∅ and ∂V ∩ Ω 6= ∅. Let z ∈ V be such that ¡ ¢ ¡ ¢ dX z, ∂V < dX z, ∂Ω and let Br (z) (r > 0) be the largest ball, such that Br (z) ⊆ V . Then we can find z0 ∈ ∂Br (z), such that x(z0 ) = m. ∂x Applying Lemma 6.1.43(a), we obtain ∂n (z0 ) > 0. On the other hand since x attains its maximum on z0 ∈ Ω, we have
∇x(z0 ) = 0, hence
∂x ∂n (z0 )
= 0, a contradiction.
6. Eigenvalue Problems and Maximum Principles
731
COROLLARY 6.1.45 If Ω ⊆ RN is a bounded domain, L satisfies hypotheses H(L) with a0 ≡ 0, ¡ ¢ x ∈ C 2 (Ω) ∩ C Ω , Lx(z) > 0 ∀z∈Ω and x attains its minimum over Ω at an interior point (i.e., in Ω), then x is constant on Ω. Similarly using Lemma 6.1.43(b), we can have the strong maximum principle when a0 > 0. THEOREM 6.1.46 (Strong Maximum Principle II) If Ω ⊆ RN is a bounded domain, L satisfies hypotheses H(L) with a0 > 0, ¡ ¢ x ∈ C 2 (Ω) ∩ C Ω , Lx(z) 6 0 ∀z∈Ω and x attains a nonnegative maximum over Ω at an interior point (i.e., in Ω), then x is constant on Ω. COROLLARY 6.1.47 If Ω ⊆ RN is a bounded domain, L satisfies hypotheses H(L) with a0 > 0, ¡ ¢ x ∈ C 2 (Ω) ∩ C Ω , Lx(z) > 0 ∀z∈Ω and x attains a nonpositive minimum over Ω at an interior point (i.e., in Ω), then x is constant on Ω. REMARK 6.1.48 In Theorem 6.1.46 and Corollary 6.1.47, the assumption that a0 > 0 is essential. If it is not satisfied, then the result fails. To see this consider the equation x00 (t) + x(t) = 0
∀ t ∈ (0, π).
Then the solution x(t) = sin t for t ∈ (0, π) attains a positive maximum at t = π2 ∈ (0, π). We conclude this section with Harnack’s inequality, which states that the values of the nonnegative solutions of Lx = 0 are comparable. For a proof of this theorem we refer to the book of Gilbarg & Trudinger (2001, Section 8.8). THEOREM 6.1.49 (Harnack Inequality) If Ω ⊆ RN is a bounded, open set, L satisfies hypotheses H(L), x ∈ C 2 (Ω), x > 0, Lx(z) = 0 for all z ∈ Ω and U ⊂⊂ Ω is a connected set, then there exists c > 0, such that sup x(z) 6 c inf x(z), z∈U
z∈U
for some constant c > 0 dependent only on U and the coefficients of L.
732
6.2
Nonlinear Analysis
The Partial p-Laplacian
In this section we study the spectral properties of the partial p-Laplacian differential operator. So let Ω ⊆ RN be a bounded domain with a C 2 -boundary ∂Ω and let m ∈ L∞ (Ω) be such that ¯© ª¯ ¯ z ∈ Ω : m(z) > 0 ¯ > 0, N where by | · |N we denote the Lebesgue measure on RN . We emphasize that in general m may change sign. We consider the following weighted nonlinear eigenvalue problem: ½ ¯ ¯p−2 ¡ ¢ p−2 −div kDx(z)kRN Dx(z) = λm(z)¯x(z)¯ x(z) for a.a. z ∈ Ω, (6.23) x|∂Ω = 0. We say that a number λ ∈ R is an eigenvalue of ¡DEFINITION ¢ 6.2.1 − ∆p , W01,p (Ω) , if (6.23) admits a nontrivial solution x ∈ W01,p (Ω), which is known as an eigenfunction corresponding to λ. PROPOSITION 6.2.2 There exists the least (first or principal) eigenvalue λ1 > 0 with a corresponding eigenfunction u1 ∈ W01,p (Ω), u1 > 0. PROOF
Let ½
λ1
° °p = inf °Dx°p : x ∈ W01,p (Ω), df
¾
Z p
m|x| dz = 1 . Ω
Evidently λ1 > 0. Let {xn }n>1 ⊆ W01,p (Ω) be a minimizing sequence, i.e., ° ° °Dxn °p & λ1 p and
Z m|xn |p dz = 1
∀ n > 1.
Ω
By virtue of Poincar´e’s inequality (see Theorem 2.5.4), the sequence {xn }n>1 ⊆ W01,p (Ω) is bounded. So we may assume that w
xn −→ u1 xn −→ u1 ¯ xn (z)¯ −→ u1 (z) ¯xn (z)¯ 6 k(z)
in W01,p (Ω), in Lp (Ω), for a.a. z ∈ Ω, for a.a. z ∈ Ω and all n > 1,
6. Eigenvalue Problems and Maximum Principles
733
with k ∈ Lp (Ω). So Z
Z p
m|u1 |p dz,
m|xn | dz −→ Ω
hence
Ω
Z m|u1 |p dz = 1.
(6.24)
Ω
¡ ¢ Also from the weak lower semicontinuity of the norm in Lp Ω; RN , we have ° ° ° ° °Du1 °p 6 lim inf °Dxn °p , p p n→+∞
° °p i.e., °Du1 °p = λ1 . From (6.24), we have u1 6= 0 and so λ1 > 0. Invoking the Lagrange multiplier rule (see Theorem 5.5.24), we can find λ ∈ R, such that 0
A(u1 ) = λm|u1 |p−2 u1 in W −1,p (Ω), (6.25) ¡ ¢ 0 ∗ 1,p 1,p where A : W0 (Ω) −→ W −1,p (Ω) = W0 (Ω) , (with p1 + p10 = 1) is the nonlinear operator defined by Z ° ° ¡ ® ¢ df °Dx(z)°p−2 A(x), y W 1,p (Ω) = Dx(z), Dy(z) RN dz. RN 0
Ω
Note that
°p−2 ¡° ¢ 0 div °Du1 (·)°RN Du1 (·) ∈ W −1,p (Ω)
(see Theorem 2.4.57). So from (6.25), we have that °p−2 ¢ ¡° ® − div °Du1 °RN Du1 , v W 1,p (Ω) 0 Z p−2 = λm|u1 | u1 v dz ∀ v ∈ W01,p (Ω),
(6.26)
Ω
hence ½ ¯ ¯p−2 ¡ ¢ p−2 −div kDu1 (z)kRN Du1 (z) = λm(z)¯u1 (z)¯ u1 (z) for a.a. z ∈ Ω, u1 |∂Ω = 0. Moreover, by using in (6.26) as a test function u1 ∈ W01,p (Ω), we obtain Z ° °p ° ° λ1 = Du1 p = λ m|u1 |p dz = λ. Ω
Therefore u1 is an eigenfunction corresponding to λ1 > 0. Finally note that if we replace u1 by |u1 |, it is still a solution of the minimization problem and so it is an eigenfunction corresponding to λ1 > 0. Thus we may assume that u1 (z) > 0 for almost all z ∈ Ω.
734
Nonlinear Analysis
Next we will determine some regularity properties of the eigenfunction u1 . For this reason we need to prove some nonlinear regularity results. We start with a result on Sobolev functions and distributions due to Br´ezis & Browder (1982). For the proof we refer to that paper. LEMMA 6.2.3 0 If x ∈ W01,p (Ω) (with p ∈ (1, +∞)), T ∈ W −1,p (Ω) ∩ L1loc (Ω) (where 1 1 1 p + p0 = 1) and for some g ∈ L (Ω), we have g(z) 6 T (z)x(z)
for a.a. z ∈ Ω,
then T (·)x(·) ∈ L1 (Ω) and Z hT, xiW 1,p (Ω) =
T (z)x(z) dz.
0
Ω
LEMMA 6.2.4 If x ∈ W01,p (Ω) is such that ¡ ¢ p−2 ∆p x = div kDxkRN Dx ∈ L1loc (Ω), there exist σ ∈ [1, p∗ ), where p∗
Np df = N −p +∞
£ ∗¢ 0 r ∈ 1, pp , c > 0 and a ∈ Lr (Ω)+ ( 1r +
if
p < N,
if
p > N,
1 r0
= 1), such that
¯ ¯σ ¯ ¯ −x(z)∆p x(z) 6 c¯x(z)¯ + a(z)¯x(z)¯ then x ∈ Lpn (Ω) for all n > 0, where ½ ∗ p df © ª p0 = 2 max pr, σ and
df
pn+1 = p0 +
if if
for a.a. z ∈ Ω,
(6.27)
p < N, p>N
n o p0 pn min pn − σ, −1 . p r
PROOF Clearly x ∈ Lp0 (Ω). Suppose that x ∈ Lpn (Ω) for some n > 0. We will show that x ∈ Lpn+1 (Ω). We consider the following sequence of truncations of x: if x(z) 6 −k, −k df x(z) if −k 6 x(z) 6 k, xk (z) = ∀ k > 1. k if k 6 x(z),
6. Eigenvalue Problems and Maximum Principles Also let us set
df
β = p
735
pn+1 − p0 . p0
Since p0 6 pn for all n > 0, we have β > 0. From (6.27), we have ¯ ¯β ¯ ¯σ+β ¯ ¯β+1 −¯xk (z)¯ xk (z)∆p x(z) 6 c¯x(z)¯ +a(z)¯x(z)¯ for a.a. z ∈ Ω. (6.28) Note that by H¨older’s inequality (see Theorem A.2.27), we have Z ¡ ¢ σ+β β+1 c|x|σ+β + a|x|β+1 dz 6 c kxkσ+β + kakr0 kxkr(β+1) .
(6.29)
Ω
¡ ¢ Since pn = max σ + β, r(β + 1) , from (6.29), it follows that Z ¡ ¢ ¡ ¢ p c|x|σ+β + a|x|β+1 dz 6 ζ kxkpnn + 1 ,
(6.30)
Ω
with
df
ζ =
¡ ¢¡ ¢ c + kakr0 |Ω|N + 1 > 0.
Moreover, since |xk |β xk ∈ W01,p (Ω)
0
∆p x ∈ W −1,p (Ω) ∩ L1loc (Ω)
and
(by hypothesis) and from (6.28), we see that |xk |β xk ∆p x is minorized by an L1 (Ω)-function, we can use Lemma 6.2.3 and infer that |xk |β xk ∆p x ∈ L1 (Ω) and
Z
|xk |β xk ∆p x dz =
−
− ∆p x, |xk |β xk
® W01,p (Ω)
.
Ω
We have
Z
Z
¡ ¡ ¢ ¢ ° °p−2 D |xk |β xk , Dx RN °Dx°RN dz
|xk |β xk ∆p x dz =
−
Z
Ω
> (β + 1)
Ω
° ° °Dxk °p N |xk |β dz R
Ω µ
¶p Z ° ¡ ¢° β p °D |xk | p xk °p N dz R β+p Ω µ ¶p µ ¶p p p n+1 1 p 1 p0 β+p p0 > p kxk kp β+p = p kxkpn+1 , 0 p c0 β + p c0 pn+1 = (β + 1)
(6.31)
where c0 > 0 is the constant for the embedding W01,p (Ω) ⊆ Lp0 (Ω), i.e., kxkLp0 (Ω) 6 c0 kxkW 1,p (Ω) 0
∀ x ∈ W01,p (Ω).
736
Nonlinear Analysis
Then from (6.28), (6.30) and (6.31), we obtain p
¡ ¢ p 6 b c0 ppn+1 kxkpnn + 1 ,
pn+1
p0 kxk kpn+1
df
with b c0 = ζ
¡ c0 ¢p p0
. Because
xk (z) −→ x(z)
for a.a. z ∈ Ω as k → +∞
and by the induction hypothesis x ∈ Lpn (Ω), we have that w
xk −→ x in Lpn+1 (Ω) and so x ∈ Lpn+1 (Ω). Now we consider the sequence {rn }n>1 , defined by df
r0 = p0 , ¢ df ¡ rn+1 = rn + r(p − 1) δ
∀ n > 0,
df p0 rp .
with δ =
LEMMA 6.2.5 If the hypotheses of Lemma 6.2.4 hold, then x ∈ Lrn (Ω) for all integers n > 0 and rn+1
rn
p δr kxkrn+1 6 crn+1 kxkrrn ,
where c > 0 is a constant depending on N , p, σ, r, kakr0 , c and kxkp0 . PROOF
For every n > 1, we have pn > p0 + s0 ©
n X
δ k = p0 + s0 δ
k=1
δn − 1 , δ−1
ª where s0 = min r(p0 − σ), p0 − r . Since s0 > 0 and δ > 1 we see that pn −→ + ∞. So Lemma 6.2.4 implies that x ∈ Lϑ (Ω) for all ϑ ∈ [1, +∞). Hence |x|σ−1 ∈ ° σ−1 ° r0 ° 0 6 η with η > 0 depending only on N , p, σ, r, kak 0 , c L (Ω) and °|x| r r and kxkp0 (see the proof of Lemma 6.2.4). We have Z ³ ´ rn ¡ ¢ rn c|x|σ−1 + a|x| r dz 6 cη + kakr0 kxkrrn . Ω
If in (6.28) we replace β by rrn − 1 and arguing as in the proof of Lemma 6.2.4, we obtain the desired inequality.
6. Eigenvalue Problems and Maximum Principles Now we are ready for the first regularity result. THEOREM 6.2.6 If all the hypotheses of Lemma 6.2.4 hold, then x ∈ L∞ (Ω) and kxk∞ 6 η1 with the constant η1 > 0 depending on N , p, σ, r, kakr0 , c and kxkp0 . PROOF
Let
¡ ¢ df hn = rn ln kxkrn
¡ p ¢ df and µn = r ln crn+1
∀ n > 0.
Then from Lemma 6.2.5, we have hn+1 6 δ(hn + µn ), so hn 6 δ n h0 +
n X
δ k µn−k
∀ n > 1.
k=1
Since rn = δ n p0 + δr(p − 1) we have
δn − 1 , δ−1
δ n p0 6 rn 6 δ n c,
df
with c = p0 + δr p−1 δ−1 . Hence µn 6 τ1 + (n + 1)τ2 , df
df
with τ1 = r ln(ccp ) and τ2 = pr ln δ. It follows that n X δ k µn−k 6 ζδ n , k=1
¡ ¢ δ δ with ζ = τ1 + τ2 δ−1 δ−1 . Then we have df
µ kxkrn 6 exp
hn δ n p0
¶
µ 6 exp
so
µ kxk∞ =
lim kxkrn 6 exp
n→+∞
¶ h0 + ζ , p0
h0 + ζ p0
¶ < +∞.
737
738
Nonlinear Analysis
Using the above theorem, we can have a stronger regularity result. This result is a particular case of a theorem of Lieberman (1988) (cf. also Di Benedetto (1983) and Tolksdorf (1984)). THEOREM 6.2.7 If x ∈ W 1,p (Ω) ∩ L∞ (Ω) and ∆p x ∈ Lr (Ω) with r ∈ [1, +∞], then (a) if r = +∞, we have x ∈ C01,β (Ω) with β ∈ (0, 1) and kxkC 1,β (Ω) 6 c 0
° ° with c > 0; both β and c depend only on N , p, kxk∞ and °∆p x°∞ ; (b) if r > N p0 (with C01,β (Ω0 )
1 p
+
1 p0
= 1), then for every Ω0 ⊂⊂ Ω, we have x ∈
and kxkC 1,β (Ω0 ) 6 b c, 0
for some c > 0; both β and b c depend only on N , p, kxk∞ , ° ° β ∈ (0, 1) and b °∆p x° and Ω0 . r To fully describe the first eigenfunction u1 associated to λ1 > 0 (see Proposition 6.2.2), we will need a nonlinear version of the Strong (Hopf) Maximum Principle (see Theorem 6.1.44). The result is due to V´azquez (1984, Theorem 5), where the interested reader can find its proof. THEOREM 6.2.8 If x ∈ C 1 (Ω), x(z) > 0 for all z ∈ Ω, x 6= 0, ∆p x ∈ L2loc (Ω) and ¡ ¢ ∆p x(z) 6 ξ x(z) for a.a. z ∈ Ω, where ξ : R+ −→ R is a continuous, increasing function, such that ξ(0) = 0 and
Z1 either ξ(r0 ) = 0 for some r0 > 0 or
1 1
0
(rξ(r)) p
dr = +∞,
then x(z) > 0 ∀ z ∈ Ω. ¢ Moreover, if x ∈ C Ω ∪ {z0 } , z0 ∈ ∂Ω and x(z0 ) = 0, then 1
¡
∂u (z0 ) < 0 ∂n (with n being the exterior unit normal on ∂Ω at z0 ).
6. Eigenvalue Problems and Maximum Principles
739
Having these general results we can establish the properties of u1 . THEOREM 6.2.9 If u ∈ W01,p (Ω) is an eigenfunction corresponding to an eigenvalue λ > λ1 in (6.25), then u ∈ C 1,β (Ω) for some β ∈ (0, 1). In particular, if u1 is an eigenfunction corresponding to the principal eigenvalue λ1 > 0, then u1 (z) > 0 and
∂u1 (z) < 0 ∂n
∀z∈Ω
∀ z ∈ ∂Ω.
PROOF The regularity of u follows from Theorem 6.2.7(a). Also from Proposition 6.2.2, we know that u1 > 0, u1 6= 0. We have ∆p u1 (z) 6 λ1 kmk∞ u1 (z)p−1
for a.a. z ∈ Ω.
So we can apply Theorem 6.2.8 (with ξ(r) = λ1 kmk∞ rp−1 ) and conclude that the thesis of the theorem holds. REMARK 6.2.10 If on C01 (Ω) we consider the pointwise partial ordering (i.e., x 6 y if and only if x(z) 6 y(z) for all z ∈ Ω), which is induced by the positive cone df
C01 (Ω)+ =
©
ª x ∈ C01 (Ω) : x(z) > 0 for all z ∈ Ω ,
then ½ int C01 (Ω)+
=
x ∈ C01 (Ω) : x(z) > 0 for all z ∈ Ω ¾ ∂x and (z) < 0 for all z ∈ ∂Ω . ∂n
So Theorem 6.2.9 implies that u1 ∈ int C01 (Ω)+ . p
The function v 7−→ kvkRN (with p > 1) is strictly convex and so for every v1 , v2 ∈ RN , v1 6= v2 , we have ¢ p−2 ¡ 2 p p kv1 kRN v1 , v2 − v1 RN < kv2 kRN − kv1 kRN . (6.32) Sometimes it is useful to express the strictness in inequality (6.32) in a more precise way. This is done in the next lemma, the proof of which can be found in Lindqvist (1990, 1992).
740
Nonlinear Analysis
LEMMA 6.2.11 (a) If p ∈ (1, 2) and v1 , v2 ∈ RN , then 2
p−2
p kv1 kRN (v1 , v2 − v1 )RN + p
p
kv2 − v1 kRN 3p(p − 1) 16 (kv1 kRN + kv2 kRN )2−p
6 kv2 kRN − kv1 kRN . (b) If p > 2 and v1 , v2 ∈ RN , then 2
p−2
p kv1 kRN (v1 , v2 − v1 )RN +
kv2 − v1 kRN p p 6 kv2 kRN − kv1 kRN . 2p−1 − 1
REMARK 6.2.12 In the proof of Lemma 6.2.11(b) the central role plays the so-called Clarkson’s inequality , which says that for p > 2 and for all v1 , v2 ∈ RN , we have · ¸ 1 p p p p kv + v k + kv − v k 6 kv1 kRN + kv2 kRN . 1 2 RN 1 2 RN 2p−1 Another related useful inequality is the following one. For the proof see Cheng (1998). LEMMA 6.2.13 (a) If p ∈ (1, 2) and v1 , v2 ∈ RN , then p−1 2 p−2 kv1 − v2 kRN (kv1 kRN + kv2 kRN ) 2 ³ ´ 6
p−2
p−2
kv1 kRN v1 − kv2 kRN v2 , v1 − v2
RN
p
6 22−p kv1 − v2 kRN .
(b) If p > 2 and v1 , v2 ∈ RN , then ³ ´ p p−2 p−2 c1 kv1 − v2 kp 6 kv1 kRN v1 − kv2 kRN v2 , v1 − v2 p
6 c2 kv1 − v2 kRN (kv1 kRN + kv2 kRN )
RN p−2
,
for some constants c1 , c2 > 0 depending only on p > 2. Using these inequalities, we can establish the simplicity of the principal eigenvalue λ1 > 0.
6. Eigenvalue Problems and Maximum Principles
741
PROPOSITION 6.2.14 ¡ ¢ The first (principal) eigenvalue λ1 > 0 of − ∆, W01,p (Ω) is simple, i.e., if u, v > 0 are eigenfunctions corresponding to the eigenvalue λ1 > 0, then u and v are proportional. PROOF Let u, v ∈ W01,p (Ω) be two eigenfunctions corresponding to the eigenvalue λ1 > 0. Then from Theorem 6.2.9, we know that u, v ∈ C01 (Ω) and u(z) > 0 Also we have Z
and
° °p−2 ¡ ¢ °Du° N Du, Dϑ R
v(z) > 0
∀ z ∈ Ω. Z
RN
m|u|p−2 uϑ dz
dz = λ1 Ω
Ω
∀ ϑ ∈ W01,p (Ω)
(6.33)
and Z
° °p−2 ¡ ¢ °Dv ° N Dv, Dη R
Z RN
m|v|p−2 vη dz
dz = λ1
Ω
Ω
∀ η ∈ W01,p (Ω). Let df
ϑ =
up − v p up−1
and
df
η =
(6.34)
v p − up . v p−1
Clearly ϑ, η ∈ L∞ (Ω) and · ³ v ´p ¸ ³ v ´p−1 Dϑ = 1 + (p − 1) Du − p Dv u u and
· Dη =
³ u ´p ¸ ³ u ´p−1 1 + (p − 1) Dv − p Du v v
and so ϑ, η ∈ W01,p (Ω). Therefore we can use them as test functions in (6.33) and (6.34). We use ϑ in (6.33) and η in (6.34) and then add the resulting equalities. We obtain · ¶ Z µ· ³ v ´p ¸ ³ u ´p ¸ p p 1 + (p − 1) kDukRN + 1 + (p − 1) kDvkRN dz u v Ω Z µ ³ ´p−1 ¢ v p−2 ¡ = p kDukRN Du, Dv RN u Ω ¶ ³ u ´p−1 ¢ p−2 ¡ +p kDvkRN Dv, Du RN dz. v
742
Nonlinear Analysis
Note that D ln u =
1 Du. u
So the last equality becomes Z °p ° °p ¢ ¡° (up − v p ) °D ln u°RN − °D ln v °RN dz Ω
Z
° °p−2 ¡ ¢ pv p °D ln u°RN D ln u, D ln v − D ln u RN dz
= Ω
Z
+
° °p−2 ¡ ¢ pup °D ln v °RN D ln v, D ln u − D ln v RN dz.
Ω
If p ∈ (1, 2), we apply Lemma 6.2.11(a) and obtain Z 2 ¢ kD ln u − D ln vkRN 3p(p − 1) ¡ p − v + up dz = 0, 16 (kD ln ukRN + kD ln vkRN )2−p Ω
so −
3p(p − 1) 16
Z µ
1 1 + p up v
Ω
¶
2
kvDu − uDvkRN dz = 0 (v kDukRN + u kDvkRN )2−p
and thus v(z)Du(z) = u(z)Dv(z) for a.a. z ∈ Ω. If p > 2, we apply Lemma 6.2.11(a) and obtain Z °p ¡ p ¢° 1 − p−1 u + v p °D ln v − D ln u°RN dz = 0, 2 Ω
so −
1
Z µ
2p−1
¶ ° 1 1 ° °vDu − uDv °p N dz = 0 + p p R v u
Ω
and thus v(z)Du(z) = u(z)Dv(z) for a.a. z ∈ Ω. Therefore for any p ∈ (1, +∞), we have v(z)Du(z) = u(z)Dv(z) for a.a. z ∈ Ω, which implies that u = βv, for some β > 0. Next we show that any eigenfunction u ∈ C01 (Ω) (see Theorem 6.2.9) associated to an eigenvalue λ 6= λ1 must change sign.
6. Eigenvalue Problems and Maximum Principles
743
PROPOSITION 6.2.15 If u ∈ C01 (Ω) is an eigenfunction corresponding to an eigenvalue λ > λ1 , then u changes sign (i.e., u+ 6= 0, u− 6= 0) and if ª z ∈ Ω : u(z) > 0 , ª df © = z ∈ Ω : u(z) < 0 , df
Ω+ = Ω− then we have
©
© ª ¡ ¢σ min |Ω+ |N , |Ω− |N > λcp kmk∞ ,
with a constant c > 0 independent of u and λ and ½ N −p if p < N, df σ = −2 if p > N. PROOF
Let u1 be the eigenfunction corresponding to λ1 > 0 with ° ° °Du1 ° = 1 and u1 (z) > 0 ∀ z ∈ Ω. p
We proceed indirectly. Suppose that u has constant sign. Without any loss of generality, we may assume that u > 0 ° ° (the analysis is similar if we assume that u 6 0) and that °Du°p = 1. From Theorem 6.2.9, we have that u ∈ C01 (Ω). So invoking Theorem 6.2.8, we infer that u(z) > 0 ∀ z ∈ Ω. Then
up1 − up ∈ W01,p (Ω) u1p−1
and
up − up1 ∈ W01,p (Ω) up−1
and using them as test functions, we obtain * + ¿ À up1 − up up − up1 −∆p u1 , p−1 + −∆p u, p−1 u u1 W01,p (Ω) W01,p (Ω) * + ¿ À up − up up − up = λ1 mu1p−1 , 1 p−1 + λ1 mup−1 , p−1 1 u u1 W01,p (Ω) W01,p (Ω) Z ¡ p ¢ = (λ1 − λ) m u1 − up dz > 0 (6.35) Ω
(see the proof of Proposition 6.2.14).
744
Nonlinear Analysis
On the other hand, since ° °p 1 = °Du1 °p = λ1
Z mup1 dz Ω
and
° °p 1 = °Du°p = λ
Z mup dz, Ω
we have
µ
1 1 (λ1 − λ) − λ1 λ
¶ < 0.
(6.36)
Comparing (6.35) and (6.36), we reach a contradiction. So u changes sign. Using as a test function u+ ∈ W01,p (Ω), from (6.25), we have Z ° + °p ° ° °Dv ° = λ m|v + |p dz 6 λ kmk °v + °p . (6.37) ∞ p p Ω
If
½ df
p∗0 =
p∗ = 2p
Np N −p
if if
p < N, p > N,
then, since 1 < p < p∗0 , we have 1 1 ° +° ° ° °v ° 6 °v + ° ∗ |Ω+ | p − p∗0 , p p 0
so
p ° + °p ° ° °v ° 6 °v + °p∗ |Ω+ |1− p∗0 . p p 0
(6.38)
Using (6.38) in (6.37), we obtain p ° + °p ° ° °Dv ° 6 λ kmk °v + °p∗ |Ω+ |1− p∗0 . ∞ p p 0
From Theorem 2.5.3, we know that ° + °p ° ° °v ° ∗ 6 cp °Dv + °p , p p 0
for some constant c(p, N ) > 0. Then from (6.39), we have ¡ ¢σ |Ω+ |N > λcp kmk∞ , where
½ df
σ =
− Np −2
if if
p < N, p > N.
Similarly for |Ω− |N . This completes the proof of the proposition.
(6.39)
6. Eigenvalue Problems and Maximum Principles
745
REMARK 6.2.16 In fact an estimate analogous to ©the one obtained ª in Proposition 6.2.15 holds at every connected component of z ∈ Ω : u(z) 6= 0 . So if Ω0 is such a connected component, then ¡ ¢β |Ω0 |N > λ kmk∞ cp0 , for some constant c0 (p, N ) > 0 and with N if − p df β = −1 if 1 − Np if
p < N, p = N, p > N.
If λ = λ1 , then there is only one connected component, namely the domain Ω itself and so we have an estimation (from below) of λ1 in terms of |Ω|N and kmk∞ , namely 1
λ1
β |Ω|N > . kmk∞ cp0
We can use Proposition 6.2.15, to show that λ1 > 0 is isolated. PROPOSITION 6.2.17 ¢ ¡ The first eigenvalue λ1 > 0 of −∆, W01,p (Ω), m is isolated,¡ i.e., there¢ exists ε > 0, such that there is no eigenvalue in the open interval λ1 , λ1 + ε . PROOF We argue by contradiction. Suppose that the proposition is not bn for n > 1, such that true. Then we can find eigenvalues λ b n & λ1 . λ © ª Let u bn n>1 ⊆ W01,p (Ω) be associated normalized (i.e., with kb un kW 1,p (Ω) = 1 for n > 1) eigenfunctions. operator, defined by Z ® A(x), y W 1,p (Ω) = 0
0
Let A : W01,p (Ω) −→ W −1,p (Ω) be the nonlinear
° °p−2 ¡ ¢ °Dx° N Dx, Dy N dz R R
∀ x, y ∈ W01,p (Ω).
Ω
It is easy to see that A is monotone (see Lemma 6.2.13) and demicontinuous. Hence A is maximal monotone (see Proposition 3.2.19). For all n > 1, we have 0 bn m|b A(b un ) = λ un |p−2 u bn in W −1,p (Ω). (6.40) By passing to a subsequence, if necessary, we may assume that w
u bn −→ u u bn −→ u bn (z)¯ −→ u(z) ¯u ¯u bn (z)¯ 6 k(z)
in W01,p (Ω), in Lp (Ω), for a.a. z ∈ Ω, for a.a. z ∈ Ω and all n > 1,
746
Nonlinear Analysis
©¯ ¯p−2 ª 0 with k ∈ Lp (Ω)+ . Then the sequence ¯u bn ¯ u bn n>1 ⊆ Lp (Ω) is bounded and ¯ ¯p−2 ¯ ¯p−2 ¯u bn (z)¯ u bn (z) −→ ¯u b(z)¯ u b(z) for a.a. z ∈ Ω. So we infer that ¯ ¯p−2 ¯ ¯p−2 ¯u bn (·)¯ u bn (·) −→ ¯u b(·)¯ u b(·) We have ® A(b un ), u bn − u W 1,p (Ω) = λ1
0
in Lp (Ω).
Z m|b un |p−2 u bn (b un − u) dz.
0
Ω
Since
Z m|b un |p−2 u bn (b un − u) dz −→ 0,
λ1 Ω
we obtain lim
n→+∞
® A(b un ), u bn − u W 1,p (Ω) = 0. 0
But A being maximal monotone, it is generalized pseudomonotone (see Proposition 3.2.47) and so ® ® A(b un ), u bn W 1,p (Ω) −→ A(u), u W 1,p (Ω) , 0
hence
0
° ° ° ° °Db un °p −→ °Du°p .
Since
¡ ¢ w Db un −→ Du in Lp Ω; RN ¡ ¢ and the Lebesgue space Lp Ω; RN is uniformly convex, from the Kadec-Klee property (see Remark A.3.22), we have ¡ ¢ Db un −→ Du in Lp Ω; RN and so
u bn −→ u in W01,p (Ω).
It follows that
w
0
A(b un ) −→ A(u) in W −1,p (Ω).
Passing to the limit as n → +∞ in (6.40), we obtain 0
A(u) = λ1 |u|p−2 u in W −1,p (Ω) and
kukW 1,p (Ω) = 1, 0
which implies that u is an eigenfunction corresponding to the eigenvalue λ1 > 0. By Theorem 6.2.9, we have that u ∈ C01 (Ω) and that |u|(z) > 0
∀ z ∈ Ω.
6. Eigenvalue Problems and Maximum Principles
747
Without any loss of generality we may assume that u(z) > 0
∀ z ∈ Ω.
Then for a given ε > 0, we can find ξε > 0 and a compact set Cε ⊆ Ω, such that u(z) > 2ξε > 0 ∀ z ∈ Cε . Also, since u bn (z) −→ u(z)
for a.a. z ∈ Ω,
by Egorov’s theorem (see Theorem A.2.10), we can find a compact set Dε ⊆ Ω ¯ ¯ with ¯Ω \ Dε ¯ 6 2ε , such that un (z) −→ u(z) uniformly on Dε . So we can find n0 = n0 (ε) > 1, such that ¯ ¯ ¯u bn (z) − u(z)¯ 6 ξε ∀ n > n0 , z ∈ Dε . It follows that u bn (z) > ξε
∀ n > n0 , z ∈ Cε ∩ Dε .
Note that, if ε < |Ω|N , then ¯ ¯ ¯Cε ∩ Dε ¯ > |Ω| − ε > 0. N N
(6.41)
Fix n > n0 and let df
Ωn− =
©
ª z∈Ω: u bn (z) < 0 .
Then from Proposition 6.2.15, we have ¯ n¯ ¡ ¢ ¯Ω− ¯ > λ1 cp kmk σ > 0. ∞ N df 1 ¡ p 2 λ1 c
¢σ kmk∞ , ¯ ¯ > ¯Ωn+ ∪ Ωn− ¯N =
So, if we choose ε = |Ω|N
(6.42)
then from (6.41) and (6.42), we have ¯ n¯ ¯ ¯ ¯Ω+ ¯ + ¯Ωn− ¯ > |Ω| , N N N
a contradiction. This proves that λ1 > 0 is isolated. of
Therefore summarizing ¡ ¢ the situation about the first element in the spectrum − ∆p , W01,p (Ω), m , we can state the following theorem.
THEOREM 6.2.18 The nonlinear eigenvalue problem (6.23) has a least (first or principal) eigenvalue λ1 > 0, which is isolated and simple (i.e., the associated eigenspace is one dimensional). Moreover, a corresponding eigenfunction u belongs in C01 (Ω), does not change sign (i.e., |u|(z) > 0 for all z ∈ Ω) and so we can always assume that u(z) > 0 for all z ∈ Ω.
748
Nonlinear Analysis
The ¡ next question is¢what can be said about the higher part of the spectrum of − ∆p , W01,p (Ω), m ? In this direction we use the Lusternik-Schnirelman theory discussed in Section 5.5. In fact we apply this theory using as topological index the Krasnoselskii genus γ (see Definition 5.4.25). So we consider the functionals ϕ1 , ϕ2 : W01,p (Ω) −→ R, defined by df
ϕ1 (x) =
° 1° °Dx°p p p
1 p
df
and ϕ2 (x) =
Z ∀ x ∈ W01,p (Ω).
m|x|p dz Ω
¡ 1
¢ Evidently ϕ1 , ϕ2 ∈ C W01,p (Ω) . Let us set df
ϕ(x) = ϕ1 (x)2 − ϕ2 (x) ∀ x ∈ W01,p (Ω). ¢ ¡ Then ϕ ∈ C 1 W01,p (Ω) . If x is a nontrivial critical point of ϕ, i.e., ϕ0 (x) = 0, then ° 2° °Dx°p A(x) = m|x|p−2 x in W −1,p0 (Ω) (6.43) p p and so using as a test function p1 x, we obtain ° 2° °Dx°2p = 1 2 p p p
Z m|x|p dz. Ω
So the critical value is given by c = −
° 1° °Dx°2p < 0. p p2
Then from (6.43), we see that x is an eigenfunction of associated to the eigenvalue λ =
x
df
1
(2λϕ1 (x)) p
¡
¢ − ∆p , W01,p (Ω) corre-
,
we can check that u is an eigenfunction corresponding to the eigenvalue λ0 =
¢
− ∆p , W01,p (Ω)
1 √ . 2 −c
Conversely, if x ∈ W01,p (Ω) is an eigenfunction of sponding to an eigenvalue λ > 0, then if we set u =
¡
1 2ϕ1 (u)
6. Eigenvalue Problems and Maximum Principles
749
and u is also a critical point of ϕ with critical value c = − 4λ1 2 . Let ½ df Tnc = C ⊆ W01,p (Ω) : 0 ∈ / C, C is compact, symmetric ¾ and γ(C) > n , with γ being the Krasnoselskii genus (see Definition 5.4.25). We consider the sequence df
cn =
inf max ϕ(x).
C∈Tnc x∈C
(6.44)
LEMMA 6.2.19 For every n > 1, cn defined by (6.44) is a critical value of ϕ and we have −∞
1, there exists a compact, symmetric set C ⊆ W01,p (Ω) with γ(C) = n, such that sup ϕ(x) < 0. x∈C
Note that Z °2p 1 1° ° ° ϕ(x) = 2 Dx p − m|x|p dz p p Ω °2p 1 1° > 2 °Dx°p − kmk∞ kxkpp p p °2p ° °p 1° 1 ° ° > 2 Dx p − kmk∞ °Dx°p ∀ x ∈ W01,p (Ω). (6.45) p pλ1 ° ° Since °Dx°p is an equivalent norm on W01,p (Ω), from (6.45) we infer that ϕ is coercive. Hence it is bounded from below and satisfies the PS-condition. For a given n > 1, let v1 , . . . , vn be functions in Cc1 (Ω), such that supp vi ∩ supp vj = ∅ and
Z m|vi |p dz > 0. Ω
∀ i 6= j
750
Nonlinear Analysis
We can generate such a family by taking disjoint, open balls {Bi }ni=1 in Ω, such that ¯ © ª¯ © ª ¯Bi ∩ z ∈ Ω : m(z) > 0 ¯ > 0 ∀ i ∈ 1, . . . , n N and then approximating the Lp (Ω) function χBi ∩{m>0} . Let df
Yn = span {vi }ni=1 . For every y ∈ Yn , y =
N P i=1
βi vi , βi ∈ R, we have
ϕ2 (y) =
n X
|βi |p .
i=1
Hence the map 1
y 7−→ ϕ2 (y) p is a norm in Yn and because Yn is finite dimensional all norms are equivalent and so we can find c > 0, such that cϕ1 (y) 6 ϕ2 (y) 6
1 ϕ1 (y) c
∀ y ∈ Yn .
Consider the compact set C ⊆ W01,p (Ω), defined by ½ df
C =
y ∈ Yn :
¾ c2 c2 6 ϕ2 (y) 6 . 4 3
We have sup ϕ(y) 6 − y∈C
c2 < 0. 8
Moreover, since Yn is isomorphic to Rn , we can identify C with an annulus C 0 in Rn , such that ∂B1 ⊆ C 0 ⊆ Rn \ {0}. Therefore γ(C) = n and we have proved the lemma. REMARK 6.2.20 The functional ϕ has an infinity of critical points. Indeed, if i < j and ci = cj = c, then γ(Kcϕ ) > j − i + 2, where df
Kcϕ =
©
x ∈ W01,p (Ω) : ϕ(x) = c, ϕ0 (x) = 0
ª
and so Kcϕ is infinite (see Remark 5.4.31). However, this does not mean that the sequence {cn }n>1 is infinite.
6. Eigenvalue Problems and Maximum Principles
751
LEMMA 6.2.21 If the sequence {cn }n>1 is defined by (6.44), then cn −→ 0. PROOF
Let
df
M =
©
ª x ∈ W01,p (Ω) : ϕ(x) 6 0 .
We need to show that for every ε > 0, we can find an integer nε > 0, such that −ε 6 sup ϕ(x) ∀ C ∈ Tncε , x∈C
with C ⊆ M . From the proof of Lemma 6.2.19, we know that ϕ(x) −→ +∞
as kxkW 1,p (Ω) → +∞.
So M is bounded in W01,p (Ω). Since the canonical injection i : W01,p (Ω) −→ Lp (Ω) is compact, for a given ϑ > 0, we can find a finite dimensional subspace Yϑ of Lp (Ω) and a map hϑ : M −→ Yϑ , such that ° ° sup °y − hϑ (y)°p 6 ϑ. y∈M
Let us set
¢ 1¡ b hϑ (y) = hϑ (y) − hϑ (−y) . 2 Because M is symmetric, we see that b hϑ : M −→ Yϑ and it is continuous, odd and satisfies ° ° sup °y − b hϑ (y)°p 6 ϑ. y∈M
The set M being bounded in W01,p (Ω), it is relatively compact in Lp (Ω) and so for a given ε > 0, we can find ϑε > 0, such that ¯ ¡ ¢¯¯ ε ¯ hϑε (y) ¯ 6 ∀ y ∈ M. ¯ϕ2 (y) − ϕ2 b 2 Let δε > 0 be such that ϕ2 (y) 6 So if y ∈ M , with
ε 2
∀ kykp 6 δε .
° ° °b hϑε (y)°p 6 δε ,
752 we have
Nonlinear Analysis ¯ ¡ ¢¯¯ ¡ ¢ ¯ ϕ2 (y) 6 ¯ϕ2 (y) − ϕ2 b hϑε (y) ¯ + ϕ2 b hϑε (y) < ε.
Hence, if the set © ª C ⊆ M ∩ y ∈ W01,p (Ω) : ϕ2 (y) > ε is compact and symmetric, we have b hϑε (C) ⊆
©
ª y ∈ Yϑε : kykp > δε .
Since the set b hϑε (C) ⊆ Lp (Ω) is compact and symmetric, we have ¡ ¢ γp b hϑε (C) = dim Yϑε , with γp being the Krasnoselskii’s genus in Lp (Ω) (see Proposition 5.4.28). Note that ¡ ¢ γ(C) 6 γp b hϑε (C) (see Proposition 5.4.29(d)). So γ(C) 6 dim Yϑε . Hence for all compact and symmetric sets C ⊆ M , such that γ(C) > dim Yϑε + 1, we can find y0 ∈ C, such that inf ϕ2 (y) 6 ϕ2 (y0 ) < ε
y∈C
and since ϕ(y) > −ϕ2 (y), we have sup ϕ(y) > − inf ϕ2 (y) > −ε,
y∈C
y∈C
which proves the lemma. Using Lemma 6.2.19 and Lemma 6.2.21 and recalling the discussion before Lemma 6.2.19, we can¢ state the following theorem concerning the spectrum ¡ of − ∆p , W01,p (Ω), m . THEOREM 6.2.22 The nonlinear eigenvalue problem (6.23) has a sequence {λn }n>1 of positive eigenvalues, such that λn −→ +∞.
6. Eigenvalue Problems and Maximum Principles
753
¯© ª¯ REMARK 6.2.23 If ¯ z ∈ Ω : m(z) < 0 ¯N > 0, then if in the above analysis we replace m by −m, we see that (6.23) has a sequence {λ−n }n>1 of negative eigenvalues, such that λ−n −→ −∞. From (6.44) and since λn = √1 , we see that 2 −cn λn = inf c sup ϕ1 (y). C∈Tn
y∈C ϕ2 (y) = 1
Therefore λ1 =
inf
y ∈ W01,p (Ω) ϕ2 (y) = 1
ϕ1 (y),
which is what we obtained in Proposition 6.2.2. It is easy to check that the eigenvalues of (6.23) form a closed set in R and the set of the corresponding normalized eigenfunctions form a compact set in W01,p (Ω). However, if p 6= 2 (nonlinear case), we do not know if {λn }n>1 are the only ¡ ¢ eigenvalues of − ∆p , W01,p (Ω) . The eigenvalues {λn }n>1 are known as the ¢ ¡ variational or Lusternik-Schnirelman eigenvalues of − ∆p , W01,p (Ω) . If (λ, u) is an eigenelement of (6.23) and ª df © Ω0 (u) = z ∈ Ω : u(z) = 0 , then by N (u) we denote the number of connected components of the open set Ω \ Ω0 (u). Let ½ ¾ df N (λ) = max N (u) : (λ, u) is an eigenelement of (6.23) . Using Remark 6.2.16, we can easily verify that λN (λ) 6 λ. So if λn < λn+1 , then N (uλn ) 6 n, which is the nonlinear analog of Courant’s nodal theorem (see also Theorem 6.6.1). Because λ1 > 0 is isolated, we have ½ ¾ ¡ ¢ df 1,p b λ2 = inf λ : λ is an eigenvalue of − ∆p , W0 (Ω), m , λ > λ1 > λ1 . THEOREM 6.2.24 b2 . λ2 = λ ¡ ¢ PROOF Since the set of eigenvalues of − ∆p , W01,p (Ω), m is closed, b2 > 0 is an eigenvalue of (6.23). If u is an eigenfunction we have that λ b2 , then u must change sign (see Proposition 6.2.15). Hence corresponding to λ 2 6 N (u) and so λ2 6 λN (λb2 ) . But from Remark 6.2.23, we know that b2 . λ b 6 λ2 and so we conclude that λ2 = λ N (λ2 )
754
Nonlinear Analysis
REMARK 6.2.25 If¯©we have ª¯ two weighted functions m1 , m2 ∈ L∞ (Ω), ¯ m1 6 m2 , m1 6= m2 and m1 > 0 ¯N > 0, then if by λ1 (m1 ) (respectively λ1 (m2 )) we denote the first positive eigenvalue of (6.23) with weight m1 (respectively m2 ), we have λ1 (m1 ) > λ1 (m2 )
(see Proposition 6.2.2). Using the variational characterization of λ2 > 0 established 6.2.24, we can show that if m1 (z) < m2 (z) for almost © in Theorem ª all z ∈ m1 > 0 , then λ2 (m2 ) < λ2 (m1 ) (see Anane & Tsouli (1996)). In the last part of this section we establish a few basic things ¡ about the spec¢ trum of −∆p with Neumann boundary conditions (i.e., of − ∆p , W 1,p (Ω) ). So the nonlinear eigenvalue problem under consideration is the following: ( ¯ ¯p−2 ¡ ¢ p−2 −div kDx(z)kRN Dx(z) = λ¯x(z)¯ x(z) for a.a. z ∈ Ω, (6.46) ∂x ∂np = 0 on ∂Ω. ° °p−2 ¡ ¢ ∂x Here ∂n = °Dx°RN Dx, n RN with n being the outward unit normal on ∂Ω. p Clearly λ = 0 is an eigenvalue of (6.46) with the constant functions as eigenfunctions. More precisely, we have the following result. PROPOSITION 6.2.26 ¡ ¢ λ = 0 is the first eigenvalue of −∆p , W 1,p (Ω) and it is isolated and simple. PROOF First note that (6.46) cannot have negative eigenvalues. Indeed, if λ < 0 is an eigenvalue of (6.46), then by multiplying with x(z), integrating over Ω and using the nonlinear Green’s identity (see Theorem 2.4.53), we obtain ° °p ° ° °Dx° = λ°x°p , p p a contradiction since λ < 0. The simplicity of λ = 0 is a direct consequence of the fact that p
0 =
inf 1,p
x∈W (Ω) x 6= 0
kDxkp p
kxkp
.
Finally suppose that λ = 0 is not isolated. We can find a sequence {λn }n>1 of positive eigenvalues of (6.46), such that λn & 0. Consider a sequence {un }n>1 of associated eigenfunctions. From Theorems 6.2.6 and 6.2.7(a), we have that un ∈ C 1 (Ω). Assume without any loss of generality that kun kp = 1 for n > 1. We have p kDun kp p λn = = kDun kp & 0. p kun kp
6. Eigenvalue Problems and Maximum Principles
755
So the sequence {xn }n>1 ⊆ W 1,p (Ω) is bounded. Therefore by passing to a suitable subsequence if necessary, we may assume that w
un −→ u in W 1,p (Ω), un −→ u in Lp (Ω) (recall that the embedding W 1,p (Ω) ⊆ Lp (Ω) is compact; see Theorem 2.5.17). We have kukp = 1 and kDukp = 0 and so
1
u = ±
1
.
|Ω|Np Using as a test function y ≡ 1 ∈ W 1,p (Ω) in (6.46), we obtain Z ¯ ¯ ¯un (z)¯p−2 un (z) dz = 0 ∀ n > 1. Ω
Passing to the limit as n → +∞, we have Z ¯ ¯ ¯u(z)¯p−2 u(z) dz = 0, Ω
a contradiction. We will characterize the first nonzero element of the spectrum of ¡ ¢ − ∆p , W 1,p (Ω) . Suppose that λ > 0 is an eigenvalue of (6.46) with a corresponding eigenfunction u ∈ C 1 (Ω). Integrating (6.46) and using the nonlinear Green’s identity (see Theorem 2.4.53), we obtain Z ¯ ¯ ¯u(z)¯p−2 u(z) dz = 0. Ω
So we are led to the consideration of the following nonempty, closed, symmetric and pointed cone: ½ ¾ Z ¯ ¯ df ¯x(z)¯p−2 x(z) dz = 0 . C(p) = x ∈ W 1,p (Ω) : Ω
Note that if p = 2, then C(p) is the orthogonal complement of the kernel of the operator −∆ with Neumann boundary conditions (which is R). We intersect C(p) with the unit sphere of Lp (Ω) and obtain ½ ¾ Z ¯ ¯ df ¯x(z)¯p−2 x(z) dz = 0 . C1 (p) = x ∈ W 1,p (Ω) : kxkp = 1, Ω
756
Nonlinear Analysis
Let ϑp : W01,p (Ω) −→ R be the strictly convex, C 1 -map, defined by p
∀ x ∈ W 1,p (Ω).
ϑp (x) = kDxkp
We consider the following minimization problem: λ1 (p) =
inf
x∈C1 (p)
ϑp (x).
(6.47)
PROPOSITION 6.2.27 λ1 (p) > 0 and we can find x ∈ C1 (p), such that ϑp (x) = λ1 (p). PROOF Let {xn }n>1 ⊆ C1 (p) be a minimizing sequence for problem (6.47). Evidently the sequence {xn }n>1 ⊆ W 1,p (Ω) is bounded and so we may assume that w
xn −→ x xn −→ x x (z) n ¯ ¯ −→ x(x) ¯xn (z)¯ 6 k(z)
in W01,p (Ω), in Lp (Ω), for a.a. z ∈ Ω, for a.a. z ∈ Ω,
with k ∈ Lp (Ω)+ . Via the Lebesgue dominated convergence theorem (see Theorem A.2.2), we have that Z ¯ ¯ ¯x(z)¯p−2 x(z) dz = 0 and kxk = 1, p Ω
i.e., x ∈ C1 (p). Moreover, exploiting the weak lower semicontinuity of the norm functional, we have p kDxkp 6 λ1 (p), hence
p
kDxkp = λ1 (p). Note that since x ∈ C1 (p), x is nonconstant and so p
λ1 (p) = kDxkp > 0.
The above proposition leads to the following Poincar´e-Wirtinger inequality. COROLLARY 6.2.28 (Poincar´ e-Wirtinger Inequality) If x ∈ C(p) and λ1 = λ1 (p) > 0, p p then λ1 kxkp 6 kDxkp . of
Next we show that ¡ ¢ for p > 2, λ1 = λ1 (p) > 0 is the first nonzero eigenvalue − ∆p , W 1,p (Ω) .
6. Eigenvalue Problems and Maximum Principles
757
THEOREM 6.2.29 If p > 2, ¡ ¢ then λ1 > 0 is the first nonzero eigenvalue of − ∆p , W 1,p (Ω) . PROOF Let x ∈ C1 (p) be a solution of problem (6.47). By the Lagrange multiplier rule, we know that we can find a, b, c ∈ R, not all of them equal to zero, such that Z Z ° °p−2 ¡ ¯ ¯p−2 ¢ ap °Dx(z)°RN Dx(z), Dy(z) RN dz + bp ¯x(z)¯ x(z)y(z) dz Z
Ω
+ c(p − 1)
¯ ¯ ¯x(z)¯p−2 y(z) dz = 0
Ω
∀ y ∈ W 1,p (Z).
(6.48)
Ω
Let y = c and recall that Z
¯ ¯ ¯x(z)¯p−2 x(z) dz = 0
Ω
(since x ∈ C1 (p)). So we obtain Z c2 (p − 1)
¯ ¯ ¯x(z)¯p−2 dz = 0
Ω
and this implies that c = 0. Using this in (6.48), we have Z Z ° °p−2 ¡ ¯ ¯p−2 ¢ ap °Dx(z)°RN Dx(z), Dy(z) RN dz + bp ¯x(z)¯ x(z)y(z) dz = 0 Ω
Ω
∀ y ∈ W 1,p (Ω). If a = 0, then Z bp
¯ ¯ ¯x(z)¯p−2 x(z)y(z) dz = 0
∀ y ∈ W 1,p (Ω).
Ω
Let y = x. We obtain
p
bp kxkp = 0, i.e., bp = 0 and so b = 0, a contradiction, since a, b, c are not all equal to zero. So a 6= 0 and without any loss of generality, we can assume that a = 1. Then Z Z ° ° ¡ ¯ ¯ ¢ °Dx(z)°p−2 ¯x(z)¯p−2 x(z)y(z) dz = 0 dz + b Dx(z), Dy(z) RN RN Ω
Ω
∀ y ∈ W 1,p (Ω).
758
Nonlinear Analysis
As before, we let y = x and obtain p
p
kDxkp + b kxkp = 0, hence b = −λ1 = −λ1 (p). Using the nonlinear Green’s identity (see Theorem 2.4.53), we see that x ∈ C1 (p) solves (6.46) with λ = λ1 . ¡ ¢ Therefore λ1 > 0 is an eigenvalue of − ∆p , W 1,p (Ω) and clearly there is no eigenvalue in the open interval (0, λ1 ).
Finally we consider the weighted eigenvalue problem: (
¯ ¯p−2 ¡ ¢ p−2 −div kDx(z)kRN Dx(z) = λm(z)¯x(z)¯ x(z) ∂x ∂np = 0 on ∂Ω,
for a.a. z ∈ Ω,
(6.49)
with m ∈ L∞ (Ω). We define ½ ¾ Z ° °p ¯ ¯p df 1,p b ° ° ¯ ¯ λ(m) = inf Dx p : x ∈ W (Ω), m(z) x(z) dz = 1 . Ω
The following theorem is due to Huang (1990) and Godoy, Gossez & Paczka (2002). THEOREM 6.2.30 R (a) If m(z) dz < 0, Ω
b b then λ(m) > 0 and λ(m) is the unique nonzero eigenvalue, which is simple and b has a positive eigenfunction (i.e., λ(m) is principal) and there is no eigenvalue ¡ ¢ b in the interval 0, λ(m) . (b) If
R
m(z) dz > 0,
Ω
b then λ(m) = 0 and when
R
m(z) dz = 0, we have that 0 is the unique eigen-
Ω
value with a positive eigenfunction. REMARK 6.2.31
If
R Ω
with positive eigenfunctions.
m(z) dz > 0, then there is no eigenvalue λ > 0
6. Eigenvalue Problems and Maximum Principles
6.3
759
The Ordinary p-Laplacian
In this section we examine the spectrum of the ordinary p-Laplacian (both the scalar and the vector ordinary p-Laplacian) under Dirichlet, Neumann and periodic boundary conditions. We start with the scalar operator and Dirichlet boundary conditions. So for p > 1 we consider the following scalar eigenvalue problem: ½ ¡¯ ¯p−2 ¯ ¯p−2 ¢0 − ¯x0 (t)¯ x0 (t) = λ¯x(t)¯ x(t) for a.a. t ∈ T , (6.50) x(0) = x(b) = 0, with T = [0, b], b > 0, λ ∈ R and x0 stands for dx dt . A simple integration argument shows that a necessary condition for (6.50) to have a nontrivial solution is that λ > 0. So throughout the analysis of problem (6.50), we will assume that λ > 0. DEFINITION 6.3.1 We say that λ > 0 is an eigenvalue of the negative scalar ordinary p-Laplacian ¡ ¡ ¢¢ with Dirichlet boundary conditions (still denoted by − ∆p , W01,p (0, b) ), if problem (6.50) admits a nontrivial solution ¡ ¢ x ∈ W01,p (0, b) which is known as an eigenfunction corresponding to the eigenvalue λ > 0. df
In what follows N0 = N∪{0} (the set of nonnegative integers) and for r ∈ R ½ df +∞ © ª r ∈ 2N0 , hri = (6.51) min n ∈ N0 : n > r otherwise. We start by establishing the complete regularity of the eigenfunctions in problem (6.50). PROPOSITION ¡ ¢ 6.3.2 If x ∈ W01,p (0, b) is an eigenfunction for problem (6.50), ¡ ¢ then x ∈ C α (T ) ∩ C hpi T \ Z(x) , where ½¿ À ¾ ª 2−p df © df 0 Z(x) = t ∈ T : x (t) = 0 and α = min + 1, hpi . p−1 Moreover, if p > 2, then x is not everywhere twice differentiable in (0, b). PROOF
Let y ∈ C 1 (T ) be defined by df
Zt
y(t) = 0
¯ ¯ ¯x(s)¯p−2 x(s) ds.
760
Nonlinear Analysis
¡ ¢ Then for all ϑ ∈ Cc∞ (0, b) , we have
¡ ¢0 ® ® − |x0 |p−2 x0 , ϑ W 1,p ((0,b)) = λ |x|p−2 x, ϑ W 1,p ((0,b)) 0
0
= λ hy 0 , ϑiW 1,p ((0,b)) , 0
so by integration by parts, we have 0 p−2 0 0 ® |x | x , ϑ W 1,p ((0,b)) = −λ hy, ϑ0 iW 1,p ((0,b)) . 0
0
(6.52)
¡ ¢ Since ϑ ∈ Cc∞ (0, b) is arbitrary, from (6.52), it follows that ¯ 0 ¯p−2 0 ¯x (t)¯ x (t) = −λy(t) + c for a.a. t ∈ T , with c ∈ R, so
|x0 |p−2 x0 ∈ C 1 (T ).
The map ψp : R −→ R, defined by ½ p−2 df |r| r ψp (r) = 0
if if
(6.53)
r= 6 0 r=0
is a homeomorphism for p ∈ (1, +∞) and so from (6.53), it follows that x0 ∈ C(T ), hence x ∈ C 1 (T ). ¡ ¢ Suppose that x ∈ C r (T ) (respectively x ∈ C r T \ Z(x) ) for r > 1. Then ¯ ¯ ¯x(·)¯p−2 x(·) ∈ C r0 (T ) ¯ ¯p−2 ¡ ¢ (respectively ¯x(·)¯ x(·) ∈ C r0 T \ Z(x) ), where © ª df r0 = min r, hpi − 2 ¡ ¢ and so y ∈ C r0 +1 (T ) (respectively y ∈ C r0 +1 T \ Z(x) ). If ¯p−2 df ¯ u(t) = ¯x0 (t)¯ x0 (t), 2−p since ψp−1 (r) = |r|p0 r, with p0 = p−1 = p0 − 2 (where that ¯ ¯p 0 x0 (t) = ¯u(t)¯ u(t) ∀t∈T
1 p
+
1 p0
= 1), we have
and so
x0 (t) = 0 ⇐⇒ u(t) = 0. ¡ ¢ Hence x ∈ C m (T ) (respectively x ∈ C r0 +2 T \ Z(x) ), where © ª m = min hp0 i + 1, r0 + 2 . Continuing this procedure as far as r 6 m (respectively r 6 r0 + 2), we have the first result of the proposition.
6. Eigenvalue Problems and Maximum Principles
761
To ¡ prove ¢ the second part of the theorem assume that p > 2 and x ∈ C 2 (0, b) . Then we have ¡ 0 p−2 0 ¢0 |x | x = (p − 2)|x0 |p−2 x00 + |x0 |p−2 x00 = (p − 1)|x0 |p−2 x00 . So we obtain ¯ ¯p−2 ¯ ¯p−2 −(p − 1)¯x0 (t)¯ x00 (t) = ¯x(t)¯ x(t) for a.a. t ∈ T .
(6.54)
But there exists a point t0 ∈ (0, b), where x attains its positive maximum (or negative minimum), i.e., x0 (t0 ) = 0
and
x(t0 ) > 0
(or x(t0 ) < 0).
These facts compared with (6.54) lead to a contradiction. REMARK 6.3.3 If p > 2, then α = 1 and if p ∈ (1, 2], then α > 2. In particular we have x ∈ C 1 (T ) for p > 2 and x ∈ C 2 (T ) for p ∈ (1, 2]. For n > 1, let ¾ ½ df 1 Sn (T ) = x ∈ C0 (T ) : x has exactly n − 1 simple zeros in (0, b) , df
Sn+ (T ) =
©
ª x ∈ Sn (T ) : x0 (0) > 0 ,
df
Sn− (T ) = −Sn+ (T ). From (6.50), we see that the first eigenvalue λ1 > 0 is given by p
λ1 =
inf 1,p
x ∈ W0 ((0, b)) kxkp = 1
p
kx0 kp =
kx0 kp
inf 1,p
x ∈ W0 ((0, b)) x 6= 0
p
kxkp
.
(6.55)
THEOREM 6.3.4 There is a unique sequence uk ∈ Sk+ , k > 1, with maximum © of functions ª value 1 on (0, b), such that (λn , µun ) n>1 are eigenelements of (6.50), where µ ∈ R \ {0} and λn =
³ n ´p b
λ1 =
³ n ´p b
· Z (p − 1) 2
1
¸p
dt 1
0
(1 − tp ) p
.
(6.56)
© ª Moreover, for m ∈ 0, 1, . . . , n − 1 , we have ¡ ¢ un (t) = (−1) u1 nt − mb m
·
¸ bm b(m + 1) ∀t∈ , . n n
(6.57)
762
Nonlinear Analysis
PROOF From Remark 6.3.3, we know that every eigenfunction u ∈ C(T ) satisfies u ∈ C 1 (T ) if p > 2 and u ∈ C 2 (T ) if p ∈ (1, 2]. Suppose that (λ, u) is an eigenelement of (6.50). Then multiplying (6.50) with u(t) and then integrating over T , we obtain p
p
ku0 kp = λ kukp , i.e., λ > 0. Also multiplying (6.50) with u0 (t) and integrating over [0, t], t ∈ (0, 1), after integration by parts, we have ¯ ¯p ¯ ¯p ¯ ¯p (p − 1)¯u0 (t)¯ + λ¯u(t)¯ = (p − 1)¯u0 (0)¯ . (6.58) Since t ∈ (0, b) was arbitrary and u 6= 0, we must have u0 (0) 6= 0. Now we will construct u explicitly. Since u0 (0) 6= 0, we may assume that u0 (0) = ϑ > 0. Then we can find t1 ∈ (0, b), such that u0 (t) > 0
∀ t ∈ (0, t1 )
and from (6.58) solving for u0 (t), we have u0 (t) =
µ ϑp −
λ u(t)p p−1
¶ p1 ∀ t ∈ [0, t1 ].
(6.59)
Then from the inverse function theorem, it follows that u(t) Z
t =
ds (ϑp −
0
1 λsp p p−1 )
.
This formula remains valid as long as u(t) remains smaller than the first positive zero of the function df
s 7−→ ϕ(ϑ, s) = ϑp − This zero is given by
µ z(ϑ) =
p−1 λ
λsp . p−1
¶ p1 ϑ.
Because z(ϑ) is simple, we can define ξ(ϑ), by df
z(ϑ) Z
ξ(ϑ) = 0
ds (ϑp
−
1 λsp p p−1 )
.
6. Eigenvalue Problems and Maximum Principles We have
¡ ¢ u ξ(ϑ) = z(ϑ)
763
¡ ¢ and u0 ξ(ϑ) = 0.
Because ϑp =
λz(ϑ)p , p−1
we can write µ ¶1 p−1 p = c λ
ξ(ϑ) = ξλ
df
with
Z1
ds
c=
1
(1 − sp ) p
0
.
(6.60)
From (6.59), we see that u is decreasing on some interval [ξλ , η) and so z(ϑ) Z
t − ξλ = − u(t)
ds (ϑp
−
1 λsp p p−1 )
and this formula remains valid as long as u is decreasing, in particular as long as u is positive. If t2 ∈ (0, ξλ )
and t3 = 2ξλ − t2 ,
then u(t Z 2)
t2 = 0
z(ϑ) Z
ds ϕ(ϑ, s)
1 p
,
ξλ − t 2 = −
ds
and u(t2 ) = u(t3 ).
1
u(t3 )
ϕ(ϑ, s) p
So t = ξλ is an axis of symmetry for u|[0,2ξλ ] and t = 2ξλ is a center of symmetry for u|[0,4ξλ ] . It follows that u is 4ξλ -periodic on R+ . The necessary and sufficient condition for u|T to be a solution of (6.50) is that b 2ξλ = n ∈ N, so 2nξλ = b and thus µ λ =
2n b
¶p
·Z
1
(p − 1)
¸p
ds 1
0
(1 − sp ) p
(see (6.60)). So we have produced the sequence given ¡ ¢ by (6.56). The number of zeros of u is given by b 2ξ1λ − 1 and since (6.50) is homogeneous, © ª it follows that (6.57) holds. The uniqueness of the sequence (λn , un ) n>1 is a consequence of the above construction.
764
Nonlinear Analysis
REMARK 6.3.5 In the variational expression for λ1 > 0 (see (6.55)), the infimum is attained at x = u1 . Some authors write 1
(p−1) Z p
df
πp = 2
ds 1 sp p p−1 )
(1 −
0
1
=
2π(p − 1) p p sin( πp )
and then the sequence of eigenvalues λn produced in Theorem 6.3.4 is given by µ ¶p nπp λn = ∀ n > 1. b Note that for p = 2, πp = π and so we recover the eigenvalues of the negative scalar ordinary Laplacian with Dirichlet boundary conditions, which are µ λn =
nπ b
¶2 ∀ n > 1.
To derive the eigenvalues of the Neumann and periodic problems, we need to study the following initial value problem: ½ ¡¯ ¯p−2 ¯ ¯p−2 ¢0 − ¯x0 (t)¯ x0 (t) = λ¯x(t)¯ x(t) for a.a. t ∈ R, (6.61) x(t0 ) = a, x0 (t0 ) = c, for some t0 ∈ R. Here a, c ∈ R. We have the following existence result for problem (6.61). PROPOSITION 6.3.6 For any λ > 0, problem (6.61) has a unique solution defined on R. PROOF For λ = 0, the result follows from direct integration. So we may assume that λ > 0. Also since the equation is autonomous, without any loss of generality, we may assume that t0 = 0. The existence of a local solution on an interval (−ε, ε) (for ε > 0) follows from a simple application of Schauder’s fixed point theorem on the compact operator ¡ ¢ K : W 1,p (0, b) −→ R, defined by df
Zt ψp−1
K(x)(t) = ct − 0
µ Zs ¶ ¡ ¢ λ ψp x(τ ) dτ ds 0
∀ t ∈ T,
6. Eigenvalue Problems and Maximum Principles
765
where as before ψp : R −→ R is the homeomorphism, defined by ½ p−2 df |r| r if r 6= 0, ψp (r) = 0 if r = 0; recall that ψp−1 = ψp0
1 1 + 0 = 1. p p
with
We now show that this local solution is unique. We work only on R+ . Case 1. a = c = 0. The solution x satisfies in its domain of definition ¯ ¯ ¯x(t)¯p |x0 (t)|p |c|p |a|p + λ = + λ = 0, p0 p p p with
1 p
+
1 p0
= 1. From this we deduce that x ≡ 0.
Case 2. a = 0, c 6= 0. Let x, y be two local solutions of (6.61) with the above initial conditions. Then ¡ ¢ ¡ ¢ ψp x0 (t) − ψp y 0 (t) = λ ·
Zt p−1
= λ
s 0
Note that
Since
µ ψp
y(s) s
Zt
£ ¡ ¢ ¡ ¢¤ ψp y(s) − ψp x(s) ds
0
¶
µ − ψp
x(s) s
¶¸ ds.
(6.62)
x(s) y(s) , −→ c as s → 0. s s ψp is a diffeomorphism on RN \ {0},
we can find ξ1 , ξ2 > 0, such that ¯ ¯ ξ1 ¯x0 (t) − y 0 (t)¯ 6 λξ2
Zt
s 0
so
¯ ¯ ¯ x(s) ¯¯ ¯ s − s ¯ ds,
p−1 ¯ y(s)
ξ1 kv 0 kC([0,ε]) 6 λξ2 εp−1 kvkC([0,ε]) .
(6.63)
766
Nonlinear Analysis
Since
Z v(0) = 0,
0
v (0) = 0
and
v(t) =
t
v 0 (s) ds,
0
from (6.63) and choosing ε > 0 small enough, we have ¡ ¢ ξ1 − λξ2 εp kv 0 kC([0,ε]) 6 0, from which we conclude that v = 0, hence x = y. Case 3. a 6= 0, c = 0. We reduce this case to Case 2. Namely, we rewrite (6.61) as the following equivalent system ½ 0 x = ψp0 (y), (6.64) y 0 = −λψp (x). From the second equation in (6.64), we have µ 0¶ y −x = ψp0 . λ Using this in the first equation of (6.64), we see that y satisfies ½ ¡ ¢0 0 − ψp0 (y 0 ) + λp −1 ψp0 (y) = 0 on T, 0 y(0) = 0, y (0) = −λψp (a) 6= 0. So we have reduced the problem to Case 2. Case 4. a, c 6= 0. Consider system (6.64). Since x(0) 6= 0
and
y(0) 6= 0,
we have x(t), y(t) 6= 0, for t > 0 small enough and so ψp and ψp0 are C 1 -functions on that subinterval. Then the uniqueness follows from the standard ordinary differential equations theory. So the solution of (6.61) is locally unique. Since for any solution x, we have |x0 (t)|p |x(t)|p |c|p |a|p + = + , p0 p p0 p we see that the solution x extends to all of R and so local uniqueness becomes global uniqueness.
6. Eigenvalue Problems and Maximum Principles
767
Using the above proposition, we can determine the spectrum of the negative scalar p-Laplacian with Neumann and periodic boundary conditions. We start with the Neumann problem. So we consider the following nonlinear eigenvalue problem: ½ ¡¯ ¯p−2 ¯ ¯p−2 ¢0 − ¯x0 (t)¯ x0 (t) = λ¯x(t)¯ x(t) for a.a. t ∈ T , (6.65) x0 (0) = x0 (b) = 0. Observe that λ = 0 is an eigenvalue with corresponding eigenspace R (the constant functions). Moreover, any nonconstant eigenfunction u ∈ C 1 (T ) satisfies Zb ¯ ¯ ¯u(t)¯p−2 u(t) dt = 0. 0
Therefore u must change sign. Since (6.65) is autonomous, Proposition 6.3.6 implies that the positive eigenvalues of (6.65) are the same with those of the Dirichlet problem and the corresponding eigenfunctions are translations of the Dirichlet eigenfunctions. THEOREM 6.3.7 The eigenvalues of problem (6.65) are given by µ0 = 0, µ ¶p · Z 1 ¸p n dt µn = (p − 1) 2 1 b 0 (1 − tp ) p
∀n>1
and the corresponding eigenfunctions are v0 (t) = c,
c ∈ R \ {0}, µ ¶ b vn (t) = un t − ∀ n > 1, 2n where un are Dirichlet eigenfunctions. Finally let us consider the periodic problem: ½ ¡¯ ¯p−2 ¯ ¯p−2 ¢0 − ¯x0 (t)¯ x0 (t) = λ¯x(t)¯ x(t) for a.a. t ∈ T , x(0) = x(b), x0 (0) = x0 (b).
(6.66)
A similar reasoning as for the Neumann problem yields the following result. THEOREM 6.3.8 The eigenvalues of problem (6.66) are given by ν0 = 0, µ ¶p · Z 1 ¸p 2n dt νn = (p − 1) 2 1 b 0 (1 − tp ) p
∀n>1
768
Nonlinear Analysis
and the corresponding eigenfunctions are w0 (t) = c,
c ∈ R \ {0},
wn (t) = un (t − tn )
∀ n > 1,
with tn ∈ R arbitrary and un being Dirichlet eigenfunctions. REMARK 6.3.9 If p = 2 (linear case), then the eigenvalues for the Neumann problem are given by µ µn =
nπ b
¶2 ∀n>0
and the eigenvalues for the periodic problem are given by µ νn =
2πn b
¶2 ∀ n > 0.
Since the eigenvalue problems (6.50), (6.65) and (6.66) are variational, we can produce a sequence of eigenvalues and eigenfunctions using the LusternikSchnirelman theory (see Section 5.5). It can be shown that the LusternikSchnirelman eigenelements coincide with those obtained in Theorems 6.3.4, 6.3.7 and 6.3.8. For details we refer to Dr´abek & Man´asevich (1999). In the last part of this section we consider the vector ordinary p-Laplacian. So now the function x is RN -valued. We consider the following nonlinear vector eigenvalue problem: ½ ¡° °p−2 ° °p−2 ¢0 − °x0 (t)°RN x0 (t) = λ°x(t)°RN x(t) for a.a. t ∈ T , x(0) = x(b) = 0.
(6.67)
Exploiting the variational structure of (6.67) and using the Lusternik-Schnirelman theory (see Section 5.5), we can produce the eigenvalues of (6.67). THEOREM 6.3.10 The eigenvalues of problem (6.67) are λn =
µ ¶p · Z 1 ¸p n dt (p − 1) 2 1 b 0 (1 − tp ) p
∀n>1
and the corresponding eigenfunctions are u bn (t) = aun (t)
∀ n > 1,
with a ∈ RN and with un being scalar Dirichlet eigenfunctions.
6. Eigenvalue Problems and Maximum Principles
769
For the periodic problem, the situation is more involved. So consider the following vector eigenvalue problem: ½ ¡° ° °p−2 °p−2 ¢0 − °x0 (t)°RN x0 (t) = λ°x(t)°RN x(t) for a.a. t ∈ T , (6.68) x(0) = x(b), x0 (0) = x0 (b). As before we see that λ = 0 is an eigenvalue of (6.68) with corresponding eigenspace RN (the constant functions). In addition to (6.68), we also consider the following problem: ½ ¡° °p−2 ° °p−2 ¢0 − °x0 (t)°RN x0 (t) − µ°x(t)°RN x(t) = h(t) for a.a. t ∈ T , (6.69) x(0) = x(b), x0 (0) = x0 (b), ¡ ¢ with h ∈ L1 T ; RN . We want to determine those µ ¡∈ R for ¢ which problem (6.69) has at least one solution for each h ∈ L1 T ; RN . In analogy to the linear theory, we call this set the resolvent set of the negative vector ordinary p-Laplacian with periodic boundary conditions and denote it by %(p, N ). Also denote by σ(p, N ) the set of eigenvalues λ of problem (6.68). We have the following result relating the two sets %(p, N ) and σ(p, N ). PROPOSITION 6.3.11 %(p, N )c ⊆ σ(p, N ). ¢ 0¡ PROOF First we show that (6.69) has a solution for every h ∈ Lp T ; RN . To this end note that σ(p, N ) is closed. So if λ∈ / σ(p, N ), then we can find ε > 0 small enough, such that £ ¤ λ − ε, λ + ε ∩ σ(p, N ) = ∅. ¢ 0¡ For a given g ∈ Lp T ; RN , we consider the following auxiliary periodic problem: ½ ¡° °p−2 ° °p−2 ¢0 − °x0 (t)°RN x0 (t) + ε°x(t)°RN x(t) = g(t) for a.a. t ∈ T , (6.70) x(0) = x(b), x0 (0) = x0 (b). If
¢ ¢ ¡ 0¡ 1,p K : Wper (0, b); RN −→ Lp T ; RN
is defined by °p−2 df ° K(x)(·) = °x(·)°RN x(·)
¡ ¢ 1,p ∀ x ∈ Wper (0, b); RN ,
770
Nonlinear Analysis
then the operator ¡ ¢ ¡ 1,p ¡ ¢¢∗ 1,p Wper (0, b); RN 3 x 7−→ A(x) + εK(x) ∈ Wper (0, b); RN is maximal monotone, strictly monotone, coercive; hence it is surjective. So problem (6.70) has a unique solution. Consider the operator ¢ ¡ ¢ 0¡ 1,p Vε : Lp T ; RN −→ Wper (0, b); RN ¢ 0¡ which to each g ∈ Lp T ; RN assigns the unique solution of (6.70). We will show that Vε is completely continuous ¢ 0¡ (hence compact too since Lp T ; RN is reflexive). To this end suppose that ¢ 0¡ {gn }n>1 ⊆ Lp T ; RN is a sequence, such that w
gn −→ g
¢ 0¡ in Lp T ; RN
¢ 0¡ for some g ∈ Lp T ; RN and let df
xn = Vε (gn ) ∀ n > 1. ¡ ¢ 1,p Clearly the sequence {xn }n>1 ⊆ Wper (0, b); RN is bounded and so we may assume that ¡ ¢ w 1,p xn −→ x in Wper (0, b); RN , ¡ ¢ xn −→ x in C T ; RN . We have
® A(xn ), xn − x W 1,p ((0,b);RN ) + ε
Zb
per
° ° ¡ ¢ °xn (t)°p−2 xn (t), (xn − x)(t) RN dt RN
0
Zb =
¡
¢ gn (t), (xn − x)(t) RN dt.
0
Since
Zb
° ° ¡ ¢ °xn (t)°p−2 xn (t), (xn − x)(t) RN dt −→ 0 RN
0
and
Zb 0
¡ ¢ gn (t), (xn − x)(t) RN dt −→ 0,
6. Eigenvalue Problems and Maximum Principles we obtain lim
n→+∞
771
® A(xn ), xn − x W 1,p ((0,b);RN ) = 0. per
From this as in the proof of Proposition 6.2.17, we infer that ¡ ¢ 1,p xn −→ x in Wper (0, b); RN . Then in the limit, we have A(x) + εK(x) = g, i.e., x = Vε (g). Hence Vε (gn ) −→ Vε (g)
¡ ¢ 1,p in Wper (0, b); RN
and we have proved the complete continuity of Vε . Now we consider the compact homotopy ¡ ¢ ¡ ¢ 1,p 1,p H : [0, 1] × Wper (0, b); RN −→ Wper (0, b); RN , defined by
¡ ¡ ¢¢ df H(β, x) = Vε λK(x) + β h + εK(x) .
We will show that there exists r > 0, such that 0 ∈ / x − H(β, x)
∀ β ∈ (0, 1), kxkW 1,p ((0,b);RN ) = r.
Suppose that this is not true. Then we can find a sequence © ª ¡ ¢ 1,p (βn , xn ) n>1 ⊆ (0, 1) × Wper (0, b); RN , such that xn = H(βn , xn )
∀n>1
and kxn kW 1,p ((0,b);RN ) −→ +∞. Let
df
yn =
xn kxn kW 1,p ((0,b);RN )
∀ n > 1.
We may assume that w
yn −→ y ¡ 1,p
¡ ¢ 1,p in Wper (0, b); RN , ¡ ¢ in C T ; RN ,
yn −→ y ¢ (0, b); RN . For every n > 1, we have
for some y ∈ Wper ¡° ° ° ° ¢ ° 0 °p−2 0 (t) 0 + ε°yn (t)°p−2 y (t) − yn (t)° RN y°np−2 °p−2 ¡ RN n ° ¢ = λ°yn (t)°RN yn (t) + βn h(t) + ε°yn (t)°RN yn (t) for a.a. t ∈ T , y (0) = y (b), y 0 (0) = y 0 (b). n n n n
(6.71)
772
Nonlinear Analysis
As we did earlier (cf., e.g., the proof of Proposition 6.2.17), we can show that ® lim A(yn ), yn − y W 1,p ((0,b);RN ) = 0, n→+∞
per
from which we infer that yn −→ y
¡ ¢ 1,p in Wper (0, b); RN .
So if we pass to the limit as n → +∞ in (6.71), we obtain ¡° °p−2 ° °p−2 ¢0 − °y 0 (t)°RN y 0 (t) + ε°y(t)°RN y(t) ° °p−2 ° °p−2 = λ°y(t)°RN y(t) + βε°y(t)°RN y(t) for a.a. t ∈ T , y(0) = y(b), y 0 (0) = y 0 (b).
(6.72)
Since kykW 1,p ((0,b);RN ) = 1, we have y 6= 0 and so from (6.72), it follows ¡ ¢ 1 that y ∈ Cper T ; RN is an eigenfunction of the negative vector ordinary p-Laplacian with periodic boundary conditions with corresponding eigenvalue λ − (1 − β)ε ∈ σ(p, N ), which contradicts the choice of ε > 0. Therefore, we can find r > 0 large enough, so that x − H(t, x) 6= 0 ∀ x ∈ ∂Br (0), where ∂Br (0) =
©
¡ ¢ ª 1,p x ∈ Wper (0, b); RN : kxkW 1,p ((0,b);RN ) = r .
From the homotopy invariance of the Leray-Schauder degree, we have ¡ ¢ ¡ ¢ ±1 = dLS id − H(0, x), Br (0), 0 = dLS id − H(1, x), Br (0), 0 , ¡ ¢ 1,p since λ ∈ / σ(p, N ). So we can find x ∈ Wper (0, b); RN , such that x = H(1, x)
and
kxkW 1,p ((0,b);RN ) < r.
¢ 0¡ Evidently this is a solution of (6.69) when h ∈ Lp T ; RN . ¡ ¢ ¢ 0¡ Finally let h ∈ L1 T ; RN and choose a sequence {hn }n>1 ⊆ Lp T ; RN , such that ¢ ¡ hn −→ h in L1 T ; RN . ¡ ¢ 1 Let xn ∈ Cper T ; RN for n > 1 be solutions of (6.69), when h is replaced by hn . Then as before we can show that ¡ ¢ 1,p xn −→ x in Wper (0, b); RN .
6. Eigenvalue Problems and Maximum Principles
773
Therefore in the limit, we have that ½ ¡° °p−2 ° °p−2 ¢0 − °x0 (t)°RN x0 (t) = λ°x(t)°RN x(t) + h(t) x(0) = x(b), x0 (0) = x0 (b),
for a.a. t ∈ T ,
¡ ¢ 1 T ; RN . This proves that i.e., problem (6.69) has a solution x ∈ Cper λ ∈ %(p, N ) and so we have proved the proposition. ¡ ¢ 1 Let u ∈ Cper T ; RN be an eigenfunction for problem (6.68). Taking inner product with u(t), integrating over T and using integration by parts, we obtain p
λ =
ku0 kp p
kukp
> 0.
So all the eigenvalues of (6.68) are nonnegative and since 0 ∈ σ(p, N ), we infer that 0 is the smallest eigenvalue. Moreover, it is clear from (6.68) that all the eigenvalues of the scalar problem (N = 1) are also eigenvalues of the vector problem (N > 1). So we obtain the eigenvalues: ν0 = 0, µ ¶p · Z 1 ¸p 2n dt νn = (p − 1) 2 1 b 0 (1 − tp ) p
∀ n > 1.
But in contrast to the Dirichlet problem, the periodic problem has more eigenvalues when N > 1 (vector case). For example consider µ df
ξn =
2πn b
¶p ∀n>1
with corresponding eigenfunctions µ ¶ 2πn 2πn df vn (t) = 0, . . . , 0, cos t, 0 . . . , 0, sin t, 0, . . . , 0 . b b More generally, if n > 1, every nontrivial solution v of µ ¶p 2πn v 00 (t) + v(t) = 0 for a.a. t ∈ T , b such that
¡ 0 ¢ v (t), v(t) RN = 0
∀ t ∈ T,
is an eigenfunction corresponding to the eigenvalue ξn .
774
Nonlinear Analysis
Integrating (6.68) over T , we see that, if λ ∈ σ(p, N ) \ {0} and u is an associated eigenfunction, we have Zb
° ° °u(t)°p−2 u(t) dt = 0. RN
0
So following the analysis of the Neumann problem for the partial p-Laplacian (see Section 6.2), we define ½ df
C(p, N ) =
x∈
C1 (p, N ) = df
¡ ¢ (0, b); RN :
Zb
¾ ° ° °x(t)°p−2 x(t) dt = 0 , RN
0
½ df
1,p Wper
¾
x ∈ C(p, N ) : kxkp = 1 , p
ϑp,N (x) = kx0 kp
¡ ¢ 1,p ∀ x ∈ Wper (0, b); RN .
Evidently ϑp,N is strictly convex and C 1 . We consider the minimization problem λ1 (p, N ) = inf ϑp,N (x). x∈C1 (p,N )
As in Proposition 6.2.27, we have the following result. PROPOSITION 6.3.12 λ1 (p, N ) > 0 and we can find x ∈ C1 (p, N ), such that ϑp,N (x) = λ1 (p, N ). This proposition leads to the following Poincar´e-Wirtinger-type inequality. COROLLARY 6.3.13 If x ∈ C(p, N ) and λ1 = λ1 (p, N ) > 0, then p p λ1 kxkp 6 kx0 kp . If p > 2, then we can have the following characterization of λ1 = λ1 (p, N ) > 0. The proof is similar to that of Theorem 6.2.29. THEOREM 6.3.14 If p > 2, then λ1 > 0 is the first nonzero eigenvalue of the negative vector ordinary p-Laplacian with periodic boundary conditions. REMARK 6.3.15 The complete structure of σ(p, N ) when N > 1 (vector case) is far from being understood and more research is needed in this direction.
6. Eigenvalue Problems and Maximum Principles
6.4
775
Maximum Principles
In the last part of Section 6.1, we proved maximum principles for second order, linear, uniformly elliptic differential operators. In this section we derive analogous results for nonlinear differential operators involving the partial pLaplacian. Let Ω ⊆ RN be a bounded domain with C 2 -boundary ∂Ω and let a ∈ L∞ (Ω) be a weight function. Using the notation introduced in the previous section, by ψp : R −→ R we denote the homeomorphism, defined by ½ p−2 df |r| r if r 6= 0, ψp (r) = 0 if r = 0. In this section we study the following nonlinear (p − 1)-homogeneous differential operator: df Vp (x) = −∆p x + a(z)ψp (x). ¢∗ ¡ 0 Let h ∈ W −1,p (Ω) = W01,p (Ω) (with p1 + p10 = 1) and consider the equation: Vp (x) = h.
(6.73)
We start by introducing the notions of weak solutions, weak upper solutions and weak lower solutions for problem (6.73). DEFINITION 6.4.1 (a) A function x ∈ W 1,p (Ω) is said to be a weak solution of (6.73), if Z Z p−2 kDxkRN (Dx, Dy)RN dz + aψp (x)y dz = hh, yiW 1,p (Ω) ∀y ∈ W01,p (Ω). 0
Ω
Ω
(b) A function x ∈ W 1,p (Ω) is said to be a weak upper solution of (6.73), if Z Z p−2 kDxkRN (Dx, Dy)RN dz + aψp (x)y dz > hh, yiW 1,p (Ω) 0
Ω
Ω
∀ y ∈ W01,p (Ω), y > 0. (c) A function x ∈ W 1,p (Ω) is said to be a weak lower solution of (6.73), if Z Z p−2 kDxkRN (Dx, Dy)RN dz + aψp (x)y dz 6 hh, yiW 1,p (Ω) 0
Ω
Ω
∀ y ∈ W01,p (Ω), y > 0.
776
Nonlinear Analysis
DEFINITION 6.4.2 We say that the differential operator Vp satisfies the maximum principle, if every weak solution x ∈ W 1,p (Z) of ½ Vp (x) = f, x|∂Ω > 0, with f ∈ L∞ (Ω), f > 0, satisfies x(z) > 0
for a.a. z ∈ Ω.
We say that Vp satisfies the strong maximum principle, if in addition x(z) > 0
for a.a. z ∈ Ω,
when f 6= 0. We examine the first eigenvalue of the differential operator Vp with Dirichlet ¡ ¢ boundary conditions (i.e., of Vp , W01,p (Ω) ). So we consider the following nonlinear eigenvalue problem: °p−2 ¯ ¯p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) + a(z)¯x(z)¯ x(z) ¯ ¯p−2 (6.74) = λ¯x(z)¯ x(z) for a.a. z ∈ Ω, x| = 0, ∂Ω with λ ∈ R. To establish the existence of a principal eigenvalue for (6.74), we will need the following lemma. Let ½ ¾ x y df D(I) = (x, y) ∈ W01,p (Ω) × W01,p (Ω) : x > 0, y > 0, , ∈ L∞ (Ω) y x and consider the function ¿ ¿ À À xp − y p xp − y p df I(x, y) = A(x), p−1 − A(y), p−1 ∀x, y ∈ D(I). x y W 1,p (Ω) W 1,p (Ω) 0
0
0
W01,p (Ω)
Here A : −→ W −1,p (Ω) is the nonlinear maximal monotone, strictly monotone operator, defined by Z ° ° ¡ ® ¢ df °Dx(z)°p−2 A(x), y W 1,p (Ω) = Dx(z), Dy(z) RN dz ∀x, y ∈ W01,p (Ω) RN 0
Ω
(see the proof of Proposition 6.2.2). Evidently A(x) = −∆p x and note that because of Theorem 2.4.57, we have ¡ ¢ 0 p−2 −∆p x = −div kDxkRN Dx ∈ W −1,p (Ω). Concerning the functional I(x, y) defined on D(I), we have the following result (see also the proof of Proposition 6.2.15).
6. Eigenvalue Problems and Maximum Principles
777
LEMMA 6.4.3 For all (x, y) ∈ D(I), I(x, y) > 0 and I(x, y) = 0 if and only if x = µy for some µ > 0. PROOF First we prove the nonnegativity of the functional I(x, y) on the domain D(I). So let (x, y) ∈ D(I). We have µ D and
µ
xp y p−1
yp D p−1 x
¶
µ ¶p−1 µ ¶p x x = p Dx − (p − 1) Dy y y
¶
µ ¶p−1 µ ¶p y y = p Dy − (p − 1) Dx. x x
So we obtain ¿ À xp − y p A(x), p−1 x W01,p (Ω) Z · ³ y ´p−1 p−2 = −p kDxkRN (Dx, Dy)RN x Ω ¸ ³ ³ y ´p ´ p + 1 + (p − 1) kDxkRN dz x Z · ³ ´p−1 y p−2 = p kDxkRN (kDxkRN kDykRN − (Dx, Dy)RN ) x Ω
³ +
1 + (p − 1)
³ y ´p ´ x
p kDxkRN
−p
³ y ´p−1 x
¸ p−1 kDxkRN
D E p −y p An analogous expression is obtained for A(x), xyp−1
W01,p (Ω)
kDykRN
dz.
by interchang-
ing the roles of x, y ∈ D(I). So finally exploiting the symmetry of the expression, we can write that Z Z ³y ´ ³y ´ I(x, y) = h1 , Dx, Dy dz + h2 , kDxkRN , kDykRN dz, x x Ω
Ω
where ³ ´ df p−2 p−2 h1 (r, ξ, η) = p rp−1 kξkRN + r1−p kηkRN (kξkRN kηkRN − (ξ, η)RN ) df
h2 (r, τ, s) =
¡
∀ r > 0, ξ, η ∈ RN , ¡ ¢ 1 + (p − 1)r τ p + 1 + (p − 1)r−p sp ¢ p
−prp−1 τ p−1 s − pr1−p sp−1 τ
∀ r, τ, s > 0.
778
Nonlinear Analysis
By virtue of the Cauchy-Schwarz inequality h1 > 0. We will show that h2 > 0 too. Note that h2 (r, 0, s) > 0 and df
If τ 6= 0, setting z =
¡
if h2 (r, 0, s) = 0, then s = 0.
¢ s p , rτ
we obtain ¡ ¢ h2 (r, τ, s) = τ p rp γ1 (z) + γ2 (z) ,
with df
γ1 (z) = z p − pz + p − 1
df
γ2 (z) = (p − 1)z p − pz p−1 .
and
Evidently γ1 , γ2 > 0 and so I(x, y) > 0
∀ (x, y) ∈ D(I).
The functions γ1 and γ2 vanish at z = 1. So h2 (t, τ, s) = 0 if and only if s = rτ. Then if I(x, y) = 0, we have h1
¡y ¢ , Dx, Dy = 0 x
and
h2
¡y ¢ , kDxkRN , kDykRN = 0 x
and this is equivalent to saying that ° ° ° ° x(z)°Dy(z)°RN = y(z)°Dx(z)°RN
for a.a. z ∈ Ω
for a.a. z ∈ Ω
and ¡
Dx(z), Dy(z)
It follows that
¢ RN
° ° ° ° = °Dx(z)°RN °Dy(z)°RN
° ° °xDy − yDx°2 2 = 0 R
for a.a. z ∈ Ω.
for a.a. z ∈ Ω,
from which it follows that x = cy with c > 0. REMARK 6.4.4 The nonnegativity of I(x, y) on D(I) can also be deduced by considering the functional ψ : L1 (Ω) −→ R+ , defined by ( ° 1 °p 1 1° df Dx p °p if x p ∈ W01,p (Ω), x > 0, p ψ(x) = +∞ otherwise. ¡ ¢ 0 Then it is easy to see that ψ ∈ Γ0 L1 (Ω) . Moreover, if by ψG we denote the Gˆateaux derivative of ψ, we have 0 p ® 0 ∀ x, y ∈ D(I). 0 6 ψG (x ) − ψG (y p ), xp − y p W 1,p (Ω) = I(x, y) 0
6. Eigenvalue Problems and Maximum Principles
779
Using Lemma 6.4.3, we can have an analog of Theorem 6.2.18 for the eigenvalue problem (6.74). THEOREM 6.4.5 The nonlinear eigenvalue problem (6.74) has a unique eigenvalue λ = λ1,p (a), defined by · ¸ Z ¯ ¯p df p ¯ ¯ λ1,p (a) = inf kDxkp + a(z) x(z) dz , (6.75) 1,p x ∈ W0 (Ω) kxkp = 1
Ω
with the property of exhibiting a positive eigenfunction u b1 ∈ W01,p (Ω). u b1 1 Moreover, λ1,p (a) is simple, isolated and u b1 ∈ C0 (Ω) with ∂∂n < 0 on ∂Ω (where n is the outer unit normal on ∂Ω). PROOF Since a ∈ L∞ (Ω), we may assume without any loss of generality that a > 0. Indeed, if this is not the case we replace a and λ in (6.74) by df
df
AM (z) = a(z) + M
and λM = λ + M
df
respectively, with M = kak∞ . Since a > 0, the function η : W01,p (Ω) −→ R+ , defined by
Z
df
p
a|x|p dz,
η(x) = kDxkp + Ω
is convex, lower semicontinuous (hence weakly lower semicontinuous too) and p
η(x) > kDxkp , which by Poincar´e’s inequality implies that η is coercive. So we can find u b1 ∈ W01,p (Ω) with kb u1 kp = 1, such that λ1,p (a) = η(b u1 ) (see (6.75)). Then by the Lagrange multiplier rule (see Theorem 5.5.24), we have that u b1 ∈ W01,p (Ω) is a solution of (6.74). Using Theorems 6.2.6 and 6.2.7(a) (nonlinear regularity theory), we have that u b1 ∈ C01 (Ω). Moreover, it is clear from (6.75) that u b1 does not change sign and so we may assume that u b1 (z) > 0 ∀z∈Ω and u b1 6= 0. Invoking Theorem 6.2.8, we have that u b1 (z) > 0
∀z∈Ω
and
∂b u1 (z) < 0 ∂n
∀ z ∈ ∂Ω.
780
Nonlinear Analysis
For the simplicity of λ1,p (a), note that, if u, v ∈ W01,p (Ω) is any pair of positive eigenfunctions corresponding to λ1,p (a), then I(u, v) = 0 and so by Lemma 6.4.3, we have u = cv which establishes that the corresponding eigenspace is one dimensional. For the uniqueness of λ1,p (a), let λ 6= λ1,p (a) be an eigenvalue of (6.74) (hence λ > λ1,p (a)), which has a positive eigenfunction u ∈ W01,p (Ω), i.e., Vp (u) = λψp (u) and
u(z) > 0
for a.a. z ∈ Ω.
As for u b1 , we have u ∈ C01 (Ω) and u(z) > 0
∀ z ∈ Ω.
Assume that kukp = 1. Then Z
¡
0 6 I(b u1 , u) =
¢ ¢¡ p λ1,p (a) − λ u b1 − up dz = λ1,p (a) − λ < 0,
Ω
a contradiction. This proves the uniqueness of λ1,p (a). Finally we show that λ1,p (a) is isolated. If this is not the case, then we can find eigenelements (λn , un ) of (6.74), such that λn & λ1,p (a). We may assume that kun kW 1,p (Ω) = 1 and so at least for a subsequence, we 0 have w
un −→ u in W01,p (Ω), un −→ u in Lp (Ω). Then since A(un ) + aψp (un ) = λn ψp (un ) we have
∀ n > 1,
Z aψp (un )(un − u) dz −→ 0 Ω
and
Z ψp (un )(un − u) dz −→ 0 Ω
and so
® A(un ), un − u W 1,p (Ω) −→ 0, 0
6. Eigenvalue Problems and Maximum Principles
781
which by virtue of the maximal monotonicity of A implies that ® ® A(un ), un W 1,p (Ω) −→ A(u), u W 1,p (Ω) , 0
0
hence kDun kp −→ kDukp . Since Dun −→ Du
¡ ¢ in Lp Ω; RN ,
Dun −→ Du
¡ ¢ in Lp Ω; RN
w
we infer that and so
un −→ u in W01,p (Ω). In the limit as n → +∞, we have A(u) + aψp (u) = λ1,p (a)ψp (u) and so u = u b1 . Arguing as in the last part of the proof of Proposition 6.2.17, we can show that ¯© ª¯ ¯ un < 0 ¯ > ϑ > 0, N
for some ϑ > 0 and all n > n0 . This contradicts the fact that u b1 > 0. We can use the above theorem to obtain characterizations of the maximum principle and strong maximum principle for the operator Vp . THEOREM 6.4.6 df Let Vp (x) = −∆p x + aψp (x). The following statements are equivalent: (a) the operator Vp satisfies the maximum principle; (b) the operator Vp satisfies the strong maximum principle; (c) λ1,p (a) > 0 (see (6.75)); (d) there exists x ∈ W01,p (Ω), x(z) > 0 for almost all z ∈ Ω, such that Vp (x) = f ∈ L∞ (Ω), f > 0, f 6= 0 (i.e., x is a positive strict upper solution of the equation Vp (x) = 0); (e) for every f ∈ L∞ (Ω)+ , there exists a unique weak solution of the equation Vp (x) = f , which is nonnegative.
782 PROOF
Nonlinear Analysis “(a)=⇒(b)”: Let x ∈ W 1,p (Ω) be a weak solution of −∆p x + aψp (x) = f
in Ω,
with f ∈ L∞ (Ω), f > 0, f 6= 0, such that x > 0 (see Definition 6.4.2). From Theorems 6.2.6 and 6.2.7(a), we have that x ∈ C 1 (Ω) and because x(z) > 0 we have
∀ z ∈ Ω,
¡ ¢ −∆p x(z) + kak∞ ψp x(z) > 0 for a.a. z ∈ Ω.
Invoking Theorem 6.2.8, we obtain x(z) > 0
∀ z ∈ Ω.
“(b)=⇒(c)”: By Theorem 6.4.5, we have a positive eigenfunction u b1 ∈ C 1 (Ω) corresponding to the eigenvalue λ1,p (a) of the operator Vp . If λ1,p (a) 6 0, we have ¡ ¢ ¡ ¢ ¡ ¢ −∆p − u b1 (z)+a(z)ψp − u b1 (z) = −λ1,p (a)ψp u b1 (z) > 0 for a.a. z ∈ Ω. So by hypothesis, we have −b u1 (z) > 0
∀ z ∈ Ω,
a contradiction. Therefore λ1,p (a) > 0. “(c)=⇒(d)”: Just take x = u b1 , the eigenfunction corresponding to the eigenvalue λ1,p (a). “(d)=⇒(c)”: Let u b1 ∈ C 1 (Ω) be the positive eigenfunction corresponding to λ1,p (a) (see Theorem 6.4.5) and let x ∈ W01,p ¡ (Ω) ¢be the positive strict upper solution existing by hypothesis. We° have b1 , x ∈ D(I) and in particular ° u °x° x ∞ u1 with c > ° ub1 ° . Suppose that λ1,p (a) 6 0. From u b1 ∈ L (Ω). Let v = cb ∞ the choice of c > 0 and since f > 0, we have ¿ À Z ¡ ¢ v p − xp −∆p x, p−1 + a v p − xp dz x W01,p (Ω) Z µ pΩ p ¶ v −x = f dz > 0. (6.76) xp−1 Ω
Also
¿ −∆p v,
v p − xp xp−1
À
Z = W01,p (Ω)
Ω
¡ ¢ −a v p − xp dz
6. Eigenvalue Problems and Maximum Principles Z ¡ ¢ + λ1,p (a) v p − xp dz.
783 (6.77)
Ω
Therefore ¿
À ¿ À v p − xp v p − xp I(v, x) = −∆p v, p−1 − −∆p x, p−1 v x W01,p (Ω) W01,p (Ω) Z ¡ ¢ 6 λ1,p (a) v p − xp dz 6 0 Ω
(see (6.76) and (6.77)) and recall that we have assumed λ1,p (a) 6 0. From Lemma 6.4.3, it follows that I(v, x) = 0
and
v = c0 x,
for some c0 > 0. Then ¡ ¢ λ1,p (a)ψp v(z) = c0p−1 f (z)
for a.a. z ∈ Ω,
a contradiction since λ1,p (a) 6 0, c0 > 0 and f > 0, f 6= 0. “(c)=⇒(a)”: Suppose that x ∈ W01,p (Ω) is a weak solution of −∆p x + aψp (x) = f, with f ∈ L∞ (Ω), f > 0 and x|∂Ω > 0. Using as a test function −x− ∈ W 1,p (Ω) (see Proposition 2.4.27), we obtain Z Z ° − °p °Dx ° + a|x− |p dz = f (−x− ) dz 6 0, p Ω
Ω
so λ1,p (a) 6 0, a contradiction to the hypothesis (see Theorem 6.4.5). “(c)=⇒(e)”: Consider the C 1 -functional ϑ : W01,p (Ω) −→ R, defined by Z Z df p ϑ(x) = kDxkp + a|x|p dz − p f x dz ∀ x ∈ W01,p (Ω), Ω
Ω
with f ∈ L∞ (Ω), f > 0. Since λ1,p (a) > 0, we have that there exists ξ > 0, such that Z p p kDxkp + a|x|p dz > ξ kxkW 1,p (Ω) . Ω
So we have
Z p
ϑ(x) > ξ kxkW 1,p (Ω) − p
f x dz, Ω
784
Nonlinear Analysis
so ϑ is coercive. From the compactness of the embedding W01,p (Ω) ⊆ Lp (Ω), we infer that ϑ is weakly lower semicontinuous. So by the Weierstrass theorem, we can find x ∈ W01,p (Ω), such that ϑ(x) = so
min
y∈W01,p (Ω)
ϑ(y),
ϑ0 (x) = 0.
(6.78)
W01,p (Ω)
From (6.78), it follows that x ∈ is a solution of ¯p−2 ¯ −∆p x(z) + a(z)¯x(z)¯ x(z) = f (z) for a.a. z ∈ Ω. Using as a test function −x− ∈ W01,p (Ω), we obtain Z Z ° °p ° °p λ1,p (a) °x− °p 6 °Dx− °p + a|x− |p dz = f (−x− ) dz 6 0, Ω
Ω
−
so x = 0, i.e., x > 0. Now we will show that this nonnegative solution is unique. Let x, y ∈ W01,p (Ω) be two such solutions. As before x, y ∈ C 1 (Ω) and from Theorem 6.2.8, we have x(z) > 0 and
y(z) > 0
and
z∈Ω
and
∂x ∂y (z) < 0 and (z) < 0 ∀ z ∈ ∂Ω. ∂n ∂n Then (x, y) ∈ D(I) and ¿ À ¿ À xp − y p xp − y p 0 6 I(x, y) = − ∆p x, p−1 − − ∆p y, p−1 x y W01,p (Ω) W01,p (Ω) Z p−1 p−1 ¡ ¢ y −x = f p−1 p−1 xp − y p dz 6 0, x y Ω
so I(x, y) = 0, hence x = cy for some c > 0 (see Lemma 6.4.3). From this we obtain cp−1 f = f. If f 6= 0, then c = 1 and so x = y. If f = 0, then because λ1,p (a) > 0, x = 0 is the only solution. “(e)=⇒(d)”: Obvious.
6. Eigenvalue Problems and Maximum Principles
785
For the next theorem, we will need the so-called Picone’s identity. For p = 2 this identity says the following. THEOREM 6.4.7 (Picone Identity for p = 2) If x > 0, y > 0 are differentiable functions, then 2
y2 y 2 kDxkRN − 2 (Dx, Dy)RN 2 x µ ¶ µ ¶x y2 − D , Dx > 0 a.e. on Ω. x RN
kDykRN + 2
= kDykRN
(6.79)
This identity was extended recently for p > 1 not necessarily equal to 2 by Allegretto & Huang (1998). THEOREM 6.4.8 (Picone Identity for p > 1) If x > 0, y > 0 are differentiable functions, ¢ yp y p−1 ¡ df p p L1 (x, y) = kDykRN + (p − 1) p kDxkRN − p p−1 Dy, kDxkp−2 , N Dx R RN x ¶ µ µ xp ¶ y df p p−2 R1 (x, y) = kDykRN + D p−1 , kDykRN Dy , x RN then L1 (x, y) = R1 (x, y) > 0
a.e. on Ω
and L1 (x, y) = 0
for a.a. z ∈ Ω
if and only if
D
³y´ x
=0
for a.a. z ∈ Ω,
i.e., y = cx for some c > 0. The next theorem is an interesting consequence of this identity. THEOREM 6.4.9 If u ∈ C 1 (Ω) is a positive supersolution for the operator Vp and a0 ∈ L∞ (Ω) satisfies a0 (z) > a(z) for a.a. z ∈ Ω, then the equation °p−2 ¯ ¯p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) + a0 (z)¯x(z)¯ x(z) = 0 has a positive solution too.
for a.a. z ∈ Ω
786
Nonlinear Analysis
PROOF Let ϑ ∈ Cc∞ (Ω). From Picone’s identity for p > 1 (see Theorem 6.4.8), we have Z Z Z ° ± °p ¢ (ϑ± )p ¡ ± ° ° 0 6 L1 (ϑ , u) dz = Dϑ RN dz − − ∆p u dz up−1 Ω Ω Ω Z ¡° ± °p ¢ ± p °Dϑ ° + a(ϑ ) dz. 6 (6.80) Ω
In (6.80) equalities hold only if ϑ± = 0
or
ϑ± = cu,
with c > 0.
Evidently this last option cannot happen. Therefore we have Z Z ¡ ¢ ¡ ¢ p p p 0 6 kDϑkRN + a|ϑ| dz 6 kDϑkRN + a0 |ϑ|p dz ∀ ϑ ∈ Cc∞ (Ω). Ω
Ω
We claim that there exists ξ > 0, such that Z ¡ ¢ p p ξ kϑk 6 kDϑkRN + a0 |ϑ|p dz
∀ ϑ ∈ Cc∞ (Ω).
(6.81)
Ω
Suppose that (6.81) is not true. Then for every n > 1, we can find ϑn ∈ Cc∞ (Ω), such that kϑn k = 1 and Z ¡ ¢ 1 p kDϑn kRN + a0 |ϑn |p dz 6 , n Ω
so because of (6.80) and since a0 > a, we have Z ¡ ¢ 1 p 0 6 kDϑn kRN + a|ϑn |p dz 6 . n Ω
By passing to a subsequence if necessary, we may assume that w
ϑn −→ ϑ in W 1,p (Ω), ϑn −→ ϑ in Lp (Ω). Passing to the limit as n → +∞ in (6.82), we obtain Z ¢ ¡ p kDϑkRN + a|ϑ|p = 0, Ω
so ϑ = 0, a contradiction to the fact that kϑn k = 1 for n > 1.
(6.82)
6. Eigenvalue Problems and Maximum Principles
787
This proves that (6.81) must hold. Hence λ1,p (a) > 0. Consider functions fn ∈ Cc1 (Ω),
fn > 0,
fn 6= 0,
kfn k∞ −→ 0
and the Dirichlet problem: ½
°p−2 ¯ ¯p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) + a0 (z)¯x(z)¯ x(z) = fn (z) for a.a. z ∈ Ω, x|∂Ω = 0.
¡ ¢ By Theorem 6.4.6, for every n > 1, this problem has a solution xn ∈ C Ω , such that xn (z) > 0
∀ z ∈ Ω.
Fix z0 ∈ Ω. Then by Harnack’s inequality (see Theorem 6.1.49 and note that it is also valid for the p-Laplacian; see Trudinger (1967)), we can find ξ > 0, such that xn (z0 ) > ξ > 0. ¡ ¢ From Theorem 6.4.8, we see that the sequence {xn }n>1 ⊆ C01,β Ω is bounded (β ∈ (0, 1)). Since ¡ ¢ the embedding C01,β Ω ⊆ C01 (Ω) is compact, we can assume that xn −→ x in C01 (Ω). Note that x(z0 ) > ξ and so x 6= 0. Also ½
°p−2 ¯ ¯p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) + a0 (z)¯x(z)¯ x(z) = 0 x|∂Ω = 0, x > 0, x 6= 0.
Invoking Theorem 6.4.9, we conclude that x(z) > 0
∀ z ∈ Ω.
for a.a. z ∈ Ω,
788
6.5
Nonlinear Analysis
Comparison Principles
If x, y ∈ C 2 (Ω) ∩ C 1 (Ω) and −∆x 6 −∆y
with x|∂Ω 6 y|∂Ω ,
then using the strong maximum principle (see Theorem 6.1.44 and Corollary 6.1.45), we can show that x 6 y in Ω. The situation is more involved with the p-Laplacian due to the degeneracy of the operator. In this section we derive such comparison principles for the p-Laplacian. The mathematical setting remains unchanged. Let Ω ⊆ RN be a bounded domain with a C 2 -boundary and let a ∈ L∞ (Ω) be a weight function. We deal with the differential operator df
Vp (x) = −∆p (x) + aψp (x), where ψp : R −→ R is the homeomorphism, defined by ½ p−2 df |r| r ψp (r) = 0
if if
r= 6 0, r = 0,
with p ∈ (1, +∞). If a ≡ 0, then we have the p-Laplacian. DEFINITION 6.5.1 We say that the operator Vp satisfies the weak comparison principle, if Vp (x) 6 Vp (y) and x|∂Ω 6 y|∂Ω
0
in W −1,p (Ω)
for some x, y ∈ W 1,p (Ω)
implies x 6 y
in Ω.
A first easy situation where the weak comparison principle holds is when a > 0. This is an easy consequence of the following proposition. PROPOSITION 6.5.2 If f : Ω × R −→ R is a function, such that: (i) for all ζ ∈ R, the function z 7−→ f (z, ζ) is measurable; (ii) for almost all z ∈ Ω, the function ζ 7−→ f (z, ζ) is continuous, nondecreasing;
6. Eigenvalue Problems and Maximum Principles
789
(iii) for almost all z ∈ Ω and all ζ ∈ R, we have ¯ ¯ ¯f (z, ζ)¯ 6 a(z) + c|ζ|p−1 , 0
with a ∈ Lp (Ω)+ (with
1 p
+
1 p0
= 1), c > 0 and x, y ∈ W 1,p (Ω) satisfy
¡ ¢ ¡ ¢ −∆p x + f ·, x(·) 6 −∆p y + f ·, y(·)
0
in W −1,p (Ω),
with x|∂Ω 6 y|∂Ω , then x 6 y in Ω. PROOF
Note that (x − y)+ ∈ W01,p (Ω).
Using this as a test function, we obtain Z ³ ´ p−2 p−2 kDxkRN Dx − kDykRN Dy, Dx − Dy
RN
dz
{x>y}
Z
=
¡ ¡ ¢ ¡ ¢¢ f z, y(z) − f z, x(z) (x − y)(z) dz.
{x>y}
By hypothesis Z
¡ ¡ ¢ ¡ ¢¢ f z, y(z) − f z, x(z) (x − y)(z) dz 6 0.
{x>y}
So using Lemma 6.2.13, we obtain (x − y)+ = 0 and so x 6 y. COROLLARY 6.5.3 If a > 0, then Vp satisfies the weak comparison principle. REMARK 6.5.4 In fact the condition a > 0 is both necessary and sufficient for the weak comparison principle to hold. Moreover, in this case the operator 0 Vp−1 : W −1,p (Ω) −→ Lp (Ω) is well defined, compact and order preserving (for details see Garc´ıa-Meli´an & Sabina de Lis (1998)). Another situation where the weak comparison principle holds is given in the next theorem.
790
Nonlinear Analysis
THEOREM 6.5.5 If λ1,p (a) > 0, x, y ∈ W 1,p (Ω) ∩ L∞ (Ω), Vp (x), Vp (y) ∈ L∞ (Ω),
x|∂Ω , y|∂Ω ∈ C 2 (∂Ω),
Vp (x) 6 Vp (y) almost everywhere on Ω, x|∂Ω 6 y|∂Ω and Vp (y) > 0
almost everywhere on Ω,
with y|∂Ω > 0, then x 6 y. Moreover, if x|∂Ω = y|∂Ω = 0, then the same conclusion is valid under the less restrictive hypotheses that x, y ∈ W 1,p (Z), Vp (x), Vp (y) ∈ L∞ (Ω), Vp (x) 6 Vp (y)
almost everywhere on Ω
and Vp (y) > 0
almost everywhere on Ω.
PROOF If y = 0, then using as test function x+ ∈ W01,p (Ω), we obtain x+ = 0 and so x 6 0. Therefore, we may assume that y 6= 0. In fact since by hypothesis Vp (y) > 0 with y|∂Ω > 0, using as a test function −y − ∈ W01,p (Ω), we obtain Z ° − °p °Dy ° + a|y − |p dz 6 0, p Ω
hence y − = 0 and so y > 0,
y 6= 0.
¡ ¢ From Theorem 6.2.7(a), we know that x, y ∈ C 1,β Ω for some β ∈ (0, 1). Moreover, Theorem 6.2.8 implies that y(z) > 0 and
∂y (z) < 0 ∂n
∀z∈Ω
∀ z ∈ ∂Ω, y(z) = 0.
6. Eigenvalue Problems and Maximum Principles
791
This means that y ∈ int C 1 (Ω)+ , where ª df © C 1 (Ω)+ = y ∈ C 1 (Ω) : y(z) > 0 for all z ∈ Ω and so y is an order unit in the ordered Banach space C 1 (Ω). Hence we can find c > 1, such that x 6 cy. Let h = Vp (y) ∈ L∞ (Ω)
and
g = y|∂Ω ∈ C 1 (∂Ω)
and consider the following boundary value problem: ½ Vp (w) = −∆p w + aϕ0 (w) = h in Ω, w|∂Ω = g in ∂Ω,
(6.83)
It is easy to prove that v = x is a lower solution of (6.83) and u = cy is an upper solution of (6.83). Then using standard truncation and penalization techniques (see, e.g., Gasi´ nski & Papageorgiou (2005, Section 4.5)), we obtain a solution w ∈ W 1,p (Ω) ∩ Lp (Ω) of (6.83), such that x(z) 6 w(z) 6 cy(z) for a.a. z ∈ Ω. Moreover, since by hypothesis λ1,p (a) > 0 and h > 0, from Theorem 6.4.6, we have that w > 0. Claim. Problem (6.83) has a unique nonnegative solution in W 1,p (Ω)∩L∞ (Ω). Suppose that w1 , w2 ∈ W 1,p (Ω) ∩ L∞ (Ω) are two nonnegative solutions of (6.83). Then from Theorems 6.2.7 and 6.2.8, we have that w1 w2 , ∈ L∞ (Ω). w2 w1 A careful reading of the proof of Lemma 6.4.3 reveals that its conclusions remain valid even if the two functions do not vanish on ∂Ω and only have the same value. Then ¶ Z µ p−1 ¢ w2 − w1p−1 ¡ p 0 6 I(w1 , w2 ) = h w1 − w2p dz 6 0 p−1 p−1 w1 w2 Ω
and so I(w1 , w2 ) = 0, which, by Lemma 6.4.3, implies that w2 = γw1 for some γ > 0. Hence γ p−1 h = h, which means that γ = 1 and so w1 = w2 . Using the Claim, we infer that w = y and so we conclude that x 6 y. The last statement of the theorem is a consequence of Theorem 6.2.6.
792
Nonlinear Analysis
Next we derive strong comparison principles (i.e., we show that x < y) for the p-Laplacian differential operator (i.e., we assume that a ≡ 0). THEOREM 6.5.6 (Strong Comparison Principle I) If x, y ∈ W 1,p (Ω) ∩ L∞ (Ω), h, g ∈ L∞ (Ω) are such that °p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) = h(z) for a.a. z ∈ Ω, °p−2 ¡° ¢ −div °Dy(z)°RN Dy(z) = g(z) for a.a. z ∈ Ω,
(6.84) (6.85)
x 6 y and h 6 g almost everywhere on Ω and the set ª df © K = z ∈ Ω : x(z) = y(z) is compact, then K = ∅. PROOF From ¡ ¢nonlinear regularity theory (see Theorem 6.2.7), we know that x, y ∈ C 1,β Ω for some β ∈ (0, 1). We argue indirectly. Suppose that K is nonempty and compact. We can find an open set U with C 2 -boundary ∂U , such that K ⊆ U ⊆ U ⊆ Ω and x(z) < y(z)
∀ z ∈ U \ K.
1,p
For a given ε > 0, let xε , yε ∈ W (U ) be the unique solutions of µ ¶ ° °2 ¢ p−2 ¡ −div ε + °Dxε (z)°RN 2 Dxε (z) = h(z) for a.a. z ∈ Ω, xε |∂U = x|∂U and
−div
µ ¡
¶ ° °2 ¢ p−2 ε + °Dyε (z)°RN 2 Dyε (z) = g(z)
for a.a. z ∈ Ω,
yε |∂U = y|∂U
respectively. The result of Lieberman (1988) that gave us Theorem 6.2.7 also implies that xε , yε ∈ C 1,γ (U ) for some γ ∈ (0, 1) independent ¡ ¢ of ε ∈ (0, 1] and are bounded. Similarly {xε }ε∈(0,1] , {yε }ε∈(0,1] ∈ Lp U ; RN are bounded. Therefore we may assume that lim xε = x and
ε&0
lim yε = y
ε&0
weakly in W 1,p (U ) and strongly in C 1,β (U ) for β ∈ (0, γ) (recall that the embedding C 1,γ (U ) ⊆ C 1,β (U ) is compact for β ∈ (0, γ)). Let V be an open subset of Ω, such that K ⊆ V ⊆ V ⊆ U
6. Eigenvalue Problems and Maximum Principles and let
¡ ¢ df mV = min y − x (z) > 0 and z∈∂V
793
df
wε = yε − xε .
Choose ε > 0 small enough, such that mV 4
and
kx − xε kL∞ (V )
µ°b ξ °RN
N ∀ ξb = (ξi )N i=1 ∈ R ,
(6.87)
i,j=1
for some µ > 0. We choose an open neighbourhood Wr of the compact set Kr , such that Wr ⊆ V, yε − xε > r
on ∂Wr
and (6.87) is still valid with µ > 0 replaced by µ2 (recall that Dyε and Dxε are continuous). Then by the strong maximum principle (see Theorem 6.1.44), we conclude that wε = yε − xε is constant on Wr , a contradiction. This proves that K = ∅. The next strong comparison theorem gives additional information about the normal derivatives on the boundary, reminiscent of the Hopf boundary lemma for linear differential operators (see Lemma 6.1.43). THEOREM 6.5.7 (Strong Comparison Principle II) If x, y ∈ C01 (Ω), h, g ∈ L∞ (Ω) are such that °p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) = h(z) for a.a. z ∈ Ω, °p−2 ¡° ¢ −div °Dy(z)°RN Dy(z) = g(z) for a.a. z ∈ Ω, 0 6 h(z) 6 g(z) and the set df
C =
©
for a.a. z ∈ Ω
z ∈ Ω : h(z) = g(z) for a.a. z ∈ Ω
has empty interior, then 0 6 x(z) < y(z) and
∂x ∂y (z) < (z) ∂n ∂n
∀z∈Ω ∀ z ∈ ∂Ω.
ª
(6.88) (6.89)
6. Eigenvalue Problems and Maximum Principles PROOF
795
Note that the set © ª z ∈ Ω : g(z) > 0 for a.a. z ∈ Ω
has positive measure and so y 6= 0. Therefore we can apply Theorem 6.2.8 and obtain y(z) > 0
and
∂y (z) < 0 ∂n
∀ z ∈ ∂Ω.
Moreover, from Theorem 6.5.5, we have that x 6 y
in Ω
and we can easily check that x > 0. Consider the coincidence set df
K =
¡ ¢ z ∈ Ω : x(z) = y(z)
and suppose that it is not empty. Then we can find a sequence {zn }n>1 ⊆ K and z0 ∈ ∂Ω, such that zn −→ z (recall that x|∂Ω = y|∂Ω = 0). Since x, y ∈ C 1 (Ω), we have that ∂x ∂y (z0 ) = (z0 ) = −µ2 < 0. ∂n ∂n
(6.90)
The function w = y − x satisfies −
µ ¶ N X ∂ ∂w aij = g − h > 0, ∂zi ∂zj i,j=1
with aij = a0ij (see (6.86) with ε = 0). Then ¯ ¯p−4 · ¯ ¯2 ¸ ¯ ∂y ¯ ∂y ¯ ¯ ∂y ∂y aij (z0 ) = ¯¯ (z0 )¯¯ δij ¯¯ (z0 )¯¯ + (p − 2) (z0 ) (z0 ) . ∂n ∂n ∂zi ∂zj Exploiting the continuity of Dx and Dy, we can find a ball B ⊆ Ω, such that z0 ∈ ∂B and the elliptic operator, defined by the aij , is uniformly elliptic in B. Hence once again by the strong maximum principle (see Theorem 6.1.44), we infer that either w = 0, which cannot happen since by hypothesis int C = ∅ or ∂w w > 0 in B and (z0 ) < 0, ∂n
796
Nonlinear Analysis
which contradicts (6.90). Therefore x(z) < y(z)
∀ z ∈ Ω.
Finally since ∂x ∂y 6 6 0 on ∂Ω ∂n ∂n
and
∂y (z0 ) < 0, ∂n
as above we conclude that ∂x ∂y (z) < (z) ∂n ∂n
∀ z ∈ ∂Ω.
A related problem deals with the oscillation/nonoscillation of the solutions (Sturmian theory). Picone’s identity (see Theorem 6.4.8) can be useful in this direction. THEOREM 6.5.8 If a1 , a2 ∈ L∞ (Ω), a2 6 a1 6 0 almost everywhere on Ω, a1 = 6 a2 and 1 x ∈ C0 (Ω) is a positive solution of ½ °p−2 ¯ ¯p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) + a1 (z)¯x(z)¯ x(z) = 0 for a.a. z ∈ Ω, x|∂Ω = 0, then any solution y ∈ C01 (Ω) of ½ °p−2 ¯ ¯p−2 ¡° ¢ −div °Dy(z)°RN Dy(z) + a2 (z)¯y(z)¯ y(z) = 0 y|∂Ω = 0
for a.a. z ∈ Ω,
must change sign. PROOF that
We argue indirectly. So assume without any loss of generality y(z) > 0
∀z∈Ω
(see Theorem 6.4.9). We have Z Z ¡ ¢ 0 6 L(x, y) dz = a2 − a1 xp dz 6 0, Ω
Ω
so x = cy, for some c > 0. But this contradicts the hypothesis that a1 6= a2 . So we conclude that y must change sign.
6. Eigenvalue Problems and Maximum Principles
6.6
797
Remarks
6.1: The eigenvalue theory of symmetric elliptic differential operators is standard and can be found in many books on partial differential equations. Indicatively we mention the books of Courant & Hilbert (1953, 1989) (a classical), Di Benedetto (1995), Evans (1998), Jost (2002), Protter & Weinberger (1967) and Renardy & Rogers (1993). The minimax expressions for the eigenvalues (see Theorem 6.1.23) can be traced back to Courant (see Courant & Hilbert (1953, 1989)). Our analysis of the weighted eigenvalue problem with an indefinite weight follows the work of de Figueiredo (1982). The results on the spectrum of self-adjoint elliptic operators can be viewed as an infinite dimensional generalization of the well known result from linear algebra which says that a real, symmetric matrix has real eigenvalues and an orthonormal basis of eigenvectors. For nonsymmetric, elliptic operators in nondivergence form N N X X ∂x ∂x df Lx = − aij + bi + a0 x, ∂z ∂z ∂z i j i i,j=1 i=1 with aij = aji , aij , bi , a0 ∈ C ∞ (Ω), the principal eigenvalue of L is real, simple and the corresponding eigenfunction does not change sign in Ω. Another important result of the linear eigenvalue theory is the so-called Courant’s nodal domain theorem (see Courant & Hilbert (1953)). THEOREM 6.6.1 (Courant Nodal Domain Theorem)¡ ¢ If 0 < λ1 < λ2 6 . . . 6 λn 6 . . . are the eigenvalues of − ∆, H01 (Ω) , {un }n>1 ⊆ H01 (Ω) are the corresponding eigenfunctions and df
Sn =
©
z ∈ Ω : un (z) = 0
ª
(the nodal set of un ), then Ω \ Sn has at most n components. We should point out that the eigenfunctions of −∆ (Dirichlet or Neumann) have the so-called unique continuation property , namely if u is an eigenfunction, then the set © ª z ∈ Ω : u(z) = 0 has an empty interior. The regularity theory of elliptic differential equations is a very broad subject. Here we limited ourselves to the case where the coefficients aij are regular. The general case of bounded measurable coefficients aij was resolved independently by De Giorgi (1957) and Nash (1958), who employed different methods. Later Moser (1960) produced a more compact proof based on a version of the Harnack inequality. The result of De Giorgi-Nash for elliptic equations with discontinuous coefficients says the following.
798
Nonlinear Analysis
THEOREM 6.6.2 If x ∈ H 1 (Ω) is a weak solution of µ ¶ N X ∂ ∂x − aij (z) (z) = 0 ∂zi ∂zi i,j=1
in Ω,
where the coefficients aij are bounded, measurable and satisfy 2
c kξkRN 6
N X
aij (z)ξi ξj
N ∀ z ∈ Ω, ξ = (ξi )N i=1 ∈ R ,
i,j=1
¯ ¯ ¯aij (z)¯ 6 M
∀ z ∈ Ω,
with 0 < c < M < +∞, then x is H¨ older continuous. Additional results in this direction can be found in Stampacchia (1965). The maximum principle for harmonic and subharmonic functions goes back to Gauss. Many books contain expositions of the maximum principle. We mention those by Courant & Hilbert (1953, 1989), Evans (1998), Gilbarg & Trudinger (2001), Neˇcas (1967), Protter & Weinberger (1967) and Renardy & Rogers (1993). The strong maximum principle (see Lemma 6.1.43 and Theorem 6.1.44) is essentially due to Hopf (1927). See also Hopf (1952), Oleinik (1952) and Pucci (1952). The maximum principle can be used to prove a result on the removability of isolated singularities of the Laplace operator. PROPOSITION 6.6.3 If z0 ∈ RN (N > 2), r > 0 and u : Br (z0 ) \ {z0 } −→ R is bounded and harmonic, then there exists a harmonic function u b : Br (z0 ) −→ R, such that u b|Br (z0 )\{z0 } = u. REMARK 6.6.4
Consider the problem ½ ∆u(z) = 0 in B1 (0) \ {0}, u|∂B1 (0) = 0, u(0) = 1.
Then by virtue of Proposition 6.6.3, this problem has no solution. Indeed if u is such a solution, then from the above proposition u admits a harmonic extension u b : B1 (0) −→ R, such that u b|∂B1 (0) = 0. Then u b ≡ 0 and so u b(0) = 0 6= 1.
6. Eigenvalue Problems and Maximum Principles
799
6.2: The systematic study of the p-Laplacian differential operator started in the mid-eighties. First de Th´elin (1986) proved that there exists only one first eigenfunction among the radial functions in a ball. Then the study was extended to more general domains and variational results were obtained by Anane (1987, 1987–1988), Bhattacharya (1988), Guedda & V´eron (1988), Lindqvist (1990, 1992). Theorem 6.2.6 is a particular case of a more general result due to Ladyzhenskaya & Uraltseva (1968, p. 286). The variational characterization of λ2 given in Theorem 6.2.24 is due to Anane & Tsouli (1996). More on the nonlinear weighted eigenvalue problem can be found in Binding, Dr´abek & Huang (1997a), Godoy, Gossez & Paczka (2002) and Huang (1990). The eigenfunctions of −∆p in general do not have the unique continuation property. See Martio (1988) for a counterexample. However, if we assume that the unique continuation property holds, then we can have a generalization of Courant’s nodal domain theorem (see Theorem 6.6.1) to the case of the p-Laplacian. Namely we have the following result due to Dr´abek & Robinson (2002). THEOREM 6.6.5 If −∆p satisfies the unique continuous property and uλn is an eigenfunction associated with the eigenvalue λn > 0, then uλn has at most n nodal domains (recall that a nodal domain ©is defined as a maximal ª connected, open subset (connected component) of z ∈ Ω : uλn (z) 6= 0 ). As we already mentioned, the Fredholm alternative may not hold for −∆p , with p 6= 2 (see Binding, Dr´abek & Huang (1997b), Dr´abek & Holubov´a (2001) for the partial p-Laplacian and del Pino, Dr´abek & Man´asevich (1999) for the ordinary p-Laplacian). 6.3: The regularity properties of the eigenfunctions of the nonnegative scalar ordinary p-Laplacian with Dirichlet boundary conditions were established ˆ by Otani (1984). The complete sequence {λn }n>1 of eigenvalues for problem (6.50) was obtained by Guedda & V´eron (1988) and del Pino, Elgueta & Man´asevich (1989). The eigenvalues for the Neumann problem (see (6.65)) and for periodic problem (see (6.66)) can be found in Dr´abek & Man´asevich (1999), where a more general nonlinear eigenvalue problem is studied, namely ¯p−2 ¯ ¯r−2 ¡¯ ¢0 − ¯x0 (t)¯ x0 (t) = λ¯x(t)¯ x(t) for a.a. t ∈ T = [0, b], with p, r ∈ (1, +∞). The weighted eigenvalue problems with Dirichlet and periodic bounded conditions were studied by Zhang (2001). He examined the following nonlinear eigenvalue problem: ¯p−2 ¯p−2 ¡¯ ¢0 ¡ ¢¯ − ¯x0 (t)¯ x0 (t) = λ + a(t) ¯x(t)¯ x(t) for a.a. t ∈ T = [0, b],
800
Nonlinear Analysis
with a ∈ L1 (T ), p ∈ (1, +∞) and with Dirichlet or periodic boundary conditions. Zhang proved that the Dirichlet eigenvalue problem has a sequence of eigenvalues −∞ < λ1 (a) < λ2 (a) < . . . < λn (a) < . . . , with λn (a) −→ +∞. All these eigenvalues are simple and to the first (principal) eigenvalue λ1 (a) corresponds an eigenfunction u b1 which satisfies u b1 > 0 for all t ∈ (0, b). For the periodic eigenvalue problem, there is a sequence of eigenvalues −∞ < λ0 (a) < λ2 (a) 6 λ2 (a) < . . . < λ2n (a) 6 λ2n (a) < . . . , with λ2n −→ +∞©(hence λ2n −→ +∞).ª If p = 2, then λ0 (a), λ2n (a), λ2n (a) n>1 cover all the eigenvalues of the weighted periodic eigenvalue problem. If p 6= 2, we do not know if this is the case. The eigenvalues λ0 (a), λ2n (a), λ2n (a) have the following monotonicity property with respect to the weight function a ∈ L1 (T ). Namely, if a, c ∈ L1 (T ), a(t) 6 c(t) for almost all t ∈ T and the inequality is strict on a set of positive measure, then λ0 (a) > λ0 (c),
λ2n (a) > λ2n (c) and
λ2n (a) > λ2n (a)
∀ n > 1.
Moreover, the functions a 7−→ λ0 (a), a 7−→ λ2n (a) and a 7−→ λ2n (a) are continuous on L1 (T ) with the norm topology. The spectrum of the vector ordinary p-Laplacian was studied by Man´asevich & Mawhin (1998, 2000), Mawhin (2001) and Zhang (2000). Proposition 6.3.11 was established by Man´asevich & Mawhin (1998), as a corollary of a more general existence theorem. Here we give a direct proof. 6.4: Maximum principle-type results for the p-Laplacian were obtained by Allegretto & Huang (1999) (see Theorem 6.4.9) and Garc´ıa-Meli´an & Sabina de Lis (1998) (see Theorems 6.4.5 and 6.4.6). Lemma 6.4.3 is due to Anane (1987). Maximum principles for more general quasilinear operators can be found in Damascelli (1998). Nonlinear analogs of Harnack’s inequality in terms of the p-Laplacian can be found in Damascelli (1998), Tolksdorf (1983) and Trudinger (1967). There is also the so-called antimaximum principle. Namely, consider the following problem: °p−2 ¯ ¯p−2 ¡° ¢ −div °Dx(z)°RN Dx(z) = λ¯x(t)¯ x(t) + h(z) (6.91) for a.a. z ∈ Ω, Bx = 0. Here Bx represents either Dirichlet or Neumann boundary conditions, p ∈ (1, +∞) and h ∈ L∞ (Ω).
6. Eigenvalue Problems and Maximum Principles
801
THEOREM 6.6.6 (Antimaximum Principle) If h ∈ L∞ (Ω), h(z) > 0 for almost all z ∈ Ω, then there exists δ = δ(h) > 0, such that for all λ ∈ (λ1 , λ1 + δ) any solution x ∈ C 1 (Ω) of (6.91) satisfies x(z) < 0
∀ z ∈ Ω.
For p = 2, the antimaximum principle was proved by Cl´ement & Peletier (1979), Hess (1981) and Godoy, Gossez & Paczka (1999) (the last two works with a weight function). For p 6= 2, the antimaximum principle was proved by Fleckinger, Gossez, Takaˇc & de Th´elin (1995) and Godoy, Gossez & Paczka (2002) (problems with a weight). 6.5: Comparison results involving the p-Laplacian were proved by Garc´ıaMeli´an & Sabina de Lis (1998) (see Theorem 6.5.5), Guedda & V´eron (1989), (see Theorems 6.5.6 and 6.5.7) and Allegretto & Huang (1999) (see Theorem 6.5.8). Additional results can be found in Allegretto & Huang (1998) and for more general quasilinear degenerate operators in Damascelli (1998).
Chapter 7 Fixed Point Theory
Fixed point theorems are the basic mathematical tools in showing the existence of solutions in various kinds of equations and inclusions. Fixed point theory is in the heart of nonlinear analysis since it provides the necessary tools to have existence theorems in many different nonlinear problems. It employs its tools from both analysis and topology and for this reason we have the informal (and imprecise) classification to “metric fixed point theory” and “topological fixed point theory.” Section 7.1 deals with the metric fixed point theory. Roughly speaking this includes all results for which the metric structure of the underlying space and/or the metric properties of the map involved play a crucial role. The most characteristic example of such a result is the celebrated Banach’s contraction principle. In fact many of the results in this group are outgrowths of this theorem. Essentially Banach’s fixed point theorem is a clever abstraction of the well known from the theory of differential equations method of successive approximations. We present several generalizations of Banach’s theorem, we examine nonexpansive maps and we also present Caristi’s theorem, which is a remarkable deep generalization of the contraction principle. In Section 7.2, we examine the other group of fixed point theorems, which constitute the so-called “topological fixed point theory.” Now crucial are the topological properties of the space and/or of the map involved. In particular the notion of compactness is basic in our considerations. The archetypical topological fixed point theorems are Brouwer’s fixed point theorem and its infinite dimensional generalization, the Schauder-Tychonoff fixed point theorem. Our presentation is centered around these two fundamental results. In many applications, order induced by a particular binary relation plays a central role. So in Section 7.3 we examine how order structures enter in fixed point theory. Special emphasis is given to ordered Banach spaces and the different types of order cones which characterize them. Also we introduce the notion of fixed point index, which is the main tool in obtaining fixed point theorems of the cone expansion and compression type. Finally in Section 7.4 we show that most of the fixed point theorems for single valued maps have their counterparts for set valued maps (multifunctions). Such results are important in many applications, such as optimization and mathematical economics.
803
804
7.1
Nonlinear Analysis
Metric Fixed Point Theory
In this section we present some fixed point theorems in which geometric conditions on the underlying space and/or mappings play a crucial role. So the results presented in this section are within the framework of at least a metric space, usually of a Banach space, and the analysis involves both the topological and the geometrical structure of the space combined with metric constraints on the map. We start with the celebrated “Banach Contraction Principle,” which, since its appearance in Banach’s thesis in 1922, found many applications in different parts of mathematical analysis. First a definition. DEFINITION 7.1.1 We say that ϕ is:
Let (X, dX ) be a metric space and let ϕ : X −→ X.
(a) a contraction, if there is β ∈ (0, 1), such that ¡ ¢ dX ϕ(x), ϕ(y) 6 βdX (x, y) ∀ x, y ∈ X; (b) a nonexpansive map, if ¡ ¢ dX ϕ(x), ϕ(y) 6 dX (x, y)
∀ x, y ∈ X.
THEOREM 7.1.2 (Banach Fixed Point Theorem) If (X, dX ) is a complete metric space and ϕ : X −→ X is a contraction, then ϕ has a unique fixed point y ∈ X and ϕ(n) (x) −→ y ∀ x ∈ X, © (n) ª where the sequence ϕ (x) n>1 is defined inductively by ϕ(0) (x) = x
and
¡ ¢ ϕ(n+1) (x) = ϕ ϕ(n) (x) .
© ª PROOF Let x ∈ X and consider the sequence of iterates ϕ(n) (x) n>1 . Then for any n, m > 1, we have ¡ ¢ dX ϕ(n) (x), ϕ(n+m) (x) 6
n+m−1 X
¡ ¢ dX ϕk (x), ϕ(k+1) (x)
k=n
6
n+m−1 X
¡ ¢ β k dX x, ϕ(x) 6
k=n
(since β ∈ (0, 1)). Because
¡ ¢ β d x, ϕ(x) 1−β X n
βn −→ 0, 1−β
7. Fixed Point Theory
805
it follows that ©
ª ϕ(n) (x) n>1 is a Cauchy sequence in X
and so exploiting the completeness of (X, dX ), we have that ϕ(n) (x) −→ y Then
in X.
¡ ¢ ϕ(n+1) (x) = ϕ ϕ(n) (x) −→ ϕ(y)
and so ϕ(y) = y, i.e., y ∈ X is a fixed point of ϕ. If y and y 0 are both fixed points of ϕ, then ¡ ¢ dX (y, y 0 ) = dX ϕ(y), ϕ(y 0 ) 6 βdX (y, y 0 ), a contradiction unless y = y 0 . So the fixed point of ϕ is unique. REMARK 7.1.3
From the above proof we have that
¡ ¢ dX ϕ(n) (x), ϕ(n+m) (x) 6
¡ ¢ βn dX x, ϕ(x) 1−β
∀ m > 1.
Hence ¡ ¢ dX ϕ(n) (x), y = 6
¡ ¢ lim dX ϕ(n) (x), ϕ(n+m) (x)
m→+∞ n
¡ ¢ β d x, ϕ(x) . 1−β X
This inequality gives an estimate for the error in the n-th iteration starting from any x ∈ X. Also for any x ∈ X, we have dX (x, y) 6
¡ ¢ 1 dX x, ϕ(x) . 1−β
EXAMPLE 7.1.4 (a) In Theorem 7.1.2 we cannot assume that the map is nonexpansive instead of contraction. Let X = c0 , the space of all R-valued sequences x b = {xn }n>1 , such that xn −→ 0. Furnished with the supremum norm kb xk∞ = sup |xn |, n>1
X = c0 becomes a Banach space. The map ϕ : X −→ X
806
Nonlinear Analysis
defined by
¡
df
ϕ(b x) =
1, x1 , x2 , . . .
¢
∀x∈X
is an isometry but clearly it is fixed point free. (b) Let T = [0, b], Y be a separable Banach space and f : T × Y −→ Y be a function, such that for all y ∈ Y , the function t 7−→ f (t, y) is measurable and ° ° °f (t, y) − f (t, z)° 6 k(t) ky − zk for a.a. t ∈ T and all y, z ∈ Y, Y Y where k ∈ L1 (T )+ . Consider the Cauchy problem ¡ ¢ ½ 0 x (t) = f t, x(t) for a.a. t ∈ T , x(0) = y0 ∈ Y. We claim that this problem has a unique solution. Indeed note that this existence problem is equivalent to the fixed point problem x = ϕ(x),
x ∈ X = C(T ; Y ),
where ϕ : X −→ X is defined by df
Zt
ϕ(x) = y0 +
¡ ¢ f s, x(s) ds
∀ t ∈ T, x ∈ X.
0
On X we consider the equivalent norm · ¸ ° ° df |x| = max °x(t)°Y e−λt , t∈T
with λ > 0. Then we obtain ¯ ¯ ¯ϕ(x)(t) − ϕ(z)(t)¯ 6 kkk1 eλt |x − z| λ so
∀ t ∈ T, x, z ∈ X,
¯ ¯ ¯ϕ(x) − ϕ(z)¯ 6 kkk1 |x − z| ∀ x, z ∈ X. λ If we choose λ > kkk1 , we see that ϕ is a contraction for the |·|-norm and so we can apply Theorem 7.1.2. This remetrization trick is often used in applying Theorem 7.1.2.
7. Fixed Point Theory
807
We can use Theorem 7.1.2 to establish the following invariance of domain result. PROPOSITION 7.1.5 If X is a Banach space, U ⊆ X is an open set and ϕ : U −→ X is a contraction, then (a) ψ = idX − ϕ : U −→ X is an open map; (b) ψ : U −→ ψ(U ) is a homeomorphism. PROOF
(a) Let x ∈ U and r > 0 be such that ½ ¾ Br (x) = y ∈ X : ky − xkX < r ⊆ U.
¡ ¢ Choose any y ∈ B(1−β)r ψ(x) , where ¡β ∈ (0,¢1) is the contraction constant of the map ϕ. We will show that y ∈ ψ Br (x) . To this end let df
h(z) = ϕ(z) + y
∀ z ∈ B r (x).
We have ° ° ° ° °h(z) − x° = °ϕ(z) + y − x° X X ° ° ° ° 6 °ϕ(z) − ϕ(x)°X + °ϕ(x) + y − x°X ° ° ° ° 6 β °z − x°X + °ψ(x) − y °X < βr + (1 − β)r = r. Therefore h : B r (x) −→ Br (x) and since h is a contraction (with constant β), we can apply Theorem 7.1.2 and obtain u ∈ Br (x), such that u = h(u), hence ψ(u) = y, which proves that ψ is open. (b) Since ψ : U −→ ψ(U ) is a continuous, open bijection, it is a homeomorphism. REMARK 7.1.6
If ϕ : X −→ X is a contraction, then ψ = idX − ϕ : X −→ X
is a homeomorphism of X onto itself.
808
Nonlinear Analysis
There are many generalizations of Theorem 7.1.2, which are obtained by weakening the contraction property of the map ϕ. THEOREM 7.1.7 If (X, dX ) is a complete metric space and ϕ : X −→ X, then ϕ has a unique fixed point y ∈ X and ϕ(n) (x) −→ y
∀x∈X
provided that one of the following conditions is satisfied: ¡ ¢ (a) dX ϕ(m) (x), ϕ(m) (z) 6 βdX (x, z) for some m > 1, some β ∈ (0, 1) and all x, z ∈ X; ∞ ¡ ¢ P (b) dX ϕ(m) (x), ϕ(m) (z) 6 βm dX (x, z) for all x, z ∈ X and with βm < m=1
+∞;
¡ ¢ (c) X is compact and dX ϕ(x), ϕ(z) < dX (x, z) for all x, z ∈ X; ¡ ¢ £ ¡ ¢ ¡ ¢¤ ¡ ¢ (d) dX ϕ(x), ϕ(z) 6 β dX x, ϕ(x) + dX z, ϕ(z) for some β ∈ 0, 21 and all x, z ∈ X. PROOF
(a) Let
¡ ¢ df ϑ(x) = dX x, ϕ(m) (x) .
© ¡ ¢ª Then ϑ ϕ(n) (x) n>1 is a Cauchy sequence in X and so ¡ ¢ lim ϑ ϕ(n) (x) = u,
n→+∞
for some u ∈ X. We have ¡ ¢ ¡ ¢ ϑ ϕ(n) (x) 6 β rn max ϑ ϕ(k) (x) k6n−1
∀ n > 1,
with rn being the largest nonnegative integer bigger or equal to © ª u = 0. Also ϕ(n) (x) n>1 is a Cauchy sequence, so
n m.
ϕ(n) (x) −→ y. We have hence
¡ ¢ dX y, ϕ(m) (y) = 0, ¡ ¢ ϕ(y) = ϕ(m) ϕ(y)
and since ϕm has only one fixed point, we conclude that y = ϕ(y).
Therefore
7. Fixed Point Theory
809
¡ ¢ (b) Now let ϑ(x) = dX x, ϕ(x) . We have ¡ ¢ ϑ ϕ(n) (x) 6 βn ϑ(x) and so
¡ ¢ ϑ ϕ(n) (x) −→ 0.
Also diam
³©
ϕ(n) (x)
´
ª n>m
6 c
∞ X
βn
n=m
for some c > 0, so diam
³©
´ ª ϕ(n) (x) n>m −→ 0
and thus we infer that © (n) ª ϕ (x) n>1 is a Cauchy sequence. Therefore ϕ(n) (x) −→ y. We obtain
¡ ¢ ϑ(y) = dX y, ϕ(y) = 0
and so y = ϕ(y). (c) Again let
¡ ¢ df ϑ(x) = dX x, ϕ(x) . © ¡ ¢ª Evidently the sequence ϑ ϕ(n) (x) n>1 is decreasing and so we have ¡ ¢ ϑ ϕ(n) (x) −→ u. Also due to the compactness of X, we may assume that ϕ(n) (x) −→ y Therefore
in X.
¡ ¢ ϑ(y) = dX y, ϕ(y) = 0
and so y = ϕ(y). (d) As before let
¡ ¢ df ϑ(x) = dX x, ϕ(x) .
© ¡ ¢ª We have that ϑ ϕ(n) (x) n>1 is a Cauchy sequence and so ¡ ¢ ϑ ϕ(n) (x) −→ u.
810
Nonlinear Analysis
Since
¡ ¢ ¡ ¢ ¡ ¢ ϑ ϕ(n+1) (x) 6 βϑ ϕ(n) (x) + βϑ ϕ(n+1) (x) ,
we have with γ =
¡ ¢ ¡ ¢ ϑ ϕ(n+1) (x) 6 γϑ ϕ(n) (x) , β 1−β
< 1 and thus ¡ ¢ ϑ ϕ(n+1) (x) 6 γ n+1 ϑ(x),
© ª so u = 0. Moreover, ϕ(n) (x) n>1 is a Cauchy sequence and so ϕ(n) (x) −→ y. Note that in this case ϕ need not be continuous, but we have ¡ ¢ ¡ ¢ ϑ(y) = lim dX ϕ(n) (x), ϕ(y) 6 β lim ϑ ϕ(n−1) (x) + βϑ(y), n→+∞
n→+∞
so ϑ(y) = 0 (since β < 21 ). Hence y = ϕ(y). Clearly in all four cases the fixed point y ∈ X is unique. EXAMPLE 7.1.8 (a) A function ϕ satisfying the condition in Theorem 7.1.7(a) (or 7.1.7(b)) need not be continuous. Indeed suppose that X = R and let ½ df 1 if x ∈ Q, ϕ(x) = 0 if x ∈ R \ Q. Evidently ϕ is not continuous and we have ϕ(2) (x) = 1 for all x ∈ R. (b) We give an example of a continuous function which is not a contraction but ϕ(n) is a contraction for some sufficiently large n > 1. Let T = [0, b], X = C(T ) and consider the map ϕ : X −→ X, defined by df
Zt
ϕ(x)(t) =
x(s) ds. 0
Then ϕ is not a contraction if b > 1 and (n)
ϕ
1 (x)(t) = (n − 1)!
Zt (t − s)n−1 x(s) ds 0
Hence for n > 1 sufficiently large ϕ(n) is a contraction.
∀ n > 1.
7. Fixed Point Theory
811
(c) In Theorem 7.1.7(c) the compactness of X is important. Consider X = [1, +∞) and Then
ϕ(x) = x +
¯ ¯ ¯ϕ(x) − ϕ(y)¯ < |x − y|
1 . x
∀ x, y ∈ X,
ϕ is not a contraction and ϕ does not have a fixed point. Another generalization of Theorem 7.1.2 is given below. THEOREM 7.1.9 If (X, dX ) is a complete metric space, ϕ : X −→ X, γ : R+ −→ R+ satisfies γ(t) < t
∀t>0
and lim sup γ(t) 6 γ(t0 ) t→t+ 0
and
¡ ¢ ¡ ¢ dX ϕ(x), ϕ(y) 6 γ dX (x, y)
∀ x, y ∈ X,
then ϕ has a unique fixed point y ∈ X and for every x ∈ X, we have ϕ(n) (x) −→ y. PROOF
Let df
xn = ϕ(n) (x), and ¡ ¢ df ϑn = dX (xn , xn+1 ) = dX ϕ(n) (x), ϕ(n+1) (x) . We may assume that ϑn > 0 for all n > 1. For every n > 1, we have ¡ ¢ ϑn = dX ϕ(xn−1 ), ϕ(xn ) 6 γ(ϑn−1 ) < ϑn−1 . So the sequence {ϑn }n>1 ⊆ R+ is decreasing, hence the limit ϑ =
(7.1) lim ϑn
n→+∞
exists. From (7.1) and the properties of γ, we have that ϑ 6 γ(ϑ) and so ϑ = 0. Now we prove that {xn }n>1 is a Cauchy sequence. Suppose that this is not the case. We can find ε > 0, such that for each k > 1, there exist nk and mk , such that ¡ ¢ k 6 mk 6 nk and dX xmk , xnk > ε. (7.2)
812
Nonlinear Analysis
We may assume that nk is the smallest integer satisfying (7.2). We set ¡ ¢ ξk = dX xmk , xnk ∀ k > 1. Then ¡ ¢ ¡ ¢ ε 6 ξk 6 dX xmk , xnk −1 + dX xnk −1 , xnk 6 ε + ϑnk −1 and so ξk −→ ε as k → +∞. Also, we have ¡ ¢ ¡ ¢ ¡ ¢ ε 6 ξk 6 dX xmk , xmk +1 + dX xmk +1 , xnk +1 + dX xnk +1 , xnk 6 ϑmk + γ(ξk ) + ϑnk , so ε 6 lim sup γ(ξk ) 6 γ(ε) < ε, n→+∞
a contradiction. Therefore
xn = ϕ(n) (x)
is a Cauchy sequence and xn −→ y, for some y ∈ X. We have ¡ ¢ ¡ ¢ dX xn , ϕ(y) 6 γ dX (xn−1 , y) , so
¡ ¢ dX y, ϕ(y) = 0,
i.e., y = ϕ(y). By virtue of the properties of γ, this fixed point is unique. Now we turn our attention to nonexpansive maps (see Definition 7.1.1(b)). A nonexpansive map on a complete metric space need not have a fixed point and if it has a fixed point, it need not be unique. EXAMPLE 7.1.10 (a) Let X be a Banach space, x0 ∈ X \ {0} and ϕ : X −→ X is defined by ϕ(x) = x + x0 . Then ϕ is clearly nonexpansive but it cannot have a fixed point (see also Example 7.1.4(a)). (b) Let X be a Banach space and ϕ = idX . Then ϕ is nonexpansive and of course every x ∈ X is a fixed point of ϕ.
7. Fixed Point Theory
813
PROPOSITION 7.1.11 If X is a Banach space, C ⊆ X is a nonempty, bounded, closed and convex set and ϕ : C −→ C is nonexpansive, then ° ° inf °x − ϕ(x)°X = 0. x∈C
PROOF
Fix u ∈ C and ε ∈ (0, 1) and consider the map ϕε : C −→ C
defined by ϕε (x) = εu + (1 − ε)ϕ(x). Then ° ° °ϕε (x) − ϕε (y)°
X
° ° 6 (1 − ε)°ϕ(x) − ϕ(y)°X 6 (1 − ε) kx − ykX ,
i.e., ϕε is a contraction. Applying Theorem 7.1.2, we obtain xε ∈ C, such that xε = ϕε (xε ). We have ° ° ° ° °xε − ϕ(xε )° = °εu + (1 − ε)ϕ(xε ) − ϕ(xε )° X X ° ° = ε°u − ϕ(xε )°X 6 εdiam C. Let ε & 0 to finish the proof. COROLLARY 7.1.12 If X is a Banach space, C ⊆ X is a nonempty, compact, convex set and ϕ : C −→ C is nonexpansive, then ϕ has a fixed point. Next we introduce some special sets which are important in the search for fixed points of nonexpansive maps. DEFINITION 7.1.13 Let X be a Banach space, C ⊆ X a nonempty, closed, convex set and ϕ : C −→ C. (a) We say that K ⊆ C is invariant under ϕ (or ϕ-invariant), if ϕ(K) ⊆ K. (b) A nonempty, closed, convex set K ⊆ C is said to be minimal invariant under ϕ, if K is invariant under ϕ and K has no nonempty, closed and convex proper subsets which are ϕ-invariant.
814
Nonlinear Analysis
REMARK 7.1.14 We can modify Definition 7.1.13(b) in obvious ways so that it applies to smaller families of sets (for example minimal ϕ-invariant weakly (or weakly∗ ) compact sets). A decreasing sequence of nonempty, closed, convex, ϕ-invariant sets may be obtained by setting K0 = C
and
Kn+1 = conv ϕ(Kn )
We set df b = K
∞ \
∀ n > 1.
Kn .
n=1
b is closed, convex and ϕ-invariant. But it may be empty. Of course The set K this situation cannot occur if C is weakly compact. PROPOSITION 7.1.15 If X is a Banach space, C ⊆ X is a nonempty, weakly compact, convex set and ϕ : C −→ C, b ⊆ C which is minimal then there exists a nonempty, closed, convex set K ϕ-invariant. PROOF Let T be the family of all nonempty, closed, convex subsets of C which are ϕ-invariant. We order T by reverse inclusion, namely if K1 , K2 ∈ T , then K1 6 K2 ⇐⇒ K2 ⊆ K1 . By the finite intersection property for the weak topology, every chain in T has an upper bound (namely the intersection of the elements in the chain). So by b ∈ T . Evidently K b the Kuratowski-Zorn lemma, T has a maximal element K is ϕ-invariant. REMARK 7.1.16 An analogous result also holds for nonempty, w∗ closed, convex subsets of a w∗ -compact set C in a dual Banach space. Note b ⊆ C is a nonempty, closed, convex and minimal ϕ-invariant set, that if K then b = conv ϕ(K). b K b ∈ T in Proposition 7.1.15 is a singleton, i.e., K b = {y}, then If K ϕ(y) = y, i.e., it is a fixed point of ϕ. For many years it remained an open question whether a nonexpansive map ϕ : C −→ C on a nonempty, w-compact, convex b ⊆ C. Alspach (1981) settled set C has a singleton minimal ϕ-invariant set K this question negatively in 1981, by furnishing the following example.
7. Fixed Point Theory EXAMPLE 7.1.17
815
Let X = L1 [0, 1],
½ df
C =
¾
Z1 h ∈ X : 0 6 h(t) 6 2 for a.a. t ∈ [0, 1],
h(s) ds = 1 0
and let ϕ : C −→ C be defined by © ª ½ min ©2h(2t), 2 ª max 0, 2h(2t − 1) − 2
if if
0 6 t 6 21 , 1 2 < t 6 1.
It can be shown that C is nonempty, weakly compact, convex in L1 [0, 1] and ϕ is an isometry (in particular it is nonexpansive). However ϕ is fixed point free. So more structure is needed to guarantee fixed points for nonexpansive maps. For this reason we introduce the following definition. DEFINITION 7.1.18 Let X be a Banach space and let C ⊆ X be a nonempty, bounded, closed, convex set. A point x ∈ C is said to be diametral, if diam C = sup kc − xkX . c∈C
We say that C has normal structure, if for any given bounded, closed, convex set K ⊆ C containing more than one point, there exists a nondiametral x ∈ C. THEOREM 7.1.19 Every nonempty, compact, convex set C in a Banach space X has normal structure. PROOF If C (with diam C > 0) does not have normal structure, then for any x1 ∈ C we can find x2 ∈ C, such that kx1 − x2 kX = diam C. Since C is convex, we have x1 + x2 ∈ C. 2 So there exists x3 ∈ C, such that ° ° ° x1 + x2 ° ° − x3 ° ° ° = diam C. 2 X
816
Nonlinear Analysis
This way we produce a sequence {xn }n>1 ⊆ C, such that ° n ° °1 X ° ° ° = diam C. x − x k n+1 ° °n X k=1
Therefore ° n ° °1 X ° ° diam C = ° x − x k n+1 ° °n X k=1 ° ° ° x1 − xn+1 x2 − xn+1 xn − xn+1 ° ° = ° + + . . . + ° ° n n n X n 1X 6 kxk − xn+1 kX 6 diam C, n k=1
so kxk − xn+1 kX = diam C > 0. Therefore {xn }n>1 ⊆ C is a sequence with no convergent subsequence, a contradiction to the fact that C is compact. THEOREM 7.1.20 Every nonempty, bounded, closed, convex set C of a uniformly convex Banach space X has normal structure. PROOF
Without any loss of generality we may assume that C ⊆ B1,
where B1 =
©
ª x ∈ X : kxkX 6 1 .
Let C1 be a nonempty, closed, convex subset of C with diam C1 > 0 and let x1 ∈ C and ε = 21 . We can find x2 ∈ C1 , such that 1 diam C1 6 kx2 − x1 kX . 2 Then for any x ∈ X, we have ° ° ° ° ° ° ° ° °x − x1 + x2 ° = ° x − x1 + x − x2 ° ° ° ° 2 ° 2 2 X X µ µ ¶¶ 1 6 diam C1 1 − δ , 2 where δ(ε) is the modulus of uniform convexity of X. Since δ conclude that C has normal structure.
¡1¢ 2
> 0, we
7. Fixed Point Theory
817
To prove the main fixed point theorem for nonexpansive maps, we need to introduce some geometric quantities associated with a set C ⊆ X. DEFINITION 7.1.21 subsets of X.
Let X be a Banach space and D, E nonempty
(a) For a given x ∈ X, the radius of D relative to x is given by df
rx (D) = sup kx − ykX . y∈D
(b) The Chebyshev radius of D relative to E is given by df
rE (D) = inf rx (D). x∈E
(c) The Chebyshev center of D relative to E is given by df
CE (D) =
©
ª x ∈ E : rx (D) = rE (D) .
REMARK 7.1.22 If D = E, then for (b) and (c) in the above definition we use the notion r(D) and C(D) and the phrase “relative to E” is dropped. If y ∈ E belongs in CE (D), then Bry (D) (y) ⊇ D and no ball centered at any point of E with smaller radius has this property. Clearly for any x ∈ D, we have r(D) 6 rx (D) 6 diam D. Moreover, rx : D −→ X is continuous and C(D) is closed, convex. The main fixed point theorem for nonexpansive maps is the following. THEOREM 7.1.23 If X is a Banach space, C ⊆ X is a nonempty, w-compact, convex set with normal structure and ϕ : C −→ C is nonexpansive, then ϕ has a fixed point, i.e., there exists x ∈ C, such that ϕ(x) = x. PROOF By virtue of Proposition 7.1.15, we may assume that C is minimal ϕ-invariant. Then C = conv ϕ(C) (see Remark 7.1.16). Let x ∈ C(C), i.e., rx (C) = r(C).
818
Nonlinear Analysis
Because ° ° °ϕ(z) − ϕ(x)° 6 kz − xk 6 r(C) X X we have
∀ z ∈ C,
¡ ¢ ϕ(C) ⊆ Br(C) ϕ(x) ,
so
¡ ¢ C = conv ϕ(C) ⊆ Br(C) ϕ(x)
and thus rϕ(x) (C) = r(C). This means that ϕ(x) ∈ C(C) and so C(C) is ϕ-invariant, which in view of the normal structure contradicts the minimality of C (see Remark 7.1.22), unless diam C = 0. So C = {x0 } is a fixed point of ϕ. In the above theorem the hypothesis that C has normal structure cannot be dispensed with as the following example illustrates. ¡ ¢ Let X = C [0, 1] and
EXAMPLE 7.1.24 ½ df
C =
¾ ¡ ¢ x ∈ C [0, 1] : x(0) = 0, x(1) = 1, 0 6 x(t) 6 1 for all t ∈ [0, 1] .
Then r(C) = diam C = 1
and
C(C) = C.
So C does not have the normal structure. Let ϕ : C −→ C be defined by ϕ(x)(t) = tx(t)
∀ t ∈ [0, 1].
Then ϕ is nonexpansive but does not have a fixed point. Recall that, if df
C = Bc =
©
ª x ∈ X : kxkX 6 c ,
the c-radial retraction is the map rc : X −→ C, defined by ½ df
rc (x) =
x cx kxkX
if if
x ∈ C, x 6∈ C.
From a well known result of de Figueiredo & Karlovitz (1967), if dim X > 3, then X is a Hilbert space if and only if rc is nonexpansive. Of course when dim X = 1 or dim X = 2, then rc is always nonexpansive. Using this fact we can prove the following alternative theorem for nonexpansive maps on a Hilbert space.
7. Fixed Point Theory
819
THEOREM 7.1.25 If H is a Hilbert space, df
C = Bc =
©
ª x ∈ H : kxkH 6 c
and ϕ : C −→ H is nonexpansive, then at least one of the following properties holds: (a) ϕ has a fixed point; (b) there is an x ∈ ∂C and λ ∈ (0, 1), such that x = λϕ(x). PROOF
The map rc ◦ ϕ : C −→ C
is nonexpansive. Since a Hilbert space is uniformly convex, because of Theorem 7.1.20, C has normal structure. So we can apply Theorem 7.1.23 and obtain x ∈ C, such that x = (rc ◦ ϕ)(x). If ϕ(x) ∈ C, then x = ϕ(x). If ϕ(x) does not belong to C, then x =
cϕ(x) ∈ ∂C kϕ(x)kH
λ =
c < 1, kϕ(x)kH
and so for we have x = λϕ(x).
By virtue of Theorem 7.1.25, we see that by imposing conditions on ϕ which prevent the second possibility to occur, we can have fixed points for nonexpansive maps on Hilbert spaces. COROLLARY 7.1.26 If H is a Hilbert space, df
C = Bc =
© ª x ∈ H : kxkH 6 c ,
ϕ : C −→ H is nonexpansive and for all x ∈ ∂C one of the following conditions holds: ° ° (i) °ϕ(x)°H 6 kxkH ; ° ° ° ° (ii) °ϕ(x)°H 6 °x − ϕ(x)°H ; ° °2 ° °2 2 (iii) °ϕ(x)°H 6 kxkH + °x − ϕ(x)°H ; ¡ ¢ 2 (iv) x, ϕ(x) H 6 kxkH , then ϕ has a fixed point.
820
Nonlinear Analysis
PROOF Let us prove the corollary when (i) is in effect. The other cases are proved similarly. Suppose that for some x ∈ ∂C and some λ ∈ (0, 1), we have x = λϕ(x). Then kxkH 6 λ kxkH < kxkH , a contradiction. Hence (b) in Theorem 7.1.25 cannot occur and so ϕ must have a fixed point. REMARK 7.1.27 In Corollary 7.1.26, case (i) is known as Rothe’s fixed point theorem, while case (iii) is known as Altman’s fixed point theorem. Recall that in Theorem 4.6.9, we proved a deep generalization of Theorem 7.1.2 (the Banach contraction principle), known as Caristi’s fixed point theorem. Let us recall the statement of that result. THEOREM 7.1.28 (Caristi Fixed Point Theorem) If (X, dX ) is a complete metric space, df
ϕ : X −→ R = R ∪ {+∞} is a proper, lower semicontinuous, bounded below function, F : X −→ 2X \ {∅} is a multifunction and for all x ∈ X, there exists y ∈ F (x), such that ϕ(y) 6 ϕ(x) − dX (y, x), then F has a fixed point, i.e., there exists x ∈ X, such that x ∈ F (x).
REMARK 7.1.29 Theorem 7.1.28 is equivalent to the Ekeland variational principle in the form of Corollary 4.6.3 (see Proposition 4.6.11).
7. Fixed Point Theory
7.2
821
Topological Fixed Point Theory
In this section we prove fixed point theorems in which the topological structure of the problem plays a central role. DEFINITION 7.2.1 A Hausdorff topological space X is said to have the fixed point property (FPP for short), if every continuous map ϕ : X −→ X has a fixed point. The following result is an easy consequence of the above definition. PROPOSITION 7.2.2 If X and Y are homeomorphic Hausdorff topological spaces and X has the FPP, then Y has the FPP too. DEFINITION 7.2.3 Let X be a Hausdorff topological space and C ⊆ X. We say that C is a retract of X, if there is a continuous map r : X −→ C, such that r|C = idC . The map r is called a retraction. REMARK 7.2.4 Every nonempty, closed and convex subset of a normed space X is a retract. This is an immediate consequence of Dugundji’s extension theorem (see Theorem 3.1.11). Each retract is closed. PROPOSITION 7.2.5 If X is a Hausdorff topological space with the FPP and C is a retract of X, then C has the FPP. PROOF Let r : X −→ C be the retraction map and let ϕ : C −→ C be a continuous map. Then ϕ ◦ r : X −→ X is continuous and since X has the FPP we can find x ∈ X, such that ¡ ¢ x = ϕ r(x) . Because
¡
¢ ϕ ◦ r (X) ⊆ C,
we have x ∈ C and so r(x) = x and x = ϕ(x).
822 REMARK 7.2.6
Nonlinear Analysis Recall that if X is a normed space and df
Br =
©
ª x ∈ X : kxkX 6 r ,
r > 0,
then ∂Br is a retract of B r if and only if dim X = +∞. Now we are ready for the first major topological fixed point theorem. THEOREM 7.2.7 (Brouwer Fixed Point Theorem) If X is a finite dimensional normed space, then © ª (a) B 1 = x ∈ X : kxkX 6 1 has the FPP; (b) every compact, convex subset C of X has the FPP. PROOF (a) We proceed by contradiction. Suppose that there exists a continuous map ϕ : B 1 −→ B 1 , which is fixed point free. Let r : B 1 −→ ∂B 1 be a map defined as follows: for each x ∈ B 1 extend the line segment from ϕ(x) through x to the point of intersection with ∂B1 . Call this point r(x). Clearly r is continuous and r|∂B1 = id∂B1 ,
i.e., r is a retraction. But this is a contradiction (see Remark 7.2.6). (b) Let r > 0 be large enough so that C ⊆ Br . We know that C is a retract of B r and B r is homeomorphic to B 1 . Then from (a) and Proposition 7.2.2 and Proposition 7.2.5, we conclude that C has the FPP. Can we extend this result to infinite dimensional normed spaces? The answer is negative. To show this we need the following result due to Klee (1956). PROPOSITION 7.2.8 If X is an infinite dimensional normed space and C ⊆ X is compact, then X \ C and X are homeomorphic.
7. Fixed Point Theory
823
THEOREM 7.2.9 © ª If X is a normed space and B 1 = x ∈ X : kxkX 6 1 , then B 1 has the FPP if and only if dim X < +∞. PROOF “=⇒”: We argue indirectly. Suppose that X is infinite dimensional. Then by virtue of Proposition 7.2.8 there exists a homeomorphism ϑ : X −→ X \ {0}. u0¤ ∈ ∂B 1 map the line segment [0, u0 ] linearly onto the line segment £For each 0, ϑ−1 (u0 ) and this way we can have a continuous map g : B 1 −→ X, such that g(0) = 0. Now let r : B 1 −→ ∂B 1 be defined by df
r(x) =
(ϑ ◦ g)(x) k(ϑ ◦ g)(x)kX
∀ x ∈ B1.
Note that this map is well defined since ϑ(X) = X \ {0} and clearly is continuous. If u0 ∈ ∂B1 , then ¡ ¢ ¡ ¢ ϑ ◦ g (u0 ) = ϑ ϑ−1 (u0 ) = u0 and so r|∂B1 = id∂B1 , i.e., r is a retraction. Hence x 7−→ −r(x) is a continuous, fixed point free map from B 1 into itself, a contradiction. So we conclude that dim X < +∞. “⇐=”: This is Theorem 7.2.7(a). In the next example we produce a fixed point free homeomorphism of the closed unit ball of an infinite dimensional Hilbert space onto itself.
824
Nonlinear Analysis
EXAMPLE 7.2.10
Let H be a separable Hilbert space and © ª B 1 = x ∈ H : kxkH 6 1 .
Since H is separable, we can find a countable orthonormal basis {en }n>1 . Consider the transformation R : H −→ H, defined by R(en ) = en+1
∀n>1
and extending linearly to all of H, i.e., df
R(x) =
∞ X
λn R(en ) =
n=1
∞ X
λn en+1
∀x=
n=1
∞ X
λn e n .
n=1
Evidently R is an isomorphism and a homeomorphism of ∂B1 onto itself. We consider the map ϕ : H −→ H, defined by df
ϕ(x) =
1 (1 − kxkH ) e1 + R(x) 2
∀ x ∈ H.
It is easy to check that ϕ is a homeomorphism which maps B 1 into itself. We claim that ϕ has no fixed points in B 1 . Suppose that there exists x0 ∈ B 1 , such that ϕ(x0 ) = x0 . We have x0 − R(x0 ) = If x0 = 0, then
1 (1 − kx0 kH ) e1 . 2
1 (1 − kx0 kH ) e1 = 0 2
and so kx0 kH = 1, a contradiction. If kx0 kH = 1, then
x0 = R(x0 ),
a contradiction since from its definition R cannot have a fixed point on ∂B1 . Finally suppose that 0 < kx0 kH < 1. We have x0 =
∞ X
λn e n ,
with
n=1
∞ X n=1
Then λ1 = λ2 = . . . = λn = . . . and so
∞ X n=1
a contradiction.
2
|λn |2 = kx0 kH < 1.
|λn |2 = +∞,
7. Fixed Point Theory
825
Theorem 7.2.9 and Example 7.2.10 illustrate that in order to extend Theorem 7.2.7 to infinite dimensional normed spaces, we need to restrict the maps ϕ : C −→ C that we consider. The next theorem is the infinite dimensional version of Brouwer’s theorem (see Theorem 7.2.7). THEOREM 7.2.11 (Schauder Fixed Point Theorem) If X is a normed space, C ⊆ X is a nonempty, convex (not necessarily closed) set and ϕ : C −→ C is a continuous map, such that ϕ(C) ⊆ X
is compact,
then ϕ has at least one fixed point. PROOF By virtue of Theorem 3.1.10, for a given ε > 0, we can find a continuous map ϕε : C −→ C, such that
° ° °ϕε (x) − ϕ(x)° < ε X
∀x∈C
and ϕε (C) ⊆ D ⊆ C, where D is a finite dimensional, closed ball. Since ϕε (D) ⊆ D, we can apply Theorem 7.2.9(b), to obtain xε ∈ D, such that ϕε (xε ) = xε . Now let εn & 0 and set df
xn = xεn Then
∀ n > 1.
° ° °xn − ϕ(xn )°
6
ϕ(xn ) −→ y
in X,
1 n and because of our hypothesis on ϕ we may assume that X
hence xn −→ y
in X.
So in the limit as n → +∞, we obtain ° ° °y − ϕ(y)° = 0, X hence ϕ(y) = y.
826
Nonlinear Analysis
Using the same approximation technique, we can prove the so-called Borsuk’s fixed point theorem. First let us recall the finite dimensional version of the result (see Dugundji & Granas (1982, p. 46)). THEOREM 7.2.12 If U is a bounded, convex, symmetric neighbourhood of the origin in RN and ϕ : U −→ RN is a continuous map, such that ϕ|∂U is odd, then ϕ has at least one fixed point. THEOREM 7.2.13 (Borsuk Fixed Point Theorem) If X is a normed space, U ⊆ X is a bounded, convex, symmetric neighbourhood of the origin and ϕ : U −→ X is a compact and odd map then ϕ has at least one fixed point. PROOF Because ¡ of ¢ Theorem 3.1.10, we can find a finite, symmetric set F ⊆ X, such that ϕ U ⊆ F(ε) , where F = {ui }m i=1
df
and F(ε) =
m [
Bε (ui ).
i=1
Let ηi : F(ε) −→ R be the map © ª df ηi (x) = max 0, ε − kx − ui kX
© ª ∀ i ∈ 1, . . . , m
and let us set df
pε (x) = P m i=1
1
m X
ηi (x)
i=1
ηi (x)ui
∀ x ∈ F(ε) .
As in the proof of Theorem 3.1.10(a), we can check that the map pε : F(ε) −→ conv F ⊆ U is compact and odd and
° ° °x − pε (x)° < ε X
∀ x ∈ F(ε) .
Then p ◦ ϕε |∂U is odd. Let Xε be a finite dimensional subspace of X, such that ¡ ¢¡ ¢ pε ◦ ϕ U ⊆ Xε . Let
df
ϕ bε = pε ◦ ϕ|U ∩Xε . Applying Theorem 7.2.12, we obtain xε , such that ϕ bε (xε ) = xε . Then ° ° °xε − ϕε (xε )° < ε X and as in the proof of Theorem 7.2.11, by considering εn & 0 and using the compactness of ϕ, we can produce a fixed point of ϕ.
7. Fixed Point Theory
827
There is the following extension of Theorem 7.2.11 to locally convex spaces. THEOREM 7.2.14 (Schauder-Tychonoff Fixed Point Theorem) Any nonempty, compact, convex set in a locally convex space has the FPP. REMARK 7.2.15 Another equivalent formulation of this theorem is the following: “Let X be a locally convex space, C ⊆ X a nonempty, closed, convex set and ϕ : C −→ C a continuous map, such that ϕ(C) is compact. Then ϕ has at least one fixed point.” In analogy to the metric nonlinear alternative theorem (see Theorem 7.1.25), we can have a topological alternative theorem, known as the LeraySchauder alternative principle or Schaeffer fixed point theorem. THEOREM 7.2.16 (Leray-Schauder Alternative Principle) If X is a normed space, C ⊆ X a nonempty, convex set and ϕ : C −→ C a compact map, then at least one of the following properties holds: (a) ϕ has a fixed point; ª df © (b) the set S = x ∈ C : x = λϕ(x) for some λ ∈ (0, 1) is unbounded. PROOF
Suppose that S is bounded. Let M > 0 be such that kxkX < M
if x = λϕ(x) for some λ ∈ (0, 1).
(7.3)
Let us set ϕ b = rM ◦ ϕ, where rM is the M -radial retraction. Observe that ϕ b : B M −→ B M , where © ª B M = x ∈ X : kxkX 6 M . Let
¡ ¢ K = conv ϕ b BM .
Since ϕ is compact and rM is continuous, ϕ b is compact too and so K ⊆ X is compact and convex. Moreover, we have ϕ b : K −→ K. Applying Theorem 7.2.11, we obtain x ∈ K, such that ϕ(x) b = x. ° ° If x ∈ K is not a fixed point of ϕ, then °ϕ(x) b °X > M and so x = λϕ(x)
with λ =
M < 1. kϕ(x)kX
(7.4)
° ° But kxkX = °ϕ(x) b °X = M . So comparing (7.3) with (7.4), we reach a contradiction. Thus ϕ(x) = x.
828
Nonlinear Analysis
REMARK 7.2.17 The above theorem is the basis of the informal principle, which says that, “if we can produce a priori bounds for the solutions of nonlinear partial differential equations, under the assumption that such solutions exist, then indeed these solutions exist (briefly a priori estimates imply existence).” This is the so-called “method of a priori estimates.” Compared to Theorem 7.2.11, Theorem 7.2.16 has the advantage that we do not have to assume that C is compact (for example C can be the whole space X). As in the metric case (see Corollary 7.1.26), using Theorem 7.2.16 we can derive many known fixed point theorems by imposing conditions on the map ϕ which prevent the occurrence of property (b) in Theorem 7.2.16. COROLLARY 7.2.18 If X is a normed space, C ⊆ X is a nonempty, convex set, U ⊆ C is open, ϕ : U −→ C is compact and for all x ∈ ∂U one of the following conditions holds: ° ° (i) °ϕ(x)°X 6 kxkX ; ° °2 ° °2 2 (ii) °ϕ(x)°X 6 °ϕ(x) − x°X + kxkX , then ϕ has a fixed point. REMARK 7.2.19 In the above corollary, case (i) is known as Rothe’s fixed point theorem and case (ii) as Altman’s fixed point theorem. Using the measure of noncompactness, we can have a generalization of Theorem 7.2.11 in which the compactness condition on the map ϕ is relaxed. First let us recall the definitions of the two measures of noncompactness which we will use. DEFINITION 7.2.20 all bounded subsets of X.
Let X be a Banach space and B the collection of
(a) The Kuratowski measure of noncompactness α : B −→ R+ is defined by ½ α(C) = inf d > 0 : C admits a finite cover by sets ¾ of diameter not bigger than d . (b) The ball (or Hausdorff) measure of noncompactness β : B −→ R+ is defined by ½ ¾ β(C) = inf r > 0 : C can be covered by finitely many balls of radius r .
7. Fixed Point Theory
829
In the next proposition we have gathered some basic properties of the above two measures of noncompactness. For details we refer to Denkowski, Mig´orski & Papageorgiou (2003b, pp. 14–15). PROPOSITION 7.2.21 If X is an infinite dimensional Banach space, B is the collection of all bounded subsets of X, γ : B −→ R+ is either α or β, then (a) γ(C) = 0 if and only if C is compact; (b) γ(λC) = |λ|γ(C) for all λ ∈ R and γ(C1 + C2 ) 6 γ(C1 ) + γ(C2 ) (γ is a seminorm); (c) if C1 ⊆ C2 , then γ(C1 ) 6 γ(C2 ) (monotonicity); © ª (d) γ(C1 ∪ C2 ) 6 max γ(C1 ), γ(C2 ) (semi-additivity); ¯ ¯ (e) ¯γ(C1 ) − γ(C2 )¯ 6 kh(C1 , C2 ), with ½ 2 if γ = α, k = 1 if γ = β and h being the Hausdorff semimetric on B; in particular γ(C) = γ(C); (f ) γ(C) = γ(conv C). REMARK 7.2.22
Also and
It is clear from Definition 7.2.20 that we have
β(C) 6 α(C) 6 2β(C)
∀ C ∈ B.
¡ ¢ α(Br ) = α ∂Br = 2r
∀ r > 0,
¡ ¢ β(Br ) = β ∂Br = r
∀ r > 0,
provided that X is infinite dimensional. Here © ª Br = x ∈ X : kxkX < r . Also by virtue of Proposition 7.2.21(a) and (b), the same is true for Br (x) = x + Br ,
x ∈ X.
Using the measure of noncompactness, we can extend Cantor’s theorem (see Theorem A.1.11) as follows.
830
Nonlinear Analysis
PROPOSITION 7.2.23 If X is a Banach space, B is the collection of all bounded subsets of X, γ : B −→ R+ is either α or β, {Cn }n>1 ⊆ B is a decreasing sequence of nonempty closed sets, such that γ(Cn ) & 0, then df b = C
∞ \
Cn is a nonempty and compact set.
n=1
PROOF Choose xn ∈ Cn for n > 1. Then from Proposition 7.2.21(a) and (b) we have ¡ ¢ ¡ ¢ γ {xn }n>1 = γ {xn }n>k ∀k>1 and
¡ ¢ γ {xn }n>k ⊆ γ(Ck ).
So
¡ ¢ γ {xn }n>1 = 0
and this implies that the sequence {xn }n>1 ⊆ X is relatively compact (see Proposition 7.2.21(a)). We may assume that xn −→ x
in X.
b and so C b= Evidently x ∈ C 6 ∅. Also b 6 γ(Cn ) γ(C)
∀ n > 1,
b = 0. Since C b is closed, we conclude that C b is compact. hence γ(C) DEFINITION 7.2.24 Let X be a Banach space, B the collection of all bounded subsets of X, γ : B −→ R+ is either α or β, C ⊆ X and ϕ : C −→ X a continuous map. (a) ϕ is γ-Lipschitz, if ¡ ¢ γ ϕ(C) 6 kγ(C)
∀ C ∈ B,
for some k > 0; (b) ϕ is γ-contraction, if it is γ-Lipschitz with constant k < 1; (c) ϕ is γ-condensing, if ¡ ¢ γ ϕ(C) < γ(C)
∀ C ∈ B, with γ(C) > 0.
7. Fixed Point Theory REMARK 7.2.25 tion
831
Recall that in a Hilbert space H, the c-radial retracrc : H −→ B c
is nonexpansive. In particular then it is a α-Lipschitz with constant k = 1. The same is true for the metric projection map on a closed, convex set C ⊆ H. The next result extends Proposition 3.1.17. PROPOSITION 7.2.26 If X is a Banach space, C ⊆ X is a nonempty, closed, bounded set and ϕ : C −→ X is γ-condensing with γ being either α or β, then idX − ϕ is proper (see Definition 3.1.13) and maps closed subsets of C onto closed sets in X. PROOF
Let K ⊆ X be a nonempty compact set and let df
D = (idX − ϕ)−1 (K). Since idX − ϕ is continuous, D is closed. Moreover, we have ¡ ¢ ¡ ¢ γ(D) 6 γ(K) + γ ϕ(D) = γ ϕ(D) (see Proposition 7.2.21(a)). Since ϕ is γ-condensing, it follows that γ(D) = 0 and so D is compact. Next let B ⊆ C be a closed set and consider xn ∈ (idX − ϕ)(B), such that xn −→ x
in X.
We have xn = (idX − ϕ)(un )
∀ n > 1,
with un ∈ B. The set K = {xn }n>1 is compact and so (idX − ϕ)−1 (K) is compact. Therefore, we may assume that un −→ u ∈ B, for some u ∈ B. Due to the continuity of idX − ϕ, we have x = (idX − ϕ)(u) and so ¡ ¢ i.e., idX − ϕ (B) is closed.
x ∈
¡ ¢ idX − ϕ (B),
832
Nonlinear Analysis
Now we are ready to have another generalization of Theorem 7.2.11. THEOREM 7.2.27 (Sadovskii Fixed Point Theorem) If X is a Banach space, C ⊆ X is a nonempty, bounded, closed, convex set and ϕ : C −→ C is a γ-condensing map, then ϕ has at least one fixed point. PROOF First suppose that ϕ is a γ-contraction with constant k ∈ (0, 1). Define a decreasing sequence {Cn }n>1 of closed subsets of C as follows: df
C1 = C
and
df
Cn+1 = conv ϕ(Cn )
∀ n > 1.
Using Proposition 7.2.21(f ) and the fact that ϕ is γ-Lipschitz with constant k < 1, we have ¡ ¢ ¡ ¢ γ(Cn+1 ) = γ conv ϕ(Cn ) = γ ϕ(Cn ) 6 kγ(Cn ) 6 . . . 6 k n+1 γ(C1 ) −→ 0. Invoking Proposition 7.2.23, we obtain that b = C
∞ \
Cn
n=1
b is also convex and ϕ : C b −→ C. b is a nonempty and compact set. Moreover, C b such that ϕ(x) = x. Applying Theorem 7.2.11, we obtain x ∈ C, Now we remove the additional hypothesis that ϕ is a γ-contraction. So suppose that ϕ is only γ-condensing. Without any loss of generality, we may assume that 0 ∈ C. Let {kn }n>1 ⊆ (0, 1) and assume that kn −→ 1. From the first part of the proof, we can find xn ∈ C, such that kn ϕ(xn ) = xn
∀ n > 1.
Hence xn − ϕ(xn ) = (kn − 1)ϕ(xn ) −→ 0. But from Proposition 7.2.26, we know that (idX − ϕ)(C) is closed and so 0 ∈ (idX − ϕ)(C). Therefore there exists x ∈ C, such that ϕ(x) = x.
7. Fixed Point Theory
7.3
833
Partial Order and Fixed Points
The purpose of this section is to establish the importance of order structures in fixed point theory. DEFINITION 7.3.1 Let X be a nonempty set. A binary relation 6 in X is said to be partial order on X, if it satisfies the following properties: (a) x 6 x for all x ∈ X (i.e., 6 is reflexive); (b) if x 6 y and y 6 z then x 6 z for all x, y, z ∈ X (i.e., 6 is transitive); (c) if x 6 y and y 6 x then x = y for all x, y ∈ X (i.e., 6 is antisymmetric). The set X equipped with the partial order 6 is said to be a partially ordered set and is denoted by (X, 6) or simply by X when the partial order 6 is clearly understood. REMARK 7.3.2 Some authors drop from the definition of partial order the property of reflexivity. We will call such an order strict partial order and denote it by y instead of y 6 x. Also y < x or x > y means that y 6 x and x 6= y (it is a strict partial order). DEFINITION 7.3.3 Let (X, 6) be a partially ordered set and let C ⊆ X be a nonempty set. An upper bound for C is an element x ∈ X, such that c 6 x for all c ∈ C. A maximal element in X is an element x ∈ X, such that if x0 ∈ X and x 6 x0 , then x = x0 . A lower bound for C and a minimal element of X are defined by reversing the above inequalities. A set C ⊆ X is said to be a bounded above set (respectively a bounded below set), if there exists an upper (respectively lower) bound for C. If it is both bounded below and above, then we say that C is an order bounded set. An upper bound x ∈ X for C is said to be a least upper bound for C (or supremum of C), if every upper bound y for C has the property x 6 y. The greatest lower bound for C (or infimum of C) is defined similarly. We denote them by sup C and inf C respectively. When they exist, then they are necessarily unique and are denoted by sup C and © inf ª C respectively. © ª We say that (X, 6) is a lattice, if for all x, y ∈ X, inf x, y and sup x, y exist and we say that (X, 6) is a complete lattice, if sup C and inf C exist for every nonempty C ⊆ X. Finally a total (or linear) order 6 on X is a partial order with the property that if x, y ∈ X, x 6= y, then either x 6 y or y 6 x. A chain in the partially ordered set (X, 6) is a nonempty set C ⊆ X on which 6 is a total order. REMARK 7.3.4
The maximal elements need not be unique.
834
Nonlinear Analysis
The basic existence theorem for maximal elements is the KuratowskiZorn lemma. THEOREM 7.3.5 (Kuratowski-Zorn Lemma) If (X, 6) is a partially ordered set and every chain of X has an upper bound, then X has at least one maximal element. DEFINITION 7.3.6 x ∈ X we set df
S+ (x) =
©
Let (X, 6) be a partially ordered set. For every
y∈X: x6y
ª
and
df
S− (x) =
©
ª y∈X: y6x .
The sets S+ (x) and S− (x) are called respectively the right and the left section of x. The set df
[x, y] = S+ (x) ∩ S− (y) =
©
z∈X: x6z6y
ª
is the order interval determined by x and y. In particular [x, y] 6= ∅
⇐⇒
x 6 y.
Let (X, 6X ) and (Y, 6Y ) be two partially ordered sets and let ϕ : X −→ Y . We say that ϕ is increasing (respectively decreasing), if x1 6X x2 implies that ϕ(x1 ) 6Y ϕ(x2 ) (respectively ϕ(x2 ) 6Y ϕ(x1 )). We say that ϕ is strictly increasing (respectively strictly decreasing), if x1 <X x2 implies that ϕ(x1 ) 0.
Evidently every order cone is a cone but the converse is not true in general.
7. Fixed Point Theory
837
We introduce particular types of order cones which are useful in fixed point theory. DEFINITION 7.3.15 order cone.
Let X be a Banach space and let K ⊆ X be an
(a) We say that K is reproducing (or generating) if X = K − K and K is total if X = K − K. (b) We say that K is normal, if there exists δ > 0, such that kx + ykX > δ
∀ x, y ∈ K, kxkX = kykX = 1,
i.e., inf
x,y∈K∩∂B1
kx + ykX > 0.
(c) We say that K is regular if every increasing sequence {xn }n>1 ⊆ X which is order bounded converges, i.e., if x1 6 x2 6 . . . 6 xn 6 . . . 6 b
∀ n > 1,
with b ∈ X, then xn −→ x. (d) We say that K is fully regular, if every increasing sequence {xn }n>1 ⊆ X which is norm bounded converges, i.e., if x1 6 x2 6 . . . 6 xn 6 . . . and kxn kX 6 M
∀ n > 1,
for some M > 0, then xn −→ x. (e) We say that K is minihedral, if for any x, y ∈ X, sup {x, y} exists. (f ) We say that K is strongly minihedral, if every C ⊆ X, which is bounded from above, has a supremum. REMARK 7.3.16 If K is reproducing, every element u ∈ X can be written as u = x − y with x, y ∈ K. Geometrically normality means that the angle between two positive unit vectors is bounded away from π, i.e., a normal cone cannot be too large. Clearly an order cone K is regular (respectively fully regular) if and only if every decreasing and order bounded (respectively norm bounded) sequence is convergent. Similarly an order cone K is minihedral if and only if for all x, y ∈ X, inf {x, y} exists and K is strongly minihedral if and only if every C ⊆ X which is bounded below has an infimum. Finally note that K is minihedral if and only if for every C ⊆ X finite, we have sup C and inf C exist.
838
Nonlinear Analysis
Next we prove some results which elaborate further on the notions introduced in Definition 7.3.15. PROPOSITION 7.3.17 If X is a Banach space and K ⊆ X is an order cone which is solid, then K is reproducing. PROOF
Let x0 ∈ int K and choose r > 0, such that © ª Br (x0 ) = x ∈ X : kx − x0 kX < r ⊆ K.
So if x ∈ X \ {0}, then y = x0 + r and x =
x ∈K kxkX
kxkX (y − x0 ) ∈ K − K. r
Therefore K is reproducing. The next proposition provides useful criteria for normality of an order cone. PROPOSITION 7.3.18 If X is a Banach space and K ⊆ X is an order cone, then the following properties are equivalent: (a) K is normal; (b) there exists β > 0, such that kx + ykX > β max {kxkX , kykX }
∀ x, y ∈ K;
(c) the norm of X is semimonotone, i.e., there exists ξ > 0, such that if 0 6 x 6 y, then kxkX 6 ξ kykX ; (d) there exists an equivalent norm k·k1 on X which is monotone, i.e., if 0 6 x 6 y, then kxk1 6 kyk1 ; (e) if xn 6 un 6 yn for all n > 1 and xn , yn −→ x, then un −→ x; ¡ ¢ ¡ ¢ (f ) the set B 1 + K ∩ B 1 − K is norm bounded; (g) the order intervals [x, y] = are bounded for all x, y ∈ X.
©
u∈X: x6u6y
ª
7. Fixed Point Theory PROOF First
839
“(a)=⇒(b)”: We may assume that kxkX = 1 and kykX 6 1. 1 = kxkX 6 kx + ykX + kykX .
So using the fact that K is normal and the above estimate, we have ° ° ° 1 − kykX ° y ° kx + ykX = °x + − y° ° kykX kykX X ° ° ° ° y ° > ° °x + kyk ° − 1 + kykX > δ − 1 + kykX X X > δ − kx + ykX . Therefore it follows that kx + ykX > 2δ . So (b) holds with β = 2δ . “(b)=⇒(c)”: Suppose that the implication is not true. Then we can find two sequences {xn }n>1 , {yn }n>1 ⊆ K, such that 0 6 xn 6 yn
∀n>1
and 0 < n kyn kX < kxn kX
∀ n > 1.
Let us set df
un =
xn yn + kxn kX n kyn kX
df
and
vn = −
xn yn + kxn kX n kyn kX
∀ n > 1.
Evidently un , vn ∈ K and kun kX > 1 −
1 , n
∀n>1 kvn kX > 1 −
1 . n
Because of (b), we have µ ¶ 2 1 = kun + vn kX > β 1 − . n n Let n → +∞ to obtain that 0 > β > 0, a contradiction. “(c)=⇒(d)”: For every x ∈ X, let kxk1 = inf kykX + inf kzkX . y6x
z>x
We will show that k·k1 is a norm on X. First note that k0k1 = 0. Conversely, if kxk1 = 0, then for a given ε > 0, we can find y, z ∈ X, such that y 6 x 6 z
and
kykX , kzkX 6 ε.
840
Nonlinear Analysis
Because of (c) and the fact that x − y 6 z − y, we have kxkX 6 kx − ykX + kykX 6 ξ kz − ykX + kykX 6 (2ξ + 1)ε. Since ε > 0 was arbitrary, we infer that kxkX = 0, hence x = 0. Clearly for all λ ∈ R and all x ∈ X, we have kλxk1 = |λ| kxk1 . Next let x, u ∈ X. Then for a given ε > 0, we can find y1 , z1 , y2 , z2 ∈ X, such that y1 6 x 6 z1 and y2 6 u 6 z2 and ky1 kX + kz1 kX 6 kxk1 + ε
and
ky2 kX + kz2 kX 6 kuk1 + ε.
Since y1 + y2 6 x + u 6 z1 + z2 , we have kx + uk1 6 ky1 + y2 kX + kz1 + z2 kX and so kx + uk1 6 ky1 kX + kz1 kX + ky2 kX + kz2 kX 6 kxk1 + kuk1 + 2ε. Let ε & 0 to conclude that kx + ukX 6 kxk1 + kuk1 and we have proved that k·k1 is a norm on X. In fact k·k1 is monotone. Indeed, if 0 6 x 6 u, then inf kykX = inf kykX = 0
y6x
y6u
and so kxk1 = inf kzkX 6 inf kzkX = kuk1 . z>x
z>u
It remains to show that k·kX and k·k1 are equivalent norms. Clearly kxk1 6 2 kxkX
∀ x ∈ X.
On the other hand for every y 6 x 6 z, we have ¡ ¢ kxkX 6 kx − ykX + kykX 6 ξ kz − ykX + kykX 6 (ξ + 1) kykX + kzkX , so kxkX 6 (ξ + 1) kxk1 .
7. Fixed Point Theory
841
This proves that k·kX and k·k1 are equivalent norms on X. “(d)=⇒(e)”: Since 0 6 un − xn 6 yn − xn
∀ n > 1,
we have kun − xn kX 6 kyn − xn kX
∀ n > 1.
So un −→ x in X. ¡ ¢ ¡ ¢ “(e)=⇒(f )”: Suppose that the set B 1 +K ∩ B 1 −K is unbounded. We can ¡ ¢ ¡ ¢ find a sequence {un }n>1 ⊆ B 1 + K ∩ B 1 − K , such that kun kX −→ +∞. Evidently xn 6 un 6 yn , with some xn , yn ∈ B 1 . Let df
vn =
xn , kun kX
df
zn =
un , kun kX
df
wn =
yn kun kX
∀ n > 1.
We have vn 6 zn 6 wn and vn , wn −→ 0 in X. So by (e), we have that zn −→ 0, a contradiction to the fact that kzn kX = 1 for all n > 1. ¡ ¢ ¡ ¢ “(f )=⇒(g)”: Since by hypothesis the set B 1 + K ∩ B 1 − K is bounded, we can find r > 0, such that ¡ ¢ ¡ ¢ B 1 + K ∩ B 1 − K ⊆ rB 1 . Let x, y ∈ X, x 6 y and u ∈ [x, y]. Let us set © ª df η = max kxkX , kykX . Then
¡ ¢ ¡ ¢ 1 u ∈ B 1 + K ∩ B 1 − K ⊆ rB 1 η
and so kukX 6 rη. This proves the norm boundedness of the order interval [x, y]. “(g)=⇒(a)”: Suppose that the implication is not true. We can find two sequences {xn }n>1 , {yn }n>1 ⊆ K ∩ ∂B1 , such that kxn + yn kX
1.
842
Nonlinear Analysis
Let us set df
un =
xn
vn =
kxn + yn kX
Then 0 6 un 6 vn
∞ X
and
xn + yn
df
and
1 2
1
kvn kX 6
n=1
It follows that
∞ X
∀ n > 1.
2 kxn + yn kX
∞ X 1 < +∞. n 2 n=1
vn = v ∈ X,
n=1
0 6 un 6 vn 6 v and kun kX > 2n . So the order interval [0, v] is unbounded, which contradicts (g). PROPOSITION 7.3.19 If X is a Banach space and K ⊆ X is an order cone, then · ¸ · ¸ · ¸ K is fully regular =⇒ K is regular =⇒ K is normal .
PROOF First suppose that K is regular and we will show that K is normal. Suppose that the implication is not true. Then we can find two sequences {xn }n>1 , {yn }n>1 ⊆ K ∩ ∂B1 , such that kxn + yn kX 6 Let us set df
um =
m X
m X
xn 6
n=1
and since
∞ X
we have that
m X
(xn + yn ) %
n=1
(xn + yn )
n=1
kxn + yn kX 6
n=1
1 . 2n
∞ X 1 < +∞, n 2 n=1
∞ X
(xn + yn ) ∈ K.
n=1
7. Fixed Point Theory
843
Then the sequence {um }m>1 is increasing, order bounded and so by virtue of the regularity of K we must have that um −→ u
in X
as m → +∞,
for some u ∈ X. But kum+1 − um kX = kxm+1 kX = 1
∀ m > 1,
a contradiction. This proves the normality of K. Next assume that K is fully regular and we will show that K is normal. Suppose that the implication is not true. As before, we can find two sequences {xn }n>1 , {yn }n>1 ⊆ K ∩ ∂B1 , such that kxn + yn kX 6 Let us set u2m =
2m X
1 . 2n
(xn + yn )
∀m>1
n=1
and u2m+1 = u2m + x2m+1
∀ m > 1.
Clearly the sequence {um }n>1 ⊆ K is increasing and norm bounded and so because K is fully regular, we have that um −→ x in X
as m → +∞,
for some x ∈ X. But note that ku2m+1 − u2m kX = kx2m+1 kX = 1, a contradiction. This proves the normality of K. Next assume that K is fully regular and we will show that K is regular. Consider an increasing sequence {xn }n>1 ⊆ X which is order bounded, i.e., there exists y ∈ X, such that x1 6 x2 6 . . . 6 xn 6 . . . 6 y
∀ n > 1.
Then because of the previous implication and Proposition 7.3.18(c), we have ky − xn kX 6 β ky − x1 kX
∀ n > 1.
So the sequence {xn }n>1 ⊆ X is increasing and norm bounded and because of the full regularity of K we have xn −→ x
in X,
for some x ∈ X. This proves the regularity of K.
844
Nonlinear Analysis
In ordered Banach spaces the positive linear functionals (i.e., x∗ ∈ X ∗ , such that hx∗ , xiX > 0 for all x ∈ K) play a role similar in importance to that of the linear functionals on subspaces. For this reason, we introduce the following definition. DEFINITION 7.3.20 Let X be a Banach space and K ⊆ X an order cone. The set ª df © K ∗ = x∗ ∈ X ∗ : hx∗ , xiX > 0 for all x ∈ K is called the dual cone of K. REMARK 7.3.21 because the condition
Clearly K ∗ is a cone but it may not be an order cone, K ∗ ∩ (−K ∗ ) = {0}
may be violated (see Definition 7.3.13). It is easy to check that K ∗ is an order cone ⇐⇒ K is total. We will show that the notions of normality and reproducing are dual for the cones K and K ∗ . First an auxiliary result. LEMMA 7.3.22 If X is a Banach space, K is an order cone and df
C = K ∩ B1, then ·
PROOF
¸ · ¸ K is reproducing ⇐⇒ rB 1 ⊆ C − C for some r > 0 · ¸ ⇐⇒ ηB 1 ⊆ C − C for some η > 0 . First suppose that K is reproducing and we will show that rB 1 ⊆ C − C,
for some r > 0. Since X = K − K, we have X =
∞ X
n(C − C);
n=1
hence by Baire category theorem (see Theorem A.1.10), we have that ¡ ¢ int C − C 6= ∅.
7. Fixed Point Theory
845
So we can find r > 0, such that B r = rB 1 ⊆ C − C. Next suppose that rB 1 ⊆ C − C for some r > 0 and we will show that ηB 1 ⊆ C − C, for some η > 0. Let η = and choose xn ∈ such that
1 (C − C), 2n
° ° n X ° ° °x − ° x n° ° k=1
r 2
X
0, then obviously K is reproducing. Using the above lemma we can establish the duality between the notions of normality and reproducing. THEOREM 7.3.23 (Krein Theorem) If X is a Banach space, K ⊆ X is an order cone and K ∗ ⊆ X ∗ the dual cone, then (a) K is reproducing if and only if K ∗ is normal; (b) K is normal if and only if K ∗ is reproducing. PROOF (a) First suppose that K is reproducing. By virtue of Lemma 7.3.22, there exists η > 0, such that ηB 1 ⊆ C − C, where
df
C = K ∩ B1. Let 0 6 x∗ 6 y ∗ (see Remark 7.3.21) and x ∈ X. Then ηx ∈ ηB 1 , kxkX
846
Nonlinear Analysis
so
ηx = u − v, kxkX
for some u, v ∈ C and thus hx∗ , xiX =
1 kxkX hx∗ , u − viX . η
Hence hx∗ , xiX 6
1 1 1 kxkX hx∗ , uiX 6 kxkX hy ∗ , uiX 6 kxkX ky ∗ kX ∗ . η η η
Since x ∈ X was arbitrary, we obtain that 1 ∗ ky kX ∗ η
kx∗ kX ∗ 6
and this by virtue of Proposition 7.3.18 implies that K ∗ is normal. Next suppose that K ∗ is normal. We argue indirectly. Suppose that K is not reproducing. Then C − C is not a neighbourhood to the origin (see Lemma 7.3.22). So for every n > 1, we can find x ∈ X, such that 1 n
kxkX
1
and
hx∗ , uiX < 1
∀ u ∈ C − C.
In particular then kx∗ kX ∗ > n. We claim that x∗ ∈ where
∗
B1 = Suppose that
¡ ∗ ¢ ¡ ∗ ¢ B1 + K ∗ ∩ B1 − K ∗ , ©
(7.5)
ª x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 . ∗
x∗ 6∈ B 1 + K ∗ .
∗
∗
Since B 1 is w∗ -compact and K ∗ is w∗ -closed, the set B 1 + K ∗ is w∗ -closed. So by the separation theorem for convex sets (see Theorem A.3.1), we can find u ∈ X, such that ∗
hx∗ , uiX > 1 > hy ∗ , uiX Note that
∀ y∗ ∈ B 1 + K ∗ . ∗
0 ∈ K ∗ ∩ B1.
7. Fixed Point Theory So and
847 ∗
hy ∗ , uiX 6 1
∀ y∗ ∈ B 1
hy ∗ , uiX 6 1
∀ y∗ ∈ K ∗ .
Hence kukX 6 1 and u ∈ −K, i.e., u ∈ B 1 ∩ (−K) = −C. But then
hx∗ , uiX < 1,
a contradiction. So If
∗
x∗ ∈ B 1 + K ∗ . ∗
x∗ 6∈ B 1 − K ∗ , ∗
∗
then −x∗ 6∈ B 1 + K ∗ and so arguing as above, we conclude that x∗ ∈ B 1 − K ∗ . This proves (7.5). But then because of Proposition 7.3.18, we reach a contradiction, since kx∗ kX ∗ > n. This proves that K is reproducing. (b) First suppose that K is normal. We argue indirectly. So suppose that K ∗ is not reproducing. Let ∗
df
C ∗ = K ∗ ∩ B1. Since C ∗ − C ∗ is w∗ -closed, we have that C ∗ − C ∗ is not a neighbourhood of the origin and so for any given n > 1, we can find u ∈ X, such that kukX > n and Note that
hy ∗ , uiX < 1
¡ ¢ int B 1 + K 6= ∅
and
∀ y∗ ∈ C ∗ − C ∗ .
¡ ¢ int B 1 − K 6= ∅.
So as above we can use the separation theorem (see Theorem A.3.1) and obtain ¡ ¢ ¡ ¢ x ∈ B1 + K ∩ B1 − K , a contradiction. Next suppose that K ∗ is reproducing. Then from part (a), we have that ¾ ½ ∗∗ ∗∗ ∗∗ ∗∗ ∗ ∗ ∗ K = x ∈ X : hx , x iX ∗ > 0 for all x ∈ K is normal. Because K ⊆ K ∗∗ , we conclude that K is normal too. Using the above theorem we can improve the first implication of Proposition 7.3.19, provided we strengthen the structure of the space X.
848
Nonlinear Analysis
PROPOSITION 7.3.24 If X is a weakly complete Banach space and K ⊆ X is an order cone, then K is fully regular if and only if K is regular. PROOF
“=⇒”: This follows from Proposition 7.3.19.
“⇐=”: Let {xn }n>1 ⊆ X be an increasing sequence which is norm bounded. © ª Then for every x∗ ∈ K ∗ , the real sequence hx∗ , xn iX n>1 converges. Since K is normal (see Proposition 7.3.19), we have that K ∗ is reproducing (see Theorem 7.3.23(b)). So it follows that for every x∗ ∈ X ∗ , the real sequence © ª ∗ hx , xn iX n>1 converges. Then by virtue of the weak completeness of X, we have that w xn −→ x in X, © ª for some x ∈ X. Since for every x∗ ∈ K ∗ , the sequence hx∗ , xn iX n>1 ⊆ R+ is increasing, we infer that hx∗ , xn iX 6 hx∗ , xiX
∀ n > 1, x∗ ∈ K ∗ ,
so xn 6 x
∀ n > 1.
Therefore the sequence {xn }n>1 is order bounded and because of the regularity of K, we have that xn −→ x in X.
REMARK 7.3.25 Recall that every reflexive Banach space is weakly complete. In fact for reflexive Banach space, we can show that: · ¸ · ¸ · ¸ K is fully regular ⇐⇒ K is regular ⇐⇒ K is normal .
For a proof we refer to Guo & Lakshmikantham (1988, pp. 10–12). PROPOSITION 7.3.26 If X is a separable Banach space and K ⊆ X is a regular, minihedral order cone, then K is strongly minihedral. PROOF Let C ⊆ X be a nonempty set which is bounded above. Let {xn }n>1 ⊆ X be a sequence which is dense in C and let us set df
un = sup{xk }nk=1
∀ n > 1.
7. Fixed Point Theory
849
Since K is minihedral, un exists. As C is bounded above, we have u1 6 u2 6 . . . 6 un 6 . . . 6 y
∀ n > 1,
for some y ∈ X. Because K is regular, we have that un −→ u
in X,
for some u ∈ X. Note that xn 6 u
∀ n > 1.
If x ∈ C, we can find a subsequence of {xn }n>1 , which converges strongly to x. Therefore x 6 u and so u is an upper bound for the set C. If z is any other upper bound of C, then un 6 z
∀n>1
and so u 6 z. Therefore u = sup C. Combining Proposition 7.3.26 and Remark 7.3.25, we obtain the following result. COROLLARY 7.3.27 If X is a separable reflexive Banach space and K ⊆ X is a normal, minihedral order cone, then K is strongly minihedral. Let us give some examples of cones illustrating the notions introduced in Definition 7.3.15. EXAMPLE 7.3.28 (a) Let Ω ⊆ RN be a measurable set with finite Lebesgue measure and let X = Lp (Ω) with p ∈ [1, +∞]. We consider df
K =
©
ª u ∈ Lp (Ω) : u(z) > 0 for a.a. z ∈ Ω .
Then K is an order cone. This order cone is reproducing (recall that u = u+ −u− ) and normal (since the norm k·kp is monotone; see Proposition 7.3.18). If p = +∞, then K is solid, but for p ∈ [1, +∞), we have int K = ∅. Using the Lebesgue dominated convergence theorem (see Theorem A.2.2), we could see that K is fully regular for every p ∈ [1, +∞] (this is trivial if p ∈ (1, +∞), since Lp (Ω) is reflexive and K is normal; see Remark 7.3.25). Clearly K is also minihedral for all p ∈ [1, +∞] and moreover, if p ∈ [1, +∞), then because Lp (Ω) is separable, K is also strongly minihedral (see Proposition 7.3.26). (b) Let D ⊆ RN be compact and let X = C(D). We consider df
K =
©
ª u ∈ C(D) : u(z) > 0 for all z ∈ D .
850
Nonlinear Analysis
Then K is an order cone which is solid, generating and normal. K is not regular. To see this let D = [0, 1] and let df
xn (t) = 1 − tn
∀ n > 1, t ∈ [0, 1].
Then x1 6 x2 6 . . . 6 xn 6 . . . 6 y
∀ n > 1,
with y(t) = 1 But the sequence {xn }n>1
∀ t ∈ [0, 1]. ¡ ¢ does not converge in C [0, 1] , since
x(t) −→ x(t) where
½ df
x(t) =
1 0
∀ t ∈ [0, 1], if if
t 6= 1, t = 0.
Also the order cone K is minihedral but not strongly minihedral. To see this let D = [0, 2] and let ½ ¾ ¡ ¢ df C = u ∈ C [0, 2] : u(t) < 1 for t ∈ (0, 1) and u(t) < 2 for t ∈ (1, 2) . Then C is order bounded but sup D does not exist. ¡ ¢ (c) Let X = C 1 [0, 2π] equipped with the norm df
kxkC 1 = kxk∞ + kx0 k∞ and let
½ df
K =
¾ ¡ ¢ u ∈ C [0, 2π] : u(t) > 0, for all t ∈ [0, 2π] . 1
It is easy to see that K is an order cone, which is solid and generating (indeed u = y − x, where y(t) = M > kxk∞ for all t ∈ [0, 2π] and x(t) = M − u(t) for all t ∈ [0, 2π]). But K is not normal. To see this note that if K was normal, then by Proposition 7.3.18, we can find β > 0, such that, if 0 6 u 6 v, then kukC 1 6 β kvkC 1 . Let df
un (t) = 1 − cos(nt) and
df
vn (t) = 2
∀ t ∈ [0, 2π], n > 1.
Then 0 6 un 6 vn ,
kun kC 1 = 2 + n
and
kvn kC 1 = 2
and so we cannot have kun kC 1 6 β kvn kC 1
∀ n > 1.
7. Fixed Point Theory
851
Before passing to some fixed point theorems, let us state a proposition, which contains some useful properties of the elements of K and of K ∗ . PROPOSITION 7.3.29 If X is a Banach space, K ⊆ X is an order cone and K ∗ ⊆ X ∗ is the dual cone, then (a) if K ∗ is nontrivial (i.e., K ∗ 6= ∅), then x ∈ K if and only if hx∗ , xiX > 0
∀ x∗ ∈ K ∗ ;
if x ∈ K \ {0}, then there exists x b∗ ∈ K ∗ , such that hb x∗ , xiX > 0; (b) if K is solid, then x ∈ int K if and only if hx∗ , xiX > 0
∀ x∗ ∈ K ∗ \ {0};
(c) if X is separable, then there exists x b∗ ∈ K ∗ , such that hb x∗ , xiX > 0
∀ x ∈ K \ {0}.
PROOF (a) Let z 6∈ K. Then by the strong separation theorem for convex sets (see Theorem A.3.2), we can find x∗ ∈ X ∗ , such that hx∗ , ziX < η 6 hx∗ , xiX
∀ x ∈ K.
Since 0 ∈ K, we see that η 6 0 and so hx∗ , ziX < 0. Moreover, for all x ∈ K and all n > 1, we have η 6 hx∗ , nxiX , hence η 6 hx∗ , xiX . n Let n → +∞ to obtain that hx∗ , xiX > 0
∀ x ∈ K.
Therefore x∗ ∈ K ∗ \ {0}. This proves (a). (b) First suppose that x ∈ int K. We will show that hx∗ , xiX > 0
∀ x∗ ∈ K ∗ \ {0}.
852
Nonlinear Analysis
Since x ∈ int K, we can find r > 0, such that © ª Br (x) = y ∈ X : ky − xkX < r ⊆ K. So
© ª ∀ z ∈ B 1 = z ∈ X : kzkX 6 1 .
x ± rz > 0
Thus from part (a), we have that ∗ ® x , x ± rz X > 0 so
∀ x∗ ∈ K ∗ \ {0},
hx∗ , xiX > r kx∗ kX > 0. Next suppose that hx∗ , xiX > 0
∀ x∗ ∈ K ∗ \ {0}
and we will show that x ∈ int K. To proceed indirectly, suppose that x 6∈ int K. Let ª df © K1 = y ∈ X : y = λx, λ > 0 . Evidently K1 is also an order cone in X and K1 ∩ int K = ∅. So by virtue of the weak separation theorem for convex sets (see Theorem A.3.1), we can find x∗ ∈ X ∗ \ {0}, such that hx∗ , yiX 6 η 6 hx∗ , ziX
∀ y ∈ K1 , z ∈ K.
Since 0 ∈ K1 ∩ K, we see that η = 0 and so hx∗ , xiX 6 0, a contradiction. © ª ∗ (c) Since X is separable, the dual unit ball B 1 = x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 furnished with the w∗ -topology is compact metrizable. So we can find a ∗ sequence {x∗n }n>1 dense in K ∗ ∩ B 1 . Let df
u∗ =
∞ X 1 ∗ x . n2 n n=1
Evidently u∗ ∈ K ∗ . We show that hu∗ , xiX > 0
∀ x ∈ K \ {0}.
If hu∗ , xiX = 0 for some x ∈ K \ {0}, then hx∗ , xiX = 0 which contradicts part (a).
∀ x∗ ∈ K ∗ ,
7. Fixed Point Theory
853
Now we state some fixed points for maps defined on ordered Banach spaces. The results that we prove are in the spirit of Theorem 7.3.10 and reveal that there is a balance between the conditions on the order cone K ⊆ X and on increasing maps ϕ. The stronger the hypotheses on K, the more general ϕ can be and vice versa. THEOREM 7.3.30 If X is a Banach space, K ⊆ X is an order cone, x0 < y0 , ϕ : [x0 , y0 ] −→ X is an increasing map, such that x0 6 ϕ(x0 ),
ϕ(y0 ) 6 y0
and one of the following hypotheses holds: (i) K is normal and ϕ is condensing; or (ii) K is regular and ϕ is demicontinuous, i.e., w
if xn −→ x, then ϕ(xn ) −→ ϕ(x), then ϕ has a minimal fixed point x and a maximal fixed point x in [x0 , y0 ] and x = where
df
lim xn ,
n→+∞
xn = ϕ(xn−1 ),
and
x =
df
yn = ϕ(yn−1 )
lim yn
n→+∞
∀n>1
and x0 6 x1 6 . . . 6 xn 6 . . . 6 x 6 x 6 . . . 6 yn 6 . . . 6 y1 6 y0 . PROOF If xn = ϕ(xn−1 ), yn = ϕ(yn−1 ) for n > 1, from the properties of ϕ, we see that x0 6 x1 6 . . . 6 xn 6 . . . 6 yn 6 . . . 6 y1 6 y0 .
(7.6)
First suppose that hypothesis (i) is in effect. The sequence C = {xn }n>1 is bounded (see Proposition 7.3.18) and C = ϕ(C) ∪ {x0 }, hence
¡ ¢ γ(C) = γ ϕ(C) ,
which by virtue of the fact that ϕ is condensing implies that γ(C) = 0 (see Definition 7.2.24(c)), i.e., C is relatively ª compact in X (see Proposition 7.2.21). © So we can find a subsequence xnk k>1 of {xn }n>1 , such that xnk −→ x.
854
Nonlinear Analysis
From (7.6), it is clear that xn 6 x 6 yn
∀ n > 1.
For m > nk , we have 0 6 x − xm 6 x − xnk and because K is by hypothesis normal, there is β > 0, such that kx − xm kX 6 β kx − xnk kX (see Proposition 7.3.18), so xm −→ x as m → +∞. From the equality ϕ(xn−1 ) = xn , if we pass to the limit as n → +∞, we obtain ϕ(x) = x. Next suppose that hypothesis (ii) is in effect. Because K is regular, we have xn −→ x in X. Then from the equality ϕ(xn−1 ) = xn
∀n>1
and the demicontinuity of ϕ, we obtain ϕ(x) = x. In a similar fashion, in both cases, we show that yn −→ x in X
and
ϕ(x) = x.
Finally we show that x and x are the maximal and minimal fixed points of ϕ on [x0 , y0 ]. Suppose that u ∈ [x0 , y0 ] is such that ϕ(u) = u. Then since ϕ is increasing, we have x1 6 ϕ(u) = u 6 y1 and continuing this way, we obtain xn 6 u 6 yn
∀ n > 1.
Passing to the limit as n → +∞, we obtain x 6 u 6 x, which proves the theorem.
7. Fixed Point Theory
855
THEOREM 7.3.31 If X is a Banach space, K ⊆ X is an order cone, which is strongly minihedral, x0 < y0 and ϕ : [x0 , y0 ] −→ X is an increasing map, such that x0 6 ϕ(x0 )
and
ϕ(y0 ) 6 y0 ,
then ϕ has a minimal fixed point x and a maximal fixed point x in [x0 , y0 ]. PROOF
Let
½ df
C =
¾ x ∈ [x0 , y0 ] : x 6 ϕ(x) .
Clearly x0 ∈ C and y0 is an upper bound of C. Because K is strongly minihedral, we have that sup C = x exists. We show that x is the maximal fixed point of ϕ on [x0 , y0 ]. We have x0 6 x 6 x 6 y0
∀x∈C
and so since ϕ is increasing, we have x0 6 ϕ(x0 ) 6 ϕ(x) 6 ϕ(x) 6 ϕ(y0 ) 6 y0 , hence x 6 ϕ(x)
∀x∈C
and thus sup C = x 6 ϕ(x).
(7.7)
On the other hand, since x 6 ϕ(x) and ϕ is increasing, we have ¡ ¢ ϕ(x) 6 ϕ ϕ(x) , i.e., ϕ(x) ∈ C. So we have ϕ(x) 6 sup C = x and it follows that ϕ(x) = x (see (7.7)). If u ∈ [x0 , y0 ] is a fixed point of ϕ, then u ∈ C and so u 6 x. This proves the maximality of x. A similar argument using the set ½ ¾ df C1 = x ∈ [x0 , y0 ] : ϕ(x) 6 x gives us x = inf C1 (see Remark 7.3.16), which is the minimal fixed point of ϕ on [x0 , y0 ]. REMARK 7.3.32 We emphasize that in Theorem 7.3.31, ϕ need not be continuous. For this reason we do not claim that x and x are obtained via a monotone iteration process. If ϕ is continuous, then the iteration processes converge.
856
Nonlinear Analysis
THEOREM 7.3.33 If X is a Banach space, K ⊆ X is an order cone which is minihedral, x0 < y0 and ϕ : [x0 , y0 ] −→ X is an increasing map, such that x0 6 ϕ(x0 ), ϕ(y0 ) 6 y0 ¡ ¢ and ϕ [x0 , y0 ] is relatively compact in X, then ϕ has a minimal fixed point x and a maximal fixed point x in [x0 , y0 ]. PROOF
Let ½ df
C = Since
¾ ¡ ¢ x ∈ ϕ [x0 , y0 ] : x 6 ϕ(x) . ¡ ¢ ϕ(x0 ) 6 ϕ ϕ(x0 ) ,
we see that ϕ(x0 ) ∈ C and so C 6= ∅. Let D be a chain in the partially ordered set C. Because ¡ ¢ D ⊆ ϕ [x0 , y0 ] , we have that D is relatively compact and so separable. Let {xn }n>1 be a sequence dense in D and let us set df
yn = sup{xk }nk=1 ∈ D
∀n>1
(in fact yn equals one of the xk since D is a chain). © ª By virtue of the relative compactness of D, we can find a subsequence ynk k>1 of {yn }n>1 , such that ynk −→ y
in X.
Because y1 6 y2 6 . . . 6 yn 6 . . . , we have that xn 6 yn 6 y and
∀n>1
¡ ¢ y ∈ D ⊆ C ⊆ ϕ [x0 , y0 ] ⊆ [x0 , y0 ].
From (7.8) and the density of {xn }n>1 in D, we have that x 6 y
∀x∈D
and so x 6 ϕ(x) 6 ϕ(y)
∀ x ∈ D.
So ϕ(y) is an upper bound of the set D. On the other hand, since yn 6 ϕ(y)
∀ n > 1,
(7.8)
7. Fixed Point Theory
857
we have y 6 ϕ(y) and so
¡ ¢ ϕ(y) 6 ϕ ϕ(y) ,
which proves that ϕ(y) is an upper bound of D in C. Invoking KuratowskiZorn lemma (see Theorem 7.3.5), we can find a maximal element x ∈ C of C. Then x 6 ϕ(x) and so
¡ ¢ ϕ(x) 6 ϕ ϕ(x) ,
i.e., ϕ(x) ∈ C, which by virtue of the maximality of x implies that ϕ(x) = x. We claim that x is the maximal fixed point of ϕ in [x0 , y0 ]. To this end let u be ©a fixed ª point of ϕ in [x0 , y0 ]. Since K is by hypothesis minihedral, v = sup x, u exists. From the facts that x 6 v
and u 6 v,
we have x = ϕ(x) 6 ϕ(v) and
u = ϕ(u) 6 ϕ(v).
Therefore v 6 ϕ(v) and so ¡ ¢ ϕ(v) 6 ϕ ϕ(v) , which implies that ϕ(v) ∈ C. By the maximality of x in C we must have ϕ(v) = x and so v 6 x, i.e., x is the maximal fixed point of ϕ in [x0 , y0 ]. In a similar fashion using the set ½ ¾ ¡ ¢ df C1 = x ∈ ϕ [x0 , y0 ] : ϕ(x) 6 x , we obtain a minimal fixed point of ϕ in [x0 , y0 ]. REMARK 7.3.34 continuous.
Again we emphasize that the map ϕ need not be
858
Nonlinear Analysis
COROLLARY 7.3.35 If X is a Banach space, K ⊆ X is an order cone, which is minihedral and normal, x0 < y0 and ϕ : [x0 , y0 ] −→ X is an increasing map, which maps bounded sets to relatively compact sets and x0 6 ϕ(x0 )
and
ϕ(y0 ) 6 y0 ,
then ϕ has a minimal fixed point x and a maximal fixed point x in [x0 , y0 ]. To produce additional fixed point theorems for maps on ordered Banach spaces as well as theorems on the existence of multiple fixed points, we need to introduce some tools from degree theory. Recall that if X is a Hausdorff topological space and C ⊆ X is a nonempty set, then C is a retract of X, if there is a continuous map r : X −→ C, such that r|C = idC (see Definition 7.2.3). Every nonempty, closed, convex subset of a Banach space is a retract © and every retractªis a closed set but not © necessarily convex. ª The set ∂B1 = x ∈ X : kxkX = 1 is a retract of B 1 = x ∈ X : kxkX = 1 if dim X = +∞. Also if C is a retract of X and D is a retract of C, then D is a retract of X. DEFINITION 7.3.36 Let X be a Banach space and consider the family ½ df M = (ϕ, U, K) : K ⊆ X is a retract, U ⊆ K is a bounded, ¾ relatively open set, ϕ : U −→ K is compact and Fix (ϕ) ∩ ∂U = ∅ , where
df
Fix (ϕ) =
©
ª u ∈ U : ϕ(x) = x .
If r : X −→ K is a retraction, then for every (ϕ, U, K) ∈ M , we can define the Z-valued fixed point index of ϕ over U with respect to K, by ¡ ¢ df i(ϕ, U, K) = dLS idX − ϕ ◦ r, r−1 (U ), 0 , where dLS denotes the Leray-Schauder degree. REMARK 7.3.37 Using the properties of the Leray-Schauder degree (namely the homotopy invariance and excision properties), we can check that the above definition is independent of the retraction used. It extends the Leray-Schauder degree (just consider the case where K = X). The fixed point index inherits the properties of the Leray-Schauder degree. So we can state the following theorem. We omit its straightforward proof.
7. Fixed Point Theory
859
THEOREM 7.3.38 On the set M of admissible triples, there is a unique Z-valued function i(ϕ, U, K) called the fixed point index of ϕ on U with respect to K, which satisfies: (a) Normalization: if ϕ(x) = u0 ∈ U
∀ x ∈ U,
then i(ϕ, U, K) = 1. (b) Additivity: if U1 and U2 are disjoint open subsets of U and ¡ ¢ Fix (ϕ) ∩ U \ (U1 ∪ U2 ) = ∅, then i(ϕ, U, K) = i(ϕ, U1 , K) + i(ϕ, U2 , K) (c) Homotopy invariance: if h : [0, 1] × U −→ K is compact and h(t, x) 6= x then
∀ (t, x) ∈ [0, 1] × ∂U,
¡ ¢ i h(t, ·), U, K is independent of t ∈ [0, 1].
(d) Reduction property: if K1 is a retract of K and ϕ(U ) ⊆ K1 , then i(ϕ, U, K) = i(ϕ, U ∩ K1 , K1 ). (e) Dependence on boundary values: if ϕ = ψ on ∂U , then i(ϕ, U, K) = i(ψ, U, K). (f ) Excision property: if U0 is an open subset of U and ¡ ¢ Fix (ϕ) ∩ U \ U0 = ∅, then i(ϕ, U, K) = i(ϕ, U0 , K). (g) Solution property: if i(ϕ, U, K) 6= ∅, then there exists x ∈ U , such that ϕ(x) = x.
860
Nonlinear Analysis
REMARK 7.3.39 As in the case with the Leray-Schauder degree, the fixed point index can be extended to all γ-condensing maps (see Definition 7.2.24(c)). For details we refer to Nussbaum (1971). In what follows all topological notions are understood with respect to the relative norm topology on K. PROPOSITION 7.3.40 If (ϕ, U, K) ∈ M (see Definition 7.3.36) and there exists u ∈ U , such that ϕ(x) − x 6= t(x − u)
∀ t > 0, x ∈ ∂U,
then i(ϕ, U, K) = 1. PROOF
Let df
h(t, x) = (1 − t)ϕ(x) + tu
∀ t ∈ [0, 1], x ∈ U .
We claim that h(t, x) 6= x
∀ (t, x) ∈ [0, 1] × ∂U.
Indeed, if there exist t1 ∈ [0, 1] and x1 ∈ ∂U , such that (1 − t1 )ϕ(x1 ) + t1 u = x1 , then t1 6= 1 and so ϕ(x1 ) − x1 =
t1 t1 (x1 − u) with > 0, 1 − t1 1 − t1
which contradicts our hypothesis. So (1 − t)ϕ(x) + tu 6= x
∀ t ∈ [0, 1], x ∈ U .
Hence by the homotopy invariance property (see Theorem 7.3.38(c)), we have i(ϕ, U, K) = i(u, U, K) = 1 (see Theorem 7.3.38(a)). PROPOSITION 7.3.41 If (ϕ, U, K) ∈ M (see Definition 7.3.36) and there exists u ∈ K \ U , such that ϕ(x) − x 6= t(x − u) ∀ t > 0, x ∈ ∂U, then i(ϕ, U, K) = 0.
7. Fixed Point Theory PROOF
861
As before let df
h(t, x) = (1 − t)ϕ(x) + tu
∀ t ∈ [0, 1], x ∈ U
and we have h(t, x) 6= x
∀ (t, x) ∈ [0, 1] × ∂U.
So from the homotopy invariance of the fixed point index, we have i(ϕ, U, K) = i(u, U, K) = 0.
PROPOSITION 7.3.42 If X is a Banach space, K ⊆ X is an order cone, U ⊆ X is a bounded open set, ϕ : K ∩ U −→ K and ψ : K ∩ ∂U −→ K are compact maps and ° ©° ª (i) inf °ψ(x)°X : x ∈ ∂(K ∩ U ) = K ∩ ∂U > 0; (ii) x − ϕ(x) 6= tψ(x) for all t > 0 and all x ∈ ∂(K ∩ U ) = K ∩ ∂U , then i(ϕ, K ∩ U, K) = 0. PROOF
By virtue of Theorem 3.1.12, we can find a compact map ψb : K ∩ U −→ K,
such that
b ∩ U ) ⊆ conv ψ(K ∩ ∂U ). ψ(K
Let us set
(7.9)
df
C = ψ(K ∩ ∂U ). Then conv C = D, where ½ df
D =
y=
n X
λk yk : yk ∈ C, λk > 0,
k=1
n X
¾ λk = 1, n > 1 .
k=1
We claim that inf kykX > 0.
y∈D
Let
df
X0 = span C.
(7.10)
862
Nonlinear Analysis
Since C is relatively compact, X0 is a separable Banach subspace of X. Also K0 = K ∩ X0 is an order cone in X0 and conv C ⊆ K0 . From Proposition 7.3.29(a), we know that there exists u∗0 ∈ K0∗ , such that ∗ ® u0 , y X > 0 ∀ y ∈ K0 \ {0}. We will show that
® inf u∗0 , y X = m > 0.
y∈C
(7.11)
If m = 0, then we can find a sequence {yk }k>1 ⊆ C, such that ∗ ® u0 , yk X & m = 0
as k → +∞.
Since C is relatively compact (recall that ψ is compact), we may assume that yk −→ y ∈ K0 and so
as k → +∞
∗ ® ® u0 , yk X −→ u∗0 , y X ,
which implies that y = 0 and so kyk kX −→ 0, a contradiction to hypothesis (i). This proves that (7.11) is true. For any y ∈ D, we have n n X X ∗ ® ® u0 , y X = λk u∗0 , yk X > m λk = m k=1
(see (7.11)), so
u∗0 , y
k=1
® X
> m
∀ y ∈ D.
(7.12)
Because D = conv C, the set D is compact in X and so we can find y0 ∈ D, such that inf kykX = ky0 kX >
y∈D
1 ku∗0 kX
∗ ® u0 , y0 X > m > 0
(see (7.12)), so y0 6= 0. This proves that (7.10) is true. From (7.9) and (7.10), we obtain ° ° b ° = η > 0. inf °ψ(x) (7.13) X x∈K∩U
Suppose that i(ϕ, K ∩ U, K) 6= 0.
7. Fixed Point Theory
863
Then because of hypothesis (ii) and the homotopy invariance property, we have ¡ ¢ ¡ ¢ b K ∩ U, K = i ϕ, K ∩ U, K 6= 0 i ϕ + tψ, ∀ t > 0. ϑ+µ η ,
where
sup kxkX
and
In particular then for t0 > ϑ =
µ =
x∈K∩U
we have
° ° sup °ϕ(x)°X ,
x∈K∩U
¡ ¢ b K ∩ U, K 6= 0 i ϕ + t0 ψ,
and so by the solution property (see Theorem 7.3.38(g)), we have that there exists x0 ∈ K ∩ U , such that b 0 ) = x0 . ϕ(x0 ) + t0 ψ(x Then t0 =
kx − ϕ(x0 )kX ϑ+µ 6 , b η kψ(x0 )kX
a contradiction. COROLLARY 7.3.43 If X is a Banach space, K ⊆ X is an order cone, U ⊆ X is a bounded open set, ϕ : K ∩ U −→ K is compact and there exists x0 > 0, such that x − ϕ(x) 6= tx0 then
PROOF
∀ t > 0, x ∈ K ∩ ∂U,
¡ ¢ i ϕ, K ∩ U, K = 0. Let
df
ψ(x) = x0
∀ x ∈ K ∩ ∂U
and apply Proposition 7.3.42. PROPOSITION 7.3.44 If X is a Banach space, K ⊆ X is an order cone, U ⊆ X is a bounded open set, ϕ : K ∩ U −→ K is compact and ° ©° ª (i) inf °ϕ(x)°X : x ∈ K ∩ ∂U > 0; (ii) ϕ(x) = λx for all λ ∈ [0, 1] and x ∈ K ∩ ∂U , then i(ϕ, U, K) = 0.
864
Nonlinear Analysis
PROOF
Suppose that there exist t0 > 0 and x0 ∈ K ∩ ∂U , such that x0 − ϕ(x0 ) = t0 ϕ(x0 ).
Then ϕ(x0 ) = λ0 x0 , with λ0 =
1 1+t0
∈ [0, 1], a contradiction to hypothesis (ii). Therefore x − ϕ(x) 6= tϕ(x)
∀ t > 0, x ∈ K ∩ ∂U.
So we can apply Proposition 7.3.42 with ψ = ϕ|K∩∂U and finish the proof. DEFINITION 7.3.45 Let X be a Banach space, C ⊆ X a nonempty, closed and convex set, ξ : C −→ R a continuous, convex function and ϑ : C −→ R a continuous, concave function. For given λ, µ ∈ R, we define ª x ∈ C : ξ(x) 6 λ , ª df © C(ϑ, µ) = x ∈ C : ϑ(x) > µ , df
C(ξ, λ) =
©
df
C(ξ, ϑ, λ, µ) = C(ξ, λ) ∩ C(ϑ, µ). PROPOSITION 7.3.46 If X is a Banach space, C ⊆ X is a nonempty, closed, convex set, ξ : C −→ R is a continuous, convex©function, ϑ : C −→ ª R is a continuous, concave function, λ, µ ∈ R, the set x ∈ C : ϑ(x) < µ is nonempty and bounded, {x ∈ C(ξ, ϑ, λ, µ) : ϑ(x) > µ} 6= ∅, the function is compact, and
© ª ϕ : x ∈ C : ϑ(x) < µ −→ C ¡ ¢ ϑ ϕ(x) > µ ¡ ¢ ϑ ϕ(x) > µ
∀ x ∈ C(ξ, ϑ, λ, µ)
¡ ¢ ∀ x ∈ C(ϑ, µ), with ξ ϕ(x) > λ,
then i(ϕ, U, C) = 0, where
df
U = PROOF Let
©
ª x ∈ C : ϑ(x) < µ .
By hypothesis, the set U is nonempty, bounded and open in C. u ∈
©
ª u ∈ C(ξ, ϑ, λ, µ) : ϑ(u) > µ .
7. Fixed Point Theory
865
We have ξ(u) 6 λ
and ϑ(u) > µ.
So u ∈ C \ U . We claim that ϕ(x) − x 6= t(x − u)
∀ t > 0, x ∈ ∂U.
Indeed, if there exist t1 > 0 and x1 ∈ ∂U , such that ϕ(x1 ) − x1 = t1 (x1 − u), we have ϑ(x1 ) = µ and If
x1 =
1 t1 ϕ(x1 ) + u. 1 + t1 1 + t1
¡ ¢ ξ ϕ(x1 ) 6 λ,
then exploiting the convexity of ξ, we have ξ(x1 ) 6
¡ ¢ 1 t1 ξ ϕ(x1 ) + ξ(u) 6 λ. 1 + t1 1 + t1
Hence by virtue of the hypotheses, we have ¡ ¢ ϑ ϕ(x1 ) > µ. Using the concavity of ϑ, we obtain µ = ϑ(x1 ) > a contradiction. If
¡ ¢ 1 t1 ϑ ϕ(x1 ) + ϑ(u) > µ, 1 + t1 1 + t1 ¡ ¢ ξ ϕ(x1 ) > λ,
then from the hypotheses, we have ¡ ¢ ϑ ϕ(x1 ) > µ. So µ = ϑ(x1 ) >
¡ ¢ 1 t1 ϑ ϕ(x1 ) + ϑ(u) > µ, 1 + t1 1 + t1
again a contradiction. So the hypotheses of Proposition 7.3.41 are satisfied and we have i(ϕ, U, C) = 0.
866
Nonlinear Analysis
PROPOSITION 7.3.47 If X is a Banach space, C ⊆ X is a nonempty, closed, convex set, ξ : C −→ R is a continuous, convex map, ϑ : C ª −→ R is a continuous, concave function, © λ, µ ∈ R, the set x ∈ C : ξ(x) < λ is nonempty and bounded, © ª x ∈ C(ξ, ϑ, λ, µ) : ξ(x) < λ 6= ∅, © ª the map ϕ : x ∈ C : ξ(x) < λ −→ C is compact, ¡ ¢ ξ ϕ(x) < λ ∀ x ∈ C(ξ, ϑ, λ, µ) and
¡ ¢ ξ ϕ(x) < λ
¡ ¢ ∀ x ∈ C(ξ, λ), with ϑ ϕ(x) < µ,
then i(ϕ, U, C) = 1, where
df
U = PROOF Let
©
ª x ∈ X : ξ(x) < λ .
By hypothesis, the set U is nonempty, bounded and open in C. u ∈
©
ª u ∈ C(ξ, ϑ, λ, µ) : ξ(u) < λ .
We have ξ(u) < λ
and ϑ(u) > µ.
So u ∈ U . We claim that ϕ(x) − x 6= t(x − u)
∀ t > 0, x ∈ ∂U.
Indeed, if there exist t1 > 0 and x1 ∈ ∂U , such that ϕ(x1 ) − x1 = t1 (x1 − u), we have ξ(x1 ) = λ
and
x1 =
1 t1 ϕ(x1 ) + u. 1 + t1 1 + t1
¡ ¢ If ϑ ϕ(x1 ) > µ, then exploiting the concavity of ϑ, we have ϑ(x1 ) >
¡ ¢ 1 t1 ϑ ϕ(x1 ) + ϑ(u) > µ. 1 + t1 1 + t1
Hence by virtue of the hypotheses, we have ¡ ¢ ξ ϕ(x1 ) < λ. Using the convexity of ξ, we obtain λ = ξ(x1 ) 6
¡ ¢ 1 t1 ξ ϕ(x1 ) + ξ(u) < λ, 1 + t1 1 + t1
7. Fixed Point Theory
867
a contradiction. ¡ ¢ If ϑ ϕ(x1 ) < µ, then from the hypotheses, we have ¡ ¢ ξ ϕ(x1 ) < λ. So λ = ξ(x1 ) 6
¡ ¢ 1 t1 ξ ϕ(x1 ) + ξ(u) < λ, 1 + t1 1 + t1
again a contradiction. So the hypotheses of Proposition 7.3.40 are satisfied and we have i(ϕ, U, C) = 1.
Using these degree theoretical tools, we can prove some more existence and multiplicity results for fixed points, based on the order structure of X. We start with the existence results which employ conditions of cone expansion and compression. THEOREM 7.3.48 If X is a Banach space, K ⊆ X is an order cone, U1 , U2 ⊆ X are two nonempty, bounded open sets, such that 0 ∈ U1
and
U 1 ⊆ U2 ,
¡ ¢ ϕ : K ∩ U 2 \ U1 −→ K is a compact map and one of the following two conditions holds: (i) ϕ(x) 6> x for all x ∈ K ∩ ∂U1 and ϕ(x) 66 x for all x ∈ K ∩ ∂U2 ; or (ii) ϕ(x) 66 x for all x ∈ K ∩ ∂U1 and ϕ(x) 6> x for all x ∈ K ∩ ∂U2 , ¡ ¢ then ϕ has at least one fixed point in K ∩ U2 \ U 1 . PROOF
Invoking Theorem 3.1.12, we can find a compact map ϕ b : K ∩ U 2 −→ K,
such that ϕ| b K∩(U
2 \U1 )
= ϕ.
First suppose that hypothesis (i) is in effect (i.e., we have cone expansion). Then ϕ(x) 6= λx ∀ λ > 1, x ∈ K ∩ ∂U1 . (7.14) Indeed, if there exist λ0 > 1 and x0 ∈ K ∩ ∂U1 , such that ϕ(x0 ) = λ0 x0 ,
868
Nonlinear Analysis
then ϕ(x0 ) > x0 , which contradicts hypothesis (i). So (7.14) is true. Considering the homotopy ¡ ¢ ∀ (t, x) ∈ [0, 1] × K ∩ U 1 ,
df
h(t, x) = tϕ(x) b we see that h(t, x) 6= x
∀ (t, x) ∈ [0, 1] × ∂U1
and so by the homotopy invariance property, we obtain ¡ ¢ i ϕ, b K ∩ U1 , K = 1.
(7.15)
On the other hand let u0 > 0 be arbitrary. We have x − ϕ(x) 6= tu0
∀ t > 0, x ∈ K ∩ ∂U2 .
(7.16)
Indeed, if there exist t1 > 0 and x1 ∈ K ∩ ∂U2 , such that x1 − ϕ(x1 ) = t1 u0 , then ϕ(x1 ) 6 x1 , which contradicts hypothesis (i). So (7.16) is true and we can use Corollary 7.3.43 and infer that ¡ ¢ i ϕ, b K ∩ U2 , K = 0. (7.17) From (7.15), (7.17) and the additivity property of the fixed point index, we have ¡ ¡ ¢ ¢ ¡ ¢ ¡ ¢ i ϕ, b K ∩ U2 \ U 1 , K = i ϕ, b K ∩ U2 , K − i ϕ, b K ∩ U1 , K = −1. (7.18) Because of (7.18) and the solution property of the fixed point index, we can find x ∈ U2 \ U 1 , such that ϕ(x) = x. If hypothesis (ii) is in effect, a similar reasoning gives ¡ ¢ ¡ ¢ i ϕ, b K ∩ U1 , K = 0 and i ϕ, b K ∩ U2 , K = 1 and so from the additivity property of the fixed point index, it follows that ¡ ¡ ¢ ¢ i ϕ, b K ∩ U2 \ U 1 , K = 1. (7.19) Therefore because of the solution property of the fixed point index, from (7.19) we can assert the existence of x ∈ U2 \ U 1 , such that ϕ(x) = x.
7. Fixed Point Theory
869
THEOREM 7.3.49 If X is a Banach space, K ⊆ X is an order cone, U1 , U2 ⊆ X are two nonempty, bounded, open sets, such that 0 ∈ U1 ¡ ¢ ϕ : K ∩ U 2 \ U1 −→ K conditions holds: ° ° °ϕ(x)° 6 kxk for (i) ° X °X °ϕ(x)° > kxk for X X ° ° ° ° (ii) °ϕ(x)°X > kxkX for °ϕ(x)° 6 kxk for X X
and
U 1 ⊆ U2 ,
is a compact map and one of the following two all x ∈ K ∩ ∂U1 and all x ∈ K ∩ ∂U2 ; or
all x ∈ K ∩ ∂U1 and all x ∈ K ∩ ∂U2 , ¡ ¢ then ϕ has at least one fixed point in K ∩ U 2 \ U1 . PROOF We will do the proof when condition (i) is in effect. The argument is similar if condition (ii) holds. Invoking Theorem 3.1.12, we can find a compact map ϕ b : K ∩ U 2 −→ K, such that ϕ| b K∩(U
2 \U1 )
= ϕ.
We may assume that ϕ has no fixed points on K ∩ ∂U2 and K ∩ ∂U1 . We claim that ϕ(x) 6= λx ∀ λ > 1, x ∈ K ∩ ∂U1 . Indeed, suppose that there exist λ0 > 1 and x0 ∈ K ∩ ∂U1 , such that ϕ(x0 ) = λ0 x0 . Then
° ° °ϕ(x0 )°
X
= λ0 kx0 kX > kx0 kX ,
which contradicts (i). So from the homotopy invariance of the fixed point index, we have ¡ ¢ i ϕ, b K ∩ U1 , K = 1. (7.20) Also we claim that ϕ(x) = λx
∀ x ∈ K ∩ ∂U2 , λ ∈ (0, 1].
Indeed, suppose that there exist λ1 ∈ (0, 1) and x1 ∈ K ∩ ∂U2 , such that ϕ(x1 ) = λ1 x1 . Then
° ° °ϕ(x1 )°
X
= λ1 kx1 kX < kx1 kX ,
870
Nonlinear Analysis
contradicting condition (i). Moreover, because of (i), we have ° ° inf °ϕ(x)°X > inf kxkX > 0. x∈K∩U2
x∈K∩∂U2
Invoking Proposition 7.3.44, we obtain ¡ ¢ i ϕ, b K ∩ U2 , K = 0.
(7.21)
Then from (7.19) and (7.21) and the additivity property of the fixed point index, we have ¡ ¡ ¢ ¢ ¡ ¢ ¡ ¢ i ϕ, b K ∩ U 2 \ U1 , K = i ϕ, b K ∩ U2 , K − i ϕ, b K ∩ U1 , K = −1 6= 0. The solution property of the fixed point index implies the existence of at least one fixed point on U 2 \ U1 . THEOREM 7.3.50 If X is a Banach space, C ⊆ X is a nonempty, bounded, closed, convex set, U1 , U2 ⊆ C are both relatively open, U 1 ∩ U 2 = ∅, ϕ : C −→ C is a compact map, such that (i) there exists u1 ∈ U1 , such that ϕ(x) − x 6= λ(x − u1 )
∀ λ > 0, x ∈ ∂U1 ;
(ii) there exists u2 ∈ U2 , such that ϕ(x) − x 6= λ(x − u2 )
∀ λ > 0, x ∈ ∂U2 ,
then ϕ has at least three¢ fixed points x1 , x2 , x3 ∈ C, such that x1 ∈ U1 , x2 ∈ U2 ¡ and x3 ∈ C \ U1 ∪ U2 . PROOF that
We have i(ϕ, C, C) = 1. Also from Proposition 7.3.40, it follows i(ϕ, U1 , C) = i(ϕ, U2 , C) = 1.
So exploiting the additivity of the fixed point index, we have ¡ ¡ ¢ ¢ i ϕ, C \ U1 ∪ U2 , C = i(ϕ, C, C) − i(ϕ, U1 , C) − i(ϕ, U2 , C) = 1 − 1 − 1 = −1 6= 0. ¡ ¢ Therefore there exist x1 ∈ U1 , x2 ∈ U2 and x3 ∈ C \ U1 ∪ U2 , such that © ª ϕ(xk ) = xk ∀ k ∈ 1, 2, 3 .
7. Fixed Point Theory
871
COROLLARY 7.3.51 If X is a Banach space, C ⊆ X is a nonempty, bounded, closed, convex set, ξ : C −→ X is a continuous, convex function, ϑ : C −→ X is a continuous, concave function, λ, µ ∈ R, ª df © U1 = x ∈ C : ξ(x) < λ 6= ∅, ª df © U2 = x ∈ C : ϑ(x) > µ 6= ∅, U 1 ∩ U 2 = ∅ and ϕ : C −→ C is compact and ¡ ¢ (i) ξ ϕ(x) < λ for all x ∈ C with ξ(x) = λ; ¡ ¢ (ii) ϑ ϕ(x) > µ for all x ∈ C with ϑ(x) = µ, then ϕ has at least three distinct fixed points x1 , x2 , x3 ∈ C, such that ξ(x1 ) < λ,
µ < ϑ(x2 ),
λ < ξ(x3 )
and
ϑ(x3 ) < µ.
PROOF Choose u1 ∈ U1 . We have ξ(u1 ) < λ. Suppose that there exist t > 0 and x ∈ ∂U1 , such that ϕ(x) − x = t(x − u1 ). Then ξ(x) = λ
and x =
1 1 ϕ(x) + u1 . 1+t 1+t
Exploiting the convexity of ξ and hypothesis (i), we obtain that ¢ 1 ¡ t λ = ξ(x) 6 ξ ϕ(x) + ξ(u1 ) < λ, 1+t 1+t a contradiction. So condition (i) in Theorem 7.3.50 holds. Similarly, choosing u2 ∈ U2 , we can show that condition (ii) in Theorem 7.3.50 holds. Applying Theorem 7.3.50, we obtain at least three distinct fixed points for ϕ. THEOREM 7.3.52 If X is a Banach space, K ⊆ X is an order cone, ϕ : K −→ K is a compact map, ξ : K −→ R is a continuous convex map, ϑ : K −→ R is a continuous, concave map, λ, µ ∈ R and © ª © ª (i) 0 ∈ x ∈ K : ϑ(x) < µ and the set x ∈ K : ϑ(x) < µ is bounded; © ª (ii) x ¡ ∈ K(ξ, ¢ ϑ, λ, µ) : ϑ(x) > µ 6= 0 and ϑ ϕ(x) > µ for all x ∈ K(ξ, ϑ, λ, µ); ¡ ¢ ¡ ¢ (iii) ϑ ϕ(x) > µ for all x ∈ K(ϑ, µ) with ξ ϕ(x) > λ; (iv) i(ϕ, Kr , K) = 1 for r > 0 small enough and i(ϕ, KR , K) = 1 for R > 0 large enough, df
where for any η > 0, Kη = K ∩ Bη , then ϕ has at least three fixed points in K.
872
Nonlinear Analysis
PROOF
Let
df
U =
©
ª x ∈ K : ϑ(x) < µ .
Then from Proposition 7.3.46, we have that i(ϕ, U, K) = 0. Because
ª x ∈ K : ϑ(x) < µ © ª and the set x ∈ K : ϑ(x) < µ is bounded (see hypothesis (i)), we can find r > 0 small enough and R > 0 large enough, such that 0 ∈
©
K r ⊆ U ⊆ U ⊆ KR . Using the additivity property of the fixed point index, we have ¡ ¢ i ϕ, U \ K r , K = i(ϕ, U, K) − i(ϕ, Kr , K) = 0 − 1 = −1 6= 0 and ¡ ¢ i ϕ, KR \ U , K = i(ϕ, KR , K) − i(ϕ, U, K) = 1 − 0 = 1 6= 0. So ϕ has at least three fixed points x1 , x2 , x3 ∈ K, such that x1 ∈ Kr ,
x2 ∈ U \ K r
and
x3 ∈ KR \ U .
Similarly using Proposition 7.3.47, we obtain the following multiplicity result. THEOREM 7.3.53 If X is a Banach space, K ⊆ X is an order cone, ϕ : K −→ K is a compact map, ξ : K −→ R is a continuous, convex function, ϑ : K −→ R is a continuous, concave function, λ, µ ∈ R and © ª © ª (i) 0 ∈ x ∈ K : ξ(x) < λ and the set x ∈ K : ξ(x) < λ is bounded; © ª (ii) ¡x ∈ K(ξ, ¢ ϑ, λ, µ) : ξ(x) < λ 6= ∅ and ξ ϕ(x) < λ for all x ∈ K(ξ, ϑ, λ, µ); ¡ ¢ ¡ ¢ (iii) ξ ϕ(x) < λ for all x ∈ K(ξ, λ) with ϑ ϕ(x) < µ; (iv) i(ϕ, Kr , K) = 0 for r > 0 small enough and i(ϕ, KR , K) = 0 for R > 0 large enough, df
where for any η > 0, Kη = K ∩ Bη , then ϕ has at least three fixed points in K.
7. Fixed Point Theory
873
THEOREM 7.3.54 If X is a Banach space, C ⊆ X is a nonempty, bounded, closed, convex set, γ, ξ : C −→ R are continuous, convex functions, ϑ, η : C −→ R are continuous, concave functions, ϑ(x) 6 ξ(x) ∀ x ∈ C, ϕ : C −→ C is a compact map, λ, µ, ν, % ∈ R with ν < λ and the following conditions hold: © ª (i) x ¡ ∈ C(γ, ¢ ϑ, λ, µ) : ϑ(x) > µ 6= ∅ and ϑ ϕ(x) > µ for all x ∈ C(γ, ϑ, λ, µ); © ª (ii) ¡x ∈ C(ξ, ¢ η, λ, µ) : ξ(x) < λ 6= ∅ and ξ ϕ(x) < λ for all x ∈ C(ξ, η, ν, %); ¡ ¢ ¡ ¢ (iii) ϑ ϕ(x) > µ for all x ∈ C(ϑ, µ) with γ ϕ(x) > λ; ¡ ¢ ¡ ¢ (iv) ξ ϕ(x) < λ for all x ∈ C(ξ, ν) with η ϕ(x) < %, then ϕ has at least three fixed points x1 , x2 , x3 ∈ C with ξ(x1 ) < ν, PROOF
Let
µ < ϑ(x2 ),
df
U1 =
©
ν < ξ(x3 )
and
ϑ(x3 ) < µ.
ª x ∈ C : ξ(x) < ν .
Then from hypotheses (ii) and (iv) and Proposition 7.3.47, we have that i(ϕ, U1 , C) = 1. Also let
df
U2 =
©
ª x ∈ C : ϑ(x) > µ .
Then from hypotheses (i) and (iii) and Proposition 7.3.47, we have that i(ϕ, U2 , C) = 1. Since ξ 6 ϑ and ν < λ, we have U 1 ∩ U 2 = ∅. Therefore ¡ ¡ ¢¢ ¡ ¢ i ϕ, C \ U1 ∪ U2 = i ϕ, C, C − i(ϕ, U1 , C) − i(ϕ, U2 , C) = 1 − 1 − 1 = −1 6= 0. ¡ ¢ So we can find x1 ∈ U1 , x2 ∈ U2 and x3 ∈ C \ U1 ∪ U2 , such that © ª ϕ(xk ) = xk ∀ k ∈ 1, 2, 3 .
874
Nonlinear Analysis
THEOREM 7.3.55 (Amann Three Fixed Points Theorem) If X is a Banach space, K ⊆ X is an order cone which is normal and solid, y1 , u1 , y2 , u2 ∈ X are such that y1 < u1 < y2 < u2 and ϕ : [y1 , y2 ] −→ X is compact and strongly increasing (i.e., if x < y, then ϕ(x) ¿ ϕ(y); see Remark 7.3.14) and y1 6 ϕ(y1 ),
ϕ(u1 ) < u1 ,
y2 < ϕ(y2 ),
ϕ(u2 ) 6 u2 ,
then ϕ has at least three fixed points x1 , x2 , x3 ∈ [y1 , y2 ], such that y1 6 x1 ¿ u1 , PROOF
y2 ¿ x2 6 u2 ,
Let
y2 66 x3 66 u1 .
df
C = [y1 , y2 ]. Then C is nonempty, closed, convex and bounded (due to the normality of K; see Proposition 7.3.18). Let df
u01 = ϕ(u1 ),
df
y20 = ϕ(y2 ).
Because ϕ is strongly increasing, we have y1 ¿ u01 ¿ y20 ¿ u2 . Let ξ, ϑ : C −→ R be defined by © ª df ξ(x) = inf t > 0 : x − y1 6 t(u01 − y1 ) , ¡ ¢ df ϑ(x) = sup t > 0 : x − y1 6 t(y20 − y1 ) . Clearly ϑ(x) < +∞
∀x∈C
and ξ is convex, while ϑ is concave. Claim 1. ξ and ϑ are continuous functions. We show that ξ is continuous. The proof of the continuity of ϑ is similar. Suppose that {xn }n>1 ⊆ X is a sequence, such that xn −→ x, for some x ∈ X. Let
df
e = u01 − y1 .
7. Fixed Point Theory
875
Then x − y1 6 ξ(x)e. For a given ε > 0, since εe À 0, we can find δ > 0, such that © ª Bδ (εe) = x ∈ X : kx − εekX < δ ⊆ K. We can find N > 1, such that εe + (x − xn ) ∈ Bδ (εe) ⊆ K
∀ n > N.
So it follows that xn − y1 = xn − x + x − y1 − εe + εe 6
¡ ¢ ξ(x) + ε e
∀ n > N,
hence ξ(xn ) 6 ξ(x) + ε
∀n>N
and thus lim sup ξ(xn ) 6 ξ(x).
(7.22)
n→+∞
On the other hand assume that lim inf ξ(xn ) = t > −∞. n→+∞
We can find a subsequence {nk }k>1 ⊆ {n}, such that ξ(xnk ) −→ t. Since 0 6 xnk − y1 6 ξ(xnk )e (e = u01 − y1 À 0), we obtain that t > −∞. Also xnk − y1 6 ξ(xnk )e implies that x − y1 6 te, hence ξ(x) 6 t, i.e., ξ(x) 6 lim inf ξ(xn ).
(7.23)
n→+∞
From (7.22) and (7.23), we infer the continuity of ξ. So the Claim is proved. Next let df
U1 =
©
ª
x ∈ C : ξ(x) < 1
and
df
U2 =
©
ª x ∈ C : ϑ(x) > 1 .
Both sets are open and nonempty, since y1 ∈ U1 and u2 ∈ U2 . We will show that U 1 ∩ U 2 = ∅. Indeed, for any x ∈ U 1 , we have ξ(x) 6 1, hence x − y1 6 ξ(x)(u01 − y1 ) 6 u01 − y1
876
Nonlinear Analysis
and so x 6 u01 , i.e., x ∈ [y1 , u01 ]. Then U 1 ⊆ [y1 , u01 ]. On the other hand, for any x ∈ [y1 , u01 ], we have that x − y1 6 u01 − y1 and so ξ(x) 6 1. Also µ µ ¶ ¶ µ ¶ 1 1 1 1 ξ y1 + 1 − x 6 ξ(y1 ) + 1 − ξ(x) < 1, n n n n so ξ(x) 6 1, i.e., x ∈ U 1 and finally [y1 , u01 ] ⊆ U 1 , i.e., U 1 = [y1 , u01 ]. Similarly, we have that U 2 = [y20 , u2 ]. Note that [y1 , u01 ] ∩ [y20 , u2 ] = ∅. Finally we will show that hypotheses (i) and (ii) of Corollary 7.3.51 hold. For x ∈ C with ξ(x) = 1, we have x 6 u01 . Then ϕ(x) 6 ϕ(u01 ) ¿ u01 , hence ϕ(x) − y1 ¿ u01 − y1 and so there exists δ1 > 0, such that ϕ(x) − y1 6 (1 − δ1 )(u01 − y1 ). This implies that
¡ ¢ ξ ϕ(x) 6 (1 − δ1 ) < 1.
Similarly for x ∈ C, with ϑ(x) = 1, we have y20 6 x. Then y20 ¿ ϕ(y20 ) 6 ϕ(x), hence y20 − y1 6 ϕ(x) − y1 and so there exists δ2 > 0, such that (1 + δ2 )(y20 − y1 ) 6 ϕ(x) − y1 . This implies that
¡ ¢ 1 < (1 + δ2 ) 6 ϑ ϕ(x) .
Therefore all the hypotheses of Corollary 7.3.51 are satisfied. Accordingly we obtain at least three fixed points x1 , x2 , x3 of ϕ, such that y1 6 x1 ¿ u1 ,
y2 ¿ x2 6 u2
and y2 66 x3 66 u1 .
REMARK 7.3.56 In the above proof we have expressed the boundary of the order interval [y1 , y2 ] in terms of a convex and of a concave functions, both continuous.
7. Fixed Point Theory
7.4
877
Fixed Points of Multifunctions
Most of the known metric and topological fixed point results for single valued maps can be extended to the case of set valued maps (multifunctions). In this section we present a few such characteristic generalizations. We will make an effort to have this section self contained. For further details on the theory of multifunctions we refer to Hu & Papageorgiou (1997, 2000). First let us fix our notation concerning hyperspaces (sets of sets). If X is a Hausdorff topological space, we set ¡ ¢ df © ª Pf X = A ⊆ X : A is nonempty, closed , ¡ ¢ df ¡ ¢ Pbf X = Pf X ∪ {∅}, ¡ ¢ df © ª Pk X = A ⊆ X : A is nonempty, compact . If X is a normed space, we also have ¡ ¢ df © Pf c X = A ⊆ X ¡ ¢ df © Pbf X = A ⊆ X df © Pbf (c) (X) = A ⊆ X df © P(w)kc (X) = A ⊆ X
ª : A is nonempty, closed, convex , ª : A is nonempty, bounded, closed , ª : A is nonempty, bounded, closed (and convex) , ª : A is nonempty, (weakly-)compact, convex .
In what follows, for a Hausdorff topological space X, if x ∈ X, then by N (x) we denote the filter of neighbourhoods of x. Also if (X, dX ) is a metric space, r > 0 and x ∈ X, then df
Br (x) =
©
ª y ∈ X : dX (x, y) < r .
Moreover, if X is a normed space as before df
Br = and
df
Br (x) =
©
©
y ∈ X : kykX < r
ª
ª y ∈ X : ky − xkX < r .
Finally, we will use the convention that in a metric space (X, dX ), dX (x, ∅) = +∞
∀ x ∈ X.
878
Nonlinear Analysis Let (X, dX ) be a metric space and A, C ∈ 2X \{∅}. ½ ¾ df h(A, C) = max sup dX (a, C), sup dX (c, A) ,
DEFINITION 7.4.1 We set
a∈A
c∈C
where h(A, C) = +∞ is allowed. The extended real number h(A, C) is called the Hausdorff distance between A and C relative to the metric dX . REMARK 7.4.2 and that
It is easy to see that h satisfies the triangle inequality
h(A, C) = 0 ⇐⇒ A = C. ¡ ¢ So h is a metric on Pf X , known as the Hausdorff metric. Moreover, ¡ ¡ ¢ ¢ Pbf X , h is also a metric¡ space and ∅ is an isolated point of it. If dX is ¢ bounded, then so is h on P X . It can f ¡ ¡ ¢ ¢ ¡ be ¢ shown that if (X, dX ) is a complete metric, then so is Pf X , h and ¡Pk ¢X is a closed subspace ¡ ¢of it. Moreover, if X is a Banach space, then P X , P (X) and P fc kc X are also closed bf (c) ¡ ¡ ¢ ¢ subspaces of Pf X , h (hence they are complete for the Hausdorff metric). ¡ ¡ ¢ ¢ If (X, dX ) is separable, then so is Pk X , h¡ . Therefore, ¡ ¢ ¢ if (X, dX ) is a Polish space (see Definition A.2.29(a)), then so is Pk X , h . From Definition 7.4.1, it is clear that © ª h(A, C) = inf ε > 0 : A ⊆ Cε and C ⊆ Aε , where for any D ∈ 2X \ {∅} and any ε > 0, [ ª df © Dε = y ∈ X : dX (y, D) < ε = Bε (y). y∈D
Therefore ¯ ©¯ ª h(A, C) = sup ¯dX (y, A) − dX (y, C)¯ : y ∈ X
∀ A, C ∈ 2X \ {∅}. ¡ ¢ Finally, if X is a normed space and A, C ∈ Pbf c X , then the following H¨ ormander’s formula holds ¯ ©¯ ª h(A, C) = sup ¯σX (y ∗ , A) − σX (y ∗ , C)¯ : ky ∗ kX ∗ 6 1 , where σX is the so-called support function, defined by © ª df σX (y ∗ , D) = sup hy ∗ , yiX : y ∈ D ∀ D ∈ 2X \ {∅}. We start with a set valued generalization of Banach’s contraction principle (see Theorem 7.1.2). THEOREM 7.4.3 (Nadler Fixed Point Theorem) ¡ ¢ If (X, dX ) is a complete metric space and F : X −→ Pf X is an h¡ ¢ contraction (i.e., h F (x), F (y) 6 kdX (x, y) for all x, y ∈ X with k ∈ [0, 1)), then F has a fixed point, i.e., there exists x ∈ X, such that x ∈ F (x).
7. Fixed Point Theory PROOF
879
Choose k1 ∈ (k, 1) and x0 ∈ X. Then we pick x1 ∈ F (x0 ) with dX (x0 , x1 ) > 0.
If no such x1 exists, then x0 is a fixed point of F and we are done. We have ¡ ¢ ¡ ¢ dX x1 , F (x1 ) 6 h F (x0 ), F (x1 ) < k1 dX (x0 , x1 ) and so we can find x2 ∈ F (x1 ), such that dX (x1 , x2 ) < k1 dX (x0 , x1 ). Inductively, we produce a sequence {xn }n>1 , such that xn+1 ∈ F (xn ) and
∀n>1
dX (xn , xn+1 ) < k1n dX (x0 , x1 )
∀ n > 1.
(7.24)
From the inequality in (7.24), it follows that {xn }n>1 ⊆ X is a Cauchy sequence and so xn −→ x in X, for some x ∈ X. Then ¡ ¢ ¡ ¢ dX xn , F (x) 6 h F (xn ), F (x) < kdX (xn , x) −→ 0. So
¡ ¢ dX x, F (x) = 0
and because F (x) is closed, we have x ∈ F (x). REMARK 7.4.4 In contrast to Theorem 7.1.2, the fixed point x ∈ X in Theorem 7.4.3 is not unique. Indeed, if F (x) = X
∀ x ∈ X,
then every x ∈ X is a fixed point of F . Evidently the set of fixed points of F (denoted by Fix (F )) is closed. We have a stability result for multivalued contractions with bounded values. PROPOSITION 7.4.5 ¡ ¢ If (X, dX ) is a complete metric space, F1 , F2 : X −→ Pbf X are h-contractions with the same constant k ∈ [0, 1) and Fix (Fi ) denotes the set of fixed points of Fi (for i = 1, 2), then ¡ ¢ ¡ ¢ 1 h Fix (F1 ), Fix (F2 ) 6 sup h F1 (x), F2 (x) . 1 − k x∈X
880
Nonlinear Analysis
PROOF
Let ε > 0 and choose ξ > 0, such that ξ
∞ X
nk n < 1.
n=1
Let us set
1 . 1−k Pick x0 ∈ Fix (F1 ) and then choose x1 ∈ F2 (x0 ), such that ¡ ¢ dX (x0 , x1 ) 6 h F1 (x0 ), F2 (x0 ) + ε. ε1 = ξε
Because
(7.25)
¡ ¢ h F2 (x1 ), F2 (x0 ) 6 kdX (x1 , x0 ),
we can find x2 ∈ F2 (x1 ), such that dX (x2 , x1 ) 6 kdX (x1 , x0 ) + kε1 . Inductively, we construct a sequence {xn }n>1 , such that xn+1 ∈ F2 (xn ) and
∀n>1
dX (xn+1 , xn ) 6 kdX (xn , xn+1 ) + k n ε1
(7.26) ∀ n > 1.
(7.27)
From the inequality (7.27), we obtain dX (xn+1 , xn ) 6 k n dX (x1 , x0 ) + nk n ε1 , so
∞ X
dX (xn+1 , xn ) 6
n=m
∞ X km dX (x1 , x0 ) + ε1 nk n 1−k n=m
(7.28)
and thus {xn }n>1 is a Cauchy sequence. So we have xn −→ x
in X,
for some x ∈ X and from (7.26), it follows that x ∈ Fix (F2 ). Moreover, from (7.28) and (7.25), we have dX (x0 , x) 6
∞ X n=0
6
∞ X 1 dX (x0 , x1 ) + ε1 nk n 1−k n=1 ¡ ¡ ¢ ¢ h F1 (x0 ), F1 (x1 ) + 2ε .
dX (xn , xn+1 ) 6
1 1−k
Revising the roles of F1 and F2 in the above argument we conclude that ¡ ¢ h Fix (F1 ), Fix (F2 ) 6
¡ ¢ 1 sup h F1 (x), F2 (x) . 1 − k x∈X
7. Fixed Point Theory
881
COROLLARY 7.4.6 ¡ ¢ If (X, dX ) is a complete metric space, Fn , F : X −→ Pbf X for n > 1 are h-contractions with the same constant k ∈ [0, 1) and ¡ ¢ sup h Fn (x), F (x) −→ 0, x∈X
then
¡ ¢ h Fix (Fn ), Fix (F ) −→ 0.
Next we prove some topological fixed point theorems. We start with a definition. DEFINITION 7.4.7
Let X be a vector space.
(a) A subset C ⊆ X is said to be finitely closed, if its intersection with every finite dimensional flat Y ⊆ X (i.e., Y = x + L for some x ∈ X and some finite dimensional subspace L of X) is closed in the Euclidian topology of Y . (b) A family {Ci }i>1 of sets of X is said to have the finite intersection property, if the intersection of each finite subfamily in not empty. (c) Let C ⊆ X be a nonempty set and let F : C −→ 2X be a multifunction. We say that F is a Knaster-Kuratowski-Mazurkiewicz multifunction (a KKM-multifunction for short), if for every finite set {xk }m k=1 ⊆ C, we have m [ conv {xk }m ⊆ F (xk ). k=1 k=1
The basic property of KKM-multifunctions is given in the next theorem. THEOREM 7.4.8 If X is a vector space, C ⊆ X is nonempty and F : C −→ 2X is a KKMmultifunction, such © that for each ª x ∈ C, F (x) is finitely closed, then the family F (x) : x ∈ C has the finite intersection property. PROOF
We proceed indirectly. Suppose that n \
F (xk ) = ∅,
k=1
for some n > 1 and some x1 , . . . , xn ∈ C. Let Y = span {xk }nk=1 ,
882
Nonlinear Analysis
let dY be the Euclidian metric in Y and D = conv {xk }nk=1 ⊆ Y. By hypothesis, each set Y ∩ F (xk ) is closed. Because n \ ¡ ¢ Y ∩ F (xk ) = ∅, k=1
we see that the function df
ξ(x) =
n X
¡ ¢ dY x, Y ∩ F (xk )
k=1
is not zero for any x ∈ D. So we can define the continuous function ϑ : D −→ D by
n
¢ 1 X ¡ ϑ(x) = dY x, Y ∩ F (xk ) xk . ξ(x) df
k=1
By Brouwer’s fixed point theorem (see Theorem 7.2.7), we can find x0 ∈ D, such that ϑ(x0 ) = x0 . Let
df
J =
©
¡ ¢ ª k : dY x0 , Y ∩ F (xk ) 6= 0 .
Then x0 6∈
[
F (xk ).
k∈J
But
[
x0 = ϑ(x0 ) ∈ conv {xk }k∈J ⊆
F (xk )
k∈J
(since F is a KKM-multifunction), a contradiction. An immediate consequence of the above theorem is the following result. THEOREM 7.4.9 If X is a Hausdorff topological space, C ⊆ X is a nonempty set, F : C −→ 2X is a KKM-multifunction, such that for each x ∈ C, F (x) is closed and for at least one x0 ∈ C, F (x0 ) is compact, then \ F (x) 6= ∅. x∈C
7. Fixed Point Theory
883
We can have the same conclusion in a different setting, which avoids imposing a compactness condition on the sets F (x) and instead uses an auxiliary multifunction. THEOREM 7.4.10 If X is a vector space, C ⊆ X is a nonempty set, F : C −→ 2X is a KKM-multifunction, G : C −→ 2X is another multifunction, such that G(x) ⊆ F (x) and
\
G(x) =
x∈C
∀x∈C \
F (x)
x∈C
and there is another topology on X, such that G(x) is compact for every x ∈ C, then \ F (x) 6= ∅. x∈C
REMARK 7.4.11 KKM-multifunctions can be used to produce an alternative proof of the Schauder-Tychonoff fixed point theorem (see Theorem 7.2.14). For details we refer to Dugundji & Granas (1982, pp. 74–75). LEMMA 7.4.12 If X is a vector space, C ⊆ X is a nonempty set and F : C −→ 2X is a multifunction, such that x 6∈ conv F (x)
∀ x ∈ C,
then x 7−→ G(x) = X \ F −1 (x) is a KKM-multifunction. PROOF
Let {xk }nk=1 ⊆ X. We have X\
n [
G(xk ) =
k=1
n \ ¡
X \ G(xk )
¢
=
k=1
n \
F −1 (xk ).
k=1
It follows that y ∈ X satisfies y 6∈
n [
G(xk )
⇐⇒
k=1
y∈
n \
F −1 (xk )
k=1
and this is equivalent to y∈C
and
xk ∈ F (y)
© ª ∀ k ∈ 1, . . . , n .
884
Nonlinear Analysis
Now, let
y ∈ conv {xk }nk=1 .
We claim that y ∈
n [
G(xk ).
k=1
If this is not the case, then from the previous argument, we know that © ª y ∈ C and xk ∈ F (y) ∀ k ∈ 1, . . . , n . Therefore
conv {xk }nk=1 ⊆ conv F (y),
so, from the choice of y, y ∈ conv F (y), a contradiction. Therefore, we have conv {xk }nk=1 ⊆
n [
G(xk )
k=1
and so we conclude that G is a KKM-multifunction. This lemma can be used to establish the existence of maximal elements for binary relations, which are irreflexive but not necessarily transitive. The result is useful in mathematical economics. PROPOSITION 7.4.13 If X is a Hausdorff topological vector space, C ⊆ X is a nonempty, compact, convex set, ≺ is an irreflexive binary relation on C and © ª (i) for every x ∈ C, x ∈ / conv y ∈ C : x ≺ y ; © ª (ii) for every x ∈ C, the lower section y ∈ C : y ≺ x is open in C, then the set of ≺-maximal elements in C is nonempty and compact. PROOF
Let df
F (x) =
©
y∈C: x≺y
ª
∀ x ∈ C.
Because of hypothesis (i) and Lemma 7.4.12, x 7−→ G(x) = X \ F −1 (x) is a KKM-multifunction. Hence S : C −→ 2C , defined by df
S(x) = C ∩ G(x) = C \ F −1 (x),
7. Fixed Point Theory
885
is a KKM-multifunction too. Moreover, because of hypothesis (ii), the set F −1 (x) is open in C and so S has compact values. Applying Theorem 7.4.9, we infer that \ ¡ \ ¢ C \ F −1 (x) = S(x) 6= ∅ x∈C
x∈C
and it is compact. ¢ T ¡ But the set of ≺-maximal elements in C is just C \ F −1 (x) . x∈C
The above theorem leads to the following existence result important in the theory of variational inequalities. THEOREM 7.4.14 If X is a locally convex space, C ⊆ X is a nonempty, compact and convex set and u : C −→ X ∗ is a map, such that the function (x, y) 7−→
® u(x), y X
is jointly continuous on C × C, then there exists x ∈ C, such that ® u(x), y − x X > 0 PROOF
∀ y ∈ C.
On C we define the irreflexive relation ≺ by ® x ≺ y ⇐⇒ u(x), x − y X > 0.
Note that for every x ∈ C, we have © ª © ª x∈ / conv y ∈ C : x ≺ y = y ∈ C : x ≺ y © ® ª = y ∈ C : u(x), x − y X > 0 . Moreover, the function (x, y) 7−→
® u(y), x − y X
is jointly continuous. So ª © ª © ® y ∈ C : y ≺ x = y ∈ C : u(y), x − y X < 0 is open. Therefore we can apply Proposition 7.4.13 and obtain a ≺-maximal element x ∈ C. We have ® ∀ y ∈ C. u(x), y − x X > 0
886
Nonlinear Analysis
We can use the above theorem to obtain topological fixed point theorems for multifunctions. First a definition. DEFINITION 7.4.15 Let X be a vector space and C ⊆ X be a nonempty set. We say that the multifunction F : C −→ 2X \ {∅} is weakly inward, if F (x) ∩ IC (x) 6= ∅ ∀ x ∈ C, where IC (x) is the inward set of x ∈ C with respect to C, defined by df
IC (x) =
©
ª x + λ(y − x) : λ > 0 and y ∈ C .
REMARK 7.4.16 If F maps C into itself, then F is inward, since for any y ∈ F (x) and for λ = 1, we have x + λ(y − x) = x + y − x = y ∈ C. Using the above notion we can prove a topological fixed point theorem for multifunctions. THEOREM 7.4.17 If X is a locally convex space, C ⊆ X is nonempty, compact, convex and ¡ ¢ F : C −→ Pf c X is a weakly inward multifunction which is upper semicontinuous (see Definition 3.2.12), then we can find x ∈ C, such that x ∈ F (x). PROOF We proceed by contradiction. Suppose that F has no fixed point. For a given x ∈ C, 0 6∈ x − F (x) and so by the strong separation theorem for convex sets (see Theorem A.3.2), we can find x∗ ∈ X ∗ \ {0}, such that ∗ ® x ,x − y X < 0 ∀ y ∈ F (x), i.e.,
¡ ¢ σX x∗ ; x − F (x) < 0.
Let us set df
U (x∗ ) =
©
¡ ¢ ª x ∈ C : σX x∗ ; x − F (x) < 0
∀ x∗ ∈ X ∗ \ {0}.
Since F is upper semicontinuous, U (x∗ ) is open and © ª U (x∗ ) x∗ ∈X ∗ \{0} is a cover of C.
7. Fixed Point Theory
887
© ªn So we can find a finite subcover U (x∗k ) k=1 and a corresponding continuous partition of unit {ξk }nk=1 . Let us set u(x) =
n X
ξk (x)x∗k .
k=1
Evidently ® the function (x, y) 7−→ u(x), y X is jointly continuous on C × C. So we can apply Theorem 7.4.14 and obtain x0 ∈ C, such that ® u(x0 ), y − x0 X > 0 ∀ y ∈ C. By hypothesis there is y ∈ F (x0 ) ∩ IC (x0 ). So y = x0 + lim λn (yn − x0 ), n→+∞
with λn > 0 and yn ∈ C. Then y − x0 = and so
® u(x0 ), y − x0 X =
lim λn (yn − x0 )
n→+∞
® lim λn u(x0 ), yn − x0 > 0,
n→+∞
a contradiction to the fact that ¡ ¢ σX u(x); x − F (x) < 0
REMARK 7.4.18 Fix (F ) is compact.
∀ x ∈ C.
It is easy to see that in the above theorem the set
Combining Theorem 7.4.17 with Remarks 7.4.16 and 7.4.18 and recalling that for locally compact multifunctions, upper semicontinuity and closedness of the graph are equivalent (see Remark 3.2.13), we obtain the following topological fixed point theorem, which is a multivalued generalization of Theorem 7.2.14. THEOREM 7.4.19 (Kakutani-Ky Fan Fixed Point Theorem) If X is a locally convex space, C ⊆ X is a nonempty, compact and convex ¡ ¢ set and F : C −→ Pf c C is a multifunction with closed graph, then the set of fixed points of F is nonempty and compact.
888
Nonlinear Analysis
Using ideas from the proof of Theorem 7.4.17, we can prove a third topological fixed point theorem for multifunctions. THEOREM 7.4.20 If X is a locally convex space, C ⊆ X is a nonempty, compact, convex set and F : C −→ 2C \ {∅} is a multifunction with convex values, such that for each y ∈ C, the set © ª F + ({y}) = x ∈ C : y ∈ F (x) is open, then there exists x ∈ C, such that x ∈ F (x). © ª The family F + ({y}) y∈C is an open cover of C. So we can find © ªn a finite subcover F + ({yk }) k=1 and a corresponding continuous partition of unity {ξk }nk=1 . Then, if we set PROOF
df
u(x) =
n X
ξk (x)yk
∀ x ∈ C,
k=1
then u : C −→ C is a continuous selector of F . Applying Theorem 7.2.14, we obtain x ∈ C, such that x = u(x) ∈ F (x).
Next let X, Y be two Banach spaces and let C ⊆ X, D ⊆ Y be nonempty, closed and convex sets. In what follows by (D, w) we denote the set D furnished with the relative weak topology of Y . We consider multifunctions G : C −→ 2C \ {∅}, which admit the following decomposition G = K ◦ N,
(7.29)
where N : C −→ 2D \ {∅} is an upper semicontinuous multifunction into (D, w) and has weakly compact and convex values and K : (D, w) −→ C is sequentially continuous, i.e., w
if yn −→ y
in D,
then
K(yn ) −→ K(y) in C.
We also assume that G is compact, i.e., maps bounded sets onto relatively compact sets. If by D we denote the family of triples (G, U, C), where G and C are as above and U ⊆ C is a nonempty, bounded, relatively open set, such that Fix (G) ∩ ∂U = ∅, then on D we can define a fixed point index similar to the one introduced in Definition 7.3.36. For details we refer to Bader (2001). In particular we have the following result.
7. Fixed Point Theory
889
THEOREM 7.4.21 There exists a map i : D −→ Z, such that (a) Normalization: If K in (7.29) is a constant map, i.e., K(y) = v0 6∈ ∂U for all y ∈ Y , then ½ 1 if v0 ∈ U, i(G, U, C) = 0 if v0 6∈ U. (b) Additivity: If Fix (G) ∩ U ⊆ U1 ∪ U2 , where U1 and U2 are disjoint open subsets of U , then i(G, U, C) = i(G, U1 , C) + i(G, U2 , C). (c) Homotopy invariance: Let F : C −→ 2C \ {∅} be a multifunction with a decomposition F = S ◦ L as in (7.29) and assume that G and F are homotopic in the following sense: “There exists an upper semicontinuous multifunction H : [0, 1] × C −→ 2D \ {∅} (D with the relative weak topology) with weakly compact, convex values, such that H(0, ·) = N and H(1, ·) = L and a sequentially continuous map u : [0, 1] × (D, w) −→ C, such that u(0, ·) = K and u(1, ·) = S.” Let us set ¡ ¢ df Ψ(t, x) = u t, H(t, x) and assume that Ψ is compact and x 6∈ Ψ(t, x)
∀ (t, x) ∈ [0, 1] × ∂U.
Then i(G, U, C) = i(F, U, C). (d) Decomposition property: For any other decomposition G0 = K 0 ◦ N 0 with intermediate set D0 being a nonempty, closed, convex set of a Banach space Y 0 , such that there exists a sequentially continuous map p : (D, w) −→ (D0 , w) with N 0 = p ◦ N and K 0 ◦ p = K, we have i(G, U, C) = i(G0 , U, C). (e) Solution property: if i(G, U, C) 6= 0, then Fix (G) ∩ U 6= ∅. REMARK 7.4.22 We emphasize that in the above theorem as well as in the ones that follow, the multifunction G need not have convex values. Using Theorem 7.4.21, we can derive fixed point principles for multifunctions that need not have convex values.
890
Nonlinear Analysis
THEOREM 7.4.23 If X is a Banach space, G : B R −→ 2X \ {∅} is a compact multifunction with a decomposition as in (7.29), then at least one of the following statements holds: (a) there exists x0 ∈ ∂BR and λ ∈ (0, 1), such that x0 ∈ λG(x0 ); or (b) Fix (G) 6= ∅. PROOF
Let r : X −→ B R be a retraction and obtain a decomposition G ◦ r = K ◦ (N ◦ r).
Suppose that Fix (G) ∩ ∂B R = ∅ (or otherwise we are done). Then i(G ◦ r, BR , X) is defined. Let u : [0, 1] × (D, w) −→ X be defined by df
u(t, y) = tK(y). Then using u we see that G ◦ r and D ◦ (N ◦ r) = 0 are homotopic in the sense of Theorem 7.4.21 provided (a) is not valid. Then by the normalization property, we have i(G ◦ r, BR , X) = i(0, BR , X) = 1. Then by virtue of the solution property, we can find x ∈ BR , such that x ∈ (G ◦ r)(x) = G(x). Similarly we can show the following nonlinear alternative theorem. THEOREM 7.4.24 If X is a Banach space, G : C −→ 2C \ {∅} is a compact multifunction with a decomposition as in (7.29) and 0 ∈ C, then at least one of the following statements holds: (a) G has a fixed point; or ª df © (b) the set S = x ∈ C : x ∈ λG(x) for some 0 < λ < 1 is unbounded. REMARK 7.4.25 Theorem 7.4.24 is a multivalued generalization of Leray-Schauder alternative principle (see Theorem 7.2.16). The remarkable feature of Theorem 7.4.24 is that the multifunction G need not have convex values.
7. Fixed Point Theory
7.5
891
Remarks
7.1: Theorem 7.1.2 is of course due to Banach (1922) and it is an abstraction of the classical method of successive approximation due to Liouville and Picard. The renorming technique used in Example 7.1.4(b) was introduced by Bielecki (1956). Proposition 7.1.5 is essentially due to Schauder (1933) (he had some additional unnecessary hypotheses). There have been many variants and extensions of Banach’s contraction principle. Most of them are of limited interest. In Theorem 7.1.7 we have collected the most useful descendants of Banach’s fixed point theorem. Theorem 7.1.7(a) is an observation resulting from the proof of Theorem 7.1.2. Theorem 7.1.7(b) is due to Weissinger (1952), Theorem 7.1.7(c) is due to Edelstein (1962) and Theorem 7.1.7(d) is due to Kannan (1969). Theorem 7.1.9 is due to Boyd & Wong (1969). The first fixed point theorem for nonexpansive maps in a noncompact setting was proved by Browder (1965b) and G¨ohde (1965) independently. They proved that a nonexpansive self-map of a nonempty, bounded, closed, convex set of a uniformly convex Banach space has a fixed point. Soon thereafter Kirk (1965) proved the same result under the slightly weaker assumption that the Banach space X is reflexive and the set C, on which the nonexpansive map is defined, is nonempty, bounded, closed, convex and has normal structure (see Definition 7.1.18). This geometric property was introduced by Brodskii & Milman (1948) to study fixed points of isometries and it is a property shared by all uniformly convex Banach spaces. Theorem 7.1.23 is essentially due to Kirk (1965). Theorem 7.1.28 is due to Caristi (1976) and it is equivalent to Ekeland’s variational principle (see Corollary 4.6.3) and to the following theorem of Takahashi (1991). THEOREM 7.5.1 df If (X, dX ) is a complete metric space, ϕ : X −→ R = R ∪ {+∞} is a proper, lower semicontinuous, bounded below function and for every y ∈ X, such that inf ϕ(x) < ϕ(y),
x∈X
we can find u ∈ X, such that u 6= y
and
ϕ(u) + dX (y, u) 6 ϕ(y),
then there exists x b ∈ X, such that ϕ(b x) = inf ϕ(x). x∈X
Detailed treatment of the metric fixed point theory can be found in the books of Dugundji & Granas (1982), Goebel & Kirk (1990), Istratescu (1981), Smart (1980) and Zeidler (1985a).
892
Nonlinear Analysis
7.2: Proposition 7.2.5 was first observed by Borsuk (1931). The product of two compact fixed point spaces (i.e., spaces with FFP) need not be a fixed point space (see Bredon (1971)). In contrast, an infinite product of nonempty, compact fixed point spaces will be a fixed point space, provided every finite product of these spaces is a fixed point space. So by virtue of Brouwer’s theorem (see Theorem 7.2.7(b)), the Hilbert cube [0, 1]N is a fixed point space. Theorem 7.2.7 is one of the oldest and best known results in topology. For n = 3 it was proved by Brouwer (1909) and for arbitrary n > 1 by Hadamard (1910). We should also mention similar results obtained earlier by Poincar´e (1886) and Bohl (1904). Unlike Banach’s contraction principle (see Theorem 7.1.2), Brouwer’s theorem does not give any computational scheme for obtaining a fixed point. However, Scarf (1967), under some additional conditions, produced some sort of algorithm for computing a fixed point of a continuous map. In Knaster, Kuratowski & Mazurkiewicz (1929) we can find a simple and short proof of Brouwer’s theorem based on combinatorial©arguments. They alsoªnote that for a continuous map ϕ : B 1 −→ RN (B 1 = x ∈ RN : kxkRN 6 1 ), the condition ϕ(∂B1 ) ⊆ B 1 suffices for the existence of a fixed point. Example 7.2.10 is due to Kakutani (1943). Theorem 7.2.11 is due to Schauder (1930) and Theorem 7.2.13 is due to Borsuk (1931). Theorem 7.2.14 extends Theorem 7.2.11 to locally convex spaces and is due to Tychonoff (1935). Its proof can be found in Dunford & Schwartz (1958, p. 456). There is a weak sequential version of it due to Arino, Gautier & Penot (1984). THEOREM 7.5.2 If X is a metrizable, locally convex space, C ⊆ X is a nonempty, weakly compact, convex set and ϕ : C −→ C is weakly sequentially continuous, then there exists x ∈ C, such that ϕ(x) = x. Theorem 7.2.16 was first proved for Banach spaces by Leray & Schauder (1934) using degree theory. A direct elementary proof for the general case was provided by Schaefer (1955). Corollary 7.2.18(a) is due to Rothe (1938), and Corollary 7.2.18(b) is due to Altman (1955). Theorem 7.2.27 is due to Sadovskii (1972) and historical forerunners of it are the fixed theorems of Darbo (1955) and Krasnoselskii (1964b, 1964a). Finally we mention an interesting consequence of the Schauder-Tychonoff fixed point theorem (see Theorem 7.2.14), which is important in functional analysis. THEOREM 7.5.3 If X is a locally convex space, C ⊆ X is a nonempty, compact, convex set and F is a commuting family of continuous affine maps h : C −→ C, then F has a common fixed point, i.e., there exists x ∈ C, such that h(x) = x for all h ∈ F.
7. Fixed Point Theory
893
The above theorem was first proved by Markov (1936) using Theorem 7.2.14. Kakutani (1938) produced a direct elementary proof and demonstrated its importance by furnishing several applications of it. Topological fixed point theory is treated in the books of Brown (1993), Dugundji & Granas (1982), Istratescu (1981), Smart (1980), Zeidler (1985a). 7.3: Theorem 7.3.8 was proved by Bourbaki (1940–1949) and Kneser (1950) independently, while Theorem 7.3.10 is due to Amann (1977). Its forerunner is Theorem 7.3.11, proved by Tarski (1955). The theory of ordered Banach spaces and the properties of the various kind of order cones (see Definition 7.3.15) can be found in the classical book of Krasnoselskii (1964a). The same topic is also treated in detail in Deimling (1985) and Guo & Lakshmikantham (1988). Theorems 7.3.30 and 7.3.31 can be found in Krasnoselskii (1964a), Krasnoselskii & Zabreiko (1984) and Amann (1977), while Theorem 7.3.33 is due to Guo & Sun (1988). A discussion of the notion of fixed point index (see Definition 7.3.36) can be found in Deimling (1985), Guo & Lakshmikantham (1988), Lloyd (1978), Nussbaum (1971) and Zeidler (1985a), while fixed point theorems which employ conditions of cone expansions and compression were first obtained by Krasnoselskii (1964a, 1964b). Additional results in this direction were obtained by Amann (1976), Guo (1986, 1987), Guo & Sun (1988), Heikkil¨a & Lakshmikantham (1994), Leggett & Williams (1979), Sun (1991) and Sun & Sun (1986). The theory has important applications in the existence and multiplicity of positive solutions for boundary value problems. 7.4: Theorem 7.4.3 was proved by Nadler (1969) for multivalued contractions with nonempty, bounded, closed values. Soon thereafter Covitz & Nadler (1970) extended the result to multivalued contractions whose values need not be compact. The stability results established in Proposition 7.4.5 and Corollary 7.4.6 are due to Lim (1985). Concerning the structure of the set of fixed points of a multivalued contraction, Ricceri (1987) proved the following result. THEOREM 7.5.4 If X is a Banach space, C ⊆ X is a nonempty, closed, convex set and ¡ ¢ F : C −→ Pf c C is an h-contraction, then Fix (F ) is a nonempty set and it is a retract of X. In particular Fix (F ) is path-connected. Analogous results can be found also in Bressan, Cellina & Fryszkowski (1991) and G´orniewicz, Marano & Slosarski (1996). Concerning h-nonexpansive multifunction, we have the following result due to Lim (1974).
894
Nonlinear Analysis
THEOREM 7.5.5 If X is a uniformly convex Banach space, C ⊆ X is a nonempty, bounded, ¡ ¢ closed, ¡convex set ¢and F : C −→ Pk C is an h-nonexpansive multifunction (i.e., h F (x), F (y) 6 kx − ykX for all x, y ∈ C), then Fix (F ) 6= ∅. Theorem 7.4.8 was first proved by Knaster, Kuratowski & Mazurkiewicz (1929) in the special case where C is the set of vertices of a simplex in RN . Their approach is combinatorial based on Sprener’s lemma. The infinite dimensional version presented in Theorem 7.4.8 is due to Fan (1952). Theorem 7.4.14 is due to Browder (1976, p. 72), but the proof there is different. The finite dimensional forerunner of the result was proved by Hartman & Stampacchia (1966). Weakly inward maps (see Definition 7.4.15) were introduced by Halpern & Bergman (1968) and Theorem 7.4.17 is due to Halpern (1970). Earlier, Browder (1967) had proved a similar result under the stronger condition that F (x) ⊆ IC (x) for all x ∈ C (strongly inward multifunction). Theorem 7.4.19 is due to Kakutani (1941) (when X = RN ) and Fan (1952, 1960–1961) (when X is a general locally convex space). Theorem 7.4.20 is due to Bader (2001). The fixed point index for multifunctions introduced in Theorem 7.4.21 and the multivalued fixed point principles in Theorems 7.4.23 and 7.4.24 are due to Bader (2001). Additional multivalued formulations of the Leray-Schauder alternative principle can be found in Frigon & Granas (1994) and O’Regan & Precup (2001). Further fixed point results for multifunctions can be found in the books of Border (1985), G´orniewicz (1999) and Hu & Papageorgiou (1997, 2000).
Appendix
A.1
Topology
DEFINITION A.1.1 Let X be a Hausdorff topological space. (a) We say that X is a locally compact space, if every point x ∈ X has a relatively compact neighbourhood. (b) We say that X is a σ-compact space, if X =
∞ [
Kn ,
n=1
with Kn compact. REMARK A.1.2 Every finite dimensional space is locally compact and σ-compact. In a locally compact space every point has a local base consisting of relatively compact sets. Also if X is locally compact and σ-compact, then X =
∞ [
Kn ,
n=1
with Kn compact and Kn ⊆ int Kn+1 Hence X =
∞ [
∀ n > 1.
int Kn .
n=1
THEOREM A.1.3 If (X, τ ) is a noncompact, locally compact space (τ is the topology of X) and df
X∞ = X ∪ {∞}, with ∞ 6∈ X, © ª df τ∞ = τ ∪ X∞ \ K : K ⊆ X is compact , then τ∞ is a Hausdorff topology on X∞ , (X∞ , τ∞ ) is a compact space and X is an open dense set in X∞ . 895
896
Nonlinear Analysis
REMARK A.1.4 The space (X∞ , τ∞ ) is called the Alexandrov onepoint compactification of X. DEFINITION A.1.5 Let (X, τ ) be a Hausdorff topological space and let {Ui }i∈I and {Vj }j∈J be two covers of X. We say that {Ui }i∈I is a refinement of {Vj }j∈J , if for every i ∈ I, we can find j ∈ J, such that Ui ⊆ Vj . In this case we write {Ui }i∈I ≺ {Vj }j∈J . A refinement {Ui }i∈I of {Vj }j∈J is said to be precise, if I = J and Ui ⊆ Vi
∀ i ∈ I = J.
A cover {Ui }i∈I of X is said to be locally finite, if every x ∈ X has a neighbourhood which intersects only finitely many sets Ui . PROPOSITION A.1.6 If X is a Hausdorff topological space and the cover {Vj }j∈J has a locally finite refinement {Ui }i∈I , then it has also a precise locally finite refinement {Wk }k∈K . Moreover, if each Ui is open, then each Wk can be chosen to be open too. DEFINITION A.1.7 (a) A Hausdorff topological space X is said to be paracompact, if every open cover of X has a locally finite refinement. (b) A partition of unity on a Hausdorff topological space X is a family {ψi }i∈I of continuous functions ψi : X −→ [0, 1], such that © ª supp ψi i∈I is a locally finite closed cover of X and
X
ψi (x) = 1
∀ x ∈ X.
i∈I
If {Ui }i∈I is a given open cover of X, then we say that the partition of unity {ψi }i∈I is subordinated to {Ui }i∈I , if supp ψi ⊆ Ui
∀ i ∈ I.
REMARK A.1.8 Compact and metrizable spaces are paracompact. Moreover, paracompact spaces are normal. THEOREM A.1.9 If X is a Hausdorff topological space, then X is paracompact if and only if every open cover of X has a locally finite partition of unity subordinate to it.
Appendix
897
THEOREM A.1.10 (Baire Category Theorem) If X is a complete metric space and X =
∞ [
Cn ,
n=1
with Cn being closed for every n > 1, then there exists n0 > 1, such that int Cn0 6= ∅. THEOREM A.1.11 (Cantor Intersection Theorem) A metric space X is complete if and only if for every decreasing sequences {Cn }n>1 of closed sets with diam Cn −→ 0, we have that
∞ \
Cn is a singleton.
n=1
DEFINITION A.1.12 Let X be a Hausdorff topological space. (a) We say that X is regular, if for every nonempty, closed set C ⊆ X and x∈ / C, we can find open sets U, V ⊆ X, such that C ⊆ U,
x ∈ V
and
U ∩ V = ∅.
(b) We say that X is normal, if for every nonempty, closed sets C1 , C2 ⊆ X, such that C1 ∩ C2 = ∅, we can find open sets U1 and U2 , such that C1 ⊆ U 1 ,
C2 ⊆ U2
and
U1 ∩ U2 = ∅.
THEOREM A.1.13 (Urysohn Lemma) A Hausdorff topological space X is normal if and only if for every pair of nonempty disjoint closed sets C1 and C2 , we can find a continuous function ϕ : X −→ [0, 1], such that ϕ|C1 ≡ 0
and
ϕ|C2 ≡ 1.
898
Nonlinear Analysis
THEOREM A.1.14 (Tietze Extension Theorem) A Hausdorff topological space is normal if and only if for every nonempty closed A ⊆ X and every continuous function ϕ : A −→ [a, b], we can find a continuous function ϕ b : X −→ [a, b], such that ϕ| b A = ϕ. DEFINITION A.1.15 Let (X, dX ) and (Y, dY ) be two metric spaces and let F be a family of continuous maps f : X −→ Y . We say that F is equicontinuous, if for any given x ∈ X and ε > 0, we can find δ = δ(ε, x) > 0, such that for all z ∈ X satisfying dX (x, z) < δ, we have that
¡ ¢ dY f (x), f (z) < ε
∀ f ∈ F.
If δ > 0 can be chosen to be independent of x ∈ X, then we say that F is uniformly equicontinuous, REMARK A.1.16 If X is compact, then every equicontinuous family F is uniformly equicontinuous. THEOREM A.1.17 (Arzela-Ascoli Theorem) If (X, dX ) is a compact metric space and K ⊆ C(X), then K is relatively compact for the d∞ -metric on C(X), if and only if K is equicontinuous and uniformly bounded, where
¯ ¯ df d∞ (f, g) = sup ¯f (x) − g(x)¯ x∈X
∀ f, g ∈ C(X).
Appendix
A.2
899
Measure Theory
THEOREM A.2.1 (Fatou Lemma) © ª Let (Ω, Σ, µ) be a measure space and let fn : Ω −→ R n>1 be a sequence of Σ-measurable functions, then (a) if fn (ω) > f (ω) and
for µ-a.a. ω ∈ Ω Z
−∞
1
is a sequence of Σ-measurable functions and h : Ω −→ R is a µ-integrable function, fn (ω) −→ f (ω) for µ-a.a. ω ∈ Ω and
¯ ¯ ¯fn (ω)¯ 6 h(ω)
for µ-a.a. ω ∈ Ω and all n > 1,
then f is µ-integrable and Z
Z fn (ω) dµ −→
Ω
f (ω) dµ. Ω
900
Nonlinear Analysis
The above two theorems can be extended using the notion of “uniform integrability.” DEFINITION A.2.3 Let (Ω, Σ, µ) be a measure space and K ⊆ L1 (Ω). We say that K is uniformly integrable, if · ¸ Z ¯ ¯ ¯ ¯ lim sup f (ω) dµ = 0. λ→+∞
REMARK A.2.4
f ∈K {|f |>λ}
If we have
¯ ¯ ¯f (ω)¯ 6 h(ω) for µ-a.a. ω ∈ Ω and all f ∈ K, for some h ∈ L1 (Ω)+ , then K ⊆ L1 (Ω) is uniformly integrable. THEOREM A.2.5 If (Ω, Σ, µ) is a finite measure space and K ⊆ L1 (Ω), then K is uniformly integrable if and only if (a) K ⊆ L1 (Ω) is bounded; and (b) for a given ε > 0, we can find δ > 0, such that Z sup f ∈K
¯ ¯ ¯f (ω)¯ dµ 6 ε
∀ A ∈ Σ, µ(A) 6 δ.
A
REMARK A.2.6 If µ is nonatomic, then (a) follows from (b). Recall that a set A ∈ Σ is a µ-atom, if µ(A) > 0 and for every B ∈ Σ with B ⊆ A, µ(B) = 0 or µ(B) = µ(A). A µ-atom is finite, if µ(A) < +∞. A µ-atom is infinite, if µ(A) = +∞. If µ possesses no atoms, then µ is said to be nonatomic. PROPOSITION A.2.7 If (Ω, Σ, µ) is a finite measure space, K ⊆ L1 (Ω) is uniformly integrable and K0 is the closure of K in the topology of almost everywhere convergence, then K0 is uniformly integrable.
Appendix
901
THEOREM A.2.8 (Extended Fatou Lemma) If (Ω, Σ, µ) is a finite measure space and {fn }n>1 ⊆ L1 (Ω) is a uniformly integrable sequence, then Z Z lim inf fn (ω) dµ 6 lim inf fn (ω) dµ n→+∞
Ω
n→+∞
Z
6 lim sup n→+∞
Ω
Z fn (ω) dµ 6
lim sup fn (ω) dµ.
Ω
Ω
n→+∞
THEOREM A.2.9 (Vitali Theorem; Extended Dominated Convergence Theorem) If (Ω, Σ, µ) is a finite measure space, {fn }n>1 ⊆ L1 (Ω) is uniformly integrable and µ fn −→ f, then f ∈ L1 (Ω) and Z
Z fn (ω) dµ −→
Ω
f (ω) dµ. Ω
THEOREM A.2.10 (Monotone© Convergence ª Theorem) If (Ω, Σ, µ) is a measure space, fn : Ω −→ R n>1 is a sequence of Σmeasurable functions, such that h(ω) 6 fn (ω)
for µ-a.a. ω ∈ Ω and all n > 1,
with
Z −∞
0, we can find a compact set Kε ⊆ X, such that µ(Kεc ) 6 ε
and
f |Kε is continuous.
902
Nonlinear Analysis
THEOREM A.2.12 (Egorov Theorem) © ª If (Ω, Σ, µ) is a finite measure space and fn : Ω −→ R n>1 is a sequence of Σ-measurable functions, f : Ω −→ R is a Σ-measurable function and fn (ω) −→ f (ω)
for µ-a.a. ω ∈ Ω,
then for a given ε > 0, we can find A ∈ Σ with µ(Ac ) < ε, such that fn −→ f
uniformly on A.
THEOREM A.2.13 (Saks Theorem) If (Ω, Σ, µ) is a finite measure space and ε > 0, then Ω is the union of a finite sequence of pairwise disjoint sets {Ak }m k=1 ⊆ Σ, such that Ak is an atom or µ(Ak ) 6 ε. REMARK A.2.14 According to Theorem A.2.13, every finite measure space can have at most a countable family of pairwise disjoint atoms. DEFINITION A.2.15
Let f : [a, b] −→ R be a function.
(a) The total variation of f over [a, b] is defined by ½X ¾ ∞ ¯ ¯ df b ¯ ¯ f (xk ) − f (xk−1 ) : a = x0 < x1 < . . . < xn = b . Va (f ) = sup k=1
If
Vab (f ) < +∞,
¡ ¢ then we say that f is of bounded variation over [a, b]. By BV [a, b] , we denote the space of functions of bounded variation on [a, b]. If T ⊆ R is any interval, then f : T −→ R is locally of bounded ¡variation, denoted by ¢ f ∈ BVloc (T ), if for all a, b ∈ T , we have that f ∈ BV [a, b] . (b) We say that f ©is absolutely continuous, if for every ε > 0, we can find ª δ > 0, such that if [cn , dn ] n is any finite or countable collection of pairwise disjoint, closed intervals in [a, b] with X (dn − cn ) < δ, n
then
X¯ ¯ ¯f (dn ) − f (cn )¯ < ε. n
¡ ¢ By AC [a, b] we denote the space of functions which are absolutely continuous on [a, b]. If T ⊆ R is any interval, then f : T −→ R is locally absolutely continuous, ¡ ¢ denoted by f ∈ ACloc (T ), if for all a, b ∈ T , we have that f ∈ AC [a, b] .
Appendix
903
¡ ¢ ¡ ¢ REMARK A.2.16 Evidently, if f ∈ AC [a, b] , then f ∈ BV [a, b] and it is uniformly continuous. PROPOSITION A.2.17 ¡ ¢ f ∈ BV [a, b] if and only if f is the difference of two increasing functions. Moreover, if f is also continuous, then both increasing functions can be chosen to be continuous. PROPOSITION A.2.18 If f ∈ BVloc (R), then f 0 exists almost everywhere, f 0 ∈ L1loc (R) and Zb f 0 (t) dt 6 f (b) − f (a)
∀ a, b ∈ R.
a
REMARK A.2.19
If h ∈ L1loc (R), c ∈ R and Zt
f (t) = c +
h(s) ds
∀ s ∈ R,
0
then f ∈ BVloc (R) and f 0 (t) = h(t) for almost all t ∈ R. Moreover, Zb Vab (f )
=
¯ ¯ ¯h(s)¯ ds =
a
Zb
¯ 0 ¯ ¯f (s)¯ ds.
a
THEOREM A.2.20 (Fundamental Theorem of Lebesgue Calculus) If f ∈ ACloc (R), then f 0 exists almost everywhere, belongs in L1loc (R) and Zt f 0 (s) ds
f (t) = f (a) +
∀ t ∈ R.
a
THEOREM ¡ ¢ A.2.21 ¡ ¢ ¡ ¢ f ∈ AC [a, b] if and only if f ∈ BV [a, b] ∩C [a, b] and for every Lebesguenull set C ⊆ [a, b], we have ¡ ¢ λ1 f (C) = 0 (λ1 being the Lebesgue measure on R).
904
Nonlinear Analysis
DEFINITION A.2.22 Let (Ω, Σ, µ) be a measure space and let ν be a signed measure on Σ. We say that ν is absolutely continuous, with respect to µ (denoted by ν ≺≺ µ), if µ(A) = 0 implies that ν(A) = 0. REMARK A.2.23 If ν is a finite signed measure on Σ, then ν ≺≺ µ if and only if for every ε > 0, we can find δ > 0, such that ¯ ¯ ¯ν(A)¯ < ε ∀ A ∈ Σ, µ(A) < δ. THEOREM A.2.24 (Radon-Nikodym Theorem) If (Ω, Σ, µ) is a σ-finite measure space, ν is a σ-finite signed measure on Σ and ν ≺≺ µ, then there exists a Σ-measurable function f : Ω −→ R, such that Z ν(A) = f (ω) dµ ∀ A ∈ Σ. A
REMARK A.2.25 The function f is called the Radon-Nikodym dedν rivative of ν with respect to µ and usually is denoted by dµ . The RadonNikodym derivative is unique modulo a µ-null set. Moreover, if µ and ν are finite, then f ∈ L1 (Ω; µ). THEOREM A.2.26 (Jensen Inequality) If (Ω, Σ, µ) is a finite measure space, J ⊆ R is an interval, f ∈ L1 (Ω), f (Ω) ⊆ J, ϕ : J −→ R is a convex function and ϕ ◦ f ∈ L1 (Ω), then µ ¶ Z Z ¡ ¢ 1 1 ϕ f (ω) dµ 6 ϕ ◦ f (ω) dµ. µ(Ω) µ(Ω) Ω
Ω
THEOREM A.2.27 (H¨ older Inequality) If (Ω, Σ, µ) is a measure space, p ∈ [1, +∞], 0
1 p
+
1 p0
= 1, f ∈ Lp (Ω) and
g ∈ Lp (Ω), then f g ∈ L1 (Ω) and kf gk1 6 kf kp kgkp0 . © ª More generally, if fk ∈ Lpk (Ω), k ∈ 1, . . . , m with
1 p
p
f1 · . . . · fm ∈ L (Ω) and kf1 · . . . · fm kp 6 kf1 kp1 . . . kfm kpm .
=
m P k=1
1 pk
6 1, then
Appendix
905
THEOREM A.2.28 (Interpolation Inequality) If (Ω, Σ, µ) is a measure space, f ∈ Lp (Ω) ∩ Lq (Ω) with p, q ∈ [1, +∞], p 6 q, then f ∈ Lr (Ω) for all r ∈ [p, q] and t
1−t
kf kr 6 kf kp kf kq with
1 r
=
t p
+
1−t q
,
(t ∈ [0, 1]).
DEFINITION A.2.29 (a) A topological space (X, τ ) is said to be a Polish space, if it is separable and there exists a metric on X for which the topology τ is complete. (b) A Hausdorff topological space X is said to be a Souslin space, if there exists a Polish space Y and a continuous surjection from Y to X. REMARK A.2.30 Closed and open subsets of Polish spaces are Polish and so are countable products and countable intersections. Every locally compact, σ-compact metrizable space is Polish and a subspace of a Polish space is itself Polish if and only if it is a Gδ -set of its completion (AlexandrovMazurkiewicz theorem). Closed and open subsets of Souslin spaces are Souslin and so are countable products, countable intersections and countable unions of Souslin spaces. Let X be a separable Banach space furnished with the weak topology (denoted by Xw ). If dim Xw = +∞, then Xw is a nonmetrizable, ∗ Souslin space. Similarly, let Xw ∗ be the dual of a separable Banach space X, ∗ ∗ equipped with the w -topology. Then Xw ∗ is a Souslin space. The Souslin subspaces of a Polish space are called analytic sets. THEOREM A.2.31 If Z, X are two Polish spaces, S ∈ B(Z × X) and for all z ∈ Z, ª df © the set S(z) = x ∈ X : (z, x) ∈ S is σ-compact, then projZ S ∈ B(X). THEOREM A.2.32 (Yankov-von Neumann-Aumann Projection Theorem) If (Ω, Σ) is a measurable space, X is a Souslin space and S ∈ Σ × B(X), then b projΩ S ∈ Σ, b being the universal σ-field corresponding to Σ. If Σ is complete, then with Σ projΩ S ∈ Σ.
906
Nonlinear Analysis
THEOREM A.2.33 (Yankov-von Neumann-Aumann Selection Theorem) If (Ω, Σ) is a complete measurable space, X is a Souslin space and F : Ω −→ 2X \ {∅} is a multifunction, such that © ª Gr F = (ω, x) ∈ Ω × X : x ∈ F (ω) ∈ Σ × B(X), then there exists a Σ-measurable function f : Ω −→ X, such that f (ω) ∈ F (ω)
∀ ω ∈ Ω.
PROPOSITION A.2.34 (a) If τ1 and τ2 are two comparable Souslin topologies on a set X, then ¡ ¢ ¡ ¢ B Xτ1 = B Xτ2 . (b) If X and Y are two Souslin spaces, then ¡ ¢ B X × Y = B(X) × B(Y ). PROPOSITION A.2.35 (Double Limit Lemma) Let X be a metric space and let {xm,n }m,n>1 ⊆ X. Suppose that the limit lim
lim xn,m exists.
m→+∞ n→+∞
Then there exist subsequences {m(n)}n>1 and {n(m)}m>1 , such that lim
lim xm,n =
m→+∞ n→+∞
lim xm,n(m) =
m→+∞
lim xm(n),n .
n→+∞
THEOREM A.2.36 (Portmanteau Theorem) 1 If X is a metric space and {µα }α∈J is a net in M+ (X) (see (3.130)), then the following statements are equivalent: w
1 (a) µα −→ µ in M+ (X); Z Z (b) f (ω) dµα −→ f (ω) dµ for all bounded, uniformly continuous funcX
X
tions f : X −→ R;
(c) lim sup µα (C) 6 µ(C) for all closed sets C ⊆ X; α
(d) lim inf µα (U ) > µ(U ) for all open sets U ⊆ X; α
(e) lim µα (A) = µ(A) for all A ∈ B(X) with µ(∂A) = 0 (such sets are called α
µ-continuity sets).
Appendix
907
DEFINITION A.2.37 Let Ω 6= ∅, S ⊆ 2Ω with ∅ ∈ S and S being stable under finite unions and finite intersections. A function m : 2Ω −→ R∗ = R ∪ {±∞} is said to be a (Choquet) S-capacity, if (i) m is monotone (i.e., if A ⊆ B ⊆ Ω, then m(A) 6 m(B)); (ii) if An % A (i.e., An ⊆ An+1 and A =
∞ S n=1
An ), then
m(An ) % m(A); (iii) if An & A (i.e., An ⊇ An+1 and A =
∞ T n=1
An ), then
m(An ) & m(A). We say that a set A ⊆ Ω is S-m-capacitable,, if m(A) =
sup m(C), C ∈ Sδ C⊆A
where Sδ =
¾ Ck : Ck ∈ S .
½\ ∞ k=1
DEFINITION A.2.38
An S-Souslin set is a set which belongs in df
A(S) = where N∗ =
∞ S
©
A(F ) : F ∈ S N
∗
ª
,
Nn and
n=1
A(F ) =
∞ [
[ {nk }k>1
∈NN
F (n1 , . . . , nm )
m=1
(this is the Souslin operation corresponding to F ). THEOREM A.2.39 (Choquet Capacitability Theorem) If Ω 6= ∅, S ⊆ 2Ω with ∅ ∈ S, S is stable under finite unions and finite intersections and m : 2Ω −→ R∗ is an S-capacity, then every S-Souslin set is S-m-capacitable.
908
A.3
Nonlinear Analysis
Functional Analysis
THEOREM A.3.1 (Weak Separation Theorem) If X is a locally convex vector space, A, C ⊆ X are two nonempty, convex sets, such that int A 6= ∅ and int A ∩ C = ∅, then there exists x∗ ∈ X ∗ \ {0}, such that hx∗ , aiX 6 hx∗ , ciX
∀ a ∈ A, c ∈ C.
THEOREM A.3.2 (Strong Separation Theorem) If X is a locally convex space, A, C ⊆ X are two nonempty, disjoint, closed, convex sets and A is compact, then there exist x∗ ∈ X ∗ \ {0} and ε > 0, such that sup hx∗ , aiX 6 inf hx∗ , ciX − ε. c∈C
a∈A
REMARK A.3.3 Usually in applications of Theorem A.3.2, X is a Banach space endowed with the weak topology THEOREM A.3.4 (Uniform Boundedness Principle) If X is a Banach space, Y is a normed space, F ⊆ L(X; Y ) and sup kAxkY < +∞
∀ x ∈ X,
A∈F
then sup kAkL < +∞.
A∈F
THEOREM A.3.5 (Open Mapping Theorem) If X, Y are two Banach spaces and A ∈ L(X; Y ) is surjective, then A is an open map. Recall that a bijective map from one topology space onto another is a homeomorphism if and only if it is both continuous and open. Then Theorem A.3.5 yields the following result. THEOREM A.3.6 (Banach Theorem) If X, Y are two Banach spaces and A ∈ L(X; Y ) is bijective, then A is an isomorphism (i.e., A−1 ∈ L(Y ; X)). THEOREM A.3.7 (Closed Graph Theorem) If X, Y are two Banach spaces and A : X −→ Y is a linear operator, then A ∈ L(X; Y ) if and only if Gr A is closed.
Appendix
909
THEOREM A.3.8 (Eberlein-Smulian Theorem) If X is a normed space and C ⊆ X, then (a) C is weakly compact if and only if C is weakly sequentially compact; (b) C is relatively weakly compact if and only if C is relatively weakly sequentially compact. THEOREM A.3.9 (Alaoglu Theorem) © ª X∗ = x∗ ∈ X ∗ : kx∗ kX ∗ 6 1 ,
If X is a normed space and B 1 X∗
then B 1
is a w∗ -compact set.
REMARK A.3.10 compact.
So every norm-bounded set in X ∗ is relatively w∗ -
A normed linear space is finite dimensional if and only if the convex hull of every compact set is compact. For infinite dimensional spaces the situation is as follows. THEOREM A.3.11 (Krein-Smulian Theorem) If X is a Banach space and C ⊆ X is weakly compact, then conv C is weakly compact too. THEOREM A.3.12 (Mazur Theorem) If X is a Banach space and C ⊆ X is a norm-compact, then conv C is norm-compact too. In an infinite dimensional Banach space the weak topology on X is never metrizable. However, on certain particular subsets the relative weak topology is metrizable. The next theorem summarizes those cases. THEOREM A.3.13 Let X be a Banach space. (a) If X is a separable space and A ⊆ X ∗ is bounded, then the relative w∗ -topology on A is metrizable. (b) If X ∗ is separable and A ⊆ X is bounded, then the relative weak topology on A is metrizable. (c) If X is separable and A ⊆ X is weakly compact, then the relative weak topology on A is metrizable. REMARK A.3.14 Recall that, if X ∗ is separable, then so is X. However the converse is not true (consider L1 [0, 1] and L∞ [0, 1] = L1 [0, 1]∗ ). But if X reflexive, then X is separable if and only if X ∗ is separable.
910
Nonlinear Analysis
PROPOSITION A.3.15 (Riesz Lemma) If X is a normed space, Y ⊆ X is a closed subspace and λ ∈ (0, 1), then we can find xλ ∈ X, such that kxλ kX = 1
and
dX (xλ , Y ) > λ.
If X is a normed space and Z is a subspace of X, it is useful to know the duals of Z and of X/Z . The next two propositions identify these two duals. PROPOSITION A.3.16 If X is a normed space and Z ⊆ X is a subspace endowed with the induced norm, then (a) the dual Z ∗ of Z can be isometrically identified with X ∗ / ⊥ ; Z ¢ ¡ ¢ ¡ ∗ ∗ (b) w Z, X / ⊥ = w X, X |Z . Z PROPOSITION A.3.17 If X is a normed space and Z ⊆ X is a subspace endowed with the induced norm, then ¡ ¢∗ (a) X/Z can be isometrically identified with Z ⊥ ; ¡ ¢ (b) σ X/Z , Z ⊥ = w(X, X ∗ )|Z . PROPOSITION A.3.18 If X is a normed space, w
w∗
in X
and
x∗α −→ x∗
kxkX 6 lim inf kxα kX
and
kx∗ kX ∗ 6 lim inf kx∗α kX ∗
xα −→ x
in X ∗ ,
then α
α
(i.e., the norm functional on X is weakly lower semicontinuous and the norm functional on X ∗ is weakly∗ lower semicontinuous). PROPOSITION A.3.19 If X is a normed space and A ⊆ X is convex, w then A = A . Hence a convex set is closed if and only if it is weakly closed. PROPOSITION A.3.20 If X is a reflexive Banach space, then a convex set in X ∗ is weakly closed if and only if it is weakly sequentially closed.
Appendix
911
DEFINITION A.3.21 A Banach space X is said to be locally uniformly convex, if for every ε > 0 and x ∈ X with kxkX = 1, we can find δ = δ(ε, x) > 0, such that ° ° °x + y ° ° ° 61−δ if kx − ykX > ε, then ° ∀ y ∈ X, kykX = 1. 2 °X If δ > 0 can be chosen independent of x, then we say that X is uniformly convex. REMARK A.3.22 A uniformly convex Banach space is reflexive (Milman-Pettis theorem). Locally uniformly convex spaces have the Kadec-Klee (or Radon-Riesz) property, namely if w
xn −→ x in X
and
kxn kX −→ kxkX ,
then xn −→ x
in X.
Recall that Hilbert spaces are uniformly convex (a consequence of the parallelogram law). THEOREM A.3.23 (Troyanski Renorming Theorem) Every reflexive Banach space can be given equivalent norm so that X and X ∗ are both locally uniformly convex and have Fr´echet differentiable norms. In the next two theorems we identify the dual of certain well known spaces. THEOREM A.3.24 (Riesz Representation Theorem) If (Ω, Σ, µ) is a σ-finite measure space, p ∈ [1, +∞) and p1 + p10 = 1 (if p = 1, then p0 = +∞), 0 then Lp (Ω) is isometrically isomorphic to Lp (Ω)∗ via the isometry Z 0 u(g)(f ) = f (ω)g(ω) dµ ∀ g ∈ Lp (Ω), f ∈ Lp (Ω). Ω
So we can write that
0
Lp (Ω)∗ = Lp (Ω).
THEOREM A.3.25 (Riesz-Markov Theorem) If X is a locally compact, σ-compact metrizable space, then C0 (X)∗ = M (X). PROPOSITION A.3.26 If X, Y are two separable, metrizable spaces and ϑ : X −→ Y is a homeomorphism, 1 1 b then ϑb : M+ (X) −→ M+ (Y ), defined by ϑ(µ) = µϑ−1 , is a homeomorphism 1 1 too if on M+ (X) and M+ (Y ) we consider the respective weak topologies.
912
A.4
Nonlinear Analysis
Calculus and Nonlinear Analysis
THEOREM A.4.1 (Gauss-Green Theorem; Divergence Theorem) If Ω ∈ RN is a nonempty, bounded open set with C 1 -boundary ∂Ω, then Z Z ¡ ¢ div u dz = (u, n)RN dσ ∀ u ∈ C 1 Ω; RN . Ω
∂Ω
Hence Z Ω
∂u dz = ∂zi
Z uni dσ
¡ ¢ © ª ∀ u ∈ C 1 Ω; RN , i ∈ 1, . . . , N .
∂Ω
REMARK A.4.2
Recall that div u(z) =
N X ∂uk (z) k=1
and n(z) =
∂zk
¡ ¢N nk (z) k=1
is the outward unit normal on ∂Ω. If v : Ω −→ R is twice partially differentiable, then ¡ ∂v ∂v ¢ ∇v = ,..., ∂z1 ∂zN is a partially differentiable vector field and div (∇v) = ∆v. ¡ ¢ Also, if f : Ω −→ R is partially differentiable and u ∈ C 1 Ω; RN , then div (f u) = f div u + (∇f, u)RN . Theorem A.4.1 leads to the following “integration by parts formula.” COROLLARY A.4.3 (Integration by Parts Formula) ¡ ¢ If Ω ∈ RN is a bounded, open set with a C 1 -boundary ∂Ω and u, v ∈ C 1 Ω , then Z Z Z © ª ∂u ∂v v dz + u dz = uvnk dσ ∀ k ∈ 1, . . . , N . ∂zk ∂zk Ω
Ω
∂Ω
Using Corollary A.4.3, we derive the so-called “Green’s formulas.”
Appendix
913
COROLLARY A.4.4 (Green Formula) ¡ ¢ If Ω ∈ RN is a bounded, open set with C 1 -boundary ∂Ω and u, v ∈ C 2 Ω , then Z Z ∂u (a) ∆u dz = dσ; ∂n Ω
Z
(b)
∂Ω
¡
Ω
Z
(c)
∇u, ∇v
¢ RN
Z
Z
dz +
u∆v dz = Ω
¡
u
∂v dσ; ∂n
∂Ω
¶ Z µ ∂v ∂u u∆v − v∆u dz = u −v dσ. ∂n ∂n ¢
Ω
∂Ω
PROPOSITION A.4.5 (Young Inequality) If p ∈ (1, +∞), p1 + p10 = 1, a, b > 0 and ε > 0, then 0 ap bp ε 1 0 ab 6 + 0 and ab 6 ap + 0 bp . p p p εp REMARK A.4.6 If p = p0 = 2, then the first inequality in Proposition A.4.5 is the well known Cauchy-Bunyakowski-Schwarz inequality . THEOREM A.4.7 (Gronwall Inequality) If T = [a, b], u ∈ C(T ), k ∈ L1 (T )+ , h ∈ L∞ (T ), and Zt u(t) 6 h(t) +
k(s)u(s) ds
∀ t ∈ T,
0
then
µ Zt
Zt u(t) 6 h(t) +
exp 0
REMARK A.4.8
¶ k(r) dr k(s)h(s) ds
∀ s ∈ T.
s
If h(t) ≡ h0 for all t ∈ T , then Zt u(t) 6 h0
k(s) ds
∀t∈T
0
(Bellman’s inequality ). More details on all the subjects covered in this Appendix and additional references can be found in Denkowski, Mig´orski & Papageorgiou (2003a, 2003b).
List of Symbols
Symbol
Page
Meaning
λN
p. 1
Lebesgue measure in RN
µbA
p. 3
the restriction of µ on A, defined by df
(µbA)(B) = µ(A ∩ B) µ|A
p. 3
∀ B ∈ 2X
the restriction of µ on 2A , defined by ¡
¢ df µ|A (B) = µ(B)
∀ B ∈ 2A ⊆ 2X
Σµ
p. 4
the collection of all µ-measurable sets
B(X)
p. 4
the Borel σ-field of X
δ(A), diam (A)
p. 7
the diameter of A
λN
p. 9
the N -dimensional Lebesgue outer measure
R∗
p. 12
R∗ = R ∪ {±∞}
λ∗
p. 12
the Lebesgue outer measure on R
dX (A, B)
p. 22
dX (A, B) = inf dX (a, b) > 0
Ac
p. 22
Ac = X \ A
Tδ (C)
p. 25
the family of all δ-covers of the set C
µδ (C)
(s)
p. 25
µδ (C) =
µ(s)
p. 25
the Hausdorff s-dimensional outer measure
Sϑ
p. 30
the Cantor-like set
N
p. 31
Lebesgue N -dimensional outer measure
L(a, b)
p. 32
the line in RN passing from b in the direction of a
P (a)
p. 32
the plane in RN passing from the origin, perpendicular to a
λ
df
df
a∈A b∈B
df
(s)
df
inf
∞ P
{An }n>1 ∈Tδ (C) n=1
δ(An )s
915
916
Nonlinear Analysis
S(a, A)
p. 33
Steiner symmetrization
Γ
p. 39
the gamma Euler function
Lip(f )
p. 52
Lipschitz constant of a Lipschitz continuous function f : RN 3 C −→ RM
Gr (f |A )
p. 52
the graph of f : RN −→ RM over A
L(X; Y )
p. 55
the vector space of all bounded linear operators from X into Y
Cc∞ (Z)
p. 57
the space of all C ∞ (Z)-functions with compact supports
jac L
p. 64
the Jacobian of a linear operator L
Jf
p. 64
the Jacobian of a Lipschitz function f
W 1,p (U )
p. 81
the Sobolev space ¡ ¢ª df © W 1,p (U ) = f ∈ Lp (U ) : Df ∈ Lp U ; RN
1,p Wloc (U )
p. 81
the Sobolev space df
1,p Wloc (U ) =
©
f : U −→ R : f |V ∈ W 1,p (V ) ª for all V ⊂⊂ U
V ⊂⊂ U
p. 81
V is a bounded open subset of U such that V ⊆ U
p∗
p. 82
Kp
p. 82
Sobolev critical exponent ¡ ¢ª ∗ df © K p = f ∈ Lp (RN ) : f > 0, Df ∈ Lp RN ; RN
capp (C)
p. 82
p-capacity of C
kf k1,p
p. 82
the norm of W 1,p (U )
kf k1,∞
p. 82
the norm of W 1,∞ (U )
X∗
p. 108
the topological dual of the Banach space X
h·, ·iX
p. 108
the duality brackets for the pair of spaces (X ∗ , X)
kf kp
p. 127
W 1,p (T ; X) ¡ ¢ AC 1,p T, X
p. 138
the norm of a strongly measurable function ª df © W 1,p (T ; X) = f ∈ Lp (T ; X) : Df ∈ Lp (T ; X)
Wpr (T )
p. 144
p. 138
the space of absolutely continuous function, differentiable almost everywhere with the derivative in Lp (T ; X) ª df © Wpr (T ) = u ∈ Lp (T ; X) : u0 = Du ∈ Lr (T ; Z)
List of Symbols b
917
un −→ u
p. 163
biting convergence in L1 (Ω; X)
w-lim sup An
p. 165
the space of x ∈ X, such that x = w-lim xnk , with
n→+∞
k→+∞
xnk ∈ Ank and n1 < n2 < . . .
Cc (Z)
p. 171
the space of continuous functions on Z with compact support
C0 (Z)
p. 171
the space of continuous functions on Z which vanish at infinity
Cb (Z)
p. 171
the space of bounded, continuous functions on Z
M (Z)
p. 172
the space of all signed measures m : B(Z) −→ R which have bounded variation
w
p. 173
weak convergence of measures
(αk )N k=1
p. 179
multi-index
|α|
p. 179
the length of the multi-index α
Cc∞ (Z)
p. 179
the space of C ∞ (Z)-functions with compact supports
D(Z)
p. 179
the space of test functions
D(Z)∗
p. 179
the space of distributions
kukW m,p (Z)
p. 181
the norm of W m,p (Z)
W0m,p (Z)
p. 181
W0m,p (Z) = D(A)
H m (Z)
p. 182
H m (Z) = W m,2 (Z)
H0m (Z)
p. 182
H0m (Z) = W0m,2 (Z)
ϕε
p. 183
mollifier
p. 183
µn −→ µ α=
uε
df
k·kW m,p (Z)
df df
kvkV q (Z,div )
p. 210
mollification or regularization of u ¡ ¢ df © ¡ ¢ ª V q Z, div = v ∈ Lq Z; RN : div v ∈ Lq (Z) h i q1 ° °q df q kvkV q (Z,div ) = kvkLq (Z;RN ) + °div v °Lq (Z)
∆p u
p. 211
¡ ¢ df p ∆p u = div kDukRN Du the p-Laplacian
Cbk (Z)
p. 222
¡ ¢ C0 Z
p. 227
D1,p (RN )
p. 231
V
¡ q
Z, div
¢
p. 210
the space of all functions u ∈ C k (Z), such that Dα u is bounded on Z for all multiindices α ∈ NN with |α| 6 k ¡ ¢ df © ¡ ¢ ª C0 Z = u ∈ C Z : u|∂Z = 0 ¡ ¢ª ∗ df © D1,p (RN ) = u ∈ Lp (RN ) : Du ∈ Lp RN ; RN
918
Nonlinear Analysis
¡ ¢ 1,p Wper T ; RN
p. 237
¡ ¢ df © ¡ ¢ ª 1,p Wper T ; RN = u ∈ W 1,p T ; RN : u(0) = u(b)
K(D; Y )
p. 266
the set of compact maps f : D −→ Y
Lc (X; Y )
p. 266
Lc (X; Y ) = K(X; Y ) ∩ L(X; Y )
idX
p. 273
identity operator on X
Lf (X; Y )
p. 276
the space of all finite dimensional operators from X into Y equipped with the norm inherited from L(X; Y )
rank L
p. 276
rank L = dim L(X)
%(L)
p. 278
the resolvent set of L ∈ L(X)
R
p. 278
the resolvent operator of L ∈ L(X)
σ(L)
p. 278
the spectrum of L ∈ L(X)
σp (L)
p. 278
A⊥
p. 285
⊥
C
p. 285
the point spectrum of L ∈ L(X) ª df © A⊥ = x∗ ∈ X ∗ : hx∗ , aiX = 0 for all a ∈ A ª df © ⊥ C = x ∈ X : hc∗ , xiX = 0 for all c∗ ∈ C
E(λ)
p. 295
the eigenspace corresponding to the eigenvalue λ
Φ(X; Y )
p. 298
the class of Fredholm operators L ∈ L(X; Y )
α(L)
p. 298
the kernel index of an operator L
β(L)
p. 298
the deficiency index of an operator L
ind L
p. 298
the index of an operator L
Φ+ (X; Y )
p. 298
the class of all semi-Fredholm operators L ∈ L(X; Y )
Φ(X)
p. 298
the class of Fredholm operators L ∈ L(X)
Φ+ (X)
p. 298
the class of all semi-Fredholm operators L ∈ L(X)
γ(L)
p. 299
the minimum modulus of an operator L
D(A)
p. 303
the domain of an operator A
Gr A
p. 303
F − (C)
p. 307
F + (U )
p. 307
the graph of an operator A ª df © F − (C) = y ∈ Y : F (y) ∩ C 6= ∅ , where F : Y −→ 2Z \ {∅} is a multifunction ª df © F + (U ) = y ∈ Y : F (y) ⊆ U , where F : Y −→ 2Z \ {∅} is a multifunction
Zw
p. 308
df
df
the space Z furnished with the weak topology
List of Symbols
919
df
R
p. 311
R = R ∪ {+∞}
∂ϕ
p. 311
the subdifferential of ϕ
F
p. 311
the duality map of X, defined by df
F(x) =
©
2
2
x∗ ∈ X ∗ : hx∗ , xiX = kxkX = kx∗ kX ∗
ª
dB (f, U, 0)
p. 321
the Brouwer degree of f on U with respect to 0
Jλ
p. 322
the resolvent operator of A
Aλ
p. 322
h·, ·ipp0
p. 341
the Yosida approximation of A ¡ 0¡ ¢ ¢ the duality brackets for Lp T ; X ∗ , Lp (T ; X)
Cb (R)
p. 369
the space of bounded continuous functions f : R −→ R equipped with the supremum norm
N BV (R)
p. 375
the space of all normalized functions of bounded variation with the total variation norm Z ¯ ¯ kϑkT V (R) = (Var ϑ)(R) = ¯ dϑ(t)¯. R
%(A)
p. 378
the resolvent set of an operator A : X ⊇ D(A) −→ X
Rλ
p. 378
the resolvent operator
p SG
p. 423
the set of Lp -selections of the multifunction G : Ω −→ 2Y \ {∅}, i.e., ª p df © = g ∈ Lp (Ω; Y ) : g(z) ∈ G(z) for µ-a.a. z ∈ Ω SG
1 M+ (E)
p. 427
the probability measures on E
1 M+ (E)
p. 427
the subprobability measures on E
b R(Ω, E)
p. 427
the space of transition probabilities on Ω
c SR(Ω, E)
p. 427
the space of transition subprobabilities on Ω
1 M+ (E)n
p. 427
1 the space M+ (E) equipped with the topology of narrow convergence
Y(Ω, E, µ)
p. 429
the space of Young measures with respect to µ
SY(Ω, E, µ)
p. 429
the space of Young submeasures with respect to µ
δu
p. 436
Dirac transition probability associated to u
N (Ω, Σ, E)
p. 438
the set of all normal integrands
920
Nonlinear Analysis
N+ (Ω, Σ, E)
p. 438
the set of positive normal integrands
b b (Ω, Σ, E) K
p. 438
the set of all Cb -Carath´eodory integrands
b 0 (Ω, Σ, E) K
p. 438
the set of all C0 -Carath´eodory integrands
Bar(u)
p. 454
the barycenter of a function u
0 (x) fG
p. 468
the Gˆateaux derivative of f at x
fF0 (x)
p. 470
the Fr´echet derivative of f at x
0 f+ (t)
p. 476
right derivative of f at t
0 f− (t)
p. 476
left derivative of f at t
dom ϕ
p. 488
the effective domain of ϕ
epi ϕ
p. 488
the epigraph of ϕ
Lϕ λ
p. 488
the sublevel set of ϕ
Γ0 (X)
p. 488
the cone of proper, convex and lower semicontinuous functions
ϕ∗
p. 512
the Legendre-Fenchel transform (or the conjugate) of ϕ
ϕ∗∗ (x)
p. 512
the second conjugate (or biconjugate) of ϕ
σC
p. 514
the support function of the set C
ϕ⊕ψ
p. 514
the infimal convolution of functions ϕ, ψ
sepi ϕ
p. 514
the strict epigraph of ϕ
∗
p. 517
the dual cone to the cone K
ϕ0 (x0 ; h)
p. 522
the directional derivative of ϕ at x0 in the direction h∈X
∂ϕ(x)
p. 523
the subdifferential of ϕ at x
∂ε ϕ(x)
p. 540
the ε-subdifferential of ϕ at x
ϕ0ε (x; h)
p. 541
ϕ0ε (x; h) = inf
ϕ0 (x; h)
p. 544
the generalized directional derivative of ϕ at x in the direction h
∂ϕ(x)
p. 546
the generalized (or Clarke) subdifferential of ϕ at x
B
p. 554
bornology
BG
p. 554
Gˆateaux bornology
BF
p. 554
Fr´echet bornology
K
df
λ>0
ϕ(x + λh) − ϕ(x) + ε λ
List of Symbols
921
BH
p. 554
Hadamard bornology
ϕ0B (x)
p. 554
B-derivative of ϕ (derivative with respect to bornology B)
∂B ϕ(x)
p. 555
B-subdifferential of ϕ (subdifferential with respect to bornology B)
∂ B ϕ(x)
p. 555
the set of B-superderivatives of ϕ at x (with respect to bornology B)
∂p ϕ(x)
p. 557
the proximal subdifferential of ϕ at x
− ∂B ϕ(x)
p. 557
the canonical B-subdifferential of ϕ at x
w(V, W )
p. 568
the weak topology; the smallest topology on V compatible with the dual pair (V,W)
m(V, W )
p. 568
Mackey topology; the largest topology on V compatible with the dual pair (V,W)
D(x, C)
p. 584
the drop associated with the pair (x, C)
TC (x0 ) H ϕω dµ
p. 599
the tangent cone to C at x0
p. 603
inf-convolution integral of ϕ with respect to µ
Ω
ϕa
p. 608
the sublevel set of ϕ at the level a
ϕ
p. 625
the set of critical points of ϕ
Kcϕ
p. 625
BrY
p. 686
the set of critical points of ϕ with energy level c ª df © BrY = x ∈ Y : kxkX 6 R , where Y is a linear subspace of X
catY (A)
p. 690
Ak
p. 694
the Lusternik-Schnirelman category ª df © Ak = A ⊆ X : A is compact and catX (A) > k
Tx0 (C)
p. 696
the tangent space to C at x0
Ln−1
p. 710
the set of all (n − 1)-dimensional subspaces Y of H
(·, ·)2
p. 721
∂x ∂np
p. 754
the inner product in L2 (Ω) ¢ p−2 ¡ ∂x df Dx, n RN with n being the outward ∂np = kDxkRN unit normal on ∂Ω
%(p, N )
p. 769
the resolvent set of the negative vector ordinary pLaplacian with periodic boundary conditions
σ(p, N )
p. 769
the set of eigenvalues of the negative vector ordinary p-Laplacian with periodic boundary conditions
K
922 C(p, N )
Nonlinear Analysis p. 774
¡ ¢ 1,p the set of functions x ∈ Wper (0, b); RN such that ° ° R °x(t)°p−2 N x(t) dt = 0 Ω
R
C1 (p, N )
p. 774
ϑp,N
p. 774
ª df © C1 (p, N ) = x ∈ C(p, N ) : kxkp = 1 ¡ ¢ df p 1,p ϑp,N (x) = kx0 kp ∀ x ∈ Wper (0, b); RN
rx (D)
p. 817
the radius of D relative to x
rE (D)
p. 817
the Chebyshev radius of D relative to E
CE (D)
p. 817
the Chebyshev center of D relative to E
α(C)
p. 828
the Kuratowski measure of noncompactness of the set C
β(C)
p. 828
the ball (Hausdorff) measure of noncompactness of the set C
(X, 6)
p. 833
partially ordered set
S+ (x)
p. 834
the right section of x in a partially ordered set (X, 6)
S− (x)
p. 834
the left section of x in a partially ordered set (X, 6)
[x, y]
p. 834
the order interval in a partially ordered set (X, 6)
i(ϕ, U, K) ¡ ¢ Pf c X
p. 858
fixed point index of ϕ over U with respect to K
p. 877
¡ ¢ Pbf X
the class of nonempty, closed and convex subsets of X
p. 877
the class of nonempty, bounded and closed subsets of X
Pbf (c) X
p. 877
the class of nonempty, bounded, closed (and convex) subsets of X
P(w)kc X
p. 877
the class of nonempty, (weakly-)compact and convex subsets of X
N (x)
p. 877
the filter of neighbourhoods of x
Fix (F )
p. 879
the set of fixed points of F : X −→ 2X
IC (x)
p. 886
the inward set of x ∈ C with respect to C
(X∞ , τ∞ )
p. 896
Alexandrov one-point compactification of X
{Ui }i ≺ {Vj }j
p. 896
{Ui }i∈I is a refinement of {Vj }j∈J
Vab (f ) ¡ ¢ BV [a, b]
p. 902
the total variation of f : [a, b] −→ R
p. 902
the space of functions of bounded variation on [a, b]
List of Symbols
923
BVloc (T )
p. 902
¡ ¢ AC [a, b]
the space of functions f : T −→ R which are locally of bounded variation
p. 902
the space of absolutely continuous functions from [a, b] into R
ACloc (T )
p. 902
the space of locally absolutely continuous function f : T −→ R
ν ≺≺ µ
p. 904
measure ν is absolutely continuous with respect to measure µ
dν dµ
p. 904
the Radon-Nikodym derivative of measure ν with respect to measure µ
References
Abraham, R. & Marsden, J. (1978), Foundations of Mechanics, Benjamin/Cummings Publishing Co., London. Adams, R. (1975), Sobolev Spaces, Vol. 65 of Pure and Applied Mathematics, Academic Press, New York/London. Agmon, S., Douglis, A. & Nirenberg, L. (1959), ‘Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions I’, Comm. Pure Appl. Math. 12, 623–727. Akhiezer, N. & Glazman, I. (1961), Theory of Linear Operators in Hilbert Spaces. Volume I, Frederick Ungar Publishing Co., New York. Akhiezer, N. & Glazman, I. (1963), Theory of Linear Operators in Hilbert Spaces. Volume II, Frederick Ungar Publisher Co., New York. Alibert, J. & Bouchitt´e, G. (1997), ‘Non-uniform integrability and generalized Young measures’, J. Convex Anal. 4, 129–147. Allegretto, W. & Huang, Y.-X. (1998), ‘A Picone’s identity for the p-Laplacian and applications’, Nonlinear Anal. 32, 819–830. Allegretto, W. & Huang, Y.-X. (1999), ‘Principal eigenvalues and Sturm comparison via Picone’s identity’, J. Differential Equations 156, 427–438. Alspach, D. (1981), ‘A fixed point free nonexpansive map’, Proc. Amer. Math. Soc. 82, 423–424. Altman, M. (1955), ‘A fixed point theorem for completely continuous operators in Banach spaces’, Bull. Acad. Polon. Sci. S´er. Sci. Math. 3, 409– 413. Amann, H. (1976), ‘Fixed point equations and nonlinear eigenvalue problems in ordered Banach spaces’, SIAM Rev. 18, 620–709. Amann, H. (1977), Order structure and fixed points, in ‘Atti 2 Sem. Anal. Funz. Appl.’, Univ. di Cosenza, Cosenza, Italy, pp. 1–50. Ambrosetti, A. (1992), ‘Critical points and nonlinear variational problems’, M´em. Soc. Math. France (N.S.) 49. Ambrosetti, A. & Rabinowitz, P. (1973), ‘Dual variational methods in critical point theory and applications’, J. Funct. Anal. 14, 349–381.
925
926
Nonlinear Analysis
Anane, A. (1987), ‘Simplicit´e et isolation de la premi`ere valeur propre du plaplacien avec poids’, C. R. Acad. Sci. Paris S´er. I Math. 305, 725–728. Anane, A. (1987–1988), Etude des Valeurs et de Resonance Pour l’Operateur p-Laplacien, Universit´e Libre de Bruxelles, Belgium. Ph. D. Thesis. Anane, A. & Tsouli, N. (1996), On the second eigenvalue of the p-Laplacian, in A. Benkirane & J.-P. Gossez, eds, ‘Nonlinear Partial Differential Equations (F`es, 1994)’, Vol. 343 of Pitman Res. Notes in Math. Ser., Logman, Harlow, pp. 1–9. Appell, J. & Zabrejko, P. (1990), Nonlinear Superposition Operators, Vol. 95 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge. Arino, O., Gautier, S. & Penot, J.-P. (1984), ‘A fixed point theory for sequentially continuous mappings and applications to ordinary differential equations’, Funkcial. Ekvac. 27, 273–279. Arzela, C. (1889), ‘Funcioni di linee’, Rom. Acc. L. Rend. (4) 5, 342–348. Ascoli, G. (1883–1884), ‘Le curve limite di une varieta data di curve’, Rom. Acc. L. Mem. (3) 18, 521–586. Ash, R. (1972), Real Analysis and Probability, Academic Press, New York. Asplund, E. (1968), ‘Fr´echet differentiability of convex functions’, Acta Math. 121, 31–47. Attouch, H. (1981), ‘On the maximality of the sum of two maximal monotone operators’, Nonlinear Anal. 5, 143–147. Aubin, J.-P. (1963), ‘Un th´eor`eme de compacit´e’, C. R. Acad. Sci. Paris S´er. I Math. 256, 5042–5044. Aubin, T. (1976), ‘Probl´eme isop´erim´etriques at espaces de Sobolev’, J. Differential Geom. 11, 573–598. Aumann, R. & Shapley, L. (1974), Values of Non-Atomic Games, Princeton University Press, Princeton, NJ. Averbukh, V. & Smolyanov, O. (1967), ‘The theory of differentiation in linear topological spaces’, Russian Math. Surveys 22, 201–258. Averbukh, V. & Smolyanov, O. (1968), ‘The various definitions of the derivatives in linear topological spaces’, Russian Math. Surveys 23, 67–113. Bader, R. (2001), ‘A topological fixed-point index theory for evolution inclusions’, Z. Anal. Anwendungen 20, 3–15. Bagby, T. & Ziemer, W. (1974), ‘Pointwise differentiability and absolute continuity’, Trans. Amer. Math. Soc. 191, 129–148.
References
927
Balder, E. (1984), ‘A general denseness result for relaxed control theory’, Bull. Austral. Math. Soc. 30, 463–475. Balder, E. (1987), ‘Necessary and sufficient conditions for L1 -strong-weak lower semicontinuity’, Nonlinear Anal. 11, 1399–1404. Balder, E. (1997), ‘Consequences of denseness of Dirac Young measures’, J. Math. Anal. Appl. 207, 536–540. Ball, J. (1977), ‘Strongly continuous semigroups, weak solutions and the variation of constants formula’, Proc. Amer. Math. Soc. 63, 370–373. Ball, J. (1989), A version of the fundamental theorem for Young measures, in ‘PDEs and Continuum Models of Phase Transitions (Nice, 1988)’, Vol. 344 of Lecture Notes in Physics, Springer-Verlag, Berlin, pp. 207–215. Ball, J. & Murat, F. (1989), ‘Remarks on Chacon’s biting lemma’, Proc. Amer. Math. Soc. 107, 655–663. Ball, J. & Zhang, K.-W. (1990), ‘Lower semicontinuity of multiple integrals and the biting lemma’, Proc. Roy. Soc. Edinburgh Sect. A 114, 367–379. Banach, S. (1922), ‘Sur les op´erations dans les ensembles abstraits et leur application aux ´equations int´egrales’, Fund. Math. 3, 133–181. Barbu, V. (1976), Nonlinear Semigroups and Differential Equations in Banach Spaces, Noordhoff International Publishing, Leiden. Barbu, V. (1994), Mathematical Methods in Optimization of Differential Systems, Vol. 310 of Mathematics and Its Applications, Kluwer, Dordrecht. Barbu, V. & Precupanu, T. (1986), Convexity and Optimization in Banach Spaces, Vol. 10 of Mathematics and Its Applications (East European Series), D. Reidel Publishing Co., Dordrecht. Bartolo, P., Benci, V. & Fortunato, D. (1983), ‘Abstract critical point theorems and applications to some nonlinear problems with “strong” resonance at infinity’, Nonlinear Anal. 7, 981–1012. Bates, P. & Ekeland, I. (1980), A saddle point theorem, in ‘Differential Equations (Proc. Eight Fall Conf., Oklahoma State Univ., Stillwater, Okla., 1979)’, Academic Press, New York, pp. 123–126. Beauzamy, B. (1982), Introduction to Banach Spaces and Their Geometry, Vol. 86 of Notas de Matem´ atica, North Holland Publishing Co., Amsterdam. Ben Naoum, A., Troestler, C. & Willem, M. (1996), ‘Extremal problems with critical Sobolev exponents on unbounded domains’, Nonlinear Anal. 26, 623–833. Benci, V. & Rabinowitz, P. (1979), ‘Critical point theorems for indefinite functionals’, Invent. Math. 52, 241–273.
928
Nonlinear Analysis
B´enilan, P. (1972), ‘Solutions int´egrales d’´equations d’´evolution dans un espace de Banach’, C. R. Acad. Sci. Paris S´er. A-B 274, 47–50. Benyamini, Y. & Lindenstrauss, J. (1997), Geometric Nonlinear Functional Analysis I, AMS, Providence, RI. Berger, M. (1977), Nonlinearity and Functional Analysis, Lectures on Nonlinear Problems in Mathematical Analysis. Pure and Applied Mathematics, Academic Press, New York. Berliocchi, H. & Lasry, J.-M. (1973), ‘Integrandes normales et mesures param´etr`ees en calcul des variations’, Bull. Soc. Math. France 101, 129– 184. Besicovitch, A. (1928), ‘On the fundamental geometrical properties of linearly measurable plane sets of points’, Math. Ann. 98, 422–464. Besicovitch, A. (1945), ‘A general form of the covering principle and relative differentiation of additive functions’, Math. Proc. Cambridge Philos. Soc. 41, 103–110. Besicovitch, A. (1946), ‘A general form of the covering principle and relative differentiation of additive functions II’, Math. Proc. Cambridge Philos. Soc. 42, 1–10. Beurling, A. & Livingston, A. (1962), ‘A theorem on duality mapping in Banach spaces’, Ark. Mat. 4, 405–411. Bhattacharya, T. (1988), ‘Radial symmetry of the first eigenfunction for the p-Laplacian in the ball’, Proc. Amer. Math. Soc. 104, 169–174. Bianchi, G., Chabrowski, J. & Szulkin, A. (1995), ‘On symmetric solutions of an elliptic equation with a nonlinearity involving critical Sobolev exponent’, Nonlinear Anal. 25, 41–59. Bielecki, A. (1956), ‘Une remarque sur l’application de la m´ethode de BanachCaccioppoli-Tikchonov dans la th´eorie de l’´equation s = f (x, y, z, p, q)’, Bull. Acad. Polon. Sci. S´er. Sci. Math. 4, 265–268. Binding, P., Dr´abek, P. & Huang, Y.-X. (1997a), ‘On Neumann boundary value problems for some quasilinear elliptic equations’, Electron. J. Differential Equations 5, 1–11. Binding, P., Dr´abek, P. & Huang, Y.-X. (1997b), ‘On the Fredholm alternative for the p-Laplacian’, Proc. Amer. Math. Soc. 125, 3555–3559. Bismut, J. (1973), ‘Int´egrales convexes et probabilit´es’, J. Math. Anal. Appl. 42, 639–673. Blanchard, P. & Br¨ uning, E. (1992), Variational Methods in Mathematical Physics, Springer-Verlag, Berlin.
References
929
Bochner, S. (1933), ‘Integration von Funktionen deren Werte die Elemente eines Vectorraumes sind’, Fund. Math. 20, 262–276. Bochner, S. & Taylor, A. (1938), ‘Linear functionals on certain spaces of abstractly-valued functions’, Ann. of Math. (2) 39, 913–944. ¨ Bohl, P. (1904), ‘Uber die Bewegung eines mechanischen System in der N¨ahe einer Gleichgewichtslage’, J. Reine Angew. Math. 127, 179–276. Border, K. (1985), Fixed Point Theorems with Applications to Economics and Game Theory, Cambridge University Press, Cambridge. Borsuk, K. (1931), ‘Sur les retractes’, Fund. Math. 17, 152–170. Borwein, J. & Preiss, D. (1987), ‘A smooth variational principle with applications to subdifferentiability and to differentiability of convex functions’, Trans. Amer. Math. Soc. 303, 517–527. Borwein, J. & Zhu, Q.-J. (1996), ‘Viscosity solutions and viscosity subderivatives in smooth Banach spaces with applications to metric regularity’, SIAM J. Control Optim. 34, 1568–1591. Bourbaki, N. (1940–1949), Topologie G´en´erale, Hermann & Cie., Paris. Bourbaki, N. (1969), Integration, Hermann & Cie., Paris. Bourgain, J. (1979), ‘An averaging result for l1 -sequences and applications to weakly conditionally compact sets in L1X ’, Israel J. Math. 32, 289–298. Boyd, D. & Wong, J.-S. (1969), ‘On nonlinear contractions’, Proc. Amer. Math. Soc. 20, 458–464. Bredon, G. (1971), ‘Some examples for the fixed point property’, Pacific J. Math. 38, 571–575. Bressan, A., Cellina, A. & Fryszkowski, A. (1991), ‘A class of absolute retracts in space of integrable functions’, Proc. Amer. Math. Soc. 112, 413–418. ´ Br´ezis, H. (1968), ‘Equations et in´equations non lin´eaires dans les espaces vectoriels en dualit´e’, Ann. Inst. Fourier (Grenoble) 18, 115–175. Br´ezis, H. (1971), ‘On a problem of T. Kato’, Comm. Pure Appl. Math. 24, 1– 6. Br´ezis, H. (1973), Op´erateurs Maximaux Monotones et Semi-Groupes de Contractions dans les Espaces de Hilbert, Vol. 5 of North-Holland Mathematics Studies, North Holland Publishing Co., Amsterdam. Br´ezis, H. (1974), ‘New results concerning monotone operators and nonlinear semigroups’, S¯ urikaisekikenky¯ usho K¯ oky¯ uroku 258, 2–27. Br´ezis, H. (1983), Analyse Fonctionnelle. Th´eorie et Applications, Masson, Paris.
930
Nonlinear Analysis
Br´ezis, H. & Browder, F. (1976), ‘A general principle on ordered sets in nonlinear functional analysis’, Advances in Math. 21, 355–364. Br´ezis, H. & Browder, F. (1982), ‘Some properties of higher order Sobolev spaces’, J. Math. Pures Appl. (9) 61, 245–259. Br´ezis, H. & Lieb, E. (1983), ‘A relation between pointwise convergence of functions and convergence of functionals’, Proc. Amer. Math. Soc. 88, 486–490. Br´ezis, H. & Nirenberg, L. (1991), ‘Remarks on finding critical points’, Comm. Pure Appl. Math. 44, 939–963. Br´ezis, H. & Pazy, A. (1970), ‘Accretive sets and differential equations in Banach spaces’, Israel J. Math. 8, 367–383. Brodskii, M. & Milman, D. (1948), ‘On the center of a convex set’, Dokl. Akad. Nauk SSSR 59, 837–840. Brondsted, A. (1964), ‘Conjugate convex functions in topological vector spaces’, Mat.-Fys. Medd. Danske Vid. Selsk. 34, 1–26. Brondsted, A. (1974), ‘On a lemma of Bishop and Phelps’, Pacific J. Math. 55, 335–341. Brondsted, A. & Rockafellar, R. (1965), ‘On the subdifferentiability of convex functions’, Proc. Amer. Math. Soc. 16, 605–611. Brooks, J. & Chacon, R. (1980), ‘Continuity and compactness of measures’, Adv. in Math. 37, 16–26. Brouwer, L. (1909), ‘On continuous one-to-one transformations of surfaces into themselves’, Proc. Konink. Nederl. Akad. Wetensch. 11, 788–798. Browder, F. (1965a), ‘Infinite dimensional manifolds and non-linear elliptic eigenvalue problems’, Ann. of Math. (2) 82, 459–477. Browder, F. (1965b), ‘Nonexpansive nonlinear operators in a Banach space’, Proc. Nat. Acad. Sci. U.S.A. 54, 1041–1044. Browder, F. (1967), ‘A new generalization of the Schauder fixed point theorem’, Math. Ann. 174, 285–290. Browder, F. (1968), ‘Nonlinear maximal monotone operators in Banach spaces’, Math. Ann. 175, 89–113. Browder, F. (1971a), ‘Normal solvability and Fredholm alternative for mappings in infinite dimensional manifolds’, J. Funct. Anal. 8, 250–274. Browder, F. (1971b), ‘Normal solvability for nonlinear mappings into Banach spaces’, Bull. Amer. Math. Soc. 77, 73–77. Browder, F. (1976), Nonlinear Operators and Nonlinear Equations of Evolution in Banach Spaces, AMS, Providence, RI.
References
931
Browder, F. & Hess, P. (1972), ‘Nonlinear mappings of monotone type in Banach spaces’, J. Funct. Anal. 11, 251–294. Brown, R. (1993), A Topological Introduction to Nonlinear Analysis, Birkh¨auser Verlag, Boston, MA. Buttazzo, G. (1989), Semicontinuity, Relaxation and Integral Representation in the Calculus of Variations, Vol. 207 of Pitman Res. Notes in Math. Ser., Longman Scientific & Technical, Harlow. Butzer, P. & Berens, H. (1967), Semi-Groups of Operators and Approximation, Vol. 145 of Die Grundlehren der mathematischen Wissenschaften, Springer-Verlag, New York. Caccioppoli, R. (1953), ‘Misura e integrazione sugli insiemi dimensionalmente orientati’, Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 12, 3–11 and 137–146. ˇ Caklovi´ c, L., Li, S. & Willem, M. (1990), ‘A note on Palais-Smale condition and coercivity’, Differential Integral Equations 3, 799–800. ¨ Carath´eodory, C. (1914), ‘Uber das lineare Mass von Punktmengen eine Verallgemainerung des L¨angenbegriffts’, G¨ ott. Nachr. pp. 404–426. Caristi, J. (1976), ‘Fixed point theorems for mapping satisfying inwardness conditions’, Trans. Amer. Math. Soc. 215, 241–251. Caristi, J. & Kirk, W. (1975), Geometric fixed point theory and inwardness conditions, in ‘The Geometry of Metric and Linear Spaces (Proc. Conf. on Geometry of Metric and Linear Spaces, Michigan, 1974)’, Vol. 490 of Lecture Notes in Mathematics, Springer-Verlag, Berlin, pp. 75–83. Cartan, H. (1967), Calcul Diff´erentiel, Hermann & Cie., Paris. Casas, E. & Fern´andez, L. (1989), ‘A Green’s formula for quasilinear elliptic operators’, J. Math. Anal. Appl. 142, 62–73. Castaing, C. & Valadier, M. (1977), Convex Analysis and Measurable Multifunctions, Vol. 580 of Lecture Notes in Mathematics, Springer-Verlag, Berlin. Cerami, G. (1978), ‘Un criterio di esistenza per i punti critici su varieta’ illimitate’, Ist. Lombardo Accad. Sci. Lett. Rend. A 112, 332–336. Cesari, L. (1936), ‘Sulle funzioni a variazione limitata’, Ann. Scuola Norm. Sup. Pisa Cl. Sci. 5, 299–313. Cesari, L. (1983), Optimization – Theory and Applications, Vol. 17 of Applications of Mathematics, Springer-Verlag, New York. Chang, K.-C. (1981), ‘Solutions of asymptotically linear operator equations via Morse theory’, Comm. Pure Appl. Math. 34, 693–712.
932
Nonlinear Analysis
Chang, K.-C. (1993), Infinite-Dimensional Morse Theory and Multiple Solution Problems, Vol. 6 of Progress in Nonlinear Differential Equations and Their Applications, Birkh¨auser Verlag, Boston, MA. Cheng, Y. (1998), ‘H¨older continuity of the inverse of p-Laplacian’, J. Math. Anal. Appl. 221, 734–748. Choquet, G. (1955), ‘Theory of capacities’, Ann. Inst. Fourier (Grenoble) 5, 131–295. Christensen, J. (1972), ‘On sets of Haar measure zero in abelian Polish groups’, Israel J. Math. 13, 255–260. Proceedings of the International Symposium on Partial Differential Equations and the Geometry of Normed Linear Spaces (Jerusalem, 1972). Christensen, J. (1974), Topology and Borel Structure, Vol. 10 of North-Holland Mathematics Studies, North-Holland Publishing Co., Amsterdam. Cioranescu, I. (1990), Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems, Vol. 62 of Mathematics and Its Applications, Kluwer, Dordrecht. Clark, D. (1972), ‘A variant of the Lusternik-Schnirelmann theory’, Indiana Univ. Math. J. 22, 65–74. Clarke, F. (1975), ‘Generalized gradients and applications’, Trans. Amer. Math. Soc. 205, 247–262. Clarke, F. (1981), ‘Generalized gradients of Lipschitz functionals’, Adv. Math. 40, 52–67. Clarke, F. (1983), Optimization and Nonsmooth Analysis, Wiley, New York. Clarke, F. (1989), Methods of Dynamics and Nonsmooth Optimization, Vol. 57 of Regional Conference Series in Applied Mathematics, SIAM, Philadelphia, PA. Clarke, F., Ledyaev, Y., Stern, R. & Wolenski, P. (1998), Nonsmooth Analysis and Control Problems, Vol. 178 of Graduate Texts in Mathematics, Springer-Verlag, New York. Cl´ement, P. & Peletier, L. (1979), ‘An anti-maximum principle for secondorder elliptic operators’, J. Differential Equations 34, 218–229. Coffman, C. (1969), ‘A minimum-maximum principle for a class of non-linear integral equations’, J. Analyse Math. 22, 391–419. Cohn, D. (1980), Measure Theory, Birkh¨auser Verlag, Boston, MA. Costa, D. & Silva, E. (1991), ‘The Palais-Smale condition versus coercivity’, Nonlinear Anal. 16, 371–381.
References
933
Courant, R. & Hilbert, D. (1953), Methods of Mathematical Physics I, Interscience Publisher, New York. Courant, R. & Hilbert, D. (1989), Methods of Mathematical Physics II, Wiley, New York. Covitz, H. & Nadler, S. (1970), ‘Multi-valued contraction mappings in generalized metric spaces’, Israel J. Math. 8, 5–11. Crandall, M. & Liggett, T. (1971), ‘Generations of semi-groups of nonlinear transformations on general Banach spaces’, Amer. J. Math. 93, 265–298. Crandall, M. & Pazy, A. (1969), ‘Semi-groups of nonlinear contractions and dissipative sets’, J. Funct. Anal. 3, 376–418. Crandall, M. & Pazy, A. (1970), ‘On accretive sets in Banach spaces’, J. Funct. Anal. 5, 204–217. Dal Maso, G. (1985), ‘Some necessary and sufficient conditions for the convergence of sequences of unilateral convex sets’, J. Funct. Anal. 62, 119–159. Damascelli, L. (1998), ‘Comparison theorems for some quasilinear degenerate elliptic operators and applications to symmetry and monotonicity results’, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 15, 493–516. Daneˇs, J. (1972), ‘A geometric theorem useful in nonlinear functional analysis’, Boll. Un. Mat. Ital. 6, 369–375. Darbo, G. (1955), ‘Punti uniti in transformazioni a codominio non compatto’, Rend. Sem. Mat. Univ. Padova 24, 84–92. Davies, R. (1970), ‘Increasing sequences of sets and Hausdorff measure’, Proc. London Math. Soc. 20, 222–236. Davies, R. & Samuels, P. (1974), ‘Density theorems for measures of Hausdorff type’, Bull. London Math. Soc. 6, 31–36. Day, M. (1955), ‘Strict convexity and smoothness of normed spaces’, Trans. Amer. Math. Soc. 78, 516–528. Day, M. (1973), Normed Linear Spaces, Vol. 21 of Ergebnisse der Mathematik und ihrer Grenzgebiete, Springer-Verlag, Berlin. de Figueiredo, D. (1982), Positive solutions of semilinear elliptic problems, in ‘Differential Equations (San Paulo, 1981)’, Vol. 957 of Lecture Notes in Mathematics, Springer-Verlag, New York, pp. 34–85. de Figueiredo, D. & Karlovitz, L. (1967), ‘On the radial projection in normed spaces’, Bull. Amer. Math. Soc. 73, 364–368. De Giorgi, E. (1954), ‘Su una teoria generale della misure r − 1-dimensionale in uno spazio ad r dimensioni’, Ann. Mat. Pura Appl. (4) 36, 191–213.
934
Nonlinear Analysis
De Giorgi, E. (1955), ‘Nuovi teoremi relativi alle misure r − 1-dimensionali in spazio ad uno r dimensioni’, Ricerche Mat. 4, 95–113. De Giorgi, E. (1957), ‘Sulla differenziabilit`a e l’analiticit`a delle estimali degli integrali multipli regolari’, Mem. Accad. Sci. Torino Cl. Sci. Fis. Mat. Natur. 3, 25–43. De Giorgi, E. (1968–1969), Teoremi di semicontinuita nel calcolo delle variazioni, Notes, Instituto Nazionale di Alta Matematica, Roma. de Guzman, M. (1975), Differentiation of Integrals in RN , Vol. 481 of Lecture Notes in Mathematics, Springer-Verlag, Berlin. de Rham, G. (1955), Variet´es Differentiables, Hermann & Cie., Paris. de Th´elin, F. (1986), ‘Sur l’espace propre associ´e `a la premi`ere valeur propre du pseudo-laplacien’, C. R. Acad. Sci. Paris S´er. I Math. 303, 355–358. Deimling, K. (1985), Nonlinear Functional Analysis, Springer-Verlag, New York. del Pino, M., Dr´abek, P. & Man´asevich, R. (1999), ‘The Fredholm alternative at the first eigenvalue for the one-dimensional p-Laplacian’, J. Differential Equations 151, 386–419. del Pino, M., Elgueta, M. & Man´asevich, R. (1989), ‘A homotopic deformation ¡ ¢0 along p of a Leray-Schauder degree result and existence for |u0 |p−2 u0 + f (t, u) = 0, u(0) = u(I) = 0, 1 < p’, J. Differential Equations 80, 1–13. Denkowski, Z., Mig´orski, S. & Papageorgiou, N. (2003a), An Introduction to Nonlinear Analysis: Theory, Kluwer/Plenum, New York. Denkowski, Z., Mig´orski, S. & Papageorgiou, N. (2003b), An Introduction to Nonlinear Analysis: Applications, Kluwer/Plenum, New York. Deny, J. & Lions, J. (1953–1954), ‘Les espaces du type de Beppo Levi’, Ann. Inst. Fourier (Grenoble) 5, 305–370. Deville, R., Godefroy, G., Hare, D. & Zizler, V. (1987), ‘Differentiability of convex functionals and the convex point of continuity property in Banach spaces’, Israel J. Math. 59, 245–255. Deville, R., Godefroy, G. & Zizler, V. (1993), Smoothness and Renorming in Banach Spaces, Vol. 64 of Pitman Monographs and Surveys in Pure and Applied Mathematics, Longman Scientific & Technical, Harlow. Di Benedetto, E. (1983), ‘C 1+α local regularity of weak solutions of degenerate elliptic equations’, Nonlinear Anal. 7, 827–850. Di Benedetto, E. (1995), Partial Differential Equations, Birkh¨auser Verlag, Boston, MA.
References
935
Di Perna, R. (1985), ‘Measure-valued solutions to conservation laws’, Arch. Rational Mech. Anal. 88, 223–270. Di Perna, R. & Majda, A. (1987), ‘Oscillations and concentrations in weak solutions of the incompressible fluid equations’, Comm. Math. Phys. 108, 667–689. Diestel, J. (1984), Sequences and Series in Banach Spaces, Vol. 92 of Graduate Texts in Mathematics, Springer-Verlag, New York. Diestel, J. & Uhl, J. (1977), Vector Measures, Vol. 15 of Mathematical Surveys and Monographs, AMS, Providence, RI. Dieudonn´e, J. (1969), Foundations of Modern Analysis, Vol. 10-I of Pure and Applied Mathematics, Academic Press, New York. Dinculeanu, N. & Foias, C. (1961), ‘Sur la repr´esentation int´egrale des certaines op´erations lin´eaires IV’, Canad. J. Math. 13, 529–556. Dontchev, A. & Zolezzi, T. (1993), Well-Posed Optimization Problems, Vol. 1543 of Lecture Notes in Mathematics, Springer-Verlag, Berlin. Dr´abek, P. & Holubov´a, G. (2001), ‘Fredholm alternative for the p-Laplacian in higher dimensions’, J. Math. Anal. Appl. 263, 182–194. Dr´abek, P. & Man´asevich, R. (1999), ‘On the closed solution to some nonhomogeneous eigenvalue problems with p-Laplacian’, Differential Integral Equations 12, 773–788. Dr´abek, P. & Robinson, S. (2002), ‘On the generalization of the Courant nodal domain theorem’, J. Differential Equations 181, 58–71. Du, Y. (1991), ‘A deformation lemma and some critical point theorems’, Bull. Austral. Math. Soc. 43, 161–168. Dubovitskii, A. & Miljutin, A. (1968), ‘Necessary conditions for a weak extremum in optimal control problems with mixed constraints of the inequality type’, U.S.S.R. Comput. Math. and Math. Phys. 8, 24–98. Dudley, R. (1989), Real Analysis and Probability, Wadsworth & Brooks/Cole, Pacific Grove, CA. Dugundji, J. & Granas, A. (1982), Fixed Point Theory. I, Pa´ nstwowe Wydawnictwo Naukowe, Warszawa. Dunford, N. (1935), ‘Integration in general analysis’, Trans. Amer. Math. Soc. 37, 441–453. Dunford, N. & Pettis, B. (1940), ‘Linear operators on summable functions’, Trans. Amer. Math. Soc. 47, 323–392. Dunford, N. & Schwartz, J. (1958), Linear Operators. I. General Theory, Vol. 7 of Pure and Applied Mathematics, Wiley, New York.
936
Nonlinear Analysis
Edelstein, M. (1962), ‘On fixed and periodic points under contractive mappings’, J. London Math. Soc. 37, 74–79. Edgar, G. (1977), ‘Measurability in a Banach space’, Indiana Univ. Math. J. 26, 663–677. Ekeland, I. (1972), ‘Sur le contrˆole optimal de syst`emes gouvern´es par des ´equations elliptiques’, J. Funct. Anal. 9, 1–62. Ekeland, I. (1974), ‘On the variational principle’, J. Math. Anal. Appl. 47, 324–353. Ekeland, I. (1979), ‘Nonconvex minimization problems’, Bull. Amer. Math. Soc. (N.S.) 1, 443–474. Ekeland, I. (1989), The ε-variational principle revisited, in A. Cellina, ed., ‘Methods of Nonconvex Analysis (Varenna 1989)’, Vol. 1446 of Lecture Notes in Mathematics, Springer-Verlag, Berlin, pp. 1–15. Ekeland, I. & Temam, R. (1976), Convex Analysis and Variational Problems, Vol. 1 of Studies in Mathematics and Its Applications, North-Holland Publishing Co., Amsterdam-Oxford. Enflo, P. (1973), ‘A counterexample to the approximation problem in Banach spaces’, Acta Math. 130, 309–317. Evans, L. (1990), Weak Convergence Methods for Nonlinear Partial Differential Equations, Vol. 74 of CBMS Regional Conference Series in Mathematics, AMS, Providence, RI. Evans, L. (1998), Partial Differential Equations, Vol. 9 of Graduate Studies in Math, AMS, Providence, RI. Evans, L. & Gariepy, R. (1992), Measure Theory and Fine Properties of Functions, CRC Press, Boca Raton, FL. Fabian, M. (1997), Gˆ ateaux Differentiability of Convex Functions and Topology, Wiley, New York. Falconer, K. (1985), The Geometry of Fractal Sets, Vol. 85 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge. Fan, K. (1952), ‘Fixed-point and minimax theorems in locally convex topological linear spaces’, Proc. Nat. Acad. Sci. U.S.A. 38, 121–126. Fan, K. (1960–1961), ‘A generalization of Tychonov’s fixed point theorem’, Math. Ann. 142, 305–310. Fang, G. & Ghoussoub, N. (1992), ‘Second order information on Palais-Smale sequences in the mountain pass theorem’, Manuscripta Math. 75, 81–95. Fattorini, H. (1999), Infinite-Dimensional Optimization and Control Theory, Vol. 62 of Encyclopedia of Mathematics and Its Applications, Cambridge University Press, Cambridge.
References
937
Federer, H. (1958), ‘A note on the Gauss-Green theorem’, Proc. Amer. Math. Soc. 9, 447–451. Federer, H. (1969), Geometric Measure Theory, Vol. 153 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, New York. Federer, H. & Ziemer, W. (1972), ‘The Lebesgue set of a function whose distribution derivatives are p-th power summable’, Indiana Univ. Math. J. 22, 139–158. Feller, W. (1953), ‘On the generation of unbounded semi-groups of bounded linear operators’, Ann. of Math. 58, 166–174. Fenchel, W. (1951), Convex Cones, Sets and Functions, Princeton University Press, Princeton, NJ. Fleckinger, J., Gossez, J.-P., Takaˇc, P. & de Th´elin, F. (1995), ‘Existence, nonexistence et principe de l’antimaximum pour le p-Laplacien’, C. R. Acad. Sci. Paris S´er. I Math. 321, 731–734. Fleming, W. (1960), ‘Functions whose partial derivatives are measures’, Illinois J. Math. 4, 452–478. Fowler, P. (1973), ‘Capacity theory in Banach spaces’, Pacific J. Math. 48, 365–385. Fr´echet, M. (1920), ‘La notion de differentielle dans l’analyse g`en´erale’, C. R. Acad. Sci. Paris S´er. A-B 180, 806–809. Fredholm, I. (1903), ‘Sur une classe d’´equations functionnelles’, Acta Math. 27, 365–390. Frehse, J. (1984), ‘A refinement of Rellich’s theorem’, Rend. Mat. (7) 5, 229– 242. Frigon, M. & Granas, A. (1994), ‘R´esultats du type de Leray-Schauder pour des contractions multivoques’, Topol. Methods Nonlinear Anal. 4, 197– 208. Gagliardo, E. (1958), ‘Propriet´a di alcune classi di funzioni in pi` u variabili’, Ricerche Mat. 7, 102–137. Gamkrelidze, R. (1978), Principles of Optimal Control Theory, Vol. 7 of Methods in Science and Engineering, Plenum Press, New York. Garc´ıa-Meli´an, J. & Sabina de Lis, J. (1998), ‘Maximum and comparison principles for operators involving the p-Laplacian’, J. Math. Anal. Appl. 218, 49–65. Gasi´ nski, L. & Papageorgiou, N. (2005), Nonsmooth Critical Point Theory and Nonlinear Boundary Value Problems, Chapman and Hall/ CRC Press, Boca Raton, FL.
938
Nonlinear Analysis
Gˆateaux, R. (1913), ‘Sur les founctionnelles continues et les founctionnelles analytiques’, C. R. Acad. Sci. Paris S´er. A-B 157, 325–327. Gelfand, I. & Shilov, G. (1977), Generalized Functions I. Properties and Operations, Academic Press, New York. Ghoussoub, N. (1993a), Duality and Perturbation Methods in Critical Point Theory, Vol. 107 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge. Ghoussoub, N. (1993b), ‘A min-max principle with a relaxed boundary condition’, Proc. Amer. Math. Soc. 117, 439–447. Gilbarg, D. & Trudinger, N. (2001), Elliptic Partial Differential Equations of Second Order, Classics in Mathematics, Springer-Verlag, Berlin. Giles, J. (1982), Convex Analysis with Applications in Differentiation of Convex Functions, Vol. 58 of Research Notes in Mathematics, Pitman, Boston, MA. Giusti, E. (1984), Minimal Surfaces and Functions of Bounded Variation, Vol. 80 of Monographs in Mathematics, Birkh¨auser Verlag, Basel. Godoy, T., Gossez, J.-P. & Paczka, S. (1999), ‘Antimaximum principle for elliptic problems with weight’, Electron. J. Differential Equations 1999, 1– 15. Godoy, T., Gossez, J.-P. & Paczka, S. (2002), ‘On the antimaximum principle for the p-Laplacian with indefinite weight’, Nonlinear Anal. 51, 449–467. Goebel, K. & Kirk, W. (1990), Topics in Metric Fixed Point Theory, Vol. 28 of Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge. Goeleven, D. (1993), ‘A note on Palais-Smale condition in the sense of Szulkin’, Differential Integral Equations 6, 1041–1043. Gohberg, I. & Goldberg, S. (1981), Basic Operator Theory, Birkh¨auser Verlag, Boston, MA. G¨ohde, D. (1965), ‘Zum Princip der kontraktiven Abbildung’, Math. Nachr. 30, 251–258. Goldberg, S. (1966), Unbounded Linear Operators: Theory and Applications, McGraw-Hill Book Co., New York. Goldstein, J. (1985), Semigroups of Linear Operators and Applications, Oxford Mathematical Monographs, Oxford University Press, New York. Golomb, M. (1935), ‘Zur Theorie der nichlinearen Integralgleichungen, Integralgleichungssysteme und Funktionalgleichungen’, Math. Z. 39, 45–75.
References
939
G´orniewicz, L. (1999), Topological Fixed Point Theory of Multivalued Mappings, Vol. 495 of Mathematics and Its Applications, Kluwer, Dordrecht. G´orniewicz, L., Marano, S. & Slosarski, M. (1996), ‘Fixed points of contractive multivalued maps’, Proc. Amer. Math. Soc. 124, 2675–2683. Guedda, M. & V´eron, L. (1988), ‘Bifurcation phenomena associated to the p-Laplacian operator’, Trans. Amer. Math. Soc. 310, 419–431. Guedda, M. & V´eron, L. (1989), ‘Quasilinear elliptic equations involving critical Sobolev exponents’, Nonlinear Anal. 13, 879–902. Guo, D. (1986), ‘Some fixed point theorems and applications’, Nonlinear Anal. 10, 1293–1302. Guo, D. (1987), Some fixed point theorems of expansion and compression type with applications, in V. Lakshmikantham, ed., ‘Nonlinear Analysis and Applications (Arlington, Tex., 1986)’, Vol. 109 of Lecture Notes in Pure and Applied Mathematics, Marcel Dekker, New York, pp. 213–221. Guo, D. & Lakshmikantham, V. (1988), Nonlinear Problems in Abstract Cones, Vol. 5 of Notes and Reports in Mathematics in Science and Engineering, Academic Press, Boston, MA. Guo, D. & Sun, J. (1988), ‘Some global generalizations of the Birkhoff-Kellogg theorem and applications’, J. Math. Anal. Appl. 129, 231–242. Gutman, S. (1985), ‘Topological equivalence in the space of integrable vectorvalued functions’, Proc. Amer. Math. Soc. 93, 40–42. Hadamard, J. (1910), Sur quelques applications de l’induce de Kronecker, in ‘Introduction a la Theorie des Fonctions d’une Variable’, Hermann & Cie., Paris, pp. 437–477. Halmos, P. (1974), Measure Theory, Vol. 18 of Graduate Texts in Mathematics, Springer-Verlag, New York. Halmos, P. (1998), Introduction to Hilbert Space and the Theory of Spectral Multiplicity, AMS Chelsea Publishing, Providence, RI. Halpern, B. (1970), ‘Fixed-point theorems for set-valued maps in infinite dimensional spaces’, Math. Ann. 189, 87–98. Halpern, B. & Bergman, G. (1968), ‘A fixed-point theorem for inward and outward maps’, Trans. Amer. Math. Soc. 130, 353–358. Hardt, R. (1979), An Introduction to Geometric Measure Theory, Melbourne University, Melbourne. Hartman, P. & Stampacchia, G. (1966), ‘On some non-linear elliptic differential-functional equations’, Acta Math. 115, 271–310. Hausdorff, F. (1919), ‘Dimension und ¨ausseres mass’, Math. Ann. 79, 157–179.
940
Nonlinear Analysis
Heikkil¨a, S. & Lakshmikantham, V. (1994), Monotone Iterative Techniques for Discontinuous Nonlinear Differential Equations, Vol. 181 of Monographs and Textbooks in Pure and Applied Mathematics, Marcel Dekker, New York. Hermes, H. & LaSalle, J. (1969), Functional Analysis and Time Optimal Control, Vol. 56 of Mathematics in Science and Engineering, Academic Press, New York. Hess, P. (1981), ‘An antimaximum principle for linear elliptic equations with an indefinite weight function’, J. Differential Equations 41, 369–374. Hewitt, E. & Ross, K. (1963), Abstract Harmonic Analysis, Springer-Verlag, Berlin. Hewitt, E. & Stromberg, K. (1975), Real and Abstract Analysis, Vol. 25 of Graduate Texts in Mathematics, Springer-Verlag, New York. Hilbert, D. (1906), ‘Grundz¨ uge einer allgemeinen Theorie der linearen Integralgleichungen IV’, G¨ ott. Nachr. pp. 157–227. Hille, E. (1942), ‘Representation of one-parameter semigroup of linear transformations’, Proc. Nat. Acad. Sci. U.S.A. 28, 175–178. Hille, E. & Phillips, R. (1957), Functional Analysis and Semigroups, Vol. 31 of American Mathematical Society Colloquium Publications, AMS, Providence, RI. Hiriart-Urruty, J.-B. (1980), ‘Lipschitz r-continuity of the approximate subdifferential of a convex function’, Math. Scand. 47, 123–134. Hiriart-Urruty, J.-B. & Lemar´echal, C. (1993), Convex Analysis and Minimization Algorithms II, Vol. 306 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, Berlin. Hiriart-Urruty, J.-B. & Phelps, R. (1993), ‘Subdifferential calculus using εsubdifferentials’, J. Funct. Anal. 118, 154–166. Hiriat-Urruty, J.-B. (1982), ε-subdifferential calculus, in ‘Convex Analysis and Optimization (London, 1980)’, Vol. 57 of Research Notes in Mathematics, Pitman, London, pp. 43–92. Hofer, H. (1984), ‘A note on the topological degree at a critical point of mountain pass-type’, Proc. Amer. Math. Soc. 90, 309–315. Hofer, H. (1985), ‘A geometric description of the neighbourhood of a critical point given by the mountain pass theorem’, J. London Math. Soc. 31, 566–570. Hofer, H. (1986), The topological degree at a critical point of mountain passtype, in ‘Nonlinear Diffusion Equations and Their Equilibrium States
References
941
(Berkeley, CA, 1983)’, Vol. 45 of Proc. Sympos. Pure Math., AMS, Providence, RI, pp. 501–509. Hofer, H. (1988), A strong form of the mountain pass theorem and application, in ‘Nonlinear Diffusion Equations and Their Equilibrium States I (Berkeley, CA, 1986)’, Vol. 12 of Math. Sci. Res. Inst. Publ., SpringerVerlag, New York, pp. 341–350. Holmes, R. (1975), Geometric Functional Analysis and Applications, Vol. 24 of Graduate Texts in Mathematics, Springer-Verlag, New York. Hopf, E. (1927), ‘Elementare Bemerkungen u ¨ber die L¨osung partieller Differentialgleichungen zweiter Ordnung vom elliptischen Typus’, Sitzungsber. Acad. Berlin 19, 147–152. Hopf, E. (1952), ‘A remark on linear elliptic differential equations of the second order’, Proc. Amer. Math. Soc. 3, 791–793. H¨ormander, L. (1955), ‘Sur la fonction d’appui des ensembles convexes dans un espaces localement convexe’, Ark. Mat. 3, 180–186. Hu, S. & Papageorgiou, N. (1997), Handbook of Multivalued Analysis. Volume I: Theory, Vol. 419 of Mathematics and Its Applications, Kluwer, Dordrecht. Hu, S. & Papageorgiou, N. (2000), Handbook of Multivalued Analysis. Volume II: Applications, Vol. 500 of Mathematics and Its Applications, Kluwer, Dordrecht. Huang, Y.-X. (1990), ‘On eigenvalue problems of p-Laplacian with Neumann boundary conditions’, Proc. Amer. Math. Soc. 109, 177–184. Hunt, B., Sauer, T. & Yorke, J. (1992), ‘Prevalence: A translation-invariant “almost every” on infinite-dimensional spaces’, Bull. Amer. Math. Soc. 27, 217–238. Hunt, B., Sauer, T. & Yorke, J. (1993), ‘Prevalence: An addendum to: “A translation-invariant “almost every” on infinite-dimensional spaces.”’, Bull. Amer. Math. Soc. 28, 306–307. Ioffe, A. (1977a), ‘On lower semicontinuity of integral functionals I’, SIAM J. Control Optim. 15, 521–538. Ioffe, A. (1977b), ‘On lower semicontinuity of integral functionals II’, SIAM J. Control Optim. 15, 991–1000. Ioffe, A. & Levin, V. (1972), ‘Subdifferentials of convex functions’, Trans. Moscow Math. Soc. 26, 3–73. Ioffe, A. & Tichomirov, V. (1968), ‘Duality of convex functions and extremal problems’, Russian Math. Surveys 23, 53–124.
942
Nonlinear Analysis
Ioffe, A. & Tihomirov, V. (1979), Theory of Extremal Problems, Vol. 6 of Studies in Mathematics and Its Applications, North-Holland Publishing Co., Amsterdam. Ionescu-Tulcea, A. & Ionescu-Tulcea, C. (1969), Topics in the Theory of Lifting, Vol. 48 of Ergebnisse der Mathematik und ihrer Grenzgebiete, Springer-Verlag, Berlin. Istratescu, V. (1981), Fixed Point Theory, Vol. 7 of Mathematics and Its Applications, D. Reidel Publishing Co., Dordrecht. James, R. (1964), ‘Weakly compact sets’, Trans. Amer. Math. Soc. 113, 129– 140. Jordan, C. (1881), ‘Sur la serie de Fourier’, C. R. Acad. Sci. Paris S´er. A-B 92, 228–230. Jost, J. (2002), Partial Differential Equations, Vol. 214 of Graduate Texts in Mathematics, Springer-Verlag, New York. Kachurovski, R. (1960), ‘On monotone operators and convex functionals’, Uspehi Mat. Nauk 15, 213–215. Kakutani, S. (1938), ‘Two-fixed point theorems concerning bicompact convex sets’, Proc. Imp. Acad. Jap. 14, 242–245. Kakutani, S. (1941), ‘A generalization of Brouwer’s fixed point theorem’, Duke Math. J. 8, 457–459. Kakutani, S. (1943), ‘Topological properties of the unit sphere of a Hilbert space’, Proc. Imp. Acad. Tokyo 19, 269–271. Kannan, R. (1969), ‘Some results on fixed points II’, Amer. Math. Monthly 76, 405–408. Kato, T. (1967), ‘Nonlinear semigroups and evolution equations’, J. Math. Soc. Japan 19, 508–520. Kato, T. (1968), Accretive operators and nonlinear evolution equations in Banach spaces, in F. Browder, ed., ‘Nonlinear Functional Analysis’, Vol. 18 of Proc. Sympos. Pure Math., AMS, Providence, RI, pp. 138–161. Kato, T. (1970), Note on differentiability of nonlinear semigroups, in F. Browder, ed., ‘Global Analysis’, Vol. 16 of Proc. Sympos. Pure Math., AMS, Providence, RI, pp. 91–94. Kato, T. (1976), Perturbation Theory for Linear Operators, Vol. 132 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, Berlin. Kenderov, P. (1974), ‘The set-valued monotone mappings are almost everywhere single-valued’, C. R. Acad. Bulgare Sci. 27, 1173–1175.
References
943
Kenmochi, N. (1972), ‘Accretive mappings in Banach spaces’, Hiroshima Math. J. 2, 163–177. Kenmochi, N. (1973), ‘Remarks on the m-accretiveness of nonlinear operators’, Hiroshima Math. J. 3, 61–68. Kenmochi, N. (1974), ‘Nonlinear operators of monotone type in reflexive Banach spaces and nonlinear perturbations’, Hiroshima Math. J. 4, 229– 263. Kenmochi, N. (1975), ‘Pseudomonotone operators and nonlinear elliptic boundary value problems’, J. Math. Soc. Japan 27, 121–149. Kirk, W. (1965), ‘A fixed point theorem for mappings which do not increase distances’, Amer. Math. Monthly 72, 1004–1006. ¨ Kirszbraun, M. (1934), ‘Uber die zusammenziehenden und Lipschitzschen Transformationen’, Fund. Math. 22, 77–108. Klee, V. (1956), ‘A note on topological properties of normed linear spaces’, Proc. Amer. Math. Soc. 7, 673–674. Knaster, B., Kuratowski, K. & Mazurkiewicz, S. (1929), ‘Ein Beweis des Fixpunktsatzes f¨ ur n-dimensionale Simplexe’, Fund. Math. 14, 132–137. Kneser, H. (1950), ‘Eine direkte Ableitung der Zornschen Lemmas aus dem Auswahlaxiom’, Math. Z. 53, 110–113. ¨ Kolmogorov, A. (1931), ‘Uber Kontaktheit der Funktionenmengen bei der Konvergenz im Mittel’, Nachr. Akad. Wiss. G¨ ottingen Math.-Phys. Kl. I pp. 60–63. Komura, Y. (1967), ‘Nonlinear semigroups in Hilbert spaces’, J. Math. Soc. Japan 19, 493–507. Kondrachov, W. (1945), ‘Sur certaines propri´et`es des fonctions dans l’espace Lp ’, Dokl. Akad. Nauk SSSR 48, 535–538. Krasnoselskii, M. (1964a), Positive Solutions of Operator Equations, Noordhoff International Publishing, Groningen. Krasnoselskii, M. (1964b), Topological Methods in the Theory of Nonlinear Integral Equations, The Macmillan Co., New York. Krasnoselskii, M. & Zabreiko, P. (1984), Geometrical Methods in Nonlinear Analysis, Vol. 263 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, New York. Krickerberg, K. (1957), ‘Distributionen, Funktionen beschr¨ankter Variation und Lebesguescher Inhalt nichtparametrischer Fl¨achen’, Ann. Mat. Pura Appl. (4) 44, 105–134.
944
Nonlinear Analysis
Kufner, A., John, O. & Fuˇcik, S. (1977), Function Spaces, Monographs and Textbooks on Mechanics of Solids and Fluids; Mechanics: Analysis, Noordhoff International Publishing, Leyden. Ladyzhenskaya, O. & Uraltseva, N. (1968), Linear and Quasilinear Elliptic Equations, Vol. 46 of Mathematics in Science and Engineering, Academic Press, New York. Lang, S. (1972), Differential Manifolds, Addison-Wesley Publishing Co., Reading, MA. Laurent, P. (1972), Approximation et Optimisation, Vol. 13 of Collection Enseignement des Sciences, Hermann, Paris. Lebesgue, H. (1904), Lecons sur l’Int´egration et la Recherche des Fonctions Primitives, Gauthier-Villars, Paris. ´ Lebesgue, H. (1910), ‘Sur l’int´egration des fonctions discontinues’, Ann. Ecole Nat. Sup. M´ec. Nantes 27, 361–450. Lebourg, G. (1975), ‘Valeur moyenne pour gradient g´en´eralis´e’, C. R. Acad. Sci. Paris S´er. A-B 281, 795–797. Legendre, A. (1786), Memoire sur la mani`ere de distinguer les maxima des minima dans le calcul variations, Memoire de l’Academie des Sciences, Paris. Leggett, L. & Williams, L. (1979), ‘Multiple positive fixed points of nonlinear operators on ordered Banach spaces’, Indiana Univ. Math. J. 28, 673– 688. Leray, J. & Schauder, J. (1934), ‘Topologie et ´equations fonctionnelles’, Ann. ´ Sci. Ecole Norm. Sup. 51, 45–78. Levin, V. (1973), ‘Subdifferentials of convex integral functionals and lifting that are the identity on subspaces in L∞ ’, Soviet Math. Dokl. 14, 1163– 1166. Levin, V. (1974), ‘The Lebesgue decomposition for functions on the vector function space L∞ X ’, Funct. Anal. Appl. 8, 314–317. Levin, V. (1975), ‘Convex integral functionals and lifting theory’, Russian Math. Surveys 30, 119–184. Levin, V. (1980), ‘Measurable selections of multivalued mappings into topological spaces and upper envelopes of Carath´eodory integrands’, Soviet Math. Dokl. 21, 771–775. L´evy, P. (1920), ‘Sur les fonctions de lignes implicites’, Bull. Soc. Math. France 48, 13–27. Li, S. & Willem, M. (1995), ‘Applications of local linking to critical point theory’, J. Math. Anal. Appl. 189, 6–32.
References
945
Li, X. & Yong, J. (1995), Optimal Control Theory for Infinite-Dimensional Systems, Systems & Control: Foundations & Applications, Birkh¨auser Verlag, Boston, MA. Lieberman, G. (1988), ‘Boundary regularity for solutions of degenerate elliptic equations’, Nonlinear Anal. 12, 1203–1219. Lim, T. (1974), ‘A fixed point theorem for multivalued nonexpansive mappings in a uniformly convex Banach space’, Bull. Amer. Math. Soc. 80, 1123– 1126. Lim, T. (1985), ‘On fixed point stability for set-valued contractive mappings with applications to generalized differential equations’, J. Math. Anal. Appl. 110, 436–441. Lindenstrauss, J. (1963), ‘On operators which attain their norms’, Israel J. Math. 1, 139–148. Lindqvist, P. (1990), ‘On the equation div(||∇x||p−2 ∇x)+λ|x|p−2 x = 0’, Proc. Amer. Math. Soc. 109, 157–164. Lindqvist, P. (1992), ‘Addendum to “On the equation div(||∇x||p−2 ∇x) + λ|x|p−2 x = 0”’, Proc. Amer. Math. Soc. 116, 583–584. Lions, J.-L. (1969), Quelques M´ethodes de R´esolution des Probl`emes aux Limites Non Lin´eaires, Dunod, Paris. Lions, J.-L. & Magenes, E. (1972), Non-Homogeneous Boundary Value Problems and Applications, Vol. 182 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, New York. Lions, P.-L. (1985a), ‘The concentration-compactness principle in the calculus of variations. The limit case’, Rev. Mat. Iberoamericana 1, 45–121. Lions, P.-L. (1985b), ‘The concentration-compactness principle in the calculus of variations. The limit case II’, Rev. Mat. Iberoamericana 1, 145–201. Liu, F. (1977), ‘A Luzin-type property of Sobolev functions’, Indiana Univ. Math. J. 26, 645–651. Liu, J. & Li, S. (1984), ‘An existence theorem of multiple critical points and its applications’, Kexue Tongbao (Chinese) 29, 1025–1027. Lloyd, N. (1978), Degree Theory, Vol. 73 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge. Lucchetti, R. & Patrone, F. (1980), ‘On Nemytskii’s operator and its applications to the lower semicontinuity of integral functionals’, Indiana Univ. Math. J. 29, 703–713. Lumer, G. & Phillips, R. (1961), ‘Dissipative operators in a Banach space’, Pacific J. Math. 11, 679–698.
946
Nonlinear Analysis
Lusternik, L. (1934), ‘On constrained extrema of functionals’, Mat. Sb. 41, 390–401. Lusternik, L. & Schnirelman, L. (1934), M´ethodes Topologiques dans les Probl´emes Variationnels, Hermann & Cie., Paris. Man´asevich, R. & Mawhin, J. (1998), ‘Periodic solutions for nonlinear systems with p-Laplacian-like operators’, J. Differential Equations 145, 367–393. Man´asevich, R. & Mawhin, J. (2000), ‘Boundary value problems for nonlinear perturbations of vector p-Laplacian-like operators’, J. Korean Math. Soc. 37, 665–685. Mandelbrojt, S. (1939), ‘Sur les fonctions convexes’, C. R. Acad. Sci. Paris 209, 977–978. Manes, A. & Micheletti, A. (1973), ‘Un’estensione della teoria variazionale classica degli autovalori per operatori ellittici del secondo ordine’, Boll. Un. Mat. Ital. 7, 285–301. Marcus, M. & Mizel, V. (1972), ‘Absolute continuity on tracks and mappings of Sobolev spaces’, Arch. Rational Mech. Anal. 45, 294–320. Marcus, M. & Mizel, V. (1979), ‘Every superposition operator mapping one Sobolev space into another is continuous’, J. Funct. Anal. 33, 217–229. Marino, A. & Prodi, G. (1975), ‘Metodi perturbativi nella teoria di Morse’, Boll. Un. Mat. Ital. 11, 1–32. Markov, A. (1936), ‘Quelques th´eor`emes sur les ensembles abeli´en’, C.R. Acad. Sci. URSS 1, 311–313. Martio, O. (1988), ‘Counterexample for the unique continuation’, Manuscripta Math. 60, 21–47. Mawhin, J. (2001), Periodic solutions of systems with p-Laplacian-like operators, in ‘Nonlinear Analysis and Its Applications to Differential Equations (Lisbon 1998)’, Vol. 43 of Progress in Nonlinear Differential Equations and Their Applications, Birkh¨auser Verlag, Boston, MA, pp. 37–63. Mawhin, J. & Willem, M. (1989), Critical Point Theory and Hamiltonian Systems, Vol. 74 of Applied Mathematical Sciences, Springer-Verlag, New York. Maz’ja, V. (1985), Sobolev Spaces, Springer Series in Soviet Mathematics, Springer-Verlag, New York. ¨ Mazur, S. (1933), ‘Uber konvexe Mengen in linearen normierten R¨aumen’, Studia Math. 4, 70–84. McShane, E. (1934), ‘Extension of range of functions’, Bull. Amer. Math. Soc. 40, 837–842.
References
947
Meyers, N. (1978), ‘Integral inequalities of Poincar´e and Wirtinger type’, Arch. Rational Mech. Anal. 68, 113–120. Meyers, N. & Serrin, J. (1964), ‘H = W ’, Proc. Nat. Acad. Sci. U.S.A. 51, 1055–1056. Mig´orski, S. (1994), ‘A counterexample to a compact embedding theorem for functions with values in a Hilbert space’, Proc. Amer. Math. Soc. 123, 2447–2449. Minty, G. (1962), ‘Monotone (nonlinear) operators in a Hilbert space’, Duke Math. J. 29, 341–346. Miyadera, I. (1952), ‘Generation of a strongly continuous semi-group of operators’, Tˆ ohoku Math. J. (2) 4, 109–114. Miyadera, I. (1992), Nonlinear Semigroups, Vol. 109 of Translations of Mathematical Monographs, AMS, Providence, RI. Moreau, J.-J. (1965), ‘Proximit´e et dualit´e dans un espace hilbertien’, Bull. Soc. Math. France 93, 273–299. Moreau, J.-J. (1966–1967), Fonctionelles Convexes, Seminaire sur les equations aux deriv´ees partielles II, College de France, Paris. Morosanu, G. (1988), Nonlinear Evolution Equations and Applications, Vol. 26 of Mathematics and Its Applications (East European Series), Reidel Publishing Co., Dordrecht. Morrey, C. (1940), ‘Functions of several variables and absolute continuity II’, Duke Math. J. 6, 187–215. Morrey, C. (1966), Multiple Integrals in the Calculus of Variations, Vol. 130 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, New York. Moser, J. (1960), ‘A new proof of De Giorgi’s theorem concerning the regularity problem for elliptic differential equations’, Comm. Pure Appl. Math. 13, 457–468. Motreanu, D. & Rˇadulescu, V. (2003), Variational and Non-Variational Methods in Nonlinear Analysis and Boundary Value Problems, Vol. 67 of Nonconvex Optimization and Its Applications, Kluwer, Dordrecht. Nadler, S. (1969), ‘Multivalued contraction mappings’, Pacific J. Math. 30, 475–488. Naniewicz, Z. & Panagiotopoulos, P. (1995), Mathematical Theory of Hemivariational Inequalities and Applications, Vol. 188 of Monographs and Textbooks in Pure and Applied Mathematics, Marcel Dekker, New York. Nash, J. (1958), ‘Continuity of solutions of parabolic and elliptic equations’, Amer. J. Math. 80, 931–954.
948
Nonlinear Analysis
Nashed, M. (1971), Differentiability and related properties of nonlinear operators: Some aspects of the role of differentials in nonlinear functional analysis, in ‘Nonlinear Functional Analysis and Applications (Proc. Advanced Sem., Mart. Res. Center, Univ. of Wisconsin, Madison, 1970)’, Academic Press, New York, pp. 103–309. Neˇcas, J. (1967), Les M´ethodes Directes en Th´eorie des Equations Ellipticues, Masson, Paris. Nirenberg, L. (1959), ‘On elliptic partial differential equations’, Ann. Scuola Norm. Sup. Pisa Cl. Sci. 13, 1–48. Nirenberg, L. (1981), ‘Variational and topological methods in nonlinear problems’, Bull. Amer. Math. Soc. (N.S.) 4, 267–302. Nirenberg, L. (1989), Variational methods in nonlinear problems, in ‘Topics in Calculus of Variations (Montecatini Terme, 1987)’, Vol. 1365 of Lecture Notes in Mathematics, Springer-Verlag, Berlin, pp. 100–119. Nussbaum, R. (1971), ‘The fixed point index for local condensing maps’, Ann. Mat. Pura Appl. 89, 217–258. Oleinik, O. (1952), ‘On properties of certain boundary value problems for equations of elliptic type’, Mat. Sb. (N.S.) 30, 695–702. O’Regan, D. & Precup, R. (2001), Theorems of Leray-Schauder Type and Applications, Vol. 3 of Series in Mathematical Analysis and Applications, Gordon and Breach Sciences Publishers, Amsterdam. ˆ Otani, M. (1984), ‘On certain second order ordinary differential equations associated with Sobolev-Poincar´e-type inequalities’, Nonlinear Anal. 8, 1255–1270. Palais, R. (1966), ‘Lusternik-Schnirelman theory on Banach manifolds’, Topology 5, 115–132. Palais, R. & Smale, S. (1964), ‘A generalized Morse theory’, Bull. Amer. Math. Soc. 70, 165–172. Papageorgiou, E. & Papageorgiou, N. (2004), ‘Two nontrivial solutions for quasilinear periodic problems’, Proc. Amer. Math. Soc. 132, 429–434. Papageorgiou, N. (1985), ‘On the theory of Banach spaces valued multifunctions, part I: Integration and conditional expectation’, J. Multivariate Anal. 17, 185–206. Papageorgiou, N. (1986), ‘Integral functionals on Souslin locally convex spaces’, J. Math. Anal. Appl. 113, 148–162. Parthasarathy, K. (1967), Probability Measures on Metric Spaces, Vol. 3 of Probability and Mathematical Statistics, Academic Press, New York.
References
949
Pascali, D. & Sburlan, S. (1978), Nonlinear Mappings of Monotone Type, Sijthoff and Noordhoff International Publishers, Alpen aan den Rijn. Pavel, N. (1987), Nonlinear Evolution Operators and Semigroups, Vol. 1260 of Lecture Notes in Mathematics, Springer-Verlag, Berlin. Pazy, A. (1968), ‘On the differentiability and compactness of semigroups of linear operators’, J. Math. Mech. 17, 1131–1141. Pazy, A. (1983), Semigroups of Linear Operators and Applications to Partial Differential Equations, Vol. 44 of Applied Mathematical Sciences, Springer-Verlag, New York. Pedregal, P. (1997), Parametrized Measures and Variational Principles, Vol. 30 of Progress in Nonlinear Differential Equations and Their Applications, Birkh¨auser Verlag, Basel. Penot, J.-P. (1978), ‘Calcul sous-diff´erentiel et optimisation’, J. Funct. Anal. 27, 248–276. Penot, J.-P. (1986), ‘The drop theorem, the petal theorem and Ekeland’s variational principle’, Nonlinear Anal. 10, 813–822. Pettis, B. (1938a), ‘Linear functionals and completely additive set functions’, Duke Math. J. 4, 552–565. Pettis, B. (1938b), ‘On integration in vector spaces’, Trans. Amer. Math. Soc. 44, 277–304. Phelps, R. (1993), Convex Functions, Monotone Operators and Differentiability, Vol. 1364 of Lecture Notes in Mathematics, Springer-Verlag, Berlin. Phillips, R. (1940), ‘On linear transformations’, Trans. Amer. Math. Soc. 48, 516–541. Phillips, R. (1953), ‘Perturbation theory for semigroups of linear operators’, Trans. Amer. Math. Soc. 74, 199–221. Phillips, R. (1959), ‘Dissipative operators and hyperbolic systems of partial differential equations’, Trans. Amer. Math. Soc. 90, 193–254. Plessner, A. (1929), ‘Eine Kenzeichung der totalstetigen Funktionen’, J. f¨ ur Math. 1960, 26–32. Poincar´e, H. (1886), ‘Sur les courbes definies par les ´equations diff´erentielles IV’, Jordan J. (4) 2, 151–217. Poljak, B. (1969), ‘Semicontinuity of integral functionals and existence theorems on extremal problems’, Math. USSR-Sb. 7, 59–77. Protter, M. & Weinberger, H. (1967), Minimax Principles in Differential Equations, Prentice Hall, New York.
950
Nonlinear Analysis
Pshenichnyi, B. (1971), Necessary Conditions for an Extremum, Vol. 4 of Pure and Applied Mathematics, Marcel-Dekker, New York. Pucci, C. (1952), ‘Maggiorazione della soluzione di un problema al contorno di tipo misto, relativo a una equazione a derivate parziale, lineare, del secondo ordere’, Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 13, 360–366. Pucci, P. & Serrin, J. (1985), ‘A mountain pass theorem’, J. Differential Equations 60, 142–149. Pucci, P. & Serrin, J. (1987), ‘The structure of the critical set in the mountain pass theorem’, Trans. Amer. Math. Soc. 299, 115–132. Rabinowitz, P. (1973), ‘Some aspects of nonlinear eigenvalue problems’, Rocky Mountain J. Math. 3, 161–202. Rabinowitz, P. (1978a), ‘Some critical points theorems and applications to semilinear elliptic partial differential equations’, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 5, 215–223. Rabinowitz, P. (1978b), Some minimax theorems and applications to nonlinear partial differential equations, in L. Cesari, R. Kannan & H. Weinberg, eds, ‘Nonlinear Analysis (selection of papers in honor of Erich H.Rothe)’, Academic Press, New York, pp. 161–177. Rabinowitz, P. (1986), Minimax Methods in Critical Point Theory with Applications to Differential Equations, Vol. 65 of CBMS Regional Conference Series in Mathematics, AMS, Providence, RI. ¨ Rademacher, H. (1919), ‘Uber partielle und totale Differenzierbarkeit von Funktionen mehrerer Variablen und u ¨ber die Transformation der Doppelintegrale I’, Math. Ann. 79, 340–359. Ramos, M. & Rebelo, C. (1994), ‘A unified approach to min-max critical point theorems’, Portugal. Math. 51, 489–516. Reed, M. & Simon, B. (1972), Functional Analysis, Academic Press, New York. Rellich, R. (1930), ‘Ein satz u ¨ber mittlere Konvergenz’, Nachr. Akad. Wiss. G¨ ottingen Math.-Phys. Kl. I 1930, 30–35. Renardy, M. & Rogers, R. (1993), A First Graduate Course in Partial Differential Equations, Springer-Verlag, New York. Resetnjak, J. (1969), ‘On the concept of capacity in the theory of functions ˇ 10, 1109–1138. with generalized derivatives’, Sibirsk. Mat. Z. Ricceri, B. (1987), ‘Une propri´et`e topologique de l’ensemble des points fixes d’une contraction multivoque ´a valueurs convexes’, Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 81, 283–286.
References
951
¨ Riesz, F. (1918), ‘Uber lineare Funktionalgleichungen’, Acta Math. 41, 71–98. Riesz, F. & Nagy, B. (1955), Functional Analysis, Frederick Ungar Publishing Co., New York. Riesz, M. (1933), ‘Sur les ensembles compacts de fonctions sommables’, Acta Sci. Math. (Szeged) 6, 136–142. Roberts, A. & Varberg, D. (1973), Convex Functions, Vol. 57 of Pure and Applied Mathematics, Academic Press, New York. Rockafellar, R. (1966), ‘Characterization of the subdifferentials of convex functions’, Pacific J. Math. 17, 497–510. Rockafellar, R. (1968), ‘Integrals which are convex functionals’, Pacific J. Math. 24, 525–539. Rockafellar, R. (1969), ‘Local boundedness of nonlinear monotone operators’, Michigan Math. J. 16, 397–407. Rockafellar, R. (1970a), Convex Analysis, Vol. 28 of Princeton Mathematical Series, Princeton University Press, Princeton, NJ. Rockafellar, R. (1970b), ‘On the maximal monotonicity of subdifferential mappings’, Pacific J. Math. 33, 209–216. Rockafellar, R. (1970c), ‘On the maximality of sums of nonlinear monotone operators’, Trans. Amer. Math. Soc. 149, 75–88. Rockafellar, R. (1971a), Convex integral functionals and duality, in ‘Contributions to Nonlinear Functional Analysis (Proc. Sympos., Math. Res. Center, Univ. Wisconsin, Madison, Wis., 1971)’, Academic Press, New York, pp. 215–236. Rockafellar, R. (1971b), ‘Integrals which are convex functionals. II’, Pacific J. Math. 39, 439–469. Rockafellar, R. (1971c), Weak compactness of level sets of integral functionals, in H. Garnir, ed., ‘Trois`eme Colloque sur l’Analyse Functionnelle’, Vander, Louvain, pp. 85–98. Rockafellar, R. (1974), Conjugate Duality and Optimization, Vol. 16 of CBMS Regional Conference Series in Mathematics, SIAM, Philadelphia, PA. Rockafellar, R. (1976), Integral functionals, normal integrands and measurable selections, in ‘Nonlinear Operators and the Calculus of Variations (Summer School, Univ. Libre Bruxelles, Brussels, 1975)’, Vol. 543 of Lecture Notes in Mathematics, Springer-Verlag, New York, pp. 157–207. Rockafellar, R. & Wets, R. (1998), Variational Analysis, Vol. 317 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, Berlin.
952
Nonlinear Analysis
Rothe, E. (1938), ‘Zur Theorie der topologischen Ordnung und der Vektorfielder in Banachschen R¨aumen’, Compositio Math. 5, 177–197. Rothe, E. (1973), ‘Morse theory in Hilbert spaces’, Rocky Mountain J. Math. 3, 251–274. Rocky Mountain Consortium Symposium on Nonlinear Eigenvalue Problems (Santa Fe, NM, 1971). Roubiˇcek, T. (1997), Relaxation in Optimization Theory and Variational Calculus, Vol. 4 of de Gruyter Series in Nonlinear Analysis and Applications, De Gruyter, Berlin. Royden, H. (1968), Real Analysis, MacMillan Publishing Co., New York. Rzezuchowski, T. (1989), ‘Strong convergence of selectors implied by weak’, Bull. Austral. Math. Soc. 39, 201–214. Sadovskii, B. (1972), ‘Limit-compact and condensing operators’, Russian Math. Surveys 27, 85–155. Scarf, H. (1967), ‘The approximation of fixed points of a continuous mapping’, SIAM J. Appl. Math. 15, 1328–1343. ¨ Schaefer, H. (1955), ‘Uber die Methode der a priori Schranken’, Math. Ann. 129, 415–416. Schauder, J. (1930), ‘Der Fixpunktsatz in Funktionalr¨aumen’, Studia Math. 2, 171–180. ¨ Schauder, J. (1933), ‘Uber das Dirichletische Problem in Gr¨ossen f¨ ur nichtlineare elliptische Differentialgleichungen’, Math. Z. 37, 623–634. Schechter, M. (1971), Principles of Functional Analysis, Academic Press, New York. Schwartz, J. (1964), ‘Generalizing the Lusternik-Schnirelman theory of critical points’, Comm. Pure Appl. Math. 17, 307–315. Schwartz, J. (1969), Nonlinear Functional Analysis, Gordon and Breach Sciences Publishers, New York. Showalter, R. (1997), Monotone Operators in Banach Spaces and Nonlinear Partial Differential Equations, Vol. 49 of Mathematical Surveys and Monographs, AMS, Providence, RI. Silva, E. (1991), ‘Linking theorems and applications to semilinear elliptic problems at resonance’, Nonlinear Anal. 16, 455–477. Simon, J. (1987), ‘Compact sets in the space Lp (0, T ; B)’, Ann. Mat. Pura Appl. (4) 146, 65–96. Simon, L. (1983), Lectures on Geometric Measure Theory, Vol. 3 of Proceedings of the Center for Mathematical Analysis, Australian National University, Australian National University, Canberra.
References
953
Smart, D. (1980), Fixed Point Theorems, Vol. 66 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge. Sobolev, S. (1963a), Applications of Functional Analysis in Mathematical Physics, Vol. 7 of Translations of Mathematical Monographs, AMS, Providence, RI. Sobolev, S. (1963b), ‘On a theorem of functional analysis’, Amer. Math. Soc. Transl. 34, 39–68. Stampacchia, G. (1965), ‘Le probl`eme de Dirichlet pour les ´equations elliptiques du second odrer `a coefficients discontinus’, Ann. Inst. Fourier (Grenoble) 15, 189–258. ` Stampacchia, G. (1966), Equations Elliptiques de Second Ordre ` a Coefficients Discontinus, Vol. 16 of S´eminaire de Math´ematiques Sup´erieures, Les Presses de l’Universit´e de Montr´eal, Montreal. Stein, E. (1970), Singular Integrals and Differentiability Properties of Functions, Vol. 30 of Princeton Mathematical Series, Princeton University Press, Princeton, NJ. Struwe, M. (1990), Variational Methods. Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems, Springer-Verlag, Berlin. Sun, J. & Sun, Y. (1986), ‘Some fixed point theorems of increasing operators’, Appl. Anal. 23, 23–27. Sun, Y. (1991), ‘A fixed point theorem for mixed monotone operators with applications’, J. Math. Anal. Appl. 156, 240–252. Szulkin, A. (1986), ‘Minimax principles for lower semicontinuous functions and applications to nonlinear boundary value problems’, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 3, 77–109. Takahashi, W. (1991), Existence theorems generalizing fixed point theorems of multivalued mappings, in J. Baillon & M. Thera, eds, ‘Fixed Point Theory and Applications (Marseille, 1989)’, Vol. 252 of Pitman Res. Notes in Math. Ser., Longman Scientific & Technical, Harlow, pp. 397– 406. Talagrand, M. (1984a), ‘Pettis integral and measure theory’, Mem. Amer. Math. Soc. 51. Talagrand, M. (1984b), ‘Weak Cauchy sequences in L1 (E)’, Amer. J. Math. 106, 703–724. Talenti, G. (1976), ‘Best constant in Sobolev inequality’, Ann. Mat. Pura Appl. 110, 353–372.
954
Nonlinear Analysis
Tarski, A. (1955), ‘A lattice-theoretical fixed point theorem and applications’, Pacific J. Math. 5, 285–309. Tartar, L. (1979), Compansated compactness and applications to partial differential equations, in R. Knops, ed., ‘Nonlinear Analysis and Mechanics, Heriat-Watt Symposium, Vol. IV’, Vol. 39 of Research Notes in Mathematics, Pitman, London, pp. 139–212. Tiba, D. (1990), Optimal Control of Nonsmooth Distributed Parameter Systems, Vol. 1459 of Lecture Notes in Mathematics, Springer-Verlag, Berlin. Tolksdorf, P. (1983), ‘On the Dirichlet problem for quasilinear equations in domains with conical boundary points’, Comm. Partial Differential Equations 8, 773–817. Tolksdorf, P. (1984), ‘Regularity for a more general class of quasilinear elliptic equations’, J. Differential Equations 51, 126–150. Tonelli, L. (1926), ‘Sulla quadratura delle superficie’, Atti R. Accad. Lincei 6, 633–638. Trudinger, N. (1967), ‘On Harnak-type inequalities and their applications to quasilinear elliptic equations’, Comm. Pure Appl. Math. 20, 721–747. Tychonoff, A. (1935), ‘Ein Fixpunktsatz’, Math. Ann. 111, 767–776. Valadier, M. (1975), ‘Convex integrands on Souslin locally convex spaces’, Pacific J. Math. 59, 267–276. Vaˇınberg, M. (1973), Variational Method and Method of Monotone Operators in the Theory of Nonlinear Equations, Halsted Press, New York. V´azquez, J. (1984), ‘A strong maximum principle for some quasilinear elliptic equations’, Appl. Math. Optim. 12, 191–202. Visintin, A. (1984), ‘Strong convergence results related to strict convexity’, Comm. Partial Differential Equations 9, 439–466. Vitali, G. (1908), ‘Sui gruppi di punti e sulle funzioni di variabili reali’, Atti Accad. Sci. Torino Cl. Sci. Fis. Mat. Natur. 43, 75–92. Vrabie, I. (1987), Compactness Methods for Nonlinear Evolutions, Vol. 32 of Pitman Monographs and Surveys in Pure and Applied Mathematics, Longman Scientific & Technical, Harlow. Wang, Z.-Q. (1987), ‘A note on the second variational theorem’, Acta Mech. Sinica (Beijing) 30, 106–110. Warga, J. (1972), Optimal Control of Differential and Functional Equations, Academic Press, New York.
References
955
Webster, R. (1994), Convexity, Oxford Science Publications, Oxford University Press, New York. Weissinger, J. (1952), ‘Zur Theorie und Anwendung des Iterationsverfahrens’, Math. Nachr. 8, 193–212. Widom, H. (1969), Lectures on Measure and Integration, Van Nostrand Reinhold Co., New York. Willem, M. (1996), Minimax Theorems, Vol. 24 of Progress in Nonlinear Differential Equations and Their Applications, Birkh¨auser Verlag, Boston, MA. Wloka, J. (1987), Partial Differential Equations, Cambridge University Press, Cambridge. Yosida, K. (1948), ‘On the differentiability and representation of oneparameter semi-groups of linear operators’, J. Math. Soc. Japan 1, 15– 21. Yosida, K. (1978), Functional Analysis, Vol. 123 of Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, Berlin. Yosida, K. & Hewitt, E. (1952), ‘Finitely additive measures’, Trans. Amer. Math. Soc. 72, 46–66. Young, L. (1942a), ‘Generalized surfaces in the calculus of variations’, Ann. of Math. (2) 43, 84–103. Young, L. (1942b), ‘Generalized surfaces in the calculus of variations. II’, Ann. of Math. (2) 43, 530–544. Young, L. (1969), Lectures on the Calculus of Variations and Optimal Control Theory, W.B. Saunders Co., Philadelphia, PA. Young, W. (1912), ‘On class of summable functions and their Fourier series’, Proc. Roy. Soc. Ser. A 87, 225–229. Zeidler, E. (1985a), Nonlinear Functional Analysis and Its Applications I. Fixed Point Theorems, Springer-Verlag, New York. Zeidler, E. (1985b), Nonlinear Functional Analysis and Its Applications III. Variational Methods and Optimization, Springer-Verlag, New York. Zeidler, E. (1990a), Nonlinear Functional Analysis and Its Applications II/A. Linear Monotone Operators, Springer-Verlag, New York. Zeidler, E. (1990b), Nonlinear Functional Analysis and Its Applications II/B. Nonlinear Monotone Operators, Springer-Verlag, New York. Zhang, M. (2000), ‘Nonuniform nonresonance of semilinear differential equations’, J. Differential Equations 166, 33–50.
956
Nonlinear Analysis
Zhang, M. (2001), ‘The rotation number approach to eigenvalues of the onedimensional p-Laplacian with periodic potentials’, J. London Math. Soc. 64, 125–143. Zhong, C.-K. (1997), ‘On Ekeland’s variational principle and a minimax theorem’, J. Math. Anal. Appl. 205, 239–250. Ziemer, W. (1989), Weakly Differentiable Functions. Sobolev Spaces and Functions of Bounded Variation, Vol. 120 of Graduate Texts in Mathematics, Springer-Verlag, New York.