NONLINEAR notes on FUNCTIONAL mathematics and its ANALYSIS applications
Jacob T Schwartz GORDON AND BREACH SCIENCE PUBL...
78 downloads
907 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
NONLINEAR notes on FUNCTIONAL mathematics and its ANALYSIS applications
Jacob T Schwartz GORDON AND BREACH SCIENCE PUBLISHERS
Nonlinear Functional Analysis
J. T. SCHWARTZ Courant Institute of Mathematical Sciences New York University
Notes by
H. Fattorini R. Nirenberg and H. Porta with an additional chapter by
Hermann Karcher
GORDON AND BREACH SCIENCE PUBLISHERS NEW YORK LONDON PARIS
Copyright © 1969 by GORDON AND BREACH SCIENCE PUau3SHERS INC.
150 Fifth Avenue, New York, N. Y. 10011
Library of Congress Catalog Card Number: 6825643 Editorial office for the United Kingdom:
Gordon and Breach Science Publishers Ltd. 12 Bloomsbury Way London W.C.1. Editorial office for France:
Gordon & Breach 79 rue Emile Dubois Paris 14e Distributed in Canada by:
The Ryerson Press 299 Queen Street West Toronto 2b, Ontario
All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing
from the publishers. Printed in East Germany.
Editors' Preface A large number of mathematical books begin as lecture notes; but, since mathematicians are busy, and since the labor required to bring lecture notes up to the level of perfection which authors and the public demand of formally published books is very considerable, it follows that an even larger number
of lecture notes make the transition to book form only after great delay or not at all. The present lecture note series aims to fill the resulting gap. It will consist of reprinted lecture notes, edited at least to a satisfactory level of completeness and intelligibility, though not necessarily to the perfection
which is expected of a book. In addition to lecture notes, the series will include volumes of collected reprints of journal articles as current developments indicate, and mixed volumes including both notes and reprints. JACOB T. SCHWARTZ
MAURICE LEvI
Contents
Introduction
.
Chapter 1:
Basic Calculus
Chapter II:
.
.
.
.
.
.
.
.
.
.
.
.
.
.
I
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
Hard Implicit Functional Theorems
.
.
.
.
.
.
.
.33
Chapter III:
Degree Theory and Applications
.
.
:
.
.
.
.
55
Chapter IV:
Morse Theory on Hilbert Manifolds
.
.
.
.
.
.
.
99
Chapter V :
Category
.
.
.
.
.
.
.
155
Chapter VI:
Applications of Morse Theory to Calculus of Variations in the Large . . . . . . . . . . . . . . . 165
.
.
.
.
.
.
Chapter VII: Applications
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
181
Chapter VIII: Closed Geodesics on Topological Spheres
.
.
.
.
.
199
Index
.
.
.
.
.
235
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction
Nonlinear functional analysis is of course not so much a subject, as the complement of another subject, namely, linear functional analysis. In studying our negatively defined field, we will, however, find a certain unity; partly because we shall exclude from functional analysis those analytic theories which do not make use of the characteristic procedure of functional analysis. This characteristic procedure is, of course, the treatment of a given problem, or the construction or study of a desired function, by imbedding the problem or function into a space (generally infinite dimensional) of related problems
or functions. In accordance with this distinction we shall, for example, regard much of the asymptotic study (by topological methods) of solutions of nonlinear differential equations as belonging to nonlinear analysis but not to nonlinear functional analysis, while, for instance, the Morse theory of geodesics, or the construction of solutions of partial differential equations by application of the Schauder fixed point theorem, will definitely be considered to belong to nonlinear functional analysis. The distinction suggested is not always clearcut, however. We may orient ourselves toward our subject of study as follows. Non
linear functional analysis is nonlinear analysis in the context of infinite dimensional topological spaces, manifolds, etc. Naturally, our knowledge of nonlinear analysis in this case cannot be more complete than our knowledge
of nonlinear analysis in the finite dimensional case. Therefore the finitedimensional case can serve as a model for the infinite dimensional case. We can formulate our aim as follows: to extend known theorems of nonlinear analysis from the finite to the infinite dimensional case; to analyze any particular difficulties, not present in the finite dimensional case, which arise in the infinite dimensional case. Now, what are the main branches of nonlinear analysis in finitely many dimensions? They may be listed under five general headings: 1. Elementary calculus. 2. The implicit function theorem and related results. I
Schwartz, Nonlinear 1
2
NONLINEAR FUNCTIONAL ANALYSIS
3. Topological principles for establishing the existence of solutions to systems of equations: the Brouwer fixed point theorem, the theory of degree,
the Jordan separation theorem, and, more generally, the Lefschetz fixed point theorem and the general topological intersection theory. 4. Topological theories for establishing the existence of critical points: the Morse critical point theory, and the LusternikSchnirelman "category" theory. 5. Theorems following by the powerful special methods of complex function theory.
We shall find infinite dimensional generalizations of theorems belonging to each of these five categories.
1. Elementary calculus goes over to Bspaces (and even slightly more general spaces) in a routine way. Integration theory is developed for vectorvalued functions defined on a measure space in Linear. Operators, Chapter 3,
and contains no surprises. The proper notion of derivative (as already in twodimensional spaces) is that of directional derivative or Gateaux derivative, which may be defined as follows. Let 0 be.a function mapping one Bspace X into another Y. Then if, for each x, y e X the function 0 (x + ty) of the real variable t is differentiable at t = 0, we say that 0 is (Gateaux) differentiable, and write
do (x; y) =
¢ (x + ty)
.
=o
A certain amount of basically elementary and unsurprising real variable theory is connected with this notion. Thus, for instance, under suitable hypotheses do (x; y) is linear in y; so that we may, if we like, speak of the derivative d0(x) as a linear operator mapping X into Y. Rather than study this elementary
calculus for its own sake, we will develop results which belong to it as needed for other purposes. .2. The implicit function theorem in Bspaces exists intwo versions. On the
one hand, we have the classical "soft" implicit function theorem, which states that if 0 is a mapping of a Bspace X into a space Y, if 0(0) = 0, and if 0 is continuously differentiable and 4,'(0) is a bounded operator with a bounded inverse, then 0 maps a neighborhood of zero (in X) homeomorphically onto a neighborhood of zero (in Y). This basic version of the theorem
has several interesting variants, one of which' is the socalled theory of "monotone" mappings. Another class of theorems closely related to this implicit function theorem form the socalled "bifurcation theory". The main
INTRODUCTION
3
idea of this latter theory may be explained as follows. Suppose that the solutio'ns of a functional equation ¢(x) = 0 are to be studied in the vicinity of a given solution x = 0. (Here, 0 is taken to be a differentiable mapping of a Bspace X into itself.) We may write ¢(x) = x + Kx + tp(x), where ly(x)I = 0(1x12) for x near 0, and where K is a linear transformation. If (I + K)' exists as a bounded operator, then, by the implicit function theorem, x = 0 is an isolated zero. In the bifurcation theory, we suppose only that K is compact, and wish to consider the case in which (I+K)'' does not exist. In this case, it follows by the Riesz theory of compact operators that X decomposes
as a direct sum X = Y ® Z of two subspaces, both invariant under K, the second being finite dimensional, such that (I+K) is a bounded mapping Y onto itself having a bounded inverse. Correspondingly, we may write x = [y, z], and write the equation 4,(x) = 0 as a pair of equations:
01 (y, z) = (I + K)y + V, (y, z) =0 4,
(y, z) = (I +K)z+V2(y,z) =0.
By the implicit function theorem, the first equation may be solved for y in terms of z : y = Y(z). Substituting this solution into the second equation, we find that the solutions of the original equation 4,(x) = 0 are in onetoone correspondence with the solutions of the equation (I + K) z + 1p2 (Y(z), z) = 0. This last equation, however, may be regarded as a finite system of equations in a finite number of variables, upon which all the resources of finitedimen
sional analysis may be brought to bear.
An introduction to the theory of bifurcation as outlined above may be found in Graves' article Remarks on singular points of functional equations, Trans. Amer. Math. Soc., V. 79, 150157 (1955). In addition to the "soft" version of the implicit function theorem described above, there exists iu the functionalanalytic case a separate "hard" version of the theorem. The precise statement of this second version of the implicit function theorem will be given in a later lecture. At present we shall only remark that this theorem applies even in cases where the Gateaux derivative of4, is unbounded as a linear operator, and has an unbounded linear inverse. The theorem is due to J. Nash: The imbedding problem for Riemannian manifolds, Anti. Math. 63, pp. 2063 (1956). J. Moser (A new technique for the construction of solutions of nonlinear differential equations, Proc. Nat. Acad. U.S.A., V. 47,1961, pp. 18241831) made the useful observation that Nash's
"hard" version of the implicit function theorem could be proved by an appropriate modification of the "Newton's Method" of finite dimensional analysis, the superrapid convergence of Newton's method compensating,
4
NONLINEAR FUNCTIONAL ANALYSIS
in an appropriate sense, for the unboundedness of the Frechet derivative and its inverse. Moser has subsequently made interesting applications and extensions of this basic idea, cf. Moser: On invariant curves ofarea preserving mappings ofan annulus, Gottinger Nachrichten,1962, pp.120, and subsequent publications. Cf. also a lecture by Serge Lang in the 1962 Sdminare Bourbaki.
3. Finite codimensional topology. The attentive reader will have observed that in our listing above of theorems of finite dimensional topology we have separated these theorems into two groups. This separation, somewhat unnatural in the finite dimensional case, is essential in the infinite dimensional case. Consider, for example, the Brouwer fixed point theorem. As is wellknown, this theorem is equivalent to the statement that the boundary of the unit sphere is not continuously deformable to a point on itself. In infinite dimensions, however, this statement is false. E.G., if we examine the boundary OS of the unit sphere in the Hilbert space L2 (0, 1), and follow the homotopy f(x)  f,(x), I z t , where
A(x) =t112f/1 l},
=0
l
J
05x5 t
t5x51
by the homotopy f112(x)  tfi/2(x) +
1  12 a(x), t Z 0, where or e 8S and o(x) = 0 for 0 S x 5 1, we obtain a continuous deformation of 8S along itself, to the single point o. This implies a set of topological consequences rather different from the corresponding results in finite dimensions. We owe to Schauder and Leray the important observation that the most familiar results of finite dimensional topology can be carried over to infinitely many dimensions if attention is restricted to the special category of maps # having the form 0  1 + yv, where 1 is the identity, and +p is a mapping
whose range is compact. Thus, for instance, if we confine our attention to this special category of maps, the boundary of the unit sphere is not continuously deformable to a single point along itself. Moreover, again for maps of this category, a straightforward generalization of the finite dimensional theory of degree can be established, and infinite dimensional generalizations of many of the basic theorems of finite dimensional topology obtained. As a basic reference, see Schauder and Leray : Topologie et equations fonctionelles, Ann. Sci. Ecole Norm. Sup. (3) 51 (1934) pp. 4578. The infinite dimensional theory of degree is based upon the finite dimensional method, which we shall develop by a simplified method patterned after the procedure of Heinz: An elementary analytic theory of degree in ndimensional space, J. Math. Mech. 8 (1959) pp. 231247.
INTRODUCTION
5
An especially useful theorem belonging to this circle of ideas is the fixed point theorem of Schauder: any continuous mapping into itself of a compact convex set in a locally convex linear topological space possesses a fixed point. Cf. Schauder: Der Fixpunktsatz in Funktionalydumen, Studia Math. 2 (1936) pp. 171180. Krein and Rutman (Uspekhi Math. Nauk 3, No. 1, pp. 395) give an interesting application of fixed point theory to the "projective space" of a Bspace. A connected account of many of the principal results in the type of functional topology discussed above is given by A. Granas: The theory of compact vector fields and some of its applications to topology of functional spaces (I). Roszprawy Math. XXX, Warsaw 1962. Granas lays stress on the homotopy theory of compact maps and on the Borsuk antipodal point theorem, but avoids the theory of degree. 4. Finite dimensional topology. The second category of topological results
available in functional spaces is distinguished by the fact that it makes reference to the ordinary singular homology and cohomology groups, defined similarly in the functional case and in the finite dimensional case. The Morse theory and the LusternikSchnirelman theory both begin with the same construction. A manifold is defined to be a topological space locally homeomorphic to a given Bspace in such a way that the "transition mappings" between the various "local coordinate patches" which cover the manifold are infinitely often differentiable. On such a manifold, all the ordinary local notions of analysis such as directional derivative, differentiable function, etc., are available. Let M be such a manifold, and let f be a smooth realvalued function defined on M. A critical point of f is by definition a point in M at which the directional derivative off in every direction vanishes. If M has a Riemannian metric, we may in the usual way define the gradient Vf off, which is a field of vectors tangent to M; in this case, the critical points of f are the points p where Vf(p) = 0. If there exists no critical pointp off such that a S f(p) 5 b (and assumingeertain additional, technicalhypotheses), then the subsets M, _ (q a Mlf(q) 5 a) and Mb = {q e Mlf(q) 5 b} are diffeomorphic. To see this, we have only to note that if each point q such that a 5 f(q) 5 'b is pushed down in the direction of the gradient field Vf until M. is reached, we obtain the desired diffeomorphism. This statement is the first main lemma of the Morse theory. The second observation on which the Morse theory is built gives a corresponding result for the case in which {q e MI a 5 f(q) 5 b} contains an isolated set of critical points. In this case, and under the further assumption that the critical points are all nondegenerate
6
NONLINEAR FUNCTIONAL ANALYSIS
in an appropriate sense, a closer analysis shows that the space Mb is diffeomorphic to a space M. L) H obtained from M. by affixing a certain collection of "handles". Thus the sequence of critical points off describes the construction of M by the successive addition of "handles" to a "ball". This connection may be exploited in either of two directions: to conclude from the known topology of M that any function f defined on M must admit critical points of certain numbers and types, or, conversely, to deduce information about the topology of M from a knowledge of the critical points of some particular function on M. If we let M be the space of all smooth curves on a finite dimensional manifold N, and regard M in an appropriate way as being an infinite dimensional
manifold, then the general Morse theory outlined above reduces to the special Morse theory of geodesics. A lucid account of the Morse theory, especially in the finitedimensional
case, is to be found in Milnor: Morse theory, Ann. of Math., Study 51, Princeton 1963. The generalization to infinite dimensional manifolds is developed in Palais: Lectures on Morse theory, Notes, Harvard, 1963, to be republished in 1964 in the Journal Topology. Palms gives the application of
the general theory to the Morse geodesic theory, developing in detail an account of the necessary compactness properties of the infinite dimensional manifold M and the function f on it. Further applications of the general theory to establish the existence of higher type critical structures in theories of minimal surfaces, etc., are to be hoped for. We may also refer to a set of notes, entitled Lectures of Smale on Differential Topology (Columbia, 1963). These notes give extensions of various qualitative theorems of finitedimensional differential topology to the infinite dimensional case. The LusternikSchnirelman theory of critical points agrees with the Morse theory in makiag use of the deformations along gradient curves on a mani
fold M. However, the methods of LusternikSchnirelman are more point set theoretic thahthose of Morse, and lead to more general but less precise results, If A and M are kopological spaces, and ¢ maps A into M and is continuous, call 0 a map of category I if it is homotopic to a constant map, and call 0 a map of category k if A can be divided into k sets A1, ..., Ak, but no fewer, such that 0 1 A is of category 1. If A c M, the category cat (A) is the catqpry of the identity map of A into M. It is not hard to establish that cat (A)  1 is a lower bound for the topological dimension of A. If f is a real valued function defined on A, and m S cat (A), put
cm(f) =
inf (sup (f(x) Ix e B}j.
ostis>zM
INTRODUCTION
Then cl(f) S c2(f) 5
7
. It may be shown, under suitable compactness
hypotheses, that for each m S cat (A) there exists a set B,,, c"A such that cm(f ). Were it the case that A conin, and sup {f(x) S x e cat tained no critical point q with f(q) = ejf), we could push all the points p e B. down in the direction of the gradient field Vf, obtaining a sets,,, of category m such that sup {f(x) I x e B.} < c.(f), a contradiction. Thus we see that each value cm(f) must be a critical value of f. A refinement of this then {x e A I f(x) = c, and Vf(x) = 01 argument shows that if c.(f) = must be of category at least m  n + 1. Thus any smooth function on A must admit at least cat (A) critical points. This last result makes it important to be able to establish lower bounds for the category of a space. We will see in a subsequent lecture that such results follow from an analysis of the singular cohomology ring of a topological space. For an introductory account
of the theory of category and some of its applications, cf. Lusternik and Schnirelman, Metkodes topologiques daps les problemes variationels, GauthierVillars, Paris, 1934.
Chapter VIII of the present notes, generously contributed by Dr. Hermann Karcher*, gives an account of some of the Morse Theory of closed geo
desics on manifolds which are topological spheres, according to methods stemming from Klingenberg. 5. The complex analytic case. A few results applying specifically to complex
analytic functional mappings between complex linear spaces are known. In the first place, one has the usual elementary results guaranteeing the power series expansion of complex analytic mappings, etc. The bifurcation theory., where applicable, shows that the set of zeroes of an analytic functional equa
tion O(x) = 0 is in onetoone bianalytic correspondence with the set of zeroes of a similar set of analytic equations in a finite number of complex variables. A good deal is known about the structure of such analytic varieties, and, the bifurcation theory enables one to carry all this information over to the functional case,
It follows readily from the definition of degree, in the cases where this definition is applicable, that the degree of a complex analytic map x  4(x) near any isolated zero is nonnegative. According to an interesting theorem of Jane Cronin (cf. Cronin : Analytic Functional Mappings, Ann. Math. 58 .
' The work of H. Karcher was supported at'the Courant Institute of Mathematical Seiencm New York University, by the National Science Foundation under Grand NSFGR8114.
8
NONLINEAR FUNCTIONAL ANALYSIS
(1953) pp. 175181) the degree of such a zero is actually positive. This result, combined with the results available from the general theory of degree, leads to a principle of permanance of zeroes that generalizes the wellknown theorem of Rouchk to the functional case.
6. Miscellany: In addition to the five principal categories of results outlined above, a variety of miscellaneous special results must be included in our subject. These will be noted as they arise in our subsequent lectures. In the present introduction, we shall note the work of Hammerstein (cf. Nichtlineare Integralgleichungen nebst Anwendungen, Acta Math., V. 54 (1930) pp. 117176) on integral equations, in which the order properties of the integral operators studied are exploited. This work is related to the theory of monotone operators alluded to above. We may also mention the existence of various investigations, notably those of E. Rothe, devoted to the variational method in functional analysis, i.e., to the possibility of solving functional equations 4(x) = 0 by casting them into the form O(x) = min, Where 0 is an appropriately selected functional. While the literature on the subject of the present course of lectures is somewhat scattered, a number of useful books have dppeared. We mention in the first place the book of Krasnoselskii: Topological methods in the theory of nonlinear integral equations, Moscow, 1956, 392 pp. An English translation of a related survey article by Krasnoselskii appears in the AMS Translations, Ser. 2, No. 10, pp. 345409. Krasnoselskii gives a good account of the available information on continuity and compactness of nonlinear integral operators of various forms, a good summary account of a number of other important topics in nonlinear theory, as well as an extensive bibliography. A less closely related, but still relevant work is the article Functional analysis and applied mathematics by Kantorovic in Uspekhi Math. Nauk 3 (No. 6) (1948), pp. 89185, as well as this author's treatise Approximated methods of higher analysis. An account of the differential calculus in Bspaces is to be found in the book of Michal: Le calcul difirentielle daps ks espaces de Banach (V. I, Fonctions analytiquesEquations int6grales) GauthierVillars; 1958 (150 pp.), in the wellknown treatise by Hille and Phillips on semigroups, and in the texts of advanced calculus by Dieudonne and by Serge Lang. The reader wishing to extend his knowledge of nonlinear functional analysis beyond the necessarily limited material contained in the present notes will find it useful to consult the comprehensive survey article by James sells : A Settingfor Global Analysis, Bull. Amer. Math. Soc. v. 72,,1966, p. 7S1809. This excellent review may also serve as a guide to the literature of the subject: _
CHAPTER I
Basic Calculus A. Some Definitions and a Lemma on Topological Linear Spaces B. Elementary Calculus . . . . . . . . . . . . . . C. The "Soft" Implicit Function Theorem . . . . . . . . D. The Hilbert Space Case . . . . . . . . . . . . . E. Compact Mappings . . . . . . . . . . . . . . . F. Higher Differentials and Taylor's Theorem . . . . . . . G. Complex Analyticity . . . . . . . . . . . . . . H. Derivatives of Quadratic Forms . . . . . . . . . .
.
.
.
.
.
.
9
.
.
.
.
.
.
11
.
.
.
.
.
.
14
.
.
.
.
.
.
18
.
.
.
.
.
.
26
.
.
.
.
.
.
28
.
.
.
.
.
.
30
.
.
.
.
.
.
31
A. Some Definitions and a Lemma on Topological Linear Spaces 1.1. Definition: We say that E is a topological linear space if E is a linear space which is given a topology such that addition and multiplication by scalars are continuous functions, i.e.: + : E x E  E and  : E x R  E are continuous functions, where E x E and E x R have the product topology. 1.2. Definition: Let E be a T.L.S. We say that E is locally convex if there exists a family of convex sets {U) which is a basis for the family of neighborhoods of 0.
(A set K is called convex iff x, y e K implies tx + (1  t) y e K for every t e [0,1 ].)
1.3. Definition: A T.L.S. will be called an Fspace, or Frechet space, if, as a topological space, it is metric and complete, with a topology given by a "norm" function }x( which satisfies: (i) lxl real ? 0; (ii) lxi = 0 if x = 0; (iii) Ix + yJ Ixi + lyi. (See Linear Operators*, Chapter 2.) We shall write L.C.F.space for locally convex Fspaces. 'Linear Operators, Nelson Durnford and Jacob T. Schwartz, WileyInterscience, Vol. 1, 1958, Vol. 11, 1963. 9
NONLINEAR FUNCTIONAL ANALYSIS
10
1.4. Definition: A T.L.S. will be called a Banach space, or a Bspace, iff it is complete and its topology is given by a norm, which in addition to conditions (i), (ii) and (iii) of the above definition, satisfies (iv) IAxl = JAI Ixi The spaces with which we shall ordinarily deal are L.C.F.spaces. 1.5. Definition: Let E be an Fspace. We say that K c F is bounded if for any neighborhood U of 0 there exists e ' 0 such that eK a U. This condition is easily seen to be equivalent to the following one: e, + 0 and k e K implies ek  0. We shall now prove a lemma relating L.C.F.spaces to Bspaces to Bspaces.
1.6. Lemma: A L.C.F.space is a Bspace if it contains a bounded open set.
First, we note that boundedness is unaffected by translations, so we can assume that 0 e U, where U is bounded and open. By definition of an L.C.F.
space, U will contain a convex neighborhood U' of 0, which a fortiori is bounded. Now, we can replace U' by V = U' n ( U) which is also convex, bounded, and a symmetric neighborhood of 0, i.e., V =  V. By the definition. of a bounded set, the family {eV}, e real > 0, is a neighborhood basis at 0. We consider now the support function of V, p(x), defined by p(x) = r sup I11l 1. (Obviously if in a Bspace V is the unit sphere, rx.v p(x)  Ixi.) The functionp(x) has the four properties of a norm function : (i) p(x) real and ? 0. Obvious. It is a finite number because V is absorbing. (ii) p(x) = 0 co x = 0. If p(x) = 0, sup ItI = oo, and this means that rxev
tx e V for all t, because V is convex. Hence x e 1 V for all t > 0, whence x
t I V . is a neighborhood basis. There
is in every neighborhood of 0, since .
fore x = 0, because the space is Hausdorff. (iii) p (x + y) 5 p(x) + p(y). It is apparent that p(x) can be defined by p(x) = inf 1. Let a, ft > 0 and such that x e aV and y e gV. Then
x + y e aV + fV; sinceV is convex, a' + PV = (a + fi) V, whence x + y e (a + (3) V. Therefore inf inf t + inf t, which t 5 r>0,xety 9>0.X+rerv r>o.,erv is (iii). (iv) p (ax) = jal p(x). If a > 0, it is easy to see that p (ax) = ap (x). But since V is symmetric, (iv) holds for any a, since a V = a V. Next we note that V = {xjp (x) < 1} if we assume. V to be open. For then'
BASIC CALCULUS
11
clearly x e V implies p(x) < 1. Also p(x) < 1 implies x e tV for some t < 1,
i.e., x = tv, v e V; since V is convex, x e V. We see at once then that sV = {xlp(x) < e}. Therefore p(x) is continuous at 0, and consequently at every x.
Conversely, for any e > 0, there exists d > 0 such that p(x) < d implies lxl < e. Simply choose d so that d V e S., where S. = {xl lxl < e}. This is possible since {e V} is a neighborhood basis at 0. Thus we have shown that l I and p(x) determine the same topology; hence p(x) is the required norm. Q.E.D.
B. Elementary Calculus 1.7. Defnitioa: Let X and Y be T.L.S. Let U be an open subset of X and f : U  Y. We say that f has a Gateaux derivative df (x, y) at x e U iff df(x, y) dt f (X + ty) 1=0 = ex ists for every y e X.
We call this derivative the derivative off at x in the direction y, and shall write it often as (df(x)) (y) or (f'(x)) (y).
1.& Definition: Let X and Y be T.L.S. and let 0: U  Y, where'U is a neighborhood of 0 in X. We say that 46 is horizontal at 0 if for each neighborhood V of 0 in Y there exists a neighborhood U' of 0 in X, and a function 0(t) such that 0 (t U') c o(t) V. 1.9. DeWtion: Let X, Y be T.L.S. and U open in X. Let f : U  Y and xo e U. We say that f is Frechet differentiable, or Fdiferentiable at xo, if there exists a continuous linear map A : X  Y such that if we write
f(xo + y) = f(xo) + Ay + 4)(y) then 0 is horizontal at 0. We call A the derivative of f at xo, and we write it df(x, y) as in Definition 1.7. 1.10. Remark; If the spaces are Bspaces, then the definition of a function horizontal at 0 is equivalent to I4)(x)I s .Ixl tv(x)
NONLINEAR FUNCTIONAL ANALYSIS
12
where tp is real valued and lim V(x) = 0. Thus, in a Bspace, the condition x'0
of Fdifferentiability can be expressed as follows :
f(xo + y) = f(xo) + Ay + 0(IYI) 1.11. Remark: If a linear function A is horizontal at 0, then A = 0, as follows at once from the definition. Thus we see that the Fderivative of a function is unique, because if f(xo + y) = f(xo) + Ay + 4,(y) and f(xo + y) = f(xo) + By + 4'(y)
where A and B are continuous and linear, and 0 and 0' are horizontal at 0, then A  B is horizontal at 0 (the sum of 0 and 4,' is still horizontal at 0) whence A = B. 1.12. Remark: The domain of f in the definition of Gdifferentiability can be assumed to be a "finitely open set in x", where x is simply a linear space.
Also, for complex spaces, it is easy to see that the Gateaux derivative is always linear in y, and that the hypothesis of linearity is also unnecessary for Fderivatives. (Cf. Hille and Philips [1], Sections 3.13 and 26.3.) In the case Xis a Bspace, it is easy to show that Fdifferentiability implies Gateaux differentiability.
1.13. Lemma: Let f : U  Y, where U is open in a Bspace X and Y is a T.L.S. Then if f has an Fderivative at x0, it also has a Gateaux derivative at x0, and they are equal. Proof: We write
f(xo + ty)  f(xo) = My + o (I tyl ),
where A is linear and continuous. But o(Ityl) = o(Itl lyl) and
lim 1 (f(x0 + ty) f(xo))  Ay. f40 t
Q.E.D. The next lemma gives the chain rule for Fderivatives.
1.14. Lemma: If f : U + V is Fdifferentiable at x0, and g : V + W is Fdifferentiable at f(xo), then g (f(x)) is Fdifferentiable at x0, and its derivative is given by: (d (gf)) (xo, y) = dg (f(xo), df(xo, y)).
Here, U, V and W are open sets contained in X, Y, Z which are T.L.S.
BASIC CALCULUS
13
Proof: We have only to write :
g [f(xo + y)] = g [f(xo) + df(xo, y) + 0(y)] = g [f(xo)] + dg [f(xo), df(xo, y)] + dg [f(xo),4(y)] + tp [df(xo, y) + 4(y)], where 0 and +p are horizontal at 0. It is easy to see that the last term is horizontal at 0 as a function from X to Z. We note that if 0 is horizontal at 0, and if A is linear and continuous, then A o4) is also horizontal at 0. This follows immediately from the definition of horizontality. Thus dg [f(x0),0(y)1 is horizontal at x0, and its derivative is dg [1(x0), df(xo, y)1. Q.E.D. We next prove another lemma relating Gateaux and Fdifferentiability.
1.15. Lemma: Let X and Y be Bspaces, U open in X and f : U  Y. If f has a Gateaux derivativef'(x, y) in U, which is linear in the variable y, and if, when regarded as a linear operator, f'(x) is bounded for x e U and depends continuously on x in the uniform topology, then f is Fdifferentiable in U. Proof: Our point of departure is the formula dt
f(x + ty) = f'(x + ty) (y)
which one can prove easily. It follows that
f(x + y) = f(x) + f0 l f'(x + ty) (y) dt
=f(x) +f'(x) (y) + fo [f'(x + ty)  f'(x)] (y) dt. But now:
f t [f'(x + ty)  f'(x)] (y) dt 6 lyl f If'(x + ty)  f'(x)I dt 0
0 o
= lyl 0(1) = o(lyl)
Q.E.D.
1.16. Remark: In the last lemma we integrated functions of a real (or complex) variable with values in a Bspace (cf., for example, Hille and Phillips [1], Chapter III). The basic fact we used was that
fsf(S) d1c (S) S f I f(S) I du (S). J
NONLINEAR FUNCTIONAL ANALYSIS
14
This is not true in general for Fspaces, because its proof depends upon the inequality IE atrl S E la,l I fil . In L.C.F. spaces, however, one can define weak integration by appropriate use of linear functionals. (Cf loc. cit.) Local convexity implies separation theorems which assure the uniqueness of the integral.
1.17. Lemma: (Contracting Mapping Principle.) Let X be a complete metric space and 4 : U + X, U open in X, and assume a (¢(x), 4(y)) Sae (x, y) with 0 S a < 1, where a (x, y) is the distance between x and y. Moreover, suppose there exists zo e U such that a (zo, X  U) > M, and e (zo,¢(zo)) < M (I  a). Then there exists a fixed point z = O (z.) such that a (zo,zj < M.
Proof: a (zo,4(zo)) < M(1  a) < M < e (zo, X  U), so 4(zo) is also in U, and inductively 02(zo), ..., 4 "(zo) ... are all in U, where #"(zo) = ¢ ( '1(zo)). The sequence zo, ¢(zo), ..., 4 (zo) ... is Cauchy, as follows from the contracting hypothesis. Hence we can set za, = lim o"(zo). By the " continuity of 46, i(z,,) = z.. The formula e (z0, 40(zo)) < M (1 ,X")
is easily proved by induction on n. Then e (zo, z.) = e (zo, lim 0"(zo)) = lime (zo, 4"(zo)) < M. Q.E.D.
C. The "soft" Implicit Function Theorem
1.18. Lemma: Let x be an Fspace, U the sphere {x: lxl < r}, and 0: U  X such that ¢(x) = x + y(x), where V(x) satisfies:
IV(x)  o(y)i 5 a lx  yl with 0 S a < 1, and
o(0) = 0.
Then: (i) 4(U) covers a sphere of radius r (1  a) about 0. (ii) 0 is onetoone and the inverse,0' 1 satisfies a Lipschitz condition with constant 1/1 a.
Proof: (i) We apply the last lemma to the function f(x) = V(x) + p where p e X and Ipl < r (1  a). If we put zo = 0, this inequality implies that Izo  O(zo)l = IpI < r (1  a). Hence there exists a point z in U such that z,, _ +y(zo,) + p, i.e., O (z,,) = p. (ii) Suppose 4(x) = x + ip(x) = p
15
BASIC CALCULUS
and4(y) = y + Vi(y) = q. Then, x  y + 1V(x)  V(y) = p  q, and Ix yf IV(x)  v(y)I s lP  qI, so (1  a) Ix  yl 5 IP  ql, and we are done. Q.E.D.
1.19. Corollary: If x is a Bspace and if, in the notation of the above lemma, V,'(x) exists and Iv'(x)I S a < 1 in U, and V(0) = 0, then (i) and (ii) are true.
Proof: We only have to note that
y))I Ix  yI dt < a Ix  yl.
Iv(x)  V'(y)I
Q.E.D. Now we can prove the following important theorem :
1.20. Theorem: (Implicit function theorem.) Let X, Y be Bspaces and
U  Y, where U is an open neighborhood of 0 in X and 4)(a) = 0. Assume : (a) 0 is Fdifferentiable in U. (b) 4)'(x) depends continuously on x in the uniform operator topology. (c) 4)'(0) is a bounded linear map with a bounded linear inverse. Then 0 maps a sufficiently small neighborhood of zero homeomorphically onto a neighborhood of zero.
Proof: Let A = 4)'(0). We put , = A 1 o 0. Then iq: U  X, 77 has an Fderivative rl'(x) which is continuous in x in the uniform operator topology,
and rl'(0) = I, the identity operator. Let v _ 71  I. Then o'(0) = 0, and
V(x)  o(y) = n(x)  rl(y)  (x  y) = f
(x + t (x  y)) (x  y) dt
o
(x y) = f
(rl' (x + t (x  y))  1) (x  y) dt.
0
T hus
hv(x)  v(y)I s Ix  yl
I0 Irl'(x + t (x  y))  11 dt. 1
But we can make the integral on the right of the last formula less than one, by taking x and y in a sufficiently small neighborhood V c U of 0. Then the preceding lemma applies toil, and, a fortiori, ¢ = Aij maps V homeomorphically onto a neighborhood of 0 in Y. Q.E.D. 1: 4)(V) . V 1.21.Cor6llary: Given the conditions of 1.20, the inverse map is Fdifferentiable. Setting ip = 41, we have, for y e 4)(V), the formula
v'(y) = (4' (4)
1(,)))1.
16
NONLINEAR FUNCTIONAL ANALYSIS
Proof: If 4(x,) = yl, 4(x2) = y2, then I4'V"2)
0'(yl)
(4'(xl))1 (Y2/, yi)I
= I(4,'(x1))' (4,'(x1) (X2  xl)  (Y2  yl)ll 5 A 14'(xl) (X2
 xl)  ,,/4'(x2) + 4(xl)I
This last expression is o &2  x1 I), whence the first expression is o (I y2 yl I), and the result follows. Q.E.D.
An induction argument easily yields the fact that if 4, has derivatives of higher order (definition given later), then so does ¢1. Similarly, if 0 depends continuously on some parameter, so does 4,1. The following theorem is a global version of the "local" implicit theorem : 1.22. Theorem: Let X and Y be Bspaces, and q5: X + Y a continuously Fdifferentiable function, and suppose 46' is invertible (as a linear operator)
at every x e X, and moreover, that I [4'(x)]' I S K < co uniformly in x. Then 0 is a homeomorphism of X onto Y. The proof will depend on the following lemma :
1.23: Lemma: Under the same hypothesis as the theorem, if d is the square 0 S s S 1, 0 S t 5 1, and if F(s, t) satisfies the conditions:
(i) F (s, t) I d  Y. (ii) F (j, t) is continuous in (s, t) and for every fixed s, 0 S s S 1, F (s, t) is Fdifferentiable in t. (iii) F (s, t) has fixed endpoints, i.e. there exist yo, yl e Y such that F (s, 0)
=yo,F(s,1) =ylfor all OSsS 1.
Then there exists a function G (s, t) from d to X which also satisfies (ii) and in addition 0 [G (s, t)] = F (s,, t) for all (s, t) ed.
Proof of the lemma: By the local implicit function theorem, there exist neighborhoods V of yo and U of x0 (where 4,(xo) = yo) such that 0 is a homeomorphism of U onto V. Then, for sufficiently small e, we can define G ( s, t) as4, .1(F(s, t)) if 0 < t 5 e and for 0 S s 5 1. We call a the largest of the values such that G (s, t) can be defined in the rectangle 0 5 t < a,
0 5 s S 1. Assume a < 1. If G (s, t) is defined for t = a, consider the curve G (s, a) and its image 0 (G (s, a)) = F(s, a). For each s, 0 S s 5 1, we can select a neighborhood U, of G (s, a) and a neighborhood V. of F(s, a)
BASIC CALCULUS
17
such that 0 is a homeomorphism of U, onto V,. But G (s, a), 0 5 s 5 1 is compact, and therefore, there exists a finite subcovering of the curve G (s, a) with neighborhoods U,,, i = 1, ..., n. In each of these neighborhoods, we
can define the function G (s, t) for all s and 0 S t < a + e by the local implicit function theorem. So, G (s, t) can be defined for the rectangle 0 S s S 1, 0 5 t < a + min e,,, contradicting the fact that a was the largest
of such numbers. Now G (s, 1) must be Fdifferentiable in t, for F(s, t) satisfies (ii) and 4'' is locally Fdifferentiable. By the chain rule, we have for
all 05 s5 I: 4,' [(G (s, t)] G' (s, t) = F(s, t)
where the prime denotes differentiation with respect to t. So :
G' (s, t) = [0' (G (s, t))]' F (s, t) and
IG' (s, t)J 5 J[0' (G (s, t))]'J IF (s, t)J forall 0 5 s 0.
BASIC CALCULUS
19
It is easy to see that a differentiable 0 is strongly monotone if (0'(x) y, y) a (y, y) for every x, y e H. In fact, suppose 0 is strongly monotone. Then for any real t: (4) (y + tz)  ¢(y), z) to 1z12, where y, z e H,
and dividing by t and taking the limit, we get (4)'(y) z, z) > a (z, z). Conversely, we get the condition of strong monotonicity by integrating the condition involving the derivative. The following definition will be useful in the sequel:
1.27. Definition: 0: H > H will be called monotone if for every x, y e H, (4)(x)  4)(y), x  y) > 0. If the sign > holds for x  y 96 0, ¢ will be called strictly monotone.
1.28. Remark: Obviously strongly monotone implies strictly monotone implies monotone. Furthermore, 4) is strongly monotone with a constant a
if 1 0  I is monotone, and if ¢ is strictly monotone (a fortiori, if 0 is a strongly monotone), 0 is 1  1. We prove now a useful lemma on Euclidean space : 1.29. Lemma: (Kirszbraun) Suppose {x1 ... x"} and {xi  are two sets of points in E", and let p be also in E. Assume that for every i, j,1 < i, j < n,
we have Ixi  xj'l S Is  xxl (I I is the standard norm E Ix.I ). Then, there exists p' a E" such that
IP'  xxl < Ip  xfl for every j, Proof: Let
1 <j5n.
A = inf max I P,  xil D'.En 1sts. IP  xtl

xti This infimum is assumed at some point p+E E", for max IP, becomes '' s" l p  X11 large when p' is large. Hence we can put IP+  xt'l
max
=A.
1$!$" IP  xt1 Now, suppose that for 1 i S k we have Ip+ :xil = A Ip  xtl, and that
for k+ 1 5 I S n we have l p+  xil < A l p  xtl . We shall show that p+ a co (xi , ... , xt) (the convex hull of {x. , ... , xk}). Suppose that p+ 0 co (xi xt). Then, we can separate p+ from co (xi xx) with a hyperplane A. If we move the point p+ toward A perpendicularly, it is obvious that the distance from p+to every point in the halfspace not containing p+
NONLINEAR FUNCTIONAL ANALYSIS
20
Ip+  x;( decreases. We can move p+ by so little as to preserve the inequalities
< A Ip  x,I, k + 1 5 i 5 n, and now we get (p+  ill < A (p  xil for every 1 5 i 5 n, which is impossible, because p+ realizes the infimum. Thus we it 4 c, = 1. Let R, = p  xi and c,%, where c, z 0 and can express p+ as 1
1
Ri = p+  x,. Now suppose A is greater than 1. Then
R'2>R; for
(1)
On the other hand, we have by the hypothesis:
(R;  RR)2 S (R,  Rjy, and after expanding and using (1):
RiRj'/> R,Rj,
(2)
c,) p+ = E c,x', and therefore, Y c,R# = 0,
c, = 1, we have
Now, as 1
1 5 i, j 5 k.
\\\\1
1
1
and, by (1) and (2), 0 > (E c,R,)Z, a contradiction. We have proved that
151.
Q.E.D.
Now, it is easy to generalize this result for Hilbert space:
1.30. Corollary: Let {xa} and (x') be two sets of points in the Hilbert
space H, and p e H. Suppose Ix.'  xg'(;S (x  x,(. Then there exists p' e H such that Ix'  p'I S (x3  p( for all a. Proof: We want prove that the intersection of the infinite family of spheres with center xa, and radius Ix,  p( is nonvoid. But spheres are compact in the weak topology for H, so it is sufficient to prove that every finite
subfamily of spheres has a nonvoid intersection. If, then, there are only finitely many xa's, the set {xa} u (x') generates a finite dimensional Euclidean space, and we have only to apply the lemma. Q.E.D.
1.30A. As the following counterexample (due to Charles McCarthy) shows, the obvious generalization of Kirszbraun's lemma to Banach spaces that are not Hilbert spaces is not true in general. We give the following
1Leoi+em: Let 1,P, 1 < p < co, n z 1 be ndimensional Euclidean space with the norm I"1,, Ix
21
BASIC CALCULUS
Then if n > 1, p # 2, the generalization of Kirszbraun's lemma does not hold.
Proof: Take in l,', p > 2, n > 1 the points xi = (0, 0, ..., 0), x2 = (0, 1, 0, ..., 0), x3 = (1, 0, ..., 0). Evidently Ixi  x21, = Ixi  x31, = 1,
Ix2  x31 = 21/D.
Choose now spheres Si, S2, S3 around xi, x2, x3 of radii 2(1
Si n S2 n S3 Now let
xi
(,0 0
..., 0)}.
//1  2iDi/D 0, ..., 0), 0), x'2 = ll , ) 2(1J`)/",
A = ((1  21,)i/a, Again
0,
,)/,_ We have
2(',)/,, 0,
Ixi  x221, = Ixi  x31, = 1,
..., 0). Ix2  x31 = 21/P.
But if we take spheres Si , Ss , S3' of radii 2c1 p)/p around x'j, x2, x3, their intersection will be void. In fact, by uniform convexity
2i,)i/,,0,..., 0)} ={y). S2' nS3 ={((1 Since (1  21,)i/, > 2(1,)/, is p > 2, Sin S2 n S3' = ¢. The case I. P, 1 < p < 2, n > 1 may be handled in a similar way; the points X1, x2, x3 are replaced by x' j, x2, x3 and vice versa, and the radii of S1, S2, S3 become
(I 
21F)i,, 2(i,)/,9 2(1,)/, respectively.
Q.E.D. 1.31. Theorem: Let H be a Hilbert space, S any subset of H, and4' : S  H. Suppose 14'(x)  4'(y)1 < K Ix  yI for all x, y e S. Then 0 can be extended to all of H in such a way that the extension satisfies the same Lipschitz condition.
Proof: Without loss of generality we can suppose that K = 1. By Zorn's lemma, there exists a maximal extension 4' subject to the same Lipschitz condition. Suppose p # domain of 4'. We have 14,(x)  4(y)I 5 Ix  yI for x, y e domain 4'. Therefore, by the last corollary, we can find p' e H such that 14(x)  p'I S Ix  pi for all x e domain 4,. If we define p' = 4'(p), we have extended j to one more point preserving the
Lipschitz condition, and thus contradicting the maximality of 4'. Hence domain of = H. Q.E.D. We now make some definitions preparatory for the next theorem.
NONLINEAR FUNCTIONAL ANALYSIS
22
1.32. Definition: We say that 0: X + Y (X, Y are Bspaces), is feebly con
tinuous if the mapping t > 0 (x + ty) is continuous from R to Y with the weak topology for every pair x, y e X. 1.33. Definition: 0 : X + Y (X, Y are Bspaces) is slightly continuous if x,,  x strongly in X implies 4(xa) + 4(x) weakly in Y.
1.34. Remark: As it is easily seen, continuity implies slight continuity implies feeble continuity.
1.35. Theorem (Minty): (a) Let 0: S + H (H is a Hilbert space), be defined in an open set S C H, and suppose 0 is feebly continuous and strongly monotone. Then 0 is an open mapping. (b) Let 0 : H > H be defined everywhere, and suppose 0 is slightly continuous and strongly monotone. Then 0 maps H onto H. Proof: (a) As we remarked earlier, we can assume without loss of generality
that 0 = id + T, where T is monotone. Now consider the Hilbert direct sum H ® H. We introduce the relations: [x, y] M [x', y'] iff
(1)
(x  x', y  y') ? 0
and [x, y] L [x', y']
(2)
if lyASIx x'$.
(Note that neither is transitive.) Let 4: H ® H  H ® H (Cayley transformation) be defined by (3)
([x, yl) _
 [x + y, x  y]
.
It is easy to see that 0 is an isometry (of course I[x, y]12 = Ix12 + Iyi2), and
that 02 = id. Now let p = [x, y] and q = [x', y']. We hale:
pMq if 4$(p) L4(q)
(4)
For 4$(p) L4$(q)
iff
Ixyx'+y'12 5Ix+yx'y'12
if 2 (x  x', y  y') 5 2 (x  x', y  y') if (x  x', y  y') > 0 if pMq.
23
BASIC CALCULUS
Call I' e H ® H the graph of T. Since T is monotone,
pMq for all p, q e T
(5)
By (4), if we put r1 = P(r), we get :
pLq for all
(6)
p, q c
r1 .
This means that r, is the graph of some function S1 satisfying a Lipschitz condition with K = 1. Obviously the domain of S1 is the set of points 1
(x + y) e H such that [x, y] cr, i.e., domain (S1) =
(7)
2 range (id + T). 1
By the previous theorem, we can extend S1 to a function S2 defined on all of H and satisfying the same Lipschitz condition. Let I'2 = graph of S2 and r3 = )(r2). Then r3 = r, because r2 = I'1. $2 satisfies the Lipschitz condition, whence
p, q e r2 implies pLq
(8)
and
p, q e r3 implies pMq
(9)
(apply (4) and recall that 02 = id). Now, by (3) and (7) we have : ( 10)
r = {_J
[(Id+ S2) x, (id  S2 ) x ] ; x e
r=
[(Id + S2) x, (id  S2) x]; x e H
and (11)
i
7 ran ge (id + T)
Suppose now that the range of id + T = range of 4 is not open; then there exists a point in this range which is a limit of a sequence of points not in the range, and by (10) and (11), this means that there exists a point [y, z] E r and {[y,,, such that [y, z] = lim [yy, c r3  r. But, by hypothesis, the domain o; T is open; hence for some no,
y e domain of T
y
y 0, z* = z,0, we arrive at
NONLINEAR. FUNCTIONAL ANALYSIS
24
the following conclusion: There exists a pair V, z*] such that:
(ii)
y* e domain of T, (y, z] M [y*, z*] for every pair (y, z] r e r,
(iii)
z* * Ty*.
(i)
Now we show that this leads to a contradiction. By (i), for small e > 0, y = y* + e (z*  Ty*) belongs to the domain of T, so, by (ii), we have :
(y*  y, z*  Ty) ? 0, and using the definition of y:
e (z*  Ty*, z*  T (y* + e (z*  Ty*))) z 0 or
(z*  Ty*, z*  T (y* + e (z*  Ty*))) 5 0. As e + 0, we have, using the feeble continuity of T:
fz*Ty*12
0.
Thus, z* = Ty', which contradicts (iii). This proves that range of# is open. (b) As we remarked before, 0 is 1 1, because it is strongly monotone, and
moreover ¢: satisfies a Lipschitz condition with constant 1la, as follows from Definition 1.26 and Schwarz's inequality. Now to prove that ¢ is onto we use the same argument as was used in Theorem 1.22, in proving that if x(t) is defined for t < a, it is defined for t = a; as before, we use, the Lipschitz condition on 4' 1 to prove that x(t) has a limit as t  a. Then we use the slight continuity of ¢ to prove that 0 (x(a)) = y(a).. Q.E.D.
We now establish an additional theorem for monotone functions:
1.36. Theorem: Suppose 0: H  H (H a Hilbert space) is monotone and continuous. If p e H is such that (x, 4(x)  p) 10 for fix) z R, (where'1,, is a number depending on p), then p belongs to the range of ¢.
Proof: We can assume that p = 0. Let 4,(x) = ex + 4(x) with e > 0. Then ¢, is strongly monotone, and by Minty's theorem there exists x, e H such that (1)
ex. + 4)(x.) = 0 ,
BASIC CALCULUS
25
So that multiplying by x, we have 814, + (O(x,), x,) _' 0.
Therefore, for every e > 0 Ix,I must be smaller than R so that the x, form a bounded set. Hence there exists a sequence
0 such that the sequence
{x,j tends weakly to a point x., and we can suppose that Ix,, j also converges. Now, by the monotonicity of# and (1), we get:
(xa  x 6xa  ex,) S 0 for every d > 0 and e > 0. Then, if we put 8 = e and let n + oo,
(x  xQ,  ex,) S 0 for every e > 0 or
(xQO  xt, x:) ? 0. Therefore
Ix,l2 z Jim
(2)
Ix6"I2.
On the other hand, spheres in Bspaces are weakly closed, and this means that:
Ix,I 5 lim Ix,j.
(3)
Hence, by (2) and (3) IXCDI = lim Ix4,l
and so Xen
By the continuity of 0 we get Q.E.D.
, x strongly. 0.
1.37. Example: Suppose (S, K, p) is a finite measure space, and V a finite dimensional linear space. Let f : S x V  V be continuous in V for every S, and such that I f(s, u)I 5 K Jul + 1 for some constant K z 1 and every s e S, u e V. Then .(s) + f(s, 0(s)) maps L2 (S, V) into L2 (S, V) be
cause
fIfs#(sxII2d#c s K2
f (I4(s)I + 1)2d/j. s
We call this map ¢(F). If a sequence {4.($)} converges in L2 to 4(s), then there exists a subsequence {4.. (s)} converging to 4(s) a.e. Hence F% d (s) + F(4) (s) a.e. But IF(4,,) (s)I 5 K 14,,,(s)I + 1, whence there exists a subsequence {4M,3 such that '0,,,j (s)j 5 V(s) for all j, where &'(s) is a summable
26
NONLINEAR FUNCTIONAL ANALYSIS co
function. (Simply take (4 ,,) such that
f
dill < oo.) By the Le
F(4) in L2. This proves that F is continuous. besgue theorem, Now let D be a linear operator in L2 with bounded inverse, and suppose we want to solve the equation D*FDx = Y. By Minty's theorem, if D*FD is strongly monotone then there exists a solution for any y e L2. The condition for strong monotonicity is
(D*FDx,_D*FDx2,x,x2)>_EIXIx212 where a>0, x,,x2aL2, (Fx,  Fx'2, x'1  x2) ? E Ixi  X2112 where s' > 0,
x', x2 a L2.
(Calling x; = Dx1, i = 1, 2, and remembering that D is bounded and has a bounded inverse.) We therefore see that for solvability it is sufficient to have
(f(s,v) f(s,v'),v v')
E' IV v'(2
for all seS, v,v'e V.
(Note'that here the scalar product is that id R", whereas before it was the one in L2.) This condition is implied by: (df(s, v) v', v') Z E' Iv'I2
for all s e S, v, v' e V
which is equivalent to the following condition : There exists an a > 0 such that the symmetric matrix 'J + J  eI is positive definite, where J is the Jacobian matrix off. Thus for the existence of solutions at every point of the above equation, it is sufficient to require that the matrix A =

( af I + LP axi
ax, r. j
has smallest eigenvalue >0 at every point.
E. Compact Mappings
1.38. Definition: Let E, F be two T.L.S. 45: E  F is compact iff it is continuous and maps bounded sets into compact sets, i.e., if B e E is bounded, then 4)(B) is relatively compact.
1.39. Definition: 0: E  F is called locally compact at a point p e E i$'4) is continuous in a neighborhood V of p, and maps V into a relatively compact set.
BASIC CALCULUS
27
1.40. Theorem: Let E, F be T.L.S., F complete. Let 0: E + F be Fdifferentiable at p e E and locally compact at p . Then do (p) is a compact linear
operator.
Proof: We may suppose that p = 0 and that ¢(p) = 0. Let A = d4(p) and suppose A is not compact. Let S be a bounded subset of E, with noncompact A(S). By the completeness of F, we can find a family {x«} a A(S) and a neighborhood U of 0 in F such that x«  xx 0 U whenever a 0 P. Now let {y«} e S be such that Aya = x«, and for any & > 0, let us define: n 6(X.) = 0 (&y.)
Then we have : (1)
vla(x,,)  &x. = 4' (by.)  &x. = A (&y.)  &x. + V (&y.) = V (&y,.)
where tp is a function horizontal at 0, i.e. for every neighborhood V of I in F, there exists a neighborhood U of 0 in E such that Sp(&U) c of&) V.
Now (2)
rl6(x«)  ,;a(x$) = (&x«  &xa) + (r!a(x«)  &x«) + (&xp  rla(xp))
Choose asymmetric and circled neighborhood V of 0 in F such that V + V
+ V,= U. For such a neighborhood there exists another one W in E
st ch that
tp(&W)cof&) V. Since $ is bounded, there exists A > 0 such that AS a W. Then for every a, rla(x«)  8x« = to (&yj a tp
But as
ao (
W) c v
(,)
V.
+ 0 as &  0, it follows that for sufficiently small &
0(1 VC &V. Therefore, by (2), tya(x«)  t a(x,) # &V whenever a 0 fi, because if t1j(x«)  rja(x«) a &V, then &x«  &xj + &V + &V a &U contrary to our assumption. Hence 0 (&y«) ,0 (&ys) 0 &U for a # fl, and for sufficiently small &, this contradicts the local compactness of 4). Q.E.D.
28
NONLINEAR FUNCTI161 AL ANALYSIS
F. Higher Differentials and Taylor's Theorem
We recall that if X1, X2,
..., X., Z are linear spaces over the same scalar
field, a function M: (X1 x X2 x ... x X.) + Z is multilinear or nlinear if it is linear in each of the variables separately. If X1,..., X., Z are Bspaces
M is continuous iff there exists a constant K such that IM (xi ... x.)J 5 K Ixi I Ix21 ... Ix.l for all x1 in X1, i.e. if it is bounded. The minimum of the numbers K satisfying this inequality will be called the norm, of M, IMI. The
set of all nlinear bounded maps M from Xi x X2 x ... x X. to Z will be denoted by B (XI, ..., X.; Z), and it is easy to verify that if X1 and Z are ZPase Bspaces then B (X1, ..., X.; Z) is a Bspace with the usual addition and scalar multiplication, and with the norm defined above. In the case Xi = X2 = ... = X., B (XI, ..., X.; Z) will be written B' (X, Z). 1.41. Lemma: Let X1, ..., X., Z be Bspaces over the same scalar field. Then there is an isometric isomorphism between B (X1, ..., X.; Z) and B (XI, B(X2, ..., B(X., Z)) ...). The proof is left as an exercise for the reader. Suppose that f is Fdifferentiable on a set Do in X with range off in Z. Then
the function f1, defined for x e Do by fi(x) = df (x), has its values in the Bspace B (X, Z). It makes sense to ask if f, is differentiable. If it is, then the differential of fi = df will have its values in the space B2 (X, Z) of bounded bilinear functions of X to Z, where, by the lemma above, we have identified
B2 (X, Z) and B (X, B (X, Z)). We define the differential of fi = df at a point c to be the second differential off at c and we denote this second differential by PA x). Hence d2f (c) is a bounded bilinear function on X to Z. Higher order differentials are defined by induction.
1.42. Defuidon: A function f on D c X to Z is said to be in class C" on D, written f e C", iff the nth differential d"f exists at every point of D and the
mapping x  d"lx) of D into B" (X; Z) is continuous. If f e C' for all n, we say that f e C. Observe that iff : X + Y e CO on a neighborhood of a point c e X and if g : Y Z e C' on a neighborhood of the point b = ft c), then h= g of: X ..+ Z e Co on a neighborhood of c. We now prove Taylor's theorem for Bspaces. We shall write x(k) for the ktuple (x, x, ..., x). 1.43. Theorem: Suppose that f e Cl on an open set D which contains the line segment joining c to c + x. Then
29
BASIC CALCULUS
f(c + x) = f(c) +
1! df(c; x) +
+
(nI)!
+
1
1
d2f(c; x(2)) + ... 21
d"lf(c; x("")
 t)"1 d"f(c + tx;
dt.
,
n!Proof: fo
Since the map t + d*Ac + tx; x(")) is continuous on [0, 1) to Z, it is clear that both sides of the equation have a meaning. To establish the equality let Z* be a continuous linear functional on Z and let F be de
fined on 10, 11 to the scalar field by F(t) = Z*f (c + tx). Then F«"(t) = Z* [dkf(c + tx; xwk')] for 0
k 5 n and we can apply the scalarvalued
form of Taylor's theorem to F. If we observe that Z* commutes with integration, we can apply the HahnBanach theorem and obtain the result. Q.E.D. 1.44. Corollary: Under the hypotheses of Taylor's theorem, there exists a bounded nlinear function R. from X to Z such that
f(c + x) =.f(c) + 1! df(c; x) + ... +
1
d"If(c;x(X1)) +(xa)).
1)!
(n
Proof: Let
An =
1
(1  t)"' d" f (c + tx) dt.
n! J o
Q.E.D.
1.45. Corollary: Under the hypotheses of Taylor's theorem, there exists a function a on a neighborhood of the origin in X to Z such that
f(c + x) = f(c) +
I
df(c; x) + ... + (n
+
d"f(c; xa') + Q(x).
where Q(x) = o(jxl"). Proof: Observe that
1 f (1 t)"'dt=1,
n
1)1
x' '>)
NONLINEAR FUNCTIONAL ANALYSIS
30
and define Q(x) to be
Q(x) =
1
1(1
 t)"' [d"f(c + tx; xc">)  d"f(c; x("))] dt.
Q.E.D(n1)!fo
.
Note: The reader may easily generalize the definitions and results of this section to the case of locally convex T.L.S.
G. Complex Analyticity Let X and Y be complex Bspaces.
1.46. Definition: We say that 46: X + Y is complex analytic on an open
subset 0 of X iff 0 (zlxl + 22x2 +
+ z xjt) is analytic in zl , ..., zx for
every xl , ..., xj, in D, zi complex. If 0 is complex analytic, we have immediately
ow _i
4, (x
(Cauchy's formula)
and
f0(x+CydC.
n!
1.47. Lemma: If v (x + (y) is a cl vectorvalued function of x and y, and 1 av _ av if , then v is complex analytic. i ax ay Proof: Let v* be a continuous linear functional. v* (v (x + iy)) is a complex valued function which satisfies the CauchyRiemann equations and is therefore analytic. Thus v* (v(z)) =
2xi
f
v*
mz C
dd,
and by the HahnBanach theorem, the Cauchy formula holds, and v is
analytic. Q.E.D.
Suppose 0 is analytic in D. Assume that 0 e D and 4,'(0) is invertible. By the implicit function theorem, 0 has a local inverse, V. If u and v + 2u are sufficiently near 0,
0 (V(u)) = u and 46 (V (v + 2u)) = v + 2u.
31
BASIC CALCULUS
Then 0' (p(v)) dt V (V + tu)
so
=u
and
0' (V(v))
d
V (v + itu)I
= iu. =o
Since 0' (V(v)) is invertible,
d
, (v + itu) = i
d
V (v + tu).
By the preceding lemma, V (v + zu) is analytic in z, and consequently V is
analytic. We have therefore proved the implicit function theorem in the analytic case :
1.48.T6eorem: Under the hypotheses of the implicit function theorem 1.20, and if 0 is analytic, 01 is also complex analytic.
H. Derivatives of Quadratic Forms
1.49. Definition: If B, V are linear spaces and f : B + V, we say that f is a quadratic form if the expression
fi (x, Y) = f(x + Y)  f(x)  f(y) f(Ax) = 22f(x) for every x e B and every scalar A. It is clear that in such a case f( x) = f(x) and f(0) = 0. P is called the bilinear form associated with f. From the definition it follows that (3)
f(x) = +i (x, x),
and therefore f and fi determine each other. Suppose now that B and V are Banach spaces. It is clear that f is continuous if and only if f is continuous. If f is continuous, then (4)
If(x)I 5111A IxI2,
where IIPII stands for the norm of f as a bilinear function. If we define II!II =
(4) may be written If(x)I S 11111 Ix12.
32
NONLINEAR FUNCTIONAL ANALYSIS
The main theorem on derivatives of quadratic forms is the following: 1.50. Theorem: Every continuous quadratic form has Fderivatives of all orders. Denote by fi the bilinear form associated with! and by Ilf II the number
I sup Ifl (x, y)I, Ixl s 1, lyi s 1. The first and second derivatives off are:
f'(z) h = fl (z, h),
f"(z)
=#I
and the higher derivatives vanish identically. From the equations above it follows that :
If'(z)I s 2 Ilfli lzI, llf"(z)0 = 2 IIfll Proof: From the definition of fi it follows that Y (z + h)  f(z) = f(h) + fl (z, h)
and (4) implies that f(h) = o(h). Since fi (z, h) is linear in h, it follows that f has an Fderivative at every z and the equality f'(z) h = P (z, h) holds. Now f : B , Horn (B, V) (being equal to fi) is obviously linear. Then from the general fact that the derivative of a linear mapping at any point is that linear mapping, we conclude thatf"(z) = f', or f"(z) = P. Q.E.D.
CHAPTER II
Hard Implicit Functional Theorems
A. Newton's Method and the Nash Implicit Functional Theorem . B. A Partial Differential Equation . . . . . . . . . . . C. Embedding of Riemannian Manifolds . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
33 41
.
.
.
.
.
.
43
A. Newton's Method and the Nash Implicit Functional Theorem The following theorem will be proved by the socalled Newton's method.
2.1. Theorem: Let B be a Banach space, and let f be a mapping whose domain D(f) is the unit sphere of B. Suppose that: (i) f has two continuous Fr6chet derivatives in D(f), both bounded above by a constant M, which we assume to exceed 2. (ii) There exists a map L(u) with domain D(L) = D(J) and range in the space &(B) of bounded linear maps of B into itself, such that (iia) (ii b)
IL(u) hlM jkI, h e B, u E D(L) df (u) L(u) Ii
h e B, u e D(L).
h#`
Then, if J f(0)) < M3, it foI1ow'§ that f(D(f)) contains the origin..
Proof: Let x = J, and let f> 0 be a real number to be specified later. Put uo = 0 and, proceeding inductively, put (2.1)
ua+t = U. 
We will prove inductively that
(2.2;n) (2.3; n) 3
1u 
t I S e$"",
Schwartz, Noatinear 33
n Z I.
34
NONLINEAR FUNCTIONAL ANALYSIS
We proceed as follows. Suppose that statements (2.2;j) and (2.3;j) are true
for j 5 n. Then epj s
Iu I
(2.4)
J=1
J=1
epcxI»
so that if fi is sufficiently large, (2.2; n) follows. Therefore Definition (2.1) makes sense. Observe now that if g is any function twice continuously Fdifferentiable, the meanvalue theorem with Lagrange remainder applied to g (u + th) yields
g (u + h) = g(u) + dg (u) h + fo (1  t) d 2g (u + th, h, h) dt. Combining this with (ii) and our induction hypothesis yields (2.5)
Iun+l 
I = IL(uu)f(un)I 5 M If(uf)I S
M2 1110  u.lI2 = M Iu Thus we have only to choose P so that M2e2"O' 5 ePxn+'

M2e
or
(2.6)
M2
p", (2.7) follows. Then u converges to some element u in D(f). By (iib) and (2.1) f(u,,) = df(u;) (u.  u,+
so J(U) = 0. Q.E.D.
If(U)I s M 1U. +A 
s
Mep",,,
The following theorem, which gives an important generalization of Theorem 2.1, is proved by a modified Newton's method. We weaken the hypotheses of Theorem 2.1 requiring not that the "inverting" operator L(u) be bounded, but only that it be an unbounded operator acting somewhat like a differential operator of order a.
35
HARD IMPLICIT FUNCTIONAL THEOREMS
Given a compact ndimensional manifoldK weintroduce the space C'(K) = C
of (possibly vectorvalued), r times continuously differentiable functions with the norm Jul, = max max jDau (x)I la15 r _. M
&2,. j a"), a1 nonnegative integers, loci = al + aI
GX'1 )
+a
... ( 61 ax"
Note that C' m C'+1 and that, if u e Ci+1, Jul,, S Jul.+1; we write Jul, = co if u 0 Cr. In the sequel we shall refer to a certain range m  a 5 r 5 m + 10a of spaces C' and to a certain constant M z 1. We suppose that M is sufficiently large so that there exist smoothing operators S(t), t z 1 such that ISO ul e S M1°' l ul,,
(S1)
1(I  S(t)) ul, 5 ml'°
(S2)
(S3)
dt
S(t) U
lule, M:°1
lim l(1  S(t)) ul, = 0,
(S4)
++ao
lule,
u e C' u e C° u E C°
tLe C'
formr:9 Lo :9 m+10a. (We will show later how to construct these operators for any compact manifold.) We proceed now to the statement of the main result of this chapter:
2.2. Nash implicit fanetional theorem: Let f be a mapping whose domain
D(f) is the unit sphere of CO with range in C". Suppose that (i) (ii)
f has two continuous Fderivatives, both bounded by M. There exists a map L(u) with domain D(L) = D(f) and range in the
space it (C", C') of bounded linear operators on C" to C"a, such that : (iia)
IL(u)hl,,, 5 Mlhl",
(iib)
df(u)L(u)h = h,
(tic)
IL(u)f(u)Im+9s 5 M(1 + IuI"+LOa),
ueD(L), heCm ueD(L), heCm a u e C"''°.
NONLINEAR FUNCTIONAL ANALYSIS
36
Then, if If(0)1.+9a 
240M202
f(D(f)) contains the origin.
Proof: Let x = I and P, µ, v > 0 be real numbers to be specified later. Put uo = 0, and proceeding inductively, put (2.8; n)
u.+ 1 = U.  S.L (u.) flu.)
where S. = S(e1). We will prove inductively that (2.9; n)
lu.  u.1I. 5 e1001"
(2.10; n) (2.11; n)
1+lu.1.+10a 5
(2.12; n)
e"a0"
Suppose (2.10; j) is true for j 5 n. Then epa0x1
Iu I
J1
SZ
ee.0c.1»
11
=
e
,enc.
1e
map (N 1)
which implies (2.9; n) if i4, µ are, sufficiently large. Suppose now that (2.9; j), (2.10; j), (2.11; j), (2.12; j) are true for j S n. Then 1u,+1  6I.. = IS.L(u.)f(101..
s M e°"' JL(u.)f(w)I ma 6 M2 e'"M' If(u )J S M2 eO""
+
,
S.1L(u.1)f(u.1)1,. M3ea0x"1u.
 u.11M
s M2 ea0x" Idf(u.1)(1  S._1)L(U,1)f(u.1)I., + M3 ea0x" e2pa0x"
s M3 eaax"[Me9a0x"' IL(u.=i)f(u.x)1.+9a + e c Ma ea0x" [e900x"' M (1 + Iu.1Ie+1oa) +
S M5
ea0""[e9.0x"' e"«0'P' +
e:"°`R""]
e'2m0x")
S Ms {expagx"1 (v  9 + x) + expa4x' (1  2µ)).
HARD IMPLICIT FUNCTIONAL THEOREMS
37
The desired inequality will be then implied by (2.13; n)
M5 {exp [ape' (v  9 + x)] + exp (c flu" (1  2µ)])
which (noting that x = 4) will follow for P sufficiently large if we choose (2.14)
µ>2, 41u+v< s.
Thus (2.10; n + 1) follows. Next we note that + 1u.+Jd.+10.
1 + i ISJL(uJ)1(uJ)I.+10, J0
i
5 1 + Me°10"' IL(uu)J(uu)I.+% Jo
5 1 + M2 Z ed" ' (1 + IuJl.+ios) Jo
5 1 + M2 i eaRCi+.)'r Jo
Thus (2.16)
(1 + Iu6+tl.+1oJ
5 e^'
ab,.ft
'P.t + M2
i
Jo
If v > 2 the right side of (2.16) will be less than I for sufficiently large P, and so statement (2.12; n + 1)will follow from (2.16), completingourinduction.
If we taker = }, µ a ,t, condition (2.14) is satisfied and so we have only to verify the correctness of statements (2.10; 1) and (2.12; 1) and our proof will be complete. These statements, however, are simply the inequalities (2.17)
IS1L(0)f0)I. 5
e'oo"
and (2.18)
1 + IS1L(0) f(0)I.+1o, S e"I"
and they in turn follow from the bound for JJ(0)I and (iic). The conclusion of the proof is now just as in Theorem 2.1. We proceed now to the construction of the family of smoothing operators whose existence was assumed for the special case K = ndimensional torus;
NONLINEAR FUNCTIONAL ANALYSIS
38
then Ck(K) is simply the space of all ktimes differentiable functions u(x), defined in E" and periodic with period 2x in each variable. Take a sufficiently
large constant M and a function a e C°°(E), vanishing outside a compact set and identically equal to 1 in a neighborhood of 0, and let a be its Fourier transform. It is well known that for any a, N IDaa (x)I < Aa.N (1 + IxI)N.
Moreover a(x) dx = 1,
./ E^
E
xa(x)dx =0, xa =x,xa2...xa
IaI > 0.
Now we set
a (t (x  y)) u(y) dy.
(S(t) u) (x) = t" J
It is clear that S(t) u e C°° and, since S(t) commutes with partial differentiation operators, we have to prove statements (S1), (S2) above only for r = 0. In fact, suppose (S1) is true for r = 0. Then IS(t)
< Ml,,' IuIo
UI°
Taking any a, IaI s r
WS(t) ul°, = IS(t) Daul°.
5 Mt°' IDaul0 s Mt°' Jul.. But then IS(t)
s me Iul,
ul k
One deals similarly with (S2). Suppose then that r = 0.' (S1) reduces to
is(t) uI° s Mt° Iulo.
Let Jai 5 e. We have I D"S(t) ulo = t"flat
f
Daa (t (x  y)) u(y) dyl
E"
s Mia1 IuIo s Me Iul0 if IaI S e, so (S1) is established. (S2) reduces to 1(1  S(t)) ulo 5 Mt° Iui,.
39
HARD IMPLICIT FUNCTIONAL THEOREMS
To prove this, apply Taylor's theorem with integral remainder:
(1) _ k=0 Y1
P
k)(0)
fu  )"'_'
I
+
(m  1)!
k!
dµ
0
to the function f(t) = u (x + ty). We obtain °1
1
u(x + y) = Y ( > yaDau (x) k=0 k. Ia1=k 1
1
+
y"
(Q  1)! Ia1=°
J0
(i 
Yu (x +,uy) du.
Thus
(t (x  y)) (u(x)  u(y)) dy
U  S(t) u = t" f En
,
Ia(t(x_y))(1/s)'tDu(x+iy)d,Ady.
(B  1)! Ial=o
EM
0
Making the change of variable ty = z, we obtain I
t" fEa foa (tx  ty) (i  µ)Q = t 1a1
f
E"
1
yDu (x + µy) dls dy
J'a(tx  z) (1  ,u)°  1 zaD"u (x + µt 1 z) d IA dz. fo
But then it is easy to conclude V  S(t) UI0 S Mt° lup0 which proves (S2). Let us pass now to (S3). As before, we can suppose without loss of general
ity that r = 0, in which case, (S3) reduces to
dr
S(t) ul 5 Mt°1 Pubo.
But d
S(t) U = dt to
=
181
a ( t (x  y)) u(y) dy
ft.
E1=1
(na (t (x  y)) + Y ty' (D'a) (t (x  y))) u(y) dy.
40
NONLINEAR FUNCTIONAL ANALYSIS
Reasoning entirely analogous to that used in the proof of (S2) yields the desired result. As for (S4), it is a wellknown result in the theory of singular integrals, and therefore we omit the proof. We note that the construction of the smoothing operators could be carried out for any compact manifold, and not only for the torus. For the proof, we
refer the reader to J. Schwartz, On Nash's Implicit Functional Theorem, Comm. Pure Appl. Math., vol. 13 (1960), pp. 509530. We note also that the use of spaces C' and the norms I.1, is by no means essential in the proof of Nash's implicit functional theorem; indeed, these spaces can be replaced, for example, by spaces like L; (K) = LD = space of all (possibly vectorvalued) functions f for which IIDafl' dx < co,
I&I S r, with the norm
iii =
f
IDfly dx.
x
1a15r
We present now a useful corollary of Nash's Implicit Functional Theorem.
2.3. 2nd implicit functional theorem: Let T  ndimensional torus, let f : Ck  C" be defined on the unit sphere of Ck, and suppose that (i) (ii)
f has infinitely many continuous Fderivatives. f is translation Invariant, i.e. if u e Ch, Iulk < I
f(u (. + h)) (s) = U (u)} W) (x + h) . (iii)
There exists a mapping L(u) defined in the unit sphere of Ck with values in 9 (Ck, Cks) such that L(u) is translation invariant in the
same sense as f, such that L(u) has infinitely many continuous Fderivatives, and such that (iiia)
(iiib)
194) hJks S M
Ihlk,
df (u) L(u) h = h,
u e Ck,
h e CR
u, h e Ck.
Then, if f(0) = 0, f(D(f)) contains a C00neighborhood of zero. Proof: Note that, since f is translation invariant, it commutes with derivatives, so if we apply f to a function in CL*, k' > k, we obtain a function in C&V0; similarly L(u) can be considered as a function whose domain is the unit sphere of Ck' and range C*'R. The inequalities and identities (iiia)*
IL(u) bilk.a S M Ihlk.,
(iiib)'
df(u) L(u) h = h,
u e CO,
u, h e Cr+s
h e C"'
HARD IMPLICIT FUNCTIONAL THEOREMS
41
also hold. We now have only to apply Nash's implicit functional theorem, for which we need (iic) of its statement. But this is a consequence of the translationinvariance of f and L together with inequality (iiia) and the boundedness of the derivatives of f. Applying Nash's implicit functional theorem, it follows that if a point k is sufficiently near to the origin in C"O, there is a point in Ck whose image is k. Therefore, f(D(f)) contains a Ck1neighborhood of the origin, and thus a C°°neighborhood also. We show now how the implicit functional theorem can be applied, first to an artificial example and then to a natural one.
B. A Partial Differential Equation Consider functions of n variables, of period 2n in each variable, i.e. functions on the ndimensional torus. The partial differential operator
a
a
a
ax1)
4
2
axs/  ... GO a
a
\ ax2 }/ +:
aX3
a
4
/
)2
+ axa
.
l2
ax,/
has, by deliberate choice, an extremely unfortunate "mixed" character fropi
the point of view of the theory of partial differential operators. But it is easy to see that 0 admits the complete orthogonal set of functions exp (i (mxxl +
+
as eigenfunctions, and that the eigenvalues of 0 are Gaussian integers. Therefore, the equation (B1)
(Q++)u=v
is invertible in the following sense: if visa function in L2(K), then there is a function u e L2(K) such that (B 1) is valid in the L2sense. Our aim is to show that the equation
f(u)=Qu+Iu+u3exp(Qu)
v
has a solution u E C°° for sufficiently small v in C°0. Observe first that f maps Ck into 0 4 for any k and that it has infinitely many continuous Fderivatives, the first of which is
df(u) h = (1 + u3 exp (Qu)) Qh + (I + 3u2 exp (Qu)) h.
42
NONLINEAR FUNCTIONAL ANALYSIS
To find L(u) we have to invert (B2)
h+
+ 3u2 exp
h
1 + u3 exp
1 + u3 exp
For u = 0, (B2) reduces to (B 1), so by a standard perturbation theory argument, (B2) will be invertible for any u sufficiently close to zero in CR. The operator L(u) I = h will then be defined and certainly continuous as an operator from C1 to C'`4; thus inequality (iiia) follows for L, and (iiib) is an immediate consequence of the definition of u, i.e., of the fact that L(u) has infinitely many derivatives and is translation invariant. But we have now verified all the hypotheses of Theorem 2.3, so our result follows at once. We note next that our "translation invariance" requirements on f do not prevent us from treating some apparently unmanageable cases, such as
f(u) = u (x) + c1(x) u(x) + c2(x) u3(x) exp
(x)) = v(x),
where c,(x) and c2(x), are C°° functions on the ndimensional torus. In fact, we have only to look at this problem as if it were that of solving the system of equations
u + dlu + d2u3 exp
v
d, = c,
d2 = C2.
If we suppose that the operator u  ( + cl) u has a bounded inverse in L2 (as was the case for c, = 1), then the first Frechet derivative of the infinitely differentiable mapping
F: [d,, d2, u]  [Du + d1u + d2u3 exp
d, , d2]
will have an inverse; in fact dF [d, , d2 i u] [sl, s2, h] =
slu + d1h + s2u3 exp (Du) + 3d2u2 exp (Oh) + d2u3 exp (Du) h, s1 i s21
which, for d, and d2 near cl and c2 and f sufficiently close to zero in suitable senses may be solved as in the previous case. Observing that Fis translationinvariant, and reasoning as before, our result follows.
HARD IMPLICIT FUNCTIONAL THEOREMS
43
C. Embedding of Riemannian Manifolds Bibliography 1. N. Bourbaki, Espaces vectoriels lopologiques. 2. S. Helgason, Differential Geometry and Symmetric Spaces.
3. S. Lang, "Fonctions Implicites et plongements Riemanniens", Sem. Bourbaki, E.N.S., expose 237 (196162). 4. J. Nash, "The imbedding problem for Riemannian manifolds", Ann. of Math. vol. 63, pp. 2063 (1956).
Now we shall consider the problem of isometric embeddings of Riemannian manifolds in euclidean spaces. This problem was successfully treated for the
first time by John Nash (see [4]), and it provides a natural application for theorems such as Theorem 2.2 above. The problem can be stated as follows. Is every Riemannian manifold (say of class Ck) isometrically embeddable in RI? (Throughout this section, embedding means diffeomorphic mapping with injective differential at each point (= regular at each point).) Nash's answer
is in the affirmative (technically, when k > 3), and he also asserts that m may be chosen less than or equal to an explicit function of the dimension n of the manifold (namely m 5 1 (3n3 + 14n2 + 11 n) for the general case and m 5 1 n(3n + 11) if M is compact). Actually we shall here prove only a weak result: our final statement will deal only with C °°compact Riemannian
manifolds and no bounds for in will be determined. M will henceforth denote a compact Riemannian manifold ofdimensoin it. The manifold itself and its metric will be supposed of class C. I. Remark: Without loss of generality the manifold M may be supposed to be a torus ([3], No. 1). In fact, by Whitney's theorem (cf. G. de Rham, Varietes differintiables, or Milnor, Notes on Differential Topology, Princeton,
1959) M can be represented as a closed smooth bounded submanifold of some Euclidean space E". But then by properly choosing everything we can assume that the projection of E' on some torus is I  I on the manifold M. This represents M as a closed smooth submanifold of a torus. Now we have some Riemannian metric defined on the submanifold M of the torus. By a standard procedure using partitions of unity it is possible to extend this metric to a metric on all of the torus. If we now isometrically embed the torus equipped with this metric we obtain by restriction an isometric embedding of M. Thus we may always suppose that our i anifold i, a torus; this will simplify some constructions. Nevertheless we begin h discu:s
44
NONLINEAR FUNCTIONAL ANALYSIS
ing an arbitrary compact manifold, because as far as VI below the assumption that M is a torus makes no difference in the proofs. 11. Consider the Banach space Cr (M, RI) of rtimes differentiable functions on M with values in R'", and, more generally the Banach space S' of symmetric, doubly covariant, r times continuously differentiable tensor fields on M, defined by dealing locally with matrices instead of with real numbers (see above). Such tensor fields are metrics and R" has a canonical metric, namely the Euclidean metric. Each z e C' (M, R'") induces "by devolution"
of this metric an element of S'1, and therefore we have a mapping f: C'(M, RM) '
S'I (for a more explicit definition see below). We shall show
that for m large enough, the image off covers an open set of S. To do so we shall prove the hypothesis of Theorem 2.2, and then establish our claim easily.
III. First of all we want to know the Frechet derivatives off (and f itself).
Suppose that zl, ..., z," are the canonical coordinates in R', and that xl, ..., xs is a coordinate system defined on some open set U of D. Then if z e C, (M a R"') and f(z) denotes, as above, the tensor on M induced by z, we have the following expression in coordinates:
(f(z))I.J = E aza 8z. mOX,8Xj
(l)
.
This formula is standard and may be taken as the starting point, but (at the risk of being more boring than necessary) we add the following exposition.
If X is a manifold and p e X, denote by TX, the tangent space to X at p. Now if z : M R" is smooth, p e M, q = z(p), z has a differential at p, i.e., z induces a map z* : TM,  T(R"), which is linear (see [2], § 3, No. 1). But the metric on R'" induces an isomorphism
u : T(RO),  (T(R0),)*
(* = dual space).
Consider now the linear mapping A obtained by composition of the mappings :
so
TM, _i T(R"),  (T(R1.)* r"'` (TM,)* , where 'z* stand for the transpose of z*. Clearly A : TM, . (TM,)*. As there exists a canonical identification :
Ho?R (TM,. (TM,)*) = (TM,)' 0 (TM,)*, R
HARD IMPLICIT FUNCTIONAL THEOREMS
45
A may be considered as a doubly covariant tensor at p. The correspondence z + A is what we called f, i.e. we define f by (f(z)), = A. Observe that the fact that the values of z are in R"' has not played any special role, and any Riemannian manifold could replace R"`. But in our case, we know that T(Rl), and (T(Rm),)* may be identified with Rm itself, and that the isomorphism u is the identity. Finally we get
/
(l a)
V lz))D = zD . Izy*.
In terms of coordinates, z* is the Jacobian matrix z* = J. = (), and from (1 a) it follows that
f(z) = J. 'J".
(1 b)
But then aza az,
a ax, ax, which is (1) above. From formula (la) or (lb) it follows at once that f: Cr+1 + S' is a quadratic form (see 2, Chap. I); the bilinear form P associated with f is (2)
j9 (x, Y) = x* 'Y* .}. y* . 'X*.
Another consequence of (I b) is the continuity off (as a function from C'+ 1 into S"). This is clear. We may therefore apply Theorem 1.50 and conclude that f has derivatives of all orders : (3)
f (z) h = z* 'h* + h* 'z*,
f"(z) (h, k) = h* 'k* + k* 'h*, f (")(z) = 0 if n
3,
and that the norms satisfy (3')
If'(z)I S 2111'11 Iz1 II,f"(z)N = 2 11111
In terms of coordinates, (3) may be written as: (3a) (3b)
f'(z) h = J,'J. + J1, 'J:, (f'(z) h), j
az, ah" + az_ ah, ax, ax, ax, ax,
NONLINEAR FUNCTIONAL ANALYSIS
46
Naturally we plan to show that f'(z) h is invertible (as a function of h) in a very smooth way, i.e., (4)
given g E S' and z e C' we want to find h as differentiable as possible
such that g = f'(z) h. IV. To achieve this we use a trick invented by Nash ([4], p. 31) which to the problem of solving (4) adds some new conditions. In other words, we require that the solution h have an additional property given a priori, namely (5)
tZ* (h(p)) = 0 for every p e M.
Since (T(RM),)* was identified with R'" and h(p) E Rm,
`zD : (T(Rm)a)*  (TM,)*,
it is clear that (5) makes sense. Of course (4) and (5) may be written in coordinates as: (4a)
g" = E
(5 a)
Z'
aza A. axe
aza A.
ax, + ax, axe
ha=0,
i=1,...,n.
axe
a
We now prove that the conditions (4) and (5) may be satisfied simultaneously by a suitable h. From (5a) we conclude that
aza aha+ a
h=0
02Z,
axtax, a
axe axe
or
Y, a
Oz,,
aha
axe axe
_ _E a
a2Za
axr axe
and then (4a) becomes (4b)
gjj = 21
a2 Z'%
ha.
axe axe
This shows the point of adding the condition (5): now (4b) and (5) give a system of algebraic linear equations, equivalent to (4) and (5), which are a system of partial differential equations of first order.
Equations (4) and (5) (or (4b) and (5a)) can be written for every. z E C'+' (M, Rm). Nevertheless, we can assure the existence of a solution only
HARD IMPLICIT FUNCTIONAL THEOREMS
47
for a nonvoid open set of z's in C3(M, Rl"), where m is large enough (m may be chosen to be m = 2n2 + 3n  see [4], p. 53but remember that we don't care about bounds for m). V. Choose a mapping of M into Rs by functions v1 , ..., vs. Now define a mapping 2 of M into Rs+(1/2)s(s+1) by means of the functions Cl
,
.,
Vs
, ..., vlt's v2r'1,...,"21', v;
2
t'sv1, .., t's.
Write 21 = V1, ZZ = t'2, ..., Zs = vs,
Z(.J = t'ji'j,
1 < j.
If v = (v1, ...,
is a regular COD embedding of M into Rs (the existence of such is guaranteed by Whitney's Theorem), we claim that f' (z) his invertible as a function of h for every z in a neighborhood of 2. In fact, if v = (v1, ... , v3) is a regular embedding, one can take as local coordinates for Msome appro
priate subset of the v='s. Suppose that M has been covered by the open sets U1, ..., UN in such a way that on each Us one such subset works as a coordinate system. We consider the linear systems (4 b) and (5 a) in this particular
case. In order to simplify the notation, order the z's once and for all by
(z1,...,Zp,z1.1,z1.2,...,Zs.s) S
and write a as a general index for them. Let us fix one particular U, (call it simply U) and suppose that x1 = v1, x2 = v2, ..., x = v is a system of coordinates on U. Consider 2. The coefficients of (4 b) and (5 a) are first or second derivatives of the 9,,'s with respect to the xl's, and the matrix B of the system of linear equations has the following form at every point of U: n
B
sn
k
NONLINEAR FUNCTIONAL ANALYSIS
48
In the above, shading indicates arbitrary coefficients and k = in (n + 1). I., I,t are the identity matrices of dimension and k respectively. This shows that the matrix B has maximal rank n + in (n + 1) at every point of U (its rows being linearly independent). The same is true (for obvious reasons) for every z which is sufficiently near 2 in the C2 sense. This remark has two strong consequences. First, it clearly shows thatthe systems (4b) and (5 a) have solutions at every point of U for all z, C2 near 2. This is basic in finding h. Second, it will follow from this that there exists
a solution of (4b) and (5 a) defined by a mapping that is as smooth as g and the second derivatives of z are. This needs some explanation. At each point of U, we know that (4b) and (5i) can be solved. Among the solutions of this system of linear equations we pick out one by the condition (ha,)2 = minimum.
(6)
It is easy to conclude from the fact that the solutions form a convex set in Euclidean space, that one well defined solution is thereby selected.
This defines a mapping h : U  R', I = s + Is (s + 1), and two problems arise. (a) Is h differentiable? (b) Is h defined independently of U4(a) Differentiability of h. Fix a point p in U. Since B has maximal rank, B 'B is nonsingular (this follows from the fact that det (B 'B) is the Gram determinant of the rows of B, and consequently different from zero if these rows are linearly independent). But then the equation
has a unique solution D = (B 'B)1 G, for all G Define
H = 'BD = 'B (B 'B)1 G.
(7)
Clearly H is a solution of (4b) and (5a) at the given point of U. We claim that this is the solution 'satisfying (6). In fact, if R is another solution, BR = G, we have (writing ( , ) for the scalar product in R'):
(H, H) =(RH,RH)+2(H,R H) (H,RH)=('BD,RH)=(D,BRBH)=(D,GG)=0.
and
Then
(9, R)
(R, fl)  (H, H) _ (R  H, R  H),
and this proves that among all solutions ft of BR = G, (R,17) is minimum when R = H.
HARD IMPLICIT FUNCTIONAL THEOREMS
49
If the point p in U is now permitted to vary, formula (7) shows that H varies as smoothly as B and G do. In it follows that h is r times differentiable provided that z is t + 2 times differentiable and g is r times differentiable.
(b) It remains to show that the solution H is independent of the coordinates chosen in U. But that is easy. If hl, h2 are solutions constructed on U1 and U2 respectively, since property (6) is coordinate independent, h1 and h2 both must possess it at every point of U1 n U2; by uniqueness h1 and h2 agree there. This proves that h may be defined everywhere, and thus we are through. The expression
H=tB(B'fB)G
(7)
for H at p in terms of G at p (and z locally), assures that the correspondence g  h (z supposed fixed) is linear.
But it tells us still more. In fact, the matrix B involves first and second derivatives of z (B is the matrix of (4b) and (5a)). Then from (7) it is apparent that His a smooth function of g and of the first and second derivatives of z. Let L(z) g denote the function H of (7). The smoothness of H as a function
of z and g may be stated as follows: the mapping L(z) g is continuous in both variables z and g simultaneously for the topologies: z e C' +2(M, R"),
gES'L(z)geC'(M,R"). Naturally the equivalence between (4) and (5) and (4b) and (5) can be proved so long as h at least first derivatives (otherwise (4) does not make sense). For that we need L(z) g to belong at least to C1(M, R"'). This requires r to be 3 or more. We sum up as follows :
(8) For every r z 3 there exists an open set 0 in C'+ 2 (M, R'") such that a function L(z) g is defined on 0 x S' is continuous (in both variables simultaneously), and has values in C' (M, R'") satisfying: (8i) (8ii) (8iii)
f'(z) ° L(z) g = g,
(z, g) E 0 x S';
L(z) g is linear in g for every fixed z in 0; the elements in 0 are 1  1 and regular at every point.
(8iii) follows from the fact that we may choose 0 to be a small neighborhood
of the embedding I above, and that the set of embeddings is open in C', r
Z1
VI. We now assume that M is a torus (cf. I). Hence there exists a global set of (local) coordinates (the angular parameters .tit, ..., x") and consequently 4
Schwartz, Nonlinear
50
NONLINEAR FUNCTIONAL ANALYSIS
there is a standard way of expressing the doubly covariant tensor fields as n x n matrices of functions, just by taking the components of such tensors in coordinates x1, ..., x" at each point. This means that it is possible to identify S' with (C' (M, R))"2. But then the mapping f : Cr I (M, R") + Sr may be assumed to have range in (C'(M, R))"2 and hence to split in n' components each with range in C'(M, R). Thus each component inherits from f all properties visAvis derivatives. We leave to the reader now the verification that these components satisfy the hypotheses of Theorem 2.2. We can then apply Theorem 2.2 and conclude that : (10) The image under f of the set of infinitely often continuously differentiable embeddings of M in some Euclidean space covers an open set of S.
Remark: We may restrict our embeddings to be embeddings of Min some fixed R'", and the conclusion should remain the same. But our next step will
consist of adding directly two such embeddings and the bound m will vanish. For that reason (9) is stated without any reference to the ranges of the embeddings considered. ' VII. Let K' = the set of all tensor fields in Sr that are metric, that is to .
say positive definite at each point of M. Clearly K' is a convex cone, open for the S' topology. For every C'+1 embedding zof M in some Euclidean space, f(z) belongs
to K'. Let E' c K' be the set of all such f(z). Lemma: E°° is a convex cone dense in K°° for the S0° topology.
Proof: (i) E' is a convex cone. For every A z 0 we have Rf(z) = f(Jir) (f is a quadratic form). If z : M + R'", u : M + R .then the embedding t = z ® u : M + Rm ® R' (defined in the obvious way) satisfies f(t) = f(z) + f(u). Both properties together define a convex cone. (ii) E°° is dense in K aD.
Proof: Suppose the contrary, and let E°° be the closure of E. Then there exists a point g e K°0 such.that g ylE°°. By the separating hyperplane property of locally convex Fspace Sab of all C°° tet}sors on the manifold M, there exists a continuous. linear functional 4 on S°° such that ¢(E°°) S 0 and 0(g) > 0. Let z be any arbitrary embedding of Minto a Euclidean space, and let u be any arbitrary smooth mapping of M into a Euclidean space. Then, for any positive e, ez is an embedding. Since 0r?f(ez ®u)) S 0
HARD IMPLICIT FUNCTIONAL THEOREMS
51
for all e > 0, it follows on letting e  0 that.0 (f(u)) 5 0 for every smooth mapping of M into a Euclidean space (cf. formula (1) above). By formula (1)
above, this is equivalent to the statement that ¢(f (u)) S 0 for each smooth mapping of M into 1dimensional Euclidean space, that is, that¢ (f(u)) S 0 for each smooth realvalued function on M.
We now let VS M be a coordinate patch, introduce coordinates [x1, x2, ..., xA] = [x1, y] = x in V mapping V onto the unit sphere in Euclidean space and restrict the functional 0 to the set SV of tenors in S°° vanishing
outside V. For h e Sv, 4(h) may be written as 4(h) = i D'J(h,J()), where t.J1 D'J = D" is a distribution defined in the unit sphere of ndimensional Euclidean space, and where h,Ax) is the coordinate expression of the tensor h e V. The above condition ¢ (f(n)) 5 0 evidently implies that
) S 0 for all smooth functions u vanishing outside a i D'J (ax, ax, au au
I.J.1 subset of the unit sphere. Let ,n be a smooth nonnegative function in R", of total integral 1, vanishing outside the unit circle. We know from the general theory of distributions that the "convolutions" DsJ(y) = D'J
(__L?)) are a family of C°° functions,
defined in the sphere of radius 1
 e, and converging as e  0 to D'J in the
sense of the theory of distributions. From the statement i D'J au au 5 0. t.Jt (ax, ax1) it is easily verified that

f DQJ(x)
1*1
'.
aaxx) ' aaxx) J dx
0
for each smooth function u vanishing outside the sphere of radius shall show that [*] implies that
1
We
A
E D'J(x) ,i;J S 0 for each jxi < I  e
I.J t
and each vector
' e R".
We proceed as follows. First note that, by the rotational symmetry and the homogeneity. of the condition [*], it is sufficient to prove [**] in the special, notationally simpler case 6 = [1, 0, ..., 01 and x = [c, y], i.e., to prove that Dal (c, y) S 0 for each small c and jyl < 1  e. To establish this last inequality, let o,be a smooth function of a real variable 1. equal to a constant in a
NONLINEAR FUNCTIONAL ANALYSIS
52
small neighborhood of t = 0 and vanishing for Itl > {l  e), and put ua(x) = 8112p
(.i  cl W(IYI). Then
auj
a_
8x1 (x) 8x1
(x) =
1(p. (X1
 c)12 (p(lyD)2,
S
so that Jeu8
(ax,
(x)12 dx =
j'f (tv'(xl  c))2 V(IYD2 dx
is independent of 8, while all the other products of partial derivatives of ua(x) have integrals which go to zero as 8  0. Thus, choosing p so that f f Itp'(x,  c)I Iw(IyD12 dx > 0,
putting us for u in [*], and letting 6 + 0, we find that
f D,1 (c, y) (w (IyD)2 dy s 0.
Therefore, letting p vary through a sequence of functiops approaching a afunction, we find that D." (c, y) S 0 for lyl < I  e. Therefore, as already observed, [**] follows. Now note that if A is a positive symmetric matrix and B is a positive symmetric matrix, then tr (AB) S 0. Indeed, we have tr(B1/2(_A)1/2(_A)112B1/2)
tr(AB) = tr(AB112B112)
= _tr(CC*),
where C = B112( _A)1/2. Since CC* ? 0 we have tr (CC*) i' 0 and our conclusion follows. Therefore it is a consequence of [**] that n
[***]
j D; (x) h,,,(x) 5 0
W1
for every smooth positive symmetric tensor hu(x) vanishing outside lxl < 1 e.
Integrating the inequality [***] and letting a  0 we conclude that DI"(h j) S 0 for every positive symmetric tensor vanishing outside a compact subset of the unit. sphere, i.e. that 4(h) S 0 for each positive h e SOD vanishing outside the coordinate neighborhood V. Since, by use of an appropriate partition of unity, any positivedefinite g e SOD can be written as a sum of positive elements h, a SOD, each vanishing outside a certain coordinate patch of M, it follows
HARD IMPLICIT FUNCTIONAL THEOREMS
53
that 0(g) S 0 for each g e K. But this contradicts our original statement ¢(g) > 0, and thus completes the proof of assertion (ii). Q.E.D.
2.4. Theorem: (Nash, [4)). For every compact Riemannian manifold Y with a Cmmetric, there exists a C°° isometric embedding of M in some Euclidean space. Proof: If E°°, K°° are the cones defined above, our theorem states simply that E°0 = K. By the lemma above we know that E°° is dense in K°°; and from (9) we also know that E°0 has interior points. a
Let g e K°0, go be an interior point of E. As K°0 is open, there exists an element 1:0 g in K°° such that g is a convex combination of go and g. But then g is also a cluster point of E because it belongs to K°°. Moreover, go is interior to E°° and E°0 is convex. This implies that all points in the open segment joining go and g (in particular g) are interior points of E°° (see [1], E.V.T., Chap. 11, § 1, prop. 15), and we are done: E°D = K.
CHAPTER III
Degree Theory and Applications
.
55 61 63
.
66 70 74
A. A Form of Sard's Lemma . . . . . . . . . . . . . . . . . . . B. Definition of the Degree of a C1 Mapping in R" . . . . . . . . C. Some Functions are Divergences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Back to the Definition . . . . . . . . . . E. The Continuous Case . . . . . . . . . . . . . F. The Multiplicative Property and Consequences . . . . . . . . . . . . . . . . . . . . . G. Borsuk's Theorem . . H. Preliminaries: Degree Theory in an Arbitrary Finite Dimensional Space . . . . . . . . . . . . . 1. Preliminaries: Restriction to a Subspace . . . J. Degree of Finite Dimensional Perturbations of the Identity . . . . . . . K. Properties . . . . . . . . . . . . . . . . . . . . . . . . L. Limits . . . . . . . . . . . . . . . . . . . . . . . . . M. Compact Perturbations . . . . . . . . . . . . . . . . . . . N. Multiplication Property and Generalized Jordan's Theorem for Banach Spaces. 0. Fixed Point Theorems in Banach Spaces . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
78 83 83 84
86 86
89 92 96
A. A Form of Sard's Lemma Our aim is to prove the following Theorem 3.1, which is related to Sard's lemma (cf. de Rham, "VarietLs differentiables", Sec. 3, Th. 4).
3.1. Theorem: Let D be an open set in R", let f be a continuously differentiable mapping of D into R", and let J(x) be the Jacobian determinant of f at x. Then for any measurable subset E of D the set f(E) is measurable and 3
m (f(E)) 5
f a
55
IJ(x)I dx.
56
NONLINEAR FUNCTIONAL ANALYSIS
We begin by recalling a few definitions. For any point vo of R" and any set of n linearly independent vectors a,, ..., a" in R", the parallelotope P with
initial vertex vo and edgevectors a,, ..., a" is the set of all points of R" of
form x = vo +
A,a,, where
are real numbers such that the
0 5 A, < 1, i = 1, ..., n. The point vo + + E a, is called the center of the `a 1 parallelotope. For fixed k the set of those points of P for which At has a fixed value equal to either 0 or 1 is called an (n  1)dimensional face (or, briefly, face) of P, so
that the number of faces of P is 2n. The point vo + Akak + } a, is called the center of the face. t+(rk It is immediate that the parallelotope P with initial vertex at the origin and edgevectors a,, ..., a" is the image of the unit cube ::5
under the nonsingular linear transformation h : R" + R" given by
h(x) = h (x', ... ,
x") = i x' a,;
moreover the image of the unit cube by any nonsingular linear transformation R" onto itself is a parallelotope of this form. It follows that P is compact, and that the frontier of P is the union of the 2n faces of P. Moreover, (see, for example, Zaanen [4], p. 160) the ndimensional measure of P is equal to Idet (h)I = det (ai)j, where at is the}th coordinate of a,; and obviously these last results extend to a parallelotope with any initial vertex.
Throughout the following discussion we use the ordinary Euclidean norm for points of As and the corresponding norm forlinear transformations of R" into itself, and we denote the inner product of x and y by x  y. We use A(n) to denote a positive constant depending only on n, not necessarily the same on any two occurrences.
For our proof of Theorem 3.1 we require two simple geometrical inequalities, which we state below. 1. Let Fbe a set in A" contained in a hyperplane H, let x0 be a fixed point of
F, and let Rx  x0 116 d whenever x e F. Let also G be the set of points of R" whose distance from F is less than 6. Then G is measurable (since it is open) and (a.1)
m(G) 9 2" (d + 6r'' 6.
DEGREE THEORY AND APPLICATIONS
57
It is evident that G lies between the two hyperplanes parallel to H and distance 8 from it, and to prove (a. 1) we construct a parallelotope containing G with two of its faces in these hyperplanes.
By a suitable translation we may suppose that H contains the origin, so
that H is an (n  1)dimensional vector subspace of R". We can therefore find a unit vector a1 such that x  a1 = 0 for all x e H (i.e. such that a1 is orthogonal to H), and then we can find vectors a2 , ... , a" such that {a1, a2,
..., a"} is a complete orthogonal set in R". Let now y e G. Since every
vector in R" can be expressed as a linear combination of the a,, there exist real numbers A,, ..., A. such that
y  xo =
Aiaj. 1=z
Further, since the distance of y from F is less than 8, there exists x e F (possibly identical with y) such that fly  x P < 8, and then writing y
 x = (y  x0)  (x  xo),
we obtain
R, = (y  x0)
=(yx)a,
a1 =(yx)a,
whence
14a11 6 fly  xfl flalll = IIy
 xll
0 is essential to such proofs.
59
DEGREE THEORY AND APPLICATIONS
By applying 3 to the derivative of a differentiable mapping, we obtain the following result; in this we use the definition of derivative given in 1.9. 4. Let C be a closed cube in R" with center x0 and with sides parallel to the axes, let f be a differentiable mapping of C into R", and let J(x) be the Jacobian determinant off at x. Then (a.4)
m*(f(C)) 5 m(C) {IJ(xo) I + A(n) (flf'(xo)0 + j)
1 in
where ri = sup II f'(x)  f'(xo) II and m* denotes outer Lebesgue measure. X@ c
To prove (a.4) let a be the length of the sides of C, and let P be the image of C by the linear transformation f'(xo) : R" * R". By the mean value theorem applied to the function f  f'(xo) ( cf. the proof of Corollary 1.45), we have for each x of C
11f(x) f(xo) f'(xo) (x  xo)II 5 rl pz  xoll < rla. fn, and this inequality expresses the fact that the point f(x)  f(xo) + f'(xo) (xo) of the translate f(C)  f(xo) + f'(xo) (xo) of ft C) is at a distance less than from the point f'(xo) (x) of P. It follows that this translate of f(C) is r?a
contained in the set of points of R" whose distance from P is less than rla f , and applying 3 (and noting that det (f'(xo)) = J(xo)) we immediately obtain the inequality (a.3). 5. Let D be an open set in R, let f be a continuously differentiable mapping of D into R", and let J(x) be the Jacobian determinant off at x. Then for any measurable subset E of D (p.1)
m* (f(E)) 5 f
s z
dx,
where m* denotes outer Lebesgue measure.
Suppose first that E is a closed cube C with sides parallel to the axes. Since f is continuous on C, we can divide C into a finite number of nonoverlapping closed cubes C1, ..., C,, with centers x1, ..., x4 and with sides parallel to
the axes such that II f'(x)  f'(xk) 11 5 e whenever x e Ck (k = 1...., N). By 4, for each cube Ck we have m*(f(CC)) < m(Ck) {IJ(xk)I + As),
where A is independent of k, so that also
m*(f(C)) < E m*(f(Ck)) < Ij fJ(xk)I m(Ck) + Asm(C),
60
NONLINEAR FUNCTIONAL ANALYSIS
the summations being extended over all cubes Ck. When the maximum diameter of the cubes Ck tends to 0 the sum E IJ(xk)I m(Ck) tends to the Riemann integral of IJ(x)I over C, and sinces is arbitrary we therefore obtain
m*(f(C)) 5 fIJ(x)I dx,
(p.2)
c
which is (#.1) for E = C. Suppose next that E is a measurable subset of D. Then we can find a set El containing E and with measure equal to that of E such that El is the intersection of a contracting sequence of open sets O a D. If now C is a closed cube contained in D with sides parallel to the axes, then for each fixed n the set C n O, is a countable union of nonoverlapping closed cubes with sides parallel to the axes, and applying (f.2) to each such cube and summing we obtain
m*(f(C n O.)) s J
IJ(x)I dx,
whence also (fl.3)
m*(f(C n E))
Lo. IJ(x)I dx
(si nce E e Op). Since J is bounded above on C, the integral on the right of
(fl.3) is finite, and so tends to L81 IJ(x)I dx as n tends to + oo, whence
m* (f(C n E)) s J
Cnr,
IJ(x)I dx = f
c.,s
I J(x)I dx.
Since D is a countable union of nonoverlapping cubes such as C, the general result (j.1) follows.
6. Let D be an open set contained in R", and let f be a continuously differ. entiabk mapping of D Into R". Then f(E) is measurable for every meawable
set E e D.
(For a proof of this under more general hypotheses see Rado and Reichelderfer [1], pp. 337, 214). Let J(x) be the Jacobian determinant off at x. It follows immediately from 5 that if E0 is the subset of D where J(x) = 0, then flE0) has measure 0, so
that m (f (E n Es)) = 0 for every measurable E e D. Since D  E0 is open, it is therefore enough to prove the above result when J(x) 0 0 on D.
DEGkEE THEORY AND APPLICATIONS
61
Suppose then that J(x) t 0 on D, so that f is locally a homeomorphism. The open set D is a countable union of closed cubes, and, by the HeineBorel theorem, we can cover each of these cubes with a finite number of closed cubes on each of which f is a homeomorphism. Hence D is a countable union
of closed cubes Ct on each of which f is a homeomorphism, and since
f(E) = U f(E n Cj), it is enough to prove that f(En C) is measurable whenever E is measurable and C is a closed cube in D on which f is a homeomorphism.
If E is closed, so are E n C and f(E n C), and hence if E is a countable union of closed sets, then f(E n C) is measurable. Since any measurable set is
the union of a set of measure zero and a set which is a countable union of closed sets, it is now enough to prove thatf(E n C) is measurable when E is of measure zero, and this follows immediately from 5. This completes the proof of 6, and hence also of Theorem 3.1. References 1. T. Rado and P. V. Reichelderfer, Continuous transformation in analysis (Berlin, 1955). 2. G. de Rham, Varietes differentlables (Paris, 1955). 3. J. Schwartz, "The formula for change in variables in a multiple integral", Amer. Math. Monthly 61 (1954), 815. 4. A. C. Zaanen, An introduction to the theory of integration (Amsterdam, 1958).
B. Definition of the Degree of a CI Mapping in Rn Notation: Until further comment, D will denote an open bounded set of R", the Euclidean space whose coordinates are x = (x1, ..., x"). Let 8D denote the boundary of D. Most of the mappings appearing below are continuous on D. We shall
write C for the space C(F), k) of continuous mappings defined on f) and having values in R. By a C' function on f) we mean a function having derivatives on a neighborhood of b up to order r which coincide with restrictions of continuous functions on D. Cf. Chapter I for the topology of Cr. Suppose that0 e C is C1 on D and that p e R" is a point not belonging to 0(0). We shall define the degree ofq5 with respect top and D; it will be an integer denoted by deg (p, 0, D).
If Z e D is the set of critical points of 0, i.e. points at which the Jacobian of 0 vanishes, and f 1(p) n Z = 0, then the set 01(p) is discrete, by the implicit function theorem; since f) is compact, this set is finite.
NONLINEAR FUNCTIONAL ANALYSIS
62
At each x 4'(p), J+ does not vanish. Then its sign is unambiguously defined and we define (11.1)
deg (p, 0, D) = E sign J#(x).
Suppose now that 4'(p) n Z 0 0.
,
By Sard's Lemma 3.1, O(Z) has measure zero in R, and in particular, has empty interior. This implies that the point p may be approximated as closely as desired by points q for which 0'(q) n Z = 0. For each q, the degree is defined as above. Then, by definition, the degree of p is (11.2)
deg (p, 0, D) = lim deg (q, 46, D). f.D j1 N is defined, the limit
lim deg (p, 4", D)
exists and does not depend on
and we then define
deg (p, 0, D) = lira deg (p, 0", D). n.ao
Justification of the definition. Let d equal the distance between p and the compact set ¢ (8D). Choose N so large that if n Z N, then 1¢n  01 < Id (I I stands here for the uniform norm = convergence in the CO sense).
Since p does not belong to any ball of radius Id and center at fi(x), x e 8D, it follows that p is not a convex combination of the form to. (x) + (1  t) 4",(x), 0 5 t S 1, x e 3D, n, m > N, because 4"(x) and 4m(x) belong to one such ball. But then we may fix n, m > N and apply Corollary 3.13 to the family :
ton+(1
0 0 choose r' to be C1 and such that (4,(x)  +p(x)I < e for x e K.1p(D) has measure zero for every D e R" by Sard's lemma, and so it is possible to pick a point yo such that x  ip(x) + yo never assumes the value 0. Suppose then that y, itself is never vanishing.
Let c = inf 14,(x) (, x e K and choose a continuous function defined for t > 0 with values in R such that ra(t) = 1
if
tZ2
r1(t) =
2t c
if t 5
If we define 0 as the mapping
OW =
V(x) rJ(I o(x)I)
then I0(x)I k c/2 for all x and I0(x)  4,(x)I < c on K. Suppose that a has been chosen so that e < c/2
.
2
DEGREE THEORY AND APPLICATIONS
79
By the Tietze extension theorem there exists a function d : C . R" such that 8(x) = O(x)  4(x) if x e K, and I8(x)I S s for x e C. Define 41(x)
_ O(x)  8(x), x e C. Then
01(x) _ 4(x) if x e K,
101(4 = 10W  44 ? I0(x)1  144 ?
C
e>0
and 01 is a solution of our problem.
3.25. Lemma: Let D e R" be a symmetric open bounded set such that 0 0 D. Let be a mapping of 8D in R1, m > n, which is odd and nonvanishing. Then Q, can be extended to D to be odd and nonvanishing.
Proof: We shall use the induction on the dimension of R. For n = 1, D looks like:
HH
e
y H
By the previous lemma we can extend the function ip = 0 restricted to [s, co] n 8D, to a nonvanishing function j defined on some interval [e, N]. By symmetry, we may define a function extending 0 and never vanishing. Suppose now that the lemma is established for n1 < n. Let x e R", .f e R111
(suppose furthermore that R"1 has been identified with the hyperplane xl = 0 in R"). Considering R"1 n D, o can be extended to 8D v (R"' 1 n D) to be odd and nonvanishing (this is our inductive step): call the extension again 4'.
where x1 = 0, x1 > 0, x1 < 0 respecNow split R" into R"1, tively, and let D+ = D n R+, D = D n r By the previous lemma 4' has a further extension to 8D v (R"1 n D) u D +, continuous and nonvanishing. Now, by symmetry, the final extension can be defined. Q.E.D. 3.26. Lemma: Let D e R" be a bounded open symmetric set such that 0 0 D, 0 : 8D + R", a continuous odd and never vanishing mapping. Then 0
can be extended to D to be continuous and odd, and furthermore, nonvanishing on D n R"1 (again the identification R"1 a R"). It follows from the previous lemma applied too retsricted to 8 (D n R81) = 8D n R"1, that we can obtain a nevervanishing extension to D n R"1. Such an extension can be extended at once to the desired map on D by symmetry. Q.E.D.
80
NONLINEAR FUNCTIONAL ANALYSIS
3.27. Lemma: If D e R" is a bounded open symmetric set and 0 # D, for every 0: 8D j, R" continuous and odd such that 0 f 0 (8D), deg (0, 4), D) is an even integer. Proof: Extend 0 to D so as to be a continuous odd mapping never vanishing on D n R". The lemma above assures the existence of such an extension. Call the extended mapping also 0. Approximate 0 by a mapping ,p of class C1 and odd (replace, if necessary, an approximating V by its odd part I [+p(x)  ip(x)]). If rp is close enough to 0, it follows that 0 0 V (dD)
00+p(Dn R"1) deg (0, vp, D) = deg (0, ¢, D).
We want to compute deg (0, lv, D). Consider the sets D+ = R"+ n D, D = Jr n D (where R"+ = {x(xl > 0}, R"_ = {xlxl < 0)). By construction V never vanishes on D n R"1, so we can avoid this set and obtain : (1)
deg (0, +p, D) = deg (0, p, D+ u D)
= deg(0,%p,D+) + deg(0,+p,D). Choose p close to 0 and such that p is not the image under ip of a critical point of ip. Observe now that since V is odd, each partial derivative p/8xt is
even. But then J, is also an even mapping. This implies in particular that p is not the image of a critical point either. Compute deg (0, v, D+) _
Sign J#(y)
V(V)=a
7ED.
deg (0, +p, D_) =
,(z) D
Sign J#(z).
zED_
Since V is odd, the set {z(tp(z) = p, z e D_} can be obtained by taking the opposite of the elements in {yjy,(y) = p}. But J,(z) = J,(z) and we conclude that deg (0, gyp, D+) = deg (0, jp, D_).
Then (1) implies that deg (0, 0, D) = deg (0, ip, D). is an even nwnber. Q.E.D.
Now we are ready to prove Borsuk's theorem.
DEGREE THEORY AND APPLICATIONS
81
Consider a small open ball U with center at 0, and a mapping f : D  R" such that
(a) f is odd, (b) AD 010D, (c) f I u = identity. The existence of such a function follows from the observation that if g is an extension satisfying (b) and (c) (such an extension certainly exists) then
f = [g(x)  g(x)] satisfies (a), (b) and (c). We know that f 18D = 46 implies deg (0, f, D) = deg (0, 0, D) (as follows from 3.16; 2).
But if f = id on U, it is clear that f # 0 on 8U and then deg (0, f, D) = deg (0, f, U u (D  U))
= deg (0, f, U) + deg (0, f, D  U)
= 1 + deg (0, f, D  U) . But the second term is known to be even by the last lemma. This proves that is odd. Q.E.D.
deg (0, ¢, D) = deg (0, f, D)
We now draw some consequences from Borsuk's theorem.
3.28. Corollary : Let D be as in Borsuk's theorem and V: 8D  R" a continuous mapping. Then there exists no homotopy +p, of W into a constant mapping such that lpr(x) # 0 for all t, x e 8D.
Proof: First extend w to some 0 defined on D. Replacing 0 by I (4)(x)  4)(x)) we may suppose that 0 itself is odd. But then Borsuk's theorem implies that deg (0, 0, D) # 0 and the impossibility of the existence of the homotopy described is then apparent. Q.E.D.
3.29. Corollary: Let D be as above. If V : 8D ' RA is an odd continuous mapping whose image is contained in a subspace E # R", then w assumes the value 0 at some point of 8D.
Proof: Extend +p to a continuous odd mapping 0: b  E. If 0 l w (8D), then by Borsuk's theorem, deg (0, 0, D), being odd, is different from zero. 6
Schwartz, Nonlinear
NONLINEAR FUNCTIONAL ANALYSIS
82
But this implies that deg (p, 45, D) differs from zero on the component d of R"  p (8D) containing 0. Now (3.16; 4) implies that 0(b) contains such a component, and, a posteriori, E does also. But this is impossible, since d
is open, nonvoid, and E # R'. Q.E.D. 3.30. Corollary: Let D be as above, and let +p any continuous mapping Tp : 8D  R" whose image is contained in a subspace E 01%. Then there exists p E 8D such that +p(p) = V(p). Proof : Apply the corollary above to the mapping I (y'(x)  lp( x)). Q.E.D. 3.31. Corollary: Let D be as above, and q5: D  R' a continuous mapping never vanishing on 8D, such that for every x E 8D,
a4, (x) # (1  0)4(x)
(1)
for alla,I Sa 5 1. Then 4(D) contains a neighborhood of the origin. Proof: Observe that the conclusion follows from the statement
deg (0, 0, D) # 0. This property is an immediate consequence of the fact that ifp is the mapping
+p(x) = I (O(x)  ¢(x)), then by Borsuk's theorem, deg (0, P, D) # 0. Under condition (1), the family
to (x)  (I  t)4(x), # 5 t .g 1, is a homotopy between 0 and tp, which implies deg (0, ,0, D) = deg (0, V. D). Q.E.D.
DEGREE THEORY AND APPLICATIONS
83
Degree TheoryGeneral Case H. Preliminaries: Degree Theory in an Arbitrary Finite Dimensional Vector Space Suppose E is a real vector space of dimension n. By choosing a basis in E we can identify E with R". This should allow us to define deg (p, 0, D) as it was done for R" and of course the only important thing is to see what happens after a change of basis. The answer is that the degree is basisindependent. More precisely, given a basis B = {b1, ..., b"}, we shall for the moment denote by deg B(p, ¢, D) the degree computed with resp et to B; then we have
3.32. Proposition: For every pair of bases B, .P degB (p, 0, D) = degp (p, 4', D) , whenever the expressions make sense.
Proof: It suffices to prove this for C1 mappings. But then we only need to know what happens to the sign of the Jacobian of a mapping when the basis is changed. This is easily seen to be invariant, whence the result. Q.E.D.
1. Preliminaries: Restriction to a Subspace
Suppose that D e Jr is an open and bounded set, and that R' S R", where the inclusion is made by identifying R" with the subspace of R" whose
points are the x such that x"+1 = x"+: = ... = x. = 0. 3.33. Proposition: If 0: D + R" is continuous and 9p: D + R is the mapping 1P = id + 4, for every p c AR" not belonging to o (8D): deg (p, p, D ) = deg (p,1ol R., a, D n R").
Proof: Let us begin by noting that y' (R" n D) c R"' as can be verified easily; thus the expression deg (p, Vlx,.,,8, R" n D) makes sense. Suppose that 4, is C'. By definition it suffices to prove the statement for this case, and under the assumption that p is not the image ofa critical point
NONLINEAR FUNCTIONAL ANALYSIS
84
of t'. As the degree is then computed by counting zeros, it is necessary to look
for the points y in V r 1(p) . If y(y) = y + 0(y) =p, then y = p  0(y) ,s R"'. Hence tp1(p) c RI n D.
This implies that the points to be counted for V: D  R" and for F = ?P1 R, fi, F: R" n D + R'" are the same, and the only possible difference in degree lies in the signs assigned to them. Our proposition will follow from the fact that at each such pointy, we have
Sign 4(y) = Sign
(1)
To prove (1), first observe that the Jacobian matrix of ip has the form
1 + aO,
ax,
0
Im
0
U runs from 1 to m). This implies immediately that J((x) for every x e R'" n D, which clearly implies (1) above. Hence our proposition has been proved. Q.E.D.
J. Degree of Finite Dimensional Perturbations of the Identity Let X be a real Hausdorff locally convex T.L.S., and D e X an open subset of X such that E n D is bounded for every finite dimensional subspace E of X (in that case we say that D is "finitely bounded"). This is the most general case we shall consider. We now give some definitions. 3.34. Definition: If T is a topological space and 0: T . X is a continuous mapping, we shall say that 0 is finite dimensional if ¢(T) is contained in some finite dimensional subspace of X. If T is also a subset of X, we define a finite dimensional perturbation of the identity to be a mapping lp : T  X of the form y, = 1 + 0 where 1 is the identity 1: T . X and 0 is finite dimensional. 0 = tp  1 is called the perturbation of V. Our aim is to define the degree deg (p, ip, D) for every finite dimensional perturbation of the identity yt = I + ¢ (defined on T = D, D as above). Let
DEGREE THEORY AND APPLICATIONS
85
p e X be a point not in y, (3D) and choose a finite dimensional subspace E c X such that
peE,
(a)
qS(D) E.
Letting f denote the restriction f : D n E "  E, we have 3.35. Definition:
deg (p, v, D) = deg (p, f, D n E), where the second member is computed in E according to the theory for finite dimensional spaces. We must justify 3.35. Suppose Fis a subspace of X satisfying properties (a) above and that F (_ E. Then proposition 3.32 applies and we conclude deg (p, f IF n D,
D n F) = deg (p, f, D n E).
If F satisfies (a) but F and E are not nested, we reduce to that case by considering E e E + F and F e E + F separately. Thus 3.35 is justified.
3.36. Remark: We know that in the finite dimensional case the degree deg (p, ¢, D) depends only on the restriction of 0 to 3D (see 3.16; 2). Let us suppose that we have a finite dimensional mapping defined only on 3D, : 3D + X. The degree of all finite dimensional perturbations of the identity 1 + 0 by means of a finite dimensional extension 0 of J defined on all of D will be the same as follows from Definition 3.35 and the finite dimensional theory. But we cannot assure the existence of such extensions unless we assume X to have additional properties (for instance, to be normal). Nevertheless a notion of degree of 1 + may be defined. Suppose p # ¢ (3D). Choose a finite dimensional subspace E containing both p and 4' (3D). E is finite dimensional and so there are extensions 0 Of $18DnE to all of D n E. Thus it is possible to define deg (p, + 1, D) by
deg (p, . + 1, D) = deg (p, ¢ + 1, D), where the second member is computed in E. Hence whenever we have a finite dimensional mapping : 3D  X we can define the degree deg (p, + 1, D), and this coincides with the degree of every finite dimensional perturbation of 1 by means of an extension of p to all of D.
NONLINEAR FUNCTIONAL ANALYSIS
86
K. Properties
We have shown that the definition of degree can be generalized to obtain a notion of degree for finite dimensional perturbations of the identity with respect to domains "finitely bounded" in an arbitrary locally convex T.L.S. Now we shall list the properties of degree that remain valid in this situation.
From the Definition 3.35 and 3.16; 4, 5, 6 and 7 we obtain: 3.37. Proposition: For every finite dimensional perturbation of the identity 1P = I + 0: D  X, and every p # ip (8D), the following results hold: 1. If p 0 tp(D), then deg (p, yr, D) = 0.
2. If p and q belong to the same component of X  p (8D), then deg (p, yr, D) = deg (q, gyp, D).
3. If D = U D1, where the family {Dt) is disjoint and 8Dg a 8D, then deg (p, I p, D ) =
4. If K
deg (p, V. Di) .
D, K is closed, p t yr(K), then deg (p, ys, D) = deg (p, rp, D  K).
5. If f : Dl  X1 is a mapping satisfying the same conditions as 4,, then
deg ((p x q),1 + (0 x J), D x D1) = deg (p, l + 4, D) deg (q, I + f, D1) whenever these expressions make sense.
L. Limits
The family of finite dimensional mappings is closed under addition and product by scalars. Hence to proceed we consider limits of such mappings. From this point of view the important thing is that the compact mappings (definition below) are such limits and so we will be able to define degrees of compact perturbations of the identity. Let us recall and introduce some notations: D is an open set in X such that D n E is bounded for every finite dimensional subspace E e X. C(D) will denote the (linear) space of all continuous mappings 0: D * X; likewise, C (8D) will be the space of continuous mappings 4, : OD  X. There exists a natural mapping Q : C(D)  C(OD) defined by restriction Q(4,) = 018D
DEGREE THEORY AND APPLICATIONS
87
Similarly, we denote by F(D) and .F(3D) the subspaces of C(D) and C(OD)
whose elements are the finite dimensional mappings. Of course, Q: .F(D)  .01 (aD). Let us give C(D) and C(OD) the topologies of the uniform convergence. The open sets of, say, C(D), are those defined by {4)(4)(D) c G)
where G runs over all the open sets of X.
Warning: These topologies are not linear space topologies, but merely group topologies (i.e., the mapping (0, p) , 0 + ,p is continuous, while the mapping (A, 0) , A4, A e R, 0 e C(D), is not necessarily continuous).
By 3.36, for every 0 e .F(3D) and every p 0 (1 + 0) (OD), the degree deg (p, 1 + 0, D) is defined, and for every g e C(D) such that Qg = 0,
deg (p, l + g, D) = deg (p, l + 0, D). 3.3& Lemma: Let 46 e C(aD), p be a point of X and V be a convex symmetric neighborhood of 0 e X such that: (a)
(p + V) n (1 + 4)) (OD) = 0.
Then, if f e F (3D) satisfying f(x)  4)(x) e V for every x e aD, the degrees
deg(p,l +f, D) are defined; moreover, these degrees are equal for all such f.
Proof: Suppose p = x + f(x), x e 3D. Then x + O(x) = p + (4)(x)  f(x)) e p + V, which contradicts (a). Hence p 0 (1 + f) (3D) and the degree is defined.
Suppose that g e F(3D) also satisfies g(x)  4)(x) e V for every a e 3D,
andcall F=1 +f,G=1 +g,jp=1 +,0.
For every x e 3D, F(x) and G(x) both belong to +p(x) + V and this set is convex. Hence any convex combination (1  t) G(x) + tF(x), 0 5 t S 1 also belongs to jp(x) + V. Using (a) this implies that for every 1, 0 5 t 5 1, and every x e 3D,
p # (1  t) G(x) + IF (x).
Let E be a finite dimensional subspace of X such that
peE G (aD) c E F (49D)
E.
88
NONLINEAR FUNCTIONAL ANALYSIS
Considering the domain D n E and the homotopy (1  t) G + tFbetween G and F, we conclude by 3.17 that deg (p, G, D n E) = deg (p, F, D n E), or, according to Definition 3.35, deg (p, G, D) = deg (p, F, D) as desired. Q.E.D.
3.39. Proposition: Let 0 e C (8D) and p a point of X not in the closure of (1 + ¢) (8D). Then there exists a neighborhood U of 0 in C (8D) such that the degree
deg (p, 1 + f, D)
takes on but a single value for all f belonging to U n .W (aD). Proof: Follows from the lemma above. Q.E.D. This proposition will permit us to define the degree for perturbations of the identity by limits of finite dimensional mappings.
Call 2' (8D), 2(D) the closure of F (8D) (respectively .F(D)) in C (8D) (respectively C(D)). Plainly Q : T(D)  .P (8D). 3.40. Definition: Let 0 = w  1 be a mapping in 2 (aD). Let p be a point of X not in the closure of ,(aD). The common value of the degrees deg (p, 1 +f, D) when f e .F (8D) is near 0 is defined to be deg (p, ip, D). If4, = y,  1 is a mapping defined on all of D and such that Q(4,) c.7 (8D), we shall write simply deg (p, +p, D) instead of deg (p, Qy,, D). In particular, for every 0 e 2(D), the degree is defined. From 3.37 the next proposition follows easily.
3.41. Proposition: If ,0 = +p  1 e 2(D) and p and q do not belong to the closure of tp (8D), then : 1. If p 0 sp(D), then deg (p, gyp, D) = 0.
2. If p and q belong to the same component of X  y,(OD), then deg (p, ap, D) = deg (q, y,, D).
3. If D = U D,, where the family (D,) is disjoint and 8Di c 8D, then
deg (p, p, D) _ 4. If K
deg (p, y,, D,)
D, K is closed and p 0 +'(K), then
deg (p, p, D) = deg (p, p, D  K).
5. If f: D1, X1 is a mapping of 2(D1), then
deg ((p, pl), 1 + (0,f), D x D1) = deg (p, 1 +,0, D) deg (pl, 1 + f, D1) .
DEGREE THEORY AND APPLICATIONS
89
Proof: Left to the reader. (Hint: Approximate and use Proposition 3.16.) In the same way it is possible to prove the following generalization of 3.33:
3.42. Proposition: Let Y be a closed subspace of X and p be a point of Y
not in v(aD), where tp  I e 3 (aD). Then deg (p, y,, D) = deg (P, Vianr, r, D n Y). Finally, we have a generalization of Borsuk's theorem: 3.43. Proposition: If D c X is an open set which is finitely bounded. symme
tric, and contains the origin, then for every odd tp such that V  I e 2 (aD), and if 0 0 tp (aD), then deg (0, t', D) is an odd number. The proof follows at once from the Definition 3.40 and the Borsuk theorem 3.23.
3.44. Corollary: If D is a domain as in the proposition above, if vp  1 is odd and belongs to 2'(D), and if 0 0 ty(aD), then V(D) covers a neighborhood of 0.
M. Compact Perturbations Here we shall show that the compact mappings are in 3 (OD) and 3(D) and obtain some additional properties of the degree of compact perturbations of the identity. We begin with some purely topological results. Let X be a locally convex T.L.S., T a topological space. Let C(T) denote the set of continuous mappings 0 : T + X. C(T) has a natural st. ucture as a topological space (see the beginning of section L). Consider the subspace.F(T) of C(T) whose elements are the mappings 0 such that O(T) is contained in a finite dimensional subspace of X and the subspace K(T) of the mappings 0 for which the set f(T) is precompact. 3.45. Proposition:
K(T) c .F(T) n K(T). Proof: Choose an open, convex, symmetric neighborhood V of 0 in X, and let 0 e K(T). Suppose the points yl , ... , y e X have the property n
O(T) c U{y,+V;. '=1
NONLINEAR FUNCTIONAL ANALYSIS
90
Letting e be the gauge induced by V, e(x) = inf 1A1,
x e X
xe,1Y
we define mappings pl : T ; R by
pi(x) = max (0, 1  e (4(x)  YO) . Each µ, is continuous (since a is continuous). Since O(T) e U l y, + V), for each x there exists at least one p,(x) different from zero. Thus the function µ(x) _ µ1(x) never vanishes on 4(T), and we can define
µ(x)
These mappings satisfy I Z A, ? 0,
:(x) _
A1(x) = 1. Now define Ar(x) Ys
This mapping belongs to F(T) n K(T) and O(x)  q5.(x) = E !(x) (4(x')  ye)
We see that if 4(x)  y, V, then a (4(x)  y,) ? 1, and consequently µ,(x) = 0, which implies A,(x) = 0. This means that 4(x)  41(x) is a convex combination of elements of V which belongs to V since V is convex. Thus 0 is a limit point of .F(T) n K(T), as desired. Q.E.D.
Using this proposition we may return to our initial situation D e X, D an open finitely bounded set. We shall say that a mapping 0 e C(D) (or# eC (aD)) is compact iff 4(D) is a compact set (respectively:4(aD)). (Not to be confused with the mappings of Definition 1.38.) 3.46. Proposition: Any compact mapping 0 e C(D) (respectively 0 e C(OD)) belongs to .'(D) (respectively to .P(AD)).
Proof: Immediate from 3.45. Q.E.D.
The perturbations of the identity by elements of 2(b) do not behave "nicely" topologically and in the preceding it was necessary to consider such
DEGREE THEORY AND APPLICATIONS
91
artificial sets as the closure of (1 + ¢) (8D). The compact ones however look like finite dimensional mappings, and we have
3.47. Proposition: If D is any domain D e X, and 0 E C(OD) (respectively 0 e C(D)) is compact, then V = 1 + 0 is proper and closed.
Proof: To say that c is proper means that the inverse image of a compact set is also compact. Suppose K = K is compact, and let A = gyp' 1(K). Suppose {xa} is an indexed family in A. Then {x, + 4)(x,)} being contained in K, has a convergent subfamily xp + 4)(x,) * y. But4) being a compact mapping, there exists a third subfamily such that ¢(x,)  z. This implies that x.I. y z, and so A is compact. Suppose now that F = D is closed and that xa + ¢(xJ  z, x, E F. 0 being compact, there exists {xp} such that ¢(x,) + y. Then
xp * z  y; by continuity, z  y + ¢ (z  y) = z. F is closed, so z  y = lim xp belongs to F, which implies that z e (1 + ¢) (F). This means that (1 + 0) (F) is closed as desired. Q.E.D. 3.48. Corollary:
' (8D) is closed.
This property makes the statement "p is not in the closure of 1P(8D)" in most of the statements above, equivalent to "p is not in tp (8D)", the same statement which appears in the finite dimensional case. We leave to the reader the work (and the delight thereby engendered) of
rewriting Propositions 3.41 and 3.43 for the case ,p  1 = compact, with the assumption p # rp(0D).
3.49. Corollary: Suppose D is a domain as in 3.43 and +p a C(D) a map such that 1P  1 is compact. If +p maps D into a proper linear subspace of X, then V(x) = V(x) for some x e 8D. Proof: Consider the map j(x) = +p(x)  V(x). If +p(x) # 0 when x c 8D, then by 3.44 j(D) would cover a neighborhood of 0. But j (D) is contained in every subspace containing ?(D); by hypothesis there is a proper one, without interior points. Q.E.D. 3.50. Corollary: Let D be as above and ? a C(D) such that +p  I is compact. If tp(x) is never in the positive direction of rp(x), x e 8D, then tp(x) = 0 for some x E D. We shall define a notion of homotopy for compact mappings:
NONLINEAR FUNCTIONAL ANALYSIS
92
3.51. Definition: Two compact mappings 00,01 e C(T) (T is any topological space) are said to be compact homotopic if there exists a compact
mapping F: I x T + X, where I = [0, 1], such that F(0, x) = 00(x)1 F(1, x) =01(x). 3.52. Proposition: Let D be as above, 0o = loo  1, 0, = lp,  1 two compact mappings 4, E C(aD). If 0o and 01 are compact homotopic under
F (t, x) = ¢,(x) = ,(x)  x and p is a point in X such that p # lp,(x) for every t and every x e OD, then deg (R V0, D) = deg (p, V1, D)
Proof: If F is compact, it may be approximated by finite dimensional mappings. The restrictions of such mappings also provide close approximations of 0o and 01, and then the proposition follows from the finite dimensional case. Q.E.D.
N. Multiplicative Property and Generalized Jordan's Theorem for Banach Spaces X is now a Banach space. Let D e X be a bounded domain to : D + X, where tp  I is compact. 3.53. Lemma: tp(D) is bounded.
Of course V(b) c b + 0(D), and ¢(D) being compact, both D and 0(D) are bounded. Then +p(D) is bounded. Q.E.D.
Since yr (OD) is closed (see 3.48), the set d = X  tp(aD) is open and therefore has the form
A=UA, t
where the A are the components of A. Among these there is one and only one unbounded component, A., because V(8D) is bounded. Let G = U At,
and suppose furthermore that g : C  X is a mapping such that g  I is compact.
93
DEGREE THEORY AND APPLICATIONS
3.54. Theorem: (multiplicative property) Under the hypotheses above, and if p 0 g+p (8D), then :
deg (p, gy,, D) = E deg (p, g, A,) deg (A j, y,, D).
(I)
10 go
Remark: We have used the notation deg (A j, y,, D) as in section F(cf. 3.20): the justification for this comes from 3.41; 2.
Proof: First of all, as g is proper (3.47), it follows that K = g'(p) S G is compact. Therefore from the covering K c U di, we can select a finite i#GO
family satisfying K c U A,. Thus in the expression (I) all but a finite number of terms vanish so that the sum is meaningful. Moreover, if g  I is approximated very closely (uniformly over G) by a finite dimensional mapping g"  1,
deg (p, g, d,) = deg (p, g, d,) ,
1 # 00 ,
as follows from the Definition 3.40. Hence we can prove (I) as assuming g  1 itself finite dimensional, and the general case will follow immediately. Observe that if ip 1 is also finite dimensional we are done, because we
then have just the finite dimensional result proved in 3.20. Thus the only thing to be proved is that 0 may be approximated by finite dimensional mappings, for which (I) is already known. When y,  1 is approximated closely, the composition gy' is also approximated, and the left member of (I) remains unchanged: deg (p, g+p, D) = deg (p, g V', D),
where +p' is the mapping corresponding to a suitable approximation to
0=;1.
Of course the terms deg (q, y,, D) don't change either after the substitu
tion of +p' for ip. The only difficulty arises when we consider the sets A
,
which
obviously do change. But K = g1(p) is compact, so the new sets d; will differ from the old ones by some (closed) sets, disjoint from K. By 3.41; 4 applied to these closed sets, the desired equality follows. Q.E.D.
Suppose again that D is open and bounded and that V: D  X with 4) = V  1 compact. 3.55. Lemma: I f f : b  X is onetoone, then lp''  1 : V(b) . X is compact.
NONLINEAR FUNCTIONAL ANALYSIS
94
Proof: It is easy to see that tp'  I = ¢ o tp' which yields the lemma. Q.E.D.
3.56. Lemma: t' can be extended to P : X  X in such a way that !  1 is still compact. The proof will follow immediately from Proposition 3.58 below.
3.57. Theorem: (generalized Jordan's theorem) If D and D* are bounded open sets in a Banach space X and there exists a homeomorphism jp : D i D* such that (p  1) (D) is compact, then the number of components of X  D
and X D" is the same. Proof: By Lemma 3.55, the inverse mapping tp1 is also of the form (identity) + (compact); thus the hypotheses are symmetric. But by Lemma 3.56 it is also possible to assume that tv and Sp1 are restrictions of globally defined compact perturbations of the identity. The proof now is the same as that of 3.21, except for the fact that the appeals to 3.20 are replaced by references to 3.54. Q.E.D. We now give a proof of Lemma 3.56. This lemma is an immediate consequence of the following generalization of the Tietze theorem due to J.Dugundji (An extension of Tietze's theorem, Pacif. Journal, Vol. 1, pp. 353367 (1951)).
3.58. Proposition: Let A be a closed subset of a metric space X, and C a convex set iti a locally convex T.L.S. E over the real *or the complex field.
Then any continuous f : A  C has a continuous extension F: X
C.
Proof: For each x E X  A choose an open B containing x such that diam V, 5 e (Vi, A). Then { V} is an open covering of X A and since X  A is paracompact there exists a locally finite refinement {U}, i.e., the U's are open and cover X  A, each U c some V, and for each x e X  A there exists an open 0x containing x and disjoint from all but a finite number of the U's. Let U0 e (U) and define for x e X  A Auo(x) = e (.r, X  Uo)/D (x,
X  U).
U
Since a (x, X  U) > 0 if x e U, and since each x e X  A is contained in some U we have 0 S Aco(x) 5 1. For any x e X  A, Au11O,, has the form
e (x, X  Uo)/ I e (x, X U), finite no. of U's
DEGREE THEORY AND APPLICATIONS
95
and since each e (x, X  U) is continuous (because l e (x, X  U)  e (y, X  U)! s e (x, Y)), Au0IOx is continuous. Therefore Au. is continuous on X  A and Avo(x) = 0
iffx#Uo. Now for each U choose au e A such that a (au, U) < 2e (A, U), and let the extension F be given by
F(x) = y A0(x) f(au) for x e X  A, u
and
F(x) = f (x) ,
for x e A .
For each x e X  A, Au(x) = 0 except for finitely many U's and since Z Au(x) . = 1, and f(au) e C it follows that F(x) a C. If x e X  A, F1 0.,, is a u
finite sum of continuous functions and hence F is continuous on X  A. Since Fis continuous on the interior of A by assumption, it only remains to show the continuity of F on the boundary of A. Let x0 e boundary A, and let W c E be any convex open set containing the origin. Since f is continuous on A there exists an a > 0 such that if a e A and a (xo, a) < a then f(a)  f(xo) a W. Let 0 = {x a X: a (x, x0) < a/6}. We will show that if x e 0, then F(x)  F(xo) a W. Assume x e X  A, a (x, x0) < a/6 and e (x, au) < a/2. Then (xo, au) 5 e (xo, x) + e (x, au) < 6 + 2 0 such that a is defined onto U x (e, +e). Let to be a real number such that t+(p)  is < to < t+(p) and such that a (to, p) a U. Such a point exists. Then define a curve p by p(t) = a (t, p) if t_ < t < t+, p(t) = a (t  to, a (to, p)) if t+ S t < to + e. Clearly (from 4.42) p(t) is well defined for t_ < t < t+ + is, is smooth and satisfies the differential equation. This contradicts the maximality of a (t, p). as it is described in
4.41(d).
NONLINEAR FUNCTIONAL ANALYSIS
120
J. Submanifolds 4.47. Definition: Let M be a smooth E manifold. A subspace N C M is called a regularly imbedded submanifold if there exist a covering of M by domains of charts (U, = domain of ¢,) and a closed linear subspace F of E such that, for every i, 4, (N n U,) = 01(U,) n F. The covering {U, n N} and the restrictions 0, I U, n N provide a smooth Fstructure for N. It is easy to see that this structure does not depend on the particular choice of the original covering { U,} of M. The following statement can be easily proved. 4.48. Proposition: Every regularly imbedded submanifold of M is a closed subset of M. Examples
1. M and every point in M are regularly imbedded submanifolds. 2. If M is an open set of the Banach space E, then for every closed linear
subspace F of E the set F n M is a regularly imbedded submanifold of M.
The following proposition provides less trivial examples. 4.49. Proposition: Let M be a smooth Emanifold and f : M * r a smooth mapping. If c e W is not a critical level for f, the set N = f '({cl) is a regularly imbedded submanifold of M (modelled on a hyperplane of E).
The reader will see that the lemma below implies the above proposition. 4.50. Lemma: If x e M, f is defined near x, smooth, and nonhorizontal at x,
then there exists a chart y + j'(y) around x such that f(y) = z(y,(y)) + c, where x e E'.
Proof: Considering f  f(x) we may suppose that f(x) = 0. Choose a chart ¢ around x such that 4(x) = 0. We know that x' = (f4' t)'(O) does not vanish. Let e e E be a vector such that x'(e) = 1, and let F be the kernel of x'. Clearly E = F ® IR e. Define a mapping 0 by:
0(y9te) =ye [fc'(y(D te))e (y c F, y and t small). The derivative at the origin of 0 is
O'(O)(y®te) =y®x'(yED te)e=y®e,
MORSE THEORY ON HILBERT MANIFOLDS
121
or 0'(0) = identity. Therefore near the origin 0 is a smooth mapping with smooth inverse. Consider the chart ip = 00. We have
fV1(y ®te)
(1)
=fo101(y ®te) =l0'(y ®ue)
where 0 (y ® ue) = y ® le. But by the definition of 0, this last equality implies that t = fo1(y (D ue). From (1) we conclude that f+p1(y(D te) = t =x'(y(D te) Q.E.D. 4.51. More Examples
be a Hilbert space, f :. ° R the mapping f(x) = (x, x). f is a quadratic form. Its derivative is f'(x) y = 2 (x, y), and is horizontal only (a) Let .
at x = 0. Therefore the unit sphere S = {x l f(x) = 11 is a regularly imbedded submanifold of 0 (modelled on any hyperplane). (b) Let (d2, I',,u) be a measure space, and consider the space X= p > 1. The mapping f : X + yP defined by f(x) = 1 Ix(s)I' µ (ds)
has as many continuous derivatives as the integer part n of p ("greatest integer less than or equal to p"). Considering Xas a manifold of class C", a form of Proposition 4.49 applies
and we conclude that the unit sphere S = {xl lxl = 1} = {xl f(x) = 1} is a regularly imbedded submanifold of X, of class C". (c) Consider again the case of a Hilbert space . ', and let S be the unit
sphere of the Hilbert space . ° ®9t. According to example (1), S is a smooth manifold modelled on any hyperplane of . ° ®98, in particular on A. Define now on S the equivalence relation x  y if x = y. The quotient S/ has a natural structure as a smooth manifold modelled on .°, called the projective space on 0 and denoted P(.*°). K. Riemannian Manifolds Some Preliminary Remarks on Bilinear Forms
If E is a real Hilbert space we have denoted by B (E, E) the linear topological space of bilinear continuous forms on E (see 4.15). The following is clear.
NONLINEAR FUNCTIONAL ANALYSIS
122
4.52. Lemma: The topology of B (E, E) may be defined by means of the norm IPI = sup {p (x, }'); 1x1 _ 0 (the same for all p e N). Then if W, = {x a M; a S f(x) '5 b}, there exists a diffeomorphism be
tween W, and N x [a, b] (as long as they are manifolds, i.e. when N is a manifold without boundary). Proof: From the proposition above we obtain the existence of a mapping (1)
A:Nx(a,b+jl"'W2
where W2 = {x a M; a  i1/2 < f(x) < b +71/2}.
This mapping sends N x {z} onto f f'(z) for every a  i/2 < z < b+ 1/2. Then the restriction of A to N x [a, b] is the desired diffeomorphism. 4.69. Corollary: Under the hypothesis of 4.68, there exists a homotopy H: M x I  M, where I = [0, 1], such that if H, (m) = H (m, s) then 1. for every s e I, H,: M> M is a diffeomorphism; 2. if m e M does not satisfy a  17/4 5 f(m) S b +,1/4, then H,(m) = m for all s; 3. Ho = identity;
4. Hl ({x;f(x) S a)) = {x;f(x) S b}. Proof: Let h be a smooth function as shown below such that h'(x) > 0 for all x.
9 Scbwartz, Nonlinear
130
NONLINEAR FUNCTIONAL ANALYSIS
Let F = {x; a  q/4 5 f(x) 5 b + 7t/4} and let G = M  F. Clearly G is open and M = G u W2. Now if A is the mapping defined in (1) of the proof of 4.68, then for every s between 0 and 1 we define H, as follows:
if m e G, then Him) = m if m = A (n, t) a W2, then H,(m) = A (n, (1  s) t + sh(t)). Observe that H, is well defined (and equal to the identity) on W2 n G. Indeed, the reader can now verify that the H, have the properties 1), ..., 4).
Remark: Part (4) of this corollary says in particular that {x; f(x) 5 a} and {x;f(x) 5 b} are diffeomorphic. This fact will be used very often. An important generalization appears in: 4.72.
B. The PalaisSmale Condition
Let us assume now that (M, g) is a Riemannian manifold. 4.70. Definition: If f e COR(M), we shall that f satisfies the PalaisSmale condition ("PS condition") if whenever S is a set in M on which! is bounded and JI Vf 11 is not bounded away from zero, then there exists a critical point
off adherent to S. Of course this is equivalent to: if is a sequence in M such that is bounded and II(Vf),j  0, then there exists a convergent subsequence of (the limit being necessarily a critical point). Remark: This condition appears in (2) and (3) of the bibliography. 4.71. Theorem (noncritical neck principle for Riemannian manifolds): Let M be a complete Riemannian manifold, f e C0D(M) and consider the
sets N e W c W1 a M1 e M defined by
W = {x; a 0 is given. Then T < t.. may be chosen so that da
dt
(t, p) dt < 27.
Consider now two points a (x, p), a (y, p) with T:5. x 5 y < t+ . They may be joined by a curve y(t) = = a (t, p), x 5 t 5 y, whose length by the last formula is less than n. Therefore the distance e (x, y) < rl and we have proved thereby that the net or (p, t), 0 < t < t+ is a Cauchy net (cf. Kelly, General Topology). Since M is c.)mplete, lim or (t, p) exists, which to
gether with t+ < + co, t , t+, contradicts Prop. 4.45. Thus hypothesis (b') of 4.68 is satisfied (for ?I = e/2) and then 4.68 applies. Q.E.D.
Remark: Observe that hypotheses (a) and (8) are independent of c. Therefore, the conclusion is true for any c between a and b.
4.72. Corollary: Under the hypotheses of Theorem 4.71, there exists a homotopy H : M x I+ M (I = [0, 1]) having the properties: 1. for every s e I, H,: M+ M is a diffeomorphism; 2. if m e M does not satisfy a  e/8:5 f(m) S b + e/8, then H,(m) = m for all s; 3. Ho = identity; 4. Hl ({x; f (x) 5 a}) = {x; f(x) S b}.
Proof: We have shown that the hypotheses of 4.71 imply those of 4.68, and hence of its corollary.
C. Local Study of Critical Points
Let E be a Hilbert space and f a smooth real function defined on a neighborhood of 0 e E. Using Taylor's expansion we write
f(x) = f(0) + f(O) (x) + If '(0) (x, x) + R(x), where R(x) is a function of order 3. Assume that 0 is a critical point of f. Then:
[*]
f(x) = f(0) + #f"(0) (x, x) + R(r).
MORSE THEORY ON HILBERT MANIFOLDS
133
Since the bilinear form f'(0) is continuous and symmetric, there exists (see 4.54) a (unique) symmetric operator A e End (E) such that
f"(0) (x, y) = (Ax, y) = (x, Ay). Formula [*] then becomes
[**]
f(x) = f(0) + j (Ax, x) + R(x).
Let us consider a smooth change of coordinates y  x(y), such that x(0) = 0. Then f(y) = fl(x(y)) and using the chain rule we obtain:
P(y) (zl , z2) = f i (x(y)) (x'(y) z1 , x'(y) Z2) + f'(x(y)) (x"(y) (zi , z2)) Since 0 is a critical point, we get: f"(O) (z1 , z2) = f1 (0) (x'(0) Z1, x'(0) z2)
This formula shows that the operator A transforms according to: [***]
A = u'1Alu, where u = x'(0).
4.73. Definition: Let f be a smooth function defined on some Riemannian
manifold, x a critical point of f. We shall say that x is a nondegenerate critical point off if in any chart, the operator A defined in [**] is invertible. Remark: The formula [***] shows this notion to be coordinate independent. We are led to the same concept as follows. If x is a critical point and 0 is a chart around x, the bilinear form H(f)x defined on TM.,, by
H(f)x(u, A) = [(fo I)" (O(x))] (4*(x)p,0*(x) A) does not depend on 0. Hence we may make the following definition.
4.74. Definition: The bilinear form H(f)x is called the Hessian off at x. According to our definitions, the Hessian of a smooth function is a smooth section of the bundle B (T(M), T(M)) = T2(M) defined on the set of critical
points off. 4.75. Definition: A critical point x is called nondegenerate if H(f)x is a scalar product defining the given topology of TM.,. It is obvious that 4.73 and 4.75 are equivalent.
Remark: A very elegant definition of H(f)x is given in Milnor, "Morse Theory" (Ann. of Math. Studies, No. 51, Princeton 1963).
134
NONLINEAR FUNCTIONAL ANALYSIS
Let F be a complex Hilbert space. Denote by End (F) the space of all the continuous linear operators T : F > F and by Aut (F) (respectively H(F)) the subset of invertible operators (respectively, the subspace of the Hermitean operators.
4.76. Lemma: If A G Aut (F) n H(F), then the mapping tp : End (F) H(F) x H(F) defined by ip(B) = (B*A + AB, i (B*A  AB))
is onetoone onto and both.tp and tp1 are continuous. Proof: In fact, given S and T symmetric, define
B = +A1(S + iT). Then B*A + AB = S and i (B*A  AB) = T, and hence
+iT) is the inverse of tp. Clearly both are continuous.
4.77. Lemma: Assume that A e Aut (F) n H(F). Then the mapping q5: Aut (F) + H(F) x H(F) defined by ¢(B) = (B*AB, i (B*AB1  A))
is differentiable and its derivative at B = I is 80 (1, B) = tp(B).
Proof: Observe that 8(B1) = BB and compute. 4.78. Corollary: 0 maps a neighborhood of 1 e Aut (F) diffeomorphically onto a neighborhood of (A, 0) =.0Q).
Proof: Use the implicit function theorem. Assume now that x1 (A(x), D(x)) is a smooth mapping from an open set of F into H(F) x H(F) such that A(O) = A and D(0) = 0. Then
4.79. Lemma: x > B(x) = 0 1(A(x), D(x)) is a smooth mapping such that B(O) = I and A(x) = B*(x) A(0) B(x) and D(x) = i (B*(x) A(0) B1(x)  A(0)).
Proof: Follows from the corollary above. Assume now that E is a real Hilbert space.
MORSE THEORY ON HILBERT MANIFOLDS
135
4.80. Proposition: Let x  A(x) be a smooth mapping from a neighborhood of 0 e E into End (E) such that A(x) is symmetric and A(0) is also invertible. Then there exists a smooth mapping x > B(x) of some neighborhood of 0 e E into Aut (E) such that: A(x) = B*(x) A(0) B(x).
Proof: Let F be the complexification of E : F = E ® C = E ® iE. Define the mappings x + iy = z  A(z) by A(z) (u + iv) = A(x) u + iv and z  D(z) by D(z) = 0. Apply Lemma 4.79 to prove that there exists a map z > B(z) such that: B*(z) A(0) B(z) = A(z)
(1)
B*(z) A(0) (B(z))' = A(0).
(2)
From (1) and (2) we conclude that: (B(z))2 = (A(0))1 A(z)
(3) (4)
(B*(z))2 = A*(z) (A*(0))1
Now observe that for every x e E, A(x) and A(0) leave E invariant: A(x) E = A(x) E e E, A(0) E = E. Then (3) and (4) imply that E is also invariant under (B(x))2 and (B" (x))2. Now we shall use the following statement (see below for a justification): (5) If an operator T satisfies III  T211 < 1, then the invariant (closed) sub
spaces of T and T2 are the same.
From (5), it follows that E is also invariant under B(x) and B*(x), provided that x e E is near 0. Hence, by restriction to E (and calling B(x) = B(x)I B) we obtain from 1 B*(x) A(0) B(x) = A(x), x e E, x near 0 as desired. Justification of (5): observe that
T=(1  (1 T 2))112 = I+ 2 (1  T2)  8 (1  T2)2 +
(2(n  1))! 2 2n I
n ((n _.
(1
1)!)2
 T2) "+ 
where the series converges in the uniform topology of operators if 11  TI2 < 1.
NONLINEAR FUNCTIONAL ANALYSIS
136
4.81. Proposition: Let A be a symmetric invertible operator in End (E)
(E = real Hilbert space). Then there exist T e Aut (E) and a projector P e End (E) such that (Ax, x) = IIPTxll2  II(1  P) Txry2, x e E.
Proof: Let h be the characteristic function of [0, oo) and g the function g(A) = 121''2, A = real 0 0. Since A is invertible, g is continuous on the spectrum of A. Then S = g(A) is defined. Clearly S (being a function of A) is symmetric and commutes with A. Moreover S is invertible (because g:0 0 on Spectrum (A)). Call T = ST is symmetric and invertible. Now define P = h(A). P is clearly a projector (because h2 = h) commuting with A, hence also with T. Since we have
I (g(j))2 = h(2)  (1  h(2)),
we conclude that AT` = P  (1  P), and hence A = PT2  (1  P)T2, But then
(Ax, x) = (PT2x, x)  ((1  P)T2x, X) = II PTxII2  11(1  P) Tx112, as desired.
4.82. Proposition (Morse Lemma) : Let f be a smooth function defined on a Riemannian manifold M (modelled on E). If x is a nondegenerate critical point off, then there exists a chart 0 around x (sending x into 0) and a projector P in E such that
f(y) =f(x) + IPcbyI2  I(1  P)oyl2
(1)
when y belongs to the domain of 0
Proof: Let w be any chart around x (sending x into 0) and put g(y) _ (ftp ') (y)  f(x), where e e E and a near 0. Then (2)
g(y) = g(0) + g'(0) y + f
(sy)] (y, y) (1  s) ds
o
f (1
 s) g" (sy) ds] (y, y)
0
rt Now, since a,, =
(1  s) g" (sy) ds is a symmetric bilinear form on E, J0 there exists a mapping y  A(y) a End (E) defined by (3)
(A(yy) x, y) = ocr (x, y)
MORSE THEORY ON HILBERT MANIFOLDS
137
Clearly every A(y) is symmetric and "
(A(0) x, y) = ao (x, y) = Ig (0) (x, y) Together with the fact that x is a nondegenerate critical point off this implies that A(0) is invertible. Now we apply 4.80 and obtain B(x) satisfying:
This implies that
A(x) = B*(x) A(0) B(_x). (A(y) Y, Y) _ (A(0) B(Y) y, B(y) y),
(4)
and from (2), (3) and (4) we obtain : g(y) = (A (0) B(y) y, B(y) Y) 
Using 4.81, we conclude that there exist T and P such that g(y) = J PT B(y)
(5)
yI2
 I(1  P) T B(y) y12.
Define 0 by
By (5), we have:
4(Y) = T (B ('(Y)) v'(Y)) 
f(Y) = f(x) + g (W(Y)) = f(x) + I PT B (W(Y))'V(Y)f 2
 I(1  P) T B (V'(Y))'ip(Y)I2 = f(x)  IP4)xI2  1(1  P) 4)x12, as desired. We must observe that4) is actually a chart; indeed it is the composition of
Y  y'(.l')
e  B(e) e
z+T(z) and clearly the first and third mappings are diffeomorphisms while y
B(y) y,
having the identity as derivative at the origin (compute!) is also a local diffeomorphism. Hence 0 is a chart on some domain around x. Q.E.D. D. Global Study of Critical Points We begin by defining handles and the attaching of handles.
4.83. Definition: For every cardinal number k we shall denote by D" the closed unit ball around 0 in a Hilbert space having an orthorormrl basis of cardinality k. aDk will denote its boundary.
NONLINEAR FUNCTIONAL ANALYSIS
138
4.84. Definition: Let M and M be two smooth manifolds (possibly with boundary). We shall say that M has been obtained from M by attaching a handle of type (k, 1) if the following conditions are satisfied:
1. M is a regularly embedded submanifold of 2; 2. There exists a closed subset H c M and a mapping h : Dk x D' + H, such that :
2a.MuH=2, 2b. h is a homeomorphism,
2c. h (aDk x D') = H n M c OM, 2d. H  M is a submanifold of M with boundary, 2e. the restriction h (Dk x D') is a diffeomorphism of Dk x D' onto l
H  M, and 2f. the restriction h I(aDk x D') is a regular embedding of aDk x D'
into M. In this situation, we use the following notation for M: M = M U H (k, 1). k
Remark: Obviously dim 2 = k + 1. We shall say that H is a handle. As a generalization, we state the following definition:
4.85. Definition: We shall say that M has been obtained from M by attaching n handles of types (k1,11), ..., if separately attaching n such handles to M in such a way that hj(H,) n h,(H,) = 0 the manifold obtained is M.
In this situation, we use the following notation for M: 2 = M U H1 ... U H. = 11?'. (See next figure.)
k
I"
MORSE THEORY ON HILBERT MANIFOLDS
139
Assume now that f is a smooth realvalued function defined on some Riemannian manifold M (modelled on E) and that x e M is a nondegenerate critical point off. By 4.82, f may be represented, in some chart 0 around x, as
f(y) =f(x) + IP0yJ2  I(1  P)d'yI2,
where P is a projector in E. 4.86. Definition: The index off at x (or the index of x, if there is no confusion) is the pair (k, 1), where k = dim (PE), 1 = dim ((1  P) E) Of course both k and 1 may be infinite cardinal numbers. We arrive now at the most important theorems of this section. The symbol  will mean "is diffeomorphic to". Let (M, g) be a complete Riemannian manifold. Given f e C°°(M) then for every s, t e A, define
[f5 s]={xeM;f(X)SS} [S 5f 2
and
x2 > 1,
` x2  y2 
2(x2);
then
x2  y2  j 2(x2),i(y2) = x2  y2  j 2(x2) = x2  y2; finally, if y2 > 2 and x2 < 1, then necessarily x2  y2 S 1 and consequently
x2  y2  2(x2) 77(y2) <  1, x2  y2  2(x2) < 1. This proves (9.5).
Define K e B by (9.6)
K = {e a B; x2 + I

2(x2) 5 y2
x2 + 1) ,
so that K is the set of elements e e B satisfying
x2 + 1  2(x2) < y2,
(9.6.1)
and (9.6.2)
y2 < x2 + 1.
Let Dk and D' be the unit balls of Y and X respectively and define
h:Dk x D' Kby (9.7)
h (y, x) = (Q(y2))1I2 x + (1 + Q(y2) x2)1/2 y,
where a is the smooth function defined as follows: if 0 5 t < 1, a(t) is the unique solution in [0, 1] of
10
Schwartz, Nonlinear
3
2 (a(t))
2
1 + o(t)
146
NONLINEAR FUNCTIONAL ANALYSIS
This mapping is smooth and has the form shown in the following graph. I 1 21
1
Since a is smooth, h is also smooth. The reader will check that the image
H = h (D" x D) of h is contained in K; this follows from the inequality a(y2) + 1 4 )' (a(y2) x2) 5 (1 + a(y2) x2) y2
which is a consequence of
1  y2 = 4.)' (a(y2)) (1 + a(y2))' < I (A(a(y2) x2) (I + a(y2) x2)I it follows from this inequality that H c W; the other condition (y2 < 1 + x2) is even easier and is left to the reader.
Now consider the function S: H  X x Y defined by: (9.9)
S(e) = ((1 +
X2)1/2y,
z
[(1(
/
1/2
1 +x2J
x) 1
S is smooth and clearly Sh = identity, hS = identity. In order to finish the proof it suffices to show that H is a (k, 1) handle and that V U H = W. To do this, we first show that
H=Kn[x2 S 11. In fact, it is obvious that H c K n [x2 S 1 ]. Assume now that e = x + y eK and x2 < 1. If x 5 1, then x2 5 or (y2/ (1 + x2)) is trivial. If x2 z  , then there exists 0 5 t S 1 such that a(t) = x2. From the formula x2 + I  4 )'(x2) = x2 + 1  4 A (a(t)) < y2
MORSE THEORY ON HILBERT MANIFOLDS
147
it follows (use (9.8)) that:
x2 + 1  (1 + a(t)) (1  t) < y2,
(1 +x2) (1 +x2)(1 t) S y2, so that
(I +x2)t 1.
(b)
(a) and (b) together imply that 2(x2) > 0, whence (c)
4
x2 < 1.
Now (a), (b) and (c) imply that e belongs toK n [x2 0,
where P = (P, 0) is a space consisting of a single point. (f) There exist maps ak : Hk (X, Y) + Hk_ 1(Y),
(which we write simply as a, omitting the subindex) such that, if 0: (X, Y)  (I, Y) Then
a4* _ (0I y)* a (here we have designated 01 Y the restriction of 0 to Y, (01 Y)* the induced map on Hk(Y) = Hk (Y, 4,)). (g) Exactness principle of Euler
Let X ? Y ? Q ; let
be inclusion maps, j*, k* the induced maps on the homology groups. We can construct the sequence
' Hk(X) '' H& (X, Y) a' Hk I(Y) "* Hk IM ... , Ho(Y) k Ho(X) J ' Ho (X, Y) 00
Hk i (1, Y) ' 0
...
(the homology sequence of the pair X, Y). The exactness principle tells us that the homology sequence of any pair (X, Y) is exact (i.e. the image of any group in the sequence under the corresponding homomorphism is equal to the kernel of the next homomorphism). We note the following results for future use.
1. Let S" be the nsphere, G = Z, n 4 0. Then
Hk(S")=0 if k>0, kin, H"(S") = Z Ho(S") = Z
150
NONLINEAR FUNCTIONAL ANALYSIS
2. Let G be as before, D be the ndisk. Then Hk (D", S° 1) = Hk (D", aD") = {0} if VA n,
H. (D", S"') = Z. I
3. Suppose (X, Y) = U (XI, YI), all XX disjoint.
i=I
I
Then Hk(X,Y)=E®Hk(Xi,YI). 1=I
Example (i). To illustrate the use of the exactness principle we will deduce 2) from 1) and (e). Consider Hk(S"1) k +Hk(D") J' Hk (D", S"1) e Hk1(S"') k* ' Hk,(D").
Since D" is homotopic to a point for any n, we have Hk(D") = {0} for any k * 0. Therefore, if k > 0, k # n, we get the sequence
{0}  Hk (D", S"1) 8' {0)
{0} ,
whose exactness implies readily that Hk (D", S"1) _ {0}. On the other hand, if k = n, the sequence is {0}  H. (DO,
{0}.
This time, exactness implies that H"(D", S" 1) = Z.
Example (ii). We now note a result more general than the exactness principle. Let X ? Y 3 Z; consider the inclusion maps
j:(X,Z)+(X, Y) k:(Y,Z)>(X,Z)
Using the induced mappingsj*, k*, 1* we can form the sequence:
' Hk(X,Z) J, 'Hk(X, Y) 8'HkI (Y, Z) k.HkI (X,Z) HkIMY)a:... a= Ho(Y,Z)
{0} k' {0}
(1.2)
MORSE THEORY ON HILBERT MANIFOLDS
151
where d' is the composition of 1* and 0, i.e.
Hkt (Y, Z)
Hkt(Y) r a
j
/
(1.3)
a =t.a
Hk (X, Y)
As an exercise the reader should prove, using the exactness principle, that (1.2) is an exact sequence.
Consider now the case G = R; the homology groups are then vector spaces. Let K,, (X, Y) be the subspace of Hk (X, Y) which is either the image of the preceding homomorphism or the kernel of the next, (similarly define Kk (X, Z), ... etc.) and set ek (X, Y) = dimKk (X, Y) ... etc. From the exactness of (1.2) we easily see that Ilk (X, Y) = ek (X, Y) + ek1 (Y, Z)
(1.4a)
flk(Y,Z)=ek(Y,Z)+ek(X,Z)
(1.4b)
fk (X, Z) = ek (X, Z) + ek (X, Y)
(1.4c)
Now, A
Y,(1)'fli(X,Z)  Y_ (1)i(3i(X, Y)  A Y, (1)ifli(X, Z) E(1)f {eJ (X, Z) + ef(X, Y)  ei (X, Y)  er1 (Y, Z) JAM
ej (Y' Z) si(X,Z)}
(1.5)
= I(I)J(ei (Y,Z)+efI (Y, Z)) =(1)'"+t C. (Y, Z). Jsm Now define
m(X,Z)_(1)"E(1),p3(X,Z) !:5m
(1.6)
Clearly it follows that
m (X, Z) = ,m (X, Y) + 1m (Y, Z)  nonnegative integer, i.e. (1.8)
Tim (X, Z) 5,1m(X, Y) +y1,,(Y,Z).
We now apply these results to Morse theory. Let M be a Hilbert manifold, f
a smooth function on M satisfying the PS condition on a 5 f 5 b; let c, a < c < b, be its only critical level, and suppose that the critical points off P1, ..., p are nondegenerate, their indices being (n4, m,), i = 1, ..., n.
152
NONLINEAR FUNCTIONAL ANALYSIS
We know, by Theorem 4.87, that
[f_ k. 5.17. Definition:
cm(f) =
inf Ae
<M)
tsupf(p)}
where we put cm(f) = oo if rm(M) = 0.' Let m < m'. Then cm(f) is an infimum over more elements than cm.(f), therefore (5.8)
00Sci(f)5CAD :... 0 we set
N. = {p e MI If(P)  cl < s and I(vf) (' (P))I < e for some t such that 0< t 5 1},
(5.9)
then any neighborhood U of K contains one of the neighborhoods N. of K.
Proof: The assertion that KK is compact follows immediately from Condi
tion PS; thus only our second assertion requires proof. Suppose that this second assertion is false. The there exist a'neighborhood U of KK not containing any of the sets N1, and hence there exists a sequence of points
and a sequence t of numbers such that 0 < t, < 1, such that
c and 0 as n  co. Passing to a subsequence, we may suppose with
out loss of generality that t + t* as n + co. Now (5.10)
d
f (yh(P)) = V (71r(P))f = a
{(vf('i (P))) f)
and thus, from the definition of the gradient, we have (5.11)
d
f (''(p)) = a (Ivf('i (AI) Ivf('i (P))I2
It is clear from (5.11) and from the definition ofa that I df(ri,(p))/dtl is uniformly bounded for all p e M and real t; thus there exists a finite constant K such that If(ne(P))  f(P)I < K lti . Since c, it follows from this last that is uniformly bounded. Thus, by Condition PS, {'7,n(pn)} has a con
vergent subsequence, and we may suppose without loss of generality that converges to a point q c M. Since (vf) 0, q is evidently a critical point off. We have (5.12)
P. = ti r (r1 e (P.))  n
q
by Lemma 4.45 and by the fact that q is a critical point. Thus q e KK is the limit of and since p # U we have a contradiction which completes our proof. 5.20 Corollary: Let 0 < e < 1. Let ri, and Nt be as in the preceding Lemma. Then if f(p) 5 c + e2/2 and p 0 N1, we have f (tll(p)) c  e2/2. Proof: It is plain from (5.11) of the preceding proof that f(77,(p)) decreases as t increases, and, in fact, that (5.13)
f(nI(P)) f(P)
I0 a (Ivf('t(P))I) Ivf(n,(P))12 dt.
164
NONLINEAR FUNCTIONAL ANALYSIS
Iff(p) < c  e we have nothing to prove; thus we may suppose without loss Ze of generality that I f(p)  el < e. Then p 0 Nt implies that I Vf for 0 < t 5 1, so that, since Pa (t) is monotone increasing (cf. the first paragraph of the preceding proof) we may conclude, using (b), that f (r11(p) f(p))
< e2. But then e2
2
and the present corollary is proved. We are now in a position to prove the principal theorem of Lusternik and Schnirelman in a generalized form. 5.21 Theorem: Let (M, f) satify Condition PS, and let {cm(f)} be as in De
finition 5.17. Suppose that m < n, and that  oo < c = c.(f) = oo. Then the set Kc of critical points (cf. (1)) is of category n  m + 1 at least; moreover, even if m = n, the set K, is nonempty. 5.22 Corollary: Under the hypotheses of the preceding theorem the set KK is of dimension n  m at least.
Proof: The corollary follows immediately from the theorem and Theorem 5.5. To prove the theorem, first suppose that n > m and that cat (K,,) S n  m, and then use the Corollary of Lemma 5.6 to find a neighborhood U of K. such that cat (0) S n  m. Using Lemma 5.19 and Lemma 5.3.2, we may suppose without loss of generality that U is one of the neighborhoods N1 described by (2). By Definition 5.17, there exists a closed subset A of M such that cat (A) z n and such that sup {f(p) l p e A} 5 c + 6212. Put A0 = A  N1. Then, by Lemma 5.3.1, cat (AD) z m. Thus, if rl, is as in Lemma 5.19 and Corollary 5.20, it follows from Lemma 5.3.3 that cat (j7, (Aa)) Z m. On the other hand, by Corollary 5.20, f(171(p)) 5 c  e2/2 for p e A0. This contradicts the Definition 5.17 of c. and thus completes the proof of Theorem 5.21 in case n > m. In case n = m and KK is void, we may let U be the null set, and arrive by the same argument at the same contradiction. Thus Theorem 5.21 follows in every case. Q.E.D. Reference L. Liusternik and L. Schnirelman, Methodes topologiques dints les problEmes variationnels (Hermann & Cie, Editeurs, Paris, 1934).
CHAPTER VI
Applications of Morse Theory to Calculus of Variations in the Large
Bibliography 1. R. S. Palais, "Morse theory on Hilbert Manifolds", Topology, Vol. 2, pp. 299340. 2. J. Milnor, Morse theory (Ann. of Math. Studies, Princeton, 1963). 3. I. M.Singer, Notes on Differential Geometry (Mimeographed, M.I.T., 1962). 4. S.S.Chern, Differentiable manifolds (Mimeographed, Chicago Univ., 1959).
We consider now the set of all suitably smooth paths in a finitedimensional compact Riemannian manifold M. We shall see that a natural (infinitedimensional) Riemannian structure can be introduced into this set, allowing us to apply our infinitedimensional Morse theory. Extremals of a conveniently chosen f on this set will correspond to geodesics in M, so that our results will relate to the geodesics of M. 6.1. Definition: Let R" be ndimensional Euclidean space. Define H0(I, R")
= L2(1, R"), i.e. the space of all functions, a, e, ... such that f 1
1a(t)l2 dt < 00
o
with the scalar product
(a, e)o =
I
fo
(a(t), Lo(t)) dt .
6.2. Definition: Let Hl(I, R") be the set of all absolutely continuous maps a : I  R" such that a' a Ho (I, R"). Hl (I, R") is a Hilbert space under the inner product (a, e)1 = (a(0), N(O)) + (a', Lo')o. In fact, if (p, q) e R" ® Ho (1, R"),
the map (p, q) , p + f g(s) ds e Hl (I, R") is an isometry onto. 0 165
NONLINEAR FUNCTIONAL ANALYSIS
166
6.3. Definition: We define L : Hi (I, R") > Ho (I, R") by La = a' and we define Hi (I, R") = {a e HI (I, R") I a(0) = a(1) = 0}. Then the following is immediate: 6.4. Theorem: L is a bounded linear transformation of norm 1. H1 (I, R") is a closed linear subspace of codimension 2n in HI(I, R") and L maps Hf (1, R") isometrically onto the set of g e Ho (I, R") such that I
g(t) dt = 0, 0
i.e. into the orthogonal complement in Ho (I, R") of the set of constant maps of
I into R. 6.5. Theorem: If p e H, (I, R") and 2 is absolutely continuous from I into R", then fI
1
Jo
(2'(t), e(t)) dt = (2, LP)o
6.6. Definition: C ° (I, R") = set of all continuous maps of I into R". C°(I, R") is a Banach space with the usual norm I I.. The inclusion of C° (1, R") into Ho (1, R") is evidently bounded.
6.7. Theorem: Let a e HI (1, R"). Then
la(t)a(s)) s jtsIIL,lo. Proof: Apply Schwarz's inequality. Corollary 1: If a e H, (1, R") then Io'L0 S 2 la), . Corollary 2: The inclusion maps i : H, (I, R")  CO (1, R") and Ho (I, R") are completely continuous.
Proof of Corollary 1 is trivial. For 2 apply the ArzelaAscoli Theorem.
6.8. Lemma: Let 0: R" + RP be a smooth map, and let 4) : HI (I, R")  H, (I, RD) be defined by Vi(a) = 0 o a. Then 0 is smooth. Moreover, if 1
:!!g m ::5 k, then
d"'oo (2I, ..., A,n) (t) =
(21(t)
... ,"(t))
This follows from
6.9. Lemma: Let F be a C'map of r into L3 (R", R°), the space of all slinear maps from 7eR" to R". Then the map F of H, (I, R") into
APPLICATIONS OF MORSE THEORY TO CALCULUS
167
L' (Hl (1, R"), HI (I, R°)) defined by
F(a) (AI ..., AS) (t) = F (a(t)) (21(t), .... AS(t)) is continuous. Moreover, if F is C3 then F is C' and
dF=dF. Proof: Observe that F(a) (AI ... AsY (t) =
dt
F(a(t}) (AI(t) ... As(t)) = dF,(,) (a'(t)) (AI(t), .... ;.,(t))
+ E F (a(t)) ( I(t), ... , A X0, ... t=1
AS(t})
which implies IF(a) (AI ... As)' (t)I < IdFa(t)I IAI(t)I I ... IAS(t)I IF(a(t))IIAI(t)I...IAI(t)I...1A3(t)I
+
Since IAtI. < 2IA111, and putting k = sup ldF,(,)I, we have IdFFcn (a'(t)) (AI ... ).s)10
k23L (a) IAII I ... IAJ I
Since also (AI ... Ai(t) ... AS)I < 2' sup IF'(o'(t))I IAIII ... ,li(t) ... IAs11,
if we recall that 1e12 = Ie(0)12 + Ie'12 we see that
IF(a) (A, ...1,)II < k(a)
IA:II
... IA,I'
where k(a) is a constant depending on a. It follows that (since F(a) is plainly multilinear) F(a) a L'(HI (I, R"), HI (I, R°)). If e e H, (I, R") then I (F(a)  F(e)) (AI ... A.,)I. s 2' sup I F (a(t))  F (e(t)) I
JAI
I1 ... IA.11
tnd it is plain that I ((F(a)  F(e)) (A1 ... ;,,))'I o 5 28M (a, e) IA111 ... IAsl1, where
M (a, e) = sup IdFc(,)I la'  e'lo + sup I dF,(t)  dFQ(t)I Ie'lo
+ s sup IF (a(t))  F (e(t))I Hence
IF(a)  F(e)I is given by
ddb0Q) (t) = d
() (A(t))
Observe now that every manifold can, by Whitney's Theorem, be imbedded in an Euclidean space. Hence
6.13. Theorem: H, (I, V) and .Q (V, p, q) are Hilbert manifolds, and, by Theorem 6.12, their manifold structure does not depend on the particular imbedding of V used. The function to which general Morse Theory will be applied in what follows will be the action integral J"(a), defined for a Riemann manifold V as follows :
6.14. Definition: For a e H, (I, V), J°(a)
f0l('t)I2dt.
We leave to the reader proofs of the following properties of J°(a).
6.15. Lemma: Let V, W be smooth manifolds, 0 : V  W an isometry.
Then J' = Jw o 0. 6.16. Lemma :Let V be a smoot h submanifold of W. Then J' = J w J H, (I, V).
6.17. Lemma: J'' is a smooth functional. Advice: Prove Lemma 6.17 first for smooth submanifolds of R" and then for general manifolds using Nash's imbedding theorem for Riemannian manifolds, which was proved as Theorem 2.4.
Next observe that Hl (I, R"), as a Hilbert space, has a natural Riemannian structure. Hence for all manifolds V, H, (I, V) will also have a
170
NONLINEAR FUNCTIONAL ANALYSIS
Riemannian structure. But the situation here is not so pleasant as in connection with the differentiable structure of H, (I, V), since now, in general, this Riemannian structure will depend on the imbedding V+ R". However, this will not bother us at all. A second observation is the following. Let W be a complete Riemannian manifold, W, a closed submanifold of W inheriting from Wits Riemannian structure; let ev, ew be the respective Riemannian metrics. It is clear that if p, q c W1, then ev (p, q) >_ ow (p, q), since, by definition the right side of given
an infitnum over a larger set than the infimum on the left. Hence the Riemannian structure of W, is also complete. Putting our observations together, we get
6.18. Theorem: Let V be a smooth submanifold of R". Then H, (I, V) is a complete smooth Ricmannian manifold with the Riemannian structure inherited
from H, (I, R") where R" is any Euclidean space in which V is isometrically imbedded.
The same reasoning gives shows that Sl (V; p, q) is also a complete Rie
mannian manifold. Regarded as a submanifold of H, (I, R"), the scalar product in it is simply (e, A)o = (Le, LA)0. One more preliminary needed for the application of Morse Theory is the verification of the PalaisSmale condition for the action integral. To remind
the reader of the nature of this condition we write it down again: PS condition: Suppose that, for a sequence a", VJv (o") + 0, and J v(,.) is bounded. Then there exists a subsequence a",, convergent to an element
aeH,(I,V).
We proceed to establish this condition for iv in a series of substeps. We suppose throughout that V has been isometrically imbedded in a Euclidian space R". In what follows, L is the operator of Definition 6.3. 6.19. Lemma: Let {a"} be a sequence in Q (V; p, q) such that I L (a"  ojo  0 as n, m + oo. Then or. converges in Q (V; p, q).

Proof: Evidently or. e H, (1, R"). {a"} is Cauchy in H, (I, R") and hence convergent. But Q (V; p, q) is closed in H, (I, R"). Q.E.D.
6.20. Detnitlon: Let p, q
V. If a e Q (V; p, q) we define h(a) to be the orthogonal projection of La onto the orthogonal complement of L (Q (V; p, q),) in Ho (I, R").
6.21.1Leorem: Let J = Jv IQ (V; p, q). If we consider Q (V; p, q) as a Riemannian manifold with the structure induced on it as a closed submanifold
APPLICATIONS OF MORSE THEORY TO CALCULUS
171
of H, (1, R") then for each a e Q (V; p, q) (VJ) (a) can be characterized as the
unique element of Q (V; p, q), mapped by L onto La  h(a). Moreover IVJ (a)1. = I La  h(a)10.
Proof: Note thatQ (V; p, q), is a closed subspace of H1(I, R") and is contained in H1 (I, R"). It follows from Theorem 6.5 that L maps Q (V; p, q), isometrically onto a closed subspace of Ho (I, R"). Since La  h(a) is orthogonal to L (Q (V; p, q),)1, La  h(a) = LA, A e Q (V; p, q) with R unique and JAI, = IL210 = ILa  h(a)lo. It will suffice to prove that dJ,(e) = (2, e)0
for e e Q (V; p, q) i.e., that dJ, (e) = (LA, 4)0 = (La  h(a), Le)o for e eQ (V; p, q),. Since (h(a), Le)o = 0 fore eQ (V; p, q) we must prove that dJ,(e) = (La, Le)o for e eQ (V; p, q).,. But JR"(a) _ I ILaI0, so dJa "(e) = (La, Le)o for e e H, (I, R"). Since = JR" IQ (V; p, q), it follows that
j
dJ,=dJ;"IQ(V;p,q),. Q.E.D. 6.22. Definition: Let Q (V; p, q), be the closure of Q (V; p, q), in Ho (1, R"), and let P, be the orthogonal projection of Ho (I, R") on Q (V, p, q),. For each point r e V, let Q(r) denote the orthogonal projection of R" onto the tangent space V, to V at r. 6.23. T71eorem: The functional J of Theorem 6.21 satifies the PalaisSmale condition.
Proof: Let {a"} be a sequence in Hl (I, V) such that IJ(a")I < M, JJ(a") + 0. Since, by Theorem 6.21, IVJ (a,,)I
= I La.  h(a")l0,
we have ILa,,  h(a")I0  0. Since each P, is a projectionhence normdecreasingit follows from the corollary of Theorem 6.7 that I La.  P,"h (o")l0  0, and by Corollary 2 of 6.7 we can assume on passing to a subsequence that Ia"  aml," + 0 as m, n ' oo. We need only to prove that IL (a"  am)Io + 0 for m, n + oo, for then it follows that a" will converge
inQ(V;p,q) to aainQ(V;p,q). But IL (a"  am)IO = (La", L (a"  am))o  (Idm, L (a,,  d.))0
Thus it suffices to prove that (La", L (a"  am))0  0 as m, n , oo. Since
I10"IZ = 2(a.) is bounded, IL (o  am)I0 is bounded also and, since La"  P,"h (a") + 0 in Ho (I, R") it suffices to prove that (P,"h (a"), L (a"  am))o  0
as
nr, n + oo.
NONLINEAR FUNCTIONAL ANALYSIS
172
We now refer to Lemma 6.24 below and note that it follows from this Lemma that if a e Hl (I, V) then Pf belongs to S2 (V, p, q), if f is smooth and vanishes for t = 0 and t = 1. Since h(a) is orthogonal to LP,f in this case, we have (h(a), LPef) = 0 for all such a and f. Thus
(P,h (a), Lf) = (h(a), (P,L  LP,) f)
(;)
for a e Hl (I, V) and smooth V vanishing at t = 0,1. If we put Q,(t) (dldt) 0 (a(t)), it follows by differentiation from (*) that (P,h (a), Lf) = (h(a), Q, '.f) _ (Q" . h(a), f)
for smooth f vanishing at t = 0,1, and hence, by a limit argument, for all f e H1 (1, R"). Since a,,  am e Hi (I, R") it follows that I
I(PQ"h (a"), L (a"  am))ol = Ifo (Qa" (t) h(a") (t), (a"  am) (t)) dt
la"  a.[. fo I Q,,,(t) h(a") (t)I dt 1
is bounded. Let A be a compact set such that a"(I) c A. Then there exists K such that
IIQQ" (t) h(a") (t)I dt 5 K ILa"Io Ih(an)Io. J
Now, since ILa"lo is bounded and since ILa"  h(o")lo  0, Ih(a")lo is bounded and the theorem follows. Q.E.D. Finally, we relate critical points of the action integral J with geodesics, and
find conditions under which these critical points are nondegenerate. We will not discuss the geometry of geodesics of a finitedimensional manifold in detail, but refer the reader instead to (3) or (4) of the Bibliography. 6.24. Lemma : Let a e Q (V; p, q). Then b (V; p, q), = {A e Ho (1, R") I A(t)
e V.(,) for almost all t e I). If A e Ho (I, R") then (PA) (t) = 0 (a(t)) A(t). Proof: Let n, e L (Ho (I, R"), Ho (I, R")) be defined by (nA) (t) = S2 (a(t)) A(t). Since S2 (a(t)) is an orthogonal projection in R" for each t e I it follows
from the definition of the inner product in Ho (1, R") that n, is an orthogonal projection. From the characterization of 0 (V; p, q), it is clear that x, maps H* (I, R") onto S2 (V; p, q),. Since Hi (I, R") is dense in Ho (1, R")
APPLICATIONS OF MORSE THEORY TO CALCULUS
173
it follows that the range of 2r, is d2 (V; p, q) so rr = P.. On the other hand, A e Ho (I, R") is fixed under r, if and only if A(t) e V,(,, for almost all t e I. Since the range of a projection is its set of fixed points, this proves our lemma. Q.E.D. The following are obvious consequences of the lemma. Corollary 1: If a e Q (V; p, q), then
P, (Hi (I, R")) = Hl (I, Y). and P. (Hi (I, R")) = S2 (V; p, q)0 Corollary 2: If or e Q(V; p, q) then P0La = La. Another simple result, whose proof is left as an exercise, is 6.25. Lemma : Let T c Ho (I, L (R", R°)) and define for each A e Ho (I, R") a measurable function TT (A) : I+ R" by T(A) (t) = T(t) A(t). Then
(1) T is bounded from Ho (I, R") to L1 (I, R°); (2) If T and A are absolutely continuous then so is T(A) and (TA)' (t) = T'(t) A(t) + T(t) A'(t);
(3) IfT e Hl (I, L (R", RD)), A e Hl (I, R"), then T (A) e Hl (I, RD). 6.26. Definition: Let a e .Q (V; p, q). Define G, e Hl (I, L (R", R")) by G, = .Q o or and Q, a Ho (I, L (R", R")) by Q, = G,.
6.27. Theorem: Let a e .G (V; p, q). Let F. be as in Definition 6.22. If e e H, (I, R"), then (LP,  PL) e(t) = Q,(t) 9(t). Given f e Ho (I, Jr), define an absolutely continuous map g : I + R" by
g(t) = J
ds. 0
Then, if e e Hi (I, R")
(.l (LP,  P,L)e)o = (g, Le)0. Proof: Since Pe (t) = G,(t) e(t) and P,(Le) (t) = G,(t) e'(t) by 6.24, (LP,  PA) (e(t)) = Q,(t) e(t) follows immediately by differentiation. By (1) of Lemma 6.25, s  Q,(s) f(s) is summable, so g is absolutely continuous. Next note that, since G,(t) = Q (a(t)) is selfadjoint for all t, Q,(t) = G,'(t) is selfadjoint wherever defined, and hence 1 P Le)o = J U(t), (f, (LPo ` o) Q.(t) e(t)) dt =
0
=
f
J
1
J(QQ(t)f(t), e(t)) dt o
(g'(t), e(t)) dt. 0
174
NONLINEAR FUNCTIONAL ANALYSIS
Then if e e Hi (1, R") Theorem 6.5 gives
Q.E.D.
(J,, (LP,  PoL) e)o = (g, Le)o
6.28. Theorem: Let h(a) be as in Definition 6.20. If or e .Q (V; p, q) then P,h (a) is absolutely continuous and (P,h (a))'(t) = Q,(t) h(a) (1).
Proof: If o e Hi (I, R") then (P,h (a), Le)o = (h(a), P,Le)o = (h(cr), (P,L  LP,) e)o since (h(a), LPe) = 0. Hence (P,h (a), Le)o = (g, Le)o if we define g to be g(t) =
fr Q0(s) h(a) (s) ds. .JJ o
Then P,h (a)  g 1 L (H* (I, R")), whence Ph (a)  g = constant. Since g is absolutely continuous so is P,h (a) and they have the same derivative. But g'(t) = Q,(t) h(a) (t). Q.E.D. 6.29. Theorem: Let a be a critical point of J. Then or is smooth and, moreover a" 1 V everywhere. Conversely, if a eQ (V; p, q), a' a.e., a" 1 V, then a is a critical point of J.
Proof: By Theorem 6.21, if a is a critical point of J, then La = h(a). Since P,la = La, it follows that P,h (a) = h(a), so by Theorem 6.21 a' is absolutely continuous (so that a is C1) and (*) Qe(t) a'(t). Now since SZ : V  L (R", R") is smooth using 6.26 we have
Q1(t) =
dt
It follows that if a is C", then Q,(t) is
d2 (a(t))
so by (*) the statement that a"
is Cl" I implies that or is C,"+ 1. Since we already know or is C', it follows that
a is smooth. If e e S2 (V; p, q) then La = h(a) is orthogonal to Le, so that a" is orthogonal toe. Since a" and e are continuous, it follows that (a"(t), e(t))
= 0, t e I. If t e I is not an endpoint and so a V,(,), then there exists e c .Q (V; p, q) sdch that e(t) = vo, hence al(t) is orthogonal to V,(,) and, by continuity, this holds also at the endpoints. Conversely, if a eQ (V; p,q) is such that a' is absolutely continuous and a" 1 V,(,) for almost all t e 1, then La 1 L (Q (V; p, q)) so La = h(a) and a is a critical point of J. Q.E.D.
APPLICATIONS OF MORSE THEORY TO CALCULUS
175
The last step in the characterization of critical points of J is supplied by the wellknown result of classical differential geometry (see (3) and (4)) that, if Or e C2(I, V), or is a geodesic of V parametrized proportionately to arclength if and only if a" 1 V everywhere. We obtain the following conclusion.
6.30. Theorem: If a e S2 (V; p, q), then or is a critical point of J if and only if a is a geodesic of V parametrized proportionately to arc length. We must now determine when an extremal point of Jwill be degenerate.
We limit ourselves to a brief exposition and to suggesting that the reader consult (3).
Let E denote the exponential map of V. into V; i.e. if v e V,, then E(v) = a(Ivl) where or is the geodesic starting from p with tangent vector v/Iv(.
Then E is smooth. Given v e V, we define R(v) = dimension of nullspace of d,,, If A(v) > 0, we call v a conjugate vector at p. A point of V is called a conjugate point of p if it is in the image under E of the set of conjugate vectors at p. By Sard's theorem the set of conjugate points of p has measure zero in V. Given v e
E_1j` _)
define v e.Q (V; p, q) by v(t) = E (t(v)). Then v is a geodesic parametrized proportionately to arc length (factor: JvJ) and hence a critical point of J. Conversely, any critical point of J is of the form v for a unique v e E 1(q). We may now state the following two theorems: 6.30. Nondegeneracy theorem: If v e E1(q) then v is a degenerate critical point of J if and only if v is a conjugate vector at p. Hence J has only nondegnerate critical points if and only if q is not a conjugate point p. This condition is satisfied if q lies outside of a set of measure zero in V.
6.31. Morse index theorem: Let v e E 1(q). Then there are only a finite number oft satisfying 0 < t < 1 such that t, is a conjugate vector at p. The index of v is E A (tv). In particular each critical point of J has finite index. 0 0, the number of negative 12
Schwartz. Nonlinear
178
NONLINEAR FUNCTIONAL ANALYSIS
eigenvalues is precisely equal to the number of eigenvalues which have crossed from positive to negative as a has increased from zero to its given value.
The above arguments establish the following lemma: 6.32. Lemma :
(i) The Hessian matrix 62J° (a; e, ti) is singular, i.e., the critical point a of the functional J`' is degenerate, if and only if the equation (*) has a nonzero solution e = (e') satisfying e'(0) = e'(1) = 0.
(ii) Let 0 < al < a2 < ... < a, < 1 be the values of a for which the differential equation (*) has a nonzero solution e = {e') satisfying e'(0) = e'(a) = 0; and let n(a) be the number of such linearly independent solutions attaching to the value a. Then Morse index of the critical point a, i.e., the number of negative eigenvalues of the hessian matrix 62J", is equal to the
sum n(al) +
+ n(a,).
Next we shall need the following Lemma.
6.33. Lemma: Let n(a) be defined for each a as in (ii) of the preceeding Lemma. Let v  E(v) be the exponential transformation, which sends each tangent vector v at the point e of the manifold V into the point o,flvI), where a, is the geodesic starting from P withtangent vector v/JvJ. Let vo be such that
or = a,,. Let dE,, be the gradient of the map E at the point vo. Then n(l) is equal to the dimension of the nullspace of the linear transformation dE,,. Before giving the proof of this Lemma, let us note that the Nondegeneracy
Theorem and the Morse Index Theorem follow readily from the two preceeding lemmas. Indeed, since n(1) = 0 is the criteria for nondegeneracy according to the first of our two lemmas, the Nondegeneracy Theorem follows immediately from the second lemma. As to the Morse Index Theorem, we note that, applying the second of Lemmas to each of the geodesic segments a(t), 0 S t 5 a, with a 5 1, we find that n(a) = 2(avo) for 0 S a S 1. Thus the Morse Index Theorem follows at once from part (ii) of the first of our Lemmas. Let us now give the proof of the second lemma: Proof: Put a,(t) = E(vt), so that a, is a geodesic curve parametrized proportional to arclength, whose tangent vector at t = 0 is v. Since for any v a, is a
critical point of the functional J"(a) we have de J1 (a, + &0)1..o = 0 for every function e = (e') vanishing for t = 0 and t = 1. Thus, if y is any vec
APPLICATIONS OF MORSE THEORY TO CALCULUS
179
for tangent to Vat the same point p as v, we have 02
JY(av+ir + Ee) a=o = 0
0EaE
I
1=0
for all Q. That is, 62JV (av,
d dE
av+iv, e) a=o  0 z=a
for all a vanishing for t = 0 and t = 1. If we note that the differential equation [*] is derived from the variational condition (t] on integrating by parts, we see at once from this last equation that the function
AY(t) = d av+4t) s=o
satisfies the linear differential equation [*]. We have 02
=o
aiat
=
ar+tv(t)
t=t=o
d
(v+sv70=v; }
de
thus d ; satisfies the initial conditions d r(0) = 0, d;(0) = v. On the other d hand, taking t = I, we find that d,(1) = a, *;,{t) = d E (v + iv) de s0 de1,;Wo = Thus the dimension of the nullspace of dE, is at the same time the dimension of the nullspace of A4,1); and hence equals the dimension of
the space of vectors v such that d satisfies both the boundary conditions A ,(O) = 0 and d,(1) = 0. That is, the dimension of the nullspace of dE, is the integer n(l) of Lemma 6.33, and thus the proof of Lemma 6.33 is complete. Q.E.D.
CHAPTER VII
Applications A. Applications to Homotopy Theory B. A proof of Theorem 5.16 . . . . C. The Homotopy of Some Lie Groups
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
181
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
185 189
A. Applications to Homotopy Theory
We first recall the definition of the homotopy groups of a space. Let X be a topological space, A a subspace of X and p e A. Let In denote the ndimensional cube, I"1 c I" the bottom face, and J"1 the union of all the other faces, so that J"1 = 8I"  I"1. We shall write
f:
(I", I"
J"1)  (X, A, p)
for any continuous function f : In > X which maps In1 into A and J"1 on p. We denote by Q" (X, A, p) the space of all such functions, and by an (X, A, p) the set of all components of D. (X, A, p). n" (X, A, p) has a well known group structure (by taking representatives of two elements of at", reparametrizing and then "joining" them). We call it the ndimensional homotopy group of X relative to A with base point p. The following are easily proved properties of the groups n":
(i) By reparametrizing we get for n z 2 da" (X, A, P) ; D1(D"1 (X, A, P).0o, 00),
where 0o is 'the constant map sending I"1 to p. Hence, n"(X, A, p) = nl(P"1(X, A,P),00,00). (ii) In the same way we prove that for n z 2 X. (X, A, P) = n.1(Q (X, A, P), 00, to), where 00 sends I1 onto p. 181
182
NONLINEAR FUNCTIONAL ANALYSIS
of maps homotopic to the con(iii) The identity stant map 0o : I" > p. (iv) If 4)(I") c A, then 0 is homotopic to the identity. 4)m, is a honotopy be(Shrink I" by means of a function m,, so that 0 tween 0 and 0o.) In other words, 'r (X, X, p) is trivial for any p e X. In the sequel we shall write n"(X, p) for n"(X, p, p). It is easy to see that
n"(X, p) is, in fact, the usual "absolute" ndimensional homotopy group of X with base point p. Also, n"(X, A, p) will often be written 7r .(X, A), when no confusion can arise. (Of course if A is arcwise connected n" (X, A, p)
does not depend on p.) Suppose we have a map V: (X, A, p) + (Y, B, q). Then V induces a map +p* : n"(X, A, p) + n"(Y, B, q). (Just send4) ESl"(X, A, p) into E D.(YB,q).) 7.1. Definition: The boundary homomorphism 49: n" (X, A, p) ' n"1(A, p, p) or briefly 8 : a. (X, A)  ac _ 1(A) is defined as follows. Given q e ="(X, A), take ¢ e q, then ¢II"1 belongs to D"_ 1(A, p, p) and so determines a class 8q a n"_ 1(A). We state the following without proof.
7.2. Theorem: Let i be the injection (A, p)  (X, p) and j the injection (X, p)  (X, A). Then the following sequence is exact:
... . n" (X, A) e ' nA1 (A, p)
o
n"1 (X, p) !_`` n"1 (X, A)
This is the analog for homotopy groups of the Exactness Principle given in Chapter N, Part 2, § E, of these Notes. Now, suppose we have a manifold M, feCOD(M), satisfying the PS condi
tion, and let as usual M° _ {x e M; f(x) 5 a), M° = {x e M; f(x) S b}. If there are only nondegenerate critical levels between a and b, Mb is deformable to M' with handles attached : (1)
Mb  M" u h1 (Dk' x D`1) u h2 (D1= x D12) u h,
h=
h3
Let A be another manifold, and 0 a mapping 46: A  Mb. Assume that dim (A) is less than the index kl of any critical point in (1) and that A is compact.
Next, note that 01 can be deformed to a smooth map, and set Al = ¢1 (h1 (Dk1 x D")). Consider h14) : Al +91 x Di'. Let p1 bethe projection map of Dk1 x D" onto D. Then p1hi 10 is smooth and maps Al into 0111. But dim (A1) < k1, whence some point in D"`1 does not belong to the range of pah'4), and the same holds for the other indices k2, etc.
APPLICATIONS
183
Now a manifold of the type
M°uh,q, x D'')uh2(#2q2 x hl
h2
D`2)U...
h3
can be deformed into M° (see drawing). Hence, 0 can be deformed to a map A  M°.
As a special case we get:
7.3. Theorem: n (M°, M°) = 0 if n < degree of any critical point between a and b.
7.4. Corollary: If Morse theory applies to (M, f) and if above some noncritical level c all critical points have indices greater than n, then n. (M, M`)
=0. We will now apply our results to the topology of spheres, in order to obtain the socalled Freudenthal suspension relation between homotopy groups.
First we recall that in relation to H,(SJ, p, q) and the function J, the geodesics joining p and q are critical points whose indices depend on the length of the geodesic: if length (y) = n  e for any 0 < e < n, then
index (y) = 0; if length (y) = n + e, then index (y) = j  1; if length (y) = 3n  e, index (y) = 2(j  1) and so on. This follows from the Morse Index Theorem 6.31. Suppose that we have, as before, a map 4) : A  H, (SJ; p, q) where p # q and p # q', the conjugate of q, and that dim (A) < 2 (j  1). Then by 7.3 0 is homotopic to a map whose range contains curves of length at most n + 2e. Now assume that length (o) < n + 2e. Let m be the midpoint of cr: m = v(Q. It is easy to see that m: H,(SJ; p, q)+ SJ is a smooth map. We have d (p, m) < in + e and d (q, m) < in + e. This implies that there are unique geodesics v, joining p and m and v2 joining q and m (see drawing below), if e is small enough.
184
NONLINEAR FUNCTIONAL ANALYSIS
P'
Then d (a(t), al(t)) 5 1(2c + 2e) + j (n + 2E) < n if e is small enough (e being the distance between p' and q). Hence in this case a(t) and al(t) are connected by a unique shortest geodesic varying continuously with t, whence a can be deformed through these geodesics into a1. The same holds for a2. Thus
any map 0: A + H1(SJ; p, q) is homotopic to a map 0: A + H1(SJ; p, q) such that each value ¢(A) is a broken geodesic of two segments and total length less than a + It follows that the space of maps A  H1 is of the same homotopy type as the space of maps with values in a "belt", and hence ofthe same homotopy type as the space of maps A  SJ1(see figure below).
Now, it can readily be proved that H1(SJ; p, q) is of the same homotopy type as H1(SJ; p, j); that is, the homotopy type does not depend on the points p and q. Thus our result is independent of the relative position of p and q. In particular, we obtain :
7.5. Theorem: If dim (A) < 2(j 1), the space of maps A  D1(SJ; p, q) is of the same homotopy type as the space of maps A  SJ1.
APPLICATIONS
185
7.6. Corollary: For n < 2(jI) an (D1 (SI; p, q))  n (Sl 1) . By property (ii) of the homotopy groups, we obtain 7.7. Corollary:
7r"+1(SJ")
n"(SJ), for n < 2j.
Corollary 7.7 is known as the Freudenthal suspension relation. 7.8. Corollary: xn(S")  2r"+1(S"+1),
if n > 0,
whence arn(S") = Z if n > 0.
B. A Proof of Theorem 5.16
Let X ° . B be a fiber space. Also let 0 : A  X and V = p¢ : A + B. We say that the homotopy +p= of V has the "lifting property" if there exists a homotopy 0, of ¢ such that ip, = po,.
Example: If X = B x C and p,: X+ B is the natural projection on B
and P2: X  C that on C, and if 0 and w are two functions as above, then given a homotopy Vr the map of =+V, has the required properties. We state without proof the following
7.9. Theorem (Kunneth) [Cf., for example, Hilton and Wiley, Homology Theory.]
H. (B x C; G) =
k+t=n
®Hk (B; Ht (C; G)).
7.10. Corollary: Suppose G = real numbers. Then
b (B x C) = E' bk(B) b!(C) k+1="
where b" are the Betti numbers. If we form the Betti polynomials
b (B, z) =
nao
z"bn(B),
Corollary 7.10 implies that
b (B x C, z) = b (B, z) b (C, z).
186
NONLINEAR FUNCTIONAL ANALYSIS
Consider now the following fiber space: take a topological space B, a point b e B and let X be the space of all curves in B starting at b, with the usual topology. Of course p : X+ B assigns its end point to each curve. Let 0: A + X, and ip = pq5. For a given homotopy y,, of ip, put 4,(a) = curve 4(a) followed by +p,(a). This provides a lifting. So X ° B has the lifting homotopy property. Furthermore, Xhas the homotopy type of a point (just shrink each curve to the point b). Therefore 0 for n > 0. Returning to Theorem 7.9, set Ht (B, H, (C, G)) for k, I z 0, and denote by Z the whole double sequence {Zk'; k, l z 0}. More generally, assume we have two arbitrary double sequences of Abelian groups, Z and Z. Then we make the following
7.11. Definition: We say that Z is derived from Z by an rboundary operation if there exists a "boundary" operator d: E ®Zk.1 E ®Zk.r such that d2 = 0, d: Zk,'  Zkr. i+rI (#)
and k.1
{dz = 0) n Zk. Z
dZ  Zk" (It should be understood that Z" is the trivial group for k or I < 0.) d is called an operator of type r. In this case we shall write 2 = JE°,(Z). Observe that for r large and k + 1 small, {dz = 0} = and dZ = {0}.
Zk.', because in this case
7.12. Lemma: If we have operators d, of type i = 2, 3, ... and starting with Z, sequences .r°2(Z), .* 3 (.*'2(Z)) of groups, etc., the limit
.W.(Z) = lim' °e
(°2(Z)) ...)
exists.
This follows from the above observation. Next we quote the following fundamental theorem on the homology of fiber space, but without giving its proof. 7.1 3.Tbeorem (LeraySerre) : [Cf. Serre, "Homologie Singulibre des Espaces
Fibras", Ann. Math. 54 (1951).] Let X ° B be a fiber space, with B connected and simply connected, and connected fiber F = p1(b). Put Zk,' = Hk (B, H, (F)). Then H (X) has a composition series with factors Zk', k + I = n, such that Z = .af°.(Z). (A composition series for G is a sequence of subgroups G, of G such that G = Go 2 G, a 0, and the factors are the groups G,/G,+,.)
APPLICATIONS
187
Let us consider once more the fiber space X D B of curves beginning at
b e B, with fiber F = p1(b) = Q(B). As we said before, all the homo
logy groups of X are zero for n > 0. As in Theorem 7.13, put Hk (B, H, (.Q(B))), where Zk0 = Hk(B). Suppose that H (B) is the first nonvanishing homology group of B of positive dimension (see diagram below). If n > 2, by Theorem 7.13, Z°.1 must be zero, because this does not change
when homology with respect to an r/r Z 2 boundary operation is taken; since the final result must be 0, all Zk.'being zero, H1(Q) itself must vanish. This implies that all the in the column of H1(Q) are 0. Similarly, if n > 3,
all the groups in the column of H,(Q) are zero. Using these remarks, we may prove the following theorem.
0 d2
ZliI
0
H0(8)H0(Q)
H1 (Q)
H, (2)
 
Hni (U) H.. (S2)
7.14. Theorem: If B is connected and simply connected, the first nonvanishing homology group
of positive dimension is isomorphic to the first
nonvanishing homology group of positive dimension of Q(B), which is Hr1 (Q(B)) Thus
H (B) ^'
(.(B))
Proof: Suppose, for example, that n = 3. After homology with respect to the 2boundary operation is taken, Z3.0 remains the same, for Z1.1 is zero by the above remark. The same is true of Z°.2. Taking homology with respect to the 3boundary operation may change both groups, but all the other
NONLINEAR FUNCTIONAL ANALYSIS
188
homologies leave invariant the groups in the places (3, 0) and (0, 2). But the limit groups H.,(Z)3.0 and H,,,(Z)°,Z must be zero, so the 3boundary homology gives us zero in both places. In other words, the sequence d30 HO) a~ 0 113
is exact, which proves the theorem. Q.E.D.
7.15. Ccrollary (Hurewicz): If B is connected and simply connected, the first nonvanishing 'romology group of positive dimension, H (B) is isomorphic to the first nonvanishing homotopy group of positive dimension a. (B).
Proof: H4(B) =
i (.(B) = H1 (f"'(B)) = ri (D8'(B)) = x.(B)
Q.E.D.
Now assume that B is a finite dimensional space. Consider homology groups with real coefficients and let Dk.' = dim bk(B) bt(Q(B)), where the bk are Betti numbers. Suppose that Q(B) has only finitely many nonvanishing Betti numbers; let b,(D(B)) be distinct from zero, and bj(Q(B)) = 0 for 1 > r. Similarly, let
0 and bk(B) = 0 for k > n. Then D'," is different from zero, and remains fixed throughout the sequence of homologies of Lemma 7.12 and Theorem 7.13 (same argument as before). But this is a contradiction, for the final result gives the trivial homology of the pathspace X and hence must be 0. Thus D(B) always has infinitely many nonvanishing homology groups. Suppose next that one of the numbers, say, b,(Q(B)), is infinite, and that for I < s, b, (S2(B)) is finite. Then the number at the node (s, 0) of the
above diagram remains infinite throughout the sequence of homologies which is again a contradiction. We have thus proved 7.16. Theorem: If B is connected, simply connected and finite dimensional, 99(B) has infinitely many nonvanishing real homology groups and all of them have finite dimensions. Q.E.D.
The space 9(B) is an example of the more general concept of a "grouplike space". 7.17. Definition: Let X be a topological space. Then X is called a grouplike
space if there is a binary operation defined on it, a distinguished element
189
APPLICATIONS
called the identity, and a mapping x  x1 such that all the properties defining a group are satisfied up to homotopy (e.g. m  m  e  identity). For grouplike spaces we have the following theorem of Hopf, which we quote from the ciled paper of Serre but shall not prove. 7.18. Theorem: The cohomology ring with real coefficients of a grouplike space with finite dimensional homology groups is the direct product of a polynomial algebra and an exterior algebra. 7.19. Corollary: Under the above hypotheses, if the grouplike space X has infinitely many nonvanishing Betti numbers, the cuplength of X equals oo. (See § 2 of Chapter 5 for the definition of cuplength.) 7.20. Corollary: If B is connected, simply connected and finite dimensional, then cuplength (S1(B)) = oo.
[Compare with Theorem 5.16.] 7.21. Corollary: Under the above hypotheses, any two points of a Riemannian manifold B are connected by indefinitely long geodesics. Remark: It can be proved that if B is compact, simple connectedness is not necessary. C. The Homotopy of Some Lie Groups
We first recall the definition and some properties of the unitary group. For more details, see Milnor's book on Morse theory. The unitary group U(n) is the group of all n x n complex matrices preserving the inner product
in C", or equivalently, the group of all n x n complex matrices such that UU* = I, where U* is the conjugate transpose of U. This is a Lie group, and the tangent space at the identity I is the space of matrices {iH}, where H is hermitian, i.e.: H = H*. Analogously, the tangent
space at U0 is the space of matrices {iU0H} = {iHU0}. The matrix exponential function defined by
expA=I+A+
+ + A2
A3
2!
3!
coincides with the exponential function defined on the Lie algebra {iH} with values in the Lie group U(n). The scalar product
*'(A, B) = Re trace (AB*) defines a Riemannian structure on U(n).
NONLINEAR FUNCTIONAL ANALYSIS
190
The geodesics beginning at I are the curves of the form v(t) = exp (ill?), with H hermitian. We say that o(t) has H as initial velocity. Our aim now is to determine at which points of the tangent space at the origin, that is, at which hermitian matrices H, the exponential function has a vanishing Jacobian. The image of these points under the exponential is the set of conjugate points to the identity I. In general, let f be any analytic function of a matrix. Then f has a Cauchy integral representation, f(Z) dz. f(M) = ,
I tact
z  M
If bf(M, N) denotes the first variation off at the point Mapplied to N, we have:
af(M, N) =
(1)
On the other hand,
f
f(z) S [(z  M)', NJ dz.
6(zM)' _(zM)'dM(zM)
(2)
Now consider the operators e(A) and A(A), on matrices defined as right and left multiplication by A respectively. Since the mapping A  e(A) is a homomorphism from the group of nonsingular matrices to the group of nonsingular linear operators on the linear space of all matrices, and similarly for A, we obtain :
e ((z  A)') = (z  e (A))' and
A((z A') =(z A(A))'. Hence from formula (2) we obtain
8 ((z  M') = (z  e(M))' (z  A(M))' 8M, and therefore using (1) it{f(z) follows that (3)
bf(M, N) = _L
(z  (M))1 (z  A(M))' dz}(N).
2xi
Set
$2) 
1 2i
f(z)
IF
I
dz.
Then
6f(M, N) = 0 (e(M), A(M)) (N) .
Moreover, 1
Z $1 z
1
_ 1
( 1 2(Z
'
L), r ), z  r2
APPLICATIONS
191
SO
$2
We now return to the function f(H) = exp (iH). In this case, we have S exp (iH) = 0 (o(H),1.(H)) 8H, where
exp (iE1)  exp (iE2)
E,  $2
(i (E1  2)))
= exp
$1  $2
But
e(A)  R(A) = Ad (A). So finally we get the formula 8 exp (iH) = exp (iH) +p (Ad (H)) oH, where (z)
1  exp (iz) z
Furthermore, the eigenvalues of ti (Ad (H)) are equal to W (eigenvalues of Ad (H)). The zeros of tp are z = 22rn, n = 0, ± 1, ±2, ... Hence the matrices H which give rise to conjugate points to the identity, are those whose Ad (H) has an eigenvalue of the form 2nn, n = 0, ± 1, ± 2, ... Now, the eigenvalues of Ad (H) are differences ofthe eigenvalues of H, hence the matrices we are looking for are those having eigenvalues differing by 2nn,
n=0,±1,±2,... All these calculations apply to the Special Unitary group SU(n) also, but in this case the tangent space to the indentity, that is, the Lie algebra, is the space of matrices {iH} where H is hermitian and has trace 0.
The following considerations apply to the group U(2m) or the group SU (2m).
We choose an element E near I having the form
exp(i(a+a,))
0
exp (i (n + el))
E=
exp (i (n + 82))
exp (i (n + e2)) 0
NONLINEAR FUNCTIONAL ANALYSIS
192
We wish to study the geodesics joining 1 and E. Therefore we have to find the solutions of exp (iH) = E. But such an elementHcommutes with E, and since E is diagonal with distinct entries, H too must be diagonal. Set
H =
Then exp (ih1) = exp (i (n + e,)); exp (ih2) = exp (i (n  et)); etc. Hence,
h 1 = n + e l + 27zn, = e, + (2n, + 1) n,
h2 = r  e, + 2nn2 = el + (2n2 + 1) n, etc. This means that the h, are of the form: (2k + 1) n ± e.
The length of the geodesic with initial velocity H is
L=
d dt
exp (itH)I = (tr (HH*))1/2 = (tr(H2))1/2
= {E ((2n, + 1) n ±
e)2}1/2
Choosing ± 1 for the coefficients of n, we obtain 22n, geodesics of minimal
length a The next shortest geodesics are obtained when all coefficients but one are
±1, and one of them is ±3; then the length is  n ylm + 8. Conjugate points to the identity appear along a geodesic when t (h, hj) = 22rn for some t, 0 < t < 1, and for, some h, 0 hj. Given hl and hj there are Ch`  h,] ([ ] = integer part) conjugate points. The total number of con
jugate points along the geodesic is therefore
I
h
rhi  h j = 2
h,#hJL
2n
IL
2n
]
But hj = n (2nj + 1) ± e. Hence the total number of conjugate points is >2 (nj  n,  1) for a small enough. nj>nj
Consider now the special case of the Special Unitary group SU (2m). Then we have
193
APPLICATIONS
7.22. Lemma: Unless m of the nj's in the formula for the hj's are 0 and the m others are 1, the geodesic with initial velocity H passes through at least 2 (m + 1) conjugate points to the identity.
Proof: Since the trace of H is zero, E hj = n (E 2nj + 2m) = 0, so Y, nj < m, or E nj = m. Thus there are two possibilities: either nJ2 Y (nj  ni  1)z2nj2 eJni
In the second case E nj = m there are no positive nj's. If our hypothesis AJ 2 (m + 1). nk=0
nJ= n 2m + 8. For the geodesics of minimal length, the fact that trace (H)
= 0 implies that there are m eigenvalues equal to +n and m eigenvalues equal to n. In this case, the matrix H is completely determined by giving the subspace of eigenvectors of the eigenvalue n, the other subspace corre13
Schwartz, Nonlinear
194
NONLINEAR FUNCTIONAL ANALYSIS
sponding to a being orthogonal to the first one. Therefore we have a homeomorphism between the manifold of minimal geodesics joining I to I in SU (2m) and the Grassmann manifold Gm(2m) of mdimensional linear subspaces of C2m.
We shall now prove the following theorem due to Bott (compare to
Lemma 22.5 of Milnor's Morse Theory):
7.25. Theorem (Bott): Let M be a complete Hilbert manifold, f a smooth function satisfying the PS condition. Suppose that the set (f a) has only one critical level c, with critical set K = f I (c), and assume that K is a finite dimensional submanifold of M. Then
n:k({f l/nj, whence the last expression is less than tnl II Vf (a (xn,, t))112 att .
nj
(*)
J
Now,
dt
0
f(a(t)) _ IIVf(a(t))112 So (*) equals tn,
nj 0
a (xn,, t)) dt = nj (f(xn,) f(yn)) d f(
which tends to zero. Thus {xn,} also tends to y,,,, e K.
Q.E.D. Proof of Theorem 7.25: By our hypotheses, K has a neighborhood N in M homeomorphic to K x disc. [Cf. O. Hanner, "Some theorems on absolute neighborhood retracts", Arkiv for Matematik, Vol. 1, (1950), pp. 389408.]
Assume that S is a topological space and ¢ : S  { f S a} is continuous. Using Morse Theory it follows that for any s > 0, 0 is homotopic to a 4z such that c  s S f(o,(S)) < c + s [because { f S a} and {f < c + e} have the same homotopy type]. Therefore there is a homotopic 01 such that
f(4 1(S)) 5 c + s and 01(S) s N. If this were not the case, we would consuch that struct a sequence c and x. l N for all n, contradicting Lemma 7.26. Since N is homeomorphic to K x disc it follows by squeezing the disc hat 01 is homotopic to a function 02 with values in K. Q.E.D. We return to our study of the groups U(2in) and SU(2m). By Corol lary 7.24. any map from a space X of di mension < 2m + I into H1(SU(2m), I, E) can be pushed down homotopically into a map ofXwith values in curves whose length is S n + e, for any e > 0. The same result is true if we consider instead
the space of loops H1(SU(2m), I, I); the points I and E being joined by an' unique minimal geodesic one can prove that H1(SU (2m), I, E) and H1 (SU(2m), I, 1) have the same homotopy type.
7.27. Lemma: Fork < 2m + 1, nx(Q(SU(2m), I, I),K) = 0, where K is the manifold of minimal geodesics joining I and I. Proof: By the remark above, a map from the kcube into Sl (SU(2m), I, I) can be pushed down to a map into the space S20 of curves of length at most
196
x2
NONLINEAR FUNCTIONAL ANALYSIS
+a. But by Bott's theorem, the space Do has vanishing homotopy
groups relative to K. Q.E.D. 7.28. Corollary: Fork < 2m + 1, a k (S2 (SU (2m),1, 1)) ~ nk (Gm(2m)) .
Proof: By Theorem 7.2, the sequence
...
....* 71Sk (X, A) 1 nk1(A)  7Lk1(X)  7Lk1 (x, A)
is exact. The first and the last written terms are zero by Lemma 7.27, whence the middle terms are isomorphic, i.e. nk (92 (SU (2m),1, 1)) = nk(K)
But, as noted preceding Theorem 7.25, K is homeomorphic to the Grassmann manifold G. (2m). Q.E.D. 7.29. Corollary (Bott's isomorphisms): nk+1 (SU (2m))
nk (G,,, (2m))
for k5 2m.
We now proceed to obtain corresponding results for the group U(n). Let X ! + B be a fiber space, X and B connected. Let F = p1(b) be the fiber. Then p induces an isomorphism p" : nk (X, F) + aik(B) [cf. Steenrod, The Topology of Fiber Bundles, or Hilton and Wylie, Homology Theory, pp. 288289]. Using the exact sequence for homotopy we see that ... + 7ak(B) s nk I(F) ..+ ack1(X) + n,% 1(B) 
...
is exact [this is the so called exact bundle sequence]. If G is a connected Lie group, and Ha subgroup of G, then G is a fiber bundle
with base space GJH (the factor space of G by H). The projection p is the natural one. [This uses the existence of local cross sections of G over GJH. See Steenrod's book.] Consider now the inclusion SU(n) c U(n). The factor space U(n)/SU(n) is the unit circle C. Then: ak (U(n)) = nk (SU(n)), k > 2.
Next, consider the inclusion U(n) c U(n + 1). The factor space U(n + l)/U(n) is the sphere S2A+1 [cf. Steenrod's book]. Hence, nk(S2n+1).+ nk1(U(n)) + Xk1(U(n + 1)) i alk1(S2s+1)
197
APPLICATIONS
is exact. So, fork < 2n + 2 we get: nk(U(n)) = irk(U(n + 1)) (Stable homotopy groups). It is natural to write ak (U(oo)) = lim alt (U(n)). a.m
The space U(m) x U(m) is included in U(2m), since the matrix 0
A(m) 0
`
A(m)
is in U(2m) for any A(m) c U(m).
It is easy to verify that the factor space U(2m)/U(m) x U(m) is homeomorphic to the Grassmann manifold G. (2m). Therefore xk (Gm(2m)) + ak1(U(m) x U(m)) .+ vk1 (U(2m))  ik z (Gm(2m))
is exact. This geometric fact justifies the following
Remark: Given a Lie group G and subgroups H1 c H2, GJH1 has a bundle structure over GJH2, with natural projection and fiber H2/H1. (Same proof as in the somewhat less general case considered before.] If we then consider the fiber space U (2m)/ U(m)  U (2m)/ U(m) x U(m),
and use the exact bundle sequence we find that xk (U(2m)l U(m))  al, (Gm (2m)) +'re1(U(m))  xk
(U(2m)l U(m))
is exact.
7.30. Theorem: (Bott's Periodicity Theorem):
xk1(U(co))  Irk+1(U(oo))
for k Z I.
Proof: First we prove that in the exact sequence noted above the first and the last groups are zero provided m is large enough. Indeed, we have seen already that the space U(2m)l U(2m 1) is the sphere S" 1. Thus
irk(U(2m)/U(2m  1)) = 0 for m large. Similarly
ak (U (2m  1)/ U (2m  2)) = 0 for m large, etc., and we get
xk (U(2m)/U(m)) = 0 for m large, by the above remark.
198
NONLINEAR FUNCTIONAL ANALYSIS
Hence, by the exactness of the above sequence of homotopy groups, we get
.nk (Gm (2m))  nk _ j (U(m)) form large. Now, nk I (U(M))  nk 1(U (2m)) by stability, for m large.
Therefore
nk (Gm (2m))  nk_ 1(U (2m)) form large. Using Corollary 7.29 (Bott's isomorphisms), we may assert that nk+ 1 (SU (2M)) '" nk(Gm (2m)) form large.
But we have seen that nk (U(n))  nk (SU (n))
for k>2.
Thus
nk + 1(U (2m)) =nk (Gm (2m)) form large,
and for k > 1. Hence, finally,
nk+1(U(2m))  nk_1(U(2m)) if m is large. Q.E.D.
7.31. Corollary: The homotopy groups nk (U(oo)) are zero if k is even, and
isomorphic to Z if k is odd. Proof: Observe that 7r2 WOOD
n2(SU(2)) = 0
n3 (U(00)) = r3 (SU (2)) = Z,
as SU(2) is nothing but S3. Q.E.D.
CHAPTER VIII
Closed Geodesics on Compact Riemannian Manifolds (Chapter by Hermann Karcher)
In this chapter closed geodesics on a Riemannian manifold M will be studied using infinite dimensional Morse Theory in much the same way as in Chapter IV where geodesics joining two fixed points were treated. We shall study a Hilbert manifold H1 (S1 , M) of closed, sufficiently regular curves (Hlcurves) on M. The coordinate spaces for Hl (S1 , M) are (as in Chapter IV) Hilbert spaces whose elements are Hlvector fields along curves on M. In Chapter IV one defined scalar products for the coordinate spaces (following Palais) with the aid of Nash's embedding theorem. In this chapter (cf. Theorem 8.6) we use instead Klingenberg's intrinsic scalar product (first introduced in a lecture given in Bonn) which depends only on the Riemannian structure of M. In Theorem 8.9 we prove that this scalar product and various other possible scalar products on the Hilbert spaces of Hlvector fields lead to equivalent norms. The differentiable structure of H1(S1, M) and useful coordinate systems are discussed in 8.11 to 8.18. Theorem 8.19 states the differentiability of the energy function and in 8.20 we introduce a Riemannian
metric for H1(S1, M) based on Klingenberg's intrinsic scalar products. These developments are somewhat more complicated than the corresponding
ones in Chapter IV since it does not seem possible to obtain the intrinsic Riemannian metric of H1(Si, M) via an embedding M z RN. In 8.22 to 8.26 we carry out an auxiliary discussion of differentiable curves on H1(S1, M)
and their representation on M. In the second half of the present chapter our use of the intrinsic scalar product allows simpler proofs than in Chapter IV; it also seems that notions such as the gradient vector field of J on H1(S1, M) can be more readily interpreted in terms of M. We prove a few geometric results concerning the Riemannian structure of H, (S1, M) in 8.27 to 8.33. 199
200
NONLINEAR FUNCTIONAL ANALYSIS
Lemma 8.34 contains basic estimates which we need to derive the explicit formula for the first derivative of the energy J in 8.35 and to prove the validity of the PalaisSmale condition for J in 8.41. In 8.36 to 8.39 we introduce the gradient of J as a vector field on H1(S1, M) and identify the critical points of J as the closed geodesics on M. In 8.41 to 8.50 standard arguments from infinite dimensional Morse Theory are used to show the existence of at least one nontrivial closed geodesic on every compact C6Riemannian manifold. In 8.48 we prove that those flow lines of the gradient deformation (cf. 8.43), which start at points f with sufficiently small energy J(f) < e have uniformly bounded length. As an immediate consequence we obtain the important result 8.47 that J1(0) is deformation retract of J1([0, e]). We conclude with a summary of recent developments.
Preliminaries. M will be a compact Riemannian manifold of class Ck (k Z 6). (Metric completeness rather than compactness is sufficient for most
of our general developments but not for the desired application to closed geodesics.) MD denotes the tangent space to M at p, TM the tangent bundle of M. The scalar product in MD is denoted by g(p) (v, w) or more briefly by (v, w); in local coordinates on Mthe metric is writtenglk(p) v'wk. The distance on M induced by this infinitesimal metric (cf. Chapter VI) is called dM(p, q). Absolutely continuous curves (resp. vector fields) with locally square integrable derivatives will be called H1curves (resp. H1vector fields). For H1curves we may define an energy integral J and a length LM as follows.
J(f) = j f (1'(t),f(t)) dt , L,4(f) =
f
(f(t),f(t))112
dt.
We shall be interested in closed H1curves parametrized by the interval [0, 1]
(not necessarily proportionally to arc length). Hence we find it useful to define the following space :
H1(S1i M) = {fIf: [0, 1]/{0, 1) , M and J(f) < co}. (We always identify the circle S1 with the factor space [0, 1]/{0, 1}.) The covariant derivative of a vector field v(t) along f(t) will be written Dv/dt; this derivative is given in local coordinates on M by the formula dt
+ I',k (f(t )) At) Vk(t )
(where rJ, e Ck I are the Christoffel symbols). Differential equations with squareintegrable coefficients can be treated by the PicardLindelof iteration scheme. Using this fact and the last formula we see that LeviCivita parallel
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS
201
transport is well defined along any curve f e H1(S1, M). In particular a parallel vector field along such a curve is absolutely continuous, and for any continuous vector field along f the following statement holds: dv/dt is locally square integrable if and only if Dv/dt is locally square integrable. The exponential map available on the manifold M may be described as follows. For 0 , v e Mo, let c : [0, oo)  M be the geodesic ray starting at p with tangent vector v, such that its parameter t is proportional to arc length s and ds/dt = I vl m.. Define exp (v) = c(1). Then exp: TM + M. We denote
the restriction expIMD = exp,. If convenient, we write exp, = exp. It follows immediately from the differential equation for geodesics (i.e., from D/8t c = 0) that exp, (t v) = c(t), in other words that exp, is radial isometric. Since r,,,, e Ck2, it follows that exp : TM  M is a C'`2 map, and the differential equation for geodesics also shows that the linear map induced by exp, at the origin of M. is the identity map. Geodesic parallel coordinates on M are easily defined in terms of the exponential map exp. Given a geodesic arc c: [0, T] + M (arc length t as parameter), choose an orthonormal nframe F0 = (c(0), v2(0), ..., v (0)} in M 0) and define nframes F, in M,(,) by parallel transport of F0 along c. The map [0, T] x Rn1 > M given by (t, u2, ..., u") ' exp t) (=R2 E u'  vl(t)) is Ck'2 and the inducgd linear map at (t, 0, ..., 0) is the identity. Therefore a neighborhood of [0, TJ x {0} is mapped C4'2 diffeomorphically onto a tubular neighborhood of c in M. The inverse map gives the desired coordinates. We have g,,t(c(t)) = ask and Fj,t(c(t)) = 0. Since M is compact these remarks prove the following lemma.
8.1. Lemma: There exists e, > 0 such that the geodesic parallel coordinates just defined are valid in the e,tubular neighborhood of any geodesic arc which is sufficiently short so that its e;.tube does not cover any point twice. Moreover there exist constants C > 0 and 0 < m1 S 1 S m2 < oo such that for parallel coordinates in any ep tube we have n
(1)
n
II'fxl 5 C and m1 E (v')2 5 gtxv'vk 5 m2 E (v')2 t=1 t=1
A consequence of the second inequality in (1) is given in 8.2. Lemma: If ( , ) and ((
,
)) are two scalar products such that ml((v, v))
5 (v, v) S M2 ((V, *for all v and if m1 S 1 S m2 then I(v, w)  ((v, w))I s 16 (m2
 ml)
((v, v))  ((w, w)).
NONLINEAR FUNCTIONAL ANALYSIS
202
Proof: Using (v, w) _ I ((v + w, v + w)  (v  w, v  w)) one first gets I(v, w)  ((v, w)) I < 8 (m2  mi) (((v, v)) + ((w, w)))
Now (v, w)  ((v, w)) = b (v, w) is bilinear and l b (v, w)i < c (It'll + 1w12)
implies lb (v, w)I s 2 1v1 Iw1,
since for A
0 we have
w) < C A2IUI2 +2
Ib (v, w)I = b (Av,
IWI2l
Q.E.D. The following result is well known for H1curves in RN.
8.3. Lemma: Every subset of H1(Si , M) on which the energy integral is bounded consists of equicontinuous curves on M. Proof :
Lm(f)l',. =
dM(f(t),f(to))
(1'(z),.f(a))1/2 dr
fto 1/2
t
I t  to l
ft (f(r), j(r)) dr) o
J(f
5
(by the CauchySchwarz inequality)
.
8.4. Corollary: L,(f) S 2J(f), and La (,f) =1(f) if and only if f is parametrized proportional to arc length, i.e., (f(a), f(r)) = coast. We next wish to define a differentiable structure on H1(S1, M) with the aid of coordinate spaces which have a geometric interpretation on M.
8.5. Definition: For any f e H1(S1, M) consider the set of H1vector fields along f: (2)
H1(S1, TMf) = {vIv 3 [0, 1]/(0, 1)) i TM
such that v(t) a Mr(,) and v is absolutely continuous and has locally square integrable derivatives.
8.6. Theorem: H1(S1, TMf) is a separable Hilbert space with the scalar product defined below.
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS
203
Proof: H, (S, , TMf) is obviously a vector space. For v, w e H, (S, , TM,r) define
= f1 Iwo, w(t)) + I
(3)
dt
o
dt
)} dt.
(Of course, Dldt denotes the covariant derivative along f.) It is clear from (2) of Definition 8.5 that we have < oo, <w, w> < oo. Moreover using I(v(t), w(t))I < Iv(t)I Iw(t)I (in Mf(,)) and the CauchySchwarz inequality, it follows that 2
. Hence is defined for all v, w e H1 (S,, TMf). Clearly is bilinear and positive definite, and therefore a scalar product.
For the proof of completeness of the space H, (S, , TMf) we need the following definition and lemma.
8.7. Definition : For v e H, (S,, TMf) put
IIvII. = max (v(t),
v(t))1'2.
t E [O.1)
8.8. Lemma: IIvII2 S 2 .
Proof: The formula (4)
at (8U(t))(v(t), w(t))) _
(.,
dt
+
is well known for f, v, w e C' and generalizes by an easy limit argument to all f e H, (S,, M) and v, w e H, (S1, TM,r). Now choose such that J I vll
v(tm)) and note that
uvll , =
(4t),
v(t)) + 5""
s (v(t), v(r)) + 2
f
dz
(v(z), v(a)) dr
1
Iv(r)I . o
Dv
dt
dr.
204
fo
NONLINEAR FUNCTIONAL ANALYSIS
Since the left side of this formula is independent of sand since 2a b s a2 + b2 we get ::g
00 IIV112
Q.E.D.
(v(t), v(t)) dt + f ' {(vr), v(r)) +
(P!..
)} dt
2.
We may now complete the proof of Theorem 8.6 in regard to completeness and separability. Since f is absolutely continuous.we can subdivide f into finitely many sub
arcs f, such that each of them is representable in some geodesic parallel coordinates by functions f,'(1) (i = 1, ..., n = dim M and t e I,) with I f,(t)I < e, (cf. Lemma 8.1) and U I, = [0, 1], II,I = 1. If and only if f e H1(S1 i M) will all the f, be Hl functions. Since J(f) < 00 we can also assume that these subares are so short that
(f(t), f(t)) dt < e r,
ml 8C m2n 3
where C, m1, m2 are the constants of 8.1 (1), and n is the dimension of M. We now consider the restriction of any v e H1(S1, TM,) to I i.e. to a vector field along f . The coordinates of v will be called vi. Then by 8.1
r,
I
(v(t ), v(t)) +
Z ml
rv
D dt dt /j
,
dt
` ((v `(t))2 + {v`(t) + I' (f,(t)) fr(t) Vk(t))2) . dt.
We use (A + B)2 ? +A2  B2 to obtain
ft ((v`)2 +
mi
)J`
fyv
11v
(1'ik frvk)2) dt
!
? ml
U1)2 + "Z (D1)2)
 C2n (E11vk)2} dt .
Note that by 8.1 (1) and 8.8
vk)2 S n > (vk)2 5
and note also that < v, v>
jr
Z
rnl
2 Jr i
( `
n m1
(v(s), v(t)) 5
m1
((v')2 + (n1)2) dt 
2n rY
f (t)) di, so that f (f(t), ,
2C2n3 m1
(j(1), j(1)) dt.
iv
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS
205
By the choice of the I,, we have
+
2C2n3
m1
J f (f(t), f(t)) dt < 1 +
t
4
which implies
17V _
Y ((v1(t))2 + (v'(t))2) di.
_m_ 3 f1v
Similarly, we may deduce 0. Then if <w, w>h12 = IIwII < 8 we have IIwtI,, < 26 (cf. Lemma 8.8 above). Het.ce 11v + wll W < e, so that the 6ball around v is in 0(f). Q.E.D. 8.13. Definition: For f, h E H, (S1, M) put d. (f, h) = max du(f(t), h(t));
(1)
tES,
given.f e H1(S1, M) and e < l e, (cf. Lemma 8.1). Put
U(f) = {h I h E Hl (S1, M) and d. (f, h) < e).
(2)
This set is introduced as a standard coordinate neighborhood off, a definition justified by the two following lemmas. Note that U(f) is an open subset of H1(S1, M) since the original Riemannian metric of M and the metric induced by an embedding M c R" (as used in Definition 8.11) are equivalent and V(f) _ {h e H1 (S1 , R"); max dRN(h(t),f(t)) < S} is open in H1 (S1 , R") by the proof of Lemma 8.12. `
8.14. Lemma: The following formula defines a 11 correspondence ri be
tween U(f) and 0(f): h(t) = expf(,)(v(t)) Proof: We have Iv(t)I = dM(f(t), h(t)) by the radial isometry of the map exp, and hence IIv1I. = d0,(f, h). Since e < e,, and assuming h e U(f), the inverse exp fc') is well defined at every point h(t) and hence the equation dis
played in the statement of the lemma is inverted by v(t) = expf«) (h(t)). Finally, the maps exp, ' () depend differentiably on p, which implies h is an H1curve if and only if v is an Hlvector field. Q.E.D. ; 8.15. Lemma: The 11 mapping , of Lemma 8.14, given by rl(h) = v, is
a C" diffeormorphism between the open subset U(f) of the manifold H1(S1 i M) and the open set 0(f) of the Hilbert space Hl (S1, TM.,).
8.16. Corollary: The mappings n : U(f) + 0(f) of Lemma 8.14 and 8.15 define valid Cx' 3 coordinates on the manifold H1(S1, M). We refer to them as standard coordinates near f. Remark: It is possible to define the differentiable structure of H1(S1, M) directly with the aid of Lemma 8.14 independently of Definition 8.11; in this case one has to show that the changeofcoordinates maps riZriT ' are of class C13.
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS
201)
The proof of Lemma 8.15 is based on Theorem 8.10 and Whitney's embedding theorem. We start with some considerations which we shall need again.
8.17. Using Whitney's embedding theorem, we may take M as a C`submanifold of some R". The Whitney sum TM O+ vM of the tangent bundle
TM and the normal bundle vM is then the trivial bundle M x R". Using
M c R" we embed the trivial bundle M x R" in R` x R". Since the tangent bundle TM is a C"'subbundle of the trivial bundle we thus get x R". This embedding has the following TM as a Ck'submanifold of properties. If we identify M with the zero section of TM, then this submanifold of TM is embedded in R" x {0}. Moreover the tangent space M, R'N
of M and p is embedded as a linear subspace of { p} x RI in such a way that the linear structure of M, is preserved. In view of Theorem 8.10 we have then HI (S,, TM) as a C' 5submanifold of Ht (S,, R" x R"), and for fixed
f e H, (S,, M), HI (S1, TMf) is a linear subspace (and therefore as a C' submanifold) of H, (S , , R" x R") and consequently H, (S,, TMf) is a Ck"5submanifold of Hl (SI, TM). In the same way the Whitney sum TAf Q+ TM is C' 'submanifold of RN x RN x RN so that the linear structure of the fibers is preserved. Consequently Hl (S, , TM ® TM) is a C' 5submanifold of HI (S, ,R' x R" x R") (Theorem 8.10) and Hl (S, , TMf) x HI (S, , TMf) is a linear subspace of
H, (SI, R" x R" x R") and a CkSsubmanifold of H, (S,, TM E) TM). Proof of Lemma 8.15: We assume 8.17. Consider the map rh : T,'tf . M defined by 0(v) = exp,( (where p(c) is the base point of r). Then 0 E CA  2 and by Theorem 8.10 the induced map 0: H, (S,, TM) + HI (S,, M) be
longs to Ck_5 (not Ck4 since TM is only Ck'). But the restriction of 0 to the Ck'5submanifold HI(S1, TMf) (cf. 8.17) is the map 9/`' of our Lemma, proving that q' e Ck5 To show that we also have 77 e C'`5, we argue as follows. On the open subset U = {(p, q) E M x M; d (p, q) < e,} the map y: U  TM given by y (p, q) = expo `(q) is well defined and Ck2. By Theorem 8.10 y induces a Ck5 map y of an open subset of HI (SL, M) x H, (S,, M) into HI (S,, TM). The domain of y contains the Ck  5 submanifold { f } x U(f) (cf. 8.13) and
the restriction of y to {f} x U(f) coincides with t) by the proof of Lemma 8.14. Q.E.D.
The next Lemma shows that the Hilbert manifold HI (S, , TM) may be identified with the tangent manifold TH, (S, , M). 14
Schwartz, Nonlinear
NONLINEAR FUNCTIONAL ANALYSIS
'110
8.18. Lemma: Let 0: TM + M be the map given by 45(v) = exp (v), and let 0 be the induced map (Theorem 8.10) of H, (S,, TM)  H, (S,, M).
Let 0 be the map defined by 0(v) = ds 0(sv) Is_ o, so that 0: H, (S,, TM)  TH, (S,, M). Then 0 is a Ck ` 6 diffeomorphism of H, (S,, TM) onto TH, (S,, M). S Proof: From the proof of Lemma 8.15 we have that 0 e Ck so that 0 e C". Write p(v) for the base point of the vector v e TM, or for the base, point curve of the H,vector field v E H, (S,, TM), as the case may be. Let v e H, (S, , TM). The point 0(v) E H, (S,, M) has by Corollary 8.16 the coordinate v in the standard coordinate neighborhood U(p(v)) near p(v).
=
Thus ds ; FP (sv)
(v)is that tangent vector of H, (S,, M) atp(v), which,
s= o
in the coordinate system of TH, (S, , M) corresponding to U (p(v)), has the coordinate (0, v). This shows that 0 is a 11 C16 mapping of H, (S,, TM) onto TH, (S,, M).
To prove that 0' is also of class Ck6 consider the coordinate system of TH, (S,, M) corresponding to U(p(v)) and denote the C"`6coordinate map by ^, i.e. : C(v) + 0 (p(v)) x H, (S, , TM,(,,)). We shall find a C'5 map a such that 0' = a o rj, which proves 01 a Ck6. Now use 8.17. Let p = p(v,) and q = exp,(v,). By Lemma 8.1 there exists e, > 0 such that if V1, v2 a M, and Iv21 < e, then P (VI, v2) = expq' (exp,(v, + v2)) defines a C1,2 function whose value is a vector
tangent to M at q. Thus a (v,, v2) = s e (v,, sv2)
defines a Ck3
I
Ss0
map a: TM ® TM  TM. By Theorem 8.10 a induces a Cks map H, (S,, TM (D TM) , H, (S,, TM) and therefore by restriction a
Ck s
map (cf. 8.17)
v: H, (S, , TiVf f) x H, (S,, TMf)  H, (S,, TM). For (v,, v2) e H, (S,, TM r) x H, (S,, TMf) and h = 0(v,) we have ds
(eXp n
=
` (exp.r (vl + sv2)))
(v1, v2)
s=o
(cf. 8.10 (ii), more explicitly we have R (s, v, , v2) = P (v,, sv2)and a (v,, v2) (1, 0, 0). Apply 8.10 (ii) to R and obtain = a (v1, v2) (:) = dR(o,a,(t).V2(t)) (1, 0, 0)
= ds (exp+,a) (eXpf(,) (v,(t) + sv2(:)))
LO)
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS
211
Thus
exph ' (exp f (v, + sv2)) = s ' a (v, , v2) + o(s) E H, (Si , TMh). Since rh (exph ' (exp f (v, + sv2))) = 0 (v1 + s v2) and since ?7' = u11(S1.TMI)
(by the proof of 8.15) it follows that
d ds
d 
= ds 0 (v, + sv2)
0 (sa (L ] , v2)) S=o
ds
S=Q
is that tangent vector of H, (S,, M) at the point h = 0(v,) whose coordinates in the coordinate neighborhood rl(v,) of TH, (S,, M) are which (v, , v2). In other words i3 (0 (a (VI , v2))) = (v, , v2) or a o completes the proof. Q.E.D. Our next aim is to prove the differentiability of the energy integral and to introduce the intrinsic Riemannian metric for HI (SI , M), more precisely : i.e.
(a (VI , v2))
8.19. Theorem: The energy J is a C' S function on H, (SI , M). 8.20. Theorem: Suppose that we represent the tangent space to H, (S,, M) at f by H, (Si , TMf) (cf. Corollary 8.16) and take as scalar product in the tangent space the intrinsic scalar product for HI (S1, TMf) defined in 8.6 (3). Then this scalar product defines a Ck' 6 Riemannian metric for HI (S, , M). Before we prove these two theorems, we make the following observations.
8.21. Let M be embedded in R" and TM in R" x R" as described in 8.17. The Ck_ I Riemannian metric g of M can be extended to a 6_2 Riemannian metric of R" in such a way as to make M a totally geodesic submanifold (using the normal bundle of M in R" and partitions of unity). This implies that the covariant derivative along a curve f e H, (S, , M) is the same if f is considered as a curve in M or as a curve in R'r. We write the scalar product as g(p) (v, w) for (p, v) and (p, w) e R" x R" (this is of course bilinear in v, w). The following statements are very similar,to Lemma 6.9 and are proved in the same way. (1) Let b(p) (v, w) be bilinear in x E R" and of class Ck in p e R". and define a Ck`2 function
b: H,(S,, R") x H,(S,, R") x H,(S,, R"), R by
b(f) (v, it) = Lb(f1) (i'(t ), w(t )) dt . 14a Schwartz. Nonlinear
NONLINEAR FUNCTIONAL ANALYSIS
212
(2) Let b ( ) ( , ) : RN  L2(RM) be a Ck map from R1 into the bilinear forms on RM. Then we define a Ck2 map
b () (,) : H1(S1, RN) '
L2
(H1 (S1, RM))
by
b (f) (v, w) = f b (f) (t)) (v(t), +'(t)) dt. 0
) : RN x RN + L2(RM) be a Ck map from R" x RN, which is linear in the second argument, into L2(RM). Then we define a Ck1 map
(3) Let c(
,
)(
,
H1(S1, RN) x H1(S,, RN) `' L2 (H1(S1, RM)) by
c (f, h) (v, w) = fo c (f(t), h(t)) (v(t), l1w(t)) dt. 1
We show c (f, h) o L2 (H1(S1, RM)) to indicate the kind of changes which have to be made to adapt the proof of Lemma 6,9. We use Lemma 8.8 and
Schwarz' inequality and we denite by e" the ith unit vector in R so that h(t) = E eihi(t). Ic (f(t), h(t)) (v(t), w(t))I S lIc (f(t), h(t))11 L2(RM) .1V(t)IRM '
s E I h1(t)I Ilc (f(t), e1)IIL2(RM)
and therefore Ic (f h) (v, 01 S max (E Ilc (f(t), e,)11 2 2(RM))1'2 1x(0.1
L
I i'(06max I v(t)I I +'(t)I
9410.1]
IlhH0, (s,. Rx) '2 1IvIIR, (s,.
it.)
' Ilwlls, (s,. RM) 1
Proof of Theorem 8.19: With the notations of 8.21 2J(f) = f g (Al)) 0
x (f(t), f(t)) dt is the restriction of the Ck4 function
on H1(S1, RN) x H1(S1i RN) x H1(S1f RN) to the diagonal of the Ck  ° submanifold H1(S1, M) x H1(S1, M) x H1(S1, M).
Q.E.D.
Proof of 8.20: We identify TH1(S1, M) and H1(S1, TM) using Lemma
8.18, i.e. we represent tangent vectors of H1(S1a M) by elements of H1(S1, TM). For v e H1(S1, TM) we denote the covariant derivative along
o
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS
213
the base point curve p(v) by Dvldt (cf. 8.21). Then the scalar product according to 8.6 (3) is Dv
1
=
g (=
p (v(0)) (v(t ), w(t)) + g (p (v(')))
Dw
`
\ dt '
dt }
dt,
where v and w have the same base point curve p(v). We have to show that this formula defines a Ck6 section from M into the positivedefinite, symmetric, continuous bilinear forms on TH, (S,, M) = H, (S1, TM), cf. Lemma 8.18. Taking, for example, Cartesian coordinates for R" we may write the covariant derivative along f as DO
dz`
dt
dt
+ Fjk (f(t)) v'(t) fk(t)) i = 1, ..., N and
(
E
J.k=1
where the rk are Ck3 functions defined on R" (cf. 8.21). We write I'(f(t)) [v(t), f(t)] for the vector {rk(f(t)) vJ(t) fk(t)} e R". By 8.21(2) and (3) we may define a Ck5 map
F: H, (SI , RN) x H,(S,, R")  L2(H,(SI, R")) by
{g (f(t)) (v(t), w(t)) + g (f(t)) (iv(t) + r (f(t)) wt), h(t)],
F(f, h) (v, w) = 0
w(t) + ru(t)) [w(t), h(t)])} dt.
The restriction of F first to the diagonal of H, (S,, R") x H, (S,, R"), which we identify with H, (S,, R"), and then to the submanifold H, (S1, M), is again Ck' 5. In other words, we have a Ck' 5 Riemannian metric on the
trivial vector bundle H, (S,, M) x H, (S1, R") and consequently also a Ck'5 Riemannian metric on the Ck5 subbundle H, (S,, TM), cf. 8.17. We lose on more order of differentiability since the identification of H, (S, , TM) with the tangent bundle TH, (S,, M) is only Ck6 (cf. Lemma 8.18). This Riemannian metric induces the right topology by Theorem 8.9. Q.E.D. We continue with some observations concerning a useful family of differentiable curves on H, (S1, M). 8.22. Notation: An element f e H,(S1, M) will be called a curve on M or a point of Hl (S,, M) depending on the situation. A curve x on H, (S,, M) will always be a map x : [a, b] , Ht (S, , M).
NONLINEAR FUNCTIONAL ANALYSIS
214
&23. Definition : Let fo, f 1 e H1(S1 i M) be such that dd (Jo, fi) < E,, which implies that the shortest geodesics on M joining fo(t) and fl(t) are unique. Then put: y (s, t) = expf0(t) (s . expf it)(t) (f1(t)))
From this function of two variables we obtain a curve y : s  y(s) E H1 (S1, M)
by writing y(s) (t) = y (s, t). 8.24. Lemma: The ycurves of Definition 8.23 are C' 5 differentiable. The
tangent vector
ay as
(0) E H1(S1 i TMf) is the coordinate of f1 in the standard
coordinate system near fo (cf. Lemma 8.14 and Corollary 8.16). We have (s,
as
t) = dM(fo(t), fl(t)) (for all s, 0 < s:5 1); hence 11as
= d.(.fo,f1) W
Proof: In the coordinate system centered at fo the ycurves have the following representation.
v(s) (t) = s expj «)(f1(t)) where vs) E H1(S1, TMf0). This implies that the ycurves are as often differentiable as the change of coordinates map, i.e. are Ck  5. Moreover, d v(s) is the coordinate of f1. ds s=0 (s, t) is the length of the tangent vector to the geodesic y (s, t), as
t = const., and therefore equals dx(fo(t), fl(t)). 8.25. Lemma: A Clcurve x : [0, 1]  H1(Sl , M) considered as map [0, 1] x [0, 1]  M by putting x (s, t) = x(s) (t) is a homotopy between the end
points x(0), x(l) e H1 (Si, M). Moreover
Cx
(s, t) is a continuous vector
as field on M along x. This implies that the deformation paths x (s, to), to = constant, are rectifiable curves on M and that their length depends continuously on t.
Proof: Since J is continuous and x [0, 1] is compact (in H1 (S3 , M)) we have max J (x(s)) = A < oo. Therefore, by Lemma 8.3, the x(s) are an equiSE[0.1]
continuous family of curves on M. Equicontinuity in one variable and continuity in the other implies continuity in both variables. In this case x(s) (to) is also equicontinuous ins for to a [0, 1]. To see this let v(s) be the coordinate of x(s) in some standard coordinate system on H1(S1 i M) (cf. 8.13 (2)), I.e.
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 215
for some f e H, (SI, M) and b > 0 let x(s) e U(f) and v(s) E H, (S,, TM,) for Is  soI < S. By Lemma 8.8 we have 11v(s)
 v(so)I17 = max Iv(s) (t)  v(so) (t)I2 2, ,
and therefore the continuity of v(s) implies the stated equicontinuity. To prove the second part of the Lemma observe that the derivative of a C'an curve in H, (S, , M) represents a tangent vector, so that  e H, (S, , TMK(S)) . as
Hence
ax
(s, t) is continuous in t for fixed s. Using coordinates we see as
as before that as (s, t) is equicontinuous ins for fixed to, to c [0, 1 ]. This implies
continuity of
an
(s, t) in both variables.
as
Q.E.D.
8.26. Lemma: Let s + x(s) be a C'curve on H, (S, , M) and s > w(s) be a C'vector field along x(s). Define 0 (s, t) = x(s) (t) and v (s, t) = w(s) (t). Then
D a¢
Da
at as
D at as
(almost everywhere) ,
and
D D D D v (s, t) = R a a , v (s, t) at as  as at) (at as)
almost everywhere,
where R is the Ck' 3 curvature tensor of M and
(resp. D f as) is the at covariant derivative along the curves 0 (so, t) (resp. 0 (s, to)) on M. Proof: The formulae are well known for 0, V E C2 and extend by the usual limit arguments to the above situation. Q.E.D. 8.27. Definition: Letx : [0, 1] + H, (SI , M) be a C'curve; then L(x)
f 11 o as
ds =
D ax f'(I'dtff.,\+(_0x as dsat at ' as)}) ds
J0
at
1 /2
NONLINEAR FUNCTIONAL ANALYSIS
216
is the length of x and E(x)
I f0
ax
2
as
2
ds is the energy of x .
8.28. Remark: L2(x) < 2E(x) may be proved along precisely the lines of the proof of Lemma 8.3. 8.29. Definition: In view of Lemma 8.25, define the Riemannian distance between
by
inf L(x) if fo and f1 are homotopic as curves on M d (fo,f1) = K(1)=f, 00 if fo, f, are not homotopic. K(O)=f0
8.30. Theorem: I \/2J(fo)
 2J(fl)I < d (fo
,
Proof: Either d (fo, f1) = oo and nothing has to be proved or there is a C`curve x joining fo and f1 . In this latter case we have
J (x(s))
d ds
= ` (m(s)) 2J
at
2Jo
(s, t), at (s, t)) dt
D ax
`
1
J
,/
o
ax
(as dt ' at) dt,
Hence, using Lemma 8.26 to change the order of differentiation and noting the identity of Definition 8.27, d
1
ax
 ,l 2J (x(s)) (fo
at
1
ds
2J (x(s)) M on M by putting eT(s) (t) = x (s, T), 0 < t S 1. We
L(x ) _ f T
1
ds
2
oxT
= Jo
as
0
dsI
dt 0
\
(s )
(t)
as
D 0 at
2
)1 1/2
as
=J ds 0
as
(s, z) = L,.4 (x,),
D axT = 0. Therefore the curve xT on dt as M and the curve xT on M have the same length. Moreover
since oxT/as is independent oft and
1
.,
1
inf LM(x,) = inJo ds I °x (s, z) S as
T
1
dr ds
o .J o
f1ds(f'f._(s,T)
inf LM(x,)
2 IIRII do max
(4)
J2J(y(s)).
as (0)
Remarks; (1) gives a bound for the energy integral along a ycurve in terms of the energy of the one end point, J(fo), and in terms of the coordinate
of the other end point, as (0).
(2) shows that a ycurve is parametrized almost proportionally to arc length.
Proof: (3) and (4) are immediate consequences of (2). To prove (2), we note that, by 8.6 (3) and 8.8 (4) d 1
ds
ay
D ay
L
fo tas as ' as 1
dt
+
D D oy D icy as at as '
at
as)
as (s) II
By integrating this equation and using a ay = 0, 8.26, s
I
s (s, t) I  d,,
220
NO` I INEAR FUNCTIONAL ANALYSIS
and the CauchySchwar7 inequality we obtain
ay _y D ay
'a