Central European Science Journals w
w
w
.
c
e
s
j
.
c
o
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
m
DOI: 10.1007/s11533-005-0001-6 Research article CEJM 4(1) 2006 1–4
Exact laws for sums of ratios of order statistics from the pareto distribution Andr´e Adler∗ Department of Mathematics, Illinois Institute of Technology, Chicago, Illinois, 60616, USA
Received 25 January 2005; accepted 29 September 2005 Abstract: Consider independent and identically distributed random variables {Xnk , 1 ≤ k ≤ m, n ≥ 1} from the Pareto distribution. We select two order statistics from each row, Xn(i) ≤ Xn(j) , for 1 ≤ i < j ≤ m. Then we test to see whether or not Laws of Large Numbers with nonzero limits exist for weighted sums of the random variables Rij = Xn(j) /Xn(i) .
c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: Almost sure convergence, weak law of large numbers, strong law of large numbers, order statistics, exact laws MSC (2000): 60F05, 60F15
In this paper we observe weighted sums of ratios of order statistics taken from small samples. We look at m observations from the Pareto distribution, i.e., f (x) = px−p−1 I(x ≥ 1), where p > 0. Then we observe two order statistics from our sample, i.e., X(i) ≤ X(j) for i < j. Next we obtain the random variable R = Rij = X(j) /X(i) . The density of R is fR (r) =
p · (m − i)! (1 − r−p )j−i−1 r−p(m−j+1)−1 I(r ≥ 1). (m − j)!(j − i − 1)!
Our goal is to determine whether or not there exist positive constants an and bN such that N n=1 an Rn /bN converges to a nonzero constant in some sense, where {Rn , n ≥ 1} are i.i.d. copies of R. These are called Exact Laws of Large Numbers since they create a fair game situation where the an Rn represent the amount a player wins on the nth play of some game and bN − bN −1 represents the corresponding fair entrance fee for the participant. ∗
E-mail:
[email protected] 2
A. Adler / Central European Journal of Mathematics 4(1) 2006 1–4
The results in this paper are quite similar to those that can be found in Adler [2]. Thus all the proofs will be omitted. In that paper the random variable of interest was just the k th order statistic from a finite sample. If we have i.i.d. Pareto random variables {Xnk , 1 ≤ k ≤ s, n ≥ 1}, then the k th order statistic from our nth sample, Xn(k) , has the density p · s! fXn(k) (x) = (1 − x−p )k−1 x−p(s−k+1)−1 I(x ≥ 1). (s − k)!(k − 1)! Notice that if we set m − i = s and m − j = s − k, which ensures that j − i = k, then we can utilize all the results from Adler [2]. As usual we define lg x = log (max{e, x}) so that we avoid dividing by zero. The first result shows that if p(m − j + 1) < 1 and under an extremely mild condition an Exact Weak Law cannot exist. Theorem 1. If p(m − j + 1) < 1 and max1≤n≤N an = o(bN ), then an Exact Weak Law cannot hold, i.e., the partial sum N n=1 an Rn /bN can only converge in probability to zero. However, if we let p(m − j + 1) = 1, then we can obtain unusual strong laws. But in order to obtain an Exact Strong Law we must select our coefficients and norming sequences properly. Theorem 2. If p(m − j + 1) = 1, then for all β > 0 we have m−i N (lg n)β−2 R n j−i−1 n=1 n lim = almost surely. β N →∞ (lg N ) β When we allow p(m −j + 1) > 1, then typical strong laws exist, since the first moment of our random variable is finite. The mean of that random variable can be found in the following lemma. Lemma. If p(m − j + 1) > 1, then ER =
(m − i)!Γ(m − j + 1 − 1/p) . (m − j)!Γ(m − i + 1 − 1/p)
In order to obtain these strong laws we can define {an , n ≥ 1} and {bn , n ≥ 1} as any pair of positive sequences as long as bn ↑ ∞, N n=1 an /bN → L, where L = 0, and the condition involving cn = bn /an in each theorem is satisfied. If L = 0, then these limit theorems still hold, however the limit is zero, which is not terribly interesting. −p(m−j+1) Theorem 3. If 1 < p(m − j + 1) < 2 and ∞ < ∞, then n=1 cn N L · (m − i)!Γ(m − j + 1 − 1/p) n=1 an Rn = lim almost surely. N →∞ bN (m − j)!Γ(m − i + 1 − 1/p) 2 Theorem 4. If p(m − j + 1) = 2 and ∞ n=1 lg(cn )/cn < ∞, then N L · (m − i)!Γ(m − j + 1 − 1/p) n=1 an Rn = almost surely. lim N →∞ bN (m − j)!Γ(m − i + 1 − 1/p)
A. Adler / Central European Journal of Mathematics 4(1) 2006 1–4
−2 Theorem 5. If p(m − j + 1) > 2 and ∞ n=1 cn < ∞, then N L · (m − i)!Γ(m − j + 1 − 1/p) n=1 an Rn = lim N →∞ bN (m − j)!Γ(m − i + 1 − 1/p)
3
almost surely.
Clearly, in all of these last three theorems the situation of an = 1 and bn = n = cn is easily satisfied. Whenever p(m − j + 1) > 1 we have tremendous freedom in selecting our constants. That is certainly not true when p(m − j + 1) = 1. Next we return to p(m − j + 1) = 1 since there are more unanswered questions. We saw in Theorem 2 that in order to establish an Exact Strong Law we are forced to set an to be a slowly varying function divided by n, while bn must also be slowly varying. If one wants to try more conventional constants such as an = 1 and bn = n we will have to set our sights a bit lower and settle for an Exact Weak Law. The Weak Law can be found in Theorem 6. We then use that Weak Law to obtain the almost sure behavior of our m−i /(α + 1) is the almost normalized partial sums. It turns out that the weak limit j−i−1 sure lower limit of these normalized partial sums. This result is known as a Generalized Law of the Iterated Logarithm. Theorem 6. If p(m − j + 1) = 1 and α > −1, then m−i N α n R n P j−i−1 n=1 . → N α+1 lg N α+1 What is interesting about these types of weak laws is that they give a false impression as to what the fair games entrance fee should be. For example in the famous St. Petersburg game, where P {Xn = 2r } = 2−r , r = 1, 2, . . ., we have from page 252 of Feller [3] N n=1 Xn P →1 N lg2 N where lg2 N is the logarithm to the base two. However, if one would use as the cumulative entrance fee N lg2 N , after N plays of that game, then the player would have a decided advantage. We see that via the almost sure behavior of the upper and lower limits of these normalized partial sums N Xn lim inf n=1 = 1 almost surely N →∞ N lg2 N while
N
Xn = ∞ almost surely. N lg2 N N →∞ Thus the player should always play this game. This result can be obtained from Klass and Teicher [4]. Similarly, we have our Generalized Law of the Iterated Logarithm. lim sup
n=1
Theorem 7. If p(m − j + 1) = 1 and α > −1, then m−i N α n R n j−i−1 = almost surely lim inf n=1 N →∞ N α+1 lg N α+1
4
A. Adler / Central European Journal of Mathematics 4(1) 2006 1–4
and
N lim sup N →∞
α n=1 n Rn N α+1 lg N
=∞
almost surely.
The only way to make any of these games fair for both the house and the player is to obtain an Exact Strong Law which can only be achieved by weighting the random variables as we did in Theorem 2, see Adler [1].
Acknowledgment I would like to thank the referee for his/her helpful comments which greatly improved upon the presentation of this article and for pointing out an error in the density of Xn(k) in Adler [2].
References [1] A. Adler: “Exact Strong Laws”, Bulletin Institute Mathematics Academia Sinica, Vol. 28(3), (2000), pp. 141–166. [2] A. Adler: “Exact Laws for Sums of Order Statistics from the Pareto Distrbution”, Bulletin Institute Mathematics Academia Sinica, Vol. 31(3), (2003), pp. 181–193. [3] W. Feller: An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed., John Wiley, New York, 1968. [4] M. Klass and H. Teicher: “Iterated Logarithm Laws for Asymmetric Random Variables Barely With or Without Finite Mean”, Annals Probab., Vol. 5(6), (1977), pp. 861–874.
Central European Science Journals w
w
w
.
c
e
s
j
.
c
o
m
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
DOI: 10.1007/s11533-005-0002-5 Research article CEJM 4(1) 2006 5–33
Category with a natural cone Francisco J. D´ıaz∗ , Sergio Rodr´ıguez-Mach´ın Departamento de Matem´atica Fundamental, Universidad de La Laguna C/ Astrof´ısico Francisco Sanchez s/n, 38271, La Laguna, Espa˜ na
Received 2 March 2005; accepted 21 September 2005 Abstract: Generally, in homotopy theory a cylinder object (or, its dual, a path object) is used to define homotopy between morphisms, and a cone object is used to build exact sequences of homotopy groups. Here, an axiomatic theory based on a cone functor is given. Suspension objects are associated to based objects and cofibrations, obtaining homotopy groups referred to an object and relative to a cofibration, respectively. Exact sequences of these groups are built. Algebraic and particular examples are given. We point out that the main results of this paper were already stated in [3], and the purpose of this article is to give full details of the foregoing. c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: Category, algebraic homotopy theory, cone construction MSC (2000): 55U35, 18C, 55P05, 55P40, 55Q05
1
Introduction
In this paper, basic properties of the topological cone are generalized to create a homotopy theory on arbitrary categories. The cone of a topological space is obtained by collapsing the base of its cylinder to a single point. The real numerical product defined on the unit interval originates a projection of the double cone onto the simple cone. The natural inclusions in the double cone, together with this projection, make the double cone behave like a cylinder of the simple cone. Dual standard constructions in the sense of P.J. Huber [9] generalize properties of the topological cone. In the present paper, some axioms about cofibrations given by H.J. Baues [1] in categories with a natural cylinder are adapted to the point of view described ∗
E-mail:
[email protected] 6
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
above: collapsing the base of the cylinder to a single point. The axiomatic theory obtained in this way allows one to obtain homotopy groups through suspensions of based objects and cofibrations. Also, exact homotopy sequences of these groups can be created. The main axiomatic homotopy theories based on a cone are particular cases of this theory: Huber [9] obtained homotopy groups and exact sequences of them on projective and on injective homotopy theories and pointed topological spaces. However, in general, the associated semisimplicial complexes do not verify the Kan extension property. So, the existence of homotopy groups cannot be guaranteed. Kleisli [12] defined homotopy theory on additive categories with additional properties. Seebach [18] created injective homotopy theory. Finally, Rodr´ıguez-Mach´ın [16] worked on additive categories. Objects in categories with a natural cylinder in the sense of Baues [1] are cofibrants, and the model categories defined by D.G. Quillen [17] give an autodual theory. In general, these facts are not true in categories with a natural cone. Hence, there are categories with a natural cone that are not derived from categories with a natural cylinder (it suffices that the category does not have an initial object). On the other hand, it is possible to develop a dual theory (category with natural cocone). If the category is additive with finite limits and colimits, cofibrations and fibrations are suitably defined, and there is compatibility between the cone and cocone structures, then a proper closed model category structure is induced (see [5]). However, this result is not available for general categories, since weak equivalences cannot be defined. The well-known homotopy theories are examples of this axiomatic theory or of its dual: the classical homotopy of topological spaces, pointed topological spaces and chain complexes [10]; projective and injective homotopy theories of R-modules [8]. Others less known are examples too: some tensorial homotopy theories and the proper homotopy theory of exterior spaces [6]. Dedicated to Sergio, wherever he is.
2
Notation
The following categorical notation will be used in this paper. F,G E H Given functors B → C −→ D → E and a transformation t : F → G, then the transformations t ∗ E : F E → GE and H ∗ t : HF → HG will be denoted by tE and Ht, respectively. When there is no possibility of confusion, the morphism tX : F X → GX will be simply denoted by t, for every object X. The pushout object of two morphisms f and g will be denoted by P {f, g}. The induced morphisms will be denoted by f : codom g → P {f, g} and g : codom f → P {f, g}. Given a morphism f , if the notation f has been used, f will denote the other morphism induced by f in a pushout. In particular, if f = g, then f and f will denote the morphisms f and g, respectively. Given morphisms r and s verifying rf = sg, the unique morphism h such that hg = r and hf = s will be denoted by {r, s}. If codom f (resp. codom g) is a pushout object, the component r (resp. s) has an expression of the form {r1 , r2 } (resp. {s1 , s2 }). In this
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
7
case, the morphism {r, s} = {{r1 , r2 }, s} (resp. {r, {s1 , s2 }}) will be frequently denoted by {r1 , r2 , s} (resp. {r, s1 , s2 }). In this way, expressions of the type {h0 , h1 , ..., hn } can appear. Given two pushout objects P {f, g} and P {f , g }, and three morphisms r : codom f → codom f , s : codom g → codom g and t : dom f = dom g → dom f = dom g verifying rf = f t and sg = g t, we will denote the unique morphism {g r, f s} by r ∪ s. If there is no possibility of confusion, expressions of the type h0 ∪ h1 ∪ ... ∪ hn will be used. The pullback object of two morphisms f and g will be denoted by P < f, g >, with induced morphisms f and g, respectively. The morphisms induced by the universal property will be denoted by < r, s > or r ∩ s. Finally, the set of extensions of a morphism u : B → X relative to another morphism i : B → A is defined by Hom(A, X)u(i) = {f : A → X / f i = u}
3
Category with a natural cone
Here, we state the minimal structure that a category must have in order to obtain a homotopy theory via a cone functor together with a class of distinguished morphisms, called cofibrations, that verify properties obtained by generalizing certain homotopy properties of topological cofibrations. The concepts of nullhomotopic continuous function and contractible topological space are generalized. Also, the concepts of contractible cofibration and pointed C-category are introduced. Finally, relative homotopy theory is defined for cofibrations with codomain a cone object, and its basic properties are studied. Definition 3.1. A category with a natural cone, or C-category, is a category C together with a class “cof” of morphisms in C, called cofibrations and denoted by , a functor C : C → C which will be called the cone functor, and natural transformations κ : 1 → C and ρ : CC → C, denominated inclusion and projection respectively, satisfying the following axioms: C1. Cone axiom. ρκC = ρCκ = 1C and ρρC = ρCρ. i
f
C2. Pushout axiom. For any pair of morphisms A B → X, where i is a cofibration, there exists the pushout square
i
f
/X
B A
f
i
/ P {i, f }
and i is also a cofibration. The cone functor carries this pushout diagram (called a cofibrated pushout) to a pushout diagram, that is C(P {i, f }) = P {Ci, Cf }. C3. Cofibration axiom. For each object X the morphisms 1X and κX are cofibrations. The composition of two cofibrations is a cofibration. Moreover, there is a retraction
8
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
for the cone of each cofibration (that is, if i : B A is a cofibration, there is a morphism r : CA → CB such that r(Ci) = 1). This last property is called the nullhomotopy extension property (NEP). C4. Relative cone axiom. Given a cofibration i : B A, the morphism i1 = {Ci, κ} : Σi = P {κ, i} CA is also a cofibration. The object Σi is called the relative cone of i. The following properties are consequences of Definition 3.1. Every isomorphism f : X → Y is a cofibration, since f = 1X = 1 X : X → P {1X , 1X } = Y . The cone of every cofibration i is a cofibration, since i and i1 = {Ci, κ} are cofibrations and Ci = i1 i. Therefore, the cone functor carries cofibrated pushouts into cofibrated pushouts. Hence, the natural transformations κ and ρ induce unions of themselves, κ ∪ κ and ρ ∪ ρ, between cofibrated pushouts. Moreover, given a morphism {f, g} whose domain is a cofibrated pushout, then C{f, g} = {Cf, Cg}. On the other hand, by the inductive use of the relative cone axiom, every cofibration i : B A generates for each natural number n a cofibration in = (in−1 )1 : Σin−1 → C n A, with i0 = i. Every commutative square f i = i g relating cofibrations i : B A and (n) i : B A induces, for each n, a morphism Σn (f, g) = C n g ∪ C n−1 f ∪ ....... ∪C n−1 f : Σin−1 → Σin−1 . z| k zz zz z| z CB /
g
/ Σi
Cg
i
B /
B z}
Σ1 (f,g)
/ i
k zz
zz }zz / CB
|~ || | | | ~|
/
~ }} } }} } ~}
Σi
/A f
/ A
i1
Σi / y|
k yy yy y |y CΣi /
Σ1 (f,g)
/ Σi 1
Σ | k yyy yy |yy CΣi /
CΣ1 (f,g)
i
y| yy y y |y y
/ CA Cf
....
Σ2 (f,g)
/ i1
/
/ CA z| zz z z z |z
Σi 1
Remark 3.2. Given a cofibration i, the expression {fn , fn−1 , ..., f0 } symbolizes a morphism with domain the object Σin if and only if the expression {fn , fn−1 , ..., f1 } symbolizes a morphism with domain CΣin−1 , the domain of f0 is C n A and f0 in = {fn , fn−1 , ..., f1 }κΣin−1 . By applying this fact to the expressions {fn , ..., fm }, for 0 ≤ m ≤ n, and observing that the cone functor carries cofibrated pushouts into cofibrated pushouts, it is concluded that an expression of the type {fn+1 , fn , ..., f0 } symbolizes a morphism with domain Σin if and only if fr C n i = fn+1 C r κ for 0 ≤ r ≤ n and fr C s κ = fs+1 C r κ for 0 ≤ r ≤ s ≤ n − 1. Remark 3.3. The following result gives an useful tool to know when a morphism is a cofibration between pushout objects. Given the commutative cubical diagram
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
g
X
Y
} f }}} γ }} ~} }
α
f }}}
}} ~}}
Y
X } g
/Z uu u f uu u uu u z u
/ P {f, g}
g
g
9
β
α∪β
/ Z uu f uuu uu z u u
/ P {f , g }
where the top and bottom faces are pushouts and α, β, γ are cofibrations. If {g , β} : P{γ, g} → Z or {f , α} : P{γ, f } → Y is a cofibration, then so is α ∪ β. Definition 3.4. A morphism f : X → Y is said to be nullhomotopic (in symbols f 0) if there exists an extension F of f relative to k, that is, F : CX → Y such that F k = f . The morphism F is called a nullhomotopy for f (in symbols F : f 0). An object X is said to be contractible (in symbols X 0) when 1X 0. Observe that zero objects are contractible. Moreover, by C1 the cone of any object is contractible. By the naturality of κ, it is easily seen that a morphism is nullhomotopic if and only if it can be factored through a contractible object. Consequently, the composite of a morphism with a nullhomotopic morphism is nullhomotopic. The above properties justify the use of “nullhomotopy” in the term NEP. Theorem 3.5. Given a morphism i : B → A, the following sentences are equivalent: a) The morphism i verifies the NEP. b) Every nullhomotopic morphism f : B → X has a nullhomotopic extension rel. i. c) Every nullhomotopic morphism f : B → X has an extension rel. i. d) The inclusion κ : B → CB has an extension rel. i. Proof. a) implies b) since by the NEP there exists a retraction r for Ci, and f˜ = F rκ is an extension of f rel. i, where F : f 0. Clearly b) implies c) and c) implies d), since the inclusion κ is nullhomotopic. d) implies a) since r = ρ(C κ ˜ ) is a retraction for Ci, where κ ˜ extends κ rel. i. We point out that if i : X → Z verifies the NEP and X 0 or Y 0, then every morphism f : X → Y has an extension relative to i. The notion “contractible” can be extended to cofibrations. Such contractible cofibrations will be the contractible objects in the category of pairs. Definition 3.6. A cofibration i : B A is said to be contractible when B and A are contractible objects.
10
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33 i
i
Theorem 3.7. Given two contractible cofibrations A B A , the pushout object P {i, i } is contractible. Proof. By the NEP, there are extensions r : CB → B, q : CA → A and q : CA → A of the morphisms 1B , {i r, 1A } and {ir, 1A } relative to κB , i1 and i1 , respectively. The morphism q ∪ q : P {Ci, Ci } → P {i, i } is a retraction of κP {i,i } = κA ∪ κA . Observe that, by Theorem 3.7, if i : B A is a contractible cofibration, then the object Σin and the cofibration in+1 : Σin C n+1 A are also contractible, for every n ≥ 0. In homotopy theory the notion of point is generally used to obtain homotopy groups of pointed objects through suspension objects. Next, we define such a notion in a Ccategory. Definition 3.8. Let ∅ be the initial object of a C-category. An object A is said to be cofibrant when the initial morphism ∅A : ∅ → A is a cofibration. In C-categories where each object is cofibrant, we can omit in C3 the phrase “The morphisms 1X and κX are always cofibrations”. Note that 1X = 1∅ : X → X = P {∅X , 1∅ } is a cofibration by C2. On the other hand, as a consequence of C2, C3 and C4, we have that the inclusion κX = (∅X )1 κ∅ = (∅X )1 ∅C∅ is also a cofibration. Definition 3.9. A C-category is said to be pointed if every object is cofibrant and the cone of the initial object is the initial object. In pointed categories the initial object is denoted by ∗ and is termed a point. We remark that ∗ 0 (since ∗ = C∗) and (∗X )1 = κX . Given two objects X and Y , we will denote by X ∨ Y the object P {∗X , ∗Y }. By C2 it is clear that C(X ∨ Y ) = CX ∨ CY . By Remark 3.3, if i : B A and i : B A are cofibrations, then so is i ∨ i : B ∨ B A ∨ A . Finally, if X 0 and Y 0, then X ∨ Y 0, by Theorem 3.7. Next, we introduce the notion of relative homotopy in a (not necessarily pointed) C-category, and prove that it is an equivalence relation compatible with the composition of morphisms. Definition 3.10. Given a cofibration i : B CA and two morphisms f0 , f1 : CA → X, the morphism f0 is said to be homotopic to the morphism f1 relative to i if there exists an extension F : C 2 A → X of the morphism {f0 ρCi, f1 } relative to i1 . The morphism F is called a homotopy from f0 to f1 relative to i, in symbols F : f0 f1 rel. i. The existence of the morphism {f0 ρCi, f1 } implies that f0 i = f1 i. Theorem 3.11. The relative homotopy relation is an equivalence relation on the set
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
11
Hom(CA, X)u(i) . Proof. By the NEP there exists the following extension μ
i1
C 2A
κρCi∪κ
/ P {Ci, Ci} = C(P {i, i}) ffff3 f fff f f f f f ffff μ fffff fffff
Σi = P {κ, i}
that will be fundamental in this proof and, later, in the construction of the homotopy groups. Clearly f ρ : f f rel. i. If F : f0 f1 rel. i, then F = {F, f0 ρ}μ : f1 f0 rel. i. Finally, if F : f0 f1 and G : f1 f2 rel. i, then F · G = {F , G}μ : f0 f2 rel. i. The quotient set Hom(CA, X)u(i) / will be denoted by [CA, X]u(i) . Remark 3.12. By the NEP, if X 0, then the set [CA, X]u(i) is always unitary. By The orem 3.7 given i : B A with B 0, the cofibration i1 is contractible and [CA, X ]u (i) is always unitary. In sum, if X or i is contractible, then f0 f1 rel. i if and only if f0 i = f1 i. Theorem 3.13. Given a cofibration i : B CA, every morphism f : X → Y induces a map f∗ : [CA, X]u(i) → [CA, Y ]f u(i) defined by f∗ ([h]) = [f h]. Moreover, every commutative square Cf i = i g relating cofibrations i : B CA and i : B CA induces a map (Cf )∗ : [CA , X]u(i ) → [CA, X]ug(i) defined by (Cf )∗ ([h ]) = [h Cf ]. If the square is a pushout, then (Cf )∗ is a bijection. Proof. If H : h0 h1 rel. i, then f H : f h0 f h1 rel. i. If H : h0 h1 rel. i , then H C 2 f : h0 Cf h1 Cf rel. i. If the square is a pushout and H : h0 h1 rel. i, then {H, {h0 , u}ρCi } : C 2 A = P {Ci, Cf } → X makes {h0 , u} {h1 , u} rel. i . Finally, we will see the behaviour of the homotopy relative to a coproduct of cofibrations. Proposition 3.14. Given cofibrations i : B CA, i : B CA and a morphism {u, u } : B ∨ B → X, there exists a bijection [CA ∨ CA , X]{u,u }(i∨i ) ∼ = [CA, X]u(i) × [CA , X]u (i ) .
Proof. It suffices to observe that F : f0 f1 rel. i and F : f0 f1 rel. i if and only if {F, F } : {f0 , f0 } {f1 , f1 } rel. i ∨ i .
12
4
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
Homotopy groups
In this section, homotopy groups relative to a cofibration or referred to an object, will be constructed through suspension functors. Also, the usual properties of these homotopy groups will be studied. The relative homotopy relation will be extended to arbitrary cofibrations, whose codomain is not necessarily a cone object. Throughout this section C will be a pointed C-category. Symbols − and · are used here, with the same meaning as in the proof of Theorem 3.11, to define the inverse and composition operations, respectively, in the homotopy groups. Firstly, we define the category of based objects, where it is possible to define zero morphisms. These concepts permit the definition of homotopy groups. Definition 4.1. The category of based objects of C is defined as follows. Objects are pairs (X, α), where X is an object and α : X → ∗ is a morphism of C. A morphism f : (X, α) → (Y, β) is f : X → Y such that βf = α. If f : X Y is a cofibration in C, then f : (X, α) (Y, β) will be called a based cofibration. Every object of C has at least a base morphism, since it is cofibrant and ∗ 0. Definition 4.2. Given a based object (X, α), the zero morphism 0 : X → Y is defined by 0 = ∗Y α. The cone functor and the natural transformations κ, ρ can be extended to the category of based objects: C(X, α) = (CX, Cα) and Cf : (CX, Cα) → (CY, Cβ) with κ : (X, α) (CX, Cα) and ρ : (C 2 X, C 2 α) → (CX, Cα). f
g
On the other hand, given two based morphisms (Y, β) ← (X, α) → (Z, γ), if there exists the pushout object P {f, g} in C, then (P {f, g}, {β, γ}) is the pushout object in the category of based objects. In particular, for every based cofibration i : (B, β) (A, α) the object Σin is based on {C n+1 β, C n α, (n+1 ..... , C n α}, for each natural number n. In this way, the category of based objects verifies the axioms of a pointed C-category except the NEP. Note that (∗, 1) is the zero object. The following result is an important tool to define the product on the homotopy groups and prove Theorem 4.7. Theorem 4.3. Given a based cofibration i : (B, β) (CA, Cα), each extension μ of κρCi ∪ κ relative to i1 in C induces a bijection μ∗ : [P {Ci, Ci}, X]0(κ) → [C 2 A, X]0(i1 ) . Proof. If H : F0 F1 rel. κ, then Hν : F0 μ F1 μ rel. i1 , where ν is an extension of {CκμρCi1 , κμ} : Σi1 = P {κ, i1 } → C 2 P {i, i} rel. i2 . We point out that {CκμρCi1 , κμ} is nullhomotopic since its codomain is a cone object. Therefore, μ∗ is well defined. The inverse map is defined by (μ∗ )−1 ([F ]) = [{0, F }]. If H : F0 F1 rel. i1 , then {0, H} : {0, F0 } {0, F1 } rel. κ. So, (μ∗ )−1 is well
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
13
defined. (μ∗ )−1 μ∗ = 1, since {F Cρ, {F ρ, Gρ}Cμ} : {F, G} {0, {F, G}μ} rel. κ. μ∗ (μ∗ )−1 = 1, since {0, F ρ}Cμ : F {0, F }μ rel. i1 .
Remark 4.4. If F : F0 F1 rel. i1 and G : G0 G1 rel. i1 , then {F, G} : {F0 , G0 } {F1 , G1 } rel. κ. Next, given a based cofibration i : B CA and an object X, we define a group structure on [C 2 A, X]0(i1 ) . Lemma 4.5. For every based cofibration i : B CA and object X, there are a symmetric map [C 2 A, X]0(i1 ) → [C 2 A, X]0(i1 ) and a product map [C 2 A, X]0(i1 ) × [C 2 A, X]0(i1 ) → [C 2 A, X]0(i1 ) defined by [F ]−1 = [F ] and [F ] · [G] = [F · G], respectively. Proof. If F : F0 F1 rel. i1 , then, by Remark 4.4, {F, 0} : {F0 , 0} {F1 , 0} rel. κ. Hence, by Theorem 4.3, F0 = {F0 , 0}μ {F1 , 0}μ = F1 rel. i1 . Similarly, if G : G0 G1 rel. i1 and H : F0 F1 rel. i1 , then {H, G} : {F0 , G0 } {F1 , G1 } rel. κ. Hence, F0 · G0 = {F0 , G0 }μ {F1 , G1 }μ = F1 · G1 rel i1 . Remark 4.6. [0]−1 = [0] = [{0, 0}μ] = [0], by Lemma 4.5 and Theorem 4.3. Theorem 4.7. If i : B CA is a based cofibration, then [C 2 A, X]0(i1 ) is a group. Proof. 1. If [F ] ∈ [C 2 A, X]0(i1 ) , then: (a) [F ] · [0] = [F ]. It is a consequence of Theorem 4.3 and Lemma 4.5, since {{F ρ, 0}Cμ, F Cρ} : {0, F } {{F, 0}μ, 0} rel. κ. (b) [0] · [F ] = [F ]. It is a consequence of Theorem 4.3, Lemma 4.5 and Remarks 4.4 and 4.6, since [0] · [F ] = [0 · F ] = [{0, F }μ] = [{0, F }μ] = [F ]. 2. If [F ] ∈ [C 2 A, X]0(i1 ) , then: (a) [F ] · [F ]−1 = [0]. It is a consequence of Theorem 4.3 and Lemma 4.5, since {F Cρ, F Cρ} : {F , F } {0, 0} rel. κ. (b) [F ]−1 · [F ] = [0]. It is a consequence of Lemma 4.5, 2(a) and 1(a), observing that ([F ]−1 )−1 = [F ]−1 = [{F , 0}μ] = [F · 0] = [F ] · [0] = [F ]. Hence, [F ]−1 · [F ] = [F ]−1 · ([F ]−1 )−1 = [F ] · [F ]−1 = [0]. 3. Finally, if [F ], [G], [H] ∈ [C 2 A, X]0(i1 ) , then ([F ] · [G]) · [H] = [F ] · ([G] · [H]). It is a consequence of Lemma 4.5, observing that every [F0 ], [F1 ], [F2 ] ∈ [C 2 A, X]0(i1 ) verify: (i) [F0 ]−1 · [F1 ]−1 = ([F1 ] · [F0 ])−1 by Theorem 4.3, Remark 4.4 and the observation made in 2(b), since {{F1 ρ, F0 ρ}Cμ, F1 Cρ} : {F0 , F1 } {{F1 , F0 }μ, 0} rel. κ. (ii) [F0 ]·[F2 ] = ([F0 ]·[F1 ])·([F1 ]−1 ·[F2 ]) as a consequence of (i), Theorem 4.3, Remark 4.4, and the observation made in 2(b), since {{F1 ρ, F0 ρ}Cμ, {F1 ρ, F2 ρ}Cμ} : {F0 , F2 } {{F1 , F0 }μ, {F1 , F2 }μ} rel. κ.
14
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
(iii) [F0 ] = [F1 ]−1 · ([F1 ] · [F0 ]) by 1(b) and (ii), since [F0 ] = [0] · [F0 ] = ([0] · [F1 ]−1 ) · ([F1 ] · [F0 ]) = [F1 ]−1 · ([F1 ] · [F0 ]). Hence, ([F ] · [G]) · [H] = ([F ] · [G]) · ([G]−1 · ([G] · [H])) = [F ] · ([G] · [H]). Remark 4.8. The group structure of [C 2 A, X]0(i1 ) does not depend on the choice of the extension μ, since P {Ci, Ci} = CP {i, i} 0 and two arbitrary extensions of κρCi ∪ κ relative to i1 are homotopic relative to the cofibration, by Remark 3.12. Definition 4.9. The first homotopy group of an object X relative to a based cofibration i : B → CA is defined by π1i (X) = [C 2 A, X]0(i1 ) . The nth-homotopy group of X relative i to i is defined by πni (X) = π1n−1 (X). Given a based object A, the group πn∗A (X) is also denoted by πnA (X) and is called the nth-homotopy group of X referred to A. is (X). In this way, it is Remark 4.10. It is obvious, by Definition 4.9, that πni (X) = πn−s possible to define homotopy groups relative to arbitrary cofibrations, whose codomain is not a cone object, for n ≥ 2.
Remark 4.11. In order to obtain generalized homotopy groups, in the research announcement [4] we give a generalization of the above construction in a not necesarily pointed C-category. Given a cofibration i : B CA and an object X, we define a groupoid whose objects are the elements of Hom(CA, X). Morphisms from f0 to f1 are homotopy classes relative to i1 of homotopies from f0 to f1 . Identities are 1f = [f p], inverse morphisms are [F ]−1 = [F ] and composite morphisms are [F ].[G] = [F ∗ G]. We believe that this groupoid allows one to reconstruct a track category in the sense of H.J. Baues (see [1]). These ideas will be developed in full detail in a later paper. The following theorem shows the functorial character of the homotopy groups. First, we prove a preliminary result. Lemma 4.12. Every based commutative square f i = i g relating based cofibrations i : B A and i : B A generates, for each natural number n, a based commutative square C n f in = in Σn (f, g) relating the based cofibrations in : Σin−1 C n A and in : Σin−1 C n A . If the first square f i = i g is a pushout, then the generated squares C n f in = in Σn (f, g) are pushouts. Proof. Clearly i1 Σ1 (f, g) = Cf i1 . On the other hand, if the square f i = i g is a pushout and u : CA → X, v = {v0 , v1 } : Σi → X verify ui1 = vΣ1 (f, g), then uCi = v0 Cg, and by C2 there exists {u, v0 } : CA → X. Hence, {u, v0 } = {u, v} : CA = P {i1 , Σ1 (f, g)} → X. By induction, since in = (in−1 )1 , the same proof is valid. Theorem 4.13. Given a based cofibration i : B A, every morphism f : X → Y
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
15
induces homomorphisms of groups f∗ : πni (X) → πni (Y ). Moreover, every based commutative square f i = i g relating based cofibrations i : B A and i : B A induces, for each object X and n ≥ 2, a homomorphism of groups (C n f )∗ : πni (X) → πni (X). If the square is a push out then (C n f )∗ is an isomorphism. Proof. By the first part of Theorem 3.13 f∗ is a map. Moreover, f∗ is a homomorphism of groups since f∗ ([F ] · [G]) = f∗ ([{{F, 0}μ, G}μ]) = ([{{f F, 0}μ, f G}μ]) = [f F ] · [f G] = f∗ ([F ]) · f∗ ([G]). By the first part of Lemma 4.12 and the second part of Theorem 3.13 (C n f )∗ is a map. If μn and μn are extensions used to obtain the operations in πni (X) and πni (X), respectively, then μn C n f in = (C n f ∪ C n f )μn in . Hence, μn C n f (C n f ∪ C n f )μn rel. in by Remark 3.12, and (C n f )∗ ([F ] · [G]) = (C n f )∗ ([F ]) · (C n f )∗ ([G]). The proof is concluded by the second part of Lemma 4.12 and the third part of Theorem 3.13. Remark 4.14. Isomorphisms of groups induced by f in the second part of Theorem 4.13 allow us to define homotopy relative to cofibrations whose codomain is not just a cone object. If A or A is a cone object, then there is homotopy relative to i or i , respectively. Since the sets [CA, X]0(i1 ) and [CA , X]0(i1 ) are bijective and the groups πni (X) and πni (X) are isomorphic for n ≥ 2, it seems natural to translate the group structure via the bijection, obtaining in this way the isomorphism of groups for n = 1. Moreover, we can assume [A, X]ug(i) ∼ = [A , X]u(i ) , that is, f0 f1 rel. i if and only if f0 f f1 f rel. i. Repeating this process, one can obtain homotopy relative to cofibrations whose codomain is not a cone object, through another such cofibration. When the cofibration is the initial morphism, then [A, X]∗X (∗A ) is simply denoted by [A, X]. As a consequence of this last convention, the homotopy groups relative to a cofibration or referred to an object can be obtained through suspension objects. Definition 4.15. Given a based cofibration i : (B, β) (CA, Cα), the suspension object of i is defined by Si = P {i, β}. Given a based object (A, α), the n-suspension object of (A, α) is inductively defined by S n A = SκSn−1 A , with S 0 A = A. The notation Sin will designate the object S n−1 Si .
Lemma 4.16. If (A, α) is a based object such that A = P {j, s}, where j is a cofibration and the codomain of the morphism s is the object ∗, then SA = P {j1 , {Cs, αs}}.
16
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
Proof. Since α{jCs, s} = {Cs, αs}, it suffices to observe the following pushouts:
j1
{jCs,s}
/A
Σj CT
Cs
kA
/ CA
α
/∗ / SA
(n Theorem 4.17. Sin+1 = P {in , {C n β, C n−1 α, ...., C n α}} for every based cofibration i : (B, β) (CA, Cα) and natural number n.
Proof. It is a consequence of the inductive use of the previous Lemma 4.16, since Si = P {i, β}, in = (in−1 )1 and Sin+1 = SSin . D Corollary 4.18. There are bijections πni (X) ∼ (X) ∼ = [Sin+1 , X] and πn+1 = [S n+1 D, X] for every based cofibration i : B CA, based object D and natural number n.
The homotopy groups of this algebraic homotopy theory verify the usual properties of homotopy groups. In this sense, by Remark 3.12, we have: Proposition 4.19. Given a based cofibration i : B CA, a based object D, an object X and a natural number n: a) If i is a contractible cofibration, then πni (X) ∼ = {0}. D b) If D is a contractible object, then πn+1 (X) ∼ = {0}. i D ∼ c) If X is a contractible object, then πn (X) = πn+1 (X) ∼ = {0}. The following properties emerge as particular cases of Remark 4.10 and Theorem 4.13. We point out that, by Lemma 4.16, the suspensions of every based object A verify S n A = S n−r S r A, for 0 ≤ r ≤ n. Proposition 4.20. Given a based object D, a morphism f : X → Y and a based morphism g : A → B: Sr A a) The homotopy groups πnA (X) and πn−r (X) are isomorphic. A A b) The map f∗ : πn (X) → πn (Y ) is a homomorphism of groups. c) The map (C n g)∗ : πnB (X) → πnA (X) is a homomorphism of groups. d) If g is an isomorphism, then (C n g)∗ is an isomorphism of groups. Homotopy groups relative to a coproduct are products of homotopy groups. Proposition 4.21. Given two based cofibrations i : B CA and i : B CA , two based objects D and D , and an object X: a) The groups πni∨i (X) and πni (X) × πni (X) are isomorphic. b) The groups πnD∨D (X) and πnD (X) × πnD (X) are isomorphic.
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
17
Proof. It is easily seen by Proposition 3.14 and Remark 4.8, since the coproduct preserves pushouts and μn ∨ μn extends (κρCin−1 ∨ κρCin−1 ) ∪ (κ ∨ κ) rel. in ∨ in .
5
Exact sequences of homotopy groups
In homotopy theory, homotopy groups are usually related through exact sequences. In C-categories, this relationship is also possible. In the classical sense, a pair (X, Y ) is a cofibration f : Y X. The category of such pairs in a C-category is also a C-category. Therefore, the development made for a C-category is also available for its category of pairs. Definition 5.1. Given a C-category C, the category cof C is the full subcategory of Pair C whose objects are the cofibrations of C. The pairs (X, Y ), (X , Y ), ... will denote, respectively, the objects f : Y X, f : Y X , ... in cof C. Given an object A in C, 1A and κA are the cofibrations associated to the pairs (A, A) and (CA, A), respectively. Definition 5.2. A morphism (u, v) : (X, Y ) (X , Y ) in cof C is a cofibration of pairs if v : Y Y and {f , u} : P {v, f } X are cofibrations in C. We point out that if (u, v) is a cofibration, then so is u = {f , u}v. Theorem 5.3. If C is a C-category, then so is cof C, with the cofibrations of pairs. Proof. The cone functor is defined for objects by C(X, Y ) = (CX, CY ), with associated cofibration Cf , and for morphisms by C(g, h) = (Cg, Ch). The natural transformations κ and ρ are defined by (κ, κ) and (ρ, ρ), respectively. - The cone axiom is clearly satisfied. (u,v)
(g,h)
- Pushout axiom: Given (X , Y ) (X, Y ) → (X , Y ), then P {(u, v), (g, h)} = (P {u, g}, P {v, h}) with associated cofibration f ∪ f . The pushouts P {u, g} and P {v, h} exist by C2 in C, and f ∪ f is a cofibration in C by Remark 3.3. On the other hand, given the following diagram X / g
X /
/ P {v, f } / {f ,u}
v D1 v
D2
h∪g
/ P {v, f } /
{f ∪f ,u}
/ X
g
/ P {u, g}
where v : Y P {v, h}. The commutative square D1 is composable with the pushout of P {v, f }, and the composite diagram is the pushout of P {v, f h}. Therefore, D1 is a pushout, and so is D2, since the composite diagram is the pushout of P {u, g}. So that, {f ∪ f , u} = {f , u} : P {v, f } P {u, g} is a cofibration, and
18
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
(u, v) is a cofibration of pairs. Clearly, the functor C carries cofibrated pushouts to push outs. - Cofibration axiom: The morphisms 1(X,Y ) and κ(X,Y ) are always cofibrations, since {f, 1X } = 1X : P {1Y , f } = X X and {Cf, κX } = f1 : P {κY , f } = Σf CX are cofibrations in C, by C3 and C4, respectively. (u,v)
(u ,v )
Given (X, Y ) (X , Y ) (X , Y ), the composite (u u, v v) is a cofibration, since {f , u u} = {f , u }(1 ∪ u) : P {v v, f } P {v , f } X is a cofibration in C. Observe that 1 ∪ u is a cofibration by Remark 3.3. Every cofibration (u, v) : (X, Y ) (X , Y ) verifies the nullhomotopy extension property. By the NEP in C, there is a retraction rv for Cv. By Theorem 3.5, there is an extension ru of the morphism {Cf rv , 1CX } relative to the cofibration {Cf , Cu} = C{f , u}. The morphism of pairs (ru , rv ) is a retraction for (Cu, Cv) = C(u, v). - Relative cone axiom: If (u, v) : (X, Y ) (X , Y ) is a cofibration, then so is (u, v)1 = (u1 , v1 ) : Σ(u,v) = (Σu , Σv ) C(X , Y ) = (CX , CY ), since P {v1 , Σ1 (f , f )} = Σ{f ,u} with induced morphisms Σ1 (f , f ) = {f , u}Cf and v1 = {{f , u}Cv, κ}. Hence, {Cf , {Cu, κ}} = {f , u}1 : P {v1 , Σ1 (f , f )} = Σ{f ,u} CX . By Remark 3.3, (Σu , Σv ) is an object of cof C with associated cofibration Σ1 (f , f ), since f1 = {Cf, κX } is a cofibration. The cone structure on cof C arises from the respective structure on C. Therefore, we can relate the homotopy concepts induced by the cone structure in both categories. Clearly, if ( g, h) is an extension of (g, h) : (X, Y ) → (X , Y ) relative to (u, v) : (X, Y ) (X , Y ), then g and h are extensions of g and h relative to u and v, respectively. In particular, every nullhomotopy of pairs is a pair of nullhomotopies. It is easily seen that the following equality holds: Hom((X , Y ), (X , Y ))(g,h)((u,v)) =
Hom(X , X ){f h,g}({f
,u})
× { h}
h∈Hom(Y ,Y )h(v)
Remark 5.4. If (u, v) : (X, Y ) (CX , CY ) is a cofibration and (G, H) : (g0 , h0 ) (g1 , h1 ) rel. (u, v), then G : g0 g1 rel. u and H : h0 h1 rel. v, since (u, v)1 = (u1 , v1 ). On the other hand, [(F, G)] · [(F , G )] = [(F · F , G · G )] for every [(F, G)], [(F , G )] ∈ (u,v) πn ((X , Y )), since an extension of pairs is a pair of extensions, and the operation of the groups in C does not depend on the choice of the extension (Remark 4.8). In this way, if the cofibration of pairs (u, v) is based, then the projections p1 : (u,v) (u,v) πn ((X , Y )) → πnv (Y ) and p2 : πn ((X , Y )) → πnu (X ), defined by p1 ([(F, G)]) = [G] and p2 ([(F, G)]) = [F ], are homomorphisms of groups for every pair (X , Y ). Some curious properties appear when the codomain of a pair is a contractible object. (u,v)
(g,h)
Proposition 5.5. Given (X , Y ) (X, Y ) → (X , Y ) , if X is contractible, then Hom(Y , Y )h(v) = ∅ if and only if Hom((X , Y ), (X , Y ))(g,h)((u,v)) = ∅.
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
19
Proof. By the equality seen before Remark 5.4 and the NEP and using the fact that X 0, it is seen that for every morphism h ∈ Hom(Y , Y )h(v) there is a morphism g ∈ Hom(X , X ){f h,g}({f ,u}) . Corollary 5.6. In the C-category cof C the following equivalences are verified. a) Given (g, h) : (X, Y ) → (X , Y ) with X 0, then (g, h) 0 if and only if h 0. b) (X, Y ) 0 if and only if X 0 and Y 0. (u,v)
(g0 ,h0 ),(g1 ,h1 )
c) Given (CX , CY ) (X, Y ) −→ (X , Y ), if X 0, then (g0 , h0 ) (g1 , h1 ) rel. (u, v) if and only if g0 g1 rel. u and h0 h1 rel. v. (u,v)
(g,h)
d) Given (CX , CY ) (X, Y ) → (X , Y ), if X 0, then [(CX , CY ), (X , Y )](g,h)((u,v)) ∼ = [CY , Y ]h(v) A pointed C-category C induces a pointed structure on cof C. Proposition 5.7. The category cof C as an initial object if and only if so has C. Proof. If ∅ is initial in C, then (∅, ∅) is initial in cof C. Conversely, if (X, Y ) is initial in cof C, then there is an unique morphism (g, h) : (X, Y ) → (Y, Y ). Hence, (f g, h) : (X, Y ) → (X, Y ) is the initial morphism (1X , 1Y ). Therefore, f g = 1X and 1Y = h = gf , and so that (X, X) ∼ = (X, Y ) is an initial object. One readily checks that X is initial in C, since this category can be considered as a subcategory of cof C identifying in a natural way objects Z with (Z, Z). Using the previous Proposition 5.7, it is clear that (X, Y ) is cofibrant if and only if so are X and Y . Moreover, C is pointed if and only if so is cof C. The pair (X, Y ) is based on (α, β) if and only if f : (Y, β) (X, α) is a based cofibration. Moreover, (u, v) : (X, Y ) (X , Y ) is based if and only if so are u and v. The part (d) of Corollary 5.6 induces the following isomorphisms of groups. Corollary 5.8. Given a based cofibration (u, v) : (X, Y ) (X , Y ), a based object (u,v) (X , Y ) and an object (X , Y ), if X 0, then πn (X , Y ) ∼ = πnv (Y ) and (X ,Y ) πn (X , Y ) ∼ = πnY (Y ). Homotopy groups of pairs will be used to build exact homotopy sequences. Next, we introduce important tools to prove this fact. Proposition 5.9. Given a based cofibration i : B A, for every n ∈ N there is an (i1 ,1) i isomorphism of groups θ : πn+1 ((X, Y )) ∼ (X), where (i1 , 1) : (Σi , A) (CA, A). = πn+2 (i ,1)
1 Proof. If [(F, G)] ∈ πn+1 ((X, Y )), then (F, G)(in+2 , 1) = (0, 0), so G = 0. Therefore, by Remark 5.4, [(F, 0)] ↔ [F ] is an isomorphism of groups.
20
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
Definition 5.10. The (n + 1)-th homotopy group of a pair (X, Y ) relative to a based (Ci,i) i cofibration i : B A is defined by πn+1 ((X, Y )) = πn ((X, Y )). If i : B A is a based cofibration, then so are Ci : CB CA and (Ci, i) : (CB, B) (CA, A). On the other hand, if i : B A is a contractible cofibration, then i so is (Ci, i) : (CB, B) (CA, A), and πn+1 ((X, Y )) ∼ = {0}. By Corollary 5.8, if X 0, i i ∼ then πn+1 ((X, Y )) = πn (Y ). Lemma 5.11. For every based cofibration i : B A and morphism F : C n+1 A → X verifying F in+1 = 0, there is an extension of the morphism {0, F, 0, ..., 0, F, 0} (resp. {0, F, 0, ..., 0, F }) relative to in+2 , for every odd (resp. even) natural number n. n+3
Proof. If n is odd, then HC 2 κ extends {0, F, 0, ..., 0, F, 0} rel. in+2 , where H is n+1 n−1 n−3 n+1 an extension of {0, F C 2 ρ, F C 2 ρ, F C 2 ρ, ..., F Cρ, F C n ρ, F C n−1 ρ, .., F C 2 ρ, 0} rel. (Ci n+1 ) n+3 . 2
2
n+2
If n is even, then HC 2 κ extends {0, F, 0, ..., 0, F } rel. in+2 , where H is an extension n−2 n−4 n n of {0, F C 2 ρ, F C 2 ρ, F C 2 ρ, ..., F ρ, F C n ρ, F C n−1 ρ, ..., F C 2 ρ} rel. (Ci n+2 ) n+2 . 2 2 The existence of the morphisms extended by H follows from Remark 3.2. i (X), then [F ] = Lemma 5.12. Given a based cofibration i : B A and [F ], [G] ∈ πn+2 [G] if and only if there exists an extension of the morphism {0, G, 0, ..., 0, F, 0} (resp. {0, G, 0, ..., 0, F }) relative to in+3 , for every even (resp. odd) natural number n.
Proof. By Lemma 5.11 there is an extension H of the morphism {0, F, 0, ..., 0, F, 0} (resp. {0, F, 0, ..., 0, F }) rel. in+3 , for every even (resp. odd) natural number n. If [F ] = [G], then there is K : G F (resp. K : F G) rel. in+2 . Therefore, ECκ (resp. Eκ) is an extension of {0, G, 0, ..., 0, F, 0} (resp. {0, G, 0, ..., 0, F }) rel. in+3 , where E is an extension of {0, K, 0, ..., 0, F ρ, H} (resp. {0, K, 0, ..., 0, H}) rel. (Cin+2 )1 (resp. Cin+3 ). Conversely, if n is odd and H0 is an extension of {0, G, 0, ..., 0, F } rel. in+3 , then G1 C n+3 κ is an extension of {0, 0, G, 0, ..., 0, F, 0} rel. in+3 , where G1 is an extension of {0, F ρ, H0 , 0, ..., 0, F C n+1 ρ, F C n ρ} rel. (Ci)n+3 . If H1 = G1 C n+3 κ, then G2 C n+2 κ is an extension of {0, 0, 0, G, 0, ..., 0, F } rel. in+3 , where G2 is an extension of {0, 0, F ρ, H1 , 0, ..., 0, F C n−1 ρ, F C n ρ} rel. (Ci1 )n+2 . If H2 = G2 C n+2 κ, then G3 C n+1 κ is an extension of {0, 0, 0, 0, G, 0, ..., 0, F, 0} rel. in+3 , where G3 is an extension of {0, 0, 0, F ρ, H2 , 0, ...., 0, F C n−1 ρ, F C n−2 ρ} rel. (Ci2 )n+1 . This process can be iterated to obtain an extension Hn of {0, ..., 0, G, F, 0} rel. in+3 . The morphism Hn+1 = Gn+1 C 3 κ extends {0, .., 0, G, F } rel. in+3 , where Gn+1 is an extension of {0, ..., 0, F ρ, Hn , F Cρ} rel. (Cin )3 . Hence, Hn+1 : G F rel. in+2 . Observe that the morphisms extended by E, G1 , G2 , ..., Gn+1 exist by Remark 3.2. An analogous process for an extension H0 of the morphism {0, G, 0, ..., 0, F, 0} rel. in+3 gives Hn+1 C 3 κ : G F rel. in+2 , when n is even. Finally, we define the exact sequence of homotopy groups associated with a based
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
21
cofibration and a pair. Theorem 5.13. For every based cofibration i : B A and pair (X, Y ), there is an exact sequence of homotopy groups: f∗
j
f∗
δ
... → π3i (Y ) → π3i (X) → π3i ((X, Y )) → π2i (Y ) → π2i (X) (Ci,i)
i i Proof. The maps f∗ and δ = p1 : πn+2 ((X, Y )) = πn+1 ((X, Y )) → πn+1 (Y ) are homomorphisms of groups, by the first part of Theorem 4.13 and Remark 5.4 respectively. (i1 ,1) (Ci,i) i i (X) → πn+1 ((X, Y )) → πn+2 ((X, Y )) = πn+1 ((X, Y )) The maps j = (1, 1)∗ θ−1 : πn+2 are homomorphisms of groups, by Proposition 5.9 and the second part of Theorem 4.13 applied to the commutative square (1, 1)(Ci, i) = (i1 , 1)(i, i) in cof C, where i is the induced cofibration in the pushout of the relative cone Σi . i If [(F, G)] ∈ πn+2 ((X, Y )), then f∗ δ([(F, G)]) = f∗ ([G]) = [f G] = [0], since HC n+2 κ : 0 f G rel. in+1 , where H is an extension of {0, ..., 0, F } rel. (Ci)n+2 . i If [F ] ∈ πn+2 (X), then δj([F ]) = δ([(F, 0)]) = [0]. i If [F ] ∈ πn+2 (Y ), then jf∗ ([F ]) = j([f F ]) = [(f F, 0)] = [(0, 0)], since (H, F ) : (f F, 0) (0, 0) (resp. (H, F ) : (0, 0) (f F, 0)) rel. (Ci, i)n+1 where, by Lemma 5.11, H : C n+2 A → X is an extension of {0, F, 0, ..., 0, F, 0} (resp. {0, F, 0, ..., 0, F }) rel. in+3 , when n is even (resp. odd). i (Y ) and f∗ ([G]) = [f G] = [0], then there is H : 0 f G rel. in+1 . If [G] ∈ πn+1 i Hence, [(F κ, G)] ∈ πn+2 ((X, Y )), where F is an extension of {0, H, 0, ..., 0} rel. Cin+2 . So δ([(F k, G)]) = [G]. i i If [(F, G)] ∈ πn+2 ((X, Y )) and [G] = [0] ∈ πn+1 (Y ), then there is H : 0 G rel. in+1 . Hence, (H , H) : (H Cκ, 0) (F, G) rel. (Ci, i)n+1 , where H is an extension of {0, f H, 0, ..., 0, F } rel. (Cin+1 )1 . So j([H Cκ]) = [(F, G)]. i If [F ] ∈ πn+2 (X) and j([F ]) = [(F, 0)] = [(0, 0)], then there is (H, G) : (F, 0) (0, 0) (resp. (H, G) : (0, 0) (F, 0)) rel. (Ci, i)n+1 , for n even (resp. odd). Hence, H is an extension of {0, f G, 0, ..., 0, F, 0} (resp. {0, f G, 0, ..., 0, F }) rel. in+3 . By Lemma 5.12 it is concluded that f∗ ([G]) = [f G] = [F ].
Corollary 5.14. For every based object A and pair (X, Y ), the following sequence of groups is exact: f∗
j
δ
f∗
... → π3A (Y ) → π3A (X) → π3A ((X, Y )) → π2A (Y ) → π2A (X)
6
Algebraic examples
The homotopy theory developed in the previous sections has a dual version based on the notions of cocone, fibration, nullhomotopy lifting property (NLP) and cosuspension. Cone or cocone structures can be obtained by applying algebraic methods to other homotopy structures. Also, adjointness relations between functors originate cones or cocones. A cone in a category is a triple (C, κ, ρ), in the sense of Definition 3.1, verifying the
22
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33 U
V
cone axiom. Huber [9] proved that every pair of adjoint functors (A → C, C → A) carries cone structures in C (resp. cocone structures in A) into cone structures in A (resp. cocone structures in C). In this sense, using the trivial cone (resp. trivial cocone) (1, id, id) we shall always obtain a cone (resp. cocone) for every pair of adjoint functors. The converse of this fact was proved by Kleisli [13]. Given a cone (C, κ, ρ), every right adjoint functor C of C induces a cocone (C , κ , ρ ) with adjunction isomorphism identifying κ∗ ∼ = κ∗ and ρ∗ ∼ = ρ∗ . This fact was proved by Huber [9] too. A morphism i is said to be a cofibration generated by the cone (C, κ, ρ) if, for each non-negative integer n, the morphism in verifies the NEP and there exists the pushout object P {in , f }, for any morphism f with the same domain of in .
6.1 Category with natural spheres (and co-spheres) The nth-suspension S n A of a topological space A is obtained by identifying the base of the topological cone CS n−1 A to a single point. If A = S 0 is the 0-sphere, then S n A = S n is the n-sphere. In this way, the homotopy groups referred to an object A can be considered as spherical homotopy groups. Definition 6.1. A category with natural spheres, or S-category, is a category C with a cone (C, κ, ρ) such that C carries pushouts to pushouts and κA is a cofibration generated by the cone, for every object A. Theorem 6.2. Every S-category is a C-category with the cofibrations generated by the cone. Proof. - Pushout axiom: By hypothesis, if i : B A is a generated cofibration, then there is the pushout object P {i, f }, for every morphism f : B → X. Hence, for any morphism g : X → Y there exists P {i, g} = P {i, gf }. Moreover, {Cf r, 1} : P {Ci, Cf } → CX is a retraction for Ci, where r is a retraction for Ci. Observing that Lemma 4.12 is also true for diagrams without base morphism, the morphism i is a generated cofibration since in = (i)n . - Cofibration axiom: The development made in the proof of the relative cone axiom in Theorem 5.3 when applied to the cofibration of pairs κ(X,Y ) , proves that Σfm = P {κm , Σm (Cf, f )}, for every non-negative integer m. j
i
Given two generated cofibrations C B A, the commutative square 1A (ij) = ij induces, by Lemma 4.12, the commutative square 1C n (A) (ij)n = in Σn (1A , j). In this way, Σn (1A , j) = C n j∪1 : Σ(ij)n−1 = P {κn−1 , Σn−1 (C(ij), ij)} → Σin−1 = P {κn−1 , Σn−1 (Ci, i)}. Hence, (ij)n = in (C n j ∪ 1). Given a morphism f = {f0 , f1 } : Σ(ij)n−1 = P {κn−1 , Σn−1 (C(ij), ij)} → X, the following diagram is a pushout: Hence, the pushout P {(ij)n , f } exists too.
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33 f ={f0 ,f1 }
Σ(ij)n−1 = P {κn−1 , Σn−1 (C(ij), ij)}
C n j∪1 1∪f1
Σin−1 = P {κn−1 , Σn−1 (Ci, i)}
23
/X
jn
/ P {jn , {f0 , f1 Σn−1 (Ci, i)}}
Finally, the morphism {C{Σn−1 (Ci, i), κn−1 Σn−1 (Ci, i)}r, Cκn−1 } : CΣin−1 → CΣ(ij)n−1 is a retraction for C(C n j ∪ 1), where rCjn = 1. So that C(ij)n is a section. Remark 6.3. Every S-category has homotopy groups relative to generated cofibrations and exact sequences of these groups. In particular, this fact guarantees the existence of the spherical homotopy groups. The dual concept of a S-category is denominated a category with natural co-spheres or S -category. Definition 6.4. A SS -category is a category with natural spheres and co-spheres, where (C, C ) is a pair of adjoint functors such that the natural adjunction isomorphism makes κ∗ ∼ = κ∗ and ρ∗ ∼ = ρ∗ . Remark 6.5. If (C, C ) is a pair of adjoint functors, then C preserves colimits and C preserves limits. Hence, C carries pushouts to pushouts and C carries pullbacks to pullbacks. Proposition 6.6. In a SS -category, the natural adjunction isomorphism γ induces an κ isomorphism πnκA (X) ∼ = πnX (A), for every pair of objects A, X. κ
Proof. The isomorphism γ : πnκA (X) → πnX (A) is defined by γ ([F ]) = [γ n+1 (F )], where ..... γ : Hom(C n+1 A, X) → Hom(A, C n+1 X). In effect, if γ n+1 denote the composite γ (n+1 F : F0 F1 rel. κn then C n+1 κ F : γ n+1 (F0 ) γ n+1 (F1 ) rel. κn , where F is a lifting of the morphism < ρ γ n+1 (F0 ), C n ρ C n κ ρ γ n+1 (F0 ), ....., C n ρ C 2 κ ρ γ n+1 (F0 ), C n ρ γ n+1 (F0 ), γ n+2 (F ) > relative to the fibration (κC )n+1 . The inverse isomorphism is defined using γ −1 . The natural adjunction isomorphism γ induces, in a natural way, adjunctions between suspensions or relative cones and the respective dual concepts, whose natural isomorphisms will be also denoted by γ. Moreover, the isomorphism γ between S and S preserves the homotopy relation, and so that the classical relation between loops and suspensions can be generalized for SS -categories. Proposition 6.7. Given two objects A and X in a SS -category, [S n−i A, S m X] ∼ = [S n A, S m+i X] Theorem 6.8. If C is a S-category with pullbacks and C is a right adjoint functor to
24
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
the cone C, then C is a SS -category. Proof. It suffices to prove that κn verify the NLP. If f : A → Σκn is nullhomotopic, then F κ = f , for some F : CA → Σκn . So that γ −1 (f ) = γ −1 (F κ) = γ −1 (F )Σn (Cκ, κ) : Σκn−1 → Σ(κC )n−1 → X. Hence, γ −1 (f ) is nullhomotopic, and by the NEP there is a morphism F : C n+2 A → X such that F κn+1 = γ −1 (f ). (∗)
Therefore, f = γ(F κn+1 ) = (Σn+1 (F, C F ))γ(κn+1 ) = (Σn+1 (F, C F ))κn+1 γ n+2 (1C n+2 A ) = κn+1 C n+2 F γ n+2 (1C n+2 A ) = κn+1 γ n+2 (F ). (∗) Note that γ n+1 (C n+1−i κ) = γ n+1 (1C n+2 A C n+1−i κ) = γ i (γ n+1−i (1C n+2 A )κ) γ i (κ γ n+2−i (1C n+2 A )) = C i κ γ n+2 (1C n+2 A )
(κ∗ ∼ =κ∗ )
=
6.2 Additive categories In [5] we proved that if C is an additive category, with finite limits and colimits, and, structures of C-category and C category on C such that (C, C ) is a pair of adjoint functors with the natural adjunction isomorphism making κ∗ ∼ = κ∗ , then C is a proper model category. Moreover, if cofibrations and fibrations are defined by the NEP and the NLP respectively, the model category is also closed. As a consequence of Theorem 6.2 (and its dual) these results can be applied to SS -additive categories. On the other hand, in additive categories the axioms of a C-category can be simplified. In this sense, if a category C has cokernels and a cone (C, κ, ρ) carrying pushouts to pushouts, then every morphism verifying the NEP is a cofibration generated by the cone: if rCi = 1, then (Cκρ + CiCr − CiCrCκρ){C 2 i, Cκ} = 1, where i and κ are the induced morphisms in the pushout of Σi . Observe that C has pushouts since it is additive with cokernels, and κ is a generated cofibration. Therefore, by Theorem 6.2, taking the morphisms verifying the NEP as cofibrations, C is a C-category. i
f0 ,f1
Proposition 6.9. Given CA B −→ X in a C-additive category, then f0 f1 rel. i if and only if there is a nullhomotopy F : f1 − f0 0 such that F Ci = 0. Proof. If H : f0 f1 rel. i, then H − f0 ρ : f1 − f0 0. Conversely, if F : f1 − f0 0, then F + f0 ρ : f0 f1 rel. i. Remark 6.10. By the previous proposition, f0 f1 rel. i if and only if f1 − f0 0 rel. i. On the other hand, Hom(A, X)0(i) is a subgroup of Hom(A, X), for all pairs of objects A, X and morphisms i with codomain A. Moreover, if f0 f1 rel. i and g0 g1 rel. i, then f0 − g0 f1 − g1 rel. i, since if F : f1 − f0 0 rel. i and G : g1 − g0 0 rel. i, then F − G : (f1 − g1 ) − (f0 − g0 ) 0 rel. i. Hence, given a cofibration i : B A, the set [C n A, X]0(in ) = Hom(C n A, X)0(in ) / rel. in is an abelian group with the operation induced by the additivity of the category. Theorem 6.11. In a C-additive category, the identity map id : πni (X) → [C n A, X]0(in )
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
25
is an isomorphism of groups. Proof. [F ] + [G] = [F ] + [G] = [{F , 0}μ] + [{0, G}μ] = [{F , 0}μ + {0, G}μ] = [({F , 0} + {0, G})μ] = [{F , G}μ] = [F · G] = [F ] · [G]. Remark 6.12. Consequently, homotopy groups in a C-additive category are abelian. Thus, we can define the abelian homotopy groups πni (X) = [C n A, X]0(in ) and πnA (X) = πn0A (X), even for n = 0, 1. This definition of homotopy groups in a C-additive category renders the condition: “C carries pushouts to pushouts” unnecessary. A C-additive category without this condition will be called a C∗ -additive category. A result similar to Theorem 6.2 is given for C∗ -additive categories. Theorem 6.13. Given a cone in an additive category C with cokernels, then C is a C∗ -additive category with cofibrations defined by the NEP. Proof. By the properties seen at the beginning of this section it suffices to prove that, given a morphism i : B → A verifying the NEP: The induced morphism i : X → P {i, f } verifies the NEP for every morphism f : B → X, since {g, κ} : P {i, f } → CX extends κ rel. i, where g is an extension of κf rel. i. The morphism i1 : Σi → CA verifies the NEP, since (Cκ + κir − C(irκ)) extends κ rel. i1 , where i and k are the induced morphisms in the pushout of Σi and rCi = 1. In this paper, homotopy theory seems to be characterized by contractible objects. This characterisation can be proved in additive categories. Theorem 6.14. Given an additive category with cokernels. If two cones (C, κ, ρ) and ∗ structures are equiv κ (C, , ρ) generate the same contractible objects, then the C∗ and C alent. are contractible objects Proof. Given a morphism i : B → A, by hypothesis CB and CB in both theories. Hence, i verifies the NEP with respect to C if and only if i verifies the and the generated cofibrations are coincident. the NEP with respect to C, Finally, if F : 0 f1 − f0 rel. i respect to C, then F β : 0 f1 − f0 rel. i respect to C, where β is an extension of i1 (α ∪ 1) relative to i1 , with α an extension of κ relative to κ . The converse is proved in a similar way.
6.3 Categories with a natural cylinder In the Introduction section of this paper, a process to obtain a topological cone starting from a cylinder is described. This fact can be generalized to obtain a C-category starting from an I-category (C, cof, I, ι0 , ι1 , , ∅) in the sense of Baues [1]. The interchange map T
26
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
used by Baues in the axiom (I5) lets us generalize the natural product of the topological cylinder: Given an I-category, the homotopy extension property (HEP) applied to the homotopy square ι0 {ι0 , ι1 } = {ι0 , 1}ι0 induces an extension F verifying F {Iι0 , Iι1 } = {ι0 , 1} and F ι0 = ι0 . Similarly, the square ι0 2 {I{ι0 , ι1 }, ι0 , ι1 } = {ι0 2 , F T, ι0 2 , F T }ι0 has an extension H verifying Hι0 = ι0 2 and HI{I{ι0 , ι1 }, ι0 , ι1 } = {ι0 2 , F T, ι0 2 , F T }. Hence, Ψ = Hι1 generalizes the fundamental properties of the natural product, that is, ΨIι0 = ι0 , ΨIι1 = 1, Ψι0 = ι0 and Ψι1 = 1. This fact permits the replacement of the interchange axiom by the following one, without significant changes in the homotopy theory. (I’5) Product axiom: There exists a natural transformation Ψ : I 2 → I verifying ΨIι1 = Ψι1 = 1, ΨIι0 = Ψι0 = ι0 and ΨIΨ = Ψ2 . An I-category where the axiom (I5) is replaced by the axiom (I 5) is called an Icategory with a natural product. Theorem 6.15. Every I-category with a natural product and final object ε is a Ccategory, with the same cofibrations. Moreover, the homotopies induced by both theories agree. Proof. The cone functor C : C → C is defined for objects by C = P {ι0 , ε} and for morphisms by I ∪ 1ε . In this sense, ι0 = C∅ : ε = C∅ → C. The natural inclusion is defined by κ = ει1 : 1 → C, where ε is the induced morphism in the pushout of the cone. The natural projection is induced by the natural product of the cylinder: ρ = {ε Ψ, C∅ εIε , C∅} : C 2 → C, where {ε Ψ, C∅ εIε } : IC = P {Iι0 , Iε} → C. - It is easy to check that (C, κ, ρ) is a cone. - C carries cofibrated pushouts to pushouts by definition, since the cylinder functor I has the required property. - Observing that objects of an I-category are cofibrants, κA = {C∅A , κA }∅ε : A ε A CA is a cofibration since the following diagram is a pushout:
{ι0 ,ι1 }
ε 1
/ε A
AA IA
ε
{C∅,κ}
/ CA
On the other hand, given a cofibration i : B A, the homotopy square (C∅B ε)i = C∅B ε = ε ι0 has an extension F verifying F Ii = ε and F ι0 = C∅B ε. The morphism ρ CF Cι1 is a retraction for Ci. - The morphism i1 is a cofibration since the following square is a pushout. - Recall that f0 f1 relative to i : B A respect to a cylinder if and only if there is a morphism F : IA → X such that F {Ii, ι0 , ι1 } = {f0 i , f0 , f1 }. In particular, if i : B CA is a cofibration: If F : f0 f1 rel. i respect to the cone, then {f0 ρ, F } κω : f0 f1 rel. i respect to
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
{Ii,ι0 ,ι1 }
{iε,C∅ε∪1}
/ Σi
IB ∪ A ∪ A IA
ε
27
i1 ={Ci,κ}
/ CA
the cylinder, where κ is any extension of κ = κ ∪ κ : P {i, i} → C(P {i, i}) = P {Ci, Ci} relative to {Ii, ι0 , ι1 }, with {Ii, ι0 , ι1 } and ω defined by the following pushout:
{Ii,ι0 ,ι1 }
{ii ,1∪1}
/ P {i, i}
IB ∪ A ∪ A ICA
ω
{Ii,ι0 ,ι1 }
/ Zi
Conversely, given F : f0 f1 rel. i respect to the cylinder one can consider the pushout P {i1 , ρCi ∪ 1}, where ρCi ∪ 1 : Σi → P {i, i}. If H is an extension of the homotopy square {f0 ρ, f0 , f0 }i1 = {f0 , F }ι0 , then Hι1 ρCi ∪ 1 : f0 f1 rel. i respect to the cone.
7
Examples
7.1 Tensorial homotopies Homotopy theories through SS -structures are built using the adjunction between the tensorial and Hom functors: - Every unitary ring R gives rise to a SS structure in the category of abelian groups. The cone and the cocone of an abelian group X are defined as follows: CX = X ⊗Z R, κX (x) = x ⊗ 1 and ρX (x ⊗ r ⊗ s) = x ⊗ rs; C X = HomZ (R, X). κX (α) = α(1) and ρX (α)(r ⊗ s) = α(rs). The adjunction isomorphism γ is defined, for f : X ⊗ R → Y , by γ(f )(x)(r) = f (x ⊗ r). Observe that, for g : X → Y R , the inverse is defined by γ −1 (g)(x ⊗ r) = g(x)(r). An abelian group X is contractible if and only if there is a homomorphism of groups F : X ⊗ R → X such that F (x ⊗ 1) = x. This condition is equivalent to the existence of an external operation G : X × R → X, denoted by G(x, r) = r · x, verifying: 1) r · (x + x ) = r · x + r · x 2) (r + r ) · x = r · x + r · x 3) 1 · x = x These contractible abelian groups are called R-quasi-modules. We point out that not every R-quasi-module is a R-module. It suffices to consider R = Q the ring of quaternions and X = O the abelian group of octonions (addition of octonions is defined by adding coefficients of the unit octonions {1, i, j, k, l, li, lj, lk}, as with the complex numbers and quaternions). The external operation is defined by considering the inclusion Q ⊂ O and the product of octonions (see the multiplication
28
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
table in http : //en.wikipedia.org/wiki/Octonion, for example). This product verifies that (ij)l = −i(jl) = i(jl). - The category of R-quasi modules is also a SS -category. A homomorphism of Rquasi-modules f : X → Y is a homomorphism of groups such that f (r · x) = r · f (x). If X is a R-quasi-module, then so is X ⊗Z R, with the operation generated by r · (x ⊗ r) = x⊗r r. Also, HomZ (R, X) is a R-quasi-module with the operation r ·α(x) = α(r ·x). The structure of SS -category defined above is also valid in this case. A R-quasi-module X is contractible if and only if X is a R-module. Observe that X 0 if and only if there is a homomorphism of R-quasi-modules F : X ⊗ R → X such that F (x ⊗ 1) = x. Hence, r · (r · x) = r · F (x ⊗ r) = F (r · (x ⊗ r)) = F (x ⊗ r r) = r r · x. - Every homomorphism of unitary rings f : R → S, with f (1) = 1 and R an abelian ring, induces a SS structure in the category of R-modules. The ring S is a left R-module with the operation rs = f (r)s. Hence, if X is a Rmodule, then so are X ⊗R S and HomR (S, X). A structure of SS -category similar to the above is also valid in this case. A R-module X is contractible if and only if X is a S-quasi-module. Observe that the first example, viz. abelian groups,is a particular case of this, with R = Z and f : Z → S defined by f (n) = n1.
7.2 Chain complexes Given an abelian category A, the category of chain complexes over A is defined as follows. Objects are pairs (X, δ), where X = {Xn }n∈Z and δ = {δn }n∈Z , with Xn objects of A and δn : Xn → Xn−1 morphisms verifying δn−1 δn = 0. If there is no possibility of confusion, we will use the notation δ : Xn → Xn−1 . Morphisms from (X, δ) to (Y, δ ) are f = {fn : Xn → Yn }n∈Z such that δ fn = fn−1 δ. K. H. Kamps [10] defined a homotopy theory on the category of chain complexes over A by using the n ⊕ Xn ⊕ Xn−1 cylinder defined by I(X, δ) = (IX, Iδ), with (IX)n = X δ0 1 fn 0 0 0 fn 0 and (Iδ)n = 0 δ −1 , and given f : (X, δ) → (Y, δ ) then (If )n = . 0 0 −δ 0 0 f n−1 1 0 The natural transformations are defined by ι0 = 0 , ι1 = 1 and = ( 1 1 0 ). 0 0 This cylinder has a natural product Ψn : Xn ⊕ Xn ⊕ Xn−1 ⊕ Xn ⊕ Xn ⊕ Xn−1 ⊕ Xn−1 ⊕ Xn−1 ⊕ Xn−2 → Xn ⊕ Xn ⊕ Xn−1 110100000 defined by the matrix Ψn = 0 0 0 0 1 0 0 0 0 . Therefore, by Theorem 6.15 there 000001010 δ −1 f 0 exists a cone defined by (CX)n = Xn ⊕ Xn−1 , (Cf )n = 0n f , (Cδ)n = 0 −δ , n−1 1 1000 κ = 0 and ρ = 0 1 1 0 .
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
29
On the other hand, the homotopy theory defined by Kamps can be obtained via a cocylinder defined by I (X, δ) = (I X, (I X)n = Xn ⊕ Xn ⊕ Xn+1 and given I δ), with fn 0 0 0 fn 0 , where (I δ)n , ι0 , ι1 and are the f : (X, δ) → (Y, δ ) then (I f )n = 0 0 fn+1 transpose of (Iδ)n , ι0 , ι1 and , respectively. Moreover, this cocylinder has a natural coproduct Ψ , where Ψn is given by the transpose of Ψ n . Hence, there exists a cocone 0 f , and (C δ)n , κ and ρ are the defined by (C X)n = Xn ⊕ Xn+1 , (C f )n = 0n f n+1 transpose of (Cδ)n , κ and ρ, respectively. We conclude that this homotopy theory can be obtained from a structure of SS category. The natural isomorphism of adjunction γ : Hom(C−, ∼) → Hom(−, C ∼) is defined by γ ( fn gn−1 ) = fgn . n Contractible objects are (X, δ) that have associated morphisms gn : Xn → Xn+1 verifying δgn +gn−1 δ = −1, for every integer n. Generated cofibrations and fibrations are, respectively, the cofibrations and fibrations used by Kamps [10] and defined by the HEP and the HLP, that is, the normal monomorphisms and epimomorphisms, respectively: Observe that morphism κ relative to a morphism i : B → A any extension of the rn 1 r i n n verifies r ( in ) = r i = 0 = κn , for every integer n. Hence, rn in = 1, and n−1 n−1 n therefore in is a section, for every integer n. rn Conversely, if rn is a retraction for in , for every integer n, then δr − r δ gives n n−1 an extension of the morphism κ relative to the morphism i.
7.3 Pointed topological spaces The classical homotopy theory on pointed topological spaces is also generated by a SS structure. The cone is defined by C(X, x0 ) = (X × I)/((X × {0}) ∪ ({x0 } × I)), with κ(x) = [x, 1] and ρ([x, t, s]) = [x, ts]. The cocone is defined by C (X, x0 ) = Hom((I, 0), (X, x0 )), with κ (α) = α(1) and (ρ (α))(t)(s) = α(ts). The natural adjunction isomorphism γ is defined by (γ(f ))(x)(t) = f ([x, t]). Observe that γ −1 (f )([x, t]) = (f (x))(t). The product on the cylinder of a topological space mentioned in the Introduction of this paper gives a cone structure on the category of topological spaces. It induces a product on the cylinder of any pointed topological space that originates the above cone. Hence the classical homotopy theory of pointed topological spaces is obtained. Two cone structures can be associated to topological spaces using HEP cofibrations or closed cofibrations. Contractible topological spaces are the contractible objects of these cone structures. A pointed topological space is contractible if and only if the point is a strong deformation retract of the topological space.
30
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
7.4 Projective and injective homotopies Projective and injective homotopy theories created by Eckmann and Hilton [8] are generated by C∗ and C∗ structures, respectively. The cone and cocone structures was defined by Huber [9] using the free, forgetful and Hom functors. 7.4.1 Projective homotopy theory The cocone C M of a R-module M is the free R-module generated by the pointed set (M, 0). Given a homomorphism of R-modules f : M → N , the cocone C f : C M → C N is the unique homorphism of R-modules verifying C f i = jf , where i : (M, 0) → (C M, 0) and j : (N, 0) → (C N, 0) are the inclusions of pointed sets. The natural transformations κ and ρ are defined as follows: κM : C M → M and ρM : C M → C 2 M are the unique homomorphisms of R-modules such that κM i = 1 and ρM i = ji, where i : (M, 0) → (C M, 0) and j : (C M, 0) → (C 2 M, 0) are the inclusions of pointed sets. Contractible objects are projective R-modules: If M 0, then there is F : M → C M such that κ F = 1. Given a homomorphism of R-modules f : M → N and an epimorphism of R-modules α : L → N . Let s : (N, 0) → (L, 0) be a section of α in the category of pointed sets. There is an unique homomorphism of R-modules H : C M → L such that Hi = sf . Hence, αHi = αsf = f = f κ i, and therefore αH = f κ . So that αHF = f κ F = f , and M is a projective R-module. The converse is obvious, by observing that κ : C M → M is an epimorphism of R-modules. 7.4.2 Injective homotopy theory The cone of a R-module M is defined by CM = Hom(C Hom(M, Q1 ), Q1 ), where Q1 is the additive group of the rational numbers module the integers and C is the cocone functor of the above projective homotopy theory. So that C 2 M = Hom(C Hom(Hom(C Hom(M, Q1 ), Q1 ), Q1 ), Q1 ) Observe that the functor Hom(−, Q1 ) carries right (resp. left) modules into left (resp. right) modules. Here we will work in the category of left R-modules. A similar development can be achieved for right R-modules. The cone of a homomorphism of R-modules f : M → N is defined by Cf = (C f ∗ )∗ . The natural transformations κ and ρ are defined as follows:
The homomorphism κM : M → CM is such that κM (m)( αi ri ) = αi ri (m) =
i∈M1
i∈M1
αi (ri m) for every finite subset M1 ⊆ Hom(M, Q1 ), m ∈ M , ri ∈ R, and homo-
i∈M1
2 morphism of groups αi : M → Q1 . The homomorphism ρM : C M → CM is such that ρM (α)( αi ri ) = α( α i ri ) for every finite subset M1 ⊆ Hom(M, Q1 ), ri ∈ R i∈M1
i∈M1
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
31
and homomorphisms of groups α : C Hom(Hom(C Hom(M, Q1 ), Q1 ), Q1 ) → Q1 and αi : M → Q1 . The homomorphism of groups α i : Hom(C Hom(M, Q1 ), Q1 ) → Q1 is defined by α i (β) = β(αi ), for every homomorphism of groups β : C Hom(M, Q1 ) → Q1 . Contractible objects are injective R-modules: To prove that every contractible R-module is injective, first we will see that CM is an injective module, for every R-module M . Since the tensor product is distributive with respect to direct sums, the tensor product of a module with its ring is isomorphic to the module, and the modules C M and R are isomorphic, for every R-module M , hence: − ⊗R C M ∼ R) ∼ (− ⊗R R) ∼ − = − ⊗R ( =− = M
M
M
M
On the other hand, − ⊗R N is a left adjoint functor of HomZ (N, −), for every R-module − is a left adjoint functor of HomZ (C M, −). N . In particular M
Since Q1 is an injective abelian group, using the adjunction isomorphism, the Rmodule HomZ (C M, Q1 ) is injective, for every R-module M . In particular, so is CM = Hom(C Hom(M, Q1 ), Q1 ). If the R-module M is contractible, then there is a homomorphism of R-modules F : CM → M such that F κM = 1M . Given a homomorphism of R-modules f : N → M and a monomorphism of R-modules α : N → L, then, since CM is an injective module, there is a homomorphism of R-modules f : L → CM such that fα = κM f . Hence, F fα = F κM f = f . Conversely, since M is injective and κM : M → CM is an injective homomorphism of R-modules, there is F : CM → M such that F κM = 1. Observe that the homomorphism of R-modules κM : M → CM is injective. Since Q1 is an injective abelian group, for every abelian group M and 0 = m ∈ M there is a homomorphism of groups αm : M → Q1 such that αm (m) = 0. If κ(m) = 0, then m = 0, since if m = 0, then there is αm : M → Q1 such that κ(m)(αm ) = αm (m) = 0.
7.5 Exterior spaces In general, the category P of topological spaces and proper maps does not have limits and colimits. In particular, pushouts do not always exist. However, P is a full subcategory of the category E of exterior spaces, and the pushout of every pair of morphisms exists in E. This fact allows the construction of a structure of natural cone through a cylinder with natural product. Since both homotopy theories agree (via the cylinder and via the cone) on the full subcategory P, we obtain the proper homotopy of topological spaces (see [6]). The category of exterior spaces is defined as follows. An externology for a topological space (X, τ ) is a subset ε ⊂ τ verifying the properties E1 and E2: E1. If E, E ∈ ε, then E ∩ E ∈ ε.
32
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
E2. If A ∈ τ , E ∈ ε and E ⊂ A, then A ∈ ε. An exterior space (X, τ, ε) is a topological space (X, τ ) with an externology ε. An exterior morphism f : (X, τ, ε) → (X , τ , ε ) is a continuous map f : (X, τ ) → (X , τ ) such that f −1 (E ) ∈ ε, for every E ∈ ε . The category of exterior spaces has pushout objects: Given two exterior morphisms f : (X, τ, ε) → (X , τ , ε ) and g : (X, τ, ε) → (X , τ , ε ), the pushout object P {f, g} is the topological space P {f, g} with the externology defined by {E ⊂ P {f, g} / f
−1
(E) ∈ ε and g −1 (E) ∈ ε }
On the other hand, given two exterior spaces (X, τ, ε) and (X , τ , ε ), the exterior product space is the topological product space (X × X , τ ) with the externology ε = {E ∈ τ / there are E ∈ ε and E ∈ ε such that E × E ⊂ E }. The cylinder of an exterior space (X, τ, ε) is the exterior product space X × I, where the externology of the unit interval is {I}. In this way, the natural transformations ι0 , ι1 , , Ψ given in the category of topological spaces are exterior morphisms. Hence, a cylinder structure with a natural product is obtained in the category of exterior spaces. A cone structure is obtained similarly as in the category of topological spaces. Observe that the cone of an exterior space (X, τ, ε) is the topological cone (CX, τ ) with the externology {E ∈ τ / there is E ∈ ε such that CE ⊂ E }.
References [1] H.J. Baues: Algebraic homotopy, Cambridge Studies in Advanced Mathematics 15, Cambridge University Press, Cambridge-New York, 1989. [2] H.J. Baues and A. Quintero: “On the locally finite chain algebra of a proper homotopy type”, Bull. Belg. Math. Soc. Simon Stevin, Vol. 3(2), (1996), pp. 161–175. [3] F.J. D´ıaz and S. Rodr´ıguez-Mach´ın: “Homotopy theory induced by cones”, Extracta Math., Vol. 16(2), (2001), pp. 287–292. [4] F.J. D´ıaz, J. Remedios and S. Rodr´ıguez-Mach´ın: “Generalized homotopy in Ccategories”, Extracta Math., Vol. 16(3), (2001), pp 393–403. [5] F.J. D´ıaz, J. Garc´ıa-Calcines and S. Rodr´ıguez-Mach´ın: Homotop´ıa algebraica: descripci´ on e interrelaci´ on de las principales teor´ıas, Monograf´ıas de la Academia de Ciencias Exactas, F´ısicas, Qu´ımicas y Naturales de Zaragoza 5, 1994. [6] J. Garc´ıa-Calcines, M. Garc´ıa-Pinillos and L.J. Hern´andez-Paricio: “A closed simplicial model category for proper homotopy and shape theories” B. Aust. Math. Soc., Vol. 57, (1998), pp. 221–242. [7] L.J. Hern´andez: Un ejemplo de Teor´ıa de homotop´ıa en los grupos abelianos, Departamento de Geometr´ıa y Topolog´ıa, Universidad de Zaragoza, 1980. [8] P.J. Hilton: Homotopy theory and duality, Gordon and Breach Science Publishers, New York-London-Paris, 1965. [9] P.J. Huber: “Homotopy theory in general categories”, Math. Ann., Vol. 144, (1961), pp. 361–385.
F.J. D´ıaz and S. Rodr´ıguez-Mach´ın / Central European Journal of Mathematics 4(1) 2006 5–33
33
[10] K.H. Kamps: “Note on normal sequences of chain complexes”, Colloq. Math., Vol 39(2), (1978), pp. 225–227. [11] K.H. Kamps and T. Porter: Abstract Homotopy and Simple Homotopy, World Scientific Publishing Co., Inc., River Edge, NJ, 1997. [12] H. Kleisli: “Homotopy theory in Abelian Categories”, Canad. J. Math., Vol. 14, (1962), pp. 139–169. [13] H. Kleisli: “Every Standard construction is induced by a pair of Adjoint Functors”, Proc. Am. Math. Soc., Vol 16(3), (1965), pp. 544–546. [14] E.G. Minian: “Generalized cofibration categories and global actions”, Special issues dedicated to Daniel Quillen on the occasion of his sixtieth birthday, Part I. K-Theory, Vol 20(1), (2000), pp. 37–95. [15] T. Porter: “Abstract Homotopy Theory: The Interaction of Category Theory and Homotopy”, Cubo Mat. Educ., Vol 5(1), (2003), pp. 115–165. [16] E. Padr´on and S. Rodr´ıguez-Mach´ın: “Model additive categories”, Rend. Circ. Mat. Palermo, Suppl., Vol. 24, (1990), pp. 465–474. [17] D.G. Quillen: Homotopical Algebra, Lecture Notes in Math, Vol. 43, Springer-Verlag, Berlin-New York, 1967. [18] J.A. Seebach Jr.: “Injectives and homotopy”, Illinois J. Math., Vol. 16, (1972), pp. 446–453.
Central European Science Journals w
w
w
.
c
e
s
j
.
o
c
m
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
DOI: 10.1007/s11533-005-0003-4 Research article CEJM 4(1) 2006 34–45
On the doubly connected domination number of a graph Joanna Cyman∗ , Magdalena Lema´ nska† , Joanna Raczek‡ Department of Discrete Mathematics, Faculty of Applied Physics and Mathematics, Gda´ nsk University of Technology, Narutowicza 11/12, 80-952 Gda´ nsk, Poland
Received 3 January 2005; accepted 19 September 2005 Abstract: For a given connected graph G = (V, E), a set D ⊆ V (G) is a doubly connected dominating set if it is dominating and both D and V (G) − D are connected. The cardinality of the minimum doubly connected dominating set in G is the doubly connected domination number. We investigate several properties of doubly connected dominating sets and give some bounds on the doubly connected domination number. c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: Doubly connected domination number, connected domination number MSC (2000): 05C69
1
Introduction
Let G = (V, E) be a simple connected graph with |V (G)| = n(G) and |E(G)| = m(G). The neighbourhood NG (v) of a vertex v is the set of all vertices adjacent to v in G and the closed neighbourhood NG [v] = NG (v) ∪ {v}. The degree dG (v) = |NG (v)| of a vertex v is the number of edges incident to v in G. The minimum and maximum degrees of vertices of V (G) are denoted by δ(G) and Δ(G), respectively. A vertex x such that dG (x) = Δ(G) = n(G) − 1 we call a universal vertex . Let Ω(G) be the set of all endvertices of G, that is the set of vertices degree 1, and let n1 (G) be the cardinality of Ω(G). ∗ † ‡
E-mail:
[email protected] E-mail:
[email protected] E-mail:
[email protected] J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
35
A vertex that is a neighbour of an end-vertex is called a support. Let S(G) be the set of supports in G. The corona G = H ◦ K1 is the graph constructed from a copy of H, where for each vertex v ∈ V (H), a new vertex v and a pendant edge vv are added. For disjoint graphs G1 and G2 , the join G = G1 + G2 is the graph G with V (G) = V (G1 ) ∪ V (G2 ) and E(G) = E(G1 ) ∪ E(G2 ) ∪ {uv : u ∈ V (G1 ) ∧ v ∈ V (G2 )}. Let us denote by G − v the graph obtained from G by removing the vertex v ∈ V (G) and all edges incident to v. For any connected graph G, a vertex x ∈ V (G) is called a cut-vertex of G if G−x is no longer connected. The vertex-connectivity or simply connectivity κ(G) is the minimum number of vertices whose removal from G results in disconnected graph or a graph with only one vertex. A set D ⊆ V (G) is a dominating set of G if for every vertex v ∈ V (G)−D, there exists a vertex u ∈ D such that v is adjacent to u. The minimum cardinality of a dominating set in G is the domination number γ(G). Sampathkumar and Walikar [7] defined a connected dominating set D to be a dominating set whose induced subgraph D is connected. The minimum cardinality of a connected dominating set in G is the connected domination number of G and is denoted by γc (G). In this paper we introduce a new type of domination: a set D ⊆ V (G) is a doubly connected dominating set of G if it is dominating and both D and V (G) − D are connected. The cardinality of a minimum doubly connected dominating set of G is the doubly connected domination number of G and is denoted by γcc (G). We define that for each connected graph G the set of all vertices of G is a doubly connected dominating set of G. For unexplained terms and symbols see [1, 5].
2
Preliminary results
We begin with some basic properties of doubly connected dominating sets. Proposition 2.1. Let D be a minimum doubly connected dominating set of a connected graph G on n ≥ 3 vertices. Then (i) every cut-vertex is in D; (ii) every support is in D; (iii) at least n1 (G) − 1 end-vertices are in D; (iv) γcc (G) ≥ n1 (G), with equality if and only if G is a star K1,n−1 ; (v) γcc (G) ≥ n1 (G) + |S(G)| − 1, with equality if and only if each vertex v ∈ V (G) is either an end-vertex or a support. Proof. (i) Assume v is a cut-vertex of G that does not belong to a minimum doubly connected dominating set D. As G − v is disconnected, it is not possible to choose a connected dominating set D ⊆ V (G) − {v}, a contradiction.
36
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
(ii) As every support is a cut-vertex, by (i) our claim follows. (iii) If not, assume there are two end-vertices not belonging to D. As every support is in D it follows, that V (G) − D is not connected, a contradiction. (iv) By (iii), at least n1 (G) − 1 end-vertices are in D. If Ω(G) ⊆ D our claim follows. Similarly, if there exists a vertex x ∈ Ω(G) such that x ∈ / D, then γcc (G) = n(G) − 1 and since n ≥ 3 we have n(G) − 1 ≥ n1 (G), which completes the proof of the bound. It is easy to see that γcc (K1,n−1 ) = n1 (G). Conversely, assume that γcc (G) = n1 (G). In this case, by (ii) and (iii), each support and at least all end-vertices except one are in a minimum doubly connected dominating set D. Thus |S(G)| = 1, |V (G) − D| = 1 and |D| = n − 1 = n1 (G). We conclude G is a star K1,n−1 . (v) By (ii) and (iii), the inequality is straightforward. If Ω(G) ∪ S(G) = V (G) then obviously n1 (G) + |S(G)| − 1 = γcc (G). Conversely, assume that γcc (G) = n1 (G) + |S(G)| − 1. In this case, the minimum doubly connected dominating set D consists of all vertices of the set S(G) and all except one end-vertices. Thus γcc (G) = n − 1 and V (G) = S(G) ∪ Ω(G). As an immediate consequence of Proposition 2.1 we have Corollary 2.2. If G = H ◦ K1 , then γcc (G) = n(G) − 1. Corollary 2.3. For a tree T on n ≥ 3 vertices, γcc (T ) = n − 1. Proof. In a tree T each vertex is either a cut-vertex or an end-vertex. By Proposition 2.1, we conclude that γcc (T ) ≥ n − 1. On the other hand, if x is an end-vertex of a tree T , then D = V (T ) − {x} is a doubly connected dominating set. Thus, γcc (T ) = n − 1. Since every doubly connected dominating set is a connected dominating and every connected dominating set is dominating, we have the following inequality chain for every connected graph G: γ(G) ≤ γc (G) ≤ γcc (G). We characterize now some graphs for which the numbers γcc (G) and γc (G) are the same. Proposition 2.4. Let G be a connected graph on n ≥ 3 vertices. (i) If G is a cycle, then γcc (G) = γc (G) = n − 2. (ii) If γcc (G) = γc (G), then γcc (G) ≤ n − 2. (iii) If γcc (G) = γc (G), then δ(G) ≥ 2. (iv) For any unicyclic graph G we have γc (G) = γcc (G) if and only if G is a cycle. Proof. (i) It is obvious. (ii) It is known [7] that for every connected graph G with n ≥ 3 we have γc (G) ≤ n − 2. Thus, for the equality γcc (G) = γc (G) we conclude that γcc (G) ≤ n − 2. (iii) Suppose γcc (G) = γc (G) and x is an end-vertex in G. By (ii), γcc (G) ≤ n − 2. Let D
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
37
be a minimum doubly connected dominating set of G of cardinality |D| ≤ n − 2. If x ∈ D, then D − {x} is also a connected dominating set of G, a contradiction with equality γcc (G) = γc (G). If x ∈ / D, then x is the unique vertex in V (G) − D, because V (G) − D is connected. Thus γcc (G) = n − 1, a contradiction. (iv) If G is a cycle on n vertices, then by (i) γcc (G) = γc (G) = n − 2. Suppose now G is unicyclic with γc (G) = γcc (G) and G is not a cycle. By (ii), γcc (G) ≤ n−2. Moreover, there exists a vertex x ∈ V (G) such that dG (x) = 1. Let D be a minimum doubly connected dominating set of G. If x ∈ / D, then γcc (G) = n − 1, a contradiction. On the other hand, x ∈ D implies, that D − {x} is a connected dominating set of G, a contradiction, as γc (G) = γcc (G). We have shown that there exist graphs G for which the equality γc (G) = γcc (G) holds. However the difference between γcc (G) and γc (G) can be arbitrarily large. Lemma 2.5. The difference γcc − γc can be arbitrarily large. Proof. Consider a star K1,n−1 with n − 1 end-vertices. Of course, γc (K1,n−1 ) = 1. By Proposition 2.1, γcc (K1,n−1 ) = n − 1. Thus γcc (K1,n−1 ) − γc (K1,n−1 ) = n − 2. Observation 2.6. Let G = Km1 ,m2 ,...,mk be the complete k partite graph, k ≥ 3 with m1 ≤ m 2 ≤ · · · ≤ m k . • If m1 = 1, then γcc (G) = 1; • If m1 ≥ 2, then γcc (G) = 2. Observation 2.7. If G1 and G2 are disjoint connected graphs, then 1 if γcc (G1 ) = 1 or γcc (G2 ) = 1; γcc (G1 + G2 ) = 2 otherwise. A connected subgraph B of G is called a block if B has no cut-vertex and every subgraph B ⊆ G with B ⊆ B and B = B has at least one cut-vertex. A connected graph G is called a block graph if every block in G is complete. A vertex v of a graph G is called a simplicial vertex if every two vertices of NG (v) are adjacent in G. Theorem 2.8. If G is a block graph, then γcc (G) = n(G) − t, where t is the maximal number of simplicial vertices in a block with a largest number of simplicial vertices. Proof. Let D be a minimum doubly connected dominating set of a block graph G. By Proposition 2.1, each cut-vertex belongs to D. Hence γcc (G) ≥ n(G) − t, where t is maximal number of simplicial vertices in a block with a largest number of simplicial vertices. Conversely, let B be a block with a largest number of simplicial vertices. Denote by F the set of all simplicial vertices belonging to B and let |F | = t. Then V (G)−F is a doubly
38
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
connected dominating set of G and we have γcc (G) ≤ n(G) − t. Thus γcc (G) = n(G) − t.
3
Bounds
Now we find some bounds on the doubly connected domination number. For this purpose, denote by A a family of graphs such that K2 ∈ A and G belongs to A if and only if for each pair of adjacent non-cut-vertices u, v ∈ V (G), V (G) − {u, v} is disconnected.
Fig. 1 A graph G ∈ A. Theorem 3.1. For every connected graph G on n ≥ 2 vertices, 1 ≤ γcc (G) ≤ n − 1 with equality for the lower bound if and only if there exists a connected graph H such that G = H + K1 and equality for the upper bound if and only if G ∈ A. Proof. The inequality 1 ≤ γcc (G) is obvious. If G = H + K1 and H is connected, then obviously γcc (G) = 1. Assume now that γcc (G) = 1 and let D = {x} be a minimum doubly connected dominating set of G. Since D is dominating, x must be a universal vertex. Moreover, V (G) − D = V (G) − {x} is connected, so x is a non-cut-vertex. We conclude that G = H + K1 , where H is connected. Now we prove that γcc (G) ≤ n−1. The inequality and the equality are straightforward when G = K2 . Suppose n ≥ 3. Then there exist in G at least two non-cut-vertices, for example two leaves of a spanning tree of G. Let x be a non-cut-vertex. Then D = V (G) − {x} is a doubly connected dominating set of G. If G ∈ A, then γcc (G) = n − 1, because every support of G is in D and for each pair of adjacent non-cut-vertices u, v ∈ V (G), the induced subgraph V (G) − {u, v} is disconnected. Now let G ∈ / A. It suffices to show that γcc (G) ≤ n − 2. If G ∈ / A, then there exist adjacent non-cut-vertices v, u ∈ V (G) such that V (G) − {u, v} is connected. In this case D = V (G) − {u, v} is a doubly connected dominating set of G, as n ≥ 3, G is connected and neither of u, v is a support. Proposition 3.2. Let G be a connected graph on n ≥ 2 vertices. Then γcc (G) ≤ n − κ(G) + 1.
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
39
Proof. If κ(G) ≤ 2, then by Theorem 3.1 our claim follows. Thus assume now κ(G) ≥ 3. It is obvious that κ(G) ≤ δ(G). Let A be a set of an arbitrary vertex x ∈ V (G) and κ(G) − 2 of its neighbours. Obviously, V (G) − A is connected. Observe that D = V (G) − A is dominating in G. Thus D is a doubly connected dominating set in G with |D| = n − κ(G) + 1. In [7] Sampathkumar and Walikar showed that for every connected graph G with n n ≥ 3 vertices and m edges we have inequalities Δ(G)+1 ≤ γc (G) ≤ 2m − n. Now we present similar inequalities for the number γcc . Theorem 3.3. For any connected graph G with n ≥ 2 vertices and m edges, n ≤ γcc (G) ≤ 2m − n + 1 Δ(G) + 1 with equality for the lower bound if and only if γcc (G) = 1 and equality for the upper bound if and only if G is a tree. n ≤ γc (G) ≤ γcc (G) the lower bound follows. If γcc (G) = 1, then by Proof. Since Δ(G)+1 n Theorem 3.1 there exists a vertex v ∈ V (G) such that dG (v) = n − 1. Thus Δ(G)+1 =1= γcc (G). n Conversely, let G be a graph such that γcc (G) = Δ(G)+1 and γcc (G) > 1. Let D be a minimum doubly connected dominating set of G. Since D is connected, for each v ∈ D we have |NG (v) ∩ (V (G) − D)| ≤ Δ(G) − 1. Hence |V (G) − D| ≤ (Δ(G) − 1)|D| and n n − γcc (G) ≤ (Δ(G) − 1)γcc (G), which gives γcc (G) ≥ Δ(G) , a contradiction. By Theorem 3.1, γcc (G) ≤ n−1 = 2(n−1)−n+1 and since G is connected, m ≥ n−1. Thus γcc (G) ≤ 2m − n + 1. We now show that γcc (G) = 2m − n + 1 if and only if G is a tree. If G is a tree, then m = n − 1 and γcc (G) = n − 1 = 2m − n + 1. Conversely, let γcc (G) = 2m − n + 1. By Theorem 3.1 we have 2m − n + 1 ≤ n − 1, which implies m ≤ n − 1, so G must be a tree with m = n − 1.
As an immediate consequence of the second paragraph of the proof of Theorem 3.3 we have what follows. Corollary 3.4. For each connected graph G with γcc (G) > 1 is γcc (G) ≥
n . Δ(G)
Now we introduce the following notation: if T1 and T2 are vertex disjoint trees, then by P(T1 , T2 ) we denote the set of all graphs G that can be obtained from T1 and T2 by adding n(T2 ) edges, one edge joining each vertex of T2 to one arbitrarily chosen vertex of T1 . We say that a graph G belongs to the family U if there exist trees T1 and T2 such that G ∈ P(T1 , T2 ). Theorem 3.5. For any connected graph G on n ≥ 2 vertices and with m edges, 2n − m − 2 ≤ γcc (G)
40
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
V (T2 ) V (T1 ) Fig. 2 A graph G ∈ P(T1 , T2 ). with equality for the bound if and only if G belongs to the family U. Proof. Let D be a minimum doubly connected dominating set in G. Since D and V (G) − D are connected and D is dominating, we have the following inequalities: m(D) ≥ γcc (G) − 1, m(V (G) − D) ≥ n − γcc (G) − 1, mγcc ≥ n − γcc (G), where mγcc is the number of the edges connecting vertices of V (G) − D to vertices of D. By summing the inequalities we obtain m = m(D) + m(V − D) + mγcc ≥ 2n − γcc (G) − 2 and thus 2n − m − 2 ≤ γcc (G). We now show that γcc (G) = 2n − m − 2 if and only if G belongs to the family U. Let G ∈ U. Then there exist trees T1 and T2 such that G ∈ P(T1 , T2 ). In such a graph G the set V (T1 ) is a doubly connected dominating set. Thus γcc (G) ≤ n(T1 ). Of course n = n(T1 ) + n(T2 ) and m = m(T1 ) + m(T2 ) + n(T2 ) = n(T1 ) − 1 + n(T2 ) − 1 + n(T2 ) = n(T1 ) + 2n(T2 ) − 2. It follows that 2n − m − 2 = 2(n(T1 ) + n(T2 )) − (n(T1 ) + 2n(T2 ) − 2) − 2 = n(T1 ). Consequently γcc (G) ≥ n(T1 ), which together with γcc (G) ≤ n(T1 ) gives γcc (G) = n(T1 ) = 2n − m − 2. Conversely, suppose γcc (G) = 2n − m − 2. This implies that m(D) = γcc (G) − 1 = n(D) − 1, m(V (G) − D) = n − γcc (G) − 1 = n(V (G) − D) − 1, mγcc = n − γcc (G). It follows that D and V (G) − D are trees and each vertex of V (G) − D has exactly one neighbour in D. Thus G is a graph obtained from two trees T1 and T2 by adding n(T2 ) edges, one edge joining each vertex of T2 to one arbitrarily chosen vertex of T1 .
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
41
Duchet and Meyniel [3] have shown that for any connected graph G is γc (G) ≤ 2β0 (G) − 1 and γc (G) ≤ 2Γ(G) − 1, where Γ(G) is the maximum cardinality of a minimal dominating set of G and β0 is the maximum cardinality of an independent set of G. The next theorem shows that there is no similar result for the doubly connected domination number of a graph. Theorem 3.6. Each of the differences γcc − β0 and γcc − Γ can be arbitrarily large. Proof. We show a graph G for which γcc (G)−β0 (G) = γcc (G)−Γ(G) = k for any positive integer k. Let G be a corona Kk+1 ◦ K1 . It is easy to observe that the set of end-vertices Ω(G) is the maximum independent set of G and thus β0 (G) = k + 1. The set Ω(G) is also the maximum minimal dominating set of G, so Γ(G) = k + 1. Since G is a corona, from Corollary 2.2 we have γcc (G) = |V (G)| − 1 = 2(k + 1) − 1 = 2k + 1. It follows that γcc (G) − β0 (G) = γcc (G) − Γ(G) = k.
4
Edge subdivision and vertex removing
Now we examine the effects on γcc (G) when G is modified by an edge subdivision. An edge subdivision in a nonempty graph G is an operation of removal of an edge e = uv and the addition of a new vertex w and edges uw and vw. A graph obtained from G by subdividing the edge e = uv is denoted by G ⊕ wuv . Theorem 4.1. For every connected graph G we have γcc (G) ≤ γcc (G ⊕ wuv ). Proof. Let e = uv be the subdivided edge and let D0 be a minimum doubly connected dominating set of G ⊕ wuv . We consider two cases: a) w ∈ D0 . Then, since D0 is connected, u or v belong to D0 . If both of these vertices belong to D0 , then D0 − {w} is a doubly connected dominating set of G and thus γcc (G) < |D0 | = γcc (G ⊕ wuv ). If u ∈ D0 and v ∈ / D0 , then D0 − {w} is a doubly connected dominating set of G and we have the required inequality. b) w ∈ / D0 . Then, since D0 is dominating, u or v belong D0 . Then, similarly as in case a), we have γcc (G) ≤ |D0 | = γcc (G ⊕ wuv ). Theorem 4.2. The difference γcc (G ⊕ wuv ) − γcc (G) can be arbitrarily large. Proof. We construct graphs G and G ⊕ wuv for which γcc (G ⊕ wuv ) − γcc (G) = k for a non-negative integer k ≥ 2. We begin with two stars K1,k−1 , k ≥ 2 and denote their centers by u and v. Next we add a vertex x and edges joining x with all vertices of the stars. Finally, to obtain a graph G, we add an edge e = uv and a pendant edge xx (see Fig. 3). It is easy to observe that the set D = {x, x } is a minimum doubly connected dominating set of G and thus γcc (G) = 2.
42
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
For the graph G⊕wuv notice that the set Du = N [v]∪{x }−{w} is a minimum doubly connected dominating set and the size of this set is k + 2. Thus γcc (G ⊕ wuv ) − γcc (G) = k + 2 − 2 = k. u
v
x
u
x
x
v
w
x
Fig. 3 Graphs G and G ⊕ wuv for k = 5. Theorem 4.3. The difference γcc (G) − γcc (G − x) can be arbitrarily large. Proof. Let H be the join K1,k + K1 , k ≥ 2, and let G be the graph that results if we add two pendant edges and two end-vertices x and y to the vertices of degree k + 1 of the graph H (see Fig. 4). It is easy to observe that V (G) − {x} is a minimum doubly connected dominating set of G. Thus, γcc (G) = k + 3. The set NG [y] is a minimum doubly connected dominating set of G − x. Thus γcc (G − x) = 2 and finally we have γcc (G) − γcc (G − x) = k + 1.
y
x
y
Fig. 4 Graphs G and G − x for k = 4. Theorem 4.4. The difference γcc (G − x) − γcc (G) can be arbitrarily large. Proof. Let G be the join of a path P on n vertices and K1 . Let x be the vertex of K1 . Clearly we have γcc (G) = 1. As G − x is a tree, by Corollary 2.3 we have γcc (G − x) = n − 1. Thus γcc (G − x) − γcc (G) = n − 2.
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
43
x
x1
x2
x3
x4
...
xn
Fig. 5 Graph G.
5
Complexity issues for γcc
In this section we consider the decision problem of DOUBLY CONNECTED DOMINATING SET as follows DOUBLY CONNECTED DOMINATING SET (DCDS) INSTANCE: A connected graph G = (V, E) and a positive integer k. QUESTION: Does G have a doubly connected dominating set of size at most k? We show that the decision problem DCDS is NP–complete, even when restricted to connected bipartite graphs. We will use a well-known NP–completeness result, called DOMINATING SET, which is defined as follows. DOMINATING SET (DS) INSTANCE: A graph G = (V, E) and a positive integer k. QUESTION: Does G have a dominating set of size at most k? Garey and Johnson in [4] proved that DS is NP–complete. Theorem 5.1. DCDS for bipartite graphs is NP–complete. Proof. We know that DCDS problem for bipartite graphs is in class NP of decision problems as it is easy to verify in polynomial time whether D is a doubly connected dominating set. For any given instance for DS, which is a graph G = (V, E) and an integer k, we construct a graph H and an integer q as follows:
44
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
V (H) = V (G) × {1, 2, 3} ∪ {x, y}, E(H) = {(v, 1)(v, 2) : v ∈ V (G)} ∪ {(v, 2)(v, 3) : v ∈ V (G)} ∪ {(v, 1)x : v ∈ V (G)} ∪ {(v, 1)y : v ∈ V (G)} ∪ {(v, 3)x : v ∈ V (G)} ∪ {(v, 3)y : v ∈ V (G)} ∪ {(v, 1)(w, 2) : vw ∈ E(G)}, q = k + 1. The graph H is connected and bipartite, as every cycle in H has even length. (See Figure 6).
x .. .
.. .
.. . y
(V, 2)
(V, 1)
(V, 3)
Fig. 6 Reduction from DS to DCDS for bipartite graphs. Assume first that G has a dominating set D = {v1 , v2 , . . . , vk }, k ≤ k, of size at most k. Let F = {(v1 , 1), (v2 , 1), . . . , (vk , 1), x}. Since x dominates all vertices in (V, 1) ∪ (V, 3) and D is a dominating set in G, the set F is dominating in H. Moreover, from the construction of H we see that induced subgraphs F and V (H) − F are connected. Thus F is a doubly connected dominating set of H of size at most q = k + 1. Conversely, assume that F is a doubly connected dominating set of cardinality at most q in H. We shall show that G contains a dominating set D of size at most k = q − 1. It is easy to see that if q > n(G), answers for problems DCDS and DS are ”yes”. So assume q ≤ n(G). We claim that either vertex x or y is in every doubly connected dominating set of size q ≤ n(G), because a connected dominating set of size at most n(G) that dominates all vertices of (V, 3) and does not contain x nor y does not exist. (Observe
J. Cyman et al. / Central European Journal of Mathematics 4(1) 2006 34–45
45
that in V × {1, 2, 3} the subset (V, 3) is a set of vertices of degree 1.) Thus assume x ∈ F . Moreover, every doubly connected dominating set F of size q1 ≤ n(G) can be transformed into a doubly connected dominating set F ⊆ (V, 1) ∪ {x} of size q ≤ q1 as follows • x ∈ F; • if (vi , 1) ∈ F , then (vi , 1) ∈ F ; • if (vi , 3) ∈ F , then (vi , 1) ∈ F ; • if (vi , 2) ∈ F , then (vi , 1) ∈ F . Now, if F = {(v1 , 1), (v2 , 1), . . . , (vq−1 , 1), x} is a doubly connected dominating set of size q, then D = {v1 , v2 , . . . , vq−1 } is a dominating set in G of size k = q − 1. It is obvious that the transformation used is polynomial, as H has 3n(G) + 2 vertices and 4n(G) + 2m(G) edges.
References [1] J.A. Bondy and U.S.R. Murty: Graph Theory with Applications, Macmillan, London, 1976. [2] C. Bo and B. Liu: “Some inequalities about connected domination number”, Disc. Math., Vol. 159, (1996), pp. 241–245. [3] P. Duchet and H. Meyniel: “On Hadwiger’s number and the stability number”, Ann. Disc. Math., Vol. 13, (1982), pp. 71–74. [4] M.R. Garey and D.S. Johnson: Computers and Intractability: A Guide to the Theory of N P –completeness, Freeman, San Francisco, 1979. [5] T.W. Haynes, S.T. Hedetniemi and P.J. Slater: Fundamentals of Domination in Graphs, Marcel Dekker, New York, 1998. [6] S.T. Hedetniemi and R. Laskar: Connected domination in graphs, Graph Theory and Combinatorics, Academic Press, London, 1984, pp. 209–217. [7] E. Sampathkumar and H.B. Walikar: “The connected domination number of a graph”, J. Math. Phys. Sci., Vol. 13, (1979), pp. 607–613.
Central European Science Journals w
w
w
.
c
e
s
j
.
o
c
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
m
DOI: 10.1007/s11533-005-0004-3 Research article CEJM 4(1) 2006 46–63
On solutions of third order nonlinear differential equations Ivan Mojsej∗ , J´an Ohriska† Institute of Mathematics, Faculty of Science, ˇ arik University, P. J. Saf´ Jesenn´a 5, 041 54 Koˇsice, Slovak Republic
Received 24 April 2005; accepted 15 November 2005 Abstract: The aim of our paper is to study oscillatory and asymptotic properties of solutions of nonlinear differential equations of the third order with deviating argument. In particular, we prove a comparison theorem for properties A and B as well as a comparison result on property A between nonlinear equations with and without deviating arguments. Our assumptions on nonlinearity f are related to its behavior only in a neighbourhood of zero and/or of infinity. c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: Oscillation theory, nonlinear equation, deviating argument, quasiderivatives MSC (2000): 34K11
1
Introduction
We consider the third-order nonlinear differential equations with deviating argument of the form: 1 1 x (t) + q(t)f (x(h(t))) = 0 , t ≥ 0 (N, h) p(t) r(t) or its special case
∗ †
1 p(t)
E-mail:
[email protected] E-mail:
[email protected] 1 x (t) + q(t)f (x(t)) = 0 , r(t)
t≥0
(N)
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
and
1 r(t)
1 − q(t)f (z(h(t))) = 0 , z (t) p(t)
t≥0
47
(NA , h)
or its special case
1 r(t)
1 z (t) − q(t)f (z(t)) = 0 , p(t)
t≥0
(NA )
where r, p, q, h ∈ C([0, ∞), R), r(t) > 0, p(t) > 0, q(t) > 0 on [0, ∞)
(H1)
f ∈ C(R, R), f (u)u > 0 for u = 0 ∞ ∞ r(t) dt = p(t) dt = ∞
(H2) (H3)
lim h(t) = ∞
(H4)
0
0
t→∞
Without mentioning them again, we shall assume the validity of conditions (H1)–(H4) throughout the paper. The notation (N A , h) is suggested by the fact that for linear equation without deviating arguments, i.e., for the equation 1 1 + q(t)x(t) = 0 , (L) x (t) p(t) r(t) the adjoint equation is
1 r(t)
1 − q(t)z(t) = 0 . z (t) p(t)
If x is a solution of (N, h), then the functions 1 1 1 1 [1] [0] [1] [2] x = x , x = x, x = x , x = r p r p
x
[3]
1 = q
(LA )
1 1 1 [2] x x = p r q
are called the quasiderivatives of x. For (N A , h) we can proceed in a similar way. The linear case of equations (N, h), (N A , h) we denote by (L, h), (LA , h), respectively. In addition to (H1)–(H4), we sometimes assume lim inf |u|→∞
or lim sup u→0
f (u) >0 u
(H5)
f (u) < ∞. u
(H6)
By a solution of an equation of the form (N, h) [(N A , h)] we mean a function w ∈ C 1 ([0, ∞), R) such that w[1] (t), w[2] (t) ∈ C 1 ([0, ∞), R) satisfying equation (N, h) [(N A , h)] for all t ≥ 0. Any solution of (N, h) or (N A , h) is said to be proper if it is defined on the interval [0, ∞) and is nontrivial in any neighborhood of infinity. A proper solution is said
48
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
to be oscillatory (nonoscillatory) if it has (has not) a sequence of zeros converging to ∞. In addition, (N, h) [(N A , h)] is called oscillatory if it has at least one nontrivial oscillatory solution and nonoscillatory if all its solutions are nonoscillatory. The study of asymptotic behavior of solutions, in the ordinary case as well as in the case with deviating argument, is often connected by introducing the concepts of equation with property A and equation with property B. Equation (N, h) is said to have property A if any proper solution x of (N, h) is either oscillatory or satisfies |x[i] (t)| ↓ 0 as t → ∞ for i = 0, 1, 2 and equation (N A , h) is said to have property B if any proper solution z of (N A , h) is either oscillatory or satisfies |z [i] (t)| ↑ ∞ as t → ∞ for i = 0, 1, 2. The notations u(t) ↓ 0 and u(t) ↑ ∞ mean that function u monotonically decreases to zero as t → ∞ or monotonically increases to infinity as t → ∞, respectively. Denote by N [(N, h)], N [(N A , h)], N [(L, h)], N [(LA , h)] the sets of all proper solutions of (N, h), (N A , h), (L, h), (LA , h), respectively. From a slight modification of the wellknown lemma of Kiguradze (see, e.g., [6]) it follows that nonoscillatory solutions x of (N, h) and (L, h) can be divided into the following two classes in the same way as in [3]: N0 = {x ∈ N [(N, h)] (x ∈ N [(L, h)]), ∃ Tx : x(t)x[1] (t) < 0, x(t)x[2] (t) > 0 for t ≥ Tx } N2 = {x ∈ N [(N, h)] (x ∈ N [(L, h)]), ∃ Tx : x(t)x[1] (t) > 0, x(t)x[2] (t) > 0 for t ≥ Tx } Similarly nonoscillatory solutions z of (N A , h) and (LA , h) can be divided into the following two classes: M1 = {z ∈ N [(N A , h)] (z ∈ N [(LA , h)]), ∃ Tz : z(t)z [1] (t) > 0, z(t)z [2] (t) < 0 for t ≥ Tz } M3 = {z ∈ N [(N A , h)] (z ∈ N [(LA , h)]), ∃ Tz : z(t)z [1] (t) > 0, z(t)z [2] (t) > 0 for t ≥ Tz } It is clear that (N, h) [(L, h)] has property A if and only if all nonoscillatory solutions x of (N, h) [(L, h)] belong to the class N0 and limt→∞ x[i] (t) = 0 for i = 0, 1, 2. Similarly (N A , h) [(LA , h)] has property B if and only if all nonoscillatory solutions z of (N A , h) [(LA , h)] belong to the class M3 and limt→∞ |z [i] (t)| = ∞ for i = 0, 1, 2. In addition, if x ∈ N0 , then its quasiderivatives satisfy the inequality x[i] (t)x[i+1] (t) < 0 for i = 0, 1, 2, for all sufficiently large t and in the literature they are called Kneser solutions. If z ∈ M3 , then its quasiderivatives satisfy the inequality z [i] (t)z [i+1] (t) > 0 for i = 0, 1, 2, for all sufficiently large t and are called strongly monotone solutions. The oscillatory and asymptotic properties of solutions of differential equations of the third order with quasiderivatives (linear, nonlinear and with delay) have been largely investigated in [1–5]. The aim of this paper is to continue the study of such equations with deviating argument and with advanced argument. Our research is based on a study of asymptotic behavior of nonoscillatory solutions of (N, h) and (N A , h), on a linearization device, and on a comparison result between equations with different deviating arguments. Such a comparison criterion, in the form used here, is quoted in section 2. The paper is organized as follows: Section 2 summarizes results which will be useful in the sequel. In
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
49
the section 3 we give a comparison theorem for properties A and B, which is more suitable for application than others existing in the literature. This theorem extends Theorem 4 in [5]. As consequence we obtain sufficient conditions ensuring property A for (N, h) and property B for (N A , h) as well as a comparison result on property A between nonlinear equations with and without deviating argument. Some results on the asymptotic behavior of nonoscillatory solutions of (N, h) [(N A , h)] which belong to the class N0 [M3 ] will be considered in the section 4. Section 5 gives new integral criteria in order for (N, h) [(N A , h)] to have property A [B]. We point out that our assumptions on the nonlinearity of f are related to its behavior only in a neighbourhood of zero and/or of infinity. No monotonicity conditions are required as well as no assumptions involving the behavior of f in R are supposed.
2
Preliminary results
We introduce the following notation: I(ui ) =
0
∞
ui (t) dt,
I(ui , uj , uk ) =
0
I(ui , uj ) =
∞
ui (t)
0
0
t
uj (s)
0
∞
ui (t)
0
t
uj (s) ds dt,
i, j = 1, 2
s
uk (b) db ds dt,
i, j, k = 1, 2, 3,
where ui , i = 1, 2, 3 are continuous positive functions on [0, ∞). For simplicity, sometimes we will write u(∞) instead of limt→∞ u(t). In the recent papers [1, 2, 5] authors have studied relationships between properties A and B and both the oscillation and the asymptotic behavior of nonoscillatory solutions for linear equations without deviating argument. We recall some of these results which will be useful in the sequel. Theorem 2.1. ([1], Theorem 2.2) The following assertions are equivalent: (i) (L) has property A. (i’) (LA ) has property B. (ii) (L) is oscillatory and I(q, p, r) = ∞. (ii’) (LA ) is oscillatory and I(q, p, r) = ∞. Lemma 2.2. ([1], Lemma 2.1) If there exists a Kneser solution x of equation (L) such that limt→∞ x[i] (t) = 0 for i = 0, 1, 2, then I(q, p, r) = ∞. Remark 2.3. Theorem 2.1 and Lemma 2.2 hold even if I(r) < ∞ and/or I(p) < ∞. We will use the following comparison theorem and a result on Kneser solutions in our consideration.
50
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
Theorem 2.4. ([2], Theorem 1) Let the following condition be satisfied: u s t ∞ r(u) 0 p(v) dv du 0 s either I(q, r) = ∞ or lim sup p(s) ds q(s) ds = ∞ p(u) du t→∞ 0 t 0 If for some K > 0 the equation 1 1 + Kq(t)x(t) = 0 x (t) p(t) r(t) has property A, then the equation 1 1 x (t) + kq(t)x(t) = 0 p(t) r(t)
(H∗ )
(LK )
(Lk )
has property A for every k > 0. Proposition 2.5. ([2], Proposition 6) Every Kneser solution of (L) tends to zero for t → ∞ if and only if I(q, p, r) = ∞. Remark 2.6. From Proposition 2.5 it follows that: If (L) is oscillatory and it has not property A, then (L) has Kneser solution tending to nonzero limit and I(p, p, r) < ∞. To extend known results to differential equations with deviating argument we will use the following comparison criterion. It is a particular case of a more general theorem which is stated in [6] for functional differential equations of higher order. Theorem 2.7. ([6], Theorem 1) Consider the differential equations (i = 1, 2) 1 1 + qi (t)x(hi (t)) = 0 x (t) p(t) r(t) 1 1 − qi (t)z(hi (t)) = 0 z (t) r(t) p(t) where qi , hi ∈ C([0, ∞), R), qi (t) > 0, lim hi (t) = ∞ and h1 (t) ≤ h2 (t),
t→∞
q1 (t) ≤ q2 (t),
(L, hi )i (LA , hi )i
f or t > t0 ≥ 0 .
If (L, h1 )1 has property A then (L, h2 )2 has property A. If (LA , h1 )1 has property B then (LA , h2 )2 has property B. Independently on properties A and B, it is easy to show the following: Lemma 2.8. ([3], Lemma 1.1) It holds: i) Any solution x of (L,h) [(N,h)] from N0 satisfies lim x[i] (t) = 0, i = 1, 2. t→∞
ii) Any solution z of (LA , h) [(N A , h)] from M3 satisfies lim |z [i] (t)| = ∞, i = 0, 1. t→∞
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
3
51
Comparison results
We begin our consideration with the following comparison theorem. Theorem 3.1. Assume (H5), (H ∗ ) and h(t) ≥ t. If (LK ) has property A for some K > 0, then (N, h) has property A and (N A , h) has property B. Proof. a) Let us prove that (N, h) has property A. Let x be a proper nonoscillatory solution of (N, h). We may assume that there exists T ≥ 0 such that x(t) > 0 for all t ≥ T . The case x(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. We know that x ∈ N0 ∪ N2 . Now we assume that (N, h) has not property A. By Lemma 2.8 there are two possibilities: I. x ∈ N2 , II. x ∈ N0 such that lim x(t) = l > 0. t→∞ Case I. Let x ∈ N2 . We consider a linearized differential equation with deviating argument 1 1 w (t) + q(t)F1 (t)w(h(t)) = 0 , (LF1 , h) p(t) r(t) f (x(h(t))) . Then w ≡ x is its nonoscillatory solution. In view of the fact x(h(t)) that x ∈ N2 we have that (LF1 , h) has not property A. As x[1] is a positive increasing function, there exists T ≥ 0 such that x[1] (t) ≥ x[1] (T ) for all t ≥ T . Integrating this inequality in (T, t) we get where F1 (t) =
[1]
x(t) ≥ x(T ) + x (T )
t
r(s) ds. T
As t → ∞ we get that function x(t) is unbounded. In view of the facts that x(∞) = ∞ and assumption (H5), there exist a positive constant k1 and T1 ≥ 0 such that F1 (t) > k1 for all t ≥ T1 . Hence by Theorem 2.7 for q1 (t) = q(t)k1 , q2 (t) = q(t)F1 (t), h1 (t) = t, h2 (t) = h(t) we obtain the result that linear differential equation 1 1 + k1 q(t)w(t) = 0 (Lk1 ) w (t) p(t) r(t) has not property A. But on other hand, by Theorem 2.4 equation (Lk ) has property A for all k > 0, which is a contradiction. Case II. Let x ∈ N0 and lim x(t) = l > 0. Hence, there exists a positive constant c t→∞ such that x(t) ≥ c > 0 for t sufficiently large. (1) We consider the linearized differential equation
1 p(t)
1 + q(t)F2 (t)w(t) = 0 , w (t) r(t)
(LF2 )
52
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
f (x(h(t))) . As w ≡ x is a nonoscillatory solution such that x ∈ N0 and x(t) x(∞) > 0, (LF2 ) has not property A. In view of the continuity of function f and (1), there exist a positive constant k2 and T2 ≥ 0 such that F2 (t) > k2 for all t ≥ T2 . Hence by Theorem 2.7 for q1 (t) = q(t)k2 , q2 (t) = q(t)F2 (t), h1 (t) = h2 (t) = t we have that the linear differential equation 1 1 + kq(t)w(t) = 0 (Lk2 ) w (t) p(t) r(t) where F2 (t) =
has not property A. But on other hand, by Theorem 2.4 equation (Lk ) has property A for all k > 0, which is a contradiction. b) Let us prove that (N A , h) has property B. Let z be a proper nonoscillatory solution of (N A , h). We may assume that there exists T ≥ 0 such that z(t) > 0 for all t ≥ T . The case z(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. We know that z ∈ M1 ∪ M3 . Now we assume that (N A , h) has not property B. By Lemma 2.8 there are two possibilities: I. z ∈ M3 such that lim z [2] (t) = ∞, t→∞
II. z ∈ M1 . Case I. We consider, for sufficiently large t, the linearized differential equation with deviating argument 1 1 − q(t)F3 (t)w(h(t)) = 0 , (LA w (t) F3 , h) r(t) p(t) f (z(h(t))) . As w ≡ z is a nonoscillatory solution such that z ∈ M3 and z(h(t)) lim z [2] (t) = ∞, (LA F3 , h) has not property B. Taking into account that z(∞) = ∞ and
where F3 (t) = t→∞
assumption (H5), there exist a positive constant k3 and T3 ≥ 0 such that F3 (t) > k3 for all t ≥ T3 . Hence by Theorem 2.7 for q1 (t) = q(t)k3 , q2 (t) = q(t)F3 (t), h1 (t) = t, h2 (t) = h(t) we obtain that linear differential equation 1 1 w (t) − q(t)k3 w(t) = 0 (LA k3 ) r(t) p(t) has not property B. On the other hand, by Theorem 2.4 equation (Lk ) has property A for all k > 0 and thus by Theorem 2.1 equation (LA k ) has property B for all k > 0, which is a contradiction. Case II. Let x ∈ M1 . As z is a positive increasing function, there are two possibilities: z(∞) = ∞ or z(∞) < ∞. If z(∞) = ∞, the proof proceeds as in the case I and hence omitted. Now, we suppose that z(∞) < ∞ and consider the linearized differential equation 1 1 − q(t)F4 (t)w(t) = 0 , (LA w (t) F4 ) r(t) p(t)
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
53
f (z(h(t))) . As w ≡ z is a nonoscillatory solution such that z ∈ M1 , (LA F4 ) z(t) has not property B. In view of the continuity of function f and z(∞) < ∞, there exist a positive constant k4 and T4 ≥ 0 such that F4 (t) > k4 for all t ≥ T4 . Hence by Theorem 2.7 for q1 (t) = q(t)k4 , q2 (t) = q(t)F4 (t), h1 (t) = h2 (t) = t we obtain the result that the linear differential equation 1 1 w (t) − q(t)k4 (t)w(t) = 0 (LA k4 ) r(t) p(t)
where F4 (t) =
has not property B. On the other hand, by Theorem 2.4 equation (Lk ) has property A for all k > 0 and thus by Theorem 2.1 equation (LA k ) has property B for all k > 0, which is a contradiction. Remark 3.2. Unlike other comparison results (see e.g., Theorem 1 in [6]), Theorem 3.1 does not require either monotonicity assumptions of the nonlinearity in R or the domination of the nonlinearity |f (u)| over the linear term |u| in R. Theorem 3.1 will be valid even in the case of the substitution of assumptions (H ∗ ) and (LK ) has property A for some K > 0 for the assumption (Lk ) has property A for all k > 0. And thus the identity h(t) ≡ t in Theorem 3.1 both gives Theorem 4 in [5] and extends Theorem 3 in [2]. Theorem 3.1 together with integral criteria ensuring property A for (LK ) gives the following result. Corollary 3.3. Assume h(t) ≥ t, (H5) and one of the following conditions hold: (i) I(q, r) = I(q, p) = ∞, (ii) I(q) = ∞, ∞ ∞ ∞ ∞ (iii) I(q, p) < ∞ and r(t) q(s) ds p(s) q(a) da ds dt = ∞. 0
t
t
Then (N, h) has property A and (N A , h) has property B.
s
Proof. From Theorems 4 and 5 in [4] and Proposition 1 in [4] it follows that (Lk ) has property A for all k > 0. Now we get the assertion from Theorem 3.1 (see Remark 3.2). The following result also holds: Corollary 3.4. Assume (H5) and h(t) ≥ t. If every nonoscillatory solution of (Lk ) is a Kneser solution for any k > 0 and I(q, p, r) = ∞, then (N, h) has property A and (N A , h) has property B. Proof. First let us remark that if I(q, p, r) = ∞, then I(kq, p, r) = ∞ for any positive constant k. By Proposition 2.5 and Lemma 2.8, every Kneser solution x of (Lk ) satisfies limt→∞ x[i] (t) = 0 for i = 0, 1, 2. Taking into account that every nonoscillatory solution
54
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
of (Lk ) is a Kneser one, we get that (Lk ) has property A for any k > 0. Now, Theorem 3.1 yields the assertion (see Remark 3.2). Theorem 3.1 yields the following comparison result between nonlinear equations with and without deviating argument. Theorem 3.5. Assume (H5), (H6), h(t) ≥ t and (Lk ) is oscillatory for all k > 0. If (N ) has property A, then (N, h) has property A and (N A , h) has property B. Proof. To prove this assertion we will show that a) if (N ) has property A, then (Lk ) has property A for all k > 0 and b) if (Lk ) has property A for all k > 0, then (N, h) has property A and (N A , h) has property B. 1 1 a) Assumption (H6) implies 0 f (u) du = ∞. Hence by Proposition 1.1 in [1], there exists at least one Kneser solution x of (N ). Because (N ) has property A, limt→∞ x[i] (t) = f (x(t)) 0 for i = 0, 1, 2. Let F is the function given by F (t) = and we consider for t x(t) sufficiently large the linearized differential equation 1 1 + q(t)F (t)w(t) = 0. (LF ) w (t) p(t) r(t) Since w ≡ x is a Kneser solution of (LF ) such that w[i] (∞) = 0 for i = 0, 1, 2 , Lemma 2.2 implies that I(qF, p, r) = ∞. (2) Because (H6) holds, there exists a positive constant M such that 0 < F (t) =
f (x(t)) <M x(t)
for all sufficiently large t.
(3)
Because (3) implies I(qF, p, r) ≤ M I(q, p, r), from (2) we have that I(q, p, r) = ∞. Now we assume that there exists a positive constant k0 such that (Lk0 ) has not property A. Because (Lk0 ) is oscillatory for all k > 0, from Theorem 2.1 we obtain the result that k0 I(q, p, r) = I(k0 q, p, r) < ∞ , which is a contradiction. Now part a) is proved. b) Let (Lk ) has property A for all k > 0. From Theorem 3.1 we immediately get that (N, h) has property A and (N A , h) has property B (see Remark 3.2). Now part b) is proved. Remark 3.6. If h(t) ≡ t in Theorem 3.5, we obtain the known result concerning property A for (N ) and property B for (N A ), see Theorem 4.1 in [1].
4
Properties of Kneser and strongly monotone solutions
The following results establish some asymptotic properties for Kneser and strongly monotone solutions of (N, h) and (N A , h), respectively.
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
55
Theorem 4.1. If I(q, p, r) = ∞, then every Kneser solution x of equation (N, h) satisfies limt→∞ x[i] (t) = 0 for i = 0, 1, 2. Proof. By Lemma 2.8 every Kneser solution x of (N, h) satisfies x[i] (∞) = 0 for i=1, 2. Suppose that there exists a positive Kneser solution x of (N, h) such that lim x(t) = c > 0.
t→∞
(4)
We consider the linearized differential equation (LF2 ). As w ≡ x is a nonoscillatory solution, (LF2 ) has a Kneser solution such that (4) holds. From Proposition 2.5, we obtain I(qF2 , p, r) < ∞. Since x is a positive decreasing function, taking into account (4) and the continuity of function f there exists a positive constant k1 such that F2 (t) > k1 > 0 for all sufficiently large t. Hence, we have that k1 I(q, p, r) < I(qF2 , p, r) < ∞ , which is a contradiction. The case x(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. Theorem 4.2. Assume (H5), h(t) ≥ t and (Lk ) is oscillatory for all k > 0. If I(q, p, r) = ∞, then every strongly monotone solution z of (N A , h) satisfies limt→∞ |z [i] (t)| = ∞ for i = 0, 1, 2. Proof. By Lemma 2.8 every strongly monotone solution z of (N A , h) satisfies |z [i] (∞)| = ∞ for i=0, 1. Suppose that there exists a positive strongly monotone solution z of (N A , h) such that limt→∞ z [2] (t) < ∞. Hence, (N A , h) does not have property B. We consider, for sufficiently large t, the linearized differential equation with deviating argument (LA F3 , h). A As w ≡ z is a nonoscillatory solution, (LF3 , h) does not have property B, too. Taking into account that z(∞) = ∞ and assumption (H5), there exists a positive constant k2 such that F3 (t) > k2 > 0 for all sufficiently large t. Hence by Theorem 2.7 for q1 (t) = q(t)k2 , q2 (t) = q(t)F3 (t), h1 (t) = t, h2 (t) = h(t) we obtain the result that the linear differential equation 1 1 − q(t)k2 w(t) = 0 (LA w (t) k2 ) r(t) p(t) has not property B. Since (Lk2 ) is oscillatory and has not property B, by Theorem 2.1, we have that k2 I(q, p, r) = I(qk2 , p, r) < ∞, which is a contradiction. The case z(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. Theorem 4.3. Assume (H6), h(t) ≥ t. If there exists a Kneser solution x of (N, h) such that limt→∞ x[i] (t) = 0 for i = 0, 1, 2 , then I(q, p, r) = ∞. Proof. Suppose that I(q, p, r) < ∞. Let x be a positive Kneser solution of (N, h). Thus there exists T ≥ 0 such that x(t) > 0, x[1] (t) < 0, x[2] (t) > 0 for all t ≥ T , and satisfies limt→∞ x[i] (t) = 0 for i=0, 1, 2. The case x(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. Let T1 > T be such that h(t) > T for all t ≥ T1 . Integrating (N, h)
56
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
three times in (t, ∞) we obtain ∞ x(t) = r(s) t
∞
p(u)
s
∞
q(b)f (x(h(b))) db du ds .
(5)
u
In view of the continuity of function f and (H6) there exists a positive constant k3 such that f (x(h(t))) < k3 0< for all sufficiently large t. (6) x(h(t)) Taking into account that x is a positive decreasing function and (6) holds, from (5) we have ∞ ∞ ∞ x(t) < k3 r(s) p(u) q(b)x(h(b)) db du ds ≤ t s u ∞ ∞ ∞ ≤ k3 x(h(t)) r(s) p(u) q(b) db du ds .
Thus
∞
t
∞
s
u
∞
1 x(t) ≤ r(s) p(u) q(b) db du ds , < k3 k3 x(h(t)) t s u by interchanging the order of integration, we get a contradiction. 0
0, x[1] (t) > 0, x[2] (t) > 0 for all t ≥ T . The case x(t) < 0, x[1] (t) < 0, x[2] (t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. a) As x[1] is a positive increasing function, we have x[1] (t) ≥ x[1] (T ) for all t ≥ T . By integrating we obtain t
[1]
x(t) ≥ x(T ) + x (T )
r(s) ds . T
As t → ∞ we get the first assertion. 1 [1] b) Since x[2] (t) = x (t) , integrating in (T, t) we obtain p(t) t
x[1] (t) = x[1] (T ) +
x[2] (s)p(s) ds.
T
Taking into account that x[2] (t) is a positive decreasing t function, we get x[1] (t) ≥ x[1] (T ) + x[2] (t) p(s) ds. As t → ∞, assumption implies the second assertion.
T
Remark 5.2. It is easy to prove that for any fixed t ≥ 0 holds t
∞
r1 (s)
s
r2 (u)
t
u
t
r3 (a) da du ds < ∞ if and only if I(r1 , r2 , r3 ) < ∞ .
(9)
To prove (9), the following auxiliary result will be needed: ∞ s u If r1 (s) r2 (u) r3 (a) da du ds < ∞, then t
t
t
t
∞
r1 (s)
t
s
r2 (u) du ds < ∞ and
t
∞
r1 (s) ds < ∞.
(10)
58
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
This assertion follows immediately from the fact that a ∞ s u r1 (s) r2 (u) r3 (a) da du ds ≥ r3 (b) db t
t
t
t
≥
a
t
r3 (b) db
t
r2 (c) dc
t
r1 (s)
= 0
0
s
r2 (u)
t
+ 0
r3 (a) da
t
0
u
r3 (a) da du ds +
0
∞
r1 (s)
r2 (u) du ds +
t
r2 (u) du ds
≥
r1 (s) ds .
r3 (a) da du ds =
0
r2 (u)
s
t
0
t
u
t
s
∞
t
Now, we prove (9). By easy computation we obtain ∞ s I(r1 , r2 , r3 ) = r1 (s) r2 (u) 0
r1 (s)
t
u
∞
u
0
r3 (a) da du
∞
r1 (s)
t
s
r2 (u)
t
t
∞
r1 (s) ds +
u
r3 (a) da du ds .
And thus (10) implies immediately the assertion (9). Analogous results hold also for two-dimensional integrals. Lemma 5.3. Let z be a solution of (N A , h) in the class M1 . Then the following assertions hold: a) If lim z [1] (t) = 0, then lim |z(t)| = ∞, t→∞
b) lim z [2] (t) = 0,
t→∞
t→∞
c) If I(q, r, p) = ∞, then lim |z(t)| = ∞. t→∞
Proof. Because z is a nonoscillatory solution of (N A , h) in the class M1 , there exists T ≥ 0 such that z(t) > 0, z [1] (t) > 0, z [2] (t) < 0 for all t ≥ T . The case x(t) < 0, x[1] (t) < 0, x[2] (t) > 0 for all t ≥ T ∗ may be proved by using similar arguments. a) As z [1] (t) is a positive decreasing function, we have 0 < z [1] (∞) ≤ z [1] (t) for all t ≥ T . Integrating this inequality in (T, t), we obtain t
z [1] (∞)
T
p(s) ds + z(T ) ≤ z(t) .
As t → ∞, the assumption implies the first assertion. b) We assume limt→∞ z [2] (t) < 0. This limit exists, because (N A , h) implies z [2] (t) > 0 for all t ≥ T , so z [2] (t) is a increasing function. Since z [2] (t) is also negative, we have 0 < −z [2] (∞) ≤ −z [2] (t) for all t ≥T . Integrating this inequality in (T, t) we obtain −z [2] (∞)
t
T
r(s) ds ≤ z [1] (T ) − z [1] (t) ,
as t → ∞ we get a contradiction and thus limt→∞ z [2] (t) = 0. c) We assume limt→∞ z(t) < ∞. This limit exists, because z [1] (t) > 0 for all t ≥ T , so z(t) is a increasing function. Integrating (N A , h) three times in (t, ∞) and using assertions a), b) of Lemma 5.3, we obtain
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
z(∞) = z(t) +
∞
∞
p(s)
∞
r(u)
t
s
59
q(b)f (z(h(b))) db du ds . u
Since 0 < z(∞) < ∞, in view of the fact that f is a continuous function, there exists a positive constant K such that f (z(h(t))) > K for all t sufficiently large, and so we get ∞ ∞ ∞ z(∞) > z(t) + K p(s) r(u) q(b) db du ds = t s u ∞ s u = z(t) + K q(s) r(u) p(b) db du ds , t
t
t
which is a contradiction with I(q, r, p) = ∞ (see Remark 5.2) and thus limt→∞ z(t) = ∞. Now we state some integral criteria ensuring that (N, h) has property A a (N A , h) has property B. Theorem 5.4. Assume (H5) and I(q) = ∞. Then (N, h) has property A and (N A , h) has property B. Proof. a) Let us prove that (N, h) has property A. Let x be a proper nonoscillatory solution of (N, h). We may assume that there exists T ≥ 0 such that x(t) > 0 for all t ≥ T . The case x(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. We know that x ∈ N0 ∪ N2 . Now we assume that (N, h) has not property A. By Lemma 2.8 there are two possibilities: I. x ∈ N2 , II. x ∈ N0 such that lim x(t) = l > 0. t→∞
Case I. Since x is a positive nonoscillatory solution of (N, h) in the class N2 , there exists T1 ≥ T such that x(t) > 0, x[1] (t) > 0, x[2] (t) > 0 for all t ≥ T1 . Because [2] x (t) = −q(t)f (x(h(t))) < 0 for all t ≥ T1 , x[2] (t) is a positive decreasing function and thus 0 < x[2] (∞) < ∞. Let T2 > T1 be such that h(t) > T1 for all t ≥ T2 . Integrating (N, h) in (T2 , ∞), we obtain [2]
[2]
x (T2 ) − x (∞) =
∞
q(t)f (x(h(t))) dt . T2
In view of the fact x[2] (∞) < ∞, there exists a positive constant c such that
∞
c=
q(t)f (x(h(t))) dt .
(11)
T2
Assertion a) of Lemma 5.1 allows us to use assumption (H5) which implies that there exists a positive constant K1 such that f (x(h(t))) > K1 x(h(t)) for all t ≥ T2 and thus from (11) we get ∞ c > K1 q(t)x(h(t)) dt . (12) T2
As x is a positive increasing function, so from we obtain (12), ∞ c > K1 x(T1 ) q(t) dt , T2
60
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
which is a contradiction. Case II. Because x is a positive nonoscillatory solution (N, h) in the class N0 , there exists T1 ≥ T such that x(t) > 0, x[1] (t) < 0, x[2] (t) > 0 for all t ≥ T1 . Integrating (N, h) in (T1 , t), we obtain t [2] [2] x (t) = x (T1 ) − q(s)f (x(h(s))) ds . T1
Since 0 < x(∞) < ∞, in view of the fact that f is a continuous function, there exists a positive constant K such that f (x(h(t))) > K forall t sufficiently large, and so we get x[2] (t) < x[2] (T1 ) − K
t
q(s) ds , T1
which gives a contradiction as t → ∞, because x[2] (t) is positive. b) Let us prove that (N A , h) has property B. Let z be a proper nonoscillatory solution of (N A , h). We may assume that there exists T ≥ 0 such that z(t) > 0 for all t ≥ T . The case z(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. We know that z ∈ M1 ∪ M3 . Now we assume that (N A , h) has not property B. By Lemma 2.8 there are two possibilities: I. z ∈ M3 such that lim z [2] (t) < ∞, t→∞
II. z ∈ M1 . Case I. Since z is a positive nonoscillatory solution of (N A , h) in the class M3 , there exists T1 ≥ T such that z(t) > 0, z [1] (t) > 0, z [2] (t) > 0 for all t ≥ T1 . Taking into account that z [2] (∞) < ∞, Lemma 2.8 allows us to use assumption (H5) and z(t) is a positive increasing function, the proof proceeds as in the case I of part a) and hence omitted. Case II. Since z is a positive nonoscillatory solution of (N A , h) in the class M1 , there exists T ≥ T such that z(t) > 0, z [1] (t) > 0, z [2] (t) < 0 for all t ≥ T1 . Because [2] 1 z (t) = q(t)f (z(h(t))) > 0 for all t ≥ T1 , so then z [2] (t) is a negative increasing function and thus −∞ < z [2] (∞) ≤ 0. Taking into account that −∞ < z [2] (∞) ≤ 0, assertion c) of Lemma 5.3 allows us to use assumption (H5) and z(t) is a positive increasing function, the proof proceeds as in the case I of part a) and hence omitted. Example 5.5. We consider the differential equation 1 1 x (t) + 90t2 x3 (t2 ) = 0 , 3 2 t t
t≥1
(13)
This is the equation of the form (N, h), where r(t) = t2 , p(t) = t3 , q(t) = 90t2 , h(t) = t2 and f (u) = u3 . Assumptions of Theorem 5.4 hold and so we know that equation (13) has property A. One nonoscillatory solution of equation (13) such that |x[i] (t)| ↓ 0 as t → ∞, i = 0, 1, 2 is the function x(t) = 1/(t2 ). Theorem 5.6. Assume (H5). ∞ q(t) a) If I(q, p, r) = ∞ and T
T
h(t)
r(s) ds dt = ∞, then (N, h) has property A.
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
b) If I(q, r) = ∞ and
∞
h(t)
q(t) T
T
61
p(s) ds dt = ∞, then (N A , h) has property B.
Proof. a) Let x be a proper nonoscillatory solution of (N, h). We may assume that there exists T ≥ 0 such that x(t) > 0 for all t ≥ T . The case x(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. We know that x ∈ N0 ∪ N2 . Now we assume that (N, h) has not property A. By Lemma 2.8 there are two possibilities: I. x ∈ N2 , II. x ∈ N0 such that lim x(t) = l > 0. t→∞
Case I. Since x is a positive nonoscillatory solution of (N, h) in the class N2 , there exists T1 ≥ T such that x(t) > 0, x[1] (t) > 0, x[2] (t) > 0 for all t ≥ T1 . Because [2] x (t) = −q(t)f (x(h(t))) < 0 for all t ≥ T1 , x[2] (t) is a positive decreasing function and thus 0 < x[2] (∞) < ∞. Let T2 > T1 be such that h(t) > T1 for all t ≥ T2 . Integrating (N, h) in (T2 , ∞), we obtain [2]
∞
[2]
x (T2 ) − x (∞) =
q(t)f (x(h(t))) dt . T2
In view of the fact x[2] (∞) < ∞, there exists a positive constant c such that ∞ c= q(t)f (x(h(t))) dt .
(14)
T2
Assertion a) of Lemma 5.1 allows us to use assumption (H5) which implies, there exists a positive constant K1 such that f (x(h(t))) > K1 x(h(t)) for all t ≥ T2 and thus from (14) we get ∞
q(t)x(h(t)) dt .
c > K1
(15)
T2
As x[1] (t) is a positive increasing function, we have x[1] (t) ≥ x[1] (T1 ) for all t ≥ T1 . Integrating this inequality in (T1 , t), we get t t [1] [1] r(s) ds > x (T1 ) r(s) ds for all t ≥ T1 x(t) ≥ x(T1 ) + x (T1 ) T1
or [1]
x(h(t)) > x (T1 )
T1
h(t)
[1]
r(s) ds > x (T1 )
T1
Replacing into (15) we obtain [1]
c > K1 x (T1 )
∞
h(t)
r(s) ds T2
for all t ≥ T2
h(t)
q(t)
r(s) ds dt ,
T2
T2
which is a contradiction. Case II. Because x is a positive nonoscillatory solution (N, h) in the class N0 , there exists T1 ≥ T such that x(t) > 0, x[1] (t) < 0, x[2] (t) > 0 for all t ≥ T1 . Integrating (N, h) three times in (t, ∞),we obtain ∞
∞
r(s)
x(t) = x(∞) + t
∞
p(u) s
q(a)f (x(h(a))) da du ds . u
Since 0 < x(∞) < ∞, in view of the fact that f is a continuous function, there exists a positive constant K2 such that f (x(h(t))) > K2 for all t sufficiently large, and so we get
62
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
x(t) > x(∞) + K2
∞
r(s) t
∞
∞
p(u) s
q(a) da du ds = s ∞ = x(∞) + K2 q(s) p(u)
u
t
t
u
r(a) da du ds ,
t
which is a contradiction with I(q, p, r) = ∞ (see Remark 5.2). b) Let z be a proper nonoscillatory solution of (N A , h). We may assume that there exists T ≥ 0 such that z(t) > 0 for all t ≥ T . The case z(t) < 0 for all t ≥ T ∗ may be proved by using similar arguments. We know that z ∈ M1 ∪ M3 . Now we assume that (N A , h) has not property B. By Lemma 2.8 there are two possibilities: I. z ∈ M3 such that lim z [2] (t) < ∞, t→∞
II. z ∈ M1 . Case I. Since z is a positive nonoscillatory solution of (N A , h) in the class M3 , there exists T1 ≥ T such that z(t) > 0, z [1] (t) > 0, z [2] (t) > 0 for all t ≥ T1 . Taking into account that z [2] (∞) < ∞, Lemma 2.8 allows us to use assumption (H5) and z [1] (t) is a positive increasing function, in the same way as in the proof the case I of part a) we get a contradiction. Case II. Since z is a positive nonoscillatory solution of (N A , h) in the class M1 , there exists T1 ≥ T such that z(t) > 0, z [1] (t) > 0, z [2] (t) < 0 for all t ≥ T1 . In virtue of z [1] (t) is a positive decreasing and z [2] (t) is a negative increasing, we have 0 ≤ z [1] (∞) < ∞ and 0 ≤ −z [2] (∞) < ∞ . Integrating (N A , h) twice in (t, ∞) and from (16) we obtain ∞ ∞ ∞ [1] [1] z (t) = z (∞)+ r(s) q(u)f (z(h(u))) du ds ≥ r(s) t
s
t
(16)
∞
q(u)f (z(h(u))) du ds
s
Assertion c) of Lemma 5.3 allows us to use assumption (H5) which implies, there exists a positive constant K3 such that f (z(h(t))) > K3 z(h(t)) for t sufficiently large and thus we have ∞ ∞ [1] z (t) > K3 r(s) q(u)z(h(u)) du ds > t s ∞ ∞ ∞ s > K3 z(h(t)) r(s) q(u) du ds = K3 z(h(t)) q(s) r(u) du ds , t
s
which gives a contradiction with I(q, r) = ∞ (see Remark 5.2).
t
t
The following example illustrates the meaning of Theorem 5.6 . Example 5.7. We consider the differential equation x (t) +
6 x(t2 ) = 0 , t2
t≥1
This is the equation of the form (N, h), where r(t) = p(t) = 1, q(t) = 6/t2 , h(t) and f (u) = u. In this case I(q) < ∞ and thus Theorem 5.4 is not applicable. it is easy to verify that conditions of Theorem 5.6-a) are fulfilled and so we get equation (17) has property A. One nonoscillatory solution of equation (17) such |x[i] (t)| ↓ 0 as t → ∞, i = 0, 1, 2 is the function x(t) = 1t .
(17) = t2 But that that
I. Mojsej and J. Ohriska / Central European Journal of Mathematics 4(1) 2006 46–63
63
References [1] M. Cecchi, Z. Doˇsl´a and M. Marini: “On nonlinear oscillations for equations associated to disconjugate operators”, Nonlinear Anal.- Theor., Vol. 30(3), (1997), pp. 1583–1594. [2] M. Cecchi, Z. Doˇsl´a and M. Marini: “Comparison theorems for third order differential equations”, Proc. Dynam. Systems Appl., Vol. 2, (1996), pp. 99–106. [3] M. Cecchi, Z. Doˇsl´a and M. Marini: “Asymptotic behavior of solutions of third order delay differential equations”, Arch. Math.(Brno), Vol. 33, (1997), pp. 99–108. [4] M. Cecchi, Z. Doˇsl´a and M. Marini: “Some properties of third order differential operators”, Czech. Math. J., Vol. 47(122), (1997), pp. 729–748. [5] M. Cecchi, Z. Doˇsl´a and M. Marini: “An Equivalence Theorem on Properties A, B for Third Order Differential Equations”, Ann. Mat. Pura Appl. (IV), Vol. CLXXIII, (1997), pp. 373–389. [6] T. Kusano and M. Naito: “Comparison theorems for functional differential equations with deviating arguments”, J. Math. Soc. Japan, Vol. 33(3), (1981), pp. 509–532.
Central European Science Journals w
w
w
.
c
e
s
j
.
c
o
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
m
DOI: 10.1007/s11533-005-0005-2 Research article CEJM 4(1) 2006 64–81
Ordinary differential equations and their exponentials Anders Kock1∗ , Gonzalo E. Reyes2† 1
Department of Mathematical Sciences, University of Aarhus, DK 8000 Aarhus C, Denmark 2 Department of Mathematics, Universit´e de Montr´eal, H3C 3J7 Montreal, Quebec, Canada
Received 13 April 2005; accepted 4 November 2005 Abstract: In the context of Synthetic Differential Geometry, we discuss vector fields/ordinary differential equations as actions; in particular, we exploit function space formation (exponential spaces) in the category of actions. c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: Synthetic Differential Geometry, vector field, action, exponential object MSC (2000): 34A99, 51K10, 18B25
Vector fields or, equivalently, (autonomous, first order) ordinary differential equations, have long been considered, heuristically, to be the same as “infinitesimal (pointed) actions” or “infinitesimal flows”, but it is only with the development of Synthetic Differential Geometry (SDG) that we have the tools to formulate these notions and prove their equivalence in a rigourous mathematical way. We exploit this fact to define the exponential of two ordinary differential equations as the exponential of the corresponding infinitesimal actions. The resulting action is seen to be the same as a partial differential equation whose solutions may be obtained by conjugation from the solutions of the differential equations that make up the exponential. Furthermore, we show that this method of conjugation, under some conditions, is an application of the method of change of variables, widely used to solve differential equations. ∗ †
E-mail:
[email protected] E-mail:
[email protected] A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
65
Our paper has three parts. In the first, we give a brief introduction to the method of Synthetic Differential Geometry (SDG). In the second, we study generalities on actions, and in the third, we describe the exponential of two such actions to obtain the above mentioned result. Some examples illustrate the general method.
1
A synthetic method in differential geometry
What today is called Synthetic Differential Geometry is a method of reasoning which has as one of its origins 20th Century French Algebraic Geometry (Grothendieck et al.), the other origin being certain aspects of modern category/topos theory. The school of algebraic geometers alluded to insisted on the consideration of nilpotent elements in function rings of the geometric objects (schemes). Recall that an element d of a commutative ring R is called nilpotent (or infinitesimal) of order k if dk+1 = 0. For such infinitesimals as increments, Taylor series and kth degree Taylor polynomials become the same thing, since we do have (to the extent it makes sense) d2 dk (k) f (x + d) = f (x) + d · f (x) + f (x) + . . . + f (x) 2! k!
(1)
exactly, since the remaining terms in the series are zero. A viewpoint related to the consideration of nilpotent elements is the consideration of the “kth neigbourhood of the diagonal” of a scheme, or manifold, M , M(k) ⊆ M × M, also a classical concept in 20th Century algebraic, and even differential, geometry, cf. [6, 9]. For instance, if M = affine line, (x, y) ∈ M(k) precisely when y − x is a kth order infinitesimal, in the above sense. The potential which this viewpoint evidently has for a new foundation for differential calculus and differential geometry could not be fully realized, until certain category theoretic notions, and the categorical logic inherent in these, became more crystallized notably through the notion of topos. Through the work of Lawvere and his collaborators in the 1960’s and 1970’s, it became gradually clear that any topos behaves so much like the familiar category of sets that most set theoretic reasoning and construction immediately can be carried out in any topos (and toposes have been constructed which contain the category of smooth manifolds, say, in a suitable way). The exception to set theoretic reasoning in general toposes is the law of excluded middle, as well as the axiom of choice. These exceptions certainly are to be expected, since they are responsible for constructions of non-smooth entities out of smooth ones; e.g. 1 if x ≤ 0; f (x) = 0 if not.
66
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
It has been amply documented (by construction of models in suitable toposes) that at the cost of not employing these two “non-smooth” logical principles, a stronger axiomatic theory of differential calculus becomes consistent, namely one where the affine “number” line has a sufficient supply of nilpotent elements d such that (1) not only holds for all kth order infinitesimals d, but that this equation characterizes f (x), f (x) . . . , . . . , f (k) (x). A fair amount of differential calculus can be carried out on such axiomatic basis, meaning in particular that it can be carried out without utilizing (or even knowing) any topos theory. Such endavour is sometimes called “naive” synthetic differential geometry, and the present note is entirely conceived in this naive style. The only trace of category theoretic notions that are essential for the considerations which we present here, is that of cartesian closed category. Any topos, in contrast to the category of schemes or smooth manifolds, is cartesian closed, which means that function spaces or exponential objects are immediately available: for any pair of “spaces” (objects) A and B, there is a “space” AB (to be thought of as the “space of maps from B to A”) with the property that, for any X, there are natural bijective correspondences between the set of maps X → AB , the set of maps X × B → A, and the set of maps B → AX . (These correspondences are often called λ-conversions, a term borrowed from logic; in category theory, they are called “exponential adjointness”.) Lawveres’s essential contribution from 1967 (see [8]) was to realize that if D is the set of 1st order infinitesimals on the affine line R (i.e. D = {x ∈ R | x2 = 0}), then AD is the tangent bundle T (A) of the “space” A. Thus he observed that a vector field on A comes by exponential adjointness in three equivalent disguises: A
→ AD
A×D → A D
→ AA
(in each of the three disguises, there is an equational condition involving 0 ∈ D). This is the viewpoint we exploit in the following two sections. Hopefully, the text there will illustrate how the synthetic/axiomatic theory works. (Further illustrations may be found in any of the three standard treatises on the subject, [3, 7, 10].) To make the character of the axiomatics explicit, we are dealing with an (unspecified) cartesian closed category (whose objects we just call “spaces” or “sets”), and a given particular object R in it, which carries the structure of a commutative ring, to be thought of as the affine number line. This ring object R is assumed to be an algebra over the rationals, i.e. we assume that 1 + 1, 1 + 1 + 1, etc. are multiplicatively invertible in R. The main thing is that the Taylor formula holds and characterizes derivatives (“Kock-Lawvere axiom”). The text should be reasonably self-contained, except that we need to consider “spaces” M which have the further technical property of being microlinear, cf. e.g. [7] for a lucid exposition of this notion. It essentially means that maps from infinitesimal objects D into M can be constructed by patching, – as illustrated in the text.
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
2
67
Generalities on actions
Recall that an action of a set (object) D on a set (object) M is a map X : D × M → M , and a homomorphism of actions (M, X) → (N, Y ) is a map f : M → N with f (X(d, m)) = Y (d, f (m)) for all m ∈ M and d ∈ D. The category of actions by a set D form a cartesian closed category (a topos, in fact). We shall be interested in particular in the exponential formation in this category (cf. section 3). We place ourselves in the context of Synthetic Differential Geometry, as described briefly in Section 1. As there, we take D to be the usual set of square zero elements in the number line R. It is a pointed object, pointed by 0 ∈ D, and the actions X : D × M → M we consider, are pointed actions in the sense that X(0, m) = m for all m ∈ M . A pointed action, in this situation, is the same thing as a vector field on M , cf. [8]; namely for m ∈ M , X(m) is the tangent vector at m given by d → X(d, m). The pointed actions likewise form a cartesian closed category (again, a topos) and the exponential to be described in section 3 is the same as the exponential in the category of actions (cf. [4]). For the case of vector fields seen as actions by D, we want to describe the “streamlines” generated by a vector field in abstract action-theoretic terms; this is going to involve ˜ ∂/∂t): R ˜ is an “infinitesimally open subset” of R, i.e., a certain “universal” action (R, ˜ then d + t ∈ R ˜ for every d ∈ D; we also assume 0 ∈ R. ˜ The simplest whenever t ∈ R examples of such subsets are R itself, the non-negative numbers R≥0 , open intervals around 0, and the set D∞ of all nilpotent elements of the number line. (Another important example is the object Δ described and utilized in [1].) The universal action is the vector ˜→R ˜ given by (d, t) → d + t. field ∂/∂t : D × R ˜ ∂/∂t) → (M, X) If (M, X) is a set with an action, a homomorphism of actions f : (R, is to be thought of as a particular solution of the differential equation given by X, with initial value f (0), or as a “streamline” for the vector field X, starting at f (0). One wants, however, also to include dependence on initial value into the notion of solution, and so one is led to consider maps ˜ × M → M, F :R satisfying at least F (d, m) = X(d, m) for all d ∈ D and all m ∈ M . ˜ d ∈ D and m ∈ M consider the following three points in M : We shall for any t ∈ R, F (d + t, m), X(d, F (t, m)) F (t, X(d, m)). We shall consider and compare the three conditions one gets by pairwise equating these (universally quantified over all t, d, m); the first (2) is the fundamental one, expressing that each F (−, m) is a particular solution (streamline) of the ODE given by the vector field X. F (d + t, m)) = X(d, F (t, m)),
(2)
68
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
F (d + t, m)) = F (t, X(d, m)),
(3)
F (t, X(d, m)) = X(d, F (t, m)).
(4)
Writing Xd for the map X(d, −) : M → M , and similarly Ft for F (t, −) : M → M , these three conditions may be rewritten as Fd+t = Xd ◦ Ft ,
(5)
Fd+t = Ft ◦ Xd ,
(6)
Ft ◦ Xd = Xd ◦ Ft ,
(7)
(these three equations universally quantified over all t, d). The three equations can be reformulated in “classical” terms, i.e. without reference to d ∈ D, using the notion of differential d(g) of a map g : M → N ; we return to the exact meaning of the terms occurring here later, but include the equations now for systematic reasons ∂F (t, m) = X(F (t, m)) ∂t ∂F (t, m) = d(Ft )(X(m)) ∂t d(Ft )(X(m)) = X(Ft (m))
(8) (9) (10)
(these three equations are universally quantified over all t, m). If M is a suitable vector space (R-module), or a suitable subset hereof, these three last equations have a simpler appearance, via the notion of the principal part ξ of the vector field X, see (13), (14), (15) below. ˜ is a Finally, one may consider the following equation, under the condition that R submonoid of (R, +), F (t + s, m) = F (t, F (s, m)). (11) This is the usual condition for an action af a monoid on a set M . Clearly, it implies (2), ˜ is a submonoid of (R, +). (3) and (4). But note that we do not in general assume that R Let X be a vector field on M , thought of as a first-order differential equation, and let ˜ be an infinitesimally open subset of R, containing 0. We say that a map f : R ˜→M R ˜ if f satisfies f (t + d) = X(d, f (t)) (or equivalently, is a particular solution of (M, X, R) if f is a homomorphism of actions). ˜ × M → M is a complete solution of (M, X, R) ˜ if Fd = Xd We say that a map F : R and F satisfies (2). ˜ which will be presupposed, From now on, we shall usually omit reference to M and R, and talk of solutions ‘of a vector field X’. A complete solution does not automatically satisfy the other conditions (3)-(11), but it does, provided that X satisfies a certain axiom (reflecting, synthetically, validity of the uniqueness assertion for solutions of differential equations on M ).
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
69
The axiom in question is the following: Uniqueness property for particular solutions of X: ˜ → M are homomorphisms of actions, with Let X be a pointed D-action on M . If f, g : R f (0) = g(0), then f = g. Note that the validity of the axiom, for a given X, depends on M and the choice of ˜ ∂/∂t. For instance, we shall prove below that it holds for any microlinear M if R ˜ is R, taken to be D∞ (and ∂/∂t given by (d, t) → d + t). This axiom has the following simple consequence: Uniqueness property for complete solutions of X: Proposition 2.1. Assume that X has the uniqueness property for particular solutions. ˜×M →M Then there is at most one complete solution F : R The converse does not seem to be true, but a weaker result is true, see Proposition 2.8 below. Proposition 2.2. Let X be a vector field on M and assume that X satisfies the unique˜ × M → M is a complete solution ness property for particular solutions. Then if F : R ˜ is a of the differential equation X, it satisfies properties (3) and (4). Furthermore, if R monoid (under +), then F also satisfies (11). Proof. Since the proofs are quite similar, we shall do only (3). Fix m ∈ M and d0 ∈ D, ˜ → M by the formulas and define the couple of functions f, g : R ⎧ ⎪ ⎨ f (t) = Fd +t (m) 0 ⎪ ⎩ g(t) = Ft (Xd0 (m)) Clearly, f and g have the same initial value f (0) = F (d0 , m) = X(d0 , m) = g(0). We have to check that f and g are homomorphisms of D-actions, i.e., they satisfy (2). For g, this is clear. For f , f (d + t) = Fd0 +(d+t) (m) = Fd+(d0 +t) (m) = Xd (Fd0 +t (m)) = Xd (f (t)). Thus, the equality of the two expressions follows from the uniqueness property for particular solutions assumed for X. We recall the notion of the differential of a map. Recall that the set M D is the tangent bundle T M of M ; it comes with a map T M → M , (base point map), namely the one
70
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
which takes τ : D → M to τ (0). The fibre of T M over x ∈ M is denoted Tx M . If f : M → N is any map, it induces for each x ∈ M a map Tx M → Tf (x) N , called the differential dx (f ) of f at x; it is given by τ → f ◦ τ . If M and N are microlinear, dx (f ) will be a linear map. (The differentials of f jointly define a map M D → N D , which is nothing but the functor (−)D applied to f .) We shall also recall some notions that apply to any “Euclidean R-module” M = V , or to an “infinitesimally open” subset M = U ⊆ V hereof. These notions are standard in SDG, but let us briefly review them: For any R-module V , we have the map V ×V → V D given by (a, b) → [d → a + d · b]. To say that V is Euclidean is to say that this map is a bijection; in other words, every tangent vector τ : D → V is uniquely of the form d → a + d · b. The element (“vector”) b ∈ V is called the principal part of the tangent vector τ (and a is of course the base point of τ ). To say that U ⊆ V is infinitesimally open is to say that if a tangent vector τ to V , as above, has its base point τ (0) in U , then τ (d) is in U for all d ∈ D. Thus U × V ∼ = U D (= T (U )). For such U and x ∈ U , we may, via the notion of principal part, identify Tx (U ) with V . Recall also that if β : M → V is any map into a Euclidean R-module, and X is a vector field on M , then the directional derivative DX (β) of β along X is the composite X
βD
M → M D → V D → V, where the last map is the principal part formation. Thus, DX (β) is characterized by validity of the equation β(X(d, m)) = β(m) + d · DX (β)(m), for all d ∈ D, m ∈ M . Equivalently, DX (β)(m) is the principal part of dm (β)(X(m)). (Recall that dm β is the differential of β at m.) Proposition 2.3. Assume that X1 , X2 are vector fields on M1 , M2 , respectively, and that H : M1 → M2 is a homomorphism (i.e., it preserves the D-action). Let V be a Euclidean R-module. Then for any u : M2 → V , DX1 (u ◦ H) = DX2 (u) ◦ H.
Proof. This is a straightforward computation: u(X2 (d, H(m))) = u(H(m)) + d · DX2 (H(m)); on the other hand u(X2 (d, H(m))) = u(H(X1 (d, m))) = u(H(m)) + d · DX1 (u ◦ H)(m). By comparing these two expressions we obtain the conclusion of the Proposition.
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
71
If X is a vector field on such Euclidean V , we get, by principal part formation a map ξ : V → V (i.e. the principal part of the field vector X(v) for v ∈ V is ξ(v)). This map is often called the principal part of the vector field. Similarly, a vector field X on an infinitesimally open U ⊆ V may be identified with a map ξ : U → V . In this case, one often writes Dξ (β) instead of DX (β). The notion of differential discussed above has a variant in case of a map f : U → V between Euclidean R-modules, due to the identification (via principal part formation) of tangent vectors to V with vectors in V . For x and u ∈ U , write df (x; u) ∈ V for the principal part of dx (f )(u), where u is the tangent vector at x whose principal part is u. It depends in a linear way on u ∈ U . (It makes sense also when U is an infinitesimally open subset of a Euclidean R-module.) In this case Dξ (β)(x) can also be described as dβ(x; ξ(x)). In the 1-dimensional case where U ⊆ R = V , Dξ (β)(x) = dβ(x; ξ(x)) = β (x) · ξ(x).
(12)
˜ × M → N is a map, ∂g/∂t(t, x) is by definition the tangent vector at If g : R g(t, x) ∈ N given by d → g(d + t, x), ( – this is the usage in the equations (8) and (9) ), or its principal part when this makes sense. It may also be written g(t, ˙ x). With these notations, (8), (9) and (10) may be reformulated ∂F (t, m) = ξ(F (t, m)) ∂t ∂F (t, m) = d(Ft )(m; ξ(m)) ∂t d(Ft )(m; ξ(m)) = ξ(Ft (m))
(13) (14) (15)
Although this latter equation does not tell us how the solution of the vector field varies with the initial value m, (varying m in arbitrary directions), it does tell us the variation of the solution when m varies in the direction prescribed by the vector field. In the one-dimensional case, there is only one direction anyway; so here we get (using (12) with β = Ft ) the following version of (15) for the complete solution F (t, x) of x˙ = ξ(x) (i.e. of the vector field with principal part ξ): ∂F (t, x) · ξ(x) = ξ(F (t, x)), ∂x
(16)
or equivalently
∂F ∂F (t, x) · ξ(x) = (t, x). (17) ∂x ∂t (An elementary proof, for the case of nonvanishing ξ, goes as follows: Let lnξ denote a primitive of 1/ξ. Then we have lnξ (F (t, x)) = t + lnξ (x);
for, both sides yield lnξ (x) for t = 0 and have the same t-derivative, namely 1 (using that F (−, x) is a solution of the differential equation). Differentiating this equation after x, we get (ξ(F (t, x))−1 ∂F/∂x(t, x) = 1/ξ(x), which is a rewriting of (16).
72
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
Recall that a vector field X on M is called integrable if there exists a complete solution ˜ × M → M . If we assume the uniqueness property for particular solutions, the F : R ˜ actually equation (11) holds; if furthermore the commutative monoid structure + on R is a group structure, then (11) implies that the action is invertible, with X−d as Xd−1 (in fact F−t = Ft−1 ). Of course, both the uniqueness property and the question whether ˜ is considered. In particular, or not the vector field X is integrable, depend on which R we shall say that X is formally integrable or has a formal solution if X is integrable for ˜ = D∞ (which is a group under addition). For the case of M = Rn , this amounts to R integration by formal power series, whence the terminology. Theorem 2.4. The uniqueness property for particular solutions holds for any vector field ˜ = D∞ ). Furthermore, every vector field on a microlinear on a microlinear object, (for R object is formally integrable. Thus, every vector field on a microlinear object has a unique complete (formal) solution. Proof. This theorem was stated in [1] and a sketch of the proof by induction was indicated. We give here a proof in detail which does not use induction. We need to recall some infinitesimal objects from the literature on SDG, cf. e.g. [7]. Besides D ⊆ R, consisting of d ∈ R with d2 = 0, we have Dn ⊆ Rn , the n-fold product of D with itself. It has the subobject D(n) ⊆ Dn consisting of those n-tuples (d1 , . . . , dn ) where di · dj = 0 for all i, j. There is also the object Dn ⊆ R consisting of δ ∈ R with δ n+1 = 0; D∞ is the union of all the Dn ’s. If (d1 , . . . , dn ) ∈ Dn , then d1 + . . . + dn ∈ Dn . Now, let M be a microlinear object, and X a vector field on it. We first recall that if d1 , d2 ∈ D have the property that d1 + d2 ∈ D, then Xd1 ◦ Xd2 = Xd1 +d2 . (For microlinear objects perceive D(2) to be a pushout over {0} of the two inclusions D → D(2), and clearly both expressions given agree if either d1 = 0 or d2 = 0.) In particular, Xd1 and Xd2 commute. But more generally, we have the following lemma. Lemma 2.5. If X is a vector field on a microlinear object and d1 , d2 ∈ D, the maps Xd1 and Xd2 commute. Proof. This is a consequence of the theory of Lie brackets, cf. e.g. [7] 3.2.2, namely [X, X] = 0. Similarly, Lemma 2.6. If X is a vector field on a microlinear object and d1 , . . . , dn ∈ D are such that d1 + · · · + dn = 0, then Xd1 ◦ · · · ◦ Xdn = 1M (= the identity map on M ). In particular, (Xd )−1 = X−d . Proof. We first prove that R, and hence any microlinear object, perceives Dn to be the
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
73
orbit space of Dn under the action of the symmetric group Sn in n letters: Assume that p : Dn → R coequalizes the action, i.e. is symmetric in the n arguments. By the basic axiom of SDG, p may be written in the form p(d1 , . . . , dn ) = aQ dQ Q⊆{1,...,n}
for unique aQ ’s in R (where dQ denotes i∈Q di ). We claim that aQ = aπ(Q) for every π ∈ Sn . Indeed, aQ dQ = ( aQ dQ ) ◦ π Q
Q
since p is symmetric. But aQ dQ ) ◦ π = aQ dπ(Q) = aπ−1 (Q) dQ . ( Q
Q
By comparing coefficients and using uniqueness of coefficients, we conclude aQ = aπ(Q) , and this shows that p is (the restriction to Dn of) a symmetric polynomial Rn → R. By Newton’s theorem (which holds internally), p is a polynomial in the elementary symmetric polynomials σi . Recall that σ1 (d1 , . . . , dn ) = d1 + · · · + dn : and each σi , when restricted to Dn , is a function of σ1 , since d21 = 0; e.g. 1 1 σ2 (d1 , . . . , d2 ) = di dj = (d1 + · · · + dn )2 = (σ1 (d1 , . . . , dn ))2 . 2 2 Now consider, for fixed m ∈ M , the map p : Dn → M given by (d1 , . . . , dn ) → Xd1 ◦ · · · ◦ Xdn (m). By Lemma 2.5, this map is invariant under the symmetric group Sn (recall that this group is generated by transpositions), so there is a unique φ : Dn → M such that φ(d1 + · · · + dn ) = Xd1 ◦ · · · ◦ Xdn (m). So if d1 + · · · + dn = 0, Xd1 ◦ · · · ◦ Xdn (m) = φ(0) = φ(0 + · · · + 0) = X0 ◦ · · · ◦ X0 (m) = m. This proves the Lemma. We can now prove the Theorem. We need to define Ft : M → M when t ∈ D∞ . Assume for instance that t ∈ Dn . By microlinearity of M , M perceives Dn to be the orbit space of Dn under the action of Sn (see the proof of Lemma 2.6), via the map (d1 , . . . , dn ) → d1 + · · · + dn , so we are forced to define Ft = Xd1 ◦ . . . Xdn if F is to extend X and to satisfy (11). The fact that this is well defined independently of the choice of n and the choice of d1 , . . . , dn that add up to t follows from Lemma 2.6. As a Corollary of the proof, we may note the following general “analytic induction principle” Proposition 2.7. Let f and g be maps D∞ → M , where M is a microlinear object. If f (0) = g(0), and if for all t ∈ D∞ and d ∈ D, f (t) = g(t) implies f (d + t) = g(d + t), then f = g.
74
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
Proof. Since D∞ is the union of the Dn ’s, it suffices to prove that f and g agree on any Dn . But, as in the proof above, microlinear objects perceive the addition map Dn → Dn to be epic, and clearly the assumptions on f and g imply that f (d1 + . . . + dn ) = g(d1 + . . . + dn ) for any (d1 , . . . , dn ) ∈ Dn .
Proposition 2.8. Assume that there is a complete solution F : D∞ × M → M for a vector field X on M (where M is microlinear). Then X has the uniqueness property for particular solutions. Proof. Let f : D∞ → M be a particular solution of X. We shall prove that f (t) = F (t, f (0)) from which the uniqueness of such f clearly follows. The proof of this equation proceeds ‘by analytic induction’, i.e. using Proposition 2.7: the equation is obviously true for t = 0. Assume that it is true for t. We prove that it is true for d + t. In fact, f (d + t) = X(d, f (t)) = X(d, F (t, f (0)) = F (d + t, f (0)) The first and last equality hold because f and F are solutions, whereas the middle one holds by the induction assumption. – Notice the following consequence of this proposition: If there is a complete solution for X, then it is unique. As a particular case of special importance, we consider a linear vector field on a microlinear and Euclidean R-module V . To say that the vector field is linear is to say that its principal-part formation V → V is a linear map, Δ, say. We have then the following version of a classical result: Proposition 2.9. Let a linear vector field on a microlinear Euclidean R-module V be given by the linear map Δ : V → V . Then the unique formal solution of the corresponding differential equation, i.e., the equation F˙ (t) = Δ(F (t)) with initial position v, is the map D∞ × V → V given by (t, v) → et·Δ (v), (18) where the right hand side here means the sum of the following “series” (which has only finitely many non-vanishing terms, since t is assumed nilpotent): v + tΔ(v) +
t3 t2 2 Δ (v) + Δ3 (v) + . . . 2! 3!
Here of course Δ2 (v) means Δ(Δ(v)), etc.
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
75
Proof. We have to prove that F˙ (t) = Δ(F (t)). We calculate the left hand side by differentiating the series term by term (there are only finitely many non-zero terms): Δ(v) +
2t 3t2 3 t2 · Δ2 (v) + Δ (v) + ... = Δ(v + t · Δ(v) + · Δ2 (v) + ...) 2! 3! 2!
using linearity of Δ. But this is just Δ applied to F (t).
There is an analogous result (which we utilized in [5]) for second order differential ··
equations of the form F (t) = Δ(F (t)) (with Δ linear); the proof is similar and we omit it: ··
Proposition 2.10. The formal solution of this second order differential equation F = ΔF , with initial position v and initial speed w, is given by F (t) = v + t · w +
3
t3 t4 t2 t5 Δ(v) + Δ(w) + Δ2 (v) + Δ2 (w) + .... 2! 3! 4! 5!
Exponential of vector fields
We shall describe the exponential of two D-actions. We do this when the action in the exponent is invertible. An action X : D × M → M is called invertible, if for each d ∈ D, X(d, −) : M → M is invertible. In this case, the exponential (N, Y )(M,X) may be described as N M equipped with the following action by D: an element d ∈ D acts on β : M → N by “conjugation”: β → Yd ◦ β ◦ (Xd )−1 , where Yd denotes Y (d, −) : N → N , and similarly for Xd . It is easy to check that if both actions are pointed, so is the above exponential. This means that exponentials in the category of pointed objects are formed by taking exponentials in the topos of actions. In this section, we show that solutions of an exponential vector field may be obtained by conjugating solutions of the vector fields that make up the exponential. Furthermore, this method of conjugation is equivalent (under some conditions) to the method of change of variables, widely used to solve differential equations. Theorem 3.1. Assume that (M, X) and (N, Y ) are vector fields having (complete) so˜ × M → M and G : R ˜ × N → N , respectively, and assume that all Ft are lutions F : R ˜ × M → M of the exponential (N, Y )(M,X) invertible. Then a (complete) solution H : R is obtained as the map ˜ × NM → NM H:R
76
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
given by conjugation: Ht (β) = Gt ◦ β ◦ Ft−1 . Proof. This is purely formal. For β ∈ N M , we have (Y X )d (Ht (β)) = Yd ◦ Ht (β) ◦ Xd−1 = Yd ◦ Gt ◦ β ◦ Ft−1 ◦ Xd−1 −1 = Gd+t ◦ β ◦ Fd+t
= Hd+t (β), where in the third step we used (5) for F as well as for G, together with invertibility of Fs for all s and invertibility of Xd . A similar argument gives that if each of (3)-(11) holds for both F and G, then the corresponding property holds for H. In the applications we have considered, the invertibility of the Ft will be secured by ˜ with Ft−1 = F−t . subtraction on R, Using directional derivatives, we can give a more familiar expression to the vector field (1ODE) Y X considered above on the object N M , when the base N is a microlinear Euclidean R-module V (hence also V M is Euclidean), and the exponent M is microlinear. In fact, letting η be the principal part of the vector field Y on N = V , we have, for u ∈ V M , m ∈ M , d ∈ D (recall that (Xd )−1 = X−d ) (Y X )d (u)(m) = Yd ◦ u ◦ X−d (m) = u((X−d (m)) + d · η(u(X−d (m))) = u(m) − d · DX (u)(m) + d · η(u(m)) = u(m) + d · [−DX (u)(m) + η(u(m))] (at the third equality sign, a cancellation of d · d took place in the last term). In other words, the principal part of Y X is θ : V M → V M given by θ(u)(m) = η(u(m)) − DX (u)(m).
(19)
In these terms, the differential equation given by the vector field Y X may be rewritten (leaving out the m, and modulo some obvious abuse of notation) as u˙ = η(u) − DX (u), or
∂u + DX (u) = η(u). ∂t
(20)
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
77
This is a PDE of first order. Thus, the exponential of two 1ODE’s is a 1PDE. Our Theorem then translates into the following result, formulated entirely in standard terms: Theorem 3.2. The complete solution of the PDE (20) is given by u(t, v)(m) = G(t, v(F (−t, m))), where F and G are complete solutions of x˙ = X(x) and y˙ = η(y), respectively, where v is an any function (initial value) M → V . ˜ and N = R, we can give this PDE a more familiar In the particular case that M = R presentation, using (12); the equation (20) becomes the following PDE for a function u(t, x) ∂u ∂u + ξ(x) = η(u). ∂t ∂x The Theorem provides the following solution u(t, x) of it, (with initial condition u(0, −) an arbitrary initial value function v = v(x)): u(t, x) = G(t, v(F (−t, x)), where F and G are solutions of x˙ = ξ(x), y˙ = η(y), i.e. satisfy ∂ G(t, y) = η(G(t, y)). ∂t This can also be verified by plain calculus, using (17).
∂ F (t, x) ∂t
= ξ(F (t, x)) and
We shall finish by giving a reformulation (and alternative proof) of the main Theorem; it may be considered as a “change of variable” method, changing variable in the exponent space M . We need some preliminaries. For any object N , let us consider its “zero vector field” Z , i.e., Zd is the identity map on N , for all d. For a vector field X on an object M , we then also have the “vertical” vector field Z × X on N × M , given by (Z × X)(d, (n, m)) = (n, X(d, m)). ˜ × M → M of a vector field X on M , we may If we have a complete solution F : R ˜×M →R ˜ × M given by (t, m) → (t, F (t, m)) consider the map F : R Proposition 3.3. The map F thus described is an automorphism of the vector field Z ×X ˜ × M. on R Proof. By a straightforward diagram chase, one sees that this is a restatement of condition (4). The following is a form of the chain rule. We consider a vector field X on M , with ˜ ˜ solution F : R×M → M . Let U : R×M → V be any function with values in a Euclidean R-module.
78
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
Proposition 3.4. Under these circumstances, we have ∂U ∂ U (t, Ft (m)) = (t, Ft (m)) + (DZ×X U )(t, Ft (m)) ∂t ∂t ˜ m ∈ M. for all t ∈ R, ˜ (Z × Proof. Since F is a solution of X, Fd+t = Xd ◦ Ft , and so for any t, t ∈ R, X)d (t , Ft (m)) = (t , Fd+t (m)). Therefore, by definition of directional derivative, U (t , Fd+t (m)) = U (t , Ft (m)) + d · (DZ×X U )(t , Ft (m)). Putting t = d + t, we thus have U (d + t, Fd+t (m)) = U (d + t, Ft (m)) + d · (DZ×X U )(d + t, Ft (m)) = U (d + t, Ft (m)) + d · (DZ×X U )(t, Ft (m)) by a standard cancellation of two d’s, after Taylor expansion. Expanding the first term, we may continue: = U (t, Ft (m)) + d ·
∂U (t, Ft (m)) + d · (DZ×X U )(t, Ft (m)). ∂t
On the other hand, U (d + t, Fd+t (m)) = U (t, Ft (m)) + d ·
∂ (U (t, Ft (m))); ∂t
comparing these two expressions gives the result.
The method of change of variables has been used extensively to solve differential equations. We shall prove that our method for solving the exponential differential equation Y X , where X is an integrable vector field on M , Y an integrable vector field on a ˜ is symmetric with respect to the origin (if t ∈ R, ˜ then Euclidean R-module, and where R ˜ may be seen as an application of the method of change of variables. We let −t ∈ R), ˜ ×M → M be the assumed η : V → V denote the principal part of Y , as before. Let F : R solution of X. From F−t = Ft−1 follows that the map F considered in Proposition 3.3 is −1 −1 invertible, with F (t, m) = (t, F (−t, m)). The map F (t, m) represents the change of variables τ = t, μ = F (−t, m)). ˜ × M → V is a particular solution of Theorem 3.5. (“Change of variables”). If u : R X Y , or, equivalently by (20), of ∂u + DX (u) = η(u), ∂t ˜ × M → V given as the composite then the unique map U : R F ˜ u ˜×M → R×M →V R
(21)
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
79
is a particular solution of Y Z , or, equivalently, of ∂U = η(U ), ∂t
(22)
and vice versa. Proof. Since u(t, m) = U (t, F−t (m)), we have ∂u ∂U ∂ (t, m) = U (t, F−t (m)) = (t, F−t (m)) − (DZ×X U )(t, F−t (m)), ∂t ∂t ∂t by the chain rule, Proposition 3.4. Writing μ for F (−t, m) (so F (t, μ) = (t, m)), we thus have ∂U ∂u (t, m) = (t, μ) − (DZ×X U )(t, μ). ∂t ∂t Now F is an automorphism of the vector field Z × X, by Proposition 3.3. Also U = u ◦ F . From Proposition 2.3, we therefore get that the second term on the right hand side equals −(DZ×X u)(t, m), so ∂U ∂u (t, m) = (t, μ) − (DZ×X u)(t, m). ∂t ∂t On the other hand u is assumed to be a solution of Y X , which means that ∂u (t, m) = η(u(t, m)) − (DZ×X u)(t, m). ∂t Comparing these two equations gives us the first equality sign in ∂U (t, μ) = η(u(t, m)) = η(U (t, μ)), ∂t and since this holds for all μ, F is invertible.)
∂U ∂t
= η(U ), as claimed. (The vice versa part follows because
Note that when the exponent vector field is the zero field Z, particular solutions h(t) of Y Z have the property that for each fixed m ∈ M , h(t)(m) is a particular solution of Y ; therefore, if Y has the uniqueness property for particular solutions, then so does Y Z . Corollary 3.6 (Uniqueness of solutions of the exponential). Let X be a vector field on ˜×M →M M and let Y be a vector field on an Euclidean module V. Assume that F : R is a complete solution of X. If Y has the uniqueness property for particular solutions, then so does Y X . ˜ × M → V be (exponential Proof. We let η be the principal part of Y. Let u, w : R adjoints of) particular solutions of Y X such that u(0, m) = v(0, m). For each m ∈ M, define U (t) = u(t, F (t, m)) and W (t) = w(t, F (t, m)). By the change of variables theorem, both U and W satisfy y˙ = η(y), i.e. they are particular solutions of Y with the same initial value. By the uniqueness of particular solutions of Y, U = W. But m is arbitrary,
80
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
i.e., u(t, F (t, m)) = w(t, F (t, m)) for every t and every m. Since F (t, −) is bijective, this shows that u = v. Corollary 3.7 (Uniqueness of the solution by conjugation). Assume the hypothesis of the previous corollary. If Y has a complete solution G, then the solution of Y X obtained by conjugation from F and G is the only complete solution of this exponential vector field. Some examples. The first two are immediate applications of the formula derived in Theorem 3.2. The third is concerned with the tangent bundle T (M ) seen as an exponential object M D . Example 3.8 (‘Simple transport equation’). ∂u/∂t + ∂u/∂x = 0 Here, ξ(x) = 1 and η(y) = 0. The complete solution of x˙ = 1 is clearly F (x, t) = x + t, which is globally defined. Clearly, the complete solution of y˙ = 0 is G(t, x) = x (also globally defined). Hence H(t, x) = G(t, v(F (−t, x))) = v(x − t) is the only (globally defined) complete solution of the PDE (with initial value the function v).
Example 3.9. ∂u/∂t + x∂u/∂x = u In this case, ξ(x) = x and η(y) = y and their complete solutions are the same, namely F (t, x) = G(t, x) = xet (globally defined). Therefore, H(t, x) = G(t, v(F (−t, x))) = v(xe−t )et is the (globally defined) complete solution of the PDE, (with initial value the function v).
Example 3.10. Let D be the set of elements of square zero in R, as usual. It carries a vector field, namely the map e : D × D → D given by (d, δ) → (1 + d) · δ. It is easy to see that this vector field is integrable, with complete solution E : R × D → D given by (t, δ) → et · δ. Now consider the tangent vector bundle M D on M . The zero vector field Z on M is certainly integrable, and so we have by the theorem a complete integral for the vector field Z e on the tangent bundle. We describe the integral explicitly (this then also describes the vector field, by restriction): it is the map R × M D → M D given by (t, β) → [d → β(e−t · d)].— The vector field on M D obtained this way is, except for the sign, the Liouville vector field, cf. [2], IX.2.
A. Kock, G.E. Reyes / Central European Journal of Mathematics 4(1) 2006 64–81
81
References [1] M. Bunge and E. Dubuc: “Local Concepts in Synthetic Differential Geometry and Germ Representability”, In: D.W. Kueker, E.G.K. Lopez-Escobar and C.H. Smith (Eds.): Mathematical Logic and Theoretical Computer Science, Marcel Dekker Inc., 1987, pp. 93–159. [2] C. Godbillon: G´eom´etrie Diff´erentielle et M´ecanique Analytique, Hermann, Paris, 1969. [3] A. Kock: Synthetic Differential Geometry, Cambridge University Press, 1981. [4] A. Kock and G.E. Reyes: “Aspects of Fractional Exponents”, Theor. Appl. Categories, Vol. 5(10), (1999). [5] A. Kock and G.E. Reyes: “Some calculus with extensive quantities; wave equation”, Theor. Appl. Categories, Vol. 11(14), (2003). [6] A. Kumpera and D. Spencer: Lie equations, Vol. 1, Ann. of Math Studies, Vol. 73, Princeton University Press, 1972. [7] R. Lavendhomme: Basic Concepts Of Synthetic Differential Geometry, Kluwer Academic Publishers, 1996. [8] F.W. Lawvere: “Categorical Dynamics”, In: A. Kock (Ed.): Topos Theoretic Methods in Geometry, Series 30, Aarhus Various Publ., (1979). [9] B. Malgrange: “Equations de Lie”, I. J. Diff. Geom., Vol. 6, (1972), pp. 503–522. [10] I. Moerdijk and G.E. Reyes: Models for Smooth Infinitesimal Analysis, SpringerVerlag, 1991.
Central European Science Journals w
w
w
.
c
e
s
j
.
c
o
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
m
DOI: 10.1007/s11533-005-0006-1 Research article CEJM 4(1) 2006 82–109
Blow-up of regular submanifolds in Heisenberg groups and applications Valentino Magnani∗ , Department of Mathematics, Pisa University, Largo Bruno Pontecorvo 5, I-56127, Pisa, Italy
Received 5 July 2005; accepted 30 November 2005 Abstract: We obtain a blow-up theorem for regular submanifolds in the Heisenberg group, where intrinsic dilations are used. Main consequence of this result is an explicit formula for the density of (p+1)-dimensional spherical Hausdorff measure restricted to a p-dimensional submanifold with respect to the Riemannian surface measure. We explicitly compute this formula in some simple examples and we present a lower semicontinuity result for the spherical Hausdorff measure with respect to the weak convergence of currents. Another application is the proof of an intrinsic coarea formula for vector-valued mappings on the Heisenberg group. c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: Heisenberg group, submanifolds, Hausdorff measure, coarea formula MSC (2000): 28A75, 22E25
1
Introduction
In recent years, several efforts have been devoted to the project of developing Analysis and Geometry in stratified groups and more general Carnot-Carath´eodory spaces with several monographs and surveys on this subject. Among them we mention [3, 6, 13, 16, 23, 28], but this list could be surely enlarged. Our study fits into the recent project of developing Geometric Measure Theory in these spaces. Ambient of our investigations is the (2n+1)-dimensional Heisenberg group Hn , which represents the simplest model of non-Abelian stratified group, [6, 27]. Aim of this paper is to present an intrinsic blow-up theorem for C 1 submanifolds in the geometry ∗
E-mail:
[email protected] V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
83
of the Heisenberg group along with its applications. The main feature of this procedure is the use of natural dilations of the group, namely, a one-parameter family of group homomorphisms that are homogeneous with respect to the distance of the group. Recall that dilations in Hn are anisotropic, hence they differently act on different directions of the submanifold. The foremost directions are the so-called horizontal directions, that determine the “sub-Riemannian geometry” of the Heisenberg group: at any point x ∈ Hn a 2n-dimensional subspace Hx Hn ⊂ Tx H2n+1 is given and the family of all horizontal spaces Hx Hn forms the so-called horizontal subbundle HHn . We will defer full definitions to Section 2. The blow-up procedure consists in enlarging the submanifold Σ at some point x ∈ Σ by intrinsic dilations and taking the intersection of the magnified submanifold with a bounded set centered at x. We are interested in studying the case when Tx Σ ⊂ Hx Hn , namely, x is a transverse point. The effect of rescaling the submanifold at a transverse point x can be obtained by considering the behavior of volp (Bx,r ∩ Σ)/rp+1 as r → 0+ , that heuristically is volp (lx δr (B1 ∩ Σx,r )) volp (δr (B1 ∩ Σx,r )) volp (Bx,r ∩ Σ) = = ≈ α(x) volp (B1 ∩ Σx,r ). rp+1 rp+1 rp+1 Here volp denotes the p-dimensional Riemannian measure restricted to Σ, the left translation lx : Hn −→ Hn is given by lx (y) = x · y, the dilation of factor r > 0 is δr : Hn −→ Hn , the dilated submanifold at x is Σx,r = δ1/r (lx−1 Σ) and Bx,r is the open ball of center x and radius r with respect to a fixed homogeneous distance. The meaning of α(x) will be clear in the following theorem, that makes rigorous our previous consideration and represents our first main result. Theorem 1.1 (Blow-up). Let Σ be a p-dimensional C 1 submanifold of Ω, where Ω is an open subset of Hn and let x be a transverse point. Then the following limit holds lim+
r→0
θpρ (τΣ,V (x)) volp (Σ ∩ Bx,r ) = . rp+1 |τΣ,V (x)|
(1)
A novel object appearing in this limit is the vertical tangent p-vector τΣ,V (x), introduced in Definition 2.13. Its associated p-dimensional subspace of hn is a subalgebra whose image through the exponential map represents the blow-up limit of the rescaled submanifold Σx,r as r → 0+ . The p-vector τΣ,V (x) in higher codimension plays the same role that the well known horizontal normal νH plays in codimension one (compare for instance with [19]). The metric factor θ(τΣ,V (x)), introduced in [19], corresponds to the measure of the intersection of B1 with the vertical subspace associated to the vertical tangent p-vector τΣ,V (x). A first consequence of Theorem 1.1 is an explicit formula to compute the (p+1)-dimensional spherical Hausdorff measure of p-dimensional C 1 submanifolds in the Heisenberg group. In fact, thanks to S p+1 -negligibility of characteristic points proved in [22], Theorem 1.1 along with standard theorems on differentiation of measures, immediately give the following result.
84
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
Theorem 1.2. Let ρ be a homogeneous distance with constant metric factor α > 0 and let SHp+1 = α Sρp+1 . Then we have n p+1 |τΣ,V (x)| dvolp (x). (2) SHn (Σ) = Σ
Note that in codimension one, the integral formula (2) fits into the results of [19] in stratified groups. The connection between these results is shown in Proposition 4.18. There are several examples of homogeneous distances satisfying hypothesis of Theorem 1.2, as we show in Example 4.6. Proposition 4.5 shows a class of homogeneous distances having constant metric factor. Proposition 4.10 shows how the computation of the (p+1)-dimensional spherical Hausdorff measure of a submanifold can be easily performed in several examples, that will appear in Section 4. Another consequence of Theorem 1.1 is the validity of an intrinsic coarea formula for vector-valued Lipschitz mappings defined on the Heisenberg group. By Sard theorem and the classical Whitney approximation theorem we can assume that a.e. level set is a submanifold of class C 1 , then we apply representation formula (2). The core of the proof stands in the key relation |τΣ,V (x)| =
JH f (x) , Jg f (x)
(3)
which surprisingly connects vertical tangent p-vector with horizontal jacobian JH f . The proof of (3) is given in Theorem 3.3. Thus, we can establish the following result. Theorem 1.3 (Coarea formula). Let f : A −→ Rk be a Riemannian Lipschitz map, where A ⊂ Hn is a measurable subset and 1 ≤ k < 2n + 1. Let ρ be a homogeneous distance with constant metric factor α > 0. Then for every measurable function u : A −→ [0, +∞] the formula p+1 u(x) JH f (x) dx = u(y) dSHn (y) dt (4) A
Rk
f −1 (t)∩A
holds, where p = 2n + 1 − k and SHp+1 = α Sρp+1 . n This coarea formula along with that of [21], which is a particular case, represent first examples of intrinsic coarea formulae for vector valued mappings defined on non-Abelian Carnot groups. It remains an interesting open question the extension of coarea formula to Lipschitz mappings with respect to a homogeneous distance. Only in the case of real-valued mappings this problem has been settled in [20]. This question is intimately related to a blow-up theorem of “intrinsicly regular” submanifolds. In this connection, we mention a recent work by Franchi, Serapioni and Serra Cassano [10], where a notion of intrinisic submanifold in Hn has been introduced in arbitrary codimension. According the their terminology, a k-codimensional H-regular submanifold for algebraic reasons must satisfy 1 ≤ k ≤ n. With this restriction it might be highly irregular, even unrectifiable
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
85
in the Euclidean sense, [14]. Nevertheless they show that an area-type formula for its (p+1)-dimensional spherical Hausdorff measure still holds. Here we wish to emphasize the difference in our approach, where we consider C 1 submanifolds, but with no restriction on their codimension. Let us summarize the contents of the present paper. Section 2 recalls some notions. Section 3 is devoted to the proof of Theorem 1.1. In Section 4 we show the validity of Theorem 1.2, along with its applications. Precisely, in Theorem 4.9 we show how a suitable rescaling of the spherical Hausdorff measure yields an intrinsic surface measure only depending on the sub-Riemannian metric, namely, the restriction of the Riemannian metric to the horizontal subbundle. In Proposition 4.5, we single out a privileged class of homogeneous distances having constant metric factor. We present several explicit computations of (p+1)-dimensional spherical Hausdorff measure in concrete examples. As another application of Theorem 1.2, we show a lower semicontintuity result for the spherical Hausdorff measure with respect to weak convergence of regular currents. Section 5 establishes an intrinsic coarea formula for vector-valued Riemannian Lipschitz mappings on the Heisenberg group.
Acknowledgment I wish to thank Bruno Franchi, Raul Serapioni and Francesco Serra Cassano for pleasant discussions on intrinsic surface area in Heisenberg groups.
2
Some basic notions
The (2n+1)-dimensional Heisenberg group Hn is a simply connected Lie group whose Lie algebra hn is equipped with a basis (X1 , . . . , X2n , Z) satisfying the bracket relations [Xk , Xk+n ] = 2 Z
(5)
for every k = 1, . . . , n. We will identify the Lie algebra hn with the isomorphic Lie algebra of left invariant vector fields on Hn , so that any Xj also denotes a left invariant vector field of Hn . In the terminology of Differential Geometry, the basis (X1 , . . . , X2n , Z) forms a moving frame in Hn . We will say that (X1 , . . . , X2n , Z) is our standard frame. In particular, (X1 , . . . , X2n ) is a horizontal frame and it spans a smooth distribution of 2ndimensional hyperplanes, called horizontal hyperplanes and denoted by Hx Hn for every x ∈ Hn . The collection of all horizontal hyperplanes forms the so called horizontal subbundle, denoted by HHn . In the sequel, we will fix the unique left invariant Riemannian metric g such that the standard frame (X1 , X2 , . . . , X2n , Z) forms an orthonormal basis at each point.
86
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
Definition 2.1. Every set of left invariant vector fields (Y1 , . . . , Y2n ) spanning the horizontal hyperplane at the unit element of Hn will be called horizontal frame. Recall that the exponential map exp : hn −→ Hn is a diffeomorphism, then it is possible to introduce a system of coordinates in all of Hn . Definition 2.2 (Graded coordinates). Let (Y1 , . . . , Y2n ) be a horizontal frame and let W be a non horizontal left invariant vector field. The frame (Y1 , . . . , Y2n , W ) defines a coordinate chart F : R2n+1 −→ Hn given by F (y) = exp y2n+1 W +
2n
yj Y j .
(6)
j=1
Coordinates defined by (6) are called graded coordinates in the case W = Z and standard coordinates in the case the standard frame (X1 , . . . , X2n , Z) is used. In general we will say that the coordinates are associated to the frame (Y1 , . . . , Y2n , W ) We will assume throughout that a system of standard coordinate is fixed, if not stated otherwise. Remark 2.3. Note that the horizontal frame (Y1 , . . . , Y2n ) of Definition 2.2 may not satisfy relations (5), where Xi are replaced by Yi . The standard frame with respect to standard coordinates reads as follows ˜ k = ∂x − xk+n ∂x2n+1 , X ˜ k+n = ∂x + xk ∂x2n+1 and Z˜ = ∂x2n+1 X k k+n and the group operation is given by the following formula n x · y = x1 + y1 , . . . , x2n + y2n , x2n+1 + y2n+1 + (xk yk+n − xk+n yk ) .
(7)
(8)
j=1
A natural family of dilations which respects the group operation (8) can be defined as follows δr (x) = (rx1 , rx2 , . . . , rx2n , r2 x2n+1 )
(9)
for every r > 0. In fact, the map δr : Hn −→ Hn defined above is a group homomorphism with respect to the operation (8). In contrast with Analysis in Euclidean spaces, where the Euclidean distance is the most natural choice, in the Heisenberg group several distances have been introduced for different purposes. However, all of them are homogeneous in the following sense. If ρ : Hn × Hn −→ [0, +∞ + [ is a homogeneous distance, then (1) ρ is a continuous with respect to the topology of Hn , (2) ρ(xy, xz) = ρ(y, z) for every x, y, z ∈ Hn ,
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
87
(3) ρ(δr y, δr z) = r ρ(y, z) for every y, z ∈ Hn and every r > 0. To simplify notations we write ρ(x, 0) = ρ(x), where 0 denotes either the origin of R2n+1 or the unit element of Hn . The open ball of center x and radius r > 0 with respect to a homogeneous distance is denoted by Bx,r . The Carnot-Carath´eodory distance is an important example of homogeneous distance, [11]. However, all of our computations hold for a general homogeneous distance, therefore in the sequel ρ will denote a homogeneous distance, if not stated otherwise. Note that the Hausdorff dimension of Hn with respect to any homogeneous distance is 2n + 2. Next, we recall the notion of Riemannian jacobian. Definition 2.4 (Riemannian jacobian). Let f : M −→ N be a C 1 smooth mapping of Riemannian manifolds and let x ∈ M , where M and N have dimension d and k, respectively. The Riemannian jacobian of f at x is given by Jg f (x) = Λk (df (x)) ,
(10)
where Λk (df (x)) : Λd (Tx M ) −→ Λk (Tf (x) N ) is the canonical linear map associated to df (x) : Tx M −→ Tf (x) N . The norm of Λk (df (x)) is understood with respect to the induced scalar products on Λd (Tx M ) and Λk (Tf (x) N ). We recall scalar products of pvectors in (17). To compute the Riemannian jacobian, we fix two orthonormal bases (X1 , . . . , Xd ) and (E1 , . . . , Ek ) of Tx M and Tf (x) N , respectively, and we represent df (x) with respect to these bases by the matrix ⎡ ⎤ E , df (x)(X1 ) E1 , df (x)(X2 ) . . . E1 , df (x)(Xd ) ⎢ 1 ⎥ ⎢ ⎥ ⎢ E2 , df (x)(X1 ) E2 , df (x)(X2 ) . . . E2 , df (x)(Xd ) ⎥ ⎢ ⎥ ∇X,E f (x) = ⎢ (11) ⎥. .. .. .. .. ⎢ ⎥ . . . . ⎢ ⎥ ⎣ ⎦ Ek , df (x)(X1 ) Ek , df (x)(X2 ) . . . Ek , df (x)(Xd ) Then the jacobian of the matrix ∇X,E f (x) coincides with Jg f (x). In the sequel, it will be useful to fix the following notation to indicate minors of a matrix. Definition 2.5. Let G be an m × n matrix with m ≤ n. We denote by Gi1 i2 ...im the m × m submatrix with columns (i1 , i2 , . . . , im ). We define the minor Mi1 i2 ...im (G) = det (Gi1 i2 ...im ) .
(12)
Definition 2.6 (Horizontal jacobian). Let Ω be an open subset of Hn and let x ∈ Ω. The horizontal jacobian of a C 1 mapping f : Ω −→ Rk at x is given by JH f (x) = Λk (df (x)|Hx Hn ) , where Λk (df (x)|Hx Hn ) : Λk (Hx Hn ) −→ Λk (Rk ).
(13)
88
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
From definition of horizontal jacobian, it follows that it only depends on the restriction of g to the horizontal subbundle, namely, from the “sub-Riemannian metric”. Let us consider a horizontal frame (Y1 , Y2 , . . . , Y2n ), hence JH f (x) is given by the jacobian of ⎡ ⎤ 1 1 1 Y f (x) Y2 f (x) . . . Y2n f (x) ⎢ 1 ⎥ ⎢ ⎥ ⎢ Y1 f 2 (x) Y2 f 2 (x) . . . Y2n f 2 (x) ⎥ ⎢ ⎥ ∇Y f (x) = ⎢ (14) ⎥. .. .. .. .. ⎢ ⎥ . . . . ⎢ ⎥ ⎣ ⎦ k k k Y1 f (x) Y2 f (x) . . . Y2n f (x) As a consequence, we have the formula JH f (x) =
[Mi1 i2 ···ik (∇Y f (x))]2 .
(15)
1≤i1 0
j=1
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
89
where the diameter is considered with respect to a homogeneous distance ρ of Hn and we do not consider any dimensional factor. The k-dimensional Hausdorff measure built with respect to the Riemannian distance is denoted by volk and it corresponds to the classical Riemannian volume measure with respect to the graded metric g, see for instance 3.2.46 of [5]. Definition 2.9 (Horizontal p-vectors). For each x ∈ Hn , we say that any linear combination of wedge products Xj1 (x) ∧ Xj2 (x) ∧ · · · Xjp (x), where 1 ≤ js ≤ 2n and j = 1, . . . , 2n, is a horizontal p-vector. The space of horizontal p-vectors is denoted by Λp (Hx Hn ). Definition 2.10 (Vertical p-vectors). For each x ∈ Hn , we say that any linear combination of wedge products Xj1 (x) ∧ Xj2 (x) ∧ · · · Xjp−1 (x) ∧ Z(x), where 1 ≤ js ≤ 2n and j = 1, . . . , 2n, is a vertical p-vector. The space of vertical p-vectors is denoted by Vp (Hx Hn ). For every couple of simple p-vectors v1 ∧ · · · ∧ vp , w1 ∧ · · · ∧ wp ∈ Λp (Tx Hn ), we define the scalar product induced by the left invariant Riemannian metric g on Tx Hn as (17) v1 ∧ · · · ∧ vp , w1 ∧ · · · ∧ wp = det (g(x)(vi , wj ) , see for instance 1.7.5 of [5] for more details. This allows us to regard the space of vertical p-vectors Vp (Tx Hn ) as the orthogonal complement of the horizontal subspace Λp (Hx Hn ). We have the orthogonal decomposition Λp (Tx Hn ) = Λp (Hx Hn ) ⊕ Vp (Tx Hn ),
(18)
which generalizes the case p = 1, corresponding to Tx Hn = Hx Hn ⊕ Z(x). Definition 2.11 (Vertical projection). Let x ∈ Hn and let ξ ∈ Λp (Tx Hn ). The orthogonal decomposition ξ = ξH + ξV associated to (18) uniquely defines the vertical p-vector ξV ∈ Vp (Tx Hn ). We say that ξV is the vertical projection of ξ and that the mapping πV : Λp (Tx Hn ) −→ Vp (Tx Hn ), which associates ξV to ξ, is the vertical projection. We have omitted x in the definition of vertical projection πV . Definition 2.12 (Characteristic points and transverse points). Let Σ ⊂ Ω be a C 1 submanifold and let x ∈ Σ. We say that x ∈ Σ is a characteristic point if Tx Σ ⊂ Hx Hn and that it is a transverse point otherwise. The characteristic set of Σ is the subset of all characteristic points and it is denoted by C(Σ). Recall that a tangent p-vector to a p-dimensional submanifold Σ of class C 1 at x ∈ Σ is defined by the wedge product t1 ∧ t2 ∧ · · · ∧ tp , where (t1 , . . . , tp ) is an orthonormal basis of Tx Σ. We denote this simple p-vector by τΣ (x). Notice that the tangent p-vector (which belongs to a one-dimensional space) cannot be continuously defined on all of Σ, unless the submanifold is oriented.
90
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
Definition 2.13 (Vertical tangent p-vector). Let Σ ⊂ Ω be a p-dimensional submanifold of class C 1 and let x ∈ Σ. A vertical tangent p-vector to Σ at x is defined by πV (τΣ ), where τΣ is a tangent p-vector and πV is the vertical projection. The vertical tangent p-vector will be denoted by τΣ,V (x).
3
Blow-up at transverse points
This section is devoted to the proof of Theorem 1.1. In the following proposition, we give a simple characterization of characteristic points using vertical tangent p-vectors. Proposition 3.1. Let Σ ⊂ Ω be a submanifold of class C 1 and let x ∈ Σ. Then x ∈ C(Σ) if and only if τΣ,V (x) = 0. Proof. Let x ∈ Σ and let (t1 , t2 , . . . , tp ) be an orthonormal basis of Tx Σ. We have the unique decomposition tj = Vj + γj Z, where Vj ∈ Hx Hn for every j = 1, . . . , p. It follows that τ = t1 ∧ t2 ∧ · · · ∧ tp = (V1 + γ1 Z ) ∧ (V2 + γ2 Z ) ∧ · · · ∧ (Vp + γp Z ) p = V1 ∧ V2 ∧ · · · ∧ Vp + γj V1 ∧ V2 ∧ · · · Vj−1 ∧ Z ∧ Vj+1 ∧ · · · ∧ Vp . j=1
Assume that x ∈ / C(Σ). If V1 , V2 , . . . , Vp are linearly dependent, then we get t1 ∧ t2 ∧ · · · ∧ tp =
p
γj V1 ∧ V2 ∧ · · · Vj−1 ∧ Z ∧ Vj+1 ∧ · · · ∧ Vp .
j=1
As a result, πV (τ ) = τ hence it is not vanishing. If V1 , . . . , Vp are linearly independent, then all wedge products of the form V1 ∧ V2 ∧ · · · Vj−1 ∧ Z ∧ Vj+1 ∧ · · · ∧ Vp
(19)
are non-vanishing for every j = 1, . . . , p. The fact that x is transverse implies that there exists γj0 = 0, then the projection πV (τ ) =
p
γj V1 ∧ V2 ∧ · · · Vj−1 ∧ Z ∧ Vj+1 ∧ · · · ∧ Vp
(20)
j=1
is non-vanishing. Conversely, if πV (τ ) = 0, then (20) yields some γj1 = 0, therefore / Hx Hn . tj1 ∈ Proposition 3.2. Let f : Ω −→ Rk be of class C 1 , with surjective differential at each point of Ω. Let Σ denote the submanifold f −1 (0) of Ω and let x ∈ Σ. Then x ∈ C(Σ) if and only if df (x)|Hx Hn is not surjective.
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
91
Proof. We first notice that Ker df (x)|Hx Hn = Tx Σ ∩ Hx Hn , then we have dim(Hx Hn ∩ Tx Σ) = 2n − dim Im (df (x)|Hx Hn ) .
(21)
This last formula allows us to get our claim as follows. Assume that x ∈ C(Σ). Then Tx Σ ⊂ Hx Hn and (21) gives 2n + 1 − k = 2n − dim Im (df (x)|Hx Hn ) . From this equation we conclude that df (x)|Hx Hn is not surjective. Conversely, if df (x)|Hx Hn is not surjective, then (21) implies dim(Hx Hn ∩ Tx Σ) ≥ 2n − k + 1 = dim(Tx Σ) therefore Tx Σ ⊂ Hx Hn , namely, x ∈ C(Σ).
Theorem 3.3. Let f : Ω −→ Rk be of class C 1 , with surjective differential at each point of Ω. Let Σ denote the submanifold f −1 (0) of Ω and let x ∈ Σ. Then we have |τΣ,V (x)| =
JH f (x) . Jg f (x)
(22)
Proof. Left invariance of Riemannian metric allows us to consider the left translated submanifold lx−1 Σ. Replacing f with f ◦ lx and Ω with lx−1 Ω we can assume that x is the unit element 0 of Hn . Recall that lx : Hn −→ Hn is the left translation lx (y) = x · y. If x ∈ C(Σ), then Proposition 3.1 and Proposition 3.2 make (22) the trivial identity 0 = 0. Assume that x ∈ Σ \ C(Σ). Then Proposition 3.2 implies that the horizontal gradients ∇H f i = (X1 f i (0), X2 f i (0), . . . , X2n f i (0)) for i = 1, 2, . . . , k span a k-dimensional space of R2n . Let c1 , c2 , . . . , ck ∈ R2n be orthogonal unit vectors generating this vector space and choose ck+1 , . . . , c2n ∈ R2n such that (c1 , c2 , . . . , c2n ) is an orthonormal basis of R2n . These vectors allow us to define a new horizontal frame Yj =
2n
ckj Xk
for every j = 1, . . . , 2n.
(23)
k=1
We denote by C the 2n × 2n orthogonal matrix whose i-th column corresponds to the vector ci , then by our choice of vectors cj , we obtain ∇Y f (x) = ∇X f (x) C and ⎡ ⎤ 1 1 1 ∇ f , c1 ∇H f , c2 · · · ∇H f , ck 0 · · · 0 ⎢ H ⎥ ⎢ ⎥ ⎢ ∇H f 2 , c1 ∇H f 2 , c2 · · · ∇H f 2 , ck 0 · · · 0 ⎥ ⎢ ⎥ ∇Y f (x) = ⎢ (24) ⎥, . . . ⎢ ⎥ .. .. .. ··· 0 ··· 0⎥ ⎢ ⎣ ⎦ k k k ∇H f , c1 ∇H f , c2 · · · ∇H f , ck 0 · · · 0
92
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
where the symbol , denotes the standard scalar product of R2n . Let us consider F : R2n+1 −→ Hn , defining graded coordinates (y1 , . . . , y2n+1 ) associated to the frame (Y1 , . . . , Y2n , Z), according to Definition 2.2. Then the differential of f at 0 with respect to (y1 , . . . , y2n+1 ) can be represented by the matrix ⎡ ⎤ fy11 (0) fy12 (0) · · · fy1k (0) 0 · · · 0 fy12n+1 (0) ⎢ ⎥ ⎢ ⎥ 2 2 1 2 ⎢ f (0) f (0) · · · f (0) 0 · · · 0 f ⎥ y2 yk y2n+1 (0) ⎥ ⎢ y1 ∇y f (0) = ⎢ . (25) ⎥. .. .. .. ⎢ . ⎥ . ··· . 0 ··· 0 . ⎢ . ⎥ ⎣ ⎦ k k k k fy1 (0) fy2 (0) · · · fyk (0) 0 · · · 0 fy2n+1 (0) It follows that fyij (0) = ∇H f i , cj for every i, j = 1, . . . , 2n. The implicit function theorem gives us a C 1 map ϕ : A −→ Rk such that A ⊂ Rp is an open neighbourhood of the origin and f (ϕ1 (˜ y ), . . . , ϕk (˜ y ), yk+1 , . . . , y2n+1 ) = 0
(26)
for every y˜ = (yk+1 , . . . , y2n+1 ) ∈ A. Then we define the mapping φ : A −→ R2n+1 as φ(˜ y ) = (ϕ1 (˜ y ), . . . , ϕk (˜ y ), yk+1 , . . . , y2n+1 ),
(27)
so that differentiating (26) we get i
0 = ∂yj (f ◦ φ) =
k
fyil ϕlyj + fyij
(28)
l=1
for every i = 1, . . . , k and j = k + 1, . . . , 2n + 1. Equations (28) can be more concisely written in matrix form as follows ∇z f ϕyj = −fyj ,
(29)
where z = (y1 , . . . , yk ), the k × k matrix ∇z f has coefficients fyil , where i, l = 1, . . . , k and j = k + 1, . . . , 2n + 1. In order to achieve a more explicit formula for the differential of the implicit map, we explicitly write the inverse matrix of ∇z f as ⎡ ⎤ C11 (∇z f ) C21 (∇z f ) · · · Ck1 (∇z f ) ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 1 ⎢ C12 (∇z f ) C22 (∇z f ) · · · Ck1 (∇z f ) ⎥ −1 (∇z f ) = ⎢ ⎥ , .. .. .. ⎥ M12···k (∇z f ) ⎢ . . · · · . ⎢ ⎥ ⎣ ⎦ C1k (∇z f ) C2k (∇z f ) · · · Ckk (∇z f ) ˆ ij f ) and D ˆ ij f where Cij (∇z f ) denotes the cofactor of ∇z f , which is equal to (−1)i+j det(D is the (k − 1) × (k − 1) square matrix obtained by removing the i-th row and the j-th
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
column from ∇z f . In view of (29) we have
⎡
k i=1
Ci1 (∇z f )fyij
93
⎤
⎢ ⎥ ⎢ k ⎥ i ⎥ ⎢ C (∇ f )f i2 z 1 yj ⎥ ⎢ i=1 ϕyj = − (∇z f )−1 fyj = − ⎢ ⎥. .. ⎥ M12···k (∇z f ) ⎢ . ⎢ ⎥ ⎣ ⎦ k i i=1 Cik (∇z f )fyj An elementary formula for computing the determinant of a matrix implies k
Cis (∇z f )fyij = M12···s−1 j s+1···k (∇y f )
i=1
for every j = k + 1, . . . , 2n + 1. As a consequence, we get ϕsyj = −
M12···s−1 j s+1···k (∇y f ) . M1···k (∇z f )
(30)
Note that M1···k (∇z f ) corresponds to the determinant of the matrix ∇z f . As a consequece of (30) and of (25), we conclude that ϕsyj (0) = 0 for every j = k + 1, . . . , 2n. Previous considerations and expression (27) lead us to the formula ⎤ ⎡ 1 0 ··· 0 ϕy2n+1 (0) ⎥ ⎢0 ⎥ ⎢ ⎢0 0 ··· 0 ϕ2y2n+1 (0) ⎥ ⎥ ⎢ ⎥ ⎢ .. .. .. ... ⎥ ⎢ .. . . . ⎥ ⎢. ⎥ ⎢ ⎥ ⎢ k ⎥ ⎢0 0 · · · 0 ϕ (0) y 2n+1 ⎥ ⎢ ⎥ ⎢ ⎥, ∇y˜φ(0) = ⎢ (31) 0 ··· 0 0 ⎥ ⎢1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢0 1 ··· 0 0 ⎥ ⎢ ⎥ ⎢ .. . .. ⎥ ⎢. 0 0 0 ⎥ ⎢ ⎥ ⎢. .. .. ... ⎥ ⎢ .. . 1 . ⎥ ⎢ ⎦ ⎣ 0 0 ··· 0 1 where ∇y˜φ(0) is a (2n + 1) × p matrix whose p × p lower block is the identity matrix. Notice that columns of (31) represent a basis of the tangent space T0 Σ with respect to coordinates (yk+1 , . . . , y2n+1 ). More precisely, the set of vectors ⎛ ⎞ k Z(0) + j=1 vj Yj (0) ⎟ ⎜ ⎝Yk+1 (0), Yk+2 (0), . . . , Y2n (0), 1/2 ⎠ 1 + kj=1 vj2
94
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
form an orthonormal basis of T0 Σ, where we have defined vj = ϕjy2n+1 (0). Then the tangent p-vector τΣ to Σ at 0 is given by the wedge product k Yk+1 (0) ∧ Yk+2 (0) ∧ · · · ∧ Y2n (0) ∧ Z(0) + j=1 vj Yj (0) τΣ (0) = . 1/2 k 2 1 + j=1 vj Obviously, p-vectors Yk+1 (0) ∧ Yk+2 (0) ∧ · · · ∧ Y2n (0) ∧ Yj (0) are horizontal, hence they disappear in the vertical projection. It follows that τΣ,V (0) = πV (τΣ,V (0)) =
Yk+1 (0) ∧ Yk+2 (0) ∧ · · · ∧ Y2n (0) ∧ Z(0) , 1/2 k 2 1 + j=1 vj
therefore we clearly obtain
|τΣ,V (0)| = 1 +
k
vl2
−1/2
.
(32)
l=1
Due to formula (30) in the case j = 2n + 1 and to (25), we obtain 1+
k l=1
2
vl2
=
(M1···k (∇z f )) +
k
2
(M12···s−1 2n+1 s+1···k (∇y f )) Jg f (0) = 2 JH f (0) (M1···k (∇z f )) s=1
2 ,
then (32) shows the validity of (22) in the case x = 0. Left invariance of the Riemannian metric g leads us to the conclusion. Definition 3.4 (Metric factor). Let τ be a vertical simple p-vector of Λp (hn ) and let L(τ ) be the unique associated subspace, with L = exp L(τ ). The metric factor of a homogeneous distance ρ with respect to τ is defined by −1 p θpρ (τ ) = H|·| F (L ∩ B1 ) , p denotes the pwhere F : R2n+1 −→ Hn defines a system of graded coordinates, H|·| dimensional Hausdorff measure with respect to the Euclidean distance of R2n+1 and B1 is the unit ball of Hn with respect to the distance ρ. Recall that the subspace associated to a simple p-vector τ is defined as {v ∈ hn | v ∧ τ = 0}.
Remark 3.5. In the case of subspaces L of codimension one, the notion of metric factor fits into the one introduced in [19]. It is easy to observe that the notion of metric factor does not depend on the system of coordinates we are using. In fact, F1−1 ◦ F2 : R2n+1 −→ R2n+1 is an Euclidean isometry whenever F1 , F2 : R2n+1 −→ Hn represent systems of graded coordinates with respect to the same left invariant Riemannian metric.
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
95
Proof (of Theorem 1.1). As in the proof of Theorem 3.3, left invariance of the Riemannian metric g allows us to assume that x = 0. For r0 > 0 sufficiently small, we can suppose the existence of a function f : Br0 −→ Rk such that Σ ∩ Br0 = f −1 (0) and whose differential is surjective at every point of Br0 . By Proposition 3.2, the horizontal gradients ∇H f i = (X1 f i (0), X2 f i (0), . . . , X2n f i (0)) for i = 1, 2, . . . , k span a k-dimensional space of R2n . Now, repeating the argument in the proof of Theorem 3.3, we define the system of graded coordinates (y1 , . . . , y2n+1 ) associated to the frame (Y1 , . . . , Y2n , Z), where Yj are given by (23). The differential of f at 0 can be represented by the matrix ⎡ ⎤ 1 1 1 1 f (0) fy2 (0) · · · fyk (0) 0 · · · 0 fy2n+1 (0) ⎢ y1 ⎥ ⎢ ⎥ ⎢ f 2 (0) f 2 (0) · · · f 1 (0) 0 · · · 0 f 2 (0) ⎥ y2 yk y2n+1 ⎢ y1 ⎥ ∇y f (0) = ⎢ . (33) ⎥, .. .. .. ⎢ . ⎥ . ··· . 0 ··· 0 . ⎢ . ⎥ ⎣ ⎦ k k k k fy1 (0) fy2 (0) · · · fyk (0) 0 · · · 0 fy2n+1 (0) whose first k columns are linearly independent. By the implicit function theorem there exists a C 1 mapping ϕ : A −→ Rk such that A ⊂ Rp is an open neighbourhood of the origin and f (ϕ1 (˜ y ), . . . , ϕk (˜ y ), yk+1 , . . . , y2n+1 ) = 0
(34)
for every y˜ = (yk+1 , . . . , y2n+1 ) ∈ A. Proceeding as in the proof of Theorem 3.3, we define the mapping φ : A −→ R2n+1 as φ(˜ y ) = (ϕ1 (˜ y ), . . . , ϕk (˜ y ), yk+1 , . . . , y2n+1 ), and by the same computations, differentiating (34) ⎡ 0 ··· ⎢0 ⎢ ⎢0 0 ··· ⎢ ⎢ .. ... ⎢ .. . ⎢. ⎢ ⎢ ⎢0 0 ··· ⎢ ⎢ ∇y˜φ(0) = ⎢ 0 ··· ⎢1 ⎢ ⎢ ⎢0 1 ··· ⎢ ⎢ .. ... ⎢. 0 ⎢ ⎢. .. ... ⎢ .. . ⎢ ⎣ 0 0 ···
(35)
we obtain ⎤
0
ϕ1y2n+1 (0) ⎥
⎥ 0 ϕ2y2n+1 (0) ⎥ ⎥ ⎥ .. .. ⎥ . . ⎥ ⎥ ⎥ k 0 ϕy2n+1 (0) ⎥ ⎥ ⎥ ⎥, 0 0 ⎥ ⎥ ⎥ ⎥ 0 0 ⎥ ⎥ ⎥ 0 0 ⎥ ⎥ .. ⎥ 1 . ⎥ ⎦ 0 1
(36)
96
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
where ∇y˜φ(0) is a (2n + 1) × p matrix whose p × p lower block is the identity matrix. For ˜r = F −1 (Br ) ⊂ each r < r0 , write the ball Br in terms of graded coordinates defining B R2n+1 . The surface Σ read in graded coordinates can be seen as the image of φ. Then we have established volp (Σ ∩ Br ) −1−p =r Jg φ(˜ y ) d˜ y. (37) rp+1 ˜r ) φ−1 (B The dilation δr restricted to coordinates (y1 , . . . , y2n+1 ) gives δr y˜ = δr ((yk+1 , . . . , y2n+1 )) = (ryk+1 , ryk+2 , . . . , ry2n , r2 y2n+1 ), therefore, performing a change of variable in (37) we get volp (Σ ∩ Bx,r ) = Jg φ(δr y˜) d˜ y. rp+1 ˜ r )) δ1/r (φ−1 (B
(38)
(39)
˜r )) can be written as follows The set δ1/r (φ−1 (B 1 ϕ (δr y˜) ϕk (δr y˜) −1 ˜ p ˜ (δ1/r ◦ φ ◦ δr ) (B1 ) = y˜ ∈ R ,..., , yk+1 , . . . , y2n+1 ∈ B1 . (40) r r From expressions (36) and (38) one easily gets that lim+
r→0
ϕj (δr y˜) =0 r
(41)
for every j = 1, . . . , k. As a result, the limit 1
(
˜r ) δ1/r φ−1 (B
+ ) −→ 1B˜1 ∩Π as r → 0
(42)
holds a.e. in Rp , where we have defined Π = (0, . . . , 0, yk+1 , . . . , y2n+1 ) ∈ R2n+1 | yj ∈ R, j = k + 1, . . . , 2n + 1 . From (39), we conclude that lim+
r→0
volp (Σ ∩ Bx,r ) ˜1 ). = Jg φ(0) Hp (Π ∩ B rp+1
(43)
To compute Jg φ(0), we use both the canonical form of the tangent space T0 Σ given by (36) and the fact that our frame (Y1 , . . . , Y2n , Z) is orthonormal. Thus, according to Definition 2.4 the Riemannian jacobian of φ at zero is given by 1/2 k 2 Jg φ(0) = 1 + vj , (44) j=1
where we have defined vj = ϕjy2n+1 (0) for every j = 1, . . . , 2n. Again, following the same steps of the proof of Theorem 3.3, we get k −1/2 −1 |τΣ,V (0)| = 1 + vl2 = (Jg φ(0)) . l=1
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
97
Then (43) yields lim+
r→0
˜1 ) Hp (Π ∩ B volp (Σ ∩ Br ) = . rp+1 |τΣ,V (0)|
(45)
The subspace L(τΣ,V (x)) associated to the p-vector τΣ,V (x) satisfies the relation exp L(τΣ,V (x)) = F (Π) ˜1 ). This fact along therefore the metric factor of ρ with respect to τΣ,V (x) is Hp (Π ∩ B with (45) implies the validity of (1) and ends the proof.
4
Spherical Hausdorff measure of submanifolds
This section deals with various applications of Theorem 1.2. A key result to obtain this theorem is the S Q−k -negligibility of characteristic points of a k-codimensional submanifold of a Carnot group of Hausdorff dimension Q, see [22]. This result in the case of Heisenberg groups reads as follows. Theorem 4.1. Let Σ ⊂ Ω be a C 1 submanifold of dimension p. Then the set of characteristic points C(Σ) is S p+1 -negligible. Remark 4.2. In order to apply the negligibility result of [22] one has to check that the notion of characteristic point in arbitrary stratified groups coincides with our definition stated in the Heisenberg group. According to [22] a point x ∈ Σ is characteristic if dim (Hx Hn ) − dim (Tx Σ ∩ Hx Hn ) ≤ k − 1.
(46)
If x is characteristic according to Definition 2.12, then dim (Tx Σ ∩ Hx Hn ) = p = 2n + 1 − k and (46) holds. Conversely, if (46) holds, then p = dim(Tx Σ) = 2n − k + 1 ≤ dim (Tx Σ ∩ Hx Hn ), hence Tx Σ ⊂ Hx Hn . Corollary 4.3. Let Σ ⊂ Ω be a C 1 submanifold of dimension p. Then we have p+1 θ(τΣ,V (x)) dS (x) = |τΣ,V (x)| dvolp (x) Σ
(47)
Σ
Proof. We apply Theorem 2.10.17(2) and Theorem 2.10.18(1) of [5], hence from limit (1) and Theorem 4.1 the proof follows by a standard argument.
98
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
Remark 4.4. Proof of Theorem 1.2 immediately follows from (47). Next, we present a class of homogeneous distances in the Heisenberg group which possess constant metric factor. The standard system of graded coordinates F : R2n+1 −→ Hn induced by (X1 , . . . , X2n , Z) will be understood in the sequel. To simplify notation we will write x = F (˜ x, x2n+1 ) ∈ Hn , with x˜ = (x1 , . . . , x2n ) ∈ R2n . Proposition 4.5. Let F : R2n+1 −→ Hn define standard coordinates and let ρ be a homox|, x2n+1 ). geneous distance of Hn such that ρ(0, F (·)) : R2n+1 −→ R only depends on (|˜ ρ ρ Then θp (τ ) = θp (˜ τ ) whenever τ, τ˜ are vertical simple p-vectors. Proof. Let τ = U1 ∧· · ·∧Up−1 ∧Z and τ˜ = W1 ∧· · ·∧Wp−1 ∧Z be vertical simple p-vectors, where it is not restrictive assuming that both (U1 , · · · , Up−1 , Z) and (W1 , . . . , Wp−1 , Z) are orthonormal systems of h2n+1 . Then we easily find an isometry J : h2n+1 −→ h2n+1 such that J (L(τ )) = L(˜ τ ) and J(Z) = Z. Recall that our graded coordinates are defined by F = exp ◦I, where I : R2n+1 −→ h2n+1 is an isometry such that I(x1 , . . . , x2n+1 ) = x2n+1 Z +
2n
xj Xj
j=1
˜1 = F −1 (B1 ) ⊂ R2n+1 , we have for every (x1 , . . . , x2n+1 ) ∈ R2n+1 . Thus, defining B ˜1 = ϕ(I −1 (L(τ ))) ∩ B ˜1 , F −1 ( exp L(˜ τ ) ∩ B1 ) = I −1 ◦ J (L(τ )) ∩ B
(48)
where ϕ = I −1 ◦ J ◦ I : R2n+1 −→ R2n+1 is an Euclidean isometry such that ϕ(e2n+1 ) = e2n+1 and e2n+1 is the (2n+1)-th vector of the canonical basis of R2n+1 . Then |˜ x| = |˜ y| whenever ϕ(˜ x, t) = (˜ y , t). As a result, the fact that ρ(0, F (˜ x, t)) only depends on (|˜ x|, t) ˜ ˜ easily implies that ϕ(B1 ) = B1 . Thus, due to (48), it follows that p θpρ (τ ) = H|·| (I −1(L(τ )) ∩ B˜1) = H|·|p (ϕ(I −1(L(τ ))) ∩ B˜1) = θpρ(˜τ ).
(49)
This ends the proof.
Example 4.6. An example of homogeneous distance satisfying hypotheses of Proposition 4.5 is the gauge distance, also called Kor´anyi distance, [15]. The gauge distance from x to the origin is given by 1/4
x|4 + 16 x22n+1 ) d(x, 0) = (|˜
,
where x = (˜ x, x2n+1 ). Then we define d(x, y) = d(0, x−1 y), for any x, y ∈ Hn . Another example of homogeneous distance with this property is the “maximum distance”, defined by d∞ (x, 0) = max |˜ x|, |x2n+1 |1/2 . Due to Proposition 4.5, both of these distances have constant metric factor.
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
99
Remark 4.7. The metric factor depends on the Riemannian metric g we have fixed. Furthermore, if we divide it by the volume of the unit ball B1 (or any other fixed subset of positive measure), then we obtain a number only depending on the restriction of the Riemannian metric to HHn . Lemma 4.8. Let g˜ be a left invariant Riemannian metric such that g˜|HHn = g|HHn . Then ! 2n+1 (B1 ) for any simple vertical p-vector, where vol ! 2n+1 θpρ (τ )/vol2n+1 (B1 ) = θ˜pρ (τ )/vol and θ˜pρ (τ ) are defined with respect to the metric g˜. Proof. By hypothesis, we can choose an orthonormal frame (X1 , X2 , . . . , X2n , W ) with 2n+1 ˜ −→ Hn respect to g˜, where W = λ Z + 2n j=1 aj Xj and λ = 0. Let F, F : R represent system of coordinates with respect to the standard basis (X1 , X2 , . . . , X2n , Z) and (X1 , X2 , . . . , X2n , W ), respectively. We have F˜ = F ◦ T , where T : R2n+1 −→ R2n+1 is given by the matrix ⎡ ⎢1 ⎢ ⎢0 ⎢ ⎢ ⎢ .. ⎢. A=⎢ ⎢ .. ⎢. ⎢ ⎢ ⎢0 ⎢ ⎣ 0
⎤ 0 ··· . 1 0 .. ... . 1 .. ... . 0 ..
0 a1 ⎥ ⎥ 0 a2 ⎥ ⎥ ⎥ .. ⎥ . a3 ⎥ ⎥ .. ⎥ 0 . ⎥ ⎥ ⎥ · · · · · · 0 1 a2n ⎥ ⎥ ⎦ 0 ··· 0 0 λ 0
(50)
By Proposition 2.7, it follows that ! 2n+1 = F˜ L2n+1 = | det A|−1 F L2n+1 = |λ|−1 vol2n+1 . vol
(51)
Let τ = U1 ∧ · · · ∧ Up−1 ∧ Z, where it is not restrictive to assume that the horizontal vectors U1 , . . . , Up−1 are orthonormal with respect to both g and g˜. We denote by L the subspace span{U1 , . . . , Up−1 , Z} of hn . Recall that p ˜ −1 (F (exp(L) ∩ B1 )) = H|·|p T −1 (F −1 (exp(L) ∩ B1 )) . θ˜pρ (τ ) = H|·|
(52)
Now we wish to determine a basis of the subspace F −1 (exp(L) ⊂ R2n+1 . The relations j 2n Ui = 2n × {0} with j=1 ci Xj give rise to p − 1 orthonormal vectors c1 , . . . , cp−1 ∈ R respect to the Euclidean scalar product such that span{c1 , . . . , cp−1 , e2n+1 } = F −1 exp (L) ⊂ R2n+1 .
(53)
In order to compute the Euclidean jacobian of T −1 restricted to the p-dimensional sub-
100
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
space span{c1 , . . . , cp−1 , e2n+1 } ⊂ R2n+1 we write the matrix ⎡ ⎤ −1 ⎢ 1 0 0 · · · 0 −λ a1 ⎥ ⎢ ⎥ ⎢ 0 1 0 . . . 0 −λ−1 a ⎥ 2 ⎥ ⎢ ⎢ ⎥ .. ⎢ .. . . ⎥ . −1 . . . −λ a3 ⎥ ⎢. . 1 −1 ⎥ A =⎢ ⎢ .. . . ⎥ .. ⎢ . . 0 ... 0 ⎥ . ⎢ ⎥ ⎢ ⎥ ⎢ 0 · · · · · · 0 1 −λ−1 a ⎥ 2n ⎥ ⎢ ⎣ ⎦ ... ... −1 0 λ 0 0
(54)
noting that T −1 cj = cj for every j = 1, . . . , p − 1 and T −1 (e2n+1 ) = λ−1 e2n+1 . As a result, the jacobian of (T −1 )| : span{c1 , . . . , cp−1 , e2n+1 } −→ R2n+1 is |λ|−1 , hence −1 −1 p θ˜pρ (τ ) = H|·| T (F (exp(L) ∩ B1 ))
p = |λ|−1 H|·| ((F −1 (exp(L) ∩ B1 )) = |λ|−1 θpρ (τ ).
Joining (51) and the previous equalities, our claim follows.
Theorem 4.9. Let Σ be a p-dimensional submanifold of Ω and let g˜ a left invariant metric such that g˜|HHn = g|HHn . Then 1 1 ! |˜ τ (x)| dvolp (x) = |τΣ,V (x)| dvolp (x) (55) ! 2n+1 (B1 ) Σ Σ,V vol2n+1 (B1 ) Σ vol The previous theorem is an immediate consequence of Corollary 4.3 and Lemma 4.8. Next, we apply (2) to compute the spherical Hausdorff measure of some submanifolds. We will use the following proposition. Proposition 4.10. Let φ : U −→ R2n+1 be a C 1 embedding, where U ⊂ Rp is a bounded open set. Let F : R2n+1 −→ Hn define standard coordinates and set Φ = F ◦φ : U −→ Hn , where Σ = Φ(U ). Let ρ be a homogeneous distance with constant metric factor α > 0. Then we have p+1 πV (Φu1 (u) ∧ Φu2 (u) ∧ · · · ∧ Φup (u)) du , SHn (Σ) = (56) U
for every measurable set A ⊂ Hn , where the norm | · | is induced by the scalar product (17) on p-vectors.
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
101
Proof. By definition of Riemannian volume, formula (2) can be written with respect to φ as " $ # p+1 (57) SHn (Σ) = |τΣ,V (φ(u))| det g(φ(u))(Φui (u), Φuj (u)) du , U
where we have " $ # Φu1 (u) ∧ Φu2 (u) ∧ · · · ∧ Φup (u) = det g(Φ(u))(Φu (u), Φu (u)) . i j
(58)
Therefore, taking into account the formula τΣ (Φ(u)) =
Φu1 (u) ∧ Φu2 (u) ∧ · · · ∧ Φup (u) , |Φu1 (u) ∧ Φu2 (u) ∧ · · · ∧ Φup (u)|
(59)
the definition of vertical tangent p-vector πV (τΣ ) = τΣ,V and joining (57), (58) and (59), formula (56) follows. u2 +u2 +u2
Example 4.11. Let φ : R3 −→ R5 , defined by φ(u) = (u1 , u2 , u3 , 0, 1 22 3 ). The mapping φ parametrizes a 3-dimensional paraboloid Σ = Φ(U ) of R5 , where U is an open bounded set of R3 , Φ = F ◦ φ and F : R3 −→ H2 represents standard coordinates, according to Definition 2.2. Using expressions (7), we have ˜ 1 (φ(u)) + (φ3 (u) + u1 )T˜(φ(u)) , φu1 (u) = X ˜ 2 (φ(u)) + (φ4 (u) + u2 )T˜(φ(u)) , φu2 (u) = X ˜ 3 (φ(u)) + (u3 − φ1 (u))T˜(φ(u)) . φu3 (u) = X Observing that for every j = 1, . . . , 2n, we have dF (φ(u))X˜j (φ(u)) = Xj (Φ(u)) ∈ HΦ(u) Hn
and
˜ dF (φ(u))Z(φ(u)) = Z(Φ(u)) ∈ TΦ(u) Hn , hence we obtain Φu1 (u) = X1 (Φ(u)) + (u3 + u1 )T (Φ(u)) ,
Φu2 (u) = X2 (Φ(u)) + u2 T (Φ(u)) ,
Φu3 (u) = X3 (Φ(u)) + (u3 − u1 )T (Φ(u)) . Thus, we can compute πV (Φu1 ∧ Φu2 ∧ Φu3 ) = (u3 − u1 ) X1 ∧ X2 ∧ T − u2 X1 ∧ X3 ∧ T + (u3 + u1 ) X2 ∧ X3 ∧ T , hence formula (56) yields SH4 2 (Σ)
" = u22 + 2(u23 + u21 ) du . U
102
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
Example 4.12. Let φ : R2 −→ R3 , φ(u1 , u2 ) = (a1 u1 , a2 u2 , bu1 +cu2 ), define a hyperplane 2 in R3 , where a1 , a2 , b, c ∈ R and (Jφ) = a21 a22 + a21 c2 + a22 b2 > 0. Embedding the hyperplane in H1 through standard coordinates F : R3 −→ H1 , we obtain Φu1 (u) = a1 X1 (Φ(u)) + (a1 a2 u2 + b)T (Φ(u)) , Φu2 (u) = a2 X2 (Φ(u)) + (c − a1 a2 u1 )T (Φ(u)) , where Φ = F ◦ φ. Then we get πV (Φu1 (u) ∧ Φu2 (u)) = a1 (c − a1 a2 u1 )X1 ∧ T − (a1 a2 u2 + b)a2 X2 ∧ T and formula (56) yields SH3 1 (Π)
" = a21 (c − a1 a2 u1 )2 + a22 (a1 a2 u2 + b)2 du,
(60)
U
where Π = Φ(U ) and U is an open bounded set of R2 . u2 +u2
Example 4.13. Let φ : R2 −→ R3 , φ(u1 , u2 ) = (u1 , u2 , 1 2 2 ), define a paraboloid in R3 . By standard coordinates F : R3 −→ H1 and arguing as in the previous examples, we have Φu1 (u) = X1 (Φ(u)) + (u2 + u1 )T (Φ(u)) and Φu2 (u) = X2 (Φ(u)) + (u2 − u1 )T (Φ(u)) , where Φ = F ◦ φ. It follows that πV (Φu1 (u) ∧ Φu2 (u)) = (u2 − u1 )X1 ∧ T − (u2 + u1 )X2 ∧ T and formula (56) yields SH3 1 (P)
" = 2u21 + 2u22 du
(61)
U
where P = Φ(U ) and U is an open bounded set of R2 . Remark 4.14. It is curious to notice that the density of SH3 1 restricted to the paraboloid P, computed in (61), is proportional to the density of SH3 1 restricted to the horizontal projection of P onto the plane F ({(x1 , x2 , x3 ) | x3 = 0}) ⊂ H1 , whose density is given by (60) in the case a1 = a2 = 1 and b = c = 0. Example 4.15. From computations of Example 4.12, one can get the 2-dimensional spherical Hausdorff measure of the line Φ(t) = F (at, 0, bt) defined on an interval [α, β]. We have Φ (t) = aX1 (Φ(t)) + bT (Φ(t)) and πV (Φ (t)) = bT (Φ(t)) ,
(62)
then defining the submanifold L = Φ([α, β]), the formula SH2 1 (L) = |b| (β − α) holds.
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
103
Another consequence of (2) is the lower semicontinuity of the spherical Hausdorff measure with respect to weak convergence of regular currents. To see this, it suffices to establish the following formula p+1 SHn (Σ) = sup τΣ,V , ω dvolp , (63) ω∈Fcp (Ω)
Σ
where Fcp (Ω) is the space of smooth p-forms with compact support in Ω with |ω| ≤ 1. ˜ The norm of ω is defined making the standard frame of p-forms (dx1 , dx2 , . . . , dx2n , θ) orthonormal and extending this scalar product to p-forms exactly as we have seen in formula (17). The 1-form θ˜ is the so called contact form θ˜ = dx2n+1 +
n
xj+n dxj − xj dxj+n
(64)
j=1
˜ is the dual basis of written in standard coordinates. Note that (dx1 , dx2 , . . . , dx2n , θ) ˜1, . . . , X ˜ 2n , Z). ˜ Formula (63) follows from (2) observing that (X |τΣ,V | dvolp = sup τΣ,V , ω dvolp , (65) ω∈Fcp (Ω)
Σ
Σ
as one can check by standard arguments. As a consequence of these observations, we can establish the following proposition. Proposition 4.16. Let (Σm ) be a sequence of C 1 submanifolds of Ω which weakly converges in the sense of currents to the C 1 submanifold Σ. Then p+1 lim inf SHp+1 n (Σm ) ≥ SHn (Σ). m−→∞
Proof. By hypothesis Σm
(66)
τΣm ,V , ω dvolp −→
Σ
τΣ,V , ω dvolp ,
(67)
for every ω ∈ Fcp (Ω). Then (63) ends the proof. 2 Remark 4.17. It is clear the importance of (66) in studying versions of the Plateau problem with respect to the geometry of Heisenberg groups. Recall that the horizontal normal is the orthogonal projection of the normal to ν(x) to Tx Σ onto the horizontal subspace Hx Hn . In the next proposition we show that in codimension one an explicit relationship can be established between vertical tangent 2nvector and horizontal normal νH .
104
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
Proposition 4.18. Let Σ be a 2n-dimensional submanifold of class C 1 and let νH (x) a horizontal normal at x ∈ Σ. Then we have j j νH = (−1)j τΣ,V
2n j j where νH = 2n j=1 νH Xj and τΣ,V = j=1 τΣ,V X1 ∧ · · · Xj−1 ∧ Xj+1 ∧ · · · X2n ∧ Z. In particular, the equality |τΣ,V | = |νH | holds. Proof. Let (t1 , t2 , . . . , t2n ) be an orthonormal basis of Tx Σ, where x is a transverse point. Then 2n cij Xi (x) + c2n+1 Z(x) tj = j i=1
where C = (cij ) is a (2n + 1) × 2n matrix, whose columns are orthonormal vectors of R2n+1 . Then we have τΣ (x) = t1 ∧ t2 ∧ · · · ∧ t2n =
2n+1
j
ˆ ) X1 ∧ X2 ∧ · · · ∧ Xj−1 ∧ Xj+1 ∧ · · · ∧ Z , det(C
j=1 j
ˆ is the 2n × 2n matrix obtained by removing the j-th row from C. The vertical where C projection yields τΣ,V (x) = πV (τΣ (x)) =
2n
j
ˆ ) X1 ∧ X2 ∧ · · · ∧ Xj−1 ∧ Xj+1 ∧ · · · ∧ Z det(C
(68)
j=1
and by elementary linear algebra one can deduce that 2n+1 j=1
& % j j ˆ (−1) det(C ) ck = det Cck = 0 j
(69)
for every k = 1, . . . , 2n. Then the vector ν=
2n
j
ˆ ) Xj + (−1)2n+1 det(C ˆ (−1)j det(C
2n+1
)Z
j=1
yields a unit normal to Σ at x. Its horizontal projection is νH =
2n
ˆ j ) Xj . (−1)j det(C
(70)
j=1
Formulae (68) and (70) yield the thesis.
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
5
105
Coarea formula
This section is devoted to the proof of Theorem 1.3. Next, we recall the Riemannian coarea formula, see Section 13.4 of [4]. Theorem 5.1. Let f : Hn −→ Rk be a Riemannian Lipschitz function, with 1 ≤ k < 2n + 1. Then for any summable map u : Hn −→ R, the following formula holds u(x) Jg f (x) dvol2n+1 (x) = u(y) dvolp (y) dt , (71) Hn
Rk
f −1 (t)
where p = 2n + 1 − k In the previous theorem the Heisenberg group Hn is equipped with its left invariant Riemannian metric g. The terminology “Riemannian Lipschitz map” means that the map is Lipschitz with respect to the Riemannian distance. Proof (of Theorem 1.3). We first prove (4) in the case f is defined on all of Hn and is of class C 1 . Let Ω be an open subset of Hn . In view of Riemannian coarea formula (71), we have u(x)Jg f (x) dx = u(y) dvolp (y) dt, (72) Rk
Ω
f −1 (t)∩Ω
where u : Ω −→ [0, +∞] is a measurable function. Note that in the left hand side of (72) we have used the Lebesgue measure in that, by Proposition 2.7, it coincides with the volume measure expressed in terms of standard coordinates, namely F (L2n+1 ) = vol2n+1 . Now we define u(x) = JH f (x)1{Jf =0}∩Ω (x)/Jf (x) and use (72), obtaining JH f (x) dx = Ω
Rk
f −1 (t)∩Ω
JH f (x)1{Jf =0} (x) dvolp (y) dt. Jf (x)
(73)
The validity of (72) also implies that for a.e. t ∈ Rk the set of points of f −1 (t) where Jg f vanishes is volp -negligible, then the previous formula becomes JH f (x) JH f (x) dx = (74) dvolp (y) dt. Ω Rk f −1 (t)∩Ω Jf (x) By classical Sard’s theorem and Theorem 4.1 for a.e. t ∈ Rk the C 1 submanifold f −1 (t) has SHp+1 n -negligible characteristic points, hence Proposition 3.2 implies that Ct = {y ∈ f −1 (t) ∩ Ω | JH f (y) = 0} is SHp+1 n -negligible. As a result, from formulae (22) and (2) we have proved that for a.e. t ∈ Rk the equalities JH f (x) −1 −1 dvolp (y) = SHp+1 (t) ∩ Ω \ Ct ) = SHp+1 (t) ∩ Ω) n (f n (f f −1 (t)∩Ω Jg f (x)
106
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
hold, therefore (74) yields Ω
JH f (x) dx =
−1 SHp+1 (t) ∩ Ω) dt. n (f
Rk
(75)
The arbitrary choice of Ω yields the validity of (75) also for arbitrary closed sets. Then, approximation of measurable sets by closed ones, Borel regularity of SHp+1 and the coarea n estimate 2.10.25 of [5] extend the validity of (75) to the following one −1 JH f (x) dx = SHp+1 (t) ∩ A)dt, (76) n (f Rk
A
where A is a measurable subset of Hn . Now we consider the general case, where f : A −→ Rk is a Lipschitz map defined on a measurable bounded subset A of H3 . Let f1 : Hn −→ Rk be a Lipschitz extension of f , namely, f1 |A = f holds. Due to the Whitney extension theorem (see for instance 3.1.15 of [5]) for every arbitrarily fixed ε > 0 there exists a C 1 function f2 : Hn −→ Rk such that the open subset O = {z ∈ Hn | f1 (z) = f2 (z)} has Lebesgue measure less than or equal to ε. We wish to prove p+1 −1 JH f (x) dx − ≤ S (f (t) ∩ A)dt JH f (x) dx n H k A R A∩O −1 SHp+1 (t) ∩ A ∩ O)dt . (77) + n (f Rk
In fact, due to the validity of (76) for C 1 mappings, we have −1 JH f2 (x) dx = SHp+1 n (f2 (t) ∩ A \ O) dt. Rk
A\O
Note here that the horizontal jacobian JH f is well defined on A, in that df is well defined at density points of the domain, see for instance Definition 7 and Proposition 2.2 of [17]. The equality f2 |A\O = f|A\O implies that JH f2 = JH f a.e. on A \ O, therefore −1 JH f (x) dx = SHp+1 (t) ∩ A \ O) dt n (f Rk
A\O
holds and inequality (77) is proved. Now we observe that for a.e. x ∈ A, we have JH f (x) ≤
k 2n ' i=1
(Xj f (x)) i
2
1/2
≤ df (x)|Hx Hn k
j=1
therefore the estimate JH f (x) ≤ Lip(f )k
(78)
holds for a.e. x ∈ A. By virtue of the general coarea inequality 2.10.25 of [5] there exists a dimensional constant c1 > 0 such that −1 SHp+1 (t) ∩ A ∩ O)dt ≤ c1 Lip(f )k H2n+2 (O). (79) n (f Rk
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
107
The fact that the 2n + 2-dimensional Hausdorff measure H2n+2 with respect to the homogeneous distance ρ is proportional to the Lebesgue measure, gives us a constant c2 > 0 such that −1 SHp+1 (t) ∩ A ∩ O)dt ≤ c2 Lip(f )k L2n+1 (O) ≤ c2 Lip(f )k ε. (80) n (f Rk
Thus, estimates (78) and (80) joined with inequality (77) yield p+1 −1 JH f (x) dx − SHn (f (t) ∩ A)dt ≤ (1 + c2 ) Lip(f )k ε. k A
R
Letting ε → 0+ , we have proved that JH f (x) dx = A
Rk
−1 SHp+1 (t) ∩ A)dt. n (f
(81)
Finally, utilizing increasing sequences of step functions pointwise converging to u and applying Beppo Levi convergence theorem the proof of (4) is achieved in the case A is bounded. If A is not bounded, then one can take the limit of (4) where A is replaced by Ak and {Ak } is an increasing sequence of measurable bounded sets whose union yields A. Then the Beppo Levi convergence theorem concludes the proof. Remark 5.2. Notice that once f : A −→ Rk in the previous theorem is considered with respect to standard coordinates it is easy to check that the locally Lipschitz property with respect to the Euclidean distance of R2n+1 is equivalent to the locally Lipschitz property with respect to the Riemannian distance.
References [1] L. Ambrosio: “Some fine properties of sets of finite perimeter in Ahlfors regular metric measure spaces”, Adv. Math., Vol. 159, (2001), pp. 51–67. [2] Z.M. Balogh: “Size of characteristic sets and functions with prescribed gradients”, J. Reine Angew. Math., Vol. 564, (2003), pp. 63–83. [3] A. Bella¨ıche and J.J. Risler (Eds.): Sub-Riemannian geometry, Progress in Mathematics, Vol. 144, Birkh¨auser Verlag, Basel, 1996. [4] Y.D. Burago and V.A. Zalgaller: Geometric inequalities, Grundlehren Math. Springer, Berlin. [5] H. Federer: Geometric Measure Theory, Springer, 1969. [6] G.B. Folland and E.M. Stein: Hardy Spaces on Homogeneous groups, Princeton University Press, 1982. [7] B. Franchi, R. Serapioni and F. Serra Cassano: “Meyers-Serrin type theorems and relaxation of variational integrals depending on vector fields”, Houston Jour. Math., Vol. 22, (1996), pp. 859–889.
108
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
[8] B. Franchi, R. Serapioni and F. Serra Cassano: “Rectifiability and Perimeter in the Heisenberg group”, Math. Ann., Vol. 321(3), (2001). [9] B. Franchi, R. Serapioni and F. Serra Cassano: “Regular hypersurfaces, intrinsic perimeter and implicit function theorem in Carnot groups”, Comm. Anal. Geom., Vol. 11(5), (2003), pp. 909–944. [10] B. Franchi, R. Serapioni and F. Serra Cassano: Regular submanifolds, graphs and area formula in Heisenberg groups, preprint, (2004). [11] M. Gromov: “Carnot-Carath´eodory spaces seen from within”, In: A. Bellaiche and J. Risler (Eds.): Subriemannian Geometry, Progress in Mathematics, Vol. 144, Birkhauser Verlag, Basel, 1996. [12] N. Garofalo and D.M. Nhieu: “Isoperimetric and Sobolev Inequalities for CarnotCarath´eodory Spaces and the Existence of Minimal Surfaces”, Comm. Pure Appl. Math., Vol. 49, (1996), pp. 1081–1144. [13] P. Hajlasz and P. Koskela: “Sobolev met Poincar´e”, Mem. Amer. Math. Soc., Vol. 145, (2000). [14] B. Kirchheim and F. Serra Cassano: “Rectifiability and parametrization of intrinsic regular surfaces in the Heisenberg group”, Ann. Sc. Norm. Super. Pisa Cl. Sci. (5), Vol. 3(4), (2004), pp. 871–896. [15] A. Kor´anyi: “Geometric properties of Heisenberg-type groups”, Adv. Math., Vol. 56(1), (1985), pp. 28–38. [16] I. Kupka: “G´eom´etrie sous-riemannienne”, Ast´erisque, Vol. 241(817,5), (1997), pp. 351–380. [17] V. Magnani: “Differentiability and Area formula on stratified Lie groups”, Houston Jour. Math., Vol. 27(2), (2001), pp. 297–323. [18] V. Magnani: “On a general coarea inequality and applications”, Ann. Acad. Sci. Fenn. Math., Vol. 27, (2002), pp. 121–140. [19] V. Magnani: “A Blow-up Theorem for regular hypersurfaces on nilpotent groups”, Manuscripta Math., Vol. 110(1), (2003), pp. 55–76. [20] V. Magnani: “The coarea formula for real-valued Lipschitz maps on stratified groups”, Math. Nachr., Vol. 278(14), (2005), pp. 1–17. [21] V. Magnani: “Note on coarea formulae in the Heisenberg group”, Publ. Mat., Vol. 48(2), (2004), pp. 409–422. [22] V. Magnani: “Characteristic points, rectifiability and perimeter measure on stratified groups”, J. Eur. Math. Soc., to appear. [23] R. Montgomery: A Tour of Subriemannian Geometries, Their Geodesics and Applications, Mathematical Surveys and Monographs, Vol. 91, American Mathematical Society, Providence, 2002. [24] P.Pansu, Geometrie du Group d’Heisenberg, Thesis (PhD), 3rd ed., Universit´e Paris VII, 1982. [25] P.Pansu, “Une in´egalit´e isoperimetrique sur le groupe de Heisenberg”, C.R. Acad. Sc. Paris, S´erie I, Vol. 295, (1982), pp. 127–130. [26] P. Pansu: “M´etriques de Carnot-Carath´eodory quasiisom´etries des espaces sym´etri-
V. Magnani / Central European Journal of Mathematics 4(1) 2006 82–109
109
ques de rang un”, Ann. Math., Vol. 129, (1989), pp. 1–60, [27] E.M. Stein: Harmonic Analysis, Princeton University Press, 1993. [28] N.Th. Varopoulos, L. Saloff-Coste and T.Coulhon: Analysis and Geometry on Groups, Cambridge University Press, Cambridge, 1992.
Central European Science Journals w
w
w
.
c
e
s
j
.
c
o
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
m
DOI: 10.1007/s11533-005-0007-0 Research article CEJM 4(1) 2006 110–122
On differences of two squares Manfred Ku ¨hleitner∗, Werner Georg Nowak† Institute of Mathematics, Department of Integrative Biology, Universit¨ at f¨ ur Bodenkultur Wien, Gregor Mendel-Straße 33, 1180 Wien, Austria
Received 14 November 2005; accepted 25 November 2005 Abstract: The arithmetic function ρ(n) counts the number of ways to write a positive integer n as a difference of two squares. Its average size is described by the Dirichlet summatory function n≤x ρ(n), and in particular by the error term R(x) in the corresponding asymptotics. This article provides a sharp lower bound as well as two mean-square results for R(x), which illustrates the close connection between ρ(n) and the number-of-divisors function d(n). c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: Arithmetic functions, divisor problems, lattice points MSC (2000): 11N37
1
Introduction
In this note we shall be concerned with the arithmetic function ρ(n) := #{(u, v) ∈ ZZ+ × ZZ : u2 − v 2 = n }
(n ∈ ZZ+ ) .
(1)
It is well known that this is closely related to the number-of-divisors function d(n). In fact, it is easy to see (cf. [10]) that ⎧ ⎪ ⎪ ⎨d(n) for n odd, ρ(n) = d( n4 ) if 4|n, (2) ⎪ ⎪ ⎩0 else. ∗ †
E-mail:
[email protected] E-mail:
[email protected] M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
111
In other words,
n n ρ(n) = d(n) − 2d( ) + 2d( ) , (3) 2 4 if we define d(w) = 0 for w ∈ / ZZ+ . The average size of d(n) is usually described by the formula 1 (4) D(x) = d(n) = x(log x + 2γ − 1) + + Δ(x) 4 1≤n≤x
where x is large, γ denotes the Euler-Mascheroni constant, and the main term is the sum s of the residues of ζ 2 (s) xs at s = 0 and s = 1. On the error term Δ(x) a wealth of deep results have been established: See the monographs of A. Ivi´c [5], E.C. Titchmarsh [19], E. Kr¨atzel [8, 9], and the recent survey article by Ivi´c, Kr¨atzel and the authors [6]. In view of (3) and (4), it is natural to study the remainder term R(x) defined by
ρ(n) =
1≤n≤x
x 1 (log x + 2γ − 1) + + R(x) . 2 4
(5)
Obviously,
x x R(x) = Δ(x) − 2Δ( ) + 2Δ( ) . (6) 2 4 Evidently, every O-estimate for Δ(x) trivially implies a corresponding upper bound for R(x). In particular, R(x) = O x131/416 (log x)26957/8320
(7)
is immediate from M. Huxley’s hitherto strongest result on the Dirichlet divisor problem [4]. Because of the alternating sign in (6), it is less straightforward to transfer lower estimates from Δ(x) to R(x). Nevertheless, M. K¨ uhleitner [10, 11] managed to adapt the methods due to J.L. Hafner [2, 3], resp., K. Corr´adi and I. K´atai [1], to derive (for certain constants C, C > 0)
1/4 (3+4 log 4)/4 R(x) = Ω+ (x log x) (log log x) exp(−C log log log x ) (8) and
2
R(x) = Ω− x1/4 exp C (log log x)1/4 (log log log x)−3/4 .
(9)
Statement of results
Our first aim will be to show how a recent and deep estimate of K. Soundararajan [18] can be used to deduce a lower bound for R(x) which is slightly sharper, apart from the lack of information concerning the sign. Theorem 2.1. For large x, the error term R(x) defined by (5) satisfies R(x) = Ω x1/4 L∗ (x) ,
112
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
where ∗ 1/4 (3/4)(2 L (x) := (log x) (log log x)
4/3 −1)
(log log log x)−5/8 .
More generally, for any function L(x) of the shape L(x) =
K
(logk x)αk ,
k=1
where αk are arbitrary real constants, and logk denotes the k-fold iterated logarithm, each of the lower estimates Δ(x) = Ω (xα L(x)) , where
1 4
R(x) = Ω (xα L(x)) ,
≤ α < 12 , implies the other one.
It is a common conjecture in problems of this kind that (up to logarithmic factors) x should give just the ”true” order of magnitude of the error term involved. In fact, R(x) x1/4 in mean-square, as will be immediate from the following much more precise asymptotics. 1/4
Theorem 2.2. For large X, X R2 (x) dx = Cρ X 3/2 + O X log4 X , 1
where the constant Cρ is given by √ ζ 4 32 1
15 − 9 2 Cρ = = 0, 424738 . . . . 21π 2 ζ(3) In the classic case of the divisor problem, a mean-square asymptotics for Δ(x), of the same accuracy, has been established by E. Preissmann [17]. He improved upon a previous result by K.C. Tong [20] who had an error term of O(X log5 X). It is natural to try to determine a localized form of the last theorem, i.e., to ask the following question: How small can an interval [X − L, X + L] be such that X+L (10) R2 (x) dx ∼ Cρ (X + L)3/2 − (X − L)3/2 X−L
remains true? From Theorem 2.2 it is immediate that (10) holds provided that L = L(X) satisfies X 1/2 log4 X lim = 0. X→∞ L(X) By an argument more specialized for the short interval case, we are able to obtain the following refinement. Theorem 2.3. The asymptotics (10) holds true for any L = L(X) < X −1 which satisfies X 1/2 log3 X = 0. X→∞ L(X) lim
(11)
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
3
113
Proof of Theorem 2.1
There are various ways to infer this result on the basis of Soundararajan’s estimate [18] Δ(x) = Ω x1/4 L∗ (x) . (12) One alternative would be to start from formulae (22), (23) below and to mimic Soundararajan’s argument. We prefer an elementary reasoning which will have the advantage to yield the more general statement of Theorem 2.1. This is a variant of what Y.-K. Lau and K.M. Tsang [12] used when considering the mean-square of the Riemann zeta-function along the critical line. It is convenient to consider R∗ (x) := − 12 R(4x) instead of R(x). We take for granted that Δ(x) = Ω (xα L(x)) and assume that, for L(x) and α as specified in Theorem 2.1, and every ε0 > 0 |R∗ (x)| ≤ ε0 xα L(x) ,
(13)
for all x ≥ x0 (ε0 ). We proceed to deduce a contradiction. By (6), R∗ (x) = −Δ(x) + Δ(2x) − 12 Δ(4x) .
(14)
Now let a := 12 (1 + i), b := 12 (1 − i), then 1 a + b = 1, ab = , |a| = |b| = 2−1/2 . 2
(15)
W (x) := −Δ(x) + a Δ(2x) ,
(16)
R∗ (x) = W (x) − b W (2x) .
(17)
We further put then (14) reads x With J := [ log ] + 1 we iterate (16) J times to obtain log 2
Δ(x) = a Δ(2x) − W (x) = · · · = aJ Δ(2J x) −
J−1
aj W (2j x) .
(18)
j=0
Accordingly, iterating (17) J times (with x replaced by y), we get ∗
J
J
W (y) = R (y) + b W (2y) = · · · = b W (2 y) +
J−1
bm R∗ (2m y) .
(19)
m=0
Using (19) in (18), we arrive at Δ(x) = aJ Δ(2J x) −
J−1 j=0
aj bJ W (2j+J x) −
J−1 J−1
aj bm R∗ (2j+m x) .
(20)
j=0 m=0
We use only the crude bound Δ(t) t1/3 log t, which by (16) also implies that W (t) t1/3 log t. Recalling (15), we see that aJ Δ(2J x) 2−J/6 x1/3 log x x1/6 log x
114
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
and
J−1
aj bJ W (2j+J x)
j=0
J−1
2−(j+J)/6 x1/3 log x x1/6 log x .
j=0
To bound the double sum in (20), we use (13), assuming that x ≥ x0 (ε0 ). For j, m < J, obviously L(2j+m x) L(x). Hence, recalling (15) again, J−1 J−1
j m
∗
j+m
a b R (2
x)
j=0 m=0
J−1 J−1
α 2−(j+m)/2 ε0 2j+m x L(x) ε0 xα L(x) .
j=0 m=0
Using the last three estimates in (20), we obtain that Δ(x) ε0 xα L(x) for x sufficiently large. Since ε0 can be chosen arbitrarily small, this contradicts the Ω-bound for Δ(x) assumed in Theorem 2.1. The other direction of the conclusion is immediate by (6) and hence the result follows.
4
Proof of Theorem 2.2
When considering the mean-square of R(x), we will need a sharp Vorono¨ı type approximation. For Δ(x), this is provided by the well-known expression
√ x1/4 d(n) π , SΔ (x, M ) := √ cos 4π nx − 4 π 2 1≤n≤M n3/4
(21)
where x and M are large real parameters. The following precise estimate has been established by T. Meurman [14]. Lemma 4.1 (T. Meurman [14, Lemma 3]). For x ≥ 1 and x M xA , where A ≥ 1 is an arbitrary constant, x−1/4 if x ≥ x5/2 M −1/2 , Δ(x) − SΔ (x, M ) always, xε where x denotes the distance from the nearest integer, and ε > 0 is arbitrary. Our strategy will be to approximate R(x) by an analogous sum Sρ (x, M ) :=
√ x1/4 ρ(n) π cos 2π nx − . π 1≤n≤M n3/4 4
This is connected with the divisor problem by the following identity.
(22)
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
115
Lemma 4.2. For arbitrary x, M ≥ 1, Sρ (x, 4M ) = SΔ (x, M ) − 2SΔ
x
x , 2M + 2SΔ , 4M . 2 4
Proof (of Lemma 4.2). Using (2), we easily conclude that Sρ (x, 4M ) =
x1/4 π +
1≤n≤4M n≡1 mod 2
x1/4 π
1≤m≤M
√ π d(n) cos 2π nx − n3/4 4
√ d(m) π . cos 4π mx − (4m)3/4 4
Since the last term clearly equals 12 SΔ (x, M ), we obtain √ 1/4 x π 2 d(n) x 1 Sρ (x, 4M ) − 2 SΔ (x, M ) = cos 4π n − π 4 n3/4 4 4 1≤n≤4M x1/4 d(2m) x π . − cos 4π m − π 1≤m≤2M (2m)3/4 2 4 Here the first term on the right hand side is equal to 2SΔ ( x4 , 4M ). In the last sum we use the identity d(2m) = 2d(m) − d( m2 ), to infer that
x , 4M = Sρ (x, 4M ) − 12 SΔ (x, M ) − 2SΔ √ 4 x π 2 x 1/4 d(m) cos 4π m − =− 3/4 π 2 m 2 4 1≤m≤2M
√ π x1/4 d(k) cos 4π kx − + π 1≤k≤M (4k)3/4 4
x , 2M + 12 SΔ (x, M ) . = −2SΔ 2 This is just the assertion of Lemma 4.2.
Combining (6) with Lemmas 4.1 and 4.2 yields the following approximation. Lemma 4.3. For x ≥ 1 and x M xA , with A ≥ 1 an arbitrary constant, the expression Sρ (x, 4M ) defined by (22) satisfies R(x) = Sρ (x, 4M ) + E(x, M ) , where E(x, M )
x−1/4 xε
if min x , x2 , x4 ≥ x5/2 (4M )−1/2 , always.
(23)
116
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
We shall make reference to this Lemma in the concluding remarks at the end of the paper. For the present purpose of proving Theorem 2.2, we proceed to integrate R2 (x) over an interval [Y, 2Y ], Y sufficiently large, using (23) with M = Y 7 . We first claim that 2Y E 2 (x, Y 7 ) dx Y 1/2 . (24) Y
In fact, the set
x x M(Y ) = {x ∈ [Y, 2Y ] : min x , , < x5/2 (4Y 7 )−1/2 } 2 4
has a measure of O(1), hence
M(Y )
On the other hand,
2
[Y,2Y ]\M(Y )
E 2 (x, Y 7 ) dx Y 2ε .
7
E (x, Y ) dx
2Y
Y
x−1/2 dx Y 1/2 ,
thus √ (24) is true. To evaluate the corresponding integral over Sρ2 (x, 4Y 7 ), we first set, for u≥ Y, ⎞2 ⎛ u
√ ρ(n) π ⎠ cos 2π n t − dt . (25) G(Y, u) := √ ⎝ 3/4 n 4 Y 7 1≤n≤4Y
Using the definition (22), we get √2Y 2Y 2 ∂G Sρ2 (x, 4Y 7 ) dx = 2 √ u2 (Y, u) du π ∂u Y Y √2Y √ 4 4 = 2 Y G(Y, 2Y ) − 2 √ u G(Y, u) du π π Y
(26)
by the change of variable x = u2 and an integration by parts. To evaluate G(Y, u), we employ the well-known result due to H.L. Montgomery and R.C. Vaughan [15]. Lemma 4.4 (Montgomery and Vaughan [15, Corollary 2]). For an arbitrary finite index set J , let (aj )j∈J be a complex sequence and let (λj )j∈J be a sequence of pairwise distinct reals. Write δj := min |λk − λj | . k∈J ,k =j
Then, for arbitrary real T0 and T > 0, 2 T0 +T 2 |a | j aj exp(iλj t) dt = T |aj |2 + O , δj T0 j∈J
where the O-constant is absolute.
j∈J
j∈J
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
117
To apply this to (25), we use the identity cos(α) = 12 (exp(iα) + exp(−iα)). Further, we choose J as the set of all nonzero integers of modulus ≤ 4Y 7 , and, for all j ∈ J , aj =
1 ρ(|j|) π exp − sgn(j) i , 2 |j|3/4 4
λj = 2π sgn(j) |j| .
It is clear that δj |j|−1/2 , thus for the error term it follows that |aj |2 j∈J
δj
04Y
and ∞ ρ2 (n) n=1
n3/2
∞
d2 (n) d2 (m) = + n3/2 (4m)3/2 n=1 m=1 n≡1 mod 2 (1 − 2−3/2 )3 1 ζ 4 32 = + 1 + 2−3/2 8 ζ(3)
4 3 √ ζ 2 1 = 15 − 9 2 = 3π 2 Cρ , 7 ζ(3) ∞
(28)
where Cρ is the constant defined in Theorem 2.2. Using (27) in (26), we immediately get Y
2Y
Sρ2 (x, 4Y 7 ) dx = Cρ (2Y )3/2 − Y 3/2 + O Y log4 Y .
Applying this formula with Y = 1
X
X X X , , ,... 2 4 8
and summing up, we arrive at
Sρ2 (x, 4Y 7 ) dx = Cρ X 3/2 + O X log4 X .
Finally, appealing to (23), (24), and Cauchy’s inequality, we complete the proof of Theorem 2.2.
118
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
5
Proof of Theorem 2.3
We may suppose that X 1/2 log3 X < L = L(X) ≤ X 1/2 log5 X ,
(29)
else the result follows by Theorem 2.2. In a way, our argument will be simpler than the proof of Theorem 2.2, because it will avoid the Hilbert type inequality of Montgomery and Vaughan (Lemma 4.4), and in fact be similar to the method used in Nowak [16]. For x ∈ [X − L(X), X + L(X)], we infer from formulas (22), (23) that R(x) =
√ x1/4 ρ(n) π cos 2π nx − + O (X ε ) π 1≤n≤X n3/4 4
(ε > 0) .
(30)
From this it is immediate that R(x) =
x1/4 S1 (x) + O X 1/4 |S2 (x)| + O (X ε ) , π
(31)
with
√ ρ(n) π cos 2π nx − n3/4 4 1≤n≤M 1 ρ(|m|) 1 = e sgn(m) |m|x − , 2 |m|3/4 8 0M
But this is just the right hand side of (33), as asserted. Looking back at (31), we proceed to evaluate the mean-square of S3 := S4 (x) :=
x1/4 S1 (x). Let S1 (x)2 = S3 + S4 (x) with π
1 ρ2 (n) , 2 1≤n≤M n3/2 ρ(|m|)ρ(|n|) 1 4 ×e
1≤|m|,|n|≤M m+n=0
|mn|3/4
×
√ 1 x − ( sgn(m) + sgn(n)) . sgn(m) |m| + sgn(n) |n| 8
Using the elementary fact that, for an arbitrary nonzero real number A, X+L √X+L √ X , x1/2 e A x dx = 2 √ t2 e (At) dt |A| X−L X−L
120
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
we obtain X+L X−L
x1/2 S4 (x) dx X
1≤|m|,|n|≤M m+n=0
ρ(|m|)ρ(|n|) |mn|−3/4 B(M ) X . | sgn(m) |m| + sgn(n) |n||
(35)
Here B(M ) denotes a positive bound depending only on M . On the other hand,
X+L
X−L
x1/4 π
2 S3 dx =
ρ2 (n) 1 3/2 3/2 (X + L) − (X − L) 3π 2 n3/2 1≤n≤M
∞ ρ2 (n) 1 3/2 3/2 (X + L) = − (X − L) + O LX 1/2 M −1/2+ε 2 3/2 3π n n=1 = Cρ (X + L)3/2 − (X − L)3/2 + O LX 1/2 M −1/2+ε
(36)
appealing again to (28). Combining (35) and (36), we arrive at
X+L
X−L
2 x1/4 S1 (x) dx = Cρ (X + L)3/2 − (X − L)3/2 π +O LX 1/2 M −1/2+ε + O (B(M )X) .
Using this along with (33) in (31), and applying (32), we conclude that X+L 2 3/2 3/2 (X + L) R (x) dx = C − (X − L) ρ X−L 1/2 −1/2+ε +O LX + O (B(M )X) M
1/2 3/2 1/2 1/4 1/2 1/2 X log X + L1/2 X 1/4 M −1/12+ε + L1/2 X ε +O L X + B (M )X +O X log3 X + LX 1/2 M −1/6+ε + LX 2ε . Hence, recalling the condition (11) and the fact that (X +L)3/2 −(X −L)3/2 LX 1/2 , X+L 2 R (x) dx X−L −1/12+ε − C . lim sup ρ M 3/2 − (X − L)3/2 (X + L) X→∞ Since M can be chosen arbitrarily large, this completes the proof of Theorem 2.3.
6
Concluding remarks
The results established show a very close analogy between the error terms R(x) and Δ(x). This fact is explained by the great similarity of the Vorono¨ı type approximations provided by Lemmas 4.1 and 4.3, respectively. On the basis of Lemma 4.3, one could deduce more results on R(x) which have direct counterparts in the theory of Δ(x). For instance, there is a series of papers concerned with asymptotics for higher power moments of Δ(x), including also the discussion of the short interval case. See D.R. Heath-Brown
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
121
[7], K.-M. Tsang [21], Y.-K. Lau and K.-M. Tsang [13], and W. Zhai [22–24]. Starting from Lemma 4.3 and carrying over the techniques employed in the papers cited, it is straightforward to deduce asymptotics
X
1 X+L
X−L
Rk (x) dx ∼ Cρ(k) X 1+k/4 , Rk (x) dx ∼ Cρ(k) (X + L)1+k/4 − (X − L)1+k/4 ,
for a certain range of integers k > 2, even with fairly precise error terms.
Acknowledgment The authors gratefully acknowledge support from the Austrian Science Fund (FWF) under project Nr. P18079-N12.
References [1] K. Corr´adi and I. K´atai: “A comment on K.S. Gangadharan’s paper “Two classical lattice point problems”” (in Hungarian), Magyar Tud. Akad. Mat. Fiz. Oszt. K¨ ozl., Vol. 17, (1967), pp. 89–97. [2] J.L. Hafner: “New omega theorems for two classical lattice point problems”, Invent. Math., Vol. 63, (1981), pp. 181–186. [3] J.L. Hafner: “On the average order of a class of arithmetical functions”, J. Number Theory, Vol. 15, (1982), pp. 36–76. [4] M. Huxley: “Exponential sums and lattice points III”, Proc. London Math. Soc., Vol. 87(3), (2003), pp. 591–609. [5] A. Ivi´c, The Riemann zeta-function, Wiley & Sons, New York 1985. [6] A. Ivi´c, E. Kr¨atzel, M. K¨ uhleitner and W.G. Nowak: “Lattice points in large regions and related arithmetic functions: Recent developments in a very classic topic”, In: W. Schwarz (Ed.): Proc. Conf. on Elementary and Analytic Number Theory ELAZ’04, held in Mainz, to appear; Available in electronic form at http://arXiv.org/pdf/math.NT/0410522. [7] D.R. Heath-Brown: “The distribution and moments of the error term in the Dirichlet divisor problems”, Acta Arith., Vol. 60, (1992), pp. 389–415. [8] E. Kr¨atzel: Lattice points, Kluwer, Dordrecht-Boston-London, 1988. [9] E. Kr¨atzel: Analytische Funktionen in der Zahlentheorie, Teubner, StuttgartLeipzig-Wiesbaden, 2000. [10] M. K¨ uhleitner: “An Omega theorem on differences of two squares”, Acta Math. Univ. Comen., New Ser., Vol. 61, (1992), pp. 117–123. [11] M. K¨ uhleitner: “An Omega theorem on differences of two squares, II”, Acta Math. Univ. Comen., New Ser., Vol. 68, (1999), pp. 27–35.
122
M. K¨ uhleitner and W.G. Nowak / Central European Journal of Mathematics 4(1) 2006 110–122
[12] Y.-K. Lau and K.-M. Tsang: “Omega result for the mean square of the Riemann zeta function”, Manuscr. Math., Vol. 117, (2005), pp. 373–381. [13] Y.-K. Lau and K.-M. Tsang: “Moments over short intervals”, Arch. Math., Vol. 84, (2005), pp. 249–257. [14] T. Meurman: “On the mean square of the Riemann zeta-function”, Quart. J. Math. Oxford, Vol. 38(2), (1987), pp. 337–343. [15] H.L. Montgomery and R.C. Vaughan: “Hilbert’s inequality”, J. London, Vol. 8(2), (1974), pp. 73–82. [16] W.G. Nowak: “On the divisor problem: Moments of Δ(x) over short intervals”, Acta Arithm., Vol. 109, (2003), pp. 329–341. [17] E. Preissmann: “Sur la moyenne quadratique du terme de reste du probl`eme du cercle”, C. R. Acad. Sci., Paris, S´er. I, Vol. 306, (1988), pp. 151–154. [18] K. Soundararajan: “Omega results for the divisor and circle problems”, Int. Math. Res. Not., Vol. 36, (2003), pp. 1987–1998. [19] E.C. Titchmarsh: The theory of the Riemann zeta function, Clarendon Press, Oxford, 1951. [20] K.C. Tong: “On divisor problems”, Acta Math. Sinica, Vol. 6, (1956), pp. 515–541. [21] K.-M. Tsang: “Higher power moments of Δ(x), E(t), and P (x)”, Proc. London Math. Soc., III. Ser., Vol. 65, (1992), pp. 65–84. [22] W. Zhai: “On higher-power moments of Δ(x)”, Acta Arith., Vol. 112, (2004), pp. 367–395. [23] W. Zhai: “On higher-power moments of Δ(x), II”, Acta Arith., Vol. 114, (2004), pp. 35–54. [24] W. Zhai: “On higher-power moments of Δ(x), III”, Acta Arith., Vol. 118, (2005), pp. 263–281.
Central European Science Journals w
w
w
.
c
e
s
j
.
c
o
Central European Journal of Mathematics C e n t r a l E u r o p e a n S c i e n c e J o ur n a l s
m
DOI: 10.1007/s11533-005-0008-z Research article CEJM 4(1) 2006 123–137
On some new spectral estimates for Schr¨ odinger-like ∗ operators Daniel Levin† Department of Mathematics, Technion - Israel Instutute of Technology, Haifa 32000, Israel
Received 22 August 2005; accepted 22 November 2005 Abstract: We prove the analog of the Cwikel-Lieb-Rozenblum estimate for a wide class of second-order elliptic operators by two different tools: Lieb-Thirring inequalities for Schr¨odinger operators with matrix-valued potentials and Sobolev inequalities for warped product spaces. c Central European Science Journals Warsaw and Springer-Verlag Berlin Heidelberg. All rights reserved.
Keywords: CLR estimate, Lieb-Thirring inequality, warped product, Schr¨ odinger operator MSC (2000): 35P15, 47F05
1
Introduction
In a number of applications Schr¨odinger-like operators arise. If A is a non-negative selfadjoint operator in a Hilbert space and V is relatively compact, then A − V is natural to be called a Schr¨odinger-like operator. The leading example here is the usual Schr¨odinger operator, where A = −Δ acts in the space L2 (Rd ) and V is the operator of multiplication by a real-valued function V (x). One of the problems arising in relation to the Schr¨odinger-like operators is the study of negative eigenvalues. On the qualitative level one can investigate if the number of such eigenvalues is finite or infinite. On the quantative level, the question is about studying the number of the eigenvalues, their asymptotic behaviour and their sum. ∗
Supported by the Israel Science Foundation founded by the Israeli Academy of Sciences and Humanities. † E-mail:
[email protected] 124
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
An important role in this field is played by the CLR-estimate. In its most special case, for the Schr¨odinger operator in Rd , d ≥ 3, it has the form N− (−Δ − V ) ≤ Cd V d/2 dx, (1) Rd
where V ≥ 0 and N− (A) denotes the number of negative eigenvalues of the operator A. The inequalities d γ Tr(−Δ − V )− ≤ Cγ,d V γ+ 2 dx. (2) Rd
are known as Lieb-Thirring bounds and hold true with finite constants Lγ,d if and only if γ ≥ 1/2 for d = 1, γ > 0 for d = 2 and γ ≥ 0 for d ≥ 3. Here and in the sequel, A− = (|A| − A) denotes the negative part of a self-adjoint operator A and Tr Aγ− = λj 1, and ν = ν(x) is the ellipticity constant at the point x ∈ Rd , i.e. the lowest eigenvalue of the coefficient matrix a(x) = (aij (x))i,j=1,...,d (for more details see [8, Section 3]). On the other hand, under certain conditions, the invariant formula for asymptotics of eigenvalues holds d d 1 N− (A − αV ) ∼ Cα 2 V 2 (det a(x))− 2 dx (6) Rd
(see [3, 10]) under the condition that the right-hand side in (5) is finite. One can observe a serious shortcoming in these latter results comparing the situation with the standard Schr¨odinger operator. For the latter, the estimate (1) involves the same quantity which is present in the asymptotics. Thus the asymptotics is justified for any V , for which it makes sense. On the contrary, the estimate and asymptotics for variable coefficients contain different expressions. Thus, on the one hand, the estimate
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
125
(5) does not have a quasi-classical character, and on the other hand, the asymptotics requires considerably more than just finiteness of the coefficients. This latter statement is justified by the following two observations. The estimate (5) involves the lowest eigenvalue of a(x) while the asymptotics contains the determinant of the matrix. These two quantities may have quite different behaviour, thus affecting the results. However, even if a(x) = ν(x)I, i.e. when det a(x) = ν d , the d 1 estimate (5) can not reflect possible cancellation of singularities in V 2 (det a)− 2 , which makes the expression in (6) finite. Of special interest here are the cases when the metric has some singularities, so that the existing estimate is useless but the asymptotic term is still finite due to possible cancellation of singularities in the metric and in the potential. We establish such type of estimates for a wide class of manifolds which involve the same quantity as the asymptotic formulas, and therefore, they are invariant. The results of the paper became possible due to a recent advancement by Hundertmark (see [5]) which is described in Section 2. This results in a series of CLR-type estimates on manifolds having topologically product-like structure and having a metric of warped product. In particular, we describe several classes of second order elliptic operators A where the number of negative eigenvalues can be estimated via the coefficient in the asymptotic formula, in the same way as it takes place for the Laplacian in Rd . The special feature of these estimates is that they allow singularities and degeneracies of ellipticity for the operator A so that they can be compensated by singularities and degeneracies of V , unlike the previously known general estimates in [3, 8, 10] where such cancellation was not taken care of. Only for very special special examples such estimates were found by G. Tashchiyan (see [15, 16]); under rather restrictive conditions some estimates of such kind were established by K. Tachizawa [13, 14]. One more example of our approach is presented in Section 6 where degenerate secondorder operators of Grushin type assume the role of A. Another approach via Sobolev inequalities for warped products which is based on the main result of [6] also gives the same spectral estimates, but it works only if the dimensions of spaces in the warped product are greater than two. There exist operators for which our first approach does not apply and Sobolev’s inequality seems to be the only way to obtain the desired result (see Section 5).
2
Auxilliary results
We start this section with the Hundertmark result [5]. Let G be a Hilbert space with norm || · ||G , scalar product < ·, · >G and let 1G be the identity operator on G. We denote by S q (G) the ideal of compact operators A whose singular values are q summable, n μn (A)q < ∞. By the definition, A ∈ S q (G) iff Tr(|A|q ) = Tr((A∗ A)q/2 ) < ∞.
126
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
Let Lq (Rd , S r (G) be the space of operator-valued functions f whose norm q q TrG (|f (x)|r )q/r dx ||f ||q,r = ||f ||Lq (Rd ,S r (G)) := Rd
is finite. Theorem 2.1. Let G be some auxiliary Hilbert space and V a non-negative potential in d d L 2 (Rd , S 2 (G)). Then the operator Q = −Δ ⊗ 1G − V has a finite number of negative eigenvalues and the following estimate holds d TrG (V 2 )dx. (7) N− (Q) ≤ Cd Rd
This auxilliary result will be used in Section 4.
3
Differential operators and Riemannian manifolds
Computations below show how to reduce the spectral problem for an elliptic operator A on L2 (Rd ), d ≥ 3 to the Laplace-Beltrami on some manifold with a chosen metric. If a second-order differential operator A on L2 (Rd ) is given by d ∂ ∂ aij , A= ∂x ∂x i j i,j=1 then we choose g ij = aij (det a)α , and we calculate explicitly the value of α. Here a = a(x) is the coefficient matrix (aij (x))i,j=1,...,d . By the definition, d vol = detgij dx1 . . . dxd , and det gij = (det g ij )−1 = (det a)−(αd+1) . ∂u ∂u , and integrating In Differential Geometry, the gradient is given by |∇u|2 = g ij ∂x i ∂xj it with respect to the volume element gives −(αd+1) ∂u ∂u ij ∂u ∂u g d vol = aij (det a)α (det a) 2 dx ∂xi ∂xj ∂xi ∂xj M
which must be equal to the Dirichlet integral for < Au, u >L2 . Finally we obtain, α − αd+1 = 0, or equivalently, α = −1/(d − 2). 2
4
Main result
As it was explained in Introduction, the ’correct’ CLR estimates for A − V must look like d N− (A − V) ≤ c V 2 d vol. (8) M
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
127
We consider a wide class of manifolds for which we can prove estimate (8). An example of such a manifold would be the warped product of two Euclidean spaces Rm and Rk . Note that the spectral properties for such manifolds may significantly differ from their direct products. E.g. the heat kernel for the product space is the product of heat kernels of their factors. For the warped product of Rm and Rk it is a very difficult problem to find an estimate from above for the heat kernel. We restrict ourselves to the following class of manifolds which topologically look like d R = Rm × Rk , m + k = d endowed with the following Riemannian metric ds2 = g ij ξi ξj : 2
ds = f
−m/(d−2)
k
1 − d−2
(det a)
2 k−2 1 aij (y)ξi ξj + f d−2 (det a)− d−2 ξk+1 + · · · + ξd2 .
(9)
i,j=1
We assume that f = f (y) ≥ 0 and f > 0 almost everywhere, (aij (y))i,j=1,...,k is some positive definite matrix and det a stands for its determinant. The volume element on M is d vol = f m/(d−2) (det a)1/(d−2) dydz. The quadratic form a[u] corresponding to the Laplace-Beltrami operator on M endowed with the Riemannian metric (9) has the form:
k ∂u ∂u 2 a[u] = aij (y) f (y)|∇y u| + dydz. (10) ∂zi ∂zj Rd i,j=1 The Schr¨odinger operator −ΔM − V is similarly defined by the quadratic form: 2 a[u] − V|u| d vol = a[u] − V |u|2 dydz,
(11)
Rd
M
i.e. V = Vf m/(d−2) (det a)1/(d−2) . The sharp CLR-estimate for (10) is given below: Theorem 4.1. For the operator A defined by (10), the following estimate holds d/2 V d vol = C V d/2 (y, z) (det a)−1/2 f −m/2 dy dz. N− (A − V) ≤ C M
Rk
(12)
Rm
Proof. At first, we make a change of variables z → z˜ so that d˜ z = f (y)dz (i.e. each l 1/k l component changes respectively as z˜ = f z , l = 1, . . . , k). Naturally, the quadratic form (10) transforms into
k ∂u ∂u |∇y u|2 + f −1 f 2/k dy d˜ z. aij (y) ˜ a[u] = ∂ z ˜ ∂ z ˜ i j Rd i,j=1 The new potential function V˜ = V˜ (y, z˜) becomes V˜ (y, z˜) = f (y)−1 V (y, z),
or V (y, z) = f (y)V˜ (y, z˜).
128
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
We will need an auxiliary lemma which is given below: Lemma 4.2. Let B be an operator defined on L2 (Rd ) by its quadratic form b[u] as follows:
k ∂u ∂u b[u] = aij (y) (13) |∇y u|2 + dydz, y ∈ Rm , z ∈ Rk . ∂z ∂z d i j R i,j=1 For the operator B described above, the following estimate holds V d/2 (y, z) (det a)−1/2 dydz, N− (B − V ) ≤ C Rk
(14)
Rm
where det a stands for the determinant of the matrix a(x) = (aij (x))i,j=1,...,k . Proof. We apply Theorem 2.1 to the following operator B − V = −Δy + W (y), where k
∂ aij (y) W (y) = − ∂zj i,j=1 Theorem 2.1 implies
N− (A − V ) ≤ C
By the definition,
m/2
Tr([W (y)]− ) =
Rm
∂ aij ∂zi
− V (y, z).
m Tr [W (y)]−2 dy.
|λj (W (y))|m/2 .
λj (W (y))= ki,j=1 aij (y)ξ i ξ j by changing z-coordinates (y remains fixed). After a linear transformation w = U z, dz becomes dz = (det U )−1 dw. It also implies
2
|u(·, z)| dz = (det U ) and
−1
−1
|u(·, w)|2 dw
< AU ∇w u, U ∇w u > dw < A∇z u, ∇z u > dz = (det U ) −1 = (det U ) < U AU ∇w u, ∇w u > dw. Now we choose U = A−1/2 . Then (15) transforms into 1/2 (detA) < ∇w u, ∇w u > dw. Making a change of variables for V (·, z)|u(·, z)|2 dz, we have 2 −1 V˜ (·, w)|u(·, w)|2 dw, V (·, z)|u(·, z)| dz = (det U )
(15)
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
129
where V˜ is defined by V (·, z) = V˜ (·, w). Therefore,
k ∂ ∂ (aij (y) ) − V (y, z) = λj −Δw − V˜ (y, w) . − ∂zj ∂zi i,j=1
λj
Finally, making the inverse change of variables w → z, we get N− (A − V ) ≤ c V d/2 (y, z) (det a)−1/2 dzdy. Rd
Remark 4.3. Lemma 4.2 can be proven directly using the approach of Sobolev inequalities provided m ≥ 3, k ≥ 3. An alternative proof of Lemma 4.2 is given below.
1/2 Proof. Denote by F (y) = Rk |u|2 dz . Then by the simple reasoning (see [4]), 2 |∇u| dydz ≥ |∇y F |2 dy. Rm
Rk
Rm
We apply the Sobolev inequality on Rm for F = F (y): Rm
It yields
Rd
|∇y F |2 dy ≥ c
|∇y u|2 dydz ≥ c
Rm
|F |
Rm
2m m−2
Rk
dy
|u|2 dz
m−2 m
m m−2
.
m−2 m
dy
.
(16)
To estimate the second term of (13) from below we make a change of variables, z → z˜, diagonalizing the quadratic form (Aξ, ξ) = (η, η), where (Aξ, ξ) = ki,j=1 aij ξi ξj . Such a change of variables respectively yields, dz = (det a)1/2 d˜ z. The left term of (13) becomes in new coordinates (y, z˜) Rm
k
∂u ∂u aij (y) dzdy = ∂zi ∂zj Rk i,j=1
Rm
Rk
|∇z˜u|2 (det a)1/2 d˜ z dy.
The right-hand side of (16) becomes c
Rm
(det a)
m 2(m−2)
Rk
m−2 m m−2 m |u|2 d˜ z dy .
130
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
Again, applying the Sobolev inequality on Rk for a new Dirichlet form |∇z˜u|2 d˜ z , we have k−2 k k
2 k ∂u ∂u 1/2 k−2 |u| (det a) aij (y) dzdy ≥ c d˜ z dy. ∂zi ∂zj Rm Rk i,j=1 Rm Rk Finally, the interpolation inequality (see [2]) gives Rm
(det a)
m 2(m−2)
Rk
m−2 m m−2 m 2 |u| d˜ z dy +
Rm
≥ C(m, k)
Rd
2
1/2
|u| (det a)
Rk d d−2
2
1/2
|u| (det a)
k k−2
k−2 k d˜ z dy
d−2 d dyd˜ z .
Further, making the inverse change of variables, z˜ → z, we come to a[u] ≥ c
Rd
|u|
2d d−2
(det a)
1 d−2
dydz
d−2 d
.
This is the correct Sobolev inequality with respect to the Riemannian measure which guarantees the validity of (14). Then one applies the main result of [6]. Now we apply Lemma 2.1 to −f or, equivalently,
2 −1 k
k ∂ ∂ (aij (y) ) − V˜ (y, z˜), ∂ z ˜ ∂ z ˜ j i i,j=1
k ∂ ∂ (bij (y) ) − V˜ (y, z˜), − ∂ z ˜ ∂ z ˜ j i i,j=1 2
where bij (y) = aij (y)f k −1 . Trivially, det(bij )i,j=1,...,k = f 2−k det(aij )i,j=1,...,k . Using Lemma 4.2, one has N− (B − V ) ≤ c where B = −
k
∂ ∂ i,j=1 ∂ z˜i (bij (y) ∂ z˜i ).
Rd
V˜ d/2 (det b)−1/2 d˜ z dy,
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
In (z, y) -variables it becomes
131
d/2 V N− (A − V ) ≤ c (det a)−1/2 f (k−2)/2 f dz dy f d R d/2 −1/2 −m/2 V (det a) f dz dy = c Vd/2 d vol. =c
Rd
M
Corollary 4.4. Let us consider the warped product of two Euclidean spaces. Our space topologically looks like Rd = Rm × Rk , m + k = d and the quadratic form corresponding to A = −Δy − f (y)Δz on L2 (Rd ) is
a[u] = |∇y u|2 + f (y)|∇z u|2 dydz, y ∈ R m , z ∈ Rk . (17) Rd
Assume also that f = f (y) ≥ 0 and f > 0 almost everywhere. Let m ≥ 3, k ≥ 1. For the operator A described above, the following estimate holds N− (A − V ) ≤ C V d/2 (y, z)f (y)−k/2 dydz. (18) Rk
Rm
Proof. It easily follows from Lemma 4.2 with aij (y) = f (y)1z .
Remark 4.5. The previous example can be easily modified. Consider now a warped product of three Euclidean spaces, Rd = Rp × Rm × Rk ,
t ∈ Rp , y ∈ Rm , z ∈ Rk ,
p + m + k = d,
p ≥ 3,
m, k ≥ 1
and an operator A defined on L2 (Rp × Rm × Rk ) by its associated quadratic form
|∇t u|2 + f (t) |∇y u|2 + g(y)|∇z u|2 dt dy dz. (19) a[u] = Rd
Then N− (A − V ) ≤ c
Rp
Rm
Rk
g(y)−k/2 f (t)−(k/2+m/2) V d/2 dydzdt.
An alternative route through Sobolev inequalities gives the same estimate if we repeat the same procedure as in Lemma 4.2 twice. Moreover, one can consider a warped product of l Euclidean spaces, where l ≥ 2. In the next section we consider an operator (as we call it g − f operator) for which we can obtain the correct CLR estimate only by Sobolev inequalities’ approach.
5
CLR estimate for g − f operator
Let us consider the product of two Euclidean spaces. Our space topologically looks like Rd = Rm × Rk , m + k = d and the quadratic form corresponding to A is a[u] = (g(z)|∇y u|2 + f (y)|∇z u|2 )dydz, y ∈ Rm , z ∈ Rk . (20) Rd
132
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
Naturally, the matrix a(x) corresponding to (17) looks like ⎛ ⎞ ⎜ g(z) . . . ⎟ ⎜ ⎟ ⎜ ⎟ g(z) ⎜ ⎟ a(x) = ⎜ ⎟ ⎜ ⎟ f (y) . ⎜ ⎟ .. ⎝ ⎠ f (y)
⎫ ⎬ ⎭ ⎫ ⎬ ⎭
m
k
Assume also that f = f (y) ≥ 0, g = g(z) ≥ 0 and f > 0, g > 0 almost everywhere. This quadratic form corresponds to the operator A = −g(z)Δy − f (y)Δz . Theorem 5.1. Let m ≥ 3, k ≥ 3. For the operator A described above, the following estimate holds N− (A − V ) ≤ C V d/2 (y, z)g(z)−m/2 f (y)−k/2 dydz. (21) Rk
Rm
Proof. The proof goes via Sobolev inequalities, together with the connection between the global Sobolev inequality and the CLR-estimate (see [6]) provided m ≥ 3, k ≥ 3. At first, let us apply the Sobolev inequality for u = u(·, z) on Rk : k−2 k 2k 2 |∇z u| dz ≥ c |u| k−2 dz . Rk
Rk
Secondly, multiplying it by f (y) and integrating it over Rm , we have k−2 k 2k 2 f (y)|∇z u| dzdy ≥ c f (y) |u| k−2 dz dy. Rm
Rk
Rm
Rk
Next, we introduce a new function F (y) =
2
Rk
|u| g(z)dz
It is a simple fact (see e.g. [4]) that 2 |∇y u| g(z)dydz ≥ Rm
Rk
12
Rm
.
|∇y F |2 dy.
Now applying the Sobolev inequality for F = F (y) on Rm , we obtain m−2 m 2m 2 |∇y F | dy ≥ c |F | m−2 dy . Rm
Rm
Taking into account (22) and the above inequality, we have an estimate
m−2 m m−2 m |∇y u|2 g(z)dydz ≥ c |u|2 g(z)dz dy . Rm
Rk
Rm
Rk
(22)
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
133
Using the interpolation inequality for mixed norms (see [2]), we have Rm
Rk
g(z)|∇y u|2 + f (y)|∇z u|2 dzdy ≥ c
Rm
Rk
|u|
2d d−2
f
k d−2
g
m d−2
dydz
d−2 d
.
Now considering the warped product as a Riemannian manifold, we evaluate the volume element and a new potential V˜ = V˜ (y, z) as k
m
k
m
V (y, z) = V˜ (y, z)f d−2 g d−2 .
d vol = f d−2 g d−2 dy dz, Further, N− (A − V ) ≤ c =c
R k Rk
R m Rm
k
m
f (y)− d−2 g(z)− d−2 k
d2
m
d
k
m
V (y, z) 2 f d−2 g d−2 dzdy d
f (y)− 2 g(z)− 2 V (y, z) 2 dz dy.
6
Lieb-Thirring and CLR inequalities for operators of Grushin type
We study Lieb-Thirring estimates for the operator A−V on L2 (R2 ), where A = −(Dx2 + x2 Dy2 ). The corresponding quadratic form a[u] is given by
|ux |2 + x2 |uy |2 dxdy. a[u] = R2
Proposition 6.1. For any γ > 0, γ+3/2 ˜ LTγ ≤ Cγ V dxdy + Cγ V |x|2 0: ∞ dt LTγ ≤ Cγ P (t; x, x)G(tV (x))dx, (24) t1+γ M 0 where G(s) = (s − 1)+ . Substituting (23) into (24), we obtain LTγ ≤ Cγ dxdy t−1−γ G(tV )(|x|t + t3/2 )−1 dt.
(25)
134
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
We recall that G(s) = (s − 1)+ , so Therefore, LTγ ≤ Cγ
G(tV ) t
≤V.
dxdy V
∞
1/V
t−γ (|x|t + t3/2 )−1 dt.
Denote by I the following integral: ∞ t−γ (|x|t + t3/2 )−1 dt. I= 1/V
Evaluating the integral by changing the variables t = sV −1 , we come to ∞ −γ −1 ds s s s I= |x| + ( )3/2 . V V V V 1 An elementary analysis of the last integral shows that if V |x|2 < 1, then |x| Vs < ( Vs )3/2 (here we recall that 1 < s < ∞), and therefore
−1 s s |x| + ( )3/2
V V
3/2 V . s
So, if V |x|2 < 1, then ∞ ∞ −γ s s s 3/2 −1 ds 1/2+γ |x| + ( ) s−γ−3/2 ds.
V I= V V V V 1 1 Suppose that γ > −1/2 so that last integral converges. It yields I ≤ c1 (γ)
Vγ . |x|
(26)
Suppose now that V |x|2 ≥ 1. Note that s s−1 V |x|−1 , s 3/2 −1 |x| + ( )
V V s−3/2 V 3/2 ,
if s < V |x|2 , if s > V |x|2 .
Therefore, the last integral can be estimated as follows: I=
γ
V |x|
1
V |x|2
∞
= 1
s−3/2−γ ds + V
= c1 (γ)
∞
+ 1
γ
V |x|2
V |x|2 ∞ γ+1/2 V |x|2 1/2
s−γ−1 ds
V V + c2 (γ) 2γ . |x| |x|
The integral in the formula above converges as soon as γ > 0. Taking into account that V |x|2 ≥ 1, we obtain Vγ I ≤ c3 (γ) . (27) |x|
D. Levin / Central European Journal of Mathematics 4(1) 2006 123–137
135
Finally, if we use estimates (26) and (27), the Lieb-Thirring inequality for the two dimensional Grushin operator reads as LTγ ≤ dxdy V · I = dxdy V · I + dxdy V · I V |x|2 0. If now κ depends on ∇u, then we are led to consider more general free energies of the form ψ(u, ∇u) = Φ(∇u) + F (u)
(4)
and we end up with an equation of the form (1) with β(s) = ds Φ(s) (d denoting the differential). For instance, models of anisotropic Allen-Cahn equations of the form (1), for a free energy of the form (0.4) (with Φ strictly convex and homogeneous of degree two, i.e., Φ(λs) = λ2 Φ(s), ∀s ∈ RN , ∀λ > 0) and α = I, are considered and studied in [14] and [26]. The study of equations of the form (1) can be found in [1, 4, 9, 15, 21] and [23] (actually, these works also consider the more general case of differential inclusions) ; we also mention [2, 7, 8] and [22] for the study of equations of the form α(
∂u ) − div(β(∇u)) + f (u) = g. ∂t
(5)
Equations (1) and (5) have been extensively studied when α and/or β are linear ; we mention, for instance, the monographs [5, 6, 18] and [24]. The existence of attractors for (1) has been proven in [11, 12] (for β = I) and [23]. Furthermore, the only known result on the existence of finite dimensional attractors is due to [11] (see also [12]), again for β = I. We also mention [22], where attractors for (5) are constructed. Our aim in this paper is to extend the results of [11] and [12] to the more general equation (1). One of the main difficulties is that we need to prove that the attractors are regular enough, namely, that they are bounded in H 2 (Ω). As a consequence, we restrict ourselves to one and two space dimensions for a general function α ; for α = cI, c > 0, the results also hold in three space dimensions. This article is organized as follows. In Section 1, we prove the existence of the global attractor, which is a compact and invariant by the flow set which attracts all the bounded sets of initial data as time goes to infinity. Then, in Section 2, we study the regularity of the global attractor, which allows us to prove, in Section 3, that it has finite fractal dimension ; this result is obtained by using the method of l-trajectories (see [19]) and actually allows to prove the existence of an exponential attractor, which is a compact and positively invariant by the flow set which attracts the bounded sets of initial data exponentially fast and has finite fractal dimension. Finally, in Section 4, we study the regularity of the solution of an elliptic equation. To do so, we essentially follow [25], where the regularity of the solution of the p-Laplacian (which, for p = 2, reduces to the
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
165
usual Laplacian) is investigated. Since we have not been able to find this regularity result in the literature, we chose to detail it here, for the sake of completeness. Throughout this article, the same letter c (and, sometimes, c , c or c ) denotes constants which may vary from line to line.
1
Existence of the global attractor
We consider the following equation in a bounded regular (at least C 3 ) domain Ω ⊂ RN , N ≥ 1 : ∂α(u) − div(β(∇u)) + f (u) = g, (6) ∂t u = 0 on ∂Ω, (7) ut=0 = u0 . (8) We assume that g ∈ L∞ (Ω). (The condition g ∈ L2 (Ω) would be sufficient to construct the global attractor : this condition will be needed for the regularity of the global attractor. Similarly, not all the conditions listed below are necessary for the well-posedness and the existence of the global attractor.) Furthermore, we make the following assumptions : α ∈ C 2 (R), α(0) = 0,
(9)
α (s) ≥ c1 , c1 > 0, s ∈ R,
(10)
c2 s − c3 ≤ sα (s) ≤ c4 s + c5 , c2 , c4 > 0, c3 , c5 ≥ 0, s ∈ R ;
(11)
β ∈ C 1 (RN )N , β(0) = 0, ds β is bounded,
(12)
c6 |s|2 − c7 ≤ B(s) ≤ c8 |s|2 + c9 , c6 , c8 > 0, c7 , c9 ≥ 0, s ∈ RN ,
(13)
where ds B(s).v = β(s).v, s, v ∈ RN (d denoting the differential), B(0) = 0, β(s).s ≥ c10 |s|2 , c10 > 0, s ∈ RN ,
(14)
ds β(s).v.v ≥ c11 |v|2 , c11 > 0, s, v ∈ RN .
(15)
Here, . denotes the usual Euclidean scalar product and |.| the associated norm. We finally assume that f ∈ C 1 (R) and that
where p > 0,
sgn(s)f (s) ≥ c12 |s|p+1 − c13 , c12 > 0, c13 ≥ 0, s ∈ R,
(16)
|f (s)| ≤ c14 |s|p+1 + c15 , c14 > 0, c15 ≥ 0, s ∈ R,
(17)
f (s) ≥ −c16 , c16 ≥ 0, s ∈ R.
(18)
We note that it follows from (10) and (18) that there exists a constant c17 > 0 such that f + c17 α is increasing.
(19)
166
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
We set H = L2 (Ω) and V = H01 (Ω), which we endow with their usual scalar products and associated norms. In particular, we denote by (., .) the usual L2 -scalar product and by |.| the associated norm. Under the above assumptions, we can prove, by using standard techniques (see, e.g., [7, 8, 11] and [28]), that (6)-(8) is well-posed and that we can define the semigroup S(t) : H → H u0 → u(t), t ≥ 0, where u(t) denotes the solution of (6)-(8) at time t (i.e., S(0) = I and S(t + s) = S(t) ◦ S(s), t, s ≥ 0). Furthermore, proceeding as in [11] and using (10) and (19), we can prove that the mapping x → S(t)x is Lipschitz continuous for the norm of L1 (Ω), ∀t ∈ R. We then have the Theorem 1.1. The semigroup S(t) possesses the global attractor A in H such that A is bounded in V . Proof. We will proceed formally here. However, all these formal calculations can be easily justified by proper regularization techniques (see, e.g., [11]). We first multiply (6) by u, integrate over Ω and obtain, setting s A(s) = τ α (τ )dτ, s ∈ R, d dt
0
Ω
A(u)dx + (β(∇u), ∇u) + (f (u), u) = (g, u).
We note that it follows from (14) that (β(∇u), ∇u) ≥ c10 |∇u|2 . Furthermore, we deduce from (16) that (f (u), u) ≥ c|u|p+2 − c , c > 0, c ≥ 0. Therefore, we have
d dt
Ω
A(u)dx + c|∇u|2 + c |u|p+2 ≤ c ,
where c, c > 0. In particular, we deduce from (20) that d A(u)dx + c|u|2 ≤ c , dt Ω which yields, using Gronwall’s lemma and noting that it follows from (11) that 2 c|u| − c ≤ A(u)dx ≤ c |u|2 + c , c, c > 0, c , c ≥ 0, Ω
(20)
(21)
(22)
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
167
the existence of a bounded absorbing set B0 in H (i.e., ∀B ⊂ H bounded, ∃t0 = t0 (B) such that t ≥ t0 implies S(t)B ⊂ B0 ). It also follows from (20) and the existence of a bounded absorbing set in H that, for r > 0,
t+r
t
t+r
t
|∇u|2 dτ ≤ c(r),
(23)
|u|p+2 dτ ≤ c (r),
(24)
t ≥ t0 (|u0 |), where c and c are independent of u0 . We then multiply (6) by ∂u and integrate over Ω to obtain ∂t ∂u ∂u ∂u d (β(∇u), ∇ ) + (α (u) , ) + ∂t ∂t ∂t dt where F (s) =
s 0
Ω
F (u)dx = (g,
∂u ), ∂t
f (τ )dτ, s ∈ R. We note that, owing to (10), (α (u)
∂u ∂u ∂u 2 , ) ≥ c1 | | . ∂t ∂t ∂t
Furthermore, ∂u d (β(∇u), ∇ ) = ∂t dt Thus,
Ω
B(∇u)dx.
d ∂u 2 ( B(∇u)dx + F (u)dx) + c| | ≤ c , c > 0. dt Ω ∂t Ω
(25)
(26)
We then note that it follows from (13) that
2
B(∇u)dx ≤ c8 |∇u|2 + c , c, c ≥ 0,
(27)
F (u)dx ≤ c |u|p+2 + c , c, c > 0, c , c ≥ 0.
(28)
c6 |∇u| − c ≤
Ω
and it follows from (16)-(17) that p+2
c|u|
−c ≤
Ω
We finally deduce from (23), (24), (26), (27), (28) and the uniform Gronwall’s lemma (see [27]) that, for r > 0, |∇u|2 ≤ c(r), t ≥ t0 (|u0 |) + r, where c is independent of u0 , hence the existence of a bounded absorbing set B1 in V and, noting that V is compactly embedded into H, of a relatively compact absorbing set in H. This finishes the proof of the theorem (see [27] for more details).
168
2
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
Regularity of the global attractor
We have the Proposition 2.1. The global attractor A is bounded in L∞ (Ω). Proof. Proceeding as in [11], we multiply (6) by α(u)|α(u)|k , k integer, and integrate over Ω to obtain 1 d k+2 |α(u)| dx + (k + 1) α (u)|α(u)|k β(∇u).∇udx k + 2 dt Ω Ω + f (u)α(u)|α(u)|k dx = (g, α(u)|α(u)|k ). Ω
We note that
Ω
α (u)|α(u)|k β(∇u).∇udx ≥ 0, k
|(g, α(u)|α(u)| )| ≤ c
Ω
|α(u)|k+1 dx.
Furthermore, since α is increasing and α(0) = 0, then sgn(α(u)) = sgn(u) and k f (u)α(u)|α(u)| dx = sgn(u)f (u)|α(u)|k+1 dx Ω
Ω
≥ (thanks to (16)) p+1 k+1 ≥ c12 |u| |α(u)| dx − c13 |α(u)|k+1 dx Ω
Ω
≥ (thanks to (10) − (11)) k+p+2 dx − c |α(u)|k+1 dx, ≥ c |α(u)| Ω
Ω
c > 0, c ≥ 0. We thus have 1 d k+2 k+p+2 |α(u)| dx + c |α(u)| dx ≤ c |α(u)|k+1 dx, k + 2 dt Ω Ω Ω which yields, using H¨older’s inequality (on both sides ; see [11]) 1 d k + 2 dt
Ω
|α(u)|
k+2
k+p+2 k+1 k+2 k+2 k+2 k+2 dx + c( |α(u)| dx) ≤ c ( |α(u)| dx) ,
Ω
Ω
so that
d ≤ c , (29) α(u)Lk+2 (Ω) + cα(u)p+1 Lk+2 (Ω) dt where c > 0 and c ≥ 0 are independent of k. It follows from (29) that (see [11] and [27]) α(u)Lk+2 (Ω) ≤ c +
c 1
(pt) p
, t ≥ r, r > 0,
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
169
for every k, where the constants only depend on those appearing in (29) and on r (in particular, they are independent of u0 ). Letting k → +∞, this yields α(u)L∞ (Ω) ≤ c + hence uL∞ (Ω) ≤ c +
c 1
(pt) p c
, t ≥ r, r > 0,
, t ≥ r, r > 0,
1
(pt) p
and the result follows, owing to the invariance of A.
We shall henceforth assume that N = 1, 2 or 3 when α = cI, c > 0, and that N = 1 or 2 otherwise. We then have the Proposition 2.2. We assume that u0 ∈ A. Then, for every t > 0,
∂u ∂t
∈ H.
Proof. We differentiate (6) with respect to time and have, setting v = α (u)
∂u , ∂t
∂v − div(ds β(∇u).∇v) + α (u)|v|2 + f (u)v = 0. ∂t
Multiplying (30) by v and integrating over Ω, we obtain, thanks to (15), ∂v 2 2 α (u) vdx + c11 |∇v| + α (u)|v| vdx + f (u)|v|2 dx ≤ 0. ∂t Ω Ω Ω We have
(30)
(31)
∂v 1 1d 2 α (u) vdx = α (u)|v| dx − α (u)|v|2 vdx, ∂t 2 dt 2 Ω Ω Ω f (u)|v|2 dx ≥ −c16 |v|2 ,
Ω
so that (31) yields d dt
Ω
2
2
α (u)|v| dx + c|∇v| +
Ω
α (u)|v|2 vdx ≤ c |v|2 .
(32)
Noting that A ⊂ L∞ (Ω), we have, for N = 1 or 2 (when N = 3 and α = cI, the quantity below vanishes), | α (u)|v|2 vdx| ≤ cv3L3 (Ω) Ω
≤ cv3H 13 (Ω)
≤ c|v|2 |∇v| ≤ |∇v|2 + c( )|v|4 , > 0. We thus find
d dt
Ω
α (u)|v|2 dx ≤ c|v|4 + c |v|2 .
(33)
170
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
Setting y =
Ω
α (u)|v|2 dx, we finally obtain (noting again that A ⊂ L∞ (Ω)) dy ≤ cy 2 + c y dt ≤ cy 2 + c .
(34)
We have, owing to (26)-(28), t
t+r
ydτ ≤ c(r), t ≥ 0, r > 0,
so that an application of the uniform Gronwall’s lemma yields y(t) ≤ c(r), t ≥ r, r > 0, and the result follows, owing to (10).
We are now in position to prove Theorem 2.3. The global attractor A is bounded in H 2 (Ω). Proof. We rewrite (6) in the form − div(β(∇u)) = ϕ(x, t),
(35)
where
∂u . (36) ∂t It follows from Propositions 2.1 and 2.2 that, if u0 ∈ A, then, for every t > 0, ϕ(., t) ∈ L2 (Ω) (we can also note that the norm of ϕ(., t) in this space only depends on A). The theorem follows from the invariance of A and ϕ(x, t) = g − f (u) − α (u)
Theorem 2.4. For every ϕ ∈ L2 (Ω), the problem −div(β(∇u)) = ϕ, u = 0 on ∂Ω, possesses a unique solution u such that u ∈ H01 (Ω) ∩ H 2 (Ω). This theorem will be proven in Section 4 below (for N ≥ 2, the case N = 1 being straightforward). Remark 2.5. The results obtained in this section also allow for more regularity on the solutions. Indeed, it follows from Proposition 2.1 that the solutions belong to L∞ (η, +∞; L∞ (Ω)), ∀η > 0. Furthermore, we have, owing to Proposition 2.2 and Theorem 2.3, ∈ L∞ (η, +∞; H) ∩ L2 (η, T ; V ), ∀η > 0, T > η, for u ∈ L∞ (η, +∞; H 2 (Ω)) and ∂u ∂t N = 1, 2 or 3 when α = cI and N = 1 or 2 otherwise. We finally note that, since ∂u ∈ L2 (η, T ; H), ∀η > 0, T > η (see (26)), we deduce from Theorem 2.4 that u ∈ ∂t L2 (η, T ; H 2 (Ω)), ∀η > 0, T > η, without any restriction on the space dimension N .
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
3
171
Dimension of the global attractor
We assume in this section that N = 1, 2 or 3 when α = cI, c > 0, and that N = 1 or 2 otherwise. We first note that it follows from the results obtained in the previous section that the semigroup associated with (6)-(7) possesses a bounded and positively invariant absorbing set B2 in H 2 (Ω) ∩ H01 (Ω) (we note that H 2 (Ω) ⊂ L∞ (Ω) with continuous injection for N ≤ 3). More precisely, we will take B2 of the form B2 = ∪t≥t0 S(t)B2 , where B2 is a bounded absorbing set in H 2 (Ω) and t0 is such that t ≥ t0 implies S(t)B2 ⊂ B2 and where the closure is taken in the weak topology of H 2 (Ω). (We note that, if u0 ∈ B2 , then u(t) = S(t)u0 ∈ H 2 (Ω), ∀t ≥ 0. Furthermore, u0 is the limit, for the weak topology of H 2 (Ω), of a sequence (u0n ), where u0n ∈ ∪t≥t0 S(t)B2 , for every n. It is then not difficult to show, passing to the limit in the equation, that, at least for a subsequence, (un (t) = S(t)u0n ) converges to u(t) for the weak topology of H 2 (Ω), ∀t ≥ 0, hence the positive ∈ L∞ (0, +∞; H) and the norm of ∂u invariance of B2 .) Furthermore, if u0 ∈ B2 , then ∂u ∂t ∂t in this space only depends on B2 (in particular, it is bounded independently of u0 ∈ B2 ). The same holds for the norm of u in L∞ (0, +∞; H 2 (Ω)) and in L∞ (0, +∞; L∞ (Ω)). We then have Proposition 3.1. Let u1 and u2 be two solutions of (6)-(7) starting from B2 . Then, d 2 2 α (u1 )|u1 − u2 | dx + c|∇(u1 − u2 )| ≤ c α (u1 )|u1 − u2 |2 dx, (37) dt Ω Ω where the positive constants c and c are independent of u1 (0) and u2 (0). Proof. We set u = u1 − u2 . We have ∂ (α(u1 ) − α(u2 )) − div(β(∇u1 ) − β(∇u2 )) + f (u1 ) − f (u2 ) = 0. ∂t We multiply (38) by u and integrate over Ω to obtain ∂ (α(u1 ) − α(u2 ))udx + (β(∇u1 ) − β(∇u2 ), ∇u) + (f (u1 ) − f (u2 ), u) = 0. Ω ∂t We note that ∂ ∂u ∂u2 (α(u1 ) − α(u2 ))udx = α (u1 ) udx + (α (u1 ) − α (u2 )) udx ∂t ∂t Ω ∂t Ω Ω 1d ∂u2 udx = α (u1 )u2 dx + (α (u1 ) − α (u2 )) 2 dt Ω ∂t Ω 1 ∂u1 2 u dx − α (u1 ) 2 Ω ∂t and, due to (15), (β(∇u1 ) − β(∇u2 ), ∇u) ≥ c11 |∇u|2 .
(38)
172
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
Therefore, we have 1d 1 ∂u1 2 2 2 u dx α (u1 )u dx + c11 |∇u| ≤ α (u1 ) 2 dt Ω 2 ∂t Ω ∂u2 − (α (u1 ) − α (u2 )) udx − (f (u1 ) − f (u2 ), u). ∂t Ω
(39)
Here, it is not difficult to show that (we assume that N = 2; the case N = 1 can be treated analogously and, when N = 3 and α = cI, the quantity below vanishes) ∂u2 2 ∂u1 2 ∂u2 ∂u1 1 u dx − (α (u1 ) − α (u2 )) udx| ≤ c (| |+| |)u dx α (u1 ) | 2 Ω ∂t ∂t ∂t ∂t Ω Ω ∂u1 ∂u2 ≤ c(| |+| |)u2L4 (Ω) ∂t ∂t ≤ c|u||∇u| c11 |∇u|2 + c|u|2 . ≤ (40) 2 Furthermore, we have |(f (u1 ) − f (u2 ), u)| ≤ c|u|2 . We finally deduce from (39)-(41) that d α (u1 )u2 dx + c|∇u|2 ≤ c |u|2 , dt Ω
(42)
hence the result. We now set l =
(41)
1 , c
where c is the constant appearing in (37). Then, we deduce from 1
(37) and Gronwall’s lemma that, setting w = α(u1 ) 2 u, |w(t2 )|2 ≤ e
t2 −t1 l
|w(t1 )|2
≤ e2 |w(t1 )|2 ,
(43)
for 0 ≤ t2 − t1 ≤ 2l. We fix s in (0, l) and integrate (37) over t ∈ (s, 2l) to obtain, owing to (43), 2l 2l 2 2 |w(2l)| + c |∇u| dt ≤ c |w|2 dt + |w(s)|2 s
s
≤ c |w(s)|2 ≤ c |u(s)|2 ,
which yields
l
2l
|∇u|2 dt ≤ c|u(s)|2 .
We finally obtain, integrating the above inequality over s ∈ (0, l), l 2l 2 |∇(u1 − u2 )| dt ≤ c |u1 − u2 |2 dt. l
0
(44)
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
173
We now have Proposition 3.2. We assume that the assumptions of Proposition 3.1 hold and that l is as above. Then,
∂ (u1 − u2 ) ≤ cu1 − u2 L2 (0,l;H) . ∂t L2 (l,2l;H −1 (Ω))
(45)
Proof. We have, again setting u = u1 − u2 , 2l ∂u ∂u = sup| < α (u1 ) , ϕ > dt|, α (u1 ) ∂t L2 (l,2l;H −1 (Ω)) ∂t ϕ l
where ϕ ∈ L2 (l, 2l; V ), ϕL2 (l,2l;V ) = 1, and < ., . > denotes the duality product between V and H −1 (Ω). We then have, noting that α (u1 )
∂u ∂u2 = div(β(∇u1 ) − β(∇u2 )) − (f (u1 ) − f (u2 )) − (α (u1 ) − α (u2 )) , ∂t ∂t
2l ∂u α (u1 ) ≤ sup( |β(∇u1 ) − β(∇u2 )||∇ϕ|dt + ∂t L2 (l,2l;H −1 (Ω)) ϕ l 2l + |f (u1 ) − f (u2 )||ϕ|dt + l 2l ∂u2 + dt |α (u1 ) − α (u2 )|| ||ϕ|dx). ∂t l Ω
Furthermore,
l
2l
|β(∇u1 ) − β(∇u2 )||∇ϕ|dt ≤ cuL2 (l,2l;V ) ,
l
2l
(46)
(47)
|f (u1 ) − f (u2 )||ϕ|dt ≤ cuL2 (l,2l;H) ≤ cuL2 (l,2l;V )
(48)
and (noting that, when N = 3 and α = cI, the quantity below vanishes) l
2l
2l ∂u2 ∂u2 dt |α (u1 ) − α (u2 )|| dt |u|| ||ϕ|dx ≤ c ||ϕ|dx ∂t ∂t l Ω Ω ∂u2 ≤ c uL2 (l,2l;V ) ∂t L∞ (l;2l,H) ≤ cuL2 (l,2l;V ) .
We thus deduce from (46)-(49) that α (u1 )
∂u ≤ cuL2 (l,2l;V ) , ∂t L2 (l,2l;H −1 (Ω))
(49)
174
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
which yields, owing to (44), α (u1 )
∂u ≤ cuL2 (0,l;H) . ∂t L2 (l,2l;H −1 (Ω))
(50)
The result follows easily.
In view of the above results, we now prove that the global attractor A constructed in Section 1 has finite (fractal) dimension. To do so, we use the method of l-trajectories (see [19] for a detailed presentation). We introduce the space of trajectories Xl = {v : (0, l) → H, v is a solution of (6) − (7) on (0, l)}, where l is as above, which we endow with the topology of L2 (0, l; H). We then set Bl = {v ∈ Xl , v(0) ∈ B2 }. We note that Bl is closed in the topology of L2 (0, l; H). Indeed, let (vn ) be a sequence of trajectories belonging to Bl which converges to some v. Then we can prove that, for every n, vn satisfies all the estimates derived in the previous sections (in particular, vn (t) ∈ B2 , for every t ∈ [0, l] and for every n) and we can pass to the limit in the equation to prove that, at least for a subsequence, (vn ) converges to a solution v of the equation on [0, l] which is weakly continuous from [0, l] onto H 2 (Ω) and that v(0) ∈ B2 (we note that (vn (0)) is bounded in H 2 (Ω) independently of n and, by construction, B2 is weakly closed in H 2 (Ω)). In particular, this yields that Bl is a complete metric space (we note that this is not necessarily the case for Xl , see [19] and [20]). We then define the operators L(t) : Xl → Xl , t ≥ 0, by (L(t)v)(s) = u(t + s), s ∈ [0, l], where u is the unique solution of (6)-(7) such that u[0,l] = v. We finally set L = L(l). Then, it follows from (44) that, if v1 , v2 ∈ Bl , (51) Lv1 − Lv2 L2 (0,l;V ) ≤ cv1 − v2 Xl and it follows from (45) that
∂ ≤ cv1 − v2 Xl . (Lv1 − Lv2 ) ∂t L2 (0,l;H −1 (Ω))
(52)
Furthermore, we easily prove that L(t) is Lipschitz from Bl onto Xl , ∀t ≥ 0, and that the mapping t → L(t)v is Lipschitz, ∀v ∈ Xl . Thanks to (51) and (52), it can be proved (see [19] for the details of the proof) that the semigroup L(t) possesses an exponential attractor Ml on Bl , that is, Ml is compact for the topology of Xl , is positively invariant (i.e., L(t)Ml ⊂ Ml , ∀t ≥ 0), has finite fractal dimension and attracts exponentially fast Bl (again, for the topology of Xl ). We now introduce the mapping e : Xl → H defined by e(v) = v(l) (i.e., e maps an l-trajectory onto its endpoint). Noting that e is Lipschitz, it follows that M = e(Ml ) is an exponential attractor for S(t) on H (see [19]). Noting finally that an exponential
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
175
attractor has, by definition, finite fractal dimension and that it always contains the global attractor, we have Theorem 3.3. The global attractor A associated with (6)-(7) has finite fractal dimension. Remark 3.4. In [11], the authors prove, in the particular case β = I, the finite dimensionality of the global attractor by using the classical method, based on the volume contraction method and the Lyapunov exponents (see, e.g., [27]). We note that this method requires some differentiability of the semigroup S(t), which we are not able to prove for our problem. We also note that the result of [11] is obtained under stronger assumptions on α (which needs to be of class C 3 ). Furthermore, in [12], the authors prove the existence of an exponential attractor, again for β = I. This also requires additional assumptions. We note that the result in [12] is obtained by using the classical construction given in [10], based on the squeezing property. However, the estimates derived in [11] already imply the existence of an exponential attractor, owing to a construction given in [13], based on a smoothing property for the difference of two solutions. We finally note that, for our problem, we would need more regularity on the solutions in order to apply the results of [13], so that the method of l-trajectories seems optimal (as far as the regularity of the solutions is concerned) in order to prove the finite dimensionality of the global attractor (and the existence of an exponential attractor). Remark 3.5. We assume that β = I and we set w = α(u). Then, w is solution of ∂w − Δα−1 (w) + f (α−1 (w)) = g, ∂t
(53)
w = 0 on ∂Ω.
(54)
∂w − (α−1 ) (w)Δw − (α−1 ) (w)|∇w|2 + f ◦ α−1 (w) = g. ∂t
(55)
We rewrite (53) in the form
This equation is of the form considered in [3, Chapter 1, Section 7]. Thus, assuming, in addition to the assumptions already made, that α is of class C 3+δ , α (0) = 0, f is of class C 1+δ and g is of class C δ , for some δ > 0, we deduce, from the results of [3], the existence of the global (C(Ω), C 2+δ (Ω))-attractor A˜ for (53)-(54), i.e., A˜ is bounded in C 2+δ (Ω), is compact in C(Ω), is invariant and attracts the bounded sets of C 2+δ (Ω) in the topology of C(Ω). Furthermore, it is not difficult to prove that A˜ has finite (fractal) dimension (we note that A˜ ⊂ A). We note that these results are obtained without any restriction on the space dimension N . However, in view, e.g., of applications to problems in phase separations, it is desirable to work in less regular phase spaces (typically, in L2 (Ω)). It would thus be interesting to prove that one can extend the above dynamical system to less regular initial data, and also to extend this approach to the more general equation (6). This will be addressed in a forthcoming paper.
176
4
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
Proof of Theorem 2.4
We will actually prove a more general result which can have an interest of its own. Let Ω be a smooth (at least C 3 ) bounded domain of RN , N ≥ 2. We consider the following elliptic boundary value problem in Ω : − div(a(x, u, ∇u)) = f,
(56)
u = 0 on ∂Ω.
(57)
f ∈ L2 (Ω),
(58)
a is of class C 1 with respect to each argument,
(59)
a(x, 0, 0) = 0, ∀x ∈ Ω,
(60)
We make the following assumptions :
N
the operator v → a(., v, ∇v) is Lipschitz from H01 (Ω) onto L2 (Ω) , and, if a = (a1 , ..., aN ) and aij = N
∂ai , ∂sj
(61)
i, j = 1, ..., N, a = a(x, u, s),
aij (x, u, s)ξi ξj ≥ c18 |ξ|2 , ∀x ∈ Ω, u ∈ R, s ∈ RN , ξ ∈ RN , ξ = (ξ1 , ..., ξN ),
(62)
i,j=1
c18 > 0, c18 − c19
N i=1
Sup|
∂ai | > 0, ∂u
(63)
where c19 denotes the constant in Poincar´e’s inequality. We then have the Proposition 4.1. There exists a constant c20 > 0 such that 2
(a(x, v 1 , ∇v 1 ) −a(x, v 2 , ∇v 2 ), ∇(v 1 − v 2 )) ≥ c20 |∇(v 1 − v 2 )| , ∀v 1 , v 2 ∈ H01 (Ω).
(64)
Proof. We have du a(x, u, s).v = (
∂aN ∂a1 (x, u, s)v, ..., (x, u, s)v), ∂u ∂u
N N ds a(x, u, s).w = ( a1j (x, u, s)wj , ..., aN j (x, u, s)wj ). j=1
j=1
Therefore, a(x, v 1 , ∇v 1 ) − a(x, v 2 , ∇v 2 ) = (ϕ1 , ..., ϕN ),
(65)
(66)
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
177
where, for i = 1, ..., N , 1 ∂ai 2 1 2 1 ϕi = (x, v + t(v − v ), ∇v )dt (v 1 − v 2 ) ∂u 0 N 1 ∂ 2 2 1 2 aij (x, v , ∇v + t∇(v − v ))dt (v 1 − v 2 ). + ∂x j 0 j=1 Furthermore, (a(x, v 1 , ∇v 1 ) − a(x, v 2 , ∇v 2 ), ∇(v 1 − v 2 )) = 1 N ∂ 1 ∂ai dx [ (v − v 2 ) (x, v 2 + t(v 1 − v 2 ), ∇v 1 )(v 1 − v 2 ) ∂u ∂x i 0 i=1 Ω +
N
aij (x, v 2 , ∇v 2 + t∇(v 1 − v 2 ))
i,j=1 1
2
2
≥ c18 |∇(v − v )| −
N
Sup|
i=1
∂ 1 ∂ (v − v 2 ) (v 1 − v 2 )]dt ∂xi ∂xj
∂ai 1 ∂ 1 ||v − v 2 || (v − v 2 )|, ∂u ∂xi
so that (64) immediately follows from (63). i and aij are bounded, i, j = 1, ..., N , then (61) is Remark 4.2. (i) If, for instance, ∂a ∂u satisfied. (ii) It follows from (60) and (64) that
(a(x, v, ∇v), ∇v) ≥ c18 |∇v|2 , ∀v ∈ H01 (Ω). It follows from the above assumptions and Proposition 4.1 that (56)-(57) possesses a unique solution u ∈ H01 (Ω) (see, e.g., [17]). Our aim is to prove that u ∈ H01 (Ω) ∩ H 2 (Ω). We first recall that, ∀x ∈ Ω, there exists an open neighborhood V of x, an open set W of RN and a C 3 -diffeomorphism T : V → W such that N T (Ω ∩ V ) = R+ ∩ W, N where R+ = {y = (y1 , ..., yN ) ∈ RN , yN > 0} ; indeed, ∂Ω is of class C 3 . Then, for a function v defined on Ω, we set T v = (v Ω∩V ) ◦ T −1
and we introduce the tangent spaces, for m = 1 or 2, m Htan (Ω) = { v,
T
T
T
T
∂ v v N v, ∂∂y1v , ..., ∂y∂N −1 , yN ∂y ∈ H m−1 (R+ ∩ W ), N for every T defined as above}.
It follows from the Lipschitz continuity (61) and the monotonicity property (64) that the solution u of (56)-(57) satisfies 2 u ∈ Htan (Ω)
(67)
178
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
(see [25, Theorem 2.1] for details). We finally assume that N
2 1 a(x, ., ∇.) maps Htan (Ω) onto Htan (Ω)
(68)
(for instance, it is not difficult to show that this property is satisfied for the operator β considered in the previous sections). Let then θ = (θ1 , ..., θN ) be a C 2 -vector field tangent to ∂Ω, i.e., 2
N
θ ∈ C (Ω) ,
N
ni θi = 0 on ∂Ω,
i=1
n = (n1 , ..., nN ) being a unit normal vector to ∂Ω. We set ∂v ∂v θi = ∂θ ∂xi i=1 N
2 and, if v ∈ Htan (Ω), then (see [25])
∂v ∈ H 1 (Ω). ∂θ We now define the fields θj , j = 1, ..., N − 1, as follows (see [25]). For x0 ∈ ∂Ω, we choose a frame with axis with origin x0 and base vectors e1 , ..., eN such that eN is the inner normal to ∂Ω at x0 . Since ∂Ω is of class C 3 , there exists an open neighborhood V of 0(= x0 ) in RN such that the boundary of Ω∩V is of class C 3 and a function η ∈ C 3 (RN −1 ) such that Ω ∩ V = {x ∈ V, xN ≥ η(x1 , ..., xN −1 )}. We then set, for j = 1, ..., N − 1, θj = ej +
∂η N e in Ω ∩ V. ∂xj
(69)
These vector fields are of class C 2 and are tangent to ∂Ω in ∂Ω ∩ V . We note that we can extend these fields, and we still denote such extensions by θj , j = 1, ..., N − 1, such that N
θj ∈ C 2 (Ω) ,
N
ni θij = 0 on ∂Ω, j = 1, ..., N − 1.
i=1
We have, in view of (69), for j = 1, ..., N − 1, ∂v ∂η ∂v ∂v = j − , ∀v ∈ L2 (Ω ∩ V ). ∂xj ∂θ ∂xj ∂xN
(70)
ψ = a(x, u, ∇u), ψ = (ψ1 , ..., ψN ).
(71)
We set Therefore, (56) reads N ∂ψj j=1
∂xj
= −f,
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
179
which, thanks to (70), is equivalent to N −1
∂η ∂ψj ∂ + (ψ − ψj ) = −f. N ∂θj ∂xN ∂xj j=1 j=1 N −1
(72)
1 (Ω)N , we have, for j = 1, ..., N − 1, Since, owing to (68), ψ ∈ Htan
∂ψ ∈ L2 (Ω)N . j ∂θ It thus follows from (72) that ∂η ∂ (ψN − ψj ) ∈ L2 (Ω ∩ V ). ∂xN ∂x j j=1 N −1
(73)
Moreover, for i = 1, ..., N − 1, ∂η ∂ψN ∂ 2 η ∂η ∂ψj ∂ (ψ − ψ ) = − ( ψj + ) ∈ L2 (Ω ∩ V ). N j i i i i ∂θ ∂xj ∂θ ∂xj ∂θ ∂xj ∂θ j=1 j=1 N −1
N −1
This, together with (70), yields that, for i = 1, ..., N − 1, ∂η ∂ (ψN − ψj ) ∈ L2 (Ω ∩ V ) ∂xi ∂x j j=1 N −1
and, in view of (73), we deduce that ψN −
N −1
∂η ψj ∈ H 1 (Ω ∩ V ). ∂x j j=1
(74)
Recalling that ψ = a(x, u, ∇u), we can rewrite (74) as ∂η ∂u ∂u ∂u ∂u ∂u ∂u , ..., , )− aj (x, u, , ..., , ) = k, ∂x1 ∂xN −1 ∂xN ∂x ∂x ∂x ∂x j 1 N −1 N j=1 N −1
aN (x, u,
where k ∈ H 1 (Ω ∩ V ), and, recalling (70), we finally have to solve an equation of the form F (z) = k, where z=
∂u ∂xN
and ∂u ∂η ∂u ∂η − z, ..., − z, z) ∂θ1 ∂x1 ∂θN −1 ∂xN −1 N −1 ∂u ∂η ∂u ∂η ∂η − aj (x, u, 1 − z, ..., N −1 − z, z). ∂xj ∂θ ∂x1 ∂θ ∂xN −1 j=1
F (z) = aN (x, u,
180
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
∂u We thus have, setting φ = (x, u, ∂θ 1 −
∂η z, ..., ∂θ∂u N −1 ∂x1
−
∂η z, z), ∂xN −1
N −1 N −1 ∂η ∂η ∂η ∂η F (z) = − aN i (φ) + aN N (φ) + aij (φ) − ajN (φ). ∂x1 ∂xi ∂xj ∂xj i=1 i,j=1 j=1
N −1
Therefore, setting ξ = (ξ1 , ..., ξN ) = ( we have
F (z) =
∂η ∂η , ..., , −1), ∂x1 ∂xN −1
N
aij (φ)ξi ξj ,
i,j=1
and we finally obtain, in view of the ellipticity condition (62), F (z) ≥ c18 |ξ|2 , hence, noting that |ξ| ≥ 1,
F (z) ≥ c18 .
(75)
It thus follows from (75) that F is invertible and ∂u ∈ H 1 (Ω ∩ V ). ∂xN Now, recalling (70), we have, for i = 1, ..., N , ∂u ∈ H 1 (Ω ∩ V ), ∂xi so that u ∈ H 2 (Ω ∩ V ), hence, by glueing, u ∈ H 2 (Ω) 2 (Ω)). (the interior regularity follows from (67) and the definition of the tangent space Htan We have thus proven the
Theorem 4.3. The solution u of (56)-(57) satisfies u ∈ H01 (Ω) ∩ H 2 (Ω).
Acknowledgment Part of this article was written while the author was visiting the University of Trento. He wishes to thank Professor Augusto Visintin for having suggested this problem to him and for many stimulating discussions, as well as for his warm hospitality. He also wishes to thank Professor Jacques Simon for discussions on the regularity of elliptic problems and Doctor Sergey Zelik for several interesting discussions.
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
181
References [1] H.W. Alt and S. Luckhaus: “Quasilinear elliptic-parabolic differential equations”, Math. Z., Vol. 183, (1983), pp. 311–341. [2] T. Arai: “On the existence of the solution for ∂ϕ(u (t)) + ∂ψ(u(t)) f (t)”, J. Fac. Sci. Univ. Tokyo Sect. IA Math., Vol. 26, (1979), pp. 75–96. [3] A.V. Babin and M.I. Vishik: Attractors of evolution equations, North-Holland, Amsterdam, 1992. [4] A. Bamberger: “Etude d’une ´equation doublement non lin´eaire”, J. Funct. Anal., Vol. 24, (1977), pp. 148–155. [5] V. Barbu: Nonlinear semigroups and differential equations in Banach spaces, Noordhoff, Leiden, 1976. [6] H. Brezis: Op´erateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert, North-Holland, Amsterdam, 1973. [7] P. Colli: “On some doubly nonlinear evolution equations in Banach spaces”, Japan J. Indust. Appl. Math., Vol. 9, (1992), pp. 181–203. [8] P. Colli and A. Visintin: “On a class of doubly nonlinear evolution problems”, Comm. Partial Diff. Eqns., Vol. 15, (1990), pp. 737–756. [9] E. DiBenedetto and R.E. Showalter: “Implicit degenerate evolution equations and applications”, S.I.A.M. J. Math. Anal., Vol. 12, (1981), pp. 731–751. [10] A. Eden, C. Foias, B. Nicolaenko and R. Temam: Exponential attractors for dissipative evolution equations, Research in Applied Mathematics, Vol. 37, John-Wiley, New York, 1994. [11] A. Eden, B. Michaux and J.-M. Rakotoson: “Doubly nonlinear parabolic-type equations as dynamical systems”, J. Dyn. Diff. Eqns., Vol. 3, (1991), pp. 87–131. [12] A. Eden and J.-M. Rakotoson: “Exponential attractors for a doubly nonlinear equation”, J. Math. Anal. Appl., Vol. 185(2), (1994), pp. 321–339. [13] M. Efendiev, A. Miranville and S. Zelik: “Exponential attractors for a nonlinear reaction-diffusion system in R3 ”, C. R. Acad. Sci. Paris S´er. I, Vol. 330, (2000), pp. 713–718. [14] C.M. Elliott and R. Sch¨atzle: “The limit of the anisotropic double-obstacle AllenCahn equation”, Proc. Royal Soc. Edin. A, Vol. 126, (1996), pp. 1217–1234. [15] O. Grange and F. Mignot: “Sur la r´esolution d’une ´equation et d’une in´equation paraboliques non lin´eaires”, J. Funct. Anal., Vol. 11, (1972), pp. 77–92. [16] M. Gurtin: “Generalized Ginzburg-Landau and Cahn-Hilliard equations based on a microforce balance”, Physica D, Vol. 92, (1996), pp. 178–192. [17] O.A. Ladyzhenskaya and N.N. Ural’ceva: Equations aux d´eriv´ees partielles de type elliptique, Monographies universitaires de Math´ematiques, Vol. 31, Dunod, 1968. [18] J.-L. Lions: Quelques m´ethodes de r´esolution des probl`emes aux limites non lin´eaires, Dunod, Paris, 1969. [19] J. Malek and D. Prazak: “Long time behavior via the method of l-trajectories”, J. Diff. Eqns., Vol. 18(2), (2002), pp. 243–279.
182
A. Miranville / Central European Journal of Mathematics 4(1) 2006 163–182
[20] D. Prazak: “A necessary and sufficient condition for the existence of an exponential attractor”, Cent. Eur. J. Math., Vol. 1(3), (2003), pp. 411–417. [21] P.-A. Raviart: “Sur la r´esolution de certaines ´equations paraboliques non lin´eaires”, J. Funct. Anal., Vol. 5, (1970), pp. 299–328. [22] A. Segatti: “Global attractor for a class of doubly nonlinear abstract evolution equations”, Discrete Cont. Dyn. Systems, To appear. [23] K. Shirakawa: “Large time behavior for doubly nonlinear systems generated by subdifferentials”, Adv. Math. Sci. Appl., Vol. 10, (2000), pp. 77–92. [24] R.E. Showalter: Monotone operators in Banach spaces and nonlinear partial differential equations, Amer. Math. Soc., Providence, R.I., 1997. [25] J. Simon: “R´egularit´e de la solution d’un probl`eme aux limites non lin´eaire”, Ann. Fac. Sci. Toulouse, Vol. 3(3-4), (1981), pp. 247–274. [26] J.E. Taylor and J.W. Cahn: “Linking anisotropic sharp and diffuse surface motion laws via gradient flows”, J. Statist. Phys., Vol. 77(1-2), (1993), pp. 183-197. [27] R. Temam: Infinite dimensional dynamical systems in mechanics and physics, 2nd ed., Springer-Verlag, 1997. [28] A. Visintin: Models of phase transitions, Birkh¨auser, Boston, 1996.