ON A PAPER OF A. FELDZAMEN BY
S. R. FOGUEL* ABSTRACT
Results of A. Feldzamen on semi-similarity of operators are proved here using matrix methods. The use of these methods yields simpler proofs, the formulations of the theorems assume a more transparent form. The purpose of this note is to give shorter and more transparent proofs of results given in [3]. This will be done by using the methods developed in [4]. It should be mentioned that we assumed separability while Feldzamen does not. 1. Preliminary notions. Let S be a normal operator, on a separable Hilbert space H, of uniform multiplicity n < ~ . Thus H can be taken as direct sum of n equal spaces L2(F/, ]E,p), where f/is a Borel subset of the plane, ~ the collection of Borel subsets off~, and # a finite positive measure. Also:
See [3], [4], or [5]. The spectral measure E(.) of S is given by (f1(2)~
X(~5)fl(2)~
E('),f,(i2)/
=
(Z(cS)f.(2) ,,
where )C(fi) is the characteristic function of 6. Every operator A that commutes with S is given by a matrix of bounded and measurable functions azj(2), where
A(f,,(i)O) fi(2)
ft(2) ~ .
= (a,j(2)) ( f . ( i , j
See Theorem 2.1. of [4]. Received January 18, 1963. * The research reported in this document has been sponsored in part by the Air Force Orifice of Scientific Research of the Air Research and Development Command United States Air Force through its European Otfice. 133
134
S.R. FOGUEL
[September
DEFINITION. The vectors y~ e H, i = 1 ... k, will be called dependent over 6 if y~(2) are dependent for almost every 2 ~ 6. LEMMA 1.1. The vectors yi are dependent over 6 if and only if there exist k measurable functions gi, and a sequence of Borel sets 6m increasing to 6, such that a. The functions gi are bounded on 6r~ and not all zero. b. I f gi, m is the restriction of gi to 6m then k
~, gi, m(S)Yi = O. i=1
Proof. It is clear that a. and b. imply dependence. Conversely, let yi(2) be dependent for 2 e 6. For each 2 E 6 there exist constants g~(2) such that k
g~(2)yi(2 ) = O. i=1
It is enough to show that one can choose gi to be measurable. Let us consider the matrix (yi,r(2)) where yi,r(2) is the rth component of yi(2). The set t) can be decomposed into finitely many disjoint measurable sets, on each a certain determinant of (yi,r(2)) is the largest non vanishing one. On each set gi(2) can be chosen by Cramer's Rule, and are thus measurable. COROLLARY.
If k
> n then the vectors y~ are dependent over every set 6.
LEMMA 1.2. Let y~ 1 < i < n be independent over~. Let x be any vector in H. There exist n measurable functions fi( 2) and a sequence of Borel sets 6,, increasing to f~ such that n
x = lim ]E f~,m(S)Y~, m--,ooi=l
where f i,,~ is the restriction o f f i to 6 m and is bounded. The functions f ~are uniquely defined.
Proof. The vectors x(2), yi(2) are dependent by the previous Corollary. Thus x(;t) can be represented by a linear combination of yi(2). Since these vectors are independent the representation is unique. The same result could be proved for the case that Yi are independent over some set 6 = fL 2. Canonical form for niipotents. In this section we will follow ['1] to bring a nilpotent matrix with measurable elements to canonical form. It was proved in 1-4] that if N is quasi nilpotent and commuting with S, then N(2)" = 0 a.e.
1963]
O N A P A P E R O F A. F E L D Z A M E N
135
Let A(2; x) be an n by n matrix whose elements are polynomials in x with coefficients that are measurable functions of 4. Let f~k be the set on which the minimal order of the polynomials a,j(2; x) is equal to k. This is a measurable set. Let f~l = U ~ ' ~
where f ~ ' J = { 2 l o r d e r of aij(2;x )
=
1}. Again I)~ 'j is
t,J
measurable. An elementary transformation will bring aij to the upper left corner and by more elementary transformations A(2; x) can be brought to the form a(2) 0 ... 0 ! A~(2; x) 0
0
where order of a(2) is one and A1(2; x) has the same form as A(2; x). Let us split f~k to fl~i , j = {4 [ 2 ~ f~k and order of a,j(2; x) = k}. On f~'Jwe apply to A(2; x) an elementray transformation to bring aij to the left upper corner. Using the Euclidean Algorithm we see that there are two possibilities: 1. By an elementary transformation (using measurable coefficients) we can bring A(2; x) on ~'~ to the form a(2 x) 0
i
...
A1(2; x)
where A1(2; x) has the same form as A(2; x) and a(2; x) divides every element of A1(2; x). 2. A(2 ; x) can be transformed to a matrix whose minimal order is less than k, on
f~. These considerations prove: LrMMA 2.1. There exists a matrix B(2;x) such that both B(2;x) and B(2;x) -1 have polynomial elements with coefficients that are measurable functions of 2 and: B(2; x)a(2; x)B(2; x ) - l = diag{fl(2; x), f2(2; x), ..., f,,(2; x)}, where fi(2;x) are polynomials fi(2; x)l f,+1(2; x).
in x
with measurable
coefficients and
Let now A(2; x ) = x I - N(2) where N(2) represents the nilpotent operator N. Thenfi(2; x) = x i~a) (or 0), for they divide the minimal polynomial of N(2) (Theorem 8, Chapter V, of [1]). Thus i(2) is a measurable function of 2 and 0 < i(2) < n. Let f~ be the union of the disjoint sets f~,, where on ~ , i(2) is equal to a given fixed integer 1 < i < n. The sets f~, are measurable. By chapter V of [1],
136
S.R. FOGUEL
[September
Theorem 6.10, the matrix diag(fi(2; x)) is equivalent, on f2~,to a canonical Jordan matrix diag(fl(2; x)) ~ x l - Q~, where 081 : QGt "~-
... 0
0
0
/~,- 1
...0
and ei is either I or zero. Using Lemma 2.1 again one can find a matrix C(2; x), with the same properties as B(2; x) of Lemma 2.1, such that C(2; x) (xI - N(2)) C(2; x)- 1 = xI - Q for 2 Ef~. Finally by chapterV, Theorem 5.10, of I-1.1: C,(Q; ~)N(,~)C,(Q,,; 2)-~ = Q,, when 2 e f~. To summarize: THEOREM 2.2. Let N be a nilpotent operator commuting with S and let N(2) be its matrix representation. Let Q~ be the Jordan forms of a nilpotent matrix. There exists a matrix D(2) of measurable functions such that D-1 (2) exists, and measurable sets 12~ whose union is f~ such that D(A)N(2)D- 1(2) = Q~ For the matrix Q, there exist vectors Ya, ..., Y, such that j1-1
Yl, Q~Yl .... ,Q=
O J,.-1
yx, ...,yr, Q~yr, . . . , ~
Yr
are independent, j 1 + ... + j , = n, and Q~yi = O. Let xi(2) = D- 1(2)y t, and let f~,m c 11~ be such that xi(2) is bounded on f~,m and f~,,m increases to f~. Then on f~,m (on E(f~,,~)H) N1, Nx1, ..., Njt -
l x l , .,. ' Xr, N Xr, ..., N j r - l x r
are independent, and NJ'x, = O. This shows that the sets f~ do not depend on the representation of H as direct sum of L2 spaces (Spectral Multiplicity Theorem). The sets f ~ will be called the canonical sets of S + N. 3. Semi similarity. Let T = S + N and T1 = $1 + N1 be two spectral operators (see 121) and let S have uniform multiplicity n (equivalenty S is similar to a normal operator with uniform multiplicity). In [31 the notion of semi semilarity is defined by:
1963]
ON A PAPER OF A. FELDZAMEN
137
DEFINITION. T and T~ are semi similar if there is a sequence o f Borel sets 6m increasing to ~ such that, if E(.) and E~(.) are the spectral measures of T a n d T 1, there are bounded maps Lm, from EI(6")H to E(6m)H, with
Lmr'L-m 1 = r l ". where Tm(Tlm) is the restriction of T(T~) to E(6") (E1(6")). It was shown in [3], Theorem 27, that if Tand T1 are semi similar, then S and S~ are similar. If Tis semi similar to Tt and T = KT2K - 1for a bounded operator K where T2 is again spectral then
L ' K T 2 K - 1L-~l= TI" or T2 is semi similar to T1. Also by the remark following Theorem 2.2 the operators T2 and T have the same canonical sets. THEOREM 3.1. The spectral operators T and T I are semi similar if and only if S and S t ".are similar and T and T 1 have the same canonical sets. Proof. Without loss of generality we may assume that S = S~. If S + N is semi similar to S + N~ then
L ' N ' L ~ 1= Nlm, where
N m and N I ". a r e
the restrictions of N and N~ to E(6")H. But then
Lm(a)N'(a)Lfl(a) = N~'(~), which proves that N(2) and NI(A) have the same canonical sets. Conversely, if N and N1 have the same canonical sets, then on f ~
N = D-I(A)Q~D(2),
N 1 = D'~l(2)Q~Dl(2);
hence
N 1 = D-~ ~(2)D(2)N(2)D-.'(2)Dl(2 ). Define 6., so that D-I(2)Dt(2) and D~-I(2)D(2) L'(2) = {D~-I(,~)D(~) ] restricted to E(6m)H}.
be
bounded on 6", and
COROLLARY. Semi similarity is a transitive relation. This is Theorem 26 of [3]. Theorem 3.1 is essentially equivalent to Theorem 29, and 30 o f [3].
138
S. R. FOGUEL BIBLIOGRAPHY
I. Albert, A. A., 1941, Introduction to algebraic theories. The University of Chicago Press, Chicago. 2. Dunford, N., 1950, Spectral operators, Pac. J. Math., 4, 213-227. 3. Feldzamen, A. N., 1961, Semi similarityinvariants for spectral operators on Hilbert space Trans. Amer. Math. Soc., 100, 277-323. 4, Foguel, S. R., 1958, Normal operators of finite multiplicity, Comm. Pure Appl. Math. 11, 297-313. 5. Mackey, G. W., 1959, Commutative Banach algebras. Notas de Matematica No. 17, Rio De Janeiro. THE HEBREWUNIVERSITYOF JERUSALEM
ON OPERATORS WHICH ATTAIN THEIR NORM BY
JORAM LINDENSTRAUSSQ)
ABSTRACT
The following problem is considered. Let X and Y be Banach spaces. Are those operators from X to Y which attain their norm on the unit cell of X, norm dense in the space of all operators from X to Y? It is proved that this is always the case if X is reflexive. In general the answer is negative and it depends on some convexity and smoothness properties of the unit cells inX and Y. As an application a refinement of the Krein-Milman theorem and Mazur's theorem concerning the density of smooth points, in the case of weakly compact sets in a separable space, is obtained. 1. Introduction. Let B(X, Y) be the Banach space of all bounded linear operators from the Banach space X into the Banach space Y. (The norm in B(X, Y) is the usual operator norm.) Let P(X, Y) be the subset of B(X, Y) consisting of all the operators which attain their norm on the unit cell of X, that is all those T for which there is an x ~ X satisfying I! x II = 1 and I] Tx I1= 11T II. Bishop and Phelps [1] (cf. also [2]) proved that if dim Y = 1 then P(X, Y) is norm dense in B(X, Y) for every Banach space X. In [1] they also raised the general question--for which Banach spaces X and Y is P(X, Y) n o r m dense in B(X, Y)? This question is the subject of the present note. Rather simple examples show that in general P(X, Y) is not dense in B(X, Y). The simplest examples, perhaps, are based on the fact that if a 1 - 1 operator T f r o m X into a strictly convex space Yattains its n o r m at a point x, then x is an extreme point of the unit cell of X. However, if we consider instead of P(X, Y) the larger set Po(X, Y), consisting of all the operators T such that T** attains its n o r m on the unit cell of X**, then it can be shown (Theorem 1) that this set is always n o r m dense in B(X, Y). The question of Bishop and Phelps, as it stands, is very general. In fact, it seems to be too general to have a reasonably complete solution. We therefore restrict ourselves here to the study of those spaces X which have either one of the following properties. A. For every Banach space Y, P(X, Y) is n o r m dense in B(X, Y). B. For every Banach space Y, P(Y, X) is n o r m dense in B(Y, X). An immediate consequence of the density of Po(X, Y) in B(X, Y) is that every Received August 20, 1963. (1) Research supported by the National Science Foundation, U.S.A. (NSF--GP--378). 139
140
JORAM LINDENSTRAUSS
[September
reflexive space has property A. In Theorem 2 we show that if a Banach space X has property A then, under certain circumstances (for example if it is separable), its unit cell must have many strongly exposed points(2). A dual result concerning property B and strongly smooth points is also given (Theorem 3). As a consequence of the results mentioned above we obtain (for weakly compact sets in a separable space) refinements of the Krein-Milman Theorem and Mazur's Theorem [10] concerning the density of smooth points (cf. Theorem 4). In section 3 we prove some results of a more special nature and discuss a few simple examples. It is shown in particular that a finite-dimensional space whose unit cell is a polyhedron has property A and that there are Banach spaces X such that P(X, X) is not norm dense in B(X, X). I am indebted to Professor R.R. Phelps for helpful conversations concerning the subject of this note. Notations. By " o p e r a t o r " we always mean a bounded linear operator. Our results hold for real and for complex Banach spaces; however, for convenience of notation, we shall assume that the Banach spaces are real. The unit cell (x: xeX,[I x [I < 1} of X is denoted by Sx. Let C be a convex set in the Banach space X. A point x e C is called an exposed point of C if there is a n f e X* such that f ( y ) < f ( x ) for every y # x in C. A point x ~ C is called a strongly exposed point of C if there is a n f E X * such that (Of(Y) < f ( x ) for y # x in C; and (ii) co f(xn) ~ f ( x ) and {X ,},=1 c C imply Hx, - x [1 ~ 0. The dual notions are those of a smooth and strongly smooth point. We shall use these notions only for the unit cell and so we define them only in this case. A point x with [1x ][ = 1 is called a smooth point of Sx if there exists only one f ~ X * satisfying f ( x ) = = 1. A point x e X with II x !I = 1 is called a strongly smooth point of Sx iff,(x) ~ 1, and {f,}ff= 1 c Sx. imply that II•-f 11-' 0 (wheref is, necessarily, the unique element of Sx. satisfying f ( x ) = 1). A point x with II x I! = 1 is a smooth point (resp. strongly smooth point) of Sx if and only if the norm is Gateaux (resp. Fr6chet) differentiable at x (cf. ~mul'yan I-11]). A Banach space X is called strictly convex (resp. smooth) if every point on the boundary of Sx is an exposed (resp. smooth) point of Sx. X is called locally uniformly convex [9] if [1x,, + x 2, and 11xn !1= II x II = I imply [1x , - x 11~ 0. If X is locally uniformly convex then eveiy point on the boundary of S x is a strongly exposed point. If every point on the boundary of Sx is strongly smooth we say that the norm in X is Fr6chet differentiable (cf. Day [5, pp. 112-113] tbr these and related notions).
Ilfll
2. The main results. We begin by giving a simple characterization of the set Po(X, Y). LEMMA 1. An operator T from X to Y belongs to Po(X, Y) if and only if ~ there are {XkIk=1 in X and StfkIk=l in y * such that (2) This notion is defined below.
1963]
ON OPERATORS WHICH ATTAIN THEIR NORM
(1)
II x k II = liT, II = 1
(2)
If~(Tx,)l ->- II z l l - 1/j
141
k
=
1,2,
..-
j=II Tll- 1/j and hence II T**x** II--II T** II- The proof of the converse is also immediate (using the weak* density of Sx in Sx..). THEOREM 1. For every X and Y, Po(X, Y) is norm dense in B(X, Y). Hence every reflexive space has property A. Proof. Let T~B(X, Y) with II ~ii = 1 and an e with 0 < e < l / 3 be given. We choose first a monotonically decreasing sequence {ek} of positive numbers such that (3)
2~
e:<e,
i=1
ei<e~,
2
ek < l / 1 0 k ,
k=l,2,...
i=k+l
We next choose inductively sequences {T,k)k~= 1' {Xk}k Qo 1' and {fk}~°=l satisfying =
(4)
T, = T
(5)
I1 TkXk II >-- II T, II- ~, II x, II = 1
k = 1,2, ...
(6)
fk(Tkxk)-- 11 T~x~ I!, llA 11= 1
k = 1,2, ..-
(7)
Tt,+ ix = TkX "[- ekf k( TkX) " TkXk
x GX ,
k = 1,2,
...
Having chosen these sequences we verify that the following hold. k-I
(8)
IITj-T, 11 _-< 2X~,, i=j
(9)
i II ~11 ~1
j_- II r~ 1!- 68j
j < k,
k = 1,2,...
Assertion (8) is easily proved by using induction on k. By (5), (6) and (7)
II Tk+, II > II r,+,~, II = II Z~x,(1 + ~f~(T~,))II = IIr,x,
II(l+ ~,llT,,,,ll)-:(llr,II-~,)(:+ ~,11r, li-~,~).
Relation (9) fo]lows easilyfrom thi~inequatity,since IIT, II= 4/3 and ~, < 1/10k, while (10) is an immediate consequence of (4) and (9). Finally we verify (11). By the triangle inequality, (5), (8), (10) and (3) we have, for j < k,
JORAM LINDENSTRAUSS
142
[1 Tj+~xk II = II Tkx~, 11- tl T k -
Tj+~
[September
11>=
k-!
= I1 T~ II - ~ - 2i = j~+ l ~, > II ~+~ II - 24 Hence, by (7) and (9),
~IS~(T~x~) I I1% 11+ 11r~ II :> II T~+ ~xk II = I1T~ II + ~J II T~ 112 &if, -
so that
If~(Zjxk)l >_ 11r~ II-
6~j,
and this proves (11). The sequence Tk converges in norm to an operator T satisfying II ~- z II z and I1 ~- Tj !1 --: If,(T~-x~)I- II %- ~ll->--- II r~ I1-
6~s-
zj_,
>= II ~11 - 6~-2~f-,
>= II ~11 - ~/s,
and the desired conclusion follows from Lemma 1. For reflexive X it is obvious that P(X, Y) = Po(X, Y) and thus every reflexive X has property A. Remark. The operator T constructed in the proof of Theorem 1 has also the property that T - T is compact. THEOREM 2. Let the Banach space X have property A. Then (i) I f X is isomorphic to a strictly convex space, then Sx is the closed convex hull of its exposed points. (ii) If X is isomorphic to a locally uniformly convex space, then Sx is the closed convex hull of its strongly exposed points. Proof. The proofs of (i) and (ii) are almost identical so we prove here only (ii). Let C be the closed convex hull of the strongly exposed points of Sx. Suppose that C # S x . Then there is an f ~ X * with [[fl[ = 1 and a 6 > 0 such that I/(x~l < 1-6 for x ~ C . Let III II1 be a locally uniformly convex norm in X which is equivalent to the given norm II II and such that I I Ix III --- II x II for every x. Let Ybe the space X @ R(3) with the norm II (x,r) !! = (111 x 1115+ re) "~. Then Y is locally uniformly convex. Let V be the operator from X into Y defined by Vx = (x, Mf(x)) where M > 2/~5. Then V is an isomorphism (into) and the same is true for every operator sufficiently close to I1. We have
II vii ~
M;
II Vx II z (1 + <M- 2)2) t/2
0, and a set {f~} of elements of norm 1 in X* such that for every ~, f~(x~) = 1, and tbr any x,
II x II --- 1 and f:(x) _>_1 - 6(e) imply II x - x: l! -In a uniformly convex space the set of all the boundary points of the unit cell is u.s.e. The set of all the extreme points of the unit cell of ll is also u.s.e. PROPOSITION 1. Suppose S x is the closed convex hull of a set of uniformly strongly exposed points. Then X has property A. Proof. The proof is similar to that of Theorem I and we indicate here only the necessary modifications. Let {x:} be a set of u.s.e, points whose closed convex hull is Sx and let {f:} be the corresponding set in X* (appearing in the definition o f a u.s.e, set). We choose the e; as in the proof of Theorem I and define a sequence of operators T k a s follows. T1 = T and
Tk+ lX = TkX + ekf~,,(x) " TkX~,
k = 1,2, ...
where x,k is an element of {x~} satisfying ]] Tkx,~ [1 -->_[[ Tk 11- e~2, and f~k is the corresponding element of {f~}. As in the proof of Theorem 1 it can be shown that the sequence Tk converges in the norm topology to an operator i~ satisfying 1[ ~V_ T][ < e, and that ]f,j(x~) [ > 1 - 1/j for i < k. By the definition of a u.s.e, set it follows that the sequence x,~ converges in the norm topology to a point x, say, and we have [[ iPx 1[ = 11i? 1[. This concludes the proof. REMARK. There exist even finite-dimensional spaces whose unit cells cannot be obtained as the closed convex hull of a u.s.e, set. Indeed, in a finite-dimensional space the closure of a u.s.e, set is again u.s.e., and simple 2-dimensional examples show that in general a convex set cannot be obtained as the closed convex hull of a closed subset of its set of exposed points. Our next result is concerned with some smoothness properties.
144
JORAM LINDENSTRAUSS
[September
THEOREM 3. Let X have property B. Then (i) I f X is isomorphic to a quotient space of a smooth space then the smooth points of S x are norm dense in the boundary of S x. (ii) If X is isomorphic to a quotient space of a space which has a Frdchet differentiable norm, then the strongly smooth points of Sx are norm dense in the boundary of Sx. Proof. Again the proofs of parts (i) and (ii) are almost identical so we prove only (ii). Let Z be a space whose norm is Fr6chet differentiable and let To be an operator from Z onto X with HTo[I 0. There exists a non-void open set G in K such that 0 ~1 and this contradicts the definition of a strongly exposed point. Our next result is a consequence of the theorem of Bishop and Phelps [1]. It may be regarded as the dual of Proposition 1. PROPOSITION 3. Let X be a Banach space such that there exist two sets {x,} in X and {f,} in X*(5) and 2 < 1 such that
~. IIs: II =
1 f o r every ~ and
II x II = sup:lf:(x) l S°r
every x ~ X .
2. II x: II =f,(x=) = 1 f o r every ~ and if:(x,) i < 2 f o r ct ~ ft. T h e n X has p r o p e r t y B. with 11zll = 1 and ~, 0 < ~ < 1, be given. Clearly 1 = II zll = sup, II Z*f=11. Let ~o be such that [1T*f=o II = 1 - e(1 - ){)/4. Choose a g ~ Y* which attains its norm on Sr and satisfies Proof.
Let T e B ( Y , X )
II g
- T*f~o
I1 1 + e(l + 2 ) / 2 .
T*fao attains its n o r m on S r the same is
The assumptions in Proposition 3 are satisfied if, tor example, X is finitedimensional and its unit cell is a polyhedron or if X = C(K) with K having a dense set of isolated points. Many examples of spaces which do not have property B can be obtained by using Theorem 2. For example, if X is strictly convex and if there is a Banach space Y such that
S r is not the closed convex hull of its exposed points and such that Yis isomorphic to a proper subspace of X then X does not have property B. In some special cases it is easy to obtain somewhat stronger results. We have for example PROPOSITION 4. If X is strictly convex and if there is a non-compact operator from Co into X then X does not have property B. Proof. Suppose T e P(co, X) and let y~_ Co satisfy II Y II-- 1 and II Tr II = H z!l. Denote by {ei}~°°_-~ the natural basis of c o. There is an integer n such that for i > n II Y + ei/ 2 II = 1. It follows that for these i, [] Ty +Tei/ 2 II ==- II T~ II and hence, by the strict convexity of X, Te~ = 0 for i > n. Thus every operator belonging to P(c o, X) has a finite-dimensional range and, as a consequence, every operator in the closure of P(co, X) is compact. Finally we observe the following. PROPOSITION 5.
There exist Banach spaces X for which P(X,X) is not dense
in B(X,X). Proof. Let Y = c o with the usual n o r m and let Z be a strictly convex space isomorphic to c o. Put X = Y @ Z with 1[ (y,z)II-- m a x ( II y II, II z [1)(6). x has the required property. Indeed, let TO be an isomorphism from Y onto Z with
II To II ==-1 and
define T i n B(X,X) by T(y,z) = (0, Toy ). We have II Toy 11>= 2~11Y II for every y e Yand some e > 0. Suppose there were a T ~ B(X, X) with II t- zl[ < and I1 5~ [t = [I T(yo ,zo) II for some (yo,Zo) in X of n o r m 1. Put T(yo, Zo)= (u,v). Clearly II u 11~ it follows that II u 11< II t II -- Ilv II. Since S r has no extreme point there is a y~ ¢ 0 in Y such that
II y~ + y o II = II- y, + y o II ~ 1. (6) For every vector it will be clear to which space it belongs. Therefore we use the same notation, tl It, for the norms in X, Y and Z.
148
JORAM LINDENSTRAUSS
Hence II T(yo, Zo) + f'(Yx,O)[I - II z 2~11 y, II
and this is a contradiction.
REFERENCES 1. Bishop, E. and Phelps, R. R., 1961, A proof that every Banach space is subreflexive, Bull. Amer. Math. Soc., 67, 97-98. 2. Bishop, E. and Phelps, R. R., 1963, The support functionals of a convex set,Proc. Symp. Pure Math., 7, (Convexity) 27-35. 3. Day, M. M., 1955, Strict convexity and smoothness, Trans. Amer. Math. Soc., 78, 516-528. 4. Day, M. M., 1957, Every L space is isomorphic to a strictly convex space, Proc. Amer. Math. Soc., 8, 415-417. 5. Day, M. M., 1958, Normed linear spaces, Springer, Berlin. 6. Kadec, M. I., 1959, On spaces which are isomorphic to locally uniformly convex spaces, Izvestia Vysshikh Uchebnykh Zavedenii (Mathematics) 6, 51-57. 7. Kadec, M. I., 1963, Some problems in the geometry of Banach spaces, Dissertation, Moscow University. 8. Klee, V., 1959, Some new results on smoothness and rotundity in normed linear spaces, Math. Ann., 139, 51-63. 9. Lovaglia, A. R., 1955, Locally uniformly convex Banach spaces, Trans. Amer. Math. Soc., 78, 225-238. 10. Mazur, S., 1933, 0ber konvexe Mengen in linearen normierten R~iumen, Studia Math., 4, 70-84. 11. ~mul'yan, V. L., 1941, Sur la structure de la sph6re unitaire dans l'espace de Banach, Math. Sb. (N. S.), 9, 545-561. YALE UNIVERSITY,NEW HAVEN AND UNIVERSITYOF WASHINGTON, SEATTLE
MINIMAL UNIVERSAL COVERS IN
En
BY
H. G. EGGLESTON*
ABSTRACT
It is shown that any plane set of constant unit width contains a semi-circle of radius ½, and using this a minimal univeral plane cover is explicitly constructed. It is also shown that in an n-dimensional space" with n> 2 there are minimal universal covers of arbinary large diameter.
Introduction. We shall consider subsets of real n-dimensioral Euclidean space E". Denote by J r , that class of subsets of E" which have a width in every direction equal to 1. If two subsets X and Y of E n are congruent we write X ,-- Y. A subset C o f E" is called a universal cover if it is closed, convex and such that for every subset A of E n whose diameter is less than or equal to 1 we can find a subset B of C such that B ~ A. Since every such set A is contained in a member of o,~,, it is sufficient in proving that a set C is a universal cover to vertify that the congruent subset B of C can be found corresponding to every set A that belongs to d , . By a m i n i m a l universal cover is meant a universal cover of which no proper subset is also a universal cover. In the plane the diameter of any minimal cover is less than 3 and the question has been asked (by V. Klee, see [23) as to whether there is a finite upper bound of the diameters of compact minimal universal covers in E", depending possibly on n. We show that this is not the case by proving that for any given positive number K and any integer n with n > 3, there is a compact minimal universal cover in E" whose diameter is greater than K. We first prove that a certain plane set is a minimal universal plane cover by means of a lemma, which incidentally also shows that any set in aT"2 contains a semicircle of radius ½ (see [1]). §I. An explicit example of a plane minimal universal cover. In this section it is shown that a certain set ¥, defined explicitly, is a plane minimal universal cover. Y is the union of a disc and of a IReuleaux triangle, both of unit diameter, so * This paper was written while the author was a National Science Foundation Visiting Senior Fellow at the University of Washington, Seattle, Washington, U.S.A. Received Septembei 12, 1963 149
150
H . G . EGGLESTON
[September
placed that two of the vertices of the triangle are diametrically opposite points on the disc. See Figure 1.
Figure 1 I f Yis a universal cover at all it must: be a minimal one: indeed no proper subset o f Y can contain b o t h a disc and a Reuleaux triangle each of unit diameter. We shall deduce that Y is a universal cover from the following lemma. L E n A 1. I f Z is a plane set of unit constant width then there are two points on the frontier of Z, say s,t, which are at unit distance apart and such that of the two semicircles of radius ½ that pass through both s and t, one at least, does not meet the interior of Z. F o r if this l e m m a were true then, since every l:oint of Z is distant at m o s t 1 f r o m b o t h s and t, it would follow that Z lies in a figure bounded by a semicircle on st and (on the opposite side of st) two arcs of unit radius and centers s and t. This figure is congruent to IT. Thus Z is congruent to a subset of Y. But any set whose diameter does not exceed 1 is contained in a set of unit constant width. Hence Y is a universal cover. It remains to prove the lemma. Proof of the lemma. By a standard approximation argument it is sufficient to establish the lemma when Z is a Reuleaux polygon and in what follows we consider this case only. Two vertices of Z, say a, b, are said to be opposite if a n d only if they are at unit distance apart. In the frontier of Z lie two circular arcs whose centers are a and b. They lie on one and the same side of the line ab; we call this the positive side and the other side will be called the negative side. The circumcircle of the part of Z on the negative side of ab will be denoted by Yah, its radius by rob (~ab is a disc). In any case rob ~_ ½ and we wish to establish the existence of two opposite vertices a,b, for which r,b = ½. We assume that no such vertices exist and show that this leads to a contradiction. Any two vertices p,q on the frontier of Z divide this frontier into two arcs,
1963]
M I N I M A L U N I V E R S A L COVERS I N E n
151
Moreover, if p and q are not opposite, then one of these two arcs contains no points opposite to p or q. The vertices of Z on this arc other than p or q are said to lie between p and q. I f p and q are opposite then by the vertices bet~veen p and q we mean those that lie on the negative side of the line pq. Since rpq > ½, there must, f o r any pair of opposite vertices p,q, be vertices between p and q which lie on Vp~. Define the vertices v, w such that they lie on Vpq and no other vertices between p and v or between q and w lie on Vpq. Let there be f(p), g(q) vertices between p and v and between q and w respectively. Let h(pq)=min(f(p),g(q)) and z = m i n p , q h(pq). Choose opposite vertices a,b so that z = h(ab) and suppose for definiteness that f ( a ) = h(ab). Then let b 1 be the vertex opposite a adjacent to b and at be the vertex opposite to bt adjacent t o a.
We consider two cases. CASE (i) Z = 0. at lies on Y.b (see Figure 2). bt lies outside Y~b and the line joining bt to the center of Yah bisects internally the angle abia~. Hence Y.b cuts the segment b~at 0 //1
/
i I
\
i
iI
\\
I I
\\
//
Iiii
I. !~
/I \ i X jl
.i I
.. i - j r I.t.i-
•
ii
.i ..i Y 0. Select one such point and let Y* be the set (or one of the sets) belonging to q/* such that F(Y*)=(½,x*2,x*a .... ,x,*_~,0). It follows from lemma 3 that Y* is the point (½, 0, 0 .... ,0) if and only if Yis a ball.
154
H.G. EGGLESTON
[September
Next let g be a large positive number so that 2(R - 1 / 2 ) . g > K (K is the preassigned positive number, R the minimal circumradius of any projection of Z), and let Pg be the intersection of P with the cone (1)
2 < (x,,+g)2 x 2 + x22 + ... + xn-t (2g + 1 ) 2 - 1
This cone intersects xn = h in the (n - 1)-dimensional ball with center (0,0 .... ,0, h) and radius rh = (h + g)/((2g + 1)2 - 1) 1/2. Since this radius is large for large values of h it follows that Pg is a universal cover. Also the set Pg contains the n-dimensional unit ball with center at x , = ½ , x i = O , i = 1..... n - 1 . Denote this ball by B*. F o r each set Yof og'n in P select Y* as described above and translate Y* parallel to the x, axis until it lies inside Pg and, subject to this condition, is as close to x, = 0 as possible. Let the translated set be Y**. Let Q be the union o f all these sets and let S be the convex cover of the closure of Q. Q and (therefore) S are bounded. S is a subset of Pg that meets the hyperplane x, = 0 (since Q ~ B * ) and S also meets the hyperplane x, = K. F o r consider where Z** can lie. If Z** lies in x, < K then R < rx, i. e. R =< (K + g)/((2g + 1) 2 - 1) '/z , which implies K+g R =< 2 ( g ( g + 1 ) ) l / 2
K 1 no(p, k) G(n: m(n, p)) contains a p chromatic subgraph K(k, ...,k) and one further edge (i. e., a K~(k,..., k)); for p = 2 this is a weakened form o f Theorem 1. Now we prove Theorem 1. First we need two Lemmas. LEMMA 1. Every G(n; m) contains a subgraph which has valency greater than [m/HI. Further (1)
M>=m-(n-N)
[~]
G(N,M) every vertex of
158
P. ERDOS
[September
(The L e m m a of course means that every vertex of G(N, M) has valency in G(N,M) greater than [re~n]). I f every vertex of G(n, m) has valency > In~ m], there is nothing to prove. Hence we can assume that G(n,m) has a vertex x 1 of valency < [re~n]. If G(n; m) - xl has a vertex x2 with v(x2) < [re~n] we consider G(n; m) - xl - x 2 . We repeat this process and obtain a sequence of vertices x x,..-,xk so that the valency of xi in (G(n; m) - x 1 . . . . . xi-1) is _< [re~n] for every 1 < i < k - 1, but every vertex of (2)
(G(n; m) - x I . . . . . .
x~.) =
G(N; M)
has valency > [re~n]. Clearly M > 0 for otherwise, since (G(n; m) - xl . . . . . x,_~) has only one vertex and thus no edges, we can put in (2) k < n - 1 and by our construction we would have
an evident contradiction. Further by our construction (k = n - N)
," which proves (1), and the p r o o f of L e m m a 1 is complete.
LEM~ 2. Let m> [n2/4]. Then every G(n; m) contains a K,(2,k) where k = Icon].
L e m m a 2 is known [6]. Now we can prove Theorem 1. In fact we shall prove the stronger statement: To every e > 0 there is a ci = q(~) so that every G(n; [n2/4] + 1) contains a
Ke([ctlogn], [n1-*]). By L e m m a 1 our G(n; In2~4.] + 1) contains a subgraph G(N, M) every vertex of which has valency > [ [ n 2 / 4 ] + 1] = [ n / 4 ] . Further (1) implies by a simple n computation (2)
M=>
~-
+1-(n-N)
Further since every vertex of (3)
>
--~-- .
G(N,M) has valency > [n/4] we have n N > --. 4
By (2) L e m m a 2 can be applied to G(N, M) and by L e m m a 2 and (3) we obtain that G(N, M) contains a Ke(2, k) with k = [c5n/4 ]. Let the vertices of our Ke(2, k) be (.we choose e5 < 1/3)
1963]
ON THE STRUCTURE OF LINEAR GRAPHS
xl,xz; Y~, "",Yk,
(4)
159
k =
n/8, be the z's adjacent to Yv Form all the (u.-2)-tuples (u. = [c 1 logn] of Theorem 1) of these vertices for each i,1 < i < k = [c5n/4 ]. By a simple computation we obtain (we use (b) > (6)
~,
ti
i=1
u.-
> csn 2
=
(a/ b) b)
(iinjS+l) csn( n)-2
4
u, - 2
> ---4-
8(u,-2)
Further trivially
(7)
u. - 2
< ( u . - 2) !
( ) n u.-2
for every 8 > 0 if cl = c~(e) is sufficiently small. The number of the z's is clearly less than n, hence the number of the (u, - 2)-tuples formed from z's is less than (u, n_ 2)" Thus from (8)there is a (u, -
2)-tuple which occurs more than
n 1-~ t i m e s - - i n other words there is a set of u,. - 2 z's which are adjacent to the same [n I -*] y's. If we adjoin to these z's xl and x2 (which are adjacent and are adjacent to all y's) we obtain that G(N; M) and hence our G(n; [n2/4] + 1) contains a Ke(u.,n 1-~) for every e > 0 if cl = c1(~) is sufficiently small. This completes the proof of our assertion and hence Theorem 1 is proved. Proof of Theorem 2. As in the proof of Theorem 1 our G(n; [n2/4] + 1) contains a Ke(2,[csn/4]), cs < ! 1 / 3 , having the vertices Xl,X2, YI,...,yk, k = [c5n/4]. Each of the k vertices y~,...,yk are adjacent to more than n/8 z's (we use the notations of Theorem 1). Consider now the bipartite graph whose
160
P. ERDOS
vertices are Yx,'",Yk; z l, ...,z, and whose edges are the edges (Yt, z j ) o f G(n;m). This bipartite graph has fewer than n vertices and more than --~
~-- c6 n2
edges. Hence by a theorem of Gallai and myself I-7] it has a path of length c2n (the length of a path is the number of its edges). Since our graph is bipartite every second of its vertices is a y. Now since x l and x2 are adjacent and they are adjacent to each of the y ' s we immediately obtain that our G(n; 1-n2/4] + 1) contains a C t for each 3 < k < [c2n ], which proves Theorem 2. Proof of Theorem 3. By L e m m a 1 G(n; [tn3/2]) contains a subgraph G(N; M) every vertex of which has valency => [tn~/Z]. Let x be one such vertex and let Yl, "",Yk, k = ½ [ thaI2] be some of the vertices adjacent to x and denote by z l , . . , the other vertices of G(N,M). Every y has valency > 1-tnl/2],thus since the number of y ' s is ½1-tn~/2] there are at least ½ 1-tn1/2] z's adjacent to each y. Hence the bipartite graph whose vertices are y~,..., Yk : Z~,... and whose edges are the edges (y~,z j) of G(n, m) has at least t2 k ½[tn l/z] = k[tnl/2] 2 > ~-n edges. The number of its vertices is clearly < n. Thus by the theorem of Gallai and myself [7] it has a path of length > 2c3 t 2 and as in the p r o o f of Theorem 2 every second vertex of this graph is a y. Since x is adjacent to every y this path together with the vertex x gives the required circuits C2t, 2 < l < c3t 2, which proves Theorem 3.
REFERENCES 1. Tur~tn, P., 1941, Mat. Lapok, 48, 436--452 (Hungarian), see also Turan, P., 1955, On the theory of graphs, Coil. Math., 3, 19-30. 2. Dirac, G., will appear in Aeta Math. Acad. Sci. Hungar 3. Erdtis, P., 1947, Some remarks on the theory of graphs, Bull. Amer, Math. Soc., 53, 292-294; see also ErdSs, P. and Renyi, A., 1960, On the evolution of random graphs, PubL Math. Inst. Hung. Acad, 5, 17-61. 4. Erd6s, P. and Stone, A. H., 1946, On the structure of linear graphs, Bull. Amer. Math. Soc., 52, 1087-1091. 5. Erd6s, P., 1938, On sequences of integers no one of which divides the product of two others and on some related problems, Izv. Nauk. Inst. Mat. Mech. Tomsk., 2, 74-82. 6. ErdiSs, P., 1962, On the theorem of Rademacher-Tur~in, Illinois Y. of Math., 6, 122-127. 7. Erd6s, P. and Gallai, T., 1959, On maximal paths and circuits of graphs, Acta. Math. Acad. Sei. Hungar., 10, 337-356. UNIVERSITYOF MICHIGAN~ ANN ARBOR,MIctnOAN, U.S.A.
A k-EXTREME POINT IS THE LIMIT OF k-EXPOSED POINTS BY
EDGAR ASPLUND ABSTRACT
It is proved that the relative boundary of a k-dimensional intersection of a hyperplane and a compact convex body is contained in the closure of the union of all intersections of dimension lower than p that the same convex body makes with different hyperplanes. We prove in this note a theorem that generalizes a theorem of Straszewicz [-3]. Let C be a c o m p a c t convex body in R n. DEFINITION 1. A point p e C is called k-extreme if it is not the centroid of some non-degenerate (k + 1)-simplex in C. In the terminology of Bourbaki I-2, §1, exerc. 2] one would say that a k-extreme point is o f order at most k. DEFINITION 2. A point p e C is called k-exposed if it is contained in a closed half-space K such that K n C is at most k-dimensional. We collect some immediate consequences of the definitions. COROLLARY. A k-exposed point is k-extreme. A k-exposed (extreme) point is
h-exposed (extreme)for all h > k. I f ~p is a supporting hyperplane of C and p ~ q~ n C is k-extreme with respect to ~ n C, then p is k-extreme with respect to C. We will call 0-exposed (extreme) points exposed (extreme) to conform with earlier usage. The following theorem coincides for k = 0 with the theorem of Straszewicz. THEOREM. For k > O, the closure of the set of k-exposed points contains the set of k-extreme points. F o r k > n - 1, the theorem is true trivially. For k = n - 1 the set of k-exposed points and the set of k-extreme points both coincide with bd C and for k __> n with C. Received September 30, 1963 161
162
EDGAR ASPLUND
This theorem is applied in [1] to the problem of determining which subsets of a general finite-dimensional Banach space have unique farthest points. We proceed to prove the theorem by induction on k. The proof for k = 0 may be found in Straszewicz [3] and in Bourbaki [2, §4, exerc. 15c], and we will use this "ordinary Straszewicz t h e o r e m " both as a starting point for the induction and in the p r o o f itself. Suppose that the theorem has been proved for k < h. Let p be a point in the complement of the closure of the set of h-exposed points. In order to obtain a contradiction we also add the hypothesis that p is h-extreme. Let A be an open neighborhood of p such that no q EA is h-exposed. By the induction hypothesis, p is the centroid of some non-degenerate h-simplex S. Choose a linear manifold M that passes through p and that is supplementary to the one generated by S. Since by hypothesis the point p is h-extreme, it must be extreme relative to M n C and hence, by the ordinary Straszewicz theorem, some point q E A n M n C is exposed relative to M n C. We will now show that the face of q is at most (h-1)-dimensional, thereby arriving at the contradiction, since by hypothesis q is not ( h - 1)-extreme. Let N be a hyperplane of M that separates q from M n C. As an affine manifold of R n, N has codimension h + 1 and N n C = {q). By the Hahn-Banach theorem, there is a hyperplane tk containing N that supports C at q. In this hyperplane N has codimension h and separates q from the rest of ~b n C. But q is not h-exposed, hence q5 n C has at least dimension h + 1. Let F be the affine space generated by ~b n C. Then N n F has codimension at most h in F, and (N n F) n C = {q} so that there exists a hyperplane ~k of F supporting q ~ A C = F C ~ C at q and containing N ~ F . In ~, N t ~ F has codimension at most h - 1 . But ~kalso contains the face o f q (in C), which therefore can have at most dimension h - 1. The proof is now complete.
REFERENCES 1. Asplund, E., Sets with unique farthest points (to appear). 2. Bourbaki, N., 1956, Espaces vectoriels topologiques, chapitre IV. Paris. 3. Straszewicz, S., 1935, Uber exponierte Punkte abgesehlossener Punktmengen, Fund. Math., 24, 139-143. UNIVERSITY OF CALIFORNIA, BERKELEY, CALIFORNIA
ON HAMILTONIAN BIPARTITE GRAPHS BY J. MOON AND L. MOSER ABSTRACT Various sufficientconditions for the existence of Hamiltonian circuits in ordinary graphs are known. In this paper the analogous results for bipartite graphs are obtained. Various sufficient conditions for an ordinary graph (without loops or multiple edges) to be Hamiltonian have been given by Dirac, Erd6s, Ore, P6sa, and others. The object of this note is to point out some corresponding results for bipartite graphs which can be obtained by similar methods. A bipartite graph B(n,n) consists of the vertices Pl,P2,'",Pn and ql,q2 ..... qn and some of the edges (Pi,qi). We assume throughout that n _~ 2. The degree d(x) of the vertex x is the number of vertices with which it is adjacent, or joined by an edge. A graph is Hamiltonian if it contains a complete cycle, i.e., a cycle which contains every vertex of the graph exactly once. The following result is proved by an argument very similar to one used by P6sa I7]. THEOREM. A bipartite graph B(n,n) is Hamiltonian if it has the following property: For any nonempty subset F of k < ½n vertices Pi each of degree d(pi) < k, every vertex q.i with degree d(qj) < n - k is adjacent to at least one vertex in F, and similarly with Pi and q~ interchanged. Proof. Assume that B = B(n, n) satisfies the hypothesis of the theorem but is not Hamiltonian. We may suppose that B becomes Hamiltonian whenever any new edge (p,q) is added, joining vertices not already adjacent. For if B did not originally have this property we could add suitable edges until it did and the graph so obtained would still satisfy the hypothesis of the theorem. Suppose that the vertices Pl and q i are not adjacent. Then there exists a complete path, say (Pl,q2,P2 ..... qn, P,,q~), from Pl to ql. If qi is any vertex adjacent to p~ then it must be that Pi-1 is not adjacent to q~, for otherwise (Pl,qj, P j , . . . , q l , P j - I , q j - I , P j - ~ . , . . . , q 2 , P l ) would be a complete cycle o f B. F r o m this it follows that
d(p) + d(q) d(Pl) + d(ql), contradicting the choice of Pl and ql. The fact that ql, where d(ql) < n - k, is not adjacent to any of the vertices p j_ 1 violates the original hypothesis. This contradiction suffices to complete the proof of the theorem. The following corollary is analogous to the theorem proved by P6sa [7] for ordinary graphs. COROLLARY1. I f B(n,n) is such that for every k, where 1 .3
T
l
-4
.
0 and that all matrices in N satisfy H a. 4) If N is a finite set of doubly stochastic matrices satisfying Ha, then the products o f the form II~AI,A~ ~ N, have a limit or, more precisely, the matrix Q all whose terms equal 1/n satisfies the equation *) For proof, see [4, p. 22].
180
A. PAZ
[September
lim f i A i = Q. m-~OO i
5) With regard to long products of stochastic matrices it is of interest to inquire whether there exists a procedure for approximating such a product. Suppose, for example, that A is a stochastic matrix which can be written in the form A = I + e where terms in e are small. The usual approximation A " = I + he, ceases to be stochastic for sufficiently large n. However, it is seen from Theorems 8, 10 and 11 that a good stochastic approximation to a long product of stochastic matrices (all satisfying H3 and all belonging to a finite set N of matrices) is obtained by omitting the first k matrices in the product, where k is an easily computable function of the error allowed. For, as is seen from the above theorems, the first k matrices in the product contribute to the terms in the rth row of the product o f m matrices at most: (1 - 5 ) " - 1 + (1
-
~)m-2 +
. . .
+ (1 -- 5) 'n-k = (1 -- 5) " - k -
(1 -- 5)"
5 while those in the other rows differ from their counterparts in the r-th row by at most (1 - 5)% where 5 is defined as in Theorem 9 and (1 - 5) is the maximal term in all kernels of the matrices in the finite set N. Acknowledgements. The author is indebted to Dr. M. Yoeli for guidance, advice and a number of suggestions followed up in the paper; to Prof. H. Hanani, for guidance and advice; to Prof. M. O. Rabin of the Hebrew University for encouragement is undertaking the present study; and to Mr. E. Goldberg for editorial assistance.
1. 2. 3. 4.
BIBLIOGRAPHY B~rgc, C., 1962, The Theoryof Graphs and its Applications, Wiley, New York. Doob, J. L., 1953, Stochastic Processes, Wiley, New York. Feller, W., 1950,Probability Theory and Applications, Wiley, New York. Kemeny, J. C. and Snell, J. L., 1960,Finite Marker Chains, Van Nostrand, Princeton.
TECHN/ON-ISRAEL INSTITUTE OF TECHNOLOGY,
~A
THE GEOMETRY OF SOLVABILITY AND DUALITY IN LINEAR PROGRAMMING BY
ADI BEN-ISRAEL* ABSTRACT
Solvability and boundedness criteria for dual linear programming problems are given in terms of the problem data and the intersections of the nonnegative orthant with certain complementary orthogonal subspaces. Introduction. The duality theorem of linear programming(1) relates two linear extremum problems in terms of solvability, boundedness and equality of functional values. The classical theory of Lagrange multipliers admits extensions to some special nonlinear cases(2) as well as interpretations of duality in the context of applications(a). Tucker, in [16], showed duality--in the linear case--to follow from elementary geometric considerations of complementary orthogonality of manifolds corresponding to the dual problems(4). In this paper we follow Tucker's approach and supplement his results [16] by an alternative theorem for dual programs (Theorem 4 below), and by a characterization of all duality situations in terms of the geometrical configurations of certain manifolds associated with the data and the data itself (Theorem 6 below). None of our results seem to be essentially new; yet our efforts may be justified for pedagogical reasons. NOTATIONS. In this note we use the same notations as in [1]. In particular: is an arbitrary ordered field; E" is the n-dimensional vector space over ~ ' ; C{fl,...,fg} is the cone spanned by the vectors f l , ' " , f k in E"; Received August 22, 1963 *This research was partly supported by the Officeof Naval Research, contract Nonr-1228(10), project NR 0474)21, and by the National Science Foundation, project G-14102. (1) Conjectured by Von Neumann (e.g. [7], p. 23) and proved byGale, Kuhn and Tucker [8]. For the extended form discussed here, see Charnes and Cooper [3]. (2) Notably Kuhn and Tucker [11]. (3) E.g. [7], pp. 19-22. (4) A similar approach was used in [1] to develop, in a unified manner, the main theorems of linear inequalities. 181
182
ADI BEN-ISRAEL
[September
A is an m x n matrix over ~ ; N(A) is the null space of A in E"; R(A r) is the range space of A r in E". In addition, let A + denote the generalized inverse of A (see, e.g., [14] and [2]). The following two theorems, which are Corollaries 3' and 5 respectively of [1], are used in the sequel and quoted here for ease of reference: 0.1 Then (i) (ii)
Tt-mOREM. Let L, L ± be complementary orthogonal subspaces in E ~. the following are equivalent: L n E ~ . = {0}; L± n intE~ # ¢.
0.2 THEOREM. Let Lbe a subspace of E ~ of dimension k, k = 1,2,...,n. Then the following are equivalent: (i) LC~bdE~+=C{el,...,ep}, l max x'----L-I I xt 0 is solvable for all b ~ R ( A ) ; (ii) A x = O, x > 0 is solvable; (iii) A x = b, x > 0 is solvable for all b ~ R ( A ) . Proof.
The solutions of A x = b, when solvable, form the manifold A+b + N(A), where A + denotes the generalized inverse of A, [14]. Now (i), (ii) and (iii) are the corresponding parts in Lemma 1 with x = A+b and L = N(A). 3. COROLLARY. Let A be an arbitrary m x n matrix over ~ . Then the following are equivalent: (i) A r w > c is solvable for all c ~ E " ; (ii) Arw > 0 is solvable; (iii) Arw > c is solvable for all c ~ E " . Proof.
In Lemma 1 let L = R(AT), x = - c .
4. THEOREM. Let A be an arbitrary m x n matrix over ~ . Consider the system of equations and inequalities: I) A x = b , x>=O I') A x = O , x>=O II) A r w > c II') Arw > 0 Then: a) I is solvable for all b ~ R(A) if and only if II' does not have solutions with nonnegative nonzero vectors Arw. b) II is solvable for all c ~ E" if and only if I' does not have nonnegative nonzero solutions. c) I f II' has solutions w with Arw > 0 then I is solvable if and only if Arw>_O ~ (b,w)>=O. d) I f I' has solutions x with x >_0 then II is solvable if and only if Ax=O, x>O ~ (c,x) 0 then I is solvable if and only if Arw > 0 =~ ( A r w , A +b) > O. This follows from the fact that AA ÷ is the perpendicular projection on R(A), e.g. [2]. 5. Let A be an arbitrary m x n matrix over ~-, b e E " and c e E". Let
S = {xeE": Ax=b, x>O}
11 = sup(c,x) x~,.~
T=
{ w e e m: A r w > c }
12 = inf (b,w) KET
The duality theorem of linear programming relates the problem of solving for 11 the primal problem, to that of solving for 12, the dual problem. The duality theorem states indeed that there are four mutually exclusive cases: Case A: S ~ , T~b, Ix=l 2 Case B: S = ¢ , T¢~b, I 2 = - ~ Case C: S ~ ¢ , T = q ~ , I 1 = Case D: S = ¢ , T=~b Conjectured by von Neumann (see [7], p. 23) and proved by Gale, Kuhn and Tucker [8], this theorem was extended to some nonlinear situations, the most general being that of Charnes, Cooper and Kortanek [4]. We will now elaborate on the four cases given above. In terms of the data {A, b, c}, and more specifically of the configurations of N(A) and R(A r) with respect to E~, we give below conditions for the attainment of each of the above cases. 6. THEOREM. Let A be an arbitrary m n matrix over ,~, b eR(A) in E m, c ~ E" and let S, T, I1 and 12 be as above. Then there are eight mutually exclusive cases, tabulated below. Proof. The cases 1, ..., 8 are clearly mutually exclusive. In each case Theorem 4 is used to draw the conclusions regarding the sets S and T. Then the duality theorem of linear programming is used to obtain 11 and I2. RE~a~Ks. a) The above 8 cases can be visualized geometrically in a manner which helps to clarify the concept of duality. Thus in the 2-dimensional case where A is a 1 x 2 matrixj dimR(A r) = dimN(A) = 1, the first case appears as follows:
u
7
6
5
4
3
1
Case
R(A r) N i n t C{ep+l, ...,e,} ¢ ¢
N ( A ) n b d E ~ = C{ei "" ep} 1 __
_ 0 and (c, x) > 0 for some x
A r w >_ 0 and ( A r w , A + b ) < 0 f o r some w
A r w >_ 0 and (Arw, A + b) < 0 for some w but A x = O, x >_ 0 =>(c, x) < 0
A r w > 0 =>( A r w , A +b) > 0 but A x = O, and (c,x) > 0 for some x > 0
A r w > 0 =>(Arw, A+b) > 0 A x = O, x > O =>(c,x) < O
A x = O, x > 0 and (c,x) > 0 for some x
A x = O, x > 0 =>(c,x) 5 0
ATw > 0 and ( A r w , A + b) < for some w
AT w >_ 0 =~(Arw, A+b) > 0
Conditions on b,c
Empty
Non-empty
Non-empty forall b G R(A)
Empty
Non-empty
Empty
Non-empty
Empty
Non-empty
Non-empty for all c ~ E "
Non-empty
Empty
T
S
Conclusions
O0
12 = - - 0 0
I 1 = 12
I:=oo
11 = / 2
12=
11 = 12
[t,I2
,
> z~7
~q
©
186
ADI BEN-ISRAEL
[September
ez
/ The other cases are drawn in a similar manner. Furthermore, by the "complementary slackness" property, it is now easy to identify optimal points. Thus Xo is the optimal solution of the primal problem and a = Arwo- c, where Wo is the optimal solution of the dual problem ([,16], p. 15). b) Theorem 6 combines well-known solvability theorems (Tucker [16], Charnes-Cooper [.3], p. 214), and the duality theorem of linear programming to characterize the duality situations in terms of the data {A,b,c).
Acknowledgment. The help and encouragement of Professor A. Charnes are hereby gratefully acknowledged. REFERENCES 1. Ben-Israel, A., Notes on linear inequalities, I: The intersection of the nonnegative orthant with complementary orthogonal subspaces, ONR Research Memorandum, No. 78; (to appear in J. Math. Anal. Appl.). 2. Ben-Israel, A. and Chames, A., 1963, Contributions to the theory of generalized inverses, J. See. Ind. Appl. Math. (to appear). 3. Charnes, A. and Cooper W. W., 1961, Management Models and Industrial Applications of Linear Programming, Vols. I, II, J. Wiley and Sons, New York. 4. Charnes, A., Cooper ,W. W. and Kortanek, K., 1962, Prec. Nat. Acad. Sci. U.S.A., 48, 783-786. 5. Courant, R. and Hilbert, D., 1953, Methods of Mathematieal Physics, Vol. I, Interscience Publishers, Inc., New York. 6. Fan, Ky., pp. 99-156 in [12]. 7. Gale D., 1960, The Theory of Linear Economic Models, McGraw-Hill Book Co., New York. 8. Gale, D., Kuhn, H. W. and Tucker, A. W., pp. 317-319 in [9]. 9. Koopmans, T. C. (editor), 1951, Activity Analysis of Production and Allocation, Cowles Monograph No. 13, J. Wiley and Sons, New York. 10. Kulm, H. W., 1959, Amer. Math. Monthly, 63, 217-232.
1963]
SOLVABILITY AND DUALITY
187
11. Kuhn, H. W. and Tucker, A. W., 1951, Nonlinear Programming, Proc. Second Berkeley Symposium on Mathematical Statistics and Probability, pp. 481-492. 12. Kuhn, H. W. and Tucker, A. W. (editors), 1956, Linear Inequalities and Related Systems, Annals of Mathematics Studies, No. 38, Princeton University Press, Princeton. 13. Motzkin, T. S., 1936, Beitrage zur Theorie der Linearen Ungleichungen, Inaugural Dissertation, Basel, 1933, Azriel, Jerusalem. 14. Penrose, R., 1955, A generalized inverse for matrices. Proc. Camb. Philos. Soc., 51, 3, 406-413. 15. Tucker, A. W., pp. 3-18 in [12]. 16. Tucker, A. W., June 1962, Simplex Method and Theory, Notes on Linear Programming and Extensions, Part 62. The Rand Corporation, Santa Monica, Memorandum RM-3199-PR. TECHNION-ISRAELINSTITUTEOF TECHNOLOGY,HAIFA. AND NORTHWESTERNUNIVERSITY,EVANSTON,ILLINOIS
THE
SPHERE
IN
THE
IMAGE
BY
H. HANANI, E. NETANYAHU ANDM. REICHAW-REICHBACH Introduetion. If f ( x ) is an analytic and univalent function of the complex variable x in the unit circle S = {x: Ix I < 1} and f ( 0 ) = Yo, then the image f ( S ) of S contains a circle with radius ro = Even when f ( x ) is not univalent, the i m a g e f ( S ) of S contains a circle with radius k where k is the constant o f Bloch(1). The aim of this paper is to prove theorems, similar to the last statement, for some mappings of n-dimensional Euclidean spaces and general Banach spaces into themselves. The idea of the proofs consists in the following use of fixed point theorems (z): Suppose that S = {x: Ilxll < 1} is the unit sphere in a Banach space X and that for a given y E X the mapping x - fl [ f ( x ) - y] of S into X has a fixed point x ~ S for some fl ~ 0. Then f ( x ) = y and so y is contained in the image J(S) of S. Considering all such points y, we are looking for conditions under which the set of these points contains a sphere with radius as large as possible. By S = S(x o,r) = {x: p(Xo,X) < r) we denote the sphere with center Xo and radius r in a metric space with metric p and by Bd(S) = {x:p(xo, x ) = r} the boundary of S. (By (x, y) we denote the scalar product of x and y. 1. In this section a simple generalization to Hilbert spaces of the fixed-point theorem of Schauder (Theorem 1) and its applications are given. Before proceeding with the proof of Theorem 1, let us first note the following
1If'(0)1.
If'(0)1,
LEMMh 1. I f X1 and X2 are closed subsets of a metric space X and f x :X1 -~ Y and f2 "-X 2 ~ Y a r e mappings(a) of X 1 and X z into a metric space Y, then f l ( x ) for x E X 1 f ( x ) = fz(x) for x e X z is continuous on X 1 U X 2 , provided that f l ( x ) = f 2 ( x ) on X1 n X2. The proof is trivial. RE~ARK 1. Easy examples show that the assumption of closedness o f the sets X~ and X2 in Lemma 1 is essential. (1) Bloeh's Theorem has been generalized to mappings of n-dimensionalspaces by S. Bochner in [2]. (z) A particular ease of this idea has been used in [9], p. 734. (3) By "mapping" we always understand a continuous mapping. Received August 12, 1963. 188
1963]
THE SPHERE IN THE IMAGE
189
THEOREM 1. Let f : S--* X be a completely continuous mapping of the sphere S = (x: I l x - Xoll < r} in the Hilbert space X into X , such that (1)
(x - Xo, f ( x ) -- Xo) < r 2
for
x ~ Bd(S).
Then there exisls a point ~ ~ S such that f ( ~ ) = 2(4).
Proof.
g(x)
Define for x E S the function
[f,(x ) = f(x) for x X, = (x: I!f(x)-xoll ~ r) 1 f(x) - x o -- Xo + r llS(x)- XolI forx X -- (x:lls(x3- xoll->r)
Then, by the continuity o f f , the sets X1 and X 2 a r e closed subsets of S, and for x E X1 n X 2 = (x: [if(x) - x 0 II = r) obviously f~(x) =f2(x). Hence, by Lemma 1, g(x) is a continuous function on S. Moreover, since S = X I u X 2 and f is completely continuous on S, it follows that g is also completely continuous on S. Thus by ~ g ( x ) - Xol] < r and by the theorem of Schauder [10, Theorem 2] there exists a point ~ such that ~ = g(~). Supposing g(~) =f2(~) = ~, we get f(£)-x 0
(2)
r
Ilf( )- xoll
_£_x
o.
Hence i1g - X o I] = r, i.e. ~ ~ Bd(S). On multiplying both sides of (2) scalarly by f ( g ) - Xo, it follows from (1) that r - Xo II --< r2- But then, by the definition of g(x), we have g = g(~) =f(:~).
Hf(~)
LEMMA 2. Let f : S - - X be a mapping of a sphere S = (x:ltx- xoll ----r) in the Hilbert space X into X , such that for some fl # O, the mapping x + fir(x) is completely continuous on S. Suppose further that for some given . ~ X we have (3)
( x - Xo, f l [ f ( x ) - 37]) < 0 for every x E Bd (S).
Then there exists a point ~ ~ S, such that f ( ~ ) = 37.
Proof. We show that Theorem 1 applies to the mapping h(x) = x + f l [ f ( x ) - 37]. In fact, since x + flf(x) is completely continuous, it follows that h(x) is completely continuous, and it remains to verify that (1) holds with f replaced by h. Indeed, for x e B d ( S ) w e have ( X - X o , X - X o + f l [ f ( x ) - 3 7 1 ) = IIx- oll 2 + ( x - Xo, fl[f(x) - 37])= r 2 + (x - Xo, f l [ f ( x ) - 37]). Hence by (3), the assumption (1) holds with f replaced by h. By Theorem 1 there exists a point ~ = h($), and since fl # 0 it follows that f($) = 37. Putting, in Lemma 2, Xo = 0 and either fl = 1 or fl = - 1 we obtain the following (4) For mappings of finite dimensional spaces, Theorem 1 can be derived from a result ofA. Abian and A. B. Brown [1].
190
H. HANANI et al.
[September
TrmOREM 2. I f f : S ~ X is a mapping of the sphere S = {x: IIx 11--< r} in the Hilbert space X into X such that for some given ~ either (a) (x,f(x)) < (x,y,) for ~ x II = r a n a x + f ( x ) is completely continuous in S or
(b)
(x,f(x)) >_ (x,~) for IIx II = r and
x
- f(x) is completely continuous in S,
then there exists a point 2 ~ S, such that f ( 2 ) = .~. The following two examples illustrate the use of Theorem 2. EXAMPLE 1. Let X be the Euclidean 2n-dimensional space, {aj}j=~,2..., a sequence of n real numbers, k - a positive integer and consider the mapping f : X ---,X defined by (4)
t
2k- 11 - ajx2j Y2j- 1 = - x2j2k-1
LY2)
=
--x2j
(j = 1,2, ...,n)
+ ajx2i-1
where x = ( x l , x 2, "",x2,) and y = (Yl,Y2, "",Y2,) are points o f X. Then the image of the unit sphere S = {x: Hxll z l} contains a sphere s'= {y: ]IYD ----ro} with radius r o > ( 1 / 2 n ) ~-1. Proof. We have ~b(x) = (x,f(x)) = - ~,)=~x2k2, and for II x ~ = 1 (i.e. for x E Bd(S)), tk(x) has a maximum for xt = x2 . . . . . x2,. This maximum equals max Jlxll=t ~b(x)= - ( 1 / 2 n ) k-~ = - ro. Now, if ~ p II ---__1 / ( 2 n - 1)g- 1. EXAMPLE 2. Let X = L2[0 ,1] be the Hilbert space of all square integrable functions on the interval [0,1] and consider the mapping f : S -~ X of the unit sphere S = {x = x(t): IIx ]1 =< 1} into X defined by f ( x ) = x + StoX2(u)du. Then x - f ( x ) = - f~x2(u)du is a completely continuous mapping on S and for x with II x II = 1 we have ](x,1-2)[
_-< I l x l l ' l l l - 5 [ ] = [ ] 1 - 2 [ I
Hence, for x with ][x I] = 1 we obtain
(x,f(x))
- ( x , y ) -- 1 +
x0)[1
- 2 0 ) ] a ¢ >_- 1 - II 1 - 211.
It follows that for 2 = 2(0 satisfying 1 > II 1 - 2 II the inequality ( x , f ( x ) ) > (x, 2) holds for x E Bd(S). Thus the assumption (b) of Theorem 2 is satisfied. By Theorem 2, we obtain that the imagef(S) contains a sphere S' = {y: I1Y - 1 II - H be a mapping of an open set G contained in a Banach space X into a subset H of a Banach space Y. Let x o e G, and suppose that there exists a linear mapping A : X - - * Y such that for every x e X we have limt. o [ f ( x o + tx) - f ( x o ) ] / t = A(x). The mapping A is called the derivative o f f at the point x o and denoted b y f ' ( x o ) o r f ' . It can be shown (6) that: If [y,x] ~ G is an interval and f : G ~ H has a derivative at each point of [y,x], then for every linear mapping U : X - ~ g we have ] [ f ( x ) - f ( y ) - U(Ay)][ < supo ((1 - 7)/2K), then the image f ( S ) of the sphere s = (x 11x ll ~ r) contains a sphere of radius r o = ((1 - 7)/2) 2 / K . Proof. We have [IF'(x)II = IIF'(O) + F'(x) - F'(O) [I< 7 + ~ F'(x) - F'(O) II _-< r + K 11 x 11(8) Hence for x satisfying l[x II ~ ((1- 7)/2K) we obtain ][F'(x)II