A PRE-HILBERT SPACE CONSISTING OF CLASSES OF CONVEX SETS BY
G. C. SHEPHARD ABSTRACT
An equivalence relation is defined...
21 downloads
681 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
A PRE-HILBERT SPACE CONSISTING OF CLASSES OF CONVEX SETS BY
G. C. SHEPHARD ABSTRACT
An equivalence relation is defined in the set of all bounded closed convex sets in Euclidean space E n. The equivalence classes are shown to be elements of a pre-Hilbert space A n, and geometrical relationships between An and E n are investigated. 1. Introduction. With the operations of vector addition and scalar multiplication, the class o,~~ of all bounded convex sets in E" forms a topological semigroup with scalar operators. IfK~, K2 e ~ , and we write K 1 ,,~ K 2 when there exist centrally symmetric convex sets $1, $2 ~ o,Y" such that
(1)
K1 + $1 = K 2 + $2,
then ~ is an equivalence relation in o,~". The equivalence class containing a given set K is denoted by I K ] and is called an a s y m m e t r y class. In 13] it was shown that with the operations
2[K] = U K ] ,
(2)
[ K , ] 4- 1-K2] = 1-K, 4- K2],
the set A" of asymmetry classes forms a normed vector space. The purpose of this note is to show that an inner product can be defined in A n so that it becomes a pre-Hilbert space (i.e. an infinite-dimensional inner-product space over the real numbers which is not complete), and to investigate briefly some geometrical properties of KI, K2 e~¢C" which imply that the corresponding classes [K1], [K2] are orthogonal in A". The treatment follows similar lines to that of G. Ewald [2] except that he considers a different equivalence relation: his relation is defined by (1) with $1, $2 representing convex sets which are centrally symmetric in the origin. Consequently the corresponding equivalence classes are not invariant under translation. 2. Asymmetry functions. The Steiner point or curvature centroid s(K) of a convex set K ~ ~f'~ may be defined (see [6]) by the relation Received August 14, 1965.
2
G.C. SHEPHARD
[March
where i is any fixed unit vector, u is a typical unit vector, f~, is the n-dimensional unit sphere in E", centred on the origin, dco is an element of surface area of f~,, and H(K,u) is the supporting function of the set K, that is, (4)
H(K, u) = sup x" y. xeK
We shall require the following properties of s(K): s(gK) = 2s(K), (5)
s(Kt + K2) = s(K1) + s(K2)
for all 2 e R and K, KI, K2EJ~rn. (6) I f ,,~r~ is topologised by the Hausdorff metric, [1, p. 34] then s(K) is a continuous function of K. Both (5) and (6) following immediately from the definition (3). Now let [K] ~ A" be any asymmetry class. Since [K] contains all the translates of each of its members, we may choose a representative K ~ [K] with the property that s(K) = o (the origin). For such a representative, write (7)
a(K, u) = H(K, u) - H(K, - u).
Then a(K,u), will be called an asymmetry function. (8) a(Kt,u ) = a(K2,u) if and only if K 1 ..~ K z. For suppose (1) holds. Since s(K1) = s(K2) = o by choice of K t and Kz, (5 implies s(S1)= s(S2)= s (say), and then by (4),
H(Si, u)= A(Si, u) + s'u
(i = 1,2)
where R(Si, u) is the supporting function of Si relative to its centre. Hence
H(K 1, u) + A(S 1, u) + s- u = H(K 1, u) + H(S 1, u) = H(Kt + S1, u) (9)
= H(K2 + $2, u)
= H(K2, u) + H(Sz, u) = H(Kz, u) + n(Sz, u) + s" u,
19661
CLASSES OF CONVEX SETS
and similarly, H ( K 1, - u) +/:](St, - u) - s" u = H(K2, - u) +/:]($2, - u) - s" u.
Subtracting this latter relation from (9) and using the fact that, by central symmetry,
R(S~,u) =//(si, -u),
(/= 1,2)
we obtain a(K t, u) = a(K 2, u)
as was to be shown. Conversely, if a(Kl, u) = a ( K 2 , u) then it is simple to verify that (1) holds with Sx, $2 defined by H(St, u) = ½(H(K2, u) + H(K2, - u)), H(S2, u) = ½(H(K 1, u) + H(KI, - u)),
and so K t ,~ K2. The main consequence of (8) is that a(K,u) is an asymmetry class invariant so that a([K], u) may be properly defined by a([K], u) = a(K, u) for any K E[K]. Thus we have established the existance of a 1-1 mapping between A n and the class of all asymmetry functions a([K], u). The next statement shows that this mapping is an isomorphism: (10) For all real 2 and [K], [K1], [K2] ~A n,
a(,t[K], u) and
=
2a([K], u)
a([Kd + [Kz],U) = a([Kd,u) + a([K2],u).
The proof follows immediately from (2) and (5). It depends essentially upon the additivity property of s(K), which explains why asymmetry functions must be defined relative to this point as origin. The rest of this section is concerned with characterising asymmetry functions. This we can do completely in the case n = 2. (11) For each [K] ~ A ~, the asymmetry function a([K],u) (a) is homogeneous in u, that is, a([K],2tu) = 2a([K],u) for all real 2,
(b) is a continuous function of u in E n,
G. C. SHEPHARD
4
[March
(c) satisfies
f n ua(EK],u)dco = o. n
(d) (in the case n = 2) is a differentiable function of u on the circle f~2 :l u I = 1, except possibly at an enumerable number of points, and the derivative is of bounded variation on f~2. Assertions (a) and (b) follow from the properties of support functions, (d) follows from the properties of convex functions given in l-4], and (c) is a consequence of (3) and (7). We now come to the main theorem: (12) TrmOREM. Let f(u) be a function defined for u 6 E z. Then f(u) is an asymmetry function (i.e. can be expressed in the form (7)for some K E J~~'2) if and only if it has the properties (a), (b), (c) and (d) o f ( l l ) . A slight modification of the proof of Theorem 3 in [2] enables us to see that f(u) is an asymmetry function if it has a continuous second derivative except possibly at the origin. In theorem (12), the conditions are necessary as well as being sufficient. The analogue of (12) for n > 2 dimensions is not known; it is difficult to see what would be the appropriate condition corresponding to (d). The necessity of the conditions in (12) is clear; to prove the sufficiency we need three lemmas, the first two of which establish conditions for a given function of a real variable to be expressible as the difference of two convex functions. (13) LEMMA. Let dp(x) be a continuous function of a real variable x in an interval [a, b] with the property that c~'(x) exists at all points of [a, b] with the possible exception of an enumerable set G c [a, b]. Suppose further, that c~'(x) is an increasing function of x where it is defined, i.e. ~ ' ( x 0 > 4"(xz) for xl ~ xz, and xl, xz E [a, b] I G.
Then c~(x) is a convex function of x in [a, b]. Proof. Let Xo be any point of ra, b]. If Xo ¢ G, let m = ~'(Xo), and if Xo E G, let m be chosen so that
~p'(x) < m for X<Xo ( x ~ [ a , b ] l G) '(x) > m for x > Xo (x ~ [a, b] \ G). In either case ~x(q~(x) - rex) > 0 for all x 6 [a, b] \ G with x > Xo and so, by lemma* of Hobson [4, p. 365],
f ( x ) > f(xo) + (x - xo)m
for
x > Xo.
* I am indebted to Dr. B. Kuttner for drawing my attention to this lemma.
1966]
CLASSES OF CONVEX SETS
5
In a similar manner we can show that this same inequality holds also in the case x < Xo. We deduce that y = f ( X o ) + ( x - Xo)m is a supporting line to the graph of y = f ( x ) at Xo. Since Xo is any point of [a, b] this is sufficient to show that f ( x ) is convex in [a, b] and the lemma is proved. (14) LEMMA. Let c~(x) be a continuous function of a real variable x in an interval [a, b] with (o(a)= O. Suppose that dy(x) exists at all points of [a,b] with the possible exception of an enumerable set G, and is bounded variation. Then dp(x) can be written as a difference
(15)
¢(x)
=
-
of two convex functions ~k~(x), ~b2(x) in [a, b]. Proof. Since ~b'(x) is bounded variation, it can be written in the form ¢'(x) = zl(x) - z2(x)
where X~(x) and X2(x) are increasing functions of x defined in [a, b] \G. Hence the integrals ~b,(x) =
f;
z,(t)dt
(i = I, 2)
are defined and then (15) follows. Further Xl(x), Z2(x) are convex by lemma (13), since ~,[(x) = X~(x) is an increasing function of x in [a, b] \ G. The remainder of the proof of the theorem is concerned with modifying the above procedure to apply to functions of a vector variable u e E 2, instead of functions of a real variable x. We shall adopt the following notation. L e t f ( u ) be a function of u E E 2, and let r, 0 be the polar coordinates of u, i.e. u = (rcos0, r sin0). Then we shall w r i t e f ( u ) = f ( r , O) when we wish the display the coordinates of u explicitly. In this notation, the conditions of (11) become, (a') f(r,O) = rf(1,O) for all r, 0, (b') f(1, O) is a continuous function of 0, (c')
fo
f(1,O)cosOdO =
f(1,O)sinOdO = O, o
(d') f(1,0) is a differential function of 0 for 0 < 0 < 2~, except possibly on an enumerable set G c [0, 2re], and this derivative is of bounded variation. (16) Suppose f(1, O) > 0 is a continuous convex function of 0 for ~ - 0, say. A simple calculation shows that this implies that 2/# = r2/rt, so we may put 2 - r 2, # = rt and then r3 = 2 r : 2 cos ~. Hence
2f(ul) +/tf(u2) - f ( u 3 ) = rtr2f(1, 01) + rlr2f(1, 02) - 2rlr2 cos ~bf(1, 03) > rlr2(f(1, 01) + f ( 1 , 02) - 2f(1, 03)) since 0 < ~b < ½n, so 0 -< c o s $ < 1, and f(1,03) > 0, > 0 since f(1, 0) is convex. Hence (17) is true in this special case and so (16) is proved. The next statement is the "local f o r m " of (16), and follows easily from it. Details of the proof are omitted. (18) I f f(1,O) > 0 is locally convex at 0 = 0o, then f ( u ) is locally convex at each point of the line 0 = 0o, r > O. We now proceed to the proof of the theorem. Since f(1, 0) is continuous and f(1, 0) = - f ( 1 , 0 + rr), we can choose the coordinate system in such a way that f(1,0) = 0, By (d'), f'(1,0) (the derivative with respect to 0) exists in [0, rr] except possibly for an enumerable set G, and is of bounded variation. By lemma (14) we can write f(1, 0) = gt(0) - g2(0) 0 < 0 < ~r where gt(O) are convex functions of 0. For any integer n, define
k(O + 2rrn)
f gt(0) - 0gt(rr)/lr + K
0 < 0 < rr
[ g2(O - n) - (0 - rr) g2 (n) / rc + K
lr < 0 < 2~r
where K is chosen sufficiently large that k(O) > 0 for all 0. It is easily verified that k(O) is continuous for all 0, is convex in each of the ranges
nn < 0 < (n + 1)rr, and that f(1, 0) = k(O) - k(O + 7r). Thus if we write h ( u ) = rk(O) where u = (r cos 0, r sin 0), we have f(u) = h(u) - h(-u)
where h(u) is continuous in E 2, and is convex in each of the half-planes 0 < 0 < rr, n < 0 < 2n by (16).
1966]
CLASSES OF CONVEX SETS
7
To complete the proof we must express f(u) as the difference of two functions which are convex in the whole plane. To do this, let k °, ~vo R , ~v~ L , ~v~ R be the left and right derivatives of k(O) at 0 = 0, zv as indicated, and put R =
max(I k°[ + Ik°l, Ik l + Ik l).
Let T be the line segment joining the points (R, re/2), (R, 3n/2) so that the supporting function of T is
H(T,u) = I rR sin O, L - rR sin 0.
0 0, and we deduce that p(u) is locally convex everywhere except possibly at the origin. But local convexity at the origin follows from the homogeneity of p(u) and so p(u) is convex in the whole plane. It may therefore be written as a supporting function H(K, u) for a suitable convex set K. Further, from the definition of p(u),
f(u) = H(K, u) - H(K, -- u). Finally,
o =fa2uf(u)dco =fta2uH(K,u)dog-fnuH(K,-u)d~o, and so o = s(K) --
s(-K),
2s(K).
Thus the Steiner point of K is at the origin, and Theorem (12) is proved. 3. An Inner Product on A n. Let [K1], [K2] e An, then define the inner product
(19)
< [Kz] l [K2] >n =
a( [K,], u) a ( [K2], u)dco.
This is, within a scalar factor, the usual inner product defined on the set C of all
8
G.C. SHEPHARD
[March
continuous functions on f~.. By the remark following theorem (12) it is clear that the set of all asymmetry functions is dense in the set
C* = { f l f ~ C,
fo"
sin Of(O)dO --
:?
cos Of(O)dO = 0}
so that A" is isomorphic to a dense subset of C*. We can now show that (20)
A" is not complete.
If n = 2, this result is an immediate consequence of theorem (12) for it is easy to construct a Cauchy sequence of asymmetry functions which converges to a function whose first derivative is not bounded variation. For general n the statement follows from" (21) Let ~ be a hyperplane ((n - 1)-dimensional subspace of E n) passing through the origin, and KI, K 2 be convex sets in ~ , then the inner products
< [Kd l [~.-~ of the corresponding asymmetry classes in M', and
< [K,] I [ ~ ] >. of the corresponding asymmetry classes in E n, are equal. Hence A "- 1 can be isometrically embedded in A ~, or alternatively, after identification, A *- 1 may be regarded as a subspace of A". Thus A 2 is a subspace of A" ( n > 2 ) and since A 2 is not complete, (20) follows immediately. To prove (21), let u ~ E " be any unit vector, and u'~ ~ be the unit vector whose direction is parallel to the perpendicular projection of u on M'. Then for any K c ~¢t~,
n ( r , u) = I u. u'l n(K, u'). This implies that the S t e i n e r p o i n t o f K , regarded as lying in W coincides with the Steiner point of K regarded as lying in E ". Taking this point to be the origin,
a(K, u) = [u "u'l a(K, u') and so
fn a ([K,],u)a([Kz],u)do~= |r ~,/2), dO|r ,d - ( 1 / 2 ) n
=~
dfln-
a([K,],u)a([K2],u)cos2Odo9 I
f. ,,- a( [K,],u)a([Ke],u)dco
where f~,_ 1 = f~, r~ ~#. From this, and the definition (19), statement (21) follows. It is of course, possible to define many norms in A" other than that derived from the inner product (19). For example, if we put
1966]
CLASSES OF CONVEX SETS
(22)
11[K] !] = sup la([K],u) l
9
then C is complete, but theorem (12) enables us to construct a Cauchy sequence a of asymmetry functions (in the norm (22)) which does not converge to an asymmetric function, and so A" is still not complete. This statement answers a question raised in [3], for a is a Cauchy sequence in the norm defined there, and so again A" is not complete. If o*~'"is topologised by the Hausdorff metric, and either of the norms (19) or (22) is defined on A", the natural mapping Jr" -~ A" is continuous. This follows from the continuity of the supporting function and (6). 4. Orthogonality. We now consider some special geometric relationships between two convex sets K1, K2 ~ ~f" such that the classes [KI], [K2] are orthogonal in A" with respect to the inner product (19). (22) Let T be any orthogonal transformation in E" with the property that T 2 = - I . (Here - I is the mapping which sends each x ~ E " into - x . ) Then, for any K, the classes [K] and [TK] are orthogonal in A". Proof.
= f a ( [ K ] , u) a ([TK], u)d¢o,
2
,ua
,u co+
,ua
,u co ,
O, since the sum of the integrands is zero. We deduce that the inner product zero, and so (22) is established. In the case n = 2, the only transformation T satisfying (22) is rotation through a right angle, so that orthogonality in A 2 is closely related to the concept of orthogonality in E 2. (24) Let R, S be absolutely perpendicular subspaces in E" (so that direR + d i m s = n). Let KR be any convex set symmetric in R (i.e. unchanged by reflection in R), and Ks be any convex set symmetric in S. Then [KR] and [Ks] are orthogonal in A . We omit the proof since it is very similar to that of (22). If A"(R) represents the subset of A" consisting of those classes [K] possessing a representative symmetric in the subspace R, then by [3, p. 15], A"(R) and A"(S) are directly complementary subspaces in A". With the inner product (19), (24) shows that these are orthogonal complements in A".
10
G.C. SHEPHARD
Let K* be the reflection of K in R, so that - K * is the reflection of K in S. Then the identity K = ½( g + K*) + ½( g - g * ) enables us to express each [ K ] e A ~as the sum of the elements [½(g + K*)] e A~(R) and [½(K - K*)] e An(S) in these subspaces. In fact the mapping [ K ] -~ [½ (K + K*)] is the orthogonal projection of [ K ] on to the subspace An(R), and similarly for the subspace S.
REFERENCES
1. T. Bonnesen and W. Fenchel, Theorie tier konvexen K~rper, Berlin 1934, reprint New York, 1948. 2. G. Ewald, Von Klassen konvexer K6rper erzeugte HilbertWiume, Math. Armalen 162 (1965), 140-146. 3. G. Ewald and G. C. Shephard, Normed vector spaces consisting of classes of convex sets, Math. Zeitschrift 91 (1960, 1-19. 4. G. H. Hardy, J. E. Littlewood and G. P61ya, Ineqaulities, Cambridge 1934. 5. E. W. Hobson, The Theory of Functions of a Real Variable, and the Theory of Fourier's Series, Cambridge 1907, reprint New York 1957. 6. G. C. Shephard, Approximation problems for convex polyhedra, Mathematika 11 (1964), 9-18. UNIVERSITY OF BIRMINGHAM, ENGLAND.
THE ERGODIC THEOREM FOR MARKOV PROCESSES(*) ee S. R. FOGUEL ABSTRACT
Most of the material in Sections 4-5-6-8-11 has been published in [4]-[10]. We shall deal with the asymptotical behavior of the iterates of a Markov transition function. Our aim is to generalize the results about the "cyclic" convergence of the iterates of a Markov matrix. Throughout the paper functional analytic methods are used and not probabilistic arguments. The report is self contained, modulo standart results from functional analysis, except for the decomposition into conservative and dissipative parts. Also we assume the existence of an invariant ~rfinite measure on the conservative part. This has been proved, under some restrictions, by several authors using probabilistic methods.
1. Definitions and notation.
Let (X, X,v) be a m e a s u r e space. By a m e a s u r e
we shall m e a n a finite positive m e a s u r e unless otherwise m e n t i o n e d (e.g. signed
P(x,A)
measure, tr finite positive measures a n d finitely additive measures). L e t be a M a r k o v s u b t r a n s i t i o n function o n it, i.e., a f u n c t i o n o n X x E which is, for each x ~ X a m e a s u r e o f t o t a l m a s s < 1 a n d , for each A E E a m e a s u r a b l e function. The s u b t r a n s i t i o n f u n c t i o n induces an o p e r a t o r o n b o u n d e d m e a s u r a b l e functions a n d o n signed measures b y
(L 1)
(Pf) (x) =
f f(y)P(x,dy)
0.2)
0,P)(A)=
f P(x,A)lt(dx)
T h u s if 1Ao denotes the characteristic f u n c t i o n o f A o Z a n d 3xo the D i r a c m e a s u r e a t Xo t h e n ( P lao ) (x) = (6xoP)(A) =
P(x,Ao) P(x,oA).
E q u a t i o n 1.2 will be o c c a s i o n a l l y used for o" finite positive m e a s u r e s a n d for finitely additive m e a s u r e s too.
Received June 21, 1966 (*) The research reported in this document has been sponsored by the Air Force Office of Scientific Research under Grant A F EOAR 66-18, through the European Office of Aerospace Research (OAR) United States Air Force. 11
12
S.R. FOGUEL
[March
The two operators are related by
f (Pf)(x).(dx)= f f(x)(#P)(dx).
(1.3)
The measure v is assumed to satisfy
vP ~, v
(1.4)
(vP is absolutely continuous with respect to v). Hence if v(A) = 0 then P(x, A) = 0 a.e.v. Equation 1.4. can be always achieved if one replaces v by Y~2-nvPn. The iterates o f P are defined inductively by (1.5)
P~(x,A) = f Pn-k(x, dy)pk(y,A) 0 < k
k. Then (#kP)(A)
= f P(x,A)#k(dx) < k f P(x,A)dv.
Thus if v(A) = 0 then (#~P) (A) = 0. Since lit ~ li in the norm of signed measures (total variation) it follows that
f P(x,A)lik(dx)-, f P(x,A)li(dx). Therefore if li-< v then lip-< v, or P leaves the subspace, consisting of signed measures that are weaker than v, invariant. For this section only let us denote (2.1)
dliP f P = g iff whenever dli-'-fdv then g - dv
This can be written as: (2.2)
fP=g
iff fA g(x)v(dx)= f P(x,A)f(x)v(dx).
Note that P on Ll(v) is the restriction of 1.2 not of 1.1. Now the operator P on signed measures is a contraction operator (of norm =< 1) and maps positive measures to positive measures. Thus P on L1 (v) is a contraction and if f > 0 a.e. v then f P >= 0 a.e.v. On the other hand we can not apply to P the classical
1966]
THE ERGODIC THEOREM FOR MARKOV PROCESSES
13
Ergodic Theorem since P 1 # 1 usually. Equality would mean that v is an invariant measure. This situation has been studied by the Chacon-Orenstein Theorem and related results. We shall cite only one result that will be used later: The spacce X is the disjoint union of its conservative part C and its dissipative part D. These sets satisfy: (2.3)
I f I~-,( v then ~ g P " is tr finite on D.
(2.4)
I f # -~ v then ] ~ P " ) ( A ) = ~ , unless ~,(pP")(A) = O, whenever A c C and v(A) > O.
(2.5)
P"I c = P" (x, C) = 1c a.e.
See [12 Proposition V.5.2]. 3. Convergence on D. Let Dj be disjoint sets whose union is D such that ~,,(vP")(D~) < ~ . Such sets exist by Equation 2.3. THEOREM 1. I f # "(V then lim,_.~ (#W) (Dj) = 0. Proof. Let
d#=fdv where O < f e L l ( v ) . Now j" ~,,P"(x,D~)v(dx)=f - min(f, c)
p k p , k ( f _ rain(f, c)) _>-f - rain(f, c)
p,,Pkll---
eke* ll---
but II 1 and II 1 w le inequality would mean that these operators have norm greater than one. (e) I f f ~ K then the characteristic function of {x:f(x) > c > 0} belongs to K Let f÷ = max (f, 0) ~ K. Let g = c - l m i n ( f + , c ) ~ K . Then 0 < g ~ l and put h e = e - l m i n ( e g , f + - m i n ( f + , c ) ) for every 8 > 0. Now 0 < he < 1 and h, e K . Also if f+(x) > c + e then eg(x) < f+(x) - c and ha(x) = g(x) but f÷(x) > c implies that g(x) --- 1. Thus if f+(x) > c + e then he(x) = 1. On the other hand if f+(x) ~ c then jr+ - rain(f+, c) = 0 and he(x) = O.
19661
THE ERGODIC THEOREM FOR MARKOV PROCESSES
17
Therefore as ~ ~ 0 ha tends to l{x,s(x)>d. (f) Let us now prove the theorem by contradiction: I f f e K and is orthogonal to Y'x then for every positive c f{x,i(~)>c}fd2 = O, by part e, hence 2{x:f(x)>c} = 0 and f+ = 0 a.e. Apply this to - f t o get that f = 0 a.e. Tx-mOR~M 5. IrA e ~,a and is of finite 2 measure then P l a and P*la ( = P - a l a since on K P is unitary) are both characteristic functions of sets in Zt. Proof, Let us prove the theorem for P1 a since the proof for P*la is identical. Put f = P l a then 0 < f < 1 (see part d of the preceding proof). Let B e Z x be such that on B 0 < f ( x ) < 1 ( f is ~l-measurable) put g = (1--f)lB. Thus 0 0} = 0 and v{Rk(x,A) = 1} # 0 for every m and k. Another way of putting it is to say that the process does not contain a deterministic subprocess. Where a subprocess is obtained by taking a subfield of Y. and is called deterministic if the transition function and its iterates assume the values zero-one a.e. From Theorem 4 follows that ~1 is generated by sets o f finite measure and since 2 is a finite we have Y~I = {W~}l°°--a• For each i the sets whose characteristic functions are P ' I n,, are atoms of Y~. Let us denote these atoms by P~W~. TtmORPM 7.
For every atom W there exists an integer k such that PtW = W.
Proof. If P#W are not disjoint then i f ' W = Prow for some m < n but since P is a unitary operator on Lz(C,Y-1,;0 P*-mw = W. If P"Ware all disjoint and /z = restriction of 2 to W then Y{/~P")(W)=/I(W) = ;t(W) which contradicts 2.4. {W k3 P W u ... u P k- W} is called a cycle. The integer k is called the order of W.
Also define (8.1)
C 1 = union of all atoms.
(8.2)
C2 = C - C 1
= C-
U Y~I.
9. The limit theorem for measures weaker than v.
Let us study/JP" where
#is a measure on C weaker than v and hence/z - 2. This procedure can also be carried out for x~-2,'",x2 and x~+1,.--,x~, if they exist. Therefore if Y = max U ll e S
then
(6)
Y= AY+B
This equation has the solution yO =
2
But this solution is unique. For let y1 be any other solution of (6). _ y t . Then W = A W and so
Iw2]
= a231wal
Let W-- yo
1966]
MEAN LENGTH OF THE CHORD OF A CLOSED CURVE
27
Summing these inequalities we get k
]E cj I wjI -< 0
(c I = 1 - 6i > 0, j = 2,..., k)
j=2
Hence w~ = 0, j = 2,..., k. In the case N = 2K we can eliminate the variable Yk by replacing the last relation o f (4) into the preceding one. The new system Y < A ' Y + B' can be considered as (5) and the result is that
yj --< sln ~ - j
sin
, j = 2 , . . . , k - 1.
But
Yk= < l+yk_l
< l+ =
sin
k-1)/sin
7g
-sin 2 ~--~(k - 1) + sin 2 2k
sin2 ~ 2k
=
•
2
1
sin k~-~
sin2 2-k
sin2 - 2k
which completes the proof of Theorem 1. (see Remark 1). LEMMA 6. Let the functions f ( t ) and g(t) be defined in [ 0 , c ] ( c > 0 ) and let us assume that f ( t ) = f ( c - t), g(t) = g(c - t) for every 0 then the points of nN+ ~ are defined by the relations AN+IAN+I -~l ~',+1
=
L 5N+1, i = 1 , ' " , 5 N, nN-----TrN+x.
Suppose that C is not collinear. In this case nN has the same property if N is sufficiently large: N > No. From (12), (13) and Lemma 4 7rN = 'rN(n~)
(N = No, No+ 1,'")
But evidently ,CNo -~ ,rNo+l ----- ,CNo+ 2 -~ ... de..~,~ .
Letting N approach infinity, we get C = z(K) i.e. C is an ellipse. The curve C cannot differ from K as in this case the aifme mapping z would change the length of K , which contradicts (15). If points of C are collinear, then ~r~ is also collinear. (N ---1,2, 3, .-.). Therefore from (12), and (13) ~ N ---- P N ( R N ) ( N - -
1,2, ...),
PN is a projection on the line containing ~qv. Evidently we have P1 = P 2 . . . . . PN . . . . d~ p , so C = P(K). But this contradicts (15) by the above argument. Hence C cannot be collinear. As we have
Sr = -£1 f ~ 3r(u)du, our theorem is proved. R m ~ K . 1) Let us assume that Y~_ A Y + B with the notations of lemma 5.
32
L. GABOR
It is easy to show, that the iteration Yo = Y, YK = A YK-1 + B, k = 1,2,... is increasing and converges. This fact does more elementary the p r o o f of this lemma. 2) Let be d the transfinite diameter of C. Then we have
d=exp(n~x~2
fcfcPePQl°greQdsedsQ )
where O = fc pedse I-5]. To examine the minimum of the integral
, fcf
1
logreQdsedSQ
for convex curves is perhaps easier as as to examine the minimum of d, and this way we could get a good lower bound for d/L. 3) The classical isoperimetric inequality follows from Theorem 1 using the formula of area
16a2 = f c f c r2~cos~eedsedse due to L. R6dei and B. Sz. Nagy [4]. 4) Inequality of Wirtinger can be roved completely by a slight modification of our proof of Theorem 1.
LITERATURE 1. W. Blaschke, Fine isoperimetrische Eigenschaftdes Kreises, Math. Zeitschrift I (1918), 52-58. 2. T. Carleman, t)ber eine isoperimetrischeAufgabe und ihre physikalisehen Anwendungen, Math. Zeitschrift 3 (1919). 3. Fan, K. O. Taussky and J. Todd, Discreteanaloguesof inequalitiesof Wirtinger, Monatsh. Math. Physik 59 (1955), 73-90. 4. L. R6dei and B. Sz. Nagy, Eine Vorallgemeinungder Heronisehe Formel, Publ. Math. Debrecen (1950). 5. G. P61ya and G. Szegtl, Isoperimetric inequalities in Mathematical Physics, Princeton University Press, (1959). JOZSEF ATTILA UNIVERSITY OF SZEGED,
HUNOARY
MATRICES
WITH
ZERO
TRACE
BY
R. C. THOMPSON*
ABSTRACT
Let Mn(F) denote the algebra of n-square matrices with elements in a field F. In this paper we show that if M e Mn(F) has zero trace then M = AB--BA for certain A, Be Mn(F), with A nilpotent and trace B = 0, apart from some exceptional cases when n = 2 or 3. We also determine when M = MB--BM for some B e M n (F). Let F be a field of characteristic p , p zero or prime. Let Ms(F) denote the algebra of n-square matrices with elements in F . Let (A,B) = AB - BA denote the c o m m u t a t o r of matrices A, B e Ms(F). It is well known that trace (A, B) = 0. In 1936, Shoda I2] proved for p = 0 that if M e M~(F) has zero trace then M = (A,B) within Ms(F). In I957, Albert and Muckenhaupt 1'1] removed the restriction on p. It is of interest to ask whether in M = (A,B) it is possible to choose A, B e M , ( F ) so that trace A = t r a c e B = 0 . I f n ~ 0 ( m o d p ) it is trivial to see that this is always possible. For let ~ = n - l t r a c e A , / ~ = n - l t r a c e B . Then M =(A,B)=(A~I,,B-~I~) where In is the n-square identity matrix. Here A - ~I, and B - ~ I , each have zero trace. However, this argument fails if n = 0 (mod p). It is still easy to see that we can always choose A to have trace zero. For if trace B = 0 then M = ( - B, A) and - B has zero trace. I f trace B # 0, let ~ = - (trace A)(trace B ) - 1. Then M = (A + ~B, B) and here trace (A + ~B) = 0. N o simple argument o f this kind can show that it is always possible to choose both A and B to have zero trace, since we shall exhibit below an example where this is impossible. We are now ready to state our main result. TI-1EOI~M 1. I f p ~ 3 let n > 2 and if p = 3 let n > 3. Let M e Mn(F) have zero trace. Then A, BeMn(F) exist such that M = (A,B), A is nilpotent, and B has zero trace. In Theorems 2, 3, 4, we supply a discussion o f the cases n = 2 and n = p = 3. In Theorem 5 we obtain some consequences o f Theorem 1. In Theorem 6 we determine when M = (M,B) within Ms(F). We first require a L e m m a that extends somewhat a L e m m a proved in 1'1]. Received December 22, 1965. * The preparation of this paper was supported in part by the U.S. Air Force under contract AFOSR 698--65.
33
34
L~.~.
R.C. THOMPSON
[March
Let M = (mti) e M,(F), where n > 2. Suppose
(1)
~E m ~ j + ~ = 0 ,
0 1, the element in position (a, fi) of K B is b._ 1,#. In - B K the last column is a zero column, and for ~ < n, the element in position (0t,~) is -b=,# +1. Thus in column p of KB-BK, for f l < n , we see new unknowns b~,p+x,b2,p+l,"',b.,#+l that do not appear in any column of K B - B K to the left of column ft. We may therefore choose B such that M - ( K B - BK) has all columns zero, except perhaps for column n. We now introduce some additional terminology. In an n-square matrix let diagonal ~tdenote the diagonal of positions (i, i + ~ - 1), 1 < i < n - a + 1; 1 < a < n. In K B - BK diagonal n has a single element, zero, and this is also true o f M . For 1 < a < n - 1, the sum down diagonal 0t in K B - B K is n-~t+l
b~_l,i+~_ 1 -
~ bi,~+~ = 0. i=1
i=2
Hence the sum down diagonal 0~in M - (K, B) is zero, 1 < 0c< n. Since M - (K, B) can have nonzero elements only in column n, we must have M - ( K , B ) = O. The elements bll =O, b22,...,b., in B satisfy the equations: m21
=
-b22
,
mt,~-t = b i - l , t - i - bn,
3 < i < n.
Hence (3)
bit = - ( m 2 1 -F m32 -F "'" q- ml,~_x),
for 2 < i < n. F r o m (3) it is easy to get (2). We now give the proof of Theorem 1. First observe that, given M = (mrj) e M.(F) with trace M = 0, it suffices to prove Theorem 1 for some similarity transform S M S -~ by a nonsingular element S of M.(F). Next observe that if D =diag(dl,d2, ...,d.)~ M.(F) and is nonsingular, then the second diagonal of D-XMD is d'~lm:2d2,d21m23d3,...,d~_tlm._l,.d.. From this it follows that for appropriate nonzero dx, d2, ...,d~ o F , we can in D - 1 M D replace the nonzero elements on the second diagonal of M with any given nonzero values from F . Moreover, the positions in M which are zero still are zero in D - 1 M D . We let C(p(2)) denote the companion matrix of polynomial p(2). We take
1966]
MATRICES WITH ZERO TRACE
35
our companion matrices so that (when degree p(2) > 1) the stripe of ones is on the second diagonal of C(p(2)). Now let (4)
M = C(pl(X)) Jr C(p2(~.)) J r ' " Jr C(Pr(~.))E Mn(F),
where Jr denotes direct sum. We arrange matters so that (5)
Pt+a(2) divides p~(2), 1 < i _~ r,
(possibly r = 1). Let d be the number of ones on the second diagonal of M . We now suppose F # GF(2), the two element field. A separate proof will be given later when F = GF(2). We break our discussion into cases. First let degree p1(2)> 4 or degree px(2) = 3, r > 1, degree p2(2) > 1. Then d _~ 3. Select y e F such that y # 0, y # - ( d - 3). This is possible i f F has at least three elements. Set x = - y - ( d - 3). Then x # 0. Find a diagonal matrix D ~ M,(F) so that the nonzero elements on the second diagonal of D-1MD are 1, x, y together with d - 3 ones. Let
D-tMD =
v
M1
where M t e M , _ I ( F ) ; u = ( 1 , 0 , 0 , . . . , 0 ) is a row (n-1)-tuple; v is a column (n-1)-tuple for which the transpose, v r, has the form vr=(O, v3,v,,...,v,). Owing to the choice of x and y , the sum down the second diagonal of M1 is zero. Hence, by the I_emma, M t = (K1,B1) for a certain (n-1)-square K t given by the lemma and for some B1 e M,_I(F). Set (6)
A=
[ 0 0
],
B=
0 K1
[-trBt vI
ul
].
B1
Here ut = (0, - 1 , 0 , 0 , ...,0), vr = (v3,v,, ...,v~,O). Then - u t K 1 = U , K l V 1 = v, and hence D - I M D = (A,B). Moreover A is nilpotent and trace B = 0. We now have to examine the following cases: (i) degree p 1 ( 2 ) = 3 , degree p2(2) . . . . . d e g r e e p , ( ~ ) = l (perhaps r = l ) ; (ii) degree pt(~.)=2; (iii) degree p1(2)= 1. Case (i). If n • 0 ( m o d p ) , set x = 0. If n -- 0(modp) but p # 3, let x be the solution in F of 3x = 2a2, where p t ( ~ ) = 2 a - a 3 2 2 - a 2 2 - a l . Defer for a moment the possibility p = 3, n - 0(mod3). Let -1 A=
0 -x
0
0
1
0
0
1
Jr I , - 3 .
36
R.C. THOMPSON
[March
Then the sum down the second diagonal of AMA -1 is zero. We can apply the lemma to AMA -1 to get AMA - x = ( K , B ) . If n ~ 0 (mod p) we also have AMA- 1 = (K, B - flI~). If we put fl = n- 1 trace B then we have K nilpotent and trace (B - ill,,) = 0 . If n = 0 (modp) then the formula (2) together with the choice of x shows trace B = 0. This finishes case (i), except when p = 3 and n = 0(mod3). When p = 3 and n = 0(rood3), the conditions in the theorem show that n > 3. Moreover (5) and degree p 2 ( 2 ) = 1 show that M = C(pl(2))4-~1n-3 for some e F . But then M is similar to M1 = ylx 4- C(pt(;O) -[-~I~-4. Let D = (1) 4- ( - 1) 4- I~-2• Then the sum down the second diagonal of D-1M1D is zero. If we apply the Lemma to D-1M1D we get D-1M1D=(K,B). The formula (2) for trace B (use n = 0 in F) shows that trace B = 0. This completes case (i). To handle the case in which degree p1(2)= 2, we let
~)1 61
~2
62
?m 6m
T , is 2m-square. We permute the rows and columns of Tm in the same way - - this is a similarity transformation p-1TmP of T,, by a permutation matrix P . We take the rows (and columns) of Tm i n t h e order 1,3,5, . . . , 2 m - 1 , 2 , 4 , 6,...,2m. The result of this similarity is (in partitioned form)
[ diag(~q,ct2,"",~tm), diag(fll,fl2, "",flm) 1 T" = P-ITmP =
diag(~'l,T2,'",Tm), diag(c51,~52, "",~m)
We now consider case (ii). If degree Pl0~) = degreep2(2) = 2, we may find a diagonal D such that the second diagonal of D - M D sums to zero. But D-iMD =Tm or D-1MD = T~ 4- (7) according as n is even or odd. So we find a nonsingular QeMn(F) such that Q-tMQ = T" or Q-1MQ = 7"-[-(7), as n is even or odd. By the Lemma, Q- 1MQ = (K, B). Here (2) and m > 1 show that t r a c e B = 0 . If degreep2(2)= 1 then p2(2) . . . . . p,(~)=2-~,, so p1(2)= ( 2 - 7) (2 - &), for certain ~, 6 e F . But then M is similar to M1 = (611 4- ~I~_ l) + E~I where E~t is n-square with all entries zero except for a single one at the (n, 1)position• Since n > 2, the Lemma shows MI=(K,B) where, by (2), trace B = 0 This completes case (ii). In case (iii), M is diagonal and by t h e L e m m a M=(K,B) with traceB = 0 . This completes the proof of Theorem 1 when F # GF(2)• Now assume F = GF(2). Let M be given by (4) and (5). First suppose degree p1(2) _->3. Let M = (m~j) and consider first the case in which the number of ones on the second diagonal of M is even. Let n--1
6 = ~, mi+l,i(n-i). i=1
1966]
MATRICES WITH ZERO TRACE
37
Let s = degreepi(2), so that C(pl(2)) is s-square. Let E,,s-2 be s-square with all entries zero except for a single one at position (s,s-2). Let A = Is + 6Es,~_2. Then M ' = AC(pl(2))A -1 4- C(p2(2)) 4-... 4- C(pr(2)) still has an even number of ones on the second diagonal. By the L emma M ' = (K,B) and by (2), traceB = 6 + (n - s + 1)6 + (n-s+2)6=2(n-s+2)6=O, Now let the number of ones on the second diagonal of M be odd. Let
v
u 1
Ml
where u = ( 1 , 0 , 0 , . . . , 0 ) , vr=(o, va,...,v,), and M1 has an even number of ones on the second diagonal. Then, by the Lemma, M1 = (KI,BO. Define A,B by (6). Then M = (A, B), A is nilpotent, trace B = 0. We may now assume that degree pl(2) is two or one. If degree p1(2) is one, then M is diagonal and the Lemma applies to M to give the result. So let degree pl(2) be two. Then pl(2) is one of 2 2, 2 2 + 2,2 2 + 1, 2 2 + 2 + 1. If pl(2) = 2 2, then if there are an even number of ones on the second diagonal of M the Lemma immediately gives the result. If there are an odd number of ones on the second diagonal then M is given by (7) with v = 0. Then, by the Lemma, M~ =(K1,B1). (M1 has at least two rows since M has at least three rows.) Let A, B be given by (6), with Vl = 0. Then M = (A,B) with A nilpotent and traceB = 0. If pl(2) = 2 2 + 2 then (because of (5)), M is diagonable and the result is at hand. If Pl(2) = 2 2 + 1 then M is similar to
M~ = [ I1 01 nI 4- - I 1 201 I "4-'"4-' [ l 1 01 ] 4where there are s copies of 1 0
I1 1 If s > 1, then MI has the form Ma = Tm or M~ = T,, 4- (1), according as n is even or odd, with fll . . . . . f l , , = 0 . But then there exists Q such that Q-1MIQ = T,. or Q-1MQ = T" 4- (1). By the Lemma, (2) and m > 1, Q-1M1Q = (K,B) with trace B = 0. I f s = 1,M is similar to I. + E.1, and by the Lemma, In+E,I = ( K , B ) , with trace B = 0 . We now have to consider the case pi(2) = 2 2 + 2 + 1. Then, as pl(2) is irreducible, pa(2) = p2(2) . . . . . pr(2) and traceM = r. Thus r is even. But then the sum down the second diagonal of M is zero. Moreover M = T~. So M is similar to T" and by the Lemma T" = (K, B) with trace B = 0. This completes the proof o f Theorem 1.
38
R.C. THOMPSON
[March
THEOREM 2. Let p = 3. Let M e Ma(F), with trace M = O. Then: (i) M = (A,B) within Ma(F) with A nilpotent and trace B = 0 if and only if the characteristic polynomial p(4) of M has the form
p(4)=2a-x22-3,
(8)
x,t~,eF;
(ii) M = (A,B) within Ma(F) with A nilpotent; (iii) M = (A,B) within Ma(F) with trace A = trace B = O. Proof. Suppose M = (A, B) within Ma(F) with A nilpotent and trace B = 0. After a similarity transformation of M = (A,B) by a nonsingular dement of M3(F), we may assume A is one of the following three matrices: 0 0 0 (9)
A=0;
A=
0 0 0
0 0 0
0
;
1 0 0
A=
0
1 0
1 0
If A = 0 then M = 0 and the characteristic polynomial of M has the form (8). From M = (A, B) we get M = (A, B -/~I3) and trace (B -/~I3) = trace B for any fl ~ F . So in M = (A, B) we may assume that the (3, 3) element of B is zero. Hence let b~j b12 b13 (10)
B --
b2t - b i t
b2a
b31
0
ba2
If we compute the characteristic polynomial of (A,B) where A is the second matrix (9) and B is given by (10), we get that the coefficient of 4 is - b ~ 3 . If we compute the characteristic polyniomal of (A, B) where A is the third matrix (9) and B is given by (10), we get (using 2 = - 1 in F) that the coefficient of ~. is --(b12 + b23)2 . Hence the characteristic polynomial of (A,B) has the form (8). Suppose now the characteristic polynomial p(2) of M is given by (8). If M is nonderogatory then M is similar to C(p(4)). But C(p(4))= (U, V) where 0
0 0 0 U =
-10
0
x
1 0
,V=
6
-x x2
-1 x
-t~x - x a - x z
Here U is nilpotent and V has trace zero. Suppose M is derogatory. Then p(~) must have a repeated root. Let y, y, ~ be the roots of p(2). Then y + y + 0~= 0 and y + y + y = 0 (since F has characteristic 3). Thus ~ = y. Hence p(4) = ( 4 - 7 ) 3. As M is derogatory the minimal polynomial of M must be 2 - y or ( 4 - 7 ) 2 and, of course, the minimal polynomial has coetticients in F . Thus y ~ F and M is similar within Ma(F) to
1966]
MATRICES WITH ZERO TRACE
(11)
?
0
0
0
?
0
s
0
?
39
where 8 is 0 or 1. But by the Lemma, for M given by (11), M = (K,B) where, using (2), traceB = 0. This proves (i). To prove (ii), first let M be nonderogatory, similar to C(g(2)) for some polynomial g(2). Choose diagonal D such that the second diagonal of D-1C(g(2))D sums to zero. Then by the Lemma, D - t C ( g ( 2 ) ) D = ( K , B ) where K is nilpotent. If M is derogatory then the argument given above shows M is similar within M3(F) to the matrix (11). Hence always M = (A,B) where A is nilpotent. And in fact we have proved that if M is derogatory then M = (A, B) with A nilpotent and trace B = 0, within M3(F). To prove (iii) therefore we may assume M = C(g(2). Let g(2) = 23 - ~2 - ~. Let now U = diag(0,1, - 1), 0 V =
-1
0
0
-/~
~
0 -1
0
Then M = (U, V) and trace U = trace V = 0. (Use 2 = - 1 in F.) This completes the proof of Theorem 2. THEOREM 3. Let p # 2 and let M e M2(F) with trace M = O. (i) I f M = (A,B) within M2(F) with A nilpotent then the eigenvalues of M are in F . I f the eigenvalues o f M are in F then M = (A,B) within M2(F) with A nilpotent and trace B = 0 . ( i i ) M = ( A , B ) within M2(F ) with trace A = trace B = O can always be achieved. THEOREM 4. Let p = 2 and let M e M2(F) with trace M = O. (i) M = (A,B) within M2(F) with A nilpotent i f and only i f the eigenvalues of M are in F . (ii) I f M = ( A , B ) within M2(F) with traceA = traceB = 0 then M is scalar. I f M is scalar then M = (A,B) within M2(F) with both A, B nilpotent. (iii) M = (A,B) within M2(F ) with trace A = 0 can always be achieved. Proofs. Let M = (A, B) with A nilpotent. Either A = 0 (and then M = 0) or, after a similarity transformation by a nonsingular element of M2(F), we may assume 0 0
A
~_.
1 01" Let B = (blj)t__l,j__2. Then
(A,B) -- [
-bt2 btt-b22
0 I• b12
J
40
R.C. THOMPSON
[March
Hence the eigenvalues of (A, B) are in F . Conversely if the eigenvalues of M = (m~i) are in F , after a similarity transformation we may assume mr2 = 0. Then M =(A,B) where A = E21 and B=
[ m210 - m 0l t ]
If p # 2 , we also have M = ( A , B - 2 - t m2112) and t r a c e ( B - 2 - t m 2 t I 2 ) = 0 . This proves part (i) of each theorem. Let p = 2 and let M = ( A , B ) with trace A = t r a c e B = 0 . Then also M = (A - o~I,B - flI) with ~ equal to the (2,2) element of A, and i equal to the (2,2) element of B, and trace(A - ct/) = trace(B - ill) = 0. So in M = (A,B) we may assume the main diagonal is zero. Then
a2t
0
b2!
0
is scalar. On the other hand if M = mI2, then M =
([0 01 [0 ,
1
0
0
0
is the commutator of two nilpotent matrices. This proves Theorem 4(ii). To prove Theorem 4(iii) we may assume M is not scalar. In Theorem 3(ii) a nonzero M with trace zero cannot be scalar. So to complete these proofs let M = C(22 - a). Then M = (A, B) where A--
[0 11 a
0
,
B =
0
01
.
0
If p ~ 2 then also M = ( A , B - 2-tI2) and trace ( B - 2-~I2)--O. This finishes the proofs of Theorems 3 and 4. THEOREM 5. Let M e M , ( F ) , n > 2, with trace M = O. Then M is an arbitrary word in commutators within Mn(F ). Thus, for example, M = ((At,A2),((Aa,A4),As)) within Mn(F) if and only if trace M = 0. We now require additional terminology. Let L be the algebraic closure of field F. The invariant factors of M e M,(F) are by definition the nonconstant polynomials on the main diagonal of the Smith canonical form of the polynomial matrix M - M. Over F , each invariant factor of M can be split into a product of powers of irreducible polynomials over F . We call these powers of irreducible polynomials over F the elementary divisors of M over F . Over L, each elementary divisor has the form ( 2 - 20) m.
1966]
MATRICES WITH ZERO TRACE
41
THEOREM 6. Let M e M o ( F ) . Then B e M ~ ( F ) exists such that M = (M, B) if and only if each elementary divisor (2 - 2o) s of M over L has m = 0 ( m o d p) whenever 20 ~ 0. I f this condition is satisfied then it is always possible to choose B such thattraceB=O, except in onesituation: if p = 2 and if each elementary divisor of M over L has even degree, then for all choices of B we have trace B = n/2.
An equivalent form of the condition of Theorem 6 is that each elementary divisor of M over F not of the form 2 m be a polynomial over F in 2 p. Proof. Suppose M = ( M , B ) . After a similarity transformation by a nonsingular element o f M~(L), we m a y suppose M = M~ -1- ... q- M , , where M i is ml-square, of the f o r m Mi = (2~) if mi = 1, or
M s = 2jim, + C(2 m') if m~ > 1. (Jordan canonical form.) Here the 2i are not necessarily different. Partition B = (Biy) 1 < i,j < r, where Bii is ms-square. Then M = (M, B) implies Ms =(Mi,Bit), 1 < i < r. Hence traceMi = 0. This implies that ms =-0(modp) whenever 2~ ~ 0. Hence the condition of Theorem 6 is satisfied. Suppose now that p = 2 and that each mi is even. Fix i, and let Bii = (b~p). Then M s = (Ms, Bii) yields b~+1.~+1 - b~ = 1, for 1 < ct < m~. Hence b~+l.~+t = ~ + b11, and hence trace Bii= m i ( m i - 1 ) / 2 + mib11 = mJ2 because m ~ = 0 in L. Therefore trace
B = (ms + ' " + m,)/2 = n/2. To complete the p r o o f of Theorem 6, we suppose M ~ Mn(F) satisfies the condition of Theorem 6. We have to find B~Mn(F) such that M = ( M , B ) , with traceB = 0, apart from the exceptional case. Let ~b(2)e be an elementary divisor of M over F , with tk(2) ~ ).. Let 20 be a root of ~b(2) of multiplicity v, where 20 e L. Then ( 2 - ;to) ve is an elementary divisor of M over L. Hence either v = 0 ( m o d p ) or e = 0 ( m o d p ) . In either event ~b().)e must be a polynomial in 2 ~. N o w let g ( ; t ) = - a o - a 1 2 ..... ar,_l)m-1 + ) m be a polynomial in 2P: ay = 0 if j = 0 ( m o d p ) , and m = 0 ( m o d p ) . Let B1 = diag (1, 2, 3, ...,m). Then C(g(2)) = (C(g(2)), BI), since (j + 1)aj = 0 = aj i f j ~ 0 (mod p), and (j + 1)aj = ay i f j ----0 ( m o d p ) . Moreoyer, for odd p, traceB = m(m + 1)/2 = 0 becuse m = 0 ( m o d p ) . Next note that C(2 m) = (C(2m),BI - CtIr,) for any m and any ~ e F . I f p is odd and m = 0 ( m o d p ) , put ~ = 0. Then traceB~ = 0. I f m = 0 ( m o d p ) , may be chosen from F so that t r a c e ( B t - ~Im) achieves any desired value in F . By taking direct sums, we can get M = (M,B) within M , ( F ) , with traceB = 0 in all cases but the indicated one. This completes the proof of Theorem 6. THEOREM7.
Let M e M ~ ( F ) , n > 2 , ( n > 3 /f p = 3 )
Then (12)
M = (((... ((A, C), C),-..), C), X)
with trace M = O .
42
R.C. THOMPSON
for certain A , B , X e M n ( F ) with t r a c e X = O , A nilpotent, and (for p ~ 2 ) , trace C = O. Proof. By Theorem 1, M = ( A , X ) with A nilpotent and trace X = 0. By Theorem 6, A = ( A , C ) , with t r a c e C = 0 for p # 2 . By iteration we get (12). REFERENCES I. A.A. Albert and B. Muckenhaupt, On matrices of trace zero, Mich. Math. J. 4 (1957), 1-3. 2. K. Shoda, Einige Siitze fiber Matrizen, Jap. J. Math. 13 (1936), 361-365. UNIVERSITYOF CALIFORNIA, SANTA BARBARA,CALIFORNIA
NON-EUCLIDEAN
INCIDENCE
PLANES
BY R. ARTZY* ABSTRACT
The paper discusses finite or infinite incidence planes in which an oval plays the role of a metric conic. The points of the oval are used as coordinates, and ordered couples of these coordinates give rise to a coordinatization of the whole plane by means of ternary structures. These ternaries are studied, and a few specializations and their geometric analogues are studied.
Introduction. Projective and afline incidence planes have been studied extensively. It is, therefore, reasonable also to consider non-euclidean planes based on incidence axioms alone. One way of doing this would be a restriction to the Bolyai-Lobachevsky plane, to the exclusion of the points on the metric conic and those exterior to it. Some aspects of such planes were discussed by L. M. Graves [2] and T. G. Ostrom [6]. Another approach replaced the metric conic and its polarity in the classical Cayley-Klein model by an oval and a polarity with respect to it in a projective incidence plane. This was done, for instance, by T. G. Ostrom [5] and in a series of papers by R. Baer culminating in [1], where the author proved that under his restrictive axioms no Bolyai Lobachevsky plane can be finite. Most of the investigations in these papers were directed at finite planes, and many of the combinatorial and number-theoretic arguments employed in them do not carry over to the infinite case. This paper deals with non-euclidean planes nc, that is, finite or infinite projective incidence planes 7r in which the role of the metric conic is played by an oval C defined by incidence properties alone. The lines of nc are shown to correspond bijectively to the ordered pairs of points of C. The incidence relation is expressed in terms of a ternary operation on the points of C, and the result is an algebraic structure called a ternary, somewhat resembling M. Hall's ternary ring [3], but possessing also quadratic properties. This enables us to obtain a coordinatization of 7tc starting from C, thus generalizing the classical theory which proceeded as follows: In the real projective plane a nonsingular conic was designated as the metric conic. An addition and a multiplication of the points on the conic were defined as described, for instance, in [9, p. 232]. The points of the conic were the "ends" of the lines in the CayleyKlein model of the Bolyai-Labachevsky plane consisting of the interior of the Received December 13, 1965. *Supported in part by Grant GP-2068 of the National Science Foundation and by the Rutgers Research Council.
43
44
R. ARTZY
[March
conic. D. Hilbert [4, Appendix III] developed an "end calculus" which was the exact analogue of the field of the points on the metric conic under the addition and multiplication mentioned above. In our treatment we define addition and multiplication of points of C in terms of the ternary. These operations closely resemble the classical operations. The points of C are shown to form commutative loops under this addition and multiplication. Associativity of the additive or the multiplicative loop, respectively, is proved to correspond to the validity of two special Pascal properties with C substituted for the conic. Even with these two properties and with linearity [3], the ternary is not necessarily right-or left-distributive. This is significant because, with the ternary a finite field, 7r would become desarguesian and C a conic, in view of B. Segre's result [8]. We would then obtain the classical Cayley-Klein model. If the ternary is a field of characteristic ~ 2, the coordinatization developed in this paper turns out to be essentially the same as that used in Hilbert's "end calculus". A question arising naturally concerns non-euclidean collineations, that is, those collineations of 7rc which preserve C. They will be discussed in another paper. 1. Definitions. We define a set of points and lines to be a non-euclidean plane 7rc if it is a projective plane lr, that is, satisfies the axioms I, ~I, II, and if it contains a subset C of points ("oval") satisfying the axioms III. I. For any two distinct points there is a unique line through both. 6I. For any two distinct lines there is a unique point lying on both. II. There are 4 points no 3 of which colline. 6I is the dual of I. It is well known that the dual of II is a consequence of I, ~I, and II. III. There is a nonempty set C of points such that III1, III2, 6III1 and 6III2 hold. III1. Each point P of C is on just one line which contains no other point of C. This line is called PP, the tangent at P. We denote ~C = {PP [ P ~ C}. III2. No 3 points of C colline. 6III1. Each line of 6C contains just one point which is not also on another tangent. 6III2. No 3 lines of ~C concur. Obviously the axioms of the non-euclidean plane are self-dual. The set of all the points mentioned in 6III1 is C. For, by III1, each tangent has one point which is not on any other tangent, namely the point in C, and by 6III1 this is the only point. Thus, fifiC = C. It follows from III1 and ~III1 that a point is in C if and only if it lies on just one tangent. If it lies on no tangent, it is called interior, and if it lies on two distinct
1966]
NON-EUCLIDEAN INCIDENCE PLANES
45
tangents, it is exterior. Dually, each line either belongs to 3C, or it contains two points of C and is called a secant, or one of its points belong to C and it is a stray. The set of all interior points may be considered as a generalized Bolyai-Lobachevsky plane. R. Baer [1] discussed non-euclidean planes for which he also postulated that every line through an interior point be a secant and that every point on a stray be exterior. These requirements are not a consequence of our axioms I, II and III. As a counter-example we observe that in PG(2, 3) the conic Xo-X122_x22 = 0 obviously satisfies all requirements of C. The point (1, 0, 0) is interior, but xl = x2 through (1, 0, 0) is not a secant. Moreover, x~ = x2 is a stray, but (1, 0, 0) on it is not exterior. Baer showed that his planes were necessarily infinite. Our planes may be finite or infinite. 2. Coordinates for rcc. PROPOSITION 1. C has at least 3 non-collinear points. Proof. Axiom II implies that the number of lines through each point of n is at least 3. At least one point P of zc is in C. Only one of the > 3 lines through P is a tangent, and hence, by III1, each of the remaining lines through P has to contain another point of C. These two additional distinct points cannot colline with P, in view of III2. We label all points of C. Three of them will be called 0,1,oo(0 # 1 # oo #. 0), which is possible in view of Proposition 1. We denote by (p) (q) the line joining the points p and q, and by j x k the point of intersection of the lines j and k. In order to shorten the notation we will use the following definition. If p e C and j is a line through p, then j x C will be p if j = (p) (p), and it will be the second point of intersection o f j and C i f j is a secant. Now consider a line j not through oo. Let (Figure 1) j × ( 0 ) ( ~ ) = X, j × ( o o ) ( o o ) = Y, X(1) x C = x , and Y(0) × C = y . Then the points x and y of C are uniquely determined by j. Conversely, if x # oo and y # oo are points of C, then by (x) (1) x (0) (oo) = X, (y) (0) x (co) (oo) = Y, a line j = X Y is uniquely determined by x and y. IfX = C - (oo}, we have, therefore, a bijectivity between X x X and all the lines of n other than those through oo. In particular, we write j = [x, y] and call x and y the line coordinates of]. We have yet to take care of the lines through oo. Let k be such a line, and let k = ( n ) ( o o ) , n e C . If Q = ( 1 ) ( 1 ) x ( 0 ) ( o o ) , and if O(n) x C = m , we will write k = [rn]. Obviously this is a bijectivity between all the points of C and the set of all lines through ~ . In particular, this makes (oo) (oo) = [0] and ( 0 ) ( o o ) = [ ~ ] . If P is a point of rc not on (0) (oo) and not in C, then for each x e Z there is just one line through P having x as its first coordinate, because for
46
R. ARTZY
[March
I
~0
Fio~
1
X = (x)(1) x (0)(oo) the line XP is uniquely determined. If y is the second coordinate of this line, then y depends on x and on the choice of P. Let (Figure 2)
P
0
c
IQ
FIOtrRE 2
P ( O ) × C = c , P ( o o ) × C = r , ( 0 ) ( 0 o ) × ( 1 ) ( 1 ) = Q, Q(r)×C = m. Then P determines m and c uniquely,and converselyfor each m and c from ~ there is a unique P. Thus y is a functionof x, m, and c, and we writey = T(x, m, c). This is the equation of the point P. Tmay be consideredas a ternaryoperation
1966]
NON-EUCLIDEAN INCIDENCE PLANES
47
Now let P = p c C, but not on (0)(oo). Then we proceed as before, now putting r = p = c. Furthermore, if P ~ 0 lies on (0) (oo), and if P(1) x C = b, then all the lines through P, except P(oo), will have the first coordinate b, and the equation of P will be x = b. Thus we obtained equations for all points of 7r, except oo. So, finally, we define x = oo to be the equation of the point oo. 3. The ternary. We now introduce an algebraic structure called a ternary, (S, T ) . The set S contains at least two distinct elements, 0 and 1, and is closed under the ternary operation T. Moreover, the axioms T1 through T12 hold in (S, T) for all a, b, c, and d in S. No claim is made as to the independence of the axioms. T1. T(O, b, c) = c. T2. T(a, O, c) = c. T3. The equation T(x, bl,c~)= T(x, b2,c2), with bl ~ b2, has a unique solution x in S. T4. The equation T(a, b, x ) = d has a unique solution x in S. T5. If al ~ a2, the simultaneous equations T(al,x,y) = d t and T(a2,x,y) =d2 have a solution for x and y in S. PROPOSITION 2. The solution in T5 is unique. Proof. Suppose there are two solutions x l , y l and x2,y2. If x ~ # x 2 , T(al,xl, y l ) = d l = T(al,x2,Y2); then, by T3, aa is uniquely defined, and a~ = a2, a contradiction. If xl = x2, then, by T4, also Yl = Y2. PROPOSITION 3. I f b # O, T(x, b, c) = d has a unique solution x. Proof. By T2, d = T(x, O, d), and by T3, T(x, b, c) --- T(x, O, d) has a unique solution x. PROPOSITION 4. I f a # O, T(a,x,c)= d has a unique solution x. Proof. By T1, T(0, x, c) = c. This equation, together with that of our statement, has a unique solution x, c by Proposition 2. Dm~irqmoN. For every a # 0 , define a -1 by T ( a , l , 1 ) = T(a,a-l,a). unique existence of a-~ follows from Proposition 4.
The
PROPOSITION 5. 1- I = 1. T6. For all a ~ O, T(a, 1,1) = T(a, x, a) implies T(x, 1,1) = T(x, a, x). This is equivalent to the statement (a-1)-1 = a. PROPOSITION 6. I f a ~ O, then a-1 ~ O. Proof. If a - l = 0 , then, by T6, T ( 0 , 1 , 1 ) = T(0,a,0), which, by T1, means 1 = 0, a contradiction. PROPOSITION 7. a - l = b-1 i m p l i e s a = b.
48
R. ARTZY
[March
Proof. By T6, we have T ( b - t , l , 1) = T(b - 1 , a , b -1) and T(b - 1 , 1 , 1 ) T ( b - 1, b, b - 1). By Proposition 4, a = b.
=
DEFINITION. For b # a # 0 # b , wedcfine a b = b a T(ab, a - 1, a) = T(ab, b - 1, b) = a + b.
by
and a + b = b + a
Unique existence of ab = ba follows from T3 and Proposition 7. PROPOSITION 8. a l = la = a i f 1 # a # O. T7. T ( 1 , a - l , a ) = T ( 1 , a , a - 1 ) i f a # 0 . PROPOSITION 9. aa-1 1 i f 1 ~ a # 0 and i f a # a - x . T8. The equation T(a, x - 1, x) = b has at most two solutions x in S. =
PROPOSITION 10. I f T ( a , x - l , x ) = b has 2 distinct nonzero solutions x and y, then x + y = b and x y = a. T9. For every nonzero a there are unique p ~ 0 and q in S such that T(p, a - 1, a) = q and such that T(p, b - i, b) = q implies a = b. DEFINITION. Under the assumptions of T9, p = aa and q = a + a . PROPOSITION 11.
1"1 = 1.
Proof. It is claimed that, in the terms of T9, a = 1 implies p = 1. Then from T(1, 1, 1) = T(1, b - 1, b) it would have to follow that b = 1. Suppose b # 1, then bl = 1, by the definition of multiplication. But, by Proposition 8, bl = b, a contradiction. PROPOSITION 12.
l f a # 0 # d, then there exists a unique x satisfying a x = d.
Proof. By TS, T(d, a- 1, a) = T ( d , x - 1, x) has at most 2 solutions. One solution is x = a. I f it is single, then byT9, aa = d. I f there is a solution x other than a, then a x = d. PROPOSITION 13.
I f e # a # O, then a + x = e has a unique solution x.
Proof. T ( a x , a - t , a ) = e has a unique solution ax, by Proposition 3. By Proposition 12 this yields a unique x. DEFINITIONS. a0 = 0a = 0, a + 0 = 0 + a = a for all a in S. PROPOSITION 14.
I f a # 0 and ax = O, then x = O.
Proof. Suppose x # 0. I f x # a, then T ( O , a - t , a ) = T(O,x - 1 , x), that is, x = a, a contradiction. I f x = a, then ax = aa = p = 0, which is impossible. Hence x = 0 is the only solution. PROPOSITION 15.
a + x = a has the only solution x = O.
Proof. Suppose x # 0 # a # x. Then T(ax, a- 1, a) = a. By Proposition 3, a x then must be 0, and by Proposition 14, x = 0, a contradiction. N o w suppose x = a # 0. Then T ( p , a - l , a ) = a, which yields the impossible value p = 0. Finally, let a = 0. Then 0 + x = 0. But 0 + x = x, and hence x = 0.
1966]
NON-EUCLIDEAN INCIDENCE PLANES
49
PRoPosmoN 16. (S, + ) and(S - {0},. ) are commutative loops. T10. x + x = a has exactly one solution x. PROPOSITION 17. b + b = 0 implies b = O. T l l . xx = a has the single solution x = 0 if a = 0, and no solutions or 2 distinct solutions in S otherwise. T12. T(xx, m , c ) = x + x has the single solution x = c if m = c - l , and no solution or 2 distinct solutions in S if 0 # m # c - t . 4. The coordinate structure as ternary.
THEOREM 1. (Z,T), as defined in section 2, is a ternary satisfying T1 through T12. Proof. T1, T2, T4 and T5 are obviously satisfied. For T3 a unique point x will always be obtained unless bl = b2, in which case x would be oo. But oo is not in ~. The construction of a - 1 turns out to be (Figure 3):
((1) (1) × (0) (oo)) (a) x c = a - 1 t
!t Fmu~ 3
and then T6 and T7 follow immediately. The points of C except oo and 0 have exactly all the equations y = T(x, m - 1, m) for all m in Y.. The construction o f a b and a + b is the following:
((a) (b) × (0) (oo)) (1) × C = ab,
((a) (b) × (oo) (oo)) (o) × C = a + b,
and these constructions also hold if a = b. T8 follows from III2; T9 from III1; T10, T l l and T12 from 6III1 and 6III2. The converse of Theorem I is
50
R. ARTZY
[March
THEOREM 2. Every ternary (S, T) satisfying T1 through T12 coordinatizes a non-euclidean plane. Proof. To show the validity of I we have to consider 3 types of points whose equations are, respectively, x = a, x = ~ , y = T(x, m, c), for a, m, c in S. Now, x = a and x = b (a ¢ b) are joined by the line [oo] only, and so are x = a and x = oo. The points x = a and y = T(x, m, c) are joined only by Ia, T(a, m, c)], and x = oo and y = T(x, m, c) lie only on [rn]. The points y = T(x, m, c) and y = T(x, n,d)(m ~ n) are on a unique line, by T3. Finally y = T(x, m, c) and y = T(x,m,d)(c ~ d) lie on [rn], and by T4 there can be no other join. For ~I we have to consider 3 types of lines: [a, b], [m], and [oo]. The lines im] and [n] (m ~ n) and the lines I-m] and [oo] intersect at x = oo only. The lines [m] and I-a, b] intersect at y = T(x, m, c) with b = T(a, m, c), which, in view of T4, determines c uniquely. The lines [oo] and Ia, b] meet at x = a, and so do I-a, b] and la, b'] when b ~ b'. The lines I-a, b] and Ia', b'], with a ~ a', intersect at the unique point y = T(x, m, c), where m and c are uniquely determined by T5 and Proposition 2 applied to the simultaneous equations b = T(a, m, c) and b' = T(a', m, c). II is satisfied because the 4 points x = 0, y = 0, x = 1, y = 1 are all distinct and no 3 of them colline. Collinearity of any three of them would lead to the contradiction 0 = 1. Concerning III1: The tangent at x = 0 is [0, 0]. Every other line [0, p] also goes through y = T(x,p-l,p), and i'oo] also passes through x = oo. No other lines contain x = 0. The tangent at x = oo is [0]. The only other lines through x = oo are of the form [p] with p ~ 0, passing through y = T(x,p,p- 1), and [oo] passing through x = 0. Finally, the tangent at y = T(x, n -1, n), with n ~ 0, is [nn, n + n] by T9. The line [n -1] is no tangent because it contains also x = oo. Concerning 1112: The line [0, b], b ~ 0, passes through 2 points of C, x = 0 and y = T(x, b -1, b). The line [0,0] passes through the single point x = 0. The line I'a, hi, with a ~ 0, goes through y = T(x, m - l , r n ) with b = T(a,m -I, m), which in view of T8 yields at most 2 solutions. The line [oo] contains 2 points of C, x = 0 and x = oo ; the line im], m ¢ 0, has 2 points, x = oo and y = T(x, m, m- 1), and the line [0] only one point, x = oo. Concerning ~5III1: Tangents can be only of the types [13] or Inn, n + hi. The tangent [0] passes through ~ , and no other tangent passes through oo. Every other point on [0] has an equation of the form y = q. Through the point y = q there is exactly one tangent other than [0], namely [aa, a + a] with a + a = q. Since, in view of T10, there is exactly one such a, each point y = q lies on just
1966]
NON-EUCLIDEAN INCIDENCE PLANES
51
2 tangents. The tangent [nn, n + n] with n # 0 passes through the point y = T(x, n - 1, n) in C. This point does not lie on [0], nor can it lie on [mm, m + m] with m ~ n, in view o f T12. Every point not in C, lying on Inn, n + n], is either x = nn or y --- n + n or of the type y = T(x, m, c) with n + n = T(nn, m, c) and m - 1 # c. Through x = nn there is exactly one other tangent Inn, n' + n'] because the equation zz = n (n # 0) has, by T l l , a solution z = n' # n. Through y = n + n there is the second tangent [0]. The point y = T(x, m, c) (m ~ 0) lies on a tangent [n'n', n' + n'] other than Inn, n + n] if the equation z + z = T(zz, m, c) has the solution z = n' besides z = n. But, since m -1 # c, this is exactly the case in view of T12. Finally, the tangent [0, 0] passes through x = 0. Every other point on it has an equation y = T(x,m,O) with m # 0. Again z + z = T(zz, m,O), by T12, has a solution z in addition to z = 0, and hence another tangent exists through this point. Concerning 61112: The point x = p can lie only on the tangents [aa, a + a] with aa = p. By T l l , there are at most 2 values of a, and therefore at most 2 tangents through x = p . Through x = oo there is only the tangent [0]. Through y = q the only tangents are [0] and [aa, a + a], with a + a = q. Since, in view of T10, there is exactly one a, there are just 2 tangents. Finally, the point y = T ( x , m , c) (m # 0) lies on the only tangents [a, T(a, m, c)] with T(aa, m, c) = a + a. By T12, there are at most 2 values a, and hence at most 2 tangents. This completes the proof. Now that the distinction between ~ and S has lost its significance, we will use S for both.
5. Specializations of the ternary. We will study a few instances of noneuclidean planes with special restrictions on the ternary. DEFINITION. We say that nc has the Pascal property with respect to the line j as axis if for every 6 points Pk, Qk (k = 1,2, 3) of C the following holds: If P1Q2 x P2Qt and P2Q3 × P3Q2 lie on j, so does also P3Q1 × PIQ3. THEOREM 3. (S, + ) is an abelian group if and only if Trc has the Pascal property with respect to the axis (ov)(~). Proof. According to the construction described in the proof of Theorem 1, (a)(b) × (0)(a + b) and (b)(c) × (0)(b + c) lie on (ov)(oo). The Pascal prooperty for the axis ( o v ) ( ~ ) holds if and only if (a)(b + c) × (a + b)(e) also lies on this axis for all choices of a, b, c in S. But this means additive associativity in S. By Proposition 16, (S, + ) then is an abelian group. THEOREM 4. (S - {0}, • ) is an abelian group if and only if in n c the Pascal property with axis (0)(oo) holds.
52
R. ARTZY
[March
Proof. Again, let a,b, c e S , all nonzero. The points (a)(b)x(1)(ab) and (b)(c) x (1)(bc) lie on (0)(0o). The Pascal property for the axis (0)(oo) holds if and only if (a)(bc) x (ab)(c) also lies on (0)(oo), that is, (ab)c = a(bc). TrmOREM 5. ( S , + , ") is not necessarily left- or right-distributive, even if (S, +) and (S - {0}," ) are groups. Proof. We employ a counter-example which we borrow from G. Pickert [7, p. 93]. Let (S, +, • ) be the real field, and use the usual addition, but a multiplication • such that
2ab if a and b are negative a
b
/
I a b otherwise. Let T(a, b,c) = a • b + c. Then we assert that T1 through T12 are satisfied. T1, 2, 4, 6, 7, 10 and 11 are trivially fulfilled. For a proof of T3 and T5 see [7]. A verification for T8, 9 and 12 is more complicated and can be done by straightforward, though cumbersome, computation. In this example addition and multiplication are associative. However, as shown in [7], the distributive law does not hold. On the other hand, we have obviously TrmOREM 6. I f (S,T) is linear, that is, T(a,b,c)= ab + c for all a,b,c in S, and if (S, +, • ) is a field of characteristic ~ 2, then T1 through T12 are valid. In this case Hilbert's arguments [4, appendix III] apply, and rr is a projective plane over a field, in the classical sense. THEOREM 7. I f (S, T) is linear and (S, +, • ) a field of characteristic ~ 2, then for the lines Ix, y] of 6C the equation y 2 = 4x holds. Proof. The lines are of the form [mm, m + m]. Thus y = 2m and x = (y/2) 2. Theorem 7, in a sense, may be considered a generalization of B. Segre's theorem [8]. A final remark concerns the connection between the ternaries and M. Hall's ternary rings [3]. Comparison of their axioms yields the fact that a ternary becomes a ternary ring if T(s, 1,0)= T(1, s, 0 ) = s for all s in S. This requirement corresponds to Pascal properties with axes through the points y = T(x,s,O), and it assures the coordinatization of the plane dual to n by means of (S, T). REFERENCES 1. R. Baer, The infinity of generalized hyperbolic planes, Studies and Essays presented to Richard Courant on his 60th Birthday, pp. 21-27. New York, 1948. 2. L. M. Graves, A finite Bolyai-Lobachevsky plane, Amer. Math. Monthly 69 (1962), 130-132. 3. M. Hall, Projectiveplanes, Trans. Amer. Math. Soc. 54 (1943), 229-277. 4. D. Hilbert, Grundlagen der Georaetrie,9th ed., Stuttgart, 1962.
1966]
NON-EUCLIDEAN INCIDENCE PLANES
53
5. T. G. Ostrom, Ovals, dualities, and Desargues's theorem, Canad. J. Math. 7 (1955), 417--431. 6. T. G. Ostrom, Ovals and finite Bolyai-Lobachevsky planes, Amer. Math. Montly 69 (1962), 889-891. 7. G. Pickert, Projektive Ebenen, Berlin, 1955. 8. B. Segre, Ovals in a finite projective plane, Canad. J. Math. 7 (1955), 414--416. 9. O. Veblen and J. W. Young, Projective Geometry, Vol. 1. Boston, 1910. STATE UNIVERSITY OF NEW YORK AT BUFFALO
VALUES OF GAMES WITH A CONTINUUM OF PLAYERS* BY
YAKAR KANNAI ABSTRACT
A definition of the "Shapley value" of games with a continuum of players and a formula for this value are given for a certain class of games, regarding them as limits of games with a finite number of players.
Introduction. The notion of "the Shapley value" of an n-person game is well-known [3]. If the game is given in the characteristic function form, it is possible to write down a formula for the value. Here we shall treat the value problem for games with a continuum of players. A definition and a formula were given for certain such games by R. J. Aumann and by L. S. Shapley in an unpublished work, using invertible strongly mixing measure-preserving transformations. In this paper, we shall show that the same formula can be obtained by regarding the continuous game as a "limit", in suitable sense, of a sequence of finite games. The characteristic function of an n-person game is a real-valued function v, defined on the subsets S of the set N = {1, ...,n} of the players. The Shapley value w(i) of the player i is given by (1.1)
s_~ (n - s)!(s .! - 1)! [v(s) -
w(i) = •
v(s
-
{i})]
where s is the cardinality of S. The characteristic function of a game with a continuum of players is a real valued function v, defined on the measurable subsets of [0, 1]. In this case, there is no immediate analogue of (1.1). Let f ( x l , "",Xk) be a C 1 function of the k real variables x l , "",Xk in the rectangle 0 < x~ ,
,~,~_N
(
G - Hi' L ( . . . ) - L
s-i)}
a~,_ l'""ak~--z-i-
=o,
uniformly in i. But
p,(jUsAj)
0p,(A,)=
-
• •i(A,)+(I-O)#,(A,). J~s-{0
Hence, we may reformulate our problem as follows: Given a continuous function g on the rectangle 0 - t)< = =
P r ( [ "et ~i) -ai-~---_-__l[ s - 1, > t ) < C t ;
28a______~and if we set e = t 3 we find that t 2'
C is a constant
which does not depend on s
or n. The function g is continuous on a closed rectangle, and therefore, is also b o u n d e d and uniformly continuous. Since we can m a k e t arbitrarily small o u r assertion follows.
REFERENCES 1. W. Feller, Probability Theory and its Applications, Vol. 1, John Wiley, 1950 (First edition). 2. P. R. Halmos, Measure Theory, Van Nostrand, 1950. 3. L. S. Shapley, A value for n-person games, Contributions to the theory of games 2, Ann. Math. Studies 28, pp. 307-318.
THE HEBREWUNIVERSITYOFJERUSALEM AND THE ISRAELINSTITUTEFOR BIOLOGICALRESEARCH
O N EXTREME POINTS IN 11 BY
JORAM LINDENSTRAUSS*
ABSTRACT
It is proved that every bounded closed and convex subset of It is the closed convex hull of its extreme points. The well known Krein Milman theorem states that in a locally convex Hausdorff linear topological space X every compact convex set K is the dosed convex hull o f its extreme points. There are many known examples o f non compact convex sets which are the dosed convex hulls of their extreme points and hence in the formulation given above, the Krein Milman theorem does not characterize compact convex sets. If A is a compact set in a locally convex Hausdorff space X then the Krein Milman theorem implies that every dosed convex subset K of A is the closed convex hull of its extreme points. It has been asked by several mathematicians how far does this property characterize compact sets A. In particular Ky Fan has asked in a symposium on linear spaces held in Jerusalem in 1964 whether in every non-quasireflexive Banach space there is a bounded dosed convex set which does not have an extreme point. The recent important results of James [2] concerning the characterization of w compact sets may suggest that the answer to the question above is positive. However it turns out that the answer is negative. We prove here THEOREM 1. In the space 11 every closed bounded and convex set is the closed convex hull o f its extreme points. The question whether Theorem 1 in its present formulation is true was raised by M. A. Rieffel and was communicated to the author by R. R. Phelps. In [3] we showed that 11 has a closed subspace which is not isomorphic to a conjugate Banach space. Hence Theorem 1 implies COROLLARY 1. There is a Banach space X which is not isomorphic to a conjugate space such that every closed bounded and convex subset of X is the closed convex hull o f its extreme points. * The research reported in this document has been sponsored by the Air Force Office of Scientific Research under Grant AF EOAR 66-18 through the European Office of Aerospace Research (OAR) United States Air Force. 59
60
J. LINDENSTRAUSS
[March
This corollary shows that the existence of many extreme points in every closed and convex subset of a given convex set A does not imply even a weak form of compactness of A. The proof of Theorem 1 is based on the results of Bishop and Phelps Ill. LEMMA 1. Let X be a Banach space. Then the following two statements are equivalent. (i) Every bounded closed and convex subset of X has an extreme point. (ii) Every bounded closed and convex subset of X is the closed convex hull of its extreme points. Proof. Clearly (ii) =~ (i). Assume that (i) holds and let K be a bounded closed and convex subset of X. Let K o be the closed convex hull of the extreme points of K, If K o # K then there is by [1] an f 6 X* and a y ¢ K such that
f ( y ) = supf(x) > sup f ( x ) . x~K
xcKo
The set Kl = {x; x ~ K,f(x) = f ( y ) } is a closed face of K which is disjoint from Ko. Since Ko = ext K D ext K1 # ~ we get a contradiction. (ext K denotes the set of extreme points of K). LEMMA 2. Let K be a bounded closed and convex subset of 11 and let ~ > O. Then there is a closed face F o f K and an integer n such that x = (xt, x2, "")¢ F Proof. Let M = supx~x [] x ]] and choose a y = (Yl, Y2, " " ) i n K such that ]1Y ~ > M - ,/4. Let n be such that ~'=1 ]Y~]> M - e/2. and let f ~ 1~' be defined by
f ( x l , x2,"-) = ~ sgn (yi)xt f=l
(here sgnt = 1 if t > 0 and = - 1 if t < 0). By [1] there is a g ~ l * such that t l f - g l t < e/4M and F = { x ; x e K , g(x)= sup~rg(u)} is not empty. We claim that F has the required properties. Indeed, let x E F, then
/=1
Ix, I >_ f ( x ) >_ g ( x ) -
l l f - gl[ [1x~ > g(y) - [ I f - gll II xl[
f(y) - IIf-
g tl(lt x II + IIy II)
M - el2 - e[2.
Hence, since 11x I[ < M, I~,>, Ix, [ < *" We are now ready to prove Theorem 1. Let K be a closed bounded and convex subset of 11. By Lemma 1 we have only to show that ext K # ~ . By Lemma 2 0o there is a sequence {Fi}ioo= t of closed faces of K and a sequence of integers {n l}i--1 such that Fi+ t is a face of F~ and x = ( X l , X2,'") E F~ implies that ~j>~, [ xj [ ~ 1/i. Let {y~} be a sequence of points in It such that y~ ~ Fv From the properties of the F t it follows immediately that the set {y~}~ 1 is totally bounded. Hence since
1966]
ON EXTREME POINTS IN I1
61
K is closed F = f ' ~ t Fl is a non empty compact face of K. By the Krein Milman theorem ext F # ~3f and hence ext K # ~5. This concludes the proof. REMARK. The p r o o f presented here can be used to prove the following more general result. Let {X~}~ ~A be a set of reflexive Banach spaces. Then every closed convex subset of ( ~E 0) X~)l is the closed convex hull of its extreme points. In this case the set F appearing in the p r o o f of the theorem will be w compact (and not norm compact in general) but this suffices for the proof. REFERENCES 1. E. Bishop and R. R. Phelps, The support functionals of a convex set, Prec. Symposia in Pure Math. VII (Convexity), Amer. Math. Soe. (1963), 27-37. 2. R. C. James, Weakly compact sets, Trans. Amer. Math. See. 113 (1964), 129-140. 3. J. Lindenstrauss, On a subspace o f the space lb Bull. Acad. Polon. Sci. Math. Astr. et Phys. 12 (1964), 539-542. THE HEBREW UNIVESITY OF JERUSALEM
ON PERTURBATION THEORY FOR SPECTRAL OPERATORS BY
L. TZAFRIRI ABSTRACT
Conditions are given under which a Rellich's perturbation theorem for normal operators on Hilbert spaces may be generalized for spectral operatot's on Banach spaces. In a previous paper, [6] Theorem 8, we proved that the strong limit of a sequence of spectral operators commuting with a Boolean algebra of projections of finite multiplicity is a spectral operator provided that their resolution of the identity are uniformly bounded. In connection with this theorem, Foguel has raised the question whether a theorem similar to Rellich's theorem [5] p. 678 (see also [2] Theorem X-7-2) is true in the above mentioned case. In this note we consider a sequence of spectral operators of scalar type {S}, converging strongly to a scalar operator S, and we give a necessary and sufficient condition for the sequence {f(Sn)} to converge strongly tof(S) for every bounded BoreI function f for which the resolution of the identity of S vanishes on a closed set containing the discontinuities of fl The required perturbation theorem for spectral operators commuting with a Boolean algebra of projections of finite multiplicity will be an immediate consequence of the precedent theorem. Foguel has introduced in [3] p. 59 the notions of real and imaginary parts of a scalar operator. In fact, he has proved that every scalar operator S whose resolution of the identity is E( • ) has a unique decomposition S=R+iJ
where R ---- ~R¢ S = j" (~Re2) E(d2) is called the real part of S and J = ~3m S = j" (~m 2) E(d2) the imaginary part (R and J have real spectrum). Using these concepts and some ideas from the proof of Rellich's theorem for normal operators we get the next basic result. TI-mOREM. Let {Sn} be a sequence of scalar operators on a Banach space X converging strongly to a scalar operator S and such that their resolutions of the identity are uniformly bounded i.e. [] E(S,, ") [[ 1, a continuous function um with 0 < Urn(2) < 1 ; Urn(2) = 0 for 2 ~ 6 and Urn(2) = 1 when minu~ p {2 - #1 > 1/m. One can easily see that lim f DI---~ OO
Um(2) E(S, d,;t)xo = Xo
,J
Since f . um is a continuous function it follows from the first part of the proof that
64
L. TZAFRIRI
[Match
f f(&)u.(2)E(Sn,d,gx = f f(;~)um(X)E(S,d;Ox;x~X;
lim~_.oo
m = 1,2,...
and in view of the uniform boundedness of ]IE(Sn")[[; n = 1,2,... we have
f m)E(Sn,dX)xo f
limn~®
=
f