'·'·"·"·'·'
Thoughts on the T
he simultaneous appearance in
of the Riemann zeta function
May 2003 of four books on the Rie
mann hypothesis (RH) provoked these
Riemann Hypothesis G. J. Chaitin
?cs)
reflections. I briefly discuss whether iom, and whether a proof of the RH might involve the notion of random ness.
(Here
New Pragmatically Justified Mathematical Axioms that Are
ing procedure in physics.
amount of computational evidence, ev
the international mathematical
idence that is so persuasive that a
community. Disagreement and
physicist would regard them as exper
controversy are welcome. The views
imentally
are
these
propositions fruitful? Do they yield many other significant results? Yes, I think so. At present, the two
and neither the publisher nor the
best candidates2 for useful new axioms
editor-in-chief endorses or accepts
of the kind that GOdel and I propose
should be submitted to the editor-in chief, Chandler Davis.
[ 1) that are justified pragmatically as in physics are: •
•
?(s)
=
0,
• Marcus du Sautoy, •
The Music of the Primes, Harper Collins, 2003. John Derbyshire, Prime Obsession,
•
Karl Sabbagh,
the P =I= NP hypothesis in theoretical
Joseph Henry Press,
2003.
The Riemann
Hy
pothesis, Farrar, Strauss and Giroux,
Are there mathematical proposi
And
for which
plained in these four books:
2003.
tions for which there is a considerable
verified?
s
distribution of prime numbers, as is ex
opposite of normal mathematical prac
responsibility for them. An Opinion
ranges over positive inte
tells us a lot about the smoothness of the
tice.1 However, this is standard operat
are exclusively those of the author,
pS
Knowing the zeroes of the zeta function,
that is justified by its many important
and opinions expressed here, however,
n
1
gers and p ranges over the primes.)3
consequences-which is precisely the
write about any issue of interest to
p
i.e., the values of
A pragmatically justified principle is one
mathematicians the opportunity to
n
the RH should be added as a new ax
Not at All Self-evident
The Opinion column offers
1 =I_!_= II --. ns - _l_
•
Julian Havil, Gamma, Princeton Uni versity Press,
2003.4
The Riemann zeta function is like
my n number: it captures a lot of in formation about the primes in one tidy
package. n is a single real number that contains a lot of information about the
halting problem. 5 And the RH is useful
because it contains a lot of number-the
oretic information: many number-theo
retic results follow from it.
Of the authors of the above four
computer science that conjectures
books on the RH, the one who takes
that many problems require an ex
Godel most seriously is du Sautoy, who
ponential amount of work to resolve,
has an entire chapter on Godel and
and
Turing in his book In that chapter on
the Riemann hypothesis concerning
p. 181, du Sautoy raises the issue of
the location of the complex zeroes
whether the RH might require new ax-
'However, new mathematical concepts such as
v'=1
and Turing's definition of computability certainly are
judged by their fruitfu lness -Fran(: oise Chaitin-Chatelin, private communication.
2Yet another class of pragmatically justified axioms are the large cardinal axioms and the axiom of determi nancy used in set theory, as discussed in Mary Tiles, The Philosophy of Set Theory, Chapters 8 and 9. For the latest developments, see Hugh Woodin, "The continuum hypothesis," AMS Notices 48 (2001 ), 567-576, 681 -690.
3\fou start with this formula and then you get the full zeta function by analytic continuation. 4Supposedly Havil's book is on Euler's constant
y,
not the RH, but ignore that. Sections 1 5.6, 1 6.8, and 1 6. 1 3
of his book are particularly relevant t o this paper. 5!1 = 'lp ha�s 2-iol is the halting probability of a suitably chosen universal Turing machine. !1 is "incompress
ible" or "algorithmically random." Given the first N bits of the base-two expansion of !1, one can determine io1 ,; N halts. This information cannot be packaged more concisely. See
whether each binary program p of size [2], Sections 2.5 through 2.1 1 .
4
THE MATHEMATICAL INTELLIGENCER © 2004 SPRINGER-VERLAG NEW YORK
ioms. On p. 182 he quotes Godel,6 who specifically men tions that this might be the case for the RH. And on p. 202 of that chapter, du Sautoy points out that if the RH is un decidable this implies that it's true, because if the RH were false it would be easy to confirm that a particular zero of the zeta function is in the wrong place. Later in his book, on pp. 256-257, du Sautoy again touches on the issue of whether the RH might require a new axiom. He relates how Hugh Montgomery sought re assurance from Godel that a famous number-theoretic con jecture-it was the twin prime conjecture, which asserts that there are infinitely many pairs p, p + 2 that are both prime-does not require new axioms. Godel, however, was not sure. In du Sautoy's words, sometimes one needs "a new foundation stone to extend the base of the edifice" of mathematics, and this might conceivably be the case both for the twin prime conjecture and for the RH. On the other hand, on pp. 128-131 du Sautoy tells the story of the Skewes number, an enormous number wwlo34 that turned up in a proof that an important conjecture must fail for extremely large cases. The conjecture in question was Gauss's conjecture that the logarithmic integral Li(x) =
Jx 2
du ln u
is always greater than the number 1r (x) of primes less than or equal to x. This was verified by direct computation for all x up to very large values. It was then refuted by Little wood without exhibiting a counter-example, and finally by Skewes with his enormous upper bound on a counter example. This raised the horrendous possibility that even though Gauss's conjecture is wrong, we might never ever see a specific counter-example. In other words, we might never ever know a specific value of x for which Li(x) is less than 1r(x). This would seem to pull the rug out from under all mathematical experimentation and computational evi dence! However, I don't believe that it actually does. The traditional view held by most mathematicians is that these two assertions, P =I= NP and the RH, cannot be taken as new axioms, and cannot require new axioms, we simply must work much harder to prove them. According to the received view, we're not clever enough, we haven't come up with the right approach yet. This is very much the cur rent consensus. However, this majority view completely ig-
nores7 the incompleteness phenomenon discovered by Godel and Turing, and extended by my own work [2] on in formation-theoretic incompleteness. What if there is no proof? In fact, new axioms can never be proved; if they can, they're theorems, not axioms. So they must either be justi fied by direct, primordial mathematical intuition, or prag matically, because of their rich and important consequences, as is done in physics. And in line with du Sautoy's observa tion, one cannot demand a proof that the RH is undecid able before being willing to add it as a new axiom, because such a proof would in fact yield the immediate corollary that the RH is true. So proving that the RH is undecidable is no easier than proving the RH, and the need to add the RH as a new axiom must remain a matter of faith. The mathematical community will never be convinced. 8 Someone recently asked me, "What's wrong with calling the RH a hypothesis? Why does it have to be called an ax iom? What do you gain by doing that?" Yes, but that's be side the point; that's not the real issue. The real question is, Where does new mathematical knowledge come from? By "new knowledge" I mean something that cannot be deduced from our previous knowledge-from what we al ready know. As I have been insinuating, I believe that the answer to this fundamental question is that new mathematical knowl edge comes from these three sources: a. mathematical intuition and imagination ( \!=]\ b. conjectures based on computational evidence (explains calculations), and c. principles with pragmatic justification, i.e., rich in consequences (explains other theorems). 9 And items (b) and (c) are much like physics, if you replace "computational evidence" by "experimental evidence." In other words, our computations are our experiments; the empirical basis of science is in the lab, the empirical basis of math is in the computer. Yes, I agree, mathematics and physics are different, but perhaps they are not as different as most people think, per haps it's a continuum of possibilities. At one end, rigorous proofs, at the other end, heuristic plausibility arguments, with absolute certainty as an unattainable limit point. I've been publishing papers defending this thesis for more than a quarter of a century, 10 but few are convinced by my arguments. So in a recent paper [ 1] I've tried a new
6Unfortunately du Sautoy does not identify the source of his Gbdel quote. I have been unable to find it in Gbdel's Collected Works. 7As
du Sautoy puts it, p. 1 8 1 , "mathematicians consoled themselves with the belief that any1hing that is really important should be provable, that it is only tortuous
statements with no valuable mathematical content that will end up being one of Gbdel's unprovable statements." 8The situation with respect to P * NP may be different. In a paper "Consequences of an exotic definition for P
145 NP is consistent with ZFC, so a version of P * NP cannot be demonstrated within ZFC. See also T. Okamoto, R. Kashima, "Resource bounded unprovability of compu tational lower bounds," http://eprint.iacr.org/2003/1 87 /. =
NP,"
Applied Mathematics and Computation
(2003), pp. 655-665, N. C. A. da Costa and F. A. Doria show that if ZFC (Zermelo-Fraenkel set theory + the axiom of choice) is consistent, then a version of P
=
9A possible fourth source of mathematical knowledge is (d) probabilistic or statistical evidence: A mathematical assertion may be deemed to be true because the prob ability that it's false is immensely small, say 1, ct· · 1 erent pnme tvtsors or n if n is square-free. ( _ 1)number o rct·rr
• •
the number of k from 1 ton for which JL (k) = -1, and the number of k from 1 to n for which JL (k ) = + 1
is O( Vn\ of the order of square root of n, i.e., is bounded by a constant times the square root of n. This is roughly the kind of behavior that one would expect if the sign of the JL function were chosen at random using independent tosses of a fair coin.u This is usually formulated in terms of the Mertens func tion M(n) : 12 n
•
As P6lya shows in the above paper-originally Ameri pp. 375-384-probabilistic heuristic reasoning can do rather well with the distribution of twin primes.By the way, this involves Euler's y constant. Can a refmement of P6lya's technique shed new light on JL and on the RH? I don't know, but I think that this is an in teresting possibility. By the way, P :1: NP also involves randomness, for as Charles Bennett and John Gill showed in 1981-SJAM Jour nal on Computing 10, pp. 96-113-with respect (relative) to a random oracle A, pA :1: NPA with probability one [7]. Further Reading-Four "Subversive" Books •
k�l
According to Derbyshire, pp. 249-251,
implies the RH, but is actually stronger than the RH. The RH is equivalent to the assertion that for any E > 0, I
M(n) = O(n2 + E).
11 For
a more precise idea of what to expect if the sign of the
troduction to Probability Theory and Its Applications, vol. 12See [4, 5].
6
THE MATHEMATICAL INTELLIGENCER
•
•
Could this formula be the door to the RH?! This probabilistic approach caught my eye while I was reading this May's crop of RH books. I have always had an interest in probabilistic methods in elementary number theory. This was one of the things that inspired me to come up with my definition of algo rithmic randomness and to find algorithmic randomness in arithmetic [6] in connection with diophantine equations.
IL
D.Hawkins, "Mathematical sieves," Scientific American, December 1958, pp. 105-112.
can Mathematical Monthly 66,
M(n) = .2: p.,(k). M(n) = O(Vn)
Monographs, vol.12, Mathematical Association of Amer ica, 1959. George P6lya, "Heuristic reasoning in the theory of num bers," 1959, reprinted in Gerald W. Alexanderson, The Random Walks of George P6lya, Mathematical Associa tion of America, 2000.
I think that anyone contemplating a probabilistic attack on the RH via the JL function should read these two publica tions. There is also some interesting work on random sieves, which are probabilistic versions of the sieve of Eratosthenes:
·
The RH is equivalent to the assertion that as k goes from 1 to n, JL(k) is positive as often as negative. More precisely, the RH is closely related to the assertion that the difference between
Mark Kac, Statistical Independence in Probability, Analysis and Number Theory, Carus Mathematical
On experimental mathematics: Borwein, Bailey, and Girgensohn, Mathematics by Ex periment, Experimentation in Mathematics, A. K. Pe ters, 2003. (See [8]. There is a chapter on zeta functions in volume two.) On a quasi-empirical view of mathematics: Tymoczko, New Directions in the Philosophy of Math ematics, Princeton University Press, 1998. On pragmatically justified new axioms and information theoretic incompleteness: Chaitin, From Philosophy to Program Size, Tallinn Cy bernetics Institute, 2003. (There is also an electronic version of this book [2].)
And regarding the adverse reaction of the mathematics community to the ideas in the above books, I think that it is interesting to recall Godel's difficulties at the Princeton Institute for Advanced Study, as recounted in:
function were chosen at random, see the chapter on the law of the iterated logarith m in Feller, An In
1 , Vlll.5 through VIII.?.
• John L. Casti, The Henry Press, 2003.
One True Platonic Heaven,
John
[2] From philosophy to program size. http://www.cs.auckland.ac.nz/CDMTCS/chaitin/ewscs.html
According to Casti, one of the reasons that it took so long for Godel's appointment at the lAS to be converted from temporary to permanent is that some of Godel's colleagues dismissed his incompleteness theorem. Now, of course, Godel has become a cultural icon13 and mathematicians take incompleteness more seriously-but perhaps not seri ously enough. Mathematicians shouldn't be cautious lawyers-! much prefer the bold Eulerian way of doing mathematics. Instead of endlessly polishing, how about some adventurous pioneer spirit? Truth can be reached through successive approxi mations; insistence on instant absolute rigor is sterile that's what I've learned from incompleteness. 14
[3] Information-theoretic limitations of formal systems. http://www. cs.auckland .ac. nz/CDMTCS/chaitin/acm 7 4. pdf [4] Mertens function. http://mathworld.wolfram.com/ MertensFunction.html [5] Mertens conjecture. http://mathworld.wolfram.com/ MertensConjecture.html [6] Randomness in arithmetic. http://www.cs.auckland.ac.nz/ CDMTCS/chaitin/sciamer2.html (7] Relative to a random oracle A, pA * NPA * co-NPA with proba bility 1 . http://www.research.ibm.com/people/b/bennetc/ bennettc1 981 497f3f4a.pdf [8] Experimental mathematics website. http://www.expmath.info [9] Apostolos Doxiadis home page. http://www.apostolosdoxiadis.com IBM Research
WEB REFERENCES
Yorktown Heights, NY 1 0598
( 1 ] Two philosophical applications of algorithmic information theory.
USA
http://www.cs.auckland.ac.nz/CDMTCS/chaitin/dijon.html
e-mail:
[email protected] 131n this connection, I should mention Incompleteness, a play and a theorem by Apostolos Doxiadis, which is a play about Gbdel. For more information, see [9]. 141n this connection, see da Costa and French, Science and Partial Truth, Oxford University Press, 2003.
Solution Kept Secret
VOLUME 26, NUMBER 1 , 2004
7
EUGENE GUTKIN
The Toeplitz-Hausdorff Theorem Revisited: Relating Linear Algebra and Geometry
Genesis In the beautiful paper
[24] 0. Toeplitz associated with any
of hermitian operators H( ) . This allows me to cast the analy ·
sis into the language of convex geometry: Support lines and
n X n matrix a compact set in the complex plane.
support functions come in. The crucial observation is that the
As his title suggests, he was inspired by a theorem of L. Fe jer [6] concerning a relationship between planar curves and Fourier series. Apart from this, the paper [24] is self contained. Let en be the standard vector space with the
support function of W(C) is the highest eigenvalue, A(·), of H( ) . This brings in both the algebraic geometry and the con vex geometry. R. Kippenhahn was the first to exploit this ob servation. In his Dissertation [16] he introduces and develops
complex
. I will not distinguish between the n matrices and operators on en. Let C be one such. It is determined by its "bilinear" form . The compact scalar product
n
X
set that Toeplitz introduces is the image,
W = W(C) C e,
en, under the quadratic map u � . He cof\jectures that W(C) is a convex set, and proves that the outer boundary of W(C) is a convex curve. of the unit sphere in
A year later F. Hausdorff proved Toeplitz's col\iecture
[12]. The Toeplitz-Hausdorff theorem was born. For several
·
this point of view. To illustrate this approach, I immediately derive rough bounds on the size of
W(C) in terms of the spec
tral attributes of C. I also reproduce without proof the much more sophisticated estimates of Kippenhahn
[16].
Then I bring in the differential geometry by calculating the curvature of the boundary curve
aW(C). To show the
usefulness of this viewpoint, I apply it to obtain new bounds on the size of
W(C) in terms of the standard attributes of
C. These estimates, although still very crude, are sharper
reasons, it continues to attract the attention of researchers.
than those I got out of the support function. The differen
Extensions of Toeplitz's setting came up in robust control;
tial geometry viewpoint turned out to be especially suitable
hence the thriving engineering literature on the subject. See
[20, 21, 5]. My own preoccupation with the Toeplitz
Hausdorff theorem has its genesis in a joint project with electrical engineers
[15, 10].
W(C), the joint nu [10].1 I conclude with a brief survey of the
to study the multidimensional version of merical range
literature and a personal remark I thank the referees for helpful comments.
Despite (or because of) the simplicity of the Toeplitz Hausdorff framework, basic questions in the subject remain
Historical Remarks
[14]. For instance, it is not known what domains are realizable as W(C) for C on en. The present article aspires
Toeplitz coined the name "Wertvorrat" for
to attract attention of the general mathematical readership
ues. Variations of "Wertvorrat" dominate the German liter
open
to the fascinating interplay of linear algebra, geometry, and analysis that the papers
[24, 12] initiated.
My plan is as follows. I analyze in some detail the original papers of Toeplitz and Hausdorff. Then, following the view point of
[24], I associate with an arbitrary C a linear pencil
English translation is the
W(C). A literal
value supply or the stock of val
ature on the subject. For instance, A. Wintner, during the Leipzig period of his prolific career, used the expressions "Wertevorrat"
domain)
(values supply)
and "Wertbereich"
(value
[26].2
The modern literature intermittently uses field
of values
1There are many generalizations of the numerical range of an operator in the modern literature. It would take several pages just to give the relevant definitions. The con cept of the joint numerical range and the awareness that it is the natural multi-dimensional extension of the numerical range is already in the founding papers [24, 1 2). 2Wintner emigrated to America shortly after the University of Leipzig refused to award hirn the Habilitation. The book [26) is apparently his Habilitationschrift.
8
THE MATHEMATICAL INTELLIGENCER © 2004 SPRINGER-VERLAG NEW YORK
s and numerical range. I don't like either expression.The former adds one more item to the litany of mathematical "fields"; the latter is plain awkward. The original name is better in every respect except one: It is German and there fore unacceptable in the English literature. 4 Some proposed alternatives (template, form range, contracted graph) did not fly. I fmd the expression numerical range the lesser of two evils, and I will use it in what follows.5 Toeplitz proves several propositions relating W(C) and the spectrum of C. For instance, he shows that W(C) con tains the spectrum, and if C is a normal operator, then W(C) is the convex hull of the spectrum. But the centerpiece of [24] is "Satz 8," the convexity of the outer boundary. The penultimate §5 of [24) offers several informal com ments, and points out the difference between convexity of the outer boundary and convexity of the set. Then Toeplitz says: "I will now discuss a generalization of the entire set ting, which . . . also shows the difficulties that stem in the general case from the possibility of holes." He goes on to introduce what is now called the joint numerical range of any number q of hermitian operators A1, .. . , Aq. The set q in question, W.(A1, . .. , Aq) c !R , is the image of the unit sphere in en, under the map u f-i> ( , . . . ). The decomposition C = A1+ iA2 implies W(C) = W.(A1,A2).6 Toeplitz demonstrates that W.(Al, . . . , Aq) is not convex, in general. He concludes: "Whether this can al ready happen for q = 2 remains possible, athough unlikely." Toeplitz missed that he actually proved the desideratum! Indeed, to a modem reader, it seems that Toeplitz essen tially settled the convexity conjecture.To us, it suffices to prove it for n = 2; for, if and belong to W(C), and the numerical range of the restriction of the form C to eu+ ev is convex, then the claim holds. And in §5 Toeplitz shows that the numerical range of an operator on 2 e is either an elliptic disc, or a segment, or a point-in each case, it is convex! In fact, this is how the Toeplitz Hausdorff theorem is proved in modem textbooks [9, 11, 14].1 Amazingly, in the 80-some years since [24), nobody, including Hausdorff, noticed that the Toeplitz-Hausdorff theorem is implicitly proved in [24). In the 3-page-long, focused, beautiful paper [ 12), Haus dorff proves Toeplitz's conjecture. On the one hand, he proves it from scratch, without using Satz 8 of [24). On the other, he goes just a step further than Toeplitz to show that the intersection of W(C) with any straight line is the image of a connected subset of the unit sphere under a continu ous mapping, and hence is connected.8 In a one-sentence remark Hausdorff points out that his results and the Toeplitz argument combine to prove the convexity of the
outer boundary of the surface W.(Al, A2, As) for any triple of hermitian operators. A natural generalization of the Toeplitz-Hausdorff theo rem would have been the convexity of W.(A1, . . . , Aq) for all hermitian operators on any en. Although this claim is ''very false" [ 1 1), W.(A1, A2, As) for any triple A1. A2, As on en is convex if n 2:: 3. Remarkably, it was established 60 years after the papers [24, 12]! There are several proofs of this in the literature [10], and some are based on the Haus dorff connectedness idea [5] . The convexity claim for W.(Al, A2, As, A 4) for operators on en fails for any n [5]. Although this is unfortunate from the engineering viewpoint [2 1], there are nontrivial interpretations of this "phase transition" [10]. But let us return to the subject. How could it be that nei ther Toeplitz nor Hausdorff realized that [24] contained a proof of the convexity of the numerical range? It is quite likely that Hausdorff overlooked the relevant part of [24] . However, the Commentary by S. D. Chatterji in Hausdorffs Collected Works [ 13] reveals a curious fact in this respect. The Hausdorff Archives in Greifswald contain two hand written notes for [ 12), dated September 19 and October 12 of 1918. In one of them Hausdorff works out the numeri cal range of any two-by-two matrix. He shows, as Toeplitz had already done, that it is a (possibly degenerate) ellipse. Bringing in the Geometry
My interpretation of the approach of [24] is as follows. Let C be an n X n matrix, and let W(C) be the numerical range. Toeplitz associates with C a linear pencil of hermitian op erators H( · ), parametrized by the circle of directions. The highest eigenvalue, A ( · ), of H( ) is the support function of W(C).I will now explain this in detail. Let denote the standard scalar product on en, linear (resp. antilinear) in the second (resp. first) argument. As usual, I lu ll =V. Let C be an operator on en with the adjoint C*, and let ·
C =A+
iB:
be the decomposition into t < 27T set H(t)
=
1
.
.
A*= A, B*
hermitian
2 [e - 'tC+ e'tC*]
=
=
B
(1)
operators. For 0
(cos t)A+ (sin t) B.
:S
(2)
2 The space of rays (i.e., oriented lines) in !R is parame 1 trized by S X IR [22]. Namely, the ray r(t, p) has direction t, and the signed distance p from the origin.The notion of sup port lines is well known [1, 22]. I will associate with any 2 compact set, XC IR , the family, u(t), 0 :S t < 27T, of its support rays. For any 0 :S t < 27T the set of p E IR such that
3See [9] for historical comments on this terminology. The name "numerical range" is due to M . H. Stone [23].
4The German-English hybrids "eigenvalue, eigenvector" are the fortunate exceptions .. . . I don't know who coined them or how, but I am happy that I don't have to use the awkward "proper value, proper vector, characteristic number," etc. 51t could have been worse. F. D. Murnaghan refers to W(C) as " ... the region of the complex plane covered by these values under the hypothesis that
..
. " [1 8].
6Thus, the patent on the joint numerical range belongs to Toeplitz and not to Hausdorff [5]. 7A
proof of the Toeplitz-Hausdorff theorem based on this idea is due to W. F. Donoghue [4). He explicitly calculates the ellipse in question. An elegant calculation of
aW(C) if n :s 3 is due to Murnaghan [1 8). Although he points out that aW(C) is an ellipse when
n =
2, Murnaghan is not concerned with the region W(C) itself.
8Hausdorff's elegant argument is limited to finite dimensions, because he diagonalizes hermitian operators. The extension of the Toeplitz-Hausdorff theorem to infinite dimensions is due to Stone [23). See [1 1 ) for a proof of N. P. Dekker [3) that combines Hausdorff's idea with the reduction to IC2
VOLUME 26, NUMBER 1, 2004
9
y
u(i)
1(A)
Figure 1 . Support rays and the eigenvalues.
r(t, p) intersects X is compact; let p(t) be the maximal such p. Then a{t) = r(t,p(t)) is the support ray of X in ilirection t. The following proposition is essentially Satz 8 of [24]. Proposition 1. Let C =A + iB be an operator on en and let H(t) = (cos t)A + (sin t)B, 0 :o; t < 27T, be the associated
pencil of hermitian operators. Let (3)
be the eigenvalues of H(t), and let Ei(O) c en be the eigen space9 corresponding to Ai(O). Let u(t), 0 ::5 t < 27T, be the support rays of W(C). Then the intersection point of a(t + 7T/2) with r(t,O) is A 1(t)(cos t, sin t). Using this point as the origin in u(t + 7T/2), iden tify a{t + 7T/2) with R Then a{t + 7T/2) n W(C) C IRis the convex hull of the spectrum of the form H(t + 7T/2) re stricted to E 1(t).
Proof The unit circle acts on operators, C � e - iac, and on
e, by rotations. The statement is equivariant with respect to these actions. Therefore, it suffices to verify the claims for the direction t = 0.We have H(O) =A, H(7T/2) = B, the ray r(O, 0) is the x-axis, and u(7T/2) is the vertical ray sup porting W from the right.See Figure 1. The points z = x + iy of the numerical range have the form z =, llu ll = 1. By (1), x =, y = . Therefore, the projection of Won the horizontal axis is the interval [A (A), n A1 (A)]. The right extremity of this interval is the intersec tion point with the ray lT( 7T/2). This proves one claim. The intersection of lT( 7T/2) with W is given by z
={ + i : llull =1, = A1(A)}.
9Another fortunate hybrid!
10
THE MATHEMATICAL INTELLIGENCER
In view of the above, our subset of IR is formed by , where u runs through the unit sphere in E1(A). The numerical range of an hermitian operator is the con vex hull of its spectrum. This proves the other claim. • Proposition 1 has several far-reaching consequences. First of all, it implies that the outer boundary aW(C) is con vex [24]. Second, it describes the support rays of W(C) via the eigenvalues of the hermitian pencil H( ) of (2). These support rays determine the convex hull of aW(C). Since W(C) is convex, as we now know, they determine the set W(C) itself. Thus, Proposition 1 yields a description of the numerical range of C in terms of the spectrum of the as sociated pencil H( · ). Since the publication of [24], many authors have devel oped this observation in several directions. One of these directions may be called algebra-geometric. Its starting point is the algebraic curve ·
det(xA + yB + zi) =
0.
(4)
This paper exploits another direction, which may be called "proper geometric." It takes off with an immediate corol lary of Proposition 1. To formulate it, I will recall the no tions of the support function and the width function of a convex set [1, 22]. Let X c lh£2 be convex and compact, and let lT(t), 0 :o; t :o; 27T, be the support rays of X. The distance between the parallel lines lT(t), lT(t + 1r) is the width of X in direction t. The support function is the signed distance of lT(t) to the origin. Denote the support and the width func tions by h(t) and w(t), respectively. Then w(t) = h(t) +
h(t + 7T).
Let C be an operator on en, let H(·) be the associated pencil of hermitian operators, and let An( ) :S · · · ::::; A1 ( ) be the eigenvalues of H(-). Then the support and the width of the numerical range of C are Corollary 1.
·
·
h(t) = A1(t - 1r/2), w(t)= A1(t - 1r/2) - An(t - 1r/2).
(5)
Proof. Proposition 1 yields the first claim. The second fol
lows from the first and the identity H( t + 1r)= -H( t).
•
Although the Toeplitz paper [24] is the precursor of both geometric directions, it was the work of R. Kippenhahn [16] that explicitly gave birth to them. 1 0 From now on I will con centrate on the proper geometric direction, referring the reader to the literature on the algebra-geometric direction. See, for instance [19].11
I will now use Corollary 1 to estimate the size of the nu merical range of C in terms of the standard attributes of the operator C. The size of a planar convex compact set X is expressed via its area, diameter, breadth, and perimeter. Let w( · ) be the width of X. The breadth and the diameter of X are the minimum and the maximum of w, respectively. The perimeter and the area of X are also controlled by the width function [1]. If X = W(C), then w(·) is determined by the spectrum of the hermitian pencil H( ) which, in tum, is determined by the operator C. Among the standard at tributes of C are its spectrum a(C) and the operator norm jcj. The number w(C) = maxAwCC) ( IAi - A1} is the diameter of the spectrum. For any a, b E I[ ·
W(aC +b)= aW(C) + b.
Hence the size of the numerical range does not change un der the transformations C � C + tl. Denote by Jtn the lin ear space of operators on en, and let .M� c Mn be the sub space of traceless operators. The function jCio= mintE 0.
(15)
Denote by ;£ the ray family (O"(t), 0 :s t < 27T}, where O"(t) has direction t + 7T/2 and intersects r(t, 0) at the point A(t)(cos t, sin t). The positivity condition (15) implies that the envelope, A(;£) C C, is a strictly convex curve, with the parametric equations
x(t)
=
A(t)
cost-
A'(t) sin t, y(t) = A(t) sin t + A'(t) cost.
(16)
Moreover, A(:£) is twice differentiable, and its radius of cur vature is given by (10) [22, 1]. Since, by Proposition 1, :£ is the family of support rays of W, we have A(:£) = aw. •
Not every operator C on en is Toeplitz regular. If C is normal, then W(C) is a polygon, hence it is not strictly con vex. By Theorem 3, normal matrices are not Toeplitz reg ular. In fact, by Theorem 2, the non-regularity of W(C) al ways has to do with a partial normality of C. Fortunately, there are plenty of Toeplitz regular operators.
The complement to the set of Toeplitz reg ular operators in _Mn is contained in a closed hypersur face.
Proposition 2.
Proof. Let �n denote the space of n X n hermitian opera tors. By (1), _Mn =�nEB i�n. Replacing cost, sint in (2) by independent variables, we obtain an algebraic mapping, ---7 C + ti. Concluding Remarks
Although the bounds ofTheorem 4 improve those of Corol lary 2 by the factor of 411T, they are still very rough. The same or better bounds on the size of the numerical range W(C) can be obtained using elementary geometry. Let X C Cbe compact. Denote by r(X) the numerical radius of X, i.e., the radius of the smallest disc D(X), centered at (0,0) and containing X. Toeplitz proved in [24] that
19_::::; r(W(C)) ::::; lei. 2
(21)
Since W(C) c D(W(C)), (21) implies (20) and the inequal ity Diameter(W) ::::; 2ICI. Invoking the invariance principle, we obtain (1 7) and the upper bound of (7). Set W1(C) = {zl- z2 : Z1, Zz E W(C)}. The set W1(C) C (: is symmetric about the origin and convex and satisfies [25] W1(C) =
{ + : llull
THEOREM
Perimeter(W)
Area(W) = sup Area(W),
=
llvll = 1, = 0}. (22)
This implies Diameter(W(C)) = max llull=llvJI= l,=O
I + 1. (23)
This in tum yields the bounds Diameter(W(C))::::; max llull=llvll= l,=O
21 I ::::; 2ICI. (24)
Invoking the same invariance principle, we obtain from (24) the upper bound of (7). There are other approaches to es timating the size of W(C). For instance, [2] employs the Gershgorin disc theorem to obtain quadratic bounds on the area of W(C) for certain nilpotent matrices. In view of these results and those of [16), of course, the main justification of Theorem 3 is not in the bounds on the size of the numerical range that it yields. The justification is the elegant formula (10) for the curvature of the bound-
VOLUME 26, NUMBER 1, 2004
13
REFERENCES
AUTHOR
[ 1 ] T. Bonnesen and W. Fenchel, Theorie der konvexen K6rper, Springer-Verlag, Berlin, 1 97 4. [2] M.-T. Chien, Y.-H. Lin, On the area of numerical range, Soochow J. Math. 26 (2000), 255-269.
[3] N. P. Dekker, Joint numerical range and joint spectrum of Hilbert space operators, Dissertation, Free University of Amsterdam , 1 969.
[4] W. F. Donoghue, Jr. , On the numerical range of a bounded oper ator, Mich. Math. J. 4 (1 957), 261 -263.
[5] A. Feintuch and A. Markus, The Toeplitz-Hausdorff theorem and robust stability theory, Math. lntelligencer 21 (1 999), 33-36. [6] L. Fejer, Ober gewisse durch die Fouriersche und Laplacesche
&UOENE OUTKIN
Reihe definierten Mittelkurven and Mittelflachen,
Rend. Circ.
Matern. Palermo 38 (1 9 1 4), 79-97. [7] M . Fiedler, Geometry of the numerical range of matrices, Lin. Alg. Appl. 37 (1 98 1 ), 8 1 -96. [8] M. Fiedler, Numerical range of matrices and Levinger's theorem , Lin. Alg. Appl. 220 (1 995), 1 71 - 1 80. [9] K. E. Gustafson and D . K. M . Rao, Numerical Range, Springer Verlag, Berlin, 1 997. [1 OJ E. Gutkin, E. Jonckheere, and M . Karow, Convexity of the joint nu merical range: Topological and differential geometric viewpoints,
preprint, 2002. [1 1 ] P. Halmos, A Hilbert space problem book, Springer-Verlag, New York, 1 982. [1 2] F. Hausdorff, Der Wertvorrat einer 81/inearform , Math. Zeitschrift 3 (1 9 1 9), 31 4-3 1 6. [1 3] F. Hausdorff, Gesammelte Werke, Band IV: Analysis, Algebra and Zahlentheorie, Springer-Verlag, Berlin, 2001 . [ 1 4] R. Horn and C. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge, 1 991 . [1 5] E. Jonckheere, F. Ahmad, and E. Gutkin, Differential topology of numerical range, Lin. Alg. Appl. 279 (1998), 227-254.
[1 6] R. Kippenhahn, Ober den Wertevorrat einer Matrix, Math. Nachr.
ary of the numerical range. The estimates (17) follow from it by very crude estimates. The formula (10) seems to be novel. My only "precursor" M. Fiedler identified in spectral terms the boundary curvature of numerical range in spe cial cases [7, 8]. There is no immediate relationship be tween his formulas and (10). I hope that (10) will find other applications to the remarkable subject that grew out of the Toeplitz-Hausdorff theorem. It goes without saying that geometric considerations pervade the literature on numerical range. Several re searchers have used the ideas above for purposes other than estimating the size of W(C). For instance, in [17] (16) helps to uncover new examples of domains satisfying the famous "porism of Poncelet."15 Before stopping, I will give unsolicited advice to the reader. There is a pervasive custom of concentrating on the latest literature while doing research. I am no exception to this rule. However, my experience with the study of nu merical range brought me to the conclusion:
6 (1 951 ), 1 93-228.
[ 1 7] B. Mirman, V. Borovikov, L. Ladyzhensky, and R. Vinograd, Nu merical ranges, Poncelet curves, invaria nt measures, Lin. Alg. Appl. 329 (200 1 ) , 61 -75.
[1 8] F. D. Murnaghan, On the field of values of a square matrix, Proc. Natl. Acad. Sci. USA 18 (1 932), 246-248. ( 1 9] H. Nakazato and P. Psarrakos, On the shape of numerical range of matrix polynomials, Lin. Alg. Appl. 338 (2001 ), 1 05-1 23.
[20] D. H. Owens, The numerical range: a tool for robust stability stud ies?, Sys. Control Lett. 5 (1 984), 1 53-1 58.
[21 ] M . G. Safonov, Stability robustness of multivariable feedback sys tems, MIT Press, Cambridge, MA, 1 980.
[22] L. A. Santal6, Integral geometry and geometric probability, Addison Wesley, London, 1 976. [23] M . H . Stone, Linear transformations in Hilbert space and their ap plications to analysis, A.M.S., New York, 1 932.
[24] 0. Toeplitz, Das algebraische Analogon zu einem Satze von Fejer, Math. Zeitschrift 2 (1 9 1 8), 1 87-1 97. [25] N .-K. Tsing, Diameter and minimal width of the numerical range, Lin. Mult. Alg. 14 (1 983), 1 79-1 85.
It is useful to read the work of "founding fathers"!
[26] A. Wintner, Spektraltheorie der unendlichen Matrizen, Verlag S .
15A related way of using the numerical range to construct such examples is pre
[27] P. Y. Wu, Polygons and numerical ranges, Amer. Math. Monthly
Hirzel, Leipzig, 1 931 . sented in [27].
14
THE MATHEMATICAL INTELLIGENCER
107 (2000), 528-540.
Theories of Vision Enul
·
•I
holz
I.
hildren
Ill.
IV.
b·
II.
'