Meromorphic functions and linear algebra

FIELDS INSTITUTE MONOGRAPHS T HE FIELDS I NSTITUTE FOR RESEARCH IN MATHEMATICAL S CIENCES Meromorphic Functions and Lin...

Author: Olavi Nevanlinna

71 downloads 925 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

FIELDS INSTITUTE MONOGRAPHS T HE FIELDS I NSTITUTE FOR RESEARCH IN MATHEMATICAL S CIENCES

Meromorphic Functions and Linear Algebra Olavi Nevanlinna

American Mathematical Society

This page intentionally left blank

FIELDS INSTITUTE MONOGRAPHS THE FIELDS INSTITUTE FOR RESEARCH IN MAnmMATICAL SCIENCES

Meromorphic Functions and Linear Algebra Olavi Nevanlinna

American Mathematical Society Providence. Rhode Island

The Fields Institute for Research in Mathematical Sciences The Fields Institute is named in honour of the Canadian mathematician John Charles Fields (1863-1932). Fields was a visionary who received many honours for his scientific work, including election to the Royal Society of Canada in 1909 and to the Royal Society of London in 1913. Among other accomplishments in the service of the international mathematics community, Fields was responsible for establishing the world's most prestigious prize for mathematics research-the Fields Medal. The Fields Institute for Research in Mathematical Sciences is supported by grants from the Ontario Ministry of Education and Training and the Natural Sciences and Engineering Research Council of Canada. The Institute is sponsored by McMaster University, the University of Toronto, the University of Waterloo, and York University, and has affiliated universities in Ontario and across Canada.

2000 Mathematics Subject Classification. Primary 30G30, 47 AlO, 47BlO, 65FlO.

Library of Congress Cataloging-in-Publication Data Nevanlinna, Olavi, 1948Meromorphic functions and linear algebra / Olavi Nevanlinna. p. cm. - (Fields Institute monographs, ISSN 1069-5273 ; 18) Includes bibliographical references. ISBN 0-8218-3247-6 (acid-free paper) 1. Functions, Meromorphic. 2. Algebras, Linear. I. Title. II. Series.

QA331.N456 2003 515'.982---dc21

2002041519

C(lpylng ..aril:Vreprinting. Individual readers of this publication, and nonprofit libraries aeiing for th~~are permitted to make fair use of the material, such as to copy a chapter for use .. in .tE1~rng..6i research. Permission is granted to quote brief passages from this publication in '-4'evieWs; provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Acquisitions Department, American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294, USA. Requests can also be made by e-mail to reprint-permissionlDams. ~rg.

©

2003 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America.

r§ The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. This publication was prepared by The Fields Institute. Visit the AMS home page at http://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

08 07 06, 05 04 03

This page intentionally left blank

CONTENTS ix

PREFACE

1 1 1 2

PROLOGUE Tapping away in an evening at Djursholm What does an epsilon weigh? Red wine at the Stock Exchange Club Ice-cream in Madison Exact equality The deficiency of values Zurich Beautiful to look at, but ... The unbearable ease of using norms Centenary Colloquium in Joensuu Two basic tasks, stability first And then accelerating the iteration Factoring the resolvent In the Hermann Weyl lecture hall A quiet life in Warsaw Finally, in Kirkkonummi

12 12

FIRST CHAPTER Resolvent Cauchy-integral

15 15 20

SECOND CHAPTER Entire functions Taylor coefficients Meromorphic functions The first main theorem Cartan's identity Order and type for meromorphic functions Boutroux-Cartan lemma Bound along a circle Representation theorems

23

THIRD CHAPTER Analytic vector valued functions Subharmonic functions Meromorphic vector valued functions

37

v

2 3 5 5 6 6 7 7 9 10 11

23 24

25 30 31 32 33 34

36 37 37 38

CONTENTS

vi

Rational functions When is the inverse also meromorphic A simple estimate for matrices

40 42 44

FOURTH CHAPTER A product form for matrices Singular value decomposition Basic inequalities for singular values and eigenvalues The total logarithmic size of a matrix Some basic properties of the total logarithmic size Direct sum, Kronecker product and Hadamard product

47 47 50 51 54 56 60

FIFTH CHAPTER The total logarithmic size is subharmonic Behavior near poles Introducing Tl for matrix valued functions Basic identity for inversion Extension to trace class How to work outside the trace class

63 63 65 68 69 70 71

SIXTH CHAPTER Perturbation results Special results for resolvents Powers and their resolvents Bounded characteristics What if small perturbation means small in norm

73 73 77 79 83 85

SEVENTH CHAPTER Combining a scalar function with an operator Representing F as G / 9 Representations for the resolvent Decay of spectral polynomials Robust bounds for Krylov solvers A bound for spectral projectors

87 87 93 94 96 98 100

EIGHTH CHAPTER Approximate polynomial degree of an analytic function Some properties of the approximate polynomial degree Approximate rational degree of a meromorphic function Spijker's lemma Power bounded operators and bounds for the Laurent coefficients

103 103 106 108 112 114

NINTH CHAPTER Growth of associated scalar functions Locally algebraic and locally almost algebraic operators

117 117 121

TENTH CHAPTER Exceptional values Simple asymptotics for resolvents of matrices Eigenvalues and exceptional values Deficient operators

125 125 126 128 131

CONTENTS

EPILOGUE Lecturing and typing in Toronto Fishing and finishing in Karjalohja

BIBLIOGRAPHY

vii

133

133 133 135

PREFACE This monograph is based on lectures which were given in two phases. In the fall of 1995 I gave a series of lectures at the Helsinki University of Technology and in October 2001 at the Fields Institute in Toronto. With this monograph I hope to demonstrate that viewing the resolvent of a matrix as a meromorphic function rather than just analytic outside the eigenvalues gives a lot of new insight. In low rank perturbations the eigenvalues - and pseudospectra - may move dramatically but underneath there is still much which is preserved. Since this has practical implications e.g. to preconditioning, I am trying to present the ideas in a simple and self contained form, accessible for the researchers in the numerical linear algebra community. However, some of the results are more natural to set up in infinite dimensional spaces as the asymptotics is then richer. The monograph is organized as follows. In the first chapter the resolvent is explicitly written down. The second chapter gives a summary of elementary value distribution theory - without going into the second main theorem. The third chapter then discusses vector valued analytic and meromorphic functions. The main new "tool", the total logarithmic size of a matrix is introduced in chapter four. It is a nonlinear tool for linear algebra and it allows one to generalize the first main theorem from the scalar valued case for matrix valued functions. This is done in chapter five. In chapter six we discuss some applications and show in particular that the growth of the resolvent as a meromorphic function is robust under low rank perturbations. The seventh chapter discusses first operators of the form Z 1-+

J(zA)

where J(z) is a scalar meromorphic function and A a bounded operator such that its resolvent is a meromorphic function. Another topic discussed is bounds for Krylov methods for solving x = Ax+b. We connect the decay of the bounds for the growth of the resolvent as a meromorphic function and as this is robust in low rank perturbations so are our bounds. Chapter eight gives a new tool into approximation theory. The growth of a meromorphic function is studied by approximating it by rational functions. The results are then applied to Kreiss matrix theorem, power boundedness and other related questions. In the ninth chapter we associate with a given operator valued meromorphic function F scalar functions Jx,y*:

z 1-+ y*(F(z)x),

and ask whether there are unit vectors x, y* such that the growth of F as a meromorphic function can be seen from the growth of Jx,y*. The last chapter gives a ix

x

PREFACE

link between the defects in value distribution theory and defective eigenvalues of a matrix. In addition I have included an epilogue and a prologue to explain how I got the ideas in the first place. I can be reached via e-mail [email protected]. Some software is available at URL http://www.math.hut.fi/annex/.

Olavi Nevanlinna Kirkkonummi, Finland September 10, 2002

PROLOGUE Keyplaces: Stockholm, Helsinki, Madison, Zurich, Joensuu, Palo Alto, Warsaw, Kirkkonummi. Tapping away in an evening in Djursholm This has been a project of some sort, although I never consciously thought of it as such. Even with hindsight, I don't know how I would express it: write in an application form, converted into monthly salaries, plumped up with some overhead expenses. Submitted to the authorities in twelve copies, as now required in Finland. Or was it perhaps a desire to do something in the realm of Nevanlinna theory, a small message to the past and the future? Or stated like this: I'm trying to prove that the matrix functions Z 1--+ 1 - zA (0.1) and Z 1--+ (1 - zA)-l (0.2) are equally large. Clear symptoms of the onset of value distribution! I could stick to mathematics. But, on the other hand ... One can only get rid of a story once it is fully constructed.

What does an epsilon weigh? "Perhaps you could write about whether you became a mathematician because of your surname." That was how the editorial secretary of the journal Tiede-2000 asked me for an autobiographical article as the eighties gave way to the nineties. I did as I was asked, but the whole thing gained a hold on me. Later I noticed that in the same series of articles, an astronomer had speculated about the effect of her name - Tiihtinen - (little star in Finnish) on her choice of career. So I wrote about my family, particularly my grandfather, [N02] "I studied at Helsinki University of Technology, like my father. There I acquired both a respectable profession and the opportunity to do mathematics. Further, into the bargain, I put distance between myself and the mathematical direction of my grandfather and his brother ... " The editor picked the title of the article as the question what weight epsilon has. Now, writing for my own pleasure, I have played with the idea of using as title the question: What does 2/c weigh? At that stage, I wrote no more about great-uncle Rolf. He was a distant figure, rather stooped, smaller in stature than his older brother, my grandfather. They clearly enjoyed each other's company and respected each other. I never heard my grandfather lecture - that is, not counting the innumerable times he held forth at the head of the dinner table. In contrast, Rolf gave a series of public lectures in

PROLOGUE

2

the early sixties on relativity theory, and my father took me to listen to them. The atmosphere was as exciting as at a concert.

Red wine at the Stock Exchange Club The last time I met the brothers was before Christmas 1975. In those days my family and I lived throughout the academic year at the Mittag-Leffler Institute. Now, exactly twenty-two years later,l in the same place, as I type these notes, the meeting comes vividly to mind. One evening the telephone rang and my grandfather said in his clear, friendly but commanding way that when I had defended my dissertation in spring 1974, I had spoken so quietly that neither he nor Rolf had heard anything. They had just been discussing this, and wanted me to give a lecture when I came to Helsinki so that they too could have the opportunity to check me out. There was nothing for me to do but to telephone om Lehto, introduce myself and explain the situation. I was soon in Helsinki, in a lecture hall at Helsinki University, nervously awaiting the beginning of the meeting of the Finnish Mathematical Society. During my post-doc time, I hadn't accumulated very much of great import to say. The previous evening I had agreed with my grandfather on the telephone that, he, great-uncle Rolf and I would go after the talk to the Stock Exchange Club. What a more pleasant way could there be to end the evening? The lecture hall was far from empty, but Rolf was nowhere to be seen and my grandfather's only presence was his portrait on the wall. After the lecture I walked the two blocks and there the brothers were, at the Stock Exchange Club: "Ah Olavi, there you are! Nice you could join us, have some wine. Rolf and I came along to the Club beforehand." It was the nineties before I could talk about this incident. My grandfather had been the head of Porssiklubi, the Stock Exchange Club. One time the Club was visited frequently by the staff of Helsinki University. Like an exclusive university club (but with a strict rule: no women allowed).

Ice-cream in Madison Eighteen months later I was turning an ice-cream maker in the heat of the mid-west. David Drasin arrived, introduced himself and with very little preamble, asked about the relationship between my grandfather and his brother, their working relationship. I said I didn't know much about it, but that they evidently tried their thoughts out on each other fairly intensively. Drasin replied to the effect that he thought their relationship was considerably more that of equals than had seemed to be the case afterwards. Drasin had just solved the inverse defect problem. Ten years later he published an article almost a hundred pages long [Dr], with the following dedication In memoriam Frithiof Nevanlinna (1894-1977) Rolf Nevanlinna (1895-1980).

lThis was originally written in 1997.

PROLOGUE

3

The main result of the paper was Theorem 0.1 F. Nevanlinna's conjecture is correct. Almost sixty years before, my grandfather had written [NF]: Es ist nicht unwahrscheinlich, dass dieses Resultat noch bestehen bleibt, auch wenn man mehrlache Werte zuliisst, wenn nur die Summe der Verzweigungsindizes gleich Null oder, was aul dasselbe hinauskommt, die Delektsumme der F'unktion gleich 2 ist. (It is not unlikely that this result still holds, even if one allows multiple values, if the sum of the ramification indices is 0 or, what amounts to the same thing, if the sum of the defects of the function is 2.) So what were those defects? Exact equality Polynomials do not have defects. Given a polynomial p of degree d the equation

p(z) = a

(0.3)

always has exactly d solutions, independently of the complex number a, provided we count the solutions according to their multiplicities. The opening moments of the value distribution theory of analytic functions are to be found in the work of Weierstrass and Picard in the 1870s. Picard showed that an analytic function obtains, in the neighbourhood of an essential singularity, all except two values at the most. For example, around infinity the exponential function takes all other values except 0 and 00. These are called Picard exceptional values. When dealing with analytic functions I, the growth function

(0.4)

M(r, I) := max I/(z)1 Izl~r

is crucial. This, however, is not suitable for handling meromorphic functions, because I/(z)1 becomes arbitrarily large near any pole. In 1925, Rolf Nevanlinna published a hundred page article [NRl] in the journal Acta Mathematica, founded by Mittag-LefHer, in which he established at one stroke the basis of the value distribution theory of meromorphic functions, the Nevanlinna theory. There the growth function M(r, I), or rather log M(r, I) is replaced by the Nevanlinna characteristic function T(r, I). Let us examine with the help of a simple example why T satisfies an identity. Consider the function I(z) := 1 - z. Now the following holds 1 {27r 211" log 11 - rei . r-+oo

u

(0.15)

It follows from the second main theorem that there can exist only count ably many defective values a and that the sum of deficiencies 8(a) is bounded from above by 2, i.e. (0.16) L8(a):::; 2.

If a function has two Picard exceptional values, then at those points 8(a) = 1 and from (0.16) we conclude that all other values are nondefective. So that's this defect. But is there something significant about this, generally that is, not just in Finland? Hermann Weyl wrote that Rolf Nevanlinna's creation was one of our century's greatest mathematical achievements. That's a strong statement, even though only two fifths of that century had passed at the time it was made. In Spring 1996, I was enjoying the Sunday evening peace of a book shop in Palo Alto, when I found in my hands a new, posthumously published book by Lee A. Rubel, [Ru]. There, Rubel states that his favorite theorem in all of mathematics is a theorem of Rolf Nevanlinna:

Theorem 0.2 If two junctions, merom orphic in the whole complex plane, share five distinct values, then the two junctions must be equal. Note that e Z and e- Z share 0, 00, 1 and -1, so the number five is sharp. This theorem is a consequence of the second main theorem. Zurich Rolf Nevanlinna worked on several different occasions in Zurich. The first time was at ETH and later at the university right next to it. When he died in 1980, a Memorial Colloquium was arranged in Zurich the following year, and I was able to travel to it from New York. At that time, I was working on numerical methods for initial value problems and, with my friend, Rolf Jeltsch, had used function theory to show that the accurate numerical solution of initial value problems (within certain rules) necessarily required a lot of work. This was based on viewing the numerical solution of a simple test equation as an approximation problem for the exponential, using algebraic functions. The values of an algebraic function are distributed very differently than those of the exponential function. In particular, they don't bend easily to look like the exponential near the origin if required at the same time to be small in the whole left half plane. Jeltsch and I took part in the Memorial Colloquium as backbenchers. All who shared the Nevanlinna surname were exempted from the registration fee; this meant some relatives had traveled from Finland. Rolf Jeltsch tried in vain for exemption on the basis of his first name. The restrained and dignified respect which shone through several of the Colloquium's speeches made an impression. It seemed inappropriate that I didn't understand very much of the subject. The defect had to be corrected.

PROLOGUE

6

Beautiful to look at, but ... I studied function theory for domestic purposes by giving a series of lectures on the subject at Helsinki University of Technology in 1982. The theory of value distribution seemed very beautiful and I dreamed that it would be romantically pleasant to allude to it in some work. The years passed but no opportunity presented itself. The trail laid by Rolf Nevanlinna, Lars Ahlfors and several others very quickly led to value distribution theory being virtually complete; its further development was mainly in the direction of extensions. Such, for example, is Hermann and Joachim Weyl's theory of meromorphic curves and Seppo Rickman's results for quasi-regular mappings. Some applications were made, particularly for differential equations in the complex plane, but even so, as a whole, the impression remained that characteristic function T has been less often used as a mathematical tool than the beauty of related results would have led one to expect. What is beautiful is useful, functional. That is the rule; and Alvar Aalto's Paimio chair is an exception. That's how one has to be able to think. One uses only those instruments which one has learned to "play" during one's studies. And when one's own tools are similar to those of others in the field, everything is all right, and meets the common norms.

The unbearable ease of using norms Usually a natural task for applications is to try to answer how large a given function is at a given point, what are its extreme values etc. In modern analysis one often uses functional analysis as a basic tool to formulate and to get a geometric feeling of the problem. This leads us to emphasize linearity and tools which are effective for linear problems - the unbearable ease of using norms. For example, in numerical linear algebra, much attention is paid to the fact that the algorithms written are invariant relative to the scaling of the task. This is often important in itself; but on the other hand one then finds non-scalable mathematical tools difficult and unnatural. For example, at first glance T(r, f) looks like a complicated tool. It speaks precisely and beautifully about a certain logarithmic average, when in practice one would like to know, even imprecisely, about the maximum value. Browsing through this, it may remain unnoticed that if f is analytic then the subharmonicity of log If I leads immediately to

0+1 T(r, f) :::; log+ M(r, f) :::; 0 _ 1 T(Or, f).

(0.17)

This is indeed to be found in all textbooks on the subject, but as soon as one looks for slightly more advanced estimates, it is really difficult to find them. The careful arrangement of a studio aiming to ensure functionality is not the same thing as just setting up a museum. Peter Henrici took on such a task. He wrote an extensive series of books on "Computational Complex Analysis", but unfortunately he did not deal with the Nevanlinna theory at all. On the other hand, Henrici expressed great admiration at the Memorial Colloquium in Zurich for Rolf Nevanlinna's doctoral dissertation: the simple elegance of the construction found within it. The Pick-Nevanlinna interpolation is nowadays an important tool in modern control theory. Sometime in the mid-eighties, I gave a seminar talk on this subject at the University of Helsinki. I explained how and why the interpolation task arises in control theory. I suggested that we in Finland would be the right ones to pick up this problem area. I drew a table on the board, complete with

PROLOGUE

7

the legs required, with even a vase on the table. All that was needed was a flower arrangement. I suggested that, with the development of computers, the tools used in scientific computing had relied too much on the results gained from real and functional analysis.

Centenary Colloquium in Joensuu The year 1994 marked the hundredth anniversary of my grandfather's birth. Since Rolf was one year younger I noticed that I had one year in which to write up a small application of value distribution theory for the XVIth Rolf Nevanlinna Colloquium to be held in Joensuu. I had already thought of the subject. Let us look at the following inequality.

Lemma 0.1 (Spijker [Sp]) If f is a rational function of degree d, and r is a circle, then

[If'(z)lldzl::; dsup If(z)l. 217r lr r

(0.18)

Here the degree of a rational function is just the maximum of the degrees of the numerator and the denominator. This is an extremely useful tool as the values of the rational function on both sides of the inequality enter on the same circle. But what if in place of "degree" , we were to write a characteristic to depict the fact that "the meromorphic function looks just like a rational function of degree d ". I formulated a precise definition for this.

Definition 0.1 If f is meromorphic for

Izl < R, and r < R, then

d(r, f) := min{deg(q) I M(r, f - q) ::; I},

(0.19)

where the minimization is taken over rational functions q. We can now very easily obtain an analogue of Spijker's lemma for meromorphic functions by just approximating the meromorphic function by a rational one, but where can we obtain the value for d(r, f)? Fortunately, it turned out that d and T could be estimated in terms of each other.

Lemma 0.2 [N03] There exist functions C 1 and C 2 with the following property. If f is meromorphic for Izl < R, f(O) =I- 00, and r < ()r < R, then

d{r, f) ::; C1{())T«()r, f) and T{()r, f) ::; T{r, f)

+ C2 {())

+ d{()r, f) log () + 2 log 2.

(0.20) (0.21)

With such a tool I decided to brave the Joensuu Colloquium. Before my lecture, I found myself nervous and wondering whether the dead brothers would be listening.

Two basic tasks, stability first I wanted to demonstrate, at least to myself, that the characteristic function T can be used effectively in estimating functions of matrices and of linear operators. A meromorphic function arises naturally as follows. Let A be a bounded linear operator in a Hilbert space and x and y be two vectors in that space. Then to the resolvent (oX - A) -1 we can associate a scalar function fx,y: oX

t-+

({oX - A)-1 x ,y)

(0.22)

PROLOGUE

8

and use it for example to estimate the powers of A. In fact

HAnl1 =

.1

1 sup I(Anx, y)1 = sup 12 IIxll=IIYII=l IIxll=IIYII=l 7l"~

An Ix,y(A)dAI

(0.23)

"(

where the path 'Y surrounds the spectrum of A. Estimating the powers of an operator A is one of the most crucial tasks in numerical analysis. Let us look at boundedness of the powers as an example. If there is a C such that (0.24) HAnl1 ~C for n = 0,1,2, ... , then it follows immediately that (0.25) when IAI > 1. Stability tasks in numerical analysis often lead to a situation where an estimate (0.25) is known, for example, so that constant C is given for the whole family of operators A. Are the powers of these operators then bounded, perhaps with a constant depending only on the C in (0.25)1 It is not difficult to see that the estimate (0.25) gives with the aid of equation (0.23) (0.26) Here there is a possible linear growth present, controlled, however, with the constant C. In a d-dimensional space the growth, however, saturates so (0.27) This is related to Spijker's Lemma in such a way that now Ix,y in (0.23) is a rational function of (at most) degree d. At the beginning of the sixties, Heinz-Otto Kreiss presented a theorem in which it was shown that from the estimate (0.25) in finite dimensional spaces the uniform boundedness of powers follows by a constant which depends only on the dimension of the space and the given constant C. Finding the lowest possible dimensional dependence became almost a race; Spijker brought it to an end by giving the answer (0.27). I relate this as an example of a phenomenon that is not unusual. Namely: in the excitement of the chase one tends gradually to lose sight of the obvious fact that the dimension may not be any "correct" parameter in a theorem like this. Particularly since it was thought of from the start as dealing with matrices (or their inverses) generated mainly from discretising partial differential equations. In such a situation the "family" of matrices for which a resolvent condition would be established would not operate in a fixed dimensional space since the refinement of the discretisation would cause the dimension to grow. So what would be the "correct", practically important parameter to replace the dimension in the Kreiss Matrix Theorem1 Definition 0.2 If A is a bounded linear operator in a Hilbert space,then its singular values O"j(A) are given by

O"j(A) :=

inf

deg(A;)<j

IIA - Ajll.

(0.28)

There is an equivalent definition for singular values which is often used in particular for matrices. They can also be defined as the square roots of eigenvalues

PROLOGUE

9

of the operator A *A. Let me recall that an operator is compact if and only if its singular values tend to 0 as the index j ~ 00. Also, let me introduce, whenever finite, (0.29) j

so that p = 1 gives the trace norm, p = 2 the Hilbert-Schmidt norm, and so on. Here is a simple version of the Kreiss matrix theorem for Hilbert spaces.

Theorem 0.3 For each p > 0 there exists a constant Cp with the following property. If we know that the resolvent condition (0.25) holds and that IllAlllp < 00, then the powers of A are bounded and for all n = 0, 1,2, ...

(0.30) This can be proved [N03] by applying the modification of Spijker's lemma and Lemma 0.2 to the function fx,y'

And then accelerating the iteration Another basic task, which has the same elements as above, relates to the speed of the iterative solution of linear equations. I investigated the subject at the beginning of the nineties and wrote a book about my research (N01]. Here it suffices to understand that optimally accelerated iteration may behave in different ways at different stages. At first we can observe fast but decelerating sublinear behaviour. I have compared this to the behaviour of an analytic semigroup. Next one typically might encounter the linear stage, and this is quite well known - to explain it, one can make use of potential theory, outside the spectrum. At the end of the iteration, the spectrum of the operator begins to appear in detail and then it is natural to change to consider the resolvent as a meromorphic function. Let the model problem be given thus: if Pk(A) = Ak + ak,lA k- 1 + ... + ak,k is a monic polynomial such that its norm IIPk(A)1I at the operator A is as small as possible, then how quickly does the norm decrease when the degree k increases? The next example, which I calculated for my book, made me think about possible connections with value distribution theory. Let A be the solution operator for the simplest possible second order differential equation u" =

f, u(O) =

(0.31)

u(l) = O.

:k

This is a self-adjoint Fredholm operator with eigenvalues {-( )2} tending to zero. If we substitute the initial condition u'(O) = 0 for the boundary condition u(l) = 0, we obtain a Volterra integral operator V 2 , the spectrum of which has collapsed to the origin. Polynomials that realize optimal acceleration for A and V2 are very different in nature, but the speed is essentially the same in both. If we calculate the characteristic functions T for the resolvents, we again obtain essentially the same speed of growth. However, for the Fredholm operator, the whole growth accumulates from counting the poles, whereas the resolvent of the Volterra operator is an entire function and all the growth becomes measured by the function m. Applying the boundary conditions here has an effect which is quite similar to setting the value a in the equation J(z) = a, and in our example the self-adjoint Fredholm operator A corresponds to the "regular" values a while the the Volterra operator V 2 corresponds to the defective, exceptional values. Note that as a result of the change of the boundary condition, the operators are, in the sense of norms, far

PROLOGUE

10

from each other, but still A = V 2 + R, where R is one-dimensional: deg(R) = 1. In this sense, it is a small perturbation. In numerical analysis, it is customary to estimate operator valued functions with the aid of holomorphic functional calculus. In holomorphic functional calculus, resolvents are examined as analytic functions outside the spectrum. In particular the path of integration in the Cauchy-integral has to be chosen in such a way that it goes around the whole spectrum. If we make a small-dimensional correction to the operator, as in our example, the spectrum can change radically and then it is difficult to construct a perturbation theory relying on such a tool.

Factoring the resolvent The Joensuu Colloquium was held in August 1995 and in the autumn, I had selected, again in honour of the anniversary, the area between function theory and functional analysis as the subject of my lectures. After I had presented the theory of scalar valued meromorphic functions we looked to see what the results were like if F is an operator valued meromorphic function, when the absolute value in the scalar theory is replaced by the operator norm. I also presented the contribution of Rolf Nevanlinna to the Pick-Nevanlinna interpolation, and connections between Nevanlinna theory and the theory of "composition operators". My dream of a value distribution theory for operator valued functions had of course got stuck at a seemingly insurmountable threshold. If the resolvent could be thought of as a meromorphic function, shouldn't it then be "just as large" as its inverse, which is even so a polynomial of first degree. So, as I stated at the beginning, shouldn't (for tradition's sake I write 1/ z instead of A) Z 1--+

1 - zA

(0.1)

and Z 1--+ (1 - ZA)-l

(0.2) really be equally large if measured as meromorphic functions? If we look at matrices in a d-dimensional space, the former is a first degree polynomial and the latter generally a d-degree rational function, and they cannot in any way generally be equally large. When I said in the course that a polynomial can be decomposed as a product using its roots, I already in fact knew that I would get over that threshold. One just had to ask in what way 1 - zA could be decomposed as a product of its first degree factors, so that it could be seen as a d-degree polynomial. Theorem 0.4 Suppose we are given a linear mapping A in ad-dimensional complex space. Then there exist vectors ab a2,' .. ,ad and Ul, U2, .. . ,Ud such that

(0.33) for j > k

(0.34) and for all complex z

(0.35) where ua* denotes the following I-dimensional mapping: x

1--+

(x, a)u.

PROLOGUE

11

What's fun about this form is that when we invert 1- zA, the order of the product is reversed and all the terms generally become "visible" (I - zA)-l

= (I + 1

z * ulan . .. (I + 1 z * Udad). - zal Ul - zadUd

(0.36)

Vectors aj and Uk can be obtained numerically by carrying out the Schur decomposition for A, after which the resolvent can be written in the form (0.36) without further work. If we then examine only one term of the product z

f--+

I - zab* ,

then the first main theorem is valid as such. The value distribution theory of scalar functions is not built on factoring the functions into first order factors. Likewise, we should not try to build the concepts in the operator valued case on such decompositions. However, the possibility of such a theory becomes evident at once. And to get the "right" definition it was enough to remember that the absolute value of the product of eigenvalues is the product of singular values.

In the Hermann Weyl lecture hall In October 1995, the hundredth anniversary of the birth of Rolf Nevanlinna, I presented my new work in the Hermann Weyl lecture hall at ETH during the week before the actual birthday. Zurich was already a familiar place. Rolf Jeltsch had returned from Aachen to take up the chair of Peter Henrici, his teacher. I had been there every year and spent the whole summer term there in 1992. Visits to the United States seemed to become more stressful from year to year due to the time difference, whereas Zurich was only a convenient couple of hours away by direct flight. I presented the following identity and the two applications described above. Let F be a matrix valued function with meromorphic elements. In the following I denote the characteristic function by Too which is obtained when the subharmonic function log+ If I is replaced by (the subharmonic function) log+ IIFII. No such identity as in the first theorem can be valid for this. On the other hand, we take the function Lj log+ O"j(F) in place of log+ If I· The starting point, the identity , 1 (0.37) log If I = log+ If I - log+

m

changes into log Idet FI =

L log+ j

O"j (F)

-

L log+

O"j

(F-l).

(0.38)

j

The characteristic function obtained is now denoted by T l .

Theorem 0.5 Let F be a matrix valued junction, with elements meromorphic in the disc Izl < R ~ 00. Then Tl(r,F) is for r < R a well-defined non-negative junction, such that it is increasing and convex in the variable logr. We have always Too(r, F) ~ Tl(r, F).

(0.39)

If G is another such function, then Tl(r,FG) ~ Tl(r,F) +T1(r,G).

(0040)

If additionally F(O) = I, then Tl (r, F) = Tl (r, F- l ).

(0041)

PROLOGUE

12

At that stage, I did not in truth have such a theorem. The other parts of the theorem are easy, but the behaviour of T1 (r, F) as r grows still required time. I wrote my regards in the book of the Joensuu Centenary Colloquium [N05), still without the dependence on radius r.

A quiet life in Warsaw In spring of 1996 I was at the Banach Center in Warsaw, without a strict timetable, and I again took a look at the function Z

I--t

s(z) := L)og+ o'j(F).

(0.42)

Was it subharmonic away from the poles? I learned that B. Aupetit had shown that if we form geometric averages of the absolute values of eigenvalues k

II IAj(z)1

(0.43)

1/ k

1

(where the eigenvalues are ordered to be decreasingly in absolute values), then these are always subharmonic, if the matrix is analytic. I contacted him and he wrote a short article in which the corresponding result was modified for singular values with the help of the polar decomposition. This tool, T 1 , extends by an approximation technique to operator valued functions in Hilbert spaces of the form 1 - K, where K is analytic away from the poles, the sum of whose singular values is finite and furthermore, at the poles b in the expansions 00

K(z) =

L Aj(z -

b)j

(0.44)

-m

the coefficients Aj are finite dimensional for j < 0, [N08]. The perturbation estimation can now be done as follows. Let F- 1 be a given operator valued meromorphic function, so that in particular Too(r, F- 1 ) is finite and the task is to estimate the perturbation (F + G)-l: Too(r, (F + G)-l) = Too(r, (1 + F- 1G)-1 F- 1) :5 Too(r, (1 + F- 1G)-1) + Too(r, F- 1)

(0.45)

If G is now, for example, a finite dimensional function, then 1 + F- 1 G is always such that the inversion identity for T1 holds and we can estimate (0.46)

where constant C depends on the behaviour of log Idet(I + F- 1G)1 at the origin.

Finally, in Kirkkonummi Hermann Weyl knew Nevanlinna theory well, and developed it himself, particularly the theory of meromorphic curves. Weyl also investigated the singular values of matrices and showed especially that between the absolute values of the eigenvalues IAj I and the singular values aj there always holds an inequality k

k

II IAjl :5 II 1

1

aj.

(0.47)

PROLOGUE

13

Nowadays these even carry the name Weyl inequalities. Weyl also studied the behaviour of the spectrum of linear operators when the operator is perturbed by the compact operator. If K is compact, then the spectrum of the operator A + K satisfies (0.48) u(A + K) c u(A) U up(A + K) where up now denotes the point spectrum. The proof of this is essentially the same as the steps in the inequality (0.45), as applied to the resolvent. I was left wondering why Weyl had not taken the logarithms of the determinants and tied those strings together. Perhaps this was important only to me. Perhaps a molehill had accidentally been allowed to become a mountain.

Comment This Prologue is essentially a translation of [N06], written during winter 1997-98 at Mittag-Leffler Institute.

FIRST CHAPTER Keywords: Resolvent, merom orphic, characteristic polynomial, minimal polynomial, algebraic operator, almost algebraic, degree of rational junction, degree of an operator, nilpotent, quasinilpotent. Resolvent We are given a matrix A which we view as an operator in the complex Euclidean space Cd. We usually write I for the identity operator. The most important function associated with the operator A is the resolvent

>. 1-4 R(>', A)

:= (>.1 - A)-I,

which we shall view as a merom orphic function defined for all >. E C, rather than analytic outside the spectrum u(A). At an eigenvalue J.L E u(A) the resolvent has a pole and the following expansion holds 00

(>.1 - A)-1

L

=

Ak(>' - J.L)k

(1.1)

k=-m in some neighborhood of J.L. We assume that A-m :f. 0 and say that m is the multiplicity of the pole J.L. The resolvent is analytic in particular for 1>'1 > p(A), the spectral radius, and the following power series expansion converges there: 00

(>.1 - A)-1 =

L Ai>.-i-I.

(1.2)

i=O In order to have a nice representation which is valid for all >. we recall the characteristic and minimal polynomial of A. The characteristic polynomial 7rA is defined by 7rA(>') := det(>.1 - A), (1.3) and by the Cayley-Hamilton theorem it vanishes at the matrix A: 7rA(A) = o. If all the eigenvalues are distinct, then there is no smaller degree monic polynomial which could vanish at A, but with multiple eigenvalues this can happen. In general, the monic polynomial of smallest degree vanishing at A is called the minimal polynomial and we shall denote it by qA. Let P be any monic polynomial, vanishing at A, and suppose that p(>.) := >.n + al>.n-l

+ ... + an.

(1.4)

Together with p we associate polynomials Pi by Homer's rule. We initialize

PO(>.) := 1 15

(1.5a)

16

FIRST CHAPTER

and then for j

~

n set (1.5b)

so that in particular Pn

= p.

Finally, given a monic polynomial p we denote

p(A) := A-np(A) = 1 + alA -1

+ ... + anA -n.

(1.6)

Proposition 1.1 If p is a monic polynomial of degree n such that p(A) = 0 and Pj, P are as above, then we can write the resolvent in the form n-l

(,\/ - A)-l = A(\) L Pj(A)A- j - l . P j=O Proof Expand the resolvent into series in 1/ A, multiply by terms.

(1.7)

p and identify the 0

Example 1.1 If A is nilpotent of degree n so that An = 0, then we can take p(A) = An and (1.7) takes the form n-l

(,\/ - A)-l

=L

Aj A- j - l ,

j=O

natural when compared with (1.2).

Example 1.2 A rank-I matrix ab* has (at most) one nonzero eigenvalue. Applying ab* to the vector a we see that the eigenvalue is b* a. In the orthogonal complement of b the matrix vanishes and thus the minimal polynomial is simply qab.(A) = A(A - b*a). Then, (1.7) takes the form

(,\/ - a*b)-l =

~(I + A - \.. a ab*)

(1.8)

which is easy to derive also e.g. using the power series expansion of the resolvent.

Remark 1.1 Multiplying by An we can modify (1.7) into form n-l

(,\/ _ A)-l = _1_ LPj(A)An-j-l p(A) j=O

(1.9)

which shows the zeros of P as poles of the resolvent. If we use the minimal polynomial qA as P then the above expressions contain no common factors.

Remark 1.2 If we substitute A = 0 in these expressions we obtain a formula for the inverse. In fact, if we for example use the characteristic polynomial 1I"A, then 1I"A(0) = (-I)ddetA so that by (1.9) we have whenever A is invertible -1

A

=

(_I)d-l detA 1I"A,d-l(A).

(1.10)

There are several algorithms to compute the characteristic polynomial. Here we give the Leverrier-Faddeev algorithm which uses matrix multiplications and taking traces, is computationally quite heavy but gives insight into the structure. Set Al := A and compute al := -trAt. form Bl := Al + all and continue as follows: 1 Aj := ABj_ l , aj:= ---;trAj , Bj:= Aj + ajI. (1.11) J

FIRST CHAPTER

17

Proposition 1.2 The Leverrier-Faddeev algorithm produces the characteristic polynomial 11'A (A) = Ad +a1 Ad-1 + ... + ad and the associated polynomials 11'A,j (A) = Aj + a1Aj-1 + ... + aj satisfy j = 1, ... ,d Bj

= 1I'A,j(A).

(1.12)

Proof The proof is by induction, utilizing what is called Newton's formula: if

Ai denote eigenvalues and d

J.Lk:= LA~ i=1

then -a1 = J.L1 and for j = 2, ... , d

-jaj = J.Lj

+ a1J.Lj-1 + ... + aj-1J.L1.

(1.13)

Clearly, (1.12) is true for j = 1 as B1 = A-trA. Suppose (1.12) holds up to j - 1, so that . 1 . 2 B j - 1 = AJ- + a1AJ- + ... + aj_II. Then (1.11), trAk = J.Lk and (1.13) give .

. 1 + a1trAJ+ ... + aj_1trA =J.Lj + a1J.Lj-1 + ... + aj-1J.L1

trA j =trAJ

= - jaj.

o

Thus (1.12) holds with j, completing the proof.

Our theme is to view the resolvent as a meromorphic function, and in fact, as we are in the finite dimensional space Cd, the resolvent is actually rational, a polynomial with matrix valued coefficients divided by a scalar polynomial. With scalar rational functions r = p/q we define the degree by degr := max{degp, degq}, which for example is invariant in the inversion r

1-+

l/r.

Our perturbation theory shall be of this form. Let us perturb a rational function r with a constant a E C as follows p r= 1-+ r+:= -P- . (1.14) q q-a Then the values of r, and in particular, the location of its poles change a lot and r and r + are not close if considered as analytic functions but when measured as meromorphic functions, the perturbation is small. Now, consider updating a matrix A by a low rank matrix B

A

1-+

A+ :=A+B.

We are interested in the corresponding transformation between the resolvents:

which can be written explicitly as follows:

R(A, A+) = (I - R(A, A)B)-1 R(A, A)

(1.15)

FIRST CHAPTER

18

Notice that (1.15) shows the perturbation to appear multiplicatively and in fact we multiply by 1 + C(A) := (1 - R(A,A)B)-l where C(A) is of low rank whenever B is. In our matrix valued theory we need two concepts, one measuring the growth of the resolvent and the other one to measure the perturbation. The former is not invariant under inversion while the latter one is. The key is thus to be able to estimate terms ofthe form 1 +C(A) without having to invert anything, without the need of knowing the location of the new poles. For a rational function r its degree determines the asymptotic growth of its growth function. We are interested in measuring the growth of the resolvent and again its asymptotic growth speed is determined by the degree of the resolvent as a rational function. In fact, every resolvent is analytic and behaves like O( near infinity and it is therefore natural to think it as a function of 1/A. We see from (1.7) that we should think of the resolvent to be of degree n where n is the degree of the minimal polynomial of the matrix.

*)

Definition 1.1 A matrix A is said to be algebraic of degree degA, where the degree is obtained from the minimal polynomial qA: degA = degqA. Notice that according to this definition the scalars, including the 0, are of degree 1 and that the resolvent is automatically of the same degree as the matrix. Example 1.3 The degree of a matrix does not behave well in forming products or sums. Let A map the even coordinates forward: Aei = ej+1 and the odd ones to while B does the same, odd and even reversed, and both map the last one, ed to 0, then the following holds:

°

degA = degB = 2 while A + B = Sd, the truncated shift and degSd = d. Further, 1 + A and 1 + B are also of degree 2 but (1 + A)(1 + B) = 1 + Sd

is again of degree d. Proposition 1.3 We have deg(A + B) ~ deg(A) (rank(B) + 1).

(1.16)

Proof Denote by n( B) the range of B. Then the dimension of the subspace spanr~o{ Ain(B)} is at most deg(A)rank(B) and so the dimension of the subspace

spann~o{(A + B)ib} is at most deg(A)rank(B) + deg(A) for every vector b. Then the claim follows from Kaplansky's theorem (Theorem 2.8.11 in [N01]). 0

The inequality (1.16) shows in a nutshell the "boundary conditions" we have in the perturbation theory. The resolvent shall be estimated using a characteristic function (Too) which is based on the norm, showing the degree in an asymptotic sense, but the perturbation must be linked to the rank and we do this by using not only the norm, the largest singular value but actually all singular values which are larger than 1. This tool shall be denoted by T 1 .

FIRST CHAPTER

19

We give two illuminating examples in which the "phenomenon" of the matrix changes drastically under a rank-l update while the size of the resolvent measured using Too stays essentially unchanged. Example 1.4 Let Sd be the truncated shift as in Example 1.3, i.e. the matrix which has 1's in the first lower diagonal: 1 1

1 1 Adding the rank-l matrix e1ed' that is, adding a 1 in the upper right hand corner, gives us a unitary matrix, call it U := Sd + e1ed: 1

1 1

U=

1 1

Thus we move from a nilpotent to unitary by a rank-l update. Following the eigenvalues along the path A(a) := (1 - a)Sd + aU is easy since the characteristic polynomial is simply

Example 1.5 Our other example is simpler to state for operators (and then think of approximating these operators with finite rank discretizations, if we would want to stay within matrices). Let us denote by V 2 the integral operator giving the solution to the initial value problem

u"(t)

= f(t),

0 ~ t ~ 1, u(O)

Thus

(V2 f)(t) =

= 0,

u'(O)

= O.

(1.17)

lot (t - s)f(s)ds.

This is a quasinilpotent operator, a(V2) = {O}, when considered e.g. in L 2[0, 1], and the resolvent is thus an entire function in 1j >... A straightforward summation of 00

R(>", V2)

=

L v 2j >.. -j-l j=O

gives

(R(>.., V 2 )f)(t)

=

t .f}.. sinh((t - s)j.f}..)f(s)ds.

1 1 A f(t) + >..2 10

We see that R(>", V2) grows particularly fast when>.. max IIR(>", V2)11 j>.j=r

rv

e1/..;r

---+

(1.18)

0 from the right, in fact (1.19)

FIRST CHAPTER

20

which means that the resolvent is an entire function in 1/A of order w = 1/2 and of type T = 1. We then change our boundary conditions, so that the solution operator becomes self-adjoint. Let A be the solution operator to the problem u"(t) = f(t), 0::; t ::; 1, u(O) = 0, u(I) = O.

(1.20)

Clearly, we can solve this by looking for a candidate in the form u(t)

= V2 f(t) + ct

and choosing c so that the boundary condition u( 1) = 0 is satisfied. This gives (Au)(t) =

11

a(t,s)u(s)ds

where the kernel is symmetric: a(t, s) = a(s, t) and is given for 0 ::; s ::; t ::; 1 by a(t, s) = s(t - 1).

Notice that we can write

which means that the updating needed, corresponding to changing the boundary condition, is a rank-l operator

Since A is self-adjoint all the growth of the resolvent is seen thru its spectral behavior and in fact A has a spectrum of eigenvalues Aj = -(I/7rj)2 together with their accumulation point O. Now we cannot measure the growth using the maximum modulus as in (1.19) but we need to measure as a meromorphic function instead. And then it turns out that the resolvents of V 2 and A exhibit the same growth speed. If we follow the path from A to V 2 along A(a) = (1- a)A + V 2 , then we start from a self-adjoint operator having negative eigenvalues. The eigenvalues start to form pairs, bifurcating into symmetric complex pairs which travel around the origin and disappear into origin from the right hand side where R(A, V2) grows fastest,

[Hyj. Cauchy-integral The Cauchy-integral represents an analytic function inside a domain as an integral over its boundary. If we denote a contour by r, then inside r an analytic function f can be written as

(1.21) If we "replace" 1/( - z) in the integral by a resolvent then we obtain the value of the analytic function at the matrix provided that the contour surrounds the whole spectrum:

(1.22)

21

FIRST CHAPTER

This holds as such for bounded operators in Banach spaces and is sometimes called the Dunford-Taylor integral. If the contour fails to surround all the eigenvalues, then it produces the value at the matrix when projected to the invariant eigenspaces associated with all the eigenvalues surrounded. In fact, if r j surrounds a single eigenvalue Aj, then setting

(1.23) we obtain the Riesz projection: Proposition 1.4 Under the assumptions above we have

(i)

LPj =1 j

pJ =P

j

PjPk =0 for j

i:- k

PjA=APj .

(ii) (iii) (iv)

Each invariant subspace PjC d contains an eigenvector Vj such that AVj = AjVj and Pjf(A) = f(A)Pj =

f:nlrj(

f(()((1 - A)-ld(.

(1.24)

Above we "substituted" the matrix A as a variable in the analytic function Sometimes we want to do this with a matrix valued function, say F:

F: z 1-+ F(z)

=

f.

(/i,j(Z)).

Here each element fi,j is analytic in a common domain n and the matrix need not be square. If A is a d x d-matrix as before, with eigenvalues in n, then we have 1 . ( F(() ® ((I - A)-ld(, F(A) = -2

7r~

lr

(1.25)

provided that the contour r stays inside n and surrounds every eigenvalue as for scalar functions f. The symbol ® denotes the tensor or Kronecker product. If C = (Ci,j) and D = (dk,l) are two matrices then

We shall discuss this product in detail later. Formula (1.25) is compatible with the obvious representation obtained from power series representations. In fact, if F(z) = LAjzj

and the spectral radius of A is smaller than the convergence radiliS of this representation, then

FIRST CHAPTER

22

Comment 1.1 One calls a bounded operator A in a Banach space algebraic if it has a minimal polynomial and its degree is defined again as the degree of the minimal polynomial. The formula (1.7) holds as such. Furthermore, in [N01] we defined an almost algebraic operator as one for which the following holds: There exists a sequence {akh>l of complex numbers such that if Pj(.~) :=

then as j

---t

.

. 1

>..J + al>"J- + ... + aj

00

Ilpj(A)1I 1/i ---t o. While algebraic operators are exactly those for which the resolvent is rational, almost algebraic can be characterized as those which have meromorphic resolvents for>.. ::f. O. Also, representation (1.7) holds in the form 1

00

(>..1 - A)-l = ~(>..) LPj(A)>..-j-l.

P

(1.26)

j=O

where ]3(>..) = 1 + aI/>"+ a2/>..2 + ... is entire in 1/>", [N01]. We shall return to this class later, see in particular Definition 7.1 and Theorem 7.3. Comment 1.2 The Leverrier-Faddeev method is given in more detail in [F], where it is called Leverrier's method in Faddeev's modification. U.J.J Leverrier's original article appeared in J .Math. 1840. Comment 1.3 Proposition 1.3 is from [Hy-N] where it was given for operators. Comment 1.4 The connection between V 2 and A in Example 1.5 was presented in [N01] (Examples 5.2.7 and 5.2.8) and it was one of the starting points for this work. It was further studied in [Hy].

SECOND CHAPTER Keywords: Entire /unction, meromorphic, Poisson-Jensen Theorem, Nevanlinna characteristic. Entire functions Weierstrass showed that every entire Junction (i.e. regular in the whole plane and so an everywhere convergent power series) 00

J(z) = Lak zk

(2.1)

k=O

can be expressed as a product in terms of its zeroes by means of the Weierstrass Jactors z2 zq E(z,O) = 1 - z, E(z, q) = (1 - z) exp{z + 2 + ... + q}' (2.2) Hadamard (1893) used the maximum modulus M(r, f) := sup

IJ(z)1

(2.3)

Izl~r

to define the order

. log log M(r, f) w = 11m sup 1 . r-+oo ogr Let l x J denote the integer part of x E JR. If w < Hadamard showed that

(2.4) 00

and q

lwJ,

then

00

J(z)

= eP(z}zn II E(Z/Zk,

q)

(2.5)

k=1

where P is a polynomial of degree at most q. In particular, iJ w is not an integer, J must have infinitely many zeroes. If 0 ~ w < 1, then q = 0, so (2.5) takes in this case the form 00

J(z) = Azn

II (1- Z/Zk)'

(2.6)

k=1

Applying these ideas to Riemann's (-function Hadamard and de la Vallee Poussin were later able to prove the prime number theorem. As an example, the function sin viz -1

viz - -

z 3! 23

z2

+ 5!

- ...

SECOND CHAPTER

24

is of order 1/2 and therefore (2.6) gives 2

00

sinz = z

IT (1- k:71"2)· k=l

In addition to the order one often talks about the type T of f. If f is of order w with 0 < w < 00, suppose there exists a constant C < 00 such that (2.7) holds for all large enough r. Then f is said to be of finite type and the greatest lower bound T:= inf C;::: 0 of the values of C for which (2.7) holds (for all r > r(C)) is called the type of f. For example eZ is of order w = 1 and type T = 1. Alternatively, the type is sometimes defined by setting T:= lim sup 10gM(r, f).

rW

r->oo

Thus, the order and type measure the growth of f as Izl = r -are also related with the decay of the Taylor coefficients.

00.

However, they

Taylor coefficients Let for a given series

f, 0 < Ro ::; 00 denote the radius of convergence of the Taylor 00

f(z) = L:akz\

(2.8)

k=O

so that

Since, for r

< Ro, 1. ( ak = -2 z-k-l f(z)dz, 7rZ J1z1=r

(2.9)

we obtain

(2.10) true for all r < Ro with k independent of r. This is a basic inequality connecting the growth of f to its Taylor coefficients. A little bit sharper result is obtained if we use Parseval's identity. Consider the function

cp I-t f(rei'P). Its Fourier coefficients are, by (2.9)

111" e- t' 'Pk ' f(ret'P)dcp = akr k -11" which are well defined because f is analytic, and thus -1

271"

25

SECOND CHAPTER

In the other direction (2.8) gives trivially 00

M(r,f):::; ~)aklrk.

(2.11)

k=O

Suppose now that f is entire so that Ro = 00. If f is of order w and type T, then M(r, f) rv exp{ TrW} and one concludes from (2.10) by substituting rW := :w that ak decays at least like ("'~W)k/w. On the other hand, if lakl would decay faster than this, then (2.11) would imply that M(r, f) actually grows slower. Here is a precise statement.

Theorem 2.1 If f is entire of order w, then .

w = bm sup k-+oo

logk log (

Jtr )

1/k .

If f is of finite positive order wand of finite type T, then T

=

~ lim supklakl w / k. ew k-+oo

Proof For a proof look at standard books on this topic, e.g. [Bo].

0

In Theorem 5.3.4 and Lemma 5.3.5 of [N01], the following quantitative version is proved.

Theorem 2.2 If f satisfies

M(r,f):::; Cexp{TrW}, then

r

>0

Tew k/w laol:::; C, lakl:::; C(T) for k 2: 1.

(2.12)

(2.13)

Reversely, if (2.13) holds, then for 0 < e :::; 1/2 and for all r > 0

13w M(r, f) :::; C + C - exp{(l + e)TrW}. e Meromorphic functions We say that f is meromorphic in a domain n if it is analytic except for possible poles. Thus at every Zo E n there exists a unique smallest nonnegative integer m(zo) such that (z - zo)m(zo) f(z) can be expanded into a convergent series around 00

(z - zo)m(zo) f(z) = LCj(z - zo)j. (2.14) o We say that m(zo) is the multiplicity of the pole at zo0 Clearly the multiplicity satisfies the following

m(zo) = lim log+ If(z)l. Z-+Zo

log p-::-r1 IZ-ZOI

(2.15)

Analytic functions can be estimated using the maximum modulus. Since meromorphic functions have poles, the maximum modulus (and maximum principle) no

SECOND CHAPTER

26

longer works. In order to introduce a related tool we start with the Poisson-Jensen integral formula, which can be given as follows. Let

L

2

00

P(p, t):=

plkleikt =

k=-oo

1- P 1- 2p cos t

+ p2

(2.16)

denote the "Poisson kernel". Suppose now that I is analytic for Izl < Ro and take p < r < RD. Then the harmonic function u := SRI can be represented at z = peiO using its values on the larger circle as follows:

o
1 with fJr < Ro we

have

fJ+1

T(r, f) ::; log+ M(r, f) ::; fJ _ 1 T(fJr, f).

(2.31)

Proof Here the first inequality follows from N(r, f) = 0 and

m(r, f) ::; log+ M(r, f). If M(r, f) ::; 1, then the second inequality holds automatically. Suppose M(r, f) > 1, and let Zo = re icp be a point such that I/(zo)1 = M(r, f). Applying the PoissonJensen formula (2.21) we have

1 logM(r,f)::; 271"

111" 1 -11" P(O,t-cp)logl/(fJreit)ldt,

since the terms corresponding to zeros are negative:

IfJr( re icp - aj) 1 < IfJ2r2 - o,jreiCP I. But

and (2.31) follows.

P ~ t < 1 - l/fJ2 _ fJ + 1 (fJ' ) - (1 - l/fJ)2 - fJ - 1 '

o

SECOND CHAPTER

30

The first main theorem

We can now prove R. Nevanlinna's first main theorem. Theorem 2.6 Let f be meromorphic in Izl < R :::; complex number and r < R, then

T(r, f)

=

1

T(r, f _ a)

00.

If a is an arbitrary

+ log ICk(a)1 + c(a, r)

(2.32)

where Ck (a) is the first nonzero coefficient in the expansion f(z) - a = ck(a)zk + Ck+1(a)zk+l + ... and le(a,r)l:::; log+lal Proof Applying (2.28) to

f -

+ log 2.

(2.33)

a gives

1

T(r, f - a) = T(r, f _ a) + log ICk(a)l·

(2.34)

But (2.30) implies

IT(r,J) - T(r,J - a)1 :::; log+lal

+ log 2,

as T(r, a) == log+ lal, and substituting this into (2.34) yields the claim.

D

Example 2.1. If f is a rational function of exact degree d (Le. f = p/q,d = max{deg(p),deg(q)} and p, q contain no common factors), then

T(r, J) = dlogr + 0(1),

as r

--+ 00.

Example 2.2. Consider f(z):= eZ • Then N(r,J) == 0, while 1 111"

m(r, J) = -2 11"

-11"

1 111"/2 r log+1 exp(reitp)ld
0 any fixed number. Let E := {z

Eel

h n Ip(z)1 :::; (-) }. e

Then there are disks Bl. ... , B n , B;:={zllz-z;l:::;r;} such that

and

SECOND CHAPTER

34

Bound along a circle With help of Boutroux-Cartan lemma we consider bounding log+ 1!(z)1 pointwise in terms of T(r, I). Theorem 2.10 Let! be meromorphic in Izi < R and choose r such that ()r < R. Then there exists a radius p such that

1

r

- 1 and ()r < R we have

1 n(r, I) ~ log ()N(()r, I).

(2.40)

If f has a pole at the origin, then the inequality holds in the form

logr 1 n(r,l)+n(O,l)log() ~ log ()N(()r,l).

(2.41)

Proof We have N(()r, I) =

(or n(t, I) - n(O, I) dt + n(O, I) log(()r)

10

t

~[n(r, I) - n(O, fl

l r

Or

dt

t

+ n(O, I) log(()r) o

which gives (2.41).

Corollary 2.2 Let f be meromorphic in Izl < R such that f(O) #- 00. Choose () > 1, 0 < r < R such that ()r < R. Then there exist a constant C(()), only depending on (), and a radius p depending on f and satisfying that for all cp

./0

~

p

~

r, such

(2.42)

Proof We estimate ()+1 ()+1 ()+1 2 () _ 1 m(()r, I) ~ () _ 1T(()r, I) ~ () _ 1 T(() r, I) and, by Lemma 2.2 n(()r, I)

Replacing () by

()1/2

~ lo~()T(()2r, I).

Theorem 2.10 implies (2.42) for some p,

C(()) := JO + 1 JO - 1

./0 ~ p ~ r, with

+ log 4JO(JO + l)e_l_. JO - 1

log JO

o

SECOND CHAPTER

36

Representation theorems

We forinulate two theorems concerning the possibility of representing a meromorphic function f as a quotient of analytic functions fI! h such that the growth of Ii's are controlled. Definition 2.3 A function f, meromorphic in chamcteristic in Izl < R, if sup T(r, f) < 00.

Izl < R is said to be of bounded

r 1 there is a constant B(O) with the following property. If f is meromorphic for Izl < 00, then there are entire functions h, h such that f = hi h and for i = 1,2 we have for all r > 0

T(r, Ii) ::; B(O) T(Or, f). Proof This theorem is in [Mi], see [Ru] for an exposition.

o

Comment 2.1 Theorem 2.10 is taken from [Ya] (Lemma 4.2) where it is used in a discussion on an inequality of Chuang Chi-tai bounding T(r, f) in terms of T(r, /'). Comment 2.2 In addition to the Nevanlinna characteristic function T(r, f) there are other related characteristic functions in the literature. In particular when the values are considered as points in the Riemann sphere and the distances are measured accordingly, the theory gets a different, more geometric flavor.

THIRD CHAPTER Keywords: Subharmonic functions, vector valued analytic and meromorphic functions, matrix and operator valued meromorphic functions, finitely meromorphic. Analytic vector valued functions We shall next generalize the characteristic function T for operator valued meromorphic functions. The first concept is going to be denoted by Too and it is defined as such for Banach space valued functions; for operator valued functions we just use the operator norms. The discussion shall touch the properties of subharmonic functions, some of which we present below. Before that, however, let us recall what we mean by vector valued analytic and meromorphic functions. If J is defined in a domain n c C taking values in a Banach space X, then it is analytic if

(3.1) lim _l_[J(z) - J(zo)] z - Zo exists for all Zo E n. The limit of the difference quotient is in the norm topology. Furthermore, J is called meromorphic if apart from poles it is analytic and around any pole b there is a smallest positive integer m = m(b) such that z-zo

Z 1--+

(z - b)m J(z)

(3.2)

is analytic at b. It is a well known and important result that if the limits are assumed in the weak topology only, they actually exist in the norm topology as well, and so "weakly analytic are analytic".

Subharmonic functions It is an important starting point for our discussion that if J is analytic taking values in a Banach space, then the mapping

u :z

1--+

u(z)

:= log+

IIJ(z)1I

(3.3)

is subharmonic.

Definition 3.1 Let n be a domain of C. A function u from n to R U { -oo} is said to be subharmonic on n if it is upper semicontinuous and satisfies the mean inequality 1 111" u(zo + rei'P)dv; (3.4) u(.zo) :5 -2 7r

_11"

whenever the closed disc B(zo, r) is contained in n. Furthermore, it is harmonic if both u and -u are subharmonic. We recall that u is upper semicontinuous if for 37

THIRD CHAPTER

38

allzoEO

limsupu(z) $ u(zo). %-+Zo

We can now state the following result.

Theorem 3.1 Suppose I is analytic from a domain 0 to a Banach space X. Then the functions IIIII and log 11/11 are subharmonic in O. Proof Clearly IIIII is continuous when formula we have I I(zo) =-. 271"~

1

I

I d. Let a be a unit vector in this intersection. But then k+l

a E ker B and writing a =

E

j=l

vjavj we obtain k+l

IIA - BI12 ~ II(A - B)a11 2 = IIAal1 2 ~ LO"]lv;aI 2 ~ O"~+l j=l

o

completing the proof.

Remark If A is not a square matrix, then it can be augmented to be a square matrix by adding a suitable number of columns or rows consisting of zeros. Basic inequalities for singular values and eigenvalues

In the following we shall denote by Mm,n the space of complex matrices, consisting of n columns of length m. The following basic but as such a simple lemma can be proved using unitary invariance and the so called interlacing property for the singular values of submatrices. Lemma 4.1 Let C E Mm,n, Vk E Mm,k, W E Mn,k be given, where k min{m,n}, and Vk, Wk have orthonormal columns. Then

(a) O"j(V':CWk) ~ O"j(C) j = 1,2, ... ,k, (b) Idet Vk*CWkl ~ O"l(C) ... O"k(C).

and

~

52

FOURTH CHAPTER

o

Proof See [Ho-J2), Lemma 3.3.1.

Let Aj = Aj{A) denote the eigenvalues of A, O'j = O'j{A) singular values and recall that we number them in the order of decreasing absolute values. Theorem 4.5 (H. Weyl, 1949) If A E Md, then k

k

j=1

j=1

III Ajl :5 II O'j

(4.7)

for k = 1,2, ... ,d,

with equality for k = d.

Proof Let A =diag (AI, ... ,Ad)' By the Schur Decomposition Theorem there exists a unitary U and a strictly upper triangular N such that

A = U{A + N)U*. Let Uk E Md,k denote the k first columns of U. Then we have

A+N=U*AU= (Uk:Uk

~)

with some matrices E, F, G. Since A + N is upper triangular, F = 0 and Uk AUk is upper triangular. Now we apply Lemma 4.1 with C := A, Vk = Wk := Uk and conclude

k

k

k

j=1

j=1

j=1

III Ajl = IdetUkAUkl :5 II O'j(Uk AUk ):5 II O'j. When k

= d the SVD gives us

Idet AI = Idet U det E det V* I = det E

o

and the equality in (4. 7) follows.

If the singular values of A and B are known, what can be said about the singular values of AB? We formulate the answer in the square matrix case. For the general case, see [Ho-J2), Theorem 3.3.4.

Theorem 4.6 (A. Horn, 1950) If A, B E Md, then for k = 1,2, ... ,d k

k

j=1

j=1

II O'j(AB) :5 II O'j {A)O'j (B)

(4.8)

with equality for k = d.

Proof Let AB = YEW· be the SVD of AB and put Vk for the k first columns of V, and W k for those of W. Then

V: ABWk = diag{O'I{AB), ... ,00k{AB)).

(4.9)

Consider BWk E Md,k' It can be written, using polar decomposition, as

BWk=Uk R where Uk E Md,k has orthonormal columns and R E Mk is positive semidefinite satisfying

53

FOURTH CHAPTER

Then Lemma 4.1 gives k

det R2 = det(W; B* BWk) ~

II Uj (B* B). j=l

But uj(B* B) = Uj(B)2 and thus det R2 ~

k

Il uj(B)

2

. From (4.9) we obtain

j=l k

II uj(AB) =ldet(Vk' ABWk) I j=l

=ldet(Vk* AUkR) I =ldet(Vk' AUk)lldet RI· But by Lemma 4.1 Idet Vk' AUk I ~

k

Il uj(A)

and since detR ~

j=l

k

Il uj(B), j=l

(4.8) follows. For k = d there must be equality as Idet(AB) I = Idet Alldet BI and d

d

j=l

j=l

Il Uj = I Il Ajl.

D

The singular values of a sum of two square matrices can be easily estimated with help of Theorem 4.4.

Lemma 4.2 If A, B E M d , then for 1 ~ j, k

Proof Let A j -

17 Bk-l

~

d, j

+k

~

d + 1 we have

be as in Theorem 4.4. Then, since

rank(A + B) ~ rank(A)

+ rank(B) = j

- 1+k - 1 = j

+k -

2,

D

Definition 4.1 For A E Md put for k = 1,2, ... ,d k

IIIAlllk := L:uj(A). j=l

These are sometimes called Ky Fan-norms. For k = 1 we have the induced operator norm IIAII (spectral norm), and with k = d we have the trace norm, also denoted IIAI11' IIAlltr.

Theorem 4.7 For k = 1, ... ,d, III . III k is a submultiplicative norm in Md, i.e., IllABlllk ~ IllAlilk IIIBlllk.

FOURTH CHAPTER

54

o

Proof See e.g. [Ho-J2], section 3.4.

The total logarithmic size of a matrix

In the value distribution theory log+lfl separates the large values of If I from those of small ones. When looking at meromorphic functions F: z 1-+ F(z) E Md we need to be able to do the same thing. Definition 4.2 For A E Md, put d

s(A)

:=

L log+ uj(A). j=1

We may call it the total logarithmic size of A.

In order to study simple properties of s(A) we need the following simple technical tool. Lemma 4.3 Let a1 ~ a2 ~ ... such that for k = 1,2, ... ,d we have

~

ad

~

k

k

j=1

j=1

0, {31

~

{32

~

...

~

{3d

~

0 be given

II aj ::; II (3j. Then d

d

Llog+(aj)::; Llog+({3j).

j=1

(4.10)

j=1

Proof If a1 ::; 1, then (4.10) holds. Otherwise, if we put ad+1 := 0, then let 1::; m ::; d be such that am ~ 1 but a m+1 < 1. Then, with k := m, d

m

m

Llog+aj = Llogaj ::; Llog{3j. j=l

j=1

j=1

But m

d

m

Llog{3j ::; Llog+{3j ::; Llog+{3j,

j=1

j=1

j=1

o

and (4.10) follows.

Theorem 4.8 For A, B E Md we have

s(AB) ::; s(A)

+ s(B).

s(A + B) ::; 2(s(2A)

+ s(2B)).

(4.11)

(4.12)

FOURTH CHAPTER

55

Proof Put OJ := O"j(AB) and (3j := O"j (A)O"j (B). Then (4.8) allows us to apply Lemma 4.3 to conclude d

s(AB) ~ I)og+(O"j(A)O"j(B)). j=l But 10g+(O"j(A)O"j(B)) ~ log+ O"j(A) + log+ O"j(B) and (4.11) follows. To obtain (4.12) notice that for any nonnegative a, b we have 1 log+ '2(a + b) ~ log+ a + log+ b.

Since O"j(A) = !O"j(2A) we have by Lemma 4.2 1 0"2j-1(A+B) ~ '2(O"j(2A)+O"j(2B)).

Therefore log+ 0"2j-1(A + B) ~ log+ O"j(2A) + log+ O"j(2B) holds for and since 0"2j(A + B) ~ 0"2j-1(A + B) we obtain (4.12).

o

When we deal with large dimensional matrices or with operators in trace class we often want to write them as [ + A. Corollary 4.4 s(I + A + B) ~ 2(s([ + 2A) Proof Write

[+ A + B

= (![

+ s(I + 2B)).

+ A) + (![ + B) and use

(4.12).

o

Theorem 4.9 Let A be invertible. Then

s(A) = S(A-1) + log Idet AI. Proof We have by Theorem 4.5 d

II O"j = Idet AI

j=l and thus log

d

d

j=l

j=l

II O"j = I)og O"j =

d

d

1

j=l

j=l

3

L log+ O"j - L log+;-: = log Idet AI·

But ;. 's are the singular values of A-1 and so substituting J

d

1

s(A-1) = '"'log+~ 0". j=l 3 gives the result. It is convenient to put s(A -1) = 00 if A is not invertible.

o

FOURTH CHAPTER

56

Theorem 4.10 A is unitary if and only if s(A)

+ S(A-l) = o.

Proof If A is unitary, then the SVD is A = U with E = I, and s(A) = s(A -1) = O. Reversely, if s(A) + S(A-l) = 0 holds then uj(A) = 1 for all j and the SVD is A = UEV· = UV· as E = I. But UV· is unitary and we are done. 0

Some basic properties of the total logarithmic size We start by studying how the total logarithmic size behaves in similarity transformations. Let A = SBS- 1. Then by Theorem 4.6 k

k

j=1

j=1

IT uj(A) ::; IT Uj(S)Uj(B)Uj(S-I).

(4.13)

But Uj(S-I) = I/Ud-j+l(S) and if we define Kj(S) := Uj(S)/Ud-j+l(S),

then we obtain from (4.13) k

k

j=1

j=1

IT uj(A) ::; IT Uj(B)Kj(S).

Notice that Kl (S) = IISIlIIS- 1 11 is the condition number of S. Since Kj(S) ~ Kj+1 (S) we obtain using Lemma 4.3 d

d

~)og+uj(A) ::; I)og+(uj(B)Kj(S)) j=1

j=1 d

d

::; Llog+uj(B) j=1

+ Llog+Kj(S), j=1

so that s(A) ::; s(B)

+ c(S),

(4.14)

where c(S) is defined in (4.15). Definition 4.3 For invertible S put d

c(S):= Llog+Kj(S), j=1

the total (logarithmic) conditioning of S. Theorem 4.11 Let S, Rand T be invertible matrices. Then (b)

c(SR- 1) =0 if and only if SR- 1 is unitary c(SR- 1) =c(RS- 1),

(c)

c(ST- 1) ::;C(SR-l) + c(RT- 1).

(a)

(4.15)

FOURTH CHAPTER

57

Proof (a) c(SR- 1) ~ 0 always, but if c(SR- 1) = 0 then in particular = 1 and SR- 1 is unitary. Reversely, for a unitary SR- 1, O'j (SR- 1) = 1 for all j and c(SR- 1 ) = o. (b) is trivial from the definition. (c) follows from writing ST- 1 = SR- 1RT- 1 = (SR- 1)(RT- 1) and using /'i,1 (SR- 1)

c(AB) ~ c(A)

+ c(B),

(4.16)

which holds for any invertible matrices A, B. To obtain (4.16) notice that k

k

II /'i,j(S) = II

O'j(S) . j=IO'd-j+1(S)

j=1

We have

k

k

k

j=1 d

j=1 d

j=1 d

II O'j(AB) ~ II O'j(A) II O'j(B). .n

.n

.n

U.(~B) ~ u.tA) u}B)' Thus with m = d-k+l J=m ' J=m' J=m' we obtain by multiplying both sides But in a symmetric way,

k

k

j=1

j=1

II /'i,j(AB) ~ II /'i,j(A)/'i,j(B) which implies (4.16) with help of Lemma 4.3: d

d

I)og+/'i,j(AB) ~ L)og+(/'i,j(A)/'i,j(B)) j=1 j=1 d

d

~ L)og+/'i,j(A) j=1

+ I)og+/'i,j(B). j=1

o Corollary 4.5 If A, B are invertible, then (4.16) holds. When we think of SR- 1 as the similarity transformation which takes RAR- 1 into SAS-l = (SR- 1)(RAR- 1)(RS- 1) then we may think of c(SR- 1) as the "distance" between the similarity transformations Sand R. In particular, the following shows the continuity of the total logarithmic size s in similarity transformations. Theorem 4.12 If Sand R are invertible, then for all A Is(SAS- 1) - s(RAR- 1)1 ~ c(SR- 1).

Proof From (4.14) we have s(SAS- 1) ~ s(RAR-l)

+ c(SR- 1)

and likewise

s(RAR-l) ~ s(SAS- 1) + c(RS- 1). Since c(RS-l) = c(SR- 1), the claim follows.

o

FOURTH CHAPTER

58

We conclude that s(A) behaves in a natural and controlled way under similarity transformations. Next we ask, in what ways we can possibly estimate the norm IIAII and s(A) in terms of each others. Since O'l(A) = IIAII, we have trivially IIAII ~ exp (s(A)) and

s(A) ~ dlog+ IIAII. However, if we know the function s(zA), then the norm can be obtained accurately. In fact, 1 IIAII = sup{lzl I s(zA) = o}.

(4.17)

Let us now look at the power An and the exponential e zA . The behavior of s(An) is related to the spectral radius formula: lim IIAnIl 1/ n = p(A) = max{IAI I A E O'(A)}.

n-+oo

(4.18)

Let A = diag(A1(A), A1(A), ... ).

Theorem 4.13 We have lim .!.s(An) = s(A). n

n-+oo

(4.19)

Proof The claim follows from the following generalization of (4.18): O'j(An)l/n ~ IAj(A)I, which is due to Yamamoto (1967), and generalizes to compact operators, see [Ro], Proposition 2.d.6. 0

Theorem 4.14 For z E C, Izl = r, A E Md, s(e zA ) ~ r11A111'

(4.20)

d

where IIAI11 = ~ O'j(A). j=l

Proof Since ezA = lim(1 + ~A)n, we have s(e zA ) ~ liminf n s(1 + ~A). But O'j(1 + ~A) ~ 1 + ~O'j(A) by Theorem 4.4 and thus log+ O'j(1 + ~A) ~ ~O'j(A), and the claim follows. 0 We have observed that s behaves nicely when two matrices are multiplied together, but estimates for the sum are necessarily somewhat more complicated. Consider the sum of two matrices. If A = B = I, then s(A) = s(B) = 0, but s(21) = dlog 2. By Lemma 4.2 we have

59

FOURTH CHAPTER

and since 0"2k ::; 0"2k-1 we have d

Ld/2J

j=l

j=l

L log+O"j(A+B)::;2 L

10g+(O"j(A) +O"j(B))

::;2(s(A)

+ s(B)) + rank(B) log 2.

(4.21)

Another grouping of the indices in Lemma 4.2 is also useful. Theorem 4.15 For A, BE Md we have

s(A+B)::; 2(s(A) +s(B))

+ rank(B) log 2

(4.22)

and

s(A + B) ::; s(A)

+ s(B) + rank (B) (log+ IIAII + log 2).

(4.23)

Proof Inequality (4.22) is in (4.21) while (4.23) follows from

O"j(A + B) ::; IIAII

+ O"j(B),

j::; rank(B)

and from

O"j+k-1(A + B) ::; O"j(A) + O"k(B) = O"j(A), where k > rank(B). Thus Llog+ O"j(A + B) ::; rank(B)(log+ IIAII

+ log 2) + Llog+ O"j(B) + Llog+ O"j(A). D

Finally, if we add something small into A we do have an estimate without additional terms, so that we see the natural continuity. Continuity Lemma 4.4 If A, B E Md, then

Is(A) - s(B)1 ::; IIA - Bill.

(4.24)

Proof If a and b ~ 0, then 110g+(a) -log+(b)1 ::; la - bl· So, we have

Is(A) - s(B)1 ::;

L

Ilog+ O"j(A) -log+ O"j(B)1

::; L 100j(A) - O"j(B)I· The trace norm 11.111 has the following property. Form from the singular values two diagonal matrices E(A) and E(B) respectively, arranging the diagonals in the usual decreasing order. Then

IIE(A) - E(B)111 ::; IIA - Bill see [Ho-J1J, p. 448. This completes the proof.

D

FOURTH CHAPTER

60

Direct sum, Kronecker product and Hadamard product Given two matrices A E Md1 , B E M~, operating in C d1 , C d2 respectively, their direct sum A E9 B is the linear mapping in C d1 E9 C d2 which maps as (x, y) E C d1 E9 C d2 to (Ax, By) and can be represented with a block diagonal matrix

The singular values of AE9B E Mdl+d2 are clearly O'j(A), O'k(B), j = 1,2, ... ,d1, k = 1,2, ... , d 2 • Therefore we have the following result:

s(A E9 B) = s(A) + s(B).

(4.25)

The Kronecker product of two matrices A E Mm,n and B E Mp,q is denoted by A ® B and is given by

anB A®B=

a

(

1n

B) E Mmp,nq

:

:

am 1 B

amnB

where =

(a~1

a~n) ..

A.

.

.

am1

amn

Remark In this notation the rank-l matrix xy*, with x, y E Cd becomes xy* = x®y*.

In multilinear algebra it is customary to write x ® y for the bilinear mapping. Our notation here follows the matrix analysis tradition where x is thought of as a column vector and x* as a row vector. Lemma 4.5 (A ® B)(C ® D) = AC ® BD. Proof This is of course under the assumption that the dimensions match so that the ordinary products make sense. To prove this, split into blocks and multi~

0

Corollary 4.6 If A E Md 1 and B E (A ® B)-1 = A- 1 ® B-1. Remark In general A ® B prove it:

=f.

Md2

are invertible, then so is A ® Band

B ® A so observe the order: A-I ® B- 1 • To

(A ® B)(A- l ® B- 1 ) =AA- l ® BB- l = Idl ® Id2 =(A- 1 ® B- 1)(A ® B).

= Id1d2

FOURTH CHAPTER

61

We can now compute the spectrum of A ® B for A E Mdl' B E Md2. Let Ax = AX and By = f,Ly, where x, yare eigenvectors, and A, f,L eigenvalues. We have by Lemma 4.5 (A ® B)(x ® y) = (Ax ® By) = Af,LX ® Y so that Af,L E O'(A®B), and x®y E C d1d2 is a corresponding eigenvector. If we take all products Ajf,Lk with multiplicities when needed, we have obtained all eigenvalues of O'(A ® B). This is easiest to check using the Schur decomposition. In fact let L, M be upper triangular and U, V unitary so that

A = ULU*,

B = VMV*.

Then U ® V is unitary (by Corollary 4.5) and L ® M is upper triangular, with Ajf,Lk'S on the diagonal. The conclusion follows as by Lemma 4.5

(U ® V)* (A ® B)(U ® V) = L ® M. In order to obtain the singular values of A®B observe that since (A®B)* = A*®B*, we have (A ® B)*(A ® B) = (A* A) ® (B* B) and the previous result on eigenvalues implies that the singular values of A ® B are obtained as products of singular values of A and B:

where j

= 1, ... ,dl , k = 1, ...

,d2 • We obtain the following result.

Theorem 4.16 If A E Mdl' BE

then

Md2'

s(A ® B) ~ d2 s(A)

+ dl s(B).

(4.26)

Proof Consider the product of all singular values of A®B and replace O'i(A® B) with ui(A ® B) where, Q := max{a, I}. Then s(A ® B) is the logarithm of the new product. But ui(A ® B) = (O'j(:A);;;(B)) ~ Uj(A)Uk(B) and doing this for every o'i(A ® B) yields the following product: dl

d2

IT Uj(A)d IT uk(B)d 2

j=1 The logarithm of this is d2 s(A)

1•

k=1

o

+ dl s(B).

Given two matrices A,B E Mm,n with elements (aij), (bij ) respectively, their Hadamard product A 0 BE Mm,k is defined by

(A 0 B)ij = aijbij . This is sometimes called the Schur product or the entrywise product. An important and simple result related with this product is that the Hadamard product of positive semidefinite matrices is always positive semidefinite. If A, B E Md are positive definite and the eigenvalues are ordered decreasingly then k

k

j=1

j=1

IT Aj(A)Aj(B) ~ IT Aj(A

0

B),

for k = 1,2, ... , d

FOURTH CHAPTER

62

(see [Ho-Jl], p. 316). For our purposes the reverse inequalities are of interest. To that end, for any A, BE Mm,n, already Schur showed that

0'1 (A 0 B)

~

0'1 (A)O'l (B).

Let A E Md' Put r1(A) ~ r2(A) ~ ... ~ rd(A) for row sums as follows: (rk(A))2 d

is the kth largest number among L: laij 12, i = 1, ... ,d. Let Ck (A) be defined j=l similarly for column sums. Then the following holds: for A, B E Md: k

IT O'j(A

j=l

k

0

B) ~

IT cj(A)rj(B),

k = 1,2, ... ,d

j=l

[Ho-JI], p. 355). This allows an easy upper estimate for s(A Lemma 4.3

0

B). In fact, by

d

s(A 0 B) ~ I)og+[cj(A)rj(B)]. j=l

(4.27)

Comment 4.1 In the discussion above we have used [Ho-J2] as a basic reference. Comment 4.2 Notice that s(A) is not of the form log+ IIAII in any operator norm. In that sense it is really a different "tool". Many of its properties appear here first time. Comment 4.3 The total logarithmic size generalizes for bounded operators A in Hilbert spaces. In fact, let O'j(A) :=

inf

rank(B)<j

and then set

IIA -

BII,

00

s(A)

L)og+ O'j(A). j=l For example, with compact operators K we always have s(K) < 00, and if K is in the trace class, that is, IIKll1 := L:~1 O'j(K) < 00, then also s(1 - K) < 00. Many properties of the total logarithmic size can be proved simply by approximating techniques. We refer to [N08]. See also the subsection "Extension to trace class" in the next chapter. :=

FIFTH CHAPTER Keywords: Inversion identity, trace class, finitely trace class meromorphic, Schatten class. The total logarithmic size is subharmonic We shall consider here the subharmonicity of the total logarithmic size of an analytic Md-valued function. To that end we shall first consider a problem related to the eigenvalues instead of the singular values. We follow closely [A2]. Let {>.j(AHt denote the eigenvalues of A E Md, indexed so that

If F is now an analytic function in a domain n we know from Theorem 3.2 that log+ IA1(F(z)1 is subharmonic as IA1(A)1 gives the spectral radius of A. However, the corresponding function with the other eigenvalues need not be subharmonic.

Example 5.1 Let

F(Z)=(~ ~) so that the eigenvalues are {I + z, 1 - z}. Thus IA1(F(z))1 = max{11 + zl, 11- zl} while IA2(F(z))1 = min{11+zl, 11-zl} and we see that IA2(F(z))1 is not subharmonic as it violates the mean value property at the origin with a small enough radius. Lemma 5.1 Let F be analytic from a domain

n into Md.

Then the functions

k

Uk(Z)

:=

L log IAj(F(z))1

(5.1)

1

are subharmonic for k = 1,2, ... , d. Proof Fix Zo E n and choose an eigenvalue /-to E a(F(zo)) and take a small enough radius s > 0 such that the closed disc B(/-to,s) contains no other eigenvalue of F(zo). Then we can fix a small 8 > 0 such that for Iz - zol < 8 no eigenvalue of F(z) touches the circle {}B(/-to, s), which is possible as the eigenvalues are continuous and there is only a finite number of them. When counted with multiplicities, let rno denote the multiplicity of /-to, so that d-rno eigenvalues of F(z) stay outside of B(/-to, s). If rno = 1 then /-to(z) is analytic in z, but for rno > 1 it may happen that the eigenvalue is not analytic. In such a case the eigenvalue splits into several eigenvalues, say, into /-t1 (z), ... , /-tmo (z), each of which is analytic in a small punctured neighborhood of zoo Notice that some of these eigenvalues can be multiple copies 63

64

FIFTH CHAPTER

of each others, but then they stay as copies and each one is separately analytic. In any case, if one defines a function h around Zo by setting

and for 0
0 and an integer mj such that

bj(z) = cj(l + o(1))r2m; as Izl = r ---+ O. Consider now bl decreasingly we have

= L~ Aj. As the eigenvalues are numbered

which further implies Cl

-d

~

l'

1m

. f Al(Z)

III

-2- ~

z-+o r

ml

l'

1m sup

Al(Z)

- 2 - ~ Cl'

z-+o r

ml

FIFTH CHAPTER

67

For the coefficient b2 we have in the same way

),1),2::; b2::; (~),1),2 This implies

< l'lIDm . f

C2 -C1 (~) -

< lim sup

),2(Z)

z->O

r 2 (m2- m l) -

z->O

< -c2 d

),2(Z)

r 2 (m2- m d -

C1

Continuing this way we see that if ),j is not identically 0, then there exists constants aj > 0 and an integer kj such that

a~ < lim inf ),j(z) < lim sup ),j(z) < ~. z->O r 2k j

J -

-

z->o r2kj

-

a~ J

Taking the logarithm and dividing by 2 gives log aj ::; lim inf (log Uj (F(z)) z->O

1

+ k j log -r )

::; lim sup (loguj(F(z)) + k j z-+O

log~) r

::; log ~ aj

Since the eigenvalues were ordered decreasingly there is a largest J such that k j < 0 for j ::; J. Summing over j then gives J

0: ::;

lim inf ( " log+ uj(F(z)) + " k j log ~) z->O

~

~

r

j=1 J

::; lim sup (Llog+Uj(F(z)) 0:

- 'E.;=1

:=

r

j=1

Z-+O.

where

+ Lkjlog~)::;(3

'E.;=11ogaj and (3 := 'E.;=11og

;j'

Thus, in particular, J.L(O) .-

0

k j is an integer.

The proof actually gave somewhat more. Namely that limsup can be replaced by lim and that the limit process is controlled with bounds. Lemma 5.5 If F is as above, then .

1 ) p. (zo=lm

Z-+Zo

and there are constants 0: ::;

0:

s(F(z)) 1 log p-=r IZ-ZOI

(5.8)

and (3 such that

lim inf (s(I - F(z)) - J.L(zo) log I 1 I) z-+zo z - Zo

::; lim sup (s(I - F(z)) - J.L(zo) log I 1 Z-+Zo

z - Zo

I)::; (3.

Proof The inequalities are explicitly available in the previous proof and the limit in (5.8) is obtained by dividing the estimates by log(l/Iz - zol). 0 We shall need an auxiliary function.

FIFTH CHAPTER

68

Izl
0 we have

Too(r, (I - ZA)-1) $ crP + O(logr). Since A E Sp there exists an m, large enough so that 1

00

- L aj(A)P < c. p j=m+1 Then, however, we can proceed as follows:

Too(r, (1 - ZA)-1) $T1(r,I - zA) $ logM1 (r, I - zA) m

$ Llog(l + raj (A)) j=1 $O(logr) + crP •

1

00

p

j=m+1

+ -rP L

aj(A)P

Here we used the inequality 10g(1 + x) $ ~xP, valid for x > 0 and 0 < p $ 1. In the general case, let k be a positive integer such that k < p $ k + 1. Then in particular Ak+1 E S1, and in fact 00

00

Laj(Ak+1)m $ Laj(A)P, j=1 j=1 see e.g. Corollary II.4.2 in [Go-K]. We have, compare with Theorem 5.7,

Too(r, (I - ZA)-1) $ T 1(r, 1- Zk+1 Ak+1) + k 10g(1 + riIAII). Here we proceed as above and in particular use 10g(1 + r k+1aj(A k+1)) $ k + 1 rPaj(Ak+1)m p to split the sum at a proper place in order to have the growth again bounded by crP + O(logr). 0 Recall that in Example 1.5 we had a self adjoint operator A such that its eigenvalues were

Aj = -

(:jr

Thus w = 1/2, but A E Sp only for p > 1/2. Since V 2 is a rank-1 perturbation of A, the same applies to V 2 .

Powers and their resolvents The proof of Theorem 6.5 was based on

T 1(r, 1- Zk+1 Ak+1) = T 1(r, (I - Zk+1 Ak+1)-1), valid for k + 1 ::::: p. We shall next study the asymptotic behavior of kT1 (r, (I - zk Ak)-1) as k grows. Given a compact A, we denote by {Aj(A)} the sequence of its eigenvalues, indexed so that i>'1(A)1 ::::: IA2(A)1 ::::: ... and each eigenvalue repeated according

SIXTH CHAPTER

80

to the dimension of the corresponding eigenspace. If the operator has only a finite number of eigenvalues, then the sequence is continued by setting Aj{A) = 0 for the larger indeces.

Lemma 6.2 If A E SI, then 00

N l (r, (I - ZA)-I) = I)og+ IAj(A)rl.

(6.12)

j=1

Proof Choose r and take the Riesz spectral projection of A including all eigenvalues which are larger than, say, in modulus. This gives a finite rank operator A r . Then A - Ar can be approximated arbitrary well with another finite rank operator and this shows that N l (r, (I - zA)-I) only depends on A r . (Compare with the Continuity Lemma 4.4.) But since this is of finite rank, it is unitarily similar to a finite dimensional upper triangular (that is, a sum of diagonal and nilpotent) operator. But then the nilpotent part can be made arbitrarily small by another suitable similarity transformation and we conclude that Nl only depends on the eigenvalues. In fact, if S denotes a similarity transformation, and if B = SCS- l with d =dimB, then

r!1

s(B) :$ d log (lISIIIIS- l ll) + s(C) and therefore the multiplicity J.L{ Aj ~A)) is not affected by the similarity transforma&a 0 Observe that the right hand side of (6.12) makes sense for all compact operators as it is always a finite sum for any fixed r. We introduce the following notation. Given a sequence {Aj} converging to zero we set N(r,{Aj}):= I)og+ IAjrl· j

Now the following holds.

Theorem 6.6 Assume A E Sp with some p. Then lim -k1T1(r, (I - zk Ak)-I) = N(r, {Aj(A)}). k--+oo

(6.13)

Proof Recall that if A E Sp then A k E SI for k ~ p. Then for such k T l (r, (I - zk Ak)-I) and T l (r,! - zk Ak) are both well defined and equal. The proof is given by several simple lemmas, some of which have some independent interest. Lemma 6.3 If A is compact, then (6.14)

Proof of Lemma 6.3 We formulated this as Theorem 4.13 for Md. This version can be found as Proposition 2.d.6 in [8]. 0 0 The aim is to show that 1 k k kml(r,I-z A )-+N{r,{Aj(A)}).

(6.15)

SIXTH CHAPTER

81

This would imply (6.13) as Tl(r, (I - zk Ak)-I) = T l (r, I - zkA k )

and, trivially, Nl(r, 1- zk Ak) = o. We shall first reduce the claim to a finite dimensional problem. Since our basic claim is about a limit with a fixed r, we can without lack of generality set r = 1 in the following. Choose a small 0 < 0 < 1. Then take a spectral decomposition of A = Al EEl A2 as follows:

1. ( A2 := -2 'In

)"(>./ - A)-ld)".

11>\1=1-6

By the spectral radius formula we have for large enough n

~ 1- ~.

IIA2'II!.

Lemma 6.4 Assume that A E Sp and p(A) < 1. Then we have lim ml(1,I - zk Ak) = 0

k-+co

as k -

00.

Proof of Lemma 6.4 If p(A) < p < 1 then for large enough n we have IIAnl1 ~ pn. If also n ~ k where k such that Ak E Sl, then we can estimate as Izl = 1, which shows that

o

The claim follows.

Lemma 6.5 If A E SI and B is of finite mnk and they opemte in invariant subspaces H A, H B respectively with HAn H B = {O}, then s(I + (A EEl B)) ~ 8(1 + A) + s(I + B) + rank(B) (log(1 + IIAII) + log 2).

Proof of Lemma 6.5 This is clear by (4.23) and (4.25).

(6.16)

o

If A = Al EEl A2 as above and rankA I = d then Lemma 6.5 gives ml(1,I _znAn) ~ml(1, 1-

+d(log(1

zn Ai) + ml(1, 1- zn A2')

(6.17)

+ IIA2'11) + log 2).

This follows because An = Af EEl A2 allows US to apply Lemma 6.5 with _zn An in place of A. By Lemma 6.4 we have limn-+ co ml(1, 1- zn A 2 ) = 0 and since IIA211- 0, then inequality (6.17) implies lim sup .!.ml(1,I - znAn ) n-+co

n

~ limsup .!.ml(1,I n-+co

n

znAi).

82

SIXTH CHAPTER

What we need still to prove is the reverse inequality liminf !ml(l,I - zn Ar) n-+oo n

~ liminf !ml(l,I n-+oo n

zn An).

(6.18)

and that the limit exists and satisfies (6.19) Consider first (6.18). Let P denote the spectral projection: Al = PA. Then for ~ d we have aj(I + A 1 ) ~ 1lPllaj(I + A)

j

while for j > d we have aj(1 + A 1 ) = 1. Thus s(I + A 1 ) ~ s(1 + A) + dlog IIPII.

Applying this to _zn An in place of A gives (6.18). In order to prove (6.19) observe first that by construction Nl (1, (I - zAd- 1 = N(I, {Aj(A)}). And recall that we have set r = 1. For Izl = 1 we have -1

+ aj(An)

~

aj(1 - zn Ar)

~

1 + aj(An)

which implies, as Al is of rank d,

By Lemma 6.3 we know that

which proves (6.19). The proof of Theorem 6.6 is now completed.

D

We shall close this topic with similar results for Too. Here it is natural to look at general bounded operators in a Banach space X. Definition 6.1 Suppose A E B(X). We denote by Poo(A) the smallest radius such that (I - zA)-l is meromorphic for Izl < 1/ Poo(A). Theorem 6.1 If A E B(X), then (1 - ZA)-l and (I - zk Ak)-l are meromorphic in the same discs: Poo(A) = Poo(Ak)-k and

(6.20) while

SIXTH CHAPTER

83

Proof Write, with,pj := 2rrj/k,

(I - zk Ak) = (I - zA)(I - ei 1, for all 0 :s: c < "(. Clearly, we have C = 0 only when A = O. We close this topic by a consequence of resolvent being bounded in the unit disc, as in (6.28).

Theorem 6.11 If (6.29) then 00

B

:s: L IIAnl1 :s: 4B(1 + B).

(6.30)

i=1 Proof The idea ofthe proof is simple. Knowing the value of Moo(r, (I -zA)-1) allows us to use the estimate (6.27) with r > 1 such that B(r - 1) < 1. In fact, we obtain from 1- zA = z(I - A) - (z - 1)1 that -1 1 +B

Moo(r, (I - zA)

If we choose r := 1 +

2k we obtain

00

B =

):s: 1- (r-1 )B.

II L

00

Ai ll-1:S: L IIAili i=O i=1

00

:s: LMoo(r, (I -

zA)-1)r- i

:s: 4B(1 + B).

i=1

o What if small perturbation means small in norm One application of knowing the growth function Too (r, (I - zA) -1) is given in the following chapter: we show that there exists a sequence of monic polynomials {Pi} such that the decay of Ilpi (A) II is related to the growth of the resolvent. Above we saw that the speed of growth of Too(r, (I - ZA)-1) is robust in low rank perturbations of A. In practical computations we would however not use A itself, sayan integral operator representing the inverse of some differential operator but rather a discretization of it, say Ah. In such a case it is of interest to know what happens to the growth function, under the assumption that E := A - Ah is small in norm. The first observation is that knowing IIA - Ahll and Too(r, (I - zA)-1) alone does not imply much. In fact, Too carries no information on the dimensions of invariant subspaces related to the poles, and thus an arbitrarily small perturbation of A can split the pole into arbitrarily many poles. Thus in these terms, all we can say is that, if Too(r, (I - zA)-1) stays small for Izl :s: R o, so that we can conclude that (I - zA)-1 is actually analytic in that disc, then a simple perturbation result is possible, as a corollary of Theorem 6.9.

Corollary 6.2 If for r

:s: Ro,

then (I - ZA)-1 is analytic for r

:s:

(6.31) Ro and the following estimate holds (6.32)

SIXTH CHAPTER

86

for r ::; Ro where ~ is given in Theorem 6.10. If now

11(1 -

z(A + E))-lll

RoIIEII < ~,

then

::; ~ _ £IIEII'

(6.33)

In order to be able to estimate Too(r, (I - z(A + E))-l for larger r we must pose further restrictions on either A or on E. We shall assume that A is in the trace class.

Theorem 6.12 Assume that A E 8 1 and E E B(H). Then for

1

Too(r, (1 - z(A + E))- ) ::;

zA.

r11E11 < 1

rllEl1 rllAlh + (1 + rllAll1) 1 _ rlIEIl'

(6.34)

Proof This follows from Theorem 6.2 by choosing F(z) = 1- zE and G(z) = 0

Comment 6.1 Much of the material of this chapter is from [N08] and [N09]. About Theorem 6.5 there are early related results in the Russian literature, see e.g. [Ma] and the references given there. Comment 6.2 Theorem 6.11 is from [N09]. It would be interesting to know the exact constant(s) in (6.30), say in the form 00

L IIAil1 ::; aB + bB2. i=l

We know that this requires a

~

2, b ~ 4/9.

SEVENTH CHAPTER Keywords: Infinite products, quotient representations, spectral polynomials, Krylov solver, robust error bounds. Combining a scalar function with an operator In the following we consider functions fA which are obtained by combining a bounded operator A E B(X) with a scalar meromorphic function f as follows

fA: z For simplicity, we shall assume that at most with finite order p, that is,

1-+

f(zA) E B(X).

(7.1)

f is meromorphic in the whole plane and grows (7.2)

for all f > O. Likewise, we assume that A is almost algebraic (equivalent with assuming that the resolvent (I - zA)-l is meromorphic in the whole plane) and such that the resolvent grows at most with finite order w, i.e.

(7.3) for all f > O. For example, operators in Schatten class Sp grow at most with order p by Theorem 6.5. Now, assuming additionally that f is analytic at the origin, fA is analytic for small z. If either f is entire or A is quasinilpotent, then fA is actually entire. Otherwise singularities can occur but these are all poles. In general we can define fA as follows. We assume for simplicity that f is analytic at the origin. Thus it has an expansion 00

f(z) = I:ajz j j=O

for

Izl < Ro with some Ra :::; 00.

If p(A) denotes the spectral radius of A, then 00

fA(Z) = I:ajAjzj

(7.4)

j=O

converges for Izl < Rajp(A). Outside of this disc fA is then extended by meromorphic continuation. 87

88

SEVENTH CHAPTER

Theorem 7.1 Let 1 be a meromorphic function in the whole plane such that 1(0) i= 00, and such that it grows at most with finite order p. Let A E B(X) be an almost algebmic opemtor such that its resolvent grows at most with finite order w. Then IA in (7.4) is a well defined B(X)-valued meromorphic function in the whole plane such that it grows at most with order max{p, w}, that is,

(7.5) for all {3 > max{p, w}. Proof Let us consider first the cases in which with Theorem 2.1, that if

1A

is entire. Recall, compare

00

G(z) = L:Bjzj j=O

then G is entire of order at most w if and only if for all .lim l/(W+E) IIBj1l1/j = 3-+ 00

€

> 0

o.

(7.6)

If 1 is entire, then 1A is entire, too, and it follows from the inequality IlajAjIl1/j::; IIAlllajl1/j that the Taylor coefficients of IA cannot decay slower than those of I. Consequently the order cannot increase. Likewise, if A is quasinilpotent then limj-+oo IIAj Il1/ j = 0 and the fact that 1 is analytic near the origin guarantees that for some G we have lajl1/j ::; G. Thus now the coefficients can be estimated by

lI aj Aj ll1/ j ::; GIIAj I11/j and the decay is now dominated by the decay of the coefficients in the resolvent. Again we have an entire function with order at most that of the resolvent. Fix now {3 >max{p, w}. If 1 has only a finite number of poles, then form a polynomial n

p(z) :=

II (1- z/bj )

j=1 so that pi is entire and hence of order less than (3. Estimating the inverse of p( zA) is easy: n

Too(r,p(zA)-1) ::;L:Too(r,(I - :.A)-1) j=1 J n

=

L: O(r/lbjl).B) = O(r.B).

(7.7)

j=1

By construction pf is entire and of order less than (3. Thus

Too(r,IA) ::;Too(r,p(zA)-1) + Too(r,p(zA)/(zA)) ::;O(r.B) + O(r.B). What remains is the general case of 1 having infinitely many poles. Let us denote these by {bj }, ordered so that Ibjl ::; Ibj+11 and each pole is repeated as many times as the multiplicity requires. Without loss of generality we can assume that {3 is not

SEVENTH CHAPTER

89

an integer and that it is close enough to max{p, w} so that they have a common integer part: (7.8) m ~ max{p,w} < f3 < m+ 1. We shall form an entire function
"1 = 211AII with II(>..! - A)-III ~ 1/IIAII.

0

Lemma 7.2 For m < f3 < m + 1 there exists 0(3 such that log IIE(zA, m)1I ~ 0(3 IIAII(3r(3

holds lor all A

(7.10)

and all r > O.

E B(X)

Proof of Lemma 7.2 The proof is divided into two parts, depending whether 211 A II r is smaller or larger than 1. Assume that 211Allr ~ 1. Then we can denote by F the function

F(z) = 10g(E(zA, m)) = -

~

L...J

j=m+l

1

..

-:-A3 Z3

J

which is analytic in this disc. Clearly for these values IIF(z)1I

~

E ~IIAlljrj ~ j=m+l

2l1All m +l r m+1

~ 2I1 AII(3r(3.

J

But E = eF so that IIEII ::; e llFll and thus

10gIlE(zA,m)lI::; IIF(z)1I ::;2I1AII(3r(3 which is of the form required. Assume then that 211Allr ~ 1. Here we base the estimation on Lemma 7.1 and on the fact that the claim holds in the scalar case. In fact, we have log IE(z, m)1 ::; c(3r(3

(7.11)

where c(3 = ~ for m = 0 and c(3 ::; e (log(f3 + 1) + 1) otherwise, see (5.6.13) and (5.6.16) in [NOl].

SEVENTH CHAPTER

90

We now apply Lemma 7.1 to the function E{z, m) and obtain

Since

211Allr 2: 1 we have log IIE{zA, m)11 SCi32i311Alli3ri3 + log 2

S{ci3 + log 2) 2i3l1Alli3ri3.

o

The proof of Lemma 7.2 is thus completed.

:;t

We shall now start estimating
1 we set

C2 := 2(

va;:- + 1)/( va;:- - 1)2.

If A E 8 1 and E E B(H) are given, then there are monic polynomials {pj} depending on A, E and C 1 , satisfying for j ~ C211AlldllEII

IIpj(A + E) II

~ eC2 (C1C2~AIl1er

(7.34)

and for j ~ C211AlldllEII

Ilpj(A + E) II ~ eC2 (HIIAlh/Il E IJ)(CIIIEll)j .

(7.35)

Proof Here the polynomials Pj are not obtained from just one function X but rather from a sequence of such functions, X7)' with help of Lemma 7.5 as in the proof of Corollary 7.1. By Theorem 6.12 we have for rllEIl < 1 1 rllEIl Too(r, (I - z(A + E))- ) ~ rllAll1 + (1 + r1lA1l1) 1 _ rllEIi'

(6.34)

Let C1 > 1 be given and choose () > 1 such that ()2 = C1 • Then for "I ~ 1/(()IIEII) we may assume X7) given so that (7.22) holds. But then for r ~ "I we have

Too(r'X7)(z)(I - z(A + E))-l) ~ 2Too (TJ, (I - z(A + E))-l)

(7.36)

and further ()+1 log+ Moo (TJ/(), X7)(z)(I - z(A + E))-l) ~ () _ 1 2 Too ("I, (I - z(A + E))-l)

2(() + 1) ~ (() _ 1)2 (1 + ()TJIIAlld Choosing "I = 1/()IIEII we obtain

IIpj(A + E) II

~ exp (~~()-+1~~ (1 + IIAlldIlEII)) (()2I1EII)j

which holds for all j. For short, put c := 2(() + 1)/(() - 1)2. Then with j < ciiAlldllEIl we have TJj := j/(c()IIAI11) ~ 1/()IIEII and we obtain

IIpj(A + E) II

~e

C

(c()2 11:11 Ie

r.

o Robust bounds for Krylov solvers Krylov subspace methods is a class of iterative methods for solving linear systemsof equations. Among them conjugate gradient method is widely used for positive definite problems, while GMRES and QMR are examples of methods suitable for general nonsingular problems. A typical step of such an iterative method involves applying a matrix to a vector and doing linear algebra operations in the low

99

SEVENTH CHAPTER

dimensional subspace created. The methods are often used with preconditioning. For example, suppose we have a nonsingular problem in the form

Bx=c. If we additionally have an approximate inverse for B, Le. we have an M such that M B = I - A with I - A invertible and A "small" , then we can write the equation equivalently in the form (7.37) x=Ax+b where b = M c. Often the preconditioner is not given explicitly but requires running a short subroutine. One special property of good Krylov methods is the following. If A is small except possibly in a low dimensional subspace, then the methods converge rapidly. Traditionally the convergence analysis in the case of conjugate gradient method has been based on approximation theory on the spectrum - a technique which cannot be used for highly nonnormal problems. Our analysis covers both cases simultaneously. In fact, a low rank perturbation of A may change the operator from self adjoint to highly nonnormal and in such a case one would have to change the method e.g. from conjugate gradient method to GMRES, but the error bounds remain essentially unchanged. Practical computations in the low dimensional subspaces created assume inner product structure. Our bounds, however, are based on spectral polynomials: they are upper bounds for the best polynomials and they can be formulated in general Banach spaces. We outline now our setting. Given a bounded operator A and a vector b we may create the sequence {Ajb}~o. If 1 ¢ a(A), we can ask for approximations to the solution of (7.37) from the subspaces

Kk(A,b):= span{Ajb}J:J. There are a lot of different methods for different kind of problems which associate an approximation Xk E Kk(A, b) for (7.37). These typically aim to satisfy

Ilxk -

AXk -

bll :::; Ily -

Ay -

bll

(7.38)

for all y E Kk(A, b). We shall give a bound assuming that (7.38) holds exactly.

Lemma 7.6 If x satisfies (7.37), then

IIx - yll :::; 11(1 - A)-llilly -

Ay -

bll.

(7.39)

Proof The claim follows from

(x - Ax - b) - (y - Ay - b) = (I - A)(x - y) = -(y - Ay - b).

o Note that any vector y in Kk(A,b) can be written in the form y = qk-I(A)b with some polynomial qk-l and that all vectors of this form are in Kk(A, b). It then follows from (7.38) that if Pk-l is any polynomial of degree k - 1 and we set Yk := Pk-I(A)b then necessarily

IIx -

xkll

:::;11(1 - A)-IIlIIYk - AYk - bll :::;11(1 - A)-IIiIII - AIIII(I - A)-lb - Pk-I(A)bll :::;11(1 - A)-III III - AIIII(1 - A)-l - Pk-I(A)lllIbll

SEVENTH CHAPTER

100

We conclude that if we can give an estimate for Ek := inf 11(1 - A)-l - p(A) II

where the infimum (actually minimum) is over all polynomials p of degree less than k, then (7.40) IIx - xkll ~ Ek ll(1 - A)-III 111 - Alillbil where Xk satisfies (7.38).

Theorem 7.4 Assume A is almost algebmic and {ai} is a sequence such that for all j = 1,2, ... IIpi(A)1I ~ Co

+

( c ew)i/W

(7.41)

holds where Pi (A) = Ai + alAi - l + ... + ai' Then X(z) = 1 + alZ + a2z2 entire. Assume also that 1 - A is nonsingular and that X(l) -:f O. Then

+ ...

is

(7.42)

Proof We have 1

00

.

(1 - zA)-l = - () Lpi(A)zJ X z i=O and so Ek

~11(1 1

k-l A)-l -

xt1)

~Pi(A)1I

00

~ IX(l)IL IIPi(A) II J=k

o

which implies the estimate (7.42).

We call this error bound "robust" as it has the following property: we have shown above that there are spectral sequences whose decay is bounded by the growth of Too(r, (1 - zA)-l). Then we have shown that this growth is insensitive in low rank updatings. Thus the only part in the error bound which is obtained by combining (7.40) and (7.42) which is not robust is in the term 11(1 - A)-lll/lx(l)l. Of course it may happen that some low rank updating brings a problem nearly singular, and then this would be large.

A bound for spectral projectors In the first chapter we discussed shortly Riesz projections: 1 . [ (AI - A)-IdA p = -2 1n

lr

(7.43)

where r surrounds an eigenvalue of A. Here we consider the following situation. We ask whether it is possible to give a bound for such a projection in terms of the growth function of the resolvent. To that end, let A be a bounded operator in

101

SEVENTH CHAPTER

a Banach space X and assume that the resolvent (1 - zA)-1 is meromorphic for Izl < R ~ 00. Choose any radius r < Rand () > 1 such that ()r < R. Then we take

(7.44) where p =

1/s satisfies

r

v'o ~ s ~ r

(7.45)

and is such that I (>.1 - A) -111 can be controlled along r in terms of its characteristic function Too (()r, (1 - zA)-I).

Theorem 7.5 Given () > 1 there is

C(()) < v'o + 1 + 10 4ev'o( v'o + 1) - v'o-1 g v'o-1

(7.46)

such that the following holds. Let A be a bounded linear operator in a Banach space such that the resolvent (1 - ZA)-1 is meromorphic for Izl < R ~ 00. Then for any r such that ()r < R there exists an s satisfying (7.45) so that for cp E (-7r, 7r]

(7.47) Proof Observe that the claim is essentially the same as in Corollary 2.2. So is the proof, too. However, we have here an operator valued function and therefore u:

z ~ log (

Ip(z)III(1 -

zA)-III)

is only subharmonic; here again we denote by P the monic polynomial vanishing at Zj = l/bj for poles bj with Ibjl 2: 1/()r. Thus we cannot use Poisson-Jensen formula as in the proof of Theorem 2.10 but we get the exactly same inequality by arguing as follows. Since u is subharmonic it stays below the harmonic function h S h(se'°t ) := 21 111" P( -() , t - cp)u(()re'°t )dt

_11"

7r

r

and so we obtain ° 1 111" P( -() S log 11(1 - se"P A)- 1 II ~, t - cp) log 11(1 - ()re'°t A)- 1 Iidt

27r

_11"

~l

(()r)2 - bk seicp og ()r(seicp - bk)

+~

r

k=1

The rest is then identical to that of the scalar case.

o

This is related to projections onto invariant subspaces as follows. Assume p is such that (7.48) a(A) nrp = 0 and denote 1 (7.49) Pp = ~ (>.1 - A) -1 d)". 7rZ

1 rp

Then Pp projects onto the invariant subspace corresponding to the part of spectrum which is smaller than p in modulus.

SEVENTH CHAPTER

102

Corollary 7.5 Given 0 > I there is C(O) satisfying (7.46) such that the following holds. Let A be a bounded linear operator in a Banach space such that the resolvent (I - zA)-l is meromorphic for Izl < R ~ 00. Then for any r > 0 such that Or < R there exists p such that I r

VB

~p~-,

r

(7.48) holds and

log IIPpl1 ~ C(O) Trx;;(Or, (I - ZA)-l). Furthermore, the number of eigenvalues outside r p is bounded by

(7.50)

I

noo(r, (1 - ZA)-l) ~ logO Too (Or, (I - ZA)-l). Proof This is clear by Theorem 7.5.

D

Comment 7.1 Almost algebraic operators were discussed in [NOI]. Bounds for Krylov solvers, based directly on Too(r, (I - ZA)-l), were discussed in [Hy-N], without help of theorem of Miles.

EIGHTH CHAPTER Keywords: Approximate polynomial degree, approximate rational degree. Approximate polynomial degree of an analytic function If p is a polynomial of degree = d, then

log+ M(r,p)

= (1 + o(l))d log r

as r

--+ 00

and reversely, if log+ M(r, 1)/ logr is bounded as r --+ 00 then I is a polynomial. Suppose we look at a given analytic I in a tiny neighborhood Izl :$ r. Then obviously, just one evaluation of I, e.g. at origin is sufficient to approximately represent I. In a larger disc one needs more evaluations. Likewise, we may want to know, how the work increases with increasing accuracy. This is achieved simply by looking at 1/c in place of I. We shall make this precise by introducing the following notation and terminology.

Definition 8.1 Let

I

be analytic for

Izl < Ro :$ 00.

Put for r < Ro

do(r, 1) := min{deg pip is a polynomial and such that M(r, 1- p) :$ I}.

We shall call do the approximate polynomial degree. We can relate do to M as follows.

Theorem 8.1 Suppose ()r

I

is analytic lor

Izl < Ro

:$

00.

Then lor () > 1 and

< Ro we have 1 + (M(()r, 1)) do(r,1) < log () log () _ 1

+ 1,

(8.1)

and

log+ M(()r, 1) :$log+ M(r, 1) + do (()r, 1) log() + log 2.

(8.2)

Proof Let d be an integer such that 1 + M 1 + M log()log () -1 :$ d < log()log () -1

+ 1,

(8.3)

where for short M = M(()r, 1). Then ()d =

M

M

exp(dlog()) ;::: exp(1og+ () _ 1) = max{() _ l' I}. d

Let us now put p(z) := the coefficients satisfy

E

akzk where ak's are the Taylor coefficients of

k=O

103

I.

Since

104

EIGHTH CHAPTER

see (2.10), we obtain with the help of (8.3) 00

2:

M(r,f-p)~

JakJr k

k=d+l 00 O-d ~ M ' " O-k = M L...J 0-1

k=d+l

1 0-1 ~ M 0 -1 min{ 1\{' 1}

~ min{l, 0 ~ 1 } which implies the first claim. In order to prove (8.2) observe that if M(Or, f -p) with deg p = d, then

M(Or, f)

~

1

M(Or,p) + 1 ~ Od M(r,p) + 1 ~

~ Od(M(r, f)

+ 1) + 1

which implies the second claim. Here we used the inequality

M(Or,p) ~ OdM(r,p) which is a special case of Bernstein's lemma and can in this form be concluded as follows. The function g(z) := z-dp(z) is analytic and bounded for JzJ ~ r. By the maximum principle we have

(Or)-dM(Or,p)

=

sup Jg(z)J ~ sup Jg(z)J Izl~(lr

Izl~r

= r-dM(r,p).

o Our main interest in formulating Theorem 8.1 is the fact that we shall later be able to formulate an analogue of it for meromorphic functions, approximated by rational functions. However, in that case we cannot in general code the growth in terms of the Taylor coefficients. The following examples illustrate the inequalities (8.1) and (8.2). Example 8.1 If p is a polynomial of degree d, then clearly do(r,p) ~ d for all r with equality for all r large enough. Consider first (8.1) with a fixed r:

do(

)

_1_1

+ (M(Or,f)) 0_ 1

r, p < log 0 og

Letting here 0 -+

00

+

1 = _1_1

log 0 og

+ ((Or)d(1+0(1))) 0- 1

gives

do(r,p) < d + 1 or, as both are integers,

do(r,p)

~ d.

On the other hand, from (8.2) we obtain, for r > 1

do(r,p) logr ~ log+ M(r,p) - C with C = log+ M(l,p)

+ log 2.

0

+

1

105

EIGHTH CHAPTER

While log+ M(r, f) is bounded for an entire f only when f is constant, do(r, f) is bounded for polynomials and thus for very slowly growing functions do(r, f) is essentially slower than log+ M(r, f). It is then not without interest that for entire functions of positive order, do(r, f) grows with the same speed as log+ M(r, f) (without any logr term) and that also the type can be correctly recovered from do(r,f).

Example 8.2 In order to see that do(r, f) codes both the order and type faithfully, it essentially suffices to consider the function

We show the following:

do(r, F) = (1 + o(l))'Tewrw,

as r

-+ 00. d

Let do be fixed and put for short d := do(r, F). If P(z) =

E Cjzj

is the corre-

j=O

sponding approximating polynomial, then the Parseval's identity gives us d

1 ~ M(r,F - p)2 ~

00

L laj - cjl2r2j + L j=O

lakl2r2k

k=d+1

~ lad+11 2 r 2 (d+1).

This gives us immediately

do(r, F)

~

'Tewrw - 1,

for all r > O.

To bound do(r, F) from above, we use inequality (8.1). By Theorem 2.2 we have for 1/2 ~ c > 0 and r > 0 13 log+ M(r, F) :5 (1 + e)'Trw + log+( -w). c Inequality (8.1) now implies with ():= exp(l/w) that there exists Ce such that

o

holds for all r > O.

Theorem 8.2 If f is entire of order w, then log do(r, f) . w = 11m sup 1 . r-+oo

ogr

(8.4)

If f is of finite positive order w, and of type 'T, then 1 l' do(r, f) 'T = 1m sup . ew r-+oo rW

(8.5)

EIGHTH CHAPTER

106

8.2.

Proof We leave this as an exercise: try to modify the discussion in Example D Some properties of the approximate polynomial degree

If we want to approximate f within tolerance e, i.e. that M(r, f - p) ~ e, then the minimum degree possible is given by do(r, fie). From (8.1) we obtain with () > 1, r fixed such that ()r < Ro,

1 1 1 do(r, - I) ~ - 1 ()log+ - + C e og e

(8.6)

where C = C(r, (), I) is independent of c. Observe that, apart from C, the right hand side of (8.6) depends on f only through (). We can relate these notions to the standard setting in approximation theory. To that end, let Ed(r, 1):= inf M(r, f - p). deg(p)~d

It is well known that if f is analytic in a slightly larger disc, then Ed(r, I) decays fast with increasing d, see e.g. [Wa], p. 75. Here is a simple version with explicit constants. Theorem 8.3 Assume f is analytic in Izl < such ()r < Ro. Then for d = 0,1,2, ... we have E ( f) d

r,

< M(()r, I) -

() -1

Ro ()-d

~ 00.

Choose r
1, (8.7)

.

Proof Let () := M~9r~j)' so that by Theorem 8.1 for e E (0,1] we have () 1 1 do(r, - I) < - 1() log - + 1. e og e Denote

cd

(8.8)

:= ()-d. Then by (8.8)

and D

In the following we formulate some simple inequalities for do(r, I). Expressions of the form do(r, c(r)1) are to be understood as follows: we look at functions z 1-+ c(r)f(z) with fixed r for Izl ~ r and consider c(r) as a constant in the approximation process. Theorem 8.4 Let f and g be analytic for Then

do(r, f + g) do(r, fg) where cf

:=

Izl < Ro

~ 00

~

max{do(r, 21), do(r,2g)}, ~ do(r, cgl) + do(r, cfg),

max{3M(r, I), va}, cg := max{3M(r, g), va}.

and ()r
1. (8.9) (8.10)

EIGHTH CHAPTER

107

Proof If M(r,21 -p) ~ 1, M(r,2g-q) ~ 1, then M(r, I +g- !(P+q» ~ 1, while 1 deg 2(P+q) ~ max{deg(p), deg(q)}. To prove (8.10), suppose M(r, 1- p)

~

1/cg and M(r, 9 - q) ~ 1/cf. Then

M(r,/g - pq) M(r,/g - Iq) + M(r, Iq - pq) ~ M(r, f)M(r,g - q) + M(r,g)M(r, 1- p) + M(r, 1- p)M(r,g - q) 111 ~"3 +"3 +"3 = 1, ~

while deg(pq)

~

deg(p) + deg(q).

0

Theorem 8.5 Let I be analytic lor Izl < Then do(r,J')

Ro

~ 00

and () > 1 such that (}r < Ro.

~ max{do((}r, (() ~ l)rf) -

1, O}

(8.11)

and do(r,f) ~ do(r,rl')

+ 1.

(8.12)

Proof Differentiating the Cauchy integral

r

I(z) = ~

I(() d(

2n i1z- 0 while

_

1

T(r, (z - a) n) = n log Tal for 0 < r :5 1 -Ial. Thus, in general we cannot bound d in terms of T alone. Since trivially d(r, f) 2': n(r, f) we may try to bound d in terms of both T and n. Theorem 8.8 For () > 1 let ()i > 1 be such that ()1 ()2()3 = (). Suppose J is meromorphic Jor Izl < R:5 00. Then Jor ()r < R we have

d(r, f) < C1 «())T«()r, f)

+ 2n«()r, f) + C2 «()).

where

C1 «()) = _1_()2 + 1 [2 + 10g()1 ()2 - 1

and

+ 1]

()3 ()3 -

1

1 ( + 1 ()2 + 1 ) C2 «()) = - 1 () log ( - ( )1) + -()1 10g2 + 1. Ogl

1-

2-

(8.21)

EIGHTH CHAPTER

110

Proof Let () > 1 be given. Choose (}i > 1 such that (}l (}2(}3 = () and assume that r is such that 1] := (}r < R. To start, let q be a finite Blaschke product such that it is analytic in Izl < 1], vanishes at the poles bj of f in that disc so that g:= qf

is analytic there and Iql = 1 along Izl = 1] and of minimal degree. Thus deg(q) n(1], f) and T(1], q) = O. By Theorem 2.4 we have 1

1

T(1], -) = log - I1 = N(1], f)· q Ck

~

(8.22)

In fact, if f is regular at origin, then this is part of Lemma 7.5, since then

q(O) =

IIn .J..b j=l 1]

so that 1 ~ + 1] log Iq(O)1 = {:rlog Ibjl = N(1],f).

f

if, on the other hand, Ie q = ~q. So, writing

has a pole at origin of degree k, then q is of the form

q(O)

q () z =

-kz 1]

k

+ Ck+l Z k+l + ...

and using (2.28) we again get (8.22). Put p := (}1(}2r and consider the NevanlinnaPick interpolation problem: Find a w, analytic in Izl

~ p

such that

w(bj ) = g(bj ) for j = 1, ... , n(p, f) and such that M(p, w) is minimal (with natural modifications if some poles are multiple). It is well known that the solution is unique and that the solution is a rational function of degree at most n(p, f). Furthermore

M(p,w)

~

M(p, g)

(8.23)

since 9 itself is a feasible function. By construction

w

1 = -(g - w) q q

f - is analytic for Izl ~ p =

(}1(}2r

log+ M((}l r ,f _~) q

~

~ ~

and we can estimate it pointwise as follows:

+ 1 T(p,/ _

(}2 (}2-

1

+ 11

(}f)2 2-

(}2 + 1 01 2 -

w) q

(T(p, ~ ) + T(p, 9 - w)) q

( N(1], f)

+ T(p, g) + T(p, w) + log 2) . (8.24)

111

EIGHTH CHAPTER

Since T(T/, q) = 0 we have

T(p,g) S T(T/, g) S T(T/,j), while by (8.23) we obtain

T(p, w) S log+ M(p, w) S log+ M(p, g) S :: ~~ T(T/, j). Substituting these into (8.24) gives log+ M((}1 r ,I - w) S (}(}2 + 1 ((2+ (}(}3 + 1) T(T/,j) + log 2 ). q 2- 1 3- 1

(8.25)

We have now an analytic function I - w / q which we shall still approximate with a polynomial p. By Theorem 8.1 we have wI + wI + 1 do(r, 1- -) < - 1 () log M((}1 r,/ - -) + - 1 () log ( - ( )1) + 1. q og 1 q og 1 1-

Thus we have approximated

I by a rational function

(8.26)

w / q + P and we have

w d(r, j) S degq + degw + degp S 2n(T/, j) + do(r, 1- -). q

o

This completes the proof.

Corollary 8.1 For () > 1 let (}i > 1 be such that (}1(}2(}3(}4 = (). Assume I is meromorphic lor Izl < R S 00 and 1(0) =1= 00 • Then lor (}r < R inequality (8.20) holds with

and C2((})=-1 1() (log+(-() 11)+(}(}2+111og2)+1. og1 12-

Proof In (8.21) we can now estimate

o Example 8.6 Let us apply these bounds for rational functions. First, if T(r, q) satisfies a lower bound T(r,q) ~ d log+ r - C (8.27) then we have from Theorem 8.7

d logr S T(l, q) + d(r, q) logr + 21og2 + C which gives immediately liminf d(r, q) r ...... oo

~

d.

(8.28)

Reversely, suppose that T(r, q) satisfies an upper bound

T(r,q) S d log+ r + C.

(8.29)

EIGHTH CHAPTER

112

Of course, we can conclude from this that q is a rational function of degree at most d, but we look what the bound in Theorem 8.8 gives. First, the bound contains the term 2n(()r,q). For r ~ 1 we have, see Lemma 2.2, 1

n(r,q)::; 10g()N(()r,q) and so we obtain, by using (8.29) and letting ()

-+ 00,

n(r, q) ::; d. We then show that inf {C1 (())T(()r,q)+C2 (())} ::;3d+1.

8>1

(8.30)

Thus, combined we have

d(r, q) ::; 5d + 1. To obtain (8.30) choose t::

(8.31)

> 0 and take ()2 and ()3 large enough so that C1 (())

::;

3 + t:: .

log ()1

Then (8.29) gives limsup C1 (())T(()r,q)::; (3 + t::)d. 8 1 -+00 But limsupC2 (()) 81 -+00

::;

1

and (8.30) follows.

Example 8.7 We saw earlier that do codes both the order and type of entire functions accurately. For meromorphic functions the approximate rational degree codes the order accurately but leaves a gap for the type. In fact, suppose T(r, J) grows with a positive order w and with a positive type a. Then we obtain from Theorem 8.7

. d(r, J) 11m sup - W -

(8.32) ~ aew, r-+oo r analogously to (8.5). To get an upper bound for limsuPr-+oo d(r, J)/r w we use Theorem 8.8. We take ()1 = e, ()2 = ()3 = 2 so that () = 4e. Thus

C1 (())T(()r, J) ::; 15 (4e)WarW Further, since n(()r, J) ::; T(()er, J) for r lim sup r-+oo

d(r~J) r

+ o(rW).

> 1, see Lemma 2.2, we conclude

::; (15 + 2eW)(4e)W a.

Spijker's lemma Polynomials satisfy Bernstein's inequality

rM(r,p') ::; deg(p)M(r,p) which we used to get Theorem 8.6

rM(r,f')::; (do(r,r!')

+ I)(M(r,J) + 1) + 1.

(8.33)

EIGHTH CHAPTER

For meromorphic by M. Spijker.

113

I an analogous result can be obtained from the following lemma

Lemma 8.1 II w is a rational/unction, then (8.34)

o

Proof The original is in [Sp].

Theorem 8.9 II I is meromorphic in Izl < Rand 0> 1 is such that Or < R, then

rl11"· .) -11" 1!,(re''P)ldcp :5 d(Or, f) ( s~p I/(re''P) I + 1 + 0 _1 1·

211"

(8.35)

Proof Suppose deg(w) = d(Or,f) and M(Or,J - w) :5 1. Then using (8.34) we obtain r 111" Iw'(rei'P)ldcp -r 111" 1/'(rei'P)ldcp:5211" -11" 211" -11" + ~ 111" 1!,(rei'P) - w'(rei'P)ldcp 211" -11"

:5d(Or, f) sup Iw(rei'P) I + rM(r,!, - Wi)

'P

:5d(Or, f)(s~p I/(rei'P) I + 0 ~ 1 M(Or, 1- w)). To have the last inequality we used the Cauchy inequality

rM(r, I'

- Wi) :5 0 ~ 1 M(Or, 1- w).

o Remark 8.1 Observe that if I is rational then for € > 0 we can apply ( 8.35) to ~I and recover (8.34) as d(Or, ~f) :5 deg(f). This scaling technique can also be used in the following result.

Theorem 8.10 Let

I be meromorphic in Izl < R such that it is analytic in

Izl < Ro, and has there the expansion 00

I(z) =

I>k Zk . k=O

Then lor r < Ro we have (8.36)

EIGHTH CHAPTER

114

Proof Let w be rational, of degree d(r, f) such that M(r, f - w) :::; 1. Then 1. ( 1. { Ck = -2 Ck-1w(()d( + -2 Ck-1(f(() - w(())d( 7rZ J1(I=r 7rZ J1(I=r where by partial integration and Spijker's Lemma

111

12 . 7rZ

while

Ck-1w(()d(1 :::; -k deg(w)r- kM(r, w)

1(I=r

I~ (

Ck-1(J(() - w(())d(1 :::; r-k. J1(I=r The claim then follows as M(r, w) :::; M(r, f) + 1. 27rZ

o

Theorem 8.12 below contains a different variant of this mechanism.

Power bounded operators and bounds for the Laurent coefficients Let A be a bounded operator in a Banach space. It is a very basic task to give conditions on the resolvent that guarantee power boundedness

IIAnl1 :::; K

for n = 0,1,2, ...

(8.37)

(see Prologue). A necessary condition is obtained easily from (8.37). In fact, for Izl < 1 we obtain

This is often called the Kreiss resolvent condition and we may write it here as follows:

K

M(r, (1 - ZA)-l) :::; - - for r < 1. (8.38) 1-r This does allow a linear growth IIAnll = O(n). We shall assume that the resolvent is additionally meromorphic in some neighborhood of the unit disc. Together with this the Kreiss condition is sufficient. We make this quantitative by assuming for some () > 1 and L = L(()) < 00 (8.39)

Theorem 8.11 For each () > 1 there are constants C i (()), i = 1,2,3 such that if the resolvent is meromorphic for Izl :::; () and the conditions (8.38) and (8.39) hold, then for n = 1,2, ... (8.40)

Proof This is a special case of the following result on Laurent coefficients of meromorphic operator valued functions. 0

EIGHTH CHAPTER

115

Theorem 8.12 For each () > 1 there are constants Ci (()), i = 1, 2, 3 such that the following holds. Assume that F is a B(X)-valued junction, meromorphic for Izl < () and analytic for 0 < Izl < 1, satisfying the quantitative estimates limsup (1 -lzI)IIF(z)1I ~ K 1%1-->1-

(8.41)

sup Tco(r,F) ~ L.

(8.42)

and r 0 for some a, then we call A defective, otherwise it is nondefective. Theorem 10.11 Let 0 ::f. A E Md. Then for a ::f. 1

"Yoo(a) = "Y00(0).

132

TENTH CHAPTER

Proof Write F(z) - aI = (1- a)(I - l':aA), so that

moo(r, (F(z) - aI)-I) = moo ( 11: ai' F- 1 ) + 0(1).

(10.27)

From the proof of Theorem 10.4 we know that with some integer k we have moo(r, F- 1 ) = k log+ r

+ 0(1)

which substituted into (10.27) gives moo(r, (F(z) - aI)-I) = moo(r,F(z)-I) + 0(1). Hence 'Yoo(a) is independent of a.

D

Corollary 10.1 A E Md is defective, in the sense of Definition 10.4, if and only if 0 is a defective eigenvalue. Proof This follows from 'Y00(0) = 600 (0).

D

Example 10.5 The operator V 2 is quasinilpotent and thus (I - zV2)-1 is entire. Clearly all quasinilpotent operators are defective as 'Y00(0) = 1. Recall that V 2 is a rank-l perturbation of a self-adjoint operator A (see Example 1.5). Theorem 10.12 If A is almost algebraic and defective, then the set of values a for which 'Yoo (a) > 0 contains a circle. Proof If a::j:. 1 is such a value, then set p:= 11 - al. Writing as in (10.27) we see that all b's on the circle p = 11 - bl satisfy 'Yoo(b) = 'Yoo(a) > O. D

We end this with a natural statement on diagonalizable operators. Theorem 10.13 Let A be an almost algebraic operator of at most finite order in a Hilbert space. If it is similar to a normal operator, then it is nondefective. Proof Clearly defectiveness is preserved under similarity transformations. Assume thus that A is normal. Now the statement is essentially that of Theorem 6.4, except that we need to check all a ::j:. 1. Comparing with the proof of Theorem 6.4 we see that moo(r, F- 1 ) = o (log Too (er, F- 1 )). Using again (10.27) yields the result. D Comment 10.1 Some recent developments in the value distribution theory, in particular for holomorphic curves and quasiregular maps, are summarized in [Er].

EPILOGUE Keyplaces: Toronto, Karjalohja. Lecturing and typing in Toronto During October 2001 I gave ten lectures at Fields Institute in Toronto. Each lecture formed the basis of a chapter in this book. My former book [N01] was intended to be an easy-to-read text book on waveform relaxation - but it transformed into a difficult-to-read research monograph on convergence theory for iterative methods in an abstract setting. Likewise, this book was intended to be an easy-to-read text book on matrix valued meromorphic functions - but it transformed into an extended version of [N08] instead, where we alternate between matrix- and operator-valued functions. To be exact, one chapter contains material from two lectures, and the tenth chapter is an exceptional chapter, or simply defective one, written later as a partial and simple minded answer to a natural question: the word defective appears both in the value distribution theory and in linear algebra, so, are they related? Fishing and finishing in Karjalohja A year later I am finishing this monograph at our summer home in Karjalohja. My grandfather worked winters in an insurance company in Helsinki but spent his summers here doing mathematics. During the winters he and Rolf Nevanlinna had their offices close by in Helsinki - at times they even shared an office. Rolf's summer home was across the lake - the ride on a motor boat, at six knots, took twenty minutes. OlIi Lehto has written (in Finnish) a biography on Rolf Nevanlinna which appeared in the fall 2001 when I got back from Toronto. There is an interesting section on the birth of Nevanlinna theory with a discussion on the mutual relations between the two brothers. My grandfather kept diary all his adult life. Unfortunately, the diaries from years around 1925 are missing, but Lehto's book includes diary quotations from later years. I decided to include the Prologue in this book for several reasons. One is this: if a Nevanlinna writes about Nevanlinna theory three quarters of a century after its birth, some explanation is wanted, and I had already published a version of the Prologue in Finnish. During the seven or so years on this project many people have been of great help. I want to thank them all, but especially my hosts and the personnel at the Fields Institute and Bob, Carl, Jarmo, Marja, Marko, Nikolai, Olli-Pekka, Saara, Ulla, Timo and Xiaoushu. 133

134

EPILOGUE

It would be only natural to dedicate this book to the brothers Frithiof and Rolf. However, I dedicate this to my father. He lost his elder brother in the war and was wounded himself. I grew up in the independent Finland. When we go fishing we use a rowboat, and the lake remains quiet.

BIBLIOGRAPHY [AI] [A2] [Bo] [Dr] [Du-S] [Er]

[F] [Go-K]

[H] [Ha-K] [Ho-J1] [Ho-J2] [Hu-N] [Hy] [Hy-N] [Ko] [Ma] [Mi] [Mii] [NF] [N01] [N02] [N03] [N04] [N05]

[N06] [N07] [N08]

Aupetit, B. [1991] A Primer on Spectral Theory, Springer-Verlag, New York. Aupetit, B. [1997] On log-subharmonicity of singular values of matrices, Studia Mathematica 122(2), 195-200. Boas Jr., R. P. [1054] Entire Functions, Academic Press. Drasin, D. [1987] ProoLoJ..aS!!!}jecture of F. Nevanlinna concerning junctions which have deficiency sum ~~Mat:h:~-Q4. Dunford, N an«SchwiiJ:tz', 'J:;.'t'z.[19p3j zan"/lr Operators, Part II: Spectral Theory, Interscience. ." ~. . Eremenko,~. [2002) Va(~~.·R¥.~butio_n'~nd ~otential Theory, Proceedings of ICM 2002, Vol II, Hig~ lM!Jcation ,PresSi.:i>p",~681-:69Q; Faddeeva, V. ~959tComp~~ti"na!.~ods of Linear Algebra, Dover. Gohberg, 1. and KrlJioB;.;~LU.~.l1J,..1ntt"oduction to the Theory of Linear Nonselfadjoint Operators, AMS Translations of Mathematical Monographs, Vol. 18. Halmos, P. R. [1971] Capacity in Banach algebras, Indiana Univ. Math. 20, 255-863. Hayman, W. K. and Kennedy, P. B. [1976] Subharmonic Functions, Vol. I., Academic Press. Horn, R. A. and Johnson, C. R. [1985] Matrix Analysis, Cambridge Univ. Press. Horn, R. A. and Johnson, C. R. [1991] Topics in Matrix Analysis, Cambridge Univ. Press. Huhtanen, M. and Nevanlinna, O. [2000] Minimal decompositions and iterative methods, Numer. Math. 86(2), 257-282. Hyvonen, S. [1997/98] Case studies on growth properties of meromorphic resolvents, Insitute Mittag-Leffler Report No. 18. Hyvonen, S. and Nevanlinna, O. [2000] Robust bounds for Krylov methods, BIT 40(2), 267-290. Konig, H. [1986] Eigenvalue Distribution of Compact Operators, Birkhiiuser. Matsaev, V. 1. [1964] Doklady Akademii Nauk SSSR 154/5, 1034-1037. Miles, J. [1972] Quotient representations of meromorphic junctions, J. Analyse Math. 25, 371-388. Miiller, V. [1987] On quasialgebraic operators in Banach spaces, Operator Theory 17, 291-300. Nevanlinna, F. [1930] Uber eine Klasse meromorpher Funktionen, Den syvende skandinaviske matematikerkongress i Oslo 19-22 August 1929, A. W. Broggers, Oslo. Nevanlinna, O. [1993] Convergence of Iterations for Linear Equations, Birkhiiuser. Nevanlinna, O. [1991] Tiede 2000, vol. 3, s. 51. Nevanlinna, O. [1996] Meromorphic resolvents and power bounded operators, BIT 36(3), 531-54l. Nevanlinna, O. [1996] Convergence of Krylov methods for sums of two operators, BIT 36(4), 775-785. Nevanlinna, O. [1996] A characteristic junction for matrix valued meromorphic junctions, XVIth Rolf Nevanlinna Colloquium, Eds. Laine/Martio, Walter de Gruyter & Co, Berlin, pp. 171-179. Nevanlinna, O. [1998] Juhlien jalkeen, Arkhimedes 3, 12-19. Nevanlinna, O. [1997] On the growth of the resolvent operators for power bounded operators, Banach Center Publications 38,247-264. Nevanlinna, O. [2000] Growth of operator valued meromorphic junctions, Ann. Acad. Sci. Fenn. Math. 25, 3-30. 135

136

[N09] [NRl] [NR2] [Ri-V]

[Ru] [Se] [Sp]

[Wa] [Ya]

BIBLIOGRAPHY

Nevanlinna, O. [2001] Resolvent conditions and powers of operators, Studia Mathematica 145(2), 113-134. Nevanlinna, R. [1925] Zur Theone der meromorphen Funktionen, Acta Math. 46, 1-99. Nevanlinna, R. [1970] Analytic Functions, Springer-Verlag. Ribaric, M. and Vidav, I. [1969] Analytic properties of the inverse A(z)-l of an analytic linear operator valued function A(z), Arch. Rational Mech. Anal. 32, 298-310. Rubel, L. A. [1996] Entire and Meromorphic Functions, Springer-Verlag, Universitext. Segal, S. [1996] Nine Introductions in Complex Analysis, North-Holland Publ. Co.. Spijker, M. N. [1991] On a conjecture by LeVeque and Trefethen related to the Kreiss matrix theorem, BIT 31, 551-555. Walsh, J. L. [1935] Interpolation and approximation by rational functions in the complex domain, AMS Colloquium Publications, vol. XX. Yang, L. Value Distribution Theory" Springer-Verlag, Berlin, and Science Press, Beijing.

Titles in This Series 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4

3

2 1

Olavi Nevanlinna, Meromorphic functions and linear algebra, 2003 Vitaly I. Vol08hin, Coloring mixed hypergraphs: theory, algorithms and applications, 2002 Neal Madras, Lectures on Monte Carlo Methods, 2002 Bradd Hart and Matthew Valeriote, Editors, Lectures on algebraic model theory, 2002 Frank den Hollander, Large deviations, 2000 B. V. Rajarama Bhat, George A. Elliott, and Peter A. Fillmore, Editors, Lectures in operator theory, 2000 Salma Kuhlmann, Ordered exponential fields, 2000 Tibor Krisztin, Hans-Otto Walther, and .Jianhong Wu, Shape, smoothness and invariant stratification of an attracting set for delayed monotone positive feedback, 1999 .Jiff Patera, Editor, Quasicrystals and discrete geometry, 1998 Paul Sellck, Introduction to homotopy theory, 1997 Terry A. Loring, Lifting solutions to perturbing problems in C·-algebras, 1997 S. O. Kochman, Bordism, stable homotopy and Adams spectral sequences, 1996 Kenneth R. Davidson, C*-Algebras by example, 1996 A. Weiss, Multiplicative Galois module structure, 1996 Gt§rard Besson, .Joachim Lohkamp, Pierre Pansu, and Peter Petersen Mir08lav Lovric, Maung Min-Oo, and McKenzie Y.-K. Wang, Editors, Riemannian geometry, 1996 Albrecht Bottcher, Aad DlJksma and Heinz Langer, Michael A. Dritschel and .James Rovnyak, and M. A. Kaashoek Peter Lancaster, Editor, Lectures on operator theory and its applications, 1996 Victor P. Snaith, Galois module structure, 1994 Stephen Wiggins, Global dynamics, phase space transport, orbits homoclinic to resonances, and applications, 1993

Meromorphic functions and linear algebra

Meromorphic Functions and Linear Algebra (Fields Institute Monographs, 18)

Entire and meromorphic functions

Entire and Meromorphic Functions (Universitext)

Meromorphic functions and analytic curves

Entire and Meromorphic Functions (Universitext)

Entire and Meromorphic Functions (Universitext)

Value distribution of meromorphic functions

Value distribution of meromorphic functions

An Expansion of Meromorphic Functions

Normal Families of Meromorphic Functions

Value Distribution of Meromorphic Functions

LINEAR ALGEBRA AND SMARANDACHE LINEAR ALGEBRA

Analytic and Algebraic Dependence of Meromorphic Functions

Fix Points and Factorization of Meromorphic Functions

Analytic And Algebraic Dependence Of Meromorphic Functions

Linear Algebra and Linear Models

Linear Algebra and Linear Models

Linear Algebra and Linear Models

Linear Algebra and Linear Models

Linear Algebra and Linear Models

Linear Algebra and Geometry

Calculus and linear algebra

Calculus and linear algebra

Linear Algebra and Geometry

Linear Algebra and Geometry

Calculus and linear algebra

Linear Algebra and Geometry

Linear algebra and geometry

Linear Algebra

ALGEBRA LINEAR

Meromorphic functions and linear algebra