This book generalizes the classical theory of orthogonal polynomials on the complex unit circle or on the real line to ...
179 downloads
840 Views
11MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
This book generalizes the classical theory of orthogonal polynomials on the complex unit circle or on the real line to orthogonal rational functions whose poles are among a prescribed set of complex numbers. The first part treats the case where these poles are all outside the unit disk or in the lower half plane. Classical topics such as recurrence relations, numerical quadrature, interpolation properties, Favard theorems, convergence, asymptotics, and moment problems are generalized and treated in detail. The same topics are discussed for the different situation where the poles are located on the unit circle or on the extended real line. In the last chapter, several applications are mentioned including linear prediction, Pisarenko modeling, lossless inverse scattering, and network synthesis. This theory has many applications in both theoretical real and complex analysis, approximation theory, numerical analysis, system theory, and in electrical engineering. Adhemar Bultheel is a professor in the Computer Science Department of Katholieke Universiteit Leuven. In addition to coauthoring several books, he teaches introductory courses in analysis and numerical analysis for engineering students and an advanced course in signal processing for computer science and mathematics. Pablo Gonzalez-Vera is a professor in the Faculty of Mathematics at La Laguna University, Canary Islands. He teaches numerical analysis in mathematics, introductory courses in calculus for engineering, as well as advanced courses in numerical integration for physics and mathematics. Erik Hendriksen is currently a researcher with the Department of Mathematics at the University of Amsterdam. He teaches introductory courses in analysis and linear algebra for students in mathematics and physics and advanced courses in functional analysis. Olav Njastad is a professor in the Department of Mathematical Sciences of the Norwegian University of Science and Technology. He is currently teaching introductory and advanced courses in analysis.
CAMBRIDGE MONOGRAPHS ON APPLIED AND COMPUTATIONAL MATHEMATICS Series Editors P. G. CIARLET, A. ISERLES, R. V. KOHN, M. H. WRIGHT
5
Orthogonal Rational Functions
The Cambridge Monographs on Applied and Computational Mathematics reflects the crucial role of mathematical and computational techniques in contemporary science. The series publishes expositions on all aspects of applicable and numerical mathematics, with an emphasis on new developments in this fastmoving area of research. State-of-the-art methods and algorithms as well as modern mathematical descriptions of physical and mechanical ideas are presented in a manner suited to graduate research students and professionals alike. Sound pedagogical presentation is a prerequisite. It is intended that books in the series will serve to inform a new generation of researchers.
Also in this series: A Practical Guide to Pseudospectral Methods, Bengt Fornberg Dynamical Systems and Numerical Analysis, A. M. Stuart and A. R. Humphries Level Set Methods, /. A. Sethian The Numerical Solution of Integral Equations of the Second Kind, Kendall E. Atkinson
Orthogonal Rational Functions
ADHEMAR BULTHEEL ERIK HENDRIKSEN
PABLO GONZALEZ-VERA OLAV NjASTAD
CAMBRIDGE UNIVERSITY PRESS
CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sao Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521650069 © Cambridge University Press 1999 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 1999 A catalogue record for this publication is available from the British Library Library of Congress Cataloguing in Publication data Orthogonal rational functions / Adhemar Bultheel ... [et al.]. p. cm. - (Cambridge monographs on applied and computational mathematics ; 4) Includes bibliographical references. ISBN 0-521-65006-2 (hb) 1. Functions, Orthogonal. 2. Functions of complex variables. I. Bultheel, Adhemar. II. Series. QA404.5.075 1999 515'.55-dc21 98-11646 CIP ISBN 978-0-521-65006-9 hardback Transferred to digital printing 2007
Contents
List of symbols
page xi
Introduction 1 1.1 1.2 1.3 1.4 1.5
Preliminaries Hardy classes The classes C and B Factorizations Reproducing kernel spaces J-unitary and J-contractive matrices
15 15 23 31 34 36
2 2.1 2.2 2.3
The fundamental spaces The spaces Cn Calculus in Cn Extremal problems in Cn
42 42 53 58
3 3.1 3.2 3.3
The kernel functions Christoffel-Darboux relations Recurrence relations for the kernels Normalized recursions for the kernels
64 64 67 70
4 4.1 4.2 4.3 4.4 4.5
Recurrence and second kind functions Recurrence for the orthogonal functions Functions of the second kind General solutions Continued fractions and three-term recurrence Points not on the boundary vii
74 74 82 90 95 101
viii
Contents
5 5.1 5.2 5.3 5.4 5.5
Para-orthogonality and quadrature Interpolatory quadrature Para-orthogonal functions Quadrature The weights An alternative approach
106 106 108 112 117 119
6 6.1 6.2 6.3 6.4 6.5
Interpolation Interpolation properties for orthogonal functions Measures and interpolation Interpolation properties for the kernels The interpolation algorithm of Nevanlinna-Pick Interpolation algorithm for the orthonormal functions
121 121 129 135 140 145
7 7.1 7.2
Density of the rational functions Density in Lp and Hp Density in L2(/x) and H2(/JL)
149 149 155
8 8.1 8.2
Favard theorems Orthogonal functions Kernels
161 161 165
9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10
Convergence Generalization of the Szego problem Further convergence results and asymptotic behavior Convergence of * Equivalence of conditions Varying measures Stronger results Weak convergence Erdos-Turan class and ratio asymptotics Root asymptotics Rates of convergence
173 174 181 183 191 192 196 206 208 226 233
10 10.1 10.2 10.3
Moment problems Motivation and formulation of the problem Nested disks The moment problem
239 239 241 251
Contents
ix
11 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10 11.11
The boundary case Recurrence for points on the boundary Functions of the second kind Christoffel-Darboux relation Green's formula Quasi-orthogonal functions Quadrature formulas Nested disks Moment problem Favard type theorem Interpolation Convergence
257
12 12.1 12.2 12.3 12.4 12.5
Some applications Linear prediction Pisarenko modeling problem Lossless inverse scattering Network synthesis HOQ problems 12.5.1 The standard H^ control problem 12.5.2 Hankel operators 12.5.3 Hankel norm approximation
342 343 356 359 369 373 373 379 385
257 267 272 277 280 286 290 300 307 319 338
Conclusion
389
Bibliography
393
Index
405
List of symbols
C C R, R Z U, L, H D, T, E O, 3O, O e
ft
JA, di
//(X) Hp, N, Hp (/, g) p,
the complex plane C = {z — Re z + i Im z] the Riemann sphere C = CU {00} the real line R = { Z G C : Im z = 0}, 1 = 1 U {00} 16 the integers U = {zGC:Imz>0},L = {zeC: Im z < 0}, 2 HI = {z G C : Re z > 0} D = {z e C : \z\ < 1}, T = {z € C : kl = 1}, E = { z e C : | z | > 1} O region in C: D or U 3O boundary of O: T or R Oe exterior of O: E or L 16 normalized measure on dO: dfr(t) = dfi(t)/(l + t2) o n l , d/t(O = J/x(0 on T 20 normalized Lebesgue measures: dk(t) = (7t)-ldt for R, dA.(r) = (2TT)- 1 J6>, r = ^ for T 17 dk(t) = dk(t)/(l +1 2 ) on R, d£(f) = dA.(O on T 19 functions holomorphic in X 16 Hardy and Nevanlinna classes 17, 20 Hp = Hp(dk) for R, Hp = Hp(dk) for T 20 inner products:
Iw f(t)g(t) dfct) for U XI
17
xii z f*(z) &k, k > 0
A n , Aw A, A A^, A° O 0 , O^ o?o
nr^tz), uJi(z) *„(*) D(t,z), E(t,z)
C,B
A ^(z) €C C(t,z)
P(t,z)
List of symbols reflection in the boundary: z = l/z for D, z = z for U substar conjugate: f*(z) = f(z) basic interpolation points: a?£ G 3O for boundary case otk G O otherwise A n = {«i, . . . , « „ } , A n = {ax, . . . , « „ } A = {«i, «2, • • •}, A = {«i, a 2 , • • •} A- = {w, au . . . , an] A° = {c*o, « ! , . . . , « „ } O 0 = O \ An, Oe0 = Oe\An* special point: OL$ = 0 for D , OLQ = i for U in the boundary case: a$ — — 1 for T, a?o = oo for R v?w(z) = I -wz for D, zaw(z) = z - w for U,
20 20 257 43 44 44 133 44 241 19 257
C7/ = GTa.
19
TT»(Z) = I l L i t»k(z) D(t,z) Riesz-Herglotz-Nevanlinna kernel: D(t, z) = {t + z)/(t - z) for D D(f, z) = - i ( l + tz)/(t - z) for U E(f,z) = l + D(r,z) C positive real functions: C = {f G //(O) : /(©) CM} B bounded analytic functions: B = {/ G # ( O ) : / ( O ) C B} -4 = {[Ai A 2 ] : AuA2eH(O), A2(z) / 0 , z e O , Ai/A 2 G 23} Riesz-Herglotz-Nevanlinna transform: fiM(z) = ic + / DO, z) dft(t), ceR Cauchy kernel: C(t, z) = [ujo(ao)ujz*(t)]-1 for O,
44
C(t, z) = l/[2i(t - z)] for U Poisson kernel:
27 27 83 23 23 141 27
23
P(t, z) = [mz(z)/uT0(a0)]/[mz(t)mz*(t)] for O... .27 P(t, z) = (1 - |z| 2 )/|; - z|2 for D if r G T, 27 P(t,z)= Imz/|r-z| 2 forUif? G R 27 Mw
Mobius transform: Mw(z) = (z- w)/{\ -wz)
25
List of symbols
a (z)
spectral factor:
Vn, n > 0, Poo, V
o(z) = c exp{i / D(t, z) log/z'(0 dk(t)}, ceT polynomials of degree < n V-n = Vn* = {p : p* e Vn} Poo = U ^ o ^ , P = closure(Poo)
p*, p eVn Qf0 P Zi
fa, f,Bn Z/, / > 0
b(z)
bn Cn, £00, C
Cn(w) /*, / G Cn 1, basic factors for boundary case: Zt = b(z)/[mi(z)/uTi(a0)] on dO Zt = i(l - z)(l - at)/(z - at) onT, Zt = z / ( l - z/oti) o n R numerator of basic factors in boundary case: b(z) = im0(z)/ziT0(ao) on dO b(z) = i(l - z) on T, b(z) = z on R basis functions for boundary case:
io = l,ftnfe) = n?=iZ/fe)>">l
54 257 257
43 43 43
259
259
fundamental spaces: Cn = span{fcfc : k = 0 , . . . , n} for boundary case 257 Cn = span{2?£ : ^ = 0, . . . , n} otherwise 43 r ^ = U^0£n, £ = closure^) 149 Cn(w) = {f:feCn: f(w) =0} 60 superstar conjugate in Cn: f*(z) = Bn(z)f*(z) . . . . 53
i//n, ^n
kn(z, w), Kn(z, w)
Pn (z), Qn (z)
sw(z) = s(z, w)
Sw(z) = 5(z, w) £p,q> P> 4 — ®
nn, Uoo, n f^n —-> M lim f(z)
List of symbols Kn leading coefficient of (pn: Kn = 0*(a n ) boundary case: (j)n(z) = 4>n(u®) + * • • + ic'nbn-\{z) + Knbn{z) ^rn functions of the second kind for Cn i/fn(z) = f[E(t, z)n(z)]djl(t) vl/rt = €n^n boundary case: - 8nOZo(z)/Zo(P) kn(z, w) reproducing kernels for Cn:
55 260 83 100
267
55 kn(z, W) = YZ=Oi(z)i(w) Kn(z,w) normalized reproducing kernels for Cn: Kn(z, w) = kn(z, w)/kn(w, w)1'2 70 Qn' para-orthogonal functions, Qn(z) = Qn(z, r) = 0 n (z) + r0*(z), r e T 109 Pn\ associated functions of second kind, P«(z) = Pn(z, T) = 1rn(z) - r^(z) Ill boundary case: quasi-orthogonal functions, Qn(z) = Quiz, r) = n(z) + * ^ ) 0 ; _ i ( z ) , r eR 280 Pn: associated functions of second kind, 291 Pniz) = Pn(Z, T) = Xlfn(z) + T ^ ^ - l f e ) Szego kernel: s(z, w) = [mo(z)uTo(w)]/[m0(ao)or(w)mw(z)cr(z)] forO s(z, w) = l/[a(w)(l -wz)cr(z)] forD s(z, w) = [(z + i)(i - u 7 ) ] / [ 2 i ^ ) ( z - uJ)or(z)] forU 61 Sw(z) = sw(z)/Vs^(w) 177 £p,q = £q ' £p*> £-0,n = £>n, £>n,0 = £-n — Cn* 106
nn = cn,n Uoo = U™=onn, n = closure(^oo) Convergence in weak star topology Orthogonal limit toa e~dO lim r |i f(ra) for T (ot + iy) for R f(X/z) for a = oo
112 149 226
322
Introduction
This monograph forms an introduction to the theory of orthogonal rational functions. The simplest way to see what we mean by orthogonal rational functions is to consider them as generalizations of orthogonal polynomials. There is not much confusion about the meaning of an orthogonal polynomial sequence. One says that {0«}^=o *s a n orthogonal polynomial sequence if (j)n is a polynomial of degree n and it is orthogonal to all polynomials of lower degree. Thus given some finite positive measure /x (with possibly complex support), one considers the Hilbert space L2(/x) of square integrable functions that contains the polynomial subspaces Vn, n = 0,1, Then {4>n}^Lo *s a n orthogonal polynomial sequence if (j)n eVn\Vn-i and 0 n _L Vn-\. In particular, when the support of the measure is (part of) the real line or of the complex unit circle, one gets the most widely studied cases of such general orthogonal polynomials. Such orthogonal polynomials appear of course in many different aspects of theoretical analysis and applications. The topics that are central in our generalization to rational functions are moment problems, quadrature formulas, and classical problems of complex approximation in the complex plane. Polynomials can be seen as rational functions whose poles are all fixed at infinity. For the orthogonal rational functions, we shall fix a sequence of poles {yk}kL\, which, in principle, can be taken anywhere in the extended complex plane. Some of these yk can be repeated, possibly an infinite number of times, or they could be infinite. However, the sequence is fixed once and for all and the order in which the yk occur (possible repetitions included) is also given. This will then define the n -dimensional spaces of rational functions C n that consist of all the rational functions of degree n whose poles are among y\»•••»Yn (including possible repetitions). We then consider {0 n }^ o t 0 ^ e a sequence of orthogonal rational functions if (j)n e Cn \ Cn~\ and (j)n _L Cn-\.
2
Introduction
There are two possible generalizations, depending on whether one generalizes the polynomials orthogonal on the real line or the polynomials orthogonal on the unit circle. The difference lies in the location of the finite poles that are introduced in the rational case. In the case of the circle, the pole at infinity is outside the closed unit disk. There, it is the most natural choice to introduce finite poles that are all outside the closed unit disk. This guarantees that the rational functions are analytic at least inside the unit disk, which allows us to transfer many properties from the polynomial to the rational case. Moreover, if the poles are not on the circle, then we avoid difficulties that could arise from singularities of the integrand in the support of the measure. If the support of the measure is, however, contained in the real line, then the pole at infinity may be in the (closure of) the support of the measure. The most natural generalization is here to choose finite poles that are on the real line itself, that is, possibly in the support of the measure for which orthogonality is considered. Of course one can by a Cay ley transform map the unit circle to the (extended) real line and the open unit disk to the upper half plane. Since this transform maps rational functions to rational functions, it makes sense to consider the analog of the orthogonal rational functions on the unit circle with poles outside the closed disk, which are the orthogonal rational functions orthogonal on the real line with poles in the lower half plane. Conversely, one can consider the orthogonal rational functions with poles on the unit circle and that are orthogonal with respect to a measure supported on the unit circle as the analog of orthogonal rational functions on the real line with poles on the real line. The cases of the real line and the unit circle, which are linked by such a Cayley transform, are essentially the same and can be easily treated in parallel, which we shall do in this monograph. The distinction between the case where the poles are outside or inside the support of the measure is, however, substantial. We have chosen to give a detailed and extensive treatment in several chapters of the case where the poles are outside the support. The case where the poles are in the support (which we call the boundary case) is treated more compactly in a separate chapter. This brief sketch should have made clear in what sense these orthogonal rational functions generalize orthogonal polynomials. Now, what are the results of the polynomial case that have been generalized to the rational case? As we suggested above, we do not go into the details of all kinds of special orthogonal polynomials by imposing a specific measure or weight function. We do keep generality by considering arbitrary measures, but we restrict ourselves to measures supported on the real line or the unit circle. In that sense we are not
Introduction
3
as general as the "general orthogonal polynomials" in the book of Stahl and Totik [193]. Orthogonal polynomials have now been studied so intensely that many different and many detailed results are available. It would be impossible to give in one volume the generalizations of all these to the rational case. We have opted for an introduction to the topic and we give only generalizations of classical interpolation problems of Schur and Caratheodory type, of quadrature formulas, and of moment problems. There is a certain logic in this because interpolation problems are intimately related to quadrature formulas and these quadrature formulas are an essential tool for solving the moment problems. These connections were made clear and were used explicitly in the book by Akhiezer [2], which treats "the classical moment problem." To some extent we have followed a similar path for the rational case. First, we derive a recurrence relation for the orthogonal rational functions. In our setup, this is mainly based on a Christoffel-Darboux type relation. In the boundary case, this recurrence generalizes the three-term recurrence relation of orthogonal polynomials; in the case where the poles are outside the support of the measure, this is a generalization of the Szego recurrence relation. To describe all the solutions of the recurrence relation, a second, independent solution is considered, which is given by the sequence of associated functions of the second kind. These functions of the second kind appear as numerators and the orthogonal rational functions as denominators in the approximants of a continued fraction that is associated with the recurrence relation. The continued fraction converges to the Riesz-Herglotz-Nevanlinna transform of the measure and the approximants interpolate this function in Hermite sense. This is the interpolation problem that we alluded to. It is directly related to the algorithm of Nevanlinna-Pick, which is a (rational) multipoint generalization of the Schur algorithm that relates to the polynomial case. A combination of the orthogonal rational functions and the associated functions of the second kind give another solution of the recurrence relation called the quasi-orthogonal or para-orthogonal functions in the boundary or nonboundary case respectively. It can be arranged that these functions have simple zeros that are on the real line or on the unit circle. These zeros are used as the nodes of quadrature formulas. In the nonboundary case, such n-point quadrature formulas are optimal in the sense that corresponding weights can be chosen in such a way that the quadrature formulas have the largest possible domain of validity. For the boundary case, these quadrature formulas are "nearly optimal" in general. Their domain of validity has a dimension one less than the optimal
4
Introduction
one. However, when the zeros of the orthogonal rational functions themselves happen to be a good choice, then the quadrature formula is really optimal. In the polynomial case, this corresponds to Gaussian quadrature formulas on the real line or Szego quadrature formulas for the circle. These quadrature formulas are an essential tool in the construction of a solution of the moment problems. These moment problems are rational generalizations of the polynomial case that correspond to the Hamburger moment problem in the case of the real line and the trigonometric moment problem in the case of the circle. Two other aspects are important or are at least closely connected to the solution of these moment problems. First, there is the well-known fact that, as n goes to oo, the polynomial spaces Vn become dense in the Hardy spaces Hp. A similar result will only hold for the spaces Cn under certain conditions for the poles. Second, there is the general question of asymptotics for the orthogonal rational functions, for the interpolants, for the quadrature formulas, etc., when n tends to infinity. Such results were extensively studied in the polynomial case. We shall devote a large chapter to their generalizations. After this general introduction, let us have a look at the roots of this theory, at the applications in which it was used, and let us have a closer look at the technical difficulties that arise by lifting the polynomial to the rational case. Since the central theme up to Chapter 10 is the generalization of results related to Szego polynomials, orthogonal on the unit circle, let us take these as a starting point. The particularly rich and fascinating theory of polynomials orthogonal on the unit circle needs no advertising. These polynomials are named after Szego since his pioneering work on them. His book on orthogonal polynomials [196] was first published in 1939, but the ideas were already published in several papers in the 1920s. The Szego polynomials were studied by several authors. For example, they play an important role in books by Geronimus [94], Freud [87], Grenander and Szego [102], and several more recent books on orthogonal polynomials. It is also in Szego's book that the notion of a reproducing kernel is clearly introduced. Later on these became a studied object of their own. The book by Meschkowski [148] is a classic. In our exposition, reproducing kernels take a rather important place and the Christoffel-Darboux summation formula, which expresses the nth reproducing kernel in terms of the nth or (n + l)st orthogonal polynomials (in our case rational functions), is used again and again in many places throughout our monograph. Szego's interest in polynomials orthogonal on the unit circle was inspired by the investigation of the eigenvalue distribution of Toeplitz forms, an even older subject related to coefficient problems as initiated by Caratheodory [48, 49] and Caratheodory and Fejer [50] and further discussed by F. Riesz [184, 185],
Introduction
5
Gronwall [103], Schur [189, 190, 191], Hamel [107], and many others. This problem is closely related to the trigonometric moment problem. Indeed, the matrices of the Toeplitz forms are Gram matrices because for the inner product
*>=
jf(t)J(t)dn(t),
where /x is a measure on the unit circle; we have (zk, zl) = J tk~ld/ji(t) = lii-k, where /x& = / t~kd[i(t) are the moments of the measure /x. Thus if the measure /x is positive, then the Gram matrix should be positive definite, and because it is Hermitian, this means that its eigenvalues should be positive. The converse is also true: Given a Hermitian Toeplitz matrix, then there will be a positive measure for which the entries of this matrix are the moments if the matrix is positive definite. Another way of putting this is to say that the function £2M(z) = J2T=o fjikZk is a Caratheodory function. This means that it is analytic in the unit disk and it has a positive real part there. Since this is an infinite-dimensional problem, it can not be checked by a finite number of computations. A computational procedure consists basically in approximating Q/A by some rational Qn that fits the first n + 1 Fourier coefficients of Q^ and letting n range over the natural numbers n = 0, 1,2, These Qn turn out to be related to \lsn/ni where (j)n is the n\h Szego polynomial and \lrn is the associated polynomial of the second kind. Both of these are solutions of the recurrence relation for the orthogonal polynomials and they appear as successive approximants in continued fractions. Thus checking the positivity of the infinite Toeplitz matrix comes down to checking the positivity of all its leading principal submatrices, or, equivalently, checking whether Q^ is a Caratheodory function reduces to checking that Qn are Caratheodory functions for all n = 0 , 1 , The Caratheodory coefficient problem is in a sense an inverse of this: Given the coefficients /x0, / x i , . . . , /xn, can these be extended to a sequence such that Yl^Lo ljikZk is a Caratheodory function, or, equivalently, such that the corresponding Toeplitz matrix with entries /x;_7, (/x_^ = ~jXk) is positive definite, or, equivalently, such that there is a positive measure /x on the unit circle for which these /x^ are the moments /x& = J t~kdfx(t), n = 0,1,2,.... Schur solved an equivalent problem [189, 190, 191]. By mapping the right half plane to the unit disk, functions with a positive real part are mapped to functions bounded by 1 or Schur functions. So the problem is reduced to checking whether a given function is a Schur function. This can be done recursively and the papers by s-like algorithm to actually check if the given coefficients (moments) correspond to a bounded analytic function. The algorithm produces some coefficients (Schur coefficients) that turned out later to be exactly the
6
Introduction
complex conjugates of the coefficients that appeared in the recurrence relation for the orthogonal polynomials as formulated by Szego. It was Pick who first considered an interpolation problem as a generalization of the coefficient problems of Caratheodory [176, 177, 178]. Nevanlinna was not aware of Pick's work when he developed the same theory in a long memoir in 1919 [153]. See also his later work [154, 155, 156]. Nevanlinna also gave an algorithm that directly generalized the algorithm given by Schur. Since then, these problems and a myriad of generalizations played an important role in several books, such as in Akhiezer [2], Krein and Nudel'man [131], and Walsh [200], and more recently in Donoghue [71], Garnett [92], Rosenblum and Rovnyak [186], Ball, Gohberg, and Rodman [22], Bakonyi and Constantinescu [19], etc. Some of the more recent interest in this subject was stimulated by the work of Adamyan, Arov, and Krein [4,5] and most of all by their fundamental papers [6, 7]. We should also mention Sarason's paper [188], which had great influence on some developments made in later publications. These results relate the theory to operator theoretic methods for Hankel and Toeplitz operators. Besides this, there is also a long history where the same theory is approached from several application fields. Grenander and Szego themselves discussed the application in the theory of probability and statistics [102]. But one finds also the applications in the prediction theory of stationary stochastic processes in work by Kolmogorov [129] and Wiener [201]. Some benchmark papers on this topic are collected in [125]. The book by Wiener contained a reprint from Levinson's celebrated paper [134], which is in fact a reformulation of the Szego recursions. Other engineering applications include network theory (see, e.g., Belevitch [23] and Youla and Saito [204]), spectral estimation (see Papoulis [174] for an excellent survey), maximum entropy analysis as formulated by Burg [47] (see the survey paper [133]), transmission lines and scattering theory as studied by Arov*, Redheffer [183], and Dewilde and Dym [60, 62], digital filtering (see the survey of Kailath [124]), and speech processing (see [144] or the tutorial paper by Makhoul [143]). It is from these engineering applications that methods for inverting and factorizing Toeplitz or related matrices also emerged (see [190]), and people are now even using these ideas for designing systolic arrays for the solution of a number of linear algebra problems [24]. The linear algebra literature in this connection has a complete history of its own, which we shall not mention here. Most of it was devoted to Toeplitz and Hankel matrices or related matrices that appear in connection with a theory of Schur-Szego. However, only recently have D. Z. Arov. Darlington's method in the study of dissipative systems. Dokl Akad. NaukSSSR, 201 (1971), 559-562. (In Russian). English translation: Soviet Physics-Doklady 16 (1972), 954-956.
Introduction
1
people started looking at matrices that are related to interpolation problems. We could go on like this and probably never be complete in summing up all the application fields and this is without ever touching all the related generalizations of this theory that were obtained recently or the analog theory that has been developed for the complex half plane instead of the unit circle or the continuous analog of Wiener-Hopf factorization. We just stop here by referring to a survey paper on the applications of Nevanlinna-Pick theory by Delsarte, Genin, and Kamp [57]. In all these papers on theory and applications, the approach of the Nevanlinna-Pick theory from the point of view of the orthogonal functions has not been fully put forward. In this monograph, we try to give an approach to the theory that is an immediate generalization of the theory of Szego for orthogonal polynomials. This theory is related to the interpolation theory of Pick and Nevanlinna like the Szego theory was related to the Schur and CaratheodoryFejer coefficient problems. By a Cayley transform, the complex unit circle is mapped to the extended real line and its interior to the upper half plane. Thus there is a natural analog of this theory for the real line. There are, however, technical differences that make the transformation not always trivial. Therefore we shall treat both cases in parallel. A central role in the theory is played by moment problems. We give some details because it illustrates very well the difference between the case of the circle and the case of the line. For the unit circle, we have the classical trigonometric moment problem. In that case one defines the moments \x^ for A: e Z as
- / •
keZ,
t =
By giving /z_£, k = 0, 1, 2 , . . . , one has defined a linear functional on the set of polynomials and the problem is to find the measure \x on the unit circle that represents this functional. Note that when \i is a real positive measure, then jj/k = fi_k9 so if the Toeplitz moment matrix G = [/JLJC-I] is Hermitian and positive definite, it is completely characterized by half of these moments, namely by the moments /x^, k = 0, 1 , . . . with a nonnegative index. Thus the previous moment problem where only the moments /x_&, k = 0, 1 , . . . , or equivalently only the moments /z& for k = 0, 1, 2 , . . . , are defined, is equivalent to the moment problem where all the /z^, k e Z are prescribed. The working instrument for solving this trigonometric moment problem is the set of orthogonal Szego polynomials obtained by orthogonalizing the basis {1, t, t2,...}. The orthogonality is with respect to the inner product given by (P, Q) = M{P(z) Q(l/z)}, where the linear functional M is defined by M{tk} = /i_&, k = 0, 1, 2, Note that the inner product is well defined for arbitrary polynomials P and Q because the argument P (z) Q (1 /z) is a Laurent polynomial and, as we remarked
8
Introduction
a moment ago, if the linear functional is defined on the set of polynomials, then it is automatically defined on the set of Laurent polynomials. For the rational generalization of this problem we consider a sequence of points ai, « 2 , . . . all inside the unit disk and define the moments (Ok(t)7T?(t)
where for k > 1 k
k
= z~
1=1
i=\
while COQ = TVQ = 1. The rational generalization of the trigonometric moment problem is to define a linear functional on the space CQQ = span(a)^"1, (D[X , oo^1, ...} by prescribing the moments /Xfco> k = 0, 1, 2, Note that if all a, = 0, then • / •
= ei0.
Thus, in that case the moment problem reduces to the classical trigonometric moment problem. Moreover, if G = [fjiki] is Hermitian positive definite, then it is obvious that
dfi(t) Thus, by partial fraction decomposition, it is seen that the knowledge of /x^o, k = 0, 1, 2 , . . . also gives the moments fiki,fc,/ = 0, 1, Again, it is sufficient to prescribe only the moments /x^o, k = 0, 1, To solve this rational generalization, the role of the orthogonal polynomials in the classical trigonometric moment problem is taken over by orthogonal functions obtained by orthogonalization of the basis {1, CDJ"1, co^1, •. •}. These are the orthogonal rational functions that are studied in this monograph. This also explains why it is called a rational generalization. Note that at any point in the discussion we can replace all the interpolation points o^ by zero and recover at any moment the corresponding result of the polynomial case. In this respect it is a natural generalization of the Szego theory. For the unit circle, both in the polynomial and the rational case, we are in a convenient situation where it is sufficient to have a functional defined on C^ = "1, taj"1, CL>21 , . . . } , and we immediately have defined the functional on
Introduction
9
a larger set, which allows us to define orthogonality. Indeed, in the rational case we should be able to evaluate integrals of the form Jf(t)g(t) d[i(t), t = eie for / , g e COQ. This requires the existence of moments /x^/, which is guaranteed because the /x^o also define the /x^/. Let us now consider the Hamburger moment problem. Here we define the moments
= f tkd/i(t),
t eR,
keZ.
Again, prescribing the /x& for k — 0, 1, 2 , . . . defines a linear functional on the set of polynomials and the problem is to find a positive measure /x on the real line such that it matches these prescribed moments for A: = 0, 1, 2, The instruments in this case are again the orthogonal polynomials obtained by orthogonalization of the basis {1, t, t2,...}. However, the inner product now generates a moment matrix G = [/x^+/], which is a Hankel matrix. It is no longer true that the \xk for k = 0, 1 , . . . also give the values of the moments /JL-JC, k = 1 , 2 , The problems where the moments [ik are prescribed for k = 0, 1, 2 , . . . (the moment problem for the set of polynomials) and where all the moments fik, k e Z are given (the moment problem in the set of Laurent polynomials) are not equivalent anymore. Thefirstproblem is called the (classical) Hamburger moment problem, whereas the second is the strong Hamburger moment problem. To solve the strong Hamburger moment problem, one has to deal with orthogonal Laurent polynomials instead of the usual polynomials [119, 51]. We may note that in the classical Hamburger moment problem, there is no problem when talking about orthogonality because the inner product is defined as (/, g) = M{f(x)g(x)} with the linear functional defined by M{xk} = /Xfc, k = 0, 1, 2 , . . . , and since the product of two polynomials is again a polynomial, it is sufficient to define the functional on the set of polynomials. A similar observation concerning the Laurent polynomials holds for the strong Hamburger moment problem. The rational generalization of the Hamburger moment problem is to consider the moments
dn(t) cok(t)'
- / where k
cok(z) = z~k Y[(l - z/ctt), 1=1
k > 1, coo = 1,
10
Introduction
and the at are points on the (extended) real line. Note that when all at = oo, this reduces to the classical Hamburger moment problem. The orthogonal polynomials are replaced by the orthogonal rational functions obtained by orthogonalization of {1, cc>[l, co^1,...}. There is a considerable complication that did not occur in any of the previous situations. Indeed, the product of two rational functions from C^ = span{l, co^1, co^1,...} is not in C^ anymore (unless there is a special choice of the c^). Thus, to be able to consider orthogonality here, we should also have a functional defined on C^ • C^ and this requires moments that are in general not obtainable from the fiko alone. Therefore, we need moments of the form 0)k(t)C0i(t)
Note that when there is only a finite number of different c^ that are cyclically repeated (the so-called cyclic case) then it does hold that the product of rational functions in C^ is again in C^ and the previously sketched difficulties do not occur. For the strong Hamburger moment problem, the rational equivalent is to prescribe the moments akl = / J
k, I = 0, 1, Q)k(t)7ti(t)
with oik as before and
1=1
This is possible, if we are willing to introduce a more complicated notation to formulate this as a problem where the special case of the strong Hamburger moment problem appears by choosing alternatively a2k = 0 and o^+i = °°The problem of orthogonality for the functions in these spaces is even more difficult since this requires even more complicated moments to be defined. In this monograph, we shall not consider this rational generalization of the strong Hamburger moment problem. Even for the classical Hamburger moment problem, the rational generalization gives yet other complications with regard to the point at oo that do not occur in the polynomial case. Indeed, if /x is a solution of a polynomial moment problem, then, since all polynomials, except the constants, tend to oo at oo, there can not be a mass at oo if the moments are finite. However, in the rational case, all the rational functions can have finite values at oo and hence a mass at
Introduction
11
infinity is perfectly possible for a solution of the moment problem. Therefore, the integrals should be taken over the extended real line R = R U {oo}, rather than over R. Note also that the situation of the real line and the complex unit circle are considerably different in the respect that the natural generalization of the polynomial case requires the points at to be inside the disk (to be able to recover the polynomial case by setting them all equal to 0) whereas for the real line case, these points are all on the extended real line (to be able to set them all equal to oo). Also, integrals of the form J f(t) d/ji(t), where / has poles at ct\, 012,..., are definitely more complicated when the a\ are in the support of the measure than when they are not. Rational generalizations of the Hamburger moment problem (called extended moment problems or multipoint moment problems) were considered in Refs. [157], [158], [160], [164], and [165] where only a finite number of different oti is used. See also Refs. [137] and [138]. Orthogonal rational functions occur first in the work of M. M. Djrbashian [66, 65, 67, 68, 69, 70] (see also Ref. [150]) and in Ref. [30]. This brief discussion of the moment problem shows that there is a distinction to be made between the case of the real line and the case of the unit circle. In fact when the Cayley transform is put into play, there are four cases to be considered: 1. 2. 3. 4.
supp supp supp supp
(/x) (/x) (/x) (/x)
is in T, is in T, is in R, is in R,
all at all at all at all at
e e e e
D. T. U. R.
Cases 1 and 3 are related by a Cayley transform and so are 2 and 4. Chapters 2-10 will be devoted to cases 1 and 3. Chapter 11 discusses similar results for cases 2 and 4 to which we shall refer as the boundary case. It is perfectly possible to combine the two cases, as for example in Ref. [62], where the oti are allowed to be inside the closed unit circle, thus combining cases 1 and 2. Let us conclude this introduction with an outline of the text. Since the previous more technical introduction introduced some terminology, we can give it in some detail. In Chapter 1, we start with preliminaries from complex analysis and some properties of reproducing kernels. It also includes some generalities on positive real functions, that is, analytic functions in the unit disk with positive real part, also known as Caratheodory functions since they appeared in the CaratheodoryFejer problem. We give also the relation with the analytic functions bounded
12
Introduction
by 1. The latter are also known as Schur functions because it was Schur who used these functions in his algorithm to solve the Caratheodory-Fejer problem. A last important tool in our analysis are the J-contractive matrices studied by Potapov. They are introduced in Section 1.5. It is then time to be more specific and we introduce the fundamental spaces of rational functions that will generalize the spaces of polynomials as they appear in the Szego theory. They are defined and discussed in Chapter 2 in some detail. Rather than starting with the recurrence for the orthogonal functions themselves, it will turn out to be easier to start with the reproducing kernels, which we do in Chapter 3. This chapter also contains the important Christoffel-Darboux relations. It will become clear in Chapter 4, when we give the recurrence for the orthogonal functions, that, compared with the reproducing kernels, they are somewhat less simple to handle, since the recurrence can not be easily described in terms of a J-unitary recursion. It is however possible to get some recurrence that generalizes the Szego relations, and one can also define functions of the second kind (Section 4.2). As in Szego's theory, these functions of the second kind appear as another independent solution of the recurrence for the orthogonal functions, exactly like the Szego polynomials of the second kind. We also mention in this chapter how the coupled Szego type recursions can be transformed into three-term recurrence relations, which are associated with continued fractions. The relation to quadrature on the unit circle is given in Chapter 5. It gives quadrature formulas that, like the Gaussian quadrature on the real line, have a maximal domain of validity. However, the zeros of the orthogonal functions on the circle are all inside the unit disk and they are thus not well suited to be used as abscissas. One can construct para-orthogonal functions that do have unimodular zeros and are missing the usual orthogonality conditions just by not being orthogonal to the constants. This deficient orthogonality causes the dimension of the domain of validity for the quadrature formulas to be one lower than the dimension in the classical case of Gaussian formulas. Also, the domain of validity is not a polynomial space but a space of rational functions. Of course, similar results hold for the half plane and they are formulated in parallel. In Chapter 6, we relate several results to rational interpolation problems in the class of Caratheodory and Schur functions. We also give the algorithm of Nevanlinna-Pick, which is related to the recurrence for the reproducing kernels, and also a similar algorithm related to the recurrence of the orthogonal functions. Some density problems of the functions in C^, that is, the completeness problem of the rational basis functions of £oo> m the spaces Lp and Hp or in
Introduction
13
and /^(/x) are discussed in Chapter 7. This forms an introduction to the chapter on convergence but is also used in the Favard theorem formulated in the next chapter. Favard theorems for the spaces of rational functions are possible, exactly as in the polynomial case. In the polynomial case, these theorems state that if one has polynomials satisfying a three-term recurrence relation, then they are orthogonal with respect to some measure. The same holds for the orthogonal rational functions where we may also replace the three-term recurrence by a recurrence as discussed in Chapters 3 and 4. We give constructive proofs for such theorems. Also, the kernels satisfy a recurrence relation and one may be interested in a Favard type theorem for the kernels, that is, if some kernel functions satisfy a typical recurrence relation, will they be reproducing kernels for nested spaces with respect to some inner product? Even in the polynomial case this problem has not been solved. We do not succeed in finding an easy characterization of the type of recurrence that will guarantee this Favard type result, but we give some indications of what can be obtained. Chapter 9 starts by giving a generalization of the Szego problem, which can be solved by a limiting approximation process in our rational function spaces. This is done in Section 9.1. We could formulate it as finding the projection of z~l onto the space //2(/x), that is, the space of polynomials closed in the L2(/x) metric. A crucial fact will then be to find out when the space H2(/JL) is not only spanned by the polynomials, which is the original Szego approach, but also when it is spanned by the rational functions under certain conditions. Some further convergence results are discussed in Section 9.2. We give asymptotics for 0* in Section 9.3. We have to require that the points at stay away from the boundary. However, such results can be obtained under weaker conditions for which we introduce in Section 9.5 the orthogonal polynomials with respect to a varying measure as studied by Lopez [140], Stahl and Totik [193], and others. The convergence results are given in Section 9.6 and subsequent sections. There are strong and weak convergence results in norm, locally uniform convergence, and results about ratio and root asymptotics. They are obtained under various conditions that hold for the measure [Szego ( / log /// > — oo) or ErdosTuran (//' > 0 a.e.)] and under several conditions on the point set ct\, a2,...: It can be assumed to be compactly included in the disk or the half plane, or its Blaschke product can be assumed to diverge or a Carleman type of condition can be assumed. In Chapter 10 the moment problem is discussed, in particular, the classical theory of nested disks is generalized.
14
Introduction
In Chapter 11, we treat the boundary situation where the ott are on the boundary, that is, on the unit circle or on the extended real line. The complication there is that the at can be in the support of the measure. At a somewhat quicker pace, the orthogonal rational functions are introduced and their recurrence relations, Christoffel-Darboux type relations, quadrature formulas, the moment problem, interpolation properties, and convergence results are discussed. Finally, in Chapter 12, we give some of the direct applications of the theory that was developed in the previous chapters.
Preliminaries
In this chapter we shall collect the necessary preliminaries from complex analysis that we shall use frequently. Most of these results are classical and we shall give them mostly without proof. We start with some elements from Hardy functions in the disk and the half plane in Section 1.1. The important classes of analytic functions in the unit disk and half plane and with positive real part are called positive real for short and are often named after Caratheodory. By a Cayley transform, they can be mapped onto the class of analytic functions of the disk or half plane, bounded by one. This is the so-called Schur class. These classes are briefly discussed in Section 1.2. Inner-outer factorizations and spectral factors are discussed in Section 1.3. The reproducing kernels are, since the work of Szego, intimately related to the theory of orthogonal polynomials and they will be even more important for the case of orthogonal rational functions. Some of their elementary properties will be recalled in Section 1.4. The 2 x 2 J-unitary and J-contractive matrix functions with entries in the Nevanlinna class will be important when we develop the recurrence relations for the kernels and the orthogonal rational functions. Some of their properties are introduced in Section 1.5. 1.1. Hardy classes We shall be concerned with complex function theory on the unit circle and the upper half plane. The complex number field is denoted by C We use the following notation for the unit circle, the open unit disk, and the complement 15
16
L Preliminaries
of the closed unit disk: T = {z:\z\
= 1},
B = {z:\z\
1}.
The upper bar denotes complex conjugation when appropriate or closures when it concerns sets (e.g., B> = D U T is the closed unit disk and C = C U {oo} is the Riemann sphere). The real axis is denoted as R. Real and imaginary parts of a complex number z are indicated by Re z and Im z respectively: z = Re z+i Im z, i is reserved for the unit on the imaginary axis, and the open right half plane is denoted as H = {z :Rez > 0}. We also need the upper and lower half plane, which we denote as V = {zeC:
Imz > 0} and
L = {z e C : Imz < 0}.
Furthermore, we denote the real line as R = (—oo, oo), its closure as R = R U {oo}, and U = U U R and C = U U L U R. We shall give a uniform treatment for the disk case (D) or the half plane case (U) as much as possible. Therefore we shall use the notation O = D or U,
dO = T or R,
Oe = E or L.
We first mention that the disk D and the half plane U can be conformally mapped onto each other by the Cayley transform r. We set z = r(w) = i ^ ^ ; 1+ w
w = r~l(z) = —-,
I+ Z
w e B, z e U.
The mapping r is a one-to-one map of the unit disk D onto the upper half plane U, - 1 onto oo, and T \ {-1} onto R = (-oo, oo). Thus the boundary T is mapped onto the extended real line R. Thus we note the slight discrepancy in notation for our two cases: 3O = 3O = T for the disk, but for the half plane aO = R (not including oo) whereas ~dO = R. By Vn we mean the set of polynomials of degree at most n. The set of complex functions holomorphic on X are denoted by H{X). Let /x be a positive measure on T, whose support is an infinite set. Sometimes it is characterized by a distribution function T/T (t) = J^ d\i, which has an infinite number of points of increase. If t = eia e T is a point of discontinuity of the distribution function, then /z({£}) is the concentrated mass or point mass at t. A similar thing can be said about a measure supported on the real line or on any measure space with (positive) measure [i.
1.1. Hardy classes
17
The metric spaces LP(X, /z), 0 < p < oo are well known [187, Chap. 3]. In our case X = 3D. The normalized Lebesgue measure for T is denoted by X : dX(0) = (27t)~ld0. For the real line R, this becomes dX{x) = n~ldx. If /j, = X, we just write Lp(dO) instead of Lp(dO, X). Also, at other occasions, we shall drop /x from the notation when /x = X. When we want to stress the difference between the cases 3O = T and 3O = R, we write this explicitly (e.g., LP(T) or Lp(R)), but when something is true for both, regardless of the fact 3D = T or R, we also drop 3O from the notation. For example, it is well known that LP(/JL) are complete metric spaces for 1 < p < oo and that L2(/x) is a Hilbert space. The inner product in L2(/x) is denoted by
= f_
f(f)g(JOdfjL(t) or (f,g) l
JxeR
Depending on the case, our integrals will be over T or R in some form or another and we shall not mention this explicitly. So, with an obvious abuse of notation, we shall take the freedom to write the previous integrals as
=
f(t)g(t)dix(t). J
We shall need many results from Hp theory. These Hardy classes Hp are the next thing we introduce. We consider the disk first. Different equivalent definitions of the Hardy classes for the disk can be given. We use the following: HP(D) is the set of functions / e H(D) for which the subharmonic functions \f\p have an harmonic majorant in D [92, p. 51; 76, Chap. 10]. This means that
sup f\f{rt)\PdX{t) = M\\pp < oo V< 1 J
whereas Hoo (D) is just the set of bounded analytic functions in D. This definition of Hp classes is conformally invariant. Conformally invariant means that if we replace in that definition D by U, we get some corresponding Hp classes for the upper half plane. We shall assume that HP(V) is defined as such. Thus HP(U) is the set of analytic functions in the upper half plane such that for 0 < p < oo
sup/
|/0c+i30l"djc=||/||£
y>0 J—oo
whereas / e H(U) is in //^(U) if and only if
18
1. Preliminaries
More generally, these definitions could be used to define Hp classes in any plane domain or Riemann sphere, but we shall not need this. The term conformally invariant is, however, misleading, since one might think that whenever / e HP(D) and r is the Cay ley transform as given above, then / o r is in HP(U). Unfortunately, this is only true for p = oo, but not for a general p < oo. The classes thus obtained turn out to be too big as one can simply check. For example, H\ (D) contains the constant functions, while obviously H\ (U) does not. To get most of the Hp properties we need, the more restrictive definition we gave above, in terms of subharmonic functions having an harmonic majorant, should be adopted for HP(U), p < oo. The following alternative definitions are equivalent with the previous ones. They are classical and can be found, for example, in [92, p. 51], [130, p. 159], and [76, p. 197]. They do use the Cay ley transform and the HP(W) spaces but with a twist:
HP(V) = {(z + iy2/pf(z)
: / o T e HP(D)},
0 < p < oo.
The presence of the extra factor (z -\-i)2^p in the definition of HP(U) can be explained as follows. Let us temporarily use an extra argument for a measure to indicate where it is supported. For example, k(0, P) = (2TI)~XQ is the normalized Lebesgue measure for the interval P = (—TT , n ]. We have identified this previously with the measure k(t,T) on the unit circle where t = el0 by setting X(el6, T) = k(0, P). The normalized Lebesgue measure on the line M is indicated as X(x, R) = n~lx. Similar notational conventions are used for other measures. Later on we shall drop the extra indication, and it should be clear from the context which particular measure is meant. Transforming the Lebesgue measure A.(0, P) = k(t9 T) from (-7T, n) or T to R using the relation t = eid = l-—-, x e R, t e T, 0 e (-n, n),
(1.1)
i +x results in [130, p. 143] dO dx 2) dk(x,R) lit 7T(1+JC 1+JC 2 dk(09 P) = dk(t, T) = — = Thus the conformal map of the Lebesgue measure on T corresponds to the measure
dk(x,R) =
dk(x,R) l+x2
1.1. Hardy classes
19
on R. Note that 1 + x2 = \x + i| 2 if x e R, which explains the extra factor in the definition of HP(U). We shall use the circle to symbolize the unit disk. For the unit disk, the objects with a circle are the same as those without a circle, but for the half plane, the objects with a circle will refer to objects that are obtained by conformally transforming the corresponding objects for the disk. So we shall also use the notation HP(V) to mean HP(U) = [f = g o r : g e HP(B)} = (z + i)2/pHp(U), or more explicitly / G HP(U) O (z + iy2/pf(z)
e HP(V) Of
ore
HP(B).
This will suffice thus far for the definition of the Hardy classes. Later on, the following expressions will play an important role. We define mw(z) = 1 — wz for D
and
zuw(z) = z — w for U.
(1.2)
Furthermore, for a fixed sequence of points at, we shall use the abbreviation mi (z) = zuai (z). The point a 0 is special. In the first part (Chapters 2-10), it is always defined as a 0 = 0 for D
and
a0 = i for U.
Hence we have z&o(z) = 1 — ctoz = 1 for D
and
nro(z) =z — oio = z-\-i for U.
This notation will be essential in the rest of the text. Notice that for the disk mz(z) = mz(z)/uT0(a0) = UT0(a0)uTz(z) = 1 - |z| 2 , whereas for U
urz(z) = 2ilnu,
mz(z)/zao(ao) = Imz,
mo(ao)mz(z) = 4 Imz.
Hence we can characterize the sets O, 9O, and Oe as follows: dO = {z e C : UJZ{Z) = 0},
O = {z e C : mz(z)/m0(a0) Oe = {zeC:
mz(z)/zn0(a0)
> 0} = {z e C : mo(ao)mz(z)
> 0},
< 0 ) = ( z e C : mo(ao)mz(z)
< 0}.
With this notation, we can define dX by dk(t)=dk(t,dQ)=
dk(t,
(1.3)
20
1. Preliminaries
which puts dX = dX for 3O = T and gives the previous definition with 1 + x2 in the denominator for 3D = R. Similarly, we shall use Hp = HP(Q) to mean Hp(3) for © = D and HP(V) when O = U. We note that HP(O) c HP(O) with equality for O = D. It is well known that HP(O) is a Banach space for 1 < p < oo. The Nevanlinna class N(O) is the class of functions / for which the subharmonic function log + \f(z)\ = max (log |/(z)|, 0) has a harmonic majorant. This class N(Q) contains all spaces HP(O) for 0 < p < oo. It can be characterized as the class of functions that are the ratio of bounded analytic functions: / e N < oo iff (if and only if)
1
f(z) = -
f\l/(t)
fz^dHO,
ifr e LP(R), z e U,
and the integral is zero for z eh. The function \J/ is the boundary function of / .
1.2. The classes C and B For example, if / e H2($J)> then it follows that f(z) = f(z)/(z # 2 (U), so that
Jff{t) dk(t) = Jfzr
i
dHt) = /(i).
23 + i) €
(1.7)
Hence this is zero iff / vanishes in i. Putting (1.6) and (1.7) together, we get: If / e H2(O) and f(a0) = 0, then f(t)d°X(t)=0.
(1.8)
In order to have a uniform notation for the disk and the half plane, we define the Cauchy kernel as ^f-^
(1.9)
Hence we have for / e Hx (©) [t,z)f(t)dX(t) =
1.2. The classes C and & The class C of positive real functions, also known as the class of Caratheodory functions, will be introduced now, as well as the closed unit ball B in H^, which corresponds to the class of Schur functions. We shall use the notation HI = {z e C : Re z > 0} for the (open) right half plane. The class C of Caratheodory functions is defined as follows: C = C(O) = {/ € H(O) : / ( O ) C S = M U iR}.
(1.10)
The class B of bounded analytic functions or Schur functions is defined as B = B(O) = {f e H(O) : /(O) c B = DUT}.
(1.11)
Since B can be regarded as the closed unit ball in //(O), we have chosen the notation B for it. However, in the sequel, we shall work most of the time with slightly smaller classes: C(O) = {f e H(O) : / ( O ) c M}
(1.12)
24
1. Preliminaries
and 23(O) = {f e H(O) : / ( © ) c B}.
(1.13)
Note that because 23(Q) consists of analytic functions, it follows by the maximum modulus principle that a function / e 23(O) can only take a value that is 1 in modulus when it is evaluated on the boundary 3O, unless it is a constant function of modulus 1. Thus 23 (Q) merely excludes the unimodular constant functions from 23(O). Similarly, C(O) merely excludes the constant functions with values on the imaginary axis from C(O). The classical Schwarz lemma for functions in 23 (B) reads Lemma 1.2.1 (Schwarz's lemma). Suppose / e 23(D) and / ( 0 ) = 0. Then | / ( 0 ) | < 1 and
\f(z)\ < \z\,
z e B.
(1.14)
Equality holds if and only if f(z) = c z with \c\ = 1. Proof. This is a classical result and we are not going to prove it here. See, for example, Ref. [46, p. 191] or [92, p. 1]. • Note that for / e S(B) \ 23(B), it can never be true that / ( 0 ) = 0; thus the lemma actually gives a statement about functions in 23(B). A Mobius transform is a linear fractional transform that conformally maps the unit circle/disk onto itself. It has the general form /1 L " *. bz + a
1*1 < 1*1,
(1.15)
or, equivalently, " , a e B , \r]\ = l. oiz
(1.16)
Note that Ma is the most general conformal map of this type that transforms a into the origin. The unit circle T is transformed into itself. The inverse transformation is given by i») / n
J-
m
(1.17)
Clearly Ma itself is a function from the class 23 (B). Usually, since r\ is not relevant, it is put equal to 1.
1.2. The classes C and B
25
To give an invariant form of the Schwarz lemma, we recall our notation (1.2) and we set =
w. The following property forms the basis of the Nevanlinna-Pick algorithm. Theorem 1.2.3. Let Ma be a Mobius transform as defined in (1.16) and i;w be as defined in (1.18). B = B(O). 1. Let f e B and a e D. Then Ma(f) e B. More precisely: Ma(B) = B.
(1.21)
26
1. Preliminaries
2. If f e B and w e O, then
Mfiw)(f)/Sw 3. IffeB
e B.
(1.22)
and f(w) = 0 for some w e O, then f/t;w e B.
Proof. 1. Since Ma e B and the composition of functions in B is also in B, we find that Ma(B) C B. Hence B c M " 1 ^ ) . But since M~l = M_a (take rj = 1 in (1.16), without loss of generality) we also have M-a(B) c S. Thus
BcM-\B) = M.a(B)cB, so that equality holds. 2. This is a rewriting of the invariant form of Schwarz's lemma. 3. This is a special case of 2 because f(w) = 0 .
•
The link with class C functions can be made as follows. The Cayley transform
c:z^
[ ^
(1.23)
is a one-to-one map of D onto M and of T onto iR (— 1 is mapped onto oo). The following result is now simple to see. Theorem 1.2.4. The following relations between class C and class B exist. 1. The Cayley transform c of (1.23) is a one-to-one map ofC onto B. That is, c(B)=C
and c(C) = B.
(1.24)
2. For the extended classes, define B = B\{—1}, that is, we exclude the constant function f = — 1. Then c(B')=C
and c(C) = B'.
3. More generally, let y e H, rj e T, and f,ge if (CD) be related by
f +Y Then f
eCiffgeB.
r)-
(1.25)
1.2. The classes C and B
27
Proof. 1. If / G B, then | / | < 1 in O so that |1 + f\ > 0 in O. Hence c(f) e H(O) and conversely, if / e C, then 1 + / has strictly positive real part in O and therefore 1 + / does not vanish. Thus again, / e H(O). The rest follows from the one-to-one map given by the Cayley transform. 2. Here we have to exclude f(z) = — 1 because then the transform would fail. 3. This is proved along the same lines as 1. This concludes our proof.
•
We give now integral expressions for functions in C, the well-known RieszHerglotz-Nevanlinna representation of class C functions. We start with the case of the disk. Therefore we introduce the Riesz-Herglotz kernel D(f,z) = — . (1.26) t- z To a positive measure on [0, 2n] (i.e., on T), we associate the C function Q^iz) inD: Q (Z) = ic + / — — - dfi(0), C G I , zeB. w
J e -z
(1.27)
This function is analytic in D and belongs to Hp for all p < 1 [76, p. 34] and hence it has a nontangential limit a.e. The constant c is the imaginary part of £2^(0) = 1 -h ic. This integral representation of C functions is called the RieszHerglotz representation for functions of class C. Conversely, every C function can be represented in this form. The relation between /x and Q^ is one to one except for the constant c. Since /x is uniquely defined by Q^, we shall refer to it as the Riesz-Herglotz measure for Q^. Note that the real part of the kernel D(t, z), t e T, z e D is given by P(t,z)
= R&D(t,z)
1 - Izl2 = -—^,
teT.
(1.28)
It is the Poisson kernel for E>. It features in the Poisson-Stieltjes integral, which represents the (positive) real part of Q^:
z)= fRsD(t,z)dfi(t)=
= /Re j±£ dfl(t)
=
fp(t,
Jy-^
d(i{t), z €
28
1. Preliminaries
This is obviously positive since the integrand on the right is positive. By Fatou's theorem [116, p. 34], this also has a radial limit given by limRe SlAreie) = fi'(0)
a.e.
(1.29)
rfl
Here \x' is the density of the absolutely continuous part of [i in its Lebesgue decomposition, and at the discontinuous points, it can be replaced by the symmetric derivative, that is, li'(0) = lim ho
2h
SeeRef. [76, p. 4]. The relation Re D(t, z) = P(t,z) fort e T can be generalized to a relation for t G C by defining P ( t , z) = ]-[D(t, 2
z) + D ( t , z ) * ] =
~ k l _,, (t-z)(l-zt)
f
K 1
(1.30)
where the substar is w.r.t. the variable t. Using a conformal mapping from D to U, we see that the Riesz-Herglotz kernel for D transforms into the Nevanlinna kernel for U:
ii±^
(.z)
(1.31)
1 t - Z
The Poisson kernel for D transforms into
where now P(t,z) is the Poisson kernel for U, which is defined as p( z) =
''
TT^p'
Now we do not have P(t,z) =RcD(t,z),
-
(L32)
but instead 2
^ t , z) + D(t, z)*] = Pi
teR
I i = l
2
Note that with our previous definitions, we can give an invariant expression for the Poisson kernel that catches both the disk and the half plane case: m
f\
'
M,
(1.33)
1.2. The classes C and B
29
whereas -[D(f, z) + D(t, z)*\ = P(t, z)ujo(t)mo*(t) is also invariant. For functions in C(U), we have the following Nevanlinna representation: Q^(z) =ic+ JD(t, z) dji(t), ceR,
zeV,
(1.34)
where /x is a finite positive measure for R. Recall that our integral is over the extended real line R. If there is a point mass b > 0 at infinity, we could split it off explicitly and write Q^z)
=ic-\bz+
/ D(t, z) d/X(t),
b>0
JR
[115, p. 588; 130, p. 144]. This is called its Nevanlinna representation. If sup \yQ (iy) | is bounded for y > 1, then /x will indeed be a finite measure and there will be no mass at infinity and the representation can be simplified even more to a Hamburger representation:
See Refs. [115, p. 590] and [2, p. 92]. The correspondence between Q^ and /x is one to one except for the constant term. The measure /x will be called the Nevanlinna measure of Q^. The real part of Q^ is
Re Qp(z) = by+ / R e D(t, z) d(L(t), = by+
z = x + iy e U
fp(t,z)dn(t)>0.
Consequently, Q^ is analytic in U with values in M, which confirms that it is a positive real function. Again by Fatou's theorem for the half plane [116, p. 123], we know that the nontangential limit of Re Q^ on the boundary converges a.e. to the density of the absolutely continuous part of the measure /x: limRe £2M(JC + iy) = /xr(x) [92, p. 29], [130, p. 146].
30
1. Preliminaries
We shall in the rest of the text use Riesz-Herglotz representation, RieszHerglotz measure, etc. in the case of the unit circle T, and Nevanlinna representation, Nevanlinna measure, etc. for the case of the real line R. For the general case 3O, we use the adjective Riesz-Herglotz-Nevanlinna instead. As it has been said before, in the sequel we assume that the measure is normalized by Jd(i = 1 and we shall normalize the C function Q^ by Q^ (a?o) = 1. This avoids the extra constant c and we get a strict one-to-one relation for z G © between the positive measure /i and the C function Q^:
M (z)=
[ReD(t,z)dfl,(t)=
fp(t,
When Q^ e ifi(D), the analysis simplifies considerably, because then /JL is absolutely continuous so that the Fourier coefficients of /x are equal to the Taylor coefficients of £2M, since indeed writing ,
oo
gives 00
^ z \
c0 = 1,
(1.35)
k=i
which converges locally uniformly in D. Any positive real function £2 of H\ (D) with Q (c*o) > 0 can be characterized by Q(z)= - / DD(t,z)ReQ(t)dX(t). C.
(1.36)
Note that the converse is not true: the measure /x can be absolutely continuous without Q^ being an H\ function. The relation (1.36) holds for D and U simultaneously since we wrote dX instead of dX. However, the relation (note X and not i )
holds for both D and U. This is a special case of the more general theorem [92, p. 61] that says that any / e H\ (O) can be recovered from its boundary function by a Poisson
1.3. Factorizations
31
integral
f()
Jp()f()dk()
zeO.
This formula also holds when / is replaced by its real part, Re / . Conversely, if /x is a finite complex measure such that the Poisson-Stieltjes integral /(*)=
fp(t,z)dfi(t)
is analytic in O, then /x is absolutely continuous: d\x = f(t) dk(t) with f(t) the boundary function of f(z). 1.3. Factorizations It is a classical result [76, pp. 24,193] that every / e HP(O), 0 < p < oo has a canonical inner-outer factorization. This means that there exists an essentially unique factorization / e HP(O) o f(z) = U(z)F(z) with U an inner function and F an outer function in H An inner function U is a function U e S(Q) with
(1.37) p(O).
\U(t)\ = 1 a.e. on3O. An outer function F e Hp(O) has the form F{z) = eiy exp
,
y e l ,
(1.38)
with e L\(k)
and
T/T G
LP{X).
In (1.37), F is of the form (1.38) with f = | / | . The inner-outer factorization holds also for a function / e HP(O) when in the definition of the outer function, the condition ( | / | =)x/r e LP(X) is replaced Since an outer function has no zeros in O, its inverse will be in H(O). An example of an inner function is a Blaschke product. It is defined as (1.39)
32
1. Preliminaries
with
\
Z
~l
n
« n eDforP,
l a Z
~"
(1.40)
nZ-a!L
Z
a n e U f o U
-Cin
The convergence factors z n e T are defined as
forU. For an = a0, we set zn = 1 by convention. A Blaschke product converges iff < oo
forD,
< oo
for U.
In the case of the half plane, we may replace the convergence condition by ma?w < oo if we know that the moduli \an \ are bounded [92, p. 56]. Any inner function has the form
U(z) = eiyB(z)S(z),
y € R,
where B is a Blaschke product and S is a singular inner function, which is of the form
-h-
5(z)=exp|-
D{t,z)dv{t)\,
where v is a bounded, positive, singular (v' = 0 a.e.) measure. In (1.37), the Blaschke factor of the inner factor U catches all the zeros an off. Inner functions in O have apseudo-meromorphic extension across the boundary dO [73]. This means the following: Because U e Z/^, it is an analytic function in O and therefore (7* e H'p, and thus it is analytic in Oe. Moreover, on the boundary 9O since we have almost everywhere for any inner function \U\2 = 1 or UU* = 1, we can write U = 1/U* on dO. In this way, U has an analytic extension to the whole complex Riemann sphere, where we have to exclude the poles ay, j = 1, 2 , . . . of course as well as the points of 3O that are in the support of the singular measure v generating the singular part
1.3. Factorizations
33
of U. The nontangential limits from outside or inside © to the boundary 9O coincide wherever they exist. See also Ref. [92, p.75 ff]. Douglas, Shapiro, and Shields [73] showed that a general function / e H2 has a pseudo-meromorphic extension across 3O if there exists an inner function U e H2 such that on dO we have Uf e H'2 or, equivalently, if / can be factored as / = h*/U* on dO with h e H2 and U inner in H2. Again, the left-hand side has an extension to O and the right-hand side to Oe, which defines / in the sphere C. Let \x! be the density of the absolutely continuous part of the positive measure fi. Suppose the Szego condition log/// e L\(dX), that is, log/jif(t)dk(t) > -oo
(1.42)
/•
is satisfied, then we can define a spectral factor of /z' as a ( z ) = cexp j ^
fD(t,z)logfi\t)dk(t)\9
zeO,
e e l
(1.43)
It is defined up to an arbitrary unimodular constant factor c. We shall refer to the spectral factor when we set c = 1. Note that then a (ao) > 0. The spectral factor is an outer function in H2. Outer implies that a as well as I / a are both in H2. See, for example, Ref. [187]. Since a is in H2, it has a nontangential limit that satisfies a.eJG9O.
(1.44)
Note also that we have
|a(z)|2 = exp if Pit, z)log^(t)dk(t)\ , z e O. As one can see from its definition, the spectral factor a does not depend on the singular part of the measure. It is completely defined in terms of the absolutely continuous part. Recall that d[is — dfi — ji'dX = d\x — d\xa. From the Szego theory of orthogonal polynomials, we know that in the circle case I / a vanishes djJLs a.e. if log [i! e L\ [87, p. 202]. The same is true for the real line: I / a = 0 dixs a.e. on 3O. The condition log \i! e L\ is fundamental in the theory of Szego for orthogonal polynomials on the unit circle. Szego's theory has been extended beyond this condition if \J! > 0 a.e. on T [152]. Suppose that the spectral factor a has a pseudo-meromorphic extension across 8O. Then the relations H'(z) = a(z) 0 is the shift operator. However, it is shown in Ref. [116, p. 107] that a subspace of H2(V) is invariant under multiplication with eiyz iff it is invariant under multiplication with the canonical shift (z — i)/(z + i) [186, p. 93]. Thus foU) is the canonical shift operator for O. The classical theorem of Beurling-Lax [116, Chap. 7] says that M is a shift invariant subspace of H2 = H2 (O) iff there exists an essentially (up to a constant factor) unique inner function U of H2 such that M = UH2. An outer function F e H2 can also be characterized by the fact that the set {SkF}k>o is dense in H2. 1.4. Reproducing kernel spaces In this section we recall some definitions and properties of reproducing kernel spaces. The necessary background can be found in Ref. [148]. Definition 1.4.1 (Reproducing kernel). Let H be a Hilbert space of functions defined on X with inner product (•, •). Then we call kw(z) = k(z, w) a repro-
ducing kernel if L kw(z) e H for all w e X, 2. (/, kw) = f(w)forall w e X and f e H. For example, (1 — Wz)~l is a reproducing kernel for H2(D) since we have [186, p. 15]
(f(t)9 1/(1 - Wt)) = f(w),
f e H2(B),
weB,
and (z — w)~l is a reproducing kernel for H2(V) because [186, p. 92]
(f(t), \/{t - w)) = f(w),
f e #2(U),
weV.
In both cases the inner product represents the Cauchy integral for the appropriate space H2(O). It is a well known property [148] that if the Hilbert space is separable and {(pk }k€r is an orthonormal basis, then the unique reproducing kernel is given by
, w) = y keT
1.4. Reproducing kernel spaces
35
These reproducing kernels can also be used to find best approximants in subspaces as the following property shows. Theorem 1.4.1. Let H be a separable Hilbert space and K a closed subspace with reproducing kernel kw(z) = k(z, w). Then the best approximant (w.r.t. the norm \\-\\ = (•, -)1/2) off e H from K is given by
= (fkw). This h is the orthogonal projection of f onto K. Proof. Suppose {0^ : k e V] is an orthonormal basis for K. Extend this with {0* : k e T"} such that { ^ : ^ 6 r = r ' U T"} is an orthonormal basis for H. Then the kernel of K is given by kw = J2ker kk(w). Any element f e H can be expanded as
/ = 5Zak^k
with ak =
keF
^' ^ '
The best approximant from K is given by h = ker
whereas (fkw)
= ^(/,0ik>0jk(u;) = ^2,ak<j)k(w) = h(w). keT'
ker
This proves the theorem.
•
With these kernels, it is also possible to solve a number of classical extremal problems in Hilbert spaces. We find in Ref. [148, p. 44] the following theorem. Theorem 1.4.2. Let H be a Hilbert space with reproducing kernel k(z, w). Then all the solutions of the following problem: Pl(a,w):
s u p { | / ( t < ; ) | 2 : | | / | | = a , w e X}
are parametrized by f = t1ak(z,w)[k(w,w)]-l/2, The supremum is \a\2k(w, w).
M = l.
36
1. Preliminaries
The problem P2(a, w):
inf{||/|| 2 : f(w) = a, w e X}
reaches an infimumfor ak(z,w)[k(w,w)rl
f =
and this solution is unique. The minimum reached is \a\2[k(w,w)rl. Proof. This theorem was given in Ref. [148] for a = 1, but the introduction of a is trivial. • The problems Pl(a,w) and P2(a, w) are related as dual extremal problems as can be found in Ref. [72, p. 133] in a much more general context of Banach spaces. Problem P2(a, w) can be understood as the problem of finding the orthogonal projection in H of 0 onto the space V = {/ e H : f(w) = a}. 1.5. J-unitary and J-contractive matrices We shall consider 2 x 2 matrices 0 whose entries are functions in the Nevanlinna class N: 0 = [ft7] e N2x2. We consider such matrices that are unitary with respect to the indefinite metric .
0
0
-1
= 1
-1.
We mean that they satisfy (1.46) where the superscript H denotes complex conjugate transpose. If we define the substar conjugate for matrices as the elementwise substar conjugate of the transposed matrix: hi
012
#11*
#21*
#21
#22
#12*
#22*
then we can write (1.46) as = 7
on
(1.47)
7.5. J-unitary and J-contractive matrices
37
As we did for inner functions in S, we can define a pseudo-meromorphic extension for such a 0-matrix. Indeed, it follows from (1.47) that | det 0| = 1 a.e. on 3O. Hence, 0 is invertible on 3O a.e. and therefore also in O a.e. From the relation ft, = J0~lJ
a.e. on3O,
we can extend the right-hand side to O and hence we define also 0* (z) = [0 (z) ] H for z e O , which is equivalent with defining 0(y) = [0*(y)]H for y e Oe. Thus 0 is defined on the sphere C We shall call the matrix functions satisfying 0*70 = 7
a.e. in C
J-unitary matrices and denote the set of these matrices as Tj = {0 e N2x2 : 0*70 = J a.e.}. We have the following properties for J-unitary matrices: Theorem 1.5,1. For elements ofTj the following relations hold:
2. 9 eTj => | det 01 = londO. 3. 0 eTj => 0"1 = 70*7. 4. 6> e T 7 =>• 0 70* = 7. 5. I/O = [0tj] e T y , then (a) 0 ! U 0 i i - 021*021 = 022*6>22 - 012*012 = 1, (b) 011*012 - #21*022 = 011*021 - #12*6>22 = 0, (C) 012*012 - #21*6>21 = 011*011 - 022*022 = 0, {d) (011 + 012)*"1(011 " 012)* = (022 + 02l)- 1 (022 ~ 02l)«
6. Let 6 = [Oij] e Ty and set a = 0n - 0n\ b = 0n +0n', c = 022- 02\\ and d = 022 + 021- Then the following holds true
if*
^]j_
2 [^
K\
=
i r£
c*i = j_
bh
2 [d
d*\
dd*
ab* + aj) = cd* + c*J = 2. Proof. Parts 1-3 are trivial to check. Part 4 follows from 0*70 = 7 so that 70^ = 0 - 1 7 and by multiplying with 0, we get 070* = 7. Part 5 is just an explicitation of 0* 70 = 7 = 070*. The equalities of part 5 then give an easy proof for part 6. •
1. Preliminaries
38
An important example of a constant J-unitary matrix is
= (i-
-P
\p\2yl/2
(1.48)
In fact, this example turns out to be almost the most general constant J-unitary matrix. Theorem 1.5.2. The most general constant 0 eTj
is given by
°k
(1.49)
772
with \rji\ = l , i = 1, 2, and Up, \p\ ^ 1 as given in (1.48).
Proof. This is a matter of simple algebra. One can make use of the properties given in Theorem 1.5.1. • A simple nonconstant matrix from the class Ty is given by the BlaschkePotapov factor with a zero i n « e O : Ba =
?« 0 0 1
(1.50)
The matrices in N2x2 that are also J-contractive in O form an important class we shall often need in this paper. J-contractive in O means 0HJ0<J
a.e. inO.
By the inequality, we mean that J — 0H JO > 0, that is, this is positive semidefinite. The class of strictly J-contractive matrices is denoted by = {0 e T/; 0HJ0 < J a.e. in
0.
7.5.
J-unitary and J-contractive matrices
39
For these matrices a number of additional properties can be proved. The following theorem is due to Dewilde and Dym [60, p. 448]. Theorem 1.5.3. For 0 = [%] e By the following hold: 7. 2. 3. 4. 5. 6. 7.
0H e D j . 6HJ6 > J a.e. inOe. (0u+en)-x eH2. (0n+0l2);l(0n (O22 + O21)-1 eH2. (022 + 02l)-l(022 (flu 4- 0X2)-\e2X - #22)* is inner.
Proof. Part 1 was shown in Potapov [180, p. 171]. We shall, however, give an explicit proof using an idea of Dym [77, p. 14]. Because J -0HJ0 > 0 a.e. in O, we find for the (2, 2) element -I-[On
022]J[0l2
022f>0.
Hence
Thus 022 is analytic in O and we can define E l l = 011 — 012022 ^ -1 = —021022 ^22
=
3 - l ^22
which are all analytic in O. Set E = [£; 7 ] and define P = (7 + J)/2 and Q = (7 - J)/2. Then we can check easily that (a) P ± 2 S and P ± E Q are invertible in O. (b) 0 = (PE + e)(QE + P)" 1 = (P - VQrHUP - Q) in O. 1 / / (c) J - 0 ( z i ) J 9 ( z 2 ) H = ( P forzi,z2 € O. (d) 7 - 0 ( z 2 ) H / 0 ( z i ) forzi,Z2 € O. From (d), we find that in O, S is contractive iff 0 is J-contractive, while we see similarly from (c) that I!7* is contractive iff 0H is J-contractive. Hence, to
40
1. Preliminaries
show that 0 e By O 0H e By, we only have to show that £ is contractive iff TiH is contractive. This is a classical result. See, for example, Ref. [77, Lemma 0.1]. In fact S E H < / a s well a s E ^ S < / a r e equivalent with the singular values of £ being bounded by 1. (Note: The matrix £ is called a scattering matrix and 0 is called a chain scattering matrix. They describe the same scattering phenomenon by a rearrangement of the inputs and outputs. See Ref. [62] and Section 12.3.) Part 2 follows from part 1 and the definition of By, which imply that for z e ©
it holds that a.e. 0(z)-lJ0(z)-H
> J. Now make use of 6(z)~l = J0(z)HJ
to get 6(w)HJ0(w)>J
withw = zeOe.
(1.53)
Using part 1, we also have 0(z)J0(z)H > 7,
zeO.
Then it follows from writing out the (1,1) element that |0n*| 2 -|6>i2*| 2 > 1 a.e.inO.
(1.54)
Therefore, (#n* + #12*) is not zero and its inverse is analytic in O. Computing the real part of the expression of part 4, we get, using (1.54), Re
(Oiu-OnA
=
I£ii*l2-lfli2*l2
>
1
>
From this, part 4 follows. Since the left-hand side in the previous expression is a harmonic majorant in O for the analytic function \6\\* + #i2*l~25 it follows from Ref. [76, Theorem 2.12, p. 28] that part 3 is true. Part 5 and 6 follow from the (2,2) element in much the same way as 3 and 4 followed from the (1,1) element in (1.53). To prove the last part, note that we had for z e © that 0*/0,f > / . So we get in ©:
with equality on 3©. Working this out gives 1 - da > 0, where a — (0\u + 012*)"1 (#21* + 022*) and with equality on 9©. This identifies a as an inner function. •
1.5. J-unitary and J-contractive matrices
41
The following theorem describes a simple matrix from the class By. Theorem 1.5.4. The most general first degree matrix in By with a zero at z = a G O is given by ^
0
T]2\
UpBaUy,
with rfi, r]2 G T, Up and Uy constant J-unitary matrices as defined in (1.48) for p and y $ T, and Ba a Blaschke-Potapov factor as in (1.50). Proof. This is a classical result that can be found, for example, in Potapov [180, pp. 187-188]. • A matrix in N2x2 that is J-contractive in O and J-unitary will be called J-inner, a terminology used by Dym [77]. The set of J-inner matrix functions is denoted as Bj = By n Ty while we set Bj = By HTy. The matrix of Theorem 1.5.4 is a J-inner matrix. Much more on J-unitary matrix functions and the work of Potapov can be found in the V. P. Potapov memorial volume [100].
The fundamental spaces
This chapter serves to introduce the spaces of rational functions with fixedpoles in Oe. In the case of the disk, these spaces generalize the spaces of polynomials. The latter are a special case if all poles are at infinity. For the half plane, the situation is similar. Also here, the polynomial case is recovered by letting the poles tend to infinity, although the fact that now oo e R, with R = 3O, makes the situation less trivial. For 3O = R, the polynomials appear in a much more natural way when the interpolation points are located on the boundary. This will be discussed later in Chapter 11. We shall first discuss several equivalent characterizations of the spaces in Section 2.1, and in Section 2.2 we give several rules for doing calculus in these spaces. The results of the latter section are frequently used in the rest of the text. It requires some skill to perform the computations fluently and the reader is warmly recommended to have a close look at this section because these results are used in practically every subsequent section of this book. The last section of this chapter reconsiders, for the spaces of rational functions, several extremal problems that are related to the extremal problem that was given in general reproducing kernel spaces in Section 1.4. Their solutions can be described in terms of kernels or orthogonal functions. 2.1. The spaces Cn In this section we shall introduce the spaces Cn, which are the fundamental spaces dealt with in Chapters 2-10. We already defined the Blaschke factors f n(z) in Section 1.3 (1.40-1.41). We recall the definitions
ftfe)=«^£> 42
(2.D
2.1. The spaces Cn
43
where depending on O = D or O = U, the definitions of the factors are uiiiz) = 1 - a,-z,
z&i (z) = z — oii,
GT*(Z) = z - a;
for D,
m*(z) = z — oti for U.
There is a special point a0 = 0 for D and a?o = i for U. For at = a 0 , we always set zt = 1 and for at ^ ar0, the normalizing factors are zt = -ai/\ai\ Zi
= \af + l\/(af + l)
forD,
forU.
Thus forD, forU. In what follows we shall always assume that an expression such as zt, even if it does not appear in a Blaschke factor, will be 1 when at = a0. The factor = z for D and fo(^) = (z — i)/(z + i) for U is important since forO
'
and thus it is fairly easy to characterize the sets O, 3O, and Oe as
1}. Recall also (1.3), which gave an equivalent but more complicated characterization of these three parts of C Next we define finite Blaschke products as Bo = 1 and Bn = ft • • • 1.
(2.2)
We then consider the spaces Cn = span{#* : k = 0, 1 , . . . , n}.
(2.3)
They will often be considered as subspaces of L2(/x) but from time to time we shall also consider them as subspaces of L2(X) or some other space.
44
2. The fundamental spaces
There are of course many equivalent ways to describe the spaces Cn. One of them is to say that Cn is a space of rational functions whose poles are all in the prescribed set {a, : i = 1 , . . . , n} c Oe. Recall that a = I / a for D whereas 6t = ot for U. Thus we may write
c
" I f T T > *»(*>=n ^ w: ^e ^4I
n {z)
7=7
"
J
Note that in the case of the disk, we may choose all oti = 0 and then Cn is just the space of polynomials of degree at most n. Thus in that case Cn =VnFor the half plane though, the polynomials are less simple to recover. First suppose that all otk = ot for some ot e U. Next we replace the basis functions [(z — a)/(z — oi)]k for the spaces Cn by [(1 — az)/(z — a)]k, which describes of course the same spaces. We can now let a tend to oo or 0 (which are both on the boundary R), and we find that Cn becomes Vn in the first case and the set of polynomials in 1/z in the second case. The spaces Cn depend upon the point sets An = {at : at e O, i = 1 , . . . , n}. By An we shall denote the set An = {at ' oti e An}.
Some of the at can be repeated a number of times. So we could rearrange them and make the repetition explicit by setting A°n = {a0} UAn = {fa,...,
ft,
ft,...,
A , . . . , j 8 m , . . . , £ m }.
(2.4)
We fix /?o to be a?o> so that vt., i = 0 , . . . , m are positive integers and Y^!o vt — n + 1. The basis [Bk : k = 0 , . . . , « } is not the only possible choice to span Cn of course. With An as described in (2.4), we can use as a possible basis in the case of the disk: {wtlLo = (1. z, • • •. zv°-\ (1 - hz)-1,..., . . . , (1 - ^ z ) -
1
For the real line, we should replace this by
(1 - F«z)"^}-
(1 (2-5)
2.7. The spaces Cn
45
so that an invariant notation would be
} (2-6) where GTJ(Z) = 1 — /^-z for the disk and mt(z) = z — fit for the half plane. As always /30 = a 0 is 0 for the disk and i for the half plane. The first v0 basis functions are different for both: {wk:k
\{zk-1 :k = l , . . . , v 0 } = 0, . . . , v o - l } = < \{(z + i) 1 "*:ik = l,...,vb}
forD, forU.
The advantage of working with the basis {Bk : k = 0 , . . . , n] is that repetition of points and distinction between c^ = 0 or at ^ 0 need no special notation as in some other choices such as, for instance, (2.5). Here is yet another way to characterize the spaces Cn. Define Mn = S0Bn H2,
(2.7)
with Bn the finite Blaschke product associated with a\,..., an. Clearly, by Beurling's theorem, Mn is a shift invariant subspace of H2 since ^Bn is an inner function. The sequence {Mn:n = 0, 1,...} contains shrinking subspaces, that is, Mn+\ C Mn C • • • C Mo = ?o#2 0, Bn e Cn \ Cn-\, which implies that the Blaschke products indeed form a basis. First
46
2. The fundamental spaces
we show that Bn e Cn. Choose some f(z) = zBn(z)g(z) H2(B)). Then
G zBn(z)H2 (g e
Bnn)) = = JJtBtBn(t)g(f)B = JJ tg{t)dk{t) = 0 (/,f, B *(t)dk(t) = n(t)g(t)Bnn*(t)dk(t) since Bn* = \/Bn and g has vanishing negative Fourier coefficients. Hence Bn _L zBnH2 and therefore/^ e Cn. However, Bn & £n-\ since f o r / e Mn-i'.
(f,Bn)=
[tBn-l(t)g(t)Bn*(t)dk(t)=
J
J
ftg(t)/Ut)dX(t)
with l/f n (z) = an/\an \ • (1 — anz)/(an — z), which gives by Cauchy's formula (/, Bn) = —g(an)——orw(l — \an\2)
for an ^ 0 or — g(0)
for a n = 0,
which is not zero for all g e H2. Hence Bn is not orthogonal to Mn-\, and thus itisnotin£n_i. • The previous theorem shows that we can identify Cn as defined in (2.8) with the originally introduced space Cn of (2.3): Cn = H2G ^BnH2 = M^ = span{£* : k = 0 , . . . , « } . The previous result says, for example, that a function in H2 is orthogonal to Cn if and only if it vanishes in the point set AjJ = {ao, o?i, . . . , « „ } ; thus the difference between a function f e H2 and its orthogonal projection onto Cn should vanish in A^. In other words, the orthogonal projection of / € H2 onto Cn should interpolate / in the points A^. We shall come back to this property in Section 7.1. Consider the special case of the disk where we put all o?; = 0. Then the spaces Cn reduce to the spaces Vn of polynomials. It is well known that in that case the Gram matrix of the basis 7Jn = [1 z z2 . . . zn] in L2(/JL) is given by Gn = (Zn, Zl)^ = [(Z>, Zj)»] - [Cj-i], which is a positive definite Toeplitz matrix containing the moments of /x. If, however, all the at are distinct, then the basis Wk we mentioned previously in (2.6) reduces to n
= [tU0
W\
'"
Wn]=\l,
2.1. The spaces Cn
47
Using the definition of Q^, we easily obtain the Gram matrix
This is a so-called Pick matrix, named after G. Pick who used the positive definiteness of this matrix as a criterion to characterize the solvability of the Nevanlinna-Pick interpolation problem. In the more general case where some of the ak do coincide, the Gram matrix looks more complicated and involves derivatives of Q^ evaluated at the at. To see this, we give a technical lemma first. Lemma 2.1.2. Consider the Riesz-Herglotz kernel D(z, w) = (z+w)(z — w)~l for the disk. Then z, w) = 2(k\)z(z - w)-(M\
k > 1,
where 3^ denotes the £th derivative with respect to w. We also have [dkwD(z, w)]m = 2(k\)zk(l-wzy(k+1\
k > 1,
where the substar transform is with respect to z. Furthermore, if Q^(w) = i I m ^ ( 0 ) +
D(t,w)d/jL(t),
then for k > 1 QS£\w) = dkw^(w) = JdkwD(t, w)dn{t) = and
(substar with respect to t). Similarly, we may consider the Nevanlinna kernel D(z, w) = (1 + zw)/[i(z - w)] for the half plane. We then get 3*D(z, w) = - i ( - l and = JdkwD(t, w)djJL(t) = -\(-\)k(k\)
J-^ w) k+\ '
48
2. The fundamental spaces
Proof. This is a matter of simple algebra and we leave this to the reader.
•
With this lemma, we can now prove the following theorem. Theorem 2.1.3. If we choose the basis (2.6) for the space Cn, then the Gram matrix Gn = [(Wi,Wj)ji\
will only depend upon k = 0, 1 , . . . , vi - 1, / = 0, 1 , . . . , m. The superscript ^ denotes the kth derivative and Q^ is the Riesz-HerglotzNevanlinna transform of'/z. Proof. Suppose we consider the case of the disk. One possible form of the elements in Gn involves an integral like
/o^F
, i = 1, 2 , . . . are different and nonzero. Following Walsh [200, p. 224], and more precisely in Ref. [199], it is called the Malmquist basis in Refs. [67] and [70]. For the half plane, the corresponding basis, orthonormal with respect to dk, is
with again Bk =
2i Hence, an invariant formulation of an orthonormal basis for Cn in H2(O) is , Bo
-—, Bi(z)
——,..., , . . . , BBn-i(z) , n - i ( z ) > UJ2(Z) T&niz) J
If the points at are renamed as in (2.5) as fij, then, in the case of the disk, this orthogonal basis takes the form
1-PiZ
\l-PiZ
.j.^0
m
This form has been used in, for example, Ref. [ I l l , p. 149 ft]. See also Ref. [186, p. 27]. Let us elaborate a bit further on the orthonormal basis in H2. Let us define (compare with (2.13)) fc=l,...,/i. (2.14) Then this forms, together with vo = 1, an orthonormal basis in H2 for Cn if all the at are mutually different and different from «o. However, it holds in general
52
2. The fundamental spaces
also if some points coincide or are equal to o?o, that Cn = span{l, The interesting thing about this basis is that we can now write
Cn(w) = span/^^vife),..., ^—^-vn(z)\ = {f e Cn : f(w) = 0}. More precisely Cn(oio) = span{? o i>i,..., &vn}
and
Cn(a0) = span{vi,..., vn}.
Define (compare with (2.7))
Mn(0) = BnH2 = tfMn. Then we can prove the following property (for the disk see also Walsh [200, p. 225]). Theorem 2.1.6. With the spaces as defined above we have that Cn(a0) = {f
e£n:f(a0)=0}
is the orthogonal complement (in H2(O)) of M.n(0) = BnH2. Thus Cn(a0) = H2Q BnH2 =
MniO^.
Proof. Take a function from BnH2 of the form Bnf with f e H2 and a function from Cn(ao) that has the form mo(z)pn-i(z)/nn(z), where pn-\ eVn-\, a polynomial of degree at most n — 1 and as before nn = JJ" mt. The inner product of these functions equals J
Xn*(t)
J
where rjn = n"=i z^ 4n-i (z) e Vn-\ and ^0*(z) = z for the disk and ETO*(Z) = z — i for the half plane. Since the integrand is in H2(O) and vanishes in ao» this integral is zero by (1.8). • The next result says that we can also find a basis for Cn from its reproducing kernel. Theorem 2.1.7. Let kn(z, w) be the reproducing kernel for Cn. Then for a set {§0, •••>£«} of distinct points in C, the functions {kn(z, §/)}, j = 0, . . . , nform a basis for Cn.
53
2.2. Calculus in Cn
Proof. Certainly, the functions are all in Cn. They are also linearly independent. Since [kn(z, £o), • • •, kn(z, £„)] = Wo(z), •.., 4>n(z)]VH with V = 01.(6.). and because { 0 o , . . . , «} is a basis, {&„(•, §o)> • • •» ^«('» ?n)} will also be a basis iff V is regular. Suppose it were not, then there is a nonzero vector A = [ao, . . . , « « ] r € C n such that
Thus the function 0 = Y^lj=oaj(l>j e >Cn is not identically zero (because A 7^ 0), and yet it has n + 1 zeros §o> • • • , § « , which is impossible. D
Note that this theorem implies that the Gram matrix
VVH = [*„(&,£)] = [(*„(-, Sj), Kir, 6 is a positive definite matrix. In fact, a positive (semi-)definite (Gram) matrix is basically equivalent to a reproducing kernel. See Ref. [12, p. 344] or [71, Chap. 10]. 2.2. Calculus in Cn Recall that we already defined the substar transform /*(z) = /(£). Now if / e Cn, we shall also define the superstar transform as f*(z) = Bn(z)f*(z), where #„ is the finite Blaschke product with zeros from An = {at : i = 1 , . . . , n}. Thus, if / = J2 atBi e £n, then =an+ i=0
where
an
+ a0Bn(z),
54
2. The fundamental spaces
Note that the definition of the superstar transform depends on n. We could have used a notation that shows this dependence explicitly, such as, for example, f[n\ but we prefer not to for simplicity of notation and it will always be clear from the context what n we shall mean. We call an the leading coefficient of / e Cn (w.r.t. the basis {Bn}). Note that the leading coefficient of / e Cn is given by f*(an). If the leading coefficient is 1, we say that / is monic. In the case of the disk, we find the polynomials as a special case by setting at = 0 for all /, so that Cn =Vn- Using the definition of superstar in this special case, it is natural to define for a polynomial p e Vn the superstar as p*(z) — znp*(z)
thus \ i=0
i=0
/
•• +aQzn.
(2.15)
Thus if we define nn{z) = fl/Li mi(z)9 then we can write Bn(z) as with
r)n =
and (2.17) In the case of the half plane, however, the polynomials are not obtained for a special choice of the a,-. Moreover, for the half plane, the relations (2.16) and (2.17) should be replaced by Xn*(z) D , . Bn{z) = ——-ri n
nn(z)
and
.
{Pn(z)\* Pn* —— = r\n .
\nn(z)J
Tin*
If we want to keep a uniform notation as in (2.16) and (2.17) for both the disk and the half plane, then we should consider defining the superstar conjugate for polynomials in the case of the half plane not as in (2.15) but we should use p*(z) = p*(z) instead. With this definition for the half plane and (2.15) for the disk, we can use (2.16) and (2.17) for both cases. Note that our notation z&* (z) = z — (xn is conformal with this convention for the superstar for polynomials since it equals (1 — anzT for the disk and it is equal to (z — an)* for the half plane. The following technical properties can be trivially verified but they are very useful if one wants to do computations in Cn.
2.2. Calculus in Cn
55
Theorem 2.2.1. In Cn, the following relations hold: 1. Iffe Cn, then (a) (/*)* = (/*)* = / , (b) (/*)* = fBn, (c) (/*)* = fBn* = f/Bn. 2. For the finite Blaschke products we have (a) B: = 1, (b) Bn* = \/Bn. 3. Define nn(z) = ITLi mt(z), and let f e Cn be given by f = pn/7rn, with pn a polynomial. Then (a) Bn = rjn7T*/7tn with rjn = n?=i Zu (b) / • = pn*/Xn* = P*n/K> (c) / * = r]npl/nn. 4. If f, g e Cn, then (f,8)/i = A = te*. /*>A 5. If(pn e Cn and((j)n, £ n _i) A = 0, i.e., £ . From the problem Pl (a, w) that we considered in Section 1.4, we can now derive Theorem 2.3.1. All the solutions of the following optimization problem:
sup{|/(a n )| 2 : ||/|| A = 1, / e Cn]
P}(l,an) : are given by
/ = #„*>
M = l,
where n is the nth orthonormal basis function of Cn with leading coefficient Kn = 0*(a n ). The maximum is equal to K%. Proof. This follows immediately from the Theorem 1.4.2 and the properties given in Theorem 2.2.3. • Furthermore, for the second optimization problem of Theorem 1.4.2 we formulate a special case. Theorem 2.3.2. The optimization problem P^(l,an)
:
inf {H/ll? : /(«„) = 1, / e Cn)
has a unique solution given by
where (pn is the nth monic orthogonal basis function in Cn, and (f>n is the orthonormal one with 4>*(an) = Kn > 0. The minimal value is K~2. Proof. This also follows from Theorem 1.4.2 and the properties in Theorem 2.2.3. • Since in Cn it holds that
we also have solved the following problem. Corollary 2.3.3. The unique solution of the problem
60
2. The fundamental spaces
where C^f denotes the set of all monic elements ofCn, is given by the nth monic orthogonal basis function (pn = K~l(pn of Cn and the minimum is K~2, with Kn > 0 the leading coefficient of the orthonormal one. Recall the definition of Cn(w) given in Section 2.1:
Cn(w) = {^—^-f : / e Cn(a0)\ = {/ : / e Cn : f(w) = 0}. The problem of finding the orthogonal projection of 1 onto Cn (w) is related to a classical Szego problem. Theorem 2.3.4. Define the following problem in Cn(w), which is defined as above with Cn the rational function space based on the point set An = {«i, . . . , a n } : P%(w) :
inf {||1 — /||? : / e Cn(w)}.
Then P^(w) has the unique solution kn(z,w) where kn(z, w) is the reproducing kernel of Cn. The minimum is given by [kn(w,w)]-\ Proof. This problem can be reduced to problem P^(l, w) by noting that Cn(u>) = {/ = 1 - 8 ' 8 € £„, g(w) = 1} and thus for / = 1 — g inf {||1 - f\\l : / e Cn(w)} = inf {||g||? : g e Cn, g(w) = l } . From this the result follows easily.
•
This theorem has a simple corollary. Corollary 2.3.5. Ifkn(z, w) is the reproducing kernel for Cnw.r.t. (•,•)#> then kn(w, w) is nondecreasing with nifw e O .
Proof. Since the Cn(w) are nested as Cn(w) c Cn+\(w), the minimum [kn(w, w)]~l cannot increase with n. Of course this is also obvious from kn(w, w) = YH=
2.3. Extremal problems in Cn
61
Because there are so many optimization problems whose solutions can be expressed in terms of the reproducing kernel, one can ask whether there is an optimization problem that has this kernel for its solution. It turns out that it gives an approximation to some kernel sw(z), known as the Szego kernel associated with fi. See Ref. [102, pp. 51-52]. It is the reproducing kernel for H2(Ji) and it is related to the spectral factor a for the measure \x as explained in Section 1.3. More precisely, the kernel kn(z, w) approximates sw(z) in L2({i)-norm. Let a be the outer spectral factor associated with the measure /x, satisfying the Szego condition log\i' e L\(dX). Then define for w e CD the Szego kernel
s(z,w) = sw(z) by sw(z) =
= = == . (2.18) [1 ? f e ) ? O ) M ) O ) () urw(z)cr(z)cr(w)
That is, sw(z) =
_ cr(z)(l — wz)cr(w) (z + i ) ( i u J ) sw(z) = _ 2 i ( ) ( - w)o(w)
for©, forU.
Before we give the approximation in Theorem 2.3.8 below, we first give some properties of sw(z) in the following lemmas. Lemma 2.3.6. The Szego kernel is the reproducing kernel for H2([i). This means that for any / e H2 (A) we have
Proof. First note that sw e H2(£i). It is a well-known property that a function / will be in H2(fi) if and only if fa e H2 (see Ref. [87, Theorem 3.4, p. 215]). We now have that the L2{k) norm of swa is
dm / \mw(t)\2 2
\mo(w)\ \a(w)\ mo(ao)mt 2
— fp(t,w)dk(t) (to) J
2
\mo(w)\
\cr(w)\2mo(ao)mw(w)
= sw(w) < oo
as long as if is in a compact subset of O (recall that mo(ao)mw(w) = 1 — \w\2 in the case of the disk, whereas it is equal to 4 Im w in the case of the half
62
2. The fundamental spaces
plane and hence is strictly positive in both cases). This implies sw e L2(A) because sw is analytic in O, it is in #2(A). Now for any / G #2 (A)
-I fit)
= / ^ -
Ojt)
3TQO)5TQ*(0
——^ ^ and the minimum is given by SW(W) -kn(w,
W).
Proof. We can also reduce this problem Sn(w) to the problem P2(l, w) by observing that
The last term is equal to —2 Re f(w) by Lemma 2.3.6. Thus we have to solve
infi
inf
||/||? - 2 Re a 1 = inf I——
2 Real.
The solution of the latter problem is easily seen to be found for a = kn(w, w). Since \\sw\\\\22^^= sw(w) by Corollary 2.3.7, we get the solution as given in the theorem. • This theorem can also be reformulated as follows: Corollary 2.3.9. Let the measure /JL of the previous theorem satisfy d[i(t) — P(t, w)\mo(t)\2dv with P(z, w) the Poisson kernel. Denote by o^ and av the spectral factors of [i and v respectively. Then the problem
inf { | | / - [crv^(w)rl\\l
:/ e Cn}, w e O
reaches a minimum \ov(w)\~2 — kn(w, w)for f(z) = kn(z, w), where kn is the reproducing kernel for Cn w.r.t. dpi. Proof. Note that the outer spectral factors are related by
mw(z) Fill this into the expression for sw of the previous theorem to find that it is equal to [av (z)crv(w)]~l. The result then follows easily. •
The kernel functions
The reproducing kernels have played an important role in the theory of orthogonal polynomials. We shall study them for our spaces of rational functions. First of all we derive the Christoffel-Darboux relations, which give the generalization of the corresponding formulas for the Szego polynomials. Next we shall derive in Section 3.2 a recurrence relation for these kernels in the style of the recurrence for the Szego orthogonal polynomials. This is not too difficult to find once the previous relations have been found. The transition matrices that describe the recurrence are almost J-unitary matrices. We can normalize them to make them precisely J-unitary. The correspondingly normalized kernels will be produced in that case. The latter are discussed in Section 3.3. 3.1. Christoffel-Darboux relations We now prove the Christoffel-Darboux relations. We start with some technical lemmas.
Lemma 3.1.1. Let/ e Cn. 1. If g and h are defined by the relations f{z) — f(w) = (z-w)g(z) — -^rkh(z) then (a) p\ (z)g(z) e Cn for p\ e V\, an arbitrary polynomial of degree at most 1, especially g(z) e Cn. (b) heCn-i. 2. Proof. Clearly g(z) can be written as pn-\(z)lnn(z) with 7in(z) = ECUi mk(z) and Pn-i(z) £ Vn-i- This implies (a). Furthermore, h(z) = mn(z)g(z), which gives h(z) = pn-i(z)/nn-i e Cn-\ and this is (b). 64
3.1. Christoffel-Darboux relations
65
The second result is a special case of (lb) for f(w) = 0.
•
Lemma 3.1.2. Let {0A;}£=O denote as before the orthonormal basis functions for Cn and & the Blaschke factor based on a^. As functions of z, with w some parameter, we have
rn(z, w) = and for k = 1 , . . . , w
1 Proof. A straightforward computation gives \ - Sk(z)l;k{w) = \ -
(ak - z)(ak - w) UTk(z)UTk(w)
mk(ak)mw(z) UJk(z)UTk(w)
According to part (2) of the previous lemma, we only have to prove that the numerator of Z^ is zero for z = w. Call this numerator N(z, w). Thus we have to prove that N(w,w) = 0. Now, , w) = (/>*_, = Bn+x{w)<j)n+i{w)Bn+x(w)(l)n+\(w) - (j)n+i{w) = 0, which proves the first part. For the second part, we have to prove, according to part (la) of the previous lemma and as in part one of this theorem, that the numerator of l*(z, w) is zero for z = w. Let us call this numerator again N(z, w). Then
N(w, w) = fi(ti))*
(4.17)
4. Recurrence and second kind functions
82
2- X:/€ = XnJ4>n* € C, 3. 1/0* and hence also l/4>n* € Hi,
4. n/rn e B.
Proof. Use Theorems 3.3.3 and 3.2.2 and some properties from Section 2.2.
• It will be useful to write an inverse form of the recursion formulas as in the next theorem. Theorem 4.1.6. Given the orthonormal function (pn with (p*(an) = Kn > 0, all the previous orthonormal functions (f)k,k < n are uniquely defined if they are normalized by £ (a^) = K^ > 0. They can be found with the recursions (4.18)
= Vn(z) with Vn(z) =
0
1 0
1
1| \-kn
-K
T-l
1
and with all the quantities appearing in this formula as in Theorem 4.1.1. Proof. The formula (4.18) is evidently the inverse of the recurrence formula (4.1). Since the coefficients Xn and the matrix Nn are completely defined in terms of fa, the (j)n-\ are uniquely defined. By induction, all the previous fa are uniquely defined. • In fact this is a simple consequence of the note given at the end of Section 3.2. The kernels are uniquely defined in terms of the last one. The orthonormal functions will also be unique if they have the normalization mentioned. 4.2. Functions of the second kind In this section we shall define some functions ^ that are the rational analogues of the polynomials of the second kind that appear in the Szego theory. We shall call them functions of the second kind. They are defined first in terms of the orthogonal functions 0 n . We then show that they satisfy the same recurrence relation as the orthogonal functions and that they can be used to get rational approximants for the positive real function Q^. Later, in Section 6.2, it will
4.2. Functions of the second kind
83
be shown that these functions of the second kind are also orthogonal rational functions with respect to a measure that is related to the given measure \i. Define the following kernel:
with D(t,z) the Riesz-Herglotz-Nevanlinna kernel. It is easily checked that
-z)
l
__ ,
forB,
~z
forU.
i(f - z)
Note that taking the substar w.r.t. z implies for t e 3O that D(t, z)* = —D(t,z) = D(z,t). Consequently it also is true that
Here are some equivalent definitions: irn(z) = J[E(t, z)ct>n(t) - D(t, z)4>n(z)]dijL(t) = f D(t,z)[(t>n(t)-n(z)]d£l(t) + I
faMdM)
(4.19) (4.20)
ifw = 0, n(t) -4>n(z)]d/i(t)
ifn > 1.
(4.21)
The last equality follows from the fact that (1, 0n)# = 8on. We shall first show that these are functions from Cn. Lemma 4.2.1. The functions ijfn of the second kind belong to Cn. Proof. This is trivially true for n — 0. For n > 1, note that the integrand in (4.21) has the form
The term in square brackets vanishes for t = z, so that the integral can be written as J
(t-z)nn(z)
and this is clearly an element in Cn.
nn(z)
84
4. Recurrence and second kind functions
We can obtain more general expressions for these functions of the second kind as shown below. Lemma 4.2.2. To define the functions of the second kind for n > 0, we may replace (4.21) by
^niz)f(z) = j
D(t,Z)[4>n(t)nt)-4>n(z)nz)]djJL(t)
= J[E(t, z) with / G Cn-\. Since 0 n J_ £n-i, this is zero. • We show next an expression for i/r*. Lemma 4.2.3. The superstar conjugates of the functions of the second kind satisfy
= f D(Z, ot€(Os(o - rn(z)g(z)] 0. The relations (4.23) then follow immediately from the corresponding ones in {A22) by taking the superstar conjugate. In fact g(z) = f*(z)/Bn(z) e Cn*(an). This proves the lemma. • Note that as in (4.20), we can give an equivalent form of (4.23) as follows:
-Vn{z)g{z)
= J D(t, z)[*{t)g{t)-*n(z)g(z)]dMt)- j 4>*n(t)g(t)dfr(t),
where, as we know, the last term is 8on. For g = l/Bn, this takes the even simpler form ~tn*(z)
= j
D(t, Z)[m{Z)}dMt) - SOn.
As in the polynomial case, these functions of the second kind satisfy the same recurrence relations as the orthogonal functions but with opposite signs for the parameters X^. Taking this sign outside the transition matrix of the recurrence gives the formula (4.24) as proved in the next theorem. Theorem 4.2.4. For thefunctions of the second kind a recursion of thefollowing form exists:
\-Kiz)
= Nn-
K\ \Xn
-l(z) 0
0 1
fn-l(z)
(4.24)
where the recurrence matrix is exactly as in Theorem 4.1.1. Proof. As in the case of Theorem 4.1.1, it is sufficient to prove only one of the two associated recursions. The other one follows by applying the superstar
86
4. Recurrence and second kind functions
conjugate. We shall prove the second one. First note that by our previous lemmas we can write for n > 1 t,z)
Multiply from the left with
? 1
1].
Then the right-hand side becomes