MATRIX METHODS: THEORY, ALGORITHMS, APPLICATIONS Edited by
Vadim Olshevsky
University of Conne ti ut Storrs, USA Eugen...

Author:
Vadim Olshevsky

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

MATRIX METHODS: THEORY, ALGORITHMS, APPLICATIONS Edited by

Vadim Olshevsky

University of Conne ti ut Storrs, USA Eugene Tyrtyshnikov

Institute of Numeri al Mathemati s Russian A ademy of S ien es Mos ow, Russia

World S ienti Publishing • 2008

ii

To the memory of Gene Golub

PREFACE

Among others devoted to matri es, this book is unique in overing the whole of a tripty h onsisting of algebrai theory, algorithmi problems and numeri al appli ations, all united by the essential use and urge for development of matrix methods. This was the spirit of the 2nd International Conferen e on Matrix Methods and Operator Equations (23{27 July 2007, Mos ow) hosted by the Institute of Numeri al Mathemati s of Russian A ademy of S ien es and organized by Dario Bini, Gene Golub, Alexander Guterman, Vadim Olshevsky, Stefano Serra-Capizzano, Gilbert Strang and Eugene Tyrtyshnikov. Matrix methods provide the key to many problems in pure and applied mathemati s. However, it is more usual that linear algebra theory, numeri al algorithms and matri es in FEM/BEM appli ations live as if in three separate worlds. In this book, maybe for the rst time at all, they are put together as one entity as it was in the Mos ow meeting, where the algebrai part was impersonated by Hans S hneider, algorithms by Gene Golub, and appli ations by Guri Mar huk. All the topi s intervened in plenary sessions and were spe ialized in three se tions, giving names to three hapters of this book. Among the authors of this book are several top- lass experts in numeri al mathemati s, matrix analysis and linear algebra appli ations in luding Dario Bini, Walter Gander, Alexander Guterman, Wolfgang Ha kbus h, Khakim Ikramov, Valery Il'in, Igor Kaporin, Boris Khoromskij, Vi tor Pan, Stefano SerraCapizzano, Reinhold S hneider, Vladimir Sergei huk, Harald Wimmer and others. The book assumes a good basi knowledge of linear algebra and general mathemati al ba kground. Besides professionals, it alls as well to a wider audien e, in a ademia and industry, of all those who onsider using matrix methods in their work or major in other elds of mathemati s, engineering and s ien es. We are pleased to a knowledge that Alexander Guterman engaged in thorough editing \Algebra and Matri es" papers, Maxim Olshanskii and Yuri Vassilevski invested their time and expertise to \Matri es and Appli ations" part, and Sergei Goreinov ommitted himself to enormous te hni al ne essities of making the texts into page. It is mu h appre iated that the Mos ow meeting that gave a base to this book was supported by Russian Foundation for Basi Resear h, Russian A ademy of S ien es, International Foundation for Te hnology and Investments, Neurok Te hsoft, and University of Insubria (Como, Italy).

vi

The soul of the meeting was Gene Golub who rendered a harming \Golub's dimension" to the three main axes of the onferen e topi s. This book is happening now to ome out in his ever lasting, eminently bright and immensely grateful memory. Vadim Olshevsky Eugene Tyrtyshnikov

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

Algebra and Matrices Operators Preserving Primitivity for Matrix Pairs . . . . . . . . . . . . . . . . . . . . .

2

De ompositions of quaternions and their matrix equivalents . . . . . . . . . . . .

20

Sensitivity analysis of Hamiltonian and reversible systems prone to dissipation-indu ed instabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

Blo k triangular miniversal deformations of matri es and matrix pen ils .

69

Determining the S hein rank of Boolean matri es . . . . . . . . . . . . . . . . . . . . .

85

L. B. Beasley (Utah State University), A. E. Guterman (Mos ow State University)

D. Janovska (Institute of Chemi al Te hnology), G. Opfer (University of Hamburg) O. N. Kirillov (Mos ow State University)

L. Klimenko (Computing Centre of Ministry of Labour and So ial Poli y of Ukraine), V. V. Sergei huk (Kiev Institute of Mathemati s) E. E. Mareni h (Murmansk State Pedagogi University)

Latti es of matrix rows and matrix olumns. Latti es of invariant

olumn eigenve tors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

V. Mareni h (Murmansk State Pedagogi University)

Matrix algebras and their length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

O. V. Markova (Mos ow State University)

On a New Class of Singular Nonsymmetri Matri es with Nonnegative Integer Spe tra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

T. Nahtman (University of Tartu), D. von Rosen (Swedish University of Agri ultural S ien es)

Redu tion of a set of matri es over a prin ipal ideal domain to the Smith normal forms by means of the same one-sided transformations . . . . 166

V. M. Prokip (Institute for Applied Problems of Me hani s and Mathemati s)

viii

Matrices and Algorithms Nonsymmetri algebrai Ri

ati equations asso iated with an M-matrix: re ent advan es and algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

D. Bini (University of Pisa), B. Iannazzo (University of Insubria), B. Meini (University of Pisa), F. Poloni (S uola Normale Superiore of Pisa)

A generalized onjugate dire tion method for nonsymmetri large ill- onditioned linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

E. R. Boudinov (FORTIS Bank, Brussels), A. I. Manevi h (Dnepropetrovsk National University)

There exist normal Hankel (φ, ψ)- ir ulants of any order n . . . . . . . . . . . . . 222

V. Chugunov (Institute of Numeri al Math. RAS), Kh. Ikramov (Mos ow State University)

On the Treatment of Boundary Artifa ts in Image Restoration by re e tion and/or anti-re e tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

M. Donatelli (University of Insubria), S. Serra-Capizzano (University of Insubria)

Zeros of Determinants of λ-Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

W. Gander (ETH, Zuri h)

How to nd a good submatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

S. Goreinov (INM RAS), I. Oseledets (INM RAS), D. Savostyanov (INM RAS), E. Tyrtyshnikov (INM RAS), N. Zamarashkin (INM RAS)

Conjugate and Semi-Conjugate Dire tion Methods with Pre onditioning Proje tors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

V. Il'in (Novosibirsk Institute of Comp. Math.)

Some Relationships between Optimal Pre onditioner and Superoptimal Pre onditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

J.-B. Chen (Shanghai Maritime University), X.-Q. Jin (University of Ma au), Y.-M. Wei (Fudan University), Zh.-L. Xu (Shanghai Maritime University)

S aling, Pre onditioning, and Superlinear Convergen e in GMRES-type iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

I. Kaporin (Computing Center of Russian A ademy of S ien es)

ix

Toeplitz and Toeplitz-blo k-Toeplitz matri es and their orrelation with syzygies of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

H. Khalil (Institute Camille Jordan), B. Mourrain (INRIA), M. S hatzman (Institute Camille Jordan)

Con epts of Data-Sparse Tensor-Produ t Approximation in Many-Parti le Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

H.-J. Flad (TU Berlin), W. Ha kbus h (Max-Plan k-Institute, Leipzig), B. Khoromskij (Max-Plan k-Institute, Leipzig), R. S hneider (TU Berlin)

Separation of variables in nonlinear Fermi equation . . . . . . . . . . . . . . . . . . . . 348

Yu. I. Kuznetsov (Novosibirsk Institute of Comp. Math.)

Faster Multipoint Polynomial Evaluation via Stru tured Matri es . . . . . . . 354

B. Murphy (Lehman College), R. E. Rosholt (Lehman College)

Testing Pivoting Poli ies in Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . 357

B. Murphy (Lehman College), G. Qian (University of New York), R. E. Rosholt (Lehman College), A.-L. Zheng (University of New York), S. Ngnosse (University of New York), I. Taj-Eddin (University of New York)

Newton's Iteration for Matrix Inversion, Advan es and Extensions . . . . . . 364

V. Y. Pan (Lehman College)

Trun ated de ompositions and ltering methods with Re e tive/AntiRe e tive boundary onditions: a omparison . . . . . . . . . . . . . . . . . . . . . . . . . 382

C. Tablino Possio (University of Milano Bi o

a)

Dis rete-time stability of a lass of hermitian polynomial matri es with positive semide nite oeÆ ients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409

H. Wimmer (University of Wurzburg)

Matrices and Applications Splitting algorithm for solving mixed variational inequalities with inversely strongly monotone operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416

I. Badriev (Kazan State University), O. Zadvornov (Kazan State University)

Multilevel Algorithm for Graph Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 434

N. Bo hkarev (Neurok), O. Diyankov (Neurok), V. Pravilnikov (Neurok)

x

2D-extension of Singular Spe trum Analysis: algorithm and elements of theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450

N. E. Golyandina (St. Petersburg State University), K. D. Usevi h (St. Petersburg State University)

Appli ation of Radon transform for fast solution of boundary value problems for ellipti PDE in domains with ompli ated geometry . . . . . . . 475

A. I. Grebennikov (Autonomous University of Puebla)

Appli ation of a multigrid method to solving diusion-type equations . . . 483

M. E. Ladonkina (Institute for Math. Modelling RAS), O. Yu. Milukova (Institute for Math. Modelling RAS), V. F. Tishkin (Institute for Math. Modelling RAS)

Monotone matri es and nite volume s hemes for diusion problems preserving non-negativity of solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501

I. Kapyrin (Institute of Numeri al Math. RAS)

Sparse Approximation of FEM Matrix for Sheet Current IntegroDierential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

M. Khapaev (Mos ow State University), M. Kupriyanov (Nu lear Physi s Institute)

The method of magneti eld omputation in presen e of an ideal

ondu tive multi onne ted surfa e by using the integro-dierential equation of the rst kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524

T. Ko hubey (Southern S ienti Centre RAS), V. I. Astakhov (Southern S ienti Centre RAS)

Spe tral model order redu tion preserving passivity for large multiport RCLM networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534

Yu. M. Ne hepurenko (Institute of Numeri al Math. RAS), A. S. Potyagalova (Caden e), I. A. Karaseva (Mos ow Institute of Physi s and Te hnology)

New Smoothers in Multigrid Methods for Strongly Nonsymmetri Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540

G. Muratova (Southern Federal University), E. Andreeva (Southern Federal University)

Operator equations for eddy urrents on singular arriers . . . . . . . . . . . . . . 547

J. Naumenko (Southern S ienti Centre RAS)

Matrix approa h to modelling of polarized radiation transfer in heterogeneous systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558

xi

T. A. Sushkevi h (Keldysh Institute for Applied Mathemati s), S. A. Strelkov (Keldysh Institute for Applied Mathemati s), S. V. Maksakova (Keldysh Institute for Applied Mathemati s) The Method of Regularization of Tikhonov Based on Augmented Systems 580

A. I. Zhdanov (Samara State Aerospa e University), T. G. Par haikina (Samara State Aerospa e University)

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587

xii

ALGEBRA AND MATRICES

Operators Preserving Primitivity for Matrix Pairs LeRoy B. Beasley1 and Alexander E. Guterman2,⋆ Department of Mathemati s and Statisti s, Utah State University, Logan, Utah 84322-4125, USA,

1

[email protected]

2

Department of Algebra, Fa ulty of Mathemati s and Me hani s, Mos ow State University, Mos ow, 119991, GSP-1, Russia, [email protected]

1

Introduction

A nonnegative matrix is alled primitive if some power of it has only positive entries, or, equivalently, it is irredu ible and its spe tral radius is the only eigenvalue of maximal modulus, or, equivalently, the greatest ommon divisor of lengths of all ir uits in the asso iate dire ted graph is equal to 1. An alternative de nition of primitivity arises in the asymptoti analysis of the homogeneous dis rete time positive systems of the form x(t + 1) = Ax(t),

t = 0, 1, . . . ,

(1)

here a non-negative ve tor x(0) represents the initial state. In this ontext the primitivity of A an be equivalently restated as the property that any positive initial ondition x(0) produ es a state evolution whi h be omes stri tly positive within a nite number of steps. Su h systems are des ribed by the following equation, see [11℄, x(h + 1, k + 1) = Ax(h, k + 1) + Bx(h + 1, k),

h, k ∈ Z, h + k > 0,

(2)

where A and B are n × n nonnegative matri es and initial onditions x(h, −h), h ∈ Z, are nonnegative n × 1 ve tors. Positive dis rete homogeneous 2D-dynami al systems are used to model diusion pro esses, water pollution, et ., see [6, 7℄. An entry of the ve tor x(h, k) typi ally represents a quantity, su h as pressure, on entration or density at a parti ular site along a stream. It an be seen that at ea h time-step the onditions of a site are determined by its previous

onditions and the onditions of the site dire tly upstream from it, see [7, 11℄. To investigate the systems of type 2, we need the following on ept: Definition 1. Let A, B ∈ Mn (Z), and h, k be some non-negative integers. The (h, k)-Hurwitz produ t, whi h is denoted by (A, B)(h,k) , is the sum of all matri es whi h are produ ts of h opies of A and k opies of B. ⋆

The se ond author wishes to thank the grants RFBR 05-01-01048, NSh-5666.2006.1 and MK-2718.2007.1 for partial nan ial support

Operators preserving primitivity for matrix pairs

Example 1.

3

(A, B)(1,0) = A and

(A, B)(2,2) = A2 B2 + ABAB + AB2 A + BA2 B + BABA + B2 A2 .

In general the Hurwitz produ t satis es the following re urren e relations: (A, B)(h,0) = Ah , (A, B)(0,k) = Bk , (A, B)(h,k) = A(A, B)(h−1,k) + B(A, B)(h,k−1) for h, k > 1.

It an be dire tly he ked, see [11℄, that the solution of (2) an be represented in the following way: x(h, k) = =

Ph+k s=0

Ph+k s=0

(A, B)(s,h+k−s) x(h − s, s − h)

(A, B)(h+k−s,s) x(s − k, k − s).

Thus the Hurwitz produ ts (A, B)(h,k) with h + k = t and the initial ondition determines the ondition after t time-steps. It is natural to ask for ne essary and suÆ ient onditions on the matrix pair (A, B) in order that the solutions of (2) are eventually (i.e., for all (h, k) with h + k suÆ iently large) stri tly positive for ea h appropriate sequen e of initial values. As for the system (1), where the analogous question is answered in terms of primitivity, in this ase primitivity for matrix pairs is needed whi h means the existen e of integers h, k, h + k > 0, su h that the Hurwitz produ t (A, B)(h,k) is a positive matrix. Definition 2. The exponent of the primitive pair (A, B) is the minimum value of h + k taken over all pairs (h, k) su h that (A, B)(h,k) is positive.

An important issue in dealing with primitive matri es or matrix pairs is to nd the omplete list of matrix operators whi h map primitive matri es to primitive matri es or primitive matrix pairs to primitive matrix pairs. If su h transformations exist then they allow us to simplify the system without loosing its main property, namely, the primitivity. In this paper we deal with su h transformations. Following Frobenius, S hur and Dieudonne, many authors have studied the problems of determining the maps on the n × n matrix algebra Mn (F) over a eld F that leave ertain matrix relations, subsets, or properties invariant. For a survey of problems and results of this type see [9, 10℄. The notion of primitivity is related to nonnegative matri es, i.e., matri es with the entries in the semiring of nonnegative real numbers. In the last de ades mu h attention has been paid to Preserver Problems for matri es over various semirings, where ompletely dierent te hniques are ne essary to obtain lassi ation of operators with ertain preserving properties, see [10, Se tion 9.1℄ and referen es therein for more details. The notion of a semiring an be introdu ed as follows

4

L. B. Beasley, A. E. Guterman

Definition 3. A semiring S onsists of a set S and two binary operations, addition and multipli ation, su h that: – S is an Abelian monoid under addition (identity denoted by 0); – S is a semigroup under multipli ation (identity, if any, denoted by 1); – multipli ation is distributive over addition on both sides; – s0 = 0s = 0 for all s ∈ S. In this paper we will always assume that there is a multipli ative identity 1 in S whi h is dierent from 0.

We need the following spe ial lass of semirings: Definition 4. A semiring is alled antinegative if the zero element is the only

element with an additive inverse.

Standard examples of semirings, whi h are not rings, are antinegative, these in lude non-negative reals and integers, max-algebras, Boolean algebras. Definition 5. A binary Boolean semiring, B, is a set {0, 1} with the opera-

tions:

0+0=0 0+1= 1+0 = 1 1+1=1

0·0=0 0·1=1·0=0 1 · 1 = 1.

We will not use the term \binary" in the sequel.

Linear operators on ertain antinegative semirings without zero divisors that strongly preserve primitivity were hara terized by L. B. Beasley and N. J. Pullman in [3, 4℄. Let us note that linear transformations T : M(S) → M(S), preserving primitive matrix pairs, obviously preserve primitivity, so are lassi ed in [3, 4℄. To see this it is suÆ ient to onsider primitive matrix pairs of the form (A, 0). Thus their images are primitive matrix pairs of the form (T (A), 0). Hen e, T (A) is primitive. However, if we onsider operators on M2 (B) = M(S) × M(S), then there is no easy way to redu e the problem of hara terization of operators preserving primitive matrix pairs to the problem of hara terization of ertain transformations in ea h omponent. In this paper we investigate the stru ture of surje tive additive transformations on the Cartesian produ t M2 (S) preserving primitive matrix pairs. It turns out that for the hara terization of these transformations we have to apply different and more involved te hniques and ideas, su h as primitive assignments,

y le matri es, et . Our paper is organized as follows: in Se tion 2 we olle t some basi fa ts, de nitions and notations, in Se tion 3 we hara terize surje tive additive transformations T : M2 (B) → M2 (B) preserving the set of primitive matrix pairs, in Se tion 4 we extend this result to matri es over arbitrary antinegative semiring without zero divisors. Here Mm,n (B) denotes the set of m × n matri es with entries from the Boolean semiring B.

Operators preserving primitivity for matrix pairs

2

5

Preliminaries

In this paper, unless otherwise is stated, S will denote any antinegative semiring without zero divisors and Mn (S) will denote the n × n matri es with entries from S. Further, we denote by M2n (S) the Cartesian produ t of Mn (S) with itself, Mn (S) × Mn (S). The notions of primitivity and exponent for square matri es are lassi al.

A matrix A ∈ Mn (S) is primitive if there is an integer k > 0 su h that all entries of Ak are non-zero. In the ase A is primitive, the exponent of A is the smallest su h k. Definition 6.

A lassi al example of primitive matri es is a so- alled Wieland matrix. Definition 7.

A Wieland matrix is

0 1

Wn = 1 1

... ... ∈ Mn (S). ... 1 0

Also we onsider the following primitive matrix

11

0 ′ Wn = 0

... ... ... ...

1

. 1 0

These matri es are primitive and the Wieland matrix Wn is the matrix with the maximal possible exponent, see [8, Chapter 8.5℄. Definition 8.

additive and

An operator

T : Mm,n (S) → Mm,n (S) is alled linear T (αX) = αT (X) for all X ∈ Mm,n (S), α ∈ S.

if it is

We say that an operator, T : Mn (S) → Mn (S), preserves (strongly preserves) primitivity if for a primitive matrix A the matrix T (A) is also

Definition 9.

primitive (A is primitive if and only if

T (A)

is primitive).

A pair (A, B) ∈ M2n (S) is alled primitive if there exist nonnegative integers h, k su h that the matrix (A, B)(h,k) is positive. In this

ase, we say that the exponent of (A, B) is (h, k) where h + k is the smallest integer su h that (A, B)(h,k) is positive, and if there is (a, b) su h that a+b = h + k and (A, B)(a,b) is positive then h > a. Definition 10.

6

L. B. Beasley, A. E. Guterman

Example 2. The notion of primitive pairs generalizes the notion of primitivity. Indeed, pairs (A, B) with k = 0 and pairs (A, O) are primitive if and only if A is primitive. In parti ular, for any primitive matrix A ∈ Mn (S) the matrix pairs (A, O), (O, A), (A, A) are also primitive. For example, (Wn , O) and (O, Wn ) are primitive. We note that there are primitive pairs (A, B) su h that neither A nor

1 0 B is primitive, for example (A := En1 , B := . ..

1 ... 1 1 ... 1 ). . . . . .. . . .

0 ... 0 1

We will use the notion of irredu ible matri es and below we present the following two equivalent de nitions of irredu ibility, see [5℄ for details: Definition 11. of the rst n

A matrix A ∈ Mn (S) is alled irredu ible if n = 1 or the sum powers of A has no zero entries. A is redu ible if it is not

irredu ible. Equivalently, A matrix A is redu ible if there is a permutation matrix P su h that Pt AP = A1 Os,n−s . If A is not redu ible it is irredu ible. A2

A3

An operator, T : M2n (S) → M2n(S), preserves primitive pairs if for any primitive pair (A1 , A2 ) we have that T (A1 , A2 ) is also primitive.

Definition 12.

In order to des ribe the nal form of our operators we need the following notions. Definition 13. The matrix X◦Y denotes the (i, j) entry of X ◦ Y is xi,j yi,j .

the Hadamard or S hur produ t, i.e.,

An operator T : Mm,n(S) → Mm,n(S) is alled a (U, V)-operator if there exist invertible matri es U and V of appropriate orders su h that T (X) = UXV for all X ∈ Mm,n (S), or, if m = n, T (X) = UXt V for all X ∈ Mm,n (S), where Xt denotes the transpose of X. Definition 14.

An operator T is alled a (P, Q, B)-operator if there exist permutation matri es P and Q, and a matrix B with no zero entries, su h that T (X) = P(X ◦ B)Q for all X ∈ Mm,n (S), or, if m = n, T (X) = P(X ◦ B)t Q for all X ∈ Mm,n (F). A (P, Q, B)-operator is alled a (P, Q)-operator if B = J, the matrix of all ones. Definition 15.

Definition 16.

A line of a matrix A is a row or a olumn of A.

Definition 17. We say that the matrix A dominates the matrix B if and only if bi,j 6= 0 implies that ai,j 6= 0, and we write A > B or B 6 A.

Operators preserving primitivity for matrix pairs

7

The matrix In is the n × n identity matrix, Jm,n is the m × n matrix of all ones, Om,n is the m × n zero matrix. We omit the subs ripts when the order is obvious from the ontext and we write I, J, and O, respe tively. The matrix Ei,j , alled a ell, denotes the matrix with exa tly one nonzero entry, that being a one in the (i, j) entry. Let Ri denote the matrix whose ith row is all ones and is zero elsewhere, and Cj denote the matrix whose jth olumn is all ones and is zero elsewhere. We let |A| denote the number of nonzero entries in the matrix A. We denote by A[i, j|k, l] the 2 × 2-submatrix of A whi h lies on the interse tion of the ith and jth rows with the kth and lth olumns. A monomial matrix is a matrix whi h has exa tly one non-zero entry in ea h row and ea h olumn.

3

Matrices over the Binary Boolean Semiring

The following lemma allows us to onstru t non-primitive matrix pairs: Lemma 1. Let S be an antinegative semiring without zero divisors, (A, B) ∈ M2n (S), and assume that at least one of the following two onditions is

satis ed: 1) |A| + |B| < n + 1, 2) A and B together ontain at most n − 1 o-diagonal ells. Then the pair (A, B) is not primitive.

Proof. 1. Let K be an irredu ible matrix. We write K = D + P, where D is a

ertain diagonal matrix and P is a matrix with zero diagonal. Let Pi,j denote the permutation matrix whi h orresponds to the transposition (i, j), i.e., Pi,j = I − Ei,i − Ej,j + Ei,j + Ej,i . If K has a row or olumn with no nonzero o-diagonal entry, say the ith row, then P1,i AP1,i =

α O1,n−1 A2 A3

so that K is redu ible.

Thus, K must have a nonzero o diagonal entry in ea h row and ea h olumn. Hen e |P| > n. Further, if K is irredu ible and |P| = n then P is a monomial matrix. 2. Note that the expansion of (A + B)(h+k) ontains all the terms found in the (h, k)-Hurwitz produ t of (A, B). So, if (A, B) is a primitive pair in M2n(S) with exponent (h, k) then due to antinegativity of S we have that (A + B)(h+k) has all nonnegative entries, that is A + B is primitive. 3. Assume to the ontrary that (A, B) is a primitive pair. Then by Item 2 the matrix A + B is primitive. Thus A + B is irredu ible. Hen e by Item 1 the matrix A + B has at least n nonzero o diagonal entries, and if A + B has exa tly n nonzero o diagonal entries then (A + B) ◦ (J \ I) is a monomial matrix. Sin e any power of a monomial matrix is a monomial matrix, we must have that A+B has a nonzero diagonal entry. Sin e |A|+|B| > |A+B| we have that |A|+|B| > n+1 and together, A and B have at least n nonzero o diagonal entries. This on ludes ⊓ ⊔ the proof.

8

L. B. Beasley, A. E. Guterman

Definition 18.

the y le

A graph is a full- y le graph if it is a vertex permutation of 1 → 2 → · · · → (n − 1) → n → 1.

A (0, 1) full- y le matrix is the adja en y matrix of a full- y le graph. If a matrix A with exa tly n nonzero entries dominates a full- y le (0, 1)-matrix, we also say that A is a full- y le matrix. Any primitive matrix A ∈ Mn (B) with exa tly n + 1 non-zero

ells one of whi h is a diagonal ell dominates a full- y le matrix.

Corollary 1.

Proof. It follows from the proof of Lemma 1, item 1, that A dominates a per-

mutation matrix P. Assume that P is not a full- y le matrix. Sin e |P| = n, it follows that the graph of P is dis onne ted. Thus the graph of A is dis onne ted. ⊓ ⊔ Hen e, A is not primitive. A ontradi tion. Lemma 2. Let T : Mm,n (B) → Mm,n (B) be Then T (O) = O and, hen e, T is a bije tive

a surje tive additive operator. linear operator.

Proof. By additivity we have T (A) = T (A + O) = T (A) + T (O) for any A. By the de nition of addition in B it follows that T (O) 6 T (A) for any A. Sin e T is surje tive, for any i, 1 6 i 6 m, j, 1 6 j 6 n, there exists Ai,j ∈ Mm,n (B) su h that T (Ai,j ) = Ei,j . Thus for all i, j we have that T (O) 6 T (Ai,j ) = Ei,j , i.e., T (O) = O. Let us he k the linearity of T now. Let λ ∈ B, X ∈ Mm,n(B). If λ = 1 then T (λX) = T (X) = λT (X). If λ = 0 then T (λX) = T (O) = O = λT (X). The bije tivity of T follows from the fa t that any surje tive operator on a ⊓ ⊔ nite set is inje tive, and Mm,n (B) is nite.

Definition 19. For matri es A = [ai,j ], B = [bi,j ] ∈ Mn (B) we denote by [A|B] ∈ Mn,2n (B) the on atenation of matri es A and B, i.e., the matrix whose ith row is (ai,1 , . . . , ai,n , bi,1 , . . . , bi,n ) for all i, i = 1, . . . , n.

Let T : M2n (B) → M2n (B) be a surje tive additive operator. De ne the operator T ∗ : Mn,2n → Mn,2n by T ∗ ([A|B]) = [C|D] if T (A, B) = (C, D).

Definition 20.

Let T : M2n (B) → M2n (B) be a surje tive additive operator, then the operator T ∗ is surje tive and additive.

Lemma 3.

Proof. Follows from the bije tion between B-semimodules M2n (B) and Mn,2n (B). Definition 21. Let D = {D|D is a D2 = D × D = {(A, B)|A, B ∈ D}.

⊓ ⊔

diagonal matrix in Mn (B)}. De ne the set

Operators preserving primitivity for matrix pairs

9

Definition 22. Let σ : {1, 2, · · · , n} → {1, 2, · · · , n} be a bije tion (permutation). We de ne the permutation matrix Pσ orresponding to σ by the forP mula Pσ = ni=1 Ei,σ(i) .

We note that in this ase Pσt Ei,j Pσ = Eσ(i),σ(j) for all i, j ∈ {1, 2, · · · , n}. In the next lemma we show how to omplete pairs of ells to a matrix whi h is similar to either Wn or Wn′ by a permutation similarity matrix.

Lemma 4. For any two pairs of distin t indi es (i, j), (k, l) su h that (i, j) 6= (l, k) and either i 6= j or k 6= l or both, there exist a permutation matrix P and n − 1 ells F1 , . . . , Fn−1 su h that Ei,j + Ek,l + F1 + . . . + Fn−1 = PWn Pt or PWn′ Pt .

Proof. Let i, j, k, l be four distin t integers in {1, 2, · · · , n}. There are ve ases to onsider:

1. (i, i), (i, l). Let σ be any permutation su h that σ(i) = n, and σ(l) = 1 and Pn−1 Fq = Eσ−1 (q),σ−1 (q+1) , q = 1, · · · , n−1. Then Pσt (Ei,i +Ei,l + q=1 Fq )Pσ = Wn′ . 2. (i, i), (k, l). In this ase, let σ be any permutation su h that σ(i) = 2, σ(k) = n, and σ(l) = 1 and Fq = Eσ−1 (q),σ−1 (q+1) , q = 1, · · · , n−1. Then Pσt (Ei,i + Pn−1 Ek,l + q=1 Fq )Pσ = Wn′ . 3. (i, j), (i, l). In this ase, let σ be any permutation su h that σ(i) = n − 1, σ(j) = n, and σ(l) = 1. Let F1 = Ej,l , and Fq = Eσ−1 (q−1),σ−1 (q) for Pn−1 2 6 q 6 n − 1, Then Pσt (Ei,j + Ek,l + q=1 Fq )Pσ = Wn . 4. (i, j), (k, j). In this ase, let σ be any permutation su h that σ(i) = n − 1, σ(k) = n, and σ(j) = 1. Let Fq = Eσ−1 (q),σ−1 (q+1) for 2 6 q 6 n − 1, Pn−1 Then Pσt (Ei,j + Ek,l + q=1 Fq )Pσ = Wn . 5. (i, j), (k, l). In this ase, let σ be any permutation su h that σ(i) = 1, σ(j) = 2, σ(k) = 3, and σ(l) = 4. Let F1 = Ej,k , Fq = Eσ−1 (q+2),σ−1 (q+3) for 2 6 q 6 n − 3, Fn−2 = Eσ−1 (n),i , and Fn−1 = Eσ−1 (n−1),i . Then Pσt (Ei,j + Pn−1 Ek,l + q=1 Fq )Pσ = Wn . ⊓ ⊔

Definition 23. Let E = {Ei,j |1 6 i, j 6 n}, on E is a mapping η : E → {0, 1}. Definition 24.

the set of all ells. An assignment

We say that η is nontrivial if η is onto.

Definition 25. Let A ∈ Mn (B), where A = {Ei,j |A > Ei,j }.

we say that

η

is A-nontrivial if

η|A

is onto

That is, η is A-nontrivial if the restri tion of η to the ells of A is onto.

10

L. B. Beasley, A. E. Guterman

Definition 26.

Further if A is primitive we say that η is A-primitive if X

(Ei,j , O) +

{Ei,j ∈A|η(Ei,j )=0}

X

(O, Ei,j )

{Ei,j ∈A|η(Ei,j )=1}

is a primitive pair. Definition 27. we say that η

If an assignment η is both A-nontrivial and A-primitive then is A-nontrivial-primitive.

Remark 1. Assignment means the oloring of edges of the full graph in two

olors. Assignment is non-trivial if both olors are used, it is A-nontrivial if both olors are used for the graph of the matrix A. Assignment is A-primitive if taking the sums of matrix units, orresponding to the edges of A of the dierent

olors we get a primitive matrix pair. Lemma 5. Let (i, j, α), (k, l, β) be two triples su h that 1 6 i, j, k, l 6 n, k 6= l, α, β ∈ {0, 1}, and (i, j) 6= (k, l). Let S = {η|η(Ei,j ) = α, η(Ek,l ) = β}. Then, S ontains a Wn -nontrivial-primitive assignment and S ontains a Wn′ -

nontrivial-primitive assignment.

Proof. Sin e every primitive matrix has a primitive assignment [2, Theorem 2.1℄,

the matri es Wn and Wn′ have primitive assignments. Hen e the lemma is trivial if Wn 6> Ei,j + Ek,l and Wn′ 6> Ei,j + Ek,l . Thus we assume that Wn > Ei,j + Ek,l or Wn′ > Ei,j + Ek,l . We shall de ne η to ful ll the requirements in ea h ase. Case 1. Wn′ > Ei,j + Ek,l . Let us show that in this ase there exists a Wn′ nontrivial-primitive assignment η su h that η(Ei,j ) = α and η(Ek,l ) = β. If i = j = 1 and l ≡ k + 1 mod n and η(E1,1 ) 6= η(Ek,k+1 ), de ne η(Ep,q ) = η(E1,1 ) for all (p, q) 6= (k, k + 1). If η(E1,1 ) = η(Ek,k+1 ), de ne η(Ek−1,k ) = η(E1,1 ) and η(Ep,q ) = η(Ek,k+1 ) for all (p, q) 6= (1, 1), (k − 1, k). This de nes a Wn′ -nontrivial-primitive assignment in S. Note that here Wn 6> Ei,j + Ek,l , and hen e there is a Wn -nontrivial-primitive assignment in S. If i 6= j and k 6= l, then j ≡ i + 1 mod n and l ≡ k + 1 mod n. If η(Ei,i+1 ) = η(Ek,k+1 ) x s, 1 6 s 6 n, s 6= i, k, and let η(E1,1 ) = η(Es,s+1 ) and η(Ep,q ) = η(Ei,i+1 ) for all (p, q) 6= (1, 1), (s, s + 1). If η(Ei,i+1 ) 6= η(Ek,k+1 ), let η(E1,1 ) = η(Ei,i+1 ) and η(Ep,q ) = η(Ek,k+1 ) for all (p, q) 6= (1, 1), (i, i + 1). In all ases, we have de ned a Wn′ -nontrivial-primitive assignment in S. Case 2 will deal with this ase for a Wn -nontrivial-primitive assignment in S. Case 2. Wn > Ei,j + Ek,l . Let us show that in this ase there exists a Wn nontrivial-primitive assignment η su h that η(Ei,j ) = α and η(Ek,l ) = β. We have the following sub ases: Sub ase 1. i, j, k, l ∈ {1, n − 1, n}. That is (i, j) = (n, 1) and (k, l) = (n − 1, 1), or vi e versa, or (i, j) = (n, 1) and (k, l) = (n − 1, n), or vi e versa. If

Operators preserving primitivity for matrix pairs

11

η(Ei,j ) = η(Ek,l ) let η(E1,2 ) 6= η(Ei,j ) and η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (1, 2), (k, l). If η(Ei,j ) 6= η(Ek,l ) then, sin e Ei,j and Ek,l are two of the ells En−1,n , En,1 , En−1,1 , let Er,s be the other of the three. If (r, s) = (n − 1, 1), let η(Er,s ) = η(Ek,l ) and η(Ep.q ) = η(Ei,j ) for all (p, q) 6= (r, s), (k, l). If (r, s) 6= (n − 1, 1) let η(Er,s ) = η(En−1,1 ) and η(Ep,p+1 ) 6= η(En−1,1) for all p, 1 6 p 6 n − 2. Sub ase 2. i ∈ {n, n − 1}, k 6∈ {n − 1, n}. (Equivalently, k ∈ {n, n − 1}, i 6∈ {n − 1, n}.) If η(Ei,j ) 6= η(Ek,l ), let η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (k, l). If η(Ei,j ) = η(Ek,l ), let η(Es,s+1 ) 6= η(Ei,j ) for some s 6= k, s < n − 1, and let η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (s, s + 1). Here, unless n = 3, the hoi e of s is always possible. The ase n = 3 is an easy exer ise. Sub ase 3. i, k 6∈ {n − 1, n}. If η(Ei,j ) = η(Ek,l ) let η(En−1,n ) = η(En−1,1 ) 6= η(Ei,j ) and η(Ep,q ) = η(Ei,j ) for all other (p, q). If η(Ei,j ) 6= η(Ek,l ) let η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (k, l). In all ases and sub ases a Wn -nontrivial-primitive assignment in S has been de ned. ⊓ ⊔

Let T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then T (D2 ) = D2 . Proof. Let us show that there is no elements from M2n (B)\D2 whi h are mapped Lemma 6.

by T to D2 . Assume the onverse, i.e., there is a matrix pair (X, Y) ∈ M2n (B)\ D2 su h that T (X, Y) ∈ D2 . Note that by Lemma 2 the operator T is bije tive. Thus by [1, Theorem 1.2℄ the image of a ell must be a ell. If n = 1 then all matri es are diagonal, so we an assume that n > 2 till the end of this proof. Without loss of generality we may assume that X is non-diagonal. Thus there is Ei,j 6 X, i 6= j. By Lemma 2 the operator T is bije tive and T (O, O) = (O, O). Hen e T (Ei,j , O) 6= (O, O). Thus T (Ei,j , O) ∈ D2 , sin e otherwise T (X, O) ∈/ D2 by antinegativity of B. Sin e n > 2 we have that |D2 \ {(O, O)}| > 15 > 2. Thus by the surje tivity of T there is also some other pairs of matri es, whose image lies in D2 , say T (X ′ , Y ′ ) ∈ D2 . Thus similar to the above we an say that there is a pair (r, s) su h that either T (Er,s , O) ∈ D2 (if X ′ 6= O) or T (O, Er,s ) ∈ D2 (if Y ′ 6= O), (r, s) 6= (j, i) and (r, s) 6= (i, j). We onsider the rst possibility now, i.e. there exists (r, s) su h that T (Er,s , O) ∈ D2 . Case 1. If r = s, by a permutational similarity of M2n(B) we an assume that (r, r) = (1, 1) and that j ≡ (i + 1) mod n. By Lemma 5 there are n − 1 ells F1 , F2 , · · · , Fn−1 and a Wn′ -nontrivial-primitive assignment η su h that Wn′ > Ei,j + Er,r + F1 + · · · + Fn−1 and for A = {Ei,j , Er,r , F1 , · · · , Fn−1 } the pair (A, B) =

X

{Ek,l ∈A|η(Ek,l )=0}

(Ek,l , O) +

X

(O, Ek,l )

{Ek,l ∈A|η(Ek,l )=1}

is a primitive pair. But T (A, B) dominates two elements of D2 by the hoi e of i, j, r, s and hen e annot be primitive by Lemma 1, a ontradi tion.

12

L. B. Beasley, A. E. Guterman

Case 2. If r 6= s, by a permutational similarity of M2n(B) we an assume that Wn > Ei,j + Er,s . By Lemma 5 there are n − 1 ells F1 , F2 , · · · , Fn−1 and a Wn nontrivial-primitive assignment η su h that Wn > Ei,j + Er,s + F1 + · · · + Fn−1 and for A = {Ei,j , Er,s , F1 , · · · , Fn−1 } the pair (A, B) =

X

{Ek,l ∈A|η(Ek,l )=0}

(Ek,l , O) +

X

(O, Ek,l )

{Ek,l ∈A|η(Ek,l )=1}

is primitive. But T (A, B) dominates two elements of D2 and hen e annot be primitive by Lemma 1, a ontradi tion. The ases T (Ei,j , O), T (O, Er,s ) ∈ D2 and X is diagonal, Y is non-diagonal

an be onsidered in a similar way. Thus, T (M2n (B) \ D2 ) ⊆ M2n (B) \ D2 sin e T is bije tive by Lemma 2 and the set M2n(B) is nite it follows that T (M2n (B) \ D2 ) = M2n (B) \ D2 and thus ⊓ ⊔ we have that T (D2 ) = D2 .

Remark 2. Note that the sum of any three (or fewer) o-diagonal ells, no two

of whi h are ollinear, is dominated by a full- y le permutation matrix unless one is the transpose of another. That is, if i 6= p 6= r 6= i and j 6= q 6= s 6= j, and (Ei,j + Ep,q + Er,s ) ◦ (Ei,j + Ep,q + Er,s )t = O, then there is a full- y le permutation matrix P su h that P > Ei,j + Ep,q + Er,s .

Let (A, B) be a matrix pair. For our purposes we will assume that if ai,j 6= 0 then bi,j = 0. Let G be the digraph whose adja en y matrix is A and let H be the digraph whose adja en y matrix is B. We olor all the ar s in G olor one and all the ar s in H olor two, and then onsider G ∪H, the two olored digraph with the same vertex set. Definition 28. We all this two olored digraph the digraph asso iated with the matrix pair (A, B).

A useful tool in determining when a matrix pair is primitive is alled the

y le matrix. Definition 29. If the digraph asso iated with the pair (A, B) has y les C1 , C2 , · · · , Ck the y le matrix M is a 2 × k matrix of integers su h that the (1, i) entry is the number of ar s in y le Ci that orrespond to that part of the digraph asso iated with A, i.e., the ar s olored olor 1, and the (2, i) entry is the number of ar s in y le Ci that orrespond to that part of the digraph asso iated with B, the ar s olored olor 2.

The usefulness of this matrix is ontained in the following result of Shader and Suwilo, see [11℄. Theorem 1. [11℄ Let (A, B) be a matrix pair with y le matrix M. (A, B) is a primitive pair if and only if the greatest ommon divisor 2 × 2 minors of M is equal to 1.

Then of all

Operators preserving primitivity for matrix pairs

13

′ Lemma 7. Let (A, B) be a matrix pair with A + B = Wn , |A| + |B| = n + 1 and |A| > |B|. Then, (A, B) is a primitive pair if and only if B = O or B is

an o diagonal ell. Proof. Let

′ M be the y le matrix of the pair (A, B). If B = O then A = Wn 1n−1 and (Wn′ , O) is a primitive pair. If B is an o-diagonal ell then M = 0 1 and det M = 1 and hen e (A, B) is a primitive pair by Theorem 1. Now, assume that (A, B) is a primitive pair. We must show that B = O or that B is an o-diagonal ell. If B = O then we are done, so assume that B 6= O. By Lemma 1 either A or B or both ontains a diagonal ell. Case 1. Assume that B 6= O, and B dominates a diagonal ell. Then, sin e ′ A + B = Wn and |A| + |B| = n + 1, it follows that non-zero ells of A and 0n−α B are omplementary. Thus M = where α is the number of o1 α diagonal ells dominated by B. Sin e (A, B) is a primitive pair, we must have that det M = ±1. Here, det M = n − α, so we have that α = n − 1 and hen e |A| = 1, a ontradi tion, sin e |A| > |B| so that |A| > n+1 2 > 1. Case 2. Assume that B 6= O and A has a nonzero diagonal entry. Here, the 1 n−α where α is the number of nonzero

y le matrix for (A, B) is M = 0 α entries in B. Sin e, by Theorem 1, the determinant of M must be 1 or -1, we must have that α = 1. That is B is an o-diagonal ell. ⊓ ⊔

Let n > 3 and T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then, either T (D, O) = (D, O) or T (D, O) = (O, D). Lemma 8.

Proof. By Lemma 2 T is a bije tive linear operator. Suppose that T (Ei,i , O) = (Ek,k , O) and T (Ej,j , O) = (O, El,l ). Let C = E1,2 + E2,3 + · · · + En−1,n + En,1 , a full- y le matrix and T (C, O) = (X, Y). Then (C + Ei,i , O) and (C + Ej,j , O) are

both primitive pairs, and hen e their images must be a primitive pair. Sin e by Lemma 6 T (D2 ) = D2 , we must have that T (M2n (B)\ D2 ) = M2n (B)\ D2 , so that T (C + Ei,i , O) = (X + Ek,k , Y) and T (C + Ej,j , O) = (X, Y + El,l ) must both be primitive pairs. It was pointed out in the proof of Lemma 6 that T is bije tive on the set of ells. Thus T (C + Ei,i , O) = (X + Ek,k , Y) and T (C + Ej,j , O) = (X, Y + El,l ) are primitive pairs whi h dominate exa tly n + 1 ells. Sin e by Corollary 1, the only primitive matri es whi h dominate exa tly n + 1 ells, one of whi h is a diagonal ell, dominate a full- y le matrix, we must have that X + Y + Ek,k and X + Y + El,l dominate full- y le matri es. It now follows that X + Y is a full- y le. Sin e (X + Ek,k , Y) is a primitive pair we have by Lemma 7 that Y is an o-diagonal ell. Sin e (X, Y + El,l ) is a primitive pair we have by Lemma 7 that X is an o diagonal ell. Sin e X + Y is a full- y le matrix, it

14

L. B. Beasley, A. E. Guterman

follows that n = 2, a ontradi tion. Thus T (D, O) = (D, O) or T (D, O) = (O, D) ⊓ ⊔

Hen eforth, we let K denote the matrix with a zero main diagonal and ones everywhere else. That is K is the adja en y matrix of the omplete loopless digraph. Let us show that T a ts on M2n (B) omponent wise.

Let n > 3 and T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then, there is are bije tive linear operators L : Mn (B) \ D → Mn (B) \ D and S : D → D su h that either T (X, O) = (L(X ◦ K), O) + (S(X ◦ I), O) for all X, or T (X, O) = (O, L(X ◦ K)) + (O, S(X ◦ I)) for all X. Lemma 9.

Proof. By Lemma 2, T is a bije tive linear operator. Thus, by [1, Theorem 2.1℄

all ells in M2n (B) are mapped to ells. By virtue of Lemma 8 we may assume without loss of generality that for all l we have that T (El,l , O) = (El,l , O) and T (O, El,l ) = (O, Eσ(l),σ(l) ) for some permutations σ. Suppose that for some pairs (p, q), (x, y) with p 6= q we have T (Ep,q , O) = (O, Ex,y ). Here, by Lemma 6, x 6= y. Let F1 , F2 , · · · , Fn−1 be any ells su h that Ep,q + F1 + F2 + · · · + Fn−1 is a full- y le. For an arbitrary k, let (A, B) = (Ek,k + Ep,q + F1 + F2 + · · · + Fn−2 , Fn−1 ).

(3)

Then (A, B) is a primitive pair by Lemma 7. Thus the image must be a primitive pair. As was pointed out, T maps ells to ells, thus |T (A, B)| = |(A, B)| = n + 1. Sin e T (Ek,k , 0) = (Ek,k , 0) ∈ (D, 0), it follows that the sum of two omponents of T (A, B) is not a matrix whi h is similar to the Wieland matrix by a permutational transformation. Thus it is similar to Wn′ and Lemma 7 an be applied. Therefore, T (Ek,k + F1 + F2 + · · · + Fn−2 , Fn−1 ) must be a pair of the form (C, O) sin e T (Ep,q , O) = (O, Ex,y ) and the omponent of T (A, B), whi h is without diagonal

ells, an possess no more than one non-zero ell. By varying the hoi e of Fi′ s we get that if F is an o-diagonal ell not in row p or olumn q, then T (F, O) 6 (J, O). That is, there are n2 − 3n + 3 o diagonal ells F su h that T (F, O) 6 (J, O). Note however that in the expression (A, B) = (Ek,k +Ep,q +F1 +F2 +· · ·+Fn−2 , Fn−1 ), see formula (3), the matrix Fn−1 ould be repla ed by any of the other odiagonal ells not in row p or olumn q. That is there are also n2 − 3n + 3 o diagonal ells F su h that T (O, F) 6 (J, O). Further if T (Ep,r , O) 6 (O, J) then as above T (Ei,q , O) 6 (J, O) so that the number of o-diagonal ells F su h that T (F, O) 6 (J, O) is at least n2 − 3n + 4. If T (Ep,r , O) 6 (J, O) then again, the number of o-diagonal ells F su h that T (F, O) 6 (J, O) is at least n2 − 3n + 4. It follows that 2[n2 − 3n + 3] + 1 6 n2 − n sin e T is bije tive by Lemma 2 and,

Operators preserving primitivity for matrix pairs

15

therefore, T is bije tive on the set of ells. But that never happens. In this ase we have arrived at a ontradi tion. De ne L : Mn (B) \ D → Mn (B) \ D by T (X ◦ K, O = (L(X ◦ K, O) and S : D → D by T (X ◦ I, O) = S(X ◦ I, O). The lemma now follows. ⊓ ⊔

Sin e the a tion of T is de ned on M2n (B) independently in ea h omponent, the following de nition is orre t and makes sense.

Definition 30. Let T : M2n (B) → M2n (B) be a linear operator su h that T (X, O) ∈ Mn (B) × O and T (O, X) ∈ O × Mn (B), for all X ∈ Mn (B). De ne the linear operators T1 and T2 on Mn (B) by T (X, Y) = (T1 (X), T2 (Y)).

Let T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then, there are bije tive linear operators L1 : Mn (B) → Mn (B) and L2 : Mn (B) → Mn (B) whi h preserve primitivity su h that T (X, Y) = (L1 (X), L2 (Y)) for all (X, Y) ∈ M2n (B), or T (X, Y) = (L2 (Y), L1 (X)) for all (X, Y) ∈ M2n (B). Corollary 2.

Proof. By Lemma 9 T (X, O) = (L(X◦K), O)+(S(X◦I), O) for all X, or T (X, O) = (O, L(X ◦ K)) + (O, S(X ◦ I)) for all X. If T (X, O) = (L(X ◦ K), O) + (S(X ◦ I), O), then by the bije tivity of T and Lemma 9, T (O, X) = (O, L′ (X◦K))+(O, S′ (X◦I)) Here, de ne L1 (X) = L(X ◦ K) + S(X ◦ I) and L2 (X) = L′ (X ◦ K) + S′ (X ◦ I) so that T (X, Y) = (L1 (X), L2 (Y)). If T (X, O) = (O, L(X ◦ K)) + (O, S(X ◦ I)), then by the bije tivity of T and Lemma 9, T (O, X) = (L′ (X ◦ K), O) + (S′ (X ◦ I), O). In this ⊓ ⊔

ase T (X, Y) = (L2 (Y), L1 (X)). If L : Mn (B) → Mn (B) is a bije tive linear operator that preserves primitive matri es then L strongly preserves primitive matri es.

Lemma 10.

Proof. Sin e the set

Mn (B) is nite, the set of primitive matri es and non primitive matri es partition Mn (B). Sin e L is bije tive, and the image of the set of primitive matri es is ontained in the set of primitive matri es, the image of the set of primitive matri es must be equal to the set of primitive matri es and onsequently the image of the set of nonprimitive matri es must be the set of nonprimitive matri es. That is, L strongly preserves primitive matri es. ⊓ ⊔

We now de ne a spe ial operator that we need for Theorem 2 below. Definition 31. An operator D : Mn (B) → Mn (B) is a diagonal repla ement operator if D(Ei,j ) = Ei,j whenever i 6= j, and D(D) ⊆ D. It is nonsingular if D(Ei,i ) 6= O for all i. If D is bije tive then there is a permutation σ of {1, · · · , n} su h that D(Ei,i ) = Eσ(i),σ(i) for all i. In su h a ase we use the notation Dσ to denote the operator.

16

L. B. Beasley, A. E. Guterman

The semigroup of linear operators on Mn (B) that strongly preserve primitive matri es is generated by transposition, the similarity operators and nonsingular diagonal repla ement when n 6= 2. When n = 2 itis generated by those operators and the spe ial operator ab b (a + d) → for all a, b, c, d ∈ Mn (B). de ned by Theorem 2. [4, Theorem 3.1℄

cd

c

0

Let us now formulate our main theorem for matrix pairs.

Let n > 3 and T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then there are permutation matri es P, Q, and R su h that: T (X, Y) = (P(X ◦ K)Pt , P(Y ◦ K)Pt ) + (Q(X ◦ I)Qt , R(Y ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B); T (X, Y) = (P(Y ◦ K)Pt , P(X ◦ K)Pt ) + (Q(Y ◦ I)Qt , R(X ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B); T (X, Y) = (P(Xt ◦ K)Pt , P(Y t ◦ K)Pt ) + (Q(X ◦ I)Qt , R(Y ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B); or T (X, Y) = (P(Y t ◦ K)Pt , P(Xt ◦ K)Pt ) + (Q(Y ◦ I)Qt , R(X ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B). Theorem 3.

Proof. By Corollary 2 indu ed a tions of T on (Mn (B), O) and (O, Mn (B)) arise. A

ording to the same orollary these a tions are linear and de ned orre tly. By Lemma 10 these indu ed operators strongly preserve primitivity. Applying Theorem 2 now, we have that for some permutation matri es P and Q, and permutations σ and τ of {1, · · · , n}, T (X, Y) = (PDσ (X)Pt , QDτ (Y)Qt ) for all (X, Y) ∈ M2n (B); or the similar transformations in the other three ases. Thus we only need show that P = Q and it is impossible that there is a transposition in the rst oordinate and no transposition in the se ond one. We start with the transposition transformation. Without loss of generality assume that T (X, O) = (PDσ (X)Pt , O)

and

T (O, Y) = (O, QDτ (Y)Qt ).

Also without loss of generality we may assume that P = I that is, T (X, O) = (Dσ (X), O). Now, it is impossible that T (O, Ei,i+1 ) = (O, Ei,i+1 ) for all i = 1, . . . , n, sin e there is no permutation matrix Q su h that Q(E1,2 + E2,3 + · · · + En−1,n + En,1 )t Qt = E1,2 + E2,3 + · · · + En−1,n + En,1 .

Therefore, there is some i su h that T (O, Ei,i+1 ) 6= (O, Ei,i+1 ) (subs ripts taken modulo n). Say without loss of generality that T (O, En,1 ) 6= (O, En,1 ). Let A1 = E1,1 + E1,2 + E2,3 + · · · + En−1,n and A2 = En,1 .

Operators preserving primitivity for matrix pairs

17

Then (A1 , A2 ) is primitive, whereas, T (A1 , A2 ) = (Ei1 ,i1 + E1,2 + E2,3 + · · · + En−1,n , Ep,q ),

where (p, q) 6= (n, 1). This matrix pair annot be primitive sin e it has exa tly n o diagonal entries and they do not form a full y le, a ontradi tion. Thus, either X is transposed in both omponents or X is not transposed in both

omponents. Suppose that P 6= Q. Then there is some Ei,j with i 6= j su h that PEi,j and QEi,j are ells in dierent rows. Let k1 , k2 , · · · kn−2 be distin t positive integers less than n su h that i, j ∈/ {k1 , k2 , · · · kn−1 }. Let A = E1,1 +Ej,k1 +Ek1 ,k2 +· · ·+ Ekn−3 ,kn−2 + Ekn−2 ,i . Then (A, Ei,j ) is a primitive pair, but T (A, Ei,j ) = (X, Y)

annot be primitive as it has a row with no o diagonal entry in either X or Y , a ontradi tion. Thus P = Q. Now by splitting any matrix to its diagonal and o-diagonal parts, we obtain the form as in the statement of the theorem. Note the spe ial operator for n = 2 in Theorem 2 is not surje tive. ⊓ ⊔

4

Matrices over Antinegative Semirings Without Zero Divisors.

Definition 32. whose (i, j)-th

The pattern, A, of a matrix A ∈ Mn (S) is the entry is 0 if ai,j = 0 and 1 if ai,j 6= 0.

(0, 1)-matrix

Remark 3. For a given matrix A ∈ Mn (S) we onsider A as a matrix in Mn (B). If S is antinegative and without zero divisors then the mapping Mn (S) → Mn (B) A→A

is a homomorphism of semirings.

Remark 4. Let S be antinegative and without zero divisors. Then dire t om-

putations show that (A, B) ∈ Mn (S) is primitive if and only if (A, B) ∈ Mn (B) is primitive.

Definition 33. Let T be an additive operator on Mn (S). We say that its pattern T is an additive operator on Mn(B) de ned by the rule T (Ei,j ) = T (Ei,j ) and T (O) = T (O).

Remark 5. It is easy to see that if S is antinegative and zero-divisor-free, then for any A ∈ Mn (S) we have that T (A) = T (A). Moreover, the following statement is true:

18

L. B. Beasley, A. E. Guterman

Let S be an antinegative semiring without zero divisors. Then the transformation whi h maps ea h additive operator T on Mn (S) to the operator T on Mn (B) is a homomorphism of semirings of additive operators on Mn (S) to additive operators on Mn (B). Lemma 11.

Proof. It is straightforward to see that if T is the zero operator, then T is the zero operator. The rest follows from [4, Lemma 2.1℄. ⊓ ⊔ Let us apply the above lemma and Theorem 3 to obtain the hara terization result over any antinegative semiring without zero divisors.

Let T : M2n (S) → M2n (S) be a surje tive additive operator whi h preserves primitive pairs. Then there is a permutation matrix P ∈ Mn (S), additive fun tions φ, ψ : S → S with zero kernels, i.e., φ(x) = 0 implies x = 0 and ψ(y) = 0 implies y = 0, and permutations σ and τ of {1, · · · , n} su h that: T (X, Y) = (PDσ (Xφ )Pt , PDτ (Y ψ C)Pt ) for all (X, Y) ∈ M2n (B), where Xφ denotes the element-wise a tion of φ on the entries of X; T (X, Y) = (PDτ (Y ψ C)Pt , PDσ (Xφ B)Pt ) for all (X, Y) ∈ M2n (B); T (X, Y) = (PDσ ((Xφ )t )Pt , PDτ ((Y ψ )t )Pt ) for all (X, Y) ∈ M2n (B); or T (X, Y) = (PDτ ((Y ψ )t )Pt , PDσ ((Xφ )t )Pt ) for all (X, Y) ∈ M2n (B). Corollary 3.

References 1. L. B. Beasley, A. E. Guterman, Linear preservers of extremes of rank inequalities over semirings: Fa tor rank, Journal of Mathemati al S ien es (New-York) 131, no. 5, (2005) 5919{5938. 2. L. B. Beasley, S. J. Kirkland, A note on k-primitive dire ted graphs, Linear Algebra and its Appl., 373 (2003) 67{74. 3. L. B. Beasley, N. J. Pullman, Linear operators that strongly preserve the index of imprimitivity, Linear an Multilinear Algebra, 31 (1992) 267{283. 4. L. B. Beasley, N. J. Pullman, Linear operators that strongly preserve primitivity, Linear an Multilinear Algebra, 25 (1989) 205{213. 5. R. Brualdi and H. Ryser, Combinatorial Matrix Theory , Cambridge University Press, New York, 1991. 6. E. Fornasini, A 2D systems approa h to river pollution modelling, Multidimensional System Signal Pro ess 2 (1991) 233{265. 7. E. Fornasini, M. Val her, Primitivity of positive matrix pairs: algebrai hara terization graph theoreti des ription and 2D systems interpretation, SIAM J. Matrix Anal. Appl. 19 (1998) 71{88. 8. R. A. Horn, C. R. Johnson,\Matrix Analysis", Cambridge University Press, New York. 9. C.-K. Li and N.-K. Tsing, Linear preserver problems: a brief introdu tion and some spe ial te hniques. Dire tions in matrix theory (Auburn, AL, 1990).Linear Algebra Appl. 162/164, (1992), P. 217{235.

Operators preserving primitivity for matrix pairs

19

10. P. Pier e and others, A Survey of Linear Preserver Problems, Linear and Multilinear Algebra, 33 (1992) 1{119. 11. B. Shader, S. Suwilo, Exponents of non-negative matrix pairs, Linear Algebra and its Appl., 363 (2003) 275{293. 12. H. Wielandt, Unzerlegbare, ni ht negative Matrizen, Math. Z. 52 (1958) 642{645.

Decompositions of quaternions and their matrix equivalents Drahoslava Janovska1 and Gerhard Opfer2 1

Institute of Chemi al Te hnology, Prague, Department of Mathemati s, Te hni ka 5, 166 28 Prague 6, Cze h Republi , [email protected]

2

University of Hamburg, Fa ulty for Mathemati s, Informati s, and Natural S ien es [MIN℄, Bundestrae 55, 20146 Hamburg, Germany, [email protected]

Dedicated to the memory of Gene Golub

Sin e quaternions have isomorphi representations in matrix form we investigate various well known matrix de ompositions for quaternions.

Abstract.

Keywords: de ompositions of quaternions, S hur, polar, SVD, Jordan,

QR, LU.

1

Introduction

We will study various de ompositions of quaternions where we will employ the isomorphi matrix images of quaternions. The matrix de ompositions allow in many ases analogue de ompositions of the underlying quaternion. Let us denote the skew eld of quaternions by H. It is well known that quaternions have an isomorphi representation either by ertain omplex (2×2)matri es or by ertain real (4 × 4)-matri es. Let a := (a1 , a2 , a3 , a4 ) ∈ H. Then the two isomorphisms : H → C2×2 , 1 : H → R4×4 are de ned as follows:

α (a) := −β a1 a2 1 (a) := a3 a4

β ∈ C2×2 , α := a1 + a2 i, β := a3 + a4 i, α −a2 −a3 −a4 a1 −a4 a3 ∈ R4×4 . a4 a1 −a2 −a3 a2 a1

(1) (2)

There is another very similar, but nevertheless dierent mapping, 2 : H → R4×4 , the meaning of whi h will be explained immediately: a1 a2 2 (a) := a3

a4

−a2 a1 −a4 a3

−a3 a4 a1 −a2

−a4 −a3 ∈ R4×4 . a2 a1

(3)

Quaternioni de ompositions

21

In the rst equation (1) the overlined quantities α, β denote the omplex onjugates of the non overlined quantities α, β, respe tively. Let b ∈ H be another quaternion. Then, the isomorphisms imply (ab) = (a)(b), 1 (ab) = 1 (a)1 (b). The third map, 2 , has the interesting property that it reverses the order of the multipli ation: 2 (ab) = 2 (b)2 (a) ∀a, b ∈ H, 1 (a)2 (b) = 2 (b)1 (a) ∀a, b ∈ H.

(4)

The mapping 2 plays a entral role in the investigations of linear maps H → H. There is a formal similarity to the Krone ker produ t of two arbitrary matri es. See [16℄ for the mentioned linear maps and [11, Lemma 4.3.1℄ for the Krone ker produ t.

Definition 1. A omplex (2 × 2)-matrix of the form introdu ed in (1) will be

alled a omplex q-matrix . A real (4 × 4)-matrix of the form introdu ed in (2) will be alled a real q-matrix . A real (4×4)-matrix of the form introdu ed in (3) will be alled a real pseudo q-matrix . The set of all omplex q-matri es will be denoted by HC . The set of all real q-matri es will be denoted by HR . The set of all real pseudo q-matri es will be denoted by HP .

We introdu e some ommon notation. Let C be a matrix of any size with real or omplex entries. By D := CT we denote the transposed matrix of C, where rows and olumns are inter hanged. By E := C we denote the onjugate matrix of C where all entries of C are hanged to their omplex onjugates. Finally, C∗ := (C)T . Let a := (a1 , a2 , a3 , a4 ) ∈ H. The rst omponent, a1 , is

alled the real part of a, denoted by ℜa. The quaternion av := (0, a2 , a3 , a4 ) will be alled ve tor part of a. From the above representations it is lear how to re over a quaternion from the orresponding matrix. Thus, it is also possible to introdu e inverse mappings −1 : HC → H, −1 −1 1 : HR → H, 2 : HP → H,

~ where −1 , −1 1 as well de ne isomorphisms. If we de ne a new algebra H where a new multipli ation, denoted by ⋆ is introdu ed by a ⋆ b := ba, then 2 is ~ and HP . This parti ularly implies that 2 (ab) = also an isomorphism between H −1 2 (b)2 (a) ∈ HP and 2 (a ) = 2 (a)−1 = 2 (a)T /|a|2 ∈ HP for all a ∈ H\{0}. Be ause of these isomorphisms it is possible to asso iate notions known from matrix theory with quaternions. Simple examples are:

22

D. Janovska, G. Opfer

det(a) tr(a) eig(a) eig(1 (a)) σ+

:= det((a)) = |a|2 , := tr((a)) = 2a1 ,

det(1 (a)) = det(2 (a)) = |a|4 , tr(1 (a)) = tr(2 (a) = 4a1 ,

(5) (6) (7)

:= eig((a)) = [σ+ , σ− ], = eig(2 (a)) = [σ+ , σ+ , σ− , σ− ], where q = a1 + a22 + a23 + a24 i = a1 + |av |i, σ− = σ+ ,

(8)

|a| = ||(a)||2 = ||1 (a)||2 = ||2 (a)||2 ,

(9) (10) (11) (12)

ond(a) := ond((a)) = ond(1 (a)) = ond(2 (a)) = 1, (aa) = (a)(a)∗ = |a|2 (1) = |a|2 I2 , 1 (aa) = 1 (a)1 (a)T = 2 (aa) = 2 (a)T 2 (a) = |a|2 I4 ,

where det, tr, eig, ond refer to determinant, tra e, olle tion of eigenvalues,

ondition , respe tively. By I2 , I4 we denote the identity matri es of order 2 and 4, respe tively. We note that a general theory for determinants of quaternion valued matri es is not available. See [1℄. We will review the lassi al matrix de ompositions and investigate the appli ability to quaternions. For the lassi al theory we usually refer to one of the books of Horn & Johnson, [10℄, [11℄. In this onne tion it is useful to introdu e another notion, namely that of equivalen e between two quaternions. Su h an equivalen e may already be regarded as one of the important de ompositions, namely the S hur de omposition, as we will see. Definition 2. Two quaternions a, b ∈ H will be alled equivalent , if there is an h ∈ H\{0} su h that b = h−1 ah.

Equivalent quaternions a, b will be denoted by a ∼ b. The set [a] := {s : s := h−1 ah, h ∈ H}

will be alled equivalen e lass of a. It is the set of all quaternions whi h are equivalent to a.

The above de ned notion of equivalen e de nes an equivalen e relation. Two quaternions a, b are equivalent if and only if

Lemma 1.

ℜa = ℜb,

a ∈ R ⇔ {a} = [a]. Let a ∈ C. Then {a, a} ⊂ [a]. (a1 , a2 , a3 , a4 ) ∈ H. Then q σ+ := a1 + a22 + a23 + a24 i ∈ [a].

Furthermore,

(13)

|a| = |b|.

Let

a =

Quaternioni de ompositions

Proof. See [13℄.

23 ⊓ ⊔

The omplex number σ+ o

urring in the last lemma will be alled om[a]. The equivalen e a ∼ b an be expressed also in the form ah − hb = 0, with an h 6= 0. This is the homogeneous form of Sylvester's & Opfer [16℄. It should equation. This equation was investigated by Janovska be noted that algebraists refer to equivalent elements usually as onjugate elements. See [18, p. 35℄.

plex representative of

2

Decompositions of quaternions

A matrix de omposition of the form (a) = (b)(c) or (a) = (b)(c)(d) with a, b, c, d ∈ H and the same with 1 also represents a dire t de omposition of the involved quaternions, namely a = bc or a = bcd be ause of the isomorphy of the involved mappings , 1 . The same applies to 2 , only the multipli ation order has to be reversed. We will study the possibility of de omposing quaternions with respe t to various well known matrix de ompositions. A survey paper on de ompositions of quaternioni matri es was given by [19℄. 2.1

Schur decompositions

Let U be an arbitrary real or omplex square matrix. If UU∗ = I (identity matrix) then U will be alled unitary . If U is real, then, U∗ = UT . A real, unitary matrix will also be alled orthogonal .

Let A be an arbitrary real or omplex square matrix. Then there exists a unitary matrix U of the same size as A su h that

Theorem 1 (Schur 1).

D := U∗ AU

(14)

is an upper triangular matrix and as su h ontains the eigenvalues of A on its diagonal. Proof. See Horn & Johnson [10, p. 79℄.

⊓ ⊔

Theorem 2 (Schur 2). Let A be an arbitrary real square matrix of order n. Then there exists a real, orthogonal matrix V of order n su h that H := VT AV

(15)

is an upper Hessenberg matrix with k 6 n blo k entries in the diagonal whi h are either real (1 × 1) matri es or real (2 × 2) matri es whi h have a pair of non real omplex onjugate eigenvalues whi h are also eigenvalues of A.

24

D. Janovska, G. Opfer

Proof. See Horn & Johnson [10, p. 82℄.

⊓ ⊔

The representation A = UDU∗ implied by (14) is usually referred to as

omplex S hur de omposition of A, whereas A = VHVT implied by (15) is usually referred to as real S hur de omposition of A. Let a be a quaternion, then we might ask whether there is a S hur de omposition of the matri es (a), 1 (a), 2 (a) in terms of quaternions. The (aÆrmative) answer was already given by Janovska & Opfer [15, 2007℄. Theorem 3. Let a ∈ H\R and σ+ be the omplex representative exists h ∈ H with |h| = 1 su h that σ+ = h−1 ah and

of [a]. There

(a) = (h)(σ+ )(h−1 ), 1 (a) = 1 (h)1 (σ+ )1 (h−1 ), 2 (a) = 2 (h−1 )2 (σ+ )2 (h)

(16)

are the S hur de ompositions of (a), 1 (a), 2 (a), respe tively, whi h in ludes that (h), 1 (h), 2 (h) are unitary and (h−1 ) = (h)∗ , 1 (h−1 ) = 1 (h)T , 2 (h−1 ) = 2 (h)T . The rst de omposition is omplex, the other two are real. Proof. The rst two de ompositions given in (16) follow immediately from

Lemma 1 and the fa t that , 1 are isomorphisms. See [15℄. The last equation

an be written as 2 (h)2 (a) = 2 (σ+ )2 (h). Applying (4) one obtains ah = hσ+ whi h oin ides with the equation for σ+ given in the beginning of the theorem. Matrix (σ+ ) is omplex and diagonal: (σ+ ) = diag(σ+ , σ− ). The other matri es 1 (σ+ ), 2 (σ+ ) are upper Hessenberg with two real (2 × 2) blo ks ea h:

a1 −|av | |av | a1 1 (σ+ ) = 0 0 0 0

0 0 a1 |av |

0 a1 −|av | 0 0 0 a1 0 0 , 2 (σ+ ) = |av | . 0 −|av | 0 a1 |av | a1 0 0 −|av | a1

⊓ ⊔

If we have a look at the forms of 1 and 2 , de ned in (2), (3), respe tively, we see that an upper (and lower) triangular matrix redu es immediately to a multiple of the identity matrix. This orresponds to the ase where a is a real quaternion. Or in other words, it is not possible to nd a omplex S hur de omposition of 1 (a), 2 (a) in HR , HP , respe tively, if a ∈/ R. In the mentioned paper [15, Se tion 8℄ we an also nd, how to onstru t h whi h o

urs in ~ h| ~ , where Theorem 3. One possibility is to put h := h/| (|av | + a2 , |av | + a2 , a3 − a4 , a3 + a4 ) if |a3 | + |a4 | > 0, ~ := (1, 0, 0, 0) h if a3 = a4 = 0 and a2 > 0, (0, 1, 0, 0) if a3 = a4 = 0 and a2 < 0.

(17)

Quaternioni de ompositions

25

Let σ+ ∼ a and multiply the de ning equation σ+ = h−1 ah from the left by h, then hσ+ −ah = 0 is the homogeneous form of Sylvester's equation and it was shown [16℄ that under the ondition stated in (13) the homogeneous equation has a solution spa e (null spa e) whi h is a two dimensional subspa e of H over R. 2.2

The polar decomposition

The aim is to generalize the polar representation of a omplex number. Let z ∈ C\{0} be a omplex number. Then, z = |z|(z/|z|), and this representation of z is unique in the lass of all two fa tor representations z = pu, where the rst fa tor p is positive and the se ond, u, has modulus one. For matri es A one ould orrespondingly ask for a representation of the form A = PU, where the rst fa tor P is positive semide nite and the se ond, U, is unitary. This is indeed possible, even for non square matri es A ∈ Cm×n , m 6 n. Matrix P is always uniquely de ned as P = (AA∗ )1/2 and U is uniquely de ned if A has maximal rank m. If A is square and non singular, then U = P−1 A. See Horn & Johnson [10, Theorem 7.3.2 and Corollary 7.3.3, pp. 412/413℄. Let a ∈ H\{0} be a non vanishing quaternion a := (a1 , a2 , a3 , a4 ). The quantity av := (0, a2 , a3 , a4 ) was alled ve tor part of a as previously explained. The matri es (a), 1 (a), 2 (a) are non singular square matri es where the olumns are orthogonal to ea h other. See (11), (12) and its representation (in terms of quaternions) is obviously a a = |a| . (18) |a|

The orresponding matrix representation in HC , HR , HP , an be easily dedu ed by using (1) to (3) and the properties listed in (11), (12). We obtain (a) = diag(|a|, |a|) (

a ), |a|

a ), |a| a 2 (a) = diag(|a|, |a|, |a|, |a|) 2( ). |a|

1 (a) = diag(|a|, |a|, |a|, |a|) 1(

(19) (20) (21)

In all ases the rst fa tor is positive de nite and the se ond is unitary, orthogonal, respe tively. From a purely algebrai standpoint this representation of a is omplete. However, already the name polar representation means more. In the omplex

ase we have z = exp(αi), z 6= 0 |z|

where α := arg z is the angle between the x-axis and an arrow representing z emanating from the origin of the z-plane. As formula: α = ar tan(ℑz/ℜz). In

26

D. Janovska, G. Opfer

the quaternioni ase one nds ( f. [2, p. 11℄) a = exp(αu), |a|

a 6= 0,

with u := av /|av |, α := ar tan(|av |/a1 ), and exp is de ned by its Taylor series using u2 = −1. 2.3

The singular value decomposition (SVD)

We start with the following well known theorem on a singular value de omposition of a given matrix A. We restri t ourselves here to square matri es. The singular values of A are the square roots of the (non negative) eigenvalues of the positive semide nite matrix AA∗ .

Let A be an arbitrary square matrix with real or omplex entries. Then there are two unitary matri es U, V of the same size as A su h that Theorem 4.

D := UAV∗

is a diagonal matrix with the singular values of A in de reasing order on the diagonal. And the number of positive diagonal entries is the rank of A. Proof. See Horn & Johnson [10, 1991, p. 414℄. ⊓ ⊔ Let a be a quaternion. The eigenvalues of (a) are σ+ , σ− , de ned in (8) and (a)(a)∗ =

|a|2 0

0 |a|2

.

Thus, the singular values of (a) are |a|, |a|. The wanted de omposition must be of the form |a| 0 0 |a|

=U

α β −β α

V∗

and the main question is whether U, V ∈ HC . In order to solve this problem, we write it dire tly in terms of quaternions, namely |a| = uav,

(22)

|u| = |v| = 1.

Let a ∈ H\R. Choose u ∈ H with |u| = 1 and de ne v := ua/|a| or, equivalently, hoose v with |v| = 1 and de ne u := va/|a|. Then (22) de nes a singular value de omposition of a and Theorem 5.

(|a|) = (u)(a)(v)∗

de nes a orresponding SVD in HC . A SVD with

orresponding SVDs in HR and in HP are 1 (|a|) = 1 (u)1 (a)1 (v)T ,

u=v

is impossible. The

2 (|a|) = 2 (v)T 2 (a)2 (u).

Quaternioni de ompositions

27

Proof. It is easy to see that (22) is valid if we hoose u, v a

ording to the given rules. If u = v then a = |a| ∈ R follows, whi h was ex luded.

⊓ ⊔

One very easy realization of (22) is to hoose u := 1 and v := a/|a| or to

hoose v := 1 and u := a/|a|.

Example 1. Let a := (1, 2, 2, 4). Then the three SVDs are: „ 0

5 B0 B @0 0 0 5 B0 B @0 0

2.4

5 0

0 5 0 0

0 0 5 0

0 5 0 0

0 0 5 0

0 5

«

=

„

1 0

0 1

«„

1 0 1 0 0 0 B0 1 0 0C C=B 0A @0 0 1 0 0 0 5 1 0 1 2 0 B −2 1 0C C=B 4 0 A @ −2 −4 −2 5

« «„ 1 − 2i −2 − 4i 1 + 2i 2 + 4i / 5. 2 − 4i 1 + 2i −2 + 4i 1 − 2i 1 10 10 1 2 2 4 1 −2 −2 −4 0 B B 1 4 −2 C 1 −4 2C 0C C / 5. C B −2 CB2 1 2A 4 1 −2 A @ −2 −4 [email protected] −4 2 −2 1 4 −2 2 1 1 1 10 1 0 1 0 0 0 1 −2 −2 −4 2 4 C B B 1 4 −2 C −4 2C CB0 1 0 0C. C /5 B 2 1 [email protected] 0 1 0A 1 −2 A @ 2 −4 0 0 0 1 4 2 −2 1 2 1

The Jordan decomposition

Let a := (a1 , a2 , a3 , a4 ) ∈ H\R. Sin e the two eigenvalues σ± of (a), de ned in (8), are dierent there will be an s ∈ H\{0} su h that a = s−1 σ+ s whi h implies (a) = (s−1 )(σ+ )(s). And this representation is the Jordan de omposition of (a) and J := (σ+ ) = σ+ 0 is the Jordan anoni al form of (a) [10, p. 126℄. In this ontext 0 σ−

this representation is almost the same as the S hur de omposition, only we do not require that |s| = 1. For the omputation of s, we ould use formula (17). In HC , HP this de omposition reads 1 (a) = 1 (s−1 )1 (σ+ )1 (s),

2 (a) = 2 (s)2 (σ+ )2 (s−1 ),

where the expli it forms of 1 (σ+ ), 2 (σ+ ) are given in the proof of Theorem 3. 2.5

The QR decomposition

Let A be an arbitrary omplex square matrix. Then there is a unitary matrix U and an upper triangular matrix R of the same size as A su h that A = UR.

This well known theorem an be found in [10, p. 112℄. And this de omposition is referred to as QR-de omposition of A. All triangular matri es in HC , in HR ,

28

D. Janovska, G. Opfer

and in HP redu e to diagonal matri es. Therefore, the QR-de ompositions of a quaternion a 6= 0 have the trivial form a=

a a a a |a| ⇔ (a) = (|a|), 1 (a) = 1 1 (|a|), 2 (a) = 2 2 (|a|), |a| |a| |a| |a|

whi h is identi al with the polar de omposition (18). 2.6

The LU decomposition

Let A ∈ Cn×n be given with entries ajk , j, k = 1, 2, . . . , n. De ne the n submatri es Aℓ := (ajk ), j, k = 1, 2, . . . , ℓ, ℓ = 1, 2, . . . , n. Then, following Horn & Johnson [10, p. 160℄ there is a lower triangular matrix L and an upper triangular matrix U su h that A = LU

if and only if all n submatri es Aℓ , ℓ = 1, 2, . . . , n are non singular. The above representation is alled LU-de omposition of A. Sin e triangular matri es in HC , in HR , and in HP redu e to diagonal matri es and sin e a produ t of two diagonal matri es is again diagonal an LU-de omposition of a quaternion a will in general not exist sin e (a), 1 (a), 2 (a) are in general not diagonal. So we may ask for the ordinary LU-de omposion of (a), 1 (a), 2 (a). In order that su h a de omposition exist we must require that the mentioned submatri es are not singular. Let a = (a1 , a2 , a3 , a4 ). Then the two mentioned submatri es of (a) are non singular if and only if the rst (1 × 1) submatrix α := a1 + a2 i 6= 0, sin e this implies that also the se ond (2 × 2) submatrix whi h is (a) is non singular be ause its determinant is |a|2 = |α|2 + a23 + a24 > 0.

Let a = (a1 , a2 , a3 , a4 ) ∈ H. Put α := a1 + a2 i and β := a3 + a4i. An LU de omposition of (a) exists if and only if α 6= 0. If this ondition is valid, then Theorem 6.

(a) =

where

α −β

β l21 = − , α

β α

=

u22 =

1 l21

0 1

α 0

β u22

,

|α|2 + |β|2 |a|2 = . α α

Proof. The if and only part follows from the general theory. The above formula is easy to he k.

Theorem 7. and of 2 (a)

then

0

a1 B a2 1 (a) := B @ a3 a4

⊓ ⊔

Let a = (a1 , a2 , a3 , a4 ) ∈ H. The four submatri es Al of 1 (a) are non singular if and only if a1 6= 0. If this ondition is valid, −a2 a1 a4 −a3

−a3 −a4 a1 a2

1 0 1 −a4 B l21 a3 C C=B −a2 A @ l31 l41 a1

0 1 l32 l42

0 0 1 l43

10 a1 0 B 0 0C CB 0 [email protected] 0 0 1

−a2 u22 0 0

−a3 u23 u33 0

1 −a4 u24 C C, u34 A u44

Quaternioni de ompositions

29

where [results for 2 (a) are in parentheses℄ (no

lj1 := aj /a1 , j = 2, 3, 4, l32 := (a1 a4 + l42 := (a2 a4 − u22 :=

(a21

+

a2 a3 )/(a21 a1 a3 )/(a21

a22 )/a1 ,

+ +

(no

:= (a2 a4 +

a1 a3 )/(a21

(u23 := (a1 a4 + a2 a3 )/a1 (u24 := (−a1 a3 + a2 a4 )/a1

u33 := a1 + l31 a3 − l32 u23 ,

(no

for 2 (a)), for 2 (a)),

:= (−a1 a4 + a2 a3 )/(a21 + a22 )

hange for 2 (a)),

u23 := (−a1 a4 + a2 a3 )/a1 , u24 := (a1 a3 + a2 a4 )/a1 ,

hange for i2 (a)),

a22 ), (l32 a22 ), (l42

hange for 2 (a)),

+

a22 )

for 2 (a)), for 2 (a)),

l43 := (a2 + l41 a3 − l42 u23 )/u33 , (l43 := (−a2 + l41 a3 − l42 u23 )/u33 u34 := −a2 + l31 a4 − l32 u24 ,

(u34 := a2 + l31 a4 − l32 u24

u44 := a1 + l41 a4 − l42 u24 − l43 u34 ,

(no

for 2 (a)),

for 2 (a)),

hange for 2 (a)).

A Cholesky de omposition annot be a hieved sin e all three matri es (a), 1 (a), 2 (a) are missing symmetry. Acknowledgment. The authors a knowledge with pleasure the support of

the Grant Agen y of the Cze h Republi (grant No. 201/06/0356). The work is a part of the resear h proje t MSM 6046137306 nan ed by MSMT, Ministry of Edu ation, Youth and Sports, Cze h Republi .

References 1. J. Fan, Determinants and multipli ative fun tionals on quaternioni matri es, Linear Algebra Appl. 369 (2003), 193{201. 2. P. R. Girard, Quaternions, Cliord Algebras and Relativisti Physi s, Birkhauser, Basel, Boston, Berlin, 2007, 179 p. 3. R. A. Horn & C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, New York, 1992, 561 p. 4. R. A. Horn & C. R. Johnson, Topi s in Matrix Analysis, Cambridge University Press, Cambridge, New York, 1991, 607 p. 5. D. Janovska & G. Opfer, Givens' transformation applied to quaternion valued ve tors, BIT 43 (2003), Suppl., 991{1002. 6. D. Janovska & G. Opfer, Fast Givens Transformation for Quaternioni Valued Matri es Applied to Hessenberg Redu tions, ETNA 20 (2005), 1{26. 7. D. Janovska & G. Opfer, Linear equations in quaternions, In: Numeri al Mathemati s and Advan ed Appli ations, Pro eedings of ENUMATH 2005, A. B. de Castro, D. Gomez, P. Quintela, and P. Salgado, eds., Springer Verlag, New York, 2006, pp. 946{953. 8. D. Janovska & G. Opfer, Computing quaternioni roots by Newton's method, ETNA 26 (2007), pp. 82{102. 9. D. Janovska & G. Opfer, On one linear equation in one quaternioni unknown, 10. R. A. Horn & C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, New York, 1992, 561 p.

30

D. Janovska, G. Opfer

11. R. A. Horn & C. R. Johnson, Topi s in Matrix Analysis, Cambridge University Press, Cambridge, New York, 1991, 607 p. 12. D. Janovska & G. Opfer, Givens' transformation applied to quaternion valued ve tors, BIT 43 (2003), Suppl., 991{1002. 13. D. Janovska & G. Opfer, Fast Givens Transformation for Quaternioni Valued Matri es Applied to Hessenberg Redu tions, ETNA 20 (2005), 1{26. 14. D. Janovska & G. Opfer, Linear equations in quaternions, In: Numeri al Mathemati s and Advan ed Appli ations, Pro eedings of ENUMATH 2005, A. B. de Castro, D. Gomez, P. Quintela, and P. Salgado, eds., Springer Verlag, New York, 2006, pp. 946{953. 15. D. Janovska & G. Opfer, Computing quaternioni roots by Newton's method, ETNA 26 (2007), pp. 82{102. 16. D. Janovska & G. Opfer, On one linear equation in one quaternioni unknown, Hamburger Beitrage zur Angewandten Mathematik, Nr. 2007-14, September 2007, 34 p., dedi ated to Bernd Fis her on the o

asion of his 50th birthday. 17. D. Janovska & G. Opfer, Linear equations in quaternioni variables, Mitt. Math. Ges. Hamburg 27 (2008), 223{234. 18. B. L. van der Waerden, Algebra I, 5th ed., Springer, Berlin Gottingen, Heidelberg, 1960, 292 p. 19. F. Zhang, Quaternions and matri es of quaternions, Linear Algebra Appl. 251 (1997), 21{57.

Sensitivity analysis of Hamiltonian and reversible systems prone to dissipation-induced instabilities Oleg N. Kirillov⋆ Institute of Me hani s, Mos ow State Lomonosov University, Mi hurinskii pr. 1, 119192 Mos ow, Russia [email protected]; Department of Me hani al Engineering, Te hnis he Universitat Darmstadt, Ho hs hulstr. 1, 64289 Darmstadt, Germany [email protected]

Stability of a linear autonomous non- onservative system in the presen e of potential, gyros opi , dissipative, and non- onservative positional for es is studied. The ases when the non- onservative system is lose either to a gyros opi system or to a ir ulatory one, are examined. It is known that marginal stability of gyros opi and ir ulatory systems an be destroyed or improved up to asymptoti stability due to a tion of small non- onservative positional and velo ity-dependent for es. We show that in both ases the boundary of the asymptoti stability domain of the perturbed system possesses singularities su h as \Dihedral angle", \Break of an edge" and \Whitney's umbrella" that govern stabilization and destabilization as well as are responsible for the imperfe t merging of modes. Sensitivity analysis of the riti al parameters is performed with the use of the perturbation theory for eigenvalues and eigenve tors of non-self-adjoint operators. In ase of two degrees of freedom, stability boundary is found in terms of the invariants of matri es of the system. Bifur ation of the stability domain due to hange of the stru ture of the damping matrix is des ribed. As a me hani al example, the Hauger gyropendulum is analyzed in detail; an instability me hanism in a general me hani al system with two degrees of freedom, whi h originates after dis retization of models of a rotating dis in fri tional onta t and possesses the spe tral mesh in the plane 'frequen y' versus 'angular velo ity', is analyti ally des ribed and its role in the ex itation of vibrations in the squealing dis brake and in the singing wine glass is dis ussed.

Abstract.

Keywords: matrix polynomial, Hamiltonian system, reversible system, Lyapunov stability, inde nite damping, perturbation, dissipation-indu ed instabilities, destabilization paradox, multiple eigenvalue, singularity. ⋆

The work has been partly supported by the Alexander von Humboldt Foundation and by the German Resear h Foundation, Grant DFG HA 1060/43-1.

32

O. N. Kirillov

1

Introduction

Consider an autonomous non- onservative system x + (ΩG + δD)x_ + (K + νN)x = 0,

(1)

where dot stands for the time dierentiation, x ∈ Rm , and real matrix K = KT

orresponds to potential for es. Real matri es D = DT , G = −GT , and N = −NT are related to dissipative (damping), gyros opi , and non- onservative positional ( ir ulatory) for es with magnitudes ontrolled by s aling fa tors δ, Ω, and ν respe tively. A ir ulatory system is obtained from (1) by negle ting velo itydependent for es x + (K + νN)x = 0, (2) while a gyros opi one has no damping and non- onservative positional for es x + ΩGx_ + Kx = 0.

(3)

Cir ulatory and gyros opi systems (2) and (3) possess fundamental symmetries that are evident after transformation of equation (1) to the form y_ = Ay with A=

"

− 21 ΩG

I

1 2 2 1 2 δΩDG + 4 Ω G

− K − νN δD − 12 ΩG

#

,

y=

"

x x+ _ 12 ΩGx

#

,

(4)

where I is the identity matrix. In the absen e of damping and gyros opi for es (δ = Ω = 0), RAR = −A with I 0 −1 R=R = (5) . 0 −I

This means that the matrix A has a reversible symmetry, and equation (2) des ribes a reversible dynami al system [16, 19, 33℄. Due to this property, det(A − λI) = det(R(A − λI)R) = det(A + λI),

(6)

and the eigenvalues of ir ulatory system (2) appear in pairs (−λ, λ). Without damping and non- onservative positional for es (δ = ν = 0) the matrix A possesses the Hamiltonian symmetry JAJ = AT , where J is a unit symple ti matrix [17, 23, 28℄ J = −J−1 =

0 I . −I 0

(7)

As a onsequen e,

det(A − λI) = det(J(A − λI)J) = det(AT + λI) = det(A + λI),

(8)

whi h implies that if λ is an eigenvalue of A then so is −λ, similarly to the reversible ase. Therefore, an equilibrium of a ir ulatory or of a gyros opi

Sensitivity analysis of Hamiltonian and reversible systems

33

system is either unstable or all its eigenvalues lie on the imaginary axis of the

omplex plane implying marginal stability if they are semi-simple. In the presen e of all the four for es, the Hamiltonian and reversible symmetries are broken and the marginal stability is generally destroyed. Instead, system (1) an be asymptoti ally stable if its hara teristi polynomial P(λ) = det(Iλ2 + (ΩG + δD)λ + K + νN),

(9)

satis es the riterion of Routh and Hurwitz. The most interesting for many appli ations, ranging from the rotor dynami s [3{5, 14, 25, 27, 30, 31, 48, 49, 59, 62℄ to physi s of the atmosphere [9, 29, 62, 66℄ and from stability and optimization of stru tures [8, 10, 11, 15, 22, 26, 33, 39, 54, 55, 65, 69℄ to fri tion-indu ed instabilities and a ousti s of fri tion [40, 42, 61, 67, 71{73, 75, 76℄, is the situation when system (1) is lose either to ir ulatory system (2) with δ, Ω ≪ ν (near-reversible system ) or to gyros opi system (3) with δ, ν ≪ Ω (near-Hamiltonian system ). The ee t of small damping and gyros opi for es on the stability of

ir ulatory systems as well as the ee t of small damping and non- onservative positional for es on the stability of gyros opi systems are regarded as paradoxi al, sin e the stability properties are extremely sensitive to the hoi e of the perturbation, and the balan e of for es resulting in the asymptoti stability is not evident, as it happens in su h phenomena as \tippe top inversion", \rising egg", and the onset of fri tion-indu ed os illations in the squealing brake and in the singing wine glass [31, 48, 49, 59, 61, 62, 67, 71{73, 75{77℄. Histori ally, Thomson and Tait in 1879 were the rst who found that dissipation destroys the gyros opi stabilization (dissipation-indu ed instability ) [1, 28, 62, 66℄. A similar ee t of non- onservative positional for es on the stability of gyros opi systems has been established almost a entury later by Lakhadanov and Karapetyan [12, 13℄. A more sophisti ated manifestation of the dissipationindu ed instabilities has been dis overed by Ziegler on the example of a double pendulum loaded by a follower for e with the damping, non-uniformly distributed among the natural modes [8℄. Without dissipation, the Ziegler pendulum is a reversible system, whi h is marginally stable for the loads non-ex eeding some riti al value. Small dissipation of order o(1) makes the pendulum either unstable or asymptoti ally stable with the riti al load, whi h generi ally is lower than that of the undamped system by the quantity of order O(1) (the destabilization paradox ). Similar dis ontinuous hange in the stability domain for the near-Hamiltonian systems has been observed by Holopainen [9, 66℄ in his study of the ee t of dissipation on the stability of baro lini waves in Earth's atmosphere, by Hoveijn and Ruijgrok on the example of a rotating shaft on an elasti foundation [30℄, and by Crandall, who investigated a gyros opi pendulum with stationary and rotating damping [31℄. Contrary to the Ziegler pendulum, the undamped gyropendulum is a gyros opi system that is marginally stable when its spin ex eeds a riti al value. Despite the stationary damping, orresponding

34

O. N. Kirillov

to a dissipative velo ity-dependent for e, destroys the gyros opi stabilization [1℄, the Crandall gyropendulum with stationary and rotating damping, where the latter is related to a non- onservative positional for e, an be asymptoti ally stable for the rotation rates ex eeding onsiderably the riti al spin of the undamped system. This is an example of the destabilization paradox in the Hamiltonian system. As it was understood during the last de ade, the reason underlying the destabilization paradox is that the multiparameter family of non-normal matrix operators of the system (1) generi ally possesses the multiple eigenvalues related to singularities of the boundary of the asymptoti stability domain, whi h were des ribed and lassi ed by Arnold already in 1970-s [17℄. Hoveijn and Ruijgrok were, apparently, the rst who asso iated the dis ontinuous hange in the riti al load in their example to the singularity Whitney umbrella, existing on the stability boundary [30℄. The same singularity on the boundary of the asymptoti stability has been identi ed for the Ziegler pendulum [47℄, for the models of dis brakes [72, 76℄, of the rods loaded by follower for e [54, 55℄, and of the gyropendulums and spinning tops [63, 70℄. These examples re e t the general fa t that the odimension-1 Hamiltonian (or reversible) Hopf bifur ation an be viewed as a singular limit of the odimension-3 dissipative resonant 1 : 1 normal form and the essential singularity in whi h these two ases meet is topologi ally equivalent to Whitney's umbrella (Hamilton meets Hopf under Whitney's umbrella) [45, 66℄. Despite the a hieved qualitative understanding, the development of the sensitivity analysis for the riti al parameters near the singularities, whi h is essential for ontrolling the stabilization and destabilization, is only beginning and is involving su h modern dis iplines as multiparameter perturbation theory of analyti al matrix fun tions [7, 18, 20, 23, 24, 28, 29, 37, 41, 57, 58℄ and of non-selfadjoint boundary eigenvalue problems [51, 53{55℄, the theory of the stru tured pseudospe tra of matrix polynomials [56, 73℄ and the theory of versal deformations of matrix families [30, 45, 47, 60℄. The growing number of physi al and me hani al appli ations demonstrating the destabilization paradox due to an interplay of non- onservative ee ts and the need for a justi ation for the use of Hamiltonian or reversible models to des ribe real-world systems that are in fa t only near-Hamiltonian or near-reversible requires a uni ed treatment of this phenomenon. The goal of the present paper is to nd and to analyze the domain of asymptoti stability of system (1) in the spa e of the parameters δ, Ω, and ν with spe ial attention to near-reversible and near-Hamiltonian ases. In the subsequent se tions we will ombine the study of the two-dimensional system, analyzing the Routh-Hurwitz stability onditions, with the perturbative approa h to the

ase of arbitrary large m. Typi al singularities of the stability boundary will be identi ed. Bifur ation of the domain of asymptoti stability due to hange of

Sensitivity analysis of Hamiltonian and reversible systems

35

the stru ture of the matrix D of dissipative for es will be thoroughly analyzed and the ee t of gyros opi stabilization of a dissipative system with inde nite damping and non- onservative positional for es will be des ribed. The estimates of the riti al parameters and expli it expressions, approximating the boundary of the asymptoti stability domain, will be extended to the ase of m > 2 degrees of freedom with the use of the perturbation theory of multiple eigenvalues of non-self-adjoint operators. In the last se tion the general theory will be applied to the study of the onset of stabilization and destabilization in the models of gyropendulums and dis brakes.

2

A circulatory system with small velocity-dependent forces

We begin with the near-reversible ase (δ, Ω ≪ ν), whi h overs Ziegler's and Nikolai's pendulums loaded by the follower for e [8, 10, 11, 33, 47, 43, 44, 53, 66℄ (their ontinuous analogue is the vis oelasti Be k olumn [10, 39, 54, 55℄), the Reut-Sugiyama pendulum [50℄, the low-dimensional models of dis brakes by North [67, 73℄, Popp [40℄, and Sinou and Jezequel [72℄, the model of a mass sliding over a onveyor belt by Homann and Gaul [42℄, the models of rotors with internal and external damping by Kimball and Smith [3, 4℄ and Kapitsa [5, 66℄, and nds appli ations even in the modeling of the two-legged walking and of the dynami s of spa e tethers [32℄. 2.1

Stability of a circulatory system

Stability of system (1) is determined by its hara teristi polynomial (8), whi h in ase of two degrees of freedom has a onvenient form provided by the LeverrierBarnett algorithm [21℄ P(λ, δ, ν, Ω) = λ4 + δtrD λ3 + (trK + δ2 det D + Ω2 ) λ2 + (δ(trKtrD − trKD) + 2Ων) λ + det K + ν2 ,

(10)

where without loss of generality we assume that det G = 1 and det N = 1. In the absen e of damping and gyros opi for es (δ = Ω = 0) the system (1) is ir ulatory, and the polynomial (10) has four roots −λ+ , −λ− , λ− , and λ+ , where r λ± =

1 1 − trK ± 2 2

q (trK)2 − 4(det K + ν2 ).

(11)

The eigenvalues (11) an be real, omplex or purely imaginary implying instability or marginal stability in a

ordan e with the following statement.

If trK > 0 and det K 6 0, ir ulatory system (2) with two degrees of freedom is stable for νd 2 < ν2 < νf 2 , unstable by divergen e for

Proposition 1.

36

O. N. Kirillov

ν2 6 νd 2 , and and νf are

unstable by utter for ν2 > νf 2 , where the riti al values νd 06

q √ 1 − det K =: νd 6 νf := (trK)2 − 4 det K. 2

If trK > 0 and det K > 0, the ir ulatory system is stable for unstable by utter for ν2 > νf 2 . If trK 6 0, the system is unstable.

(12) ν2 < νf 2

and

The proof is a onsequen e of formula (11), reversible symmetry, and the fa t that time dependen e of solutions of equation (2) is given by exp(λt) for simple eigenvalues λ, with an additional|polynomial in t|prefa tor (se ular terms) in

ase of multiple eigenvalues with the Jordan blo k. The solutions monotonously grow for positive real λ implying stati instability (divergen e), os illate with an in reasing amplitude for omplex λ with positive real part ( utter), and remain bounded when λ is semi-simple and purely imaginary (stability). For K, having two equal eigenvalues, νf = 0 and the ir ulatory system (2) is unstable in agreement with the Merkin theorem for ir ulatory systems with two degrees of freedom [34, 62℄.

Fig. 1. Stability diagrams and traje tories of eigenvalues for the in reasing parameter ν > 0 for the ir ulatory system (2) with trK > 0 and det K < 0 (a) and trK > 0 and det K > 0 (b).

Stability diagrams and motion of eigenvalues in the omplex plane for ν in reasing from zero are presented in Fig. 1. When trK > 0 and det K < 0 there are two real and two purely imaginary eigenvalues at ν = 0, and the system is stati ally unstable, see Fig. 1(a). With the in rease of ν both the imaginary and real eigenvalues are moving to the origin, until at ν = νd the real pair merges and originates a double zero eigenvalue with the Jordan blo k. At ν = νd the system is unstable due to linear time dependen e of a solution orresponding to λ = 0. The further in rease of ν yields splitting of the double zero eigenvalue

Sensitivity analysis of Hamiltonian and reversible systems

37

into two purely imaginary ones. The imaginary eigenvalues of the same sign are then moving towards ea h other until at ν = νf they originate a pair of double eigenvalues ±iωf with the Jordan blo k, where ωf =

r

1 trK. 2

(13)

At ν = νf the system is unstable by utter due to se ular terms in its solutions. For ν > νf the utter instability is aused by two of the four omplex eigenvalues lying on the bran hes of a hyperboli urve Im λ2 − Re λ2 = ω2f .

(14)

The riti al values νd and νf onstitute the boundaries between the divergen e and stability domains and between the stability and utter domains respe tively. For trK > 0 and det K = 0 the divergen e domain shrinks to a point νd = 0 and for trK > 0 and det K > 0 there exist only stability and utter domains as shown in Fig. 1(b). For negative ν the boundaries of the divergen e and utter domains are ν = −νd and ν = −νf . In general, the Jordan hain for the eigenvalue iωf onsists of an eigenve tor u0 and an asso iated ve tor u1 that satisfy the equations [53℄ (−ω2f I + K + νf N)u0 = 0,

(−ω2f I + K + νf N)u1 = −2iωf u0 .

(15)

Due to the non-self-adjointness of the matrix operator, the same eigenvalue possesses the left Jordan hain of generalized eigenve tors v0 and v1 v0T (−ω2f I + K + νf N) = 0,

v1T (−ω2f I + K + νf N) = −2iωf v0T .

(16)

The eigenvalues u0 and v0 are biorthogonal v0T u0 = 0.

(17)

In the neighborhood of ν = νf the double eigenvalue and the orresponding eigenve tors vary a

ording to the formulas [52, 53℄ √ 1 λ(ν) = iωf ± µ ν − νf + o((ν − νf ) 2 ), √ 1 u(ν) = u0 ± µu1 ν − νf + o((ν − νf ) 2 ), √ 1 v(ν) = v0 ± µv1 ν − νf + o((ν − νf ) 2 ),

(18)

where µ2 is a real number given by µ2 = −

v0T Nu0 . 2iωf v0T u1

(19)

38

O. N. Kirillov

For m = 2 the generalized eigenve tors of the right and left Jordan hains at the eigenvalue iωf , where the eigenfrequen y is given by (13) and the riti al value νf is de ned by (12), are [52℄ u0 =

0 2k12 − 2νf 2k12 + 2νf , u1 = v1 = . , v0 = k22 − k11 −4iωf k22 − k11

(20)

Substituting (20) into equation (19) yields the expression µ2 = −

νf 4νf (k11 − k22 ) > 0. = T 2iωf v0 u1 2ω2f

(21)

After plugging the real-valued oeÆ ient µ into expansions (18) we obtain an approximation of order |ν − νf |1/2 of the exa t eigenvalues λ = λ(ν). This an be veri ed by the series expansions of (11) about ν = νf . 2.2

The influence of small damping and gyroscopic forces on the stability of a circulatory system

The one-dimensional domain of marginal stability of ir ulatory system (2) given by Proposition 1 blows up into a three-dimensional domain of asymptoti stability of system (1) in the spa e of the parameters δ, Ω, and ν, whi h is des ribed by the Routh and Hurwitz riterion for the polynomial (10) δtrD > 0, trK + δ2 det D + Ω2 > 0, det K + ν2 > 0, Q(δ, Ω, ν) > 0,

(22)

where Q := −q2 + δtrD(trK + δ2 det D + Ω2 )q − (δtrD)2 (det K + ν2 ), q := δ(trKtrD − trKD) + 2Ων.

(23)

Considering the asymptoti stability domain (22) in the spa e of the parameters δ, ν and Ω we remind that the initial system (1) is equivalent to the rst-order system with the real 2m×2m matrix A(δ, ν, Ω) de ned by expression (4). As it was established by Arnold [17℄, the boundary of the asymptoti stability domain of a multiparameter family of real matri es is not a smooth surfa e. Generi ally, it possesses singularities orresponding to multiple eigenvalues with zero real part. Applying the qualitative results of [17℄, we dedu e that the parts of the ν-axis belonging to the stability domain of system (2) and orresponding to two dierent pairs of simple purely imaginary eigenvalues, form edges of the dihedral angles on the surfa es that bound the asymptoti stability domain of system (1), see Fig. 2(a). At the points ±νf of the ν-axis, orresponding to the stability- utter boundary of system (2) there exists a pair of double purely imaginary eigenvalues with the Jordan blo k. Qualitatively, the asymptoti stability domain of system (1) in the spa e (δ, ν, Ω) near the ν-axis looks like a dihedral

Sensitivity analysis of Hamiltonian and reversible systems

39

Singularities dihedral angle (a), trihedral angle (b), and deadlo k of an edge (or a half of the Whitney umbrella ( )) of the boundary of the asymptoti stability domain.

Fig. 2.

angle whi h be omes more a ute while approa hing the points ±νf. At these points the angle shrinks forming the deadlo k of an edge, whi h is a half of the Whitney umbrella surfa e [17, 30, 45℄, see Fig. 2( ). In ase when the stability domain of the ir ulatory system has a ommon boundary with the divergen e domain, as shown in Fig. 1(a), the boundary of the asymptoti stability domain of the perturbed system (1) possesses the trihedral angle singularity at ν = ±νd , see Fig. 2(b). The rst two of the onditions of asymptoti stability (22) restri t the region of variation of parameters δ and Ω either to a half-plane δtrD > 0, if det D > 0, or to a spa e between the line δ = 0 and one of the bran hes of a hyperbola | det D| δ2 − Ω2 = 2ω2f , if det D < 0. Provided that δ and Ω belong to the des ribed domain, the asymptoti stability of system (1) is determined by the last two of the inequalities (22), whi h impose limits on the variation of ν. Solving the quadrati in ν equation Q(δ, ν, Ω) = 0 we write the stability ondition Q > 0 in the form + (ν − ν− (24) cr )(ν − νcr ) < 0, with ν± cr (δ, Ω)

=

Ωb ±

√ Ω2 b2 + ac δ. a

(25)

The oeÆ ients a, b, and c are a(δ, Ω) = 4Ω2 + δ2 (trD)2 ,

b(δ, Ω) = 4νf β∗ + (δ2 det D + Ω2 )trD,

c(δ, Ω) = ν2f ((trD)2 − 4β2∗ ) + (ω2f trD − 2νf β∗ )(δ2 det D + Ω2 )trD,

where β∗ :=

tr(K − ω2f I)D 2νf

.

(26) (27)

For det K 6 0, the domain of asymptoti stability onsists of two non-interse ting parts, bounded by the surfa es ν = ν± cr (δ, Ω) and by the planes ν = ±νd ,

40

O. N. Kirillov

separating it from the divergen e domain. For det K > 0, inequality det K+ν2 > 0 is ful lled, and in a

ordan e with the ondition (24) the asymptoti stability − domain is ontained between the surfa es ν = ν+ cr (δ, Ω) and ν = νcr (δ, Ω). ± The fun tions νcr (δ, Ω) de ned by expressions (25) are singular at the origin due to vanishing denominator. Assuming Ω = βδ and al ulating a limit of these fun tions when δ tends to zero, we obtain ν± 0 (β)

:= lim

δ→ 0

p 4ββ∗ ± trD (trD)2 + 4(β2 − β2∗ ) = νf . (trD)2 + 4β2

ν± cr

(28)

The fun tions ν± 0 (β) are real-valued if the radi and in (28) is non-negative. Proposition 2.

Let λ1 (D) and λ2 (D) be eigenvalues of D. Then, |β∗ | 6

|λ1 (D) − λ2 (D)| . 2

(29)

If D is semi-de nite (det D > 0) or inde nite with 0 > det D > −

(k12 (d22 − d11 ) − d12 (k22 − k11 ))2 , 4ν2f

then |β∗ | 6

|trD| , 2

(30) (31)

and the limits ν±0 (β) are ontinuous real-valued fun tions of β. Otherwise, there exists an interval of dis ontinuity β2 < β2∗ − (trD)2 /4.

Proof. With the use of the de nition of β∗ , (27), a series of transformations β2∗ −

(trD)2 1 = 2 4 4νf −

(k11 − k22 )(d11 − d22 ) + 2k12 d12 2

(d11 + d22 )2 ((k11 − k22 )2 + 4k212 ) 4 4ν2f

= − det D −

2

(k12 (d22 − d11 ) − d12 (k22 − k11 ))2 4ν2f

(32)

yields the expression β2∗ =

(λ1 (D) − λ2 (D))2 (k12 (d22 − d11 ) − d12 (k22 − k11 ))2 . − 4 4ν2f

(33)

For real β∗ , formula (32) implies inequality (30). The remaining part of the proposition follows from (33). Inequality (30) subdivides the set of inde nite damping matri es into two

lasses.

Sensitivity analysis of Hamiltonian and reversible systems

− Fig. 3. The fun tions ν+ 0 (β) (bold lines) and ν0 (β) ( ne when D is hanging from weakly- to strongly inde nite.

41

lines), and their bifur ation

Definition 1. We all a 2 × 2 real symmetri matrix D with det D < 0 weakly inde nite, if 4β2∗ < (trD)2 , and strongly inde nite, if 4β2∗ > (trD)2 .

As an illustration, we al ulate and plot the fun tions ν± 0 (β), normalized by νf , for the matrix K > 0 and inde nite matri es D1 , D2 , and D3 √ 4 75 130 − 11 27 3 63 7 3 , D3 = . (34) K= , D1 = , D2 = 4 √ 51 3 5 31 1 3 130 − 11

The graphs of the fun tions ν± 0 (β) bifur ate with a hange of the damping matrix from the weakly inde nite to the strongly inde nite one. Indeed, sin e D1 satis es the stri t inequality (30), the limits are ontinuous fun tions with separated graphs, as shown in Fig. 3(a). Expression (30) is an equality for the matrix D2 . Consequently, the fun tions ν± 0 (β) are ontinuous, with their graphs tou hing ea h other at the origin, Fig. 3(b). For the matrix D3 , ondition (30) is not ful lled, and the fun tions are dis ontinuous. Their graphs, however, are joint together, forming ontinuous urves, see Fig. 3( ). The al ulated ν± 0 (β) are bounded fun tions of β, non-ex eeding the riti al values ±νf of the unperturbed

ir ulatory system. Proposition 3. ± |ν± 0 (β)| 6 |ν0 (±β∗ )| = νf .

(35)

Proof. Let us observe that µ±0 := ν±0 /νf are roots of the quadrati equation ν2f aβ µ2 − 2δΩb0 νf µ − δ2 c0 = 0,

(36)

with δ2 aβ := a(δ, βδ), b0 := b(0, 0), c0 := c(0, 0). A

ording to the S hur

riterion [6℄ all the roots µ of equation (36) are inside the losed unit disk, if δ2 c0 + ν2f aβ = (trD)2 + 4(β2 − β2∗ ) + (trD)2 > 0, 2δΩνf b0 + ν2f aβ − δ2 c0 = (β + β∗ )2 > 0,

−2δΩνf b0 + ν2f aβ − δ2 c0 = (β − β∗ )2 > 0.

(37)

42

O. N. Kirillov

± The rst of onditions (37) is satis ed for real ν± 0 , implying |µ0 (β)| 6 1 with + − |µ0 (β∗ )| = |µ0 (−β∗ )| = 1. ± The limits ν± 0 (β) of the riti al values of the ir ulatory parameter νcr (δ, Ω), whi h are ompli ated fun tions of δ and Ω, ee tively depend only on the ratio β = Ω/δ, de ning the dire tion of approa hing zero in the plane (δ, Ω). Along the dire tions β = β∗ and β = −β∗ , the limits oin ide with the riti al utter loads of the unperturbed ir ulatory system (2) in su h a way that ν+ 0 (β∗ ) = (−β ) = −ν . A

ording to Proposition 3, the limit of the nonνf and ν− ∗ f 0

onservative positional for e at the onset of utter for system (1) with dissipative and gyros opi for es tending to zero does not ex eed the riti al utter load of ir ulatory system (2), demonstrating a jump in the riti al load whi h is

hara teristi of the destabilization paradox. Power series expansions of the fun tions ν± 0 (β) around β = ±β∗ (with the radius of onvergen e not ex eeding |trD|/2) yield simple estimates of the jumps in the riti al load for the two-dimensional system (1)

νf ∓ ν± 0 (β) = νf

2 (β ∓ β∗ )2 + o((β ∓ β∗ )2 ). (trD)2

(38)

Leaving in expansions (38) only the se ond order terms and then substituting

β = Ω/δ, we get equations of the form Z = X2 /Y 2 , whi h is anoni al for the

Whitney umbrella surfa e [17, 30, 45℄. These equations approximate the boundary of the asymptoti stability domain of system (1) in the vi inity of the points (0, 0, ±νf ) in the spa e of the parameters (δ, Ω, ν). An extension to the ase when the system (1) has m degrees of freedom is given by the following statement. Theorem 1. Let the system (2) with m degrees of freedom be stable for ν < νf and let at ν = νf its spe trum ontain a double eigenvalue iωf with the left and right Jordan hains of generalized eigenve tors u0 , u1 and v0 , v1 , satisfying equations (15) and (16). De ne the real quantities d1 = Re(v0T Du0 ),

d2 = Im(v0T Du1 + v1T Du0 ),

g1 = Re(v0T Gu0 ),

g2 = Im(v0T Gu1 + v1T Gu0 ),

and β∗ = −

v0T Du0 . v0T Gu0

(39) (40)

Then, in the vi inity of β := Ω/δ = β∗ the limit of the riti al utter load of the near-reversible system with m degrees of freedom as δ → 0 is

ν+ cr

ν+ 0 (β) = νf −

g21 (β − β∗ )2 + o((β − β∗ )2 ). µ2 (d2 + β∗ g2 )2

(41)

Sensitivity analysis of Hamiltonian and reversible systems

43

Proof. Perturbing a simple eigenvalue iω(ν) of the stable system (2) at a xed ν < νf by small dissipative and gyros opi for es yields the in rement λ = iω −

vT Du vT Gu δ − Ω + o(δ, Ω). 2vT u 2vT u

(42)

Sin e the eigenve tors u(ν) and v(ν) an be hosen real, the rst order in rement is real-valued. Therefore, in the rst approximation in δ and Ω, the simple eigenvalue iω(ν) remains on the imaginary axis if Ω = β(ν)δ, where β(ν) = −

vT (ν)Du(ν) . vT (ν)Gu(ν)

(43)

Substituting expansions (18) into formula (43), we obtain √ √ d1 ± d2 µ νf − ν + o ( νf − ν) √ √ , β(ν) = − g1 ± g2 µ νf − ν + o ( νf − ν)

(44)

wherefrom expression (41) follows, if |β − β∗ | ≪ 1 .

For various ν, bold lines show linear approximations to the boundary of the asymptoti stability domain (white) of system (1) in the vi inity of the origin in the plane (δ, Ω), when trK > 0 and det K > 0, and 4β2∗ < (trD)2 (upper row) or 4β2∗ > (trD)2 (lower row). Fig. 4.

After substituting β = Ω/δ the formula (41) gives an approximation of the

riti al utter load ν+ cr (δ, Ω) = νf −

g21 (Ω − β∗ δ)2 , 2 µ (d2 + β∗ g2 )2 δ2

(45)

44

O. N. Kirillov

whi h has the anoni al Whitney's umbrella form. The oeÆ ients (21) and (39)

al ulated with the use of ve tors (20) are d1 = 2(k22 − k11 )tr(K − ω2f I)D,

g1 = 4(k11 − k22 )νf

d2 = −8ωf (2d12 k12 + d22 (k22 − k11 )),

g2 = 16ωf νf .

(46)

With (46) expression (41) is redu ed to (38). Using exa t expressions for the fun tions ω(ν), u(ν), and v(ν), we obtain better estimates in ase when m = 2. Substituting the expli it expression for the eigenfrequen y q ω2 (ν) = ω2f ± ν2f − ν2 , (47) following from (11){(13), into the equation (43), whi h now reads

we obtain

δ 2νf β∗ + ω2 (ν) − ω2f trD − 2Ων = 0, νf Ω= ν

"

β∗ ±

trD 2

s

ν2 1− 2 νf

#

δ.

(48) (49)

Equation (49) is simply formula (28) inverted with respe t to β = Ω/δ.

Fig. 5. The domain of asymptoti stability of system (1) with the singularities Whitney umbrella, dihedral angle, and trihedral angle when K > 0 and 4β2∗ < (trD)2 (a), K > 0

and 4β2∗ > (trD)2 (b), and when trK > 0 and det K < 0 ( ).

We use the linear approximation (49) to study the asymptoti behavior of the stability domain of the two-dimensional system (1) in the vi inity of the origin in the plane (δ, Ω) for various ν. It is enough to onsider only the ase when trK > 0 and det K > 0, so that −νf < ν < νf , be ause for det K 6 0 the region ν2 < ν2d 6 ν2f is unstable and should be ex luded. For ν2 < ν2f the radi and in expression (49) is real and nonzero, so that in the rst approximation the domain of asymptoti stability is ontained between two lines interse ting at the origin, as depi ted in Fig. 4 ( entral olumn). When

Sensitivity analysis of Hamiltonian and reversible systems

45

ν approa hes the riti al values ±νf , the angle be omes more a ute until at ν = νf or ν = −νf it degenerates to a single line Ω = δβ∗ or Ω = −δβ∗ respe tively. For β∗ 6= 0 these lines are not parallel to ea h other, and due to

inequality (31) they are never verti al, see Fig. 4 (right olumn). However, the degeneration an be lifted already in the se ond-order approximation in δ Ω = ±δβ∗ ±

ωf trD

p

det D + β2∗

2νf

δ2 + O(δ3 ).

(50)

If the radi and is positive, equation (50) de nes two urves tou hing ea h other at the origin, as shown in Fig. 4 by dashed lines. Inside the usps |ν± cr (δ, Ω)| > νf . The evolution of the domain of asymptoti stability in the plane (δ, Ω), when ν goes from ±νf to zero, depends on the stru ture of the matrix D and is governed by the sign of the expression 4β2∗ − (trD)2 . For the negative sign the angle between the lines (49) is getting wider, tending to π as ν → 0, see Fig. 4 (upper left). Otherwise, the angle rea hes a maximum for some ν2 < ν2f and then shrinks to a single line δ = 0 at ν = 0, Fig. 4 (lower left). At ν = 0 the Ω-axis orresponds to a marginally stable gyros opi system. Sin e the linear approximation to the asymptoti stability domain does not ontain the Ω-axis at any ν 6= 0, small gyros opi for es annot stabilize a ir ulatory system in the absen e of damping for es (δ = 0), whi h is in agreement with the theorems of Lakhadanov and Karapetyan [12, 13℄. Re onstru ting with the use of the obtained results the asymptoti stability domain of system (1), we nd that it has three typi al on gurations in the vi inity of the ν-axis in the parameter spa e (δ, Ω, ν). In ase of a positivede nite matrix K and of a semi-de nite or a weakly-inde nite matrix D the addition of small damping and gyros opi for es blows the stability interval of a ir ulatory system ν2 < ν2f up to a three-dimensional region bounded by the parts of a singular surfa e ν = ν± cr (δ, Ω), whi h belong to the half-spa e δtrD > 0, Fig. 5(a). The stability interval of a ir ulatory system forms an edge of a dihedral angle. At ν = 0 the angle of the interse tion rea hes its maximum (π), reating another edge along the Ω-axis. While approa hing the points ±νf , the angle be omes more a ute and ends up with the deadlo k of an edge, Fig. 5(a). When the matrix D approa hes the threshold 4β2∗ = (trD)2 , two smooth parts of the stability boundary orresponding to negative and positive ν ome towards ea h other until they tou h, when D is at the threshold. After D be omes strongly inde nite this temporary glued on guration ollapses into two po kets of asymptoti stability, as shown in Fig. 5(b). Ea h of the two po kets has a deadlo k of an edge as well as two edges whi h meet at the origin and form a singularity known as the \break of an edge" [17℄. The on guration of the asymptoti stability domain, shown in Fig. 5( ),

orresponds to an inde nite matrix K with trK > 0 and det K < 0. In this ase

46

O. N. Kirillov

the ondition ν2 > ν2d divides the domain of asymptoti stability into two parts,

orresponding to positive and negative ν. The intervals of ν-axis form edges of dihedral angles, whi h end up with the deadlo ks at ν = ±νf and with the trihedral angles at ν = ±νd , Fig. 5( ). Qualitatively, this on guration does not depend on the properties of the matrix D.

Fig. 6. Bifur ation of the domain of the asymptoti stability (white) in the plane (δ, Ω) at ν = 0 due to the hange of the stru ture of the matrix D a

ording to the riterion

(44).

We note that the parameter 4β2∗ − (trD)2 governs not only the bifur ation of the stability domain near the ν-axis, but also the bifur ation of the whole stability domain in the spa e of the parameters δ, Ω, and ν. This is seen from the stability onditions (24){(26). For example, for ν = 0 the inequality Q > 0 is redu ed to c(δ, Ω) > 0, where c(δ, Ω) is given by (26). For positive semide nite matri es D this ondition is always satis ed. For inde nite matri es equation c(δ, Ω) = 0 de nes either hyperbola or two interse ting lines. In ase of weakly-inde nite D the stability domain is bounded by the ν-axis and one of the hyperboli bran hes, see Figure 6 (left). At the threshold 4β2∗ = (trD)2 the stability domain is separated to two half- oni al parts, as shown in the enter of Figure 6. Strongly-inde nite damping makes impossible stabilization by small gyros opi for es, see Figure 6 (right). In this ase the non- onservative for es are required for stabilization. Thus, we generalize the results of the works [35, 36℄, whi h were obtained for diagonal matri es K and D. Moreover, the authors of the works [35, 36℄ did not take into a

ount the non- onservative positional for es orresponding to the matrix N in equation (1) and missed the existen e of the two lasses of inde nite matri es, whi h lead to the bifur ation of the domain of asymptoti stability. We an also on lude that at least in two dimensions the requirement of de niteness of the matrix D established in [46℄ is not ne essary for the stabilization of a ir ulatory system by gyros opi and damping for es.

Sensitivity analysis of Hamiltonian and reversible systems

3

47

A gyroscopic system with weak damping and circulatory forces

A stati ally unstable potential system, whi h has been stabilized by gyros opi for es an be destabilized by the introdu tion of small stationary damping, whi h is a velo ity-dependent for e [1℄. However, many stati ally unstable gyropendulums enjoy robust stability at high speeds [31℄. To explain this phenomenon a

on ept of rotating damping has been introdu ed, whi h is also proportional to the displa ements by a non- onservative way and thus ontributes not only to the matrix D in equation (1), but to the matrix N as well [3{5, 31℄. This leads to a problem of perturbation of gyros opi system (3) by weak dissipative and non- onservative positional for es [14, 27, 31, 32, 46, 48, 49, 59, 62, 63, 66, 74℄. 3.1

Stability of a gyroscopic system

In the absen e of dissipative and ir ulatory for es (δ = ν = 0), the polynomial (10) has four roots ±λ± , where λ± =

r

1 1 − (trK + Ω2 ) ± 2 2

q (trK + Ω2 )2 − 4 det K.

(51)

Analysis of these eigenvalues yields the following result, see e.g. [47℄.

If det K > 0 and trK < 0, gyros opi system (3) with two degrees of freedom is unstable by divergen e for Ω2 < Ω−0 2 , unstable by

utter for Ω−0 2 6 Ω2 6 Ω+0 2 , and stable for Ω+0 2 < Ω2 , where the riti al values Ω−0 and Ω+0 are Proposition 4.

06

q q √ √ + −trK − 2 det K =: Ω− 6 Ω := −trK + 2 det K. 0 0

(52)

If det K > 0 and trK > 0, the gyros opi system is stable for any Ω [2℄. If det K 6 0, the system is unstable [1℄. Representing for det K > 0 the equation (51) in the form λ± =

s

−

1 2

1 r 1 −2 2 2 2 Ω2 − ± Ω0 + Ω+ Ω2 − Ω− Ω2 − Ω+ . (53) 0 0 0 2 2

− we nd that at Ω = 0 there are in general four real roots ±λ± = ±(Ω+ 0 ± Ω0 )/2 and system (3) is stati ally unstable. With the in rease of Ω2 the distan e λ+ − λ− between the two roots of the same sign is getting smaller. The roots are 2 moving towards ea h other until they merge at Ω2 = Ω− 0 with the origination of a pair of double real eigenvalues ±ω0 with the Jordan blo ks, where

1 ω0 = 2

q √ 2 4 −2 Ω+ det K > 0. 0 − Ω0 =

(54)

48

O. N. Kirillov

Further in rease of Ω2 yields splitting of ±ω0 to two ouples of omplex onjugate eigenvalues lying on the ir le Reλ2 + Im λ2 = ω20 .

(55)

2 they rea h The omplex eigenvalues move along the ir le until at Ω2 = Ω+ 0 the imaginary axis and originate a omplex- onjugate pair of double purely 2 imaginary eigenvalues ±iω0 . For Ω2 > Ω+ the double eigenvalues split into 0 four simple purely imaginary eigenvalues whi h do not leave the imaginary axis, Fig. 7.

Stability diagram for the gyros opi system with K < 0 (left) and the orresponding traje tories of the eigenvalues in the omplex plane for the in reasing parameter Ω > 0 (right).

Fig. 7.

− Thus, the system (3) with K < 0 is stati ally unstable for Ω ∈ (−Ω− 0 , Ω0 ), + − − + it is dynami ally unstable for Ω ∈ [−Ω0 , −Ω0 ] ∪ [Ω0 , Ω0 ], and it is stable (gy+ ros opi stabilization) for Ω ∈ (−∞, −Ω+ 0 ) ∪ (Ω0 , ∞), see Fig. 7. The values of − the gyros opi parameter ±Ω0 de ne the boundary between the divergen e and

utter domains while the values ±Ω+ 0 originate the utter-stability boundary.

3.2

The influence of small damping and non-conservative positional forces on the stability of a gyroscopic system

Consider the asymptoti stability domain in the plane (δ, ν) in the vi inity of the origin, assuming that Ω 6= 0 is xed. Observing that the third of the inequalities (22) is ful lled for det K > 0 and the rst one simply restri ts the region of variation of δ to the half-plane δtrD > 0, we fo us our analysis on the remaining two of the onditions (22). Taking into a

ount the stru ture of oeÆ ients (26) and leaving the linear terms with respe t to δ in the Taylor expansions of the fun tions ν± cr (δ, Ω), we

Sensitivity analysis of Hamiltonian and reversible systems

49

get the equations determining a linear approximation to the stability boundary ν=

trKD − trKtrD − trDλ2± (Ω)

δ 2Ω p 2trKD + trD(Ω2 − trK) ± trD (Ω2 + trK)2 − 4 det K = δ, 4Ω

(56)

where the eigenvalues λ± (Ω) are given by formula (51). For det K > 0 and trK > 0 the gyros opi system is stable at any Ω. Consequently, the oeÆ ients λ2± (Ω) are always real, and equations (56) de ne in general two lines interse ting at the origin, Fig. 8. Sin e trK > 0, the se ond of the inequalities (22) is satis ed for det D > 0, and it gives an upper bound of δ2 for det D < 0. Thus, a linear approximation to the domain of asymptoti stability near the origin in the plane (δ, ν), is an angle-shaped area between two lines (56), as shown in Fig. 8. With the hange of Ω the size of the angle is varying and moreover, the stability domain rotates as a whole about the origin. As Ω → ∞, the size of the angle tends to π/2 in su h a way that the stability domain ts one of the four quadrants of the parameter plane, as shown in Fig. 8 (right olumn). From (56) it follows that asymptoti ally as Ω → 0 νf ν(Ω) = Ω

1 trD β∗ ± +o . 2 Ω

(57)

Consequently, the angle between the lines (56) tends to π for the matri es D satisfying the ondition 4β2∗ < (trD)2 , see Fig. 8 (upper left). In this ase in the linear approximation the domain of asymptoti stability spreads over two quadrants and ontains the δ-axis. Otherwise, the angle tends to zero as Ω → 0, Fig. 8 (lower left). In the linear approximation the stability domain always belongs to one quadrant and does not ontain δ-axis, so that in the absen e of non- onservative positional for es gyros opi system (3) with K > 0 annot be made asymptoti ally stable by damping for es with strongly-inde nite matrix D, whi h is also visible in the three-dimensional pi ture of Fig. 5(b). The threedimensional domain of asymptoti stability of near-Hamiltonian system (1) with K > 0 and D semi-de nite or weakly-ide nite is inside a dihedral angle with the Ω-axis as its edge, as shown in Fig. 5(a). With the in rease in |Ω|, the se tion of the domain by the plane Ω = const is getting more narrow and is rotating about the origin so that the points of the parameter plane (δ, ν) that where stable at lower |Ω| an lose their stability for the higher absolute values of the gyros opi parameter (gyros opi destabilization of a stati ally stable potential system in the presen e of damping and non- onservative positional for es). To study the ase when K < 0 we write equation (56) in the form " s q # q Ω+ trD Ω2 2 2 + − 0 γ∗ + −1 δ, ν= Ω2 − Ω0 ± Ω2 − Ω0 2 Ω 4 Ω+ 0

(58)

50

O. N. Kirillov

For various Ω, bold lines show linear approximations to the boundary of the asymptoti stability domain (white) of system (1) in the vi inity of the origin in the plane (δ, ν), when trK > 0 and det K > 0, and 4β2∗ < (trD)2 (upper row) or 4β2∗ > (trD)2 (lower row). Fig. 8.

where γ∗ := Proposition 5.

2 2 tr[K + (Ω+ 0 − ω0 )I]D

2Ω+ 0

.

(59)

Let λ1 (D) and λ2 (D) be eigenvalues of D. Then, |γ∗ | 6 Ω+ 0

|λ1 (D) − λ2 (D)| |λ1 (D) + λ2 (D)| + Ω− . 0 4 4

(60)

Proof. With the use of the Cau hy-S hwarz inequality we obtain |trD| tr(K − tr2K I)(D − tr2D I) + 4 2Ω+ 0 |λ1 (K) − λ2 (K)||λ1 (D) − λ2 (D)| + |trD| + 6 Ω0 . 4 4Ω+ 0

|γ∗ | 6 Ω+ 0

(61)

+ Taking into a

ount that |λ1 (K) − λ2 (K)| = Ω− 0 Ω0 , we get inequality (60). 2 −2 2 Expression (58) is real-valued when Ω2 > Ω+ 0 or Ω 6 Ω0 . For suÆ iently small |δ| the rst inequality implies the se ond of the stability onditions (22), whereas the last inequality ontradi ts it. Consequently, the domain of asymptoti stability is determined by the inequalities δtrD > 0 and Q(δ, ν, Ω) > 0, and its linear approximation in the vi inity of the origin in the (δ, ν)-plane has the form of an angle with the boundaries given by equations (58). For Ω tend+ ing to in nity the angle expands to π/2, whereas for Ω = Ω+ 0 or Ω = −Ω0

Sensitivity analysis of Hamiltonian and reversible systems

51

For various Ω, bold lines show linear approximations to the boundary of the asymptoti stability domain (white) of system (1) in the vi inity of the origin in the plane (δ, ν), when K < 0.

Fig. 9.

it degenerates to a single line ν = δγ∗ or ν = −δγ∗ respe tively. For γ∗ 6= 0 these lines are not parallel to ea h other, and due to inequality (60) they never stay verti al, see Fig. 9 (left). The degeneration an, however, be removed in the se ond-order approximation in δ ν = ±δγ∗ ±

q

trD ω20 det D − γ2∗ 2Ω+ 0

δ2 + O(δ3 ),

(62)

as shown by dashed lines in Fig. 9 (left). Therefore, gyros opi stabilization of stati ally unstable onservative system with K < 0 an be improved up to asymptoti stability by small damping and ir ulatory for es, if their magnitudes are in the narrow region with the boundaries depending on Ω. The lower the desirable absolute value of the riti al gyros opi parameter Ωcr (δ, ν) the poorer

hoi e of the appropriate ombinations of damping and ir ulatory for es. To estimate the new riti al value of the gyros opi parameter Ωcr (δ, ν), whi h an deviate signi antly from that of the onservative gyros opi system, we onsider the formula (58) in the vi inity of the points (0, 0, ±Ω+ 0 , ) in the parameter spa e. Leaving only the terms, whi h are

onstant or proportional to q + Ω ± Ω0 in both the numerator and denominator and assuming ν = γδ, we nd 2 + + ±Ω+ (63) (γ ∓ γ∗ )2 + o((γ − γ∗ )2 ), cr (γ) = ±Ω0 ± Ω0 (ω0 trD)2 After substitution γ = ν/δ equations (63) take the form anoni al for the Whitney umbrella. The domain of asymptoti stability onsists of two po kets of two Whitney umbrellas, sele ted by the onditions δtrD > 0 and Q(δ, ν, Ω) > 0. Equations (58) are a linear approximation to the stability boundary in the vi inity of the Ω-axis. Moreover, they des ribe in an impli it form a limit of the riti al gyros opi parameter Ωcr (δ, γδ) when δ tends to zero, as a fun tion of the ratio γ = ν/δ, Fig. 10(b). Most of the dire tions γ give the limit + value |Ω± cr (γ)| > Ω0 with an ex eption for γ = γ∗ and γ = −γ∗ , so that

52

O. N. Kirillov

+ + − Ω+ cr (γ∗ ) = Ω0 and Ωcr (−γ∗ ) = −Ω0 . Estimates of the riti al gyros opi pa-

rameter (63) are extended to the ase of arbitrary number of degrees of freedom by the following statement.

Fig. 10. Blowing the domain of gyros opi stabilization of a stati ally unstable onservative system with K < 0 up to the domain of asymptoti stability with the Whitney umbrella singularities (a). The limits of the riti al gyros opi parameter Ω± cr as fun tions of γ = ν/δ (b).

Let the system (3) with even number m of degrees of freedom be gyros opi ally stabilized for Ω > Ω+0 and let at Ω = Ω+0 its spe trum ontain a double eigenvalue iω0 with the Jordan hain of generalized eigenve tors u0 , u1 , satisfying the equations Theorem 2.

(−Iω20 + iω0 Ω+ 0 G + K)u0 = 0, + (−Iω20 + iω0 Ω+ 0 G + K)u1 = −(2iω0 I + Ω0 G)u0 .

De ne the real quantities d1 , d2 , n1 ,

n2 ,

and γ∗ as

d1 = Re(uT0 Du0 ),

d2 = Im(uT0 Du1 − uT1 Du0 ),

n1 = Im(uT0 Nu0 ),

n2 = Re(uT0 Nu1 − uT1 Nu0 ),

γ∗ = −iω0

(64)

uT0 Du0 , uT0 Nu0

(65) (66)

where the bar over a symbol denotes omplex onjugate. Then, in the vi inity of γ := ν/δ = γ∗ the limit of the riti al value of the gyros opi parameter Ω+cr of the near-Hamiltonian system as δ → 0 is + Ω+ cr (γ) = Ω0 +

whi h is valid for |γ − γ∗| ≪ 1.

n21 (γ − γ∗ )2 , µ2 (ω0 d2 − γ∗ n2 − d1 )2

(67)

Sensitivity analysis of Hamiltonian and reversible systems

53

Proof. Perturbing the system (3), whi h is stabilized by the gyros opi for es with Ω > Ω+ 0 , by small damping and ir ulatory for es, yields an in rement to a simple eigenvalue [53℄ λ = iω −

ω2 uT Duδ − iωuT Nuν + o(δ, ν). uT Ku + ω2 uT u

(68)

Choose the eigenvalues and the orresponding eigenve tors that merge at Ω = Ω+ 0

q 1 + 2 iω(Ω) = iω0 ± iµ Ω − Ω+ 0 + o(|Ω − Ω0 | ), q 1 + 2 u(Ω) = u0 ± iµu1 Ω − Ω+ 0 + o(|Ω − Ω0 | ),

where

µ2 = −

2ω20 uT0 u0 . T + T T 2 T Ω+ 0 (ω0 u1 u1 − u1 Ku1 − iω0 Ω0 u1 Gu1 − u0 u0 )

(69) (70)

Sin e D and K are real symmetri matri es and N is a real skew-symmetri one, the rst-order in rement to the eigenvalue iω(Ω) given by (68) is real-valued. Consequently, in the rst approximation in δ and ν, simple eigenvalue iω(Ω) remains on the imaginary axis, if ν = γ(Ω)δ, where γ(Ω) = −iω(Ω)

uT (Ω)Du(Ω) . uT (Ω)Nu(Ω)

(71)

Substitution of the expansions (69) into the formula (71) yields q q d1 ∓ µd2 Ω − Ω+ 0 q γ(Ω) = −(ω0 ± µ Ω − Ω+ , 0) n1 ± µn2 Ω − Ω+ 0

(72)

wherefrom the expression (67) follows, if |γ − γ∗ | ≪ 1.

Substituting γ = ν/δ in expression (72) yields the estimate for the riti al value of the gyros opi parameter Ω+ cr (δ, ν) + Ω+ cr (δ, ν) = Ω0 +

n21 (ν − γ∗ δ)2 . µ2 (ω0 d2 − γ∗ n2 − d1 )2 δ2

(73)

We show now that for m = 2 expression (67) implies (63). At the riti al value of the gyros opi parameter Ω+ 0 de ned by equation (52), the double eigenvalue iω0 with ω0 given by (54) has the Jordan hain −1 −iω0 Ω+ − k12 0 0 , u1 = 2 u0 = . −ω20 + k11 ω0 − k22 iω0 (k22 − k11 ) − Ω+ 0 k12

(74)

54

O. N. Kirillov

With the ve tors (74) equation (70) yields µ2 =

2 2 Ω+ Ω+ 0 (ω0 − k11 )(ω0 − k22 ) 0 > 0, = 2 2 2 Ω+ ω2 − k2 0

0

(75)

12

whereas the formula (66) reprodu es the oeÆ ient γ∗ given by (59). To show that (63) follows from (67) it remains to al ulate the oeÆ ients (65). We have 2 2 2 n1 = −2Ω+ 0 ω0 (ω0 − k11 ), ω0 d2 − γ∗ n2 − d1 = −2ω0 (ω0 − k11 )trD. (76) 2 2 Taking into a

ount that (Ω+ 0 ) = −trK + 2ω0 , and using the relations (76) in (73) we exa tly reprodu e (63). Therefore, in the presen e of small damping and non- onservative positional for es, gyros opi for es an both destabilize a stati ally stable onservative system (gyros opi destabilization) and stabilize a stati ally unstable onservative system (gyros opi stabilization). The rst ee t is essentially related with the dihedral angle singularity of the stability boundary, whereas the se ond one is governed by the Whitney umbrella singularity. In the remaining se tions we demonstrate how these singularities appear in me hani al systems.

4

The modified Maxwell-Bloch equations with mechanical applications

The modi ed Maxwell-Blo h equations are the normal form for rotationally symmetri , planar dynami al systems [28, 48, 59℄. They follow from equation (1) for m = 2, D = I, and K = κI, and thus an be written as a single dierential equation with the omplex oeÆ ients x + iΩx_ + δx_ + iνx + κx = 0, x = x1 − ix2 ,

(77)

where κ orresponds to potential for es. Equations in this form appear in gyrodynami al problems su h as the tippe top inversion, the rising egg, and the onset of os illations in the squealing dis brake and the singing wine glass [14, 31, 48, 59, 62, 66, 68, 76℄. A

ording to stability onditions (22) the solution x = 0 of equation (77) is asymptoti ally stable if and only if δ > 0,

Ω>

ν δ − κ. δ ν

(78)

For κ > 0 the domain of asymptoti stability is a dihedral angle with the Ω-axis serving as its edge, Fig. 11(a). The se tions of the domain by the planes Ω = const are ontained in the angle-shaped regions with the boundaries ν=

Ω±

√ Ω2 + 4κ δ. 2

(79)

Sensitivity analysis of Hamiltonian and reversible systems

55

Fig. 11. Two on gurations of the asymptoti stability domain of the modi ed MaxwellBlo h equations for κ > 0 (a) and κ < 0 (b) orresponding to gyros opi destabilization and gyros opi stabilization respe tively; Hauger's gyropendulum ( ).

The domain shown in Fig. 11(a) is a parti ular ase of that depi ted in Fig. 5(a). For K = κI the interval [−νf , νf ] shown in Fig. 5(a)√shrinks to a point so that at Ω = 0 the angle is bounded by the lines ν = ±δ κ and thus it is less than π. The domain of asymptoti stability is twisting around the Ω-axis in su h a manner that it always remains in the half-spa e δ > 0, Fig. 11(a). Consequently, the system stable at Ω = 0 an be ome unstable at greater Ω, as shown in Fig. 11(a) by the dashed line. The larger magnitudes of ir ulatory for es, the lower |Ω| at the onset of instability. As κ > 0 de reases, the hypersurfa es forming the dihedral angle approa h ea h other so that, at κ = 0, they temporarily merge along the line ν = 0 and a new on guration originates for κ < 0, Fig. 11(b). The new domain of asymptoti stability onsists of two disjoint parts that are po kets of two Whitney umbrellas singled out by inequality δ > 0. The absolute values of the gyros opi parameter Ω in the stability domain are always not less than √ Ω+ = 2 −κ . As a onsequen e, the system unstable at Ω = 0 an be ome 0 asymptoti ally stable at greater Ω, as shown in Fig. 11(b) by the dashed line. 4.1

Stability of Hauger’s gyropendulum

Hauger's gyropendulum [14℄ is an axisymmetri rigid body of mass m hinged at the point O on the axis of symmetry as shown in Figure (11)( ). The body's moment of inertia about the axis through the point O perpendi ular to the axis of symmetry is denoted by I, the body's moment of inertia about the axis of symmetry is denoted by I0 , and the distan e between the fastening point and the

enter of mass is s. The orientation of the pendulum, whi h is asso iated with the trihedron Oxfyf zf , with respe t to the xed trihedron Oxi yi zi is spe i ed by the angles ψ, θ, and φ. The pendulum experien es the for e of gravity G = mg and a follower torque T that lies in the plane of the zi and zf oordinate axes. The moment ve tor makes an angle of ηα with the axis zi , where η is a

56

O. N. Kirillov

parameter (η 6= 1) and α is the angle between the zi and zf axes. Additionally, the pendulum experien es the restoring elasti moment R = −rα in the hinge and the dissipative moments B = −bωs and K = −kφ, where ωs is the angular velo ity of an auxiliary oordinate system Oxs ys zs with respe t to the inertial system and r, b, and k are the orresponding oeÆ ients. Linearization of the nonlinear equations of motion derived in [14℄ with the new variables x1 = ψ and x2 = θ and the subsequent nondimensionalization yield the Maxwell-Blo h equations (77) where the dimensionless parameters are given by Ω=

I0 1−η T b r − mgs , ν= T, ω = − . , δ= , κ= 2 2 I Iω Iω Iω k

(80)

The domain of asymptoti stability of the Hauger gyropendulum, given by (78), is shown in Fig. 11(a,b). A

ording to formulas (52) and (54), for the stati ally unstable gyropendulum (κ < 0) the√singular points on the Ω-axis orrespond to the riti al √ = ±2 −κ and the

riti al frequen y ω = −κ . Noting that values ±Ω+ 0 0√ + + Ωcr (ν = ± −κδ, δ) = ±Ω0 and substituting γ = ν/δ into formula (78), we √ expand Ω+ cr (γ) in a series in the neighborhood of γ = ± −κ √ √ √ 1 (γ ∓ −κ)2 + o (γ ∓ −κ)2 . Ω+ cr (γ) = ±2 −κ ± √ −κ

(81)

Pro eeding from γ to ν and δ in (81) yields approximations of the stability boundary near the singularities: √ √ 1 (ν ∓ δ −κ)2 . = ±2 −κ ± √ (82) δ2 −κ √ √ They also follow from formula (63) after substituting ω0 = −κ, and γ∗ = −κ, Ω+ cr (ν, δ)

where the last value is given by (59). Thus, Hauger's gyropendulum, whi h is unstable at Ω = 0, an be ome asymptoti ally stable for suÆ iently large |Ω| > Ω+ 0 under a suitable ombination of dissipative and non onservative positional for es. Note that Hauger failed to nd Whitney umbrella singularities on the boundary of the pendulum's gyros opi stabilization domain. 4.2

Friction-induced instabilities in rotating elastic bodies of revolution

e , κ = ρ2 − Ω e 2 , and The modi ed Maxwell-Blo h equations (77) with Ω = 2Ω ν = 0 and δ = 0, where ρ > 0 is the frequen y of free vibrations of the potential e = ν = 0, des ribe a two-mode approximation system orresponding to δ = Ω of the models of rotating elasti bodies of revolution after their linearization and dis retization [67, 71, 76℄. In the absen e of dissipative and non- onservative

Sensitivity analysis of Hamiltonian and reversible systems

57

positional for es the hara teristi polynomial (10) orresponding to the opere = Iλ2 + 2λΩG e + (ρ2 − Ω e 2 )I, whi h belongs to the lass of matrix ator L0 (Ω) polynomials onsidered, e.g., in [38℄, has four purely imaginary roots e λ± p = iρ ± iΩ,

e λ± n = −iρ ± iΩ.

(83)

e Im λ) the eigenvalues (83) form a olle tion of straight lines In the plane (Ω, interse ting with ea h other { the spe tral mesh [64, 76℄. Two nodes of the e = 0 orrespond to the double semi-simple eigenvalues λ = ±iρ. The mesh at Ω e =Ω e 0 = 0 has two linearly-independent double semi-simple eigenvalue iρ at Ω eigenve tors u1 and u2 1 u1 = √ 2ρ

0 , 1

1 u2 = √ 2ρ

1 . 0

(84)

The eigenve tors are orthogonal uTi uj = 0, i 6= j, and satisfy the normalization e = ±Ω e d there exist double

ondition uTi ui = (2ρ)−1 . At the other two nodes at Ω e e semi-simple eigenvalues λ = 0. The range |Ω| < Ωd = ρ is alled sub riti al for e. the gyros opi parameter Ω In the following, with the use of the perturbation theory of multiple eigenvalues, we des ribe the deformation of the mesh aused by dissipative (δD) and non- onservative perturbations (νN), originating, e.g. from the fri tional

onta t, and larify the key role of inde nite damping and non- onservative positional for es in the development of the sub riti al utter instability. This will give a lear mathemati al des ription of the me hanism of ex itation of parti ular modes of rotating stru tures in fri tional onta t, su h as squealing dis brakes and singing wine glasses [67, 71, 76℄. e =Ω e 0 + ∆Ω e , the double Under perturbation of the gyros opi parameter Ω eigenvalue iρ into two simple ones bifur ates a

ording to the asymptoti formula [58℄ r 2 e f11 + f22 ± i∆Ω e (f11 − f22 ) + f12 f21 λ± p = iρ + i∆Ω 2 4 where the quantities fij are e ∂L ( Ω) 0 = 2iρuTj Gui . ui fij = uTj e e ∂Ω Ω=0,λ=iρ

(85)

(86)

The skew symmetry of G yields f11 = f22 = 0, f12 = −f21 = i, so that (86) gives the exa t result (83). 4.2.1 Deformation of the spectral mesh. Consider a perturbation of the gye + ∆L(Ω) e , assuming that the size of the perturbation ros opi system L0 (Ω) e = δλD + νN ∼ ε is small, where ε = k∆L(0)k is the Frobenius norm ∆L(Ω)

58

O. N. Kirillov

e = 0. The behavior of the perturbed eigenvalue iρ for of the perturbation at Ω e and small ε is des ribed by the asymptoti formula [58℄ small Ω e (f11 + f22 ) + i ǫ11 + ǫ22 λ = iβ + iΩ 2 2 s e (Ω(f11 − f22 ) + ǫ11 − ǫ22 )2 e 12 + ǫ12 )(Ωf e 21 + ǫ21 ), ±i + (Ωf 4

(87)

where fij are given by (86) and ǫij are small omplex numbers of order ε ǫij = uTj ∆L(0)ui = iρδuTj Dui + νuTj Nui .

(88)

With the use of the ve tors (84) we obtain √ µ1 + µ2 λ = iρ − δ ± c, c = 4

µ1 − µ2 4

2

2 ν e δ + iΩ + , 2ρ 2

(89)

where the eigenvalues µ1 , µ2 of D satisfy the equation µ2 − µtrD + det D = 0. Separation of real and imaginary parts in equation (89) yields µ + µ2 Re λ = − 1 δ± 4

where Re c =

r

µ1 − µ2 4

|c| + Re c , Im λ = ρ ± 2

2

e2 + δ2 − Ω

r

|c| − Re c , 2

e Ων ν2 , Im c = . 2 4ρ ρ

(90) (91)

The formulas (89)-(91) des ribe splitting of the double eigenvalues at the nodes of the spe tral mesh due to variation of parameters. Assuming ν = 0 in formulas (90) we nd that

Re λ +

when and

µ1 + µ2 δ 4

2

2 e 2 = (µ1 − µ2 ) δ2 , +Ω 16

2 e 2 − (µ1 − µ2 ) δ2 < 0, Ω 16

2 e 2 − (Im λ − ρ)2 = (µ1 − µ2 ) δ2 , Ω 16

Re λ = −

Im λ = ρ

(92) (93)

µ1 + µ2 δ, 4

(94)

when the sign in inequality (93) is opposite. For a given δ equation (94) de nes e Im λ), while (92) is the equation of a ir le in the a hyperbola in the plane (Ω, e Re λ), as shown in Fig. 12(a, ). For tra king the omplex eigenvalues plane (Ω, e , it is onvenient to onsider the due to hange of the gyros opi parameter Ω e Im λ, Re λ). In this spa e eigenvalue bran hes in the three-dimensional spa e (Ω, the ir le belongs to the plane Im λ = ρ and the hyperbola lies in the plane Re λ = −δ(µ1 + µ2 )/4, see Fig. 13(a, ).

Sensitivity analysis of Hamiltonian and reversible systems

59

Origination of a latent sour e of the sub riti al utter instability in presen e of full dissipation: Submerged bubble of instability (a); oales en e of eigenvalues in the omplex plane at two ex eptional points (b); hyperboli traje tories of imaginary parts ( ).

Fig. 12.

The radius rb of the ir le of omplex eigenvalues|the bubble of instability |and the distan e db of its enter from the plane Re λ = 0 are expressed by means of the eigenvalues µ1 and µ2 of the matrix D rb = |(µ1 − µ2 )δ|/4,

db = |(µ1 + µ2 )δ|/4.

(95)

Consequently, the bubble of instability is \submerged" under the surfa e Re λ = e Im λ, Re λ) and does not interse t the plane Re λ = 0 under the 0 in the spa e (Ω,

ondition db > rb , whi h is equivalent to the positive-de niteness of the matrix δD. Hen e, the role of full dissipation or pervasive damping is to deform the spe tral mesh in su h a way that the double semi-simple eigenvalue is in ated to the bubble of omplex eigenvalues (92) onne ted with the two bran hes of the hyperbola (94) at the points Im λ = ρ,

Re λ = −δ(µ1 + µ2 )/4,

e = ±δ(µ1 − µ2 )/4, Ω

(96)

and to plunge all the eigenvalue urves into the region Re λ 6 0. The eigenvalues at the points (96) are double and have a Jordan hain of order 2. In the omplex e along the lines Re λ = −db plane the eigenvalues move with the variation of Ω until they meet at the points (96) and then split in the orthogonal dire tion; however, they never ross the imaginary axis, see Fig. 12(b). The radius of the bubble of instability is greater then the depth of its submersion under the surfa e Re λ = 0 only if the eigenvalues µ1 and µ2 of the damping matrix have dierent signs, i.e. if the damping is inde nite. The damping with the inde nite matrix appears in the systems with fri tional onta t when the fri tion oeÆ ient is de reasing with relative sliding velo ity [35, 36, 40℄. Inde nite damping leads to the emersion of the bubble of instability meaning that the

60

O. N. Kirillov

The me hanism of sub riti al utter instability (bold lines): The ring (bubble) of omplex eigenvalues submerged under the surfa e Re λ = 0 due to a tion of dissipation with det D > 0 - a latent sour e of instability (a); repulsion of eigenvalue bran hes of the spe tral mesh due to a tion of non- onservative positional for es (b); emersion of the bubble of instability due to inde nite damping with det D < 0 ( );

ollapse of the bubble of instability and immersion and emersion of its parts due to

ombined a tion of dissipative and non- onservative positional for es (d). Fig. 13.

e2 e2 eigenvalues √ of the bubble have positive real parts in the range Ω < Ωcr , where δ e cr = Ω 2 − det D. Changing the damping matrix δD from positive de nite to inde nite we trigger the state of the bubble of instability from latent (Re λ < 0) e cr < Ω e d , the to a tive (Re λ > 0), see Fig. 13(a, ). Sin e for small δ we have Ω

utter instability is sub riti al and is lo alized in the neighborhood of the nodes e = 0. of the spe tral mesh at Ω In the absen e of dissipation, the non- onservative positional for es destroy the marginal stability of gyros opi systems [12, 13℄. Indeed, assuming δ = 0 in the formula (89) we obtain e λ± p = iρ ± iΩ ±

ν , 2ρ

e λ± n = −iρ ± iΩ ∓

ν . 2ρ

(97)

Sensitivity analysis of Hamiltonian and reversible systems

61

e and −iρ − iΩ e of the A

ording to (97), the eigenvalues of the bran hes iρ + iΩ spe tral mesh get positive real parts due to perturbation by the non- onservative positional for es. The eigenvalues of the other two bran hes are shifted to the left from the imaginary axis, see Fig. 13(b).

Fig. 14. Sub riti al utter instability due to ombined a tion of dissipative and non onservative positional for es: Collapse and emersion of the bubble of instability (a); e goes from ex ursions of eigenvalues to the right side of the omplex plane when Ω negative values to positive (b); rossing of imaginary parts ( ).

In ontrast to the ee t of inde nite damping the instability indu ed by the non- onservative for es only is not lo al. However, in ombination with the dissipative for es, both de nite and inde nite, the non- onservative for es an

reate sub riti al utter instability in the vi inity of diaboli al points. From equation (89) we nd that in presen e of dissipative and ir ulatory perturbations the traje tories of the eigenvalues in the omplex plane are des ribed by the formula

Re λ +

trD 4

δ (Im λ − ρ) =

e Ων . 2ρ

(98)

Non- onservative positional for es with ν 6= 0 destroy the merging of modes, shown in Fig. 12, so that the eigenvalues move along the separated traje tories. e A

ording to (98) the eigenvalues with | Im λ| in reasing due to an in rease in |Ω| move loser to the imaginary axis then the others, as shown in Fig 14(b). In the e Im λ, Re λ) the a tion of the non- onservative positional for es sepaspa e (Ω, rates the bubble of instability and the adja ent hyperboli eigenvalue bran hes into two non-interse ting urves, see Fig 13(d). The form of ea h of the new eigenvalue urves arries the memory about the original bubble of instability, so that the real parts of the eigenvalues an be positive for the values of the

62

O. N. Kirillov

e = 0 in the range Ω e2 < Ω e 2 , where gyros opi parameter lo alized near Ω cr e cr = δ Ω

trD 4

s

−

ν2 − δ2 ρ2 det D . − δ2 ρ2 (trD/2)2

ν2

(99)

follows from the equations (89)-(91). e2 < Ω e 2 are The eigenfrequen ies of the unstable modes from the interval Ω cr lo alized near the frequen y of the double semi-simple eigenvalue at the node of + the undeformed spe tral mesh: ω− cr < ω < ωcr ω± cr

ν =ρ± 2ρ

s

−

ν2 − δ2 ρ2 det D . − δ2 ρ2 (trD/2)2

ν2

(100)

When the radi and in formulas (99) and (100) is real, the eigenvalues make the ex ursion to right side of the omplex plane, as shown in Fig. 14(b). In presen e of non- onservative positional for es su h ex ursions behind the stability boundary are possible, even when dissipation is full (det D > 0). The equation (99) des ribes the surfa e in the spa e of the parameters δ, e , whi h is an approximation to the stability boundary. Extra ting the ν, and Ω parameter ν in (99) yields ν = ±δρtrD

s

e2 δ2 det D + 4Ω . e2 δ2 (trD)2 + 16Ω

(101)

e is xed, the formula (101) des ribes two independent urves If det D > 0 and Ω in the plane (δ, ν) interse ting with ea h other at the origin along the straight lines given by the expression ν=±

ρtrD δ. 2

(102)

However, in ase when det D < 0, the radi al in (101) is real only for δ2 < e 2 / det D meaning that (101) des ribes two bran hes of a losed loop in −4Ω the plane of the parameters δ and ν. The loop is self-interse ting at the origin with the tangents given by the expression (102). Hen e, the shape of the surfa e des ribed by equation (101) is a one with the "8"-shaped loop in a ross-se tion, see Fig. 15(a). The asymptoti stability domain is inside the two of the four po kets of the one, sele ted by the inequality δtrD > 0, as shown in Fig. 15(a). The singularity of the stability domain at the origin is the degeneration of a more general on guration shown in Fig. 5(b). The domain of asymptoti stability bifur ates when det D hanges from negative to positive values. This pro ess is shown in Fig. 15. In ase of inde nite damping there exists an instability gap due to the singularity at the origin. e = 0 for any ombination of the parameters Starting in the utter domain at Ω

Sensitivity analysis of Hamiltonian and reversible systems

63

e for dierent types of Domains of asymptoti stability in the spa e (δ, ν, Ω) damping: Inde nite damping det D < 0 (a); semi-de nite (pervasive) damping det D = 0 (b); full dissipation det D > 0 ( ). Fig. 15.

δ and ν one an rea h the domain of asymptoti stability at higher values of e (gyros opi stabilization), as shown in Fig. 15(a) by the dashed line. The |Ω|

gap is responsible for the sub riti al utter instability lo alized in the vi inity of the node of the spe tral mesh of the unperturbed gyros opi system. When det D = 0, the gap vanishes in the dire tion ν = 0. In ase of full dissipation (det D > 0) the singularity at the origin unfolds. However, the memory about it is preserved in the two instability gaps lo ated in the folds of the stability boundary with the lo ally strong urvature, Fig. 15( ). At some values of δ and ν one an penetrate the fold of the stability boundary with the hange of Ω, as shown in Fig. 15( ) by the dashed line. For su h δ and ν the utter instability e = 0. is lo alized in the vi inity of Ω The phenomenon of the lo al sub riti al utter instability is ontrolled by the eigenvalues of the matrix D. When both of them are positive, the folds of the stability boundary are more pronoun ed if one of the eigenvalues is lose to zero. If one of the eigenvalues is negative and the other is positive, the lo al sub riti al utter instability is possible for any ombination of δ and ν in luding the ase when the non- onservative positional for es are absent (ν = 0). The instability me hanism behind the squealing dis brake or singing wine glass an be des ribed as the emersion (or a tivation) due to inde nite damping and non- onservative positional for es of the bubbles of instability reated by the full dissipation in the vi inity of the nodes of the spe tral mesh.

Conclusions Investigation of stability and sensitivity analysis of the riti al parameters and

riti al frequen ies of near-Hamiltonian and near-reversible systems is ompli ated by the singularities of the boundary of asymptoti stability domain, whi h

64

O. N. Kirillov

are related to the multiple eigenvalues. In the paper we have developed the methods of approximation of the stability boundaries near the singularities and obtained estimates of the riti al values of parameters in the ase of arbitrary number of degrees of freedom using the perturbation theory of eigenvalues and eigenve tors of non-self-adjoint operators. In ase of two degrees of freedom the domain of asymptoti stability of near-reversible and near-Hamiltonian systems is fully des ribed and its typi al on gurations are found. Bifur ation of the stability domain due to hange of the matrix of dissipative for es is dis overed and des ribed. Two lasses of inde nite damping matri es are found and the expli it threshold, separating the weakly- and strongly inde nite matri es is derived. The role of dissipative and non- onservative for es in the paradoxi al ee ts of gyros opi stabilization of stati ally unstable potential systems as well as of destabilization of stati ally stable ones is lari ed. Finally, the me hanism of sub riti al utter instability in rotating elasti bodies of revolution in fri tional

onta t, ex iting os illations in the squealing dis brake and in the singing wine glass, is established.

Acknowledgments The author is grateful to Professor P. Hagedorn for his interest to this work and useful dis ussions.

References 1. W. Thomson and P. G. Tait, Treatise on Natural Philosophy, Vol. 1, Part 1, New Edition, Cambridge Univ. Press, Cambridge, 1879. 2. E. J. Routh, A treatise on the stability of a given state of motion, Ma millan, London, 1892. 3. A. L. Kimball, Internal fri tion theory of shaft whirling, Phys. Rev., 21(6) (1923), pp. 703. 4. D. M. Smith, The motion of a rotor arried by a exible shaft in exible bearings, Pro . Roy. So . Lond. A., 142 (1933), pp. 92{118. 5. P. L. Kapitsa, Stability and transition through the riti al speed of fast rotating shafts with fri tion, Zh. Tekh. Fiz., 9 (1939), pp. 124{147. 6. H. Bilharz, Bemerkung zu einem Satze von Hurwitz, Z. angew. Math. Me h., 24(2) (1944), pp. 77{82. 7. M. G. Krein, A generalization of some investigations of linear dierential equations with periodi oeÆ ients, Doklady Akad. Nauk SSSR N.S., 73 (1950), pp. 445-448. 8. H. Ziegler, Die Stabilitatskriterien der Elastome hanik, Ing.-Ar h., 20 (1952), pp. 49{56. 9. E. O. Holopainen, On the ee t of fri tion in baro lini waves, Tellus, 13(3) (1961), pp. 363{367.

Sensitivity analysis of Hamiltonian and reversible systems

65

10. V. V. Bolotin, Non- onservative Problems of the Theory of Elasti Stability, Pergamon, Oxford, 1963. 11. G. Herrmann and I. C. Jong, On the destabilizing ee t of damping in non onservative elasti systems, ASME J. of Appl. Me hs., 32(3) (1965), pp. 592{597. 12. V. M. Lakhadanov, On stabilization of potential systems, Prikl. Mat. Mekh., 39(1) (1975), pp. 53-58. 13. A. V. Karapetyan, On the stability of non onservative systems, Vestn. MGU. Ser. Mat. Mekh., 4 (1975), pp. 109-113. 14. W. Hauger, Stability of a gyros opi non- onservative system, Trans. ASME, J. Appl. Me h., 42 (1975), pp. 739{740. 15. I. P. Andrei hikov and V. I. Yudovi h, The stability of vis o-elasti rods, Izv. A ad. Nauk SSSR. MTT, 1 (1975), pp. 150{154. 16. V. N. Tkhai, On stability of me hani al systems under the a tion of position for es, PMM U.S.S.R., 44 (1981), pp. 24{29. 17. V. I. Arnold, Geometri al Methods in the Theory of Ordinary Dierential Equations, Springer, New York and Berlin, 1983. 18. A. S. Deif, P. Hagedorn, Matrix polynomials subje ted to small perturbations. Z. angew. Math. Me h., 66 (1986), pp. 403{412. 19. M. B. Sevryuk, Reversible systems, Le ture Notes in Mathemati s 1211, Springer, Berlin, 1986. 20. N. V. Bani huk, A. S. Bratus, A. D. Myshkis, Stabilizing and destabilizing ee ts in non onservative systems, PMM U.S.S.R., 53(2) (1989), pp. 158{164. 21. S. Barnett, Leverrier's algorithm: a new proof and extensions, SIAM J. Matrix Anal. Appl., 10(4) (1989), pp. 551-556. 22. A. P. Seyranian, Destabilization paradox in stability problems of non onservative systems, Advan es in Me hani s, 13(2) (1990), 89{124. 23. R. S. Ma Kay, Movement of eigenvalues of Hamiltonian equilibria under nonHamiltonian perturbation, Phys. Lett. A, 155 (1991), 266{268. 24. H. Langer, B. Najman, K. Veseli , Perturbation of the eigenvalues of quadrati matrix polynomials, SIAM J. Matrix Anal. Appl. 13(2) (1992), pp. 474{ 489. 25. G. Haller, Gyros opi stability and its loss in systems with two essential oordinates, Intern. J. of Nonl. Me hs., 27 (1992), 113{127. 26. A. N. Kounadis, On the paradox of the destabilizing ee t of damping in non onservative systems, Intern. J. of Nonl. Me hs., 27 (1992), 597{609. 27. V. F. Zhuravlev, Nutational vibrations of a free gyros ope, Izv. Ross. Akad. Nauk, Mekh. Tverd. Tela, 6 (1992), 13{16. 28. A. M. Blo h, P. S. Krishnaprasad, J. E. Marsden, T. S. Ratiu, Dissipationindu ed instabilities, Annales de l'Institut Henri Poincare, 11(1) (1994), pp. 37{ 90. 29. J. Maddo ks and M. L. Overton, Stability theory for dissipatively perturbed Hamiltonian systems, Comm. Pure and Applied Math., 48 (1995), pp. 583{610. 30. I. Hoveijn and M. Ruijgrok, The stability of parametri ally for ed oupled os illators in sum resonan e, Z. angew. Math. Phys., 46 (1995), pp. 384{392. 31. S. H. Crandall, The ee t of damping on the stability of gyros opi pendulums, Z. angew. Math. Phys., 46 (1995), pp. 761{780. 32. V. V. Beletsky, Some stability problems in applied me hani s, Appl. Math. Comp., 70 (1995), pp. 117{141.

66

O. N. Kirillov

33. O. M. O'Reilly, N. K. Malhotra, N. S. Nama h hivaya, Some aspe ts of 34. 35. 36. 37.

destabilization in reversible dynami al systems with appli ation to follower for es, Nonlin. Dyn. 10 (1996), pp. 63{87. D. R. Merkin, Introdu tion to the Theory of Stability, Springer, Berlin, 1997. P. Freitas, M. Grinfeld, P. A. Knight, Stability of nite-dimensional systems with inde nite damping, Adv. Math. S i. Appl. 7(1) (1997), pp. 437{448. ller, Gyros opi stabilization of inde nite damped W. Kliem and P. C. Mu systems. Z. angew. Math. Me h. 77(1) (1997), pp. 163{164. R. Hryniv and P. Lan aster, On the perturbation of analyti matrix fun tions,

Integral Equations And Operator Theory, 34(3) (1999), pp. 325{338. 38. R. Hryniv and P. Lan aster, Stabilization of gyros opi systems. Z. angew. Math. Me h. 81(10) (2001), pp. 675{681. 39. V. V. Bolotin, A. A. Grishko, M. Yu. Panov, Ee t of damping on the post riti al behavior of autonomous non- onservative systems, Intern. J. of Nonl. Me hs. 37 (2002), pp. 1163{1179. 40. K. Popp, M. Rudolph, M. Kro ger, M. Lindner, Me hanisms to generate and to avoid fri tion indu ed vibrations, VDI-Beri hte 1736, VDI-Verlag, Dusseldorf, 2002. 41. P. Lan aster, A. S. Markus, F. Zhou, Perturbation theory for analyti matrix fun tions: The semisimple ase, SIAM J. Matrix Anal. Appl. 25(3) (2003), pp. 606{626. 42. N. Hoffmann and L. Gaul, Ee ts of damping on mode- oupling instability in fri tion indu ed os illations, Z. angew. Math. Me h. 83 (2003), pp. 524{534. 43. O. N. Kirillov, How do small velo ity-dependent for es (de)stabilize a non onservative system?, DCAMM Report 681, Lyngby, 2003. 44. A. P. Seiranyan and O. N. Kirillov, Ee t of small dissipative and gyros opi for es on the stability of non onservative systems, Doklady Physi s, 48(12) (2003), pp. 679{684. 45. W. F. Langford, Hopf meets Hamilton under Whitney's umbrella, in IUTAM symposium on nonlinear sto hasti dynami s. Pro eedings of the IUTAM symposium, Monti ello, IL, USA, Augsut 26-30, 2002, Solid Me h. Appl. 110, S. N. Nama h hivaya, et al., eds., Kluwer, Dordre ht, 2003, pp. 157{165. 46. A. P. Ivanov, The stability of me hani al systems with positional non onservative for es, J. Appl. Maths. Me hs. 67(5) (2003), pp. 625{629. 47. A. P. Seyranian and A. A. Mailybaev, Multiparameter stability theory with me hani al appli ations, World S ienti , Singapore, 2003. 48. N. M. Bou-Rabee, J. E. Marsden, L. A. Romero, Tippe Top inversion as a dissipation-indu ed instability, SIAM J. Appl. Dyn. Sys. 3 (2004), pp. 352{377. 49. H. K. Moffatt, Y. Shimomura, M. Brani ki, Dynami s of an axisymmetri

body spinning on a horizontal surfa e. I. Stability and the gyros opi approximation, Pro . Roy. So . Lond. A 460 (2004), pp. 3643{3672. 50. O. N. Kirillov, Destabilization paradox, Doklady Physi s. 49(4) (2004), pp. 239{

245. 51. O. N. Kirillov, A. P. Seyranian, Collapse of the Keldysh hains and stability of ontinuous non onservative systems, SIAM J. Appl. Math. 64(4) (2004), pp. 1383{1407. 52. O. N. Kirillov and A. P. Seyranian, Stabilization and destabilization of a ir ulatory system by small velo ity-dependent for es, J. Sound and Vibr., 283(3{5) (2005), pp. 781{800.

Sensitivity analysis of Hamiltonian and reversible systems

67

53. O. N. Kirillov, A theory of the destabilization paradox in non- onservative systems, A ta Me hani a, 174(3{4) (2005), pp. 145{166. 54. O. N. Kirillov and A. P. Seyranian, Instability of distributed non onservative systems aused by weak dissipation, Doklady Mathemati s, 71(3) (2005), pp. 470{ 475. 55. O. N. Kirillov and A. O. Seyranian, The ee t of small internal and external damping on the stability of distributed non- onservative systems, J. Appl. Math. Me h., 69(4) (2005), pp. 529{552. 56. P. Lan aster, P. Psarrakos, On the Pseudospe tra of Matrix Polynomials, SIAM J. Matrix Anal. Appl., 27(1) (2005), pp. 115{120. 57. A. P. Seyranian, O. N. Kirillov, A. A. Mailybaev, Coupling of eigenvalues of omplex matri es at diaboli and ex eptional points. J. Phys. A: Math. Gen., 38(8) (2005), pp. 1723{1740. 58. O. N. Kirillov, A. A. Mailybaev, A. P. Seyranian, Unfolding of eigenvalue surfa es near a diaboli point due to a omplex perturbation, J. Phys. A: Math. Gen., 38(24) (2005), pp. 5531-5546. 59. N. M. Bou-Rabee, J. E. Marsden, L. A. Romero, A geometri treatment of Jellet's egg, Z. angew. Math. Me h., 85(9) (2005), pp. 618{642. 60. A. A. Mailybaev, O. N. Kirillov, A. P. Seyranian, Berry phase around degenera ies, Dokl. Math., 73(1) (2006), pp. 129{133. 61. T. Butlin, J. Woodhouse, Studies of the Sensitivity of Brake Squeal, Appl. Me h. and Mater., 5-6 (2006), pp. 473{479. 62. R. Kre hetnikov and J. E. Marsden, On destabilizing ee ts of two fundamental non- onservative for es, Physi a D, 214 (2006), pp. 25{32. 63. O. N. Kirillov, Gyros opi stabilization of non- onservative systems, Phys. Lett. A, 359(3) (2006), pp. 204{210. 64. U. Gunther, O. N. Kirillov, A Krein spa e related perturbation theory for MHD alpha-2 dynamos and resonant unfolding of diaboli al points, J. Phys. A: Math. Gen., 39 (2006), pp. 10057{10076 65. V. Kobelev, Sensitivity analysis of the linear non onservative systems with fra tional damping, Stru t. Multidis . Optim., 33 (2007), pp. 179-188. 66. R. Kre hetnikov and J. E. Marsden, Dissipation-indu ed instabilities in nite dimensions, Rev. Mod. Phys., 79 (2007), pp. 519{553. 67. U. von Wagner, D. Ho hlenert, P. Hagedorn, Minimal models for disk brake squeal, J. Sound Vibr., 302(3) (2007), pp. 527{539. 68. O. N. Kirillov, On the stability of non onservative systems with small dissipation, J. Math. S i., 145(5) (2007), pp. 5260{5270. 69. A. N. Kounadis, Flutter instability and other singularity phenomena in symmetri systems via ombination of mass distribution and weak damping, Int. J. of Non-Lin. Me h., 42(1) (2007), pp. 24{35. 70. O. N. Kirillov, Destabilization paradox due to breaking the Hamiltonian and reversible symmetry, Int. J. of Non-Lin. Me h., 42(1) (2007), pp. 71{87. 71. G. Spelsberg-Korspeter, D. Ho hlenert, O. N. Kirillov, P. Hagedorn, In-

and out-of-plane vibrations of a rotating plate with fri tional onta t: Investigations on squeal phenomena, Trans. ASME, J. Appl. Me h., (2007) (submitted). 72. J.-J. Sinou and L. Jezequel, Mode oupling instability in fri tion-indu ed vibrations and its dependen y on system parameters in luding damping, Eur. J. Me h. A., 26 (2007), 106{122.

68

O. N. Kirillov

73. P. Kessler, O. M. O'Reilly, A.-L. Raphael, M. Zworski, On dissipation-

indu ed destabilization and brake squeal: A perspe tive using stru tured pseudospe tra, J. Sound Vibr., 308 (2007), 1-11. 74. O. N. Kirillov Gyros opi stabilization in the presen e of non- onservative for es, Dokl. Math., 76(2) (2007), pp. 780{785. 75. G. Spelsberg-Korspeter, O. N. Kirillov, P. Hagedorn, Modeling and stability analysis of an axially moving beam with fri tional onta t, Trans. ASME,

J. Appl. Me h. 75(3) (2008), 031001. 76. O. N. Kirillov, Sub riti al utter in the a ousti s of fri tion, Pro . R. So . A, 464 (2008), pp. 77. J. Kang, C. M. Krousgrill, F. Sadeghi, Dynami instability of a thin ir ular plate with fri tion interfa e and its appli ation to dis brake squeal, J. Sound. Vibr. (2008).

Block triangular miniversal deformations of matrices and matrix pencils Lena Klimenko1 and Vladimir V. Sergei huk2 1

Information and Computer Centre of the Ministry of Labour and So ial Poli y of Ukraine, Esplanadnaya 8/10, Kiev, Ukraine [email protected]

2

Institute of Mathemati s, Teresh henkivska 3, Kiev, Ukraine [email protected]

For ea h square omplex matrix, V. I. Arnold onstru ted a normal form with the minimal number of parameters to whi h a family of all matri es B that are lose enough to this matrix an be redu ed by similarity transformations that smoothly depend on the entries of B. Analogous normal forms were also onstru ted for families of omplex matrix pen ils by A. Edelman, E. Elmroth, and B. K agstrom, and

ontragredient matrix pen ils (i.e., of matrix pairs up to transformations (A, B) 7→ (S−1 AR, R−1 BS)) by M. I. Gar ia-Planas and V. V. Sergei huk. In this paper we give other normal forms for families of matri es, matrix pen ils, and ontragredient matrix pen ils; our normal forms are blo k triangular. Abstract.

Keywords: anoni al forms, matrix pen ils, versal deformations, perturbation theory.

1

Introduction

The redu tion of a matrix to its Jordan form is an unstable operation: both the Jordan form and the redu tion transformations depend dis ontinuously on the entries of the original matrix. Therefore, if the entries of a matrix are known only approximately, then it is unwise to redu e it to Jordan form. Furthermore, when investigating a family of matri es smoothly depending on parameters, then although ea h individual matrix an be redu ed to its Jordan form, it is unwise to do so sin e in su h an operation the smoothness relative to the parameters is lost. For these reasons, Arnold [1℄ onstru ted a miniversal deformation of any Jordan anoni al matrix J; that is, a family of matri es in a neighborhood of J with the minimal number of parameters, to whi h all matri es M lose to J an be redu ed by similarity transformations that smoothly depend on the entries of M (see De nition 1). Miniversal deformations were also onstru ted for:

70

L. Klimenko, V. V. Sergei huk

(i) the Krone ker anoni al form of omplex matrix pen ils by Edelman, Elmroth, and K agstrom [9℄; another miniversal deformation (whi h is simple in the sense of De nition 2) was onstru ted by Gar ia-Planas and Sergei huk [10℄; (ii) the Dobrovol'skaya and Ponomarev anoni al form of omplex ontragredient matrix pen ils (i.e., of matri es of ounter linear operators U ⇄ V ) in [10℄. Belitskii [4℄ proved that ea h Jordan anoni al matrix J is permutationally similar to some matrix J# , whi h is alled a Weyr anoni al matrix and possesses the property: all matri es that ommute with J# are blo k triangular. Due to this property, J# plays a entral role in Belitskii's algorithm for redu ing the matri es of any system of linear mappings to anoni al form, see [5, 11℄. In this paper, we nd another property of Weyr anoni al matri es: they possess blo k triangular miniversal deformations (in the sense of De nition 2). Therefore, if we onsider, up to smooth similarity transformations, a family of matri es that are lose enough to a given square matrix, then we an take it in its Weyr anoni al form J# and the family in the form J# + E, in whi h E is blo k triangular. We also give blo k triangular miniversal deformations of those anoni al forms of pen ils and ontragredient pen ils that are obtained from (i) and (ii) by repla ing the Jordan anoni al matri es with the Weyr anoni al matri es. All matri es that we onsider are omplex matri es.

2

Miniversal deformations of matrices

Definition 1 (see [1–3]). A deformation of an n-by-n matrix A is a matrix fun tion A(α1 , . . . , αk ) (its arguments α1 , . . . , αk are alled parameters) on a neighborhood of ~0 = (0, . . . , 0) that is holomorphi at ~0 and equals A at ~0. Two deformations of A are identi ed if they oin ide on a neighborhood of ~0. A deformation A(α1 , . . . , αk) of A is versal if all matri es A + E in some neighborhood of A redu e to the form A(h1 (E), . . . , hk (E)) = S(E)−1 (A + E)S(E),

S(0) = In ,

in whi h S(E) is a holomorphi at zero matrix fun tion of the entries of E. A versal deformation with the minimal number of parameters is alled miniversal. Definition 2. Let B(α1 , . . . , αk ).

a deformation

A

of

A

be represented in the form

A+

Blo k triangular miniversal deformations of matri es and matrix pen ils – –

71

If k entries of B(α1 , . . . , αk ) are the independent parameters α1 , . . . , αk and the others are zero then the deformation A is alled simple3. A simple deformation is blo k triangular with respe t to some partition of A into blo ks if B(α1 , . . . , αk) is blo k triangular with respe t to the

onformal partition and ea h of its blo ks is either 0 or all of its entries are independent parameters.

If A(α1 , . . . , αk ) is a miniversal deformation of A and S−1 AS = B for some nonsingular S, then S−1 A(α1 , . . . , αk )S is a miniversal deformation of B. Therefore, it suÆ es to onstru t miniversal deformations of anoni al matri es for similarity. Let J(λ) := Jn1 (λ) ⊕ · · · ⊕ Jnl (λ), n1 > n2 > · · · > nl , (1) be a Jordan anoni al matrix with a single eigenvalue equal to λ; the unites of Jordan blo ks are written over the diagonal:

λ1

0

λ ... Jni (λ) := .. . 1 0 λ

(ni -by-ni ).

For ea h natural numbers p and q, de ne the p × q matrix

Tpq

∗ 0 ... 0 . . .. . . . . . if p < q, ∗ 0 . . . 0 := 0 ... 0 . .. . . . if p > q, 0 . . . 0 ∗ ... ∗

(2)

in whi h the stars denote independent parameters (alternatively, we may take Tpq with p = q as in the ase p < q).

Let J(λ) be a Jordan anoni al matrix of the form (1) with a single eigenvalue equal to λ. Let H := [Tn ,n ] be the parameter blo k matrix partitioned onformally to J(λ) with the blo ks Tn ,n de ned in (2). Then Theorem 1 ([3, §30, Theorem 2]). (i)

i

i

j

J(λ) + H 3

j

(3)

Arnold's miniversal de nitions presented in Theorem 1 are simple. Moreover, by [10, Corollary 2.1℄ the set of matri es of any quiver representation (i.e., of any nite system of linear mappings) over C or R possesses a simple miniversal deformation.

72

L. Klimenko, V. V. Sergei huk

is a simple miniversal deformation of J(λ). (ii) Let

if i 6= j, (4) be a Jordan anoni al matrix in whi h every J(λi ) is of the form (1), and let J(λi ) + Hi be its miniversal deformation (3). Then J := J(λ1 ) ⊕ · · · ⊕ J(λτ ),

λi 6= λj

J + K := (J(λ1 ) + H1 ) ⊕ · · · ⊕ (J(λτ ) + Hτ )

(5)

is a simple miniversal deformation of J. Definition 3 ([13]). The Weyr anoni al form J# of a Jordan anoni al matrix J (and of any matrix that is similar to J) is de ned as follows. (i) If J has a single eigenvalue, then we write it in the form (1). Permute the rst olumns of Jn (λ), Jn (λ), . . . , and Jn (λ) into the rst l olumns, then permute the orresponding rows. Next permute the se ond olumns of all blo ks of size at least 2 × 2 into the next olumns and permute the

orresponding rows; and so on. The obtained matrix is the Weyr anoni al form J(λ)# of J(λ). (ii) If J has distin t eigenvalues, then we write it in the form (4). The Weyr anoni al form of J is 1

2

l

J# := J(λ1 )# ⊕ · · · ⊕ J(λτ )# .

Ea h dire t summand of (6) has the form

Is2 λI s1 0 # λIs2 J(λ) = 0

(6)

0 .. . , . . Isk . 0 λIsk

(7)

in whi h si is the number of Jordan blo ks Jl (λ) of size l > i in J(λ). The sequen e (s1 , s2 , . . . , sk ) is alled the Weyr hara teristi of J (and of any matrix that is similar to J) for the eigenvalue λ, see [12℄. By [4℄ or [11, Theorem 1.2℄, all matri es ommuting with J# are blo k triangular. In the next lemma we onstru t a miniversal deformation of J# that is blo k triangular with respe t to the most oarse partition of J# for whi h all diagonal blo ks have the form λi I and ea h o-diagonal blo k is 0 or I. This means that the sizes of diagonal blo ks of (7) with respe t to this partition form the sequen e obtained from sk , sk−1 − sk , . . . , s2 − s3 , s1 − s2 , sk , sk−1 − sk , . . . , s2 − s3 , .................. sk , sk−1 − sk , sk

Blo k triangular miniversal deformations of matri es and matrix pen ils

73

by removing the zero members.

Let J(λ) be a Jordan anoni al matrix of the form (1) with a single eigenvalue equal to λ. Let J(λ) + H be its miniversal deformation (3). Denote by Theorem 2. (i)

J(λ)# + H#

(8)

the parameter matrix obtained from J(λ) + H by the permutations des ribed in De nition 3(i). Then J(λ)# + H# is a miniversal deformation of J(λ)# and its matrix H# is lower blo k triangular. (ii) Let J be a Jordan anoni al matrix represented in the form (4) and let J# be its Weyr anoni al form. Let us apply the permutations des ribed in (i) to ea h of the dire t summands of miniversal deformation (5) of J. Then the obtained matrix J# + K# := (J(λ1 )# + H1# ) ⊕ · · · ⊕ (J(λτ )# + Hτ# )

(9)

is a miniversal deformation of J# , whi h is simple and blo k triangular (in the sense of De nition 2). Let us prove this theorem. The form of J(λ)# +H# and the blo k triangularity of H# be ome learer if we arry out the permutations from De nition 3(i) in two steps. First step. Let us write the sequen e n1 , n2 , . . . , nl from (1) in the form

where

m1 , . . . , m1 , m2 , . . . , m2 , . . . , mt , . . . , mt , | | {z } | {z } {z } r1 times r2 times rt times m1 > m2 > · · · > mt .

(10)

Partition J(λ) into t horizontal and t verti al strips of sizes r1 m1 , r2 m2 , . . . , rt mt

(ea h of them ontains Jordan blo ks of the same size), produ e the des ribed permutations within ea h of these strips, and obtain J(λ)+ := Jm1 (λIr1 ) ⊕ · · · ⊕ Jmt (λIrt ),

in whi h λIri Iri 0 . λIri . . Jmi (λIri ) := .. . I ri 0 λIri

(mi diagonal blo ks).

(11)

74

L. Klimenko, V. V. Sergei huk

By the same permutations of rows and olumns of J(λ) + H, redu e H to ~mi ,mj (ri , rj )], H+ := [T in whi h every T~mi ,mj (ri , rj ) is obtained from the matrix Tmi ,mj de ned in (2) by repla ing ea h entry 0 with the ri × rj zero blo k and ea h entry ∗ with the ri × rj blo k

⋆ := ...

.. . .

(12)

∗ ... ∗

For example, if

then

∗ ... ∗

(13)

J(λ) = J4 (λ) ⊕ · · · ⊕ J4 (λ) ⊕ J2 (λ) ⊕ · · · ⊕ J2 (λ) {z } | {z } | p times q times

(1,1) (1,2) (1,3) (1,4) (2,1) (2,2)

λIp Ip 0 0 0 0 0 0 (1,2) 0 λIp Ip 0 0 (1,3) 0 0 λIp Ip 0 + J(λ) = J4 (λIp ) ⊕ J2 (λIq ) = (1,4) 0 0 0 λIp 0 0 (2,1) 0 0 0 0 λIq Iq (2,2) 0 0 0 0 0 λIq (1,1)

(14)

A strip is indexed by (i, j) if it ontains the j-th strip of Jmi (λIri ). Correspondingly,

(1,1)

(1,3) H+ = (1,4) (2,1) (1,2) (2,2)

(1,1) (1,2) (1,3) (1,4) (2,1) (2,2)

0 0 0 ⋆ ⋆ ⋆

0 0 0 ⋆ 0 0

0 0 0 ⋆ 0 0

0 0 0 ⋆ 0 0

0 0 0 ⋆ 0 ⋆

0 0 0 ⋆ 0 ⋆

(15)

Se ond step. We permute in J(λ)+ the rst verti al strips of Jm1 (λIr1 ), Jm2 (λIr2 ), . . . , Jmt (λIrt )

into the rst t verti al strips and permute the orresponding horizontal strips, then permute the se ond verti al strips into the next verti al strips and permute the orresponding horizontal strips; ontinue the pro ess until J(λ)# is a hieved. The same permutations transform H+ to H# .

Blo k triangular miniversal deformations of matri es and matrix pen ils

75

For example, applying there permutations to (14) and (15), we obtain

(1,1) (2,1) (1,2) (2,2) (1,3) (1,4)

λIp 0 Ip 0 0 0 0 (2,1) 0 λIq 0 Iq 0 (1,2) 0 0 λIp 0 Ip 0 # J(λ) = (2,2) 0 0 0 λIq 0 0 (1,3) 0 0 0 0 λIp Ip (1,4) 0 0 0 0 0 λIp (1,1)

and

(1,1)

(1,2) H# = (2,2) (1,3) (2,1) (1,4)

(1,1) (2,1) (1,2) (2,2) (1,3) (1,4)

0 ⋆ 0 ⋆ 0 ⋆

0 0 0 ⋆ 0 ⋆

0 0 0 0 0 ⋆

0 0 0 ⋆ 0 ⋆

0 0 0 0 0 ⋆

0 0 0 0 0 ⋆

(16)

(17)

Proof of Theorem 2. (i) Following (14), we index the verti al (horizontal) strips

of J(λ)+ in (11) by the pairs of natural numbers as follows: a strip is indexed by (i, j) if it ontains the j-th strip of Jmi (λIri ). The pairs that index the strips of J(λ)+ form the sequen e (1, 1), (1, 2), . . . , (1, mt ), . . . , (1, m2 ), . . . , (1, m1 ), (2, 1), (2, 2), . . . , (2, mt ), . . . , (2, m2 ), ······························ (t, 1), (t, 2), . . . , (t, mt ),

(18)

whi h is is ordered lexi ographi ally. Rearranging the pairs by the olumns of (18): (1, 1), (2, 1), . . . , (t, 1); . . . ; (1, mt ), (2, mt ), . . . , (t, mt ); . . . ; (1, m1 ) (19)

(i.e., as in lexi ographi ordering but starting from the se ond elements of the pairs) and making the same permutation of the orresponding strips in J(λ)+ and H+ , we obtain J(λ)# and H# ; see examples (16) and (17). The ((i, j), (i ′ , j ′))-th entry of H+ is a star if and only if either i 6 i ′ and j = mi , or i > i ′ and j ′ = 1.

(20)

By (10), in these ases j > j ′ and if j = j ′ then either j = j ′ = mi and i = i ′, or j = j ′ = 1 and i > i ′ . Therefore, H# is lower blo k triangular. ⊓ ⊔ (ii) This statement follows from (i) and Theorem 1(ii).

76

L. Klimenko, V. V. Sergei huk

Remark 1. Let J(λ) be a Jordan matrix with a single eigenvalue, let m1 > m2 > · · · > mt be the distin t sizes of its Jordan blo ks, and let ri be the number of Jordan blo ks of size mi . Then the deformation J(λ)# + H# from Theorem 2

an be formally onstru ted as follows:

– J(λ)# and H# are matri es of the same size; they are onformally partitioned

into horizontal and verti al strips, whi h are indexed by the pairs (19).

– The ((i, j), (i, j))-th diagonal blo k of J(λ)# is λIri , its ((i, j), (i, j + 1))-th blo k is Iri , and its other blo ks are zero. – The ((i, j), (i ′ , j ′ ))-th blo k of H+ has the form (12) if and only if (20) holds;

its other blo ks are zero.

3

Miniversal deformations of matrix pencils

By Krone ker's theorem on matrix pen ils (see [6, Se t. XII, §4℄), ea h pair of m × n matri es redu es by equivalen e transformations (A, B) 7→ (S−1 AR, S−1 BR),

S and R are nonsingular,

to a Krone ker anoni al pair (Akr , Bkr ) being a dire t sum, uniquely determined up to permutation of summands, of pairs of the form (Ir , Jr (λ)), (Jr (0), Ir ), (Fr , Gr ), (FTr , GTr ),

in whi h λ ∈ C and

1 0 Fr := 0

0 .. . , .. . 1 0

0 1 Gr :=

0

0 .. . .. . 0

(21)

1

are matri es of size r × (r − 1) with r > 1. De nitions 1 and 2 are extended to matrix pairs in a natural way. Miniversal deformations of (Akr , Bkr ) were obtained in [9, 10℄. The deformation obtained in [10℄ is simple; in this se tion we redu e it to blo k triangular form by permutations of rows and olumns. For this purpose, we repla e in (Akr , Bkr ) – the dire t sum (I, J) of all pairs of the form (Ir , Jr (λ)) by the pair (I, J# ),

and

– the dire t sum (J(0), I) of all pairs of the form (Jr (0), Ir ) by the pair (J(0)# , I),

Blo k triangular miniversal deformations of matri es and matrix pen ils

77

in whi h J# and J(0)# are the Weyr matri es from De nition 3. We obtain a

anoni al matrix pair of the form r l M M (Fqi , Gqi ); (FTpi , GTpi ) ⊕ (I, J# ) ⊕ (J(0)# , I) ⊕

(22)

i=1

i=1

in whi h we suppose that

p1 6 · · · 6 pl ,

(23)

q1 > · · · > qr .

(This spe ial ordering of dire t summands of (22) admits to onstru t its miniversal deformation that is blo k triangular.) Denote by 0↑ :=

∗ ··· ∗

0

, 0↓ :=

0

∗ ··· ∗

∗

, 0← := ...

∗

0 ,

0→

∗ := 0 ...

∗

the matri es, in whi h the entries of the rst row, the last row, the rst olumn, and the last olumn, respe tively, are stars and the other entries are zero, and write ∗ ··· ∗ 0 ··· .. . . Z := 0 . . 0 ···

0 .. .

0

(the number of zeros in the rst row of Z is equal to the number of rows). The stars denote independent parameters. In the following theorem we give a simple miniversal deformation of (22) that is blo k triangular with respe t to the partition of (22) in whi h J# and J(0)# are partitioned as in Theorem 2 and all blo ks of (FTpi , GTpi ) and (Fqi , Gqi ) are 1-by-1.

Let (A, B) be a anoni al matrix pair of the form (22) satisfying (23). One of the blo k triangular simple miniversal deformations of (A, B)

Theorem 3.

has the form (A, B), in whi h

FTp1

FTp2 0 ... FTpl 0 0 I A := → → → 0 J(0)# + H# 0 0 ... 0 ↓ 0 Fq1 0↓ Fq2 0→ 0→ . . . 0→ 0 .. ... . 0↓ 0 Fqr

(24)

78

L. Klimenko, V. V. Sergei huk

and GTp1 ZT GT p . . 2 . 0 .. . . . . T Z . . . ZT GTpl ← ← 0 0 . . . 0← J# + K# , B := 0 0 I ↑ ↑ 0 0 Gq1 ↑ ↑ 0 0 Z G q2 0 . . .. .. . .. . . . . . . 0↑ 0↑ Z . . . Z Gqr

where J(0)# + H# and mations (8) and (9).

J# + K#

(25)

are the blo k triangular miniversal defor-

Proof. The following miniversal deformation of matrix pairs was obtained in [10℄. The matrix pair (22) is equivalent to its Krone ker anoni al form

(Akr , Bkr ) :=

r l M M (Fqi , Gqi ) ⊕ (I, J) ⊕ (J(0), I) ⊕ (FTpi , GTpi ). i=1

i=1

By [10, Theorem 4.1℄, one of the simple miniversal deformations of (Akr , Bkr ) has the form (Akr , Bkr ), in whi h

Fqr

0

0↓ 0↓

Fqr−1 → → → 0 0 0 . . . 0 .. .. . . ↓ F1 0 0 I 0 0 Akr := → → → J(0) + H 0 0 . . . 0 T Fpl 0 FTpl−1 0 .. . 0 FTp1

Blo k triangular miniversal deformations of matri es and matrix pen ils

and

Gqr

Z

...

Z

0↑

0↑

. . 0↑ 0↑ Gqr−1 . . .. 0 .. .. .. . . . Z ↑ Gq1 0 0 0↑ ← ← J + K 0 0 0 . . . 0← Bkr := I 0 T T Gpl Z . . . ZT . . GTpl−1 . . .. 0 .. . ZT

GTp1

0

79

.

In view of Theorem 2, the deformation (Akr , Bkr ) is permutationally equivalent to the deformation (A, B) from Theorem 3. (The blo ks H and K in (Akr , Bkr ) are lower blo k triangular; be ause of this we redu e (Akr , Bkr ) to (A, B), whi h is lower blo k triangular.) ⊓ ⊔

Remark 2. Constru ting J(λ)# , we for ea h r join all r-by-r Jordan blo ks Jr (λ)

of J(λ) in Jr (λI); see (11). We an join analogously pairs of equal sizes in (22) and obtain a pair of the form ′

′

r l M M ^ Tp ′ ) ⊕ (I, J# ) ⊕ (J(0)# , I) ⊕ (^Fq ′ , G ^ q ′ ), FTp ′ , G (^ i i i

i

i=1

(26)

i=1

in whi h p1′ < · · · < pl′ ′ and q1′ > · · · > qr′ ′ . This pair is permutationally equivalent to (22). Produ ing the same permutations of rows and olumns in (24) ^ Tp , ^Fq , G ^ q , and 0, 0↑ , 0↓ , 0← , 0→ , Z in and (25), we join all FTp , GTp , Fq , Gq in F^Tp , G ↑ ↓ ← → ^, 0^ , 0^ , 0^ , 0^ , Z^ whi h onsist of blo ks 0 and ⋆ de ned in (12); the obtaining 0 pair is a blo k triangular miniversal deformation of (26).

4

Miniversal deformations of contragredient matrix pencils

Ea h pair of m × n and n × m matri es redu es by transformations of ontragredient equivalen e (A, B) 7→ (S−1 AR, R−1 BS),

S and R are nonsingular,

to the Dobrovol'skaya and Ponomarev anoni al form [7℄ (see also [8℄) being a dire t sum, uniquely determined up to permutation of summands, of pairs of the form (Ir , Jr (λ)), (Jr (0), Ir ), (Fr , GTr ), (FTr , Gr ), (27)

80

L. Klimenko, V. V. Sergei huk

in whi h λ ∈ C and the matri es Fr and Gr are de ned in (21). For ea h matrix M, de ne the matri es 0 ... 0 M△ := , M

0

. M⊲ := M .. 0

that are obtained by adding the zero row to the top and the zero olumn to the right, respe tively. Ea h blo k matrix whose blo ks have the form T△ (in whi h T is de ned in (2)) is denoted by H△ . Ea h blo k matrix whose blo ks have the form T⊲ is denoted by H⊲ . Theorem 4.

Let

(28)

(I, J) ⊕ (A, B)

be a anoni al matrix pair for ontragredient equivalen e, in whi h J is a nonsingular Jordan anoni al matrix, (A, B) :=

r l M M (FTqi , Gqi ), (Fpi , GTpi ) ⊕ (I, J(0)) ⊕ (J ′ (0), I) ⊕ i=1

i=1

J(0)

and J ′ (0) are Jordan matri es with the single eigenvalue 0, and p1 > p2 > · · · > pl ,

q1 6 q2 6 · · · 6 qr .

Then one of the simple miniversal deformations of (28) has the form (29)

(I, J + K) ⊕ (A, B),

in whi h J + K is the deformation (5) of J and formation of (A, B):

A :=

(A, B)

Fp1 T . . . T Fp2

. . . .. . ...

H△

H

I

H J ′ (0) + H

T Fpl

GTq1 0

is the following de-

H H⊲ H T ... T .. T ... . Gq2 ... T GTqr

Blo k triangular miniversal deformations of matri es and matrix pen ils

and

81

GTp1 + T

T ... 0 .. ... .. . . T . . . T GT + T pl H J(0) + H B := H⊲ H I Fq1 + T . T .. H H H △ . .. . . . . . .

T. . . T Fqr + T

.

Proof. The following simple miniversal deformation of (28) was obtained in [10, Theorem 5.1℄: up to obvious permutations of strips, it has the form in whi h J + K is (5),

and

(30)

(I, J + K) ⊕ (A ′ , B ′ ),

Fp1 + T T . . . T

Fp2 0 ′ A :=

. + T .. ..

.. .

. T

0

H

H

Fpl + T 0 H

I 0 0 J ′ (0) + H

0 H GTq1 T . . . T

0

0

. GTq2 . .

H

..

0

.. .

. T GTqr

,

GTp1 0 T GT . . p2 . H 0 0 . . . . . . T T . . . T Gpl H J(0) + H H H ; B ′ := 0 H I 0 Fq1 + T 0 T Fq2 + T H H 0 . . .. . . . . . T . . . T Fqr + T

82

L. Klimenko, V. V. Sergei huk

Let (C, D) be the anoni al pair (28), and let (P, Q) be any matrix pair of the same size in whi h ea h entry is 0 or ∗. By [10, Theorem 2.1℄, see also the beginning of the proof of Theorem 5.1 in [10℄, (C + P, D + Q) is a versal (respe tively, miniversal) deformation of (C, D) if and only if for every pair (M, N) of size of (C, D) there exist square matri es S and R and a pair (respe tively, a unique pair) (P, Q) obtained from (P, Q) by repla ing its stars with omplex numbers su h that (M, N) + (CR − SC, DS − RD) = (P, Q).

(31)

The matri es of (C, D) are blo k diagonal: C = C1 ⊕ C2 ⊕ · · · ⊕ Ct ,

D = D1 ⊕ D2 ⊕ · · · ⊕ Dt ,

in whi h (Ci , Di ) are of the form (27). Partitioning onformally the matri es of (M, N) and (P, Q) and equating the orresponding blo ks in (31), we nd that (C + P, D + Q) is a versal deformation of (C, D) if and only if for ea h pair of indi es (i, j) and every pair (Mij , Nij ) of the size of (Pij , Qij ) there exist matri es Sij and Rij and a pair (Pij , Qij ) obtained from (Pij , Qij ) by repla ing its stars with

omplex numbers su h that

(32)

(Mij , Nij ) + (Ci Rij − Sij Cj , Di Sij − Rij Dj ) = (Pij , Qij ).

Let (C + P ′ , D + Q ′ ) be the deformation (30) of (C, D). Sin e it is versal, for ea h pair of indi es (i, j) and every pair (Mij , Nij ) of the size of (Pij′ , Qij′ ) there exist matri es Sij and Rij and a pair ′ ′ ′ ′ (Pij , Qij ) obtained from (Pij , Qij ) by repla ing its stars with

omplex numbers su h that

(33)

′ ′ (Mij , Nij ) + (Ci Rij − Sij Cj , Di Sij − Rij Dj ) = (Pij , Qij ).

Let (C + P, D + Q) be the deformation (29). In order to prove that it is versal, let us verify the ondition (32). If (Pij , Qij ) = (Pij′ , Qij′ ) then (32) holds by (33). Let (Pij , Qij ) 6= (Pij′ , Qij′ ) for some (i, j). Sin e the ondition (33) holds, it suÆ es to verify that for ea h (Pij′ , Qij′ ) obtained from (Pij′ , Qij′ ) by repla ing its stars with omplex numbers there exist matri es S and R and a pair (Pij , Qij ) obtained from (Pij , Qij ) by repla ing its stars with omplex numbers su h that ′ ′ (Pij , Qij ) + (Ci R − SCj , Di S − RDj ) = (Pij , Qij ).

The following 5 ases are possible.

(34)

Blo k triangular miniversal deformations of matri es and matrix pen ils

83

Case 1: (Ci , Di ) = (Fp , GTp ) and i = j. Then ′ ′ (Pii , Qii )

= (T, 0) =

0 α1 · · · αp−1

,0

(we denote by T any matrix obtained from T by repla ing its stars with

omplex numbers). Taking

0

αp−1 .. S := . α 2 α1

in (34), we obtain

..

0

.

.. .. . . .. .. .. . . . . α2 . . αp−1 0

0 . αp−1 . . .. .. .. . . . R := . . . . . . α . . . 3

,

0

. α2 α3 . . αp−1 0

αp−1

(Pii , Qii ) = 0, ...

α1

0 = (0, T ).

Case 2: (Ci , Di ) = (Fp , GTp ) and (Cj , Dj ) = (Im , Jm (0)). Then

′ ′ (Pij , Qij ) = (0, T ). Taking S := −T△ and R := 0 in (34), we obtain (Pij , Qij ) = (T△ , 0). Case 3: (Ci , Di ) = (Im , Jm (0)) and (Cj , Dj ) = (Jn (0), In ). Then (Pij′ , Qij′ ) = (0, T ). Taking S := 0 and R := T in (34), we obtain (Pij , Qij ) = (T, 0). Case 4: (Ci , Di ) = (Im , Jm (0)) and (Gj , Dj ) = (GTq , Fq ). Then (Pij′ , Qij′ ) = (0, T ). Taking S := 0 and R := T⊲ in (34), we obtain (Pij , Qij ) = (T⊲ , 0). Case 5: (Ci , Di ) = (Jn (0), In ) and (Gj , Dj ) = (Fp , GTp , ). Then (Pij′ , Qij′ ) = (T, 0). Taking S := T⊲ and R := 0 in (34), we obtain (Pij , Qij ) = (0, T⊲ ).

We have proved that the deformation (29) is versal. It is miniversal sin e it ⊓ ⊔ has the same number of parameters as the miniversal deformation (30).

Remark 3. The deformation (I, J + K) ⊕ (A, B) from Theorem 4 an be made blo k triangular by the following permutations of its rows and olumns, whi h are transformations of ontragredient equivalen e: – First, we redu e (I, J + K) to the form (I, J# + K# ), in whi h J# + K# is

de ned in (9). – Se ond, we redu e the diagonal blo k J(0) + H in B to the form J(0)# + H# (de ned in (8)) by the permutations of rows and olumns of B des ribed in De nition 3. Then we make the ontragredient permutations of rows and

olumns of A.

84

L. Klimenko, V. V. Sergei huk

– Finally, we redu e the diagonal blo k J ′ (0)+H in A to the form J ′ (0)# +H# (de ned in (8)) by the permutations of rows and olumns of A des ribed in

De nition 3, and make the ontragredient permutations of rows and olumns of B. The obtained deformation J ′ (0)# + H# is lower blo k triangular, we make it upper blo k triangular by transformations P(J ′ (0)# + H# )P,

0

1

P := · · · 1 0

(i.e., we rearrange in the inverse order the rows and olumns of A that ross J ′ (0)# +H# and make the ontragredient permutations of rows and olumns of B).

References 1. V. I. Arnold, On matri es depending on parameters, Russian Math. Surveys, 26 (no. 2) (1971), pp. 29{43. 2. V. I. Arnold, Le tures on bifur ations in versal families, Russian Math. Surveys, 27 (no. 5) (1972), pp. 54{123. 3. V. I. Arnold, Geometri al Methods in the Theory of Ordinary Dierential Equations, Springer-Verlag, New York, 1988. 4. G. R. Belitskii, Normal forms in a spa e of matri es, in Analysis in In niteDimensional Spa es and Operator Theory, V. A. Mar henko, ed., Naukova Dumka, Kiev, 1983, pp. 3-15 (in Russian). 5. G. R. Belitskii, Normal forms in matrix spa es, Integral Equations Operator Theory, 38 (2000), pp. 251{283. 6. F. R. Gantma her, Matrix Theory, Vol. 2, AMS Chelsea Publishing, Providen e, RI, 2000. 7. N. M. Dobrovol'skaya and V. A. Ponomarev, A pair of ounter operators, Uspehi Mat. Nauk, 20 (no. 6) (1965), pp. 80{86. 8. R. A. Horn and D. I. Merino, Contragredient equivalen e: a anoni al form and some appli ations, Linear Algebra Appl., 214 (1995), pp. 43{92. m, A geometri approa h to per9. A. Edelman, E. Elmroth, and B. K agstro turbation theory of matri es and matrix pen ils. Part I: Versal deformations, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 653{692. 10. M. I. Gar ia-Planas and V. V. Sergei huk, Simplest miniversal deformations of matri es, matrix pen ils, and ontragredient matrix pen ils, Linear Algebra Appl., 302{303 (1999), pp. 45{61 (some misprints of this paper were orre ted in its preprint arXiv:0710.0946). 11. V. V. Sergei huk, Canoni al matri es for linear matrix problems, Linear Algebra Appl., 317 (2000), pp. 53{102. 12. H. Shapiro, The Weyr hara teristi , Amer. Math. Monthly, 106 (1999), pp. 919{ 929. 13. E. Weyr, Repartition des matri es en espe es et formation de toutes les espe es, C. R. A ad. S i. Paris, 100 (1885), pp. 966{969.

Determining the Schein rank of Boolean matrices Evgeny E. Mareni h⋆ Murmansk State Pedagogi University [email protected]

Abstract. In this paper we present some results of S hein rank of Boolean matri es. A notion of the interse tion number of a bipartite graph is de ned and its appli ations to S hein rank of Boolean matri es are derived. We dis uss minimal and maximal matri es of given S hein rank, the number of m × n Boolean matri es with given S hein rank. The S hein ranks of some m × n Boolean matri es are determined. In the last se tion, we give some further result on erning the S hein rank of Boolean matri es.

Keywords: Boolean matrix, S hein rank, oding fun tions for bipartite

graphs.

1

Introduction

The following are des ribed in Se tions 2 and 3: 1. the set of all m × n minimal Boolean matri es of S hein rank k; 2. the set of all m × n maximal Boolean matri es of S hein rank 2,3; 3. some maximal m × n Boolean matri es of S hein rank k. In Se tion 4 we de ne the interse tion number of a bipartite graph Γ and prove that the interse tion number is equal to the minimum number of maximal

omplete bipartite subgraphs whose union in ludes all edges of Γ . In Se tion 5 we de ne a k- anoni al family CS(k) of bipartite graphs, obtain the family CS(2) and some graphs in the family CS(3). In Se tion 6, we apply the interse tion number and anoni al families to determining the S hein rank of Boolean matri es. In parti ular, formulas for the number of all m × n Boolean matri es of S hein rank k are obtained. In Se tion 7, oding of bipartite graphs is studied. In Se tion 8, we de ne the bipartite interse tion graphs and investigate the S hein rank of asso iated matri es. In Se tion 9, we give some further result on erning the S hein rank of Boolean matri es. ⋆

This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

86

2

E. E. Mareni h

The Schein rank of Boolean matrices

Our notation and terminology are similar to those of [1℄, [4℄. We olle t in this se tion a number of result and de nitions required latter. Where possible we state simple orollaries of these results without proof. We olle t in this se tion a number of results and de nitions required latter. A detailed treatment may be found in [3℄, [4℄. Let U be a nite set, 2U be the olle tion of all subsets of U. The number of elements in U is denoted by |U|. Let Bul(U) = (2U , ⊆) be the Boolean algebra (or poset) of all subsets of a nite set U partially ordered by in lusion. Let Bul(k) be the Boolean algebra of all subsets of a nite set of k elements. Let P = {e0, e1} be a two-element Boolean latti e with the greatest element e1 and the least element e0. The latti e operations meet ∧ and join ∨ are de ned as follows: ∧ |e 0e 1 ∨ |e 0e 1

e 0 |e 0e 0 e 1 |e 0e 1

e 0 |e 0e 1 e 1 |e 1e 1

Following [4℄, we re all some de nitions. Let Pm×n denote the set of all m × n (Boolean) matri es with entries in P. Matri es with all entries in P will be denoted by ROMAN apitals A = kaij km×n , B = kbij km×n , C = kcij km×n , X = ||xij km×n , . . .. Then the usual de nitions for addition and multipli ation of matri es over eld are applied to Boolean matri es as well. The n×n identity matrix E = En×n is the matrix su h that eij =

e 1, if i = j, e 0, if i 6= j.

Denote by En×n the n × n matrix with e0' entries on the main diagonal and e 1 elsewhere. The m × n zero matrix 0m×n is the matrix all of whose entries are e0. The 1. m × n universal matrix Jm×n is the matrix all of whose entries are e The transpose of A will be denoted by A(t) . De ne a partial ordering 6 on Pm×n by A 6 B i aij 6 bij for all i, j. Let A(r) (A(r) ) denote the rth olumn (row) of A. A subspa e of Pm×1 (P1×n ) is a subset of Pm×1 (P1×n ) ontaining the zero ve tor and losed under addition. A olumn spa e Column(A) of a matrix A is the span of the set of all olumns of A. Likewise one has a row spa e Row(A) of A. De nitions of the olumn rank rankc (A) (row rank rankr (A)) of A is due to Kim, [4℄.

Determining the S hein rank of Boolean matri es

87

Theorem 1 (Kim, Roush, [3℄). Let A ∈ Pm×n , A 6= 0m×n .

Then the following

onditions are equivalent: (i) ranks (A) = k. (ii) k is the least integer su h that A is a produ t of an m × k matrix and an k × n matrix. (iii) k is the smallest dimension of a subspa e W su h that W ontains the

olumn spa e Column(A) (row spa e Row(A)). Example. We have Column(En×n ) = Pn×1 . From Theorem 1 (iii), it follows

that

ranks (En×n ) = n.

The following theorem is due to Kim [4℄.

Let A ∈ Pm×n . Then: (i) ranks (A) = ranks (A(t) ).

Theorem 2.

(ii) ranks (A) 6 min{rankc (A), rankr (A)}. (iii) ranks (A) 6 min{m, n}. (iv) If Column(A) 6 Column(B) then ranks (A) 6 ranks (B). Corollary 1.

Let A ∈ Pm×n . If B is a submatrix of A then ranks (B) 6 ranks (A).

Corollary 2.

Let A1 , . . . , Ak ∈ Pm×n . Then

ranks (A1 + A2 + . . . + Ak ) 6 ranks (A1 ) + ranks (A2 ) + . . . + ranks (Ak ).

Let A1 , . . . , Ak be Boolean matri es. If the produ t A1 A2 . . . Ak is de ned, then

Corollary 3.

ranks (A1 A2 . . . Ak ) 6 ranks (Ai ), i = 1, . . . , k, ranks (A1 A2 . . . Ak ) 6 min{ranks (A1 ), ranks (A2 ), . . . , ranks (Ak )}.

Example. If A ∈ Pn×n is invertible, then ranks (A) = n. A square matrix is alled a permutation matrix if every row and every olumn

ontains only one e1.

Corollary 4.

es. Then

Let A ∈ Pm×n and π ∈ Pm×m , σ ∈ Pn×n be permutation matriranks (πA) = ranks (Aσ) = ranks (A).

Corollary 5.

Let A ∈ Pn×n . Then ranks (A) > ranks (A2 ) > ranks (A3 ) > . . . .

88

3

E. E. Mareni h

Matrices of Schein rank 2, 3

Let A ∈ Pm×n . By ρ(A) denote the number of e1's in A. By Chrk (m, n) we denote the set of all matrix A ∈ Pm×n su h that ranks (A) = k, where min{m, n} > k. The term e1-rank of a matrix A ∈ Pm×n is the maximum number of e1's entries of A no two of whi h share a row or olumn of A. We denote the term e 1-rank of A by ρt (A). By Konig theorem [1℄, it follows that the e 1-term rank of A is the minimum number of rows and olumns of A ontaining all e 1's entries of A. An element a of a poset (Q, 6) is maximal if whenever a 6 x, then a = x. We dually de ne minimal elements. The set of all minimal matri es in (Chrk (m, n), 6) is des ribed in the following theorem. Theorem 3. Let m, n > k. A matrix A is minimal ρ(A) = k and A has a k × k permutation submatrix.

in

(Chrk (m, n), 6)

i

Proof. If ρ(A) = k and A has a k× k permutation submatrix, then A is minimal

in (Chrk (m, n), 6). Let C ∈ Chrk (m, n). We rst show that ρt (C) > k. Suppose, to the ontrary, that ρt (C) < k. By Konig theorem [1℄, it follows that ρt (C) rows and olumns of C ontaining all e1's entries of C. We see that ranks (C) 6 ρt (C) < k. This is a ontradi tion sin e ρt (C) > k. Therefore exists a matrix A ∈ Pm×n su h that A 6 C, ρ(A) = k and A has a k × k permutation submatrix. ⊓ ⊔ The number of all minimal matri es in (Chrk (m, n), 6) is n(n − 1) . . . (n − k + 1)m(m − 1) . . . (m − k + 1).

Let ∆k ∈ Pk×k have the following form:

e 1 e 0 ∆k = e 0 e 0

e 1 1 ... e e 1 ... e 1 e 1 ... e 1 . ... ... e 0 e 0 ... e 1 e 1 e 1 e 0

From [5℄ it follows that ranks (∆k ) = k, k > 1. Let ∼ be the equivalen e relation on Pm×n de ned by B ∼ C i C = πBσ for some permutation matri es π ∈ Pm×m , σ ∈ Pn×n . Now we obtain some maximal matri es in (Chrk (m, n), 6).

Determining the S hein rank of Boolean matri es

89

Theorem 4. Let A ∈ Pm×n . If there exists a submatrix B of A su h that B ∼ ∆k and B ontains all e 0's entries of A, then A is maximal in (Chrk (m, n), 6 ).

Proof. We have ranks (A) = ranks (∆k ) = k. It suÆ es to show that ∆k is maximal in (Chrk (k, k), 6). Let B be obtained from ∆k by repla ing a sele tion of the e0's by e1's. Let r be the least integer su h that bir 6= (∆k )ir for some i. Then B(r) is a span of some rows B(i) , i 6= r. Therefore ranks (B) < k. ⊓ ⊔ The set of all maximal matri es of (Chr2 (m, n), 6) is des ribed in the following theorem. Theorem 5. Let A ∈ Pm×n and m, n > 2. Then (Chr2 (m, n), 6) i only one entry of A is e 0.

a matrix

A

is maximal in

The number of all maximal elements in the poset (Chr2 (m, n), 6) is nm. The set of all maximal matri es in the poset (Chr3 (m, n), 6) is des ribed in the following theorem.

Theorem 6. Let A ∈ Pm×n and m, n > 3. A matrix A is maximal (Chr3 (m, n), 6) i there exists a submatrix B of A su h that B ∼ ∆3 0's entries of A. B ∼ E3×3 , and B ontains all e

in or

Proof. Let C ∈ Chr3 (m, n). By Konig theorem, it follows that the e0-term rank of C is the minimum number of rows and olumns of C ontaining all e0's entries of C. By Konig theorem the proof is now divided into following ases. Case 1: there exist three e0's entrees su h that no two e0 entries share a row or olumn of A. The matrix A obtained from C by repla ing other e0 entrees by e 1 is maximal in (Chr3 (m, n), 6). Case 2: there exist two rows and olumns of C ontaining all e0 entries of 0's en-tries of C, then C. If there exists a row (a olumn) of C ontaining all e ranks (C) 6 2, whi h is a ontradi tion. Therefore there exist a row and a olumn of C ontaining all e0's entries of C. Case 2.1: there exist two olumns of C ontaining all e0's entries of C. Then there exists a submatrix B of C su h that ranks (B) = ranks (C) = 3 and ea h row of B is a row of the matrix

e 0 e 1 D=e 0 e 1

e 0 e 0 e 1 e 1

e 1 e 1 . e 1 e 1

By onsidering all matri es B su h that ranks (B) = 3, we on lude the proof in this ase.

90

E. E. Mareni h

Case 2.2: there exist a row and a olumn of C ontaining all e0 entries of C. It is easy to see that ranks (C) = ranks (A) for some matrix A su h that ρ(A) = n − 3 and A has a submatrix B su h that B ∼ ∆3 . ⊓ ⊔

Remark. The matrix Ek×k is not maximal in (Chrk (k, k), 6) for k > 5. 4

On coding of bipartite graphs by sets

Let Γ = Γ (V1 ∪ V2 , E) be a bipartite graph with bipartition V1 = {1, 2, 3, . . .}, V2 = {1 ′ , 2 ′ , 3 ′ , . . .} and U a nite set. A fun tion f : V1 ∪ V2 → 2U is alled U- oding fun tion for Γ if for any verti es v1 , v2 onditions {v1 , v2 } ∈ E and f(v1 ) ∩ f(v2 ) 6= ∅ are equivalent. We

all f(v) the ode of v ∈ V1 ∪ V2 . Note that there exist oding fun tions for any bipartite graph Γ . The interse tion number nintbp (Γ ) of a bipartite graph Γ = Γ (V1 ∪ V2 , E) is the least number |U| su h that there exists a U- oding fun tion for Γ . Note that every maximal omplete bipartite subgraph has at least one edge. The following example lari es the above de nitions.

Example. Let Γ1 be the following graph: 1s

2s

3s

4s

1′

2′

3′

4′

@ @ @ @ @ @ @ @ @ s @s @s @s

Γ1 :

Then some maximal omplete bipartite subgraphs of Γ1 are: 1

2

2

3

3

4

1′

2′

2′

3′

3′

4′

s s @ @ @ s @s ,

s s @ @ @ s @s ,

s s @ @ @ s @s .

In the following theorem we show that the interse tion number nintbp (Γ ) of a bipartite graph is losely onne ted to the set of all omplete bipartite subgraphs of Γ . Γ = Γ (V1 ∪ V2 , E) be a bipartite graph. The interse tion number nintbp (Γ ) is equal to the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ .

Theorem 7.

Let

Determining the S hein rank of Boolean matri es

Proof. Let nintbp (Γ ) {1, . . . , k}. De ne sets

91

= k and f be a U- oding fun tion for Γ , where U =

Vr = {v | v ∈ V1 ∪ V2 , r ∈ f(v)}, r = 1, . . . , k.

Note that Vr 6= ∅, r = 1, . . . , k. Let Γr be a subgraph su h that Vr is the set of verti es of Γr . Then Γr is a omplete bipartite subgraph of Γ . The union of subgraphs Γ1 , . . . , Γk in ludes all edges of Γ . Any subgraph Γ1 , . . . , Γk is ontained in some maximal omplete bipartite subgraph. Therefore the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ is less than or equal to k = nintbp (Γ ). Let the minimum number of omplete bipartite subgraphs whose union in ludes all edges of Γ is equal to k. Let the union of Γ1 , . . . , Γk in ludes all edges of Γ . For any v ∈ V1 ∪ V2 de ne the set f(v): r ∈ f(v) i v is a vertex of Γr . We now prove that f : V1 ∪ V2 → 2U is a U- oding fun tion for Γ . Let v1 ∈ V1 , v2 ∈ V2 and {v1 , v2 } ∈ E. Then {v1 , v2 } is an edge of some Γr . Therefore r ∈ f(v1 ), f(v2 ) and f(v1 ) ∩ f(v2 ) 6= ∅. Let v1 ∈ V1 , v2 ∈ V2 , and f(v1 )∩f(v2 ) 6= ∅. Then there exists r ∈ f(v1 ), f(vr ). Therefore {v1 , v2 } is an edge of Γr . Thus {v1 , v2 } ∈ E. We have proved that f : V1 ∪ V2 → 2U is a U- oding fun tion for Γ . Then nintbp (Γ ) is less than or equal to the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ . Thus nintbp (Γ ) equals the minimum number of maximal omplete bipartite ⊓ ⊔ subgraphs whose union in ludes all edges of Γ .

Example. The minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ1 is equal to 3. Therefore nintbp (Γ1 ) = 3.

5

On canonical bipartite graphs

Let Γ = Γ (V1 ∪ V2 , E) be a bipartite graph with bipartition V1 = {1, 2, 3, . . .}, V2 = {1 ′ , 2 ′ , 3 ′ , . . .} and U a nite set. Denote by V^1 the set of all nonisolated verti es of V1 . In the same way, we de ne V^2 . De ne the following sets E(v) = {z | {v, z} ∈ E}, v ∈ V1 ∪ V2 .

Let ∼ be the equivalen e relation on V^1 ∪ V^2 de ned by: u ∼ v whenever E(u) = E(v). ′ ′ ′ Let Γc = Γ (V1 ∪ V2 , E ) be a bipartite graph with bipartition V1′ ,V2′ , where V1′ = V^1 /∼, V2′ = V^2 /∼ are quotient sets and E ′ is de ned by:

92

E. E. Mareni h {^i, ^j ′ } ∈ E ′ i {i, j} ∈ E.

We all Γc a anoni al representation of Γ .

Example. Consider the graph Γ and its anoni al representation Γc , Γ :

sH s s @H @ @HH@ H @ [email protected] [email protected] s @s Hs ,

Γc :

s s @ @ @ @s . s

For any bipartite graph Γ , the following statements are valid.

Lemma 1.

(i) nintbp (Γ ) = nintbp (Γc ). (ii) If nintbp (Γ ) = k, then k 6 V1′ , V2′ 6 2k − 1. Let CS(k) be the set of all nonisomorphi anoni al representations for bipartite graphs of interse tion number k. We all CS(k) a k- anoni al family. Any

anoni al representation of a bipartite graph is alled a anoni al graph.

Example.

1. The anoni al family CS(1) ontains the unique graph s

s.

2. The anoni al family CS(2) ontains four graphs s

s

s

s,

s s @ @ @ @s , s

s s s @ @ @ @ @ @ @s @s . s

s s @ @ @ @ @ @ s @s @s ,

3. In CS(3) we onsider all graphs with three verti es in bipartition V1′ , V2′ : s s s @ @ @ @ @ @ s @s @s , s s s @ @ @ @ @ @ @s , @s s

s

s s

s

s s @ @ @ s @s , s s

s

s,

s s s HH @ @ @HH@ H [email protected] @ H s Hs , @ @s s s s

s

s

s.

Determining the S hein rank of Boolean matri es

93

The anoni al family CS(k) give us all bipartite graphs Γ = Γ (V1 ∪ V2 , E) su h that nintbp (Γ ) = k. Let Fk (m, n) is the number of all bipartite graphs Γ = Γ (V1 ∪ V2 , E) su h that V1 = {1, . . . , m}, V2 = {1 ′ , 2 ′ , . . . , n ′ }, nintbp (Γ ) = k. We have (1)

F1 (m, n) = (2m − 1)(2n − 1), F1 (n, n) = (2n − 1)2 .

For the anoni al family CS(2), we obtain the following theorem. Theorem 8.

For all m, n > 1 (2)

F2 (m, n) = 23 (3m − 2 · 2m + 1)(3n − 2 · 2n + 1) + + 21 (3m − 2 · 2m + 1)(4n − 3 · 3n + 3 · 2n − + 12 (3n − 2 · 2n + 1)(4m − 3 · 3m + 3 · 2m − m m m n n n

+ 21 (4

−3·3

+3·2

In parti ular, for all n > 1 n

n

n

n

+ (3 − 2 · 2 + 1)(4 − 3 · 3 + 3 · 2 − 1) +

6

1) +

− 1)(4 − 3 · 3 + 3 · 2 − 1).

F2 (n, n) = 32 (3n − 2 · 2n + 1)2 + n

1) +

1 n 2 (4

(3) n

n

2

− 3 · 3 + 3 · 2 − 1) .

On the Schein rank of Boolean matrices and the intersection number of associated graphs

Let A ∈ Pm×n , U be a nite set. To a matrix A ∈ Pm×n asso iate a bipartite graph Γ (A) = Γ (V1 ∪ V2 , E) with bipartition V1 = {1, . . . , m}, V2 = {1 ′ , 2 ′ , . . . , n ′ } by taking aij = e1 if and only if there is an edge between i and j ′ . To a matrix A ∈ Pm×n asso iate a bipartite graph Γ (A) = Γ (V1 ∪ V2 , E) by taking bipartition V1 = {1, . . . , m}, V2 = {1 ′ , 2 ′ , . . . , n ′ } and a set of edges E su h that {i, j ′ } ∈ E if and only if aij = e1. The following theorem redu es the S hein rank problem for any matrix A to determining the interse tion number of Γ (A). Theorem 9.

The S hein rank of A equals the interse tion number of Γ (A).

Proof. We rst prove that nintbp (Γ ) 6 ranks (A). Let ranks (A) = k. Then A = C1 D1 + C2 D2 + . . . + Ck Dk

for some C1 , C2 , . . . , Ck ∈ Pm×1 , D1 , D2 , . . . , Dk ∈ P1×n . De ne sets: f(i) = {j | (Cj )(i) = e 1, j = 1, . . . , k}, i = 1, . . . , m, ′ (j) f(j ) = {i | (Di ) = e 1, i = 1, . . . , k}, j = 1, . . . , n.

94

E. E. Mareni h

Let f : V1 ∪V2 → 2U and U = {1, . . . , k} be a fun tion and a set. We now prove that f is a U- oding fun tion of Γ (A). The following statements are equivalent: – – – –

aij = e 1; e 1 = (Cr Dr )ij = (Cr )(i) ∧ (Dr )(j) for some r; there exists r su h that r ∈ f(i), r ∈ f(j ′ ); f(i) ∩ f(j ′ ) 6= ∅.

We have proved that aij = e1 i f(i) ∩ f(j ′ ) 6= ∅. Therefore f is a U- oding fun tion of Γ (A). Thus nintbp (Γ ) 6 k = ranks (A). We now prove that nintbp (Γ ) 6 ranks (A). Let nintbp (Γ ) = k and f is a U- oding fun tion for Γ (A). We have f : V1 ∪ V2 → 2U , where U = {1, . . . , k}. De ne olumn ve tors C1 , C2 , . . . , Ck ∈ Pm×1 by setting: (Cr )(i) = e 1 i r ∈ f(i), i = 1, . . . , m, r = 1, . . . , k. Similarly, de ne row ve tors D1 , D2 , . . . , Dk ∈ P1×n by setting: (Dr )(j) = e 1 i r ∈ f(j ′ ), j = 1, . . . , n, r = 1, . . . , k. We laim that A = C1 D1 +C2 D2 +. . .+Ck Dk . Indeed, the following statements are equivalent: – – – – – –

aij = e 1; f(i) ∩ f(j ′ ) 6= ∅; there exists r su h that r ∈ f(i), r ∈ f(j ′ ); there exists r su h that (Cr )(i) = e1, (Dr )(j) = e1; (Cr )(i) ∧ (Dr )(j) = (Cr Dr )ij = e 1 for some r; (C1 D1 + C2 D2 + . . . + Ck Dk )ij = e 1.

Therefore

A = C1 D1 + C2 D2 + . . . + Ck Dk , ranks (A) 6 k = nintbp (Γ ).

We have proved that ranks (A) = nintbp (Γ ).

⊓ ⊔

From Theorem 9 and [2℄, [Remark 6.7℄, we obtain the following orollary. Corollary 6 ([5℄). Let A ∈ Pm×n .

The S hein rank of A is equal to the minimum number of omplete bipartite subgraphs whose union in ludes all edges of Γ (A).

Example. Let A ∈ Pn×n have the following form:

e 1e 1e 0 ... e 0e 0 e 1e 1 ... e 0e 0 0e A = ··· ··· ··· eee 0 0 0 ... e 1e 1 e 1e 0e 0 ... e 0e 1

Determining the S hein rank of Boolean matri es

95

Then Γ (A) have the following form: Γ (A) :

s s s @ @ @ @ @@ @s s @s

s s @ @ @ @s s ···

··· ···

Note that Γ (A) has 2n edges and any maximal omplete bipartite subgraph

ontains two edges. Therefore the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ is n. Thus ranks (A) = n. The anoni al family CS(k) give us all matri es A ∈ Pm×n su h that ranks (A) = k. Theorem 10.

Let m, n > 1 and min{m, n} > k. Then |Chrk(m, n)| = Fk (m, n).

Proof. The number of all matri es A ∈ Pm×n su h that ranks (A) = k is equal to the number of all bipartite graphs Γ = Γ (V1 ∪V2 , E) su h that V1 = {1, . . . , m},

V2 = {1 ′ , 2 ′ , . . . , n ′ }, nintbp (Γ ) = k.

⊓ ⊔

The results of se tion 5 give us the formulas for |Chr1 (m, n)| and |Chr2 (m, n)|.

Example. 1. The number of all matri es A ∈ P2×2 su h that ranks (A) = k is

equal to Fk (2, 2). Using anoni al families, we get: F0 (2, 2) = 1, F1 (2, 2) = 9, F2 (2, 2) = 6. 2. The number of all matri es A ∈ P3×3 su h that ranks (A) = k is equal to Fk (3, 3). Using anoni al families, we get: F0 (3, 3) = 1, F1 (3, 3) = 49, F2 (3, 3) = 306, F3 (3, 3) = 156. From the proof of Theorem 9 we obtain the following statements. If f : V1 ∪ V2 → 2U is a U- oding fun tion for Γ (A) and U = {1, . . . , k} is a set, then A = XY where X ∈ Pm×k , Y ∈ Pk×n are given by: xij = e 1 i j ∈ f(i), i ∈ V1 , j ∈ U; e yij = 1 i i ∈ f(j ′ ), i ∈ U, j ′ ∈ V2 .

(4) (5)

Thus X(i) asso iates to the set f(i) and Y (j) asso iates to the set f(j ′ ). If A = XY , where X ∈ Pm×k , then f : V1 ∪ V2 → 2U given by (4), (5) is a U- oding fun tion for Γ (A).

7

On coding of bipartite graphs by antichains

Let A ∈ Pm×n be a matrix, Γ (A) = Γ (V1 ∪ V2 , E) a bipartite graph asso iated to A, f : V1 ∪ V2 → 2U a U- oding fun tion for Γ (A).

96

E. E. Mareni h

For given real number x, denote by ⌊x⌋ the greatest integer that is less than or equal to x. Similarly, ⌈x⌉ is the least interger that > x. l , where k ∈ N. Denote by l = N(k) the least number su h that k 6 ⌊l/2⌋ We have: N(1) = 1, N(2) = 2, N(3) = 3, N(4) = N(5) = N(6) = 4, N(7) = . . . = N(10) = 5, N(11) = . . . = N(20) = 6, N(21) = . . . = N(35) = 7, N(36) = . . . = N(70) = 8, N(71) = . . . = N(126) = 9, N(127) = . . . = N(252) = 10, N(253) = . . . = N(462) = 11, N(463) = . . . = N(924) = 12, N(925) = . . . = N(1716) = 13. Consider the following properties of N(k).

Let q, t, k ∈ N, 1 6 q 6 k. Then: (i) k > N( qk ); (ii) k = N( qk ) for any given t > 1 and suÆ iently large k = 2q − t; (iii) k = N( qk ) for any given t > 1 and suÆ iently large k = 2q + t.

Lemma 2.

) is equivalent to k−1 k < . ⌊(k − 1)/2⌋ q

Proof. (i) The equality k = N(

k q

(6)

Let t be even, t = 2a. Then (6) is equivalent to

q(q − 1) . . . (q − a + 1) < 2(q − a)(q − a − 1) . . . (q − 2a + 1).

(7)

Both sides of (7) are polynomials in one variable q. These polynomials have degree a. We ompare their highest oeÆ ients and see that (6) holds for any suÆ iently large q. ⊓ ⊔ Let t be odd. Similar reasoning gives (6). For k = 2q ± t, we an get more pre ise result.

)

holds if:

)

holds if:

Corollary 7. Let q, k ∈ N, 1 6 q 6 k. k = 2q for all q; k = 2q − 1 for all q > 2; k = 2q − 2 for all q > 3; k = 2q − 3 for all q > 3; k = 2q − 4 for all q > 8.

The equality k = N(

k q

Corollary 8. Let q, k ∈ N, 1 6 q 6 k. k = 2q + 1 for all q > 1; k = 2q + 2 for all q > 1; k = 2q + 3 for all q > 4.

The equality k = N(

k q

Determining the S hein rank of Boolean matri es

In parti ular, N

k ⌊k/2⌋

=N

k ⌈k/2⌉

= k,

97

k > 1.

A subset B of a poset (Q, 6) is an 6-anti hain if for any pair of distin t elements x and y of B, both x 66 y and y 66 x. The following lemma is useful for al ulation of the S hein rank of Boolean matri es.

Let A ∈ Pm×n . Then: (i) If the family of all rows of A is a 6-anti hain, then ranks (A) > N(m). (ii) If the family of all olumns of A is a 6-anti hain, then ranks (A) > N(n). Proof. (i) Let f : V1 ∪ V2 → 2U be a U- oding fun tion for Γ (A), where |U| = Lemma 3.

ranks (A). We now prove that {f(i) | i ∈ V1 } is a 6-anti hain. Suppose f(i1 ) ⊆ f(i2 ) for some i1 , i2 ∈ V1 . Then A = XY , where X ∈ Pm×k , are given by (4) and (5). A

ording the de nition of X, if xi j = e1 then j ∈ f(i1 ) ⊆ f(i2 ) and xi j = e1 for any j ∈ V2 . 1

2

Therefore

X(i1 ) 6 X(i2 ) ,

A(i1 ) = X(i1 ) Y 6 X(i2 ) Y = A(i2 ) .

Sin e the family of all rows of A is a 6-anti hain, we see that i1 = i2 and {f(i) | i ∈ V1 } is a ⊆-anti hain. The family of sets {f(i) | i ∈ V1 } is a 6-anti hain of a poset Bul(U). By Sperner's theorem, [1℄, we have m6

| U| ⌊|U|/2⌋

.⊓ ⊔

We say that A ∈ Pn×n is an (n, k, λ) design if ea h olumn and ea h row of A has exa tly k e 1's, and ea h two rows of A has exa tly λ e 1's in ommon.

Example. Let A ∈ Pn×n be an (n, k, λ) design, where λ < k < n. Then n > ranks (A) > max{min{n,

nk }, N(n)}. λ2

(8)

Sin e λ < k, the family of all rows of A is a 6-anti hain. Therefore ranks (A) > N(n).

Combining this with ranks (A) > min{n,

obtained in [5℄, we get (8).

nk }, λ2

(9)

Note that the inequality (8) is exa t (the inequality (9) is not exa t) for

En×n .

98

8

E. E. Mareni h

Bipartite intersection graphs Γk,p,q

Let k, p, q ∈ N and U = {1, . . . , k}. We renumerate l-element subsets of U in lexi ographi al ordering Wl (U) = {wk,l,1 , . . . , wk,l,b(k,l) },

k where b(k, l) = . l

De ne the bipartite graph Γk,p,q = Γ (V1 ∪2 , E) by setting: V1 = Wp (U),

V2 = Wq (U);

{wk,p,i , wk,q,j } ∈ E i wk,p,i ∩ wk,q,j = 6 ∅. We have |V1| = pk and |V2 | = qk . Note that Γk,p,q is a regular graph, k−p deg(v) = , v ∈ V1 , q k−q deg(v) = v ∈ V2 . p

The graph Γ (A) = Γ (V1 ∪ V2 , E) is asso iated to the matrix A(k, p, q) = (a(k, p, q)ij ) ∈ Pbin(k,p)×bin(k,q) ,

where

a(k, p, q)ij = e 1 i

wk,p,i ∩ wk,q,j 6= ∅.

If p + q 6 k then the sets of all rows and all olumns of A(k, p, q) are 6-anti hains. The rows of A(k, p, 1) asso iate to p-element subsets of U that is in lexi o-

graphi al ordering. Let C(k, p) ∈ Pk×k be the ir ulant matrix, obtained by y ling the row whose rst p entries are e1 and whose last k − p entries are e0. Theorem 11.

Let p, k ∈ N, 1 6 p 6 k. Then:

(i) k > ranks (A(k, p, 1)) > N( pk ). (ii) If k = N( pk ), then ranks (A(k, p, 1)) = k. (iii) If k > 2p − 1, then ranks (A(k, p, 1)) = k.

Proof. (i) The set of all rows of

A(k, p, 1) is a 6-anti hain. (iii)The ir ulant C(k, p) is a submatrix in A(k, p, 1). Therefore k > ranks (A(k, p, 1)) > ⊓ ⊔ ranks (C(k, p)). From [5℄, if k > 2p − 1, then ranks (C(k, p)) = k.

Determining the S hein rank of Boolean matri es

99

Example. Consider the following matrix

Sin e 4 = N(

4 2

eeee 1100 e eee 1010 eeee 1001 A(4, 2, 1) = e e e e . 0110 e 0e 1e 0e 1 e 0e 0e 1e 1

), we see that ranks (A(4, 2, 1)) = 4.

It is easy to prove that

(10)

A(k, p, q) = A(k, p, 1) · A(k, 1, q), (t)

A(k, p, 1) = (A(k, 1, p))

.

The matrix A(k, p, 1) is a blo k matrix. Indeed, A(k, p, 1) =

Jbin(k−1,p−1)×1 A(k − 1, p − 1, 1) 0bin(k−1,p)×1 A(k − 1, p, 1)

.

Combining this with (10), we get that A(k, p, q) is the following blo k matrix: A(k, p, q) =

Jbin(k−1,p−1)×bin(k−1,q−1) A(k − 1, p − 1, q) A(k − 1, p, q − 1) A(k − 1, p, q)

Example. The graph Γ5,2,2 is

.

12s

13s 14s 15 23s 24s 25 34 35s 45s Pa PP P s !! sQ ! !H a aa sQ aP ! ! H H ! P Q Q Q Q a ! ! @ PP A Q HH A Q Q @ A A QPP P [email protected] H [email protected] H @ A! @ a aa aP ! Q ! !Q H H aa a !! P !P PQ [email protected]H APQ P ! ! [email protected] a P a A aQ @ H A @ H @ ! Q A ! [email protected] Q ! P P P a ! a ! ! H H H ! PP P Q P! Q A Q!A@ ! aQ aQ a! ! A @[email protected] A P P @ AH @ H @ aQ ! a aQ P Q Q P Q A ! ! P H H H! a ! a a ! P P P ! a Q H P A A P ! Q @ A A @ Q Q a P a A @ Q H A @ H @ ! Q ! A @ ! P P aa a! a! ! HQ H H Q P P Q! Q Q PP Q P ! a a ! ! A ! H P A A P @ @ A @ H A @ H @ A @ a ! a a ! ! Q! a P Q QP Q Q PP HQ A ! H aH ! ! Pa ! HQ a P A A! P [email protected] !a @A H a A! [email protected] aH QP A HQ [email protected] @P AP @A ! P P ! a a ! ! H H ! Q PQ P [email protected] Q A P Q Q A a ! a aH ! ! ! H P P @ A AAs @ H A @ @ A @ a ! a a ! ! H A ! A Q Q @ P a a A @ H Q H A @ @ Q ! ! A @ Qs Ps Ps Qs ! s P s as As @s

12

13

Therefore

14

15

23

0eeeeeeeeee1 1111111000 Be eeeeeeeeeC B1111100110C Be 1e 1e 1e 0e 1e 0e 1e 0e 1C C B1e BeeeeeeeeeeC B1111001011C C B Be 1e 1e 0e 0e 1e 1e 1e 1e 1e 0C B A(5, 2, 2) = B e e e e e e e e e e C C= B1010111101C BeeeeeeeeeeC B1001111011C BeeeeeeeeeeC B0110110111C C B @e 0e 1e 0e 1e 1e 0e 1e 1e 1e 1A e 0e 0e 1e 1e 0e 1e 1e 1e 1e 1

24

25

34

35

0eeeee1 11000 Be eeeeC 1 B 0100C Be 0e 0e 1e 0C C0 e e e e e e e e e 1 B1e 1111000000 BeeeeeC e B 1 0 0 0 1 CB e e e e e e e e e e C CB 1 0 0 0 1 1 1 0 0 0 C B eeeeC Be C B 0 1 1 0 0 CB e 0e 1e 0e 0e 1e 0e 0e 1e 1e 0 C. CB Be e e e e B C 0 1 0 1 0 C 0e B 0e 1e 0e 0e 1e 0e 1e 0e 1A B e e e e e [email protected] e 0 1 0 0 1 C B BeeeeeC e 0e 0e 0e 1e 0e 0e 1e 0e 1e 1 B00110C C B @e 0e 0e 1e 0e 1A e e e e e 00011

45

100

E. E. Mareni h

Now we obtain the following properties of the S hein rank for A(k, p, q). Theorem 12.

Then:

Let k, p, q ∈ N, 1 6 p, q 6 k.

(i) ranks (A(k, p, q)) 6 min{ ranks (A(k, 1, q)),

rank p,1)) } 6 k. s (A(k, (ii) If p + q 6 k, then ranks (A(k, p, q)) > max N pk , N qk .

Proof. The inequality (i) follows from (10). (ii) The families of all rows and all

olumns of A(k, p, q) are 6-anti hains. This ompletes the proof.

⊓ ⊔

The following is an immediate onsequen e of Theorem 11 and Corollary 8.

If k, p ∈ N, 1 6 p 6 k. Then: (i) If p 6 k/2 and k = N( pk ), then ranks (A(k, p, p)) = k. (ii) ranks (A(2p, p, p)) = 2p.

Corollary 9.

(iii) ranks (A(2p + 1, p, p + 1)) = 2p + 1.

). Therefore ranks (A(k,2, 2)) = k, k = 4, 5, 6, 7. If k = 6, 7, 8, 9, then k = N( k3 ). Therefore ranks (A(k, 3, 3)) = k, k = 6, 7, 8, 9.

Example. If k = 4, 5, 6, 7, then k = N(

k 2

The following orollary is an appli ation of Theorem 11. Corollary 10.

Let k, p ∈ N, 1 6 p < k. Then

ranks (

A(k, p, p) Ebin(k,p)×bin(k,p)

Ebin(k,p)×bin(k,p) ) = k. A(k, k − p, k − p)

Proof. Consider the produ t of blo k matri es:

A(k, p, 1) A(k, 1, p) A(k, 1, k − p) = A(k, k − p, 1) Ebin(k,p)×bin(k,p) A(k, p, p) . = Ebin(k,p)×bin(k,p) A(k, k − p, k − p)

(11)

Taking into a

ount (10), we obtain k > ranks (

A(k, p, p) Ebin(k,p)×bin(k,p)

Ebin(k,p)×bin(k,p) ) > ranks (A(k, p, p)) = k.⊓ ⊔ A(k, k − p, k − p)

In parti ular, for p = 1 we have ranks

Ek×k Ek×k Ek×k Jk×k

= k.

Determining the S hein rank of Boolean matri es

9

101

The Schein rank of En×n

The following exer ise is due to Kim [4℄, [p. 63, Exer ise 24℄. k . Exer ise. Prove that the S hein rank of the matrix En×n is k if n = [k/2] e e The ranks of all square matri es with 0 on the main diagonal and 1 elsewhere are determined in [8℄. From Theorem 9 and Sperner's theorem, we get the following result. Theorem 13.

The S hein rank of En×n is equal to N(n).

Proof. The matrix E = En×n is asso iated to a bipartite graph Γ (E) = Γ (V1 ∪ V2 , E).

We have:

V1 = {1, . . . , n}, V2 = {1 ′ , 2 ′ , . . . , n ′ };

and

{i, j ′ } ∈ E is an edge of Γ (E) whenever i 6= j.

We now al ulate nintbp (Γ (E)). Let nint∅ (Γ (E)) = m and f be a U- oding fun tion for Γ (E), where |U| = m. Denote: f(i) = ai , f(i ′ ) = bi , i = 1, . . . , n.

Consider the following sets: g(i) = ai , g(i ′ ) = ai = U − ai , i = 1, . . . , n.

It is easy to prove that g : V1 ∪ V2 → 2U is a U- oding fun tion for Γ (E). In parti ular, ai ∩ aj 6= ∅ for all i 6= j. If ai ⊆ aj for some i 6= j, then aj ⊆ ai ,

ai ∩ aj ⊆ ai ∩ ai = ∅,

ai ∩ aj = ∅.

This is a ontradi tion. Therefore the family {a1, a2 ,. . . , an } is an ⊆-anti hain m in Bul(U). A

ording Sperner's theorem, n 6 ⌊m/2⌋ . We now prove that there exists a U- oding fun tion for Γ (E), where |U| = k. By Sperner's theorem the size of a maximal ⊆-anti hain in Bul(k) equals k ⌊k/2⌋ . Let {a1 , a2 , . . . , an } be some n-element ⊆-anti hain in Bul(U) su h that {a1 , a2 , . . . , an } is ontained in a maximal ⊆-anti hain and |ai | = ⌊k/2⌋ for all i = 1, . . . , n. Then |ai | = ⌈k/2⌉

for all i = 1, . . . , n.

102

E. E. Mareni h

Denote:

f(i) = ai , f(i ′ ) = ai = U − ai , i = 1, . . . , n.

Suppose ai ∩ aj = ∅ for some i, j. We have |ai | + |aj | = ⌊k/2⌋ + ⌈k/2⌉ = k,

ai ∪ aj = U.

Therefore ai = aj , ai = aj , i = j. Thus the equality f(i) ∩ f(j ′ ) = ai ∩ aj = ∅ is equivalent to i = j. We have proved that f : V1 ∪ V2 → 2U is a U- oding fun tion ⊓ ⊔ for Γ (E).

Let n =

Corollary 11.

k ⌊k/2⌋

.

The following statements are valid.

(i) ranks (En×n ) = k. (ii) If En×n = XY , where X ∈ Pn×N(n) , Y ∈ PN(n)×n , then X = πA(k, ⌊k/2⌋ , 1),

Y = A(k, 1, ⌈k/2⌉) π(t) ,

X = πA(k, ⌈k/2⌉ , 1),

Y = A(k, 1, ⌊k/2⌋) π

or (t)

,

(12) (13)

where π ∈ Pbin(k,[k/2])×bin(k,[k/2] is a permutation matrix. Proof. Using Theorem 13 and properties of numbers N(n), we get ranks (En×n ) = N

k ⌊k/2⌋

= k. By Sperner's theorem, see [1℄, there exist only two ⊆-anti hains of maximal length in Bul(k), whi h are the ⌊k/2⌋-element set and the ⌈k/2⌉-element set. ⊓ ⊔ From the proof of Theorem 13 we get (ii).

If k is even, then (12) oin ides with (13). If k is odd, then (12) does not

oin ide with (13). Example. The matrix E6×6 is the produ t of two matri es

E6×6

Example. Let

0eeeeee1 011111 Be eeeeeC B101111C BeeeeeeC B110111C =BeeeeeeC= B111011C B C @e 1e 1e 1e 1e 0e 1A e 1e 1e 1e 1e 1e 0

0eeee1 1100 0 1 Be eeeC eeeeee B1010C 000111 B e e e e CB e e e e e e C B 1 0 0 1 CB 0 1 1 0 0 1 C B e e e e CB e e e e e e C. B 0 1 1 0 [email protected] 1 0 1 0 1 0 A B C @e 1e 1e 0e 1e 0e 0 0e 1e 0e 1A e e e e e 0011

B = B(n) be the n × n matrix with e 0 on the main and ba k diagonals and e1 elsewhere.

In parti ular, onsider

e 0e 1e 1e 1e 0 e 0e 1e 0e 1 1e eeeee B(5) = 1 1 0 1 1 , e 1e 0e 1e 0e 1 e 0e 1e 1e 1e 0

eeeeee 011110 e eeeee 101101 eeeeee 110011 B(6) = e e e e e e 110011 e 1e 0e 1e 1e 0e 1 e 0e 1e 1e 1e 1e 0

Determining the S hein rank of Boolean matri es

We have

B(r) = B(n−r+1) , B(r) = B(n−r+1) , r = 1, . . . , n.

103

(14)

By removing rows of numbers ⌈n/2⌉ + 1, ⌈n/2⌉ + 2, . . ., n from B(n), we get the matrix X. From (14), we have ranks (B(n)) = ranks (X). By removing olumns of numbers ⌈n/2⌉ + 1, ⌈n/2⌉ + 2, . . ., n from X, we get Ek×k , where k = ⌈n/2⌉. From (14), we have ranks (X) = ranks (Ek×k ). Therefore ranks (B(n)) = N(⌈n/2⌉).

Example. Let C(n) be the n × n matrix with e1 on the main and ba k diagonals and e0 elsewhere. We have ranks (C(n)) = ⌈n/2⌉ .

Acknowledgments

This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

References 1. M. Aigner, Combinatorial Theory, Grundlehren Math. Wiss. 234, Springer-Verlag, Berlin, 1979. 2. J. Orlin, Contentment in graph theory: Covering graphs with liques, K. Nederlandse Ak. van Wetens happen Pro . Ser. A, 80 (1977), pp. 406{424. 3. Kim, Ki Hang and F.W. Roush, Generalized fuzzy matri es, Fuzzy Sets Systems, 4 (1980), pp. 293{315. 4. Kim, Ki Hang, Boolean matrix theory and appli ations, Mar el Dekker, New York and Basel, 1982. 5. D.A. Gregory, N.J. Pullman, Semiring rank: Boolean rank and nonnegative rank fa torization, Journal of Combinatori s, Information & System S ien es, v. 8, No.3 (1983), pp. 223{233. 6. Di Nola, S. Sessa, On the S hein rank of matri es over linear latti e, Linear Algebra Appl., 118 (1989), pp. 155{158. 7. Di Nola, S. Sessa Determining the S hein rank of matri es over linear latti es and nite relational equations, The Journal of Fuzzy Mathemati s, vol. 1, No. 1 (1993), pp. 33{38. 8. D. de Caen, D.A. Gregory and N.J. Pullman, The Boolean rank of zero-one matri es., Pro . Third Caribbean Conferen e on Combinatori s, Graph Theory, and Computing, Barbados, 1981,pp. 169{173 SIAM J. Numer. Anal., 19 (1982), pp. 400{408.

Lattices of matrix rows and matrix columns. Lattices of invariant column eigenvectors. Valentina Mareni h⋆ Murmansk State Pedagogi University [email protected]

Abstract. We onsider matri es over a Brouwerian latti e. The linear span of olumns of a matrix A form a semilatti e. We all it a olumn semilatti e for A. The questions are: when olumn semilatti e is a latti e, when olumn semilatti e is a distributive latti e, and what formulas an be obtained for the meet and the join operations? We prove that for any latti e matrix A, the olumn semilatti e is a latti e. We also obtain formulas for the meet and the join operations. If A is an idempotent or A is a regular matrix, then the olumn semilatti e is a distributive latti e. We also on ider invariant eigenve tors of a square matrix A over a Brouwerian latti e. It is proved that all A-invariant eigenve tors form a distributive latti e and the simple formulas for the meet and the join operations are obtained.

Keywords: latti e matrix, latti es of olumns, invariant eigenve tors of latti e matri es.

1

Introduction

In Se tion 2 we re all some de nitions: latti e matri es, olumn ve tors over a latti e, operations over latti e matri es, Brouwerian and Boolean latti es, systems of linear equations over a latti e. Also we re all the solvability riterion for a system of linear equations over a Brouwerian latti e and some its orollaries, whi h are needed for the sequal (for more details see [1℄). In Se tion 3, we de ne a olumn semilatti e (Column(A), 6) that is the linear span of olumns of a matrix A. Similarly, a row semilatti e an be de ned. The questions are: when (Column(A), 6) is a latti e, when (Column(A), 6) is a distributive latti e, and what formulas an be obtained for the meet and the join operations? In 1962, K.A. Zarezky proved that for a square matrix A over the two-element Boolean latti e, the olumn semilatti e is a latti e whenever A is a regular matrix. We onsider some ases when the olumn semilatti e is a latti e and get e and the join ∨ e operations. Note that the similar results formulas for the meet ∧ ⋆

This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

Latti es of matrix rows and matrix olumns

105

an be obtained for a row semilatti e. The main result of this se tion is the following. For a regular matrix A over a Brouwerian latti e, e in the latti e (Column(A), 6) is 1. the formula for the meet operation ∧ e u∧v = C(u ∧ v), for all u, v ∈ Column(A), where C is an idempotent su h that Column(A) = Column(C). 2. (Column(A), 6) is a distributive latti e.

In se tion 4, we re all the de nition of invariant olumn eigenve tors that is due to L.A. Skornyakov, see [6℄. The set of all invariant olumn eigenve tors form a subspa e. We prove that for any m × m matrix A over a distributive latti e: 1. the subspa e of all invariant olumn eigenve tors oin ides with Column ((A + A2 )k ), where k > m, 2 k 2. matrix (A + A ) is an idempotent. In se tion 5, we onsider a square matrix A and A-invariant eigenve tors over a Brouwerian latti e. From previous results it follows that all A-invariant eigenve tors form a distributive latti e. Also the simple formulas for the meet and the join operations are obtained.

2

Preliminaries

The following notations will be used throughout. Denote by (P, ∧, ∨, 6) a latti e. 2.1

Lattice matrices and column vectors

Let Pm×n be a set of all m × n matri es over P and A = kaij k ∈ Pm×n . We de ne the following matrix operations: – for any matri es A, B ∈ Pm×n A + B = kaij ∨ bij k; – for any matri es A ∈ Pm×n , B ∈ Pn×k n

AB = k ∨ (air ∧ brj )km×k . r=1

A square latti e matrix A ∈ P is an idempotent if A2 = A. The transpose of A is de ned by analogy with linear algebra and is denoted by A(t) . m×m

106

V. Mareni h

Any element (p1 , . . . , pm )t in Pm×1 is alled a olumn ve tor. We de ne a partial order on Pm×1 : ′ t ′ (p1 , . . . , pm )t 6 (p1′ , . . . , pm ) ⇔ p1 6 p1′ , . . . , pm 6 pm

′ t and the folowing operations. For any (p1 , . . . , pm )t , (p1′ , . . . , pm ) ∈ Pm×1 , and λ ∈ P, ′ t ′ t (p1 , . . . , pm )t + (p1′ , . . . , pm ) = (p1 ∨ p1′ , . . . , pm ∨ pm ) ; t

t

λ(p1 , . . . , pm ) = (λ ∧ p1 , . . . , λ ∧ pm ) .

(1) (2)

With these notations de ne a linear span of olumn ve tors (by analogy with linear algebra). Any set S ⊆ Pm×1 losed under the operations (1) and (2) is alled a subspa e. The partially ordered set (Pm×1 , 6) is a latti e with meet ∧ and join ∨ ′ t operations de ned as follows. For any (p1 , . . . , pm )t , (p1′ , . . . , pm ) ∈ Pm×1 , ′ t ′ t (p1 , . . . , pm )t ∨ (p1′ , . . . , pm ) = (p1 ∨ p1′ , . . . , pm ∨ pm ) , ′ t ′ t (p1 , . . . , pm )t ∧ (p1′ , . . . , pm ) = (p1 ∧ p1′ , . . . , pm ∧ pm ) .

Re all that any partially ordered set is alled, more simply, a poset. 2.2

Brouwerian lattices

Let us re all the de nition of Brouwerian latti es. If for given elements a, b ∈ P the

b greatest solution of the inequality a ∧ x 6 b exists then itbis denoted by a and is alled the relative pseudo omplement of a in b. If a exists for all a, b ∈ P then (P, ∧, ∨, 6) is alled a Brouwerian latti e. Note that: – any Brouwerian latti e has the greatest element denoted by 1^; – any Brouwerian latti e is a distributive latti e; – any nite distributive latti e is a Brouwerian latti e.

Let (P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×n a matrix, c = (c1 , . . ., cm )t ∈ Pm×1 a olumn ve tor. De ne a ve tor m m ci ci c t ∈ Pn×1 . , ..., ∧ = ∧ A i=1 ain i=1 ai1

Latti es of matrix rows and matrix olumns 2.3

107

Boolean lattices

Let (P, ∧, ∨, 6) be a distributive latti e with the least element b0 and the greatest element b1. If for any a ∈ P there exists a ∈ P su h that a ∨ a = b1 and a ∧ a = b0, then (P, ∧, ∨, 6) is alled a Boolean latti e.

Any Boolean latti e is a Brouwerian latti e, where ab = a ∨ b. Denote by A a matrix A = kaij k. Let U be a nite set, 2U be the olle tion of all subsets of U. Denote by Bul(U) = (2U , ⊆) the poset of all subsets of U partially ordered by in lusion (we all it a Boolean algebra). Let Bul(k) be the Boolean algebra of all subsets of a nite k-element set. It is obvious that Bul(U) (Bul(k)) is a Boolean latti e. 2.4

Systems of linear equations over Brouwerian lattices

Befor ontinuing, we require the following results, whi h are known from [1℄. Let A ∈ Pm×n , c ∈ Pm×1 . De ne a system of linear equations Ax = c,

(3)

Ax 6 c.

(4)

and a system of linear inequations Theorem 1. Let (P, ∧, ∨, 6) be a Brouwerian latti e. Then

(i) x = Ac is the greatest solution for the system of inequations (4).

c x = is the solution of (3). If (ii) System (3) is solvable whenever A

c System (3) is solvable, then x = A is the greatest solution of it.

c Corollary 1. Let (P, ∧, ∨, 6) be a Brouwerian latti e. Then x = A is the greatest ve tor in the set {Ax | Ax 6 c, x ∈ Pn×1 }. Theorem 2. Let (P, ∧, ∨, 6) be a Boolean latti e. Then the following onditions are equivalent: (i) System (3) is solvable. (ii) The greatest solution of System (3) is x=

c = At · c. A

The solvability for systems of linear equations over Boolean latti es was studied in details by Rudeany in [2℄. Corollary 2.

Let (P, ∧, ∨, 6) be a a Boolean latti e. Then c x=A = A · At · c A

is the greatest ve tor in the set {Ax | Ax 6 c,

x ∈ Pn×1 }.

108

3

V. Mareni h

Semilattices and lattices of matrix columns and matrix rows

Let A = kaij k ∈ Pm×n be a matrix, A(j) = (a1j , . . . , amj )t the j-th olumn. The linear span of olumns we denote by Column(A). Let u ∈ Column(A), then u = Ax for some olumn ve tor x ∈ Pn×1 . De ne a poset (Column(A), 6) with respe t to the partial order 6 indu ed by the latti e (Pm×1 , 6). Let (P, ∧, ∨, 6) be a distributive latti e and A ∈ Pm×n . Then (Column(A), 6 e given by ) is an upper semilatti e with the join operation ∨

e ′ , . . . , p ′ )t = (p1 , . . . , pm )t +(p ′ , . . . , p ′ )t = (p1 ∨p ′ , . . . , pm ∨p ′ )t , (p1 , . . . , pm )t ∨(p 1 m 1 m 1 m ′ t for any (p1 , . . . , pm )t , (p1′ , . . . , pm ) ∈ Column(A). We all (Column(A), 6) a

olumn semilatti e. Similarly, a row semilatti e (Row(A), 6) an be de ned. The questions are: when (Column(A), 6) is a latti e, when (Column(A), 6

is a distributive latti e, and what formulas an be obtained for the meet and the join operations? In 1962, K.A. Zarezky obtained the following result.

)

Theorem 3 (Zarezky's riterion). Let P = {^ 0, ^ 1} be a two-element Boolean latm×m a square matrix. Then (Column(A), 6) is a distributive ti e and A ∈ P latti e whenever A is a regular matrix.

Re all that a square matrix A ∈ Pm×m is alled a regular matrix if there exists B ∈ Pm×m su h that ABA = A.

It is known that A is a regular matrix whenever there exists an idempotent C su h that Column(A) = Column(C), (5) see [3℄. In the following theorem we onsider some ases when the olumn semilatti e e and join ∨ e operations. is a latti e and obtain formulas for the meet ∧ Theorem 4.

Let (P, ∧, ∨, 6) be a latti e and A ∈ Pm×n .

(i) If (P, ∧, ∨, 6) is a Brouwerian latti e, then (Column(A), 6) is a latti e, e and join ∨ e operations are where the formulas for the meet ∧ e = u + v, u∨v ~ =A u∧v =A u ∧ v u∧v , A A A

for all u, v ∈ Column(A).

Latti es of matrix rows and matrix olumns

109

(ii) If (P, ∧, ∨, 6) is a Boolean latti e, then (Column(A), 6) is a latti e, e and join ∨ e are where formulas for the operations meet ∧

for all u, v ∈ Column(A).

e = u + v = u ∨ v, u∨v e = A · At · (u + v), u∧v

e . Then w e = u∧v e is the greatest ve tor in the set Proof. (i) Let w

{w = Ax, x ∈ Pn×1 |w 6 u, w 6 v} = {w = Ax, x ∈ Pn×1 |w 6 u ∧ v}.

A

oding Corollary 1,

u∧v e =A w . A

(ii) A

oding i) and Corollary 2,

u∧v e u∧v = A ⊔ = A · At · (u ∧ v) = A · At · (u + v). ⊓ A

Let (P, ∧, ∨, 6) be a nite distributive latti e. Then the olumn semilatti e (Column(A), 6) is a latti e, in whi h the meet and join operations an be al ulated by formulas from Theorem 4 (i).

Corollary 3.

e more simply. For some olumn latti es, we an express the meet operation ∧

Let (P, ∧, ∨, 6) be a Brouwerian latti e and matrix A ∈ Pm×m an idempotent. Then: e, ∨ e in the latti e (Column(A), 6 (i) the formulas for meet and join operations ∧ ) are Theorem 5.

e = u + v, u∧v e = A (u ∧ v) , u∨v

for all u, v ∈ Column(A). (ii) (Column(A), 6) is a distributive latti e. Proof. (i) A

ording Theorem 4, ~ =A u∧v

u v ∧ . A A

Note that u is the solution of Ax = u. A

ording Theorem 1,

Sin e u 6

u A

and v 6

v A

u u6 . A

, we get

A(u ∧ v) 6 A

u v e ∧ = u∧v. A A

110

V. Mareni h

Note that w = Aw for any idempotent matrix A ∈ Pm×m and any ve tor w ∈ Column(A). We have e = A(u∧v) e 6 A(u ∧ v). A(u ∧ v) 6 u∧v

Therefore

e = A(u ∧ v). u∧v

(ii) For any u, v, w ∈ Column(A),

e ∧v) e = w + A(u ∧ v) = Aw + A(u ∧ v) = A(w ∨ (u ∧ v)) = w∨(u e ∧ (w∨v)) e e ∧(w e ∨v), e = A((w∨u) = (w∨u)

This ompletes the proof.

Note that the results similar to Theorem 4, Corrolary 3 and Theorem 5 an be obtained for a row semilatti e (Row(A), 6). The following statement is an analog of Zarezky's theorem. Theorem 6.

matrix, and Then:

Let

(P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×m a regular C ∈ Pm×m an idempotent su h that Column(A) = Column(C).

e in the latti e (Column(A), 6) is (i) the formular for the meet operation ∧ e = C(u ∧ v), u∧v

for all u, v ∈ Column(A). (ii) (Column(A), 6) is a distributive latti e.

Proof. Re all that for any regular matrix A there always exists an idempotent C su h that Column(A) = Column(C), see (5). The proof of (i) and (ii) follows

from Theorem 5.

The result similar to the statement (ii) was proved by K.A. Zarezky for semigroups of binary relations, see [4℄. Kim and Roush obtained the similar result for the fuzzy latti e, see [5℄.

4

Subspaces of A-invariant column eigenvectors

Let (P, ∧, ∨, 6) be a distributive latti e and A ∈ Pm×m a square matrix. The following de nition of invariant olumn eigenve tors over a latti e is due to L.A. Skornyakov, see [6℄. (We say "A-invariant olumn ve tors" instead of "invariant olumn eigenve tors".)

Latti es of matrix rows and matrix olumns

111

A olumn ve tor u ∈ Pm×1 is alled an A-invariant if Au = u. If u, v ∈ Pm×1 are invariant olumn ve tors and p ∈ P, then u+v and pv are A-invariant olumn ve tors. Therefore the set all A-invariant olumn ve tors form a subspa e. Our purpose is to des ribe the subspa e of all A-invariant olumn ve tors. The following two lemmas are needed for the sequel.

Let matrix. Then

Lemma 1.

(P, ∧, ∨, 6) Am 6

be a distributive latti e and 2m X

Ar 6

X

Ar 6

r>m

r=m+1

X

A ∈ Pm×m

a square

Ar .

(6)

16r6m

These inequalities was prooved by K. Che hlarova in [7℄. Lemma 2. Let (P, ∧, ∨, 6) trix and k > m. Then

be a distributive latti e, A ∈ Pm×m a square ma-

Ak 6 Ak+1 + Ak+2 + . . . + Ak+m , 2k+1

A

k

k+1

6A +A

2k

+ ... + A .

(7) (8)

Proof. Sin e (6), we get Am 6 Am+1 + Am+2 + . . . + A2m .

Multiplying both sides by Ak−m , we obtain (7). Now let us prove (8). First we prove (8) for {e0, e1} -Boolean matrix A. Let A ∈ {0, 1}m×m ; then for any olumn ve tor ξ ∈ {e 0; e 1}m×1 Ak ξ 6 (Ak + Ak+1 )ξ 6 . . . 6 (Ak + Ak+1 + . . . + A2k )ξ.

If all inequalities are stri t, then there exists a m in the latti e ({e0, e1}m , 6). This is a ontradi tion, be ause the length of Boolean ∼ Bul(m) is equal to m. Suppose algebra ({e0, e1}m , 6) = (Ak + Ak+1 + . . . + Ak+s )ξ = (Ak + Ak+1 + . . . + Ak+s+1 )ξ

for some s, where 1 6 s 6 k − 1. Then Ak+s+1 ξ 6 (Ak + Ak+1 + . . . + Ak+s )ξ.

We prove by indu tion on r > s + 1 the following inequalities Ak+r ξ 6 (Ak + Ak+1 + . . . + Ak+s )ξ.

(9)

For r = s + 1 the inequality is already proved. We assume that the inequality holds for r and prove it for r + 1. Indeed, Ak+r+1 ξ 6 (Ak+1 + . . . + Ak+s+1 )ξ 6 (Ak + Ak+1 + . . . + Ak+s+1 )ξ = = (Ak + Ak+1 + . . . + Ak+s )ξ + Ak+s+1 ξ = (Ak + Ak+1 + . . . + Ak+s )ξ.

112

V. Mareni h

Thus inequalities (9) are valid. By setting r = k + 1 > s + 1 in (9), we obtain A2k+1 ξ 6 (Ak + Ak+1 + . . . + A2k )ξ.

Sin e ξ is an arbitrary olumn ve tor, we see that the inequality (8) is valid for any Boolean matrix A ∈ {0, 1}m×m . Now we suppose that A is a latti e matrix, A ∈ Pm×m . Using the inequality (8) for Boolean {e0, e1}-matri es and the de omposition of A into the linear span of se tions ( onstituents), we see that (8) is valid over the latti e (P, ∧, ∨, 6). (The linear span of se tions are de ned in [5℄.) Lemma 2 gives us the following. Theorem 7.

Then:

Let (P, ∧, ∨, 6) be a distributive latti e, k > m and A ∈ Pm×m .

(i) (A + A2 )k = (A + A2 )k · A, (ii) (A + A2 )k is an idempotenet matrix.

Proof. (i) It follows from (7) that Ak 6 Ak+1 + . . . + A2k .

Combining this inequality and (8), we get (A + A2 )k = Ak + Ak+1 + . . . + A2k = Ak + Ak+1 + . . . + A2k + A2k+1 = = Ak+1 + . . . + A2k + A2k+1 = (A + A2 )k · A.

(ii) Using (i), we see that (A + A2 )k = (A + A2 )k As for any s > 0. Therefore (A + A2 )k

2

= (A + A2 )k (A + A2 )k = (A + A2 )k Ak + Ak+1 + . . . + A2k = =

2k X

(A + A2 )k As =

s=0

2k X

(A + A2 )k = (A + A2 )k ,

s=0

and (A + A2 )k is an idempotenet matrix. The result similar to Theorem 7 was proved by K. H. Kim for the two-element Boolean latti e P = {e0, e1}, see [3℄. In the following lemma we des ribe invariant ve tors of idempotent matri es.

Lemma 3. Let (P, ∧, ∨, 6) be a distributive latti e, B ∈ Pm×m an idempotent and ξ ∈ Pm×1 a olumn ve tor. Then ξ is a B-invariant ve tor whenever ξ ∈ Column(B).

Latti es of matrix rows and matrix olumns

113

Proof. Suppose Bξ = ξ; then ξ ∈ Column(B).

Suppose ξ ∈ Column(B). Sin e B is an idempotent, we get B = B2 and B(j) = B · B(j) for any olumn Bj , j = 1, . . . , n. By de nition of Column(B), ξ = β1 B(1) + . . . + βn B(m) ,

for some β1 , . . . , βm ∈ P. Therefore Bξ = B(β1 B(1) + . . . + βm B(m) ) = β1 B · B(1) + . . . + βm B · B(m) = ξ. ⊓ ⊔

Now we an des ribe all invariant eigenve tors of m × m-matri es over a latti e. Theorem 8. Let (P, ∧, ∨, 6) be a distributive latti e, A ∈ Pm×m and k > m. Then the subspa e of all A-invariant olumn ve tors oin ides with Column ((A + A2 )k ) and matrix (A + A2 )k is an idempotent.

Proof. A

oding Theorem 7,

(A + A2 )k · A.

(A + A2 )k is an idempotent and (A + A2 )k =

First we shall prove that onditions Aξ = ξ and (A + A2 )k ξ = ξ are equavalent for any ξ ∈ Pm×1 . Suppose Aξ = ξ, then obviously (A + A2 )k ξ = ξ. Suppose (A + A2 )k ξ = ξ, then Aξ = A(A + A2 )k ξ = (A + A2 )k ξ = ξ. Sin e (A+A2 )k is an idempotent, using Lemma 3, we see that (A+A2 )k ξ = ξ is aquavalent to ξ ∈ Column((A + A2 )k ). For the two-element Boolean latti e P = {e0, e1}, Theorem 8 is a orollary of results obtained by T.S. Blyth, see [10℄.

5

Lattices of invariant column vectors

Let (P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×m a square matrix. From previous results it follows that all A-invariant ve tors form a distributive latti e. Also the simple formulas for the meet and the join operations are obtained. Theorem 9. Let (P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×m a matrix, Jm×1 = (^1, . . . , ^1)t ∈ Pm×n a universal olumn ve tor, k > m.

square Then:

(i) all A-invariant ve tors form a latti e, whi h oin ides with

e ∨, e 6 Column((A + A2 )k ), ∧,

e and the join ∨ e operations in the latti e of (ii) the formulas for the meet ∧

all A-invariant ve tors are

e = u + v = u ∨ v, u∧v e = (A + A2 )k (u ∧ v), u∨v

for all A-invariant olumn ve tors u, v ∈ Pm×1 ;

114

V. Mareni h

(iii) the latti e of all A-invariant ve tors is distributive, with the greatest element Am Jm×1 . If (P, ∧, ∨, 6) is a Brouwerian latti e with ^0, then the latti e of all A-invariant ve tors has the least element e0 = (^0, . . . , ^0)t ∈ Pm×1 .

Proof. Statements (i) and (ii) are immediate onsequen es of Theorems 8, 4, 5.

Let us prove (iii). First we shall prove that Am · Jm×1 is an A-invariant ve tor. From the obvious inequality AJm×1 6 Jm×1 , it folows that Ar+1 Jm×1 6 Ar Jm×1 , r = 1, 2, . . . .

A

oding (6), Am Jm×1 6

X

(10)

Ar Jm×1 .

r>m

Consider the right part of this inequality. Sin e (10), we see that Am+1 Jm×1 is the greatest summand, therefore X

Ar Jm×1 = Am+1 Jm×1 .

r>m

Applying (10), we get Am Jm×1 6

X

Ar Jm×1 = Am+1 Jm×1 6 Am Jm×1 .

r>m

To on lude the proof, it remains to note that Am · Jm×1 is the greatest Ainvariant ve tor. Indeed, if ξ is A-invariant ve tor, then ξ = Aξ = ... = Am ξ 6 Am Jm×1 . ⊓ ⊔

Acknowledgments This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

References 1. E.E. Mareni h, V.G. Kumarov Inversion of matri es over a pseudo omplemented latti e, Journal of Mathemati al S ien es, Volume 144, Number 2, Springer New York (2007), pp. 3968{3979 2. S. Rudeanu, Latti e Fun tions and Equations, Springer - Verlag, London, 2001. 3. Kim, Ki Hang, Boolean matrix theory and appli ations, Mar el Dekker, New York and Basel, 1982. 4. K.A. Zarezky, Regular elements in semigroups of binary relations, Uspeki mat. nauk, 17-3 (1962), pp. 105{108.

Latti es of matrix rows and matrix olumns

115

5. K.X. Kim, F.W. Roush, Generalized fuzzy matri es, Fuzzy Sets Systems, 4 (1980), pp. 293{315. 6. L.A. Skornyakov, Eigenve tors of a matrix over a distributive latti e, Vestnik Kievskogo Universiteta, 27 (1986), pp. 96{97. 7. K. Che hlarova, Powers of matri es over distributive latti es - a review, Fuzzy Sets and Systems, 138 (2003), pp. 627{641. 8. S. Kirkland, N.J. Pullman Boolean spe tral theory, Linear Algebra Appl., 175 (1992), pp. 177{190. 9. Y.-J. Tan , On the powers of matri es over distributive latti es, Linear Algebra Appl., 336 (2001), pp. 1{14. 10. T.S. Blyth, On eigenve tors of Boolean matri es, Pro . Royal So ., Edinburgh Se t A 67 (1966), pp. 196{204.

Matrix algebras and their length Olga V. Markova Mos ow State University, Mathemati s and Me hani s Dept. ov [email protected]

Abstract. Let F be a eld and let A be a nite-dimensional F-algebra. We de ne the length of a nite generating set of this algebra as the smallest number k su h that words of the length not greater than k generate A as a ve tor spa e, and the length of the algebra is the maximum of lengths of its generating sets. In this paper we study the onne tion between the length of an algebra and the lengths of its subalgebras. It turns out that the length of an algebra an be smaller than the length of its subalgebra. To investigate, how dierent the length of an algebra and the length of its subalgebra an be, we evaluate the dieren e and the ratio of the lengths of an algebra and its subalgebra for several representative families of algebras. Also we give examples of length omputation of two and three blo k upper triangular matrix algebras.

Keywords: length; nite-dimensional asso iative algebras; matrix subalgebras; upper triangular matri es; blo k matri es.

1

Main Definitions and Notation

Let F be an arbitrary eld and let A be a nite-dimensional asso iative algebra over F. Ea h nite-dimensional algebra is ertainly nitely generated. Let S = {a1 , . . . , ak } be a nite generating set for A.

Notation 1. Let hSi denote the linear span, i.e. the set of all nite linear ombinations with oeÆ ients from F, of the set S. A length of a word ai · · · ai , ai ∈ S, ai 6= 1, is t. If A is an algebra with 1, then it is said that 1 is a word of elements from S of length 0. Definition 1.

1

t

j

j

Notation 2. Let Si denote the set of all words in the alphabet a1 , . . . , ak of a

length less than or equal to i, i > 0.

Notation 3. Let Li (S) = hSi i and let L(S) =

S∞

i=0 Li (S) be the linear span of all words in the alphabet a1 , . . . , ak . Note that L0 (S) = F, if A is unitary, and L0 (S) = 0, otherwise.

Matrix algebras and their length

117

Sin e S is a generating set for A, any element of A an be written as a nite linear ombination of words in a1 , . . . , ak , i.e., A = L(S). The de nition of Si implies that Li+j (S) = hLi (S)Lj (S)i and L0 (S) ⊆ L1 (S) ⊆ · · · ⊆ Lh (S) ⊆ · · · ⊆ L(S) = A. Sin e A is nite dimensional, there exists an integer h > 0 su h that Lh (A) = Lh+1 (A).

A number l(S) is alled a length of a nite generating set provided it equals the smallest number h, su h that Lh (S) = Lh+1 (S). Definition 2.

S

Note that if for some h > 0 it holds that Lh (S) = Lh+1 (S), then Lh+2 (S) = hL1 (S)Lh+1 (S)i = hL1 (S)Lh (S)i = Lh+1 (S)

and similarly Li (S) = Lh (S) for all i > h. Thus l(S) is de ned orre tly. Sin e S is a generating set for A, it follows that Lh (S) = L(S) = A. The following de nition is ru ial for this paper. Definition 3. The length of the algebra A, denoted by l(A), is the maximum of lengths of all its generating sets. Definition 4. The word v ∈ Lj (S) is alled redu ible over S i < j, su h that v ∈ Li (S) and Li (S) 6= Lj (S).

if there exists

Notation 4. Let Mn (F) be the full matrix algebra of order n over F, Tn (F) be

the algebra of n × n upper triangular matri es over matri es over F, Dn (F) be the algebra of n × n diagonal matri es over matri es over F, and Nn (F) be the subalgebra of nilpotent matri es in Tn (F).

Notation 5. We denote by E the identity matrix, by Ei,j the matrix unit, i.e. the matrix with 1 on (i, j)-position and 0 elsewhere.

2

Introduction

The problem of evaluating the length of the full matrix algebra in terms of its order was posed in 1984 by A. Paz in [4℄ and has not been solved yet. The ase of 3 × 3 matri es was studied by Spen er ad Rivlin [5℄, [6℄ in onne tion with possible appli ations in me hani s. Some known upper bounds for the length of the matrix algebra are not linear. Theorem 6. [4, Theorem 1, Remark 1℄ l(Mn (F)) 6 ⌈(n2 + 2)/3⌉.

Theorem 7. [3, Corollary 3.2℄ Let F p n 2n2 /(n − 1) + 1/4 + n/2 − 2.

Let

F

be an arbitrary eld. Then

be an arbitrary eld. Then

In [4℄ Paz also suggested a linear bound:

l(Mn (F))

l(A) and for any natural number k the dieren e of the lengths is l(A ′ ) − l(A) = k (Theorem 9). Also we investigate the ratio between l(A ′ ) and l(A). The question on the possible values of length ratio remains open yet in general. But in the Se tions 3.1 and 3.2 we give some examples of length omputation of two and three blo k upper triangular matrix algebras. Apart from their intrinsi interest, these examples give the following result: for any rational number r ∈ [1, 2] there exist su h F-algebra A and its subalgebra A ′ , that l(A ′ )/l(A) = r (Corollary 2). We note that there are still very few examples of algebras with exa tly evaluated length. In this papers we give some new series of su h examples: algebras An,m , f. Theorem 11, and An1 ,n2 ,n3 , f. Theorem 14. In addition in Se tion 3.3 we give some examples of algebras A satisfying the inequality l(A) > l(A ′ ) for any subalgebra A ′ ⊆ A.

Matrix algebras and their length

3

119

On the lengths of algebra and its subalgebras

Noti e that generally speaking the length fun tion unlike the dimension fun tion

an in rease when passing from an algebra to its subalgebras. We rst onsider two types of transformations preserving the length of a generating set. Proposition 1. Let F be an arbitrary eld and let A be a nite-dimensional asso iative F-algebra. If S = {a1 , . . . , ak} is a generating set for A and C = {cij } ∈ Mk (F) is non-singular, then the set of oordinates of the ve tor

C

i.e. the set is

a1

.. .

ak

=

c11 a1 + c12 a2 + . . . + c1k ak

.. .

ck1 a1 + ck2 a2 + . . . + ckk ak

,

(2)

Sc = {c11 a1 + c12 a2 + . . . + c1k ak , . . . , ck1 a1 + ck2 a2 + . . . + ckk ak } also a generating set for A and l(Sc ) = l(S).

Proof. Let us prove using the indu tion on n that

Ln (S) = Ln (Sc ) holds for every n. Sin e any linear ombination γ1 a1 +. . .+γk ak ∈ L1 (S), then L1 (Sc ) ⊆ L1 (S). The non-singularity of C provides that ai ∈ L1 (Sc ), i = 1, . . . , k, i.e. L1 (S) ⊆ L1 (Sc ). Hen e L1 (Sc ) = L1 (S). Let us take n > 1 and suppose that for n − 1 the equality holds. Then Ln (S) = hL1 (S)Ln−1 (S)i = hL1 (Sc )Ln−1 (Sc )i = Ln (Sc ).

Let F be an arbitrary eld and let A be a nite-dimensional asso iative unitary F-algebra. Let S = {a1 , . . . , ak } be a generating set for A su h that 1A ∈/ ha1 , . . . , ak i. Then S1 = {a1 + γ1 1A , . . . , ak + γk1A } is also a generating set for A and l(S1 ) = l(S).

Proposition 2.

Proof. The proof is analogous to that of Proposition 1, but simpler. For further onsiderations we need the following lass of matri es: Definition 5.

Let

F

be an arbitrary eld. A matrix

C ∈ Mn (F)

nonderogatory provided dimF (hE, C, C2 , . . . , Cn−1 i) = n.

is alled

Lemma 1. [8, Lemma 7.7℄ Let F be an arbitrary eld and let A be a ommutative subalgebra of Mn (F). If there exists a nonderogatory matrix A ∈ A then A is a subalgebra generated by A, and l(A) = n − 1.

Let F be an arbitrary eld and let A4 ⊂ T4 (F) be an algebra generated by matri es E, E4,4 , E1,2 , E1,3 and E2,3 . Then l(A4 ) = 2.

Proposition 3.

120

O. V. Markova

Proof. The dimension of any subalgebra of

M4 (F) generated by a single matrix does not ex eed 4, but dimF A4 = 5. Hen e for any generating set S = / hA1 , A2 i. If the {A1 , . . . , Ak } for A4 it holds that k > 2 and if k = 2, then E ∈ generating set S ontains 3 matri es A1 , A2 , A3 su h that E ∈/ hA1 , A2 , A3 i, then dimF L1 (S) > 4 and in this ase dimF L2 (S) = 5, that is l(S) 6 2. Let us

onsider the ase when S = {A, B}, E ∈/ hA, Bi. It follows from Proposition 2 that matri es A and B an be taken in the following form

0 a12 0 0 A= 0 0 0 0

a13 a23 0 0

0 0 b12 0 0 0 , B = 0 0 0 a44 0 0

b13 b23 0 0

0 0 . 0 b44

Sin e S is a generating set, then a44 6= 0 or b44 6= 0. Without loss of generality we will assume that a44 6= 0. Then by Proposition 1 we an take b44 = 0. Then A2 = a12 a23 E1,3 + a244 E4,4 , AB = a12 b23 E1,3 , BA = a23 b12 E1,3 , B2 = b12 b23 E1,3 , A3 = a344 E4,4 .

Other produ ts in A and B of length greater than or equal to 3 are equal to zero. Hen e, we obtain that for a generating set S the ve tors (a12 , a23 ) and (b12 , b23 ) are always linearly independent. But in this ase AB 6= 0 or BA 6= 0, 2 that is E1,3 ∈ L2 (S). Hen e E4,4 = a−2 44 (A − a12 a23 E1,3 ) ∈ L2 (S), E1,2 , E1,3 ∈ hA, B, E1,3 , E4,4 i ⊆ L2 (S). Consequently, L2 (S) = A4 and l(S) = 2. That is l(A4 ) = 2.

Example 1. Let F be an arbitrary eld and let A4 ⊂ T4 (F), generated by matri-

es E, E4,4 , E1,2 , E1,3 and E2,3 . There exists a subalgebra A ′ of A4 , generated by a nonderogatory matrix A = E1,2 + E2,3 + E4,4 , l(A ′ ) = 3 > 2 = l(A4 ).

For all n > 4 and for any eld F with the number of elements greater than n − 4 there exist su h subalgebras An′ ⊂ An ⊂ Mn (F) that l(An′ ) = n − 1 > l(An ). Corollary 1.

Proof. Let f4 , . . . , fn ∈ F be distin t nonzero elements. Consider the following subalgebras:

An = h{E1,1 + E2,2 + E3,3 , E1,2 , E1,3 , E2,3 , E4,4 , . . . , En,n }i, An′ = hE1,2 + E1,3 +

n X i=4

fi Ei,i i.

Then l(An′ ) = n − 1, sin e the matrix E1,2 + E1,3 +

n P

i=4

fi Ei,i is nonderogatory. It

follows from [8, Theorem 4.5℄ that l(Dn−4 (F)) = n − 5. Consequently, l(An ) = l(A4 ⊕ Dn−4 (F)) 6 2 + (n − 5) + 1 = n − 2 by onsequent appli ation of Example 1 and Theorem 8.

Matrix algebras and their length

121

Thus Proposition 3 and Example 1 provide a positive answer to the question whether the length of an algebra an be smaller than the length of its subalgebra. Consequently, the next natural question is: what values an be taken on by the

dieren e and the ratio of the lengths of an algebra and of its subalgebra.

Let (A, A ′ ) be a pair, where A is an F-algebra over and arbitrary eld F and A ⊆ A be its subalgebra. The next theorem shows that there exist families of su h pairs, su h that l(A ′ ) > l(A) and the dieren e an be arbitrary large, and thus answers the rst question. The se ond question is onsidered in the next two se tions. ′

Theorem 9. For any natural number k there exist algebras A ′ ⊂ A ⊂ Mn (F), that l(A ′ ) − l(A) = k.

a number

n

and su h

Proof. Example 2 and Proposition 4 below give an expli it onstru tion of a pair

(A, A ′ ) of F-algebras su h that A ′ ⊆ A and l(A ′ ) − l(A) = k. This onstru tion

is based on Example 1.

Example 2. Let F be a suÆ iently large eld, let k be a xed positive number, n = 4k. Let A = A4 ⊕ . . . ⊕ A4 ⊂ Mn (F), Ai = E4i−3,4i−3 + E4i−3,4i−2 + | {z } k

times

E4i−2,4i−2 + E4i−2,4i−1 + E4i−1,4i−1 , i = 1, . . . , k, let us assign k X A ′ = h (ai Ai + bi E4i,4i )i ⊂ A, i=1

here ai , bi , i = 1, . . . , k, are distin t nonzero elements from F. l(A) = 3k − 1 as shown below, while l(A ′ ) = n − 1 = 4k − 1 by Lemma 1. Proposition 4. Let k be a xed natural 2k + 1. Let A = A4 ⊕ . . . ⊕ A4 ⊂ Mn (F). | {z } k

number, let F be a eld with Then l(A) = 3k − 1.

|F| >

times

Proof. It follows from Theorem 8 and Example 1 that l(A) 6 2k+k−1 = 3k−1. Consider a generating set SA = {A =

k X

(αi (Ai − E4i−2,4i−1 ) + βi E4i,4i ), E4j−2,4j−1 , j = 1, . . . , k},

i=1

where αi , βi , i = 1, . . . , k, are distin t nonzero elements from F, Ai = E4i−3,4i−3 + E4i−3,4i−2 +E4i−2,4i−2 +E4i−2,4i−1 +E4i−1,4i−1 , i = 1, . . . , k. Sin e AE4j−2,4j−1 = αj (E4j−3,4j−1 + E4j−2,4j−1 ), E4j−2,4j−1 A = αj E4j−2,4j−1 , and the degree of the minimal polynomial of A is 3k, then l(SA ) = 3k − 1 = l(A).

122 3.1

O. V. Markova Two block subalgebras in upper triangular matrix algebra

We note that in Example 2 the value m = l(A ′ ) − l(A) is an arbitrary number, however the ratio r = (l(A ′ ) + 1) : (l(A) + 1) = 4 : 3 is a onstant. The main aim of this and the next se tions is to show that for any rational number r ∈ [1, 2] there exist F-algebra A and its subalgebra A ′, su h that l(A ′ )/l(A) = r. In this se tion we onsider the following 2-parametri family of algebras An,m , An,m =

*

+ 1 6 i < j 6 n, ⊂ Tm+n (F), E, Eii , Ei,j , or n + 1 6 i < j 6 n + m i=1 n X

where n > m are natural numbers, over an arbitrary eld F. We ompute their ′ ′ lengths expli itly and found the subalgebra An,m with l(An,m ) > l(An,m ) in ea h algebra of this family, then hoosing appropriate values of parameters n and ′ m we obtain the required behavior of the ratio l(An,m )/l(An,m ), see Corollary 2.

Remark 1. The aforementioned onstru tions generalize Example 1, namely, we

obtain a series of algebras A(n) = An,m and their subalgebras A ′ (n) with the xed length dieren e m, for whi h the length ratio r = r(n) is a non- onstant linear-fra tional fun tion.

Remark 2. Algebra A4 des ribed in Example 1 oin ides with A3,1 . Notation 10. Any

A ∈ An,m is of the following form A =

′ A 0 , where 0 A ′′

A ′ ∈ Tn (F), A ′′ ∈ Tm (F). From now on we will use the following notation A = A ′ ⊕ A ′′ .

In the following two lemmas we mark spe ial elements in generating sets whi h are signi ant for omputation of the length of An,m . Lemma 2. Let n ∈ N, n > 3 and let S be a generating set for An,m . Then there exists a generating set eS for An,m su h that the following onditions hold: 1. dim L1 (eS) = |eS| + 1; P ai,j Ei,j 2. there exist a matrix A0 = A0′ ⊕ A0′′ ∈ eS su h that A0′ =

and A0′′ =

P

16i<j6n

ai+n,j+n Ei+n,j+n + E;

16i<j6m

3. for all S ∈ eS, S 6= A0 , it holds that (S)i,i = 0, i = 1, . . . , n + m; 4. there exist B1 , . . . , Bn−1 ∈ eS su h that either (i) for all r = 1, . . . , n − 1 Br′ = Er,r+1 +

or

n n P P

i=1 j=i+2

bi,j;r Ei,j , Br′′ ∈ Nm (F),

Matrix algebras and their length

123

(ii) there exists k ∈ {1, . . . , n − 1} su h that Br′ = Er,r+1 + br Ek,k+1 +

n n P P

bi,j;r Ei,j ,

i=1 j=i+2

5. l(eS) = l(S).

Br′′ ∈ Nm (F), r = 1, . . . , n − 1, r 6= k, n n P P a ^i,j Ei,j , ak 6= 0, Bk′ = A0′ = ak Ek,k+1 + j=i+2 i=1 P Bk′′ = A0′′ = a ^i+n,j+n Ei+n,j+n + E; 16i<j6m

Proof. Let us onsequently transform the set S into a generating set satisfying

the onditions 1{4. 1. This ondition is equivalent to the fa t that all elements from S are linearly independent and E ∈/ hSi. Otherwise we remove all redundant elements from S. Noti e that it does not hange the length of S. 2. In order to prove the existen e of A0 it is suÆ ient to show that S ontains a matrix with two distin t eigenvalues. But if all matri es in S had only one eigenvalue, then S would not be a generating set for An,m . 3. Proposition 2 allows us to transform the given generating set into a generating set S ′ = {S − (S)1,1 E, S ∈ S} preserving its length. Then by Proposition 1 the transformation of S ′ into a generating set S ′′ = {A0 , S − (S)n+1,n+1 A0 , S ∈ S ′ , S 6= A0 } also does not hange its length. For the sake of simpli ity of the subsequent text let us redenote S ′′ by S. 4. Sin e Ei,i+1 ∈ An,m , but for any t > 2 and S ∈ St \ S the oeÆ ient (S)i,i+1 = 0, i = 1, . . . , n − 1, then there exist B1 , . . . , Bn−1 ∈ S su h that ve tors ui = ((Bi )1,2 , (Bi )2,3 , . . . , (Bi )n−1,n ), i = 1, . . . , n − 1 are linearly independent. Let us next do the following transformation F of the set S (by Proposition 1 F preserves the length of S), whi h is identi al for all elements S ∈ S, S 6= Bi , i = 1, . . . , n−1, i.e. F(S) = S, and its a tion on the set of matri es Bj , j = 1, . . . , n−1 depends on belonging of A0 to this set as follows: (i) Assume that A0 6∈ {B1 , . . . , Bn−1 }. There exists a non-singular linear transformation F = {fi,j } ∈ Mn−1 (F) that maps the set {ui } into the set {e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en−1 = (0, 0, . . . , 1)} ⊂ Fn−1 ,

i.e. ei =

n−1 P j=1

fi,j uj . Then let us assign F(Br ) =

Then F(Br ) ′ = Er,r+1 +

n n P P

i=1 j=i+2

bi,j;r Ei,j .

n−1 P

fr,j Bj .

j=1

(ii) Assume that A0 ∈ {B1 , . . . , Bn−1 }, i.e. A0 = Bp for some p ∈ {1, . . . , n − 1}. Sin e any matrix in Mn−1,n−2 (F) of rank n − 2 ontains a non-singular submatrix of order n − 2, then there exists a number k ∈ {1, . . . , n − 1} su h that

124

O. V. Markova

the ve tors vi = ((Bi )1,2 , . . . , (Bi )k−1,k , (Bi )k+1,k+2 , . . . , (Bi )n−1,n ), i = 1, . . . , n − 1, i 6= p are linearly independent. Sin e the matri es Bj were numbered arbitrarily, we would assume that p = k. There exists a non-singular linear transformation G = {gi,j } ∈ Mn−2 (F) that maps the set {vi } into the set {e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en−2 = (0, 0, . . . , 1)} ⊂ Fn−2 ,

i.e. ei =

k−1 P

gi,j vj +

j=1

F(Br ) =

k−1 X j=1

Then

n−2 P j=k

gr,j Bj +

gi,j vj+1 . Then let us assign

n−2 X j=k

gr,j Bj+1 , r 6= k, F(A0 ) = A0 −

F(Br ) ′ = Er,r+1 + br Ek,k+1 +

n X n X

i=1 j=i+2

F(A0 ) ′ = ak Ek,k+1 + F(A0 ) ′′ =

X

n X n X

a ^i,j Ei,j ,

i=1 j=i+2

n−1 X

(A0 )i,i+1 F(Bi ).

i=1 i6=k

bi,j;r Ei,j if r 6= k, ak 6= 0,

a ^i+n,j+n Ei+n,j+n + E.

16i<j6m

For the sake of simpli ity of the subsequent text let us redenote F(A0 ) and S = F(S) is of F(Br ) by A0 and Br , orrespondingly. Then the generating set e

required type. 5. The transformations applied to the set S in paragraphs 1{4 did not hange its length, onsequently, the length of the new generating set is equal to the length of the initial one.

Lemma 3. Let n > 3, m > 2 and let S be a generating set for An,m , satisfying the onditions 1{4 of Lemma 2. Then one of the following onditions holds: (i) there exist su h C, C1 , . . . , Cm−1 ∈ hS \ A0 i that Cr′′ = Er+n,r+n+1 +

m X m X

ci+n,j+n;r Ei+n,j+n , r = 1, . . . , m − 1,

i=1 j=i+2 m m P P

and if A~ 0 = E − A0 + C then A~ 0′′ = a ~i+n,j+n Ei+n,j+n , i=1 j=i+2 or (ii) there exist su h C, Cr ∈ hS \ A0 i, r = 1, . . . , m − 1, r 6= s that Cr′′ = Er+n,r+n+1 + cr Es+n,s+n+1 +

m X m X

i=1 j=i+2

ci+n,j+n;r Ei+n,j+n ,

Matrix algebras and their length

125

and if A~ 0 = E − A0 + C then ~ 0′′ = a~s Es+n,s+n+1 + A

m m X X

i=1 j=i+2

a ~ i+n,j+n Ei+n,j+n , a~s 6= 0,

where s ∈ {1, . . . , m − 1} (in this ase let us assign Cs = A~ 0 ). Proof. Sin e all S1 ,

S2 ∈ S and all i = n+1, . . . , n+m−1 satisfy (S1 S2 )i,i+1 = 0

if S1 6= A0 , S2 6= A0 and

(S1 A0 )i,i+1 = (S1 )i,i+1 ,

(A0 S2 )i,i+1 = (S2 )i,i+1 and (A20 )i,i+1 = 2(A0 )i,i+1 ,

then there exist C~ 1 , . . . , C~ m−1 ∈ S su h that the ve tors

~ i )n+1,n+2 , (C~ i )n+2,n+3 , . . . , (C~ i )n+m−1,n+m ), wi = ((C

i = 1, . . . , m − 1 are linearly independent. Matri es Ci an be obtained from C~ i , i = 1, . . . , m−1, if we apply transformam−1 P tions similar to those in paragraph 4 of Lemma 2. Assign C = a ^i+n,i+n+1 Ci . i=1

That is (C)i+n,i+n+1 = (A0 )i+n,i+n+1 for all i = 1, . . . , m − 1.

Further length omputation of An,m is arried out separately for dierent values of n and m. Lemma 4.

Let F be an arbitrary eld. Then l(A1,1 ) = 1 and l(A2,2 ) = 3.

Proof.

1. Algebra A1,1 = D2 (F). Consequently, it follows from [8, Theorem 4.5℄ that l(A1,1 ) = 1.

2. Algebra A2,2 is generated by a nonderogatory matrix E1,1 + E1,2 + E2,2 + E3,4 . Consequently, by Lemma 1 we have l(A2,2 ) = 3. Lemma 5.

Let F be an arbitrary eld. Then l(A2,1 ) = 2 and l(A3,2 ) = 3.

Proof.

1. Algebra A2,1 is generated by a nonderogatory matrix E1,1 +E1,2 +E2,2 . Consequently, by Lemma 1 we have l(A2,1 ) = 2.

2. Let S be a generating set for A3,2 . Without loss of generality we assume S to satisfy the onditions 1{4 of Lemma 2 and therefore one of the onditions of Lemma 3. Let us show that L3 (S) = A3,2 . We have B1 B2 (E−A0 ) = aE1,3 , a 6= 0, B1 (E − A0 )2 = b11 E1,2 + b12 E2,3 + b13 E1,3 , B2 (E − A0 )2 = b21 E1,2 + b22 E2,3 + b23 E1,3 , with linearly independent ve tors (b11 , b12 ) and (b21 , b22 ). C1 A20 = bE4,5 + cE1,3 , b 6= 0. Then E4,4 + E5,5 = (A0 − (A0 )1,2 E1,2 − (A0 )1,3 E1,3 − (A0 )2,3 E2,3 − (A0 )4,5 E4,5 ) ∈ L3 (S). Consider S = {A = E1,2 , B = E2,3 + E4,4 + E4,5 + E5,5 , E}. It follows from the following equations A2 = 0, AB = E1,3 , BA = 0, B2 = E4,4 + 2E4,5 + E5,5 and B3 − B2 = E4,5 that l(S) = 3 = l(A3,2 ).

126

O. V. Markova

Lemma 6. Let F l(An,m ) = n − 1.

be an arbitrary eld, let

n, m ∈ N

and

n − m > 2.

Then

Proof. Let us rst prove the upper bound l(An,m )

6 n − 1. Consider a generating set S for An,m . Without loss of generality we assume S satisfying the

onditions 1{4 of Lemma 2. 1. We use indu tion on p = n − (j − i) to prove that Ei,j ∈ Ln−1 (S) for 1 6 i < j 6 n, j − i > 2. If p = 1 then B1 B2 . . . Bn−1 = (ak )t E1,n ∈ Ln−1 (S), t ∈ {0, 1}, sin e ′′ ′′ ′′ B1 B2 . . . Bn−1 = 0 as a produ t of n − 1 nilpotent matri es of order m 6 n − 2 when t = 0, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order m 6 n − 2 when t = 1. Consider the following matrix produ ts = Bj Bj+1 . . . Bj+n−p−1 (E − A0 )p−1 ∈ Ln−1 (S), j = 1, . . . , p. n n P P dh,i;j,p Eh,i , t ∈ {0, 1}, We have Bj,′ j+n−p−1 = (ak )t Ej,j+n−p + Bj,

j+n−p−1

h=1 i=h+n−p+1

and Bj,′′ j+n−p−1 = 0 as a produ t of n − 1 nilpotent matri es of order m 6 n − 2 when t = 0, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order m 6 n − 2 when t = 1. Applying the indu tion hypothesis we obtain that Ei,i+n−q−1 ∈ Ln−1 (S) for all q = 1, . . . , p − 1, i = 1, . . . , q. Hen e − (ak )t Ej,j+n−p ∈ Ln−1 (S). Sin e by de nition it holds that Bj, j+n−p−1 ∈ Ln−1 (S), then Bj,

Ej,

j+n−p−1

j+n−p−1

= (ak )−t (Bj,

t j+n−p−1 −(Bj, j+n−p−1 −(ak ) Ej,j+n−p ))

∈ Ln−1 (S),

j = 1, . . . , p. 2. Let us now onsider Bj,

It follows immediately that

j

= Bj (E − A0 )n−2 ∈ Ln−1 (S), j = 1, . . . , n − 1.

(Bj, j )r,r+1 = (Bj )r,r+1 , j, r = 1, . . . , n − 1,

that is Bj,′ j

= Ej,j+1 + γj Ek,k+1 +

n n X X

h=1 i=h+2

′ Bk,

k

= atk Ek,k+1 +

n n X X

h=1 i=h+2

dh,i;j Eh,i , j 6= k,

dh,i;k Eh,i , t ∈ {0, 1},

and Br,′′ r = 0 as a produ t of n − 1 nilpotent matri es of order m 6 n − 2 when r 6= k, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order m 6 n − 2 when r = k.

Matrix algebras and their length

It follows from paragraph 1 that Therefore Ek,k+1 = a−t k (Bk,k − Bj,j −

n P

n P

h=1 i=h+2

n P

n P

h=1 i=h+2 n n P P

h=1 i=h+2

127

dh,i;j Eh,i ∈ Ln−1 (S), j = 1, . . . , n.

dh,i;k Eh,i ) ∈ Ln−1 (S). Then Ej,j+1 =

dh,i;j Eh,i − γj Ek,k+1 ∈ Ln−1 (S).

Consequently, Ei,j ∈ Ln−1 (S), 1 6 i < j 6 n. Hen e for any N ∈ Nn (F) it holds that N ⊕ 0 ∈ Ln−1 (S).

3. Let S1 , . . . , Sn ∈ S and assume there exists some Si 6= A0 . It follows from [7, Equation (1)℄ that there exists V ∈ Ln−1 (S) su h that S1 · · · Sn + V = S ′ ⊕ 0, S ′ ∈ Nn (F), but it follows from paragraphs 1 and 2 that S ′ ⊕ 0 ∈ Ln−1 (S). Therefore S1 · · · Sn is redu ible. By Caley{Hamilton Theorem it holds that (A0′′ )m+1 ∈ hA0′′ , (A0′′ )2 , . . . , (A0′′ )m i. Consequently, there exists ′ ′ VA ∈ Ln−1 (S) su h that An 0 + VA = A ⊕ 0, A ∈ Nn (F), but it follows from ′ paragraphs 1 and 2 that A ⊕ 0 ∈ Ln−1 (S). Therefore An0 is also redu ible. So any word of length n in elements of S is redu ible, therefore Ln (S) = Ln−1 (S) and l(S) 6 n − 1.

By Theorem 8 we obtain that l(An,m ) > n − 1. Consequently, l(An,m ) =

n − 1.

Lemma 7. n − 1.

Let

F

be an arbitrary eld, n ∈ N and n > 3. Then

l(An,n−1 ) =

Proof. Let us rst prove the upper bound l(An,n−1 ) 6 n − 1. Let S be a generating set for An,n−1 . Without loss of generality we assume S to satisfy the

onditions 1{4 of Lemma 2 and therefore one of the onditions of Lemma 3. 1. We use indu tion on p = n − (j − i) to prove that Ei,j ∈ Ln−1 (S) for 1 6 i < j 6 n, j − i > 2. Consider the ase when p = 1. (i) Assume that there is no su h number k that A0 = Bk . Then we obtain ′′ B1 B2 · · · Bn−1 = E1,n ∈ Ln−1 (S), sin e B1′′ B2′′ · · · Bn−1 = 0 as a produ t of n − 1 nilpotent matri es of order n − 1. Also it holds that C1 · · · Cn−2 A0 = aEn+1,2n−1 + bE1,n , a 6= 0, that is En+1,2n−1 ∈ Ln−1 (S). (ii) Assume now that there exists a number k su h that A0 = Bk . Then we obtain B1 B2 · · · Bn−1 = ak E1,n + αEn+1,2n−1 . Assume that α = 0. It follows from the equalities (C1 · · · Cn−2 ) ′′ = aE1,n−1 , a 6= 0 and (C1 · · · Cn−2 ) ′ = β1 E1,n−2 + β2 E1,n−1 + β3 E1,n + β4 E2,n−1 + β5 E2,n + β6 E3,n ∈ Nn (F) for n > 3 that if k = n − 1 then A0 C1 · · · Cn−2 = aEn+1,2n−1 , and if k 6= n − 1 then C1 · · · Cn−2 A0 = aEn+1,2n−1 , onsequently, E1,n , En+1,2n−1 ∈ Ln−1 (S). Assume now that α 6= 0. Therefore (B1 · · · Bk−1 Bk+1 · · · Bn−1 ) ′′ = αE1,n−1 and (B1 · · · Bk−1 Bk+1 · · · Bn−1 ) ′ = β1 E1,n−1 + β2 E1,n + β3 E2,n . Sin e n > 3, then k 6= 1 or k 6= n − 1. If k 6= 1 we obtain that A0 B1 · · · Bk−1 Bk+1 · · · Bn−1 =

128

O. V. Markova

αEn+1,2n−1 , and if k 6= n − 1 we obtain that B1 · · · Bk−1 Bk+1 · · · Bn−1 A0 = αEn+1,2n−1 , onsequently, E1,n , En+1,2n−1 ∈ Ln−1 (S). Therefore in all ases it holds that E1,n , En+1,2n−1 ∈ Ln−1 (S). Consider matri es Bj,j+n−p−1 ∈ Ln−1 (S), j = 1, . . . , p de ned in Lemma 6. ′′ Bj,j+n−p−1 = b(j, p)En+1,2n−1 , b(j, p) ∈ F, as a produ t of n − 1 nilpotent matri es of order n − 1 when t = 0, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order n − 1 when t = 1.

Hen e using indu tion hypothesis and arguments similar to those of paragraph 1 of Lemma 6 we obtain that Ej, j+n−p−1 ∈ Ln−1 (S), j = 1, . . . , p.

2. Consider Bj, j ∈ Ln−1 (S), j = 1, . . . , n − 1 de ned in Lemma 6. It follows immediately that Br,′′ r = b(r)En+1,2n−1 , b(r) ∈ F, as a produ t of n − 1 nilpotent matri es of order n − 1 when r 6= k, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order n − 1 when r = k. Hen e using arguments similar to those of paragraph 2 of Lemma 6 we obtain that Ej,j+1 ∈ Ln−1 (S). Consequently, Ei,j ∈ Ln−1 (S), 1 6 i < j 6 n. Hen e for any N ∈ Nn (F) it holds that N ⊕ 0 ∈ Ln−1 (S). Therefore, as it was shown in paragraph 3 of Lemma 6 any word of length n in elements of S is redu ible, thus Ln (S) = Ln−1 (S) and l(S) 6 n − 1. Then l(An,n−1 ) 6 n − 1.

By Theorem 8 we obtain that l(An,n−1 ) > n−1. Consequently, l(An,n−1 ) =

n − 1.

Lemma 8.

Proof.

Let F be a eld, n ∈ N and n > 2. Then l(An,n ) = n.

I. Let us rst prove the upper bound l(An,n ) 6 n. Let S be a generating set for An,n . Without loss of generality we assume S to satisfy the onditions 1{4 of Lemma 2 and therefore one of the onditions of Lemma 3. 1. We use indu tion on p = n−(j−i) to prove that Ei,j , Ei+n,j+n ∈ Ln−1 (S) for 1 6 i < j 6 n, j − i > 2. Consider the ase when p = 1. Assume that there does not exist su h number k that A0 = Bk . Then we obtain B1 B2 · · · Bn−1 (E − A0 ) = E1,n ∈ Ln (S), ′′ (E − A0 ) ′′ = 0 as a produ t of n nilpotent matri es of sin e B1′′ B2′′ · · · Bn−1 order n. Assume now that there exist a number k su h that A0 = Bk . Then we obtain B1 B2 · · · Bn−1 = ak E1,n + α1 En+1,2n−1 + α2 En+1,2n + α3 En+2,2n . Noti e that sin e n = m > 2, then if ondition (ii) of Lemma 3 holds, then the number s introdu ed there satis es one of the inequalities s 6= 1 or s 6= n − 1. And if ondition (i) of Lemma 3 holds, both inequalities hold true. If s 6= 1 ~ 0 B1 B2 · · · Bn−1 = E1,n , and if s 6= n − 1 then B1 B2 · · · Bn−1 A ~ 0 = E1,n , then A that is E1,n ∈ Ln (S). Also it holds that C1 · · · Cn−1 A0 = aE1,n + bEn+1,2n ∈ Ln (S), b 6= 0, therefore, En+1,2n = b−1 (C1 · · · Cn−1 A0 − aE1,n ) ∈ Ln (S). Consider the following matrix produ ts Bj,j+n−p−1 = Bj Bj+1 · · · Bj+n−p−1 (E − A0 )p ∈ Ln (S), j = 1, . . . , p.

Matrix algebras and their length

Bj,′

j+n−p−1

n X

= atk Ej,j+n−p +

n X

h=1 i=h+n−p+1

129

bh,i;j,p Eh,i , t ∈ {0, 1},

′′ and Bj,j+n−p−1 = b(j, p)En+1,2n , b(j, p) ∈ F, as a produ t of n nilpotent matri es of order n when t = 0, and as a produ t of n − 1 nilpotent and one unitriangular matri es of order n when t = 1. Consider Cj,j+n−p−1 = Cj Cj+1 · · · Cj+n−p−1 Ap0 ∈ Ln (S), j = 1, . . . , p.

Cj,′′

j+n−p−1

= (a ~s )t Ej+n,j+2n−p +

n X

n X

h=1 i=h+n−p+1

ch,i;j,p Eh+n,i+n , t ∈ {0, 1},

and Cj,′ j+n−p−1 = c(j, p)E1,n as a produ t of n nilpotent matri es of order n when t = 0, and as a produ t of n − 1 nilpotent and one unitriangular matri es of order n when t = 1. Applying the indu tion hypothesis we obtain that Ei,i+n−q−1 , Ei+n,i+2n−q−1 ∈ Ln (S) for all q = 2, . . . , p − 1, i = 1, . . . , q,

and E1,n , En+1,2n ∈ Ln (S) as was shown above. Therefore, Bj,

j+n−p−1

− (ak )t Ej,j+n−p , Cj,

− (as )t Ej+n,j+2n−p ∈ Ln (S).

j+n−p−1

Sin e by de nition it holds that Bj, j+n−p−1 , Cj, j+n−p−1 ∈ Ln (S) then Ej,

j+n−p−1

= (ak )−t (Bj,

j+n−p−1

Ej+n, j+2n−p−1 = (as )−t (Cj, Ln (S), j = 1, . . . , p.

− (Bj,

j+n−p−1

j+n−p−1

− (Cj,

− (ak )t Ej,j+n−p )) ∈ Ln (S),

j+n−p−1

− (as )t Ej+n,j+2n−p )) ∈

2. Consider next Bj,j = Bj (E−A0 )n−1 ∈ Ln (S) and Cj,j = Cj A0n−1 ∈ Ln (S), j = 1, . . . , n − 1. It follows immediately that ′ Bj,j = Ej,j+1 + βj Ek,k+1 +

n n X X

h=1 i=h+2 ′ Bk,k = (ak )t Ek,k+1 +

n n X X

h=1 i=h+2 ′′ Cj,j = Ej+n,j+n+1 + γj Es+n,s+n+1 +

bh,i;j,n−1 Eh,i , j 6= k,

bh,i;k,n−1 Eh,i , t ∈ {0, 1},

n n X X

h=1 i=h+2 ′′ Cs,s = (a ~s )t Es+n,s+n+1 +

n n X X

h=1 i=h+2

ch,i;j,n−1 Eh+n,i+n , j 6= s,

ch,i;s,n−1 Eh,i , t ∈ {0, 1},

′′ Br,r = b(j)E1,n as a produ t of n nilpotent matri es of order n when t = 0, and as a produ t of n − 1 nilpotent and one unitriangular matri es of order n when

130

O. V. Markova

′ t = 1, and Cr,r = c(r)E1,n as a produ t of n − 1 nilpotent and one unitriangular matri es of order n.

It follows from paragraph 1 that

Bk,k − (ak )t Ek,k+1 , Cs,s − (as )t Es+n,s+n+1 ∈ Ln (S). Therefore Ek,k+1 , Es+n,s+n+1 ∈ Ln (S). Then n n P P bh,i;j,n−1 Eh,i − βj Ek,k+1 − b(j)En+1,2n ∈ Ln−1 (S), Ej,j+1 = Bj,j − h=1 i=h+2 n n P P

Ej+n,j+n+1 = Cj,j −

h=1 i=h+2

ch,i;j,n−1 Eh,i − γj Es+n,s+n+1 − c(j)E1,n ∈ Ln−1 (S).

Consequently, Ej,j+n−p , Ej+n,j+2n−p ∈ Ln (S), j = 1, . . . , p. 3. Then it holds that 0 ⊕ E ′′ = (A0 − ak Ek,k+1 −

n X

n X

a ^ij Ei,j −

h=1 i=h+n−p+1

X

a ^ i+n

j+n Ei+n,j+n ),

16i<j6n

that is 0 ⊕ E ∈ Ln (S). Hen e any generating set S satis es Ln (S) = An,n , therefore l(An,n ) 6 n. II. Let us onstru t a generating set for An,n of length n. Let Sn = {Ai = Ei,i+1 + En+i,n+i+1 , i = 1, . . . , n − 1, E, En =

n X

Ej,j }.

j=1

Sin e Ai Aj = 0 for j 6= i + 1 and En Ai = Ai En = Ei,i+1 , then E1,n ∈/ Ln−1 (Sn ), where 1 6 i < j 6 n, Ln−1 (Sn ) = E, En , Ei,j , Ei+n,j+n , E1,n + En+1,2n , j−i6 n−2

but E1,n = A1 · · · An−1 En ∈ Ln (S) and therefore, Ln (S) = An,n . Consequently, l(An,n ) = n. The ombination of Lemmas 4{8 implies Theorem 11.

let

Let F be an arbitrary eld, let n > m be natural numbers and

An,m =

Then

*

+ 1 6 i < j 6 n, E, Eii , Ei,j , or ⊂ Tm+n (F). n + 1 6 i < j 6 n + m i=1 n X

l(An,m ) =

n − 1, for n − m > 2, n − 1 for n = m + 1, n > 3, n + 1 for n = m = 2, n, for n = m 6= 2, n for n = m + 1, m = 1, 2.

Matrix algebras and their length

131

The following Corollary shows in parti ular that the length ratio for a two blo k algebra and its subalgebra an take on any rational value in [1, 2].

Let F be an arbitrary eld, let n > m be xed natural numbers.

Corollary 2.

Let

Cn,m =

n−1 X

Ei,i+1 +

i=1

m−1 X j=1

(Ej+n,j+n + Ej+n,j+n+1 ) + En+m,n+m ∈ An,m

be a nonderogatory matrix, and let ′ An,m = hCjn,m , | 0 6 j 6 n + m − 1i ⊆ An,

m.

Then ′ ) = n + m − 1; 1. l(An,m = 2, 2. for n = m = 1, 2 or n

3.

′ l(An,

′ l(An, l(An,

3.2

m)

− l(An,

m)

′ m = 1 An, m = An,m ; m, for n − m > 2, or n = m + 1, n > 3,

=

for n = m 6= 2, or n = 3, m = 2, for n − m > 2, or n = m + 1, n > 3,

m − 1,

m 1 + , ) + 1 m n = 1 + m − 1 , m) + 1 n+1

for n = m 6= 2, or n = 3,

and

m = 2.

Three block subalgebras in upper triangular matrix algebra

In this se tion we onsider the 3-parametri family of algebras An1 ,n2 ,n3 ⊂

Tn1 (F) ⊕ Tn2 (F) ⊕ Tn3 (F), An1 ,n2 ,n3 =

*

1 6 i < j 6 n1 , + or n + 1 6 i < j 6 n + n , E, Ei,i , Ei,i , Ei,j , 1 1 2 or i=1 i=n1 +1 n1 + n2 + 1 6 i < j 6 n1 + n2 + n3 n1 +n2 X

n1 X

over an arbitrary eld F, ompute their lengths expli itly and found the subalgebras An′ 1 ,n2 ,n3 ⊂ An1 ,n2 ,n3 with l(An′ 1 ,n2 ,n3 ) > l(An1 ,n2 ,n3 ), then hoosing appropriate values of parameters n1 , n2 , n3 we obtain arbitrary rational ratios l(An′ 1 ,n2 ,n3 ) ∈ [1, 2), see Corollary 3. l(An1 ,n2 ,n3 )

Notation 12. Any A ∈ An

1,

n2 , n3

is of the following form

′ A 0 0 A = 0 A ′′ 0 , where A ′ ∈ Tn1 (F), A ′′ ∈ Tn2 (F), A ′′′ ∈ Tn3 (F). 0 0 A ′′′

From now on we will use the following notation A = A ′ ⊕ A ′′ ⊕ A ′′′ .

132

O. V. Markova

In the following three lemmas we mark spe ial elements in generating sets whi h are signi ant for omputation of the length of An1 ,n2 ,n3 .

Let S be a generating set for An ,n ,n . Then there exists a generating set eS for An ,n ,n su h that the following onditions hold: 1. dim L1 (eS) = |eS| + 1; 2. any S ∈ eS satis es (S)ii = 0, i = 1, . . . , n1 ; 3. either (i) there exist matri es A1 = (ai,j;1 ), A2 = (ai,j;2 ) ∈ eS su h that Lemma 9.

1

1

2

2

3

3

A1′′ = E + N1 , N1 ∈ Nn2 (F), A1′′′ ∈ Nn3 (F), A2′′ ∈ Nn2 (F), A2′′′ = E + N2 , N2 ∈ Nn3 (F), e S 6= A1 , A2 , satisfy (S)i,i = 0, i = 1, . . . , n1 + n2 + n3 ; S ∈ S,

and all or (ii) there exists a matrix A0 = (ai,j;0 ) su h that

A0′′ = E + N, N ∈ Nn2 (F), A0′′′ = aE + M, M ∈ Nn3 (F), a ∈ / {0, 1},

and all S ∈ S,

S 6= A0

satisfy (S)i,i = 0,

i = 1, . . . , n1 + n2 + n3 ;

4. l(eS) = l(S).

Proof. Let us onsequently transform the set S into a generating set satisfying the onditions 1{3. 1. We use the same arguments as in point 1 of Lemma 2. 2. Proposition 2 allows us to transform the given generating set into a generating set S1 = {S − (S)1,1 E, S ∈ S} preserving its length. 3. (i) Assume there exist C1 , C2 ∈ S1 su h that ve tors c1 = ((C1 )n1 +1,n1 +1 , (C1 )n1 +n2 +1,n1 +n2 +1 ), c2 = ((C2 )n1 +1,n1 +1 , (C2 )n1 +n2 +1,n1 +n2 +1 )

are linearly independent. Thus there exists a non-singular matrix F = (fi,j ) ∈ M2 (F2 ) su h that (1, 0) = f1,1 c1 + f1,2 c2 , (0, 1) = f2,1 c1 + f2,2 c2 . Let us assign Ai = fi,1 C1 + fi,2 C2 , i = 1, 2. Then Proposition 1 allows us to transform the given generating set into a generating set S2 = {A1 , A2 , S| S ∈ S, S 6= C1 , C2 }

preserving its length. And by Proposition 1 the transformation of S2 into a generating set S3 = {A1 , A2 , S − (S)n1 +1,n1 +1 A1 − (S)n1 +n2 +1,n1 +n2 +1 A2 | S ∈ S1 , S 6= A1 , A2 } also does not hange its length. In this ase we assign e S = S3 . (ii) Otherwise there exists su h a matrix A in S that ve tors ((A)n1 +1

n1 +1 ,

(A)n1 +n2 +1

n1 +n2 +1 ),

((A2 )n1 +1

n1 +1 ,

(A2 )n1 +n2 +1

n1 +n2 +1 )

are linearly independent. Thus matrix A has two distin t non-zero eigenvalues. Then we an repla e matrix A in s S1 with the matrix A0 = (A)−1 n1 +1,n1 +1 A.

Matrix algebras and their length

133

Then Proposition 1 allows us to transform the given generating set into a generating set S2 = {A0 , S − (S)n1 +1,n1 +1 A0 , S ∈ S, S 6= A0 }. Let us assign e S = S2 . Lemma 10. Let S be a generating set for An ,n ,n satisfying the onditions 1, 2 and 3(i) of Lemma 9. The there exist a generating set eS for An ,n ,n satisfying l(e S) = l(S), su h matri es B1 , . . . , Bn −1 ∈ e S and k1 , k2 ∈ {1, . . . , n1 − 1} that one of the following onditions holds: n n P P bi,j;r Ei,j , Br′′ ∈ Nn (F), Br′ = Er,r+1 + 1. 1

1

2

2

3

3

1

1

1

2

i=1 j=i+2

Br′′′

∈ Nn3 (F), r = 1, . . . , n − 1;

2. there exists j ∈ {1, 2} su h that

Br′ = Er,r+1 + brj Ekj ,kj +1 +

n1 P

n1 P

bh,i;r Eh,i ,

h=1 i=h+2

Br′′ ∈ Nn2 (F), Br′′′ ∈ Nn3 (F), r = 1, . . . , n1 − 1, r 6= kj , n1 n1 P P ah,i;j Eh,i , a(kj , j) 6= 0, Aj′ = Bk′ j = a(kj , j)Ekj ,kj +1 + h=1 i=h+2

Bk′′j = Aj′′ , Bk′′′j = Aj′′′ ;

3.

Br′ = Er,r+1 + br1 Ek1 ,k1 +1 + br2 Ek2 ,k2 +1 +

n1 P

n1 P

bh,i;r Eh,i ,

h=1 i=h+2

Br′′ ∈ Nn2 (F), Br′′′ ∈ Nn3 (F), r = 1, . . . , n1 − 1, r 6= k1 , k2 , n1 n1 P P ah,i;j Eh,i , Aj′ = Bk′ j = a(k1 , j)Ek1 ,k1 +1 + a(k2 , j)Ek2 ,k2 +1 + h=1 i=h+2

a(kj , j) 6= 0, a(k1 , 1)a(k2 , 2) − a(k2 , 1)a(k1 , 2) 6= 0, Bk′′j = Aj′′ , Bk′′′j = Aj′′′ , j = 1, 2.

Proof. Sin e Ei,i+1

∈ An1 ,n2 ,n3 , but for any t > 2 and S ∈ St \ S the oeÆ ient (S)i,i+1 = 0, i = 1, . . . , n1 − 1, then there exist B1 , . . . , Bn1 −1 ∈ S su h that the ve tors ((Bi )1,2 , (Bi )2,3 , . . . , (Bi )n1 −1,n1 ), i = 1, . . . , n1 − 1 are linearly

independent. Consider next the following transformation F of the set S (by Proposition 1 F preserves the length of S), whi h is identi al for all elements S ∈ S, S 6= Bi , i = 1, . . . , n1 − 1, i.e. F(S) = S, and its a tion on the set of matri es Bj , j = 1, . . . , n1 − 1 depends on belonging of A1 and A2 to this set as follows: T If |{B1 , . . . , Bn1 −1 } {A1 , A2 }| 6 1, then F is onstru ted similarly to the transformation des ribed in point 4 of Lemma 2. Assume that both A1 , A2 ∈ {B1 , . . . , Bn1 −1 }, i.e. A1 = Bp , A2 = Bq for some distin t p, q ∈ {1, . . . , n1 − 1}. Sin e any matrix in Mn1 −1,n1 −3 (F) of rank n1 − 3 ontains a non-singular submatrix of order n1 − 3, then there exist numbers k1 , k2 ∈ {1, . . . , n1 − 1}, k1 < k2 su h that the ve tors vi = ((Bi )1,2 , . . . , (Bi )k1 −1,k1 , (Bi )k1 +1,k1 +2 , . . . , (Bi )k2 −1,k2 , (Bi )k2 +1,k2 +2 (Bi )n1 −1,n1 ),

134

O. V. Markova

i = 1, . . . , n1 − 1, i 6= p, q are linearly independent. Sin e the matri es Bj were numbered arbitrarily, we would assume that p = k1 , q = k2 . There exists a non-singular linear transformation G = {gi,j } ∈ Mn1 −3 (F) that maps the set {vi } into the set {e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en1 −3 = (0, 0, . . . , 1)} ⊂ Fn1 −3 .

i.e. ei =

nP 1 −3 j=1

gi,j vj . Then let us assign

F(Br ) =

kX 1 −1

gr,j Bj +

j=1

kX 2 −1

gr,j Bj+1 +

j=k1

F(As ) = As −

nX 1 −1 j=k2

nX 1 −1

gr,j Bj+2 , r 6= k1 , k2 ,

(As )i,i+1 F(Bi ), s = 1, 2.

i=1 i6=k1 ,k2

For the sake of simpli ity of the subsequent text let us redenote F(A1 ), F(A2 ) and F(Br ) by A1 , A2 and Br , orrespondingly.

Let S be a generating set for An ,n ,n satisfying onditions 1, 2 and 3(ii) of Lemma 9. Then there exist a generating set eS for An ,n ,n satisfying l(eS) = l(S), su h matri es B1 , . . . , Bn −1 ∈ eS and k0 ∈ {1, . . . , n1 −1} that one of the following onditions holds: n n P P bi,j;r Ei,j , Br′′ ∈ Nn (F), 1. Br′ = Er,r+1 + Lemma 11.

1

2

3

1

2

3

1

1

1

2

i=1 j=i+2

Br′′′

2.

∈ Nn3 (F), r = 1, . . . , n − 1; n1 n1 P P Br′ = Er,r+1 + br0 Ek0 ,k0 +1 + bh,i;r Eh,i , h=1 i=h+2

Br′′ ∈ Nn2 (F), Br′′′ ∈ Nn3 (F), r = 1, . . . , n1 − 1, r 6= k, n1 n1 P P ah,i;j Eh,i , a(k0 , 0) 6= 0, A0′ = Bk′ 0 = a(k0 , 0)Ek0 ,k0 +1 + h=1 i=h+2

Bk′′0 = A0′′ , Bk′′′0 = A0′′′ .

Proof. The proof is analogous to the proof of point 4 of Lemma 2. Theorem 13. Let F = F2 , let n1 , n2 , n3 ∈ N, n1 > n2 + 2, n2 > n3 , (n2 , n3 ) 6= (1, 1), (2, 1), (2, 2). Then l(An1 , n2 , n3 ) = n1 − 1.

Proof. Let us rst prove the upper bound l(An

) 6 n1 − 1. Let S be a generating set for An1 ,n2 ,n3 . Without loss of generality we assume S to satisfy the onditions 1{2 of Lemma 9. Sin e by [8, Theorem 6.1℄ l(D3 (F2 ) = 1, then the only possibility for S is to satisfy ondition 3(i) of Lemma 9, and onsequently, we assume S to satisfy one of the onditions of Lemma 10. 1 ,n2 ,n3

Matrix algebras and their length

135

1. We use indu tion on p = n1 − (j − i) to prove that Ei,j ∈ Ln1 −1 (S) for 1 6 i < j 6 n1 , j − i > 2. Noti e that B1 B2 · · · Bn1 −1 = E1,n1 ∈ Ln1 −1 (S), sin e (B1 B2 · · · Bn1 −1 ) ′′ = 0 and (B1 B2 · · · Bn1 −1 ) ′′′ = 0 as produ ts of n1 − 1 nilpotent matri es of order ns+1 6 n1 − 2, if S satis es ondition 1 of Lemma 10 and as produ ts of n1 − 2 nilpotent and one unitriangular matri es of order ns+1 6 n1 − 2, if S satis es

ondition 2 or 3 of Lemma 10. Consider the following matrix produ ts

Bj,j+n1 −p−1 = Bj Bj+1 · · · Bj+n1 −p−1 (E − A1 − A2 )p−1 ∈ Ln1 −1 (S), j = 1, . . . , p. ′ We have Bj,j+n = Ej,j+n1 −p + 1 −p−1

n1 P

n1 P

h=1 i=h+n1 −p+1

ch,i;j Eh,i , and

′′ ′′′ Bj,j+n = 0 and Bj,j+n = 0 as produ ts of n1 − 1 nilpotent matri es 1 −p−1 1 −p−1 of order ns+1 6 n1 − 2, if S satis es ondition 1 of Lemma 10 or for ks de ned in points 2 and 3 of Lemma 10 it holds that ks ∈/ {j, . . . , j + n1 − p − 1}, s = 1, 2, and as produ ts of n1 − 2 nilpotent and one unitriangular matri es of order ns+1 6 n1 − 2 otherwise. Applying the indu tion hypothesis we obtain that Ei,i+n1 −q−1 ∈ Ln1 −1 (S) for all q = 1, . . . , p − 1, i = 1, . . . , q. Therefore Bj, j+n1 −p−1 − Ej,j+n−p ∈ Ln1 −1 (S). Hen e we obtain that Ej, j+n1 −p−1 = (Bj, j+n1 −p−1 − (Bj, j+n1 −p−1 − Ej,j+n1 −p )) ∈ Ln1 −1 (S), j = 1, . . . , p.

2. Consider next Bj,j = Bj (E − A1 − A2 )n1 −2 ∈ Ln1 −1 (S), j = 1, . . . , n1 − 1.

′ Bj,j = Ej,j+1 + γj,1 Ek1 ,k1 +1 + γj,2 Ek2 ,k2 +1 +

n1 X n1 X

h=1 i=h+2

Bk′ r ,kr = ak1 ,r Ek1 ,k1 +1 + ak2 ,r Ek2 ,k2 +1 +

n1 X n1 X

ch,i;j Eh,i , j 6= k1 , k2

ch,i;r Eh,i , r = 1, 2,

h=1 i=h+2 ′′′ ′′ = 0 as produ ts of n1 − 1 nilpotent matri es of order ns+1 6 Bj,j = 0 and Bj,j n1 − 2, if ks 6= j or does not exist, and as produ ts of n1 − 2 nilpotent and one unitriangular matri es of order ns+1 6 n1 − 2, if ks = j, s = 1, 2. It follows from paragraph 1 that Ej,j+1 +γj,1 Ek1 ,k1 +1 +γj,2 Ek2 ,k2 +1 , ak1 ,r Ek1 ,k1 +1 + ak2 ,r Ek2 ,k2 +1 ∈ Ln1 −1 (S), j 6= k1 , k2 , r = 1, 2, and hen e Ej,j+1 ∈ Ln1 −1 (S), j = 1, . . . , n1 − 1. Consequently, Ei,j ∈ Ln1 −1 (S), 1 6 i < j 6 n1 . 3. From paragraphs 1 and 2 we obtain that n1 n1 P P P λh,i Eh,i ∈ Ln2 (S), and Ei,i + Ei,i ∈ (E − A1 − A2 )n2 = i=1

16h n1 − 1. Consequently,

l(An1 ,n2 ,n3 ) = n1 − 1.

Theorem 14. Let F be an arbitrary eld, |F| > 3, and let n1 , n2 , n3 ∈ N, n1 > n2 + n3 + 2, n2 > n3 , (n2 , n3 ) 6= (1, 1), (2, 1), (2, 2). Then l(An1 , n2 , n3 ) = n1 − 1.

Proof.

I. Let us rst prove the upper bound l(An1 ,n2 ,n3 ) 6 n1 − 1. Let S be a generating set for An1 ,n2 ,n3 . Without loss of generality we assume S to satisfy the onditions 1{2 of Lemma 9. If S satis es ondition 3(i) of Lemma 9, then

the proof is analogous to the proof of Theorem 13. Consequently, we assume S to satisfy ondition 3(ii) of Lemma 9, and therefore one of the onditions of Lemma 11. 1. We use indu tion on p = n1 − (j − i) to prove that Ei,j ∈ Ln1 −1 (S) for 1 6 i < j 6 n1 , j − i > 2. Let us denote m = n1 + n2 + 1. If p = 1, then B1 · · · Bn1 −1 = bE1,n1 ∈ Ln1 −1 (S), b 6= 0, sin e (B1 · · · Bn1 −1 ) ′′ = 0 and (B1 · · · Bn1 −1 ) ′′′ = 0 as produ ts of n1 − 2 nilpotent and one unitriangular matri es or n1 − 1 nilpotent matri es of orders n2 and n3 , orrespondingly. If p 6 n1 − n3 − 2 and j = 1, . . . , p onsider Bj,j+n1 −p−1 = Bj · · · Bj+n1 −p−1 (E − A0 )p−1 ∈ Ln1 −1 (S), ′ Bj,j+n 1 −p−1

t

= a(k0 , 0) Ej,j+n1 −p +

n1 X

n1 X

h=1 i=h+n1 −p+1

ch,i;j Eh,i , t ∈ {0, 1},

′′ ′′ and Bj,j+n = 0, Bj,j+n = 0, as produ ts of nilpotent matri es of 1 −p−1 1 −p−1 lengths smaller than orders of fa tors. If n1 − n3 − 1 6 p < n1 − 1 and j = 1, . . . , p onsider

Bj,j+n1 −p−1 = Bj · · · Bj+n1 −p−1 (E − a−1 A0 )n3 −n1 +p (E − A0 )n1 −n3 −1 , Bj,j+n1 −p−1 ∈ Ln1 −1 (S), ′ Bj,j+n 1 −p−1

t

= a(k0 , 0) Ej,j+n1 −p +

n1 X

n1 X

h=1 i=h+n1 −p+1

ch,i;j Eh,i , t ∈ {0, 1},

′′ ′′ Bj,j+n = 0 and Bj,j+n = 0, as produ ts of nilpotent matri es of 1 −p−1 1 −p−1

lengths smaller than orders of fa tors. Applying the indu tion hypothesis we obtain that Ei,i+n1 −q−1 ∈ Ln1 −1 (S) for all q = 2, . . . , p − 1, i = 1, . . . , q, and E1,n1 ∈ Ln1 −1 (S) as shown above.

Matrix algebras and their length

137

Therefore, Bj, j+n1 −p−1 − a(k0 , 0)t Ej,j+n1 −p ∈ Ln1 −1 (S). Hen e we obtain that

Ej, j+n1 −p−1 = (a(k0 , 0))−t (Bj,j+n1 −p−1 − (Bj,j+n1 −p−1 − (a(k0 , 0))t Ej,j+n1 −p )) ∈ Ln1 −1 (S), j = 1, . . . , p.

2. For j = 1, . . . , n1 − 1 onsider produ ts Bj,j = Bj (E − a−1 A0 )n3 −1 (E − A0 )n1 −n3 −1 , j 6= k0 , and Bk0 ,k0 = Bk0 (E − a−1 A0 )n3 (E − A0 )p−n3 −1 , Bj,j ∈ Ln1 −1 (S). We have ′ Bj,j = Ej,j+1 + γj Ek0 ,k0 +1 +

n1 X n1 X

ch,i;j Eh,i , j 6= k0

h=1 i=h+2 n1 X n1 X

Bk′ 0 ,k0 = a(k0 , 0)Ek0 ,k0 +1 +

ch,i;k0 Eh,i ,

h=1 i=h+2

′′′ ′′ = 0 as produ ts of nilpotent matri es of lengths smaller than Br,r = 0 and Br,r orders of fa tors. With paragraph 1 it gives Ej,j+n1 −p ∈ Ln1 −1 (S), j = 1, . . . , p. n1 P P λh,i Eh,i ∈ Ei,i + 3. We have (E − A0 )n2 (E − a−1 A0 )n3 = i=1

Ln2 +n3 (S), and

n1 P

i=1

16h n1 − 1. Consequently, l(An1 ,n2 ,n3 ) = n1 − 1.

The following Corollary shows in parti ular that the length ratio for a three blo k algebra and its subalgebra also an take on many dierent values, namely any rational value in [1, 2). Corollary 3. Let F be an arbitrary eld, |F| > 3, and n1 > n2 + n3 + 2, n2 > n3 > 3. Let a ∈ F, a 6= 0, 1 and Cn1 ,n2 ,n3 =

nX 1 −1

Ei,i+1 +

i=1

n1 +nX 2 +n3 −1 k=n1 +n2 +1

n1 +n X2 −1

let let

n1 , n2 , n3 ∈ N,

(Ej,j + Ej,j+1 ) + En1 +n2 ,n1 +n2 +

j=n1 +1

(aEk,k + Ek,k+1 ) + aEn1 +n2 +n3 ,n1 +n2 +n3 ∈ An1 ,n2 ,n3

be a nonderogatory matrix, let An′ 1 ,n2 ,n3 = hCjn1 ,n2 ,n3 , | 0 6 j 6 n1 + n2 + n3 − 1i ⊆ An1 ,n2 ,n3 .

Then 1. l(An′

1 ,n2 ,n3

) = n1 + n2 + n3 − 1;

138

O. V. Markova

2. l(An′ ,n ,n ) − l(An ,n ,n ) = n2 + n3 ; l(An′ ,n ,n ) + 1 n2 + n3 3. < 2. =1+ 1

2

3

1

2

3

1

2

l(An1 ,n2 ,n3 ) + 1

3

n1

Remark 3. Let us denote An1 = hE(n1 ) , Ei,j , 1 6 i < j 6 n1 i ⊂ Tn1 (F).

Noti e that An1 ,n2 ,n3 = An1 ⊕ An2 ,n3 and l(An1 ,n2 ,n3 ) = l(An1 ) = max (l(An1 ), l(An2 ,n3 )). That is we obtain another example providing sharpness of the lower bound in (1). 3.3

Examples

We now give the examples of algebras with length bounding the lengths of subalgebras.

Let F be an arbitrary eld, n, m ∈ N, n > m − 2, and let An, m be the algebra introdu ed in Theorem 11. Let also Corollary 4.

B = hEi,j , 1 6 i < j 6 n, E,

Then l(B) = n − 1 = l(An,

m ).

n P

i=1

Ei,i , N1 , . . . , Np ∈ 0 ⊕ Nm (F)i ⊆ An,

m.

Example 3. Let F be an arbitrary eld, let A ⊆ Tn (F)) be a subalgebra of upper triangular matrix algebra. Then l(A) 6 l(Tn (F)).

Let F be an arbitrary eld, let A be a nite-dimensional FB ⊆ A su h that there exist a1 , . . . , an ∈ A satisfying hB, a1 , . . . , an i = A and ai b, bai ∈ ha1 , . . . , an i for all b ∈ B. Then l(B) 6 l(A). S Proof. Let SB be a generating set for B. Then SA = SB {a1 , . . . , an} is a generating set for A of the length l(SB ). Then l(A) > l(SA ) = l(SB ) and therefore l(A) > max l(SB ) = l(B). Proposition 5.

algebra, and

SB

Let us give some examples of algebras satisfying the ondition of Proposition 5.

Example 4. Let F be an arbitrary eld, let A be a subalgebra of Tn (F) and let T B=A

Dn (F). Then l(B) 6 l(A).

Example 5. Let F be an arbitrary eld and let A, algebras. Then A ⊂ A ⊕ B and l(A) 6 l(A ⊕ B).

B be nite-dimensional F-

The author is greatly indebted to her supervisor Dr. A. E. Guterman for the attention given to the work and for useful dis ussions.

Matrix algebras and their length

139

References 1. T. J. Laey, Simultaneous Redu tion of Sets of Matri es under Similarity, Linear Algebra and its Appli ations, 84(1986), 123{138 2. W. E. Longsta, Burnside's theorem: irredu ible pairs of transformations, Linear Algebra and its Appli ations, 382(2004), 247{269 3. C. J. Pappa ena, An Upper Bound for the Length of a Finite-Dimensional Algebra, Journal of Algebra, 197(1997), 535{545 4. A. Paz, An Appli ation of the Cayley{Hamilton Theorem to Matrix Polynomials in Several Variables, Linear and Multilinear Algebra, 15(1984), 161{170 5. A. J. M. Spen er, R. S. Rivlin, The Theory of Matrix Polynomials and its Appli ations to the Me hani s of Isotropi Continua, Ar hive for Rational Me hani s and Analysis, 2(1959), 309{336 6. A. J. M. Spen er, R. S. Rivlin, Further Results in the Theory of Matrix Polynomials, Ar hive for Rational Me hani s and Analysis, 4(1960), 214{230 7. O. V. Markova, On the length of upper-triangular matrix algebra, Uspekhi Matem. Nauk, 60(2005), no. 5, 177-178, [in Russian℄; English translation: Russian Mathemati al Surveys, 60(2005), no. 5, 984{985. 8. O. V. Markova, Length omputation of matrix subalgebras of spe ial type, Fundamental and Applied Mathemati s, 13 (2007), Issue 4, 165{197.

On a New Class of Singular Nonsymmetric Matrices with Nonnegative Integer Spectra Tatjana Nahtman1,⋆ and Dietri h von Rosen2 1

Institute of Mathemati al Statisti s, University of Tartu, Estonia tatjana.nahtman@ut.ee; Department of Statisti s, University of Sto kholm, Sweden tatjana.nahtman@statistics.su.se

2

Department of Biometry and Engineering, Swedish University of Agri ultural S ien es dietrich.von.rosen@bt.slu.se

The obje tive of this paper is to onsider a lass of singular nonsymmetri matri es with integer spe trum. The lass omprises generalized triangular matri es with diagonal elements obtained by summing the elements of the orresponding olumn. If the size of a matrix belonging to the lass equals n × n, the spe trum of the matrix is given by the sequen e of distin t non-negative integers up to n− 1, irrespe tive of the elements of the matrix. Right and left eigenve tors are obtained. Moreover, several interesting relations are presented, in luding fa torizations via triangular matri es. Abstract.

Keywords: eigenve tors, generalized triangular matrix, integer spe trum, nonsymmetri matrix, triangular fa torization, Vandermonde matrix.

1

Introduction

In this paper we onsider a new lass of singular matri es with remarkable algebrai properties. For example, the spe trum of a matrix belonging to this lass depends only on the size of the matrix and not on the spe i elements of this matrix. Moreover, the spe trum entirely onsists of su

essive non-negative integer values 0, 1, . . . , n−1. A spe ial ase of this lass of matri es originates from statisti al sampling theory (Bondesson & Traat, 2005, 2007). In their papers, via sampling theory (the Poisson sampling design) as well as some analyti proofs, eigenvalues and eigenve tors were presented. Their proofs remind on the use of Lagrangian polynomials whi h for example are used when nding the inverse of a Vandermonde matrix (e.g. see Ma on & Spitzbart, 1958; El-Mikkawy, 2003). We have not found any other work related to the matrix lass whi h we are going to onsider. ⋆

The work of T. Nahtman was supported by the grant GMTMS6702 of Estonian Resear h Foundation.

Properties of singular matrix with integer spe trum

141

The main purpose of this paper is to introdu e the lass, show some basi algebrai properties, show how to fa tor the lass and demonstrate how to nd eigenvalues and eigenve tors of matri es belonging to the lass. The paper fo uses more on presentation of results than giving omplete proofs of the most general versions of the theorems. Definition 1. A to the Bn - lass

square nonsymmetri matrix B = (bij ) of order if its elements satisfy the following onditions:

bii =

n X

bji ,

n

belongs (1)

i = 1, . . . , n,

j=1, j6=i

(2)

bij + bji = 1, j 6= i, i, j = 1, . . . , n, bij bki bij − bik = , bkj 6= 0, j 6= k, i 6= k, j; i, j, k = 1, . . . , n. bkj

(3)

bki Instead of (3) one may use bkj = bbijij−b or bij bkj = bik bkj + bij bki . Relation ik (2) de nes a generalized triangular stru ture and it an be shown that (3) is a ne essary and suÆ ient ondition for the lass to have the non-negative integer spe tra onsisting of the distin t integers {0, 1, . . . , n − 1}. Due to (1), the sum of the elements in ea h row equals n − 1. Another matrix with integer eigenvalues and row element sum equal to n−1, with many appli ations in various elds, is the well known tridiagonal Ka matrix (Clement matrix); see Taussky & Todd (1991). Moreover, for any B ∈ Bn with positive elements we may onsider (n − 1)−1 B as a transition matrix with interesting symmetri properties re e ted by the equidistant integer spe tra. When B ∈ B3 , 0

1

b21 + b31 b12 b13 C C C C B = b21 b12 + b32 b23 C C A b31 b32 b13 + b23 B B B B B B @

1

0

b21 + b31 1 − b21 1 − b31 C C C C. = b21 1 − b21 + b32 1 − b32 C C A b31 b32 2 − b31 − b32 B B B B B B @

(4)

It is worth observing that any B ∈ Bn is a sum of three matri es: an upper triangular matrix, a diagonal matrix and a skew-symmetri matrix. For (4) we have 0

1

0

1

0

1

0 1 1C −b21 −b31 C 0 B0 B b21 + b31 0 C C B B C C C B B C C+B C+B B = 0 1 1C −b32 C −b + b 0 C. B b21 0 B0 C C 21 32 C B B C C A @ @ A A b31 b32 0 002 0 0 −b31 − b32 B B B B B B @

Note that the eigenvalues {0, 1, 2} of B are found on the diagonal of the upper triangular matrix, irrespe tive of the values of (bij ) as long as they satisfy (1){ (3).

142

T. Nahtman, D. von Rosen

In the Conditional Poisson sampling design (e.g., see Aires, 1999) bij =

pi (1 − pj ) pi − pj

are used to al ulate onditional in lusion probabilities, where the pi 's are in lusion probabilities under the Poisson design. Bondesson & Traat (2005, 2007) generalized this expression somewhat and onsidered bij =

ri , ri − rj

(5)

where the ri 's are arbitrary distin t values. In this paper, instead of (5), we assume (3) to hold. Note that any bij satisfying (5) also satis es (3). For the matrix de ned via the elements in (5) Bondesson & Traat (2005, 2007) presented eigenvalues, and right and left eigenve tors. They expressed their results as fun tions of ri in (5) whereas in this paper we will express the results in terms of bij , i.e. the elements of B ∈ Bn . Moreover, the proofs of all results in this paper are pure algebrai whereas Bondesson & Traat (2005, 2007) indi ated proofs based on series expansions and identi ation of oeÆ ients. It is however not lear how to apply their results to the Bn - lass of matri es, given in De nition 1.1. Moreover, the algebrai approa h of this paper opens up a world of interesting relations. In parti ular, the triangular fa torization of matri es in the Bn - lass, presented in Se tion 4. As noted before it follows from (3) that bkj =

bij (1 − bik ) bij bki = . bij − bik bij − bik

(6)

Hen e, any row in B, B ∈ Bn , generates all other elements and thus, there are at most n − 1 fun tionally independent elements in B. For example, we may use b1j , j = 2, 3, . . . , n, to generate all other elements in B. Furthermore, if we

hoose for rj in (5), without loss of generality, r1 = 1 and rj = −

bj1 , bij

j = 2, 3, . . . , n,

it follows that b1j =

1 1 − rj

and bij =

1 1 1−rj (1 − 1−ri ) 1 1 1−rj − 1−ri

=

ri . ri − rj

Thus, all bij 's an be generated by the above hoi e of rj . This means that a matrix de ned by (5), as onsidered in Bondesson & Traat (2005, 2007), is a

Properties of singular matrix with integer spe trum

143

anoni al version of any matrix de ned through (3), assuming that (1) and (2) hold. The lass Bn an be generalized in a natural way.

The matrix Bn,k : (n − k + 2) × (n − k + 2), k = 2, . . . , n, is obtained from the matrix B, B ∈ Bn , by elimination of k−2 onse utive rows and olumns starting from the se ond row and olumn, with orresponding adjustments in the main diagonal.

Definition 2.

The paper onsists of ve se tions. In Se tion 2 some basi and fundamental relations for any B ∈ Bn are given whi h will be used in the subsequent. Se tion 3 onsists of a straightforward proof on erning the spe trum of any B ∈ Bn . In Se tion 4 we onsider a fa torization of B ∈ Bn into a produ t of three triangular matri es. Finally, in Se tion 5 expressions of left and right eigenve tors are presented. Several proofs of theorems are omitted due to lengthy al ulations. However, for further details it is referred to the te hni al report Nahtman & von Rosen (2007). All proofs of this paper ould easily have been presented for, say n < 7, but for a general n we rely on indu tion whi h is more diÆ ult to look through. Their is ertainly spa e for improving the proofs and this is a another reason for omitting them. In the present paper only real-valued matri es are onsidered, although the generalization to matri es with omplex-valued entries ould be performed fairly easy.

2

Preparations

This se tion shows some relations among the elements in B ∈ Bn whi h are of utmost importan e for the subsequent presentation. Theorem 1.

(i)

Let B ∈ Bn . For all n > 1

The sum of the produ ts of the o-diagonal row elements equals 1: n Y n X

bij = 1.

i=1 j=1 j6=i

(ii)

1:

The sum of the produ ts of the o-diagonal olumn elements equals n Y n X j=1 i=1 i6=j

bij = 1.

144

T. Nahtman, D. von Rosen

Proof. Be ause of symmetry only (i) is proven. For n = 2 the trivial relation b12 + b21 = 1 is obtained. Moreover, for n = 3 3 Y 3 X

bij = b12 b13 + b21 b23 + b31 b32 = b12 − b12 b31 + b21 − b21 b32 + b31 b32

i=1 j=1 j6= i

= 1 − (b12 − b13 )b32 − b21 b32 + b31 b32 = 1 − (b12 + b21 )b32 + (b13 + b31 )b32 = 1,

where in the se ond equality (3) is utilized. Now it is assumed that the theorem is true for n − 1, i.e. n−1 X n−1 Y

(7)

bij = 1,

i=1 j=1 j6=i

whi h by symmetry yields n Y n X

bij = 1,

(8)

k = 1, 2, . . . , n.

i=1 j=1 i6=k j6=i j6=k

From here on a hain of al ulations is started: n Y n X

bij =

i=1 j=1 j6= i

n−1 X n−1 Y i=1 j=1 j6= i

=

n−2 X n−2 Y

bij bin +

n−1 Y

bij bin−1 bin +

i=1 j=1 j6= i

=

n−2 X n−2 Y

bnj

j=1

n−2 Y

bn−1j bn−1n +

bnj bnn−1

j=1

j=1

bij bin−1 (1 − bni ) +

n−2 Y

n−2 Y

bn−1j (1 − bnn−1 ) +

bnj bnn−1 .

(9)

j=1

j=1

i=1 j=1 j6= i

n−2 Y

Sin e by the indu tion assumption n−2 X n−2 Y

bij bin−1 +

i=2 j=1 j6=i

n−2 Y

bn−1j = 1

j=1

the last expression in (9) equals 1−

n−2 X n−2 Y i=1 j=1 j6=i

bij (bin−1 − bin )bnn−1 −

n−2 Y j=1

bn−1j bnn−1 +

n−2 Y j=1

bnj bnn−1 , (10)

Properties of singular matrix with integer spe trum

145

where (3) has been used: bin−1 bni = (bin−1 − bin )bnn−1 . Reshaping (10) we obtain 1−

n−1 X n−1 Y

bij bnn−1 +

i=1 j=1 j6=i

n X

i=1 i6=n−1

n Y

bij bnn−1 ,

(11)

j=1 j6=i j6=n−1

and using the indu tion assumption, i.e. (7) as well as (8), we see that (11) is indeed equal to 1 − bnn−1 + bnn−1 = 1,

and the theorem is proved. Corollary 1.

Let B

∈ Bn .

⊓ ⊔

For all n > 1, n−1 n XY

bij = 1 −

i=1 j=1 j6=i

Corollary 2.

Let B

∈ Bn .

n−1 Y

bnj .

j=1

For every integer a su h that a < n, n Y n X

bij = 1.

i=a j=a j6=i

Theorem 2. Let B ∈ Bn and put cij = b−1 ij bji . (i) c−1 = c , i = 6 j, ji ij (ii) cki cjk = −cji , k 6= i, j 6= k, i 6= j, (iii) cki clj = ckj cli , k 6= i, j; l 6= i, j.

Then, ( an ellation) (ex hangeability)

Proof. (i) follows immediately from the de nition of cij . For (ii) it is observed that (see (3)) bji bik bij bki =− bkj bjk

and hen e, −1 −1 −1 −1 −1 −1 −1 cki cjk = b−1 ki bik bjk bkj = bki bik bjk bkj bij bij = −bki bki bjk bjk bij bij = −cji .

Con erning (iii) it is noted that cki clj = cki clj cil cli = −cki cij cli = ckj cli .⊓ ⊔

146

T. Nahtman, D. von Rosen

Throughout the paper the following abbreviations for two types of multiple sums will be used. Both will frequently be applied in the subsequent: [m,n]

X

=

X

=

Pn

i1 =m

i1 6···6ik [m,n]

i1 k, otherwise.

In the next Un and Vn from the previous theorem are presented elementwise.

Let Un respe tively. Then,

Theorem 10.

= (uij )

and

Vn = (vij )

uij = (−b1j )I{j>1}

i Y

k=2 k6=j

be given by (21) and (22),

bjk , i > j

(23)

Properties of singular matrix with integer spe trum

151

and I j−1 bj1 {j>1} Y −1 vij = − bik , i > j. b1j

(24)

k=1

Example 1. For n = 4 the matri es U4 and V4 are given by

1 0 0 0 b12 −b12 0 0 , U4 = b12 b13 −b12 b23 −b13 b32 0 b12 b13 b14 −b12 b23 b24 −b13 b32 b34 −b14 b42 b43 10 0 0 1 −b−1 0 0 12 . V4 = 1 −b21 /(b12 b31 ) −1/(b13 b32 ) 0 1 −b21 /(b12 b41 ) −b31 /(b13 b41 b42 ) −1/(b14 b42 b43 )

The matri es Un and Vn may also be related to Theorem 7. Theorem 11.

Let Un and Vn be given by (4.1) and (4.2), respe tively. Then, Un =

n−2 Y

Diag(In−i−2 , Un,n−i ),

(25)

Diag(Ii , Un,2+i ),

(26)

i=0

Vn =

n−2 Y i=0

where Un,k is de ned in (19).

Before onsidering the VTU-de omposition, i.e. the fa torization Un BVn = Tn whi h is one of the main theorems of the paper, where Tn is a triangular matrix spe i ed below in Theorem 12, a te hni al lemma stating another basi property of B ∈ Bn is presented. On e again the proof is omitted.

Lemma 1. Let B ∈ Bn (21). Then,

and let

22 (U21 n : Un )

be the last row in

Un ,

given in

22 (U21 n : Un )B = 0.

Theorem 12. (VTU-de omposition) Let B ∈ Bn , Un and Vn = U−1 n be the triangular matri es given by (21) and (22), respe tively. Then Un BVn = Tn , where the upper triangular Tn equals Tn =

n X

(n − k)ek ek′ +

k=1 n X r−2 r−1 X Y

r=3 k=1 m=k+1

n X r−2 X r−1 X

l Y

r=3 k=1 l=k+1 m=k+1 ′ b−1 rm ek er −

n−1 X k=1

′ ek ek+1 .

′ b−1 rm blr ek el −

152

T. Nahtman, D. von Rosen

Proof. After the proof we show some details for

n = 3. Suppose that Un−1 Bn−1 Vn−1 = Tn−1 holds, where Bn−1 ∈ Bn−1 . Using the notation of

Theorem 9

Un BVn =

Un−1 0 22 U21 n Un

Vn−1 0 B Vn21 Vn22

and let B be partitioned as B=

B11 B12 B21 B22

,

n−1 × n−1 n−1 × 1 1 × n−1 1×1

.

22 From Lemma 1 it follows that (U21 n : Un )B = 0 and thus

Un BVn =

Un−1 B11 Vn−1 + Un−1 B12 Vn21 Un−1 B12 Vn22 0 0

(27)

.

The blo ks of the non-zero elements should be studied in some detail. Thus, one has to show that Un−1 B12 Vn22 equals the rst n − 1 elements in the nth olumn of Tn . Let T = (tij ), where tij = 0, if i > j. For example, for the se ond element in Un−1 B12 Vn22 : −(−b12 b2n + b12 b1n )b−1 1n

n−1 Y

n−1 Y

−1 b−1 nm = −bn2 b1n b1n

b−1 nm = −

m=2

m=2

n−1 Y

b−1 nm ,

m=3

whi h equals t2n . For Un−1 B11 Vn−1 + Un−1 B12 Vn21 , given in (27), it is noted that this expression equals Un−1 Bn−1 Vn−1 + I −

n−1 X

n−1 X

bin Un−1 di di′ Vn−1 +

i=1

Un−1 bin di Vn21

(28)

i=1

and then the two last terms in (28) should be exploited. After some al ulations this will give a useful re ursive relation between Un BVn and Un−1 Bn−1 Vn−1 :

Un Bn Vn =(In−1 : 0) ′ Un−1 Bn−1 Vn−1 (In−1 : 0) −

n−2 X

n−1 Y

′ b−1 nm ek en −

k=1 m=k+1

en−1 en′ +

n−1 X k=1

ek ek′ +

n−2 X n−1 X

l Y

′ b−1 nm bln ek el .

k=1 l=k+1 m=k+1

By utilizing this expression together with the indu tion assumption about Un−1 Bn−1 Vn−1 = Tn−1 leads to the Tn of the theorem. ⊓ ⊔

Properties of singular matrix with integer spe trum

153

Let Tn = (tij ) be the upper triangular matrix de ned in Theorem 12. Then the elements of Tn are given by

Corollary 3.

tij =

n X

j Y

b−1 kl −

k=j+1 l=i+1

j−1 X

tik =

k=i

j Y

n X

b−1 kl − I{j>i}

k=j+1 l=i+1

j−1 n X Y

b−1 kl .

k=j l=i+1

Observe that the expression implies that tii = n − i. Moreover, Tn 1 = 0. The stru ture of the matrix Tn is the following

n−1

0

...

n P b−1 − (n − 1) n−2 i ′ =3 i ′ 2 n 3 n n P Q −1 P P b−1 b−1 bi ′ j − i ′ 3 − (n − 2) i ′2 ′ ′ ′ i =4 i =3 i =4 j=2 n 4 n Q Tn′ = .. P P . b−1 b−1 i ′3 i ′j − i ′ =4 i ′ =5 j=3 .. .. . . n n−1 Q Q −1 b−1 bnj ′ − − i ′j j ′ =2

j=3

...

..

.

... ... ...

0

00

0 0 0 0 0 . 2 0 0 −1 bnn−1 − 2 1 0 −b−1 −1 0 nn−1 0

This se tion is ended by showing some detailed al ulations for n = 3.

Example 2. For n = 3

−1 2 b−1 32 − 2 −b32 T3 = 0 1 −1 . 0 0 0

From (23) and (24) in Theorem 10 we have 1 0 0 , −b12 0 U3 = b12 b12 b13 −b12 b23 −b13 b32

1 0 0 . 0 V3 = 1 −b−1 12 −1 −1 −1 −1 1 b23 b13 b32 −b13 b32

We are going to show that V3 T3 U3 = B ∈ B3 . Now

−1 2 + b−1 b12 − b12 b13 32 b12 − 2b12 − b32 b12 b13 −1 T3 U3 = +2b12 + b32 b23 b12 − b12 b−1 −b 12 + b12 b23 32 b13 b13 b32 −1 2 − 2b12 + (b12 − b13 ) b12 b32 (1 − b23 ) + 2b12 = b12 b31 −b12 b32 0 0 b21 + b31 b12 b13 = b12 b31 −b12 b32 b13 b32

0

0

0

′ 0 0 0

b13 b13 b32 0

154

T. Nahtman, D. von Rosen

and V3 T3 U3 ′ −1 b21 + b31 b21 b21 + b31 + b23 b−1 13 b32 b12 b31 −1 = b12 b12 + b32 b12 − b12 b32 b23 b−1 13 b32 −1 −1 −1 b13 −b13 b12 b32 + b13 b13 + b13 b32 b23 b13 b32 b21 + b31 b12 b13 = b21 b12 + b32 −(b32 − b31 ) + b13 b31 b12 − (b23 − b21 ) b13 + b23 b21 + b31 b12 b13 = B, = b21 b12 + b32 b23 b31 b32 b13 + b23

where in the above al ulations we have used (3) and Theorem 2 (ii).

5

Eigenvectors of the matrix B

It is already known from Theorem 5 that the matrix B ∈ Bn has eigenvalues {0, 1, . . . , n − 1}. This an also be seen from the stru ture of the matrix Tn given in Corollary 3 and the fa t that the matri es B and Tn are similar, i.e. Un BUn−1 = Tn . The right eigenve tors of the matrix B are of spe ial interest in sampling theory when B is a fun tion of the in lusion probabilities, outlined in the Introdu tion. We are going to present the eigenve tors of the matrix B ∈ Bn in a general form. From Se tion 2 we know that Un BU−1 n = Tn , where the matrix Tn is an upper-triangular matrix given by Theorem 12. Sin e B and Tn are similar, they have the same eigenvalues and then the eigenve tors of B are rather easy to obtain using the eigenve tors of Tn . In the next theorem we shall obtain expli it expressions for the eigenve tors of the matrix Tn .

Let Tn be given by Theorem 12. Then there exist upper triangular matri es VT and UT su h that

Theorem 13.

Tn = UT ΛVT , Λ = diag(n − 1, n − 2, . . . , 1, 0),

(29)

UT = VT−1 .

The matrix UT

= (uij )

uij = 1 +

is given by

j−i X

[j+1,n]

(−1)g

g=1

where

P[j+1,n]

i1 0; Au > 0 for some ve tor u > 0; All the eigenvalues of A have positive

real parts.

Ri

ati equations asso iated with an M-matrix

181

For a Z-matrix A it holds that: A is an M-matrix if and only if there exists a nonzero ve tor v > 0 su h that Av > 0 or a nonzero ve tor w > 0 su h that wT A > 0. Theorem 4.

The equivalen e of (a) and ( ) in Theorem 3 implies the next result. Lemma 5. Let A be a nonsingular B is also a nonsingular M-matrix.

M-matrix. If

B>A

is a Z-matrix, then

The following well-known result on erns the properties of S hur omplements of M-matri es.

Let M be a nonsingular M-matrix or an irredu ible singular Mmatrix. Partition M as

Lemma 6.

M=

M11 M12 , M21 M22

where M11 and M22 are square matri es. Then M11 and M22 are nonsingular M-matri es. The S hur omplement of M11 (or M22 ) in M is also an M-matrix (singular or nonsingular a

ording to M). Moreover, the S hur

omplement is irredu ible if M is irredu ible. 2.2

The dual equation

Reverting the oeÆ ients of equation (1) yields the dual equation YBY − YA − DY + C = 0,

(4)

whi h is still a NARE, asso iated with the matrix

A −B N= −C D

that is a nonsingular M-matrix or an irredu ible singular M-matrix if and only if the matrix M is so. In fa t N is learly a Z-matrix and N = ΠMΠ, where Π = Π−1 is the matrix whi h permutes the blo ks of M. So, if Mv > 0, for v > 0, then NΠv > 0 and by Theorem 4, N is an M-matrix. 2.3

Existence of nonnegative solutions

The spe ial stru ture of the matrix M of (2) allows one to prove the existen e of a minimal nonnegative solution S of (1), i.e., a solution S > 0 su h that X−S > 0 for any solution X > 0 to (1). See [20℄ and [21℄ for more details.

Let M in (2) be an M-matrix. Then the NARE (1) has a minimal nonnegative solution S. If M is irredu ible, then S > 0 and A − SC and D − CS are irredu ible M-matri es. If M is nonsingular, then A − SC and D − CS are nonsingular M-matri es. Theorem 7.

182

D. Bini, B. Iannazzo, B. Meini, F. Poloni

Observe that the above theorem holds for the dual equation (4) and guarantees the existen e of a minimal nonnegative solution of (4) whi h is denoted by T. 2.4

The eigenvalue problem associated with the matrix equation

A useful te hnique frequently en ountered in the theory of matrix equations

onsists in relating the solutions to some invariant subspa es of a matrix polynomial. In parti ular, the solutions of (1) an be des ribed in terms of the invariant subspa es of the matrix

D −C H= , B −A

(5)

In 0 whi h is obtained premultiplying the matrix M by J = . 0 −Im In fa t, if X is a solution of equation (1), then, by dire t inspe tion, H

In I = n R, X X

(6)

where R = D − CX. Moreover, the eigenvalues of the matrix R are a subset of the

Y span Z an invariant subspa e of H, and Y is a nonsingular n × n matrix, then ZY −1 is

eigenvalues of H. Conversely, if the olumns of the (n + m) × n matrix

a solution of the Ri

ati equation, in fa t H

Z Z = V, T T

for some V , from whi h post-multiplying by Z−1 one obtains

I H TZ−1

Z I −1 = VZ = ZVZ−1 ; T TZ−1

setting X = TZ−1 one has D − CX = ZVZ−1 and B − AX = XD − XCX. Similarly, for the solutions of the dual equation it holds that H

Y Y = U, Im Im

where U = BY−A. The eigenvalues of the matrix U are a subset of the eigenvalues of H.

Ri

ati equations asso iated with an M-matrix 2.5

183

The eigenvalues of H

We say that a set A of k omplex numbers has a (k1 , k2 ) splitting with respe t to the unit ir le if k = k1 + k2 , and A = A1 ∪ A2 , where A1 is formed by k1 elements of modulus at most 1 and A2 is formed by k2 elements of modulus at least 1. Similarly, we say that A has a (k1 , k2 ) splitting with respe t to the imaginary axis if k = k1 + k2 , and A = A1 ∪ A2 , where A1 is formed by k1 elements with nonpositive real part and A2 is formed by k2 elements with nonnegative real part. We say that the splitting is omplete if at lest one set A1 or A2 has no eigenvalues in its boundary. Sin e the eigenvalues of an M-matrix have nonnegative real part, it follows that the eigenvalues of H have an (m, n) splitting with respe t to the imaginary axis. This property is proved in the next Theorem 8. Let M be an irredu ible M-matrix. Then the eigenvalues of H = JM have an (m, n) splitting with respe t to the imaginary axis. Moreover, the only eigenvalue that an lie on the imaginary axis is 0.

Proof. Let

v > 0 be the only positive eigenve tor of M, and let λ > 0 be the asso iate eigenvalue; de ne Dv = diag(v). The matrix M = D−1 v MDv has the same eigenvalues as M; moreover, it is an M-matrix su h that Me = λe. Due to the sign stru ture of M-matri es, this means that M is diagonal dominant (stri tly in the nonsingular ase). Noti e that H = D−1 v HDv = JM, thus H is diagonal dominant as well, with m negative and n positive diagonal entries. We apply Gershgorin's theorem [30, Se . 14℄ to H; due to the diagonal dominan e,

the Gershgorin ir les never ross the imaginary axis (in the singular ase, they are tangent in 0). Thus, by using a ontinuity argument we an say that m eigenvalues of H lie in the negative half-plane and n in the positive one, and the only eigenvalues on the imaginary axis are the zero ones. But sin e H and H are ⊓ ⊔ similar, they have the same eigenvalues.

We an give a more pre ise result on the lo ation of the eigenvalues of H, after de ning the drift of the Ri

ati equation. Indeed, when M is a singular irredu ible M-matrix, by the Perron{Frobenius theorem, the eigenvalue 0 is simple, there are positive ve tors u and v su h that uT M = 0,

Mv = 0,

(7)

up to a s alar fa tor. and both the ve tors u and v are unique Writing u =

an de ne

u1 u2

and v =

v1 , with u1 , v1 ∈ Rn and u2 , v2 ∈ Rm , one v2

µ = uT2 v2 − uT1 v1 = −uT Jv.

(8)

The number µ determines some properties of the Ri

ati equation. Depending on the sign of µ and following a Markov hain terminology, one an all µ the

184

D. Bini, B. Iannazzo, B. Meini, F. Poloni

drift as in [6℄, and an lassify the Ri

ati equations asso iated with a singular irredu ible M-matrix in three ategories:

(a) positive re urrent if µ < 0; (b) null re urrent if µ = 0; ( ) transient if µ > 0. In uid queues problems, v oin ides with the ve tor of ones. In general v and u an be omputed by performing the LU fa torization of the matrix M, say M = LU, and solving the two triangular linear systems uT L = [0, . . . , 0, 1] and Uv = 0 (see [30, Se . 54℄). The lo ation of the eigenvalues of H is made pre ise in the following [20, 23℄: Theorem 9. Let M be a nonsingular or a singular irredu ible M-matrix, and let λ1 , . . . , λm+n be the eigenvalues of H = JM ordered by nonin reasing real part. Then λn and λn+1 are real and

Reλn+m 6 · · · 6 Reλn+2 < λn+1 6 0 6 λn < Reλn−1 6 · · · 6 Reλ1 .

The minimal nonnegative solutions S and T of the equation (1) and of the dual equation (4), respe tively, are su h that σ(D − CS) = {λ1 , . . . , λn } and σ(A − SC) = σ(A − BT ) = {−λn+1 , . . . , −λn+m }. If M is nonsingular then λn+1 < 0 < λn . If M is singular and irredu ible then: 1. if µ < 0 then λn = 0 and λn+1 < 0; 2. if µ = 0 then λn = λn+1 = 0 and there exists only one eigenve tor, up to a s alar onstant, for the eigenvalue 0; 3. if µ > 0 then λn > 0 and λn+1 = 0. We all λn and λn+1 the entral eigenvalues of H. If H (and thus M) is nonsingular, then the entral eigenvalues lie on two dierent half planes so the splitting is omplete. In the singular ase the splitting is omplete if and only if µ 6= 0. The lose to null re urrent ase, i.e., the ase µ ≈ 0, deserves parti ular attention, sin e it orresponds to an ill- onditioned null eigenvalue for the matrix H. In fa t, if u and v are normalized su h that kuk2 = kvk2 = 1, then 1/|µ| is the ondition number of the null eigenvalue for the matrix H (see [19℄). When M is singular irredu ible, for the Perron{Frobenius theorem the eigenvalue 0 is simple, therefore H = JM has a one dimensional kernel and uT J and v are the unique (up to a s alar onstant) left and right eigenve tors, respe tively,

orresponding to the eigenvalue 0. However the algebrai multipli ity of 0 as an eigenvalue of H an be 2; in that ase, the Jordan form of H has a 2 × 2 Jordan blo k orresponding to the 0 eigenvalue and it holds uT Jv = 0 [31℄.

Ri

ati equations asso iated with an M-matrix

185

The next result, presented in [25℄, shows the redu tion from the ase µ < 0 to the ase µ > 0 and onversely, when M is singular irredu ible. This property enable us to restri t our interest only to the ase µ 6 0. Lemma 10. The matrix S is the minimal nonnegative solution of (1) if and

only if Z = ST is the minimal nonnegative solution of the equation XCT X − XAT − DT X + BT = 0.

(9)

Therefore, if M is singular and irredu ible, the equation (1) is transient if and only if the equation (9) is positive re urrent. Proof. The rst part is easily shown by taking transpose on both sides of the

equation (1). The M-matrix orresponding to (9) is Mt =

Sin e

AT −CT . −BT DT

T T v2 v1 Mt = 0,

the se ond part readily follows. 2.6

Mt

u2 = 0, u1

⊓ ⊔

The differential of the Riccati operator

The matrix equation (1) de nes a Ri

ati operator R(X) = XCX − AX − XD + B,

whose dierential dRX at a point X is dRX [H] = HCX + XCH − AH − HD.

(10)

The dierential H → dRX [H] is a linear operator whi h an be represented by the matrix ∆X = (CX − D)T ⊗ Im + In ⊗ (XC − A), (11) where ⊗ denotes the Krone ker produ t (see [30, Se . 10℄). We say that a solution X of the matrix equation (1) is riti al if the matrix ∆X is singular. From the properties of Krone ker produ t [30, Se . 10℄, it follows that the eigenvalues of ∆X are the sums of those of CX − D and XC − A. If X = S, where S is the minimal nonnegative solution, then D − CX and A − XC are M-matri es ( ompare Theorem 7), and thus all the eigenvalues of ∆S have nonpositive real parts. Moreover, sin e D − CS and A − SC are M-matri es then −∆S is an M-matrix. The minimal nonnegative solution S is riti al if and only if both M-matri es D − CS and A − SC are singular, thus, in view of Theorem 9, the minimal solution is riti al if and only if M is irredu ible singular and µ = 0. Moreover, if 0 6 X 6 S then D − CX > D − CS and A − XC > A − SC are nonsingular M-matri es by lemma 5, thus −∆X is a nonsingular M-matrix.

186 2.7

D. Bini, B. Iannazzo, B. Meini, F. Poloni The number of positive solutions

If the matrix M is irredu ible, Theorem 7 states that there exists a minimal positive solution S of the NARE. In the study of nonsymmetri Ri

ati dierential equations asso iated with an M-matrix [18, 34℄ one is interested in all the positive solutions. In [18℄ it is shown that if M is nonsingular or singular irredu ible with µ 6= 0, then there exists a se ond solution S+ su h that S+ > S and S+ is obtained by a rank one orre tion of the matrix S. More pre isely, the following result holds [18℄. Theorem 11. If M is irredu ible nonsingular or irredu ible singular µ 6= 0, then there exists a se ond positive solution S+ of (1) given by

with

S+ = S + kabT ,

where k = (λn − λn+1 )/bT Ca, a is su h that (A − SC)a = −λn+1 a and b is su h that bT (D − CS) = λn bT . We prove that there are exa tly two nonnegative solutions in the non riti al

ase and only one in the riti al ase. In order to prove this result it is useful to study the form of the Jordan hains of an invariant subspa e of H orresponding to a positive solution. Lemma 12. Let M be irredu ible and let Σ be any positive solution of (1). Denote by η1 , . . . , ηn the eigenvalues of D − CΣ ordered by nonde reasing real part. Then η1 is real, and there exists a positive eigenve tor v of H asso iated with η1 . Moreover, any other ve tor independent of v, belonging to Jordan hains of H orresponding to η1 , . . . , ηn annot be positive or negative.

Proof. Sin e Σ is a solution of (1), then from (6) one has I I H = (D − CΣ). Σ Σ

Sin e D − CS is an irredu ible M-matrix for Theorem 7, and Σ > S (S is the minimal positive solution), then D − CΣ is an irredu ible Z-matrix and thus an be written as sI − N with N nonnegative and irredu ible. Then by Theorem 1 and Corollary 2 η1 is a simple real eigenvalue of D − CΣ, the orresponding eigenve tor an be hosen positive and there are no other positive or negative eigenve tors or Jordan hains orresponding to any of the eigenvalues. Let P−1 (D − CΣ)P = K be the Jordan anoni al form of D − CΣ, where the rst

olumn of P is the positive eigenve tor orresponding to η1 . Then we have P P K. = H ΣP ΣP

Ri

ati equations asso iated with an M-matrix

187

P are the Jordan hains of H orresponding to η1 , . . ., ΣP ηn , and there are no positive or negative olumns, ex ept for the rst one. ⊓ ⊔

Thus, the olumns of

If M is an irredu ible nonsingular M-matrix or an irredu ible singular M-matrix with µ 6= 0, then (1) has exa tly two positive solutions. If M is irredu ible singular with µ = 0, then (1) has a unique positive solution. Theorem 13.

Proof. From Lemma 12 applied to S it follows that H has a positive eigenve tor

orresponding to λn , and no other positive or negative eigenve tors or Jordan

hains orresponding to λ1 , . . . , λn . Let T be the minimal nonnegative solution of the dual equation (4). Then T T H = (−(A − BT )). I I

As in the proof of Lemma 12, we an prove that H has a positive eigenve tor orresponding to the eigenvalue λn+1 and no other positive or negative eigenve tors or Jordan hains orresponding to λn+1 , . . . , λn+m . If M is irredu ible nonsingular, or irredu ible singular with µ 6= 0, then λn > λn+1 , and there are only two linearly independent positive eigenve tors

orresponding to real eigenvalues. By Lemma 12, there an be at most two solutions orresponding to λn , λn−1 , . . . , λ1 , and to λn+1 , λn−1 , . . . , λ1 , respe tively. Sin e it is know from Theorem 11 that there exist at least two positive solutions, thus (1) has exa tly two positive solutions. If M is irredu ible singular with µ = 0, there is only one positive eigenve tor

orresponding to λn = λn+1 , and the unique solution of (1) is obtained by the ⊓ ⊔ Jordan hains orresponding to λn , λn−1 , . . . , λ1 . The next results provide a useful property of the minimal solutions whi h will be useful in Se tion 4. Theorem 14. Let M be singular and irredu ible, and let S and T be the minimal nonnegative solutions of (1) and (4), respe tively. Then the following properties hold:

(a) if µ < 0, then Sv1 = v2 and Tv2 < v1 ; (b) if µ = 0, then Sv1 = v2 and Tv2 = v1 ; ( ) if µ > 0, then Sv1 < v2 and Tv2 = v1 .

Proof. From the proof of Theorem 13, it follows that if µ 6= 0, there exist two

independent positive eigenve tors a and bof H relative to the entral eigenvalues

λn and λn+1 , respe tively. We write a =

and a2 , b2 ∈ Rm .

a1 b and b = 1 , with a1 , b1 ∈ Rn a2 b2

188

D. Bini, B. Iannazzo, B. Meini, F. Poloni

Sin e the solution S is onstru ted from an invariant subspa e ontaining a, then Sa1 = a2 , sin e the solution S+ is onstru ted from an invariant subspa e

ontaining b, then S+ b1 = b2 . Analogously, if T+ is the se ond positive solution of the dual equation, then Tb2 = b1 and T+ a2 = a1 . The statements (a) and (c) follow from the fa t that if µ < 0 then v = a ( ompare Theorem 9), so Sv1 = v2 and Tv2 < T+ v2 = v1 , sin e T < T+ ; if µ > 0 then v = b, so Tv2 = v1 and Sv1 < S+ v1 = v2 , sin e S < S+ . The statement (b) orresponding to the ase µ = 0 an be proved in a similar way. ⊓ ⊔

Remark 1. When µ > 0, from Lemma 10 and Theorem 14 we dedu e that the minimal nonnegative solution S of (1) is su h that uT2 S = uT1 . 2.8

Perturbation analysis for the minimal solution

We on lude this se tion with a result of Guo and Higham [24℄ who perform a qualitative des ription of the perturbation of the minimal nonnegative solution S of a NARE (1) asso iated with an M-matrix. f is onsidered whi h The result is split in two theorems where an M-matrix M is obtained by means of a small perturbation of M. Here, we denote by Se the minimal nonnegative solution of the perturbed Ri

ati equation asso iated with f. M

If M is a nonsingular M-matrix or an irredu ible singular Mµ 6= 0, then there exist onstants γ > 0 and ε > 0 su h that f with kM e − Sk 6 γkM f − Mk for all M f − Mk < ε. kS Theorem 15.

matrix with

If M is an irredu ible singular M-matrix with there exist onstants γ > 0 and ε > 0 su h that Theorem 16.

(a) (b)

µ = 0,

then

f with kM e − Sk 6 γkM f − Mk1/2 for all M f − Mk < ε; kS e f f − Mk < ε. f kS − Sk 6 γkM − Mk for all singular M with kM

It is interesting to observe that in the riti al ase, where µ = 0 or if µ ≈ 0, one has to expe t poor numeri al performan es even if the algorithm used for approximating S is ba kward stable. Moreover, the rounding errors introdu ed point representation with to represent the input values of M in the oating √ pre ision ε may generate an error of the order ε in the solution S. This kind of problems will be over ome in Se tion 4.1.

3

Numerical methods

We give a brief review of the numeri al methods developed so far for omputing the minimal nonnegative solution of the NARE (1) asso iated with an M-matrix.

Ri

ati equations asso iated with an M-matrix

189

Here we onsider the ase where the M-matrix M is nonsingular or is singular, irredu ible and µ 6 0. The ase µ > 0 an be redu ed to the ase µ < 0 by means of Lemma 10. The riti al ase where µ = 0 needs dierent te hniques whi h will be treated in the next Se tion 4. We start with a dire t method based on the S hur form of the matrix H then we onsider iterative methods based on xed-point te hniques, Newton's iteration and we on lude the se tion by analyzing a lass of doubling algorithms. The latter lass in ludes methods based on Cy li Redu tion (CR) of [9℄, and on the Stru ture-preserving Doubling Algorithm (SDA) of [2℄. 3.1

Schur method

A lassi al approa h for solving equation (1) is to use the (ordered) S hur de omposition of the matrix M to ompute the invariant subspa es of H orresponding to the minimal solution S. This approa h for the symmetri algebrai Ri

ati equation was rst presented by Laub in 1979 [40℄. Con erning the NARE, a study of that method in the singular and riti al ase was done by Guo [23℄ who presented a modi ed S hur method for the riti al or near riti al ase (µ ≈ 0). As explained in Se tion 2.4 from

In I H = n (D − CS) S S

it follows that nding the minimal solution S of the NARE (1) is equivalent to nding a basis of the invariant subspa e of H relative to the eigenvalues of D − CS, i.e., the eigenvalues of H with nonnegative real part. A method for nding an invariant subspa e is obtained by omputing a semi-ordered S hur form of H, that is, omputing an orthogonal matrix Q and a quasi upper-triangular matrix T su h that Q∗ HQ = T , where T is blo k upper triangular with diagonal blo ks Ti,i of size at most 2. The semi-ordering means that if Ti,i , Tj,j and Tk,k are diagonal blo ks having eigenvalues with positive, null and negative real parts, respe tively, then i < j < k. A semi-ordered S hur form an be omputed in two steps: – Compute a real S hur form of H by the ustomary Hessenberg redu tion

followed by the appli ation of the QR algorithm as des ribed in [19℄. – Swap the diagonal blo ks by means of orthogonal transformations as des ribed in [4℄.

The minimal solution of the NARE an be obtained from the rst n olumns Q1 su h that Q1 is an n × n matrix, that is, of the matrix Q partitioned as Q2

. In the riti al ase this method does not work, sin e there is no way to hoose an invariant subspa e relative to the rst n eigenvalues, moreover in the near S=

Q2 Q−1 1

190

D. Bini, B. Iannazzo, B. Meini, F. Poloni

riti al ase where µ ≈ 0, there is la k of a

ura y sin e the 0 eigenvalue is ill- onditioned. However, the modi ed S hur method given by C.-H. Guo [24℄ over omes these problems. The ost of this algorithm, following [23℄, is 200n3 . 3.2

Functional iterations

In [20℄ a lass of xed-point methods for (1) is onsidered. The xed-point iterations are based on suitable splittings of A and D, that is A = A1 − A2 and D = D1 − D2 , with A1 , D1 hosen to be M-matri es and A2 , D2 > 0. The form of the iterations is A1 Xk+1 + Xk+1 D1 = Xk CXk + Xk D2 + A2 Xk + B,

(12)

where at ea h step a Sylvester equation of the form M1 X + XM2 = N must be solved. Some possible hoi es for the splitting are: 1. A1 and D1 are the diagonal parts of A and D, respe tively; 2. A1 is the lower triangular part of A and D1 the upper triangular part of D; 3. A1 = A and D1 = D. The solution Xk+1 of the Sylvester equation an be omputed, for instan e, by using the Bartels and Stewart method [5℄, as in MATLAB's sylvsol fun tion of the Ni k Higham Matrix Fun tion toolbox [28℄ The ost of this omputation is roughly 60n3 ops in luding the omputation of the S hur form of the oeÆ ients A1 and D1 [29℄. However, observe that for the rst splitting, A1 and D1 are diagonal matri es and the Sylvester equation

an be solved with O(n2 ) ops; for the se ond splitting, the matri es A1 and D1 are already in the S hur form. This substantially redu es the ost of the appli ation of the Bartels and Stewart method to 2n3 . Con erning the third iteration, observe that the matrix oeÆ ients A1 and D1 are independent of the iteration. Therefore, the omputation of their S hur form must be performed only on e. A monotoni onvergen e result holds for the three iterations [20℄.

If R(X) 6 0 for some positive matrix X, then for the xedpoint iterations (12) with X0 = 0, it holds that Xk < Xk+1 < X for k > 0. Moreover, lim Xk = S. Theorem 17.

We have also an asymptoti onvergen e result [20℄. Theorem 18. For the xed-point iterations (12) with X0 = 0, it holds that p lim sup k kXk − Sk = ρ((I ⊗ A1 + DT1 ⊗ I)−1 (I ⊗ (A2 + SC) + (D2 + CS)T ⊗ I).

Ri

ati equations asso iated with an M-matrix

191

These iterations have linear onvergen e whi h turns to sublinear in the

riti al ase. The omputational ost varies from 8n3 arithmeti operations per step for the rst splitting, to 64n3 for the rst step plus 10n3 for ea h subsequent step for the last splitting. The most expensive iteration is the third one whi h, on the other hand, has the highest (linear) onvergen e speed. 3.3

Newton’s method

Newton's iteration was rst applied to the symmetri algebrai Ri

ati equation by Kleinman in 1968 [37℄ and later on by various authors. In parti ular, Benner and Byers [7℄ omplemented the method with an optimization te hnique (exa t line sear h) in order to redu e the number of steps needed for arriving at onvergen e. The study of the Newton method for nonsymmetri algebrai Ri

ati equations was started by Guo and Laub in [26℄, and a ni e onvergen e result was given by Guo and Higham in [24℄. The onvergen e of the Newton method is generally quadrati ex ept for the

riti al ase where the onvergen e is observed to be linear with rate 1/2 [26℄. At ea h step, a Sylvester matrix equation must be solved, so the omputational

ost is O(n3 ) ops per step, but with a large overhead onstant. The Newton method for a NARE [26℄ onsists in the iteration Xk+1 = N(Xk ) = Xk − (dRXk )−1 R(Xk ),

k = 0, 1, . . .

(13)

whi h, in view of (10), an be written expli itly as (A − Xk C)Xk+1 + Xk+1 (D − CXk ) = B − Xk CXk .

(14)

Therefore, the matrix Xk+1 is obtained by solving a Sylvester equation. This linear equation is de ned by the matrix ∆Xk = (D − CXk )T ⊗ Im + In ⊗ (A − Xk C)

whi h is nonsingular if 0 6 Xk < S, as shown in se tion 2.6. Thus, if 0 6 Xk < S for any k, the sequen e (13) is well-de ned. In the non riti al ase, dRS is nonsingular, and the iteration is quadrati ally

onvergent in a neighborhood of the minimal nonnegative solution S by the traditional results on Newton's method (see e.g. [36℄). Moreover, the following monotoni onvergen e result holds [24℄: Theorem 19. Consider Newton's method (14) starting from X0 = 0. Then for ea h k = 0, 1, . . . , we have 0 6 Xk 6 Xk+1 < S and ∆Xk is a nonsingular M-matrix. Therefore, the sequen e (Xk ) is well-de ned and onverges monotoni ally to S.

192

D. Bini, B. Iannazzo, B. Meini, F. Poloni

The same result holds when 0 6 X0 6 S; the proof in [24℄ an be easily adapted to this ase. In [26℄, a hybrid method was suggested, whi h onsists in performing a ertain number of iterations of a linearly onvergent algorithm, su h as the ones of Se tion 3.2, and then using the omputed value as the starting point for Newton's method. At ea h step of Newton's iteration, the largest omputational work is given by the solution of the Sylvester equation (14). We re all that the solution Xk+1 ,

omputed by means of the Bartels and Stewart method [5℄ osts roughly 60n3 ops. Therefore the overall ost of Newton's iteration is 66n3 ops. It is worth noting that in the riti al and near riti al ases, the matrix ∆k be omes almost singular as Xk approa hes the solution S; therefore, some numeri al instability is to be expe ted. Su h instability an be removed by means of a suitable te hnique whi h we will des ribe in Se tion 4.1.

3.4

Doubling algorithms

In this se tion we report some quadrati ally onvergent algorithms obtained in [13℄ for solving (1). Quadrati ally onvergent methods for omputing the extremal solution of the NARE an be obtained by transforming the NARE into a Unilateral Quadrati Matrix Equation (UQME) of the kind A2 X2 + A1 X + A0 = 0

(15)

where A0 , A1 , A2 and X are p × p matri es. Equations of this kind an be solved eÆ iently by means of doubling algorithms like Cy li Redu tion (CR) [9, 12℄ or Logarithmi Redu tion (LR) [39℄. The rst attempt to redu e a NARE to a UQME was performed by Ramaswami [46℄ in the framework of uid queues. Subsequently, many ontributions in this dire tion have been given by several authors [23, 10, 13, 33, 6℄ and dierent redu tion te hniques have been designed. Con erning algorithms, Cy li Redu tion and SDA are the most ee tive

omputational te hniques. The former was applied the rst time in [9℄ by Bini and Meini to solve unilateral quadrati equations. The latter, was rst presented by Anderson in 1978 [2℄ for the numeri al solution of dis rete-time algebrai Ri

ati equations. A new interpretation was given by Chu, Fan, Guo, Hwang, Lin, Xu [16, 32, 41℄, for other kinds of algebrai Ri

ati equations.

Ri

ati equations asso iated with an M-matrix

193

CR applied to (15) generates sequen es of matri es de ned by the following equations (k)

V (k) = (A1 )−1 (k+1)

= −A0 V (k) A0

(k+1)

= A1 − A0 V (k) A2 − A2 V (k) A0

A0

A1

(k+1) A2

(k)

(k)

=

(k)

(k)

(k)

(k)

(k)

k = 0, 1, . . .

(16)

(k) (k) −A2 V (k) A2

b (k+1) = A b (k) − A(k) V (k) A(k) A 2 0

b (0) = A1 . where A(0) = Ai , i = 0, 1, 2, A i The following result provides onvergen e properties of CR [12℄. Theorem 20. Let x1 , . . . , x2p be the roots of a(z) = det(A0 + zA1 + z2 A2 ), in luding roots at the in nity if deg a(z) < 2p, ordered by in reasing modulus.

Suppose that |xp | 6 1 6 |xp+1 | and |xp | < |xp+1 |, and that a solution G exists to (15) su h that ρ(G) = |xp|. Then, G is the unique solution to (15) with minimal spe tral radius, moreover, if CR (16) an be arried out with no breakdown, the sequen e is su h that for any norm

−1 b (k) G(k) = − A A0

k

||G(k) − G|| 6 ϑ|xp /xp+1 |2

2 where ϑ > 0 is a suitable onstant. Moreover, it holds that ||A(k) 0 || = O(|xp | ), (k) ||A2 || = O(|xp+1 |−2 ). k

k

Observe that, the onvergen e onditions of the above theorem require that the roots of a(z) have a (p, p) omplete splitting with respe t to the unit ir le. For this reason, before transforming the NARE into a UQME, it is onvenient to b su h that the eigenvalues of H b transform the Hamiltonian H into a new matrix H have an (n, m) splitting with respe t to the unit ir le, i.e., n eigenvalues belong to the losed unit disk and m are outside. This an be obtained by means of one of the two operators: the Cayley transform Cγ (z) = (z + γ)−1 (z − γ), where γ > 0, or the shrink-and-shift operator Sτ (z) = 1 − τz, where τ > 0. In fa t, the Cayley transform maps the right open half-plane into the open unit disk. Similarly, for suitable values of τ, the transformation Sτ maps a suitable subset of the right half-plane inside the unit disk. This property is better explained in the following result whi h has been proved in [13℄. Theorem 21.

Let γ, τ > 0 and let

Hγ = Cγ (H) = (H + γI)−1 (H − γI), Hτ = Sτ (H) = I − τH.

Assume

µ < 0,

then:

194

1.

D. Bini, B. Iannazzo, B. Meini, F. Poloni Hγ

has eigenvalues ξi = Cγ (λi ), i = 1, . . . , m + n, su h that max |ξi | 6 1

max{maxi (A)i,i , maxi (D)i,i }, Hτ i = 1, . . . , m + n, su h that

max |µi | 6 1

0, ξ < 0, p and q are su h that pT v = qT w = 1. Sin e v and w are orthogonal ve tors, the double-shift moves one zero eigenvalue to η and the other e = H + ξqwT are those of H e T = HT + ξwqT , to ξ. Indeed, the eigenvalues of H whi h are the eigenvalues of H ex ept that one zero eigenvalue is repla ed by ξ, e e + ηvpT are the eigenvalues of H by Lemma 26. Also, the eigenvalues of H = H ex ept that the remaining zero eigenvalue is repla ed by η, by Lemma 26 again. From H we may de ne a new Ri

ati equation XCX − XD − AX + B = 0.

(29)

As before, the minimal nonnegative solution S of (1) is a solution of (29) su h that σ(D−CS) = {η, λ1 , . . . , λn−1 }. However, it seems very diÆ ult to determine the existen e of a solution Y of the dual equation of (29) su h that σ(A − B Y) = {−ξ, −λn+2 , . . . , −λn+m }. 4.2

Choosing a new initial value

If the eigenve tor of H relative to the null eigenvalue is partitioned as right v1 v= , from Theorem 14 it follows that for the minimal nonnegative solution v2 S, it holds that Sv1 = v2 (and then (D − CS)v1 = 0).

In the algorithms in whi h the initial value an be hosen, like Newton's method, the usual hoi e X0 = 0 does not exploit this information, rather it relies only on the positivity of S. Note that in the Ri

ati equations modeling

uid queues, the ondition Xv1 = v2 is equivalent to the sto hasti ity of S sin e v1 = v2 = e. A possibly better onvergen e is expe ted if one ould generate a sequen e su h that Xk v1 = v2 for any k > 0. More pre isely, one must hoose an iteration

202

D. Bini, B. Iannazzo, B. Meini, F. Poloni

c = {A ∈ Cn×n : Av1 = v2 } and an initial whi h preserves the aÆne subspa e W c for whi h the sequen e onverges to the desired solution. value X0 ∈ W A similar idea has been used in [45℄ in order to improve the onvergen e speed of ertain fun tional iterations for solving nonlinear matrix equations related to spe ial Markov hains. A ni e property of Newton's method is that it is stru ture-preserving with c. To prove this fa t onsider the following prerespe t to the aÆne subspa e W liminary result whi h on erns the Newton iteration Lemma 28.

The Newton method Xk+1 = N(Xk ),

N(Xk ) = Xk − (dFXk )−1 F(Xk )

applied to the matrix equation F(X) = 0, when de ned, preserves the aÆne stru ture Vb if and only if F is a fun tion from Vb to its parallel linear subspa e V . Proof. Consider the matrix X ∈ Vb. The matrix N(X) belongs to Vb if and only if

N(X) − X = (dFX )−1 (−F(X)) belongs to V , and that o

urs if and only if F(X) (and then −F(X)) belongs to V . ⊓ ⊔

Now, we are ready to prove that the Newton method applied to the Ri

ati c. operator is stru ture-preserving with respe t to W

If X0 is su h that X0 v1 = v2 , and the Newton method applied to the Ri

ati equation R(X) = 0 is well de ned then Xk v1 = v2 for any k > 0. c. That is, the Newton method preserves the stru ture W

Proposition 29.

c Proof. In view of Lemma 28, one needs to prove that R is a fun tion from W

to the parallel linear subspa e W . c, then R(X)v1 = 0, in fa t If X ∈ W

R(X)v1 = XCXv1 − AXv1 − XDv1 + Bv1 = XCv2 − Av2 − XDv1 + Bv1 ,

and the last term is 0 sin e Cv2 = Dv1 and Av2 = Bv1 .

⊓ ⊔

A possible hoi e for the starting value is (X0 )i,j = (v2 )i /s where s = i v1 (i). It must be observed that the stru tured preserving onvergen e is not anymore monotoni . Sin e the approximation error has a null omponent along the subspa e W , one should expe t a better onvergen e speed for the sequen es c. A proof of this fa t and the onvergen e analysis of this obtained with X0 ∈ W approa h is still work in pla e. If µ = 0, the dierential of R is singular at the solution S as well as at any c. This makes the sequen e Xk unde ned. A way to over ome this point X ∈ W drawba k is onsidering the shifted Ri

ati equation des ribed in Se tion 4.1. P

Ri

ati equations asso iated with an M-matrix

203

The dierential of the shifted Ri

ati equation (26) at a point X is represented by the matrix e X = ∆X + I ⊗ (η(Xv1 − v2 )pT ) + (ηv1 (pT + pT X))T ⊗ I, ∆ (30) 2 1 2 p where the ve tor p 6= 0 partitioned as p = 1 is an arbitrary nonnegative p2 ve tor su h that pT v = 1. Choosing p2 = 0 provides a ni e simpli ation of the

problem, in fa t

e X = ∆X − QT ⊗ I, ∆

where Q = ηv1 pT1 . The next result gives more insights on the a tion of the Newton iteration on the stru ture Vb.

e c then R(X) = R(X) Assume that p2 = 0. If X ∈ W , where Re is de ned in (27). Moreover the sequen es generated by Newton's method, e c are the = 0 with X0 ∈ W when de ned, applied to R(X) = 0 and to R(X) same. Proposition 30.

b , in the assumption p2 = 0, follows from Proof. The fa t R(X) = R(X) e R(X) = R(X) − η(Xv1 − v2 )pT1 .

e e X )−1 R(X) e Let N(X) = X − (dRX )−1 R(X) and N(X) denote the = X − (dR Newton operator for the original equation and for the shifted one, respe tively. To prove that the sequen es are the same, it must be shown that e − CX) = B e − XCX (A − XC)N(X) + N(X)(D

c and for any η (for whi h the equation has a unique solution). holds for any X ∈ W One has e − CX) (A − XC)N(X) + N(X)(D

e − XCX, = B − XCX + N(X)ηv1 pT1 = B − XCX + ηv2 pT1 = B

c. This ompletes the where we have used that N(X)v1 = v2 sin e N(X) ∈ W proof. ⊓ ⊔

Sin e any starting value X0 ∈ Vb gives the same sequen e for the Newton method applied either to the Ri

ati equation (1) or to the shifted Ri

ati equation (26), then, hoosing su h an initial value has the same ee t of applying the shift te hnique. For the appli ability one needs that the matrix ∆Xk is nonsingular at ea h step. Unfortunately the derivative might be singular for some singular M-matrix c+ = {X ∈ W, c X > 0}. and some X ∈ W

204

D. Bini, B. Iannazzo, B. Meini, F. Poloni

If a breakdown o

urs, it is always possible to perform the iteration by using the shifted iteration, with p2 = 0 and for a suitable hoi e of the parameter η. In fa t, the iteration is proved in Proposition 30 to be the same by any hoi e of p1 and η. The onvergen e is more subtle. Besides the loss of monotoni onvergen e, c, even if it is the one may note that S is not the only solution belonging to W c+ . In fa t, in view of Theorem 13, there are at most two only belonging to W positive solutions, and only one of them has the property Sv1 = v2 . The proof c+ , of onvergen e is still work in progress, we onje ture that for ea h X0 ∈ W the sequen e generated by the Newton method, if de ned, onverges to S. A possible improvement of the algorithm ould be obtained by implementing the exa t line sear h introdu ed in [7℄.

5

Numerical experiments and comparisons

We present some numeri al experiments to illustrate the behavior of the algorithms presented in Se tion 3 and 4.1 in the riti al and non riti al ase. To

ompare the a

ura y of the methods we have used the relative error err = b 1 /kXk1 on the omputed solution X b, when the exa t solution X was kX − Xk provided. Elsewhere, we have used the relative residual error res =

b X b − XD b − AX b + Bk1 kXC . b Xk b 1 + kXDk b 1 + kAXk b 1 + kBk1 kXC

The tests were performed using MATLAB 6 Release 12 on a pro essor AMD Athlon 64. The ode for the dierent algorithms is available for download at the web page http://bezout.dm.unipi.it/mriccati/. In these tests we onsider three methods: the Newton method (N), the SDA, and the Cy li Redu tion (CR) algorithm applied to the UQME (17) (in both b obtained by the Cayley transform SDA and CR we have onsidered the matrix H of H and not the one relying on the shrink-and-shift operator). We have also onsidered the improved version of these methods applied to the singular/ riti al ase; we denoted them as IN, ISDA and ICR, respe tively, where \I" stands for \Improved". The initial value for IN is hosen as suggested in Se tion 4.1; the parameter for the shift P is hosen as η = max{max(A)i,i , max(D)i,i } and the ve tor p is hosen to be e/ i vi . The iterations are stopped when the relative residual/error eases to de rease or be omes smaller than 10ε, where ε is the ma hine pre ision.

Test 31. A null re urrent ase [6, Example 1℄. Let

0.003 −0.001 −0.001 −0.001 −0.001 0.003 −0.001 −0.001 M= −0.001 −0.001 0.003 −0.001 −0.001 −0.001 −0.001 0.003

Ri

ati equations asso iated with an M-matrix

where D is a 2 × 2 matrix. The minimal positive solution is X =

1 2

205

1 1 . 1 1

As suggested by the Theorem 16, the a

ura y of the √

ustomary algorithms N, SDA and CR is poor in the riti al ase, and is near to ε ≈ 10−8 . We report in Table 1 the number of steps and the relative error for the three algorithms. If one uses the singularity, due to the parti ular stru ture of the problem, the solution is a hieved in one step by IN, ISDA and ICR with full a

ura y. Algorithm Steps Relative error N 21 6.0 · 10−7 SDA 36 8.6 · 10−7 CR 31 4.7 · 10−9 Table 1.

A

ura y of the algorithms in the riti al ase, Test 31

Test 32. Random hoi e of a singular M-matrix with

Me = 0 [20℄. To onstru t M, we generated a 100 × 100 random matrix R, and set M = diag(Re) − R. The matri es A, B, C and D are 50×50. We generated 5 dierent matri es M and

omputed the relative residuals and number of steps needed for the iterations to onverge. All the algorithms (N, IN, SDA, ISDA, CR and ICR) arrive at a relative residual less than 10ε. The number of steps needed by the algorithms are reported in Table 2. As one an see the basi algorithms require the same number of steps, whilst using the singularity the Newton method requires one or two steps less than ISDA and ICR, however, the ost per step of these two methods make their overall ost mu h lower than the Newton method. The use of the singularity redu es dramati ally the number of steps needed for the algorithms to onverge.

Algorithm Steps needed N 11{12 IN 3 SDA 11{12 ISDA 4-5 11{13 CR ICR 4{5 Minimum and maximum number of steps needed for algorithms to onverge in Test 32

Table 2.

206

D. Bini, B. Iannazzo, B. Meini, F. Poloni

Table 3 summarizes the spe tral and omputational properties of the solutions of the NARE (1). Table 4 reports the omputational ost of the algorithms for solving (1) with m = n, together with the onvergen e properties in the non riti al ase . M

splitting

omplete

solutions > 0 ∆S

a

ura y

nonsingular

M µ0

omplete

λn+1 < 0 < λn λn+1< 0 = λn λn+1 = 0 = λn λn+1 = 0< λn

2 nonsingular

2 nonsingular

ε

ε

Table 3.

1 singular √

2 nonsingular

ε

ε

Summary of the properties of the NARE

Computational ost Referen e Algorithm S hur method 200n3 [23, 40℄ 3 3 Fun tional iteration 8n |14n (per step) [20, 26℄ 66n3 (per step) [26, 24℄ Newton's method 74 3 [10, 13℄ n (per step) CR applied to (17) 3 64 3 n (per step) [16, 25, 13℄ CR applied to (18) (SDA) 3 38 3 CR applied to (19), (20) n (per step) [33, 13℄ 3 Table 4.

Comparison of the algorithms.

References 1. S. Ahn and V. Ramaswami. Transient analysis of uid ow models via sto hasti

oupling to a queue. Sto h. Models, 20(1):71{101, 2004. 2. B. D. O. Anderson. Se ond-order onvergent algorithms for the steady-state Ri

ati equation. Internat. J. Control, 28(2):295{306, 1978. 3. S. Asmussen. Stationary distributions for uid ow models with or without Brownian noise. Comm. Statist. Sto hasti Models, 11(1):21{49, 1995. 4. Z. Bai and J. W. Demmel. On swapping diagonal blo ks in real S hur form. Linear Algebra Appl., 186:73{95, 1993. 5. R. H. Bartels and G. W. Stewart. Solution of the matrix equation AX + XB = C. Commun. ACM, 15(9):820{826, 1972.

Ri

ati equations asso iated with an M-matrix

207

6. N. G. Bean, M. M. O'Reilly, and P. G. Taylor. Algorithms for return probabilities for sto hasti uid ows. Sto hasti Models, 21(1):149{184, 2005. 7. P. Benner and R. Byers. An exa t line sear h method for solving generalized

ontinuous-time algebrai Ri

ati equations. IEEE Trans. Automat. Control, 43(1):101{107, 1998. 8. A. Berman and R. J. Plemmons. Nonnegative matri es in the mathemati al s ien es, volume 9 of Classi s in Applied Mathemati s. So iety for Industrial and Applied Mathemati s (SIAM), Philadelphia, PA, 1994. Revised reprint of the 1979 original. 9. D. Bini and B. Meini. On the solution of a nonlinear matrix equation arising in queueing problems. SIAM J. Matrix Anal. Appl., 17(4):906{926, 1996. 10. D. A. Bini, B. Iannazzo, G. Latou he, and B. Meini. On the solution of algebrai Ri

ati equations arising in uid queues. Linear Algebra Appl., 413(2-3):474{494, 2006. 11. D. A. Bini, B. Iannazzo, and F. Poloni. A Fast Newton's Method for a Nonsymmetri Algebrai Ri

ati Equation. SIAM Journal on Matrix Analysis and Appli ations, 30(1):276{290, 2008. 12. D. A. Bini, G. Latou he, and B. Meini. Numeri al methods for stru tured Markov

hains. Numeri al Mathemati s and S ienti Computation. Oxford University Press, New York, 2005. Oxford S ien e Publi ations. 13. D. A. Bini, B. Meini, and F. Poloni. From algebrai Ri

ati equations to unilateral quadrati matrix equations: old and new algorithms. Te hni al Report 1665, Dipartimento di Matemati a, Universita di Pisa, Italy, July 2007. 14. A. Brauer. Limits for the hara teristi roots of a matrix. IV. Appli ations to sto hasti matri es. Duke Math. J., 19:75{91, 1952. 15. C.-Y. Chiang and W.-W. Lin. A stru tured doubling algorithm for nonsymmetri algebrai Ri

ati equations (a singular ase). Te hni al report, National Center for Theoreti al S ien es, National Tsing Hua University, Taiwan R.O.C., July 2006. 16. E. K.-W. Chu, H.-Y. Fan, and W.-W. Lin. A stru ture-preserving doubling algorithm for ontinuous-time algebrai Ri

ati equations. Linear Algebra Appl., 396:55{80, 2005. 17. A. da Silva Soares and G. Latou he. Further results on the similarity between uid queues and QBDs. In Matrix-analyti methods (Adelaide, 2002), pages 89{106. World S i. Publ., River Edge, NJ, 2002. 18. S. Fital and C.-H. Guo. Convergen e of the solution of a nonsymmetri matrix Ri

ati dierential equation to its stable equilibrium solution. J. Math. Anal. Appl., 318(2):648{657, 2006. 19. G. H. Golub and C. F. Van Loan. Matrix omputations. Johns Hopkins Studies in the Mathemati al S ien es. Johns Hopkins University Press, Baltimore, MD, third edition, 1996. 20. C.-H. Guo. Nonsymmetri algebrai Ri

ati equations and Wiener-Hopf fa torization for M-matri es. SIAM J. Matrix Anal. Appl., 23(1):225{242, 2001. 21. C.-H. Guo. A note on the minimal nonnegative solution of a nonsymmetri algebrai Ri

ati equation. Linear Algebra Appl., 357:299{302, 2002. 22. C.-H. Guo. Comments on a shifted y li redu tion algorithm for quasi-birth-death problems. SIAM J. Matrix Anal. Appl., 24(4):1161{1166, 2003. 23. C.-H. Guo. EÆ ient methods for solving a nonsymmetri algebrai Ri

ati equation arising in sto hasti uid models. J. Comput. Appl. Math., 192(2):353{373, 2006.

208

D. Bini, B. Iannazzo, B. Meini, F. Poloni

24. C.-H. Guo and N. J. Higham. Iterative Solution of a Nonsymmetri Algebrai Ri

ati Equation. SIAM Journal on Matrix Analysis and Appli ations, 29(2):396{ 412, 2007. 25. C.-H. Guo, B. Iannazzo, and B. Meini. On the Doubling Algorithm for a (Shifted) Nonsymmetri Algebrai Ri

ati Equation. SIAM J. Matrix Anal. Appl., 29(4):1083{1100, 2007. 26. C.-H. Guo and A. J. Laub. On the iterative solution of a lass of nonsymmetri algebrai Ri

ati equations. SIAM J. Matrix Anal. Appl., 22(2):376{391, 2000. 27. C. He, B. Meini, and N. H. Rhee. A shifted y li redu tion algorithm for quasibirth-death problems. SIAM J. Matrix Anal. Appl., 23(3):673{691, 2001/02. 28. N. J. Higham. The Matrix Fun tion Toolbox. http://www.ma.man.ac.uk/ ∼ higham/mftoolbox. 29. N. J. Higham. Fun tions of Matri es: Theory and Computation. So iety for Industrial and Applied Mathemati s, Philadelphia, PA, USA, 2008. 30. L. Hogben, editor. Handbook of linear algebra. Dis rete Mathemati s and its Appli ations (Bo a Raton). Chapman & Hall/CRC, Bo a Raton, FL, 2007. Asso iate editors: Ri hard Brualdi, Anne Greenbaum and Roy Mathias. 31. R. A. Horn and S. Serra Capizzano. Canoni al and standard forms for ertain rank one perturbations and an appli ation to the ( omplex) Google pageranking problem. To appear in Internet Mathemati s, 2007. 32. T.-M. Hwang, E. K.-W. Chu, and W.-W. Lin. A generalized stru ture-preserving doubling algorithm for generalized dis rete-time algebrai Ri

ati equations. Internat. J. Control, 78(14):1063{1075, 2005. 33. B. Iannazzo and D. Bini. A y li redu tion method for solving algebrai Ri

ati equations. Te hni al report, Dipartimento di Matemati a, Universita di Pisa, Italy, 2005. 34. J. Juang. Global existen e and stability of solutions of matrix Ri

ati equations. J. Math. Anal. Appl., 258(1):1{12, 2001. 35. J. Juang and W.-W. Lin. Nonsymmetri algebrai Ri

ati equations and Hamiltonian-like matri es. SIAM J. Matrix Anal. Appl., 20(1):228{243, 1999. 36. L. V. Kantorovi h. Fun tional analysis and applied mathemati s. NBS Rep. 1509. U. S. Department of Commer e National Bureau of Standards, Los Angeles, Calif., 1952. Translated by C. D. Benster. 37. D. Kleinman. On an iterative te hnique for ri

ati equation omputations. IEEE Trans. Automat. Control, 13(1):114{115, 1968. 38. P. Lan aster and L. Rodman. Algebrai Ri

ati equations. Oxford S ien e Publi ations. The Clarendon Press Oxford University Press, New York, 1995. 39. G. Latou he and V. Ramaswami. A logarithmi redu tion algorithm for quasibirth-death pro esses. J. Appl. Probab., 30(3):650{674, 1993. 40. A. J. Laub. A S hur method for solving algebrai Ri

ati equations. IEEE Trans. Automat. Control, 24(6):913{921, 1979. 41. W.-W. Lin and S.-F. Xu. Convergen e analysis of stru ture-preserving doubling algorithms for Ri

ati-type matrix equations. SIAM J. Matrix Anal. Appl., 28(1):26{39, 2006. 42. L.-Z. Lu. Newton iterations for a non-symmetri algebrai Ri

ati equation. Numer. Linear Algebra Appl., 12(2-3):191{200, 2005. 43. L.-Z. Lu. Solution form and simple iteration of a nonsymmetri algebrai Ri

ati equation arising in transport theory. SIAM J. Matrix Anal. Appl., 26(3):679{685, 2005.

Ri

ati equations asso iated with an M-matrix

209

44. V. L. Mehrmann. The autonomous linear quadrati ontrol problem, volume 163 of Le ture Notes in Control and Information S ien es. Springer-Verlag, Berlin, 1991. Theory and numeri al solution. 45. B. Meini. New onvergen e results on fun tional iteration te hniques for the numeri al solution of M/G/1 type Markov hains. Numer. Math., 78(1):39{58, 1997. 46. V. Ramaswami. Matrix analyti methods for sto hasti uid ows. In D. Smith and P. Hey, editors, TeletraÆ Engineering in a Competitive World, Pro eedings of the 16th International TeletraÆ Congress, Elsevier S ien e B.V., Edimburgh, UK, pages 1019{1030, 1999. 47. L. C. G. Rogers. Fluid models in queueing theory and Wiener-Hopf fa torization of Markov hains. Ann. Appl. Probab., 4(2):390{413, 1994. 48. D. Williams. A \potential-theoreti " note on the quadrati Wiener-Hopf equation for Q-matri es. In Seminar on Probability, XVI, volume 920 of Le ture Notes in Math., pages 91{94. Springer, Berlin, 1982.

A generalized conjugate direction method for nonsymmetric large ill-conditioned linear systems Edouard R. Boudinov1 and Arkadiy I. Manevi h2 1

FORTIS Bank, Brussels, Belgium edouard.boudinov@mail.ru

2

Department of Computational Me hani s and Strength of Stru tures, Dniepropetrovsk National University, Dniepropetrovsk, Ukraine armanevich@yandex.ru

Abstract. A new version of the generalized onjugate dire tion (GCD) method for nonsymmetri linear algebrai systems is proposed whi h is oriented on large and ill- onditioned sets of equations. In distin tion from the known Krylov subspa e methods for unsymmetri al matri es, the method uses expli itly omputed A- onjugate (in generalized sense) ve tors, along with an orthogonal set of residuals obtained in the Arnoldi orthogonalization pro ess. Employing entire sequen es of orthonormal basis ve tors in the Krylov subspa es, similarly to GMRES and FOM, ensures high stability of the method. But instead of solution of a linear set of equations with a Hessenberg matrix in ea h iteration for determining the step we use A- onjugate ve tors and some simple re urren e formulas. The performan e of the proposed algorithm is illustrated by the results of extensive numeri al experiments with large-s ale ill- onditioned linear systems and by omparison with the known eÆ ient algorithms.

Keywords: linear algebrai equations, large-s ale problems, iterative meth-

ods for linear systems, Krylov subspa e methods, onjugate dire tion methods, orthogonalization.

1

Introduction

The method proposed in this paper is based on the notion of A- onjuga y in generalized sense, or \one-sided onjuga y" (in Russian literature term \A-pseudoorthogonality" is also used). We remind the primary de nition: ve tors dk are named onjugate dire tion ve tors of a real non-singular matrix A (in generalized sense) if the following onditions are satis ed: (di , Adk ) = 0

for i < k;

(di , Adk ) 6= 0 for i = k;

(1)

(in general ase (di , Adk ) 6= 0 for i > k). The notion of A- onjuga y in generalized sense has been introdu ed and studied already in 1970-s by G. W. Stewart [4℄, V. V. Voevodin and E. E. Tyrtyshnikov [7℄, [11℄, [12℄, and others.

A generalized onjugate dire tion method

211

A few generalized CD-algorithms for non-symmetri systems, based on onesided onjuga y, have been elaborated already in 1980-s and later ( L. A. Hageman, D. M. Young [10℄ and others, see also [19℄, [20℄). These algorithms relate to dierent lasses of the Krylov subspa e methods: minimum residual methods, orthogonal residual methods, orthogonal errors methods. Convergen e of these algorithms has been well-studied and, in parti ular, the nite termination property has been proved. Of ourse, these results relate to pre ise arithmeti . However in pra ti e the generalized CD-algorithms turned out to be less eÆ ient, at whole, than methods based on an orthogonalization pro edure, su h as elaborated in the same years Full Orhogonalization Method (FOM) [15℄, Generalized Minimal Residual (GMRES) [16℄. It is well known that the onvergen e of CD-algorithms in nite pre ision arithmeti essentially diers from its theoreti al estimates in exa t arithmeti . In this paper we propose a new generalized onjugate dire tion algorithm for solving nonsymmetri linear systems ( tting into the lass of orthogonal residual methods) whi h is ompetitive with the most eÆ ient known methods in the ase of large dimension or ill- onditioned systems. Similarly to GMRES and FOM, the algorithm employs entire sequen es of orthonormal basis ve tors in the Krylov subspa es obtained in the Arnoldi orthogonalization pro ess [1℄. This pro ess is also onsidered as a way of omputing residuals, instead of their usual updating. For simpli ity we des ribe the algorithm in two forms, sequentially introdu ing new elements. First a \basi algorithm" is presented whi h determines iterates by employing the one-sided onjugation and some re urrent formulas (but residuals are updated by the usual formula). Then the nal algorithm is des ribed whi h uses the orthogonalization pro ess for deriving residuals. The performan e of the proposed algorithm is demonstrated by applying to a set of standard linear problems. The results are ompared to that obtained by the

lassi al onjugate gradients method, GMRES and some other eÆ ient methods.

2

Basic algorithm

We solve the problem Ax = b,

x, b ∈ ℜN ,

(2)

where A is an N × N non-singular real matrix (in general ase a nonsymmetri one). Given an initial guess x1 , we ompute an initial residual r1 = b − Ax1 and initial onjugate ve tor d1 as a normalized residual r1 : d1 = r01 = r1 /kr1 k. The

ondition (d1 , Ad1 ) 6= 0 is assumed to be satis ed. The \basi algorithm" is as follows: xk+1 = xk + αk dk ,

αk =

(rk , dk ) , (dk , Adk )

(3)

212

E. Boudinov, A. Manevi h

(4)

rk+1 = rk − αk Adk , dk+1 = rk+1 +

k X

(k+1)

βi

(5)

di ,

i=1

((x, y) denotes the s alar produ t of x and y). CoeÆ ients αi (3) provide the rk+1 to be orthogonal to the dk . CoeÆ ients βi(k+1) (i = 1, . . . , k) are omputed from one-sided onjuga y

onditions (1), whi h lead to a triangular set of equations with respe t to β(k+1) . i This pro ess an be slightly simpli ed if to use the following apparent identity whi h follows from formula (4): Adi =

ri − ri+1 αi

(i = 1, . . . , k − 1),

(6)

an be Then the following two-term re urrent formulas for oeÆ ients β(k) i easily derived: (k) βi

= αi

"

# (k) βi−1 (ri , Ark ) , − αi−1 kri k2

(k)

β1

=−

(d1 , Ark ) (d1 , Ad1 )

(7)

The termination riterion is taken in the form krk k 6 ε or krk k 6 εkr1 k. The algorithm onstru ts the orthogonal set of ve tors ri , i 6 k, and the A- onjugate (in the generalized sense) set of ve tors di , i 6 k. Note that this method relates to \long re urren e" algorithms with respe t to onjugate ve tors ( .v.), be ause every new . v. is omputed from onditions of A- onjuga y with respe t to all pre eding . v.'s. But it is a \short re urren e" algorithm with respe t to the orthogonal set of residuals. In the ase of symmetri matrix A the algorithm is redu ed to the lassi al CG-method: the ve tor set di , i 6 k be omes A- onjugate in usual sense and all β(k) i , i < k, vanish.

3

Final algorithm

The basi algorithm is lose to a few known algorithms, su h as ORTHORES (L.A. Hageman, D.M. Young [10℄) and some others. In exa t arithmeti it rea hes the solution in at most N iterations almost for every initial ve tor x1 ([4℄, [13℄). But in pra ti e the eÆ ien y of this algorithm is found to be insuÆ ient for large and/or ill- onditioned systems. The main reason of this shortage, in our opinion, is onne ted with the updating formula for residuals (4). The updating formula (with αk (3)) ensures orthogonality of the urrent residual rk+1 to the last onjugate dire tion dk with high a

ura y, but the orthogonality to all pre eding residuals ri , i 6 k, is maintained only in exa t arithmeti . Round-o errors are not orre ted in the next step; they only are a

umulated from step

A generalized onjugate dire tion method

213

to step. This a

umulation gradually violates the orthogonality of ve tors {ri} and destroys A- onjuga y of ve tors {di }. We would like to underline that the basi property of residuals {ri}, required for eÆ ien y of the algorithm, is their mutual orthogonality. A

umulation of errors and derangement of the residuals orthogonality is a prin ipal inherent drawba k of the basi algorithm (as well as every short re urren e CD-algorithm). At rst sight, the remedy is the dire t omputation of the residuals by formula rk = b − Axk . But this way is wrong. The roundo errors in omputation of step lengths (point xk+1 ) are again a

umulated, so the residuals are omputed \exa tly", but at \inexa t" points! The orthogonality of residuals again is gradually distorted. Besides, the additional matrix-ve tor multipli ation per iteration is required. We propose another way, whi h is realized in the nal algorithm. Instead of the usual updating residuals we ompute rk simply from the onditions of orthogonality with respe t to all pre eding r0i , i < k (using the modi ed GramS hmidt orthogonalization). Indeed, it is known a priori that the new residual should be orthogonal to all r0i , i < k, so we need only in proper s aling in order to the normal ve tor would oin ide with the residual (in exa t arithmeti ). Su h a s aling is given by the following formula: rk+1 = −αk

Ar0k

−

k X

γk,i r0i

i=1

!

,

γk,i = (Ar0k , r0i ),

r0k =

rk krk k

(8)

It an be easily shown that in exa t arithmeti formulas (8) and (4) for the residuals are identi al (both they determine a ve tor orthogonal to all {ri}, i 6 k in the Krylov subspa e Kk+1 , and have equal proje tions onto the ve tor r0k ). Other formulas of the algorithm remain prin ipally the same, but some

hanges appears be ause we introdu e the normalized ve tors r0i instead of ri . The ve tor dk (5) now is de ned as follows: dk+1 = r0k+1 + (k) βi

= αi

"

k X

(k+1)

βi

(9)

di ,

i=1

# (k) βi−1 (r0i , Ar0k ) , − αi−1 kri k

(k)

β1

=−

(r01 , Ar0k ) (r01 , Ar01 )

(10)

Formula for the iterate xk+1 (3) remains the same, but formula for the step length αk is hanged due to new s aling of the ve tor di : xk+1 = xk + αk dk ,

αk =

krk k , (dk , Adk )

(11)

This formula for αk yields from (4) and the identity (rk , dk ) =

rk , (r0k

+

k−1 X i=1

!

(k) βi di )

= krk k

(12)

214

E. Boudinov, A. Manevi h

(sin e all ve tors di , i 6 k − 2, are linear ombinations of the ve tors rj , j 6 i). It is evident that the orthogonal ve tor set {r0i } is less sus eptible to degenera y than the A- onjugate ve tors set {di }. Hen e all omputations based on the ve tors {r0i} have higher a

ura y than those based on {di }. Therefore it is worthwhile to repla e, whenever it possible, operations based on {di } by the ones based on {r0i }. One has (dk , Adk ) =

(r0k

+

k−1 X

(k) βi di ), Adk

i=1

r0k , Ar0k +

k−1 X

(k) ri βi

i=1

− ri+1 αi

!

!

= (r0k , Adk ) = (k)

β = (r0k , Ar0k ) − k−1 krk k αk−1

(13)

(here we use formulas (1), (6)). Thus the oeÆ ients αk and β(k) are omputed via the ve tors {r0i }, and the i A- onjugate ve tors {di } are used only for omputation of the urrent ve tor dk by Eq. (9). With modi ation (8) the algorithm be omes \long re urren e" one also with respe t to residuals. This property is usually onsidered as a shortage sin e it is onne ted with in reased storage requirements and ompli ation of omputations. But the long re urren e property makes algorithm more stable and less sensitive to the round-o errors. This was noted already in 80-s ([14℄). So in the

ase of ill- onditioned or large problems long re urren e be omes more likely a merit of an algorithm rather than a drawba k. The nal algorithm performs only one matrix-ve tor multipli ation per iteration. We omit here all additional details and options of the algorithm. It an be easily seen that the nal algorithm onstru ts the same bases in Krylov subspa es as do GMRes and FOM (they use the similar Gram-S hmidt ortogonalization with the same initial ve tors). But as for determining steps in these subspa es, omputational s heme of our algorithm and that of GMRes (and FOM) are quite dierent. The GMRES nds the step by solving a linear set of equations with an upper Hessenberg matrix. This pro ess involves the Givens rotations for redu ing Hessenberg matri es to the triangular form and/or other

omputational elements. In our algorithm this subproblem is solved by employing onjugate dire tions. It is important that no extra matrix-ve tor produ t is required per iteration.

4

Numerical experiments

The algorithm has been realized in JAVA programming language and has been tested on a variety of linear algebrai problems (in most ases ill- onditioned).

A generalized onjugate dire tion method

215

For omparison we have hosen the following methods: the lassi al CG [2℄; the Bi-CG [3℄, [6℄, the Conjugate Gradient Squared (CGS) [17℄, the Bi-CGSTAB [18℄ and the GMRES [16℄. We used the MATLAB implementations of the Bi-CG, CGS and Bi-CGSTAB methods ([21℄). But for CG and GMRES we employed our implementations. In order to redu e the exe ution time the matri es were rst pre al ulated and then used in the methods implemented in MATLAB. Our implementations of the CG and GMRES methods were ben hmarked against the MATLAB implementation of these methods (p g and gmres fun tions MATLAB), and it was established that the numbers of iterations were identi al in the both implementations, but the running time was less in our implementation. The termination riterion was taken in the form krk k 6 εkr1 k (with ε = 10−13 − 10−15 ). 3 All omputations have been performed on PC Pentium 3.2 GHz with RAM of 2000 MB in double pre ision. Our main aims were to ompare 1) the long re

uren e algorithms with the short re urren e ones, 2) the orthogonalization pro edure for spe ifying residuals with usual updating residuals. First we present the results for symmetri systems with the following matri es (here degree in the denominators is gradually in reased, and so the matri es be ome more degenerate): SYMM1 : SYMM2 :

aii = aii =

SYMM3 : SYMM4 : SYMM5 :

1 i2

1 i4

aij (i 6= j) =

aij (i < j) = aii =

aii =

1 i

1 i3

aii =

aij (i > j) =

aij (i 6= j) =

aij (i < j) = 1 i5

1 ij2

1 (ij)2 j

1 i2 j

1 (ij)3

(15) (16)

1 (ij)2

aij (i > j) =

aij (i 6= j) =

(14)

1 ij

1 (ij)2 i

(17) (18)

In table 1 the results of the al ulations for the number of variables N = 1000 and ε = r/r1 = 10−13 (in the termination riterion) are presented. Notations in this and following tables: N is the number of variables, εx is the a

ura y in arguments, kiter is the number of iterations, t is the running time. The lassi al CG and our basi method have su

essfully solved the relatively simple problems SYMM1 { SYMM3; in the problem SYMM3 a

ura y of the CG was very low, and in others problems (SYMM4, SYMM5) these algorithms 3

The algorithm provides in the k-th iteration the orthogonal residual point xor k+1 in whi h rk+1 is orthogonal to all pre eding di : (ror k+1 , di ) = 0, i = 1, . . . , k. Having obtained the onjugate ve tors basis in the Krylov subspa e Kk , by the ost of a few additional omputations we obtained the minimal residual point xmr k+1 , whi h is , Ad ) = 0, i = 1, . . . , k . For

orre t

omparison with de ned by onditions (rmr i k+1 GMRES we used this point in the termination riterion.

216

E. Boudinov, A. Manevi h

Number of iterations for symmetri matri es (14)-(18). N = 1000, ε = 10−13 , \*"|the algorithm failed. Table 1.

Problem CG GMRES SYMM1 152 SYMM2 1001 SYMM3 1001 SYMM4 * SYMM5 *

88

177 239 258 170

Basi Final algorithm algorithm GCD 88 88 198 177 269 239 * 258 * 170

have failed. The GMRES and our nal algorithm GCD have su

essfully solved all the problems with identi al a

ura y and the number of iterations, and the running time was pra ti ally the same. We would like to note that in all the

ases the number of iterations (and so the number of stored onjugate ve tors) in GMRES and our algorithm was less by several times omparing to the number of variables N (for N=1000 it did not ex eed 258). Table 2 shows the results for larger number of variables N = 10000. The CGalgorithm and our basi algorithm have solved with reasonable a

ura y only the rst problem. The GMRES and our nal algorithm have solved all problems, and the numbers of iterations were again identi al for both the methods and mu h less than the dimension of the problem. The running times in both the methods were approximately the same. Table 2. Number of iterations for symmetri matri es (14)-(18). N = 10000, ε = 10−13 ,

\*"|the algorithm failed.

Problem CG GMRES SYMM1 448 SYMM2 * SYMM3 *

186

513 766

Basi Final algorithm algorithm GCD 208 186 554 513 * 766

We see that even at solving linear systems with symmetri matri es general algorithms designed for non-symmetri problems turn out to be more stable and eÆ ient omparing to spe ial algorithms for symmetri systems; it is lear that the matri es may remain symmetri in the pro ess of omputations only in exa t arithmeti . In the next y le of numeri al experiments we onsider the linear problems with nonsymmetri matri es. They were obtained from the matri es of type

A generalized onjugate dire tion method

217

(14){(18) by introdu ing an asymmetry fa tor µ: ASYMM1 :

aii = 1i ,

aij (i < j) =

ASYMM2 :

aii =

1 i2 ,

aij (i < j) =

ASYMM3 :

aii =

1 , i3

aij (i < j) =

ASYMM4 : ASYMM5 :

aii = aii =

1 , i4 1 , i5

1+µ ij , 1+µ ij2 , 1+µ , (ij)2

1+µ aij (i < j) = (ij) 2j , 1+µ aij (i < j) = (ij) 3,

aij (i > j) = aij (i > j) = aij (i > j) = aij (i > j) = aij (i > j) =

1−µ ij 1−µ i2 j 1−µ (ij)2

(19)

1−µ (ij)2 i 1−µ (ij)3

(22)

(20) (21) (23)

The following algorithms have been tested, alongside with our algorithm (GCD): Bi-CG, CGS, Bi-CGSTAB and GMRES (the rst three algorithms are short re urren e). Results for N = 1000 with µ = 0.5 and ε = 10−13 are presented in Table 3. Numbers of iterations in unsymmetri al problems with matri es (19)-(23), solved by various algorithms; N = 1000; the asymmetry oeÆ ient µ = 0.5; ε = 10−13 ; \*"|the algorithm has failed. Table 3.

Matrix GMRES GCD Bi-CG CGS Bi-CGSTAB ASYMM1 95 95 184 131 92 ASYMM2 183 183 1430 2473 918 ASYMM3 244 244 * * * ASYMM4 264 264 * * * ASYMM5 176 176 * * *

The CGS, Bi-CG, Bi-CGSTAB algorithms have solved only problems ASYMM1, ASYMM2. The GMRES and our algorithm have solved all the problems with approximately the same a

ura y and numbers of iterations. The data presented in above tables enable us to draw the following on lusions: – at solving ill- onditioned problems the short re urren e algorithms ompare

unfavorably with the long re urren e ones; only long re urren e algorithms are eÆ ient in ill- onditioned problems of moderate and large dimensions; – algorithms based on usual updating of residuals (CG, our basi algorithm) are at a disadvantage in relation to algorithms based on an orthogonalization pro edure (GMRES, our nal algorithm); – the onvergen e of our nal algorithm GCD is identi al to that of GMRES. Therefore in the next omputations we dealt only with GMRES and our algorithm GCD. Table 4 shows the results obtained by these algorithms for the same problems with larger number of variables N = 10000. Along with

218

E. Boudinov, A. Manevi h

the numbers of iterations, here we present also the a

ura y in arguments and the exe ution time. Again the both algorithms have solved all problems with approximately the same numbers of iterations, a

ura y and exe ution times. Results for unsymmetri al problems with matri es (19)-(23); N = 10000; the asymmetry oeÆ ient µ = 0.5; ε = 10−13 .

Table 4.

GMRES Proposed method GCD Matrix εx kiter t (se ) εx kiter t (se ) ASYMM1 < 10−9 200 224 < 10−9 200 224 ASYMM2 < 10−5 532 621 < 10−6 532 619 ASYMM3 < 0.009 788 957 < 0.005 788 956

In order to examine the algorithms in very large s ale problems, we onsidered unsymmetri al problems produ ed from the matri es (19) with nonzero elements only on ve diagonals, i. e., aij = 0 for j > i + 2 and j < i − 2. Table 5 presents the results obtained by GMRES and our algorithm. The both methods were very eÆ ient in solving the problems up to N=150000 (on the given PC). We see that the a

ura y of the solutions did not de rease as the dimension of the problem in reased. The numbers of iterations in the both methods were again identi al, and the running time was pra ti ally the same. Table 5. Results for unsymmetri al problems with matri es (19) having only 5 diagonals with non-zero elements: the asymmetry fa tor µ = 0.5; ε = 10−10 .

GMRES Proposed method GCD N εx kiter t (se ) εx kiter t (se ) 1000 < 10−7 76 0.06 < 10−7 76 0.06 10000 < 10−6 159 2.02 < 10−6 159 1.95 50000 < 10−6 264 25.4 < 10−6 264 23.0 100000 < 10−6 329 83.4 < 10−6 329 73.9 150000 < 10−6 374 208 < 10−6 374 210

Table 6 demonstrates the in uen e of the asymmetry fa tor on the eÆ ien y of the GMRES and proposed algorithm, for two problems with N = 1000: the matrix (19) and Hilbert matrix modi ed with the asymmetry fa tor: aij =

1+µ i+j−1

(i < j),

aij =

1−µ i+j−1

(i > j)

(24)

The matrix asymmetry of the rst matrix pra ti ally did not ae t the performan e of the algorithms. In the se ond problem even a very small matrix

A generalized onjugate dire tion method

219

asymmetry had an impa t on the onvergen y rate of the both algorithms: they required N iterations for solving the problems. The above on lusion about the

omparative eÆ ien y of the both methods holds at any µ-values. Table 6. Results for unsymmetri al N = 1000; ε = 10−13 .

problems with various asymmetry oeÆ ients µ;

GMRES µ

0 0.1 0.5 1.0 2.0

εx < < < <

5 that are (φ, ψ)- ir ulants for appropriately

hosen values of φ and ψ.

1. The issue that we treat in this short paper is motivated by our study

of the normal Hankel problem, i.e., the problem of des ribing normal Hankel matri es. This problem is still open despite a number of available partial results. A detailed a

ount of its present state is given in Se tion 1 of our paper [1℄. We need a shorter version of this a

ount to formulate and then prove our result. Let H = H1 + iH2 (1) be an arbitrary Hankel matrix, H1 and H2 being its real and imaginary parts, respe tively. Denote by Pn the ba kward identity matrix:

Then,

Pn =

1

1 . .. .

T = HPn = T1 + iT2

(2)

On normal Hankel (φ, ψ)- ir ulants

223

is a Toeplitz matrix, T1 and T2 being again the real and imaginary parts of T. One an show that, for H to be a normal matrix, it is ne essary and suÆ ient that the asso iated Toeplitz matrix (2) satis es the relation (3)

Im (T T ∗ ) = 0.

Let a1 , a2 , ..., an−1 and a−1 , a−2, ..., a−n+1 be the o-diagonal entries in the rst row and the rst olumn of T1 . Denote by b1 , b2 , . . ., bn−1 and b−1 , b−2 , . . ., b−n+1 the orresponding entries in T2 . Using these entries, we an form the matri es an−1 bn−1 an−2 bn−2 F= . .. .. . a1

and

a−1 a−2 G= . ..

b1

b−1 b−2 . .. .

a−n+1 b−n+1

It turns out that all the lasses of normal Hankel matri es previously des ribed in the literature orrespond to the ases where, for at least one of the matri es F and G, the rank is less than two. Therefore, we hereafter assume that rank F = rank G = 2. In this ase, the basi equality (3) implies (see details in our paper [2℄) that (4)

G = FW,

where W=

αβ γδ

is a real 2 × 2 matrix with the determinant

(5)

αδ − βγ = 1.

The matrix equality (4) is equivalent to the s alar relations a−i = αan−i + γbn−i ,

b−i = βan−i + δbn−i ,

1 6 i 6 n − 1.

(6)

Writing the Toeplitz matrix (2) in the form

T =

t0 t−1 t−2 ...

t1 t0 t−1 ...

t2 t1 t0 ...

t−n+1 t−n+2 t−n+3

. . . tn−1 . . . tn−2 . . . tn−3 , ... ... . . . t0

(7)

224

V. N. Chugunov and Kh. D. Ikramov

we an repla e real relations (6) by the omplex formulas t−i = φtn−i + ψtn−i ,

where φ=

β−γ α+δ +i , 2 2

1 6 i 6 n − 1,

ψ=

α−δ β+γ +i . 2 2

(8) (9)

The omplex form of relation (5) is as follows: |φ|2 − |ψ|2 = 1.

(10)

Let (φ, ψ) be a xed pair of omplex numbers obeying ondition (10). A Toeplitz matrix T is alled a (φ, ψ)- ir ulant if its entries satisfy relations (8). The orresponding Hankel matrix H = T Pn will be alled a Hankel (φ, ψ) ir ulant. The ase ψ = 0, |φ| = 1 orresponds to the well-known lasses of Toeplitz and Hankel φ- ir ulants. However, for ψ 6= 0, it is not at all lear whether there exist nontrivial normal Hankel (φ, ψ)- ir ulants. Indeed, if equalities (6) are substituted into our basi relation (3), then the result is a system of n − 1 real equations with respe t to 2n real unknowns a0 , a1 , ..., an−1 and b0 , b1 , ..., bn−1 . Sin e these equations are quadrati , they need not to have real solutions. It was shown in [1℄ that the above system is solvable for n = 3 and n = 4 for every quadruple (α, β, γ, δ) satisfying ondition (5). The question of the existen e of normal Hankel (φ, ψ)- ir ulants for larger values of n was left open there. Below, we onstru t a spe ial lass of Toeplitz matri es that generate normal Hankel matri es for any n > 5. These matri es are (φ, ψ)- ir ulants for appropriate values of φ and ψ, where ψ 6= 0. 2. We seek T as a Toeplitz matrix with the rst row of the form 0 0 ··· 0 a b a

Here, a = x + iy and b = z + iw are omplex numbers to be determined. This matrix T must be a (φ, ψ)- ir ulant for appropriate values of φ and ψ (that is, for appropriate α, β, γ, and δ). The Hankel (φ, ψ)- ir ulant orresponding to this T is normal if and only if the basi relation (3) is ful lled. Now, observe that the property of T to be a (φ, ψ)- ir ulant implies that T T ∗ is a Toeplitz matrix (see [1℄ or [2℄ for explanations of this fa t). Moreover, T T ∗ is obviously a Hermitian matrix. It follows that the matrix relation (3) is equivalent to n − 1 s alar onditions Im{T T ∗ }1j = 0,

j = 2, 3, . . . , n.

Due to the \tridiagonal" stru ture of T, we have {T T ∗ }1j = 0,

j = 4, 5, . . . , n − 2.

(11)

On normal Hankel (φ, ψ)- ir ulants

225

The remaining onditions in (11) orrespond to j = 2, 3, n − 1 and n. They have the same form for any value of n, beginning from n = 5. Thus, to nd the desired a and b, it suÆ es to analyze the ase n = 5. Sin e {T T ∗ }12 = ba + ab, {T T ∗ }13 = |a|2 , the rst two onditions in (11) are automati ally ful lled. It remains to satisfy two onditions orresponding to j = 4 and j = 5. This yields the following system of two equations in four real variables x, y, z and w : βx2 + (δ − α)xy − γy2 = 0,

(12)

[2βx + (δ − α)y]z + [(δ − α)x − 2γy]w = 0.

(13)

Furthermore, we must keep in mind the relation

0 0 x y rank F = rank z w = 2, xy

whi h is equivalent to the inequality

yz − xw 6= 0

and ex ludes solutions to system (12), (13) for whi h x = y = 0. Suppose that (x, y) is a nontrivial solution to equation (12). Substituting x and y into (13), we obtain a linear equation with respe t to z and w. However, if at least one of the expressions inside the bra kets is nonzero, then this equation is equivalent to the relation yz − xw = 0, (14) signifying that rank F = 1. Indeed, the determinant of the system omposed of equations (13) and (14) is given by the formula 2βx + (δ − α)y (δ − α)x − 2γy = −2[βx2 + (δ − α)xy − γy2 ] y −x

and, hen e, vanishes in view of (12). On the other hand, if, for the hosen solution (x, y), we have 2βx + (δ − α)y = 0,

(15)

(δ − α)x − 2γy = 0,

(16)

then (13) is satis ed by any pair (z, w). Almost all of these pairs satisfy the

ondition yz − xw 6= 0.

226

V. N. Chugunov and Kh. D. Ikramov

By assumption, the homogeneous system (15), (16) has a nontrivial solution

(x, y), whi h means that its determinant 2β δ − α 2 δ − α −2γ = −4βγ − (δ − α)

must be zero. Taking (5) into a

ount, we obtain the ondition |δ + α| = 2.

(17)

Summing up, we have shown that, for every quadruple (α, β, γ, δ) satisfying

onditions (5) and (17), there exist omplex s alars a = x + iy and b = z + iw spe ifying the desired Toeplitz matrix T . This matrix is a (φ, ψ)- ir ulant for φ and ψ determined by the hosen values of α, β, γ, and δ. The orresponding matrix H (see (2)) is a normal Hankel (φ, ψ)- ir ulant. V. N. Chugunov a knowledges the support of the Russian Foundation for Basi Resear h (proje ts nos. 04-07-90336 and 05-01-00721) and a Priority Resear h Grant OMN-3 of the Department of Mathemati al S ien es of Russian A ademy of S ien es.

References 1. V. N. Chugunov, Kh. D. Ikramov, On normal Hankel matri es of low orders, Mat. Zametki (a

epted for publi ation). 2. V. N. Chugunov, Kh. D. Ikramov, On normal Hankel matri es, Zap. Nau hn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 346 (2007), 63{80.

On the Treatment of Boundary Artifacts in Image Restoration by reflection and/or anti-reflection Mar o Donatelli⋆ and Stefano Serra-Capizzano⋆⋆ Dipartimento di Fisi a e Matemati a, Universita dell'Insubria - Sede di Como, Via Valleggio 11, 22100 Como, Italy ⋆ marco.donatelli@uninsubria.it, ⋆⋆ stefano.serrac@uninsubria.it, ⋆⋆ serra@mail.dm.unipi.it

Abstract. The abrupt boundary trun ation of an image introdu es artifa ts in the restored image. For large image restoration with shiftinvariant blurring, it is advisable to use Fast Fourier transform (FFT)based pro edures for redu ing the omputational eort. In this dire tion several te hniques manipulate the observed image at the boundary or make some assumptions on the boundary of the true image, in su h a way that FFT-based algorithms an be used. We ompare the use of re e tion with that of anti-re e tion, in onne tion with the hoi e of the boundary onditions or for extending the observed image, both theoreti ally and numeri ally. Furthermore, we ombine the two proposals. More pre isely we apply anti-re e tion, followed by re e tion if ne essary, to the observed image and we observe that the resulting restoration quality is in reased with respe t to the ase of plain re e tion.

Keywords: image deblurring, boundary onditions, fast transforms and matrix algebras.

1

Introduction

The blurred image is expressed as a fun tion of an original s ene that is larger than the eld of view (FOV) of the blurred image, sin e pixels from the original s ene outside the aptured image window ontribute to the pixels near the boundaries of the blurred observed image. Indeed the standard observation model an be expressed as g = Afo + u, (1) where fo and g, lexi ographi ally ordered, are the true and observed images, and u is the noise. The matrix A represents a onvolution of the true image fo with the point spread fun tion (PSF) that we assume to be known and shift invariant. If the observed image is n × n and the PSF m × m, then (1) implies that fo is (n+m−1)× (n+m−1) and that A is a Toeplitz matrix of matrix size n2 × (n + m − 1)2 . This means that the linear system (1) is underdetermined.

228

M. Donatelli, S. Serra-Capizzano

The goal is to re over fo only in the FOV, i.e., the image f equal to the n × n middle part of fo . A well-established solution to both the problems of nonuniqueness and noise ampli ation is regularization. A lassi approa h is the Tikhonov regularization [10℄, whi h involves simultaneously minimizing the data error and a measure of the roughness of the solution. This leads to the linear system (AT A + µI)f = AT g,

(2)

where µ > 0 is the regularization parameter that should be opportunely hosen and usually satis es µ ≪ 1. In general the solution of the linear system (2)

an be omputationally expensive, sin e it is not automati that an FFT-based algorithm an be applied dire tly. However an interesting approa h is proposed in [8℄ when m ≪ n. Indeed, for dealing with the re tangular matrix A while using FFT-based algorithms, it is ne essary to resort to iterative methods [2℄, in whi h the main task is the appli ation of FFT-based pro edures for matrixve tor multipli ation. Conversely, for employing FFT-based dire t algorithms, the linear system to solve should have oeÆ ient matrix diagonalizable by a suitable fast trigonometri transform, su h as sine, osine, ω-Fourier (|ω| = 1), Hartley transforms (see e.g. [6℄). This an be done modifying system (1) in order to obtain a square oeÆ ient matrix. The rst approa h amounts in imposing boundary onditions (BCs) to fo an then omputing a regularized solution of Bfb = g,

(3)

where B is n2 × n2 , with a stru ture depending on the shift invariant kernel and on the type of BCs [5℄. The se ond approa h is to extend g in some way to obtain ge of size 2n × 2n, and then regularizing Cfe = ge ,

(4)

where C is the (2n)2 × (2n)2 Cir ulant matrix obtained by periodi ally ompleting A; here the restored image is the n × n part of fe orresponding to g [1℄. In this paper, we ompare the two approa hes in the ase of re e tive pad, i.e. the two proposals in [7℄ and [1℄. We will also onsider the use of re e tion and anti-re e tion in onne tion as possible hoi es for boundary onditions. The main results are the following: – In the ase of strongly symmetri (symmetri with respe t to ea h axis in-

dependently) PSFs, the onsidered approa hes produ e omparable restorations in pra ti al problems. – Imposing anti-re e tive boundary onditions leads to a better restoration quality with respe t to the re e tive boundary onditions, at least for moderate level of noise [9, 4, 3℄. However a dire t fast method is available only in the strong symmetri setting.

On the Treatment of Boundary Artifa ts in Image Restoration

(a)

(b)

229

( )

Fig. 1. (a) Full re e tion of the left top quadrant on right and bottom. (b) Half re e tion

on ea h edge of the middle image. ( ) Half anti-re e tion on ea h edge of the middle image (s aled image). The edges of the images are emphasized by tiny verti al and horizontal lines. – To improve the results obtained by image extension as in (4), we use the

ideas in [1℄, but instead of using re e tion, we apply anti-re e tion or antire e tion followed by re e tion. In this way we obtain a FFT-based algorithm also in the ase of a generi PSF (non-ne essarily symmetri ), so over oming the limitations in [7, 9℄ on erning the assumption of a strongly symmetri onvolution kernel.

The paper is ompleted with numeri al results to validate the proposals and the related analysis.

2

Reflection for image extension and BCs

In this se tion, we ompare the re e tion pad to extend g and the imposition of re e tive BCs. The proposal in [1℄ to extend g, is to form a new image ge of size 2n × 2n as des ribed in Fig. 1 (a). The observed image g is at the top left quadrant, the top right quadrant is the mirror image of g around the y axis, and the bottom half is the mirror image of the top half around the x axis. After that, the solution of the Tikhonov linear system is omputed by ir ular

onvolution, be ause the oeÆ ient matrix C in (4) is Cir ulant. In [1℄ it is shown that, for 1-D images and symmetri PSFs, when the trun ated image is lo ally stationary at its boundaries, this approa h leads to smaller expe ted errors in the restored image with respe t to apply dire tly the ir ular onvolution to g. Indeed the ir ular onvolution assumes a ir ular signal and, independently of g, ge is always a periodi image, hen e it is reasonable to expe t that it is obtained from a periodi s ene. This learly redu es the boundary artifa t in the restoration, also in the ase of non-symmetri PSFs. We note that a re e tion of size n/2 with respe t to ea h edge an be also used as in Fig. 1 (b), obtaining

230

M. Donatelli, S. Serra-Capizzano

the same algorithm. Indeed the previous observation means only a translation of the period of the image of n/2 in ea h variable. The use of re e tive or Neumann BCs implies that the true image outside the FOV is a re e tion of the image inside the FOV. Therefore fo is assumed to be an extension by re e tion of f like in Fig. 1 (b). The re e tion is done with respe t to ea h edge with a bandwidth depending on the support of the PSF, sin e it is ne essary ea h pixel at the boundary to be well de ned. Imposing re e tive BCs the square linear system has size n2 and the matrix B in (3) has a Toeplitz plus Hankel stru ture. More spe i ally, if the PSF is strongly symmetri , then B an be diagonalized by the dis rete osine transform of type I (DCT-I) (two dimensional in the ase of images). Now we provide an algebrai formulation of the two approa hes in the ase of strongly symmetri PSFs and 1-D ase. The latter will allow us to give a qualitative omparison of the solution omputed from the two strategies applying the Tikhonov regularization to (3) and (4), respe tively. Sin e the PSF is symmetri , we have hP= [h−q , . . . , h0 , . . . , hq ] with h−i = hi and q = (m−1)/2. Let Tk = { φα (x) = kj=−k αj eijx , α−j = αj } be the set of even trigonometri polynomial of degree at most k, then the symbol φh (x) =

q X

hj eijx ,

(5)

j=−q

is su h that φh ∈ Tq and q 6 (n − 1)/2 for m 6 n. Imposing re e tive BCs, thanks to the symmetry of the PSF, in (3) B = Rn DRTn where Rn is the DCT-I matrix (Rn is real and orthogonal), D = diag(b) with b = RTn (Be1 )/RTn e1 (the division is omponent-wise) and e1 is the rst ve tor of the anoni al base. Moreover, sin e bi = φh (iπ/n), i = 0, . . . , n − 1, B

an be expressed in terms of its symbol φh and it will be denoted by B = Rn (φh ) (see [7℄). Therefore, using the Tikhonov regularization approa h (2) for the linear system (3), we obtain fr = Rn diag

b b2 + µ

Rn g,

(6)

where the operations between ve tors are intended omponent-wise. Setting z = b/(b2 + µ) and by de ning pr ∈ Tn−1 as the interpolating polynomial in the iπ pairs n , zi for i = 0, . . . , n − 1, we nd fr = Rn (pr )g.

(7)

For the other approa h in (4), without loss of generality, let {1, . . . , n} be the FOV and n be even. Hen e, by re e ting g = [g1 , . . . , gn ] on both sides, we have ge = [gn/2 , . . . , g2 , g1 , g1 , g2 , . . . , gn , gn , gn−1 , . . . , gn/2+1 ],

(8)

On the Treatment of Boundary Artifa ts in Image Restoration

231

that, as already observed, leads to the same proposal as in [1℄. De ning El , P= I Er 2n×n

(9)

fc = [ 0 | I | 0 ]n×2n C2n (pc )Pg,

(10)

El = [ J | 0 ]n/2×n , Er = [ 0 | J ]n/2×n and J being the n/2 × n/2 ip matrix with entries [J]s,t = 1 if s + t = n + 1 and zero otherwise, we have ge = Pg. While C = F2n ΛFH 2n , where F2n is the Fourier matrix of order 2n and Λ = diag(c), H with c = FH 2n (Ce1 )/F2n e1 . Sin e ci = φh (2πi/2n), i = 0, . . . , 2n − 1, we denote C = C2n (φh ). Using the Tikhonov regularization (2) for the linear system (4), the restored signal of size n is

where I is the identity of order n and, similarly to the re e tive BCs ase, pc ∈ T2n−1 is the interpolating polynomial in ( iπ n , vi ) for i = 0, . . . , 2n − 1 with v = c/(c2 + µ). We show that pc ∈ Tn and it is the interpolating polynomial in iπ ( iπ n , vi ) for i = 0, . . . , n, i.e., the points ( n , vi ) for i = n + 1, . . . , 2n − 1 do not add any further information. The interpolation onditions are

iπ pc = vi , i = 0, . . . , n, n (n + i)π pc = vn+i , i = 1, . . . , n − 1. n

(11) (12)

From the trigonometri identity os( (n+i)π ) = os( (n−i)π ), it follows cn+i = n n (n+i)π ) cn−i that implies vn+i = vn−i and pc ( n ) = pc ( (n−i)π , for i = 1, . . . , n−1. n ) = v Therefore, onditions (12) an be written as pc ( (n−i)π for i = 1, . . . , n− n−i n 1, that are a subset of (11). Moreover, ci = bi , and then vi = zi , for i = 0, . . . , n − 1. Con luding, let Ωn = { iπ n | i = 0, . . . , n} be the interpolation nodes forming a uniform grid on [0, π] and let ψ = φh /(φ2h + µ), then pc ∈ Tn interpolating ψ in

Ωn ,

interpolating ψ in Ωn \ {π} .

pr ∈ Tn−1

(13) (14)

In order to ompare fr with fc , it remains to he k whether [ 0 | I | 0 ] C2n (φα )P belongs to the DCT-I algebra. Let φα ∈ T n2 , then the n × 2n matrix T = [ 0 | I | 0 ] C2n (φα ) is

T =

α− n2 . . . α0 . . . α n2

..

.

.. .

α− n2

..

..

.

..

.

..

. α n2

.

.. . . . .

α− n2 . . . α0 . . . α n2

(15)

232

M. Donatelli, S. Serra-Capizzano

and TP = Rn (φα ). We note that [ 0 | I | 0 ] C2n (φα )P = Rn (φα ) holds only if φα ∈ T n2 . Therefore it an not be used in (10) sin e pc ∈ Tn , but it generally fails to belong to T n2 . However, from (7) and (10), it holds fr − fc = (Rn (pr ) − [ 0 | I | 0 ] C2n (pc )P)g = (Rn (pr − φα ) − [ 0 | I | 0 ] C2n (pc − φα )P)g,

for φα ∈ T n2 . We take

(16) (17)

φα = arg min ||ψ − p ||∞ .

(18)

C2n (pc − φα ) = C2n (pc − ψ + ψ − φα )

(19) (20)

p∈T n 2

Therefore

= C2n (rn ) + C2n (a n2 ),

where rn is the lassi al remainder in the trigonometri interpolation with n + 1 equispa ed nodes in [0, π] with nodes belonging to Ωn , while a n2 is the sup-norm optimal remainder of degree n/2. Similarly Rn (pr − φα ) = Rn (~rn−1 ) + Rn (a n2 ),

(21)

where ~rn−1 is the remainder of the trigonometri interpolation with n equispa ed nodes in [0, π] with nodes belonging to Ωn \ {xn = π}. As a onsequen e, sin e the transforms asso iated with the ir ulant and the osine algebras are unitary, it follows that the spe tral norms kCn (s)k, kRn (s)k are bounded by the in nity norm of s. Moreover kPk = k[ 0 | I | 0 ]k = 1 and hen e, by using (19){(21) in (17), we nd kfr − fc k 6 (kRn (pr − φα )k + kC2n (pc − φα )k)kgk 6 (krn k∞ + k~rn−1 k∞ + 2kan/2 k∞ )kgk 6 2(Knkan k∞ + kan/2 k∞ )kgk,

(22) (23) (24)

with K onstant, where the latter inequality follows from the evaluation of the Lebesgue onstants in the interpolation operators. In fa t, after the hange of variable y = os(x), the operator behind rn is the interpolation on [−1, 1] with Chebyshev nodes of se ond type (the zeros of sin(nx)/ sin(x)) plus the additional endpoints {±1}: its Lebesgue onstant is known to grow as K log(n). The other Lebesgue onstant related to the operator behind ~rn−1 is again related to the Chebyshev nodes of the se ond type plus only y = 1 (i.e. x = x0 = 0); in this

ase the asso iated Lebesgue onstant is known to grow as Kn. Sin e kat k∞ is exponentially onverging to zero as t tends to in nity (due to the C∞ regularity of ψ), it follows that kfr − fc k is exponentially onverging to zero as n tends to in nity. As a onsequen e, the ve tors fr and fc do not oin ide in general, but their numeri al dieren e is negligible already for moderate values of n.

On the Treatment of Boundary Artifa ts in Image Restoration

233

Finally, when the PSF is not strongly symmetri , we noti e that B an not be diagonalized by DCT-I and it has only a Toeplitz plus Hankel stru ture. Therefore in general the linear system arising from Tikhonov regularization and re e tive BCs an not be solved by a FFT-based algorithm. On the other hand, the other approa h based on the extension of g an be again applied without modi ations.

3

Image extension by anti-reflection

The re e tive pad is ee tive if the image is lo ally stationary at its boundaries, but it an still reate signi ant artifa ts if the image intensity has a large gradient at the boundary. Re e ting the image will reate a usp that is likely to be highly in onsistent with the original image, sin e the image beyond the boundary more than likely ontinues to hange a

ording to the gradient at the boundary rather than the negative of that gradient. A

ording to this observation in [9℄, the author proposed to anti-re e t instead of to re e t the image at the boundary. The onsidered idea preserves the ontinuity of the normal derivative at the boundary without reating a usp. In Fig. 1 ( ) is shown how to extend an image by anti-re e tion. We note a dierent s aling with respe t to Fig. 1 (a) and Fig. 1 (b) sin e the anti-re e tion produ e value outside the original domain and the following visualization requires to s ale the image. We analyze in detail 1-D images. Imposing anti-re e tive BCs the images f = [f1 , . . . , fn ] is assumed to be extended as f1−j = 2f1 − fj+1 ,

fn+j = 2fn − fn−j ,

(25)

for j = 1, 2, . . . [9℄. Antire e tive BCs usually provide restoration better than re e tive BCs, also in pra ti al 2-D appli ations, while, from a omputational eort viewpoint, they share the same properties as the re e tive BCs [4, 3℄. Indeed, when the PSF is strongly symmetri the matrix B in (3) is essentially diagonalized by dis rete sine transform of type III (DST-III), in the sense that the rst and last equations are de oupled and the inner (n − 2) × (n − 2) blo k

an be diagonalized by DST-III. Hen e, several omputations involving B, like Tikhonov regularization, an be done by FFT-based algorithms. In the last ase, PSF no strongly symmetri , the matrix B is Toeplitz plus Hankel plus a rank two orre tion and the linear system arising from Tikhonov regularization an not be handled by simply invoking FFT-based algorithms. Therefore, when the PSF is not strongly symmetri , it ould be useful to apply the anti-re e tion pad to extend g and regularizing (4). The extended image ge an be easily omputed by ge = Pg, with P de ned in (9) where now El = [ 2e | − J | 0 ] and Er = [ 0 | − J | 2e ], e = [1, . . . , 1]T . We observe that in the

ase of a strongly symmetri PSF with the anti-re e tive pad, dierently from

234

M. Donatelli, S. Serra-Capizzano

(a)

(b)

( )

Fig. 2. (a) Original image where the box indi ates observed region. (b) Gaussian blurred and noisy image. ( ) Out of fo us blurred and noisy image.

the re e tive ase, the two approa hes (BCs on f and extension of g) produ e dierent restorations, usually of omparable quality: indeed the eigenvalues of B are not a subset of the eigenvalues of C, as it happens for the re e tive pad, even if they are de ned on a uniform grid { iπ/(n + 1) | i = 1, . . . , n} as well. The main problem extending g by anti-re e tion is that ge is not periodi and then the model (4) ould suer from this. On the other hand the ringing ee ts are greatly redu ed with respe t to the appli ation of the ir ulant de onvolution dire tly to g, sin e the boundaries are far away from the portion of the restored image, when ompared with the ir ulant ase. However, we an improve the model, and then the restoration, extending ge by re e tion and obtaining a new periodi extended image gp of size 4n × 4n. Clearly this further proposal leads to a moderate in rease in the omputational eort. In fa t, as observed in [1℄, gp is real and symmetri and hen e only the omputation of the real part of a 2D FFT of size 2n × 2n is required.

4

Numerical experiments

For the following experimentation we use Matlab 7.0. The blurred images are

ontaminated by a mild white Gaussian noise. The restorations are ompared visually and the relative restoration error (RRE) is de ned as kf^ − fk2 /kfk2, where f^ and f are the restored and the true image respe tively. For the Tikhonov regularization the parameter µ is hosen experimentally su h that it minimizes the RRE, in a ertain range of µ. The image in Fig 2 (a) was blurred with a Gaussian PSF (Fig. 2 (b)) and with an out of fo us PSF (Fig. 2 ( )). The observed images are n × n with n = 195. Sin e both the PSFs are strongly symmetri , we an ompare the two approa hes based on re e tive BCs and re e tive extension of the observed image respe tively. The restored images and the absolute dieren e of the RREs for the two strategies in Fig. 3 and Fig. 4 validate the theoreti al analysis given in Se tion 2. We note that both strategies rea h the minimum RRE for the same

On the Treatment of Boundary Artifa ts in Image Restoration

235

−13

10

−14

10

−15

10

−16

10

−17

10

(a)

−2

−1

10

(b)

10

( )

Restorations of the image in Fig. 2 (b) (Gaussian blur): (a) restoration by re e tive BCs, (b) restoration by re e tive extension of the observed image, ( ) loglog dieren e of the RREs for the two approa hes ('*' orresponds to the optimal µ equal to 0.026 used in the restored images (a) and (b), absen e of line means exa tly zero value). Fig. 3.

−9

10

−10

10

−11

10

−12

10

−13

10

−14

10

−15

10

−16

10

−17

10

(a)

(b)

−1

0

10

10

( )

Restorations of the image in Fig. 2 ( ) (out of fo us blur): (a) restoration by re e tive BCs, (b) restoration by re e tive extension of the observed image, ( ) loglog dieren e of the RREs for the two approa hes ('*' orresponds to the optimal µ equal to 0.304 used in the restored images (a) and (b)). Fig. 4.

value of µ and we observe that, around this minimum, the absolute dieren e of the RREs has the same order of the ma hine pre ision (10−16 ). Now we onsider the anti-re e tive extension of the observed image des ribed in Se tion 3 and we ompare it only with the re e tive extension in the ase of a nonsymmetri PSF. Indeed, for strongly symmetri PSFs we have seen that the two approa hes based on re e tive BCs and re e tive extension of the observed image are equivalent. Moreover, in the re ent literature, it is widely do umented a ertain suprema y of the anti-re e tive BCs with respe t to re e tive BCs [4, 3℄, for moderate levels of noise. On the other hand, when the PSF is not strongly symmetri the BC approa h with the Tikhonov regularization leads to a linear system that an not be solved by FFT-based algorithms. Hen e, in su h ase the only fast approa h is whi h based on the extension of the observed image. A

ording to the above omments we hoose a PSF representing a motion along

236

M. Donatelli, S. Serra-Capizzano

(a)

(b)

( )

(d)

Fig. 5. (a) Moving blurred and noisy image. 2n × 2n (RRE = 0.0932). ( ) Restoration by

(b) Restoration by re e tive extension anti-re e tive extension 2n × 2n (RRE = 0.0807). (d) Restoration by anti-re e tive extension and then re e tive extension 4n × 4n (RRE = 0.0770).

−1

10

−2

10

−1

10

Fig. 6. Loglog RRE vs. µ for the test in Fig. 5 and the three approa hes: −− re e tive extension, · · · anti-re e tive extension 2n × 2n, −− anti-re e tive extension and then re e tive extension 4n × 4n.

the x axis. The original image is again that in Fig. 2 (a), while the blurred and noisy image is in Fig. 5 (a). In Fig. 5 ( ) the restored image is obtained by anti-re e tive extension that, also if the extended image is not periodi , is better than the restored image with re e tive extension in Fig. 5 (b). The improvement is espe ially visible near the right edge, that is in the dire tion of the motion. If we want further improve the restoration, as des ribed in Se tion 3, we an extend by re e tion the 2n × 2n image obtained by the anti-re etive pad and then apply the ir ulant de- onvolution to the new 4n × 4n problem. Indeed, the restored image in Fig. 5 (d) is better than that in Fig. 5 ( ). Moreover the last approa h is more stable under perturbations of the parameter µ, as shown in Fig. 6 by the plot of the RREs vs. µ for the onsidered approa hes.

Acknowledgment The work of the authors was partially supported by MUR, grant №2006017542.

On the Treatment of Boundary Artifa ts in Image Restoration

237

References 1. F. Aghdasi and R. K. Ward, Redu tion of boundary artifa ts in image restoration, IEEE Trans. Image Pro ess., 5 (1996), pp. 611{618. 2. M. Bertero and P. Bo

a

i, A simple method for the redu tion of the boundary ee ts in the ri hardson-lu y approa h to image de onvolution, Astron. Astrophys., 437 (2005), pp. 369{374. 3. M. Christiansen and M. Hanke, Deblurring methods using antire e tive boundary onditions. manus ript, 2006. 4. M. Donatelli, C. Estati o, A. Martinelli, and S. Serra-Capizzano,

Improved image deblurring with anti-re e tive boundary onditions and reblurring, Inverse Problems, 22 (2006), pp. 2035{2053. 5. P. C. Hansen, J. G. Nagy, and D. P. O'Leary, Deblurring Images: Matri es, Spe tra, and Filtering, SIAM, Philadelphia, PA, 2006. 6. T. Kailath and V. Olshevsky, Displa ement stru ture approa h to dis retetrigonometri -transform based pre onditioners of g. strang type and t. han type,

Cal olo, 33 (1996), p. 191208. 7. M. K. Ng, R. H. Chan, and W. C. Tang, A fast algorithm for deblurring models with Neumann boundary onditions, SIAM J. S i. Comput., 21 (1999), pp. 851{866. 8. S. J. Reeves, Fast image restoration without boundary artifa ts, IEEE Trans. Image Pro ess., 14 (2005), pp. 1448{1453. 9. S. Serra-Capizzano, A note on anti-re e tive boundary onditions and fast deblurring models, SIAM J. S i. Comput., 25 (2003), pp. 1307{1325. 10. A. N. Tikhonov, Solution of in orre tly formulated problems and regularization method, Soviet Math. Dokl., 4 (1963), pp. 1035{1038.

Zeros of Determinants of λ-Matrices Walter Gander Computational S ien e, ETH, CH-8092 Zuri h, Switzerland gander@inf.ethz.ch

Jim Wilkinson dis overed that the omputation of zeros of polynomials is ill onditioned when the polynomial is given by its oeÆ ients. For many problems we need to ompute zeros of polynomials, but we do not ne essarily need to represent the polynomial with its oeÆ ients. We develop algorithms that avoid the oeÆ ients. They turn out to be stable, however, the drawba k is often heavily in reased omputational eort. Modern pro essors on the other hand are mostly idle and wait for run hing numbers so it may pay to a

ept more omputations in order to in rease stability and also to exploit parallelism. We apply the method for nonlinear eigenvalue problems.

Abstract.

Keywords: Nonlinear eigenvalue problems, Gaussian Elimination, Determinants, Algorithmi Dierentiation.

1

Introduction

The lassi al textbook approa h to solve an eigenvalue problem Ax = λx is to rst ompute the oeÆ ients of the hara teristi polynomial Pn (λ) = det(λI − A) by expanding the determinant Pn (λ) = c0 + c1 λ + · · · + cn−1 λn−1 + λn .

Then se ond apply some iterative method like e.g. Newton's method to ompute the zeros of Pn whi h are the eigenvalues of the matrix A. In the beginning of the area of numeri al analysis a resear h fo us was to develop reliable solvers for zeros of polynomials. A typi al example is e.g. [4℄. However, the ru ial dis overy by Jim Wilkinson [6℄ was that the zeros of a polynomial an be very sensitive to small hanges of the oeÆ ients of the polynomial. Thus the determination of the zeros from the oeÆ ients is ill onditioned. It is easy today to repeat the experiment using a omputer algebra system. Exe uting the following Maple statements p :=1: for i from 1 by 1 to 20 do PP := expand(p); Digits := 7

p := p*(x-i) od:

Zeros of Determinants of λ-Matri es

239

PPP := evalf(PP) Digits := 30 Z := fsolve(PPP, x, complex, maxsols = 20)

we an simulate what Jim Wilkinson experien ed. We rst expand the produ t 20 Y (x − i) = x20 − 210x19 ± · · · + 20! i=1

then round the oeÆ ients to oating point numbers with 7 de imal digits. x20 − 210.0 x19 + 2.432902 × 1018 ∓ · · · − 8.752948 × 1018 x + 20615.0 x18

Continuing now the omputation with 30 de imal digits to determine the exa t zeros of the polynomial with trun ated oeÆ ients we note that we do not obtain the numbers 1, 2, . . . , 20. Instead many zeros are omplex su h as e.g. 17.175 ± 9.397i. Thus trun ating the oeÆ ients to 7 de imal digits has a very large ee t on the zeros. The problem is ill onditioned.

2

Matlab Reverses Computing

Instead of expanding the determinant to obtain the oeÆ ients of the hara teristi polynomial the ommand P = poly(A) in Matlab omputes the eigenvalues of A by the QR-Algorithm and expands the linear fa tors Pn (λ) = (λ − λ1 )(λ − λ2 ) · · · (λ − λn ) = λn + cn−1 λn−1 + · + c0

to ompute the oeÆ ients. Given on the other hand the oeÆ ients ck of a polynomial, the ommand lambda = roots(P) forms the ompanion matrix

−cn−1 −cn−2 1 0 0 1 A= . . .. .. 0 0

· · · −c1 −c0 ··· ··· 0 0 ··· 0 .. .. .. . . . 0

1

0

and uses again the QR-Algorithm to nd the eigenvalues whi h are the zeros of the polynomial.

3

Evaluating the Characteristic Polynomial

How an we evaluate the hara teristi polynomial without rst omputing its

oeÆ ients? One way is to use Gaussian elimination and the fa t that it is easy to

240

W. Gander

ompute the determinant of a triangular matrix. Assume that we have omputed the de omposition C = LU

with L a lower unit triangular and U an upper triangular matrix. Then det(C) = det(L) det(U) = u11 u22 · · · unn sin e det(L) = 1. Using partial pivoting for the de omposition we have to hange the sign of the determinant ea h time that we inter hange two rows. The program then be omes: function f = determinant(C) n = length(C); f = 1; for i = 1:n [cmax,kmax]= max(abs(C(i:n,i))); if cmax == 0 % Matrix singular f = 0; return end kmax = kmax+i-1; if kmax ~= i h = C(i,:); C(i,:) = C(kmax,:); C(kmax,:) = h; f = -f; end f = f*C(i,i); % elimination step C(i+1:n,i) = C(i+1:n,i)/C(i,i); C(i+1:n,i+1:n) = C(i+1:n,i+1:n) - C(i+1:n,i)*C(i,i+1:n); end

Let C(λ) = λI−A. We would like to use Newton's method to ompute zeros of P(λ) = det(C(λ)) = 0. For this we need the derivative P ′ (λ). It an be omputed

by algorithmi dierentiation, that is by dierentiating ea h statement of the program to ompute P(λ). For instan e the statement to update the determinant f = f*C(i,i); will be pre eded by the statement for the derivative, thus fs =fs*C(i,i)+f*Cs(i,i) ; f = f*C(i,i);

We used the variable Cs for the matrix C ′ (λ) and fs for the derivative of the determinant. There is, however, for larger matri es the danger that the value of the determinant over- respe tively under ows. Noti e that for Newton's iteration we do not need both values f = det(C(λ)) and fs = ddλ det(C(λ)). It is suÆ ient to

Zeros of Determinants of λ-Matri es

ompute the ratio

241

f P(λ) = . P ′ (λ) fs

Over ow an be redu ed by omputing the logarithm. Thus instead of omputing f = f*C(i,i) we an ompute lf = lf + log(C(i,i)). Even better is the derivative of the logarithm lfs :=

d fs log(f) = dλ f

whi h yields dire tly the inverse Newton orre tion. Thus instead updating the logarithm lf = lf + log(cii ) we dire tly ompute the derivative lfs = lfs +

csii . cii

This onsiderations lead to function ffs = deta(C,Cs) % DETA computes Newton correction ffs = f/fs n = length(C); lfs = 0; for i = 1:n [cmax,kmax]= max(abs(C(i:n,i))); if cmax == 0 % Matrix singular ffs = 0; return end kmax = kmax+i-1; if kmax ~= i h = C(i,:); C(i,:) = C(kmax,:); C(kmax,:) = h; h = Cs(kmax,:); Cs(kmax,:) = Cs(i,:); Cs(i,:) = h; end lfs = lfs + Cs(i,i)/C(i,i); % elimination step Cs(i+1:n,i) = (Cs(i+1:n,i)*C(i,i)-Cs(i,i)*C(i+1:n,i))/C(i,i)^2; C(i+1:n,i) = C(i+1:n,i)/C(i,i); Cs(i+1:n,i+1:n) = Cs(i+1:n,i+1:n) - Cs(i+1:n,i)*C(i,i+1:n)- ... C(i+1:n,i)*Cs(i,i+1:n); C(i+1:n,i+1:n) = C(i+1:n,i+1:n) - C(i+1:n,i)*C(i,i+1:n); end ffs = 1/lfs;

Note that as an alternative to the algorithmi dierentiation presented here one ould use the Formula of Ja obi d det(C(λ)) = det(C(λ)) tra e C−1 (λ)C ′ (λ) dλ

242

W. Gander

whi h gives an expli it expression for the derivative of the determinant.

4

Suppression instead Deflation

If x1 , . . . , xk are already omputed zeros then we would like to ontinue working with the de ated polynomial Pn−k (x) :=

Pn (x) (x − x1 ) · · · (x − xk )

(1)

of degree n − k. However, we annot expli itly de ate the zeros sin e we are working with P(λ) = det(λI − A). Dierentiating Equation (1) we obtain k

′ Pn−k (x) =

X 1 Pn′ (x) Pn (x) . − (x − x1 ) · · · (x − xk ) (x − x1 ) · · · (x − xk ) x − xi i=1

Thus the Newton-iteration be omes xnew = x −

Pn−k (x) Pn (x) =x− ′ ′ Pn−k (x) Pn (x)

1−

1 k Pn (x) X Pn′ (x)

i=1

1 x − xi

This variant of Newton's Iteration is alled Newton-Maehly Iteration [2, 3℄.

5

Example

We generate a random symmetri matrix A with eigenvalues 1, 2, . . . , n: x = [1:n]’; Q = rand(n); Q = orth(Q); A = Q*diag(x)*Q’;

respe tively a non symmetri matrix with x = [1:n]’; Q = rand(n); A = Q*diag(x)*inv(Q);

Then we ompute the solutions of det(C(λ)) = 0 with C(λ) = λI − A using the Newton-Maehly iteration. We ompare the results with the ones obtained by the QR-Algorithm eig(A) and with the zeros of the hara teristi polynomial roots(poly(A)). In Tables 1 and 2 the norm of the dieren e of the omputed eigenvalues to the exa t ones is printed. Noti e that due to ill- onditioning the roots of the hara teristi polynomial dier very mu h and that for n = 200 the oeÆ ients of the hara teristi polynomial over ow and the zeros annot be omputed any more. On the other hand we an see that the our method

ompetes in a

ura y very well with the standard QR-algorithm.

Zeros of Determinants of λ-Matri es

Table 1.

matrix

n roots(poly(A)) eig(A) 50 1.3598e+02 3.9436e−13 100 9.5089e+02 1.1426e−12 150 2.8470e+03 2.1442e−12 200 −−− 3.8820e−12

det(A − λI) = 0

n roots(poly(A)) eig(A) 50 1.3638e+02 3.7404e−12 100 9.7802e+02 3.1602e−11 150 2.7763e+03 6.8892e−11 200 −−− 1.5600e−10

det(A − λI) = 0

243

4.7243e−14 1.4355e−13 3.4472e−13 6.5194e−13

Norm of dieren e of the omputed to the exa t eigenvalues for a symmetri

2.7285e−12 3.5954e−11 3.0060e−11 6.1495e−11

Table 2. Norm of dieren e of the omputed to the exa t eigenvalues for a nonsymmetri matrix

6

Generalization to λ-matrices

Consider a quadrati eigenvalue problem det(C(λ)) = 0,

with C(λ) = λ2 M + λC + K.

If det(M) 6= 0 then one way to \linearize" the problem is to onsider the equivalent general eigenvalue-problem with dimension 2n: det

M0 0 M −λ =0 0 K −M −C

Alternatively with our approa h we an ompute the zeros of det(C(λ)) with Newton's iteration. Take the mass-spring system example from [5℄. For the nonoverdamped ase the matrix is C(λ) = λ2 M + λC + K with M = I, C = τ tridiag(−1, 3, −1), K = κ tridiag(−1, 3, −1)

and with κ = 5, τ = 3 and n = 50. The Matlab program to ompute the eigenvalues is % Figure 3.3 in Tisseur-Meerbergen clear, format compact n=50 tau = 3, kappa = 5, e = -ones(n-1,1); C = (diag(e,-1)+ diag(e,1)+ 3*eye(n)); K = kappa*C; C = tau*C;

244

W. Gander

lam = -0.5+0.1*i; tic for k=1:2*n ffs = 1; q=0; while abs(ffs)>1e-14 Q = lam*(lam*eye(n)+ C)+K; Qs = 2*lam*eye(n)+C; ffs = deta(Q,Qs); s = 0; if k>1 s = sum(1./(lam-lamb(1:k-1))); end lam = lam-ffs/(1-ffs*s); q=q+1; end clc k, lam, q, ffs, lamb(k) = lam; lam = lam*(1+0.01*i); end toc clf plot(real(lamb),imag(lamb),’o’)

and produ es Figure 1. The omputation in Matlab needed 13.9 se onds on

Fig. 1.

Eigenvalues in the omplex plane for the nonoverdamped ase

a IBM X41 laptop. As starting values for the iteration we used the omplex number λ(1 + 0.01i) near the last omputed eigenvalue λ.

Zeros of Determinants of λ-Matri es

245

In the se ond \overdamped" ase we have κ = 5, τ = 10. Sin e the eigenvalues are all real we an hoose real starting values. We hose 1.01λ where again λ was the last eigenvalue found. Figure 2 shows the eigenvalues whi h are all real and

omputed with Matlab in 16.3 se onds.

Fig. 2.

Real eigenvalues for the overdamped ase

Finally we re omputed a ubi eigenvalue problem from [1℄. Here we have C(λ) = λ3 A3 + λ2 A2 + λA1 + A0

with A0 = tridiag(1, 8, 1) A2 = diag(1, 2, . . . , n) and A1 = A3 = I.

In [1℄ the matrix dimension was n = 20 thus 60 eigenvalues had to be omputed. Using our method we ompute these in 1.9 se onds. Figure 3 shows the 150 eigenvalues for n = 50 whi h have been omputed in 17.9 se onds.

7

Conclusion

We have demonstrated that omputing zeros of polynomials from their oeÆ ients is ill- onditioned. However, dire t evaluation of the hara teristi polynomial is feasible. With this omputational intensive method we have shown that medium size nonlinear eigenvalue problems may be solved with a simple program whi h omputes determinants by Gaussian elimination and applies algorithmi dierentiation and suppresses already omputed zeros. We obtained results in reasonable time in spite that we did not ompile the Matlab program

246

W. Gander

Fig. 3.

Cubi Eigenvalue Problem

and that we did not make use of the banded stru ture of the matri es. This algorithm, though omputational expensive, maybe useful for its potential for parallelization on future multi ore ar hite tures.

References 1. P. Arbenz and W. Gander, Solving nonlinear Eigenvalue Problems by Algorithmi Dierentiation, Computing 36, 205-215, 1986. 2. H. J. Maehly, Zur iterativen Au osung algebrais her Glei hungen, ZAMP (Zeits hrift fur angewandte Mathematik und Physik), (1954), pp. 260{263. 3. J. Stoer and R. Bulirs h, Introdu tion to Numeri al Analysis, Springer, 1991. 4. W. Kellenberger, Ein konvergentes Iterationsverfahren zur Bere hnung der Wurzeln eines Polynoms, Z. Angew. Math. Phys. 21 (1970) 647{651. 5. F. Tisseur and K. Meerbergen, The Quadrati Eigenvalue Problem, SIAM. Rev., 43, pp. 234{286, 2001. 6. J. H. Wilkinson, Rounding errors in algebrai pro esses, Dover Publi ations, 1994.

How to find a good submatrix⋆ S. A. Goreinov, I. V. Oseledets, D. V. Savostyanov, E. E. Tyrtyshnikov, and N. L. Zamarashkin Institute of Numeri al Mathemati s of Russian A ademy of S ien es, Gubkina 8, 119333 Mos ow, Russia {sergei,ivan,draug,tee,kolya}@bach.inm.ras.ru Abstract. Pseudoskeleton approximation and some other problems require the knowledge of suÆ iently well- onditioned submatrix in a larges ale matrix. The quality of a submatrix an be measured by modulus of its determinant, also known as volume. In this paper we dis uss a sear h algorithm for the maximum-volume submatrix whi h already proved to be useful in several matrix and tensor approximation algorithms. We investigate the behavior of this algorithm on random matri es and present some its appli ations, in luding maximization of a bivariate fun tional.

Keywords: maximum volume, low rank, maxvol, pseudoskeleton approx-

imation.

1

Introduction

Several problems in matrix analysis require the knowledge of a good submatrix in a given (supposedly large) matrix. By \good" we mean a suÆ iently well onditioned submatrix. The appli ation that we are parti ularly interested in is the approximation of a given matrix by a low-rank matrix: A ≈ UV ⊤ ,

where A is m × n and U and V are m × r and n × r, respe tively. Optimal approximation in spe tral or Frobenius norm an be omputed via singular value de omposition (SVD) whi h, however, requires too many operations. A mu h faster way is to use CGR de ompositions [1℄ (later also referred to as CUR by some authors) whi h in Matlab notation an be written as: A ≈ A(:, J)A(I, J)−1 A(I, :),

(1)

where I, J are appropriately hosen index sets of length r from 1 : n and 1 : m. It an be seen that the right hand side matrix oin ides with A in r rows and ⋆

This work was supported by RFBR grant №08-01-00115a and by a Priority Resear h Grant OMN-3 of the Department of Mathemati al S ien es of Russian A ademy of S ien es.

248

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

^ = A(I, J) is nonsingular, r olumns. Moreover, if A is stri tly of rank r and A

the exa t equality holds. However, in the approximate ase the quality of the approximation (1) relies heavily on the \quality" of the submatrix. The question is how to measure this quality and how to nd a good submatrix. A theoreti al ^ has maximal in modulus answer (basi ally, existen e theory) [3℄ is that if A determinant among all r × r submatri es of A, then element-wise error estimate is of the form |A − Ar | 6 (r + 1)σr+1 ,

where |A| = maxij |aij | denotes Chebyshev norm, Ar is the right hand side of (1) and σr+1 is the r + 1{th singular value of the matrix A, i.e. the error of the best rank-r approximation in the spe tral norm. That is the theory, but what about a pra ti al algorithm? How to nd a good submatrix? That is the topi of the urrent paper. As we have seen, the submatrix quality an be measured by its determinant, so we want to nd a submatrix with the largest possible determinant. An intermediate step to the solution of that problem is omputation of the maximal volume submatrix not in a matrix where both dimensions are large, but in the matrix where only one dimension is large, i.e. in a \tall matrix". Su h pro edure ( alled maxvol) plays a ru ial role in several matrix algorithms we have developed, and it deserves a spe ial des ription [2, 4℄. In this paper we investigate the behavior of the maxvol algorithm on random matri es and present some theoreti al results and its appli ation for fast sear h of the maximum entry of large-s ale matrix. We also propose a new approa h for maximization of a bivariate fun tional on the base of maxvol algorithm. 1.1

Notation

In this arti le we use Matlab-like notation for de ning rows and olumns of matrix. Therefore we write i-th row of matrix A as ai,: and j-th olumn of A as a:,j . We will also use olumns and rows of identity matrix, denoting them as ei and eTj respe tively, using the same notations for dierent sizes, but the a tual size will be always lear by the ontext. 1.2

Definitions and basic lemmas

Let us give some formal de nitions and prove basi lemmas to rely on. Definition 1.

its volume.

We refer to the modulus of determinant of square matrix as

submatrix A of re tangular m × n matrix A maximum volume submatrix, if it has maximum determinant in modulus among all possible r × r submatri es of A. Definition 2.

We all

r×r

How to nd a good submatrix

249

We all r × r submatrix A of re tangular n × r matrix A of full rank dominant, if all the entries of AA−1 are not greater than 1 in modulus.

Definition 3.

The main observation that lays ground for the algorithms for the onstru tion of maxvol algorithm is the following lemma.

For n × r matrix maximum volume r × r submatrix is dominant. Proof. Without loss of generality we an onsider that A o

upies rst r rows of A. Let us refer to them as upper submatrix. Then Lemma 1.

AA−1 =

Ir×r = B. Z

(2)

Multipli ation by a nonsingular matrix does not hange the ratio of determinants of any pair of r × r submatri es in A. Therefore, the upper submatrix Ir×r is a maximum-volume submatrix in B and it is dominant in B i A is dominant in A. Now, if there is some |bij | > 1 in B, then we an onstru t a new submatrix with a volume larger than volume of the upper submatrix. To see that, swap rows i and j in B, and it is easy to see that a new upper submatrix

B ′

has

1

.. . = ∗ ∗ bij ∗ ...

∗

(3)

1

| det(B ′ )| = |bij | > 1 = | det(Ir×r )|.

That means that Irxr (and hen e trix.

A )

is not the maximum volume subma-

The volume of a dominant submatrix an not be very mu h smaller than the maximum volume, as the following lemma shows. Lemma 2.

For any nonsingular n × r matrix A | det(A )| > | det(A )|/rr/2

(4)

for all dominant r × r submatri es of A. Proof. Suppose that A is the upper submatrix and write AA−1

Ir×r = = B. Z

(5)

250

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

All entries of B are not greater than 1 in modulus, therefore by Hadamard inequality the volume of any r × r submatrix Br×r of B is not greater than | det(Br×r )| 6

r Y

|bσi ,: | 6 rr/2 ,

i=1

where σi are indi es of rows that ontain Br×r . The inequality is sharp. For example, if Z ontains Fourier, Hadamard or Walsh matrix as a submatrix, it is easy to see that the equality is attained. 2

Algorithm maxvol

A dominant property of the maximal-volume submatrix allows us to onstru t a simple and eÆ ient algorithm for the sear h of maximal volume submatrix.

Algorithm 1. Given: n × r matrix A. Find: r × r dominant submatrix A .

0 Start with arbitrary nonsingular r × r submatrix A⊡ . Reorder rows in A so that A⊡ o

upies rst r rows in A. 1 Compute AA−1 ⊡ =B

and nd its maximal in modulus entry bij . 2 If |bij | > 1, then swap rows i and j in B. Now, upper submatrix in B has the form (3) and the volume |bij | > 1. By swapping the rows we have in reased the volume of the upper submatrix in B, as well as in A. Let A⊡ be the new upper submatrix of A and go to step 1. If |bij | = 1, return A = A⊡ . On ea h iterative step of Algorithm 1, volume of A⊡ in reases until the volume of A is rea hed. In pra ti e, we an simplify the stopping riterion in the iterative step to |bij | < 1 + δ with suÆ iently small parameter δ (we think that δ ∼ 10−2 an be a good hoi e). This dramati ally redu es the number of iterative steps but does not hange the \good" properties of a submatrix. If omputations pro eed in a naive way, then the most expensive part of iterations is step 1, whi h needs one r × r matrix inversion and nr2 operations for the matrix-by-matrix produ t AA−1 ⊡ . We an redu e the omplexity of this step by a fa tor of r if we note that on ea h iteration, A⊡ is updated by a rank-one matrix, and apply Sherman-Woodbury-Morrison formula for the matrix inverse. Now we des ribe this in detail. Swapping of rows i and j of matrix A is equivalent to the following rank-one update. A := A + ej (ai,: − aj,: ) + ei (aj,: − ai,: ) = A + (ej − ei )(ai,: − aj,: ) = A + pvT . (6)

How to nd a good submatrix

251

For the upper submatrix, this update is A⊡ := A⊡ + ej (ai,: − aj,: ) = A⊡ + qvT .

(7)

For the inverse of the upper submatrix, we use the SWM formula T −1 −1 T −1 −1 −1 v A⊡ . A−1 ⊡ := A⊡ − A⊡ q(1 + v A⊡ q)

(8)

Note that −1 −1 −1 vT A−1 ⊡ q = (ai,: − aj.: )A⊡ ej = (AA⊡ )i,: − (AA⊡ )j,: ej = bij − bjj = bij − 1.

We pro eed with the formula of fast update of B = AA−1 ⊡ ,

−1 −1 T −1 T B = AA−1 ⊡ := (A + pv )(A⊡ − A⊡ qv A⊡ /bij ) = −1 −1 T −1 T −1 T −1 = AA⊡ − AA⊡ qv A⊡ /bij + pvT A−1 ⊡ − pv A⊡ qv A⊡ /bij = T −1 = B − Bq − bij p + pvT A−1 ⊡ q v A⊡ /bij .

T −1 Using vT A−1 ⊡ = bi,: − bj,: and v A⊡ q = bij − 1, we have

B := B − (b:.j − bij p + (bij − 1)p)(bi,: − bj,: )/bij ,

and nally

(9) Note also that the upper r × r submatrix of B remains to be identity after ea h update, be ause b1:r,j = ej for j 6 r and (ei )1:r = 0 for i > r that is always the ase. So we need to update only the submatrix Z. This an be also done by rank-one update: Z := Z − (b:,j + ei )(bi,: − eTj )/bij . (10) Note that in the last formula we use \old" indexing, i.e. rows of Z are numbered from r + 1 to n. Therefore, ea h iterative step of the algorithm redu es to a rank-one update of Z whi h an be done in (n − r)r operations, and a sear h for a maximummodulus element in Z, whi h is of the same omplexity. Overall omplexity for the algorithm 1 is therefore O(nr2 ) for initialization and O(cnr) for iterative part, where c is the number of iterations. We an write a rather rough estimate for c as follows. Ea h iteration step in reases volume of A⊡ by a value |bij | > 1+δ. After k steps {k} {0} | det(A⊡ )| > | det(A⊡ )|(1 + δ)k , therefore {0} c 6 log | det(A )| − log | det(A⊡ )| / log(1 + δ). (11) B := B − (b:,j − ej + ei )(bi,: − eTj )/bij .

This shows that good initial guess for A an redu e the number of iterations. If no \empiri al" guesses are available, it is always safe to apply Gaussian elimination with pivoting to A and use the set of pivoted rows as an initial approximation to the maximal volume submatrix.

252

3

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

maxvol-based maximization methods

As an appli ation onsider the following simple and interesting problem: nd maximum in modulus element of a low-rank matrix A = UV T , given by U and V. This problem arises for example in maximization of two-dimensional separable fun tion on a grid, or as an essential part of the Cross3D algorithm for omputation of Tu ker approximation of three dimensional tensor in linear time [4℄. Dire t omparison of all elements requires rnm operations for m × n matrix of rank r. Is it possible to devise an algorithm with omplexity linear in matrix size? 3.1

Theoretical estimates

Our idea is not to sear h for maximum element in the whole submatrix, but only in the submatrix of maximal volume. Though looking not very natural at the rst glan e, this algorithm a tually works well in many ases. Often the maximal element in the maximal volume submatrix is not ne essarily the same as the true maximal element, but it an not be very mu h smaller (for example, if it zero, then the submatrix of maximal volume is zero and the matrix is also zero, whi h we hope is not true). But are there any quantitative estimates? In fa t, we an repla e maximal-volume submatrix by an arbitrary dominant submatrix, whi h yields the same estimate. But rst we need to extend the de nition of the dominant submatrix to the ase of m × n matri es. It is done in a very simple manner. Definition 4.

We all

r×r

submatrix

A

of re tangular

m×n

matrix

A

dominant, if it is dominant in olumns and rows that it o

upies in terms

of De nition 3.

Theorem 1. If A rank r, then

is a dominant

r×r

submatrix of a

|A | > |A|/r2 .

m×n

matrix

A

of

(12)

Proof. If maximum in modulus element b of A belongs to A , the statement is trivial. If not, onsider (r + 1) × (r + 1) submatrix, that ontains A and b, ^= A

A c . dT b

(13)

Elements of ve tors c and d an be bounded as follows |c| 6 r|A |,

|d| 6 r|A |.

(14)

This immediately follows from c = A (A−1 ~, where all elements of c) = A c c~ are not greater than 1 in modulus. Bound for elements of d is proved in the same way.

How to nd a good submatrix

Now we have to bound |b|. Sin e

A

253

has rank r and A is nonsingular, (15)

b = dT A−1 c,

and it immediately follows that |A| = |b| 6 |d|r 6 |A |r2 ,

whi h ompletes the proof. The restri tion rank A = r may be removed with almost no hange in the bound (12). However, one has to repla e A by A . Theorem 2. If A is maximum-volume r × r (nonsingular) m × n matrix A, then |A | > |A|/(2r2 + r).

submatrix of

Proof. Again, onsider submatrix A^ that ontains A and b, see (13). Bound (14) follows immediately, be ause the maximum-volume submatrix is dominant, see Lemma 1. Sin e rank A is now arbitrary, the equality (15) is no longer valid. Instead, we use an inequality from [3℄, b − dT B−1 c 6 (r + 1)σr+1 (A), ^

(16)

^ That gives ^ > σ2 (A) ^ > . . . > σr+1 (A) ^ are singular values of A. where σ1 (A) T ^ + |dT A−1 ^ |b| 6 (r + 1)σr+1 (A) ~| 6 c| 6 (r + 1)σr+1 (A) + |d c 2 ^ ^ 6 (r + 1)σr+1 (A) + |d|r 6 (r + 1)σr+1 (A) + |A |r .

(17)

^ in terms of values of its elements. Note We need an estimate for σr+1 (A) that ^TA ^= A

AT d cT b

A c AT A + ddT AT c + bd = . T d b cT A + bdT cT c + b2

From the singular value interla ing theorem,

^ σr (AT A + ddT ) > σ2r+1 (A),

and for r > 1 ^ σr−1 (AT A ) > σr (AT A + ddT ) > σ2r+1 (A).

Finally we have σ1 (A ) > σr+1 (A) and |A | > σ1 (A )/r. Plugging this into (17), we get |b| 6 (r + 1)r|A | + r2 |A | = (2r2 + r)|A |,

whi h ompletes the proof.

254

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin Matrix size 1000, rank 10

Matrix size 10000, rank 10

30

7

25

6

53676384 trials

15 10

33627632 trials

5

20

4 3 2

5

1

0

0

0

0.2

Fig. 1.

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

Distribution of the ratio of maxvol over true maximal element

Now it is lear that we an redu e the sear h to only r2 elements of the dominant matrix. Then the sear h time does not depend on matrix size, and the total omplexity is just the omplexity of nding A , whi h is O(nr2 + mr2 ) operations. Maximum element in A is \suÆ iently good" in the sense of proven theorems. In pra ti al ases, the ratio |A|/|A | is suÆ iently smaller than r2 . Consider two examples, whi h illustrate this fa t. 3.2

Search for the maximum element in random low-rank matrices

In order to see how good is the maximal element in our \good" submatrix, we tested it rst on random matri es. Given n, m, r, two matri es U and V were generated with elements uniformly distributed in [−1 : 1]. Then U, V were repla ed with Q-fa tors of their QR-de ompositions and a matrix A = UDV ⊤ was generated with random positive diagonal D with elements uniformly distributed on [0, 1]. We generated a large set of trial matri es, for ea h of these matri es we omputed maximal element using the proposed algorithm. The a tual degradation of the maximal element is presented on the Figure 1, where the histogram of the ratio of maximal element in A over the true maximal element is given. Note that this ratio for ertain is not lower than 0.5 for all trials (smooth humps in the middle part of histograms), and in some 5% of ases (sharp peaks in the right part of histograms) we even found a true maximal element, whi h was mu h less probable for a random hoi e. 3.3

Maximization of bivariate functions

There is an interesting appli ation of our algorithm. It an be applied to the problem of global optimization of bivariate fun tions. Suppose we want to nd a maximum of |f(x, y)| in some re tangle (x, y) ∈ Π = [a0 , a1 ] × [b0 , b1 ], and f is some given fun tion. \Dis retizing" the problem on some suÆ iently ne

How to nd a good submatrix

255

grid (xi , yj ), i = 1, 2, . . . , m, j = 1, 2, . . . , n we obtain an m × n matrix A = [f(xi , yj )] to nd the maximal in modulus element in. Assume additionally that the fun tion f(x, y) an be suÆ iently well approximated by a sum of separable fun tions:

f(x, y) ≈

r X

uα (x)vα (y).

α=1

Then it easy to see that in this ase the matrix A admits a rank-r approximation of the form A = f(xi , yj ) ≈ UV ⊤ ,

where U, V are n × r and m × r matri es, respe tively, with elements U = [uα (xi )], V = [vα (yj )]. Thus the \dis retized" problem is equivalent to the problem of nding maximal in modulus element in a large low-rank matrix A, so we

an apply our method. We have no guarantee that we will nd the exa t maximum, but we will have an estimate of it. As an example we onsidered a standard banana fun tion minimization problem: b(x, y) = 100(y − x)2 + (1 − x)2 .

This fun tion has minimum in (1, 1) equal to 0 and is positive in all other points. In order to reformulate the problem as a maximization problem, we introdu e an auxiliary fun tion f(x, y) =

1 , b(x, y) + 10−6

the maximum of whi h is lo ated at the same point (1, 1). A re tangle [−2, 2] × [2, 2] was hosen and dis retized on a 500 × 500 uniform grid, the orresponding matrix A was approximated by a matrix of rank 10 for whi h the maximum was found by our maxvol algorithm. The extremal point was ontained in the grid, and the maxvolreturned the exa t position of the minimum: (1, 1). For other hoi es of grids the situation was the same, and the approximations to the extremum were very good (the error was O(h), where h is a grid size). This result is very en ouraging. However, it should not be treated as a universal optimization method, but it an be very useful in global optimization methods, be ause it gives us an estimate of the value of the global optimum | this an be eÆ iently used, for example, in bran h-and-bound methods, with maxvol estimates for the maximal value in a parti ular domain. Another possibility is to use \lo al" separable approximations to fun tions and then minimize this lo al part by the maxvol algorithm. In orporation of our method into robust optimization methods will be the subje t of future resear h.

4

Conclusion and future work

In this paper we presented a simple iterative method for the sear h of a submatrix of maximal volume in a given re tangular matrix. This submatrix plays

256

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

an important role in the theory and algorithms for the approximation by low (tensor) rank matri es. As an appli ation, we onstru ted an algorithm for the

omputation of maximal in modulus element in a given low-rank matrix and proved, that the element an not be mu h smaller than the \true" maximal element. Experiments on random matri es prove that our algorithm performs very good, as well as the experiment with the minimization of the banana fun tion. A future work will be fo used on maximizing separable fun tions by using bran h-and-bound method and maxvol estimates of the maximal element in ea h subdomain and by using \lo al" approximations by separable fun tions.

References 1. S. A. Goreinov, E. E. Tyrtyshikov, and N. L. Zamarashkin, A theory of pseudo{skeleton approximations, Linear Algebra Appl., 261: 1{21, 1997. 2. E. E. Tyrtyshnikov, In omplete ross approximation in the mosai -skeleton method. Computing, 64(4): 367{380, 2000. 3. S. A. Goreinov and E. E. Tyrtyshnikov, The maximal-volume on ept in approximation by low-rank matri es, Contemporary Mathemati s, 208: 47{51, 2001. 4. I. V. Oseledets, D. V. Savostyanov, and E. E. Tyrtyshnikov, Tu ker dimensionality redu tion of three-dimensional arrays in linear time, SIAM J. Matrix Anal. Appl., 30(3): 939{956, 2008.

Conjugate and Semi-Conjugate Direction Methods with Preconditioning Projectors V. P. Il'in Institute of Computational Mathemati s and Mathemati al Geophysi s, Siberian Bran h of Russian A ademy of S ien es, ak. Lavrentieva 6, 630090 Novosibirsk, Russia ilin@sscc.ru

Abstract. The a

eleration of the original proje tive iterative methods of multipli ative or additive type for solving systems of linear algebrai equations (SLAEs) by means of onjugate dire tion approa hes is onsidered. The orthogonal and varitional properties of the pre onditioned

onjugate gradient, onjugate residual and semi- onjugate residual algorithms, as well as estimations of the number of iterations, are presented. Similar results were obtained for the dynami ally pre onditioned iterative pro ess in Krylov subspa es. Appli ation of dis ussed te hniques for domain de omposition, Ka zmarz, and Cimmino methods is proposed.

1

Introduction

The aim of this paper is to analyze the iterative algorithms in the Krylov subspa es whose pre onditioners are some kinds of proje tor operators. At rst we

onsider the general approa h for a

eleration of some onvergent iterations with a onstant iteration matrix. Let us have the system of linear algebrai equations: Au = f,

u = {ui },

f = {fi } ∈ RN ,

A = {ai,j } ∈ RN,N ,

(1)

and the onvergent stationary iterative pro ess uk+1 = Buk + g,

uk → u, k→ ∞

g = (I − B)A−1 f.

(2)

Suppose that the iteration matrix B has eigenvalues λq (B) and spe tral radius ρ = max{|λq (B)|} < 1. Then the ve tor u is the solution of system q

~ ≡ (I − B)u = g, Au

(3)

~ is the pre onditioned non-singular matrix. where I is the identity matrix and A ~ If A is a symmetri positive de nite (s.p.d) matrix, its spe tral ondition number is ~ 2 kA ~ −1 k2 = (1 + ρ)/(1 − ρ). (4) = kAk

258

V. P. Il'in

and to solve SLAE (3) we an apply some iterative onjugate dire tion methods (see [1℄ { [4℄): ~ 0 , p0 = r0 ; n = 0, 1, ... : r0 = g − Au n+1 ~ n, u = un + αn pn , rn+1 = rn − αn Ap pn+1 = rn+1 + βn pn ,

(5)

whi h have the optimality property in the Krylov subspa es ~ = Span{p0 , p1 , ..., pn } = Span{p0 , Ap ~ 0 , ..., A ~ n p0 }. Kn+1 (r0 , A) In onjugate dire tion (CG) and onjugate residual (CR) methods, s = 0, 1 (s) respe tively, the iterative parameters α(s) n and βn are de ned as follows: ~s n n ~ n ~s n α(s) n = (A r , r )/(Ap , A p ),

~ s n+1 , rn+1 )/(A ~ s rn , rn ). β(s) n = (A r

(6)

These algorithms provide the residual and dire tion ( orre tion) ve tors rn and pn with the orthogonal pe uliarities ~ s rn , rk ) = (A ~ s rn , rn )δn,k , (A

~ n, A ~ s pk ) = (Ap ~ n, A ~ s pn )δn,k . (Ap

(7)

n ~ s−1 rn , rn ), s = 0, 1, are minimized in Also, the fun tionals Φ(s) n (r ) = (A the Krylov subspa es, and the number of iterations ne essary for satisfying the

ondition (s)

n 0 1/2 (Φ(s) 6 ε < 1, n (r )/Φ0 (r ))

is estimated by the value

n(ε) 6 1 + ln

1+

√ 1 − ε2 / ln γ, ε

√ √ γ = ( − 1)/( − 1).

(8)

It should be noted that matrix-ve tor multipli ation in (5) presents the implementation of one iteration (2) that does not require expli it forming of ma~ and B, be ause, for example, tri es A ~ n = pn − Bpn . Ap ~ is nonsymmetri and positive de nite, i.e. If martix A ~ u) > δ(u, u), δ > 0, u 6= 0, (Au, system (3) an be solved by means of the semi- onjugate residual (SCR) method realizing the stabilized version of the generalized onjugate residual (GCR) algorithm, whi h is des ribed in [5℄ and has instability features in terms of trun ation errors, see [4℄.

Conjugate and Semi-Conjugate Dire tion Methods

259

In SCR, the ve tors un+1 and rn+1 are omputed a

ording to formulas (5), n+1 with the oeÆ ients α(s) are n from (6) for s = 1, and the dire tion ve tors p de ned as follows: pn+1,0 = rn+1 ,

pn+1,l = pn+1,l−1 + βn,l pl−1 ,

~ l , Ap ~ n+1,l−1 )/(Ap ~ l , Ap ~ l ), βn,l = −(Ap

l = 1, ..., n,

pn+1 = pn+1,n .

(9)

Relations (5), (9) realize the onstru tion of At A-orthogonal ( onjugate) ve tors p0 , p1 , . . ., pn+1 by means of modi ed Gram{S hmidt orthogonalization n n n [6℄. In this ase, the fun tional Φ(1) n (r ) = (r , r ) is minimized in the subspa e ~ and the residual ve tors are right semi- onjugate, in the sense that Kn+1 (r0 , A) ~ k , rn ) = 0 are satis ed for k < n. Sin e SCR and GMRES the equalities (Ar methods (see [4℄) have the same variational properties in the Krylov subspa es, similar estimate of the number of iterations n(ε) is valid for them, and it will be used below. This paper is organized as follows. In Se tion 2, we des ribe proje tive methods of the multipli ative type using the onjugate dire tion and semi- onjugate dire tion approa hes. The next Se tion is devoted to the additive type proje tive methods in the Krylov subspa es. Also, the appli ation of dynami pre onditioners is dis ussed. This approa h means using variable iteration matrix Bn at dierent iterations. This is the implementation requirement, for example, in many two-level iterative pro esses.

2

Multiplicative projector methods

Let Ω = {i = 1, 2, ..., N} denote a set of matrix row numbers and Ωp , p = 1, 2, ..., l, be its non-interse ting integer subsets, with the numbers mp of their elements,

Ω=

l [

Ωp , m1 + ... + ml = N.

p=1

Also, let us introdu e subve tors u(p) , f(p) , p = 1, ..., l, of dimensions mp and re tangular matri es A(p) ∈ Rmp ×N : u(p) = {ui , i ∈ Ωp },

f(p) = {fi , i ∈ Ωp },

A(p) = {Ai , i ∈ Ωp },

(10)

where Ai is the i-th row of matrix A. Then SLAE (1) an be rewritten as A(p) u = f(p) ,

p = 1, 2, ..., l.

(11)

To solve (11), we onsider an iterative pro ess in whi h the omputing of ea h n-th approximation step onsists of the following stages: n,p−1 , un,p = un,p−1 + ωA+ (p) r(p)

n = 1, 2, ...,

p = 1, 2, ..., l,

un = un,l .

(12)

260

V. P. Il'in

Here u0,0 = {u0i , i = 1, 2, ..., N} is the initial guess, and ω is some iterative parameter, = f(p) − A(p) un,p−1 rn,p−1 (p)

is the residual subve tor of dimension mp , and A+ p is pseudoinverse to matrix t t −1 = A (A A A(p) de ned by the formula A+ if A(p) has a full rank. (p) (p) (p) ) (p) + We have from the above that I − A(p) A(p) is a symmetri positive semide nite matrix realizing orthogonal proje tion into the p-th subspa e, whi h is presented geometri ally by the union of subspa es des ribed by the i-th equations, i ∈ Ωp . Iterative method (12) an be written in the matrix form, un = Bun−1 + g, B = (I − Tl ) · · · (I − T1 ), Tp = ωA+ (p) A(p) .

(13)

Proje tive algorithm (12), (13) for ω = 1 and mp = 1 presents the \pointwise" method published by S.Ka zmarz in [7℄. Its dierent generalizations and investigations were made by many authors, see [8℄, [9℄. In [10℄ the following assertion was proved for abstra t iterative proje tion method of the multipli ative type, with appli ation to the domain de omposition approa h:

Theorem 1. Let Tp , p = 1, ..., l, be s.p.d. matri es, and the following inequalities be valid for any ve tor v ∈ RN : (Tp v, v)/(v, v) 6 α < 2,

p = 1, 2, ..., l;

Then the estimate

kvk 6 β

l X

(Tp v, v).

p=1

kBk2 6 ρ = 1 − (2 − α)/{β[l + α2 l(l − 1)/2]}

is true for the Eu lidian norm satisfy the onditions

||B||2 .

If the matri es Tp

= ωTp

for all

p

[(T1 v, v) + ... + (Tl v, v)], (Tp v, v)/(v, v) 6 α < 2, kvk 6 β p then for ω = (α (l − 1)l)−1 we have ρ = 1 − (3 α β l)−1 . It should be noted that iteration matrix B in iterative pro ess (13) is nonsymmetri , be ause matri es Tp are not ommutative in general.

Now we onsider the alternative dire tion blo k version of the Ka zmarz method, in whi h ea h iteration onsists of two stages. The rst one realizes

onventional formulas (12) or (13), and the se ond stage implements similar

omputations but in the ba kward ordering on the number p: n,p−1 un+1/2,p = un,p−1 + ωA+ , (p) r(p) n+1/2 n+1/2,l p = 1, 2, ..., l, u =u = un+1/2,l+1 , n+1/2,p+1 , un+1,p = un+1/2,p+1 + ωA+ (p) r(p) p = l, ..., 2, 1, un+1 = un+1,1 .

(14)

Conjugate and Semi-Conjugate Dire tion Methods

261

The iteration matrix in iterations (14) is the matrix produ t B = B2 B1 , where B1 oin ides with B from (13) and B2 has a similar form. Thus, un+1 = B2 B1 un + g,

B2 = (I − T1 )(I − T2 ) · · · (I − Tl ) = Bt1 .

(15)

Under onditions of Theorem 1, the estimate ||Bk ||2 6 ρ is valid for ea h matrix B1 , B2 , and for the iteration matrix of the alternative dire tion method we have an inequality ||B|| 6 ||B1 || · ||B2 || 6 ρ2 < 1. Sin e method (14), (15) an be presented in the form (2) with s.p.d. matrix B, it is possible to a

elerate the onvergen e of iterations by means of onjugate dire tion methods, applied formally for solving pre onditioned SLAE (3), and the following result is true. Theorem 2. The

iterations of the alternative dire tion multipli ative proje tive onjugate gradient (ADMPCG) and onjugate residual (ADMPCR) methods de ned by relations (3), (5), and (6) for s = 0, 1 respe tively, are

onvergent under onditions of Theorem 1, and the estimate (8) is valid for the number of iterations n(ε), where = (1 + ρ2 )/(1 − ρ2 ) and the value ρ is determined in Theorem 1.

Now let us onsider the su

essive multipli ative proje tive semi- onjugate residual (SMPSCR) method in the Krylov subspa es whi h is an alternative to the above ADMPCR algorithm. The new approa h is based on the a

eleration of iterative pro ess (13) with non-symmetri iteration matrix B by means of is des ribed by (3), (13). formulas (5) and (9) where pre onditioned matrix A The SMPSCR pro edure requires, for omputing un+1 , to save in memory all previous dire tion ve tors p0 , ..., pn , similarly to the GMRES method [4℄. These two approa h have the same onvergent property be ause they provide mini~ . The following mization of the fun tional (rn , rn ) in the subspa e Kn+1 (r0 , A) result is true for the su

essive multipli ative method. Theorem 3.

Suppose, that the SMPSCR algorithm, de ned by formulas

~ = XΛX−1 , (3), (5),(6) and (9),(11){(13) for s = 1, has diagonalizable matrix A ~ and X is a square matrix Λ = diag(λ1 , ..., λN ), where λi are eigenvalues of A whose olumns are orresponding eigenve tors. Then this method is onver-

gent under onditions of Theorem 1, and the following estimate is valid for the number of iterations:

q 1 − ε21 n(ε) 6 1 + ln / ln γ, ε1 = ε/(kXk2 · kX−1 k2 ), ε1 √ √ Here γ1 = a+ a2 − d2 , γ2 = c+ c2 − d2 , where a, d are the semi-major axis and the fo al distan e (d2 < c2 ) for the ellipse E(a, d, c) whi h in ludes all values λi , ex ludes origin, and is entered at c.

1+

It should be noted that for the SMPSCR method, as for GMRES, dierent redu ed versions with a bounded number of saved dire tion ve tors pn an be

262

V. P. Il'in

onstru ted. This will de rease the omputational resour es for the implementation of the algorithm, but the quantities n(ε) will in rease in these ases.

3

Additive projective methods

Let us re all that the Ka zmarz method is based on su

essive proje tion of the points from the spa e RN onto the hyperplanes whi h are des ribed by the

orresponding equations of the algebrai system. A similar idea is used in the Cimmino algorithm (see [11℄{[13℄ and its referen es). But here proje tions of the given point un onto all hyperplanes are made simultaneously, and the next step of the iterative approximation is hosen by means of some averaging pro edure, or linear ombination, with proje tive points un,i , i = 1, ..., N. Su h an additive type iterative pro ess to solve SLAE (11) an be presented in a generalized blo k version as n−1 un,p = un−1 + A+ (p) r(p) ,

p = 1, 2, ..., l,

un = (un,1 + un,2 + ... + un,l )/l,

These relations an be written in the following matrix form: un = Bun−1 + g,

B = I − l−1

l X

A+ (p) A(p) =

(17)

p=1 −1

=I−l

l X

Tp ,

−1

g=l

p=1

l X

(16)

A+ (p) f(p) ,

p=1

where matri es Tp are de ned in (13). Obviously, the limit ve tor of this sequen e u = lim un , if it exists, satis es n→ ∞ the pre onditioned system of equations ~ = f~, Au

~= A

l X

p=1

Tp ,

f~ =

l X

A+ (p) f(p) .

(18)

p=1

~ of system (18) is a s.p.d. one, its spe tral properties are obtained If matrix A from the following result [10℄.

Theorem 4. Let the quantities 0 < α < 2 and 0 < ρ < 1 be de ned from ~ of s.p.d. matrix A ~ from system Theorem 1. Then the spe tral radius λ(A) (18) satis es the inequalities

~ 6 αl. (2 − α)(1 − ρ)/4 6 λ(A) Now we an estimate the onvergen e rate of the additive proje tive approa h.

Conjugate and Semi-Conjugate Dire tion Methods

263

Theorem 5. Estimate (8) for the number of iterations n(ε) is valid for the onjugate gradient and onjugate residual methods to solve the SLAE (18), i.e. to a

elerate the additive proje tive algorithm (17). In this ase the ~ 6 4αl(2 − α)−1 (1 − ρ)−1 .

ondition number satis es the estimate (A) Remark 1. It follows from Theorems 1 and 5 that the multipli ative method is faster, in omparison to a similar additive pro edure. However the latter has a onsiderable advantage for parallel implementation on a multi-pro essor omputer, be ause the al ulation of ea h proje tion at the subspa e an be done independently. Remark 2. Theorems 1 and 4 were proved in [10℄ to analyse onvergen e

properties of the multipli ative and additive domain de omposition methods. It is evident that Theorems 2, 3 and 5 on the a

elerations of proje tive iterative methods by means of onjugate dire tion or semi- onjugate dire tion algorithms in the Krylov subspa es an be used su

essively in these appli ations. Thus, the blo k variant of SLAE (11) an be interpreted as a matrix representation of the algebrai domain de omposion (ADD) formulation.

4

Iterations in Krylov subspaces with dynamic preconditioning

If we have a large problem, i.e. the original algebrai system (1) has a dimensionality of several millions or hundreds of millions, then it is natural to use some iterative pro edure for solving auxiliary SLAEs at ea h step of blo k proje tion method (12) or (17). In this ase we obtain a two level iterative approa h: at the external level we have iterative method of the form n un+1 = Bn un + gn = un + C−1 n (f − Au ),

Bn = I − C−1 n A,

(19)

with variable (dynami ) iteration matri es Bn and pre onditioning matri es Cn , and at the internal level the subsystems of dimensionality mp are solved iteratively. The a

eleration of iterative pro ess (19) in the Krylov subspa es −1 0 −1 0 n −1 0 Kn+1 (r0 , C−1 n A) = span{C0 r , AC1 r , ..., A Cn r }

264

V. P. Il'in

an be done by the following dynami ally pre onditioned semi- onjugate residual (DPSCR) method: 0 r0 = f − Au0 , p0 = C−1 n = 0, 1, .. : 0 r , n+1 n n n+1 u = u + αn p , r = rn − αn Apn , n n P P n+1 pn+1 = C−1 + βn,k pk = pn+1,l + βn,k pk , n+1 r k=0

pn+1,l = pn+1,l−1 + βn,l−1 pl−1 , n n n n αn = (AC−1 n r , r )/(Ap , Ap ),

k=l

n+1 pn+1,0 = C−1 , pn+1 = pn+1,n , n+1 r k n,k βn,k = −(Ap , Ap )/(Apk , Apk ).

The algorithm DPSCR provides minimization of the residual norm ||r in the subspa e Kn+1 (r0 , C−1 n A), and the following equality is true: ||rn+1 ||2 = (r0 , r0 ) −

0 0 2 n n 2 (AC−1 (AC−1 n r ,r ) 0 r ,r ) − . . . − . (Ap0 , Ap0 ) (Apn , Apn )

(20)

n+1

||

(21)

Thus, this method onverges if matri es C−1 n A are positive de nite. In order to de rease the omputational omplexity of the algorithm, for large n two redu ed versions of method (20) an be applied. The rst one is based on the pro edure of periodi al restarting after ea h m iterations. This means that for n = ml, l = 1, 2, ..., the residual ve tor rn is omputed not from the re urrent relation but from the original equation (rml = f − Auml ), and subsequent al ulations are performed in the onventional form. The se ond way onsists in trun ated orthogonalization, i.e. for n > m only the last m dire tion ve tors pn , ..., pn−m+1 and Apn , ..., Apn−m+1 are saved in the memory and used in the re ursion. The following ombination of these two approa hes an be proposed. Let m1 be the restart period, m2 be the number of saved orthogonal dire tion ve tors, and n′ = n − mn2 m2 , where [b℄ is the integer part of b. Then the uni ed redu ed re ursion for pn is written as p

n+1

=

n+1 C−1 n+1 r

+

n X

βn,k pk ,

m = min{n′ , m1 }.

(22)

k=n−m+1

It is easy to show from (21) that the redu ed versions of DPSCR onverge also, if matri es C−1 n A are positive de nite for all n.

References 1. Golub G., Van Loan C. Matrix omputations. The Johns Hopkins Univ. Press, Baltimore, 1989. 2. O.Axelsson. Iterative solution methods. Cambridge Univ. Press, New York, 1994. 3. V.P.Il'in. Iterative In omplete Fa torization Methods, World S ienti Publ., Singapore, 1992.

Conjugate and Semi-Conjugate Dire tion Methods

265

4. Y.Saad. Iterative methods for sparse linear systems, PWS Publ., New York, 1996. 5. S.C.Eisenstat, H.C.Elman, M.H.S hultz. Variational iterative methods for nonsymmetri systems of linear equations, SIAM J. Num. Anal., 20, (1983), pp. 345-357. 6. C.L.Lawson, R.J.Hanson. Solving Least Square Problems, Prenti e-Hall, In ., New Jersey, 1974. 7. S.Ka zmarz. Angenaherte Au osung von Systemen linearer Glei hungen, Bull. Internat. A ad. Polon. S i. Lettres A, 335{357 (1937). Translated into English: Int. J. Control 57(6): 1269{1271 (1993). 8. K.Tanabe. Proje tion method for solving a singular system of linear equation and its appli ations, Number Math., 17, (1971), pp. 203-214. 9. V.P.Il'in. On the iterative Ka zmarz method and its generalizations (in Russian), Sib.J.Industr. Math., 9, (2006), pp. 39-49. 10. J.H.Bramble, J.E.Pas iak, J.Wang, J.Xu. Convergen e estimates for produ t iterative methods with appli ations to domain de omposition, Math. of Comput., 57, (1991), 195, pp. 1-21. 11. G.Cimmino. Cal olo approssimato per le soluzioni dei sistemi di equazioni lineari, La Ri er a S ienti a, II, 9 (1938), pp. 326-333. 12. R.Bramley, A.Sameh. Row proje tion methods for large nonsymmetri linear systems, SIAM J. S i. Stat. Comput., 13, (1992), pp. 168-193. 13. G.Appleby, D.C.Smolarski. A linear a

eleration row a tion method for proje ting onto subspa es, Ele troni Transa tions on Num. Anal., 20, (2005), pp. 243-275.

Some Relationships between Optimal Preconditioner and Superoptimal Preconditioner Jian-Biao Chen1,⋆ , Xiao-Qing Jin2,⋆⋆ , Yi-Min Wei3,⋆⋆⋆ , and Zhao-Liang Xu1,† 1

Department of Mathemati s, Shanghai Maritime University, Shanghai 200135, P. R. China. 2 Department of Mathemati s, University of Ma au, Ma ao, P. R. China.

3

Institute of Mathemati s, S hool of Mathemati al S ien es, Fudan University, Shanghai 200433, P. R. China.

xqjin@umac.mo

ymwei@fudan.edu.cn

Abstract. For any given n-by-n matrix An , a spe i ir ulant pre onditioner tF (An ) proposed by E. Tyrtyshnikov [SIAM J. Matrix Anal.

Appl., Vol. 13 (1992), pp. 459{473℄ is de ned to be the solution of min kIn − C−1 n An kF Cn

over all n-by-n nonsingular ir ulant matri es Cn . The tF (An ), alled the superoptimal ir ulant pre onditioner, has been proved to be a good pre onditioner for a large lass of stru tured systems in luding some ill onditioned problems from image pro essing. In this paper, we study this pre onditioner from an operator viewpoint. We will give some relationships between the optimal pre onditioner (operator) proposed by T. Chan [SIAM J. S i. Statist. Comput., Vol. 9 (1988), pp. 766{771℄ and superoptimal pre onditioner (operator).

Keywords: optimal pre onditioner, superoptimal pre onditioner.

1

Introduction

In 1986, ir ulant pre onditioners were proposed for solving Toeplitz systems [18, 22℄ by the pre onditioned onjugate gradient method. Sin e then, the use of ⋆

⋆⋆

⋆⋆⋆

†

The resear h of this author is partially sponsored by the Hi-Te h Resear h and Development Program of China (grant number: 2007AA11Z249). The resear h of this author is supported by the resear h grant RG-UL/0708S/Y1/JXQ/FST from University of Ma au. The resear h of this author is supported by the National Natural S ien e Foundation of China under the grant 10871051 and Shanghai S ien e and Te hnology Committee under grant 08511501703. The resear h of this author is supported by the National Natural S ien e Foundation of China under the grant 10871051.

Relationships between optimal and superoptimal pre onditioners

267

ir ulant pre onditioners for solving stru tured systems has been studied extensively [4, 10{12, 16, 17, 19, 20℄. In 1988, T. Chan [6℄ proposed a spe i ir ulant pre onditioner as follows. For any arbitrary matrix An , T. Chan's ir ulant pre onditioner cF (An ) is de ned to be the minimizer of the Frobenius norm min kAn − Cn kF Cn

where Cn runs over all ir ulant matri es. The cF (An ) is alled the optimal

ir ulant pre onditioner in [6℄. A generalization of the optimal ir ulant pre onditioner is de ned in [9℄. More pre isely, given a unitary matrix U ∈ Cn×n , let MU ≡ {U∗ Λn U | Λn is any n-by-n diagonal matrix}. (1) The optimal pre onditioner cU (An ) is de ned to be the minimizer of min kAn − Wn kF Wn

where Wn runs over MU . We remark that in (1), when U = F, the Fourier matrix, MF is the set of all ir ulant matri es [8℄, and then cU (An ) turns ba k to cF (An ). The matrix U an also take other fast dis rete transform matri es su h as the dis rete Hartley matrix, the dis rete sine matrix or the dis rete

osine matrix, et ., and then MU is the set of matri es that an be diagonalized by a orresponding fast transform [2, 4, 10, 17℄. We refer to [14℄ for a survey of the optimal pre onditioner. Now we introdu e the superoptimal ir ulant pre onditioner proposed by Tyrtyshnikov in 1992. For any arbitrary matrix An , the superoptimal ir ulant pre onditioner tF (An ) is de ned to be the minimizer of min kIn − C−1 n An kF Cn

where Cn runs over all nonsingular ir ulant matri es. The generalized superoptimal pre onditioner tU (An ) is de ned to be the minimizer of min kIn − Wn−1 An kF Wn

where Wn runs over all nonsingular matri es in MU given by (1). Again, tU (An ) turns ba k to tF (An ) when U = F. In this paper, we study the superoptimal pre onditioner from an operator viewpoint. We will give some relationships between the optimal pre onditioner (operator) and superoptimal pre onditioner (operator). Now, we introdu e some lemmas whi h will be used later. Let δ(En ) denote the diagonal matrix whose diagonal is equal to the diagonal of the matrix En .

268

J.-B. Chen, X.-Q. Jin, Y.-M. Wei, Zh.-L. Xu

Lemma 1. ([3])

Let An ∈ Cn×n . Then cU (An ) ≡ U∗ δ(UAn U∗ )U.

For a relationship between cU (An ) and tU (An ), we have Lemma 2. ([3])

Then

Let

An ∈ Cn×n

su h that

An

and

cU (An )

are invertible.

tU (An ) ≡ cU (An A∗n )cU (A∗n )−1 .

Lemma 3. ([3, 7])

For any matrix An ∈ Cn×n , δ(UAn A∗n U∗ ) − δ(UAn U∗ ) · δ(UA∗n U∗ )

is a positive semi-de nite diagonal matrix. 2

Relationships between cU and tU

The optimal pre onditioner was studied from an operator viewpoint in [3℄. Let the Bana h algebra of all n-by-n matri es over the omplex eld, equipped with a matrix norm k·k, be denoted by (Cn×n , k·k). Let (MU , k·k) be the subalgebra of (Cn×n , k·k). We note that MU is an inverse- losed, ommutative algebra. Let tU be an operator from (Cn×n , k · k) to (MU , k · k) su h that for any An in Cn×n , −1 tU (An ) is the minimizer of kIn − Wn An kF over all nonsingular Wn ∈ MU . Before we dis uss the operator tU in details, we introdu e the following theorem whi h is on erned with the operator norms of cU . Theorem 1. ([2, 3])

(i) kcU kF ≡

(ii) kcU k2 ≡

For all n > 1, we have

sup kcU (An )kF = 1.

kAn kF =1

sup kcU (An )k2 = 1.

kAn k2 =1

The following theorem in ludes some properties of tU (An ).

Let An vertible. We have

Theorem 2.

∈ Cn×n

with

n>1

su h that

An

and

cU (An )

are in-

(i) tU (αAn ) = αtU (An ), for all α ∈ C. (ii) tU (A∗n ) = tU (An )∗ for the normal matrix An . (iii) tU (Bn An ) = Bn tU (An ) for Bn ∈ MU if Bn An and cU (Bn An ) are invert-

ible.

Relationships between optimal and superoptimal pre onditioners

269

(iv) tU (An ) is stable for any normal and stable matrix An . We re all that a

matrix is stable if all the real parts of its eigenvalues are negative.

Proof. For (i), if α = 0, (i) holds obviously. If α 6= 0, it follows from Lemma 2

that

tU (αAn ) = cU (αAn α A∗n )cU (α A∗n )−1 = αα cU (An A∗n )[α cU (A∗n )]−1 = αcU (An A∗n )cU (A∗n )−1 = αtU (An ).

For (ii), we have by Lemma 2 again, tU (An )∗ = [cU (An A∗n )cU (A∗n )−1 ]∗ = [cU (A∗n )−1 ]∗ cU (An A∗n ) = cU (An )−1 cU (An A∗n )

and then by Lemma 1, tU (A∗n ) = cU (A∗n An )cU (An )−1 = U∗ δ(UA∗n An U∗ )UU∗ δ(UAn U∗ )−1 U = U∗ δ(UAn U∗ )−1 UU∗ δ(UA∗n An U∗ )U = cU (An )−1 cU (A∗n An ).

Sin e An is normal, we obtain tU (A∗n ) = tU (An )∗ . For (iii), we have tU (Bn An ) = cU (Bn An A∗n B∗n )cU (A∗n B∗n )−1 = Bn cU (An A∗n )B∗n cU (A∗n B∗n )−1 = Bn cU (An A∗n )cU (A∗n )−1 = Bn tU (An ).

For (iv), it follows from [15℄ that δ(UAn U∗ ) and δ(UA∗n U∗ ) are stable. Sin e δ(UAn A∗n U∗ ) is a positive diagonal matrix, we know that δ(UAn A∗n U∗ ) · ⊓ ⊔ δ(UA∗n U∗ )−1 is also stable. In general, we remark that (ii) is not true. For example, Let U = I2 and

11 A2 = . 02

It is easy to verify that tU (A∗n ) 6= tU (An )∗ .

Let An vertible. We have

Theorem 3.

(i) (ii)

sup

kAn kF =1

sup

kAn k2 =1

∈ Cn×n

ktU (An )kF > 1. ktU (An )k2 > 1.

with

n>1

su h that

An

and

cU (An )

are in-

270

J.-B. Chen, X.-Q. Jin, Y.-M. Wei, Zh.-L. Xu

Proof. For (i), we have by Lemmas 1 and 2, tU (An ) = U∗ δ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 U.

Noti e that from Lemma 3 and the invertibility of An and cU (An ), δ(UAn A∗n U∗ ) > δ(UAn U∗ )δ(UA∗n U∗ ) > 0,

where M > N for any matri es M and N means that the all entries of M − N are non-negative. We obtain (2)

|δ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 | > |δ(UAn U∗ )| > 0

where |Q| = [|qij |] for any matrix Q = [qij ]. Thus we have by (2) and Theorem 1, sup

ktU (An )kF =

kAn kF =1

>

sup kδ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 kF

kAn kF =1

sup kδ(UAn U∗ )kF =

kAn kF =1

sup kcU (An )kF = kcU kF = 1.

kAn kF =1

For (ii), it follows by (2) that ktU (An )k2 = kδ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 k2 > kδ(UAn U∗ )k2 = kcU (An )k2 .

Hen e by Theorem 1 again, sup

kAn k2 =1

ktU (An )k2 > kcU k2 = 1.

⊓ ⊔

Finally, we give a relationship of the unitarily invariant norm between

cU (An )−1 An and tU (An )−1 An .

Let An ∈ Cn×n with n > 1 su h that An and vertible. For every unitarily invariant norm k · k, we have

Theorem 4.

cU (An )

are in-

ktU (An )−1 An k 6 kcU (An )−1 An k.

Proof. It follows from [13, Theorem 2.2℄: if the singular values are ordered in the following de reasing way: σ1 > σ2 > · · · > σn , then we have σk [tU (An )−1 An ] 6 σk [cU (An )−1 An ],

k = 1, 2, . . . , n.

Thus, for every unitarily invariant norm k · k, the result holds from [21, p.79, Theorem 3.7℄. ⊓ ⊔

Relationships between optimal and superoptimal pre onditioners

271

References 1. D. Berta

ini, A Cir ulant Pre onditioner for the Systems of LMF-Based ODE Codes, SIAM J. S i. Comput., Vol. 22 (2000), pp. 767{786. 2. R. Chan and X. Jin, An Introdu tion to Iterative Toeplitz Solvers, SIAM, Philadelphia, 2007. 3. R. Chan, X. Jin and M. Yeung, The Cir ulant Operator in the Bana h Algebra of Matri es, Linear Algebra Appl., Vol. 149 (1991), pp. 41{53. 4. R. Chan and M. Ng, Conjugate Gradient Methods for Toeplitz Systems, SIAM Review, Vol. 38 (1996), pp. 427{482. 5. R. Chan, M. Ng and X. Jin, Strang-Type Pre onditioners for Systems of LMFBased ODE Codes, IMA J. Numer. Anal., Vol. 21 (2001), pp. 451{462. 6. T. Chan, An Optimal Cir ulant Pre onditioner for Toeplitz Systems, SIAM J. S i. Statist. Comput., Vol. 9 (1988), pp. 766{771. 7. C. Cheng, X. Jin, S. Vong and W. Wang, A Note on Spe tra of Optimal and Superoptimal Pre onditioned Matri es, Linear Algebra Appl., Vol. 422 (2007), pp. 482{485. 8. P. Davis, Cir ulant Matri es, 2nd ed., Chelsea Publishing, New York, 1994. 9. T. Hu kle, Cir ulant and Skew Cir ulant Matri es for Solving Toeplitz Matrix Problems, SIAM J. Matrix Anal. Appl., Vol. 13 (1992), pp. 767{777. 10. X. Jin, Developments and Appli ations of Blo k Toeplitz Iterative Solvers, S ien e Press, Beijing; and Kluwer A ademi Publishers, Dordre ht, 2002. 11. X. Jin, Three Useful Pre onditioners in Stru tured Matrix Computations , Pro eedings of the 4th ICCM (2007), Vol. III, pp. 570{591. Eds: L.-Z. Ji, K.-F Liu, L. Yang and S.-T. Yau, Higher Edu ation Press, Beijing, 2007. 12. X. Jin and Y. Wei, Numeri al Linear Algebra and Its Appli ations, S ien e Press, Beijing, 2004. 13. X. Jin and Y. Wei, A Short Note on Singular Values of Optimal and Superoptimal Pre onditioned Matri es, Int. J. Comput. Math., Vol. 84 (2007), pp. 1261{ 1263. 14. X. Jin and Y. Wei, A Survey and Some Extensions of T. Chan's Pre onditioner, Linear Algebra Appl., Vol. 428 (2008), pp. 403{412. 15. X. Jin, Y. Wei and W. Xu, A Stability Property of T. Chan's Pre onditioner, SIAM J. Matrix Aanal. Appl., Vol. 25 (2003), pp. 627{629. 16. T. Ku and C. Kuo, Design and Analysis of Toeplitz Pre onditioners, IEEE Trans. Signal Pro ess., Vol. 40 (1992), pp. 129{141. 17. M. Ng, Iterative Methods for Toeplitz Systems, Oxford University Press, Oxford, 2004. 18. J. Olkin, Linear and Nonlinear De onvolution Problems, Ph.D thesis, Ri e University, Houston, 1986. 19. D. Potts and G. Steidl, Pre onditioners for Ill-Conditioned Toeplitz Matri es, BIT, Vol. 39 (1999), pp. 579{594. 20. S. Serra, Pre onditioning Strategies for Asymptoti ally Ill-Conditioned Blo k Toeplitz Systems, BIT, Vol. 34 (1994), pp. 579{594. 21. G. Stewart and J. Sun, Matrix Perturbation Theory, A ademi Press, Boston, 1990. 22. G. Strang, A Proposal for Toeplitz Matrix Cal ulations, Stud. Appl. Math., Vol. 74 (1986), pp. 171{176.

272

J.-B. Chen, X.-Q. Jin, Y.-M. Wei, Zh.-L. Xu

23. E. Tyrtyshnikov, Optimal and Super-Optimal Cir ulant Pre onditioners, SIAM J. Matrix Anal. Appl., Vol. 13 (1992), pp. 459{473.

Scaling, Preconditioning, and Superlinear Convergence in GMRES-type iterations⋆ Igor Kaporin Computing Center of Russian A ademy of S ien es, Vavilova 40, Mos ow 119991, Russia kaporin@ccas.ru

A theoreti al justi ation is found for several standard te hniques related to ILU pre onditioning, su h as pre-s aling and pivot modi ation, with impli ations for pra ti al implementation. An improved estimate for the redu tion of the GMRES residual is obtained within the general framework of two-stage pre onditioning. In parti ular, an estimate in terms of a onditioning measure of the s aled oeÆ ient matrix and the Frobenius norm of the s aled ILU residual is presented. Abstract.

Keywords: unsymmetri sparse matrix, two-side s aling, in omplete LU pre onditioning, two-stage pre onditioning, superlinear onvergen e.

1

Introduction

In the present paper we address ertain theoreti al issues related to the onstru tion of omputational methods for the numeri al solution of large linear systems with general nonsingular unsymmetris sparse oeÆ ient matri es. As is known, dire t solvers (whi h are based on the \exa t" sparse triangular fa torization of the matrix) represent a quite robust, advan ed and wellestablished pie e of numeri al software. As an example, one an refer to the UMFPACK solver [5℄, whi h implements an unsymmetri multifrontal sparse Gauss elimination. However, the sparsity stru ture inherent to many important

lasses of problems (su h as fully three-dimensional dis rete models) is rather unsuitable to su h methods. This is due to huge volumes of intermediate data generated by a dire t solver (namely, arrays presenting nonzero elements of the triangular fa tors) whi h are many orders of magnitude larger than the order of the system. Moreover, the orresponding omputation time grows even faster than the storage spa e as the linear system size in reases. An alternative to dire t solvers is represented by iterative methods. Unfortunately, any \ lassi al" xed-storage simplisti s hemes (for instan e, the ILU(0) ⋆

This work was partially supported through the Presidium of Russian A ademy of S ien es program P-14 and the program "Leading S ienti S hools" (proje t NSh2240.2006.1)

274

I. Kaporin

pre onditioned GMRES(m) method) are ompletely unreliable for general unsymmetri linear systems. More promising are the In omplete LU-type Pre onditioned Krylov subspa e iterative solvers based on the approximate fa torization \by value" without any restri tions on the sparsity of the triangular fa tors. An appropriate use of the \approximate" triangular fa torization makes it possible to generate mu h more ompa t triangular fa tors as ompared to those arising in dire t solvers. Is should be stressed that almost all results and te hniques developed for the \exa t" LU-fa torization need to be essentially revisited and reformed in order to be useful for the purpose of eÆ ient pre onditioning of the Krylov subspa e iterations. In this ase, for instan e, a areful pivoting (stri tly targeted at \as good as possible" diagonal dominan e in the approximate triangular fa tors) appears to be mu h more important than any near-optimum pre-ordering or even the dynami a

ount for the lo al ll-in [12℄. It an be de nitely stated that the urrently available software produ ts implementing pre onditioned iterative sparse linear solvers still suer from the following de ien es: (a) their reliability is still worse than that of dire t solvers; (b) in order to provide satisfa tory reliability and eÆ ien y, they require quite

ompli ated tuning of the solver ontrol parameters (whi h are related to the numeri al algorithm itself rather than to the problem solved). Below we present a superlinear onvergen e estimate for Pre onditioned GMRES-type iterative linear equation solvers. The formulation of the result is spe i ally adjusted to the ase when the pre onditioning is based on approximate triangular fa torization applied to a pre-s aled oeÆ ient matrix. Hen e, in addition to many empiri al observations (see, e.g., [2℄), a ertain theoreti al eviden e is found for the onsidered robust iterative solvers.

2

Problem setting

Consider the linear algebrai system Ax = b

(1)

with a general unsymmetri nonsingular sparse n × n matrix A. The In omplete LU (ILU) pre onditioned GMRES-type iterative methods use the pre onditioner matrix C ≈ A of the form C = PLUQ whi h is obtained from the ILU equation A = PLUQ + E,

where L and U are nonsingular lower and upper triangular matri es, respe tively, while P and Q are permutation matri es. Hen e, the pre onditioner is given by C = PLUQ,

(2)

Superlinear onvergen e in pre onditioned GMRES

275

whi h is obviously an \easily invertible" matrix. The additive term E is the ILU

error matrix, a standard assumption for whi h is |(E)ij | = O(τ),

(3)

where τ ≪ 1 is a pres ribed threshold parameter. Note that a more general stru ture of the error matrix is admissible in pre onditioned GMRES-type methods, namely, |(E − X)ij | = O(τ),

rank(X) = O(1),

(4)

whi h was proposed in [19℄ in the ontext of pre onditioned Toeplitz-like system solvers. The low-rank term in ILU error matrix may arise due to the use of pivot

orre tion, whi h te hnique an be helpful in the ase of diagonal pivoting, see [12℄ for more detail. We onsider the pre onditioned Krylov subspa e iterative solver for unsymmetri linear system (1) as an appli ation of GMRES iterations [16℄ to the right pre onditioned system AC−1 y = b, (5) so that the solution of (1) is obtained as x = C−1 y. Note: Under a proper hoi e of permutation matri es P and Q (mainly aimed at the improvement of diagonal dominan e of L and U), one an observe that kEk2F ≡ tra e(ET E) = O(nτ2 ),

(6)

i.e., only relatively few nonzero entries of E attain their maximum allowed magnitude. At the same time, the stability of the triangular fa tors is often improved (more pre isely, the ratio ond(C)/ ond(A) is not large), whi h is desirable from the numeri al stability viewpoint.

3

Scaling techniques

It was noted by many authors (see, e.g. [2, 13℄ and referen es therein) that the ILU fa torization `by value' applied to a properly two-side s aled oeÆ ient matrix AS = DL ADR (7) may yield mu h better pre onditioning ompared to similar algorithms applied to the original oeÆ ient matrix A (espe ially in several hard-to-solve ases, see also [12℄). The in omplete triangular fa torization (now applied to s aled matrix (7)) yields the equation AS = PS LS US QS + ES ,

276

I. Kaporin

where PS and QS are permutation matri es arising due to pre-ordering and pivoting applied to the s aled matrix. Hen e, a

ording to (7) the resulting pre onditioner is −1 D−1 L PS LS US QS DR ≡ C ≈ A.

Note that in a tual implementation the latter pre onditioning an readily be transformed to the same form (2) (though with dierent triangular fa tors, even if the permutations would be the same). Next we will onstru t a pre onditioning quality measure via (i) a spe ial ondition number of AS (presenting the s aling quality measure) and (ii) Frobenius norm of the s aled ILU error matrix ES . The orresponding ILU-GMRES onvergen e estimate an be referred to as the

onstru tive one, be ause the residual norm bound is expressed literally via the very fun tionals whi h are expe ted to be dire tly optimized in the pro edures of s aling and approximate fa torization. Moreover, the improvement of s aling quality and the attained value of the fa torization quality riteria an be readily evaluated a posteriori (numeri ally). Note: Below in Se tion 5 we present a onvergen e estimate for the GMRES method whi h does not depend on the quantities of the type k ond(DL )k, kEk, or k(LU)−1 k. Taking into a

ount that our result holds in `exa t arithmeti s', one an on lude that \bad" (i.e., too large) values of quality indi ators (whi h are often asso iated with ILU pre onditioning) su h as (a) ondition numbers of the s aling matri es DL and DR , (b) size of the elements of the uns aled 'original' error matrix E, and ( ) norm of the inverse of the s aled pre onditioner, may have their destru tive ee t on the GMRES onvergen e only in the presen e of round-o errors.

4

How to estimate GMRES Convergence

From now on, let k · k denote the matrix spe tral norm kBk = max kBzk/kzk, z6=0

kzk =

√ zT z.

For the kth residual rk = b − Axk

(8) (9)

in the pre onditioned minimum residual method (also known as GMRES(∞),

f.[16℄) one has, by the onstru tion, krk k =

min kPk (M)r0 k = kPk∗ (M)r0 k.

Pk (0)=1

(10)

Superlinear onvergen e in pre onditioned GMRES

Here

277

(11)

M = AC−1

is the right pre onditioned matrix and Pk∗ (·) is the polynomial determined at the kth step of the minimum residual method; this polynomial has the degree not greater than k and is normalized by the ondition Pk∗ (0) = 1. For the sake of simpli ity, let M be diagonalizable, that is, M = VΛV −1 .

(12)

Here, the olumns of V are the (normalized) eigenve tors v1 , v2 , . . . , vn of M and the entries of the diagonal matrix Λ are the orresponding eigenvalues λ1 , λ2 , . . . , λn of M. Using (10) and (12), one nds ek (M)r0 k = kP ek (VΛV −1 )r0 k = kV P ek (Λ)V −1 r0 k krk k = kPk∗ (M)r0 k 6 kP ek (Λ)D−1 V −1 r0 k 6 kVDkkP ek(Λ)kk(VD)−1 kkr0 k = kVDP ek (λi )| kr0 k ek (Λ)kkr0 k = κ max |P = κ kP 16i6n

(13)

whi h holds for an arbitrary polynomial Pek of a degree not greater than k and normalized by the ondition Pek (0) = 1. Note that hereafter, the notation κ = min kVDkk(VD)−1 k = min ond(VD) D=diag.

D=diag.

(14)

is used to denote the ondition number of VD, where D is an arbitrary nonsingular diagonal matrix. Nontrivial hoi es of Pek (·) and upper bounds for max16i6n |Pek (λi )| are typi ally obtained via the separation of the spe trum of M into the luster part and the outlying part, see, for instan e [7, 4, 3℄ for the ase of SPD matrix M, and [6, 15, 16, 21℄ for the general ase. In [18℄, an alternative te hnique is used, whi h allows to relax diagonalizability ondition (12). A standard approa h to the analysis of pre onditioned iterations is to use the general theory of Krylov subspa e methods for a pre onditioned system (5) using substitution (11). Unfortunately, one an hardly estimate and/or ontrol any related properties of the pre onditioned matrix M even a posteriori. It is not known how one an ee tively relate any hara teristi s of the \lo alization" or \distribution" of the eigenvalue spe trum of M to the result of pre onditioning. For instan e, in general even the tr(M) is very hard to estimate (e.g., its exa t evaluation seems like n times the solution ost of the original linear system). Therefore, we a tually reje t the use of the pre onditioned spe trum as an \interfa e" between the pre onditioning and iterations. Instead of that, we separately use some properties of the two fa tors ES and A−1 S in the multipli ative splitting su h as −1 −1 DL (I − M−1 )D−1 L = (AS − CS )AS = ES AS .

278

I. Kaporin

This well onforms to the two-stage pre onditioning s heme, where at the rst stage one improves some onditioning measure for matrix AS by the hoi e of DL and DR , and at the se ond stage seeks for an easily invertible CS whi h dire tly approximates AS .

5

Superlinear GMRES convergence via scaled error matrix

Let us denote the singular values of a real-valued n × n-matrix Z as σ1 (Z) > σ2 (Z) > · · · > σn (Z) > 0.

Re alling the de nition of the Frobenius matrix norm given in (6) and taking into a

ount that σi (Z)2 is exa tly the ith eigenvalue of ZT Z, one has 2 kZkF = tr(ZT Z) =

n X

(15)

σi (Z)2 .

i=1

Moreover, by (det Z)2 = det(ZT Z) it follows | det Z| =

n Y

(16)

σi (Z).

i=1

5.1

Main result

Next we present a s ale-invariant generalization of the pre onditioned GMRES

onvergen e result earlier presented in [12℄.

Let CS be a pre onditioner for the s aled matrix AS as de ned in (7) and the iterates xk be generated by the GMRES(∞) method with the −1 pre onditioner C = D−1 L CS DR . Then the kth residual rk = b − Axk satis es Theorem 1.

n k/2 krk k 6 κ K(AS ) 4e sin2 [CS , AS ] , kr0 k k

k = 1, 2, . . . , n,

(17)

where e = exp(1), the quantity κ was de ned in (14), n . | det Z| K(Z) = n−1/2 kZkF

(18)

denotes the unsymmetri K- ondition number of a nonsingular matrix Z, and T 2 sin2 [Y, Z] = 1 −

(trZ Y) kYk2F kZk2F

(19)

denotes the squared sine of the Eu lidean a ute angle between the matri es and Z.

Y

Superlinear onvergen e in pre onditioned GMRES

279

Proof. Let us de ne the s alar ξ=

tra e(ATS CS ) kCS k2

(20)

.

(Note that if CS ≈ AS , then ξ ≈ 1.) Let the eigenvalues of the pre onditioned −1 matrix M = AC−1 (re all that C = D−1 L CS DR ) be numbered by the de rease of the distan e to ξ: |ξ − λ1 | > |ξ − λ2 | > . . . |ξ − λn | > 0.

(21)

Following the te hniques introdu ed in [24℄ (see also [11℄), let us onsider the polynomial Pek of the form ek (λ) = P

k Y i=1

λ 1− λi

.

Taking into a

ount that Pek (λj ) = 0 for 1 6 j 6 k and using the above ordering of the eigenvalues one an dedu e from (13) the following residual norm estimate: k Y 1 krk k λj e e 6 max |Pk (λ)| = max |Pk (λ)| = max 1− k<j6n k<j6n κ kr0 k 16j6n λi i=1

= max

k<j6n

6 max

k<j6n

k

62

m Y |(ξ − λi ) − (ξ − λj )|

|λi |

i=1

k Y |ξ − λi | + |ξ − λj | i=1

k Y |ξ − λi | i=1

|λi |

|λi |

k Y ξ =2 1 − λi k

i=1

k k Y Y λπ(i) (I − ξM−1 ) 6 2k λi (I − ξM−1 ) , = 2k i=1

i=1

where the index permutation π(i) orresponds to the reordering of the original numbering (21) a

ording to the de rease of eigenvalue modules of the matrix I − ξM−1 : |λ1 (I − ξM−1 )| > |λ2 (I − ξM−1 )| > · · · > |λn (I − ξM−1 )|.

Next we use the identity

I − ξM−1 = I − ξCA−1 = (A − ξC)A−1 = DL−1 (AS − ξCS )AS−1 DL

and apply the lassi al inequalities k Y i=1

|λi (XY)| 6

k Y i=1

σi (XY) 6

k Y i=1

σi (X)σi (Y),

280

I. Kaporin

the left of whi h is known as the Weyl inequality (written for Z = XY ), while the right one was found by Horn, see [14℄, with X = AS − ξCS and Y = AS−1 . This yields the following estimate: k Y 1 krk k k λi (I − ξM−1 ) 6 2 0 κ kr k i=1

k Y λi (D−1 (AS − ξCS )A−1 DL ) = 2k

L

S

i=1

= 2k

k k Y Y λi ((AS − ξCS )A−1 ) 6 2k σi ((AS − ξCS )A−1 )

S

S

i=1

i=1

6 2k

k Y

σi (AS − ξCS )σi (A−1 S )

i=1

= 2k

k Y (σi (AS − ξCS ))2 i=1

!1/2

k Y

!

(22)

σi (AS−1 ) .

i=1

The rst produ t an be bounded using the inequality between the arithmeti and geometri mean values m Y i=1

ηi

!1/m

m

6

1 X ηi , m

(23)

ηi > 0,

i=1

taken with m = k and ηi = (σi (AS − ξCS ))2 : k Y i=1

2

(σi (AS − ξCS ))

!1/2

k

6

1X (σi (AS − ξCS ))2 k i=1

!k/2

!k/2 n 1X 2 6 (σi (AS − ξCS )) k i=1 k/2 1 2 = kAS − ξCS kF k k/2 1 2 2 = . kAS kF sin [AS , CS ] k

Here the last equality follows from (19) and (20).

(24)

Superlinear onvergen e in pre onditioned GMRES

281

The se ond produ t in (22) an also be estimated using inequality (23), this time taken with m = n − k and ηi = σi (AS ): k Y

−1

σi (AS ) =

i=1

k Y

!−1

σn+1−i (AS )

i=1

n Y

1 = | det AS | =

!

σi (AS )

i=1

n−k Y 1 σi (AS ) | det AS |

k Y

!−1

σn+1−i (AS )

i=1

i=1

1 = | det AS |

n−k Y

(σi (AS ))2

i=1

!1/2

! n−k 2 n−k 1 X 2 (σi (AS )) n−k i=1 ! n−k 2 n 1 X 1 (σi (AS ))2 6 | det AS | n − k i=1 n−k 2 1 1 2 kAS kF = | det AS | n − k n−k 2 exp(k/2) 1 kAS k2F 6 . | det AS | n

1 6 | det AS |

(25)

The latter inequality follows from

n n−k

n−k 2

n−k n log 2 n−k n n−k k , −1 = exp 6 exp 2 n−k 2 = exp

where we have used log η 6 η − 1. Substituting now the above two ineqialities (24) and (25) into (22), one gets n−k k2 2 1 1 krk k exp(k/2) 1 2 k 2 2 6 2 kA kA k [A k sin , C ] S S S S F F κ kr0 k k | det AS | n k/2 n−1 kA k2 n/2 n S F . = 4e sin2 [AS , CS ] k | det AS |

Finally, it only remains to re all de nition (18), and the required inequality (17) follows.

282

I. Kaporin

Hen e, Theorem 1 a tually gives a theoreti al basis for two-stage pre onditionings. For instan e, at the rst stage one hooses the s aling matri es DL and DR (subje t to the ondition of near minimization of K(DL ADR ), see Se tion 5.3 below and [12℄ for more detail), and at the se ond stage one onstru ts an easily invertible approximation for the s aled matrix AS = DL ADR , e.g., with the use of an approximate triangular fa torization with permutations as in [12℄. Note that the earlier supelinear GMRES onvergen e estimate [11℄ was formulated in terms of the quantities kIn − AC−1 kF and |λ(AC−1 )|min , whi h, in general, an hardly be estimated even a posteriori. It turns out that simplisti upper bounds like kIn − AC−1 kF = k(A − C)C−1 kF 6 kC−1 kkEkF

are often senseless due to o

asionally huge values of kC−1 k, see for instan e the data in Tables 2{7 below. At the same time, one an see there that \reasonably huge" values of the norm of the inverse pre onditioner may not destroy the GMRES onvergen e. Also, the above pre onditioning quality measure (19) satis es a natural ondition of being a s ale-invariant fun tional of its matrix arguments, that is, sin2 [γCS , αAS ] = sin2 [CS , AS ],

α 6= 0,

γ 6= 0.

This well onforms with the obvious fa t that the GMRES residual norm is invariant with respe t to any re-s aling of the pre onditioner (i.e. C := βC, β 6= 0). Certainly, the parti ular value of the onstant 4e in (17) is somewhat overestimated due to rather rough te hniques used in the proof of Theorem 1. Based on spe ial analyti al examples, it an be onje tured that the unimprovable value for this onstant equals to one. Note: Starting with a suÆ iently large iteration number k, the right-hand sides of the above estimate (17) de rease faster than any geometri progression. In this sense, these estimates on rm the superlinear GMRES onvergen e, whi h is often observed when the pre onditioning is good enough. 5.2

The corresponding GMRES iteration number bound

Using the te hniques developed in [11℄ one an readily nd an upper bound for the iteration number needed to attain the spe i ed residual norm redu tion ε ≪ 1. We will use the following auxiliary result (for the proof see [11℄). Lemma 1.

Let t > 0 and s>

1 + (1 + e−1 )t , log(e + t)

(26)

Superlinear onvergen e in pre onditioned GMRES

283

where e = exp(1). Then the inequality s log s > t

(27)

holds. As was mentioned in [11℄, for any t > 0 it holds t < s log s < 1.064t, i.e. the relative overestimation in (27) is never larger than 6.5%. Now we an prove a GMRES iteration number bound similar to the ones presented in [11℄,[12℄. Theorem 2. The iteration number k suÆ ient for the ε times redu tion of the residual norm in the minimum residual method satis es 4en sin [CS , AS ] + (2 + 2e ) log k6 −1 log e + 2en sin2 [C , A ] log κε K(AS ) S S

2

κ ε K(AS )

−1

(28)

with κ determined in (14) and e = exp(1).

Proof. By the result of Theorem 1, a suÆ ient ondition to satisfy the required inequality krk k/kr0k 6 ε is

k/2 n 6 ε, κ K(AS ) 4e sin2 [CS , AS ] k

whi h an be rewritten as k log 2

k 2 4en sin [CS , AS ]

> log

κ ε

K(AS ) . −1

Multiplying the latter inequality by 2en sin2 [CS , AS ] s=

k , 2 4en sin [CS , AS ]

t=

and denoting

κ 1 log K(A ) , S ε 2en sin2 [CS , AS ]

one an see that the resulting inequality is equivalent to ondition (27). By Lemma 1, a suÆ ient ondition for (27) to hold is (26), whi h yields exa tly the required estimate (28). The use of the losest integer from above is valid, sin e the fun tion s log s in reases for s > 1/e, and by (26) it holds s > 1. 5.3

Relating the new estimate to scaling

In view of (17), it is natural to require that the s aling should minimize fun tional (18) with Z = AS = DL ADR . As is shown in [12℄, the minimizer satis es

284

I. Kaporin

exa tly the requirement that AS have the Eu lidean norms of ea h row and

olumn equal to the same number, e.g., n X j=1

(DL )2i (A)2ij (DR )2j = 1,

n X

(DL )2i (A)2ij (DR )2j = 1.

i=1

exa tly as was re ommended in [2, 13℄. The diagonal matri es DL and DR an be evaluated as an approximate solution of the above nonlinear system of equations using the RAS (Row-Alternating S aling) iterations (see, e.g. [17℄ and referen es therein). Ea h RAS half-iteration onsists in one-side re-s aling of the urrent matrix to normalize all its rows or all the olumns at odd and even steps, respe tively. The RAS algorithm and its \symmetrized" version are investigated in [12℄ from the viewpoint of K(AS ) redu tion. Note that both the onvergen e theory above and the numeri al examples given later ( f. also [12℄) learly indi ate that it makes sense to invest a onsiderable fra tion of omputational eorts into the evaluation of s aling matri es DL and DR for whi h the fa tor K(AS ) in the right hand side of the GMRES

onvergen e estimate (17) is redu ed onsiderably. In this respe t, one an even use sparse triangular matri es instead of diagonal DL and DR , as it was done in the two-side expli it pre onditioning proposed and investigated in [8℄. There, a general unsymmetri matrix A was pre onditioned using the two-side transformation b = GL AGU , A

with GL and GU hosen as the sparse lower and upper triangular matri es, respe tively. The positions and values of their entries were determined from the same ondition of K(GL AGU ) minimization. To this end, a RAS-type pro edure was used, where at ea h half-step one evaluates the K- ondition numbA b T or ber minimizer GL or GU , where K(M) = (n−1 trM)n / det M and M = A T b A b , respe tively ( f. also [9℄). The strategy onsidered in [8℄ was as folM=A lows: allowing the matri es GL and GU to have a suÆ iently large number of bA b T omes lose enough to the nonzeroes, one an assume that the matrix M = A identity matrix In to make the expli it Conjugate Gradient iterations eÆ ient in solving the two-side pre onditioned system My = f. Sin e su h a onstru tion is ompletely free of ne essity of solving systems with large sparse triangular matri es, this method is onsidered suitable for the parallel implementation. In the ontext of the present paper, even the use of GL and GU ontaining not more than 2 nonzeroes in ea h row and olumn instead of diagonal matri es DL and DR , respe tively, may result in a further onsiderable redu tion of K(GL AGU ). Moreover, one an expe t that approximate triangular fa torization of the type bb bQ b +E b GL AGU = P LU

Superlinear onvergen e in pre onditioned GMRES

285

will possess even better pre onditioning quality than that obtained with simple diagonal s aling. In this ase, the onvergen e estimate (17) of Theorem 1 will take the form n k/2 krk k bb b Q, b GL AGU ] 6 κ K(GL AGU ) 4e sin2 [P LU . kr0 k k

b b b b −1 Hen e, the resulting two-level pre onditioner takes the form C = G−1 L P LUQGU and its appli ation additionaly requires two matrix-ve tor multipli ations with the sparse matri es GL and GU . Of ourse, su h a s heme would involve ertain additional algorithmi ompli ations; however, the expe ted gain in pre onditioning quality should prevail. 5.4

Relating the new estimate to ILU preconditioning

If the matrix AS = DL ADR is s aled to satisfy (29)

kAS k2F = n,

(note that (29) holds for s alings obtained using any number of RAS iterations) then the following upper bound holds: sin [CS , AS ] ≡ 1 − 2

6

tra eATS CS

2

=

kCS k2F kAS k2F kAS − CS k2F kAS k−2 F

minσ kAS − σCS k2F kAS k2F

= n−1 kAS − CS k2F = n−1 kES k2F .

Hen e, under ondition (29) the result of Theorem 1 oin ides exa tly with the one presented in [12℄:

k 3.3 √ kES kF , k √ where we have also used the numeri al inequality 2 e < 3.3. krk k κ 6 kr0 k | det AS |

(30)

It should be noted that if the ILU threshold parameter is hosen suÆ iently small, e.g. τ = 0.001, and the ILU fa tors are stable enough, then the typi al values of kES kF are not big (one an often observe kES kF < 1 even for realisti large-s ale problems, f. numeri al data in [12℄). As was noted above, the quantity kES kF an be easily evaluated in the pro ess of the approximate fa torization of AS , whi h allows us to use it as an a posteriori indi ator of ILU pre onditioning quality. Turning ba k to the low-rank modi ed form of error term (4), one an generalize the main result to take into a

ount the ase when the pivot modi ation rule is used in the ILU fa torization (see [12℄ for more detail). Setting ξ = 1 in (22), one nds, for any integer 1 6 m ≪ k, the following estimate: 1 krk k 6 2k κ kr0 k

m Y i=1

!

σi (ES )

k Y

i=m+1

2

σi (ES )

!1/2

k Y i=1

−1

!

σi (AS ) .

(31)

286

I. Kaporin

Estimating these three produ ts separately, one has m Y

!

σi (ES )

i=1

k Y

2

σi (ES )

i=m+1

!1/2

6 kES km , 6

=

n X 1 σi (ES )2 k−m i=m+1

!(k−m)/2

1 min kES − Xk2F k − m rank(X)=m

(k−m)/2

where we have used the well known result of E kart and Young, see, e.g. Theorem B5 in [14℄, Se tion 10. Finally, by (25) and (29), it follows k Y

−1

!

σi (AS )

i=1

6

exp(k/2) . | det AS |

Substituting the latter three inequalities in (31) gives the needed generalization of (30): m krk k κ 3.3kE k 6 S kr0 k | det AS |

3.3 √ min kES − XkF k − m rank(X)=m

k−m

.

(32)

One an readily apply the te hniques of Se tion 5.2 and nd that the orresponding iteration number bound will dier only by an additive term of the type m + o(m). However, in ertain ases, for some moderate value of m, it may hold min kES − XkF ≪ kES kF . rank(X)=m

For instan e, the use of pivot modi ations in ILU algorithms is equivalent to the approximate triangular de omposition of a diagonally perturbed input matrix, e = PS LS US QS + E eS , AS + D

e is a diagonal matrix having only m nonzero elements (whi h may have where D

onsiderably larger magnitudes ompared to the ILU threshold parameter τ), eS satisfy the bound (3). Clearly, rank(D) e = m, and therefore and the entries of E one nds

min

rank(X)=m

kES − XkF =

min

rank(X)=m

eS − D e − XkF 6 kE e S kF , kE

whi h quantity may really be onsiderably smaller than the Frobenius norm of eS − D e. the total residual ES = E Hen e, one an expe t that m pivot modi ations in ILU pre onditioning may ost m additional GMRES iterations. It should be noted that the omplete diagonal pivoting in ILU des ribed in [12℄ usually requires a rather small, if any, number of pivot modi ations.

Superlinear onvergen e in pre onditioned GMRES Table 1.

287

RAS(δ) s aling statisti s for 18 test problems with δ = 0.8 and δ = 0.1

Size #Nonzeroes δ = 0.8: #RAS iters. δ = 0.1: #RAS iters. Problem n nz(A) log K(A) and log K(AS ) and log K(AS ) gre 1107 1107 5664 4.487+02 7 4.067+02 35 3.828+02 qh1484 1484 6110 3.562+04 19 1.296+03 39 1.286+03 west2021 2021 7310 1.974+04 21 3.177+02 131 2.006+02 nn 1374 1374 8588 1.409+04 41 9.204+02 79 6.190+02 sherman3 5005 20033 8.789+04 3 1.177+03 4 1.177+03 sherman5 3312 20793 1.140+04 6 2.164+02 20 1.769+02 saylr4 3564 22316 5.760+03 3 4.985+03 4 4.984+03 lnsp3937 3937 25407 7.792+04 29 1.272+03 95 1.216+03 gemat12 4929 33044 1.107+04 13 3.214+03 92 3.154+03 dw8192 8192 41746 2.532+04 3 5.493+03 10 5.486+03

ir uit3 12127 48137 3.583+04 22 7.603+03 62 7.545+03

ryg10K 10000 49699 4.562+04 4 6.393+03 27 6.389+03 fd18 16428 63406 2.173+05 17 4.490+03 93 4.015+03 bayer10 13436 71594 1.312+05 24 2.412+03 173 1.917+03 lhr04 4101 82682 3.635+03 26 9.399+02 193 8.253+02 utm5940 5940 83842 5.625+03 12 3.406+03 54 3.332+03 bayer04 20545 85537 4.193+05 45 2.214+03 209 1.648+03 orani678 2529 90158 1.219+03 11 1.639+02 88 1.026+02

Table 2.

RAS(0.1)+ILU(0.001) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 k kE k iterations iterations S F S gre 1107 15.87 1.216+06 1.962−02 10 90 qh1484 2.72 2.108+05 4.889−03 3 221 west2021 2.73 2.824+11 1.235−02 4 47 nn 1374 8.18 1.395+07 1.268−02 21 129 sherman3 4.82 4.675+01 4.360−02 15 280 sherman5 2.15 3.809+00 2.801−02 6 49 saylr4 0.80 9.194+02 1.388−02 60 889 lnsp3937 5.56 1.346+04 4.833−02 8 294 gemat12 2.52 8.304+05 2.504−02 12 631 dw8192 4.01 6.982+01 3.978−02 15 1126

ir uit3 1.56 6.375+03 1.688−02 12 1343

ryg10K 3.72 3.344+03 4.379−02 34 1315 fd18 11.73 2.278+30 6.855−02 30 921 bayer10 3.55 7.903+38 5.448−02 7 452 lhr04 2.04 6.725+03 6.300−02 18 218 utm5940 5.65 4.073+04 9.949−02 30 830 bayer04 2.97 1.105+38 5.123−02 5 390 orani678 0.95 3.656+00 5.503−02 6 37

288

I. Kaporin

Table 3.

RAS(0.1)+ILU(0.01) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 kES kF iterations iterations S k gre 1107 13.73 3.281+03 2.708−01 21 158 qh1484 2.24 2.243+06 8.372−02 21 341 west2021 2.37 9.970+02 1.103−01 8 73 nn 1374 7.27 2.489+04 1.775−01 48 212 sherman3 2.62 3.451+01 3.554−01 35 438 sherman5 1.42 4.039+00 2.624−01 11 85 saylr4 0.78 9.163+02 4.895−02 59 1064 lnsp3937 3.65 3.564+05 4.329−01 16 475 gemat12 1.74 9.887+17 2.667−01 48 963 dw8192 2.73 2.662+01 3.506−01 45 1670

ir uit3 1.23 1.483+03 1.952−01 56 1969

ryg10K 2.34 1.046+03 3.828−01 78 1949 fd18 9.33 1.826+45 7.006−01 93 1507 bayer10 2.52 5.266+50 5.658−01 12 † 755 lhr04 1.01 1.590+05 6.587−01 55 392 utm5940 2.52 2.457+02 8.077−01 68 1338 bayer04 2.28 4.151+32 5.299−01 9 651 orani678 0.38 2.854+00 4.440−01 8 70

Table 4.

RAS(0.1)+ILU(0.07) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 k kE k iterations iterations S F S gre 1107 9.30 9.996+02 2.291+00 50 409 qh1484 1.73 2.163+05 7.746−01 62 596 west2021 1.87 7.695+12 1.172+00 18 177 nn 1374 5.70 8.907+04 1.574+00 610 † 452 sherman3 1.56 3.132+01 1.919+00 73 800 sherman5 1.00 1.770+00 1.234+00 22 168 saylr4 0.76 8.869+02 1.444−01 60 1279 lnsp3937 2.23 5.106+04 2.628+00 36 966 gemat12 1.12 1.932+10 2.104+00 174 1780 dw8192 1.42 5.544+01 1.808+00 205 2627

ir uit3 0.98 2.001+02 1.653+00 163 3321

ryg10K 1.55 1.627+02 2.352+00 175 3272 fd18 5.61 9.233+59 4.458+00 222 3048 bayer10 1.66 7.786+28 3.764+00 47 1637 lhr04 0.39 5.661+03 3.397+00 92 874 utm5940 0.96 6.650+01 4.138+00 137 2557 bayer04 1.53 7.189+51 3.625+00 104 † 1442 orani678 0.15 2.408+00 2.180+00 14 191

Superlinear onvergen e in pre onditioned GMRES Table 5.

289

RAS(0.8)+ILU(0.001) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 kES kF iterations iterations S k gre 1107 14.98 1.442+07 2.157−02 14 96 qh1484 3.10 1.665+13 6.356−03 4 230 west2021 2.94 1.236+06 1.079−02 6 69 nn 1374 9.18 4.938+04 6.882−03 36 169 sherman3 4.81 6.704+01 4.341−02 15 280 sherman5 2.19 3.916+00 2.643−02 6 58 saylr4 .79 9.445+02 1.384−02 60 889 lnsp3937 5.29 6.959+02 4.531−02 8 302 gemat12 2.55 1.932+04 2.515−02 13 642 dw8192 3.96 4.462+01 3.933−02 16 1125

ir uit3 1.57 4.101+03 1.666−02 12 1350

ryg10K 3.73 4.367+04 4.396−02 41 1316 fd18 13.61 1.745+41 6.887−02 24 1021 bayer10 4.32 1.735+27 4.986−02 8 549 lhr04 2.21 9.265+08 6.237−02 19 244 utm5940 6.14 3.131+03 1.012−01 32 849 bayer04 3.74 3.142+31 4.947−02 10 507 orani678 1.06 1.589+01 5.903−02 5 54

Table 6.

RAS(0.8)+ILU(0.01) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 k kE k iterations iterations S F S gre 1107 12.28 4.509+05 2.640−01 28 165 qh1484 2.49 5.968+10 8.875−02 19 348 west2021 2.40 2.354+04 1.440−01 14 114 nn 1374 8.56 5.503+06 1.594−01 168 † 291 sherman3 2.62 3.515+01 3.559−01 35 438 sherman5 1.46 4.191+00 2.577−01 11 99 saylr4 .78 9.378+02 4.929−02 59 1065 lnsp3937 3.45 1.648+02 4.110−01 15 487 gemat12 1.78 2.472+14 2.602−01 35 973 dw8192 2.72 4.546+01 3.481−01 48 1669

ir uit3 1.23 1.281+03 2.654−01 44 2106

ryg10K 2.35 7.093+02 3.858−01 78 1953 fd18 10.41 7.823+41 7.219−01 139 1673 bayer10 3.41 1.119+39 4.974−01 28 † 886 lhr04 1.26 3.094+06 6.223−01 41 428 utm5940 2.80 5.501+03 8.158−01 70 1367 bayer04 3.05 1.450+54 5.359−01 24 840 orani678 .41 5.877+01 4.694−01 8 99

290

I. Kaporin

Fig. 1. Set of points (log k, log kest ) depi ting the orrelation between the observed and estimated iteration numbers

6

Numerical experiments

The orre tness of the above onvergen e estimate (30) has also been tested numeri ally using several small-sized \hard" test matri es taken from the University of Florida Sparse Matrix Colle tion [1℄. The limitation on the sizes of the matri es was set in order to make easier the \exa t" LU-fa torization of the

oeÆ ient matrix A whi h was used for the evaluation of log | det A|. The linear systems were solved with an arti ial right-hand side b = Ax∗ , where the omponents of the exa t solution were hosen as x∗ (i) = i/n, i = 1,2,. . . ,n. The initial guess was always hosen as x0 = 0 and the stopping riterion in GMRES iteration was set as kerk k 6 εkr0 k with ε = 10−8 , where kerk k is the estimated GMRES residual norm. If the matrix A is very ill- onditioned and the pre onditioning is not suÆ iently strong (e.g. if the ILU threshold parameter τ is set too large), the true residual norm an be mu h larger than the estimated one (due to the al ulations in nite pre ision). In the ases of a omplete failure when krk k > kr0 k, we put the \†" mark after the GMRES iteration number in Tables 2-7. In the GMRES(m) s heme, we took m = 900 and used approximate LU pre onditioning with the \best" default tuning of the pre-ordering and pivoting (see [12℄ for more detail).

Superlinear onvergen e in pre onditioned GMRES

291

Set of points (log k, log kES kF ) depi ting the orrelation between the observed iteration number and the Frobenius norm of the s aled ILU error

Fig. 2.

Note: It has been observed (espe ially in al ulations with \nn 1374" matrix), that mu h better results, in the sense of loseness between the \iterative" and the \true" residual (the latter is rk = b − Axk ), are obtained using the BiCGStab iterations [22℄. Probably, an improved GMRES implementation [23℄ (where the plain rotations are repla ed by elementary re e tions) would be more

ompetitive. In the s aling pro edure, the RAS stopping riterion was P

P

maxi j (AS )2ij maxj i (AS )2ij P P , max mini j (AS )2ij minj i (AS )2ij

!

6 1+δ

with δ = 0.1, 0.8, and the ILU threshold parameter τ was set to τ = 0.001, 0.01, 0.07. We present numeri al results for 18 sample problems from the above mentioned olle tion. The problems are taken form the subset of 60 matri es whi h has been used in [12℄ for testing of ILU pre onditionings. Hen e, the statisti s on the total of 2 × 3 × 18 = 108 test runs are reported in Tables 2- 7. In Table 1 we list the names of the test matri es with their sizes and number of nonzeroes, and present values for the quality measure K(AS ) whi h hara -

292

I. Kaporin

Set of points (log k, log kC−1 S k) depi ting the (absen e of) orrelation between the observed iteration number and the (lower bound for) spe tral norm of the inverse s aled pre onditioner Fig. 3.

terize the result of s aling. Clearly, the smaller δ, the smaller is K(AS ), whi h

orresponds to better s aling. However, the number of RAS iterations in reases

onsiderably when re ning the pre ision from δ = 0.8 to δ = 0.1. In GMRES(m) we took m = 900 and used approximate LU pre onditioning as in [12℄. All omputing was done in double pre ision. The iteration number

ounts and other related data are given in Tables 2-7. For ea h test run we give: 1. The resulting pre onditioner density nz(L + U)/nz(A); −1 T −1 2. The lower bound on the spe tral norm of C−1 S (taken as (v US )(LS u)/n, where the omponents of the ve tors u and v are 1 or −1 with signs determined in the ourse of ba k substitutions to obtain a lo al maximum at ea h step); 3. The Frobenius norm of the s aled ILU residual ES ; 4. The a tual number k of GMRES iterations; 5. The upper bound kest for the iteraton number k obtained from estimate (30) with κ = 1 in the same way as in the proof of Theorem 2. First of all, the results presented give another on rmation that good pres aling an be useful for the improvement of the ILU-GMRES performan e.

Superlinear onvergen e in pre onditioned GMRES Table 7.

293

RAS(0.8)+ILU(0.07) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 kES kF iterations iterations S k gre 1107 8.25 9.027+05 2.353+00 67 433 qh1484 1.92 2.938+08 7.703−01 59 598 west2021 1.85 4.015+06 1.124+00 27 236 nn 1374 7.41 1.838+06 1.650+00 720† 619 sherman3 1.56 3.129+01 1.922+00 73 801 sherman5 1.01 1.706+00 1.436+00 22 210 saylr4 .76 9.742+02 1.440−01 60 1279 lnsp3937 2.17 1.929+01 2.523+00 37 975 gemat12 1.16 5.290+15 2.142+00 >900† 1820 dw8192 1.40 2.341+01 1.777+00 215 2614

ir uit3 1.00 2.395+02 1.649+00 153 3340

ryg10K 1.56 1.694+02 2.350+00 163 3273 fd18 7.07 2.029+64 4.708+00 420 3405 bayer10 2.26 3.880+70 4.046+00 114 2006 lhr04 .46 3.720+21 3.887+00 >900† 1044 utm5940 1.07 4.046+02 4.218+00 145 2626 bayer04 2.03 7.703+49 3.686+00 121† 1789 orani678 .15 1.685+01 2.152+00 13 238

Next we adress onsisten y analysis for the above presented GMRES onvergen e theory. One an see that for the ases onsidered, upper bound (28) is, on average, a twenty times overestimation of the a tual iteration ount. However, the relative variations of the upper bound (from one problem to another) orrelate with the a tual iteration numbers rather well, as is illustrated in Figure 1. (In Figs. 1{3 we have used only the data on 99 out of 108 test runs, thus ignoring the breakdown o

asions marked by \†".) A mu h weaker orrelation is observed between the Frobenius norm of the s aled ILU residual kES kF and the a tual GMRES iteration number, see Figure 2. Furthermore, the onventional indi ator kC−1 S k does not demonstrate any orrelation with the GMRES iteration number. Note that, if there is a hidden dependen e, for instan e, of the form k = αkβest , then the points (log k, log kest ) lie at the orresponding straight line. The reader may learly observe that only the dis rete set shown in Figure 1

an safely be interpreted as a \linear fun tion plus noise". More pre isely, one

an nd two interse ting straight lines in Figure 1 whi h, in fa t, orrespond to two dierent lasses of test problems.

294

7

I. Kaporin

Conclusion

First, a theoreti al justi ation is found for the standard pre-s aling te hnique related to the ILU fa torization, with impli ations for pra ti al implementation. (Namely, a more a

urate evaluation of DL and DR may be useful, or even sparse matri es with more than n nonzeroes an be used instead of the diagonal ones.) Se ond, an estimate for the redu tion of the original (uns aled) residual is obtained in terms of the s aled ILU error. These results an readily be used as a working tool for the onstru tion of eÆ ient two-stage pre onditionings for Krylov subspa e methods.

Acknowledgments The author thanks Eugene Tyrtyshnikov for his kind interest in this resear h and for his valuable assistan e in related presentations.

References 1. Univ. of Florida Sparse Matrix Colle tion. http://www.cise.ufl.edu/research/sparse/matrices/

2. V.F. de Almeida, A.M. Chapman, and J.J. Derby, On Equilibration and

Sparse Fa torization of Matri es Arising in Finite Element Solutions of Partial Dierential Equations, Numer. Methods Partial Dierent. Equ., 16 (2000),

pp. 11{29. 3. O. Axelsson and I. Kaporin, On the sublinear and superlinear rate of onvergen e of onjugate gradient methods, Numeri al Algorithms, 25 (2000), pp. 1{22. 4. O. Axelsson and G. Lindskog, On the rate of onvergen e of the pre onditioned onjugate gradient method, Numeris he Mathematik, 48 (1986), pp. 499{ 523. 5. T. Davis, http://www. ise.u .edu/resear h/sparse/umfpa k/ 6. S.L. Campbell, I.C. Ipsen, C.T. Kelley, and C.D. Meyer, GMRES and the Minimal Polynomial, BIT, 36 (1996), pp. 664{675. 7. A. Jennings, In uen e of the eigenvalue spe trum on the onvergen e rate of the onjugate gradient method, Journal of the Institute of Mathemati s and Its Appli ations, 20 (1977), pp. 61{72. 8. I. Kaporin, Expli itly pre onditioned onjugate gradient method for the solution of nonsymmetri linear systems, Int. J. Computer Math., 40 (1992), pp. 169{187. 9. I. Kaporin, New onvergen e results and pre onditioning strategies for the onjugate gradient method, Numer. Linear Algebra with Appls., 1 (1994), pp. 179{210. 10. I. Kaporin, High quality pre onditioning of a general symmetri positive matrix based on its UT U + UT R + RT U-de omposition, Numeri al Linear Algebra Appl., 5 (1998), pp. 484{509. 11. I. Kaporin, Superlinear onvergen e in minimum residual iterations, Numeri al Linear Algebra Appl., 12 (2005), pp. 453{470.

Superlinear onvergen e in pre onditioned GMRES

295

12. I. Kaporin, S aling, Reordering, and Diagonal Pivoting in ILU Pre onditionings, Russian Journal of Numeri al Analysis and Mathemati al Modelling, 22 (2007), pp. 341{375. 13. O. E. Livne and G. H. Golub, S aling by Binormalization, Numer. Alg., 35 (2004), pp. 97{120. 14. A. W. Marshall and I. Olkin, Inequalities: Theory of Majorization and its Appli ations, A ademi Press, New York, 1979. 15. I. Moret A note on the superlinear onvergen e of GMRES, SIAM Journal on Numeri al Analysis, 34 (1997), pp. 513{516. 16. Y. Saad and M. H. S hultz, GMRES: A generalized minimal residual method for solving nonsymmetri linear systems, SIAM J. S i. Statist. Comput., 7 (1986), pp. 856{869. 17. M. H. S hneider and S. A. Zenios, A omparative study of algorithms for matrix balan ing, Operations Resear h, 38 (1990), pp. 439{455. 18. V. Simon ini and D.B. Szyld, On the O

urren e of Superlinear Convergen e of Exa t and Inexa t Krylov Subspa e Methods, Dept. Math., Temple University Report 03-3-13; Philadelphia, Pennsylvania, Mar h 2003, 25pp. 19. E. E. Tyrtyshnikov, A unifying approa h to some old and new theorems on distribution and lustering, Linear Algebra and Appli ations, 232 (1996), pp. 1{ 43. 20. E. E. Tyrtyshnikov, Krylov subspa e methods and minimal residuals, J. Numer. Math. (2007, submitted). 21. H. A. van der Vorst and C. Vuik, The superlinear onvergen e behaviour of GMRES, Journal of Computational and Applied Mathemati s, 48 (1993), pp. 327{ 341. 22. H. A. van der Vorst, Bi-CGStab: a fast and smoothly onverging variant of Bi-CG for the solution of nonsymmetri linear systems, SIAM J. S i. Statist. Comput., 13 (1992), pp. 631{644. 23. H. F. Walker, Implementation of the GMRES method using Householder transformations, SIAM J. S i. Statist. Comput., 9 (1988), pp. 152{163. 24. R. Winther, Some superlinear onvergen e results for the onjugate gradient method, SIAM J. Numer. Analysis, 17 (1980), pp. 14{17.

Toeplitz and Toeplitz-block-Toeplitz matrices and their correlation with syzygies of polynomials Houssam Khalil1, Bernard Mourrain2, and Mi helle S hatzman1,⋆ 1

Institut Camille Jordan, 43 boulevard du 11 novembre 1918, 69622 Villeurbanne

edex Fran e khalil@math.univ-lyon1.fr schatz@math.univ-lyon1.fr

⋆ 2

INRIA, GALAAD team, 2004 route des Lu ioles, BP 93, 06902 Sophia Antipolis Cedex, Fran e mourrain@sophia.inria.fr

Abstract. In this paper, we re-investigate the resolution of Toeplitz systems T u = g, from a new point of view, by orrelating the solution of su h problems with syzygies of polynomials or moving lines. We show an expli it onne tion between the generators of a Toeplitz matrix and the generators of the orresponding module of syzygies. We show that this module is generated by two elements of degree n and the solution of T u = g an be reinterpreted as the remainder of an expli it ve tor depending on g, by these two generators. This approa h extends naturally to multivariate problems and we des ribe for Toeplitz-blo k-Toeplitz matri es, the stru ture of the orresponding generators.

Keywords: Toeplitz matrix, rational interpolation, syzygie.

1

Introduction

Stru tured matri es appear in various domains, su h as s ienti omputing, signal pro essing, . . . They usually express, in a linearize way, a problem whi h depends on less parameters than the number of entries of the orresponding matrix. An important area of resear h is devoted to the development of methods for the treatment of su h matri es, whi h depend on the a tual parameters involved in these matri es. Among well-known stru tured matri es, Toeplitz and Hankel stru tures have been intensively studied [5, 6℄. Nearly optimal algorithms are known for the multipli ation or the resolution of linear systems, for su h stru ture. Namely, if A is a Toeplitz matrix of size n, multiplying it by a ve tor or solving a linear ~ ~ system with A requires O(n) arithmeti operations (where O(n) = O(n logc (n)) for some c > 0) [2, 12℄. Su h algorithms are alled super-fast, in opposition with fast algorithms requiring O(n2 ) arithmeti operations.

Toeplitz and Toeplitz-blo k-Toeplitz matri es

297

The fundamental ingredients in these algorithms are the so- alled generators [6℄, en oding the minimal information stored in these matri es, and on whi h the matrix transformations are translated. The orrelation with other types of stru tured matri es has also been well developed in the literature [10, 9℄, allowing to treat so eÆ iently other stru tures su h as Vandermonde or Cau hy-like stru tures. Su h problems are strongly onne ted to polynomial problems [4, 1℄. For instan e, the produ t of a Toeplitz matrix by a ve tor an be dedu ed from the produ t of two univariate polynomials, and thus an be omputed eÆ iently by evaluation-interpolation te hniques, based on FFT. The inverse of a Hankel or Toeplitz matrix is onne ted to the Bezoutian of the polynomials asso iated to their generators. However, most of these methods involve univariate polynomials. So far, few investigations have been pursued for the treatment of multilevel stru tured matri es [11℄, related to multivariate problems. Su h linear systems appear for instan e in resultant or in residue onstru tions, in normal form omputations, or more generally in multivariate polynomial algebra. We refer to [8℄ for a general des ription of su h orrelations between multi-stru tured matri es and multivariate polynomials. Surprisingly, they also appear in numeri al s heme and pre onditionners. A main hallenge here is to devise super-fast algorithms of ~ for the resolution of multi-stru tured systems of size n.

omplexity O(n) In this paper, we onsider blo k-Toeplitz matri es, where ea h blo k is a Toeplitz matrix. Su h a stru ture, whi h is the rst step to multi-level stru tures, is involved in many bivariate problems, or in numeri al linear problems.We reinvestigate rst the resolution of Toeplitz systems T u = g, from a new point of view, by orrelating the solution of su h problems with syzygies of polynomials or moving lines. We show an expli it onne tion between the generators of a Toeplitz matrix and the generators of the orresponding module of syzygies. We show that this module is generated by two elements of degree n and the solution of T u = g an be reinterpreted as the remainder of an expli it ve tor depending on g, by these two generators. This approa h extends naturally to multivariate problems and we des ribe for Toeplitz-blo k-Toeplitz matri es, the stru ture of the orresponding generators. In parti ular, we show the known result that the module of syzygies of k non-zero bivariate polynomials is free of rank k − 1, by a new elementary proof. Exploiting the properties of moving lines asso iated to Toeplitz matri es, we give a new point of view to resolve a Toeplitz-blo k-Toeplitz system. In the next se tion we studie the s alar Toeplitz ase. In the hapter 3 we

onsider the Toeplitz-blo k-Toeplitz ase. Let R = K[x]. For n ∈ N, we denote by K[x]n the ve tor spa e of polynomials x−1 ] be the set of Laurent polynomials in the variable of degree 6 n. Let L = K[x, P i + x. For any polynomial p = n i=−m pi x ∈ L, we denote by p the sum of terms

298

H. Khalil, B. Mourrain, M. S hatzman P

n i − with positive exponents: p+ = P i=0 pi x and by p , the sum of terms with −1 − i stri tly negative exponents: p = i=−m pi x . We have p = p+ + p− . For n ∈ N, we denote by Un = {ω; ωn = 1} the set of roots of unity of order n.

2

Univariate case

We begin by the univariate ase and the following problem: n−1 n−1 Problem 1. Given a Toeplitz matrix T = (ti−j )i,j=0 ∈ Kn×n (T = (Tij )i,j=0 with Tij = ti−j ) of size n and g = (g0 , . . . , gn−1 ) ∈ Kn , nd u = (u0 , . . . , un−1 ) ∈ Kn su h that T u = g. (1) n−1 Let E = {1, . . . , x }, and ΠE be the proje tion of R on the ve tor spa e generated by E, along hxn , xn+1 , . . .i. Definition 1. – T (x) = – T~ (x) = – u(x) =

We de ne the following polynomials:

n−1 X

ti xi ,

i=−n+1 2n−1 X

~ti xi with ~ti =

i=0 n−1 X i=0

ui xi , g(x) =

n−1 X

ti ti−2n

if i < n , if i > n

gi xi .

i=0

Noti e that T~ = T + + x2 n T − and T (w) = T~ (w) if w ∈ U2 n . We also have (see [8℄) T u = g ⇔ ΠE (T (x)u(x)) = g(x).

For any polynomial u ∈ K[x] of degree d, we denote it as u(x) = u(x) + xn u(x) with deg(u) 6 n − 1 and deg(u) 6 d − n if d > n and u = 0 otherwise. Then, we have T (x) u(x) = T (x)u(x) + T (x)xn u(x) = ΠE (T (x)u(x)) + ΠE (T (x)xn u(x)) +(α−n+1 x−n+1 + · · · + α−1 x−1 )

+(αn xn + · · · + αn+m xn+m )

= ΠE (T (x)u(x)) + ΠE (T (x)xn u(x)) +x−n+1 A(x) + xn B(x),

(2)

with m = max(n − 2, d − 1), A(x) = α−n+1 + · · · + α−1 xn−2 ,

B(x) = αn + · · · + αn+m xm .

(3)

Toeplitz and Toeplitz-blo k-Toeplitz matri es

299

See [8℄ for more details, on the orrelation between stru tured matri es and (multivariate) polynomials. 2.1

Moving lines and Toeplitz matrices

We onsider here another problem, related to interesting questions in Ee tive Algebrai Geometry.

Problem 2. Given three polynomials a, b, c ∈ R respe tively of degree < l, < m, < n, nd three polynomials p, q, r ∈ R of degree < ν − l, < ν − m, < ν − n, su h that

a(x) p(x) + b(x) q(x) + c(x) r(x) = 0.

(4)

We denote by L(a, b, c) the set of (p, q, r) ∈ K[x]3 whi h are solutions of (4). It is a K[x]-module of K[x]3 . The solutions of the problem (2) are L(a, b, c) ∩ K[x]ν−l−1 × K[x]ν−m−1 × K[x]ν−n−1 . Given a new polynomial d(x) ∈ K[x], we denote by L(a, b, c; d) the set of (p, q, r) ∈ K[x]3 su h that a(x) p(x) + b(x) q(x) + c(x) r(x) = d(x). Theorem 1. For any non-zero ve tor of K[x]-module L(a, b, c) is free of rank 2.

polynomials

(a, b, c) ∈ K[x]3 ,

the

Proof. By the Hilbert's theorem, the ideal I generated by (a, b, c) has a free resolution of length at most 1, that is of the form: 0 → K[x]p → K[x]3 → K[x] → K[x]/I → 0.

As I 6= 0, for dimensional reasons, we must have p = 2. Definition 2. with (p, q, r)

A µ-base of L(a, b, c) is a basis (p, q, r), (p ′ , q ′ , r ′) of L(a, b, c), of minimal degree µ.

Noti e if µ1 is the smallest degree of a generator and µ2 the degree of the se ond generator (p ′ , q ′ , r ′), we have d = max(deg(a), deg(b), deg(c)) = µ1 +µ2 . Indeed, we have 0 → K[x]ν−d−µ1 ⊕ K[x]ν−d−µ2 →

K[x]3ν−d → K[x]ν → K[x]ν /(a, b, c)ν → 0,

for ν >> 0. As the alternate sum of the dimension of the K-ve tor spa es is zero and K[x]ν /(a, b, c)ν is 0 for ν >> 0, we have 0 = 3 (d − ν − 1) + ν − µ1 − d + 1 + ν − µ2 − d + 1 + ν + 1 = d − µ1 − µ2 .

For L(T~ (x), xn , x2n − 1), we have µ1 + µ2 = 2 n. We are going to show now that in fa t µ1 = µ2 = n:

300

H. Khalil, B. Mourrain, M. S hatzman

Proposition 1.

The K[x]-module L(T~ (x), xn , x2n − 1) has a n-basis.

Proof. Consider the map K[x]3n−1 → K[x]3n−1

(p(x), q(x), r(x)) 7→ T~ (x)p(x) + xn q(x) + (x2n − 1)r(x)

(5)

whi h 3n × 3n matrix is of the form

T0 0 −In S := T1 In 0 . T2 0 In

(6)

where T0 , T1 , T2 are the oeÆ ient matri es of (T~ (x), x T~ (x), . . . , xn T~ (x)), respe tively for the list of monomials (1, . . . , xn−1 ), (xn , . . . , x2n−1 ), (x2n , . . . , x3n−1 ). Noti e in parti ular that T = T0 + T2 Redu ing the rst rows of (T0 |0| − In ) by the last rows (T2 |0|In ), we repla e it by the blo k (T0 + T2 |0|0), without hanging the rank of S. As T = T0 + T2 is invertible, this shows that the matrix S is of rank 3n. Therefore, there is no syzygies in degree n − 1. As the sum 2n = µ1 + µ2 and µ1 6 n, µ2 6 n where µ1 , µ2 are the smallest degree of a pair of generators of L(T~ (x), xn , x2n − 1) of degree 6 n, we have µ1 = µ2 = n. Thus there exist two linearly independent syzygies (u1 , v1 , w1 ), (u2 , v2 , w2 ) of degree n, whi h generate L(T~ (x), xn , x2n − 1). A similar result an also be found in [12℄, but the proof mu h longer than this one, is based on interpolation te hniques and expli it omputations. Let us now des ribe how to onstru t expli itly two generators of L(T~ (x), xn , x2n − 1) of degree n (see also [12℄). As T~ (x) is of degree 6 2 n − 1 and the map (5) is a surje tive fun tion, there exists (u, v, w) ∈ K[x]3n−1 su h that T~ (x)u(x) + xn v(x) + (x2 n − 1) w = T~ (x)xn ,

we dedu e that (u1 , v1 , w1 ) = (xn − u, −v, −w) ∈ L(T~ (x), xn , x2n − 1). As there exists (u ′ , v ′ , w ′ ) ∈ K[x]3n−1 su h that T~ (x)u ′ (x) + xn v ′ (x) + (x2 n − 1) w ′ = 1 = xn xn − (x2 n − 1)

(7)

(8)

we dedu e that (u2 , v2 , w2 ) = (−u ′ , xn − v ′ , −w ′ − 1) ∈ L(T~ (x), xn , x2n − 1). Now, the ve tors (u1 , v1 , w1 ), (u2 , v2 , w2 ) of L(T~ (x), xn , x2n − 1) are linearly independent sin e by onstru tion, the oeÆ ient ve tors of xn in (u1 , v1 , w1 ) and (u2 , v2 , w2 ) are respe tively (1, 0, 0) and (0, 1, 0). Proposition 2. The ve tor u is solution v(x) ∈ K[x]n−1 , w(x) ∈ K[x]n−1 su h that

of (1) if and only if there exist

(u(x), v(x), w(x)) ∈ L(T~ (x), xn , x2n − 1; g(x))

Toeplitz and Toeplitz-blo k-Toeplitz matri es

301

Proof. The ve tor u is solution of (1) if and only if we have ΠE (T (x)u(x)) = g(x).

As u(x) is of degree 6 n − 1, we dedu e from (2) and (3) that there exist polynomial A(x) ∈ K[x]n−2 and B(x) ∈ K[x]n−1 su h that T (x)u(x) − x−n+1 A(x) − xn B(x) = g(x).

By evaluation at the roots ω ∈ U2n , and sin e ω−n = ωn and T~ (ω) = T (ω) for ω ∈ Un , we have T~ (ω)u(ω) + ωn v(ω) = g(ω), ∀ω ∈ U2n (ω), with v(x) = −x A(x) − B(x) of degree 6 n − 1. We dedu e that there exists w(x) ∈ K[x] su h that T~ (x)u(x) + xn v(x) + (x2n − 1)w(x) = g(x). Noti e that w(x) is of degree 6 n−1, be ause (x2n −1) w(x) is of degree 6 3n−1. Conversely, a solution (u(x), v(x), w(x)) ∈ L(T~ (x), xn , x2n −1; g(x))∩K[x]3n−1 implies a solution (u, v, w) ∈ K3 n of the linear system:

u g S v = 0 w 0

where S is has the blo k stru ture (6), so that T2 u + w = 0 and T0 u − w = (T0 + T2 )u = g. As we have T0 + T2 = T , the ve tor u is a solution of (1), whi h ends the proof of the proposition. 2.2

Euclidean division

As a onsequen e of proposition 1, we have the following property: Proposition 3. Let {(u1 , v1 , w1 ), (u2 , v2 , w2 )} a n-basis of L(T~ (x), xn , x2n −1),

0

the remainder of the division of xn g by

given in the proposition (2).

0

g

u1 u2 v1 v2 w1 w2

is the ve tor solution

Proof. The ve tor xn g ∈ L(T~ (x), xn , x2 n − 1; g) (a parti ular solution). We

u1 u2

−g

divide it by v1 v2 we obtain w1 w2

u 0 u1 u2 v = xn g − v1 v2 p q w g w1 w2

302

H. Khalil, B. Mourrain, M. S hatzman

(u, v, w) is the remainder of division, thus (u, v, w) ∈ K[x]3n−1 ∩L(T~ (x), xn , x2 n − 1; g). However (u, v, w) is the unique ve tor ∈ K[x]3n−1 ∩ L(T~ (x), xn , x2 n − 1; g) be ause if there is an other ve tor then their dieren e is in L(T~ (x), xn , x2 n − 1) ∩ K[x]3n−1 whi h is equal to {(0, 0, 0)}. ′ Problem 3. Given a matrix and a ve tor of polynomials e(x) e ′ (x) of degree f(x) f (x) p(x) en en′ n, and of degree m > n, su h that is invertible; nd the fn fn′ q(x) p(x) e(x) e ′ (x) remainder of the division of by . q(x) f(x) f ′ (x) Proposition 4. The rst oordinate of remainder 0 u u′ by is the polynomial v(x) solution r r′ xn g

ve tor of the division of of (1).

We des ribe here a generalized Eu lidean division algorithm to solve problem (3). p(x) e(x) e ′ (x) of degree m, B(x) = of degree n 6 m. q(x) f(x) f ′ (x) E(x) = B(x)Q(x) + R(x) with deg(R(x)) < n, and deg(Q(x)) 6 m − n. Let z = x1

Let E(x) =

E(x) = B(x)Q(x) + R(x) 1 1 1 1 ⇔ E( ) = B( )Q( ) + R( ) z z z z 1 m−n 1 1 1 n m Q( ) + zm−n+1 zn−1 R( ) ⇔ z E( ) = z B( )z z z z z ^ (z) = B ^ (z)Q(z) ^ + zm−n+1 R^ (z) ⇔ E

(9)

^ B(z), ^ ^ ^ are the polynomials obtained by reversing the order with E(z), Q(z), R(z) of oeÆ ients of polynomials E(z), B(z), Q(z), R(z). ^ ^ E(z) ^ + zm+n−1 R(z) = Q(z) (9) ⇒ ^ (z) ^ (z) B B ^ ^ = E(z) mod zm−n+1 ⇒ Q(z) ^ B(z) 1 ^ ^B(z) exists be ause its oeÆ ient of highest degree is invertible. Thus Q(z) is ^ E(z) obtained by omputing the rst m − n + 1 oeÆ ients of ^ . B(z) 1 To nd W(x) = we will use Newton's iteration: Let f(W) = B^ − W −1 . ^ (x) B ^ − Wl−1 , thus f ′ (Wl ).(Wl+1 − Wl ) = −Wl−1 (Wl + 1 − Wl )Wl−1 = f(Wl ) = B

^ Wl . Wl+1 = 2Wl − Wl B

Toeplitz and Toeplitz-blo k-Toeplitz matri es

303

^ −1 and W0 = B 0 whi h exists. ^ l W − Wl+1 = W − 2Wl + Wl BW ^ Wl )2 = W(I2 − B

^ (W − Wl ) = (W − Wl )B

Thus Wl (x) = W(x) mod x2l for l = 0, . . . , ⌈log(m − n + 1)⌉.

We need O(n log(n) log(m − n) + m log m) arithmeti operations to solve problem (3)

Proposition 5.

Proof. We must do ⌈log(m − n + 1)⌉ Newton's iteration to obtain the rst

1 = W(x). And for ea h iteration we must do O(n log n) ^ B arithmeti operations (multipli ation of polynimials of degree n). And then we ^ 1. need O(m log m) aritmeti operations to do the multipli ation E. ^ B m − n + 1 oe ients of

2.3

Construction of the generators

The anoni al basis of K[x]3 is denoted by σ1 , σ2 , σ3 . Let ρ1 , ρ2 the generators of L(T~ (x), xn , x2n − 1) of degree n given by ρ1 = xn σ1 − (u, v, w) = (u1 , v1 , w1 ) ρ2 = xn σ2 − (u ′ , v ′ , w ′ ) = (u2 , v2, w2 )

(10)

with (u, v, w), (u ′ , v ′ , w ′ ) are the ve tor given in (7) and (8). We will des ribe here how we ompute (u1 , v1 , w1 ) and (u2 , v2 , w2 ). We will give two methods to ompute them, the se ond one is the method given in [12℄. The rst one use the Eu lidean g d algorithm: We will re al rstly the algebrai and omputational properties of the well known extended Eu lidean algorithm (see [13℄): Given p(x), p ′ (x) two polynomials in degree m and m ′ respe tively, let r0 = p, s0 = 1, t0 = 0,

r1 = p ′ , s1 = 0, t1 = 1.

and de ne ri+1 = ri−1 − qi ri , si+1 = si−1 − qi si , ti+1 = ti−1 − qi ti ,

where qi results when the division algorithm is applied to ri−1 and ri , i.e. ri−1 = qi ri + ri+1 . deg ri+1 < deg ri for i = 1, . . . , l with l is su h that rl = 0, therefore rl−1 = g d(p(x), p ′ (x)).

304

H. Khalil, B. Mourrain, M. S hatzman

Proposition 6.

The following relations hold: and

si p + t i p ′ = ri

and

deg ri+1 < deg ri , deg si+1 > deg si

(si , ti ) = 1

for i = 1, . . . , l

i = 1, . . . , l − 1

and

deg ti+1 > deg ti ,

deg si+1 = deg(qi .si ) = deg v − deg ri , deg ti+1 = deg(qi .ti ) = deg u − deg ri .

Proposition 7. By and p ′ (x) = x2n−1

applying the Eu lidean g d algorithm in p(x) = xn−1 T in degree n − 1 and n − 2 we obtain ρ1 and ρ2 respe tively

Proof. We saw that Tu = g if and only if there exist A(x) and B(x) su h that T (x)u(x) + x2n−1 B(x) = xn−1 b(x) + A(x)

with T (x) = xn−1 T (x) a polynomial of degree 6 2n − 2. In (7) and (8) we saw that for g(x) = 1 (g = e1 ) and g(x) = xn T (x) (g = (0, t−n+1 , . . . , t−1 )T ) we obtain a base of L(T~ (x), xn , x2n − 1). Tu1 = e1 if and only if there exist A1 (x), B1 (x) su h that T (x)u1 (x) + x2n−1 B1 (x) = xn−1 + A1 (x)

(11)

and Tu2 = (0, t−n+1 , . . . , t−1 )T if and only if there exist A2 (x), B2 (x) su h that T (x)(u2 (x) + xn ) + x2n−1 B2 (x) = A2 (x)

(12)

with deg A1 (x) 6 n − 2 and deg A2 (x) 6 n − 2. Thus By applying the extended Eu lidean algorithm in p(x) = xn−1 T and p ′ (x) = x2n−1 until we have deg rl (x) = n − 1 and deg rl+1 (x) = n − 2 we obtain u1 (x) =

1 sl (x), c1

B1 (x) =

1 tl (x), c1

xn−1 + A1 (x) =

1 rl (x) c1

and xn + u2 (x) =

1 sl+1 (x), c2

B2 (x) =

1 tl+1 (x), c2

A2 (x) =

1 rl+1 (x) c2

Toeplitz and Toeplitz-blo k-Toeplitz matri es

305

with c1 and c2 are the highest oeÆ ients of rl (x) and sl+1 (x) respe tively, in fa t: The equation (11) is equivalent to n

z

n−1

n

n−1

}| t−n+1 .. . t 0 . . . tn−1

{

..

.

z

n−1

}|

{

. . . t−n+1

.

.. .

...

t0

..

..

.

.. .

1

tn−1

..

. 1

A1 u1 = 1 B1 0 .. . 0

sin e T is invertible then the (2n − 1) × (2n − 1) blo k at the bottom is invertible and then u1 and B1 are unique, therefore u1 , B1 and A1 are unique. And, by proposition (6), deg rl = n − 1 (rl = c1 (xn + A1 (x)) then deg sl+1 = (2n − 1) − (n − 1) = n and deg tl+1 = (2n − 2) − (n − 1) = n − 1 thus, by the same proposition, deg sl 6 n − 1 and deg tl 6 n − 2. Therfore c11 sl = u1 and 1 c1 tl = B1 . Finaly, Tu = e1 if and only if there exist v(x), w(x) su h that T~ (x)u(x) + xn v(x) + (x2n − 1)w(x) = 1

(13)

T~ (x) = T + + x2n T − = T + (x2n − 1)T − thus T (x)u(x) + xn v(x) + (x2n − 1)(w(x) + T − (x)u(x)) = 1

(14)

of a other hand T (x)u(x) − x−n+1 A1 (x) + xn B1 (x) = 1 and x−n+1 A1 (x) = xn (xA1 ) − x−n (x2n − 1)xA1 thus T (x)u(x) + xn (B(x) − xA(x)) + (x2n − 1)x−n+1 A(x) = 1

(15)

By omparing (14) and (15), and as 1 = xn xn −(x2n −1) we have the proposition and we have w(x) = x−n+1 A(x)−T− (x)u(x)+1 whi h is the part of positif degree of −T− (x)u(x) + 1.

Remark 1. A superfast eu lidean g d algorithm, wi h uses no more than O(n log2 n), is given in [13℄ hapter 11.

The se ond methode to ompute (u1 , v1 , w1 ) and (u2 , v2 , w2 ) is given in [12℄. We are interested in omputing the oeÆ ients of σ1 , σ2 , the oeÆ ients of σ3

orrespond to elements in the ideal (x2n −1) thus an be obtain by redu tion and n u(x) v(x) x − u −v 0 0 n 2n ~ = . of (T (x) x ).B(x) by x − 1, with B(x) = ′ ′ n −u1

x − v1

u (x) v (x)

306

H. Khalil, B. Mourrain, M. S hatzman

A superfast algorithm to ompute B(x) is given in [12℄. Let us des ribe how to ompute it. By evaluation of (10) at the roots ωj ∈ U2n we dedu e that (u(x) v(x))T and ′ (u (x) v ′ (x))T are the solution of the following rational interpolation problem:

T~ (ωj )u(ωj ) + ωn j v(ωj ) = 0 ~T (ωj )u ′ (ωj ) + ωnj v ′ (ωj ) = 0 with

Definition 3.

de ned as

un = 1, vn = 0 un′ = 0, vn′ = 1

The τ-degree of a ve tor polynomial w(x) = (w1 (x) w2 (x))T is τ − deg w(x) := max{deg w1 (x), deg w2 (x) − τ}

B(x) is a n−redu ed basis of the module of all ve tor polynomials r(x) ∈ K[x]2 that satisfy the interpolation onditions fTj r(ωj ) = 0, j = 0, . . . , 2n − 1 T~ (ωj ) . ωn j B(x) is alled a τ−redu ed basis (with τ = n) that orresponds to the interpolation data (ωj , fj ), j = 0, . . . , 2n − 1.

with fj =

Definition 4. A set of ve tor polynomial in K[x]2 is alled τ-redu ed τ-highest degree oeÆ ients are lineary independent.

if the

Theorem 2. Let τ = n. Suppose J is a positive integer. Let σ1 , . . . , σJ ∈ K and φ1 , . . . , φJ ∈ K2 wi h are 6= (0 0)T . Let 1 6 j 6 J and τJ ∈ Z. Suppose that Bj (x) ∈ K[x]2×2 is a τJ -redu ed basis matrix with basis ve tors having τJ −degree δ1 and δ2 , respe tively, orresponding to the interpolation data {(σi , φi ); i = 1, . . . , j}. Let τj→ J := δ1 − δ2 . Let Bj→ J (x) be a τj→ J -redu ed basis matrix orresponding to the interpolation data {(σi , BTj (σj )φi ); i = j + 1, . . . , J}. Then BJ (x) := Bj (x)Bj→ J (x) is a τJ -redu ed basis matrix orresponding to the interpolation data {(σi , φi ); i = 1, . . . , J}.

Proof. For the proof, see [12℄. When we apply this theorem for the ωj ∈ U2n as interpolation points, we obtain a superfast algorithm (O(n log2 n)) wi h ompute B(x).[12℄ We onsider the two following problems:

Toeplitz and Toeplitz-blo k-Toeplitz matri es

3

307

Bivariate case

Let m ∈ N, m ∈ N. In this se tion we denote by E = {(i, j); 0 6 i 6 m − 1, 0 6 j 6 n − 1}, and R = K[x, y]. We denote by K[x, y]m the ve tor spa e of bivariate n polynomials of degree 6 m in x and 6 n in y.

Notation. For a blo k matrix M, of blo k size n and ea h blo k is of size m, we will use the following indi ation : M = M(i1 ,i2 ),(j1 ,j2 )

06i1 ,j1 6m−1 06i2 ,j2 6n−1

= (Mαβ )α,β∈E .

(16)

(i2 , j2 ) gives the blo k's positions, (i1 , j1 ) the position in the blo ks.

Problem 4. Given a Toeplitz blo k Toeplitz matrix

(T = (Tαβ )α,β∈E with Tαβ K Kmn , nd u = (uα )α∈E su h that mn×mn

T = (tα−β )α∈E,β∈E ∈ = tα−β ) of size mn and g = (gα )α∈E ∈

Tu=g

(17)

We de ne the following polynomials: X T (x, y) := ti,j xi yj ,

Definition 5. –

– T~ (x, y) :=

(i,j)∈E−E 2n−1,2m−1 X

~ti,j xi yj with

i,j=0

si i < m, j < n ti,j si i > m, j < n , t i−2m,j ~ti,j := t si i < m, j > n i,j−2n ti−2m,j−2n si i > m, i > n X X – u(x, y) := ui,j xi yj , g(x, y) := gi,j xi yj . (i,j)∈E (i,j)∈E 3.1

Moving hyperplanes

For any non-zero ve tor of polynomials a = (a1 , . . . , an ) ∈ K[x, y]n , we denote by L(a) the set of ve tors (h1 , . . . , hn) ∈ K[x, y]n su h that n X

ai hi = 0.

(18)

i=1

It is a K[x, y]-module of K[x, y]n .

Proposition 8. The ve tor u is solution of (17) if and only if there exist h2 , . . . , h9 ∈ K[x, y]m−1 su h that (u(x, y), h2 (x, y), . . . , h9 (x, y)) belongs to n−1

L(T~ (x, y), xm , x2 m − 1, yn , xm yn , (x2 m − 1) yn , y2 n − 1, xm (y2 n − 1), (x2 m − 1) (y2 n − 1)).

308

H. Khalil, B. Mourrain, M. S hatzman

Proof. Let L = {xα

yα2 , 0 6 α1 6 m−1, 0 6 α2 6 n−1}, and ΠE the proje tion of R on the ve tor spa e generated by L. By [8℄, we have

whi h implies that

1

(19)

T u = g ⇔ ΠE (T (x, y) u(x, y)) = g(x, y)

T (x, y)u(x, y) =g(x, y) + xm yn A1 (x, y) + xm y−n A2 (x, y) +

(20)

x−m yn A3 (x, y) + x−m y−n A4 (x, y) + m

x A5 (x, y) + x

−m

n

−n

A6 (x, y) + y A7 (x, y) + y

A8 (x, y),

where the Ai (x, y) are polynomials of degree at most m − 1 in x and n − 1 in y. Sin e ωm = ω−m , υn = υ−n , T~ (ω, υ) = T (ω, υ) for ω ∈ U2 m , υ ∈ U2 n , we dedu e by evaluation at the roots ω ∈ U2 m , υ ∈ U2 n that R(x, y) :=T~ (x, y)u(x, y) + xm h2 (x, y) + yn h4 (x, y) + xm yn h5 (x, y) − g(x, y) ∈ (x2 m − 1, y2 n − 1)

with h2 = −(A5 +A6 ), h4 = −(A7 +A8 ), h5 = −(A1 (x, y)+A2 (x, y)+A3 (x, y)+ A4 (x, y)). By redu tion by the polynomials x2 m −1, y2 n −1, and as R(x, y) is of degree 6 3m − 1 in x and 6 3n − 1 in y, there exist h3 (x, y), h6 (x, y), . . . , h8 (x, y) ∈ K[x, y]m−1 su h that n−1

T~ (x, y)u(x, y) + xm h2 (x, y) + (x2m − 1)h3 (x, y) + yn h4 (x, y) + xm yn h5 (x, y) + (x2m − 1)yn h6 (x, y) + (y2n − 1)h7 (x, y) + m

2m

x (y

− 1)h7 (x, y) + (x

2n

2n

− 1)(y

(21)

− 1)h8 (x, y) = g(x, y).

Conversely a solution of (21) an be transformed into a solution of (20), whi h ends the proof of the proposition. In the following, we are going to denote by T the ve tor T = (T~ (x, y), xm , x − 1, yn , xm yn , (x2 m − 1) yn , y2 n − 1, xm (y2 n − 1), (x2 m − 1) (y2 n − 1)). 2m

Proposition 9.

There is no elements of K[x, y]m−1 in n−1

L(T).

Proof. We onsider the map K[x, y]9m−1 → K[x, y]3m−1 n−1

p(x, y) = (p1 (x, y), . . . , p9 (x, y)) 7→ T.p

3n−1

(22) (23)

Toeplitz and Toeplitz-blo k-Toeplitz matri es

whi h 9mn × 9mn matrix is of the form

E21 −E11 + E31

−E11 −E21 E11 − E31

309

.. .. .. .. .. . . . . T0 . E2n −E1n + E3n −E1n −E2n E1n − E3n E11 E21 −E11 + E31 .. .. .. S := . . . T1 E1n E2n −E1n + E3n E11 E21 −E11 + E31 .. .. .. . . . T2 E1n E2n −E1n + E3n

(24) matrix with with Eij is the 3m × mn matrix eij ⊗ Im and eij is the 3 × n T0

entries equal zero ex ept the (i, j)th entrie equal 1. And the matrix T1 is the following 9mn × m matrix

t0 t 1 . .. tn−1 0 t−n+1 . .. t −1 0 . . . .. . 0

0 t0

..

.

... tn−1 0

..

.

... ...

..

.

t1 ...

.. ..

. .

. . . t−n+1 t−1 . . .

..

.

...

.. ..

. .

...

T2

ti,0 0 ... 0 t ti,0 ... 0 i,1 .. . .. . . . . . . . . . . t0 ... ti,1 ti,0 ti,n−1 0 t1 ti,m−1 . . . ti,1 .. .. .. ti,−m+1 . . . 0 and ti = . . . . . . . . ti,−m+1 . t−n+1 t 0 . . . ti,−m+1 0 i,−1 t−n+1 ti,−1 . . . ti,−m+1 0 .. .. .. .. .. . . . . . .. .. . t−1 ti,−1 . 0 0 ... ... 0 0 0

For the same reasons in the proof of proposition (1) the matrix S is invertible. Theorem 3. For any non-zero ve tor of polynomials a = (ai )i=1,...,n ∈ K[x, y]n , the K[x, y]-module L(a1 , . . . , an ) is free of rank n − 1.

Proof. Consider rst the ase where ai are monomials.

ai = xαi yβi that are sorted in lexi ographi order su h that x < y, a1 being the biggest and an the smallest. Then the module of syzygies of a is generated by the S-polynomials: σi σj S(ai , aj ) = l m(ai , aj )( − ), ai aj

310

H. Khalil, B. Mourrain, M. S hatzman

where (σi )i=1,...,n is the anoni al basis of K[x, y]n [3℄. We easily he k that (ai ,ak ) l m(ai ,ak ) S(ai , ak ) = l m l m(ai ,aj ) S(ai , aj ) − l m(aj ,ak ) S(aj , ak ) if i 6= j 6= k and l m(ai , aj ) divides l m(ai , ak ). Therefore L(a) is generated by the S(ai , aj ) whi h are minimal for the division, that is, by S(ai , ai+1 ) (for i = 1, . . . , n − 1), sin e the monomials ai are sorted lexi ographi ally. As the syzygies S(ai , ai+1 ) involve the basis elements σi , σi+1 , they are linearly independent over K[x, y], whi h shows that L(a) is a free module of rank n − 1 and that we have the following resolution: 0 → K[x, y]n−1 → K[x, y]n → (a) → 0.

Suppose now that ai are general polynomials ∈ K[x, y] and let us ompute a Grobner basis of ai , for a monomial ordering re ning the degree [3℄. We denote by m1 , . . . , ms the leading terms of the polynomials in this Grobner basis, sorted by lexi ographi order. The previous onstru tion yields a resolution of (m1 , . . ., ms ): 0 → K[x, y]s−1 → K[x, y]s → (mi )i=1,...,s → 0.

Using [7℄ (or [3℄), this resolution an be deformed into a resolution of (a), of the form 0 → K[x, y]p → K[x, y]n → (a) → 0,

whi h shows that L(a) is also a free module. Its rank p is ne essarily equal to n − 1, sin e the alternate sum of the dimensions of the ve tor spa es of elements of degree 6 ν in ea h module of this resolution should be 0, for ν ∈ N. 3.2

Generators and reduction

In this se tion, we des ribe an expli it set of generators of L(T). The anoni al basis of K[x, y]9 is denoted by σ1 , . . . , σ9 . First as T~ (x, y) is of degree 6 2 m−1 in x and 6 2 n−1 in y and as the fun tion (22) in surje tive, there exists u1 , u2 ∈ K[x, y]9m−1 su h that T · u1 = T~ (x, y)xm , n−1

T · u2 = T~ (x, y)yn . Thus,

ρ1 = xm σ1 − u1 ∈ L(T), ρ2 = yn σ1 − u2 ∈ L(T).

We also have u3 ∈ K[x, y]m−1, su h that T · u3 = 1 = xm xm − (x2 m − 1) = n−1

yn yn − (y2 n − 1). We dedu e that

ρ3 = xm σ2 − σ3 − u3 ∈ L(T), ρ4 = yn σ4 − σ7 − u3 ∈ L(T).

Toeplitz and Toeplitz-blo k-Toeplitz matri es

311

Finally, we have the obvious relations: ρ5 ρ6 ρ7 ρ8 Proposition 10.

= yn σ2 − σ5 ∈ L(T), = xm σ4 − σ5 ∈ L(T), = xm σ5 − σ6 + σ4 ∈ L(T), = yn σ5 − σ8 + σ2 ∈ L(T).

The relations ρ1 , . . . , ρ8 form a basis of L(T).

Proof. Let h = (h1 , . . . , h9 ) ∈ L(T). By redu tion by the previous elements of

L(T), we an assume that the oeÆ ients h1 , h2 , h4 , h5 are in K[x, y]m−1 . Thus, n−1

T~ (x, y)h1 + xm h2 + yn h4 + xm yn h5 ∈ (x2 n − 1, y2 m − 1). As this polynomial is of degree 6 3 m − 1 in x and 6 3 n − 1 in y, by redu tion by the polynomials, we dedu e that the oeÆ ients h3 , h6 , . . . , h9 are in K[x, y]m−1 . By proposition n−1

9, there is no non-zero syzygy in K[x, y]9m−1 . Thus we have h = 0 and every n−1

element of L(T) an be redu ed to 0 by the previous relations. In other words, ρ1 , . . . , ρ8 is a generating set of the K[x, y]-module L(T). By theorem 3, the relations ρi annot be dependent over K[x, y] and thus form a basis of L(T). 3.3

Interpolation

Our aim is now to ompute eÆ iently a system of generators of L(T). More pre isely, we are interested in omputing the oeÆ ients of σ1 , σ2 , σ4 , σ5 of ρ1 , ρ2 , ρ3 . Let us all B(x, y) the orresponding oeÆ ient matrix, whi h is of the form: xm 0 0 0

yn 0 0 0

0 xm + K[x, y]4,3 m−1 0 n−1 0

(25)

Noti e that the other oeÆ ients of the relations ρ1 , ρ2 , ρ3 orrespond to elements in the ideal (x2 m − 1, y2 n − 1) and thus an be obtained easily by redu tion of the entries of (T~ (x, y), xm , yn , xm yn ) · B(x, y) by the polynomials x2 m − 1, y2 n − 1. Noti e also that the relation ρ4 an be easily dedu ed from ρ3 , sin e we have ρ3 − xm σ2 + σ3 + yn σ4 − σ7 = ρ4 . Sin e the other relations ρi (for i > 4) are expli it and independent of T~ (x, y), we an easily dedu e a basis of L(T) from the matrix B(x, y). As in L(T) ∩ K[x, y]m−1 there is only one element, thus by omputing the n−1 basis given in proposition (10) and redu ing it we an obtain this element in L(T) ∩ K[x, y]m−1 whi h gives us the solution of Tu = g. We an give a fast n−1 algorithm to do these two step, but a superfast algorithm is not available.

312

4

H. Khalil, B. Mourrain, M. S hatzman

Conclusions

We show in this paper a orrelation between the solution of a Toeplitz system and the syzygies of polynomials. We generalized this way, and we gave a orrelation between the solution of a Toeplitz-blo k-Toeplitz system and the syzygies of bivariate polynomials. In the univariate ase we ould exploit this orrelation to give a superfast resolution algorithm. The generalization of this te hnique to the bivariate ase is not very lear and it remains an important hallenge.

References 1. D. Bini and V. Y. Pan. Polynomial and matrix omputations. Vol. 1. Progress in Theoreti al Computer S ien e. Birkhauser Boston In ., Boston, MA, 1994. Fundamental algorithms. 2. R. Bitmead and B. Anderson. Asymptoti ally fast solution of Toeplitz and related systems of equations. Linear Algebra and Its Appli ations, 34:103{116, 1980. 3. D. Eisenbud. Commutative algebra, volume 150 of Graduate Texts in Mathemati s. Springer-Verlag, New York, 1995. With a view toward algebrai geometry. 4. P. Fuhrmann. A polynomial approa h to linear algebra. Springer-Verlag, 1996. 5. G. Heinig and K. Rost. Algebrai methods for Toeplitz-like matri es and operators, volume 13 of Operator Theory: Advan es and Appli ations. Birkhauser Verlag, Basel, 1984. 6. T. Kailath and A. H. Sayed. Displa ement stru ture: theory and appli ations. SIAM Rev., 37(3):297{386, 1995. 7. H. M. Moller and F. Mora. New onstru tive methods in lassi al ideal theory. J. Algebra, 100(1):138{178, 1986. 8. B. Mourrain and V. Y. Pan. Multivariate polynomials, duality, and stru tured matri es. J. Complexity, 16(1):110{180, 2000. 9. V. Y. Pan. Nearly optimal omputations with stru tured matri es. In Pro eedings

of the Eleventh Annual ACM-SIAM Symposium on Dis rete Algorithms (San Fran is o, CA, 2000), pages 953{962, New York, 2000. ACM. 10. V. Y. Pan. Stru tured matri es and polynomials. Birkhauser Boston In ., Boston,

MA, 2001. Uni ed superfast algorithms. 11. E. Tyrtyshnikov. Fast algorithms for blo k Toeplitz matri es. Sov. J. Numer. Math. Modelling, 1(2):121{139, 1985. 12. M. Van Barel, G. Heinig, and P. Kravanja. A stabilized superfast solver for nonsymmetri Toeplitz systems. SIAM J. Matrix Anal. Appl., 23(2):494{510 (ele troni ), 2001. 13. J. von zur Gathen and J. Gerhard. Modern omputer algebra. Cambridge University Press, Cambridge, se ond edition, 2003.

Concepts of Data-Sparse Tensor-Product Approximation in Many-Particle Modelling Heinz-Jurgen Flad1, Wolfgang Ha kbus h2, Boris N. Khoromskij2, and Reinhold S hneider1 1

2

Institut fur Mathematik, Te hnis he Universitat Berlin Strae des 17. Juni 137, D-10623 Berlin, Germany {flad,schneidr}@math.tu-berlin.de

Max-Plan k-Institute for Mathemati s in the S ien es Inselstr. 22-26, D-04103 Leipzig, Germany {wh,bokh}@mis.mpg.de

We present on epts of data-sparse tensor approximations to the fun tions and operators arising in many-parti le models of quantum

hemistry. Our approa h is based on the systemati use of stru tured tensor-produ t representations where the low-dimensional omponents are represented in hierar hi al or wavelet based matrix formats. The modern methods of tensor-produ t approximation in higher dimensions are dis ussed with the fo us on analyti ally based approa hes. We give numeri al illustrations whi h on rm the eÆ ien y of tensor de omposition te hniques in ele troni stru ture al ulations. Keywords: S hrodinger equation, Hartree-Fo k method, density fun tional theory, tensor-produ t approximation. Abstract.

1

Introduction

Among the most hallenging problems of s ienti omputing nowadays are those of high dimensions, for instan e, multi-parti le intera tions, integral or dierential equations on [0, 1]d and the related numeri al operator al ulus for d > 3. Many standard approa hes have a omputational omplexity that grows exponentially in the dimension d and thus fail be ause of the well known \ urse of dimensionality". To get rid of this exponential growth in the omplexity one an use the idea of tensor-produ t onstru tions ( f. [86℄) on all stages of the solution pro ess. Hereby we approximate the quantity of interest in tensorprodu t formats and use other approximation methods for the remaining lowdimensional omponents. Depending on the spe i properties of the problem, these low-dimensional omponents are already in a data-sparse format, like band stru tured matri es, or an be approximated via hierar hi al (low-rank) matrix and wavelet formats, respe tively. In order to obtain low-rank tensor-produ t approximations it is onvenient to start already with a separable approximation of possibly large separation rank. This is the ase e.g. for hyperboli ross

314

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

approximations in tensor-produ t wavelet bases or for Gaussian-type and plane wave basis sets whi h are frequently used in quantum hemistry and solid state physi s. With su h a representation at hand it is possible to apply algebrai re ompression methods to generate the desired low-rank approximations. We want to stress, however, that these re ompression methods in multi-linear algebra lead to severe omputational problems sin e they are, in fa t, equivalent to some kind of nonlinear approximation in d > 3. Despite these omputational diÆ ulties, su h kind of pro edure is espe ially favourable for smooth fun tions with few singularities whi h are a tually typi al for our envisaged appli ations to be dis ussed below. A large lass of translation invariant kernels of integral operators an be represented via integral transformations of a separable fun tion, e.g. Gaussian fun tion. Using exponentially onvergent quadrature rules for the parametri integrals it is possible to derive low-rank tensor-produ t approximations for these integral operators. In a similar manner it is possible to derive su h representations for matrix-valued fun tions in the tensor-produ t format. It is the purpose of the present paper to dis uss possible appli ations of the afore outlined approa h to ele troni stru ture al ulations with appli ations in quantum hemistry and solid state physi s. It will be shown in the following how to ombine the dierent te hniques, whi h omplement ea h other ni ely, to provide a feasible numeri al operator al ulus for some standard manyparti le models in quantum hemistry. Within the present work, we fo us on the Hartree-Fo k method and the Kohn-Sham equations of density fun tional theory (DFT). We present a brief survey on existing approximation methods, and give some numeri al results on rming their eÆ ien y. Our approa h aims towards a numeri al solution of the Hartree-Fo k and Kohn-Sham equations with

omputational omplexity that s ales almost linearly in the number of parti les (atoms). In parti ular, large mole ular systems su h as biomole ules, and nanostru tures, reveal severe limitations of the standard numeri al algorithms and tensor-produ t approximations might help to over ome at least some of them. The rest of the paper is organised as follows. Se tion 2 gives a brief outline of ele troni stru ture al ulations and of the Hartree-Fo k method in parti ular. This is followed by a dis ussion of best N-term approximation and its generalization to tensor produ t wavelet bases. We present an appli ation of this approa h to the Hartree-Fo k method. In Se tion 4, we rst introdu e various tensor produ t formats for the approximation of fun tions and matri es in higher dimensions. Thereafter we onsider a variety of methods to obtain separable approximations of multivariate fun tions. These methods enter around the Sin interpolation and onvenient integral representations for these fun tions. Se tion 5 provides an overview on dierent data sparse formats for the univariate omponents of tensor produ ts. Finally, we dis uss in Se tion 6 pos-

Tensor-Produ t Approximation in Many-Parti le Modelling

315

sible appli ations of these tensor-produ t te hniques in order to obtain linear s aling methods for Hartree-Fo k and Kohn-Sham equations.

2

Basic principles of electronic structure calculations

The physi s of stationary states, i.e. time harmoni , quantum me hani al systems of N parti les, is ompletely des ribed by a single wave fun tion (r1 , s1 , ..., rN , sN ) 7→ Ψ(r1 , s1 , ..., rN , sN ) ∈ C , ri ∈ R3 , si ∈ S ,

whi h is a fun tion depending on the spatial oordinates ri ∈ R3 of the parti les i = 1, . . . , N together with their spin degrees of freedom si . Sin e identi al quantum me hani al parti les, e.g. ele trons, annot be distinguished, the wave fun tion must admit a ertain symmetry with respe t to the inter hange of parti les. The Pauli ex lusion prin iple states that for ele trons, the spin variables an take only two values si ∈ S = {± 12 }, and the wave fun tion has to be antisymmetri with respe t to the permutation of parti les Ψ(r1 , s1 , . . . , ri , si , . . . rj , sj , . . . , rN , sN ) = −Ψ(r1 , s1 , . . . , rj , sj , . . . ri , si , . . . , rN , sN ) .

The Born Oppenheimer approximation onsiders a quantum me hani al ensemble of N ele trons moving in an exterior ele tri al eld generated by the nu lei of K atoms. Therein the wave fun tion is supposed to be a solution of the stationary ele troni S hrodinger equation HΨ = EΨ ,

with the many-parti le S hrodinger operator (non-relativisti Hamiltonian) H given by H := −

N K X N X X X 1 Za 1X + + ∆i − 2 |ri − Ra | |ri − rj | i=1

a=1 i=1

i<j6N

a> 1. We will see in a moment that the sparse grid approximation is not too bad. Be ause, to store both fun tions fL and gL with respe t to the given basis requires 2 · 2L oeÆ ients, whereas the sparse grid approximation requires O(L2 2L ) nonzero oeÆ ients in

ontrast to O(2dL ) for the full produ t. Keeping in mind that a really optimal tensor-produ t approximation for d > 2 is still an unsolved problem, and in general it might be quite expensive, the sparse grids approximation is simple and heap from the algorithmi point of view. It a hieves also an almost optimal

omplexity for storage requirements. It is a trivial task to onvert an \optimal" tensor-produ t representation into a sparse grid approximation. The opposite dire tion is a highly nontrivial task and requires fairly sophisti ated ompression algorithms. It is worthwhile to mention that previous wavelet matrix ompression approa hes are based on some Calderon-Zygmund type estimates for the kernels. The sparse grid approximation is intimately related to wavelet matrix ompression of integral operators with globally smooth kernels. The kernel fun tions of Calderon-Zygmund operators are not globally smooth. Nevertheless, it an be shown that they an be approximated within linear or almost linear omplexity by means of wavelet Galerkin methods see e.g. [8, 17{19, 77℄, sin e they are smooth in the far eld region. This result is proved, provided that the S hwartz kernel K(x, y) in Rd ×Rd is approximated by tensor-produ t bases Ψ⊗Ψ, where Ψ is an isotropi wavelet basis in Rd . Re ently developed fast methods like wavelet matrix ompression and hierar hi al matri es are working well for isotropi basis fun tions orNisotropi lusters. Corresponding results for sparse grid approximai tions with 2d i=1 Ψ have not been derived so far. Tensor-produ t bases in the framework of sparse grids do not have this geometri isotropy, whi h might spoil 3

It should be mentioned that in our appli ations at best almost optimal tensorprodu t approximations an be a hieved. This is not of parti ular signi an e sin e we are aiming at a ertain a

ura y and small variations of the separation rank, required in order to a hieve this a

ura y, do not ause mu h harm.

Tensor-Produ t Approximation in Many-Parti le Modelling

323

the eÆ ien y of these methods. This is not the ase for more general tensorprodu t approximations of these operators dis ussed in Se tions 4.2.2 and 4.2.3 below. Therefore tensor-produ t approximations will provide an appropriate and eÆ ient tool handling nonlo al operators a ting on fun tions whi h are represented by means of tensor-produ t (sparse grid) bases. The development of su h a tool will play a fundamental role for dealing with operators in high dimensions.

4

Toolkit for tensor-product approximations

The numeri al treatment of operators in higher dimensions arising in traditional nite element methods (FEM) and boundary element methods (BEM) as well as in quantum hemistry, material s ien es and nan ial mathemati s all have in ommon the fundamental diÆ ulty that the omputational ost of traditional methods usually has an exponential growth in d even for algorithms with linear

omplexity O(N) in the problem size N (indeed, N s ales exponentially in d as N = nd , where n is the \one dimensional" problem size). There are several approa hes to remove the dimension parameter d from the exponent ( f. [5, 41, 49, 53, 58℄). For the approximation of fun tions, su h methods are usually based on dierent forms of the separation of variables. Spe i ally, a multivariate fun tion F : Rd → R an be approximated in the form Fr (x1 , ..., xd ) =

r X

k=1

(1)

sk Φk (x1 ) · · · Φd (k) (xd ) ≈ F,

where the set of fun tions {Φ(ℓ) k (xℓ )} an be xed, like the best N-term approximation dis ussed in Se tion 3, or hosen adaptively. The latter approa h tries to optimize the fun tions {Φ(ℓ) k (xℓ )} in order to a hieve for a ertain separation rank r at least the almost optimal approximation property. By in reasing r, the approximation an be made as a

urate as desired. In the ase of globally analyti fun tions there holds r = O(| log ε|d−1 ), while for analyti fun tions with point singularities one an prove r = O(| log ε|2(d−1) ) ( f. [53℄). In the following we want to give a short overview of various approa hes to generate separable approximations with low separation rank. We rst introdu e in Se tion 4.1 two dierent tensor-produ t formats whi h have been used in the following. Se tion 4.2 provides a su

int dis ussion of low rank tensor-produ t approximations of spe ial fun tions, in luding the Coulomb and Yukawa potential, for whi h a ertain type of \seperable" integral representation exists. This integral representation an be used to obtain separable approximations either by applying the Sin approximation (Se tion 4.2.1) or dire tly through a best N-term approximation of exponential sums (Se tion 4.2.2).

324

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

4.1

Tensor-product representations in higher dimension

Let a d-th order tensor A = [ai1 ...id ] ∈ CI be given, de ned on the produ t index set I = I1 × ... × Id. It an be approximated via the anoni al de omposition (CANDECOMP) or parallel fa tors (PARAFAC) model (shortly, anoni al model) in the following manner A ≈ A(r) =

r X

k=1

(1)

(d)

bk Vk ⊗ ..... ⊗ Vk ,

bk ∈ C,

(11)

where the Krone ker fa tors Vk(ℓ) ∈ CIℓ are unit-norm ve tors whi h are hosen su h that for a ertain approximation only a minimal number r of omponents in the representation (11) are required. The minimal number r is alled the Krone ker rank of a given tensor A(r) . Here and in the following we use the notation ⊗ to represent the anoni al tensor U ≡ [ui ]i∈I = b U(1) ⊗ ... ⊗ U(d) ∈ CI , (d) (ℓ) (ℓ) de ned by ui1 ...id = b · u(1) ≡ [uiℓ ]iℓ ∈Iℓ ∈ CIℓ . We make use i1 · · · uid with U of the multi-index notation i := (i1 , ..., id ) ∈ I. The Tu ker model deals with the approximation

A ≈ A(r) =

r1 X

...

k1 =1

rd X

(1)

kd =1

(d)

bk1 ...kd Vk1 ⊗ ... ⊗ Vkd ,

(12)

∈ CIℓ (kℓ = 1, ..., rℓ , ℓ = 1, ..., d) are omplex where the Krone ker fa tors Vk(ℓ) ℓ ve tors of the respe tive size nℓ = |Iℓ |, r = (r1 , ..., rd ) (the Tu ker rank) and (ℓ) bk1 ...kd ∈ C. Without loss of generality, we assume that the ve tors {Vkℓ } are orthonormal, i.e., D

E (ℓ) (ℓ) Vkℓ , Vm = δkℓ ,mℓ , ℓ

kℓ , mℓ = 1, ..., rℓ ; ℓ = 1, ..., d,

where δkℓ ,mℓ is Krone ker's delta. On the level of operators (matri es) we distinguish the following tensorprodu t stru tures. Given a matrix A ∈ CN×N with N = nd , we approximate it with the anoni al model by a matrix A(r) of the form A ≈ A(r) =

r X

k=1

(1)

(d)

Vk ⊗ · · · ⊗ Vk ,

(13)

where the Vk(ℓ) are hierar hi ally stru tured matri es of order n × n. Again the important parameter r is denoted as the Krone ker rank.

Tensor-Produ t Approximation in Many-Parti le Modelling

325

We also introdu e the following rank-(r1 , ..., rd ) Tu ker-type tensor-produ t matrix format A=

r1 X

...

k1 =1

rd X

kd =1

(1)

(d)

2

2

bk1 ...kd Vk1 ⊗ ... ⊗ Vkd ∈ RI1 ×...×Id ,

(14)

∈ RIℓ ×Iℓ , kℓ = 1, ..., rℓ , ℓ = 1, ..., d, are matri es where the Krone ker fa tors Vk(ℓ) ℓ of a ertain stru ture (say, H-matrix, wavelet based format, Toeplitz/ ir ulant, low-rank, banded, et .). The matrix representation in the form (14) is a model redu tion whi h is a generalisation of the low-rank approximation of matri es,

orresponding to the ase d = 2. For a lass of matrix-valued fun tions ( f. [53, 58℄ and Se tion 6.1 below) it is possible to show that r = O(| log ε|2(d−1) ). Further results on the tensor-produ t approximation to ertain matrix-valued fun tions an be found in [41, 54℄. Note that algebrai re ompression methods based on the singular value de omposition (SVD) annot be dire tly generalised to d > 3. We refer to [5, 6, 25{27, 33, 58, 59, 64, ?,67, 74, 90℄ and referen es therein for detailed des ription of the methods of numeri al multi-linear algebra. In the following, we stress the signi an e of analyti al methods for the separable approximation of multivariate fun tions and related fun tion-generated matri es/tensors. 4.2

Separable approximation of functions

Separable approximation of fun tions plays an important role in the design of ee tive tensor-produ t de omposition methods. For a large lass of fun tions ( f. [84, 85℄) it is possible to show that tensor-produ t approximations with low separation rank exist. In this se tion, we overview the most ommonly used methods to onstru t separable approximations of multivariate fun tions. Sinc interpolation methods Sin -approximation methods provide the eÆ ient tools for interpolating C∞ fun tions on R having exponential de ay as |x| → ∞ ( f. [80℄). Let 4.2.1

Sk,h (x) =

sin [π(x − kh)/h] π(x − kh)/h

(k ∈ Z, h > 0, x ∈ R)

be the k-th Sin fun tion with step size h, evaluated at x. Given f in the Hardy spa e H1 (Dδ ) with respe t to the strip Dδ := {z ∈ C : |ℑz| 6 δ} for a δ < π2 . Let h > 0 and M ∈ N0 , the orresponding Sin -interpolant ( ardinal series representation) and quadrature read as CM (f, h) =

M X

k=−M

f(kh)Sk,h ,

TM (f, h) = h

M X

k=−M

f(kh),

326

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

where the latter approximates the integral I(f) =

Z

f(x)dx.

R

For the interpolation error, the hoi e h =

onvergen e rate

p πδ/bM implies the exponential

kf − CM (f, h)k∞ 6 CM1/2 e−

√

πδbM

. p Similarly, for the quadrature error, the hoi e h = 2πδ/bM yields |I(f) − TM (f, h)| 6 Ce−

√

2πδbM

.

If f has a double-exponential de ay as |x| → ∞, i.e., |f(ξ)| 6 C exp(−bea|ξ| )

for all ξ ∈ R with a, b, C > 0,

the onvergen e rate of both Sin -interpolation and Sin -quadrature an be improved up to O(e−cM/ log M ). For example, let d = 2. Given a fun tion F(ζ, η) de ned in the produ t domain Ω := [0, 1] × [a, b], a, b ∈ R, we assume that for ea h xed η ∈ [a, b], the univariate fun tion F(·, η) belongs to C∞ (0, 1] and allows a ertain holomorphi extension (with respe t to ζ) to the omplex plane C ( f. [53℄ for more details). Moreover, the fun tion F(·, η) restri ted onto [0, 1] is allowed to have a singularity with respe t to ζ at the end-point ζ = 0 of [0, 1]. Spe i ally, it is assumed that there is a fun tion φ : R → (0, 1] su h that for any η ∈ [a, b] the omposition f(x) = F(φ(x), η) belongs to the lass H1 (Dδ ). For this lass of fun tions a separable approximation is based on the transformed Sin -interpolation [41, 80℄ leading to FM (ζ, η) =

M X

k=−M

F(φ(kh), η)Sk,h (φ−1 (ζ)) ≈ F(ζ, η).

The following error bound sup |F(ζ, η) − FM (ζ, η)| 6 Ce−sM/ log M

ζ∈[a,b]

(15)

holds with φ−1 (ζ) = arsinh(ar osh(ζ−1 )). In the ase of a multivariate fun tion in [0, 1]d−1 × [a, b], one an adapt the orresponding tensor-produ t approximation by su

essive appli ation of the one-dimensional interpolation ( f. [53℄). In the numeri al example shown in Fig. 1), we approximate the Eu lidean distan e |x − y| in R3 on the domain |xi − yi | 6 1 (i = 1, 2, 3), by the Sin -interpolation. To that end, the approximation (15) applies to the fun tion p F(ζ, η, ϑ) = ζ2 + η2 + ϑ2 in Ω := [0, 1]3 .

Tensor-Produ t Approximation in Many-Parti le Modelling

327

Integral representation methods Integral representation methods are based on the quadrature approximation of integral Lapla e-type transforms representing spheri ally symmetri fun tions. In parti ular, some fun tions of the Eu lidean distan e in Rd , say, 4.2.2

1/|x − y|, |x − y|β , e−|x−y| , e−λ|x−y| /|x − y|,

x, y ∈ Rd ,

an be approximated by Sin -quadratures of the orresponding Gaussian integral on the semi-axis [41, 53, 54, 65℄. For example, in the range 0 < a 6 |x − y| 6 A, one an use the integral representation 1 1 = √ |x − y| π

Z

Z

exp(−|x − y|2 t2 )dt = F(ρ; t)dt,

R

x, y ∈ Rd

(16)

R

of the Coulomb potential with 2 2 1 F(ρ; t) = √ e−ρ t , π

ρ = |x − y|,

d = 3.

After the substitution t = log(1 + eu ) and u = sinh(w) in the integral (16), we apply the quadrature to obtain TM (F, h) := h

M X

k=−M

osh(kh)G(ρ, sinh(kh)) ≈

Z

F(ρ, t)dt = R

1 ρ

(17)

−ρ2 log2 (1+eu )

with G(ρ, u) = √2π e 1+eu and with h = C0 log M/M. The quadrature (17) is proven to onverge exponentially in M, 1 EM := − TM (F, h) 6 Ce−sM/ log M , ρ

where C, s do not depend on M (but depend on ρ), see [53℄. With the proper s aling of the Coulomb potential, one an apply this quadrature in the referen e interval ρ ∈ [1, R]. A numeri al example for this quadrature with values ρ ∈ [1, R], R 6 5000, is presented in Fig. 2. We observe almost linear error growth in ρ. In ele troni stru ture al ulations, the Galerkin dis retisation of the Coulomb potential in tensor-produ t wavelet bases is of spe i interest. For simpli ity, we onsider an isotropi 3d-wavelet basis (s)

(s )

(s )

(s )

γj,a (x) := ψj,a11 (x1 ) ψj,a22 (x2 ) ψj,a33 (x3 ), (1) j/2 (0) j where the fun tions ψ(0) ψ (2 x − a), ψj,a (x) := 2j/2 ψ(1) (2j x − a), j,a (x) := 2 with j, a ∈ Z, orrespond to univariate s aling fun tions and wavelets, respe tively. The nonstandard representation of the Coulomb potential ( f. [8, 34℄)

328

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

requires integrals of the form Z Z

(p) γj,a (x)

∞ Z

2−2j+1 1 (q) γj,b (y) d3 xd3 y = √ |x−y| π

R3 R3

I(p,q)(t, a − b)dt,

0

with I(p,q) (t, a) = G(p1 ,q1 ) (a1 , t) G(p2 ,q2 ) (a2 , t) G(p3 ,q3 ) (a3 , t),

and G(p,q) (a, t) =

ZZ

ψ(p) (x − a) e−(x−y)

2 2

ψ(q) (y) dxdy.

t

RR

In order to bene t from the tensor-produ t stru ture, it is important to have a uniform error bound with respe t to the spatial separation |a−b| of the wavelets. Re ently, the following theorem was proven by S hwinger [79℄

Given a univariate wavelet basis

Theorem 3.

(p)

ψj,a

Z (p) ψ (x − y) ψ(q) (y) dy . e−c|x|

whi h satis es

for c > 0.

Then for any δ < π4 , the integration error of the exponential quadrature q q πδ 2πδ rule ( f. [80℄) with h = M (h = M pure s aling fun tions, i.e., p = q = (0, 0, 0)) satis es ∞ Z M √ X (p,q) mh (p,q) mh I 6 Ce−α M (t, a)dt − h e I (e , a) m=−M

(18)

0

√ √ α = 2 πδ (α = 2πδ

for pure s aling fun tions) with onstant dent of the translation parameter a.

C

indepen-

We illustrate the theorem for the ase of pure s aling fun tions in Fig. 4.2.2. Similar results for wavelets are presented in [14℄. On the best approximation by exponential sums Using integral representation methods, the Sin -quadrature an be applied, for example, to the integrals Z Z

4.2.3

1 = ρ

∞

0

e−ρξ dξ, and

1 1 = √ ρ π

∞

−∞

e−ρ

2 2

t

dt

to obtain an exponentially onvergent sum of exponentials approximating the inverse fun tion ρ1 . Instead, one an dire tly determine the best approximation of a fun tion with respe t to a ertain norm by exponential sums

n P

ν=1

ων e−tν x

Tensor-Produ t Approximation in Many-Parti le Modelling

or

n P

329

ων e−tν x , where ων , tν ∈ R are to be hosen optimally. For some appli2

ν=1

ations in quantum hemistry of approximation by exponential sums we refer e.g. to [1, 60, 62℄. We re all some fa ts from the approximation theory by exponential sums ( f. [10℄ and the dis ussion in [53℄). The existen e result is based on the fundamental Big Bernstein Theorem : If f is ompletely monotone for x > 0, i.e., for all n > 0, x > 0,

(−1)n f(n) (x) > 0

then it is the restri tion of the Lapla e transform of a measure to the half-axis: f(z) =

Z

e−tz dµ(t). R+

For n > 1, onsider the set E0n of exponential sums and the extended set En : E0n

:=

En :=

u=

n X

ων e

−tν x

ν=1

u=

ℓ X

ν=1

: ων , tν ∈ R ,

pν (x)e−tν x : tν ∈ R,

pν polynomials with

ℓ X

(1 + degree(pν )) 6 n .

ν=1

Now one an address the problem of nding the best approximation to f over the set En hara terised by the best N-term approximation error d∞ (f, En ) := inf v∈En kf − vk∞ .

We re all the omplete ellipti integral of the rst kind with modulus κ, K(κ) =

Z1 0

dt p 2 (1 − t )(1 − κ2 t2 )

(0 < κ < 1)

( f. [12℄), and de ne K′ (κ) := K(κ′ ) by κ2 + (κ′ )2 = 1. Theorem 4. 4 ([10℄) Assume that f is ompletely monotone and analyti for ℜe z > 0, and let 0 < a < b. Then for the uniform approximation on the interval [a, b],

4

lim d∞ (f, En )1/n 6

n→ ∞

1 , ω2

where

ω = exp

πK(κ) K′ (κ)

with

κ=

a . b

The same result holds for E0n , but the best approximation may belong to the losure En of E0n .

330

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

In the ase dis ussed below, we have κ = 1/R for possibly large R. Applying the asymptoti s K(κ′ ) = ln κ4 + C1 κ + ... K(κ) =

π 2 {1

+ 41 κ2 + C1 κ4 + ...}

for κ′ → 1, for κ → 0,

of the omplete ellipti integrals ( f. [44℄), we obtain

2πK(κ) π2 1 π2 − − exp exp ≈ ≈ 1 − = . ω2 K(κ′ ) ln(4R) ln(4R)

The latter expression indi ates that the number n of dierent terms to a hieve a toleran e ε is asymptoti ally n≈

| log ε| | log ε| ln (4R) . ≈ | log ω−2 | π2

This result shows the same asymptoti al onvergen e in n as the orresponding bound in the Sin -approximation theory. Optimisation with respe t to the maximum norm leads to the nonlinear minimisation problem inf v∈E0n kf − vkL∞ [1,R] involving 2n parameters {ων , tν }nν=1 . The numeri al implementation is based on the Remez algorithm ( f. [12℄). For the parti ular appli ation with f(x) = x−1 , we have the same asymptoti al dependen e n = n(ε, R) as in the Sin -approximation above, however, the numeri al results 5 indi ate a noti eable improvement ompared with the quadrature method, at least for n 6 15. The best approximation to 1/ρµ in the interval [1, R] with respe t to a W weighted L2 -norm an be redu ed to the minimisation of an expli itly given dierentiable fun tional d2 (f, En ) := inf v∈En kf − vkL2W .

Given R > 1, µ > 0, n > 1, nd the 2n real parameters t1 , ω1 , ..., tn , ωn ∈ R, su h that Fµ (R; t1 , ω1 , ..., tn , ωn ) :=

ZR 1

5

n 1 2 X ωi e−ti x dx = min . W(x) µ − x

(19)

i=1

Numeri al results for the best approximation of x−1 by sums of exponentials

an be found in [10℄ and [11℄; a full list of numeri al data is presented in www.mis.mpg.de/scicomp/EXP SUM/1 x/tabelle.

Tensor-Produ t Approximation in Many-Parti le Modelling

331

In the parti ular ase of µ = 1 and W(x) = 1, the integral (19) an be al ulated in a losed form6 : n

F1 (R; t1 , ω1 , ..., tn , ωn ) = 1 −

X 1 −2 ωi [Ei(−ti ) − Ei(−ti R)] R i=1

n 1 X ω2i −2ti − e−2ti R + 2 + e 2 ti i=1

X

16i<j6n

i ωi ωj h −(ti +tj ) − e−(ti +tj )R e ti + tj R

x e dt ( f. [12℄). In the spe ial with the integral exponential fun tion Ei(x) = −−∞ t

ase R = ∞, the expression for F1 (∞; . . .) even simpli es. Gradient or Newton type methods with a proper hoi e of the initial guess an be used to obtain the minimiser of F1 ( f. [56℄).

5

t

Data sparse formats for univariate components

5.1

Hierarchical matrix techniques

The hierar hi al matrix (H-matrix) te hnique [46, 50, 51, 55℄ (see also the mosai skeleton method [83℄) allows an eÆ ient treatment of dense matri es arising, e.g., from BEM, evaluation of volume integrals and multi-parti le intera tions,

ertain matrix-valued fun tions, et . In parti ular, it provides matrix formats whi h enable the omputation and storage of inverse FEM stiness matri es

orresponding to ellipti problems as well as of BEM matri es. The hierar hi al matri es are represented by means of a ertain blo k partitioning. Fig. 4 shows typi al admissible blo k stru tures. Ea h blo k is lled by a submatrix of a rank not ex eeding k. Then, for the mentioned lass of matri es, it an be shown that the exa t dense matrix A and the approximating hierar hi al matrix AH dier by kA − AH k 6 O(ηk ) for a ertain number η < 1. This exponential de rease allows to obtain an error ε by the hoi e k = O (log(1/ε)) . It is shown ( f. [50{52℄) that the H-matrix arithmeti exhibits almost linear

omplexity in N: –

Data ompression. The storage of N × N H-matri es as well as the matrix-

by-ve tor multipli ation and matrix-matrix addition have a ost O (kN log N), where the lo al rank k is the parameter determining the approximation error. – Matrix-by-matrix and matrix-inverse omplexity. The approximate matrixmatrix multipli ation and the approximate inversion both take O(k2 N log2 N) operations. – The Hadamard (entry-wise) matrix produ t. The exa t Hadamard produ t of two rank-k H-matri es leads to an H-matrix of the blo k-rank k2 (see Se tion 5.2 below). 6

In the general ase, the integral (19) may be approximated by ertain quadratures.

332

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

5.2

Hierarchical Kronecker tensor-product approximations

Sin e n is mu h smaller than N, one an apply the hierar hi al (or low-rank) matrix stru ture to represent the Krone ker fa tors Vkℓ in (13) with the omplexity O(n logq n) or even O(n) that nally leads to O(rn) = O(rN1/d ) data to represent the ompressed matrix Ar . We all by HKT(r, s) the lass of Krone ker rank-r matri es, where the Krone ker fa tors Vkℓ are represented by the blo k-rank s H-matri es (shortly, HKT-matri es). It was shown in [58℄ that the advantages of repla ing A with Ar ( f. (13)), where all the Krone ker fa tors possess the stru ture of general H-matri es, are the following: –

Data ompression. The storage for the Vkℓ matri es of (13) is only O(rn) =

O(rN1/d ) while that for the original (dense) matrix A is O(N2 ), where r = O(logα N) for some α > 0. Consequently, we enjoy a linear-logarithmi

omplexity of O(n logα n) in the univariate problem size n. – Matrix-by-ve tor omplexity. Instead of O(N2 ) operations to ompute Ax, x ∈ CN , we now need only O(rknd log n) = O(rkN log n) operations. If the ve tor an be represented in a tensor-produ t form (say, x = x1 ⊗ . . . ⊗ xd , xi ∈ Cn ) the orresponding ost is redu ed to O(rkn log n) = O(rkN1/d log n) operations. – Matrix-by-matrix omplexity. Instead of O(N3 ) operations to ompute AB, we now need only O(r2 n3 ) = O(r2 N3/d ) operations for rather general stru -

ture of the Krone ker fa tors. Remarkably, this result is mu h better than the orresponding matrix-by-ve tor omplexity for a general ve tor x. – Hadamard produ t. The Hadamard (entry-wise) produ t of two HKTmatri es A ∗ B is presented in the same format: (U1 × V1 ) ∗ (U2 × V2 ) = (U1 ∗U2 )×(V1 ∗V2 ). In turn, the exa t Hadamard produ t U1 ∗U2 (same for V1 ∗ V2 ) of two rank-k H-matri es results in an H-matrix of the blo k-rank k2 and with the orresponding \skeleton" ve tors de ned by the Hadamard produ ts of those in the initial fa tors (sin e there holds (a ⊗ b)∗ (a1 ⊗ b1 ) = (a ∗ a1 ) ⊗ (b ∗ b1 )). Therefore, basi linear algebra operations an be performed in the tensor-produ t representation using one-dimensional operations, thus avoiding an exponential s aling in the dimension d. The exa t produ t of two HKT-matri es an be represented in the same format, but with squared Krone ker rank and properly modi ed blo k-rank [58℄. If A, B ∈ HKT(r, s), where s orresponds to the blo k-rank of the H-matri es involved, then in general AB ∈/ HKT(r, s). However, A=

r X

k=1

A UA k ⊗ Vk ,

B=

r X l=1

B UB l ⊗ Vl ,

A B B n×n UA , k , Vk , Ul , Vl ∈ C

(20)

Tensor-Produ t Approximation in Many-Parti le Modelling

leads to AB =

333

r X r X B A B (UA k Ul ) ⊗ (Vk Vl ).

k=1 l=1

It an be proven that the and VkAVlB matri es possess the same hierar hi al partitioning as the initial fa tors in (20) with blo ks of possibly larger (than s) rank bounded, nevertheless, by sAB = O(s log N). Thus, AB ∈ HKT(r2 , sAB ) with sAB = O(s log N). A UA k Ul

5.3

Wavelet Kronecker tensor-product approximations

Wavelet matrix ompression was introdu ed in [8℄. This te hniques has been onsidered by one of the authors during the past de ade in a series of publi ations ( f. [77℄). The ompression of the Krone ker fa tors Vi ∈ Rn×n is not so obvious, sin e it is not lear to what extend they satisfy a Calderon-Zygmund ondition. It is more likely that they obey more or less a hyperboli ross stru ture. An underlying trun ation riterion based on the size of the oeÆ ients will provide an automati way to nd the optimal stru ture independent of an a priori assumption. A basi thresholding or a posteriori riterion has been formulated by Harbre ht [61℄ and in [22℄. With this riterion at hand, we expe t linear s aling with respe t to the size of the matri es.

Data ompression. The matri es in (13)

Vkℓ an be ompressed requiring total storage size about O(rn) = O(rN ), where r = O(logα N) is as above. The data ve tor requires at most O(n logd n) nonzero oeÆ ients. – Matrix-by-ve tor omplexity. Instead of O(N2 ) operations to ompute Ax, x ∈ CN , we now need only O(rnd ) = O(rN) operations. If the ve tor is represented in a tensor-produ t form (say, x = x1 ⊗ ... ⊗ xd , xi ∈ Cn ) or in sparse grid representation, then the orresponding ost is redu ed to O(rn), resp. O(rn logd n) operations . – Matrix-by-matrix omplexity. Using the ompression of the Lemarie algebra [82℄, instead of O(N3 ) operations to ompute AB, we need only O(r2 n logq n) = O(r2 N1/d logq N), or even O(r2 n) operations. –

1/d

Adaptive wavelet s hemes for nonlinear operators have been developed in [3, 24℄ and for nonlo al operators in [23℄. Corresponding s hemes for hyperboli

ross approximations have not been worked out up to now. Perhaps basi ideas

an be transfered immediately to the tensor-produ t ase.

6

Linear scaling methods for Hartree-Fock and Kohn-Sham equations

Operator-valued fun tions G(L) of ellipti operators L play a prominent role in quantum many-parti le theory. A possible representation of the operator G(L)

334

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

is given by the Dunford-Cau hy integral ( f. [38{41℄) G(L) =

1 2πi

Z

G(z)(zI − L)−1 dz, Γ

where Γ envelopes the spe trum spe (L) of the operator L in the omplex plane. This kind of representation is espe ially suitable for tensor-produ t approximation using Sin or Gauss-Lobatto quadratures for the ontour integral to get an approximate operator of the form G(L) ≈

X

ck G(zk )(zk I − L)−1 .

(21)

An important example for an operator valued fun tion is the sign fun tion of the shifted Fo k operator whi h an be dire tly related to the spe tral proje tor Pρ asso iated with the density matrix ρ. This relation Pρ =

1 1 [I − sign(F − µI)] = − 2 2πi

Z

(F − zI)−1 dz, Γ

where Γ ∩ spe (F) = ∅ en loses the N/2 lowest eigenvalues of the Fo k operator, has been rst noti ed by Beylkin, Coult and Mohlenkamp [7℄. In order to be appli able, the method requires a nite gap between the highest o

upied εN/2 and lowest uno

upied εN/2+1 eigenvalue to adjust the parameter εN/2 < µ < εN/2+1 . This onstraint, in parti ular, ex ludes metalli systems. In general, the approximability of inverse matri es, required in (21), within the HKT format is still an open problem. First results on fast approximate algorithms to ompute inverse matri es in the HKT format for the ase d > 2

an be found in [41℄. In Fig. 6, we onsider the HKT representation to the dis rete Lapla ian inverse (−∆h )−1 (homogeneous Diri hlet boundary onditions) in Rd , whi h an be obtained with O(dn logq n) ost. Numeri al examples for still higher dimensions d 6 1024 are presented in [45℄. For omparison, the following numeri al example manifests the optimal Krone ker rank of the dis rete ellipti inverse in d = 2. Let −∆h now orrespond to a ve-point sten il dis retization of the Lapla ian on a uniform mesh in the unit re tangle in R2 (Diri hlet boundary onditions). It is easy to see that the Krone ker rank of −∆h is 2. The Krone ker ranks of (−∆h )−1 for dierent relative approximation a

ura ies (in the Frobenius norm) are given in Table 6. Our results indi ate a logarithmi bound O(log ε−1 ) for the approximate Krone ker rank r. 6.1

Matrix-valued functions approach for density matrices

Let F ∈ RM×M be the Fo k matrix that represents the Fo k operator F ( f. (8)) in an orthogonal basis {ϕi }M i=1 , M > N/2. There exist two dierent approa hes

Tensor-Produ t Approximation in Many-Parti le Modelling

335

to ompute the Galerkin dis retization D ∈ RM×M of the density matrix (6) via the matrix sign of the shifted Fo k matrix D=

1 [I − sign(F − µI)], 2

with µ ∈ (εN/2 , εN/2+1 ).

The rst approa h uses an exponentially onvergent quadrature for the integral to obtain an expansion into resolvents (21) whereas the se ond approa h is based on a Newton-S hultz iteration s heme. Con erning the tensor-produ t approximation of resolvents in the HKT format we refer to our dis ussion in Se tion 5.2. For the Newton-S hultz iteration s heme proposed in [7℄ S(n+1) = S(n) +

i 1h I − (S(n) )2 S(n) , S(0) = (F − µI) /||F − µI||2 , 2

(22)

the sequen e S(n) onverges to sign(F − µI). First appli ations in quantum

hemistry by Nemeth and S useria [71℄ demonstrate the pra ti ability of this approa h. Iterations s hemes of the form (22) seem to be espe ially favourable for tensor-produ t formats. Starting from an initial approximation of the Fo k matrix F, with low separation rank one has to perform matrix-matrix multipli ations whi h an be handled in an eÆ ient manner in the tensor-produ t format,

f. our dis ussion in Se tion 5.2. After ea h iteration step a re ompression of the tensor-produ t de omposition of S(n+1) be omes ne essary. For the re ompression one an apply the simple alternating least squares (ALS) method [5, 87, 90℄ or Newton-type and related algebrai iterative methods [33℄. The ALS algorithm starts with an initial de omposition of S(n+1) with separation rank r and obtains the best approximation with separation rank ~r 6 r by iteratively solving an optimisation problem for ea h oordinate separately. Assume that r is a tually mu h larger than ne essary, i.e., ~r N, we obtain: J+ =

N X

N

sin

l=2

+

2N X

i=1

sin

l=N+2

J− =

N−1 X

l=1−N

=

N−1 X l=1

lkπ X (l − i)π + λi ci (t)cl−i (t) sin N+1 N+1 N

lkπ X (l − i)π , λi ci (t)cl−i (t) sin N+1 N+1 i=1

N

sin

lkπ X (i − l)π = λi ci (t)ci−l (t) sin N+1 N+1 i=1

N lkπ X (i − l)π (i + l)π sin − ci+l (t) sin λi ci (t) ci−l (t) sin . N+1 N+1 N+1 i=1

In the se ond equality for J− , the terms with indi es l and −l were ombined. In the se ond term in J+ , was taken by 2N − l + 2 instead of l. Respe tively the orre tion of the limits of the summation on i is made. From the inequality

Separation of variables in nonlinear Fermi equation

351

1 6 j 6 N and the equalities j = l − i and j = i − l, we have i 6 l − 1, i > l − N, and i > l + 1 respe tively. Then J+ =

N X

sin

l=2

lkπ (βl (t) − γl (t)), N+1

βl (t) =

l−1 X

λi ci (t)cl−i (t) sin

i=1

N X

γl (t) =

(l − i)π , N+1

λi ci (t)c2N−l−i+2 (t) sin

i=N−l+2

J− =

N−1 X

sin

l=1

δl (t) =

lkπ (δl (t) − εl (t)), N+1 N X

λi ci (t)ci−l (t) sin

i=l+1

εl (t) =

(2N − l − i + 2)π , N+1

N−l X

λi ci (t)ci+l (t) sin

i=1

(i − l)π , N+1

(i + l)π . N+1

Now let us use the orthogonality relations: X N N+1 jkπ lkπ (j) (l) = = δjl , Y ,Y sin sin N+1 N+1 2

j, l = 1(1)N.

(10)

k=1

The transformed relation (8) has the following ve tor form: −

N N−1 N X X 1X (l) (l) cl (t) + λl cl (t) Y (l) = (βl (t) − γl (t))Yk + (δl (t) − εl (t))Yk . α l=2

l=1

l=1

Finally, by (10) we nd (5). 2. Corollary 1.

relation:

The ve tor

⊓ ⊔

T C = c1 (t), . . . , cN (t)

is bound by the following

= −BΛ1/2 C. Λ1/2 C

0 t1 t2 · · · tN−2 t 0 t1 1 . . t t1 . . . . 2 . ... BT = .. ... t1 tN−2 t1 0 tN−1 tN−2 · · · t2 t1

tN−1 tN−2 .. . , t2 t1 0

(11)

(12)

352

Yu. I. Kuznetsov

t2 t3 · · · tN−1 tN t 0 3 . .. . BH = ... .. ... ... tN−1 tN 0 0 tN tN−1 · · · t3

the values tk are de ned in (6).

0 tN tN−1 .. . , t3 t2

(13)

Proof. Let us represent the equations (5) for the ve tor C in the ve tor form.

The rst and the third terms in the right-hand side of equation (5) determine the symmetri Toeplitz matrix BT . The se ond and the fourth terms of the same ⊓ ⊔ equation form the persymmetri Hankel matrix BH . T

Let us now de ne the ve tor C = c1 (t), . . . , cN (t) matrix B = Λ1/2 I + α(BT − BH ) Λ1/2 ,

and the symmetri

where I is the identity matrix, Λ = diag(λ1 , . . . , λN ). If C=

then

Λ1/2 C , Λ1/2 C_

C_ = AC,

where

(14) (15)

0 I A= ∈ R2N×2N . −B 0

The ve tor C determines the oordinates in the Lagrange spa e. The total energy of the linear os illator (1) at α = 0 is the Hamilton fun tion: H=

1 _ 1 _ (Z(t), Z(t)) + (ΛZ(t), Z(t)). 2 2

(16)

At α = 0 the energy (16) is onserved. The eigenvalue problem for the matrix A is of the following form

0 I −B 0

Uj Uj , = µj Vj Vj

BUj = xj Uj ,

(17)

where xj = −µ2j are the real numbers. The ve tors Uj form the orthonormal basis in RN , (Uj , Ul ) = δjl , j, l = 1(1)N. By hoosing α one an ensure xj > 0, hen e √ µj = ±ipj , pj = xj , (18)

Separation of variables in nonlinear Fermi equation

353

√

where i = −1. A pair of eigenve tors of A orrespond to this pair of eigenvalues U±j

Uj , = ±ipj Uj

V±j

1 = 2

"

# Uj . ±i p1j Uj

(19)

The ve tors U±j , V±j form the biorthogonal system. As (U∓j , V∓k ) = δjk , j, k = 1(1)2N, then Z(t) =

n X φj + iϕj

2

j=1

where

−ipj t

e

φj − iϕj ipj t Uj Uj + , e −ipj Uj ipj Uj 2

φj = DUj , Z(0) ,

ϕj =

1 _ DUj , Z(0) . pj

The motion determined by the ve tor Z(t) is a periodi one and is a superposition of the harmoni s of the linear os illator. 3. The systems of the equations (15) is solved by RK-method of Radaux. In the numeri al experiments the al ulations begin with t = 0, when the system is at rest. At α = 0, the energy (16) is preserved in the initial harmoni s. For N = 31 on 100000 steps (τ = 0.001) the relative error of the total energy H is about 4.10−4 . The purpose of the numeri al experiments was to show that some lo alization T took pla e for α 6= 0. If C(j) (0) = ej , j = 1, .., 31, where C = c1 (t), . . . , cN (t) , α = 1, N = 31 on 20000 iterations (τ = 0.001) and v v u[N/j] uN uX uX (j) 2 2 (j) t (ci (t)) = t (cji (t)) + εj (t), i=1

P

i=1

2

(j) where [N/j] i=1 (cji (t)) ontain only the oeÆ ients whose number is divisible by j. Espe ially expressive is the ase for j = 2k , k = 1, .., 4: εj (t) ≡ 0. We get also: j = 3, ε3 = 10−6 ; j = 5, ε5 = 3 · 10−3 ; j = 6, ε6 = 5 · 10−3 ; j = 7, ε7 = 4 · 10−2 .

References 1. E. Fermi, Colle ted papers (Note e memorie), University of Chi ago Press, 1965. V. 2. 2. V. K. Mezentsev, S. L. Musher, I. V. Ryzhenkova, S. K. Turitsyn, Twodimensional solitons in dis rete systems, JETP Letters, 60 (11) (1994), 815{821.

Faster Multipoint Polynomial Evaluation via Structured Matrices B. Murphy and R. E. Rosholt Department of Mathemati s and Computer S ien e, Lehman College, City University of New York, Bronx, NY 10468, USA brian.murphy@lehman.cuny.edu rhys.rosholt@lehman.cuny.edu

We a

elerate multipoint polynomial evaluation by redu ing the problem to stru tured matrix omputation and transforming the resulting matrix stru ture. Abstract.

Keywords: Algorithm design and analysis, Multipoint polynomial evaluation, Vandermonde matri es, Hankel matri es. Exploiting the links between omputations with polynomials and stru tured matri es and transformation of matrix stru ture are two ee tive means for enhan ing the eÆ ien y of algorithms in both areas [P89/90℄, [P92℄, [BP94℄, [GKO95℄, [P01℄. We demonstrate the power of these te hniques by a

elerating multipoint evaluation of univariate polynomials. Multipoint polynomial evaluation is a lassi al problem of algebrai ompuN−1 of a polynomial tations. Given the oeÆ ient ve tor p = (pj )j=0 p(x) = p0 + p1 x + · · · + pN−1 xN−1

and n distin t points x1 , . . . , xn , one seeks the ve tor v = (vi )ni=1 of the values vi = p(xi ), i = 1, . . . , n. Hereafter \ops" stands for \arithmeti operations", mM (resp. iM ) denotes the number of ops required for multipli ation of a matrix M (resp. the inverse of matrix M−1 ) by a ve tor, and we assume that N > n. (N is large, e.g., for univariate polynomials obtained from multivariate polynomials via Krone ker's map.) One an ompute the ve tor v in 2(N − 1)n ops, by applying Horner's algorithm n times, whereas the Moen k{Borodin algorithm [MB72℄ uses O((N/n)m(n) log n) ops provided a pair of polynomials in x an be multiplied modulo xk in m(k) ops, m(k) = O(k log k) where the eld of onstants supports FFT and m(k) = O((k log k) log log k) over any eld of onstants [CK91℄. We take advantage of shifting to the equivalent problem of multipli ation of the n × N Vandermonde matrix n,N−1 Vn,N (x) = (xji )i=1,j=0

Faster Multipoint Polynomial Evaluation via Stru tured Matri es

355

by the ve tor p. This enables us to exploit matrix stru ture to de rease the upper bound to O(((N/n) + log n)m(n)), thus yielding some a

eleration of these lassi al omputations. Our te hniques may be of interest as a sample of the stru ture transformation for the a

eleration of omputations with stru tured matri es. In our ase we rely on the transformation of the matrix Vn,N (x) into the Hankel matrix H(x) = T Vn,n (x)Vn,N (x).

We use the following auxiliary results (see, e.g., [P01, Chapters 2 and 3℄).

Fa t 1. T H(x) = Vn,n (x)Vn,N (x).

is an n × N Hankel matrix n X i=1

xk+j i

!n,N−1

.

k=1,j=0

Fa t 2. mH = O((N/n)m(n))forH = Hn,N (x).

Fa t 3.

mV = O(m(n) log n)

for an n × n Vandermonde matrix V and iV = O(m(n) log n)

if this matrix is nonsingular. We ompute the ve tor v as follows.

Algorithm 2. 1. Compute the N + n entries of the Hankel matrix Hn,N (x) by using O((N/n)m(n) + m(n) log n) ops.

2. Compute the ve tor z = Hn,N (x)p by using O((N/n)m(n)) ops. −T 3. Apply O(m(n) log n) ops to ompute and output the ve tor v = Vn,n (x)z. T The matri es Vn,n(x) and their transposes Vn,n (x) are nonsingular be ause the n points x1 , . . . , xn are distin t.

356

B. Murphy and R. E. Rosholt

The ost bound on Stages 2 and 3 follow from Fa ts 2 and 3 respe tively. To perform Stage 1 we rst apply O(m(n) log n) ops to ompute the oeÆ ients of the polynomial q(x) =

n Y (x − xi ) i=1

( f., e.g. [P01, Se tion 3.1℄) and then apply O((N/n)m(n)) ops to ompute the power sums n X

xki , k = 1, 2, . . . , N + n

i=1

of its zero ( f. [BP94, page 34℄).

References [BP94℄ D. Bini, V. Y. Pan, Polynomial and Matrix Computations, Volume 1: Fundamental Algorithms, Birkhauser, Boston, 1994. [CK91℄ D. G. Cantor, E. Kaltofen, On Fast Multipli ation of Polynomials over Arbitrary Rings, A ta Informati a, 28(7), 697{701, 1991. [GKO95℄ I. Gohberg, T. Kailath, V. Olshevsky, Fast Gaussian Elimination with Partial Pivoting for Matri es with Displa ement Stru ture, Math. of Computation, 64, 1557{1576, 1995. [MB72℄ R. Moen k, A. Borodin, Fast Modular Transform via Division, Pro . of 13th Annual Symposium on Swit hing and Automata Theory, 90{96, IEEE Computer So iety Press, Washington, DC, 1972. [P89/90℄ V. Y. Pan, On Computations with Dense Stru tured Matri es, Math. of Computation, 55(191), 179{190, 1990. Pro eedings version in Pro . ISSAC89, 34-42, ACM Press, New York, 1989. [P92℄ V. Y. Pan, Complexity of Computations with Matri es and Polynomials, SIAM Review, 34, 2, 225{262, 1992. [P01℄ V. Y. Pan, Stru tured Matri es and Polynomials: Uni ed Superfast Algorithms, Birkhauser/Springer, Boston/New York, 2001.

Testing Pivoting Policies in Gaussian Elimination⋆ Brian Murphy1,⋆⋆ , Guoliang Qian2,⋆⋆⋆ , Rhys Eri Rosholt1,† , Ai-Long Zheng3, Severin Ngnosse2,‡ , and Islam Taj-Eddin2,§ 1

Department of Mathemati s and Computer S ien e, Lehman College, City University of New York, Bronx, NY 10468, USA ⋆⋆

2

brian.murphy@lehman.cuny.edu † rosholt@lehman.cuny.edu

Ph.D. Program in Computer S ien e, The City University of New York, New York, NY 10036 USA ⋆⋆⋆

guoliangqian@yahoo.com ‡ sngnosse@msn.com § itaj-eddin@gc.cuny.edu

3

Ph.D. Program in Mathemati s, The City University of New York, New York, NY 10036 USA, azheng 1999@yahoo.com

Abstract. We begin with spe ifying a lass of matri es for whi h Gaussian elimination with partial pivoting fails and then observe that both rook and omplete pivoting easily handle these matri es. We display the results of testing partial, rook and omplete pivoting for this and other

lasses of matri es. Our tests on rm that rook pivoting is an inexpensive but solid ba kup wherever partial pivoting fails.

Keywords: Gaussian elimination, pivoting.

1

Introduction

Hereafter we write GEPP, GECP, and GERP to denote Gaussian elimination with partial, omplete, and rook pivoting. GEPP and GPPP are Wilkinson's

lassi al algorithms [1℄, [2℄, [3℄, whereas GERP is a more re ent and mu h less known invention [4℄, [5℄, [6℄. Ea h of the three algorithms uses (2/3)n3 + O(n2 )

ops to yield triangular fa torization of an n × n matrix, but they dier in the number of omparisons involved, and GEPP has slightly weaker numeri ally. Namely, both GERP and GECP guarantee numeri al stability [7℄, [5℄, whereas GEPP is statisti ally stable for most of the input instan es in omputational pra ti e but fails for some rare but important lasses of inputs [8℄, [9℄, [10℄. Nevertheless GEPP is omnipresent in modern numeri al matrix omputations, whereas GECP is rarely used. The reason is simple: GEPP involves (1/2)n2 + O(n) omparisons versus (1/3)n3 + O(n2 ) in GECP, that is the omputational ⋆

Supported by PSC CUNY Award 69350{0038

358

B. Murphy et al.

ost of pivoting is negligible versus arithmeti ost for GEPP but is substantial for GECP. GERP ombines the advantages of both GECP and GEPP. A

ording to the theory and extensive tests, GERP is stable numeri ally almost as as GECP and is likely to use about 2n2 omparisons for random input matri es (see [4℄, [5℄, [6℄, and our Remark 1), although it uses the order of n3 omparisons in the worst ase [3, page 160℄. Ea h of GEPP, GECP, and GERP an be ombined with initial s aling for additional heuristi prote tion against instability, whi h requires from about n2 to about 2n2 omparisons and as many ops [1, Se tion 3.5.2℄, [2, Se tion 3.4.4℄, [3, Se tion 9.7℄, so that the overall omputational ost is still strongly dominated by the elimination ops. The ustomary examples of well onditioned matri es for whi h GEPP fails numeri ally are rather ompli ated, but in the next se tion we give a simple example, whi h should provide learer insight into this problem. Namely, we spe ify a lass of input matri es for whi h already the rounding errors at the rst elimination step of GEPP ompletely orrupt the output. The results of our numeri al tests in Se tion 3 show that both GECP and GERP have no problems with this lass. We also in lude the test results for six other input lasses. For ea h lass we present the number of omparisons, growth fa tor, and the norms of the error and residual ve tors, whi h gives a more omplete pi ture versus [4℄, [5℄, and [6℄ ( f. our on luding Remark 2). Our tests on rm that GERP is an inexpensive but solid ba kup wherever GEPP fails.

2

A Hard Input Class for GEPP

Already the rst step of Gaussian elimination tends to magnify the input errors wherever the pivot entry is absolutely smaller than some other entries in the same row and olumn. For example, represent an input matrix M as follows, M=

1 vT u B

n−1 n−1 = (mij )i,j=0 , B = (mij )i,j=1 ,

(1)

let ε denote the ma hine epsilon (also alled unit roundo), and suppose that u = se, v = te, e = (1, 1, . . . , 1)T , |mij | 6 1 for i, j > 0,

(2)

s < 2/ε, t = 1.

Then the rst elimination step, performed error-free, produ es an (n−1)×(n−1) matrix Bs = B + seeT , whi h turns into a rank-one matrix (s)eeT in the result of rounding. Here and hereafter (a) denotes the oating-point representations of a real number a.

Testing Pivoting Poli ies in Gaussian Elimination

359

Partial pivoting xes the latter problem for this matrix but does not help against exa tly the same problem where the input matrix M satis es equations (1) and (2) and where s = 1, t > 2/ε.

(3)

In this ase the rst elimination step, performed error-free, would produ e the (n−1)× (n−1) matrix Bt = B+teeT . Rounding would turn it into the rank-one matrix (t)eeT . We refer the reader to [8℄ and [9℄ ( f. also [10℄) on some narrow but important

lasses of linear systems of equations oming from omputational pra ti e on whi h GEPP fails to produ e orre t output.

3

Experimental Results

Tables 1{4 show the results of tests by Dr. Xinmao Wang at the Department of Mathemati s, University of S ien e and Te hnology of China, Hefei, Anhui 230026, China. He implemented GEPP, GECP, and GERP in C++ under the 64-bit Fedore Core 7 Linux with AMD Athlon64 3200+ unipro essor and 1 GB memory. In his implementation he used n omparisons for omputing the maximum of n numbers. He tested the algorithms for n × n matri es M of the following seven lasses. 1. Matri es with random integer entries uniformly and independently of ea h other distributed in the range (−10l , 10l ). 2. Matri es M = PLU for n × n permutation matri es P that de ne n inter hanges of random pairs of rows and for lower unit triangular matri es L and UT with random integer entries in the range (−10b , 10b ). 3. Matri es M = SΣT for random orthogonal matri es S and T ( omputed as the Q-fa tors in the QR fa torization of matri es with random integer entries uniformly and independently of ea h other distributed in the range (−10c , 10c )) and for the diagonal matrix Σ = diag(σi )ni=1 where σ1 = σ2 = · · · = σn−ρ = 1 and σn−ρ+1 = σn = 10−q ( f. [3, Se tion 28.3℄). 4. Matri es M satisfying equations (1){(3) where B denotes an (n−1)×(n−1) matrix from matrix lass 1 above.

360

B. Murphy et al. 0

5. Matri es

I O ... B−M1 I O . . . B B B M = B −M1 I B B .. .. @ . . −M1

1 I OC C .. C .C C C C OA I

from [8, page 232℄, where

−0.05 0.3 0.994357 0.289669 M1 = exp . ≈ 0.3 −0.05 0.289669 0.994357 1 0 0 ··· 0 −1/C − kh 1 − kh 0 ··· 0 −1/C 2 2 .. .. kh kh . . . . . − 2 −kh 1 − 2 6. Matri es M = . . . . . .. .. .. . 0 −1/C kh − 2 −kh · · · −kh 1 − kh −1/C 2 −kh · · · −kh −kh 1 − 1/C − − kh 2

page 1360℄, where kh = 23 , C = 6. 0

7. Matri es

1

0 ··· 0 1

B B−1 1 B B M=B B−1 −1 B B .. .. @ . .

kh 2

from [9,

1

. .C . .. .. C C C .. . 0 1C C from [10, page 156℄. C C .. . 1 1A ..

−1 −1 · · · −1 1

n = 128

Class 1 Class 2 Class 3, ρ = 1 Class 3, ρ = 2 Class 3, ρ = 3 Class 4 Class 5 Class 6 Class 7

minimal maximal average 31371 37287 34147 35150 40904 38168 30189 36097 32995 30597 36561 32960 29938 35761 32967 31342 36333 33648 24318 32258 32764 Table 1.

n = 256

Class 1 Class 2 Class 3, ρ = 1 Class 3, ρ = 2 Class 3, ρ = 3 Class 4 Class 5 Class 6 Class 7

minimal maximal average 131692 146780 139419 147123 161971 153559 127911 143706 136361 129228 144226 136427 129945 145882 136508 131533 146014 138392 97790 130050 131068

Numbers of omparisons in GERP.

For ea h matrix of lasses 1{4 the tests were performed for m = 1000 input instan es M for ea h of the two values n = 128 and n = 256, for b = c = l = 4, and for q = 10. For lass 3 the tests were performed for ea h of the three values

Testing Pivoting Poli ies in Gaussian Elimination

361

GEPP GECP GERP n = 256 GEPP GECP GERP Class 1 13.8 ± 2.5 6.4 ± 0.4 8.4 ± 0.8 Class 1 21.8 ± 3.8 9.5 ± 0.6 12.8 ± 1.3 Class 2 2.5 ± 0.5 1.5 ± 0.2 1.8 ± 0.2 Class 2 3.4 ± 0.6 1.9 ± 0.2 2.4 ± 0.3 Cl. 3, ρ = 1 17.4 ± 4.0 8.7 ± 1.0 11.6 ± 1.8 Cl. 3, ρ = 1 32.2 ± 7.4 15.5 ± 1.7 20.6 ± 2.9 Cl. 3, ρ = 2 15.6 ± 3.6 7.7 ± 0.8 10.2 ± 1.4 Cl. 3, ρ = 2 29.2 ± 6.7 13.8 ± 1.4 18.6 ± 2.9 Cl. 3, ρ = 3 14.3 ± 3.5 7.0 ± 0.7 9.3 ± 1.3 Cl. 3, ρ = 3 27.0 ± 6.1 12.5 ± 1.3 16.7 ± 2.3 Class 4 FAIL 1 1 Class 4 FAIL 1 1 Class 5 3.4e6 2 2 Class 5 3.1e13 2 2 Class 6 6.6e36 1.33 1.33 Class 6 8.6e74 1.33 1.33 Class 7 1.7e38 2 2 Class 7 5.8e76 2 2 n = 128

Table 2.

Growth fa tor in GEPP/GECP/GERP.

ρ = 1, 2, 3. Besides the results of these tests, Tables 1{4 also over the test results for matri es M of lasses 5{7 (from the papers [8℄, [9℄, and [10℄, respe tively), for whi h GEPP produ ed orrupted outputs. To every matrix GEPP, GECP, and GERP were applied. As was expe ted, for matrix lasses 1{3 numeri al performan e of GEPP, GECP, and GERP was similar but for lasses 4{7 GEPP either failed or lost many more orre t input bits versus GECP and GERP. Table 1 shows the maximum, minimum and average numbers of omparisons used in GERP for every input lass of matri es. Table 2 shows the average growth fa tor φ = max

n−1 i,j,k=0

|mij |/ max (k)

n−1 i,j=0

|mij |

n−1 (as well as its standard deviation from the average) where M(k) = (mi,j (k))i,j=k denotes the matrix omputed in k steps of Gaussian elimination with the sele ted n−1 denotes the input matrix. pivoting poli y and M = M(0) = (mij )i,j=0 Tables 3 and 4 show the average norms of the error and residual ve tors, respe tively, as well as the standard deviations from the average, where the linear systems My = f were solved by applying GECP, GEPP, and GERP. The ve tors f were de ned a

ording to the following rule: rst generate ve tors y with random omponents from the sets {−1, 0, 1} or {−1, 1}, then save these ve tors for omputing the errors ve tors, and nally ompute the ve tors f = My.

Remark 1. Table 1 shows the results of testing GERP where n omparisons were used for omputing the maximum of n numbers. Extensive additional tests with random matri es (of lass 1) for n = 2h and for h ranging from 5 to 10 were performed in the Graduate Center of the City University of New York. In these tests the modi ation GERP was run where no tested row or olumn is examined again until the next elimination step. Furthermore, the tests used k−1

omparisons for omputing the maximum of k numbers. The observed numbers of omparisons slightly de reased versus Table 1 and always stayed below 2n2 .

362

B. Murphy et al. GEPP GECP GERP Class 1 6.8e-13 ± 3.4e-12 5.2e-13 ± 2.8e-12 4.8e-13 ± 2.2e-12 Class 2 1.7e7 ± 2.6e8 8.7e5 ± 4.6e6 6.6e5 ± 3.7e6 Class 3, ρ = 1 1.1e-5 ± 8.4e-6 7.4e-6 ± 5.7e-6 8.7e-6 ± 6.7e-6 Class 3, ρ = 2 1.7e-5 ± 8.8e-6 1.2e-5 ± 6.1e-6 1.3e-5 ± 7.0e-6 Class 3, ρ = 3 2.1e-5 ± 9.2e-6 1.5e-5 ± 6.2e-6 1.7e-5 ± 7.5e-6 Class 4 FAIL 5.7e-13 ± 6.3e-12 5.7e-13 ± 3.5e-12 Class 5 1.0e-9 2.7e-15 2.7e-15 Class 6 3.1e3 2.7e-15 2.7e-15 Class 7 6.5 0.0 0.0 n = 256 GEPP GECP GERP Class 1 3.8e-12 ± 3.7e-11 2.8e-12 ± 4.0e-11 2.6e-12 ± 2.0e-11 Class 2 3.9e7 ± 5.0e8 1.1e6 ± 4.1e6 2.2e6 ± 1.3e7 Class 3, ρ = 1 2.0e-5 ± 1.5e-5 1.3e-5 ± 9.3e-6 1.5e-5 ± 1.1e-5 Class 3, ρ = 2 3.1e-5 ± 1.6e-5 2.0e-5 ± 1.1e-5 2.4e-5 ± 1.2e-5 Class 3, ρ = 3 3.9e-5 ± 1.7e-5 2.5e-5 ± 1.1e-5 2.9e-5 ± 1.2e-5 Class 4 FAIL 3.6e-12 ± 4.0e-11 3.6e-12 ± 2.5e-11 Class 5 1.4e-2 3.7e-15 3.7e-15 Class 6 7.2e57 3.6e-14 3.6e-14 Class 7 11.3 0.0 0.0 n = 128

Table 3.

Norms of the error ve tors in GEPP/GECP/GERP.

Remark 2. Similar test results for lass 1 were presented earlier in [5℄ and [6℄

and for lasses 3 and 5{7 in [5℄, but [5℄ shows no norms of the error and residual ve tors. It seems that GEPP, GECP, and GERP have not been tested earlier for

lasses 2 and 4.

Acknowledgement We are happy to a knowledge valuable experimental support of our work by Dr. Xinmao Wang.

References 1. G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd edition, The Johns Hopkins University Press, Baltimore, Maryland, 1996. 2. G. W. Stewart, Matrix Algorithms, Vol I: Basi De ompositions, SIAM, Philadelphia, 1998. 3. N. J. Higham, A

ura y and Stability in Numeri al Analysis, SIAM, Philadelphia, 2002 (se ond edition). 4. L. Neal and G. Pool, A Geometri Analysis of Gaussian Elimination, II, Linear Algebra and Its Appli ations, 173, 239{264, 1992.

Testing Pivoting Poli ies in Gaussian Elimination

363

GEPP GECP GERP Class 1 1.6e-9 ± 3.0e-10 1.1e-9 ± 1.7e-10 1.2e-9 ± 2.1e-10 Class 2 2.2e-4 ± 1.6e-3 1.2e-4 ± 4.7e-4 1.1e-4 ± 6.3e-4 Class 3, ρ = 1 3.1e-14 ± 5.1e-15 2.0e-14 ± 2.9e-15 2.3e-14 ± 3.6e-15 Class 3, ρ = 2 3.0e-14 ± 5.0e-15 1.9e-14 ± 2.8e-15 2.3e-14 ± 3.6e-15 Class 3, ρ = 3 3.0e-14 ± 5.3e-15 1.9e-14 ± 2.8e-15 2.3e-14 ± 3.5e-15 Class 4 FAIL 3.3e2 ± 3.3e2 3.5e2 ± 3.3e2 Class 5 1.1e-9 1.9e-15 1.9e-15 Class 6 2.9e3 1.7e-14 1.7e-14 Class 7 14.5 0.0 0.0 n = 256 GEPP GECP GERP Class 1 7.1e-9 ± 1.1e-9 4.4e-9 ± 5.8e-10 5.2e-9 ± 7.2e-10 Class 2 2.1e-3 ± 3.7e-2 6.2e-4 ± 2.1e-3 1.5e-3 ± 1.6e-2 Class 3, ρ = 1 9.8e-14 ± 1.5e-14 5.7e-14 ± 6.8e-15 7.4e-14 ± 9.3e-15 Class 3, ρ = 2 9.7e-14 ± 1.4e-14 5.7e-14 ± 7.0e-15 7.1e-14 ± 9.2e-15 Class 3, ρ = 3 3.9e-5 ± 1.7e-5 5.7e-14 ± 6.9e-15 7.0e-14 ± 9.1e-15 Class 4 FAIL 6.7e2 ± 6.5e2 6.6e2 ± 6.3e2 Class 5 9.0e-3 2.6e-15 2.6e-15 Class 6 2.1e58 1.0e-13 1.0e-13 Class 7 41.1 0.0 0.0 n = 128

Table 4.

Norms of the residual ve tors in GEPP/GECP/GERP.

5. L. V. Foster, The Growth Fa tor and EÆ ien y of Gaussian Elimination with Rook Pivoting, J. of Comp. and Applied Math., 86, 177{194, 1997. Corringendum in J. of Comp. and Applied Math., 98, 177, 1998. 6. G. Pool and L. Neal, The Rook's Pivoting Strategy, J. of Comp. and Applied Math., 123, 353{369, 2000. 7. J. H. Wilkinson, Error Analysis of Dire t Methods of Matrix Inversion, J. of ACM, 8, 281{330, 1961. 8. S. J. Wright, A Colle tion of Problem for Whi h Gaussian Elimination with Partial Pivoting Is Unstable, SIAM J. on S i. Stat. Computing, 14, 1, 231{238, 1993. 9. L. V. Foster, Gaussian Elimination with Partial Pivoting Can Fail in Pra ti e, SIAM J. on Matrix Analisys and Appli ations, 15, 4, 1354{1362, 1994. 10. N. J. Higham and D. J. Higham, Large Growth Fa tors in Gaussian Elimination with Pivoting, SIAM J. on Matrix Analisys and Appli ations, 10, 2, 155{164, 1989.

Newton’s Iteration for Matrix Inversion, Advances and Extensions⋆ Vi tor Y. Pan Department of Mathemati s and Computer S ien e Lehman College of CUNY, Bronx, NY 10468, USA victor.pan@lehman.cuny.edu http://comet.lehman.cuny.edu/vpan/

We rst over Newton's iteration for generalized matrix inversion, its ameliorations, re ursive ompression of its iterates in the ase of stru tured inputs, some te hniques of ontinuation via fa torization, and extension to splitting the Singular Value De omposition. We ombine the latter extension with our re ent fast algorithms for the null spa e bases (prompted by our progress in randomized pre onditioning). We applied these ombinations to ompute the respe tive spa es of singular ve tors and to arrive at divide-and- onquer algorithms for matrix inversion and omputing determinants. Our te hniques promise to be effe tive for omputing other matrix fun tions in the ase of ill onditioned inputs.

Abstract.

Keywords: Matrix inversion, Newton's Iteration, Matrix stru ture, Continuation (homotopy), Divide-and- onquer algorithms, Null spa es.

1

Introduction

Newton's iteration for generalized matrix inversion amounts mostly to performing a sequen e of matrix multipli ations. This level-three BLAS performan e is parti ularly ee tive on systoli arrays and parallel omputers. Newton's iteration for generalized inverse is important in its own right but also as a sample te hnique for omputing various other matrix fun tions su h as the square root, matrix sign fun tion, and the solution of Ri

ati equation. We survey and advan e this approa h, show its a

eleration in the ase of stru tured input matri es, its ombination with our new te hniques of homotopi ontinuation, fa torization, and pre onditioning, as well as its extension to divide-and onquer algorithms for splitting the Singular Value De omposition, that is for

omputing the respe tive subspa es generated by singular ve tors (hereafter we refer to su h subspa es as singular subspa es and invoke the usual abbreviation SVD). The latter extensions employ our re ent te hniques for the omputation of bases for null spa es, whi h should enhan e the power of the approa h. ⋆

Supported by PSC CUNY Award 69330{0038.

Newton's Iteration for Matrix Inversion, Advan es and Extensions

365

We re all some basi de nitions in the next se tion and then, in Se tion 3, the onvergen e analysis from [1℄ and [2℄ and some re ipes for the initialization. In Se tion 4 we des ribe three te hniques that exploit input stru ture to save running time and omputer memory by ompressing the omputed approximations. All three te hniques usually require reasonably good initialization (in spite of some interesting phenomenon of auto orre tion in ompression), and in Se tion 5 we over a general re ipe for initialization by means of homotopy ( ontinuation), ee tive for both general and stru tured inputs. We improve

onditioning of ontinuation by representing it as re ursive fa torization. These pre onditioning te hniques an be of interest on its own right, independently of the onsidered iterative pro esses. In Se tion 6 we des ribe a modi ed iteration dire ted to splitting the SVD and its generalizations. This te hnique produ es bases for the respe tive singular subspa es and an be extended to divide-and- onquer algorithms for the inverses, determinants, square roots, and other matrix fun tions. The te hnique is proposed for general Hermitian input matri es. (It does not preserve matrix stru ture ex ept for symmetry.) We over this dire tion in Se tion 7, where we also employ our re ent ee tive algorithms for omputing null spa e bases of general non-Hermitian matri es. We brie y re all these algorithms in Se tion 8 and point out their natural extension to randomized pre onditioning of ill onditioned inputs. In Se tion 9 we dis uss some dire tions for further study.

2

Basic Definitions

We rely on the ustomary de nitions for matrix omputations in [3℄{[8℄. MH denotes the Hermitian transpose of a matrix M. Ik is the k × k identity matrix. I is the identity matrix of an unspe i ed size. (A, B) is the 1 × 2 blo k matrix with blo ks A and B. diag(ai )i (resp. diag(Bi )i ) is the diagonal (resp. blo k diagonal) matrix with the diagonal entries ai (resp. diagonal blo ks Bi ). U is a unitary matrix if UH U = I. N(M) denotes the (right) null spa e of a matrix M. range(M) is the range of a matrix M, that is its olumn span. A matrix M is a matrix basis for a spa e S if its olumns form a basis for this spa e, that is if the matrix has full olumn rank and if range(M) = S. A matrix basis for the null spa e N(M) is a null matrix basis for a matrix M. ρ = rank(M) is its rank. σi (M) is its ith largest singular value, i = 1, 2, . . . , ρ. ond2 M = σ1 (M)/σρ (M) > 1 is the ondition number of a matrix M of a rank ρ. A matrix is well onditioned if its ondition number is not large (relatively to the omputational task and

omputer environment) and is ill onditioned otherwise. C+ and C− denote the Moore{Penrose generalized inverse of a matrix C, so that C+ = C− = C−1 for a nonsingular matrix C.

366

3

Vi tor Y. Pan

Newton’s iteration for matrix inversion. Its initialization and acceleration

Newton's iteration xk+1 = xk − f(xk )/f ′ (xk ), k = 0, 1, . . . ,

(1)

rapidly improves a rude initial approximation x = x0 to the solution x = r of an equation f(x) = 0 provided f(x) is a smooth nearly linear fun tion on an open line interval that overs two points r and x0 . Equation (1) an be obtained by trun ating all terms of the orders of at least two in Taylor's expansion of the fun tion f(x) at x = r. Hotelling [9℄ and S hultz [10℄ extended Newton's iteration (1) to the ase where x = X, xk = Xk , and f(xk ) = f(Xk ) are matri es and f(X) = M − X−1 for two matri es M and X. In this ase Newton's iteration rapidly improves a rude initial approximation X0 to the inverse of a nonsingular n × n matrix M, Xk+1 = Xk (2I − MXk ), k = 0, 1, . . . .

(2)

Indeed, de ne the error and residual matri es Ek = M−1 − Xk , ek = kEk k, Rk = MEk = I − MXk , ρk = kRk k

for all k, assume a matrix norm k · k satisfying the submultipli ative property kABk 6 kAk kBk, and dedu e from equation (2) that k

k

Rk = R2k−1 = R20 , ρk 6 ρ20 , 2

2k

MEk = (MEk−1 ) = (ME0 ) , ek 6

k k e20 kMk2 −1 .

(3) (4)

The latter equations show quadrati onvergen e of the approximations Xk to the inverse matrix M−1 provided ρ0 < 1. Ea h step (2) amounts essentially to performing matrix multipli ation twi e. Iteration (2) is numeri ally stable for nonsingular matri es M, but numeri al stability has been proved in [2℄ for its extensions (16) and (17) in Se tion 6 even where the matrix M is singular. Ben-Israel in [11℄ and Ben-Israel and Cohen in [12℄ proved that the iteration

onverges where X0 = aMH for a suÆ iently small positive s alar a. Sderstrm and Stewart [1℄ analyzed Newton's iteration based on the SVDs of the involved matri es. This study was ontinued by S hreiber in [13℄ and then in [2℄. We outline this work by using Generalized SVDs (hereafter to be referred to as GSVDs), that is nonunique representations of matri es as UΣV H where U and V are unitary matri es and Σ is a diagonal matrix. They turn into SVDs wherever Σ denotes diagonal matri es lled with nonnegative entries in nonin reasing order. Assume that the matri es M and X0 have GSVDs M = UΣV H , X0 = VΦ0 UH

(5)

Newton's Iteration for Matrix Inversion, Advan es and Extensions

367

for some unitary matri es U and V and diagonal matri es Σ = diag(σi )i and Φ0 = diag(φi,0 )i . In parti ular this is the ase where X0 = f(MH )

(6)

X0 = aMH + bI

(7)

is a matrix fun tion in MH , e.g., for two s alars a and b. Then we have Xk M = VSk V H , Sk = diag(si )i , 1 − si (k)

(k+1)

(k)

= (1 − si )2

(8)

for all i and k. Furthermore, we have (0)

si

= σi φi,0 = σi f(σi )

(9)

for all i under (6), so that iteration (2) onverges to the generalized inverse M− if 0 < s(0) = σi φi,0 < 2 for all i. Convergen e is lo ally quadrati but an be i are lose to zero or two for some subs ripts i. slow initially if the values s(0) i More pre isely, assume the hoi e (7) for b = 0 and a = 1/(kMk1 kMk∞ ) [11℄. Then it an be proved that ρ0 6 1 − 1/(( ond2 M)2 n) ( f. [14℄). By hoosing a = y/(kMk1 kMk∞ ) for any value of y su h that 1 6 y 6 2( ond2 M)2 n/(1 + ( ond2 M)2 n) we obtain the slightly improved bound ρ0 6 1 − y/(( ond2 M)2 n). In parti ular for y = 2n/(1+n) we obtain that ρ0 6 1−2n/(( ond2 M)2 (1+n)). Under these hoi es we need about ν = 2 log2 ond2 M steps (2) to de rease the residual norm ρk below 1/e = 0.3678781 . . . . Then in the order of l = log2 ln h h additional steps (2) we would yield the bound ρν+l 6 e−2 , e = 2.7182981 . . . . The bound on the number ν of initial steps is riti al for ill onditioned matri es. It was de reased roughly by twi e in [2℄ by means of repla ing iteration (2) by its s aled version Xk+1 = ak Xk (2I − MXk ), k = 0, 1, . . . , l

(10)

for appropriate s alars ak . Clearly, the inversion of a nonsingular matrix M an be redu ed to the inversion of either of the Hermitian positive de nite matri es MH M or MMH −1 H H −1 be ause M = (MH M)−1 MH or of the Hermitian matrix = M (MM )

0 M 0 M−H , having the inverse . H M 0 M−1 0 Now suppose M is a Hermitian matrix. Then one an further a

elerate the

omputations by twi e by hoosing the initial approximation X0 = yI/kMk1 for √ √ n( ond2 M)). This yields any value y su h that√1 6 y 6 2 n( ond√ 2 M)/(1 + the bound ρ0 6 1 − 2 n/(( ond2 M)(1 + n)).

The paper [2℄ obtains some a

eleration for a large lass of inputs by means of repla ing iteration (2) with ubi iteration Xk+1 = (cX2k + dXk + eI)Xk ,

368

Vi tor Y. Pan

k = 0, 1, . . . for appropriate s alars c, d, and e. The latter resour e was employed

again in [15℄ in the ase of sru tured input matri es. For more narrow input

lasses one an try to yield further a

eleration of onvergen e by applying more general iteration s hemes. For example, re all the following two-stage iteration [16℄{[18℄, having ee tive appli ations to integral equations via the asso iated tensor omputations, Xk+1 = Xk (2I − Xk ), Yk+1 = Yk (2I − Xk ).

Here Y0 = I and M = X0 is a nonsingular matrix su h that σ1 (I − X0 ) = kI − X0 k2 < 1. It is readily veri ed that Xk = X0 Yk for all k and that the matri es Xk onverge to the identity matrix I. Consequently the matri es Yk

onverge to the inverse M−1 = X−1 0 .

4

Structured iteration, recursive compressions, and autocorrection

Next, assuming that the input matrix M is stru tured and is given with its short displa ement generator, we modify Newton's iteration to perform its steps faster. We begin with re alling some ba kground on the displa ement representation of matri es ( f. [19℄{[21℄). We rely on the Sylvester displa ement operators ∇A,B (M) ← AM − MB, de ned by the pairs of the asso iated n × n operator matri es A and B. The next simple fa t relates them to the Stein operators ∆A,B (M) = M − AMB. Theorem 1. ∇A,B = A∆A−1 ,B

nonsingular.

if A is nonsingular. ∇A,B = −∆−1 A,B

−1

B

if B is

∇A,B (M) is the displa ement of M, its rank is the displa ement rank of M. The matrix pair {S, T } is a displa ement generator of length l for M if ∇A,B (M) = ST H and if S and T are n × l matri es. If a matrix M has displa ement rank r = rank ∇A,B (M) and is given with its displa ement generator of a length l, then one an readily ompute its displa ement generator of length r in O(l2 n) ops [21, Se tions 4.6℄. Most popular stru tures of Toeplitz, Hankel, Vandermonde and Cau hy types are asso iated with the operators ∇A,B where ea h of the operator matri es A and B is diagonal or unit f- ir ulant. For su h operators simple l-term bilinear or trilinear expressions of an n × n matrix M via the entries of its displa ement generator {S, T } of length l an be found in [20℄, [21, Se tions 4.4.4 and 4.4.5℄, and [22℄. If l λ1 . To extend our study to the ase of inde nite Hermitian matri es M we just √ need to modify the matri es Mk and Pk by repla ing tk ← tk −1 for all k. We refer the reader to [36, Se tion 7℄ on some extensions of homotopi te hniques to the ase of non-Hermitian input matri es. If the input matrix M has stru ture of Toeplitz type or has rank stru ture, then so do the matri es Mk , Pk , and Vk for all k, and we an a

elerate the

omputations respe tively. We an extend the stru tures of other types from the matrix M to the matri es Mk for all k (and onsequently also to the matri es Pk and Vk for all k) simply by rede ning the matri es: Mk ← M + tk N where the matrix N shares the stru ture with the matrix M. E.g., for a Hankel-like matrix

374

Vi tor Y. Pan

M, we an hoose N being the re e tion matrix, whi h has entries ones on its antidiagonal and has zero entries elsewhere. For matri es M having stru ture of Vandermonde or Cau hy type, we an hoose N being a Vandermonde or Cau hy matrix, respe tively, asso iated with the same operator ∇A,B . Alternatively, to invert a matrix M having the stru tures of Vandermonde or Cau hy types we an rst ompute the matrix N = VMW where ea h of V and W is an appropriate Vandermonde matrix or the inverse or transpose

of su h a matrix. This would redu e the original inversion problem to the ase of a Toeplitz-like matrix N be ause M−1 = W −1N−1 V −1 . (This te hnique of displa ement transformation is due to [38℄, was extensively used by G. Heinig, and is most widely known be ause of its ee tive appli ation to pra ti al solution of Toeplitz and Toepitz-like linear sytems of equations in [39℄.) We have the following lower bound on the number l of homotopi steps, l + 1 > logκ ond2 (M)

for every s alar κ ex eeding the ondition numbers of the matri es M0 , P0 , . . ., Pl−1 , and Vl . This bound is implied by the inequality

ond2 (M) 6 ond2 (M0 ) ond2 (Vl )

l−1 Y

ond2 (Pk ).

k=0

With an appropriate hoi e of step sizes one only needs O(log ond2 M+log2 ln h) h Newton's steps overall to approximate M−1 with the residual norm below 1/e2 , e = 2.718291 . . . ( f. [36℄).

6

Splitting GSVDs

We keep using the de nitions in equations (5) and (8) and at rst re all the following iteration from [13℄, Yk = Xk (2I − MXk ), Xk+1 = Yk MYk , k = 0, 1, . . . ,

su h that

(16)

Xk+1 M = ((2I − Xk M)Xk M)2 , k = 0, 1, . . . ,

and for X0 = aMH the singular values s(k) of the matri es Xk M satisfy the i quarti equations (k+1)

si

(k)

(k)

= ((2 − si )si )2 , i = 1, 2, . . . , n; k = 0, 1, . . .

The basi quarti polynomial√mapping s ← (2 − s)2 s2 for this iteration√has four xed points s~0 = 1, s~1 = (3 − 5)/2 = 0.3819 . . . , s~2 = 1, and ~s3 = (3 + 5)/2 = (0) 2.618 . . . . The iteration sends the singular values si of the matrix X0 M to zero

Newton's Iteration for Matrix Inversion, Advan es and Extensions

375

if they lie in the interval {s : 0 < s < s~√1 } and sends them to one if they lie in the interval {s : ~s1 < s < 2 − ~s1 = (1 + 5)/2 = 1.618 . . . }. If all singular values of the matrix X0 M lie in these two intervals, then under (6) the matri es Xk onverge to the generalized inverse (M<s )− of the matrix M<s where sf(s) = s~1 under (6). Here and hereafter we write M<s = UΣ<s V H where Σ<s = diag(σi(<s) ), σi(<s) equals σi if σi > s and equals zero otherwise, so that M<s and M>s = M − M<s V H denote the two matri es obtained by setting to zeros all singular values σi of the matrix M ex eeded by s and greater than s, respe tively. The onvergen e is lo ally quadrati but initially is slow for matri es M (0) having singular values σi su h that the values s(0) = σi φi (equal to σi f(σi ) i under (6)) lie near the points s~1 and/or 2 − s~1 . The iteration an be dire ted towards the matrix M<s for any xed smaller positive s ( f. [2, Se tion 7℄). At rst one should hoose appropriate s alars c, d, a0 , a1 , . . . , de ne the initial approximation X0 = cI + dMH , and apply iteration (10). For appropriate s alars ak and suÆ iently large l one yields that (l) (l) 0 6 si < s~1 if σi < s and ~ s1 6 si < 2 − s~1 otherwise. Then one writes X0 ← Xl and shifts to iteration (16). Similar results are obtained in [2℄ for the iteration Xk+1 = (3I − 2Xk M)Xk MXk , k = 0, 1, . . . ,

(17)

su h that Xk+1 M = (3I − 2Xk M)(Xk M) , k = 0, 1, . . . . This iteration is asso iated with the ubi polynomial mapping s ← s2 (3 − 2s), whi h has nonnegative xed points 0, 1/2, and 1. The iteration sends the singular values si of the matrix X0 M towards zero where 0 6 s(0) < 1/2 and towards one where i √ (0) 1/2 < si 6 s~4 = (1 + 3)/2 = 1.366 . . . . The onvergen e to zero and one is lo ally quadrati but initially is slow near the points 1/2 and s~4 . Then again this an be readily extended to the iteration for whi h the iterates Xk onverge to the matrix (M<s )− for a sele ted smaller positive s. Both iterations (16) and (17) are proved to be numeri ally stable in [2℄. Having the matri es M and (M<s )− , one an readily ompute the matri es M<s = M(M<s )− M and M>s = M − M<s . The paper [2℄ also extends iteration (17) to yield onvergen e to the proje tion matri es Ps = M(M<s )− = Udiag(Ir(s) , 0)UH and P(s) = (M<s )− M = V diag(Ir(s) , 0)V H where the integer r(s) is de ned by the threshold value s. In this ase it is suÆ ient to hoose two s alars a and b satisfying 2

a > 0, b > 0, as21 + b = 1/2, aσ21 + b < s~4 = 1.37 . . . ,

to set X0 = aP + bI, and to apply the iteration Xk+1 = (I − 2(Xk − I))X2k , k = 0, 1, . . . .

The iteration onverges to the matri es Ps and P(s) for P = MMH and P = MH M, respe tively.

376

7

Vi tor Y. Pan

Divide-and-conquer algorithms and computing singular subspaces

Let us brie y omment on some appli ations of the splitting te hniques from the previous se tion. Clearly, the SVD and GSVD omputation for a matrix M an be redu ed to the omputation of the SVDs or GSVDs of the pair of matri es M<s and M>s , and this an be re ursively extended. Similar divide-and- onquer pro ess an be applied to omputing the generalized inverse M− = (M<s )− + (M>s )− and the solutions M− f = (M<s )− f + (M>s )− f of linear systems Mx = f . This pro ess an rely on omputing the pairs of matri es (M<s )− and (M>s )− or M<s and M>s . In the latter ase the ve tors (M<s )− f and (M>s )− f an be omputed as the least-squares solutions of the linear systems M<s x<s = f and M>s x>s = f , respe tively, that have the minimum Eu lidean norms (see [3, Se tion 5.5℄ on the respe tive theory and algorithms). The omputations with the matri es M<s and M>s are simpler than with the matrix M be ause ondM = ( ondM<s ) ondM>s and rank(M) = rank(M<s ) + rank(M>s ). Splitting is more ee tive where the treshold value s balan es it, that is where the ratios ( ondM<s )/ ondM>s and/or rank(M<s )/ rank(M>s ) are lose to one. In a sample appli ation of splitting SVD or GSVD to omputing the deterH minant det M, we an ompute some unitary matrix bases UH <s , V<s , U>s , and V>s , respe tively, for the left and right null spa es of the matri es M<s and M>s , respe tively, su h that H U<s ~ 1, U ~ 2 )Σdiag(V~1 , V~2 ) = diag(M ~ 1, M ~ 2 ). M(V<s , V>s ) = diag(U UH >s

~ 1, U ~ 2 ) and (V~1 , V~2 ) are unitary matri es, whereas the diagonal blo ks Here (U ~ i , V~i, and M ~ i are ni × ni matri es for i = 1, 2; n1 = r(s) and n2 = n − n1 . U Therefore ~ 1 )(det M ~ 2 )/((det(U<s , U>s )(det(V<s , V>s )), det M = (det M where the matri es (U<s , U>s ) and (V<s , V>s ) are unitary and the matri es M1 and M2 have both sizes and ondition numbers de reased versus the matrix M. ~ 1 )( ondM ~ 2 ). Indeed ondM = ( ondM We an apply the same te hniques wherever we an zero some singular values of the input matrix M and preserve the respe tive singular subspa es of the matrix M. We only need to ompute the respe tive matrix bases for the left and right null spa es of the resulting matrix and for their omplements ( f. the next se tion), whi h are the respe tive singular subspa es of the input matrix. The desired suppression of some small singular values an be a hieved in smaller numbers of steps (16) or (17), as soon as the respe tive singular values nearly vanish, whereas the other singular values remain bounded and separated from zero and do not ne essarily move lose to one.

Newton's Iteration for Matrix Inversion, Advan es and Extensions

377

Clearly, the approa h an be applied to omputing any polynomial or rational fun tion in a Hermitian matrix M having a GSVD M = UH ΣU. Consequently it

an be extended to polynomial and rational approximation of irrational fun tions in Hermitian matri es.

8

Computation of null matrix bases

Computation of null matrix bases an rely on fa torizations of input matri es (say, on their QR or PLUP* fa torizations) but also on the re ent alternatives in [40℄, [41℄. Here is a relevant basi result from [40℄, [41℄, whi h for simpli ity we state only for square input matri es M and for the right null spa es. (Re all that the left null spa e of a matrix M is the right null spa e of its Hermitian transpose MH .) Theorem 8. U and V of

Assume an n × n matrix M of a rank ρ, a pair of two matri es sizes n × r, and the nonsingular matrix C = M + UV H . Then r > rank(U) > n − ρ,

(18)

N(M) = range(C+ UY)

(19)

provided Y is a matrix basis for the null spa e then we have

N(MC+ U).

Furthermore if

r = rank(U) = n − ρ,

(20)

N(M) = range(C+ U),

(21)

V H C+ U = Ir .

(22)

One an hoose the matri es U and V at random based on the following simple results from [42℄. Theorem 9. For a nite set ∆ of ardinality |∆| in a ring R and four matri es M ∈ Rn×n of a rank ρ, U and V in ∆r×n , and C = M + UV T , we

have

a) rank(C) 6 r + ρ, b) rank(C) = n with a probability of at least 1 −

if r + ρ > n and either the entries of both matri es U and V have been randomly sampled from the set ∆ or U = V and the entries of the matrix U have been randomly sampled from this set, r

) rank(C) = n with a probability of at least 1 − |∆| if r + ρ > n, the matrix U (respe tively V ) has full rank r, and the entries of the matrix V (respe tively U) have been randomly sampled from the set ∆. 2r |∆|

378

Vi tor Y. Pan

With weakly random generation of the matri es U and V (allowing to endow them with the desired patterns of stru ture and sparseness) our null spa e

omputations are expe ted to be numeri ally stable a

ording to the theoreti al and experimental study in [42, Se tions 4, 6, and 8℄. In parti ular this study shows that under the assumptions of the previous theorem and under weakly random hoi e of sparse and stru tured matri es U and V of a rank r > n − ρ, the ratio ( ond2 C)/ ond2 M is likely to be neither large nor small provided the matri es M, U, and V are s aled so that the ratio kMk2/kUV H k2 is neither large nor small. We refer the reader to [40℄{[44℄ on su h a randomized additive prepro essing M → M + UV H and its appli ations to some fundamental matrix

omputations (su h as eigen-solving and linear system solving).

9

Discussion

1. One an enhan e the power of our te hniques in various ways by ombining them with the available software and hardware. E.g., the iterations of Se tions 6 and 7 an be applied on urrently to a number of initial approximations X0 , thus produ ing matri es M<s(j) for a number of treshold values s(j), j = 1, 2, . . . . The matrix M<s − M s is obtained by zeroing all singular values of the matrix M that are less than s or not less than t. Thus we an employ the power of parallel pro essing to a

elerate splitting the given omputational problem into subproblems having smaller sizes and ondition numbers. 2. Our divide-and- onquer pro esses employ various algorithms for omputing null spa e bases numeri ally, that is some bases for the spa es of singular ve tors asso iated with the smallest singular values. It is a hallenge to re ne these algorithms, parti ularly where these values form lusters not learly separated from ea h other. 3. Another hallenge is to de ne new polynomial mappings that would modify iterations (16) and (17) to a

elerate the onvergen e of some singular values to zero while keeping suÆ iently many of them away from zero. 4. Many of the te hniques in this paper an be applied to the omputation of various other matrix fun tions besides the inverse, e.g., the square roots, matrix sign fun tion, and the solution of Ri

ati's equation ( f. [45℄ and the bibliography therein). Newton's iteration is fundamental for su h tasks. Wherever the output matrix is stru tured, our te hniques of re ursive ompression in Se tion 4 an support a

eleration, at least lo ally. Our te hniques in Se tion 5 for ontinuation via re ursive fa torization an treat the paramount problem of initialization, in both ases of general and stru tured input matri es. Finally the te hniques in Se tions 6{8 for splitting GSVDs and omputing bases for the respe tive singular subspa es an be readily extended from the ase of inversion to omputing the

Newton's Iteration for Matrix Inversion, Advan es and Extensions

379

square roots and other fun tions in Hermitian matri es. A natural hallenge is the transition from GSVDs to eigen-de ompositions, whi h would involve nonHermitian input matri es.

References 1. T. Sderstrm, G. W. Stewart, On the Numeri al Properties of an Iterative Method for Computing the Moore{Penrose Generalized Inverse, SIAM Journal on Numeri al Analysis, 11, 61{74, 1974. 2. V. Y. Pan, R. S hreiber, An Improved Newton Iteration for the Generalized Inverse of a Matrix, with Appli ations, SIAM Journal on S ienti and Statisti al Computing, 12, 5, 1109{1131, 1991. 3. G. H. Golub, C. F. Van Loan, Matrix Computations, 3rd edition, The Johns Hopkins University Press, Baltimore, Maryland, 1996. 4. G. W. Stewart, Matrix Algorithms, Vol I: Basi De ompositions, SIAM, Philadelphia, 1998. 5. G. W. Stewart, Matrix Algorithms, Vol II: Eigensystems, SIAM, Philadelphia, 1998 ( rst edition), 2001 (se ond edition). 6. J. W. Demmel, Applied Numeri al Linear Algebra, SIAM, Philadelphia, 1997. 7. L. N. Trefethen, D. Bau, III, Numeri al Linear Algebra, SIAM, Philadelphia, 1997. 8. N. J. Higham, A

ura y and Stability in Numeri al Analysis, SIAM, Philadelphia, 2002 (se ond edition). 9. H. Hotelling, Analysis of a Complerx Statisto al Variable into Prin ipal Components, J. of Edu ational Psysh., 24, 417{441, 498{520, 1933. 10. G. S hultz, Iterative Bere hnung der Re iproken Matrix, Z. Angew. Meth. Me h., 13, 57{59, 1933. 11. A. Ben-Israel, A Note on Iterative Method for Generalized Inversion of Matri es, Mathemati s of Computation, 20, 439{440, 1966. 12. A. Ben-Israel, D. Cohen, On Iterative Computation of Generalized Inverses and Asso iated Proje tions, SIAM Journal on Numeri al Analysis, 3, 410{419, 1966. 13. R. S hreiber, Computing Generalized Inverses and Eigenvalues of Symmetri Matri es Using Systoli Arrays, in Computing Methods in Applied S en e and Engeneering (edited by R. Glowinski and J.-L. Lions), North{Holland, Amsterdam, 1984. 14. V. Y. Pan, J. Reif, Fast and EÆ ient Parallel Solution of Dense Linear Systems, Computers and Math. (with Appli ations), 17, 11, 1481{1491, 1989. 15. G. Codevi o, V. Y. Pan, M. Van Barel, Newton-like Iteration Based on Cubi Polynomials for Stru tured Matri es, Numeri al Algorithms, 36, 365{380, 2004. 16. I. V. O eledetz, E. E. Tyrtyshnikov, Approximate Inversion of Matri es in the Pro ess of Solving a Hypersingular Integral Equation, Computational Math. and Math. Physi s, 45, 2, 302{313, 2005 (Translated from JVM i MF, 45, 2, 315{326, 2005). 17. V. Olshevsky, I.V. Oseledets, and E.E. Tyrtyshnikov, Tensor Properties of Multilevel Toeplitz and Related Matri es, Linear Algebra and Its Appli ations, 412, 1{21, 2006. 18. V. Olshevsky, I.V. Oseledets, and E.E. Tyrtyshnikov, Superfast Inversion of TwoLevel Toeplitz Matri es Using Newton Iteration and Tensor-Displa ement Stru ture, Operator Theory: Advan es and Appli ations, 179, 229{240, 2008.

380

Vi tor Y. Pan

19. T. Kailath, S.-Y. Kung, M. Morf, Displa ement Ranks of Matri es and Linear Equations, J. Math. Anal. Appl., 68(2), 395{407, 1979. 20. I. Gohberg, V. Olshevsky, Complexity of Multipli ation with Ve tors for Stru tured Matri es, Linear Algebra and Its Appli ations, 202, 163{192, 1994. 21. V. Y. Pan, Stru tured Matri es and Polynomials: Uni ed Superfast Algorithms, Birkhauser/Springer, Boston/New York, 2001. 22. V. Y. Pan, X. Wang, Inversion of Displa ement Operators, SIAM Journal on Matrix Analysis and Appli ations, 24, 3, 660{677, 2003. 23. V. Y. Pan, Parallel Solution of Toeplitz-like Linear Systems, J. of Complexity, 8, 1{21, 1992. 24. V. Y. Pan, Con urrent Iterative Algorithm for Toepliz-like Linear Systems, IEEE Transa tions on Parallel and Distributed Systems, 4, 5, 592{600, 1993. 25. V. Y. Pan, De reasing the Displa ement Rank of a Matrix, SIAM Journal on Matrix Analysis and Appli ations, 14, 1, 118{121, 1993. 26. V. Y. Pan, S. Branham, R. Rosholt, A. Zheng, Newton's Iteration for Stru tured Matri es and Linear Systems of Equations, SIAM volume on Fast Reliable Algorithms for Matri es with Stru ture (edited by T. Kailath and A. H. Sayed), 189{210, SIAM Publi ations, Philadelphia, 1999. 27. V. Y. Pan, Y. Rami, Newton's Iteration for the Inversion of Stru tured Matri es, in Stru tured Matri es: Re ent Developments in Theory and Computation, (edited by D. Bini, E. Tyrtyshnikov and P. Yalamov), 79{90, Nova S ien e Publishers, USA, 2001. 28. D. A. Bini, B. Meini, Approximate Displa ement Rank and Appli ations, in AMS

Conferen e "Stru tured Matri es in Operator Theory, Control, Signal and Image Pro essing", Boulder, 1999 (edited by V. Olshevsky), Ameri an Math. So iety, 215{232, Providen e, RI, 2001.

29. V. Y. Pan, Y. Rami, X. Wang, Stru tured Matri es and Newton's Iteration: Uni ed Approa h, Linear Algebra and Its Appli ations, 343–344, 233{265, 2002. 30. V. Y. Pan, M. Van Barel, X. Wang, G. Codevi o, Iterative Inversion of Stru tured Matri es, Theoreti al Computer S ien e, 315, 2–3 (Spe ial Issue on Algebrai and Numeri al Computing), 581{592, 2004. 31. R. Vandebril, M. Van Barel, G. Golub, N. Mastronardi, A Bibliography on Semiseparable Matri es, Cal olo, 42, 3–4, 249{270, 2005. 32. D. Bini, V. Y. Pan, Improved Parallel Computations with Toeplitz-like and Hankellike Matri es, Linear Algebra and Its Appli ations, 188/189, 3{29, 1993. 33. D. Bini, V. Y. Pan, Polynomial and Matrix Computations, Vol.1: Fundamental Algorithms, Birkhauser, Boston, 1994. 34. V. Y. Pan, A. Zheng, X. Huang, O. Dias, Newton's Iteration for Inversion of Cau hy-like and Other Stru tured Matri es, Journal of Complexity, 13, 108{124, 1997. 35. V. Y. Pan, A Homotopi Residual Corre tion Pro ess, Pro . of the Se ond Conferen e on Numeri al Analysis and Appli ations (edited by L. Vulkov, J. Wasniewsky and P. Yalamov), Le ture Notes in Computer S ien e, 1988, 644{649, Springer, Berlin, 2001. 36. V. Y. Pan, M. Kunin, R. Rosholt, H. Kodal, Homotopi Residual Corre tion Pro esses, Math. of Computation, 75, 345{368, 2006. 37. V. Y. Pan, New Homotopi /Fa torization and Symmetrization Te hniques for Newton's and Newton/Stru tured Iteration, Computer and Math. with Appli ations, 54, 721{729, 2007.

Newton's Iteration for Matrix Inversion, Advan es and Extensions

381

38. V. Y. Pan, Computations with Dense Stru tured Matri es, Math. of Comp., 55(191), 179{190, 1990. Pro . version in Pro . Annual ACM-SIGSAM International Symposium on Symboli and Algebrai Computation (ISSAC '89), 34-42, ACM Press, New York, 1989. 39. I. Gohberg, T. Kailath, V. Olshevsky, Fast Gaussian Elimination with Partial Pivoting for Matri es with Displa ement Stru ture, Math. of Comp., 64(212), 1557{1576, 1995. 40. V. Y. Pan, Computations in the Null Spa es with Additive Pre onditioning, Te hni al Report TR 2007009, CUNY Ph.D. Program in Computer S ien e, Graduate Center, City University of New York, April 2007. Available at http://www. s.g . uny.edu/tr/te hreport.php?id=352 41. V. Y. Pan, G. Qian, Solving Homogeneous Linear Systems with Weakly Random Additive Prepro essing, Te hni al Report TR 2008009, CUNY Ph.D. Program in Computer S ien e, Graduate Center, the City University of New York, 2008. Available at http://www. s.g . uny.edu/tr/te hreport.php?id=352 42. V. Y. Pan, D. Ivolgin, B. Murphy, R. E. Rosholt, Y. Tang, X. Yan, Additive Pre onditioning for Matrix Computations, Te h. Report TR 2008004, Ph.D. Program in Computer S ien e, Graduate Center, the City University of New York, 2008. Available at http://www. s.g . uny.edu/tr/te hreport.php?id=352 Pro eedings version in Pro . of the Third International Computer S ien e Symposium in Russia (CSR 2008), Le ture Notes in Computer S ien e (LNCS), 5010, 372{383, 2008. 43. V. Y. Pan, X. Yan, Additive Pre onditioning, Eigenspa es, and the Inverse Iteration, Linear Algebra and Its Appli ations, in press. 44. V. Y. Pan, D. Grady, B. Murphy, G. Qian, R. E. Rosholt, A. Ruslanov, S hur Aggregation for Linear Systems and Determinants, Theoreti al Computer S ien e, Spe ial Issue on Symboli {Numeri al Algorithms (D.A. Bini, V.Y. Pan, and J. Vers helde, editors), in press. 45. N. J. Higham, Fun tions of Matri es: Theory and Applli ations, SIAM, Philadelphia, 2008.

Truncated decompositions and filtering methods with Reflective/Anti-Reflective boundary conditions: a comparison C. Tablino Possio⋆ Dipartimento di Matemati a e Appli azioni, Universita di Milano Bi o

a, via Cozzi 53, 20125 Milano, Italy cristina.tablinopossio@unimib.it

Abstract. The paper analyzes and ompares some spe tral ltering methods as trun ated singular/eigen-value de ompositions and Tikhonov/Reblurring regularizations in the ase of the re ently proposed Re e tive [18℄ and Anti-Re e tive [21℄ boundary onditions. We give numeri al eviden e to the fa t that spe tral de ompositions (SDs) provide a good image restoration quality and this is true in parti ular for the AntiRe e tive SD, despite the loss of orthogonality in the asso iated transform. The related omputational ost is omparable with previously known spe tral de ompositions, and results substantially lower than the singular value de omposition. The model extension to the ross- hannel blurring phenomenon of olor images is also onsidered and the related spe tral ltering methods are suitably adapted.

Keywords: ltering methods, spe tral de ompositions, boundary onditions. 1

Introduction

In this paper we deal with the lassi al image restoration problem of blurred and noisy images in the ase of a spa e invariant blurring. Under su h assumption the image formation pro ess is modelled a

ording to the following integral equation with spa e invariant kernel g(x) =

Z

h(x − x ~)f(x~)dx~ + η(x), x ∈ R2 ,

(1)

where f denotes the true physi al obje t to be restored, g is the re orded blurred and noisy image, η takes into a

ount unknown errors in the olle ted data, e.g. measurement errors and noise. As ustomary, we onsider the dis retization of (1) by means of a standard ⋆

The work of the author was partially supported by MIUR 2006017542

Trun ated de ompositions and ltering methods with R/AR BCs

383

2D generalization of the re tangle quadrature formula on an equispa ed grid,

ordered row-wise from the top-left orner to the bottom-right one. Hen e, we obtain the relations X hi−j fj + ηi , i ∈ Z2 , gi = (2) j∈Z2

e ∞ = [hi−j ](i,j)=((i ,i ),(j ,j )) , in whi h an in nite and a shift-invariant matrix A 1 2 1 2 i.e., a two-level Toeplitz matrix, is involved. In prin iple, (2) presents an in nite summation sin e the true image s ene does not have a nite boundary. Nevertheless, the data gi are learly olle ted only at a nite number of values, so representing only a nite region of su h an in nite s ene. In addition, the blurring operator typi ally shows a nite support, so that it is ompletely des ribed by a Point Spread Fun tion (PSF) mask su h as hPSF = [hi1 ,i2 ]i1 =−q1 ,...,q1 ,i2 =−q2 ,...,q2 (3) P > 0 for any i1 , i2 and q i=−q hi = 1, i = (i1 , i2 ), q = (q1 , q2 )

where hi1 ,i2 (normalization a

ording to a suitable onservation law). Therefore, relations (2) imply gi =

q X

hs fi−s + ηi ,

i1 = 1, . . . , n1 , i2 = 1, . . . , n2 ,

(4)

s=−q

where the range of olle ted data de nes the so alled Field of View (FOV). On e again, we are assuming that all the involved data in (5), similarly to (2), are reshaped in a row-wise ordering. In su h a way we obtain the linear system e f~ = g − η A

(5)

e ∈ RN(n)×N(n+2q) is a nite prin ipal sub-matrix of A e ∞ , with main where A N(n+2q) N(n) ~ , g, η ∈ R and with N(m) = m1 m2 , diagonal ontaining h0,0 , f ∈ R for any two-index m = (m1 , m2 ). Su h a reshape is onsidered just to perform the theoreti al analysis, sin e all the deblurring/denoising methods are able to deal dire tly with data in matrix form. For instan e, it is evident that the blurring pro ess in (4) onsists in a dis rete onvolution between the PSF mask, after a rotation of 180◦ , and the proper true image data in e F = [fi1 ,i2 ]i1 =−q1 +1,...,n1 +q1 ,i2 =−q2 +1,...,n2 +q2 .

Hereafter, with a two-index notation, we denote by F = [fi1 ,i2 ]i1 =1,...,n1 ,i2 =1,...,n2 the true image inside the FOV and by G = [gi1 ,i2 ]i1 =1,...,n1 ,i2 =1,...,n2 the re orded image. Thus, assuming the knowledge of PSF mask in (3) and of some statisti al properties of η, the deblurring problem is de ned as to restore, as best as possible, the true image F on the basis of the re orded image G. As evident

384

C. Tablino Possio

from (4), the problem is undetermined sin e the number of unknowns involved in the onvolution ex eeds the number of re orded data. Boundary onditions (BCs) are introdu ed to arti ially des ribe the s ene outside the FOV: the values of unknowns outside the FOV are xed or are de ned as linear ombinations of the unknowns inside the FOV, the target being to redu e (5) into a square linear system An f = g − η (6) with An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 and f, g, η ∈ RN(n) . The hoi e of the BCs does not ae t the global spe tral behavior of the matrix. However, it may have a valuable impa t both with respe t to the a

ura y of the restored image and to the omputational osts for re overing f from the blurred datum, with or without noise. Noti e also that, typi ally, the matrix A is very ill- onditioned and there is a signi ant interse tion between the subspa e related to small eigen/singular values and the high frequen y subspa e. Su h a feature requires the use of suitable regularization methods that allow to properly restore the image F with ontrolled noise levels [12{14, 24℄, among whi h we an ite trun ated SVD, Tikhonov, and total variation [12, 14, 24℄. Hereafter, we fo us our attention on spe ial ase of PSFs satisfying a strong symmetry property, i.e., su h that h|i| = hi

for any i = −q, . . . , q.

(7)

This assumption is ful lled in the majority of models in real opti al appli ations. For instan e, in most 2D astronomi al imaging with opti al lens [5℄ the model of the PSF is ir ularly symmetri , and hen e, strongly symmetri ; in the multi-image de onvolution of some re ent interferometri teles opes, the PSF is strongly symmetri too [6℄. Moreover, in real appli ations when the PSF is obtained by measurements (like a guide star in astronomy), the in uen e of noise leads to a numeri ally nonsymmetri PSF, also when the kernel of the PSF is strongly (or entro) symmetri . In su h a ase, by employing a symmetrized version of the measured PSF, omparable restorations are observed [15, 1℄. The paper is organized as follows. In Se tion 2 we fo us on two re ently proposed BCs, i.e., the Re e tive [18℄ and Anti-Re e tive BCs [21℄ and their relevant properties. Se tion 3 summarizes some lassi al ltering te hniques as the trun ated singular/eigen-values de omposition and the Tikhonov method. The Re-blurring method [11, 9℄ is onsidered in the ase of Anti-Re e tive BCs and its re-interpretation in the framework of the lassi al Tikhonov regularization is given. In Se tion 4 the model is generalized for taking into a

ount the

ross- hannel blurring phenomenon and the previous ltering methods are suitable adapted. Lastly, Se tion 5 deals with some omputational issues and reports several numeri al tests, the aim being to ompare the quoted ltering methods and the two type of BCs, both in the ase of gray-s ale and olor images. In

Trun ated de ompositions and ltering methods with R/AR BCs

385

Se tion 6 some on lusions and remarks end the paper.

2

Boundary conditions

In this se tion we summarize the relevant properties of two re ently proposed type of BCs, i.e., the Re e tive [18℄ and Anti-Re e tive BCs [21℄. Spe ial attention is given to the stru tural and spe tral properties of the arising matri es. In fa t, though the hoi e of the BCs does not ae t the global spe tral behavior of the matrix A, it an have a valuable impa t with respe t both to the a

ura y of the restoration (espe ially lose to the boundaries where ringing ee ts an appear), and the omputational osts for re overing the image from the blurred one, with or without noise. Moreover, tanking into a

ount the s ale of the problem, the regularization methods analysis an be greatly simpli ed whenever a spe tral (or singular value) de omposition of A is easily available. This means that the target is to obtain the best possible approximation properties, keeping unaltered the fa t that the arising matrix shows an exploitable stru ture. For instan e, the use of periodi BCs enfor es a ir ulant stru ture, so that the spe tral de omposition an be

omputed eÆ iently with the fast Fourier transform (FFT) [8℄. Despite these

omputational fa ilities, they give rise to signi ant ringing ee ts when a signi ant dis ontinuity is introdu ed into the image. Hereafter, we fo us on two re ently proposed boundary onditions, that more

arefully des ribe the s ene outside the FOV. Clearly, several other methods deal with this topi in the image pro essing literature, e.g. lo al mean value [22℄ or extrapolation te hniques (see [17℄ and referen es therein). Nevertheless, the penalty of their good approximation properties

ould lie in a linear algebra problem more diÆ ult to ope with. 2.1

Reflective boundary conditions

In [18℄ Ng et al. analyze the use of Re e tive BCs, both from model and linear algebra point of view. The improvement with respe t to Periodi BCs is due to the preservation of the ontinuity of the image. In fa t, the s ene outside the FOV is assumed to be a re e tion of the s ene inside the FOV. For example, with a boundary at x1 = 0 and x2 = 0 the re e tive ondition is given by f(±x1 , ±x2 ) = f(x1 , x2 ). More pre isely, along the borders, the BCs impose fi1 ,1−i2 = fi1 ,i2 , fi1 ,n2 +i2 = fi1 ,n2 +1−i2 , for any i1 = 1, . . . , n1 , i2 = 1, . . . , q2 f1−i1 ,i2 = fi1 ,i2 , fn1 +i1 ,i2 = fn1 +1−i1 ,i2 , for any i1 = 1, . . . , q1 , i2 = 1, . . . , n2 ,

386

C. Tablino Possio

and, at the orners, the BCs impose for any i1 = 1, . . . , q1 , i2 = 1, . . . , q2 f1−i1 ,1−i2 = fi1 ,i2 , f1−i1 ,n2 +i2 = fi1 ,n2 +1−i2 ,

fn1 +i1 ,n2 +i2 = fn1 +1−i1 ,n2 +1−i2 , fn1 +i1 ,1−i2 = fn1 +1−i1 ,i2 ,

i.e., a double re e tion, rst with respe t to one axis and after with respe t to the other, no matter about the order. e is redu ed to a square Toeplitz-plusAs a onsequen e the re tangular matrix A Hankel blo k matrix with Toeplitz-plus-Hankel blo ks, i.e., An shows the twolevel Toeplitz-plus-Hankel stru ture. Moreover, if the blurring operator satis es the strong symmetry ondition (7) then the matrix An belongs to DCT-III matrix algebra. Therefore, its spe tral de omposition an be omputed very eÆ iently using the fast dis rete osine transform (DCT-III) [23℄. More in detail, let Cn = {An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 | An = Rn Λn RTn } be the two-level DCT-III matrix algebra, i.e., the algebra of matri es that are simultaneously diagonalized by the orthogonal transform Rn = Rn1 ⊗ Rn2 ,

Rm =

"r

2 − δt,1

os m

(s − 1)(t − 1/2)π m

#m

,

(8)

s,t=1

with δs,t denoting the Krone ker symbol. Thus, the expli it stru ture of the matrix is An = Toeplitz(V) + Hankel(σ(V), Jσ(V)), with V = [V0 V1 . . . Vq1 0 . . . 0] and where ea h Vi1 , i1 = 1, . . . , q1 is the unilevel DCT-III matrix asso iated to the ith 1 row of the PSF mask, i.e., Vi1 = Toeplitz(vi1 ) + Hankel(σ(vi1 ), Jσ(vi1 )), with vi1 = [hi1 ,0 , . . . , hi1 ,q2 , 0, . . . , 0]. Here, we denote by σ the shift operator su h that σ(vi1 ) = [hi1 ,1 , . . ., hi1 ,q2 , 0, . . ., 0] and by J the usual ip matrix; at the blo k level the same operations are intended in blo k-wise sense. Beside this stru tural hara terization, the spe tral des ription is ompletely known. In fa t, let f be the bivariate generating fun tion asso iated to the PSF mask (3), that is f(x1 , x2 ) = h0,0 + 2 +4

q1 X

q1 X

s1 =1 q2 X

hs1 ,0 os(s1 x1 ) + 2

q2 X

h0,s2 os(s2 x2 )

s2 =1

hs1 ,s2 os(s1 x1 ) os(s2 x2 ),

(9)

s1 =1 s2 =1

then the eigenvalues of the orresponding matrix An ∈ Cn are given by λs (An ) = f xs[n11 ] , xs[n22 ] , s = (s1 , s2 ),

x[m] = r

(r − 1)π , m

where s1 = 1, . . . , n1 , s2 = 1, . . . , n2 , and where the two-index notation highlights the tensorial stru ture of the orresponding eigenve tors.

Trun ated de ompositions and ltering methods with R/AR BCs

387

Lastly, noti e that standard operations like matrix-ve tor produ ts, resolution of linear systems and eigenvalues evaluations an be performed by means of FCTIII [18℄ within O(n1 n2 log(n1 n2 )) arithmeti operations (ops). For example, by multiplying by e1 = [1, 0, . . . , 0]T both the sides of RTn An = Λn RTn , it holds that [Λn ](i1 ,i2 ) = [RTn (An e1 )](i1 ,i2 ) /[RTn e1 ](i1 ,i2 ) ,

i1 = 1, . . . , n1 , i2 = 1, . . . , n2 ,

i.e., it is enough to onsider an inverse FCT-III applied to the rst olumn of An , with a omputational ost of O(n1 n2 log(n1 n2 )) ops. 2.2

Anti-reflective boundary conditions

More re ently, Anti-re e tive boundary onditions (AR-BCs) have been proposed in [21℄ and studied [2{4, 9, 10, 19℄. The improvement is due to the fa t that not only the ontinuity of the image, but also of the normal derivative, are guaranteed at the boundary. This regularity, whi h is not shared with Diri hlet or periodi BCs, and only partially shared with re e tive BCs, signi antly redu es typi al ringing artifa ts. The key idea is simply to assume that the s ene outside the FOV is the antire e tion of the s ene inside the FOV. For example, with a boundary at x1 = 0 the anti-re exive ondition impose f(−x1 , x2 ) − f(x∗1 , x2 ) = −(f(x1 , x2 ) − f(x∗1 , x2 )), for any x2 , where x∗1 is the enter of the one-dimensional anti-re e tion, i.e., f(−x1 , x2 ) = 2f(x∗1 , x2 ) − f(x1 , x2 ), for any x2 . In order to preserve a tensorial stru ture, at the orners, a double anti-re e tion, rst with respe t to one axis and after with respe t to the other, is onsidered, so that the BCs impose f(−x1 , −x2 ) = 4f(x∗1 , x∗2 ) − 2f(x∗1 , x2 ) − 2f(x1 , x∗2 ) + f(x1 , x2 ),

where (x∗1 , x∗2 ) is the enter of the two-dimensional anti-re e tion. More pre isely, by hoosing as enter of the anti-re e tion the rst available data, along the borders, the BCs impose f1−i1 ,i2 =

2f1,i2 − fi1 +1,i2 , fn1 +i1 ,i2 = 2fn1 ,i2 − fn1 −i1 ,i2 , i1 = 1, . . . , q1 ,

fi1 ,1−i2 =

i2 = 1, . . . , n2 ,

2fi1 ,1 − fi1 ,i2 +1 , fi1 ,n2 +i2 = 2fi1 ,n2 − fi1 ,n2 −i2 , i1 = 1, . . . , n1 ,

i2 = 1, . . . , q2 .

At the orners, the BCs impose for any i1 = 1, . . . , q1 and i2 = 1, . . . , q2 , f1−i1 ,1−i2 = 4f1,1 − 2f1,i2 +1 − 2fi1 +1,1 + fi1 +1,i2 +1 , f1−i1 ,n2 +i2 = 4f1,n2 − 2f1,n2 −i2 − 2fi1 +1,n2 + fi1 +1,n2 −i2 , fn1 +i1 ,1−i2 = 4fn1 ,1 − 2fn1 ,i2 +1 − 2fn1 −i1 ,1 + fn1 −i1 ,i2 +1 , fn1 +i1 ,n2 +i2 = 4fn1 ,n2 − 2fn1 ,n1 −i2 − 2fn1 −i1 ,n2 + fn1 −i1 ,n2 −i2 .

388

C. Tablino Possio

e is redu ed to a square ToeplitzAs a onsequen e the re tangular matrix A plus-Hankel blo k matrix with Toeplitz-plus-Hankel blo ks, plus an additional stru tured low rank matrix. Moreover, under the assumption of strong symmetry of the PSF and of a mild nite support ondition (more pre isely hi = 0 if |ij | > n−2, for some j ∈ {1, 2}), the resulting linear system An f = g is su h that An belongs to the AR2D n ommutative matrix algebra [3℄. This new algebra shares some properties with the τ (or DST-I) algebra [7℄. Going inside the de nition, a matrix An ∈ AR2D n has the following blo k stru ture D0 + Z[1] 0T 0 D + Z[2] 0 1 .. .. . . [q ] Dq1 −1 + Z 1 0 , τ(D , . . . , D ) D D An = 0 q1 q1 q1 [q1 ] 0 Dq1 −1 + Z .. .. . . [2] 0 D1 + Z 0T

0

D0 + Z[1]

where τ(D0 , . . . , Dq1 ) is a blo k τ matrix with respe t to the AR1D blo ks Di1 , Pq1 [k] i1 = 1, . . . , q1 and Z = 2 t=k Dt for k = 1, . . . , q1 . In parti ular, the AR1D [1D] blo k Di1 is asso iated to ith = [hi1 ,i2 ]i2 =−q2 ,...,q2 1 row of the PSF, i.e., hi1 and it is de ned as

Di1

[1]

hi1 ,0 + zi1 [2] hi1 ,1 + zi1

0T

0 0

.. .. . . [q2 ] h 0 i1 ,q2 −1 + zi1 τ(hi1 ,0 , . . . , hi1 ,q2 ) hi1 ,q2 hi1 ,q2 = , [q2 ] 0 hi1 ,q2 −1 + zi1 . .. .. . [2] 0 hi1 ,1 + zi1 [1] T 0 0 hi1 ,0 + zi1 P

q2 where z[k] i1 = 2 t=k hi1 ,t for k = 1, . . . , q2 and τ(hi1 ,0 , . . . , hi1 ,q2 ) is the previously deunilevel τ matrix asso iated to the one-dimensional PSF h[1D] i1 ned. Noti e that the rank-1 orre tion given by the elements z[k] i1 pertains to the ontribution of the anti-re e tion enters with respe t to the verti al borders, while

Trun ated de ompositions and ltering methods with R/AR BCs

389

the low rank orre tion given by the matri es Z[k] pertains to the ontribution of the anti-re e tion enters with respe t to the horizontal borders. It is evident from the above matrix stru ture that favorable omputational properties are guaranteed also by virtue of the τ stru ture. Therefore, rstly we re all the relevant properties of the two-level τ algebra [7℄. Let Tn = {An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 | An = Qn Λn Qn } be the two-level τ matrix algebra, i.e., the algebra of matri es that are simultaneously diagonalized by the symmetri orthogonal transform Qn = Qn1 ⊗ Qn2 ,

Qm =

"r

2 sin m+1

stπ m+1

#m

.

(10)

s,t=1

With the same notation as the DCT-III algebra ase, the expli it stru ture of the matrix is two level Toeplitz-plus-Hankel. More pre isely, An = Toeplitz(V) − Hankel(σ2 (V), Jσ2 (V))

with V = [V0 V1 . . . Vq1 0 . . . 0], where ea h Vi1 , i1 = 1, . . . , q1 is a the unilevel τ matrix asso iated to the ith 1 row of the PSF mask, i.e., Vi1 = Toeplitz(vi1 ) − 2 2 Hankel(σ (vi1 ), Jσ (vi1 )) with vi1 = [hi1 ,0 , . . . , hi1 ,q2 , 0, . . . , 0]. Here, we denote by σ2 the double shift operator su h that σ2 (vi1 ) = [hi1 ,2 , . . . , hi1 ,q2 , 0, . . . , 0]; at the blo k level the same operations are intended in blo k-wise sense. On e more, the spe tral hara terization is ompletely known sin e for any An ∈ Tn the related eigenvalues are given by λs (An ) = f xs[n11 ] , xs[n22 ] , s = (s1 , s2 ),

x[m] = r

rπ , m+1

where s1 = 1, . . . , n1 , s2 = 1, . . . , n2 , and f is the bivariate generating fun tion asso iated to the PSF de ned in (9). As in the DCT-III ase, standard operations like matrix-ve tor produ ts, resolution of linear systems and eigenvalues evaluations an be performed by means of FST-I within O(n1 n2 log(n1 n2 )) (ops). For instan e, it is enough to onsider a FST-I applied to the rst olumn of An to obtain the eigenvalues [Λn ](i1 ,i2 ) = [Qn (An e1 )](i1 ,i2 ) /[Qn e1 ](i1 ,i2 ) ,

i1 = 1, . . . , n1 , i2 = 1, . . . , n2 .

Now, with respe t to the AR2D n matrix algebra, a omplete spe tral hara terization is given in [3, 4℄. A really useful fa t is the existen e of a transform Tn that simultaneously diagonalizes all the matri es belonging to AR2D n , although the orthogonality property is partially lost. Theorem 1. [4℄ by Tn , i.e.,

Any matrix

An ∈ AR2D n , n = (n1 , n2 ),

An = Tn Λn Ten ,

Ten = Tn−1

an be diagonalized

390

C. Tablino Possio

where Tn = Tn

1

Tm

α−1 m

⊗ Tn2 , Ten = Ten1 ⊗ Ten2 , 0T

0

−1 −1 = αm p Qm−2 αm Jp 0

0T

α−1 m

and

with

αm

0T

0

e Tm = −Qm−2 p Qm−2 −Qm−2 Jp 0

0T

αm

The entries of the ve tor p ∈ Rm−2 are de ned as pj = 1 − j/(m − 1), j = 1, . . . , m − 2, J ∈ Rm−2×m−2 is the ip matrix, and αm is a normalizing fa tor hosen su h that the Eu lidean norm of the rst and last olumn of Tm will be equal to 1. Theorem 2. [3℄ Let An ∈ AR2D n , n = (n1 , n2 ), the matrix related PSF hPSF = [hi1 ,i2 ]i1 =−q1 ,...,q1 ,i2 =−q2 ,...,q2 . Then, the eigenvalues

are given by

to the of An

– 1 with algebrai multipli ity 4, – the n2 − 2 eigenvalues of the unilevel τ matrix related to the oneP P dimensional PSF h{r} = [ qi11=−q1 hi1 ,−q2 , . . . , qi11=−q1 hi1,q2 ], ea h one with algebrai multipli ity 2, – the n1 − 2 eigenvalues of the unilevel τ matrix related to the oneP P dimensional PSF h{c} = [ qi22=−q2 h−q1 ,i2 , . . . , qi22=−q2 hq1 ,i2 ], ea h one with algebrai multipli ity 2, – the (n1 − 2)(n2 − 2) eigenvalues of the two-level τ matrix related to the two-dimensional PSF hPSF .

Noti e that the three sets of multiple eigenvalues are exa tly related to the type of low rank orre tion imposed by the BCs through the enters of the antire e tions. More in detail, the eigenvalues of τn2 −2 (h{r} ) and of τn1 −2 (h{c} ) take into a

ount the ondensed PSF information onsidered along the horizontal and verti al borders respe tively, while the eigenvalue equal to 1 takes into a

ount the ondensed information of the whole PSF at the four orners. In addition, it is worth noti ing that the spe tral hara terization an be ompletely des ribed in terms of the generating fun tion asso iated to the PSF de ned in (9), simply by extending to 0 the standard τ evaluation grid, i.e., it holds λs (An ) = f xs[n11 ] , xs[n22 ] , s = (s1 , s2 ), sj = 0, . . . , nj ,

x[m] = r

rπ , m+1

where the 0−index refers to the rst/last olumns of the matrix Tm [3℄. See [2, 4℄ for some algorithms related to standard operations like matrix-ve tor produ ts, resolution of linear systems and eigenvalues evaluations with a omputational ost of O(n1 n2 log(n1 n2 )) ops.

Trun ated de ompositions and ltering methods with R/AR BCs

391

It is worthwhile stressing that the omputational ost of the inverse transform is omparable with that of the dire t transform and, at least at rst sight, the very true penalty is the loss of orthogonality due to the rst/last olumn of the matrix Tm .

3

Filtering methods

Owing to the ill- onditioning, the standard solution f = A−1 n g is not physi ally meaningful sin e it is ompletely orrupted by the noise propagation from data to solution, i.e., by the so alled inverted noise. For this reason, restoration methods look for an approximate solution with ontrolled noise levels: widely onsidered regularization methods are obtained through spe tral ltering [14, 16℄. Hereafter, we onsider the trun ated Singular Values De ompositions (SVDs) (or Spe tral De ompositions (SDs)) and the Tikhonov (or Re-blurring) regularization method. 3.1

Truncated SVDs and truncated SDs

The Singular Values De omposition (SVD) highlights a standard perspe tive for dealing with the inverted noise. More pre isely, if An = Un Σn VnT ∈ RN(n)×N(n)

is the SVD of An , i.e., Un and Vn are orthogonal matri es and Σn is a diagonal matrix with entries σ1 > σ2 > . . . σN(n) > 0, then the solution of the linear system An f = g an be written as f=

N(n)

X

k=1

uTk g σk

vk ,

where uk and vk denote the kth olumn of the matrix Un and Vn , respe tively. With regard to the image restoration problem, the idea is to onsider a sharp lter, i.e., to take in the summation only the terms orresponding to singular values greater than a ertain threshold value δ, so damping the ee ts aused by division by the small singular values. Therefore, by setting the lter fa tors as 1, if σk > δ, φk = 0, otherwise, the ltered solution is de ned as f lt =

N(n)

X

k=1

uT g φk k σk

X uT g vk = vk , φk k σk k∈Iδ

Iδ = {k | σk > δ}.

392

C. Tablino Possio

Due to s ale of the problem, the SVD of the matrix An is in general an expensive

omputational task (and not negligible also in the ase of a separable PSF). Thus, an \a priori" known spe tral de omposition, whenever available, an give rise to a valuable simpli ation. More pre isely, let en ∈ RN(n)×N(n) , An = Vn Λn V

en = V −1 V n

be a spe tral de omposition of An , then the ltered solution is de ned as f lt =

N(n)

X

k=1

φk

~vk g λk

vk =

X ~ vk g φk vk , λk

Iδ = {k | |λk (A)| > δ},

k∈Iδ

where vk and v~k denote the kth olumn of Vn and the kth row of Ven , respe tively, and where φk = 1 if k ∈ Iδ , 0 otherwise. 3.2

Tikhonov and re-blurring regularizations

In the lassi al Tikhonov regularization method, the image ltering is obtained by looking for the solution of the following minimization problem min kAn f − gk22 + µkDnfk22 , f

(11)

where µ > 0 is the regularization parameter and Dn is a arefully hosen matrix (typi ally Dn = In or represents the dis retization of a dierential operator, properly adapted with respe t to the hosen BCs). The target is to minimize the Eu lidean norm of the residual kAnf−gk2 without explosions with respe t to the quantity kDn xk2 . As well know, (11) is equivalent to the solution to the damped least square problem (ATn An + µDTn Dn )f = ATn g.

(12)

In addition, the regularization Tikhonov method an be reinterpreted in the framework of lassi al spe tral ltering method. For instan e, in the ase of Dn = In , by making use of the SVD of An = Un Σn VnT , the solution of (12) an be rewritten as T f lt = Vn Φn Σ−1 n Un g,

where Φn = diag(φk ) with φk = σ2k /(σ2k + µ), k = 1, . . . , N(n). A severe drawba k in adopting the Tikhonov regularization approa h in the ase T of An ∈ AR2D / AR2D n is due to the fa t that An ∈ n , so that all the favorable omputational properties are substantially spoiled. An alternative approa h, named ′ Re-blurring, has been proposed in [11, 9℄: the proposal is to repla e ATn by An ′ in (12), where An is the blurring matrix related to the urrent BCs with a PSF

Trun ated de ompositions and ltering methods with R/AR BCs

393

rotated by 180◦ . This approa h is ompletely equivalent to (12) in the ase of Diri hlet and Periodi BCs, while the novelty on erns both Re e tive BCs and ′ Anti-Re e tive BCs, where in general An 6= ATn . The authors show that the Re-blurring with anti-re e tive BCs is omputationally onvenient and leads to a larger redu tion of the ringing ee ts arising in lassi al deblurring s hemes. From the modelling point of view, the authors motivation relies upon the fa t that Re-blurring smoothes the noise in the right hand side of the system, in the same manner as this happens in the ase of Diri hlet, Periodi and Re e tive BCs. Hereafter, we onsider an explanation of the observed approximation results. As previously laimed, we fo us our attention on the ase of a strongly symmetri ′ PSF, so that the matrix An equals the matrix An . Moreover, also in this ase it is evident that the linear system (A2n + µD2n )f = An g.

(13)

is not equivalent to a minimization problem, again be ause the matrix A ∈ AR2D n is not symmetri . Nevertheless, the symmetrization of (13) an be performed by diagonalization, so obtaining (Λ2A,n + µΛ2D,n )f^ = ΛA,n g ^,

(14)

where f^ = Ten f and g^ = Ten g. In su h a way (14) is again equivalent to the minimization problem min kΛA,n Ten f − Ten gk22 + µkΛD,n Ten fk22 ,

(15)

min kTen(An f − g)k22 + µkTen Dn fk22 .

(16)

f

or equivalently, again by making use of the diagonalization result, to f

Clearly, the last formulation in (16) is the most natural and it allows to laim that the Re-blurring method an be interpreted as a standard Tikhonov regularization method in the spa e transformed by means of Ten . Re alling that Ten is not an orthogonal transformation, the goal be omes to

ompare kTenfk2 and kfk2, that is to bound kTen k2 = kTen1 k2 kTen1 k2 , being kTen fk2 6 kTen k2 kfk2 . A quite sharp estimate of su h a norm an be found by exploiting the stru ture of the unilevel matrix Tem ∈ Rm×m . Let f = [f2 , . . . , fm−1 ], it holds that kTem fk22 = α2m f21 + kQm−2 (−f1 p + f − fn Jp)k22 + α2m f2m = α2m (f21 + f2m ) + k − f1 p + f − fn Jpk22

2 + (|f1 | + |fn |)kpk2 )2 6 α2m (f21 + f2m ) + (kfk

22 + 3kpk22 kfk22 + 4kpk2 kfk22 6 α2m (f21 + f2m ) + kfk 6 (1 + 2kpk2 )2 kfk22 ,

394

C. Tablino Possio

being α2m = 1 + kpk22. Sin e, by de nition, kpk22 ≃ m, we have √ kTem k2 6 1 + 2kpk2 ≃ 2 m.

(17)

Noti e that the bound given in (17) is quite sharp, sin e for instan e kTem e1 k22 equals 1 + 2kpk22 .

4

Cross-channel blurring

Hereafter, we extend the analysis of the deblurring problem to the ase of olor images digitalized, for instan e, a

ording to the standard RGB system. Several te hniques an be used for re ording olor images, but the main problem on erns the fa t that light from one olor hannel an end up on a pixel assigned to another olor. The onsequen e of this phenomenon is alled ross- hannel blurring among the three hannels of the image and it sums up to the previously analyzed blurring of ea h one of the three olors, named within- hannel blurring. By assuming that the ross- hannel blurring takes pla e after the within- hannel blurring of the image, that it is spatially invariant and by assuming that the same within- hannel blurring o

urs in all the three olor hannels, the problem an be modelled [16℄ as (A olor ⊗ An )f = g − η (18) with An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 , and

arr arg arb A olor = agr agg agb . abr abg abb

The row-entries denote the amount of within- hannel blurring pertaining to ea h olor hannel; a normalized onservation law pres ribes that A olor e = e, e = [1 1 1]T . Lastly, the ve tors f, g, η ∈ R3N(n) are assumed to olle t the three

olor hannels in the RGB order. Clearly, if A olor = I3 , i.e., the blurring is only of within- hannel type, the problem is simply de oupled into three independent gray-s ale deblurring problems. In the general ase, taking into a

ount the tensorial stru ture of the whole blurring matrix A olor ⊗ An is evident that the trun ated SVDs and SDs an be formulated as the natural extension of those onsidered in the within-blurring

ase. Noti e that in the ase of SDs, we will onsider a SVD for the matrix A olor , sin e it naturally assures an orthogonal de omposition, no matter about the spe i matrix, while its omputational ost is negligible with respe t to the s ale of the problem. In addition, we tune the ltering strategy with respe t the spe tral information given only by the matrix An , i.e., for any xed σk (or λk ) we simultaneously sum, or dis ard, the three ontribution on f related to the

Trun ated de ompositions and ltering methods with R/AR BCs

395

three singular values of A olor . With respe t to the Tikhonov regularization method, the approa h is a bit more involved. Under the assumption An = ATn = Vn Λn Ven , the damped least square problem [(A olor ⊗ An )T (A olor ⊗ An ) + µI3n ]f = (A olor ⊗ An )T g

an be rewritten as T en + µ(I3 ⊗ In )]f = (A olor ⊗ Vn Λn V en )T g. [(A olor A olor ) ⊗ Vn Λ2n V

(19)

Thus, by setting S3n = I3 ⊗ Ven, f^ = S3n f, g^ = S3n g, (19) an be transformed in

en )T S−1 g en + µ(I3 ⊗ In )]S−1 f^ = S3n (A olor ⊗ Vn Λn V S3n [(AT olor A olor ) ⊗ Vn Λ2n V 3n ^, 3n

so obtaining the linear system

T [(A olor A olor ) ⊗ Λ2n + µ(I3 ⊗ In )]f^ = (AT olor ⊗ Λn )g ^,

that an easily be de oupled into n1 n2 linear systems of dimension 3. Clearly, in the ase of any matrix An ∈ Cn , all these manipulations an be performed by means of an orthogonal transformation S3n . Noti e also that the

omputational ost is always O(n1 n2 log n1 n2 ) ops. With respe t to An = Tn Λn Ten ∈ AR2D n , we an onsider the same strategy by referring to the Re-blurring regularization method. More pre isely, the linear system T [(AT olor A olor ) ⊗ A2n + µ(I3 ⊗ In )]f = (A olor ⊗ An )g

an be transformed in

T [(A olor A olor ) ⊗ Λ2n + µ(I3 ⊗ In )]f^ = (AT olor ⊗ Λn )g ^.

Though the transformation S3n = I3 ⊗ Ten is not orthogonal as in the Re e tive

ase, the obtained restored image are fully omparable with the previous ones and the omputational ost is still O(n1 n2 log n1 n2 )) ops.

5 5.1

Numerical tests Some computational issues

Before analyzing the image restoration results, we dis uss how the methods

an work without reshaping the involved data. In fa t, the tensorial stru ture of the matri es, obtained by onsidering Re e tive and Anti-Re e tive BCs,

an be exploited in depth, so that the algorithms an deal dire tly, and more naturally, with the data olle ted in matrix form. Hereafter, we onsider a twoindex notation in the sense of the previously adopted row-wise ordering.

396

C. Tablino Possio

[n ] 1] In the SD ase onsidered in Se tion 3.1, sin e v~k = ~v[n vk22 is represented k1 ⊗ ~ in matrix form as (~vk[n11 ] )T ~vk[n22 ] , the required s alar produ t an be omputed as

v~k g =

[n ] v~k11

T

[n ] v~k22 ⊙ G,

where ⊙ denotes the summation of all the involved terms after a element-wise produ t. Clearly, vk = vk[n11 ] ⊗vk[n22 ] is represented in matrix form as vk[n11 ] (~vk[n22 ] )T . In a similar manner, in the ase of the SVD of An with separable PSF h = [n ] [n ] [n ] [n ] h1 ⊗ h2 , we an represent vk = vk11 ⊗ vk22 in matrix form as vk11 (vk22 )T and [n ] [n ] [n ] [n ] uTk = (uk11 ⊗ uk22 )T as uk11 (uk22 )T . The eigenvalues required for the SD an be stored into a matrix Λ∗ ∈ Rn1 ×n2 . In the ase of An ∈ Cn this matrix an be evaluated as T e T ./ V en E∗ V eT en A∗ V Λ∗ = V n1 2 1 n1 2

where A∗ ∈ Rn2 ×n1 denotes the rst olumn of An and E∗1 the rst anoni al basis ve tor, reshaped as matri es in olumn-wise order. In addition, the two-level dire t and inverse transform y = Vn x and y = Ven x an be dire tly evaluated on a matrix data as Y = Vn1 XVnT 2 = (Vn2 (Vn1 X)T )T

and Y = Ven1 XVen2 = (Ven2 (Ven1 X)T )T

by referring to the orresponding unilevel transforms. In the same way, the eigenvalues required in the ase of An ∈ AR2D n an be suitably stored as

1

Λ∗ (τn2 −2 (hr ))

1

∗ c ∗ ∗ c Λ = Λ (τn1 −2 (h )) Λ (τn−2 (h)) Λ (τn1 −2 (h )) ∈ Rn1 ×n2 , ∗ r 1 Λ (τn2 −2 (h )) 1 ∗

with referen e to the notations of Theorem 2, where the eigenvalues of the unilevel and two-level τ matri es are evaluated as outlined in Se tion 2.2. Lastly, the linear systems obtained, for any xed µ, in the ase of Tikhonov and Re-blurring regularization methods an be solved with referen e to the matrix Φn of the orresponding lter fa tors by applying the Re e tive and Anti-Re e tive transforms with a omputational ost O(n1 n2 log n1 n2 ) ops. 5.2

Truncated decompositions

In this se tion we ompare the ee tiveness of trun ated spe tral de ompositions (SDs) with respe t to the standard trun ated SVDs both in the ase of

Trun ated de ompositions and ltering methods with R/AR BCs

397

Re e tive and Anti-Re e tive BCs. Due to s ale of the problem, the SVD of the matrix An is in general an expensive omputational task (and not negligible also in the ase of a separable PSF). Thus, a spe tral de omposition, whenever available as in these ases, leads to a valuable simpli ation. Firstly, we onsider the ase of the separable PSF aused by atmospheri turbulen e ! hi1 ,i2 =

1

2πσi1 σi2

exp −

1 2

i1 σi1

2

−

1 2

i2 σi2

2

,

where σi1 and σi2 determine the width of the PSF itself. Sin e the Gaussian fun tion de ays exponentially away from its enter, it is ustomary to trun ate the values in the PSF mask after an assigned de ay |i1 |, |i2 | 6 l. It is evident from the quoted de nition that the Gaussian PSF satis es the strong symmetry

ondition (7). Another example of strongly symmetri PSF is given by the PSF representing the out-of-fo us blur hi1 ,i2 =

1 , πr2

0,

if i21 + i22 6 r2 , otherwise,

where r is the radius of the PSF. In the reported numeri al tests, the blurred image g has been perturbed by adding a Gaussian noise ontribution η = ηn ν with ν xed noise ve tor, ηn = ρkgk2 /kνk2 , and ρ assigned value. In su h a way the Signal Noise Ratio (SNR) [5℄ is given by SNR = 20 log10 5.2.1

kgk2 = 20 log10 ρ−1 (dB). kηk2

Gray-scale images In Figure 1 we report the template true image (the

FOV is delimited by a white frame), together with the blurred image with the Gaussian PSF with support 15 × 15 and σi1 = σi2 = 2 and the referen e perturbation ν, reshaped in matrix form. We onsider the optimal image restoration with respe t to the relative restoration error (RRE), i.e., kf lt − ftrue k2 /kftrue k2 , where f lt is the omputed approximation of the true image ftrue by onsidering spe tral ltering. More in detail, the RRE is analyzed by progressively adding a new basis element at a time, a

ording to the non-de reasing order of the singular/eigen-values (the eigenvalues are ordered with respe t to their absolute value). In the ase of SDs (or SVDs related to a separable PSF) this an be done as des ribed in Se tion 5.1 and, beside the preliminary ost related to the omputation of the de omposition, the addition of a new term has a omputational

ost equal to 4n1 n2 ops. The algorithm proposed in [4℄, that makes use of the Anti-Re e tive dire t and inverse transforms, is less expensive in the ase of tests with few threshold values. Hereafter, the aim is to ompare the trun ated SVD with the trun ated SD

398

C. Tablino Possio True image

Reference noise perturbation

Effect of Gaussian blurring

Effect of Out−of−Focus blurring

True image (FOV is delimited by a white frame), referen e noise perturbation, blurred image with the Gaussian PSF with support 15 × 15 and σi1 = σi2 = 2, and blurred image with the Out-of-Fo us PSF with support 15 × 15. Fig. 1.

Table 1. Optimal RREs of trun ated SVD and SD with referen e to the true image in Figure 1 (Gaussian blur σi1 = σi2 = 2). PSF SVD SD SVD SD SVD SD SVD SD SVD SD

Refle tive BCs 11x11 15x15 ρ =0 0.059164 0.087402 0.090742 0.043754 0.087400 0.090746 ρ =0.001 0.060278 0.091964 0.094468 0.060278 0.091964 0.094476 ρ =0.01 0.091151 0.11214 0.11307 0.091152 0.11214 0.11307 ρ =0.05 0.11635 0.13356 0.13508 0.11635 0.13356 0.13510 ρ =0.1 0.13024 0.14607 0.14746 0.13024 0.14607 0.14746 5x5

21x21

PSF

0.093856 0.093867

SVD SD

0.097034 0.097034

SVD SD

0.11495 0.11495

SVD SD

0.13739 0.13739

SVD SD

0.15047 0.15047

SVD SD

Anti-Refle tive BCs 5x5 11x11 15x15 ρ =0 0.039165 0.064081 0.086621 0.038316 0.063114 0.083043 ρ =0.001 0.062182 0.094237 0.098897 0.059617 0.089105 0.092814 ρ =0.01 0.096049 0.12231 0.12403 0.091383 0.11230 0.11343 ρ =0.05 0.12791 0.15070 0.15188 0.11666 0.13414 0.13570 ρ =0.1 0.14399 0.16756 0.16964 0.13083 0.14709 0.14852

21x21 0.087237 0.083521 0.10042 0.094343 0.12536 0.11495 0.15492 0.13816 0.17225 0.15162

restorations both in the ase of Re e tive and Anti-Re e tive BCs. Periodi BCs are not analyzed here, sin e Re e tive and Anti-Re e tive BCs give better performan es with respe t to the approximation of the image at the boundary. In Table 1 and 2 we report the results obtained by varying the dimension of the PSF support, the parameter ρ related to the amount of the noise perturbation

Trun ated de ompositions and ltering methods with R/AR BCs

399

and the varian e of the onsidered Gaussian blur. As expe ted the optimal RRE worses as the parameter ρ in reases and the Anti-Re e tive BCs show better performan es in the ase of low noise levels. In fa t, for low ρ values, the redu tion of ringing artifa ts is signi ant, while the quality of the restoration for higher ρ values is essentially driven by the goal of noise ltering. Therefore, in su h a ase, the hoi e of the BCs be omes more an more meaningless sin e it is not able to in uen e the image restoration quality. Some examples of restored images are reported in Figure 2. More impressive is the fa t that SDs give better, or equal, results with respe t Table 2. Optimal RREs of trun ated SVD and SD with referen e to the true image in Figure 1 (Gaussian blur σi1 = σi2 = 5). PSF SVD SD SVD SD SVD SD SVD SD SVD SD

Refle tive BCs 11x11 15x15 ρ =0 0.063387 0.081274 0.097351 0.045365 0.081274 0.096387 ρ =0.001 0.063915 0.096243 0.11449 0.063915 0.096274 0.11449 ρ =0.01 0.089032 0.13343 0.14947 0.089032 0.13343 0.14946 ρ =0.05 0.12203 0.16002 0.17339 0.12203 0.16002 0.17339 ρ =0.1 0.13412 0.16793 0.17963 0.13412 0.16793 0.17963 5x5

21x21

PSF

0.14634 0.14634

SVD SD

0.15217 0.15217

SVD SD

0.17397 0.17397

SVD SD

0.18335 0.18335

SVD SD

0.19057 0.19057

SVD SD

Anti-Refle tive BCs 5x5 11x11 15x15 ρ =0 0.040214 0.079543 0.088224 0.039437 0.078970 0.088832 ρ =0.001 0.068197 0.095808 0.11522 0.063575 0.093247 0.1127 ρ =0.01 0.09412 0.14482 0.16825 0.089038 0.13611 0.15270 ρ =0.05 0.13553 0.18563 0.21006 0.12253 0.16269 0.17439 ρ =0.1 0.15010 0.20164 0.22256 0.13487 0.16916 0.18088

21x21 0.13686 0.13129 0.15767 0.14893 0.21148 0.17446 0.22962 0.18414 0.23960 0.19218

to those obtained by onsidering SVDs. This numeri al eviden e is really interesting in the ase of Anti-Re e tive BCs: despite the loss of the orthogonality property in the spe tral de omposition, the restoration results are better than those obtained by onsidering SVD. Moreover, the observed trend with respe t to the Re e tive BCs is also onserved. A further analysis refers to the so- alled Pi ard plots (see Figure 3), where the

oeÆ ients |uTk g|, or |v~k g|, (bla k dots) are ompared with the singular values σk , or the absolute values of the eigenvalues |λk |, (red line). As expe ted, initially these oeÆ ients de rease faster than σk , or |λk |, while afterwards they level o at a plateau determined by the level of the noise in the image. The threshold of this hange of behavior is in good agreement with the optimal k value obtained in the numeri al test by monitoring the RRE. Moreover, noti e that the Pi ard plots related to the SDs are quite in agreement with those orresponding to SVDs. In the ase of the Anti-Re e tive SD we observe an in reasing data dispersion with respe t to the plateau, but the orresponden e between the threshold and the hosen optimal k is still preserved. The omputational relevan e of this result is due to the signi ant lower omputational ost required by the Anti-Re e tive SDs with respe t to the orresponding SVDs.

400

C. Tablino Possio ρ = 0.01 R − TSVD

R − TSD

AR − TSVD

AR − TSD

ρ = 0.05 R − TSVD

R − TSD

AR − TSVD

AR − TSD

Optimal restorations of trun ated SVD and SD in the ase of Re e tive and Anti-Re e tive BCs with referen e to Figure 1 (Gaussian blur σi1 = σi2 = 2).

Fig. 2.

Trun ated de ompositions and ltering methods with R/AR BCs

401

Lastly, Table 3 reports the spe tral ltering results obtained in the ase of Outρ = 0.01 R−SVD, kott=7681

5

R−SD, kott=7678

5

10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5

1

1.5

2

4

x 10

AR−SVD, kott=8329

5

AR−SD, kott=7816

5

10

2.5 4

x 10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5

1

1.5

2

4

2.5 4

x 10

x 10

ρ = 0.05 R−SVD, kott=4653

5

R−SD, kott=4654

5

10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5

1

1.5

2

4

AR−SVD, kott=5415

5

x 10 AR−SD, kott=4710

5

10

2.5 4

x 10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5 4

x 10

1

1.5

2

2.5 4

x 10

Pi ard plot of trun ated SVD and SD in the ase of Re e tive and AntiRe e tive BCs with referen e to Figure 1 (Gaussian blur σi1 = σi2 = 2).

Fig. 3.

of-Fo us blur by varying the dimension of the PSF support and the parameter ρ related to the noise perturbation. The RRE follows the same trend observed in the ase of Gaussian blur. Other image restoration tests with dierent gray-s ale images have been onsidered in [20℄. A more interesting remark again pertains the omputational ost. Sin e the Outof-Fo us PSF is not separable, but the transforms are, the use of SDs related to Re e tive or Anti-Re e tive BCs allows to exploit the tensorial nature of the

402

C. Tablino Possio

Optimal RREs of trun ated SDs with referen e to the true image in Figure 1 (Out-of-Fo us blur).

Table 3.

PSF ρ =0 ρ =0.001 ρ =0.01 ρ =0.05 ρ =0.1

Refle tive BCs 5x5 11x11 15x15 0.072593 0.084604 0.088323 0.072671 0.085809 0.091035 0.080016 0.12255 0.13569 0.10645 0.15365 0.16810 0.12089 0.16314 0.17836

21x21 0.096479 0.10436 0.15276 0.18777 0.20471

5x5 0.072821 0.072904 0.080427 0.10685 0.12147

Anti-Refle tive BCs 11x11 15x15 21x21 0.085366 0.091252 0.099293 0.086643 0.093929 0.10752 0.12316 0.13803 0.15683 0.15571 0.17147 0.19172 0.16482 0.17987 0.20829

orresponding transforms, both with respe t to the omputation of the eigenvalues and of the eigenve tors (or of the Re e tive and Anti-Re e tive transforms). 5.2.2 Color images in the case of cross-channel blurring Here, we analyze some restoration tests in the ase of the template olor image reported in Figure 4, by assuming the presen e of a ross- hannel blurring phenomenon modelled as in (18). The entity of this mixing ee t is hosen a

ording to the matrix

0.7 0.2 0.1 A olor = 0.25 0.5 0.25 . 0.15 0.1 0.75

(20)

In Figure 4 the ross- hannel blurred image with Gaussian PSF with support True image

Effect of Cross−channel Gaussian blurring

True image (FOV is delimited by a white frame) and ross- hannel blurred image with the Gaussian PSF with support 15 × 15 and σi1 = σi2 = 2 and matrix A olor in (20). Fig. 4.

15 × 15 and σi1 = σi2 = 2 is also reported. Noti e that the entity of the ross hannel blurring is not negligible, sin e the whole image results to be darkened and the olor intensities of the additive RGB system are substantially altered. Table 4 reports the optimal RREs of trun ated SVDs and SDs obtained by varying the dimension of the Gaussian PSF support and the parameter ρ related to the amount of the noise perturbation. It is worth stressing that we tune the

Trun ated de ompositions and ltering methods with R/AR BCs

403

ltering strategy with respe t the spe tral information given just by the matrix An , i.e., for any xed σk (or λk ) we simultaneously sum, or dis ard, the three

ontribution on f related to the three singular values of A olor . In fa t, the magnitude of singular values of the onsidered matrix A olor does not dier enough to dramati ally hange the ltering information given just by An . Nevertheless, also the omparison with the restoration results obtained by onsidering a global ordering justi es this approa h. The olor ase behaves as the gray-s ale one: as expe ted the optimal RRE be omes worse as the parameter ρ in reases and the Anti-Re e tive SD shows better performan es in the ase of low noise levels. In addition, by referring to Figure 5, we note that the trun ated SVD in the Table 4. Optimal RREs of trun ated SVD and SD with referen e to the true image in Figure 4 (Cross- hannel and Gaussian Blur σi1 = σi2 = 2). PSF SVD SD SVD SD SVD SD SVD SD SVD SD

Refle tive BCs 11x11 15x15 ρ =0 0.078276 0.12114 0.11654 0.078276 0.12114 0.11654 ρ =0.001 0.078992 0.1212 0.11663 0.078992 0.12119 0.11663 ρ =0.01 0.10152 0.12396 0.12088 0.10152 0.12396 0.12088 ρ =0.05 0.12102 0.13853 0.13743 0.12102 0.13853 0.13743 ρ =0.1 0.13437 0.14898 0.14854 0.13437 0.14898 0.14854 5x5

21x21

PSF

0.1178 0.1178

SVD SD

0.11792 0.11792

SVD SD

0.12198 0.12198

SVD SD

0.13844 0.13844

SVD SD

0.14947 0.14947

SVD SD

Anti-Refle tive BCs 5x5 11x11 15x15 ρ =0 0.076646 0.1006 0.1098 0.074953 0.098474 0.10508 ρ =0.001 0.077394 0.10639 0.1111 0.075727 0.10233 0.10612 ρ =0.01 0.10431 0.12695 0.12624 0.10087 0.11737 0.11805 ρ =0.05 0.13017 0.15075 0.15063 0.12127 0.13699 0.13756 ρ =0.1 0.1456 0.16516 0.16626 0.13507 0.14796 0.14955

21x21 0.10646 0.10216 0.11002 0.10443 0.12779 0.118 0.15166 0.13795 0.16647 0.15018

ase of Anti-Re e tive BCs shows a little more 'fre kles' than the orresponding trun ated SVD in the ase of Re e tive BCs. Nevertheless, for low noise levels, is just the Anti-Re e tive SD that exhibits less 'fre kles' than the Re e tive SD.

5.3

Tikhonov and re-blurring regularizations

By onsidering a Gaussian blurring of the true image reported in Figure 1, Table 5 ompares the optimal RRE obtained in the ase of the Tikhonov method for Re e tive BCs and of the Re-blurring method for Anti-Re e tive BCs. In addition, in Table 6, the same omparison refers to the ase of the Out-of-Fo us PSF. As expe ted, the RRE deteriorates as the dimension of the noise level or the dimension of the PSF support in reases. Noti e also that the gap between the Re e tive and Anti-Re e tive BCs is redu ed also for low noise levels. Further numeri al tests an be found in [9, 2℄. Lastly, we fo us our attention on the ase of the olor image in Figure 4. The

404

C. Tablino Possio ρ = 0.01 R − TSVD

R − TSD

AR − TSVD

AR − TSD

ρ = 0.05 R − TSVD

R − TSD

AR − TSVD

AR − TSD

Optimal restorations of trun ated SVD and SD in the ase of Re e tive and Anti-Re e tive BCs with referen e to Figure 4 (Cross- hannel and Gaussian Blur σi1 = σi2 = 2). Fig. 5.

Trun ated de ompositions and ltering methods with R/AR BCs

405

Table 5. Optimal RREs of

Tikhonov and Re-blurring methods and orresponding µott with referen e to the true image in Figure 1 (Gaussian Blur σi1 = σi2 = 2). PSF R AR R AR R AR R AR R AR

5x5 11x11 15x15 ρ =0 0.041015 4.1e-005 0.079044 9e-006 0.086386 1.1e-005 0.034237 1.1e-005 0.059465 1e-006 0.078963 1e-006 ρ =0.001 0.050155 0.000188 0.087482 5.7e-005 0.090825 4.3e-005 0.048556 0.000163 0.085279 4.6e-005 0.089388 3.3e-005 ρ =0.01 0.083456 0.005555 0.10748 0.001786 0.10863 0.001678 0.083436 0.005536 0.10744 0.001792 0.10868 0.001691 ρ =0.05 0.12024 0.038152 0.12982 0.01929 0.13071 0.018417 0.12049 0.038379 0.13006 0.01957 0.13096 0.018669 ρ =0.1 0.14767 0.06587 0.14721 0.039231 0.14822 0.038181 0.14813 0.066251 0.14766 0.039707 0.14866 0.038644

21x21 0.089556 1.6e-005 0.079805 1e-006 0.093071 4.9e-005 0.090821 3.3e-005 0.11023 0.11019

0.001573 0.001575

0.13307 0.1333

0.017892 0.018105

0.15097 0.15144

0.037893 0.038296

Table 6. Optimal RREs of

Tikhonov and Re-blurring methods and orresponding µott with referen e to the true image in Figure 1 (Out-of-Fo us blur). PSF R AR R AR R AR R AR R AR

5x5 11x11 15x15 ρ =0 0.031422 0.000172 0.05346 6.9e-005 0.060954 3.5e-005 0.036213 0.000302 0.051236 6.8e-005 0.06683 5.7e-005 ρ =0.001 0.034441 0.000271 0.061465 0.000145 0.073751 0.000101 0.038313 0.000402 0.059957 0.000138 0.076695 0.000126 ρ =0.01 0.069647 0.008493 0.11361 0.004117 0.12881 0.003037 0.070384 0.008923 0.11404 0.00422 0.12982 0.003139 ρ =0.05 0.12204 0.053687 0.1532 0.030719 0.16614 0.022121 0.12256 0.05423 0.15402 0.031574 0.16739 0.023213 ρ =0.1 0.16366 0.092379 0.17357 0.055919 0.1829 0.042944 0.16433 0.093069 0.17485 0.057326 0.18457 0.044901

21x21 0.074785 2.7e-005 0.084482 5.8e-005 0.09074 7.9e-005 0.095274 0.000106 0.14914 0.15061

0.001873 0.001969

0.18769 0.18933

0.01346 0.014472

0.20323 0.20511

0.028803 0.031011

Table 7. Optimal RREs of Tikhonov and Re-blurring methods µott with referen e to the true image in Figure 4 (Cross- hannel σi1 = σi2 = 2). PSF R AR R AR R AR R AR R AR

5x5 11x11 15x15 ρ =0 0.069148 0.000203 0.11508 0.001204 0.1123 0.000717 0.062854 0.000102 0.091232 7e-006 0.1014 4.4e-005 ρ =0.001 0.071259 0.000312 0.11515 0.001228 0.11239 0.000744 0.066734 0.000209 0.098658 5.8e-005 0.10276 7.7e-005 ρ =0.01 0.094871 0.004975 0.11896 0.002919 0.11712 0.002421 0.094458 0.004841 0.1144 0.001884 0.11481 0.00184 ρ =0.05 0.13209 0.029798 0.13662 0.015305 0.13599 0.014896 0.13239 0.029944 0.13561 0.014992 0.13593 0.014772 ρ =0.1 0.16281 0.051315 0.15543 0.029068 0.15547 0.028822 0.16341 0.051659 0.15526 0.029213 0.15602 0.029029

and orresponding and Gaussian Blur

21x21 0.11335 0.000726 0.098266 1.5e-005 0.11347 0.10111

0.000755 4.5e-005

0.1182 0.11507

0.002459 0.001755

0.13669 0.13611

0.014824 0.014595

0.15586 0.15588

0.02868 0.02872

406

C. Tablino Possio

image restorations have been obtained by onsidering the transformation pro edure outlined at the end of Se tion 4. Despite the RREs in Table 7 are bigger than in the gray-s ale ase, the per eption of the image restoration quality is very satisfying and a little less 'fre kles' than in the orresponding SDs and SVDs are observed (see Figure 6). Noti e, also that the la k of orthogonality in the S3n transform related to the Anti-re e tive BCs does not deteriorate the performan es of the restoration.

6

Conclusions

In this paper we have analyzed and ompared SD and SVD ltering methods in the ase both of Re e tive and Anti-Re e tive BCs. Numeri al eviden e is given of the good performan es a hievable through SDs and with a substantially lower

omputational ost with respe t to SVDs. In addition, the tensorial stru ture of the Re e tive and Anti-Re e tive SDs an be exploited in depth also in the

ase of not separable PSFs. A spe ial mention has to be done to the fa t that the loss of orthogonality of the Anti-Re e tive transform does not seems to have any onsequen e on the trend of the image restoration results. The analysis in the ase of ross hannel blurring in olor images allows to on rm the quoted onsiderations. Finally, the Re-blurring regularizing method has been re-interpreted as a standard Tikhonov regularization method in the spa e transformed by means of Ten . Some numeri al tests highlight the image restoration performan es, also in the

ase of ross- hannel blurring. Future works will on ern the analysis of ee tive strategies allowing to properly

hoose the optimal regularizing parameters in the Anti-Re e tive BCs ase.

References 1. B. An onelli, M. Bertero, P. Bo

a

i, M. Carbillet, and H. Lanteri,

Redu tion of boundary ee ts in multiple image de onvolution with an appli ation to LBT LINC-NIRVANA, Astron. Astrophys., 448 (2006), pp. 1217{1224. 2. A. Ari o , M. Donatelli, and S. Serra-Capizzano, The Antire e tive Algebra: Stru tural and Computational Analyses with Appli ation to Image Deblurring and Denoising, Cal olo, 45{3 (2008), pp. 149{175. 3. A. Ari o , M. Donatelli, and S. Serra Capizzano, Spe tral analysis of the anti-re e tive algebra, Linear Algebra Appl., 428 (2008), pp. 657{675. 4. A. Ari o , M. Donatelli, J. Nagy, and S. Serra-Capizzano, The antire e tive transform and regularization by ltering, Numeri al Linear Algebra in Signals, Systems, and Control., in Le ture Notes in Ele tri al Engineering edited by S. Bhatta haryya, R. Chan, V. Olshevsky, A. Routray, and P. Van Dooren, Springer Verlag, in press.

Trun ated de ompositions and ltering methods with R/AR BCs

407

ρ = 0.01

R − Tikhonov

AR − Re−blurring

ρ = 0.05

R − Tikhonov

Fig. 6. Optimal RREs of

AR − Re−blurring

Tikhonov and Re-blurring methods with referen e to the true image in Figure 4 (Cross- hannel and Gaussian blur σi1 = σi2 = 2 - ρ = 0.05).

408

C. Tablino Possio

5. M. Bertero and P. Bo

a

i, Introdu tion to inverse problems in imaging, Inst. of Physi s Publ. London, UK, 1998. 6. M. Bertero and P. Bo

a

i, Image restoration for Large Bino ular Teles ope (LBT), Astron. Astrophys. Suppl. Ser., 147 (2000), pp. 323{332. 7. D. Bini and M. Capovani, Spe tral and omputational properties of band symmetri Toeplitz matri es, Linear Algebra Appl., 52/53 (1983), pp. 99{125. 8. P. J. Davis, Cir ulant Matri es, Wiley, New York, 1979. 9. M. Donatelli, C. Estati o, A. Martinelli, and S. Serra Capizzano,

Improved image deblurring with anti-re e tive boundary onditions and reblurring, Inverse Problems, 22 (2006), pp. 2035{2053.

10. M. Donatelli, C. Estati o, J. Nagy, L. Perrone, and S. Serra Capizzano, Anti-re e tive boundary onditions and fast 2D deblurring models, Pro eeding to SPIE's 48th Annual Meeting, San Diego, CA USA, F. Luk Ed, 5205 (2003), pp. 380{389. 11. M. Donatelli and S. Serra Capizzano, Anti-re e tive boundary onditions and re-blurring, Inverse Problems, 21 (2005), pp. 169{182. 12. H. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, Kluwer A ademi Publishers, Dordre ht, The Netherlands, 2000. 13. C. W. Groets h, The Theory of Tikhonov Regularization for Fredholm Integral Equations of the First Kind, Pitman, Boston, 1984. 14. P. C. Hansen, Rank-de ient and dis rete ill-posed problems, SIAM, Philadelphia, PA, 1997. 15. M. Hanke and J. Nagy, Restoration of atmospheri ally blurred images by symmetri inde nite onjugate gradient te hnique, Inverse Problems, 12 (1996), pp. 157{173. 16. P. C. Hansen, J. Nagy, and D. P. O'Leary, Deblurring Images Matri es, Spe tra and Filtering, SIAM Publi ations, Philadelphia, 2006. 17. R. L. Lagendijk and J. Biemond, Iterative Identi ation and Restoration of Images , Springer-Verlag New York, In ., 1991. 18. M. K. Ng, R. H. Chan, and W. C. Tang, A fast algorithm for deblurring models with Neumann boundary onditions, SIAM J. S i. Comput., 21 (1999), no. 3, pp. 851{866. 19. L. Perrone, Krone ker Produ t Approximations for Image Restoration with Anti-Re e tive Boundary Conditions, Numer. Linear Algebra Appl., 13{1 (2006), pp. 1{22. 20. F. Rossi, Te ni he di ltraggio nella ri ostruzione di immagini on ondizioni al ontorno antiri ettenti, (in Italian), Basi Degree Thesis, University of MilanoBi o

a, Milano, 2006. 21. S. Serra Capizzano, A note on anti-re e tive boundary onditions and fast deblurring models, SIAM J. S i. Comput., 25{3 (2003), pp. 1307{1325. 22. Y. Shi and Q. Chang, A

eleration methods for image restoration problem with dierent boundary onditions, Appl. Numer. Math., 58{5 (2008), pp. 602{614. 23. G. Strang, The Dis rete Cosine Transform, SIAM Review, 41{1 (1999), pp. 135{ 147. 24. C. R. Vogel, Computational Methods for Inverse Problems, SIAM, Philadelphia, PA, 2002.

Discrete-time stability of a class of hermitian polynomial matrices with positive semidefinite coefficients Harald K. Wimmer Mathematis hes Institut, Universitat Wurzburg, D-97074 Wurzburg, Germany wimmer@mathematik.uni-wuerzburg.de

P

i Polynomial matri es G(z) = Izm − m−1 i=0 C i z with positive semide nite

oeÆ ients Ci are studied. If C0 is positive de nite and P Ci = I then all hara teristi values of G(z) are in the losed unit dis and those lying on the unit ir le are m-th roots of unity having linear elementary divisors. The result yields a stability and onvergen e

riterion for a system of dieren e equations.

Abstract.

Keywords: polynomial matrix, zeros of polynomials, root lo ation, blo k

ompanion matrix, dieren e equation, stability.

1

Introduction

In this note we deal with a theorem on polynomials and its extension to polynomial matri es. The following result an be found in [1℄, and to some extent also in [3℄, [4, p. 92℄) and [5, p. 3℄.

Let mial su h that

Theorem 1.

g(z) = zm − cm−1 zm−1 + · · · + c1 z + c0

ci > 0, i = 0, . . . , m − 1, c0 > 0,

Xm−1

and

i=0

be a real polynoci = 1.

(1)

(i) Then all zeros of g(z) are in the losed unit dis . (ii) The zeros of g(z) lying on the unit ir le are simple and they are m-th

roots of unity. (iii) The number of zeros of g(z) on the unit ir le is equal to

Moreover

k = g d {m} ∪ {i; ci 6= 0} .

g(z) = (zk − 1)p(zk )

with

p(λk ) 6= 0

if

|λ| = 1.

It is the purpose of this note to extend the pre eding theorem to polynomial matri es and to derive a stability and onvergen e result for a system of dieren e P i equations. We onsider matri es G(z) = Izm − m−1 i=0 Ci z where the oeÆ ients

410

H. K. Wimmer

Ci ∈ Cn×n are hermitian and positive semide nite (Ci > 0) and C0 is positive P de nite (C0 > 0), and we assume m−1 i=0 Ci = I. The following notation will be used. We de ne σ(G) = {λ; det G(λ) = 0}. In a

ordan e with [2, p. 341℄ the elements of σ(G) will be alled the hara teristi values of G(z). If G(λ)v = 0 and v ∈ Cn , v 6= 0, then v is said to be an eigenve tor orresponding to λ. An r-tuple of ve tors (v0 , v1 , . . . , vr−1 ), vi ∈ Cn , v0 6= 0, is alled a Jordan hain (or Keldysh hain) of length r if G(λ)v0 = 0, G′ (λ)v0 + G(λ)v1 = 0, · · · , (r−1) 1 (λ)v0 (r−1)! G

+

(r−2) 1 (λ)v1 (r−2)! G

+ · · · + G(λ)vr−1 = 0.

Let D = {z ∈ C; |z| < 1} be the open unit dis and ∂D = {z ∈ C; |z| = 1} the unit

ir le of the omplex plane, and let R> be the set of nonnegative real numbers. Let Em = {ζ ∈ C; ζm = 1} be the group of m-th roots of unity. If ζ ∈ Em then ord ζ will denote the order of ζ, i.e. if ord ζ = s then s is the smallest positive divisor of m su h that ζs = 1.

2

Polynomial matrices

Let us have a loser look at Theorem 1. It is not diÆ ult to show that the zeros of the polynomial g(z) lie in the losed unit dis . But it is remarkable that the unimodular zeros of g(z) should be roots of unity. In the theorem below we shall en ounter this property in a more general setting. A

ordingly, the fo us of this se tion will be on hara teristi values on the unit ir le and orresponding eigenve tors. To make the exposition self- ontained we dont take advantage of Theorem 1 in the subsequent proof. Theorem 2.

Let

G(z) = Izm −

Xm−1 i=0

Ci zi

be an n × n polynomial matrix with hermitian oeÆ ients Ci > 0, i = 0, . . . , m − 1, C0 > 0, and

Xm−1 i=0

Ci

su h that

Ci = I.

(2)

(i) Then |λ| 6 1 for all λ ∈ σ(G). (ii) If λ ∈ σ(G) and |λ| = 1 then λm = 1. The elementary divisors of G(z)

orresponding to λ are linear. If ord λ = s then Es ⊆ σ(G). (iii) Let v ∈ Cn , v 6= 0. De ne k(v) = g d {m} ∪ {i; Ci v 6= 0, i = 0, 1, . . . , m − 1} .

(3)

M(v) = {λ ∈ σ(G); |λ| = 1, G(λ)v = 0}

(4)

Suppose the set

Polynomial matri es

is nonempty. Then M(v) = Ek(v) . If

m = k(v)ℓ

411

then

ℓ−1 i h X G(z)v = Izk(v)ℓ − Ck(v)j zk(v)j v = (zk(v) − 1) p zk(v)

(5)

j=0

where

p(z) ∈ Cn [z]

and 6 0 p λk(v) =

if

|λ| = 1.

(6)

Proof. Note that det C0 6= 0 implies 0 ∈/ σ(G). Let λ be a hara teristi value of G(z) and v a orresponding eigenve tor with v∗ v = 1. Set gv (z) = v∗ G(z)v Pm−1 ∗ m and ci = v Ci v. Then gv (z) = z − i=0 ci zi , and the assumptions (2) imply (1). We have gv (λ) = 0 or equivalently 1=

m−1 X i=0

Hen e

ci . m−i λ

m−1 m−1 X ci X ci 6 . 1 = m−i λ |λm−i | i=0

(7)

(8)

i=0

1 | < 1, 0 6 i 6 m − 1. Then (8) implies the stri t (i) Suppose |λ| P > 1, i.e. | λm−i inequality 1 < ci , in ontradi tion to (1). (ii) Put ci βi = m−i , i = 0, . . . , m − 1. (9)

λ

If |λ| = 1 then (1) and (7) yield Xm−1 Xm−1 βi = |βi | = 1. i=0 i=0

Hen e βi = uα , i = 0, . . . , m − 1, with αi ∈ R> , u ∈ C, |u| = 1. From (7) we Pi obtain 1 = u αi . Therefore u ∈ R, u > 0. Thus u = 1, and we have (9) with βi ∈ R> . Take i = 0. Then c0 > 0 yields λm ∈ R> . Be ause of |λ| = 1 we obtain λm = 1, i.e. λ ∈ Em . We rewrite (9) as βi = λi ci , βi ∈ R> , i = 0, . . . , m − 1.

(10)

Let ord λ = s and m = ℓs. Suppose ci 6= 0, i.e. ci > 0. Then (10) implies λi = 1, that is i ∈ {0, s, 2s, . . . , (ℓ − 1)s}. Therefore ci = 0 if i ∈/ sZ. Be ause of Ci > 0 we have ci = v∗ Ci v = 0 if and only if Ci v = 0. Hen e i

h ℓ ℓ−1 G(z)v = I (zs ) − C(ℓ−1)s (zs ) + · · · + Cs zs + C0 v.

(11)

412

H. K. Wimmer

Let µs = 1. Then (11) and G(1) = 0 imply G(µ)v = 0. Thus Es ⊆ σ(G). Moreover v∗ G(λ) = 0, and gv (z) = zℓs −

ℓ−1 X

cjs zjs

and

j=0

ℓ−1 X

cjs = 1.

j=0

Let us show that the elementary divisors orresponding to λ are linear. It suÆ es to prove (see e.g. [2, p. 342℄) that the ve tor v an not be extended to a Jordan

hain of length greater than 1. Suppose there exists a ve tor w ∈ Cn su h that G′ (λ) v + G(λ) w = 0. Then v∗ G(λ) = 0 and λs = 1 imply 0 = v∗ [G(λ) w + G′ (λ) v] = v∗ G′ (λ)v = g′v (λ) = ℓs λℓs−1 −

ℓ−1 X j=0

P

i h Xℓ−1 js cjs . js cjs λjs−1 = λ−1 ℓs − j=0

P

ℓ−1 Thus we would obtain ℓs = ℓ−1 j=0 js cjk , whi h is in ompatible with j=0 cjs = 1. the set of ommon positive (iii) Let M(v) be de ned by (4) and let D(v) denote divisors of m ∪ {i; Ci v 6= 0, i = 0, . . . , m − 1} . Then s ∈ D(v) with m = ℓs is equivalent to (11). We know that M(v) ⊆ Em , and we have seen that s ∈ D(v) if Es ⊆ M(v). Sin e (11) implies Es ⊆ M(v) it is obvious that

s ∈ D(v)

if and only if Es ⊆ M(v).

(12)

Now let λ, µ ∈ M(v) and ord λ = s, ord µ = t. Set q = l m{s, t}, and r = m/q. Then i h Xr−1 G(z)v = Izrq −

j=0

Cjq zjq v.

Hen e Eq ⊆ M(v). In parti ular, we have λµ ∈ M(v). Therefore M(v) is a subgroup of Em . Hen e M(v) = Ek^ for some divisor k^ of m. Note that Es ⊆ Ek^ is equivalent to s|k^. Therefore it follows from (12) that k^ is the greatest element of D(v). Thus, if k(v) is given by (3) then k^ = k(v). It remains to show that the polynomial ve tor p(zk(v) ) in (5) satis es the

ondition (6). Suppose p(λk(v) ) = 0 for some λ ∈ ∂D. Then λ ∈ M(v) and therefore λ ∈ Ek(v) , i.e. λk(v) − 1 = 0. Hen e G(λ)v = G′ (λ)v = 0. But then there would exist an an elementary divisor (z − λ)t with t > 2. Therefore we ⊓ ⊔ have (6). From

P

Ci = I follows G(1) = 0. Thus 1 ∈ σ(G). More pre isely, det G(z) = (z − 1)n f(z), f(1) 6= 0. To he k whether G(z) has hara teristi values on ∂D dierent from 1 we introdu e the following matri es. Let s, s 6= 1, s 6= m, be a positive divisor of m su h that m = sℓ. De ne Xℓ−1 Cjs . Ts = I − j=0

Polynomial matri es

413

Corollary 1. For ea h nontrivial divisor s of m the matrix Ts is nonsingular if and only if λ = 1 is only hara teristi value of G(z) on the unit ir le.

Proof. Suppose G(λ)v = 0 and v 6= 0, |λ| = 1, ord λ = s, s > 1. Pm−1 P Then Ci v = 0 for i ∈/ sZ. Hen e ℓ−1 i=0 Ci v = v, and therefore j=0 Cjs v = Ts v = 0, and rank Ts < n. Conversely, suppose rank Ts < n for some s. Let Pℓ−1 P Ts v = 0, v 6= 0. Then Ci = I and Ci > 0 imply G(z)v = Izm − j=0 Cjs zjs v and we on lude that {1} $ Es ⊆ σ(G). ⊓ ⊔

3

A difference equation

Theorem 2 deals with the lo ation of hara teristi values with respe t to the unit

ir le. Therefore it an be applied to stability problems of systems of dieren e equations. Theorem 3.

Let C0 , . . . , Cm−1 ∈ Cn×n be hermitian and su h that (2), i.e.

Ci > 0, i = 0, . . . , m − 1, C0 > 0, and

holds. Then all solutions

x(t) t∈N

0

Xm−1 i=0

Ci = I,

of the dieren e equation

x(t + m) = Cm−1 x(t + m − 1) + · · · + C1 x(t + 1) + C0 x(t),

are bounded for

t → ∞.

The

x(0) = x0 , . . . , x(m − 1) = xm−1 , sequen e x(jm) j∈N is onvergent.

(13a) (13b)

0

Proof. It is well known that the solutions of (13) are bounded if and only if all P

hara teristi values of the asso iated polynomial matrix G(z) = Izm − Ci zi are in the losed unit dis and if those whi h lie on the unit ir le have linear elementary divisors. To prove onvergen e of x(jm) we onsider the blo k

ompanion matrix

0 0 F= . . C0

I 0 . . C1

0 I . . C2

... 0 ... 0 ... . ... . . . . Cm−1

asso iated with G(z). Note that det G(z) = det(zI−F). Moreover G(z) and F have T the same elementary divisors. Set y(t) = xT (t), xT (t + 1), · · · , xT (t + m − 1) and de ne y0 onforming to (13b). Then (13) is equivalent to y(t + 1) = Fy(t), y(0) = y0 .

414

H. K. Wimmer

The orresponding equation for w(j) = x(jm) is w(j + 1) = Fm w(j).

We know that σ(G) ⊆ D and that λ ∈ σ(G) ∩ ∂D implies λm = 1. Therefore σ(Fm ) ⊆ {1} ∪ D, and Fm is similar to diag(I, ^ F) with σ(^ F) ⊆ D. Hen e w(j) is onvergent. ⊓ ⊔

References 1. N. Anderson, E. B. Saff, and R. S. Varga, An extension of the EnestromKakeya theorem and its sharpness, SIAM J. Math. Anal. 12 (1981), pp. 10{22. 2. H. Baumgartel, Analyti Perturbation Theory for Matri es and Operators, Operator Theory, Advan es and Appli ations, Vol. 15, Birkhauser, Basel, 1985. einen Satz des Herrn Kakeya, T^ohoku Math. J. 4 (1913), pp. 3. A. Hurwitz, Uber 89{93; in: Mathematis he Werke von A. Hurwitz, 2. Band, pp. 627{631, Birkhauser, Basel, 1933. 4. A. M. Ostrowski, Solutions of Equations in Eu lidean and Bana h Spa es, 3rd ed., A ademi Press, New York, 1973. 5. V. V. Prasolov, Polynomials, Algorithms and Computation in Mathemati s, Vol. 11, Springer, New York, 2004.

MATRICES AND APPLICATIONS

Splitting algorithm for solving mixed variational inequalities with inversely strongly monotone operators⋆ Ildar Badriev and Oleg Zadvornov Department of Computational Mathemati s and Mathemati al yberneti s, 420008 Kazan State University, Russia, Kazan, Kremlevskya, 18 Ildar.Badriev@ksu.ru

We onsider a boundary value problem whose generalized statement is formulated as a mixed variational inequality in a Hilbert spa e. The operator of this variational inequality is a sum of several inversely strongly monotone operators (whi h are not ne essarily potential operators). The fun tional o

urring in this variational inequality is also a sum of several lower semi- ontinuous onvex proper fun tionals. For solving of the onsidered variational inequality a de omposition iterative method is oered. The suggested method does not require the inversion of original operators. The onvergen e of this method is investigated.

Abstract.

Keywords: variational inequality, inversely strongly monotone operator, variational inequality, iterative method.

1

Statement of the problem

Let Ω ⊂ R n , n > 1 be a bounded domain with a Lips hitz ontinuous boundary Γ . We onsider the following boundary value problem with respe t to the fun tion u = (u1 , u2 , . . . , un ): n X ∂ (i) v (x) + dij (x)uj (x) = fi (x), x ∈ Ω, i = 1, . . . , n, ∂xj j

(1)

j=1

u(x) = 0, (i)

−vj (x) ∈

(2)

x ∈ Γ,

gj (|∂u(x)/∂xj |) ∂ui (x) , x ∈ Ω, i, j = 1, . . . , n, |∂u(x)/∂xj | ∂xj

(3)

where f = (f1 , f2 , . . . , fn ) is a given fun tion, D = {dij } is an unsymmetri matrix su h that (Dξ, ξ) > α0 (Dξ, Dξ) ⋆

∀ ξ ∈ R n,

α 0 > 0.

(4)

This work was supported by the Russian Foundation for Basi Resear h, proje t №№06-01-00633 and 07-01-00674.

Splitting algorithm for solving mixed variational inequalities

417

We assume that the multi-valued fun tions gj an be represented in the form gj (ξ) = g0j (ξ) + ϑj h(ξ − βj ),

where ϑj , βj are the given non negative onstants, h is the multi-valued and g0j are the single-valued fun tions given by the formulas ξ < 0, 0, h(ξ) = [0, 1], ξ = 0, 1, ξ > 0,

g0j (ξ) =

0, ξ 6 βj , g∗j (ξ − βj ), ξ > βj ,

g∗j : [0, +∞) → [0, +∞) are the ontinuous fun tions whi h satisfy the following

onditions:

(5)

g∗j (0) = 0, g∗j (ξ) > g∗j (ζ) ∀ ξ > ζ > 0,

∃ σj > 0 : | g∗j (ξ) − g∗j (ζ) | 6

1 | ξ − ζ | ∀ ξ, ζ > 0, σj

(6)

∃ kj > 0, ξ∗j > 0 : g∗j (ξ∗j ) > kj ξ∗j , g∗j (ξ) − g∗j (ζ) > kj (ξ − ζ) ∀ ξ > ζ > ξ∗j . h◦ Let us introdu e the notations: V = W ∂/∂xj : V → H, j = 1, 2, . . . , n.

in

(1) 2 (Ω)

(7)

, H = [L2 (Ω)] , Bj = n

A generalized solution of the problem (1){(3) is de ned as the fun tion u ∈ V satisfying for all η ∈ V the variational inequality (A0 u, η − u)V +

n X j=1

n X

(Aj ◦ Bj (u), Bj (η − u))H + F0 (η) − F0 (u) +

(8)

[Fj (Bj η) − Fj (Bj u) ] > 0.

j=1

Here B∗j : H → V , j = 1, 2, . . . , n are operators onjugate to Bj . The operators A0 : V → V and Aj : H → H, j = 1, 2, . . . , n are generated by the forms (A0 u, η)V =

Z

(Qu, η)dx, u, η ∈ V,

(Aj y, z)H =

Ω

Z

(Gj (y), z)dx, y, z ∈ H.

Ω

The operators Gj : R n → R n and the fun tionals F0 : V → R 1 , Fj : H → R 1 , j = 1, 2, . . . , n are de ned by the formulas: Gj (y) = g0j (|y|) |y|−1 y , y 6= 0, Gj (0) = 0,

Fj (z) = ϑj

Z

µ(|z| − βj ) dx, z ∈ H,

Ω

The following result is valid.

µ(ζ) =

Z F0 (η) = − (f, η) dx, η ∈ V,

Ω

0, ζ < 0, ζ, ζ > 0.

418

I. Badriev, O. Zadvornov

Let the ondition (4) be satis ed. Then A0 is an inversely strongly monotone operator, i.e.,

Lemma 1.

2

(A0 η − A0 u , η − u)V > σ0 k A0 η − A0 u kV ,

σ0 > 0

∀ u, η ∈ V.

(9)

Proof. It follows from (4) that −1/2

| Qξ | 6 α 0

(Qξ, ξ) 1/2

∀ ξ ∈ R n,

and hen e −1/2

(Qξ, ξ) 1/2 | ζ | ∀ ξ, ζ ∈ R n .

| (Qξ, ζ) | 6 α 0

Be ause of this | (A0 u, η)V | 6

−1/2 α0

−1/2

α0

| (Q u, η) | d x 6

Ω

Z

Z

Ω

1/2

(Qu, u) d x 1/2

cH (A0 u, u)V

−1/2 α0

Z

(Qu, Qu) 1/2 | η | d x 6

Ω

Z

Ω

1/2

| η |2 d x

−1/2

k η kV = σ 0

−1/2

= α0

1/2

(A0 u, u)V

k η kH 6

1/2

(A0 u, u)V

k η kV ,

where cH is the Friedri hs onstant (the onstant of embedding V into H), 2 σ 0 = α 0 /c H . Therefore, k A0 u kV = sup

η6=0

1/2 | (A0 u, η)V | −1/2 6 σ0 (A0 u, u)V , k η kV

when e by virtue of linearity of A0 it follows required inequality.

⊓ ⊔

By analogy with [4℄ we obtain that the following results are valid.

Let the onditions (5){(7) be satis ed. Then Aj are oer ive and inversely strongly monotone operators, i.e.,

Lemma 2.

2

(Aj y − Aj z , y − z)H > σj k Aj y − Aj z kH , σj > 0,

The fun tionals F0 : V → R 1 ,

onvex and Lips hitz ontinuous ones.

Lemma 3.

∀ y, z ∈ H.

Fj : H → R 1 , j = 1, 2, . . . , n

(10)

are

It follows from these results that the variational inequality (8) has at least one solution (see e.g. [5℄).

Splitting algorithm for solving mixed variational inequalities

2

419

The iterative process

In the following, we will onsider the abstra t variational inequality (8) postulating the properties (9), (10) and assuming that Bj : V → H, j = 1, 2, . . . , n are linear ontinuous operators and Fj , j = 0, 1, 2, . . . , n are proper onvex and Lips hitz ontinuous fun tionals. In addition, we assume that the operator

n X j=1

B∗j Bj : V → V is a anoni al isomorphism, i.e., n X

B∗j Bj u, η

j=1

V

∀ u, η ∈ V.

= (u, η)V

(11)

To solve the variational inequality (8) we onsider the following splitting algorithm. (0) Let u(0) ∈ V , y(0) ∈ H, λj ∈ H, j = 1, 2, . . . , n be arbitrary elements. For j (k) (k) k = 0, 1, 2, . . . and for known yj , λj , j = 1, 2, . . . , n we de ne u(k+1) as a solution of the variational inequality: 1 (k+1) + u − u(k) , η − u(k+1) τ0 V + F0 (η) − F0 u(k+1) + A0 u(k) , η − u(k+1)

(12)

V

n X

(k)

B∗j λj

+r

j=1

n X j=1

(k) B∗j Bj u(k) − yj , η − u(k+1) > 0 ∀ η ∈ V. V

, j = 1, 2, . . . , n, by solving the variational inequalities Then we nd y(k+1) j 1 (k+1) (k) (k+1) yj − yj , z − yj + τj H (k+1) (k) (k) (k+1) + Fj (z) − Fj yj + Aj yj − λj , z − yj H (k) (k+1) r yj − Bj u(k+1) , z − yj > 0 ∀ z ∈ H, j = 1, 2, . . . , n.

(13)

(k+1) + r Bj u(k+1) − yj ,

(14)

H

Finally, we set

(k+1)

λj

(k)

= λj

j = 1, 2, . . . , n.

Here τj > 0, j = 0, 1, 2, . . . , n and r > 0 are the iterative parameters. To analyze the onvergen e of the method (12){(14) we formulate it via the transition operator T : V × H n × H n → V × H n × H n that takes ea h ve tor q = ( q0 , q1 , . . . , q2n ) = (u, Y, Λ ), Y ∈ H n , Λ ∈ H n to the element T q = ( T 0 q, T 1 q, . . . , T 2n q ) as follows

h

T 0 q = Prox τ0 F0 q0 − τ0 A0 q0 +

n X j=1

B∗j

q n+j + r

n X j=1

i , B∗j Bj q0 − qj

(15)

420

I. Badriev, O. Zadvornov

h i , j = 1, 2, . . . , n, T j q = Prox τj Fj qj − τj Aj qj − q n+j + r qj − Bj T 0 q

T n+j q = q n+j + r Bj T 0 q − T j q ,

(16) (17)

j = 1, 2, . . . , n.

Here Prox G is a proximal mapping (see e.g. [5℄). Re all that a mapping Prox G : Z → Z is said to be proximal if it takes ea h element p of Hilbert spa e Z to the element v = Prox G (p) that is the solution of the minimization problem

1 1 2 2 k v − p kZ + G(v) = min k z − p kZ + G(z) , z∈P 2 2 This problem is equivalent (if G is a onvex proper lower semi- ontinuous fun tional) to a variational inequality (18)

( v − p, z − v )Z + G(z) − G(v) > 0 ∀ z ∈ Z.

It is easy to show that a proximal mapping is a rmly nonexpansive; i.e., k Prox G (p) − Prox G (z) kZ 6 ( Prox G (p) − Prox G (z) , p − z )Z 2

∀ p, z ∈ Z.

We introdu e the notations Y (k) =

(k) (k) , y1 , y2 , . . . , y(k) n

Λ(k) =

(k) (k) . λ1 , λ2 , . . . , λ(k) n

Then using the de nition of a proximal mapping by the variational inequality (18) it is easy to verify that the iterative pro ess (12){(14) an be represented in the form (0) is an arbitrary element, q

(k+1)

q

(k)

= Tq

(k)

, q

(k)

= u

,Y

(k)

(k)

,Λ

(19) , k = 0, 1, 2, . . . ,

i.e., T is the transition operator of this iterative pro ess. Let us now obtain a relationship between the solution of the original variational inequality (8) and the omponents of the xed point of the transition operator T . The following result is true.

Let the operator T : V × H n × H n → V × H n × H n is de ned by the relationships (15) { (17). Then the point q = ( u, Y, Λ ) where u ∈ V , Y = ( y1 , y2 , . . . , yn ) ∈ H n , Λ = ( λ1 , λ2 , . . . , λn ) ∈ H n , is a xed point of the operator T if and only if Theorem 1.

yj = Bj u,

j = 1, 2, . . . , n,

λj ∈ ∂Fj (yj ) + Aj yj ,

j = 1, 2, . . . , n,

(20) (21)

Splitting algorithm for solving mixed variational inequalities −

n X j=1

421

(22)

B∗j λ j ∈ ∂F0 (u) + A0 u.

Moreover, the rst omponent u of ea h xed point q of the operator T is a solution of the problem (8). Proof. Let q = ( u, Y, Λ ) be xed point of the operator T , i.e., a

ording to (15) { (17)

n n i h X X , B∗j λ j + r B∗j Bj u − yj u = Prox τ0 F0 u − τ0 A0 u + j=1

h i yj = Prox τj Fj yj − τj Aj yj − λ j + r yj − Bj u , λ j = λ j + r (Bj u − yj ) ,

(23)

j=1

j = 1, 2, . . . , n, (24)

(25)

j = 1, 2, . . . , n.

Obviously, the relations (25) are equivalent to (20). By the (20) and de nition (18) of a proximal mapping the relations (24) are equivalent to variational inequalities τj ( Aj yj − λj , z − yj )H + τj Fj (z) − τj Fj (yj ) > 0 ∀ z ∈ H,

j = 1, 2, . . . , n,

or

( Aj yj − λj , z − yj )H + Fj (z) − Fj (yj ) > 0 ∀ z ∈ H,

j = 1, 2, . . . , n,

(26)

ea h of these is equivalent to −A j yj −λj ∈ ∂Fj (yj ), j = 1, 2, . . . , n, i.e., in lusions (21) hold. In an analogous way we have that relation (23) is equivalent to the variational inequality n X A0 u + B∗j λ j , η − u + F0 (η) − F0 (u) > 0 j=1

V

∀ η ∈ V,

(27)

i.e., to the in lusion (22). We have thereby shown that the equality Tq = q is equivalent to relations (20) { (22). Let us now verify that the rst omponent u of ea h xed point q of the operator T is a solution of the problem (8). To this end, in inequalities (26), we use relations (20) to repla e yj by Bj u, j = 1, 2, . . . , n and set z = Bj η, where η is an arbitrary element of V . By adding resulting inequalities and using the de nition of onjugate operator we have n X j=1

(B∗j

◦ Aj ◦ Bj (u), η − u)V +

n X j=1

[ Fj (Bj η) − Fj (Bj u) ] > 0

∀ η ∈ V. (28)

By adding inequalities (27){(28) we have that u is a solution of the problem (8). ⊓ ⊔ The proof of the theorem is omplete.

422

I. Badriev, O. Zadvornov

Theorem 2.

Suppose that there exists a solution of problem (8) and ∃ u∗ ∈ dom F0 : Bj u∗ ∈ dom Fj ;

Fj

is ontinuous at the point Bj u∗ ,

j = 1, 2, . . . , n.

(29)

Then the set of xed points of the operator T is nonempty. Proof. Let

u be a solution of the problem (8), yj = Bj u, j = 1, 2, . . . , n. The variational inequality (8) is equivalent to the following in lusion −A0 u −

n X j=1

B∗j Aj y j ∈ ∂ F0 +

n X j=1

(30)

Fj ◦ Bj (u).

If onditions (29) are satis ed, then it follows from Propositions 5.6 and 5.7 [5℄ that

∂F0 +

n X j=1

Fj ◦ Bj (u) = ∂ F0 (u)+

n X j=1

∂ (Fj ◦ Bj ) (u) = ∂ F0 (u)+

n X

B∗j ∂ Fj (yj ).

j=1

(31) Relations (31) and (30) imply that there exist elements v ∈ ∂ F0 (u), zj ∈ ∂Fj (yj ), j = 1, 2, . . . , n, su h that −A0 u −

n X

B∗j Aj y j = v +

j=1

or

−A0 u −

n X

n X

B∗j zj ,

j=1

B∗j ( Aj y j + zj ) = v.

j=1

Let λ j = Aj y j + zj , j = 1, 2, . . . , n; then we have the in lusions −A0 u −

n X j=1

B∗j λ j = v ∈ ∂ F0 (u);

−Aj y j + λ j = z ∈ ∂Fj (yj ),

j = 1, 2, . . . , n,

i.e., the relations (21), (22) hold. Next relations (20) are valid by virtue of the de nition of yj . Therefore by Theorem 1, the operator T has a xed point, namely the point q = ( u, Y, Λ ) where Y = ( y1 , y2 , . . . , yn ) ∈ H n , Λ = ( λ1 , λ2 , . . . , λn ) ∈ H n . The proof of ⊓ ⊔ the theorem is omplete. Thus the onvergen e analysis of the iterative pro ess (12){(14) an be redu ed to that of the su

essive approximation method for nding a xed point of T .

Splitting algorithm for solving mixed variational inequalities

3

423

The investigation of the convergence of the iterative process

Let introdu e the Hilbert spa e Q = V × H n × H n with the inner produ t (·, ·)Q =

n n X 1 X 1 1 − τ0 r (·, ·)V + (·, ·)H + (·, ·)H , τ0 τj r j=1

j=1

where r, τj , j = 0, 1, 2, . . . , n, are positive onstants; moreover, τj r < 1, j = 0, 1, 2, . . . , n. The investigation of the onvergen e of the iterative pro ess (19) is based on the following Theorem 3.

Let onditions (9){(11) be satis ed, and let τj

0, j = 0, 1, 2, . . . , n; therefore it follows from (9), (10) and (33) that T is nonexpansive operator. We rewrite relation (15) in view of (11) in the form T 0 q = Prox τ0 F0 = Prox τ0 F0

q0 − τ0 A0 q0 − τ0 r q0 − τ0

n X

B∗j ( q n+j − r qj )

j=1

S0 q0 − τ0

n X j=1

B∗j

( q n+j − r qj )

,

where S0 : V → V is the operator given by the formula S0 = (1 − τ0 r) I − τ0 A0 .

424

I. Badriev, O. Zadvornov

A

ording to [3℄ by using (9) we obtain 2

2

k S0 p0 − S0 q0 kV = (1 − τ0 r)2 k q0 − p0 kV −

2

2 τ0 (1 − τ0 r) (A0 q0 − A0 p0 , q0 − p0 )V + τ20 kA0 q0 − A0 p0 kV 6 2 σ0 r + 1 (A0 q0 − A0 p0 , q0 − p0 )V , (1 − τ0 r)2 k q0 − p0 k2V − 2 τ0 1 − τ0 2 σ0

i.e., 2

k S0 p0 − S0 q0 kV 6

2

(1 − τ0 r)2 k q0 − p0 kV − τ0 (1 − τ0 r) δ0 (A0 q0 − A0 p0 , q0 − p0 )V ,

(34)

for any q0 , p0 ∈ V . Further, by using the rmly nonexpansiving property of proximal mapping Prox τ0 F0 we obtain 2

k T0 q − T0 p kV 6 (T0 q − T0 p, S0 q0 − S0 p0 )V − τ0

n X

B∗j ( q n+j − p n+j ) − r B∗j (qj − pj ) , T0 q − T0 p

j=1

V

.

Let us transform the rst term in the right side by the relation (v, w)Z =

1 ε 1 kvk2Z − kv − ε w k2Z + kwk2Z 2ε 2ε 2

∀ v, w ∈ Z, ∀ ε > 0

(35)

with Z = V , v = S0 q0 − S0 p0 , w = T0 q − T0 p. We have 2

k T0 q − T0 p kV 6

1 ε 2 2 kS0 q0 − S0 p0 kV + kT0 q − T0 p kV − 2ε 2

1 k(S0 q0 − S0 p0 ) − ε (T0 q − T0 p ) k2V − 2ε n X τ0 B∗j ( q n+j − p n+j ) − r B∗j (qj − pj ) , T0 q − T0 p V . j=1

Therefore, by virtue of (34) we obtain 2−ε (1 − τ0 r)2 2 2 k T0 q − T0 p k V 6 kq0 − p0 kV − 2 2ε τ0 (1 − τ0 r) δ0 (A0 q0 − A0 p0 , q0 − p0 )V − 2ε 1 2 k(1 − τ0 r) (q0 − p0 ) − τ0 (A0 q0 − A0 p0 ) − ε (T0 q − T0 p ) kV − 2ε n X τ0 B∗j ( q n+j − p n+j ) − r B∗j (qj − pj ) , T0 q − T0 p V . j=1

Splitting algorithm for solving mixed variational inequalities

425

After division by τ0 by hoosing ε = 1 − τ0 r we have δ0 1 + τ0 r k T0 q − T0 p k2V + (A0 q0 − A0 p0 , q0 − p0 )V + 2 τ0 2 1 2 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) kV 6 2 (1 − τ0 r) τ0 n X 1 − τ0 r kq0 − p0 k2V − ( q n+j − p n+j , Bj (T0 q − T0 p ))H + 2 τ0 j=1

n X

r

( qj − pj , Bj (T0 q − T0 p ) )H .

j=1

From this inequality after the transformation of the terms qj − pj , Bj (T0 q−T0 p ) H by the (34) with Z = H, ε = 1, v = qj −pj , w = Bj (T0 q−T0 p ) it follows that δ0 1 + τ0 r 2 k T0 q − T0 p k V + (A0 q0 − A0 p0 , q0 − p0 )V + 2 τ0 2 1 2 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) kV + 2 (1 − τ0 r) τ0 n r X 2 k (qj − Bj T0 q) − (pj − Bj T0 p) kH 6 2 j=1

n 1 − τ0 r r X 2 2 kq0 − p0 kV + k qj − pj kH + 2 τ0 2 j=1

r 2

n X j=1

2

k Bj (T0 q − T0 p ) kH −

n X

( q n+j − p n+j , Bj (T0 q − T0 p ))H .

j=1

For ea h j = 1, 2, . . . , n we rewrite the (16) in the form

(36)

T j q = Prox τj Fj ( qj − τj r qj − τj Aj qj + τj q n+j + τj Bj T 0 q ) = Prox τj Fj ( Sj qj + τj q n+j + τj Bj T 0 q ) ,

where the operators Sj : H → H are de ned by the relationships Sj = (1 − τj r) I − τj Aj . By virtue of (10) by the analogous with (34) we obtain the estimates k Sj pj − Sj qj k2H 6

2

(1 − τj r)2 kqj − pj kH − τj (1 − τj r) δj (Aj qj − Aj pj , qj − pj )H ,

(37)

and take in a

ount the rmly non expanding of the proximal mapping Prox τj Fj and the equality (34) with an arbitrary ε > 0, Z = H, v = Sj qj − Sj pj , w =

426

I. Badriev, O. Zadvornov

Tj q − Tj p we have k Tj q − Tj p k2H 6 ( Tj q − Tj p , Sj qj − Sj pj )H +

τj r ( Tj q − Tj p , Bj ( T0 q − T0 p ) )H + τj ( Tj q − Tj p , q n+j − p n+j )H = ε 1 k Sj qj − Sj pj k2H + k Tj q − Tj p k2H − 2ε 2 1 2 k (Sj qj − Sj pj ) − ε ( Tj q − Tj p )kH + 2ε τj r ( Tj q − Tj p , Bj ( T0 q − T0 p ))H + τj ( Tj q − Tj p , q n+j − p n+j )H .

By setting in the last inequality ε = 1 − τj r and using the estimation (37) for 2 kSj qj − Sj pj kH we obtain the inequality 1 + τj r δj 2 k Tj q − Tj p kH + (Aj qj − Aj pj , qj − pj )H + 2 τj 2 1 2 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) kH 6 2 (1 − τj r)τj 1 − τj r 2 k qj − pj kH − ( Tj q − Tj p , q n+j − p n+j )H + 2 τj r ( Tj q − Tj p , Bj ( T0 q − T0 p ) )H ,

whi h after using the relation (34) with ε = 1, Z = H, v = Tj q − Tj p, w = Bj (T0 q − T0 p) for the transformation the last term implies 1 + τj r δj 2 k Tj q − Tj p kH + (Aj qj − Aj pj , qj − pj )H + 2 τj 2 r 2 k (Tj q − Bj T0 q ) − (Tj p − Bj T0 p ) kH + 2 1 2 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) kH 6 2 (1 − τj r)τj 1 − τj r 2 k qj − pj kH + ( q n+j − p n+j , Tj q − Tj p )H + 2 τj r r 2 2 k Tj q − Tj p kH + k Bj ( T0 q − T0 p) kH . 2 2

Further for ea h j = 1, 2, . . . , n by virtue of (17) we have

(38)

1 1 k T n+j q − T n+j p k2H = k q n+j − p n+j k2H + 2r 2r ( q n+j − p n+j , Bj (T0 q − T0 p ) )H − r 2 ( q n+j − p n+j , Tj q − Tj p )H + k Bj ( T0 q − T0 p ) − (Tj q − Tj p ) kH . 2

(39)

Splitting algorithm for solving mixed variational inequalities

427

By adding relations (38), (39) with j = 1, 2, . . . , n, and the relations (36) after multiplying by 2 we have 1 + τ0 r 2 k T0 q − T0 p kV + δ0 (A0 q0 − A0 p0 , q0 − p0 )V + τ0 1 2 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) kV + (1 − τ0 r) τ0 n X 2 r k (qj − Bj T0 q) − (pj − Bj T0 p) kH + j=1

n n X X 1 + τj r 2 k Tj q − Tj p kH + δj (Aj qj − Aj pj , qj − pj )H + τj j=1 n X

r

j=1 n X

j=1

2

k (Tj q − Bj T0 q ) − (Tj p − Bj T0 p ) kH +

1 2 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) kH + (1 − τj r)τj

j=1 n X

1 r r

j=1

n X

j=1 n X

2

k T n+j q − T n+j p kH 6 2

k Bj (T0 q − T0 p ) kH − 2

1 − τj r 2 k qj − pj kH + 2 τj

j=1 n X

r

1 r 2

j=1 n X j=1 n X

2

k Tj q − Tj p kH + r

n X j=1

2

k q n+j − p n+j kH + 2

n X 1 − τ0 r 2 2 kq0 − p0 kV + r k qj − pj kH + τ0 j=1

n X

( q n+j − p n+j , Bj (T0 q − T0 p ))H +

j=1 n X

( q n+j − p n+j , Tj q − Tj p )H +

j=1

2

k Bj ( T0 q − T0 p) kH + n X

( q n+j − p n+j , Bj (T0 q − T0 p ) )H −

j=1

( q n+j − p n+j , Tj q − Tj p )H + r

j=1

n X j=1

2

k Bj (T0 q − T0 p ) − (Tj q − Tj p ) kH .

Then by virtue of (11) n X j=1

2

2

k Bj η kH = k η kV

Taking in a

ount this equation we have

∀ η ∈ V.

(40)

428

I. Badriev, O. Zadvornov

n n X 1 + τ0 r 1 X 1 + τj r k T n+j q − T n+j p k2H + k T0 q − T0 p k2V + k Tj q − Tj p k2H + τ0 τ r j j=1 j=1

δ0 (A0 q0 − A0 p0 , q0 − p0 )V +

n X

δj (Aj qj − Aj pj , qj − pj )H +

j=1

n X j=1

1 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) k2H + (1 − τj r)τj

1 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) k2V + (1 − τ0 r) τ0 n X k (qj − Bj T0 q) − (pj − Bj T0 p) k2H 6 r j=1

n n X 1 X 1 1 − τ0 r k q n+j − p n+j k2H , kq0 − p0 k2V + k qj − pj k2H + τ0 τj r j=1

j=1

i.e., the inequality (33) is true. The proof of the theorem is omplete.

⊓ ⊔

Re all (see [6℄), that the operator T : Q → Q is named the asymptoti ally regular if T k+1 q − T k q → 0 as k → +∞ for any q ∈ Q. It is valid the following

Let the operator T has at least one xed point and let the on, onditions (9){(11), (32) are hold. Then the iterative sequen e q(k) +∞ k=0 ∗ stru ted a

ording to (19), onverges weakly to q in Q as k → +∞, q∗ is a xed point of the operator T , the relation Theorem 4.

lim

y(k) − Bj u (k) = 0, j

k→ +∞

j = 1, 2, . . . , n,

H

is valid and the operator T : Q → Q is an asymptoti ally regular; i.e.,

lim

q(k+1) − q(k)

= 0.

k→ +∞

Proof. We use the inequality (33) with

Q

(41)

(42)

q = q(k) assuming that p is a xed

point of the operator T (the existen e of at least one xed point is provided by the assumptions of the theorem). Sin e Tq(k) = q(k+1) by the de nition of the iterative sequen e, pj = Tj p, j = 0, 1, 2, . . . , n, for a xed point, and, by Theorem 1 pj = Bj T0 p = Bj p0 , j = 1, 2, . . . , n, we have

Splitting algorithm for solving mixed variational inequalities

429

2

(k+1)

+ − p + δ0 A0 u(k) − A0 p0 , u(k) − p0

q V

Q

n X

δj

j=1

(k)

Aj yj

(k)

− Aj pj , yj

− pj

H

+

2 1

(1 − τ0 r) (u(k) − u(k+1) ) − τ0 (A0 u(k) − A0 p0 ) + τ0 (1 − τ0 r) V n

2 X 1

(k) (k+1) (k) ) − τj (Aj yj − Aj pj ) +

(1 − τj r) (yj − yj τj (1 − τj r) H j=1 n X

r

j=1

2

2

(k)

yj − Bj u(k+1) 6 q(k) − p , H

Q

+∞

This inequality implies that the numeri al sequen e q(k) − p Q in reasing and hen e have a nite limit:

lim

k→ +∞

lim

k→ +∞

(k)

Aj yj

Q

(k)

− Aj pj , yj

− pj

H

H

= 0,

(43)

= 0, j = 1, 2, . . . , n,

(44)

A0 u(k) − A0 p0 , u(k) − p0

− Bj u(k+1) = 0, lim

y(k) j

k→ +∞

is non

lim

q(k) − p

< +∞,

k→ +∞

therefore

k=0

V

(45)

j = 1, 2, . . . , n,

lim

(1 − τ0 r) u (k) − u (k+1) − τ0 A0 u(k) − A0 p0

= 0,

k→ +∞

V

(46)

(k) lim

(1 − τj r) yj(k) − y(k+1) A y − A p − τ

= 0, j = 1, 2, . . . , n. j j j j j j k→ ∞ H (47) By using (9), (10), (43) and (44), we obtain

lim

A0 u(k) − A0 p0

= 0, lim

Aj y(k) − Aj pj = 0, j = 1, 2, . . . , n. j k→ ∞ k→ ∞ V H (48) It follows from (46) { (48) that

lim

u(k) − u(k+1)

= 0,

k→ +∞

V

(k+1) − yj lim

y(k)

= 0, j

k→ +∞

H

Further by using (40), (45), (49), from the inequality

j = 1, 2, . . . , n.

(49)

430

I. Badriev, O. Zadvornov

(k)

(k)

yj − Bj u(k) 6 yj − Bj u(k+1) + Bj (u(k) − u(k+1) ) 6 H H H ! n

2 1/2 X

(k)

=

yj − Bj u(k+1) +

Bi (u(k) − u(k+1) ) H

i=1

(k)

(k+1)

yj − Bj u

+ u(k) − u(k+1) , H

H

j = 1, 2, . . . , n,

V

we obtain (41). It follows from (17) and (41) that

(k+1)

(k+1) − λj − Bj u(k+1) = 0, j = 1, 2, . . . , n. lim

λ(k)

= r lim yj j k→ ∞ k→ ∞ H H (50) Relations (49), (50) imply that the ondition (42) is satis ed, i.e., T is an asymptoti al regular operator. Sin e, by addition, by the assumptions of the Theorem, the operator T have a non empty set of xed points and, by Theorem 3, is non expanding operator, it follows from [7℄ that the iterative sequen e ∗ {q(k) }+∞ k=0 onstru ted by (19) is weakly onverges in Q as k → +∞. Its limit q is the xed point of the operator T . The proof of the theorem is omplete. ⊓ ⊔

Note that if the assumptions of the Theorem 1 are valid, then it follows (k) +∞ from Theorems 2, 4 that the sequen es { u(k) }+∞ k=0 and { yj }k=0 , onstru ted by (12){(14) are is weakly onverge to u and Bj u, j = 1, 2, . . . , n in V and H, respe tively, as k → +∞, where u is a solution of variational inequality (8).

4

Application of the iterative method to the problem (1)–(3)

Let us apply the suggested iterative method (12){(14) to the problem (1){(3). Sin e in (14) al ulations are performed by expli it formulas, it is suÆ ient to

onsider only the problems (12), (13). Sin e F0 is a linear fun tional, the variational inequality (12) by standard way an be rewritten in the form 1 (k+1) u − u(k) , η + τ0 V n X (k) (k) (k) (k) ∗ b ,η A0 u − f + r u + Bj λj − r yj = 0 ∀ η ∈ V, V

j=1

where the element fb ∈ V is de ned by the formula b η)V = (f,

Z

Ω

(f, η) dx,

η ∈ V.

Splitting algorithm for solving mixed variational inequalities

431

Thus the rst step of the iterative pro ess an be redu ed to solving of n Diri hlet problems for Poisson equation. Further, for ea h j = 1, 2, . . . , n, let us rewrite variational inequality (13) in the form (k+1) (k+1) (k+1) >0 yj , z − yj + Gj (z) − Gj yj H

where

(51)

∀ z ∈ H,

i h (k) (k) (k) (k) ,z Gj (z) = τj Fj (z) − yj − τj Aj yj − λj + r yj − Bj u(k+1)

H

.

By using the de nition of a proximal mapping we obtain that the variational inequality (51) is equivalent to a following minimization problem

2

1 (k+1) (k+1)

z 2 + Gj (z) > 1 y y + G

j j j H 2 2 H

or

∀ z ∈ H,

1 1

(k+1) 2 (k+1) 2 > kzkH + Fj (z) −

yj

− Fj yj 2 τj 2 τj H i 1 (k) h (k+1) (k) (k) (k) (k+1) , z − yj y − Aj yj − λj + r yj − Bj u τj j H

∀ z ∈ H,

i.e.,

i 1 (k) h (k+1) (k) (k) (k) , ∈ ∂b Fj yj yj − Aj yj − λj + r yj − Bj u(k+1) τj

(52)

where

1 b Fj (z) = kzk2H + Fj (z). 2 τj It is known (see [5℄), that p ∈ ∂ bFj (z) if and only if z ∈ ∂ bFj∗ (q), where bFj∗ is a fun tional onjugate to bFj (see, e.g., [5℄). So the in lusion (52) is equivalent to

the following one: (k+1)

yj

Sin e

∈ ∂b Fj∗

b Fj (z) =

i 1 (k) h (k) (k) (k) . yj − Aj yj − λj + r yj − Bj u(k+1) τj

Z Z |z|

gτj (ξ) d ξ d x,

Ω 0

then it is not diÆ ult to he k that b Fj∗ (z) =

Z |z| Z

Ω 0

ϕτj (ξ) d ξ d x,

gτj (ξ) =

ξ/τj ,

(53)

ξ 6 βj ,

ξ/τj + ϑj , ξ > βj ,

τj ξ, ξ 6 βj /τj , ϕτj (ξ) = βj , βj /τj < ξ 6 βj /τj + ϑj , τj (ξ + ϑj ), ξ > βj /τj + ϑj .

432

I. Badriev, O. Zadvornov

Then we obtain that fun tional bFj∗ is Gato dierentiable, moreover, b Fj∗

′

(z) =

ϕτj (| z |) z, |z|

hen e by virtue of the Proposition 5.3 [5℄ the subdierential ∂ bFj∗ (z) ontains ′ unique element oin iding with bFj∗ (z). So al ulations by (53) are performed by expli it formulas.

References 1. I.B. Badriev, O.A. Zadvornov A De omposition Method for Variational Inequalities of the Se ond Kind with Strongly Inverse-Monotone Operators, Differential inequalities, Pleaiades Publishing, In , 2003, 39, pp. 936{944. 2. O.A. Zadvornov On the Convergen e of the Semi-impli ite Method for Solving the Variational Inqualities of the Se ond Kind, Izvestiya Vyzov. Matemati a, 2005, 6, pp. 61{70 (in Russian). 3. I.B. Badriev, O.A. Zadvornov On the onvergen e of Dual-Type Iterative Method for Mixed Variational Inqualities, Dierential inequalities, Pleaiades Publishing, In , 2006, 42(8), pp. 1180{1188. 4. I.B. Badriev, O.A. Zadvornov, A.M. Saddek Convergen e Analysis of Iterative Methods for Some Variational Inqualities with Pseudomonotone Operators, Dierential inequalities, Pleaiades Publishing, In , 2001, 37(7), pp. 934{942. 5. Ekeland I., Temam R. Convex Analysis and Variational Problems, NorthHolland Publishing Company, Amsterdam, 1976. 6. Browder F.E., Petryshin W.V. The solution by iteration of nonlinear fun tional equations in Bana h spa es, Bull. Amer. Math. So ., 1966, V. 72, pp. 571575. 7. Opial Z. Weak onvergen e of the sequen e of su

essive approximations for nonexpansive mappings, Bull. Amer. Math. So ., 1967, V. 73, pp. 591{597. 8. Gaewskii H., Gro ger K., Za harias K. Ni htlineare Operatorglei hungen und Operatordierentialglei hungen, Berlin: A ademie-Verlag, 1974.

Splitting algorithm for solving mixed variational inequalities A lass of multilevel algorithms for partitioning of a sparse matrix prior to parallel solution of a system of linear equations is des ribed. This matrix partitioning problem an be des ribed in terms of a graph partitioning problem whi h is known to be NP-hard, so several heuristi s for its solution have been proposed in the past de ades. For this purpose we use the multilevel algorithm proposed by B. Hendri kson and R. Leland [2℄ and further developed by G. Karypis and V. Kumar [3℄. This algorithm is very eÆ ient and tends to produ e high quality partitioning for a wide range of matri es arising in many pra ti al appli ations. Abstract.

Keywords: graph partitioning, parallel omputations, load balan ing.

433

Multilevel Algorithm for Graph Partitioning N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov Neurok Te hSoft LLC, Russia diyankov@aconts.com

1

Introduction

EÆ ient algorithms for graph partitioning are riti al for s ienti simulations on high-performan e parallel omputers. For example, parallel iterative solution of a linear system of equations Ax = b

where A is a large sparse matrix, b is a right-hand side and x is a ve tor of unknowns is based on the partitioning of the matrix A. The main purpose of the partitioning pro edure is to divide the matrix A into required number of parts (stripes) in su h a way that ea h part has approximately the same number of rows and the number of interpro ess ommuni ations performed during the parallel solution is kept as small as possible. This lass of problems an be stri tly des ribed in terms of the graph partitioning problem (see se . 2), whi h is known to be NP-hard, so several heuristi s for its solution have been developed in the past de ades. They an be subdivided into three main ategories. The rst one ontains so- alled spe tral algorithms. While a hieving partitioning of a very good quality, they require a large amount of hardware resour es (CPU

y les and memory) be ause of the ne essity to nd eigenve tor orresponding to the se ond largest eigenvalue of the Lapla ian matrix of the adja en y graph of A. The se ond group ontains greedy algorithms whi h nd graphs partitioning by sequentially adding nodes to growing subsets following some greedy strategy su h as minimizing the number of ut edges (see se . 2) at ea h step. The third group ontains multilevel (ML) algorithms whi h are among the best ones in terms of partitioning quality and omputational resour es requirements, whi h is very important as problems be ome larger. The ML approa h itself was originally proposed by B. Hendri kson and R. Leland [2℄ and further developed by G. Karypis and V. Kumar [3℄ in their METIS pa kage. In this paper we des ribe its analogue and present numeri al tests results. The remainder of the paper is organized as follows. In se tion 2 we de ne the graph partitioning problem. In se tion 3 the main idea behind multilevel te hniques is demonstrated. In se tions 4, 5, and 6 we des ribe in details dierent phases of the multilevel approa h - oarsening, initial partitioning and un oarsening respe tively. In se tion 7 we present a variant of the original ML approa h whi h is alled Cell-Based

Multilevel Algorithm for Graph Partitioning

435

Multilevel (CBML) approa h and des ribe its advantages. Se tion 8 presents numeri al tests results. Se tion 9 provides a summary of the tests.

Fig. 1.

2

Sparse matrix and its adja en y graph.

The problem statement

It is well known that nonzero pattern of a sparse matrix an be represented by its adja en y graph (see Fig. 1). Namely, given a square n × n sparse matrix A ontaining nz nonzero entries, its adja en y graph is G = (V, E), where V is the set of nodes orresponding to the rows of A (|V| = n), and E is the set of edges orresponding to the nonzero entries of A (|E| = nz). The graph is undire ted when A is symmetri and dire ted otherwise. In the following paragraphs we assume that A is symmetri . This restri tion is easy to ful l by onsidering the matrix A∗ = A + AT and its adja en y graph instead of A sin e the exa t values of the matrix's nonzero entries are unimportant. The k-way graph partitioning problem is formulated as follows: partition V into k disjoint subsets fV1 , V2 , ..., Vn g su h that |Vi | ≈ Vk for i = 1..k (loadbalan ing ondition), while minimizing the number of edges whose in ident nodes belong to dierent partitions ( ut-size minimization ondition) (see Fig. 1). These edges are alled ut edges and their number is alled ut size. This problem an be trivially extended to graphs with weights assigned to the nodes and edges (see [3℄). When the number of parts is a power of 2, i.e. k = 2p , the problem is frequently solved in a re ursive bise tion fashion. Namely, we S rst obtain 2-way partitioning of our graph: V = V1 V2 . Then, we re ursively apply the same pro edure to subgraphs of G indu ed by V1 and V2 . After p steps the original graph is partitioned into k parts. It's worth to say that this approa h often works worse than the original k-way partitioning approa h when

436

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

k > 2, but still frequently used due to its simpli ity. In this paper we des ribe the original k-way partitioning approa h and propose some improvements. We

estimate partitioning quality depending on the degree the load-balan ing and

ut size minimization onditions are ful lled.

3

Multilevel k-way graphs partitioning

The whole pro edure may be depi ted by Fig. 2. It onsists of the following main phases:

Fig. 2.

Multilevel graph partitioning algorithm.

1. Coarsening phase. During the oarsening phase, a sequen e of smaller graphs fG1 , G2 , ..., Gm g is onstru ted until the number of nodes in the oarsest graph Gm be omes less than some prede ned value (around a few hundreds). The number of graphs in this sequen e is alled oarsening depth. We use spe ial parameter ν to ontrol the oarsening depth. Namely, we try to build a oarser graph Gi+1 from a ner one Gi until |Vi | > ν ∗ |V1|. Ea h graph Gi forms a layer of oarsening. That is why this approa h is alled "multilevel". At ea h layer, possibly ex ept the rst one, weights are assigned to the nodes and edges of the graphs (see se . 4) in order to partitioning of the oarsest graph be good with respe t to the original one. There are many possibilities to onstru t a oarser graph from a ner one. But we use edges ollapsing te hnique that is based on mat hings (see se . 4).

Multilevel Algorithm for Graph Partitioning

437

2. Initial partitioning phase. During the initial partitioning phase, high quality partitioning of the oarsest graph Gm is omputed. Sin e the number of nodes of Gm is small omparing with that in G1 this phase an be a

omplished very qui kly. A tually, it takes about 10% of the total partitioning time. There exists many algorithms to do this (see [3℄, [5℄). For this we use Restarted Greedy Graph Growing (RGGG) algorithm whi h is des ribed in details in se . 5. 3. Uncoarsening phase with refinement. During this phase, just found partitioning of the oarsest graph Gm is proje ted ba k to the original graph G1 by going through the set of intermediate graphs. On ea h layer just proje ted partitioning is re ned. There are many lo al re nement algorithms intended to do this. It is worth to mention Kernigan-Lin re nement algorithm [5℄ and it's linear-time variant - Fidu

ia-Mattheyses re nement algorithm [1℄. In our multilevel approa h we use a variant of the original Fidu

ia-Mattheyses lo al re nement algorithm whi h is alled boundary Fidu

ia-Mattheyses lo al re nement algorithm.

4

Coarsening phase

Given a weighted graph Gi with weights assigned to the nodes and the edges, the next level oarser graph Gi+1 is onstru ted from it by merging together (i) (i) (i+1) some subsets of its nodes fν(i) j1 , νj2 ,..., νjk g (an estors) into multinodes νj (i) (des endants). The weight of ν(i+1) equals to the sum of weights of fν(i) j j1 , νj2 ,..., (i) (i) (i) (i) νjk g. In the ase when more than one node of fνj1 , νj2 , ..., νjk g ontain (i) (i) edges in ident to the same node u * fν(i) j1 , νj2 , ..., νjk g, the weight of the , u) equals to the sum of the weights of these edges. It is obvious edge (ν(i+1) j that a oarser graph an be onstru ted from a ner one in many dierent ways. For matri es with unstru tured nonzero patterns it seems reasonable to use oarsening pro edure based on ollapsing together the edges of G that form a mat hing, be ause of the ne essity to preserve onne tivity stru ture of the original graph in the oarsest one. Remind that a mat hing in a graph (weighted or unweighted) is a subset of its edges with the following property: no two of whi h are in ident to the same node. A mat hing is alled maximal if it is impossible to add one more edge to it su h that the resulting subset of edges forms a mat hing too. The maximal mat hing that has the maximum number of edges is alled maximum mat hing. Sin e the goal of the oarsening pro edure is to de rease the size of the graph, mat hing should ontain a large number of edges. But the omplexity of omputing maximum mat hing is higher than that of omputing maximal mat hing. That is why we onstru t maximal mat hings during the oarsening phase. They an be generated very qui kly using depth-

438

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

rst sear h [6℄ or randomized algorithm. We implemented the following three types of mat hings: – Random mat hing (RM). This type of mat hings is very popular be ause

of its simpli ity and often gives good results. It is demonstrated by the following pseudo ode:

Algorithm 3. Random Mat hing Algorithm INPUT: graph G(V,E) OUTPUT: mat hing M 1. mat hing M = ∅; 2. forea h( u ∈ V ) mask[u℄ = 0; 3. forea h( u ∈ V ) f 4. if( 0 == mask[u℄ ) f 5. mask[u℄ = 1; 6. if( exist v ∈ adj[u] su h that 0 == mask[v℄ ) 7. mask[v℄ = 1; M ←− (u, v); 8. 9. g 10. g 11. g

f

Initially, the mat hing is empty (line 1) and all nodes are unmasked (line 2). Then, the nodes are visited in random order (line 3). If node u is already masked it is skipped. Otherwise, it is masked (lines 4, 5) and then we arbitrary sele t its adja ent unmasked node v if su h a node exists (line 6), mask it (line 7) and add the edge (u, v) to the mat hing. Obviously, that this algorithm has a linear time omplexity with respe t to the number of nodes, i.e O(|V|). – Heavy-Edge Mat hing (HEM). As in the previous algorithm, the nodes are visited in random order. But now we sele t unmasked node v adja ent to u in su h a way that the weight of the edge (u, v) is maximal over all unmat hed adja ent edges. The algorithm an be illustrated by the following pseudo ode:

Algorithm 4. Heavy-Edge Mat hing Algorithm INPUT: graph G(V,E) OUTPUT: mat hing M 1. mat hing M = ∅; 2. forea h( u ∈ V ) mask[u℄ = 0; 3. forea h( u ∈ V ) f 4. if( 0 == mask[u℄ ) f 5. mask[u℄ = 1;

Multilevel Algorithm for Graph Partitioning

6. if( exist v ∈ adj[u] su h that 0 == mask[v℄ and max ) f 7. mask[v℄ = 1; 8. M ←− (u, v); 9. g 10. g 11. g

439

w(u,v) −→

This algorithm has linear time omplexity with respe t to the number of edges, i.e O(|E|). – Heavy-Clique Mat hing (HCM). In this se tion we des ribe our version of heavy- lique mat hing algorithm. The algorithm an be ee tive for graphs with a few highly- onne ted omponents [6℄. In [3℄ one an nd a variant of HCM algorithm based on the on ept of edges density. In ontrast to this algorithm, we developed our own one. Remind that for undire ted graph G one an de ne the on ept of degree of a node [6℄, whi h gives the number of edges in ident to the node. The algorithm an be demonstrated by the following pseudo ode:

Algorithm 5. Heavy-Clique Mat hing Algorithm INPUT: graph G(V,E) OUTPUT: mat hing M 1. mat hing M = ∅; 2. for( i = 1; G 6= ∅; i++ ) f 3. if( u∈V min (deg[u]) == 1 ) break; 4. build Gi from G su h that min (deg[u]) is as large as possible; u⊂V 5. G = G\Gi ; 6. g 7. forea h( Gi ) M ←− HEM(Gi ); 8. M ←− HEM(G); i

i

Initially, the mat hing is empty (line 1). In line 4 we try to build subgraph

Gi from a given one G whi h has the following property: min (deg[u]) u⊂Vi

is

as large as possible, where deg[u] is the degree of node u. In other words,

we try to extra t subgraph from G in whi h the minimal degree of a node is as large as possible. The operation in line 4 an be implemented in O(|E|) by the algorithm whi h is des ribed below (see Maximal Minimum Degree Subgraph Extra tion algorithm). Then we perform the same pro edure for the subgraph of G whi h is indu ed by the set of nodes fV\Vig until the

ondition in line 3 is satis ed or the subgraph be omes empty (loop in lines 2-6). As a result, we obtain the sequen e of graphs fG1, G2 , ..., Gq g and the remaining part of input graph whi h onsists of isolated nodes or isolated

440

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

pairs of nodes. After that we build required mat hing M as a onjun tion of heavy-edge mat hings for all graphs Gi and G. Let's onsider the algorithm whi h demonstrates how we an build a subgraph of G with maximal minimum degree ( the operation in line 4 ) in O(|E|) time:

Algorithm 6. Maximal Minimum Degree Subgraph Extra tion INPUT: graph G(V,E) OUTPUT: subgraph G∗ ∈ G 1. sort nodes of G in degrees-as ending order; 2. for( u = 0; u < |V|; u + + ) f 3. save D[u℄ ←− deg[u℄; G = G\u; 5. 6. maintain nodes degrees-as ending order; 7. g (D[u]); 8. nd u∗ : D[u∗ ] = max u ∗ 9. while( u < u ) G = G\{u}; 10. G∗ = G; In line 1 we sort the nodes of G in degrees as ending order. In the loop in lines 2-7 we visit sorted nodes one at a time, take the node with minimum degree, save its degree and then ex lude it with its in ident edges from G. It is ne essary to re al ulate the degrees of the remaining nodes and maintain their degrees as ending order. In line 8 we nd maximal value of all degrees that were saved in line 3 and orresponding node u∗ . Then the output of the algorithm is obtained by removing the nodes of G whi h were visited before u∗ in the loop in lines 2-7. We onsidered three algorithms for mat hings generation. In [3℄ one an nd other ones. When the edges are unweighted (or have the same weight) it seems reasonable to use RM. In order to onstru t a oarser graph from a ner one we need to ollapse together mat hed edges. This pro edure an also be implemented in O(|E|). It is worth to note that oarsening phase usually takes about 80% of the total partitioning time.

5

Initial partitioning phase

During this phase high quality partitioning of the oarsest graph Gm is onstru ted. Sin e Gm has quite a small number of nodes, this phase takes quite a small amount of time. For this we use Restarted Greedy Graph Growing algorithm. The algorithm an be outlined by the following pseudo ode:

Algorithm 7. Restarted Greedy Graph Growing Algorithm INPUT: graph G, the number of parts k, the number of restarts rests

Multilevel Algorithm for Graph Partitioning

441

OUTPUT: partitioning of G into k parts 1. while( rests − − ) f 2. put all nodes in partition P0 ; 3. for( j = 1; j < k; j + + ) f 4. randomly sele t u ∈ P0 and put it in Pi ; 5. while( size[Pi ] < Vk ) f 6. sele t u ∈ P0 su h that cutsize −→ min; 7. move u from P0 to Pk ; 8. g 9. g 10. save partitioning; 11. g At the beginning of ea h restart we put all nodes in partition P0 (line 2). In order to onstru t partition Pi , we rst randomly sele t a node from partition P0 and put it in partition Pi (growing subset) whi h was empty before it (line 4). Then we sequentially move nodes from P0 to Pi in su h a way that ea h movement results in the smallest possible in rease in the ut size (lines 6,7). We ontinue this until the size of Pi be omes more than or equal to | Vk |. Then we try to

onstru t partition Pi+1 in the same way (loop in lines 3-9). It is obvious that in order to onstru t k-way partitioning of the graph we must onstru t k − 1 partitions. After that, remaining nodes in partition P0 form missing k-th partition. After the required number of restarts is nished, we use the best partitioning as the result. This pro edure an be used as a standalone partitioner, but greedy algorithms often give partitioning of a poor quality and the required amount of time often ex eeds the amount of time required by multilevel partitioner.

6

Uncoarsening with refinement phase

This phase onsists of the following two steps. First, partitioning of the oarsest graph is proje ted ba k to the original graph by going through intermediate graphs. Sin e ea h node of Gi+1 is formed by a distin t subset of nodes of Gi , the proje tion is trivial to realize. Namely, we an derive partitioning of Gi from (i) (i) partitioning of Gi+1 by assigning to the set of nodes fν(i) j1 , νj2 , ..., νjk g that

ollapsed into ν(i+1) the partition number that holds ν(i+1) . The next step is j j a re nement pro edure for just found partitioning of Gi. We use a modi ation of the original lo al Fidu

ia-Mattheyses re nement algorithm [1℄ whi h we

all boundary Fidu

ia-Mattheyses re nement algorithm. The entral on ept behind re nement algorithms is the on ept of gain of a node. Given a node u whi h belongs to the partition Pi , the gain of movement of node u from Pi to

442

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

Pj (i 6= j) is given by the following formula: X X gain(Pi −→ Pj ) = w(u,v) − w(u,v) , u

v⊂Pj

v⊂Pi

where w(u,v) is the weight of the edge (u, v). In other words, gain(Pi −→ Pj ) u gives the de rease in the ut size we obtain after the movement is performed. The boundary Fidu

ia-Mattheyses re nement algorithm may be outlined by the followig pseudo ode:

Algorithm 8. Boundary Fidu

ia-Mattheyses Re nement Algorithm INPUT: graph G(V,E), the number of parts k, the number of restarts rests OUTPUT: partitioning of G into k parts 1. while( rests − − ) f 2. unlo k all nodes V ; 3. for( p = 0; p < k; p + + ) f 4. put all boundary nodes from Pi to PQi ; 5. g 6. while( all PQs are not empty ) f Pj ) −→ 7. from all PQs nd i, j and u ⊂ Pi su h that gain(Pi −→ u max; 8. if( gain(Pi −→ Pj ) < 0 ) break; u 9. if( after the movement Pi and Pj are still balan ed ) f 10. move Pi −→ Pj ; u 11. adjust gains of all unlo ked nodes v ⊂ adj[u]; 12. g 13. else f 14. remove u from PQi ; 15. g 16. g 17. g The boundary Fidu

ia-Mattheyses re nement algorithm is iterative in nature. The number of iterations (or restarts) is ontrolled by the rests parameter. We maintain k priority queues to hold boundary nodes from ea h partition that are allowed to move, i.e. unlo ked (see [1℄ for explanation). Initially all the nodes are unlo ked (line 2). At lines 3-5 we initialize all queues with boundary nodes from orresponding partitions. As a key of a node we use the maximum gain from all gains onsidered to allowable movements of that node, i.e. movements from Pi to Pj , i 6= j. Then, in the loop in lines 6-16 we look at the tops of all queues and sele t the node that has the maximum value of key (line 7). After that we know all information that is ne essary to perform just found movement. If the gain of the movement is negative we exit from the loop in lines 6-16 be ause only movements with positive gains an re ne partitioning. It is worth

Multilevel Algorithm for Graph Partitioning

443

to note that due to the load-balan ing ondition su h a movement may be not allowed (it is ontrolled by line 9). In this ase we simply remove the node from its queue and perform this pro edure again. We exit from the loop in lines 6-16 only if there is no movements that preserve load-balan ing and de rease the ut size. An advantage of this algorithm over the one des ribed in [1℄ is its time

omplexity. It an be approximated by the formula O(|N∗ ∗ S|) where N∗ is the number of boundary nodes (nodes whi h have at least one adja ent node that belongs to another partition) and S is the average sparsity.

7

Cell-based multilevel approach

In this se tion we introdu e a new multilevel te hnique for sparse matrix partitioning. We all it ell-based multilevel (CBML) partitioning algorithm. It is ee tive for large sparse matri es arising from su h dis retization of PDEs in whi h several unknowns are related to ea h grid ell. Let us onsider su h a matrix A. In the ase when A is a multiblo k matrix we onsider one of its stru ture. Namely, blo ks. The adja en y graph G of A has spe ial onne tivity S the set of its nodes V an be represented as V = Vµ in whi h ea h subset Vµ has the following property: all the nodes u ⊂ Vµ are indistinguishable (remind S S that two nodes v and u are alled indistinguishable if adj[u] u = adj[v] v). This information about onne tivity pattern may be employed to nd better partitioning ompared with that generated by the original algorithm. Namely we an onsider so- alled redu ed graph G∗ of G in whi h ea h node v ⊂ V ∗

orresponds to subset Vµ , has the same onne tivity stru ture as any node in Vµ and the weight w[v] = |Vµ |. Then we apply multilevel te hnique to redu ed graph G∗ . After partitioning of G∗ is generated, partitioning of G an be derived from it sin e ea h node of G∗ is formed from distin t subset of nodes of G. |V ∗ | In the ase when |V| ≪ 1 this approa h seems to produ e better partitioning

ompared with that generated by the original multilevel algorithm applied to G (see Fig. 8 in the numeri al test results se tion).

8

Numerical test results

In this se tion we present the results of omparison of our partitioner with METIS pa kage whi h an be downloaded from http://www.glaros.dtc.umn.edu/gkhome/metis/metis/download.

All tests are performed on Opteron 2.0 GHz, with 2 Gb RAM running under SLES 9. We use a publi XOM matrix olle tion whi h an be downloaded from http://www.aconts.com/XOMMatrices. Table 1 demonstrates test matri es properties.

444

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov Table 1.

Summary of publi XOM matrix olle tion

Problem N Z Z/N ZD ND SD POD CI-1 113465 1654732 14,58 0 2 23619 477535 CI-2 62449 460319 7,37 0 0 128 202 CIT-1 17436 344245 19,74 0 4207 7388 98092 CIT-2 249428 5613978 22,51 30 1323 16106 1024619 SBO-1 21700 145122 6,69 1 0 1 7 SBO-2 111756 888190 7,95 0 0 8 12 SBO-3 216051 1849317 8,56 0 0 0 0 SBO-4 93264 667882 7,16 0 0 0 0 SEO-1 22421 204784 9,13 0 180 94 849

Here N is the number of rows, Z is the number of nonzero entries, S = Z/N is the average sparsity, ZD is the number of zero diagonal entries, ND is the number of negativePdiagonal entries, SD is the number of "small" diagonal entries (i.e. |aii| < 0.01 ∗ |aij |), POD is the number of positive o-diagonal entries. j6=i

– Matching algorithms impact on partitioning quality. The main obsevation

behind these tests is that all algorithms for mat hing generation give good results with respe t to partitioning quality and time requrements. But it seems reasonable to use "heavy" mat hings (i.e. HEM or HCM) on "deep" layers of oarsing where weights are assigned to the nodes and the edges. As the experiments show, HEM is the best algorithm whi h results in very good partitioning of the oarsest graph. In the tests des ribed below HEM is used as default mat hing strategy. In Fig. 3 omparison of deferent mat hing strategies for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32) is presented for the problem CI-1. – Coarsening depth parameter impact on partitioning quality. As was des ribed in se . 3, the depth of oarsening is ontrolled by spe ial parameter ν, i.e. we try to build a oarser graph Gi+1 from a ner one Gi until |Vi | > ν ∗ |V1 |, where |V1 | is the number of nodes of G1 . The smaller the parameter's value the smaller the number of nodes in the oarsest graph and then the better partitioning we an obtain after the initial partitioning phase. As our experiments show, small values of ν result in better initial partitionig. In this se tion we present the results of tests where the oarsening depth parameter is varying for the problem CI-1 for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). – Comparison with METIS package. In this se tion the results of omparison of our partitioner (MLPT) with METIS pa kage are presented for all matri es for dierent numbers of parts. In the tests we use HEM algorithm to nd mat hing and the value of the oarsening depth parameter is ν = 0.0001. We an on lude that our variant of multilevel algorithm gen-

Multilevel Algorithm for Graph Partitioning

445

5,4

5,2

4,8

4,6

2

log (cutsize)

5,0

4,4

HEM

4,2

HCM RM

4,0

0

5

10

15

20

25

30

35

parts

Comparison of dierent mat hing algorithms and their impa t on partitioning quality for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32) for the problem CI-1. Fig. 3.

5,4

5,2

4,8

4,6

2

log (cutsize)

5,0

4,4

0,0001 0,001

4,2

0,01 0,1

4,0

0

5

10

15

20

25

30

35

parts

In uen e of oarsening depth parameter ν on partitioning quality for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32) for the problem CI-1.

Fig. 4.

446

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

1,30 1,25 1,20

CI-1

1,15

CIT-2

/cutsize

METIS

1,10 1,05

CIT-1

1,00

CI-2

0,95

cutsize

MLPT

0,90

SBO-4

0,85

SBO-1

0,80

SBO-2

0,75 0,70

SBO-3

0,65

SEO-1

0,60 0,55

0

5

10

15

20

25

30

35

parts

Comparison of partitioning quality generated by MLPT with that generated by METIS pa kage for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). Here cutsizeMLPT is the ut size for MLPT and cutsizeMETIS is the ut size for METIS. Fig. 5.

1,6 1,5 1,4

CI-1

1,3

CIT-2

1,2

CIT-1

time

MLPT

/time

METIS

1,1 1,0

CI-2

0,9

SBO-4

0,8 0,7

SBO-1

0,6

SBO-2

0,5

SBO-3

0,4 0,3

SEO-1

0,2 0,1 0

5

10

15

20

25

30

35

parts

Fig. 6. Comparison of partitioning time required by MLPT with that required by METIS pa kage for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). Here timeMLPT

is the time requred by MLPT and timeMETIS is the time required by METIS.

Multilevel Algorithm for Graph Partitioning

447

7,0

6,5

6,0

5,5

5,0

proc = 2 proc = 3

speedup

4,5

proc = 4 proc = 5

4,0

proc = 6 3,5

proc = 7 proc = 8

3,0

2,5

2,0

C I2 .b b s f C IT -1 .b b s f C IT -2 .b b s f S B O -1 .b b s f S B O -2 .b b s f S B O -3 .b b s f S B O -4 .b b s f S E O -1 .b b s f

C I1 .b b s f

1,5

problem

Fig. 7. In uen e of partitioning quality on the performan e of matrix-ve tor produ t (MVP) operation on MPI ar hite ture for dierent numbers of parts (2,3,4,5,6,7,8). Here speedup is the ratio of serial time to parallel time required by the operation. 1,2

cutsize

CBMLPT

/cutsize

MLPT

1,1

1,0

0,9

0,8

0,7

CIT-2

0,6

0

5

10

15

20

25

30

35

parts

Comparison of Cell-Based Multilevel Algorithm with the original one for the problem CIT-2 for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). Here cutsizeCBMLPT is the ut size for CBMLPT and cutsizeMLPT is the ut size for MLPT . Fig. 8.

448

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

erates partitioning ompetetive with that generated by METIS pa kage. In Fig. 5 we ompare the quality of partitioning generated by our partitioner with that generated by METIS pa kage. In Fig. 6 we ompare the required time. – MPI matrix-vector product . While solving a large sparse linear system of equations via some Krylov-like iterative methods on a ma hine with distributed memory ar hite ture, it is very important to perform matrix-ve tor produ t (mvp) operation as fast as possible. In Fig. 7 we present the impa t of partitioning quality on the performan e of MVP operation on MPI ar hite ture for 2, 3, 4, 5, 6, 7, and 8 numbers of parts. We an on lude that there is signi ant speedup for most matri es. – Comparison of CBML approach with the original one. In this se tion we present the advantages of ell-based multilevel approa h over the original one for the problem CIT − 1. As it was mentioned earlier, this approa h tends to generate good partitionings for the paroblems arising from des retization of PDEs in whi h several unknowns are related to ea h grid ell. The Fig. 8 presents the result of omparison. One an on lude that CBML algorithm is preferable over the original one for su h systems.

9

Conclusion

We evaluated the performan e of our multilevel partitioner for a range of matri es arising from dis retization of PDEs. One an on lude that the multilevel te hnique work quite well. As it was mentioned earlier, the oarsening phase requires more than half of the total partitioning time. This fa t demonstrates that in order to ee tively parallelize the whole algorithm some tri ks must be employed to parallelize the oarsening phase. In [4℄ a parallel multilevel algorithm is proposed whi h is based on the graph oloring. Comparing partitioning quality one an on lude that the best partitionings are generated when HEM algorithm is used to nd the edges to ontra t. In addition, obtained partitionings are ompetitive with those generated by METIS pa kage.

References 1. C. M. Fidu

ia and R. M. Mattheyses, A linear time heuristi for improving network partitions. In: Pro . 19th IEEE Design Automation Conferen e, 1982, pp. 175-181. 2. B. Hendri kson and R. Leland, A Multilevel Algorithm for Partitioning Graphs. Te h. report SAND93-1301, Sandia National Laboratories, Albuquerque, NM, 1993. 3. G. Karypis and V. Kumar, Multilevel Graph Partition and Sparse Matrix Ordering. In: Intl. Conf. on Parallel Pro essing, 1995.

Multilevel Algorithm for Graph Partitioning

449

4. G. Karypis and V. Kumar, A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. In: J. Parallel and Distributed Computing, 1998, No.48, pp. 71-95. 5. B. W. Kernigan and S. Lin, An eÆ ient heuristi pro edure for Partitioning graphs. In: Bell Sys. Te h. J., 1970, No. 49, pp. 291-307. 6. O. Ore, Theory of Graphs. AMS Colloquium Publi ations 38. AMS, 1962.

2D-extension of Singular Spectrum Analysis: algorithm and elements of theory N. E. Golyandina⋆ and K. D. Usevi h⋆⋆ Mathemati al Department, St. Petersburg State University, Universitetskij pr. 28, St. Petersburg Petrodvorets 198504, Russia ⋆ nina@gistatgroup.com, ⋆⋆ usevich.k.d@gmail.com

Abstract. Singular Spe trum Analysis is a nonparametri method, whi h allows one to solve problems like de omposition of a time series into a sum of interpretable omponents, extra tion of periodi omponents, noise removal and others. In this paper, the algorithm and theory of the SSA method are extended to analyse two-dimensional arrays (e.g. images). The 2D-SSA algorithm based on the SVD of a Hankel-blo kHankel matrix is introdu ed. Another formulation of the algorithm by means of Krone ker-produ t SVD is presented. Basi SSA notions su h as separability are onsidered. Results on ranks of Hankel-blo k-Hankel matri es generated by exponential, sine-wave and polynomial 2D-arrays are obtained. An example of 2D-SSA appli ation is presented.

Keywords: Singular Spe trum Analysis, image analysis, Hankel-blo kHankel matrix, separability, nite rank, Singular Value De omposition, Krone ker-produ t SVD.

1

Introduction

The purpose of this paper is to extend the SSA (Singular Spe trum Analysis) algorithm and theory developed in [7℄ to the ase of two-dimensional arrays of data (i.e. real-valued fun tions of two variables de ned on Cartesian grid). The mono hrome digital images are a standard example here. Singular Spe trum Analysis is a well-known model-free te hnique for analysis of real-valued time series. Basi ally, SSA is an exploratory method intended to perform de omposition of a time series into a sum of interpretable omponents, su h as trend, periodi ities and noise (see [3, 4, 7℄ for more details). SSA has proved to be su

essful for su h tasks. Moreover, there are several SSA extensions for time series fore asting, hange-point dete tion, missing values imputation and so on. These are the reasons to believe that the two-dimensional extension of SSA (2DSSA, rst presented in [6℄) has similar apabilities. However, its appli ation was hampered by la k of theory, whi h this paper is intended to redu e. Suppose we observe a 2D-array of data (a real matrix) being a sum of unknown omponents F = F(1) + . . . + F(m) . The general task of the 2D-SSA

2D-extension of Singular Spe trum Analysis

451

algorithm is to produ e a de omposition e (m) , e (1) + . . . + F F=F

(1)

where the terms approximate the initial omponents. In §2 we present the algorithm of 2D-SSA. First of all, the algorithm is formulated basing on the SVD of the Hankel-blo k-Hankel (HbH for short) matrix generated by the input 2D-array. However, another equivalent representation of the algorithm ts better for examination and analysis. It is based on the de omposition of a matrix into a sum of Krone ker produ ts. The key step of the algorithm is grouping of terms of the SVD. This step governs the resulting de omposition (1). Main problems of grouping are: possibility of proper grouping and identi ation of terms in the SVD. These problems are dis ussed in §2.4 and investigated in §3 and §4. In §3 we study the notion of separability inherited from the 1D ase. Separability means possibility to extra t onstituents from their sum by 2D-SSA. We also provide a brief review of results on one-dimensional separability as the basis for results in the 2D ase. Se tion 4 deals with the so- alled 2D-SSA rank of a 2D-array de ned as the number of SVD terms orresponding to the 2D-array and equal to the rank of a Hankel-blo k-Hankel matrix generated by the 2D-array. This number is important, as it should be taken into a

ount when performing identi ation. We provide rank al ulations for dierent 2D-arrays: exponents, polynomials and sine-waves. In §5 we demonstrate 2D-SSA notions by an example of periodi noise removal. General definitions

First of all, let us review de nitions that will be used throughout this paper. The following operator is widely used in the SSA theory and is quite helpful for the 2D-SSA algorithm formulation. m,n Definition 1. Let A = aij i,j=1 ∈ Mm,n (Q) be a matrix over spa e Q. The hankelization operator HQ : Mm,n(Q) 7→ Mm,n (Q)

by

e2 e1 a a a e3 e2 a HQ A = . . . . ..

en ... a en+1 ... a

. . . .. .

em+1 . . . a em+n−1 em a a

where Dk = {(i, j) :

,

ek = a

X

(i,j)∈Dk

1 6 i 6 m, 1 6 j 6 n, i + j = k + 1}.

Eu lidean is de ned

. aij #Dk ,

452

N. E. Golyandina, K. D. Usevi h

= Mm,n (R) the spa e of real matri es Further, we will denote by Mm,n def with Frobenius inner produ t: hX, YiM =

n m X X

(2)

xij yij ,

i=1 j=1

m,n where X = (xij )m,n i,j=1 , Y = (yij )i,j=1 ∈ Mm,n . Introdu e an isomorphism between Mm,n and Rmn .

Definition 2. The ve torization Mm,n is given by

(see, for instan e, [8℄) of A

= (aij )m,n i,j=1 ∈

def

(3)

The

denoted by matrm,n (X) is

ve A = (a11 , . . . , am1 ; a12 , . . . , am2 ; . . . ; a1n , . . . , amn)T . Definition 3.

de ned to be

(m, n)-matri izing of X ∈ Rmn A ∈ Mm,n satisfying ve A = X.

Then, re all the operation of Krone ker produ t [8, 9℄.

For A = (aij )m,n i,j=1 ∈ Mm,n and B Krone ker produ t is, by de nition, Definition 4.

= A ⊗ B def

a11 B . . . a1n B

.. .

.. .

am1 B . . . amn B

= (bkl )p,q k,l=1 ∈ Mp,q

.

their

(4)

Finally, we need an isomorphism between lasses of blo k matri es. Definition 5.

The rearrangement R : Mmp,nq 7→ Mpq,mn is de ned as def

R(C) = D ∈ Mpq,mn , where (D)i+(j−1)p,k+(l−1)m = (C)i+(k−1)p,j+(l−1)q

(5)

for 1 6 i 6 p, 1 6 j 6 q, 1 6 k 6 m, 1 6 l 6 n. Note that the introdu ed rearrangement of a matrix is the transpose of the rearrangement de ned in [2℄. The following properties of the rearrangement are quite useful, despite being easily he ked. p,q – Let A = (aij )m,n i,j=1 ∈ Mm,n and B = (bkl )k,l=1 ∈ Mp,q . Then

R(A ⊗ B) = ve B(ve A)T .

(6)

kR(C)kM = kCkM .

(7)

– For any C ∈ Mmp,nq

2D-extension of Singular Spe trum Analysis

2 2.1

453

2D-SSA Basic algorithm

Consider a 2D-array of data

f(0, 0) f(1, 0) F= . ..

. . . f(0, Ny − 1) . . . f(1, Ny − 1)

f(0, 1) f(1, 1)

.. .

. . .. . .

f(Nx − 1, 0) f(Nx − 1, 1) . . . f(Nx − 1, Ny − 1)

.

The algorithm is based on the SVD of a Hankel-blo k-Hankel (HbH) matrix

onstru ted from the 2D-array. The dimensions of the HbH matrix are de ned by the window sizes (Lx , Ly ), whi h are restri ted by 1 6 Lx 6 Nx , 1 6 Ly 6 Ny and 1 < Lx Ly < Nx Ny . Let Kx = Nx − Lx + 1 and Ky = Ny − Ly + 1 for

onvenien e of notation. Embedding

At this step, the input 2D-array is arranged into a Hankel-blo k-Hankel matrix of size Lx Ly × Kx Ky :

H0 H1 W= H2 . ..

H1 H2

H2 H3

. .. . ..

H3

.. .

HLy −1 HLy

where

f(0, j) f(1, j) Hj = . ..

...

. . . HKy −1 . . . HKy . .. . , . . . . . . ..

(8)

. . . HNy −1

f(1, j) . . . f(Kx − 1, j) f(2, j) . . . f(Kx , j) . .. . . .. . . .

f(Lx − 1, j) f(Lx , j) . . . f(Nx − 1, j)

Obviously, there is the one-to-one orresponden e between 2D-arrays of size Nx × Ny and HbH matri es (8). Let us all the matrix W a Hankel-blo kHankel matrix generated by the 2D-array F. SVD

Then, the SVD is applied to the Hankel-blo k-Hankel matrix (8): W=

d p X λi Ui Vi T .

(9)

i=1

Here λi (1 6 i 6 d) are the non-zero eigenvalues of the matrix WWT arranged in de reasing order λ1 > λ2 > · · · > λd > 0; {U1 , . . . , Ud } is a system of orthonormal in RLx Ly eigenve tors of the matrix WWT ; {V1 , . . . , Vd } is an orthonormal

454

N. E. Golyandina, K. D. Usevi h

fa tor ve tors. The fa tor ve tors system of ve tors in RKx Ky , hereafter alled T U /√λ . The triple (√λ , U , V ) is said to

an be expressed as follows: Vi = W i i i i √ i be the ith eigentriple. Note that λi is alled a singular value of the matrix W. Grouping

After spe ifying m disjoint subsets of indi es Ik (groups of eigentriples), I1 ∪ I2 ∪ · · · ∪ Im = {1, . . . , d},

(10)

one obtains the de omposition of the HbH matrix W=

m X

WIk ,

where WI =

k=1

Xp λi Ui Vi T .

(11)

i∈I

This is the most important step of the algorithm as it ontrols the resulting de omposition of the input 2D-array. The problem of proper grouping of the eigentriples will be dis ussed further (in §2.4). Projection

Proje tion step is ne essary in order to obtain a de omposition (1) of the input 2D-array from the de omposition (11) of the HbH matrix. Firstly, matri es fI . Se ondly, 2D-arrays F eI WIk are redu ed to Hankel-blo k-Hankel matri es W k k f are obtained from WIk by the one-to-one orresponden e. fI , in their turn, are obtained by orthogonal proje tion of The matri es W k matri es WIk in Frobenius norm (2) onto the linear spa e of blo k-Hankel Lx Ly × Kx Ky matri es with Hankel Lx × Kx blo ks. The orthogonal proje tion of

Z1,1 Z1,2 . . . Z2,1 Z2,2 . . . Z= . .. .. .. . . ZLy ,1 ZLy ,2 . . .

Z1,Ky Z2,Ky , .. . ZLy ,Ky

Zi,j ∈ MLx ,Kx ,

an be expressed as a two-step hankelization e = HMLx ,Kx Z

HR Z1,1 HR Z1,2 . . . HR Z1,Ky HR Z2,1 HR Z2,2 . . . HR Z2,Ky . . .. .. .. .. . . . R R R H ZLy ,1 H ZLy ,2 . . . H ZLy ,Ky

In other words, the hankelization is applied at rst to the blo ks (within-blo k hankelization) and then to the whole matrix, i.e. the blo ks on se ondary diagonals are averaged between themselves (between-blo k hankelization). Certainly, the hankelization operators an be applied in the reversed order. Thus, the result of the algorithm is F=

m X

k=1

eI . F k

(12)

2D-extension of Singular Spe trum Analysis

455

e I is said to be the re onstru ted by eigentriples with indi es A omponent F k Ik 2D-array. 2.2

Algorithm: Kronecker products

Let us examine the algorithm in terms of tensors and matrix Krone ker produ ts. Embedding

Columns of the Hankel-blo k-Hankel matrix W generated by the 2D-array F

an be treated as ve torized Lx × Ly submatri es (moving 2D windows) of the input 2D-array F (see Fig. 1). 1

l

1 k

Ny

pp pp ppppppppp Fk,l

L -

6 Lx ?

y

Nx

Fig. 1.

Moving 2D windows

More pre isely, if Wm stands for the mth olumn of the Hankel-blo k-Hankel matrix W = [W1 : . . . : WKx Ky ], then Wk+(l−1)Kx = ve (Fk,l ) for 1 6 k 6 Kx , 1 6 l 6 Ky ,

(13)

where Fk,l denotes the Lx × Ly submatrix beginning from the entry (k, l)

f(k − 1, l − 1)

. Fk,l = ..

. . . f(k − 1, l + Ly − 2)

. . .. . .

f(k + Lx − 2, l − 1) . . . f(k + Lx − 2, l + Ly − 2)

.

(14)

An analogous equality holds for the rows of the Hankel-blo k-Hankel matrix

W. Let W n be the nth row of the matrix W = [W 1 : . . . : W Lx Ly ]T . Then W i+(j−1)Lx = ve (Fi,j ) for 1 6 i 6 Lx , 1 6 j 6 Ly ,

(15)

where Fi,j denotes the Kx × Ky submatrix beginning from the entry (i, j). Basi ally, the HbH matrix is a 2D representation of the 4-order tensor Xij kl i,j Xij kl = (Fk,l )i,j = (F )k,l = f(i + k − 2, j + l − 2)

(16)

and the SVD of the matrix W is an orthogonal de omposition of this tensor. Another 2D representation of the tensor Xij kl an be obtained by the rearrangement

456

N. E. Golyandina, K. D. Usevi h

(5) of W:

F1,1 F1,2 . . . .. .. .. X = R(W) = . . . FKx ,1 FKx ,2 . . .

F1,Ky .. . . FKx ,Ky

(17)

Let us all this blo k Lx Kx × Ly Ky matrix the 2D-traje tory matrix and formulate the subsequent steps of the algorithm in terms of 2D-traje tory matri es. SVD

First of all, re all that the eigenve tors {Ui }di=1 form an orthonormal basis of span(W1 , . . . , WKx Ky ) and the fa tor ve tors {Vi }di=1 form an orthonormal basis of span(W 1 , . . . , W Lx Ly ). Consider matri es Ψi = matrLx ,Ly (Ui ) ∈ MLx ,Ly ,

Φi = matrKx ,Ky (Vi ) ∈ MKx ,Ky ,

and all Ψi and Φi eigenarrays and fa tor arrays respe tively. It is easily seen x ,Ky ) that systems {Ψi }di=1 and {Φi }di=1 form orthogonal bases of span({Fk,l }Kk,l=1 L ,L i,j x y and span({F }i,j=1 ) (see (13) and (15)). Moreover, by (6) one an rewrite the SVD step of the algorithm as a de omposition of the 2D-traje tory matrix X=

d X i=1

Xi =

d p X λi Φi ⊗ Ψi .

(18)

i=1

The de omposition is biorthogonal and has the same optimality properties as the SVD (see [2℄). We will all it Krone ker-produ t SVD (KP-SVD for short). Grouping

Grouping step in terms of Krone ker produ ts has exa tly the same form as (11). Choosing m disjoint subsets Ik (10) one obtains the grouped expansion X=

m X

k=1

XIk ,

where XI =

Xp λi Φi ⊗ Ψi .

(19)

i∈I

Note that it is more onvenient in pra ti e to perform the grouping step on the base of Ψi and Φi (instead of Ui and Vi ), sin e they are two-dimensional as well as the input 2D-array. Projection

It follows from (18) and (6) that matri es XIk are rearrangements of orresponding matri es WIk . Sin e the rearrangement R preserves Frobenius inner e I in (12) an be expressed through orthogoprodu t, the resulting 2D-arrays F k nal proje tions in Frobenius norm of the matri es XIk onto the linear subspa e of 2D-traje tory matri es (17) and the one-to-one orresponden e between 2Darrays and matri es like (17).

2D-extension of Singular Spe trum Analysis 2.3

457

Special cases

Here we will onsider some spe ial ases of 2D-SSA. It happens that these spe ial

ases des ribe most of well-known SSA-like algorithms. 2.3.1

1D sequences: SSA for time series. The rst spe ial ase o

urs when

the input array has only one dimension, namely it is a one-dimensional nite real-valued sequen e (1D-sequen e for short): F = (f(0, 0), . . . , f(Nx − 1, 0))T .

(20)

In this ase, the 2D-SSA algorithm oin ides with the original SSA algorithm [7℄ applied to the same data. Let us brie y des ribe the SSA algorithm in its standard notation denoting f(i, 0) by fi and Nx by N. The only parameter L = Lx is alled the window length. Let K = N − L + 1 = Kx . Algorithm onsists of four steps (the same as those of 2D-SSA). The result of Embedding step is the Hankel matrix

f0 f1 W = f2 . ..

f1 f2 f2 f3 f3 f4

.. .. . .

. . . fK−1 . . . fK . . . fK+1

. . .. ..

fL−1 fL fL+1 . . . fN−1

.

(21)

This matrix is alled the traje tory matrix1 . SVD and De omposition steps are exa tly the same as in the 2D ase. Proje tion in the 1D ase is formulated as one-step hankelization HR . 2.3.2 Extreme window sizes. Let us return to a general 2D-array ase when Nx , Ny > 1. Consider extreme window sizes: (a) Lx = 1 or Lx = Nx ; (b) Ly = 1 or Lx = Ny .

1. If onditions (a) and (b) are met both, then due to ondition 1 < Lx Ly < Nx Ny we get (Lx , Ly ) = (Nx , 1) or (Lx , Ly ) = (1, Ny ). In this ase, the HbH matrix W oin ides with the 2D-array F itself or with its transpose. Thus, the algorithm of 2D-SSA is redu ed to a grouping of the SVD omponents of the 2D-array F. This te hnique is used in image pro essing and it works well for 2D-arrays that are produ ts of 1D-sequen es (f(i, j) = pi qj ). 2. Consider the ase when either (a) or (b) is met. Let it be (b). Without loss of generality, we an assume that Ly = 1 and 1 < Lx < Nx . Then the HbH matrix W generated by F onsists of sta ked Hankel matri es W = [H0 : H1 : . . . : HNy −1 ] 1

In the SSA literature, the traje tory matrix is usually denoted by X

458

N. E. Golyandina, K. D. Usevi h

and we ome to the algorithm of MSSA [4,6,10℄ for simultaneous de omposition of multiple time series. More pre isely, we treat the 2D-array as a set of time series arranged into olumns and apply the MSSA algorithm with parameter Lx to this set of series. Pra ti ally, MSSA is more preferred than the general 2D-SSA if we expe t only one dimension of the input 2D-array to be `stru tured'. 2.3.3 Product of 1D sequences. In §2.3.1, we have shown that SSA for time series an be onsidered as a spe ial ase of the 2D-SSA. However, we an establish another relation between SSA and 2D-SSA. Consider the outer produ t of 1D-sequen es as an important parti ular ase of 2D-arrays: f(i, j) = pi qj . Produ ts of 1D-sequen es are of great importan e for the general ase of 2D-SSA as we an study properties (e.g. separability) of sums of produ ts of 1D-sequen es based on properties of the fa tors. The fa t here is that a 2D-SSA de omNxmain −1,Ny −1

an be expressed through SSA position of the 2D-array F = f(i, j) i,j=0 y −1 Nx −1 de ompositions of the 1D-sequen es (pi )i=0 and (qj )N j=0 . In matrix notation, the produ t of two 1D-sequen es P = (p0 , . . . , pNx −1 )T and Q = (q0 , . . . , qNy −1 )T is F = PQT . Let us x window sizes (Lx , Ly ) and denote by W(p) and W(q) the Hankel matri es generated by P and Q respe tively:

W(p)

p1 . . . pKx −1 p2 . . . pKx , .. . . .. .. .

p0 p1 = . ..

W(q)

pLx −1 pLx . . . pNx −1

q0 q1 = . ..

q1 . . . qKy −1 q2 . . . qKy . .. . . .. .. .

qLy −1 qLy . . . qNy −1

Then the Hankel-blo k-Hankel matrix W generated by the 2D-array F is W = W(q) ⊗ W(p) .

Thus, the following theorem holds. Theorem 1 ([9, Th. 13.10]). Let W(p)

positions

W

(p)

=

dp P

m=1

Then

q (p) (p) (p) T λm Um Vm ,

and W(q) have singular value de omW

(q)

=

dq P

n=1

q

(q)

(q)

(q) T

λn Un Vn

dp dq q T X X (p) (q) (p) (p) W= λm λn U(q) Vn(q) ⊗ Vm n ⊗ Um

.

(22)

(23)

m=1 n=1

yields a singular value de omposition of the matrix W, after rearranging of (q) its terms (in de reasing order of λ(p) m λn ).

2D-extension of Singular Spe trum Analysis 2.4

459

Comments on Grouping step

Let us now dis uss perhaps the most sophisti ated point of the algorithm: grouping of the eigentriples. Rules for grouping are not de ned within the 2D-SSA algorithm and this step is supposed to be performed by hand, on the base of theoreti al results. The way of grouping depends on the task one has to solve. The general task of 2D-SSA is to extra t additive omponents from the observed 2D-array. Let us try to formalize this task. Suppose we observe a sum of 2D-arrays: F = F(1) + . . . + F(m) . For example, F is a sum of a smooth surfa e, regular u tuations and noise. When applying the 2D-SSA algorithm to F, we have to group somehow the eigentriples (i.e. to group the terms of (9) or (18)) at Grouping step. The problems arising here are: – Is it possible to group the eigentriples providing the initial de omposition of

F into F(k) ? – How to identify the eigentriples orresponding to a omponent F(k) ?

In order to answer the rst question, we introdu e the notion of separability of the 2D-arrays F(1) , . . . , F(m) by 2D-SSA (following the 1D ase [7℄) as the

possibility to extra t them from their sum. In other words, we all the set of 2Darrays separable if the answer to the rst question is positive. In §3.1 we present the stri t de nition of separability and study its properties. In §3.2 we review some fa ts on separability of time series (the 1D-SSA ase), establish a link between the 1D-SSA and 2D-SSA ases and dedu e several important examples of 2D-SSA separability (§3.3). For pra ti al reasons, we dis uss approximate and asymptoti separability. If omponents are separable, then we ome to the se ond question: how to perform an appropriate grouping? The main idea is based on the following fa t: the eigenarrays {Ψi }i∈Ik and fa tor arrays {Φi }i∈Ik orresponding to a omponent F(k) an be expressed as linear ombinations of submatri es of the omponent. We an on lude that they repeat the form of the omponent F(k) . For example, smooth surfa es produ e smooth eigenarrays (fa tor arrays), periodi omponents generate periodi eigenarrays, and so on. In §3.4 we also des ribe a tool of weighted orrelations for he king separability a-posteriori. This tool an be an additional guess for grouping. Another matter of on ern is the number of eigentriples we have to gather to obtain a omponent F(k) . This number is alled the 2D-SSA rank of the 2Darray F(k) and is equal to the rank of the HbH matrix generated by F(k) . A tually, we are interested in separable 2D-arrays. Clearly, they have rank-de ient HbH matri es in non-trivial ase. This lass of 2D-arrays has an important sub lass: the 2D-arrays keeping their 2D-SSA rank onstant within a range of window sizes. In the 1D ase (see §2.3.1) the HbH matri es are Hankel and the sub lass oin ides with the whole lass. For the general 2D ase it is not so. However, 2D-arrays from the de ned above sub lass are of onsiderable interest

460

N. E. Golyandina, K. D. Usevi h

sin e the number of eigentriples they produ e does not depend on the hoi e of window sizes. §4 ontains several examples of su h 2D-arrays and rank al ulations for them.

3

2D separability

This se tion deals with the problem of separability stated in §2.4 as a possibility to extra t terms from the observed sum. We onsider the problem of separability for two 2D-arrays, F(1) and F(2) . Let us x window sizes (Lx , Ly ) and onsider the SVD of the HbH matrix W generated by F = F(1) + F(2) : W=

d p X λi Ui Vi T . i=1

If we denote W(1) and W(2) the Hankel-blo k-Hankel matri es generated by F(1) and F(2) , then the problem of separability an be formulated as follows: does there exist su h a grouping {I1 , I2 } that W(1) =

Xp λi Ui Vi T

and W(2) =

i∈I1

Xp λi Ui Vi T .

(24)

i∈I2

The important point to note here is that if W has equal singular values, then the SVD of W is not unique. For this reason, we introdu e two notions (in the same fashion as in [7℄): strong and weak separability. Strong separability means that any SVD of the matrix W allows the desired grouping, while weak separability means that there exists su h an SVD. 3.1

Basic definitions

Let L(m,n) = L(m,n) (G) denote the linear spa e spanned by the m × n submatri es of a 2D-array G. Parti ulary, for xed window sizes (Lx , Ly ), we have L(Lx ,Ly ) (F) = span({Fk,l }) and L(Kx ,Ky ) (F) = span({Fi,j }). Definition 6. Two 2D-arrays F(1) (Lx , Ly )-separable if L(Lx ,Ly ) (F(1) ) ⊥ L(Lx ,Ly ) (F(2) )

and F(2) with equal sizes are weakly and

L(Kx ,Ky ) (F(1) ) ⊥ L(Kx ,Ky ) (F(2) ).

Due to properties of SVDs, De nition 6 means that if F(1) and F(2) are weakly separable, then the sum of SVDs of W(1) and W(2) (24) is an SVD of the W. We also introdu e the de nition of strong separability.

We all two 2D-arrays F(1) and F(2) strongly separable if they are weakly separable and the sets of singular values of their Hankel-blo kHankel matri es do not interse t. Definition 7.

2D-extension of Singular Spe trum Analysis

461

Hereafter we will speak mostly about the weak separability and will say `separability' for short.

Remark 1. The set of 2D-arrays separable from a xed 2D-array F is a linear spa e.

Sin e the exa t separability is not feasible, let us introdu e the approximate separability as almost orthogonality of the orresponding subspa es. Consider 2D-arrays F and G and x window sizes (Lx , Ly ). As in (14), Fk1 ,l1 , Gk2 ,l2 stand for Lx × Ly submatri es of F and G and Fi1 ,j1 , Gi2 ,j2 do for Kx × Ky submatri es. Let us introdu e a distan e between two 2D-arrays in order to measure the approximate separability: def

ρ(Lx ,Ly ) (F, G) = max(ρL , ρK ),

(25)

where

hFk1 ,l1 , Gk2 ,l2 iM , JK = {1, . . . , Kx } × {1, . . . , Ky }; ρK = max (k1 ,l1 ),(k2 ,l2 )∈JK kFk1 ,l1 kM kGk2 ,l2 kM

Fi1 ,j1 , Gi2 ,j2 M ρL = max i1 ,j1 , JL = {1, . . . , Lx } × {1, . . . , Ly }. (i1 ,j1 ),(i2 ,j2 )∈JL kF kM kGi2 ,j2 kM

Remark 2. The 2D-arrays F and G are separable i ρ(L

x ,Ly )

(F, G) = 0.

A quite natural way to deal with approximate separability is studying asymptoti by array sizes separability of 2D-arrays, namely `good' approximate separa,∞ bility for relatively big 2D-arrays. Consider two in nite 2D-arrays F = (fij )∞ i,j=0 ∞ ,∞ and G = (gij )i,j=0 . Let F|m,n and G|m,n denote nite submatri es of in nite , G|m,n = (gij )m−1,n−1 . 2D-arrays F and G: F|m,n = (fij )m−1,n−1 i,j=0 i,j=0 Definition 8. F

and G are said to be asymptoti ally separable if lim

Nx ,Ny → ∞

(26)

ρ(Lx ,Ly ) (F|Nx ,Ny , G|Nx ,Ny ) = 0

for any Lx = Lx (Nx , Ny ) and Ly = Ly (Nx , Ny ) su h that Lx , Kx , Ly , Ky → ∞ as Nx , Ny → ∞. 3.2

Separability of 1D sequences

As well as the original 1D-SSA algorithm an be treated as a spe ial ase of 2D-SSA, the notion of L-separability of time series (originally introdu ed in [7℄) is a spe ial ase of (Lx , Ly )-separability.

Remark 3. Time series F(1)

= (f0 , . . . , fN−1 )T and F(2) = (f0 , . . . , fN−1 )T are L-separable if they are (L, 1)-separable as 2D-arrays. (1)

(1)

(2)

(2)

462

N. E. Golyandina, K. D. Usevi h

Let us now give several examples of the (weak) L-separability, whi h is thoroughly studied in [7℄.

Example 1. The sequen e F

= (f0 , . . . , fN−1 )T with fn = os (2πωn + ϕ) is L-separable from a non-zero onstant sequen e (c, . . . , c)T if Lω and Kω, where K = N − L + 1, are integers.

Example 2. Two osine sequen es of length N given by f(1) n = os (2πω1 n + ϕ1 )

and f(2) n = os (2πω2 n + ϕ2 )

are L-separable if ω1 6= ω2 , 0 < ω1 , ω2 6 1/2 and Lω1 , Lω2 , Kω1 , Kω2 are integers. In general, there are only a small number of exa t separability examples. Hen e, we ome to onsideration of approximate separability. It is studied with the help of asymptoti separability of time series rst introdu ed in [7℄. Asymptoti separability is de ned in the same fashion as that in the 2D ase (see De nition 8). The only dieren e is that we let just one dimension (and parameter) tend to in nity (be ause another dimension is xed).

Example 3. Two osine sequen es given by f(l) n =

m X

ck os(2πωk n + ϕk ), (l)

(l)

(l)

(l)

0 < ωk 6 1/2, l = 1, 2,

(27)

k=0

with dierent frequen ies are asymptoti ally separable. In Table 1, one an see a short summary on asymptoti separability of time series. Table 1.

onst

os exp exp os poly

Asymptoti separability

onst os exp exp os poly − + + + −

+ + + + +

+ + + + +

+ + + + +

− + + + −

In this table, const stands for non-zero onstant sequen es, cos does for osine sequen es (27), exp denotes sequen es exp(αn), exp cos stands for eαn os (2πωn + φ) and poly does for polynomial sequen es. Note that onditions of separability are omitted in the table. For more details, su h as onditions,

onvergen e rates, and other types of separability (e.g. sto hasti separability of a deterministi signal from the white noise), see [7℄.

2D-extension of Singular Spe trum Analysis 3.3

463

Products of 1D sequences

Let us study separability properties for produ ts of 1D-sequen es (introdu ed in §2.3.3). Consider four 1D-sequen es (1) T P(1) = (p(1) 0 , . . . , pNx −1 ) ,

(1) T Q(1) = (q(1) 0 , . . . , qNy −1 ) ,

(2) T P(2) = (p(2) 0 , . . . , pNx −1 ) ,

(2) T Q(2) = (q(2) 0 , . . . , qNy −1 ) .

Proposition 1. If P(1) and P(2) are Lx -separable or sequen es Q(1) and Q(2) are Ly -separable, then their produ ts F(1) = P(1) (Q(1) )T and F(2) = P(2) (Q(2) )T are (Lx , Ly )-separable.

Proof. First of all, let us noti e that submatri es of the 2D-arrays are produ ts of subve tors of 1D-sequen es

(1) (1) (1) T (1) F(1) k1 ,l1 = (pk1 −1 , . . . , pk1 +Lx −2 ) (ql1 −1 , . . . , ql1 +Ly −2 ), (2) (2) (2) T (2) F(2) k2 ,l2 = (pk2 −1 , . . . , pk2 +Lx −2 ) (ql2 −1 , . . . , ql2 +Ly −2 ).

(28)

Let us re all an important feature of Frobenius inner produ t:

T AB , CDT M = hA, Ci2 hB, Di2 ,

(29)

where A, B, C, and D are ve tors. Applying (29) to (28), we obtain the orthogonality of all Lx × Ly submatri es of 2D-arrays: D

(2) F(1) k1 ,l1 , Fk2 ,l2

E

M

= 0.

Likewise, all their Kx × Ky submatri es are orthogonal too. A

ording to Remark 2, we on lude that the 2D-arrays F(1) and F(2) are separable, and the ⊓ ⊔ proof is omplete.

Furthermore, we an generalize Proposition 1 to approximate and asymptoti separability. Lemma 1.

Under the assumptions of Proposition 1, ρ(Lx ,Ly ) (F(1) , F(2) ) 6 ρLx (P(1) , P(2) )ρLy (Q(1) , Q(2) ).

Proof. Equalities (28) and (29) make the proof obvious. Proposition 2.

Let F(1) and F(2) be produ ts of in nite 1D-sequen es: F(1) = P(1) (Q(1) )T ,

P(j) =

(j) (j) (p0 , . . . , pn , . . .)T

⊓ ⊔

F(2) = P(2) (Q(2) )T ,

and

Q(j) = (q0 , . . . , qn , . . .)T . (j)

(j)

If P(1) , P(2) or Q(1) , Q(2) are asymptoti ally separable, then are asymptoti ally separable too.

F(1)

and

F(2)

464

N. E. Golyandina, K. D. Usevi h

Proof. The proposition follows immediately from Lemma 1.

⊓ ⊔

The following example of asymptoti separability an be shown using Proposition 2 and Remark 1.

Example 4. The 2D-array given by

= os(2πω1 i) ln(j + 1) + ln(i + 1) os(2πω2 j) is asymptoti ally separable from a onstant 2D-array f(2) (i, j) = const. f(1) (i, j)

Example 4 demonstrates that separability in the 2D ase is more varied than in the 1D ase. For instan e, nothing but periodi 1D-sequen es are separable from a onstant sequen e. The next example is an analogue of Example 3.

Example 5. Two 2D sine-wave arrays given by f(l) (i, j) =

m X

ck os(2πω1k i + ϕ1k ) os(2πω2k j + ϕ2k ), l = 1, 2, (l)

(l)

(l)

(l)

(l)

k=1

with dierent frequen ies are asymptoti ally separable by 2D-SSA. However, the problem of la k of strong separability in presen e of weak separability appears more frequently in the 2D ase. The wider is the range of eigenvalues of the HbH matrix orresponding to a 2D-array, the more likely is mixing of omponents produ ed by the 2D-array and other onstituents. This be omes a problem at Grouping step. For example, if two 1D-sequen es have eigenvalues from the range [λ2 , λ1 ], then the range of eigenvalues of their produ t, by Proposition 1, is wider: [λ22 , λ21 ]. 3.4

Checking the separability: weighted correlations

Following the 1D ase, we introdu e a ne essary ondition of separability, whi h

an be applied in pra ti e. Definition 9.

as follows: D

A weighted inner produ t of 2D-arrays F(1) and F(2) is de ned

F(1) , F(2)

E

w

Nx −1 Ny −1 def X X

=

i=0

j=0

f(1) (i, j) · f(2) (i, j) · wx (i) · wy (j),

where wx (i) = min(i + 1, Lx , Kx , Nx − i)

and

wy (j) = min(j + 1, Ly , Ky , Ny − j).

2D-extension of Singular Spe trum Analysis

465

In fa t, the fun tions wx (i) and wy (j) de ne the number of entries on se ondary diagonals of Hankel Lx × Kx and Ly × Ky matri es respe tively. More pre isely, wx (i) = # (k, l) : 1 6 k 6 Kx , 1 6 l 6 Lx , k + l = i + 1 , wy (j) = # (k, l) : 1 6 k 6 Ky , 1 6 l 6 Ly , k + l = j + 1 .

Hen e, for a Hankel-blo k-Hankel matrix W generated by F, the produ t wx (i)wy (j) is equal to the number of entries in W orresponding to the entry (i, j) of the 2D-array F. The same holds for the number of entries in a 2D-traje tory matrix X. This observation implies the following proposition. Proposition 3. D

F(1) , F(2)

E

w

E D = X(1) , X(2)

M

E D = W(1) , W(2)

.

M

With the help of the weighted inner produ t, we an formulate a ne essary

ondition for separability. Proposition 4.

If F(1) and F(2) are separable, then F(1) , F(2)

w

= 0.

Finally, we introdu e weighted orrelations to measure approximate separability and the matrix of weighted orrelations to provide an additional information useful for grouping.

A weighted orrelation (w- orrelation) arrays F(1) and F(2) is de ned as

Definition 10.

ρw

between two 2D-

F(1) , F(2) w . ρw (F , F ) = kF(1) kw kF(2) kw (1)

(2)

Consider the 2D-array F and apply 2D-SSA with parameters (Lx , Ly ). If we

hoose the maximal grouping (10), namely m = d and Ik = {k}, 1 6 k 6 d, e I is alled the kth elementary re onstru ted omponent and the then ea h F k matrix of weighted orrelations R = (rij )di,j=1 is given by e I )|. eI , F rij = |ρw (F i j

For an example of appli ation see §5.

4 4.1

2D-SSA ranks of 2D-arrays. Examples of calculation Basic properties

Let us rst introdu e a de nition of the 2D-SSA rank.

(30)

466

N. E. Golyandina, K. D. Usevi h

Definition 11. The (Lx , Ly )-rank (2D-SSA rank for window sizes (Lx , Ly )) of the 2D-array F is de ned to be

rankLx ,Ly (F) def = dim L(Lx ,Ly ) = dim L(Kx ,Ky ) = rank W. It is immediate that the (Lx , Ly )-rank is equal to the number of omponents in the SVD (9) of the Hankel-blo k-Hankel matrix generated by F. There is another way to express the rank through the 2D-traje tory matrix (17).

If for xed window sizes

Lemma 2.

X=

m X i=1

Ai ⊗ Bi ,

(Lx , Ly )

there exists representation

Bi ∈ MLx ,Ly ,

Ai ∈ MKx ,Ky ,

(31)

then rankL ,L F does not ex eed m. Furthermore, if ea h system {Ai }m i=1 , {Bi }m is linearly independent, then rank F ) = m . ( L ,L i=1 x

y

x

y

Proof. The proof is evident, sin e equality (31) an be rewritten as W=

m X

ve Bi (ve Ai )T

i=1

by (6).

⊓ ⊔

By Theorem 1, the 2D-SSA rank of a produ t of 1D-sequen es 2D-SSA rank is equal to the produ t of the ranks: rankLx ,Ly (PQT ) = rankLx (P) rankLy (Q), where rankL (·) stands for rankL,1 (·). For a sum of produ ts of 1D-sequen es F =

n P

i=1

(32)

P(i) (Q(i) )T , the 2D-SSA

rank is not generally equal to the sum of produ ts of ranks due to possible linear dependen e of ve tors. In order to al ulate 2D-SSA ranks for this kind of 2D-arrays, the following lemma may be useful. Lemma 3. If for xed window sizes (Lx , Ly ) systems {Aj }nj=1 and {Bi }m i=1 su h that X=

m,n X

i,j=1

then rankL

x ,Ly

there exist linearly independent

cij Aj ⊗ Bi , Bi ∈ MLx ,Ly , Aj ∈ MKx ,Ky ,

(F) = rank C,

where C = (cij )m,n i,j=1 .

(33)

2D-extension of Singular Spe trum Analysis

467

Proof. Let us rewrite the ondition (3) in the same way as in the proof of Lemma 2: m,n X

W=

cij ve Bi (ve Aj )T .

i,j=1

If we set A = [ve A1 : . . . : ve An ] and B = [ve B1 : . . . : ve Bm ], then W = BCAT . Sin e A and B have linearly independent olumns, the ranks of W and C oin ide. ⊓ ⊔ 4.2

Ranks of time series

In the 1D ase, lass of series having onstant rank within a range of window length is alled time series of nite rank [7℄. This lass mostly onsist of sums of produ ts of polynomials, exponents and osines: ′

fn =

d X

(k) Pm (n) ρn k os(2πωk n + ϕk ) + k

d X

(k) Pm (n) ρn k. k

(34)

k=d ′ +1

k=1

Here 0 < ωk < 0.5, ρk 6= 0, and Pl(k) are polynomials of degree l. The time series (34) form the lass of time series governed by linear re urrent formulae (see [3, 7℄). It happens that SSA ranks of time series like (34) an be expli itly al ulated. Proposition 5. Let a time series FN = (f0 , ..., fN−1 ) be de ned in (34) with (ωk , ρk ) 6= (ωl , ρl ) for 1 6 k, l 6 d ′ and ρk 6= ρl for d ′ < k, l 6 d. Then rankL (FN ) is equal to ′

r=2

d X

(mk + 1) +

d X

(35)

(mk + 1)

k=d ′ +1

k=1

if L > r and K > r. Proof. Equality (34) an be rewritten as a sum of omplex exponents: ′

fn =

d X

k=1

(k) Pm (n) (αk (λk )n k

+

βk (λk′ )n )

+

d X

(k) Pm (n) ρn k, k

k=d ′ +1

where λk = ρk e2πiωk , λk′ = ρk e−2πiωk and αk , βk 6= 0. The latter equality yields a anoni al representation (see [1, §8℄) of the Hankel matrix W with rank r. Under the stated onditions on L and K, rank W = r by [1, Theorem 8.1℄. ⊓ ⊔

468 4.3

N. E. Golyandina, K. D. Usevi h Calculation of 2D-SSA ranks

Proposition 5 together with (32) gives possibility to al ulate 2D-SSA ranks for 2D-arrays that are produ ts of 1D-sequen es. However, the general 2D ase is mu h more ompli ated. In this se tion, we introdu e results on erning 2D-SSA ranks for 2D exponential, polynomial and sine-wave arrays. In the examples below, one an observe the ee t that the 2D-SSA rank of a 2D-array given by f(i, j) = pi+j is equal to the SSA rank of the sequen e (pi ). It is not surprising, sin e 2D-SSA is in general invariant to rotation (and to other linear maps) of arguments of a 2D-fun tion f(i, j). Exponent. The result on rank of a sum of 2D exponents is quite simple. Nx −1,Ny −1 Proposition 6. For an exponential 2D-array F = f(i, j) i,j=0 de ned 4.3.1

by

f(i, j) =

m X

cn ρin µjn ,

ρn , µn 6= 0,

n=1

(36)

rankLx ,Ly (F) = m if Lx , Ly , Kx, Ky > m and (ρl , µl ) 6= (ρk , µk ) for l 6= k.

Proof. The proof is based on Lemma 2. Let us express entries of the matrix X using equality (16):

(Fk,l )i,j = f(i + k − 2, j + l − 2) =

m X

(l−1) cn ρ(i−1) µ(j−1) ρ(k−1) µn . n n n

(37)

n=1

It is easy to he k that equality (37) de nes de omposition X=

m X

n=1

An ⊗ Bn ,

where

(K −1) x −1) T An = (ρ0n , . . . , ρ(K ) (µ0n , . . . , µn y ), n y −1) Bn = (ρ0n , . . . , ρn(Lx −1) )T (µ0n , . . . , µ(L ). n m Obviously, ea h system {Ai }m i=1 , {Bi }i=1 is linearly independent. Applying Lemma 2 nishes the proof. ⊓ ⊔

4.3.2

Polynomials. Let Pm be a polynomial of degree m: Pm (i, j) =

m m−s X X

gst is jt

s=0 t=0

and at least one of leading oeÆ ients gs,m−s for s = 0, . . . , m is non-zero. Consider the 2D-array F of sizes Nx , Ny > 2m + 1 with f(i, j) = Pm (i, j).

2D-extension of Singular Spe trum Analysis

469

If Lx , Ly , Kx , Ky > m + 1, then

Proposition 7.

rankLx ,Ly (F) = rankm+1,m+1 (G ′ ),

where G′ =

G ′′ 0 0

0m×m

′ ′ g00 . . . g0m

. G ′′ = ..

,

. ..

′ gm0

In addition, the following inequality holds: m + 1 6 rankLx ,Ly (F) 6

0

,

′ gst = gst s! t!.

(m/2 + 1) , for even m, ((m + 1)/2 + 1) (m + 1)/2, for odd m. 2

(38)

Proof. The rst part of the proposition is proved in the same way as Propo-

sition 6 ex ept for using Lemma 3 instead of Lemma 2. Let us apply Taylor formula (Fk,l )i,j = Pm (i + k − 2, j + l − 2) = s+t ∂ Pm s t 1 = (i − 1) (j − 1) (k − 1, l − 1) = s! t! ∂is ∂jt (39) s=0 t=0 m X m m−t s t m−s v u X X X (i − 1) (j − 1) (k − 1) (l − 1) = . gu+s,v+t (u + s)!(v + t)! s! t! u! v! m X m X

s=0 t=0

u=0 v=0

′ If we set gst = 0 for s + t > m + 1, then we an rewrite (39) as

X=

m X

s,t,u,v=0

where

(40)

T v 1 0u , . . . , (Kx − 1)u 0 , . . . , (Ky − 1)v for 0 6 u, v 6 m u! v! T t 1 0s , . . . , (Lx − 1)s 0 , . . . , (Ly − 1)t for 0 6 s, t 6 m. = s! t!

Au+(m+1)v = Bs+(m+1)t

′ gu+s,v+t Au+(m+1)v ⊗ Bs+(m+1)t ,

Let W(g) be the Hankel-blo k-Hankel matrix generated by G ′ with window sizes (m + 1, m + 1). Then (40) an be rewritten as (m+1)2 −1

X=

X

i,j=0 2

(W(g) )ji Ai ⊗ Bj . 2

(m+1) −1 (m+1) −1 The systems {Ai }i=0 and {Bj }j=0 are linearly independent due to restri tions on Lx , Ly . By Lemma 3, the rst part of the proposition is proved. The bounds in (38) an be proved using the fa t that

m+1,m+1 ′ }k,l=1 , rank (G ′ ) = dim L(m+1,m+1) (G ′ ) = dim span {Gk,l

m+1,m+1

470

N. E. Golyandina, K. D. Usevi h

′ is the (m + 1) × (m + 1) submatrix of G ′ beginning from the entry where Gk,l (k, l). De ne by Tn the spa e of (m + 1) × (m + 1) matri es with zero entries below the nth se ondary diagonal:

def

Tn = {A = (aij )m,m i,j=0 ∈ Mm+1,m+1 : aij = 0

for i + j > n}.

′ Then Gk,l belongs to Tn for n > m − (k + l) + 2 and does not, in general, for smaller n. Let us introdu e

def ′ Cn = span {Gk,l }k+l=m−n+2 ⊆ Tn , def

Sn = span(C0 , . . . , Cn ) = span(Sn−1 , Cn ) ⊆ Tn .

Then L(m+1,m+1) (G ′ ) = Sm . By the theorem onditions, there exists i su h that gi,m−i 6= 0. Hen e, there exist C0 , . . . , Cm ∈ Mm+1,m+1 su h that Cn ∈ Cn ⊆ Tn , Cn 6∈ Tn−1 . Therefore, the system {C0 , . . . , Cm } is linearly independent and the lower bound is proved. To prove the upper bound, note that dim Sn 6 min(dim Sn−1 + dim Cn , dim Tn ). Sin e dim Cn 6 m + 1 − n and dim Tn = dim Sm 6

m X

min(n + 1, m − n + 1) =

n=0

n+1 P k=1

k, one an show that

(m/2 + 1) , m even, ⊓ ⊔ ((m + 1)/2 + 1) (m + 1)/2, m odd. 2

Let us demonstrate two examples that meet the bounds in inequality (38) exa tly: the 2D-SSA rank of the 2D array given by f(k, l) = (k + l)2 (m = 2) equal to 3, while the 2D-SSA rank for f(k, l) = kl is equal to 4. 4.3.3

Sine-wave 2D-arrays. Consider a sum of sine-wave fun tions hd (k, l) =

d X

(41)

Am (k, l),

m=1

Am (k, l) =

os(2πω(X) m k) sin(2πω(X) m k)

!T

a m bm cm dm

!

os(2πω(Y) m l) , sin(2πω(Y) m l)

(42)

where 1 6 k 6 Nx , 1 6 l 6 Ny , at least one oeÆ ient in ea h group {am , bm , cm , dm } is non-zero and the frequen ies meet the following onditions: (X)

(Y)

(X)

(Y)

(ωn , ωn ) 6= (ωm , ωm ),

(Y) for n 6= m, ω(X) m , ωm ∈ (0, 1/2).

(43)

2D-extension of Singular Spe trum Analysis

For window sizes (Lx , Ly ) su h that Lx , Ly , Kx , Ky N −1,N −1 2D-SSA rank of F = hd (k, l) k,l=0 is equal to

Proposition 8.

x

rankLx ,Ly (F) =

and numbers

νm

d P

the

y

νm ,

m=1

where

νm = 2 or 4;

an be expressed as νm = 2 rank

> 4d

471

am bm cm dm dm −cm −bm am

(44)

.

Proof. Summands Am of (42) an be rewritten as a sum of omplex exponents: 4Am (k, l) = (am − dm − i(cm + bm )) e

(X) 2πiωm k

(X) −2πiωm k

+ (am − dm + i(cm + bm )) e

(X) −2πiωm k

+ (am + dm + i(cm − bm )) e + (am + dm − i(cm − bm )) e

2πiω(Y) m l

+

−2πiω(Y) m l

+

2πiω(Y) m l

+

e e e

(X) 2πiωm k −2πiω(Y) m l

e

.

Note that the oeÆ ients of the rst pair of omplex exponents be ome zero at on e if am = dm and bm = −cm . The se ond pair of omplex exponents vanishes if am = −dm and bm = cm . Therefore, the number of non-zero oeÆ ients of the omplex exponents orresponding to ea h summand Am (k, l) is equal to νm de ned in (44). Then the 2D-array an be represented as a sum of produ ts: hd (k, l) =

r X

n=1

xn yn k zn l ,

r=

d X

νm ,

(45)

m=1

where all the oeÆ ients xn ∈ C are non-zero, while yn and zn have the form (X) (Y) yn = e2πiωn , zn = e2πiωn , and pairs (yn , zn ) are distin t due to onditions (43), namely (yn , zn ) 6= (ym , zm ) for n 6= m. Due to [5℄, the rank of the Hankel-blo k-Hankel matrix W generated by the ⊓ ⊔ 2D-array (45) is equal to r at least for Lx , Ly > 4d.

Note that the ondition Lx , Ly > 4d is just suÆ ient for the result of Proposition 8. The same result is valid for a larger range of Lx , Ly ; this range depends on the input 2D array, see [5℄ for the ase of omplex exponents. Let us apply the proposition to two examples. Let f(k, l) = os(2πω(X) k + 2πω(Y) l). Then the 2D-SSA rank equals 2. If f(k, l) = os(2πω(X) k) ·

os(2πω(Y) l), then the 2D-SSA rank equals 4.

5

Example of analysis

Consider a real-life digital image of Mars (275 × 278) obtained by web- amera2 (see Fig. 2). As one an see, the image is orrupted by a kind of periodi noise, 2

Sour e: Pierre Thierry

472

N. E. Golyandina, K. D. Usevi h

probably sinusoidal due to possible ele tromagneti nature of noise. Let us try to extra t this noise by 2D-SSA. It is more suitable to use the 2D-traje tory matrix notation. After hoosing window sizes (25, 25) we obtain expansion (18). As we will show, these window sizes are enough for separation of periodi noise.

Fig. 2.

2D-array: Mars

Ψ1

Ψ2

Ψ3

Ψ4

Ψ5

Ψ6

Ψ7

Ψ8

Ψ9

Ψ10

Ψ11

Ψ12

Ψ13

Ψ14

Ψ15

Ψ16

Ψ17

Ψ18

Ψ19

Ψ20

Fig. 3.

Eigenarrays

Let us look at the eigenarrays (Fig. 3). The eigenarrays from the eigentriples with indi es N = {13, 14, 16, 17} have periodi stru ture similar to the noise. The fa tor arrays have the same periodi ity too. This observation entitles us to believe that these eigentriples onstitute the periodi noise. In addition, 4 is a likely rank for sine-wave 2D-arrays.

2D-extension of Singular Spe trum Analysis

Fig. 4.

473

Weighted orrelations for the leading 30 omponents

Let us validate our onje ture examining the plot of weighted orrelations matrix (see Fig. 4). The plot depi ts w- orrelations rij (30) between elementary re onstru ted omponents (the left-top orner represents the entry r11 ). Values are plotted in grays ale, white stands for 0 and bla k does for 1. The plot ontains two blo ks un orrelated to the rest. This means that the sum of elementary re onstru ted omponents orresponding to indi es from N is separable from the rest. Re onstru tion of a 2D-array by the set N gives us the periodi noise, while the residual produ es a ltered image.

Fig. 5.

Re onstru ted noise and residual ( ltered image)

As the matter of fa t, the noise is not pure periodi and is in a sense modulated. This happens due to lipping of the signal values range to [0, 255].

474

N. E. Golyandina, K. D. Usevi h

References 1. G. Heinig and K. Rost Algebrai methods for Toeplitz-like matri es and operators, Akademie Verlag, Berlin, 1984. 2. C.F. Van Loan and N.P. Pitsianis Approximation with Krone ker produ ts in M.S.Moonen and G. H. Golub, eds., Linear Algebra for Large S ale and Real Time Appli ations, Kluwer Publi ations, pp. 293{314, 1993. 3. V.M. Bu hstaber Time series analysis and grassmannians in S. Gindikin, ed., Applied Problems of Radon Transform, AMS Transa tion { Series 2, Vol. 162, Providen e (RI), pp. 1{17, 1994. 4. J. Elsner and A. Tsonis Singular Spe trum Analysis. A New Tool in Time Series Analysis, Plenum Press, New York, 1996. 5. H. Hua Yang and Y. Hua On Rank of Blo k Hankel Matrix for 2-D Frequen y Dete tion and Estimation, IEEE Transa tions on Signal Pro essing, Vol. 44, Issue 4, pp. 1046{1048 1996. 6. D. Danilov and A. Zhigljavsky, eds., Prin ipal Components of Time Series: the \Caterpillar" method, St.Petersburg State University, St.Petersburg, 1997 (in Russian). 7. N. Golyandina, V. Nekrutkin, and A. Zhigljavsky Analysis of Time Series Stru ture: SSA and Related Te hniques, Chapman & Hall/CRC, Bo a Raton, 2001. 8. J.R. Magnus and H. Neude ker Matrix Dierential Cal ulus with Appli ations to Statisti s and E onometri s, John Wiley & Sons, 2004. 9. A.J. Laub Matrix Analysis for S ientists and Engineers, SIAM, 2004. 10. D. Stepanov and N.Golyandina SSA-based approa hes to analysis and fore ast of multidimensional time series, Pro eedings of the 5th St.Petersburg Workshop on Simulation, St.Petersburg State University, St.Petersburg, pp. 293{298, 2005.

Application of Radon transform for fast solution of boundary value problems for elliptic PDE in domains with complicated geometry Alexandre I. Grebennikov Fa ultad de Cien ias Fisi o Matemati as, Benemerita Universidad Autonoma de Puebla, Av. San Claudio y Rio Verde, Col. San Manuel, Ciudad Universitaria, Puebla, Puebla, 72570 | Mexi o agrebe@fcfm.buap.mx

Abstract. A new approa h for solution of the boundary value problems for wide lass of ellipti partial dierential equations of mathemati al physi s is proposed. This lass in ludes the Lapla e, Poisson, and Helmholtz equations. The approa h is based on the dis overed by author Lo al Ray Prin iple and leads to new General Ray (GR) method, whi h presents the solution of the Diri hlet boundary problems by expli it analyti al formulas that in lude the dire t and inverse Radon transform. GR-method is realized by fast algorithms and MATLAB software, whose quality is demonstrated by numeri al experiments.

Keywords: Diri hlet problem for the Lapla e equation, dire t and inverse Radon transform.

1

Introduction

The traditional s heme of solving inverse problems of mathemati al physi s requires, as a rule, solution of a sequen e of dire t problems [1℄. That is why development of new fast methods for solution of dire t problems is very important for solving inverse problems [2, p.311℄. There are two main approa hes for solving boundary value problems for partial dierential equations in analyti al form: the Fourier de omposition and the Green fun tion method [2℄. The Fourier de omposition is used, as a rule, only in theoreti al investigations. The Green fun tion method is the expli it one, but it is diÆ ult to onstru t the Green fun tion for the omplex geometry of the

onsidered domain Ω. The known numeri al algorithms are based on the Finite Dieren es method, Finite Elements (Finite Volume) method and the Boundary Integral Equation method. Numeri al approa hes lead to solution of systems of linear algebrai equations [3℄ that require a lot of omputer time and memory. A new approa h for the solution of boundary value problems on the base of the General Ray Prin iple (GRP) was proposed by the author in [4℄, [5℄ for the

476

A. I. Grebennikov

stationary waves eld. GRP leads to expli it analyti al formulas (GR-method) and fast algorithms, developed and illustrated by numeri al experiments in [5℄{ [8℄ for solution of dire t and oeÆ ient inverse problems for the equations of mathemati al physi s. But there were some diÆ ulties with the stri t theoreti al justi ation of that version of GR-method. Here we extend the proposed approa h to onstru tion of another version of GR-method based on appli ation of the dire t Radon transform [9℄ to the PDE [10℄{[12℄. This version of GR-method is justi ed theoreti ally, formulated in algorithmi form, implemented as a program pa kage in MATLAB system and illustrated by numeri al experiments.

2

General Ray Principle

The General Ray Prin iple (GRP) was proposed in [4℄, [5℄. It gives no traditional mathemati al model for onsidered physi al eld and orresponding boundary problems. GRP onsists in the following main assumptions: 1. the physi al eld an be simulated mathemati ally by the superposition of plane ve tors (general rays) that form eld V(l) for some xed straight line l; ea h ve tor of eld V(l) is parallel to the dire tion along this line l, and the superposition orresponds to all possible lines l that interse t domain Ω; 2. the eld V(l) is hara terized by some potential fun tion u(x, y); 3. we know some hara teristi s su h as values of fun tion u(x, y) and/or ow of the ve tor V(l) in any boundary point P0 = (x0 , y0 ) of the domain. Appli ation of the GRP to the problem under investigation means to onstru t an analogue of given PDE in the form of family of ODEs des ribing the distribution of the fun tion u(x, y) along the \General Rays", whi h are presented by a straight line l with some parameterization. We use the traditional Radon parameterization with a parameter t: x = p os ϕ − t sin ϕ , y = p sin ϕ + t os ϕ. Here |p| is a length of the perpendi ular from the origin to the line l, ϕ ∈ [0, 2π] is the angle between the axis X and this perpendi ular. Using this parameterization, we onsidered in [4℄, [5℄ the variant of GRP that redu es the Lapla e equation to the assemblage (depending on p, ϕ) of ordinary dierential equations with respe t to variable t. This family of ODEs was used as the lo al analogue of the PDE. There we onstru ted orresponding version of the General Ray method for the onvex domain Ω. It onsists in the following steps: 1. solution of boundary value problems for the obtained assemblage of ODEs in expli it analyti al or approximate form, using well known standard formulas and numeri al methods; 2. al ulation of the integral average for this solution along the line l;

Radon transform for fast solution of BVP

477

3. transformation of these solutions by the inverse Radon transform produ ing the required superposition. The numeri al justi ation of this version of GR-method was given for the ase of domain Ω being the unit ir le [5℄. For some more ompli ated domains the quality of the method was illustrated by numeri al examples. The redu tion of the onsidered PDE to the family of ODEs with respe t to the variable t makes it possible to satisfy dire tly boundary onditions, to

onstru t the eÆ ient and fast numeri al algorithms. At the same time, there are some diÆ ulties with implementation of this method for the ompli ated geometry of the domain Ω, as well as with its theoreti al justi ation even for the simple ases.

3

Formulation and theoretical justification of p-version of GR-method

Let us onsider the Diri hlet boundary problem for the Poisson equation: △u(x, y) = ψ(x, y), u(x, y) = f(x, y),

(x, y) ∈ Ω; (x, y) ∈ Γ.

(1) (2)

for the fun tion u(x, y) that has two ontinuous derivatives with respe t to both variables inside the plane domain Ω bounded by a ontinuous urve Γ . Here ψ(x, y), (x, y) ∈ Ω and f(x, y), (x, y) ∈ Γ , are given fun tions. In [10℄{[12℄, investigations are presented on the possibility of redu tion of solution of PDE to the family of ODEs using the dire t Radon transform [9℄. This redu tion leads to ODE with respe t to variable p and an be interpreted in the frame of the introdu ed General Ray Prin iple. But at rst glan e, using the variable p makes it impossible to satisfy dire tly the boundary onditions expressed in (x, y) variables. Possibly by this reason the mentioned and other related investigations were on entrated only at theoreti al aspe t of onstru tion of some basis of general solutions of PDE. Unfortunately, this approa h was not used for onstru tion of numeri al methods and algorithms for solution of boundary value problems, ex ept for some simple examples [10℄. The important new element, introdu ed here into this s heme, onsists in satisfying the boundary onditions by their redu tion to homogeneous ones. The p-version of the GR-method an be formulated as the sequen e of the following steps: 1. redu e the boundary value problem to homogeneous one; 2. represent the distribution of the potential fun tion along the general ray (a straight line l) by its dire t Radon transform uϕ (p); 3. onstru t the family of ODEs in the variable p with respe t the fun tion uϕ (p);

478

A. I. Grebennikov

4. solve the onstru ted ODEs with zero boundary onditions; 5. al ulate the inverse Radon transform of the obtained solution; 6. revert to the initial boundary onditions. We present below the implementation of this s heme. We suppose that the boundary Γ an be represented in polar oordinates (r, α) by some one-valued positive fun tion that we denote r0 (α), α ∈ [0, 2π]. It is always possible for the simple onne ted star-shaped domain Ω with the entre at the origin. Let us write the boundary fun tion as = f(r0 (α)) os α, r0(α) sin α). f(α)

(3)

Supposing that fun tions r0 and f(α) have the se ond derivative we introdu e the fun tions f(α) , (x, y) ∈ Ω f0 (α) = 2 (4) r0 (α)

ψ0 (x, y) = ψ(x, y) − 4f0 (α) − f0′′ (α)

(5)

u0 (x, y) = u(x, y) − r2 f0 (α).

(6)

To pro eed with the rst step of the s heme, we an write the boundary-value problem with respe t to the fun tion u0 (x, y) as the following two equations: △u0 (x, y) = ψ0 (x, y), u0 (x, y) = 0,

(x, y) ∈ Ω;

(x, y) ∈ Γ.

(7) (8)

To make the se ond and the third steps we need the dire t Radon transform [7℄: R[u](p, ϕ) =

Z +∞ −∞

u(p os ϕ − t sin ϕ, p sin ϕ + t os ϕ)dt

After appli ation of the Radon transform to the equation (7) and using formula (2) at the pp. 3 of [8℄ we obtain the family of ODEs with respe t to the variable p: d2 uϕ (p) = R[ψ0 ](p, ϕ), dp2

b (p, ϕ) ∈ Ω

(9)

b is the domain of possible values of parameters p, ϕ. As a rule, ϕ ∈ where Ω [0, 2π], while modulus of the parameter p is equal to the radius in the polar oordinates and varies in the limits determined by the boundary urve Γ . In the onsidered ase, for some xed ϕ the parameter p is in the limits −r0 (ϕ − π) < p < r0 (ϕ). Unfortunately, boundary ondition (8) annot be modi ed dire tly by Radon transform to the orresponding boundary onditions for every equation of the

Radon transform for fast solution of BVP

479

family (9). For the fourth step we propose to use the following boundary onditions for every xed ϕ ∈ [0, 2π]: uϕ (−r0 (ϕ − π)) = 0;

uϕ (r0 (ϕ)) = 0.

(10)

b ϕ (p) the solution of the problem (9)-(10) that an be univo ally deDenote by u termined as fun tion of variable p for every ϕ ∈ [0, 2π],p ∈ (−r0 (ϕ − π), r0 (ϕ)), b ϕ (p) ≡ 0 for all ϕ with ontinuity in p. and outside of this interval we extend u Let us denote the inverse Radon transform as an operator R−1 , whi h for any fun tion z(p, ϕ) an be represented by formula (9): R−1 [z] =

1 2π2

Zπ Z∞

−π −∞

zp′ (x os ϕ + y sin ϕ, ϕ) dtdϕ (x os ϕ + y sin ϕ) − t

The justi ation of the fth step of the s heme is ontained in the following theorem.

The following formula for the solution of boundary value problems (7)-(8) is true:

Theorem 1.

u uϕ (p)], (x, y) ∈ Ω. 0 (x, y) = R−1 [b

(11)

Proof. Substituting fun tion de ned by (11) into left-hand side of equation (7) and using [8, Lemma 2.1, p. 3℄ we obtain the following relations: △u 0 (x, y) = R−1 [

b ϕ (p) d2 u ] = R−1 [R[ψ0 ](p, ϕ)] = ψ0 (x, y) dp2

(12)

whi h mean that the equation (7) is satis ed (see also [8℄, p. 40). From the b ϕ (p) ≡ 0, p ∈ / (−r0 (ϕ − π), r0 (ϕ)), ϕ ∈ [0, π] and Theorem 2.6 (the

ondition u support theorem) from [8, p.10℄ it follows that u 0 (x, y) ≡ 0 for (x, y) ∈/ Ω and, due its ontinuity, satis es the boundary onditions (8). This nishes the proof. The sixth step of GR-method is presented in detail in the following theorem. Theorem 2. The solution u (x, y) of boundary-value problems (1), (2) is pre-

sented by the following formulas ^ 2 (p, ϕ) − u (x, y) = R−1 [(ψ ^ 2 (p, ϕ) = ψ

(p + r0 (ϕ − π)) ^ ψ2 (r0 (ϕ), ϕ))] + r2 f0 (α) (13) (r0 (ϕ) + r0 (ϕ − π)) Zp

Zp

−r0 (ϕ−π) −r0 (ϕ−π)

b 0 (p, ϕ)dp, ψ

b 0 (p, ϕ) = R[ψ0 (x, y)]. ψ

(14)

Justi ation of this theorem obviously follows from the expli it formula for the solution of equation (9) with onditions (10). The dire t and inverse Radon transforms in expli it formulas (13), (14) an be implemented numeri ally by fast Fourier dis rete transformation (FFDT) whi h ensures the eÆ ien y of the proposed method.

480

4

A. I. Grebennikov

Results of numerical experiments

We have onstru ted the fast algorithmi and program implementation of GRmethod for onsidered problem in MATLAB system. We used the uniform dis retization of variables p ∈ [−1, 1], ϕ ∈ [0, π], as well as the dis retization of variables x, y, with n nodes. We made tests of mathemati ally simulated model examples with known exa t fun tions u(x, y), f(x, y), ψ(x, y). Graphi illustrations of numeri al examples of solution by p-version of GR-method are presented in Fig. 1(a)-1(d). From Fig. 1(a), 1(d) we an see that the method gives a good approximation also for a non-dierentiable urve Γ .

5

Conclusion

New version of GR-method is onstru ted. It is based on the appli ation of the Radon transform dire tly to the Poisson equation. This version of GR-method for arbitrary simply onne ted star-shaped domains is justi ed theoreti ally, formulated in algorithmi form, implemented as a program pa kage in MATLAB system and illustrated by numeri al experiments. Proposed version an be applied for the solution of boundary value problems for other PDEs with onstant

oeÆ ients. In perspe tive, it seems interesting to develop this approa h for the solution of dire t and inverse problems involving the equations of mathemati al physi s with variable oeÆ ients.

Acknowledgments Author a knowledges VIEP BUAP for the support in the frame of the Proje t No 04/EXES/07 and also SEP and CONACYT for support in the frame of the Proje t No CB 2006-01/0057479.

References 1. A.N. Tikhonov, V.Y. Arsenin, Solutions of Ill-Posed Problems, V.H. Winston & Sons, Washington, D.C., 1977. 2. S. L. Sobolev, Partial dierential equations mathemati al physi s, Pergamon Press, 1964. 3. A.A. Samarskii, The theory of dieren e s hemes, Mar el Dekker, In ., New York, 2001. 4. A. I. Grebennikov, Fast algorithm for solution of Diri hlet problem for Lapla e equation, WSEAS Transa tion on Computers Journal, 2(4), 1039{1043 (2003). 5. A. I. Grebennikov, The study of the approximation quality of GR-method for solution of the Diri hlet problem for Lapla e equation, WSEAS Transa tion on Mathemati s Journal, 2(4), 312{317 (2003).

Radon transform for fast solution of BVP

481

6. A. I. Grebennikov, Spline Approximation Method and Its Appli ations, MAX Press, Russia, 2004. 7. A. I. Grebennikov, A novel approa h for the solution of dire t and inverse problems of some equations of mathemati al physi s, Pro eedings of the 5-th International Conferen e on Inverse Problems in Engineering: Theory and Pra ti e, (ed. D. Lesni ), Vol. II, Leeds University Press, Leeds, UK, Chapter G04, 1{10. (2005). 8. A. Grebennikov, Linear regularization algorithms for omputer tomography, Inverse Problems in S ien e and Engineering, Vol. 14, No. 1, January, 53{64 (2006). die Bestimmung von Funktionen dur h ihre Integralwerte langs 9. J. Radon, Uber gewisser Mannigfaltigkeiten, 75 years of Radon transform (Vienna, 1992), Conf. Pro . Le ture Notes Math. Phys., IV, 324{339 (1994). 10. Helgason Sigurdur, The Radon Transform, Birkhauser, Boston-Basel-Berlin, 1999. 11. M. Gelfand and S. J. Shapiro, Homogeneous fun tions and their appli ations, Uspekhi Mat. Nauk, 10, 3{70 (1955). 12. V. A. Borovikov, Fundamental solutions of linear partial dierential equations with onstant oeÆ ients, Trudy Mos ov. Mat. Obsh h., 8, 877{890 (1959).

482

A. I. Grebennikov

(a) Solution of the Poisson equation in the unit ir le with the homogeneous Diri hlet ondition.

(b)

( )

(d) Fig. 1.

Application of a multigrid method to solving diffusion-type equations⋆ M. E. Ladonkina⋆, O. Yu. Milyukova⋆⋆, and V. F. Tishkin⋆⋆⋆ ⋆

Institute for Mathemati al Modeling, RAS, Mos ow, Russia ⋆⋆ miliukova@imamod.ru, ⋆⋆⋆ tishkin@imamod.ru

ladm@imamod.ru,

Abstract. A new eÆ ient multigrid algorithm is proposed for solving paraboli equations. It is similar to impli it s hemes by stability and a

ura y, but the omputational omplexity is substantially redu ed at ea h time step. Stability and a

ura y of the proposed two-grid algorithm are analyzed theoreti ally for one- and two-dimensional heat diffusion equations. Good a

ura y is demonstrated on model problems for one- and two-dimensional heat diusion equations, in luding those with thermal ondu tivity de ned as a dis ontinuous fun tion of oordinates.

Keywords: paraboli equations, multigrid methods, stability, a

ura y.

1

Introduction

Numeri al simulation of many problems in mathemati al physi s must take into a

ount diusion pro esses modeled by paraboli equations. Expli it s hemes lead to severe CFL restri tions on the time step [1℄, [2℄. Impli it s hemes are free from stability restri tions, but diÆ ult to use be ause if high omputational

omplexity of the orresponding linear algebrai equations. Appli ation of lassi al multigrid methods [3℄ may also be osty and not mu h better than expli it s hemes. Therefore, new algorithms should be developed for paraboli equations. In this paper, we present a new eÆ ient multigrid algorithm. We analyze the stability and a

ura y of the two-grid algorithm applied to model problems for one- and two-dimensional heat diusion equations with onstant and variable

oeÆ ients. The proposed algorithm is similar to an impli it s heme in regard to stability and a

ura y and substantially redu es the omputational omplexity at ea h time step. ⋆

This work was supported by the RFBR (Grant N 08-01-00435).

484

2

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

Description of the algorithm

As an example, we onsider an initial-boundary value problem for one- and two-dimensional heat diusion equations, ρcv

∂T = div(kgradT ) + f, ∂t T (x, t) = g(x, t),

(1)

x ∈ G, x ∈ γ,

T (x, 0) = T0 (x),

where Cv is heat at a unit volume, ρ is density, k is thermal ondu tivity, T is temperature at point x at time t, f is the heat sour e density, γ is the

omputational domain boundary, and g(x, t) and T0 (x) are given fun tions. To approximate problem (1) in the omputational domain G = {0 < x < l1 , 0 < y < l2 , 0 < t 6 T }, we use the fully impli it nite-dieren e s heme (ρcv )ij

n+1 uij − un ij

τ

= ki+0.5,j

+ki,j+0.5

n+1 n+1 ui+1,j − uij

h2x

n+1 n+1 ui,j+1 − uij

h2y

− ki−0.5,j

n+1 n+1 ui,j − ui−1,j

h2x

+

n+1 n+1 ui,j − ui,j−1

+ Φij , (2) h2y 0 < i < N 1 , 0 < j < N2 ,

− ki,j−0.5

n+1 u0j = u1 (yj , tn+1 ),

n+1 uN = u2 (yj , tn+1 ), 1 ,j

0 6 j 6 N2 ,

n+1 ui,0

n+1 ui,N 2

0 6 i 6 N1 ,

= u3 (xi , tn+1 ), u0ij

= u4 (xi , tn+1 ),

= T0 (xi , yj ),

0 6 i 6 N1 ,

0 6 j 6 N2 ,

where hx and hy are onstant mesh sizes in the x and y dire tions and τ is a time step. Finite-dieren e s heme (2) is a system of linear algebrai equations in the unknown values of the solution at the (n+1)th time level: Ah uh = fh .

(3)

The proposed algorithm for al ulating the grid fun tion at the next time level onsists of the following steps. Step 1. One or several smoothing iterations of equation (2) or (3) are performed using the formula

×

(ρcv )ij τ

(ρcv )ij u ij ki+0.5,j usi+1,j + ki−0.5,j usi−1,j + + τ h2x ki,j+0.5 usi,j+1 + ki,j−0.5 usi,j−1 + + Φij × h2y −1 ki+0.5,j + ki−0.5,j ki,j+0.5 + ki,j−0.5 + + + (1 − σ)usij , h2x h2y s+1 uij

=σ

(4)

Appli ation of a multigrid method

485

where i = 1, 2, ..., N1 − 1, j = 1, 2, ..., N2 − 1, σ is a weight oeÆ ient (0 < σ 6 1), and u0ij = unij . In formula (4), index n+1 is omitted and u ij = unij . The resulting grid fun tion is denoted by usm ij . Then, the residual is al ulated as rh = Ah usm h − fh . Step 2. The residual is restri ted to the oarse grid: Rlp = r2i1 ,2j1 ,

l = i1 = 1, ..., N1 /2 − 1,

p = j1 = 1, ..., N2 /2 − 1.

Step 3. A oarse grid orre tion equation is solved. For the two-dimensional problem analyzed here, it has the form ∆lp ∆l+1,p − ∆lp ∆l,p − ∆l−1,p − kl+0.5,p + kl−0.5,p − 2 τ Hx H2x l,p+0.5 ∆l,p+1 − ∆lp + kl,p−0.5 ∆l,p − ∆l,p−1 = Rlp , −k H2y H2y (ρcv )lp

(5)

∆l0 = ∆l,N2 /2 = ∆0p = ∆N1 /2,p = 0, l = 1, 2, ..., N1 /2 − 1, p = 1, 2, ..., N2 /2 − 1,

where Hx = 2hx ,Hy = 2hy . Step 4. The oarse grid orre tion ∆lp is interpolated to the ne grid by performing a 4-point fa e- entered and a 16-point ell- entered interpolation: ∆lp , i = 2l, j = 2p, 9 1 i = 2l + 1, j = 2p, 16 (∆lp + ∆l+1,p ) − 16 (∆l−1,p + ∆l+2,p ), 9 1 i = 2l, j = 2p + 1, 16 (∆lp + ∆l,p+1 ) − 16 (∆l,p−1 + ∆l,p+2 ), 81 δij = 256 (6) (∆lp + ∆l+1,p + ∆l,p+1 + ∆l+1,p+1 )− 9 − 256 (∆l−1,p + ∆l−1,p+1 + ∆l,p+2 + ∆l+1,p+2 + +∆l+2,p+1 + ∆l+2,p + ∆l+1,p−1 + ∆l,p−1 ) + i = 2l + 1, 1 + 256 (∆l−1,p−1 + ∆l+2,p+2 + ∆l−1,p+2 + ∆l+2,p−1 ), j = 2p + 1

where i = 1, 2, ..., N1 − 1, j = 1, 2, ..., N2 − 1. Note that δ0j = δN1 ,j = δi,0 = δi,N2 = 0. Step 5. Finally, the grid fun tion is al ulated at the next time level as uij = usm ij − δij .

(7)

Thus, a single iteration of the two-grid y le is performed. Even though the system of linear equations remains in ompletely solved, the algorithm is similar

486

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

to an impli it s heme in terms of stability and a

ura y. This is demonstrated below both theoreti ally and numeri ally for several model problems. Moreover, when the number of ne grid points is suÆ iently large, the omputational ost is lower in the proposed algorithm as ompared to the impli it s heme used on the ne grid, be ause the solution of oarse grid orre tion equation (5) has a mu h lower omputational omplexity as ompared to the solution of impli it s heme (2).

3

Theoretical stability analysis

We use Fourier analysis [4℄, [5℄ to examine stability of the two-grid algorithm with respe t to initial onditions. As a model example, we onsider the Diri hlet problem for the one-dimensional heat diusion equation with unit oeÆ ients on the interval [0, 1℄. Suppose that N is an even number and a single smoothing iteration is performed. The impli it s heme used on the ne grid is uim − 2uim + uim uim im i i−1 i −u i + Φi , = i+1 τ h2x

0 < i < N,

(8)

im uim 0 = uN1 = 0,

u0i = T0 (xi ),

0 6 i 6 N1 ,

where uim is the solution of the impli it s heme for the heat diusion equation i at the next time level, h = 1/N, and (T0 )i is a given grid fun tion. We represent the solution at the nth level as a Fourier series, i = u

N−1 X

√ ak sin kπxi 2.

k=1

The Fourier series expansion of the solution at the (n+1)th time level obtained in [6℄ is ui =

N−1 X

N−k {[qksm − 0.5(1 + q k )Qkcor qkres ]ak + 0.5(1 + q k )Qkcor qres aN−k }

k=1,k6=N/2

√ √ × sin kπxi 2 + qN/2 sm aN/2 sin 0.5Nπxi 2,

(9)

where qksm = 1 +

σR (qk − 1), R+1

qk = os kπ/N,

qkres =

qksm [1 + R(1 − qk )] − 1 , τ τ , Qkcor = 1 + 0.5R(1 − q2k )

q k = qk [1 + 0.5(1 − q2k )],

R = 2τ/h2 .

(10)

Appli ation of a multigrid method

487

In the one-dimensional problem, the interpolation at Step 4 is performed as follows: δi =

∆l ,

i = 2l,

9 16 (∆l

+ ∆l+1 ) −

1 16 (∆l−1

+ ∆l+2 ),

i = 2l + 1,

where i = 1, 2, ..., N − 1. Now, we show that the algorithm is absolutely stable in a ertain norm the linear subspa e with respe t to initial onditions when √ σ = 0.5. We de ne √ k H as the span of the Fourier modes 2 sin kπxi and 2 sin(N − k)πxi , where k = 1, 2, ...N/2 − 1. By virtue of representation (9) ombined with the equalities N−k Qkcor = Qcor and q k = −q N−k , the ve tor √ √ xk = ak sin kπxi 2 + aN−k sin(N − k)πxi 2 ∈ Hk

is transformed into ve tor

yk = Ak xk ,

where

Ak =

k )Qkcor qkres qksm − 0.5(1 + q 0.5(1 −

qk )Qkcor qkres

N−k 0.5(1 + q k )Qkcor qres N−k qsm

− 0.5(1 −

N−k qk )Qkcor qres

1 6 k 6 N/2 − 1.

,

It was shown in [6℄ that the eigenvalues of Ak satisfy the inequalities λk1 6= λk2 ,

|λk1 | 6 1,

(11)

|λk2 | 6 1.

We de ne the norm ku k1 on the spa e of grid fun tions as N/2−1

ku k21 =

X

(αk1 )2 + (αk2 )2 + a2N/2 ,

k=1

√

where αk1 , αk2 are the omponents of u in the basis ek1 , ek2 , 2 sin 0.5Nπxi (k = 1, 2, ..., N/2 − 1); ek1 and ek2 are the eigenvalues asso iated with eigenvalues λk1 and λk2 , respe tively. Combining (11) with the inequality |qksm | 6 1, we have kuk1 6 ku k1 .

This proves the absolute stability in the norm kk1 of the algorithm with respe t to initial onditions. We note here that the norms kk1 and kkL2 are equivalent [6℄.

488

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

It was shown in [6℄ that the algorithm is stable with respe t to the right-hand side. Thus, it is proved that the algorithm is absolutely stable with respe t to initial onditions and right-hand side. For the one-dimensional model problem, it holds that 0

kuk1 6 ku k1 + τQ1

n X j=0

kΦj k2 ,

where Q1 = const is independent of h, τ. The norm k k2 is de ned by analogy with k k1 .

4

Solution error

As a model example, we onsider an initial-boundary value problem for the twodimensional heat diusion equation with unit oeÆ ients on the unit square, subje t to zero boundary onditions: ∂u ∂2 u ∂2 u + , 0 < x < 1, 0 < y < 1, = ∂t ∂x2 ∂y2 u(x, 0, t) = 0, u(x, 1, t) = 0, 0 6 x 6 1,

0 6 t 6 T,

u(0, y, t) = 0,

0 < t 6 T,

u(1, y, t) = 0,

0 6 y 6 1,

0 6 t 6 T,

u(x, y, 0) = T0 (x, y),

0 6 x 6 1,

0 6 y 6 1.

(12)

We assume here that T0 (x, y) is an in nitely dierentiable fun tion. The impli it s heme used on the ne grid is im im im im uim uim uim im i,j+1 − 2uij + ui,j−1 i+1,j − 2uij + ui−1,j ij − u ij + , = τ h2 h2 0 < i < N, 0 < j < N,

u0i,j

im uim 0,j = uN,j = 0,

0 < j < N,

uim i,0

0 < i < N,

=

uim i,N

= T0 (xi , yj ),

= 0,

(13)

0 6 i 6 N, 0 6 j 6 N,

where h = 1/N, the grid fun tion (T0 )ij approximates T0 (x, y), and N is an even number. Suppose that a single smoothing iteration is performed and σ = 0.5. We represent the solution at the nth time level as a Fourier series expansion: un ij =

N−1 X N−1 X

akm 2 sin kπxi sin mπyj ,

0 < i < N, 0 < j < N.

(14)

k=1 m=1

We al ulate the Fourier series expansion of the solution at the next time level. Following [6℄, we demonstrate ea h step of the algorithm. Substituting

Appli ation of a multigrid method

489

expansion (14) into the right-hand side of (4) and setting Cv ρ ≡ 0, kij ≡ 1, s = 0, and hx = hy = h, we perform the smoothing step to obtain usm ij =

N−1 X N−1 X

qkm sm akm 2 sin kπxi sin mπyj ,

(15)

k=1 m=1

where

qkm sm = 1 +

0.5R (qk + qm − 2), 2R + 1

(16)

sm qk , q k are de ned in (10), and R = 2τ/h2 . Repla ing uim ij with uij given by n (15) and u im ij with uij de ned by (14) in (13), we nd a Fourier series expansion

for the residual on the ne grid: rij =

N−1 X N−1 X

qkm res akm 2 sin kπxi sin mπyj ,

0 < i < N, 0 < j < N,

k=1 m=1

where

qkm res =

qksm [1 + R(2 − qk − qm )] − 1 . τ

(17)

Performing Step 2 (restri ting the residual to the oarse grid) and using the identities sin(N − k)πx2i = − sin kπx2i and sin(0.5πNx2i ) = 0, we obtain N N 2 −1 2 −1

Rlp =

X X

k,N−m N−k,m (qkm ak,N−m − qres aN−k,m + res akm − qres

(18)

k=1 m=1

+qN−k,N−m aN−k,N−m ) × 2 sin kπxl sin mπyp , res

where xl = x2i and yp = y2j (l = i = 1, 2, ..., N/2 − 1, p = j = 1, 2, ..., N/2 − 1). We represent the solution ∆lp of the oarse grid orre tion equation ∆lp ∆l+1,p − 2∆lp + ∆l−1,p ∆l,p+1 − 2∆lp + ∆l,p−1 − = Rlp , − τ H2 H2 ∆l0 = ∆l,N/2 = ∆0p = ∆N/2,p = 0

(19)

(l = 1, 2, ..., N/2 − 1, p = 1, 2, ..., N/2 − 1, H = 2h) as Fourier series, N/2−1 N/2−1

∆lp =

X

k=1

X

a ~km 2 sin kπxl sin mπyp .

m=1

Substituting (18) and (20) into (19), we obtain N/2−1 N/2−1

∆lp =

X

k=1

X

km k,N−m N−k,m Qkm ak,N−m − qres aN−k,m + cor (qres akm − qres

m=1

+qN−k,N−m aN−k,N−m )2 sin kπxl sin mπyp , res

490

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

where Qkm cor =

τ . 1 + 0.5R(2 − q2k − q2m )

(20)

We interpolate ∆lp to the ne grid in two substeps. First, interpolation is performed to the grid {(ih, pH), i = 1, ..., N − 1, p = 1, ..., N/2 − 1} as follows: ~ ip = ∆

∆lp ,

i = 2l,

9 16 (∆lp

+ ∆l+1,p ) −

1 16 (∆l−1,p

+ ∆l+2,p ),

(21)

i = 2l + 1.

~ ip is interpolated to the ne grid by formulas analogous to (21). It an Then, ∆ be shown that this pro edure is equivalent to interpolation by (6). Following [6℄ in ea h substep, we nd the Fourier series expansion of the grid fun tion δij : δij =

X

X

km k,N−m 0.25(1 + q k )(1 + q m )Qkm ak,N−m − cor (qres akm − qres

k6=N/2 m6=N/2 N−k,m −qres aN−k,m + qN−k,N−m aN−k,N−m )2 sin kπxl sin mπyp . res

Finally, formula (7) at Step 5 yields uij =

X X

k6= N 2

(b1km akm + b2km ak,N−m + b3km aN−k,m − b4km aN−k,N−m )

m6= N 2

×2 sin kπxi sin mπyj + +

X

N−1 X

qN/2,m aN/2,m 2 sin 0.5Nπxi sin mπyj + sm

m=1

qk,N/2 ak,N/2 2sinkπxi sin0.5Nπyj , sm

(22)

k6= N 2

where

k b1km = qkm k )(1 + q m )Qkm sm − (1 + q cor qres /4, k,N−m b2km = (1 + q k )(1 + q m )Qkm /4, cor qres N−k,m b3km = (1 + q k )(1 + q m )Qkm /4, cor qres N−k,N−m k )(1 + q m )Qkm /4, b4km = (1 + q cor qres

km km qkm k are sm , qres , Qcor are de ned by (16), (17), and (20), respe tively, and qk , q de ned in (10), R = 2τ/h2 .

As a result, we have Fourier series expansion (22) of the solution at the next time level obtained by the proposed algorithm. To analyze the a

ura y of the solution, we start with estimating the trun ation error of impli it s heme (13) on this solution. In (13), we substitute uij

Appli ation of a multigrid method

491

n im given by (22) for uim ij and repla e u ij with uij represented by (14). The resulting residual is

X

ϕij =

X

(r1km akm + r2k,m ak,N−m + r3k,m aN−k,m −

k6=N/2 m6=N/2

+

N−1 X

−r4k,m aN−k,N−m )2 sin kπxi sin mπyj +

r5m aN/2,m 2 sin 0.5Nπxi sin mπyj +

m=1

X

r6k ak,N/2 2 sin kπxi sin 0.5Nπyj ,

k6=N/2

where r1km = r5m

=

b1 km −1 τ

+

qN/2,m −1 sm τ

2b1 km (2−qk −qm ) , h2

+

2qN/2,m (2−qm ) sm , h2

2,3,4 r2,3,4 km = bkm

r6k

=

h

qk,N/2 −1 sm τ

1 τ

+

+

2(2−qk −qm ) h2

i

,

(23)

2qk,N/2 (2−qk ) sm . h2

Applying the triangle inequality and the Parseval identity, we obtain kϕkL2 6 kϕ1 kL2 + kϕ2 kL2 + kϕ3 kL2 + kϕ4 kL2 ,

(24)

where the terms on the right-hand side are de ned as kϕ1 k2L2 = kϕ2 k2L2 = kϕ3 k2L2 = kϕ4 k2L2 = +

Suppose that

P

k6=N/2

P

k6=N/2

P

k6=N/2

P

k6=N/2

PN−1

P

P

P

P

1 2 2 m6=N/2 (rkm ) (akm ) , 2 2 2 m6=N/2 (rk,N−m ) (akm ) , 3 2 2 m6=N/2 (rN−k,m ) (akm ) ,

(25)

4 2 2 m6=N/2 (rN−k,N−m ) (akm ) +

5 2 2 m=1 (rm ) (aN/2,m )

+

P

6 2 2 k6=N/2 (rk ) (ak,N/2 ) .

τ = hβ , where 0 < β < 2.

We assume that unij has 2p bounded nite-dieren e derivatives with respe t to both oordinates. To obtain an upper bound for the rst term in (24), the (k, m) index domain Ω is partitioned into four subdomains, Ω = Ω1 ∪ Ω2 ∪ Ω3 ∪ Ω4 (see Fig. 1).

492

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

m 6

N−1

Ω3

Ω4

Ω1

Ω2

m1

1

1

N−1

k1

k

Figure 1. De omposition of the (k, m) index domain Ω into subdomains:k1 = m1 = is the integer part of Nβδ .

[Nβδ ], 0 < δ < 1/7, [Nβδ ]

We nd upper bounds for |r1km | and |akm | in ea h subdomain. In Ω1 , it holds that kπh 2), and the im third one is the value of maxi,j |(uij − uij )/uim ij | at t = 0.199 in Problems 5-7, respe tively.

im im im Table 5. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

5.

N=100 K s=1 s=2 s=1 10 .045 .031 .96 · 10−9 50 .026 .014 .62 · 10−5 100 .02 .007 .265 · 10−3

N=200 s=1 s=2 s=1

N=500 s=1 s=2 s=1

.030 .021 .245 · 10−10 .021 .011 .169 · 10−8 .016 .007 .129 · 10−6 .028 .015 .333 · 10−10

498

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

im im im Table 6. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

6.

N=100 K s=1 s=2 s=1 10 .047 .032 .108 · 10−8 50 .027 .014 .137 · 10−4 100 .021 .009 .313 · 10−3

N=200 s=1 s=2 s=1

N=500 s=1 s=2 s=1

.030 .022 .259 · 10−10 .022 .011 .203 · 10−8 .017 .007 .170 · 10−6 .029 .016 .353 · 10−10

im im im Table 7. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

7.

N=100 K s=1 s=2 s=1 10 .044 .030 .495 · 10−6 50 .031 .015 .137 · 10−4 100 .03 .013 .41 · 10−2

N=200 s=1 s=2 s=1

N=500 s=1 s=2 s=1

.030 .021 .122 · 10−4 .024 .016 .304 · 10−4 .016 .007 .126 · 10−4 .033 .016 .133 · 10−4

These results demonstrate that the proposed algorithm provides good a

ura y as applied to an initial-boundary value problem for the heat diusion equation. To examine the dependen e of a

ura y on the magnitude of the jump in thermal ondu tivity, we ompare the results for problems 3-7 presented above with the results obtained for a relatively small jump in k and with those for thermal ondu tivity de ned as a ontinuous fun tion of oordinates. In Problem 8, 1 + 0.3 sin 10πx, if (x − 0.5)2 + (y − 0.5)2 < 1/16, k= otherwise. 1,

In Problem 9,

k = 1 + 0.3 sin 10πx.

im Tables 8 and 9 list the values of maxi,j,t |(uim ij − uij )/uij | at 0 < t < 0.199, im im and maxi,j |(uij − uij )/uij | at t = 0.199 in the rst and se ond olumns orresponding to ea h value of N in Problems 8 and 9, respe tively. These results are obtained by using approximation (41) for thermal ondu tivity on the oarse grid.

It is lear from omparison between Tables 7, 8 and 9 that higher a

ura y is a hieved when k is ontinuous or has a small jump, as ompared to the ase of a large jump in thermal ondu tivity.

Appli ation of a multigrid method

499

im im im Table 8. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

8 (s=1).

K N=100 N=200 N=500 10 .000313 .213 · 10−7 .000197 .207 · 10−7 100 .000457 .979 · 10−6 .000276 .951 · 10−7 .898 · 10−4 .102 · 10−7 im im im Table 9. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

9 (s=1).

K N=100 N=200 N=500 10 .000243 .416 · 10−7 .614 · 10−4 .207 · 10−7 100 .00029 .898 · 10−6 .846 · 10−4 .378 · 10−7 .24 · 10−4 .188 · 10−8

6

CONCLUSION

A new eÆ ient algorithm is developed for solving diusion-type equations. By applying the algorithm to several model problems, it is shown both theoreti ally and numeri ally that the algorithm is similar to an impli it s heme in terms of stability and a

ura y. The new algorithm substantially redu es the the omputational omplexity at ea h time level, as ompared to impli it s hemes.

References 1. A.A. SAMARSKY, Dieren e s heme theory, Nauka, 1989 (in Russian). 2. N.S. BAHVALOV, N.P. ZHIDKOV and G.M. KOBELKOV, Numeri al methods, Nauka, 1987 (in Russian). 3. R.P. FEDORENKO, A relaxation method for solving dieren e ellipti equations, Zh. Vy hisl. Mat. Mat. Fiz., Vol.1 (1961), N 5, pp. 922-927 (in Russian). 4. S.K. GODUNOV, V.S.RYABENKIY, A relaxation method for solving dieren e ellipti equations, Zh. Vy hisl. Mat. Mat. Fiz., Vol.1 (1961), N 5, pp. 922-927 (in Russian). 5. R. RIHTMAER, K.MORTON, Dieren e methods for solving of boundary value problem, Mir, 1972 (in Russian). 6. M.E. LADONKINA, O.Yu. MILYUKOVA, V.F. TISHKIN, A numeri al algorithm for diusion-type equations based on the multigrid methods, Mat. Model., Vol.19, (2007), N 4, pp. 71-89, (in Russian). 7. A.A. SAMARSKY, Ye.S. NIKOLAYEV, Methods for solving nite-dieren e equations, Nauka, (1978). 8. I. GUSTAFSSON, A Class of First Order Fa torization Methods, BIT, V.18 (1978), pp.142-156. 9. M.E. LADONKINA, O.Yu. MILYUKOVA, V.F. TISHKIN, Appli ation of the multigrid method for al ulation diusion pro esses, CD-Pro eedings of

500

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

West-East Speed Flow Field Conferen e 19-22, November 2007, Mos ow, Russia (http://wehsff.imamod.ru/pages/s7.htm).

Monotone matrices and finite volume schemes for diffusion problems preserving non-negativity of solution I. V. Kapyrin Institute of Numeri al Mathemati s, Russian A ademy of S ien es, ul. Gubkina 8, Mos ow, 119333 Russia ivan.kapyrin@gmail.com

A new nite volume s heme for 3D diusion problems with heterogeneous full diusion tensor is onsidered. The dis retization uses nonlinear two-point ux approximation on unstru tured tetrahedral grids. Monotoni ity of the linearized operator allows us to guarantee nonnegativity of the dis rete solution.

Abstract.

Introduction The simulation of substan e transport in porous media [1℄ ne essitates the dis retization of the diusion operator. In su h problems, the diusion tensor is strongly inhomogeneous and anisotropi and the geometry of the omputational domain requires the use of unstru tured ondensing meshes. Under these onditions, the solutions produ ed by some modern numeri al s hemes [2℄ exhibit unphysi al os illations and negative values. Negative solution values may lead to in orre tly omputed hemi al intera tions between the substan es. As a result, the s heme be omes non onservative. In the present paper a nite volume (FV) method for numeri al solution of three-dimensional diusion problems with anisotropi full diusion tensor on tetrahedral grids is being onsidered. The method was introdu ed in [3℄ for problems with homogeneous Diri hlet boundary onditions, here we extend it to the ase of nonhomogeneous onditions of Diri hlet and Neumann types. For the formulation of the s hemes we use a spe ial nonlinear diusive ux approximation, introdu ed for two-dimensional diusion problems by C.Le Potier in [4℄ and modi ed in [5℄. The resulting s hemes are onservative and monotone in the sense of ensuring the nonnegativity of solution for respe tive sour es and boudary onditions (see [6℄, Se tion 2.4). The proof of the latter feature of the method is based on the monotoni ity property of the linearized operator matrix.

502

1

I. V. Kapyrin

Nonlinear Finite Volume Method

Let Ω be a onvex polyhedral domain in R3 with boundary ∂Ω. Consider the stationary diusion equation with two types of boundary onditions in the mixed statement: ∇ · r = f,

r = −D∇C in Ω,

C|ΓD = gD (x),

r · n|ΓN = gN (x).

(1a) (1b) (1 )

Here, C is the on entration of the substan e, r is the diusion ux, f is the sour e fun tion, and D is a symmetri positive de nite diusion tensor of dimension 3 × 3 that is pie ewise onstant in Ω. The boundary ∂Ω onsists of two parts ΓD and ΓN . On ΓD the on entration is spe i ed by a ontinuous fun tion gD (x). On ΓN the ontinuous fun tion gN (x) pres ribes the diusive ux through the boundary. In the following we assume that ΓN is the union of noninterse ting planar fragments. In the omputational domain Ω, we onstru t a onformal tetrahedral mesh εh , su h that the diusion tensor is onstant on ea h of its elements T . Let NT be the number of tetrahedra T ∈ εh , NP be the number of verti es, Ne be the total number of fa es, and NB be the number of external fa es in εh . The mass onservation law (1a) an be integrated with respe t to T ∈ εh by using Green's identity: Z

r · n ds =

Z

f dx

∀T ∈ εh ,

(2)

T

∂T

where n denotes the unit outward normal to ∂T . Let ne be an outward normal to the fa e e of T whose length is numeri ally equal to the surfa e area of the

orresponding fa e; i.e., |ne | = |e|. Relation (2) an be rewritten as: X

e∈∂T

re · n e =

Z

f dx

∀T ∈ εh ,

(3)

T

where re is the mean diusion ux density through the fa e e: re =

1 |e|

Z

r ds

e

The diusion ux re · ne through e an be approximated as follows. For ea h T ∈ εh and ea h external fa e e, we introdu e their degrees of freedom. The set NT +NB of support points of these degrees of freedom is de ned as B = {Xj }j=1 . For ea h tetrahedron T , B in ludes some point XT inside T (its oordinates will be spe i ed later). Let the tetrahedron T have a fa e e belonging to ∂Ω and ne be

Monotone matri es and nite volume s hemes

503

the outward normal to e. Then if e ∈ ΓD we add its enter of mass Xe to B, otherwise, if e ∈ ΓN we add to B the proje tion Xe of the internal point XT along the ve tor Dne (the hoi e of XT will guarantee that Xe lies inside the fa e e). Sin e Ω is onvex, for any internal vertex Oi of εh , there are four points Xi,j (j = 1, 2, 3, 4) from B su h that Oi lies inside the tetrahedron formed by them (the nearest points are pi ked). Therefore, there are nonnegative oeÆ ients λi,j satisfying the onditions 4 X j=1

−−−−→ λi,j · Oi Xi,j = 0,

4 X

λi,j = 1.

j=1

The oeÆ ients λi,j > 0 are used for linear interpolation of the on entration at interior nodes of the initial mesh from its values at points of B: COi =

4 X

(4)

λi,j CXi,j .

j=1

A similar formula an be written for the on entrations at points Oi ∈ ΓN using the values at three vertexes of a triangle in ΓN , whi h ontains Oi . For the points Oi ∈ ΓD the interpolation is not needed be ause the respe tive on entration values are known from the Diri hlet boundary onditions. O1

O2 A

X+ M

X−

B

O3

Fig. 1.

Geometri onstru tions for the nonlinear nite-volume method.

Consider two neighboring tetrahedra T+ = AO1 O2 O3 and T− = BO1 O2 O3 in the initial mesh εh (see gure 1), X+ ,X− are the orresponding elements in B , D+ and D− are diusion tensors, and V + and V − { are their volumes. Let

504

I. V. Kapyrin

M be the enter of mass of the ommon fa e e, e = O1 O2 O3 . We introdu e the following notation (here and below, i, j and k are assumed to be dierent; i.e., {i, j, k} = {1, 2, 3}, {2, 1, 3}, {3, 1, 2}): – Ti+ and (Ti− ) are the tetrahedra X+ MOj Ok and X− MOj Ok respe tively, and Vi+ and Vi− are their respe tive volumes. – ne is the normal to the ommon fa e O1 O2 O3 , that is external with respe t to T+ . − – n+ ei and nei are the normals to the fa e MOj Ok , that are external with respe t to Ti+ and Ti− , respe tively. − – n+ ij and nij are the normals to the respe tive fa es MX+ Ok and MX− Ok , that are external with respe t to Ti+ and Ti− , respe tively. − – n+ i and ni are the normals to the respe tive fa es X+ Oj Ok and X− Oj Ok , are external with respe t to Ti+ and Ti− , respe tively. – The lengths of all the above normals are numeri ally equal to the surfa e

areas of the orresponding fa es.

Ea h pair of tetrahedra Ti+ and Ti− is asso iated with an auxiliary variable CM,i , that is the substan e on entration at the point M. The diusion ux ri∗ (here ∗ and below, the star denotes either a plus R or a minus) R on ea h tetrahedron Ti Cn ds, integrating it to is de ned by using Green's identity ∇C dx = Ti∗

∂Ti∗

se ond-order a

ura y, and taking into a

ount n∗i + n∗ei + n∗ij + n∗ik = 0: ∗ Vi∗ D−1 ∗ ri =

1 ∗ ni CM,i + n∗ei CX∗ + n∗ij COj + n∗ik COk . 3

(5)

The introdu ed degrees of freedom CM,i are eliminated using the assumption of

ux ontinuity through e: ri+ · ne = ri− · ne . As a result, the ux in (5) is de ned in terms of the on entrations CX+ , CX− at the points X+ and X− and in terms of COj and COk , for whi h we use linear interpolation (4). The total diusion

ux re ·ne through e is represented as a linear ombination of three uxes ri+ ·ne : re · ne = µe1 r1+ · ne + µe2 r2+ · ne + µe3 r3+ · ne .

(6)

To determine the oeÆ ients µei , i = 1, 2, 3, we set the following onditions on diusion ux (6) through e. – If the values ri+ ·ne /|ne | approximate the diusion ux density, then re ·ne /|ne |

is also its approximation:

3 X

µej = 1.

(7)

j=1

– The approximation sten il for the ux is two-point and nonlinear: re · ne = K+ (CO1 , CO2 , CO3 )CX+ − K− (CO1 , CO2 , CO3 )CX− .

(8)

Monotone matri es and nite volume s hemes

505

This ondition is ensured by the equation (a12 CO2 +a13 CO3 )µe1 +(a21 CO1 +a23 CO3 )µe2 +(a31 CO1 +a32 CO2 )µe3 = 0,

(9)

where aij =

− − + (D+ n+ j , na )(D− ni , na ) − (D− nj , na )(D+ ni , na ) − − + (D+ n+ i , na )Vi − (D− ni , na )Vi

.

Equations (7) and (9) de ne a family of solutions with parameter pe : µe1 (pe ) = µe1 (0) + pe [CO1 (a31 − a21 ) + CO2 a32 − CO3 a23 ], µe2 (pe ) µe3 (pe )

= =

µe2 (0) µe3 (0)

e

+ p [CO2 (a12 − a32 ) + CO3 a13 − CO1 a31 ], + pe [CO3 (a23 − a13 ) + CO1 a21 − CO2 a12 ].

(10a) (10b) (10 )

Here, µe1 (0), µe2 (0) and µe3 (0) omprise a parti ular solution to system (7),(9): µei (0) =

+ + − [(D− n− i , ne )Vi − (D+ ni , ne )Vi ]COi . 3 P + + − [(D− n− , n )V − (D n , n )V ]C e + j e Oj j j j

(11)

j=1

Remark 1. CoeÆ ients (11) are identi al to those in the two-dimensional nonlinear nite-volume method with the volumes repla ed by areas. In the twodimensional ase, µe1 and µe2 are unique and pre isely determined by onditions (7) and (8) on two-point approximations of the diusion ux. In ase when O1 O2 O3 ∈ ΓN , we have the following diusive ux approximation Z re · ne = gN (x) ds. (12) e

If the fa e O1 O2 O3 belongs to ΓD , Green's identity on the tetrahedron X+ O1 O2 O3 with volume V + yields the equation V + D−1 r =

1 + + (CX+ ne + CO1 n+ 1 + CO2 n2 + CO3 n3 ), 3

(13)

where COi , i ∈ {1, 2, 3} are known from the boundary onditions. For the external fa e e ∈ ΓD , we an write re · ne = KB+ CX+ + KB− ,

(Dne , ne ) where KB+ = , and 3V + + + (Dn+ 1 , ne )CO1 + (Dn2 , ne )CO2 + (Dn3 , ne )CO3 . KB− = + 3V

(14)

(15)

506

I. V. Kapyrin

Thus, the diusion ux re ·ne is de ned by formulas (6), (10) and (5) for internal mesh fa es and by formulas (12), (14) for external mesh fa es. Let CT be the on entration at the point XT orresponding to tetrahedron T having the fa e e ∈ ΓN . We eliminate the on entration Ce at the point Xe on the fa e e using the approximation of diusive ux through e: Ce − CT = −gN (Xe ), l e −XT k where l = kXkDnk and n is the unit normal ve tor to the fa e e. It is to be mentioned here that with nonnegative CTi , i = 1, .., NT and a nonpositive fun tion gN (x) the nonnegativity of COi in (4) is guaranteed after the elimination of Ce for all fa es e ∈ ΓN . The formulation of the method is ompleted by substituting the ux expressions into mass onservation law (3). Dis retization of (3) produ es a nonlinear system of equations A(CX )CX = F, (16)

where CX is the NT -ve tor of unknown on entrations at the points XT of the set B. The matrix A(CX ) an be represented as the union of submatri es A(CX ) =

X

(17)

Ne Ae (CX )NeT ,

e∈∂εh

Ne being the respe tive assembling matri es, onsisting of zeros and ones. Here Ae (CX ) is a 2 × 2 matrix of the form Ke+ −Ke− Ae (CX ) = (18) −Ke− Ke+

for any internal fa e e and a 1 × 1 matrix of the form Ae (CX ) = KB+ for any e ∈ ΓD . For the omponent FT of the right-hand-side ve tor F orresponding to tetrahedron T the following relation holds: FT =

Z

T

fdx −

X

e∈∂T ∩ΓD

KB− −

X

Z

gN ds.

(19)

e∈∂T ∩ΓN e

System (16) is solved using the Pi ard iteration A(CkX )Ck+1 =F X

(20)

with some initial approximation C0X . To onstru t monotone s hemes, we de ne the lo ation of a point XT ∈ B orresponding to an arbitrary tetrahedron T = ABCD in the initial mesh εh with fa es a, b, c and d opposite to A, B, C, D and D, respe tively. Let RA , RB , RC and RD be the position ve tors of the

orresponding verti es of T . The ve tors na , nb , nc and nd are outward normals

Monotone matri es and nite volume s hemes

507

to the fa es. Their lengths are numeri ally equal to the surfa e areas of the

orresponding fa es. De ne RA kna kD + RB knb kD + RC knc kD + RD knd kD (21) , kna kD + knb kD + knc kD + knd kD p = (Dnβ , nβ ) and β ∈ {a, b, c, d}. Note that, for an isotropi

RXT =

where knβ kD tensor, expression (21) gives the oordinates of the enter of the sphere ins ribed in T .

2

Monotonicity of the Method

Hereafter we formulate the monotoni ity property that is the main feature of the proposed FV method.

Let the right-hand side in system (16) of the nonlinear nitevolume method be nonnegative (i.e., Fi > 0); the boundary onditions satisfy gD (x) > 0 on ΓD and gN (x) 6 0 on ΓN . Let (16) be the orresponding nonlinear system of FV dis retization for (1); the support points of the degrees of freedom on the tetrahedra be given by formula (21); the initial approximation be (C0X )i > 0; and, for any internal fa e e, the nonnegative values µei , i ∈ {1, 2, 3} be hosen from solutions (10a)-(10 ) on every Pi ard iteration (20). Then all the iterative approximations to CX are nonnegative: Theorem 1.

(CkX )i > 0,

i = 1, . . . , NT ,

∀k > 0.

Proof. We rely on the following de nition of a monotone matrix: The matrix A is alled a monotone matrix if the ondition Ax > 0 implies that the ve tor x is positive. Assume that the matrix A(CX ) is monotone for any nonnegative

to ve tor CX , and the right-hand-side F is nonnegative. Then the solution Ck+1 X system (20) is also a nonnegative ve tor. Taking into a

ount (C0X )i > 0, we nd by indu tion that (CkX )i > 0, ∀k > 0, ∀i = 1, . . . , NT . Let us prove that the matrix A(CX ) is monotone for any nonnegative ve tor CX , and the right-hand-side F is nonnegative. Consider the oeÆ ients K+ (CO1 , CO2 , CO3 ), K− (CO1 , CO2 , CO3 ), KB+ and KB− in expressions (8) and (14) for the diusion ux through a fa e. The oeÆ ient KB+ is positive be ause D is positive de nite. Plugging (5) (after eliminating CM,i ) into (6) gives formulas for K+ and K− : K+ =

3 X i=1

K− = −

µei ·

3 X i=1

+ (D− n− (D+ ne , ne ) i , ne )Vi · + + −. 3V + (D− n− i , ne )Vi − (D+ ni , ne )Vi

µei ·

− (D+ n+ (D− ne , ne ) i , ne )Vi · + + −. 3V − (D− n− i , ne )Vi − (D+ ni , ne )Vi

508

I. V. Kapyrin

For K+ and K− to be positive and for KB− to be nonpositive, it is suÆ ient to show that (D− n− (D+ n+ (22) i , ne ) > 0, i , ne ) < 0. Consider the tetrahedron ABCD ∈ εh with fa es a, b, c and d opposite to the verti es A, B, C and D, respe tively, and with normals na , nb , nc and nd to these fa es (the lengths of the normals are numeri ally equal to the surfa e areas of the

orresponding fa es). The point XT inside the tetrahedron is de ned by formula (21). Let nab be de ned as the normal (external with respe t to XT BCD) to the plane XT CD, nbc be de ned as the normal (external with respe t to XT ACD) to the plane XT AD, and so on for nβγ , where β, γ ∈ {a, b, c, d}, β 6= γ. Sin e the length of a normal is not important for the proof of (22), nab an be al ulated as 1 −−→ −−−→ nab = (kna kD + knb kD + knc kD + knd kD )(CXT × DXT ) (23) 2 −−→ −−−→ For the ve tors CXT and DXT , we have the expressions

−→ −→ −−→ −−→ CAkna kD + CBknb kD + CDknd kD , CXT = kna kD + knb kD + knc kD + knd kD −−→ −→ −−→ −−−→ DAkna kD + DBknb kD + DCknc kD . DXT = kna kD + knb kD + knc kD + knd kD

Substituting them into ve tor produ t (23) gives

nab = nb kna kD − na knb kD .

Let us show that (Dna , nab) < 0 and (Dnb , nab) > 0 by using the Cau hy{ S hwarz inequality (Dna , nab ) = (Dna , nb )kna kD − (Dna , na )knb kD = = kna kD (na , nb )D − kna kD knb kD < 0.

(24)

Here, (·, ·)D is the s alar produ t in the metri de ned by the tensor D. Similarly, we an prove (Dnb , nab ) > 0 and inequalities of the form (Dnβ , nβγ ) < 0 and + (Dnγ , nβγ ) > 0, β 6= γ, where β, γ ∈ {a, b, c, d}. In (22), n− i and ni are repla ed by the orresponding ve tors nβγ and ne is repla ed by nβ or nγ . Then, using (24), we prove (22). Therefore, K+ and K− are positive and KB− is nonpositive. Thus, the matrix A(CX ) has the following properties. – All the diagonal elements of A(CX ) are positive. – All the o-diagonal elements of A(CX ) are nonpositive. – The matrix is olumn diagonally dominant; this diagonal dominan e is stri t

for olumns orresponding to elements that have fa es on the boundary of the omputational domain with Diri hlet onditions.

Monotone matri es and nite volume s hemes

509

Therefore, AT (CX ) is an M-matrix and all the elements of (AT (CX ))−1 are nonnegative. Sin e the transposition and inversion of matri es are ommuting operations, we have (AT (CX ))−1 = (A−1 (CX ))T . Therefore, all the elements of A−1 (CX ) are nonnegative and A(CX ) is monotone. The nonnegativity of right-hand-side F represented by the formula (19) is provided by the onditions of the theorem and the nonpositivity of oeÆ ients KB− . ⊓ ⊔

Remark 2. The validity of (22) implies that

µei > 0, i ∈ {1, 2, 3} required in the assumption of the theorem an always be hosen by setting pe = 0 ∀e in (10a)-(10 ). The range of pe for whi h µei are positive is an interval; it may degenerate into the point pe = 0 when two of the three COi are zero. If COi = 0 ∀i ∈ {1, 2, 3}, then solution (10a)-(10 ) is always positive and does not depend on pe .

Remark 3. The point XT given by (21) is a solution to the system of six equa-

tions determining the equality of the angles in the D-metri between the ve tors nβ , nβγ and nγ , −nβ,γ , where β, γ ∈ {a, b, c, d} and β 6= γ. Corollary 1.

Consider the nonstationary diusion equation ∂C − ∇ · D∇C = f ∂t

(25)

with a nonnegative right-hand side, a nonnegative initial ondition, and a nonnegative Diri hlet boundary ondition. The nonlinear FV method is used to onstru t the impli it s heme

V n V n+1 n+1 + A(CX ) CX = C + Fn+1 , ∆t ∆t X

where V is a diagonal matrix of elements' volumes and F involves the righthand side and the boundary onditions. At every time step, the system is solved by the Pi ard method

V V n n+1,k + A(CX ) Cn+1,k+1 = C + Fn+1 , X ∆t ∆t X

If

µei ∀e, i ∈ {1, 2, 3} 1, 2 . . .. Corollary 2.

are positive, then

k = 1, 2 . . . ,

n+1,0 CX = Cn X.

n+1,k (CX )j > 0, j = 1, . . . , NT ,

k=

In the expli it s heme for the dis retization of (25) V n+1 C = ∆t X

V n+1 − A(Cn ) Cn , X X+F ∆t

the solution CXn+1 an be made nonnegative by hoosing a suÆ iently small ∆t ensuring that the diagonal elements of V/∆t − A(Cn ) are nonnegative

510

I. V. Kapyrin

(its o-diagonal elements are obviously nonnegative). Moreover, ∆t ∼ h2 (where h is the size of a quasi-uniform mesh), whi h is similar to the stability ondition for expli it s hemes. Although the onvergen e of the dis rete solution to the solution of dierential problem (1a)-(1 ) is not proved, test omputations have revealed that the nonlinear nite-volume method with oeÆ ients (11) has quadrati onvergen e with respe t to the on entration and linear onvergen e with respe t to diusion

uxes. At the same time the onvergen e of Pi ard iterations is not guaranteed and this problem may be ome a key question in the further development of this method.

Acknowledgements The author is grateful to Yu. V. Vassilevski, C. Le Potier, D. A. Svyatski, and K. N. Lipnikov for fruitful dis ussions of the problem and the ideas used in the development of the method. This work was supported in part by the Russian Foundation for Basi Resear h (proje t no. 04-07-90336), by the program \Computational and Information Issues of the Solution to Large-S ale Problems" of the Department of Mathemati al S ien es of the Russian A ademy of S ien es, and by a grant from the Foundation for the Support of National S ien e for best graduate students of the Russian A ademy of S ien es.

References 1. A. Bourgeat, M. Kern, S. S huma her and J. Talandier. The COUPLEX test ases: Nu lear waste disposal simulation. Computational Geos ien es, 2004, 8, pp.83-98. 2. G. Bernard-Mi hel, C. Le Potier, A. Be

antini, S. Gounand and M. Chraibi. The Andra Couplex 1 test ase: Comparisons between nite element, mixed hybrid nite element and nite volume dis retizations. Computational Geos ien es, 2004, 8, pp.83-98. 3. I. V. Kapyrin. A family of monotone methods for the numeri al solution of three-dimensional diusion problems on unstru tured tetrahedral meshes.Doklady Mathemati s, 2007, Vol.76, No.2, pp.734-738. 4. C. Le Potier. S hema volumes nis monotone pour des operateurs de diusion fortement anisotropes sur des maillages de triangle non stru tures. C. R. A ad. S i. Paris, 2005, Ser. I 341, pp.787-792. 5. K. Lipnikov, M. Shashkov, D. Svyatski and Yu. Vassilevski. Monotone nite volume s hemes for diusion equations on unstru tured triangular and shaperegular polygonal meshes. Journal of Computational Physi s, 2007, Vol.227, No.1, pp.492-512. 6. A. A. Samarskii and P. N. Vabish hevi h. Numeri al Methods for Solving Conve tion{Diusion Problems Editorial URSS, Mos ow, 1999, 248p. [in Russian℄.

Sparse Approximation of FEM Matrix for Sheet Current Integro-Differential Equation⋆ Mikhail Khapaev1 and Mikhail Yu. Kupriyanov2 1

Dept. of Computer S ien e, Mos ow State University, 119992 Mos ow, Russia vmhap@cs.msu.su

2

Nu lear Physi s Institute, Mos ow State University, 119992 Mos ow, Russia mkupr@pn.sinp.msu.ru

We onsider two-dimensional integro-dierential equation for

urrents in thin super ondu ting lms. The integral operator of this equation is hypersingular operator with kernel de aying as 1/R3 . For numeri al solution Galerkin Finite Element Method (FEM) on triangular mesh with linear elements is used. It results in dense FEM matrix of large dimension. As the kernel is qui kly de aying then o-diagonal elements of FEM matrix are small. We investigate simple sparsi ation approa h based on dropping small entries of FEM matrix. The on lusion is that it allows to redu e to some extent memory requirements. Nevertheless for problems with large number of mesh points more ompli ated te hniques as one of hierar hi al matri es algorithms should be onsidered. Abstract.

Keywords: super ondu tivity, FEM, sparse matrix.

1

Introduction

In this paper we onsider the problem of numeri al solution of boundary value problem for integro-dierential equation for sheet urrent in thin super ondu ting lms. The simplest form of this equation for a single ondu tor is −λ⊥ ∆ψ(r) +

1 4π

Z Z ∇ψ(r ′ ), ∇ ′ S

1 ds + Hz (r) = 0, |r − r ′ |

(1)

where λ⊥ is onstant parameter, S is 2D bounded domain on plane (x, y), r = (x, y). ψ(r) is unknown fun tion. It is stream fun tion potential representation for 2D sheet urrent. Hz (r) is the right hand side and has the sense of z omponent of external magneti eld. The boundary ondition for (1) is ψ(r) = F(r),

r ∈ ∂S.

(2)

Here fun tion F(r) is ompletely de ned by inlet and outlet urrents over ondu tor boundary ∂S and urrents ir ulating around holes in S. In the paper we ⋆

The paper is supported by ISTC proje t 3174.

512

M. Khapaev, M. Kupriyanov

evaluate the problem in more general form a

ounting several single- onne ted

ondu tors with holes and nite thi kness of lms. Our interest to problem (1), (2) is motivated by omputations of indu tan es and urrent elds in mi roele troni super ondu tor stru tures [1, 2℄. Traditionally problems for surfa e, sheet or volume urrents are equally solved using PEEC (Partial Element Equivalent Cir uit) te hnique [3, 4℄. This approa h brings to equation with weakly singular kernel. In our ase it is λ⊥ J(r) +

1 4π

ZZ S

∇ · J(r) = 0,

J(r ′ ) ds = −∇χ(r), |r − r ′ |

∆χ = 0.

(3) (4)

In (3) J(r) is unknown urrent, χ(r) is one more unknown fun tion (phase). (1) an be obtained from (3) using dierentiation. Equation (3) needs boundary onditions for fun tion χ(r) and urrent J(r). Equations similar to (3) are well known for normal ondu tors. Approa hes similar to PEEC for (3) for super ondu tors are also known [6, 7℄. For normal ondu tor fun tion χ(r) has sense of voltage potential. Re ently fast multipoles te hnique based program FASTHENRY [5℄ for (3) was adopted for super ondu tors [8℄. The main problem in numeri al solution of (1) or (3) is dense matrix of large size. It is ne essary to ll this matrix fast and then store it or it's approximation. It is also ne essary to have a fast and reliable method for solution of system of linear equations with this matrix. In other ase simulation of many pra ti al problems an be unfeasible. We prefer to solve equation (1) instead of (3) be ause (1) a

ounts important physi al features of the problem and be ause of numeri al eÆ ien y onsiderations: – Many super ondu tivity problems are based solely on urrents and magneti eld. In these ases it is diÆ ult to de ne boundary onditions for χ(r). – Holes in S is a problem for (3) and is an easy task for (1). Given urrents ir ulating around holes are a

ounted in boundary onditions in fun tion F(r)

(1). Non-de aying urrents ir ulating around holes are typi al for problems in super ondu tivity. – FEM for (1) has better numeri al approximation then PEEC and thus an give smaller system of linear equations. – FEM o-diagonal matrix elements for (1) qui kly tends to zero with the distan e between nite elements. In this paper we outline the evaluation of boundary value problem for integrodierential equations for sheet urrents in thin super ondu ting lms. Properties of operators are dis ussed and nite element method is formulated. We study de aying of matrix elements and formulate simple strategy for dropping small

Sparse Approximation of FEM Matrix

513

elements of the matrix. Then dire t sparse solver is used for fa torisation and solution. Two numeri al examples are onsidered. The sparsi ation te hnique we developed allows to extend the set of problems that an be eÆ iently solved. It is also shown that even for qui kly de aying kernels more ompli ated methods of solving large dense FEM (Galerkin) systems of equations like [9, 10℄ should be used.

2 2.1

Equations evaluation Preliminaries

In this paper we study the urrents in ondu ting layers separated by layers of diele tri . Let tm be the thi kness of ondu ting layers and dk be the thi kness of diele tri layers, k, m | the numbers of the layers. Condu ting layers an

ontain few single- onne ted ondu tors of arbitrary shape. Let the number of

ondu tors in all layers be Nc and the total number of holes in all ondu tors will be Nh . Ea h ondu tor an have urrent terminals where inlet or outlet

urrents are given. For large lass of mi rowave and digital ir uits it an be assumed [11, 6℄ dk ≪ l, tm ≪ l, where l is the typi al lateral size of ir uit in plane (x, y). Ea h ondu tor o

upy spa e domain Vm = Sm × [h0m , h1m ], m = 1, . . . , Nc . Two-dimensional domain Sm is the proje tion of the ondu tor on the plane (x, y). We all the boundary of the ondu tor ∂Sm the boundary of the proje tion Sm . Let ∂Sh,k be the boundary of the hole with number k, ∂Sext,m | external boundary of m-th ondu tor. We assume that all urrent terminals are on the external boundary of the ondu tors. The magneti eld is ex ited by external magneti eld, urrents ir ulating around holes and urrents through hains of terminals on the ondu tors. For further onvenien e, let P, P0 stands for points in 3D spa e, r, r0 | for points on plane. Also, onsider dierential operators ∂x = ∂/∂x, ∂y = ∂/∂y, ∇xy = (∂x , ∂y ). 2.2

London Equations for Conductors of Finite Thickness

The basi equations for further onsideration are stati London equations [1℄. Let j be urrent density and H | total magneti eld in luding self- eld of j and external magneti eld, λ | so alled London penetration depth [1℄. Then basi equations are: λ2 ∇ × j + H = 0, ∇ × H = j.

(5) (6)

514

M. Khapaev, M. Kupriyanov

Typi ally λ and lm thi kness are of same order. As lm is assumed thin j ≈ j(x, y) and problem redu es to z- omponent of (5) [12℄ (7)

λ2 (∂x jy (P0 ) − ∂y jx (P0 )) + Hz (P0 ) = 0

Consider the sheet urrent density Jm (r): Jm (r) =

Z h1m

j(P)dz,

(8)

r ∈ Sm .

h0 m

Self magneti eld in (7) is al ulated by means of average urrent density Jn (r)/tn and Biot-Savart formula: H(P0 ) =

Nc Z 1 X 1 1 Jn (r) × ∇P dvP . 4π |P − P0 | Vn t n

(9)

n=1

Consider London penetration depth for lms

(10)

λsm = λ2m /tm .

Averaging (7) over the thi kness of ondu tors we obtain the following equations for the sheet urrents in ondu tors λsm (∂x Jm,y (r0 ) − ∂y Jm,x (r0 )) + Nc Z Z 1 X (Jn (r) × ∇xy Gmn (r, r0 ))z dsr + Hz (r0 ) = 0, 4π

(11)

n=1 S n

where r0 ∈ Sm , m = 1, . . . , Nc , Hz (r) is z omponent of external magneti eld and 1 Gmn (r, r0 ) = tm tn

Z h1m h0 m

dz0

Z h1n h0 n

1 dz. |P − P0 |

(12)

The equations (11) must be ompleted by the harge onservation low ∇ · Jm = m = 1, . . . , Nc . Our goal is to take into a

ount small but nite thi kness of ondu tors. Therefore we substitute the both of one-dimensional integra

Vadim Olshevsky

University of Conne ti ut Storrs, USA Eugene Tyrtyshnikov

Institute of Numeri al Mathemati s Russian A ademy of S ien es Mos ow, Russia

World S ienti Publishing • 2008

ii

To the memory of Gene Golub

PREFACE

Among others devoted to matri es, this book is unique in overing the whole of a tripty h onsisting of algebrai theory, algorithmi problems and numeri al appli ations, all united by the essential use and urge for development of matrix methods. This was the spirit of the 2nd International Conferen e on Matrix Methods and Operator Equations (23{27 July 2007, Mos ow) hosted by the Institute of Numeri al Mathemati s of Russian A ademy of S ien es and organized by Dario Bini, Gene Golub, Alexander Guterman, Vadim Olshevsky, Stefano Serra-Capizzano, Gilbert Strang and Eugene Tyrtyshnikov. Matrix methods provide the key to many problems in pure and applied mathemati s. However, it is more usual that linear algebra theory, numeri al algorithms and matri es in FEM/BEM appli ations live as if in three separate worlds. In this book, maybe for the rst time at all, they are put together as one entity as it was in the Mos ow meeting, where the algebrai part was impersonated by Hans S hneider, algorithms by Gene Golub, and appli ations by Guri Mar huk. All the topi s intervened in plenary sessions and were spe ialized in three se tions, giving names to three hapters of this book. Among the authors of this book are several top- lass experts in numeri al mathemati s, matrix analysis and linear algebra appli ations in luding Dario Bini, Walter Gander, Alexander Guterman, Wolfgang Ha kbus h, Khakim Ikramov, Valery Il'in, Igor Kaporin, Boris Khoromskij, Vi tor Pan, Stefano SerraCapizzano, Reinhold S hneider, Vladimir Sergei huk, Harald Wimmer and others. The book assumes a good basi knowledge of linear algebra and general mathemati al ba kground. Besides professionals, it alls as well to a wider audien e, in a ademia and industry, of all those who onsider using matrix methods in their work or major in other elds of mathemati s, engineering and s ien es. We are pleased to a knowledge that Alexander Guterman engaged in thorough editing \Algebra and Matri es" papers, Maxim Olshanskii and Yuri Vassilevski invested their time and expertise to \Matri es and Appli ations" part, and Sergei Goreinov ommitted himself to enormous te hni al ne essities of making the texts into page. It is mu h appre iated that the Mos ow meeting that gave a base to this book was supported by Russian Foundation for Basi Resear h, Russian A ademy of S ien es, International Foundation for Te hnology and Investments, Neurok Te hsoft, and University of Insubria (Como, Italy).

vi

The soul of the meeting was Gene Golub who rendered a harming \Golub's dimension" to the three main axes of the onferen e topi s. This book is happening now to ome out in his ever lasting, eminently bright and immensely grateful memory. Vadim Olshevsky Eugene Tyrtyshnikov

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

Algebra and Matrices Operators Preserving Primitivity for Matrix Pairs . . . . . . . . . . . . . . . . . . . . .

2

De ompositions of quaternions and their matrix equivalents . . . . . . . . . . . .

20

Sensitivity analysis of Hamiltonian and reversible systems prone to dissipation-indu ed instabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

Blo k triangular miniversal deformations of matri es and matrix pen ils .

69

Determining the S hein rank of Boolean matri es . . . . . . . . . . . . . . . . . . . . .

85

L. B. Beasley (Utah State University), A. E. Guterman (Mos ow State University)

D. Janovska (Institute of Chemi al Te hnology), G. Opfer (University of Hamburg) O. N. Kirillov (Mos ow State University)

L. Klimenko (Computing Centre of Ministry of Labour and So ial Poli y of Ukraine), V. V. Sergei huk (Kiev Institute of Mathemati s) E. E. Mareni h (Murmansk State Pedagogi University)

Latti es of matrix rows and matrix olumns. Latti es of invariant

olumn eigenve tors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

V. Mareni h (Murmansk State Pedagogi University)

Matrix algebras and their length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

O. V. Markova (Mos ow State University)

On a New Class of Singular Nonsymmetri Matri es with Nonnegative Integer Spe tra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

T. Nahtman (University of Tartu), D. von Rosen (Swedish University of Agri ultural S ien es)

Redu tion of a set of matri es over a prin ipal ideal domain to the Smith normal forms by means of the same one-sided transformations . . . . 166

V. M. Prokip (Institute for Applied Problems of Me hani s and Mathemati s)

viii

Matrices and Algorithms Nonsymmetri algebrai Ri

ati equations asso iated with an M-matrix: re ent advan es and algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

D. Bini (University of Pisa), B. Iannazzo (University of Insubria), B. Meini (University of Pisa), F. Poloni (S uola Normale Superiore of Pisa)

A generalized onjugate dire tion method for nonsymmetri large ill- onditioned linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

E. R. Boudinov (FORTIS Bank, Brussels), A. I. Manevi h (Dnepropetrovsk National University)

There exist normal Hankel (φ, ψ)- ir ulants of any order n . . . . . . . . . . . . . 222

V. Chugunov (Institute of Numeri al Math. RAS), Kh. Ikramov (Mos ow State University)

On the Treatment of Boundary Artifa ts in Image Restoration by re e tion and/or anti-re e tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

M. Donatelli (University of Insubria), S. Serra-Capizzano (University of Insubria)

Zeros of Determinants of λ-Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

W. Gander (ETH, Zuri h)

How to nd a good submatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

S. Goreinov (INM RAS), I. Oseledets (INM RAS), D. Savostyanov (INM RAS), E. Tyrtyshnikov (INM RAS), N. Zamarashkin (INM RAS)

Conjugate and Semi-Conjugate Dire tion Methods with Pre onditioning Proje tors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

V. Il'in (Novosibirsk Institute of Comp. Math.)

Some Relationships between Optimal Pre onditioner and Superoptimal Pre onditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

J.-B. Chen (Shanghai Maritime University), X.-Q. Jin (University of Ma au), Y.-M. Wei (Fudan University), Zh.-L. Xu (Shanghai Maritime University)

S aling, Pre onditioning, and Superlinear Convergen e in GMRES-type iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

I. Kaporin (Computing Center of Russian A ademy of S ien es)

ix

Toeplitz and Toeplitz-blo k-Toeplitz matri es and their orrelation with syzygies of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

H. Khalil (Institute Camille Jordan), B. Mourrain (INRIA), M. S hatzman (Institute Camille Jordan)

Con epts of Data-Sparse Tensor-Produ t Approximation in Many-Parti le Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

H.-J. Flad (TU Berlin), W. Ha kbus h (Max-Plan k-Institute, Leipzig), B. Khoromskij (Max-Plan k-Institute, Leipzig), R. S hneider (TU Berlin)

Separation of variables in nonlinear Fermi equation . . . . . . . . . . . . . . . . . . . . 348

Yu. I. Kuznetsov (Novosibirsk Institute of Comp. Math.)

Faster Multipoint Polynomial Evaluation via Stru tured Matri es . . . . . . . 354

B. Murphy (Lehman College), R. E. Rosholt (Lehman College)

Testing Pivoting Poli ies in Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . 357

B. Murphy (Lehman College), G. Qian (University of New York), R. E. Rosholt (Lehman College), A.-L. Zheng (University of New York), S. Ngnosse (University of New York), I. Taj-Eddin (University of New York)

Newton's Iteration for Matrix Inversion, Advan es and Extensions . . . . . . 364

V. Y. Pan (Lehman College)

Trun ated de ompositions and ltering methods with Re e tive/AntiRe e tive boundary onditions: a omparison . . . . . . . . . . . . . . . . . . . . . . . . . 382

C. Tablino Possio (University of Milano Bi o

a)

Dis rete-time stability of a lass of hermitian polynomial matri es with positive semide nite oeÆ ients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409

H. Wimmer (University of Wurzburg)

Matrices and Applications Splitting algorithm for solving mixed variational inequalities with inversely strongly monotone operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416

I. Badriev (Kazan State University), O. Zadvornov (Kazan State University)

Multilevel Algorithm for Graph Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 434

N. Bo hkarev (Neurok), O. Diyankov (Neurok), V. Pravilnikov (Neurok)

x

2D-extension of Singular Spe trum Analysis: algorithm and elements of theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450

N. E. Golyandina (St. Petersburg State University), K. D. Usevi h (St. Petersburg State University)

Appli ation of Radon transform for fast solution of boundary value problems for ellipti PDE in domains with ompli ated geometry . . . . . . . 475

A. I. Grebennikov (Autonomous University of Puebla)

Appli ation of a multigrid method to solving diusion-type equations . . . 483

M. E. Ladonkina (Institute for Math. Modelling RAS), O. Yu. Milukova (Institute for Math. Modelling RAS), V. F. Tishkin (Institute for Math. Modelling RAS)

Monotone matri es and nite volume s hemes for diusion problems preserving non-negativity of solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501

I. Kapyrin (Institute of Numeri al Math. RAS)

Sparse Approximation of FEM Matrix for Sheet Current IntegroDierential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

M. Khapaev (Mos ow State University), M. Kupriyanov (Nu lear Physi s Institute)

The method of magneti eld omputation in presen e of an ideal

ondu tive multi onne ted surfa e by using the integro-dierential equation of the rst kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524

T. Ko hubey (Southern S ienti Centre RAS), V. I. Astakhov (Southern S ienti Centre RAS)

Spe tral model order redu tion preserving passivity for large multiport RCLM networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534

Yu. M. Ne hepurenko (Institute of Numeri al Math. RAS), A. S. Potyagalova (Caden e), I. A. Karaseva (Mos ow Institute of Physi s and Te hnology)

New Smoothers in Multigrid Methods for Strongly Nonsymmetri Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540

G. Muratova (Southern Federal University), E. Andreeva (Southern Federal University)

Operator equations for eddy urrents on singular arriers . . . . . . . . . . . . . . 547

J. Naumenko (Southern S ienti Centre RAS)

Matrix approa h to modelling of polarized radiation transfer in heterogeneous systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558

xi

T. A. Sushkevi h (Keldysh Institute for Applied Mathemati s), S. A. Strelkov (Keldysh Institute for Applied Mathemati s), S. V. Maksakova (Keldysh Institute for Applied Mathemati s) The Method of Regularization of Tikhonov Based on Augmented Systems 580

A. I. Zhdanov (Samara State Aerospa e University), T. G. Par haikina (Samara State Aerospa e University)

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587

xii

ALGEBRA AND MATRICES

Operators Preserving Primitivity for Matrix Pairs LeRoy B. Beasley1 and Alexander E. Guterman2,⋆ Department of Mathemati s and Statisti s, Utah State University, Logan, Utah 84322-4125, USA,

1

[email protected]

2

Department of Algebra, Fa ulty of Mathemati s and Me hani s, Mos ow State University, Mos ow, 119991, GSP-1, Russia, [email protected]

1

Introduction

A nonnegative matrix is alled primitive if some power of it has only positive entries, or, equivalently, it is irredu ible and its spe tral radius is the only eigenvalue of maximal modulus, or, equivalently, the greatest ommon divisor of lengths of all ir uits in the asso iate dire ted graph is equal to 1. An alternative de nition of primitivity arises in the asymptoti analysis of the homogeneous dis rete time positive systems of the form x(t + 1) = Ax(t),

t = 0, 1, . . . ,

(1)

here a non-negative ve tor x(0) represents the initial state. In this ontext the primitivity of A an be equivalently restated as the property that any positive initial ondition x(0) produ es a state evolution whi h be omes stri tly positive within a nite number of steps. Su h systems are des ribed by the following equation, see [11℄, x(h + 1, k + 1) = Ax(h, k + 1) + Bx(h + 1, k),

h, k ∈ Z, h + k > 0,

(2)

where A and B are n × n nonnegative matri es and initial onditions x(h, −h), h ∈ Z, are nonnegative n × 1 ve tors. Positive dis rete homogeneous 2D-dynami al systems are used to model diusion pro esses, water pollution, et ., see [6, 7℄. An entry of the ve tor x(h, k) typi ally represents a quantity, su h as pressure, on entration or density at a parti ular site along a stream. It an be seen that at ea h time-step the onditions of a site are determined by its previous

onditions and the onditions of the site dire tly upstream from it, see [7, 11℄. To investigate the systems of type 2, we need the following on ept: Definition 1. Let A, B ∈ Mn (Z), and h, k be some non-negative integers. The (h, k)-Hurwitz produ t, whi h is denoted by (A, B)(h,k) , is the sum of all matri es whi h are produ ts of h opies of A and k opies of B. ⋆

The se ond author wishes to thank the grants RFBR 05-01-01048, NSh-5666.2006.1 and MK-2718.2007.1 for partial nan ial support

Operators preserving primitivity for matrix pairs

Example 1.

3

(A, B)(1,0) = A and

(A, B)(2,2) = A2 B2 + ABAB + AB2 A + BA2 B + BABA + B2 A2 .

In general the Hurwitz produ t satis es the following re urren e relations: (A, B)(h,0) = Ah , (A, B)(0,k) = Bk , (A, B)(h,k) = A(A, B)(h−1,k) + B(A, B)(h,k−1) for h, k > 1.

It an be dire tly he ked, see [11℄, that the solution of (2) an be represented in the following way: x(h, k) = =

Ph+k s=0

Ph+k s=0

(A, B)(s,h+k−s) x(h − s, s − h)

(A, B)(h+k−s,s) x(s − k, k − s).

Thus the Hurwitz produ ts (A, B)(h,k) with h + k = t and the initial ondition determines the ondition after t time-steps. It is natural to ask for ne essary and suÆ ient onditions on the matrix pair (A, B) in order that the solutions of (2) are eventually (i.e., for all (h, k) with h + k suÆ iently large) stri tly positive for ea h appropriate sequen e of initial values. As for the system (1), where the analogous question is answered in terms of primitivity, in this ase primitivity for matrix pairs is needed whi h means the existen e of integers h, k, h + k > 0, su h that the Hurwitz produ t (A, B)(h,k) is a positive matrix. Definition 2. The exponent of the primitive pair (A, B) is the minimum value of h + k taken over all pairs (h, k) su h that (A, B)(h,k) is positive.

An important issue in dealing with primitive matri es or matrix pairs is to nd the omplete list of matrix operators whi h map primitive matri es to primitive matri es or primitive matrix pairs to primitive matrix pairs. If su h transformations exist then they allow us to simplify the system without loosing its main property, namely, the primitivity. In this paper we deal with su h transformations. Following Frobenius, S hur and Dieudonne, many authors have studied the problems of determining the maps on the n × n matrix algebra Mn (F) over a eld F that leave ertain matrix relations, subsets, or properties invariant. For a survey of problems and results of this type see [9, 10℄. The notion of primitivity is related to nonnegative matri es, i.e., matri es with the entries in the semiring of nonnegative real numbers. In the last de ades mu h attention has been paid to Preserver Problems for matri es over various semirings, where ompletely dierent te hniques are ne essary to obtain lassi ation of operators with ertain preserving properties, see [10, Se tion 9.1℄ and referen es therein for more details. The notion of a semiring an be introdu ed as follows

4

L. B. Beasley, A. E. Guterman

Definition 3. A semiring S onsists of a set S and two binary operations, addition and multipli ation, su h that: – S is an Abelian monoid under addition (identity denoted by 0); – S is a semigroup under multipli ation (identity, if any, denoted by 1); – multipli ation is distributive over addition on both sides; – s0 = 0s = 0 for all s ∈ S. In this paper we will always assume that there is a multipli ative identity 1 in S whi h is dierent from 0.

We need the following spe ial lass of semirings: Definition 4. A semiring is alled antinegative if the zero element is the only

element with an additive inverse.

Standard examples of semirings, whi h are not rings, are antinegative, these in lude non-negative reals and integers, max-algebras, Boolean algebras. Definition 5. A binary Boolean semiring, B, is a set {0, 1} with the opera-

tions:

0+0=0 0+1= 1+0 = 1 1+1=1

0·0=0 0·1=1·0=0 1 · 1 = 1.

We will not use the term \binary" in the sequel.

Linear operators on ertain antinegative semirings without zero divisors that strongly preserve primitivity were hara terized by L. B. Beasley and N. J. Pullman in [3, 4℄. Let us note that linear transformations T : M(S) → M(S), preserving primitive matrix pairs, obviously preserve primitivity, so are lassi ed in [3, 4℄. To see this it is suÆ ient to onsider primitive matrix pairs of the form (A, 0). Thus their images are primitive matrix pairs of the form (T (A), 0). Hen e, T (A) is primitive. However, if we onsider operators on M2 (B) = M(S) × M(S), then there is no easy way to redu e the problem of hara terization of operators preserving primitive matrix pairs to the problem of hara terization of ertain transformations in ea h omponent. In this paper we investigate the stru ture of surje tive additive transformations on the Cartesian produ t M2 (S) preserving primitive matrix pairs. It turns out that for the hara terization of these transformations we have to apply different and more involved te hniques and ideas, su h as primitive assignments,

y le matri es, et . Our paper is organized as follows: in Se tion 2 we olle t some basi fa ts, de nitions and notations, in Se tion 3 we hara terize surje tive additive transformations T : M2 (B) → M2 (B) preserving the set of primitive matrix pairs, in Se tion 4 we extend this result to matri es over arbitrary antinegative semiring without zero divisors. Here Mm,n (B) denotes the set of m × n matri es with entries from the Boolean semiring B.

Operators preserving primitivity for matrix pairs

2

5

Preliminaries

In this paper, unless otherwise is stated, S will denote any antinegative semiring without zero divisors and Mn (S) will denote the n × n matri es with entries from S. Further, we denote by M2n (S) the Cartesian produ t of Mn (S) with itself, Mn (S) × Mn (S). The notions of primitivity and exponent for square matri es are lassi al.

A matrix A ∈ Mn (S) is primitive if there is an integer k > 0 su h that all entries of Ak are non-zero. In the ase A is primitive, the exponent of A is the smallest su h k. Definition 6.

A lassi al example of primitive matri es is a so- alled Wieland matrix. Definition 7.

A Wieland matrix is

0 1

Wn = 1 1

... ... ∈ Mn (S). ... 1 0

Also we onsider the following primitive matrix

11

0 ′ Wn = 0

... ... ... ...

1

. 1 0

These matri es are primitive and the Wieland matrix Wn is the matrix with the maximal possible exponent, see [8, Chapter 8.5℄. Definition 8.

additive and

An operator

T : Mm,n (S) → Mm,n (S) is alled linear T (αX) = αT (X) for all X ∈ Mm,n (S), α ∈ S.

if it is

We say that an operator, T : Mn (S) → Mn (S), preserves (strongly preserves) primitivity if for a primitive matrix A the matrix T (A) is also

Definition 9.

primitive (A is primitive if and only if

T (A)

is primitive).

A pair (A, B) ∈ M2n (S) is alled primitive if there exist nonnegative integers h, k su h that the matrix (A, B)(h,k) is positive. In this

ase, we say that the exponent of (A, B) is (h, k) where h + k is the smallest integer su h that (A, B)(h,k) is positive, and if there is (a, b) su h that a+b = h + k and (A, B)(a,b) is positive then h > a. Definition 10.

6

L. B. Beasley, A. E. Guterman

Example 2. The notion of primitive pairs generalizes the notion of primitivity. Indeed, pairs (A, B) with k = 0 and pairs (A, O) are primitive if and only if A is primitive. In parti ular, for any primitive matrix A ∈ Mn (S) the matrix pairs (A, O), (O, A), (A, A) are also primitive. For example, (Wn , O) and (O, Wn ) are primitive. We note that there are primitive pairs (A, B) su h that neither A nor

1 0 B is primitive, for example (A := En1 , B := . ..

1 ... 1 1 ... 1 ). . . . . .. . . .

0 ... 0 1

We will use the notion of irredu ible matri es and below we present the following two equivalent de nitions of irredu ibility, see [5℄ for details: Definition 11. of the rst n

A matrix A ∈ Mn (S) is alled irredu ible if n = 1 or the sum powers of A has no zero entries. A is redu ible if it is not

irredu ible. Equivalently, A matrix A is redu ible if there is a permutation matrix P su h that Pt AP = A1 Os,n−s . If A is not redu ible it is irredu ible. A2

A3

An operator, T : M2n (S) → M2n(S), preserves primitive pairs if for any primitive pair (A1 , A2 ) we have that T (A1 , A2 ) is also primitive.

Definition 12.

In order to des ribe the nal form of our operators we need the following notions. Definition 13. The matrix X◦Y denotes the (i, j) entry of X ◦ Y is xi,j yi,j .

the Hadamard or S hur produ t, i.e.,

An operator T : Mm,n(S) → Mm,n(S) is alled a (U, V)-operator if there exist invertible matri es U and V of appropriate orders su h that T (X) = UXV for all X ∈ Mm,n (S), or, if m = n, T (X) = UXt V for all X ∈ Mm,n (S), where Xt denotes the transpose of X. Definition 14.

An operator T is alled a (P, Q, B)-operator if there exist permutation matri es P and Q, and a matrix B with no zero entries, su h that T (X) = P(X ◦ B)Q for all X ∈ Mm,n (S), or, if m = n, T (X) = P(X ◦ B)t Q for all X ∈ Mm,n (F). A (P, Q, B)-operator is alled a (P, Q)-operator if B = J, the matrix of all ones. Definition 15.

Definition 16.

A line of a matrix A is a row or a olumn of A.

Definition 17. We say that the matrix A dominates the matrix B if and only if bi,j 6= 0 implies that ai,j 6= 0, and we write A > B or B 6 A.

Operators preserving primitivity for matrix pairs

7

The matrix In is the n × n identity matrix, Jm,n is the m × n matrix of all ones, Om,n is the m × n zero matrix. We omit the subs ripts when the order is obvious from the ontext and we write I, J, and O, respe tively. The matrix Ei,j , alled a ell, denotes the matrix with exa tly one nonzero entry, that being a one in the (i, j) entry. Let Ri denote the matrix whose ith row is all ones and is zero elsewhere, and Cj denote the matrix whose jth olumn is all ones and is zero elsewhere. We let |A| denote the number of nonzero entries in the matrix A. We denote by A[i, j|k, l] the 2 × 2-submatrix of A whi h lies on the interse tion of the ith and jth rows with the kth and lth olumns. A monomial matrix is a matrix whi h has exa tly one non-zero entry in ea h row and ea h olumn.

3

Matrices over the Binary Boolean Semiring

The following lemma allows us to onstru t non-primitive matrix pairs: Lemma 1. Let S be an antinegative semiring without zero divisors, (A, B) ∈ M2n (S), and assume that at least one of the following two onditions is

satis ed: 1) |A| + |B| < n + 1, 2) A and B together ontain at most n − 1 o-diagonal ells. Then the pair (A, B) is not primitive.

Proof. 1. Let K be an irredu ible matrix. We write K = D + P, where D is a

ertain diagonal matrix and P is a matrix with zero diagonal. Let Pi,j denote the permutation matrix whi h orresponds to the transposition (i, j), i.e., Pi,j = I − Ei,i − Ej,j + Ei,j + Ej,i . If K has a row or olumn with no nonzero o-diagonal entry, say the ith row, then P1,i AP1,i =

α O1,n−1 A2 A3

so that K is redu ible.

Thus, K must have a nonzero o diagonal entry in ea h row and ea h olumn. Hen e |P| > n. Further, if K is irredu ible and |P| = n then P is a monomial matrix. 2. Note that the expansion of (A + B)(h+k) ontains all the terms found in the (h, k)-Hurwitz produ t of (A, B). So, if (A, B) is a primitive pair in M2n(S) with exponent (h, k) then due to antinegativity of S we have that (A + B)(h+k) has all nonnegative entries, that is A + B is primitive. 3. Assume to the ontrary that (A, B) is a primitive pair. Then by Item 2 the matrix A + B is primitive. Thus A + B is irredu ible. Hen e by Item 1 the matrix A + B has at least n nonzero o diagonal entries, and if A + B has exa tly n nonzero o diagonal entries then (A + B) ◦ (J \ I) is a monomial matrix. Sin e any power of a monomial matrix is a monomial matrix, we must have that A+B has a nonzero diagonal entry. Sin e |A|+|B| > |A+B| we have that |A|+|B| > n+1 and together, A and B have at least n nonzero o diagonal entries. This on ludes ⊓ ⊔ the proof.

8

L. B. Beasley, A. E. Guterman

Definition 18.

the y le

A graph is a full- y le graph if it is a vertex permutation of 1 → 2 → · · · → (n − 1) → n → 1.

A (0, 1) full- y le matrix is the adja en y matrix of a full- y le graph. If a matrix A with exa tly n nonzero entries dominates a full- y le (0, 1)-matrix, we also say that A is a full- y le matrix. Any primitive matrix A ∈ Mn (B) with exa tly n + 1 non-zero

ells one of whi h is a diagonal ell dominates a full- y le matrix.

Corollary 1.

Proof. It follows from the proof of Lemma 1, item 1, that A dominates a per-

mutation matrix P. Assume that P is not a full- y le matrix. Sin e |P| = n, it follows that the graph of P is dis onne ted. Thus the graph of A is dis onne ted. ⊓ ⊔ Hen e, A is not primitive. A ontradi tion. Lemma 2. Let T : Mm,n (B) → Mm,n (B) be Then T (O) = O and, hen e, T is a bije tive

a surje tive additive operator. linear operator.

Proof. By additivity we have T (A) = T (A + O) = T (A) + T (O) for any A. By the de nition of addition in B it follows that T (O) 6 T (A) for any A. Sin e T is surje tive, for any i, 1 6 i 6 m, j, 1 6 j 6 n, there exists Ai,j ∈ Mm,n (B) su h that T (Ai,j ) = Ei,j . Thus for all i, j we have that T (O) 6 T (Ai,j ) = Ei,j , i.e., T (O) = O. Let us he k the linearity of T now. Let λ ∈ B, X ∈ Mm,n(B). If λ = 1 then T (λX) = T (X) = λT (X). If λ = 0 then T (λX) = T (O) = O = λT (X). The bije tivity of T follows from the fa t that any surje tive operator on a ⊓ ⊔ nite set is inje tive, and Mm,n (B) is nite.

Definition 19. For matri es A = [ai,j ], B = [bi,j ] ∈ Mn (B) we denote by [A|B] ∈ Mn,2n (B) the on atenation of matri es A and B, i.e., the matrix whose ith row is (ai,1 , . . . , ai,n , bi,1 , . . . , bi,n ) for all i, i = 1, . . . , n.

Let T : M2n (B) → M2n (B) be a surje tive additive operator. De ne the operator T ∗ : Mn,2n → Mn,2n by T ∗ ([A|B]) = [C|D] if T (A, B) = (C, D).

Definition 20.

Let T : M2n (B) → M2n (B) be a surje tive additive operator, then the operator T ∗ is surje tive and additive.

Lemma 3.

Proof. Follows from the bije tion between B-semimodules M2n (B) and Mn,2n (B). Definition 21. Let D = {D|D is a D2 = D × D = {(A, B)|A, B ∈ D}.

⊓ ⊔

diagonal matrix in Mn (B)}. De ne the set

Operators preserving primitivity for matrix pairs

9

Definition 22. Let σ : {1, 2, · · · , n} → {1, 2, · · · , n} be a bije tion (permutation). We de ne the permutation matrix Pσ orresponding to σ by the forP mula Pσ = ni=1 Ei,σ(i) .

We note that in this ase Pσt Ei,j Pσ = Eσ(i),σ(j) for all i, j ∈ {1, 2, · · · , n}. In the next lemma we show how to omplete pairs of ells to a matrix whi h is similar to either Wn or Wn′ by a permutation similarity matrix.

Lemma 4. For any two pairs of distin t indi es (i, j), (k, l) su h that (i, j) 6= (l, k) and either i 6= j or k 6= l or both, there exist a permutation matrix P and n − 1 ells F1 , . . . , Fn−1 su h that Ei,j + Ek,l + F1 + . . . + Fn−1 = PWn Pt or PWn′ Pt .

Proof. Let i, j, k, l be four distin t integers in {1, 2, · · · , n}. There are ve ases to onsider:

1. (i, i), (i, l). Let σ be any permutation su h that σ(i) = n, and σ(l) = 1 and Pn−1 Fq = Eσ−1 (q),σ−1 (q+1) , q = 1, · · · , n−1. Then Pσt (Ei,i +Ei,l + q=1 Fq )Pσ = Wn′ . 2. (i, i), (k, l). In this ase, let σ be any permutation su h that σ(i) = 2, σ(k) = n, and σ(l) = 1 and Fq = Eσ−1 (q),σ−1 (q+1) , q = 1, · · · , n−1. Then Pσt (Ei,i + Pn−1 Ek,l + q=1 Fq )Pσ = Wn′ . 3. (i, j), (i, l). In this ase, let σ be any permutation su h that σ(i) = n − 1, σ(j) = n, and σ(l) = 1. Let F1 = Ej,l , and Fq = Eσ−1 (q−1),σ−1 (q) for Pn−1 2 6 q 6 n − 1, Then Pσt (Ei,j + Ek,l + q=1 Fq )Pσ = Wn . 4. (i, j), (k, j). In this ase, let σ be any permutation su h that σ(i) = n − 1, σ(k) = n, and σ(j) = 1. Let Fq = Eσ−1 (q),σ−1 (q+1) for 2 6 q 6 n − 1, Pn−1 Then Pσt (Ei,j + Ek,l + q=1 Fq )Pσ = Wn . 5. (i, j), (k, l). In this ase, let σ be any permutation su h that σ(i) = 1, σ(j) = 2, σ(k) = 3, and σ(l) = 4. Let F1 = Ej,k , Fq = Eσ−1 (q+2),σ−1 (q+3) for 2 6 q 6 n − 3, Fn−2 = Eσ−1 (n),i , and Fn−1 = Eσ−1 (n−1),i . Then Pσt (Ei,j + Pn−1 Ek,l + q=1 Fq )Pσ = Wn . ⊓ ⊔

Definition 23. Let E = {Ei,j |1 6 i, j 6 n}, on E is a mapping η : E → {0, 1}. Definition 24.

the set of all ells. An assignment

We say that η is nontrivial if η is onto.

Definition 25. Let A ∈ Mn (B), where A = {Ei,j |A > Ei,j }.

we say that

η

is A-nontrivial if

η|A

is onto

That is, η is A-nontrivial if the restri tion of η to the ells of A is onto.

10

L. B. Beasley, A. E. Guterman

Definition 26.

Further if A is primitive we say that η is A-primitive if X

(Ei,j , O) +

{Ei,j ∈A|η(Ei,j )=0}

X

(O, Ei,j )

{Ei,j ∈A|η(Ei,j )=1}

is a primitive pair. Definition 27. we say that η

If an assignment η is both A-nontrivial and A-primitive then is A-nontrivial-primitive.

Remark 1. Assignment means the oloring of edges of the full graph in two

olors. Assignment is non-trivial if both olors are used, it is A-nontrivial if both olors are used for the graph of the matrix A. Assignment is A-primitive if taking the sums of matrix units, orresponding to the edges of A of the dierent

olors we get a primitive matrix pair. Lemma 5. Let (i, j, α), (k, l, β) be two triples su h that 1 6 i, j, k, l 6 n, k 6= l, α, β ∈ {0, 1}, and (i, j) 6= (k, l). Let S = {η|η(Ei,j ) = α, η(Ek,l ) = β}. Then, S ontains a Wn -nontrivial-primitive assignment and S ontains a Wn′ -

nontrivial-primitive assignment.

Proof. Sin e every primitive matrix has a primitive assignment [2, Theorem 2.1℄,

the matri es Wn and Wn′ have primitive assignments. Hen e the lemma is trivial if Wn 6> Ei,j + Ek,l and Wn′ 6> Ei,j + Ek,l . Thus we assume that Wn > Ei,j + Ek,l or Wn′ > Ei,j + Ek,l . We shall de ne η to ful ll the requirements in ea h ase. Case 1. Wn′ > Ei,j + Ek,l . Let us show that in this ase there exists a Wn′ nontrivial-primitive assignment η su h that η(Ei,j ) = α and η(Ek,l ) = β. If i = j = 1 and l ≡ k + 1 mod n and η(E1,1 ) 6= η(Ek,k+1 ), de ne η(Ep,q ) = η(E1,1 ) for all (p, q) 6= (k, k + 1). If η(E1,1 ) = η(Ek,k+1 ), de ne η(Ek−1,k ) = η(E1,1 ) and η(Ep,q ) = η(Ek,k+1 ) for all (p, q) 6= (1, 1), (k − 1, k). This de nes a Wn′ -nontrivial-primitive assignment in S. Note that here Wn 6> Ei,j + Ek,l , and hen e there is a Wn -nontrivial-primitive assignment in S. If i 6= j and k 6= l, then j ≡ i + 1 mod n and l ≡ k + 1 mod n. If η(Ei,i+1 ) = η(Ek,k+1 ) x s, 1 6 s 6 n, s 6= i, k, and let η(E1,1 ) = η(Es,s+1 ) and η(Ep,q ) = η(Ei,i+1 ) for all (p, q) 6= (1, 1), (s, s + 1). If η(Ei,i+1 ) 6= η(Ek,k+1 ), let η(E1,1 ) = η(Ei,i+1 ) and η(Ep,q ) = η(Ek,k+1 ) for all (p, q) 6= (1, 1), (i, i + 1). In all ases, we have de ned a Wn′ -nontrivial-primitive assignment in S. Case 2 will deal with this ase for a Wn -nontrivial-primitive assignment in S. Case 2. Wn > Ei,j + Ek,l . Let us show that in this ase there exists a Wn nontrivial-primitive assignment η su h that η(Ei,j ) = α and η(Ek,l ) = β. We have the following sub ases: Sub ase 1. i, j, k, l ∈ {1, n − 1, n}. That is (i, j) = (n, 1) and (k, l) = (n − 1, 1), or vi e versa, or (i, j) = (n, 1) and (k, l) = (n − 1, n), or vi e versa. If

Operators preserving primitivity for matrix pairs

11

η(Ei,j ) = η(Ek,l ) let η(E1,2 ) 6= η(Ei,j ) and η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (1, 2), (k, l). If η(Ei,j ) 6= η(Ek,l ) then, sin e Ei,j and Ek,l are two of the ells En−1,n , En,1 , En−1,1 , let Er,s be the other of the three. If (r, s) = (n − 1, 1), let η(Er,s ) = η(Ek,l ) and η(Ep.q ) = η(Ei,j ) for all (p, q) 6= (r, s), (k, l). If (r, s) 6= (n − 1, 1) let η(Er,s ) = η(En−1,1 ) and η(Ep,p+1 ) 6= η(En−1,1) for all p, 1 6 p 6 n − 2. Sub ase 2. i ∈ {n, n − 1}, k 6∈ {n − 1, n}. (Equivalently, k ∈ {n, n − 1}, i 6∈ {n − 1, n}.) If η(Ei,j ) 6= η(Ek,l ), let η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (k, l). If η(Ei,j ) = η(Ek,l ), let η(Es,s+1 ) 6= η(Ei,j ) for some s 6= k, s < n − 1, and let η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (s, s + 1). Here, unless n = 3, the hoi e of s is always possible. The ase n = 3 is an easy exer ise. Sub ase 3. i, k 6∈ {n − 1, n}. If η(Ei,j ) = η(Ek,l ) let η(En−1,n ) = η(En−1,1 ) 6= η(Ei,j ) and η(Ep,q ) = η(Ei,j ) for all other (p, q). If η(Ei,j ) 6= η(Ek,l ) let η(Ep,q ) = η(Ei,j ) for all (p, q) 6= (k, l). In all ases and sub ases a Wn -nontrivial-primitive assignment in S has been de ned. ⊓ ⊔

Let T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then T (D2 ) = D2 . Proof. Let us show that there is no elements from M2n (B)\D2 whi h are mapped Lemma 6.

by T to D2 . Assume the onverse, i.e., there is a matrix pair (X, Y) ∈ M2n (B)\ D2 su h that T (X, Y) ∈ D2 . Note that by Lemma 2 the operator T is bije tive. Thus by [1, Theorem 1.2℄ the image of a ell must be a ell. If n = 1 then all matri es are diagonal, so we an assume that n > 2 till the end of this proof. Without loss of generality we may assume that X is non-diagonal. Thus there is Ei,j 6 X, i 6= j. By Lemma 2 the operator T is bije tive and T (O, O) = (O, O). Hen e T (Ei,j , O) 6= (O, O). Thus T (Ei,j , O) ∈ D2 , sin e otherwise T (X, O) ∈/ D2 by antinegativity of B. Sin e n > 2 we have that |D2 \ {(O, O)}| > 15 > 2. Thus by the surje tivity of T there is also some other pairs of matri es, whose image lies in D2 , say T (X ′ , Y ′ ) ∈ D2 . Thus similar to the above we an say that there is a pair (r, s) su h that either T (Er,s , O) ∈ D2 (if X ′ 6= O) or T (O, Er,s ) ∈ D2 (if Y ′ 6= O), (r, s) 6= (j, i) and (r, s) 6= (i, j). We onsider the rst possibility now, i.e. there exists (r, s) su h that T (Er,s , O) ∈ D2 . Case 1. If r = s, by a permutational similarity of M2n(B) we an assume that (r, r) = (1, 1) and that j ≡ (i + 1) mod n. By Lemma 5 there are n − 1 ells F1 , F2 , · · · , Fn−1 and a Wn′ -nontrivial-primitive assignment η su h that Wn′ > Ei,j + Er,r + F1 + · · · + Fn−1 and for A = {Ei,j , Er,r , F1 , · · · , Fn−1 } the pair (A, B) =

X

{Ek,l ∈A|η(Ek,l )=0}

(Ek,l , O) +

X

(O, Ek,l )

{Ek,l ∈A|η(Ek,l )=1}

is a primitive pair. But T (A, B) dominates two elements of D2 by the hoi e of i, j, r, s and hen e annot be primitive by Lemma 1, a ontradi tion.

12

L. B. Beasley, A. E. Guterman

Case 2. If r 6= s, by a permutational similarity of M2n(B) we an assume that Wn > Ei,j + Er,s . By Lemma 5 there are n − 1 ells F1 , F2 , · · · , Fn−1 and a Wn nontrivial-primitive assignment η su h that Wn > Ei,j + Er,s + F1 + · · · + Fn−1 and for A = {Ei,j , Er,s , F1 , · · · , Fn−1 } the pair (A, B) =

X

{Ek,l ∈A|η(Ek,l )=0}

(Ek,l , O) +

X

(O, Ek,l )

{Ek,l ∈A|η(Ek,l )=1}

is primitive. But T (A, B) dominates two elements of D2 and hen e annot be primitive by Lemma 1, a ontradi tion. The ases T (Ei,j , O), T (O, Er,s ) ∈ D2 and X is diagonal, Y is non-diagonal

an be onsidered in a similar way. Thus, T (M2n (B) \ D2 ) ⊆ M2n (B) \ D2 sin e T is bije tive by Lemma 2 and the set M2n(B) is nite it follows that T (M2n (B) \ D2 ) = M2n (B) \ D2 and thus ⊓ ⊔ we have that T (D2 ) = D2 .

Remark 2. Note that the sum of any three (or fewer) o-diagonal ells, no two

of whi h are ollinear, is dominated by a full- y le permutation matrix unless one is the transpose of another. That is, if i 6= p 6= r 6= i and j 6= q 6= s 6= j, and (Ei,j + Ep,q + Er,s ) ◦ (Ei,j + Ep,q + Er,s )t = O, then there is a full- y le permutation matrix P su h that P > Ei,j + Ep,q + Er,s .

Let (A, B) be a matrix pair. For our purposes we will assume that if ai,j 6= 0 then bi,j = 0. Let G be the digraph whose adja en y matrix is A and let H be the digraph whose adja en y matrix is B. We olor all the ar s in G olor one and all the ar s in H olor two, and then onsider G ∪H, the two olored digraph with the same vertex set. Definition 28. We all this two olored digraph the digraph asso iated with the matrix pair (A, B).

A useful tool in determining when a matrix pair is primitive is alled the

y le matrix. Definition 29. If the digraph asso iated with the pair (A, B) has y les C1 , C2 , · · · , Ck the y le matrix M is a 2 × k matrix of integers su h that the (1, i) entry is the number of ar s in y le Ci that orrespond to that part of the digraph asso iated with A, i.e., the ar s olored olor 1, and the (2, i) entry is the number of ar s in y le Ci that orrespond to that part of the digraph asso iated with B, the ar s olored olor 2.

The usefulness of this matrix is ontained in the following result of Shader and Suwilo, see [11℄. Theorem 1. [11℄ Let (A, B) be a matrix pair with y le matrix M. (A, B) is a primitive pair if and only if the greatest ommon divisor 2 × 2 minors of M is equal to 1.

Then of all

Operators preserving primitivity for matrix pairs

13

′ Lemma 7. Let (A, B) be a matrix pair with A + B = Wn , |A| + |B| = n + 1 and |A| > |B|. Then, (A, B) is a primitive pair if and only if B = O or B is

an o diagonal ell. Proof. Let

′ M be the y le matrix of the pair (A, B). If B = O then A = Wn 1n−1 and (Wn′ , O) is a primitive pair. If B is an o-diagonal ell then M = 0 1 and det M = 1 and hen e (A, B) is a primitive pair by Theorem 1. Now, assume that (A, B) is a primitive pair. We must show that B = O or that B is an o-diagonal ell. If B = O then we are done, so assume that B 6= O. By Lemma 1 either A or B or both ontains a diagonal ell. Case 1. Assume that B 6= O, and B dominates a diagonal ell. Then, sin e ′ A + B = Wn and |A| + |B| = n + 1, it follows that non-zero ells of A and 0n−α B are omplementary. Thus M = where α is the number of o1 α diagonal ells dominated by B. Sin e (A, B) is a primitive pair, we must have that det M = ±1. Here, det M = n − α, so we have that α = n − 1 and hen e |A| = 1, a ontradi tion, sin e |A| > |B| so that |A| > n+1 2 > 1. Case 2. Assume that B 6= O and A has a nonzero diagonal entry. Here, the 1 n−α where α is the number of nonzero

y le matrix for (A, B) is M = 0 α entries in B. Sin e, by Theorem 1, the determinant of M must be 1 or -1, we must have that α = 1. That is B is an o-diagonal ell. ⊓ ⊔

Let n > 3 and T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then, either T (D, O) = (D, O) or T (D, O) = (O, D). Lemma 8.

Proof. By Lemma 2 T is a bije tive linear operator. Suppose that T (Ei,i , O) = (Ek,k , O) and T (Ej,j , O) = (O, El,l ). Let C = E1,2 + E2,3 + · · · + En−1,n + En,1 , a full- y le matrix and T (C, O) = (X, Y). Then (C + Ei,i , O) and (C + Ej,j , O) are

both primitive pairs, and hen e their images must be a primitive pair. Sin e by Lemma 6 T (D2 ) = D2 , we must have that T (M2n (B)\ D2 ) = M2n (B)\ D2 , so that T (C + Ei,i , O) = (X + Ek,k , Y) and T (C + Ej,j , O) = (X, Y + El,l ) must both be primitive pairs. It was pointed out in the proof of Lemma 6 that T is bije tive on the set of ells. Thus T (C + Ei,i , O) = (X + Ek,k , Y) and T (C + Ej,j , O) = (X, Y + El,l ) are primitive pairs whi h dominate exa tly n + 1 ells. Sin e by Corollary 1, the only primitive matri es whi h dominate exa tly n + 1 ells, one of whi h is a diagonal ell, dominate a full- y le matrix, we must have that X + Y + Ek,k and X + Y + El,l dominate full- y le matri es. It now follows that X + Y is a full- y le. Sin e (X + Ek,k , Y) is a primitive pair we have by Lemma 7 that Y is an o-diagonal ell. Sin e (X, Y + El,l ) is a primitive pair we have by Lemma 7 that X is an o diagonal ell. Sin e X + Y is a full- y le matrix, it

14

L. B. Beasley, A. E. Guterman

follows that n = 2, a ontradi tion. Thus T (D, O) = (D, O) or T (D, O) = (O, D) ⊓ ⊔

Hen eforth, we let K denote the matrix with a zero main diagonal and ones everywhere else. That is K is the adja en y matrix of the omplete loopless digraph. Let us show that T a ts on M2n (B) omponent wise.

Let n > 3 and T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then, there is are bije tive linear operators L : Mn (B) \ D → Mn (B) \ D and S : D → D su h that either T (X, O) = (L(X ◦ K), O) + (S(X ◦ I), O) for all X, or T (X, O) = (O, L(X ◦ K)) + (O, S(X ◦ I)) for all X. Lemma 9.

Proof. By Lemma 2, T is a bije tive linear operator. Thus, by [1, Theorem 2.1℄

all ells in M2n (B) are mapped to ells. By virtue of Lemma 8 we may assume without loss of generality that for all l we have that T (El,l , O) = (El,l , O) and T (O, El,l ) = (O, Eσ(l),σ(l) ) for some permutations σ. Suppose that for some pairs (p, q), (x, y) with p 6= q we have T (Ep,q , O) = (O, Ex,y ). Here, by Lemma 6, x 6= y. Let F1 , F2 , · · · , Fn−1 be any ells su h that Ep,q + F1 + F2 + · · · + Fn−1 is a full- y le. For an arbitrary k, let (A, B) = (Ek,k + Ep,q + F1 + F2 + · · · + Fn−2 , Fn−1 ).

(3)

Then (A, B) is a primitive pair by Lemma 7. Thus the image must be a primitive pair. As was pointed out, T maps ells to ells, thus |T (A, B)| = |(A, B)| = n + 1. Sin e T (Ek,k , 0) = (Ek,k , 0) ∈ (D, 0), it follows that the sum of two omponents of T (A, B) is not a matrix whi h is similar to the Wieland matrix by a permutational transformation. Thus it is similar to Wn′ and Lemma 7 an be applied. Therefore, T (Ek,k + F1 + F2 + · · · + Fn−2 , Fn−1 ) must be a pair of the form (C, O) sin e T (Ep,q , O) = (O, Ex,y ) and the omponent of T (A, B), whi h is without diagonal

ells, an possess no more than one non-zero ell. By varying the hoi e of Fi′ s we get that if F is an o-diagonal ell not in row p or olumn q, then T (F, O) 6 (J, O). That is, there are n2 − 3n + 3 o diagonal ells F su h that T (F, O) 6 (J, O). Note however that in the expression (A, B) = (Ek,k +Ep,q +F1 +F2 +· · ·+Fn−2 , Fn−1 ), see formula (3), the matrix Fn−1 ould be repla ed by any of the other odiagonal ells not in row p or olumn q. That is there are also n2 − 3n + 3 o diagonal ells F su h that T (O, F) 6 (J, O). Further if T (Ep,r , O) 6 (O, J) then as above T (Ei,q , O) 6 (J, O) so that the number of o-diagonal ells F su h that T (F, O) 6 (J, O) is at least n2 − 3n + 4. If T (Ep,r , O) 6 (J, O) then again, the number of o-diagonal ells F su h that T (F, O) 6 (J, O) is at least n2 − 3n + 4. It follows that 2[n2 − 3n + 3] + 1 6 n2 − n sin e T is bije tive by Lemma 2 and,

Operators preserving primitivity for matrix pairs

15

therefore, T is bije tive on the set of ells. But that never happens. In this ase we have arrived at a ontradi tion. De ne L : Mn (B) \ D → Mn (B) \ D by T (X ◦ K, O = (L(X ◦ K, O) and S : D → D by T (X ◦ I, O) = S(X ◦ I, O). The lemma now follows. ⊓ ⊔

Sin e the a tion of T is de ned on M2n (B) independently in ea h omponent, the following de nition is orre t and makes sense.

Definition 30. Let T : M2n (B) → M2n (B) be a linear operator su h that T (X, O) ∈ Mn (B) × O and T (O, X) ∈ O × Mn (B), for all X ∈ Mn (B). De ne the linear operators T1 and T2 on Mn (B) by T (X, Y) = (T1 (X), T2 (Y)).

Let T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then, there are bije tive linear operators L1 : Mn (B) → Mn (B) and L2 : Mn (B) → Mn (B) whi h preserve primitivity su h that T (X, Y) = (L1 (X), L2 (Y)) for all (X, Y) ∈ M2n (B), or T (X, Y) = (L2 (Y), L1 (X)) for all (X, Y) ∈ M2n (B). Corollary 2.

Proof. By Lemma 9 T (X, O) = (L(X◦K), O)+(S(X◦I), O) for all X, or T (X, O) = (O, L(X ◦ K)) + (O, S(X ◦ I)) for all X. If T (X, O) = (L(X ◦ K), O) + (S(X ◦ I), O), then by the bije tivity of T and Lemma 9, T (O, X) = (O, L′ (X◦K))+(O, S′ (X◦I)) Here, de ne L1 (X) = L(X ◦ K) + S(X ◦ I) and L2 (X) = L′ (X ◦ K) + S′ (X ◦ I) so that T (X, Y) = (L1 (X), L2 (Y)). If T (X, O) = (O, L(X ◦ K)) + (O, S(X ◦ I)), then by the bije tivity of T and Lemma 9, T (O, X) = (L′ (X ◦ K), O) + (S′ (X ◦ I), O). In this ⊓ ⊔

ase T (X, Y) = (L2 (Y), L1 (X)). If L : Mn (B) → Mn (B) is a bije tive linear operator that preserves primitive matri es then L strongly preserves primitive matri es.

Lemma 10.

Proof. Sin e the set

Mn (B) is nite, the set of primitive matri es and non primitive matri es partition Mn (B). Sin e L is bije tive, and the image of the set of primitive matri es is ontained in the set of primitive matri es, the image of the set of primitive matri es must be equal to the set of primitive matri es and onsequently the image of the set of nonprimitive matri es must be the set of nonprimitive matri es. That is, L strongly preserves primitive matri es. ⊓ ⊔

We now de ne a spe ial operator that we need for Theorem 2 below. Definition 31. An operator D : Mn (B) → Mn (B) is a diagonal repla ement operator if D(Ei,j ) = Ei,j whenever i 6= j, and D(D) ⊆ D. It is nonsingular if D(Ei,i ) 6= O for all i. If D is bije tive then there is a permutation σ of {1, · · · , n} su h that D(Ei,i ) = Eσ(i),σ(i) for all i. In su h a ase we use the notation Dσ to denote the operator.

16

L. B. Beasley, A. E. Guterman

The semigroup of linear operators on Mn (B) that strongly preserve primitive matri es is generated by transposition, the similarity operators and nonsingular diagonal repla ement when n 6= 2. When n = 2 itis generated by those operators and the spe ial operator ab b (a + d) → for all a, b, c, d ∈ Mn (B). de ned by Theorem 2. [4, Theorem 3.1℄

cd

c

0

Let us now formulate our main theorem for matrix pairs.

Let n > 3 and T : M2n (B) → M2n (B) be a surje tive additive operator whi h preserves primitive pairs. Then there are permutation matri es P, Q, and R su h that: T (X, Y) = (P(X ◦ K)Pt , P(Y ◦ K)Pt ) + (Q(X ◦ I)Qt , R(Y ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B); T (X, Y) = (P(Y ◦ K)Pt , P(X ◦ K)Pt ) + (Q(Y ◦ I)Qt , R(X ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B); T (X, Y) = (P(Xt ◦ K)Pt , P(Y t ◦ K)Pt ) + (Q(X ◦ I)Qt , R(Y ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B); or T (X, Y) = (P(Y t ◦ K)Pt , P(Xt ◦ K)Pt ) + (Q(Y ◦ I)Qt , R(X ◦ I)Rt ) for all (X, Y) ∈ 2 Mn (B). Theorem 3.

Proof. By Corollary 2 indu ed a tions of T on (Mn (B), O) and (O, Mn (B)) arise. A

ording to the same orollary these a tions are linear and de ned orre tly. By Lemma 10 these indu ed operators strongly preserve primitivity. Applying Theorem 2 now, we have that for some permutation matri es P and Q, and permutations σ and τ of {1, · · · , n}, T (X, Y) = (PDσ (X)Pt , QDτ (Y)Qt ) for all (X, Y) ∈ M2n (B); or the similar transformations in the other three ases. Thus we only need show that P = Q and it is impossible that there is a transposition in the rst oordinate and no transposition in the se ond one. We start with the transposition transformation. Without loss of generality assume that T (X, O) = (PDσ (X)Pt , O)

and

T (O, Y) = (O, QDτ (Y)Qt ).

Also without loss of generality we may assume that P = I that is, T (X, O) = (Dσ (X), O). Now, it is impossible that T (O, Ei,i+1 ) = (O, Ei,i+1 ) for all i = 1, . . . , n, sin e there is no permutation matrix Q su h that Q(E1,2 + E2,3 + · · · + En−1,n + En,1 )t Qt = E1,2 + E2,3 + · · · + En−1,n + En,1 .

Therefore, there is some i su h that T (O, Ei,i+1 ) 6= (O, Ei,i+1 ) (subs ripts taken modulo n). Say without loss of generality that T (O, En,1 ) 6= (O, En,1 ). Let A1 = E1,1 + E1,2 + E2,3 + · · · + En−1,n and A2 = En,1 .

Operators preserving primitivity for matrix pairs

17

Then (A1 , A2 ) is primitive, whereas, T (A1 , A2 ) = (Ei1 ,i1 + E1,2 + E2,3 + · · · + En−1,n , Ep,q ),

where (p, q) 6= (n, 1). This matrix pair annot be primitive sin e it has exa tly n o diagonal entries and they do not form a full y le, a ontradi tion. Thus, either X is transposed in both omponents or X is not transposed in both

omponents. Suppose that P 6= Q. Then there is some Ei,j with i 6= j su h that PEi,j and QEi,j are ells in dierent rows. Let k1 , k2 , · · · kn−2 be distin t positive integers less than n su h that i, j ∈/ {k1 , k2 , · · · kn−1 }. Let A = E1,1 +Ej,k1 +Ek1 ,k2 +· · ·+ Ekn−3 ,kn−2 + Ekn−2 ,i . Then (A, Ei,j ) is a primitive pair, but T (A, Ei,j ) = (X, Y)

annot be primitive as it has a row with no o diagonal entry in either X or Y , a ontradi tion. Thus P = Q. Now by splitting any matrix to its diagonal and o-diagonal parts, we obtain the form as in the statement of the theorem. Note the spe ial operator for n = 2 in Theorem 2 is not surje tive. ⊓ ⊔

4

Matrices over Antinegative Semirings Without Zero Divisors.

Definition 32. whose (i, j)-th

The pattern, A, of a matrix A ∈ Mn (S) is the entry is 0 if ai,j = 0 and 1 if ai,j 6= 0.

(0, 1)-matrix

Remark 3. For a given matrix A ∈ Mn (S) we onsider A as a matrix in Mn (B). If S is antinegative and without zero divisors then the mapping Mn (S) → Mn (B) A→A

is a homomorphism of semirings.

Remark 4. Let S be antinegative and without zero divisors. Then dire t om-

putations show that (A, B) ∈ Mn (S) is primitive if and only if (A, B) ∈ Mn (B) is primitive.

Definition 33. Let T be an additive operator on Mn (S). We say that its pattern T is an additive operator on Mn(B) de ned by the rule T (Ei,j ) = T (Ei,j ) and T (O) = T (O).

Remark 5. It is easy to see that if S is antinegative and zero-divisor-free, then for any A ∈ Mn (S) we have that T (A) = T (A). Moreover, the following statement is true:

18

L. B. Beasley, A. E. Guterman

Let S be an antinegative semiring without zero divisors. Then the transformation whi h maps ea h additive operator T on Mn (S) to the operator T on Mn (B) is a homomorphism of semirings of additive operators on Mn (S) to additive operators on Mn (B). Lemma 11.

Proof. It is straightforward to see that if T is the zero operator, then T is the zero operator. The rest follows from [4, Lemma 2.1℄. ⊓ ⊔ Let us apply the above lemma and Theorem 3 to obtain the hara terization result over any antinegative semiring without zero divisors.

Let T : M2n (S) → M2n (S) be a surje tive additive operator whi h preserves primitive pairs. Then there is a permutation matrix P ∈ Mn (S), additive fun tions φ, ψ : S → S with zero kernels, i.e., φ(x) = 0 implies x = 0 and ψ(y) = 0 implies y = 0, and permutations σ and τ of {1, · · · , n} su h that: T (X, Y) = (PDσ (Xφ )Pt , PDτ (Y ψ C)Pt ) for all (X, Y) ∈ M2n (B), where Xφ denotes the element-wise a tion of φ on the entries of X; T (X, Y) = (PDτ (Y ψ C)Pt , PDσ (Xφ B)Pt ) for all (X, Y) ∈ M2n (B); T (X, Y) = (PDσ ((Xφ )t )Pt , PDτ ((Y ψ )t )Pt ) for all (X, Y) ∈ M2n (B); or T (X, Y) = (PDτ ((Y ψ )t )Pt , PDσ ((Xφ )t )Pt ) for all (X, Y) ∈ M2n (B). Corollary 3.

References 1. L. B. Beasley, A. E. Guterman, Linear preservers of extremes of rank inequalities over semirings: Fa tor rank, Journal of Mathemati al S ien es (New-York) 131, no. 5, (2005) 5919{5938. 2. L. B. Beasley, S. J. Kirkland, A note on k-primitive dire ted graphs, Linear Algebra and its Appl., 373 (2003) 67{74. 3. L. B. Beasley, N. J. Pullman, Linear operators that strongly preserve the index of imprimitivity, Linear an Multilinear Algebra, 31 (1992) 267{283. 4. L. B. Beasley, N. J. Pullman, Linear operators that strongly preserve primitivity, Linear an Multilinear Algebra, 25 (1989) 205{213. 5. R. Brualdi and H. Ryser, Combinatorial Matrix Theory , Cambridge University Press, New York, 1991. 6. E. Fornasini, A 2D systems approa h to river pollution modelling, Multidimensional System Signal Pro ess 2 (1991) 233{265. 7. E. Fornasini, M. Val her, Primitivity of positive matrix pairs: algebrai hara terization graph theoreti des ription and 2D systems interpretation, SIAM J. Matrix Anal. Appl. 19 (1998) 71{88. 8. R. A. Horn, C. R. Johnson,\Matrix Analysis", Cambridge University Press, New York. 9. C.-K. Li and N.-K. Tsing, Linear preserver problems: a brief introdu tion and some spe ial te hniques. Dire tions in matrix theory (Auburn, AL, 1990).Linear Algebra Appl. 162/164, (1992), P. 217{235.

Operators preserving primitivity for matrix pairs

19

10. P. Pier e and others, A Survey of Linear Preserver Problems, Linear and Multilinear Algebra, 33 (1992) 1{119. 11. B. Shader, S. Suwilo, Exponents of non-negative matrix pairs, Linear Algebra and its Appl., 363 (2003) 275{293. 12. H. Wielandt, Unzerlegbare, ni ht negative Matrizen, Math. Z. 52 (1958) 642{645.

Decompositions of quaternions and their matrix equivalents Drahoslava Janovska1 and Gerhard Opfer2 1

Institute of Chemi al Te hnology, Prague, Department of Mathemati s, Te hni ka 5, 166 28 Prague 6, Cze h Republi , [email protected]

2

University of Hamburg, Fa ulty for Mathemati s, Informati s, and Natural S ien es [MIN℄, Bundestrae 55, 20146 Hamburg, Germany, [email protected]

Dedicated to the memory of Gene Golub

Sin e quaternions have isomorphi representations in matrix form we investigate various well known matrix de ompositions for quaternions.

Abstract.

Keywords: de ompositions of quaternions, S hur, polar, SVD, Jordan,

QR, LU.

1

Introduction

We will study various de ompositions of quaternions where we will employ the isomorphi matrix images of quaternions. The matrix de ompositions allow in many ases analogue de ompositions of the underlying quaternion. Let us denote the skew eld of quaternions by H. It is well known that quaternions have an isomorphi representation either by ertain omplex (2×2)matri es or by ertain real (4 × 4)-matri es. Let a := (a1 , a2 , a3 , a4 ) ∈ H. Then the two isomorphisms : H → C2×2 , 1 : H → R4×4 are de ned as follows:

α (a) := −β a1 a2 1 (a) := a3 a4

β ∈ C2×2 , α := a1 + a2 i, β := a3 + a4 i, α −a2 −a3 −a4 a1 −a4 a3 ∈ R4×4 . a4 a1 −a2 −a3 a2 a1

(1) (2)

There is another very similar, but nevertheless dierent mapping, 2 : H → R4×4 , the meaning of whi h will be explained immediately: a1 a2 2 (a) := a3

a4

−a2 a1 −a4 a3

−a3 a4 a1 −a2

−a4 −a3 ∈ R4×4 . a2 a1

(3)

Quaternioni de ompositions

21

In the rst equation (1) the overlined quantities α, β denote the omplex onjugates of the non overlined quantities α, β, respe tively. Let b ∈ H be another quaternion. Then, the isomorphisms imply (ab) = (a)(b), 1 (ab) = 1 (a)1 (b). The third map, 2 , has the interesting property that it reverses the order of the multipli ation: 2 (ab) = 2 (b)2 (a) ∀a, b ∈ H, 1 (a)2 (b) = 2 (b)1 (a) ∀a, b ∈ H.

(4)

The mapping 2 plays a entral role in the investigations of linear maps H → H. There is a formal similarity to the Krone ker produ t of two arbitrary matri es. See [16℄ for the mentioned linear maps and [11, Lemma 4.3.1℄ for the Krone ker produ t.

Definition 1. A omplex (2 × 2)-matrix of the form introdu ed in (1) will be

alled a omplex q-matrix . A real (4 × 4)-matrix of the form introdu ed in (2) will be alled a real q-matrix . A real (4×4)-matrix of the form introdu ed in (3) will be alled a real pseudo q-matrix . The set of all omplex q-matri es will be denoted by HC . The set of all real q-matri es will be denoted by HR . The set of all real pseudo q-matri es will be denoted by HP .

We introdu e some ommon notation. Let C be a matrix of any size with real or omplex entries. By D := CT we denote the transposed matrix of C, where rows and olumns are inter hanged. By E := C we denote the onjugate matrix of C where all entries of C are hanged to their omplex onjugates. Finally, C∗ := (C)T . Let a := (a1 , a2 , a3 , a4 ) ∈ H. The rst omponent, a1 , is

alled the real part of a, denoted by ℜa. The quaternion av := (0, a2 , a3 , a4 ) will be alled ve tor part of a. From the above representations it is lear how to re over a quaternion from the orresponding matrix. Thus, it is also possible to introdu e inverse mappings −1 : HC → H, −1 −1 1 : HR → H, 2 : HP → H,

~ where −1 , −1 1 as well de ne isomorphisms. If we de ne a new algebra H where a new multipli ation, denoted by ⋆ is introdu ed by a ⋆ b := ba, then 2 is ~ and HP . This parti ularly implies that 2 (ab) = also an isomorphism between H −1 2 (b)2 (a) ∈ HP and 2 (a ) = 2 (a)−1 = 2 (a)T /|a|2 ∈ HP for all a ∈ H\{0}. Be ause of these isomorphisms it is possible to asso iate notions known from matrix theory with quaternions. Simple examples are:

22

D. Janovska, G. Opfer

det(a) tr(a) eig(a) eig(1 (a)) σ+

:= det((a)) = |a|2 , := tr((a)) = 2a1 ,

det(1 (a)) = det(2 (a)) = |a|4 , tr(1 (a)) = tr(2 (a) = 4a1 ,

(5) (6) (7)

:= eig((a)) = [σ+ , σ− ], = eig(2 (a)) = [σ+ , σ+ , σ− , σ− ], where q = a1 + a22 + a23 + a24 i = a1 + |av |i, σ− = σ+ ,

(8)

|a| = ||(a)||2 = ||1 (a)||2 = ||2 (a)||2 ,

(9) (10) (11) (12)

ond(a) := ond((a)) = ond(1 (a)) = ond(2 (a)) = 1, (aa) = (a)(a)∗ = |a|2 (1) = |a|2 I2 , 1 (aa) = 1 (a)1 (a)T = 2 (aa) = 2 (a)T 2 (a) = |a|2 I4 ,

where det, tr, eig, ond refer to determinant, tra e, olle tion of eigenvalues,

ondition , respe tively. By I2 , I4 we denote the identity matri es of order 2 and 4, respe tively. We note that a general theory for determinants of quaternion valued matri es is not available. See [1℄. We will review the lassi al matrix de ompositions and investigate the appli ability to quaternions. For the lassi al theory we usually refer to one of the books of Horn & Johnson, [10℄, [11℄. In this onne tion it is useful to introdu e another notion, namely that of equivalen e between two quaternions. Su h an equivalen e may already be regarded as one of the important de ompositions, namely the S hur de omposition, as we will see. Definition 2. Two quaternions a, b ∈ H will be alled equivalent , if there is an h ∈ H\{0} su h that b = h−1 ah.

Equivalent quaternions a, b will be denoted by a ∼ b. The set [a] := {s : s := h−1 ah, h ∈ H}

will be alled equivalen e lass of a. It is the set of all quaternions whi h are equivalent to a.

The above de ned notion of equivalen e de nes an equivalen e relation. Two quaternions a, b are equivalent if and only if

Lemma 1.

ℜa = ℜb,

a ∈ R ⇔ {a} = [a]. Let a ∈ C. Then {a, a} ⊂ [a]. (a1 , a2 , a3 , a4 ) ∈ H. Then q σ+ := a1 + a22 + a23 + a24 i ∈ [a].

Furthermore,

(13)

|a| = |b|.

Let

a =

Quaternioni de ompositions

Proof. See [13℄.

23 ⊓ ⊔

The omplex number σ+ o

urring in the last lemma will be alled om[a]. The equivalen e a ∼ b an be expressed also in the form ah − hb = 0, with an h 6= 0. This is the homogeneous form of Sylvester's & Opfer [16℄. It should equation. This equation was investigated by Janovska be noted that algebraists refer to equivalent elements usually as onjugate elements. See [18, p. 35℄.

plex representative of

2

Decompositions of quaternions

A matrix de omposition of the form (a) = (b)(c) or (a) = (b)(c)(d) with a, b, c, d ∈ H and the same with 1 also represents a dire t de omposition of the involved quaternions, namely a = bc or a = bcd be ause of the isomorphy of the involved mappings , 1 . The same applies to 2 , only the multipli ation order has to be reversed. We will study the possibility of de omposing quaternions with respe t to various well known matrix de ompositions. A survey paper on de ompositions of quaternioni matri es was given by [19℄. 2.1

Schur decompositions

Let U be an arbitrary real or omplex square matrix. If UU∗ = I (identity matrix) then U will be alled unitary . If U is real, then, U∗ = UT . A real, unitary matrix will also be alled orthogonal .

Let A be an arbitrary real or omplex square matrix. Then there exists a unitary matrix U of the same size as A su h that

Theorem 1 (Schur 1).

D := U∗ AU

(14)

is an upper triangular matrix and as su h ontains the eigenvalues of A on its diagonal. Proof. See Horn & Johnson [10, p. 79℄.

⊓ ⊔

Theorem 2 (Schur 2). Let A be an arbitrary real square matrix of order n. Then there exists a real, orthogonal matrix V of order n su h that H := VT AV

(15)

is an upper Hessenberg matrix with k 6 n blo k entries in the diagonal whi h are either real (1 × 1) matri es or real (2 × 2) matri es whi h have a pair of non real omplex onjugate eigenvalues whi h are also eigenvalues of A.

24

D. Janovska, G. Opfer

Proof. See Horn & Johnson [10, p. 82℄.

⊓ ⊔

The representation A = UDU∗ implied by (14) is usually referred to as

omplex S hur de omposition of A, whereas A = VHVT implied by (15) is usually referred to as real S hur de omposition of A. Let a be a quaternion, then we might ask whether there is a S hur de omposition of the matri es (a), 1 (a), 2 (a) in terms of quaternions. The (aÆrmative) answer was already given by Janovska & Opfer [15, 2007℄. Theorem 3. Let a ∈ H\R and σ+ be the omplex representative exists h ∈ H with |h| = 1 su h that σ+ = h−1 ah and

of [a]. There

(a) = (h)(σ+ )(h−1 ), 1 (a) = 1 (h)1 (σ+ )1 (h−1 ), 2 (a) = 2 (h−1 )2 (σ+ )2 (h)

(16)

are the S hur de ompositions of (a), 1 (a), 2 (a), respe tively, whi h in ludes that (h), 1 (h), 2 (h) are unitary and (h−1 ) = (h)∗ , 1 (h−1 ) = 1 (h)T , 2 (h−1 ) = 2 (h)T . The rst de omposition is omplex, the other two are real. Proof. The rst two de ompositions given in (16) follow immediately from

Lemma 1 and the fa t that , 1 are isomorphisms. See [15℄. The last equation

an be written as 2 (h)2 (a) = 2 (σ+ )2 (h). Applying (4) one obtains ah = hσ+ whi h oin ides with the equation for σ+ given in the beginning of the theorem. Matrix (σ+ ) is omplex and diagonal: (σ+ ) = diag(σ+ , σ− ). The other matri es 1 (σ+ ), 2 (σ+ ) are upper Hessenberg with two real (2 × 2) blo ks ea h:

a1 −|av | |av | a1 1 (σ+ ) = 0 0 0 0

0 0 a1 |av |

0 a1 −|av | 0 0 0 a1 0 0 , 2 (σ+ ) = |av | . 0 −|av | 0 a1 |av | a1 0 0 −|av | a1

⊓ ⊔

If we have a look at the forms of 1 and 2 , de ned in (2), (3), respe tively, we see that an upper (and lower) triangular matrix redu es immediately to a multiple of the identity matrix. This orresponds to the ase where a is a real quaternion. Or in other words, it is not possible to nd a omplex S hur de omposition of 1 (a), 2 (a) in HR , HP , respe tively, if a ∈/ R. In the mentioned paper [15, Se tion 8℄ we an also nd, how to onstru t h whi h o

urs in ~ h| ~ , where Theorem 3. One possibility is to put h := h/| (|av | + a2 , |av | + a2 , a3 − a4 , a3 + a4 ) if |a3 | + |a4 | > 0, ~ := (1, 0, 0, 0) h if a3 = a4 = 0 and a2 > 0, (0, 1, 0, 0) if a3 = a4 = 0 and a2 < 0.

(17)

Quaternioni de ompositions

25

Let σ+ ∼ a and multiply the de ning equation σ+ = h−1 ah from the left by h, then hσ+ −ah = 0 is the homogeneous form of Sylvester's equation and it was shown [16℄ that under the ondition stated in (13) the homogeneous equation has a solution spa e (null spa e) whi h is a two dimensional subspa e of H over R. 2.2

The polar decomposition

The aim is to generalize the polar representation of a omplex number. Let z ∈ C\{0} be a omplex number. Then, z = |z|(z/|z|), and this representation of z is unique in the lass of all two fa tor representations z = pu, where the rst fa tor p is positive and the se ond, u, has modulus one. For matri es A one ould orrespondingly ask for a representation of the form A = PU, where the rst fa tor P is positive semide nite and the se ond, U, is unitary. This is indeed possible, even for non square matri es A ∈ Cm×n , m 6 n. Matrix P is always uniquely de ned as P = (AA∗ )1/2 and U is uniquely de ned if A has maximal rank m. If A is square and non singular, then U = P−1 A. See Horn & Johnson [10, Theorem 7.3.2 and Corollary 7.3.3, pp. 412/413℄. Let a ∈ H\{0} be a non vanishing quaternion a := (a1 , a2 , a3 , a4 ). The quantity av := (0, a2 , a3 , a4 ) was alled ve tor part of a as previously explained. The matri es (a), 1 (a), 2 (a) are non singular square matri es where the olumns are orthogonal to ea h other. See (11), (12) and its representation (in terms of quaternions) is obviously a a = |a| . (18) |a|

The orresponding matrix representation in HC , HR , HP , an be easily dedu ed by using (1) to (3) and the properties listed in (11), (12). We obtain (a) = diag(|a|, |a|) (

a ), |a|

a ), |a| a 2 (a) = diag(|a|, |a|, |a|, |a|) 2( ). |a|

1 (a) = diag(|a|, |a|, |a|, |a|) 1(

(19) (20) (21)

In all ases the rst fa tor is positive de nite and the se ond is unitary, orthogonal, respe tively. From a purely algebrai standpoint this representation of a is omplete. However, already the name polar representation means more. In the omplex

ase we have z = exp(αi), z 6= 0 |z|

where α := arg z is the angle between the x-axis and an arrow representing z emanating from the origin of the z-plane. As formula: α = ar tan(ℑz/ℜz). In

26

D. Janovska, G. Opfer

the quaternioni ase one nds ( f. [2, p. 11℄) a = exp(αu), |a|

a 6= 0,

with u := av /|av |, α := ar tan(|av |/a1 ), and exp is de ned by its Taylor series using u2 = −1. 2.3

The singular value decomposition (SVD)

We start with the following well known theorem on a singular value de omposition of a given matrix A. We restri t ourselves here to square matri es. The singular values of A are the square roots of the (non negative) eigenvalues of the positive semide nite matrix AA∗ .

Let A be an arbitrary square matrix with real or omplex entries. Then there are two unitary matri es U, V of the same size as A su h that Theorem 4.

D := UAV∗

is a diagonal matrix with the singular values of A in de reasing order on the diagonal. And the number of positive diagonal entries is the rank of A. Proof. See Horn & Johnson [10, 1991, p. 414℄. ⊓ ⊔ Let a be a quaternion. The eigenvalues of (a) are σ+ , σ− , de ned in (8) and (a)(a)∗ =

|a|2 0

0 |a|2

.

Thus, the singular values of (a) are |a|, |a|. The wanted de omposition must be of the form |a| 0 0 |a|

=U

α β −β α

V∗

and the main question is whether U, V ∈ HC . In order to solve this problem, we write it dire tly in terms of quaternions, namely |a| = uav,

(22)

|u| = |v| = 1.

Let a ∈ H\R. Choose u ∈ H with |u| = 1 and de ne v := ua/|a| or, equivalently, hoose v with |v| = 1 and de ne u := va/|a|. Then (22) de nes a singular value de omposition of a and Theorem 5.

(|a|) = (u)(a)(v)∗

de nes a orresponding SVD in HC . A SVD with

orresponding SVDs in HR and in HP are 1 (|a|) = 1 (u)1 (a)1 (v)T ,

u=v

is impossible. The

2 (|a|) = 2 (v)T 2 (a)2 (u).

Quaternioni de ompositions

27

Proof. It is easy to see that (22) is valid if we hoose u, v a

ording to the given rules. If u = v then a = |a| ∈ R follows, whi h was ex luded.

⊓ ⊔

One very easy realization of (22) is to hoose u := 1 and v := a/|a| or to

hoose v := 1 and u := a/|a|.

Example 1. Let a := (1, 2, 2, 4). Then the three SVDs are: „ 0

5 B0 B @0 0 0 5 B0 B @0 0

2.4

5 0

0 5 0 0

0 0 5 0

0 5 0 0

0 0 5 0

0 5

«

=

„

1 0

0 1

«„

1 0 1 0 0 0 B0 1 0 0C C=B 0A @0 0 1 0 0 0 5 1 0 1 2 0 B −2 1 0C C=B 4 0 A @ −2 −4 −2 5

« «„ 1 − 2i −2 − 4i 1 + 2i 2 + 4i / 5. 2 − 4i 1 + 2i −2 + 4i 1 − 2i 1 10 10 1 2 2 4 1 −2 −2 −4 0 B B 1 4 −2 C 1 −4 2C 0C C / 5. C B −2 CB2 1 2A 4 1 −2 A @ −2 −4 [email protected] −4 2 −2 1 4 −2 2 1 1 1 10 1 0 1 0 0 0 1 −2 −2 −4 2 4 C B B 1 4 −2 C −4 2C CB0 1 0 0C. C /5 B 2 1 [email protected] 0 1 0A 1 −2 A @ 2 −4 0 0 0 1 4 2 −2 1 2 1

The Jordan decomposition

Let a := (a1 , a2 , a3 , a4 ) ∈ H\R. Sin e the two eigenvalues σ± of (a), de ned in (8), are dierent there will be an s ∈ H\{0} su h that a = s−1 σ+ s whi h implies (a) = (s−1 )(σ+ )(s). And this representation is the Jordan de omposition of (a) and J := (σ+ ) = σ+ 0 is the Jordan anoni al form of (a) [10, p. 126℄. In this ontext 0 σ−

this representation is almost the same as the S hur de omposition, only we do not require that |s| = 1. For the omputation of s, we ould use formula (17). In HC , HP this de omposition reads 1 (a) = 1 (s−1 )1 (σ+ )1 (s),

2 (a) = 2 (s)2 (σ+ )2 (s−1 ),

where the expli it forms of 1 (σ+ ), 2 (σ+ ) are given in the proof of Theorem 3. 2.5

The QR decomposition

Let A be an arbitrary omplex square matrix. Then there is a unitary matrix U and an upper triangular matrix R of the same size as A su h that A = UR.

This well known theorem an be found in [10, p. 112℄. And this de omposition is referred to as QR-de omposition of A. All triangular matri es in HC , in HR ,

28

D. Janovska, G. Opfer

and in HP redu e to diagonal matri es. Therefore, the QR-de ompositions of a quaternion a 6= 0 have the trivial form a=

a a a a |a| ⇔ (a) = (|a|), 1 (a) = 1 1 (|a|), 2 (a) = 2 2 (|a|), |a| |a| |a| |a|

whi h is identi al with the polar de omposition (18). 2.6

The LU decomposition

Let A ∈ Cn×n be given with entries ajk , j, k = 1, 2, . . . , n. De ne the n submatri es Aℓ := (ajk ), j, k = 1, 2, . . . , ℓ, ℓ = 1, 2, . . . , n. Then, following Horn & Johnson [10, p. 160℄ there is a lower triangular matrix L and an upper triangular matrix U su h that A = LU

if and only if all n submatri es Aℓ , ℓ = 1, 2, . . . , n are non singular. The above representation is alled LU-de omposition of A. Sin e triangular matri es in HC , in HR , and in HP redu e to diagonal matri es and sin e a produ t of two diagonal matri es is again diagonal an LU-de omposition of a quaternion a will in general not exist sin e (a), 1 (a), 2 (a) are in general not diagonal. So we may ask for the ordinary LU-de omposion of (a), 1 (a), 2 (a). In order that su h a de omposition exist we must require that the mentioned submatri es are not singular. Let a = (a1 , a2 , a3 , a4 ). Then the two mentioned submatri es of (a) are non singular if and only if the rst (1 × 1) submatrix α := a1 + a2 i 6= 0, sin e this implies that also the se ond (2 × 2) submatrix whi h is (a) is non singular be ause its determinant is |a|2 = |α|2 + a23 + a24 > 0.

Let a = (a1 , a2 , a3 , a4 ) ∈ H. Put α := a1 + a2 i and β := a3 + a4i. An LU de omposition of (a) exists if and only if α 6= 0. If this ondition is valid, then Theorem 6.

(a) =

where

α −β

β l21 = − , α

β α

=

u22 =

1 l21

0 1

α 0

β u22

,

|α|2 + |β|2 |a|2 = . α α

Proof. The if and only part follows from the general theory. The above formula is easy to he k.

Theorem 7. and of 2 (a)

then

0

a1 B a2 1 (a) := B @ a3 a4

⊓ ⊔

Let a = (a1 , a2 , a3 , a4 ) ∈ H. The four submatri es Al of 1 (a) are non singular if and only if a1 6= 0. If this ondition is valid, −a2 a1 a4 −a3

−a3 −a4 a1 a2

1 0 1 −a4 B l21 a3 C C=B −a2 A @ l31 l41 a1

0 1 l32 l42

0 0 1 l43

10 a1 0 B 0 0C CB 0 [email protected] 0 0 1

−a2 u22 0 0

−a3 u23 u33 0

1 −a4 u24 C C, u34 A u44

Quaternioni de ompositions

29

where [results for 2 (a) are in parentheses℄ (no

lj1 := aj /a1 , j = 2, 3, 4, l32 := (a1 a4 + l42 := (a2 a4 − u22 :=

(a21

+

a2 a3 )/(a21 a1 a3 )/(a21

a22 )/a1 ,

+ +

(no

:= (a2 a4 +

a1 a3 )/(a21

(u23 := (a1 a4 + a2 a3 )/a1 (u24 := (−a1 a3 + a2 a4 )/a1

u33 := a1 + l31 a3 − l32 u23 ,

(no

for 2 (a)), for 2 (a)),

:= (−a1 a4 + a2 a3 )/(a21 + a22 )

hange for 2 (a)),

u23 := (−a1 a4 + a2 a3 )/a1 , u24 := (a1 a3 + a2 a4 )/a1 ,

hange for i2 (a)),

a22 ), (l32 a22 ), (l42

hange for 2 (a)),

+

a22 )

for 2 (a)), for 2 (a)),

l43 := (a2 + l41 a3 − l42 u23 )/u33 , (l43 := (−a2 + l41 a3 − l42 u23 )/u33 u34 := −a2 + l31 a4 − l32 u24 ,

(u34 := a2 + l31 a4 − l32 u24

u44 := a1 + l41 a4 − l42 u24 − l43 u34 ,

(no

for 2 (a)),

for 2 (a)),

hange for 2 (a)).

A Cholesky de omposition annot be a hieved sin e all three matri es (a), 1 (a), 2 (a) are missing symmetry. Acknowledgment. The authors a knowledge with pleasure the support of

the Grant Agen y of the Cze h Republi (grant No. 201/06/0356). The work is a part of the resear h proje t MSM 6046137306 nan ed by MSMT, Ministry of Edu ation, Youth and Sports, Cze h Republi .

References 1. J. Fan, Determinants and multipli ative fun tionals on quaternioni matri es, Linear Algebra Appl. 369 (2003), 193{201. 2. P. R. Girard, Quaternions, Cliord Algebras and Relativisti Physi s, Birkhauser, Basel, Boston, Berlin, 2007, 179 p. 3. R. A. Horn & C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, New York, 1992, 561 p. 4. R. A. Horn & C. R. Johnson, Topi s in Matrix Analysis, Cambridge University Press, Cambridge, New York, 1991, 607 p. 5. D. Janovska & G. Opfer, Givens' transformation applied to quaternion valued ve tors, BIT 43 (2003), Suppl., 991{1002. 6. D. Janovska & G. Opfer, Fast Givens Transformation for Quaternioni Valued Matri es Applied to Hessenberg Redu tions, ETNA 20 (2005), 1{26. 7. D. Janovska & G. Opfer, Linear equations in quaternions, In: Numeri al Mathemati s and Advan ed Appli ations, Pro eedings of ENUMATH 2005, A. B. de Castro, D. Gomez, P. Quintela, and P. Salgado, eds., Springer Verlag, New York, 2006, pp. 946{953. 8. D. Janovska & G. Opfer, Computing quaternioni roots by Newton's method, ETNA 26 (2007), pp. 82{102. 9. D. Janovska & G. Opfer, On one linear equation in one quaternioni unknown, 10. R. A. Horn & C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, New York, 1992, 561 p.

30

D. Janovska, G. Opfer

11. R. A. Horn & C. R. Johnson, Topi s in Matrix Analysis, Cambridge University Press, Cambridge, New York, 1991, 607 p. 12. D. Janovska & G. Opfer, Givens' transformation applied to quaternion valued ve tors, BIT 43 (2003), Suppl., 991{1002. 13. D. Janovska & G. Opfer, Fast Givens Transformation for Quaternioni Valued Matri es Applied to Hessenberg Redu tions, ETNA 20 (2005), 1{26. 14. D. Janovska & G. Opfer, Linear equations in quaternions, In: Numeri al Mathemati s and Advan ed Appli ations, Pro eedings of ENUMATH 2005, A. B. de Castro, D. Gomez, P. Quintela, and P. Salgado, eds., Springer Verlag, New York, 2006, pp. 946{953. 15. D. Janovska & G. Opfer, Computing quaternioni roots by Newton's method, ETNA 26 (2007), pp. 82{102. 16. D. Janovska & G. Opfer, On one linear equation in one quaternioni unknown, Hamburger Beitrage zur Angewandten Mathematik, Nr. 2007-14, September 2007, 34 p., dedi ated to Bernd Fis her on the o

asion of his 50th birthday. 17. D. Janovska & G. Opfer, Linear equations in quaternioni variables, Mitt. Math. Ges. Hamburg 27 (2008), 223{234. 18. B. L. van der Waerden, Algebra I, 5th ed., Springer, Berlin Gottingen, Heidelberg, 1960, 292 p. 19. F. Zhang, Quaternions and matri es of quaternions, Linear Algebra Appl. 251 (1997), 21{57.

Sensitivity analysis of Hamiltonian and reversible systems prone to dissipation-induced instabilities Oleg N. Kirillov⋆ Institute of Me hani s, Mos ow State Lomonosov University, Mi hurinskii pr. 1, 119192 Mos ow, Russia [email protected]; Department of Me hani al Engineering, Te hnis he Universitat Darmstadt, Ho hs hulstr. 1, 64289 Darmstadt, Germany [email protected]

Stability of a linear autonomous non- onservative system in the presen e of potential, gyros opi , dissipative, and non- onservative positional for es is studied. The ases when the non- onservative system is lose either to a gyros opi system or to a ir ulatory one, are examined. It is known that marginal stability of gyros opi and ir ulatory systems an be destroyed or improved up to asymptoti stability due to a tion of small non- onservative positional and velo ity-dependent for es. We show that in both ases the boundary of the asymptoti stability domain of the perturbed system possesses singularities su h as \Dihedral angle", \Break of an edge" and \Whitney's umbrella" that govern stabilization and destabilization as well as are responsible for the imperfe t merging of modes. Sensitivity analysis of the riti al parameters is performed with the use of the perturbation theory for eigenvalues and eigenve tors of non-self-adjoint operators. In ase of two degrees of freedom, stability boundary is found in terms of the invariants of matri es of the system. Bifur ation of the stability domain due to hange of the stru ture of the damping matrix is des ribed. As a me hani al example, the Hauger gyropendulum is analyzed in detail; an instability me hanism in a general me hani al system with two degrees of freedom, whi h originates after dis retization of models of a rotating dis in fri tional onta t and possesses the spe tral mesh in the plane 'frequen y' versus 'angular velo ity', is analyti ally des ribed and its role in the ex itation of vibrations in the squealing dis brake and in the singing wine glass is dis ussed.

Abstract.

Keywords: matrix polynomial, Hamiltonian system, reversible system, Lyapunov stability, inde nite damping, perturbation, dissipation-indu ed instabilities, destabilization paradox, multiple eigenvalue, singularity. ⋆

The work has been partly supported by the Alexander von Humboldt Foundation and by the German Resear h Foundation, Grant DFG HA 1060/43-1.

32

O. N. Kirillov

1

Introduction

Consider an autonomous non- onservative system x + (ΩG + δD)x_ + (K + νN)x = 0,

(1)

where dot stands for the time dierentiation, x ∈ Rm , and real matrix K = KT

orresponds to potential for es. Real matri es D = DT , G = −GT , and N = −NT are related to dissipative (damping), gyros opi , and non- onservative positional ( ir ulatory) for es with magnitudes ontrolled by s aling fa tors δ, Ω, and ν respe tively. A ir ulatory system is obtained from (1) by negle ting velo itydependent for es x + (K + νN)x = 0, (2) while a gyros opi one has no damping and non- onservative positional for es x + ΩGx_ + Kx = 0.

(3)

Cir ulatory and gyros opi systems (2) and (3) possess fundamental symmetries that are evident after transformation of equation (1) to the form y_ = Ay with A=

"

− 21 ΩG

I

1 2 2 1 2 δΩDG + 4 Ω G

− K − νN δD − 12 ΩG

#

,

y=

"

x x+ _ 12 ΩGx

#

,

(4)

where I is the identity matrix. In the absen e of damping and gyros opi for es (δ = Ω = 0), RAR = −A with I 0 −1 R=R = (5) . 0 −I

This means that the matrix A has a reversible symmetry, and equation (2) des ribes a reversible dynami al system [16, 19, 33℄. Due to this property, det(A − λI) = det(R(A − λI)R) = det(A + λI),

(6)

and the eigenvalues of ir ulatory system (2) appear in pairs (−λ, λ). Without damping and non- onservative positional for es (δ = ν = 0) the matrix A possesses the Hamiltonian symmetry JAJ = AT , where J is a unit symple ti matrix [17, 23, 28℄ J = −J−1 =

0 I . −I 0

(7)

As a onsequen e,

det(A − λI) = det(J(A − λI)J) = det(AT + λI) = det(A + λI),

(8)

whi h implies that if λ is an eigenvalue of A then so is −λ, similarly to the reversible ase. Therefore, an equilibrium of a ir ulatory or of a gyros opi

Sensitivity analysis of Hamiltonian and reversible systems

33

system is either unstable or all its eigenvalues lie on the imaginary axis of the

omplex plane implying marginal stability if they are semi-simple. In the presen e of all the four for es, the Hamiltonian and reversible symmetries are broken and the marginal stability is generally destroyed. Instead, system (1) an be asymptoti ally stable if its hara teristi polynomial P(λ) = det(Iλ2 + (ΩG + δD)λ + K + νN),

(9)

satis es the riterion of Routh and Hurwitz. The most interesting for many appli ations, ranging from the rotor dynami s [3{5, 14, 25, 27, 30, 31, 48, 49, 59, 62℄ to physi s of the atmosphere [9, 29, 62, 66℄ and from stability and optimization of stru tures [8, 10, 11, 15, 22, 26, 33, 39, 54, 55, 65, 69℄ to fri tion-indu ed instabilities and a ousti s of fri tion [40, 42, 61, 67, 71{73, 75, 76℄, is the situation when system (1) is lose either to ir ulatory system (2) with δ, Ω ≪ ν (near-reversible system ) or to gyros opi system (3) with δ, ν ≪ Ω (near-Hamiltonian system ). The ee t of small damping and gyros opi for es on the stability of

ir ulatory systems as well as the ee t of small damping and non- onservative positional for es on the stability of gyros opi systems are regarded as paradoxi al, sin e the stability properties are extremely sensitive to the hoi e of the perturbation, and the balan e of for es resulting in the asymptoti stability is not evident, as it happens in su h phenomena as \tippe top inversion", \rising egg", and the onset of fri tion-indu ed os illations in the squealing brake and in the singing wine glass [31, 48, 49, 59, 61, 62, 67, 71{73, 75{77℄. Histori ally, Thomson and Tait in 1879 were the rst who found that dissipation destroys the gyros opi stabilization (dissipation-indu ed instability ) [1, 28, 62, 66℄. A similar ee t of non- onservative positional for es on the stability of gyros opi systems has been established almost a entury later by Lakhadanov and Karapetyan [12, 13℄. A more sophisti ated manifestation of the dissipationindu ed instabilities has been dis overed by Ziegler on the example of a double pendulum loaded by a follower for e with the damping, non-uniformly distributed among the natural modes [8℄. Without dissipation, the Ziegler pendulum is a reversible system, whi h is marginally stable for the loads non-ex eeding some riti al value. Small dissipation of order o(1) makes the pendulum either unstable or asymptoti ally stable with the riti al load, whi h generi ally is lower than that of the undamped system by the quantity of order O(1) (the destabilization paradox ). Similar dis ontinuous hange in the stability domain for the near-Hamiltonian systems has been observed by Holopainen [9, 66℄ in his study of the ee t of dissipation on the stability of baro lini waves in Earth's atmosphere, by Hoveijn and Ruijgrok on the example of a rotating shaft on an elasti foundation [30℄, and by Crandall, who investigated a gyros opi pendulum with stationary and rotating damping [31℄. Contrary to the Ziegler pendulum, the undamped gyropendulum is a gyros opi system that is marginally stable when its spin ex eeds a riti al value. Despite the stationary damping, orresponding

34

O. N. Kirillov

to a dissipative velo ity-dependent for e, destroys the gyros opi stabilization [1℄, the Crandall gyropendulum with stationary and rotating damping, where the latter is related to a non- onservative positional for e, an be asymptoti ally stable for the rotation rates ex eeding onsiderably the riti al spin of the undamped system. This is an example of the destabilization paradox in the Hamiltonian system. As it was understood during the last de ade, the reason underlying the destabilization paradox is that the multiparameter family of non-normal matrix operators of the system (1) generi ally possesses the multiple eigenvalues related to singularities of the boundary of the asymptoti stability domain, whi h were des ribed and lassi ed by Arnold already in 1970-s [17℄. Hoveijn and Ruijgrok were, apparently, the rst who asso iated the dis ontinuous hange in the riti al load in their example to the singularity Whitney umbrella, existing on the stability boundary [30℄. The same singularity on the boundary of the asymptoti stability has been identi ed for the Ziegler pendulum [47℄, for the models of dis brakes [72, 76℄, of the rods loaded by follower for e [54, 55℄, and of the gyropendulums and spinning tops [63, 70℄. These examples re e t the general fa t that the odimension-1 Hamiltonian (or reversible) Hopf bifur ation an be viewed as a singular limit of the odimension-3 dissipative resonant 1 : 1 normal form and the essential singularity in whi h these two ases meet is topologi ally equivalent to Whitney's umbrella (Hamilton meets Hopf under Whitney's umbrella) [45, 66℄. Despite the a hieved qualitative understanding, the development of the sensitivity analysis for the riti al parameters near the singularities, whi h is essential for ontrolling the stabilization and destabilization, is only beginning and is involving su h modern dis iplines as multiparameter perturbation theory of analyti al matrix fun tions [7, 18, 20, 23, 24, 28, 29, 37, 41, 57, 58℄ and of non-selfadjoint boundary eigenvalue problems [51, 53{55℄, the theory of the stru tured pseudospe tra of matrix polynomials [56, 73℄ and the theory of versal deformations of matrix families [30, 45, 47, 60℄. The growing number of physi al and me hani al appli ations demonstrating the destabilization paradox due to an interplay of non- onservative ee ts and the need for a justi ation for the use of Hamiltonian or reversible models to des ribe real-world systems that are in fa t only near-Hamiltonian or near-reversible requires a uni ed treatment of this phenomenon. The goal of the present paper is to nd and to analyze the domain of asymptoti stability of system (1) in the spa e of the parameters δ, Ω, and ν with spe ial attention to near-reversible and near-Hamiltonian ases. In the subsequent se tions we will ombine the study of the two-dimensional system, analyzing the Routh-Hurwitz stability onditions, with the perturbative approa h to the

ase of arbitrary large m. Typi al singularities of the stability boundary will be identi ed. Bifur ation of the domain of asymptoti stability due to hange of

Sensitivity analysis of Hamiltonian and reversible systems

35

the stru ture of the matrix D of dissipative for es will be thoroughly analyzed and the ee t of gyros opi stabilization of a dissipative system with inde nite damping and non- onservative positional for es will be des ribed. The estimates of the riti al parameters and expli it expressions, approximating the boundary of the asymptoti stability domain, will be extended to the ase of m > 2 degrees of freedom with the use of the perturbation theory of multiple eigenvalues of non-self-adjoint operators. In the last se tion the general theory will be applied to the study of the onset of stabilization and destabilization in the models of gyropendulums and dis brakes.

2

A circulatory system with small velocity-dependent forces

We begin with the near-reversible ase (δ, Ω ≪ ν), whi h overs Ziegler's and Nikolai's pendulums loaded by the follower for e [8, 10, 11, 33, 47, 43, 44, 53, 66℄ (their ontinuous analogue is the vis oelasti Be k olumn [10, 39, 54, 55℄), the Reut-Sugiyama pendulum [50℄, the low-dimensional models of dis brakes by North [67, 73℄, Popp [40℄, and Sinou and Jezequel [72℄, the model of a mass sliding over a onveyor belt by Homann and Gaul [42℄, the models of rotors with internal and external damping by Kimball and Smith [3, 4℄ and Kapitsa [5, 66℄, and nds appli ations even in the modeling of the two-legged walking and of the dynami s of spa e tethers [32℄. 2.1

Stability of a circulatory system

Stability of system (1) is determined by its hara teristi polynomial (8), whi h in ase of two degrees of freedom has a onvenient form provided by the LeverrierBarnett algorithm [21℄ P(λ, δ, ν, Ω) = λ4 + δtrD λ3 + (trK + δ2 det D + Ω2 ) λ2 + (δ(trKtrD − trKD) + 2Ων) λ + det K + ν2 ,

(10)

where without loss of generality we assume that det G = 1 and det N = 1. In the absen e of damping and gyros opi for es (δ = Ω = 0) the system (1) is ir ulatory, and the polynomial (10) has four roots −λ+ , −λ− , λ− , and λ+ , where r λ± =

1 1 − trK ± 2 2

q (trK)2 − 4(det K + ν2 ).

(11)

The eigenvalues (11) an be real, omplex or purely imaginary implying instability or marginal stability in a

ordan e with the following statement.

If trK > 0 and det K 6 0, ir ulatory system (2) with two degrees of freedom is stable for νd 2 < ν2 < νf 2 , unstable by divergen e for

Proposition 1.

36

O. N. Kirillov

ν2 6 νd 2 , and and νf are

unstable by utter for ν2 > νf 2 , where the riti al values νd 06

q √ 1 − det K =: νd 6 νf := (trK)2 − 4 det K. 2

If trK > 0 and det K > 0, the ir ulatory system is stable for unstable by utter for ν2 > νf 2 . If trK 6 0, the system is unstable.

(12) ν2 < νf 2

and

The proof is a onsequen e of formula (11), reversible symmetry, and the fa t that time dependen e of solutions of equation (2) is given by exp(λt) for simple eigenvalues λ, with an additional|polynomial in t|prefa tor (se ular terms) in

ase of multiple eigenvalues with the Jordan blo k. The solutions monotonously grow for positive real λ implying stati instability (divergen e), os illate with an in reasing amplitude for omplex λ with positive real part ( utter), and remain bounded when λ is semi-simple and purely imaginary (stability). For K, having two equal eigenvalues, νf = 0 and the ir ulatory system (2) is unstable in agreement with the Merkin theorem for ir ulatory systems with two degrees of freedom [34, 62℄.

Fig. 1. Stability diagrams and traje tories of eigenvalues for the in reasing parameter ν > 0 for the ir ulatory system (2) with trK > 0 and det K < 0 (a) and trK > 0 and det K > 0 (b).

Stability diagrams and motion of eigenvalues in the omplex plane for ν in reasing from zero are presented in Fig. 1. When trK > 0 and det K < 0 there are two real and two purely imaginary eigenvalues at ν = 0, and the system is stati ally unstable, see Fig. 1(a). With the in rease of ν both the imaginary and real eigenvalues are moving to the origin, until at ν = νd the real pair merges and originates a double zero eigenvalue with the Jordan blo k. At ν = νd the system is unstable due to linear time dependen e of a solution orresponding to λ = 0. The further in rease of ν yields splitting of the double zero eigenvalue

Sensitivity analysis of Hamiltonian and reversible systems

37

into two purely imaginary ones. The imaginary eigenvalues of the same sign are then moving towards ea h other until at ν = νf they originate a pair of double eigenvalues ±iωf with the Jordan blo k, where ωf =

r

1 trK. 2

(13)

At ν = νf the system is unstable by utter due to se ular terms in its solutions. For ν > νf the utter instability is aused by two of the four omplex eigenvalues lying on the bran hes of a hyperboli urve Im λ2 − Re λ2 = ω2f .

(14)

The riti al values νd and νf onstitute the boundaries between the divergen e and stability domains and between the stability and utter domains respe tively. For trK > 0 and det K = 0 the divergen e domain shrinks to a point νd = 0 and for trK > 0 and det K > 0 there exist only stability and utter domains as shown in Fig. 1(b). For negative ν the boundaries of the divergen e and utter domains are ν = −νd and ν = −νf . In general, the Jordan hain for the eigenvalue iωf onsists of an eigenve tor u0 and an asso iated ve tor u1 that satisfy the equations [53℄ (−ω2f I + K + νf N)u0 = 0,

(−ω2f I + K + νf N)u1 = −2iωf u0 .

(15)

Due to the non-self-adjointness of the matrix operator, the same eigenvalue possesses the left Jordan hain of generalized eigenve tors v0 and v1 v0T (−ω2f I + K + νf N) = 0,

v1T (−ω2f I + K + νf N) = −2iωf v0T .

(16)

The eigenvalues u0 and v0 are biorthogonal v0T u0 = 0.

(17)

In the neighborhood of ν = νf the double eigenvalue and the orresponding eigenve tors vary a

ording to the formulas [52, 53℄ √ 1 λ(ν) = iωf ± µ ν − νf + o((ν − νf ) 2 ), √ 1 u(ν) = u0 ± µu1 ν − νf + o((ν − νf ) 2 ), √ 1 v(ν) = v0 ± µv1 ν − νf + o((ν − νf ) 2 ),

(18)

where µ2 is a real number given by µ2 = −

v0T Nu0 . 2iωf v0T u1

(19)

38

O. N. Kirillov

For m = 2 the generalized eigenve tors of the right and left Jordan hains at the eigenvalue iωf , where the eigenfrequen y is given by (13) and the riti al value νf is de ned by (12), are [52℄ u0 =

0 2k12 − 2νf 2k12 + 2νf , u1 = v1 = . , v0 = k22 − k11 −4iωf k22 − k11

(20)

Substituting (20) into equation (19) yields the expression µ2 = −

νf 4νf (k11 − k22 ) > 0. = T 2iωf v0 u1 2ω2f

(21)

After plugging the real-valued oeÆ ient µ into expansions (18) we obtain an approximation of order |ν − νf |1/2 of the exa t eigenvalues λ = λ(ν). This an be veri ed by the series expansions of (11) about ν = νf . 2.2

The influence of small damping and gyroscopic forces on the stability of a circulatory system

The one-dimensional domain of marginal stability of ir ulatory system (2) given by Proposition 1 blows up into a three-dimensional domain of asymptoti stability of system (1) in the spa e of the parameters δ, Ω, and ν, whi h is des ribed by the Routh and Hurwitz riterion for the polynomial (10) δtrD > 0, trK + δ2 det D + Ω2 > 0, det K + ν2 > 0, Q(δ, Ω, ν) > 0,

(22)

where Q := −q2 + δtrD(trK + δ2 det D + Ω2 )q − (δtrD)2 (det K + ν2 ), q := δ(trKtrD − trKD) + 2Ων.

(23)

Considering the asymptoti stability domain (22) in the spa e of the parameters δ, ν and Ω we remind that the initial system (1) is equivalent to the rst-order system with the real 2m×2m matrix A(δ, ν, Ω) de ned by expression (4). As it was established by Arnold [17℄, the boundary of the asymptoti stability domain of a multiparameter family of real matri es is not a smooth surfa e. Generi ally, it possesses singularities orresponding to multiple eigenvalues with zero real part. Applying the qualitative results of [17℄, we dedu e that the parts of the ν-axis belonging to the stability domain of system (2) and orresponding to two dierent pairs of simple purely imaginary eigenvalues, form edges of the dihedral angles on the surfa es that bound the asymptoti stability domain of system (1), see Fig. 2(a). At the points ±νf of the ν-axis, orresponding to the stability- utter boundary of system (2) there exists a pair of double purely imaginary eigenvalues with the Jordan blo k. Qualitatively, the asymptoti stability domain of system (1) in the spa e (δ, ν, Ω) near the ν-axis looks like a dihedral

Sensitivity analysis of Hamiltonian and reversible systems

39

Singularities dihedral angle (a), trihedral angle (b), and deadlo k of an edge (or a half of the Whitney umbrella ( )) of the boundary of the asymptoti stability domain.

Fig. 2.

angle whi h be omes more a ute while approa hing the points ±νf. At these points the angle shrinks forming the deadlo k of an edge, whi h is a half of the Whitney umbrella surfa e [17, 30, 45℄, see Fig. 2( ). In ase when the stability domain of the ir ulatory system has a ommon boundary with the divergen e domain, as shown in Fig. 1(a), the boundary of the asymptoti stability domain of the perturbed system (1) possesses the trihedral angle singularity at ν = ±νd , see Fig. 2(b). The rst two of the onditions of asymptoti stability (22) restri t the region of variation of parameters δ and Ω either to a half-plane δtrD > 0, if det D > 0, or to a spa e between the line δ = 0 and one of the bran hes of a hyperbola | det D| δ2 − Ω2 = 2ω2f , if det D < 0. Provided that δ and Ω belong to the des ribed domain, the asymptoti stability of system (1) is determined by the last two of the inequalities (22), whi h impose limits on the variation of ν. Solving the quadrati in ν equation Q(δ, ν, Ω) = 0 we write the stability ondition Q > 0 in the form + (ν − ν− (24) cr )(ν − νcr ) < 0, with ν± cr (δ, Ω)

=

Ωb ±

√ Ω2 b2 + ac δ. a

(25)

The oeÆ ients a, b, and c are a(δ, Ω) = 4Ω2 + δ2 (trD)2 ,

b(δ, Ω) = 4νf β∗ + (δ2 det D + Ω2 )trD,

c(δ, Ω) = ν2f ((trD)2 − 4β2∗ ) + (ω2f trD − 2νf β∗ )(δ2 det D + Ω2 )trD,

where β∗ :=

tr(K − ω2f I)D 2νf

.

(26) (27)

For det K 6 0, the domain of asymptoti stability onsists of two non-interse ting parts, bounded by the surfa es ν = ν± cr (δ, Ω) and by the planes ν = ±νd ,

40

O. N. Kirillov

separating it from the divergen e domain. For det K > 0, inequality det K+ν2 > 0 is ful lled, and in a

ordan e with the ondition (24) the asymptoti stability − domain is ontained between the surfa es ν = ν+ cr (δ, Ω) and ν = νcr (δ, Ω). ± The fun tions νcr (δ, Ω) de ned by expressions (25) are singular at the origin due to vanishing denominator. Assuming Ω = βδ and al ulating a limit of these fun tions when δ tends to zero, we obtain ν± 0 (β)

:= lim

δ→ 0

p 4ββ∗ ± trD (trD)2 + 4(β2 − β2∗ ) = νf . (trD)2 + 4β2

ν± cr

(28)

The fun tions ν± 0 (β) are real-valued if the radi and in (28) is non-negative. Proposition 2.

Let λ1 (D) and λ2 (D) be eigenvalues of D. Then, |β∗ | 6

|λ1 (D) − λ2 (D)| . 2

(29)

If D is semi-de nite (det D > 0) or inde nite with 0 > det D > −

(k12 (d22 − d11 ) − d12 (k22 − k11 ))2 , 4ν2f

then |β∗ | 6

|trD| , 2

(30) (31)

and the limits ν±0 (β) are ontinuous real-valued fun tions of β. Otherwise, there exists an interval of dis ontinuity β2 < β2∗ − (trD)2 /4.

Proof. With the use of the de nition of β∗ , (27), a series of transformations β2∗ −

(trD)2 1 = 2 4 4νf −

(k11 − k22 )(d11 − d22 ) + 2k12 d12 2

(d11 + d22 )2 ((k11 − k22 )2 + 4k212 ) 4 4ν2f

= − det D −

2

(k12 (d22 − d11 ) − d12 (k22 − k11 ))2 4ν2f

(32)

yields the expression β2∗ =

(λ1 (D) − λ2 (D))2 (k12 (d22 − d11 ) − d12 (k22 − k11 ))2 . − 4 4ν2f

(33)

For real β∗ , formula (32) implies inequality (30). The remaining part of the proposition follows from (33). Inequality (30) subdivides the set of inde nite damping matri es into two

lasses.

Sensitivity analysis of Hamiltonian and reversible systems

− Fig. 3. The fun tions ν+ 0 (β) (bold lines) and ν0 (β) ( ne when D is hanging from weakly- to strongly inde nite.

41

lines), and their bifur ation

Definition 1. We all a 2 × 2 real symmetri matrix D with det D < 0 weakly inde nite, if 4β2∗ < (trD)2 , and strongly inde nite, if 4β2∗ > (trD)2 .

As an illustration, we al ulate and plot the fun tions ν± 0 (β), normalized by νf , for the matrix K > 0 and inde nite matri es D1 , D2 , and D3 √ 4 75 130 − 11 27 3 63 7 3 , D3 = . (34) K= , D1 = , D2 = 4 √ 51 3 5 31 1 3 130 − 11

The graphs of the fun tions ν± 0 (β) bifur ate with a hange of the damping matrix from the weakly inde nite to the strongly inde nite one. Indeed, sin e D1 satis es the stri t inequality (30), the limits are ontinuous fun tions with separated graphs, as shown in Fig. 3(a). Expression (30) is an equality for the matrix D2 . Consequently, the fun tions ν± 0 (β) are ontinuous, with their graphs tou hing ea h other at the origin, Fig. 3(b). For the matrix D3 , ondition (30) is not ful lled, and the fun tions are dis ontinuous. Their graphs, however, are joint together, forming ontinuous urves, see Fig. 3( ). The al ulated ν± 0 (β) are bounded fun tions of β, non-ex eeding the riti al values ±νf of the unperturbed

ir ulatory system. Proposition 3. ± |ν± 0 (β)| 6 |ν0 (±β∗ )| = νf .

(35)

Proof. Let us observe that µ±0 := ν±0 /νf are roots of the quadrati equation ν2f aβ µ2 − 2δΩb0 νf µ − δ2 c0 = 0,

(36)

with δ2 aβ := a(δ, βδ), b0 := b(0, 0), c0 := c(0, 0). A

ording to the S hur

riterion [6℄ all the roots µ of equation (36) are inside the losed unit disk, if δ2 c0 + ν2f aβ = (trD)2 + 4(β2 − β2∗ ) + (trD)2 > 0, 2δΩνf b0 + ν2f aβ − δ2 c0 = (β + β∗ )2 > 0,

−2δΩνf b0 + ν2f aβ − δ2 c0 = (β − β∗ )2 > 0.

(37)

42

O. N. Kirillov

± The rst of onditions (37) is satis ed for real ν± 0 , implying |µ0 (β)| 6 1 with + − |µ0 (β∗ )| = |µ0 (−β∗ )| = 1. ± The limits ν± 0 (β) of the riti al values of the ir ulatory parameter νcr (δ, Ω), whi h are ompli ated fun tions of δ and Ω, ee tively depend only on the ratio β = Ω/δ, de ning the dire tion of approa hing zero in the plane (δ, Ω). Along the dire tions β = β∗ and β = −β∗ , the limits oin ide with the riti al utter loads of the unperturbed ir ulatory system (2) in su h a way that ν+ 0 (β∗ ) = (−β ) = −ν . A

ording to Proposition 3, the limit of the nonνf and ν− ∗ f 0

onservative positional for e at the onset of utter for system (1) with dissipative and gyros opi for es tending to zero does not ex eed the riti al utter load of ir ulatory system (2), demonstrating a jump in the riti al load whi h is

hara teristi of the destabilization paradox. Power series expansions of the fun tions ν± 0 (β) around β = ±β∗ (with the radius of onvergen e not ex eeding |trD|/2) yield simple estimates of the jumps in the riti al load for the two-dimensional system (1)

νf ∓ ν± 0 (β) = νf

2 (β ∓ β∗ )2 + o((β ∓ β∗ )2 ). (trD)2

(38)

Leaving in expansions (38) only the se ond order terms and then substituting

β = Ω/δ, we get equations of the form Z = X2 /Y 2 , whi h is anoni al for the

Whitney umbrella surfa e [17, 30, 45℄. These equations approximate the boundary of the asymptoti stability domain of system (1) in the vi inity of the points (0, 0, ±νf ) in the spa e of the parameters (δ, Ω, ν). An extension to the ase when the system (1) has m degrees of freedom is given by the following statement. Theorem 1. Let the system (2) with m degrees of freedom be stable for ν < νf and let at ν = νf its spe trum ontain a double eigenvalue iωf with the left and right Jordan hains of generalized eigenve tors u0 , u1 and v0 , v1 , satisfying equations (15) and (16). De ne the real quantities d1 = Re(v0T Du0 ),

d2 = Im(v0T Du1 + v1T Du0 ),

g1 = Re(v0T Gu0 ),

g2 = Im(v0T Gu1 + v1T Gu0 ),

and β∗ = −

v0T Du0 . v0T Gu0

(39) (40)

Then, in the vi inity of β := Ω/δ = β∗ the limit of the riti al utter load of the near-reversible system with m degrees of freedom as δ → 0 is

ν+ cr

ν+ 0 (β) = νf −

g21 (β − β∗ )2 + o((β − β∗ )2 ). µ2 (d2 + β∗ g2 )2

(41)

Sensitivity analysis of Hamiltonian and reversible systems

43

Proof. Perturbing a simple eigenvalue iω(ν) of the stable system (2) at a xed ν < νf by small dissipative and gyros opi for es yields the in rement λ = iω −

vT Du vT Gu δ − Ω + o(δ, Ω). 2vT u 2vT u

(42)

Sin e the eigenve tors u(ν) and v(ν) an be hosen real, the rst order in rement is real-valued. Therefore, in the rst approximation in δ and Ω, the simple eigenvalue iω(ν) remains on the imaginary axis if Ω = β(ν)δ, where β(ν) = −

vT (ν)Du(ν) . vT (ν)Gu(ν)

(43)

Substituting expansions (18) into formula (43), we obtain √ √ d1 ± d2 µ νf − ν + o ( νf − ν) √ √ , β(ν) = − g1 ± g2 µ νf − ν + o ( νf − ν)

(44)

wherefrom expression (41) follows, if |β − β∗ | ≪ 1 .

For various ν, bold lines show linear approximations to the boundary of the asymptoti stability domain (white) of system (1) in the vi inity of the origin in the plane (δ, Ω), when trK > 0 and det K > 0, and 4β2∗ < (trD)2 (upper row) or 4β2∗ > (trD)2 (lower row). Fig. 4.

After substituting β = Ω/δ the formula (41) gives an approximation of the

riti al utter load ν+ cr (δ, Ω) = νf −

g21 (Ω − β∗ δ)2 , 2 µ (d2 + β∗ g2 )2 δ2

(45)

44

O. N. Kirillov

whi h has the anoni al Whitney's umbrella form. The oeÆ ients (21) and (39)

al ulated with the use of ve tors (20) are d1 = 2(k22 − k11 )tr(K − ω2f I)D,

g1 = 4(k11 − k22 )νf

d2 = −8ωf (2d12 k12 + d22 (k22 − k11 )),

g2 = 16ωf νf .

(46)

With (46) expression (41) is redu ed to (38). Using exa t expressions for the fun tions ω(ν), u(ν), and v(ν), we obtain better estimates in ase when m = 2. Substituting the expli it expression for the eigenfrequen y q ω2 (ν) = ω2f ± ν2f − ν2 , (47) following from (11){(13), into the equation (43), whi h now reads

we obtain

δ 2νf β∗ + ω2 (ν) − ω2f trD − 2Ων = 0, νf Ω= ν

"

β∗ ±

trD 2

s

ν2 1− 2 νf

#

δ.

(48) (49)

Equation (49) is simply formula (28) inverted with respe t to β = Ω/δ.

Fig. 5. The domain of asymptoti stability of system (1) with the singularities Whitney umbrella, dihedral angle, and trihedral angle when K > 0 and 4β2∗ < (trD)2 (a), K > 0

and 4β2∗ > (trD)2 (b), and when trK > 0 and det K < 0 ( ).

We use the linear approximation (49) to study the asymptoti behavior of the stability domain of the two-dimensional system (1) in the vi inity of the origin in the plane (δ, Ω) for various ν. It is enough to onsider only the ase when trK > 0 and det K > 0, so that −νf < ν < νf , be ause for det K 6 0 the region ν2 < ν2d 6 ν2f is unstable and should be ex luded. For ν2 < ν2f the radi and in expression (49) is real and nonzero, so that in the rst approximation the domain of asymptoti stability is ontained between two lines interse ting at the origin, as depi ted in Fig. 4 ( entral olumn). When

Sensitivity analysis of Hamiltonian and reversible systems

45

ν approa hes the riti al values ±νf , the angle be omes more a ute until at ν = νf or ν = −νf it degenerates to a single line Ω = δβ∗ or Ω = −δβ∗ respe tively. For β∗ 6= 0 these lines are not parallel to ea h other, and due to

inequality (31) they are never verti al, see Fig. 4 (right olumn). However, the degeneration an be lifted already in the se ond-order approximation in δ Ω = ±δβ∗ ±

ωf trD

p

det D + β2∗

2νf

δ2 + O(δ3 ).

(50)

If the radi and is positive, equation (50) de nes two urves tou hing ea h other at the origin, as shown in Fig. 4 by dashed lines. Inside the usps |ν± cr (δ, Ω)| > νf . The evolution of the domain of asymptoti stability in the plane (δ, Ω), when ν goes from ±νf to zero, depends on the stru ture of the matrix D and is governed by the sign of the expression 4β2∗ − (trD)2 . For the negative sign the angle between the lines (49) is getting wider, tending to π as ν → 0, see Fig. 4 (upper left). Otherwise, the angle rea hes a maximum for some ν2 < ν2f and then shrinks to a single line δ = 0 at ν = 0, Fig. 4 (lower left). At ν = 0 the Ω-axis orresponds to a marginally stable gyros opi system. Sin e the linear approximation to the asymptoti stability domain does not ontain the Ω-axis at any ν 6= 0, small gyros opi for es annot stabilize a ir ulatory system in the absen e of damping for es (δ = 0), whi h is in agreement with the theorems of Lakhadanov and Karapetyan [12, 13℄. Re onstru ting with the use of the obtained results the asymptoti stability domain of system (1), we nd that it has three typi al on gurations in the vi inity of the ν-axis in the parameter spa e (δ, Ω, ν). In ase of a positivede nite matrix K and of a semi-de nite or a weakly-inde nite matrix D the addition of small damping and gyros opi for es blows the stability interval of a ir ulatory system ν2 < ν2f up to a three-dimensional region bounded by the parts of a singular surfa e ν = ν± cr (δ, Ω), whi h belong to the half-spa e δtrD > 0, Fig. 5(a). The stability interval of a ir ulatory system forms an edge of a dihedral angle. At ν = 0 the angle of the interse tion rea hes its maximum (π), reating another edge along the Ω-axis. While approa hing the points ±νf , the angle be omes more a ute and ends up with the deadlo k of an edge, Fig. 5(a). When the matrix D approa hes the threshold 4β2∗ = (trD)2 , two smooth parts of the stability boundary orresponding to negative and positive ν ome towards ea h other until they tou h, when D is at the threshold. After D be omes strongly inde nite this temporary glued on guration ollapses into two po kets of asymptoti stability, as shown in Fig. 5(b). Ea h of the two po kets has a deadlo k of an edge as well as two edges whi h meet at the origin and form a singularity known as the \break of an edge" [17℄. The on guration of the asymptoti stability domain, shown in Fig. 5( ),

orresponds to an inde nite matrix K with trK > 0 and det K < 0. In this ase

46

O. N. Kirillov

the ondition ν2 > ν2d divides the domain of asymptoti stability into two parts,

orresponding to positive and negative ν. The intervals of ν-axis form edges of dihedral angles, whi h end up with the deadlo ks at ν = ±νf and with the trihedral angles at ν = ±νd , Fig. 5( ). Qualitatively, this on guration does not depend on the properties of the matrix D.

Fig. 6. Bifur ation of the domain of the asymptoti stability (white) in the plane (δ, Ω) at ν = 0 due to the hange of the stru ture of the matrix D a

ording to the riterion

(44).

We note that the parameter 4β2∗ − (trD)2 governs not only the bifur ation of the stability domain near the ν-axis, but also the bifur ation of the whole stability domain in the spa e of the parameters δ, Ω, and ν. This is seen from the stability onditions (24){(26). For example, for ν = 0 the inequality Q > 0 is redu ed to c(δ, Ω) > 0, where c(δ, Ω) is given by (26). For positive semide nite matri es D this ondition is always satis ed. For inde nite matri es equation c(δ, Ω) = 0 de nes either hyperbola or two interse ting lines. In ase of weakly-inde nite D the stability domain is bounded by the ν-axis and one of the hyperboli bran hes, see Figure 6 (left). At the threshold 4β2∗ = (trD)2 the stability domain is separated to two half- oni al parts, as shown in the enter of Figure 6. Strongly-inde nite damping makes impossible stabilization by small gyros opi for es, see Figure 6 (right). In this ase the non- onservative for es are required for stabilization. Thus, we generalize the results of the works [35, 36℄, whi h were obtained for diagonal matri es K and D. Moreover, the authors of the works [35, 36℄ did not take into a

ount the non- onservative positional for es orresponding to the matrix N in equation (1) and missed the existen e of the two lasses of inde nite matri es, whi h lead to the bifur ation of the domain of asymptoti stability. We an also on lude that at least in two dimensions the requirement of de niteness of the matrix D established in [46℄ is not ne essary for the stabilization of a ir ulatory system by gyros opi and damping for es.

Sensitivity analysis of Hamiltonian and reversible systems

3

47

A gyroscopic system with weak damping and circulatory forces

A stati ally unstable potential system, whi h has been stabilized by gyros opi for es an be destabilized by the introdu tion of small stationary damping, whi h is a velo ity-dependent for e [1℄. However, many stati ally unstable gyropendulums enjoy robust stability at high speeds [31℄. To explain this phenomenon a

on ept of rotating damping has been introdu ed, whi h is also proportional to the displa ements by a non- onservative way and thus ontributes not only to the matrix D in equation (1), but to the matrix N as well [3{5, 31℄. This leads to a problem of perturbation of gyros opi system (3) by weak dissipative and non- onservative positional for es [14, 27, 31, 32, 46, 48, 49, 59, 62, 63, 66, 74℄. 3.1

Stability of a gyroscopic system

In the absen e of dissipative and ir ulatory for es (δ = ν = 0), the polynomial (10) has four roots ±λ± , where λ± =

r

1 1 − (trK + Ω2 ) ± 2 2

q (trK + Ω2 )2 − 4 det K.

(51)

Analysis of these eigenvalues yields the following result, see e.g. [47℄.

If det K > 0 and trK < 0, gyros opi system (3) with two degrees of freedom is unstable by divergen e for Ω2 < Ω−0 2 , unstable by

utter for Ω−0 2 6 Ω2 6 Ω+0 2 , and stable for Ω+0 2 < Ω2 , where the riti al values Ω−0 and Ω+0 are Proposition 4.

06

q q √ √ + −trK − 2 det K =: Ω− 6 Ω := −trK + 2 det K. 0 0

(52)

If det K > 0 and trK > 0, the gyros opi system is stable for any Ω [2℄. If det K 6 0, the system is unstable [1℄. Representing for det K > 0 the equation (51) in the form λ± =

s

−

1 2

1 r 1 −2 2 2 2 Ω2 − ± Ω0 + Ω+ Ω2 − Ω− Ω2 − Ω+ . (53) 0 0 0 2 2

− we nd that at Ω = 0 there are in general four real roots ±λ± = ±(Ω+ 0 ± Ω0 )/2 and system (3) is stati ally unstable. With the in rease of Ω2 the distan e λ+ − λ− between the two roots of the same sign is getting smaller. The roots are 2 moving towards ea h other until they merge at Ω2 = Ω− 0 with the origination of a pair of double real eigenvalues ±ω0 with the Jordan blo ks, where

1 ω0 = 2

q √ 2 4 −2 Ω+ det K > 0. 0 − Ω0 =

(54)

48

O. N. Kirillov

Further in rease of Ω2 yields splitting of ±ω0 to two ouples of omplex onjugate eigenvalues lying on the ir le Reλ2 + Im λ2 = ω20 .

(55)

2 they rea h The omplex eigenvalues move along the ir le until at Ω2 = Ω+ 0 the imaginary axis and originate a omplex- onjugate pair of double purely 2 imaginary eigenvalues ±iω0 . For Ω2 > Ω+ the double eigenvalues split into 0 four simple purely imaginary eigenvalues whi h do not leave the imaginary axis, Fig. 7.

Stability diagram for the gyros opi system with K < 0 (left) and the orresponding traje tories of the eigenvalues in the omplex plane for the in reasing parameter Ω > 0 (right).

Fig. 7.

− Thus, the system (3) with K < 0 is stati ally unstable for Ω ∈ (−Ω− 0 , Ω0 ), + − − + it is dynami ally unstable for Ω ∈ [−Ω0 , −Ω0 ] ∪ [Ω0 , Ω0 ], and it is stable (gy+ ros opi stabilization) for Ω ∈ (−∞, −Ω+ 0 ) ∪ (Ω0 , ∞), see Fig. 7. The values of − the gyros opi parameter ±Ω0 de ne the boundary between the divergen e and

utter domains while the values ±Ω+ 0 originate the utter-stability boundary.

3.2

The influence of small damping and non-conservative positional forces on the stability of a gyroscopic system

Consider the asymptoti stability domain in the plane (δ, ν) in the vi inity of the origin, assuming that Ω 6= 0 is xed. Observing that the third of the inequalities (22) is ful lled for det K > 0 and the rst one simply restri ts the region of variation of δ to the half-plane δtrD > 0, we fo us our analysis on the remaining two of the onditions (22). Taking into a

ount the stru ture of oeÆ ients (26) and leaving the linear terms with respe t to δ in the Taylor expansions of the fun tions ν± cr (δ, Ω), we

Sensitivity analysis of Hamiltonian and reversible systems

49

get the equations determining a linear approximation to the stability boundary ν=

trKD − trKtrD − trDλ2± (Ω)

δ 2Ω p 2trKD + trD(Ω2 − trK) ± trD (Ω2 + trK)2 − 4 det K = δ, 4Ω

(56)

where the eigenvalues λ± (Ω) are given by formula (51). For det K > 0 and trK > 0 the gyros opi system is stable at any Ω. Consequently, the oeÆ ients λ2± (Ω) are always real, and equations (56) de ne in general two lines interse ting at the origin, Fig. 8. Sin e trK > 0, the se ond of the inequalities (22) is satis ed for det D > 0, and it gives an upper bound of δ2 for det D < 0. Thus, a linear approximation to the domain of asymptoti stability near the origin in the plane (δ, ν), is an angle-shaped area between two lines (56), as shown in Fig. 8. With the hange of Ω the size of the angle is varying and moreover, the stability domain rotates as a whole about the origin. As Ω → ∞, the size of the angle tends to π/2 in su h a way that the stability domain ts one of the four quadrants of the parameter plane, as shown in Fig. 8 (right olumn). From (56) it follows that asymptoti ally as Ω → 0 νf ν(Ω) = Ω

1 trD β∗ ± +o . 2 Ω

(57)

Consequently, the angle between the lines (56) tends to π for the matri es D satisfying the ondition 4β2∗ < (trD)2 , see Fig. 8 (upper left). In this ase in the linear approximation the domain of asymptoti stability spreads over two quadrants and ontains the δ-axis. Otherwise, the angle tends to zero as Ω → 0, Fig. 8 (lower left). In the linear approximation the stability domain always belongs to one quadrant and does not ontain δ-axis, so that in the absen e of non- onservative positional for es gyros opi system (3) with K > 0 annot be made asymptoti ally stable by damping for es with strongly-inde nite matrix D, whi h is also visible in the three-dimensional pi ture of Fig. 5(b). The threedimensional domain of asymptoti stability of near-Hamiltonian system (1) with K > 0 and D semi-de nite or weakly-ide nite is inside a dihedral angle with the Ω-axis as its edge, as shown in Fig. 5(a). With the in rease in |Ω|, the se tion of the domain by the plane Ω = const is getting more narrow and is rotating about the origin so that the points of the parameter plane (δ, ν) that where stable at lower |Ω| an lose their stability for the higher absolute values of the gyros opi parameter (gyros opi destabilization of a stati ally stable potential system in the presen e of damping and non- onservative positional for es). To study the ase when K < 0 we write equation (56) in the form " s q # q Ω+ trD Ω2 2 2 + − 0 γ∗ + −1 δ, ν= Ω2 − Ω0 ± Ω2 − Ω0 2 Ω 4 Ω+ 0

(58)

50

O. N. Kirillov

For various Ω, bold lines show linear approximations to the boundary of the asymptoti stability domain (white) of system (1) in the vi inity of the origin in the plane (δ, ν), when trK > 0 and det K > 0, and 4β2∗ < (trD)2 (upper row) or 4β2∗ > (trD)2 (lower row). Fig. 8.

where γ∗ := Proposition 5.

2 2 tr[K + (Ω+ 0 − ω0 )I]D

2Ω+ 0

.

(59)

Let λ1 (D) and λ2 (D) be eigenvalues of D. Then, |γ∗ | 6 Ω+ 0

|λ1 (D) − λ2 (D)| |λ1 (D) + λ2 (D)| + Ω− . 0 4 4

(60)

Proof. With the use of the Cau hy-S hwarz inequality we obtain |trD| tr(K − tr2K I)(D − tr2D I) + 4 2Ω+ 0 |λ1 (K) − λ2 (K)||λ1 (D) − λ2 (D)| + |trD| + 6 Ω0 . 4 4Ω+ 0

|γ∗ | 6 Ω+ 0

(61)

+ Taking into a

ount that |λ1 (K) − λ2 (K)| = Ω− 0 Ω0 , we get inequality (60). 2 −2 2 Expression (58) is real-valued when Ω2 > Ω+ 0 or Ω 6 Ω0 . For suÆ iently small |δ| the rst inequality implies the se ond of the stability onditions (22), whereas the last inequality ontradi ts it. Consequently, the domain of asymptoti stability is determined by the inequalities δtrD > 0 and Q(δ, ν, Ω) > 0, and its linear approximation in the vi inity of the origin in the (δ, ν)-plane has the form of an angle with the boundaries given by equations (58). For Ω tend+ ing to in nity the angle expands to π/2, whereas for Ω = Ω+ 0 or Ω = −Ω0

Sensitivity analysis of Hamiltonian and reversible systems

51

For various Ω, bold lines show linear approximations to the boundary of the asymptoti stability domain (white) of system (1) in the vi inity of the origin in the plane (δ, ν), when K < 0.

Fig. 9.

it degenerates to a single line ν = δγ∗ or ν = −δγ∗ respe tively. For γ∗ 6= 0 these lines are not parallel to ea h other, and due to inequality (60) they never stay verti al, see Fig. 9 (left). The degeneration an, however, be removed in the se ond-order approximation in δ ν = ±δγ∗ ±

q

trD ω20 det D − γ2∗ 2Ω+ 0

δ2 + O(δ3 ),

(62)

as shown by dashed lines in Fig. 9 (left). Therefore, gyros opi stabilization of stati ally unstable onservative system with K < 0 an be improved up to asymptoti stability by small damping and ir ulatory for es, if their magnitudes are in the narrow region with the boundaries depending on Ω. The lower the desirable absolute value of the riti al gyros opi parameter Ωcr (δ, ν) the poorer

hoi e of the appropriate ombinations of damping and ir ulatory for es. To estimate the new riti al value of the gyros opi parameter Ωcr (δ, ν), whi h an deviate signi antly from that of the onservative gyros opi system, we onsider the formula (58) in the vi inity of the points (0, 0, ±Ω+ 0 , ) in the parameter spa e. Leaving only the terms, whi h are

onstant or proportional to q + Ω ± Ω0 in both the numerator and denominator and assuming ν = γδ, we nd 2 + + ±Ω+ (63) (γ ∓ γ∗ )2 + o((γ − γ∗ )2 ), cr (γ) = ±Ω0 ± Ω0 (ω0 trD)2 After substitution γ = ν/δ equations (63) take the form anoni al for the Whitney umbrella. The domain of asymptoti stability onsists of two po kets of two Whitney umbrellas, sele ted by the onditions δtrD > 0 and Q(δ, ν, Ω) > 0. Equations (58) are a linear approximation to the stability boundary in the vi inity of the Ω-axis. Moreover, they des ribe in an impli it form a limit of the riti al gyros opi parameter Ωcr (δ, γδ) when δ tends to zero, as a fun tion of the ratio γ = ν/δ, Fig. 10(b). Most of the dire tions γ give the limit + value |Ω± cr (γ)| > Ω0 with an ex eption for γ = γ∗ and γ = −γ∗ , so that

52

O. N. Kirillov

+ + − Ω+ cr (γ∗ ) = Ω0 and Ωcr (−γ∗ ) = −Ω0 . Estimates of the riti al gyros opi pa-

rameter (63) are extended to the ase of arbitrary number of degrees of freedom by the following statement.

Fig. 10. Blowing the domain of gyros opi stabilization of a stati ally unstable onservative system with K < 0 up to the domain of asymptoti stability with the Whitney umbrella singularities (a). The limits of the riti al gyros opi parameter Ω± cr as fun tions of γ = ν/δ (b).

Let the system (3) with even number m of degrees of freedom be gyros opi ally stabilized for Ω > Ω+0 and let at Ω = Ω+0 its spe trum ontain a double eigenvalue iω0 with the Jordan hain of generalized eigenve tors u0 , u1 , satisfying the equations Theorem 2.

(−Iω20 + iω0 Ω+ 0 G + K)u0 = 0, + (−Iω20 + iω0 Ω+ 0 G + K)u1 = −(2iω0 I + Ω0 G)u0 .

De ne the real quantities d1 , d2 , n1 ,

n2 ,

and γ∗ as

d1 = Re(uT0 Du0 ),

d2 = Im(uT0 Du1 − uT1 Du0 ),

n1 = Im(uT0 Nu0 ),

n2 = Re(uT0 Nu1 − uT1 Nu0 ),

γ∗ = −iω0

(64)

uT0 Du0 , uT0 Nu0

(65) (66)

where the bar over a symbol denotes omplex onjugate. Then, in the vi inity of γ := ν/δ = γ∗ the limit of the riti al value of the gyros opi parameter Ω+cr of the near-Hamiltonian system as δ → 0 is + Ω+ cr (γ) = Ω0 +

whi h is valid for |γ − γ∗| ≪ 1.

n21 (γ − γ∗ )2 , µ2 (ω0 d2 − γ∗ n2 − d1 )2

(67)

Sensitivity analysis of Hamiltonian and reversible systems

53

Proof. Perturbing the system (3), whi h is stabilized by the gyros opi for es with Ω > Ω+ 0 , by small damping and ir ulatory for es, yields an in rement to a simple eigenvalue [53℄ λ = iω −

ω2 uT Duδ − iωuT Nuν + o(δ, ν). uT Ku + ω2 uT u

(68)

Choose the eigenvalues and the orresponding eigenve tors that merge at Ω = Ω+ 0

q 1 + 2 iω(Ω) = iω0 ± iµ Ω − Ω+ 0 + o(|Ω − Ω0 | ), q 1 + 2 u(Ω) = u0 ± iµu1 Ω − Ω+ 0 + o(|Ω − Ω0 | ),

where

µ2 = −

2ω20 uT0 u0 . T + T T 2 T Ω+ 0 (ω0 u1 u1 − u1 Ku1 − iω0 Ω0 u1 Gu1 − u0 u0 )

(69) (70)

Sin e D and K are real symmetri matri es and N is a real skew-symmetri one, the rst-order in rement to the eigenvalue iω(Ω) given by (68) is real-valued. Consequently, in the rst approximation in δ and ν, simple eigenvalue iω(Ω) remains on the imaginary axis, if ν = γ(Ω)δ, where γ(Ω) = −iω(Ω)

uT (Ω)Du(Ω) . uT (Ω)Nu(Ω)

(71)

Substitution of the expansions (69) into the formula (71) yields q q d1 ∓ µd2 Ω − Ω+ 0 q γ(Ω) = −(ω0 ± µ Ω − Ω+ , 0) n1 ± µn2 Ω − Ω+ 0

(72)

wherefrom the expression (67) follows, if |γ − γ∗ | ≪ 1.

Substituting γ = ν/δ in expression (72) yields the estimate for the riti al value of the gyros opi parameter Ω+ cr (δ, ν) + Ω+ cr (δ, ν) = Ω0 +

n21 (ν − γ∗ δ)2 . µ2 (ω0 d2 − γ∗ n2 − d1 )2 δ2

(73)

We show now that for m = 2 expression (67) implies (63). At the riti al value of the gyros opi parameter Ω+ 0 de ned by equation (52), the double eigenvalue iω0 with ω0 given by (54) has the Jordan hain −1 −iω0 Ω+ − k12 0 0 , u1 = 2 u0 = . −ω20 + k11 ω0 − k22 iω0 (k22 − k11 ) − Ω+ 0 k12

(74)

54

O. N. Kirillov

With the ve tors (74) equation (70) yields µ2 =

2 2 Ω+ Ω+ 0 (ω0 − k11 )(ω0 − k22 ) 0 > 0, = 2 2 2 Ω+ ω2 − k2 0

0

(75)

12

whereas the formula (66) reprodu es the oeÆ ient γ∗ given by (59). To show that (63) follows from (67) it remains to al ulate the oeÆ ients (65). We have 2 2 2 n1 = −2Ω+ 0 ω0 (ω0 − k11 ), ω0 d2 − γ∗ n2 − d1 = −2ω0 (ω0 − k11 )trD. (76) 2 2 Taking into a

ount that (Ω+ 0 ) = −trK + 2ω0 , and using the relations (76) in (73) we exa tly reprodu e (63). Therefore, in the presen e of small damping and non- onservative positional for es, gyros opi for es an both destabilize a stati ally stable onservative system (gyros opi destabilization) and stabilize a stati ally unstable onservative system (gyros opi stabilization). The rst ee t is essentially related with the dihedral angle singularity of the stability boundary, whereas the se ond one is governed by the Whitney umbrella singularity. In the remaining se tions we demonstrate how these singularities appear in me hani al systems.

4

The modified Maxwell-Bloch equations with mechanical applications

The modi ed Maxwell-Blo h equations are the normal form for rotationally symmetri , planar dynami al systems [28, 48, 59℄. They follow from equation (1) for m = 2, D = I, and K = κI, and thus an be written as a single dierential equation with the omplex oeÆ ients x + iΩx_ + δx_ + iνx + κx = 0, x = x1 − ix2 ,

(77)

where κ orresponds to potential for es. Equations in this form appear in gyrodynami al problems su h as the tippe top inversion, the rising egg, and the onset of os illations in the squealing dis brake and the singing wine glass [14, 31, 48, 59, 62, 66, 68, 76℄. A

ording to stability onditions (22) the solution x = 0 of equation (77) is asymptoti ally stable if and only if δ > 0,

Ω>

ν δ − κ. δ ν

(78)

For κ > 0 the domain of asymptoti stability is a dihedral angle with the Ω-axis serving as its edge, Fig. 11(a). The se tions of the domain by the planes Ω = const are ontained in the angle-shaped regions with the boundaries ν=

Ω±

√ Ω2 + 4κ δ. 2

(79)

Sensitivity analysis of Hamiltonian and reversible systems

55

Fig. 11. Two on gurations of the asymptoti stability domain of the modi ed MaxwellBlo h equations for κ > 0 (a) and κ < 0 (b) orresponding to gyros opi destabilization and gyros opi stabilization respe tively; Hauger's gyropendulum ( ).

The domain shown in Fig. 11(a) is a parti ular ase of that depi ted in Fig. 5(a). For K = κI the interval [−νf , νf ] shown in Fig. 5(a)√shrinks to a point so that at Ω = 0 the angle is bounded by the lines ν = ±δ κ and thus it is less than π. The domain of asymptoti stability is twisting around the Ω-axis in su h a manner that it always remains in the half-spa e δ > 0, Fig. 11(a). Consequently, the system stable at Ω = 0 an be ome unstable at greater Ω, as shown in Fig. 11(a) by the dashed line. The larger magnitudes of ir ulatory for es, the lower |Ω| at the onset of instability. As κ > 0 de reases, the hypersurfa es forming the dihedral angle approa h ea h other so that, at κ = 0, they temporarily merge along the line ν = 0 and a new on guration originates for κ < 0, Fig. 11(b). The new domain of asymptoti stability onsists of two disjoint parts that are po kets of two Whitney umbrellas singled out by inequality δ > 0. The absolute values of the gyros opi parameter Ω in the stability domain are always not less than √ Ω+ = 2 −κ . As a onsequen e, the system unstable at Ω = 0 an be ome 0 asymptoti ally stable at greater Ω, as shown in Fig. 11(b) by the dashed line. 4.1

Stability of Hauger’s gyropendulum

Hauger's gyropendulum [14℄ is an axisymmetri rigid body of mass m hinged at the point O on the axis of symmetry as shown in Figure (11)( ). The body's moment of inertia about the axis through the point O perpendi ular to the axis of symmetry is denoted by I, the body's moment of inertia about the axis of symmetry is denoted by I0 , and the distan e between the fastening point and the

enter of mass is s. The orientation of the pendulum, whi h is asso iated with the trihedron Oxfyf zf , with respe t to the xed trihedron Oxi yi zi is spe i ed by the angles ψ, θ, and φ. The pendulum experien es the for e of gravity G = mg and a follower torque T that lies in the plane of the zi and zf oordinate axes. The moment ve tor makes an angle of ηα with the axis zi , where η is a

56

O. N. Kirillov

parameter (η 6= 1) and α is the angle between the zi and zf axes. Additionally, the pendulum experien es the restoring elasti moment R = −rα in the hinge and the dissipative moments B = −bωs and K = −kφ, where ωs is the angular velo ity of an auxiliary oordinate system Oxs ys zs with respe t to the inertial system and r, b, and k are the orresponding oeÆ ients. Linearization of the nonlinear equations of motion derived in [14℄ with the new variables x1 = ψ and x2 = θ and the subsequent nondimensionalization yield the Maxwell-Blo h equations (77) where the dimensionless parameters are given by Ω=

I0 1−η T b r − mgs , ν= T, ω = − . , δ= , κ= 2 2 I Iω Iω Iω k

(80)

The domain of asymptoti stability of the Hauger gyropendulum, given by (78), is shown in Fig. 11(a,b). A

ording to formulas (52) and (54), for the stati ally unstable gyropendulum (κ < 0) the√singular points on the Ω-axis orrespond to the riti al √ = ±2 −κ and the

riti al frequen y ω = −κ . Noting that values ±Ω+ 0 0√ + + Ωcr (ν = ± −κδ, δ) = ±Ω0 and substituting γ = ν/δ into formula (78), we √ expand Ω+ cr (γ) in a series in the neighborhood of γ = ± −κ √ √ √ 1 (γ ∓ −κ)2 + o (γ ∓ −κ)2 . Ω+ cr (γ) = ±2 −κ ± √ −κ

(81)

Pro eeding from γ to ν and δ in (81) yields approximations of the stability boundary near the singularities: √ √ 1 (ν ∓ δ −κ)2 . = ±2 −κ ± √ (82) δ2 −κ √ √ They also follow from formula (63) after substituting ω0 = −κ, and γ∗ = −κ, Ω+ cr (ν, δ)

where the last value is given by (59). Thus, Hauger's gyropendulum, whi h is unstable at Ω = 0, an be ome asymptoti ally stable for suÆ iently large |Ω| > Ω+ 0 under a suitable ombination of dissipative and non onservative positional for es. Note that Hauger failed to nd Whitney umbrella singularities on the boundary of the pendulum's gyros opi stabilization domain. 4.2

Friction-induced instabilities in rotating elastic bodies of revolution

e , κ = ρ2 − Ω e 2 , and The modi ed Maxwell-Blo h equations (77) with Ω = 2Ω ν = 0 and δ = 0, where ρ > 0 is the frequen y of free vibrations of the potential e = ν = 0, des ribe a two-mode approximation system orresponding to δ = Ω of the models of rotating elasti bodies of revolution after their linearization and dis retization [67, 71, 76℄. In the absen e of dissipative and non- onservative

Sensitivity analysis of Hamiltonian and reversible systems

57

positional for es the hara teristi polynomial (10) orresponding to the opere = Iλ2 + 2λΩG e + (ρ2 − Ω e 2 )I, whi h belongs to the lass of matrix ator L0 (Ω) polynomials onsidered, e.g., in [38℄, has four purely imaginary roots e λ± p = iρ ± iΩ,

e λ± n = −iρ ± iΩ.

(83)

e Im λ) the eigenvalues (83) form a olle tion of straight lines In the plane (Ω, interse ting with ea h other { the spe tral mesh [64, 76℄. Two nodes of the e = 0 orrespond to the double semi-simple eigenvalues λ = ±iρ. The mesh at Ω e =Ω e 0 = 0 has two linearly-independent double semi-simple eigenvalue iρ at Ω eigenve tors u1 and u2 1 u1 = √ 2ρ

0 , 1

1 u2 = √ 2ρ

1 . 0

(84)

The eigenve tors are orthogonal uTi uj = 0, i 6= j, and satisfy the normalization e = ±Ω e d there exist double

ondition uTi ui = (2ρ)−1 . At the other two nodes at Ω e e semi-simple eigenvalues λ = 0. The range |Ω| < Ωd = ρ is alled sub riti al for e. the gyros opi parameter Ω In the following, with the use of the perturbation theory of multiple eigenvalues, we des ribe the deformation of the mesh aused by dissipative (δD) and non- onservative perturbations (νN), originating, e.g. from the fri tional

onta t, and larify the key role of inde nite damping and non- onservative positional for es in the development of the sub riti al utter instability. This will give a lear mathemati al des ription of the me hanism of ex itation of parti ular modes of rotating stru tures in fri tional onta t, su h as squealing dis brakes and singing wine glasses [67, 71, 76℄. e =Ω e 0 + ∆Ω e , the double Under perturbation of the gyros opi parameter Ω eigenvalue iρ into two simple ones bifur ates a

ording to the asymptoti formula [58℄ r 2 e f11 + f22 ± i∆Ω e (f11 − f22 ) + f12 f21 λ± p = iρ + i∆Ω 2 4 where the quantities fij are e ∂L ( Ω) 0 = 2iρuTj Gui . ui fij = uTj e e ∂Ω Ω=0,λ=iρ

(85)

(86)

The skew symmetry of G yields f11 = f22 = 0, f12 = −f21 = i, so that (86) gives the exa t result (83). 4.2.1 Deformation of the spectral mesh. Consider a perturbation of the gye + ∆L(Ω) e , assuming that the size of the perturbation ros opi system L0 (Ω) e = δλD + νN ∼ ε is small, where ε = k∆L(0)k is the Frobenius norm ∆L(Ω)

58

O. N. Kirillov

e = 0. The behavior of the perturbed eigenvalue iρ for of the perturbation at Ω e and small ε is des ribed by the asymptoti formula [58℄ small Ω e (f11 + f22 ) + i ǫ11 + ǫ22 λ = iβ + iΩ 2 2 s e (Ω(f11 − f22 ) + ǫ11 − ǫ22 )2 e 12 + ǫ12 )(Ωf e 21 + ǫ21 ), ±i + (Ωf 4

(87)

where fij are given by (86) and ǫij are small omplex numbers of order ε ǫij = uTj ∆L(0)ui = iρδuTj Dui + νuTj Nui .

(88)

With the use of the ve tors (84) we obtain √ µ1 + µ2 λ = iρ − δ ± c, c = 4

µ1 − µ2 4

2

2 ν e δ + iΩ + , 2ρ 2

(89)

where the eigenvalues µ1 , µ2 of D satisfy the equation µ2 − µtrD + det D = 0. Separation of real and imaginary parts in equation (89) yields µ + µ2 Re λ = − 1 δ± 4

where Re c =

r

µ1 − µ2 4

|c| + Re c , Im λ = ρ ± 2

2

e2 + δ2 − Ω

r

|c| − Re c , 2

e Ων ν2 , Im c = . 2 4ρ ρ

(90) (91)

The formulas (89)-(91) des ribe splitting of the double eigenvalues at the nodes of the spe tral mesh due to variation of parameters. Assuming ν = 0 in formulas (90) we nd that

Re λ +

when and

µ1 + µ2 δ 4

2

2 e 2 = (µ1 − µ2 ) δ2 , +Ω 16

2 e 2 − (µ1 − µ2 ) δ2 < 0, Ω 16

2 e 2 − (Im λ − ρ)2 = (µ1 − µ2 ) δ2 , Ω 16

Re λ = −

Im λ = ρ

(92) (93)

µ1 + µ2 δ, 4

(94)

when the sign in inequality (93) is opposite. For a given δ equation (94) de nes e Im λ), while (92) is the equation of a ir le in the a hyperbola in the plane (Ω, e Re λ), as shown in Fig. 12(a, ). For tra king the omplex eigenvalues plane (Ω, e , it is onvenient to onsider the due to hange of the gyros opi parameter Ω e Im λ, Re λ). In this spa e eigenvalue bran hes in the three-dimensional spa e (Ω, the ir le belongs to the plane Im λ = ρ and the hyperbola lies in the plane Re λ = −δ(µ1 + µ2 )/4, see Fig. 13(a, ).

Sensitivity analysis of Hamiltonian and reversible systems

59

Origination of a latent sour e of the sub riti al utter instability in presen e of full dissipation: Submerged bubble of instability (a); oales en e of eigenvalues in the omplex plane at two ex eptional points (b); hyperboli traje tories of imaginary parts ( ).

Fig. 12.

The radius rb of the ir le of omplex eigenvalues|the bubble of instability |and the distan e db of its enter from the plane Re λ = 0 are expressed by means of the eigenvalues µ1 and µ2 of the matrix D rb = |(µ1 − µ2 )δ|/4,

db = |(µ1 + µ2 )δ|/4.

(95)

Consequently, the bubble of instability is \submerged" under the surfa e Re λ = e Im λ, Re λ) and does not interse t the plane Re λ = 0 under the 0 in the spa e (Ω,

ondition db > rb , whi h is equivalent to the positive-de niteness of the matrix δD. Hen e, the role of full dissipation or pervasive damping is to deform the spe tral mesh in su h a way that the double semi-simple eigenvalue is in ated to the bubble of omplex eigenvalues (92) onne ted with the two bran hes of the hyperbola (94) at the points Im λ = ρ,

Re λ = −δ(µ1 + µ2 )/4,

e = ±δ(µ1 − µ2 )/4, Ω

(96)

and to plunge all the eigenvalue urves into the region Re λ 6 0. The eigenvalues at the points (96) are double and have a Jordan hain of order 2. In the omplex e along the lines Re λ = −db plane the eigenvalues move with the variation of Ω until they meet at the points (96) and then split in the orthogonal dire tion; however, they never ross the imaginary axis, see Fig. 12(b). The radius of the bubble of instability is greater then the depth of its submersion under the surfa e Re λ = 0 only if the eigenvalues µ1 and µ2 of the damping matrix have dierent signs, i.e. if the damping is inde nite. The damping with the inde nite matrix appears in the systems with fri tional onta t when the fri tion oeÆ ient is de reasing with relative sliding velo ity [35, 36, 40℄. Inde nite damping leads to the emersion of the bubble of instability meaning that the

60

O. N. Kirillov

The me hanism of sub riti al utter instability (bold lines): The ring (bubble) of omplex eigenvalues submerged under the surfa e Re λ = 0 due to a tion of dissipation with det D > 0 - a latent sour e of instability (a); repulsion of eigenvalue bran hes of the spe tral mesh due to a tion of non- onservative positional for es (b); emersion of the bubble of instability due to inde nite damping with det D < 0 ( );

ollapse of the bubble of instability and immersion and emersion of its parts due to

ombined a tion of dissipative and non- onservative positional for es (d). Fig. 13.

e2 e2 eigenvalues √ of the bubble have positive real parts in the range Ω < Ωcr , where δ e cr = Ω 2 − det D. Changing the damping matrix δD from positive de nite to inde nite we trigger the state of the bubble of instability from latent (Re λ < 0) e cr < Ω e d , the to a tive (Re λ > 0), see Fig. 13(a, ). Sin e for small δ we have Ω

utter instability is sub riti al and is lo alized in the neighborhood of the nodes e = 0. of the spe tral mesh at Ω In the absen e of dissipation, the non- onservative positional for es destroy the marginal stability of gyros opi systems [12, 13℄. Indeed, assuming δ = 0 in the formula (89) we obtain e λ± p = iρ ± iΩ ±

ν , 2ρ

e λ± n = −iρ ± iΩ ∓

ν . 2ρ

(97)

Sensitivity analysis of Hamiltonian and reversible systems

61

e and −iρ − iΩ e of the A

ording to (97), the eigenvalues of the bran hes iρ + iΩ spe tral mesh get positive real parts due to perturbation by the non- onservative positional for es. The eigenvalues of the other two bran hes are shifted to the left from the imaginary axis, see Fig. 13(b).

Fig. 14. Sub riti al utter instability due to ombined a tion of dissipative and non onservative positional for es: Collapse and emersion of the bubble of instability (a); e goes from ex ursions of eigenvalues to the right side of the omplex plane when Ω negative values to positive (b); rossing of imaginary parts ( ).

In ontrast to the ee t of inde nite damping the instability indu ed by the non- onservative for es only is not lo al. However, in ombination with the dissipative for es, both de nite and inde nite, the non- onservative for es an

reate sub riti al utter instability in the vi inity of diaboli al points. From equation (89) we nd that in presen e of dissipative and ir ulatory perturbations the traje tories of the eigenvalues in the omplex plane are des ribed by the formula

Re λ +

trD 4

δ (Im λ − ρ) =

e Ων . 2ρ

(98)

Non- onservative positional for es with ν 6= 0 destroy the merging of modes, shown in Fig. 12, so that the eigenvalues move along the separated traje tories. e A

ording to (98) the eigenvalues with | Im λ| in reasing due to an in rease in |Ω| move loser to the imaginary axis then the others, as shown in Fig 14(b). In the e Im λ, Re λ) the a tion of the non- onservative positional for es sepaspa e (Ω, rates the bubble of instability and the adja ent hyperboli eigenvalue bran hes into two non-interse ting urves, see Fig 13(d). The form of ea h of the new eigenvalue urves arries the memory about the original bubble of instability, so that the real parts of the eigenvalues an be positive for the values of the

62

O. N. Kirillov

e = 0 in the range Ω e2 < Ω e 2 , where gyros opi parameter lo alized near Ω cr e cr = δ Ω

trD 4

s

−

ν2 − δ2 ρ2 det D . − δ2 ρ2 (trD/2)2

ν2

(99)

follows from the equations (89)-(91). e2 < Ω e 2 are The eigenfrequen ies of the unstable modes from the interval Ω cr lo alized near the frequen y of the double semi-simple eigenvalue at the node of + the undeformed spe tral mesh: ω− cr < ω < ωcr ω± cr

ν =ρ± 2ρ

s

−

ν2 − δ2 ρ2 det D . − δ2 ρ2 (trD/2)2

ν2

(100)

When the radi and in formulas (99) and (100) is real, the eigenvalues make the ex ursion to right side of the omplex plane, as shown in Fig. 14(b). In presen e of non- onservative positional for es su h ex ursions behind the stability boundary are possible, even when dissipation is full (det D > 0). The equation (99) des ribes the surfa e in the spa e of the parameters δ, e , whi h is an approximation to the stability boundary. Extra ting the ν, and Ω parameter ν in (99) yields ν = ±δρtrD

s

e2 δ2 det D + 4Ω . e2 δ2 (trD)2 + 16Ω

(101)

e is xed, the formula (101) des ribes two independent urves If det D > 0 and Ω in the plane (δ, ν) interse ting with ea h other at the origin along the straight lines given by the expression ν=±

ρtrD δ. 2

(102)

However, in ase when det D < 0, the radi al in (101) is real only for δ2 < e 2 / det D meaning that (101) des ribes two bran hes of a losed loop in −4Ω the plane of the parameters δ and ν. The loop is self-interse ting at the origin with the tangents given by the expression (102). Hen e, the shape of the surfa e des ribed by equation (101) is a one with the "8"-shaped loop in a ross-se tion, see Fig. 15(a). The asymptoti stability domain is inside the two of the four po kets of the one, sele ted by the inequality δtrD > 0, as shown in Fig. 15(a). The singularity of the stability domain at the origin is the degeneration of a more general on guration shown in Fig. 5(b). The domain of asymptoti stability bifur ates when det D hanges from negative to positive values. This pro ess is shown in Fig. 15. In ase of inde nite damping there exists an instability gap due to the singularity at the origin. e = 0 for any ombination of the parameters Starting in the utter domain at Ω

Sensitivity analysis of Hamiltonian and reversible systems

63

e for dierent types of Domains of asymptoti stability in the spa e (δ, ν, Ω) damping: Inde nite damping det D < 0 (a); semi-de nite (pervasive) damping det D = 0 (b); full dissipation det D > 0 ( ). Fig. 15.

δ and ν one an rea h the domain of asymptoti stability at higher values of e (gyros opi stabilization), as shown in Fig. 15(a) by the dashed line. The |Ω|

gap is responsible for the sub riti al utter instability lo alized in the vi inity of the node of the spe tral mesh of the unperturbed gyros opi system. When det D = 0, the gap vanishes in the dire tion ν = 0. In ase of full dissipation (det D > 0) the singularity at the origin unfolds. However, the memory about it is preserved in the two instability gaps lo ated in the folds of the stability boundary with the lo ally strong urvature, Fig. 15( ). At some values of δ and ν one an penetrate the fold of the stability boundary with the hange of Ω, as shown in Fig. 15( ) by the dashed line. For su h δ and ν the utter instability e = 0. is lo alized in the vi inity of Ω The phenomenon of the lo al sub riti al utter instability is ontrolled by the eigenvalues of the matrix D. When both of them are positive, the folds of the stability boundary are more pronoun ed if one of the eigenvalues is lose to zero. If one of the eigenvalues is negative and the other is positive, the lo al sub riti al utter instability is possible for any ombination of δ and ν in luding the ase when the non- onservative positional for es are absent (ν = 0). The instability me hanism behind the squealing dis brake or singing wine glass an be des ribed as the emersion (or a tivation) due to inde nite damping and non- onservative positional for es of the bubbles of instability reated by the full dissipation in the vi inity of the nodes of the spe tral mesh.

Conclusions Investigation of stability and sensitivity analysis of the riti al parameters and

riti al frequen ies of near-Hamiltonian and near-reversible systems is ompli ated by the singularities of the boundary of asymptoti stability domain, whi h

64

O. N. Kirillov

are related to the multiple eigenvalues. In the paper we have developed the methods of approximation of the stability boundaries near the singularities and obtained estimates of the riti al values of parameters in the ase of arbitrary number of degrees of freedom using the perturbation theory of eigenvalues and eigenve tors of non-self-adjoint operators. In ase of two degrees of freedom the domain of asymptoti stability of near-reversible and near-Hamiltonian systems is fully des ribed and its typi al on gurations are found. Bifur ation of the stability domain due to hange of the matrix of dissipative for es is dis overed and des ribed. Two lasses of inde nite damping matri es are found and the expli it threshold, separating the weakly- and strongly inde nite matri es is derived. The role of dissipative and non- onservative for es in the paradoxi al ee ts of gyros opi stabilization of stati ally unstable potential systems as well as of destabilization of stati ally stable ones is lari ed. Finally, the me hanism of sub riti al utter instability in rotating elasti bodies of revolution in fri tional

onta t, ex iting os illations in the squealing dis brake and in the singing wine glass, is established.

Acknowledgments The author is grateful to Professor P. Hagedorn for his interest to this work and useful dis ussions.

References 1. W. Thomson and P. G. Tait, Treatise on Natural Philosophy, Vol. 1, Part 1, New Edition, Cambridge Univ. Press, Cambridge, 1879. 2. E. J. Routh, A treatise on the stability of a given state of motion, Ma millan, London, 1892. 3. A. L. Kimball, Internal fri tion theory of shaft whirling, Phys. Rev., 21(6) (1923), pp. 703. 4. D. M. Smith, The motion of a rotor arried by a exible shaft in exible bearings, Pro . Roy. So . Lond. A., 142 (1933), pp. 92{118. 5. P. L. Kapitsa, Stability and transition through the riti al speed of fast rotating shafts with fri tion, Zh. Tekh. Fiz., 9 (1939), pp. 124{147. 6. H. Bilharz, Bemerkung zu einem Satze von Hurwitz, Z. angew. Math. Me h., 24(2) (1944), pp. 77{82. 7. M. G. Krein, A generalization of some investigations of linear dierential equations with periodi oeÆ ients, Doklady Akad. Nauk SSSR N.S., 73 (1950), pp. 445-448. 8. H. Ziegler, Die Stabilitatskriterien der Elastome hanik, Ing.-Ar h., 20 (1952), pp. 49{56. 9. E. O. Holopainen, On the ee t of fri tion in baro lini waves, Tellus, 13(3) (1961), pp. 363{367.

Sensitivity analysis of Hamiltonian and reversible systems

65

10. V. V. Bolotin, Non- onservative Problems of the Theory of Elasti Stability, Pergamon, Oxford, 1963. 11. G. Herrmann and I. C. Jong, On the destabilizing ee t of damping in non onservative elasti systems, ASME J. of Appl. Me hs., 32(3) (1965), pp. 592{597. 12. V. M. Lakhadanov, On stabilization of potential systems, Prikl. Mat. Mekh., 39(1) (1975), pp. 53-58. 13. A. V. Karapetyan, On the stability of non onservative systems, Vestn. MGU. Ser. Mat. Mekh., 4 (1975), pp. 109-113. 14. W. Hauger, Stability of a gyros opi non- onservative system, Trans. ASME, J. Appl. Me h., 42 (1975), pp. 739{740. 15. I. P. Andrei hikov and V. I. Yudovi h, The stability of vis o-elasti rods, Izv. A ad. Nauk SSSR. MTT, 1 (1975), pp. 150{154. 16. V. N. Tkhai, On stability of me hani al systems under the a tion of position for es, PMM U.S.S.R., 44 (1981), pp. 24{29. 17. V. I. Arnold, Geometri al Methods in the Theory of Ordinary Dierential Equations, Springer, New York and Berlin, 1983. 18. A. S. Deif, P. Hagedorn, Matrix polynomials subje ted to small perturbations. Z. angew. Math. Me h., 66 (1986), pp. 403{412. 19. M. B. Sevryuk, Reversible systems, Le ture Notes in Mathemati s 1211, Springer, Berlin, 1986. 20. N. V. Bani huk, A. S. Bratus, A. D. Myshkis, Stabilizing and destabilizing ee ts in non onservative systems, PMM U.S.S.R., 53(2) (1989), pp. 158{164. 21. S. Barnett, Leverrier's algorithm: a new proof and extensions, SIAM J. Matrix Anal. Appl., 10(4) (1989), pp. 551-556. 22. A. P. Seyranian, Destabilization paradox in stability problems of non onservative systems, Advan es in Me hani s, 13(2) (1990), 89{124. 23. R. S. Ma Kay, Movement of eigenvalues of Hamiltonian equilibria under nonHamiltonian perturbation, Phys. Lett. A, 155 (1991), 266{268. 24. H. Langer, B. Najman, K. Veseli , Perturbation of the eigenvalues of quadrati matrix polynomials, SIAM J. Matrix Anal. Appl. 13(2) (1992), pp. 474{ 489. 25. G. Haller, Gyros opi stability and its loss in systems with two essential oordinates, Intern. J. of Nonl. Me hs., 27 (1992), 113{127. 26. A. N. Kounadis, On the paradox of the destabilizing ee t of damping in non onservative systems, Intern. J. of Nonl. Me hs., 27 (1992), 597{609. 27. V. F. Zhuravlev, Nutational vibrations of a free gyros ope, Izv. Ross. Akad. Nauk, Mekh. Tverd. Tela, 6 (1992), 13{16. 28. A. M. Blo h, P. S. Krishnaprasad, J. E. Marsden, T. S. Ratiu, Dissipationindu ed instabilities, Annales de l'Institut Henri Poincare, 11(1) (1994), pp. 37{ 90. 29. J. Maddo ks and M. L. Overton, Stability theory for dissipatively perturbed Hamiltonian systems, Comm. Pure and Applied Math., 48 (1995), pp. 583{610. 30. I. Hoveijn and M. Ruijgrok, The stability of parametri ally for ed oupled os illators in sum resonan e, Z. angew. Math. Phys., 46 (1995), pp. 384{392. 31. S. H. Crandall, The ee t of damping on the stability of gyros opi pendulums, Z. angew. Math. Phys., 46 (1995), pp. 761{780. 32. V. V. Beletsky, Some stability problems in applied me hani s, Appl. Math. Comp., 70 (1995), pp. 117{141.

66

O. N. Kirillov

33. O. M. O'Reilly, N. K. Malhotra, N. S. Nama h hivaya, Some aspe ts of 34. 35. 36. 37.

destabilization in reversible dynami al systems with appli ation to follower for es, Nonlin. Dyn. 10 (1996), pp. 63{87. D. R. Merkin, Introdu tion to the Theory of Stability, Springer, Berlin, 1997. P. Freitas, M. Grinfeld, P. A. Knight, Stability of nite-dimensional systems with inde nite damping, Adv. Math. S i. Appl. 7(1) (1997), pp. 437{448. ller, Gyros opi stabilization of inde nite damped W. Kliem and P. C. Mu systems. Z. angew. Math. Me h. 77(1) (1997), pp. 163{164. R. Hryniv and P. Lan aster, On the perturbation of analyti matrix fun tions,

Integral Equations And Operator Theory, 34(3) (1999), pp. 325{338. 38. R. Hryniv and P. Lan aster, Stabilization of gyros opi systems. Z. angew. Math. Me h. 81(10) (2001), pp. 675{681. 39. V. V. Bolotin, A. A. Grishko, M. Yu. Panov, Ee t of damping on the post riti al behavior of autonomous non- onservative systems, Intern. J. of Nonl. Me hs. 37 (2002), pp. 1163{1179. 40. K. Popp, M. Rudolph, M. Kro ger, M. Lindner, Me hanisms to generate and to avoid fri tion indu ed vibrations, VDI-Beri hte 1736, VDI-Verlag, Dusseldorf, 2002. 41. P. Lan aster, A. S. Markus, F. Zhou, Perturbation theory for analyti matrix fun tions: The semisimple ase, SIAM J. Matrix Anal. Appl. 25(3) (2003), pp. 606{626. 42. N. Hoffmann and L. Gaul, Ee ts of damping on mode- oupling instability in fri tion indu ed os illations, Z. angew. Math. Me h. 83 (2003), pp. 524{534. 43. O. N. Kirillov, How do small velo ity-dependent for es (de)stabilize a non onservative system?, DCAMM Report 681, Lyngby, 2003. 44. A. P. Seiranyan and O. N. Kirillov, Ee t of small dissipative and gyros opi for es on the stability of non onservative systems, Doklady Physi s, 48(12) (2003), pp. 679{684. 45. W. F. Langford, Hopf meets Hamilton under Whitney's umbrella, in IUTAM symposium on nonlinear sto hasti dynami s. Pro eedings of the IUTAM symposium, Monti ello, IL, USA, Augsut 26-30, 2002, Solid Me h. Appl. 110, S. N. Nama h hivaya, et al., eds., Kluwer, Dordre ht, 2003, pp. 157{165. 46. A. P. Ivanov, The stability of me hani al systems with positional non onservative for es, J. Appl. Maths. Me hs. 67(5) (2003), pp. 625{629. 47. A. P. Seyranian and A. A. Mailybaev, Multiparameter stability theory with me hani al appli ations, World S ienti , Singapore, 2003. 48. N. M. Bou-Rabee, J. E. Marsden, L. A. Romero, Tippe Top inversion as a dissipation-indu ed instability, SIAM J. Appl. Dyn. Sys. 3 (2004), pp. 352{377. 49. H. K. Moffatt, Y. Shimomura, M. Brani ki, Dynami s of an axisymmetri

body spinning on a horizontal surfa e. I. Stability and the gyros opi approximation, Pro . Roy. So . Lond. A 460 (2004), pp. 3643{3672. 50. O. N. Kirillov, Destabilization paradox, Doklady Physi s. 49(4) (2004), pp. 239{

245. 51. O. N. Kirillov, A. P. Seyranian, Collapse of the Keldysh hains and stability of ontinuous non onservative systems, SIAM J. Appl. Math. 64(4) (2004), pp. 1383{1407. 52. O. N. Kirillov and A. P. Seyranian, Stabilization and destabilization of a ir ulatory system by small velo ity-dependent for es, J. Sound and Vibr., 283(3{5) (2005), pp. 781{800.

Sensitivity analysis of Hamiltonian and reversible systems

67

53. O. N. Kirillov, A theory of the destabilization paradox in non- onservative systems, A ta Me hani a, 174(3{4) (2005), pp. 145{166. 54. O. N. Kirillov and A. P. Seyranian, Instability of distributed non onservative systems aused by weak dissipation, Doklady Mathemati s, 71(3) (2005), pp. 470{ 475. 55. O. N. Kirillov and A. O. Seyranian, The ee t of small internal and external damping on the stability of distributed non- onservative systems, J. Appl. Math. Me h., 69(4) (2005), pp. 529{552. 56. P. Lan aster, P. Psarrakos, On the Pseudospe tra of Matrix Polynomials, SIAM J. Matrix Anal. Appl., 27(1) (2005), pp. 115{120. 57. A. P. Seyranian, O. N. Kirillov, A. A. Mailybaev, Coupling of eigenvalues of omplex matri es at diaboli and ex eptional points. J. Phys. A: Math. Gen., 38(8) (2005), pp. 1723{1740. 58. O. N. Kirillov, A. A. Mailybaev, A. P. Seyranian, Unfolding of eigenvalue surfa es near a diaboli point due to a omplex perturbation, J. Phys. A: Math. Gen., 38(24) (2005), pp. 5531-5546. 59. N. M. Bou-Rabee, J. E. Marsden, L. A. Romero, A geometri treatment of Jellet's egg, Z. angew. Math. Me h., 85(9) (2005), pp. 618{642. 60. A. A. Mailybaev, O. N. Kirillov, A. P. Seyranian, Berry phase around degenera ies, Dokl. Math., 73(1) (2006), pp. 129{133. 61. T. Butlin, J. Woodhouse, Studies of the Sensitivity of Brake Squeal, Appl. Me h. and Mater., 5-6 (2006), pp. 473{479. 62. R. Kre hetnikov and J. E. Marsden, On destabilizing ee ts of two fundamental non- onservative for es, Physi a D, 214 (2006), pp. 25{32. 63. O. N. Kirillov, Gyros opi stabilization of non- onservative systems, Phys. Lett. A, 359(3) (2006), pp. 204{210. 64. U. Gunther, O. N. Kirillov, A Krein spa e related perturbation theory for MHD alpha-2 dynamos and resonant unfolding of diaboli al points, J. Phys. A: Math. Gen., 39 (2006), pp. 10057{10076 65. V. Kobelev, Sensitivity analysis of the linear non onservative systems with fra tional damping, Stru t. Multidis . Optim., 33 (2007), pp. 179-188. 66. R. Kre hetnikov and J. E. Marsden, Dissipation-indu ed instabilities in nite dimensions, Rev. Mod. Phys., 79 (2007), pp. 519{553. 67. U. von Wagner, D. Ho hlenert, P. Hagedorn, Minimal models for disk brake squeal, J. Sound Vibr., 302(3) (2007), pp. 527{539. 68. O. N. Kirillov, On the stability of non onservative systems with small dissipation, J. Math. S i., 145(5) (2007), pp. 5260{5270. 69. A. N. Kounadis, Flutter instability and other singularity phenomena in symmetri systems via ombination of mass distribution and weak damping, Int. J. of Non-Lin. Me h., 42(1) (2007), pp. 24{35. 70. O. N. Kirillov, Destabilization paradox due to breaking the Hamiltonian and reversible symmetry, Int. J. of Non-Lin. Me h., 42(1) (2007), pp. 71{87. 71. G. Spelsberg-Korspeter, D. Ho hlenert, O. N. Kirillov, P. Hagedorn, In-

and out-of-plane vibrations of a rotating plate with fri tional onta t: Investigations on squeal phenomena, Trans. ASME, J. Appl. Me h., (2007) (submitted). 72. J.-J. Sinou and L. Jezequel, Mode oupling instability in fri tion-indu ed vibrations and its dependen y on system parameters in luding damping, Eur. J. Me h. A., 26 (2007), 106{122.

68

O. N. Kirillov

73. P. Kessler, O. M. O'Reilly, A.-L. Raphael, M. Zworski, On dissipation-

indu ed destabilization and brake squeal: A perspe tive using stru tured pseudospe tra, J. Sound Vibr., 308 (2007), 1-11. 74. O. N. Kirillov Gyros opi stabilization in the presen e of non- onservative for es, Dokl. Math., 76(2) (2007), pp. 780{785. 75. G. Spelsberg-Korspeter, O. N. Kirillov, P. Hagedorn, Modeling and stability analysis of an axially moving beam with fri tional onta t, Trans. ASME,

J. Appl. Me h. 75(3) (2008), 031001. 76. O. N. Kirillov, Sub riti al utter in the a ousti s of fri tion, Pro . R. So . A, 464 (2008), pp. 77. J. Kang, C. M. Krousgrill, F. Sadeghi, Dynami instability of a thin ir ular plate with fri tion interfa e and its appli ation to dis brake squeal, J. Sound. Vibr. (2008).

Block triangular miniversal deformations of matrices and matrix pencils Lena Klimenko1 and Vladimir V. Sergei huk2 1

Information and Computer Centre of the Ministry of Labour and So ial Poli y of Ukraine, Esplanadnaya 8/10, Kiev, Ukraine [email protected]

2

Institute of Mathemati s, Teresh henkivska 3, Kiev, Ukraine [email protected]

For ea h square omplex matrix, V. I. Arnold onstru ted a normal form with the minimal number of parameters to whi h a family of all matri es B that are lose enough to this matrix an be redu ed by similarity transformations that smoothly depend on the entries of B. Analogous normal forms were also onstru ted for families of omplex matrix pen ils by A. Edelman, E. Elmroth, and B. K agstrom, and

ontragredient matrix pen ils (i.e., of matrix pairs up to transformations (A, B) 7→ (S−1 AR, R−1 BS)) by M. I. Gar ia-Planas and V. V. Sergei huk. In this paper we give other normal forms for families of matri es, matrix pen ils, and ontragredient matrix pen ils; our normal forms are blo k triangular. Abstract.

Keywords: anoni al forms, matrix pen ils, versal deformations, perturbation theory.

1

Introduction

The redu tion of a matrix to its Jordan form is an unstable operation: both the Jordan form and the redu tion transformations depend dis ontinuously on the entries of the original matrix. Therefore, if the entries of a matrix are known only approximately, then it is unwise to redu e it to Jordan form. Furthermore, when investigating a family of matri es smoothly depending on parameters, then although ea h individual matrix an be redu ed to its Jordan form, it is unwise to do so sin e in su h an operation the smoothness relative to the parameters is lost. For these reasons, Arnold [1℄ onstru ted a miniversal deformation of any Jordan anoni al matrix J; that is, a family of matri es in a neighborhood of J with the minimal number of parameters, to whi h all matri es M lose to J an be redu ed by similarity transformations that smoothly depend on the entries of M (see De nition 1). Miniversal deformations were also onstru ted for:

70

L. Klimenko, V. V. Sergei huk

(i) the Krone ker anoni al form of omplex matrix pen ils by Edelman, Elmroth, and K agstrom [9℄; another miniversal deformation (whi h is simple in the sense of De nition 2) was onstru ted by Gar ia-Planas and Sergei huk [10℄; (ii) the Dobrovol'skaya and Ponomarev anoni al form of omplex ontragredient matrix pen ils (i.e., of matri es of ounter linear operators U ⇄ V ) in [10℄. Belitskii [4℄ proved that ea h Jordan anoni al matrix J is permutationally similar to some matrix J# , whi h is alled a Weyr anoni al matrix and possesses the property: all matri es that ommute with J# are blo k triangular. Due to this property, J# plays a entral role in Belitskii's algorithm for redu ing the matri es of any system of linear mappings to anoni al form, see [5, 11℄. In this paper, we nd another property of Weyr anoni al matri es: they possess blo k triangular miniversal deformations (in the sense of De nition 2). Therefore, if we onsider, up to smooth similarity transformations, a family of matri es that are lose enough to a given square matrix, then we an take it in its Weyr anoni al form J# and the family in the form J# + E, in whi h E is blo k triangular. We also give blo k triangular miniversal deformations of those anoni al forms of pen ils and ontragredient pen ils that are obtained from (i) and (ii) by repla ing the Jordan anoni al matri es with the Weyr anoni al matri es. All matri es that we onsider are omplex matri es.

2

Miniversal deformations of matrices

Definition 1 (see [1–3]). A deformation of an n-by-n matrix A is a matrix fun tion A(α1 , . . . , αk ) (its arguments α1 , . . . , αk are alled parameters) on a neighborhood of ~0 = (0, . . . , 0) that is holomorphi at ~0 and equals A at ~0. Two deformations of A are identi ed if they oin ide on a neighborhood of ~0. A deformation A(α1 , . . . , αk) of A is versal if all matri es A + E in some neighborhood of A redu e to the form A(h1 (E), . . . , hk (E)) = S(E)−1 (A + E)S(E),

S(0) = In ,

in whi h S(E) is a holomorphi at zero matrix fun tion of the entries of E. A versal deformation with the minimal number of parameters is alled miniversal. Definition 2. Let B(α1 , . . . , αk ).

a deformation

A

of

A

be represented in the form

A+

Blo k triangular miniversal deformations of matri es and matrix pen ils – –

71

If k entries of B(α1 , . . . , αk ) are the independent parameters α1 , . . . , αk and the others are zero then the deformation A is alled simple3. A simple deformation is blo k triangular with respe t to some partition of A into blo ks if B(α1 , . . . , αk) is blo k triangular with respe t to the

onformal partition and ea h of its blo ks is either 0 or all of its entries are independent parameters.

If A(α1 , . . . , αk ) is a miniversal deformation of A and S−1 AS = B for some nonsingular S, then S−1 A(α1 , . . . , αk )S is a miniversal deformation of B. Therefore, it suÆ es to onstru t miniversal deformations of anoni al matri es for similarity. Let J(λ) := Jn1 (λ) ⊕ · · · ⊕ Jnl (λ), n1 > n2 > · · · > nl , (1) be a Jordan anoni al matrix with a single eigenvalue equal to λ; the unites of Jordan blo ks are written over the diagonal:

λ1

0

λ ... Jni (λ) := .. . 1 0 λ

(ni -by-ni ).

For ea h natural numbers p and q, de ne the p × q matrix

Tpq

∗ 0 ... 0 . . .. . . . . . if p < q, ∗ 0 . . . 0 := 0 ... 0 . .. . . . if p > q, 0 . . . 0 ∗ ... ∗

(2)

in whi h the stars denote independent parameters (alternatively, we may take Tpq with p = q as in the ase p < q).

Let J(λ) be a Jordan anoni al matrix of the form (1) with a single eigenvalue equal to λ. Let H := [Tn ,n ] be the parameter blo k matrix partitioned onformally to J(λ) with the blo ks Tn ,n de ned in (2). Then Theorem 1 ([3, §30, Theorem 2]). (i)

i

i

j

J(λ) + H 3

j

(3)

Arnold's miniversal de nitions presented in Theorem 1 are simple. Moreover, by [10, Corollary 2.1℄ the set of matri es of any quiver representation (i.e., of any nite system of linear mappings) over C or R possesses a simple miniversal deformation.

72

L. Klimenko, V. V. Sergei huk

is a simple miniversal deformation of J(λ). (ii) Let

if i 6= j, (4) be a Jordan anoni al matrix in whi h every J(λi ) is of the form (1), and let J(λi ) + Hi be its miniversal deformation (3). Then J := J(λ1 ) ⊕ · · · ⊕ J(λτ ),

λi 6= λj

J + K := (J(λ1 ) + H1 ) ⊕ · · · ⊕ (J(λτ ) + Hτ )

(5)

is a simple miniversal deformation of J. Definition 3 ([13]). The Weyr anoni al form J# of a Jordan anoni al matrix J (and of any matrix that is similar to J) is de ned as follows. (i) If J has a single eigenvalue, then we write it in the form (1). Permute the rst olumns of Jn (λ), Jn (λ), . . . , and Jn (λ) into the rst l olumns, then permute the orresponding rows. Next permute the se ond olumns of all blo ks of size at least 2 × 2 into the next olumns and permute the

orresponding rows; and so on. The obtained matrix is the Weyr anoni al form J(λ)# of J(λ). (ii) If J has distin t eigenvalues, then we write it in the form (4). The Weyr anoni al form of J is 1

2

l

J# := J(λ1 )# ⊕ · · · ⊕ J(λτ )# .

Ea h dire t summand of (6) has the form

Is2 λI s1 0 # λIs2 J(λ) = 0

(6)

0 .. . , . . Isk . 0 λIsk

(7)

in whi h si is the number of Jordan blo ks Jl (λ) of size l > i in J(λ). The sequen e (s1 , s2 , . . . , sk ) is alled the Weyr hara teristi of J (and of any matrix that is similar to J) for the eigenvalue λ, see [12℄. By [4℄ or [11, Theorem 1.2℄, all matri es ommuting with J# are blo k triangular. In the next lemma we onstru t a miniversal deformation of J# that is blo k triangular with respe t to the most oarse partition of J# for whi h all diagonal blo ks have the form λi I and ea h o-diagonal blo k is 0 or I. This means that the sizes of diagonal blo ks of (7) with respe t to this partition form the sequen e obtained from sk , sk−1 − sk , . . . , s2 − s3 , s1 − s2 , sk , sk−1 − sk , . . . , s2 − s3 , .................. sk , sk−1 − sk , sk

Blo k triangular miniversal deformations of matri es and matrix pen ils

73

by removing the zero members.

Let J(λ) be a Jordan anoni al matrix of the form (1) with a single eigenvalue equal to λ. Let J(λ) + H be its miniversal deformation (3). Denote by Theorem 2. (i)

J(λ)# + H#

(8)

the parameter matrix obtained from J(λ) + H by the permutations des ribed in De nition 3(i). Then J(λ)# + H# is a miniversal deformation of J(λ)# and its matrix H# is lower blo k triangular. (ii) Let J be a Jordan anoni al matrix represented in the form (4) and let J# be its Weyr anoni al form. Let us apply the permutations des ribed in (i) to ea h of the dire t summands of miniversal deformation (5) of J. Then the obtained matrix J# + K# := (J(λ1 )# + H1# ) ⊕ · · · ⊕ (J(λτ )# + Hτ# )

(9)

is a miniversal deformation of J# , whi h is simple and blo k triangular (in the sense of De nition 2). Let us prove this theorem. The form of J(λ)# +H# and the blo k triangularity of H# be ome learer if we arry out the permutations from De nition 3(i) in two steps. First step. Let us write the sequen e n1 , n2 , . . . , nl from (1) in the form

where

m1 , . . . , m1 , m2 , . . . , m2 , . . . , mt , . . . , mt , | | {z } | {z } {z } r1 times r2 times rt times m1 > m2 > · · · > mt .

(10)

Partition J(λ) into t horizontal and t verti al strips of sizes r1 m1 , r2 m2 , . . . , rt mt

(ea h of them ontains Jordan blo ks of the same size), produ e the des ribed permutations within ea h of these strips, and obtain J(λ)+ := Jm1 (λIr1 ) ⊕ · · · ⊕ Jmt (λIrt ),

in whi h λIri Iri 0 . λIri . . Jmi (λIri ) := .. . I ri 0 λIri

(mi diagonal blo ks).

(11)

74

L. Klimenko, V. V. Sergei huk

By the same permutations of rows and olumns of J(λ) + H, redu e H to ~mi ,mj (ri , rj )], H+ := [T in whi h every T~mi ,mj (ri , rj ) is obtained from the matrix Tmi ,mj de ned in (2) by repla ing ea h entry 0 with the ri × rj zero blo k and ea h entry ∗ with the ri × rj blo k

⋆ := ...

.. . .

(12)

∗ ... ∗

For example, if

then

∗ ... ∗

(13)

J(λ) = J4 (λ) ⊕ · · · ⊕ J4 (λ) ⊕ J2 (λ) ⊕ · · · ⊕ J2 (λ) {z } | {z } | p times q times

(1,1) (1,2) (1,3) (1,4) (2,1) (2,2)

λIp Ip 0 0 0 0 0 0 (1,2) 0 λIp Ip 0 0 (1,3) 0 0 λIp Ip 0 + J(λ) = J4 (λIp ) ⊕ J2 (λIq ) = (1,4) 0 0 0 λIp 0 0 (2,1) 0 0 0 0 λIq Iq (2,2) 0 0 0 0 0 λIq (1,1)

(14)

A strip is indexed by (i, j) if it ontains the j-th strip of Jmi (λIri ). Correspondingly,

(1,1)

(1,3) H+ = (1,4) (2,1) (1,2) (2,2)

(1,1) (1,2) (1,3) (1,4) (2,1) (2,2)

0 0 0 ⋆ ⋆ ⋆

0 0 0 ⋆ 0 0

0 0 0 ⋆ 0 0

0 0 0 ⋆ 0 0

0 0 0 ⋆ 0 ⋆

0 0 0 ⋆ 0 ⋆

(15)

Se ond step. We permute in J(λ)+ the rst verti al strips of Jm1 (λIr1 ), Jm2 (λIr2 ), . . . , Jmt (λIrt )

into the rst t verti al strips and permute the orresponding horizontal strips, then permute the se ond verti al strips into the next verti al strips and permute the orresponding horizontal strips; ontinue the pro ess until J(λ)# is a hieved. The same permutations transform H+ to H# .

Blo k triangular miniversal deformations of matri es and matrix pen ils

75

For example, applying there permutations to (14) and (15), we obtain

(1,1) (2,1) (1,2) (2,2) (1,3) (1,4)

λIp 0 Ip 0 0 0 0 (2,1) 0 λIq 0 Iq 0 (1,2) 0 0 λIp 0 Ip 0 # J(λ) = (2,2) 0 0 0 λIq 0 0 (1,3) 0 0 0 0 λIp Ip (1,4) 0 0 0 0 0 λIp (1,1)

and

(1,1)

(1,2) H# = (2,2) (1,3) (2,1) (1,4)

(1,1) (2,1) (1,2) (2,2) (1,3) (1,4)

0 ⋆ 0 ⋆ 0 ⋆

0 0 0 ⋆ 0 ⋆

0 0 0 0 0 ⋆

0 0 0 ⋆ 0 ⋆

0 0 0 0 0 ⋆

0 0 0 0 0 ⋆

(16)

(17)

Proof of Theorem 2. (i) Following (14), we index the verti al (horizontal) strips

of J(λ)+ in (11) by the pairs of natural numbers as follows: a strip is indexed by (i, j) if it ontains the j-th strip of Jmi (λIri ). The pairs that index the strips of J(λ)+ form the sequen e (1, 1), (1, 2), . . . , (1, mt ), . . . , (1, m2 ), . . . , (1, m1 ), (2, 1), (2, 2), . . . , (2, mt ), . . . , (2, m2 ), ······························ (t, 1), (t, 2), . . . , (t, mt ),

(18)

whi h is is ordered lexi ographi ally. Rearranging the pairs by the olumns of (18): (1, 1), (2, 1), . . . , (t, 1); . . . ; (1, mt ), (2, mt ), . . . , (t, mt ); . . . ; (1, m1 ) (19)

(i.e., as in lexi ographi ordering but starting from the se ond elements of the pairs) and making the same permutation of the orresponding strips in J(λ)+ and H+ , we obtain J(λ)# and H# ; see examples (16) and (17). The ((i, j), (i ′ , j ′))-th entry of H+ is a star if and only if either i 6 i ′ and j = mi , or i > i ′ and j ′ = 1.

(20)

By (10), in these ases j > j ′ and if j = j ′ then either j = j ′ = mi and i = i ′, or j = j ′ = 1 and i > i ′ . Therefore, H# is lower blo k triangular. ⊓ ⊔ (ii) This statement follows from (i) and Theorem 1(ii).

76

L. Klimenko, V. V. Sergei huk

Remark 1. Let J(λ) be a Jordan matrix with a single eigenvalue, let m1 > m2 > · · · > mt be the distin t sizes of its Jordan blo ks, and let ri be the number of Jordan blo ks of size mi . Then the deformation J(λ)# + H# from Theorem 2

an be formally onstru ted as follows:

– J(λ)# and H# are matri es of the same size; they are onformally partitioned

into horizontal and verti al strips, whi h are indexed by the pairs (19).

– The ((i, j), (i, j))-th diagonal blo k of J(λ)# is λIri , its ((i, j), (i, j + 1))-th blo k is Iri , and its other blo ks are zero. – The ((i, j), (i ′ , j ′ ))-th blo k of H+ has the form (12) if and only if (20) holds;

its other blo ks are zero.

3

Miniversal deformations of matrix pencils

By Krone ker's theorem on matrix pen ils (see [6, Se t. XII, §4℄), ea h pair of m × n matri es redu es by equivalen e transformations (A, B) 7→ (S−1 AR, S−1 BR),

S and R are nonsingular,

to a Krone ker anoni al pair (Akr , Bkr ) being a dire t sum, uniquely determined up to permutation of summands, of pairs of the form (Ir , Jr (λ)), (Jr (0), Ir ), (Fr , Gr ), (FTr , GTr ),

in whi h λ ∈ C and

1 0 Fr := 0

0 .. . , .. . 1 0

0 1 Gr :=

0

0 .. . .. . 0

(21)

1

are matri es of size r × (r − 1) with r > 1. De nitions 1 and 2 are extended to matrix pairs in a natural way. Miniversal deformations of (Akr , Bkr ) were obtained in [9, 10℄. The deformation obtained in [10℄ is simple; in this se tion we redu e it to blo k triangular form by permutations of rows and olumns. For this purpose, we repla e in (Akr , Bkr ) – the dire t sum (I, J) of all pairs of the form (Ir , Jr (λ)) by the pair (I, J# ),

and

– the dire t sum (J(0), I) of all pairs of the form (Jr (0), Ir ) by the pair (J(0)# , I),

Blo k triangular miniversal deformations of matri es and matrix pen ils

77

in whi h J# and J(0)# are the Weyr matri es from De nition 3. We obtain a

anoni al matrix pair of the form r l M M (Fqi , Gqi ); (FTpi , GTpi ) ⊕ (I, J# ) ⊕ (J(0)# , I) ⊕

(22)

i=1

i=1

in whi h we suppose that

p1 6 · · · 6 pl ,

(23)

q1 > · · · > qr .

(This spe ial ordering of dire t summands of (22) admits to onstru t its miniversal deformation that is blo k triangular.) Denote by 0↑ :=

∗ ··· ∗

0

, 0↓ :=

0

∗ ··· ∗

∗

, 0← := ...

∗

0 ,

0→

∗ := 0 ...

∗

the matri es, in whi h the entries of the rst row, the last row, the rst olumn, and the last olumn, respe tively, are stars and the other entries are zero, and write ∗ ··· ∗ 0 ··· .. . . Z := 0 . . 0 ···

0 .. .

0

(the number of zeros in the rst row of Z is equal to the number of rows). The stars denote independent parameters. In the following theorem we give a simple miniversal deformation of (22) that is blo k triangular with respe t to the partition of (22) in whi h J# and J(0)# are partitioned as in Theorem 2 and all blo ks of (FTpi , GTpi ) and (Fqi , Gqi ) are 1-by-1.

Let (A, B) be a anoni al matrix pair of the form (22) satisfying (23). One of the blo k triangular simple miniversal deformations of (A, B)

Theorem 3.

has the form (A, B), in whi h

FTp1

FTp2 0 ... FTpl 0 0 I A := → → → 0 J(0)# + H# 0 0 ... 0 ↓ 0 Fq1 0↓ Fq2 0→ 0→ . . . 0→ 0 .. ... . 0↓ 0 Fqr

(24)

78

L. Klimenko, V. V. Sergei huk

and GTp1 ZT GT p . . 2 . 0 .. . . . . T Z . . . ZT GTpl ← ← 0 0 . . . 0← J# + K# , B := 0 0 I ↑ ↑ 0 0 Gq1 ↑ ↑ 0 0 Z G q2 0 . . .. .. . .. . . . . . . 0↑ 0↑ Z . . . Z Gqr

where J(0)# + H# and mations (8) and (9).

J# + K#

(25)

are the blo k triangular miniversal defor-

Proof. The following miniversal deformation of matrix pairs was obtained in [10℄. The matrix pair (22) is equivalent to its Krone ker anoni al form

(Akr , Bkr ) :=

r l M M (Fqi , Gqi ) ⊕ (I, J) ⊕ (J(0), I) ⊕ (FTpi , GTpi ). i=1

i=1

By [10, Theorem 4.1℄, one of the simple miniversal deformations of (Akr , Bkr ) has the form (Akr , Bkr ), in whi h

Fqr

0

0↓ 0↓

Fqr−1 → → → 0 0 0 . . . 0 .. .. . . ↓ F1 0 0 I 0 0 Akr := → → → J(0) + H 0 0 . . . 0 T Fpl 0 FTpl−1 0 .. . 0 FTp1

Blo k triangular miniversal deformations of matri es and matrix pen ils

and

Gqr

Z

...

Z

0↑

0↑

. . 0↑ 0↑ Gqr−1 . . .. 0 .. .. .. . . . Z ↑ Gq1 0 0 0↑ ← ← J + K 0 0 0 . . . 0← Bkr := I 0 T T Gpl Z . . . ZT . . GTpl−1 . . .. 0 .. . ZT

GTp1

0

79

.

In view of Theorem 2, the deformation (Akr , Bkr ) is permutationally equivalent to the deformation (A, B) from Theorem 3. (The blo ks H and K in (Akr , Bkr ) are lower blo k triangular; be ause of this we redu e (Akr , Bkr ) to (A, B), whi h is lower blo k triangular.) ⊓ ⊔

Remark 2. Constru ting J(λ)# , we for ea h r join all r-by-r Jordan blo ks Jr (λ)

of J(λ) in Jr (λI); see (11). We an join analogously pairs of equal sizes in (22) and obtain a pair of the form ′

′

r l M M ^ Tp ′ ) ⊕ (I, J# ) ⊕ (J(0)# , I) ⊕ (^Fq ′ , G ^ q ′ ), FTp ′ , G (^ i i i

i

i=1

(26)

i=1

in whi h p1′ < · · · < pl′ ′ and q1′ > · · · > qr′ ′ . This pair is permutationally equivalent to (22). Produ ing the same permutations of rows and olumns in (24) ^ Tp , ^Fq , G ^ q , and 0, 0↑ , 0↓ , 0← , 0→ , Z in and (25), we join all FTp , GTp , Fq , Gq in F^Tp , G ↑ ↓ ← → ^, 0^ , 0^ , 0^ , 0^ , Z^ whi h onsist of blo ks 0 and ⋆ de ned in (12); the obtaining 0 pair is a blo k triangular miniversal deformation of (26).

4

Miniversal deformations of contragredient matrix pencils

Ea h pair of m × n and n × m matri es redu es by transformations of ontragredient equivalen e (A, B) 7→ (S−1 AR, R−1 BS),

S and R are nonsingular,

to the Dobrovol'skaya and Ponomarev anoni al form [7℄ (see also [8℄) being a dire t sum, uniquely determined up to permutation of summands, of pairs of the form (Ir , Jr (λ)), (Jr (0), Ir ), (Fr , GTr ), (FTr , Gr ), (27)

80

L. Klimenko, V. V. Sergei huk

in whi h λ ∈ C and the matri es Fr and Gr are de ned in (21). For ea h matrix M, de ne the matri es 0 ... 0 M△ := , M

0

. M⊲ := M .. 0

that are obtained by adding the zero row to the top and the zero olumn to the right, respe tively. Ea h blo k matrix whose blo ks have the form T△ (in whi h T is de ned in (2)) is denoted by H△ . Ea h blo k matrix whose blo ks have the form T⊲ is denoted by H⊲ . Theorem 4.

Let

(28)

(I, J) ⊕ (A, B)

be a anoni al matrix pair for ontragredient equivalen e, in whi h J is a nonsingular Jordan anoni al matrix, (A, B) :=

r l M M (FTqi , Gqi ), (Fpi , GTpi ) ⊕ (I, J(0)) ⊕ (J ′ (0), I) ⊕ i=1

i=1

J(0)

and J ′ (0) are Jordan matri es with the single eigenvalue 0, and p1 > p2 > · · · > pl ,

q1 6 q2 6 · · · 6 qr .

Then one of the simple miniversal deformations of (28) has the form (29)

(I, J + K) ⊕ (A, B),

in whi h J + K is the deformation (5) of J and formation of (A, B):

A :=

(A, B)

Fp1 T . . . T Fp2

. . . .. . ...

H△

H

I

H J ′ (0) + H

T Fpl

GTq1 0

is the following de-

H H⊲ H T ... T .. T ... . Gq2 ... T GTqr

Blo k triangular miniversal deformations of matri es and matrix pen ils

and

81

GTp1 + T

T ... 0 .. ... .. . . T . . . T GT + T pl H J(0) + H B := H⊲ H I Fq1 + T . T .. H H H △ . .. . . . . . .

T. . . T Fqr + T

.

Proof. The following simple miniversal deformation of (28) was obtained in [10, Theorem 5.1℄: up to obvious permutations of strips, it has the form in whi h J + K is (5),

and

(30)

(I, J + K) ⊕ (A ′ , B ′ ),

Fp1 + T T . . . T

Fp2 0 ′ A :=

. + T .. ..

.. .

. T

0

H

H

Fpl + T 0 H

I 0 0 J ′ (0) + H

0 H GTq1 T . . . T

0

0

. GTq2 . .

H

..

0

.. .

. T GTqr

,

GTp1 0 T GT . . p2 . H 0 0 . . . . . . T T . . . T Gpl H J(0) + H H H ; B ′ := 0 H I 0 Fq1 + T 0 T Fq2 + T H H 0 . . .. . . . . . T . . . T Fqr + T

82

L. Klimenko, V. V. Sergei huk

Let (C, D) be the anoni al pair (28), and let (P, Q) be any matrix pair of the same size in whi h ea h entry is 0 or ∗. By [10, Theorem 2.1℄, see also the beginning of the proof of Theorem 5.1 in [10℄, (C + P, D + Q) is a versal (respe tively, miniversal) deformation of (C, D) if and only if for every pair (M, N) of size of (C, D) there exist square matri es S and R and a pair (respe tively, a unique pair) (P, Q) obtained from (P, Q) by repla ing its stars with omplex numbers su h that (M, N) + (CR − SC, DS − RD) = (P, Q).

(31)

The matri es of (C, D) are blo k diagonal: C = C1 ⊕ C2 ⊕ · · · ⊕ Ct ,

D = D1 ⊕ D2 ⊕ · · · ⊕ Dt ,

in whi h (Ci , Di ) are of the form (27). Partitioning onformally the matri es of (M, N) and (P, Q) and equating the orresponding blo ks in (31), we nd that (C + P, D + Q) is a versal deformation of (C, D) if and only if for ea h pair of indi es (i, j) and every pair (Mij , Nij ) of the size of (Pij , Qij ) there exist matri es Sij and Rij and a pair (Pij , Qij ) obtained from (Pij , Qij ) by repla ing its stars with

omplex numbers su h that

(32)

(Mij , Nij ) + (Ci Rij − Sij Cj , Di Sij − Rij Dj ) = (Pij , Qij ).

Let (C + P ′ , D + Q ′ ) be the deformation (30) of (C, D). Sin e it is versal, for ea h pair of indi es (i, j) and every pair (Mij , Nij ) of the size of (Pij′ , Qij′ ) there exist matri es Sij and Rij and a pair ′ ′ ′ ′ (Pij , Qij ) obtained from (Pij , Qij ) by repla ing its stars with

omplex numbers su h that

(33)

′ ′ (Mij , Nij ) + (Ci Rij − Sij Cj , Di Sij − Rij Dj ) = (Pij , Qij ).

Let (C + P, D + Q) be the deformation (29). In order to prove that it is versal, let us verify the ondition (32). If (Pij , Qij ) = (Pij′ , Qij′ ) then (32) holds by (33). Let (Pij , Qij ) 6= (Pij′ , Qij′ ) for some (i, j). Sin e the ondition (33) holds, it suÆ es to verify that for ea h (Pij′ , Qij′ ) obtained from (Pij′ , Qij′ ) by repla ing its stars with omplex numbers there exist matri es S and R and a pair (Pij , Qij ) obtained from (Pij , Qij ) by repla ing its stars with omplex numbers su h that ′ ′ (Pij , Qij ) + (Ci R − SCj , Di S − RDj ) = (Pij , Qij ).

The following 5 ases are possible.

(34)

Blo k triangular miniversal deformations of matri es and matrix pen ils

83

Case 1: (Ci , Di ) = (Fp , GTp ) and i = j. Then ′ ′ (Pii , Qii )

= (T, 0) =

0 α1 · · · αp−1

,0

(we denote by T any matrix obtained from T by repla ing its stars with

omplex numbers). Taking

0

αp−1 .. S := . α 2 α1

in (34), we obtain

..

0

.

.. .. . . .. .. .. . . . . α2 . . αp−1 0

0 . αp−1 . . .. .. .. . . . R := . . . . . . α . . . 3

,

0

. α2 α3 . . αp−1 0

αp−1

(Pii , Qii ) = 0, ...

α1

0 = (0, T ).

Case 2: (Ci , Di ) = (Fp , GTp ) and (Cj , Dj ) = (Im , Jm (0)). Then

′ ′ (Pij , Qij ) = (0, T ). Taking S := −T△ and R := 0 in (34), we obtain (Pij , Qij ) = (T△ , 0). Case 3: (Ci , Di ) = (Im , Jm (0)) and (Cj , Dj ) = (Jn (0), In ). Then (Pij′ , Qij′ ) = (0, T ). Taking S := 0 and R := T in (34), we obtain (Pij , Qij ) = (T, 0). Case 4: (Ci , Di ) = (Im , Jm (0)) and (Gj , Dj ) = (GTq , Fq ). Then (Pij′ , Qij′ ) = (0, T ). Taking S := 0 and R := T⊲ in (34), we obtain (Pij , Qij ) = (T⊲ , 0). Case 5: (Ci , Di ) = (Jn (0), In ) and (Gj , Dj ) = (Fp , GTp , ). Then (Pij′ , Qij′ ) = (T, 0). Taking S := T⊲ and R := 0 in (34), we obtain (Pij , Qij ) = (0, T⊲ ).

We have proved that the deformation (29) is versal. It is miniversal sin e it ⊓ ⊔ has the same number of parameters as the miniversal deformation (30).

Remark 3. The deformation (I, J + K) ⊕ (A, B) from Theorem 4 an be made blo k triangular by the following permutations of its rows and olumns, whi h are transformations of ontragredient equivalen e: – First, we redu e (I, J + K) to the form (I, J# + K# ), in whi h J# + K# is

de ned in (9). – Se ond, we redu e the diagonal blo k J(0) + H in B to the form J(0)# + H# (de ned in (8)) by the permutations of rows and olumns of B des ribed in De nition 3. Then we make the ontragredient permutations of rows and

olumns of A.

84

L. Klimenko, V. V. Sergei huk

– Finally, we redu e the diagonal blo k J ′ (0)+H in A to the form J ′ (0)# +H# (de ned in (8)) by the permutations of rows and olumns of A des ribed in

De nition 3, and make the ontragredient permutations of rows and olumns of B. The obtained deformation J ′ (0)# + H# is lower blo k triangular, we make it upper blo k triangular by transformations P(J ′ (0)# + H# )P,

0

1

P := · · · 1 0

(i.e., we rearrange in the inverse order the rows and olumns of A that ross J ′ (0)# +H# and make the ontragredient permutations of rows and olumns of B).

References 1. V. I. Arnold, On matri es depending on parameters, Russian Math. Surveys, 26 (no. 2) (1971), pp. 29{43. 2. V. I. Arnold, Le tures on bifur ations in versal families, Russian Math. Surveys, 27 (no. 5) (1972), pp. 54{123. 3. V. I. Arnold, Geometri al Methods in the Theory of Ordinary Dierential Equations, Springer-Verlag, New York, 1988. 4. G. R. Belitskii, Normal forms in a spa e of matri es, in Analysis in In niteDimensional Spa es and Operator Theory, V. A. Mar henko, ed., Naukova Dumka, Kiev, 1983, pp. 3-15 (in Russian). 5. G. R. Belitskii, Normal forms in matrix spa es, Integral Equations Operator Theory, 38 (2000), pp. 251{283. 6. F. R. Gantma her, Matrix Theory, Vol. 2, AMS Chelsea Publishing, Providen e, RI, 2000. 7. N. M. Dobrovol'skaya and V. A. Ponomarev, A pair of ounter operators, Uspehi Mat. Nauk, 20 (no. 6) (1965), pp. 80{86. 8. R. A. Horn and D. I. Merino, Contragredient equivalen e: a anoni al form and some appli ations, Linear Algebra Appl., 214 (1995), pp. 43{92. m, A geometri approa h to per9. A. Edelman, E. Elmroth, and B. K agstro turbation theory of matri es and matrix pen ils. Part I: Versal deformations, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 653{692. 10. M. I. Gar ia-Planas and V. V. Sergei huk, Simplest miniversal deformations of matri es, matrix pen ils, and ontragredient matrix pen ils, Linear Algebra Appl., 302{303 (1999), pp. 45{61 (some misprints of this paper were orre ted in its preprint arXiv:0710.0946). 11. V. V. Sergei huk, Canoni al matri es for linear matrix problems, Linear Algebra Appl., 317 (2000), pp. 53{102. 12. H. Shapiro, The Weyr hara teristi , Amer. Math. Monthly, 106 (1999), pp. 919{ 929. 13. E. Weyr, Repartition des matri es en espe es et formation de toutes les espe es, C. R. A ad. S i. Paris, 100 (1885), pp. 966{969.

Determining the Schein rank of Boolean matrices Evgeny E. Mareni h⋆ Murmansk State Pedagogi University [email protected]

Abstract. In this paper we present some results of S hein rank of Boolean matri es. A notion of the interse tion number of a bipartite graph is de ned and its appli ations to S hein rank of Boolean matri es are derived. We dis uss minimal and maximal matri es of given S hein rank, the number of m × n Boolean matri es with given S hein rank. The S hein ranks of some m × n Boolean matri es are determined. In the last se tion, we give some further result on erning the S hein rank of Boolean matri es.

Keywords: Boolean matrix, S hein rank, oding fun tions for bipartite

graphs.

1

Introduction

The following are des ribed in Se tions 2 and 3: 1. the set of all m × n minimal Boolean matri es of S hein rank k; 2. the set of all m × n maximal Boolean matri es of S hein rank 2,3; 3. some maximal m × n Boolean matri es of S hein rank k. In Se tion 4 we de ne the interse tion number of a bipartite graph Γ and prove that the interse tion number is equal to the minimum number of maximal

omplete bipartite subgraphs whose union in ludes all edges of Γ . In Se tion 5 we de ne a k- anoni al family CS(k) of bipartite graphs, obtain the family CS(2) and some graphs in the family CS(3). In Se tion 6, we apply the interse tion number and anoni al families to determining the S hein rank of Boolean matri es. In parti ular, formulas for the number of all m × n Boolean matri es of S hein rank k are obtained. In Se tion 7, oding of bipartite graphs is studied. In Se tion 8, we de ne the bipartite interse tion graphs and investigate the S hein rank of asso iated matri es. In Se tion 9, we give some further result on erning the S hein rank of Boolean matri es. ⋆

This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

86

2

E. E. Mareni h

The Schein rank of Boolean matrices

Our notation and terminology are similar to those of [1℄, [4℄. We olle t in this se tion a number of result and de nitions required latter. Where possible we state simple orollaries of these results without proof. We olle t in this se tion a number of results and de nitions required latter. A detailed treatment may be found in [3℄, [4℄. Let U be a nite set, 2U be the olle tion of all subsets of U. The number of elements in U is denoted by |U|. Let Bul(U) = (2U , ⊆) be the Boolean algebra (or poset) of all subsets of a nite set U partially ordered by in lusion. Let Bul(k) be the Boolean algebra of all subsets of a nite set of k elements. Let P = {e0, e1} be a two-element Boolean latti e with the greatest element e1 and the least element e0. The latti e operations meet ∧ and join ∨ are de ned as follows: ∧ |e 0e 1 ∨ |e 0e 1

e 0 |e 0e 0 e 1 |e 0e 1

e 0 |e 0e 1 e 1 |e 1e 1

Following [4℄, we re all some de nitions. Let Pm×n denote the set of all m × n (Boolean) matri es with entries in P. Matri es with all entries in P will be denoted by ROMAN apitals A = kaij km×n , B = kbij km×n , C = kcij km×n , X = ||xij km×n , . . .. Then the usual de nitions for addition and multipli ation of matri es over eld are applied to Boolean matri es as well. The n×n identity matrix E = En×n is the matrix su h that eij =

e 1, if i = j, e 0, if i 6= j.

Denote by En×n the n × n matrix with e0' entries on the main diagonal and e 1 elsewhere. The m × n zero matrix 0m×n is the matrix all of whose entries are e0. The 1. m × n universal matrix Jm×n is the matrix all of whose entries are e The transpose of A will be denoted by A(t) . De ne a partial ordering 6 on Pm×n by A 6 B i aij 6 bij for all i, j. Let A(r) (A(r) ) denote the rth olumn (row) of A. A subspa e of Pm×1 (P1×n ) is a subset of Pm×1 (P1×n ) ontaining the zero ve tor and losed under addition. A olumn spa e Column(A) of a matrix A is the span of the set of all olumns of A. Likewise one has a row spa e Row(A) of A. De nitions of the olumn rank rankc (A) (row rank rankr (A)) of A is due to Kim, [4℄.

Determining the S hein rank of Boolean matri es

87

Theorem 1 (Kim, Roush, [3℄). Let A ∈ Pm×n , A 6= 0m×n .

Then the following

onditions are equivalent: (i) ranks (A) = k. (ii) k is the least integer su h that A is a produ t of an m × k matrix and an k × n matrix. (iii) k is the smallest dimension of a subspa e W su h that W ontains the

olumn spa e Column(A) (row spa e Row(A)). Example. We have Column(En×n ) = Pn×1 . From Theorem 1 (iii), it follows

that

ranks (En×n ) = n.

The following theorem is due to Kim [4℄.

Let A ∈ Pm×n . Then: (i) ranks (A) = ranks (A(t) ).

Theorem 2.

(ii) ranks (A) 6 min{rankc (A), rankr (A)}. (iii) ranks (A) 6 min{m, n}. (iv) If Column(A) 6 Column(B) then ranks (A) 6 ranks (B). Corollary 1.

Let A ∈ Pm×n . If B is a submatrix of A then ranks (B) 6 ranks (A).

Corollary 2.

Let A1 , . . . , Ak ∈ Pm×n . Then

ranks (A1 + A2 + . . . + Ak ) 6 ranks (A1 ) + ranks (A2 ) + . . . + ranks (Ak ).

Let A1 , . . . , Ak be Boolean matri es. If the produ t A1 A2 . . . Ak is de ned, then

Corollary 3.

ranks (A1 A2 . . . Ak ) 6 ranks (Ai ), i = 1, . . . , k, ranks (A1 A2 . . . Ak ) 6 min{ranks (A1 ), ranks (A2 ), . . . , ranks (Ak )}.

Example. If A ∈ Pn×n is invertible, then ranks (A) = n. A square matrix is alled a permutation matrix if every row and every olumn

ontains only one e1.

Corollary 4.

es. Then

Let A ∈ Pm×n and π ∈ Pm×m , σ ∈ Pn×n be permutation matriranks (πA) = ranks (Aσ) = ranks (A).

Corollary 5.

Let A ∈ Pn×n . Then ranks (A) > ranks (A2 ) > ranks (A3 ) > . . . .

88

3

E. E. Mareni h

Matrices of Schein rank 2, 3

Let A ∈ Pm×n . By ρ(A) denote the number of e1's in A. By Chrk (m, n) we denote the set of all matrix A ∈ Pm×n su h that ranks (A) = k, where min{m, n} > k. The term e1-rank of a matrix A ∈ Pm×n is the maximum number of e1's entries of A no two of whi h share a row or olumn of A. We denote the term e 1-rank of A by ρt (A). By Konig theorem [1℄, it follows that the e 1-term rank of A is the minimum number of rows and olumns of A ontaining all e 1's entries of A. An element a of a poset (Q, 6) is maximal if whenever a 6 x, then a = x. We dually de ne minimal elements. The set of all minimal matri es in (Chrk (m, n), 6) is des ribed in the following theorem. Theorem 3. Let m, n > k. A matrix A is minimal ρ(A) = k and A has a k × k permutation submatrix.

in

(Chrk (m, n), 6)

i

Proof. If ρ(A) = k and A has a k× k permutation submatrix, then A is minimal

in (Chrk (m, n), 6). Let C ∈ Chrk (m, n). We rst show that ρt (C) > k. Suppose, to the ontrary, that ρt (C) < k. By Konig theorem [1℄, it follows that ρt (C) rows and olumns of C ontaining all e1's entries of C. We see that ranks (C) 6 ρt (C) < k. This is a ontradi tion sin e ρt (C) > k. Therefore exists a matrix A ∈ Pm×n su h that A 6 C, ρ(A) = k and A has a k × k permutation submatrix. ⊓ ⊔ The number of all minimal matri es in (Chrk (m, n), 6) is n(n − 1) . . . (n − k + 1)m(m − 1) . . . (m − k + 1).

Let ∆k ∈ Pk×k have the following form:

e 1 e 0 ∆k = e 0 e 0

e 1 1 ... e e 1 ... e 1 e 1 ... e 1 . ... ... e 0 e 0 ... e 1 e 1 e 1 e 0

From [5℄ it follows that ranks (∆k ) = k, k > 1. Let ∼ be the equivalen e relation on Pm×n de ned by B ∼ C i C = πBσ for some permutation matri es π ∈ Pm×m , σ ∈ Pn×n . Now we obtain some maximal matri es in (Chrk (m, n), 6).

Determining the S hein rank of Boolean matri es

89

Theorem 4. Let A ∈ Pm×n . If there exists a submatrix B of A su h that B ∼ ∆k and B ontains all e 0's entries of A, then A is maximal in (Chrk (m, n), 6 ).

Proof. We have ranks (A) = ranks (∆k ) = k. It suÆ es to show that ∆k is maximal in (Chrk (k, k), 6). Let B be obtained from ∆k by repla ing a sele tion of the e0's by e1's. Let r be the least integer su h that bir 6= (∆k )ir for some i. Then B(r) is a span of some rows B(i) , i 6= r. Therefore ranks (B) < k. ⊓ ⊔ The set of all maximal matri es of (Chr2 (m, n), 6) is des ribed in the following theorem. Theorem 5. Let A ∈ Pm×n and m, n > 2. Then (Chr2 (m, n), 6) i only one entry of A is e 0.

a matrix

A

is maximal in

The number of all maximal elements in the poset (Chr2 (m, n), 6) is nm. The set of all maximal matri es in the poset (Chr3 (m, n), 6) is des ribed in the following theorem.

Theorem 6. Let A ∈ Pm×n and m, n > 3. A matrix A is maximal (Chr3 (m, n), 6) i there exists a submatrix B of A su h that B ∼ ∆3 0's entries of A. B ∼ E3×3 , and B ontains all e

in or

Proof. Let C ∈ Chr3 (m, n). By Konig theorem, it follows that the e0-term rank of C is the minimum number of rows and olumns of C ontaining all e0's entries of C. By Konig theorem the proof is now divided into following ases. Case 1: there exist three e0's entrees su h that no two e0 entries share a row or olumn of A. The matrix A obtained from C by repla ing other e0 entrees by e 1 is maximal in (Chr3 (m, n), 6). Case 2: there exist two rows and olumns of C ontaining all e0 entries of 0's en-tries of C, then C. If there exists a row (a olumn) of C ontaining all e ranks (C) 6 2, whi h is a ontradi tion. Therefore there exist a row and a olumn of C ontaining all e0's entries of C. Case 2.1: there exist two olumns of C ontaining all e0's entries of C. Then there exists a submatrix B of C su h that ranks (B) = ranks (C) = 3 and ea h row of B is a row of the matrix

e 0 e 1 D=e 0 e 1

e 0 e 0 e 1 e 1

e 1 e 1 . e 1 e 1

By onsidering all matri es B su h that ranks (B) = 3, we on lude the proof in this ase.

90

E. E. Mareni h

Case 2.2: there exist a row and a olumn of C ontaining all e0 entries of C. It is easy to see that ranks (C) = ranks (A) for some matrix A su h that ρ(A) = n − 3 and A has a submatrix B su h that B ∼ ∆3 . ⊓ ⊔

Remark. The matrix Ek×k is not maximal in (Chrk (k, k), 6) for k > 5. 4

On coding of bipartite graphs by sets

Let Γ = Γ (V1 ∪ V2 , E) be a bipartite graph with bipartition V1 = {1, 2, 3, . . .}, V2 = {1 ′ , 2 ′ , 3 ′ , . . .} and U a nite set. A fun tion f : V1 ∪ V2 → 2U is alled U- oding fun tion for Γ if for any verti es v1 , v2 onditions {v1 , v2 } ∈ E and f(v1 ) ∩ f(v2 ) 6= ∅ are equivalent. We

all f(v) the ode of v ∈ V1 ∪ V2 . Note that there exist oding fun tions for any bipartite graph Γ . The interse tion number nintbp (Γ ) of a bipartite graph Γ = Γ (V1 ∪ V2 , E) is the least number |U| su h that there exists a U- oding fun tion for Γ . Note that every maximal omplete bipartite subgraph has at least one edge. The following example lari es the above de nitions.

Example. Let Γ1 be the following graph: 1s

2s

3s

4s

1′

2′

3′

4′

@ @ @ @ @ @ @ @ @ s @s @s @s

Γ1 :

Then some maximal omplete bipartite subgraphs of Γ1 are: 1

2

2

3

3

4

1′

2′

2′

3′

3′

4′

s s @ @ @ s @s ,

s s @ @ @ s @s ,

s s @ @ @ s @s .

In the following theorem we show that the interse tion number nintbp (Γ ) of a bipartite graph is losely onne ted to the set of all omplete bipartite subgraphs of Γ . Γ = Γ (V1 ∪ V2 , E) be a bipartite graph. The interse tion number nintbp (Γ ) is equal to the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ .

Theorem 7.

Let

Determining the S hein rank of Boolean matri es

Proof. Let nintbp (Γ ) {1, . . . , k}. De ne sets

91

= k and f be a U- oding fun tion for Γ , where U =

Vr = {v | v ∈ V1 ∪ V2 , r ∈ f(v)}, r = 1, . . . , k.

Note that Vr 6= ∅, r = 1, . . . , k. Let Γr be a subgraph su h that Vr is the set of verti es of Γr . Then Γr is a omplete bipartite subgraph of Γ . The union of subgraphs Γ1 , . . . , Γk in ludes all edges of Γ . Any subgraph Γ1 , . . . , Γk is ontained in some maximal omplete bipartite subgraph. Therefore the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ is less than or equal to k = nintbp (Γ ). Let the minimum number of omplete bipartite subgraphs whose union in ludes all edges of Γ is equal to k. Let the union of Γ1 , . . . , Γk in ludes all edges of Γ . For any v ∈ V1 ∪ V2 de ne the set f(v): r ∈ f(v) i v is a vertex of Γr . We now prove that f : V1 ∪ V2 → 2U is a U- oding fun tion for Γ . Let v1 ∈ V1 , v2 ∈ V2 and {v1 , v2 } ∈ E. Then {v1 , v2 } is an edge of some Γr . Therefore r ∈ f(v1 ), f(v2 ) and f(v1 ) ∩ f(v2 ) 6= ∅. Let v1 ∈ V1 , v2 ∈ V2 , and f(v1 )∩f(v2 ) 6= ∅. Then there exists r ∈ f(v1 ), f(vr ). Therefore {v1 , v2 } is an edge of Γr . Thus {v1 , v2 } ∈ E. We have proved that f : V1 ∪ V2 → 2U is a U- oding fun tion for Γ . Then nintbp (Γ ) is less than or equal to the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ . Thus nintbp (Γ ) equals the minimum number of maximal omplete bipartite ⊓ ⊔ subgraphs whose union in ludes all edges of Γ .

Example. The minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ1 is equal to 3. Therefore nintbp (Γ1 ) = 3.

5

On canonical bipartite graphs

Let Γ = Γ (V1 ∪ V2 , E) be a bipartite graph with bipartition V1 = {1, 2, 3, . . .}, V2 = {1 ′ , 2 ′ , 3 ′ , . . .} and U a nite set. Denote by V^1 the set of all nonisolated verti es of V1 . In the same way, we de ne V^2 . De ne the following sets E(v) = {z | {v, z} ∈ E}, v ∈ V1 ∪ V2 .

Let ∼ be the equivalen e relation on V^1 ∪ V^2 de ned by: u ∼ v whenever E(u) = E(v). ′ ′ ′ Let Γc = Γ (V1 ∪ V2 , E ) be a bipartite graph with bipartition V1′ ,V2′ , where V1′ = V^1 /∼, V2′ = V^2 /∼ are quotient sets and E ′ is de ned by:

92

E. E. Mareni h {^i, ^j ′ } ∈ E ′ i {i, j} ∈ E.

We all Γc a anoni al representation of Γ .

Example. Consider the graph Γ and its anoni al representation Γc , Γ :

sH s s @H @ @HH@ H @ [email protected] [email protected] s @s Hs ,

Γc :

s s @ @ @ @s . s

For any bipartite graph Γ , the following statements are valid.

Lemma 1.

(i) nintbp (Γ ) = nintbp (Γc ). (ii) If nintbp (Γ ) = k, then k 6 V1′ , V2′ 6 2k − 1. Let CS(k) be the set of all nonisomorphi anoni al representations for bipartite graphs of interse tion number k. We all CS(k) a k- anoni al family. Any

anoni al representation of a bipartite graph is alled a anoni al graph.

Example.

1. The anoni al family CS(1) ontains the unique graph s

s.

2. The anoni al family CS(2) ontains four graphs s

s

s

s,

s s @ @ @ @s , s

s s s @ @ @ @ @ @ @s @s . s

s s @ @ @ @ @ @ s @s @s ,

3. In CS(3) we onsider all graphs with three verti es in bipartition V1′ , V2′ : s s s @ @ @ @ @ @ s @s @s , s s s @ @ @ @ @ @ @s , @s s

s

s s

s

s s @ @ @ s @s , s s

s

s,

s s s HH @ @ @HH@ H [email protected] @ H s Hs , @ @s s s s

s

s

s.

Determining the S hein rank of Boolean matri es

93

The anoni al family CS(k) give us all bipartite graphs Γ = Γ (V1 ∪ V2 , E) su h that nintbp (Γ ) = k. Let Fk (m, n) is the number of all bipartite graphs Γ = Γ (V1 ∪ V2 , E) su h that V1 = {1, . . . , m}, V2 = {1 ′ , 2 ′ , . . . , n ′ }, nintbp (Γ ) = k. We have (1)

F1 (m, n) = (2m − 1)(2n − 1), F1 (n, n) = (2n − 1)2 .

For the anoni al family CS(2), we obtain the following theorem. Theorem 8.

For all m, n > 1 (2)

F2 (m, n) = 23 (3m − 2 · 2m + 1)(3n − 2 · 2n + 1) + + 21 (3m − 2 · 2m + 1)(4n − 3 · 3n + 3 · 2n − + 12 (3n − 2 · 2n + 1)(4m − 3 · 3m + 3 · 2m − m m m n n n

+ 21 (4

−3·3

+3·2

In parti ular, for all n > 1 n

n

n

n

+ (3 − 2 · 2 + 1)(4 − 3 · 3 + 3 · 2 − 1) +

6

1) +

− 1)(4 − 3 · 3 + 3 · 2 − 1).

F2 (n, n) = 32 (3n − 2 · 2n + 1)2 + n

1) +

1 n 2 (4

(3) n

n

2

− 3 · 3 + 3 · 2 − 1) .

On the Schein rank of Boolean matrices and the intersection number of associated graphs

Let A ∈ Pm×n , U be a nite set. To a matrix A ∈ Pm×n asso iate a bipartite graph Γ (A) = Γ (V1 ∪ V2 , E) with bipartition V1 = {1, . . . , m}, V2 = {1 ′ , 2 ′ , . . . , n ′ } by taking aij = e1 if and only if there is an edge between i and j ′ . To a matrix A ∈ Pm×n asso iate a bipartite graph Γ (A) = Γ (V1 ∪ V2 , E) by taking bipartition V1 = {1, . . . , m}, V2 = {1 ′ , 2 ′ , . . . , n ′ } and a set of edges E su h that {i, j ′ } ∈ E if and only if aij = e1. The following theorem redu es the S hein rank problem for any matrix A to determining the interse tion number of Γ (A). Theorem 9.

The S hein rank of A equals the interse tion number of Γ (A).

Proof. We rst prove that nintbp (Γ ) 6 ranks (A). Let ranks (A) = k. Then A = C1 D1 + C2 D2 + . . . + Ck Dk

for some C1 , C2 , . . . , Ck ∈ Pm×1 , D1 , D2 , . . . , Dk ∈ P1×n . De ne sets: f(i) = {j | (Cj )(i) = e 1, j = 1, . . . , k}, i = 1, . . . , m, ′ (j) f(j ) = {i | (Di ) = e 1, i = 1, . . . , k}, j = 1, . . . , n.

94

E. E. Mareni h

Let f : V1 ∪V2 → 2U and U = {1, . . . , k} be a fun tion and a set. We now prove that f is a U- oding fun tion of Γ (A). The following statements are equivalent: – – – –

aij = e 1; e 1 = (Cr Dr )ij = (Cr )(i) ∧ (Dr )(j) for some r; there exists r su h that r ∈ f(i), r ∈ f(j ′ ); f(i) ∩ f(j ′ ) 6= ∅.

We have proved that aij = e1 i f(i) ∩ f(j ′ ) 6= ∅. Therefore f is a U- oding fun tion of Γ (A). Thus nintbp (Γ ) 6 k = ranks (A). We now prove that nintbp (Γ ) 6 ranks (A). Let nintbp (Γ ) = k and f is a U- oding fun tion for Γ (A). We have f : V1 ∪ V2 → 2U , where U = {1, . . . , k}. De ne olumn ve tors C1 , C2 , . . . , Ck ∈ Pm×1 by setting: (Cr )(i) = e 1 i r ∈ f(i), i = 1, . . . , m, r = 1, . . . , k. Similarly, de ne row ve tors D1 , D2 , . . . , Dk ∈ P1×n by setting: (Dr )(j) = e 1 i r ∈ f(j ′ ), j = 1, . . . , n, r = 1, . . . , k. We laim that A = C1 D1 +C2 D2 +. . .+Ck Dk . Indeed, the following statements are equivalent: – – – – – –

aij = e 1; f(i) ∩ f(j ′ ) 6= ∅; there exists r su h that r ∈ f(i), r ∈ f(j ′ ); there exists r su h that (Cr )(i) = e1, (Dr )(j) = e1; (Cr )(i) ∧ (Dr )(j) = (Cr Dr )ij = e 1 for some r; (C1 D1 + C2 D2 + . . . + Ck Dk )ij = e 1.

Therefore

A = C1 D1 + C2 D2 + . . . + Ck Dk , ranks (A) 6 k = nintbp (Γ ).

We have proved that ranks (A) = nintbp (Γ ).

⊓ ⊔

From Theorem 9 and [2℄, [Remark 6.7℄, we obtain the following orollary. Corollary 6 ([5℄). Let A ∈ Pm×n .

The S hein rank of A is equal to the minimum number of omplete bipartite subgraphs whose union in ludes all edges of Γ (A).

Example. Let A ∈ Pn×n have the following form:

e 1e 1e 0 ... e 0e 0 e 1e 1 ... e 0e 0 0e A = ··· ··· ··· eee 0 0 0 ... e 1e 1 e 1e 0e 0 ... e 0e 1

Determining the S hein rank of Boolean matri es

95

Then Γ (A) have the following form: Γ (A) :

s s s @ @ @ @ @@ @s s @s

s s @ @ @ @s s ···

··· ···

Note that Γ (A) has 2n edges and any maximal omplete bipartite subgraph

ontains two edges. Therefore the minimum number of maximal omplete bipartite subgraphs whose union in ludes all edges of Γ is n. Thus ranks (A) = n. The anoni al family CS(k) give us all matri es A ∈ Pm×n su h that ranks (A) = k. Theorem 10.

Let m, n > 1 and min{m, n} > k. Then |Chrk(m, n)| = Fk (m, n).

Proof. The number of all matri es A ∈ Pm×n su h that ranks (A) = k is equal to the number of all bipartite graphs Γ = Γ (V1 ∪V2 , E) su h that V1 = {1, . . . , m},

V2 = {1 ′ , 2 ′ , . . . , n ′ }, nintbp (Γ ) = k.

⊓ ⊔

The results of se tion 5 give us the formulas for |Chr1 (m, n)| and |Chr2 (m, n)|.

Example. 1. The number of all matri es A ∈ P2×2 su h that ranks (A) = k is

equal to Fk (2, 2). Using anoni al families, we get: F0 (2, 2) = 1, F1 (2, 2) = 9, F2 (2, 2) = 6. 2. The number of all matri es A ∈ P3×3 su h that ranks (A) = k is equal to Fk (3, 3). Using anoni al families, we get: F0 (3, 3) = 1, F1 (3, 3) = 49, F2 (3, 3) = 306, F3 (3, 3) = 156. From the proof of Theorem 9 we obtain the following statements. If f : V1 ∪ V2 → 2U is a U- oding fun tion for Γ (A) and U = {1, . . . , k} is a set, then A = XY where X ∈ Pm×k , Y ∈ Pk×n are given by: xij = e 1 i j ∈ f(i), i ∈ V1 , j ∈ U; e yij = 1 i i ∈ f(j ′ ), i ∈ U, j ′ ∈ V2 .

(4) (5)

Thus X(i) asso iates to the set f(i) and Y (j) asso iates to the set f(j ′ ). If A = XY , where X ∈ Pm×k , then f : V1 ∪ V2 → 2U given by (4), (5) is a U- oding fun tion for Γ (A).

7

On coding of bipartite graphs by antichains

Let A ∈ Pm×n be a matrix, Γ (A) = Γ (V1 ∪ V2 , E) a bipartite graph asso iated to A, f : V1 ∪ V2 → 2U a U- oding fun tion for Γ (A).

96

E. E. Mareni h

For given real number x, denote by ⌊x⌋ the greatest integer that is less than or equal to x. Similarly, ⌈x⌉ is the least interger that > x. l , where k ∈ N. Denote by l = N(k) the least number su h that k 6 ⌊l/2⌋ We have: N(1) = 1, N(2) = 2, N(3) = 3, N(4) = N(5) = N(6) = 4, N(7) = . . . = N(10) = 5, N(11) = . . . = N(20) = 6, N(21) = . . . = N(35) = 7, N(36) = . . . = N(70) = 8, N(71) = . . . = N(126) = 9, N(127) = . . . = N(252) = 10, N(253) = . . . = N(462) = 11, N(463) = . . . = N(924) = 12, N(925) = . . . = N(1716) = 13. Consider the following properties of N(k).

Let q, t, k ∈ N, 1 6 q 6 k. Then: (i) k > N( qk ); (ii) k = N( qk ) for any given t > 1 and suÆ iently large k = 2q − t; (iii) k = N( qk ) for any given t > 1 and suÆ iently large k = 2q + t.

Lemma 2.

) is equivalent to k−1 k < . ⌊(k − 1)/2⌋ q

Proof. (i) The equality k = N(

k q

(6)

Let t be even, t = 2a. Then (6) is equivalent to

q(q − 1) . . . (q − a + 1) < 2(q − a)(q − a − 1) . . . (q − 2a + 1).

(7)

Both sides of (7) are polynomials in one variable q. These polynomials have degree a. We ompare their highest oeÆ ients and see that (6) holds for any suÆ iently large q. ⊓ ⊔ Let t be odd. Similar reasoning gives (6). For k = 2q ± t, we an get more pre ise result.

)

holds if:

)

holds if:

Corollary 7. Let q, k ∈ N, 1 6 q 6 k. k = 2q for all q; k = 2q − 1 for all q > 2; k = 2q − 2 for all q > 3; k = 2q − 3 for all q > 3; k = 2q − 4 for all q > 8.

The equality k = N(

k q

Corollary 8. Let q, k ∈ N, 1 6 q 6 k. k = 2q + 1 for all q > 1; k = 2q + 2 for all q > 1; k = 2q + 3 for all q > 4.

The equality k = N(

k q

Determining the S hein rank of Boolean matri es

In parti ular, N

k ⌊k/2⌋

=N

k ⌈k/2⌉

= k,

97

k > 1.

A subset B of a poset (Q, 6) is an 6-anti hain if for any pair of distin t elements x and y of B, both x 66 y and y 66 x. The following lemma is useful for al ulation of the S hein rank of Boolean matri es.

Let A ∈ Pm×n . Then: (i) If the family of all rows of A is a 6-anti hain, then ranks (A) > N(m). (ii) If the family of all olumns of A is a 6-anti hain, then ranks (A) > N(n). Proof. (i) Let f : V1 ∪ V2 → 2U be a U- oding fun tion for Γ (A), where |U| = Lemma 3.

ranks (A). We now prove that {f(i) | i ∈ V1 } is a 6-anti hain. Suppose f(i1 ) ⊆ f(i2 ) for some i1 , i2 ∈ V1 . Then A = XY , where X ∈ Pm×k , are given by (4) and (5). A

ording the de nition of X, if xi j = e1 then j ∈ f(i1 ) ⊆ f(i2 ) and xi j = e1 for any j ∈ V2 . 1

2

Therefore

X(i1 ) 6 X(i2 ) ,

A(i1 ) = X(i1 ) Y 6 X(i2 ) Y = A(i2 ) .

Sin e the family of all rows of A is a 6-anti hain, we see that i1 = i2 and {f(i) | i ∈ V1 } is a ⊆-anti hain. The family of sets {f(i) | i ∈ V1 } is a 6-anti hain of a poset Bul(U). By Sperner's theorem, [1℄, we have m6

| U| ⌊|U|/2⌋

.⊓ ⊔

We say that A ∈ Pn×n is an (n, k, λ) design if ea h olumn and ea h row of A has exa tly k e 1's, and ea h two rows of A has exa tly λ e 1's in ommon.

Example. Let A ∈ Pn×n be an (n, k, λ) design, where λ < k < n. Then n > ranks (A) > max{min{n,

nk }, N(n)}. λ2

(8)

Sin e λ < k, the family of all rows of A is a 6-anti hain. Therefore ranks (A) > N(n).

Combining this with ranks (A) > min{n,

obtained in [5℄, we get (8).

nk }, λ2

(9)

Note that the inequality (8) is exa t (the inequality (9) is not exa t) for

En×n .

98

8

E. E. Mareni h

Bipartite intersection graphs Γk,p,q

Let k, p, q ∈ N and U = {1, . . . , k}. We renumerate l-element subsets of U in lexi ographi al ordering Wl (U) = {wk,l,1 , . . . , wk,l,b(k,l) },

k where b(k, l) = . l

De ne the bipartite graph Γk,p,q = Γ (V1 ∪2 , E) by setting: V1 = Wp (U),

V2 = Wq (U);

{wk,p,i , wk,q,j } ∈ E i wk,p,i ∩ wk,q,j = 6 ∅. We have |V1| = pk and |V2 | = qk . Note that Γk,p,q is a regular graph, k−p deg(v) = , v ∈ V1 , q k−q deg(v) = v ∈ V2 . p

The graph Γ (A) = Γ (V1 ∪ V2 , E) is asso iated to the matrix A(k, p, q) = (a(k, p, q)ij ) ∈ Pbin(k,p)×bin(k,q) ,

where

a(k, p, q)ij = e 1 i

wk,p,i ∩ wk,q,j 6= ∅.

If p + q 6 k then the sets of all rows and all olumns of A(k, p, q) are 6-anti hains. The rows of A(k, p, 1) asso iate to p-element subsets of U that is in lexi o-

graphi al ordering. Let C(k, p) ∈ Pk×k be the ir ulant matrix, obtained by y ling the row whose rst p entries are e1 and whose last k − p entries are e0. Theorem 11.

Let p, k ∈ N, 1 6 p 6 k. Then:

(i) k > ranks (A(k, p, 1)) > N( pk ). (ii) If k = N( pk ), then ranks (A(k, p, 1)) = k. (iii) If k > 2p − 1, then ranks (A(k, p, 1)) = k.

Proof. (i) The set of all rows of

A(k, p, 1) is a 6-anti hain. (iii)The ir ulant C(k, p) is a submatrix in A(k, p, 1). Therefore k > ranks (A(k, p, 1)) > ⊓ ⊔ ranks (C(k, p)). From [5℄, if k > 2p − 1, then ranks (C(k, p)) = k.

Determining the S hein rank of Boolean matri es

99

Example. Consider the following matrix

Sin e 4 = N(

4 2

eeee 1100 e eee 1010 eeee 1001 A(4, 2, 1) = e e e e . 0110 e 0e 1e 0e 1 e 0e 0e 1e 1

), we see that ranks (A(4, 2, 1)) = 4.

It is easy to prove that

(10)

A(k, p, q) = A(k, p, 1) · A(k, 1, q), (t)

A(k, p, 1) = (A(k, 1, p))

.

The matrix A(k, p, 1) is a blo k matrix. Indeed, A(k, p, 1) =

Jbin(k−1,p−1)×1 A(k − 1, p − 1, 1) 0bin(k−1,p)×1 A(k − 1, p, 1)

.

Combining this with (10), we get that A(k, p, q) is the following blo k matrix: A(k, p, q) =

Jbin(k−1,p−1)×bin(k−1,q−1) A(k − 1, p − 1, q) A(k − 1, p, q − 1) A(k − 1, p, q)

Example. The graph Γ5,2,2 is

.

12s

13s 14s 15 23s 24s 25 34 35s 45s Pa PP P s !! sQ ! !H a aa sQ aP ! ! H H ! P Q Q Q Q a ! ! @ PP A Q HH A Q Q @ A A QPP P [email protected] H [email protected] H @ A! @ a aa aP ! Q ! !Q H H aa a !! P !P PQ [email protected]H APQ P ! ! [email protected] a P a A aQ @ H A @ H @ ! Q A ! [email protected] Q ! P P P a ! a ! ! H H H ! PP P Q P! Q A Q!A@ ! aQ aQ a! ! A @[email protected] A P P @ AH @ H @ aQ ! a aQ P Q Q P Q A ! ! P H H H! a ! a a ! P P P ! a Q H P A A P ! Q @ A A @ Q Q a P a A @ Q H A @ H @ ! Q ! A @ ! P P aa a! a! ! HQ H H Q P P Q! Q Q PP Q P ! a a ! ! A ! H P A A P @ @ A @ H A @ H @ A @ a ! a a ! ! Q! a P Q QP Q Q PP HQ A ! H aH ! ! Pa ! HQ a P A A! P [email protected] !a @A H a A! [email protected] aH QP A HQ [email protected] @P AP @A ! P P ! a a ! ! H H ! Q PQ P [email protected] Q A P Q Q A a ! a aH ! ! ! H P P @ A AAs @ H A @ @ A @ a ! a a ! ! H A ! A Q Q @ P a a A @ H Q H A @ @ Q ! ! A @ Qs Ps Ps Qs ! s P s as As @s

12

13

Therefore

14

15

23

0eeeeeeeeee1 1111111000 Be eeeeeeeeeC B1111100110C Be 1e 1e 1e 0e 1e 0e 1e 0e 1C C B1e BeeeeeeeeeeC B1111001011C C B Be 1e 1e 0e 0e 1e 1e 1e 1e 1e 0C B A(5, 2, 2) = B e e e e e e e e e e C C= B1010111101C BeeeeeeeeeeC B1001111011C BeeeeeeeeeeC B0110110111C C B @e 0e 1e 0e 1e 1e 0e 1e 1e 1e 1A e 0e 0e 1e 1e 0e 1e 1e 1e 1e 1

24

25

34

35

0eeeee1 11000 Be eeeeC 1 B 0100C Be 0e 0e 1e 0C C0 e e e e e e e e e 1 B1e 1111000000 BeeeeeC e B 1 0 0 0 1 CB e e e e e e e e e e C CB 1 0 0 0 1 1 1 0 0 0 C B eeeeC Be C B 0 1 1 0 0 CB e 0e 1e 0e 0e 1e 0e 0e 1e 1e 0 C. CB Be e e e e B C 0 1 0 1 0 C 0e B 0e 1e 0e 0e 1e 0e 1e 0e 1A B e e e e e [email protected] e 0 1 0 0 1 C B BeeeeeC e 0e 0e 0e 1e 0e 0e 1e 0e 1e 1 B00110C C B @e 0e 0e 1e 0e 1A e e e e e 00011

45

100

E. E. Mareni h

Now we obtain the following properties of the S hein rank for A(k, p, q). Theorem 12.

Then:

Let k, p, q ∈ N, 1 6 p, q 6 k.

(i) ranks (A(k, p, q)) 6 min{ ranks (A(k, 1, q)),

rank p,1)) } 6 k. s (A(k, (ii) If p + q 6 k, then ranks (A(k, p, q)) > max N pk , N qk .

Proof. The inequality (i) follows from (10). (ii) The families of all rows and all

olumns of A(k, p, q) are 6-anti hains. This ompletes the proof.

⊓ ⊔

The following is an immediate onsequen e of Theorem 11 and Corollary 8.

If k, p ∈ N, 1 6 p 6 k. Then: (i) If p 6 k/2 and k = N( pk ), then ranks (A(k, p, p)) = k. (ii) ranks (A(2p, p, p)) = 2p.

Corollary 9.

(iii) ranks (A(2p + 1, p, p + 1)) = 2p + 1.

). Therefore ranks (A(k,2, 2)) = k, k = 4, 5, 6, 7. If k = 6, 7, 8, 9, then k = N( k3 ). Therefore ranks (A(k, 3, 3)) = k, k = 6, 7, 8, 9.

Example. If k = 4, 5, 6, 7, then k = N(

k 2

The following orollary is an appli ation of Theorem 11. Corollary 10.

Let k, p ∈ N, 1 6 p < k. Then

ranks (

A(k, p, p) Ebin(k,p)×bin(k,p)

Ebin(k,p)×bin(k,p) ) = k. A(k, k − p, k − p)

Proof. Consider the produ t of blo k matri es:

A(k, p, 1) A(k, 1, p) A(k, 1, k − p) = A(k, k − p, 1) Ebin(k,p)×bin(k,p) A(k, p, p) . = Ebin(k,p)×bin(k,p) A(k, k − p, k − p)

(11)

Taking into a

ount (10), we obtain k > ranks (

A(k, p, p) Ebin(k,p)×bin(k,p)

Ebin(k,p)×bin(k,p) ) > ranks (A(k, p, p)) = k.⊓ ⊔ A(k, k − p, k − p)

In parti ular, for p = 1 we have ranks

Ek×k Ek×k Ek×k Jk×k

= k.

Determining the S hein rank of Boolean matri es

9

101

The Schein rank of En×n

The following exer ise is due to Kim [4℄, [p. 63, Exer ise 24℄. k . Exer ise. Prove that the S hein rank of the matrix En×n is k if n = [k/2] e e The ranks of all square matri es with 0 on the main diagonal and 1 elsewhere are determined in [8℄. From Theorem 9 and Sperner's theorem, we get the following result. Theorem 13.

The S hein rank of En×n is equal to N(n).

Proof. The matrix E = En×n is asso iated to a bipartite graph Γ (E) = Γ (V1 ∪ V2 , E).

We have:

V1 = {1, . . . , n}, V2 = {1 ′ , 2 ′ , . . . , n ′ };

and

{i, j ′ } ∈ E is an edge of Γ (E) whenever i 6= j.

We now al ulate nintbp (Γ (E)). Let nint∅ (Γ (E)) = m and f be a U- oding fun tion for Γ (E), where |U| = m. Denote: f(i) = ai , f(i ′ ) = bi , i = 1, . . . , n.

Consider the following sets: g(i) = ai , g(i ′ ) = ai = U − ai , i = 1, . . . , n.

It is easy to prove that g : V1 ∪ V2 → 2U is a U- oding fun tion for Γ (E). In parti ular, ai ∩ aj 6= ∅ for all i 6= j. If ai ⊆ aj for some i 6= j, then aj ⊆ ai ,

ai ∩ aj ⊆ ai ∩ ai = ∅,

ai ∩ aj = ∅.

This is a ontradi tion. Therefore the family {a1, a2 ,. . . , an } is an ⊆-anti hain m in Bul(U). A

ording Sperner's theorem, n 6 ⌊m/2⌋ . We now prove that there exists a U- oding fun tion for Γ (E), where |U| = k. By Sperner's theorem the size of a maximal ⊆-anti hain in Bul(k) equals k ⌊k/2⌋ . Let {a1 , a2 , . . . , an } be some n-element ⊆-anti hain in Bul(U) su h that {a1 , a2 , . . . , an } is ontained in a maximal ⊆-anti hain and |ai | = ⌊k/2⌋ for all i = 1, . . . , n. Then |ai | = ⌈k/2⌉

for all i = 1, . . . , n.

102

E. E. Mareni h

Denote:

f(i) = ai , f(i ′ ) = ai = U − ai , i = 1, . . . , n.

Suppose ai ∩ aj = ∅ for some i, j. We have |ai | + |aj | = ⌊k/2⌋ + ⌈k/2⌉ = k,

ai ∪ aj = U.

Therefore ai = aj , ai = aj , i = j. Thus the equality f(i) ∩ f(j ′ ) = ai ∩ aj = ∅ is equivalent to i = j. We have proved that f : V1 ∪ V2 → 2U is a U- oding fun tion ⊓ ⊔ for Γ (E).

Let n =

Corollary 11.

k ⌊k/2⌋

.

The following statements are valid.

(i) ranks (En×n ) = k. (ii) If En×n = XY , where X ∈ Pn×N(n) , Y ∈ PN(n)×n , then X = πA(k, ⌊k/2⌋ , 1),

Y = A(k, 1, ⌈k/2⌉) π(t) ,

X = πA(k, ⌈k/2⌉ , 1),

Y = A(k, 1, ⌊k/2⌋) π

or (t)

,

(12) (13)

where π ∈ Pbin(k,[k/2])×bin(k,[k/2] is a permutation matrix. Proof. Using Theorem 13 and properties of numbers N(n), we get ranks (En×n ) = N

k ⌊k/2⌋

= k. By Sperner's theorem, see [1℄, there exist only two ⊆-anti hains of maximal length in Bul(k), whi h are the ⌊k/2⌋-element set and the ⌈k/2⌉-element set. ⊓ ⊔ From the proof of Theorem 13 we get (ii).

If k is even, then (12) oin ides with (13). If k is odd, then (12) does not

oin ide with (13). Example. The matrix E6×6 is the produ t of two matri es

E6×6

Example. Let

0eeeeee1 011111 Be eeeeeC B101111C BeeeeeeC B110111C =BeeeeeeC= B111011C B C @e 1e 1e 1e 1e 0e 1A e 1e 1e 1e 1e 1e 0

0eeee1 1100 0 1 Be eeeC eeeeee B1010C 000111 B e e e e CB e e e e e e C B 1 0 0 1 CB 0 1 1 0 0 1 C B e e e e CB e e e e e e C. B 0 1 1 0 [email protected] 1 0 1 0 1 0 A B C @e 1e 1e 0e 1e 0e 0 0e 1e 0e 1A e e e e e 0011

B = B(n) be the n × n matrix with e 0 on the main and ba k diagonals and e1 elsewhere.

In parti ular, onsider

e 0e 1e 1e 1e 0 e 0e 1e 0e 1 1e eeeee B(5) = 1 1 0 1 1 , e 1e 0e 1e 0e 1 e 0e 1e 1e 1e 0

eeeeee 011110 e eeeee 101101 eeeeee 110011 B(6) = e e e e e e 110011 e 1e 0e 1e 1e 0e 1 e 0e 1e 1e 1e 1e 0

Determining the S hein rank of Boolean matri es

We have

B(r) = B(n−r+1) , B(r) = B(n−r+1) , r = 1, . . . , n.

103

(14)

By removing rows of numbers ⌈n/2⌉ + 1, ⌈n/2⌉ + 2, . . ., n from B(n), we get the matrix X. From (14), we have ranks (B(n)) = ranks (X). By removing olumns of numbers ⌈n/2⌉ + 1, ⌈n/2⌉ + 2, . . ., n from X, we get Ek×k , where k = ⌈n/2⌉. From (14), we have ranks (X) = ranks (Ek×k ). Therefore ranks (B(n)) = N(⌈n/2⌉).

Example. Let C(n) be the n × n matrix with e1 on the main and ba k diagonals and e0 elsewhere. We have ranks (C(n)) = ⌈n/2⌉ .

Acknowledgments

This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

References 1. M. Aigner, Combinatorial Theory, Grundlehren Math. Wiss. 234, Springer-Verlag, Berlin, 1979. 2. J. Orlin, Contentment in graph theory: Covering graphs with liques, K. Nederlandse Ak. van Wetens happen Pro . Ser. A, 80 (1977), pp. 406{424. 3. Kim, Ki Hang and F.W. Roush, Generalized fuzzy matri es, Fuzzy Sets Systems, 4 (1980), pp. 293{315. 4. Kim, Ki Hang, Boolean matrix theory and appli ations, Mar el Dekker, New York and Basel, 1982. 5. D.A. Gregory, N.J. Pullman, Semiring rank: Boolean rank and nonnegative rank fa torization, Journal of Combinatori s, Information & System S ien es, v. 8, No.3 (1983), pp. 223{233. 6. Di Nola, S. Sessa, On the S hein rank of matri es over linear latti e, Linear Algebra Appl., 118 (1989), pp. 155{158. 7. Di Nola, S. Sessa Determining the S hein rank of matri es over linear latti es and nite relational equations, The Journal of Fuzzy Mathemati s, vol. 1, No. 1 (1993), pp. 33{38. 8. D. de Caen, D.A. Gregory and N.J. Pullman, The Boolean rank of zero-one matri es., Pro . Third Caribbean Conferen e on Combinatori s, Graph Theory, and Computing, Barbados, 1981,pp. 169{173 SIAM J. Numer. Anal., 19 (1982), pp. 400{408.

Lattices of matrix rows and matrix columns. Lattices of invariant column eigenvectors. Valentina Mareni h⋆ Murmansk State Pedagogi University [email protected]

Abstract. We onsider matri es over a Brouwerian latti e. The linear span of olumns of a matrix A form a semilatti e. We all it a olumn semilatti e for A. The questions are: when olumn semilatti e is a latti e, when olumn semilatti e is a distributive latti e, and what formulas an be obtained for the meet and the join operations? We prove that for any latti e matrix A, the olumn semilatti e is a latti e. We also obtain formulas for the meet and the join operations. If A is an idempotent or A is a regular matrix, then the olumn semilatti e is a distributive latti e. We also on ider invariant eigenve tors of a square matrix A over a Brouwerian latti e. It is proved that all A-invariant eigenve tors form a distributive latti e and the simple formulas for the meet and the join operations are obtained.

Keywords: latti e matrix, latti es of olumns, invariant eigenve tors of latti e matri es.

1

Introduction

In Se tion 2 we re all some de nitions: latti e matri es, olumn ve tors over a latti e, operations over latti e matri es, Brouwerian and Boolean latti es, systems of linear equations over a latti e. Also we re all the solvability riterion for a system of linear equations over a Brouwerian latti e and some its orollaries, whi h are needed for the sequal (for more details see [1℄). In Se tion 3, we de ne a olumn semilatti e (Column(A), 6) that is the linear span of olumns of a matrix A. Similarly, a row semilatti e an be de ned. The questions are: when (Column(A), 6) is a latti e, when (Column(A), 6) is a distributive latti e, and what formulas an be obtained for the meet and the join operations? In 1962, K.A. Zarezky proved that for a square matrix A over the two-element Boolean latti e, the olumn semilatti e is a latti e whenever A is a regular matrix. We onsider some ases when the olumn semilatti e is a latti e and get e and the join ∨ e operations. Note that the similar results formulas for the meet ∧ ⋆

This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

Latti es of matrix rows and matrix olumns

105

an be obtained for a row semilatti e. The main result of this se tion is the following. For a regular matrix A over a Brouwerian latti e, e in the latti e (Column(A), 6) is 1. the formula for the meet operation ∧ e u∧v = C(u ∧ v), for all u, v ∈ Column(A), where C is an idempotent su h that Column(A) = Column(C). 2. (Column(A), 6) is a distributive latti e.

In se tion 4, we re all the de nition of invariant olumn eigenve tors that is due to L.A. Skornyakov, see [6℄. The set of all invariant olumn eigenve tors form a subspa e. We prove that for any m × m matrix A over a distributive latti e: 1. the subspa e of all invariant olumn eigenve tors oin ides with Column ((A + A2 )k ), where k > m, 2 k 2. matrix (A + A ) is an idempotent. In se tion 5, we onsider a square matrix A and A-invariant eigenve tors over a Brouwerian latti e. From previous results it follows that all A-invariant eigenve tors form a distributive latti e. Also the simple formulas for the meet and the join operations are obtained.

2

Preliminaries

The following notations will be used throughout. Denote by (P, ∧, ∨, 6) a latti e. 2.1

Lattice matrices and column vectors

Let Pm×n be a set of all m × n matri es over P and A = kaij k ∈ Pm×n . We de ne the following matrix operations: – for any matri es A, B ∈ Pm×n A + B = kaij ∨ bij k; – for any matri es A ∈ Pm×n , B ∈ Pn×k n

AB = k ∨ (air ∧ brj )km×k . r=1

A square latti e matrix A ∈ P is an idempotent if A2 = A. The transpose of A is de ned by analogy with linear algebra and is denoted by A(t) . m×m

106

V. Mareni h

Any element (p1 , . . . , pm )t in Pm×1 is alled a olumn ve tor. We de ne a partial order on Pm×1 : ′ t ′ (p1 , . . . , pm )t 6 (p1′ , . . . , pm ) ⇔ p1 6 p1′ , . . . , pm 6 pm

′ t and the folowing operations. For any (p1 , . . . , pm )t , (p1′ , . . . , pm ) ∈ Pm×1 , and λ ∈ P, ′ t ′ t (p1 , . . . , pm )t + (p1′ , . . . , pm ) = (p1 ∨ p1′ , . . . , pm ∨ pm ) ; t

t

λ(p1 , . . . , pm ) = (λ ∧ p1 , . . . , λ ∧ pm ) .

(1) (2)

With these notations de ne a linear span of olumn ve tors (by analogy with linear algebra). Any set S ⊆ Pm×1 losed under the operations (1) and (2) is alled a subspa e. The partially ordered set (Pm×1 , 6) is a latti e with meet ∧ and join ∨ ′ t operations de ned as follows. For any (p1 , . . . , pm )t , (p1′ , . . . , pm ) ∈ Pm×1 , ′ t ′ t (p1 , . . . , pm )t ∨ (p1′ , . . . , pm ) = (p1 ∨ p1′ , . . . , pm ∨ pm ) , ′ t ′ t (p1 , . . . , pm )t ∧ (p1′ , . . . , pm ) = (p1 ∧ p1′ , . . . , pm ∧ pm ) .

Re all that any partially ordered set is alled, more simply, a poset. 2.2

Brouwerian lattices

Let us re all the de nition of Brouwerian latti es. If for given elements a, b ∈ P the

b greatest solution of the inequality a ∧ x 6 b exists then itbis denoted by a and is alled the relative pseudo omplement of a in b. If a exists for all a, b ∈ P then (P, ∧, ∨, 6) is alled a Brouwerian latti e. Note that: – any Brouwerian latti e has the greatest element denoted by 1^; – any Brouwerian latti e is a distributive latti e; – any nite distributive latti e is a Brouwerian latti e.

Let (P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×n a matrix, c = (c1 , . . ., cm )t ∈ Pm×1 a olumn ve tor. De ne a ve tor m m ci ci c t ∈ Pn×1 . , ..., ∧ = ∧ A i=1 ain i=1 ai1

Latti es of matrix rows and matrix olumns 2.3

107

Boolean lattices

Let (P, ∧, ∨, 6) be a distributive latti e with the least element b0 and the greatest element b1. If for any a ∈ P there exists a ∈ P su h that a ∨ a = b1 and a ∧ a = b0, then (P, ∧, ∨, 6) is alled a Boolean latti e.

Any Boolean latti e is a Brouwerian latti e, where ab = a ∨ b. Denote by A a matrix A = kaij k. Let U be a nite set, 2U be the olle tion of all subsets of U. Denote by Bul(U) = (2U , ⊆) the poset of all subsets of U partially ordered by in lusion (we all it a Boolean algebra). Let Bul(k) be the Boolean algebra of all subsets of a nite k-element set. It is obvious that Bul(U) (Bul(k)) is a Boolean latti e. 2.4

Systems of linear equations over Brouwerian lattices

Befor ontinuing, we require the following results, whi h are known from [1℄. Let A ∈ Pm×n , c ∈ Pm×1 . De ne a system of linear equations Ax = c,

(3)

Ax 6 c.

(4)

and a system of linear inequations Theorem 1. Let (P, ∧, ∨, 6) be a Brouwerian latti e. Then

(i) x = Ac is the greatest solution for the system of inequations (4).

c x = is the solution of (3). If (ii) System (3) is solvable whenever A

c System (3) is solvable, then x = A is the greatest solution of it.

c Corollary 1. Let (P, ∧, ∨, 6) be a Brouwerian latti e. Then x = A is the greatest ve tor in the set {Ax | Ax 6 c, x ∈ Pn×1 }. Theorem 2. Let (P, ∧, ∨, 6) be a Boolean latti e. Then the following onditions are equivalent: (i) System (3) is solvable. (ii) The greatest solution of System (3) is x=

c = At · c. A

The solvability for systems of linear equations over Boolean latti es was studied in details by Rudeany in [2℄. Corollary 2.

Let (P, ∧, ∨, 6) be a a Boolean latti e. Then c x=A = A · At · c A

is the greatest ve tor in the set {Ax | Ax 6 c,

x ∈ Pn×1 }.

108

3

V. Mareni h

Semilattices and lattices of matrix columns and matrix rows

Let A = kaij k ∈ Pm×n be a matrix, A(j) = (a1j , . . . , amj )t the j-th olumn. The linear span of olumns we denote by Column(A). Let u ∈ Column(A), then u = Ax for some olumn ve tor x ∈ Pn×1 . De ne a poset (Column(A), 6) with respe t to the partial order 6 indu ed by the latti e (Pm×1 , 6). Let (P, ∧, ∨, 6) be a distributive latti e and A ∈ Pm×n . Then (Column(A), 6 e given by ) is an upper semilatti e with the join operation ∨

e ′ , . . . , p ′ )t = (p1 , . . . , pm )t +(p ′ , . . . , p ′ )t = (p1 ∨p ′ , . . . , pm ∨p ′ )t , (p1 , . . . , pm )t ∨(p 1 m 1 m 1 m ′ t for any (p1 , . . . , pm )t , (p1′ , . . . , pm ) ∈ Column(A). We all (Column(A), 6) a

olumn semilatti e. Similarly, a row semilatti e (Row(A), 6) an be de ned. The questions are: when (Column(A), 6) is a latti e, when (Column(A), 6

is a distributive latti e, and what formulas an be obtained for the meet and the join operations? In 1962, K.A. Zarezky obtained the following result.

)

Theorem 3 (Zarezky's riterion). Let P = {^ 0, ^ 1} be a two-element Boolean latm×m a square matrix. Then (Column(A), 6) is a distributive ti e and A ∈ P latti e whenever A is a regular matrix.

Re all that a square matrix A ∈ Pm×m is alled a regular matrix if there exists B ∈ Pm×m su h that ABA = A.

It is known that A is a regular matrix whenever there exists an idempotent C su h that Column(A) = Column(C), (5) see [3℄. In the following theorem we onsider some ases when the olumn semilatti e e and join ∨ e operations. is a latti e and obtain formulas for the meet ∧ Theorem 4.

Let (P, ∧, ∨, 6) be a latti e and A ∈ Pm×n .

(i) If (P, ∧, ∨, 6) is a Brouwerian latti e, then (Column(A), 6) is a latti e, e and join ∨ e operations are where the formulas for the meet ∧ e = u + v, u∨v ~ =A u∧v =A u ∧ v u∧v , A A A

for all u, v ∈ Column(A).

Latti es of matrix rows and matrix olumns

109

(ii) If (P, ∧, ∨, 6) is a Boolean latti e, then (Column(A), 6) is a latti e, e and join ∨ e are where formulas for the operations meet ∧

for all u, v ∈ Column(A).

e = u + v = u ∨ v, u∨v e = A · At · (u + v), u∧v

e . Then w e = u∧v e is the greatest ve tor in the set Proof. (i) Let w

{w = Ax, x ∈ Pn×1 |w 6 u, w 6 v} = {w = Ax, x ∈ Pn×1 |w 6 u ∧ v}.

A

oding Corollary 1,

u∧v e =A w . A

(ii) A

oding i) and Corollary 2,

u∧v e u∧v = A ⊔ = A · At · (u ∧ v) = A · At · (u + v). ⊓ A

Let (P, ∧, ∨, 6) be a nite distributive latti e. Then the olumn semilatti e (Column(A), 6) is a latti e, in whi h the meet and join operations an be al ulated by formulas from Theorem 4 (i).

Corollary 3.

e more simply. For some olumn latti es, we an express the meet operation ∧

Let (P, ∧, ∨, 6) be a Brouwerian latti e and matrix A ∈ Pm×m an idempotent. Then: e, ∨ e in the latti e (Column(A), 6 (i) the formulas for meet and join operations ∧ ) are Theorem 5.

e = u + v, u∧v e = A (u ∧ v) , u∨v

for all u, v ∈ Column(A). (ii) (Column(A), 6) is a distributive latti e. Proof. (i) A

ording Theorem 4, ~ =A u∧v

u v ∧ . A A

Note that u is the solution of Ax = u. A

ording Theorem 1,

Sin e u 6

u A

and v 6

v A

u u6 . A

, we get

A(u ∧ v) 6 A

u v e ∧ = u∧v. A A

110

V. Mareni h

Note that w = Aw for any idempotent matrix A ∈ Pm×m and any ve tor w ∈ Column(A). We have e = A(u∧v) e 6 A(u ∧ v). A(u ∧ v) 6 u∧v

Therefore

e = A(u ∧ v). u∧v

(ii) For any u, v, w ∈ Column(A),

e ∧v) e = w + A(u ∧ v) = Aw + A(u ∧ v) = A(w ∨ (u ∧ v)) = w∨(u e ∧ (w∨v)) e e ∧(w e ∨v), e = A((w∨u) = (w∨u)

This ompletes the proof.

Note that the results similar to Theorem 4, Corrolary 3 and Theorem 5 an be obtained for a row semilatti e (Row(A), 6). The following statement is an analog of Zarezky's theorem. Theorem 6.

matrix, and Then:

Let

(P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×m a regular C ∈ Pm×m an idempotent su h that Column(A) = Column(C).

e in the latti e (Column(A), 6) is (i) the formular for the meet operation ∧ e = C(u ∧ v), u∧v

for all u, v ∈ Column(A). (ii) (Column(A), 6) is a distributive latti e.

Proof. Re all that for any regular matrix A there always exists an idempotent C su h that Column(A) = Column(C), see (5). The proof of (i) and (ii) follows

from Theorem 5.

The result similar to the statement (ii) was proved by K.A. Zarezky for semigroups of binary relations, see [4℄. Kim and Roush obtained the similar result for the fuzzy latti e, see [5℄.

4

Subspaces of A-invariant column eigenvectors

Let (P, ∧, ∨, 6) be a distributive latti e and A ∈ Pm×m a square matrix. The following de nition of invariant olumn eigenve tors over a latti e is due to L.A. Skornyakov, see [6℄. (We say "A-invariant olumn ve tors" instead of "invariant olumn eigenve tors".)

Latti es of matrix rows and matrix olumns

111

A olumn ve tor u ∈ Pm×1 is alled an A-invariant if Au = u. If u, v ∈ Pm×1 are invariant olumn ve tors and p ∈ P, then u+v and pv are A-invariant olumn ve tors. Therefore the set all A-invariant olumn ve tors form a subspa e. Our purpose is to des ribe the subspa e of all A-invariant olumn ve tors. The following two lemmas are needed for the sequel.

Let matrix. Then

Lemma 1.

(P, ∧, ∨, 6) Am 6

be a distributive latti e and 2m X

Ar 6

X

Ar 6

r>m

r=m+1

X

A ∈ Pm×m

a square

Ar .

(6)

16r6m

These inequalities was prooved by K. Che hlarova in [7℄. Lemma 2. Let (P, ∧, ∨, 6) trix and k > m. Then

be a distributive latti e, A ∈ Pm×m a square ma-

Ak 6 Ak+1 + Ak+2 + . . . + Ak+m , 2k+1

A

k

k+1

6A +A

2k

+ ... + A .

(7) (8)

Proof. Sin e (6), we get Am 6 Am+1 + Am+2 + . . . + A2m .

Multiplying both sides by Ak−m , we obtain (7). Now let us prove (8). First we prove (8) for {e0, e1} -Boolean matrix A. Let A ∈ {0, 1}m×m ; then for any olumn ve tor ξ ∈ {e 0; e 1}m×1 Ak ξ 6 (Ak + Ak+1 )ξ 6 . . . 6 (Ak + Ak+1 + . . . + A2k )ξ.

If all inequalities are stri t, then there exists a m in the latti e ({e0, e1}m , 6). This is a ontradi tion, be ause the length of Boolean ∼ Bul(m) is equal to m. Suppose algebra ({e0, e1}m , 6) = (Ak + Ak+1 + . . . + Ak+s )ξ = (Ak + Ak+1 + . . . + Ak+s+1 )ξ

for some s, where 1 6 s 6 k − 1. Then Ak+s+1 ξ 6 (Ak + Ak+1 + . . . + Ak+s )ξ.

We prove by indu tion on r > s + 1 the following inequalities Ak+r ξ 6 (Ak + Ak+1 + . . . + Ak+s )ξ.

(9)

For r = s + 1 the inequality is already proved. We assume that the inequality holds for r and prove it for r + 1. Indeed, Ak+r+1 ξ 6 (Ak+1 + . . . + Ak+s+1 )ξ 6 (Ak + Ak+1 + . . . + Ak+s+1 )ξ = = (Ak + Ak+1 + . . . + Ak+s )ξ + Ak+s+1 ξ = (Ak + Ak+1 + . . . + Ak+s )ξ.

112

V. Mareni h

Thus inequalities (9) are valid. By setting r = k + 1 > s + 1 in (9), we obtain A2k+1 ξ 6 (Ak + Ak+1 + . . . + A2k )ξ.

Sin e ξ is an arbitrary olumn ve tor, we see that the inequality (8) is valid for any Boolean matrix A ∈ {0, 1}m×m . Now we suppose that A is a latti e matrix, A ∈ Pm×m . Using the inequality (8) for Boolean {e0, e1}-matri es and the de omposition of A into the linear span of se tions ( onstituents), we see that (8) is valid over the latti e (P, ∧, ∨, 6). (The linear span of se tions are de ned in [5℄.) Lemma 2 gives us the following. Theorem 7.

Then:

Let (P, ∧, ∨, 6) be a distributive latti e, k > m and A ∈ Pm×m .

(i) (A + A2 )k = (A + A2 )k · A, (ii) (A + A2 )k is an idempotenet matrix.

Proof. (i) It follows from (7) that Ak 6 Ak+1 + . . . + A2k .

Combining this inequality and (8), we get (A + A2 )k = Ak + Ak+1 + . . . + A2k = Ak + Ak+1 + . . . + A2k + A2k+1 = = Ak+1 + . . . + A2k + A2k+1 = (A + A2 )k · A.

(ii) Using (i), we see that (A + A2 )k = (A + A2 )k As for any s > 0. Therefore (A + A2 )k

2

= (A + A2 )k (A + A2 )k = (A + A2 )k Ak + Ak+1 + . . . + A2k = =

2k X

(A + A2 )k As =

s=0

2k X

(A + A2 )k = (A + A2 )k ,

s=0

and (A + A2 )k is an idempotenet matrix. The result similar to Theorem 7 was proved by K. H. Kim for the two-element Boolean latti e P = {e0, e1}, see [3℄. In the following lemma we des ribe invariant ve tors of idempotent matri es.

Lemma 3. Let (P, ∧, ∨, 6) be a distributive latti e, B ∈ Pm×m an idempotent and ξ ∈ Pm×1 a olumn ve tor. Then ξ is a B-invariant ve tor whenever ξ ∈ Column(B).

Latti es of matrix rows and matrix olumns

113

Proof. Suppose Bξ = ξ; then ξ ∈ Column(B).

Suppose ξ ∈ Column(B). Sin e B is an idempotent, we get B = B2 and B(j) = B · B(j) for any olumn Bj , j = 1, . . . , n. By de nition of Column(B), ξ = β1 B(1) + . . . + βn B(m) ,

for some β1 , . . . , βm ∈ P. Therefore Bξ = B(β1 B(1) + . . . + βm B(m) ) = β1 B · B(1) + . . . + βm B · B(m) = ξ. ⊓ ⊔

Now we an des ribe all invariant eigenve tors of m × m-matri es over a latti e. Theorem 8. Let (P, ∧, ∨, 6) be a distributive latti e, A ∈ Pm×m and k > m. Then the subspa e of all A-invariant olumn ve tors oin ides with Column ((A + A2 )k ) and matrix (A + A2 )k is an idempotent.

Proof. A

oding Theorem 7,

(A + A2 )k · A.

(A + A2 )k is an idempotent and (A + A2 )k =

First we shall prove that onditions Aξ = ξ and (A + A2 )k ξ = ξ are equavalent for any ξ ∈ Pm×1 . Suppose Aξ = ξ, then obviously (A + A2 )k ξ = ξ. Suppose (A + A2 )k ξ = ξ, then Aξ = A(A + A2 )k ξ = (A + A2 )k ξ = ξ. Sin e (A+A2 )k is an idempotent, using Lemma 3, we see that (A+A2 )k ξ = ξ is aquavalent to ξ ∈ Column((A + A2 )k ). For the two-element Boolean latti e P = {e0, e1}, Theorem 8 is a orollary of results obtained by T.S. Blyth, see [10℄.

5

Lattices of invariant column vectors

Let (P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×m a square matrix. From previous results it follows that all A-invariant ve tors form a distributive latti e. Also the simple formulas for the meet and the join operations are obtained. Theorem 9. Let (P, ∧, ∨, 6) be a Brouwerian latti e, A ∈ Pm×m a matrix, Jm×1 = (^1, . . . , ^1)t ∈ Pm×n a universal olumn ve tor, k > m.

square Then:

(i) all A-invariant ve tors form a latti e, whi h oin ides with

e ∨, e 6 Column((A + A2 )k ), ∧,

e and the join ∨ e operations in the latti e of (ii) the formulas for the meet ∧

all A-invariant ve tors are

e = u + v = u ∨ v, u∧v e = (A + A2 )k (u ∧ v), u∨v

for all A-invariant olumn ve tors u, v ∈ Pm×1 ;

114

V. Mareni h

(iii) the latti e of all A-invariant ve tors is distributive, with the greatest element Am Jm×1 . If (P, ∧, ∨, 6) is a Brouwerian latti e with ^0, then the latti e of all A-invariant ve tors has the least element e0 = (^0, . . . , ^0)t ∈ Pm×1 .

Proof. Statements (i) and (ii) are immediate onsequen es of Theorems 8, 4, 5.

Let us prove (iii). First we shall prove that Am · Jm×1 is an A-invariant ve tor. From the obvious inequality AJm×1 6 Jm×1 , it folows that Ar+1 Jm×1 6 Ar Jm×1 , r = 1, 2, . . . .

A

oding (6), Am Jm×1 6

X

(10)

Ar Jm×1 .

r>m

Consider the right part of this inequality. Sin e (10), we see that Am+1 Jm×1 is the greatest summand, therefore X

Ar Jm×1 = Am+1 Jm×1 .

r>m

Applying (10), we get Am Jm×1 6

X

Ar Jm×1 = Am+1 Jm×1 6 Am Jm×1 .

r>m

To on lude the proof, it remains to note that Am · Jm×1 is the greatest Ainvariant ve tor. Indeed, if ξ is A-invariant ve tor, then ξ = Aξ = ... = Am ξ 6 Am Jm×1 . ⊓ ⊔

Acknowledgments This resear h is ondu ted in a

ordan e with the Themati plan of Russian Federal Edu ational Agen y, theme №1.03.07.

References 1. E.E. Mareni h, V.G. Kumarov Inversion of matri es over a pseudo omplemented latti e, Journal of Mathemati al S ien es, Volume 144, Number 2, Springer New York (2007), pp. 3968{3979 2. S. Rudeanu, Latti e Fun tions and Equations, Springer - Verlag, London, 2001. 3. Kim, Ki Hang, Boolean matrix theory and appli ations, Mar el Dekker, New York and Basel, 1982. 4. K.A. Zarezky, Regular elements in semigroups of binary relations, Uspeki mat. nauk, 17-3 (1962), pp. 105{108.

Latti es of matrix rows and matrix olumns

115

5. K.X. Kim, F.W. Roush, Generalized fuzzy matri es, Fuzzy Sets Systems, 4 (1980), pp. 293{315. 6. L.A. Skornyakov, Eigenve tors of a matrix over a distributive latti e, Vestnik Kievskogo Universiteta, 27 (1986), pp. 96{97. 7. K. Che hlarova, Powers of matri es over distributive latti es - a review, Fuzzy Sets and Systems, 138 (2003), pp. 627{641. 8. S. Kirkland, N.J. Pullman Boolean spe tral theory, Linear Algebra Appl., 175 (1992), pp. 177{190. 9. Y.-J. Tan , On the powers of matri es over distributive latti es, Linear Algebra Appl., 336 (2001), pp. 1{14. 10. T.S. Blyth, On eigenve tors of Boolean matri es, Pro . Royal So ., Edinburgh Se t A 67 (1966), pp. 196{204.

Matrix algebras and their length Olga V. Markova Mos ow State University, Mathemati s and Me hani s Dept. ov [email protected]

Abstract. Let F be a eld and let A be a nite-dimensional F-algebra. We de ne the length of a nite generating set of this algebra as the smallest number k su h that words of the length not greater than k generate A as a ve tor spa e, and the length of the algebra is the maximum of lengths of its generating sets. In this paper we study the onne tion between the length of an algebra and the lengths of its subalgebras. It turns out that the length of an algebra an be smaller than the length of its subalgebra. To investigate, how dierent the length of an algebra and the length of its subalgebra an be, we evaluate the dieren e and the ratio of the lengths of an algebra and its subalgebra for several representative families of algebras. Also we give examples of length omputation of two and three blo k upper triangular matrix algebras.

Keywords: length; nite-dimensional asso iative algebras; matrix subalgebras; upper triangular matri es; blo k matri es.

1

Main Definitions and Notation

Let F be an arbitrary eld and let A be a nite-dimensional asso iative algebra over F. Ea h nite-dimensional algebra is ertainly nitely generated. Let S = {a1 , . . . , ak } be a nite generating set for A.

Notation 1. Let hSi denote the linear span, i.e. the set of all nite linear ombinations with oeÆ ients from F, of the set S. A length of a word ai · · · ai , ai ∈ S, ai 6= 1, is t. If A is an algebra with 1, then it is said that 1 is a word of elements from S of length 0. Definition 1.

1

t

j

j

Notation 2. Let Si denote the set of all words in the alphabet a1 , . . . , ak of a

length less than or equal to i, i > 0.

Notation 3. Let Li (S) = hSi i and let L(S) =

S∞

i=0 Li (S) be the linear span of all words in the alphabet a1 , . . . , ak . Note that L0 (S) = F, if A is unitary, and L0 (S) = 0, otherwise.

Matrix algebras and their length

117

Sin e S is a generating set for A, any element of A an be written as a nite linear ombination of words in a1 , . . . , ak , i.e., A = L(S). The de nition of Si implies that Li+j (S) = hLi (S)Lj (S)i and L0 (S) ⊆ L1 (S) ⊆ · · · ⊆ Lh (S) ⊆ · · · ⊆ L(S) = A. Sin e A is nite dimensional, there exists an integer h > 0 su h that Lh (A) = Lh+1 (A).

A number l(S) is alled a length of a nite generating set provided it equals the smallest number h, su h that Lh (S) = Lh+1 (S). Definition 2.

S

Note that if for some h > 0 it holds that Lh (S) = Lh+1 (S), then Lh+2 (S) = hL1 (S)Lh+1 (S)i = hL1 (S)Lh (S)i = Lh+1 (S)

and similarly Li (S) = Lh (S) for all i > h. Thus l(S) is de ned orre tly. Sin e S is a generating set for A, it follows that Lh (S) = L(S) = A. The following de nition is ru ial for this paper. Definition 3. The length of the algebra A, denoted by l(A), is the maximum of lengths of all its generating sets. Definition 4. The word v ∈ Lj (S) is alled redu ible over S i < j, su h that v ∈ Li (S) and Li (S) 6= Lj (S).

if there exists

Notation 4. Let Mn (F) be the full matrix algebra of order n over F, Tn (F) be

the algebra of n × n upper triangular matri es over matri es over F, Dn (F) be the algebra of n × n diagonal matri es over matri es over F, and Nn (F) be the subalgebra of nilpotent matri es in Tn (F).

Notation 5. We denote by E the identity matrix, by Ei,j the matrix unit, i.e. the matrix with 1 on (i, j)-position and 0 elsewhere.

2

Introduction

The problem of evaluating the length of the full matrix algebra in terms of its order was posed in 1984 by A. Paz in [4℄ and has not been solved yet. The ase of 3 × 3 matri es was studied by Spen er ad Rivlin [5℄, [6℄ in onne tion with possible appli ations in me hani s. Some known upper bounds for the length of the matrix algebra are not linear. Theorem 6. [4, Theorem 1, Remark 1℄ l(Mn (F)) 6 ⌈(n2 + 2)/3⌉.

Theorem 7. [3, Corollary 3.2℄ Let F p n 2n2 /(n − 1) + 1/4 + n/2 − 2.

Let

F

be an arbitrary eld. Then

be an arbitrary eld. Then

In [4℄ Paz also suggested a linear bound:

l(Mn (F))

l(A) and for any natural number k the dieren e of the lengths is l(A ′ ) − l(A) = k (Theorem 9). Also we investigate the ratio between l(A ′ ) and l(A). The question on the possible values of length ratio remains open yet in general. But in the Se tions 3.1 and 3.2 we give some examples of length omputation of two and three blo k upper triangular matrix algebras. Apart from their intrinsi interest, these examples give the following result: for any rational number r ∈ [1, 2] there exist su h F-algebra A and its subalgebra A ′ , that l(A ′ )/l(A) = r (Corollary 2). We note that there are still very few examples of algebras with exa tly evaluated length. In this papers we give some new series of su h examples: algebras An,m , f. Theorem 11, and An1 ,n2 ,n3 , f. Theorem 14. In addition in Se tion 3.3 we give some examples of algebras A satisfying the inequality l(A) > l(A ′ ) for any subalgebra A ′ ⊆ A.

Matrix algebras and their length

3

119

On the lengths of algebra and its subalgebras

Noti e that generally speaking the length fun tion unlike the dimension fun tion

an in rease when passing from an algebra to its subalgebras. We rst onsider two types of transformations preserving the length of a generating set. Proposition 1. Let F be an arbitrary eld and let A be a nite-dimensional asso iative F-algebra. If S = {a1 , . . . , ak} is a generating set for A and C = {cij } ∈ Mk (F) is non-singular, then the set of oordinates of the ve tor

C

i.e. the set is

a1

.. .

ak

=

c11 a1 + c12 a2 + . . . + c1k ak

.. .

ck1 a1 + ck2 a2 + . . . + ckk ak

,

(2)

Sc = {c11 a1 + c12 a2 + . . . + c1k ak , . . . , ck1 a1 + ck2 a2 + . . . + ckk ak } also a generating set for A and l(Sc ) = l(S).

Proof. Let us prove using the indu tion on n that

Ln (S) = Ln (Sc ) holds for every n. Sin e any linear ombination γ1 a1 +. . .+γk ak ∈ L1 (S), then L1 (Sc ) ⊆ L1 (S). The non-singularity of C provides that ai ∈ L1 (Sc ), i = 1, . . . , k, i.e. L1 (S) ⊆ L1 (Sc ). Hen e L1 (Sc ) = L1 (S). Let us take n > 1 and suppose that for n − 1 the equality holds. Then Ln (S) = hL1 (S)Ln−1 (S)i = hL1 (Sc )Ln−1 (Sc )i = Ln (Sc ).

Let F be an arbitrary eld and let A be a nite-dimensional asso iative unitary F-algebra. Let S = {a1 , . . . , ak } be a generating set for A su h that 1A ∈/ ha1 , . . . , ak i. Then S1 = {a1 + γ1 1A , . . . , ak + γk1A } is also a generating set for A and l(S1 ) = l(S).

Proposition 2.

Proof. The proof is analogous to that of Proposition 1, but simpler. For further onsiderations we need the following lass of matri es: Definition 5.

Let

F

be an arbitrary eld. A matrix

C ∈ Mn (F)

nonderogatory provided dimF (hE, C, C2 , . . . , Cn−1 i) = n.

is alled

Lemma 1. [8, Lemma 7.7℄ Let F be an arbitrary eld and let A be a ommutative subalgebra of Mn (F). If there exists a nonderogatory matrix A ∈ A then A is a subalgebra generated by A, and l(A) = n − 1.

Let F be an arbitrary eld and let A4 ⊂ T4 (F) be an algebra generated by matri es E, E4,4 , E1,2 , E1,3 and E2,3 . Then l(A4 ) = 2.

Proposition 3.

120

O. V. Markova

Proof. The dimension of any subalgebra of

M4 (F) generated by a single matrix does not ex eed 4, but dimF A4 = 5. Hen e for any generating set S = / hA1 , A2 i. If the {A1 , . . . , Ak } for A4 it holds that k > 2 and if k = 2, then E ∈ generating set S ontains 3 matri es A1 , A2 , A3 su h that E ∈/ hA1 , A2 , A3 i, then dimF L1 (S) > 4 and in this ase dimF L2 (S) = 5, that is l(S) 6 2. Let us

onsider the ase when S = {A, B}, E ∈/ hA, Bi. It follows from Proposition 2 that matri es A and B an be taken in the following form

0 a12 0 0 A= 0 0 0 0

a13 a23 0 0

0 0 b12 0 0 0 , B = 0 0 0 a44 0 0

b13 b23 0 0

0 0 . 0 b44

Sin e S is a generating set, then a44 6= 0 or b44 6= 0. Without loss of generality we will assume that a44 6= 0. Then by Proposition 1 we an take b44 = 0. Then A2 = a12 a23 E1,3 + a244 E4,4 , AB = a12 b23 E1,3 , BA = a23 b12 E1,3 , B2 = b12 b23 E1,3 , A3 = a344 E4,4 .

Other produ ts in A and B of length greater than or equal to 3 are equal to zero. Hen e, we obtain that for a generating set S the ve tors (a12 , a23 ) and (b12 , b23 ) are always linearly independent. But in this ase AB 6= 0 or BA 6= 0, 2 that is E1,3 ∈ L2 (S). Hen e E4,4 = a−2 44 (A − a12 a23 E1,3 ) ∈ L2 (S), E1,2 , E1,3 ∈ hA, B, E1,3 , E4,4 i ⊆ L2 (S). Consequently, L2 (S) = A4 and l(S) = 2. That is l(A4 ) = 2.

Example 1. Let F be an arbitrary eld and let A4 ⊂ T4 (F), generated by matri-

es E, E4,4 , E1,2 , E1,3 and E2,3 . There exists a subalgebra A ′ of A4 , generated by a nonderogatory matrix A = E1,2 + E2,3 + E4,4 , l(A ′ ) = 3 > 2 = l(A4 ).

For all n > 4 and for any eld F with the number of elements greater than n − 4 there exist su h subalgebras An′ ⊂ An ⊂ Mn (F) that l(An′ ) = n − 1 > l(An ). Corollary 1.

Proof. Let f4 , . . . , fn ∈ F be distin t nonzero elements. Consider the following subalgebras:

An = h{E1,1 + E2,2 + E3,3 , E1,2 , E1,3 , E2,3 , E4,4 , . . . , En,n }i, An′ = hE1,2 + E1,3 +

n X i=4

fi Ei,i i.

Then l(An′ ) = n − 1, sin e the matrix E1,2 + E1,3 +

n P

i=4

fi Ei,i is nonderogatory. It

follows from [8, Theorem 4.5℄ that l(Dn−4 (F)) = n − 5. Consequently, l(An ) = l(A4 ⊕ Dn−4 (F)) 6 2 + (n − 5) + 1 = n − 2 by onsequent appli ation of Example 1 and Theorem 8.

Matrix algebras and their length

121

Thus Proposition 3 and Example 1 provide a positive answer to the question whether the length of an algebra an be smaller than the length of its subalgebra. Consequently, the next natural question is: what values an be taken on by the

dieren e and the ratio of the lengths of an algebra and of its subalgebra.

Let (A, A ′ ) be a pair, where A is an F-algebra over and arbitrary eld F and A ⊆ A be its subalgebra. The next theorem shows that there exist families of su h pairs, su h that l(A ′ ) > l(A) and the dieren e an be arbitrary large, and thus answers the rst question. The se ond question is onsidered in the next two se tions. ′

Theorem 9. For any natural number k there exist algebras A ′ ⊂ A ⊂ Mn (F), that l(A ′ ) − l(A) = k.

a number

n

and su h

Proof. Example 2 and Proposition 4 below give an expli it onstru tion of a pair

(A, A ′ ) of F-algebras su h that A ′ ⊆ A and l(A ′ ) − l(A) = k. This onstru tion

is based on Example 1.

Example 2. Let F be a suÆ iently large eld, let k be a xed positive number, n = 4k. Let A = A4 ⊕ . . . ⊕ A4 ⊂ Mn (F), Ai = E4i−3,4i−3 + E4i−3,4i−2 + | {z } k

times

E4i−2,4i−2 + E4i−2,4i−1 + E4i−1,4i−1 , i = 1, . . . , k, let us assign k X A ′ = h (ai Ai + bi E4i,4i )i ⊂ A, i=1

here ai , bi , i = 1, . . . , k, are distin t nonzero elements from F. l(A) = 3k − 1 as shown below, while l(A ′ ) = n − 1 = 4k − 1 by Lemma 1. Proposition 4. Let k be a xed natural 2k + 1. Let A = A4 ⊕ . . . ⊕ A4 ⊂ Mn (F). | {z } k

number, let F be a eld with Then l(A) = 3k − 1.

|F| >

times

Proof. It follows from Theorem 8 and Example 1 that l(A) 6 2k+k−1 = 3k−1. Consider a generating set SA = {A =

k X

(αi (Ai − E4i−2,4i−1 ) + βi E4i,4i ), E4j−2,4j−1 , j = 1, . . . , k},

i=1

where αi , βi , i = 1, . . . , k, are distin t nonzero elements from F, Ai = E4i−3,4i−3 + E4i−3,4i−2 +E4i−2,4i−2 +E4i−2,4i−1 +E4i−1,4i−1 , i = 1, . . . , k. Sin e AE4j−2,4j−1 = αj (E4j−3,4j−1 + E4j−2,4j−1 ), E4j−2,4j−1 A = αj E4j−2,4j−1 , and the degree of the minimal polynomial of A is 3k, then l(SA ) = 3k − 1 = l(A).

122 3.1

O. V. Markova Two block subalgebras in upper triangular matrix algebra

We note that in Example 2 the value m = l(A ′ ) − l(A) is an arbitrary number, however the ratio r = (l(A ′ ) + 1) : (l(A) + 1) = 4 : 3 is a onstant. The main aim of this and the next se tions is to show that for any rational number r ∈ [1, 2] there exist F-algebra A and its subalgebra A ′, su h that l(A ′ )/l(A) = r. In this se tion we onsider the following 2-parametri family of algebras An,m , An,m =

*

+ 1 6 i < j 6 n, ⊂ Tm+n (F), E, Eii , Ei,j , or n + 1 6 i < j 6 n + m i=1 n X

where n > m are natural numbers, over an arbitrary eld F. We ompute their ′ ′ lengths expli itly and found the subalgebra An,m with l(An,m ) > l(An,m ) in ea h algebra of this family, then hoosing appropriate values of parameters n and ′ m we obtain the required behavior of the ratio l(An,m )/l(An,m ), see Corollary 2.

Remark 1. The aforementioned onstru tions generalize Example 1, namely, we

obtain a series of algebras A(n) = An,m and their subalgebras A ′ (n) with the xed length dieren e m, for whi h the length ratio r = r(n) is a non- onstant linear-fra tional fun tion.

Remark 2. Algebra A4 des ribed in Example 1 oin ides with A3,1 . Notation 10. Any

A ∈ An,m is of the following form A =

′ A 0 , where 0 A ′′

A ′ ∈ Tn (F), A ′′ ∈ Tm (F). From now on we will use the following notation A = A ′ ⊕ A ′′ .

In the following two lemmas we mark spe ial elements in generating sets whi h are signi ant for omputation of the length of An,m . Lemma 2. Let n ∈ N, n > 3 and let S be a generating set for An,m . Then there exists a generating set eS for An,m su h that the following onditions hold: 1. dim L1 (eS) = |eS| + 1; P ai,j Ei,j 2. there exist a matrix A0 = A0′ ⊕ A0′′ ∈ eS su h that A0′ =

and A0′′ =

P

16i<j6n

ai+n,j+n Ei+n,j+n + E;

16i<j6m

3. for all S ∈ eS, S 6= A0 , it holds that (S)i,i = 0, i = 1, . . . , n + m; 4. there exist B1 , . . . , Bn−1 ∈ eS su h that either (i) for all r = 1, . . . , n − 1 Br′ = Er,r+1 +

or

n n P P

i=1 j=i+2

bi,j;r Ei,j , Br′′ ∈ Nm (F),

Matrix algebras and their length

123

(ii) there exists k ∈ {1, . . . , n − 1} su h that Br′ = Er,r+1 + br Ek,k+1 +

n n P P

bi,j;r Ei,j ,

i=1 j=i+2

5. l(eS) = l(S).

Br′′ ∈ Nm (F), r = 1, . . . , n − 1, r 6= k, n n P P a ^i,j Ei,j , ak 6= 0, Bk′ = A0′ = ak Ek,k+1 + j=i+2 i=1 P Bk′′ = A0′′ = a ^i+n,j+n Ei+n,j+n + E; 16i<j6m

Proof. Let us onsequently transform the set S into a generating set satisfying

the onditions 1{4. 1. This ondition is equivalent to the fa t that all elements from S are linearly independent and E ∈/ hSi. Otherwise we remove all redundant elements from S. Noti e that it does not hange the length of S. 2. In order to prove the existen e of A0 it is suÆ ient to show that S ontains a matrix with two distin t eigenvalues. But if all matri es in S had only one eigenvalue, then S would not be a generating set for An,m . 3. Proposition 2 allows us to transform the given generating set into a generating set S ′ = {S − (S)1,1 E, S ∈ S} preserving its length. Then by Proposition 1 the transformation of S ′ into a generating set S ′′ = {A0 , S − (S)n+1,n+1 A0 , S ∈ S ′ , S 6= A0 } also does not hange its length. For the sake of simpli ity of the subsequent text let us redenote S ′′ by S. 4. Sin e Ei,i+1 ∈ An,m , but for any t > 2 and S ∈ St \ S the oeÆ ient (S)i,i+1 = 0, i = 1, . . . , n − 1, then there exist B1 , . . . , Bn−1 ∈ S su h that ve tors ui = ((Bi )1,2 , (Bi )2,3 , . . . , (Bi )n−1,n ), i = 1, . . . , n − 1 are linearly independent. Let us next do the following transformation F of the set S (by Proposition 1 F preserves the length of S), whi h is identi al for all elements S ∈ S, S 6= Bi , i = 1, . . . , n−1, i.e. F(S) = S, and its a tion on the set of matri es Bj , j = 1, . . . , n−1 depends on belonging of A0 to this set as follows: (i) Assume that A0 6∈ {B1 , . . . , Bn−1 }. There exists a non-singular linear transformation F = {fi,j } ∈ Mn−1 (F) that maps the set {ui } into the set {e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en−1 = (0, 0, . . . , 1)} ⊂ Fn−1 ,

i.e. ei =

n−1 P j=1

fi,j uj . Then let us assign F(Br ) =

Then F(Br ) ′ = Er,r+1 +

n n P P

i=1 j=i+2

bi,j;r Ei,j .

n−1 P

fr,j Bj .

j=1

(ii) Assume that A0 ∈ {B1 , . . . , Bn−1 }, i.e. A0 = Bp for some p ∈ {1, . . . , n − 1}. Sin e any matrix in Mn−1,n−2 (F) of rank n − 2 ontains a non-singular submatrix of order n − 2, then there exists a number k ∈ {1, . . . , n − 1} su h that

124

O. V. Markova

the ve tors vi = ((Bi )1,2 , . . . , (Bi )k−1,k , (Bi )k+1,k+2 , . . . , (Bi )n−1,n ), i = 1, . . . , n − 1, i 6= p are linearly independent. Sin e the matri es Bj were numbered arbitrarily, we would assume that p = k. There exists a non-singular linear transformation G = {gi,j } ∈ Mn−2 (F) that maps the set {vi } into the set {e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en−2 = (0, 0, . . . , 1)} ⊂ Fn−2 ,

i.e. ei =

k−1 P

gi,j vj +

j=1

F(Br ) =

k−1 X j=1

Then

n−2 P j=k

gr,j Bj +

gi,j vj+1 . Then let us assign

n−2 X j=k

gr,j Bj+1 , r 6= k, F(A0 ) = A0 −

F(Br ) ′ = Er,r+1 + br Ek,k+1 +

n X n X

i=1 j=i+2

F(A0 ) ′ = ak Ek,k+1 + F(A0 ) ′′ =

X

n X n X

a ^i,j Ei,j ,

i=1 j=i+2

n−1 X

(A0 )i,i+1 F(Bi ).

i=1 i6=k

bi,j;r Ei,j if r 6= k, ak 6= 0,

a ^i+n,j+n Ei+n,j+n + E.

16i<j6m

For the sake of simpli ity of the subsequent text let us redenote F(A0 ) and S = F(S) is of F(Br ) by A0 and Br , orrespondingly. Then the generating set e

required type. 5. The transformations applied to the set S in paragraphs 1{4 did not hange its length, onsequently, the length of the new generating set is equal to the length of the initial one.

Lemma 3. Let n > 3, m > 2 and let S be a generating set for An,m , satisfying the onditions 1{4 of Lemma 2. Then one of the following onditions holds: (i) there exist su h C, C1 , . . . , Cm−1 ∈ hS \ A0 i that Cr′′ = Er+n,r+n+1 +

m X m X

ci+n,j+n;r Ei+n,j+n , r = 1, . . . , m − 1,

i=1 j=i+2 m m P P

and if A~ 0 = E − A0 + C then A~ 0′′ = a ~i+n,j+n Ei+n,j+n , i=1 j=i+2 or (ii) there exist su h C, Cr ∈ hS \ A0 i, r = 1, . . . , m − 1, r 6= s that Cr′′ = Er+n,r+n+1 + cr Es+n,s+n+1 +

m X m X

i=1 j=i+2

ci+n,j+n;r Ei+n,j+n ,

Matrix algebras and their length

125

and if A~ 0 = E − A0 + C then ~ 0′′ = a~s Es+n,s+n+1 + A

m m X X

i=1 j=i+2

a ~ i+n,j+n Ei+n,j+n , a~s 6= 0,

where s ∈ {1, . . . , m − 1} (in this ase let us assign Cs = A~ 0 ). Proof. Sin e all S1 ,

S2 ∈ S and all i = n+1, . . . , n+m−1 satisfy (S1 S2 )i,i+1 = 0

if S1 6= A0 , S2 6= A0 and

(S1 A0 )i,i+1 = (S1 )i,i+1 ,

(A0 S2 )i,i+1 = (S2 )i,i+1 and (A20 )i,i+1 = 2(A0 )i,i+1 ,

then there exist C~ 1 , . . . , C~ m−1 ∈ S su h that the ve tors

~ i )n+1,n+2 , (C~ i )n+2,n+3 , . . . , (C~ i )n+m−1,n+m ), wi = ((C

i = 1, . . . , m − 1 are linearly independent. Matri es Ci an be obtained from C~ i , i = 1, . . . , m−1, if we apply transformam−1 P tions similar to those in paragraph 4 of Lemma 2. Assign C = a ^i+n,i+n+1 Ci . i=1

That is (C)i+n,i+n+1 = (A0 )i+n,i+n+1 for all i = 1, . . . , m − 1.

Further length omputation of An,m is arried out separately for dierent values of n and m. Lemma 4.

Let F be an arbitrary eld. Then l(A1,1 ) = 1 and l(A2,2 ) = 3.

Proof.

1. Algebra A1,1 = D2 (F). Consequently, it follows from [8, Theorem 4.5℄ that l(A1,1 ) = 1.

2. Algebra A2,2 is generated by a nonderogatory matrix E1,1 + E1,2 + E2,2 + E3,4 . Consequently, by Lemma 1 we have l(A2,2 ) = 3. Lemma 5.

Let F be an arbitrary eld. Then l(A2,1 ) = 2 and l(A3,2 ) = 3.

Proof.

1. Algebra A2,1 is generated by a nonderogatory matrix E1,1 +E1,2 +E2,2 . Consequently, by Lemma 1 we have l(A2,1 ) = 2.

2. Let S be a generating set for A3,2 . Without loss of generality we assume S to satisfy the onditions 1{4 of Lemma 2 and therefore one of the onditions of Lemma 3. Let us show that L3 (S) = A3,2 . We have B1 B2 (E−A0 ) = aE1,3 , a 6= 0, B1 (E − A0 )2 = b11 E1,2 + b12 E2,3 + b13 E1,3 , B2 (E − A0 )2 = b21 E1,2 + b22 E2,3 + b23 E1,3 , with linearly independent ve tors (b11 , b12 ) and (b21 , b22 ). C1 A20 = bE4,5 + cE1,3 , b 6= 0. Then E4,4 + E5,5 = (A0 − (A0 )1,2 E1,2 − (A0 )1,3 E1,3 − (A0 )2,3 E2,3 − (A0 )4,5 E4,5 ) ∈ L3 (S). Consider S = {A = E1,2 , B = E2,3 + E4,4 + E4,5 + E5,5 , E}. It follows from the following equations A2 = 0, AB = E1,3 , BA = 0, B2 = E4,4 + 2E4,5 + E5,5 and B3 − B2 = E4,5 that l(S) = 3 = l(A3,2 ).

126

O. V. Markova

Lemma 6. Let F l(An,m ) = n − 1.

be an arbitrary eld, let

n, m ∈ N

and

n − m > 2.

Then

Proof. Let us rst prove the upper bound l(An,m )

6 n − 1. Consider a generating set S for An,m . Without loss of generality we assume S satisfying the

onditions 1{4 of Lemma 2. 1. We use indu tion on p = n − (j − i) to prove that Ei,j ∈ Ln−1 (S) for 1 6 i < j 6 n, j − i > 2. If p = 1 then B1 B2 . . . Bn−1 = (ak )t E1,n ∈ Ln−1 (S), t ∈ {0, 1}, sin e ′′ ′′ ′′ B1 B2 . . . Bn−1 = 0 as a produ t of n − 1 nilpotent matri es of order m 6 n − 2 when t = 0, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order m 6 n − 2 when t = 1. Consider the following matrix produ ts = Bj Bj+1 . . . Bj+n−p−1 (E − A0 )p−1 ∈ Ln−1 (S), j = 1, . . . , p. n n P P dh,i;j,p Eh,i , t ∈ {0, 1}, We have Bj,′ j+n−p−1 = (ak )t Ej,j+n−p + Bj,

j+n−p−1

h=1 i=h+n−p+1

and Bj,′′ j+n−p−1 = 0 as a produ t of n − 1 nilpotent matri es of order m 6 n − 2 when t = 0, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order m 6 n − 2 when t = 1. Applying the indu tion hypothesis we obtain that Ei,i+n−q−1 ∈ Ln−1 (S) for all q = 1, . . . , p − 1, i = 1, . . . , q. Hen e − (ak )t Ej,j+n−p ∈ Ln−1 (S). Sin e by de nition it holds that Bj, j+n−p−1 ∈ Ln−1 (S), then Bj,

Ej,

j+n−p−1

j+n−p−1

= (ak )−t (Bj,

t j+n−p−1 −(Bj, j+n−p−1 −(ak ) Ej,j+n−p ))

∈ Ln−1 (S),

j = 1, . . . , p. 2. Let us now onsider Bj,

It follows immediately that

j

= Bj (E − A0 )n−2 ∈ Ln−1 (S), j = 1, . . . , n − 1.

(Bj, j )r,r+1 = (Bj )r,r+1 , j, r = 1, . . . , n − 1,

that is Bj,′ j

= Ej,j+1 + γj Ek,k+1 +

n n X X

h=1 i=h+2

′ Bk,

k

= atk Ek,k+1 +

n n X X

h=1 i=h+2

dh,i;j Eh,i , j 6= k,

dh,i;k Eh,i , t ∈ {0, 1},

and Br,′′ r = 0 as a produ t of n − 1 nilpotent matri es of order m 6 n − 2 when r 6= k, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order m 6 n − 2 when r = k.

Matrix algebras and their length

It follows from paragraph 1 that Therefore Ek,k+1 = a−t k (Bk,k − Bj,j −

n P

n P

h=1 i=h+2

n P

n P

h=1 i=h+2 n n P P

h=1 i=h+2

127

dh,i;j Eh,i ∈ Ln−1 (S), j = 1, . . . , n.

dh,i;k Eh,i ) ∈ Ln−1 (S). Then Ej,j+1 =

dh,i;j Eh,i − γj Ek,k+1 ∈ Ln−1 (S).

Consequently, Ei,j ∈ Ln−1 (S), 1 6 i < j 6 n. Hen e for any N ∈ Nn (F) it holds that N ⊕ 0 ∈ Ln−1 (S).

3. Let S1 , . . . , Sn ∈ S and assume there exists some Si 6= A0 . It follows from [7, Equation (1)℄ that there exists V ∈ Ln−1 (S) su h that S1 · · · Sn + V = S ′ ⊕ 0, S ′ ∈ Nn (F), but it follows from paragraphs 1 and 2 that S ′ ⊕ 0 ∈ Ln−1 (S). Therefore S1 · · · Sn is redu ible. By Caley{Hamilton Theorem it holds that (A0′′ )m+1 ∈ hA0′′ , (A0′′ )2 , . . . , (A0′′ )m i. Consequently, there exists ′ ′ VA ∈ Ln−1 (S) su h that An 0 + VA = A ⊕ 0, A ∈ Nn (F), but it follows from ′ paragraphs 1 and 2 that A ⊕ 0 ∈ Ln−1 (S). Therefore An0 is also redu ible. So any word of length n in elements of S is redu ible, therefore Ln (S) = Ln−1 (S) and l(S) 6 n − 1.

By Theorem 8 we obtain that l(An,m ) > n − 1. Consequently, l(An,m ) =

n − 1.

Lemma 7. n − 1.

Let

F

be an arbitrary eld, n ∈ N and n > 3. Then

l(An,n−1 ) =

Proof. Let us rst prove the upper bound l(An,n−1 ) 6 n − 1. Let S be a generating set for An,n−1 . Without loss of generality we assume S to satisfy the

onditions 1{4 of Lemma 2 and therefore one of the onditions of Lemma 3. 1. We use indu tion on p = n − (j − i) to prove that Ei,j ∈ Ln−1 (S) for 1 6 i < j 6 n, j − i > 2. Consider the ase when p = 1. (i) Assume that there is no su h number k that A0 = Bk . Then we obtain ′′ B1 B2 · · · Bn−1 = E1,n ∈ Ln−1 (S), sin e B1′′ B2′′ · · · Bn−1 = 0 as a produ t of n − 1 nilpotent matri es of order n − 1. Also it holds that C1 · · · Cn−2 A0 = aEn+1,2n−1 + bE1,n , a 6= 0, that is En+1,2n−1 ∈ Ln−1 (S). (ii) Assume now that there exists a number k su h that A0 = Bk . Then we obtain B1 B2 · · · Bn−1 = ak E1,n + αEn+1,2n−1 . Assume that α = 0. It follows from the equalities (C1 · · · Cn−2 ) ′′ = aE1,n−1 , a 6= 0 and (C1 · · · Cn−2 ) ′ = β1 E1,n−2 + β2 E1,n−1 + β3 E1,n + β4 E2,n−1 + β5 E2,n + β6 E3,n ∈ Nn (F) for n > 3 that if k = n − 1 then A0 C1 · · · Cn−2 = aEn+1,2n−1 , and if k 6= n − 1 then C1 · · · Cn−2 A0 = aEn+1,2n−1 , onsequently, E1,n , En+1,2n−1 ∈ Ln−1 (S). Assume now that α 6= 0. Therefore (B1 · · · Bk−1 Bk+1 · · · Bn−1 ) ′′ = αE1,n−1 and (B1 · · · Bk−1 Bk+1 · · · Bn−1 ) ′ = β1 E1,n−1 + β2 E1,n + β3 E2,n . Sin e n > 3, then k 6= 1 or k 6= n − 1. If k 6= 1 we obtain that A0 B1 · · · Bk−1 Bk+1 · · · Bn−1 =

128

O. V. Markova

αEn+1,2n−1 , and if k 6= n − 1 we obtain that B1 · · · Bk−1 Bk+1 · · · Bn−1 A0 = αEn+1,2n−1 , onsequently, E1,n , En+1,2n−1 ∈ Ln−1 (S). Therefore in all ases it holds that E1,n , En+1,2n−1 ∈ Ln−1 (S). Consider matri es Bj,j+n−p−1 ∈ Ln−1 (S), j = 1, . . . , p de ned in Lemma 6. ′′ Bj,j+n−p−1 = b(j, p)En+1,2n−1 , b(j, p) ∈ F, as a produ t of n − 1 nilpotent matri es of order n − 1 when t = 0, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order n − 1 when t = 1.

Hen e using indu tion hypothesis and arguments similar to those of paragraph 1 of Lemma 6 we obtain that Ej, j+n−p−1 ∈ Ln−1 (S), j = 1, . . . , p.

2. Consider Bj, j ∈ Ln−1 (S), j = 1, . . . , n − 1 de ned in Lemma 6. It follows immediately that Br,′′ r = b(r)En+1,2n−1 , b(r) ∈ F, as a produ t of n − 1 nilpotent matri es of order n − 1 when r 6= k, and as a produ t of n − 2 nilpotent and one unitriangular matri es of order n − 1 when r = k. Hen e using arguments similar to those of paragraph 2 of Lemma 6 we obtain that Ej,j+1 ∈ Ln−1 (S). Consequently, Ei,j ∈ Ln−1 (S), 1 6 i < j 6 n. Hen e for any N ∈ Nn (F) it holds that N ⊕ 0 ∈ Ln−1 (S). Therefore, as it was shown in paragraph 3 of Lemma 6 any word of length n in elements of S is redu ible, thus Ln (S) = Ln−1 (S) and l(S) 6 n − 1. Then l(An,n−1 ) 6 n − 1.

By Theorem 8 we obtain that l(An,n−1 ) > n−1. Consequently, l(An,n−1 ) =

n − 1.

Lemma 8.

Proof.

Let F be a eld, n ∈ N and n > 2. Then l(An,n ) = n.

I. Let us rst prove the upper bound l(An,n ) 6 n. Let S be a generating set for An,n . Without loss of generality we assume S to satisfy the onditions 1{4 of Lemma 2 and therefore one of the onditions of Lemma 3. 1. We use indu tion on p = n−(j−i) to prove that Ei,j , Ei+n,j+n ∈ Ln−1 (S) for 1 6 i < j 6 n, j − i > 2. Consider the ase when p = 1. Assume that there does not exist su h number k that A0 = Bk . Then we obtain B1 B2 · · · Bn−1 (E − A0 ) = E1,n ∈ Ln (S), ′′ (E − A0 ) ′′ = 0 as a produ t of n nilpotent matri es of sin e B1′′ B2′′ · · · Bn−1 order n. Assume now that there exist a number k su h that A0 = Bk . Then we obtain B1 B2 · · · Bn−1 = ak E1,n + α1 En+1,2n−1 + α2 En+1,2n + α3 En+2,2n . Noti e that sin e n = m > 2, then if ondition (ii) of Lemma 3 holds, then the number s introdu ed there satis es one of the inequalities s 6= 1 or s 6= n − 1. And if ondition (i) of Lemma 3 holds, both inequalities hold true. If s 6= 1 ~ 0 B1 B2 · · · Bn−1 = E1,n , and if s 6= n − 1 then B1 B2 · · · Bn−1 A ~ 0 = E1,n , then A that is E1,n ∈ Ln (S). Also it holds that C1 · · · Cn−1 A0 = aE1,n + bEn+1,2n ∈ Ln (S), b 6= 0, therefore, En+1,2n = b−1 (C1 · · · Cn−1 A0 − aE1,n ) ∈ Ln (S). Consider the following matrix produ ts Bj,j+n−p−1 = Bj Bj+1 · · · Bj+n−p−1 (E − A0 )p ∈ Ln (S), j = 1, . . . , p.

Matrix algebras and their length

Bj,′

j+n−p−1

n X

= atk Ej,j+n−p +

n X

h=1 i=h+n−p+1

129

bh,i;j,p Eh,i , t ∈ {0, 1},

′′ and Bj,j+n−p−1 = b(j, p)En+1,2n , b(j, p) ∈ F, as a produ t of n nilpotent matri es of order n when t = 0, and as a produ t of n − 1 nilpotent and one unitriangular matri es of order n when t = 1. Consider Cj,j+n−p−1 = Cj Cj+1 · · · Cj+n−p−1 Ap0 ∈ Ln (S), j = 1, . . . , p.

Cj,′′

j+n−p−1

= (a ~s )t Ej+n,j+2n−p +

n X

n X

h=1 i=h+n−p+1

ch,i;j,p Eh+n,i+n , t ∈ {0, 1},

and Cj,′ j+n−p−1 = c(j, p)E1,n as a produ t of n nilpotent matri es of order n when t = 0, and as a produ t of n − 1 nilpotent and one unitriangular matri es of order n when t = 1. Applying the indu tion hypothesis we obtain that Ei,i+n−q−1 , Ei+n,i+2n−q−1 ∈ Ln (S) for all q = 2, . . . , p − 1, i = 1, . . . , q,

and E1,n , En+1,2n ∈ Ln (S) as was shown above. Therefore, Bj,

j+n−p−1

− (ak )t Ej,j+n−p , Cj,

− (as )t Ej+n,j+2n−p ∈ Ln (S).

j+n−p−1

Sin e by de nition it holds that Bj, j+n−p−1 , Cj, j+n−p−1 ∈ Ln (S) then Ej,

j+n−p−1

= (ak )−t (Bj,

j+n−p−1

Ej+n, j+2n−p−1 = (as )−t (Cj, Ln (S), j = 1, . . . , p.

− (Bj,

j+n−p−1

j+n−p−1

− (Cj,

− (ak )t Ej,j+n−p )) ∈ Ln (S),

j+n−p−1

− (as )t Ej+n,j+2n−p )) ∈

2. Consider next Bj,j = Bj (E−A0 )n−1 ∈ Ln (S) and Cj,j = Cj A0n−1 ∈ Ln (S), j = 1, . . . , n − 1. It follows immediately that ′ Bj,j = Ej,j+1 + βj Ek,k+1 +

n n X X

h=1 i=h+2 ′ Bk,k = (ak )t Ek,k+1 +

n n X X

h=1 i=h+2 ′′ Cj,j = Ej+n,j+n+1 + γj Es+n,s+n+1 +

bh,i;j,n−1 Eh,i , j 6= k,

bh,i;k,n−1 Eh,i , t ∈ {0, 1},

n n X X

h=1 i=h+2 ′′ Cs,s = (a ~s )t Es+n,s+n+1 +

n n X X

h=1 i=h+2

ch,i;j,n−1 Eh+n,i+n , j 6= s,

ch,i;s,n−1 Eh,i , t ∈ {0, 1},

′′ Br,r = b(j)E1,n as a produ t of n nilpotent matri es of order n when t = 0, and as a produ t of n − 1 nilpotent and one unitriangular matri es of order n when

130

O. V. Markova

′ t = 1, and Cr,r = c(r)E1,n as a produ t of n − 1 nilpotent and one unitriangular matri es of order n.

It follows from paragraph 1 that

Bk,k − (ak )t Ek,k+1 , Cs,s − (as )t Es+n,s+n+1 ∈ Ln (S). Therefore Ek,k+1 , Es+n,s+n+1 ∈ Ln (S). Then n n P P bh,i;j,n−1 Eh,i − βj Ek,k+1 − b(j)En+1,2n ∈ Ln−1 (S), Ej,j+1 = Bj,j − h=1 i=h+2 n n P P

Ej+n,j+n+1 = Cj,j −

h=1 i=h+2

ch,i;j,n−1 Eh,i − γj Es+n,s+n+1 − c(j)E1,n ∈ Ln−1 (S).

Consequently, Ej,j+n−p , Ej+n,j+2n−p ∈ Ln (S), j = 1, . . . , p. 3. Then it holds that 0 ⊕ E ′′ = (A0 − ak Ek,k+1 −

n X

n X

a ^ij Ei,j −

h=1 i=h+n−p+1

X

a ^ i+n

j+n Ei+n,j+n ),

16i<j6n

that is 0 ⊕ E ∈ Ln (S). Hen e any generating set S satis es Ln (S) = An,n , therefore l(An,n ) 6 n. II. Let us onstru t a generating set for An,n of length n. Let Sn = {Ai = Ei,i+1 + En+i,n+i+1 , i = 1, . . . , n − 1, E, En =

n X

Ej,j }.

j=1

Sin e Ai Aj = 0 for j 6= i + 1 and En Ai = Ai En = Ei,i+1 , then E1,n ∈/ Ln−1 (Sn ), where 1 6 i < j 6 n, Ln−1 (Sn ) = E, En , Ei,j , Ei+n,j+n , E1,n + En+1,2n , j−i6 n−2

but E1,n = A1 · · · An−1 En ∈ Ln (S) and therefore, Ln (S) = An,n . Consequently, l(An,n ) = n. The ombination of Lemmas 4{8 implies Theorem 11.

let

Let F be an arbitrary eld, let n > m be natural numbers and

An,m =

Then

*

+ 1 6 i < j 6 n, E, Eii , Ei,j , or ⊂ Tm+n (F). n + 1 6 i < j 6 n + m i=1 n X

l(An,m ) =

n − 1, for n − m > 2, n − 1 for n = m + 1, n > 3, n + 1 for n = m = 2, n, for n = m 6= 2, n for n = m + 1, m = 1, 2.

Matrix algebras and their length

131

The following Corollary shows in parti ular that the length ratio for a two blo k algebra and its subalgebra an take on any rational value in [1, 2].

Let F be an arbitrary eld, let n > m be xed natural numbers.

Corollary 2.

Let

Cn,m =

n−1 X

Ei,i+1 +

i=1

m−1 X j=1

(Ej+n,j+n + Ej+n,j+n+1 ) + En+m,n+m ∈ An,m

be a nonderogatory matrix, and let ′ An,m = hCjn,m , | 0 6 j 6 n + m − 1i ⊆ An,

m.

Then ′ ) = n + m − 1; 1. l(An,m = 2, 2. for n = m = 1, 2 or n

3.

′ l(An,

′ l(An, l(An,

3.2

m)

− l(An,

m)

′ m = 1 An, m = An,m ; m, for n − m > 2, or n = m + 1, n > 3,

=

for n = m 6= 2, or n = 3, m = 2, for n − m > 2, or n = m + 1, n > 3,

m − 1,

m 1 + , ) + 1 m n = 1 + m − 1 , m) + 1 n+1

for n = m 6= 2, or n = 3,

and

m = 2.

Three block subalgebras in upper triangular matrix algebra

In this se tion we onsider the 3-parametri family of algebras An1 ,n2 ,n3 ⊂

Tn1 (F) ⊕ Tn2 (F) ⊕ Tn3 (F), An1 ,n2 ,n3 =

*

1 6 i < j 6 n1 , + or n + 1 6 i < j 6 n + n , E, Ei,i , Ei,i , Ei,j , 1 1 2 or i=1 i=n1 +1 n1 + n2 + 1 6 i < j 6 n1 + n2 + n3 n1 +n2 X

n1 X

over an arbitrary eld F, ompute their lengths expli itly and found the subalgebras An′ 1 ,n2 ,n3 ⊂ An1 ,n2 ,n3 with l(An′ 1 ,n2 ,n3 ) > l(An1 ,n2 ,n3 ), then hoosing appropriate values of parameters n1 , n2 , n3 we obtain arbitrary rational ratios l(An′ 1 ,n2 ,n3 ) ∈ [1, 2), see Corollary 3. l(An1 ,n2 ,n3 )

Notation 12. Any A ∈ An

1,

n2 , n3

is of the following form

′ A 0 0 A = 0 A ′′ 0 , where A ′ ∈ Tn1 (F), A ′′ ∈ Tn2 (F), A ′′′ ∈ Tn3 (F). 0 0 A ′′′

From now on we will use the following notation A = A ′ ⊕ A ′′ ⊕ A ′′′ .

132

O. V. Markova

In the following three lemmas we mark spe ial elements in generating sets whi h are signi ant for omputation of the length of An1 ,n2 ,n3 .

Let S be a generating set for An ,n ,n . Then there exists a generating set eS for An ,n ,n su h that the following onditions hold: 1. dim L1 (eS) = |eS| + 1; 2. any S ∈ eS satis es (S)ii = 0, i = 1, . . . , n1 ; 3. either (i) there exist matri es A1 = (ai,j;1 ), A2 = (ai,j;2 ) ∈ eS su h that Lemma 9.

1

1

2

2

3

3

A1′′ = E + N1 , N1 ∈ Nn2 (F), A1′′′ ∈ Nn3 (F), A2′′ ∈ Nn2 (F), A2′′′ = E + N2 , N2 ∈ Nn3 (F), e S 6= A1 , A2 , satisfy (S)i,i = 0, i = 1, . . . , n1 + n2 + n3 ; S ∈ S,

and all or (ii) there exists a matrix A0 = (ai,j;0 ) su h that

A0′′ = E + N, N ∈ Nn2 (F), A0′′′ = aE + M, M ∈ Nn3 (F), a ∈ / {0, 1},

and all S ∈ S,

S 6= A0

satisfy (S)i,i = 0,

i = 1, . . . , n1 + n2 + n3 ;

4. l(eS) = l(S).

Proof. Let us onsequently transform the set S into a generating set satisfying the onditions 1{3. 1. We use the same arguments as in point 1 of Lemma 2. 2. Proposition 2 allows us to transform the given generating set into a generating set S1 = {S − (S)1,1 E, S ∈ S} preserving its length. 3. (i) Assume there exist C1 , C2 ∈ S1 su h that ve tors c1 = ((C1 )n1 +1,n1 +1 , (C1 )n1 +n2 +1,n1 +n2 +1 ), c2 = ((C2 )n1 +1,n1 +1 , (C2 )n1 +n2 +1,n1 +n2 +1 )

are linearly independent. Thus there exists a non-singular matrix F = (fi,j ) ∈ M2 (F2 ) su h that (1, 0) = f1,1 c1 + f1,2 c2 , (0, 1) = f2,1 c1 + f2,2 c2 . Let us assign Ai = fi,1 C1 + fi,2 C2 , i = 1, 2. Then Proposition 1 allows us to transform the given generating set into a generating set S2 = {A1 , A2 , S| S ∈ S, S 6= C1 , C2 }

preserving its length. And by Proposition 1 the transformation of S2 into a generating set S3 = {A1 , A2 , S − (S)n1 +1,n1 +1 A1 − (S)n1 +n2 +1,n1 +n2 +1 A2 | S ∈ S1 , S 6= A1 , A2 } also does not hange its length. In this ase we assign e S = S3 . (ii) Otherwise there exists su h a matrix A in S that ve tors ((A)n1 +1

n1 +1 ,

(A)n1 +n2 +1

n1 +n2 +1 ),

((A2 )n1 +1

n1 +1 ,

(A2 )n1 +n2 +1

n1 +n2 +1 )

are linearly independent. Thus matrix A has two distin t non-zero eigenvalues. Then we an repla e matrix A in s S1 with the matrix A0 = (A)−1 n1 +1,n1 +1 A.

Matrix algebras and their length

133

Then Proposition 1 allows us to transform the given generating set into a generating set S2 = {A0 , S − (S)n1 +1,n1 +1 A0 , S ∈ S, S 6= A0 }. Let us assign e S = S2 . Lemma 10. Let S be a generating set for An ,n ,n satisfying the onditions 1, 2 and 3(i) of Lemma 9. The there exist a generating set eS for An ,n ,n satisfying l(e S) = l(S), su h matri es B1 , . . . , Bn −1 ∈ e S and k1 , k2 ∈ {1, . . . , n1 − 1} that one of the following onditions holds: n n P P bi,j;r Ei,j , Br′′ ∈ Nn (F), Br′ = Er,r+1 + 1. 1

1

2

2

3

3

1

1

1

2

i=1 j=i+2

Br′′′

∈ Nn3 (F), r = 1, . . . , n − 1;

2. there exists j ∈ {1, 2} su h that

Br′ = Er,r+1 + brj Ekj ,kj +1 +

n1 P

n1 P

bh,i;r Eh,i ,

h=1 i=h+2

Br′′ ∈ Nn2 (F), Br′′′ ∈ Nn3 (F), r = 1, . . . , n1 − 1, r 6= kj , n1 n1 P P ah,i;j Eh,i , a(kj , j) 6= 0, Aj′ = Bk′ j = a(kj , j)Ekj ,kj +1 + h=1 i=h+2

Bk′′j = Aj′′ , Bk′′′j = Aj′′′ ;

3.

Br′ = Er,r+1 + br1 Ek1 ,k1 +1 + br2 Ek2 ,k2 +1 +

n1 P

n1 P

bh,i;r Eh,i ,

h=1 i=h+2

Br′′ ∈ Nn2 (F), Br′′′ ∈ Nn3 (F), r = 1, . . . , n1 − 1, r 6= k1 , k2 , n1 n1 P P ah,i;j Eh,i , Aj′ = Bk′ j = a(k1 , j)Ek1 ,k1 +1 + a(k2 , j)Ek2 ,k2 +1 + h=1 i=h+2

a(kj , j) 6= 0, a(k1 , 1)a(k2 , 2) − a(k2 , 1)a(k1 , 2) 6= 0, Bk′′j = Aj′′ , Bk′′′j = Aj′′′ , j = 1, 2.

Proof. Sin e Ei,i+1

∈ An1 ,n2 ,n3 , but for any t > 2 and S ∈ St \ S the oeÆ ient (S)i,i+1 = 0, i = 1, . . . , n1 − 1, then there exist B1 , . . . , Bn1 −1 ∈ S su h that the ve tors ((Bi )1,2 , (Bi )2,3 , . . . , (Bi )n1 −1,n1 ), i = 1, . . . , n1 − 1 are linearly

independent. Consider next the following transformation F of the set S (by Proposition 1 F preserves the length of S), whi h is identi al for all elements S ∈ S, S 6= Bi , i = 1, . . . , n1 − 1, i.e. F(S) = S, and its a tion on the set of matri es Bj , j = 1, . . . , n1 − 1 depends on belonging of A1 and A2 to this set as follows: T If |{B1 , . . . , Bn1 −1 } {A1 , A2 }| 6 1, then F is onstru ted similarly to the transformation des ribed in point 4 of Lemma 2. Assume that both A1 , A2 ∈ {B1 , . . . , Bn1 −1 }, i.e. A1 = Bp , A2 = Bq for some distin t p, q ∈ {1, . . . , n1 − 1}. Sin e any matrix in Mn1 −1,n1 −3 (F) of rank n1 − 3 ontains a non-singular submatrix of order n1 − 3, then there exist numbers k1 , k2 ∈ {1, . . . , n1 − 1}, k1 < k2 su h that the ve tors vi = ((Bi )1,2 , . . . , (Bi )k1 −1,k1 , (Bi )k1 +1,k1 +2 , . . . , (Bi )k2 −1,k2 , (Bi )k2 +1,k2 +2 (Bi )n1 −1,n1 ),

134

O. V. Markova

i = 1, . . . , n1 − 1, i 6= p, q are linearly independent. Sin e the matri es Bj were numbered arbitrarily, we would assume that p = k1 , q = k2 . There exists a non-singular linear transformation G = {gi,j } ∈ Mn1 −3 (F) that maps the set {vi } into the set {e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en1 −3 = (0, 0, . . . , 1)} ⊂ Fn1 −3 .

i.e. ei =

nP 1 −3 j=1

gi,j vj . Then let us assign

F(Br ) =

kX 1 −1

gr,j Bj +

j=1

kX 2 −1

gr,j Bj+1 +

j=k1

F(As ) = As −

nX 1 −1 j=k2

nX 1 −1

gr,j Bj+2 , r 6= k1 , k2 ,

(As )i,i+1 F(Bi ), s = 1, 2.

i=1 i6=k1 ,k2

For the sake of simpli ity of the subsequent text let us redenote F(A1 ), F(A2 ) and F(Br ) by A1 , A2 and Br , orrespondingly.

Let S be a generating set for An ,n ,n satisfying onditions 1, 2 and 3(ii) of Lemma 9. Then there exist a generating set eS for An ,n ,n satisfying l(eS) = l(S), su h matri es B1 , . . . , Bn −1 ∈ eS and k0 ∈ {1, . . . , n1 −1} that one of the following onditions holds: n n P P bi,j;r Ei,j , Br′′ ∈ Nn (F), 1. Br′ = Er,r+1 + Lemma 11.

1

2

3

1

2

3

1

1

1

2

i=1 j=i+2

Br′′′

2.

∈ Nn3 (F), r = 1, . . . , n − 1; n1 n1 P P Br′ = Er,r+1 + br0 Ek0 ,k0 +1 + bh,i;r Eh,i , h=1 i=h+2

Br′′ ∈ Nn2 (F), Br′′′ ∈ Nn3 (F), r = 1, . . . , n1 − 1, r 6= k, n1 n1 P P ah,i;j Eh,i , a(k0 , 0) 6= 0, A0′ = Bk′ 0 = a(k0 , 0)Ek0 ,k0 +1 + h=1 i=h+2

Bk′′0 = A0′′ , Bk′′′0 = A0′′′ .

Proof. The proof is analogous to the proof of point 4 of Lemma 2. Theorem 13. Let F = F2 , let n1 , n2 , n3 ∈ N, n1 > n2 + 2, n2 > n3 , (n2 , n3 ) 6= (1, 1), (2, 1), (2, 2). Then l(An1 , n2 , n3 ) = n1 − 1.

Proof. Let us rst prove the upper bound l(An

) 6 n1 − 1. Let S be a generating set for An1 ,n2 ,n3 . Without loss of generality we assume S to satisfy the onditions 1{2 of Lemma 9. Sin e by [8, Theorem 6.1℄ l(D3 (F2 ) = 1, then the only possibility for S is to satisfy ondition 3(i) of Lemma 9, and onsequently, we assume S to satisfy one of the onditions of Lemma 10. 1 ,n2 ,n3

Matrix algebras and their length

135

1. We use indu tion on p = n1 − (j − i) to prove that Ei,j ∈ Ln1 −1 (S) for 1 6 i < j 6 n1 , j − i > 2. Noti e that B1 B2 · · · Bn1 −1 = E1,n1 ∈ Ln1 −1 (S), sin e (B1 B2 · · · Bn1 −1 ) ′′ = 0 and (B1 B2 · · · Bn1 −1 ) ′′′ = 0 as produ ts of n1 − 1 nilpotent matri es of order ns+1 6 n1 − 2, if S satis es ondition 1 of Lemma 10 and as produ ts of n1 − 2 nilpotent and one unitriangular matri es of order ns+1 6 n1 − 2, if S satis es

ondition 2 or 3 of Lemma 10. Consider the following matrix produ ts

Bj,j+n1 −p−1 = Bj Bj+1 · · · Bj+n1 −p−1 (E − A1 − A2 )p−1 ∈ Ln1 −1 (S), j = 1, . . . , p. ′ We have Bj,j+n = Ej,j+n1 −p + 1 −p−1

n1 P

n1 P

h=1 i=h+n1 −p+1

ch,i;j Eh,i , and

′′ ′′′ Bj,j+n = 0 and Bj,j+n = 0 as produ ts of n1 − 1 nilpotent matri es 1 −p−1 1 −p−1 of order ns+1 6 n1 − 2, if S satis es ondition 1 of Lemma 10 or for ks de ned in points 2 and 3 of Lemma 10 it holds that ks ∈/ {j, . . . , j + n1 − p − 1}, s = 1, 2, and as produ ts of n1 − 2 nilpotent and one unitriangular matri es of order ns+1 6 n1 − 2 otherwise. Applying the indu tion hypothesis we obtain that Ei,i+n1 −q−1 ∈ Ln1 −1 (S) for all q = 1, . . . , p − 1, i = 1, . . . , q. Therefore Bj, j+n1 −p−1 − Ej,j+n−p ∈ Ln1 −1 (S). Hen e we obtain that Ej, j+n1 −p−1 = (Bj, j+n1 −p−1 − (Bj, j+n1 −p−1 − Ej,j+n1 −p )) ∈ Ln1 −1 (S), j = 1, . . . , p.

2. Consider next Bj,j = Bj (E − A1 − A2 )n1 −2 ∈ Ln1 −1 (S), j = 1, . . . , n1 − 1.

′ Bj,j = Ej,j+1 + γj,1 Ek1 ,k1 +1 + γj,2 Ek2 ,k2 +1 +

n1 X n1 X

h=1 i=h+2

Bk′ r ,kr = ak1 ,r Ek1 ,k1 +1 + ak2 ,r Ek2 ,k2 +1 +

n1 X n1 X

ch,i;j Eh,i , j 6= k1 , k2

ch,i;r Eh,i , r = 1, 2,

h=1 i=h+2 ′′′ ′′ = 0 as produ ts of n1 − 1 nilpotent matri es of order ns+1 6 Bj,j = 0 and Bj,j n1 − 2, if ks 6= j or does not exist, and as produ ts of n1 − 2 nilpotent and one unitriangular matri es of order ns+1 6 n1 − 2, if ks = j, s = 1, 2. It follows from paragraph 1 that Ej,j+1 +γj,1 Ek1 ,k1 +1 +γj,2 Ek2 ,k2 +1 , ak1 ,r Ek1 ,k1 +1 + ak2 ,r Ek2 ,k2 +1 ∈ Ln1 −1 (S), j 6= k1 , k2 , r = 1, 2, and hen e Ej,j+1 ∈ Ln1 −1 (S), j = 1, . . . , n1 − 1. Consequently, Ei,j ∈ Ln1 −1 (S), 1 6 i < j 6 n1 . 3. From paragraphs 1 and 2 we obtain that n1 n1 P P P λh,i Eh,i ∈ Ln2 (S), and Ei,i + Ei,i ∈ (E − A1 − A2 )n2 = i=1

16h n1 − 1. Consequently,

l(An1 ,n2 ,n3 ) = n1 − 1.

Theorem 14. Let F be an arbitrary eld, |F| > 3, and let n1 , n2 , n3 ∈ N, n1 > n2 + n3 + 2, n2 > n3 , (n2 , n3 ) 6= (1, 1), (2, 1), (2, 2). Then l(An1 , n2 , n3 ) = n1 − 1.

Proof.

I. Let us rst prove the upper bound l(An1 ,n2 ,n3 ) 6 n1 − 1. Let S be a generating set for An1 ,n2 ,n3 . Without loss of generality we assume S to satisfy the onditions 1{2 of Lemma 9. If S satis es ondition 3(i) of Lemma 9, then

the proof is analogous to the proof of Theorem 13. Consequently, we assume S to satisfy ondition 3(ii) of Lemma 9, and therefore one of the onditions of Lemma 11. 1. We use indu tion on p = n1 − (j − i) to prove that Ei,j ∈ Ln1 −1 (S) for 1 6 i < j 6 n1 , j − i > 2. Let us denote m = n1 + n2 + 1. If p = 1, then B1 · · · Bn1 −1 = bE1,n1 ∈ Ln1 −1 (S), b 6= 0, sin e (B1 · · · Bn1 −1 ) ′′ = 0 and (B1 · · · Bn1 −1 ) ′′′ = 0 as produ ts of n1 − 2 nilpotent and one unitriangular matri es or n1 − 1 nilpotent matri es of orders n2 and n3 , orrespondingly. If p 6 n1 − n3 − 2 and j = 1, . . . , p onsider Bj,j+n1 −p−1 = Bj · · · Bj+n1 −p−1 (E − A0 )p−1 ∈ Ln1 −1 (S), ′ Bj,j+n 1 −p−1

t

= a(k0 , 0) Ej,j+n1 −p +

n1 X

n1 X

h=1 i=h+n1 −p+1

ch,i;j Eh,i , t ∈ {0, 1},

′′ ′′ and Bj,j+n = 0, Bj,j+n = 0, as produ ts of nilpotent matri es of 1 −p−1 1 −p−1 lengths smaller than orders of fa tors. If n1 − n3 − 1 6 p < n1 − 1 and j = 1, . . . , p onsider

Bj,j+n1 −p−1 = Bj · · · Bj+n1 −p−1 (E − a−1 A0 )n3 −n1 +p (E − A0 )n1 −n3 −1 , Bj,j+n1 −p−1 ∈ Ln1 −1 (S), ′ Bj,j+n 1 −p−1

t

= a(k0 , 0) Ej,j+n1 −p +

n1 X

n1 X

h=1 i=h+n1 −p+1

ch,i;j Eh,i , t ∈ {0, 1},

′′ ′′ Bj,j+n = 0 and Bj,j+n = 0, as produ ts of nilpotent matri es of 1 −p−1 1 −p−1

lengths smaller than orders of fa tors. Applying the indu tion hypothesis we obtain that Ei,i+n1 −q−1 ∈ Ln1 −1 (S) for all q = 2, . . . , p − 1, i = 1, . . . , q, and E1,n1 ∈ Ln1 −1 (S) as shown above.

Matrix algebras and their length

137

Therefore, Bj, j+n1 −p−1 − a(k0 , 0)t Ej,j+n1 −p ∈ Ln1 −1 (S). Hen e we obtain that

Ej, j+n1 −p−1 = (a(k0 , 0))−t (Bj,j+n1 −p−1 − (Bj,j+n1 −p−1 − (a(k0 , 0))t Ej,j+n1 −p )) ∈ Ln1 −1 (S), j = 1, . . . , p.

2. For j = 1, . . . , n1 − 1 onsider produ ts Bj,j = Bj (E − a−1 A0 )n3 −1 (E − A0 )n1 −n3 −1 , j 6= k0 , and Bk0 ,k0 = Bk0 (E − a−1 A0 )n3 (E − A0 )p−n3 −1 , Bj,j ∈ Ln1 −1 (S). We have ′ Bj,j = Ej,j+1 + γj Ek0 ,k0 +1 +

n1 X n1 X

ch,i;j Eh,i , j 6= k0

h=1 i=h+2 n1 X n1 X

Bk′ 0 ,k0 = a(k0 , 0)Ek0 ,k0 +1 +

ch,i;k0 Eh,i ,

h=1 i=h+2

′′′ ′′ = 0 as produ ts of nilpotent matri es of lengths smaller than Br,r = 0 and Br,r orders of fa tors. With paragraph 1 it gives Ej,j+n1 −p ∈ Ln1 −1 (S), j = 1, . . . , p. n1 P P λh,i Eh,i ∈ Ei,i + 3. We have (E − A0 )n2 (E − a−1 A0 )n3 = i=1

Ln2 +n3 (S), and

n1 P

i=1

16h n1 − 1. Consequently, l(An1 ,n2 ,n3 ) = n1 − 1.

The following Corollary shows in parti ular that the length ratio for a three blo k algebra and its subalgebra also an take on many dierent values, namely any rational value in [1, 2). Corollary 3. Let F be an arbitrary eld, |F| > 3, and n1 > n2 + n3 + 2, n2 > n3 > 3. Let a ∈ F, a 6= 0, 1 and Cn1 ,n2 ,n3 =

nX 1 −1

Ei,i+1 +

i=1

n1 +nX 2 +n3 −1 k=n1 +n2 +1

n1 +n X2 −1

let let

n1 , n2 , n3 ∈ N,

(Ej,j + Ej,j+1 ) + En1 +n2 ,n1 +n2 +

j=n1 +1

(aEk,k + Ek,k+1 ) + aEn1 +n2 +n3 ,n1 +n2 +n3 ∈ An1 ,n2 ,n3

be a nonderogatory matrix, let An′ 1 ,n2 ,n3 = hCjn1 ,n2 ,n3 , | 0 6 j 6 n1 + n2 + n3 − 1i ⊆ An1 ,n2 ,n3 .

Then 1. l(An′

1 ,n2 ,n3

) = n1 + n2 + n3 − 1;

138

O. V. Markova

2. l(An′ ,n ,n ) − l(An ,n ,n ) = n2 + n3 ; l(An′ ,n ,n ) + 1 n2 + n3 3. < 2. =1+ 1

2

3

1

2

3

1

2

l(An1 ,n2 ,n3 ) + 1

3

n1

Remark 3. Let us denote An1 = hE(n1 ) , Ei,j , 1 6 i < j 6 n1 i ⊂ Tn1 (F).

Noti e that An1 ,n2 ,n3 = An1 ⊕ An2 ,n3 and l(An1 ,n2 ,n3 ) = l(An1 ) = max (l(An1 ), l(An2 ,n3 )). That is we obtain another example providing sharpness of the lower bound in (1). 3.3

Examples

We now give the examples of algebras with length bounding the lengths of subalgebras.

Let F be an arbitrary eld, n, m ∈ N, n > m − 2, and let An, m be the algebra introdu ed in Theorem 11. Let also Corollary 4.

B = hEi,j , 1 6 i < j 6 n, E,

Then l(B) = n − 1 = l(An,

m ).

n P

i=1

Ei,i , N1 , . . . , Np ∈ 0 ⊕ Nm (F)i ⊆ An,

m.

Example 3. Let F be an arbitrary eld, let A ⊆ Tn (F)) be a subalgebra of upper triangular matrix algebra. Then l(A) 6 l(Tn (F)).

Let F be an arbitrary eld, let A be a nite-dimensional FB ⊆ A su h that there exist a1 , . . . , an ∈ A satisfying hB, a1 , . . . , an i = A and ai b, bai ∈ ha1 , . . . , an i for all b ∈ B. Then l(B) 6 l(A). S Proof. Let SB be a generating set for B. Then SA = SB {a1 , . . . , an} is a generating set for A of the length l(SB ). Then l(A) > l(SA ) = l(SB ) and therefore l(A) > max l(SB ) = l(B). Proposition 5.

algebra, and

SB

Let us give some examples of algebras satisfying the ondition of Proposition 5.

Example 4. Let F be an arbitrary eld, let A be a subalgebra of Tn (F) and let T B=A

Dn (F). Then l(B) 6 l(A).

Example 5. Let F be an arbitrary eld and let A, algebras. Then A ⊂ A ⊕ B and l(A) 6 l(A ⊕ B).

B be nite-dimensional F-

The author is greatly indebted to her supervisor Dr. A. E. Guterman for the attention given to the work and for useful dis ussions.

Matrix algebras and their length

139

References 1. T. J. Laey, Simultaneous Redu tion of Sets of Matri es under Similarity, Linear Algebra and its Appli ations, 84(1986), 123{138 2. W. E. Longsta, Burnside's theorem: irredu ible pairs of transformations, Linear Algebra and its Appli ations, 382(2004), 247{269 3. C. J. Pappa ena, An Upper Bound for the Length of a Finite-Dimensional Algebra, Journal of Algebra, 197(1997), 535{545 4. A. Paz, An Appli ation of the Cayley{Hamilton Theorem to Matrix Polynomials in Several Variables, Linear and Multilinear Algebra, 15(1984), 161{170 5. A. J. M. Spen er, R. S. Rivlin, The Theory of Matrix Polynomials and its Appli ations to the Me hani s of Isotropi Continua, Ar hive for Rational Me hani s and Analysis, 2(1959), 309{336 6. A. J. M. Spen er, R. S. Rivlin, Further Results in the Theory of Matrix Polynomials, Ar hive for Rational Me hani s and Analysis, 4(1960), 214{230 7. O. V. Markova, On the length of upper-triangular matrix algebra, Uspekhi Matem. Nauk, 60(2005), no. 5, 177-178, [in Russian℄; English translation: Russian Mathemati al Surveys, 60(2005), no. 5, 984{985. 8. O. V. Markova, Length omputation of matrix subalgebras of spe ial type, Fundamental and Applied Mathemati s, 13 (2007), Issue 4, 165{197.

On a New Class of Singular Nonsymmetric Matrices with Nonnegative Integer Spectra Tatjana Nahtman1,⋆ and Dietri h von Rosen2 1

Institute of Mathemati al Statisti s, University of Tartu, Estonia tatjana.nahtman@ut.ee; Department of Statisti s, University of Sto kholm, Sweden tatjana.nahtman@statistics.su.se

2

Department of Biometry and Engineering, Swedish University of Agri ultural S ien es dietrich.von.rosen@bt.slu.se

The obje tive of this paper is to onsider a lass of singular nonsymmetri matri es with integer spe trum. The lass omprises generalized triangular matri es with diagonal elements obtained by summing the elements of the orresponding olumn. If the size of a matrix belonging to the lass equals n × n, the spe trum of the matrix is given by the sequen e of distin t non-negative integers up to n− 1, irrespe tive of the elements of the matrix. Right and left eigenve tors are obtained. Moreover, several interesting relations are presented, in luding fa torizations via triangular matri es. Abstract.

Keywords: eigenve tors, generalized triangular matrix, integer spe trum, nonsymmetri matrix, triangular fa torization, Vandermonde matrix.

1

Introduction

In this paper we onsider a new lass of singular matri es with remarkable algebrai properties. For example, the spe trum of a matrix belonging to this lass depends only on the size of the matrix and not on the spe i elements of this matrix. Moreover, the spe trum entirely onsists of su

essive non-negative integer values 0, 1, . . . , n−1. A spe ial ase of this lass of matri es originates from statisti al sampling theory (Bondesson & Traat, 2005, 2007). In their papers, via sampling theory (the Poisson sampling design) as well as some analyti proofs, eigenvalues and eigenve tors were presented. Their proofs remind on the use of Lagrangian polynomials whi h for example are used when nding the inverse of a Vandermonde matrix (e.g. see Ma on & Spitzbart, 1958; El-Mikkawy, 2003). We have not found any other work related to the matrix lass whi h we are going to onsider. ⋆

The work of T. Nahtman was supported by the grant GMTMS6702 of Estonian Resear h Foundation.

Properties of singular matrix with integer spe trum

141

The main purpose of this paper is to introdu e the lass, show some basi algebrai properties, show how to fa tor the lass and demonstrate how to nd eigenvalues and eigenve tors of matri es belonging to the lass. The paper fo uses more on presentation of results than giving omplete proofs of the most general versions of the theorems. Definition 1. A to the Bn - lass

square nonsymmetri matrix B = (bij ) of order if its elements satisfy the following onditions:

bii =

n X

bji ,

n

belongs (1)

i = 1, . . . , n,

j=1, j6=i

(2)

bij + bji = 1, j 6= i, i, j = 1, . . . , n, bij bki bij − bik = , bkj 6= 0, j 6= k, i 6= k, j; i, j, k = 1, . . . , n. bkj

(3)

bki Instead of (3) one may use bkj = bbijij−b or bij bkj = bik bkj + bij bki . Relation ik (2) de nes a generalized triangular stru ture and it an be shown that (3) is a ne essary and suÆ ient ondition for the lass to have the non-negative integer spe tra onsisting of the distin t integers {0, 1, . . . , n − 1}. Due to (1), the sum of the elements in ea h row equals n − 1. Another matrix with integer eigenvalues and row element sum equal to n−1, with many appli ations in various elds, is the well known tridiagonal Ka matrix (Clement matrix); see Taussky & Todd (1991). Moreover, for any B ∈ Bn with positive elements we may onsider (n − 1)−1 B as a transition matrix with interesting symmetri properties re e ted by the equidistant integer spe tra. When B ∈ B3 , 0

1

b21 + b31 b12 b13 C C C C B = b21 b12 + b32 b23 C C A b31 b32 b13 + b23 B B B B B B @

1

0

b21 + b31 1 − b21 1 − b31 C C C C. = b21 1 − b21 + b32 1 − b32 C C A b31 b32 2 − b31 − b32 B B B B B B @

(4)

It is worth observing that any B ∈ Bn is a sum of three matri es: an upper triangular matrix, a diagonal matrix and a skew-symmetri matrix. For (4) we have 0

1

0

1

0

1

0 1 1C −b21 −b31 C 0 B0 B b21 + b31 0 C C B B C C C B B C C+B C+B B = 0 1 1C −b32 C −b + b 0 C. B b21 0 B0 C C 21 32 C B B C C A @ @ A A b31 b32 0 002 0 0 −b31 − b32 B B B B B B @

Note that the eigenvalues {0, 1, 2} of B are found on the diagonal of the upper triangular matrix, irrespe tive of the values of (bij ) as long as they satisfy (1){ (3).

142

T. Nahtman, D. von Rosen

In the Conditional Poisson sampling design (e.g., see Aires, 1999) bij =

pi (1 − pj ) pi − pj

are used to al ulate onditional in lusion probabilities, where the pi 's are in lusion probabilities under the Poisson design. Bondesson & Traat (2005, 2007) generalized this expression somewhat and onsidered bij =

ri , ri − rj

(5)

where the ri 's are arbitrary distin t values. In this paper, instead of (5), we assume (3) to hold. Note that any bij satisfying (5) also satis es (3). For the matrix de ned via the elements in (5) Bondesson & Traat (2005, 2007) presented eigenvalues, and right and left eigenve tors. They expressed their results as fun tions of ri in (5) whereas in this paper we will express the results in terms of bij , i.e. the elements of B ∈ Bn . Moreover, the proofs of all results in this paper are pure algebrai whereas Bondesson & Traat (2005, 2007) indi ated proofs based on series expansions and identi ation of oeÆ ients. It is however not lear how to apply their results to the Bn - lass of matri es, given in De nition 1.1. Moreover, the algebrai approa h of this paper opens up a world of interesting relations. In parti ular, the triangular fa torization of matri es in the Bn - lass, presented in Se tion 4. As noted before it follows from (3) that bkj =

bij (1 − bik ) bij bki = . bij − bik bij − bik

(6)

Hen e, any row in B, B ∈ Bn , generates all other elements and thus, there are at most n − 1 fun tionally independent elements in B. For example, we may use b1j , j = 2, 3, . . . , n, to generate all other elements in B. Furthermore, if we

hoose for rj in (5), without loss of generality, r1 = 1 and rj = −

bj1 , bij

j = 2, 3, . . . , n,

it follows that b1j =

1 1 − rj

and bij =

1 1 1−rj (1 − 1−ri ) 1 1 1−rj − 1−ri

=

ri . ri − rj

Thus, all bij 's an be generated by the above hoi e of rj . This means that a matrix de ned by (5), as onsidered in Bondesson & Traat (2005, 2007), is a

Properties of singular matrix with integer spe trum

143

anoni al version of any matrix de ned through (3), assuming that (1) and (2) hold. The lass Bn an be generalized in a natural way.

The matrix Bn,k : (n − k + 2) × (n − k + 2), k = 2, . . . , n, is obtained from the matrix B, B ∈ Bn , by elimination of k−2 onse utive rows and olumns starting from the se ond row and olumn, with orresponding adjustments in the main diagonal.

Definition 2.

The paper onsists of ve se tions. In Se tion 2 some basi and fundamental relations for any B ∈ Bn are given whi h will be used in the subsequent. Se tion 3 onsists of a straightforward proof on erning the spe trum of any B ∈ Bn . In Se tion 4 we onsider a fa torization of B ∈ Bn into a produ t of three triangular matri es. Finally, in Se tion 5 expressions of left and right eigenve tors are presented. Several proofs of theorems are omitted due to lengthy al ulations. However, for further details it is referred to the te hni al report Nahtman & von Rosen (2007). All proofs of this paper ould easily have been presented for, say n < 7, but for a general n we rely on indu tion whi h is more diÆ ult to look through. Their is ertainly spa e for improving the proofs and this is a another reason for omitting them. In the present paper only real-valued matri es are onsidered, although the generalization to matri es with omplex-valued entries ould be performed fairly easy.

2

Preparations

This se tion shows some relations among the elements in B ∈ Bn whi h are of utmost importan e for the subsequent presentation. Theorem 1.

(i)

Let B ∈ Bn . For all n > 1

The sum of the produ ts of the o-diagonal row elements equals 1: n Y n X

bij = 1.

i=1 j=1 j6=i

(ii)

1:

The sum of the produ ts of the o-diagonal olumn elements equals n Y n X j=1 i=1 i6=j

bij = 1.

144

T. Nahtman, D. von Rosen

Proof. Be ause of symmetry only (i) is proven. For n = 2 the trivial relation b12 + b21 = 1 is obtained. Moreover, for n = 3 3 Y 3 X

bij = b12 b13 + b21 b23 + b31 b32 = b12 − b12 b31 + b21 − b21 b32 + b31 b32

i=1 j=1 j6= i

= 1 − (b12 − b13 )b32 − b21 b32 + b31 b32 = 1 − (b12 + b21 )b32 + (b13 + b31 )b32 = 1,

where in the se ond equality (3) is utilized. Now it is assumed that the theorem is true for n − 1, i.e. n−1 X n−1 Y

(7)

bij = 1,

i=1 j=1 j6=i

whi h by symmetry yields n Y n X

bij = 1,

(8)

k = 1, 2, . . . , n.

i=1 j=1 i6=k j6=i j6=k

From here on a hain of al ulations is started: n Y n X

bij =

i=1 j=1 j6= i

n−1 X n−1 Y i=1 j=1 j6= i

=

n−2 X n−2 Y

bij bin +

n−1 Y

bij bin−1 bin +

i=1 j=1 j6= i

=

n−2 X n−2 Y

bnj

j=1

n−2 Y

bn−1j bn−1n +

bnj bnn−1

j=1

j=1

bij bin−1 (1 − bni ) +

n−2 Y

n−2 Y

bn−1j (1 − bnn−1 ) +

bnj bnn−1 .

(9)

j=1

j=1

i=1 j=1 j6= i

n−2 Y

Sin e by the indu tion assumption n−2 X n−2 Y

bij bin−1 +

i=2 j=1 j6=i

n−2 Y

bn−1j = 1

j=1

the last expression in (9) equals 1−

n−2 X n−2 Y i=1 j=1 j6=i

bij (bin−1 − bin )bnn−1 −

n−2 Y j=1

bn−1j bnn−1 +

n−2 Y j=1

bnj bnn−1 , (10)

Properties of singular matrix with integer spe trum

145

where (3) has been used: bin−1 bni = (bin−1 − bin )bnn−1 . Reshaping (10) we obtain 1−

n−1 X n−1 Y

bij bnn−1 +

i=1 j=1 j6=i

n X

i=1 i6=n−1

n Y

bij bnn−1 ,

(11)

j=1 j6=i j6=n−1

and using the indu tion assumption, i.e. (7) as well as (8), we see that (11) is indeed equal to 1 − bnn−1 + bnn−1 = 1,

and the theorem is proved. Corollary 1.

Let B

∈ Bn .

⊓ ⊔

For all n > 1, n−1 n XY

bij = 1 −

i=1 j=1 j6=i

Corollary 2.

Let B

∈ Bn .

n−1 Y

bnj .

j=1

For every integer a su h that a < n, n Y n X

bij = 1.

i=a j=a j6=i

Theorem 2. Let B ∈ Bn and put cij = b−1 ij bji . (i) c−1 = c , i = 6 j, ji ij (ii) cki cjk = −cji , k 6= i, j 6= k, i 6= j, (iii) cki clj = ckj cli , k 6= i, j; l 6= i, j.

Then, ( an ellation) (ex hangeability)

Proof. (i) follows immediately from the de nition of cij . For (ii) it is observed that (see (3)) bji bik bij bki =− bkj bjk

and hen e, −1 −1 −1 −1 −1 −1 −1 cki cjk = b−1 ki bik bjk bkj = bki bik bjk bkj bij bij = −bki bki bjk bjk bij bij = −cji .

Con erning (iii) it is noted that cki clj = cki clj cil cli = −cki cij cli = ckj cli .⊓ ⊔

146

T. Nahtman, D. von Rosen

Throughout the paper the following abbreviations for two types of multiple sums will be used. Both will frequently be applied in the subsequent: [m,n]

X

=

X

=

Pn

i1 =m

i1 6···6ik [m,n]

i1 k, otherwise.

In the next Un and Vn from the previous theorem are presented elementwise.

Let Un respe tively. Then,

Theorem 10.

= (uij )

and

Vn = (vij )

uij = (−b1j )I{j>1}

i Y

k=2 k6=j

be given by (21) and (22),

bjk , i > j

(23)

Properties of singular matrix with integer spe trum

151

and I j−1 bj1 {j>1} Y −1 vij = − bik , i > j. b1j

(24)

k=1

Example 1. For n = 4 the matri es U4 and V4 are given by

1 0 0 0 b12 −b12 0 0 , U4 = b12 b13 −b12 b23 −b13 b32 0 b12 b13 b14 −b12 b23 b24 −b13 b32 b34 −b14 b42 b43 10 0 0 1 −b−1 0 0 12 . V4 = 1 −b21 /(b12 b31 ) −1/(b13 b32 ) 0 1 −b21 /(b12 b41 ) −b31 /(b13 b41 b42 ) −1/(b14 b42 b43 )

The matri es Un and Vn may also be related to Theorem 7. Theorem 11.

Let Un and Vn be given by (4.1) and (4.2), respe tively. Then, Un =

n−2 Y

Diag(In−i−2 , Un,n−i ),

(25)

Diag(Ii , Un,2+i ),

(26)

i=0

Vn =

n−2 Y i=0

where Un,k is de ned in (19).

Before onsidering the VTU-de omposition, i.e. the fa torization Un BVn = Tn whi h is one of the main theorems of the paper, where Tn is a triangular matrix spe i ed below in Theorem 12, a te hni al lemma stating another basi property of B ∈ Bn is presented. On e again the proof is omitted.

Lemma 1. Let B ∈ Bn (21). Then,

and let

22 (U21 n : Un )

be the last row in

Un ,

given in

22 (U21 n : Un )B = 0.

Theorem 12. (VTU-de omposition) Let B ∈ Bn , Un and Vn = U−1 n be the triangular matri es given by (21) and (22), respe tively. Then Un BVn = Tn , where the upper triangular Tn equals Tn =

n X

(n − k)ek ek′ +

k=1 n X r−2 r−1 X Y

r=3 k=1 m=k+1

n X r−2 X r−1 X

l Y

r=3 k=1 l=k+1 m=k+1 ′ b−1 rm ek er −

n−1 X k=1

′ ek ek+1 .

′ b−1 rm blr ek el −

152

T. Nahtman, D. von Rosen

Proof. After the proof we show some details for

n = 3. Suppose that Un−1 Bn−1 Vn−1 = Tn−1 holds, where Bn−1 ∈ Bn−1 . Using the notation of

Theorem 9

Un BVn =

Un−1 0 22 U21 n Un

Vn−1 0 B Vn21 Vn22

and let B be partitioned as B=

B11 B12 B21 B22

,

n−1 × n−1 n−1 × 1 1 × n−1 1×1

.

22 From Lemma 1 it follows that (U21 n : Un )B = 0 and thus

Un BVn =

Un−1 B11 Vn−1 + Un−1 B12 Vn21 Un−1 B12 Vn22 0 0

(27)

.

The blo ks of the non-zero elements should be studied in some detail. Thus, one has to show that Un−1 B12 Vn22 equals the rst n − 1 elements in the nth olumn of Tn . Let T = (tij ), where tij = 0, if i > j. For example, for the se ond element in Un−1 B12 Vn22 : −(−b12 b2n + b12 b1n )b−1 1n

n−1 Y

n−1 Y

−1 b−1 nm = −bn2 b1n b1n

b−1 nm = −

m=2

m=2

n−1 Y

b−1 nm ,

m=3

whi h equals t2n . For Un−1 B11 Vn−1 + Un−1 B12 Vn21 , given in (27), it is noted that this expression equals Un−1 Bn−1 Vn−1 + I −

n−1 X

n−1 X

bin Un−1 di di′ Vn−1 +

i=1

Un−1 bin di Vn21

(28)

i=1

and then the two last terms in (28) should be exploited. After some al ulations this will give a useful re ursive relation between Un BVn and Un−1 Bn−1 Vn−1 :

Un Bn Vn =(In−1 : 0) ′ Un−1 Bn−1 Vn−1 (In−1 : 0) −

n−2 X

n−1 Y

′ b−1 nm ek en −

k=1 m=k+1

en−1 en′ +

n−1 X k=1

ek ek′ +

n−2 X n−1 X

l Y

′ b−1 nm bln ek el .

k=1 l=k+1 m=k+1

By utilizing this expression together with the indu tion assumption about Un−1 Bn−1 Vn−1 = Tn−1 leads to the Tn of the theorem. ⊓ ⊔

Properties of singular matrix with integer spe trum

153

Let Tn = (tij ) be the upper triangular matrix de ned in Theorem 12. Then the elements of Tn are given by

Corollary 3.

tij =

n X

j Y

b−1 kl −

k=j+1 l=i+1

j−1 X

tik =

k=i

j Y

n X

b−1 kl − I{j>i}

k=j+1 l=i+1

j−1 n X Y

b−1 kl .

k=j l=i+1

Observe that the expression implies that tii = n − i. Moreover, Tn 1 = 0. The stru ture of the matrix Tn is the following

n−1

0

...

n P b−1 − (n − 1) n−2 i ′ =3 i ′ 2 n 3 n n P Q −1 P P b−1 b−1 bi ′ j − i ′ 3 − (n − 2) i ′2 ′ ′ ′ i =4 i =3 i =4 j=2 n 4 n Q Tn′ = .. P P . b−1 b−1 i ′3 i ′j − i ′ =4 i ′ =5 j=3 .. .. . . n n−1 Q Q −1 b−1 bnj ′ − − i ′j j ′ =2

j=3

...

..

.

... ... ...

0

00

0 0 0 0 0 . 2 0 0 −1 bnn−1 − 2 1 0 −b−1 −1 0 nn−1 0

This se tion is ended by showing some detailed al ulations for n = 3.

Example 2. For n = 3

−1 2 b−1 32 − 2 −b32 T3 = 0 1 −1 . 0 0 0

From (23) and (24) in Theorem 10 we have 1 0 0 , −b12 0 U3 = b12 b12 b13 −b12 b23 −b13 b32

1 0 0 . 0 V3 = 1 −b−1 12 −1 −1 −1 −1 1 b23 b13 b32 −b13 b32

We are going to show that V3 T3 U3 = B ∈ B3 . Now

−1 2 + b−1 b12 − b12 b13 32 b12 − 2b12 − b32 b12 b13 −1 T3 U3 = +2b12 + b32 b23 b12 − b12 b−1 −b 12 + b12 b23 32 b13 b13 b32 −1 2 − 2b12 + (b12 − b13 ) b12 b32 (1 − b23 ) + 2b12 = b12 b31 −b12 b32 0 0 b21 + b31 b12 b13 = b12 b31 −b12 b32 b13 b32

0

0

0

′ 0 0 0

b13 b13 b32 0

154

T. Nahtman, D. von Rosen

and V3 T3 U3 ′ −1 b21 + b31 b21 b21 + b31 + b23 b−1 13 b32 b12 b31 −1 = b12 b12 + b32 b12 − b12 b32 b23 b−1 13 b32 −1 −1 −1 b13 −b13 b12 b32 + b13 b13 + b13 b32 b23 b13 b32 b21 + b31 b12 b13 = b21 b12 + b32 −(b32 − b31 ) + b13 b31 b12 − (b23 − b21 ) b13 + b23 b21 + b31 b12 b13 = B, = b21 b12 + b32 b23 b31 b32 b13 + b23

where in the above al ulations we have used (3) and Theorem 2 (ii).

5

Eigenvectors of the matrix B

It is already known from Theorem 5 that the matrix B ∈ Bn has eigenvalues {0, 1, . . . , n − 1}. This an also be seen from the stru ture of the matrix Tn given in Corollary 3 and the fa t that the matri es B and Tn are similar, i.e. Un BUn−1 = Tn . The right eigenve tors of the matrix B are of spe ial interest in sampling theory when B is a fun tion of the in lusion probabilities, outlined in the Introdu tion. We are going to present the eigenve tors of the matrix B ∈ Bn in a general form. From Se tion 2 we know that Un BU−1 n = Tn , where the matrix Tn is an upper-triangular matrix given by Theorem 12. Sin e B and Tn are similar, they have the same eigenvalues and then the eigenve tors of B are rather easy to obtain using the eigenve tors of Tn . In the next theorem we shall obtain expli it expressions for the eigenve tors of the matrix Tn .

Let Tn be given by Theorem 12. Then there exist upper triangular matri es VT and UT su h that

Theorem 13.

Tn = UT ΛVT , Λ = diag(n − 1, n − 2, . . . , 1, 0),

(29)

UT = VT−1 .

The matrix UT

= (uij )

uij = 1 +

is given by

j−i X

[j+1,n]

(−1)g

g=1

where

P[j+1,n]

i1 0; Au > 0 for some ve tor u > 0; All the eigenvalues of A have positive

real parts.

Ri

ati equations asso iated with an M-matrix

181

For a Z-matrix A it holds that: A is an M-matrix if and only if there exists a nonzero ve tor v > 0 su h that Av > 0 or a nonzero ve tor w > 0 su h that wT A > 0. Theorem 4.

The equivalen e of (a) and ( ) in Theorem 3 implies the next result. Lemma 5. Let A be a nonsingular B is also a nonsingular M-matrix.

M-matrix. If

B>A

is a Z-matrix, then

The following well-known result on erns the properties of S hur omplements of M-matri es.

Let M be a nonsingular M-matrix or an irredu ible singular Mmatrix. Partition M as

Lemma 6.

M=

M11 M12 , M21 M22

where M11 and M22 are square matri es. Then M11 and M22 are nonsingular M-matri es. The S hur omplement of M11 (or M22 ) in M is also an M-matrix (singular or nonsingular a

ording to M). Moreover, the S hur

omplement is irredu ible if M is irredu ible. 2.2

The dual equation

Reverting the oeÆ ients of equation (1) yields the dual equation YBY − YA − DY + C = 0,

(4)

whi h is still a NARE, asso iated with the matrix

A −B N= −C D

that is a nonsingular M-matrix or an irredu ible singular M-matrix if and only if the matrix M is so. In fa t N is learly a Z-matrix and N = ΠMΠ, where Π = Π−1 is the matrix whi h permutes the blo ks of M. So, if Mv > 0, for v > 0, then NΠv > 0 and by Theorem 4, N is an M-matrix. 2.3

Existence of nonnegative solutions

The spe ial stru ture of the matrix M of (2) allows one to prove the existen e of a minimal nonnegative solution S of (1), i.e., a solution S > 0 su h that X−S > 0 for any solution X > 0 to (1). See [20℄ and [21℄ for more details.

Let M in (2) be an M-matrix. Then the NARE (1) has a minimal nonnegative solution S. If M is irredu ible, then S > 0 and A − SC and D − CS are irredu ible M-matri es. If M is nonsingular, then A − SC and D − CS are nonsingular M-matri es. Theorem 7.

182

D. Bini, B. Iannazzo, B. Meini, F. Poloni

Observe that the above theorem holds for the dual equation (4) and guarantees the existen e of a minimal nonnegative solution of (4) whi h is denoted by T. 2.4

The eigenvalue problem associated with the matrix equation

A useful te hnique frequently en ountered in the theory of matrix equations

onsists in relating the solutions to some invariant subspa es of a matrix polynomial. In parti ular, the solutions of (1) an be des ribed in terms of the invariant subspa es of the matrix

D −C H= , B −A

(5)

In 0 whi h is obtained premultiplying the matrix M by J = . 0 −Im In fa t, if X is a solution of equation (1), then, by dire t inspe tion, H

In I = n R, X X

(6)

where R = D − CX. Moreover, the eigenvalues of the matrix R are a subset of the

Y span Z an invariant subspa e of H, and Y is a nonsingular n × n matrix, then ZY −1 is

eigenvalues of H. Conversely, if the olumns of the (n + m) × n matrix

a solution of the Ri

ati equation, in fa t H

Z Z = V, T T

for some V , from whi h post-multiplying by Z−1 one obtains

I H TZ−1

Z I −1 = VZ = ZVZ−1 ; T TZ−1

setting X = TZ−1 one has D − CX = ZVZ−1 and B − AX = XD − XCX. Similarly, for the solutions of the dual equation it holds that H

Y Y = U, Im Im

where U = BY−A. The eigenvalues of the matrix U are a subset of the eigenvalues of H.

Ri

ati equations asso iated with an M-matrix 2.5

183

The eigenvalues of H

We say that a set A of k omplex numbers has a (k1 , k2 ) splitting with respe t to the unit ir le if k = k1 + k2 , and A = A1 ∪ A2 , where A1 is formed by k1 elements of modulus at most 1 and A2 is formed by k2 elements of modulus at least 1. Similarly, we say that A has a (k1 , k2 ) splitting with respe t to the imaginary axis if k = k1 + k2 , and A = A1 ∪ A2 , where A1 is formed by k1 elements with nonpositive real part and A2 is formed by k2 elements with nonnegative real part. We say that the splitting is omplete if at lest one set A1 or A2 has no eigenvalues in its boundary. Sin e the eigenvalues of an M-matrix have nonnegative real part, it follows that the eigenvalues of H have an (m, n) splitting with respe t to the imaginary axis. This property is proved in the next Theorem 8. Let M be an irredu ible M-matrix. Then the eigenvalues of H = JM have an (m, n) splitting with respe t to the imaginary axis. Moreover, the only eigenvalue that an lie on the imaginary axis is 0.

Proof. Let

v > 0 be the only positive eigenve tor of M, and let λ > 0 be the asso iate eigenvalue; de ne Dv = diag(v). The matrix M = D−1 v MDv has the same eigenvalues as M; moreover, it is an M-matrix su h that Me = λe. Due to the sign stru ture of M-matri es, this means that M is diagonal dominant (stri tly in the nonsingular ase). Noti e that H = D−1 v HDv = JM, thus H is diagonal dominant as well, with m negative and n positive diagonal entries. We apply Gershgorin's theorem [30, Se . 14℄ to H; due to the diagonal dominan e,

the Gershgorin ir les never ross the imaginary axis (in the singular ase, they are tangent in 0). Thus, by using a ontinuity argument we an say that m eigenvalues of H lie in the negative half-plane and n in the positive one, and the only eigenvalues on the imaginary axis are the zero ones. But sin e H and H are ⊓ ⊔ similar, they have the same eigenvalues.

We an give a more pre ise result on the lo ation of the eigenvalues of H, after de ning the drift of the Ri

ati equation. Indeed, when M is a singular irredu ible M-matrix, by the Perron{Frobenius theorem, the eigenvalue 0 is simple, there are positive ve tors u and v su h that uT M = 0,

Mv = 0,

(7)

up to a s alar fa tor. and both the ve tors u and v are unique Writing u =

an de ne

u1 u2

and v =

v1 , with u1 , v1 ∈ Rn and u2 , v2 ∈ Rm , one v2

µ = uT2 v2 − uT1 v1 = −uT Jv.

(8)

The number µ determines some properties of the Ri

ati equation. Depending on the sign of µ and following a Markov hain terminology, one an all µ the

184

D. Bini, B. Iannazzo, B. Meini, F. Poloni

drift as in [6℄, and an lassify the Ri

ati equations asso iated with a singular irredu ible M-matrix in three ategories:

(a) positive re urrent if µ < 0; (b) null re urrent if µ = 0; ( ) transient if µ > 0. In uid queues problems, v oin ides with the ve tor of ones. In general v and u an be omputed by performing the LU fa torization of the matrix M, say M = LU, and solving the two triangular linear systems uT L = [0, . . . , 0, 1] and Uv = 0 (see [30, Se . 54℄). The lo ation of the eigenvalues of H is made pre ise in the following [20, 23℄: Theorem 9. Let M be a nonsingular or a singular irredu ible M-matrix, and let λ1 , . . . , λm+n be the eigenvalues of H = JM ordered by nonin reasing real part. Then λn and λn+1 are real and

Reλn+m 6 · · · 6 Reλn+2 < λn+1 6 0 6 λn < Reλn−1 6 · · · 6 Reλ1 .

The minimal nonnegative solutions S and T of the equation (1) and of the dual equation (4), respe tively, are su h that σ(D − CS) = {λ1 , . . . , λn } and σ(A − SC) = σ(A − BT ) = {−λn+1 , . . . , −λn+m }. If M is nonsingular then λn+1 < 0 < λn . If M is singular and irredu ible then: 1. if µ < 0 then λn = 0 and λn+1 < 0; 2. if µ = 0 then λn = λn+1 = 0 and there exists only one eigenve tor, up to a s alar onstant, for the eigenvalue 0; 3. if µ > 0 then λn > 0 and λn+1 = 0. We all λn and λn+1 the entral eigenvalues of H. If H (and thus M) is nonsingular, then the entral eigenvalues lie on two dierent half planes so the splitting is omplete. In the singular ase the splitting is omplete if and only if µ 6= 0. The lose to null re urrent ase, i.e., the ase µ ≈ 0, deserves parti ular attention, sin e it orresponds to an ill- onditioned null eigenvalue for the matrix H. In fa t, if u and v are normalized su h that kuk2 = kvk2 = 1, then 1/|µ| is the ondition number of the null eigenvalue for the matrix H (see [19℄). When M is singular irredu ible, for the Perron{Frobenius theorem the eigenvalue 0 is simple, therefore H = JM has a one dimensional kernel and uT J and v are the unique (up to a s alar onstant) left and right eigenve tors, respe tively,

orresponding to the eigenvalue 0. However the algebrai multipli ity of 0 as an eigenvalue of H an be 2; in that ase, the Jordan form of H has a 2 × 2 Jordan blo k orresponding to the 0 eigenvalue and it holds uT Jv = 0 [31℄.

Ri

ati equations asso iated with an M-matrix

185

The next result, presented in [25℄, shows the redu tion from the ase µ < 0 to the ase µ > 0 and onversely, when M is singular irredu ible. This property enable us to restri t our interest only to the ase µ 6 0. Lemma 10. The matrix S is the minimal nonnegative solution of (1) if and

only if Z = ST is the minimal nonnegative solution of the equation XCT X − XAT − DT X + BT = 0.

(9)

Therefore, if M is singular and irredu ible, the equation (1) is transient if and only if the equation (9) is positive re urrent. Proof. The rst part is easily shown by taking transpose on both sides of the

equation (1). The M-matrix orresponding to (9) is Mt =

Sin e

AT −CT . −BT DT

T T v2 v1 Mt = 0,

the se ond part readily follows. 2.6

Mt

u2 = 0, u1

⊓ ⊔

The differential of the Riccati operator

The matrix equation (1) de nes a Ri

ati operator R(X) = XCX − AX − XD + B,

whose dierential dRX at a point X is dRX [H] = HCX + XCH − AH − HD.

(10)

The dierential H → dRX [H] is a linear operator whi h an be represented by the matrix ∆X = (CX − D)T ⊗ Im + In ⊗ (XC − A), (11) where ⊗ denotes the Krone ker produ t (see [30, Se . 10℄). We say that a solution X of the matrix equation (1) is riti al if the matrix ∆X is singular. From the properties of Krone ker produ t [30, Se . 10℄, it follows that the eigenvalues of ∆X are the sums of those of CX − D and XC − A. If X = S, where S is the minimal nonnegative solution, then D − CX and A − XC are M-matri es ( ompare Theorem 7), and thus all the eigenvalues of ∆S have nonpositive real parts. Moreover, sin e D − CS and A − SC are M-matri es then −∆S is an M-matrix. The minimal nonnegative solution S is riti al if and only if both M-matri es D − CS and A − SC are singular, thus, in view of Theorem 9, the minimal solution is riti al if and only if M is irredu ible singular and µ = 0. Moreover, if 0 6 X 6 S then D − CX > D − CS and A − XC > A − SC are nonsingular M-matri es by lemma 5, thus −∆X is a nonsingular M-matrix.

186 2.7

D. Bini, B. Iannazzo, B. Meini, F. Poloni The number of positive solutions

If the matrix M is irredu ible, Theorem 7 states that there exists a minimal positive solution S of the NARE. In the study of nonsymmetri Ri

ati dierential equations asso iated with an M-matrix [18, 34℄ one is interested in all the positive solutions. In [18℄ it is shown that if M is nonsingular or singular irredu ible with µ 6= 0, then there exists a se ond solution S+ su h that S+ > S and S+ is obtained by a rank one orre tion of the matrix S. More pre isely, the following result holds [18℄. Theorem 11. If M is irredu ible nonsingular or irredu ible singular µ 6= 0, then there exists a se ond positive solution S+ of (1) given by

with

S+ = S + kabT ,

where k = (λn − λn+1 )/bT Ca, a is su h that (A − SC)a = −λn+1 a and b is su h that bT (D − CS) = λn bT . We prove that there are exa tly two nonnegative solutions in the non riti al

ase and only one in the riti al ase. In order to prove this result it is useful to study the form of the Jordan hains of an invariant subspa e of H orresponding to a positive solution. Lemma 12. Let M be irredu ible and let Σ be any positive solution of (1). Denote by η1 , . . . , ηn the eigenvalues of D − CΣ ordered by nonde reasing real part. Then η1 is real, and there exists a positive eigenve tor v of H asso iated with η1 . Moreover, any other ve tor independent of v, belonging to Jordan hains of H orresponding to η1 , . . . , ηn annot be positive or negative.

Proof. Sin e Σ is a solution of (1), then from (6) one has I I H = (D − CΣ). Σ Σ

Sin e D − CS is an irredu ible M-matrix for Theorem 7, and Σ > S (S is the minimal positive solution), then D − CΣ is an irredu ible Z-matrix and thus an be written as sI − N with N nonnegative and irredu ible. Then by Theorem 1 and Corollary 2 η1 is a simple real eigenvalue of D − CΣ, the orresponding eigenve tor an be hosen positive and there are no other positive or negative eigenve tors or Jordan hains orresponding to any of the eigenvalues. Let P−1 (D − CΣ)P = K be the Jordan anoni al form of D − CΣ, where the rst

olumn of P is the positive eigenve tor orresponding to η1 . Then we have P P K. = H ΣP ΣP

Ri

ati equations asso iated with an M-matrix

187

P are the Jordan hains of H orresponding to η1 , . . ., ΣP ηn , and there are no positive or negative olumns, ex ept for the rst one. ⊓ ⊔

Thus, the olumns of

If M is an irredu ible nonsingular M-matrix or an irredu ible singular M-matrix with µ 6= 0, then (1) has exa tly two positive solutions. If M is irredu ible singular with µ = 0, then (1) has a unique positive solution. Theorem 13.

Proof. From Lemma 12 applied to S it follows that H has a positive eigenve tor

orresponding to λn , and no other positive or negative eigenve tors or Jordan

hains orresponding to λ1 , . . . , λn . Let T be the minimal nonnegative solution of the dual equation (4). Then T T H = (−(A − BT )). I I

As in the proof of Lemma 12, we an prove that H has a positive eigenve tor orresponding to the eigenvalue λn+1 and no other positive or negative eigenve tors or Jordan hains orresponding to λn+1 , . . . , λn+m . If M is irredu ible nonsingular, or irredu ible singular with µ 6= 0, then λn > λn+1 , and there are only two linearly independent positive eigenve tors

orresponding to real eigenvalues. By Lemma 12, there an be at most two solutions orresponding to λn , λn−1 , . . . , λ1 , and to λn+1 , λn−1 , . . . , λ1 , respe tively. Sin e it is know from Theorem 11 that there exist at least two positive solutions, thus (1) has exa tly two positive solutions. If M is irredu ible singular with µ = 0, there is only one positive eigenve tor

orresponding to λn = λn+1 , and the unique solution of (1) is obtained by the ⊓ ⊔ Jordan hains orresponding to λn , λn−1 , . . . , λ1 . The next results provide a useful property of the minimal solutions whi h will be useful in Se tion 4. Theorem 14. Let M be singular and irredu ible, and let S and T be the minimal nonnegative solutions of (1) and (4), respe tively. Then the following properties hold:

(a) if µ < 0, then Sv1 = v2 and Tv2 < v1 ; (b) if µ = 0, then Sv1 = v2 and Tv2 = v1 ; ( ) if µ > 0, then Sv1 < v2 and Tv2 = v1 .

Proof. From the proof of Theorem 13, it follows that if µ 6= 0, there exist two

independent positive eigenve tors a and bof H relative to the entral eigenvalues

λn and λn+1 , respe tively. We write a =

and a2 , b2 ∈ Rm .

a1 b and b = 1 , with a1 , b1 ∈ Rn a2 b2

188

D. Bini, B. Iannazzo, B. Meini, F. Poloni

Sin e the solution S is onstru ted from an invariant subspa e ontaining a, then Sa1 = a2 , sin e the solution S+ is onstru ted from an invariant subspa e

ontaining b, then S+ b1 = b2 . Analogously, if T+ is the se ond positive solution of the dual equation, then Tb2 = b1 and T+ a2 = a1 . The statements (a) and (c) follow from the fa t that if µ < 0 then v = a ( ompare Theorem 9), so Sv1 = v2 and Tv2 < T+ v2 = v1 , sin e T < T+ ; if µ > 0 then v = b, so Tv2 = v1 and Sv1 < S+ v1 = v2 , sin e S < S+ . The statement (b) orresponding to the ase µ = 0 an be proved in a similar way. ⊓ ⊔

Remark 1. When µ > 0, from Lemma 10 and Theorem 14 we dedu e that the minimal nonnegative solution S of (1) is su h that uT2 S = uT1 . 2.8

Perturbation analysis for the minimal solution

We on lude this se tion with a result of Guo and Higham [24℄ who perform a qualitative des ription of the perturbation of the minimal nonnegative solution S of a NARE (1) asso iated with an M-matrix. f is onsidered whi h The result is split in two theorems where an M-matrix M is obtained by means of a small perturbation of M. Here, we denote by Se the minimal nonnegative solution of the perturbed Ri

ati equation asso iated with f. M

If M is a nonsingular M-matrix or an irredu ible singular Mµ 6= 0, then there exist onstants γ > 0 and ε > 0 su h that f with kM e − Sk 6 γkM f − Mk for all M f − Mk < ε. kS Theorem 15.

matrix with

If M is an irredu ible singular M-matrix with there exist onstants γ > 0 and ε > 0 su h that Theorem 16.

(a) (b)

µ = 0,

then

f with kM e − Sk 6 γkM f − Mk1/2 for all M f − Mk < ε; kS e f f − Mk < ε. f kS − Sk 6 γkM − Mk for all singular M with kM

It is interesting to observe that in the riti al ase, where µ = 0 or if µ ≈ 0, one has to expe t poor numeri al performan es even if the algorithm used for approximating S is ba kward stable. Moreover, the rounding errors introdu ed point representation with to represent the input values of M in the oating √ pre ision ε may generate an error of the order ε in the solution S. This kind of problems will be over ome in Se tion 4.1.

3

Numerical methods

We give a brief review of the numeri al methods developed so far for omputing the minimal nonnegative solution of the NARE (1) asso iated with an M-matrix.

Ri

ati equations asso iated with an M-matrix

189

Here we onsider the ase where the M-matrix M is nonsingular or is singular, irredu ible and µ 6 0. The ase µ > 0 an be redu ed to the ase µ < 0 by means of Lemma 10. The riti al ase where µ = 0 needs dierent te hniques whi h will be treated in the next Se tion 4. We start with a dire t method based on the S hur form of the matrix H then we onsider iterative methods based on xed-point te hniques, Newton's iteration and we on lude the se tion by analyzing a lass of doubling algorithms. The latter lass in ludes methods based on Cy li Redu tion (CR) of [9℄, and on the Stru ture-preserving Doubling Algorithm (SDA) of [2℄. 3.1

Schur method

A lassi al approa h for solving equation (1) is to use the (ordered) S hur de omposition of the matrix M to ompute the invariant subspa es of H orresponding to the minimal solution S. This approa h for the symmetri algebrai Ri

ati equation was rst presented by Laub in 1979 [40℄. Con erning the NARE, a study of that method in the singular and riti al ase was done by Guo [23℄ who presented a modi ed S hur method for the riti al or near riti al ase (µ ≈ 0). As explained in Se tion 2.4 from

In I H = n (D − CS) S S

it follows that nding the minimal solution S of the NARE (1) is equivalent to nding a basis of the invariant subspa e of H relative to the eigenvalues of D − CS, i.e., the eigenvalues of H with nonnegative real part. A method for nding an invariant subspa e is obtained by omputing a semi-ordered S hur form of H, that is, omputing an orthogonal matrix Q and a quasi upper-triangular matrix T su h that Q∗ HQ = T , where T is blo k upper triangular with diagonal blo ks Ti,i of size at most 2. The semi-ordering means that if Ti,i , Tj,j and Tk,k are diagonal blo ks having eigenvalues with positive, null and negative real parts, respe tively, then i < j < k. A semi-ordered S hur form an be omputed in two steps: – Compute a real S hur form of H by the ustomary Hessenberg redu tion

followed by the appli ation of the QR algorithm as des ribed in [19℄. – Swap the diagonal blo ks by means of orthogonal transformations as des ribed in [4℄.

The minimal solution of the NARE an be obtained from the rst n olumns Q1 su h that Q1 is an n × n matrix, that is, of the matrix Q partitioned as Q2

. In the riti al ase this method does not work, sin e there is no way to hoose an invariant subspa e relative to the rst n eigenvalues, moreover in the near S=

Q2 Q−1 1

190

D. Bini, B. Iannazzo, B. Meini, F. Poloni

riti al ase where µ ≈ 0, there is la k of a

ura y sin e the 0 eigenvalue is ill- onditioned. However, the modi ed S hur method given by C.-H. Guo [24℄ over omes these problems. The ost of this algorithm, following [23℄, is 200n3 . 3.2

Functional iterations

In [20℄ a lass of xed-point methods for (1) is onsidered. The xed-point iterations are based on suitable splittings of A and D, that is A = A1 − A2 and D = D1 − D2 , with A1 , D1 hosen to be M-matri es and A2 , D2 > 0. The form of the iterations is A1 Xk+1 + Xk+1 D1 = Xk CXk + Xk D2 + A2 Xk + B,

(12)

where at ea h step a Sylvester equation of the form M1 X + XM2 = N must be solved. Some possible hoi es for the splitting are: 1. A1 and D1 are the diagonal parts of A and D, respe tively; 2. A1 is the lower triangular part of A and D1 the upper triangular part of D; 3. A1 = A and D1 = D. The solution Xk+1 of the Sylvester equation an be omputed, for instan e, by using the Bartels and Stewart method [5℄, as in MATLAB's sylvsol fun tion of the Ni k Higham Matrix Fun tion toolbox [28℄ The ost of this omputation is roughly 60n3 ops in luding the omputation of the S hur form of the oeÆ ients A1 and D1 [29℄. However, observe that for the rst splitting, A1 and D1 are diagonal matri es and the Sylvester equation

an be solved with O(n2 ) ops; for the se ond splitting, the matri es A1 and D1 are already in the S hur form. This substantially redu es the ost of the appli ation of the Bartels and Stewart method to 2n3 . Con erning the third iteration, observe that the matrix oeÆ ients A1 and D1 are independent of the iteration. Therefore, the omputation of their S hur form must be performed only on e. A monotoni onvergen e result holds for the three iterations [20℄.

If R(X) 6 0 for some positive matrix X, then for the xedpoint iterations (12) with X0 = 0, it holds that Xk < Xk+1 < X for k > 0. Moreover, lim Xk = S. Theorem 17.

We have also an asymptoti onvergen e result [20℄. Theorem 18. For the xed-point iterations (12) with X0 = 0, it holds that p lim sup k kXk − Sk = ρ((I ⊗ A1 + DT1 ⊗ I)−1 (I ⊗ (A2 + SC) + (D2 + CS)T ⊗ I).

Ri

ati equations asso iated with an M-matrix

191

These iterations have linear onvergen e whi h turns to sublinear in the

riti al ase. The omputational ost varies from 8n3 arithmeti operations per step for the rst splitting, to 64n3 for the rst step plus 10n3 for ea h subsequent step for the last splitting. The most expensive iteration is the third one whi h, on the other hand, has the highest (linear) onvergen e speed. 3.3

Newton’s method

Newton's iteration was rst applied to the symmetri algebrai Ri

ati equation by Kleinman in 1968 [37℄ and later on by various authors. In parti ular, Benner and Byers [7℄ omplemented the method with an optimization te hnique (exa t line sear h) in order to redu e the number of steps needed for arriving at onvergen e. The study of the Newton method for nonsymmetri algebrai Ri

ati equations was started by Guo and Laub in [26℄, and a ni e onvergen e result was given by Guo and Higham in [24℄. The onvergen e of the Newton method is generally quadrati ex ept for the

riti al ase where the onvergen e is observed to be linear with rate 1/2 [26℄. At ea h step, a Sylvester matrix equation must be solved, so the omputational

ost is O(n3 ) ops per step, but with a large overhead onstant. The Newton method for a NARE [26℄ onsists in the iteration Xk+1 = N(Xk ) = Xk − (dRXk )−1 R(Xk ),

k = 0, 1, . . .

(13)

whi h, in view of (10), an be written expli itly as (A − Xk C)Xk+1 + Xk+1 (D − CXk ) = B − Xk CXk .

(14)

Therefore, the matrix Xk+1 is obtained by solving a Sylvester equation. This linear equation is de ned by the matrix ∆Xk = (D − CXk )T ⊗ Im + In ⊗ (A − Xk C)

whi h is nonsingular if 0 6 Xk < S, as shown in se tion 2.6. Thus, if 0 6 Xk < S for any k, the sequen e (13) is well-de ned. In the non riti al ase, dRS is nonsingular, and the iteration is quadrati ally

onvergent in a neighborhood of the minimal nonnegative solution S by the traditional results on Newton's method (see e.g. [36℄). Moreover, the following monotoni onvergen e result holds [24℄: Theorem 19. Consider Newton's method (14) starting from X0 = 0. Then for ea h k = 0, 1, . . . , we have 0 6 Xk 6 Xk+1 < S and ∆Xk is a nonsingular M-matrix. Therefore, the sequen e (Xk ) is well-de ned and onverges monotoni ally to S.

192

D. Bini, B. Iannazzo, B. Meini, F. Poloni

The same result holds when 0 6 X0 6 S; the proof in [24℄ an be easily adapted to this ase. In [26℄, a hybrid method was suggested, whi h onsists in performing a ertain number of iterations of a linearly onvergent algorithm, su h as the ones of Se tion 3.2, and then using the omputed value as the starting point for Newton's method. At ea h step of Newton's iteration, the largest omputational work is given by the solution of the Sylvester equation (14). We re all that the solution Xk+1 ,

omputed by means of the Bartels and Stewart method [5℄ osts roughly 60n3 ops. Therefore the overall ost of Newton's iteration is 66n3 ops. It is worth noting that in the riti al and near riti al ases, the matrix ∆k be omes almost singular as Xk approa hes the solution S; therefore, some numeri al instability is to be expe ted. Su h instability an be removed by means of a suitable te hnique whi h we will des ribe in Se tion 4.1.

3.4

Doubling algorithms

In this se tion we report some quadrati ally onvergent algorithms obtained in [13℄ for solving (1). Quadrati ally onvergent methods for omputing the extremal solution of the NARE an be obtained by transforming the NARE into a Unilateral Quadrati Matrix Equation (UQME) of the kind A2 X2 + A1 X + A0 = 0

(15)

where A0 , A1 , A2 and X are p × p matri es. Equations of this kind an be solved eÆ iently by means of doubling algorithms like Cy li Redu tion (CR) [9, 12℄ or Logarithmi Redu tion (LR) [39℄. The rst attempt to redu e a NARE to a UQME was performed by Ramaswami [46℄ in the framework of uid queues. Subsequently, many ontributions in this dire tion have been given by several authors [23, 10, 13, 33, 6℄ and dierent redu tion te hniques have been designed. Con erning algorithms, Cy li Redu tion and SDA are the most ee tive

omputational te hniques. The former was applied the rst time in [9℄ by Bini and Meini to solve unilateral quadrati equations. The latter, was rst presented by Anderson in 1978 [2℄ for the numeri al solution of dis rete-time algebrai Ri

ati equations. A new interpretation was given by Chu, Fan, Guo, Hwang, Lin, Xu [16, 32, 41℄, for other kinds of algebrai Ri

ati equations.

Ri

ati equations asso iated with an M-matrix

193

CR applied to (15) generates sequen es of matri es de ned by the following equations (k)

V (k) = (A1 )−1 (k+1)

= −A0 V (k) A0

(k+1)

= A1 − A0 V (k) A2 − A2 V (k) A0

A0

A1

(k+1) A2

(k)

(k)

=

(k)

(k)

(k)

(k)

(k)

k = 0, 1, . . .

(16)

(k) (k) −A2 V (k) A2

b (k+1) = A b (k) − A(k) V (k) A(k) A 2 0

b (0) = A1 . where A(0) = Ai , i = 0, 1, 2, A i The following result provides onvergen e properties of CR [12℄. Theorem 20. Let x1 , . . . , x2p be the roots of a(z) = det(A0 + zA1 + z2 A2 ), in luding roots at the in nity if deg a(z) < 2p, ordered by in reasing modulus.

Suppose that |xp | 6 1 6 |xp+1 | and |xp | < |xp+1 |, and that a solution G exists to (15) su h that ρ(G) = |xp|. Then, G is the unique solution to (15) with minimal spe tral radius, moreover, if CR (16) an be arried out with no breakdown, the sequen e is su h that for any norm

−1 b (k) G(k) = − A A0

k

||G(k) − G|| 6 ϑ|xp /xp+1 |2

2 where ϑ > 0 is a suitable onstant. Moreover, it holds that ||A(k) 0 || = O(|xp | ), (k) ||A2 || = O(|xp+1 |−2 ). k

k

Observe that, the onvergen e onditions of the above theorem require that the roots of a(z) have a (p, p) omplete splitting with respe t to the unit ir le. For this reason, before transforming the NARE into a UQME, it is onvenient to b su h that the eigenvalues of H b transform the Hamiltonian H into a new matrix H have an (n, m) splitting with respe t to the unit ir le, i.e., n eigenvalues belong to the losed unit disk and m are outside. This an be obtained by means of one of the two operators: the Cayley transform Cγ (z) = (z + γ)−1 (z − γ), where γ > 0, or the shrink-and-shift operator Sτ (z) = 1 − τz, where τ > 0. In fa t, the Cayley transform maps the right open half-plane into the open unit disk. Similarly, for suitable values of τ, the transformation Sτ maps a suitable subset of the right half-plane inside the unit disk. This property is better explained in the following result whi h has been proved in [13℄. Theorem 21.

Let γ, τ > 0 and let

Hγ = Cγ (H) = (H + γI)−1 (H − γI), Hτ = Sτ (H) = I − τH.

Assume

µ < 0,

then:

194

1.

D. Bini, B. Iannazzo, B. Meini, F. Poloni Hγ

has eigenvalues ξi = Cγ (λi ), i = 1, . . . , m + n, su h that max |ξi | 6 1

max{maxi (A)i,i , maxi (D)i,i }, Hτ i = 1, . . . , m + n, su h that

max |µi | 6 1

0, ξ < 0, p and q are su h that pT v = qT w = 1. Sin e v and w are orthogonal ve tors, the double-shift moves one zero eigenvalue to η and the other e = H + ξqwT are those of H e T = HT + ξwqT , to ξ. Indeed, the eigenvalues of H whi h are the eigenvalues of H ex ept that one zero eigenvalue is repla ed by ξ, e e + ηvpT are the eigenvalues of H by Lemma 26. Also, the eigenvalues of H = H ex ept that the remaining zero eigenvalue is repla ed by η, by Lemma 26 again. From H we may de ne a new Ri

ati equation XCX − XD − AX + B = 0.

(29)

As before, the minimal nonnegative solution S of (1) is a solution of (29) su h that σ(D−CS) = {η, λ1 , . . . , λn−1 }. However, it seems very diÆ ult to determine the existen e of a solution Y of the dual equation of (29) su h that σ(A − B Y) = {−ξ, −λn+2 , . . . , −λn+m }. 4.2

Choosing a new initial value

If the eigenve tor of H relative to the null eigenvalue is partitioned as right v1 v= , from Theorem 14 it follows that for the minimal nonnegative solution v2 S, it holds that Sv1 = v2 (and then (D − CS)v1 = 0).

In the algorithms in whi h the initial value an be hosen, like Newton's method, the usual hoi e X0 = 0 does not exploit this information, rather it relies only on the positivity of S. Note that in the Ri

ati equations modeling

uid queues, the ondition Xv1 = v2 is equivalent to the sto hasti ity of S sin e v1 = v2 = e. A possibly better onvergen e is expe ted if one ould generate a sequen e su h that Xk v1 = v2 for any k > 0. More pre isely, one must hoose an iteration

202

D. Bini, B. Iannazzo, B. Meini, F. Poloni

c = {A ∈ Cn×n : Av1 = v2 } and an initial whi h preserves the aÆne subspa e W c for whi h the sequen e onverges to the desired solution. value X0 ∈ W A similar idea has been used in [45℄ in order to improve the onvergen e speed of ertain fun tional iterations for solving nonlinear matrix equations related to spe ial Markov hains. A ni e property of Newton's method is that it is stru ture-preserving with c. To prove this fa t onsider the following prerespe t to the aÆne subspa e W liminary result whi h on erns the Newton iteration Lemma 28.

The Newton method Xk+1 = N(Xk ),

N(Xk ) = Xk − (dFXk )−1 F(Xk )

applied to the matrix equation F(X) = 0, when de ned, preserves the aÆne stru ture Vb if and only if F is a fun tion from Vb to its parallel linear subspa e V . Proof. Consider the matrix X ∈ Vb. The matrix N(X) belongs to Vb if and only if

N(X) − X = (dFX )−1 (−F(X)) belongs to V , and that o

urs if and only if F(X) (and then −F(X)) belongs to V . ⊓ ⊔

Now, we are ready to prove that the Newton method applied to the Ri

ati c. operator is stru ture-preserving with respe t to W

If X0 is su h that X0 v1 = v2 , and the Newton method applied to the Ri

ati equation R(X) = 0 is well de ned then Xk v1 = v2 for any k > 0. c. That is, the Newton method preserves the stru ture W

Proposition 29.

c Proof. In view of Lemma 28, one needs to prove that R is a fun tion from W

to the parallel linear subspa e W . c, then R(X)v1 = 0, in fa t If X ∈ W

R(X)v1 = XCXv1 − AXv1 − XDv1 + Bv1 = XCv2 − Av2 − XDv1 + Bv1 ,

and the last term is 0 sin e Cv2 = Dv1 and Av2 = Bv1 .

⊓ ⊔

A possible hoi e for the starting value is (X0 )i,j = (v2 )i /s where s = i v1 (i). It must be observed that the stru tured preserving onvergen e is not anymore monotoni . Sin e the approximation error has a null omponent along the subspa e W , one should expe t a better onvergen e speed for the sequen es c. A proof of this fa t and the onvergen e analysis of this obtained with X0 ∈ W approa h is still work in pla e. If µ = 0, the dierential of R is singular at the solution S as well as at any c. This makes the sequen e Xk unde ned. A way to over ome this point X ∈ W drawba k is onsidering the shifted Ri

ati equation des ribed in Se tion 4.1. P

Ri

ati equations asso iated with an M-matrix

203

The dierential of the shifted Ri

ati equation (26) at a point X is represented by the matrix e X = ∆X + I ⊗ (η(Xv1 − v2 )pT ) + (ηv1 (pT + pT X))T ⊗ I, ∆ (30) 2 1 2 p where the ve tor p 6= 0 partitioned as p = 1 is an arbitrary nonnegative p2 ve tor su h that pT v = 1. Choosing p2 = 0 provides a ni e simpli ation of the

problem, in fa t

e X = ∆X − QT ⊗ I, ∆

where Q = ηv1 pT1 . The next result gives more insights on the a tion of the Newton iteration on the stru ture Vb.

e c then R(X) = R(X) Assume that p2 = 0. If X ∈ W , where Re is de ned in (27). Moreover the sequen es generated by Newton's method, e c are the = 0 with X0 ∈ W when de ned, applied to R(X) = 0 and to R(X) same. Proposition 30.

b , in the assumption p2 = 0, follows from Proof. The fa t R(X) = R(X) e R(X) = R(X) − η(Xv1 − v2 )pT1 .

e e X )−1 R(X) e Let N(X) = X − (dRX )−1 R(X) and N(X) denote the = X − (dR Newton operator for the original equation and for the shifted one, respe tively. To prove that the sequen es are the same, it must be shown that e − CX) = B e − XCX (A − XC)N(X) + N(X)(D

c and for any η (for whi h the equation has a unique solution). holds for any X ∈ W One has e − CX) (A − XC)N(X) + N(X)(D

e − XCX, = B − XCX + N(X)ηv1 pT1 = B − XCX + ηv2 pT1 = B

c. This ompletes the where we have used that N(X)v1 = v2 sin e N(X) ∈ W proof. ⊓ ⊔

Sin e any starting value X0 ∈ Vb gives the same sequen e for the Newton method applied either to the Ri

ati equation (1) or to the shifted Ri

ati equation (26), then, hoosing su h an initial value has the same ee t of applying the shift te hnique. For the appli ability one needs that the matrix ∆Xk is nonsingular at ea h step. Unfortunately the derivative might be singular for some singular M-matrix c+ = {X ∈ W, c X > 0}. and some X ∈ W

204

D. Bini, B. Iannazzo, B. Meini, F. Poloni

If a breakdown o

urs, it is always possible to perform the iteration by using the shifted iteration, with p2 = 0 and for a suitable hoi e of the parameter η. In fa t, the iteration is proved in Proposition 30 to be the same by any hoi e of p1 and η. The onvergen e is more subtle. Besides the loss of monotoni onvergen e, c, even if it is the one may note that S is not the only solution belonging to W c+ . In fa t, in view of Theorem 13, there are at most two only belonging to W positive solutions, and only one of them has the property Sv1 = v2 . The proof c+ , of onvergen e is still work in progress, we onje ture that for ea h X0 ∈ W the sequen e generated by the Newton method, if de ned, onverges to S. A possible improvement of the algorithm ould be obtained by implementing the exa t line sear h introdu ed in [7℄.

5

Numerical experiments and comparisons

We present some numeri al experiments to illustrate the behavior of the algorithms presented in Se tion 3 and 4.1 in the riti al and non riti al ase. To

ompare the a

ura y of the methods we have used the relative error err = b 1 /kXk1 on the omputed solution X b, when the exa t solution X was kX − Xk provided. Elsewhere, we have used the relative residual error res =

b X b − XD b − AX b + Bk1 kXC . b Xk b 1 + kXDk b 1 + kAXk b 1 + kBk1 kXC

The tests were performed using MATLAB 6 Release 12 on a pro essor AMD Athlon 64. The ode for the dierent algorithms is available for download at the web page http://bezout.dm.unipi.it/mriccati/. In these tests we onsider three methods: the Newton method (N), the SDA, and the Cy li Redu tion (CR) algorithm applied to the UQME (17) (in both b obtained by the Cayley transform SDA and CR we have onsidered the matrix H of H and not the one relying on the shrink-and-shift operator). We have also onsidered the improved version of these methods applied to the singular/ riti al ase; we denoted them as IN, ISDA and ICR, respe tively, where \I" stands for \Improved". The initial value for IN is hosen as suggested in Se tion 4.1; the parameter for the shift P is hosen as η = max{max(A)i,i , max(D)i,i } and the ve tor p is hosen to be e/ i vi . The iterations are stopped when the relative residual/error eases to de rease or be omes smaller than 10ε, where ε is the ma hine pre ision.

Test 31. A null re urrent ase [6, Example 1℄. Let

0.003 −0.001 −0.001 −0.001 −0.001 0.003 −0.001 −0.001 M= −0.001 −0.001 0.003 −0.001 −0.001 −0.001 −0.001 0.003

Ri

ati equations asso iated with an M-matrix

where D is a 2 × 2 matrix. The minimal positive solution is X =

1 2

205

1 1 . 1 1

As suggested by the Theorem 16, the a

ura y of the √

ustomary algorithms N, SDA and CR is poor in the riti al ase, and is near to ε ≈ 10−8 . We report in Table 1 the number of steps and the relative error for the three algorithms. If one uses the singularity, due to the parti ular stru ture of the problem, the solution is a hieved in one step by IN, ISDA and ICR with full a

ura y. Algorithm Steps Relative error N 21 6.0 · 10−7 SDA 36 8.6 · 10−7 CR 31 4.7 · 10−9 Table 1.

A

ura y of the algorithms in the riti al ase, Test 31

Test 32. Random hoi e of a singular M-matrix with

Me = 0 [20℄. To onstru t M, we generated a 100 × 100 random matrix R, and set M = diag(Re) − R. The matri es A, B, C and D are 50×50. We generated 5 dierent matri es M and

omputed the relative residuals and number of steps needed for the iterations to onverge. All the algorithms (N, IN, SDA, ISDA, CR and ICR) arrive at a relative residual less than 10ε. The number of steps needed by the algorithms are reported in Table 2. As one an see the basi algorithms require the same number of steps, whilst using the singularity the Newton method requires one or two steps less than ISDA and ICR, however, the ost per step of these two methods make their overall ost mu h lower than the Newton method. The use of the singularity redu es dramati ally the number of steps needed for the algorithms to onverge.

Algorithm Steps needed N 11{12 IN 3 SDA 11{12 ISDA 4-5 11{13 CR ICR 4{5 Minimum and maximum number of steps needed for algorithms to onverge in Test 32

Table 2.

206

D. Bini, B. Iannazzo, B. Meini, F. Poloni

Table 3 summarizes the spe tral and omputational properties of the solutions of the NARE (1). Table 4 reports the omputational ost of the algorithms for solving (1) with m = n, together with the onvergen e properties in the non riti al ase . M

splitting

omplete

solutions > 0 ∆S

a

ura y

nonsingular

M µ0

omplete

λn+1 < 0 < λn λn+1< 0 = λn λn+1 = 0 = λn λn+1 = 0< λn

2 nonsingular

2 nonsingular

ε

ε

Table 3.

1 singular √

2 nonsingular

ε

ε

Summary of the properties of the NARE

Computational ost Referen e Algorithm S hur method 200n3 [23, 40℄ 3 3 Fun tional iteration 8n |14n (per step) [20, 26℄ 66n3 (per step) [26, 24℄ Newton's method 74 3 [10, 13℄ n (per step) CR applied to (17) 3 64 3 n (per step) [16, 25, 13℄ CR applied to (18) (SDA) 3 38 3 CR applied to (19), (20) n (per step) [33, 13℄ 3 Table 4.

Comparison of the algorithms.

References 1. S. Ahn and V. Ramaswami. Transient analysis of uid ow models via sto hasti

oupling to a queue. Sto h. Models, 20(1):71{101, 2004. 2. B. D. O. Anderson. Se ond-order onvergent algorithms for the steady-state Ri

ati equation. Internat. J. Control, 28(2):295{306, 1978. 3. S. Asmussen. Stationary distributions for uid ow models with or without Brownian noise. Comm. Statist. Sto hasti Models, 11(1):21{49, 1995. 4. Z. Bai and J. W. Demmel. On swapping diagonal blo ks in real S hur form. Linear Algebra Appl., 186:73{95, 1993. 5. R. H. Bartels and G. W. Stewart. Solution of the matrix equation AX + XB = C. Commun. ACM, 15(9):820{826, 1972.

Ri

ati equations asso iated with an M-matrix

207

6. N. G. Bean, M. M. O'Reilly, and P. G. Taylor. Algorithms for return probabilities for sto hasti uid ows. Sto hasti Models, 21(1):149{184, 2005. 7. P. Benner and R. Byers. An exa t line sear h method for solving generalized

ontinuous-time algebrai Ri

ati equations. IEEE Trans. Automat. Control, 43(1):101{107, 1998. 8. A. Berman and R. J. Plemmons. Nonnegative matri es in the mathemati al s ien es, volume 9 of Classi s in Applied Mathemati s. So iety for Industrial and Applied Mathemati s (SIAM), Philadelphia, PA, 1994. Revised reprint of the 1979 original. 9. D. Bini and B. Meini. On the solution of a nonlinear matrix equation arising in queueing problems. SIAM J. Matrix Anal. Appl., 17(4):906{926, 1996. 10. D. A. Bini, B. Iannazzo, G. Latou he, and B. Meini. On the solution of algebrai Ri

ati equations arising in uid queues. Linear Algebra Appl., 413(2-3):474{494, 2006. 11. D. A. Bini, B. Iannazzo, and F. Poloni. A Fast Newton's Method for a Nonsymmetri Algebrai Ri

ati Equation. SIAM Journal on Matrix Analysis and Appli ations, 30(1):276{290, 2008. 12. D. A. Bini, G. Latou he, and B. Meini. Numeri al methods for stru tured Markov

hains. Numeri al Mathemati s and S ienti Computation. Oxford University Press, New York, 2005. Oxford S ien e Publi ations. 13. D. A. Bini, B. Meini, and F. Poloni. From algebrai Ri

ati equations to unilateral quadrati matrix equations: old and new algorithms. Te hni al Report 1665, Dipartimento di Matemati a, Universita di Pisa, Italy, July 2007. 14. A. Brauer. Limits for the hara teristi roots of a matrix. IV. Appli ations to sto hasti matri es. Duke Math. J., 19:75{91, 1952. 15. C.-Y. Chiang and W.-W. Lin. A stru tured doubling algorithm for nonsymmetri algebrai Ri

ati equations (a singular ase). Te hni al report, National Center for Theoreti al S ien es, National Tsing Hua University, Taiwan R.O.C., July 2006. 16. E. K.-W. Chu, H.-Y. Fan, and W.-W. Lin. A stru ture-preserving doubling algorithm for ontinuous-time algebrai Ri

ati equations. Linear Algebra Appl., 396:55{80, 2005. 17. A. da Silva Soares and G. Latou he. Further results on the similarity between uid queues and QBDs. In Matrix-analyti methods (Adelaide, 2002), pages 89{106. World S i. Publ., River Edge, NJ, 2002. 18. S. Fital and C.-H. Guo. Convergen e of the solution of a nonsymmetri matrix Ri

ati dierential equation to its stable equilibrium solution. J. Math. Anal. Appl., 318(2):648{657, 2006. 19. G. H. Golub and C. F. Van Loan. Matrix omputations. Johns Hopkins Studies in the Mathemati al S ien es. Johns Hopkins University Press, Baltimore, MD, third edition, 1996. 20. C.-H. Guo. Nonsymmetri algebrai Ri

ati equations and Wiener-Hopf fa torization for M-matri es. SIAM J. Matrix Anal. Appl., 23(1):225{242, 2001. 21. C.-H. Guo. A note on the minimal nonnegative solution of a nonsymmetri algebrai Ri

ati equation. Linear Algebra Appl., 357:299{302, 2002. 22. C.-H. Guo. Comments on a shifted y li redu tion algorithm for quasi-birth-death problems. SIAM J. Matrix Anal. Appl., 24(4):1161{1166, 2003. 23. C.-H. Guo. EÆ ient methods for solving a nonsymmetri algebrai Ri

ati equation arising in sto hasti uid models. J. Comput. Appl. Math., 192(2):353{373, 2006.

208

D. Bini, B. Iannazzo, B. Meini, F. Poloni

24. C.-H. Guo and N. J. Higham. Iterative Solution of a Nonsymmetri Algebrai Ri

ati Equation. SIAM Journal on Matrix Analysis and Appli ations, 29(2):396{ 412, 2007. 25. C.-H. Guo, B. Iannazzo, and B. Meini. On the Doubling Algorithm for a (Shifted) Nonsymmetri Algebrai Ri

ati Equation. SIAM J. Matrix Anal. Appl., 29(4):1083{1100, 2007. 26. C.-H. Guo and A. J. Laub. On the iterative solution of a lass of nonsymmetri algebrai Ri

ati equations. SIAM J. Matrix Anal. Appl., 22(2):376{391, 2000. 27. C. He, B. Meini, and N. H. Rhee. A shifted y li redu tion algorithm for quasibirth-death problems. SIAM J. Matrix Anal. Appl., 23(3):673{691, 2001/02. 28. N. J. Higham. The Matrix Fun tion Toolbox. http://www.ma.man.ac.uk/ ∼ higham/mftoolbox. 29. N. J. Higham. Fun tions of Matri es: Theory and Computation. So iety for Industrial and Applied Mathemati s, Philadelphia, PA, USA, 2008. 30. L. Hogben, editor. Handbook of linear algebra. Dis rete Mathemati s and its Appli ations (Bo a Raton). Chapman & Hall/CRC, Bo a Raton, FL, 2007. Asso iate editors: Ri hard Brualdi, Anne Greenbaum and Roy Mathias. 31. R. A. Horn and S. Serra Capizzano. Canoni al and standard forms for ertain rank one perturbations and an appli ation to the ( omplex) Google pageranking problem. To appear in Internet Mathemati s, 2007. 32. T.-M. Hwang, E. K.-W. Chu, and W.-W. Lin. A generalized stru ture-preserving doubling algorithm for generalized dis rete-time algebrai Ri

ati equations. Internat. J. Control, 78(14):1063{1075, 2005. 33. B. Iannazzo and D. Bini. A y li redu tion method for solving algebrai Ri

ati equations. Te hni al report, Dipartimento di Matemati a, Universita di Pisa, Italy, 2005. 34. J. Juang. Global existen e and stability of solutions of matrix Ri

ati equations. J. Math. Anal. Appl., 258(1):1{12, 2001. 35. J. Juang and W.-W. Lin. Nonsymmetri algebrai Ri

ati equations and Hamiltonian-like matri es. SIAM J. Matrix Anal. Appl., 20(1):228{243, 1999. 36. L. V. Kantorovi h. Fun tional analysis and applied mathemati s. NBS Rep. 1509. U. S. Department of Commer e National Bureau of Standards, Los Angeles, Calif., 1952. Translated by C. D. Benster. 37. D. Kleinman. On an iterative te hnique for ri

ati equation omputations. IEEE Trans. Automat. Control, 13(1):114{115, 1968. 38. P. Lan aster and L. Rodman. Algebrai Ri

ati equations. Oxford S ien e Publi ations. The Clarendon Press Oxford University Press, New York, 1995. 39. G. Latou he and V. Ramaswami. A logarithmi redu tion algorithm for quasibirth-death pro esses. J. Appl. Probab., 30(3):650{674, 1993. 40. A. J. Laub. A S hur method for solving algebrai Ri

ati equations. IEEE Trans. Automat. Control, 24(6):913{921, 1979. 41. W.-W. Lin and S.-F. Xu. Convergen e analysis of stru ture-preserving doubling algorithms for Ri

ati-type matrix equations. SIAM J. Matrix Anal. Appl., 28(1):26{39, 2006. 42. L.-Z. Lu. Newton iterations for a non-symmetri algebrai Ri

ati equation. Numer. Linear Algebra Appl., 12(2-3):191{200, 2005. 43. L.-Z. Lu. Solution form and simple iteration of a nonsymmetri algebrai Ri

ati equation arising in transport theory. SIAM J. Matrix Anal. Appl., 26(3):679{685, 2005.

Ri

ati equations asso iated with an M-matrix

209

44. V. L. Mehrmann. The autonomous linear quadrati ontrol problem, volume 163 of Le ture Notes in Control and Information S ien es. Springer-Verlag, Berlin, 1991. Theory and numeri al solution. 45. B. Meini. New onvergen e results on fun tional iteration te hniques for the numeri al solution of M/G/1 type Markov hains. Numer. Math., 78(1):39{58, 1997. 46. V. Ramaswami. Matrix analyti methods for sto hasti uid ows. In D. Smith and P. Hey, editors, TeletraÆ Engineering in a Competitive World, Pro eedings of the 16th International TeletraÆ Congress, Elsevier S ien e B.V., Edimburgh, UK, pages 1019{1030, 1999. 47. L. C. G. Rogers. Fluid models in queueing theory and Wiener-Hopf fa torization of Markov hains. Ann. Appl. Probab., 4(2):390{413, 1994. 48. D. Williams. A \potential-theoreti " note on the quadrati Wiener-Hopf equation for Q-matri es. In Seminar on Probability, XVI, volume 920 of Le ture Notes in Math., pages 91{94. Springer, Berlin, 1982.

A generalized conjugate direction method for nonsymmetric large ill-conditioned linear systems Edouard R. Boudinov1 and Arkadiy I. Manevi h2 1

FORTIS Bank, Brussels, Belgium edouard.boudinov@mail.ru

2

Department of Computational Me hani s and Strength of Stru tures, Dniepropetrovsk National University, Dniepropetrovsk, Ukraine armanevich@yandex.ru

Abstract. A new version of the generalized onjugate dire tion (GCD) method for nonsymmetri linear algebrai systems is proposed whi h is oriented on large and ill- onditioned sets of equations. In distin tion from the known Krylov subspa e methods for unsymmetri al matri es, the method uses expli itly omputed A- onjugate (in generalized sense) ve tors, along with an orthogonal set of residuals obtained in the Arnoldi orthogonalization pro ess. Employing entire sequen es of orthonormal basis ve tors in the Krylov subspa es, similarly to GMRES and FOM, ensures high stability of the method. But instead of solution of a linear set of equations with a Hessenberg matrix in ea h iteration for determining the step we use A- onjugate ve tors and some simple re urren e formulas. The performan e of the proposed algorithm is illustrated by the results of extensive numeri al experiments with large-s ale ill- onditioned linear systems and by omparison with the known eÆ ient algorithms.

Keywords: linear algebrai equations, large-s ale problems, iterative meth-

ods for linear systems, Krylov subspa e methods, onjugate dire tion methods, orthogonalization.

1

Introduction

The method proposed in this paper is based on the notion of A- onjuga y in generalized sense, or \one-sided onjuga y" (in Russian literature term \A-pseudoorthogonality" is also used). We remind the primary de nition: ve tors dk are named onjugate dire tion ve tors of a real non-singular matrix A (in generalized sense) if the following onditions are satis ed: (di , Adk ) = 0

for i < k;

(di , Adk ) 6= 0 for i = k;

(1)

(in general ase (di , Adk ) 6= 0 for i > k). The notion of A- onjuga y in generalized sense has been introdu ed and studied already in 1970-s by G. W. Stewart [4℄, V. V. Voevodin and E. E. Tyrtyshnikov [7℄, [11℄, [12℄, and others.

A generalized onjugate dire tion method

211

A few generalized CD-algorithms for non-symmetri systems, based on onesided onjuga y, have been elaborated already in 1980-s and later ( L. A. Hageman, D. M. Young [10℄ and others, see also [19℄, [20℄). These algorithms relate to dierent lasses of the Krylov subspa e methods: minimum residual methods, orthogonal residual methods, orthogonal errors methods. Convergen e of these algorithms has been well-studied and, in parti ular, the nite termination property has been proved. Of ourse, these results relate to pre ise arithmeti . However in pra ti e the generalized CD-algorithms turned out to be less eÆ ient, at whole, than methods based on an orthogonalization pro edure, su h as elaborated in the same years Full Orhogonalization Method (FOM) [15℄, Generalized Minimal Residual (GMRES) [16℄. It is well known that the onvergen e of CD-algorithms in nite pre ision arithmeti essentially diers from its theoreti al estimates in exa t arithmeti . In this paper we propose a new generalized onjugate dire tion algorithm for solving nonsymmetri linear systems ( tting into the lass of orthogonal residual methods) whi h is ompetitive with the most eÆ ient known methods in the ase of large dimension or ill- onditioned systems. Similarly to GMRES and FOM, the algorithm employs entire sequen es of orthonormal basis ve tors in the Krylov subspa es obtained in the Arnoldi orthogonalization pro ess [1℄. This pro ess is also onsidered as a way of omputing residuals, instead of their usual updating. For simpli ity we des ribe the algorithm in two forms, sequentially introdu ing new elements. First a \basi algorithm" is presented whi h determines iterates by employing the one-sided onjugation and some re urrent formulas (but residuals are updated by the usual formula). Then the nal algorithm is des ribed whi h uses the orthogonalization pro ess for deriving residuals. The performan e of the proposed algorithm is demonstrated by applying to a set of standard linear problems. The results are ompared to that obtained by the

lassi al onjugate gradients method, GMRES and some other eÆ ient methods.

2

Basic algorithm

We solve the problem Ax = b,

x, b ∈ ℜN ,

(2)

where A is an N × N non-singular real matrix (in general ase a nonsymmetri one). Given an initial guess x1 , we ompute an initial residual r1 = b − Ax1 and initial onjugate ve tor d1 as a normalized residual r1 : d1 = r01 = r1 /kr1 k. The

ondition (d1 , Ad1 ) 6= 0 is assumed to be satis ed. The \basi algorithm" is as follows: xk+1 = xk + αk dk ,

αk =

(rk , dk ) , (dk , Adk )

(3)

212

E. Boudinov, A. Manevi h

(4)

rk+1 = rk − αk Adk , dk+1 = rk+1 +

k X

(k+1)

βi

(5)

di ,

i=1

((x, y) denotes the s alar produ t of x and y). CoeÆ ients αi (3) provide the rk+1 to be orthogonal to the dk . CoeÆ ients βi(k+1) (i = 1, . . . , k) are omputed from one-sided onjuga y

onditions (1), whi h lead to a triangular set of equations with respe t to β(k+1) . i This pro ess an be slightly simpli ed if to use the following apparent identity whi h follows from formula (4): Adi =

ri − ri+1 αi

(i = 1, . . . , k − 1),

(6)

an be Then the following two-term re urrent formulas for oeÆ ients β(k) i easily derived: (k) βi

= αi

"

# (k) βi−1 (ri , Ark ) , − αi−1 kri k2

(k)

β1

=−

(d1 , Ark ) (d1 , Ad1 )

(7)

The termination riterion is taken in the form krk k 6 ε or krk k 6 εkr1 k. The algorithm onstru ts the orthogonal set of ve tors ri , i 6 k, and the A- onjugate (in the generalized sense) set of ve tors di , i 6 k. Note that this method relates to \long re urren e" algorithms with respe t to onjugate ve tors ( .v.), be ause every new . v. is omputed from onditions of A- onjuga y with respe t to all pre eding . v.'s. But it is a \short re urren e" algorithm with respe t to the orthogonal set of residuals. In the ase of symmetri matrix A the algorithm is redu ed to the lassi al CG-method: the ve tor set di , i 6 k be omes A- onjugate in usual sense and all β(k) i , i < k, vanish.

3

Final algorithm

The basi algorithm is lose to a few known algorithms, su h as ORTHORES (L.A. Hageman, D.M. Young [10℄) and some others. In exa t arithmeti it rea hes the solution in at most N iterations almost for every initial ve tor x1 ([4℄, [13℄). But in pra ti e the eÆ ien y of this algorithm is found to be insuÆ ient for large and/or ill- onditioned systems. The main reason of this shortage, in our opinion, is onne ted with the updating formula for residuals (4). The updating formula (with αk (3)) ensures orthogonality of the urrent residual rk+1 to the last onjugate dire tion dk with high a

ura y, but the orthogonality to all pre eding residuals ri , i 6 k, is maintained only in exa t arithmeti . Round-o errors are not orre ted in the next step; they only are a

umulated from step

A generalized onjugate dire tion method

213

to step. This a

umulation gradually violates the orthogonality of ve tors {ri} and destroys A- onjuga y of ve tors {di }. We would like to underline that the basi property of residuals {ri}, required for eÆ ien y of the algorithm, is their mutual orthogonality. A

umulation of errors and derangement of the residuals orthogonality is a prin ipal inherent drawba k of the basi algorithm (as well as every short re urren e CD-algorithm). At rst sight, the remedy is the dire t omputation of the residuals by formula rk = b − Axk . But this way is wrong. The roundo errors in omputation of step lengths (point xk+1 ) are again a

umulated, so the residuals are omputed \exa tly", but at \inexa t" points! The orthogonality of residuals again is gradually distorted. Besides, the additional matrix-ve tor multipli ation per iteration is required. We propose another way, whi h is realized in the nal algorithm. Instead of the usual updating residuals we ompute rk simply from the onditions of orthogonality with respe t to all pre eding r0i , i < k (using the modi ed GramS hmidt orthogonalization). Indeed, it is known a priori that the new residual should be orthogonal to all r0i , i < k, so we need only in proper s aling in order to the normal ve tor would oin ide with the residual (in exa t arithmeti ). Su h a s aling is given by the following formula: rk+1 = −αk

Ar0k

−

k X

γk,i r0i

i=1

!

,

γk,i = (Ar0k , r0i ),

r0k =

rk krk k

(8)

It an be easily shown that in exa t arithmeti formulas (8) and (4) for the residuals are identi al (both they determine a ve tor orthogonal to all {ri}, i 6 k in the Krylov subspa e Kk+1 , and have equal proje tions onto the ve tor r0k ). Other formulas of the algorithm remain prin ipally the same, but some

hanges appears be ause we introdu e the normalized ve tors r0i instead of ri . The ve tor dk (5) now is de ned as follows: dk+1 = r0k+1 + (k) βi

= αi

"

k X

(k+1)

βi

(9)

di ,

i=1

# (k) βi−1 (r0i , Ar0k ) , − αi−1 kri k

(k)

β1

=−

(r01 , Ar0k ) (r01 , Ar01 )

(10)

Formula for the iterate xk+1 (3) remains the same, but formula for the step length αk is hanged due to new s aling of the ve tor di : xk+1 = xk + αk dk ,

αk =

krk k , (dk , Adk )

(11)

This formula for αk yields from (4) and the identity (rk , dk ) =

rk , (r0k

+

k−1 X i=1

!

(k) βi di )

= krk k

(12)

214

E. Boudinov, A. Manevi h

(sin e all ve tors di , i 6 k − 2, are linear ombinations of the ve tors rj , j 6 i). It is evident that the orthogonal ve tor set {r0i } is less sus eptible to degenera y than the A- onjugate ve tors set {di }. Hen e all omputations based on the ve tors {r0i} have higher a

ura y than those based on {di }. Therefore it is worthwhile to repla e, whenever it possible, operations based on {di } by the ones based on {r0i }. One has (dk , Adk ) =

(r0k

+

k−1 X

(k) βi di ), Adk

i=1

r0k , Ar0k +

k−1 X

(k) ri βi

i=1

− ri+1 αi

!

!

= (r0k , Adk ) = (k)

β = (r0k , Ar0k ) − k−1 krk k αk−1

(13)

(here we use formulas (1), (6)). Thus the oeÆ ients αk and β(k) are omputed via the ve tors {r0i }, and the i A- onjugate ve tors {di } are used only for omputation of the urrent ve tor dk by Eq. (9). With modi ation (8) the algorithm be omes \long re urren e" one also with respe t to residuals. This property is usually onsidered as a shortage sin e it is onne ted with in reased storage requirements and ompli ation of omputations. But the long re urren e property makes algorithm more stable and less sensitive to the round-o errors. This was noted already in 80-s ([14℄). So in the

ase of ill- onditioned or large problems long re urren e be omes more likely a merit of an algorithm rather than a drawba k. The nal algorithm performs only one matrix-ve tor multipli ation per iteration. We omit here all additional details and options of the algorithm. It an be easily seen that the nal algorithm onstru ts the same bases in Krylov subspa es as do GMRes and FOM (they use the similar Gram-S hmidt ortogonalization with the same initial ve tors). But as for determining steps in these subspa es, omputational s heme of our algorithm and that of GMRes (and FOM) are quite dierent. The GMRES nds the step by solving a linear set of equations with an upper Hessenberg matrix. This pro ess involves the Givens rotations for redu ing Hessenberg matri es to the triangular form and/or other

omputational elements. In our algorithm this subproblem is solved by employing onjugate dire tions. It is important that no extra matrix-ve tor produ t is required per iteration.

4

Numerical experiments

The algorithm has been realized in JAVA programming language and has been tested on a variety of linear algebrai problems (in most ases ill- onditioned).

A generalized onjugate dire tion method

215

For omparison we have hosen the following methods: the lassi al CG [2℄; the Bi-CG [3℄, [6℄, the Conjugate Gradient Squared (CGS) [17℄, the Bi-CGSTAB [18℄ and the GMRES [16℄. We used the MATLAB implementations of the Bi-CG, CGS and Bi-CGSTAB methods ([21℄). But for CG and GMRES we employed our implementations. In order to redu e the exe ution time the matri es were rst pre al ulated and then used in the methods implemented in MATLAB. Our implementations of the CG and GMRES methods were ben hmarked against the MATLAB implementation of these methods (p g and gmres fun tions MATLAB), and it was established that the numbers of iterations were identi al in the both implementations, but the running time was less in our implementation. The termination riterion was taken in the form krk k 6 εkr1 k (with ε = 10−13 − 10−15 ). 3 All omputations have been performed on PC Pentium 3.2 GHz with RAM of 2000 MB in double pre ision. Our main aims were to ompare 1) the long re

uren e algorithms with the short re urren e ones, 2) the orthogonalization pro edure for spe ifying residuals with usual updating residuals. First we present the results for symmetri systems with the following matri es (here degree in the denominators is gradually in reased, and so the matri es be ome more degenerate): SYMM1 : SYMM2 :

aii = aii =

SYMM3 : SYMM4 : SYMM5 :

1 i2

1 i4

aij (i 6= j) =

aij (i < j) = aii =

aii =

1 i

1 i3

aii =

aij (i > j) =

aij (i 6= j) =

aij (i < j) = 1 i5

1 ij2

1 (ij)2 j

1 i2 j

1 (ij)3

(15) (16)

1 (ij)2

aij (i > j) =

aij (i 6= j) =

(14)

1 ij

1 (ij)2 i

(17) (18)

In table 1 the results of the al ulations for the number of variables N = 1000 and ε = r/r1 = 10−13 (in the termination riterion) are presented. Notations in this and following tables: N is the number of variables, εx is the a

ura y in arguments, kiter is the number of iterations, t is the running time. The lassi al CG and our basi method have su

essfully solved the relatively simple problems SYMM1 { SYMM3; in the problem SYMM3 a

ura y of the CG was very low, and in others problems (SYMM4, SYMM5) these algorithms 3

The algorithm provides in the k-th iteration the orthogonal residual point xor k+1 in whi h rk+1 is orthogonal to all pre eding di : (ror k+1 , di ) = 0, i = 1, . . . , k. Having obtained the onjugate ve tors basis in the Krylov subspa e Kk , by the ost of a few additional omputations we obtained the minimal residual point xmr k+1 , whi h is , Ad ) = 0, i = 1, . . . , k . For

orre t

omparison with de ned by onditions (rmr i k+1 GMRES we used this point in the termination riterion.

216

E. Boudinov, A. Manevi h

Number of iterations for symmetri matri es (14)-(18). N = 1000, ε = 10−13 , \*"|the algorithm failed. Table 1.

Problem CG GMRES SYMM1 152 SYMM2 1001 SYMM3 1001 SYMM4 * SYMM5 *

88

177 239 258 170

Basi Final algorithm algorithm GCD 88 88 198 177 269 239 * 258 * 170

have failed. The GMRES and our nal algorithm GCD have su

essfully solved all the problems with identi al a

ura y and the number of iterations, and the running time was pra ti ally the same. We would like to note that in all the

ases the number of iterations (and so the number of stored onjugate ve tors) in GMRES and our algorithm was less by several times omparing to the number of variables N (for N=1000 it did not ex eed 258). Table 2 shows the results for larger number of variables N = 10000. The CGalgorithm and our basi algorithm have solved with reasonable a

ura y only the rst problem. The GMRES and our nal algorithm have solved all problems, and the numbers of iterations were again identi al for both the methods and mu h less than the dimension of the problem. The running times in both the methods were approximately the same. Table 2. Number of iterations for symmetri matri es (14)-(18). N = 10000, ε = 10−13 ,

\*"|the algorithm failed.

Problem CG GMRES SYMM1 448 SYMM2 * SYMM3 *

186

513 766

Basi Final algorithm algorithm GCD 208 186 554 513 * 766

We see that even at solving linear systems with symmetri matri es general algorithms designed for non-symmetri problems turn out to be more stable and eÆ ient omparing to spe ial algorithms for symmetri systems; it is lear that the matri es may remain symmetri in the pro ess of omputations only in exa t arithmeti . In the next y le of numeri al experiments we onsider the linear problems with nonsymmetri matri es. They were obtained from the matri es of type

A generalized onjugate dire tion method

217

(14){(18) by introdu ing an asymmetry fa tor µ: ASYMM1 :

aii = 1i ,

aij (i < j) =

ASYMM2 :

aii =

1 i2 ,

aij (i < j) =

ASYMM3 :

aii =

1 , i3

aij (i < j) =

ASYMM4 : ASYMM5 :

aii = aii =

1 , i4 1 , i5

1+µ ij , 1+µ ij2 , 1+µ , (ij)2

1+µ aij (i < j) = (ij) 2j , 1+µ aij (i < j) = (ij) 3,

aij (i > j) = aij (i > j) = aij (i > j) = aij (i > j) = aij (i > j) =

1−µ ij 1−µ i2 j 1−µ (ij)2

(19)

1−µ (ij)2 i 1−µ (ij)3

(22)

(20) (21) (23)

The following algorithms have been tested, alongside with our algorithm (GCD): Bi-CG, CGS, Bi-CGSTAB and GMRES (the rst three algorithms are short re urren e). Results for N = 1000 with µ = 0.5 and ε = 10−13 are presented in Table 3. Numbers of iterations in unsymmetri al problems with matri es (19)-(23), solved by various algorithms; N = 1000; the asymmetry oeÆ ient µ = 0.5; ε = 10−13 ; \*"|the algorithm has failed. Table 3.

Matrix GMRES GCD Bi-CG CGS Bi-CGSTAB ASYMM1 95 95 184 131 92 ASYMM2 183 183 1430 2473 918 ASYMM3 244 244 * * * ASYMM4 264 264 * * * ASYMM5 176 176 * * *

The CGS, Bi-CG, Bi-CGSTAB algorithms have solved only problems ASYMM1, ASYMM2. The GMRES and our algorithm have solved all the problems with approximately the same a

ura y and numbers of iterations. The data presented in above tables enable us to draw the following on lusions: – at solving ill- onditioned problems the short re urren e algorithms ompare

unfavorably with the long re urren e ones; only long re urren e algorithms are eÆ ient in ill- onditioned problems of moderate and large dimensions; – algorithms based on usual updating of residuals (CG, our basi algorithm) are at a disadvantage in relation to algorithms based on an orthogonalization pro edure (GMRES, our nal algorithm); – the onvergen e of our nal algorithm GCD is identi al to that of GMRES. Therefore in the next omputations we dealt only with GMRES and our algorithm GCD. Table 4 shows the results obtained by these algorithms for the same problems with larger number of variables N = 10000. Along with

218

E. Boudinov, A. Manevi h

the numbers of iterations, here we present also the a

ura y in arguments and the exe ution time. Again the both algorithms have solved all problems with approximately the same numbers of iterations, a

ura y and exe ution times. Results for unsymmetri al problems with matri es (19)-(23); N = 10000; the asymmetry oeÆ ient µ = 0.5; ε = 10−13 .

Table 4.

GMRES Proposed method GCD Matrix εx kiter t (se ) εx kiter t (se ) ASYMM1 < 10−9 200 224 < 10−9 200 224 ASYMM2 < 10−5 532 621 < 10−6 532 619 ASYMM3 < 0.009 788 957 < 0.005 788 956

In order to examine the algorithms in very large s ale problems, we onsidered unsymmetri al problems produ ed from the matri es (19) with nonzero elements only on ve diagonals, i. e., aij = 0 for j > i + 2 and j < i − 2. Table 5 presents the results obtained by GMRES and our algorithm. The both methods were very eÆ ient in solving the problems up to N=150000 (on the given PC). We see that the a

ura y of the solutions did not de rease as the dimension of the problem in reased. The numbers of iterations in the both methods were again identi al, and the running time was pra ti ally the same. Table 5. Results for unsymmetri al problems with matri es (19) having only 5 diagonals with non-zero elements: the asymmetry fa tor µ = 0.5; ε = 10−10 .

GMRES Proposed method GCD N εx kiter t (se ) εx kiter t (se ) 1000 < 10−7 76 0.06 < 10−7 76 0.06 10000 < 10−6 159 2.02 < 10−6 159 1.95 50000 < 10−6 264 25.4 < 10−6 264 23.0 100000 < 10−6 329 83.4 < 10−6 329 73.9 150000 < 10−6 374 208 < 10−6 374 210

Table 6 demonstrates the in uen e of the asymmetry fa tor on the eÆ ien y of the GMRES and proposed algorithm, for two problems with N = 1000: the matrix (19) and Hilbert matrix modi ed with the asymmetry fa tor: aij =

1+µ i+j−1

(i < j),

aij =

1−µ i+j−1

(i > j)

(24)

The matrix asymmetry of the rst matrix pra ti ally did not ae t the performan e of the algorithms. In the se ond problem even a very small matrix

A generalized onjugate dire tion method

219

asymmetry had an impa t on the onvergen y rate of the both algorithms: they required N iterations for solving the problems. The above on lusion about the

omparative eÆ ien y of the both methods holds at any µ-values. Table 6. Results for unsymmetri al N = 1000; ε = 10−13 .

problems with various asymmetry oeÆ ients µ;

GMRES µ

0 0.1 0.5 1.0 2.0

εx < < < <

5 that are (φ, ψ)- ir ulants for appropriately

hosen values of φ and ψ.

1. The issue that we treat in this short paper is motivated by our study

of the normal Hankel problem, i.e., the problem of des ribing normal Hankel matri es. This problem is still open despite a number of available partial results. A detailed a

ount of its present state is given in Se tion 1 of our paper [1℄. We need a shorter version of this a

ount to formulate and then prove our result. Let H = H1 + iH2 (1) be an arbitrary Hankel matrix, H1 and H2 being its real and imaginary parts, respe tively. Denote by Pn the ba kward identity matrix:

Then,

Pn =

1

1 . .. .

T = HPn = T1 + iT2

(2)

On normal Hankel (φ, ψ)- ir ulants

223

is a Toeplitz matrix, T1 and T2 being again the real and imaginary parts of T. One an show that, for H to be a normal matrix, it is ne essary and suÆ ient that the asso iated Toeplitz matrix (2) satis es the relation (3)

Im (T T ∗ ) = 0.

Let a1 , a2 , ..., an−1 and a−1 , a−2, ..., a−n+1 be the o-diagonal entries in the rst row and the rst olumn of T1 . Denote by b1 , b2 , . . ., bn−1 and b−1 , b−2 , . . ., b−n+1 the orresponding entries in T2 . Using these entries, we an form the matri es an−1 bn−1 an−2 bn−2 F= . .. .. . a1

and

a−1 a−2 G= . ..

b1

b−1 b−2 . .. .

a−n+1 b−n+1

It turns out that all the lasses of normal Hankel matri es previously des ribed in the literature orrespond to the ases where, for at least one of the matri es F and G, the rank is less than two. Therefore, we hereafter assume that rank F = rank G = 2. In this ase, the basi equality (3) implies (see details in our paper [2℄) that (4)

G = FW,

where W=

αβ γδ

is a real 2 × 2 matrix with the determinant

(5)

αδ − βγ = 1.

The matrix equality (4) is equivalent to the s alar relations a−i = αan−i + γbn−i ,

b−i = βan−i + δbn−i ,

1 6 i 6 n − 1.

(6)

Writing the Toeplitz matrix (2) in the form

T =

t0 t−1 t−2 ...

t1 t0 t−1 ...

t2 t1 t0 ...

t−n+1 t−n+2 t−n+3

. . . tn−1 . . . tn−2 . . . tn−3 , ... ... . . . t0

(7)

224

V. N. Chugunov and Kh. D. Ikramov

we an repla e real relations (6) by the omplex formulas t−i = φtn−i + ψtn−i ,

where φ=

β−γ α+δ +i , 2 2

1 6 i 6 n − 1,

ψ=

α−δ β+γ +i . 2 2

(8) (9)

The omplex form of relation (5) is as follows: |φ|2 − |ψ|2 = 1.

(10)

Let (φ, ψ) be a xed pair of omplex numbers obeying ondition (10). A Toeplitz matrix T is alled a (φ, ψ)- ir ulant if its entries satisfy relations (8). The orresponding Hankel matrix H = T Pn will be alled a Hankel (φ, ψ) ir ulant. The ase ψ = 0, |φ| = 1 orresponds to the well-known lasses of Toeplitz and Hankel φ- ir ulants. However, for ψ 6= 0, it is not at all lear whether there exist nontrivial normal Hankel (φ, ψ)- ir ulants. Indeed, if equalities (6) are substituted into our basi relation (3), then the result is a system of n − 1 real equations with respe t to 2n real unknowns a0 , a1 , ..., an−1 and b0 , b1 , ..., bn−1 . Sin e these equations are quadrati , they need not to have real solutions. It was shown in [1℄ that the above system is solvable for n = 3 and n = 4 for every quadruple (α, β, γ, δ) satisfying ondition (5). The question of the existen e of normal Hankel (φ, ψ)- ir ulants for larger values of n was left open there. Below, we onstru t a spe ial lass of Toeplitz matri es that generate normal Hankel matri es for any n > 5. These matri es are (φ, ψ)- ir ulants for appropriate values of φ and ψ, where ψ 6= 0. 2. We seek T as a Toeplitz matrix with the rst row of the form 0 0 ··· 0 a b a

Here, a = x + iy and b = z + iw are omplex numbers to be determined. This matrix T must be a (φ, ψ)- ir ulant for appropriate values of φ and ψ (that is, for appropriate α, β, γ, and δ). The Hankel (φ, ψ)- ir ulant orresponding to this T is normal if and only if the basi relation (3) is ful lled. Now, observe that the property of T to be a (φ, ψ)- ir ulant implies that T T ∗ is a Toeplitz matrix (see [1℄ or [2℄ for explanations of this fa t). Moreover, T T ∗ is obviously a Hermitian matrix. It follows that the matrix relation (3) is equivalent to n − 1 s alar onditions Im{T T ∗ }1j = 0,

j = 2, 3, . . . , n.

Due to the \tridiagonal" stru ture of T, we have {T T ∗ }1j = 0,

j = 4, 5, . . . , n − 2.

(11)

On normal Hankel (φ, ψ)- ir ulants

225

The remaining onditions in (11) orrespond to j = 2, 3, n − 1 and n. They have the same form for any value of n, beginning from n = 5. Thus, to nd the desired a and b, it suÆ es to analyze the ase n = 5. Sin e {T T ∗ }12 = ba + ab, {T T ∗ }13 = |a|2 , the rst two onditions in (11) are automati ally ful lled. It remains to satisfy two onditions orresponding to j = 4 and j = 5. This yields the following system of two equations in four real variables x, y, z and w : βx2 + (δ − α)xy − γy2 = 0,

(12)

[2βx + (δ − α)y]z + [(δ − α)x − 2γy]w = 0.

(13)

Furthermore, we must keep in mind the relation

0 0 x y rank F = rank z w = 2, xy

whi h is equivalent to the inequality

yz − xw 6= 0

and ex ludes solutions to system (12), (13) for whi h x = y = 0. Suppose that (x, y) is a nontrivial solution to equation (12). Substituting x and y into (13), we obtain a linear equation with respe t to z and w. However, if at least one of the expressions inside the bra kets is nonzero, then this equation is equivalent to the relation yz − xw = 0, (14) signifying that rank F = 1. Indeed, the determinant of the system omposed of equations (13) and (14) is given by the formula 2βx + (δ − α)y (δ − α)x − 2γy = −2[βx2 + (δ − α)xy − γy2 ] y −x

and, hen e, vanishes in view of (12). On the other hand, if, for the hosen solution (x, y), we have 2βx + (δ − α)y = 0,

(15)

(δ − α)x − 2γy = 0,

(16)

then (13) is satis ed by any pair (z, w). Almost all of these pairs satisfy the

ondition yz − xw 6= 0.

226

V. N. Chugunov and Kh. D. Ikramov

By assumption, the homogeneous system (15), (16) has a nontrivial solution

(x, y), whi h means that its determinant 2β δ − α 2 δ − α −2γ = −4βγ − (δ − α)

must be zero. Taking (5) into a

ount, we obtain the ondition |δ + α| = 2.

(17)

Summing up, we have shown that, for every quadruple (α, β, γ, δ) satisfying

onditions (5) and (17), there exist omplex s alars a = x + iy and b = z + iw spe ifying the desired Toeplitz matrix T . This matrix is a (φ, ψ)- ir ulant for φ and ψ determined by the hosen values of α, β, γ, and δ. The orresponding matrix H (see (2)) is a normal Hankel (φ, ψ)- ir ulant. V. N. Chugunov a knowledges the support of the Russian Foundation for Basi Resear h (proje ts nos. 04-07-90336 and 05-01-00721) and a Priority Resear h Grant OMN-3 of the Department of Mathemati al S ien es of Russian A ademy of S ien es.

References 1. V. N. Chugunov, Kh. D. Ikramov, On normal Hankel matri es of low orders, Mat. Zametki (a

epted for publi ation). 2. V. N. Chugunov, Kh. D. Ikramov, On normal Hankel matri es, Zap. Nau hn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 346 (2007), 63{80.

On the Treatment of Boundary Artifacts in Image Restoration by reflection and/or anti-reflection Mar o Donatelli⋆ and Stefano Serra-Capizzano⋆⋆ Dipartimento di Fisi a e Matemati a, Universita dell'Insubria - Sede di Como, Via Valleggio 11, 22100 Como, Italy ⋆ marco.donatelli@uninsubria.it, ⋆⋆ stefano.serrac@uninsubria.it, ⋆⋆ serra@mail.dm.unipi.it

Abstract. The abrupt boundary trun ation of an image introdu es artifa ts in the restored image. For large image restoration with shiftinvariant blurring, it is advisable to use Fast Fourier transform (FFT)based pro edures for redu ing the omputational eort. In this dire tion several te hniques manipulate the observed image at the boundary or make some assumptions on the boundary of the true image, in su h a way that FFT-based algorithms an be used. We ompare the use of re e tion with that of anti-re e tion, in onne tion with the hoi e of the boundary onditions or for extending the observed image, both theoreti ally and numeri ally. Furthermore, we ombine the two proposals. More pre isely we apply anti-re e tion, followed by re e tion if ne essary, to the observed image and we observe that the resulting restoration quality is in reased with respe t to the ase of plain re e tion.

Keywords: image deblurring, boundary onditions, fast transforms and matrix algebras.

1

Introduction

The blurred image is expressed as a fun tion of an original s ene that is larger than the eld of view (FOV) of the blurred image, sin e pixels from the original s ene outside the aptured image window ontribute to the pixels near the boundaries of the blurred observed image. Indeed the standard observation model an be expressed as g = Afo + u, (1) where fo and g, lexi ographi ally ordered, are the true and observed images, and u is the noise. The matrix A represents a onvolution of the true image fo with the point spread fun tion (PSF) that we assume to be known and shift invariant. If the observed image is n × n and the PSF m × m, then (1) implies that fo is (n+m−1)× (n+m−1) and that A is a Toeplitz matrix of matrix size n2 × (n + m − 1)2 . This means that the linear system (1) is underdetermined.

228

M. Donatelli, S. Serra-Capizzano

The goal is to re over fo only in the FOV, i.e., the image f equal to the n × n middle part of fo . A well-established solution to both the problems of nonuniqueness and noise ampli ation is regularization. A lassi approa h is the Tikhonov regularization [10℄, whi h involves simultaneously minimizing the data error and a measure of the roughness of the solution. This leads to the linear system (AT A + µI)f = AT g,

(2)

where µ > 0 is the regularization parameter that should be opportunely hosen and usually satis es µ ≪ 1. In general the solution of the linear system (2)

an be omputationally expensive, sin e it is not automati that an FFT-based algorithm an be applied dire tly. However an interesting approa h is proposed in [8℄ when m ≪ n. Indeed, for dealing with the re tangular matrix A while using FFT-based algorithms, it is ne essary to resort to iterative methods [2℄, in whi h the main task is the appli ation of FFT-based pro edures for matrixve tor multipli ation. Conversely, for employing FFT-based dire t algorithms, the linear system to solve should have oeÆ ient matrix diagonalizable by a suitable fast trigonometri transform, su h as sine, osine, ω-Fourier (|ω| = 1), Hartley transforms (see e.g. [6℄). This an be done modifying system (1) in order to obtain a square oeÆ ient matrix. The rst approa h amounts in imposing boundary onditions (BCs) to fo an then omputing a regularized solution of Bfb = g,

(3)

where B is n2 × n2 , with a stru ture depending on the shift invariant kernel and on the type of BCs [5℄. The se ond approa h is to extend g in some way to obtain ge of size 2n × 2n, and then regularizing Cfe = ge ,

(4)

where C is the (2n)2 × (2n)2 Cir ulant matrix obtained by periodi ally ompleting A; here the restored image is the n × n part of fe orresponding to g [1℄. In this paper, we ompare the two approa hes in the ase of re e tive pad, i.e. the two proposals in [7℄ and [1℄. We will also onsider the use of re e tion and anti-re e tion in onne tion as possible hoi es for boundary onditions. The main results are the following: – In the ase of strongly symmetri (symmetri with respe t to ea h axis in-

dependently) PSFs, the onsidered approa hes produ e omparable restorations in pra ti al problems. – Imposing anti-re e tive boundary onditions leads to a better restoration quality with respe t to the re e tive boundary onditions, at least for moderate level of noise [9, 4, 3℄. However a dire t fast method is available only in the strong symmetri setting.

On the Treatment of Boundary Artifa ts in Image Restoration

(a)

(b)

229

( )

Fig. 1. (a) Full re e tion of the left top quadrant on right and bottom. (b) Half re e tion

on ea h edge of the middle image. ( ) Half anti-re e tion on ea h edge of the middle image (s aled image). The edges of the images are emphasized by tiny verti al and horizontal lines. – To improve the results obtained by image extension as in (4), we use the

ideas in [1℄, but instead of using re e tion, we apply anti-re e tion or antire e tion followed by re e tion. In this way we obtain a FFT-based algorithm also in the ase of a generi PSF (non-ne essarily symmetri ), so over oming the limitations in [7, 9℄ on erning the assumption of a strongly symmetri onvolution kernel.

The paper is ompleted with numeri al results to validate the proposals and the related analysis.

2

Reflection for image extension and BCs

In this se tion, we ompare the re e tion pad to extend g and the imposition of re e tive BCs. The proposal in [1℄ to extend g, is to form a new image ge of size 2n × 2n as des ribed in Fig. 1 (a). The observed image g is at the top left quadrant, the top right quadrant is the mirror image of g around the y axis, and the bottom half is the mirror image of the top half around the x axis. After that, the solution of the Tikhonov linear system is omputed by ir ular

onvolution, be ause the oeÆ ient matrix C in (4) is Cir ulant. In [1℄ it is shown that, for 1-D images and symmetri PSFs, when the trun ated image is lo ally stationary at its boundaries, this approa h leads to smaller expe ted errors in the restored image with respe t to apply dire tly the ir ular onvolution to g. Indeed the ir ular onvolution assumes a ir ular signal and, independently of g, ge is always a periodi image, hen e it is reasonable to expe t that it is obtained from a periodi s ene. This learly redu es the boundary artifa t in the restoration, also in the ase of non-symmetri PSFs. We note that a re e tion of size n/2 with respe t to ea h edge an be also used as in Fig. 1 (b), obtaining

230

M. Donatelli, S. Serra-Capizzano

the same algorithm. Indeed the previous observation means only a translation of the period of the image of n/2 in ea h variable. The use of re e tive or Neumann BCs implies that the true image outside the FOV is a re e tion of the image inside the FOV. Therefore fo is assumed to be an extension by re e tion of f like in Fig. 1 (b). The re e tion is done with respe t to ea h edge with a bandwidth depending on the support of the PSF, sin e it is ne essary ea h pixel at the boundary to be well de ned. Imposing re e tive BCs the square linear system has size n2 and the matrix B in (3) has a Toeplitz plus Hankel stru ture. More spe i ally, if the PSF is strongly symmetri , then B an be diagonalized by the dis rete osine transform of type I (DCT-I) (two dimensional in the ase of images). Now we provide an algebrai formulation of the two approa hes in the ase of strongly symmetri PSFs and 1-D ase. The latter will allow us to give a qualitative omparison of the solution omputed from the two strategies applying the Tikhonov regularization to (3) and (4), respe tively. Sin e the PSF is symmetri , we have hP= [h−q , . . . , h0 , . . . , hq ] with h−i = hi and q = (m−1)/2. Let Tk = { φα (x) = kj=−k αj eijx , α−j = αj } be the set of even trigonometri polynomial of degree at most k, then the symbol φh (x) =

q X

hj eijx ,

(5)

j=−q

is su h that φh ∈ Tq and q 6 (n − 1)/2 for m 6 n. Imposing re e tive BCs, thanks to the symmetry of the PSF, in (3) B = Rn DRTn where Rn is the DCT-I matrix (Rn is real and orthogonal), D = diag(b) with b = RTn (Be1 )/RTn e1 (the division is omponent-wise) and e1 is the rst ve tor of the anoni al base. Moreover, sin e bi = φh (iπ/n), i = 0, . . . , n − 1, B

an be expressed in terms of its symbol φh and it will be denoted by B = Rn (φh ) (see [7℄). Therefore, using the Tikhonov regularization approa h (2) for the linear system (3), we obtain fr = Rn diag

b b2 + µ

Rn g,

(6)

where the operations between ve tors are intended omponent-wise. Setting z = b/(b2 + µ) and by de ning pr ∈ Tn−1 as the interpolating polynomial in the iπ pairs n , zi for i = 0, . . . , n − 1, we nd fr = Rn (pr )g.

(7)

For the other approa h in (4), without loss of generality, let {1, . . . , n} be the FOV and n be even. Hen e, by re e ting g = [g1 , . . . , gn ] on both sides, we have ge = [gn/2 , . . . , g2 , g1 , g1 , g2 , . . . , gn , gn , gn−1 , . . . , gn/2+1 ],

(8)

On the Treatment of Boundary Artifa ts in Image Restoration

231

that, as already observed, leads to the same proposal as in [1℄. De ning El , P= I Er 2n×n

(9)

fc = [ 0 | I | 0 ]n×2n C2n (pc )Pg,

(10)

El = [ J | 0 ]n/2×n , Er = [ 0 | J ]n/2×n and J being the n/2 × n/2 ip matrix with entries [J]s,t = 1 if s + t = n + 1 and zero otherwise, we have ge = Pg. While C = F2n ΛFH 2n , where F2n is the Fourier matrix of order 2n and Λ = diag(c), H with c = FH 2n (Ce1 )/F2n e1 . Sin e ci = φh (2πi/2n), i = 0, . . . , 2n − 1, we denote C = C2n (φh ). Using the Tikhonov regularization (2) for the linear system (4), the restored signal of size n is

where I is the identity of order n and, similarly to the re e tive BCs ase, pc ∈ T2n−1 is the interpolating polynomial in ( iπ n , vi ) for i = 0, . . . , 2n − 1 with v = c/(c2 + µ). We show that pc ∈ Tn and it is the interpolating polynomial in iπ ( iπ n , vi ) for i = 0, . . . , n, i.e., the points ( n , vi ) for i = n + 1, . . . , 2n − 1 do not add any further information. The interpolation onditions are

iπ pc = vi , i = 0, . . . , n, n (n + i)π pc = vn+i , i = 1, . . . , n − 1. n

(11) (12)

From the trigonometri identity os( (n+i)π ) = os( (n−i)π ), it follows cn+i = n n (n+i)π ) cn−i that implies vn+i = vn−i and pc ( n ) = pc ( (n−i)π , for i = 1, . . . , n−1. n ) = v Therefore, onditions (12) an be written as pc ( (n−i)π for i = 1, . . . , n− n−i n 1, that are a subset of (11). Moreover, ci = bi , and then vi = zi , for i = 0, . . . , n − 1. Con luding, let Ωn = { iπ n | i = 0, . . . , n} be the interpolation nodes forming a uniform grid on [0, π] and let ψ = φh /(φ2h + µ), then pc ∈ Tn interpolating ψ in

Ωn ,

interpolating ψ in Ωn \ {π} .

pr ∈ Tn−1

(13) (14)

In order to ompare fr with fc , it remains to he k whether [ 0 | I | 0 ] C2n (φα )P belongs to the DCT-I algebra. Let φα ∈ T n2 , then the n × 2n matrix T = [ 0 | I | 0 ] C2n (φα ) is

T =

α− n2 . . . α0 . . . α n2

..

.

.. .

α− n2

..

..

.

..

.

..

. α n2

.

.. . . . .

α− n2 . . . α0 . . . α n2

(15)

232

M. Donatelli, S. Serra-Capizzano

and TP = Rn (φα ). We note that [ 0 | I | 0 ] C2n (φα )P = Rn (φα ) holds only if φα ∈ T n2 . Therefore it an not be used in (10) sin e pc ∈ Tn , but it generally fails to belong to T n2 . However, from (7) and (10), it holds fr − fc = (Rn (pr ) − [ 0 | I | 0 ] C2n (pc )P)g = (Rn (pr − φα ) − [ 0 | I | 0 ] C2n (pc − φα )P)g,

for φα ∈ T n2 . We take

(16) (17)

φα = arg min ||ψ − p ||∞ .

(18)

C2n (pc − φα ) = C2n (pc − ψ + ψ − φα )

(19) (20)

p∈T n 2

Therefore

= C2n (rn ) + C2n (a n2 ),

where rn is the lassi al remainder in the trigonometri interpolation with n + 1 equispa ed nodes in [0, π] with nodes belonging to Ωn , while a n2 is the sup-norm optimal remainder of degree n/2. Similarly Rn (pr − φα ) = Rn (~rn−1 ) + Rn (a n2 ),

(21)

where ~rn−1 is the remainder of the trigonometri interpolation with n equispa ed nodes in [0, π] with nodes belonging to Ωn \ {xn = π}. As a onsequen e, sin e the transforms asso iated with the ir ulant and the osine algebras are unitary, it follows that the spe tral norms kCn (s)k, kRn (s)k are bounded by the in nity norm of s. Moreover kPk = k[ 0 | I | 0 ]k = 1 and hen e, by using (19){(21) in (17), we nd kfr − fc k 6 (kRn (pr − φα )k + kC2n (pc − φα )k)kgk 6 (krn k∞ + k~rn−1 k∞ + 2kan/2 k∞ )kgk 6 2(Knkan k∞ + kan/2 k∞ )kgk,

(22) (23) (24)

with K onstant, where the latter inequality follows from the evaluation of the Lebesgue onstants in the interpolation operators. In fa t, after the hange of variable y = os(x), the operator behind rn is the interpolation on [−1, 1] with Chebyshev nodes of se ond type (the zeros of sin(nx)/ sin(x)) plus the additional endpoints {±1}: its Lebesgue onstant is known to grow as K log(n). The other Lebesgue onstant related to the operator behind ~rn−1 is again related to the Chebyshev nodes of the se ond type plus only y = 1 (i.e. x = x0 = 0); in this

ase the asso iated Lebesgue onstant is known to grow as Kn. Sin e kat k∞ is exponentially onverging to zero as t tends to in nity (due to the C∞ regularity of ψ), it follows that kfr − fc k is exponentially onverging to zero as n tends to in nity. As a onsequen e, the ve tors fr and fc do not oin ide in general, but their numeri al dieren e is negligible already for moderate values of n.

On the Treatment of Boundary Artifa ts in Image Restoration

233

Finally, when the PSF is not strongly symmetri , we noti e that B an not be diagonalized by DCT-I and it has only a Toeplitz plus Hankel stru ture. Therefore in general the linear system arising from Tikhonov regularization and re e tive BCs an not be solved by a FFT-based algorithm. On the other hand, the other approa h based on the extension of g an be again applied without modi ations.

3

Image extension by anti-reflection

The re e tive pad is ee tive if the image is lo ally stationary at its boundaries, but it an still reate signi ant artifa ts if the image intensity has a large gradient at the boundary. Re e ting the image will reate a usp that is likely to be highly in onsistent with the original image, sin e the image beyond the boundary more than likely ontinues to hange a

ording to the gradient at the boundary rather than the negative of that gradient. A

ording to this observation in [9℄, the author proposed to anti-re e t instead of to re e t the image at the boundary. The onsidered idea preserves the ontinuity of the normal derivative at the boundary without reating a usp. In Fig. 1 ( ) is shown how to extend an image by anti-re e tion. We note a dierent s aling with respe t to Fig. 1 (a) and Fig. 1 (b) sin e the anti-re e tion produ e value outside the original domain and the following visualization requires to s ale the image. We analyze in detail 1-D images. Imposing anti-re e tive BCs the images f = [f1 , . . . , fn ] is assumed to be extended as f1−j = 2f1 − fj+1 ,

fn+j = 2fn − fn−j ,

(25)

for j = 1, 2, . . . [9℄. Antire e tive BCs usually provide restoration better than re e tive BCs, also in pra ti al 2-D appli ations, while, from a omputational eort viewpoint, they share the same properties as the re e tive BCs [4, 3℄. Indeed, when the PSF is strongly symmetri the matrix B in (3) is essentially diagonalized by dis rete sine transform of type III (DST-III), in the sense that the rst and last equations are de oupled and the inner (n − 2) × (n − 2) blo k

an be diagonalized by DST-III. Hen e, several omputations involving B, like Tikhonov regularization, an be done by FFT-based algorithms. In the last ase, PSF no strongly symmetri , the matrix B is Toeplitz plus Hankel plus a rank two orre tion and the linear system arising from Tikhonov regularization an not be handled by simply invoking FFT-based algorithms. Therefore, when the PSF is not strongly symmetri , it ould be useful to apply the anti-re e tion pad to extend g and regularizing (4). The extended image ge an be easily omputed by ge = Pg, with P de ned in (9) where now El = [ 2e | − J | 0 ] and Er = [ 0 | − J | 2e ], e = [1, . . . , 1]T . We observe that in the

ase of a strongly symmetri PSF with the anti-re e tive pad, dierently from

234

M. Donatelli, S. Serra-Capizzano

(a)

(b)

( )

Fig. 2. (a) Original image where the box indi ates observed region. (b) Gaussian blurred and noisy image. ( ) Out of fo us blurred and noisy image.

the re e tive ase, the two approa hes (BCs on f and extension of g) produ e dierent restorations, usually of omparable quality: indeed the eigenvalues of B are not a subset of the eigenvalues of C, as it happens for the re e tive pad, even if they are de ned on a uniform grid { iπ/(n + 1) | i = 1, . . . , n} as well. The main problem extending g by anti-re e tion is that ge is not periodi and then the model (4) ould suer from this. On the other hand the ringing ee ts are greatly redu ed with respe t to the appli ation of the ir ulant de onvolution dire tly to g, sin e the boundaries are far away from the portion of the restored image, when ompared with the ir ulant ase. However, we an improve the model, and then the restoration, extending ge by re e tion and obtaining a new periodi extended image gp of size 4n × 4n. Clearly this further proposal leads to a moderate in rease in the omputational eort. In fa t, as observed in [1℄, gp is real and symmetri and hen e only the omputation of the real part of a 2D FFT of size 2n × 2n is required.

4

Numerical experiments

For the following experimentation we use Matlab 7.0. The blurred images are

ontaminated by a mild white Gaussian noise. The restorations are ompared visually and the relative restoration error (RRE) is de ned as kf^ − fk2 /kfk2, where f^ and f are the restored and the true image respe tively. For the Tikhonov regularization the parameter µ is hosen experimentally su h that it minimizes the RRE, in a ertain range of µ. The image in Fig 2 (a) was blurred with a Gaussian PSF (Fig. 2 (b)) and with an out of fo us PSF (Fig. 2 ( )). The observed images are n × n with n = 195. Sin e both the PSFs are strongly symmetri , we an ompare the two approa hes based on re e tive BCs and re e tive extension of the observed image respe tively. The restored images and the absolute dieren e of the RREs for the two strategies in Fig. 3 and Fig. 4 validate the theoreti al analysis given in Se tion 2. We note that both strategies rea h the minimum RRE for the same

On the Treatment of Boundary Artifa ts in Image Restoration

235

−13

10

−14

10

−15

10

−16

10

−17

10

(a)

−2

−1

10

(b)

10

( )

Restorations of the image in Fig. 2 (b) (Gaussian blur): (a) restoration by re e tive BCs, (b) restoration by re e tive extension of the observed image, ( ) loglog dieren e of the RREs for the two approa hes ('*' orresponds to the optimal µ equal to 0.026 used in the restored images (a) and (b), absen e of line means exa tly zero value). Fig. 3.

−9

10

−10

10

−11

10

−12

10

−13

10

−14

10

−15

10

−16

10

−17

10

(a)

(b)

−1

0

10

10

( )

Restorations of the image in Fig. 2 ( ) (out of fo us blur): (a) restoration by re e tive BCs, (b) restoration by re e tive extension of the observed image, ( ) loglog dieren e of the RREs for the two approa hes ('*' orresponds to the optimal µ equal to 0.304 used in the restored images (a) and (b)). Fig. 4.

value of µ and we observe that, around this minimum, the absolute dieren e of the RREs has the same order of the ma hine pre ision (10−16 ). Now we onsider the anti-re e tive extension of the observed image des ribed in Se tion 3 and we ompare it only with the re e tive extension in the ase of a nonsymmetri PSF. Indeed, for strongly symmetri PSFs we have seen that the two approa hes based on re e tive BCs and re e tive extension of the observed image are equivalent. Moreover, in the re ent literature, it is widely do umented a ertain suprema y of the anti-re e tive BCs with respe t to re e tive BCs [4, 3℄, for moderate levels of noise. On the other hand, when the PSF is not strongly symmetri the BC approa h with the Tikhonov regularization leads to a linear system that an not be solved by FFT-based algorithms. Hen e, in su h ase the only fast approa h is whi h based on the extension of the observed image. A

ording to the above omments we hoose a PSF representing a motion along

236

M. Donatelli, S. Serra-Capizzano

(a)

(b)

( )

(d)

Fig. 5. (a) Moving blurred and noisy image. 2n × 2n (RRE = 0.0932). ( ) Restoration by

(b) Restoration by re e tive extension anti-re e tive extension 2n × 2n (RRE = 0.0807). (d) Restoration by anti-re e tive extension and then re e tive extension 4n × 4n (RRE = 0.0770).

−1

10

−2

10

−1

10

Fig. 6. Loglog RRE vs. µ for the test in Fig. 5 and the three approa hes: −− re e tive extension, · · · anti-re e tive extension 2n × 2n, −− anti-re e tive extension and then re e tive extension 4n × 4n.

the x axis. The original image is again that in Fig. 2 (a), while the blurred and noisy image is in Fig. 5 (a). In Fig. 5 ( ) the restored image is obtained by anti-re e tive extension that, also if the extended image is not periodi , is better than the restored image with re e tive extension in Fig. 5 (b). The improvement is espe ially visible near the right edge, that is in the dire tion of the motion. If we want further improve the restoration, as des ribed in Se tion 3, we an extend by re e tion the 2n × 2n image obtained by the anti-re etive pad and then apply the ir ulant de- onvolution to the new 4n × 4n problem. Indeed, the restored image in Fig. 5 (d) is better than that in Fig. 5 ( ). Moreover the last approa h is more stable under perturbations of the parameter µ, as shown in Fig. 6 by the plot of the RREs vs. µ for the onsidered approa hes.

Acknowledgment The work of the authors was partially supported by MUR, grant №2006017542.

On the Treatment of Boundary Artifa ts in Image Restoration

237

References 1. F. Aghdasi and R. K. Ward, Redu tion of boundary artifa ts in image restoration, IEEE Trans. Image Pro ess., 5 (1996), pp. 611{618. 2. M. Bertero and P. Bo

a

i, A simple method for the redu tion of the boundary ee ts in the ri hardson-lu y approa h to image de onvolution, Astron. Astrophys., 437 (2005), pp. 369{374. 3. M. Christiansen and M. Hanke, Deblurring methods using antire e tive boundary onditions. manus ript, 2006. 4. M. Donatelli, C. Estati o, A. Martinelli, and S. Serra-Capizzano,

Improved image deblurring with anti-re e tive boundary onditions and reblurring, Inverse Problems, 22 (2006), pp. 2035{2053. 5. P. C. Hansen, J. G. Nagy, and D. P. O'Leary, Deblurring Images: Matri es, Spe tra, and Filtering, SIAM, Philadelphia, PA, 2006. 6. T. Kailath and V. Olshevsky, Displa ement stru ture approa h to dis retetrigonometri -transform based pre onditioners of g. strang type and t. han type,

Cal olo, 33 (1996), p. 191208. 7. M. K. Ng, R. H. Chan, and W. C. Tang, A fast algorithm for deblurring models with Neumann boundary onditions, SIAM J. S i. Comput., 21 (1999), pp. 851{866. 8. S. J. Reeves, Fast image restoration without boundary artifa ts, IEEE Trans. Image Pro ess., 14 (2005), pp. 1448{1453. 9. S. Serra-Capizzano, A note on anti-re e tive boundary onditions and fast deblurring models, SIAM J. S i. Comput., 25 (2003), pp. 1307{1325. 10. A. N. Tikhonov, Solution of in orre tly formulated problems and regularization method, Soviet Math. Dokl., 4 (1963), pp. 1035{1038.

Zeros of Determinants of λ-Matrices Walter Gander Computational S ien e, ETH, CH-8092 Zuri h, Switzerland gander@inf.ethz.ch

Jim Wilkinson dis overed that the omputation of zeros of polynomials is ill onditioned when the polynomial is given by its oeÆ ients. For many problems we need to ompute zeros of polynomials, but we do not ne essarily need to represent the polynomial with its oeÆ ients. We develop algorithms that avoid the oeÆ ients. They turn out to be stable, however, the drawba k is often heavily in reased omputational eort. Modern pro essors on the other hand are mostly idle and wait for run hing numbers so it may pay to a

ept more omputations in order to in rease stability and also to exploit parallelism. We apply the method for nonlinear eigenvalue problems.

Abstract.

Keywords: Nonlinear eigenvalue problems, Gaussian Elimination, Determinants, Algorithmi Dierentiation.

1

Introduction

The lassi al textbook approa h to solve an eigenvalue problem Ax = λx is to rst ompute the oeÆ ients of the hara teristi polynomial Pn (λ) = det(λI − A) by expanding the determinant Pn (λ) = c0 + c1 λ + · · · + cn−1 λn−1 + λn .

Then se ond apply some iterative method like e.g. Newton's method to ompute the zeros of Pn whi h are the eigenvalues of the matrix A. In the beginning of the area of numeri al analysis a resear h fo us was to develop reliable solvers for zeros of polynomials. A typi al example is e.g. [4℄. However, the ru ial dis overy by Jim Wilkinson [6℄ was that the zeros of a polynomial an be very sensitive to small hanges of the oeÆ ients of the polynomial. Thus the determination of the zeros from the oeÆ ients is ill onditioned. It is easy today to repeat the experiment using a omputer algebra system. Exe uting the following Maple statements p :=1: for i from 1 by 1 to 20 do PP := expand(p); Digits := 7

p := p*(x-i) od:

Zeros of Determinants of λ-Matri es

239

PPP := evalf(PP) Digits := 30 Z := fsolve(PPP, x, complex, maxsols = 20)

we an simulate what Jim Wilkinson experien ed. We rst expand the produ t 20 Y (x − i) = x20 − 210x19 ± · · · + 20! i=1

then round the oeÆ ients to oating point numbers with 7 de imal digits. x20 − 210.0 x19 + 2.432902 × 1018 ∓ · · · − 8.752948 × 1018 x + 20615.0 x18

Continuing now the omputation with 30 de imal digits to determine the exa t zeros of the polynomial with trun ated oeÆ ients we note that we do not obtain the numbers 1, 2, . . . , 20. Instead many zeros are omplex su h as e.g. 17.175 ± 9.397i. Thus trun ating the oeÆ ients to 7 de imal digits has a very large ee t on the zeros. The problem is ill onditioned.

2

Matlab Reverses Computing

Instead of expanding the determinant to obtain the oeÆ ients of the hara teristi polynomial the ommand P = poly(A) in Matlab omputes the eigenvalues of A by the QR-Algorithm and expands the linear fa tors Pn (λ) = (λ − λ1 )(λ − λ2 ) · · · (λ − λn ) = λn + cn−1 λn−1 + · + c0

to ompute the oeÆ ients. Given on the other hand the oeÆ ients ck of a polynomial, the ommand lambda = roots(P) forms the ompanion matrix

−cn−1 −cn−2 1 0 0 1 A= . . .. .. 0 0

· · · −c1 −c0 ··· ··· 0 0 ··· 0 .. .. .. . . . 0

1

0

and uses again the QR-Algorithm to nd the eigenvalues whi h are the zeros of the polynomial.

3

Evaluating the Characteristic Polynomial

How an we evaluate the hara teristi polynomial without rst omputing its

oeÆ ients? One way is to use Gaussian elimination and the fa t that it is easy to

240

W. Gander

ompute the determinant of a triangular matrix. Assume that we have omputed the de omposition C = LU

with L a lower unit triangular and U an upper triangular matrix. Then det(C) = det(L) det(U) = u11 u22 · · · unn sin e det(L) = 1. Using partial pivoting for the de omposition we have to hange the sign of the determinant ea h time that we inter hange two rows. The program then be omes: function f = determinant(C) n = length(C); f = 1; for i = 1:n [cmax,kmax]= max(abs(C(i:n,i))); if cmax == 0 % Matrix singular f = 0; return end kmax = kmax+i-1; if kmax ~= i h = C(i,:); C(i,:) = C(kmax,:); C(kmax,:) = h; f = -f; end f = f*C(i,i); % elimination step C(i+1:n,i) = C(i+1:n,i)/C(i,i); C(i+1:n,i+1:n) = C(i+1:n,i+1:n) - C(i+1:n,i)*C(i,i+1:n); end

Let C(λ) = λI−A. We would like to use Newton's method to ompute zeros of P(λ) = det(C(λ)) = 0. For this we need the derivative P ′ (λ). It an be omputed

by algorithmi dierentiation, that is by dierentiating ea h statement of the program to ompute P(λ). For instan e the statement to update the determinant f = f*C(i,i); will be pre eded by the statement for the derivative, thus fs =fs*C(i,i)+f*Cs(i,i) ; f = f*C(i,i);

We used the variable Cs for the matrix C ′ (λ) and fs for the derivative of the determinant. There is, however, for larger matri es the danger that the value of the determinant over- respe tively under ows. Noti e that for Newton's iteration we do not need both values f = det(C(λ)) and fs = ddλ det(C(λ)). It is suÆ ient to

Zeros of Determinants of λ-Matri es

ompute the ratio

241

f P(λ) = . P ′ (λ) fs

Over ow an be redu ed by omputing the logarithm. Thus instead of omputing f = f*C(i,i) we an ompute lf = lf + log(C(i,i)). Even better is the derivative of the logarithm lfs :=

d fs log(f) = dλ f

whi h yields dire tly the inverse Newton orre tion. Thus instead updating the logarithm lf = lf + log(cii ) we dire tly ompute the derivative lfs = lfs +

csii . cii

This onsiderations lead to function ffs = deta(C,Cs) % DETA computes Newton correction ffs = f/fs n = length(C); lfs = 0; for i = 1:n [cmax,kmax]= max(abs(C(i:n,i))); if cmax == 0 % Matrix singular ffs = 0; return end kmax = kmax+i-1; if kmax ~= i h = C(i,:); C(i,:) = C(kmax,:); C(kmax,:) = h; h = Cs(kmax,:); Cs(kmax,:) = Cs(i,:); Cs(i,:) = h; end lfs = lfs + Cs(i,i)/C(i,i); % elimination step Cs(i+1:n,i) = (Cs(i+1:n,i)*C(i,i)-Cs(i,i)*C(i+1:n,i))/C(i,i)^2; C(i+1:n,i) = C(i+1:n,i)/C(i,i); Cs(i+1:n,i+1:n) = Cs(i+1:n,i+1:n) - Cs(i+1:n,i)*C(i,i+1:n)- ... C(i+1:n,i)*Cs(i,i+1:n); C(i+1:n,i+1:n) = C(i+1:n,i+1:n) - C(i+1:n,i)*C(i,i+1:n); end ffs = 1/lfs;

Note that as an alternative to the algorithmi dierentiation presented here one ould use the Formula of Ja obi d det(C(λ)) = det(C(λ)) tra e C−1 (λ)C ′ (λ) dλ

242

W. Gander

whi h gives an expli it expression for the derivative of the determinant.

4

Suppression instead Deflation

If x1 , . . . , xk are already omputed zeros then we would like to ontinue working with the de ated polynomial Pn−k (x) :=

Pn (x) (x − x1 ) · · · (x − xk )

(1)

of degree n − k. However, we annot expli itly de ate the zeros sin e we are working with P(λ) = det(λI − A). Dierentiating Equation (1) we obtain k

′ Pn−k (x) =

X 1 Pn′ (x) Pn (x) . − (x − x1 ) · · · (x − xk ) (x − x1 ) · · · (x − xk ) x − xi i=1

Thus the Newton-iteration be omes xnew = x −

Pn−k (x) Pn (x) =x− ′ ′ Pn−k (x) Pn (x)

1−

1 k Pn (x) X Pn′ (x)

i=1

1 x − xi

This variant of Newton's Iteration is alled Newton-Maehly Iteration [2, 3℄.

5

Example

We generate a random symmetri matrix A with eigenvalues 1, 2, . . . , n: x = [1:n]’; Q = rand(n); Q = orth(Q); A = Q*diag(x)*Q’;

respe tively a non symmetri matrix with x = [1:n]’; Q = rand(n); A = Q*diag(x)*inv(Q);

Then we ompute the solutions of det(C(λ)) = 0 with C(λ) = λI − A using the Newton-Maehly iteration. We ompare the results with the ones obtained by the QR-Algorithm eig(A) and with the zeros of the hara teristi polynomial roots(poly(A)). In Tables 1 and 2 the norm of the dieren e of the omputed eigenvalues to the exa t ones is printed. Noti e that due to ill- onditioning the roots of the hara teristi polynomial dier very mu h and that for n = 200 the oeÆ ients of the hara teristi polynomial over ow and the zeros annot be omputed any more. On the other hand we an see that the our method

ompetes in a

ura y very well with the standard QR-algorithm.

Zeros of Determinants of λ-Matri es

Table 1.

matrix

n roots(poly(A)) eig(A) 50 1.3598e+02 3.9436e−13 100 9.5089e+02 1.1426e−12 150 2.8470e+03 2.1442e−12 200 −−− 3.8820e−12

det(A − λI) = 0

n roots(poly(A)) eig(A) 50 1.3638e+02 3.7404e−12 100 9.7802e+02 3.1602e−11 150 2.7763e+03 6.8892e−11 200 −−− 1.5600e−10

det(A − λI) = 0

243

4.7243e−14 1.4355e−13 3.4472e−13 6.5194e−13

Norm of dieren e of the omputed to the exa t eigenvalues for a symmetri

2.7285e−12 3.5954e−11 3.0060e−11 6.1495e−11

Table 2. Norm of dieren e of the omputed to the exa t eigenvalues for a nonsymmetri matrix

6

Generalization to λ-matrices

Consider a quadrati eigenvalue problem det(C(λ)) = 0,

with C(λ) = λ2 M + λC + K.

If det(M) 6= 0 then one way to \linearize" the problem is to onsider the equivalent general eigenvalue-problem with dimension 2n: det

M0 0 M −λ =0 0 K −M −C

Alternatively with our approa h we an ompute the zeros of det(C(λ)) with Newton's iteration. Take the mass-spring system example from [5℄. For the nonoverdamped ase the matrix is C(λ) = λ2 M + λC + K with M = I, C = τ tridiag(−1, 3, −1), K = κ tridiag(−1, 3, −1)

and with κ = 5, τ = 3 and n = 50. The Matlab program to ompute the eigenvalues is % Figure 3.3 in Tisseur-Meerbergen clear, format compact n=50 tau = 3, kappa = 5, e = -ones(n-1,1); C = (diag(e,-1)+ diag(e,1)+ 3*eye(n)); K = kappa*C; C = tau*C;

244

W. Gander

lam = -0.5+0.1*i; tic for k=1:2*n ffs = 1; q=0; while abs(ffs)>1e-14 Q = lam*(lam*eye(n)+ C)+K; Qs = 2*lam*eye(n)+C; ffs = deta(Q,Qs); s = 0; if k>1 s = sum(1./(lam-lamb(1:k-1))); end lam = lam-ffs/(1-ffs*s); q=q+1; end clc k, lam, q, ffs, lamb(k) = lam; lam = lam*(1+0.01*i); end toc clf plot(real(lamb),imag(lamb),’o’)

and produ es Figure 1. The omputation in Matlab needed 13.9 se onds on

Fig. 1.

Eigenvalues in the omplex plane for the nonoverdamped ase

a IBM X41 laptop. As starting values for the iteration we used the omplex number λ(1 + 0.01i) near the last omputed eigenvalue λ.

Zeros of Determinants of λ-Matri es

245

In the se ond \overdamped" ase we have κ = 5, τ = 10. Sin e the eigenvalues are all real we an hoose real starting values. We hose 1.01λ where again λ was the last eigenvalue found. Figure 2 shows the eigenvalues whi h are all real and

omputed with Matlab in 16.3 se onds.

Fig. 2.

Real eigenvalues for the overdamped ase

Finally we re omputed a ubi eigenvalue problem from [1℄. Here we have C(λ) = λ3 A3 + λ2 A2 + λA1 + A0

with A0 = tridiag(1, 8, 1) A2 = diag(1, 2, . . . , n) and A1 = A3 = I.

In [1℄ the matrix dimension was n = 20 thus 60 eigenvalues had to be omputed. Using our method we ompute these in 1.9 se onds. Figure 3 shows the 150 eigenvalues for n = 50 whi h have been omputed in 17.9 se onds.

7

Conclusion

We have demonstrated that omputing zeros of polynomials from their oeÆ ients is ill- onditioned. However, dire t evaluation of the hara teristi polynomial is feasible. With this omputational intensive method we have shown that medium size nonlinear eigenvalue problems may be solved with a simple program whi h omputes determinants by Gaussian elimination and applies algorithmi dierentiation and suppresses already omputed zeros. We obtained results in reasonable time in spite that we did not ompile the Matlab program

246

W. Gander

Fig. 3.

Cubi Eigenvalue Problem

and that we did not make use of the banded stru ture of the matri es. This algorithm, though omputational expensive, maybe useful for its potential for parallelization on future multi ore ar hite tures.

References 1. P. Arbenz and W. Gander, Solving nonlinear Eigenvalue Problems by Algorithmi Dierentiation, Computing 36, 205-215, 1986. 2. H. J. Maehly, Zur iterativen Au osung algebrais her Glei hungen, ZAMP (Zeits hrift fur angewandte Mathematik und Physik), (1954), pp. 260{263. 3. J. Stoer and R. Bulirs h, Introdu tion to Numeri al Analysis, Springer, 1991. 4. W. Kellenberger, Ein konvergentes Iterationsverfahren zur Bere hnung der Wurzeln eines Polynoms, Z. Angew. Math. Phys. 21 (1970) 647{651. 5. F. Tisseur and K. Meerbergen, The Quadrati Eigenvalue Problem, SIAM. Rev., 43, pp. 234{286, 2001. 6. J. H. Wilkinson, Rounding errors in algebrai pro esses, Dover Publi ations, 1994.

How to find a good submatrix⋆ S. A. Goreinov, I. V. Oseledets, D. V. Savostyanov, E. E. Tyrtyshnikov, and N. L. Zamarashkin Institute of Numeri al Mathemati s of Russian A ademy of S ien es, Gubkina 8, 119333 Mos ow, Russia {sergei,ivan,draug,tee,kolya}@bach.inm.ras.ru Abstract. Pseudoskeleton approximation and some other problems require the knowledge of suÆ iently well- onditioned submatrix in a larges ale matrix. The quality of a submatrix an be measured by modulus of its determinant, also known as volume. In this paper we dis uss a sear h algorithm for the maximum-volume submatrix whi h already proved to be useful in several matrix and tensor approximation algorithms. We investigate the behavior of this algorithm on random matri es and present some its appli ations, in luding maximization of a bivariate fun tional.

Keywords: maximum volume, low rank, maxvol, pseudoskeleton approx-

imation.

1

Introduction

Several problems in matrix analysis require the knowledge of a good submatrix in a given (supposedly large) matrix. By \good" we mean a suÆ iently well onditioned submatrix. The appli ation that we are parti ularly interested in is the approximation of a given matrix by a low-rank matrix: A ≈ UV ⊤ ,

where A is m × n and U and V are m × r and n × r, respe tively. Optimal approximation in spe tral or Frobenius norm an be omputed via singular value de omposition (SVD) whi h, however, requires too many operations. A mu h faster way is to use CGR de ompositions [1℄ (later also referred to as CUR by some authors) whi h in Matlab notation an be written as: A ≈ A(:, J)A(I, J)−1 A(I, :),

(1)

where I, J are appropriately hosen index sets of length r from 1 : n and 1 : m. It an be seen that the right hand side matrix oin ides with A in r rows and ⋆

This work was supported by RFBR grant №08-01-00115a and by a Priority Resear h Grant OMN-3 of the Department of Mathemati al S ien es of Russian A ademy of S ien es.

248

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

^ = A(I, J) is nonsingular, r olumns. Moreover, if A is stri tly of rank r and A

the exa t equality holds. However, in the approximate ase the quality of the approximation (1) relies heavily on the \quality" of the submatrix. The question is how to measure this quality and how to nd a good submatrix. A theoreti al ^ has maximal in modulus answer (basi ally, existen e theory) [3℄ is that if A determinant among all r × r submatri es of A, then element-wise error estimate is of the form |A − Ar | 6 (r + 1)σr+1 ,

where |A| = maxij |aij | denotes Chebyshev norm, Ar is the right hand side of (1) and σr+1 is the r + 1{th singular value of the matrix A, i.e. the error of the best rank-r approximation in the spe tral norm. That is the theory, but what about a pra ti al algorithm? How to nd a good submatrix? That is the topi of the urrent paper. As we have seen, the submatrix quality an be measured by its determinant, so we want to nd a submatrix with the largest possible determinant. An intermediate step to the solution of that problem is omputation of the maximal volume submatrix not in a matrix where both dimensions are large, but in the matrix where only one dimension is large, i.e. in a \tall matrix". Su h pro edure ( alled maxvol) plays a ru ial role in several matrix algorithms we have developed, and it deserves a spe ial des ription [2, 4℄. In this paper we investigate the behavior of the maxvol algorithm on random matri es and present some theoreti al results and its appli ation for fast sear h of the maximum entry of large-s ale matrix. We also propose a new approa h for maximization of a bivariate fun tional on the base of maxvol algorithm. 1.1

Notation

In this arti le we use Matlab-like notation for de ning rows and olumns of matrix. Therefore we write i-th row of matrix A as ai,: and j-th olumn of A as a:,j . We will also use olumns and rows of identity matrix, denoting them as ei and eTj respe tively, using the same notations for dierent sizes, but the a tual size will be always lear by the ontext. 1.2

Definitions and basic lemmas

Let us give some formal de nitions and prove basi lemmas to rely on. Definition 1.

its volume.

We refer to the modulus of determinant of square matrix as

submatrix A of re tangular m × n matrix A maximum volume submatrix, if it has maximum determinant in modulus among all possible r × r submatri es of A. Definition 2.

We all

r×r

How to nd a good submatrix

249

We all r × r submatrix A of re tangular n × r matrix A of full rank dominant, if all the entries of AA−1 are not greater than 1 in modulus.

Definition 3.

The main observation that lays ground for the algorithms for the onstru tion of maxvol algorithm is the following lemma.

For n × r matrix maximum volume r × r submatrix is dominant. Proof. Without loss of generality we an onsider that A o

upies rst r rows of A. Let us refer to them as upper submatrix. Then Lemma 1.

AA−1 =

Ir×r = B. Z

(2)

Multipli ation by a nonsingular matrix does not hange the ratio of determinants of any pair of r × r submatri es in A. Therefore, the upper submatrix Ir×r is a maximum-volume submatrix in B and it is dominant in B i A is dominant in A. Now, if there is some |bij | > 1 in B, then we an onstru t a new submatrix with a volume larger than volume of the upper submatrix. To see that, swap rows i and j in B, and it is easy to see that a new upper submatrix

B ′

has

1

.. . = ∗ ∗ bij ∗ ...

∗

(3)

1

| det(B ′ )| = |bij | > 1 = | det(Ir×r )|.

That means that Irxr (and hen e trix.

A )

is not the maximum volume subma-

The volume of a dominant submatrix an not be very mu h smaller than the maximum volume, as the following lemma shows. Lemma 2.

For any nonsingular n × r matrix A | det(A )| > | det(A )|/rr/2

(4)

for all dominant r × r submatri es of A. Proof. Suppose that A is the upper submatrix and write AA−1

Ir×r = = B. Z

(5)

250

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

All entries of B are not greater than 1 in modulus, therefore by Hadamard inequality the volume of any r × r submatrix Br×r of B is not greater than | det(Br×r )| 6

r Y

|bσi ,: | 6 rr/2 ,

i=1

where σi are indi es of rows that ontain Br×r . The inequality is sharp. For example, if Z ontains Fourier, Hadamard or Walsh matrix as a submatrix, it is easy to see that the equality is attained. 2

Algorithm maxvol

A dominant property of the maximal-volume submatrix allows us to onstru t a simple and eÆ ient algorithm for the sear h of maximal volume submatrix.

Algorithm 1. Given: n × r matrix A. Find: r × r dominant submatrix A .

0 Start with arbitrary nonsingular r × r submatrix A⊡ . Reorder rows in A so that A⊡ o

upies rst r rows in A. 1 Compute AA−1 ⊡ =B

and nd its maximal in modulus entry bij . 2 If |bij | > 1, then swap rows i and j in B. Now, upper submatrix in B has the form (3) and the volume |bij | > 1. By swapping the rows we have in reased the volume of the upper submatrix in B, as well as in A. Let A⊡ be the new upper submatrix of A and go to step 1. If |bij | = 1, return A = A⊡ . On ea h iterative step of Algorithm 1, volume of A⊡ in reases until the volume of A is rea hed. In pra ti e, we an simplify the stopping riterion in the iterative step to |bij | < 1 + δ with suÆ iently small parameter δ (we think that δ ∼ 10−2 an be a good hoi e). This dramati ally redu es the number of iterative steps but does not hange the \good" properties of a submatrix. If omputations pro eed in a naive way, then the most expensive part of iterations is step 1, whi h needs one r × r matrix inversion and nr2 operations for the matrix-by-matrix produ t AA−1 ⊡ . We an redu e the omplexity of this step by a fa tor of r if we note that on ea h iteration, A⊡ is updated by a rank-one matrix, and apply Sherman-Woodbury-Morrison formula for the matrix inverse. Now we des ribe this in detail. Swapping of rows i and j of matrix A is equivalent to the following rank-one update. A := A + ej (ai,: − aj,: ) + ei (aj,: − ai,: ) = A + (ej − ei )(ai,: − aj,: ) = A + pvT . (6)

How to nd a good submatrix

251

For the upper submatrix, this update is A⊡ := A⊡ + ej (ai,: − aj,: ) = A⊡ + qvT .

(7)

For the inverse of the upper submatrix, we use the SWM formula T −1 −1 T −1 −1 −1 v A⊡ . A−1 ⊡ := A⊡ − A⊡ q(1 + v A⊡ q)

(8)

Note that −1 −1 −1 vT A−1 ⊡ q = (ai,: − aj.: )A⊡ ej = (AA⊡ )i,: − (AA⊡ )j,: ej = bij − bjj = bij − 1.

We pro eed with the formula of fast update of B = AA−1 ⊡ ,

−1 −1 T −1 T B = AA−1 ⊡ := (A + pv )(A⊡ − A⊡ qv A⊡ /bij ) = −1 −1 T −1 T −1 T −1 = AA⊡ − AA⊡ qv A⊡ /bij + pvT A−1 ⊡ − pv A⊡ qv A⊡ /bij = T −1 = B − Bq − bij p + pvT A−1 ⊡ q v A⊡ /bij .

T −1 Using vT A−1 ⊡ = bi,: − bj,: and v A⊡ q = bij − 1, we have

B := B − (b:.j − bij p + (bij − 1)p)(bi,: − bj,: )/bij ,

and nally

(9) Note also that the upper r × r submatrix of B remains to be identity after ea h update, be ause b1:r,j = ej for j 6 r and (ei )1:r = 0 for i > r that is always the ase. So we need to update only the submatrix Z. This an be also done by rank-one update: Z := Z − (b:,j + ei )(bi,: − eTj )/bij . (10) Note that in the last formula we use \old" indexing, i.e. rows of Z are numbered from r + 1 to n. Therefore, ea h iterative step of the algorithm redu es to a rank-one update of Z whi h an be done in (n − r)r operations, and a sear h for a maximummodulus element in Z, whi h is of the same omplexity. Overall omplexity for the algorithm 1 is therefore O(nr2 ) for initialization and O(cnr) for iterative part, where c is the number of iterations. We an write a rather rough estimate for c as follows. Ea h iteration step in reases volume of A⊡ by a value |bij | > 1+δ. After k steps {k} {0} | det(A⊡ )| > | det(A⊡ )|(1 + δ)k , therefore {0} c 6 log | det(A )| − log | det(A⊡ )| / log(1 + δ). (11) B := B − (b:,j − ej + ei )(bi,: − eTj )/bij .

This shows that good initial guess for A an redu e the number of iterations. If no \empiri al" guesses are available, it is always safe to apply Gaussian elimination with pivoting to A and use the set of pivoted rows as an initial approximation to the maximal volume submatrix.

252

3

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

maxvol-based maximization methods

As an appli ation onsider the following simple and interesting problem: nd maximum in modulus element of a low-rank matrix A = UV T , given by U and V. This problem arises for example in maximization of two-dimensional separable fun tion on a grid, or as an essential part of the Cross3D algorithm for omputation of Tu ker approximation of three dimensional tensor in linear time [4℄. Dire t omparison of all elements requires rnm operations for m × n matrix of rank r. Is it possible to devise an algorithm with omplexity linear in matrix size? 3.1

Theoretical estimates

Our idea is not to sear h for maximum element in the whole submatrix, but only in the submatrix of maximal volume. Though looking not very natural at the rst glan e, this algorithm a tually works well in many ases. Often the maximal element in the maximal volume submatrix is not ne essarily the same as the true maximal element, but it an not be very mu h smaller (for example, if it zero, then the submatrix of maximal volume is zero and the matrix is also zero, whi h we hope is not true). But are there any quantitative estimates? In fa t, we an repla e maximal-volume submatrix by an arbitrary dominant submatrix, whi h yields the same estimate. But rst we need to extend the de nition of the dominant submatrix to the ase of m × n matri es. It is done in a very simple manner. Definition 4.

We all

r×r

submatrix

A

of re tangular

m×n

matrix

A

dominant, if it is dominant in olumns and rows that it o

upies in terms

of De nition 3.

Theorem 1. If A rank r, then

is a dominant

r×r

submatrix of a

|A | > |A|/r2 .

m×n

matrix

A

of

(12)

Proof. If maximum in modulus element b of A belongs to A , the statement is trivial. If not, onsider (r + 1) × (r + 1) submatrix, that ontains A and b, ^= A

A c . dT b

(13)

Elements of ve tors c and d an be bounded as follows |c| 6 r|A |,

|d| 6 r|A |.

(14)

This immediately follows from c = A (A−1 ~, where all elements of c) = A c c~ are not greater than 1 in modulus. Bound for elements of d is proved in the same way.

How to nd a good submatrix

Now we have to bound |b|. Sin e

A

253

has rank r and A is nonsingular, (15)

b = dT A−1 c,

and it immediately follows that |A| = |b| 6 |d|r 6 |A |r2 ,

whi h ompletes the proof. The restri tion rank A = r may be removed with almost no hange in the bound (12). However, one has to repla e A by A . Theorem 2. If A is maximum-volume r × r (nonsingular) m × n matrix A, then |A | > |A|/(2r2 + r).

submatrix of

Proof. Again, onsider submatrix A^ that ontains A and b, see (13). Bound (14) follows immediately, be ause the maximum-volume submatrix is dominant, see Lemma 1. Sin e rank A is now arbitrary, the equality (15) is no longer valid. Instead, we use an inequality from [3℄, b − dT B−1 c 6 (r + 1)σr+1 (A), ^

(16)

^ That gives ^ > σ2 (A) ^ > . . . > σr+1 (A) ^ are singular values of A. where σ1 (A) T ^ + |dT A−1 ^ |b| 6 (r + 1)σr+1 (A) ~| 6 c| 6 (r + 1)σr+1 (A) + |d c 2 ^ ^ 6 (r + 1)σr+1 (A) + |d|r 6 (r + 1)σr+1 (A) + |A |r .

(17)

^ in terms of values of its elements. Note We need an estimate for σr+1 (A) that ^TA ^= A

AT d cT b

A c AT A + ddT AT c + bd = . T d b cT A + bdT cT c + b2

From the singular value interla ing theorem,

^ σr (AT A + ddT ) > σ2r+1 (A),

and for r > 1 ^ σr−1 (AT A ) > σr (AT A + ddT ) > σ2r+1 (A).

Finally we have σ1 (A ) > σr+1 (A) and |A | > σ1 (A )/r. Plugging this into (17), we get |b| 6 (r + 1)r|A | + r2 |A | = (2r2 + r)|A |,

whi h ompletes the proof.

254

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin Matrix size 1000, rank 10

Matrix size 10000, rank 10

30

7

25

6

53676384 trials

15 10

33627632 trials

5

20

4 3 2

5

1

0

0

0

0.2

Fig. 1.

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

Distribution of the ratio of maxvol over true maximal element

Now it is lear that we an redu e the sear h to only r2 elements of the dominant matrix. Then the sear h time does not depend on matrix size, and the total omplexity is just the omplexity of nding A , whi h is O(nr2 + mr2 ) operations. Maximum element in A is \suÆ iently good" in the sense of proven theorems. In pra ti al ases, the ratio |A|/|A | is suÆ iently smaller than r2 . Consider two examples, whi h illustrate this fa t. 3.2

Search for the maximum element in random low-rank matrices

In order to see how good is the maximal element in our \good" submatrix, we tested it rst on random matri es. Given n, m, r, two matri es U and V were generated with elements uniformly distributed in [−1 : 1]. Then U, V were repla ed with Q-fa tors of their QR-de ompositions and a matrix A = UDV ⊤ was generated with random positive diagonal D with elements uniformly distributed on [0, 1]. We generated a large set of trial matri es, for ea h of these matri es we omputed maximal element using the proposed algorithm. The a tual degradation of the maximal element is presented on the Figure 1, where the histogram of the ratio of maximal element in A over the true maximal element is given. Note that this ratio for ertain is not lower than 0.5 for all trials (smooth humps in the middle part of histograms), and in some 5% of ases (sharp peaks in the right part of histograms) we even found a true maximal element, whi h was mu h less probable for a random hoi e. 3.3

Maximization of bivariate functions

There is an interesting appli ation of our algorithm. It an be applied to the problem of global optimization of bivariate fun tions. Suppose we want to nd a maximum of |f(x, y)| in some re tangle (x, y) ∈ Π = [a0 , a1 ] × [b0 , b1 ], and f is some given fun tion. \Dis retizing" the problem on some suÆ iently ne

How to nd a good submatrix

255

grid (xi , yj ), i = 1, 2, . . . , m, j = 1, 2, . . . , n we obtain an m × n matrix A = [f(xi , yj )] to nd the maximal in modulus element in. Assume additionally that the fun tion f(x, y) an be suÆ iently well approximated by a sum of separable fun tions:

f(x, y) ≈

r X

uα (x)vα (y).

α=1

Then it easy to see that in this ase the matrix A admits a rank-r approximation of the form A = f(xi , yj ) ≈ UV ⊤ ,

where U, V are n × r and m × r matri es, respe tively, with elements U = [uα (xi )], V = [vα (yj )]. Thus the \dis retized" problem is equivalent to the problem of nding maximal in modulus element in a large low-rank matrix A, so we

an apply our method. We have no guarantee that we will nd the exa t maximum, but we will have an estimate of it. As an example we onsidered a standard banana fun tion minimization problem: b(x, y) = 100(y − x)2 + (1 − x)2 .

This fun tion has minimum in (1, 1) equal to 0 and is positive in all other points. In order to reformulate the problem as a maximization problem, we introdu e an auxiliary fun tion f(x, y) =

1 , b(x, y) + 10−6

the maximum of whi h is lo ated at the same point (1, 1). A re tangle [−2, 2] × [2, 2] was hosen and dis retized on a 500 × 500 uniform grid, the orresponding matrix A was approximated by a matrix of rank 10 for whi h the maximum was found by our maxvol algorithm. The extremal point was ontained in the grid, and the maxvolreturned the exa t position of the minimum: (1, 1). For other hoi es of grids the situation was the same, and the approximations to the extremum were very good (the error was O(h), where h is a grid size). This result is very en ouraging. However, it should not be treated as a universal optimization method, but it an be very useful in global optimization methods, be ause it gives us an estimate of the value of the global optimum | this an be eÆ iently used, for example, in bran h-and-bound methods, with maxvol estimates for the maximal value in a parti ular domain. Another possibility is to use \lo al" separable approximations to fun tions and then minimize this lo al part by the maxvol algorithm. In orporation of our method into robust optimization methods will be the subje t of future resear h.

4

Conclusion and future work

In this paper we presented a simple iterative method for the sear h of a submatrix of maximal volume in a given re tangular matrix. This submatrix plays

256

S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin

an important role in the theory and algorithms for the approximation by low (tensor) rank matri es. As an appli ation, we onstru ted an algorithm for the

omputation of maximal in modulus element in a given low-rank matrix and proved, that the element an not be mu h smaller than the \true" maximal element. Experiments on random matri es prove that our algorithm performs very good, as well as the experiment with the minimization of the banana fun tion. A future work will be fo used on maximizing separable fun tions by using bran h-and-bound method and maxvol estimates of the maximal element in ea h subdomain and by using \lo al" approximations by separable fun tions.

References 1. S. A. Goreinov, E. E. Tyrtyshikov, and N. L. Zamarashkin, A theory of pseudo{skeleton approximations, Linear Algebra Appl., 261: 1{21, 1997. 2. E. E. Tyrtyshnikov, In omplete ross approximation in the mosai -skeleton method. Computing, 64(4): 367{380, 2000. 3. S. A. Goreinov and E. E. Tyrtyshnikov, The maximal-volume on ept in approximation by low-rank matri es, Contemporary Mathemati s, 208: 47{51, 2001. 4. I. V. Oseledets, D. V. Savostyanov, and E. E. Tyrtyshnikov, Tu ker dimensionality redu tion of three-dimensional arrays in linear time, SIAM J. Matrix Anal. Appl., 30(3): 939{956, 2008.

Conjugate and Semi-Conjugate Direction Methods with Preconditioning Projectors V. P. Il'in Institute of Computational Mathemati s and Mathemati al Geophysi s, Siberian Bran h of Russian A ademy of S ien es, ak. Lavrentieva 6, 630090 Novosibirsk, Russia ilin@sscc.ru

Abstract. The a

eleration of the original proje tive iterative methods of multipli ative or additive type for solving systems of linear algebrai equations (SLAEs) by means of onjugate dire tion approa hes is onsidered. The orthogonal and varitional properties of the pre onditioned

onjugate gradient, onjugate residual and semi- onjugate residual algorithms, as well as estimations of the number of iterations, are presented. Similar results were obtained for the dynami ally pre onditioned iterative pro ess in Krylov subspa es. Appli ation of dis ussed te hniques for domain de omposition, Ka zmarz, and Cimmino methods is proposed.

1

Introduction

The aim of this paper is to analyze the iterative algorithms in the Krylov subspa es whose pre onditioners are some kinds of proje tor operators. At rst we

onsider the general approa h for a

eleration of some onvergent iterations with a onstant iteration matrix. Let us have the system of linear algebrai equations: Au = f,

u = {ui },

f = {fi } ∈ RN ,

A = {ai,j } ∈ RN,N ,

(1)

and the onvergent stationary iterative pro ess uk+1 = Buk + g,

uk → u, k→ ∞

g = (I − B)A−1 f.

(2)

Suppose that the iteration matrix B has eigenvalues λq (B) and spe tral radius ρ = max{|λq (B)|} < 1. Then the ve tor u is the solution of system q

~ ≡ (I − B)u = g, Au

(3)

~ is the pre onditioned non-singular matrix. where I is the identity matrix and A ~ If A is a symmetri positive de nite (s.p.d) matrix, its spe tral ondition number is ~ 2 kA ~ −1 k2 = (1 + ρ)/(1 − ρ). (4) = kAk

258

V. P. Il'in

and to solve SLAE (3) we an apply some iterative onjugate dire tion methods (see [1℄ { [4℄): ~ 0 , p0 = r0 ; n = 0, 1, ... : r0 = g − Au n+1 ~ n, u = un + αn pn , rn+1 = rn − αn Ap pn+1 = rn+1 + βn pn ,

(5)

whi h have the optimality property in the Krylov subspa es ~ = Span{p0 , p1 , ..., pn } = Span{p0 , Ap ~ 0 , ..., A ~ n p0 }. Kn+1 (r0 , A) In onjugate dire tion (CG) and onjugate residual (CR) methods, s = 0, 1 (s) respe tively, the iterative parameters α(s) n and βn are de ned as follows: ~s n n ~ n ~s n α(s) n = (A r , r )/(Ap , A p ),

~ s n+1 , rn+1 )/(A ~ s rn , rn ). β(s) n = (A r

(6)

These algorithms provide the residual and dire tion ( orre tion) ve tors rn and pn with the orthogonal pe uliarities ~ s rn , rk ) = (A ~ s rn , rn )δn,k , (A

~ n, A ~ s pk ) = (Ap ~ n, A ~ s pn )δn,k . (Ap

(7)

n ~ s−1 rn , rn ), s = 0, 1, are minimized in Also, the fun tionals Φ(s) n (r ) = (A the Krylov subspa es, and the number of iterations ne essary for satisfying the

ondition (s)

n 0 1/2 (Φ(s) 6 ε < 1, n (r )/Φ0 (r ))

is estimated by the value

n(ε) 6 1 + ln

1+

√ 1 − ε2 / ln γ, ε

√ √ γ = ( − 1)/( − 1).

(8)

It should be noted that matrix-ve tor multipli ation in (5) presents the implementation of one iteration (2) that does not require expli it forming of ma~ and B, be ause, for example, tri es A ~ n = pn − Bpn . Ap ~ is nonsymmetri and positive de nite, i.e. If martix A ~ u) > δ(u, u), δ > 0, u 6= 0, (Au, system (3) an be solved by means of the semi- onjugate residual (SCR) method realizing the stabilized version of the generalized onjugate residual (GCR) algorithm, whi h is des ribed in [5℄ and has instability features in terms of trun ation errors, see [4℄.

Conjugate and Semi-Conjugate Dire tion Methods

259

In SCR, the ve tors un+1 and rn+1 are omputed a

ording to formulas (5), n+1 with the oeÆ ients α(s) are n from (6) for s = 1, and the dire tion ve tors p de ned as follows: pn+1,0 = rn+1 ,

pn+1,l = pn+1,l−1 + βn,l pl−1 ,

~ l , Ap ~ n+1,l−1 )/(Ap ~ l , Ap ~ l ), βn,l = −(Ap

l = 1, ..., n,

pn+1 = pn+1,n .

(9)

Relations (5), (9) realize the onstru tion of At A-orthogonal ( onjugate) ve tors p0 , p1 , . . ., pn+1 by means of modi ed Gram{S hmidt orthogonalization n n n [6℄. In this ase, the fun tional Φ(1) n (r ) = (r , r ) is minimized in the subspa e ~ and the residual ve tors are right semi- onjugate, in the sense that Kn+1 (r0 , A) ~ k , rn ) = 0 are satis ed for k < n. Sin e SCR and GMRES the equalities (Ar methods (see [4℄) have the same variational properties in the Krylov subspa es, similar estimate of the number of iterations n(ε) is valid for them, and it will be used below. This paper is organized as follows. In Se tion 2, we des ribe proje tive methods of the multipli ative type using the onjugate dire tion and semi- onjugate dire tion approa hes. The next Se tion is devoted to the additive type proje tive methods in the Krylov subspa es. Also, the appli ation of dynami pre onditioners is dis ussed. This approa h means using variable iteration matrix Bn at dierent iterations. This is the implementation requirement, for example, in many two-level iterative pro esses.

2

Multiplicative projector methods

Let Ω = {i = 1, 2, ..., N} denote a set of matrix row numbers and Ωp , p = 1, 2, ..., l, be its non-interse ting integer subsets, with the numbers mp of their elements,

Ω=

l [

Ωp , m1 + ... + ml = N.

p=1

Also, let us introdu e subve tors u(p) , f(p) , p = 1, ..., l, of dimensions mp and re tangular matri es A(p) ∈ Rmp ×N : u(p) = {ui , i ∈ Ωp },

f(p) = {fi , i ∈ Ωp },

A(p) = {Ai , i ∈ Ωp },

(10)

where Ai is the i-th row of matrix A. Then SLAE (1) an be rewritten as A(p) u = f(p) ,

p = 1, 2, ..., l.

(11)

To solve (11), we onsider an iterative pro ess in whi h the omputing of ea h n-th approximation step onsists of the following stages: n,p−1 , un,p = un,p−1 + ωA+ (p) r(p)

n = 1, 2, ...,

p = 1, 2, ..., l,

un = un,l .

(12)

260

V. P. Il'in

Here u0,0 = {u0i , i = 1, 2, ..., N} is the initial guess, and ω is some iterative parameter, = f(p) − A(p) un,p−1 rn,p−1 (p)

is the residual subve tor of dimension mp , and A+ p is pseudoinverse to matrix t t −1 = A (A A A(p) de ned by the formula A+ if A(p) has a full rank. (p) (p) (p) ) (p) + We have from the above that I − A(p) A(p) is a symmetri positive semide nite matrix realizing orthogonal proje tion into the p-th subspa e, whi h is presented geometri ally by the union of subspa es des ribed by the i-th equations, i ∈ Ωp . Iterative method (12) an be written in the matrix form, un = Bun−1 + g, B = (I − Tl ) · · · (I − T1 ), Tp = ωA+ (p) A(p) .

(13)

Proje tive algorithm (12), (13) for ω = 1 and mp = 1 presents the \pointwise" method published by S.Ka zmarz in [7℄. Its dierent generalizations and investigations were made by many authors, see [8℄, [9℄. In [10℄ the following assertion was proved for abstra t iterative proje tion method of the multipli ative type, with appli ation to the domain de omposition approa h:

Theorem 1. Let Tp , p = 1, ..., l, be s.p.d. matri es, and the following inequalities be valid for any ve tor v ∈ RN : (Tp v, v)/(v, v) 6 α < 2,

p = 1, 2, ..., l;

Then the estimate

kvk 6 β

l X

(Tp v, v).

p=1

kBk2 6 ρ = 1 − (2 − α)/{β[l + α2 l(l − 1)/2]}

is true for the Eu lidian norm satisfy the onditions

||B||2 .

If the matri es Tp

= ωTp

for all

p

[(T1 v, v) + ... + (Tl v, v)], (Tp v, v)/(v, v) 6 α < 2, kvk 6 β p then for ω = (α (l − 1)l)−1 we have ρ = 1 − (3 α β l)−1 . It should be noted that iteration matrix B in iterative pro ess (13) is nonsymmetri , be ause matri es Tp are not ommutative in general.

Now we onsider the alternative dire tion blo k version of the Ka zmarz method, in whi h ea h iteration onsists of two stages. The rst one realizes

onventional formulas (12) or (13), and the se ond stage implements similar

omputations but in the ba kward ordering on the number p: n,p−1 un+1/2,p = un,p−1 + ωA+ , (p) r(p) n+1/2 n+1/2,l p = 1, 2, ..., l, u =u = un+1/2,l+1 , n+1/2,p+1 , un+1,p = un+1/2,p+1 + ωA+ (p) r(p) p = l, ..., 2, 1, un+1 = un+1,1 .

(14)

Conjugate and Semi-Conjugate Dire tion Methods

261

The iteration matrix in iterations (14) is the matrix produ t B = B2 B1 , where B1 oin ides with B from (13) and B2 has a similar form. Thus, un+1 = B2 B1 un + g,

B2 = (I − T1 )(I − T2 ) · · · (I − Tl ) = Bt1 .

(15)

Under onditions of Theorem 1, the estimate ||Bk ||2 6 ρ is valid for ea h matrix B1 , B2 , and for the iteration matrix of the alternative dire tion method we have an inequality ||B|| 6 ||B1 || · ||B2 || 6 ρ2 < 1. Sin e method (14), (15) an be presented in the form (2) with s.p.d. matrix B, it is possible to a

elerate the onvergen e of iterations by means of onjugate dire tion methods, applied formally for solving pre onditioned SLAE (3), and the following result is true. Theorem 2. The

iterations of the alternative dire tion multipli ative proje tive onjugate gradient (ADMPCG) and onjugate residual (ADMPCR) methods de ned by relations (3), (5), and (6) for s = 0, 1 respe tively, are

onvergent under onditions of Theorem 1, and the estimate (8) is valid for the number of iterations n(ε), where = (1 + ρ2 )/(1 − ρ2 ) and the value ρ is determined in Theorem 1.

Now let us onsider the su

essive multipli ative proje tive semi- onjugate residual (SMPSCR) method in the Krylov subspa es whi h is an alternative to the above ADMPCR algorithm. The new approa h is based on the a

eleration of iterative pro ess (13) with non-symmetri iteration matrix B by means of is des ribed by (3), (13). formulas (5) and (9) where pre onditioned matrix A The SMPSCR pro edure requires, for omputing un+1 , to save in memory all previous dire tion ve tors p0 , ..., pn , similarly to the GMRES method [4℄. These two approa h have the same onvergent property be ause they provide mini~ . The following mization of the fun tional (rn , rn ) in the subspa e Kn+1 (r0 , A) result is true for the su

essive multipli ative method. Theorem 3.

Suppose, that the SMPSCR algorithm, de ned by formulas

~ = XΛX−1 , (3), (5),(6) and (9),(11){(13) for s = 1, has diagonalizable matrix A ~ and X is a square matrix Λ = diag(λ1 , ..., λN ), where λi are eigenvalues of A whose olumns are orresponding eigenve tors. Then this method is onver-

gent under onditions of Theorem 1, and the following estimate is valid for the number of iterations:

q 1 − ε21 n(ε) 6 1 + ln / ln γ, ε1 = ε/(kXk2 · kX−1 k2 ), ε1 √ √ Here γ1 = a+ a2 − d2 , γ2 = c+ c2 − d2 , where a, d are the semi-major axis and the fo al distan e (d2 < c2 ) for the ellipse E(a, d, c) whi h in ludes all values λi , ex ludes origin, and is entered at c.

1+

It should be noted that for the SMPSCR method, as for GMRES, dierent redu ed versions with a bounded number of saved dire tion ve tors pn an be

262

V. P. Il'in

onstru ted. This will de rease the omputational resour es for the implementation of the algorithm, but the quantities n(ε) will in rease in these ases.

3

Additive projective methods

Let us re all that the Ka zmarz method is based on su

essive proje tion of the points from the spa e RN onto the hyperplanes whi h are des ribed by the

orresponding equations of the algebrai system. A similar idea is used in the Cimmino algorithm (see [11℄{[13℄ and its referen es). But here proje tions of the given point un onto all hyperplanes are made simultaneously, and the next step of the iterative approximation is hosen by means of some averaging pro edure, or linear ombination, with proje tive points un,i , i = 1, ..., N. Su h an additive type iterative pro ess to solve SLAE (11) an be presented in a generalized blo k version as n−1 un,p = un−1 + A+ (p) r(p) ,

p = 1, 2, ..., l,

un = (un,1 + un,2 + ... + un,l )/l,

These relations an be written in the following matrix form: un = Bun−1 + g,

B = I − l−1

l X

A+ (p) A(p) =

(17)

p=1 −1

=I−l

l X

Tp ,

−1

g=l

p=1

l X

(16)

A+ (p) f(p) ,

p=1

where matri es Tp are de ned in (13). Obviously, the limit ve tor of this sequen e u = lim un , if it exists, satis es n→ ∞ the pre onditioned system of equations ~ = f~, Au

~= A

l X

p=1

Tp ,

f~ =

l X

A+ (p) f(p) .

(18)

p=1

~ of system (18) is a s.p.d. one, its spe tral properties are obtained If matrix A from the following result [10℄.

Theorem 4. Let the quantities 0 < α < 2 and 0 < ρ < 1 be de ned from ~ of s.p.d. matrix A ~ from system Theorem 1. Then the spe tral radius λ(A) (18) satis es the inequalities

~ 6 αl. (2 − α)(1 − ρ)/4 6 λ(A) Now we an estimate the onvergen e rate of the additive proje tive approa h.

Conjugate and Semi-Conjugate Dire tion Methods

263

Theorem 5. Estimate (8) for the number of iterations n(ε) is valid for the onjugate gradient and onjugate residual methods to solve the SLAE (18), i.e. to a

elerate the additive proje tive algorithm (17). In this ase the ~ 6 4αl(2 − α)−1 (1 − ρ)−1 .

ondition number satis es the estimate (A) Remark 1. It follows from Theorems 1 and 5 that the multipli ative method is faster, in omparison to a similar additive pro edure. However the latter has a onsiderable advantage for parallel implementation on a multi-pro essor omputer, be ause the al ulation of ea h proje tion at the subspa e an be done independently. Remark 2. Theorems 1 and 4 were proved in [10℄ to analyse onvergen e

properties of the multipli ative and additive domain de omposition methods. It is evident that Theorems 2, 3 and 5 on the a

elerations of proje tive iterative methods by means of onjugate dire tion or semi- onjugate dire tion algorithms in the Krylov subspa es an be used su

essively in these appli ations. Thus, the blo k variant of SLAE (11) an be interpreted as a matrix representation of the algebrai domain de omposion (ADD) formulation.

4

Iterations in Krylov subspaces with dynamic preconditioning

If we have a large problem, i.e. the original algebrai system (1) has a dimensionality of several millions or hundreds of millions, then it is natural to use some iterative pro edure for solving auxiliary SLAEs at ea h step of blo k proje tion method (12) or (17). In this ase we obtain a two level iterative approa h: at the external level we have iterative method of the form n un+1 = Bn un + gn = un + C−1 n (f − Au ),

Bn = I − C−1 n A,

(19)

with variable (dynami ) iteration matri es Bn and pre onditioning matri es Cn , and at the internal level the subsystems of dimensionality mp are solved iteratively. The a

eleration of iterative pro ess (19) in the Krylov subspa es −1 0 −1 0 n −1 0 Kn+1 (r0 , C−1 n A) = span{C0 r , AC1 r , ..., A Cn r }

264

V. P. Il'in

an be done by the following dynami ally pre onditioned semi- onjugate residual (DPSCR) method: 0 r0 = f − Au0 , p0 = C−1 n = 0, 1, .. : 0 r , n+1 n n n+1 u = u + αn p , r = rn − αn Apn , n n P P n+1 pn+1 = C−1 + βn,k pk = pn+1,l + βn,k pk , n+1 r k=0

pn+1,l = pn+1,l−1 + βn,l−1 pl−1 , n n n n αn = (AC−1 n r , r )/(Ap , Ap ),

k=l

n+1 pn+1,0 = C−1 , pn+1 = pn+1,n , n+1 r k n,k βn,k = −(Ap , Ap )/(Apk , Apk ).

The algorithm DPSCR provides minimization of the residual norm ||r in the subspa e Kn+1 (r0 , C−1 n A), and the following equality is true: ||rn+1 ||2 = (r0 , r0 ) −

0 0 2 n n 2 (AC−1 (AC−1 n r ,r ) 0 r ,r ) − . . . − . (Ap0 , Ap0 ) (Apn , Apn )

(20)

n+1

||

(21)

Thus, this method onverges if matri es C−1 n A are positive de nite. In order to de rease the omputational omplexity of the algorithm, for large n two redu ed versions of method (20) an be applied. The rst one is based on the pro edure of periodi al restarting after ea h m iterations. This means that for n = ml, l = 1, 2, ..., the residual ve tor rn is omputed not from the re urrent relation but from the original equation (rml = f − Auml ), and subsequent al ulations are performed in the onventional form. The se ond way onsists in trun ated orthogonalization, i.e. for n > m only the last m dire tion ve tors pn , ..., pn−m+1 and Apn , ..., Apn−m+1 are saved in the memory and used in the re ursion. The following ombination of these two approa hes an be proposed. Let m1 be the restart period, m2 be the number of saved orthogonal dire tion ve tors, and n′ = n − mn2 m2 , where [b℄ is the integer part of b. Then the uni ed redu ed re ursion for pn is written as p

n+1

=

n+1 C−1 n+1 r

+

n X

βn,k pk ,

m = min{n′ , m1 }.

(22)

k=n−m+1

It is easy to show from (21) that the redu ed versions of DPSCR onverge also, if matri es C−1 n A are positive de nite for all n.

References 1. Golub G., Van Loan C. Matrix omputations. The Johns Hopkins Univ. Press, Baltimore, 1989. 2. O.Axelsson. Iterative solution methods. Cambridge Univ. Press, New York, 1994. 3. V.P.Il'in. Iterative In omplete Fa torization Methods, World S ienti Publ., Singapore, 1992.

Conjugate and Semi-Conjugate Dire tion Methods

265

4. Y.Saad. Iterative methods for sparse linear systems, PWS Publ., New York, 1996. 5. S.C.Eisenstat, H.C.Elman, M.H.S hultz. Variational iterative methods for nonsymmetri systems of linear equations, SIAM J. Num. Anal., 20, (1983), pp. 345-357. 6. C.L.Lawson, R.J.Hanson. Solving Least Square Problems, Prenti e-Hall, In ., New Jersey, 1974. 7. S.Ka zmarz. Angenaherte Au osung von Systemen linearer Glei hungen, Bull. Internat. A ad. Polon. S i. Lettres A, 335{357 (1937). Translated into English: Int. J. Control 57(6): 1269{1271 (1993). 8. K.Tanabe. Proje tion method for solving a singular system of linear equation and its appli ations, Number Math., 17, (1971), pp. 203-214. 9. V.P.Il'in. On the iterative Ka zmarz method and its generalizations (in Russian), Sib.J.Industr. Math., 9, (2006), pp. 39-49. 10. J.H.Bramble, J.E.Pas iak, J.Wang, J.Xu. Convergen e estimates for produ t iterative methods with appli ations to domain de omposition, Math. of Comput., 57, (1991), 195, pp. 1-21. 11. G.Cimmino. Cal olo approssimato per le soluzioni dei sistemi di equazioni lineari, La Ri er a S ienti a, II, 9 (1938), pp. 326-333. 12. R.Bramley, A.Sameh. Row proje tion methods for large nonsymmetri linear systems, SIAM J. S i. Stat. Comput., 13, (1992), pp. 168-193. 13. G.Appleby, D.C.Smolarski. A linear a

eleration row a tion method for proje ting onto subspa es, Ele troni Transa tions on Num. Anal., 20, (2005), pp. 243-275.

Some Relationships between Optimal Preconditioner and Superoptimal Preconditioner Jian-Biao Chen1,⋆ , Xiao-Qing Jin2,⋆⋆ , Yi-Min Wei3,⋆⋆⋆ , and Zhao-Liang Xu1,† 1

Department of Mathemati s, Shanghai Maritime University, Shanghai 200135, P. R. China. 2 Department of Mathemati s, University of Ma au, Ma ao, P. R. China.

3

Institute of Mathemati s, S hool of Mathemati al S ien es, Fudan University, Shanghai 200433, P. R. China.

xqjin@umac.mo

ymwei@fudan.edu.cn

Abstract. For any given n-by-n matrix An , a spe i ir ulant pre onditioner tF (An ) proposed by E. Tyrtyshnikov [SIAM J. Matrix Anal.

Appl., Vol. 13 (1992), pp. 459{473℄ is de ned to be the solution of min kIn − C−1 n An kF Cn

over all n-by-n nonsingular ir ulant matri es Cn . The tF (An ), alled the superoptimal ir ulant pre onditioner, has been proved to be a good pre onditioner for a large lass of stru tured systems in luding some ill onditioned problems from image pro essing. In this paper, we study this pre onditioner from an operator viewpoint. We will give some relationships between the optimal pre onditioner (operator) proposed by T. Chan [SIAM J. S i. Statist. Comput., Vol. 9 (1988), pp. 766{771℄ and superoptimal pre onditioner (operator).

Keywords: optimal pre onditioner, superoptimal pre onditioner.

1

Introduction

In 1986, ir ulant pre onditioners were proposed for solving Toeplitz systems [18, 22℄ by the pre onditioned onjugate gradient method. Sin e then, the use of ⋆

⋆⋆

⋆⋆⋆

†

The resear h of this author is partially sponsored by the Hi-Te h Resear h and Development Program of China (grant number: 2007AA11Z249). The resear h of this author is supported by the resear h grant RG-UL/0708S/Y1/JXQ/FST from University of Ma au. The resear h of this author is supported by the National Natural S ien e Foundation of China under the grant 10871051 and Shanghai S ien e and Te hnology Committee under grant 08511501703. The resear h of this author is supported by the National Natural S ien e Foundation of China under the grant 10871051.

Relationships between optimal and superoptimal pre onditioners

267

ir ulant pre onditioners for solving stru tured systems has been studied extensively [4, 10{12, 16, 17, 19, 20℄. In 1988, T. Chan [6℄ proposed a spe i ir ulant pre onditioner as follows. For any arbitrary matrix An , T. Chan's ir ulant pre onditioner cF (An ) is de ned to be the minimizer of the Frobenius norm min kAn − Cn kF Cn

where Cn runs over all ir ulant matri es. The cF (An ) is alled the optimal

ir ulant pre onditioner in [6℄. A generalization of the optimal ir ulant pre onditioner is de ned in [9℄. More pre isely, given a unitary matrix U ∈ Cn×n , let MU ≡ {U∗ Λn U | Λn is any n-by-n diagonal matrix}. (1) The optimal pre onditioner cU (An ) is de ned to be the minimizer of min kAn − Wn kF Wn

where Wn runs over MU . We remark that in (1), when U = F, the Fourier matrix, MF is the set of all ir ulant matri es [8℄, and then cU (An ) turns ba k to cF (An ). The matrix U an also take other fast dis rete transform matri es su h as the dis rete Hartley matrix, the dis rete sine matrix or the dis rete

osine matrix, et ., and then MU is the set of matri es that an be diagonalized by a orresponding fast transform [2, 4, 10, 17℄. We refer to [14℄ for a survey of the optimal pre onditioner. Now we introdu e the superoptimal ir ulant pre onditioner proposed by Tyrtyshnikov in 1992. For any arbitrary matrix An , the superoptimal ir ulant pre onditioner tF (An ) is de ned to be the minimizer of min kIn − C−1 n An kF Cn

where Cn runs over all nonsingular ir ulant matri es. The generalized superoptimal pre onditioner tU (An ) is de ned to be the minimizer of min kIn − Wn−1 An kF Wn

where Wn runs over all nonsingular matri es in MU given by (1). Again, tU (An ) turns ba k to tF (An ) when U = F. In this paper, we study the superoptimal pre onditioner from an operator viewpoint. We will give some relationships between the optimal pre onditioner (operator) and superoptimal pre onditioner (operator). Now, we introdu e some lemmas whi h will be used later. Let δ(En ) denote the diagonal matrix whose diagonal is equal to the diagonal of the matrix En .

268

J.-B. Chen, X.-Q. Jin, Y.-M. Wei, Zh.-L. Xu

Lemma 1. ([3])

Let An ∈ Cn×n . Then cU (An ) ≡ U∗ δ(UAn U∗ )U.

For a relationship between cU (An ) and tU (An ), we have Lemma 2. ([3])

Then

Let

An ∈ Cn×n

su h that

An

and

cU (An )

are invertible.

tU (An ) ≡ cU (An A∗n )cU (A∗n )−1 .

Lemma 3. ([3, 7])

For any matrix An ∈ Cn×n , δ(UAn A∗n U∗ ) − δ(UAn U∗ ) · δ(UA∗n U∗ )

is a positive semi-de nite diagonal matrix. 2

Relationships between cU and tU

The optimal pre onditioner was studied from an operator viewpoint in [3℄. Let the Bana h algebra of all n-by-n matri es over the omplex eld, equipped with a matrix norm k·k, be denoted by (Cn×n , k·k). Let (MU , k·k) be the subalgebra of (Cn×n , k·k). We note that MU is an inverse- losed, ommutative algebra. Let tU be an operator from (Cn×n , k · k) to (MU , k · k) su h that for any An in Cn×n , −1 tU (An ) is the minimizer of kIn − Wn An kF over all nonsingular Wn ∈ MU . Before we dis uss the operator tU in details, we introdu e the following theorem whi h is on erned with the operator norms of cU . Theorem 1. ([2, 3])

(i) kcU kF ≡

(ii) kcU k2 ≡

For all n > 1, we have

sup kcU (An )kF = 1.

kAn kF =1

sup kcU (An )k2 = 1.

kAn k2 =1

The following theorem in ludes some properties of tU (An ).

Let An vertible. We have

Theorem 2.

∈ Cn×n

with

n>1

su h that

An

and

cU (An )

are in-

(i) tU (αAn ) = αtU (An ), for all α ∈ C. (ii) tU (A∗n ) = tU (An )∗ for the normal matrix An . (iii) tU (Bn An ) = Bn tU (An ) for Bn ∈ MU if Bn An and cU (Bn An ) are invert-

ible.

Relationships between optimal and superoptimal pre onditioners

269

(iv) tU (An ) is stable for any normal and stable matrix An . We re all that a

matrix is stable if all the real parts of its eigenvalues are negative.

Proof. For (i), if α = 0, (i) holds obviously. If α 6= 0, it follows from Lemma 2

that

tU (αAn ) = cU (αAn α A∗n )cU (α A∗n )−1 = αα cU (An A∗n )[α cU (A∗n )]−1 = αcU (An A∗n )cU (A∗n )−1 = αtU (An ).

For (ii), we have by Lemma 2 again, tU (An )∗ = [cU (An A∗n )cU (A∗n )−1 ]∗ = [cU (A∗n )−1 ]∗ cU (An A∗n ) = cU (An )−1 cU (An A∗n )

and then by Lemma 1, tU (A∗n ) = cU (A∗n An )cU (An )−1 = U∗ δ(UA∗n An U∗ )UU∗ δ(UAn U∗ )−1 U = U∗ δ(UAn U∗ )−1 UU∗ δ(UA∗n An U∗ )U = cU (An )−1 cU (A∗n An ).

Sin e An is normal, we obtain tU (A∗n ) = tU (An )∗ . For (iii), we have tU (Bn An ) = cU (Bn An A∗n B∗n )cU (A∗n B∗n )−1 = Bn cU (An A∗n )B∗n cU (A∗n B∗n )−1 = Bn cU (An A∗n )cU (A∗n )−1 = Bn tU (An ).

For (iv), it follows from [15℄ that δ(UAn U∗ ) and δ(UA∗n U∗ ) are stable. Sin e δ(UAn A∗n U∗ ) is a positive diagonal matrix, we know that δ(UAn A∗n U∗ ) · ⊓ ⊔ δ(UA∗n U∗ )−1 is also stable. In general, we remark that (ii) is not true. For example, Let U = I2 and

11 A2 = . 02

It is easy to verify that tU (A∗n ) 6= tU (An )∗ .

Let An vertible. We have

Theorem 3.

(i) (ii)

sup

kAn kF =1

sup

kAn k2 =1

∈ Cn×n

ktU (An )kF > 1. ktU (An )k2 > 1.

with

n>1

su h that

An

and

cU (An )

are in-

270

J.-B. Chen, X.-Q. Jin, Y.-M. Wei, Zh.-L. Xu

Proof. For (i), we have by Lemmas 1 and 2, tU (An ) = U∗ δ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 U.

Noti e that from Lemma 3 and the invertibility of An and cU (An ), δ(UAn A∗n U∗ ) > δ(UAn U∗ )δ(UA∗n U∗ ) > 0,

where M > N for any matri es M and N means that the all entries of M − N are non-negative. We obtain (2)

|δ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 | > |δ(UAn U∗ )| > 0

where |Q| = [|qij |] for any matrix Q = [qij ]. Thus we have by (2) and Theorem 1, sup

ktU (An )kF =

kAn kF =1

>

sup kδ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 kF

kAn kF =1

sup kδ(UAn U∗ )kF =

kAn kF =1

sup kcU (An )kF = kcU kF = 1.

kAn kF =1

For (ii), it follows by (2) that ktU (An )k2 = kδ(UAn A∗n U∗ )δ(UA∗n U∗ )−1 k2 > kδ(UAn U∗ )k2 = kcU (An )k2 .

Hen e by Theorem 1 again, sup

kAn k2 =1

ktU (An )k2 > kcU k2 = 1.

⊓ ⊔

Finally, we give a relationship of the unitarily invariant norm between

cU (An )−1 An and tU (An )−1 An .

Let An ∈ Cn×n with n > 1 su h that An and vertible. For every unitarily invariant norm k · k, we have

Theorem 4.

cU (An )

are in-

ktU (An )−1 An k 6 kcU (An )−1 An k.

Proof. It follows from [13, Theorem 2.2℄: if the singular values are ordered in the following de reasing way: σ1 > σ2 > · · · > σn , then we have σk [tU (An )−1 An ] 6 σk [cU (An )−1 An ],

k = 1, 2, . . . , n.

Thus, for every unitarily invariant norm k · k, the result holds from [21, p.79, Theorem 3.7℄. ⊓ ⊔

Relationships between optimal and superoptimal pre onditioners

271

References 1. D. Berta

ini, A Cir ulant Pre onditioner for the Systems of LMF-Based ODE Codes, SIAM J. S i. Comput., Vol. 22 (2000), pp. 767{786. 2. R. Chan and X. Jin, An Introdu tion to Iterative Toeplitz Solvers, SIAM, Philadelphia, 2007. 3. R. Chan, X. Jin and M. Yeung, The Cir ulant Operator in the Bana h Algebra of Matri es, Linear Algebra Appl., Vol. 149 (1991), pp. 41{53. 4. R. Chan and M. Ng, Conjugate Gradient Methods for Toeplitz Systems, SIAM Review, Vol. 38 (1996), pp. 427{482. 5. R. Chan, M. Ng and X. Jin, Strang-Type Pre onditioners for Systems of LMFBased ODE Codes, IMA J. Numer. Anal., Vol. 21 (2001), pp. 451{462. 6. T. Chan, An Optimal Cir ulant Pre onditioner for Toeplitz Systems, SIAM J. S i. Statist. Comput., Vol. 9 (1988), pp. 766{771. 7. C. Cheng, X. Jin, S. Vong and W. Wang, A Note on Spe tra of Optimal and Superoptimal Pre onditioned Matri es, Linear Algebra Appl., Vol. 422 (2007), pp. 482{485. 8. P. Davis, Cir ulant Matri es, 2nd ed., Chelsea Publishing, New York, 1994. 9. T. Hu kle, Cir ulant and Skew Cir ulant Matri es for Solving Toeplitz Matrix Problems, SIAM J. Matrix Anal. Appl., Vol. 13 (1992), pp. 767{777. 10. X. Jin, Developments and Appli ations of Blo k Toeplitz Iterative Solvers, S ien e Press, Beijing; and Kluwer A ademi Publishers, Dordre ht, 2002. 11. X. Jin, Three Useful Pre onditioners in Stru tured Matrix Computations , Pro eedings of the 4th ICCM (2007), Vol. III, pp. 570{591. Eds: L.-Z. Ji, K.-F Liu, L. Yang and S.-T. Yau, Higher Edu ation Press, Beijing, 2007. 12. X. Jin and Y. Wei, Numeri al Linear Algebra and Its Appli ations, S ien e Press, Beijing, 2004. 13. X. Jin and Y. Wei, A Short Note on Singular Values of Optimal and Superoptimal Pre onditioned Matri es, Int. J. Comput. Math., Vol. 84 (2007), pp. 1261{ 1263. 14. X. Jin and Y. Wei, A Survey and Some Extensions of T. Chan's Pre onditioner, Linear Algebra Appl., Vol. 428 (2008), pp. 403{412. 15. X. Jin, Y. Wei and W. Xu, A Stability Property of T. Chan's Pre onditioner, SIAM J. Matrix Aanal. Appl., Vol. 25 (2003), pp. 627{629. 16. T. Ku and C. Kuo, Design and Analysis of Toeplitz Pre onditioners, IEEE Trans. Signal Pro ess., Vol. 40 (1992), pp. 129{141. 17. M. Ng, Iterative Methods for Toeplitz Systems, Oxford University Press, Oxford, 2004. 18. J. Olkin, Linear and Nonlinear De onvolution Problems, Ph.D thesis, Ri e University, Houston, 1986. 19. D. Potts and G. Steidl, Pre onditioners for Ill-Conditioned Toeplitz Matri es, BIT, Vol. 39 (1999), pp. 579{594. 20. S. Serra, Pre onditioning Strategies for Asymptoti ally Ill-Conditioned Blo k Toeplitz Systems, BIT, Vol. 34 (1994), pp. 579{594. 21. G. Stewart and J. Sun, Matrix Perturbation Theory, A ademi Press, Boston, 1990. 22. G. Strang, A Proposal for Toeplitz Matrix Cal ulations, Stud. Appl. Math., Vol. 74 (1986), pp. 171{176.

272

J.-B. Chen, X.-Q. Jin, Y.-M. Wei, Zh.-L. Xu

23. E. Tyrtyshnikov, Optimal and Super-Optimal Cir ulant Pre onditioners, SIAM J. Matrix Anal. Appl., Vol. 13 (1992), pp. 459{473.

Scaling, Preconditioning, and Superlinear Convergence in GMRES-type iterations⋆ Igor Kaporin Computing Center of Russian A ademy of S ien es, Vavilova 40, Mos ow 119991, Russia kaporin@ccas.ru

A theoreti al justi ation is found for several standard te hniques related to ILU pre onditioning, su h as pre-s aling and pivot modi ation, with impli ations for pra ti al implementation. An improved estimate for the redu tion of the GMRES residual is obtained within the general framework of two-stage pre onditioning. In parti ular, an estimate in terms of a onditioning measure of the s aled oeÆ ient matrix and the Frobenius norm of the s aled ILU residual is presented. Abstract.

Keywords: unsymmetri sparse matrix, two-side s aling, in omplete LU pre onditioning, two-stage pre onditioning, superlinear onvergen e.

1

Introduction

In the present paper we address ertain theoreti al issues related to the onstru tion of omputational methods for the numeri al solution of large linear systems with general nonsingular unsymmetris sparse oeÆ ient matri es. As is known, dire t solvers (whi h are based on the \exa t" sparse triangular fa torization of the matrix) represent a quite robust, advan ed and wellestablished pie e of numeri al software. As an example, one an refer to the UMFPACK solver [5℄, whi h implements an unsymmetri multifrontal sparse Gauss elimination. However, the sparsity stru ture inherent to many important

lasses of problems (su h as fully three-dimensional dis rete models) is rather unsuitable to su h methods. This is due to huge volumes of intermediate data generated by a dire t solver (namely, arrays presenting nonzero elements of the triangular fa tors) whi h are many orders of magnitude larger than the order of the system. Moreover, the orresponding omputation time grows even faster than the storage spa e as the linear system size in reases. An alternative to dire t solvers is represented by iterative methods. Unfortunately, any \ lassi al" xed-storage simplisti s hemes (for instan e, the ILU(0) ⋆

This work was partially supported through the Presidium of Russian A ademy of S ien es program P-14 and the program "Leading S ienti S hools" (proje t NSh2240.2006.1)

274

I. Kaporin

pre onditioned GMRES(m) method) are ompletely unreliable for general unsymmetri linear systems. More promising are the In omplete LU-type Pre onditioned Krylov subspa e iterative solvers based on the approximate fa torization \by value" without any restri tions on the sparsity of the triangular fa tors. An appropriate use of the \approximate" triangular fa torization makes it possible to generate mu h more ompa t triangular fa tors as ompared to those arising in dire t solvers. Is should be stressed that almost all results and te hniques developed for the \exa t" LU-fa torization need to be essentially revisited and reformed in order to be useful for the purpose of eÆ ient pre onditioning of the Krylov subspa e iterations. In this ase, for instan e, a areful pivoting (stri tly targeted at \as good as possible" diagonal dominan e in the approximate triangular fa tors) appears to be mu h more important than any near-optimum pre-ordering or even the dynami a

ount for the lo al ll-in [12℄. It an be de nitely stated that the urrently available software produ ts implementing pre onditioned iterative sparse linear solvers still suer from the following de ien es: (a) their reliability is still worse than that of dire t solvers; (b) in order to provide satisfa tory reliability and eÆ ien y, they require quite

ompli ated tuning of the solver ontrol parameters (whi h are related to the numeri al algorithm itself rather than to the problem solved). Below we present a superlinear onvergen e estimate for Pre onditioned GMRES-type iterative linear equation solvers. The formulation of the result is spe i ally adjusted to the ase when the pre onditioning is based on approximate triangular fa torization applied to a pre-s aled oeÆ ient matrix. Hen e, in addition to many empiri al observations (see, e.g., [2℄), a ertain theoreti al eviden e is found for the onsidered robust iterative solvers.

2

Problem setting

Consider the linear algebrai system Ax = b

(1)

with a general unsymmetri nonsingular sparse n × n matrix A. The In omplete LU (ILU) pre onditioned GMRES-type iterative methods use the pre onditioner matrix C ≈ A of the form C = PLUQ whi h is obtained from the ILU equation A = PLUQ + E,

where L and U are nonsingular lower and upper triangular matri es, respe tively, while P and Q are permutation matri es. Hen e, the pre onditioner is given by C = PLUQ,

(2)

Superlinear onvergen e in pre onditioned GMRES

275

whi h is obviously an \easily invertible" matrix. The additive term E is the ILU

error matrix, a standard assumption for whi h is |(E)ij | = O(τ),

(3)

where τ ≪ 1 is a pres ribed threshold parameter. Note that a more general stru ture of the error matrix is admissible in pre onditioned GMRES-type methods, namely, |(E − X)ij | = O(τ),

rank(X) = O(1),

(4)

whi h was proposed in [19℄ in the ontext of pre onditioned Toeplitz-like system solvers. The low-rank term in ILU error matrix may arise due to the use of pivot

orre tion, whi h te hnique an be helpful in the ase of diagonal pivoting, see [12℄ for more detail. We onsider the pre onditioned Krylov subspa e iterative solver for unsymmetri linear system (1) as an appli ation of GMRES iterations [16℄ to the right pre onditioned system AC−1 y = b, (5) so that the solution of (1) is obtained as x = C−1 y. Note: Under a proper hoi e of permutation matri es P and Q (mainly aimed at the improvement of diagonal dominan e of L and U), one an observe that kEk2F ≡ tra e(ET E) = O(nτ2 ),

(6)

i.e., only relatively few nonzero entries of E attain their maximum allowed magnitude. At the same time, the stability of the triangular fa tors is often improved (more pre isely, the ratio ond(C)/ ond(A) is not large), whi h is desirable from the numeri al stability viewpoint.

3

Scaling techniques

It was noted by many authors (see, e.g. [2, 13℄ and referen es therein) that the ILU fa torization `by value' applied to a properly two-side s aled oeÆ ient matrix AS = DL ADR (7) may yield mu h better pre onditioning ompared to similar algorithms applied to the original oeÆ ient matrix A (espe ially in several hard-to-solve ases, see also [12℄). The in omplete triangular fa torization (now applied to s aled matrix (7)) yields the equation AS = PS LS US QS + ES ,

276

I. Kaporin

where PS and QS are permutation matri es arising due to pre-ordering and pivoting applied to the s aled matrix. Hen e, a

ording to (7) the resulting pre onditioner is −1 D−1 L PS LS US QS DR ≡ C ≈ A.

Note that in a tual implementation the latter pre onditioning an readily be transformed to the same form (2) (though with dierent triangular fa tors, even if the permutations would be the same). Next we will onstru t a pre onditioning quality measure via (i) a spe ial ondition number of AS (presenting the s aling quality measure) and (ii) Frobenius norm of the s aled ILU error matrix ES . The orresponding ILU-GMRES onvergen e estimate an be referred to as the

onstru tive one, be ause the residual norm bound is expressed literally via the very fun tionals whi h are expe ted to be dire tly optimized in the pro edures of s aling and approximate fa torization. Moreover, the improvement of s aling quality and the attained value of the fa torization quality riteria an be readily evaluated a posteriori (numeri ally). Note: Below in Se tion 5 we present a onvergen e estimate for the GMRES method whi h does not depend on the quantities of the type k ond(DL )k, kEk, or k(LU)−1 k. Taking into a

ount that our result holds in `exa t arithmeti s', one an on lude that \bad" (i.e., too large) values of quality indi ators (whi h are often asso iated with ILU pre onditioning) su h as (a) ondition numbers of the s aling matri es DL and DR , (b) size of the elements of the uns aled 'original' error matrix E, and ( ) norm of the inverse of the s aled pre onditioner, may have their destru tive ee t on the GMRES onvergen e only in the presen e of round-o errors.

4

How to estimate GMRES Convergence

From now on, let k · k denote the matrix spe tral norm kBk = max kBzk/kzk, z6=0

kzk =

√ zT z.

For the kth residual rk = b − Axk

(8) (9)

in the pre onditioned minimum residual method (also known as GMRES(∞),

f.[16℄) one has, by the onstru tion, krk k =

min kPk (M)r0 k = kPk∗ (M)r0 k.

Pk (0)=1

(10)

Superlinear onvergen e in pre onditioned GMRES

Here

277

(11)

M = AC−1

is the right pre onditioned matrix and Pk∗ (·) is the polynomial determined at the kth step of the minimum residual method; this polynomial has the degree not greater than k and is normalized by the ondition Pk∗ (0) = 1. For the sake of simpli ity, let M be diagonalizable, that is, M = VΛV −1 .

(12)

Here, the olumns of V are the (normalized) eigenve tors v1 , v2 , . . . , vn of M and the entries of the diagonal matrix Λ are the orresponding eigenvalues λ1 , λ2 , . . . , λn of M. Using (10) and (12), one nds ek (M)r0 k = kP ek (VΛV −1 )r0 k = kV P ek (Λ)V −1 r0 k krk k = kPk∗ (M)r0 k 6 kP ek (Λ)D−1 V −1 r0 k 6 kVDkkP ek(Λ)kk(VD)−1 kkr0 k = kVDP ek (λi )| kr0 k ek (Λ)kkr0 k = κ max |P = κ kP 16i6n

(13)

whi h holds for an arbitrary polynomial Pek of a degree not greater than k and normalized by the ondition Pek (0) = 1. Note that hereafter, the notation κ = min kVDkk(VD)−1 k = min ond(VD) D=diag.

D=diag.

(14)

is used to denote the ondition number of VD, where D is an arbitrary nonsingular diagonal matrix. Nontrivial hoi es of Pek (·) and upper bounds for max16i6n |Pek (λi )| are typi ally obtained via the separation of the spe trum of M into the luster part and the outlying part, see, for instan e [7, 4, 3℄ for the ase of SPD matrix M, and [6, 15, 16, 21℄ for the general ase. In [18℄, an alternative te hnique is used, whi h allows to relax diagonalizability ondition (12). A standard approa h to the analysis of pre onditioned iterations is to use the general theory of Krylov subspa e methods for a pre onditioned system (5) using substitution (11). Unfortunately, one an hardly estimate and/or ontrol any related properties of the pre onditioned matrix M even a posteriori. It is not known how one an ee tively relate any hara teristi s of the \lo alization" or \distribution" of the eigenvalue spe trum of M to the result of pre onditioning. For instan e, in general even the tr(M) is very hard to estimate (e.g., its exa t evaluation seems like n times the solution ost of the original linear system). Therefore, we a tually reje t the use of the pre onditioned spe trum as an \interfa e" between the pre onditioning and iterations. Instead of that, we separately use some properties of the two fa tors ES and A−1 S in the multipli ative splitting su h as −1 −1 DL (I − M−1 )D−1 L = (AS − CS )AS = ES AS .

278

I. Kaporin

This well onforms to the two-stage pre onditioning s heme, where at the rst stage one improves some onditioning measure for matrix AS by the hoi e of DL and DR , and at the se ond stage seeks for an easily invertible CS whi h dire tly approximates AS .

5

Superlinear GMRES convergence via scaled error matrix

Let us denote the singular values of a real-valued n × n-matrix Z as σ1 (Z) > σ2 (Z) > · · · > σn (Z) > 0.

Re alling the de nition of the Frobenius matrix norm given in (6) and taking into a

ount that σi (Z)2 is exa tly the ith eigenvalue of ZT Z, one has 2 kZkF = tr(ZT Z) =

n X

(15)

σi (Z)2 .

i=1

Moreover, by (det Z)2 = det(ZT Z) it follows | det Z| =

n Y

(16)

σi (Z).

i=1

5.1

Main result

Next we present a s ale-invariant generalization of the pre onditioned GMRES

onvergen e result earlier presented in [12℄.

Let CS be a pre onditioner for the s aled matrix AS as de ned in (7) and the iterates xk be generated by the GMRES(∞) method with the −1 pre onditioner C = D−1 L CS DR . Then the kth residual rk = b − Axk satis es Theorem 1.

n k/2 krk k 6 κ K(AS ) 4e sin2 [CS , AS ] , kr0 k k

k = 1, 2, . . . , n,

(17)

where e = exp(1), the quantity κ was de ned in (14), n . | det Z| K(Z) = n−1/2 kZkF

(18)

denotes the unsymmetri K- ondition number of a nonsingular matrix Z, and T 2 sin2 [Y, Z] = 1 −

(trZ Y) kYk2F kZk2F

(19)

denotes the squared sine of the Eu lidean a ute angle between the matri es and Z.

Y

Superlinear onvergen e in pre onditioned GMRES

279

Proof. Let us de ne the s alar ξ=

tra e(ATS CS ) kCS k2

(20)

.

(Note that if CS ≈ AS , then ξ ≈ 1.) Let the eigenvalues of the pre onditioned −1 matrix M = AC−1 (re all that C = D−1 L CS DR ) be numbered by the de rease of the distan e to ξ: |ξ − λ1 | > |ξ − λ2 | > . . . |ξ − λn | > 0.

(21)

Following the te hniques introdu ed in [24℄ (see also [11℄), let us onsider the polynomial Pek of the form ek (λ) = P

k Y i=1

λ 1− λi

.

Taking into a

ount that Pek (λj ) = 0 for 1 6 j 6 k and using the above ordering of the eigenvalues one an dedu e from (13) the following residual norm estimate: k Y 1 krk k λj e e 6 max |Pk (λ)| = max |Pk (λ)| = max 1− k<j6n k<j6n κ kr0 k 16j6n λi i=1

= max

k<j6n

6 max

k<j6n

k

62

m Y |(ξ − λi ) − (ξ − λj )|

|λi |

i=1

k Y |ξ − λi | + |ξ − λj | i=1

k Y |ξ − λi | i=1

|λi |

|λi |

k Y ξ =2 1 − λi k

i=1

k k Y Y λπ(i) (I − ξM−1 ) 6 2k λi (I − ξM−1 ) , = 2k i=1

i=1

where the index permutation π(i) orresponds to the reordering of the original numbering (21) a

ording to the de rease of eigenvalue modules of the matrix I − ξM−1 : |λ1 (I − ξM−1 )| > |λ2 (I − ξM−1 )| > · · · > |λn (I − ξM−1 )|.

Next we use the identity

I − ξM−1 = I − ξCA−1 = (A − ξC)A−1 = DL−1 (AS − ξCS )AS−1 DL

and apply the lassi al inequalities k Y i=1

|λi (XY)| 6

k Y i=1

σi (XY) 6

k Y i=1

σi (X)σi (Y),

280

I. Kaporin

the left of whi h is known as the Weyl inequality (written for Z = XY ), while the right one was found by Horn, see [14℄, with X = AS − ξCS and Y = AS−1 . This yields the following estimate: k Y 1 krk k k λi (I − ξM−1 ) 6 2 0 κ kr k i=1

k Y λi (D−1 (AS − ξCS )A−1 DL ) = 2k

L

S

i=1

= 2k

k k Y Y λi ((AS − ξCS )A−1 ) 6 2k σi ((AS − ξCS )A−1 )

S

S

i=1

i=1

6 2k

k Y

σi (AS − ξCS )σi (A−1 S )

i=1

= 2k

k Y (σi (AS − ξCS ))2 i=1

!1/2

k Y

!

(22)

σi (AS−1 ) .

i=1

The rst produ t an be bounded using the inequality between the arithmeti and geometri mean values m Y i=1

ηi

!1/m

m

6

1 X ηi , m

(23)

ηi > 0,

i=1

taken with m = k and ηi = (σi (AS − ξCS ))2 : k Y i=1

2

(σi (AS − ξCS ))

!1/2

k

6

1X (σi (AS − ξCS ))2 k i=1

!k/2

!k/2 n 1X 2 6 (σi (AS − ξCS )) k i=1 k/2 1 2 = kAS − ξCS kF k k/2 1 2 2 = . kAS kF sin [AS , CS ] k

Here the last equality follows from (19) and (20).

(24)

Superlinear onvergen e in pre onditioned GMRES

281

The se ond produ t in (22) an also be estimated using inequality (23), this time taken with m = n − k and ηi = σi (AS ): k Y

−1

σi (AS ) =

i=1

k Y

!−1

σn+1−i (AS )

i=1

n Y

1 = | det AS | =

!

σi (AS )

i=1

n−k Y 1 σi (AS ) | det AS |

k Y

!−1

σn+1−i (AS )

i=1

i=1

1 = | det AS |

n−k Y

(σi (AS ))2

i=1

!1/2

! n−k 2 n−k 1 X 2 (σi (AS )) n−k i=1 ! n−k 2 n 1 X 1 (σi (AS ))2 6 | det AS | n − k i=1 n−k 2 1 1 2 kAS kF = | det AS | n − k n−k 2 exp(k/2) 1 kAS k2F 6 . | det AS | n

1 6 | det AS |

(25)

The latter inequality follows from

n n−k

n−k 2

n−k n log 2 n−k n n−k k , −1 = exp 6 exp 2 n−k 2 = exp

where we have used log η 6 η − 1. Substituting now the above two ineqialities (24) and (25) into (22), one gets n−k k2 2 1 1 krk k exp(k/2) 1 2 k 2 2 6 2 kA kA k [A k sin , C ] S S S S F F κ kr0 k k | det AS | n k/2 n−1 kA k2 n/2 n S F . = 4e sin2 [AS , CS ] k | det AS |

Finally, it only remains to re all de nition (18), and the required inequality (17) follows.

282

I. Kaporin

Hen e, Theorem 1 a tually gives a theoreti al basis for two-stage pre onditionings. For instan e, at the rst stage one hooses the s aling matri es DL and DR (subje t to the ondition of near minimization of K(DL ADR ), see Se tion 5.3 below and [12℄ for more detail), and at the se ond stage one onstru ts an easily invertible approximation for the s aled matrix AS = DL ADR , e.g., with the use of an approximate triangular fa torization with permutations as in [12℄. Note that the earlier supelinear GMRES onvergen e estimate [11℄ was formulated in terms of the quantities kIn − AC−1 kF and |λ(AC−1 )|min , whi h, in general, an hardly be estimated even a posteriori. It turns out that simplisti upper bounds like kIn − AC−1 kF = k(A − C)C−1 kF 6 kC−1 kkEkF

are often senseless due to o

asionally huge values of kC−1 k, see for instan e the data in Tables 2{7 below. At the same time, one an see there that \reasonably huge" values of the norm of the inverse pre onditioner may not destroy the GMRES onvergen e. Also, the above pre onditioning quality measure (19) satis es a natural ondition of being a s ale-invariant fun tional of its matrix arguments, that is, sin2 [γCS , αAS ] = sin2 [CS , AS ],

α 6= 0,

γ 6= 0.

This well onforms with the obvious fa t that the GMRES residual norm is invariant with respe t to any re-s aling of the pre onditioner (i.e. C := βC, β 6= 0). Certainly, the parti ular value of the onstant 4e in (17) is somewhat overestimated due to rather rough te hniques used in the proof of Theorem 1. Based on spe ial analyti al examples, it an be onje tured that the unimprovable value for this onstant equals to one. Note: Starting with a suÆ iently large iteration number k, the right-hand sides of the above estimate (17) de rease faster than any geometri progression. In this sense, these estimates on rm the superlinear GMRES onvergen e, whi h is often observed when the pre onditioning is good enough. 5.2

The corresponding GMRES iteration number bound

Using the te hniques developed in [11℄ one an readily nd an upper bound for the iteration number needed to attain the spe i ed residual norm redu tion ε ≪ 1. We will use the following auxiliary result (for the proof see [11℄). Lemma 1.

Let t > 0 and s>

1 + (1 + e−1 )t , log(e + t)

(26)

Superlinear onvergen e in pre onditioned GMRES

283

where e = exp(1). Then the inequality s log s > t

(27)

holds. As was mentioned in [11℄, for any t > 0 it holds t < s log s < 1.064t, i.e. the relative overestimation in (27) is never larger than 6.5%. Now we an prove a GMRES iteration number bound similar to the ones presented in [11℄,[12℄. Theorem 2. The iteration number k suÆ ient for the ε times redu tion of the residual norm in the minimum residual method satis es 4en sin [CS , AS ] + (2 + 2e ) log k6 −1 log e + 2en sin2 [C , A ] log κε K(AS ) S S

2

κ ε K(AS )

−1

(28)

with κ determined in (14) and e = exp(1).

Proof. By the result of Theorem 1, a suÆ ient ondition to satisfy the required inequality krk k/kr0k 6 ε is

k/2 n 6 ε, κ K(AS ) 4e sin2 [CS , AS ] k

whi h an be rewritten as k log 2

k 2 4en sin [CS , AS ]

> log

κ ε

K(AS ) . −1

Multiplying the latter inequality by 2en sin2 [CS , AS ] s=

k , 2 4en sin [CS , AS ]

t=

and denoting

κ 1 log K(A ) , S ε 2en sin2 [CS , AS ]

one an see that the resulting inequality is equivalent to ondition (27). By Lemma 1, a suÆ ient ondition for (27) to hold is (26), whi h yields exa tly the required estimate (28). The use of the losest integer from above is valid, sin e the fun tion s log s in reases for s > 1/e, and by (26) it holds s > 1. 5.3

Relating the new estimate to scaling

In view of (17), it is natural to require that the s aling should minimize fun tional (18) with Z = AS = DL ADR . As is shown in [12℄, the minimizer satis es

284

I. Kaporin

exa tly the requirement that AS have the Eu lidean norms of ea h row and

olumn equal to the same number, e.g., n X j=1

(DL )2i (A)2ij (DR )2j = 1,

n X

(DL )2i (A)2ij (DR )2j = 1.

i=1

exa tly as was re ommended in [2, 13℄. The diagonal matri es DL and DR an be evaluated as an approximate solution of the above nonlinear system of equations using the RAS (Row-Alternating S aling) iterations (see, e.g. [17℄ and referen es therein). Ea h RAS half-iteration onsists in one-side re-s aling of the urrent matrix to normalize all its rows or all the olumns at odd and even steps, respe tively. The RAS algorithm and its \symmetrized" version are investigated in [12℄ from the viewpoint of K(AS ) redu tion. Note that both the onvergen e theory above and the numeri al examples given later ( f. also [12℄) learly indi ate that it makes sense to invest a onsiderable fra tion of omputational eorts into the evaluation of s aling matri es DL and DR for whi h the fa tor K(AS ) in the right hand side of the GMRES

onvergen e estimate (17) is redu ed onsiderably. In this respe t, one an even use sparse triangular matri es instead of diagonal DL and DR , as it was done in the two-side expli it pre onditioning proposed and investigated in [8℄. There, a general unsymmetri matrix A was pre onditioned using the two-side transformation b = GL AGU , A

with GL and GU hosen as the sparse lower and upper triangular matri es, respe tively. The positions and values of their entries were determined from the same ondition of K(GL AGU ) minimization. To this end, a RAS-type pro edure was used, where at ea h half-step one evaluates the K- ondition numbA b T or ber minimizer GL or GU , where K(M) = (n−1 trM)n / det M and M = A T b A b , respe tively ( f. also [9℄). The strategy onsidered in [8℄ was as folM=A lows: allowing the matri es GL and GU to have a suÆ iently large number of bA b T omes lose enough to the nonzeroes, one an assume that the matrix M = A identity matrix In to make the expli it Conjugate Gradient iterations eÆ ient in solving the two-side pre onditioned system My = f. Sin e su h a onstru tion is ompletely free of ne essity of solving systems with large sparse triangular matri es, this method is onsidered suitable for the parallel implementation. In the ontext of the present paper, even the use of GL and GU ontaining not more than 2 nonzeroes in ea h row and olumn instead of diagonal matri es DL and DR , respe tively, may result in a further onsiderable redu tion of K(GL AGU ). Moreover, one an expe t that approximate triangular fa torization of the type bb bQ b +E b GL AGU = P LU

Superlinear onvergen e in pre onditioned GMRES

285

will possess even better pre onditioning quality than that obtained with simple diagonal s aling. In this ase, the onvergen e estimate (17) of Theorem 1 will take the form n k/2 krk k bb b Q, b GL AGU ] 6 κ K(GL AGU ) 4e sin2 [P LU . kr0 k k

b b b b −1 Hen e, the resulting two-level pre onditioner takes the form C = G−1 L P LUQGU and its appli ation additionaly requires two matrix-ve tor multipli ations with the sparse matri es GL and GU . Of ourse, su h a s heme would involve ertain additional algorithmi ompli ations; however, the expe ted gain in pre onditioning quality should prevail. 5.4

Relating the new estimate to ILU preconditioning

If the matrix AS = DL ADR is s aled to satisfy (29)

kAS k2F = n,

(note that (29) holds for s alings obtained using any number of RAS iterations) then the following upper bound holds: sin [CS , AS ] ≡ 1 − 2

6

tra eATS CS

2

=

kCS k2F kAS k2F kAS − CS k2F kAS k−2 F

minσ kAS − σCS k2F kAS k2F

= n−1 kAS − CS k2F = n−1 kES k2F .

Hen e, under ondition (29) the result of Theorem 1 oin ides exa tly with the one presented in [12℄:

k 3.3 √ kES kF , k √ where we have also used the numeri al inequality 2 e < 3.3. krk k κ 6 kr0 k | det AS |

(30)

It should be noted that if the ILU threshold parameter is hosen suÆ iently small, e.g. τ = 0.001, and the ILU fa tors are stable enough, then the typi al values of kES kF are not big (one an often observe kES kF < 1 even for realisti large-s ale problems, f. numeri al data in [12℄). As was noted above, the quantity kES kF an be easily evaluated in the pro ess of the approximate fa torization of AS , whi h allows us to use it as an a posteriori indi ator of ILU pre onditioning quality. Turning ba k to the low-rank modi ed form of error term (4), one an generalize the main result to take into a

ount the ase when the pivot modi ation rule is used in the ILU fa torization (see [12℄ for more detail). Setting ξ = 1 in (22), one nds, for any integer 1 6 m ≪ k, the following estimate: 1 krk k 6 2k κ kr0 k

m Y i=1

!

σi (ES )

k Y

i=m+1

2

σi (ES )

!1/2

k Y i=1

−1

!

σi (AS ) .

(31)

286

I. Kaporin

Estimating these three produ ts separately, one has m Y

!

σi (ES )

i=1

k Y

2

σi (ES )

i=m+1

!1/2

6 kES km , 6

=

n X 1 σi (ES )2 k−m i=m+1

!(k−m)/2

1 min kES − Xk2F k − m rank(X)=m

(k−m)/2

where we have used the well known result of E kart and Young, see, e.g. Theorem B5 in [14℄, Se tion 10. Finally, by (25) and (29), it follows k Y

−1

!

σi (AS )

i=1

6

exp(k/2) . | det AS |

Substituting the latter three inequalities in (31) gives the needed generalization of (30): m krk k κ 3.3kE k 6 S kr0 k | det AS |

3.3 √ min kES − XkF k − m rank(X)=m

k−m

.

(32)

One an readily apply the te hniques of Se tion 5.2 and nd that the orresponding iteration number bound will dier only by an additive term of the type m + o(m). However, in ertain ases, for some moderate value of m, it may hold min kES − XkF ≪ kES kF . rank(X)=m

For instan e, the use of pivot modi ations in ILU algorithms is equivalent to the approximate triangular de omposition of a diagonally perturbed input matrix, e = PS LS US QS + E eS , AS + D

e is a diagonal matrix having only m nonzero elements (whi h may have where D

onsiderably larger magnitudes ompared to the ILU threshold parameter τ), eS satisfy the bound (3). Clearly, rank(D) e = m, and therefore and the entries of E one nds

min

rank(X)=m

kES − XkF =

min

rank(X)=m

eS − D e − XkF 6 kE e S kF , kE

whi h quantity may really be onsiderably smaller than the Frobenius norm of eS − D e. the total residual ES = E Hen e, one an expe t that m pivot modi ations in ILU pre onditioning may ost m additional GMRES iterations. It should be noted that the omplete diagonal pivoting in ILU des ribed in [12℄ usually requires a rather small, if any, number of pivot modi ations.

Superlinear onvergen e in pre onditioned GMRES Table 1.

287

RAS(δ) s aling statisti s for 18 test problems with δ = 0.8 and δ = 0.1

Size #Nonzeroes δ = 0.8: #RAS iters. δ = 0.1: #RAS iters. Problem n nz(A) log K(A) and log K(AS ) and log K(AS ) gre 1107 1107 5664 4.487+02 7 4.067+02 35 3.828+02 qh1484 1484 6110 3.562+04 19 1.296+03 39 1.286+03 west2021 2021 7310 1.974+04 21 3.177+02 131 2.006+02 nn 1374 1374 8588 1.409+04 41 9.204+02 79 6.190+02 sherman3 5005 20033 8.789+04 3 1.177+03 4 1.177+03 sherman5 3312 20793 1.140+04 6 2.164+02 20 1.769+02 saylr4 3564 22316 5.760+03 3 4.985+03 4 4.984+03 lnsp3937 3937 25407 7.792+04 29 1.272+03 95 1.216+03 gemat12 4929 33044 1.107+04 13 3.214+03 92 3.154+03 dw8192 8192 41746 2.532+04 3 5.493+03 10 5.486+03

ir uit3 12127 48137 3.583+04 22 7.603+03 62 7.545+03

ryg10K 10000 49699 4.562+04 4 6.393+03 27 6.389+03 fd18 16428 63406 2.173+05 17 4.490+03 93 4.015+03 bayer10 13436 71594 1.312+05 24 2.412+03 173 1.917+03 lhr04 4101 82682 3.635+03 26 9.399+02 193 8.253+02 utm5940 5940 83842 5.625+03 12 3.406+03 54 3.332+03 bayer04 20545 85537 4.193+05 45 2.214+03 209 1.648+03 orani678 2529 90158 1.219+03 11 1.639+02 88 1.026+02

Table 2.

RAS(0.1)+ILU(0.001) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 k kE k iterations iterations S F S gre 1107 15.87 1.216+06 1.962−02 10 90 qh1484 2.72 2.108+05 4.889−03 3 221 west2021 2.73 2.824+11 1.235−02 4 47 nn 1374 8.18 1.395+07 1.268−02 21 129 sherman3 4.82 4.675+01 4.360−02 15 280 sherman5 2.15 3.809+00 2.801−02 6 49 saylr4 0.80 9.194+02 1.388−02 60 889 lnsp3937 5.56 1.346+04 4.833−02 8 294 gemat12 2.52 8.304+05 2.504−02 12 631 dw8192 4.01 6.982+01 3.978−02 15 1126

ir uit3 1.56 6.375+03 1.688−02 12 1343

ryg10K 3.72 3.344+03 4.379−02 34 1315 fd18 11.73 2.278+30 6.855−02 30 921 bayer10 3.55 7.903+38 5.448−02 7 452 lhr04 2.04 6.725+03 6.300−02 18 218 utm5940 5.65 4.073+04 9.949−02 30 830 bayer04 2.97 1.105+38 5.123−02 5 390 orani678 0.95 3.656+00 5.503−02 6 37

288

I. Kaporin

Table 3.

RAS(0.1)+ILU(0.01) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 kES kF iterations iterations S k gre 1107 13.73 3.281+03 2.708−01 21 158 qh1484 2.24 2.243+06 8.372−02 21 341 west2021 2.37 9.970+02 1.103−01 8 73 nn 1374 7.27 2.489+04 1.775−01 48 212 sherman3 2.62 3.451+01 3.554−01 35 438 sherman5 1.42 4.039+00 2.624−01 11 85 saylr4 0.78 9.163+02 4.895−02 59 1064 lnsp3937 3.65 3.564+05 4.329−01 16 475 gemat12 1.74 9.887+17 2.667−01 48 963 dw8192 2.73 2.662+01 3.506−01 45 1670

ir uit3 1.23 1.483+03 1.952−01 56 1969

ryg10K 2.34 1.046+03 3.828−01 78 1949 fd18 9.33 1.826+45 7.006−01 93 1507 bayer10 2.52 5.266+50 5.658−01 12 † 755 lhr04 1.01 1.590+05 6.587−01 55 392 utm5940 2.52 2.457+02 8.077−01 68 1338 bayer04 2.28 4.151+32 5.299−01 9 651 orani678 0.38 2.854+00 4.440−01 8 70

Table 4.

RAS(0.1)+ILU(0.07) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 k kE k iterations iterations S F S gre 1107 9.30 9.996+02 2.291+00 50 409 qh1484 1.73 2.163+05 7.746−01 62 596 west2021 1.87 7.695+12 1.172+00 18 177 nn 1374 5.70 8.907+04 1.574+00 610 † 452 sherman3 1.56 3.132+01 1.919+00 73 800 sherman5 1.00 1.770+00 1.234+00 22 168 saylr4 0.76 8.869+02 1.444−01 60 1279 lnsp3937 2.23 5.106+04 2.628+00 36 966 gemat12 1.12 1.932+10 2.104+00 174 1780 dw8192 1.42 5.544+01 1.808+00 205 2627

ir uit3 0.98 2.001+02 1.653+00 163 3321

ryg10K 1.55 1.627+02 2.352+00 175 3272 fd18 5.61 9.233+59 4.458+00 222 3048 bayer10 1.66 7.786+28 3.764+00 47 1637 lhr04 0.39 5.661+03 3.397+00 92 874 utm5940 0.96 6.650+01 4.138+00 137 2557 bayer04 1.53 7.189+51 3.625+00 104 † 1442 orani678 0.15 2.408+00 2.180+00 14 191

Superlinear onvergen e in pre onditioned GMRES Table 5.

289

RAS(0.8)+ILU(0.001) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 kES kF iterations iterations S k gre 1107 14.98 1.442+07 2.157−02 14 96 qh1484 3.10 1.665+13 6.356−03 4 230 west2021 2.94 1.236+06 1.079−02 6 69 nn 1374 9.18 4.938+04 6.882−03 36 169 sherman3 4.81 6.704+01 4.341−02 15 280 sherman5 2.19 3.916+00 2.643−02 6 58 saylr4 .79 9.445+02 1.384−02 60 889 lnsp3937 5.29 6.959+02 4.531−02 8 302 gemat12 2.55 1.932+04 2.515−02 13 642 dw8192 3.96 4.462+01 3.933−02 16 1125

ir uit3 1.57 4.101+03 1.666−02 12 1350

ryg10K 3.73 4.367+04 4.396−02 41 1316 fd18 13.61 1.745+41 6.887−02 24 1021 bayer10 4.32 1.735+27 4.986−02 8 549 lhr04 2.21 9.265+08 6.237−02 19 244 utm5940 6.14 3.131+03 1.012−01 32 849 bayer04 3.74 3.142+31 4.947−02 10 507 orani678 1.06 1.589+01 5.903−02 5 54

Table 6.

RAS(0.8)+ILU(0.01) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 k kE k iterations iterations S F S gre 1107 12.28 4.509+05 2.640−01 28 165 qh1484 2.49 5.968+10 8.875−02 19 348 west2021 2.40 2.354+04 1.440−01 14 114 nn 1374 8.56 5.503+06 1.594−01 168 † 291 sherman3 2.62 3.515+01 3.559−01 35 438 sherman5 1.46 4.191+00 2.577−01 11 99 saylr4 .78 9.378+02 4.929−02 59 1065 lnsp3937 3.45 1.648+02 4.110−01 15 487 gemat12 1.78 2.472+14 2.602−01 35 973 dw8192 2.72 4.546+01 3.481−01 48 1669

ir uit3 1.23 1.281+03 2.654−01 44 2106

ryg10K 2.35 7.093+02 3.858−01 78 1953 fd18 10.41 7.823+41 7.219−01 139 1673 bayer10 3.41 1.119+39 4.974−01 28 † 886 lhr04 1.26 3.094+06 6.223−01 41 428 utm5940 2.80 5.501+03 8.158−01 70 1367 bayer04 3.05 1.450+54 5.359−01 24 840 orani678 .41 5.877+01 4.694−01 8 99

290

I. Kaporin

Fig. 1. Set of points (log k, log kest ) depi ting the orrelation between the observed and estimated iteration numbers

6

Numerical experiments

The orre tness of the above onvergen e estimate (30) has also been tested numeri ally using several small-sized \hard" test matri es taken from the University of Florida Sparse Matrix Colle tion [1℄. The limitation on the sizes of the matri es was set in order to make easier the \exa t" LU-fa torization of the

oeÆ ient matrix A whi h was used for the evaluation of log | det A|. The linear systems were solved with an arti ial right-hand side b = Ax∗ , where the omponents of the exa t solution were hosen as x∗ (i) = i/n, i = 1,2,. . . ,n. The initial guess was always hosen as x0 = 0 and the stopping riterion in GMRES iteration was set as kerk k 6 εkr0 k with ε = 10−8 , where kerk k is the estimated GMRES residual norm. If the matrix A is very ill- onditioned and the pre onditioning is not suÆ iently strong (e.g. if the ILU threshold parameter τ is set too large), the true residual norm an be mu h larger than the estimated one (due to the al ulations in nite pre ision). In the ases of a omplete failure when krk k > kr0 k, we put the \†" mark after the GMRES iteration number in Tables 2-7. In the GMRES(m) s heme, we took m = 900 and used approximate LU pre onditioning with the \best" default tuning of the pre-ordering and pivoting (see [12℄ for more detail).

Superlinear onvergen e in pre onditioned GMRES

291

Set of points (log k, log kES kF ) depi ting the orrelation between the observed iteration number and the Frobenius norm of the s aled ILU error

Fig. 2.

Note: It has been observed (espe ially in al ulations with \nn 1374" matrix), that mu h better results, in the sense of loseness between the \iterative" and the \true" residual (the latter is rk = b − Axk ), are obtained using the BiCGStab iterations [22℄. Probably, an improved GMRES implementation [23℄ (where the plain rotations are repla ed by elementary re e tions) would be more

ompetitive. In the s aling pro edure, the RAS stopping riterion was P

P

maxi j (AS )2ij maxj i (AS )2ij P P , max mini j (AS )2ij minj i (AS )2ij

!

6 1+δ

with δ = 0.1, 0.8, and the ILU threshold parameter τ was set to τ = 0.001, 0.01, 0.07. We present numeri al results for 18 sample problems from the above mentioned olle tion. The problems are taken form the subset of 60 matri es whi h has been used in [12℄ for testing of ILU pre onditionings. Hen e, the statisti s on the total of 2 × 3 × 18 = 108 test runs are reported in Tables 2- 7. In Table 1 we list the names of the test matri es with their sizes and number of nonzeroes, and present values for the quality measure K(AS ) whi h hara -

292

I. Kaporin

Set of points (log k, log kC−1 S k) depi ting the (absen e of) orrelation between the observed iteration number and the (lower bound for) spe tral norm of the inverse s aled pre onditioner Fig. 3.

terize the result of s aling. Clearly, the smaller δ, the smaller is K(AS ), whi h

orresponds to better s aling. However, the number of RAS iterations in reases

onsiderably when re ning the pre ision from δ = 0.8 to δ = 0.1. In GMRES(m) we took m = 900 and used approximate LU pre onditioning as in [12℄. All omputing was done in double pre ision. The iteration number

ounts and other related data are given in Tables 2-7. For ea h test run we give: 1. The resulting pre onditioner density nz(L + U)/nz(A); −1 T −1 2. The lower bound on the spe tral norm of C−1 S (taken as (v US )(LS u)/n, where the omponents of the ve tors u and v are 1 or −1 with signs determined in the ourse of ba k substitutions to obtain a lo al maximum at ea h step); 3. The Frobenius norm of the s aled ILU residual ES ; 4. The a tual number k of GMRES iterations; 5. The upper bound kest for the iteraton number k obtained from estimate (30) with κ = 1 in the same way as in the proof of Theorem 2. First of all, the results presented give another on rmation that good pres aling an be useful for the improvement of the ILU-GMRES performan e.

Superlinear onvergen e in pre onditioned GMRES Table 7.

293

RAS(0.8)+ILU(0.07) pre onditioning statisti s for the 18 test problems Pre ond. Lower est. #GMRES #Estimated Problem density for kC−1 kES kF iterations iterations S k gre 1107 8.25 9.027+05 2.353+00 67 433 qh1484 1.92 2.938+08 7.703−01 59 598 west2021 1.85 4.015+06 1.124+00 27 236 nn 1374 7.41 1.838+06 1.650+00 720† 619 sherman3 1.56 3.129+01 1.922+00 73 801 sherman5 1.01 1.706+00 1.436+00 22 210 saylr4 .76 9.742+02 1.440−01 60 1279 lnsp3937 2.17 1.929+01 2.523+00 37 975 gemat12 1.16 5.290+15 2.142+00 >900† 1820 dw8192 1.40 2.341+01 1.777+00 215 2614

ir uit3 1.00 2.395+02 1.649+00 153 3340

ryg10K 1.56 1.694+02 2.350+00 163 3273 fd18 7.07 2.029+64 4.708+00 420 3405 bayer10 2.26 3.880+70 4.046+00 114 2006 lhr04 .46 3.720+21 3.887+00 >900† 1044 utm5940 1.07 4.046+02 4.218+00 145 2626 bayer04 2.03 7.703+49 3.686+00 121† 1789 orani678 .15 1.685+01 2.152+00 13 238

Next we adress onsisten y analysis for the above presented GMRES onvergen e theory. One an see that for the ases onsidered, upper bound (28) is, on average, a twenty times overestimation of the a tual iteration ount. However, the relative variations of the upper bound (from one problem to another) orrelate with the a tual iteration numbers rather well, as is illustrated in Figure 1. (In Figs. 1{3 we have used only the data on 99 out of 108 test runs, thus ignoring the breakdown o

asions marked by \†".) A mu h weaker orrelation is observed between the Frobenius norm of the s aled ILU residual kES kF and the a tual GMRES iteration number, see Figure 2. Furthermore, the onventional indi ator kC−1 S k does not demonstrate any orrelation with the GMRES iteration number. Note that, if there is a hidden dependen e, for instan e, of the form k = αkβest , then the points (log k, log kest ) lie at the orresponding straight line. The reader may learly observe that only the dis rete set shown in Figure 1

an safely be interpreted as a \linear fun tion plus noise". More pre isely, one

an nd two interse ting straight lines in Figure 1 whi h, in fa t, orrespond to two dierent lasses of test problems.

294

7

I. Kaporin

Conclusion

First, a theoreti al justi ation is found for the standard pre-s aling te hnique related to the ILU fa torization, with impli ations for pra ti al implementation. (Namely, a more a

urate evaluation of DL and DR may be useful, or even sparse matri es with more than n nonzeroes an be used instead of the diagonal ones.) Se ond, an estimate for the redu tion of the original (uns aled) residual is obtained in terms of the s aled ILU error. These results an readily be used as a working tool for the onstru tion of eÆ ient two-stage pre onditionings for Krylov subspa e methods.

Acknowledgments The author thanks Eugene Tyrtyshnikov for his kind interest in this resear h and for his valuable assistan e in related presentations.

References 1. Univ. of Florida Sparse Matrix Colle tion. http://www.cise.ufl.edu/research/sparse/matrices/

2. V.F. de Almeida, A.M. Chapman, and J.J. Derby, On Equilibration and

Sparse Fa torization of Matri es Arising in Finite Element Solutions of Partial Dierential Equations, Numer. Methods Partial Dierent. Equ., 16 (2000),

pp. 11{29. 3. O. Axelsson and I. Kaporin, On the sublinear and superlinear rate of onvergen e of onjugate gradient methods, Numeri al Algorithms, 25 (2000), pp. 1{22. 4. O. Axelsson and G. Lindskog, On the rate of onvergen e of the pre onditioned onjugate gradient method, Numeris he Mathematik, 48 (1986), pp. 499{ 523. 5. T. Davis, http://www. ise.u .edu/resear h/sparse/umfpa k/ 6. S.L. Campbell, I.C. Ipsen, C.T. Kelley, and C.D. Meyer, GMRES and the Minimal Polynomial, BIT, 36 (1996), pp. 664{675. 7. A. Jennings, In uen e of the eigenvalue spe trum on the onvergen e rate of the onjugate gradient method, Journal of the Institute of Mathemati s and Its Appli ations, 20 (1977), pp. 61{72. 8. I. Kaporin, Expli itly pre onditioned onjugate gradient method for the solution of nonsymmetri linear systems, Int. J. Computer Math., 40 (1992), pp. 169{187. 9. I. Kaporin, New onvergen e results and pre onditioning strategies for the onjugate gradient method, Numer. Linear Algebra with Appls., 1 (1994), pp. 179{210. 10. I. Kaporin, High quality pre onditioning of a general symmetri positive matrix based on its UT U + UT R + RT U-de omposition, Numeri al Linear Algebra Appl., 5 (1998), pp. 484{509. 11. I. Kaporin, Superlinear onvergen e in minimum residual iterations, Numeri al Linear Algebra Appl., 12 (2005), pp. 453{470.

Superlinear onvergen e in pre onditioned GMRES

295

12. I. Kaporin, S aling, Reordering, and Diagonal Pivoting in ILU Pre onditionings, Russian Journal of Numeri al Analysis and Mathemati al Modelling, 22 (2007), pp. 341{375. 13. O. E. Livne and G. H. Golub, S aling by Binormalization, Numer. Alg., 35 (2004), pp. 97{120. 14. A. W. Marshall and I. Olkin, Inequalities: Theory of Majorization and its Appli ations, A ademi Press, New York, 1979. 15. I. Moret A note on the superlinear onvergen e of GMRES, SIAM Journal on Numeri al Analysis, 34 (1997), pp. 513{516. 16. Y. Saad and M. H. S hultz, GMRES: A generalized minimal residual method for solving nonsymmetri linear systems, SIAM J. S i. Statist. Comput., 7 (1986), pp. 856{869. 17. M. H. S hneider and S. A. Zenios, A omparative study of algorithms for matrix balan ing, Operations Resear h, 38 (1990), pp. 439{455. 18. V. Simon ini and D.B. Szyld, On the O

urren e of Superlinear Convergen e of Exa t and Inexa t Krylov Subspa e Methods, Dept. Math., Temple University Report 03-3-13; Philadelphia, Pennsylvania, Mar h 2003, 25pp. 19. E. E. Tyrtyshnikov, A unifying approa h to some old and new theorems on distribution and lustering, Linear Algebra and Appli ations, 232 (1996), pp. 1{ 43. 20. E. E. Tyrtyshnikov, Krylov subspa e methods and minimal residuals, J. Numer. Math. (2007, submitted). 21. H. A. van der Vorst and C. Vuik, The superlinear onvergen e behaviour of GMRES, Journal of Computational and Applied Mathemati s, 48 (1993), pp. 327{ 341. 22. H. A. van der Vorst, Bi-CGStab: a fast and smoothly onverging variant of Bi-CG for the solution of nonsymmetri linear systems, SIAM J. S i. Statist. Comput., 13 (1992), pp. 631{644. 23. H. F. Walker, Implementation of the GMRES method using Householder transformations, SIAM J. S i. Statist. Comput., 9 (1988), pp. 152{163. 24. R. Winther, Some superlinear onvergen e results for the onjugate gradient method, SIAM J. Numer. Analysis, 17 (1980), pp. 14{17.

Toeplitz and Toeplitz-block-Toeplitz matrices and their correlation with syzygies of polynomials Houssam Khalil1, Bernard Mourrain2, and Mi helle S hatzman1,⋆ 1

Institut Camille Jordan, 43 boulevard du 11 novembre 1918, 69622 Villeurbanne

edex Fran e khalil@math.univ-lyon1.fr schatz@math.univ-lyon1.fr

⋆ 2

INRIA, GALAAD team, 2004 route des Lu ioles, BP 93, 06902 Sophia Antipolis Cedex, Fran e mourrain@sophia.inria.fr

Abstract. In this paper, we re-investigate the resolution of Toeplitz systems T u = g, from a new point of view, by orrelating the solution of su h problems with syzygies of polynomials or moving lines. We show an expli it onne tion between the generators of a Toeplitz matrix and the generators of the orresponding module of syzygies. We show that this module is generated by two elements of degree n and the solution of T u = g an be reinterpreted as the remainder of an expli it ve tor depending on g, by these two generators. This approa h extends naturally to multivariate problems and we des ribe for Toeplitz-blo k-Toeplitz matri es, the stru ture of the orresponding generators.

Keywords: Toeplitz matrix, rational interpolation, syzygie.

1

Introduction

Stru tured matri es appear in various domains, su h as s ienti omputing, signal pro essing, . . . They usually express, in a linearize way, a problem whi h depends on less parameters than the number of entries of the orresponding matrix. An important area of resear h is devoted to the development of methods for the treatment of su h matri es, whi h depend on the a tual parameters involved in these matri es. Among well-known stru tured matri es, Toeplitz and Hankel stru tures have been intensively studied [5, 6℄. Nearly optimal algorithms are known for the multipli ation or the resolution of linear systems, for su h stru ture. Namely, if A is a Toeplitz matrix of size n, multiplying it by a ve tor or solving a linear ~ ~ system with A requires O(n) arithmeti operations (where O(n) = O(n logc (n)) for some c > 0) [2, 12℄. Su h algorithms are alled super-fast, in opposition with fast algorithms requiring O(n2 ) arithmeti operations.

Toeplitz and Toeplitz-blo k-Toeplitz matri es

297

The fundamental ingredients in these algorithms are the so- alled generators [6℄, en oding the minimal information stored in these matri es, and on whi h the matrix transformations are translated. The orrelation with other types of stru tured matri es has also been well developed in the literature [10, 9℄, allowing to treat so eÆ iently other stru tures su h as Vandermonde or Cau hy-like stru tures. Su h problems are strongly onne ted to polynomial problems [4, 1℄. For instan e, the produ t of a Toeplitz matrix by a ve tor an be dedu ed from the produ t of two univariate polynomials, and thus an be omputed eÆ iently by evaluation-interpolation te hniques, based on FFT. The inverse of a Hankel or Toeplitz matrix is onne ted to the Bezoutian of the polynomials asso iated to their generators. However, most of these methods involve univariate polynomials. So far, few investigations have been pursued for the treatment of multilevel stru tured matri es [11℄, related to multivariate problems. Su h linear systems appear for instan e in resultant or in residue onstru tions, in normal form omputations, or more generally in multivariate polynomial algebra. We refer to [8℄ for a general des ription of su h orrelations between multi-stru tured matri es and multivariate polynomials. Surprisingly, they also appear in numeri al s heme and pre onditionners. A main hallenge here is to devise super-fast algorithms of ~ for the resolution of multi-stru tured systems of size n.

omplexity O(n) In this paper, we onsider blo k-Toeplitz matri es, where ea h blo k is a Toeplitz matrix. Su h a stru ture, whi h is the rst step to multi-level stru tures, is involved in many bivariate problems, or in numeri al linear problems.We reinvestigate rst the resolution of Toeplitz systems T u = g, from a new point of view, by orrelating the solution of su h problems with syzygies of polynomials or moving lines. We show an expli it onne tion between the generators of a Toeplitz matrix and the generators of the orresponding module of syzygies. We show that this module is generated by two elements of degree n and the solution of T u = g an be reinterpreted as the remainder of an expli it ve tor depending on g, by these two generators. This approa h extends naturally to multivariate problems and we des ribe for Toeplitz-blo k-Toeplitz matri es, the stru ture of the orresponding generators. In parti ular, we show the known result that the module of syzygies of k non-zero bivariate polynomials is free of rank k − 1, by a new elementary proof. Exploiting the properties of moving lines asso iated to Toeplitz matri es, we give a new point of view to resolve a Toeplitz-blo k-Toeplitz system. In the next se tion we studie the s alar Toeplitz ase. In the hapter 3 we

onsider the Toeplitz-blo k-Toeplitz ase. Let R = K[x]. For n ∈ N, we denote by K[x]n the ve tor spa e of polynomials x−1 ] be the set of Laurent polynomials in the variable of degree 6 n. Let L = K[x, P i + x. For any polynomial p = n i=−m pi x ∈ L, we denote by p the sum of terms

298

H. Khalil, B. Mourrain, M. S hatzman P

n i − with positive exponents: p+ = P i=0 pi x and by p , the sum of terms with −1 − i stri tly negative exponents: p = i=−m pi x . We have p = p+ + p− . For n ∈ N, we denote by Un = {ω; ωn = 1} the set of roots of unity of order n.

2

Univariate case

We begin by the univariate ase and the following problem: n−1 n−1 Problem 1. Given a Toeplitz matrix T = (ti−j )i,j=0 ∈ Kn×n (T = (Tij )i,j=0 with Tij = ti−j ) of size n and g = (g0 , . . . , gn−1 ) ∈ Kn , nd u = (u0 , . . . , un−1 ) ∈ Kn su h that T u = g. (1) n−1 Let E = {1, . . . , x }, and ΠE be the proje tion of R on the ve tor spa e generated by E, along hxn , xn+1 , . . .i. Definition 1. – T (x) = – T~ (x) = – u(x) =

We de ne the following polynomials:

n−1 X

ti xi ,

i=−n+1 2n−1 X

~ti xi with ~ti =

i=0 n−1 X i=0

ui xi , g(x) =

n−1 X

ti ti−2n

if i < n , if i > n

gi xi .

i=0

Noti e that T~ = T + + x2 n T − and T (w) = T~ (w) if w ∈ U2 n . We also have (see [8℄) T u = g ⇔ ΠE (T (x)u(x)) = g(x).

For any polynomial u ∈ K[x] of degree d, we denote it as u(x) = u(x) + xn u(x) with deg(u) 6 n − 1 and deg(u) 6 d − n if d > n and u = 0 otherwise. Then, we have T (x) u(x) = T (x)u(x) + T (x)xn u(x) = ΠE (T (x)u(x)) + ΠE (T (x)xn u(x)) +(α−n+1 x−n+1 + · · · + α−1 x−1 )

+(αn xn + · · · + αn+m xn+m )

= ΠE (T (x)u(x)) + ΠE (T (x)xn u(x)) +x−n+1 A(x) + xn B(x),

(2)

with m = max(n − 2, d − 1), A(x) = α−n+1 + · · · + α−1 xn−2 ,

B(x) = αn + · · · + αn+m xm .

(3)

Toeplitz and Toeplitz-blo k-Toeplitz matri es

299

See [8℄ for more details, on the orrelation between stru tured matri es and (multivariate) polynomials. 2.1

Moving lines and Toeplitz matrices

We onsider here another problem, related to interesting questions in Ee tive Algebrai Geometry.

Problem 2. Given three polynomials a, b, c ∈ R respe tively of degree < l, < m, < n, nd three polynomials p, q, r ∈ R of degree < ν − l, < ν − m, < ν − n, su h that

a(x) p(x) + b(x) q(x) + c(x) r(x) = 0.

(4)

We denote by L(a, b, c) the set of (p, q, r) ∈ K[x]3 whi h are solutions of (4). It is a K[x]-module of K[x]3 . The solutions of the problem (2) are L(a, b, c) ∩ K[x]ν−l−1 × K[x]ν−m−1 × K[x]ν−n−1 . Given a new polynomial d(x) ∈ K[x], we denote by L(a, b, c; d) the set of (p, q, r) ∈ K[x]3 su h that a(x) p(x) + b(x) q(x) + c(x) r(x) = d(x). Theorem 1. For any non-zero ve tor of K[x]-module L(a, b, c) is free of rank 2.

polynomials

(a, b, c) ∈ K[x]3 ,

the

Proof. By the Hilbert's theorem, the ideal I generated by (a, b, c) has a free resolution of length at most 1, that is of the form: 0 → K[x]p → K[x]3 → K[x] → K[x]/I → 0.

As I 6= 0, for dimensional reasons, we must have p = 2. Definition 2. with (p, q, r)

A µ-base of L(a, b, c) is a basis (p, q, r), (p ′ , q ′ , r ′) of L(a, b, c), of minimal degree µ.

Noti e if µ1 is the smallest degree of a generator and µ2 the degree of the se ond generator (p ′ , q ′ , r ′), we have d = max(deg(a), deg(b), deg(c)) = µ1 +µ2 . Indeed, we have 0 → K[x]ν−d−µ1 ⊕ K[x]ν−d−µ2 →

K[x]3ν−d → K[x]ν → K[x]ν /(a, b, c)ν → 0,

for ν >> 0. As the alternate sum of the dimension of the K-ve tor spa es is zero and K[x]ν /(a, b, c)ν is 0 for ν >> 0, we have 0 = 3 (d − ν − 1) + ν − µ1 − d + 1 + ν − µ2 − d + 1 + ν + 1 = d − µ1 − µ2 .

For L(T~ (x), xn , x2n − 1), we have µ1 + µ2 = 2 n. We are going to show now that in fa t µ1 = µ2 = n:

300

H. Khalil, B. Mourrain, M. S hatzman

Proposition 1.

The K[x]-module L(T~ (x), xn , x2n − 1) has a n-basis.

Proof. Consider the map K[x]3n−1 → K[x]3n−1

(p(x), q(x), r(x)) 7→ T~ (x)p(x) + xn q(x) + (x2n − 1)r(x)

(5)

whi h 3n × 3n matrix is of the form

T0 0 −In S := T1 In 0 . T2 0 In

(6)

where T0 , T1 , T2 are the oeÆ ient matri es of (T~ (x), x T~ (x), . . . , xn T~ (x)), respe tively for the list of monomials (1, . . . , xn−1 ), (xn , . . . , x2n−1 ), (x2n , . . . , x3n−1 ). Noti e in parti ular that T = T0 + T2 Redu ing the rst rows of (T0 |0| − In ) by the last rows (T2 |0|In ), we repla e it by the blo k (T0 + T2 |0|0), without hanging the rank of S. As T = T0 + T2 is invertible, this shows that the matrix S is of rank 3n. Therefore, there is no syzygies in degree n − 1. As the sum 2n = µ1 + µ2 and µ1 6 n, µ2 6 n where µ1 , µ2 are the smallest degree of a pair of generators of L(T~ (x), xn , x2n − 1) of degree 6 n, we have µ1 = µ2 = n. Thus there exist two linearly independent syzygies (u1 , v1 , w1 ), (u2 , v2 , w2 ) of degree n, whi h generate L(T~ (x), xn , x2n − 1). A similar result an also be found in [12℄, but the proof mu h longer than this one, is based on interpolation te hniques and expli it omputations. Let us now des ribe how to onstru t expli itly two generators of L(T~ (x), xn , x2n − 1) of degree n (see also [12℄). As T~ (x) is of degree 6 2 n − 1 and the map (5) is a surje tive fun tion, there exists (u, v, w) ∈ K[x]3n−1 su h that T~ (x)u(x) + xn v(x) + (x2 n − 1) w = T~ (x)xn ,

we dedu e that (u1 , v1 , w1 ) = (xn − u, −v, −w) ∈ L(T~ (x), xn , x2n − 1). As there exists (u ′ , v ′ , w ′ ) ∈ K[x]3n−1 su h that T~ (x)u ′ (x) + xn v ′ (x) + (x2 n − 1) w ′ = 1 = xn xn − (x2 n − 1)

(7)

(8)

we dedu e that (u2 , v2 , w2 ) = (−u ′ , xn − v ′ , −w ′ − 1) ∈ L(T~ (x), xn , x2n − 1). Now, the ve tors (u1 , v1 , w1 ), (u2 , v2 , w2 ) of L(T~ (x), xn , x2n − 1) are linearly independent sin e by onstru tion, the oeÆ ient ve tors of xn in (u1 , v1 , w1 ) and (u2 , v2 , w2 ) are respe tively (1, 0, 0) and (0, 1, 0). Proposition 2. The ve tor u is solution v(x) ∈ K[x]n−1 , w(x) ∈ K[x]n−1 su h that

of (1) if and only if there exist

(u(x), v(x), w(x)) ∈ L(T~ (x), xn , x2n − 1; g(x))

Toeplitz and Toeplitz-blo k-Toeplitz matri es

301

Proof. The ve tor u is solution of (1) if and only if we have ΠE (T (x)u(x)) = g(x).

As u(x) is of degree 6 n − 1, we dedu e from (2) and (3) that there exist polynomial A(x) ∈ K[x]n−2 and B(x) ∈ K[x]n−1 su h that T (x)u(x) − x−n+1 A(x) − xn B(x) = g(x).

By evaluation at the roots ω ∈ U2n , and sin e ω−n = ωn and T~ (ω) = T (ω) for ω ∈ Un , we have T~ (ω)u(ω) + ωn v(ω) = g(ω), ∀ω ∈ U2n (ω), with v(x) = −x A(x) − B(x) of degree 6 n − 1. We dedu e that there exists w(x) ∈ K[x] su h that T~ (x)u(x) + xn v(x) + (x2n − 1)w(x) = g(x). Noti e that w(x) is of degree 6 n−1, be ause (x2n −1) w(x) is of degree 6 3n−1. Conversely, a solution (u(x), v(x), w(x)) ∈ L(T~ (x), xn , x2n −1; g(x))∩K[x]3n−1 implies a solution (u, v, w) ∈ K3 n of the linear system:

u g S v = 0 w 0

where S is has the blo k stru ture (6), so that T2 u + w = 0 and T0 u − w = (T0 + T2 )u = g. As we have T0 + T2 = T , the ve tor u is a solution of (1), whi h ends the proof of the proposition. 2.2

Euclidean division

As a onsequen e of proposition 1, we have the following property: Proposition 3. Let {(u1 , v1 , w1 ), (u2 , v2 , w2 )} a n-basis of L(T~ (x), xn , x2n −1),

0

the remainder of the division of xn g by

given in the proposition (2).

0

g

u1 u2 v1 v2 w1 w2

is the ve tor solution

Proof. The ve tor xn g ∈ L(T~ (x), xn , x2 n − 1; g) (a parti ular solution). We

u1 u2

−g

divide it by v1 v2 we obtain w1 w2

u 0 u1 u2 v = xn g − v1 v2 p q w g w1 w2

302

H. Khalil, B. Mourrain, M. S hatzman

(u, v, w) is the remainder of division, thus (u, v, w) ∈ K[x]3n−1 ∩L(T~ (x), xn , x2 n − 1; g). However (u, v, w) is the unique ve tor ∈ K[x]3n−1 ∩ L(T~ (x), xn , x2 n − 1; g) be ause if there is an other ve tor then their dieren e is in L(T~ (x), xn , x2 n − 1) ∩ K[x]3n−1 whi h is equal to {(0, 0, 0)}. ′ Problem 3. Given a matrix and a ve tor of polynomials e(x) e ′ (x) of degree f(x) f (x) p(x) en en′ n, and of degree m > n, su h that is invertible; nd the fn fn′ q(x) p(x) e(x) e ′ (x) remainder of the division of by . q(x) f(x) f ′ (x) Proposition 4. The rst oordinate of remainder 0 u u′ by is the polynomial v(x) solution r r′ xn g

ve tor of the division of of (1).

We des ribe here a generalized Eu lidean division algorithm to solve problem (3). p(x) e(x) e ′ (x) of degree m, B(x) = of degree n 6 m. q(x) f(x) f ′ (x) E(x) = B(x)Q(x) + R(x) with deg(R(x)) < n, and deg(Q(x)) 6 m − n. Let z = x1

Let E(x) =

E(x) = B(x)Q(x) + R(x) 1 1 1 1 ⇔ E( ) = B( )Q( ) + R( ) z z z z 1 m−n 1 1 1 n m Q( ) + zm−n+1 zn−1 R( ) ⇔ z E( ) = z B( )z z z z z ^ (z) = B ^ (z)Q(z) ^ + zm−n+1 R^ (z) ⇔ E

(9)

^ B(z), ^ ^ ^ are the polynomials obtained by reversing the order with E(z), Q(z), R(z) of oeÆ ients of polynomials E(z), B(z), Q(z), R(z). ^ ^ E(z) ^ + zm+n−1 R(z) = Q(z) (9) ⇒ ^ (z) ^ (z) B B ^ ^ = E(z) mod zm−n+1 ⇒ Q(z) ^ B(z) 1 ^ ^B(z) exists be ause its oeÆ ient of highest degree is invertible. Thus Q(z) is ^ E(z) obtained by omputing the rst m − n + 1 oeÆ ients of ^ . B(z) 1 To nd W(x) = we will use Newton's iteration: Let f(W) = B^ − W −1 . ^ (x) B ^ − Wl−1 , thus f ′ (Wl ).(Wl+1 − Wl ) = −Wl−1 (Wl + 1 − Wl )Wl−1 = f(Wl ) = B

^ Wl . Wl+1 = 2Wl − Wl B

Toeplitz and Toeplitz-blo k-Toeplitz matri es

303

^ −1 and W0 = B 0 whi h exists. ^ l W − Wl+1 = W − 2Wl + Wl BW ^ Wl )2 = W(I2 − B

^ (W − Wl ) = (W − Wl )B

Thus Wl (x) = W(x) mod x2l for l = 0, . . . , ⌈log(m − n + 1)⌉.

We need O(n log(n) log(m − n) + m log m) arithmeti operations to solve problem (3)

Proposition 5.

Proof. We must do ⌈log(m − n + 1)⌉ Newton's iteration to obtain the rst

1 = W(x). And for ea h iteration we must do O(n log n) ^ B arithmeti operations (multipli ation of polynimials of degree n). And then we ^ 1. need O(m log m) aritmeti operations to do the multipli ation E. ^ B m − n + 1 oe ients of

2.3

Construction of the generators

The anoni al basis of K[x]3 is denoted by σ1 , σ2 , σ3 . Let ρ1 , ρ2 the generators of L(T~ (x), xn , x2n − 1) of degree n given by ρ1 = xn σ1 − (u, v, w) = (u1 , v1 , w1 ) ρ2 = xn σ2 − (u ′ , v ′ , w ′ ) = (u2 , v2, w2 )

(10)

with (u, v, w), (u ′ , v ′ , w ′ ) are the ve tor given in (7) and (8). We will des ribe here how we ompute (u1 , v1 , w1 ) and (u2 , v2 , w2 ). We will give two methods to ompute them, the se ond one is the method given in [12℄. The rst one use the Eu lidean g d algorithm: We will re al rstly the algebrai and omputational properties of the well known extended Eu lidean algorithm (see [13℄): Given p(x), p ′ (x) two polynomials in degree m and m ′ respe tively, let r0 = p, s0 = 1, t0 = 0,

r1 = p ′ , s1 = 0, t1 = 1.

and de ne ri+1 = ri−1 − qi ri , si+1 = si−1 − qi si , ti+1 = ti−1 − qi ti ,

where qi results when the division algorithm is applied to ri−1 and ri , i.e. ri−1 = qi ri + ri+1 . deg ri+1 < deg ri for i = 1, . . . , l with l is su h that rl = 0, therefore rl−1 = g d(p(x), p ′ (x)).

304

H. Khalil, B. Mourrain, M. S hatzman

Proposition 6.

The following relations hold: and

si p + t i p ′ = ri

and

deg ri+1 < deg ri , deg si+1 > deg si

(si , ti ) = 1

for i = 1, . . . , l

i = 1, . . . , l − 1

and

deg ti+1 > deg ti ,

deg si+1 = deg(qi .si ) = deg v − deg ri , deg ti+1 = deg(qi .ti ) = deg u − deg ri .

Proposition 7. By and p ′ (x) = x2n−1

applying the Eu lidean g d algorithm in p(x) = xn−1 T in degree n − 1 and n − 2 we obtain ρ1 and ρ2 respe tively

Proof. We saw that Tu = g if and only if there exist A(x) and B(x) su h that T (x)u(x) + x2n−1 B(x) = xn−1 b(x) + A(x)

with T (x) = xn−1 T (x) a polynomial of degree 6 2n − 2. In (7) and (8) we saw that for g(x) = 1 (g = e1 ) and g(x) = xn T (x) (g = (0, t−n+1 , . . . , t−1 )T ) we obtain a base of L(T~ (x), xn , x2n − 1). Tu1 = e1 if and only if there exist A1 (x), B1 (x) su h that T (x)u1 (x) + x2n−1 B1 (x) = xn−1 + A1 (x)

(11)

and Tu2 = (0, t−n+1 , . . . , t−1 )T if and only if there exist A2 (x), B2 (x) su h that T (x)(u2 (x) + xn ) + x2n−1 B2 (x) = A2 (x)

(12)

with deg A1 (x) 6 n − 2 and deg A2 (x) 6 n − 2. Thus By applying the extended Eu lidean algorithm in p(x) = xn−1 T and p ′ (x) = x2n−1 until we have deg rl (x) = n − 1 and deg rl+1 (x) = n − 2 we obtain u1 (x) =

1 sl (x), c1

B1 (x) =

1 tl (x), c1

xn−1 + A1 (x) =

1 rl (x) c1

and xn + u2 (x) =

1 sl+1 (x), c2

B2 (x) =

1 tl+1 (x), c2

A2 (x) =

1 rl+1 (x) c2

Toeplitz and Toeplitz-blo k-Toeplitz matri es

305

with c1 and c2 are the highest oeÆ ients of rl (x) and sl+1 (x) respe tively, in fa t: The equation (11) is equivalent to n

z

n−1

n

n−1

}| t−n+1 .. . t 0 . . . tn−1

{

..

.

z

n−1

}|

{

. . . t−n+1

.

.. .

...

t0

..

..

.

.. .

1

tn−1

..

. 1

A1 u1 = 1 B1 0 .. . 0

sin e T is invertible then the (2n − 1) × (2n − 1) blo k at the bottom is invertible and then u1 and B1 are unique, therefore u1 , B1 and A1 are unique. And, by proposition (6), deg rl = n − 1 (rl = c1 (xn + A1 (x)) then deg sl+1 = (2n − 1) − (n − 1) = n and deg tl+1 = (2n − 2) − (n − 1) = n − 1 thus, by the same proposition, deg sl 6 n − 1 and deg tl 6 n − 2. Therfore c11 sl = u1 and 1 c1 tl = B1 . Finaly, Tu = e1 if and only if there exist v(x), w(x) su h that T~ (x)u(x) + xn v(x) + (x2n − 1)w(x) = 1

(13)

T~ (x) = T + + x2n T − = T + (x2n − 1)T − thus T (x)u(x) + xn v(x) + (x2n − 1)(w(x) + T − (x)u(x)) = 1

(14)

of a other hand T (x)u(x) − x−n+1 A1 (x) + xn B1 (x) = 1 and x−n+1 A1 (x) = xn (xA1 ) − x−n (x2n − 1)xA1 thus T (x)u(x) + xn (B(x) − xA(x)) + (x2n − 1)x−n+1 A(x) = 1

(15)

By omparing (14) and (15), and as 1 = xn xn −(x2n −1) we have the proposition and we have w(x) = x−n+1 A(x)−T− (x)u(x)+1 whi h is the part of positif degree of −T− (x)u(x) + 1.

Remark 1. A superfast eu lidean g d algorithm, wi h uses no more than O(n log2 n), is given in [13℄ hapter 11.

The se ond methode to ompute (u1 , v1 , w1 ) and (u2 , v2 , w2 ) is given in [12℄. We are interested in omputing the oeÆ ients of σ1 , σ2 , the oeÆ ients of σ3

orrespond to elements in the ideal (x2n −1) thus an be obtain by redu tion and n u(x) v(x) x − u −v 0 0 n 2n ~ = . of (T (x) x ).B(x) by x − 1, with B(x) = ′ ′ n −u1

x − v1

u (x) v (x)

306

H. Khalil, B. Mourrain, M. S hatzman

A superfast algorithm to ompute B(x) is given in [12℄. Let us des ribe how to ompute it. By evaluation of (10) at the roots ωj ∈ U2n we dedu e that (u(x) v(x))T and ′ (u (x) v ′ (x))T are the solution of the following rational interpolation problem:

T~ (ωj )u(ωj ) + ωn j v(ωj ) = 0 ~T (ωj )u ′ (ωj ) + ωnj v ′ (ωj ) = 0 with

Definition 3.

de ned as

un = 1, vn = 0 un′ = 0, vn′ = 1

The τ-degree of a ve tor polynomial w(x) = (w1 (x) w2 (x))T is τ − deg w(x) := max{deg w1 (x), deg w2 (x) − τ}

B(x) is a n−redu ed basis of the module of all ve tor polynomials r(x) ∈ K[x]2 that satisfy the interpolation onditions fTj r(ωj ) = 0, j = 0, . . . , 2n − 1 T~ (ωj ) . ωn j B(x) is alled a τ−redu ed basis (with τ = n) that orresponds to the interpolation data (ωj , fj ), j = 0, . . . , 2n − 1.

with fj =

Definition 4. A set of ve tor polynomial in K[x]2 is alled τ-redu ed τ-highest degree oeÆ ients are lineary independent.

if the

Theorem 2. Let τ = n. Suppose J is a positive integer. Let σ1 , . . . , σJ ∈ K and φ1 , . . . , φJ ∈ K2 wi h are 6= (0 0)T . Let 1 6 j 6 J and τJ ∈ Z. Suppose that Bj (x) ∈ K[x]2×2 is a τJ -redu ed basis matrix with basis ve tors having τJ −degree δ1 and δ2 , respe tively, orresponding to the interpolation data {(σi , φi ); i = 1, . . . , j}. Let τj→ J := δ1 − δ2 . Let Bj→ J (x) be a τj→ J -redu ed basis matrix orresponding to the interpolation data {(σi , BTj (σj )φi ); i = j + 1, . . . , J}. Then BJ (x) := Bj (x)Bj→ J (x) is a τJ -redu ed basis matrix orresponding to the interpolation data {(σi , φi ); i = 1, . . . , J}.

Proof. For the proof, see [12℄. When we apply this theorem for the ωj ∈ U2n as interpolation points, we obtain a superfast algorithm (O(n log2 n)) wi h ompute B(x).[12℄ We onsider the two following problems:

Toeplitz and Toeplitz-blo k-Toeplitz matri es

3

307

Bivariate case

Let m ∈ N, m ∈ N. In this se tion we denote by E = {(i, j); 0 6 i 6 m − 1, 0 6 j 6 n − 1}, and R = K[x, y]. We denote by K[x, y]m the ve tor spa e of bivariate n polynomials of degree 6 m in x and 6 n in y.

Notation. For a blo k matrix M, of blo k size n and ea h blo k is of size m, we will use the following indi ation : M = M(i1 ,i2 ),(j1 ,j2 )

06i1 ,j1 6m−1 06i2 ,j2 6n−1

= (Mαβ )α,β∈E .

(16)

(i2 , j2 ) gives the blo k's positions, (i1 , j1 ) the position in the blo ks.

Problem 4. Given a Toeplitz blo k Toeplitz matrix

(T = (Tαβ )α,β∈E with Tαβ K Kmn , nd u = (uα )α∈E su h that mn×mn

T = (tα−β )α∈E,β∈E ∈ = tα−β ) of size mn and g = (gα )α∈E ∈

Tu=g

(17)

We de ne the following polynomials: X T (x, y) := ti,j xi yj ,

Definition 5. –

– T~ (x, y) :=

(i,j)∈E−E 2n−1,2m−1 X

~ti,j xi yj with

i,j=0

si i < m, j < n ti,j si i > m, j < n , t i−2m,j ~ti,j := t si i < m, j > n i,j−2n ti−2m,j−2n si i > m, i > n X X – u(x, y) := ui,j xi yj , g(x, y) := gi,j xi yj . (i,j)∈E (i,j)∈E 3.1

Moving hyperplanes

For any non-zero ve tor of polynomials a = (a1 , . . . , an ) ∈ K[x, y]n , we denote by L(a) the set of ve tors (h1 , . . . , hn) ∈ K[x, y]n su h that n X

ai hi = 0.

(18)

i=1

It is a K[x, y]-module of K[x, y]n .

Proposition 8. The ve tor u is solution of (17) if and only if there exist h2 , . . . , h9 ∈ K[x, y]m−1 su h that (u(x, y), h2 (x, y), . . . , h9 (x, y)) belongs to n−1

L(T~ (x, y), xm , x2 m − 1, yn , xm yn , (x2 m − 1) yn , y2 n − 1, xm (y2 n − 1), (x2 m − 1) (y2 n − 1)).

308

H. Khalil, B. Mourrain, M. S hatzman

Proof. Let L = {xα

yα2 , 0 6 α1 6 m−1, 0 6 α2 6 n−1}, and ΠE the proje tion of R on the ve tor spa e generated by L. By [8℄, we have

whi h implies that

1

(19)

T u = g ⇔ ΠE (T (x, y) u(x, y)) = g(x, y)

T (x, y)u(x, y) =g(x, y) + xm yn A1 (x, y) + xm y−n A2 (x, y) +

(20)

x−m yn A3 (x, y) + x−m y−n A4 (x, y) + m

x A5 (x, y) + x

−m

n

−n

A6 (x, y) + y A7 (x, y) + y

A8 (x, y),

where the Ai (x, y) are polynomials of degree at most m − 1 in x and n − 1 in y. Sin e ωm = ω−m , υn = υ−n , T~ (ω, υ) = T (ω, υ) for ω ∈ U2 m , υ ∈ U2 n , we dedu e by evaluation at the roots ω ∈ U2 m , υ ∈ U2 n that R(x, y) :=T~ (x, y)u(x, y) + xm h2 (x, y) + yn h4 (x, y) + xm yn h5 (x, y) − g(x, y) ∈ (x2 m − 1, y2 n − 1)

with h2 = −(A5 +A6 ), h4 = −(A7 +A8 ), h5 = −(A1 (x, y)+A2 (x, y)+A3 (x, y)+ A4 (x, y)). By redu tion by the polynomials x2 m −1, y2 n −1, and as R(x, y) is of degree 6 3m − 1 in x and 6 3n − 1 in y, there exist h3 (x, y), h6 (x, y), . . . , h8 (x, y) ∈ K[x, y]m−1 su h that n−1

T~ (x, y)u(x, y) + xm h2 (x, y) + (x2m − 1)h3 (x, y) + yn h4 (x, y) + xm yn h5 (x, y) + (x2m − 1)yn h6 (x, y) + (y2n − 1)h7 (x, y) + m

2m

x (y

− 1)h7 (x, y) + (x

2n

2n

− 1)(y

(21)

− 1)h8 (x, y) = g(x, y).

Conversely a solution of (21) an be transformed into a solution of (20), whi h ends the proof of the proposition. In the following, we are going to denote by T the ve tor T = (T~ (x, y), xm , x − 1, yn , xm yn , (x2 m − 1) yn , y2 n − 1, xm (y2 n − 1), (x2 m − 1) (y2 n − 1)). 2m

Proposition 9.

There is no elements of K[x, y]m−1 in n−1

L(T).

Proof. We onsider the map K[x, y]9m−1 → K[x, y]3m−1 n−1

p(x, y) = (p1 (x, y), . . . , p9 (x, y)) 7→ T.p

3n−1

(22) (23)

Toeplitz and Toeplitz-blo k-Toeplitz matri es

whi h 9mn × 9mn matrix is of the form

E21 −E11 + E31

−E11 −E21 E11 − E31

309

.. .. .. .. .. . . . . T0 . E2n −E1n + E3n −E1n −E2n E1n − E3n E11 E21 −E11 + E31 .. .. .. S := . . . T1 E1n E2n −E1n + E3n E11 E21 −E11 + E31 .. .. .. . . . T2 E1n E2n −E1n + E3n

(24) matrix with with Eij is the 3m × mn matrix eij ⊗ Im and eij is the 3 × n T0

entries equal zero ex ept the (i, j)th entrie equal 1. And the matrix T1 is the following 9mn × m matrix

t0 t 1 . .. tn−1 0 t−n+1 . .. t −1 0 . . . .. . 0

0 t0

..

.

... tn−1 0

..

.

... ...

..

.

t1 ...

.. ..

. .

. . . t−n+1 t−1 . . .

..

.

...

.. ..

. .

...

T2

ti,0 0 ... 0 t ti,0 ... 0 i,1 .. . .. . . . . . . . . . . t0 ... ti,1 ti,0 ti,n−1 0 t1 ti,m−1 . . . ti,1 .. .. .. ti,−m+1 . . . 0 and ti = . . . . . . . . ti,−m+1 . t−n+1 t 0 . . . ti,−m+1 0 i,−1 t−n+1 ti,−1 . . . ti,−m+1 0 .. .. .. .. .. . . . . . .. .. . t−1 ti,−1 . 0 0 ... ... 0 0 0

For the same reasons in the proof of proposition (1) the matrix S is invertible. Theorem 3. For any non-zero ve tor of polynomials a = (ai )i=1,...,n ∈ K[x, y]n , the K[x, y]-module L(a1 , . . . , an ) is free of rank n − 1.

Proof. Consider rst the ase where ai are monomials.

ai = xαi yβi that are sorted in lexi ographi order su h that x < y, a1 being the biggest and an the smallest. Then the module of syzygies of a is generated by the S-polynomials: σi σj S(ai , aj ) = l m(ai , aj )( − ), ai aj

310

H. Khalil, B. Mourrain, M. S hatzman

where (σi )i=1,...,n is the anoni al basis of K[x, y]n [3℄. We easily he k that (ai ,ak ) l m(ai ,ak ) S(ai , ak ) = l m l m(ai ,aj ) S(ai , aj ) − l m(aj ,ak ) S(aj , ak ) if i 6= j 6= k and l m(ai , aj ) divides l m(ai , ak ). Therefore L(a) is generated by the S(ai , aj ) whi h are minimal for the division, that is, by S(ai , ai+1 ) (for i = 1, . . . , n − 1), sin e the monomials ai are sorted lexi ographi ally. As the syzygies S(ai , ai+1 ) involve the basis elements σi , σi+1 , they are linearly independent over K[x, y], whi h shows that L(a) is a free module of rank n − 1 and that we have the following resolution: 0 → K[x, y]n−1 → K[x, y]n → (a) → 0.

Suppose now that ai are general polynomials ∈ K[x, y] and let us ompute a Grobner basis of ai , for a monomial ordering re ning the degree [3℄. We denote by m1 , . . . , ms the leading terms of the polynomials in this Grobner basis, sorted by lexi ographi order. The previous onstru tion yields a resolution of (m1 , . . ., ms ): 0 → K[x, y]s−1 → K[x, y]s → (mi )i=1,...,s → 0.

Using [7℄ (or [3℄), this resolution an be deformed into a resolution of (a), of the form 0 → K[x, y]p → K[x, y]n → (a) → 0,

whi h shows that L(a) is also a free module. Its rank p is ne essarily equal to n − 1, sin e the alternate sum of the dimensions of the ve tor spa es of elements of degree 6 ν in ea h module of this resolution should be 0, for ν ∈ N. 3.2

Generators and reduction

In this se tion, we des ribe an expli it set of generators of L(T). The anoni al basis of K[x, y]9 is denoted by σ1 , . . . , σ9 . First as T~ (x, y) is of degree 6 2 m−1 in x and 6 2 n−1 in y and as the fun tion (22) in surje tive, there exists u1 , u2 ∈ K[x, y]9m−1 su h that T · u1 = T~ (x, y)xm , n−1

T · u2 = T~ (x, y)yn . Thus,

ρ1 = xm σ1 − u1 ∈ L(T), ρ2 = yn σ1 − u2 ∈ L(T).

We also have u3 ∈ K[x, y]m−1, su h that T · u3 = 1 = xm xm − (x2 m − 1) = n−1

yn yn − (y2 n − 1). We dedu e that

ρ3 = xm σ2 − σ3 − u3 ∈ L(T), ρ4 = yn σ4 − σ7 − u3 ∈ L(T).

Toeplitz and Toeplitz-blo k-Toeplitz matri es

311

Finally, we have the obvious relations: ρ5 ρ6 ρ7 ρ8 Proposition 10.

= yn σ2 − σ5 ∈ L(T), = xm σ4 − σ5 ∈ L(T), = xm σ5 − σ6 + σ4 ∈ L(T), = yn σ5 − σ8 + σ2 ∈ L(T).

The relations ρ1 , . . . , ρ8 form a basis of L(T).

Proof. Let h = (h1 , . . . , h9 ) ∈ L(T). By redu tion by the previous elements of

L(T), we an assume that the oeÆ ients h1 , h2 , h4 , h5 are in K[x, y]m−1 . Thus, n−1

T~ (x, y)h1 + xm h2 + yn h4 + xm yn h5 ∈ (x2 n − 1, y2 m − 1). As this polynomial is of degree 6 3 m − 1 in x and 6 3 n − 1 in y, by redu tion by the polynomials, we dedu e that the oeÆ ients h3 , h6 , . . . , h9 are in K[x, y]m−1 . By proposition n−1

9, there is no non-zero syzygy in K[x, y]9m−1 . Thus we have h = 0 and every n−1

element of L(T) an be redu ed to 0 by the previous relations. In other words, ρ1 , . . . , ρ8 is a generating set of the K[x, y]-module L(T). By theorem 3, the relations ρi annot be dependent over K[x, y] and thus form a basis of L(T). 3.3

Interpolation

Our aim is now to ompute eÆ iently a system of generators of L(T). More pre isely, we are interested in omputing the oeÆ ients of σ1 , σ2 , σ4 , σ5 of ρ1 , ρ2 , ρ3 . Let us all B(x, y) the orresponding oeÆ ient matrix, whi h is of the form: xm 0 0 0

yn 0 0 0

0 xm + K[x, y]4,3 m−1 0 n−1 0

(25)

Noti e that the other oeÆ ients of the relations ρ1 , ρ2 , ρ3 orrespond to elements in the ideal (x2 m − 1, y2 n − 1) and thus an be obtained easily by redu tion of the entries of (T~ (x, y), xm , yn , xm yn ) · B(x, y) by the polynomials x2 m − 1, y2 n − 1. Noti e also that the relation ρ4 an be easily dedu ed from ρ3 , sin e we have ρ3 − xm σ2 + σ3 + yn σ4 − σ7 = ρ4 . Sin e the other relations ρi (for i > 4) are expli it and independent of T~ (x, y), we an easily dedu e a basis of L(T) from the matrix B(x, y). As in L(T) ∩ K[x, y]m−1 there is only one element, thus by omputing the n−1 basis given in proposition (10) and redu ing it we an obtain this element in L(T) ∩ K[x, y]m−1 whi h gives us the solution of Tu = g. We an give a fast n−1 algorithm to do these two step, but a superfast algorithm is not available.

312

4

H. Khalil, B. Mourrain, M. S hatzman

Conclusions

We show in this paper a orrelation between the solution of a Toeplitz system and the syzygies of polynomials. We generalized this way, and we gave a orrelation between the solution of a Toeplitz-blo k-Toeplitz system and the syzygies of bivariate polynomials. In the univariate ase we ould exploit this orrelation to give a superfast resolution algorithm. The generalization of this te hnique to the bivariate ase is not very lear and it remains an important hallenge.

References 1. D. Bini and V. Y. Pan. Polynomial and matrix omputations. Vol. 1. Progress in Theoreti al Computer S ien e. Birkhauser Boston In ., Boston, MA, 1994. Fundamental algorithms. 2. R. Bitmead and B. Anderson. Asymptoti ally fast solution of Toeplitz and related systems of equations. Linear Algebra and Its Appli ations, 34:103{116, 1980. 3. D. Eisenbud. Commutative algebra, volume 150 of Graduate Texts in Mathemati s. Springer-Verlag, New York, 1995. With a view toward algebrai geometry. 4. P. Fuhrmann. A polynomial approa h to linear algebra. Springer-Verlag, 1996. 5. G. Heinig and K. Rost. Algebrai methods for Toeplitz-like matri es and operators, volume 13 of Operator Theory: Advan es and Appli ations. Birkhauser Verlag, Basel, 1984. 6. T. Kailath and A. H. Sayed. Displa ement stru ture: theory and appli ations. SIAM Rev., 37(3):297{386, 1995. 7. H. M. Moller and F. Mora. New onstru tive methods in lassi al ideal theory. J. Algebra, 100(1):138{178, 1986. 8. B. Mourrain and V. Y. Pan. Multivariate polynomials, duality, and stru tured matri es. J. Complexity, 16(1):110{180, 2000. 9. V. Y. Pan. Nearly optimal omputations with stru tured matri es. In Pro eedings

of the Eleventh Annual ACM-SIAM Symposium on Dis rete Algorithms (San Fran is o, CA, 2000), pages 953{962, New York, 2000. ACM. 10. V. Y. Pan. Stru tured matri es and polynomials. Birkhauser Boston In ., Boston,

MA, 2001. Uni ed superfast algorithms. 11. E. Tyrtyshnikov. Fast algorithms for blo k Toeplitz matri es. Sov. J. Numer. Math. Modelling, 1(2):121{139, 1985. 12. M. Van Barel, G. Heinig, and P. Kravanja. A stabilized superfast solver for nonsymmetri Toeplitz systems. SIAM J. Matrix Anal. Appl., 23(2):494{510 (ele troni ), 2001. 13. J. von zur Gathen and J. Gerhard. Modern omputer algebra. Cambridge University Press, Cambridge, se ond edition, 2003.

Concepts of Data-Sparse Tensor-Product Approximation in Many-Particle Modelling Heinz-Jurgen Flad1, Wolfgang Ha kbus h2, Boris N. Khoromskij2, and Reinhold S hneider1 1

2

Institut fur Mathematik, Te hnis he Universitat Berlin Strae des 17. Juni 137, D-10623 Berlin, Germany {flad,schneidr}@math.tu-berlin.de

Max-Plan k-Institute for Mathemati s in the S ien es Inselstr. 22-26, D-04103 Leipzig, Germany {wh,bokh}@mis.mpg.de

We present on epts of data-sparse tensor approximations to the fun tions and operators arising in many-parti le models of quantum

hemistry. Our approa h is based on the systemati use of stru tured tensor-produ t representations where the low-dimensional omponents are represented in hierar hi al or wavelet based matrix formats. The modern methods of tensor-produ t approximation in higher dimensions are dis ussed with the fo us on analyti ally based approa hes. We give numeri al illustrations whi h on rm the eÆ ien y of tensor de omposition te hniques in ele troni stru ture al ulations. Keywords: S hrodinger equation, Hartree-Fo k method, density fun tional theory, tensor-produ t approximation. Abstract.

1

Introduction

Among the most hallenging problems of s ienti omputing nowadays are those of high dimensions, for instan e, multi-parti le intera tions, integral or dierential equations on [0, 1]d and the related numeri al operator al ulus for d > 3. Many standard approa hes have a omputational omplexity that grows exponentially in the dimension d and thus fail be ause of the well known \ urse of dimensionality". To get rid of this exponential growth in the omplexity one an use the idea of tensor-produ t onstru tions ( f. [86℄) on all stages of the solution pro ess. Hereby we approximate the quantity of interest in tensorprodu t formats and use other approximation methods for the remaining lowdimensional omponents. Depending on the spe i properties of the problem, these low-dimensional omponents are already in a data-sparse format, like band stru tured matri es, or an be approximated via hierar hi al (low-rank) matrix and wavelet formats, respe tively. In order to obtain low-rank tensor-produ t approximations it is onvenient to start already with a separable approximation of possibly large separation rank. This is the ase e.g. for hyperboli ross

314

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

approximations in tensor-produ t wavelet bases or for Gaussian-type and plane wave basis sets whi h are frequently used in quantum hemistry and solid state physi s. With su h a representation at hand it is possible to apply algebrai re ompression methods to generate the desired low-rank approximations. We want to stress, however, that these re ompression methods in multi-linear algebra lead to severe omputational problems sin e they are, in fa t, equivalent to some kind of nonlinear approximation in d > 3. Despite these omputational diÆ ulties, su h kind of pro edure is espe ially favourable for smooth fun tions with few singularities whi h are a tually typi al for our envisaged appli ations to be dis ussed below. A large lass of translation invariant kernels of integral operators an be represented via integral transformations of a separable fun tion, e.g. Gaussian fun tion. Using exponentially onvergent quadrature rules for the parametri integrals it is possible to derive low-rank tensor-produ t approximations for these integral operators. In a similar manner it is possible to derive su h representations for matrix-valued fun tions in the tensor-produ t format. It is the purpose of the present paper to dis uss possible appli ations of the afore outlined approa h to ele troni stru ture al ulations with appli ations in quantum hemistry and solid state physi s. It will be shown in the following how to ombine the dierent te hniques, whi h omplement ea h other ni ely, to provide a feasible numeri al operator al ulus for some standard manyparti le models in quantum hemistry. Within the present work, we fo us on the Hartree-Fo k method and the Kohn-Sham equations of density fun tional theory (DFT). We present a brief survey on existing approximation methods, and give some numeri al results on rming their eÆ ien y. Our approa h aims towards a numeri al solution of the Hartree-Fo k and Kohn-Sham equations with

omputational omplexity that s ales almost linearly in the number of parti les (atoms). In parti ular, large mole ular systems su h as biomole ules, and nanostru tures, reveal severe limitations of the standard numeri al algorithms and tensor-produ t approximations might help to over ome at least some of them. The rest of the paper is organised as follows. Se tion 2 gives a brief outline of ele troni stru ture al ulations and of the Hartree-Fo k method in parti ular. This is followed by a dis ussion of best N-term approximation and its generalization to tensor produ t wavelet bases. We present an appli ation of this approa h to the Hartree-Fo k method. In Se tion 4, we rst introdu e various tensor produ t formats for the approximation of fun tions and matri es in higher dimensions. Thereafter we onsider a variety of methods to obtain separable approximations of multivariate fun tions. These methods enter around the Sin interpolation and onvenient integral representations for these fun tions. Se tion 5 provides an overview on dierent data sparse formats for the univariate omponents of tensor produ ts. Finally, we dis uss in Se tion 6 pos-

Tensor-Produ t Approximation in Many-Parti le Modelling

315

sible appli ations of these tensor-produ t te hniques in order to obtain linear s aling methods for Hartree-Fo k and Kohn-Sham equations.

2

Basic principles of electronic structure calculations

The physi s of stationary states, i.e. time harmoni , quantum me hani al systems of N parti les, is ompletely des ribed by a single wave fun tion (r1 , s1 , ..., rN , sN ) 7→ Ψ(r1 , s1 , ..., rN , sN ) ∈ C , ri ∈ R3 , si ∈ S ,

whi h is a fun tion depending on the spatial oordinates ri ∈ R3 of the parti les i = 1, . . . , N together with their spin degrees of freedom si . Sin e identi al quantum me hani al parti les, e.g. ele trons, annot be distinguished, the wave fun tion must admit a ertain symmetry with respe t to the inter hange of parti les. The Pauli ex lusion prin iple states that for ele trons, the spin variables an take only two values si ∈ S = {± 12 }, and the wave fun tion has to be antisymmetri with respe t to the permutation of parti les Ψ(r1 , s1 , . . . , ri , si , . . . rj , sj , . . . , rN , sN ) = −Ψ(r1 , s1 , . . . , rj , sj , . . . ri , si , . . . , rN , sN ) .

The Born Oppenheimer approximation onsiders a quantum me hani al ensemble of N ele trons moving in an exterior ele tri al eld generated by the nu lei of K atoms. Therein the wave fun tion is supposed to be a solution of the stationary ele troni S hrodinger equation HΨ = EΨ ,

with the many-parti le S hrodinger operator (non-relativisti Hamiltonian) H given by H := −

N K X N X X X 1 Za 1X + + ∆i − 2 |ri − Ra | |ri − rj | i=1

a=1 i=1

i<j6N

a> 1. We will see in a moment that the sparse grid approximation is not too bad. Be ause, to store both fun tions fL and gL with respe t to the given basis requires 2 · 2L oeÆ ients, whereas the sparse grid approximation requires O(L2 2L ) nonzero oeÆ ients in

ontrast to O(2dL ) for the full produ t. Keeping in mind that a really optimal tensor-produ t approximation for d > 2 is still an unsolved problem, and in general it might be quite expensive, the sparse grids approximation is simple and heap from the algorithmi point of view. It a hieves also an almost optimal

omplexity for storage requirements. It is a trivial task to onvert an \optimal" tensor-produ t representation into a sparse grid approximation. The opposite dire tion is a highly nontrivial task and requires fairly sophisti ated ompression algorithms. It is worthwhile to mention that previous wavelet matrix ompression approa hes are based on some Calderon-Zygmund type estimates for the kernels. The sparse grid approximation is intimately related to wavelet matrix ompression of integral operators with globally smooth kernels. The kernel fun tions of Calderon-Zygmund operators are not globally smooth. Nevertheless, it an be shown that they an be approximated within linear or almost linear omplexity by means of wavelet Galerkin methods see e.g. [8, 17{19, 77℄, sin e they are smooth in the far eld region. This result is proved, provided that the S hwartz kernel K(x, y) in Rd ×Rd is approximated by tensor-produ t bases Ψ⊗Ψ, where Ψ is an isotropi wavelet basis in Rd . Re ently developed fast methods like wavelet matrix ompression and hierar hi al matri es are working well for isotropi basis fun tions orNisotropi lusters. Corresponding results for sparse grid approximai tions with 2d i=1 Ψ have not been derived so far. Tensor-produ t bases in the framework of sparse grids do not have this geometri isotropy, whi h might spoil 3

It should be mentioned that in our appli ations at best almost optimal tensorprodu t approximations an be a hieved. This is not of parti ular signi an e sin e we are aiming at a ertain a

ura y and small variations of the separation rank, required in order to a hieve this a

ura y, do not ause mu h harm.

Tensor-Produ t Approximation in Many-Parti le Modelling

323

the eÆ ien y of these methods. This is not the ase for more general tensorprodu t approximations of these operators dis ussed in Se tions 4.2.2 and 4.2.3 below. Therefore tensor-produ t approximations will provide an appropriate and eÆ ient tool handling nonlo al operators a ting on fun tions whi h are represented by means of tensor-produ t (sparse grid) bases. The development of su h a tool will play a fundamental role for dealing with operators in high dimensions.

4

Toolkit for tensor-product approximations

The numeri al treatment of operators in higher dimensions arising in traditional nite element methods (FEM) and boundary element methods (BEM) as well as in quantum hemistry, material s ien es and nan ial mathemati s all have in ommon the fundamental diÆ ulty that the omputational ost of traditional methods usually has an exponential growth in d even for algorithms with linear

omplexity O(N) in the problem size N (indeed, N s ales exponentially in d as N = nd , where n is the \one dimensional" problem size). There are several approa hes to remove the dimension parameter d from the exponent ( f. [5, 41, 49, 53, 58℄). For the approximation of fun tions, su h methods are usually based on dierent forms of the separation of variables. Spe i ally, a multivariate fun tion F : Rd → R an be approximated in the form Fr (x1 , ..., xd ) =

r X

k=1

(1)

sk Φk (x1 ) · · · Φd (k) (xd ) ≈ F,

where the set of fun tions {Φ(ℓ) k (xℓ )} an be xed, like the best N-term approximation dis ussed in Se tion 3, or hosen adaptively. The latter approa h tries to optimize the fun tions {Φ(ℓ) k (xℓ )} in order to a hieve for a ertain separation rank r at least the almost optimal approximation property. By in reasing r, the approximation an be made as a

urate as desired. In the ase of globally analyti fun tions there holds r = O(| log ε|d−1 ), while for analyti fun tions with point singularities one an prove r = O(| log ε|2(d−1) ) ( f. [53℄). In the following we want to give a short overview of various approa hes to generate separable approximations with low separation rank. We rst introdu e in Se tion 4.1 two dierent tensor-produ t formats whi h have been used in the following. Se tion 4.2 provides a su

int dis ussion of low rank tensor-produ t approximations of spe ial fun tions, in luding the Coulomb and Yukawa potential, for whi h a ertain type of \seperable" integral representation exists. This integral representation an be used to obtain separable approximations either by applying the Sin approximation (Se tion 4.2.1) or dire tly through a best N-term approximation of exponential sums (Se tion 4.2.2).

324

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

4.1

Tensor-product representations in higher dimension

Let a d-th order tensor A = [ai1 ...id ] ∈ CI be given, de ned on the produ t index set I = I1 × ... × Id. It an be approximated via the anoni al de omposition (CANDECOMP) or parallel fa tors (PARAFAC) model (shortly, anoni al model) in the following manner A ≈ A(r) =

r X

k=1

(1)

(d)

bk Vk ⊗ ..... ⊗ Vk ,

bk ∈ C,

(11)

where the Krone ker fa tors Vk(ℓ) ∈ CIℓ are unit-norm ve tors whi h are hosen su h that for a ertain approximation only a minimal number r of omponents in the representation (11) are required. The minimal number r is alled the Krone ker rank of a given tensor A(r) . Here and in the following we use the notation ⊗ to represent the anoni al tensor U ≡ [ui ]i∈I = b U(1) ⊗ ... ⊗ U(d) ∈ CI , (d) (ℓ) (ℓ) de ned by ui1 ...id = b · u(1) ≡ [uiℓ ]iℓ ∈Iℓ ∈ CIℓ . We make use i1 · · · uid with U of the multi-index notation i := (i1 , ..., id ) ∈ I. The Tu ker model deals with the approximation

A ≈ A(r) =

r1 X

...

k1 =1

rd X

(1)

kd =1

(d)

bk1 ...kd Vk1 ⊗ ... ⊗ Vkd ,

(12)

∈ CIℓ (kℓ = 1, ..., rℓ , ℓ = 1, ..., d) are omplex where the Krone ker fa tors Vk(ℓ) ℓ ve tors of the respe tive size nℓ = |Iℓ |, r = (r1 , ..., rd ) (the Tu ker rank) and (ℓ) bk1 ...kd ∈ C. Without loss of generality, we assume that the ve tors {Vkℓ } are orthonormal, i.e., D

E (ℓ) (ℓ) Vkℓ , Vm = δkℓ ,mℓ , ℓ

kℓ , mℓ = 1, ..., rℓ ; ℓ = 1, ..., d,

where δkℓ ,mℓ is Krone ker's delta. On the level of operators (matri es) we distinguish the following tensorprodu t stru tures. Given a matrix A ∈ CN×N with N = nd , we approximate it with the anoni al model by a matrix A(r) of the form A ≈ A(r) =

r X

k=1

(1)

(d)

Vk ⊗ · · · ⊗ Vk ,

(13)

where the Vk(ℓ) are hierar hi ally stru tured matri es of order n × n. Again the important parameter r is denoted as the Krone ker rank.

Tensor-Produ t Approximation in Many-Parti le Modelling

325

We also introdu e the following rank-(r1 , ..., rd ) Tu ker-type tensor-produ t matrix format A=

r1 X

...

k1 =1

rd X

kd =1

(1)

(d)

2

2

bk1 ...kd Vk1 ⊗ ... ⊗ Vkd ∈ RI1 ×...×Id ,

(14)

∈ RIℓ ×Iℓ , kℓ = 1, ..., rℓ , ℓ = 1, ..., d, are matri es where the Krone ker fa tors Vk(ℓ) ℓ of a ertain stru ture (say, H-matrix, wavelet based format, Toeplitz/ ir ulant, low-rank, banded, et .). The matrix representation in the form (14) is a model redu tion whi h is a generalisation of the low-rank approximation of matri es,

orresponding to the ase d = 2. For a lass of matrix-valued fun tions ( f. [53, 58℄ and Se tion 6.1 below) it is possible to show that r = O(| log ε|2(d−1) ). Further results on the tensor-produ t approximation to ertain matrix-valued fun tions an be found in [41, 54℄. Note that algebrai re ompression methods based on the singular value de omposition (SVD) annot be dire tly generalised to d > 3. We refer to [5, 6, 25{27, 33, 58, 59, 64, ?,67, 74, 90℄ and referen es therein for detailed des ription of the methods of numeri al multi-linear algebra. In the following, we stress the signi an e of analyti al methods for the separable approximation of multivariate fun tions and related fun tion-generated matri es/tensors. 4.2

Separable approximation of functions

Separable approximation of fun tions plays an important role in the design of ee tive tensor-produ t de omposition methods. For a large lass of fun tions ( f. [84, 85℄) it is possible to show that tensor-produ t approximations with low separation rank exist. In this se tion, we overview the most ommonly used methods to onstru t separable approximations of multivariate fun tions. Sinc interpolation methods Sin -approximation methods provide the eÆ ient tools for interpolating C∞ fun tions on R having exponential de ay as |x| → ∞ ( f. [80℄). Let 4.2.1

Sk,h (x) =

sin [π(x − kh)/h] π(x − kh)/h

(k ∈ Z, h > 0, x ∈ R)

be the k-th Sin fun tion with step size h, evaluated at x. Given f in the Hardy spa e H1 (Dδ ) with respe t to the strip Dδ := {z ∈ C : |ℑz| 6 δ} for a δ < π2 . Let h > 0 and M ∈ N0 , the orresponding Sin -interpolant ( ardinal series representation) and quadrature read as CM (f, h) =

M X

k=−M

f(kh)Sk,h ,

TM (f, h) = h

M X

k=−M

f(kh),

326

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

where the latter approximates the integral I(f) =

Z

f(x)dx.

R

For the interpolation error, the hoi e h =

onvergen e rate

p πδ/bM implies the exponential

kf − CM (f, h)k∞ 6 CM1/2 e−

√

πδbM

. p Similarly, for the quadrature error, the hoi e h = 2πδ/bM yields |I(f) − TM (f, h)| 6 Ce−

√

2πδbM

.

If f has a double-exponential de ay as |x| → ∞, i.e., |f(ξ)| 6 C exp(−bea|ξ| )

for all ξ ∈ R with a, b, C > 0,

the onvergen e rate of both Sin -interpolation and Sin -quadrature an be improved up to O(e−cM/ log M ). For example, let d = 2. Given a fun tion F(ζ, η) de ned in the produ t domain Ω := [0, 1] × [a, b], a, b ∈ R, we assume that for ea h xed η ∈ [a, b], the univariate fun tion F(·, η) belongs to C∞ (0, 1] and allows a ertain holomorphi extension (with respe t to ζ) to the omplex plane C ( f. [53℄ for more details). Moreover, the fun tion F(·, η) restri ted onto [0, 1] is allowed to have a singularity with respe t to ζ at the end-point ζ = 0 of [0, 1]. Spe i ally, it is assumed that there is a fun tion φ : R → (0, 1] su h that for any η ∈ [a, b] the omposition f(x) = F(φ(x), η) belongs to the lass H1 (Dδ ). For this lass of fun tions a separable approximation is based on the transformed Sin -interpolation [41, 80℄ leading to FM (ζ, η) =

M X

k=−M

F(φ(kh), η)Sk,h (φ−1 (ζ)) ≈ F(ζ, η).

The following error bound sup |F(ζ, η) − FM (ζ, η)| 6 Ce−sM/ log M

ζ∈[a,b]

(15)

holds with φ−1 (ζ) = arsinh(ar osh(ζ−1 )). In the ase of a multivariate fun tion in [0, 1]d−1 × [a, b], one an adapt the orresponding tensor-produ t approximation by su

essive appli ation of the one-dimensional interpolation ( f. [53℄). In the numeri al example shown in Fig. 1), we approximate the Eu lidean distan e |x − y| in R3 on the domain |xi − yi | 6 1 (i = 1, 2, 3), by the Sin -interpolation. To that end, the approximation (15) applies to the fun tion p F(ζ, η, ϑ) = ζ2 + η2 + ϑ2 in Ω := [0, 1]3 .

Tensor-Produ t Approximation in Many-Parti le Modelling

327

Integral representation methods Integral representation methods are based on the quadrature approximation of integral Lapla e-type transforms representing spheri ally symmetri fun tions. In parti ular, some fun tions of the Eu lidean distan e in Rd , say, 4.2.2

1/|x − y|, |x − y|β , e−|x−y| , e−λ|x−y| /|x − y|,

x, y ∈ Rd ,

an be approximated by Sin -quadratures of the orresponding Gaussian integral on the semi-axis [41, 53, 54, 65℄. For example, in the range 0 < a 6 |x − y| 6 A, one an use the integral representation 1 1 = √ |x − y| π

Z

Z

exp(−|x − y|2 t2 )dt = F(ρ; t)dt,

R

x, y ∈ Rd

(16)

R

of the Coulomb potential with 2 2 1 F(ρ; t) = √ e−ρ t , π

ρ = |x − y|,

d = 3.

After the substitution t = log(1 + eu ) and u = sinh(w) in the integral (16), we apply the quadrature to obtain TM (F, h) := h

M X

k=−M

osh(kh)G(ρ, sinh(kh)) ≈

Z

F(ρ, t)dt = R

1 ρ

(17)

−ρ2 log2 (1+eu )

with G(ρ, u) = √2π e 1+eu and with h = C0 log M/M. The quadrature (17) is proven to onverge exponentially in M, 1 EM := − TM (F, h) 6 Ce−sM/ log M , ρ

where C, s do not depend on M (but depend on ρ), see [53℄. With the proper s aling of the Coulomb potential, one an apply this quadrature in the referen e interval ρ ∈ [1, R]. A numeri al example for this quadrature with values ρ ∈ [1, R], R 6 5000, is presented in Fig. 2. We observe almost linear error growth in ρ. In ele troni stru ture al ulations, the Galerkin dis retisation of the Coulomb potential in tensor-produ t wavelet bases is of spe i interest. For simpli ity, we onsider an isotropi 3d-wavelet basis (s)

(s )

(s )

(s )

γj,a (x) := ψj,a11 (x1 ) ψj,a22 (x2 ) ψj,a33 (x3 ), (1) j/2 (0) j where the fun tions ψ(0) ψ (2 x − a), ψj,a (x) := 2j/2 ψ(1) (2j x − a), j,a (x) := 2 with j, a ∈ Z, orrespond to univariate s aling fun tions and wavelets, respe tively. The nonstandard representation of the Coulomb potential ( f. [8, 34℄)

328

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

requires integrals of the form Z Z

(p) γj,a (x)

∞ Z

2−2j+1 1 (q) γj,b (y) d3 xd3 y = √ |x−y| π

R3 R3

I(p,q)(t, a − b)dt,

0

with I(p,q) (t, a) = G(p1 ,q1 ) (a1 , t) G(p2 ,q2 ) (a2 , t) G(p3 ,q3 ) (a3 , t),

and G(p,q) (a, t) =

ZZ

ψ(p) (x − a) e−(x−y)

2 2

ψ(q) (y) dxdy.

t

RR

In order to bene t from the tensor-produ t stru ture, it is important to have a uniform error bound with respe t to the spatial separation |a−b| of the wavelets. Re ently, the following theorem was proven by S hwinger [79℄

Given a univariate wavelet basis

Theorem 3.

(p)

ψj,a

Z (p) ψ (x − y) ψ(q) (y) dy . e−c|x|

whi h satis es

for c > 0.

Then for any δ < π4 , the integration error of the exponential quadrature q q πδ 2πδ rule ( f. [80℄) with h = M (h = M pure s aling fun tions, i.e., p = q = (0, 0, 0)) satis es ∞ Z M √ X (p,q) mh (p,q) mh I 6 Ce−α M (t, a)dt − h e I (e , a) m=−M

(18)

0

√ √ α = 2 πδ (α = 2πδ

for pure s aling fun tions) with onstant dent of the translation parameter a.

C

indepen-

We illustrate the theorem for the ase of pure s aling fun tions in Fig. 4.2.2. Similar results for wavelets are presented in [14℄. On the best approximation by exponential sums Using integral representation methods, the Sin -quadrature an be applied, for example, to the integrals Z Z

4.2.3

1 = ρ

∞

0

e−ρξ dξ, and

1 1 = √ ρ π

∞

−∞

e−ρ

2 2

t

dt

to obtain an exponentially onvergent sum of exponentials approximating the inverse fun tion ρ1 . Instead, one an dire tly determine the best approximation of a fun tion with respe t to a ertain norm by exponential sums

n P

ν=1

ων e−tν x

Tensor-Produ t Approximation in Many-Parti le Modelling

or

n P

329

ων e−tν x , where ων , tν ∈ R are to be hosen optimally. For some appli2

ν=1

ations in quantum hemistry of approximation by exponential sums we refer e.g. to [1, 60, 62℄. We re all some fa ts from the approximation theory by exponential sums ( f. [10℄ and the dis ussion in [53℄). The existen e result is based on the fundamental Big Bernstein Theorem : If f is ompletely monotone for x > 0, i.e., for all n > 0, x > 0,

(−1)n f(n) (x) > 0

then it is the restri tion of the Lapla e transform of a measure to the half-axis: f(z) =

Z

e−tz dµ(t). R+

For n > 1, onsider the set E0n of exponential sums and the extended set En : E0n

:=

En :=

u=

n X

ων e

−tν x

ν=1

u=

ℓ X

ν=1

: ων , tν ∈ R ,

pν (x)e−tν x : tν ∈ R,

pν polynomials with

ℓ X

(1 + degree(pν )) 6 n .

ν=1

Now one an address the problem of nding the best approximation to f over the set En hara terised by the best N-term approximation error d∞ (f, En ) := inf v∈En kf − vk∞ .

We re all the omplete ellipti integral of the rst kind with modulus κ, K(κ) =

Z1 0

dt p 2 (1 − t )(1 − κ2 t2 )

(0 < κ < 1)

( f. [12℄), and de ne K′ (κ) := K(κ′ ) by κ2 + (κ′ )2 = 1. Theorem 4. 4 ([10℄) Assume that f is ompletely monotone and analyti for ℜe z > 0, and let 0 < a < b. Then for the uniform approximation on the interval [a, b],

4

lim d∞ (f, En )1/n 6

n→ ∞

1 , ω2

where

ω = exp

πK(κ) K′ (κ)

with

κ=

a . b

The same result holds for E0n , but the best approximation may belong to the losure En of E0n .

330

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

In the ase dis ussed below, we have κ = 1/R for possibly large R. Applying the asymptoti s K(κ′ ) = ln κ4 + C1 κ + ... K(κ) =

π 2 {1

+ 41 κ2 + C1 κ4 + ...}

for κ′ → 1, for κ → 0,

of the omplete ellipti integrals ( f. [44℄), we obtain

2πK(κ) π2 1 π2 − − exp exp ≈ ≈ 1 − = . ω2 K(κ′ ) ln(4R) ln(4R)

The latter expression indi ates that the number n of dierent terms to a hieve a toleran e ε is asymptoti ally n≈

| log ε| | log ε| ln (4R) . ≈ | log ω−2 | π2

This result shows the same asymptoti al onvergen e in n as the orresponding bound in the Sin -approximation theory. Optimisation with respe t to the maximum norm leads to the nonlinear minimisation problem inf v∈E0n kf − vkL∞ [1,R] involving 2n parameters {ων , tν }nν=1 . The numeri al implementation is based on the Remez algorithm ( f. [12℄). For the parti ular appli ation with f(x) = x−1 , we have the same asymptoti al dependen e n = n(ε, R) as in the Sin -approximation above, however, the numeri al results 5 indi ate a noti eable improvement ompared with the quadrature method, at least for n 6 15. The best approximation to 1/ρµ in the interval [1, R] with respe t to a W weighted L2 -norm an be redu ed to the minimisation of an expli itly given dierentiable fun tional d2 (f, En ) := inf v∈En kf − vkL2W .

Given R > 1, µ > 0, n > 1, nd the 2n real parameters t1 , ω1 , ..., tn , ωn ∈ R, su h that Fµ (R; t1 , ω1 , ..., tn , ωn ) :=

ZR 1

5

n 1 2 X ωi e−ti x dx = min . W(x) µ − x

(19)

i=1

Numeri al results for the best approximation of x−1 by sums of exponentials

an be found in [10℄ and [11℄; a full list of numeri al data is presented in www.mis.mpg.de/scicomp/EXP SUM/1 x/tabelle.

Tensor-Produ t Approximation in Many-Parti le Modelling

331

In the parti ular ase of µ = 1 and W(x) = 1, the integral (19) an be al ulated in a losed form6 : n

F1 (R; t1 , ω1 , ..., tn , ωn ) = 1 −

X 1 −2 ωi [Ei(−ti ) − Ei(−ti R)] R i=1

n 1 X ω2i −2ti − e−2ti R + 2 + e 2 ti i=1

X

16i<j6n

i ωi ωj h −(ti +tj ) − e−(ti +tj )R e ti + tj R

x e dt ( f. [12℄). In the spe ial with the integral exponential fun tion Ei(x) = −−∞ t

ase R = ∞, the expression for F1 (∞; . . .) even simpli es. Gradient or Newton type methods with a proper hoi e of the initial guess an be used to obtain the minimiser of F1 ( f. [56℄).

5

t

Data sparse formats for univariate components

5.1

Hierarchical matrix techniques

The hierar hi al matrix (H-matrix) te hnique [46, 50, 51, 55℄ (see also the mosai skeleton method [83℄) allows an eÆ ient treatment of dense matri es arising, e.g., from BEM, evaluation of volume integrals and multi-parti le intera tions,

ertain matrix-valued fun tions, et . In parti ular, it provides matrix formats whi h enable the omputation and storage of inverse FEM stiness matri es

orresponding to ellipti problems as well as of BEM matri es. The hierar hi al matri es are represented by means of a ertain blo k partitioning. Fig. 4 shows typi al admissible blo k stru tures. Ea h blo k is lled by a submatrix of a rank not ex eeding k. Then, for the mentioned lass of matri es, it an be shown that the exa t dense matrix A and the approximating hierar hi al matrix AH dier by kA − AH k 6 O(ηk ) for a ertain number η < 1. This exponential de rease allows to obtain an error ε by the hoi e k = O (log(1/ε)) . It is shown ( f. [50{52℄) that the H-matrix arithmeti exhibits almost linear

omplexity in N: –

Data ompression. The storage of N × N H-matri es as well as the matrix-

by-ve tor multipli ation and matrix-matrix addition have a ost O (kN log N), where the lo al rank k is the parameter determining the approximation error. – Matrix-by-matrix and matrix-inverse omplexity. The approximate matrixmatrix multipli ation and the approximate inversion both take O(k2 N log2 N) operations. – The Hadamard (entry-wise) matrix produ t. The exa t Hadamard produ t of two rank-k H-matri es leads to an H-matrix of the blo k-rank k2 (see Se tion 5.2 below). 6

In the general ase, the integral (19) may be approximated by ertain quadratures.

332

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

5.2

Hierarchical Kronecker tensor-product approximations

Sin e n is mu h smaller than N, one an apply the hierar hi al (or low-rank) matrix stru ture to represent the Krone ker fa tors Vkℓ in (13) with the omplexity O(n logq n) or even O(n) that nally leads to O(rn) = O(rN1/d ) data to represent the ompressed matrix Ar . We all by HKT(r, s) the lass of Krone ker rank-r matri es, where the Krone ker fa tors Vkℓ are represented by the blo k-rank s H-matri es (shortly, HKT-matri es). It was shown in [58℄ that the advantages of repla ing A with Ar ( f. (13)), where all the Krone ker fa tors possess the stru ture of general H-matri es, are the following: –

Data ompression. The storage for the Vkℓ matri es of (13) is only O(rn) =

O(rN1/d ) while that for the original (dense) matrix A is O(N2 ), where r = O(logα N) for some α > 0. Consequently, we enjoy a linear-logarithmi

omplexity of O(n logα n) in the univariate problem size n. – Matrix-by-ve tor omplexity. Instead of O(N2 ) operations to ompute Ax, x ∈ CN , we now need only O(rknd log n) = O(rkN log n) operations. If the ve tor an be represented in a tensor-produ t form (say, x = x1 ⊗ . . . ⊗ xd , xi ∈ Cn ) the orresponding ost is redu ed to O(rkn log n) = O(rkN1/d log n) operations. – Matrix-by-matrix omplexity. Instead of O(N3 ) operations to ompute AB, we now need only O(r2 n3 ) = O(r2 N3/d ) operations for rather general stru -

ture of the Krone ker fa tors. Remarkably, this result is mu h better than the orresponding matrix-by-ve tor omplexity for a general ve tor x. – Hadamard produ t. The Hadamard (entry-wise) produ t of two HKTmatri es A ∗ B is presented in the same format: (U1 × V1 ) ∗ (U2 × V2 ) = (U1 ∗U2 )×(V1 ∗V2 ). In turn, the exa t Hadamard produ t U1 ∗U2 (same for V1 ∗ V2 ) of two rank-k H-matri es results in an H-matrix of the blo k-rank k2 and with the orresponding \skeleton" ve tors de ned by the Hadamard produ ts of those in the initial fa tors (sin e there holds (a ⊗ b)∗ (a1 ⊗ b1 ) = (a ∗ a1 ) ⊗ (b ∗ b1 )). Therefore, basi linear algebra operations an be performed in the tensor-produ t representation using one-dimensional operations, thus avoiding an exponential s aling in the dimension d. The exa t produ t of two HKT-matri es an be represented in the same format, but with squared Krone ker rank and properly modi ed blo k-rank [58℄. If A, B ∈ HKT(r, s), where s orresponds to the blo k-rank of the H-matri es involved, then in general AB ∈/ HKT(r, s). However, A=

r X

k=1

A UA k ⊗ Vk ,

B=

r X l=1

B UB l ⊗ Vl ,

A B B n×n UA , k , Vk , Ul , Vl ∈ C

(20)

Tensor-Produ t Approximation in Many-Parti le Modelling

leads to AB =

333

r X r X B A B (UA k Ul ) ⊗ (Vk Vl ).

k=1 l=1

It an be proven that the and VkAVlB matri es possess the same hierar hi al partitioning as the initial fa tors in (20) with blo ks of possibly larger (than s) rank bounded, nevertheless, by sAB = O(s log N). Thus, AB ∈ HKT(r2 , sAB ) with sAB = O(s log N). A UA k Ul

5.3

Wavelet Kronecker tensor-product approximations

Wavelet matrix ompression was introdu ed in [8℄. This te hniques has been onsidered by one of the authors during the past de ade in a series of publi ations ( f. [77℄). The ompression of the Krone ker fa tors Vi ∈ Rn×n is not so obvious, sin e it is not lear to what extend they satisfy a Calderon-Zygmund ondition. It is more likely that they obey more or less a hyperboli ross stru ture. An underlying trun ation riterion based on the size of the oeÆ ients will provide an automati way to nd the optimal stru ture independent of an a priori assumption. A basi thresholding or a posteriori riterion has been formulated by Harbre ht [61℄ and in [22℄. With this riterion at hand, we expe t linear s aling with respe t to the size of the matri es.

Data ompression. The matri es in (13)

Vkℓ an be ompressed requiring total storage size about O(rn) = O(rN ), where r = O(logα N) is as above. The data ve tor requires at most O(n logd n) nonzero oeÆ ients. – Matrix-by-ve tor omplexity. Instead of O(N2 ) operations to ompute Ax, x ∈ CN , we now need only O(rnd ) = O(rN) operations. If the ve tor is represented in a tensor-produ t form (say, x = x1 ⊗ ... ⊗ xd , xi ∈ Cn ) or in sparse grid representation, then the orresponding ost is redu ed to O(rn), resp. O(rn logd n) operations . – Matrix-by-matrix omplexity. Using the ompression of the Lemarie algebra [82℄, instead of O(N3 ) operations to ompute AB, we need only O(r2 n logq n) = O(r2 N1/d logq N), or even O(r2 n) operations. –

1/d

Adaptive wavelet s hemes for nonlinear operators have been developed in [3, 24℄ and for nonlo al operators in [23℄. Corresponding s hemes for hyperboli

ross approximations have not been worked out up to now. Perhaps basi ideas

an be transfered immediately to the tensor-produ t ase.

6

Linear scaling methods for Hartree-Fock and Kohn-Sham equations

Operator-valued fun tions G(L) of ellipti operators L play a prominent role in quantum many-parti le theory. A possible representation of the operator G(L)

334

H.-J. Flad, W. Ha kbus h, B. Khoromskij, R. S hneider

is given by the Dunford-Cau hy integral ( f. [38{41℄) G(L) =

1 2πi

Z

G(z)(zI − L)−1 dz, Γ

where Γ envelopes the spe trum spe (L) of the operator L in the omplex plane. This kind of representation is espe ially suitable for tensor-produ t approximation using Sin or Gauss-Lobatto quadratures for the ontour integral to get an approximate operator of the form G(L) ≈

X

ck G(zk )(zk I − L)−1 .

(21)

An important example for an operator valued fun tion is the sign fun tion of the shifted Fo k operator whi h an be dire tly related to the spe tral proje tor Pρ asso iated with the density matrix ρ. This relation Pρ =

1 1 [I − sign(F − µI)] = − 2 2πi

Z

(F − zI)−1 dz, Γ

where Γ ∩ spe (F) = ∅ en loses the N/2 lowest eigenvalues of the Fo k operator, has been rst noti ed by Beylkin, Coult and Mohlenkamp [7℄. In order to be appli able, the method requires a nite gap between the highest o

upied εN/2 and lowest uno

upied εN/2+1 eigenvalue to adjust the parameter εN/2 < µ < εN/2+1 . This onstraint, in parti ular, ex ludes metalli systems. In general, the approximability of inverse matri es, required in (21), within the HKT format is still an open problem. First results on fast approximate algorithms to ompute inverse matri es in the HKT format for the ase d > 2

an be found in [41℄. In Fig. 6, we onsider the HKT representation to the dis rete Lapla ian inverse (−∆h )−1 (homogeneous Diri hlet boundary onditions) in Rd , whi h an be obtained with O(dn logq n) ost. Numeri al examples for still higher dimensions d 6 1024 are presented in [45℄. For omparison, the following numeri al example manifests the optimal Krone ker rank of the dis rete ellipti inverse in d = 2. Let −∆h now orrespond to a ve-point sten il dis retization of the Lapla ian on a uniform mesh in the unit re tangle in R2 (Diri hlet boundary onditions). It is easy to see that the Krone ker rank of −∆h is 2. The Krone ker ranks of (−∆h )−1 for dierent relative approximation a

ura ies (in the Frobenius norm) are given in Table 6. Our results indi ate a logarithmi bound O(log ε−1 ) for the approximate Krone ker rank r. 6.1

Matrix-valued functions approach for density matrices

Let F ∈ RM×M be the Fo k matrix that represents the Fo k operator F ( f. (8)) in an orthogonal basis {ϕi }M i=1 , M > N/2. There exist two dierent approa hes

Tensor-Produ t Approximation in Many-Parti le Modelling

335

to ompute the Galerkin dis retization D ∈ RM×M of the density matrix (6) via the matrix sign of the shifted Fo k matrix D=

1 [I − sign(F − µI)], 2

with µ ∈ (εN/2 , εN/2+1 ).

The rst approa h uses an exponentially onvergent quadrature for the integral to obtain an expansion into resolvents (21) whereas the se ond approa h is based on a Newton-S hultz iteration s heme. Con erning the tensor-produ t approximation of resolvents in the HKT format we refer to our dis ussion in Se tion 5.2. For the Newton-S hultz iteration s heme proposed in [7℄ S(n+1) = S(n) +

i 1h I − (S(n) )2 S(n) , S(0) = (F − µI) /||F − µI||2 , 2

(22)

the sequen e S(n) onverges to sign(F − µI). First appli ations in quantum

hemistry by Nemeth and S useria [71℄ demonstrate the pra ti ability of this approa h. Iterations s hemes of the form (22) seem to be espe ially favourable for tensor-produ t formats. Starting from an initial approximation of the Fo k matrix F, with low separation rank one has to perform matrix-matrix multipli ations whi h an be handled in an eÆ ient manner in the tensor-produ t format,

f. our dis ussion in Se tion 5.2. After ea h iteration step a re ompression of the tensor-produ t de omposition of S(n+1) be omes ne essary. For the re ompression one an apply the simple alternating least squares (ALS) method [5, 87, 90℄ or Newton-type and related algebrai iterative methods [33℄. The ALS algorithm starts with an initial de omposition of S(n+1) with separation rank r and obtains the best approximation with separation rank ~r 6 r by iteratively solving an optimisation problem for ea h oordinate separately. Assume that r is a tually mu h larger than ne essary, i.e., ~r N, we obtain: J+ =

N X

N

sin

l=2

+

2N X

i=1

sin

l=N+2

J− =

N−1 X

l=1−N

=

N−1 X l=1

lkπ X (l − i)π + λi ci (t)cl−i (t) sin N+1 N+1 N

lkπ X (l − i)π , λi ci (t)cl−i (t) sin N+1 N+1 i=1

N

sin

lkπ X (i − l)π = λi ci (t)ci−l (t) sin N+1 N+1 i=1

N lkπ X (i − l)π (i + l)π sin − ci+l (t) sin λi ci (t) ci−l (t) sin . N+1 N+1 N+1 i=1

In the se ond equality for J− , the terms with indi es l and −l were ombined. In the se ond term in J+ , was taken by 2N − l + 2 instead of l. Respe tively the orre tion of the limits of the summation on i is made. From the inequality

Separation of variables in nonlinear Fermi equation

351

1 6 j 6 N and the equalities j = l − i and j = i − l, we have i 6 l − 1, i > l − N, and i > l + 1 respe tively. Then J+ =

N X

sin

l=2

lkπ (βl (t) − γl (t)), N+1

βl (t) =

l−1 X

λi ci (t)cl−i (t) sin

i=1

N X

γl (t) =

(l − i)π , N+1

λi ci (t)c2N−l−i+2 (t) sin

i=N−l+2

J− =

N−1 X

sin

l=1

δl (t) =

lkπ (δl (t) − εl (t)), N+1 N X

λi ci (t)ci−l (t) sin

i=l+1

εl (t) =

(2N − l − i + 2)π , N+1

N−l X

λi ci (t)ci+l (t) sin

i=1

(i − l)π , N+1

(i + l)π . N+1

Now let us use the orthogonality relations: X N N+1 jkπ lkπ (j) (l) = = δjl , Y ,Y sin sin N+1 N+1 2

j, l = 1(1)N.

(10)

k=1

The transformed relation (8) has the following ve tor form: −

N N−1 N X X 1X (l) (l) cl (t) + λl cl (t) Y (l) = (βl (t) − γl (t))Yk + (δl (t) − εl (t))Yk . α l=2

l=1

l=1

Finally, by (10) we nd (5). 2. Corollary 1.

relation:

The ve tor

⊓ ⊔

T C = c1 (t), . . . , cN (t)

is bound by the following

= −BΛ1/2 C. Λ1/2 C

0 t1 t2 · · · tN−2 t 0 t1 1 . . t t1 . . . . 2 . ... BT = .. ... t1 tN−2 t1 0 tN−1 tN−2 · · · t2 t1

tN−1 tN−2 .. . , t2 t1 0

(11)

(12)

352

Yu. I. Kuznetsov

t2 t3 · · · tN−1 tN t 0 3 . .. . BH = ... .. ... ... tN−1 tN 0 0 tN tN−1 · · · t3

the values tk are de ned in (6).

0 tN tN−1 .. . , t3 t2

(13)

Proof. Let us represent the equations (5) for the ve tor C in the ve tor form.

The rst and the third terms in the right-hand side of equation (5) determine the symmetri Toeplitz matrix BT . The se ond and the fourth terms of the same ⊓ ⊔ equation form the persymmetri Hankel matrix BH . T

Let us now de ne the ve tor C = c1 (t), . . . , cN (t) matrix B = Λ1/2 I + α(BT − BH ) Λ1/2 ,

and the symmetri

where I is the identity matrix, Λ = diag(λ1 , . . . , λN ). If C=

then

Λ1/2 C , Λ1/2 C_

C_ = AC,

where

(14) (15)

0 I A= ∈ R2N×2N . −B 0

The ve tor C determines the oordinates in the Lagrange spa e. The total energy of the linear os illator (1) at α = 0 is the Hamilton fun tion: H=

1 _ 1 _ (Z(t), Z(t)) + (ΛZ(t), Z(t)). 2 2

(16)

At α = 0 the energy (16) is onserved. The eigenvalue problem for the matrix A is of the following form

0 I −B 0

Uj Uj , = µj Vj Vj

BUj = xj Uj ,

(17)

where xj = −µ2j are the real numbers. The ve tors Uj form the orthonormal basis in RN , (Uj , Ul ) = δjl , j, l = 1(1)N. By hoosing α one an ensure xj > 0, hen e √ µj = ±ipj , pj = xj , (18)

Separation of variables in nonlinear Fermi equation

353

√

where i = −1. A pair of eigenve tors of A orrespond to this pair of eigenvalues U±j

Uj , = ±ipj Uj

V±j

1 = 2

"

# Uj . ±i p1j Uj

(19)

The ve tors U±j , V±j form the biorthogonal system. As (U∓j , V∓k ) = δjk , j, k = 1(1)2N, then Z(t) =

n X φj + iϕj

2

j=1

where

−ipj t

e

φj − iϕj ipj t Uj Uj + , e −ipj Uj ipj Uj 2

φj = DUj , Z(0) ,

ϕj =

1 _ DUj , Z(0) . pj

The motion determined by the ve tor Z(t) is a periodi one and is a superposition of the harmoni s of the linear os illator. 3. The systems of the equations (15) is solved by RK-method of Radaux. In the numeri al experiments the al ulations begin with t = 0, when the system is at rest. At α = 0, the energy (16) is preserved in the initial harmoni s. For N = 31 on 100000 steps (τ = 0.001) the relative error of the total energy H is about 4.10−4 . The purpose of the numeri al experiments was to show that some lo alization T took pla e for α 6= 0. If C(j) (0) = ej , j = 1, .., 31, where C = c1 (t), . . . , cN (t) , α = 1, N = 31 on 20000 iterations (τ = 0.001) and v v u[N/j] uN uX uX (j) 2 2 (j) t (ci (t)) = t (cji (t)) + εj (t), i=1

P

i=1

2

(j) where [N/j] i=1 (cji (t)) ontain only the oeÆ ients whose number is divisible by j. Espe ially expressive is the ase for j = 2k , k = 1, .., 4: εj (t) ≡ 0. We get also: j = 3, ε3 = 10−6 ; j = 5, ε5 = 3 · 10−3 ; j = 6, ε6 = 5 · 10−3 ; j = 7, ε7 = 4 · 10−2 .

References 1. E. Fermi, Colle ted papers (Note e memorie), University of Chi ago Press, 1965. V. 2. 2. V. K. Mezentsev, S. L. Musher, I. V. Ryzhenkova, S. K. Turitsyn, Twodimensional solitons in dis rete systems, JETP Letters, 60 (11) (1994), 815{821.

Faster Multipoint Polynomial Evaluation via Structured Matrices B. Murphy and R. E. Rosholt Department of Mathemati s and Computer S ien e, Lehman College, City University of New York, Bronx, NY 10468, USA brian.murphy@lehman.cuny.edu rhys.rosholt@lehman.cuny.edu

We a

elerate multipoint polynomial evaluation by redu ing the problem to stru tured matrix omputation and transforming the resulting matrix stru ture. Abstract.

Keywords: Algorithm design and analysis, Multipoint polynomial evaluation, Vandermonde matri es, Hankel matri es. Exploiting the links between omputations with polynomials and stru tured matri es and transformation of matrix stru ture are two ee tive means for enhan ing the eÆ ien y of algorithms in both areas [P89/90℄, [P92℄, [BP94℄, [GKO95℄, [P01℄. We demonstrate the power of these te hniques by a

elerating multipoint evaluation of univariate polynomials. Multipoint polynomial evaluation is a lassi al problem of algebrai ompuN−1 of a polynomial tations. Given the oeÆ ient ve tor p = (pj )j=0 p(x) = p0 + p1 x + · · · + pN−1 xN−1

and n distin t points x1 , . . . , xn , one seeks the ve tor v = (vi )ni=1 of the values vi = p(xi ), i = 1, . . . , n. Hereafter \ops" stands for \arithmeti operations", mM (resp. iM ) denotes the number of ops required for multipli ation of a matrix M (resp. the inverse of matrix M−1 ) by a ve tor, and we assume that N > n. (N is large, e.g., for univariate polynomials obtained from multivariate polynomials via Krone ker's map.) One an ompute the ve tor v in 2(N − 1)n ops, by applying Horner's algorithm n times, whereas the Moen k{Borodin algorithm [MB72℄ uses O((N/n)m(n) log n) ops provided a pair of polynomials in x an be multiplied modulo xk in m(k) ops, m(k) = O(k log k) where the eld of onstants supports FFT and m(k) = O((k log k) log log k) over any eld of onstants [CK91℄. We take advantage of shifting to the equivalent problem of multipli ation of the n × N Vandermonde matrix n,N−1 Vn,N (x) = (xji )i=1,j=0

Faster Multipoint Polynomial Evaluation via Stru tured Matri es

355

by the ve tor p. This enables us to exploit matrix stru ture to de rease the upper bound to O(((N/n) + log n)m(n)), thus yielding some a

eleration of these lassi al omputations. Our te hniques may be of interest as a sample of the stru ture transformation for the a

eleration of omputations with stru tured matri es. In our ase we rely on the transformation of the matrix Vn,N (x) into the Hankel matrix H(x) = T Vn,n (x)Vn,N (x).

We use the following auxiliary results (see, e.g., [P01, Chapters 2 and 3℄).

Fa t 1. T H(x) = Vn,n (x)Vn,N (x).

is an n × N Hankel matrix n X i=1

xk+j i

!n,N−1

.

k=1,j=0

Fa t 2. mH = O((N/n)m(n))forH = Hn,N (x).

Fa t 3.

mV = O(m(n) log n)

for an n × n Vandermonde matrix V and iV = O(m(n) log n)

if this matrix is nonsingular. We ompute the ve tor v as follows.

Algorithm 2. 1. Compute the N + n entries of the Hankel matrix Hn,N (x) by using O((N/n)m(n) + m(n) log n) ops.

2. Compute the ve tor z = Hn,N (x)p by using O((N/n)m(n)) ops. −T 3. Apply O(m(n) log n) ops to ompute and output the ve tor v = Vn,n (x)z. T The matri es Vn,n(x) and their transposes Vn,n (x) are nonsingular be ause the n points x1 , . . . , xn are distin t.

356

B. Murphy and R. E. Rosholt

The ost bound on Stages 2 and 3 follow from Fa ts 2 and 3 respe tively. To perform Stage 1 we rst apply O(m(n) log n) ops to ompute the oeÆ ients of the polynomial q(x) =

n Y (x − xi ) i=1

( f., e.g. [P01, Se tion 3.1℄) and then apply O((N/n)m(n)) ops to ompute the power sums n X

xki , k = 1, 2, . . . , N + n

i=1

of its zero ( f. [BP94, page 34℄).

References [BP94℄ D. Bini, V. Y. Pan, Polynomial and Matrix Computations, Volume 1: Fundamental Algorithms, Birkhauser, Boston, 1994. [CK91℄ D. G. Cantor, E. Kaltofen, On Fast Multipli ation of Polynomials over Arbitrary Rings, A ta Informati a, 28(7), 697{701, 1991. [GKO95℄ I. Gohberg, T. Kailath, V. Olshevsky, Fast Gaussian Elimination with Partial Pivoting for Matri es with Displa ement Stru ture, Math. of Computation, 64, 1557{1576, 1995. [MB72℄ R. Moen k, A. Borodin, Fast Modular Transform via Division, Pro . of 13th Annual Symposium on Swit hing and Automata Theory, 90{96, IEEE Computer So iety Press, Washington, DC, 1972. [P89/90℄ V. Y. Pan, On Computations with Dense Stru tured Matri es, Math. of Computation, 55(191), 179{190, 1990. Pro eedings version in Pro . ISSAC89, 34-42, ACM Press, New York, 1989. [P92℄ V. Y. Pan, Complexity of Computations with Matri es and Polynomials, SIAM Review, 34, 2, 225{262, 1992. [P01℄ V. Y. Pan, Stru tured Matri es and Polynomials: Uni ed Superfast Algorithms, Birkhauser/Springer, Boston/New York, 2001.

Testing Pivoting Policies in Gaussian Elimination⋆ Brian Murphy1,⋆⋆ , Guoliang Qian2,⋆⋆⋆ , Rhys Eri Rosholt1,† , Ai-Long Zheng3, Severin Ngnosse2,‡ , and Islam Taj-Eddin2,§ 1

Department of Mathemati s and Computer S ien e, Lehman College, City University of New York, Bronx, NY 10468, USA ⋆⋆

2

brian.murphy@lehman.cuny.edu † rosholt@lehman.cuny.edu

Ph.D. Program in Computer S ien e, The City University of New York, New York, NY 10036 USA ⋆⋆⋆

guoliangqian@yahoo.com ‡ sngnosse@msn.com § itaj-eddin@gc.cuny.edu

3

Ph.D. Program in Mathemati s, The City University of New York, New York, NY 10036 USA, azheng 1999@yahoo.com

Abstract. We begin with spe ifying a lass of matri es for whi h Gaussian elimination with partial pivoting fails and then observe that both rook and omplete pivoting easily handle these matri es. We display the results of testing partial, rook and omplete pivoting for this and other

lasses of matri es. Our tests on rm that rook pivoting is an inexpensive but solid ba kup wherever partial pivoting fails.

Keywords: Gaussian elimination, pivoting.

1

Introduction

Hereafter we write GEPP, GECP, and GERP to denote Gaussian elimination with partial, omplete, and rook pivoting. GEPP and GPPP are Wilkinson's

lassi al algorithms [1℄, [2℄, [3℄, whereas GERP is a more re ent and mu h less known invention [4℄, [5℄, [6℄. Ea h of the three algorithms uses (2/3)n3 + O(n2 )

ops to yield triangular fa torization of an n × n matrix, but they dier in the number of omparisons involved, and GEPP has slightly weaker numeri ally. Namely, both GERP and GECP guarantee numeri al stability [7℄, [5℄, whereas GEPP is statisti ally stable for most of the input instan es in omputational pra ti e but fails for some rare but important lasses of inputs [8℄, [9℄, [10℄. Nevertheless GEPP is omnipresent in modern numeri al matrix omputations, whereas GECP is rarely used. The reason is simple: GEPP involves (1/2)n2 + O(n) omparisons versus (1/3)n3 + O(n2 ) in GECP, that is the omputational ⋆

Supported by PSC CUNY Award 69350{0038

358

B. Murphy et al.

ost of pivoting is negligible versus arithmeti ost for GEPP but is substantial for GECP. GERP ombines the advantages of both GECP and GEPP. A

ording to the theory and extensive tests, GERP is stable numeri ally almost as as GECP and is likely to use about 2n2 omparisons for random input matri es (see [4℄, [5℄, [6℄, and our Remark 1), although it uses the order of n3 omparisons in the worst ase [3, page 160℄. Ea h of GEPP, GECP, and GERP an be ombined with initial s aling for additional heuristi prote tion against instability, whi h requires from about n2 to about 2n2 omparisons and as many ops [1, Se tion 3.5.2℄, [2, Se tion 3.4.4℄, [3, Se tion 9.7℄, so that the overall omputational ost is still strongly dominated by the elimination ops. The ustomary examples of well onditioned matri es for whi h GEPP fails numeri ally are rather ompli ated, but in the next se tion we give a simple example, whi h should provide learer insight into this problem. Namely, we spe ify a lass of input matri es for whi h already the rounding errors at the rst elimination step of GEPP ompletely orrupt the output. The results of our numeri al tests in Se tion 3 show that both GECP and GERP have no problems with this lass. We also in lude the test results for six other input lasses. For ea h lass we present the number of omparisons, growth fa tor, and the norms of the error and residual ve tors, whi h gives a more omplete pi ture versus [4℄, [5℄, and [6℄ ( f. our on luding Remark 2). Our tests on rm that GERP is an inexpensive but solid ba kup wherever GEPP fails.

2

A Hard Input Class for GEPP

Already the rst step of Gaussian elimination tends to magnify the input errors wherever the pivot entry is absolutely smaller than some other entries in the same row and olumn. For example, represent an input matrix M as follows, M=

1 vT u B

n−1 n−1 = (mij )i,j=0 , B = (mij )i,j=1 ,

(1)

let ε denote the ma hine epsilon (also alled unit roundo), and suppose that u = se, v = te, e = (1, 1, . . . , 1)T , |mij | 6 1 for i, j > 0,

(2)

s < 2/ε, t = 1.

Then the rst elimination step, performed error-free, produ es an (n−1)×(n−1) matrix Bs = B + seeT , whi h turns into a rank-one matrix (s)eeT in the result of rounding. Here and hereafter (a) denotes the oating-point representations of a real number a.

Testing Pivoting Poli ies in Gaussian Elimination

359

Partial pivoting xes the latter problem for this matrix but does not help against exa tly the same problem where the input matrix M satis es equations (1) and (2) and where s = 1, t > 2/ε.

(3)

In this ase the rst elimination step, performed error-free, would produ e the (n−1)× (n−1) matrix Bt = B+teeT . Rounding would turn it into the rank-one matrix (t)eeT . We refer the reader to [8℄ and [9℄ ( f. also [10℄) on some narrow but important

lasses of linear systems of equations oming from omputational pra ti e on whi h GEPP fails to produ e orre t output.

3

Experimental Results

Tables 1{4 show the results of tests by Dr. Xinmao Wang at the Department of Mathemati s, University of S ien e and Te hnology of China, Hefei, Anhui 230026, China. He implemented GEPP, GECP, and GERP in C++ under the 64-bit Fedore Core 7 Linux with AMD Athlon64 3200+ unipro essor and 1 GB memory. In his implementation he used n omparisons for omputing the maximum of n numbers. He tested the algorithms for n × n matri es M of the following seven lasses. 1. Matri es with random integer entries uniformly and independently of ea h other distributed in the range (−10l , 10l ). 2. Matri es M = PLU for n × n permutation matri es P that de ne n inter hanges of random pairs of rows and for lower unit triangular matri es L and UT with random integer entries in the range (−10b , 10b ). 3. Matri es M = SΣT for random orthogonal matri es S and T ( omputed as the Q-fa tors in the QR fa torization of matri es with random integer entries uniformly and independently of ea h other distributed in the range (−10c , 10c )) and for the diagonal matrix Σ = diag(σi )ni=1 where σ1 = σ2 = · · · = σn−ρ = 1 and σn−ρ+1 = σn = 10−q ( f. [3, Se tion 28.3℄). 4. Matri es M satisfying equations (1){(3) where B denotes an (n−1)×(n−1) matrix from matrix lass 1 above.

360

B. Murphy et al. 0

5. Matri es

I O ... B−M1 I O . . . B B B M = B −M1 I B B .. .. @ . . −M1

1 I OC C .. C .C C C C OA I

from [8, page 232℄, where

−0.05 0.3 0.994357 0.289669 M1 = exp . ≈ 0.3 −0.05 0.289669 0.994357 1 0 0 ··· 0 −1/C − kh 1 − kh 0 ··· 0 −1/C 2 2 .. .. kh kh . . . . . − 2 −kh 1 − 2 6. Matri es M = . . . . . .. .. .. . 0 −1/C kh − 2 −kh · · · −kh 1 − kh −1/C 2 −kh · · · −kh −kh 1 − 1/C − − kh 2

page 1360℄, where kh = 23 , C = 6. 0

7. Matri es

1

0 ··· 0 1

B B−1 1 B B M=B B−1 −1 B B .. .. @ . .

kh 2

from [9,

1

. .C . .. .. C C C .. . 0 1C C from [10, page 156℄. C C .. . 1 1A ..

−1 −1 · · · −1 1

n = 128

Class 1 Class 2 Class 3, ρ = 1 Class 3, ρ = 2 Class 3, ρ = 3 Class 4 Class 5 Class 6 Class 7

minimal maximal average 31371 37287 34147 35150 40904 38168 30189 36097 32995 30597 36561 32960 29938 35761 32967 31342 36333 33648 24318 32258 32764 Table 1.

n = 256

Class 1 Class 2 Class 3, ρ = 1 Class 3, ρ = 2 Class 3, ρ = 3 Class 4 Class 5 Class 6 Class 7

minimal maximal average 131692 146780 139419 147123 161971 153559 127911 143706 136361 129228 144226 136427 129945 145882 136508 131533 146014 138392 97790 130050 131068

Numbers of omparisons in GERP.

For ea h matrix of lasses 1{4 the tests were performed for m = 1000 input instan es M for ea h of the two values n = 128 and n = 256, for b = c = l = 4, and for q = 10. For lass 3 the tests were performed for ea h of the three values

Testing Pivoting Poli ies in Gaussian Elimination

361

GEPP GECP GERP n = 256 GEPP GECP GERP Class 1 13.8 ± 2.5 6.4 ± 0.4 8.4 ± 0.8 Class 1 21.8 ± 3.8 9.5 ± 0.6 12.8 ± 1.3 Class 2 2.5 ± 0.5 1.5 ± 0.2 1.8 ± 0.2 Class 2 3.4 ± 0.6 1.9 ± 0.2 2.4 ± 0.3 Cl. 3, ρ = 1 17.4 ± 4.0 8.7 ± 1.0 11.6 ± 1.8 Cl. 3, ρ = 1 32.2 ± 7.4 15.5 ± 1.7 20.6 ± 2.9 Cl. 3, ρ = 2 15.6 ± 3.6 7.7 ± 0.8 10.2 ± 1.4 Cl. 3, ρ = 2 29.2 ± 6.7 13.8 ± 1.4 18.6 ± 2.9 Cl. 3, ρ = 3 14.3 ± 3.5 7.0 ± 0.7 9.3 ± 1.3 Cl. 3, ρ = 3 27.0 ± 6.1 12.5 ± 1.3 16.7 ± 2.3 Class 4 FAIL 1 1 Class 4 FAIL 1 1 Class 5 3.4e6 2 2 Class 5 3.1e13 2 2 Class 6 6.6e36 1.33 1.33 Class 6 8.6e74 1.33 1.33 Class 7 1.7e38 2 2 Class 7 5.8e76 2 2 n = 128

Table 2.

Growth fa tor in GEPP/GECP/GERP.

ρ = 1, 2, 3. Besides the results of these tests, Tables 1{4 also over the test results for matri es M of lasses 5{7 (from the papers [8℄, [9℄, and [10℄, respe tively), for whi h GEPP produ ed orrupted outputs. To every matrix GEPP, GECP, and GERP were applied. As was expe ted, for matrix lasses 1{3 numeri al performan e of GEPP, GECP, and GERP was similar but for lasses 4{7 GEPP either failed or lost many more orre t input bits versus GECP and GERP. Table 1 shows the maximum, minimum and average numbers of omparisons used in GERP for every input lass of matri es. Table 2 shows the average growth fa tor φ = max

n−1 i,j,k=0

|mij |/ max (k)

n−1 i,j=0

|mij |

n−1 (as well as its standard deviation from the average) where M(k) = (mi,j (k))i,j=k denotes the matrix omputed in k steps of Gaussian elimination with the sele ted n−1 denotes the input matrix. pivoting poli y and M = M(0) = (mij )i,j=0 Tables 3 and 4 show the average norms of the error and residual ve tors, respe tively, as well as the standard deviations from the average, where the linear systems My = f were solved by applying GECP, GEPP, and GERP. The ve tors f were de ned a

ording to the following rule: rst generate ve tors y with random omponents from the sets {−1, 0, 1} or {−1, 1}, then save these ve tors for omputing the errors ve tors, and nally ompute the ve tors f = My.

Remark 1. Table 1 shows the results of testing GERP where n omparisons were used for omputing the maximum of n numbers. Extensive additional tests with random matri es (of lass 1) for n = 2h and for h ranging from 5 to 10 were performed in the Graduate Center of the City University of New York. In these tests the modi ation GERP was run where no tested row or olumn is examined again until the next elimination step. Furthermore, the tests used k−1

omparisons for omputing the maximum of k numbers. The observed numbers of omparisons slightly de reased versus Table 1 and always stayed below 2n2 .

362

B. Murphy et al. GEPP GECP GERP Class 1 6.8e-13 ± 3.4e-12 5.2e-13 ± 2.8e-12 4.8e-13 ± 2.2e-12 Class 2 1.7e7 ± 2.6e8 8.7e5 ± 4.6e6 6.6e5 ± 3.7e6 Class 3, ρ = 1 1.1e-5 ± 8.4e-6 7.4e-6 ± 5.7e-6 8.7e-6 ± 6.7e-6 Class 3, ρ = 2 1.7e-5 ± 8.8e-6 1.2e-5 ± 6.1e-6 1.3e-5 ± 7.0e-6 Class 3, ρ = 3 2.1e-5 ± 9.2e-6 1.5e-5 ± 6.2e-6 1.7e-5 ± 7.5e-6 Class 4 FAIL 5.7e-13 ± 6.3e-12 5.7e-13 ± 3.5e-12 Class 5 1.0e-9 2.7e-15 2.7e-15 Class 6 3.1e3 2.7e-15 2.7e-15 Class 7 6.5 0.0 0.0 n = 256 GEPP GECP GERP Class 1 3.8e-12 ± 3.7e-11 2.8e-12 ± 4.0e-11 2.6e-12 ± 2.0e-11 Class 2 3.9e7 ± 5.0e8 1.1e6 ± 4.1e6 2.2e6 ± 1.3e7 Class 3, ρ = 1 2.0e-5 ± 1.5e-5 1.3e-5 ± 9.3e-6 1.5e-5 ± 1.1e-5 Class 3, ρ = 2 3.1e-5 ± 1.6e-5 2.0e-5 ± 1.1e-5 2.4e-5 ± 1.2e-5 Class 3, ρ = 3 3.9e-5 ± 1.7e-5 2.5e-5 ± 1.1e-5 2.9e-5 ± 1.2e-5 Class 4 FAIL 3.6e-12 ± 4.0e-11 3.6e-12 ± 2.5e-11 Class 5 1.4e-2 3.7e-15 3.7e-15 Class 6 7.2e57 3.6e-14 3.6e-14 Class 7 11.3 0.0 0.0 n = 128

Table 3.

Norms of the error ve tors in GEPP/GECP/GERP.

Remark 2. Similar test results for lass 1 were presented earlier in [5℄ and [6℄

and for lasses 3 and 5{7 in [5℄, but [5℄ shows no norms of the error and residual ve tors. It seems that GEPP, GECP, and GERP have not been tested earlier for

lasses 2 and 4.

Acknowledgement We are happy to a knowledge valuable experimental support of our work by Dr. Xinmao Wang.

References 1. G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd edition, The Johns Hopkins University Press, Baltimore, Maryland, 1996. 2. G. W. Stewart, Matrix Algorithms, Vol I: Basi De ompositions, SIAM, Philadelphia, 1998. 3. N. J. Higham, A

ura y and Stability in Numeri al Analysis, SIAM, Philadelphia, 2002 (se ond edition). 4. L. Neal and G. Pool, A Geometri Analysis of Gaussian Elimination, II, Linear Algebra and Its Appli ations, 173, 239{264, 1992.

Testing Pivoting Poli ies in Gaussian Elimination

363

GEPP GECP GERP Class 1 1.6e-9 ± 3.0e-10 1.1e-9 ± 1.7e-10 1.2e-9 ± 2.1e-10 Class 2 2.2e-4 ± 1.6e-3 1.2e-4 ± 4.7e-4 1.1e-4 ± 6.3e-4 Class 3, ρ = 1 3.1e-14 ± 5.1e-15 2.0e-14 ± 2.9e-15 2.3e-14 ± 3.6e-15 Class 3, ρ = 2 3.0e-14 ± 5.0e-15 1.9e-14 ± 2.8e-15 2.3e-14 ± 3.6e-15 Class 3, ρ = 3 3.0e-14 ± 5.3e-15 1.9e-14 ± 2.8e-15 2.3e-14 ± 3.5e-15 Class 4 FAIL 3.3e2 ± 3.3e2 3.5e2 ± 3.3e2 Class 5 1.1e-9 1.9e-15 1.9e-15 Class 6 2.9e3 1.7e-14 1.7e-14 Class 7 14.5 0.0 0.0 n = 256 GEPP GECP GERP Class 1 7.1e-9 ± 1.1e-9 4.4e-9 ± 5.8e-10 5.2e-9 ± 7.2e-10 Class 2 2.1e-3 ± 3.7e-2 6.2e-4 ± 2.1e-3 1.5e-3 ± 1.6e-2 Class 3, ρ = 1 9.8e-14 ± 1.5e-14 5.7e-14 ± 6.8e-15 7.4e-14 ± 9.3e-15 Class 3, ρ = 2 9.7e-14 ± 1.4e-14 5.7e-14 ± 7.0e-15 7.1e-14 ± 9.2e-15 Class 3, ρ = 3 3.9e-5 ± 1.7e-5 5.7e-14 ± 6.9e-15 7.0e-14 ± 9.1e-15 Class 4 FAIL 6.7e2 ± 6.5e2 6.6e2 ± 6.3e2 Class 5 9.0e-3 2.6e-15 2.6e-15 Class 6 2.1e58 1.0e-13 1.0e-13 Class 7 41.1 0.0 0.0 n = 128

Table 4.

Norms of the residual ve tors in GEPP/GECP/GERP.

5. L. V. Foster, The Growth Fa tor and EÆ ien y of Gaussian Elimination with Rook Pivoting, J. of Comp. and Applied Math., 86, 177{194, 1997. Corringendum in J. of Comp. and Applied Math., 98, 177, 1998. 6. G. Pool and L. Neal, The Rook's Pivoting Strategy, J. of Comp. and Applied Math., 123, 353{369, 2000. 7. J. H. Wilkinson, Error Analysis of Dire t Methods of Matrix Inversion, J. of ACM, 8, 281{330, 1961. 8. S. J. Wright, A Colle tion of Problem for Whi h Gaussian Elimination with Partial Pivoting Is Unstable, SIAM J. on S i. Stat. Computing, 14, 1, 231{238, 1993. 9. L. V. Foster, Gaussian Elimination with Partial Pivoting Can Fail in Pra ti e, SIAM J. on Matrix Analisys and Appli ations, 15, 4, 1354{1362, 1994. 10. N. J. Higham and D. J. Higham, Large Growth Fa tors in Gaussian Elimination with Pivoting, SIAM J. on Matrix Analisys and Appli ations, 10, 2, 155{164, 1989.

Newton’s Iteration for Matrix Inversion, Advances and Extensions⋆ Vi tor Y. Pan Department of Mathemati s and Computer S ien e Lehman College of CUNY, Bronx, NY 10468, USA victor.pan@lehman.cuny.edu http://comet.lehman.cuny.edu/vpan/

We rst over Newton's iteration for generalized matrix inversion, its ameliorations, re ursive ompression of its iterates in the ase of stru tured inputs, some te hniques of ontinuation via fa torization, and extension to splitting the Singular Value De omposition. We ombine the latter extension with our re ent fast algorithms for the null spa e bases (prompted by our progress in randomized pre onditioning). We applied these ombinations to ompute the respe tive spa es of singular ve tors and to arrive at divide-and- onquer algorithms for matrix inversion and omputing determinants. Our te hniques promise to be effe tive for omputing other matrix fun tions in the ase of ill onditioned inputs.

Abstract.

Keywords: Matrix inversion, Newton's Iteration, Matrix stru ture, Continuation (homotopy), Divide-and- onquer algorithms, Null spa es.

1

Introduction

Newton's iteration for generalized matrix inversion amounts mostly to performing a sequen e of matrix multipli ations. This level-three BLAS performan e is parti ularly ee tive on systoli arrays and parallel omputers. Newton's iteration for generalized inverse is important in its own right but also as a sample te hnique for omputing various other matrix fun tions su h as the square root, matrix sign fun tion, and the solution of Ri

ati equation. We survey and advan e this approa h, show its a

eleration in the ase of stru tured input matri es, its ombination with our new te hniques of homotopi ontinuation, fa torization, and pre onditioning, as well as its extension to divide-and onquer algorithms for splitting the Singular Value De omposition, that is for

omputing the respe tive subspa es generated by singular ve tors (hereafter we refer to su h subspa es as singular subspa es and invoke the usual abbreviation SVD). The latter extensions employ our re ent te hniques for the omputation of bases for null spa es, whi h should enhan e the power of the approa h. ⋆

Supported by PSC CUNY Award 69330{0038.

Newton's Iteration for Matrix Inversion, Advan es and Extensions

365

We re all some basi de nitions in the next se tion and then, in Se tion 3, the onvergen e analysis from [1℄ and [2℄ and some re ipes for the initialization. In Se tion 4 we des ribe three te hniques that exploit input stru ture to save running time and omputer memory by ompressing the omputed approximations. All three te hniques usually require reasonably good initialization (in spite of some interesting phenomenon of auto orre tion in ompression), and in Se tion 5 we over a general re ipe for initialization by means of homotopy ( ontinuation), ee tive for both general and stru tured inputs. We improve

onditioning of ontinuation by representing it as re ursive fa torization. These pre onditioning te hniques an be of interest on its own right, independently of the onsidered iterative pro esses. In Se tion 6 we des ribe a modi ed iteration dire ted to splitting the SVD and its generalizations. This te hnique produ es bases for the respe tive singular subspa es and an be extended to divide-and- onquer algorithms for the inverses, determinants, square roots, and other matrix fun tions. The te hnique is proposed for general Hermitian input matri es. (It does not preserve matrix stru ture ex ept for symmetry.) We over this dire tion in Se tion 7, where we also employ our re ent ee tive algorithms for omputing null spa e bases of general non-Hermitian matri es. We brie y re all these algorithms in Se tion 8 and point out their natural extension to randomized pre onditioning of ill onditioned inputs. In Se tion 9 we dis uss some dire tions for further study.

2

Basic Definitions

We rely on the ustomary de nitions for matrix omputations in [3℄{[8℄. MH denotes the Hermitian transpose of a matrix M. Ik is the k × k identity matrix. I is the identity matrix of an unspe i ed size. (A, B) is the 1 × 2 blo k matrix with blo ks A and B. diag(ai )i (resp. diag(Bi )i ) is the diagonal (resp. blo k diagonal) matrix with the diagonal entries ai (resp. diagonal blo ks Bi ). U is a unitary matrix if UH U = I. N(M) denotes the (right) null spa e of a matrix M. range(M) is the range of a matrix M, that is its olumn span. A matrix M is a matrix basis for a spa e S if its olumns form a basis for this spa e, that is if the matrix has full olumn rank and if range(M) = S. A matrix basis for the null spa e N(M) is a null matrix basis for a matrix M. ρ = rank(M) is its rank. σi (M) is its ith largest singular value, i = 1, 2, . . . , ρ. ond2 M = σ1 (M)/σρ (M) > 1 is the ondition number of a matrix M of a rank ρ. A matrix is well onditioned if its ondition number is not large (relatively to the omputational task and

omputer environment) and is ill onditioned otherwise. C+ and C− denote the Moore{Penrose generalized inverse of a matrix C, so that C+ = C− = C−1 for a nonsingular matrix C.

366

3

Vi tor Y. Pan

Newton’s iteration for matrix inversion. Its initialization and acceleration

Newton's iteration xk+1 = xk − f(xk )/f ′ (xk ), k = 0, 1, . . . ,

(1)

rapidly improves a rude initial approximation x = x0 to the solution x = r of an equation f(x) = 0 provided f(x) is a smooth nearly linear fun tion on an open line interval that overs two points r and x0 . Equation (1) an be obtained by trun ating all terms of the orders of at least two in Taylor's expansion of the fun tion f(x) at x = r. Hotelling [9℄ and S hultz [10℄ extended Newton's iteration (1) to the ase where x = X, xk = Xk , and f(xk ) = f(Xk ) are matri es and f(X) = M − X−1 for two matri es M and X. In this ase Newton's iteration rapidly improves a rude initial approximation X0 to the inverse of a nonsingular n × n matrix M, Xk+1 = Xk (2I − MXk ), k = 0, 1, . . . .

(2)

Indeed, de ne the error and residual matri es Ek = M−1 − Xk , ek = kEk k, Rk = MEk = I − MXk , ρk = kRk k

for all k, assume a matrix norm k · k satisfying the submultipli ative property kABk 6 kAk kBk, and dedu e from equation (2) that k

k

Rk = R2k−1 = R20 , ρk 6 ρ20 , 2

2k

MEk = (MEk−1 ) = (ME0 ) , ek 6

k k e20 kMk2 −1 .

(3) (4)

The latter equations show quadrati onvergen e of the approximations Xk to the inverse matrix M−1 provided ρ0 < 1. Ea h step (2) amounts essentially to performing matrix multipli ation twi e. Iteration (2) is numeri ally stable for nonsingular matri es M, but numeri al stability has been proved in [2℄ for its extensions (16) and (17) in Se tion 6 even where the matrix M is singular. Ben-Israel in [11℄ and Ben-Israel and Cohen in [12℄ proved that the iteration

onverges where X0 = aMH for a suÆ iently small positive s alar a. Sderstrm and Stewart [1℄ analyzed Newton's iteration based on the SVDs of the involved matri es. This study was ontinued by S hreiber in [13℄ and then in [2℄. We outline this work by using Generalized SVDs (hereafter to be referred to as GSVDs), that is nonunique representations of matri es as UΣV H where U and V are unitary matri es and Σ is a diagonal matrix. They turn into SVDs wherever Σ denotes diagonal matri es lled with nonnegative entries in nonin reasing order. Assume that the matri es M and X0 have GSVDs M = UΣV H , X0 = VΦ0 UH

(5)

Newton's Iteration for Matrix Inversion, Advan es and Extensions

367

for some unitary matri es U and V and diagonal matri es Σ = diag(σi )i and Φ0 = diag(φi,0 )i . In parti ular this is the ase where X0 = f(MH )

(6)

X0 = aMH + bI

(7)

is a matrix fun tion in MH , e.g., for two s alars a and b. Then we have Xk M = VSk V H , Sk = diag(si )i , 1 − si (k)

(k+1)

(k)

= (1 − si )2

(8)

for all i and k. Furthermore, we have (0)

si

= σi φi,0 = σi f(σi )

(9)

for all i under (6), so that iteration (2) onverges to the generalized inverse M− if 0 < s(0) = σi φi,0 < 2 for all i. Convergen e is lo ally quadrati but an be i are lose to zero or two for some subs ripts i. slow initially if the values s(0) i More pre isely, assume the hoi e (7) for b = 0 and a = 1/(kMk1 kMk∞ ) [11℄. Then it an be proved that ρ0 6 1 − 1/(( ond2 M)2 n) ( f. [14℄). By hoosing a = y/(kMk1 kMk∞ ) for any value of y su h that 1 6 y 6 2( ond2 M)2 n/(1 + ( ond2 M)2 n) we obtain the slightly improved bound ρ0 6 1 − y/(( ond2 M)2 n). In parti ular for y = 2n/(1+n) we obtain that ρ0 6 1−2n/(( ond2 M)2 (1+n)). Under these hoi es we need about ν = 2 log2 ond2 M steps (2) to de rease the residual norm ρk below 1/e = 0.3678781 . . . . Then in the order of l = log2 ln h h additional steps (2) we would yield the bound ρν+l 6 e−2 , e = 2.7182981 . . . . The bound on the number ν of initial steps is riti al for ill onditioned matri es. It was de reased roughly by twi e in [2℄ by means of repla ing iteration (2) by its s aled version Xk+1 = ak Xk (2I − MXk ), k = 0, 1, . . . , l

(10)

for appropriate s alars ak . Clearly, the inversion of a nonsingular matrix M an be redu ed to the inversion of either of the Hermitian positive de nite matri es MH M or MMH −1 H H −1 be ause M = (MH M)−1 MH or of the Hermitian matrix = M (MM )

0 M 0 M−H , having the inverse . H M 0 M−1 0 Now suppose M is a Hermitian matrix. Then one an further a

elerate the

omputations by twi e by hoosing the initial approximation X0 = yI/kMk1 for √ √ n( ond2 M)). This yields any value y su h that√1 6 y 6 2 n( ond√ 2 M)/(1 + the bound ρ0 6 1 − 2 n/(( ond2 M)(1 + n)).

The paper [2℄ obtains some a

eleration for a large lass of inputs by means of repla ing iteration (2) with ubi iteration Xk+1 = (cX2k + dXk + eI)Xk ,

368

Vi tor Y. Pan

k = 0, 1, . . . for appropriate s alars c, d, and e. The latter resour e was employed

again in [15℄ in the ase of sru tured input matri es. For more narrow input

lasses one an try to yield further a

eleration of onvergen e by applying more general iteration s hemes. For example, re all the following two-stage iteration [16℄{[18℄, having ee tive appli ations to integral equations via the asso iated tensor omputations, Xk+1 = Xk (2I − Xk ), Yk+1 = Yk (2I − Xk ).

Here Y0 = I and M = X0 is a nonsingular matrix su h that σ1 (I − X0 ) = kI − X0 k2 < 1. It is readily veri ed that Xk = X0 Yk for all k and that the matri es Xk onverge to the identity matrix I. Consequently the matri es Yk

onverge to the inverse M−1 = X−1 0 .

4

Structured iteration, recursive compressions, and autocorrection

Next, assuming that the input matrix M is stru tured and is given with its short displa ement generator, we modify Newton's iteration to perform its steps faster. We begin with re alling some ba kground on the displa ement representation of matri es ( f. [19℄{[21℄). We rely on the Sylvester displa ement operators ∇A,B (M) ← AM − MB, de ned by the pairs of the asso iated n × n operator matri es A and B. The next simple fa t relates them to the Stein operators ∆A,B (M) = M − AMB. Theorem 1. ∇A,B = A∆A−1 ,B

nonsingular.

if A is nonsingular. ∇A,B = −∆−1 A,B

−1

B

if B is

∇A,B (M) is the displa ement of M, its rank is the displa ement rank of M. The matrix pair {S, T } is a displa ement generator of length l for M if ∇A,B (M) = ST H and if S and T are n × l matri es. If a matrix M has displa ement rank r = rank ∇A,B (M) and is given with its displa ement generator of a length l, then one an readily ompute its displa ement generator of length r in O(l2 n) ops [21, Se tions 4.6℄. Most popular stru tures of Toeplitz, Hankel, Vandermonde and Cau hy types are asso iated with the operators ∇A,B where ea h of the operator matri es A and B is diagonal or unit f- ir ulant. For su h operators simple l-term bilinear or trilinear expressions of an n × n matrix M via the entries of its displa ement generator {S, T } of length l an be found in [20℄, [21, Se tions 4.4.4 and 4.4.5℄, and [22℄. If l λ1 . To extend our study to the ase of inde nite Hermitian matri es M we just √ need to modify the matri es Mk and Pk by repla ing tk ← tk −1 for all k. We refer the reader to [36, Se tion 7℄ on some extensions of homotopi te hniques to the ase of non-Hermitian input matri es. If the input matrix M has stru ture of Toeplitz type or has rank stru ture, then so do the matri es Mk , Pk , and Vk for all k, and we an a

elerate the

omputations respe tively. We an extend the stru tures of other types from the matrix M to the matri es Mk for all k (and onsequently also to the matri es Pk and Vk for all k) simply by rede ning the matri es: Mk ← M + tk N where the matrix N shares the stru ture with the matrix M. E.g., for a Hankel-like matrix

374

Vi tor Y. Pan

M, we an hoose N being the re e tion matrix, whi h has entries ones on its antidiagonal and has zero entries elsewhere. For matri es M having stru ture of Vandermonde or Cau hy type, we an hoose N being a Vandermonde or Cau hy matrix, respe tively, asso iated with the same operator ∇A,B . Alternatively, to invert a matrix M having the stru tures of Vandermonde or Cau hy types we an rst ompute the matrix N = VMW where ea h of V and W is an appropriate Vandermonde matrix or the inverse or transpose

of su h a matrix. This would redu e the original inversion problem to the ase of a Toeplitz-like matrix N be ause M−1 = W −1N−1 V −1 . (This te hnique of displa ement transformation is due to [38℄, was extensively used by G. Heinig, and is most widely known be ause of its ee tive appli ation to pra ti al solution of Toeplitz and Toepitz-like linear sytems of equations in [39℄.) We have the following lower bound on the number l of homotopi steps, l + 1 > logκ ond2 (M)

for every s alar κ ex eeding the ondition numbers of the matri es M0 , P0 , . . ., Pl−1 , and Vl . This bound is implied by the inequality

ond2 (M) 6 ond2 (M0 ) ond2 (Vl )

l−1 Y

ond2 (Pk ).

k=0

With an appropriate hoi e of step sizes one only needs O(log ond2 M+log2 ln h) h Newton's steps overall to approximate M−1 with the residual norm below 1/e2 , e = 2.718291 . . . ( f. [36℄).

6

Splitting GSVDs

We keep using the de nitions in equations (5) and (8) and at rst re all the following iteration from [13℄, Yk = Xk (2I − MXk ), Xk+1 = Yk MYk , k = 0, 1, . . . ,

su h that

(16)

Xk+1 M = ((2I − Xk M)Xk M)2 , k = 0, 1, . . . ,

and for X0 = aMH the singular values s(k) of the matri es Xk M satisfy the i quarti equations (k+1)

si

(k)

(k)

= ((2 − si )si )2 , i = 1, 2, . . . , n; k = 0, 1, . . .

The basi quarti polynomial√mapping s ← (2 − s)2 s2 for this iteration√has four xed points s~0 = 1, s~1 = (3 − 5)/2 = 0.3819 . . . , s~2 = 1, and ~s3 = (3 + 5)/2 = (0) 2.618 . . . . The iteration sends the singular values si of the matrix X0 M to zero

Newton's Iteration for Matrix Inversion, Advan es and Extensions

375

if they lie in the interval {s : 0 < s < s~√1 } and sends them to one if they lie in the interval {s : ~s1 < s < 2 − ~s1 = (1 + 5)/2 = 1.618 . . . }. If all singular values of the matrix X0 M lie in these two intervals, then under (6) the matri es Xk onverge to the generalized inverse (M<s )− of the matrix M<s where sf(s) = s~1 under (6). Here and hereafter we write M<s = UΣ<s V H where Σ<s = diag(σi(<s) ), σi(<s) equals σi if σi > s and equals zero otherwise, so that M<s and M>s = M − M<s V H denote the two matri es obtained by setting to zeros all singular values σi of the matrix M ex eeded by s and greater than s, respe tively. The onvergen e is lo ally quadrati but initially is slow for matri es M (0) having singular values σi su h that the values s(0) = σi φi (equal to σi f(σi ) i under (6)) lie near the points s~1 and/or 2 − s~1 . The iteration an be dire ted towards the matrix M<s for any xed smaller positive s ( f. [2, Se tion 7℄). At rst one should hoose appropriate s alars c, d, a0 , a1 , . . . , de ne the initial approximation X0 = cI + dMH , and apply iteration (10). For appropriate s alars ak and suÆ iently large l one yields that (l) (l) 0 6 si < s~1 if σi < s and ~ s1 6 si < 2 − s~1 otherwise. Then one writes X0 ← Xl and shifts to iteration (16). Similar results are obtained in [2℄ for the iteration Xk+1 = (3I − 2Xk M)Xk MXk , k = 0, 1, . . . ,

(17)

su h that Xk+1 M = (3I − 2Xk M)(Xk M) , k = 0, 1, . . . . This iteration is asso iated with the ubi polynomial mapping s ← s2 (3 − 2s), whi h has nonnegative xed points 0, 1/2, and 1. The iteration sends the singular values si of the matrix X0 M towards zero where 0 6 s(0) < 1/2 and towards one where i √ (0) 1/2 < si 6 s~4 = (1 + 3)/2 = 1.366 . . . . The onvergen e to zero and one is lo ally quadrati but initially is slow near the points 1/2 and s~4 . Then again this an be readily extended to the iteration for whi h the iterates Xk onverge to the matrix (M<s )− for a sele ted smaller positive s. Both iterations (16) and (17) are proved to be numeri ally stable in [2℄. Having the matri es M and (M<s )− , one an readily ompute the matri es M<s = M(M<s )− M and M>s = M − M<s . The paper [2℄ also extends iteration (17) to yield onvergen e to the proje tion matri es Ps = M(M<s )− = Udiag(Ir(s) , 0)UH and P(s) = (M<s )− M = V diag(Ir(s) , 0)V H where the integer r(s) is de ned by the threshold value s. In this ase it is suÆ ient to hoose two s alars a and b satisfying 2

a > 0, b > 0, as21 + b = 1/2, aσ21 + b < s~4 = 1.37 . . . ,

to set X0 = aP + bI, and to apply the iteration Xk+1 = (I − 2(Xk − I))X2k , k = 0, 1, . . . .

The iteration onverges to the matri es Ps and P(s) for P = MMH and P = MH M, respe tively.

376

7

Vi tor Y. Pan

Divide-and-conquer algorithms and computing singular subspaces

Let us brie y omment on some appli ations of the splitting te hniques from the previous se tion. Clearly, the SVD and GSVD omputation for a matrix M an be redu ed to the omputation of the SVDs or GSVDs of the pair of matri es M<s and M>s , and this an be re ursively extended. Similar divide-and- onquer pro ess an be applied to omputing the generalized inverse M− = (M<s )− + (M>s )− and the solutions M− f = (M<s )− f + (M>s )− f of linear systems Mx = f . This pro ess an rely on omputing the pairs of matri es (M<s )− and (M>s )− or M<s and M>s . In the latter ase the ve tors (M<s )− f and (M>s )− f an be omputed as the least-squares solutions of the linear systems M<s x<s = f and M>s x>s = f , respe tively, that have the minimum Eu lidean norms (see [3, Se tion 5.5℄ on the respe tive theory and algorithms). The omputations with the matri es M<s and M>s are simpler than with the matrix M be ause ondM = ( ondM<s ) ondM>s and rank(M) = rank(M<s ) + rank(M>s ). Splitting is more ee tive where the treshold value s balan es it, that is where the ratios ( ondM<s )/ ondM>s and/or rank(M<s )/ rank(M>s ) are lose to one. In a sample appli ation of splitting SVD or GSVD to omputing the deterH minant det M, we an ompute some unitary matrix bases UH <s , V<s , U>s , and V>s , respe tively, for the left and right null spa es of the matri es M<s and M>s , respe tively, su h that H U<s ~ 1, U ~ 2 )Σdiag(V~1 , V~2 ) = diag(M ~ 1, M ~ 2 ). M(V<s , V>s ) = diag(U UH >s

~ 1, U ~ 2 ) and (V~1 , V~2 ) are unitary matri es, whereas the diagonal blo ks Here (U ~ i , V~i, and M ~ i are ni × ni matri es for i = 1, 2; n1 = r(s) and n2 = n − n1 . U Therefore ~ 1 )(det M ~ 2 )/((det(U<s , U>s )(det(V<s , V>s )), det M = (det M where the matri es (U<s , U>s ) and (V<s , V>s ) are unitary and the matri es M1 and M2 have both sizes and ondition numbers de reased versus the matrix M. ~ 1 )( ondM ~ 2 ). Indeed ondM = ( ondM We an apply the same te hniques wherever we an zero some singular values of the input matrix M and preserve the respe tive singular subspa es of the matrix M. We only need to ompute the respe tive matrix bases for the left and right null spa es of the resulting matrix and for their omplements ( f. the next se tion), whi h are the respe tive singular subspa es of the input matrix. The desired suppression of some small singular values an be a hieved in smaller numbers of steps (16) or (17), as soon as the respe tive singular values nearly vanish, whereas the other singular values remain bounded and separated from zero and do not ne essarily move lose to one.

Newton's Iteration for Matrix Inversion, Advan es and Extensions

377

Clearly, the approa h an be applied to omputing any polynomial or rational fun tion in a Hermitian matrix M having a GSVD M = UH ΣU. Consequently it

an be extended to polynomial and rational approximation of irrational fun tions in Hermitian matri es.

8

Computation of null matrix bases

Computation of null matrix bases an rely on fa torizations of input matri es (say, on their QR or PLUP* fa torizations) but also on the re ent alternatives in [40℄, [41℄. Here is a relevant basi result from [40℄, [41℄, whi h for simpli ity we state only for square input matri es M and for the right null spa es. (Re all that the left null spa e of a matrix M is the right null spa e of its Hermitian transpose MH .) Theorem 8. U and V of

Assume an n × n matrix M of a rank ρ, a pair of two matri es sizes n × r, and the nonsingular matrix C = M + UV H . Then r > rank(U) > n − ρ,

(18)

N(M) = range(C+ UY)

(19)

provided Y is a matrix basis for the null spa e then we have

N(MC+ U).

Furthermore if

r = rank(U) = n − ρ,

(20)

N(M) = range(C+ U),

(21)

V H C+ U = Ir .

(22)

One an hoose the matri es U and V at random based on the following simple results from [42℄. Theorem 9. For a nite set ∆ of ardinality |∆| in a ring R and four matri es M ∈ Rn×n of a rank ρ, U and V in ∆r×n , and C = M + UV T , we

have

a) rank(C) 6 r + ρ, b) rank(C) = n with a probability of at least 1 −

if r + ρ > n and either the entries of both matri es U and V have been randomly sampled from the set ∆ or U = V and the entries of the matrix U have been randomly sampled from this set, r

) rank(C) = n with a probability of at least 1 − |∆| if r + ρ > n, the matrix U (respe tively V ) has full rank r, and the entries of the matrix V (respe tively U) have been randomly sampled from the set ∆. 2r |∆|

378

Vi tor Y. Pan

With weakly random generation of the matri es U and V (allowing to endow them with the desired patterns of stru ture and sparseness) our null spa e

omputations are expe ted to be numeri ally stable a

ording to the theoreti al and experimental study in [42, Se tions 4, 6, and 8℄. In parti ular this study shows that under the assumptions of the previous theorem and under weakly random hoi e of sparse and stru tured matri es U and V of a rank r > n − ρ, the ratio ( ond2 C)/ ond2 M is likely to be neither large nor small provided the matri es M, U, and V are s aled so that the ratio kMk2/kUV H k2 is neither large nor small. We refer the reader to [40℄{[44℄ on su h a randomized additive prepro essing M → M + UV H and its appli ations to some fundamental matrix

omputations (su h as eigen-solving and linear system solving).

9

Discussion

1. One an enhan e the power of our te hniques in various ways by ombining them with the available software and hardware. E.g., the iterations of Se tions 6 and 7 an be applied on urrently to a number of initial approximations X0 , thus produ ing matri es M<s(j) for a number of treshold values s(j), j = 1, 2, . . . . The matrix M<s − M s is obtained by zeroing all singular values of the matrix M that are less than s or not less than t. Thus we an employ the power of parallel pro essing to a

elerate splitting the given omputational problem into subproblems having smaller sizes and ondition numbers. 2. Our divide-and- onquer pro esses employ various algorithms for omputing null spa e bases numeri ally, that is some bases for the spa es of singular ve tors asso iated with the smallest singular values. It is a hallenge to re ne these algorithms, parti ularly where these values form lusters not learly separated from ea h other. 3. Another hallenge is to de ne new polynomial mappings that would modify iterations (16) and (17) to a

elerate the onvergen e of some singular values to zero while keeping suÆ iently many of them away from zero. 4. Many of the te hniques in this paper an be applied to the omputation of various other matrix fun tions besides the inverse, e.g., the square roots, matrix sign fun tion, and the solution of Ri

ati's equation ( f. [45℄ and the bibliography therein). Newton's iteration is fundamental for su h tasks. Wherever the output matrix is stru tured, our te hniques of re ursive ompression in Se tion 4 an support a

eleration, at least lo ally. Our te hniques in Se tion 5 for ontinuation via re ursive fa torization an treat the paramount problem of initialization, in both ases of general and stru tured input matri es. Finally the te hniques in Se tions 6{8 for splitting GSVDs and omputing bases for the respe tive singular subspa es an be readily extended from the ase of inversion to omputing the

Newton's Iteration for Matrix Inversion, Advan es and Extensions

379

square roots and other fun tions in Hermitian matri es. A natural hallenge is the transition from GSVDs to eigen-de ompositions, whi h would involve nonHermitian input matri es.

References 1. T. Sderstrm, G. W. Stewart, On the Numeri al Properties of an Iterative Method for Computing the Moore{Penrose Generalized Inverse, SIAM Journal on Numeri al Analysis, 11, 61{74, 1974. 2. V. Y. Pan, R. S hreiber, An Improved Newton Iteration for the Generalized Inverse of a Matrix, with Appli ations, SIAM Journal on S ienti and Statisti al Computing, 12, 5, 1109{1131, 1991. 3. G. H. Golub, C. F. Van Loan, Matrix Computations, 3rd edition, The Johns Hopkins University Press, Baltimore, Maryland, 1996. 4. G. W. Stewart, Matrix Algorithms, Vol I: Basi De ompositions, SIAM, Philadelphia, 1998. 5. G. W. Stewart, Matrix Algorithms, Vol II: Eigensystems, SIAM, Philadelphia, 1998 ( rst edition), 2001 (se ond edition). 6. J. W. Demmel, Applied Numeri al Linear Algebra, SIAM, Philadelphia, 1997. 7. L. N. Trefethen, D. Bau, III, Numeri al Linear Algebra, SIAM, Philadelphia, 1997. 8. N. J. Higham, A

ura y and Stability in Numeri al Analysis, SIAM, Philadelphia, 2002 (se ond edition). 9. H. Hotelling, Analysis of a Complerx Statisto al Variable into Prin ipal Components, J. of Edu ational Psysh., 24, 417{441, 498{520, 1933. 10. G. S hultz, Iterative Bere hnung der Re iproken Matrix, Z. Angew. Meth. Me h., 13, 57{59, 1933. 11. A. Ben-Israel, A Note on Iterative Method for Generalized Inversion of Matri es, Mathemati s of Computation, 20, 439{440, 1966. 12. A. Ben-Israel, D. Cohen, On Iterative Computation of Generalized Inverses and Asso iated Proje tions, SIAM Journal on Numeri al Analysis, 3, 410{419, 1966. 13. R. S hreiber, Computing Generalized Inverses and Eigenvalues of Symmetri Matri es Using Systoli Arrays, in Computing Methods in Applied S en e and Engeneering (edited by R. Glowinski and J.-L. Lions), North{Holland, Amsterdam, 1984. 14. V. Y. Pan, J. Reif, Fast and EÆ ient Parallel Solution of Dense Linear Systems, Computers and Math. (with Appli ations), 17, 11, 1481{1491, 1989. 15. G. Codevi o, V. Y. Pan, M. Van Barel, Newton-like Iteration Based on Cubi Polynomials for Stru tured Matri es, Numeri al Algorithms, 36, 365{380, 2004. 16. I. V. O eledetz, E. E. Tyrtyshnikov, Approximate Inversion of Matri es in the Pro ess of Solving a Hypersingular Integral Equation, Computational Math. and Math. Physi s, 45, 2, 302{313, 2005 (Translated from JVM i MF, 45, 2, 315{326, 2005). 17. V. Olshevsky, I.V. Oseledets, and E.E. Tyrtyshnikov, Tensor Properties of Multilevel Toeplitz and Related Matri es, Linear Algebra and Its Appli ations, 412, 1{21, 2006. 18. V. Olshevsky, I.V. Oseledets, and E.E. Tyrtyshnikov, Superfast Inversion of TwoLevel Toeplitz Matri es Using Newton Iteration and Tensor-Displa ement Stru ture, Operator Theory: Advan es and Appli ations, 179, 229{240, 2008.

380

Vi tor Y. Pan

19. T. Kailath, S.-Y. Kung, M. Morf, Displa ement Ranks of Matri es and Linear Equations, J. Math. Anal. Appl., 68(2), 395{407, 1979. 20. I. Gohberg, V. Olshevsky, Complexity of Multipli ation with Ve tors for Stru tured Matri es, Linear Algebra and Its Appli ations, 202, 163{192, 1994. 21. V. Y. Pan, Stru tured Matri es and Polynomials: Uni ed Superfast Algorithms, Birkhauser/Springer, Boston/New York, 2001. 22. V. Y. Pan, X. Wang, Inversion of Displa ement Operators, SIAM Journal on Matrix Analysis and Appli ations, 24, 3, 660{677, 2003. 23. V. Y. Pan, Parallel Solution of Toeplitz-like Linear Systems, J. of Complexity, 8, 1{21, 1992. 24. V. Y. Pan, Con urrent Iterative Algorithm for Toepliz-like Linear Systems, IEEE Transa tions on Parallel and Distributed Systems, 4, 5, 592{600, 1993. 25. V. Y. Pan, De reasing the Displa ement Rank of a Matrix, SIAM Journal on Matrix Analysis and Appli ations, 14, 1, 118{121, 1993. 26. V. Y. Pan, S. Branham, R. Rosholt, A. Zheng, Newton's Iteration for Stru tured Matri es and Linear Systems of Equations, SIAM volume on Fast Reliable Algorithms for Matri es with Stru ture (edited by T. Kailath and A. H. Sayed), 189{210, SIAM Publi ations, Philadelphia, 1999. 27. V. Y. Pan, Y. Rami, Newton's Iteration for the Inversion of Stru tured Matri es, in Stru tured Matri es: Re ent Developments in Theory and Computation, (edited by D. Bini, E. Tyrtyshnikov and P. Yalamov), 79{90, Nova S ien e Publishers, USA, 2001. 28. D. A. Bini, B. Meini, Approximate Displa ement Rank and Appli ations, in AMS

Conferen e "Stru tured Matri es in Operator Theory, Control, Signal and Image Pro essing", Boulder, 1999 (edited by V. Olshevsky), Ameri an Math. So iety, 215{232, Providen e, RI, 2001.

29. V. Y. Pan, Y. Rami, X. Wang, Stru tured Matri es and Newton's Iteration: Uni ed Approa h, Linear Algebra and Its Appli ations, 343–344, 233{265, 2002. 30. V. Y. Pan, M. Van Barel, X. Wang, G. Codevi o, Iterative Inversion of Stru tured Matri es, Theoreti al Computer S ien e, 315, 2–3 (Spe ial Issue on Algebrai and Numeri al Computing), 581{592, 2004. 31. R. Vandebril, M. Van Barel, G. Golub, N. Mastronardi, A Bibliography on Semiseparable Matri es, Cal olo, 42, 3–4, 249{270, 2005. 32. D. Bini, V. Y. Pan, Improved Parallel Computations with Toeplitz-like and Hankellike Matri es, Linear Algebra and Its Appli ations, 188/189, 3{29, 1993. 33. D. Bini, V. Y. Pan, Polynomial and Matrix Computations, Vol.1: Fundamental Algorithms, Birkhauser, Boston, 1994. 34. V. Y. Pan, A. Zheng, X. Huang, O. Dias, Newton's Iteration for Inversion of Cau hy-like and Other Stru tured Matri es, Journal of Complexity, 13, 108{124, 1997. 35. V. Y. Pan, A Homotopi Residual Corre tion Pro ess, Pro . of the Se ond Conferen e on Numeri al Analysis and Appli ations (edited by L. Vulkov, J. Wasniewsky and P. Yalamov), Le ture Notes in Computer S ien e, 1988, 644{649, Springer, Berlin, 2001. 36. V. Y. Pan, M. Kunin, R. Rosholt, H. Kodal, Homotopi Residual Corre tion Pro esses, Math. of Computation, 75, 345{368, 2006. 37. V. Y. Pan, New Homotopi /Fa torization and Symmetrization Te hniques for Newton's and Newton/Stru tured Iteration, Computer and Math. with Appli ations, 54, 721{729, 2007.

Newton's Iteration for Matrix Inversion, Advan es and Extensions

381

38. V. Y. Pan, Computations with Dense Stru tured Matri es, Math. of Comp., 55(191), 179{190, 1990. Pro . version in Pro . Annual ACM-SIGSAM International Symposium on Symboli and Algebrai Computation (ISSAC '89), 34-42, ACM Press, New York, 1989. 39. I. Gohberg, T. Kailath, V. Olshevsky, Fast Gaussian Elimination with Partial Pivoting for Matri es with Displa ement Stru ture, Math. of Comp., 64(212), 1557{1576, 1995. 40. V. Y. Pan, Computations in the Null Spa es with Additive Pre onditioning, Te hni al Report TR 2007009, CUNY Ph.D. Program in Computer S ien e, Graduate Center, City University of New York, April 2007. Available at http://www. s.g . uny.edu/tr/te hreport.php?id=352 41. V. Y. Pan, G. Qian, Solving Homogeneous Linear Systems with Weakly Random Additive Prepro essing, Te hni al Report TR 2008009, CUNY Ph.D. Program in Computer S ien e, Graduate Center, the City University of New York, 2008. Available at http://www. s.g . uny.edu/tr/te hreport.php?id=352 42. V. Y. Pan, D. Ivolgin, B. Murphy, R. E. Rosholt, Y. Tang, X. Yan, Additive Pre onditioning for Matrix Computations, Te h. Report TR 2008004, Ph.D. Program in Computer S ien e, Graduate Center, the City University of New York, 2008. Available at http://www. s.g . uny.edu/tr/te hreport.php?id=352 Pro eedings version in Pro . of the Third International Computer S ien e Symposium in Russia (CSR 2008), Le ture Notes in Computer S ien e (LNCS), 5010, 372{383, 2008. 43. V. Y. Pan, X. Yan, Additive Pre onditioning, Eigenspa es, and the Inverse Iteration, Linear Algebra and Its Appli ations, in press. 44. V. Y. Pan, D. Grady, B. Murphy, G. Qian, R. E. Rosholt, A. Ruslanov, S hur Aggregation for Linear Systems and Determinants, Theoreti al Computer S ien e, Spe ial Issue on Symboli {Numeri al Algorithms (D.A. Bini, V.Y. Pan, and J. Vers helde, editors), in press. 45. N. J. Higham, Fun tions of Matri es: Theory and Applli ations, SIAM, Philadelphia, 2008.

Truncated decompositions and filtering methods with Reflective/Anti-Reflective boundary conditions: a comparison C. Tablino Possio⋆ Dipartimento di Matemati a e Appli azioni, Universita di Milano Bi o

a, via Cozzi 53, 20125 Milano, Italy cristina.tablinopossio@unimib.it

Abstract. The paper analyzes and ompares some spe tral ltering methods as trun ated singular/eigen-value de ompositions and Tikhonov/Reblurring regularizations in the ase of the re ently proposed Re e tive [18℄ and Anti-Re e tive [21℄ boundary onditions. We give numeri al eviden e to the fa t that spe tral de ompositions (SDs) provide a good image restoration quality and this is true in parti ular for the AntiRe e tive SD, despite the loss of orthogonality in the asso iated transform. The related omputational ost is omparable with previously known spe tral de ompositions, and results substantially lower than the singular value de omposition. The model extension to the ross- hannel blurring phenomenon of olor images is also onsidered and the related spe tral ltering methods are suitably adapted.

Keywords: ltering methods, spe tral de ompositions, boundary onditions. 1

Introduction

In this paper we deal with the lassi al image restoration problem of blurred and noisy images in the ase of a spa e invariant blurring. Under su h assumption the image formation pro ess is modelled a

ording to the following integral equation with spa e invariant kernel g(x) =

Z

h(x − x ~)f(x~)dx~ + η(x), x ∈ R2 ,

(1)

where f denotes the true physi al obje t to be restored, g is the re orded blurred and noisy image, η takes into a

ount unknown errors in the olle ted data, e.g. measurement errors and noise. As ustomary, we onsider the dis retization of (1) by means of a standard ⋆

The work of the author was partially supported by MIUR 2006017542

Trun ated de ompositions and ltering methods with R/AR BCs

383

2D generalization of the re tangle quadrature formula on an equispa ed grid,

ordered row-wise from the top-left orner to the bottom-right one. Hen e, we obtain the relations X hi−j fj + ηi , i ∈ Z2 , gi = (2) j∈Z2

e ∞ = [hi−j ](i,j)=((i ,i ),(j ,j )) , in whi h an in nite and a shift-invariant matrix A 1 2 1 2 i.e., a two-level Toeplitz matrix, is involved. In prin iple, (2) presents an in nite summation sin e the true image s ene does not have a nite boundary. Nevertheless, the data gi are learly olle ted only at a nite number of values, so representing only a nite region of su h an in nite s ene. In addition, the blurring operator typi ally shows a nite support, so that it is ompletely des ribed by a Point Spread Fun tion (PSF) mask su h as hPSF = [hi1 ,i2 ]i1 =−q1 ,...,q1 ,i2 =−q2 ,...,q2 (3) P > 0 for any i1 , i2 and q i=−q hi = 1, i = (i1 , i2 ), q = (q1 , q2 )

where hi1 ,i2 (normalization a

ording to a suitable onservation law). Therefore, relations (2) imply gi =

q X

hs fi−s + ηi ,

i1 = 1, . . . , n1 , i2 = 1, . . . , n2 ,

(4)

s=−q

where the range of olle ted data de nes the so alled Field of View (FOV). On e again, we are assuming that all the involved data in (5), similarly to (2), are reshaped in a row-wise ordering. In su h a way we obtain the linear system e f~ = g − η A

(5)

e ∈ RN(n)×N(n+2q) is a nite prin ipal sub-matrix of A e ∞ , with main where A N(n+2q) N(n) ~ , g, η ∈ R and with N(m) = m1 m2 , diagonal ontaining h0,0 , f ∈ R for any two-index m = (m1 , m2 ). Su h a reshape is onsidered just to perform the theoreti al analysis, sin e all the deblurring/denoising methods are able to deal dire tly with data in matrix form. For instan e, it is evident that the blurring pro ess in (4) onsists in a dis rete onvolution between the PSF mask, after a rotation of 180◦ , and the proper true image data in e F = [fi1 ,i2 ]i1 =−q1 +1,...,n1 +q1 ,i2 =−q2 +1,...,n2 +q2 .

Hereafter, with a two-index notation, we denote by F = [fi1 ,i2 ]i1 =1,...,n1 ,i2 =1,...,n2 the true image inside the FOV and by G = [gi1 ,i2 ]i1 =1,...,n1 ,i2 =1,...,n2 the re orded image. Thus, assuming the knowledge of PSF mask in (3) and of some statisti al properties of η, the deblurring problem is de ned as to restore, as best as possible, the true image F on the basis of the re orded image G. As evident

384

C. Tablino Possio

from (4), the problem is undetermined sin e the number of unknowns involved in the onvolution ex eeds the number of re orded data. Boundary onditions (BCs) are introdu ed to arti ially des ribe the s ene outside the FOV: the values of unknowns outside the FOV are xed or are de ned as linear ombinations of the unknowns inside the FOV, the target being to redu e (5) into a square linear system An f = g − η (6) with An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 and f, g, η ∈ RN(n) . The hoi e of the BCs does not ae t the global spe tral behavior of the matrix. However, it may have a valuable impa t both with respe t to the a

ura y of the restored image and to the omputational osts for re overing f from the blurred datum, with or without noise. Noti e also that, typi ally, the matrix A is very ill- onditioned and there is a signi ant interse tion between the subspa e related to small eigen/singular values and the high frequen y subspa e. Su h a feature requires the use of suitable regularization methods that allow to properly restore the image F with ontrolled noise levels [12{14, 24℄, among whi h we an ite trun ated SVD, Tikhonov, and total variation [12, 14, 24℄. Hereafter, we fo us our attention on spe ial ase of PSFs satisfying a strong symmetry property, i.e., su h that h|i| = hi

for any i = −q, . . . , q.

(7)

This assumption is ful lled in the majority of models in real opti al appli ations. For instan e, in most 2D astronomi al imaging with opti al lens [5℄ the model of the PSF is ir ularly symmetri , and hen e, strongly symmetri ; in the multi-image de onvolution of some re ent interferometri teles opes, the PSF is strongly symmetri too [6℄. Moreover, in real appli ations when the PSF is obtained by measurements (like a guide star in astronomy), the in uen e of noise leads to a numeri ally nonsymmetri PSF, also when the kernel of the PSF is strongly (or entro) symmetri . In su h a ase, by employing a symmetrized version of the measured PSF, omparable restorations are observed [15, 1℄. The paper is organized as follows. In Se tion 2 we fo us on two re ently proposed BCs, i.e., the Re e tive [18℄ and Anti-Re e tive BCs [21℄ and their relevant properties. Se tion 3 summarizes some lassi al ltering te hniques as the trun ated singular/eigen-values de omposition and the Tikhonov method. The Re-blurring method [11, 9℄ is onsidered in the ase of Anti-Re e tive BCs and its re-interpretation in the framework of the lassi al Tikhonov regularization is given. In Se tion 4 the model is generalized for taking into a

ount the

ross- hannel blurring phenomenon and the previous ltering methods are suitable adapted. Lastly, Se tion 5 deals with some omputational issues and reports several numeri al tests, the aim being to ompare the quoted ltering methods and the two type of BCs, both in the ase of gray-s ale and olor images. In

Trun ated de ompositions and ltering methods with R/AR BCs

385

Se tion 6 some on lusions and remarks end the paper.

2

Boundary conditions

In this se tion we summarize the relevant properties of two re ently proposed type of BCs, i.e., the Re e tive [18℄ and Anti-Re e tive BCs [21℄. Spe ial attention is given to the stru tural and spe tral properties of the arising matri es. In fa t, though the hoi e of the BCs does not ae t the global spe tral behavior of the matrix A, it an have a valuable impa t with respe t both to the a

ura y of the restoration (espe ially lose to the boundaries where ringing ee ts an appear), and the omputational osts for re overing the image from the blurred one, with or without noise. Moreover, tanking into a

ount the s ale of the problem, the regularization methods analysis an be greatly simpli ed whenever a spe tral (or singular value) de omposition of A is easily available. This means that the target is to obtain the best possible approximation properties, keeping unaltered the fa t that the arising matrix shows an exploitable stru ture. For instan e, the use of periodi BCs enfor es a ir ulant stru ture, so that the spe tral de omposition an be

omputed eÆ iently with the fast Fourier transform (FFT) [8℄. Despite these

omputational fa ilities, they give rise to signi ant ringing ee ts when a signi ant dis ontinuity is introdu ed into the image. Hereafter, we fo us on two re ently proposed boundary onditions, that more

arefully des ribe the s ene outside the FOV. Clearly, several other methods deal with this topi in the image pro essing literature, e.g. lo al mean value [22℄ or extrapolation te hniques (see [17℄ and referen es therein). Nevertheless, the penalty of their good approximation properties

ould lie in a linear algebra problem more diÆ ult to ope with. 2.1

Reflective boundary conditions

In [18℄ Ng et al. analyze the use of Re e tive BCs, both from model and linear algebra point of view. The improvement with respe t to Periodi BCs is due to the preservation of the ontinuity of the image. In fa t, the s ene outside the FOV is assumed to be a re e tion of the s ene inside the FOV. For example, with a boundary at x1 = 0 and x2 = 0 the re e tive ondition is given by f(±x1 , ±x2 ) = f(x1 , x2 ). More pre isely, along the borders, the BCs impose fi1 ,1−i2 = fi1 ,i2 , fi1 ,n2 +i2 = fi1 ,n2 +1−i2 , for any i1 = 1, . . . , n1 , i2 = 1, . . . , q2 f1−i1 ,i2 = fi1 ,i2 , fn1 +i1 ,i2 = fn1 +1−i1 ,i2 , for any i1 = 1, . . . , q1 , i2 = 1, . . . , n2 ,

386

C. Tablino Possio

and, at the orners, the BCs impose for any i1 = 1, . . . , q1 , i2 = 1, . . . , q2 f1−i1 ,1−i2 = fi1 ,i2 , f1−i1 ,n2 +i2 = fi1 ,n2 +1−i2 ,

fn1 +i1 ,n2 +i2 = fn1 +1−i1 ,n2 +1−i2 , fn1 +i1 ,1−i2 = fn1 +1−i1 ,i2 ,

i.e., a double re e tion, rst with respe t to one axis and after with respe t to the other, no matter about the order. e is redu ed to a square Toeplitz-plusAs a onsequen e the re tangular matrix A Hankel blo k matrix with Toeplitz-plus-Hankel blo ks, i.e., An shows the twolevel Toeplitz-plus-Hankel stru ture. Moreover, if the blurring operator satis es the strong symmetry ondition (7) then the matrix An belongs to DCT-III matrix algebra. Therefore, its spe tral de omposition an be omputed very eÆ iently using the fast dis rete osine transform (DCT-III) [23℄. More in detail, let Cn = {An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 | An = Rn Λn RTn } be the two-level DCT-III matrix algebra, i.e., the algebra of matri es that are simultaneously diagonalized by the orthogonal transform Rn = Rn1 ⊗ Rn2 ,

Rm =

"r

2 − δt,1

os m

(s − 1)(t − 1/2)π m

#m

,

(8)

s,t=1

with δs,t denoting the Krone ker symbol. Thus, the expli it stru ture of the matrix is An = Toeplitz(V) + Hankel(σ(V), Jσ(V)), with V = [V0 V1 . . . Vq1 0 . . . 0] and where ea h Vi1 , i1 = 1, . . . , q1 is the unilevel DCT-III matrix asso iated to the ith 1 row of the PSF mask, i.e., Vi1 = Toeplitz(vi1 ) + Hankel(σ(vi1 ), Jσ(vi1 )), with vi1 = [hi1 ,0 , . . . , hi1 ,q2 , 0, . . . , 0]. Here, we denote by σ the shift operator su h that σ(vi1 ) = [hi1 ,1 , . . ., hi1 ,q2 , 0, . . ., 0] and by J the usual ip matrix; at the blo k level the same operations are intended in blo k-wise sense. Beside this stru tural hara terization, the spe tral des ription is ompletely known. In fa t, let f be the bivariate generating fun tion asso iated to the PSF mask (3), that is f(x1 , x2 ) = h0,0 + 2 +4

q1 X

q1 X

s1 =1 q2 X

hs1 ,0 os(s1 x1 ) + 2

q2 X

h0,s2 os(s2 x2 )

s2 =1

hs1 ,s2 os(s1 x1 ) os(s2 x2 ),

(9)

s1 =1 s2 =1

then the eigenvalues of the orresponding matrix An ∈ Cn are given by λs (An ) = f xs[n11 ] , xs[n22 ] , s = (s1 , s2 ),

x[m] = r

(r − 1)π , m

where s1 = 1, . . . , n1 , s2 = 1, . . . , n2 , and where the two-index notation highlights the tensorial stru ture of the orresponding eigenve tors.

Trun ated de ompositions and ltering methods with R/AR BCs

387

Lastly, noti e that standard operations like matrix-ve tor produ ts, resolution of linear systems and eigenvalues evaluations an be performed by means of FCTIII [18℄ within O(n1 n2 log(n1 n2 )) arithmeti operations (ops). For example, by multiplying by e1 = [1, 0, . . . , 0]T both the sides of RTn An = Λn RTn , it holds that [Λn ](i1 ,i2 ) = [RTn (An e1 )](i1 ,i2 ) /[RTn e1 ](i1 ,i2 ) ,

i1 = 1, . . . , n1 , i2 = 1, . . . , n2 ,

i.e., it is enough to onsider an inverse FCT-III applied to the rst olumn of An , with a omputational ost of O(n1 n2 log(n1 n2 )) ops. 2.2

Anti-reflective boundary conditions

More re ently, Anti-re e tive boundary onditions (AR-BCs) have been proposed in [21℄ and studied [2{4, 9, 10, 19℄. The improvement is due to the fa t that not only the ontinuity of the image, but also of the normal derivative, are guaranteed at the boundary. This regularity, whi h is not shared with Diri hlet or periodi BCs, and only partially shared with re e tive BCs, signi antly redu es typi al ringing artifa ts. The key idea is simply to assume that the s ene outside the FOV is the antire e tion of the s ene inside the FOV. For example, with a boundary at x1 = 0 the anti-re exive ondition impose f(−x1 , x2 ) − f(x∗1 , x2 ) = −(f(x1 , x2 ) − f(x∗1 , x2 )), for any x2 , where x∗1 is the enter of the one-dimensional anti-re e tion, i.e., f(−x1 , x2 ) = 2f(x∗1 , x2 ) − f(x1 , x2 ), for any x2 . In order to preserve a tensorial stru ture, at the orners, a double anti-re e tion, rst with respe t to one axis and after with respe t to the other, is onsidered, so that the BCs impose f(−x1 , −x2 ) = 4f(x∗1 , x∗2 ) − 2f(x∗1 , x2 ) − 2f(x1 , x∗2 ) + f(x1 , x2 ),

where (x∗1 , x∗2 ) is the enter of the two-dimensional anti-re e tion. More pre isely, by hoosing as enter of the anti-re e tion the rst available data, along the borders, the BCs impose f1−i1 ,i2 =

2f1,i2 − fi1 +1,i2 , fn1 +i1 ,i2 = 2fn1 ,i2 − fn1 −i1 ,i2 , i1 = 1, . . . , q1 ,

fi1 ,1−i2 =

i2 = 1, . . . , n2 ,

2fi1 ,1 − fi1 ,i2 +1 , fi1 ,n2 +i2 = 2fi1 ,n2 − fi1 ,n2 −i2 , i1 = 1, . . . , n1 ,

i2 = 1, . . . , q2 .

At the orners, the BCs impose for any i1 = 1, . . . , q1 and i2 = 1, . . . , q2 , f1−i1 ,1−i2 = 4f1,1 − 2f1,i2 +1 − 2fi1 +1,1 + fi1 +1,i2 +1 , f1−i1 ,n2 +i2 = 4f1,n2 − 2f1,n2 −i2 − 2fi1 +1,n2 + fi1 +1,n2 −i2 , fn1 +i1 ,1−i2 = 4fn1 ,1 − 2fn1 ,i2 +1 − 2fn1 −i1 ,1 + fn1 −i1 ,i2 +1 , fn1 +i1 ,n2 +i2 = 4fn1 ,n2 − 2fn1 ,n1 −i2 − 2fn1 −i1 ,n2 + fn1 −i1 ,n2 −i2 .

388

C. Tablino Possio

e is redu ed to a square ToeplitzAs a onsequen e the re tangular matrix A plus-Hankel blo k matrix with Toeplitz-plus-Hankel blo ks, plus an additional stru tured low rank matrix. Moreover, under the assumption of strong symmetry of the PSF and of a mild nite support ondition (more pre isely hi = 0 if |ij | > n−2, for some j ∈ {1, 2}), the resulting linear system An f = g is su h that An belongs to the AR2D n ommutative matrix algebra [3℄. This new algebra shares some properties with the τ (or DST-I) algebra [7℄. Going inside the de nition, a matrix An ∈ AR2D n has the following blo k stru ture D0 + Z[1] 0T 0 D + Z[2] 0 1 .. .. . . [q ] Dq1 −1 + Z 1 0 , τ(D , . . . , D ) D D An = 0 q1 q1 q1 [q1 ] 0 Dq1 −1 + Z .. .. . . [2] 0 D1 + Z 0T

0

D0 + Z[1]

where τ(D0 , . . . , Dq1 ) is a blo k τ matrix with respe t to the AR1D blo ks Di1 , Pq1 [k] i1 = 1, . . . , q1 and Z = 2 t=k Dt for k = 1, . . . , q1 . In parti ular, the AR1D [1D] blo k Di1 is asso iated to ith = [hi1 ,i2 ]i2 =−q2 ,...,q2 1 row of the PSF, i.e., hi1 and it is de ned as

Di1

[1]

hi1 ,0 + zi1 [2] hi1 ,1 + zi1

0T

0 0

.. .. . . [q2 ] h 0 i1 ,q2 −1 + zi1 τ(hi1 ,0 , . . . , hi1 ,q2 ) hi1 ,q2 hi1 ,q2 = , [q2 ] 0 hi1 ,q2 −1 + zi1 . .. .. . [2] 0 hi1 ,1 + zi1 [1] T 0 0 hi1 ,0 + zi1 P

q2 where z[k] i1 = 2 t=k hi1 ,t for k = 1, . . . , q2 and τ(hi1 ,0 , . . . , hi1 ,q2 ) is the previously deunilevel τ matrix asso iated to the one-dimensional PSF h[1D] i1 ned. Noti e that the rank-1 orre tion given by the elements z[k] i1 pertains to the ontribution of the anti-re e tion enters with respe t to the verti al borders, while

Trun ated de ompositions and ltering methods with R/AR BCs

389

the low rank orre tion given by the matri es Z[k] pertains to the ontribution of the anti-re e tion enters with respe t to the horizontal borders. It is evident from the above matrix stru ture that favorable omputational properties are guaranteed also by virtue of the τ stru ture. Therefore, rstly we re all the relevant properties of the two-level τ algebra [7℄. Let Tn = {An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 | An = Qn Λn Qn } be the two-level τ matrix algebra, i.e., the algebra of matri es that are simultaneously diagonalized by the symmetri orthogonal transform Qn = Qn1 ⊗ Qn2 ,

Qm =

"r

2 sin m+1

stπ m+1

#m

.

(10)

s,t=1

With the same notation as the DCT-III algebra ase, the expli it stru ture of the matrix is two level Toeplitz-plus-Hankel. More pre isely, An = Toeplitz(V) − Hankel(σ2 (V), Jσ2 (V))

with V = [V0 V1 . . . Vq1 0 . . . 0], where ea h Vi1 , i1 = 1, . . . , q1 is a the unilevel τ matrix asso iated to the ith 1 row of the PSF mask, i.e., Vi1 = Toeplitz(vi1 ) − 2 2 Hankel(σ (vi1 ), Jσ (vi1 )) with vi1 = [hi1 ,0 , . . . , hi1 ,q2 , 0, . . . , 0]. Here, we denote by σ2 the double shift operator su h that σ2 (vi1 ) = [hi1 ,2 , . . . , hi1 ,q2 , 0, . . . , 0]; at the blo k level the same operations are intended in blo k-wise sense. On e more, the spe tral hara terization is ompletely known sin e for any An ∈ Tn the related eigenvalues are given by λs (An ) = f xs[n11 ] , xs[n22 ] , s = (s1 , s2 ),

x[m] = r

rπ , m+1

where s1 = 1, . . . , n1 , s2 = 1, . . . , n2 , and f is the bivariate generating fun tion asso iated to the PSF de ned in (9). As in the DCT-III ase, standard operations like matrix-ve tor produ ts, resolution of linear systems and eigenvalues evaluations an be performed by means of FST-I within O(n1 n2 log(n1 n2 )) (ops). For instan e, it is enough to onsider a FST-I applied to the rst olumn of An to obtain the eigenvalues [Λn ](i1 ,i2 ) = [Qn (An e1 )](i1 ,i2 ) /[Qn e1 ](i1 ,i2 ) ,

i1 = 1, . . . , n1 , i2 = 1, . . . , n2 .

Now, with respe t to the AR2D n matrix algebra, a omplete spe tral hara terization is given in [3, 4℄. A really useful fa t is the existen e of a transform Tn that simultaneously diagonalizes all the matri es belonging to AR2D n , although the orthogonality property is partially lost. Theorem 1. [4℄ by Tn , i.e.,

Any matrix

An ∈ AR2D n , n = (n1 , n2 ),

An = Tn Λn Ten ,

Ten = Tn−1

an be diagonalized

390

C. Tablino Possio

where Tn = Tn

1

Tm

α−1 m

⊗ Tn2 , Ten = Ten1 ⊗ Ten2 , 0T

0

−1 −1 = αm p Qm−2 αm Jp 0

0T

α−1 m

and

with

αm

0T

0

e Tm = −Qm−2 p Qm−2 −Qm−2 Jp 0

0T

αm

The entries of the ve tor p ∈ Rm−2 are de ned as pj = 1 − j/(m − 1), j = 1, . . . , m − 2, J ∈ Rm−2×m−2 is the ip matrix, and αm is a normalizing fa tor hosen su h that the Eu lidean norm of the rst and last olumn of Tm will be equal to 1. Theorem 2. [3℄ Let An ∈ AR2D n , n = (n1 , n2 ), the matrix related PSF hPSF = [hi1 ,i2 ]i1 =−q1 ,...,q1 ,i2 =−q2 ,...,q2 . Then, the eigenvalues

are given by

to the of An

– 1 with algebrai multipli ity 4, – the n2 − 2 eigenvalues of the unilevel τ matrix related to the oneP P dimensional PSF h{r} = [ qi11=−q1 hi1 ,−q2 , . . . , qi11=−q1 hi1,q2 ], ea h one with algebrai multipli ity 2, – the n1 − 2 eigenvalues of the unilevel τ matrix related to the oneP P dimensional PSF h{c} = [ qi22=−q2 h−q1 ,i2 , . . . , qi22=−q2 hq1 ,i2 ], ea h one with algebrai multipli ity 2, – the (n1 − 2)(n2 − 2) eigenvalues of the two-level τ matrix related to the two-dimensional PSF hPSF .

Noti e that the three sets of multiple eigenvalues are exa tly related to the type of low rank orre tion imposed by the BCs through the enters of the antire e tions. More in detail, the eigenvalues of τn2 −2 (h{r} ) and of τn1 −2 (h{c} ) take into a

ount the ondensed PSF information onsidered along the horizontal and verti al borders respe tively, while the eigenvalue equal to 1 takes into a

ount the ondensed information of the whole PSF at the four orners. In addition, it is worth noti ing that the spe tral hara terization an be ompletely des ribed in terms of the generating fun tion asso iated to the PSF de ned in (9), simply by extending to 0 the standard τ evaluation grid, i.e., it holds λs (An ) = f xs[n11 ] , xs[n22 ] , s = (s1 , s2 ), sj = 0, . . . , nj ,

x[m] = r

rπ , m+1

where the 0−index refers to the rst/last olumns of the matrix Tm [3℄. See [2, 4℄ for some algorithms related to standard operations like matrix-ve tor produ ts, resolution of linear systems and eigenvalues evaluations with a omputational ost of O(n1 n2 log(n1 n2 )) ops.

Trun ated de ompositions and ltering methods with R/AR BCs

391

It is worthwhile stressing that the omputational ost of the inverse transform is omparable with that of the dire t transform and, at least at rst sight, the very true penalty is the loss of orthogonality due to the rst/last olumn of the matrix Tm .

3

Filtering methods

Owing to the ill- onditioning, the standard solution f = A−1 n g is not physi ally meaningful sin e it is ompletely orrupted by the noise propagation from data to solution, i.e., by the so alled inverted noise. For this reason, restoration methods look for an approximate solution with ontrolled noise levels: widely onsidered regularization methods are obtained through spe tral ltering [14, 16℄. Hereafter, we onsider the trun ated Singular Values De ompositions (SVDs) (or Spe tral De ompositions (SDs)) and the Tikhonov (or Re-blurring) regularization method. 3.1

Truncated SVDs and truncated SDs

The Singular Values De omposition (SVD) highlights a standard perspe tive for dealing with the inverted noise. More pre isely, if An = Un Σn VnT ∈ RN(n)×N(n)

is the SVD of An , i.e., Un and Vn are orthogonal matri es and Σn is a diagonal matrix with entries σ1 > σ2 > . . . σN(n) > 0, then the solution of the linear system An f = g an be written as f=

N(n)

X

k=1

uTk g σk

vk ,

where uk and vk denote the kth olumn of the matrix Un and Vn , respe tively. With regard to the image restoration problem, the idea is to onsider a sharp lter, i.e., to take in the summation only the terms orresponding to singular values greater than a ertain threshold value δ, so damping the ee ts aused by division by the small singular values. Therefore, by setting the lter fa tors as 1, if σk > δ, φk = 0, otherwise, the ltered solution is de ned as f lt =

N(n)

X

k=1

uT g φk k σk

X uT g vk = vk , φk k σk k∈Iδ

Iδ = {k | σk > δ}.

392

C. Tablino Possio

Due to s ale of the problem, the SVD of the matrix An is in general an expensive

omputational task (and not negligible also in the ase of a separable PSF). Thus, an \a priori" known spe tral de omposition, whenever available, an give rise to a valuable simpli ation. More pre isely, let en ∈ RN(n)×N(n) , An = Vn Λn V

en = V −1 V n

be a spe tral de omposition of An , then the ltered solution is de ned as f lt =

N(n)

X

k=1

φk

~vk g λk

vk =

X ~ vk g φk vk , λk

Iδ = {k | |λk (A)| > δ},

k∈Iδ

where vk and v~k denote the kth olumn of Vn and the kth row of Ven , respe tively, and where φk = 1 if k ∈ Iδ , 0 otherwise. 3.2

Tikhonov and re-blurring regularizations

In the lassi al Tikhonov regularization method, the image ltering is obtained by looking for the solution of the following minimization problem min kAn f − gk22 + µkDnfk22 , f

(11)

where µ > 0 is the regularization parameter and Dn is a arefully hosen matrix (typi ally Dn = In or represents the dis retization of a dierential operator, properly adapted with respe t to the hosen BCs). The target is to minimize the Eu lidean norm of the residual kAnf−gk2 without explosions with respe t to the quantity kDn xk2 . As well know, (11) is equivalent to the solution to the damped least square problem (ATn An + µDTn Dn )f = ATn g.

(12)

In addition, the regularization Tikhonov method an be reinterpreted in the framework of lassi al spe tral ltering method. For instan e, in the ase of Dn = In , by making use of the SVD of An = Un Σn VnT , the solution of (12) an be rewritten as T f lt = Vn Φn Σ−1 n Un g,

where Φn = diag(φk ) with φk = σ2k /(σ2k + µ), k = 1, . . . , N(n). A severe drawba k in adopting the Tikhonov regularization approa h in the ase T of An ∈ AR2D / AR2D n is due to the fa t that An ∈ n , so that all the favorable omputational properties are substantially spoiled. An alternative approa h, named ′ Re-blurring, has been proposed in [11, 9℄: the proposal is to repla e ATn by An ′ in (12), where An is the blurring matrix related to the urrent BCs with a PSF

Trun ated de ompositions and ltering methods with R/AR BCs

393

rotated by 180◦ . This approa h is ompletely equivalent to (12) in the ase of Diri hlet and Periodi BCs, while the novelty on erns both Re e tive BCs and ′ Anti-Re e tive BCs, where in general An 6= ATn . The authors show that the Re-blurring with anti-re e tive BCs is omputationally onvenient and leads to a larger redu tion of the ringing ee ts arising in lassi al deblurring s hemes. From the modelling point of view, the authors motivation relies upon the fa t that Re-blurring smoothes the noise in the right hand side of the system, in the same manner as this happens in the ase of Diri hlet, Periodi and Re e tive BCs. Hereafter, we onsider an explanation of the observed approximation results. As previously laimed, we fo us our attention on the ase of a strongly symmetri ′ PSF, so that the matrix An equals the matrix An . Moreover, also in this ase it is evident that the linear system (A2n + µD2n )f = An g.

(13)

is not equivalent to a minimization problem, again be ause the matrix A ∈ AR2D n is not symmetri . Nevertheless, the symmetrization of (13) an be performed by diagonalization, so obtaining (Λ2A,n + µΛ2D,n )f^ = ΛA,n g ^,

(14)

where f^ = Ten f and g^ = Ten g. In su h a way (14) is again equivalent to the minimization problem min kΛA,n Ten f − Ten gk22 + µkΛD,n Ten fk22 ,

(15)

min kTen(An f − g)k22 + µkTen Dn fk22 .

(16)

f

or equivalently, again by making use of the diagonalization result, to f

Clearly, the last formulation in (16) is the most natural and it allows to laim that the Re-blurring method an be interpreted as a standard Tikhonov regularization method in the spa e transformed by means of Ten . Re alling that Ten is not an orthogonal transformation, the goal be omes to

ompare kTenfk2 and kfk2, that is to bound kTen k2 = kTen1 k2 kTen1 k2 , being kTen fk2 6 kTen k2 kfk2 . A quite sharp estimate of su h a norm an be found by exploiting the stru ture of the unilevel matrix Tem ∈ Rm×m . Let f = [f2 , . . . , fm−1 ], it holds that kTem fk22 = α2m f21 + kQm−2 (−f1 p + f − fn Jp)k22 + α2m f2m = α2m (f21 + f2m ) + k − f1 p + f − fn Jpk22

2 + (|f1 | + |fn |)kpk2 )2 6 α2m (f21 + f2m ) + (kfk

22 + 3kpk22 kfk22 + 4kpk2 kfk22 6 α2m (f21 + f2m ) + kfk 6 (1 + 2kpk2 )2 kfk22 ,

394

C. Tablino Possio

being α2m = 1 + kpk22. Sin e, by de nition, kpk22 ≃ m, we have √ kTem k2 6 1 + 2kpk2 ≃ 2 m.

(17)

Noti e that the bound given in (17) is quite sharp, sin e for instan e kTem e1 k22 equals 1 + 2kpk22 .

4

Cross-channel blurring

Hereafter, we extend the analysis of the deblurring problem to the ase of olor images digitalized, for instan e, a

ording to the standard RGB system. Several te hniques an be used for re ording olor images, but the main problem on erns the fa t that light from one olor hannel an end up on a pixel assigned to another olor. The onsequen e of this phenomenon is alled ross- hannel blurring among the three hannels of the image and it sums up to the previously analyzed blurring of ea h one of the three olors, named within- hannel blurring. By assuming that the ross- hannel blurring takes pla e after the within- hannel blurring of the image, that it is spatially invariant and by assuming that the same within- hannel blurring o

urs in all the three olor hannels, the problem an be modelled [16℄ as (A olor ⊗ An )f = g − η (18) with An ∈ RN(n)×N(n) , n = (n1 , n2 ), N(n) = n1 n2 , and

arr arg arb A olor = agr agg agb . abr abg abb

The row-entries denote the amount of within- hannel blurring pertaining to ea h olor hannel; a normalized onservation law pres ribes that A olor e = e, e = [1 1 1]T . Lastly, the ve tors f, g, η ∈ R3N(n) are assumed to olle t the three

olor hannels in the RGB order. Clearly, if A olor = I3 , i.e., the blurring is only of within- hannel type, the problem is simply de oupled into three independent gray-s ale deblurring problems. In the general ase, taking into a

ount the tensorial stru ture of the whole blurring matrix A olor ⊗ An is evident that the trun ated SVDs and SDs an be formulated as the natural extension of those onsidered in the within-blurring

ase. Noti e that in the ase of SDs, we will onsider a SVD for the matrix A olor , sin e it naturally assures an orthogonal de omposition, no matter about the spe i matrix, while its omputational ost is negligible with respe t to the s ale of the problem. In addition, we tune the ltering strategy with respe t the spe tral information given only by the matrix An , i.e., for any xed σk (or λk ) we simultaneously sum, or dis ard, the three ontribution on f related to the

Trun ated de ompositions and ltering methods with R/AR BCs

395

three singular values of A olor . With respe t to the Tikhonov regularization method, the approa h is a bit more involved. Under the assumption An = ATn = Vn Λn Ven , the damped least square problem [(A olor ⊗ An )T (A olor ⊗ An ) + µI3n ]f = (A olor ⊗ An )T g

an be rewritten as T en + µ(I3 ⊗ In )]f = (A olor ⊗ Vn Λn V en )T g. [(A olor A olor ) ⊗ Vn Λ2n V

(19)

Thus, by setting S3n = I3 ⊗ Ven, f^ = S3n f, g^ = S3n g, (19) an be transformed in

en )T S−1 g en + µ(I3 ⊗ In )]S−1 f^ = S3n (A olor ⊗ Vn Λn V S3n [(AT olor A olor ) ⊗ Vn Λ2n V 3n ^, 3n

so obtaining the linear system

T [(A olor A olor ) ⊗ Λ2n + µ(I3 ⊗ In )]f^ = (AT olor ⊗ Λn )g ^,

that an easily be de oupled into n1 n2 linear systems of dimension 3. Clearly, in the ase of any matrix An ∈ Cn , all these manipulations an be performed by means of an orthogonal transformation S3n . Noti e also that the

omputational ost is always O(n1 n2 log n1 n2 ) ops. With respe t to An = Tn Λn Ten ∈ AR2D n , we an onsider the same strategy by referring to the Re-blurring regularization method. More pre isely, the linear system T [(AT olor A olor ) ⊗ A2n + µ(I3 ⊗ In )]f = (A olor ⊗ An )g

an be transformed in

T [(A olor A olor ) ⊗ Λ2n + µ(I3 ⊗ In )]f^ = (AT olor ⊗ Λn )g ^.

Though the transformation S3n = I3 ⊗ Ten is not orthogonal as in the Re e tive

ase, the obtained restored image are fully omparable with the previous ones and the omputational ost is still O(n1 n2 log n1 n2 )) ops.

5 5.1

Numerical tests Some computational issues

Before analyzing the image restoration results, we dis uss how the methods

an work without reshaping the involved data. In fa t, the tensorial stru ture of the matri es, obtained by onsidering Re e tive and Anti-Re e tive BCs,

an be exploited in depth, so that the algorithms an deal dire tly, and more naturally, with the data olle ted in matrix form. Hereafter, we onsider a twoindex notation in the sense of the previously adopted row-wise ordering.

396

C. Tablino Possio

[n ] 1] In the SD ase onsidered in Se tion 3.1, sin e v~k = ~v[n vk22 is represented k1 ⊗ ~ in matrix form as (~vk[n11 ] )T ~vk[n22 ] , the required s alar produ t an be omputed as

v~k g =

[n ] v~k11

T

[n ] v~k22 ⊙ G,

where ⊙ denotes the summation of all the involved terms after a element-wise produ t. Clearly, vk = vk[n11 ] ⊗vk[n22 ] is represented in matrix form as vk[n11 ] (~vk[n22 ] )T . In a similar manner, in the ase of the SVD of An with separable PSF h = [n ] [n ] [n ] [n ] h1 ⊗ h2 , we an represent vk = vk11 ⊗ vk22 in matrix form as vk11 (vk22 )T and [n ] [n ] [n ] [n ] uTk = (uk11 ⊗ uk22 )T as uk11 (uk22 )T . The eigenvalues required for the SD an be stored into a matrix Λ∗ ∈ Rn1 ×n2 . In the ase of An ∈ Cn this matrix an be evaluated as T e T ./ V en E∗ V eT en A∗ V Λ∗ = V n1 2 1 n1 2

where A∗ ∈ Rn2 ×n1 denotes the rst olumn of An and E∗1 the rst anoni al basis ve tor, reshaped as matri es in olumn-wise order. In addition, the two-level dire t and inverse transform y = Vn x and y = Ven x an be dire tly evaluated on a matrix data as Y = Vn1 XVnT 2 = (Vn2 (Vn1 X)T )T

and Y = Ven1 XVen2 = (Ven2 (Ven1 X)T )T

by referring to the orresponding unilevel transforms. In the same way, the eigenvalues required in the ase of An ∈ AR2D n an be suitably stored as

1

Λ∗ (τn2 −2 (hr ))

1

∗ c ∗ ∗ c Λ = Λ (τn1 −2 (h )) Λ (τn−2 (h)) Λ (τn1 −2 (h )) ∈ Rn1 ×n2 , ∗ r 1 Λ (τn2 −2 (h )) 1 ∗

with referen e to the notations of Theorem 2, where the eigenvalues of the unilevel and two-level τ matri es are evaluated as outlined in Se tion 2.2. Lastly, the linear systems obtained, for any xed µ, in the ase of Tikhonov and Re-blurring regularization methods an be solved with referen e to the matrix Φn of the orresponding lter fa tors by applying the Re e tive and Anti-Re e tive transforms with a omputational ost O(n1 n2 log n1 n2 ) ops. 5.2

Truncated decompositions

In this se tion we ompare the ee tiveness of trun ated spe tral de ompositions (SDs) with respe t to the standard trun ated SVDs both in the ase of

Trun ated de ompositions and ltering methods with R/AR BCs

397

Re e tive and Anti-Re e tive BCs. Due to s ale of the problem, the SVD of the matrix An is in general an expensive omputational task (and not negligible also in the ase of a separable PSF). Thus, a spe tral de omposition, whenever available as in these ases, leads to a valuable simpli ation. Firstly, we onsider the ase of the separable PSF aused by atmospheri turbulen e ! hi1 ,i2 =

1

2πσi1 σi2

exp −

1 2

i1 σi1

2

−

1 2

i2 σi2

2

,

where σi1 and σi2 determine the width of the PSF itself. Sin e the Gaussian fun tion de ays exponentially away from its enter, it is ustomary to trun ate the values in the PSF mask after an assigned de ay |i1 |, |i2 | 6 l. It is evident from the quoted de nition that the Gaussian PSF satis es the strong symmetry

ondition (7). Another example of strongly symmetri PSF is given by the PSF representing the out-of-fo us blur hi1 ,i2 =

1 , πr2

0,

if i21 + i22 6 r2 , otherwise,

where r is the radius of the PSF. In the reported numeri al tests, the blurred image g has been perturbed by adding a Gaussian noise ontribution η = ηn ν with ν xed noise ve tor, ηn = ρkgk2 /kνk2 , and ρ assigned value. In su h a way the Signal Noise Ratio (SNR) [5℄ is given by SNR = 20 log10 5.2.1

kgk2 = 20 log10 ρ−1 (dB). kηk2

Gray-scale images In Figure 1 we report the template true image (the

FOV is delimited by a white frame), together with the blurred image with the Gaussian PSF with support 15 × 15 and σi1 = σi2 = 2 and the referen e perturbation ν, reshaped in matrix form. We onsider the optimal image restoration with respe t to the relative restoration error (RRE), i.e., kf lt − ftrue k2 /kftrue k2 , where f lt is the omputed approximation of the true image ftrue by onsidering spe tral ltering. More in detail, the RRE is analyzed by progressively adding a new basis element at a time, a

ording to the non-de reasing order of the singular/eigen-values (the eigenvalues are ordered with respe t to their absolute value). In the ase of SDs (or SVDs related to a separable PSF) this an be done as des ribed in Se tion 5.1 and, beside the preliminary ost related to the omputation of the de omposition, the addition of a new term has a omputational

ost equal to 4n1 n2 ops. The algorithm proposed in [4℄, that makes use of the Anti-Re e tive dire t and inverse transforms, is less expensive in the ase of tests with few threshold values. Hereafter, the aim is to ompare the trun ated SVD with the trun ated SD

398

C. Tablino Possio True image

Reference noise perturbation

Effect of Gaussian blurring

Effect of Out−of−Focus blurring

True image (FOV is delimited by a white frame), referen e noise perturbation, blurred image with the Gaussian PSF with support 15 × 15 and σi1 = σi2 = 2, and blurred image with the Out-of-Fo us PSF with support 15 × 15. Fig. 1.

Table 1. Optimal RREs of trun ated SVD and SD with referen e to the true image in Figure 1 (Gaussian blur σi1 = σi2 = 2). PSF SVD SD SVD SD SVD SD SVD SD SVD SD

Refle tive BCs 11x11 15x15 ρ =0 0.059164 0.087402 0.090742 0.043754 0.087400 0.090746 ρ =0.001 0.060278 0.091964 0.094468 0.060278 0.091964 0.094476 ρ =0.01 0.091151 0.11214 0.11307 0.091152 0.11214 0.11307 ρ =0.05 0.11635 0.13356 0.13508 0.11635 0.13356 0.13510 ρ =0.1 0.13024 0.14607 0.14746 0.13024 0.14607 0.14746 5x5

21x21

PSF

0.093856 0.093867

SVD SD

0.097034 0.097034

SVD SD

0.11495 0.11495

SVD SD

0.13739 0.13739

SVD SD

0.15047 0.15047

SVD SD

Anti-Refle tive BCs 5x5 11x11 15x15 ρ =0 0.039165 0.064081 0.086621 0.038316 0.063114 0.083043 ρ =0.001 0.062182 0.094237 0.098897 0.059617 0.089105 0.092814 ρ =0.01 0.096049 0.12231 0.12403 0.091383 0.11230 0.11343 ρ =0.05 0.12791 0.15070 0.15188 0.11666 0.13414 0.13570 ρ =0.1 0.14399 0.16756 0.16964 0.13083 0.14709 0.14852

21x21 0.087237 0.083521 0.10042 0.094343 0.12536 0.11495 0.15492 0.13816 0.17225 0.15162

restorations both in the ase of Re e tive and Anti-Re e tive BCs. Periodi BCs are not analyzed here, sin e Re e tive and Anti-Re e tive BCs give better performan es with respe t to the approximation of the image at the boundary. In Table 1 and 2 we report the results obtained by varying the dimension of the PSF support, the parameter ρ related to the amount of the noise perturbation

Trun ated de ompositions and ltering methods with R/AR BCs

399

and the varian e of the onsidered Gaussian blur. As expe ted the optimal RRE worses as the parameter ρ in reases and the Anti-Re e tive BCs show better performan es in the ase of low noise levels. In fa t, for low ρ values, the redu tion of ringing artifa ts is signi ant, while the quality of the restoration for higher ρ values is essentially driven by the goal of noise ltering. Therefore, in su h a ase, the hoi e of the BCs be omes more an more meaningless sin e it is not able to in uen e the image restoration quality. Some examples of restored images are reported in Figure 2. More impressive is the fa t that SDs give better, or equal, results with respe t Table 2. Optimal RREs of trun ated SVD and SD with referen e to the true image in Figure 1 (Gaussian blur σi1 = σi2 = 5). PSF SVD SD SVD SD SVD SD SVD SD SVD SD

Refle tive BCs 11x11 15x15 ρ =0 0.063387 0.081274 0.097351 0.045365 0.081274 0.096387 ρ =0.001 0.063915 0.096243 0.11449 0.063915 0.096274 0.11449 ρ =0.01 0.089032 0.13343 0.14947 0.089032 0.13343 0.14946 ρ =0.05 0.12203 0.16002 0.17339 0.12203 0.16002 0.17339 ρ =0.1 0.13412 0.16793 0.17963 0.13412 0.16793 0.17963 5x5

21x21

PSF

0.14634 0.14634

SVD SD

0.15217 0.15217

SVD SD

0.17397 0.17397

SVD SD

0.18335 0.18335

SVD SD

0.19057 0.19057

SVD SD

Anti-Refle tive BCs 5x5 11x11 15x15 ρ =0 0.040214 0.079543 0.088224 0.039437 0.078970 0.088832 ρ =0.001 0.068197 0.095808 0.11522 0.063575 0.093247 0.1127 ρ =0.01 0.09412 0.14482 0.16825 0.089038 0.13611 0.15270 ρ =0.05 0.13553 0.18563 0.21006 0.12253 0.16269 0.17439 ρ =0.1 0.15010 0.20164 0.22256 0.13487 0.16916 0.18088

21x21 0.13686 0.13129 0.15767 0.14893 0.21148 0.17446 0.22962 0.18414 0.23960 0.19218

to those obtained by onsidering SVDs. This numeri al eviden e is really interesting in the ase of Anti-Re e tive BCs: despite the loss of the orthogonality property in the spe tral de omposition, the restoration results are better than those obtained by onsidering SVD. Moreover, the observed trend with respe t to the Re e tive BCs is also onserved. A further analysis refers to the so- alled Pi ard plots (see Figure 3), where the

oeÆ ients |uTk g|, or |v~k g|, (bla k dots) are ompared with the singular values σk , or the absolute values of the eigenvalues |λk |, (red line). As expe ted, initially these oeÆ ients de rease faster than σk , or |λk |, while afterwards they level o at a plateau determined by the level of the noise in the image. The threshold of this hange of behavior is in good agreement with the optimal k value obtained in the numeri al test by monitoring the RRE. Moreover, noti e that the Pi ard plots related to the SDs are quite in agreement with those orresponding to SVDs. In the ase of the Anti-Re e tive SD we observe an in reasing data dispersion with respe t to the plateau, but the orresponden e between the threshold and the hosen optimal k is still preserved. The omputational relevan e of this result is due to the signi ant lower omputational ost required by the Anti-Re e tive SDs with respe t to the orresponding SVDs.

400

C. Tablino Possio ρ = 0.01 R − TSVD

R − TSD

AR − TSVD

AR − TSD

ρ = 0.05 R − TSVD

R − TSD

AR − TSVD

AR − TSD

Optimal restorations of trun ated SVD and SD in the ase of Re e tive and Anti-Re e tive BCs with referen e to Figure 1 (Gaussian blur σi1 = σi2 = 2).

Fig. 2.

Trun ated de ompositions and ltering methods with R/AR BCs

401

Lastly, Table 3 reports the spe tral ltering results obtained in the ase of Outρ = 0.01 R−SVD, kott=7681

5

R−SD, kott=7678

5

10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5

1

1.5

2

4

x 10

AR−SVD, kott=8329

5

AR−SD, kott=7816

5

10

2.5 4

x 10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5

1

1.5

2

4

2.5 4

x 10

x 10

ρ = 0.05 R−SVD, kott=4653

5

R−SD, kott=4654

5

10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5

1

1.5

2

4

AR−SVD, kott=5415

5

x 10 AR−SD, kott=4710

5

10

2.5 4

x 10

10

0

0

10

10

−5

−5

10

10 0.5

1

1.5

2

2.5

0.5 4

x 10

1

1.5

2

2.5 4

x 10

Pi ard plot of trun ated SVD and SD in the ase of Re e tive and AntiRe e tive BCs with referen e to Figure 1 (Gaussian blur σi1 = σi2 = 2).

Fig. 3.

of-Fo us blur by varying the dimension of the PSF support and the parameter ρ related to the noise perturbation. The RRE follows the same trend observed in the ase of Gaussian blur. Other image restoration tests with dierent gray-s ale images have been onsidered in [20℄. A more interesting remark again pertains the omputational ost. Sin e the Outof-Fo us PSF is not separable, but the transforms are, the use of SDs related to Re e tive or Anti-Re e tive BCs allows to exploit the tensorial nature of the

402

C. Tablino Possio

Optimal RREs of trun ated SDs with referen e to the true image in Figure 1 (Out-of-Fo us blur).

Table 3.

PSF ρ =0 ρ =0.001 ρ =0.01 ρ =0.05 ρ =0.1

Refle tive BCs 5x5 11x11 15x15 0.072593 0.084604 0.088323 0.072671 0.085809 0.091035 0.080016 0.12255 0.13569 0.10645 0.15365 0.16810 0.12089 0.16314 0.17836

21x21 0.096479 0.10436 0.15276 0.18777 0.20471

5x5 0.072821 0.072904 0.080427 0.10685 0.12147

Anti-Refle tive BCs 11x11 15x15 21x21 0.085366 0.091252 0.099293 0.086643 0.093929 0.10752 0.12316 0.13803 0.15683 0.15571 0.17147 0.19172 0.16482 0.17987 0.20829

orresponding transforms, both with respe t to the omputation of the eigenvalues and of the eigenve tors (or of the Re e tive and Anti-Re e tive transforms). 5.2.2 Color images in the case of cross-channel blurring Here, we analyze some restoration tests in the ase of the template olor image reported in Figure 4, by assuming the presen e of a ross- hannel blurring phenomenon modelled as in (18). The entity of this mixing ee t is hosen a

ording to the matrix

0.7 0.2 0.1 A olor = 0.25 0.5 0.25 . 0.15 0.1 0.75

(20)

In Figure 4 the ross- hannel blurred image with Gaussian PSF with support True image

Effect of Cross−channel Gaussian blurring

True image (FOV is delimited by a white frame) and ross- hannel blurred image with the Gaussian PSF with support 15 × 15 and σi1 = σi2 = 2 and matrix A olor in (20). Fig. 4.

15 × 15 and σi1 = σi2 = 2 is also reported. Noti e that the entity of the ross hannel blurring is not negligible, sin e the whole image results to be darkened and the olor intensities of the additive RGB system are substantially altered. Table 4 reports the optimal RREs of trun ated SVDs and SDs obtained by varying the dimension of the Gaussian PSF support and the parameter ρ related to the amount of the noise perturbation. It is worth stressing that we tune the

Trun ated de ompositions and ltering methods with R/AR BCs

403

ltering strategy with respe t the spe tral information given just by the matrix An , i.e., for any xed σk (or λk ) we simultaneously sum, or dis ard, the three

ontribution on f related to the three singular values of A olor . In fa t, the magnitude of singular values of the onsidered matrix A olor does not dier enough to dramati ally hange the ltering information given just by An . Nevertheless, also the omparison with the restoration results obtained by onsidering a global ordering justi es this approa h. The olor ase behaves as the gray-s ale one: as expe ted the optimal RRE be omes worse as the parameter ρ in reases and the Anti-Re e tive SD shows better performan es in the ase of low noise levels. In addition, by referring to Figure 5, we note that the trun ated SVD in the Table 4. Optimal RREs of trun ated SVD and SD with referen e to the true image in Figure 4 (Cross- hannel and Gaussian Blur σi1 = σi2 = 2). PSF SVD SD SVD SD SVD SD SVD SD SVD SD

Refle tive BCs 11x11 15x15 ρ =0 0.078276 0.12114 0.11654 0.078276 0.12114 0.11654 ρ =0.001 0.078992 0.1212 0.11663 0.078992 0.12119 0.11663 ρ =0.01 0.10152 0.12396 0.12088 0.10152 0.12396 0.12088 ρ =0.05 0.12102 0.13853 0.13743 0.12102 0.13853 0.13743 ρ =0.1 0.13437 0.14898 0.14854 0.13437 0.14898 0.14854 5x5

21x21

PSF

0.1178 0.1178

SVD SD

0.11792 0.11792

SVD SD

0.12198 0.12198

SVD SD

0.13844 0.13844

SVD SD

0.14947 0.14947

SVD SD

Anti-Refle tive BCs 5x5 11x11 15x15 ρ =0 0.076646 0.1006 0.1098 0.074953 0.098474 0.10508 ρ =0.001 0.077394 0.10639 0.1111 0.075727 0.10233 0.10612 ρ =0.01 0.10431 0.12695 0.12624 0.10087 0.11737 0.11805 ρ =0.05 0.13017 0.15075 0.15063 0.12127 0.13699 0.13756 ρ =0.1 0.1456 0.16516 0.16626 0.13507 0.14796 0.14955

21x21 0.10646 0.10216 0.11002 0.10443 0.12779 0.118 0.15166 0.13795 0.16647 0.15018

ase of Anti-Re e tive BCs shows a little more 'fre kles' than the orresponding trun ated SVD in the ase of Re e tive BCs. Nevertheless, for low noise levels, is just the Anti-Re e tive SD that exhibits less 'fre kles' than the Re e tive SD.

5.3

Tikhonov and re-blurring regularizations

By onsidering a Gaussian blurring of the true image reported in Figure 1, Table 5 ompares the optimal RRE obtained in the ase of the Tikhonov method for Re e tive BCs and of the Re-blurring method for Anti-Re e tive BCs. In addition, in Table 6, the same omparison refers to the ase of the Out-of-Fo us PSF. As expe ted, the RRE deteriorates as the dimension of the noise level or the dimension of the PSF support in reases. Noti e also that the gap between the Re e tive and Anti-Re e tive BCs is redu ed also for low noise levels. Further numeri al tests an be found in [9, 2℄. Lastly, we fo us our attention on the ase of the olor image in Figure 4. The

404

C. Tablino Possio ρ = 0.01 R − TSVD

R − TSD

AR − TSVD

AR − TSD

ρ = 0.05 R − TSVD

R − TSD

AR − TSVD

AR − TSD

Optimal restorations of trun ated SVD and SD in the ase of Re e tive and Anti-Re e tive BCs with referen e to Figure 4 (Cross- hannel and Gaussian Blur σi1 = σi2 = 2). Fig. 5.

Trun ated de ompositions and ltering methods with R/AR BCs

405

Table 5. Optimal RREs of

Tikhonov and Re-blurring methods and orresponding µott with referen e to the true image in Figure 1 (Gaussian Blur σi1 = σi2 = 2). PSF R AR R AR R AR R AR R AR

5x5 11x11 15x15 ρ =0 0.041015 4.1e-005 0.079044 9e-006 0.086386 1.1e-005 0.034237 1.1e-005 0.059465 1e-006 0.078963 1e-006 ρ =0.001 0.050155 0.000188 0.087482 5.7e-005 0.090825 4.3e-005 0.048556 0.000163 0.085279 4.6e-005 0.089388 3.3e-005 ρ =0.01 0.083456 0.005555 0.10748 0.001786 0.10863 0.001678 0.083436 0.005536 0.10744 0.001792 0.10868 0.001691 ρ =0.05 0.12024 0.038152 0.12982 0.01929 0.13071 0.018417 0.12049 0.038379 0.13006 0.01957 0.13096 0.018669 ρ =0.1 0.14767 0.06587 0.14721 0.039231 0.14822 0.038181 0.14813 0.066251 0.14766 0.039707 0.14866 0.038644

21x21 0.089556 1.6e-005 0.079805 1e-006 0.093071 4.9e-005 0.090821 3.3e-005 0.11023 0.11019

0.001573 0.001575

0.13307 0.1333

0.017892 0.018105

0.15097 0.15144

0.037893 0.038296

Table 6. Optimal RREs of

Tikhonov and Re-blurring methods and orresponding µott with referen e to the true image in Figure 1 (Out-of-Fo us blur). PSF R AR R AR R AR R AR R AR

5x5 11x11 15x15 ρ =0 0.031422 0.000172 0.05346 6.9e-005 0.060954 3.5e-005 0.036213 0.000302 0.051236 6.8e-005 0.06683 5.7e-005 ρ =0.001 0.034441 0.000271 0.061465 0.000145 0.073751 0.000101 0.038313 0.000402 0.059957 0.000138 0.076695 0.000126 ρ =0.01 0.069647 0.008493 0.11361 0.004117 0.12881 0.003037 0.070384 0.008923 0.11404 0.00422 0.12982 0.003139 ρ =0.05 0.12204 0.053687 0.1532 0.030719 0.16614 0.022121 0.12256 0.05423 0.15402 0.031574 0.16739 0.023213 ρ =0.1 0.16366 0.092379 0.17357 0.055919 0.1829 0.042944 0.16433 0.093069 0.17485 0.057326 0.18457 0.044901

21x21 0.074785 2.7e-005 0.084482 5.8e-005 0.09074 7.9e-005 0.095274 0.000106 0.14914 0.15061

0.001873 0.001969

0.18769 0.18933

0.01346 0.014472

0.20323 0.20511

0.028803 0.031011

Table 7. Optimal RREs of Tikhonov and Re-blurring methods µott with referen e to the true image in Figure 4 (Cross- hannel σi1 = σi2 = 2). PSF R AR R AR R AR R AR R AR

5x5 11x11 15x15 ρ =0 0.069148 0.000203 0.11508 0.001204 0.1123 0.000717 0.062854 0.000102 0.091232 7e-006 0.1014 4.4e-005 ρ =0.001 0.071259 0.000312 0.11515 0.001228 0.11239 0.000744 0.066734 0.000209 0.098658 5.8e-005 0.10276 7.7e-005 ρ =0.01 0.094871 0.004975 0.11896 0.002919 0.11712 0.002421 0.094458 0.004841 0.1144 0.001884 0.11481 0.00184 ρ =0.05 0.13209 0.029798 0.13662 0.015305 0.13599 0.014896 0.13239 0.029944 0.13561 0.014992 0.13593 0.014772 ρ =0.1 0.16281 0.051315 0.15543 0.029068 0.15547 0.028822 0.16341 0.051659 0.15526 0.029213 0.15602 0.029029

and orresponding and Gaussian Blur

21x21 0.11335 0.000726 0.098266 1.5e-005 0.11347 0.10111

0.000755 4.5e-005

0.1182 0.11507

0.002459 0.001755

0.13669 0.13611

0.014824 0.014595

0.15586 0.15588

0.02868 0.02872

406

C. Tablino Possio

image restorations have been obtained by onsidering the transformation pro edure outlined at the end of Se tion 4. Despite the RREs in Table 7 are bigger than in the gray-s ale ase, the per eption of the image restoration quality is very satisfying and a little less 'fre kles' than in the orresponding SDs and SVDs are observed (see Figure 6). Noti e, also that the la k of orthogonality in the S3n transform related to the Anti-re e tive BCs does not deteriorate the performan es of the restoration.

6

Conclusions

In this paper we have analyzed and ompared SD and SVD ltering methods in the ase both of Re e tive and Anti-Re e tive BCs. Numeri al eviden e is given of the good performan es a hievable through SDs and with a substantially lower

omputational ost with respe t to SVDs. In addition, the tensorial stru ture of the Re e tive and Anti-Re e tive SDs an be exploited in depth also in the

ase of not separable PSFs. A spe ial mention has to be done to the fa t that the loss of orthogonality of the Anti-Re e tive transform does not seems to have any onsequen e on the trend of the image restoration results. The analysis in the ase of ross hannel blurring in olor images allows to on rm the quoted onsiderations. Finally, the Re-blurring regularizing method has been re-interpreted as a standard Tikhonov regularization method in the spa e transformed by means of Ten . Some numeri al tests highlight the image restoration performan es, also in the

ase of ross- hannel blurring. Future works will on ern the analysis of ee tive strategies allowing to properly

hoose the optimal regularizing parameters in the Anti-Re e tive BCs ase.

References 1. B. An onelli, M. Bertero, P. Bo

a

i, M. Carbillet, and H. Lanteri,

Redu tion of boundary ee ts in multiple image de onvolution with an appli ation to LBT LINC-NIRVANA, Astron. Astrophys., 448 (2006), pp. 1217{1224. 2. A. Ari o , M. Donatelli, and S. Serra-Capizzano, The Antire e tive Algebra: Stru tural and Computational Analyses with Appli ation to Image Deblurring and Denoising, Cal olo, 45{3 (2008), pp. 149{175. 3. A. Ari o , M. Donatelli, and S. Serra Capizzano, Spe tral analysis of the anti-re e tive algebra, Linear Algebra Appl., 428 (2008), pp. 657{675. 4. A. Ari o , M. Donatelli, J. Nagy, and S. Serra-Capizzano, The antire e tive transform and regularization by ltering, Numeri al Linear Algebra in Signals, Systems, and Control., in Le ture Notes in Ele tri al Engineering edited by S. Bhatta haryya, R. Chan, V. Olshevsky, A. Routray, and P. Van Dooren, Springer Verlag, in press.

Trun ated de ompositions and ltering methods with R/AR BCs

407

ρ = 0.01

R − Tikhonov

AR − Re−blurring

ρ = 0.05

R − Tikhonov

Fig. 6. Optimal RREs of

AR − Re−blurring

Tikhonov and Re-blurring methods with referen e to the true image in Figure 4 (Cross- hannel and Gaussian blur σi1 = σi2 = 2 - ρ = 0.05).

408

C. Tablino Possio

5. M. Bertero and P. Bo

a

i, Introdu tion to inverse problems in imaging, Inst. of Physi s Publ. London, UK, 1998. 6. M. Bertero and P. Bo

a

i, Image restoration for Large Bino ular Teles ope (LBT), Astron. Astrophys. Suppl. Ser., 147 (2000), pp. 323{332. 7. D. Bini and M. Capovani, Spe tral and omputational properties of band symmetri Toeplitz matri es, Linear Algebra Appl., 52/53 (1983), pp. 99{125. 8. P. J. Davis, Cir ulant Matri es, Wiley, New York, 1979. 9. M. Donatelli, C. Estati o, A. Martinelli, and S. Serra Capizzano,

Improved image deblurring with anti-re e tive boundary onditions and reblurring, Inverse Problems, 22 (2006), pp. 2035{2053.

10. M. Donatelli, C. Estati o, J. Nagy, L. Perrone, and S. Serra Capizzano, Anti-re e tive boundary onditions and fast 2D deblurring models, Pro eeding to SPIE's 48th Annual Meeting, San Diego, CA USA, F. Luk Ed, 5205 (2003), pp. 380{389. 11. M. Donatelli and S. Serra Capizzano, Anti-re e tive boundary onditions and re-blurring, Inverse Problems, 21 (2005), pp. 169{182. 12. H. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, Kluwer A ademi Publishers, Dordre ht, The Netherlands, 2000. 13. C. W. Groets h, The Theory of Tikhonov Regularization for Fredholm Integral Equations of the First Kind, Pitman, Boston, 1984. 14. P. C. Hansen, Rank-de ient and dis rete ill-posed problems, SIAM, Philadelphia, PA, 1997. 15. M. Hanke and J. Nagy, Restoration of atmospheri ally blurred images by symmetri inde nite onjugate gradient te hnique, Inverse Problems, 12 (1996), pp. 157{173. 16. P. C. Hansen, J. Nagy, and D. P. O'Leary, Deblurring Images Matri es, Spe tra and Filtering, SIAM Publi ations, Philadelphia, 2006. 17. R. L. Lagendijk and J. Biemond, Iterative Identi ation and Restoration of Images , Springer-Verlag New York, In ., 1991. 18. M. K. Ng, R. H. Chan, and W. C. Tang, A fast algorithm for deblurring models with Neumann boundary onditions, SIAM J. S i. Comput., 21 (1999), no. 3, pp. 851{866. 19. L. Perrone, Krone ker Produ t Approximations for Image Restoration with Anti-Re e tive Boundary Conditions, Numer. Linear Algebra Appl., 13{1 (2006), pp. 1{22. 20. F. Rossi, Te ni he di ltraggio nella ri ostruzione di immagini on ondizioni al ontorno antiri ettenti, (in Italian), Basi Degree Thesis, University of MilanoBi o

a, Milano, 2006. 21. S. Serra Capizzano, A note on anti-re e tive boundary onditions and fast deblurring models, SIAM J. S i. Comput., 25{3 (2003), pp. 1307{1325. 22. Y. Shi and Q. Chang, A

eleration methods for image restoration problem with dierent boundary onditions, Appl. Numer. Math., 58{5 (2008), pp. 602{614. 23. G. Strang, The Dis rete Cosine Transform, SIAM Review, 41{1 (1999), pp. 135{ 147. 24. C. R. Vogel, Computational Methods for Inverse Problems, SIAM, Philadelphia, PA, 2002.

Discrete-time stability of a class of hermitian polynomial matrices with positive semidefinite coefficients Harald K. Wimmer Mathematis hes Institut, Universitat Wurzburg, D-97074 Wurzburg, Germany wimmer@mathematik.uni-wuerzburg.de

P

i Polynomial matri es G(z) = Izm − m−1 i=0 C i z with positive semide nite

oeÆ ients Ci are studied. If C0 is positive de nite and P Ci = I then all hara teristi values of G(z) are in the losed unit dis and those lying on the unit ir le are m-th roots of unity having linear elementary divisors. The result yields a stability and onvergen e

riterion for a system of dieren e equations.

Abstract.

Keywords: polynomial matrix, zeros of polynomials, root lo ation, blo k

ompanion matrix, dieren e equation, stability.

1

Introduction

In this note we deal with a theorem on polynomials and its extension to polynomial matri es. The following result an be found in [1℄, and to some extent also in [3℄, [4, p. 92℄) and [5, p. 3℄.

Let mial su h that

Theorem 1.

g(z) = zm − cm−1 zm−1 + · · · + c1 z + c0

ci > 0, i = 0, . . . , m − 1, c0 > 0,

Xm−1

and

i=0

be a real polynoci = 1.

(1)

(i) Then all zeros of g(z) are in the losed unit dis . (ii) The zeros of g(z) lying on the unit ir le are simple and they are m-th

roots of unity. (iii) The number of zeros of g(z) on the unit ir le is equal to

Moreover

k = g d {m} ∪ {i; ci 6= 0} .

g(z) = (zk − 1)p(zk )

with

p(λk ) 6= 0

if

|λ| = 1.

It is the purpose of this note to extend the pre eding theorem to polynomial matri es and to derive a stability and onvergen e result for a system of dieren e P i equations. We onsider matri es G(z) = Izm − m−1 i=0 Ci z where the oeÆ ients

410

H. K. Wimmer

Ci ∈ Cn×n are hermitian and positive semide nite (Ci > 0) and C0 is positive P de nite (C0 > 0), and we assume m−1 i=0 Ci = I. The following notation will be used. We de ne σ(G) = {λ; det G(λ) = 0}. In a

ordan e with [2, p. 341℄ the elements of σ(G) will be alled the hara teristi values of G(z). If G(λ)v = 0 and v ∈ Cn , v 6= 0, then v is said to be an eigenve tor orresponding to λ. An r-tuple of ve tors (v0 , v1 , . . . , vr−1 ), vi ∈ Cn , v0 6= 0, is alled a Jordan hain (or Keldysh hain) of length r if G(λ)v0 = 0, G′ (λ)v0 + G(λ)v1 = 0, · · · , (r−1) 1 (λ)v0 (r−1)! G

+

(r−2) 1 (λ)v1 (r−2)! G

+ · · · + G(λ)vr−1 = 0.

Let D = {z ∈ C; |z| < 1} be the open unit dis and ∂D = {z ∈ C; |z| = 1} the unit

ir le of the omplex plane, and let R> be the set of nonnegative real numbers. Let Em = {ζ ∈ C; ζm = 1} be the group of m-th roots of unity. If ζ ∈ Em then ord ζ will denote the order of ζ, i.e. if ord ζ = s then s is the smallest positive divisor of m su h that ζs = 1.

2

Polynomial matrices

Let us have a loser look at Theorem 1. It is not diÆ ult to show that the zeros of the polynomial g(z) lie in the losed unit dis . But it is remarkable that the unimodular zeros of g(z) should be roots of unity. In the theorem below we shall en ounter this property in a more general setting. A

ordingly, the fo us of this se tion will be on hara teristi values on the unit ir le and orresponding eigenve tors. To make the exposition self- ontained we dont take advantage of Theorem 1 in the subsequent proof. Theorem 2.

Let

G(z) = Izm −

Xm−1 i=0

Ci zi

be an n × n polynomial matrix with hermitian oeÆ ients Ci > 0, i = 0, . . . , m − 1, C0 > 0, and

Xm−1 i=0

Ci

su h that

Ci = I.

(2)

(i) Then |λ| 6 1 for all λ ∈ σ(G). (ii) If λ ∈ σ(G) and |λ| = 1 then λm = 1. The elementary divisors of G(z)

orresponding to λ are linear. If ord λ = s then Es ⊆ σ(G). (iii) Let v ∈ Cn , v 6= 0. De ne k(v) = g d {m} ∪ {i; Ci v 6= 0, i = 0, 1, . . . , m − 1} .

(3)

M(v) = {λ ∈ σ(G); |λ| = 1, G(λ)v = 0}

(4)

Suppose the set

Polynomial matri es

is nonempty. Then M(v) = Ek(v) . If

m = k(v)ℓ

411

then

ℓ−1 i h X G(z)v = Izk(v)ℓ − Ck(v)j zk(v)j v = (zk(v) − 1) p zk(v)

(5)

j=0

where

p(z) ∈ Cn [z]

and 6 0 p λk(v) =

if

|λ| = 1.

(6)

Proof. Note that det C0 6= 0 implies 0 ∈/ σ(G). Let λ be a hara teristi value of G(z) and v a orresponding eigenve tor with v∗ v = 1. Set gv (z) = v∗ G(z)v Pm−1 ∗ m and ci = v Ci v. Then gv (z) = z − i=0 ci zi , and the assumptions (2) imply (1). We have gv (λ) = 0 or equivalently 1=

m−1 X i=0

Hen e

ci . m−i λ

m−1 m−1 X ci X ci 6 . 1 = m−i λ |λm−i | i=0

(7)

(8)

i=0

1 | < 1, 0 6 i 6 m − 1. Then (8) implies the stri t (i) Suppose |λ| P > 1, i.e. | λm−i inequality 1 < ci , in ontradi tion to (1). (ii) Put ci βi = m−i , i = 0, . . . , m − 1. (9)

λ

If |λ| = 1 then (1) and (7) yield Xm−1 Xm−1 βi = |βi | = 1. i=0 i=0

Hen e βi = uα , i = 0, . . . , m − 1, with αi ∈ R> , u ∈ C, |u| = 1. From (7) we Pi obtain 1 = u αi . Therefore u ∈ R, u > 0. Thus u = 1, and we have (9) with βi ∈ R> . Take i = 0. Then c0 > 0 yields λm ∈ R> . Be ause of |λ| = 1 we obtain λm = 1, i.e. λ ∈ Em . We rewrite (9) as βi = λi ci , βi ∈ R> , i = 0, . . . , m − 1.

(10)

Let ord λ = s and m = ℓs. Suppose ci 6= 0, i.e. ci > 0. Then (10) implies λi = 1, that is i ∈ {0, s, 2s, . . . , (ℓ − 1)s}. Therefore ci = 0 if i ∈/ sZ. Be ause of Ci > 0 we have ci = v∗ Ci v = 0 if and only if Ci v = 0. Hen e i

h ℓ ℓ−1 G(z)v = I (zs ) − C(ℓ−1)s (zs ) + · · · + Cs zs + C0 v.

(11)

412

H. K. Wimmer

Let µs = 1. Then (11) and G(1) = 0 imply G(µ)v = 0. Thus Es ⊆ σ(G). Moreover v∗ G(λ) = 0, and gv (z) = zℓs −

ℓ−1 X

cjs zjs

and

j=0

ℓ−1 X

cjs = 1.

j=0

Let us show that the elementary divisors orresponding to λ are linear. It suÆ es to prove (see e.g. [2, p. 342℄) that the ve tor v an not be extended to a Jordan

hain of length greater than 1. Suppose there exists a ve tor w ∈ Cn su h that G′ (λ) v + G(λ) w = 0. Then v∗ G(λ) = 0 and λs = 1 imply 0 = v∗ [G(λ) w + G′ (λ) v] = v∗ G′ (λ)v = g′v (λ) = ℓs λℓs−1 −

ℓ−1 X j=0

P

i h Xℓ−1 js cjs . js cjs λjs−1 = λ−1 ℓs − j=0

P

ℓ−1 Thus we would obtain ℓs = ℓ−1 j=0 js cjk , whi h is in ompatible with j=0 cjs = 1. the set of ommon positive (iii) Let M(v) be de ned by (4) and let D(v) denote divisors of m ∪ {i; Ci v 6= 0, i = 0, . . . , m − 1} . Then s ∈ D(v) with m = ℓs is equivalent to (11). We know that M(v) ⊆ Em , and we have seen that s ∈ D(v) if Es ⊆ M(v). Sin e (11) implies Es ⊆ M(v) it is obvious that

s ∈ D(v)

if and only if Es ⊆ M(v).

(12)

Now let λ, µ ∈ M(v) and ord λ = s, ord µ = t. Set q = l m{s, t}, and r = m/q. Then i h Xr−1 G(z)v = Izrq −

j=0

Cjq zjq v.

Hen e Eq ⊆ M(v). In parti ular, we have λµ ∈ M(v). Therefore M(v) is a subgroup of Em . Hen e M(v) = Ek^ for some divisor k^ of m. Note that Es ⊆ Ek^ is equivalent to s|k^. Therefore it follows from (12) that k^ is the greatest element of D(v). Thus, if k(v) is given by (3) then k^ = k(v). It remains to show that the polynomial ve tor p(zk(v) ) in (5) satis es the

ondition (6). Suppose p(λk(v) ) = 0 for some λ ∈ ∂D. Then λ ∈ M(v) and therefore λ ∈ Ek(v) , i.e. λk(v) − 1 = 0. Hen e G(λ)v = G′ (λ)v = 0. But then there would exist an an elementary divisor (z − λ)t with t > 2. Therefore we ⊓ ⊔ have (6). From

P

Ci = I follows G(1) = 0. Thus 1 ∈ σ(G). More pre isely, det G(z) = (z − 1)n f(z), f(1) 6= 0. To he k whether G(z) has hara teristi values on ∂D dierent from 1 we introdu e the following matri es. Let s, s 6= 1, s 6= m, be a positive divisor of m su h that m = sℓ. De ne Xℓ−1 Cjs . Ts = I − j=0

Polynomial matri es

413

Corollary 1. For ea h nontrivial divisor s of m the matrix Ts is nonsingular if and only if λ = 1 is only hara teristi value of G(z) on the unit ir le.

Proof. Suppose G(λ)v = 0 and v 6= 0, |λ| = 1, ord λ = s, s > 1. Pm−1 P Then Ci v = 0 for i ∈/ sZ. Hen e ℓ−1 i=0 Ci v = v, and therefore j=0 Cjs v = Ts v = 0, and rank Ts < n. Conversely, suppose rank Ts < n for some s. Let Pℓ−1 P Ts v = 0, v 6= 0. Then Ci = I and Ci > 0 imply G(z)v = Izm − j=0 Cjs zjs v and we on lude that {1} $ Es ⊆ σ(G). ⊓ ⊔

3

A difference equation

Theorem 2 deals with the lo ation of hara teristi values with respe t to the unit

ir le. Therefore it an be applied to stability problems of systems of dieren e equations. Theorem 3.

Let C0 , . . . , Cm−1 ∈ Cn×n be hermitian and su h that (2), i.e.

Ci > 0, i = 0, . . . , m − 1, C0 > 0, and

holds. Then all solutions

x(t) t∈N

0

Xm−1 i=0

Ci = I,

of the dieren e equation

x(t + m) = Cm−1 x(t + m − 1) + · · · + C1 x(t + 1) + C0 x(t),

are bounded for

t → ∞.

The

x(0) = x0 , . . . , x(m − 1) = xm−1 , sequen e x(jm) j∈N is onvergent.

(13a) (13b)

0

Proof. It is well known that the solutions of (13) are bounded if and only if all P

hara teristi values of the asso iated polynomial matrix G(z) = Izm − Ci zi are in the losed unit dis and if those whi h lie on the unit ir le have linear elementary divisors. To prove onvergen e of x(jm) we onsider the blo k

ompanion matrix

0 0 F= . . C0

I 0 . . C1

0 I . . C2

... 0 ... 0 ... . ... . . . . Cm−1

asso iated with G(z). Note that det G(z) = det(zI−F). Moreover G(z) and F have T the same elementary divisors. Set y(t) = xT (t), xT (t + 1), · · · , xT (t + m − 1) and de ne y0 onforming to (13b). Then (13) is equivalent to y(t + 1) = Fy(t), y(0) = y0 .

414

H. K. Wimmer

The orresponding equation for w(j) = x(jm) is w(j + 1) = Fm w(j).

We know that σ(G) ⊆ D and that λ ∈ σ(G) ∩ ∂D implies λm = 1. Therefore σ(Fm ) ⊆ {1} ∪ D, and Fm is similar to diag(I, ^ F) with σ(^ F) ⊆ D. Hen e w(j) is onvergent. ⊓ ⊔

References 1. N. Anderson, E. B. Saff, and R. S. Varga, An extension of the EnestromKakeya theorem and its sharpness, SIAM J. Math. Anal. 12 (1981), pp. 10{22. 2. H. Baumgartel, Analyti Perturbation Theory for Matri es and Operators, Operator Theory, Advan es and Appli ations, Vol. 15, Birkhauser, Basel, 1985. einen Satz des Herrn Kakeya, T^ohoku Math. J. 4 (1913), pp. 3. A. Hurwitz, Uber 89{93; in: Mathematis he Werke von A. Hurwitz, 2. Band, pp. 627{631, Birkhauser, Basel, 1933. 4. A. M. Ostrowski, Solutions of Equations in Eu lidean and Bana h Spa es, 3rd ed., A ademi Press, New York, 1973. 5. V. V. Prasolov, Polynomials, Algorithms and Computation in Mathemati s, Vol. 11, Springer, New York, 2004.

MATRICES AND APPLICATIONS

Splitting algorithm for solving mixed variational inequalities with inversely strongly monotone operators⋆ Ildar Badriev and Oleg Zadvornov Department of Computational Mathemati s and Mathemati al yberneti s, 420008 Kazan State University, Russia, Kazan, Kremlevskya, 18 Ildar.Badriev@ksu.ru

We onsider a boundary value problem whose generalized statement is formulated as a mixed variational inequality in a Hilbert spa e. The operator of this variational inequality is a sum of several inversely strongly monotone operators (whi h are not ne essarily potential operators). The fun tional o

urring in this variational inequality is also a sum of several lower semi- ontinuous onvex proper fun tionals. For solving of the onsidered variational inequality a de omposition iterative method is oered. The suggested method does not require the inversion of original operators. The onvergen e of this method is investigated.

Abstract.

Keywords: variational inequality, inversely strongly monotone operator, variational inequality, iterative method.

1

Statement of the problem

Let Ω ⊂ R n , n > 1 be a bounded domain with a Lips hitz ontinuous boundary Γ . We onsider the following boundary value problem with respe t to the fun tion u = (u1 , u2 , . . . , un ): n X ∂ (i) v (x) + dij (x)uj (x) = fi (x), x ∈ Ω, i = 1, . . . , n, ∂xj j

(1)

j=1

u(x) = 0, (i)

−vj (x) ∈

(2)

x ∈ Γ,

gj (|∂u(x)/∂xj |) ∂ui (x) , x ∈ Ω, i, j = 1, . . . , n, |∂u(x)/∂xj | ∂xj

(3)

where f = (f1 , f2 , . . . , fn ) is a given fun tion, D = {dij } is an unsymmetri matrix su h that (Dξ, ξ) > α0 (Dξ, Dξ) ⋆

∀ ξ ∈ R n,

α 0 > 0.

(4)

This work was supported by the Russian Foundation for Basi Resear h, proje t №№06-01-00633 and 07-01-00674.

Splitting algorithm for solving mixed variational inequalities

417

We assume that the multi-valued fun tions gj an be represented in the form gj (ξ) = g0j (ξ) + ϑj h(ξ − βj ),

where ϑj , βj are the given non negative onstants, h is the multi-valued and g0j are the single-valued fun tions given by the formulas ξ < 0, 0, h(ξ) = [0, 1], ξ = 0, 1, ξ > 0,

g0j (ξ) =

0, ξ 6 βj , g∗j (ξ − βj ), ξ > βj ,

g∗j : [0, +∞) → [0, +∞) are the ontinuous fun tions whi h satisfy the following

onditions:

(5)

g∗j (0) = 0, g∗j (ξ) > g∗j (ζ) ∀ ξ > ζ > 0,

∃ σj > 0 : | g∗j (ξ) − g∗j (ζ) | 6

1 | ξ − ζ | ∀ ξ, ζ > 0, σj

(6)

∃ kj > 0, ξ∗j > 0 : g∗j (ξ∗j ) > kj ξ∗j , g∗j (ξ) − g∗j (ζ) > kj (ξ − ζ) ∀ ξ > ζ > ξ∗j . h◦ Let us introdu e the notations: V = W ∂/∂xj : V → H, j = 1, 2, . . . , n.

in

(1) 2 (Ω)

(7)

, H = [L2 (Ω)] , Bj = n

A generalized solution of the problem (1){(3) is de ned as the fun tion u ∈ V satisfying for all η ∈ V the variational inequality (A0 u, η − u)V +

n X j=1

n X

(Aj ◦ Bj (u), Bj (η − u))H + F0 (η) − F0 (u) +

(8)

[Fj (Bj η) − Fj (Bj u) ] > 0.

j=1

Here B∗j : H → V , j = 1, 2, . . . , n are operators onjugate to Bj . The operators A0 : V → V and Aj : H → H, j = 1, 2, . . . , n are generated by the forms (A0 u, η)V =

Z

(Qu, η)dx, u, η ∈ V,

(Aj y, z)H =

Ω

Z

(Gj (y), z)dx, y, z ∈ H.

Ω

The operators Gj : R n → R n and the fun tionals F0 : V → R 1 , Fj : H → R 1 , j = 1, 2, . . . , n are de ned by the formulas: Gj (y) = g0j (|y|) |y|−1 y , y 6= 0, Gj (0) = 0,

Fj (z) = ϑj

Z

µ(|z| − βj ) dx, z ∈ H,

Ω

The following result is valid.

µ(ζ) =

Z F0 (η) = − (f, η) dx, η ∈ V,

Ω

0, ζ < 0, ζ, ζ > 0.

418

I. Badriev, O. Zadvornov

Let the ondition (4) be satis ed. Then A0 is an inversely strongly monotone operator, i.e.,

Lemma 1.

2

(A0 η − A0 u , η − u)V > σ0 k A0 η − A0 u kV ,

σ0 > 0

∀ u, η ∈ V.

(9)

Proof. It follows from (4) that −1/2

| Qξ | 6 α 0

(Qξ, ξ) 1/2

∀ ξ ∈ R n,

and hen e −1/2

(Qξ, ξ) 1/2 | ζ | ∀ ξ, ζ ∈ R n .

| (Qξ, ζ) | 6 α 0

Be ause of this | (A0 u, η)V | 6

−1/2 α0

−1/2

α0

| (Q u, η) | d x 6

Ω

Z

Z

Ω

1/2

(Qu, u) d x 1/2

cH (A0 u, u)V

−1/2 α0

Z

(Qu, Qu) 1/2 | η | d x 6

Ω

Z

Ω

1/2

| η |2 d x

−1/2

k η kV = σ 0

−1/2

= α0

1/2

(A0 u, u)V

k η kH 6

1/2

(A0 u, u)V

k η kV ,

where cH is the Friedri hs onstant (the onstant of embedding V into H), 2 σ 0 = α 0 /c H . Therefore, k A0 u kV = sup

η6=0

1/2 | (A0 u, η)V | −1/2 6 σ0 (A0 u, u)V , k η kV

when e by virtue of linearity of A0 it follows required inequality.

⊓ ⊔

By analogy with [4℄ we obtain that the following results are valid.

Let the onditions (5){(7) be satis ed. Then Aj are oer ive and inversely strongly monotone operators, i.e.,

Lemma 2.

2

(Aj y − Aj z , y − z)H > σj k Aj y − Aj z kH , σj > 0,

The fun tionals F0 : V → R 1 ,

onvex and Lips hitz ontinuous ones.

Lemma 3.

∀ y, z ∈ H.

Fj : H → R 1 , j = 1, 2, . . . , n

(10)

are

It follows from these results that the variational inequality (8) has at least one solution (see e.g. [5℄).

Splitting algorithm for solving mixed variational inequalities

2

419

The iterative process

In the following, we will onsider the abstra t variational inequality (8) postulating the properties (9), (10) and assuming that Bj : V → H, j = 1, 2, . . . , n are linear ontinuous operators and Fj , j = 0, 1, 2, . . . , n are proper onvex and Lips hitz ontinuous fun tionals. In addition, we assume that the operator

n X j=1

B∗j Bj : V → V is a anoni al isomorphism, i.e., n X

B∗j Bj u, η

j=1

V

∀ u, η ∈ V.

= (u, η)V

(11)

To solve the variational inequality (8) we onsider the following splitting algorithm. (0) Let u(0) ∈ V , y(0) ∈ H, λj ∈ H, j = 1, 2, . . . , n be arbitrary elements. For j (k) (k) k = 0, 1, 2, . . . and for known yj , λj , j = 1, 2, . . . , n we de ne u(k+1) as a solution of the variational inequality: 1 (k+1) + u − u(k) , η − u(k+1) τ0 V + F0 (η) − F0 u(k+1) + A0 u(k) , η − u(k+1)

(12)

V

n X

(k)

B∗j λj

+r

j=1

n X j=1

(k) B∗j Bj u(k) − yj , η − u(k+1) > 0 ∀ η ∈ V. V

, j = 1, 2, . . . , n, by solving the variational inequalities Then we nd y(k+1) j 1 (k+1) (k) (k+1) yj − yj , z − yj + τj H (k+1) (k) (k) (k+1) + Fj (z) − Fj yj + Aj yj − λj , z − yj H (k) (k+1) r yj − Bj u(k+1) , z − yj > 0 ∀ z ∈ H, j = 1, 2, . . . , n.

(13)

(k+1) + r Bj u(k+1) − yj ,

(14)

H

Finally, we set

(k+1)

λj

(k)

= λj

j = 1, 2, . . . , n.

Here τj > 0, j = 0, 1, 2, . . . , n and r > 0 are the iterative parameters. To analyze the onvergen e of the method (12){(14) we formulate it via the transition operator T : V × H n × H n → V × H n × H n that takes ea h ve tor q = ( q0 , q1 , . . . , q2n ) = (u, Y, Λ ), Y ∈ H n , Λ ∈ H n to the element T q = ( T 0 q, T 1 q, . . . , T 2n q ) as follows

h

T 0 q = Prox τ0 F0 q0 − τ0 A0 q0 +

n X j=1

B∗j

q n+j + r

n X j=1

i , B∗j Bj q0 − qj

(15)

420

I. Badriev, O. Zadvornov

h i , j = 1, 2, . . . , n, T j q = Prox τj Fj qj − τj Aj qj − q n+j + r qj − Bj T 0 q

T n+j q = q n+j + r Bj T 0 q − T j q ,

(16) (17)

j = 1, 2, . . . , n.

Here Prox G is a proximal mapping (see e.g. [5℄). Re all that a mapping Prox G : Z → Z is said to be proximal if it takes ea h element p of Hilbert spa e Z to the element v = Prox G (p) that is the solution of the minimization problem

1 1 2 2 k v − p kZ + G(v) = min k z − p kZ + G(z) , z∈P 2 2 This problem is equivalent (if G is a onvex proper lower semi- ontinuous fun tional) to a variational inequality (18)

( v − p, z − v )Z + G(z) − G(v) > 0 ∀ z ∈ Z.

It is easy to show that a proximal mapping is a rmly nonexpansive; i.e., k Prox G (p) − Prox G (z) kZ 6 ( Prox G (p) − Prox G (z) , p − z )Z 2

∀ p, z ∈ Z.

We introdu e the notations Y (k) =

(k) (k) , y1 , y2 , . . . , y(k) n

Λ(k) =

(k) (k) . λ1 , λ2 , . . . , λ(k) n

Then using the de nition of a proximal mapping by the variational inequality (18) it is easy to verify that the iterative pro ess (12){(14) an be represented in the form (0) is an arbitrary element, q

(k+1)

q

(k)

= Tq

(k)

, q

(k)

= u

,Y

(k)

(k)

,Λ

(19) , k = 0, 1, 2, . . . ,

i.e., T is the transition operator of this iterative pro ess. Let us now obtain a relationship between the solution of the original variational inequality (8) and the omponents of the xed point of the transition operator T . The following result is true.

Let the operator T : V × H n × H n → V × H n × H n is de ned by the relationships (15) { (17). Then the point q = ( u, Y, Λ ) where u ∈ V , Y = ( y1 , y2 , . . . , yn ) ∈ H n , Λ = ( λ1 , λ2 , . . . , λn ) ∈ H n , is a xed point of the operator T if and only if Theorem 1.

yj = Bj u,

j = 1, 2, . . . , n,

λj ∈ ∂Fj (yj ) + Aj yj ,

j = 1, 2, . . . , n,

(20) (21)

Splitting algorithm for solving mixed variational inequalities −

n X j=1

421

(22)

B∗j λ j ∈ ∂F0 (u) + A0 u.

Moreover, the rst omponent u of ea h xed point q of the operator T is a solution of the problem (8). Proof. Let q = ( u, Y, Λ ) be xed point of the operator T , i.e., a

ording to (15) { (17)

n n i h X X , B∗j λ j + r B∗j Bj u − yj u = Prox τ0 F0 u − τ0 A0 u + j=1

h i yj = Prox τj Fj yj − τj Aj yj − λ j + r yj − Bj u , λ j = λ j + r (Bj u − yj ) ,

(23)

j=1

j = 1, 2, . . . , n, (24)

(25)

j = 1, 2, . . . , n.

Obviously, the relations (25) are equivalent to (20). By the (20) and de nition (18) of a proximal mapping the relations (24) are equivalent to variational inequalities τj ( Aj yj − λj , z − yj )H + τj Fj (z) − τj Fj (yj ) > 0 ∀ z ∈ H,

j = 1, 2, . . . , n,

or

( Aj yj − λj , z − yj )H + Fj (z) − Fj (yj ) > 0 ∀ z ∈ H,

j = 1, 2, . . . , n,

(26)

ea h of these is equivalent to −A j yj −λj ∈ ∂Fj (yj ), j = 1, 2, . . . , n, i.e., in lusions (21) hold. In an analogous way we have that relation (23) is equivalent to the variational inequality n X A0 u + B∗j λ j , η − u + F0 (η) − F0 (u) > 0 j=1

V

∀ η ∈ V,

(27)

i.e., to the in lusion (22). We have thereby shown that the equality Tq = q is equivalent to relations (20) { (22). Let us now verify that the rst omponent u of ea h xed point q of the operator T is a solution of the problem (8). To this end, in inequalities (26), we use relations (20) to repla e yj by Bj u, j = 1, 2, . . . , n and set z = Bj η, where η is an arbitrary element of V . By adding resulting inequalities and using the de nition of onjugate operator we have n X j=1

(B∗j

◦ Aj ◦ Bj (u), η − u)V +

n X j=1

[ Fj (Bj η) − Fj (Bj u) ] > 0

∀ η ∈ V. (28)

By adding inequalities (27){(28) we have that u is a solution of the problem (8). ⊓ ⊔ The proof of the theorem is omplete.

422

I. Badriev, O. Zadvornov

Theorem 2.

Suppose that there exists a solution of problem (8) and ∃ u∗ ∈ dom F0 : Bj u∗ ∈ dom Fj ;

Fj

is ontinuous at the point Bj u∗ ,

j = 1, 2, . . . , n.

(29)

Then the set of xed points of the operator T is nonempty. Proof. Let

u be a solution of the problem (8), yj = Bj u, j = 1, 2, . . . , n. The variational inequality (8) is equivalent to the following in lusion −A0 u −

n X j=1

B∗j Aj y j ∈ ∂ F0 +

n X j=1

(30)

Fj ◦ Bj (u).

If onditions (29) are satis ed, then it follows from Propositions 5.6 and 5.7 [5℄ that

∂F0 +

n X j=1

Fj ◦ Bj (u) = ∂ F0 (u)+

n X j=1

∂ (Fj ◦ Bj ) (u) = ∂ F0 (u)+

n X

B∗j ∂ Fj (yj ).

j=1

(31) Relations (31) and (30) imply that there exist elements v ∈ ∂ F0 (u), zj ∈ ∂Fj (yj ), j = 1, 2, . . . , n, su h that −A0 u −

n X

B∗j Aj y j = v +

j=1

or

−A0 u −

n X

n X

B∗j zj ,

j=1

B∗j ( Aj y j + zj ) = v.

j=1

Let λ j = Aj y j + zj , j = 1, 2, . . . , n; then we have the in lusions −A0 u −

n X j=1

B∗j λ j = v ∈ ∂ F0 (u);

−Aj y j + λ j = z ∈ ∂Fj (yj ),

j = 1, 2, . . . , n,

i.e., the relations (21), (22) hold. Next relations (20) are valid by virtue of the de nition of yj . Therefore by Theorem 1, the operator T has a xed point, namely the point q = ( u, Y, Λ ) where Y = ( y1 , y2 , . . . , yn ) ∈ H n , Λ = ( λ1 , λ2 , . . . , λn ) ∈ H n . The proof of ⊓ ⊔ the theorem is omplete. Thus the onvergen e analysis of the iterative pro ess (12){(14) an be redu ed to that of the su

essive approximation method for nding a xed point of T .

Splitting algorithm for solving mixed variational inequalities

3

423

The investigation of the convergence of the iterative process

Let introdu e the Hilbert spa e Q = V × H n × H n with the inner produ t (·, ·)Q =

n n X 1 X 1 1 − τ0 r (·, ·)V + (·, ·)H + (·, ·)H , τ0 τj r j=1

j=1

where r, τj , j = 0, 1, 2, . . . , n, are positive onstants; moreover, τj r < 1, j = 0, 1, 2, . . . , n. The investigation of the onvergen e of the iterative pro ess (19) is based on the following Theorem 3.

Let onditions (9){(11) be satis ed, and let τj

0, j = 0, 1, 2, . . . , n; therefore it follows from (9), (10) and (33) that T is nonexpansive operator. We rewrite relation (15) in view of (11) in the form T 0 q = Prox τ0 F0 = Prox τ0 F0

q0 − τ0 A0 q0 − τ0 r q0 − τ0

n X

B∗j ( q n+j − r qj )

j=1

S0 q0 − τ0

n X j=1

B∗j

( q n+j − r qj )

,

where S0 : V → V is the operator given by the formula S0 = (1 − τ0 r) I − τ0 A0 .

424

I. Badriev, O. Zadvornov

A

ording to [3℄ by using (9) we obtain 2

2

k S0 p0 − S0 q0 kV = (1 − τ0 r)2 k q0 − p0 kV −

2

2 τ0 (1 − τ0 r) (A0 q0 − A0 p0 , q0 − p0 )V + τ20 kA0 q0 − A0 p0 kV 6 2 σ0 r + 1 (A0 q0 − A0 p0 , q0 − p0 )V , (1 − τ0 r)2 k q0 − p0 k2V − 2 τ0 1 − τ0 2 σ0

i.e., 2

k S0 p0 − S0 q0 kV 6

2

(1 − τ0 r)2 k q0 − p0 kV − τ0 (1 − τ0 r) δ0 (A0 q0 − A0 p0 , q0 − p0 )V ,

(34)

for any q0 , p0 ∈ V . Further, by using the rmly nonexpansiving property of proximal mapping Prox τ0 F0 we obtain 2

k T0 q − T0 p kV 6 (T0 q − T0 p, S0 q0 − S0 p0 )V − τ0

n X

B∗j ( q n+j − p n+j ) − r B∗j (qj − pj ) , T0 q − T0 p

j=1

V

.

Let us transform the rst term in the right side by the relation (v, w)Z =

1 ε 1 kvk2Z − kv − ε w k2Z + kwk2Z 2ε 2ε 2

∀ v, w ∈ Z, ∀ ε > 0

(35)

with Z = V , v = S0 q0 − S0 p0 , w = T0 q − T0 p. We have 2

k T0 q − T0 p kV 6

1 ε 2 2 kS0 q0 − S0 p0 kV + kT0 q − T0 p kV − 2ε 2

1 k(S0 q0 − S0 p0 ) − ε (T0 q − T0 p ) k2V − 2ε n X τ0 B∗j ( q n+j − p n+j ) − r B∗j (qj − pj ) , T0 q − T0 p V . j=1

Therefore, by virtue of (34) we obtain 2−ε (1 − τ0 r)2 2 2 k T0 q − T0 p k V 6 kq0 − p0 kV − 2 2ε τ0 (1 − τ0 r) δ0 (A0 q0 − A0 p0 , q0 − p0 )V − 2ε 1 2 k(1 − τ0 r) (q0 − p0 ) − τ0 (A0 q0 − A0 p0 ) − ε (T0 q − T0 p ) kV − 2ε n X τ0 B∗j ( q n+j − p n+j ) − r B∗j (qj − pj ) , T0 q − T0 p V . j=1

Splitting algorithm for solving mixed variational inequalities

425

After division by τ0 by hoosing ε = 1 − τ0 r we have δ0 1 + τ0 r k T0 q − T0 p k2V + (A0 q0 − A0 p0 , q0 − p0 )V + 2 τ0 2 1 2 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) kV 6 2 (1 − τ0 r) τ0 n X 1 − τ0 r kq0 − p0 k2V − ( q n+j − p n+j , Bj (T0 q − T0 p ))H + 2 τ0 j=1

n X

r

( qj − pj , Bj (T0 q − T0 p ) )H .

j=1

From this inequality after the transformation of the terms qj − pj , Bj (T0 q−T0 p ) H by the (34) with Z = H, ε = 1, v = qj −pj , w = Bj (T0 q−T0 p ) it follows that δ0 1 + τ0 r 2 k T0 q − T0 p k V + (A0 q0 − A0 p0 , q0 − p0 )V + 2 τ0 2 1 2 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) kV + 2 (1 − τ0 r) τ0 n r X 2 k (qj − Bj T0 q) − (pj − Bj T0 p) kH 6 2 j=1

n 1 − τ0 r r X 2 2 kq0 − p0 kV + k qj − pj kH + 2 τ0 2 j=1

r 2

n X j=1

2

k Bj (T0 q − T0 p ) kH −

n X

( q n+j − p n+j , Bj (T0 q − T0 p ))H .

j=1

For ea h j = 1, 2, . . . , n we rewrite the (16) in the form

(36)

T j q = Prox τj Fj ( qj − τj r qj − τj Aj qj + τj q n+j + τj Bj T 0 q ) = Prox τj Fj ( Sj qj + τj q n+j + τj Bj T 0 q ) ,

where the operators Sj : H → H are de ned by the relationships Sj = (1 − τj r) I − τj Aj . By virtue of (10) by the analogous with (34) we obtain the estimates k Sj pj − Sj qj k2H 6

2

(1 − τj r)2 kqj − pj kH − τj (1 − τj r) δj (Aj qj − Aj pj , qj − pj )H ,

(37)

and take in a

ount the rmly non expanding of the proximal mapping Prox τj Fj and the equality (34) with an arbitrary ε > 0, Z = H, v = Sj qj − Sj pj , w =

426

I. Badriev, O. Zadvornov

Tj q − Tj p we have k Tj q − Tj p k2H 6 ( Tj q − Tj p , Sj qj − Sj pj )H +

τj r ( Tj q − Tj p , Bj ( T0 q − T0 p ) )H + τj ( Tj q − Tj p , q n+j − p n+j )H = ε 1 k Sj qj − Sj pj k2H + k Tj q − Tj p k2H − 2ε 2 1 2 k (Sj qj − Sj pj ) − ε ( Tj q − Tj p )kH + 2ε τj r ( Tj q − Tj p , Bj ( T0 q − T0 p ))H + τj ( Tj q − Tj p , q n+j − p n+j )H .

By setting in the last inequality ε = 1 − τj r and using the estimation (37) for 2 kSj qj − Sj pj kH we obtain the inequality 1 + τj r δj 2 k Tj q − Tj p kH + (Aj qj − Aj pj , qj − pj )H + 2 τj 2 1 2 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) kH 6 2 (1 − τj r)τj 1 − τj r 2 k qj − pj kH − ( Tj q − Tj p , q n+j − p n+j )H + 2 τj r ( Tj q − Tj p , Bj ( T0 q − T0 p ) )H ,

whi h after using the relation (34) with ε = 1, Z = H, v = Tj q − Tj p, w = Bj (T0 q − T0 p) for the transformation the last term implies 1 + τj r δj 2 k Tj q − Tj p kH + (Aj qj − Aj pj , qj − pj )H + 2 τj 2 r 2 k (Tj q − Bj T0 q ) − (Tj p − Bj T0 p ) kH + 2 1 2 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) kH 6 2 (1 − τj r)τj 1 − τj r 2 k qj − pj kH + ( q n+j − p n+j , Tj q − Tj p )H + 2 τj r r 2 2 k Tj q − Tj p kH + k Bj ( T0 q − T0 p) kH . 2 2

Further for ea h j = 1, 2, . . . , n by virtue of (17) we have

(38)

1 1 k T n+j q − T n+j p k2H = k q n+j − p n+j k2H + 2r 2r ( q n+j − p n+j , Bj (T0 q − T0 p ) )H − r 2 ( q n+j − p n+j , Tj q − Tj p )H + k Bj ( T0 q − T0 p ) − (Tj q − Tj p ) kH . 2

(39)

Splitting algorithm for solving mixed variational inequalities

427

By adding relations (38), (39) with j = 1, 2, . . . , n, and the relations (36) after multiplying by 2 we have 1 + τ0 r 2 k T0 q − T0 p kV + δ0 (A0 q0 − A0 p0 , q0 − p0 )V + τ0 1 2 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) kV + (1 − τ0 r) τ0 n X 2 r k (qj − Bj T0 q) − (pj − Bj T0 p) kH + j=1

n n X X 1 + τj r 2 k Tj q − Tj p kH + δj (Aj qj − Aj pj , qj − pj )H + τj j=1 n X

r

j=1 n X

j=1

2

k (Tj q − Bj T0 q ) − (Tj p − Bj T0 p ) kH +

1 2 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) kH + (1 − τj r)τj

j=1 n X

1 r r

j=1

n X

j=1 n X

2

k T n+j q − T n+j p kH 6 2

k Bj (T0 q − T0 p ) kH − 2

1 − τj r 2 k qj − pj kH + 2 τj

j=1 n X

r

1 r 2

j=1 n X j=1 n X

2

k Tj q − Tj p kH + r

n X j=1

2

k q n+j − p n+j kH + 2

n X 1 − τ0 r 2 2 kq0 − p0 kV + r k qj − pj kH + τ0 j=1

n X

( q n+j − p n+j , Bj (T0 q − T0 p ))H +

j=1 n X

( q n+j − p n+j , Tj q − Tj p )H +

j=1

2

k Bj ( T0 q − T0 p) kH + n X

( q n+j − p n+j , Bj (T0 q − T0 p ) )H −

j=1

( q n+j − p n+j , Tj q − Tj p )H + r

j=1

n X j=1

2

k Bj (T0 q − T0 p ) − (Tj q − Tj p ) kH .

Then by virtue of (11) n X j=1

2

2

k Bj η kH = k η kV

Taking in a

ount this equation we have

∀ η ∈ V.

(40)

428

I. Badriev, O. Zadvornov

n n X 1 + τ0 r 1 X 1 + τj r k T n+j q − T n+j p k2H + k T0 q − T0 p k2V + k Tj q − Tj p k2H + τ0 τ r j j=1 j=1

δ0 (A0 q0 − A0 p0 , q0 − p0 )V +

n X

δj (Aj qj − Aj pj , qj − pj )H +

j=1

n X j=1

1 k (1 − τj r) [ (qj − Tj q) − (pj − Tj p) ] − τj (Aj qj − Aj pj ) k2H + (1 − τj r)τj

1 k(1 − τ0 r) [ (q0 − T0 q) − (p0 − T0 p) ] − τ0 (A0 q0 − A0 p0 ) k2V + (1 − τ0 r) τ0 n X k (qj − Bj T0 q) − (pj − Bj T0 p) k2H 6 r j=1

n n X 1 X 1 1 − τ0 r k q n+j − p n+j k2H , kq0 − p0 k2V + k qj − pj k2H + τ0 τj r j=1

j=1

i.e., the inequality (33) is true. The proof of the theorem is omplete.

⊓ ⊔

Re all (see [6℄), that the operator T : Q → Q is named the asymptoti ally regular if T k+1 q − T k q → 0 as k → +∞ for any q ∈ Q. It is valid the following

Let the operator T has at least one xed point and let the on, onditions (9){(11), (32) are hold. Then the iterative sequen e q(k) +∞ k=0 ∗ stru ted a

ording to (19), onverges weakly to q in Q as k → +∞, q∗ is a xed point of the operator T , the relation Theorem 4.

lim

y(k) − Bj u (k) = 0, j

k→ +∞

j = 1, 2, . . . , n,

H

is valid and the operator T : Q → Q is an asymptoti ally regular; i.e.,

lim

q(k+1) − q(k)

= 0.

k→ +∞

Proof. We use the inequality (33) with

Q

(41)

(42)

q = q(k) assuming that p is a xed

point of the operator T (the existen e of at least one xed point is provided by the assumptions of the theorem). Sin e Tq(k) = q(k+1) by the de nition of the iterative sequen e, pj = Tj p, j = 0, 1, 2, . . . , n, for a xed point, and, by Theorem 1 pj = Bj T0 p = Bj p0 , j = 1, 2, . . . , n, we have

Splitting algorithm for solving mixed variational inequalities

429

2

(k+1)

+ − p + δ0 A0 u(k) − A0 p0 , u(k) − p0

q V

Q

n X

δj

j=1

(k)

Aj yj

(k)

− Aj pj , yj

− pj

H

+

2 1

(1 − τ0 r) (u(k) − u(k+1) ) − τ0 (A0 u(k) − A0 p0 ) + τ0 (1 − τ0 r) V n

2 X 1

(k) (k+1) (k) ) − τj (Aj yj − Aj pj ) +

(1 − τj r) (yj − yj τj (1 − τj r) H j=1 n X

r

j=1

2

2

(k)

yj − Bj u(k+1) 6 q(k) − p , H

Q

+∞

This inequality implies that the numeri al sequen e q(k) − p Q in reasing and hen e have a nite limit:

lim

k→ +∞

lim

k→ +∞

(k)

Aj yj

Q

(k)

− Aj pj , yj

− pj

H

H

= 0,

(43)

= 0, j = 1, 2, . . . , n,

(44)

A0 u(k) − A0 p0 , u(k) − p0

− Bj u(k+1) = 0, lim

y(k) j

k→ +∞

is non

lim

q(k) − p

< +∞,

k→ +∞

therefore

k=0

V

(45)

j = 1, 2, . . . , n,

lim

(1 − τ0 r) u (k) − u (k+1) − τ0 A0 u(k) − A0 p0

= 0,

k→ +∞

V

(46)

(k) lim

(1 − τj r) yj(k) − y(k+1) A y − A p − τ

= 0, j = 1, 2, . . . , n. j j j j j j k→ ∞ H (47) By using (9), (10), (43) and (44), we obtain

lim

A0 u(k) − A0 p0

= 0, lim

Aj y(k) − Aj pj = 0, j = 1, 2, . . . , n. j k→ ∞ k→ ∞ V H (48) It follows from (46) { (48) that

lim

u(k) − u(k+1)

= 0,

k→ +∞

V

(k+1) − yj lim

y(k)

= 0, j

k→ +∞

H

Further by using (40), (45), (49), from the inequality

j = 1, 2, . . . , n.

(49)

430

I. Badriev, O. Zadvornov

(k)

(k)

yj − Bj u(k) 6 yj − Bj u(k+1) + Bj (u(k) − u(k+1) ) 6 H H H ! n

2 1/2 X

(k)

=

yj − Bj u(k+1) +

Bi (u(k) − u(k+1) ) H

i=1

(k)

(k+1)

yj − Bj u

+ u(k) − u(k+1) , H

H

j = 1, 2, . . . , n,

V

we obtain (41). It follows from (17) and (41) that

(k+1)

(k+1) − λj − Bj u(k+1) = 0, j = 1, 2, . . . , n. lim

λ(k)

= r lim yj j k→ ∞ k→ ∞ H H (50) Relations (49), (50) imply that the ondition (42) is satis ed, i.e., T is an asymptoti al regular operator. Sin e, by addition, by the assumptions of the Theorem, the operator T have a non empty set of xed points and, by Theorem 3, is non expanding operator, it follows from [7℄ that the iterative sequen e ∗ {q(k) }+∞ k=0 onstru ted by (19) is weakly onverges in Q as k → +∞. Its limit q is the xed point of the operator T . The proof of the theorem is omplete. ⊓ ⊔

Note that if the assumptions of the Theorem 1 are valid, then it follows (k) +∞ from Theorems 2, 4 that the sequen es { u(k) }+∞ k=0 and { yj }k=0 , onstru ted by (12){(14) are is weakly onverge to u and Bj u, j = 1, 2, . . . , n in V and H, respe tively, as k → +∞, where u is a solution of variational inequality (8).

4

Application of the iterative method to the problem (1)–(3)

Let us apply the suggested iterative method (12){(14) to the problem (1){(3). Sin e in (14) al ulations are performed by expli it formulas, it is suÆ ient to

onsider only the problems (12), (13). Sin e F0 is a linear fun tional, the variational inequality (12) by standard way an be rewritten in the form 1 (k+1) u − u(k) , η + τ0 V n X (k) (k) (k) (k) ∗ b ,η A0 u − f + r u + Bj λj − r yj = 0 ∀ η ∈ V, V

j=1

where the element fb ∈ V is de ned by the formula b η)V = (f,

Z

Ω

(f, η) dx,

η ∈ V.

Splitting algorithm for solving mixed variational inequalities

431

Thus the rst step of the iterative pro ess an be redu ed to solving of n Diri hlet problems for Poisson equation. Further, for ea h j = 1, 2, . . . , n, let us rewrite variational inequality (13) in the form (k+1) (k+1) (k+1) >0 yj , z − yj + Gj (z) − Gj yj H

where

(51)

∀ z ∈ H,

i h (k) (k) (k) (k) ,z Gj (z) = τj Fj (z) − yj − τj Aj yj − λj + r yj − Bj u(k+1)

H

.

By using the de nition of a proximal mapping we obtain that the variational inequality (51) is equivalent to a following minimization problem

2

1 (k+1) (k+1)

z 2 + Gj (z) > 1 y y + G

j j j H 2 2 H

or

∀ z ∈ H,

1 1

(k+1) 2 (k+1) 2 > kzkH + Fj (z) −

yj

− Fj yj 2 τj 2 τj H i 1 (k) h (k+1) (k) (k) (k) (k+1) , z − yj y − Aj yj − λj + r yj − Bj u τj j H

∀ z ∈ H,

i.e.,

i 1 (k) h (k+1) (k) (k) (k) , ∈ ∂b Fj yj yj − Aj yj − λj + r yj − Bj u(k+1) τj

(52)

where

1 b Fj (z) = kzk2H + Fj (z). 2 τj It is known (see [5℄), that p ∈ ∂ bFj (z) if and only if z ∈ ∂ bFj∗ (q), where bFj∗ is a fun tional onjugate to bFj (see, e.g., [5℄). So the in lusion (52) is equivalent to

the following one: (k+1)

yj

Sin e

∈ ∂b Fj∗

b Fj (z) =

i 1 (k) h (k) (k) (k) . yj − Aj yj − λj + r yj − Bj u(k+1) τj

Z Z |z|

gτj (ξ) d ξ d x,

Ω 0

then it is not diÆ ult to he k that b Fj∗ (z) =

Z |z| Z

Ω 0

ϕτj (ξ) d ξ d x,

gτj (ξ) =

ξ/τj ,

(53)

ξ 6 βj ,

ξ/τj + ϑj , ξ > βj ,

τj ξ, ξ 6 βj /τj , ϕτj (ξ) = βj , βj /τj < ξ 6 βj /τj + ϑj , τj (ξ + ϑj ), ξ > βj /τj + ϑj .

432

I. Badriev, O. Zadvornov

Then we obtain that fun tional bFj∗ is Gato dierentiable, moreover, b Fj∗

′

(z) =

ϕτj (| z |) z, |z|

hen e by virtue of the Proposition 5.3 [5℄ the subdierential ∂ bFj∗ (z) ontains ′ unique element oin iding with bFj∗ (z). So al ulations by (53) are performed by expli it formulas.

References 1. I.B. Badriev, O.A. Zadvornov A De omposition Method for Variational Inequalities of the Se ond Kind with Strongly Inverse-Monotone Operators, Differential inequalities, Pleaiades Publishing, In , 2003, 39, pp. 936{944. 2. O.A. Zadvornov On the Convergen e of the Semi-impli ite Method for Solving the Variational Inqualities of the Se ond Kind, Izvestiya Vyzov. Matemati a, 2005, 6, pp. 61{70 (in Russian). 3. I.B. Badriev, O.A. Zadvornov On the onvergen e of Dual-Type Iterative Method for Mixed Variational Inqualities, Dierential inequalities, Pleaiades Publishing, In , 2006, 42(8), pp. 1180{1188. 4. I.B. Badriev, O.A. Zadvornov, A.M. Saddek Convergen e Analysis of Iterative Methods for Some Variational Inqualities with Pseudomonotone Operators, Dierential inequalities, Pleaiades Publishing, In , 2001, 37(7), pp. 934{942. 5. Ekeland I., Temam R. Convex Analysis and Variational Problems, NorthHolland Publishing Company, Amsterdam, 1976. 6. Browder F.E., Petryshin W.V. The solution by iteration of nonlinear fun tional equations in Bana h spa es, Bull. Amer. Math. So ., 1966, V. 72, pp. 571575. 7. Opial Z. Weak onvergen e of the sequen e of su

essive approximations for nonexpansive mappings, Bull. Amer. Math. So ., 1967, V. 73, pp. 591{597. 8. Gaewskii H., Gro ger K., Za harias K. Ni htlineare Operatorglei hungen und Operatordierentialglei hungen, Berlin: A ademie-Verlag, 1974.

Splitting algorithm for solving mixed variational inequalities A lass of multilevel algorithms for partitioning of a sparse matrix prior to parallel solution of a system of linear equations is des ribed. This matrix partitioning problem an be des ribed in terms of a graph partitioning problem whi h is known to be NP-hard, so several heuristi s for its solution have been proposed in the past de ades. For this purpose we use the multilevel algorithm proposed by B. Hendri kson and R. Leland [2℄ and further developed by G. Karypis and V. Kumar [3℄. This algorithm is very eÆ ient and tends to produ e high quality partitioning for a wide range of matri es arising in many pra ti al appli ations. Abstract.

Keywords: graph partitioning, parallel omputations, load balan ing.

433

Multilevel Algorithm for Graph Partitioning N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov Neurok Te hSoft LLC, Russia diyankov@aconts.com

1

Introduction

EÆ ient algorithms for graph partitioning are riti al for s ienti simulations on high-performan e parallel omputers. For example, parallel iterative solution of a linear system of equations Ax = b

where A is a large sparse matrix, b is a right-hand side and x is a ve tor of unknowns is based on the partitioning of the matrix A. The main purpose of the partitioning pro edure is to divide the matrix A into required number of parts (stripes) in su h a way that ea h part has approximately the same number of rows and the number of interpro ess ommuni ations performed during the parallel solution is kept as small as possible. This lass of problems an be stri tly des ribed in terms of the graph partitioning problem (see se . 2), whi h is known to be NP-hard, so several heuristi s for its solution have been developed in the past de ades. They an be subdivided into three main ategories. The rst one ontains so- alled spe tral algorithms. While a hieving partitioning of a very good quality, they require a large amount of hardware resour es (CPU

y les and memory) be ause of the ne essity to nd eigenve tor orresponding to the se ond largest eigenvalue of the Lapla ian matrix of the adja en y graph of A. The se ond group ontains greedy algorithms whi h nd graphs partitioning by sequentially adding nodes to growing subsets following some greedy strategy su h as minimizing the number of ut edges (see se . 2) at ea h step. The third group ontains multilevel (ML) algorithms whi h are among the best ones in terms of partitioning quality and omputational resour es requirements, whi h is very important as problems be ome larger. The ML approa h itself was originally proposed by B. Hendri kson and R. Leland [2℄ and further developed by G. Karypis and V. Kumar [3℄ in their METIS pa kage. In this paper we des ribe its analogue and present numeri al tests results. The remainder of the paper is organized as follows. In se tion 2 we de ne the graph partitioning problem. In se tion 3 the main idea behind multilevel te hniques is demonstrated. In se tions 4, 5, and 6 we des ribe in details dierent phases of the multilevel approa h - oarsening, initial partitioning and un oarsening respe tively. In se tion 7 we present a variant of the original ML approa h whi h is alled Cell-Based

Multilevel Algorithm for Graph Partitioning

435

Multilevel (CBML) approa h and des ribe its advantages. Se tion 8 presents numeri al tests results. Se tion 9 provides a summary of the tests.

Fig. 1.

2

Sparse matrix and its adja en y graph.

The problem statement

It is well known that nonzero pattern of a sparse matrix an be represented by its adja en y graph (see Fig. 1). Namely, given a square n × n sparse matrix A ontaining nz nonzero entries, its adja en y graph is G = (V, E), where V is the set of nodes orresponding to the rows of A (|V| = n), and E is the set of edges orresponding to the nonzero entries of A (|E| = nz). The graph is undire ted when A is symmetri and dire ted otherwise. In the following paragraphs we assume that A is symmetri . This restri tion is easy to ful l by onsidering the matrix A∗ = A + AT and its adja en y graph instead of A sin e the exa t values of the matrix's nonzero entries are unimportant. The k-way graph partitioning problem is formulated as follows: partition V into k disjoint subsets fV1 , V2 , ..., Vn g su h that |Vi | ≈ Vk for i = 1..k (loadbalan ing ondition), while minimizing the number of edges whose in ident nodes belong to dierent partitions ( ut-size minimization ondition) (see Fig. 1). These edges are alled ut edges and their number is alled ut size. This problem an be trivially extended to graphs with weights assigned to the nodes and edges (see [3℄). When the number of parts is a power of 2, i.e. k = 2p , the problem is frequently solved in a re ursive bise tion fashion. Namely, we S rst obtain 2-way partitioning of our graph: V = V1 V2 . Then, we re ursively apply the same pro edure to subgraphs of G indu ed by V1 and V2 . After p steps the original graph is partitioned into k parts. It's worth to say that this approa h often works worse than the original k-way partitioning approa h when

436

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

k > 2, but still frequently used due to its simpli ity. In this paper we des ribe the original k-way partitioning approa h and propose some improvements. We

estimate partitioning quality depending on the degree the load-balan ing and

ut size minimization onditions are ful lled.

3

Multilevel k-way graphs partitioning

The whole pro edure may be depi ted by Fig. 2. It onsists of the following main phases:

Fig. 2.

Multilevel graph partitioning algorithm.

1. Coarsening phase. During the oarsening phase, a sequen e of smaller graphs fG1 , G2 , ..., Gm g is onstru ted until the number of nodes in the oarsest graph Gm be omes less than some prede ned value (around a few hundreds). The number of graphs in this sequen e is alled oarsening depth. We use spe ial parameter ν to ontrol the oarsening depth. Namely, we try to build a oarser graph Gi+1 from a ner one Gi until |Vi | > ν ∗ |V1|. Ea h graph Gi forms a layer of oarsening. That is why this approa h is alled "multilevel". At ea h layer, possibly ex ept the rst one, weights are assigned to the nodes and edges of the graphs (see se . 4) in order to partitioning of the oarsest graph be good with respe t to the original one. There are many possibilities to onstru t a oarser graph from a ner one. But we use edges ollapsing te hnique that is based on mat hings (see se . 4).

Multilevel Algorithm for Graph Partitioning

437

2. Initial partitioning phase. During the initial partitioning phase, high quality partitioning of the oarsest graph Gm is omputed. Sin e the number of nodes of Gm is small omparing with that in G1 this phase an be a

omplished very qui kly. A tually, it takes about 10% of the total partitioning time. There exists many algorithms to do this (see [3℄, [5℄). For this we use Restarted Greedy Graph Growing (RGGG) algorithm whi h is des ribed in details in se . 5. 3. Uncoarsening phase with refinement. During this phase, just found partitioning of the oarsest graph Gm is proje ted ba k to the original graph G1 by going through the set of intermediate graphs. On ea h layer just proje ted partitioning is re ned. There are many lo al re nement algorithms intended to do this. It is worth to mention Kernigan-Lin re nement algorithm [5℄ and it's linear-time variant - Fidu

ia-Mattheyses re nement algorithm [1℄. In our multilevel approa h we use a variant of the original Fidu

ia-Mattheyses lo al re nement algorithm whi h is alled boundary Fidu

ia-Mattheyses lo al re nement algorithm.

4

Coarsening phase

Given a weighted graph Gi with weights assigned to the nodes and the edges, the next level oarser graph Gi+1 is onstru ted from it by merging together (i) (i) (i+1) some subsets of its nodes fν(i) j1 , νj2 ,..., νjk g (an estors) into multinodes νj (i) (des endants). The weight of ν(i+1) equals to the sum of weights of fν(i) j j1 , νj2 ,..., (i) (i) (i) (i) νjk g. In the ase when more than one node of fνj1 , νj2 , ..., νjk g ontain (i) (i) edges in ident to the same node u * fν(i) j1 , νj2 , ..., νjk g, the weight of the , u) equals to the sum of the weights of these edges. It is obvious edge (ν(i+1) j that a oarser graph an be onstru ted from a ner one in many dierent ways. For matri es with unstru tured nonzero patterns it seems reasonable to use oarsening pro edure based on ollapsing together the edges of G that form a mat hing, be ause of the ne essity to preserve onne tivity stru ture of the original graph in the oarsest one. Remind that a mat hing in a graph (weighted or unweighted) is a subset of its edges with the following property: no two of whi h are in ident to the same node. A mat hing is alled maximal if it is impossible to add one more edge to it su h that the resulting subset of edges forms a mat hing too. The maximal mat hing that has the maximum number of edges is alled maximum mat hing. Sin e the goal of the oarsening pro edure is to de rease the size of the graph, mat hing should ontain a large number of edges. But the omplexity of omputing maximum mat hing is higher than that of omputing maximal mat hing. That is why we onstru t maximal mat hings during the oarsening phase. They an be generated very qui kly using depth-

438

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

rst sear h [6℄ or randomized algorithm. We implemented the following three types of mat hings: – Random mat hing (RM). This type of mat hings is very popular be ause

of its simpli ity and often gives good results. It is demonstrated by the following pseudo ode:

Algorithm 3. Random Mat hing Algorithm INPUT: graph G(V,E) OUTPUT: mat hing M 1. mat hing M = ∅; 2. forea h( u ∈ V ) mask[u℄ = 0; 3. forea h( u ∈ V ) f 4. if( 0 == mask[u℄ ) f 5. mask[u℄ = 1; 6. if( exist v ∈ adj[u] su h that 0 == mask[v℄ ) 7. mask[v℄ = 1; M ←− (u, v); 8. 9. g 10. g 11. g

f

Initially, the mat hing is empty (line 1) and all nodes are unmasked (line 2). Then, the nodes are visited in random order (line 3). If node u is already masked it is skipped. Otherwise, it is masked (lines 4, 5) and then we arbitrary sele t its adja ent unmasked node v if su h a node exists (line 6), mask it (line 7) and add the edge (u, v) to the mat hing. Obviously, that this algorithm has a linear time omplexity with respe t to the number of nodes, i.e O(|V|). – Heavy-Edge Mat hing (HEM). As in the previous algorithm, the nodes are visited in random order. But now we sele t unmasked node v adja ent to u in su h a way that the weight of the edge (u, v) is maximal over all unmat hed adja ent edges. The algorithm an be illustrated by the following pseudo ode:

Algorithm 4. Heavy-Edge Mat hing Algorithm INPUT: graph G(V,E) OUTPUT: mat hing M 1. mat hing M = ∅; 2. forea h( u ∈ V ) mask[u℄ = 0; 3. forea h( u ∈ V ) f 4. if( 0 == mask[u℄ ) f 5. mask[u℄ = 1;

Multilevel Algorithm for Graph Partitioning

6. if( exist v ∈ adj[u] su h that 0 == mask[v℄ and max ) f 7. mask[v℄ = 1; 8. M ←− (u, v); 9. g 10. g 11. g

439

w(u,v) −→

This algorithm has linear time omplexity with respe t to the number of edges, i.e O(|E|). – Heavy-Clique Mat hing (HCM). In this se tion we des ribe our version of heavy- lique mat hing algorithm. The algorithm an be ee tive for graphs with a few highly- onne ted omponents [6℄. In [3℄ one an nd a variant of HCM algorithm based on the on ept of edges density. In ontrast to this algorithm, we developed our own one. Remind that for undire ted graph G one an de ne the on ept of degree of a node [6℄, whi h gives the number of edges in ident to the node. The algorithm an be demonstrated by the following pseudo ode:

Algorithm 5. Heavy-Clique Mat hing Algorithm INPUT: graph G(V,E) OUTPUT: mat hing M 1. mat hing M = ∅; 2. for( i = 1; G 6= ∅; i++ ) f 3. if( u∈V min (deg[u]) == 1 ) break; 4. build Gi from G su h that min (deg[u]) is as large as possible; u⊂V 5. G = G\Gi ; 6. g 7. forea h( Gi ) M ←− HEM(Gi ); 8. M ←− HEM(G); i

i

Initially, the mat hing is empty (line 1). In line 4 we try to build subgraph

Gi from a given one G whi h has the following property: min (deg[u]) u⊂Vi

is

as large as possible, where deg[u] is the degree of node u. In other words,

we try to extra t subgraph from G in whi h the minimal degree of a node is as large as possible. The operation in line 4 an be implemented in O(|E|) by the algorithm whi h is des ribed below (see Maximal Minimum Degree Subgraph Extra tion algorithm). Then we perform the same pro edure for the subgraph of G whi h is indu ed by the set of nodes fV\Vig until the

ondition in line 3 is satis ed or the subgraph be omes empty (loop in lines 2-6). As a result, we obtain the sequen e of graphs fG1, G2 , ..., Gq g and the remaining part of input graph whi h onsists of isolated nodes or isolated

440

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

pairs of nodes. After that we build required mat hing M as a onjun tion of heavy-edge mat hings for all graphs Gi and G. Let's onsider the algorithm whi h demonstrates how we an build a subgraph of G with maximal minimum degree ( the operation in line 4 ) in O(|E|) time:

Algorithm 6. Maximal Minimum Degree Subgraph Extra tion INPUT: graph G(V,E) OUTPUT: subgraph G∗ ∈ G 1. sort nodes of G in degrees-as ending order; 2. for( u = 0; u < |V|; u + + ) f 3. save D[u℄ ←− deg[u℄; G = G\u; 5. 6. maintain nodes degrees-as ending order; 7. g (D[u]); 8. nd u∗ : D[u∗ ] = max u ∗ 9. while( u < u ) G = G\{u}; 10. G∗ = G; In line 1 we sort the nodes of G in degrees as ending order. In the loop in lines 2-7 we visit sorted nodes one at a time, take the node with minimum degree, save its degree and then ex lude it with its in ident edges from G. It is ne essary to re al ulate the degrees of the remaining nodes and maintain their degrees as ending order. In line 8 we nd maximal value of all degrees that were saved in line 3 and orresponding node u∗ . Then the output of the algorithm is obtained by removing the nodes of G whi h were visited before u∗ in the loop in lines 2-7. We onsidered three algorithms for mat hings generation. In [3℄ one an nd other ones. When the edges are unweighted (or have the same weight) it seems reasonable to use RM. In order to onstru t a oarser graph from a ner one we need to ollapse together mat hed edges. This pro edure an also be implemented in O(|E|). It is worth to note that oarsening phase usually takes about 80% of the total partitioning time.

5

Initial partitioning phase

During this phase high quality partitioning of the oarsest graph Gm is onstru ted. Sin e Gm has quite a small number of nodes, this phase takes quite a small amount of time. For this we use Restarted Greedy Graph Growing algorithm. The algorithm an be outlined by the following pseudo ode:

Algorithm 7. Restarted Greedy Graph Growing Algorithm INPUT: graph G, the number of parts k, the number of restarts rests

Multilevel Algorithm for Graph Partitioning

441

OUTPUT: partitioning of G into k parts 1. while( rests − − ) f 2. put all nodes in partition P0 ; 3. for( j = 1; j < k; j + + ) f 4. randomly sele t u ∈ P0 and put it in Pi ; 5. while( size[Pi ] < Vk ) f 6. sele t u ∈ P0 su h that cutsize −→ min; 7. move u from P0 to Pk ; 8. g 9. g 10. save partitioning; 11. g At the beginning of ea h restart we put all nodes in partition P0 (line 2). In order to onstru t partition Pi , we rst randomly sele t a node from partition P0 and put it in partition Pi (growing subset) whi h was empty before it (line 4). Then we sequentially move nodes from P0 to Pi in su h a way that ea h movement results in the smallest possible in rease in the ut size (lines 6,7). We ontinue this until the size of Pi be omes more than or equal to | Vk |. Then we try to

onstru t partition Pi+1 in the same way (loop in lines 3-9). It is obvious that in order to onstru t k-way partitioning of the graph we must onstru t k − 1 partitions. After that, remaining nodes in partition P0 form missing k-th partition. After the required number of restarts is nished, we use the best partitioning as the result. This pro edure an be used as a standalone partitioner, but greedy algorithms often give partitioning of a poor quality and the required amount of time often ex eeds the amount of time required by multilevel partitioner.

6

Uncoarsening with refinement phase

This phase onsists of the following two steps. First, partitioning of the oarsest graph is proje ted ba k to the original graph by going through intermediate graphs. Sin e ea h node of Gi+1 is formed by a distin t subset of nodes of Gi , the proje tion is trivial to realize. Namely, we an derive partitioning of Gi from (i) (i) partitioning of Gi+1 by assigning to the set of nodes fν(i) j1 , νj2 , ..., νjk g that

ollapsed into ν(i+1) the partition number that holds ν(i+1) . The next step is j j a re nement pro edure for just found partitioning of Gi. We use a modi ation of the original lo al Fidu

ia-Mattheyses re nement algorithm [1℄ whi h we

all boundary Fidu

ia-Mattheyses re nement algorithm. The entral on ept behind re nement algorithms is the on ept of gain of a node. Given a node u whi h belongs to the partition Pi , the gain of movement of node u from Pi to

442

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

Pj (i 6= j) is given by the following formula: X X gain(Pi −→ Pj ) = w(u,v) − w(u,v) , u

v⊂Pj

v⊂Pi

where w(u,v) is the weight of the edge (u, v). In other words, gain(Pi −→ Pj ) u gives the de rease in the ut size we obtain after the movement is performed. The boundary Fidu

ia-Mattheyses re nement algorithm may be outlined by the followig pseudo ode:

Algorithm 8. Boundary Fidu

ia-Mattheyses Re nement Algorithm INPUT: graph G(V,E), the number of parts k, the number of restarts rests OUTPUT: partitioning of G into k parts 1. while( rests − − ) f 2. unlo k all nodes V ; 3. for( p = 0; p < k; p + + ) f 4. put all boundary nodes from Pi to PQi ; 5. g 6. while( all PQs are not empty ) f Pj ) −→ 7. from all PQs nd i, j and u ⊂ Pi su h that gain(Pi −→ u max; 8. if( gain(Pi −→ Pj ) < 0 ) break; u 9. if( after the movement Pi and Pj are still balan ed ) f 10. move Pi −→ Pj ; u 11. adjust gains of all unlo ked nodes v ⊂ adj[u]; 12. g 13. else f 14. remove u from PQi ; 15. g 16. g 17. g The boundary Fidu

ia-Mattheyses re nement algorithm is iterative in nature. The number of iterations (or restarts) is ontrolled by the rests parameter. We maintain k priority queues to hold boundary nodes from ea h partition that are allowed to move, i.e. unlo ked (see [1℄ for explanation). Initially all the nodes are unlo ked (line 2). At lines 3-5 we initialize all queues with boundary nodes from orresponding partitions. As a key of a node we use the maximum gain from all gains onsidered to allowable movements of that node, i.e. movements from Pi to Pj , i 6= j. Then, in the loop in lines 6-16 we look at the tops of all queues and sele t the node that has the maximum value of key (line 7). After that we know all information that is ne essary to perform just found movement. If the gain of the movement is negative we exit from the loop in lines 6-16 be ause only movements with positive gains an re ne partitioning. It is worth

Multilevel Algorithm for Graph Partitioning

443

to note that due to the load-balan ing ondition su h a movement may be not allowed (it is ontrolled by line 9). In this ase we simply remove the node from its queue and perform this pro edure again. We exit from the loop in lines 6-16 only if there is no movements that preserve load-balan ing and de rease the ut size. An advantage of this algorithm over the one des ribed in [1℄ is its time

omplexity. It an be approximated by the formula O(|N∗ ∗ S|) where N∗ is the number of boundary nodes (nodes whi h have at least one adja ent node that belongs to another partition) and S is the average sparsity.

7

Cell-based multilevel approach

In this se tion we introdu e a new multilevel te hnique for sparse matrix partitioning. We all it ell-based multilevel (CBML) partitioning algorithm. It is ee tive for large sparse matri es arising from su h dis retization of PDEs in whi h several unknowns are related to ea h grid ell. Let us onsider su h a matrix A. In the ase when A is a multiblo k matrix we onsider one of its stru ture. Namely, blo ks. The adja en y graph G of A has spe ial onne tivity S the set of its nodes V an be represented as V = Vµ in whi h ea h subset Vµ has the following property: all the nodes u ⊂ Vµ are indistinguishable (remind S S that two nodes v and u are alled indistinguishable if adj[u] u = adj[v] v). This information about onne tivity pattern may be employed to nd better partitioning ompared with that generated by the original algorithm. Namely we an onsider so- alled redu ed graph G∗ of G in whi h ea h node v ⊂ V ∗

orresponds to subset Vµ , has the same onne tivity stru ture as any node in Vµ and the weight w[v] = |Vµ |. Then we apply multilevel te hnique to redu ed graph G∗ . After partitioning of G∗ is generated, partitioning of G an be derived from it sin e ea h node of G∗ is formed from distin t subset of nodes of G. |V ∗ | In the ase when |V| ≪ 1 this approa h seems to produ e better partitioning

ompared with that generated by the original multilevel algorithm applied to G (see Fig. 8 in the numeri al test results se tion).

8

Numerical test results

In this se tion we present the results of omparison of our partitioner with METIS pa kage whi h an be downloaded from http://www.glaros.dtc.umn.edu/gkhome/metis/metis/download.

All tests are performed on Opteron 2.0 GHz, with 2 Gb RAM running under SLES 9. We use a publi XOM matrix olle tion whi h an be downloaded from http://www.aconts.com/XOMMatrices. Table 1 demonstrates test matri es properties.

444

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov Table 1.

Summary of publi XOM matrix olle tion

Problem N Z Z/N ZD ND SD POD CI-1 113465 1654732 14,58 0 2 23619 477535 CI-2 62449 460319 7,37 0 0 128 202 CIT-1 17436 344245 19,74 0 4207 7388 98092 CIT-2 249428 5613978 22,51 30 1323 16106 1024619 SBO-1 21700 145122 6,69 1 0 1 7 SBO-2 111756 888190 7,95 0 0 8 12 SBO-3 216051 1849317 8,56 0 0 0 0 SBO-4 93264 667882 7,16 0 0 0 0 SEO-1 22421 204784 9,13 0 180 94 849

Here N is the number of rows, Z is the number of nonzero entries, S = Z/N is the average sparsity, ZD is the number of zero diagonal entries, ND is the number of negativePdiagonal entries, SD is the number of "small" diagonal entries (i.e. |aii| < 0.01 ∗ |aij |), POD is the number of positive o-diagonal entries. j6=i

– Matching algorithms impact on partitioning quality. The main obsevation

behind these tests is that all algorithms for mat hing generation give good results with respe t to partitioning quality and time requrements. But it seems reasonable to use "heavy" mat hings (i.e. HEM or HCM) on "deep" layers of oarsing where weights are assigned to the nodes and the edges. As the experiments show, HEM is the best algorithm whi h results in very good partitioning of the oarsest graph. In the tests des ribed below HEM is used as default mat hing strategy. In Fig. 3 omparison of deferent mat hing strategies for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32) is presented for the problem CI-1. – Coarsening depth parameter impact on partitioning quality. As was des ribed in se . 3, the depth of oarsening is ontrolled by spe ial parameter ν, i.e. we try to build a oarser graph Gi+1 from a ner one Gi until |Vi | > ν ∗ |V1 |, where |V1 | is the number of nodes of G1 . The smaller the parameter's value the smaller the number of nodes in the oarsest graph and then the better partitioning we an obtain after the initial partitioning phase. As our experiments show, small values of ν result in better initial partitionig. In this se tion we present the results of tests where the oarsening depth parameter is varying for the problem CI-1 for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). – Comparison with METIS package. In this se tion the results of omparison of our partitioner (MLPT) with METIS pa kage are presented for all matri es for dierent numbers of parts. In the tests we use HEM algorithm to nd mat hing and the value of the oarsening depth parameter is ν = 0.0001. We an on lude that our variant of multilevel algorithm gen-

Multilevel Algorithm for Graph Partitioning

445

5,4

5,2

4,8

4,6

2

log (cutsize)

5,0

4,4

HEM

4,2

HCM RM

4,0

0

5

10

15

20

25

30

35

parts

Comparison of dierent mat hing algorithms and their impa t on partitioning quality for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32) for the problem CI-1. Fig. 3.

5,4

5,2

4,8

4,6

2

log (cutsize)

5,0

4,4

0,0001 0,001

4,2

0,01 0,1

4,0

0

5

10

15

20

25

30

35

parts

In uen e of oarsening depth parameter ν on partitioning quality for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32) for the problem CI-1.

Fig. 4.

446

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

1,30 1,25 1,20

CI-1

1,15

CIT-2

/cutsize

METIS

1,10 1,05

CIT-1

1,00

CI-2

0,95

cutsize

MLPT

0,90

SBO-4

0,85

SBO-1

0,80

SBO-2

0,75 0,70

SBO-3

0,65

SEO-1

0,60 0,55

0

5

10

15

20

25

30

35

parts

Comparison of partitioning quality generated by MLPT with that generated by METIS pa kage for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). Here cutsizeMLPT is the ut size for MLPT and cutsizeMETIS is the ut size for METIS. Fig. 5.

1,6 1,5 1,4

CI-1

1,3

CIT-2

1,2

CIT-1

time

MLPT

/time

METIS

1,1 1,0

CI-2

0,9

SBO-4

0,8 0,7

SBO-1

0,6

SBO-2

0,5

SBO-3

0,4 0,3

SEO-1

0,2 0,1 0

5

10

15

20

25

30

35

parts

Fig. 6. Comparison of partitioning time required by MLPT with that required by METIS pa kage for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). Here timeMLPT

is the time requred by MLPT and timeMETIS is the time required by METIS.

Multilevel Algorithm for Graph Partitioning

447

7,0

6,5

6,0

5,5

5,0

proc = 2 proc = 3

speedup

4,5

proc = 4 proc = 5

4,0

proc = 6 3,5

proc = 7 proc = 8

3,0

2,5

2,0

C I2 .b b s f C IT -1 .b b s f C IT -2 .b b s f S B O -1 .b b s f S B O -2 .b b s f S B O -3 .b b s f S B O -4 .b b s f S E O -1 .b b s f

C I1 .b b s f

1,5

problem

Fig. 7. In uen e of partitioning quality on the performan e of matrix-ve tor produ t (MVP) operation on MPI ar hite ture for dierent numbers of parts (2,3,4,5,6,7,8). Here speedup is the ratio of serial time to parallel time required by the operation. 1,2

cutsize

CBMLPT

/cutsize

MLPT

1,1

1,0

0,9

0,8

0,7

CIT-2

0,6

0

5

10

15

20

25

30

35

parts

Comparison of Cell-Based Multilevel Algorithm with the original one for the problem CIT-2 for dierent numbers of parts (2,4,6,8,10,12,16,20,24,28,32). Here cutsizeCBMLPT is the ut size for CBMLPT and cutsizeMLPT is the ut size for MLPT . Fig. 8.

448

N. S. Bo hkarev, O. V. Diyankov, and V. Y. Pravilnikov

erates partitioning ompetetive with that generated by METIS pa kage. In Fig. 5 we ompare the quality of partitioning generated by our partitioner with that generated by METIS pa kage. In Fig. 6 we ompare the required time. – MPI matrix-vector product . While solving a large sparse linear system of equations via some Krylov-like iterative methods on a ma hine with distributed memory ar hite ture, it is very important to perform matrix-ve tor produ t (mvp) operation as fast as possible. In Fig. 7 we present the impa t of partitioning quality on the performan e of MVP operation on MPI ar hite ture for 2, 3, 4, 5, 6, 7, and 8 numbers of parts. We an on lude that there is signi ant speedup for most matri es. – Comparison of CBML approach with the original one. In this se tion we present the advantages of ell-based multilevel approa h over the original one for the problem CIT − 1. As it was mentioned earlier, this approa h tends to generate good partitionings for the paroblems arising from des retization of PDEs in whi h several unknowns are related to ea h grid ell. The Fig. 8 presents the result of omparison. One an on lude that CBML algorithm is preferable over the original one for su h systems.

9

Conclusion

We evaluated the performan e of our multilevel partitioner for a range of matri es arising from dis retization of PDEs. One an on lude that the multilevel te hnique work quite well. As it was mentioned earlier, the oarsening phase requires more than half of the total partitioning time. This fa t demonstrates that in order to ee tively parallelize the whole algorithm some tri ks must be employed to parallelize the oarsening phase. In [4℄ a parallel multilevel algorithm is proposed whi h is based on the graph oloring. Comparing partitioning quality one an on lude that the best partitionings are generated when HEM algorithm is used to nd the edges to ontra t. In addition, obtained partitionings are ompetitive with those generated by METIS pa kage.

References 1. C. M. Fidu

ia and R. M. Mattheyses, A linear time heuristi for improving network partitions. In: Pro . 19th IEEE Design Automation Conferen e, 1982, pp. 175-181. 2. B. Hendri kson and R. Leland, A Multilevel Algorithm for Partitioning Graphs. Te h. report SAND93-1301, Sandia National Laboratories, Albuquerque, NM, 1993. 3. G. Karypis and V. Kumar, Multilevel Graph Partition and Sparse Matrix Ordering. In: Intl. Conf. on Parallel Pro essing, 1995.

Multilevel Algorithm for Graph Partitioning

449

4. G. Karypis and V. Kumar, A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. In: J. Parallel and Distributed Computing, 1998, No.48, pp. 71-95. 5. B. W. Kernigan and S. Lin, An eÆ ient heuristi pro edure for Partitioning graphs. In: Bell Sys. Te h. J., 1970, No. 49, pp. 291-307. 6. O. Ore, Theory of Graphs. AMS Colloquium Publi ations 38. AMS, 1962.

2D-extension of Singular Spectrum Analysis: algorithm and elements of theory N. E. Golyandina⋆ and K. D. Usevi h⋆⋆ Mathemati al Department, St. Petersburg State University, Universitetskij pr. 28, St. Petersburg Petrodvorets 198504, Russia ⋆ nina@gistatgroup.com, ⋆⋆ usevich.k.d@gmail.com

Abstract. Singular Spe trum Analysis is a nonparametri method, whi h allows one to solve problems like de omposition of a time series into a sum of interpretable omponents, extra tion of periodi omponents, noise removal and others. In this paper, the algorithm and theory of the SSA method are extended to analyse two-dimensional arrays (e.g. images). The 2D-SSA algorithm based on the SVD of a Hankel-blo kHankel matrix is introdu ed. Another formulation of the algorithm by means of Krone ker-produ t SVD is presented. Basi SSA notions su h as separability are onsidered. Results on ranks of Hankel-blo k-Hankel matri es generated by exponential, sine-wave and polynomial 2D-arrays are obtained. An example of 2D-SSA appli ation is presented.

Keywords: Singular Spe trum Analysis, image analysis, Hankel-blo kHankel matrix, separability, nite rank, Singular Value De omposition, Krone ker-produ t SVD.

1

Introduction

The purpose of this paper is to extend the SSA (Singular Spe trum Analysis) algorithm and theory developed in [7℄ to the ase of two-dimensional arrays of data (i.e. real-valued fun tions of two variables de ned on Cartesian grid). The mono hrome digital images are a standard example here. Singular Spe trum Analysis is a well-known model-free te hnique for analysis of real-valued time series. Basi ally, SSA is an exploratory method intended to perform de omposition of a time series into a sum of interpretable omponents, su h as trend, periodi ities and noise (see [3, 4, 7℄ for more details). SSA has proved to be su

essful for su h tasks. Moreover, there are several SSA extensions for time series fore asting, hange-point dete tion, missing values imputation and so on. These are the reasons to believe that the two-dimensional extension of SSA (2DSSA, rst presented in [6℄) has similar apabilities. However, its appli ation was hampered by la k of theory, whi h this paper is intended to redu e. Suppose we observe a 2D-array of data (a real matrix) being a sum of unknown omponents F = F(1) + . . . + F(m) . The general task of the 2D-SSA

2D-extension of Singular Spe trum Analysis

451

algorithm is to produ e a de omposition e (m) , e (1) + . . . + F F=F

(1)

where the terms approximate the initial omponents. In §2 we present the algorithm of 2D-SSA. First of all, the algorithm is formulated basing on the SVD of the Hankel-blo k-Hankel (HbH for short) matrix generated by the input 2D-array. However, another equivalent representation of the algorithm ts better for examination and analysis. It is based on the de omposition of a matrix into a sum of Krone ker produ ts. The key step of the algorithm is grouping of terms of the SVD. This step governs the resulting de omposition (1). Main problems of grouping are: possibility of proper grouping and identi ation of terms in the SVD. These problems are dis ussed in §2.4 and investigated in §3 and §4. In §3 we study the notion of separability inherited from the 1D ase. Separability means possibility to extra t onstituents from their sum by 2D-SSA. We also provide a brief review of results on one-dimensional separability as the basis for results in the 2D ase. Se tion 4 deals with the so- alled 2D-SSA rank of a 2D-array de ned as the number of SVD terms orresponding to the 2D-array and equal to the rank of a Hankel-blo k-Hankel matrix generated by the 2D-array. This number is important, as it should be taken into a

ount when performing identi ation. We provide rank al ulations for dierent 2D-arrays: exponents, polynomials and sine-waves. In §5 we demonstrate 2D-SSA notions by an example of periodi noise removal. General definitions

First of all, let us review de nitions that will be used throughout this paper. The following operator is widely used in the SSA theory and is quite helpful for the 2D-SSA algorithm formulation. m,n Definition 1. Let A = aij i,j=1 ∈ Mm,n (Q) be a matrix over spa e Q. The hankelization operator HQ : Mm,n(Q) 7→ Mm,n (Q)

by

e2 e1 a a a e3 e2 a HQ A = . . . . ..

en ... a en+1 ... a

. . . .. .

em+1 . . . a em+n−1 em a a

where Dk = {(i, j) :

,

ek = a

X

(i,j)∈Dk

1 6 i 6 m, 1 6 j 6 n, i + j = k + 1}.

Eu lidean is de ned

. aij #Dk ,

452

N. E. Golyandina, K. D. Usevi h

= Mm,n (R) the spa e of real matri es Further, we will denote by Mm,n def with Frobenius inner produ t: hX, YiM =

n m X X

(2)

xij yij ,

i=1 j=1

m,n where X = (xij )m,n i,j=1 , Y = (yij )i,j=1 ∈ Mm,n . Introdu e an isomorphism between Mm,n and Rmn .

Definition 2. The ve torization Mm,n is given by

(see, for instan e, [8℄) of A

= (aij )m,n i,j=1 ∈

def

(3)

The

denoted by matrm,n (X) is

ve A = (a11 , . . . , am1 ; a12 , . . . , am2 ; . . . ; a1n , . . . , amn)T . Definition 3.

de ned to be

(m, n)-matri izing of X ∈ Rmn A ∈ Mm,n satisfying ve A = X.

Then, re all the operation of Krone ker produ t [8, 9℄.

For A = (aij )m,n i,j=1 ∈ Mm,n and B Krone ker produ t is, by de nition, Definition 4.

= A ⊗ B def

a11 B . . . a1n B

.. .

.. .

am1 B . . . amn B

= (bkl )p,q k,l=1 ∈ Mp,q

.

their

(4)

Finally, we need an isomorphism between lasses of blo k matri es. Definition 5.

The rearrangement R : Mmp,nq 7→ Mpq,mn is de ned as def

R(C) = D ∈ Mpq,mn , where (D)i+(j−1)p,k+(l−1)m = (C)i+(k−1)p,j+(l−1)q

(5)

for 1 6 i 6 p, 1 6 j 6 q, 1 6 k 6 m, 1 6 l 6 n. Note that the introdu ed rearrangement of a matrix is the transpose of the rearrangement de ned in [2℄. The following properties of the rearrangement are quite useful, despite being easily he ked. p,q – Let A = (aij )m,n i,j=1 ∈ Mm,n and B = (bkl )k,l=1 ∈ Mp,q . Then

R(A ⊗ B) = ve B(ve A)T .

(6)

kR(C)kM = kCkM .

(7)

– For any C ∈ Mmp,nq

2D-extension of Singular Spe trum Analysis

2 2.1

453

2D-SSA Basic algorithm

Consider a 2D-array of data

f(0, 0) f(1, 0) F= . ..

. . . f(0, Ny − 1) . . . f(1, Ny − 1)

f(0, 1) f(1, 1)

.. .

. . .. . .

f(Nx − 1, 0) f(Nx − 1, 1) . . . f(Nx − 1, Ny − 1)

.

The algorithm is based on the SVD of a Hankel-blo k-Hankel (HbH) matrix

onstru ted from the 2D-array. The dimensions of the HbH matrix are de ned by the window sizes (Lx , Ly ), whi h are restri ted by 1 6 Lx 6 Nx , 1 6 Ly 6 Ny and 1 < Lx Ly < Nx Ny . Let Kx = Nx − Lx + 1 and Ky = Ny − Ly + 1 for

onvenien e of notation. Embedding

At this step, the input 2D-array is arranged into a Hankel-blo k-Hankel matrix of size Lx Ly × Kx Ky :

H0 H1 W= H2 . ..

H1 H2

H2 H3

. .. . ..

H3

.. .

HLy −1 HLy

where

f(0, j) f(1, j) Hj = . ..

...

. . . HKy −1 . . . HKy . .. . , . . . . . . ..

(8)

. . . HNy −1

f(1, j) . . . f(Kx − 1, j) f(2, j) . . . f(Kx , j) . .. . . .. . . .

f(Lx − 1, j) f(Lx , j) . . . f(Nx − 1, j)

Obviously, there is the one-to-one orresponden e between 2D-arrays of size Nx × Ny and HbH matri es (8). Let us all the matrix W a Hankel-blo kHankel matrix generated by the 2D-array F. SVD

Then, the SVD is applied to the Hankel-blo k-Hankel matrix (8): W=

d p X λi Ui Vi T .

(9)

i=1

Here λi (1 6 i 6 d) are the non-zero eigenvalues of the matrix WWT arranged in de reasing order λ1 > λ2 > · · · > λd > 0; {U1 , . . . , Ud } is a system of orthonormal in RLx Ly eigenve tors of the matrix WWT ; {V1 , . . . , Vd } is an orthonormal

454

N. E. Golyandina, K. D. Usevi h

fa tor ve tors. The fa tor ve tors system of ve tors in RKx Ky , hereafter alled T U /√λ . The triple (√λ , U , V ) is said to

an be expressed as follows: Vi = W i i i i √ i be the ith eigentriple. Note that λi is alled a singular value of the matrix W. Grouping

After spe ifying m disjoint subsets of indi es Ik (groups of eigentriples), I1 ∪ I2 ∪ · · · ∪ Im = {1, . . . , d},

(10)

one obtains the de omposition of the HbH matrix W=

m X

WIk ,

where WI =

k=1

Xp λi Ui Vi T .

(11)

i∈I

This is the most important step of the algorithm as it ontrols the resulting de omposition of the input 2D-array. The problem of proper grouping of the eigentriples will be dis ussed further (in §2.4). Projection

Proje tion step is ne essary in order to obtain a de omposition (1) of the input 2D-array from the de omposition (11) of the HbH matrix. Firstly, matri es fI . Se ondly, 2D-arrays F eI WIk are redu ed to Hankel-blo k-Hankel matri es W k k f are obtained from WIk by the one-to-one orresponden e. fI , in their turn, are obtained by orthogonal proje tion of The matri es W k matri es WIk in Frobenius norm (2) onto the linear spa e of blo k-Hankel Lx Ly × Kx Ky matri es with Hankel Lx × Kx blo ks. The orthogonal proje tion of

Z1,1 Z1,2 . . . Z2,1 Z2,2 . . . Z= . .. .. .. . . ZLy ,1 ZLy ,2 . . .

Z1,Ky Z2,Ky , .. . ZLy ,Ky

Zi,j ∈ MLx ,Kx ,

an be expressed as a two-step hankelization e = HMLx ,Kx Z

HR Z1,1 HR Z1,2 . . . HR Z1,Ky HR Z2,1 HR Z2,2 . . . HR Z2,Ky . . .. .. .. .. . . . R R R H ZLy ,1 H ZLy ,2 . . . H ZLy ,Ky

In other words, the hankelization is applied at rst to the blo ks (within-blo k hankelization) and then to the whole matrix, i.e. the blo ks on se ondary diagonals are averaged between themselves (between-blo k hankelization). Certainly, the hankelization operators an be applied in the reversed order. Thus, the result of the algorithm is F=

m X

k=1

eI . F k

(12)

2D-extension of Singular Spe trum Analysis

455

e I is said to be the re onstru ted by eigentriples with indi es A omponent F k Ik 2D-array. 2.2

Algorithm: Kronecker products

Let us examine the algorithm in terms of tensors and matrix Krone ker produ ts. Embedding

Columns of the Hankel-blo k-Hankel matrix W generated by the 2D-array F

an be treated as ve torized Lx × Ly submatri es (moving 2D windows) of the input 2D-array F (see Fig. 1). 1

l

1 k

Ny

pp pp ppppppppp Fk,l

L -

6 Lx ?

y

Nx

Fig. 1.

Moving 2D windows

More pre isely, if Wm stands for the mth olumn of the Hankel-blo k-Hankel matrix W = [W1 : . . . : WKx Ky ], then Wk+(l−1)Kx = ve (Fk,l ) for 1 6 k 6 Kx , 1 6 l 6 Ky ,

(13)

where Fk,l denotes the Lx × Ly submatrix beginning from the entry (k, l)

f(k − 1, l − 1)

. Fk,l = ..

. . . f(k − 1, l + Ly − 2)

. . .. . .

f(k + Lx − 2, l − 1) . . . f(k + Lx − 2, l + Ly − 2)

.

(14)

An analogous equality holds for the rows of the Hankel-blo k-Hankel matrix

W. Let W n be the nth row of the matrix W = [W 1 : . . . : W Lx Ly ]T . Then W i+(j−1)Lx = ve (Fi,j ) for 1 6 i 6 Lx , 1 6 j 6 Ly ,

(15)

where Fi,j denotes the Kx × Ky submatrix beginning from the entry (i, j). Basi ally, the HbH matrix is a 2D representation of the 4-order tensor Xij kl i,j Xij kl = (Fk,l )i,j = (F )k,l = f(i + k − 2, j + l − 2)

(16)

and the SVD of the matrix W is an orthogonal de omposition of this tensor. Another 2D representation of the tensor Xij kl an be obtained by the rearrangement

456

N. E. Golyandina, K. D. Usevi h

(5) of W:

F1,1 F1,2 . . . .. .. .. X = R(W) = . . . FKx ,1 FKx ,2 . . .

F1,Ky .. . . FKx ,Ky

(17)

Let us all this blo k Lx Kx × Ly Ky matrix the 2D-traje tory matrix and formulate the subsequent steps of the algorithm in terms of 2D-traje tory matri es. SVD

First of all, re all that the eigenve tors {Ui }di=1 form an orthonormal basis of span(W1 , . . . , WKx Ky ) and the fa tor ve tors {Vi }di=1 form an orthonormal basis of span(W 1 , . . . , W Lx Ly ). Consider matri es Ψi = matrLx ,Ly (Ui ) ∈ MLx ,Ly ,

Φi = matrKx ,Ky (Vi ) ∈ MKx ,Ky ,

and all Ψi and Φi eigenarrays and fa tor arrays respe tively. It is easily seen x ,Ky ) that systems {Ψi }di=1 and {Φi }di=1 form orthogonal bases of span({Fk,l }Kk,l=1 L ,L i,j x y and span({F }i,j=1 ) (see (13) and (15)). Moreover, by (6) one an rewrite the SVD step of the algorithm as a de omposition of the 2D-traje tory matrix X=

d X i=1

Xi =

d p X λi Φi ⊗ Ψi .

(18)

i=1

The de omposition is biorthogonal and has the same optimality properties as the SVD (see [2℄). We will all it Krone ker-produ t SVD (KP-SVD for short). Grouping

Grouping step in terms of Krone ker produ ts has exa tly the same form as (11). Choosing m disjoint subsets Ik (10) one obtains the grouped expansion X=

m X

k=1

XIk ,

where XI =

Xp λi Φi ⊗ Ψi .

(19)

i∈I

Note that it is more onvenient in pra ti e to perform the grouping step on the base of Ψi and Φi (instead of Ui and Vi ), sin e they are two-dimensional as well as the input 2D-array. Projection

It follows from (18) and (6) that matri es XIk are rearrangements of orresponding matri es WIk . Sin e the rearrangement R preserves Frobenius inner e I in (12) an be expressed through orthogoprodu t, the resulting 2D-arrays F k nal proje tions in Frobenius norm of the matri es XIk onto the linear subspa e of 2D-traje tory matri es (17) and the one-to-one orresponden e between 2Darrays and matri es like (17).

2D-extension of Singular Spe trum Analysis 2.3

457

Special cases

Here we will onsider some spe ial ases of 2D-SSA. It happens that these spe ial

ases des ribe most of well-known SSA-like algorithms. 2.3.1

1D sequences: SSA for time series. The rst spe ial ase o

urs when

the input array has only one dimension, namely it is a one-dimensional nite real-valued sequen e (1D-sequen e for short): F = (f(0, 0), . . . , f(Nx − 1, 0))T .

(20)

In this ase, the 2D-SSA algorithm oin ides with the original SSA algorithm [7℄ applied to the same data. Let us brie y des ribe the SSA algorithm in its standard notation denoting f(i, 0) by fi and Nx by N. The only parameter L = Lx is alled the window length. Let K = N − L + 1 = Kx . Algorithm onsists of four steps (the same as those of 2D-SSA). The result of Embedding step is the Hankel matrix

f0 f1 W = f2 . ..

f1 f2 f2 f3 f3 f4

.. .. . .

. . . fK−1 . . . fK . . . fK+1

. . .. ..

fL−1 fL fL+1 . . . fN−1

.

(21)

This matrix is alled the traje tory matrix1 . SVD and De omposition steps are exa tly the same as in the 2D ase. Proje tion in the 1D ase is formulated as one-step hankelization HR . 2.3.2 Extreme window sizes. Let us return to a general 2D-array ase when Nx , Ny > 1. Consider extreme window sizes: (a) Lx = 1 or Lx = Nx ; (b) Ly = 1 or Lx = Ny .

1. If onditions (a) and (b) are met both, then due to ondition 1 < Lx Ly < Nx Ny we get (Lx , Ly ) = (Nx , 1) or (Lx , Ly ) = (1, Ny ). In this ase, the HbH matrix W oin ides with the 2D-array F itself or with its transpose. Thus, the algorithm of 2D-SSA is redu ed to a grouping of the SVD omponents of the 2D-array F. This te hnique is used in image pro essing and it works well for 2D-arrays that are produ ts of 1D-sequen es (f(i, j) = pi qj ). 2. Consider the ase when either (a) or (b) is met. Let it be (b). Without loss of generality, we an assume that Ly = 1 and 1 < Lx < Nx . Then the HbH matrix W generated by F onsists of sta ked Hankel matri es W = [H0 : H1 : . . . : HNy −1 ] 1

In the SSA literature, the traje tory matrix is usually denoted by X

458

N. E. Golyandina, K. D. Usevi h

and we ome to the algorithm of MSSA [4,6,10℄ for simultaneous de omposition of multiple time series. More pre isely, we treat the 2D-array as a set of time series arranged into olumns and apply the MSSA algorithm with parameter Lx to this set of series. Pra ti ally, MSSA is more preferred than the general 2D-SSA if we expe t only one dimension of the input 2D-array to be `stru tured'. 2.3.3 Product of 1D sequences. In §2.3.1, we have shown that SSA for time series an be onsidered as a spe ial ase of the 2D-SSA. However, we an establish another relation between SSA and 2D-SSA. Consider the outer produ t of 1D-sequen es as an important parti ular ase of 2D-arrays: f(i, j) = pi qj . Produ ts of 1D-sequen es are of great importan e for the general ase of 2D-SSA as we an study properties (e.g. separability) of sums of produ ts of 1D-sequen es based on properties of the fa tors. The fa t here is that a 2D-SSA de omNxmain −1,Ny −1

an be expressed through SSA position of the 2D-array F = f(i, j) i,j=0 y −1 Nx −1 de ompositions of the 1D-sequen es (pi )i=0 and (qj )N j=0 . In matrix notation, the produ t of two 1D-sequen es P = (p0 , . . . , pNx −1 )T and Q = (q0 , . . . , qNy −1 )T is F = PQT . Let us x window sizes (Lx , Ly ) and denote by W(p) and W(q) the Hankel matri es generated by P and Q respe tively:

W(p)

p1 . . . pKx −1 p2 . . . pKx , .. . . .. .. .

p0 p1 = . ..

W(q)

pLx −1 pLx . . . pNx −1

q0 q1 = . ..

q1 . . . qKy −1 q2 . . . qKy . .. . . .. .. .

qLy −1 qLy . . . qNy −1

Then the Hankel-blo k-Hankel matrix W generated by the 2D-array F is W = W(q) ⊗ W(p) .

Thus, the following theorem holds. Theorem 1 ([9, Th. 13.10]). Let W(p)

positions

W

(p)

=

dp P

m=1

Then

q (p) (p) (p) T λm Um Vm ,

and W(q) have singular value de omW

(q)

=

dq P

n=1

q

(q)

(q)

(q) T

λn Un Vn

dp dq q T X X (p) (q) (p) (p) W= λm λn U(q) Vn(q) ⊗ Vm n ⊗ Um

.

(22)

(23)

m=1 n=1

yields a singular value de omposition of the matrix W, after rearranging of (q) its terms (in de reasing order of λ(p) m λn ).

2D-extension of Singular Spe trum Analysis 2.4

459

Comments on Grouping step

Let us now dis uss perhaps the most sophisti ated point of the algorithm: grouping of the eigentriples. Rules for grouping are not de ned within the 2D-SSA algorithm and this step is supposed to be performed by hand, on the base of theoreti al results. The way of grouping depends on the task one has to solve. The general task of 2D-SSA is to extra t additive omponents from the observed 2D-array. Let us try to formalize this task. Suppose we observe a sum of 2D-arrays: F = F(1) + . . . + F(m) . For example, F is a sum of a smooth surfa e, regular u tuations and noise. When applying the 2D-SSA algorithm to F, we have to group somehow the eigentriples (i.e. to group the terms of (9) or (18)) at Grouping step. The problems arising here are: – Is it possible to group the eigentriples providing the initial de omposition of

F into F(k) ? – How to identify the eigentriples orresponding to a omponent F(k) ?

In order to answer the rst question, we introdu e the notion of separability of the 2D-arrays F(1) , . . . , F(m) by 2D-SSA (following the 1D ase [7℄) as the

possibility to extra t them from their sum. In other words, we all the set of 2Darrays separable if the answer to the rst question is positive. In §3.1 we present the stri t de nition of separability and study its properties. In §3.2 we review some fa ts on separability of time series (the 1D-SSA ase), establish a link between the 1D-SSA and 2D-SSA ases and dedu e several important examples of 2D-SSA separability (§3.3). For pra ti al reasons, we dis uss approximate and asymptoti separability. If omponents are separable, then we ome to the se ond question: how to perform an appropriate grouping? The main idea is based on the following fa t: the eigenarrays {Ψi }i∈Ik and fa tor arrays {Φi }i∈Ik orresponding to a omponent F(k) an be expressed as linear ombinations of submatri es of the omponent. We an on lude that they repeat the form of the omponent F(k) . For example, smooth surfa es produ e smooth eigenarrays (fa tor arrays), periodi omponents generate periodi eigenarrays, and so on. In §3.4 we also des ribe a tool of weighted orrelations for he king separability a-posteriori. This tool an be an additional guess for grouping. Another matter of on ern is the number of eigentriples we have to gather to obtain a omponent F(k) . This number is alled the 2D-SSA rank of the 2Darray F(k) and is equal to the rank of the HbH matrix generated by F(k) . A tually, we are interested in separable 2D-arrays. Clearly, they have rank-de ient HbH matri es in non-trivial ase. This lass of 2D-arrays has an important sub lass: the 2D-arrays keeping their 2D-SSA rank onstant within a range of window sizes. In the 1D ase (see §2.3.1) the HbH matri es are Hankel and the sub lass oin ides with the whole lass. For the general 2D ase it is not so. However, 2D-arrays from the de ned above sub lass are of onsiderable interest

460

N. E. Golyandina, K. D. Usevi h

sin e the number of eigentriples they produ e does not depend on the hoi e of window sizes. §4 ontains several examples of su h 2D-arrays and rank al ulations for them.

3

2D separability

This se tion deals with the problem of separability stated in §2.4 as a possibility to extra t terms from the observed sum. We onsider the problem of separability for two 2D-arrays, F(1) and F(2) . Let us x window sizes (Lx , Ly ) and onsider the SVD of the HbH matrix W generated by F = F(1) + F(2) : W=

d p X λi Ui Vi T . i=1

If we denote W(1) and W(2) the Hankel-blo k-Hankel matri es generated by F(1) and F(2) , then the problem of separability an be formulated as follows: does there exist su h a grouping {I1 , I2 } that W(1) =

Xp λi Ui Vi T

and W(2) =

i∈I1

Xp λi Ui Vi T .

(24)

i∈I2

The important point to note here is that if W has equal singular values, then the SVD of W is not unique. For this reason, we introdu e two notions (in the same fashion as in [7℄): strong and weak separability. Strong separability means that any SVD of the matrix W allows the desired grouping, while weak separability means that there exists su h an SVD. 3.1

Basic definitions

Let L(m,n) = L(m,n) (G) denote the linear spa e spanned by the m × n submatri es of a 2D-array G. Parti ulary, for xed window sizes (Lx , Ly ), we have L(Lx ,Ly ) (F) = span({Fk,l }) and L(Kx ,Ky ) (F) = span({Fi,j }). Definition 6. Two 2D-arrays F(1) (Lx , Ly )-separable if L(Lx ,Ly ) (F(1) ) ⊥ L(Lx ,Ly ) (F(2) )

and F(2) with equal sizes are weakly and

L(Kx ,Ky ) (F(1) ) ⊥ L(Kx ,Ky ) (F(2) ).

Due to properties of SVDs, De nition 6 means that if F(1) and F(2) are weakly separable, then the sum of SVDs of W(1) and W(2) (24) is an SVD of the W. We also introdu e the de nition of strong separability.

We all two 2D-arrays F(1) and F(2) strongly separable if they are weakly separable and the sets of singular values of their Hankel-blo kHankel matri es do not interse t. Definition 7.

2D-extension of Singular Spe trum Analysis

461

Hereafter we will speak mostly about the weak separability and will say `separability' for short.

Remark 1. The set of 2D-arrays separable from a xed 2D-array F is a linear spa e.

Sin e the exa t separability is not feasible, let us introdu e the approximate separability as almost orthogonality of the orresponding subspa es. Consider 2D-arrays F and G and x window sizes (Lx , Ly ). As in (14), Fk1 ,l1 , Gk2 ,l2 stand for Lx × Ly submatri es of F and G and Fi1 ,j1 , Gi2 ,j2 do for Kx × Ky submatri es. Let us introdu e a distan e between two 2D-arrays in order to measure the approximate separability: def

ρ(Lx ,Ly ) (F, G) = max(ρL , ρK ),

(25)

where

hFk1 ,l1 , Gk2 ,l2 iM , JK = {1, . . . , Kx } × {1, . . . , Ky }; ρK = max (k1 ,l1 ),(k2 ,l2 )∈JK kFk1 ,l1 kM kGk2 ,l2 kM

Fi1 ,j1 , Gi2 ,j2 M ρL = max i1 ,j1 , JL = {1, . . . , Lx } × {1, . . . , Ly }. (i1 ,j1 ),(i2 ,j2 )∈JL kF kM kGi2 ,j2 kM

Remark 2. The 2D-arrays F and G are separable i ρ(L

x ,Ly )

(F, G) = 0.

A quite natural way to deal with approximate separability is studying asymptoti by array sizes separability of 2D-arrays, namely `good' approximate separa,∞ bility for relatively big 2D-arrays. Consider two in nite 2D-arrays F = (fij )∞ i,j=0 ∞ ,∞ and G = (gij )i,j=0 . Let F|m,n and G|m,n denote nite submatri es of in nite , G|m,n = (gij )m−1,n−1 . 2D-arrays F and G: F|m,n = (fij )m−1,n−1 i,j=0 i,j=0 Definition 8. F

and G are said to be asymptoti ally separable if lim

Nx ,Ny → ∞

(26)

ρ(Lx ,Ly ) (F|Nx ,Ny , G|Nx ,Ny ) = 0

for any Lx = Lx (Nx , Ny ) and Ly = Ly (Nx , Ny ) su h that Lx , Kx , Ly , Ky → ∞ as Nx , Ny → ∞. 3.2

Separability of 1D sequences

As well as the original 1D-SSA algorithm an be treated as a spe ial ase of 2D-SSA, the notion of L-separability of time series (originally introdu ed in [7℄) is a spe ial ase of (Lx , Ly )-separability.

Remark 3. Time series F(1)

= (f0 , . . . , fN−1 )T and F(2) = (f0 , . . . , fN−1 )T are L-separable if they are (L, 1)-separable as 2D-arrays. (1)

(1)

(2)

(2)

462

N. E. Golyandina, K. D. Usevi h

Let us now give several examples of the (weak) L-separability, whi h is thoroughly studied in [7℄.

Example 1. The sequen e F

= (f0 , . . . , fN−1 )T with fn = os (2πωn + ϕ) is L-separable from a non-zero onstant sequen e (c, . . . , c)T if Lω and Kω, where K = N − L + 1, are integers.

Example 2. Two osine sequen es of length N given by f(1) n = os (2πω1 n + ϕ1 )

and f(2) n = os (2πω2 n + ϕ2 )

are L-separable if ω1 6= ω2 , 0 < ω1 , ω2 6 1/2 and Lω1 , Lω2 , Kω1 , Kω2 are integers. In general, there are only a small number of exa t separability examples. Hen e, we ome to onsideration of approximate separability. It is studied with the help of asymptoti separability of time series rst introdu ed in [7℄. Asymptoti separability is de ned in the same fashion as that in the 2D ase (see De nition 8). The only dieren e is that we let just one dimension (and parameter) tend to in nity (be ause another dimension is xed).

Example 3. Two osine sequen es given by f(l) n =

m X

ck os(2πωk n + ϕk ), (l)

(l)

(l)

(l)

0 < ωk 6 1/2, l = 1, 2,

(27)

k=0

with dierent frequen ies are asymptoti ally separable. In Table 1, one an see a short summary on asymptoti separability of time series. Table 1.

onst

os exp exp os poly

Asymptoti separability

onst os exp exp os poly − + + + −

+ + + + +

+ + + + +

+ + + + +

− + + + −

In this table, const stands for non-zero onstant sequen es, cos does for osine sequen es (27), exp denotes sequen es exp(αn), exp cos stands for eαn os (2πωn + φ) and poly does for polynomial sequen es. Note that onditions of separability are omitted in the table. For more details, su h as onditions,

onvergen e rates, and other types of separability (e.g. sto hasti separability of a deterministi signal from the white noise), see [7℄.

2D-extension of Singular Spe trum Analysis 3.3

463

Products of 1D sequences

Let us study separability properties for produ ts of 1D-sequen es (introdu ed in §2.3.3). Consider four 1D-sequen es (1) T P(1) = (p(1) 0 , . . . , pNx −1 ) ,

(1) T Q(1) = (q(1) 0 , . . . , qNy −1 ) ,

(2) T P(2) = (p(2) 0 , . . . , pNx −1 ) ,

(2) T Q(2) = (q(2) 0 , . . . , qNy −1 ) .

Proposition 1. If P(1) and P(2) are Lx -separable or sequen es Q(1) and Q(2) are Ly -separable, then their produ ts F(1) = P(1) (Q(1) )T and F(2) = P(2) (Q(2) )T are (Lx , Ly )-separable.

Proof. First of all, let us noti e that submatri es of the 2D-arrays are produ ts of subve tors of 1D-sequen es

(1) (1) (1) T (1) F(1) k1 ,l1 = (pk1 −1 , . . . , pk1 +Lx −2 ) (ql1 −1 , . . . , ql1 +Ly −2 ), (2) (2) (2) T (2) F(2) k2 ,l2 = (pk2 −1 , . . . , pk2 +Lx −2 ) (ql2 −1 , . . . , ql2 +Ly −2 ).

(28)

Let us re all an important feature of Frobenius inner produ t:

T AB , CDT M = hA, Ci2 hB, Di2 ,

(29)

where A, B, C, and D are ve tors. Applying (29) to (28), we obtain the orthogonality of all Lx × Ly submatri es of 2D-arrays: D

(2) F(1) k1 ,l1 , Fk2 ,l2

E

M

= 0.

Likewise, all their Kx × Ky submatri es are orthogonal too. A

ording to Remark 2, we on lude that the 2D-arrays F(1) and F(2) are separable, and the ⊓ ⊔ proof is omplete.

Furthermore, we an generalize Proposition 1 to approximate and asymptoti separability. Lemma 1.

Under the assumptions of Proposition 1, ρ(Lx ,Ly ) (F(1) , F(2) ) 6 ρLx (P(1) , P(2) )ρLy (Q(1) , Q(2) ).

Proof. Equalities (28) and (29) make the proof obvious. Proposition 2.

Let F(1) and F(2) be produ ts of in nite 1D-sequen es: F(1) = P(1) (Q(1) )T ,

P(j) =

(j) (j) (p0 , . . . , pn , . . .)T

⊓ ⊔

F(2) = P(2) (Q(2) )T ,

and

Q(j) = (q0 , . . . , qn , . . .)T . (j)

(j)

If P(1) , P(2) or Q(1) , Q(2) are asymptoti ally separable, then are asymptoti ally separable too.

F(1)

and

F(2)

464

N. E. Golyandina, K. D. Usevi h

Proof. The proposition follows immediately from Lemma 1.

⊓ ⊔

The following example of asymptoti separability an be shown using Proposition 2 and Remark 1.

Example 4. The 2D-array given by

= os(2πω1 i) ln(j + 1) + ln(i + 1) os(2πω2 j) is asymptoti ally separable from a onstant 2D-array f(2) (i, j) = const. f(1) (i, j)

Example 4 demonstrates that separability in the 2D ase is more varied than in the 1D ase. For instan e, nothing but periodi 1D-sequen es are separable from a onstant sequen e. The next example is an analogue of Example 3.

Example 5. Two 2D sine-wave arrays given by f(l) (i, j) =

m X

ck os(2πω1k i + ϕ1k ) os(2πω2k j + ϕ2k ), l = 1, 2, (l)

(l)

(l)

(l)

(l)

k=1

with dierent frequen ies are asymptoti ally separable by 2D-SSA. However, the problem of la k of strong separability in presen e of weak separability appears more frequently in the 2D ase. The wider is the range of eigenvalues of the HbH matrix orresponding to a 2D-array, the more likely is mixing of omponents produ ed by the 2D-array and other onstituents. This be omes a problem at Grouping step. For example, if two 1D-sequen es have eigenvalues from the range [λ2 , λ1 ], then the range of eigenvalues of their produ t, by Proposition 1, is wider: [λ22 , λ21 ]. 3.4

Checking the separability: weighted correlations

Following the 1D ase, we introdu e a ne essary ondition of separability, whi h

an be applied in pra ti e. Definition 9.

as follows: D

A weighted inner produ t of 2D-arrays F(1) and F(2) is de ned

F(1) , F(2)

E

w

Nx −1 Ny −1 def X X

=

i=0

j=0

f(1) (i, j) · f(2) (i, j) · wx (i) · wy (j),

where wx (i) = min(i + 1, Lx , Kx , Nx − i)

and

wy (j) = min(j + 1, Ly , Ky , Ny − j).

2D-extension of Singular Spe trum Analysis

465

In fa t, the fun tions wx (i) and wy (j) de ne the number of entries on se ondary diagonals of Hankel Lx × Kx and Ly × Ky matri es respe tively. More pre isely, wx (i) = # (k, l) : 1 6 k 6 Kx , 1 6 l 6 Lx , k + l = i + 1 , wy (j) = # (k, l) : 1 6 k 6 Ky , 1 6 l 6 Ly , k + l = j + 1 .

Hen e, for a Hankel-blo k-Hankel matrix W generated by F, the produ t wx (i)wy (j) is equal to the number of entries in W orresponding to the entry (i, j) of the 2D-array F. The same holds for the number of entries in a 2D-traje tory matrix X. This observation implies the following proposition. Proposition 3. D

F(1) , F(2)

E

w

E D = X(1) , X(2)

M

E D = W(1) , W(2)

.

M

With the help of the weighted inner produ t, we an formulate a ne essary

ondition for separability. Proposition 4.

If F(1) and F(2) are separable, then F(1) , F(2)

w

= 0.

Finally, we introdu e weighted orrelations to measure approximate separability and the matrix of weighted orrelations to provide an additional information useful for grouping.

A weighted orrelation (w- orrelation) arrays F(1) and F(2) is de ned as

Definition 10.

ρw

between two 2D-

F(1) , F(2) w . ρw (F , F ) = kF(1) kw kF(2) kw (1)

(2)

Consider the 2D-array F and apply 2D-SSA with parameters (Lx , Ly ). If we

hoose the maximal grouping (10), namely m = d and Ik = {k}, 1 6 k 6 d, e I is alled the kth elementary re onstru ted omponent and the then ea h F k matrix of weighted orrelations R = (rij )di,j=1 is given by e I )|. eI , F rij = |ρw (F i j

For an example of appli ation see §5.

4 4.1

2D-SSA ranks of 2D-arrays. Examples of calculation Basic properties

Let us rst introdu e a de nition of the 2D-SSA rank.

(30)

466

N. E. Golyandina, K. D. Usevi h

Definition 11. The (Lx , Ly )-rank (2D-SSA rank for window sizes (Lx , Ly )) of the 2D-array F is de ned to be

rankLx ,Ly (F) def = dim L(Lx ,Ly ) = dim L(Kx ,Ky ) = rank W. It is immediate that the (Lx , Ly )-rank is equal to the number of omponents in the SVD (9) of the Hankel-blo k-Hankel matrix generated by F. There is another way to express the rank through the 2D-traje tory matrix (17).

If for xed window sizes

Lemma 2.

X=

m X i=1

Ai ⊗ Bi ,

(Lx , Ly )

there exists representation

Bi ∈ MLx ,Ly ,

Ai ∈ MKx ,Ky ,

(31)

then rankL ,L F does not ex eed m. Furthermore, if ea h system {Ai }m i=1 , {Bi }m is linearly independent, then rank F ) = m . ( L ,L i=1 x

y

x

y

Proof. The proof is evident, sin e equality (31) an be rewritten as W=

m X

ve Bi (ve Ai )T

i=1

by (6).

⊓ ⊔

By Theorem 1, the 2D-SSA rank of a produ t of 1D-sequen es 2D-SSA rank is equal to the produ t of the ranks: rankLx ,Ly (PQT ) = rankLx (P) rankLy (Q), where rankL (·) stands for rankL,1 (·). For a sum of produ ts of 1D-sequen es F =

n P

i=1

(32)

P(i) (Q(i) )T , the 2D-SSA

rank is not generally equal to the sum of produ ts of ranks due to possible linear dependen e of ve tors. In order to al ulate 2D-SSA ranks for this kind of 2D-arrays, the following lemma may be useful. Lemma 3. If for xed window sizes (Lx , Ly ) systems {Aj }nj=1 and {Bi }m i=1 su h that X=

m,n X

i,j=1

then rankL

x ,Ly

there exist linearly independent

cij Aj ⊗ Bi , Bi ∈ MLx ,Ly , Aj ∈ MKx ,Ky ,

(F) = rank C,

where C = (cij )m,n i,j=1 .

(33)

2D-extension of Singular Spe trum Analysis

467

Proof. Let us rewrite the ondition (3) in the same way as in the proof of Lemma 2: m,n X

W=

cij ve Bi (ve Aj )T .

i,j=1

If we set A = [ve A1 : . . . : ve An ] and B = [ve B1 : . . . : ve Bm ], then W = BCAT . Sin e A and B have linearly independent olumns, the ranks of W and C oin ide. ⊓ ⊔ 4.2

Ranks of time series

In the 1D ase, lass of series having onstant rank within a range of window length is alled time series of nite rank [7℄. This lass mostly onsist of sums of produ ts of polynomials, exponents and osines: ′

fn =

d X

(k) Pm (n) ρn k os(2πωk n + ϕk ) + k

d X

(k) Pm (n) ρn k. k

(34)

k=d ′ +1

k=1

Here 0 < ωk < 0.5, ρk 6= 0, and Pl(k) are polynomials of degree l. The time series (34) form the lass of time series governed by linear re urrent formulae (see [3, 7℄). It happens that SSA ranks of time series like (34) an be expli itly al ulated. Proposition 5. Let a time series FN = (f0 , ..., fN−1 ) be de ned in (34) with (ωk , ρk ) 6= (ωl , ρl ) for 1 6 k, l 6 d ′ and ρk 6= ρl for d ′ < k, l 6 d. Then rankL (FN ) is equal to ′

r=2

d X

(mk + 1) +

d X

(35)

(mk + 1)

k=d ′ +1

k=1

if L > r and K > r. Proof. Equality (34) an be rewritten as a sum of omplex exponents: ′

fn =

d X

k=1

(k) Pm (n) (αk (λk )n k

+

βk (λk′ )n )

+

d X

(k) Pm (n) ρn k, k

k=d ′ +1

where λk = ρk e2πiωk , λk′ = ρk e−2πiωk and αk , βk 6= 0. The latter equality yields a anoni al representation (see [1, §8℄) of the Hankel matrix W with rank r. Under the stated onditions on L and K, rank W = r by [1, Theorem 8.1℄. ⊓ ⊔

468 4.3

N. E. Golyandina, K. D. Usevi h Calculation of 2D-SSA ranks

Proposition 5 together with (32) gives possibility to al ulate 2D-SSA ranks for 2D-arrays that are produ ts of 1D-sequen es. However, the general 2D ase is mu h more ompli ated. In this se tion, we introdu e results on erning 2D-SSA ranks for 2D exponential, polynomial and sine-wave arrays. In the examples below, one an observe the ee t that the 2D-SSA rank of a 2D-array given by f(i, j) = pi+j is equal to the SSA rank of the sequen e (pi ). It is not surprising, sin e 2D-SSA is in general invariant to rotation (and to other linear maps) of arguments of a 2D-fun tion f(i, j). Exponent. The result on rank of a sum of 2D exponents is quite simple. Nx −1,Ny −1 Proposition 6. For an exponential 2D-array F = f(i, j) i,j=0 de ned 4.3.1

by

f(i, j) =

m X

cn ρin µjn ,

ρn , µn 6= 0,

n=1

(36)

rankLx ,Ly (F) = m if Lx , Ly , Kx, Ky > m and (ρl , µl ) 6= (ρk , µk ) for l 6= k.

Proof. The proof is based on Lemma 2. Let us express entries of the matrix X using equality (16):

(Fk,l )i,j = f(i + k − 2, j + l − 2) =

m X

(l−1) cn ρ(i−1) µ(j−1) ρ(k−1) µn . n n n

(37)

n=1

It is easy to he k that equality (37) de nes de omposition X=

m X

n=1

An ⊗ Bn ,

where

(K −1) x −1) T An = (ρ0n , . . . , ρ(K ) (µ0n , . . . , µn y ), n y −1) Bn = (ρ0n , . . . , ρn(Lx −1) )T (µ0n , . . . , µ(L ). n m Obviously, ea h system {Ai }m i=1 , {Bi }i=1 is linearly independent. Applying Lemma 2 nishes the proof. ⊓ ⊔

4.3.2

Polynomials. Let Pm be a polynomial of degree m: Pm (i, j) =

m m−s X X

gst is jt

s=0 t=0

and at least one of leading oeÆ ients gs,m−s for s = 0, . . . , m is non-zero. Consider the 2D-array F of sizes Nx , Ny > 2m + 1 with f(i, j) = Pm (i, j).

2D-extension of Singular Spe trum Analysis

469

If Lx , Ly , Kx , Ky > m + 1, then

Proposition 7.

rankLx ,Ly (F) = rankm+1,m+1 (G ′ ),

where G′ =

G ′′ 0 0

0m×m

′ ′ g00 . . . g0m

. G ′′ = ..

,

. ..

′ gm0

In addition, the following inequality holds: m + 1 6 rankLx ,Ly (F) 6

0

,

′ gst = gst s! t!.

(m/2 + 1) , for even m, ((m + 1)/2 + 1) (m + 1)/2, for odd m. 2

(38)

Proof. The rst part of the proposition is proved in the same way as Propo-

sition 6 ex ept for using Lemma 3 instead of Lemma 2. Let us apply Taylor formula (Fk,l )i,j = Pm (i + k − 2, j + l − 2) = s+t ∂ Pm s t 1 = (i − 1) (j − 1) (k − 1, l − 1) = s! t! ∂is ∂jt (39) s=0 t=0 m X m m−t s t m−s v u X X X (i − 1) (j − 1) (k − 1) (l − 1) = . gu+s,v+t (u + s)!(v + t)! s! t! u! v! m X m X

s=0 t=0

u=0 v=0

′ If we set gst = 0 for s + t > m + 1, then we an rewrite (39) as

X=

m X

s,t,u,v=0

where

(40)

T v 1 0u , . . . , (Kx − 1)u 0 , . . . , (Ky − 1)v for 0 6 u, v 6 m u! v! T t 1 0s , . . . , (Lx − 1)s 0 , . . . , (Ly − 1)t for 0 6 s, t 6 m. = s! t!

Au+(m+1)v = Bs+(m+1)t

′ gu+s,v+t Au+(m+1)v ⊗ Bs+(m+1)t ,

Let W(g) be the Hankel-blo k-Hankel matrix generated by G ′ with window sizes (m + 1, m + 1). Then (40) an be rewritten as (m+1)2 −1

X=

X

i,j=0 2

(W(g) )ji Ai ⊗ Bj . 2

(m+1) −1 (m+1) −1 The systems {Ai }i=0 and {Bj }j=0 are linearly independent due to restri tions on Lx , Ly . By Lemma 3, the rst part of the proposition is proved. The bounds in (38) an be proved using the fa t that

m+1,m+1 ′ }k,l=1 , rank (G ′ ) = dim L(m+1,m+1) (G ′ ) = dim span {Gk,l

m+1,m+1

470

N. E. Golyandina, K. D. Usevi h

′ is the (m + 1) × (m + 1) submatrix of G ′ beginning from the entry where Gk,l (k, l). De ne by Tn the spa e of (m + 1) × (m + 1) matri es with zero entries below the nth se ondary diagonal:

def

Tn = {A = (aij )m,m i,j=0 ∈ Mm+1,m+1 : aij = 0

for i + j > n}.

′ Then Gk,l belongs to Tn for n > m − (k + l) + 2 and does not, in general, for smaller n. Let us introdu e

def ′ Cn = span {Gk,l }k+l=m−n+2 ⊆ Tn , def

Sn = span(C0 , . . . , Cn ) = span(Sn−1 , Cn ) ⊆ Tn .

Then L(m+1,m+1) (G ′ ) = Sm . By the theorem onditions, there exists i su h that gi,m−i 6= 0. Hen e, there exist C0 , . . . , Cm ∈ Mm+1,m+1 su h that Cn ∈ Cn ⊆ Tn , Cn 6∈ Tn−1 . Therefore, the system {C0 , . . . , Cm } is linearly independent and the lower bound is proved. To prove the upper bound, note that dim Sn 6 min(dim Sn−1 + dim Cn , dim Tn ). Sin e dim Cn 6 m + 1 − n and dim Tn = dim Sm 6

m X

min(n + 1, m − n + 1) =

n=0

n+1 P k=1

k, one an show that

(m/2 + 1) , m even, ⊓ ⊔ ((m + 1)/2 + 1) (m + 1)/2, m odd. 2

Let us demonstrate two examples that meet the bounds in inequality (38) exa tly: the 2D-SSA rank of the 2D array given by f(k, l) = (k + l)2 (m = 2) equal to 3, while the 2D-SSA rank for f(k, l) = kl is equal to 4. 4.3.3

Sine-wave 2D-arrays. Consider a sum of sine-wave fun tions hd (k, l) =

d X

(41)

Am (k, l),

m=1

Am (k, l) =

os(2πω(X) m k) sin(2πω(X) m k)

!T

a m bm cm dm

!

os(2πω(Y) m l) , sin(2πω(Y) m l)

(42)

where 1 6 k 6 Nx , 1 6 l 6 Ny , at least one oeÆ ient in ea h group {am , bm , cm , dm } is non-zero and the frequen ies meet the following onditions: (X)

(Y)

(X)

(Y)

(ωn , ωn ) 6= (ωm , ωm ),

(Y) for n 6= m, ω(X) m , ωm ∈ (0, 1/2).

(43)

2D-extension of Singular Spe trum Analysis

For window sizes (Lx , Ly ) su h that Lx , Ly , Kx , Ky N −1,N −1 2D-SSA rank of F = hd (k, l) k,l=0 is equal to

Proposition 8.

x

rankLx ,Ly (F) =

and numbers

νm

d P

the

y

νm ,

m=1

where

νm = 2 or 4;

an be expressed as νm = 2 rank

> 4d

471

am bm cm dm dm −cm −bm am

(44)

.

Proof. Summands Am of (42) an be rewritten as a sum of omplex exponents: 4Am (k, l) = (am − dm − i(cm + bm )) e

(X) 2πiωm k

(X) −2πiωm k

+ (am − dm + i(cm + bm )) e

(X) −2πiωm k

+ (am + dm + i(cm − bm )) e + (am + dm − i(cm − bm )) e

2πiω(Y) m l

+

−2πiω(Y) m l

+

2πiω(Y) m l

+

e e e

(X) 2πiωm k −2πiω(Y) m l

e

.

Note that the oeÆ ients of the rst pair of omplex exponents be ome zero at on e if am = dm and bm = −cm . The se ond pair of omplex exponents vanishes if am = −dm and bm = cm . Therefore, the number of non-zero oeÆ ients of the omplex exponents orresponding to ea h summand Am (k, l) is equal to νm de ned in (44). Then the 2D-array an be represented as a sum of produ ts: hd (k, l) =

r X

n=1

xn yn k zn l ,

r=

d X

νm ,

(45)

m=1

where all the oeÆ ients xn ∈ C are non-zero, while yn and zn have the form (X) (Y) yn = e2πiωn , zn = e2πiωn , and pairs (yn , zn ) are distin t due to onditions (43), namely (yn , zn ) 6= (ym , zm ) for n 6= m. Due to [5℄, the rank of the Hankel-blo k-Hankel matrix W generated by the ⊓ ⊔ 2D-array (45) is equal to r at least for Lx , Ly > 4d.

Note that the ondition Lx , Ly > 4d is just suÆ ient for the result of Proposition 8. The same result is valid for a larger range of Lx , Ly ; this range depends on the input 2D array, see [5℄ for the ase of omplex exponents. Let us apply the proposition to two examples. Let f(k, l) = os(2πω(X) k + 2πω(Y) l). Then the 2D-SSA rank equals 2. If f(k, l) = os(2πω(X) k) ·

os(2πω(Y) l), then the 2D-SSA rank equals 4.

5

Example of analysis

Consider a real-life digital image of Mars (275 × 278) obtained by web- amera2 (see Fig. 2). As one an see, the image is orrupted by a kind of periodi noise, 2

Sour e: Pierre Thierry

472

N. E. Golyandina, K. D. Usevi h

probably sinusoidal due to possible ele tromagneti nature of noise. Let us try to extra t this noise by 2D-SSA. It is more suitable to use the 2D-traje tory matrix notation. After hoosing window sizes (25, 25) we obtain expansion (18). As we will show, these window sizes are enough for separation of periodi noise.

Fig. 2.

2D-array: Mars

Ψ1

Ψ2

Ψ3

Ψ4

Ψ5

Ψ6

Ψ7

Ψ8

Ψ9

Ψ10

Ψ11

Ψ12

Ψ13

Ψ14

Ψ15

Ψ16

Ψ17

Ψ18

Ψ19

Ψ20

Fig. 3.

Eigenarrays

Let us look at the eigenarrays (Fig. 3). The eigenarrays from the eigentriples with indi es N = {13, 14, 16, 17} have periodi stru ture similar to the noise. The fa tor arrays have the same periodi ity too. This observation entitles us to believe that these eigentriples onstitute the periodi noise. In addition, 4 is a likely rank for sine-wave 2D-arrays.

2D-extension of Singular Spe trum Analysis

Fig. 4.

473

Weighted orrelations for the leading 30 omponents

Let us validate our onje ture examining the plot of weighted orrelations matrix (see Fig. 4). The plot depi ts w- orrelations rij (30) between elementary re onstru ted omponents (the left-top orner represents the entry r11 ). Values are plotted in grays ale, white stands for 0 and bla k does for 1. The plot ontains two blo ks un orrelated to the rest. This means that the sum of elementary re onstru ted omponents orresponding to indi es from N is separable from the rest. Re onstru tion of a 2D-array by the set N gives us the periodi noise, while the residual produ es a ltered image.

Fig. 5.

Re onstru ted noise and residual ( ltered image)

As the matter of fa t, the noise is not pure periodi and is in a sense modulated. This happens due to lipping of the signal values range to [0, 255].

474

N. E. Golyandina, K. D. Usevi h

References 1. G. Heinig and K. Rost Algebrai methods for Toeplitz-like matri es and operators, Akademie Verlag, Berlin, 1984. 2. C.F. Van Loan and N.P. Pitsianis Approximation with Krone ker produ ts in M.S.Moonen and G. H. Golub, eds., Linear Algebra for Large S ale and Real Time Appli ations, Kluwer Publi ations, pp. 293{314, 1993. 3. V.M. Bu hstaber Time series analysis and grassmannians in S. Gindikin, ed., Applied Problems of Radon Transform, AMS Transa tion { Series 2, Vol. 162, Providen e (RI), pp. 1{17, 1994. 4. J. Elsner and A. Tsonis Singular Spe trum Analysis. A New Tool in Time Series Analysis, Plenum Press, New York, 1996. 5. H. Hua Yang and Y. Hua On Rank of Blo k Hankel Matrix for 2-D Frequen y Dete tion and Estimation, IEEE Transa tions on Signal Pro essing, Vol. 44, Issue 4, pp. 1046{1048 1996. 6. D. Danilov and A. Zhigljavsky, eds., Prin ipal Components of Time Series: the \Caterpillar" method, St.Petersburg State University, St.Petersburg, 1997 (in Russian). 7. N. Golyandina, V. Nekrutkin, and A. Zhigljavsky Analysis of Time Series Stru ture: SSA and Related Te hniques, Chapman & Hall/CRC, Bo a Raton, 2001. 8. J.R. Magnus and H. Neude ker Matrix Dierential Cal ulus with Appli ations to Statisti s and E onometri s, John Wiley & Sons, 2004. 9. A.J. Laub Matrix Analysis for S ientists and Engineers, SIAM, 2004. 10. D. Stepanov and N.Golyandina SSA-based approa hes to analysis and fore ast of multidimensional time series, Pro eedings of the 5th St.Petersburg Workshop on Simulation, St.Petersburg State University, St.Petersburg, pp. 293{298, 2005.

Application of Radon transform for fast solution of boundary value problems for elliptic PDE in domains with complicated geometry Alexandre I. Grebennikov Fa ultad de Cien ias Fisi o Matemati as, Benemerita Universidad Autonoma de Puebla, Av. San Claudio y Rio Verde, Col. San Manuel, Ciudad Universitaria, Puebla, Puebla, 72570 | Mexi o agrebe@fcfm.buap.mx

Abstract. A new approa h for solution of the boundary value problems for wide lass of ellipti partial dierential equations of mathemati al physi s is proposed. This lass in ludes the Lapla e, Poisson, and Helmholtz equations. The approa h is based on the dis overed by author Lo al Ray Prin iple and leads to new General Ray (GR) method, whi h presents the solution of the Diri hlet boundary problems by expli it analyti al formulas that in lude the dire t and inverse Radon transform. GR-method is realized by fast algorithms and MATLAB software, whose quality is demonstrated by numeri al experiments.

Keywords: Diri hlet problem for the Lapla e equation, dire t and inverse Radon transform.

1

Introduction

The traditional s heme of solving inverse problems of mathemati al physi s requires, as a rule, solution of a sequen e of dire t problems [1℄. That is why development of new fast methods for solution of dire t problems is very important for solving inverse problems [2, p.311℄. There are two main approa hes for solving boundary value problems for partial dierential equations in analyti al form: the Fourier de omposition and the Green fun tion method [2℄. The Fourier de omposition is used, as a rule, only in theoreti al investigations. The Green fun tion method is the expli it one, but it is diÆ ult to onstru t the Green fun tion for the omplex geometry of the

onsidered domain Ω. The known numeri al algorithms are based on the Finite Dieren es method, Finite Elements (Finite Volume) method and the Boundary Integral Equation method. Numeri al approa hes lead to solution of systems of linear algebrai equations [3℄ that require a lot of omputer time and memory. A new approa h for the solution of boundary value problems on the base of the General Ray Prin iple (GRP) was proposed by the author in [4℄, [5℄ for the

476

A. I. Grebennikov

stationary waves eld. GRP leads to expli it analyti al formulas (GR-method) and fast algorithms, developed and illustrated by numeri al experiments in [5℄{ [8℄ for solution of dire t and oeÆ ient inverse problems for the equations of mathemati al physi s. But there were some diÆ ulties with the stri t theoreti al justi ation of that version of GR-method. Here we extend the proposed approa h to onstru tion of another version of GR-method based on appli ation of the dire t Radon transform [9℄ to the PDE [10℄{[12℄. This version of GR-method is justi ed theoreti ally, formulated in algorithmi form, implemented as a program pa kage in MATLAB system and illustrated by numeri al experiments.

2

General Ray Principle

The General Ray Prin iple (GRP) was proposed in [4℄, [5℄. It gives no traditional mathemati al model for onsidered physi al eld and orresponding boundary problems. GRP onsists in the following main assumptions: 1. the physi al eld an be simulated mathemati ally by the superposition of plane ve tors (general rays) that form eld V(l) for some xed straight line l; ea h ve tor of eld V(l) is parallel to the dire tion along this line l, and the superposition orresponds to all possible lines l that interse t domain Ω; 2. the eld V(l) is hara terized by some potential fun tion u(x, y); 3. we know some hara teristi s su h as values of fun tion u(x, y) and/or ow of the ve tor V(l) in any boundary point P0 = (x0 , y0 ) of the domain. Appli ation of the GRP to the problem under investigation means to onstru t an analogue of given PDE in the form of family of ODEs des ribing the distribution of the fun tion u(x, y) along the \General Rays", whi h are presented by a straight line l with some parameterization. We use the traditional Radon parameterization with a parameter t: x = p os ϕ − t sin ϕ , y = p sin ϕ + t os ϕ. Here |p| is a length of the perpendi ular from the origin to the line l, ϕ ∈ [0, 2π] is the angle between the axis X and this perpendi ular. Using this parameterization, we onsidered in [4℄, [5℄ the variant of GRP that redu es the Lapla e equation to the assemblage (depending on p, ϕ) of ordinary dierential equations with respe t to variable t. This family of ODEs was used as the lo al analogue of the PDE. There we onstru ted orresponding version of the General Ray method for the onvex domain Ω. It onsists in the following steps: 1. solution of boundary value problems for the obtained assemblage of ODEs in expli it analyti al or approximate form, using well known standard formulas and numeri al methods; 2. al ulation of the integral average for this solution along the line l;

Radon transform for fast solution of BVP

477

3. transformation of these solutions by the inverse Radon transform produ ing the required superposition. The numeri al justi ation of this version of GR-method was given for the ase of domain Ω being the unit ir le [5℄. For some more ompli ated domains the quality of the method was illustrated by numeri al examples. The redu tion of the onsidered PDE to the family of ODEs with respe t to the variable t makes it possible to satisfy dire tly boundary onditions, to

onstru t the eÆ ient and fast numeri al algorithms. At the same time, there are some diÆ ulties with implementation of this method for the ompli ated geometry of the domain Ω, as well as with its theoreti al justi ation even for the simple ases.

3

Formulation and theoretical justification of p-version of GR-method

Let us onsider the Diri hlet boundary problem for the Poisson equation: △u(x, y) = ψ(x, y), u(x, y) = f(x, y),

(x, y) ∈ Ω; (x, y) ∈ Γ.

(1) (2)

for the fun tion u(x, y) that has two ontinuous derivatives with respe t to both variables inside the plane domain Ω bounded by a ontinuous urve Γ . Here ψ(x, y), (x, y) ∈ Ω and f(x, y), (x, y) ∈ Γ , are given fun tions. In [10℄{[12℄, investigations are presented on the possibility of redu tion of solution of PDE to the family of ODEs using the dire t Radon transform [9℄. This redu tion leads to ODE with respe t to variable p and an be interpreted in the frame of the introdu ed General Ray Prin iple. But at rst glan e, using the variable p makes it impossible to satisfy dire tly the boundary onditions expressed in (x, y) variables. Possibly by this reason the mentioned and other related investigations were on entrated only at theoreti al aspe t of onstru tion of some basis of general solutions of PDE. Unfortunately, this approa h was not used for onstru tion of numeri al methods and algorithms for solution of boundary value problems, ex ept for some simple examples [10℄. The important new element, introdu ed here into this s heme, onsists in satisfying the boundary onditions by their redu tion to homogeneous ones. The p-version of the GR-method an be formulated as the sequen e of the following steps: 1. redu e the boundary value problem to homogeneous one; 2. represent the distribution of the potential fun tion along the general ray (a straight line l) by its dire t Radon transform uϕ (p); 3. onstru t the family of ODEs in the variable p with respe t the fun tion uϕ (p);

478

A. I. Grebennikov

4. solve the onstru ted ODEs with zero boundary onditions; 5. al ulate the inverse Radon transform of the obtained solution; 6. revert to the initial boundary onditions. We present below the implementation of this s heme. We suppose that the boundary Γ an be represented in polar oordinates (r, α) by some one-valued positive fun tion that we denote r0 (α), α ∈ [0, 2π]. It is always possible for the simple onne ted star-shaped domain Ω with the entre at the origin. Let us write the boundary fun tion as = f(r0 (α)) os α, r0(α) sin α). f(α)

(3)

Supposing that fun tions r0 and f(α) have the se ond derivative we introdu e the fun tions f(α) , (x, y) ∈ Ω f0 (α) = 2 (4) r0 (α)

ψ0 (x, y) = ψ(x, y) − 4f0 (α) − f0′′ (α)

(5)

u0 (x, y) = u(x, y) − r2 f0 (α).

(6)

To pro eed with the rst step of the s heme, we an write the boundary-value problem with respe t to the fun tion u0 (x, y) as the following two equations: △u0 (x, y) = ψ0 (x, y), u0 (x, y) = 0,

(x, y) ∈ Ω;

(x, y) ∈ Γ.

(7) (8)

To make the se ond and the third steps we need the dire t Radon transform [7℄: R[u](p, ϕ) =

Z +∞ −∞

u(p os ϕ − t sin ϕ, p sin ϕ + t os ϕ)dt

After appli ation of the Radon transform to the equation (7) and using formula (2) at the pp. 3 of [8℄ we obtain the family of ODEs with respe t to the variable p: d2 uϕ (p) = R[ψ0 ](p, ϕ), dp2

b (p, ϕ) ∈ Ω

(9)

b is the domain of possible values of parameters p, ϕ. As a rule, ϕ ∈ where Ω [0, 2π], while modulus of the parameter p is equal to the radius in the polar oordinates and varies in the limits determined by the boundary urve Γ . In the onsidered ase, for some xed ϕ the parameter p is in the limits −r0 (ϕ − π) < p < r0 (ϕ). Unfortunately, boundary ondition (8) annot be modi ed dire tly by Radon transform to the orresponding boundary onditions for every equation of the

Radon transform for fast solution of BVP

479

family (9). For the fourth step we propose to use the following boundary onditions for every xed ϕ ∈ [0, 2π]: uϕ (−r0 (ϕ − π)) = 0;

uϕ (r0 (ϕ)) = 0.

(10)

b ϕ (p) the solution of the problem (9)-(10) that an be univo ally deDenote by u termined as fun tion of variable p for every ϕ ∈ [0, 2π],p ∈ (−r0 (ϕ − π), r0 (ϕ)), b ϕ (p) ≡ 0 for all ϕ with ontinuity in p. and outside of this interval we extend u Let us denote the inverse Radon transform as an operator R−1 , whi h for any fun tion z(p, ϕ) an be represented by formula (9): R−1 [z] =

1 2π2

Zπ Z∞

−π −∞

zp′ (x os ϕ + y sin ϕ, ϕ) dtdϕ (x os ϕ + y sin ϕ) − t

The justi ation of the fth step of the s heme is ontained in the following theorem.

The following formula for the solution of boundary value problems (7)-(8) is true:

Theorem 1.

u uϕ (p)], (x, y) ∈ Ω. 0 (x, y) = R−1 [b

(11)

Proof. Substituting fun tion de ned by (11) into left-hand side of equation (7) and using [8, Lemma 2.1, p. 3℄ we obtain the following relations: △u 0 (x, y) = R−1 [

b ϕ (p) d2 u ] = R−1 [R[ψ0 ](p, ϕ)] = ψ0 (x, y) dp2

(12)

whi h mean that the equation (7) is satis ed (see also [8℄, p. 40). From the b ϕ (p) ≡ 0, p ∈ / (−r0 (ϕ − π), r0 (ϕ)), ϕ ∈ [0, π] and Theorem 2.6 (the

ondition u support theorem) from [8, p.10℄ it follows that u 0 (x, y) ≡ 0 for (x, y) ∈/ Ω and, due its ontinuity, satis es the boundary onditions (8). This nishes the proof. The sixth step of GR-method is presented in detail in the following theorem. Theorem 2. The solution u (x, y) of boundary-value problems (1), (2) is pre-

sented by the following formulas ^ 2 (p, ϕ) − u (x, y) = R−1 [(ψ ^ 2 (p, ϕ) = ψ

(p + r0 (ϕ − π)) ^ ψ2 (r0 (ϕ), ϕ))] + r2 f0 (α) (13) (r0 (ϕ) + r0 (ϕ − π)) Zp

Zp

−r0 (ϕ−π) −r0 (ϕ−π)

b 0 (p, ϕ)dp, ψ

b 0 (p, ϕ) = R[ψ0 (x, y)]. ψ

(14)

Justi ation of this theorem obviously follows from the expli it formula for the solution of equation (9) with onditions (10). The dire t and inverse Radon transforms in expli it formulas (13), (14) an be implemented numeri ally by fast Fourier dis rete transformation (FFDT) whi h ensures the eÆ ien y of the proposed method.

480

4

A. I. Grebennikov

Results of numerical experiments

We have onstru ted the fast algorithmi and program implementation of GRmethod for onsidered problem in MATLAB system. We used the uniform dis retization of variables p ∈ [−1, 1], ϕ ∈ [0, π], as well as the dis retization of variables x, y, with n nodes. We made tests of mathemati ally simulated model examples with known exa t fun tions u(x, y), f(x, y), ψ(x, y). Graphi illustrations of numeri al examples of solution by p-version of GR-method are presented in Fig. 1(a)-1(d). From Fig. 1(a), 1(d) we an see that the method gives a good approximation also for a non-dierentiable urve Γ .

5

Conclusion

New version of GR-method is onstru ted. It is based on the appli ation of the Radon transform dire tly to the Poisson equation. This version of GR-method for arbitrary simply onne ted star-shaped domains is justi ed theoreti ally, formulated in algorithmi form, implemented as a program pa kage in MATLAB system and illustrated by numeri al experiments. Proposed version an be applied for the solution of boundary value problems for other PDEs with onstant

oeÆ ients. In perspe tive, it seems interesting to develop this approa h for the solution of dire t and inverse problems involving the equations of mathemati al physi s with variable oeÆ ients.

Acknowledgments Author a knowledges VIEP BUAP for the support in the frame of the Proje t No 04/EXES/07 and also SEP and CONACYT for support in the frame of the Proje t No CB 2006-01/0057479.

References 1. A.N. Tikhonov, V.Y. Arsenin, Solutions of Ill-Posed Problems, V.H. Winston & Sons, Washington, D.C., 1977. 2. S. L. Sobolev, Partial dierential equations mathemati al physi s, Pergamon Press, 1964. 3. A.A. Samarskii, The theory of dieren e s hemes, Mar el Dekker, In ., New York, 2001. 4. A. I. Grebennikov, Fast algorithm for solution of Diri hlet problem for Lapla e equation, WSEAS Transa tion on Computers Journal, 2(4), 1039{1043 (2003). 5. A. I. Grebennikov, The study of the approximation quality of GR-method for solution of the Diri hlet problem for Lapla e equation, WSEAS Transa tion on Mathemati s Journal, 2(4), 312{317 (2003).

Radon transform for fast solution of BVP

481

6. A. I. Grebennikov, Spline Approximation Method and Its Appli ations, MAX Press, Russia, 2004. 7. A. I. Grebennikov, A novel approa h for the solution of dire t and inverse problems of some equations of mathemati al physi s, Pro eedings of the 5-th International Conferen e on Inverse Problems in Engineering: Theory and Pra ti e, (ed. D. Lesni ), Vol. II, Leeds University Press, Leeds, UK, Chapter G04, 1{10. (2005). 8. A. Grebennikov, Linear regularization algorithms for omputer tomography, Inverse Problems in S ien e and Engineering, Vol. 14, No. 1, January, 53{64 (2006). die Bestimmung von Funktionen dur h ihre Integralwerte langs 9. J. Radon, Uber gewisser Mannigfaltigkeiten, 75 years of Radon transform (Vienna, 1992), Conf. Pro . Le ture Notes Math. Phys., IV, 324{339 (1994). 10. Helgason Sigurdur, The Radon Transform, Birkhauser, Boston-Basel-Berlin, 1999. 11. M. Gelfand and S. J. Shapiro, Homogeneous fun tions and their appli ations, Uspekhi Mat. Nauk, 10, 3{70 (1955). 12. V. A. Borovikov, Fundamental solutions of linear partial dierential equations with onstant oeÆ ients, Trudy Mos ov. Mat. Obsh h., 8, 877{890 (1959).

482

A. I. Grebennikov

(a) Solution of the Poisson equation in the unit ir le with the homogeneous Diri hlet ondition.

(b)

( )

(d) Fig. 1.

Application of a multigrid method to solving diffusion-type equations⋆ M. E. Ladonkina⋆, O. Yu. Milyukova⋆⋆, and V. F. Tishkin⋆⋆⋆ ⋆

Institute for Mathemati al Modeling, RAS, Mos ow, Russia ⋆⋆ miliukova@imamod.ru, ⋆⋆⋆ tishkin@imamod.ru

ladm@imamod.ru,

Abstract. A new eÆ ient multigrid algorithm is proposed for solving paraboli equations. It is similar to impli it s hemes by stability and a

ura y, but the omputational omplexity is substantially redu ed at ea h time step. Stability and a

ura y of the proposed two-grid algorithm are analyzed theoreti ally for one- and two-dimensional heat diffusion equations. Good a

ura y is demonstrated on model problems for one- and two-dimensional heat diusion equations, in luding those with thermal ondu tivity de ned as a dis ontinuous fun tion of oordinates.

Keywords: paraboli equations, multigrid methods, stability, a

ura y.

1

Introduction

Numeri al simulation of many problems in mathemati al physi s must take into a

ount diusion pro esses modeled by paraboli equations. Expli it s hemes lead to severe CFL restri tions on the time step [1℄, [2℄. Impli it s hemes are free from stability restri tions, but diÆ ult to use be ause if high omputational

omplexity of the orresponding linear algebrai equations. Appli ation of lassi al multigrid methods [3℄ may also be osty and not mu h better than expli it s hemes. Therefore, new algorithms should be developed for paraboli equations. In this paper, we present a new eÆ ient multigrid algorithm. We analyze the stability and a

ura y of the two-grid algorithm applied to model problems for one- and two-dimensional heat diusion equations with onstant and variable

oeÆ ients. The proposed algorithm is similar to an impli it s heme in regard to stability and a

ura y and substantially redu es the omputational omplexity at ea h time step. ⋆

This work was supported by the RFBR (Grant N 08-01-00435).

484

2

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

Description of the algorithm

As an example, we onsider an initial-boundary value problem for one- and two-dimensional heat diusion equations, ρcv

∂T = div(kgradT ) + f, ∂t T (x, t) = g(x, t),

(1)

x ∈ G, x ∈ γ,

T (x, 0) = T0 (x),

where Cv is heat at a unit volume, ρ is density, k is thermal ondu tivity, T is temperature at point x at time t, f is the heat sour e density, γ is the

omputational domain boundary, and g(x, t) and T0 (x) are given fun tions. To approximate problem (1) in the omputational domain G = {0 < x < l1 , 0 < y < l2 , 0 < t 6 T }, we use the fully impli it nite-dieren e s heme (ρcv )ij

n+1 uij − un ij

τ

= ki+0.5,j

+ki,j+0.5

n+1 n+1 ui+1,j − uij

h2x

n+1 n+1 ui,j+1 − uij

h2y

− ki−0.5,j

n+1 n+1 ui,j − ui−1,j

h2x

+

n+1 n+1 ui,j − ui,j−1

+ Φij , (2) h2y 0 < i < N 1 , 0 < j < N2 ,

− ki,j−0.5

n+1 u0j = u1 (yj , tn+1 ),

n+1 uN = u2 (yj , tn+1 ), 1 ,j

0 6 j 6 N2 ,

n+1 ui,0

n+1 ui,N 2

0 6 i 6 N1 ,

= u3 (xi , tn+1 ), u0ij

= u4 (xi , tn+1 ),

= T0 (xi , yj ),

0 6 i 6 N1 ,

0 6 j 6 N2 ,

where hx and hy are onstant mesh sizes in the x and y dire tions and τ is a time step. Finite-dieren e s heme (2) is a system of linear algebrai equations in the unknown values of the solution at the (n+1)th time level: Ah uh = fh .

(3)

The proposed algorithm for al ulating the grid fun tion at the next time level onsists of the following steps. Step 1. One or several smoothing iterations of equation (2) or (3) are performed using the formula

×

(ρcv )ij τ

(ρcv )ij u ij ki+0.5,j usi+1,j + ki−0.5,j usi−1,j + + τ h2x ki,j+0.5 usi,j+1 + ki,j−0.5 usi,j−1 + + Φij × h2y −1 ki+0.5,j + ki−0.5,j ki,j+0.5 + ki,j−0.5 + + + (1 − σ)usij , h2x h2y s+1 uij

=σ

(4)

Appli ation of a multigrid method

485

where i = 1, 2, ..., N1 − 1, j = 1, 2, ..., N2 − 1, σ is a weight oeÆ ient (0 < σ 6 1), and u0ij = unij . In formula (4), index n+1 is omitted and u ij = unij . The resulting grid fun tion is denoted by usm ij . Then, the residual is al ulated as rh = Ah usm h − fh . Step 2. The residual is restri ted to the oarse grid: Rlp = r2i1 ,2j1 ,

l = i1 = 1, ..., N1 /2 − 1,

p = j1 = 1, ..., N2 /2 − 1.

Step 3. A oarse grid orre tion equation is solved. For the two-dimensional problem analyzed here, it has the form ∆lp ∆l+1,p − ∆lp ∆l,p − ∆l−1,p − kl+0.5,p + kl−0.5,p − 2 τ Hx H2x l,p+0.5 ∆l,p+1 − ∆lp + kl,p−0.5 ∆l,p − ∆l,p−1 = Rlp , −k H2y H2y (ρcv )lp

(5)

∆l0 = ∆l,N2 /2 = ∆0p = ∆N1 /2,p = 0, l = 1, 2, ..., N1 /2 − 1, p = 1, 2, ..., N2 /2 − 1,

where Hx = 2hx ,Hy = 2hy . Step 4. The oarse grid orre tion ∆lp is interpolated to the ne grid by performing a 4-point fa e- entered and a 16-point ell- entered interpolation: ∆lp , i = 2l, j = 2p, 9 1 i = 2l + 1, j = 2p, 16 (∆lp + ∆l+1,p ) − 16 (∆l−1,p + ∆l+2,p ), 9 1 i = 2l, j = 2p + 1, 16 (∆lp + ∆l,p+1 ) − 16 (∆l,p−1 + ∆l,p+2 ), 81 δij = 256 (6) (∆lp + ∆l+1,p + ∆l,p+1 + ∆l+1,p+1 )− 9 − 256 (∆l−1,p + ∆l−1,p+1 + ∆l,p+2 + ∆l+1,p+2 + +∆l+2,p+1 + ∆l+2,p + ∆l+1,p−1 + ∆l,p−1 ) + i = 2l + 1, 1 + 256 (∆l−1,p−1 + ∆l+2,p+2 + ∆l−1,p+2 + ∆l+2,p−1 ), j = 2p + 1

where i = 1, 2, ..., N1 − 1, j = 1, 2, ..., N2 − 1. Note that δ0j = δN1 ,j = δi,0 = δi,N2 = 0. Step 5. Finally, the grid fun tion is al ulated at the next time level as uij = usm ij − δij .

(7)

Thus, a single iteration of the two-grid y le is performed. Even though the system of linear equations remains in ompletely solved, the algorithm is similar

486

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

to an impli it s heme in terms of stability and a

ura y. This is demonstrated below both theoreti ally and numeri ally for several model problems. Moreover, when the number of ne grid points is suÆ iently large, the omputational ost is lower in the proposed algorithm as ompared to the impli it s heme used on the ne grid, be ause the solution of oarse grid orre tion equation (5) has a mu h lower omputational omplexity as ompared to the solution of impli it s heme (2).

3

Theoretical stability analysis

We use Fourier analysis [4℄, [5℄ to examine stability of the two-grid algorithm with respe t to initial onditions. As a model example, we onsider the Diri hlet problem for the one-dimensional heat diusion equation with unit oeÆ ients on the interval [0, 1℄. Suppose that N is an even number and a single smoothing iteration is performed. The impli it s heme used on the ne grid is uim − 2uim + uim uim im i i−1 i −u i + Φi , = i+1 τ h2x

0 < i < N,

(8)

im uim 0 = uN1 = 0,

u0i = T0 (xi ),

0 6 i 6 N1 ,

where uim is the solution of the impli it s heme for the heat diusion equation i at the next time level, h = 1/N, and (T0 )i is a given grid fun tion. We represent the solution at the nth level as a Fourier series, i = u

N−1 X

√ ak sin kπxi 2.

k=1

The Fourier series expansion of the solution at the (n+1)th time level obtained in [6℄ is ui =

N−1 X

N−k {[qksm − 0.5(1 + q k )Qkcor qkres ]ak + 0.5(1 + q k )Qkcor qres aN−k }

k=1,k6=N/2

√ √ × sin kπxi 2 + qN/2 sm aN/2 sin 0.5Nπxi 2,

(9)

where qksm = 1 +

σR (qk − 1), R+1

qk = os kπ/N,

qkres =

qksm [1 + R(1 − qk )] − 1 , τ τ , Qkcor = 1 + 0.5R(1 − q2k )

q k = qk [1 + 0.5(1 − q2k )],

R = 2τ/h2 .

(10)

Appli ation of a multigrid method

487

In the one-dimensional problem, the interpolation at Step 4 is performed as follows: δi =

∆l ,

i = 2l,

9 16 (∆l

+ ∆l+1 ) −

1 16 (∆l−1

+ ∆l+2 ),

i = 2l + 1,

where i = 1, 2, ..., N − 1. Now, we show that the algorithm is absolutely stable in a ertain norm the linear subspa e with respe t to initial onditions when √ σ = 0.5. We de ne √ k H as the span of the Fourier modes 2 sin kπxi and 2 sin(N − k)πxi , where k = 1, 2, ...N/2 − 1. By virtue of representation (9) ombined with the equalities N−k Qkcor = Qcor and q k = −q N−k , the ve tor √ √ xk = ak sin kπxi 2 + aN−k sin(N − k)πxi 2 ∈ Hk

is transformed into ve tor

yk = Ak xk ,

where

Ak =

k )Qkcor qkres qksm − 0.5(1 + q 0.5(1 −

qk )Qkcor qkres

N−k 0.5(1 + q k )Qkcor qres N−k qsm

− 0.5(1 −

N−k qk )Qkcor qres

1 6 k 6 N/2 − 1.

,

It was shown in [6℄ that the eigenvalues of Ak satisfy the inequalities λk1 6= λk2 ,

|λk1 | 6 1,

(11)

|λk2 | 6 1.

We de ne the norm ku k1 on the spa e of grid fun tions as N/2−1

ku k21 =

X

(αk1 )2 + (αk2 )2 + a2N/2 ,

k=1

√

where αk1 , αk2 are the omponents of u in the basis ek1 , ek2 , 2 sin 0.5Nπxi (k = 1, 2, ..., N/2 − 1); ek1 and ek2 are the eigenvalues asso iated with eigenvalues λk1 and λk2 , respe tively. Combining (11) with the inequality |qksm | 6 1, we have kuk1 6 ku k1 .

This proves the absolute stability in the norm kk1 of the algorithm with respe t to initial onditions. We note here that the norms kk1 and kkL2 are equivalent [6℄.

488

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

It was shown in [6℄ that the algorithm is stable with respe t to the right-hand side. Thus, it is proved that the algorithm is absolutely stable with respe t to initial onditions and right-hand side. For the one-dimensional model problem, it holds that 0

kuk1 6 ku k1 + τQ1

n X j=0

kΦj k2 ,

where Q1 = const is independent of h, τ. The norm k k2 is de ned by analogy with k k1 .

4

Solution error

As a model example, we onsider an initial-boundary value problem for the twodimensional heat diusion equation with unit oeÆ ients on the unit square, subje t to zero boundary onditions: ∂u ∂2 u ∂2 u + , 0 < x < 1, 0 < y < 1, = ∂t ∂x2 ∂y2 u(x, 0, t) = 0, u(x, 1, t) = 0, 0 6 x 6 1,

0 6 t 6 T,

u(0, y, t) = 0,

0 < t 6 T,

u(1, y, t) = 0,

0 6 y 6 1,

0 6 t 6 T,

u(x, y, 0) = T0 (x, y),

0 6 x 6 1,

0 6 y 6 1.

(12)

We assume here that T0 (x, y) is an in nitely dierentiable fun tion. The impli it s heme used on the ne grid is im im im im uim uim uim im i,j+1 − 2uij + ui,j−1 i+1,j − 2uij + ui−1,j ij − u ij + , = τ h2 h2 0 < i < N, 0 < j < N,

u0i,j

im uim 0,j = uN,j = 0,

0 < j < N,

uim i,0

0 < i < N,

=

uim i,N

= T0 (xi , yj ),

= 0,

(13)

0 6 i 6 N, 0 6 j 6 N,

where h = 1/N, the grid fun tion (T0 )ij approximates T0 (x, y), and N is an even number. Suppose that a single smoothing iteration is performed and σ = 0.5. We represent the solution at the nth time level as a Fourier series expansion: un ij =

N−1 X N−1 X

akm 2 sin kπxi sin mπyj ,

0 < i < N, 0 < j < N.

(14)

k=1 m=1

We al ulate the Fourier series expansion of the solution at the next time level. Following [6℄, we demonstrate ea h step of the algorithm. Substituting

Appli ation of a multigrid method

489

expansion (14) into the right-hand side of (4) and setting Cv ρ ≡ 0, kij ≡ 1, s = 0, and hx = hy = h, we perform the smoothing step to obtain usm ij =

N−1 X N−1 X

qkm sm akm 2 sin kπxi sin mπyj ,

(15)

k=1 m=1

where

qkm sm = 1 +

0.5R (qk + qm − 2), 2R + 1

(16)

sm qk , q k are de ned in (10), and R = 2τ/h2 . Repla ing uim ij with uij given by n (15) and u im ij with uij de ned by (14) in (13), we nd a Fourier series expansion

for the residual on the ne grid: rij =

N−1 X N−1 X

qkm res akm 2 sin kπxi sin mπyj ,

0 < i < N, 0 < j < N,

k=1 m=1

where

qkm res =

qksm [1 + R(2 − qk − qm )] − 1 . τ

(17)

Performing Step 2 (restri ting the residual to the oarse grid) and using the identities sin(N − k)πx2i = − sin kπx2i and sin(0.5πNx2i ) = 0, we obtain N N 2 −1 2 −1

Rlp =

X X

k,N−m N−k,m (qkm ak,N−m − qres aN−k,m + res akm − qres

(18)

k=1 m=1

+qN−k,N−m aN−k,N−m ) × 2 sin kπxl sin mπyp , res

where xl = x2i and yp = y2j (l = i = 1, 2, ..., N/2 − 1, p = j = 1, 2, ..., N/2 − 1). We represent the solution ∆lp of the oarse grid orre tion equation ∆lp ∆l+1,p − 2∆lp + ∆l−1,p ∆l,p+1 − 2∆lp + ∆l,p−1 − = Rlp , − τ H2 H2 ∆l0 = ∆l,N/2 = ∆0p = ∆N/2,p = 0

(19)

(l = 1, 2, ..., N/2 − 1, p = 1, 2, ..., N/2 − 1, H = 2h) as Fourier series, N/2−1 N/2−1

∆lp =

X

k=1

X

a ~km 2 sin kπxl sin mπyp .

m=1

Substituting (18) and (20) into (19), we obtain N/2−1 N/2−1

∆lp =

X

k=1

X

km k,N−m N−k,m Qkm ak,N−m − qres aN−k,m + cor (qres akm − qres

m=1

+qN−k,N−m aN−k,N−m )2 sin kπxl sin mπyp , res

490

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

where Qkm cor =

τ . 1 + 0.5R(2 − q2k − q2m )

(20)

We interpolate ∆lp to the ne grid in two substeps. First, interpolation is performed to the grid {(ih, pH), i = 1, ..., N − 1, p = 1, ..., N/2 − 1} as follows: ~ ip = ∆

∆lp ,

i = 2l,

9 16 (∆lp

+ ∆l+1,p ) −

1 16 (∆l−1,p

+ ∆l+2,p ),

(21)

i = 2l + 1.

~ ip is interpolated to the ne grid by formulas analogous to (21). It an Then, ∆ be shown that this pro edure is equivalent to interpolation by (6). Following [6℄ in ea h substep, we nd the Fourier series expansion of the grid fun tion δij : δij =

X

X

km k,N−m 0.25(1 + q k )(1 + q m )Qkm ak,N−m − cor (qres akm − qres

k6=N/2 m6=N/2 N−k,m −qres aN−k,m + qN−k,N−m aN−k,N−m )2 sin kπxl sin mπyp . res

Finally, formula (7) at Step 5 yields uij =

X X

k6= N 2

(b1km akm + b2km ak,N−m + b3km aN−k,m − b4km aN−k,N−m )

m6= N 2

×2 sin kπxi sin mπyj + +

X

N−1 X

qN/2,m aN/2,m 2 sin 0.5Nπxi sin mπyj + sm

m=1

qk,N/2 ak,N/2 2sinkπxi sin0.5Nπyj , sm

(22)

k6= N 2

where

k b1km = qkm k )(1 + q m )Qkm sm − (1 + q cor qres /4, k,N−m b2km = (1 + q k )(1 + q m )Qkm /4, cor qres N−k,m b3km = (1 + q k )(1 + q m )Qkm /4, cor qres N−k,N−m k )(1 + q m )Qkm /4, b4km = (1 + q cor qres

km km qkm k are sm , qres , Qcor are de ned by (16), (17), and (20), respe tively, and qk , q de ned in (10), R = 2τ/h2 .

As a result, we have Fourier series expansion (22) of the solution at the next time level obtained by the proposed algorithm. To analyze the a

ura y of the solution, we start with estimating the trun ation error of impli it s heme (13) on this solution. In (13), we substitute uij

Appli ation of a multigrid method

491

n im given by (22) for uim ij and repla e u ij with uij represented by (14). The resulting residual is

X

ϕij =

X

(r1km akm + r2k,m ak,N−m + r3k,m aN−k,m −

k6=N/2 m6=N/2

+

N−1 X

−r4k,m aN−k,N−m )2 sin kπxi sin mπyj +

r5m aN/2,m 2 sin 0.5Nπxi sin mπyj +

m=1

X

r6k ak,N/2 2 sin kπxi sin 0.5Nπyj ,

k6=N/2

where r1km = r5m

=

b1 km −1 τ

+

qN/2,m −1 sm τ

2b1 km (2−qk −qm ) , h2

+

2qN/2,m (2−qm ) sm , h2

2,3,4 r2,3,4 km = bkm

r6k

=

h

qk,N/2 −1 sm τ

1 τ

+

+

2(2−qk −qm ) h2

i

,

(23)

2qk,N/2 (2−qk ) sm . h2

Applying the triangle inequality and the Parseval identity, we obtain kϕkL2 6 kϕ1 kL2 + kϕ2 kL2 + kϕ3 kL2 + kϕ4 kL2 ,

(24)

where the terms on the right-hand side are de ned as kϕ1 k2L2 = kϕ2 k2L2 = kϕ3 k2L2 = kϕ4 k2L2 = +

Suppose that

P

k6=N/2

P

k6=N/2

P

k6=N/2

P

k6=N/2

PN−1

P

P

P

P

1 2 2 m6=N/2 (rkm ) (akm ) , 2 2 2 m6=N/2 (rk,N−m ) (akm ) , 3 2 2 m6=N/2 (rN−k,m ) (akm ) ,

(25)

4 2 2 m6=N/2 (rN−k,N−m ) (akm ) +

5 2 2 m=1 (rm ) (aN/2,m )

+

P

6 2 2 k6=N/2 (rk ) (ak,N/2 ) .

τ = hβ , where 0 < β < 2.

We assume that unij has 2p bounded nite-dieren e derivatives with respe t to both oordinates. To obtain an upper bound for the rst term in (24), the (k, m) index domain Ω is partitioned into four subdomains, Ω = Ω1 ∪ Ω2 ∪ Ω3 ∪ Ω4 (see Fig. 1).

492

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

m 6

N−1

Ω3

Ω4

Ω1

Ω2

m1

1

1

N−1

k1

k

Figure 1. De omposition of the (k, m) index domain Ω into subdomains:k1 = m1 = is the integer part of Nβδ .

[Nβδ ], 0 < δ < 1/7, [Nβδ ]

We nd upper bounds for |r1km | and |akm | in ea h subdomain. In Ω1 , it holds that kπh 2), and the im third one is the value of maxi,j |(uij − uij )/uim ij | at t = 0.199 in Problems 5-7, respe tively.

im im im Table 5. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

5.

N=100 K s=1 s=2 s=1 10 .045 .031 .96 · 10−9 50 .026 .014 .62 · 10−5 100 .02 .007 .265 · 10−3

N=200 s=1 s=2 s=1

N=500 s=1 s=2 s=1

.030 .021 .245 · 10−10 .021 .011 .169 · 10−8 .016 .007 .129 · 10−6 .028 .015 .333 · 10−10

498

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

im im im Table 6. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

6.

N=100 K s=1 s=2 s=1 10 .047 .032 .108 · 10−8 50 .027 .014 .137 · 10−4 100 .021 .009 .313 · 10−3

N=200 s=1 s=2 s=1

N=500 s=1 s=2 s=1

.030 .022 .259 · 10−10 .022 .011 .203 · 10−8 .017 .007 .170 · 10−6 .029 .016 .353 · 10−10

im im im Table 7. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

7.

N=100 K s=1 s=2 s=1 10 .044 .030 .495 · 10−6 50 .031 .015 .137 · 10−4 100 .03 .013 .41 · 10−2

N=200 s=1 s=2 s=1

N=500 s=1 s=2 s=1

.030 .021 .122 · 10−4 .024 .016 .304 · 10−4 .016 .007 .126 · 10−4 .033 .016 .133 · 10−4

These results demonstrate that the proposed algorithm provides good a

ura y as applied to an initial-boundary value problem for the heat diusion equation. To examine the dependen e of a

ura y on the magnitude of the jump in thermal ondu tivity, we ompare the results for problems 3-7 presented above with the results obtained for a relatively small jump in k and with those for thermal ondu tivity de ned as a ontinuous fun tion of oordinates. In Problem 8, 1 + 0.3 sin 10πx, if (x − 0.5)2 + (y − 0.5)2 < 1/16, k= otherwise. 1,

In Problem 9,

k = 1 + 0.3 sin 10πx.

im Tables 8 and 9 list the values of maxi,j,t |(uim ij − uij )/uij | at 0 < t < 0.199, im im and maxi,j |(uij − uij )/uij | at t = 0.199 in the rst and se ond olumns orresponding to ea h value of N in Problems 8 and 9, respe tively. These results are obtained by using approximation (41) for thermal ondu tivity on the oarse grid.

It is lear from omparison between Tables 7, 8 and 9 that higher a

ura y is a hieved when k is ontinuous or has a small jump, as ompared to the ase of a large jump in thermal ondu tivity.

Appli ation of a multigrid method

499

im im im Table 8. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

8 (s=1).

K N=100 N=200 N=500 10 .000313 .213 · 10−7 .000197 .207 · 10−7 100 .000457 .979 · 10−6 .000276 .951 · 10−7 .898 · 10−4 .102 · 10−7 im im im Table 9. maxi,j,t |(uim ij −uij )/uij | and maxi,j |(uij −uij )/uij | at t = 0.199 in Problem

9 (s=1).

K N=100 N=200 N=500 10 .000243 .416 · 10−7 .614 · 10−4 .207 · 10−7 100 .00029 .898 · 10−6 .846 · 10−4 .378 · 10−7 .24 · 10−4 .188 · 10−8

6

CONCLUSION

A new eÆ ient algorithm is developed for solving diusion-type equations. By applying the algorithm to several model problems, it is shown both theoreti ally and numeri ally that the algorithm is similar to an impli it s heme in terms of stability and a

ura y. The new algorithm substantially redu es the the omputational omplexity at ea h time level, as ompared to impli it s hemes.

References 1. A.A. SAMARSKY, Dieren e s heme theory, Nauka, 1989 (in Russian). 2. N.S. BAHVALOV, N.P. ZHIDKOV and G.M. KOBELKOV, Numeri al methods, Nauka, 1987 (in Russian). 3. R.P. FEDORENKO, A relaxation method for solving dieren e ellipti equations, Zh. Vy hisl. Mat. Mat. Fiz., Vol.1 (1961), N 5, pp. 922-927 (in Russian). 4. S.K. GODUNOV, V.S.RYABENKIY, A relaxation method for solving dieren e ellipti equations, Zh. Vy hisl. Mat. Mat. Fiz., Vol.1 (1961), N 5, pp. 922-927 (in Russian). 5. R. RIHTMAER, K.MORTON, Dieren e methods for solving of boundary value problem, Mir, 1972 (in Russian). 6. M.E. LADONKINA, O.Yu. MILYUKOVA, V.F. TISHKIN, A numeri al algorithm for diusion-type equations based on the multigrid methods, Mat. Model., Vol.19, (2007), N 4, pp. 71-89, (in Russian). 7. A.A. SAMARSKY, Ye.S. NIKOLAYEV, Methods for solving nite-dieren e equations, Nauka, (1978). 8. I. GUSTAFSSON, A Class of First Order Fa torization Methods, BIT, V.18 (1978), pp.142-156. 9. M.E. LADONKINA, O.Yu. MILYUKOVA, V.F. TISHKIN, Appli ation of the multigrid method for al ulation diusion pro esses, CD-Pro eedings of

500

M. E. Ladonkina, O. Yu. Milukova, V. F. Tishkin

West-East Speed Flow Field Conferen e 19-22, November 2007, Mos ow, Russia (http://wehsff.imamod.ru/pages/s7.htm).

Monotone matrices and finite volume schemes for diffusion problems preserving non-negativity of solution I. V. Kapyrin Institute of Numeri al Mathemati s, Russian A ademy of S ien es, ul. Gubkina 8, Mos ow, 119333 Russia ivan.kapyrin@gmail.com

A new nite volume s heme for 3D diusion problems with heterogeneous full diusion tensor is onsidered. The dis retization uses nonlinear two-point ux approximation on unstru tured tetrahedral grids. Monotoni ity of the linearized operator allows us to guarantee nonnegativity of the dis rete solution.

Abstract.

Introduction The simulation of substan e transport in porous media [1℄ ne essitates the dis retization of the diusion operator. In su h problems, the diusion tensor is strongly inhomogeneous and anisotropi and the geometry of the omputational domain requires the use of unstru tured ondensing meshes. Under these onditions, the solutions produ ed by some modern numeri al s hemes [2℄ exhibit unphysi al os illations and negative values. Negative solution values may lead to in orre tly omputed hemi al intera tions between the substan es. As a result, the s heme be omes non onservative. In the present paper a nite volume (FV) method for numeri al solution of three-dimensional diusion problems with anisotropi full diusion tensor on tetrahedral grids is being onsidered. The method was introdu ed in [3℄ for problems with homogeneous Diri hlet boundary onditions, here we extend it to the ase of nonhomogeneous onditions of Diri hlet and Neumann types. For the formulation of the s hemes we use a spe ial nonlinear diusive ux approximation, introdu ed for two-dimensional diusion problems by C.Le Potier in [4℄ and modi ed in [5℄. The resulting s hemes are onservative and monotone in the sense of ensuring the nonnegativity of solution for respe tive sour es and boudary onditions (see [6℄, Se tion 2.4). The proof of the latter feature of the method is based on the monotoni ity property of the linearized operator matrix.

502

1

I. V. Kapyrin

Nonlinear Finite Volume Method

Let Ω be a onvex polyhedral domain in R3 with boundary ∂Ω. Consider the stationary diusion equation with two types of boundary onditions in the mixed statement: ∇ · r = f,

r = −D∇C in Ω,

C|ΓD = gD (x),

r · n|ΓN = gN (x).

(1a) (1b) (1 )

Here, C is the on entration of the substan e, r is the diusion ux, f is the sour e fun tion, and D is a symmetri positive de nite diusion tensor of dimension 3 × 3 that is pie ewise onstant in Ω. The boundary ∂Ω onsists of two parts ΓD and ΓN . On ΓD the on entration is spe i ed by a ontinuous fun tion gD (x). On ΓN the ontinuous fun tion gN (x) pres ribes the diusive ux through the boundary. In the following we assume that ΓN is the union of noninterse ting planar fragments. In the omputational domain Ω, we onstru t a onformal tetrahedral mesh εh , su h that the diusion tensor is onstant on ea h of its elements T . Let NT be the number of tetrahedra T ∈ εh , NP be the number of verti es, Ne be the total number of fa es, and NB be the number of external fa es in εh . The mass onservation law (1a) an be integrated with respe t to T ∈ εh by using Green's identity: Z

r · n ds =

Z

f dx

∀T ∈ εh ,

(2)

T

∂T

where n denotes the unit outward normal to ∂T . Let ne be an outward normal to the fa e e of T whose length is numeri ally equal to the surfa e area of the

orresponding fa e; i.e., |ne | = |e|. Relation (2) an be rewritten as: X

e∈∂T

re · n e =

Z

f dx

∀T ∈ εh ,

(3)

T

where re is the mean diusion ux density through the fa e e: re =

1 |e|

Z

r ds

e

The diusion ux re · ne through e an be approximated as follows. For ea h T ∈ εh and ea h external fa e e, we introdu e their degrees of freedom. The set NT +NB of support points of these degrees of freedom is de ned as B = {Xj }j=1 . For ea h tetrahedron T , B in ludes some point XT inside T (its oordinates will be spe i ed later). Let the tetrahedron T have a fa e e belonging to ∂Ω and ne be

Monotone matri es and nite volume s hemes

503

the outward normal to e. Then if e ∈ ΓD we add its enter of mass Xe to B, otherwise, if e ∈ ΓN we add to B the proje tion Xe of the internal point XT along the ve tor Dne (the hoi e of XT will guarantee that Xe lies inside the fa e e). Sin e Ω is onvex, for any internal vertex Oi of εh , there are four points Xi,j (j = 1, 2, 3, 4) from B su h that Oi lies inside the tetrahedron formed by them (the nearest points are pi ked). Therefore, there are nonnegative oeÆ ients λi,j satisfying the onditions 4 X j=1

−−−−→ λi,j · Oi Xi,j = 0,

4 X

λi,j = 1.

j=1

The oeÆ ients λi,j > 0 are used for linear interpolation of the on entration at interior nodes of the initial mesh from its values at points of B: COi =

4 X

(4)

λi,j CXi,j .

j=1

A similar formula an be written for the on entrations at points Oi ∈ ΓN using the values at three vertexes of a triangle in ΓN , whi h ontains Oi . For the points Oi ∈ ΓD the interpolation is not needed be ause the respe tive on entration values are known from the Diri hlet boundary onditions. O1

O2 A

X+ M

X−

B

O3

Fig. 1.

Geometri onstru tions for the nonlinear nite-volume method.

Consider two neighboring tetrahedra T+ = AO1 O2 O3 and T− = BO1 O2 O3 in the initial mesh εh (see gure 1), X+ ,X− are the orresponding elements in B , D+ and D− are diusion tensors, and V + and V − { are their volumes. Let

504

I. V. Kapyrin

M be the enter of mass of the ommon fa e e, e = O1 O2 O3 . We introdu e the following notation (here and below, i, j and k are assumed to be dierent; i.e., {i, j, k} = {1, 2, 3}, {2, 1, 3}, {3, 1, 2}): – Ti+ and (Ti− ) are the tetrahedra X+ MOj Ok and X− MOj Ok respe tively, and Vi+ and Vi− are their respe tive volumes. – ne is the normal to the ommon fa e O1 O2 O3 , that is external with respe t to T+ . − – n+ ei and nei are the normals to the fa e MOj Ok , that are external with respe t to Ti+ and Ti− , respe tively. − – n+ ij and nij are the normals to the respe tive fa es MX+ Ok and MX− Ok , that are external with respe t to Ti+ and Ti− , respe tively. − – n+ i and ni are the normals to the respe tive fa es X+ Oj Ok and X− Oj Ok , are external with respe t to Ti+ and Ti− , respe tively. – The lengths of all the above normals are numeri ally equal to the surfa e

areas of the orresponding fa es.

Ea h pair of tetrahedra Ti+ and Ti− is asso iated with an auxiliary variable CM,i , that is the substan e on entration at the point M. The diusion ux ri∗ (here ∗ and below, the star denotes either a plus R or a minus) R on ea h tetrahedron Ti Cn ds, integrating it to is de ned by using Green's identity ∇C dx = Ti∗

∂Ti∗

se ond-order a

ura y, and taking into a

ount n∗i + n∗ei + n∗ij + n∗ik = 0: ∗ Vi∗ D−1 ∗ ri =

1 ∗ ni CM,i + n∗ei CX∗ + n∗ij COj + n∗ik COk . 3

(5)

The introdu ed degrees of freedom CM,i are eliminated using the assumption of

ux ontinuity through e: ri+ · ne = ri− · ne . As a result, the ux in (5) is de ned in terms of the on entrations CX+ , CX− at the points X+ and X− and in terms of COj and COk , for whi h we use linear interpolation (4). The total diusion

ux re ·ne through e is represented as a linear ombination of three uxes ri+ ·ne : re · ne = µe1 r1+ · ne + µe2 r2+ · ne + µe3 r3+ · ne .

(6)

To determine the oeÆ ients µei , i = 1, 2, 3, we set the following onditions on diusion ux (6) through e. – If the values ri+ ·ne /|ne | approximate the diusion ux density, then re ·ne /|ne |

is also its approximation:

3 X

µej = 1.

(7)

j=1

– The approximation sten il for the ux is two-point and nonlinear: re · ne = K+ (CO1 , CO2 , CO3 )CX+ − K− (CO1 , CO2 , CO3 )CX− .

(8)

Monotone matri es and nite volume s hemes

505

This ondition is ensured by the equation (a12 CO2 +a13 CO3 )µe1 +(a21 CO1 +a23 CO3 )µe2 +(a31 CO1 +a32 CO2 )µe3 = 0,

(9)

where aij =

− − + (D+ n+ j , na )(D− ni , na ) − (D− nj , na )(D+ ni , na ) − − + (D+ n+ i , na )Vi − (D− ni , na )Vi

.

Equations (7) and (9) de ne a family of solutions with parameter pe : µe1 (pe ) = µe1 (0) + pe [CO1 (a31 − a21 ) + CO2 a32 − CO3 a23 ], µe2 (pe ) µe3 (pe )

= =

µe2 (0) µe3 (0)

e

+ p [CO2 (a12 − a32 ) + CO3 a13 − CO1 a31 ], + pe [CO3 (a23 − a13 ) + CO1 a21 − CO2 a12 ].

(10a) (10b) (10 )

Here, µe1 (0), µe2 (0) and µe3 (0) omprise a parti ular solution to system (7),(9): µei (0) =

+ + − [(D− n− i , ne )Vi − (D+ ni , ne )Vi ]COi . 3 P + + − [(D− n− , n )V − (D n , n )V ]C e + j e Oj j j j

(11)

j=1

Remark 1. CoeÆ ients (11) are identi al to those in the two-dimensional nonlinear nite-volume method with the volumes repla ed by areas. In the twodimensional ase, µe1 and µe2 are unique and pre isely determined by onditions (7) and (8) on two-point approximations of the diusion ux. In ase when O1 O2 O3 ∈ ΓN , we have the following diusive ux approximation Z re · ne = gN (x) ds. (12) e

If the fa e O1 O2 O3 belongs to ΓD , Green's identity on the tetrahedron X+ O1 O2 O3 with volume V + yields the equation V + D−1 r =

1 + + (CX+ ne + CO1 n+ 1 + CO2 n2 + CO3 n3 ), 3

(13)

where COi , i ∈ {1, 2, 3} are known from the boundary onditions. For the external fa e e ∈ ΓD , we an write re · ne = KB+ CX+ + KB− ,

(Dne , ne ) where KB+ = , and 3V + + + (Dn+ 1 , ne )CO1 + (Dn2 , ne )CO2 + (Dn3 , ne )CO3 . KB− = + 3V

(14)

(15)

506

I. V. Kapyrin

Thus, the diusion ux re ·ne is de ned by formulas (6), (10) and (5) for internal mesh fa es and by formulas (12), (14) for external mesh fa es. Let CT be the on entration at the point XT orresponding to tetrahedron T having the fa e e ∈ ΓN . We eliminate the on entration Ce at the point Xe on the fa e e using the approximation of diusive ux through e: Ce − CT = −gN (Xe ), l e −XT k where l = kXkDnk and n is the unit normal ve tor to the fa e e. It is to be mentioned here that with nonnegative CTi , i = 1, .., NT and a nonpositive fun tion gN (x) the nonnegativity of COi in (4) is guaranteed after the elimination of Ce for all fa es e ∈ ΓN . The formulation of the method is ompleted by substituting the ux expressions into mass onservation law (3). Dis retization of (3) produ es a nonlinear system of equations A(CX )CX = F, (16)

where CX is the NT -ve tor of unknown on entrations at the points XT of the set B. The matrix A(CX ) an be represented as the union of submatri es A(CX ) =

X

(17)

Ne Ae (CX )NeT ,

e∈∂εh

Ne being the respe tive assembling matri es, onsisting of zeros and ones. Here Ae (CX ) is a 2 × 2 matrix of the form Ke+ −Ke− Ae (CX ) = (18) −Ke− Ke+

for any internal fa e e and a 1 × 1 matrix of the form Ae (CX ) = KB+ for any e ∈ ΓD . For the omponent FT of the right-hand-side ve tor F orresponding to tetrahedron T the following relation holds: FT =

Z

T

fdx −

X

e∈∂T ∩ΓD

KB− −

X

Z

gN ds.

(19)

e∈∂T ∩ΓN e

System (16) is solved using the Pi ard iteration A(CkX )Ck+1 =F X

(20)

with some initial approximation C0X . To onstru t monotone s hemes, we de ne the lo ation of a point XT ∈ B orresponding to an arbitrary tetrahedron T = ABCD in the initial mesh εh with fa es a, b, c and d opposite to A, B, C, D and D, respe tively. Let RA , RB , RC and RD be the position ve tors of the

orresponding verti es of T . The ve tors na , nb , nc and nd are outward normals

Monotone matri es and nite volume s hemes

507

to the fa es. Their lengths are numeri ally equal to the surfa e areas of the

orresponding fa es. De ne RA kna kD + RB knb kD + RC knc kD + RD knd kD (21) , kna kD + knb kD + knc kD + knd kD p = (Dnβ , nβ ) and β ∈ {a, b, c, d}. Note that, for an isotropi

RXT =

where knβ kD tensor, expression (21) gives the oordinates of the enter of the sphere ins ribed in T .

2

Monotonicity of the Method

Hereafter we formulate the monotoni ity property that is the main feature of the proposed FV method.

Let the right-hand side in system (16) of the nonlinear nitevolume method be nonnegative (i.e., Fi > 0); the boundary onditions satisfy gD (x) > 0 on ΓD and gN (x) 6 0 on ΓN . Let (16) be the orresponding nonlinear system of FV dis retization for (1); the support points of the degrees of freedom on the tetrahedra be given by formula (21); the initial approximation be (C0X )i > 0; and, for any internal fa e e, the nonnegative values µei , i ∈ {1, 2, 3} be hosen from solutions (10a)-(10 ) on every Pi ard iteration (20). Then all the iterative approximations to CX are nonnegative: Theorem 1.

(CkX )i > 0,

i = 1, . . . , NT ,

∀k > 0.

Proof. We rely on the following de nition of a monotone matrix: The matrix A is alled a monotone matrix if the ondition Ax > 0 implies that the ve tor x is positive. Assume that the matrix A(CX ) is monotone for any nonnegative

to ve tor CX , and the right-hand-side F is nonnegative. Then the solution Ck+1 X system (20) is also a nonnegative ve tor. Taking into a

ount (C0X )i > 0, we nd by indu tion that (CkX )i > 0, ∀k > 0, ∀i = 1, . . . , NT . Let us prove that the matrix A(CX ) is monotone for any nonnegative ve tor CX , and the right-hand-side F is nonnegative. Consider the oeÆ ients K+ (CO1 , CO2 , CO3 ), K− (CO1 , CO2 , CO3 ), KB+ and KB− in expressions (8) and (14) for the diusion ux through a fa e. The oeÆ ient KB+ is positive be ause D is positive de nite. Plugging (5) (after eliminating CM,i ) into (6) gives formulas for K+ and K− : K+ =

3 X i=1

K− = −

µei ·

3 X i=1

+ (D− n− (D+ ne , ne ) i , ne )Vi · + + −. 3V + (D− n− i , ne )Vi − (D+ ni , ne )Vi

µei ·

− (D+ n+ (D− ne , ne ) i , ne )Vi · + + −. 3V − (D− n− i , ne )Vi − (D+ ni , ne )Vi

508

I. V. Kapyrin

For K+ and K− to be positive and for KB− to be nonpositive, it is suÆ ient to show that (D− n− (D+ n+ (22) i , ne ) > 0, i , ne ) < 0. Consider the tetrahedron ABCD ∈ εh with fa es a, b, c and d opposite to the verti es A, B, C and D, respe tively, and with normals na , nb , nc and nd to these fa es (the lengths of the normals are numeri ally equal to the surfa e areas of the

orresponding fa es). The point XT inside the tetrahedron is de ned by formula (21). Let nab be de ned as the normal (external with respe t to XT BCD) to the plane XT CD, nbc be de ned as the normal (external with respe t to XT ACD) to the plane XT AD, and so on for nβγ , where β, γ ∈ {a, b, c, d}, β 6= γ. Sin e the length of a normal is not important for the proof of (22), nab an be al ulated as 1 −−→ −−−→ nab = (kna kD + knb kD + knc kD + knd kD )(CXT × DXT ) (23) 2 −−→ −−−→ For the ve tors CXT and DXT , we have the expressions

−→ −→ −−→ −−→ CAkna kD + CBknb kD + CDknd kD , CXT = kna kD + knb kD + knc kD + knd kD −−→ −→ −−→ −−−→ DAkna kD + DBknb kD + DCknc kD . DXT = kna kD + knb kD + knc kD + knd kD

Substituting them into ve tor produ t (23) gives

nab = nb kna kD − na knb kD .

Let us show that (Dna , nab) < 0 and (Dnb , nab) > 0 by using the Cau hy{ S hwarz inequality (Dna , nab ) = (Dna , nb )kna kD − (Dna , na )knb kD = = kna kD (na , nb )D − kna kD knb kD < 0.

(24)

Here, (·, ·)D is the s alar produ t in the metri de ned by the tensor D. Similarly, we an prove (Dnb , nab ) > 0 and inequalities of the form (Dnβ , nβγ ) < 0 and + (Dnγ , nβγ ) > 0, β 6= γ, where β, γ ∈ {a, b, c, d}. In (22), n− i and ni are repla ed by the orresponding ve tors nβγ and ne is repla ed by nβ or nγ . Then, using (24), we prove (22). Therefore, K+ and K− are positive and KB− is nonpositive. Thus, the matrix A(CX ) has the following properties. – All the diagonal elements of A(CX ) are positive. – All the o-diagonal elements of A(CX ) are nonpositive. – The matrix is olumn diagonally dominant; this diagonal dominan e is stri t

for olumns orresponding to elements that have fa es on the boundary of the omputational domain with Diri hlet onditions.

Monotone matri es and nite volume s hemes

509

Therefore, AT (CX ) is an M-matrix and all the elements of (AT (CX ))−1 are nonnegative. Sin e the transposition and inversion of matri es are ommuting operations, we have (AT (CX ))−1 = (A−1 (CX ))T . Therefore, all the elements of A−1 (CX ) are nonnegative and A(CX ) is monotone. The nonnegativity of right-hand-side F represented by the formula (19) is provided by the onditions of the theorem and the nonpositivity of oeÆ ients KB− . ⊓ ⊔

Remark 2. The validity of (22) implies that

µei > 0, i ∈ {1, 2, 3} required in the assumption of the theorem an always be hosen by setting pe = 0 ∀e in (10a)-(10 ). The range of pe for whi h µei are positive is an interval; it may degenerate into the point pe = 0 when two of the three COi are zero. If COi = 0 ∀i ∈ {1, 2, 3}, then solution (10a)-(10 ) is always positive and does not depend on pe .

Remark 3. The point XT given by (21) is a solution to the system of six equa-

tions determining the equality of the angles in the D-metri between the ve tors nβ , nβγ and nγ , −nβ,γ , where β, γ ∈ {a, b, c, d} and β 6= γ. Corollary 1.

Consider the nonstationary diusion equation ∂C − ∇ · D∇C = f ∂t

(25)

with a nonnegative right-hand side, a nonnegative initial ondition, and a nonnegative Diri hlet boundary ondition. The nonlinear FV method is used to onstru t the impli it s heme

V n V n+1 n+1 + A(CX ) CX = C + Fn+1 , ∆t ∆t X

where V is a diagonal matrix of elements' volumes and F involves the righthand side and the boundary onditions. At every time step, the system is solved by the Pi ard method

V V n n+1,k + A(CX ) Cn+1,k+1 = C + Fn+1 , X ∆t ∆t X

If

µei ∀e, i ∈ {1, 2, 3} 1, 2 . . .. Corollary 2.

are positive, then

k = 1, 2 . . . ,

n+1,0 CX = Cn X.

n+1,k (CX )j > 0, j = 1, . . . , NT ,

k=

In the expli it s heme for the dis retization of (25) V n+1 C = ∆t X

V n+1 − A(Cn ) Cn , X X+F ∆t

the solution CXn+1 an be made nonnegative by hoosing a suÆ iently small ∆t ensuring that the diagonal elements of V/∆t − A(Cn ) are nonnegative

510

I. V. Kapyrin

(its o-diagonal elements are obviously nonnegative). Moreover, ∆t ∼ h2 (where h is the size of a quasi-uniform mesh), whi h is similar to the stability ondition for expli it s hemes. Although the onvergen e of the dis rete solution to the solution of dierential problem (1a)-(1 ) is not proved, test omputations have revealed that the nonlinear nite-volume method with oeÆ ients (11) has quadrati onvergen e with respe t to the on entration and linear onvergen e with respe t to diusion

uxes. At the same time the onvergen e of Pi ard iterations is not guaranteed and this problem may be ome a key question in the further development of this method.

Acknowledgements The author is grateful to Yu. V. Vassilevski, C. Le Potier, D. A. Svyatski, and K. N. Lipnikov for fruitful dis ussions of the problem and the ideas used in the development of the method. This work was supported in part by the Russian Foundation for Basi Resear h (proje t no. 04-07-90336), by the program \Computational and Information Issues of the Solution to Large-S ale Problems" of the Department of Mathemati al S ien es of the Russian A ademy of S ien es, and by a grant from the Foundation for the Support of National S ien e for best graduate students of the Russian A ademy of S ien es.

References 1. A. Bourgeat, M. Kern, S. S huma her and J. Talandier. The COUPLEX test ases: Nu lear waste disposal simulation. Computational Geos ien es, 2004, 8, pp.83-98. 2. G. Bernard-Mi hel, C. Le Potier, A. Be

antini, S. Gounand and M. Chraibi. The Andra Couplex 1 test ase: Comparisons between nite element, mixed hybrid nite element and nite volume dis retizations. Computational Geos ien es, 2004, 8, pp.83-98. 3. I. V. Kapyrin. A family of monotone methods for the numeri al solution of three-dimensional diusion problems on unstru tured tetrahedral meshes.Doklady Mathemati s, 2007, Vol.76, No.2, pp.734-738. 4. C. Le Potier. S hema volumes nis monotone pour des operateurs de diusion fortement anisotropes sur des maillages de triangle non stru tures. C. R. A ad. S i. Paris, 2005, Ser. I 341, pp.787-792. 5. K. Lipnikov, M. Shashkov, D. Svyatski and Yu. Vassilevski. Monotone nite volume s hemes for diusion equations on unstru tured triangular and shaperegular polygonal meshes. Journal of Computational Physi s, 2007, Vol.227, No.1, pp.492-512. 6. A. A. Samarskii and P. N. Vabish hevi h. Numeri al Methods for Solving Conve tion{Diusion Problems Editorial URSS, Mos ow, 1999, 248p. [in Russian℄.

Sparse Approximation of FEM Matrix for Sheet Current Integro-Differential Equation⋆ Mikhail Khapaev1 and Mikhail Yu. Kupriyanov2 1

Dept. of Computer S ien e, Mos ow State University, 119992 Mos ow, Russia vmhap@cs.msu.su

2

Nu lear Physi s Institute, Mos ow State University, 119992 Mos ow, Russia mkupr@pn.sinp.msu.ru

We onsider two-dimensional integro-dierential equation for

urrents in thin super ondu ting lms. The integral operator of this equation is hypersingular operator with kernel de aying as 1/R3 . For numeri al solution Galerkin Finite Element Method (FEM) on triangular mesh with linear elements is used. It results in dense FEM matrix of large dimension. As the kernel is qui kly de aying then o-diagonal elements of FEM matrix are small. We investigate simple sparsi ation approa h based on dropping small entries of FEM matrix. The on lusion is that it allows to redu e to some extent memory requirements. Nevertheless for problems with large number of mesh points more ompli ated te hniques as one of hierar hi al matri es algorithms should be onsidered. Abstract.

Keywords: super ondu tivity, FEM, sparse matrix.

1

Introduction

In this paper we onsider the problem of numeri al solution of boundary value problem for integro-dierential equation for sheet urrent in thin super ondu ting lms. The simplest form of this equation for a single ondu tor is −λ⊥ ∆ψ(r) +

1 4π

Z Z ∇ψ(r ′ ), ∇ ′ S

1 ds + Hz (r) = 0, |r − r ′ |

(1)

where λ⊥ is onstant parameter, S is 2D bounded domain on plane (x, y), r = (x, y). ψ(r) is unknown fun tion. It is stream fun tion potential representation for 2D sheet urrent. Hz (r) is the right hand side and has the sense of z omponent of external magneti eld. The boundary ondition for (1) is ψ(r) = F(r),

r ∈ ∂S.

(2)

Here fun tion F(r) is ompletely de ned by inlet and outlet urrents over ondu tor boundary ∂S and urrents ir ulating around holes in S. In the paper we ⋆

The paper is supported by ISTC proje t 3174.

512

M. Khapaev, M. Kupriyanov

evaluate the problem in more general form a

ounting several single- onne ted

ondu tors with holes and nite thi kness of lms. Our interest to problem (1), (2) is motivated by omputations of indu tan es and urrent elds in mi roele troni super ondu tor stru tures [1, 2℄. Traditionally problems for surfa e, sheet or volume urrents are equally solved using PEEC (Partial Element Equivalent Cir uit) te hnique [3, 4℄. This approa h brings to equation with weakly singular kernel. In our ase it is λ⊥ J(r) +

1 4π

ZZ S

∇ · J(r) = 0,

J(r ′ ) ds = −∇χ(r), |r − r ′ |

∆χ = 0.

(3) (4)

In (3) J(r) is unknown urrent, χ(r) is one more unknown fun tion (phase). (1) an be obtained from (3) using dierentiation. Equation (3) needs boundary onditions for fun tion χ(r) and urrent J(r). Equations similar to (3) are well known for normal ondu tors. Approa hes similar to PEEC for (3) for super ondu tors are also known [6, 7℄. For normal ondu tor fun tion χ(r) has sense of voltage potential. Re ently fast multipoles te hnique based program FASTHENRY [5℄ for (3) was adopted for super ondu tors [8℄. The main problem in numeri al solution of (1) or (3) is dense matrix of large size. It is ne essary to ll this matrix fast and then store it or it's approximation. It is also ne essary to have a fast and reliable method for solution of system of linear equations with this matrix. In other ase simulation of many pra ti al problems an be unfeasible. We prefer to solve equation (1) instead of (3) be ause (1) a

ounts important physi al features of the problem and be ause of numeri al eÆ ien y onsiderations: – Many super ondu tivity problems are based solely on urrents and magneti eld. In these ases it is diÆ ult to de ne boundary onditions for χ(r). – Holes in S is a problem for (3) and is an easy task for (1). Given urrents ir ulating around holes are a

ounted in boundary onditions in fun tion F(r)

(1). Non-de aying urrents ir ulating around holes are typi al for problems in super ondu tivity. – FEM for (1) has better numeri al approximation then PEEC and thus an give smaller system of linear equations. – FEM o-diagonal matrix elements for (1) qui kly tends to zero with the distan e between nite elements. In this paper we outline the evaluation of boundary value problem for integrodierential equations for sheet urrents in thin super ondu ting lms. Properties of operators are dis ussed and nite element method is formulated. We study de aying of matrix elements and formulate simple strategy for dropping small

Sparse Approximation of FEM Matrix

513

elements of the matrix. Then dire t sparse solver is used for fa torisation and solution. Two numeri al examples are onsidered. The sparsi ation te hnique we developed allows to extend the set of problems that an be eÆ iently solved. It is also shown that even for qui kly de aying kernels more ompli ated methods of solving large dense FEM (Galerkin) systems of equations like [9, 10℄ should be used.

2 2.1

Equations evaluation Preliminaries

In this paper we study the urrents in ondu ting layers separated by layers of diele tri . Let tm be the thi kness of ondu ting layers and dk be the thi kness of diele tri layers, k, m | the numbers of the layers. Condu ting layers an

ontain few single- onne ted ondu tors of arbitrary shape. Let the number of

ondu tors in all layers be Nc and the total number of holes in all ondu tors will be Nh . Ea h ondu tor an have urrent terminals where inlet or outlet

urrents are given. For large lass of mi rowave and digital ir uits it an be assumed [11, 6℄ dk ≪ l, tm ≪ l, where l is the typi al lateral size of ir uit in plane (x, y). Ea h ondu tor o

upy spa e domain Vm = Sm × [h0m , h1m ], m = 1, . . . , Nc . Two-dimensional domain Sm is the proje tion of the ondu tor on the plane (x, y). We all the boundary of the ondu tor ∂Sm the boundary of the proje tion Sm . Let ∂Sh,k be the boundary of the hole with number k, ∂Sext,m | external boundary of m-th ondu tor. We assume that all urrent terminals are on the external boundary of the ondu tors. The magneti eld is ex ited by external magneti eld, urrents ir ulating around holes and urrents through hains of terminals on the ondu tors. For further onvenien e, let P, P0 stands for points in 3D spa e, r, r0 | for points on plane. Also, onsider dierential operators ∂x = ∂/∂x, ∂y = ∂/∂y, ∇xy = (∂x , ∂y ). 2.2

London Equations for Conductors of Finite Thickness

The basi equations for further onsideration are stati London equations [1℄. Let j be urrent density and H | total magneti eld in luding self- eld of j and external magneti eld, λ | so alled London penetration depth [1℄. Then basi equations are: λ2 ∇ × j + H = 0, ∇ × H = j.

(5) (6)

514

M. Khapaev, M. Kupriyanov

Typi ally λ and lm thi kness are of same order. As lm is assumed thin j ≈ j(x, y) and problem redu es to z- omponent of (5) [12℄ (7)

λ2 (∂x jy (P0 ) − ∂y jx (P0 )) + Hz (P0 ) = 0

Consider the sheet urrent density Jm (r): Jm (r) =

Z h1m

j(P)dz,

(8)

r ∈ Sm .

h0 m

Self magneti eld in (7) is al ulated by means of average urrent density Jn (r)/tn and Biot-Savart formula: H(P0 ) =

Nc Z 1 X 1 1 Jn (r) × ∇P dvP . 4π |P − P0 | Vn t n

(9)

n=1

Consider London penetration depth for lms

(10)

λsm = λ2m /tm .

Averaging (7) over the thi kness of ondu tors we obtain the following equations for the sheet urrents in ondu tors λsm (∂x Jm,y (r0 ) − ∂y Jm,x (r0 )) + Nc Z Z 1 X (Jn (r) × ∇xy Gmn (r, r0 ))z dsr + Hz (r0 ) = 0, 4π

(11)

n=1 S n

where r0 ∈ Sm , m = 1, . . . , Nc , Hz (r) is z omponent of external magneti eld and 1 Gmn (r, r0 ) = tm tn

Z h1m h0 m

dz0

Z h1n h0 n

1 dz. |P − P0 |

(12)

The equations (11) must be ompleted by the harge onservation low ∇ · Jm = m = 1, . . . , Nc . Our goal is to take into a

ount small but nite thi kness of ondu tors. Therefore we substitute the both of one-dimensional integra