Proceedings ISSAC (Beijing)

The Association for Computing Machinery 1515 Broadway New York, New York 10036 Copyright © 2005 by the Association for ...

16 downloads 1015 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

The Association for Computing Machinery 1515 Broadway New York, New York 10036

Copyright © 2005 by the Association for Computing Machinery, Inc. (ACM). Permission to make digital or hard copies of portions of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permission to republish from: Publications Dept., ACM, Inc. Fax +1 (212) 869-0481 or . For other copying of articles that carry a code at the bottom of the first or last page, copying is permitted provided that the per-copy fee indicated in the code is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Notice to Past Authors of ACM-Published Articles ACM intends to create a complete electronic archive of all articles and/or other material previously published by ACM. If you have written a work that has been previously published by ACM in any journal or conference proceedings prior to 1978, or any SIG Newsletter at any time, and you do NOT want this work to appear in the ACM Digital Library, please inform [email protected], stating the title of the work, the author(s), and where and when published.

ISBN: 1-59593-095-7

Additional copies may be ordered prepaid from:

ACM Order Department PO Box 11405 New York, NY 10286-1405

Phone: 1-800-342-6626 (US and Canada) +1-212-626-0500 (all other countries) Fax: +1-212-944-1318 E-mail: [email protected]

ACM Order Number 505050 Printed in the USA

ii

Foreword ISSAC 2005 is a continuation of a well-established series of international conferences for the presentation of the latest advances in the field of Symbolic and Algebraic Computation. The first meeting of the series (1966) was held in Washington, DC, and sponsored by the Association for Computing Machinery (ACM). Since then, the abbreviated name of the meeting has evolved from SYMSAM, SYMSAC, EUROSAM, EUROCAL to finally settle on the present name ISSAC. This 30th meeting was hosted by the Key Laboratory of Mathematics Mechanization, Chinese Academy of Sciences, Beijing, China from July 24 to July 27. The topics of the conference include, but are not limited to: •

Algorithmic mathematics. Algebraic, symbolic and symbolic-numeric algorithms. Simplification, function manipulation, equations, summation, integration, ODE/PDE, linear algebra, number theory, group and geometric computing.

•

Computer Science. Theoretical and practical problems in symbolic computation. Systems, problem solving environments, user interfaces, software, libraries, parallel/distributed computing and programming languages for symbolic computation, concrete analysis, benchmarking, theoretical and practical complexity of computer algebra algorithms, automatic differentiation, code generation, mathematical data structures and exchange protocols.

•

Applications. Problem treatments using algebraic, symbolic or symbolic-numeric computation in an essential or a novel way. Engineering, economics and finance, physical and biological sciences, computer science, logic, mathematics, statistics, education.

Following tradition, ISSAC 2005 featured invited talks, contributed papers, tutorials, poster sessions, software exhibitions, and satellite workshops. This volume contains all the contributed papers which were presented at the meeting as well as the abstracts of the invited talks. The picture on the front cover shows a page from the classic Chinese math book bearing the title “Jade Mirrors of Four Elements” by Zhu Shijie, written in 1303 AD during the Yuan Dynasty. In this page, a system of equations of three unknowns and degree three is reduced to a univariate equation by eliminating variables. For ISSAC 2005, a total of 111 papers were submitted and each was distributed to members of the program committee and external reviewers. An average of 2.5 referee reports was obtained for each submission, and finally 48 papers were selected for presentation. We are particularly pleased that the papers in these proceedings represent such a broad spectrum of topics found in the field of computer algebra. Of course it is our pleasure to acknowledge the contributions of all the researchers and educators who submitted papers for consideration and all those who assisted in the selection process. We would also like to express our sincere gratitude to all the organizers listed in the front material of these proceedings. The success of ISSAC 2005 is in large part due to the efforts of these people. Finally, we thank the Association for Computing Machinery (ACM) and its Special Interest Group on Symbolic and Algebraic Computation (SIGSAM) for their sponsorship and for their assistance in the organization. We also would like to thank the following sponsors •

the National Natural Science Foundation of China (NSFC),

•

the Chinese Academy of Mathematics and Systems Science (AMSS),

•

the Institute of Systems Sciences (ISS),

•

the Key Laboratory of Mathematics Mechanization (KLMM),

•

Maplesoft Inc.

•

and the Institut national de recherche en informatique et automatique (INRIA) for both their financial and logistic support.

We write this note in anticipation that the attendees of the ISSAC 2005 conference will find the experience both scientifically rewarding and personally satisfying.

Xiao-Shan Gao, General Co-Chair

George Labahn, General Co-Chair

Peter Paule, Program Chair

Manuel Kauers, Editor

iii

These proceedings are dedicated to the memory of our friend and colleague

Manuel Bronstein who died suddenly in June 2005. He will be missed by us all.

v

Table of Contents Dedication to Manuel Bronstein ...............................................................................................................v ISSAC 2005 Conference Organization..................................................................................................xi ISSAC Steering Committee ........................................................................................................................xii Reviewers .............................................................................................................................................................xii Sponsor & Supporters .................................................................................................................................xiv Invited Talks •

A View on the Future of Symbolic Computation .......................................................................................1

•

D-finiteness: Algorithms and Applications.................................................................................................2

•

On a Finite Kernel Theorem for Polynomial-Type Optimization Problems and Some of its Applications .............................................................................................................................................4

B. Buchberger (Johannes Kepler University) B. Salvy (INRIA Rocquencourt)

W. Wen-tsun (Academia Sinica)

Contributed Talks •

Gosper’s Algorithm, Accurate Summation, and the Discrete Newton-Leibniz Formula.................5

•

Signature of Symmetric Rational Matrices and the Unitary Dual of Lie Groups.............................13

•

Sum of Roots with Positive Real Parts ......................................................................................................21

•

Algebraic General Solutions of Algebraic Ordinary Differential Equations.....................................29

•

Adherence is Better than Adjacency: Computing the Riemann Index Using CAD ........................37

•

Fast Algorithms for Polynomial Solutions of Linear Differential Equations....................................45

•

Non Complete Integrability of a Magnetic Satellite in Circular Orbit .................................................53

•

Symmetric and Semisymmetric Graphs Construction Using G-graphs ...........................................61

•

Picard–Vessiot Extensions for Linear Functional Systems .................................................................68

•

On Using Bi-equational Constraints in CAD Construction...................................................................76

•

Hybrid Symbolic-Numeric Integration in Multiple Dimensions via Tensor-Product Series .........84

•

A BLAS Based C Library for Exact Linear Algebra on Integer Matrices ...........................................92

S. A. Abramov (Russian Academy of Sciences), M. Petkovšek (University of Ljubljana) J. Adams (University of Maryland), B. D. Saunders, Z. Wan (University of Delaware) H. Anai (Fujitsu Laboratories Ltd.), S. Hara (The University of Tokyo), K. Yokoyama (Rikkyo University) J. M. Aroca, J. Cano (Univ. de Valladolid), R. Feng, X. S. Gao (Academia Sinica) J. C. Beaumont, R. J. Bradford, J. H. Davenport, N. Phisanbut (University of Bath) A. Bostan (Algorithms Project), T. Cluzeau (Université de Limoges), B. Salvy (INRIA Rocquencourt) D. Boucher (Université de Rennes 1) A. Bretto, L. Gillibert (Université de Caen), B. Laget (Ecole Nationale d’Ingénieurs de Saint-Etienne) M. Bronstein (INRIA – CAFÉ), Z. Li (Acad. of Math. and Syst. Sci.), M. Wu (INRIA – CAFÉ, Acad. of Math. and Syst. Sci.) C. W. Brown (United State Naval Academy), S. McCallum (Macquarie University NSW) O. A. Carvajal, F. W. Chapman, K. O. Geddes (University of Waterloo) Z. Chen, A. Storjohann (University of Waterloo)

vii

•

Structure and Asymptotic Expansion of Multiple Harmonic Sums ..................................................100

•

Lifting Techniques for Triangular Decompositions..............................................................................108

•

Computing the Multiplicity Structure in Solving Polynomial Systems ...........................................116

•

Algorithms for the Non-monic Case of the Sparse Modular GCD Algorithm ................................124

•

Computing µ-Bases of Rational Curves and Surfaces Using Polynomial Matrix Factorization ......................................................................................................................................132

C. Costermans, J. Y. Enjalbert, H. Ngoc Minh, M. Petitot (Université Lille 2) X. Dahan (LIX, École polytechnique), M. Moreno Maza (University of Western Ontario), É. Schost (LIX, École polytechnique), W. Wu, Y. Xie (ORCCA, UWO) B. H. Dayton, Z. Zeng (Northeastern Illinois University) J. de Kleine, M. Monagan (Simon Fraser University), A. Wittkopf (Maplesoft)

J. Deng, F. Chen, L. Shen (University of Science and Technology of China) •

Efficient Computation of the Characteristic Polynomial.....................................................................140

•

Selfintersections of a Bézier Bicubic Surface........................................................................................148

•

A Procedure for Proving Special Function Inequalities Involving a Discrete Parameter ....................................................................................................................................156

J.-G. Dumas, C. Pernet (Université Joseph Fourier), Z. Wan (University of Delaware) A. Galligo, J. P. Pavone (Université de Nice Sophia-Antipolis)

S. Gerhold, M. Kauers (Johannes Kepler Universität) •

Generalized Loewy-Decomposition of D-Modules................................................................................163

•

On Computing Nearest Singular Hankel Matrices ................................................................................171

•

A Reliable Block Lanczos Algorithm over Small Finite Fields ..........................................................177

•

Schur Partition for Symmetric Ternary Forms and Readable Proof to Inequalities.....................185

•

Affine Transformations of Algebraic Numbers......................................................................................193

•

Architecture-Aware Classical Taylor Shift by 1 .....................................................................................200

•

On the Complexity of Factoring Bivariate Supersparse (Lacunary) Polynomials........................208

•

Generic Matrix Multiplication and Memory Management in LinBox.................................................216

•

Exact Analytical Solutions to the Nonlinear Schrödinger Equation Model....................................224

•

Half-GCD and Fast Rational Recovery .....................................................................................................231

•

Application of Wu’s Method to Symbolic Model Checking ................................................................237

•

Probabilistic Algorithms for Computing Resultants ............................................................................245

D. Grigoriev (Université de Rennes, Beaulieu), F. Schwarz (FhG, Institut SCAI) M. A. Hitz (North Georgia College & State University) B. Hovinen (University of Toronto), W. Eberly (University of Calgary) F. Huang, S. Chen (Chinese Academy of Science) D. J. Jeffrey, Pratibha (The University of Western Ontario), K. B. Roach (The University of Waterloo) J. R. Johnson, W. Krandick, A. D. Ruslanov (Drexel University) E. Kaltofen (North Carolina State University), P. Koiran (École Normale Supérieure de Lyon) E. Kaltofen (North Carolina State University), D. Morozov (Duke University), G. Yuhasz (North Carolina State University) B. Li, Y. Chen (Ningbo University, Chinese Academy of Sciences), Q. Wang (Dalian University of Technology, Chinese Academy of Sciences) D. Lichtblau (Wolfram Research, Inc.) W. Mao (Chinese Academy of Sciences), J. Wu (Chinese Academy of Sciences, Lanzhou University, Universität Mannheim) M. Monagan (Simon Fraser University)

viii

•

Generalized Normal Forms and Polynomial System Solving ............................................................253

•

Domains and Expressions: An Interface Between Two Approaches to Computer Algebra ....................................................................................................................................261

B. Mourrain (INRIA)

C. E. Oancea, S. M. Watt (The University of Western Ontario) •

Symbolic-Numeric Completion of Differential Systems by Homotopy Continuation ..................269

•

Algorithms for Symbolic/Numeric Control of Affine Dynamical Systems ......................................277

•

Finding Telescopers with Minimal Depth for Indefinite Nested Sum and Product Expressions ............................................................................................................................285

G. Reid (University of Western Ontario), J. Verschelde (University of Illinois at Chicago), A. Wittkopf (Simon Fraser University), W. Wu (The University of Western Ontario) A. Rondepierre, J.-G. Dumas (Université Joseph Fourier)

C. Schneider (J. Kepler University) •

Multivariate Power Series Multiplication .................................................................................................293

•

Partial Degree Formulae for Rational Algebraic Surfaces ..................................................................301

•

Computing the Rank and a Small Nullspace Basis of a Polynomial Matrix ...................................309

•

Approximation of Dynamical Systems using S-Systems Theory: Application to Biological Systems............................................................................................................317

É. Schost (LIX, École Polytechnique) S. Pérez-Díaz, J. R. Sendra (Universidad de Alcalá) A. Storjohann (University of Waterloo), G. Villard (École Normale Supérieure de Lyon)

L. Tournier (Laboratoire de Modélisation et Calcul) •

Generalized Laplace Transformations and Integration of Hyperbolic Systems of Linear Partial Differential Equations....................................................................................................325 S. P. Tsarev (Krasnoyarsk State Pedagogical University)

•

Preconditioners for Singular Black Box Matrices.................................................................................332

•

Solving Second Order Linear Differential Equations with Klein’s Theorem ..................................340

•

Deterministic Equation Solving over Finite Fields ...............................................................................348

•

Stability Analysis of Biological Systems with Real Solution Classification ..................................354

•

An Open Problem on Metric Invariants of Tetrahedra .........................................................................362

•

Admissible Orderings and Finiteness Criteria for Differential Standard Bases ...........................365

W. J. Turner (Wabash College) M. van Hoeij (Florida State University), J.-A. Weil (Université de Limoges) C. van de Woestijne (Universiteit Leiden) D. Wang (Beihang University), B. Xia (Peking University) L. Yang, Z. Zeng (East China Normal University) A. Zobnin (Moscow State University)

Author Index......................................................................................................................................................373

ix

ISSAC 2005 Conference Organization General Chairs:

SIGSAM Chair: Program Committee:

Poster Committee:

Tutorials: Software Exhibits: Proceedings: Local Arrangements:

Treasurer: Publicity Committee:

Xiao-Shan Gao, MMRC, Chinese Academy of Sciences (China) George Labahn, University of Waterloo, Ontario (Canada) Emil Volcheck, National Security Agency (USA) Peter Paule (Chair), RISC-Linz (Austria) Ron Boisvert, NIST (USA) John Cannon, University of Sydney (Australia) Howard Cheng, University of Lethbridge (Canada) Frédéric Chyzak, INRIA-Rocquencourt (France) Robert Corless, University of Western Ontario (Canada) Mark Giesbrecht, University of Waterloo, Ontario (Canada) Andreas Griewank, Humboldt-Universität zu Berlin (Germany) Tudor Jebelean, RISC-Linz (Austria) Hongbo Li, MMRC, Chinese Academy of Sciences (China) Daniel Lichtblau, Wolfram Research (USA) Michael Monagan, Simon Fraser University (Canada) Teo Mora, Università di Genova (Italy) Marko Petkovšek, University of Ljubljana (Slovenia) Tomás Recio, University of Cantabria (Spain) Felix Ulmer, Université de Rennes 1 (France) Paul Wang, Kent State University (USA) Kazuhiro Yokoyama, Kyushu University (Japan) Austin Lobo (Chair), Washington College (USA) Alin Bostan, INRIA-Rocquencourt (France) Ha Le, Simon Fraser University (Canada) William Turner, Wabash College (USA) Claude-Pierre Jeannerod, INRIA-Lyon (France) Dongming Wang, Université Pierre et Marie Curie (France) Manuel Kauers, RISC-Linz (Austria) Ziming Li (Chair), MMRC, Chinese Academy of Sciences (China) Dongdai Lin, MMRC, Chinese Academy of Sciences (China) Huilin Liu, MMRC, Chinese Academy of Sciences (China) Yujie Ma, MMRC, Chinese Academy of Sciences (China) Lihong Zhi, MMRC, Chinese Academy of Sciences (China) Zhuojun Liu, MMRC, Chinese Academy of Sciences (China) Ilias Kotsireas, Wilfried Laurier University (Canada) xi

Web/Registration:

ISSAC Steering Committee:

Dingkang Wang (Chair), MMRC, Chinese Academy of Sciences (China) Zhuosheng Lu, MMRC, Chinese Academy of Sciences (China) Erich Kaltofen (Chair), North Carolina State University (USA) Mark Giesbrecht, University of Waterloo (Canada) Wolfram Koepf, Universität Kassel (Germany) Gilles Villard, INRIA-Lyon (France) Emil Volcheck, National Security Agency (USA) Kazuhiro Yokoyama, Kyushu University (Japan)

Reviewers:

P. Abbott S.A. Abramov I. Ajwa M. Angeles Gomez-Molleda A. Antonov D. Aruliah G. Ateniese M. Audin E. Bach M. Barkatou D.A. Bini U. Bodenhofer P. Borwein A. Bostan F. Boulier D. Bradley M. Bronstein J. Carette H. Cheng B. Chen W.Y.C. Chen E.-W. Chionh R. Churchill F. Chyzak G.E. Collins D. Coombs G. Cooperman G. Craciun J.-G. Dumas W. Eberly M. El Kahoui I.Z. Emiris T. Erlebach J. Farr G. Fee R. Feng

C. Fieker S. Fortune A. Galligo X.-S. Gao J. Gerhard P. Giorgi M. Giusti L. Gonzalez-Vega T. Hagerup M. Harrison W. Harris G. Havas T.F. Havel F. Hess Q.-H. Hou E. Hubert R. Israel T. Jebelean R. Joan J. Johnson B. Juettler D. Kapadia D. Kapur M. Kauers M. Kida E.P. Klement J. Knoop I.S. Kotsireas W. Krandick G. Labelle G. Landsmann J. Lauri D. Lazard G. Lecerf H. Le D. Lichtblau xii

P. Lisonek Z. Li A. Lobo A.J. Maciejewski E. Mansfield M.G. Marinari A. Martens R. Martin D. Masulovic J. May J.M. McNamee E. Melis M. Monagan T. Mora M. Moreno-Maza J. Moulin Ollagnier B. Mourrain J.-M. Muller M. Noro F. Ollivier A. Orlando P. Orponen V. Pan H. Park C. Pernet M. Petkovsek G. Pfister R. Pozo A. Quadrat A. Quiros S. Ratschan D. Richardson A. Riese E. Rodriguez Carbonell C.M. Roney-Dougal F. Rouillier

Reviewers (continued)

S.M. Rump B. Salvy S. Saminger F. Santos T. Sasaki D. Saunders J. Schicho C. Schneider W. Schreiner F. Schwarz J. Segura A. Seidl R. Sendra T. Shimoyama I. Shparlinski

M. Singer A. Solomon G. Sommer A. Steel H.J. Stetter A. Storjohann A. Szanto N. Temme V. Timofte S. Tsarev B. Unger C. van de Woestijne J. van der Hoeven M. van der Put M. van Hoeij

In addition, 52 reviewers have requested that their name not appear.

xiii

J. Verschelde G. Villard K. Weber J.A. Weil V. Weispfenning F. Winkler W. Wu Y. Wu L. Yang P. Yu M. Zhang L. Zhi B. Zimmermann E. Zuazua W. Zudilin

Sponsor & Supporters ISSAC 2005 is sponsored by

Association for Computing Machinery SIGSAM — Special Interest Group on Symbolic and Algebraic Manipulation with financial support from

National Natural Science Foundation of China

Maplesoft

Academy of Mathematics and Systems Science

Institute of Systems Science

Key Laboratory of Mathematics Mechanization

Institut national de recherche en informatique et en automatique

xiv

A View on the Future of Symbolic Computation Invited Talk Abstract Bruno Buchberger Research Institute for Symbolic Computation Johannes Kepler University Linz, Austria

[email protected] Since approximately 1960, symbolic computation added algebraic algorithms (polynomial algorithms, simplification algorithms for expressions, algorithms for integration, algorithms for the analysis of algebraic structures like groups etc.) to numerics and provided both numerical and algebraic algorithms in the frame of powerful integrated mathematical software systems like Macsyma, Reduce, . . . , Mathematica, Maple, . . . Various wonderful tools like graphics, notebook facilities, extensible two-dimensional syntax etc. greatly enhanced the attractivity of these systems for mathematicians, scientists, and engineers. Over the recent decades, sometimes based on very early work in the 19th century, new and deep research results in various branches of mathematics have been developed by the symbolic computation research community which led to an impressive variety of new algebraic algorithms. In parallel, in a different community, based on new and deep results in mathematical logic, algorithms and systems for automated theorem proving were developed. In the editorial for the Journal of Symbolic Computation (1985), I tried to offer this journal as a common forum for both the computer algebra and the computational logic community and for the interaction and merge of the two fields. In fact,

in some specific theorem proving methods (as, for example, decision methods for the first order theory of real closed fields and decision methods for geometry), algebraic techniques play an important role. However, we are not yet at a stage where both worlds, the world of computational algebra (the algorithmization of the object level of mathematics) and the world of computational logic (the algorithmization of the meta-level of mathematics) would find there common frame in terms of integrated mathematical software systems. In the talk, I will sketch a view on future symbolic computation that hopefully will integrate numerics, computer algebra, and computational logic in a unified frame and will offer software systems for supporting the entire process of what could be called “mathematical theory exploration” or “mathematical knowledge management”. In this view, symbolic computation is not only a specific part of mathematics but, rather, will be specific way of doing mathematics. This will have drastic effects on the way how research, education, and application in mathematics will be possible and the publication, accumulation, and use of mathematical knowledge will be organized. We envisage a kind of “Bourbakism of the 21st century”, which will be very different — and partly in opposition to — the Bourbakism of the 20th century.

Copyright is held by author/owner ISSAC’05, July 24–27, 2005, Beijing, China. ACM 1-59593-095-7/05/0007.

1

D-finiteness: Algorithms and Applications Invited Talk Abstract Bruno Salvy Algorithms Project Inria Rocquencourt 78153 Le Chesnay (France)

[email protected] Differentially finite series are solutions of linear differential equations with polynomial coefficients. P-recursive sequences are solutions of linear recurrences with polynomial coefficients. Corresponding notions are obtained by replacing classical differentiation or difference operators by their qanalogues. All these objects share numerous properties that are described in the framework of “D-finiteness”. Our aim in this area is to enable computer algebra systems to deal in an algorithmic way with a large number of special functions and sequences. Indeed, it can be estimated that approximately 60% of the functions described in Abramowitz & Stegun’s handbook [1] fall into this category, as well as 25% of the sequences in Sloane’s encyclopedia [20, 21]. In a way, D-finite sequences or series are non-commutative analogues of algebraic numbers: the role of the minimal polynomial is played by a linear operator. Ore [14] described a non-commutative version of Euclidean division and extended Euclid algorithm for these linear operators (known as Ore polynomials). In the same way as in the commutative case, these algorithms make several closure properties effective (see [22]). It follows that identities between these functions or sequences can be proved or computed automatically. Part of the success of the gfun package [17] comes from an implementation of these operations. Another part comes from the possibility of discovering such identities empirically, with Padé-Hermite approximants on power series [2] taking the place of the LLL algorithm on floating-point numbers. The discovery that a series is D-finite is also important from the complexity point of view: several operations can be performed on D-finite series at a lower cost than on arbitrary power series. This includes multiplication, but also evaluation at rational points by binary splitting [4]. A typical application is the numerical evaluation of π in computer algebra systems; we give another one in these proceedings [3]. Also, the local behaviour of solutions of linear differential equations in the neighbourhood of their singularities is well understood [9] and implementations of algorithms comput-

ing the corresponding expansions are available [24, 13]. This gives access to the asymptotics of numerous sequences or to analytic proofs that sequences or functions cannot satisfy such equations [10]. Results of a more algebraic nature are obtained by differential Galois theory [18, 19], which naturally shares many subroutines with algorithms for D-finite series. The truly spectacular applications of D-finiteness come from the multivariate case: instead of series or sequences, one works with multivariate series or sequences, or with sequences of series or polynomials,. . . . They obey systems of linear operators that may be of differential, difference, qdifference or mixed types, with the extra constraint that a finite number of initial conditions are sufficient to specify the solution. This is a non-commutative analogue of polynomial systems with a finite number of solutions. It turns out that, as in the polynomial case, Gr¨ obner bases give algorithmic answers to many decision questions, by providing normal forms in a finite dimensional vector space. This has been observed first in the differential case [11, 23] and then extended to the more general multivariate Ore case [8]. A crucial insight of Zeilberger [27, 15] is that elimination in this non-commutative setting computes definite integrals or sums. This is known as creative telescoping. In the hypergeometric setting (when the quotient is a vector space of dimension 1), a fast algorithm for this operation is known as Zeilberger’s fast algorithm [26]. In the more general case, Gr¨ obner bases are of help in this elimination. This is true in the differential case [16, 25] and to a large extent in the more general multivariate case [8]. Also, Zeilberger’s fast algorithm has been generalized to the multivariate Ore case by Chyzak [5, 6]. Still, various efficiency issues remain and phenomena of non-minimality of the eliminated operators are not completely understood. A further generalization of D-finite series is due to Gessel [12] who developed a theory of symmetric series. These series are such than when all but a finite number of their variables (in a certain basis) are specialized to 0, the resulting series is D-finite in the previous sense. Closure properties under scalar product lead to proofs of D-finiteness (in the classical sense) for various combinatorial sequences. Again, algorithms based on Gr¨ obner bases make these operations effective [7]. The talk will survey the nicest of these algorithms and their applications. I will also indicate where current work is in progress, or where more work is needed.


2

Categories and Subject Descriptors

[11] A. Galligo. Some algorithmic questions on ideals of differential operators. In B. F. Caviness, editor, Proceedings EUROCAL’85, volume 204 of Lecture Notes in Computer Science, pages 413–421. Springer-Verlag, 1985. [12] I. M. Gessel. Symmetric functions and P -recursiveness. Journal of Combinatorial Theory, Series A, 53:257–285, 1990. [13] M. van Hoeij. Formal solutions and factorization of differential operators with power series coefficients. Journal of Symbolic Computation, 24(1):1–30, 1997. [14] O. Ore. Theory of non-commutative polynomials. Annals of Mathematics, 34:480–508, 1933. [15] M. Petkovˇsek, H. S. Wilf, and D. Zeilberger. A = B. A. K. Peters, Wellesley, MA, 1996. [16] M. Saito, B. Sturmfels, and N. Takayama. Gr¨ obner deformations of hypergeometric differential equations. Springer-Verlag, Berlin, 2000. [17] B. Salvy and P. Zimmermann. Gfun: a Maple package for the manipulation of generating and holonomic functions in one variable. ACM Transactions on Mathematical Software, 20(2):163–177, 1994. [18] M. F. Singer. Liouvillian solutions of n-th order homogeneous linear differential equations. American Journal of Mathematics, 103(4):661–682, 1981. [19] M. F. Singer. Algebraic relations among solutions of linear differential equations. Transactions of the American Mathematical Society, 295(2):753–763, 1986. [20] N. J. A. Sloane. The On-Line Encyclopedia of Integer Sequences. 2005. Published electronically at http://www.research.att.com/~njas/sequences/. [21] N. J. A. Sloane and S. Plouffe. The Encyclopedia of Integer Sequences. Academic Press, 1995. [22] R. P. Stanley. Enumerative combinatorics, volume 2. Cambridge University Press, 1999. [23] N. Takayama. Gr¨ obner basis and the problem of contiguous relations. Japan Journal of Applied Mathematics, 6(1):147–160, 1989. ´ Tournier. Solutions formelles d’équations [24] E. différentielles. Doctorat d’état, Université scientifique, technologique et médicale de Grenoble, 1987. [25] H. Tsai. Algorithms for algebraic analysis. PhD thesis, University of California at Berkeley, Spring 2000. [26] H. S. Wilf and D. Zeilberger. Rational function certification of multisum/integral/“q” identities. Bulletin of the American Mathematical Society, 27(1):148–153, July 1992. [27] D. Zeilberger. A holonomic systems approach to special functions identities. Journal of Computational and Applied Mathematics, 32(3):321–368, 1990.

I.1.2 [Symbolic and Algebraic Manipulation]: Algorithms

General Terms Algorithms

Keywords Computer algebra, Linear differential equations, Linear recurrences, Creative telescoping, Elimination.

1.

REFERENCES

[1] M. Abramowitz and I. A. Stegun, editors. Handbook of mathematical functions with formulas, graphs, and mathematical tables. Dover Publications Inc., New York, 1992. Reprint of the 1972 edition. [2] B. Beckermann and G. Labahn. A uniform approach for the fast computation of matrix-type Padé approximants. SIAM Journal on Matrix Analysis and Applications, 15(3):804–823, July 1994. [3] A. Bostan, T. Cluzeau, and B. Salvy. Fast algorithms for polynomial solutions of linear differential equations. In M. Kauers, editor, Symbolic and Algebraic Computation, New York, 2005. ACM Press. Proceedings of ISSAC’05, July 2005, Beijing, China. [4] D. V. Chudnovsky and G. V. Chudnovsky. Approximations and complex multiplication according to Ramanujan. In Ramanujan revisited, pages 375–472. Academic Press, Boston, MA, 1988. [5] F. Chyzak. Fonctions holonomes en calcul formel. Phd ´ thesis, Ecole polytechnique, 1998. [6] F. Chyzak. An extension of Zeilberger’s fast algorithm to general holonomic functions. Discrete Mathematics, 217(1-3):115–134, 2000. [7] F. Chyzak, M. Mishna, and B. Salvy. Effective scalar products of D-finite symmetric functions. Journal of Combinatorial Theory, Series A, 2005. 51 pages. To appear. [8] F. Chyzak and B. Salvy. Non-commutative elimination in Ore algebras proves multivariate holonomic identities. Journal of Symbolic Computation, 26(2):187–227, Aug. 1998. [9] E. Fabry. Sur les intégrales des équations différentielles linéaires ` a coefficients rationnels. Thèse de doctorat ès sciences mathématiques, Faculté des Sciences de Paris, July 1885. [10] P. Flajolet, S. Gerhold, and B. Salvy. On the non-holonomic character of logarithms, powers, and the nth prime function. The Electronic Journal of Combinatorics, 11(2), Apr. 2005. A2, 16 pages.

3

On a Finite Kernel Theorem for Polynomial-Type Optimization Problems and Some of its Applications Invited Talk Abstract Wu Wen-tsun MMKL, Academy of Mathematics and System Sciences, Academia Sinica Beijing, 100080, P.R. China

[email protected] Extremalization and optimization problems are considered as of utmost importance both in the past and in the present days. Thus, in the early beginning of infinitesimal calculus the determination of maxima and minima is one of the stimulating problems causing the creation of calculus and one of the successful applications which stimulates its rapid further development. However, the maxima and minima involved are all of local character leading to equations difficult to be solved, not to say the inherent logical difficulties involving necessary and/or sufficient conditions to be satisfied. In recent years owing to creation of computers various kinds of numerical methods have been developed which involve usually some converging processes. These methods, besides such problems as stability or error-control, can hardly give the greatest or least value, or global-optimal value for short over the whole domain, supposed to exist in advance. How-

ever, the problem becomes very agreeable if we limit ourselves to the polynomial-type case. In fact, based on the classical treatment of polynomial equations-solving in ancient China and its modernization due to J.F. Ritt, we have discovered a Finite Kernel Theorem to the effect that a finite set of real values, to be called the finite kernel set of the given problem, may be so determined that all possible extremal values will be found among this finite set and the corresponding extremal zeros are then trivially determined. Clearly it will give the global optimal value over the whole domain in consideration, if it is already known to exist in some way. Associated packages wsolve and e val have been given by D.K. Wang and had been applied with success in various kinds of problems, polynomial-definiteness, non-linear programming, etc., particularly problems involving inequalities.


4

Gosper’s Algorithm, Accurate Summation, and the Discrete Newton-Leibniz Formula ∗

S. A. Abramov†

M. Petkovˇsek‡

Russian Academy of Sciences Dorodnicyn Computing Centre Vavilova 40, 119991, Moscow GSP-1, Russia

Department of Mathematics University of Ljubljana Jadranska 19, SI-1000 Ljubljana, Slovenia

[email protected]

[email protected]

ABSTRACT

hypergeometric term u(n) which satisfies the key equation

Sufficient conditions are given for validity of the discrete Newton-Leibniz formula when the indefinite sum is obtained either by Gosper’s algorithm or by Accurate Summation algorithm. It is shown that sometimes a polynomial can be factored from the summand in such a way that the safe summation range is increased.

u(n + 1) − u(n) = t(n)

for all n ∈ I \ S where S is a finite set. Summing this equation on n from v to w we get the discrete analog of the Newton-Leibniz formula w

t(n) = u(w + 1) − u(v)

(2)

n=v

Categories and Subject Descriptors: G.2.1 [Combinatorics]: Counting problems; I.1.2 [Algorithms]: Algebraic algorithms

provided that [v, w] ∩ Z ⊆ I \ S. In many existing implementations of Gosper’s algorithm, however, indiscriminate use of (2) sometimes results in wrong answers. Here is a case in point.

General Terms: Algorithms Keywords: symbolic summation, Gosper’s algorithm, Accurate Summation algorithm, Newton-Leibniz formula

1.

(1)

Example 1. Consider the sequence 2n−3

INTRODUCTION

t(n) =

Let K be a field of characteristic zero. A function t : I → K defined on an interval of integers I ⊆ Z is a

n

4n

,

(3)

which is defined for all n ∈ Z. This is a hypergeometric term which satisfies

• hypergeometric term if there are nonzero polynomials a0 , a1 ∈ K[n] such that a1 (n)t(n + 1) + a0 (n)t(n) = 0 for all n ∈ Z such that n, n + 1 ∈ I;

2(n + 1)(n − 2)t(n + 1) = (2n − 1)(n − 1)t(n)

(4)

for all n ∈ Z. Gosper’s algorithm succeeds with input t(n) and returns 2n(n + 1) 2n−3 n u(n) = . (n − 2)4n

• P -recursive sequence if there are polynomials a0 , a1 , . . . , aρ ∈ K[n] such that a0 aρ = 0 and aρ (n)t(n + ρ) + · · · + a1 (n)t(n + 1) + a0 (n)t(n) = 0 for all n ∈ Z such that n, n + 1, . . . , n + ρ ∈ I.

Summing equation (1) on n from 0 to m the left-hand side telescopes, and we obtain m (m + 1)(m + 2) 2m−1 m+1 t(n) = (?) u(m + 1) − u(0) = . 2(m − 1)4m n=0 (5) But the expression on the right gives the true value of the sum only at m = 0. At m = 1 it is undefined, while at each m ≥ 2 its value is 3/8 less than the actual value of the sum. The problem here is that u(n) is undefined at n = 2, hence equation (1) does not hold for n ∈ {1, 2}, and summing it over a range including 1 or 2 may give a wrong answer. This is not an isolated example: a similar phenomenon seems to occur with the sum m 2n−p

Each hypergeometric term is, of course, a P -recursive sequence. If t(n) is a hypergeometric term, one can use the wellknown Gosper’s algorithm [6] to find (if it exists) another ∗ The work is partially supported by the ECO-NET program of the French Foreign Affairs Ministry. † Partially supported by RFBR under grant 04-01-00757. ‡ Partially supported by ARRS under grant P1-0294.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISSAC’05, July 24–27, 2005, Beijing, China. Copyright 2005 ACM 1-59593-095-7/05/0007 ...$5.00.

n

n=0

for each positive integer p.

5

4n

If t is a P -recursive sequence, then one can use Accurate Summation algorithm from [3], or its generalization in [5], to solve equation (1) (we discuss this algorithm in Section 5). Problems similar to those arising in Example 1 are possible when one uses the resulting Newton-Leibniz formula. Notice that one can apply Accurate Summation algorithm in the case ρ = 1 as an alternative to Gosper’s algorithm; then the incorrect formula (5) will appear again. This common error is the discrete analogon of a wellknown error in definite integration committed by some of the early symbolic integrators: when attempting to evalub ate I = a f (x)dx by computing first an antiderivative F (x) such that F (x) = f (x), and then using the Newton-Leibniz formula I = F (b)−F (a), we may obtain an incorrect answer unless F (x) is continuous on [a, b]. For example, the actual value of 1 x2 + 1 dx 4 2 −1 x − x + 1

1 (β)n

Note that if (α)n resp. 1/(β)n is defined for some n ∈ Z, then (α)n+1 resp. 1/(β)n−1 is defined for that n as well. More precisely, if α ∈ Z and α ≥ 1 then (α)n is defined on [−α + 1, ∞) ∩ Z, otherwise it is defined on all Z. Similarly, if β ∈ Z and β ≤ 0 then 1/(β)n is defined on (−∞, −β] ∩ Z, otherwise it is defined on all Z. Thus (α)n and 1/(β)n are hypergeometric terms which satisfy (α)n+1 = (α + n)(α)n ,

(β + n)/(β)n+1 = 1/(β)n (6)

whenever (α)n and 1/(β)n+1 are defined. If I ⊆ Z is an infinite interval of integers we denote (−∞, a + 1] ∩ Z, if I = (−∞, a] ∩ Z, I+ = I, otherwise;

is π, but using the antiderivative arctan(x − 1/x) in the Newton-Leibniz formula gives 0. The obvious solution is to split the summation interval into several subintervals that do not contain the exceptional points from S. In this paper we analyze the exceptional set S that appears in Gosper’s algorithm when summing hypergeometric terms, and more generally, in the Accurate Summation algorithm [3] when summing P -recursive sequences. Section 3 provides sufficient conditions for the NewtonLeibniz formula (2) to hold when the indefinite sum u(n) is obtained by Gosper’s algorithm, and Section 5 does the same for Accurate Summation. These conditions provide a bounding interval for the exceptional set S, and are of two kinds: a priori, which are weaker but readily available even before running the algorithms, as they are based on the singularities of the operator annihilating the summand; and a posteriori, which are stronger but available only after running the algorithms, as they are based on their output. On the other hand, in Section 4 we prove that for proper hypergeometric terms the discrete Newton-Leibniz formula is valid without restrictions. For general P -recursive sequences Section 6 shows that sometimes a polynomial can be factored from the summand in such a way that the size of the bounding interval in the a priori condition is decreased. A thorough analysis of the relationship between hypergeometric terms as syntactic objects and their analytic meaning in the context of summation has been provided by M. Schorn in [8]. The solution proposed there for evaluation of sums such as the one in Example 1 is by means of suitably chosen limiting processes.

2.

 n−1

1    , n ≥ 0, β = 0, −1, . . . , 1 − n,     k=0 β + k |n| =

  (β − k), n < 0,      k=1 undefined , otherwise.

I− =

(−∞, a − 1] ∩ Z, if I = (−∞, a] ∩ Z, I, otherwise.

We use E to denote the shift operator w.r.t. n, so that E t(n) = t(n + 1). Since juxtaposition can mean not only operator application but also composition of operators, we use ◦ to denote the latter in case of ambiguity, so that, e.g., E ◦ t(n) = t(n + 1) ◦ E = t(n + 1)E. Sometimes we use parentheses to denote operator application, writing, e.g., E(1) = E 1 = 1. Definition 2. For a linear difference operator L = aρ E ρ + aρ−1 E ρ−1 + · · · + a0

(7)

where ρ ≥ 1, aρ , . . . , a0 ∈ K[n], aρ a0 = 0 and gcd(a0 , . . . , aρ ) = 1, we define the sets SLl of leading and SLt of trailing integer singularities by SLl SLt

= {x ∈ Z; aρ (x − ρ) = 0}, = {x ∈ Z; a0 (x) = 0}.

We call • mLl = min(SLl ∪ {+∞}) the minimal leading singularity of L, • MLl = max(SLl ∪ {−∞}) the maximal leading singularity of L, • mLt = min(SLt ∪ {+∞}) the minimal trailing singularity of L,

PRELIMINARIES

• MLt = max(SLt ∪ {−∞}) the maximal trailing singularity of L.

Definition 1. Following conventional notation, the rising factorial power (α)n and its reciprocal 1/(β)n are defined for α, β ∈ K and n ∈ Z by  n−1

   (α + k), n ≥ 0,     k=0 |n| (α)n =

1   , n < 0, α = 1, 2, . . . , |n|,   α − k    k=1 undefined , otherwise;

Proposition 1. Let L be as in (7) and b ∈ K[n]. If a rational function y ∈ K(n) satisfies aρ (n)y(n + ρ) + · · · + a0 (n)y(n) = b(n),

(8)

then y(n) has no integer poles outside the interval (possibly empty) [mLl , MLt ]. For a proof, see [1].

6

3.

WHEN CAN GOSPER’S ALGORITHM BE USED TO SUM HYPERGEOMETRIC TERMS?

Proof: By assumption, t(n), t(n + 1), r(n), r(n + 1), u(n), u(n + 1) are defined for all n ∈ I1 , and (9), (10) are valid on I1 . Therefore, for all n ∈ I1 ,

We denote Gosper’s algorithm hereafter by GA. Consider the case when (8) has the form a1 (n)t(n + 1) + a0 (n)t(n) = 0

a0 (n)u(n + 1)

(9)

and set L = a1 (n)E + a0 (n). Let a hypergeometric term t(n) satisfy equation (9). Given a0 (n), a1 (n) as input, GA tries to construct r ∈ K(n) such that a0 (n)r(n + 1) + a1 (n)r(n) = −a1 (n)

= = =

a0 (n)r(n + 1)t(n + 1) −a1 (n)(1 + r(n))t(n + 1) (by (10)) a0 (n)(1 + r(n))t(n) (by (9))

=

a0 (n)u(n) + a0 (n)t(n),

or equivalently, a0 (n)(u(n + 1) − u(n)) = a0 (n)t(n).

(10)

(12)

Pick an n ∈ I1 . If a0 (n) = 0 then (12) implies (1). If a0 (n) = 0 then, since gcd(a0 , a1 ) = 1, we have a1 (n) = 0. Hence (9) implies t(n + 1) = 0 and (10) implies r(n) + 1 = 0. Therefore u(n + 1) = r(n + 1)t(n + 1) = 0 and u(n) + t(n) = (r(n) + 1)t(n) = 0, so (1) holds for all n ∈ I1 . Summing (1) over I1 yields (11).

(this can also be done by the algorithms from [1] or [2]). If such r(n) exists then u(n) = r(n)t(n) satisfies the key equation (1), possibly with finitely many exceptions. We now give two kinds of sufficient conditions for this u(n) to satisfy equation (1) and for the discrete Newton-Leibniz formula in the form w t(k) = u(w) − u(v) + t(w) (11)

Theorem 1. (a priori condition for GA) If [v, w] ∩ [mLt + 1, MLl −1] = ∅, then the key equation (1) holds for all n ∈ I1 , and the discrete Newton-Leibniz formula (11) is valid.

k=v

to be valid: Proof: Since r(n) satisfies (10), Proposition 1 implies that r(n) has no integer poles outside the interval [α, β] where

1. an a posteriori condition, depending on the poles of r(n) (Proposition 2),

α

2. an a priori condition, depending only on the integer singularities of L (Theorem 1).

β

In both, we make the following assumptions: • L = a1 (n)E + a0 (n) is an operator of type (7) with ρ = 1,

= = = =

min({x ∈ Z; a0 (x − 1) = 0} ∪ {+∞}) mLt + 1, max({x ∈ Z; a1 (x) = 0} ∪ {−∞}) MLl − 1.

By assumption, the interval [v, w] is disjoint from [α, β]. Hence r(n) has no poles in [v, w], and the assertion follows from Proposition 2.

• r ∈ K(n) is a rational function which satisfies (10) as an equation in K(n),

In practice, one would run GA and then check if the a posteriori condition of Proposition 2 is satisfied, i.e., if r(n) has any integer poles in the summation interval. If yes, this interval would be split into several subintervals in order to guarantee correct evaluation. But it may be useful to check the a priori condition of Theorem 1 first, because this will, in general, restrict the relevant domain to check for poles of r(n).

• v, w are integers such that v ≤ w, • I1 := [v, w − 1] ∩ Z, • t(n) is a K-valued sequence which is defined for all n ∈ [v, w] ∩ Z and satisfies (9) for all n ∈ I1 , • u(n) is a K-valued sequence such that u(n) = r(n)t(n) whenever both r(n) and t(n) are defined.

Example 2. For the hypergeometric term t(n) = 2n−3 n of Example 1, we have L = 2(n + 1)(n − 2)E − /4 n (2n − 1)(n − 1), r(n) = n2n(n + 1)/(n − 2), and u(n) = /((n − 2)4 ). Thus SLt = {1}, SLl = {0, 3}, 2n(n + 1) 2n−3 n mLt = 1, MLl = 3, [mLt + 1, MLl − 1] = {2}, and the only integer pole of r(n) is n = 2. In this case both the a priori and the a posteriori conditions give the same point n = 2 to be avoided by the summation interval. As predicted by either condition, the key equation (1) fails at n = 1 and n = 2 because u(n) or u(n + 1) are undefined there. One is tempted to absorb the denominator factor n − 2 into the binomial coefficient, by, say, the sequence replace u(n) and n ) which is defined ev/((2n − 1)4 u ¯(n) = n(n + 1) 2n−1 n erywhere, and agrees with u(n) for all n = 1, 2. But then equation

Remark 1. Since u(n) = r(n)t(n) it is clear that, in general, formula (11) should be used instead of (2), because the latter formula needs values of the summand lying outside the summation interval which however may be undefined. A nice example is provided by the sum 2n 2n k 4n (−1) / 2k k k=0 whose evaluation was posed as Problem 10494 in the Amer. Math. Monthly in 1996. Here GA succeeds (see [7]), but the summand is undefined everywhere outside the summation interval. Proposition 2. (a posteriori condition for GA) If r(n) has no integer poles in [v, w], then the key equation (1) holds for all n ∈ I1 , and the discrete Newton-Leibniz formula (11) is valid.

u ¯(n + 1) − u ¯(n) = t(n) fails at n = 0 and n = 1.

7

(13)

Example 3. Let (n − 2)(n − 3)(n − 5)(n − 1)!, n t(n) = (n − 2)(n − 3)(n − 5) (−1) , (−n)!

The hypergeometric term t(n) from Example 4 is an instance of a proper term which we are going to define now. Then we show in Theorem 2 that there are no restrictions on the validity of the discrete Newton-Leibniz formula for proper terms.

n≥2 n≤1

where we define as usual 1/k! = 0 when k is a negative integer. This is a hypergeometric term which satisfies

Definition 3. A hypergeometric term t(n) defined on an interval I of integers is proper if there are

(n − 5)(n − 3) t(n + 1) = (n − 4)(n − 1)n t(n) for all n ∈ Z. Here we have a0 (n) = −(n − 4)(n − 1)n, a1 (n) = (n − 5)(n − 3), r(n) = (n − 6)/((n − 2)(n − 3)), and u(n) = r(n)t(n). Thus SLt = {0, 1, 4}, SLl = {4, 6}, mLt = 0, MLl = 6, [mLt + 1, MLl − 1] = [1, 5], and the set of integer poles of r(n) is {2, 3}. The set to be avoided by the summation interval given by the a priori condition is {1, 2, 3, 4, 5}, while the analogous set given by the a posteriori condition is {2, 3}. The key equation (1) fails at n = 1, 2, 3 as predicted by the a posteriori condition, because u(n) or u(n + 1) are undefined there. One can try cancelling the factor (n − 2)(n − 3) and replace u(n) by the sequence (n − 5)(n − 6)(n − 1)!, n ≥ 2 n u ¯(n) = , n≤1 (n − 5)(n − 6) (−1) (−n)!

• a polynomial p ∈ K[n], • a constant z ∈ K, • nonnegative integers q, r, • constants α1 , . . . , αq , β1 , . . . , βr ∈ K such that

Theorem 2. Let t(n) be a proper hypergeometric term defined on an interval I of integers and given by (15). Denote a(n) = z qi=1 (n + αi ) and b(n) = rj=1 (n + βj ). If a polynomial y ∈ K[n] satisfies a(n)y(n + 1) − b(n − 1)y(n) = p(n)

SUMMATION OF PROPER HYPERGEOMETRIC TERMS

and if

It is clear that the a priori condition given in Theorem 1 is, in general, too cautious: e.g., if the summand is a polynomial sequence then the integer singularities of the corresponding recurrence present no obstacles to validity of the discrete Newton-Leibniz formula (11). The following example shows that even the a posteriori condition given in Proposition 2 can sometimes be too pessimistic.

=

t(n).

(17)

Since (αi )n and 1/(βj )n are defined for all n ∈ I, (αi )n+1 n−1 are defined there too. and 1/(βj ) By (6), a(n) qi=1 (αi )n = z qi=1 (αi )n+1 and b(n−1)/ rj=1 (βj )n = 1/ rj=1 (βj )n−1 . Hence (17) is the same as (13), and (14) follows by summing it over [v, w] ∩ Z.

(−1/2)n , 2n!

t(n) = u ¯(w + 1) − u ¯(v)

q (αi )n u ¯(n) = y(n)z n r i=1 (β j )n−1 j=1

Proof: By (16) holds for all n ∈ I. Multiplying assumption, it by z n qi=1 (αi )n / rj=1 (βj )n yields q (αi )n z n ri=1 a(n)y(n + 1) j=1 (βj )n q (αi )n − z n ri=1 b(n − 1)y(n) j=1 (βj )n

then equation (13) holds for all n ∈ Z, and the discrete Newton-Leibniz formula w

(16)

for all n ∈ I + (see Section 2 for notation), then equation (13) holds for all n ∈ I, and the discrete Newton-Leibniz formula (14) is valid whenever [v, w] ∩ Z ⊆ I.

Example 4. Let t(n) = (2 − n)(−1/2)n /(4n!). This hypergeometric term is defined for all n ∈ Z (note that t(n) = 0 for n < 0) and satisfies Lt(n) = 0 for all n ∈ Z where L is the same operator as in Example 2. Thus both Theorem 1 and Proposition 2 require the point n = 2 to be excluded from the summation interval. Equation (1) indeed fails at n = 1 and n = 2 because u(n) = r(n)t(n) is undefined at n = 2. But if we cancel the factor n − 2 in the product r(n)t(n), where r(n) = 2n(n + 1)/(n − 2), and replace u(n) by the resulting sequence u ¯(n) = −n(n + 1)

(15)

for all n ∈ I.

which is defined everywhere, and agrees with u(n) for all n = 2, 3. But equation (13) still fails at n = 1.

4.

q (αi )n t(n) = p(n)z n ri=1 j=1 (βj )n

Example 5. Even though the hypergeometric term (3) from Example 1 defined on I = Z can be written in terms of rising factorials as

(14)

n=v

is valid for all v ≤ w. This example also shows that, thanks to possible singularities, a hypergeometric term (or a P -recursive sequence) is, in general, not uniquely defined by its annihilating operator and an appropriate number of initial values. In fact, it is shown in [4] that every positive integer is the dimension of the kernel of some operator of type (7) with ρ = 1 in the space of sequences t : Z → K.

t(n) =

(n − 2)n , 4n (1)n

one can show that it is not a proper term on Z. However, it can also be written as 2 t∗ (n), n < 2, t(n) = n ≥ 2, t∗ (n),

8

and q = P ∗ (1). Then

where t∗ (n) = (2 − n)

(−1/2)n 4(1)n

(E − 1) ◦ Q + q =

is a proper term (namely the one discussed in Example 4). So to evaluate w n=v t(n) one can first split the summation range at n = 2, then use Theorem 2 on both subranges.

ρ−1 ρ−k

bk+j (n − j + 1)E k+1 − Q + P ∗ (1)

k=0 j=1

=

ρ ρ−k+1 k=1

j=1

ρ−1 ρ−k+1

−

k=0

5.

WHEN CAN ACCURATE SUMMATION BE USED TO SUM P-RECURSIVE SEQUENCES?

=

By Accurate Summation algorithm (hereafter denoted by AS) we mean a specific version of the general Accurate Integration algorithm given in [3] for integration/summation of solutions of Ore equations. This version, which is adapted for sequences that satisfy equations of the form (8) with b(n) = 0, solves the following problem: Let a minimal annihilator L of the form (7) be known for a K-valued sequence t(n). Determine if there exists a sequence u(n) which satis˜ of order ρ. fies (1), and has a minimal annihilator L It is shown in [3] that if such a u exists then it can be expressed as R t where R is an operator of order ρ − 1 with rational-function coefficients. AS constructs R if it exists. (GA solves this problem when ρ = 1.) In order to analyze the validity of the discrete Newton-Leibniz formula (11) in this case, we need to express explicitly the quotient and the remainder of a linear difference operator when divided by the first-order operator E − 1 from the left. The notion of the adjoint difference operator is useful here.

−

bk+j−1 (n − j + 1)E k

j=2 ρ−1

bk (n)E k = P.

k=0

As the quotient and remainder in operator division are unique, it follows that R = Q and p = q. Note that just to find the remainder p = P ∗ (1), it suffices to take adjoints on both sides of equation P = (E −1)◦R+p which results in P ∗ = R∗ ◦ (E − 1)∗ + p = R∗ ◦ (E −1 − 1) + p, and apply this to 1. Remark 2. Let r ∈ K(n) be a rational function, and L a difference operator as in (7). By Lemma 1, the remainder of 1 − rL when divided by E − 1 from the left is equal to (1 − rL)∗ (1) = (1 − L∗ ◦ r)(1) = 1 − (L∗ ◦ r)(1) = 1 − L∗ r.

ρ

bk (n)E k

Algorithm AS

k=0

Input: L = ρk=0 ak (n)E k ∈ K[n, E]. Output: r ∈ K(n) and R ∈ K(n)[E] such that 1 − rL = (E − 1) ◦ R, if they exist.

be an operator in K(n)[E]. Its adjoint L∗ ∈ K(n)[E −1 ] is defined as E −k ◦ bk (n) =

k=0

ρ

if there exists r ∈ K(n) such that L∗ r = 1 then for k := 0, 1, . . . , ρ − 1 do ck (n) := − ρ−k j=1 r(n − j)ak+j (n − j); ρ−1 k R := k=0 ck (n)E ; return (r(n), R) else such r(n) and R do not exist.

bk (n − k)E −k .

k=0

It is straightforward to verify that (L1 ◦ L2 )∗ = L∗2 ◦ L∗1 . k Lemma 1. Let P = ρk=0 bk (n)E k , R = ρ−1 k=0 ck (n)E be operators from K(n)[E], and p ∈ K(n) a rational function such that P = (E−1)◦R+p. Then ck (n) = ρ−k j=1 bk+j (n−j) ∗ and p = P (1).

We can find a rational-function solution r(n) of L∗ r = 1 using, e.g., the algorithm from [1] or the algorithm from [2]. A generalization of [3] was given in [5]; however, the approach taken in [3] has the advantage of simplicity, as it only uses the adjoint operator and algorithms for finding rational solutions. This simplifies the investigation of solutions that are obtained by AS, and enables us to formulate a priori conditions for AS, similar to Theorem 1 (see Theorems 3 and 5 below).

Proof: Let

Q =

bk+j−1 (n − j + 1)E k

Hence an operator R such that 1 − rL = (E − 1) ◦ R exists if and only if L∗ r = 1. This observation forms the basis of Accurate Summation.

L =

ρ

bj (n − j)

j=0

= bρ (n)E ρ +

Definition 4. Let

L∗ =

ρ

j=1

ρ−1 ρ−k+1 k=0

bk+j−1 (n − j + 1)E k +

j=2

ρ ρ−k+1 k=0

bk+j−1 (n − j + 1)E k

ρ−1 ρ−k

bk+j (n − j)E k

k=0 j=1

9

Assume that AS succeeds with L, returning r and R. It is shown in [3] that

Proof: Write R =

ρ−1

k=0 ck (n)E

(i) By assumption, r(n − j) has no integer poles in [α + j, β + j], hence ck (n) has no integer poles in = [max1≤j≤ρ−k (α + 1≤j≤ρ−k [α + j, β + j] j), min1≤j≤ρ−k (β +j)] = [α+ρ−k, β +1], and the coefficients of R have no integer poles in 0≤k≤ρ−1 [α + ρ − k, β+1] = [max0≤k≤ρ−1 (α+ρ−k), β+1] = [α+ρ, β+1]. (ii) To prove the second assertion, we need to express the coefficients ck (n) in a different way. Since 1 − rL = (E − 1) ◦ R, it follows from Remark 2 that L∗ r = ρ j=0 r(n − j)aj (n − j) = 1. Shifting this k times we find that ρj=0 r(n+k−j)aj (n+k−j) = ρ−k j=−k r(n− j)ak+j (n − j) = 1. Therefore

1. an a posteriori condition, depending on the poles of r(n) and of the coefficients of R (Proposition 3), 2. an a priori condition, depending only on the integer singularities of L (Theorem 3). In either case, we make the following assumptions:

ρ−k

• L ∈ K[n, E] is an operator of type (7),

ck (n)

= −

• r ∈ K(n) is a rational function which satisfies L r = 1 as an equation in K(n),

r(n − j)ak+j (n − j) − 1

j=−k

=

• v, w are integers such that v ≤ w − ρ,

k

r(n + j)ak−j (n + j) − 1

j=0

• Iρ := [v, w − ρ] ∩ Z,

for 0 ≤ k ≤ ρ − 1. By assumption, r(n + j) has no integer poles in [α − j, β − j], hence ck (n) has no integer poles in 0≤j≤k [α − j, β − j] = [max0≤j≤k (α − j), min0≤j≤k (β − j)] = [α, β − k], and the coefficients of R have no integer poles in 0≤k≤ρ−1 [α, β − k] = [α, min0≤k≤ρ−1 (β − k)] = [α, β − ρ + 1].

• t(n) is a K-valued sequence which is defined for all n ∈ [v, w] ∩ Z and satisfies L t(n) = 0 for all n ∈ Iρ , • u(n) is a K-valued sequence such that u(n) = R t(n) whenever R t(n) is defined. Proposition 3. (a posteriori condition for AS) If r(n) has no poles in Iρ and the coefficients of R have no integer poles in [v, w − ρ + 1], then equation (1) holds for all n ∈ Iρ , and the discrete Newton-Leibniz formula

Theorem 3. (a priori condition for AS) If [v, w − ρ] ∩ [mLt , MLl − ρ] = ∅, then equation (1) holds for all n ∈ Iρ , and the discrete Newton-Leibniz formula (18) is valid. Proof: Rewrite L∗ r =1 in the equivalent form L r = 1 where L = E ρ ◦ L∗ = ρk=0 aρ−k (n + k)E k ∈ K[n, E]. By Lemma 1, r(n) has no integer poles outside [mL l , ML t ]. But SL l = SLt and SL t = SLl − ρ, therefore mL l = mLt and ML t = MLl −ρ, hence r(n) has no integer poles outside [mLt , MLl − ρ]. If mLt ≤ MLl − ρ, then both intervals [v, w − ρ] and [mLt , MLl − ρ] are nonempty, hence either w − ρ < mLt or MLl − ρ < v. In the former case, r(n) has no integer poles in (−∞, w − ρ], so by Lemma 2(i), the coefficients of R have no integer poles in (−∞, w−ρ+1]. In the latter case, r(n) has no integer poles in [v, ∞), so by Lemma 2(ii), the coefficients of R have no integer poles in [v, ∞). In either case, the result follows from Proposition 3. If mLt > MLl − ρ then r(n) has no integer poles at all. By Lemma 2, the coefficients of R also have no integer poles, and the result again follows from Proposition 3.

t(w − ρ + k) (18)

k=1

is valid. Proof: By assumptions on t and R, u(n) and u(n + 1) are defined for all n ∈ Iρ . As r(n) has no poles in Iρ , = = =

r(n − j)ak+j (n − j)

ρ−k

+

• R ∈ K(n)[E] is an operator of order ρ − 1 which satisfies 1 − rL = (E − 1) ◦ R in K(n)[E],

u(n + 1) − u(n)

j=1

∗

t(k) = u(w − ρ + 1) − u(v) +

r(n − j)ak+j (n − j)

for 0 ≤ k ≤ ρ − 1.

We now give two sufficient conditions for this u(n) to satisfy equation (1) and for the discrete Newton-Leibniz formula (11) to be valid:

k=v

j=1

• the sequence u(n) = R t(n) satisfies (1), possibly with finitely many exceptions, for any sequence t such that Lt = 0 (not only for those t’s whose minimal annihilator is L).

ρ

. By Lemma 1,

ρ−k

ck (n) = −

• if L is a minimal annihilator for t, then a minimal ˜ = 1 − R ◦ (E − 1) (note annihilator for u = R t is L ˜ has the same order ρ as L); that L

w

k

(E ◦ R) t(n) − R t(n) ((E − 1) ◦ R) t(n) = (1 − rL) t(n) t(n) − r(n) L t(n) = t(n)

for every n ∈ Iρ . Thus (1) holds for all n ∈ Iρ , and summing it over Iρ yields (18). Lemma 2. Let α, β ∈ Z ∪ {−∞, ∞}. If r(n) has no integer poles in [α, β] then the coefficients of R have

A similar remark to the one stated immediately after the proof of Theorem 1 about the use of the a priori and a posteriori conditions in practice applies here as well.

(i) no integer poles in [α + ρ, β + 1], and also (ii) no integer poles in [α, β − ρ + 1].

10

Example 6. Let L = (n−3)(n−2)(n+1)E 2 −(n−3)(n2 − 2n−1)E −(n−2)2 . Define t(n) by the initial values t(2) = a, t(3) = 0, t(4) = b, t(5) = c where a, b, c are arbitrary fixed complex numbers, and by the recurrence L t(n) = 0 when n ≤ 1 or n ≥ 6. Then it can be checked that L t(n) = 0 for all n ∈ Z. Algorithm AS succeeds with input L and returns r(n) = −1/((n−2)(n−3)), R = nE +1/(n−3). In this case SLt = {2}, mLt = 2, SLl = {1, 4, 5}, MLl − ρ = 5 − 2 = 3. So both the a posteriori and the a priori conditions reduce to 3 ∈ / [v, w − 1]. This is the best possible, as the sequence u(n) = R t(n) is undefined at n = 3, and equation (1) does not hold for n = 2, 3. It can be verified that except in the special case b + 4c = 0, there is no way to define u(3) so that (1) would hold for all n ∈ Z.

If I ∩ [mLt ¯ + 1, MLl ¯ − 1] = ∅ then equation (13) holds for all n ∈ I − , and the discrete Newton-Leibniz formula w

(19)

is valid. Proof: By assumption, we have for all n ∈ I − , ¯0 (n)t¯(n) a ¯1 (n)t¯(n + 1) + a ¯ a1 (n)p(n + 1)t(n + 1) + a0 (n)p(n)t¯(n)

= =

0, 0.

(20) (21)

¯1 (n) Multiplying (20) by a1 (n)p(n + 1), (21) by a and subtracting, we find a ¯0 (n)t¯(n)a1 (n)p(n + 1) = a1 (n). Since t¯(n) has infinitely many nonzero a0 (n)p(n)t¯(n)¯ − values on I , this implies that

Remark 3. When ρ = 1 and v ≤ w − 1, Theorem 3 implies Theorem 1 in the following way. If r(n) satisfies (10) then it is easy to verify that r¯(n) := −r(n + 1)/a1 (n) satisfies L∗ r¯ = 1, and 1− r¯ L = (E −1)◦R where R = r(n) is an operator of order 0. Thus u(n) = r(n)t(n) of Theorem 1 agrees with u(n) = R t(n) of Theorem 3. By the assumption of Theorem 1, [v, w] ∩ [mLt + 1, MLl − 1] = ∅. If mLt + 1 ≤ MLl − 1 then either w ≤ mLt or MLl − 1 ≤ v − 1, hence [v, w − 1] ∩ [mLt , MLl − 1] = ∅, and the conclusion follows by Theorem 3. If mLt + 1 > MLl − 1 then mLt ≥ MLl − 1. Again we distinguish two cases: If mLt > MLl − 1, the conclusion follows by Theorem 3. If mLt = MLl − 1, then a0 (n) and a1 (n) have a common zero since a0 (mLt ) = a1 (MLl − 1) = 0. But this contradicts the assumption of relative primality of the coefficients of L. Note that when ρ ≥ 2, polynomials a0 (n) and aρ (n) need not be relatively prime.

6.

t(k) = u ¯(w) − u ¯(v) + t(w)

k=v

a1 (n) = a ¯0 (n)a1 (n)p(n + 1) a0 (n)p(n)¯

(22)

holds infinitely often, hence also as an equation in K[n]. Multiplying (10) by a ¯1 (n)p(n), (22) by r(n+1), subtracting, and cancelling a1 (n) in K(n), we obtain r (n + 1) + a ¯1 (n)¯ r (n) = −¯ a1 (n)p(n) a ¯0 (n)¯

(23)

as an equation in K(n). It follows from Proposition 1 that r¯ ∈ K(n) has no integer poles outside the interval [mLt ¯ + ¯(n) and u ¯(n) are defined on I, r¯(n+1) 1, MLl ¯ −1]. Therefore r and u ¯(n + 1) are defined on I − , and (23) is valid for all n ∈ I −. Pick an n ∈ I − . Multiplying (20) by r¯(n + 1), (23) by t¯(n) and subtracting, we obtain r (n + 1)t¯(n + 1) − a ¯1 (n)¯ r(n)t¯(n) = a ¯1 (n)p(n)t¯(n). a ¯1 (n)¯ ¯1 (n) = 0 then, by If a ¯1 (n) = 0 this reduces to (13). If a assumption, a ¯0 (n) = 0, so (20) implies t¯(n) = t(n) = u ¯(n) = 0 and (23) implies r¯(n + 1) = u ¯(n + 1) = 0, hence (13) holds in this case as well. So (13) holds for all n ∈ I − , and the second assertion follows by summing (13) on n from v to w − 1.

EXPLOITING POLYNOMIAL FACTORS

In this section we show how polynomial factors of hypergeometric terms (even non-proper ones, such as the one in Example 3) and of P -recursive sequences can be used to strengthen the statements of Theorems 1 and 3 (i.e., to weaken the a priori conditions for validity of the discrete Newton-Leibniz formula).

Example 7. Let t(n) be the hypergeometric term from Example 3 which satisfies L t(n) = 0 for all n ∈ Z where L = (n − 5)(n − 3) E − (n − 4)(n − 1)n. Define p(n) = (n − 5)(n − 3)(n − 2) and (n − 1)!, n ≥ 2, t¯(n) = (−1)n , n ≤ 1. (−n)!

Theorem 4. Assume that ¯ = a • L = a1 (n)E + a0 (n) and L ¯1 (n)E + a ¯0 (n) are operators of type (7) with ρ = 1,

¯ t¯(n) = 0 for all n = 1 where L ¯ = E − n. So, in Then L the notation of Theorem 4, the maximal possible interval I ¯ has no leading is either (−∞, 1] ∩ Z or [2, ∞) ∩ Z. As L singularities, MLl ¯ = −∞ and [mLt ¯ + 1, MLl ¯ − 1] = ∅. With r(n) = (n − 6)/((n − 2)(n − 3)), r¯(n) = (n − 5)(n − 6) and u ¯(n) = r¯(n)t¯(n), all the assumptions of Theorem 4 are satisfied, and it follows that formula (19) is valid provided that w ≤ 1 or v ≥ 2.

• t(n), t¯(n) are K-valued sequences with infinitely many nonzero values, defined on an infinite interval of inte¯ t¯(n) = 0 on I − (see gers I and satisfying L t(n) = L Section 2 for notation), • p ∈ K[n] is a polynomial such that t(n) = p(n)t¯(n) for all n ∈ I, • r ∈ K(n) is a rational function which satisfies (10) as an equation in K(n),

Now we consider the general case with ρ ≥ 1. Proposition 4. Let (r(n), R) be the result of applying AS to an operator L ∈ K[n, E] of type (7), and let r = s/q ¯ ∈ where s, q ∈ K[n]. Then there exist p ∈ K[n] and L K[n, E] such that ¯ L ◦ p = q L, (24)

• r¯ = p r ∈ K(n), • u ¯(n) is a K-sequence such that u ¯(n) = r¯(n)t¯(n) whenever both r¯(n) and t¯(n) are defined, • v, w are integers such that v ≤ w and [v, w] ∩ Z ⊆ I.

¯ and R ◦ p ∈ K[n, E]. (E − 1) ◦ R ◦ p = p − s L,

11

Proof: Let d ∈ K[n] be a polynomial and B ∈ K[n, E] an operator such that E ρ ◦ L∗ ◦

in (25). Then by Theorem 5, the sequence t(n) = (n−2)t¯(n) satisfies Lt(n) = 0 for all n ∈ [v, w − 1] ∩ Z , and the formula w−1

1 1 = B. q d

t(n) = 2w(w + 1)t¯(w) − 2v(v + 1)t¯(v)

(26)

n=v

Then

¯ =0 is valid whenever v ≤ w − 1. The general solution of Ly (−1/2)n ¯ is t(n) = c (1)n where c is an arbitrary constant. Thus, by taking c = −1/4, we see that (26) can be used to sum n the term t(n) = (n − 2)t¯(n) = (2 − n) (−1/2) considered in 4(1)n Example 4.

1 1 L ◦ E −ρ = B ∗ ◦ . q d Therefore L ◦ E −ρ ◦ d = qB ∗ . Multiplying this by E ρ on the right gives L ◦ E −ρ ◦ d ◦ E ρ = L ◦ d(n − ρ) = qB ∗ ◦ E ρ . Take ¯ = B ∗ ◦ E ρ . Then (24) is satisfied and p(n) = d(n − ρ) and L ¯ = p − rq L ¯ = p − rL ◦ p = (1 − rL) ◦ p = (E − 1) ◦ R ◦ p. p−s L ¯ by Hence the operator R ◦ p is the left quotient of p − s L E − 1 and, consequently, has polynomial coefficients.

Acknowledgements. The authors wish to express their thanks to J¨ urgen Gerhard for providing (a close variant of) the hypergeometric term in Example 1, and to anonymous referees for their helpful remarks.

Theorem 5. Let ¯ R, r, p, q be such as in Proposition 4, • L, L,

7. REFERENCES [1] S. A. Abramov, Rational solutions of linear difference and differential equations with polynomial coefficients, USSR Comput. Math. Phys. 29 (1989), 7–12. Transl. from Zh. vychisl. mat. mat. fyz. 29 (1989), 1611–1620. [2] S. A. Abramov, Rational solutions of linear difference and q-difference equations with polynomial coefficients, Programming and Comput. Software 21 (1995), 273–278. Transl. from Programmirovanie 21 (1995), 3–11. [3] S. A. Abramov, M. van Hoeij, Integration of solutions of linear functional equations, Integral Transforms and Special Functions 8 (1999), 3–12. [4] S. A. Abramov, M. Petkovˇsek, Solution spaces of H-systems and the Ore-Sato theorem, Proceedings FPSAC ’05, to appear. [5] F. Chyzak, An extension of Zeilberger’s fast algorithm to general holonomic functions, Discrete Math. 217 (2000), 115–134. [6] R. W. Gosper, Jr., Decision procedure for indefinite hypergeometric summation, Proc. Natl. Acad. Sci. USA 75 (1978), 40–42. [7] I. Nemes, M. Petkovˇsek, H. S. Wilf, D. Zeilberger, How to do Monthly problems with your computer, Amer. Math. Monthly 104 (1997), 505–519. [8] M. Schorn, Contributions to Symbolic Summation, RISC, J. Kepler University, Diploma Thesis, December 1995.

• v, w ∈ Z be such that v ≤ w − ρ, • Iρ = [v, w − ρ] ∩ Z, • t¯(n) be a K-valued sequence defined for all n ∈ [v, w] ∩ ¯ t¯(n) = 0 for all n ∈ Iρ . Z such that L Then the K-valued sequence t(n) = p(n)t¯(n) satisfies Lt(n) = 0 for all n ∈ Iρ , and the discrete Newton-Leibniz formula (18) can be applied to t(n) with u(n) = (R ◦ p)t¯(n). Proof: By (24), ¯ t¯(n) = 0 Lt(n) = (L ◦ p)t¯(n) = q L for all n ∈ Iρ . Also, u(n) = (R ◦ p)t¯(n) and u(n + 1) = (E ◦ R ◦ p)t¯(n) are defined for all n ∈ Iρ . Therefore, by Proposition 4, u(n + 1) − u(n)

¯ t¯(n) = ((E − 1) ◦ R ◦ p)t¯(n) = (p − sL) ¯ ¯ ¯ = p(n)t(n) − s(n)Lt(n) = t(n)

for all n ∈ Iρ , and (18) follows by summing this over Iρ . Example 8. Consider again the operator L = 2(n + 1)(n − 2)E − (2n − 1)(n − 1) from Example 2. Here mLt = 1 and MLl = 3, so, following Theorem 3, we can apply formula (18) if [v, w−1]∩[1, 2] = ∅. Using the algorithm from [1] or the algorithm from [2] we compute the solution r(n) = −(n + 2)/((n − 1)(n − 2)) of L∗ r = 1, and set q(n) = (n − 1)(n − 2). Then E ◦ L∗ ◦

1 1 = (−(2n + 1)E + 2(n + 1)), q(n) n−1

therefore we have d(n) B = ¯ = L p(n)

= n − 1, −(2n + 1)E + 2(n + 1), B ∗ ◦ E = 2(n + 1)E − (2n − 1), = n − 2,

(25)

R = 2n(n + 1)/(n − 2), u(n) = 2n(n + 1)t¯(n). Let t¯(n) be a sequence defined for all n ∈ [v, w] ∩ Z and ¯ t¯(n) = 0 for all n ∈ [v, w−1]∩ Z where L ¯ is given satisfying L

12

Signature of Symmetric Rational Matrices and the Unitary Dual of Lie Groups [Extended Abstract] ∗

†

B. David Saunders

Jeffrey Adams

Department of Mathematics University of Maryland College Park, MD, 20742 USA

[email protected]

Department of Computer and Information Sciences University of Delaware Newark, DE, 19716 USA

Department of Computer and Information Sciences University of Delaware Newark, DE, 19716 USA

[email protected]

[email protected]

ABSTRACT

definiteness properties of symmetric integer matrices. The latter is sometimes easier than the signature. Also there is often a considerable difference in the cost of verifying (certifying) a property and of certifying it’s negation. This is primarily an experimental paper whose purpose is to assess the feasibility of exact integer linear algebra methods for signature and definiteness determinations that are useful in the study of Lie group representations. The combination of measurements and analysis of asymptotic growth rates of time and memory use that we provide is for the purpose of predicting the cost of the larger computations of interest, so as to determine the most promising algorithms and the hardware resource needs. The algorithmic asymptotic complexities reported here are largely straightforward applications of known results. We have an interesting observation which allows both efficient binary tree structured Chinese remaindering to be used to construct numbers moduli a bunch of distinct primes and an early termination strategy at the same time. Practical and asymptotic speed up is given when using early termination. We have measured the time and memory costs of three algorithms on some matrices arising in group representations, studied their relative merits, and developed formulas to predict the costs for large matrices. We remark that the largest computation we’ve succeeded in doing with a standard package such as Mathematica is for a matrix of dimension 400, whereas the largest needed is for dimension 7168 Our measurements indicate it will be feasible. In section 2, the motivation for this work in the study of Lie group representations is presented. This application creates large problems straining our limits of time and memory. In section 3 we present the proposed algorithms and discuss their mathematical basis and complexity. Also included is some discussion of unimplemented alternatives and of the space issues for large instances of the problems. In section 4, we apply these algorithms to the matrices from Lie group representations. Finally, in section 5, experimental results are reported and, in section 6, conclusions are drawn on the state of these problems.

A key step in the computation of the unitary dual of a Lie group is the determination if certain rational symmetric matrices are positive semi-definite. The size of some of the computations dictates that high performance integer matrix computations be used. We explore the feasibility of this approach by developing three algorithms for integer symmetric matrix signature and studying their performance both asymptotically and experimentally on a particular matrix family constructed from the exceptional Weyl group E8 . We conclude that the computation is doable, with a parallel implementation needed for the largest representations.

Categories and Subject Descriptors G.4 [Mathematical Software]: Algorithm Design and Analysis; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems—Computations on discrete structures

General Terms Algorithms, Performance

Keywords matrix signature, symmetric matrix, Lie group

1.

†

Zhendong Wan

INTRODUCTION

We propose, analyze, and test several algorithms for computing the signature and for verifying or disproving specific ∗Supported by NSF grant DMS 0200851. †Supported by NSF grants CCR 0098284 and 0112807.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISSAC ’05, KLMM, Chinese Academy of Sciences, Beijing, China Copyright 2005 ACM 1-59593-095-7/05/0007 ...$5.00.

2. LIE GROUP REPRESENTATIONS AND UNITARY DUALITY We assume some familiarity with root systems and Weyl

13

Each facet contains an element ν with rational coordinates. It is well known we may choose a basis of Vρ so that ρ(w) is a rational, or even an integral, matrix for all w ∈ W . For n ≤ 8 the former has been carried out explicitly in [16], and the latter in [1] except for E8 . Then the matrices Aρ (ν) and Jρ (ν) will have rational entries. We may clear denominators when testing for positive semi–definiteness. Now let G be a split semi-simple group over R or a p–adic field F, with root system R. Associated to ν is an irreducible spherical representation πν of G.

groups, for example see [11]. We begin with a formal construction. Let R be a root system with Weyl group W . Thus R is a finite subset of V = Rn satisfying certain properties; in particular for each α ∈ R the reflection sα ∈ Hom(V, V ) takes R to itself. By definition W acts on V . We may choose simple roots S = {α1 , . . . , αn } so that W is generated by {si = sαi | i = 1, . . . , n}, with relations order(si sj ) = mi,j for certain mi,j ∈ {2, 3, 4, 6}. For ν ∈ V define ν, α∨ = 2(ν, α)/(α, α). Let V + be the set of ν ∈ V which are dominant (i.e., v, α∨ i ≥ 0 (1 ≤ i ≤ n)). Fix a finite dimensional representation (ρ, Vρ ) of W . Thus ρ : W → GL(Vρ ) is a group homomorphism. Fix an invariant Hermitian form (, )ρ on Vρ (i.e., satisfying (v1 , v2 )ρ = (ρ(w)v1 , ρ(w)v2 )ρ for all v1 , v2 ∈ Vρ , w ∈ W ). Choose a matrix Jρ so that (v1 , v2 )ρ = v1 Jρ v2t . For α ∈ R and ν ∈ V + we define Aρ (α, ν) =

1 + ν, α∨ ρ(sα ) ∈ Hom(Vρ , Vρ ). 1 + ν, α∨

Lemma 3. If F is p–adic πν is unitary if and only if τν is unitary. Conjecture 4. If F = R then πν is unitary if and only if τν is unitary.

(1)

Let w be the longest element of W and choose a reduced expression w = siN siN −1 . . . si1 (1 ≤ j ≤ N, 1 ≤ ij ≤ n). Set w0 = 1 and define wj = sαij sαij−1 . . . sαi1 (1 ≤ j ≤ N ). For ν ∈ V define Aρ (ν) =

n−1 Y

Aρ (αij , wj−1 ν) ∈ Hom(Vρ , Vρ )

(2)

j=1

and Jρ (ν) = Aρ (ν)Jρ .

(3)

Lemma 1. 1. Aρ (ν) is independent of the choice of reduced expression for w , 2. Aρ (0) = Id and Aρ (δ) = 0 (unless ρ is trivial) where δ is one-half the sum of the positive roots, 3. limν→∞ Aρ (ν) = ρ(w ), 4. Aρ (ν) is invertible unless ν, α∨ i = 1 for some i, 5. Assume w ν = −ν. Then (Aρ (ν)v1 , v2 )ρ = (v1 , Aρ (ν)v2 )ρ for all v1 , v2 ∈ V , and Jρ (ν) is symmetric.

Thus computing the unitary representations τν of Lemma 2 tells us about a subset of the unitary dual of Lie groups (the “spherical” unitary dual). We may as well assume that G is simple, or equivalently that R is irreducible. The irreducible root systems are of type An , Bn , Cn , Dn (classical) or E6 , E7 , E8 , F4 or G2 (exceptional). The most interesting case is that of E8 . The Weyl group has order 696, 729, 600; it has 112 representations, the largest of which has dimension 7, 168. There are 1, 070, 716 facets. In the classical case the classification of the unitary representations τν , and Conjecture 4, are known [2]. Dan Barbasch and Dan Ciubotaru have also computed the unitary τν in the exceptional cases. Thus the problem which the calculation of Jρ (ν) solves is already known. However this calculation is the prototype of a much more general calculation which will be needed to calculate the unitary dual of Hecke algebras, real and p–adic Lie groups. The Atlas of Lie Groups and Representations is a project to compute the unitary dual by theoretical and computational means, see http://atlas.math.umd.edu. Information about what computations are feasible is of great importance in the continuation of this project.

3. SIGNATURE ALGORITHMS For the rest of this paper we will consider the question of signatures and sign patterns of symmetric matrices in general, but constantly keeping in mind the operators generated from Lie group matrix representations. The signature of a real symmetric matrix A is generally defined as σ = p−n, the number by which positive eigenvalues outnumber negative ones. Define the signature triple to be the σ ∗ (A) = (p, z, n), where p is the number of positive eigenvalues, z is the multiplicity of zero as an eigenvalue, and n is the number of negative eigenvalues. A real symmetric matrix is positive definite if z and n are zero and positive semi-definite if n is zero. For a polynomial with only real roots, Descartes’ rule of signs can be used to determine the number of positive roots and the number of negative roots. Thus the signature triple of a matrix can be determined from the signs in the vector of coefficients of the characteristic polynomial. It is computationally useful that, when only the zero eigenvalue has multiplicity greater than 1, the characteristic polynomial is just a shift of the minimal polynomial. If A has minimal polynomial m(x) of degree d, we define the shifted minimal

For proofs of this and other statements in this section see [2] and http://atlas.math.umd.edu/papers. Let H be the affine Hecke algebra of W [11, Chapter 7]. Associated to ν is an irreducible spherical representation τν . Lemma 2. The representation τν of H is unitary if and only if for every irreducible representation ρ of W , the operator Jρ (ν) is positive semi–definite. It is also of interest to determine the signature of Jρ (ν). The denominator in (1) is a convenient normalization, which makes Lemma 1(2) and (3) hold. Since ν ∈ V + it is positive and does not affect whether Jρ (ν) is positive semi– definite. The question of whether Jρ (ν) is positive semi–definite only depends on whether 1 − ν, α∨ is positive, negative, or 0, for each α ∈ R. Therefore the set of dominant parameters ν is decomposed into a finite number of facets, each one determined by an element of {+, 0, −}n (not every such n-tuple arises). The classification of the unitary representations τν of H is therefore reduced to a finite calculation.

14

polynomial to be xn−d m(x). Less well known is that signature can also be determined in certain cases, in a similar way, from the vector consisting of the leading principal minors of A. In view of these connections, for a vector the signature triple is σ ∗ (v) = (p, z, n), where p is the number of alternating successive nonzero pairs, and n is the number of constant successive nonzero pairs, z is the number of zeros at the end of the vector (which corresponds to the multiplicity of zero as an eigenvalue in our case). To be precise, a pair of entries, (vi , vj ) of (v0 , v1 , . . . vn ) is successive if i < j and the entries between them are zero. A successive pair of nonzero entries is alternating(constant) if their signs are opposite(same). The vector in question will be either the coefficient vector of a polynomial or w = (w0 = 1, w1 , . . . , wn ), where wi is (−1)i mi , with mi being the i × i leading principal minor of the given matrix. Following [12], say a matrix of rank r has generic rank profile if the leading principal minors of dimensions 1 through r are nonzero.

prime may divide many leading principal minors of A (and the rank may be lower mod p than the integer rank). Note that in both of these cases the segment of non-zero values computed mod p is shorter than it should be. In the interest of presenting the Chinese Remaindering issues in a generic way, we refer to a likely image function, which given A and p, returns a vector which is likely to be the image mod p of the desired integer vector but may not be, and if not, will be shorter. We call a prime p good if likely image(A, p) returns the correct image, otherwise p is bad. The key feature of this generic Chinese remainder algorithm for such vectors is its early termination technique. Algorithm: GenCRA [Chinese Remainder Algorithm with early termination] Input: A, a symmetric integer matrix. vp = likely image(A, p), a function as described above. P, a set of primes. Ω, the random sample size for the certificate. Output: The revealed vector v with integer coefficients.

Theorem 1 (Signature theorem). Let A be a real symmetric n × n matrix. The following hold: 1. Signature is invariant under congruence, that is, if Q is nonsingular then σ ∗ (A) = σ ∗ (QAQT ).

1. Set list L := ∅. This will be a list of pairs: (good prime, imaged answer mod that prime). Set l = 0. 2. Choose a random vector x of length n with entries independently and uniformly chosen from [0, Ω − 1]. 3. Uniformly and randomly remove a prime p from P. 4. Call vp = likely image(A, p). If the length of vp is less than l , [reject] goto statement 3. If the length of vp is greater than l , [restart] reset L := {(p, vp )}. Otherwise, [continue] append pair (p, vp ) to L. 5. Use Chinese remainder algorithm to construct the certificate c(i) where i the size of L, such that c(i) = x · vq (mod q), for each pair (q, vq ) in L. This construction is done incrementally, combining the previous result c(i−1) with the current residue and modulus. 6. Otherwise the termination condition, c(i) = c(i−1) , is met. Return the vector v, which is constructed from pairs in L by the Chinese remainder algorithm, such that v = vq (mod q), for every pair (q, vq ) in L. This remaindering is done using a divide-and-conquer method. Notes:

2. σ ∗ (A) = σ ∗ (charpoly(A)). Also, if the nonzero eigenvalues of A are distinct, σ ∗ (A) = σ ∗ (shiftminpoly(A)). 3. If A is in generic rank profile with rank r, σ ∗ (A) = σ ∗ (1, −m1 , . . . , (−1)r mr , 0, . . . , 0), where mi is the ith leading principal minor of A, and the last (n − r) entries are zero. 4. A matrix in generic rank profile has a unique A = LDLT decomposition with unit lower triangular L and diagonal D = diag(d1 , . . . , dn ). If A has generic rank profile and rank r, then (ending with Q (n − r) zeroes) σ ∗ (A) = σ ∗ (1, −d1 , d1 d2 , . . . , (−1)r 1≤i≤r di , 0, . . . , 0). Proof. A good source for these fundamental facts is [9]. In particular the third statement is a theorem of Jacobi, [9, Chapter X, §3, theorem 2]. The fourth item follows since di = mi /mi−1 . See [10, Chapter 4] for a good discussion of LDLT decomposition. ✷ The generic rank profile condition assures that the mi consist of nonzero entries followed by zeros with no intermingling of zero and nonzero values, and the same applies to the diagonal D of the LDLT decomposition. Interestingly, again see [9], the signature can be recovered even when there are some scattered zeroes among the nonzero leading minors, hence something less than generic rank profile is needed. We will not pursue this point further. Call a vector v a σ ∗ -revealing vector for A if σ ∗ (A) = ∗ σ (v). The algorithms we propose all work by computing images of a signature revealing vector mod a series of primes and constructing the integer vector via the Chinese Remainder Algorithm. The vector will be the coefficients of the characteristic polynomial, of the shifted minimal polynomial, or the vector of leading principal minors. In most of these cases there can be bad primes. For instance there are primes p for which the minimal polynomial of image mod p of A is not the image mod p of the integer minimal polynomial of A. This leads to a shifted minpoly mod p that is not a true image of the integer shifted minpoly. Similarly a

1. In order to capture negative numbers, we normalize the final number a to lie in [−(m − 1)/2, (m − 1)/2], Q where m= 1≤i≤n pi . 2. The early termination technique which may be used to reduce the practical run time has been studied before -see e.g. [8, 14, 7]. Here, we use a different and more efficient termination technique. At each step, only one integer called a “certificate” is constructed at each prime, instead of the entire vector answer. This method has almost the same probability as the probability when the whole answer is constructed at each prime. It allows the more efficient divide and conquer remaindering to be done for the n values in the answer while using the incremental remaindering only for the certificate. This technique can be easily adopted to other cases, such as solving non-singular integer systems over the rationals.

15

Proof: These preconditionings are discussed in detail in [4] ✷ For the generic rank profile condition, The butterfly is chosen as preconditioner because the specified matrix QAQT can be computed in O˜(n2 ) time. For our purposes here we can afford preconditioning complexity up to O˜(n3 ) time. A general random matrix could be used for the preconditioner, or Toeplitz [13] or sparse preconditioners [4, Section 6]. It is of interest to keep the size of the entries of the resulting matrix as small as possible. We build three algorithms on the theory described above, two using the minimal polynomial (for blackbox and for dense matrices) and one using LDLT decomposition.

3. The pre-selected primes idea in [14] may be use here also. It works with preselected prime stream so that one can pre-compute the constants which are independent of the actual answer, for instance (p1 . . . pi )−1 mod pi+1 (The precomputed constant will help if the Newton interpolation algorithm is applied to construct the final answer). Such moduli are not random, additional post-check of the result at one additional random moduli will guarantee the correctness of the final answer with very high probability. Please see [14] for more details. Theorem 2. The algorithm above computes the revealed vector with error probability at most

Algorithm: BBSM [BlackBox Signature by Minpoly] Input: A, a symmetric matrix in blackbox form, S, a set of integers from which to make random selections. Output: The signature σ ∗ (A) = (p, z, n). 1. [ Preconditioning may be necessary ] Let q be a random prime. Let r := rank(A, q) and mq = minpoly(A, q). If deg(mq ) < n and deg(mq ) ≤ r, let B := DAD [A blackbox], for a random diagonal matrix D with entries chosen from S, Otherwise, let B := A. [ Now B has the same signature as A and its charpoly is its shifted minpoly, with high probability. ] 2. Choose a set P of primes, and the sample size Ω, such that the error probability is as small as desired. Return σ ∗ (xmax(0,n−r−1) GenCRA(B, minpoly(), P, Ω)).

log2β (2l ΩB) 2 + , Ω M − logβ (2l ΩB) where B is the infinity norm of the correct answer, M is the number of good primes in P, and β is the minimal prime in P. Note: For a typical problem, such as characteristic polynomial, or minimal polynomial, it is easy to choose a reasonable set of primes and random sample size such that the error probability is tiny. Runs are independent. If you repeat the algorithm, you square the error probability. PROOF: Let vector α denote the correct answer and c denote x · α. We assume the early termination condition is met when c(n) is equal to c(n−1) , for some number n. If both |c| ≥ α∞ and c = c(n) are true, then the algorithm returns the correct answer. This is true since the modulus which is the product of the primes in L is at least 2α∞ under these hypotheses. If x is a random vector with entries independently and uniformly chosen from the integer set [0, Ω−1], then Prob(|x· α|) < α∞ ) ≤ Ω2 . This is true since there is at least one entry of α, whose absolute value is equal to α∞ . Without loss of generosity, we assume |α0 | = α∞ . Then for any x1 , . . . , xl−1 , then there are at most two integers x0 , such that |x · α| < α∞ . The probability analysis of c = c(n) on the condition that the early termination condition c(n) = c(n−1) is met, is straightforward - see e.g. [14, Theorem 1.] for details. So

The r = rank(A, p) and v = minpoly(A, p) algorithms used to compute the rank and the minimal polynomial of A mod p respectively are as in [17, 13, 4], for example. Here minpoly() and rank() run in time O˜(nE) and are probabilistic with probability of error no more than 1/p, where E is the cost of a single matrix-by-vector product. But rank() will never return a value greater than the true rank and minpoly() always computes at least a factor of the true minimal polynomial of A mod p. DAD is the blackbox whose matrix-vector product is formed as y = D(A(Dx)). Algorithm: DSM [Dense Signature by Minpoly] Input: Matrix A, in dense form. Output: σ ∗ (A). Method: Apply algorithm BBSM, except use r = rank(A, p) [6] and v = minpoly(B, p), [5], algorithms which are available for the explicit (dense) matrix representation. Then rank() and minpoly() are deterministic eliminations running in time O˜(n3 ) and using O(n2 d) memory Of course, in this case the DAD preconditioning is done explicitly (and cheaply). Again, of course, p may be a bad prime (minpoly of A mod p not equal mod p image of integer minpoly of A..

log2 (2lΩB)

β . ✷ the total error probability is at most Ω2 + M −log β (2lΩB) The following theorem gives additional flexibility in using the strategies for signature computation made available by GenCRA.

Theorem 3. Let A be an n × n real symmetric matrix and let S be a set of nonzero integers of sufficient size that , = n2 log(n)2 /|S| is as small as desired. 1. Let D be a diagonal matrix whose n diagonal entries are chosen uniformly at random from S, and let B = DADT . Then

Especially for blackbox matrices, it is useful that minpoly computation can suffice, because we have faster algorithms for minpoly than for charpoly in that case. But for the DSM algorithm above, an alternative is to use the charpoly() function as the likely image function, see [5] for example. Also note that the minpoly suffices for determining the sign pattern (not full signature) even without preconditioning. When the minpoly is of low degree or has small coefficients, this is a great savings. In general, though, BBSM is not a fully memory efficient algorithm because of the size of the

Prob(shiftminpoly(B) = charpoly(B) ≥ 1 − ,. 2. Let Q be a butterfly matrix whose n log(n) defining entries are chosen uniformly at random from S, and let B = QAQT . Then Prob(B is in generic rank profile) ≥ 1 − ,.

16

σ ∗ -revealing vector. It is possible that the technique of [3] could be used to determine the signs using less memory and perhaps less time. This approach deserves further examination. Alternatively, combining Theorem 1, part 3, 4, and theorem 3, part 2, we may work from the LDLT decomposition of a matrix in generic rank profile. If the matrix should fail to have generic rank profile, this will be detected during the eliminations because of the need for pivoting. We use preconditioners Q to assure that QAQT has generic rank profile. In some cases symmetric pivoting could be used instead to avoid the increased entry size caused by preconditioning. The LDLT decomposition based signature algorithm assumes a procedure lpm(v, A, p) which computes the vector of leading principal minors with alternating signs as described above, up to but not including the first which is zero mod p. The matrix mod p can fail to have generic rank profile. We could modify the LDLT to successfully handle such cases, but it is a remote possibility and not worth the overhead. We simply reject such primes as bad primes. The procedure lpm meets the requirements of a σ ∗ -revealing function for GenCRA.

Also, with any of the σ ∗ -revealing vectors, the entries tend to grow in proportion to their index. In particular the ith entry is either an i × i minor or sum of i × i minors, so is bounded by the Hadamard bound which is in O˜(id). A heuristic to determine indefiniteness computes the first few vector entries, using many fewer remaindering steps than are required for the later entries. If the sign pattern fails to be constant or strictly alternating the matrix is indefinite. It is an open question whether a conjugacy class preconditioning, A → QAQT , could make probable that early entries indicate definiteness (cheaply).

4. APPLICATION TO LIE MATRICES The matrices from Lie group representation are rational matrices. Our algorithms in the previous section focus on integer matrices. There are at least two ways to adapt. One is to compute the minimal polynomial, or the leading principal minors over the rationals. Rational numbers must be reconstructed at the GenCRA, and it is easy to adopt our GenCRA, including the early termination technique, to this case. The other way is to multiply the matrices by the lcm of the denominators of all entries. For these specially constructed matrices from Lie group representation, the lcm of the denominators of all entries is just a little larger than each individual one, so this latter way is a better. We found for some models that the gcd of all the numerators in the dense representation is not trivial and can be removed. Next we present blackbox algorithms to compute the lcm of denominators of all entries, and gcd of numerators of all entries, respectively.

Algorithm: DSLU [Dense matrix Signature by LU-decomposition] Input: A, an integer symmetric k × k matrix. Output: Signature σ ∗ (A) = (p, z, n). 1. [ Preconditioning may be necessary ] Let q be a random prime. Let lpm(vq , A, q), r := rank(A, q). If length of vq less than r + 1, let B := QAQT , for a random integer matrix Q with entries chosen from [1, · · · , s] (or any set of s nonzero integers). [Q may be a Toeplitz or Butterfly matrix for speed] Otherwise, let B := A. [ Now B has the same signature as A and has generic rank profile. ] Choose a set P of primes, and the sample size Ω, such that the error probability is as small as desired. 2. Return σ ∗ (xn−r GenCRA(B, lpm(), P, Ω)).

Algorithm: LD [LCM of Denominators] Input: A, a rational matrix M , sample size n, number of trials Output: d, the lcm of denominators of all entries of dense representation of A. 1. for i from 1 to n do Choose a random vector x(i) with entries independently and uniformly chosen from [0, M − 1] y (i) = Ax(i) , apply x(i) to A d(i) = the lcm of denominators of every entry of y (i) . 2. d := lcm(d(1) , · · · , d(i) ) 3. return d.

Theorem 4. Let A be an integer blackbox n × n symmetric matrix, whose matrix-vector product costs e operations and whose entries would be of bit length at most d if they were explicitly constructed. Algorithms BBSM, DSM, and DSLU are Las Vegas probabilistic algorithm if the Hadamard bound is used and are Monte Carlo if early termination is used.Even with early termination, DSM is Las Vegas if the computed integer minimal polynomial is checked by evaluation at A over the integers. Also a minimal polynomial verification by application to the identity could be done after BBSM making it Las Vegas for the sign pattern of the signature. Let h be a bound for the bit lengths of the values constructed using GenCRA, (the length of the largest characteristic polynomial coefficient or of the largest leading principal minor). Then the expected run times are in O˜(neh) for BBSM and in O˜(n3 h) for DSM and DSLU. By the Hadamard bound, h is in O˜(nd), so the expected run times are also in O˜(n2 ed) for BBSM and in O˜(n4 d) for DSM and DSLU. In particular if e ∈ O˜(n), and d ∈ O(log(n)), then the BBSM expected run time is O˜(n3 ) and the DSM, DSLU expected times are in O˜(n4 ). ✷

Algorithm: GN [Gcd of Numerators] Apply algorithm LD by replacing lcm with gcd and denominators with numerators. The algorithm LD always returns a factor of the true answer. The algorithm GN always returns a multiple of the true answer. For a rational matrix A, if d and g are the lcm of all denominators and gcd all numerators of entries of A, respectively, then If A = dg A is an integer matrix. For each

individual trial, if all entries of A x(i) are coprime, then d(i) is correct in both algorithms. Please see [15] for probability analysis. Roughly speaking, the error probability is 2−n+1 . Each matrix Jρ (ν) in equation 3 can be represented as a product of many sums of sparse matrices and scalar matrices. The algorithm BBSM is very suitable for this case. We also reduce the product of 121 sums of sparse matrices and

17

5.

EXPERIMENTS

Our goal is to know which algorithm to use and how much resources it will take for computing the signature of the Jρ (ν) in equation 3, the matrix generated by the facet ν for the model of dimension n of the group E8 . In particular, we focus on the case when the facet is the challenge facet ν0 = 1/40(0, 1, 2, 3, 4, 5, 6, 23), and the goal is to verify (or disprove) that Jn (ν0 ) is positive definite. To emphasize the model dimension n in what follows, let Jn = Jρ (ν0 ). The three signature algorithms we have discussed are BBSM, DSM, DSLU. We have computed full solutions by each model for the models Q2, . . ., Q12, Q22, Q32, Q42, Q52(dimension 1008), and we have computed single modular step costs for every 10th model on up to the largest, Q112(dimension 7168). The experiments involved individual run times up to about one day and verified, where the full computation was done, that Jn is indeed positive definite. All computations reported here were performed on a 3.2GHz Xeon processor with 6GB of main memory. Our purpose here is to determine the algorithm to use and resources needed for the full signature computation on examples of all dimensions (up to 7168) and to have tools to estimate the cost for matrices defined by other facets. The blackbox representation of Jn is a product of very sparse factors. Since the blackbox method cost is sensitive to e, the cost of the single matrix-by-vector product, we examined the number of nonzero elements in total in these factors. To guess the expected growth of e with dimension, we measured the total number of nonzero entries in the factors of every 10th model up to Q112. Since the number of factors is fixed and they are extremely sparse we expect a linear or n log(n) term to dominate. The formula e = 132.5900691n + 13.12471811n ln(n), a least squares fit to the data, is plotted in figure 1 along with the data values. The fit is extremely good and we see that e is only slightly super-linear. Therefore we expect the runtime of characteristic polynomial computation mod a single prime to grow at a rate only slightly above quadratic in n.

Each algorithm involves a series of steps mod primes pi , each step being a computation of σ ∗ -revealing vector (a minpoly or diagonal of LDLT -decomposition) mod pi . The computation is finished when the CRA remainderingQhas given an image of the vector with modulus, M = 2 pi , sufficiently large to recover the actual (signed) vector entries. If we know a bound d for the length of the entries of Jn , we have by the Hadamard bound a maximal length b = n(log(n)/2 + d) + 1 for M . The dense matrices are computed by applying the black box to the columns of the identity matrix (over Z, not modularly), and so the computation is sensitive to the length of the actual integer entries. Let d be the bit length of the largest entry. We do not have a theory to predict how matrix entry bit length d may depend on the model dimension for a fixed facet. We have plotted the bit length d of the largest entry of Jn along with the fitted curve d = 67.7 + 33.2 log(n) in figure 2. One sees the log(n) playing a stronger role than that for the non-zeroes in the blackbox, but the fit to the data is poorer and likely to have less predictive value. We noted that the n2 entries of A all have about the same length, so n2 d accurately describes the storage required for A. Figure 2: Bit length of entries 350

300

bit length

scalar matrices to a product of 94 sparse matrices for the E8 case. We explicitly compute each sum of a sparse matrix and a scalar matrix and store them as a single sparse matrix. Also, if the result is a diagonal matrix, explicitly multiply it into the previous factor. By reducing the number of sparse matrices in this way, 20% run time was saved for the BBSM algorithm. After multiplying these collapsed matrices by the lcm of denominators of all entries and dividing by gcd of numerators, these matrices can be handled by the algorithms discussed in the previous section.

200

150 actual bit length estimated by: 176.5334824+18.42786376ln(n)

100 10

100

1000

10000

matrix order

It will also be useful to use d in estimating the number of remaindering steps. The needed bit length b for the modulus M is at most 1 bit (for sign) more than the length of the p Hadamard bound. The rows norms are no more than n(ed )2 , so their product has length bounded by n(log(n)/2 + d). Reasonably consistent with this prediction is the number of bits used when we ran the full algorithms on Jn for n up to 1008 (model Q52). The curve in figure 3 is a least squares fit, b = 150.1n + 17.30nln(n), The number of bits for the charpoly coefficients (sums of minors) are expected to be slightly larger than for the entries of the leading principal minors vector, but most likely, the final term, determinant of Jn , is dominant. The number of bits needed appears to be slightly super-linear, not quite as large as the worst case Hadamard bound level. Figure 3: Bit length of final modulus

Figure 1: Blackbox number of non-zeroes 1e+07

1e+06

1e+06

100000

bits

number of non-zero entries

250

100000

10000

10000

1000

estimated by: 87.7n + 20nlogn Actual data

1000 10

100

1000

Actual required Estimated by:150.1n + 17.3nln(n)

100 10000

1

matrix order

18

10

100 matrix order

1000

10000

Next we consider the costs of the modular steps. Let step(s) denote the cost per step which varies with length, s, of the prime, but is essentially constant among primes of the same length. The total time is b/s×step(s) when using s bit primes. Thus we see the advantage of minimizing step(s)/s, the cost per bit. After discussing step(s)/s and determining it’s value, for each algorithm, we will know how to estimate the full runtime of the algorithms. We expect to be able to reduce the LU step cost by half, as an elimination better tailored to symmetric matrices can be done, while still taking advantage of block operations using BLAS. For the blackbox algorithm, block methods may be able to reduce the costs somewhat, and the DSM may perform relatively better than here if the dense charpoly() algorithm [5] is used rather than preconditioning. At any rate, in figure 4 we see the performance of the modular steps of the three algorithms, BBSM, DAME, DSLU as currently implemented. Asymptotically, the BBSM step runtime, s, is expected to grow in proportion to ne. There was a good fit to s = 61n2 log(n) nanoseconds per bit (recall that e has a n log(n) dominant term). The algorithms DSM and DSLU depend on the construction of the full integer matrix of Jn , which is done once. The step times consist of computing a modular image of that and proceeding with an elimination on the image. The implementation of the Krylov matrix construction in the minpoly() algorithm of DSM , uses a technique of recursive doubling to better exploit fast matrix multiplication, and gets a step time s which fits the formula t = 1.8n3 + 0.16n3 log(n) nanoseconds per bit. t = 0.37n3 log(n) nanoseconds per bit. quite well. The DSLU step fits t = 0.3838888362n3 nanosec/bit. But these formulas have to be worked to the bit cost level!!! The log factor for DSM step can be removed, so it remains possible that it can become competitive or participate with BBSM in a heterogeneous distributed computation, running on the machines with larger memory. The time formula for the dense matrices should remain valid as long as the modular step fits in main memory. After than swapping would dramatically increase times. Assuming a memory capable of holding a 2GB virtual memory, this would allow for n2 words, so n < 215 . For all practical purposes, the blackbox step has no memory limitation.

the expanded integer matrix initially. In figure 5 we show the overall run times (where the computations have been done so far) together with their least squares fit formulas. The formulas for time t in nanoseconds are BBSM: t = 10050n3 log(n), DSM: t = 6.08n4 log(n), and DSLU: t = 1.45n4 log(n).Theory predicts a second log factor in the BBSM time, coming from the the single matrix-by-vector product cost, e. However that effect is weak and the fit was poorer. Figure 5: Total run time 1e+09 1e+08 1e+07 1e+06

time (s)

100000 10000 1000 100 BB time Estimated BB MINBLAS timeS Estimated MINBLAS LUBLAS time Estimated LUBLAS

10 1 0.1 0.01 10

100

1000

10000

matrix order

The memory needed to store the σ ∗ -revealing vector mutes the memory advantage of BBSM. However the modular images of the vector are easily stored on hard disk and manipulated from there, so the memory advantage remains real. The crossover point of the runtime formulas for BBSM and DSLU is around n = 6931. However the formula t = 10.02n4 fits the DSLU data about as well as our formula above. Using that form, the crossover point is at n = 9150. We have also completed a study of the operators for the facet 1/12(0, 2, 2, 3, 4, 5, 7, 23). In this case the ranks of the operators in each model are very low. For the largest model of dimension 7168 the rank is 448, and for the other 111 models the rank seldom exceeds a tenth of the dimension. We were able to verify the positive semi-definiteness in all 112 models, using a total of about 125 cpu hours. The LU method was most often the fastest, and it’s time was dominated by the cost to expand the product of 121 sparse matrices into a single dense matrix. The blackbox minpoly method was faster for the models with lowest ranks (and thus smallest degree minimal polynomial). Also it was the only method to work for the largest model because of address space limitations.

Figure 4: Time per bit of modulus 100 10

6. CONCLUSION

time (s)

1

We have demonstrated that we can compute (on current hardware) the signature of a dense n × n integer matrix having entries of bit length around log(n) in a minute if n ≤ 200, in three hours if n = 1000 and (projected) in a CPU year for J7168 . Beyond that size, using algorithm BBSM, the time grows at a rate slightly higher than n3 and memory is not a constraint (except for storage of the sparse factors of the blackbox). However, we conclude that algorithm DSLU serves best for matrices of dimension n < 7000. It is tossup between DSLU and BBSM for dimensions 7000 ≤ n ≤ 9000 and BBSM is superior beyond that. DSLU time grows slightly above n4 . For J7168 , DSLU requires explicit use of file storage of the expanded matrix and all algorithms should do this for intermediate results (modular images of σ ∗ -revealing vectors) because of the large size. We have not measured the cost

0.1 0.01 0.001 0.0001 1e-05 100

BB per bit Estimated BB MINBLAS per bit Estimated BL LUBLAS per bit Estimated LU 1000 matrix order

10000

The total runtime then involves the time per bit in the modular steps times the number of bits in the σ ∗ -revealing vector. The Chinese remaindering adds a cost that is similar for each but large enough to mute somewhat the effect of the differing step costs. However with the early termination technique of the previous section this remaindering is a smaller factor than in earlier timings. The two dense algorithms also incur a lower order cost for the creation of

19

[8] I. Z. Emiris. A complete implementation for computing general dimensional convex hulls. Inter. J. Comput. Geom. Appl., 8:223–253, 1998. [9] F. R. Gantmacher. The Theory of Matrices. Chelsea, New York, NY, 1959. [10] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, Maryland, third edition, 1996. [11] James E. Humphreys. Reflection groups and Coxeter groups, volume 29 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 1990. [12] E. Kaltofen and A. Lobo. On rank properties of Toeplitz matrices over finite fields. In ISSAC 96 Proc. 1996 Internat. Symp. Symbolic Algebraic Comput., pages 241–249. [13] E. Kaltofen and B. D. Saunders. On Wiedemann’s method of solving sparse linear systems. In H. F. Mattson, T. Mora, and T. R. N. Rao, editors, Proc. AAECC-9, volume 539 of Lect. Notes Comput. Sci., pages 29–38, Heidelberg, Germany, 1991. Springer Verlag. [14] Erich Kaltofen. An output-sensitive variant of the baby steps/giant steps determinant algorithm. In Proc. ISSAC’02, pages 138–144. ACM Press, 2002. [15] B. David Saunders and Zhendong Wan. Smith normal form of dense integer matrices fast algorithms into practice. In Proc. ISSAC’04, pages 274–281. ACM Press, 2004. [16] John R. Stembridge. Explicit matrices for irreducible representations of Weyl groups. Represent. Theory (electronic), 8:267–289, 2004. [17] D. Wiedemann. Solving sparse linear equations over finite fields. IEEE Trans. Inf. Theory, it-32:54–62, 1986.

of this file manipulation. At the crossover about one CPU year is required and DSLU needs a large memory. The run time is expected to be about a CPU year, so parallel computation is desirable (and is quite straightforward for either algorithm) on distributed or shared memory hardware. It is an open question whether definiteness can be determined fundamentally faster than signature. There is a fast Monte Carlo algorithm for rank, hence for distinguishing semi-definite from definite matrices. We have sketched a heuristic that sometimes can determine indefiniteness much faster than the signature computation. To provide for the needs of Lie group representation studies, both BBSM and DSLU will be further refined and their parallel implementation developed. Also the judicious incorporation of numeric computation is a possibility.

7.

REFERENCES

[1] J Adams. Integral models of representations of weyl groups. http://atlas.math.umd.edu/weyl/integral. [2] D. Barbasch. Unitary spherical spectrum for split classical groups, preprint. http://www.math.cornell.edu/~barbasch/nsph.ps. [3] H. Bronnimann, I. Z. Emiris, V.Y. Pan, and S. Pion. Sign determination in residue number systems. Theoret. Comput. Sci., 210:173197, 1999. [4] L. Chen, W. Eberly, E. Kaltofen, B. D. Saunders, W. J. Turner, and G. Villard. Efficient matrix preconditioners for black box linear algebra. Linear Algebra and Applications, 343-344:119–146, 2002. [5] C.Pernet and Z. Wan. LU based algorithms for the characteristic polynomial over a finite field. In Poster, ISSAC’03. ACM Press, 2003. [6] Jean-Guillaume Dumas, Pascal Giorgi, and Clément Pernet. FFPACK: Finite field linear algebra package. [7] Wayne Eberly. Early termination over small fields. In Proc. ISSAC’03, pages 80–87. ACM Press, 2003.

20

Sum of Roots with Positive Real Parts

∗

Hirokazu Anai

Shinji Hara

Kazuhiro Yokoyama

Fujitsu Laboratories Ltd 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki 211-8588, Japan

The University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan

Rikkyo University 3-34-1 Nishi Ikebukuro, Toshima-ku, Tokyo, 171-8501, Japan

[email protected]

shinji [email protected]

ABSTRACT

[email protected]

system with a plant P (s) = nc (s) dc (s)

In this paper we present a method to compute or estimate the sum of roots with positive real parts (SORPRP) of a polynomial, which is related to a certain index of “average” stability in optimal control, without computing explicit numerical values of the roots. The method is based on symbolic and algebraic computations and enables us to deal with polynomials with parametric coefficients for their SORPRP. This leads to provide a novel systematic method to achieve optimal regulator design in control by combining the method with quantifier elimination. We also report some experimental result for a typical class of plants and confirm the effectiveness of the proposed method.

np (s) dp (s)

controlled by a controller

C(s) = where np (s), dp (s), nc (s), dc (s) ∈ Q[s], the stability of the system is described as follows: The feedback system is stable if and only if all of the roots of the closed-loop characteristic polynomial g(s) = np nc + dp dp locate within the left-half plane of the Gaussian plane. This is called Hurwitz stability. We may consider more general notion of stability, called D-stability, which implies that all of the roots locate inside a restricted region D within the left-half plane of the Gaussian plane. Control design problem is to find a controller C(s) so that the system satisfies given specifications. As the controller C(s, q) has fixed-structure with some parameters q, what we have to do is to seek feasible controller parameters q which satisfy the controller design problem. For such problems, techniques in computer algebra have been successfully applied [9, 14, 1, 2]. Stability is the first necessary requirement for control system design, and assigning roots of a certain polynomial within a desired region is an essential problem for stability study. Root assignment problem for Hurwitz stability is to find controller parameters q so that the system is Hurwitz stable. This is easily verified by the well-known Routh-Hurwitz criterion. In the case of D-stability, a wedge shape region or a circle is usually used as stability region D. For root assignment problems with such stability regions, controller design problem is reduced to check a sign definite condition (SDC) ∀z > 0, f (z) > 0 where f (z) ∈ Q(q)[z], see [15, 13]. Applying real quantifier elimination (QE) to the sign definite condition, we can obtain possible regions of controller parameters q to meet D-stability. In particular, we can utilize an efficient quantifier elimination algorithm specialized to SDC [1, 10]. These controller synthesis methods with respect to stability are implemented as functions in a MATLAB toolbox for robust parametric control [3] In this paper we focus on the sum of roots with positive real parts (SORPRP) of a given even polynomial, and provide another successful application of computer algebra to control design problem, where the SORPRP is related to certain index of “average” stability in optimal control. Here we compute or estimate the SORPRP without computing explicit numerical values of roots. Hence, we can handle polynomials with parametric coefficients for their SORPRP. The key point of the method is that computing SORPRP is reduced to computation of the maximal real root of another univariate polynomial. Subsequently this enables us to achieve control system design with respect to SORPRP systematically. In fact, since the actual control design prob-

Categories and Subject Descriptors I.1.2 [Computing Methodologies]: Symbolic and Algebraic Manipulation – Algebraic Algorithms

General Terms Algorithm, Experimentation

Keywords Sum of roots with positive real parts, Gr¨ obner basis, resultant, quantifier elimination, optimal regulator control

1. INTRODUCTION In control and system theory, investigating location of roots of the characteristic polynomial is one of important and fundamental topics related to the stability of feedback control systems. For example, in case of a typical feedback ∗

This work has been supported in part by CREST of Japan Science and Technology Agency, the 21st Century COE: Information Science and Technology Strategic Core “Superrobust computation project” (The University of Tokyo) and “Development of dynamic mathematics with high functionality” (Kyushu University).


21

¯ m,f (z) and Rm,f (z), since W is a real number. To compute R we use the following triangular set related to Cauchy moduli [5] defined by f (x).

lems treated are recast as simple conditions on an univariate polynomial with parametric coefficients (one of them is a sign definite condition), we can utilize an efficient quantifier elimination algorithm using Sturm-Habicht sequence [1, 10]. The proposed method is applied to an even polynomial derived from “linear quadratic regulator (LQR) problem” which is one of the main concerns in control theory. The rest of this paper is organized as follows: First we show algorithms to compute other polynomials whose maximal real root coincides with the SORPRP of a given even polynomial. Then, using such polynomials, we state our main problem that comes from control design problem in §3. In §4, we present how we formulate the basic problem as a first-order formula and solve it by using quantifier elimination. In §5 we demonstrate some examples of LQR problems in order to confirm the effectiveness of the proposed method. Finally this paper ends with conclusion in §6.

2.

Definition 2. Let D be an arbitrary computable integral domain and K its quotient field. For a polynomial g(x) of degree n in D[x], we define the following polynomials: {g1 (x1 ), g2 (x1 , x2 ), . . . , gn (x1 , . . . , xn )}, where g1 (x1 ) = g(x1 ) and gi (x1 , . . . , xi ) is the quotient of g(xi ) divided by (xi − x1 ) · · · (xi − xi−1 ) for each i > 1. We note that gi (x1 , . . . , xi ) ∈ D[x1 , . . . , xi ] and gi (x1 , . . . , xi ) coincides with the quotient of gi−1 (x1 , . . . , xi−2 , xi ) divided by xi − xi−1 . Here we call {g1 , . . . , gn } the standard triangular set defined by g(x), and also call {g1 , . . . , gk } the k-th standard triangular set defined by g(x). It is well-known that {g1 , . . . , gk } forms a Gr¨ obner basis of the ideal hg1 , . . . , gk i generated by itself with respect to the lexicographic order x1 < . . . < xk in K[x1 , . . . , xk ] and the set of all its zeros with multiplicities counted coincides with the set {(βi1 , . . . , βik ) | i1 , . . . , ik ∈ {1, . . . , n} are distinct to each other }, where β1 , . . . , βn are all roots of g(x) in the algebraic closure of K. Thus, when g(x) is squarefree, hg1 , . . . , gk i is a radical ideal. We note that for each gi its leading coefficient lc(gi ) with respect to the order < coincides with the leading coefficient lc(g) of g(x). Now let F = {f1 (x1 ), . . . , fm (x1 , . . . , xm )} be the m-th standard triangular set defined by f (x) in Q[x1 , . . . , xm ]. Rm,f (z) can be computed by successive resultant computa¯ m,f (z) can be computed as the minimal polynotion and R mial of z = x1 + · · · + xm modulo the ideal I = hFi (with square-free computation if necessary). As Rm,f (z) has huge degree, it is difficult to compute it as it is. But Rm,f (z) is very useful for estimating coefficients and so we explain its computation below.

SORPRP OF EVEN POLYNOMIALS

First we consider an even polynomial f (x) of degree 2m in Q[x] with non zero constant. Then for any root α of f (x), −α is also a root of f (x). Thus, there are m roots, say α1 , . . . , αm , which have positive real parts, and also m roots, say αm+1 , . . . , α2m , which have negative real parts. So, f (x)

= =

a2m x2m + a2m−2 x2m−2 + · · · + a2 x2 + a0 Q a2m 2m i=1 (x − αi ),

(1)

where a2k ∈ Q for 0 ≤ k ≤ m, a2m 6= 0 and a0 6= 0. Our first target is to compute W = α1 + . . . + αm , which is the sum of all roots with positive real parts, without computing all αi ’s. For simplicity, we call W the SORPRP of f . Since, for each non real root of f (x), its complex conjugate has the same real part, we have the following: Lemma 1. W is a real number.

2.1

Polynomial having SORPRP as its root

Computation of Rm,f (z) via resultant

Let Bi1 ,...,im = αi1 + · · · + αim for 1 ≤ i1 , . . . , im ≤ 2m, and B the set of all distinct values of Bi1 ,...,im .

Let Tm (z) = z − (x1 + · · · + xm ) and for each k ≤ m, we define Tk successively as follows:

Definition 1. Gathering all sums Bi1 ,...,im , we can con¯ m,f (z), struct a polynomial Rm,f (z) and its square-free part R where z is a new variable: Y Rm,f (z) = (z − Bi1 ,...,im ), (2)

Tk−1 (z, x1 , . . . , xk−1 ) = resxk (fk (x1 , . . . , xk ), Tk (z, x1 , . . . , xk )),

where resy means the resultant with respect to a variable y. Then T0 (z) ∈ Q[z] and T0 (z) coincides with ad2m Rm,f (z) for some positive integer d. This can be shown as follows: By construction of Sylvester matrices in the resultant computation, it follows that the leading coefficient of Ti with respect to xj , where Ti is considered as a univariate polynomial in xj , is some power of a2m for each j < i, and the same for the leading coefficient of Ti with respect to z. Then, by the property of resultant, we have

i1 ,...,im

¯ m,f (z) R

=

Y

(z − B).

(3)

B∈B

As there are cases where Bi1 ,...,im coincides with Bj1 ,...,jm for distinct (i1 , . . . , im ) and (j1 , . . . , jm ), the square-free part ¯ m,f (z) is much smaller than Rm,f (z). Since all Bi1 ,...,im R are algebraic numbers and the coefficients of Rm,f (z) are symmetric under all permutations of roots, it follows that ¯ m,f (z) ∈ Q[z]. We may call Rm,f (z) Rm,f (z) ∈ Q[z] and so R ¯ m,f (z) the characteristic polynomial of sums of m and R roots, and the minimal polynomial of sums of m roots, respectively. ¯ m,f (z)) ≤ We note that deg(Rm,f (z)) = (2m)! and deg(R m! (2m)! . C = 2m m (m!)2

T0 (z)

=

1 ad2m

2m Y

T1 (z, αi ),

i=1

T1 (z, αi1 )

=

2 ad2m

Y

T2 (z, αi1 , αi2 ),

i2 6=i1

.. . Tm−1 (z, αi1 , . . . , αim−1 )

=

m ad2m

Y

Tm (z, αi1 , . . . , αim ),

im 6=i1 ,...,im−1

Tm (z, αi1 , . . . , αim )

It is obvious that the SORPRP W = α1 + · · · + αm of f (x) ¯ m,f (z) (Rm,f (z)), coincides with the maximal real root of R

22

=

z − (αi1 + · · · + αim ),

where i1 , . . . , im are distinct to each other and each di is a positive integer. (See [8].) When f (x) ∈ Z[x], that is, all a2k are integers, T0 (z) belongs to Z[z]. In order to avoid “coefficient growth and degree growth” in resultant computation, we may apply factorization technique to each Tk or its factors for computing smaller factors of T0 . (See §4.2 for usage of factors.) We note that multi-polynomial resultants can be also applied for computing T0 (z) by considering the system {f1 = f2 = · · · = fm = z − (x1 + · · · + xm ) = 0} in the variable {x1 , · · · , xm }.

tational efficiency, computing M (z, p) is much better than computing T0 (z, p) in many cases.

2.3

¯ m,f (z) via minimal polynomial Computation of R

Theorem 2. (Theorem 3.5 in [18], see also [12]) For a polynomial h(x) of degree n in C[x], let ϕ(x), ψ(x) ∈ R[x] be its real part and its imaginary part, respectively, that is, h(x) = ϕ(x) + iψ(x). We assume that h(x) has no real root. (If h(x) has a real root, we can eliminate it by GCD computation of ϕ(x) and ψ(x). ) Moreover, let N, N 0 be the number of roots with positive imaginary parts, and the number of roots with negative imaginary parts, respectively. Moreover, if deg(ϕ) ≥ deg(ψ), let V (a) be the number of variations in the sign of the standard Sturm sequence of ϕ(x), ψ(x) evaluated at the point a, and otherwise, let V (a) be that of ψ(x), ϕ(x) evaluated at the point a. Then,

Let z = x1 + · · · + xm and I = hFi in Q[x1 , . . . , xm ]. Then, we consider a minimal polynomial M (z) of z modulo I, that is, M (z) has the smallest degree among all polynomials h(z) in Q[z] such that h(x1 + · · · + xm ) belongs to the ideal I. Since the set of all zeros of I with multiplicities counted is {(αi1 , . . . , αim ) | i1 , . . . , im ∈ {1, . . . , 2m} are distinct to each other }, it can be shown easily that M (z) ¯ m,f (z) as its factor. (We is a factor of Rm,f (z) and has R ¯ m,f (z). may say that M (z) stands between Rm,f (z) and R ) Especially, when f (x) is square-free, then M (z)/lc(M (z)) ¯ m,f (z). When f (x) ∈ Z[x], that is, all a2k coincides with R are integers, by removing denominators of coefficients appearing in M (z), we may assume that M (z) belongs to Z[z]. Then the leading coefficient lc(M ) divides some power of a2m , as M (z) divides T0 (z). As we already know the Gr¨ obner basis {f1 , . . . , fm } of I, M (z) can be computed rather easily. See §4 in [17].

2.2

Note on non-even polynomial case

For an even polynomial of degree 2m, we know the number m of roots with positive real parts from their degree 2m. However, for a non-even polynomial, we do not know the number m of roots with positive real parts in advance. We can compute the number m (without computing each roots) by utilizing the following theorem for h(x) = f (ix), where i is an imaginary unit. (For Sturm sequence, see [16].)

N=

1 1 {n+V (∞)−V (−∞)}, N 0 = {n−V (∞)+V (−∞)}. 2 2

Since the number N 0 of roots of h(x) with negative imaginary parts coincides with the number m of roots of f (x) with positive real parts, we can compute m from N 0 . Then, same as in the case of even polynomials, the SORPRP W = α1 + · · · + αm of f (x) coincides with the maximal real root of Rf,m (z). For parametric case, m ranges according to the values of parameters, and m can be computed by precise analysis on Sturm sequence of ϕ, ψ using quantifier elimination technique. However, the problem becomes much complicated and harder to solve.

Parametric case

Now we consider the case where each coefficient a2k is some polynomial in parameters p = {p1 , . . . , pt }. Thus, the even polynomial f (x) is considered as a multivariate polynomial f (x, p) in Q[x, p]. Setting D = Q[p] and K = Q(p), we can compute the m-th standard triangular set F = {f1 (x1 , p), . . . , fm (x1 , . . . , xm , p)} in D[x1 , . . . , xm ]. Then, as lc(fi ) = a2m (p) for each i, F¯ = {f1 /a2m , . . . , fm /a2m } is the reduced Gr¨ obner basis of hFi in K[x1 , . . . , xm ]. By F, we can compute T0 (z, p) by successive resultant computation and M (z, p) as a minimal polynomial of z modulo the ideal hFi in K[x1 , . . . , xm ]. We note that using a block order {xm > . . . > x1 } >> z, M (z, p) is found in a Gr¨ obner basis of hF ∪ {z − (x1 + · · · + xm )}i in K[x1 , . . . , xm , z]. Then T0 (z, p) belongs to Q[z, p], and by removing denominators, we may assume that M (z, p) also belongs to Q[z, p]. As F¯ is the reduced Gr¨ obner basis of hFi and the denominator coincides with a2m (p), the following holds. (See Exercises of Chapter 6.3 in [7].)

3.

FORMULATION OF BASIC PROBLEM

Here we explain the fundamental problem in this paper. We denote the polynomial obtained above (T0 (z, p) or M (z, p)) by R(z). What we do after obtaining R(z) is the following: Problem 1. Given a polynomial R(z) involving parameters p in coefficients, R(z) ∈ Q(p)[z] and M1 , M2 ∈ Q (M1 > M2 ). Then find feasible ranges of parameters p so that the maximal real root W of R(z) satisfies the following each requirement: (a) W < M1 , (b) W > M2 , (c) M2 < W < M1 .

Theorem 1. For each (c1 , . . . , ct ) ∈ Qt , consider the polynomial fc (z) obtained from f (x, p) by substituting the parameters (p1 , . . . , pt ) with (c1 , . . . , ct ). If the leading coefficient a2m (c1 , . . . , ct ) does not vanish, then T0 (z, c1 , . . . , ct ) coincides with cRm,fc (z) for some non-zero constant c in Q, ¯ m,fc (z) and M (z, c1 , . . . , ct ) is a factor of Rm,fc (z) and has R as its factor in Q[z].

Here we exclude ranges where the leading coefficient of R(z) or its constant term vanishes. In view of control theory the parameters p usually comes from controller or plant parameters of the control system to be designed, and the above three requirements are originated from control design specifications in terms of SORPRP.

By Theorem 1, we can handle the SORPRPs for polynomials with parametric coefficients. For the total compu-

23

4. 4.1

SOLVING THE BASIC PROBLEM

5.

We here consider a typical optimal control problem named linear quadratic regulator (LQR) problem. We will first briefly explain the problem in §5.1 and show some computational examples, by which we can confirm the effectiveness of our proposed method. All computations except quantifier elimination are done by using a computer algebra system Risa/Asir1 . All QE computations in this paper were carried out by QEPCAD2 since QEPCAD succeeded in achieving all of QE computations for our examples in a very small amount of time. For the larger sized problems, we may use an efficient QE algorithm based on Sturm-Habicht sequence [1, 10]. Some types of QE methods using Sturm-Habicht sequence are available in a maple package SyNRAC [4, 19].

Outline of Algorithm

Problem 1 is resolved by using quantifier elimination over the real closed field. Actually all of the requirements are reduced to simple first-order formulas for R(z) ∈ Q(p)[z] as follows: (a) W < M1 : This requirement is equivalent to the first-order formula: ∀z > M1 , R(z) 6= 0.

(4)

This is so called a sign definite condition [1], hence we can solve it by an efficient quantifier elimination algorithm using Sturm-Habicht sequence [11, 6].

5.1

LQR problem

Here we briefly explain about linear quadratic regulator (LQR) problem (see [20] for more details) and introduce our target polynomial of which we want to estimate the SORPRP. Let us consider a linear time-invariant SISO (singleinput single-output) system represented by

(b) W > M2 : This requirement is equivalent to the first-order formula: ∃z > M2 , R(z) = 0.

CONTROL APPLICATION

(5)

We can also solve it by an efficient quantifier elimination algorithm using Sturm-Habicht sequence, see [10].

x(t) ˙ y(t)

(6)

Hence, this is achieved by superposing both quantifier-free formulas obtained by performing quantifier elimination for (a) and (b).

4.2

(7)

where x ∈ Rm is the state variable, u ∈ R is the control input, y ∈ R is the output, A ∈ Rm×m is the system matrix, b ∈ Rm is the input matrix, and cT ∈ Rm is the output matrix. Then the LQR problem is to find a control input u which minimizes the cost function Z ∞ (qy 2 (t) + ru2 (t))dt, (8) J=

(c) M2 < W < M1 : This requirement is equivalent to the conjunction of (a) and (b), that is, (∀z > M1 , R(z) 6= 0) ∧ (∃z > M2 , R(z) = 0).

= Ax(t) + bu(t), = cx(t),

0

where q > 0 and r > 0 are called weights. If we take the larger value of q, we can get the faster response in general. On the other hands, the lager value of r is required when we have a severe restriction on the value of u, since r reflects the penalty on u(t). Note that the ratio q/r plays an essential role for finding the optimal control input and determines the closed-loop poles. Actually, it is well-known that the optimal closed-loop poles are determined by the corresponding polynomial given by

Strategy against large size problems

In many examples, the polynomial R(z) is frequently factorized into small factors over Q(p) like R(z) = R1 · · · Rs due to a certain symmetry on αi1 + · · · + αim (see Remark 1 for example). Then the formula (4) is equivalent to ^ (∀z > M1 , Ri (z) 6= 0).

ϕ(s) = r · d(s)d(−s) + q · n(s)n(−s),

i

(9)

where d(s) and n(s) are the denominator and numerator of the transfer function of the plant (7) represented by

This means that the result obtained performing quantifier elimination to (4) is equivalent to conjunction of all results obtained by applying quantifier elimination to each factor Ri respectively. Moreover, the formula (5) is equivalent to _ (∃z > M2 , Ri (z) = 0).

P (s) = c(sI − A)−1 b. In other words, P (s) =

i

This implies that the result obtained performing quantifier elimination to (5) is equivalent to disjunction of all results obtained by applying quantifier elimination to each factor Ri respectively. Dividing the original large problem into several smaller problems via factorization shall reduce the total cost of quantifier elimination much. Of course, we have to consider the cost for factorization. We can expect the performance of this approach is superior to applying quantifier elimination to original formula (4) and (5).

n(s) , d(n)

where d(s) := det(sI − A), n(s) := c adj(sI − A) b. Note that deg(d(s)) = m, deg(n(s)) < m hold. 1 2

24

See http://www.math.kobe-u.ac.jp/Asir/asir.html See http://www.cs.usna.edu/ qepcad/B/QEPCAD.html

The polynomial ϕ(s) is our target polynomial with deg(ϕ(s)) = 2m and it is an even polynomial. It is strongly desired to establish a guiding principle to choose appropriate values of r and q or the ratio q/r, since the closed-loop poles are all the poles of ϕ(s) which have negative real parts. In the sequel we carry out an investigation of the weights r and q in terms of average stability, that is, the sum of roots with negative real parts (SORNRP) of ϕ(s). We can attain this by just applying our method for SORPRP shown in the previous sections to R(−z), where the polynomial R(z) has SORPRP of ϕ(s) as its root. Because, as ψ(s) is even, the value of SORPRP coincides with the absolute value of SORPNP, and R(−z) also has −1×SORPNP as its maximal real root. Particularly we study some behaviors of a parameter involving in the plant P (s) and feasible bounds for SORPRP W versus the ratio of weights q/r or q with r = 1 under the specifications in §4. This kind of investigations is important in practice to see control performance limitations, since the average stability is one of appropriate measures for the quickness of feedback control systems.

R(z; q, L) = −9765625L10 z 14

5.2

Remark 1. R(z; q, L) is factorized as follows instantly:

A sample plant: 2nd-order system with time delay

R(z; q, L) = R1 R2 R3 R4 R5

Here we study the LQR problem for a class of typical second-order systems with time delay represented by P (s) =

where

1 − 21 Ls kωn2 ωn2 ke−Ls ' , s2 + 2ζωn s + ωn2 s2 + 2ζωn s + ωn2 1 + 12 Ls

R1 R2 R3

where the exponential e−Ls is transformed to a rational function by the Padé approximation. We consider the case where k = 1, ζ = 0.1, ωn = 30 (kHz), and r = 1. Here, initially we assume that L > 0, r, q > 0. Then the target even polynomial is expressed as ϕ(s; q, L) = =

+ (−95703125L10 + 195312500L8 )z 12 +((68359375q−306796875)L10 + 995312500L8 −1562500000L6 )z 10 +((382812500q− 352493750)L10 + (1757812500q + 3258437500)L8 − 3062500000L6 + 6250000000L4 )z 8 + ((−78125000q 2 + 519031250q − 123443875)L10 + 8 (3675000000q +4263245000)L +(−7812500000q − 9013000000)L6 + 2450000000L4 − 12500000000L2 )z 6 + ((294122500q + 11647251)L10 + (2187500000q 2 + 3474625000q + 8 1863605100)L + (−17150000000q − 14797020000)L6 + (1250000000q − 3552000000)L4 − 4900000000L2 + 10000000000)z 4 +((−156250000q 3 −168625000q 2 − 10 2 12620025q − 245025)L + (1225000000q + 97020000q + 1920996)L8 + (−6250000000q 2 − 23304500000q − 7830818400)L6 + (−19600000000q + 8635760000)L4 + 2 (−12500000000q + 25916000000)L + 19600000000)z 2 + (625000000q 3 + 674500000q 2 + 50480100q + 980100)L8 + (−4900000000q 2 − 5094040000q − 194040000)L6 + (−5000000000q 2 + 4406000000q + 9406000000)L4 + (19600000000q + 19600000000)L2 + 10000000000q + 10000000000.

R4 R5

5.3

d(s)d(−s) + q · n(s)n(−s) −25L2 s6 + (−49L2 + 100)s4 . +((−25q − 25)L2 + 196)s2 + 100q + 100

= Lz + 2, = Lz − 2, = 625L4 z 4 − 5000L3 z 3 + (2450L4 + 15000L2 )z 2 + (−9800L3 −20000L)z + (−2500rqk2 − 99)L4 + 9800L2 + 10000, = 625L4 z 4 + 5000L3 z 3 + (2450L4 + 15000L2 )z 2 + (9800L3 +20000L)z + (−2500rqk2 − 99)L4 + 9800L2 + 10000, = −25z 4 − 49z 2 − 25rqk2 − 25.

Relationship between L and q

Here we consider the case where the bounds for the SORPRP are given, that is, M1 and M2 are fixed. Then we check the behavior of the plant parameter L versus a change of q. The possible regions of (L, q) to meet the specifications in the L − q parameter space is obtained by applying quantifier elimination to R(z; q, L) as explained in §4.

We remark that the leading coefficient −25L2 of ϕ(s) never vanish as L > 0, and the constant term 100q +100 also never vanish as q > 0. Let I3 be the ideal generated by the 3rd standard triangular set of ϕ(s; q, L);

5.3.1

(a) W < M1

Let M1 = 500, then the specification (a) is equivalent to the following first-order formula:

{ϕ(x1 ; q, L), ϕ1 (x1 , x2 ; q, L), ϕ2 (x1 , x2 , x3 ; q, L)},

∀z > 500, R(z; q, L) 6= 0.

where ϕ1 (x1 , x2 ; q, L) is the quotient of ϕ(x2 , q, L) divided by x2 − x1 and ϕ2 (x1 , x2 ; x3 ; q, L) is the quotient of ϕ1 (x3 , q, L) divided by (x3 − x1 )(x3 − x2 ). Then we obtained the following minimal polynomial R(z; q, L) in z of x1 + x2 + x3 with respect to I3 immediately. The maximal real root of R(z) coincides with the SORPRP W of ϕ(s; q, L). Since we need to compute sum of roots with negative real parts in a sense of stability, we apply our method computing SORPRP to R(−z; q, L). But as ϕ is even it follows that R(−z; q, L) = R(z; q, L).

(10)

After performing quantifier elimination to this, we can obtain the following equivalent quantifier-free formula in (L, q) which describe feasible regions of (L, q) for (a): (q + 62500490001 >= 0 ∧ 250L − 1 ≥ 0 ∧ 2500L4 q − 39063112499901L4 + 625004900000L3 − 3750009800L2 + 10000000L − 10000 ≤ 0 ∧ 2500L4 q − 39063112499901L4 − 625004900000L3 − 3750009800L2 − 10000000L − 10000 ≤ 0

This is illustrated as a shaded region in Fig.1.

(b) W > M2 Let M2 = 300, then the specification (b) is equivalent to the following first-order formula: 5.3.2

∃z > 300, R(z; q, L) = 0.

25

(11)

Figure 1: Feasible region of L − q for (a)

Figure 3: Feasible region of L − q for (a) ∧ (b) shown in Figs. 1, 2 and 3 meets the above requirements in terms of the magnitude of SORPRP. We can obtain the following knowledge from Fig. 3. The plant parameter L is restricted within an interval for a fixed value of q under the specification of 300 < W < 500. The maximum and minimum edges of the feasible interval of L are monotonically increasing. Thus, for instance for the value of L around 0.01, q must be taken from the region which is larger than a certain value. We can obtain the exact threshold value easily since we have the feasible region as a semi-algebraic set by virtue of quantifier elimination. These greatly help control designers to choose appropriate value of the ratio of weights q/r for their control system more systematically.

5.4

Relationship between M1 /M2 and q Next we investigate the case where the controller is given, that is, L is fixed. We set L = 0.02(msec). Then we estimate the behavior of the possible bounds for M1 and M2 versus a change of q. The possible regions of M1 and M2 according to a change of q is obtained by performing quantifier elimination to R(z; 1/50) as explained in §4.

Figure 2: Feasible region of L − q for (b) After performing quantifier elimination to this, we can obtain a following equivalent quantifier-free formula in (L, q) which describe feasible regions of (L, q) for (b):

5.4.1

The specification (a) is equivalent to the following firstorder formula:

(q + 8100176401 < 0) ∨ (L > 0∧150L − 1 < 0) ∨ (2500L4 q − 5062720499901L4 − 135002940000L3 − 1350009800L2 − 6000000L − 10000 > 0) ∨ (2500L4 q − 5062720499901L4 + 135002940000L3 − 1350009800L2 + 6000000L − 10000 > 0)

∀z > M1 , R(z; q, 1/50) 6= 0.

(12)

After performing quantifier elimination to this, we can obtain the following equivalent quantifier-free formula in (M1 , q) which describe feasible regions of (M1 , q) for (a):

This is illustrated as a shaded region in Fig.2.

5.3.3

(a) W < M1

(c) M2 < W < M1

If M2 = 300, M1 = 500 for the requirement (c), the problem is recast as the following first-order formula:

M1 − 100 ≥ 0 ∧ 25q + 25M14 + 49M12 + 25 ≥ 0 ∧ 2500q − 625M14 + 250000M13 − 37502450M12 2500490000M1 − 62524499901 ≤ 0

(∀z > 500, R(z; q, L) 6= 0) ∧ (∃z > 300, R(z; q, L) = 0). A formula describing feasible regions of (L, q) for the requirement (c) can be obtained by superposing above two results for (a) and (b) in the parameter space L − q as shown in Fig.3.

+

This is illustrated in Fig.3 as a shaded region.

5.4.2 (b) W > M2 The specification (b) is equivalent to the following firstorder formula:

Control theoretical significance : Any system with parameter values of L and q within the feasible regions

∃z > M2 , R(z; q, 1/50) = 0.

26

(13)

cases certainly, and choose appropriate value of the weight q for their requirement level (i.e., M1 , M2 ) systematically.

6.

Figure 4: Feasible region of M1 − q

7.

REFERENCES

[1] H. Anai and S. Hara. Fixed-structure robust controller synthesis based on sign definite condition by a special quantifier elimination. In Proceedings of American Control Conference 2000, pp.1312–1316, 2000. [2] H. Anai and S. Hara. Linear programming approach to robust controller design by a quantifier elimination. In Proceedings of SICE Annual Conference 2002 (Osaka, Japan), pp.863–869, 2002. [3] H. Anai, H. Yanami, K. Sakabe, and S. Hara. Fixed-structure robust controller synthesis based on symbolic-numeric computation: design algorithms with a CACSD toolbox (invited paper). In Proceedings of CCA/ISIC/CACSD 2004 (Taipei, Taiwan), pp.1540–1545, 2004. [4] H. Anai and H. Yanami. SyNRAC: A maple-package for solving real algebraic constraints. In Proceedings of International Workshop on Computer Algebra Systems and their Applications (CASA) 2003 (Saint Petersburg, Russian Federation), P.M.A. Sloot et al. (Eds.): ICCS 2003, LNCS 2657, pp.828–837. Springer, 2003. [5] P. Aubry and A. Valibouze. Using galois ideals for computing relative resolvents. Journal of Symbolic Computation, 30: pp.635–651, 2000. [6] B. Caviness and J. Johnson, editors. Quantifier Elimination and Cylindrical Algebraic Decomposition. Texts and Monographs in Symbolic Computation. Springer, Berlin, Heidelberg, New York, 1998. [7] D. Cox, J. Little, and D. O’Shea. Ideals, Varieties and Algorithms. Undergraduate Texts in Mathematics. Springer-Verlag, New York, Berlin, Heidelberg, 1992. [8] D. Cox, J. Little, and D. O’Shea. Using Algebraic Geometry. Graduate Texts in Mathematics 185. Springer-Verlag, New York, Berlin, Heidelberg, 1998. [9] P. Dorato, W. Yang, and C. Abdallah. Robust multi-objective feedback design by quantifier elimination. J. Symb. Comp. 24, pp.153–159, 1997. [10] L. Gonz´ alez-Vega. A combinatorial algorithm solving some quantifier elimination problems. In B.F.

Figure 5: Feasible region of M2 − q After performing quantifier elimination to this, we can obtain the following equivalent quantifier-free formula in (M2 , q) which describe feasible regions of (M2 , q) for (b): M2 − 100 < 0 ∨ 25M24 + 49M22 + 25 < 0 ∨ 2500q − 625M24 + 250000M23 − 37502450M22 2500490000M2 − 62524499901 > 0

CONCLUSION

In this paper we have presented a method to compute or estimate the sum of roots with positive real parts (SORPRP) of a polynomial with parametric coefficients based on symbolic and algebraic computations. Since the method does not compute explicit numerical values of the roots, we can treat polynomials with parametric coefficients for their SORPRP. Combining the method with quantifier elimination, we succeeded in giving a novel systematic method for achieving optimal regulator design in control. In order to see its effectiveness and practicality, we made some experiments for a concrete example from optimal regulator control. The method proposed here shall provide one of promising direction for an ad hoc part (i.e., choice of weights) of optimal regulator design that is one of the main concerns in control and gives another successful application of computer algebra to control design problem.

+

This is illustrated in Fig.4 as a shaded region, which is the exact complementary set of the shaded region in Fig.3. Control theoretical significance : Fig.4 illustrates the behavior of lower bound of the possible magnitude of M1 for q so that the given plant P (s) (where L is fixed) satisfies the requirement (a). Fig.5 shows the behavior of upper bound of the possible magnitude of M2 for q so that the given plant P (s) satisfies the requirement (b). We can see from Fig.4 that the lower bound of M1 monotonically increasing. Thus, if we need to satisfy (a) for the smaller M1 , we must choose enough smaller q. We can easily obtain the feasible range of q for a given value of M1 by using the semi-algebraic set obtained by quantifier elimination which describes the feasible region in Fig.4. These information becomes significant supports for control designers to avoid infeasible requirement

27

[11]

[12] [13]

[14]

Caviness and J.R. Johnson, editors, Quantifier Elimination and Cylindrical Algebraic Decomposition, Texts and Monographs in Symbolic Computation, pp.365–375. Springer, Wien, New York, 1998. L. Gonz´ alez-Vega, T. Recio, H. Lombardi, and M.-F. Roy. Sturm-habicht sequences determinants and real roots of univariate polynomials. In B.F. Caviness and J.R. Johnson, editors, Quantifier Elimination and Cylindrical Algebraic Decomposition, Texts and Monographs in Symbolic Computation, pp.300–316. Springer, Wien, New York, 1998. H. Weber. Lehrbuch der Algebra, Volume I: Third Edition AMS, 2002. S. Hara, T. Kimura, and R. Kondo. H ∞ control system design by a parameter space approach. In Proceedings of MTNS-91, pages 287–292, Kobe, Japan, 1991. M. Jirstrand. Nonlinear control system design by quantifier elimination. Journal of Symbolic Computation, 24(2): pp.137–152, August 1997. Applications of quantifier elimination (Albuquerque, NM, 1995).

[15] T. Kimura and S. Hara. A robust control system design by a parameter space approach based on sign definition condition. In Proceedings of KACC-91, pp.1533–1538, Soul, Korea, 1991. [16] B. Mishra. Algorithmic Algebra Springer Verlag, 1993. [17] M. Noro and K. Yokoyama. Implementation of prime decomposition of polynomial ideals over small finite fields Journal of Symbolic Computation, 38 (2004) pp.1227-1246. [18] T. Takagi. Lecture of Algebra (‘Daisugaku Kougi’ in Japanese ). Kyoritsu Pub. Japan, 1930. [19] H. Yanami and H. Anai. Development of SyNRAC – formula description and new functions. In Proceedings of International Workshop on Computer Algebra Systems and their Applications (CASA) 2004 : ICCS 2004, LNCS 3039, pp.286–294. Springer, 2004. [20] K Zhou, J.C Doyle, and K. Glover. Robust and Optimal Control. Prentice Hall, 1995.

28

Algebraic General Solutions of Algebraic Ordinary Differential Equations J.M. Aroca and J. Cano

R. Feng and X.S. Gao

Department of Algebra, Geometry and Topology Fac. Ciencias. Univ. de Valladolid Valladolid 47011, Spain

Key Laboratory of Mathematics Mechanization Institute of Systems Science, AMSS, Academia Sinica, Beijing 100080, China

(aroca,jcano)@agt.uva.es

[email protected] [email protected]

ABSTRACT

homogeneous ODEs. Many other interesting results on finding the Liouvillian solutions of linear ODEs were reported in [2, 6, 23, 24]. Most of these results are limited to the linear case or some special type nonlinear equations. Work on finding closed form solutions for nonlinear differential equations is not as systematic as that for linear equations. With respect to the particular ODEs of the form y 0 = R(x, y) where R(x, y) is a rational function, Darboux and Poincaré made important contributions [16]. More recently, Cerveau, Carnicer and Corral et al also made important progresses [4, 3, 7]. In particular, Carnicer gave the degree bound of algebraic solutions in the nondicritical case. In [21], Singer studied the Liouvillian first integrals of differential equations. In [12], Hubert gave a method to compute a basis of the general solutions of first order ODEs and applied it to study the local behavior of the solutions. In [9, 10], Feng and Gao gave a necessary and sufficient condition for an algebraic ODE to have a rational type general solution and a polynomial-time algorithm to compute a rational general solution if it exists. In this paper, the idea proposed in [9] is generalized to compute algebraic function solutions. In Section 2, we give a sufficient and necessary condition for an algebraic ODE to have an algebraic general solution, by constructing a class of differential equations whose solutions are all algebraic functions. In Section 3, by treating the variable and its derivative as independent variables, a first order autonomous ODE defines a plane algebraic curve. Using the Riemann-Hurwitz formula, we give a degree bound of algebraic function solutions of the equation. This degree bound is optimal in the sense that there is a class of first order autonomous ODEs, whose algebraic function solutions reach this bound. In Section 4, based on the above results and the theory of Hermite-Padé approximants, we give a polynomial-time algorithm to find an algebraic general solution for a first order autonomous ODE. dy A first order autonomous ODE F (y, dx ) = 0 can be redx duced to the form G(y, dy ) = 0, where G is also a polynomial(see the section 3.2.1, (7)). Then to find the solution of F = 0, we may first find x = φ(y) as a function in y by computing the integral of an algebraic function, and then compute the inversion y = φ−1 (x). For an algebraic R function φ(x) which satisfies G(x, φ(x)) = 0, let y = φ(x)dx dy be the integral of φ(x). Then we have G(x, dx ) = 0. By the dy same way, G(x, dx ) = 0 can be converted into a first order autonomous ODE F (x, dx ) = 0. Then to find the integral y dy

In this paper, we give a necessary and sufficient condition for an algebraic ODE to have an algebraic general solution. For a first order autonomous ODE, we give an optimal bound for the degree of its algebraic general solutions and a polynomial-time algorithm to compute an algebraic general solution if it exists. Here an algebraic ODE means that an ODE given by a differential polynomial.

Categories and Subject Descriptors I.1.2 [SYMBOLIC AND ALGEBRAIC MANIPULATION]: Algorithms—Algebraic algorithms

General Terms Algorithms, Theory

Keywords Algebraic general solution, algebraic differential equation, first order autonomous ODE, algebraic curve, Hermite-Padé approximants

1.

INTRODUCTION

Finding the close form solution of an ODE can be traced back to the work of Liouville. For the algorithm consideration, the pioneer work is due to Risch. In [17, 18], RRisch described a method to find the elementary integral of udx where u is an elementary function. In Trager’s Ph.D thesis [22], he gave a method to compute the integral of algebraic functions based on Risch’s ideas. In [1], Bronstein generalized Trager’s results to elementary functions. For higher order linear homogeneous ODEs, Kovacic presented an effective method to find the Liouvillian solutions for second order ODEs [14]. In [20], Singer established a general framework for finding the Liouvillian solutions for general linear


29

of φ(x), we may first find x = ϕ(y) by computing a solution of F (x, dx ) = 0 and then compute the inversion. Hence, our dy algorithm is equivalent to a polynomial-time algorithm for finding an algebraic integral for an algebraic function.

2. 2.1

P , P (η) = 0 implies that P ∈ Σ. It is well known that an ideal Σ is prime iff it has a generic zero [19]. As a consequence of Lemma 2.1, we have Lemma 2.2. Let F ∈ K{y} \ K be an irreducible differential polynomial with a generic solution η. Then for a differential polynomial P we have P (η) = 0 iff prem(P, F ) = 0.

ALGEBRAIC GENERAL SOLUTIONS OF ALGEBRAIC ODES

The following definition of the general solution is due to Ritt.

Definition of algebraic general solutions

Definition 2.3. Let F ∈ K{y} \ K be an irreducible differential polynomial. A general solution of F = 0 is defined as a generic zero of ΣF . An algebraic general solution of F = 0 is defined as a general solution yˆ which satisfies the following equation

In the following, let K = Q(x) be the differential field of d rational functions in x with differential operator dx and y an ¯ be the algebraic closure of the indeterminate over K. Let Q rational number field Q. We denote by yi the i-th derivative of y. We use K{y} to denote the ring of differential polynomials over the differential field K, which consists of the polynomials in the yi with coefficients in K. All differential polynomials in this paper are in K{y}. Let Σ be a system of differential polynomials in K{y}. A zero of Σ is an element in a universal extension field of K, which vanishes every differential polynomial in Σ [19]. In this paper, we also assume that the universal extension field of K contains an infinite number of arbitrary constants. We will use C to denote the constant field of the universal extension field of K. Let P ∈ K{y} \ K. We denote by ord(P ) the highest derivative of y in P , called the order of P . Let o = ord(P ) > 0. We may write P as follows

G(x, y) =

(3)

P Pmj i j where ai,j are in C and n i=0 ai,j x y is irreducible in j=0 C[x, y]. When n = 1, yˆ is called a rational general solution of F = 0. For algebraic solutions of a differential equation F = 0, we have the following lemma. ¯ Lemma 2.4. Let G(y) ∈ C(x)[y] and irreducible in C(x)[y] where C¯ is the algebraic closure of C. If one solution of G(y) = 0 is a solution of F = 0, then every solution of G(y) = 0 is the solution of F = 0. ¯ Proof. Since G(y) is irreducible in C(x)[y], every solution of G(y) = 0 is a generic zero of G(y) = 0. By Lemma 2.2, prem(F, G) = 0. That is,

where ai are polynomials in y, y1 , . . . , yo−1 and ad 6= 0. ad ∂P is called the separant is called the initial of P and S = ∂y o

of P . The k-th derivative of P is denoted by P (k) . Let S be the separant of P , o = ord(P ) and an integer k > 0. Then we have

S k I l F = P G0 + QG

(4)

∂G , ∂y

I is the initial of G and k, l ∈ Z. Since where S = every solution of G(y) = 0 is a generic zero, S or I do not vanish at it. Hence every solution of G(y) = 0 is a solution of F = 0.

(1)

where Rk is of lower order than o + k. Let P be a differential polynomial of order o. A differential polynomial Q is said to be reduced with respect to P if ord(Q) < o or ord(Q) = o and deg(Q, yo ) < deg(P, yo ). For two differential polynomials P and Q, let R = prem(P, Q) be the differential pseudo-remainder of P with respect to Q. We have the following differential remainder formula for R [13, 19] X JP = Bi Q(i) + R

A general solution of F = 0 is usually defined as a family of solutions with o independent parameters in a loose sense where o = ord(F ). The definition given by Ritt is more precise. Theorem 6 in Section 12, Chapter 2 in [13] tells us that Ritt’s definition of general solutions is equivalent to the definition in the classical literature.

2.2 A criterion for existence of algebraic general solutions

i

where J is a product of certain powers of the initial and separant of Q and Bi , R are differential polynomials. Moreover, R is reduced with respect to Q. For a differential polynomial P of order o, we say that P is irreducible if P is irreducible when P is treated as a polynomial in K[y, y1 , . . . , yo ]. Let P ∈ K{y}\K be an irreducible differential polynomial and ΣP = {A ∈ K{y}|SA ≡ 0 mod {P }},

ai,j xi y j = 0

j=0 i=0

P = ad yod + ad−1 yod−1 + . . . + a0

P (k) = Syo+k + Rk

mj n X X

For non-negative integers h, α, k, let A(h,α;k) (y) be the following (h + 1) × (α + 1) matrix:      

k+1 0 k+2 0

k+h+1 0

(2)

yk+1 yk+2 . . .

k+1 yk 1 k+2 yk+1 1

. . .

yk+h+1

k+h+1 1

yk+h

··· ···

k+1 α k+2 α

··· ···

k+h+1 α

yk+1−α yk+2−α . . . yk+h+1−α

   .  

Zn ≥0 ,

Let α = (α1 , · · · , αn ) ∈ α0 ∈ Z≥0 where Z≥0 means the set of non-negative integers. Let A(α0 ;α) (y) be the (h + 1) × (h + 1) matrix

where{P } is the perfect differential ideal generated by P [13, 19]. Ritt proved that [19]

(A(h,α1 ;α0 ) (y)|A(h,α2 ;α0 ) (y 2 )| · · · |A(h,αn ;α0 ) (y n ))

Lemma 2.1. ΣP is a prime differential ideal and a differential polynomial Q belongs to ΣP iff prem(Q, P ) = 0.

where n + α1 + · · · + αn = h + 1. Let D(α0 ;α) be the determinant of A(α0 ;α) (y). Note that if n = 1, D(α0 ,α) is just equal to Dn,m in [9].

Let Σ be a non-trivial prime ideal in K{y}. A zero η of Σ is called a generic zero of Σ if for any differential polynomial

30

¯ ¯ Q{y} and G(x, y) ∈ Q[x, y] which is irreducible. We say G(x, y) is nontrivial if deg(G, x) > 0 and deg(G, y) > 0. From now on, we always assume that G(x, y) is nontrivial. When we say that G(x, y) = 0 is an algebraic solution of F = 0, we mean that one of the algebraic functions yˆ(x) defined by G(x, yˆ(x)) = 0 is a solution of F = 0.

Lemma 2.5. An element y¯ in the universal extension of K is a solution of D(α0 ;α) = 0 iff it satisfies the equation (3) with mj ≤ αj for j = 0, · · · , n. Proof. Assume that y¯ satisfies the equation (3) with mj ≤ αj where j = 0, · · · , n. Then we have αj n X X

3.1 Structure for algebraic general solutions

ai,j (xi y¯j )(α0 +1) = 0

It is a trivial fact that for an autonomous ODE, the solution set is invariant by a translation of the independent variable x. Moreover, we have the following fact.

j=1 i=0 i j (α0 +1)

i j

where (x y¯ ) means the (α0 + 1)-th derivative of x y¯ with respect to x and if i > mj then ai,j = 0. Since ai,j are constants, (xi y¯j )(α0 +1) (i = 0, · · · , αj , j = 1, · · · , n) are linearly dependent over C. That is, the Wronskian determinant W ((xi y¯j )(α0 +1) ) for (xi y¯j )(α0 +1) vanishes where j = 0, · · · , n, i = 0, · · · , αj [19]. Then y¯ satisfies the equation (3) with mj ≤ αj iff W ((xi y¯j )(α0 +1) ) = 0. By the computation process,

Lemma 3.1. Let G(x, y) = 0 be an algebraic solution of F = 0. Then G(x + c, y) = 0 is an algebraic general solution of F = 0, where c is an arbitrary constant. Proof. Assume that y¯(x) is a formal power series solution of G(x, y) = 0. Then y¯(x + c) will be a solution of G(x + c, y) = 0. Because y¯(x) is a solution of F = 0, y¯(x + c) is still a solution of F = 0. Hence G(x + c, y) = 0 is an algebraic solution of F = 0. For any T ∈ K{y} satisfying T (¯ y (x + c)) = 0, let R = prem(T, F ). Then R(¯ y (x + c)) = 0. Suppose that R 6= 0. Since F is irreducible and deg(R, y1 ) < deg(F, y1 ), there are two differential polynomials P, Q ∈ K{y} such that P F + QR ∈ K[y] and P F + QR 6= 0. Thus (P F + QR)(¯ y (x + c)) = 0. Because ¯ and c is an arbitrary constant which is trany¯(x + c) ∈ / Q scendental over K, we have P F + QR = 0, a contradiction. Hence R = 0 which means that T ∈ ΣF . So y¯(x + c) is a generic zero of ΣF . Hence G(x + c, y) = 0 is an algebraic general solution.

W ((xi y¯j )(α0 +1) ) = D(α0 ;α) (¯ y ) ∗ |diag(B0 , · · · , Bn )| where diag(B0 , · · · , Bn ) is the diagonal matrix of Bj and   αj 1

Bj =

 0  .  . . 0

x 1 .. . 0

··· ··· ··· ···

x αj xαj −1   ..  . αj !

for j = 0, · · · , n. Hence W ((xi y¯j )(α0 +1) ) = 0 if and only if D(α0 ;α) (¯ y ) = 0. By the above Lemma, we can prove the following criteria theorem easily.

Lemma 3.1 reduces the problem of finding an algebraic general solution to the problem of finding a nontrivial algebraic solution. In what follows, we will show how to find ¯ y]. First of all, we a nontrivial algebraic solution in Q[x, decide the degree of an algebraic solution.

Theorem 2.6. Let F be an irreducible differential polynomial. Then F = 0 has an algebraic general solution yˆ iff there exist α = (α1 , · · · , αn ) ∈ Zn ≥0 , α0 ∈ Z≥0 such that prem(D(α0 ;α) , F ) = 0.

3.2 Degree bound of an algebraic solution

Proof. (⇒) Let yˆ be an algebraic general solution of F = 0 which satisfies the equation (3). Let α = (m1 , m2 , · · · , mn ) and α0 = m0 . Then from Lemmas 2.1, 2.2 and 2.5

Assume that G(x, y) = 0 is an algebraic solution of the differential equation F = 0. In this subsection, we will give a bound for deg(G, x) and deg(G, y). First, we introduce some concepts concerning the algebraic function fields in one variable. ¯ Definition 3.2. Q(x, α) is called an algebraic function ¯ and α is field in one variable, if x is transcendental over Q ¯ algebraic over Q(x) [11].

y ) = 0 ⇒ D(α0 ,α) ∈ ΣF ⇒ prem(D(α0 ,α) , F ) = 0. D(α0 ,α) (ˆ (⇐) prem(D(α0 ,α) , F ) = 0 implies that D(α0 ,α) ∈ ΣF by Lemma 2.1. Then all the zeros of ΣF must satisfy the equation (3). In particular, the generic zero of ΣF satisfies the equation (3).

An irreducible algebraic curve G(x, y) = 0 where G(x, y) ∈ ¯ ¯ Q[x, y] corresponds to an algebraic function field Q(α, β) which is unique under an isomorphism where α, β satisfies ¯ It is well G(α, β) = 0 and α or β is transcendental over Q. known that two algebraic curves with isomorphic function fields have the same genus.

Given an algebraic differential equation F = 0, if we know the degree bound of the equation (3) with respect to x and y which perhaps defines an algebraic general solution of F = 0, then we can decide whether it has an algebraic general solution by computing prem(D(α0 ,α) , F ) step by step. However for ODEs of order greater than one or with variate coefficients, we do not know this bound. Even for the case P (x,y) where P (x, y), Q(x, y) ∈ Q[x, y], we have no efy 0 = Q(x,y) fective method to get the bound [3, 16]. In the following, for the first order autonomous ODEs, we give a degree bound for algebraic function solutions.

3.

3.2.1 Parametrization of a curve ¯ Let Q((t)) be the quotient field of the ring of formal power ¯ series Q[[t]]. Let G(x, y) be a nontrivial irreducible polyno¯ ¯ mial in Q[x, y]. If x(t), y(t) ∈ Q((t)) satisfy G(x(t), y(t)) = 0, we say that they are the coordinates of a parametrization ¯ There exist provided x(t) or y(t) does not belong to Q. ¯ nonzero integers q and p, and units u(t), v(t) in x0 , y0 ∈ Q, ¯ Q[[t]], such that x(t) − x0 = tq u(t), (5) y(t) − y0 = tp v(t).

DEGREE BOUND FOR FIRST ORDER AUTONOMOUS ODES

In the following, we will always assume that F = 0 is a first order autonomous ODE in Q{y} and irreducible in

31

¯ where Ai (y), Fj (y) ∈ Q[y]. We use Res(A, B, z) to denote the Sylvester-resultant of A and B with respect to z and Z stands for “the zero set of”. Let S = Z(As (y)) ∪ Z(Fd (y)) ∪ ∂F Z(Res(G, ∂G , x)) ∪ Z(Res(G, ∂G , x)) ∪ Z(Res(F, ∂y , y1 )). ∂x ∂y 1 ¯ such Then S is a finite set. Hence we can choose a c ∈ Q that c ∈ / S. Then we have the following results:

The center of the parametrization is the point P ∈ P1 × P1 defined accordingly the following cases: (a) If q > 0 and p > 0, then P = (x0 , y0 ); (b) If q > 0 and p < 0, then P = (x0 , ∞); (c) If q < 0 and p > 0, then P = (∞, y0 ); (d) If q < 0 and p < 0, then P = (∞, ∞). If p < 0 (resp. q < 0) we agree to take y0 = 0 (resp. x0 = 0). If there exists an integer k ≥ 2 such that x(t), y(t) ∈ ¯ k )), the parametrization will be called reducible, otherQ((t ¯ wise irreducible. If t¯ ∈ Q[[t]] with order with respect to t greater than zero, then x(t¯), y(t¯) is another parametrization with the same center. If the order of t¯ is equal to one, the two parametrizations will be said to be equivalent. An equivalence class of irreducible parametrizations will be called a place B of the curve G = 0 with center the center of one of its parametrizations. Two equivalent parametrizations have the same integers q and p as defined above. Then given a place B, we define nonzero integers νx (B) and νy (B) as the integers q and p of any of its irreducible parametrizations. Let g be the genus of G(x, y) = 0 and n = deg(G, y). By the Riemann-Hurwitz formula [15] we have that 1X g =1−n+ (|νx (B)| − 1) 2 B

¯ (c, z) = 0} = {z1 , z2 , · · · , zd } has ex(a) the set {z ∈ Q|F actly d elements; ¯ (b) the set {x ∈ Q|G(x, c) = 0} = {x1 , x2 , · · · , xs } has exactly s elements; (c) since series

such that G(x, ϕi (x)) = 0 for each i = 1, · · · , s. From Lemma 2.4, ϕi (x) is a solution of F = 0. Then we have F (ϕi (x), ϕ0i (x)) = 0 which implies that F (c, gi,1 ) = 0. Suppose that s > d. Then there exist at least two of gi,1 which are equal to each other. Without lost of generalization, as∂F (c, c1 ) 6= 0, there exists sume that g1,1 = g2,1 = c1 . Since ∂y 1 only one solution ϕ(x) of F (y, y1 ) = 0 such that ϕ(0) = c and ϕ0 (0) = c1 . Hence ϕ1 (x) = ϕ2 (x + x2 − x1 ) = ϕ(x − x1 ). So (x, ϕ1 (x)) and (x + x2 − x1 , ϕ1 (x)) are two coordinates of a parametrizations of G = 0. This is a contradiction by the above lemma. Hence s ≤ d. Let G0 = y1 ∂G + ∂G and ∂y ∂x 0 H(y, y1 ) = Res(G, G , x). Then H(y, y1 ) = y1s Res(G,

B

where the sum runs over all places B of the curve G = 0 with center (α, β).

Since F is first order and autonomous, we can regard F = 0 as an algebraic curve and we will use F (y, y1 ) to denote F.

Proof. By Gauss’s lemma, we know G(x, y) is irreducible ¯ ¯ Q(y(t)) ¯ ¯ in Q(y)[x]. Since y(t) ∈ / Q, is isomorphic to Q(y) ¯ which implies that G(x, y(t)) ∈ Q(y(t))[x] is irreducible too. Now assume that x(t) is a root of G(x + c, y(t)) = 0. Then we have G(x, y(t)) divides G(x + c, y(t)). It is clear that deg(G(x+c, y(t)), x) = deg(G(x, y(t)), x) and G(x, y(t)) and G(x + c, y(t)) have the same leader coefficients. Hence, G(x, y(t) = G(x + c, y(t)). Since c 6= 0, we have that deg(G(x, y), x) = deg(G(x, y(t)), x) = 0, in contradiction with the nontriviality of G(x, y).

Lemma 3.5. Assume that G(x, y) = 0 is an algebraic solution of F = 0. Then the genus of G(x, y) = 0 equals to that of F (y, y1 ) = 0. Proof. Let α satisfy G(x, α) = 0. It is clear that α is ¯ Then Q(x, ¯ ¯ transcendental over Q. α) and Q(α, α0 ) are the algebraic function fields of G(x, y) = 0 and F (y, y1 ) = 0 ¯ ¯ respectively. We only need to prove Q(x, α) = Q(α, α0 ). ¯ ¯ ¯ From Theorem 3.4, we have [Q(x, α) : Q(α)] = [Q(α, α0 ) : ¯ Q(α)]. Since G(x, α) = 0, α0 = − ∂G (x, α)/ ∂G (x, α). which ∂x ∂y 0 ¯ ¯ ¯ implies that α ∈ Q(x, α). Hence Q(x, α) = Q(α, α0 ).

Now we are ready to give the degree bound of the algebraic solution of F = 0. First, we could determine the degree deg(G, x) exactly from the degree of F .

For convenience, we consider a new differential equation

¯ Theorem 3.4. Let G(x, y) ∈ Q[x, y] be irreducible and let G(x, y) = 0 be an algebraic solution of F = 0. Then we have

1 deg(F,y1 ) F¯ (x1 , y) = x1 F (y, ) = 0 (7) x1 ¯ 1 , y] and where x1 = dx = y11 . F¯ is irreducible in Q[x dy deg(F¯ , y) = deg(F, y), deg(F¯ , x1 ) = deg(F, y1 ). Then we have the following lemma.

deg(G, x) = deg(F, y1 ). Proof. Assume that deg(G, x) = s and deg(F, y1 ) = d. Let us write F

F0 (y) + F1 (y)y1 + · · · + Fd (y)y1d

=

∂G , x) + terms of lower order in y1 . ∂y

Since Res(G, ∂G , x) 6= 0, we have deg(H, y1 ) = s. As∂y sume that y¯(x) is a solution of G(x, y) = 0. Then we have H(¯ y (x), y¯0 (x)) = F (¯ y (x), y¯0 (x)) = 0. Because F is irreducible, we have that deg(H, y1 ) ≥ deg(F, y1 ). In the other words, s ≥ d.

Lemma 3.3. Let G(x, y) be a nontrivial irreducible poly¯ y]. Let (x(t), y(t)) be the coordinates of a nomial of Q[x, parametrization G = 0. Then, for any nonzero constant c ∈ ¯ (x(t) + c, y(t)) are not the coordinates of a parametrizaQ, tion of G = 0.

A0 (y) + A1 (y)x + · · · + As (y)xs ,

6= 0, there exists a unique formal power

ϕi (x) = c + gi,1 (x − xi ) + gi,2 (x − xi )2 + · · ·

where B runs over all places of the curve G = 0. Each place B with center (α, β) corresponds to exactly qB fractional power series y(x1/qB ) which are the solutions of ¯ ∪ {∞}. Hence, by the Puiseux G(x, y(x)) = 0. Let α ∈ Q theorem we have that X |νx (B)| = deg(G, y), (6)

G(x, y) =

∂G (xi , c) ∂y

Lemma 3.6. Let F¯ be defined as in (7) and G(x, y) = 0 an algebraic solution of F = 0. Then G(x, y) = 0 also defines an algebraic function (in y) solution of F¯ (x1 , y) = 0.

32

and νx (B) > νy (B); and we say that B ∈ (4)0 if B ∈ (4) and νx (B) < νy (B). In the following inequalities Bx , By ˜x1 and B ˜y will stand for νx (B), νy (B), νx1 (B) ˜ and νy (B) ˜ B respectively. For k = 1 and k = 4, we have that X X X ˜x1 | + (10) (|Bx | − 1) ≤ |B (|By | − 1).

Proof. From the proof of Theorem 3.4, we know that Res(G, G0 , x) = A(y)F (y, y1 ) where G0 = y1 ∂G + ∂G . In the other words, there exist ∂y ∂x ¯ two polynomials P, Q ∈ Q[x, y, y1 ] such that P G + QG0 = A(y)F (y, y1 ). Replacing y1 by x11 and multiplying some power of x1 , we have ¯ ∂G + x1 ∂G ) = xk1 A(y)F¯ (x1 , y) P¯ G + Q( ∂y ∂x

B∈(k)0

B∈(k)

(8)

B∈(k)

For k = 2 and k = 3, we have that X X ˜x1 |. (|Bx | − 1) ≤ |B

¯ ¯ ∈ Q[x, where P¯ , Q y, x1 ] and k ∈ Z≥0 . Suppose that β satisfies G(β, y) = 0. Replacing x by β and x1 by β 0 in (8) where β 0 = dβ , we have that F¯ (β 0 , y) = 0. Hence G(x, y) = dy 0 is an algebraic solution of F¯ = 0.

B∈(k)

(11)

B∈(k)

˜ is over x1 = 0. If If B ∈ (1)0 ∪ (2), then the center of B ˜ is over x1 = ∞. Hence, B ∈ (3) ∪ (4)0 , then the center of B using the formula (6), we have that X ˜x1 | ≤ 2 deg(F¯ , y). |B (12)

Lemma 3.7. Let (x(t), y(t)) be an irreducible parametriza0 (t) , y(t)) is an irreducible parametrization of G = 0. Then ( xy0 (t) tion of F¯ (x1 , y) = 0.

B∈(1)0 ,(2),(3),(4)0

0

(t) where 0 means the Proof. Let us denote x1 (t) = xy0 (t) derivative with respect to t. Since x1 (t) = dx (t), we have dy F¯ (x1 (t), y(t)) = 0. Assume that (x1 (t), y(t)) is a reducible ¯ k )). that x1 (t), y(t) ∈ Q((t parametrization. Let P k ≥ 2, such 0 0 kj−1 Then x1 (t)y (t) = j≥j0 cj t . Since x (t) = x1 (t)y 0 (t), P c tkj then we have that c0 = 0 and x(t) = c + j≥j0 jkj , for some constant c. Hence we get a contradiction because ¯ k )). x(t), y(t) ∈ Q((t

By the Riemann-Hurwitz formula, we have that X ˜y | − 1) ≤ 2(gF¯ + deg(F¯ , x1 ) − 1). (|B

(13)

B∈(1),(4)

We remark that in the inequalities (12,13) we have used the ˜ between the places of G = 0 and fact that the map B 7→ B places of F¯ = 0 is injective. By the inequalities ((9)-(13)), we have that 2(gG + deg(G, y) − 1) ≤ 2(gF¯ + deg(F¯ , x1 ) + deg(F¯ , y) − 1).

Theorem 3.8. Assume that G(x, y) = 0 is a nontrivial algebraic solution of F = 0. Then we have that

Using the above inequalities, and the facts that deg(F¯ , x1 ) = deg(F, y1 ), deg(F¯ , y) = deg(F, y) and that gG = gF¯ , gives the required inequality.

deg(G, y) ≤ deg(F, y) + deg(F, y1 ). Proof. Let F¯ be as in (7). Let gG and gF¯ be the genus of G(x, y) = 0 and F¯ (x1 , y) = 0 respectively. Let B be a place of G = 0 with center P = (α, β). Let (x(t), y(t)) be an irreducible parametrization of B. Let us denote by ˜ the place of the algebraic curve F¯ (x1 , y) = 0 given by B the irreducible parametrization (x1 (t), y(t)), where x1 (t) = ˜ be the center of B. ˜ It is obvious x0 (t)/y 0 (t). Let P˜ = (α, ˜ β) ˜ If νx (B) 6= νy (B) then ˜ and β = β. that νy (B) = νy (B) ˜ = νx (B) − νy (B). Hence, if νx (B) > we have that νx1 (B) νy (B), then α ˜ = 0; if νx (B) < νy (B), then α ˜ = ∞; if ¯ νx (B) = νy (B), then α ˜ ∈ Q. The map that sends each place B of G = 0 to the place ˜ of F¯ = 0 is injective. Let B and B 0 be two places of B ˜ = B f0 . Let (x(t), y(t)) and (z(t), v(t)) G = 0 such that B be the parametrizations of B and B 0 respectively. We may 0 assume that y(t) = y0 + tp and v(t) = v0 + tp (see [26], ˜=B f0 we have that p = p0 , Chap. 4, Theorem 2.2). Since B 0 0 y(t) = v(t) and x (t) = z (t). Hence z(t) = x(t)+c, for some constant c. By lemma 3.3 we have that c = 0, so B = B 0 . By the Riemann-Hurwitz formula we have that X 2(gG + deg(G, y) − 1) = (|νx (B)| − 1), (9)

The following example shows that the degree bound given in Theorem 3.8 is optimal. Example 3.9. Assume that n > m > 0 and (n, m) = 1. Let G(x, y) = y n − xm which is irreducible. We have that G(x, y) = 0 is an algebraic solution of F = y n−m y1m − (m )m = 0. In this case, we have that deg(G, y) = deg(F, y)+ n deg(F, y1 ).

4. A POLYNOMIAL-TIME ALGORITHM The simple degree bounds given in the preceding section allow us to give a polynomial-time algorithm to compute algebraic function solutions of a first order autonomous ODE.

4.1 Algebraic approximant Algebraic approximant is a special type of Hermite-Padé approximant. It uses an algebraic function to approximate a given function. Definition 4.1. Let G(x, y) be an irreducible polynomial ¯ in Q[x, y]. An algebraic function y¯(x) satisfying G(x, y¯(x)) = 0 is called an algebraic approximant to a function f (x) if

B

G(x, f (x)) = O(x(m+1)(n+1)−1 )

where B runs over all places of G = 0. We will split the right hand side of the above equation in four cases: We say that B ∈ (1) if νx (B) > 0 and νy (B) > 0; that B ∈ (2) if νx (B) > 0 and νy (B) < 0; B ∈ (3) if νx (B) < 0 and νy (B) > 0; and that B ∈ (4) if νx (B) < 0 and νy (B) < 0. Moreover, we say that B ∈ (1)0 if B ∈ (1)

where m = deg(G, x) and n = deg(G, y). More generally, we will find G(x, y) such that G(x, f (x)) = O(xN +1 )

33

(14)

where N is a positive integer. We can get the coefficients of G(x, y) with respect x and y by solving linear equations. Pn to Pm i j Let G(x, y) = j=0 i=0 bi,j x y and f (x) = a0 + a1 x + N N +1 · · · + aN x + O(x ). Let   I(m+1)×(m+1)  M0 =  (15) 0(N −m)×(m+1)

Algorithm 4.3. Input: F Output: the first N + 1 terms tion of F = 0 which is not in

¯ 2 on F (y, y1 ) = 0 such that 1. Find a point (z0 , z1 ) ∈ Q S(z0 , z1 ) 6= 0 and z1 6= 0. 2. i := 2 and ϕ(x) := z0 + z1 x. 3. while i ≤ N do

where I(m+1)×(m+1) is an m + 1 unit square matrix and 0(N −m)×(m+1) is an (N − m) × (m + 1) zero matrix. Let Mi = T M i ∗ M0 for i = 1, · · · , n where   a0 0 0 ··· 0  a1 a0 0 ··· 0     a2 a a ··· 0  1 0 TM =  (16)   .. .. .. .. ..   . . . . .  aN aN −1 aN −2 · · · a0

(a) Replace y by ϕ(x) and y1 by ϕ0 (x) in F (y, y1 ). (b) c := the coefficient of xi−1 in F (ϕ(x), ϕ0 (x)). (i−1)!c (c) zi := − S(z and ϕ(x) := ϕ(x) + 0 ,z1 )

(M0 |M2 | · · · |Mn )

B0  B1   .  .. Bn

    = 0 , Bi = 



b0,i  b1,i   .  .. bm,i

4. Return(ϕ(x)). The correctness of the algorithm comes from the following facts. Let y¯(x) be a formal power series solution of F = 0. Then by (1),

    

(F (¯ y (x), y¯1 (x)))(i−1) = S y¯i (x) + R(¯ y (x), · · · , y¯i−1 (x)) = 0. Since y¯k (x)|x=0 = zk for k = 1, 2, · · · , we have that

(17)

S(z0 , z1 )zi + R(z0 , · · · , zi−1 ) = 0. Now assume that ϕ(x) = z0 + z1 x + · · · +

for i = 1, · · · , n. Let y¯(x) = a0 + a1 x + · · · be a formal power series. When we say ϕ(x) is the first N + 1 terms of y¯(x), we mean that ϕ(x) = a0 + a1 x + · · · + aN xN . The following lemma will be used in our algorithm.

R(z0 , · · · , zi−1 ) = (F (ϕ(x), ϕ0 (x)))(i−1) |x=0 which equals to (i − 1)! times the coefficient of xi−1 in F (ϕ(x), ϕ0 (x)). Let T = tdeg(F ), the total degree of F . Theorem 9 given in [9] shows that the number of the points on F (y, y1 ) = 0 which make S(y, y1 ) or y1 vanish is at most T 2. The complexity of Algorithm 4.3 is polynomial in terms of the number of multiplications in Q needed in the algorithm. In Step 1, we can find a point (z0 , z1 ) as follows. 2 We may replace y by the integers z0 = 0, ±1, · · · ± d T2 e where T = tdeg(F ) and let L(y1 ) be a monic irreducible factor of F (z0 , y1 ) ∈ Q[y1 ]. We may take z1 to be a root of L(y1 ) = 0. Since the number of the points which make S(y, y1 ) or y1 vanish is at most T 2 , there always exists an 2 integer z0 ∈ {0, ±1, · · · ± d T2 e} such that the point (z0 , z1 ) satisfies the assumption in Step 1. Hence the complexity of Step 1 is polynomial. Then all the procedures will be executed over the number field Q(z1 ). Let D = deg(L(y1 )) ≤ T = tdeg(F ). Then any element of Q(z1 ) can be represented as a polynomial in z1 with degree ≤ T −1. Let β, γ ∈ Q(z1 ). Then there exist P (z), Q(z) ∈ Q[z] such that β = P (z1 ), γ = Q(z1 ) where deg(P ) ≤ T − 1, deg(Q) ≤ T − 1. To compute φ = β ∗ γ, we need to compute R = prem(P Q, L). Therefore, a multiplication of two elements in Q(z1 ) needs O(T 2 ) multiplications of rational numbers. Since computing the inversion of β can also be done in O(T 2 ), the division of two elements in Q(z1 ) needs O(T 2 ) multiplications of rational numbers too. In Step 3, the computation of (a0 +a1 x+· · ·+aN xN )T needs at most O(N 2 T 4 ) multiplications in Q(z1 ), and hence at most O(T 2 · N 2 T 4 ) = O(N 2 T 6 ) multiplications in Q.

where deg(Qi (x), x) ≤ m and not all of them are zero. Then (18)

¯ does not equal to zero. where λ ∈ Q Proof. Let Q(x, y) = Q0 (x) + Q1 (x)y + · · · + Qn (x)y n . ¯ There exist S, T ∈ Q[x, y] such that (19)

where deg(S, y) < n and deg(T, y) < n. If Q(x, y¯(x)) = 0, then (18) is true. Assume that Q(x, y¯(x)) 6= 0 and Res(G, Q, y) 6= 0. Then it is not difficult to know that deg(Res(G, Q, y), x) ≤ 2mn. However, substituting y¯(x) to the left side of (19), the left side will become a series with order greater than 2mn, a contradiction. Hence Res(G, Q, y) = 0 which implies (18) is true, because G(x, y) is irreducible.

4.2

Then

Since ϕ(k) (x)|x=0 = zk for k = 1, · · · , i − 1 and ϕi (x) = 0, we have that

Q0 (x) + Q1 (x)ϕ(x) + · · · + Qn (x)ϕ(x)n = O(x2mn+1 )

SG(x, y) + T Q(x, y) = Res(G, Q, y)

zi−1 xi−1 . (i−1)!

(F (ϕ(x), ϕ0 (x)))(i−1) = R(ϕ(x), · · · , ϕ(i−1) (x)).

Lemma 4.2. Let y¯(x) be a formal power series such that G(x, y¯(x)) = 0. Assume that m = deg(G, x) and n = deg(G, y). Let ϕ(x) be the first 2mn + 1 terms of y¯(x). If ¯ Q0 (x), Q1 (x), · · · , Qn (x) ∈ Q[x] such that

G(x, y) = λ(Q0 (x) + Q1 (x)y + · · · + Qn (x)y n )

zi xi . i!

(d) i := i + 1.

and ai are the coefficients of f (x). Then by the computation process, we can write (14) as the matrix form 

= 0 and a positive integer N . of a formal power series solu¯ Q.

An algorithm to compute algebraic solutions

First, we give an algorithm to compute the first N + 1 terms of a formal power series solution of F = 0 for a given positive integer N . Regarding F = 0 as an algebraic curve, find a point (z0 , z1 ) on it such that the separant S(y, y1 ) of F (y, y1 ) does not vanish at (z0 , z1 ). Then we can compute yi = zi step by step from (1). Then y¯(x) = z0 +z1 x+ z2!2 x2 + · · · is a formal power series solution of F = 0. Moreover, if ¯ z1 6= 0, then y¯(x) ∈ / Q.

34

needs O(T 8 ), because T M is an l × l matrix with l ≤ 2T 2 + 1 and M0 is a p × q matrix with p ≤ 2T 2 + 1, q ≤ T + 1. (Note that in the worst case, we have to do the operations over Q(z1 ). Hence the complexity has to increase by O(T 2 ).) In Step 2(d), we need only to solve at most 4T 2 + 1 linear equations with at most 2T 2 + 3T + 1 variables. Hence its complexity is polynomial. In Step 2(g), from ([25],p152), GCD(G, S) and GCD(G, I) can be computed in O(T 6 ). In Step 2(h), for deciding whether prem(F, G) = 0, we compute R1 = prem(F, G0 ) first. Since R1 = ( ∂G )k F (y, − ∂G / ∂G ) where k ≤ T , we can com∂y ∂x ∂y 12 pute it in O(T ) and have that deg(R1 , x) ≤ 2T 2 and deg(R1 , y) ≤ 4T 2 + T . Then we compute the GCD(R1 , G) which can be computed in O(T 10 ). If GCD(R1 , G) = G, then prem(F, G) = 0; otherwise prem(F, G) 6= 0. The number of the circulation in Step 2 is at most 2T . Hence the complexity of Step 2 is also polynomial.

Now we can give the algorithm to compute an algebraic solution of F = 0. Algorithm 4.4. Input: F = 0. Output: an algebraic solution of F = 0 if it exists. 1. d := deg(F, y1 ) and e := deg(F, y). 2. k := 1. while k ≤ d + e do (a) Compute the first 2dk + 1 terms ϕ(x) of a formal power series solution of F = 0 by Algoritm 4.3. (b) ai := the coefficient of xi in ϕ(x) for i = 0, · · · , 2dk. (c) In (15) and (16), let m = d, n = k and N = 2dk. We construct the linear equations (17). (d) If (17) has no nonzero solution or the dimension of the solution space of (17) is great than one, then go to Step (i). (e) Otherwise, choose one of nonzero solutions ¯bi,j where i = 0, · · · , d and j = 0, · · · , k. Pk Pd ¯ (f) G(x, y) := bi,j xi y j , S := ∂G and j=0

i=0

Example 4.5. Consider F = (y 6 +2y+1)y13 −(12y 5 +9y 4 −1)y12 +27y 8 +54y 7 +27y 6 +4y 3 . 1. Let d = 3 and e = 8.

∂y

I := the initial of G(x, y). (g) If GCD(G, S) 6= 1 or GCD(G, I) 6= 1, then go to Step (i). Otherwise, go to next step.

2. For the case k = 1, we get a G(x, y) = 0 which is not the solution of F = 0. Here we only give the process in the case k = 2.

(h) Let R = prem(F, G). If R = 0, then return(G(x, y) = 0).

3. The first 13 terms of the formal power series solution of F = 0 is

(i) k := k + 1.

5 2 9 1 5 41 6 65 7 x − x3 + x4 + x5 − x − x 2 4 2 4 32 64 363 8 111 9 2545 10 5141 11 5891 12 + x − x − x + x + x . 128 256 512 1024 1024

ϕ(x) = 1 − 2x +

3. If the algorithm does not return G(x, y) = 0 in Step 2, F = 0 has no algebraic solution and the algorithm terminates.

4. Let m = 3, n = 2 and N = 12. We construct the linear equations (17). Solving it, we get a nonzero solution

From Theorem 2.6 and Lemma 2.5, we know that if F = 0 has a nontrivial algebraic solution, then every formal power series solution is algebraic. From Lemma 4.2, we only need to compute the first 2dk + 1 terms of a nontrivial formal power series solution to construct the algebraic approximant. From Theorems 3.4, 3.8, if F = 0 has an algebraic solution G(x, y) = 0, then there is a k which satisfies that k ≥ 1 and k ≤ d + e such that deg(G, x) = d and deg(G, y) = k. From Lemma 4.2 again, the dimension of the solution space of (17) equals to one. If G(x, y) = 0 is an algebraic solution, then G(x, y) is irreducible. Then it is obvious that GCD(G, S) = 1 and GCD(G, I) = 1. Now assume that GCD(G, S) = 1, GCD(G, I) = 1 and prem(F, G) = 0. We will prove that G(x, y) is irreducible. Suppose that k = h. ¯ Then G(x, y) can not have a factor u(x) ∈ Q[x], because ¯ GCD(G, I) = 1. If G(x, y) = g(y) ∈ Q[y], then by (14), g(ϕ(0)) = 0 and g 0 (ϕ(0))ϕ0 (0) = 0. Since ϕ0 (0) = z1 6= 0 and g 0 (y) = S, we have GCD(G, S) 6= 1. Hence G(x, y) ∈ / ¯ Q[y]. If G(x, y) is reducible, then G(x, y) has an irreducible ˜ y) which is nontrivial and deg(G, ˜ y) < h. Since factor G(x, GCD(G, S) = 1, GCD(G, I) = 1 and prem(F, G) = 0, by ˜ y) = 0 is an algebraic solution of F = 0. Hence, we (4), G(x, ˜ y) when k was less than h and the algowould have got G(x, rithm must terminate before k = h, a contradiction with the assumption k = h. So G(x, y) is irreducible and G(x, y) = 0 is an algebraic solution. The complexity of Algorithm 4.4 is polynomial in T where T = tdeg(F ). In Step 2(a), the complexity is polynomial. In Step 2(c), we need only to compute T M 2T ∗ M0 which

(−1, 1, 0, 0, 0, 3, −3, 1, 1, 0, 0, 0). 5. Let G(x, y) = −1 + x + 3xy − 3x2 y + x3 y + y 2 and S = 2y + 3x − 3x2 + x3 , I = 1. 6. We have GCD(G, S) = 1 and GCD(G, I) = 1. 7. prem(F, G) = 0. Hence G(x, y) = −1 + x + 3xy − 3x2 y + x3 y + y 2 = 0 is an algebraic solution of F = 0.

5. REFERENCES [1] Bronstein, M., Integration of elementary functions, J. Symb. Comput., 9, 117-173, 1990. [2] Bronstein, M. and Lafaille, S., Solutions of linear ordinary differential equations in terms of special functions, Proc. ISSAC2002, ACM Press, 2002. [3] Carnicer, M.M., The Poincaré problem in the nondicritical case, Ann. of Math., 140, 289-294, 1994. [4] Cerveau, D. and Lins Neto, A., Holomorphic foliations in CP (2) having an invariant algebraic curve, Ann. Inst. Fourier, 41(4), 883-903, 1991. [5] Cormier, O., Singer, M.F., Trager, B.M. and Ulmer, F., Linear differential operators for polynomial equations, J. Symb. Comput., 34, 355-398, 2002. [6] Cormier, O., On Liouvillian solutions of linear differential equations of order 4 and 5, Proc. ISSAC2001, 93-100, ACM Press, 2001.

35

[7] Corral, N. and Fern´ andez-S´ anchez, P., Isolated invariant curves of a foliation, to apper in Proc. Amer. Math. Soc.. [8] Davenport, J.H., On the integration of algebraic functions, Lecture Notes in Computer Science, 102, Springer-Verge, New York, 1981. [9] Feng, R. and Gao, X.S., Rational general solutions of algebraic ordinary differential equations, Proc. ISSAC2004, 155-162, ACM Press, 2004. [10] Feng, R. and Gao, X.S., A polynomial-time algorithm to compute rational solutions of first order autonomous ODEs, MM-Preprints, No.23, 54-65, December, 2004. [11] Fulton, W., Algebraic Curves, Benjamin/Cummings Publishing Company, Inc, 1969. [12] Hubert, E., The general solution of an ordinary differential equation, Proc. ISSAC1996, 189-195, ACM Press, 1996. [13] Kolchin, E.R., Differential Algebra and Algebraic Groups, ACM Press, New York, 1973. [14] Kovacic, J.J., An algorithm for solving second order linear homogeneous differential equations, J. Symb. Comput., 2(1), 3-43, 1986. [15] Lang, S., Introduction to Algebraic and Abelian Functions, second edition, Springer-Verlag, New York, 1972. [16] Poincaré, H., Sur l’intégration algébrique des équations différentielles du premier ordre et du premier degré, Rend. Circ. Mat. Palermo, 11, 193-239, 1897.

[17] Risch, R.H., The problem of integration in finite terms, Trans. Amer. Math. Soc., 139, 167-189, 1969. [18] Risch, R.H., The Solution of the problem of integration in finite terms, Bull. Amer. Math. Soc., 76, 605–608, 1970. [19] Ritt, J.F., Differential Algebra, Amer. Math. Sco. Colloquium, New York, 1950. [20] Singer, M.F., Liouillian solutions of nth order homogeneous linear differential equations, Amer. J. Math., 103(4), 661-682, 1981. [21] Singer, M.F., Liouillian first integrals of differential equations, Trans. Amer. Math. Sco., 333(2), 673-688, 1992. [22] Trager, B., Integration of Algebraic Functions, Ph.D thesis, Dpt. of EECS, Massachusetts Institute of Technology, 1984. [23] Ulmer, F. and Calmet, J., On liouvillian solutions of homogeneous linear differential equations, Proc. ISSAC1990, 236-243, ACM Press, 1990. [24] Van der Put, M. and Singer, M. Galois Theory of Linear Differential Equations, Springer, Berlin, 2003. [25] von Zur Gathen, J. and Gerhard, J. (1999). Modern Computer Algebra, Cambridge University Press, Cambridge. [26] Walker, R. J., Algebraic Curves, Princeton Unv. Press, 1950.

36

Adherence is Better than Adjacency: Computing the Riemann Index Using CAD James C. Beaumont, Russell J. Bradford, James H. Davenport & Nalina Phisanbut

∗

Dept. of Computer Science, University of Bath Bath BA2 7AY England

{J.Beaumont, R.J.Bradford, J.H.Davenport, cspnp}@bath.ac.uk ABSTRACT

Algorithms, Theory

as set equality. As pointed out in [15] not all such formulae are identities and instead require set inclusions: we have that Log(z 2 ) ⊃ 2 Log(z) for example. Many others may be found in [1], which we will use to provide a set of realistic test formulae— exercising occasional care, however, in their interpretation[16]. Given H ⊆ 0, the problem is then to decide on what regions of n does h = 0 hold? The paper [15] provides a graphic illustration that one is indeed forced to consider the geometry of n with respect to the branch cuts of h in order to answer this question. One important application of having a method to decide such questions is to the area of simplification. To obtain and define precisely what one means by a ‘simplification’, is an old and difficult problem[23]. We shall not grapple with this issue here but refer instead to recent progress in [9] for a potential approach, and to [8] for some of the problems involved with working with multi-valued transformations.

Keywords

1.1 Previous Work

Riemann Surfaces, Elementary Functions, Branch Cuts

Progress towards constructing a verification system for multi-valued formulae as described above has been reported in [4] and its precursors. This is based on the Decomposition Method first suggested in [19], which requires one to:

Given an elementary function with algebraic branch cuts, we show how to decide which sheet of the associated Riemann surface we are on at any given point. We do this by establishing a correspondence between the Cylindrical Algebraic Decomposition (CAD) of the complex plane defined by the branch cuts and a finite subset of sheets of the Riemann surface. The key advantage is that we no longer have to deal with the difficult ‘constant problem’.

Categories and Subject Descriptors I.1.2 [Symbolic and Algebraic Manipulation]: Algorithms

General Terms

1.

INTRODUCTION

The elementary functions are the field of functions obtained by applications of exp, log and the arithmetic operations to a set of variables x1 , . . . , xn and constants, . As in previous papers by the authors[4, 8], we shall focus particularly on those elementary functions which are in fact multivalued. We shall use the notation that terms such as log and

1. calculate the set of branch cuts of the proposed identity F = 0; 2. find a sample point in each of the regions in defined by the cuts; 3. evaluate the identity numerically using that point, thereby concluding whether the formula is true or not on that entire region by the Monodromy theorem.

, and more generally, f, h, denote single-valued functions from to , whilst Log, nSqrt and F, H denote multivalued functions, regarded as mapping into √ sets of values, so that Sqrt(z) = {w : w 2 = z} = {± z} for example. Numerous well-known identities for multi-valued functions exist; examples include Log(z 2 ) − Log(z) − Log(z) = {0}, and Sqrt(z 2 ) = {±z}. The = is of course to be interpreted n

Further details of how each of the steps above should be performed can be found in [4]. A key point to remember is that, as first suggested in [8], we use Cylindrical Algebraic Decomposition (CAD; see [10]) for step two; we shall assume that the reader is familiar with the basic notions involved in this algorithm. To do this, we restrict the class of formulae under consideration to those where the branch cuts are algebraic: it is sufficient to prohibit anything other than nth roots from being nested inside other elementary functions.1 It is however worth pointing out that step one has now been made computationally efficient by the recent proposal to use resultants to eliminate nth roots in the input expression; see [5] for details. Also in that paper, the use of an efficient method[18] to decide which is the best projection

∗The authors gratefully acknowledge the support of EPSRC, under grant number GR/R84139/01.


1 Note that this applies to constant functions; witness log(x − exp(2)), which has a non-algebraic branch cut.

37

order is demonstrated to be of great importance regarding the efficiency of step two. Thus in this paper, we are still strongly advocating a CAD based Decomposition Method; what we are proposing here is a new and more efficient approach to performing the final step. This step is surprisingly non-trivial as was seen in [3, 4] and so an improvement is highly desirable. We first provide a convenient summary of the most serious problems involved at this stage. Suppose for simplicity of exposition that our formula H is of one variable. Then we can compute a description of the branch cuts of h, as a semi-algebraic set in 2 real variables. Firstly suppose that the region we are testing has co-dimension > 0. In the terminology of CAD, this means that we are investigating a region that comprises a particular section, s say, of a stack. Let the sample point of s be p = (x, y), which must be interpreted as the complex number x + iy. As argued in [8], numerical evaluation of H(p) may give completely incorrect results except in the rare cases where x, y have finite floating point representations. A symbolic approach would therefore seem necessary, but in the worst case scenario, the point p will have some coordinates that cannot be expressed in terms of radicals. Any attempt to search for alternative sample points in sections where this is possible— those constructed over level 1 regions of full dimension— would result in an undesirable coupling between steps 2 and 3 of the algorithm. Even in the cases where p is expressible using radicals one runs into the ‘constant problem’ and has to generally resort to algorithms that are potentially costly and rely on the truth of number-theoretic conjectures[26]. The constant problem is in fact undecidable for sufficiently large function fields[25]. This is inhibiting as ultimately, one would like to handle non-elementary multi-valued functions as well, such as the W -Lambert function[14] for example. Secondly, in all regions, it may happen that p is an ‘unlucky sample point’. If H is a rational function of elementary functions Hi each of which has the set of branches {hi }i then p is an unlucky point if for one or more of the hi we have that hi (p) are equal for several different branches. If we use such a p, then we cannot guarantee that we can draw correct conclusions about the truth of the identity H on the region it represents. An example is afforded by p(z) Log(q(z)2 ) − 2p(z) Log(q(z)) where p, q ∈ [z] and the unlucky points are then the roots of p, q. The method of [27], as was demonstrated in [3], is an effective, although costly, solution to this problem. Finally the work reported in [5] demonstrated that the number of cells that are produced by problems of seemingly simple appearance can be extremely large. This of course exacerbates the problems mentioned above. In such cases, numerical evaluation can also be very time-consuming— assuming we are on regions where it can be applied safely. The rest of the paper is as follows. In sections 2 and 3 we describe and exemplify a method for testing the proposed formula on the branch cuts. In section 4 we propose a new method based on Riemann surfaces to make testing the identity on all cells still more efficient. Finally we summarize our contribution and consider future directions.

2.

inverse elementary functions, such as F (z) = arcsin(z), log(z). These are the only elementary functions with branch cuts, so they will be the main focus here. It will be convenient to work with functions of the form Hi = Fi (Gi (z)) where Gi ∈ rt (z) and Fi is an nth root or logarithm as our ‘building blocks’; since all of the base functions can be defined in terms of these, it follows that all our admissible formulae (as in 1.1) can be expressed recursively in terms of the Hi . The definitions we use for the base functions are those from [12], although everything which follows applies regardless of the initial choices made, as long as we are consistent. With F as above, the Riemann surface associated with F shall be denoted by RS (F ), and we recall that this is a path-connected domain for F having either n or an infinite number of sheets respectively, each of which is a domain for a particular branch of the function in question. The branches ) for F (z) in this case are of course either n g(z) exp( 2kπi n for k = 0, . . . , n − 1 or log(z) + 2kπi for k ∈ . For a fixed k, we shall refer to a particular k-th branch of F , denoted by f k which has domain which will refer to as comprising the k-th sheet of RS (F ); we denote the principal branch of F by f 0 , or simply f . Further, RI (F )|c shall be the Riemann index of F on a particular cell c of the CAD induced by the branch cuts: that is, either exp( 2kπi ) or 2kπi for some fixed n k. The branch cuts serve of course, to act as boundaries where distinct sheets are joined; we denote the set of branch cuts of f by B(f ). The important question we shall address is, which sheet does a given branch cut belong to? We now make a key definition, for the case of CADs in the plane.

Definition 1. (Adherence) Suppose that we have a CAD for B(h), and c is a section of a stack representing part of, or all of, a particular branch cut. Let s be an adjacent sector cell to c. Then we say that the branch cut c adheres to s if c belongs to the same sheet of RS (H) as does s. Recall that, two cells of a CAD are said to be adjacent if their union is path-connected[11]. When H = F (G), c can not adhere to both adjacent cells by monodromy. A brief overview of the algorithm, for the single variable case, is presented below. Its purpose is to solve the problem of testing an identity on a branch cut by using an adjacent cell of full-dimension instead. In what follows, for any x ∈ we define sign(x) = 1 if x > 0 and sign(x) = −1 if x < 0. Given an input formula φ, we recall that the TruthValue of a cell c in the CAD with respect to φ is a boolean value, depending on whether or not φ(p) is satisfied at any p ∈ c.

Algorithm 1. Input: H = F (g) where F is Log or nSqrt ., g ∈ (z) Output: A CAD of B(h); adherent cells determined.

2

1. Compute S = B(h) and D = CAD(S) 2. Compute adj(D) if need be 3. For each c ∈ D where T ruthV alue(c) = T rue do sign := sign(=(g(p))), p ∈ c1 , with s1 , c adjacent if sign = 1 then RI (H)|c = RI (H)|s1 else RI (H)|c = RI (H)|s2 where s2 is adjacent to c and not to s1 .

COMPUTING THE ADHERENCE

We now comment further on this algorithm. The role of steps 1 and 2 is obvious. The adherence of any branch cut

If we define 0 p(z) = p(z), then we may define rt (z) recursively to be the set of functions of the form φ( n pi (z) ) where φ is a rational function with pi ∈ rt (z) and n ∈ . By a base (inverse) function we shall mean any of the 14

2 The case of G ∈ rt (z) has extra difficulties which we return to later in sections 2.1, 3.

38

of f (g) will depend only on what we what choose to be the closure of the base function f (z). Examination of the definitions of the complex logarithm and nth root functions shows that this in turn is determined only by what definition we take for the principal branch of arg: here we choose the modern convention which requires that arg(z) ∈ (−π, π]. Thus for the branch cut of f (z), we have Counter Clockwise (CC) Closure, or, in the terminology here, the branch cut c of f (z) adheres to the cell T having cell index (1,3), which lies in the same stack in which c does in the CAD constructed with respect to this single branch cut. We now make step 3 more precise. We shall require that s1 is of full-dimension. Thus in the x—y plane, there are two possibilities for c: either (a) c is a vertical line x = a for a ∈ or (b) it is not. In the former case we choose a sector cell s1 which is adjacent to c and lies in either of the two adjacent stacks to the one in which c resides, (another reason for treating these cuts specially is given in 2.1) and in the latter case, we choose one of the two sector cells that are adjacent to c in the same stack. Notice that for two-dimensional CADs such as here, we can decide a priori whether intra-stack adjacency alone, which does not require an algorithm, will be sufficient; thus allowing us to bypass step two above. To do this we examine the CAD data structure to see if there are any cells which have truth value T rue, are of dimension one, and are constructed over level one cells of zero dimension. The idea behind step 3 is the following. Let γ : [0, 1] → be a path with γ(0) = p and γ(1) = q, with q ∈ c, and let g(q) = s where s ∈ − . Then if g(γ(t)) → s + 0+ i as t → 1 then c adheres to s1 ; otherwise, it must adhere to another cell, s2 say. In case (a) this s2 will be an adjacent sector cell to c, but in the different stack to which s1 resides; in case (b), it will be the only other adjacent sector to c in that stack. Applying an adjacency algorithm[11] to D produces output which can be thought of as comprising lists of cell indices where all indices in a list represent cells which form a single connected component. Computing the adjacencies thus gives us the cells s1 whose sample points are required for the last step. Notice that there will be any finite number of adjacent cells s1 to c in general. We only need to choose one cell of course for each branch cut that comprises a single connected component, and this can be achieved by using the adjacency information, thus avoiding the potential redundancy. In the case where g ∈ (z), the particular choice of adjacent cell does not matter. We point out that the representation used in step 1 will produce a CAD containing other sections which are not branch cuts, and so we do not wish to apply the algorithm to them. For example, if we wish to represent a semi-circular branch cut we would use {(x2 + y 2 − r2 = 0) ∧ (x > 0)}, but this will make for a CAD containing both roots of the bivariate polynomial. However these unwanted cells will always have TruthValue F alse, so can be ignored.

the CAD solution formula construction step, they comprise cells which are assigned the truth value T rue, since they satisfy the formula S. In some cases one can evaluate s = g(p) for p on the sample point of each cell in S and check whether or not s ∈ B(f ). In general, we run into the same problems as described above in (1.1). One might think that we could detect the spurious cuts by examining the signs of the imaginary parts of g(pi ) for pi ∈ si with i = 1, 2, but the fact that we cannot guarantee that s ∈ B(f ) means that the reasoning based on the continuity of g is not sufficient. For the important case of square roots, we showed in [4] how to remove the spurious branch cuts by adding polynomial constraint equalities to the system. It is a minor extension to handle nth roots which we defer to [6]. However this contributes exponentially to the number of polynomials that are used in the CAD construction. A further difficulty is that the faster methods for eliminating roots such as Gr¨ obner bases, or better, the method of [5] work as black-boxes and do not automatically generate the appropriate sets of constraints at each stage. Except in simpler cases, where it may prove to be efficient to remove the spurious cuts, we propose to apply algorithm (1) exactly as we did before; this can be done since for any p ∈ s1 , =(g(p)) < 0 or =(g(p)) > 0. (In the case where =(g(p)) = 0, we know the cut is spurious) Suppose that, as in step 3 that c is in fact spurious, although we cannot yet decide if this is so. It is easy to see that s1 , s2 will then belong to the same sheet of RS (H) as does c, and so correctness at the sample point testing stage is guaranteed. The ‘adherence information’ we obtain in this case is of course spurious but since c adheres to both the si , we cannot obtain incorrect results. Whilst this approach provides us with a generic algorithm, it is wasteful in that we must compute g(p) for each spurious c nevertheless. One further issue to be exemplified later in section 3, is that given f (g) it may happen that a (non-spurious) cut c derives from both f and g. In that case, we must first calculate the adherence of c with respect to g, before we can compute the adherence with respect to f . This is because, unlike for the case where g ∈ (z), the choice of s1 in step 3 does matter: we need to choose it so that c adheres to s1 with respect to g, and then g will be continuous onto the c, as required.

2.2 Justification In the interest of brevity a detailed proof of correctness shall not given in this paper. However the essential ideas can be sketched as follows. First, one must remember when computing with g in the manner above that g is a complex valued function, g : → , only the branch cuts are real algebraic. However working with both real and complex geometry in this way does not cause any problems provided that one stays away from regions of non-analyticity. (This is why we do not allow complex conjugation: if g(z) = z for example, then one can easily check that this does not satisfy the Cauchy-Riemann equations, and is therefore not analytic.) We have that g(p) ∈ B(f ) by necessity. Now g is continuous on a path from at least one of the pi in the adjacent sectors onto p since in following [4], at step 1, we always include the branch cuts at infinity: that is where the denominator of g vanishes; the only other potential problem would be discontinuities arising in g due to the presence of nth roots, (as in example 3, next section) but as with the singularities they will be by virtue of the CAD, not inside

2.1 Nested roots Suppose that H = F (G) where G ∈ rt (z). In [4] we pointed out (for the case of square roots) that the method to calculate B(h) will produce a semi-algebraic set S say that contains spurious branch cuts. Recall that in the simplest case, when g = n p(z), that these are sets {z | (g k (z) ∈ B(f ))∧(k 6= 0)}— although in general g may contain several nth roots. They arise as an artefact of the method to remove nth roots from the input formula. The problem is that in

39

arctan(x)

the adjacent sectors we are looking at. The only case where this does not occur is when, given f (g), the cuts for f and g coincide, but this situation is not a problem if we deal with it as described at the end of (2.1). In the case of vertical line cuts, the end points may be singularities. However, since we use sectors in adjacent stacks, as opposed to using the sector above and below such cuts, we avoid losing the notion of a continuous path onto the cut. We remark that [11] is the most recent adjacency algorithm at the time of writing; we should mention that this may fail in higher than four dimensions, albeit with a very low probability. In the case of failure, we cannot guarantee to be able to represent the section in question correctly and we may under-represent the branch cuts. Fortunately for us, the CADs derived from formulae we are most interested in have at most 4 dimensions.

π/2

+0 x

x0

−π/2

−π

0

Figure 1:

∗

& corresponding branches for Arctan(x).

have branch cuts at infinity (example 1) and singularities (example 2) which are not just zero dimensional regions.

1. H = Arctan

x+y 1−xy

− Arctan(x) − Arctan(y) = 0; ?

we now investigate the corresponding identity h=0, where h = h1 − h2 − h3 . For real inputs, arctan(x) only has branch cuts at ±∞. Our representation is only for finite branch cuts, so we see that the branch cuts for H derive from h1 only: that is, the set B(h) = {xy − 1 = 0}. This comprises the two branch cuts shown in figure (2) which now require investigation. By numeric or symbolic evaluation, the identity is readily verified to be true on region 2. On region 1 we have that H = −π and that H = π on region 3. We now consider the formula on the cuts themselves. Notice that the CAD data structure will tell us that we do not need to compute the adjacency of the CAD with respect to B(h) here. Clearly, a purely numeric approach here would fail, at any point on the cuts, due to the blow-up. We shall show that arctan(x) must be viewed as a function of the form, (−∞, ∞] → (− π2 , π2 ]: that is the point (∞, π2 ) belongs to the principal branch of arctan whilst the point (−∞, − π2 ) belongs to the branch, arctan(x) − π. This requires us to work on the extended real line ∗ = ∪{∞} which we recall is constructed by identifying the end points ±∞ as in figure (1).3 Thus on ∗ one works with positive infinity only,and if x → ∞+ then we see that one passes onto the point ∞ continuously (see the left-hand diagram of figure 1) whilst if we let x → ∞− then we do not. In order to preserve the continuous 1-1 correspondence between the finite domain of arctan(x), that is , and the extended domain that is ∗ , we see that −∞ does not belong to the principal branch domain of arctan(x). As always, when passing through a branch cut one passes onto the adjoining branch domain; we can see that in this case that this will be the branch arctan(x) − π, as in the right-hand diagram of figure (1). Let c1 = {(xy − 1 = 0) ∧ (x > 0)} and c2 = {(xy − 1 = x+y 0) ∧ (x < 0)}, and put g = 1−xy . Now we can use adherence to determine which branch each ci is on by determining whether g tends to +∞ or −∞: the ci will adhere to the cell where g is positive. This requires that our adjacent cells are sign-invariant for g so we must first add the line y + x = 0 to the CAD. Consider c1 , where our adjacent regions are 1 and the part of 2 above the line, and then suppose that our sample points for these are given by p1 = (1, 2) and p2 = (1, 12 ) respectively. We obtain g(p1 ) = −3 and g(p2 ) = 3, which shows that c1 adheres to region 2, and we conclude the identity is true on this branch cut. Similarly we treat c2 = {(xy −1 = 0)∧(x < 0)} using regions 2 (below the line) and 3 and p3 = (−1, −2) and p2 = (−1, − 12 ). This

2.3 Application In general the input formula H = 0 contains several of our Hi = Fi (Gi ) building blocks. To deal with this, all that one needs to add in an implementation is a piece of additional information, namely, which Hi each branch cut cell derives from, to the usual CAD cell data structure. (Such as that found in the package we have used here[24].) that is, which Hi each branch cut cell derives from. Then we should simply apply the algorithm as shown, only using this data to ensure that we apply the appropriate gi at stage 3. Of course, some branch cut sectors may derive from several Hi . Now the adherence information alone is insufficient to determine the actual RI (Hi ) on a cell. However the point is that we now have a much simpler method (modulo the remark at the end of this subsection) to determine the truth of the formula ? h=0 on the branch cuts by testing the appropriate adjacent full dimensional cells instead. Apart from the possibility of choosing unlucky sample points (this will be avoided using the method of section 4), it is now an easy task to calculate the correction factor for h = 0 on these cells by using the method of [8] which uses floating point evaluation of H at the sample point together with bounds on the accuracy derived in the manner shown there. Of course this does not tell us what RI (Hi ) is for individual Hi . A problem occurs however, on branch cut cells which derive from several Hi , for some of these functions may adhere to different cells. An example of this is discussed in section 3. It is often convenient for a user to specify a function by arccosh say, as opposed to giving its definition in terms of logs and square roots. We have therefore determined the adherence for each of the base functions using the method above so that one can now allow our input function F to be any of these 14 functions. For future reference, this information is presented in the appendix. If a particular cut of F under consideration lies on the imaginary axes of the plane, then one must make the necessary minor changes in step 3: we calculate sign := sign(0 0 =>0 =>0 0 = dN + 1. Then, the matrix A = C(N ) ·”· · C(1) can be com“ factorial √ √ puted using O m2 M( dN ) + mω dN operations in F .

1. Choose p prime in {B, . . . , 2B} (B defd below); 2. Compute the reduction C[p] of C modulo p; 3. Compute A[p] = C[p](N ) · · · C[p](−β) I[p];

Proof. For the ease of exposition, let us suppose that √ N and p d are perfect powers of 4, so that k = dN and n = N/d are powers of 2. The general case can be treated with no extra difficulty (by writing N in base 4). Let M(x) be the polynomial matrix C(nx + n) · · · C(nx + 1). Since the required scalar matrix A equals M(k − 1) · · · M(0), it suffices to evaluate M on 0, 1, . . . , k. For this, we use a recursive algorithm. Suppose that the values of the matrix M0 (x) = C(nx + n2 ) · · · C(nx + 1) at the points 0, 1, . . . , k2 are already known. Let M1 (x) = M0 (x + 21 ), so that M(x) = M1 (x)M0 (x). Since the degree of M0 is at most k/2, the values of M0 at 0, 1, . . . , k + 1 can be deduced using m2 simple extrapolations, in complexity O(m2 M(k)). The values of M1 at 0, 1, . . . , k + 1 can be deduced by two extrapolations (of difference 1/2, in degree k/2) of the values of M0 at 0, 1, . . . , k + 1. Since p > k + 1, the elements 1, 2, . . . , k + 1 and 1/2 − k/2, . . . , 1/2 + k/2 are nonzero in F , so these final two extrapolations can also

4. Return the integer n − rank(A[p]). Proof. Let us first study the correctness of the algorithm. Let, as before, o = α + β denote the order of the recurrence (2). Since x = 0 is ordinary, we have n ≤ o ≤ n + d and the indicial polynomial u−α (x) has the form ax(x − 1) · · · (x − n + 1), where a has size bounded by `. On the other hand, since p > d + n + N , none of the elements −β +o−n+1, . . . , o+N is zero in Fp . This ensures that u−α (i + o)[p] is invertible in Fp , for all −β ≤ i ≤ N , provided that p does not divide a. Therefore, if p is not a divisor of a, the recurrence (3) can be used for −β ≤ i ≤ N and the conclusion of Prop. 1 still holds for L[p] (over Fp ). In other words, the algorithm returns the dimension of the Fp -vector space of the solutions in Fp [x] of L[p]. Let us call p a good prime if p does not divide a and if, simultaneously, the matrices A and A[p] have the same rank. In short, we have just proved that if the algorithm

50

Example 1 BinSplit LFS Compact Expand 213 2.23 0.29 0.38 214 6.47 0.62 1.48 215 25.98 1.45 7.11 216 174.17 3.74 45.52 217 1358.01 10.63 318.88 218 > 4Gb 14.31 > 4Gb 220 74.52 222 398.6 24 2 > 1h Example 2 N LFS BinSplit BsGs LFS 210 0.2 0.09 0 0.54 212 1.10 0.27 0.01 3.74 214 16.27 1.11 0.02 48.80 216 > 4Gb 5.06 0.05 > 4Gb 218 24.08 0.12 220 115.39 0.27 222 416.81 0.68 224 2199.9 1.57 226 > 1h 3.77

chooses a good prime p, then the dimension of polynomial solutions over Q and over Fp coincide, and thus, the algorithm ModBsGsPolySols returns the correct output. We now estimate the probability of choosing a good prime. Using Lemma 6, the entries of A have sizes bounded by Γ = N (6n log(o) + n log(N ) + `) + 2d(6n log(o) + n log(d) + `), which is, by the assumption N ≥ d, ˚ in Olog (`N + nN log(N )). Let B be the integer B = 2 logc (N ) (` + ˇ ` ´ n log(n) + 2nΓ) , so that B = Olog n2 `N log1+c (N ) . Let us suppose that the prime p is chosen uniformly at random in the set of prime numbers between B and 2B. Then, using Lemma 9, it is easy to infer that p is a good prime with probability at least 1 − 2 log1c (N ) . We finally prove the complexity estimate. By [17, Th. 18.8], the cost of Step 1 is in Olog (I(log N ) log2 N ). Using the algorithm from Lemma 3, Step 2 can be done using O((d + n) M(n) log(n)) operations in Fp . Step 4 can be done using O(nω ) operations in Fp . Since N ≥ d + n and p > N +1, Th. 2 can be used to perform Step 3 and this concludes the complexity analysis, since ` ´ every operation in Fp costs O(I(log(p))) = Olog I(log(N )) bit operations.

N

The algorithm ModBsGsPolySols can easily be modified, so as to return also the degrees of all the polynomial solutions, within the same bit complexity bound √ Olog (M( N )I(log(N ))) and with the same probability. Combining our two algorithms leads to an algorithm for computing polynomial solutions which is output-sensitive. Indeed, suppose that the indicial polynomial of L has positive integer roots N1 < · · · < Nk = N and that the polynomial solutions of L have degrees d1 < · · · < dr = d. Using our ModBsGsPolySols algorithm, we √ compute the degrees di in bit complexity roughly linear in N ; then, using our algorithm BinSplitPolySols, we return a compact representation of solutions in bit complexity roughly linear in d. If √ d N, this strategy has its benefits; for instance, if d ≈ N (as in Ex. 4, §5), we√compute the solutions in bit complexity roughly linear in N instead of N 2 by the basic algorithm.

5.

Total 0.67 2.10 8.56 49.26 329.51

BsGs 0 0.02 0.03 0.05 0.07 0.11 0.25 0.59 1.35 Example 3 BinSplit BsGs 0.01 0.01 0.03 0.03 0.09 0.06 0.54 0.19 3.15 0.45 16.72 1.09 93.64 2.57 506.81 6.17 > 1h 14.84

The equations used in these tables are as follows: Ex. 1 is (1 − x2 )y 00 − 2xy 0 + N (N + 1)y = 0, where N is a power of 2. The dimension of polynomial solutions is 1 and any nonzero solution has degree N ; the recurrence has order o = 2, but only two terms of the recurrence are nonzero. Ex. 2 is (x2 + 2x + 1)y 0 − (N x + N − 1)y = 0. It has no nonzero polynomial solutions, but its indicial equation at infinity has N as root; the recurrence has order o = 2. Ex. 3 (taken from [10]) is 2x3 y 00 + ((3 − 2N )x2 + x)y 0 − (N x + 1)y = 0. It has a 1-dimensional space of polynomial solutions and the recurrence has order 1. Finally, in Ex. 4 we consider a family of LDEs indexed by N of order n = 3; the recurrence (2) has order o = 7, the indicial equation is (x − d)(x − N ) = 0, the LDE has a solution of degree d, but no solution of degree N . In Exs. 1 and 4, the column Compact displays the times used by BinSplit to compute the compact representation of solutions, while the column Expand indicates the time necessary to compute the expansion of the compact representation in the monomial basis. Their sum is collected in the column Total, whose output is the same as that of LFS. All the tests have been performed on the computers of the MEDICIS resource center www.medicis.polytechnique.fr, using a 2 Gb, 2200+ AMD 64 Athlon processor. The timings (in seconds) shown in these tables prove that the theoretical complexity estimations can be observed in practice: – The cost of LFS is multiplied by more than 16 when the degree N is multiplied by 4. This is in agreement with the fact that the basic algorithm has complexity (at least) quadratic in N . Moreover, the memory requirements are also roughly proportional to N 2 , and this naturally becomes prohibitive (the mention > 4Gb means that the execution was stopped after 4Gb of memory were exhausted.) – The cost of BinSplit is multiplied by slightly more than 5 when the degree N is multiplied by 4. This accurately reflects the behavior of the GMP’s integer multiplication. – The cost of BsGs is multiplied by slightly more than 2 when the degree N is multiplied by 4. Again, this is in line with the complexity estimates and shows that the polynomial multiplication we are using is quite good. – When the recurrence has 2 terms (Exs. 1 and 3) BinSplit

EXPERIMENTS

We have implemented our algorithms BinSplitPolySols and ModBsGsPolySols in the computer algebra systems Maple v. 9.5 and respectively Magma v. 2.11-2. Our choice is motivated by the fact that Magma and Maple provide implementations of fast integer arithmetic, based on Karatsuba and FFT multiplications. They both use the GNU Multi Precision Arithmetic Library (GMP). This is important, since in our experiments over Z, the computations require sometimes up to millions of bits. Moreover, Magma employs asymptotically fast algorithms for performing arithmetic with univariate polynomials over Fp (including Karatsuba and FFTbased methods). Again, this is crucial, since in our modular baby-step / giant-step algorithm, the theoretical gains are valid only in conjunction with fast polynomial multiplication. We have also implemented the basic algorithm in Maple. The performances of our implementation are very similar to those of Maple’s function PolynomialSolutions from the LinearFunctionalSystems (LFS) package. Maple provides another implementation of the basic algorithm, namely the function polysols from DEtools package. Since LFS outperforms DEtools on the set of examples we considered, we have chosen to display only the timings of LFS for comparisons.

51

Example 4 d 9 16 25 36 49 64 81 100 121 144 169

N 81 256 625 1296 2401 4096 6561 10000 14641 20736 28561

LFS 0.14 0.63 4.90 30.14 192 746.94 4098.17 >6h? > 1 day ? > 1 week ? > 1 month ?

BsGs 0.03 0.05 0.08 0.13 0.17 0.23 0.3 0.38 0.52 0.7 0.9

´ Schost. Linear [9] A. Bostan, P. Gaudry, and E. recurrences with polynomial coefficients and application to integer factorization and Cartier-Manin operator, May 2004. 29 pages. Submitted. [10] D. Boucher. About the polynomial solutions of homogeneous linear differential equations depending on parameters. In Proc. ISSAC’99, pages 261–268. ACM Press, 1999. [11] R. P. Brent. Multiple-precision zero-finding methods and the complexity of elementary function evaluation. In Analytic computational complexity, pages 151–176. Academic Press, 1976. [12] D. V. Chudnovsky and G. V. Chudnovsky. Approximations and complex multiplication according to Ramanujan. In Ramanujan revisited, pages 375–472. Academic Press, 1988. [13] F. Chyzak. An extension of Zeilberger’s fast algorithm to general holonomic functions. Discrete Mathematics, 217(1-3):115–134, 2000. [14] F. Chyzak, P. Dumas, H. Le, J. Martins, M. Mishna, and B. Salvy. Taming apparent singularities via Ore closure. Preprint, July 2004. [15] G. Estrin. Organization of computer systems—the fixed plus variable structure computer. In AFIPS conference proceedings, volume 17, pages 33–40, 1960. [16] J. von zur Gathen and J. Gerhard. Fast algorithms for Taylor shifts and certain difference equations. In Proc. ISSAC’97, pages 40–47. ACM Press, 1997. [17] J. von zur Gathen and J. Gerhard. Modern computer algebra. Cambridge University Press, 1999. [18] J. Gerhard. Modular algorithms in symbolic summation and symbolic integration. volume 3218 of LNCS. Springer–Verlag, 2005. [19] G. H. Golub and C. F. Van Loan. Matrix computations. Johns Hopkins University Press, 1996. [20] M. van Hoeij, J.-F. Ragot, F. Ulmer, and J.-A. Weil. Liouvillian solutions of linear differential equations of order three and higher. Journal of Symbolic Computation, 28(4-5):589–609, 1999. [21] P. Kogge and H. Stone. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comp., C-22:786–793, 1973. [22] J. Liouville. Second mémoire sur la détermination des intégrales dont la valeur est algébrique. Journal de ´ l’Ecole polytechnique, Cahier 14:149–193, 1833. [23] F. M. Marotte. Les équations différentielles linéaires et la théorie des groupes. PhD thesis, Faculté des Sciences de Paris, 1898. [24] A. Sch¨ onhage, A. F. W. Grotefeld, and E. Vetter. Fast algorithms. Bibliographisches Institut, 1994. A multitape Turing machine implementation. [25] M. F. Singer. Liouvillian solutions of n-th order homogeneous linear differential equations. American Journal of Mathematics, 103(4):661–682, 1981. [26] M. F. Singer and F. Ulmer. Linear differential equations and products of linear forms. Journal of Pure and Applied Algebra, 117/118:549–563, 1997. [27] V. Strassen. Einige Resultate u ¨ber Berechnungskomplexit¨ at. Jahresbericht der Deutschen Mathematiker-Vereinigung, 78(1):1–8, 1976.

BinSplit Compact Expand 0.04 0.01 0.05 0.01 0.07 0.01 0.07 0.02 0.08 0.01 0.09 0.03 0.1 0.02 0.12 0.02 0.137 0.03 0.179 0.03 0.203 0.04

essentially computes scalar factorials and there is no linear algebra step. In the opposite case (o > 1), BinSplit multiplies matrices of small size, but containing potentially huge integer entries. A further improvement (not implemented yet) is to use Strassen’s algorithm to multiply integer matrices; the gain should be already visible on 2 × 2 matrices. – In Ex. 1, expanding the compact representation has quadratic complexity, but a constant factor (between 3 and 4) is gained over LFS. At least two issues contribute to this constant factor: LFS computes two power series expansions up to order N instead of a single one; the size of the numerators and denominators are more important in the computations done by LFS than in our algorithm. – The timings in Ex. 4 clearly show the advantage of first executing the baby step/giant step algorithm to compute the possible degrees d of solutions; these values are given as input to BinSplitPolySols in order to compute the compact representation of solutions. This way, we get an algorithm that is output-sensitive. Moreover, expanding the solutions is in this case negligible. Without the information on the degrees, even though the polynomial solutions have moderate degrees (up to d = 81), LFS spends a lot of time in (uselessly) unraveling recurrences up to order N = d2 . (The entries marked by ? are estimated timings.)

6. REFERENCES [1] S. Abramov, M. Bronstein, and M. Petkovˇsek. On polynomial solutions of linear operator equations. In Proc. ISSAC’95, pages 290–296. ACM Press, 1995. [2] S. A. Abramov and K. Y. Kvashenko. Fast algorithms to search for the rational solutions of linear differential equations with polynomial coefficients. In Proc. ISSAC’91, pages 267–270. ACM Press, 1991. [3] G. Almkvist and D. Zeilberger. The method of differentiating under the integral sign. Journal of Symbolic Computation, 10(6):571–591, 1990. [4] M. A. Barkatou. On rational solutions of systems of linear differential equations. Journal of Symbolic Computation, 28(4-5):547–567, 1999. [5] M. Beeler, R. Gosper, and R. Schroeppel. HAKMEM. Artificial Intelligence Memo No. 239. MIT, 1972. [6] D. J. Bernstein. Fast multiplication and its applications. Available at http://cr.yp.to/papers.html. [7] P. B. Borwein. On the complexity of calculating factorials. Journal of Algorithms, 6(3):376–380, 1985. [8] A. Bostan, T. Cluzeau, and B. Salvy. Fast algorithms for polynomial and rational solutions of linear operator equations. Preprint, 2005.

52

Non Complete Integrability of a Magnetic Satellite in Circular Orbit Delphine Boucher IRMAR Université de Rennes 1 Campus de Beaulieu F-35042 Rennes Cedex

[email protected] ABSTRACT

of meromorphic first integrals which are functionally independent and in involution (see [1], [6] and [15] for precise definitions of these notions). Here we will use a result of J.-J. Morales and J.-P. Ramis :

We consider the motion of a rigid body (for example a satellite) on a circular orbit around a fixed gravitational and magnetic center. We study the non complete meromorphic integrability of the equations of motion which depend on parameters linked to the inertia tensor of the satellite and to the magnetic field. Using tools from computer algebra we apply a criterion deduced from J.-J. Morales and J.-P. Ramis theorem which relies on the differential Galois group of a linear differential system, called normal variational system. With this criterion, we establish non complete integrability for the magnetic satellite with axial symmetry, except for a particular family F already found in [11], and for the satellite without axial symmetry. In the case of the axial symmetry, we discuss the family F using higher order variational equations ([14]) and also prove non complete integrability.

Theorem 1 ([15]). Let (S) be a Hamiltonian system, x0 (t) be a particular solution of (S), Y (t) = A(t) Y (t) be the variational system of (S) computed along the solution x0 (t) and G be the differential Galois group of Y (t) = A(t) Y (t). If the system (S) is completely integrable with meromorphic first integrals, then the connected component of the identity in the group G, denoted G0 , is an abelian group. In [10], using Ziglin’s theory, A. Maciejewski and K. Gozdziewski gave a numerical proof of the non complete integrability of the problem of the satellite with axial symmetry in a non magnetic field. Furthermore, A. Maciejewski ([9, 8]) and M. Audin ([2]) also gave independent (formal) proofs of non complete integrability for the satellite with axial symmetry in a non magnetic field. They both applied Morales and Ramis theorem to an order two variational equation using two different approaches. Then, the proof was extended in many ways. First in [11], A. Maciejewski and M. Przybylska proved that the system of the magnetic satellite with axial symmetry was not completely integrable except for a family F of parameters for which the answer remained open. A major improvement in this result is that they deal with real integrability and not only with complex integrability. Lastly, in [5], the author proved that the system of the satellite without axial symmetry and without magnetic field was not completely integrable.

Categories and Subject Descriptors I.1 [Computing Methodologies]: Symbolic and algebraic manipulation; I.1.2 [Symbolic and algebraic manipulation]: Algorithms; J.2 [Physical sciences and engineering]: Astronomy


1.

INTRODUCTION

We consider a rigid body (the satellite) moving in a circular orbit around a fixed gravitational and magnetic center ([9, 8], [10], [11]). The equations of the motion of the satellite are given by a Hamiltonian system which depends on a set P of parameters related to the inertia tensor of the body and to the magnetic field ([11]). Our goal is to find the values of the parameters for which this Hamiltonian system may be completely integrable (with meromorphic first integrals), which means that there exists a sufficient number

In this paper we deal with the case of the satellite with and without axial symmetry in a magnetic field. We use a criterion of non complete meromorphic integrability deduced from theorem 1 and established in [4, 3] : Criterion 1 ([4, 3]). Let (S) be a Hamiltonian system and (N V S) be the normal variational system computed along a particular solution of (S). If (N V S) is irreducible and has formal solutions with logarithmic terms at a singular point, then (S) is not completely integrable (with meromorphic first integrals).


We also use a new theorem, announced in [14], on higher order normal variational systems. First for the magnetic satellite with axial symmetry, we apply criterion 1 to an equation of order 2 (instead of applying

53

Kovacic’s algorithm like in [11]). We find the system is not completely integrable except for the family F of parameters already found in [11] and for which the question of integrability remained open. For this family we use the new theorem on higher order normal variational systems ([14]) to establish the non complete integrability.

The parameter ξ is a real parameter linked to the magnetic field; the parameters A, B and C are the components of the inertia tensor which is linked to the distribution of the mass throughout the body. Without loss of generality, one can assume that they satisfy

Then for the case of the satellite without axial symmetry, we deal with a fourth order normal variational equation (or 4 × 4 normal variational system). Since Kovacic’s algorithm is restricted to second order equations, we use criterion 1. We are still in a lucky situation since our normal variational system has no exponents depending on the parameters. This allows us to avoid arithmetic conditions on the parameters and to adapt all the algorithms needed in criterion 1 to our situation. In section 2, we give the normal variational system computed along the particular solution given in [11]. In section 3 we consider the particular case when the magnetic satellite has an axial symmetry. Like in [11], we prove that this normal variational system can be reduced to a linear differential equation of order 2. We apply criterion 1 to it and we conclude to the non complete integrability except for the same particular family F as in [11]. Then we solve the remaining case of this family using higher order normal variational systems ([14]). In section 4 we deal with the general case of the magnetic satellite, without axial symmetry. We prove that the 4 × 4 normal variational system has regular formal solutions with logarithmic terms and that it is irreducible under the conditions on the parameters. Using again criterion 1 we conclude that the equations of the satellite without axial symmetry are not meromorphically completely integrable.

A < B + C, B < C + A, C < A + B,

2.

A ≥ 0, B ≥ 0, C ≥ 0,

C < B ≤ A = 1. In [11], the authors consider a particular solution to the Hamiltonian system : π π xk =t (q1 , , , 1 + w z, 0, 0) 2 2 where z = k cn(w t, k), cos(q1 ) = −k sn(w t, k),

(3)

The constraints (2) on the parameters become: w2 , 0 < w2 < 6 B − 3 ≤ 3, 3 The variational system along the solution xk is A = 1, C = B −

(4)

Y (z) = V S(z) Y (z) where V S(z) =

dt J6 Hessian(H, xk ) dz

is a 6 × 6 matrix. We introduce a new function f defined by

NORMAL VARIATIONAL SYSTEM

f (z) = dn(w t, k) sn(w t, k). Using the following simplifications : d (cn(w t, k)) = −w dn(w t, k) sn(w t, k), dt z2 , k2 we express the coefficients of V S(z) in Q(B, w, ξ, k)(z, f (z)). As the matrix is too big, we do not give it here. Now we construct the normal variational system associated to the particular solution xk . The vector Wk = xk =t (w z, 0, 0, −w2 k f (z), 0, 0) is a particular solution to the variational system. Thanks to it we make a symplectic transformation on the variational system, which enables to keep the symplectic structure of the differential Galois group of the variational system. We consider the following symplectic matrix P6 (such that t P6 J6 P6 = J6 ) whose first column is made of the coordinates of the vector Wk : P6 = 1 0 0 0 0 0 0 C B C B −1 0 0 0 0 C B C B 0 −1 0 0 0 C. B Wk −Wk [5] −Wk [6] Wk [2] Wk [3] 1 C B Wk [1] Wk [1] Wk [1] Wk [1] Wk [1] C B C B 0 0 0 −1 0 A @ 0 0 0 0 −1 sn(w t, k)2 = 1 −

(1)

where x =(q1 , q2 , q3 , p1 , p2 , p3 ), = −I6 , „ «2 1 s3 p 1 c2 s3 p3 1 2 + c3 p2 − H(x) = − ξ c2 + 2 2 s2 s2 „

sin(q1 ) = dn(w t, k)

w2 = 3 (B − C).

J62

1 + 2

k ∈]0, 1[

and w is a new parameter defined by

In this section we recall the equations which define the problem of the satellite using the most recent paper on the subject ([11]) and we compute the normal variational system along the particular solution given in [11] using a symplectic transformation. After some reductions detailed in [11], the equations of the rotational motion of the body can be given by a Hamiltonian system x (t) = J6 ∇H(x(t))

(2)

«„ « c3 p1 c2 c3 p3 c3 p1 c2 c3 p3 s3 p 2 − − s3 p 2 − − s2 s2 Bs2 B Bs2 „ « s3 p 1 1 p3 2 c2 s3 p3 − + + c3 p2 − s3 s2 2 C s2 s2 « „ c3 p1 c2 c3 p3 c3 s2 − p3 c2 − − s3 p 2 − s2 s2

3 3 3 (−s3 c2 s1 + c3 c1 )2 + B (−c3 c2 s1 − s3 c1 )2 + Cs2 2 s1 2 2 2 2 and +

∀i ∈ {1, 2, 3}, ci = cos(qi ), si = sin(qi ).

54

The system satisfied by Y˜ defined by Y = P6 Y˜ is

3.1 The normal variational equation 2

If we replace C with C = 3−w , the normal variational 3 system (5) has a particular solution : t (0, 1, 0, 0). We construct a symplectic matrix P4 (such that t P4 J4 P4 − J4 = 0 where J42 = −I4 ) with t (0, 1, 0, 0) on its first row like in previous section and we reduce the equivalent system to the following one : „ « 0 α(z) µ (z) = µ(z). β(z) 0

Y˜ (z) = V˜S(z) Y˜ (z) where V˜S(z) = P6 (z)−1 (V S(z) P6 (z) − P6 (z)). As the matrix V S is infinitesimally symplectic (i.e. J6 V S +t V S J6 = 0) and P6 is symplectic, the matrix V˜S is also infinitesimally symplectic ([15]). Furthermore as Wk is a solution of the variational system (i.e. V S(z) Wk (z) − Wk (z) = 0), the first row of the matrix V˜S(z) is equal to zero. So V˜S(z) can be written in the following form: 0 B B B ˜ V S(z) = B B @

×

0 0 0 0 0 0

×

A1 (z) 0 0 A3 (z)

where

1

× ×

× × × 0 × ×

A2 (z) 0 0 − A1 (z) t

α(z) =

C C C C C A

β(z) =

(5)

The matrix N V S(z) has the following infinitesimal symplectic structure : „ « A1 (z) A2 (z) N V S(z) = A3 (z) A4 (z)

y (z) −

α (z) y (z) − α(z) β(z) y(z) = 0. α(z)

Using the two equalities

where A4 (z) = −t A1 (z), A2 (z) =t A2 (z), A3 (z) =t A3 (z) :

A1 (z) =

A2 (z) =

(B−1) (1+w z) k w B f (z)

0 z − k1+w w f (z) −1 k w B f (z)

A3 (z) = @

f (z)2 = −

!

!

we get the same linear differential equation as in [11] :

3 k w (w2 −3 B) f (z)

a(z)

3 (B−1) −w

−3 (B−1) w

b(z)

y (z) +

1 −

A

(w2 + 3 − 3 B) (z 2 + 1 − k2 ) + (w z + 1)2 − ξ k w f (z)

Proposition 1. When 0 < w2 < 3, the equation (6) has a formal solution with a logarithmic term at the point infinity if and only if w2 = 2 ξ. Proof. The exponents at infinity are ρ0 = −1 and ρ1 = 2. As they differ each other from an integer one can test whether there is a formal solution with logarithms using Frobenius method (chapter 16 of [7]). We first make the change of variable z = 1/t and we study the new equation around 0. We consider a formal series solution of the type X cn tn , ρ ∈ C tρ

SATELLITE WITH AXIAL SYMMETRY

We assume that two of the components of the inertia tensor are equal, let us say A and B, so the conditions (4) become : and 0 < w2 < 3

2w2 z 2 + 2wz − w2 k2 + w2 + 1 − ξ y(z) = 0. w2 (z 2 − k2 )(z 2 + 1 − k2 )

Instead of applying Kovacic’s algorithm like in [11], we are going to apply criterion 1.

In the following section we study the special case when the satellite has an axial symmetry (see also [2, 11, 5] for results in this case).

w2 3

(6)

3.2 Non complete integrability when w2 −2 ξ = 0

(B − 1) (3 B (k2 − z 2 ) + (w z + 1)2 ) b(z) = − k w B f (z)

A = B = 1, C = 1 −

(2 z 2 + 1 − 2 k2 ) z y (z) + 1 − k2 ) (z 2 − k2 )

(z 2

whose coefficients are now in Q(w, ξ, k)(z). In next subsection we find the already known result of [11] using criterion 1.

and a(z) =

(z 2 + 1 − k2 ) (z 2 − k2 ) k2

z (2 z 2 − 2 k2 + 1) f (z) = 2 f (z) (z + 1 − k2 ) (z 2 − k2 )

0 0

0 0

3.

2w2 z 2 + 2wz − k2 w2 + w2 + 1 − ξ . w k f (z)

According to [15] (or proposition 4.2 p.76 of [13]), this symplectic transformation enables us to apply our criterion directly on this reduced 2 × 2 linear differential system instead of applying it on the initial 4 × 4 system. We first transform it into a linear differential equation of order 2 namely the one satisfied by the first component y of µ:

One can extract from this system the following system called normal variational system ν (z) = N V S(z) ν(z).

−1 w k f (z)

.

n≥0

55

We plug all these expressions into the differential equation and we get a system of equations which are polynomial in w, ξ and linear in p0 , p1 . Under the constraint on w and ξ, we find the following exponential solutions : 1 − wz if and only if 2ξ − w2 = 2w2 k2 − w2 − 2 = 0; p − (4 wz − 4 + w2 ) (−4 wz + w2 + 4) if and only if 2ξ − w2 = (4kw + 4 + w2 )(4kw − 4 − w2 ) = 0. So there is an exponential solution if and only if w2 − 2ξ = 0 and (2w2 k2 − w2 − 2) (4kw + 4 + w2 ) (4kw − 4 − w2 ) = 0 which ends the proof.

and compute the linear recurrence relation satisfied by the coefficients cn . f0 (ρ + n) cn + f1 (ρ + n − 1) cn−1 + f2 (ρ + n − 2) cn−2 +f3 (ρ + n − 3) cn−3 + f4 (ρ + n − 4) cn−4 = 0 with f0 (n) = w2 (n − 2) (n + 1) , (whose roots are the exponents −1 and 2 at infinity) f1 (n) = −2 w,

We can now state : f2 (n) = −2 w2 k2 n2 + w2 k2 + w2 n2 − w2 − 1 + ξ,

Proposition 3. Under the constraints on the parameters, w, ξ ∈ IR, 0 < w2 < 3, the satellite with axial symmetry is not completely integrable along the solution xk with 0 < k < 1 when

f3 (n) = 0, f4 (n) = nw2 k2 (k − 1) (k + 1) (n + 1) .

w2 − 2ξ = 0.

According to proposition 7 of [3], there exists a formal solution at t = 0 with a logarithmic term if and only if the following determinant does not cancel when ρ = ρ0 = −1 : ˛ ˛ ˛ f1 (ρ + 2) f2 (ρ + 1) f3 (ρ) ˛ ˛ ˛ Fρ1 −ρ0 (ρ) = F3 (ρ) = ˛˛ f0 (ρ + 2) f1 (ρ + 1) f2 (ρ) ˛˛ . ˛ 0 f0 (ρ + 1) f1 (ρ) ˛

Proof. This proposition is directly deduced from the two previous ones and from criterion 1. So until now we have no answer for the family F defined by F= w2 w2 ,ξ= , 0 < w2 < 3}. 3 2 One can prove that in this case the connected component of the identity in the Galois group G of the equation (6) is abelian. Indeed, the fourth symmetric equation of (6) has a rational solution so the unique exponential solution to (6) is algebraic, G is a finite subgroup of the Borel group and G0 is the additive group so it is abelian. For the family F, we are going to use higher order variational equations ([13]). {(A, B, C, ξ), A = B = 1, C = 1−

We find F3 (−1) = 4w3 (w2 − 2ξ) so if 0 < w2 < 3, it does not cancel if and only if w2 − 2ξ = 0. Proposition 2. When 0 < w2 < 3, the equation (6) is irreducible if and only if 2ξ − w2 = 0 or (2w2 k2 − w2 − 2)(4kw + 4 + w2 )(4kw − 4 − w2 ) = 0. Proof. Since the equation is of order 2, it is irreducible if, and only if, it has no factor of order one i.e. no solution y such that y /y is in Q(w, ξ, z). Such solutions are called exponential solutions. There are many algorithms to find these solutions (see for example chapter 4 of [16]). Here we follow the ’classical’ method. We first compute the exponents at the singularities of the equation. Then for each singularity we construct sets of exponents which are equal up to integers and we keep the minimum value of each set.√ At the points s1 = k, s2 = i 1 − k2 , s3 = −s1 and s4 = −s2 the exponents are 0 and 12 so at each finite singularity we get two distinct sets of one single exponent and we keep all the exponents. At infinity, the exponents are −1 and 2 so we get one set of two exponents and we keep only the exponent −1. An exponential solution will be of the form (z − s1 )

e1

(z + s1 )

e2

(z − s2 )

e3

(z + s2 )

e4

3.3 Non complete integrability when w2 −2ξ = 0 We first make the change of variable z = 1t in equation (6) and we write the new equivalent equation like a companion system: Y1 (t) = A Y1 (t) (E1 ) with

„ A=

and

8 ˜(t) = − < a : ˜b(t) = −

0 1 a ˜(t) ˜b(t)

«

2 t2 k2 w2 −w2 t2 −4 wt−2 t2 −4 w2 2t2 (−1−t2 +k2 t2 )(kt−1)(kt+1)w2 t(1−2 k2 −2 k2 t2 +2 k4 t2 )

(−1−t2 +k2 t2 )(kt−1)(kt+1)

Then we consider the formal solutions at the point 0 of the higher order variational equations. As it is explained in [14], we stop if we find a logarithmic term in some solution. Then according to theorem 4 and lemma 3 of [14], we conclude to the non complete integrability of the Hamiltonian system. On a practical point of view, we follow the ideas of section 6 of [12]. The second order variational equation is

p(z)

where the degree of p is equal to −(e1 + e2 + e3 + e4 + e∞ ), ei is one of the previously selected exponents at the point si and e∞ is one of the previously selected exponent at the point ∞ (here e∞ = −1). As −(e1 +e2 +e3 +e4 −1) must be a natural integer, the possible exponential solutions are p0 + p1 z where p0 , p1 are unknown coefficients or (z − s1 )e1 (z + s1 )e2 (z − s2 )e3 (z + s2 )e4 where (e1 , e2 , e3 , e4 ) belongs to {(0, 0, 12 , 12 ), (0, 12 , 0, 12 ), (0, 12 , 12 , 0), ( 12 , 12 , 0, 0), ( 12 , 0, 12 , 0), ( 21 , 0, 0, 12 )}.

Y2 (t) = A Y2 (t) + B2 (E2 ) where B2 = A Y1 (t) and Y1 is a solution of (E1 ). We replace Y1 (t) with λ1 F1 (t) + λ2 F2 (t) where (F1 , F2 ) is a basis of formal solutions of (E1 ) at 0 and λ1 , λ2 ∈ C.

56

1 where Q(z) is a diagonal matrix whose diagonal is 1, f (z) , 1 , 1. f (z) As f (z)2 is rational (k2 f (z)2 = (z 2 + 1 − k2 ) (k2 − z 2 )), this transformation does not change the (non) virtual abelianity of the group. We get the equivalent system :

Using the method of variation of constant we look for λ2,1 (t) and λ2,2 (t) such that λ2,1 (t) F1 (t)+λ2,2 (t) F2 (t) is a solution of (E2 ). If there are values of λ1 , λ2 such that the residue in the formal series of λ2,1 (t) or λ2,2 (t) is not equal to zero then we stop as we found a logarithmic term in a solution of (E2 ) else we go on : for k ≥ 3, we define the kth order variational equation (Ek ) by Yk (t)

η (z) = M (z) η(z)

= A Yk (t) + Bk (Ek )

where

Bk−1

where Bk = A Yk−1 (t) + and Yk−1 (t) is a previously computed formal series solution of (Ek−1 ). We look for a solution Yk (t) = λk,1 (t) F1 (t) + λk,2 (t) F2 (t) of (Ek ). If for all λ1 , λ2 , the residues in the formal series of λk,1 (t) and λk,2 (t) are equal to zero then we go on else we stop. These computations lead to the following proposition.

0 B m3 (z) B M (z) = @ m6 (z) m7 (z) and

Proposition 4. Under the constraints on the parameters, w, ξ ∈ IR, 0 < w2 < 3, the satellite with axial symmetry is not completely integrable along the solution xk with 0 < k < 1 when w2 − 2ξ = 0. Proof. Following the method explained below, we find that there is a logarithm in some solution of the fourth order variational equation. More precisely, a basis of formal solutions of (E1 ) at 0 is 8 1 0 −1 1 2 2 1 −w2 −2 + 3w + 2w k12w t + h.o.t. > 2 > 3 t > > A F1 (t) = @ > > > > 1 1 2w2 k2 −w2 −2 > + + h.o.t. > 3 t2 12w2
t1 = w2 −6 B+3 w3 > > ( ) > > > t3 (−1+B)k(w6 −9 w2 B+3 w4 −9 w2 B 2 +27 B 2 −27 B ) > > t = > 2 > (−3 B+2 w2 +3)B(w2 −3)(−3 B+w2 +3) > < (w2 −3)(−3 B+w2 +3)Bt t3 = k(−1+B)w3 w2 −6 B+3 ( ) > > > 3 t2 B (w2 −3) > > t4 = − w2 w2 −6 B+3 > > ( ) > > 2 2 2 > > : t5 = − t (w +3)(−1+B)k(−3 B+w )w 2 2 w −3 −3 B+w +3 B ( )( )

We now extend the proof to the case of the magnetic satellite without axial symmetry.

SATELLITE WITHOUT AXIAL SYMMETRY

Now we assume that there is no axial symmetry i.e. B = 1 then the conditions (4) become w2 , 3

k (B−1) (1+w z) w (z 2 +1−k2 ) (k2 −z 2 ) B −k w B (z 2 +1−k2 ) (k2 −z 2 ) 1+w z −k w z (2 k2 −2 z 2 −1) (z 2 +1−k2 ) (k2 −z 2 ) 3 k w (w2 −3 B) 2 (w +3−3 B) (z 2 +1−k2 )+(w z+1)2 −ξ kw −3(B−1) w (1−B) k (3 B (k2 −z 2 )+(w z+1)2 ) . w B (z 2 +1−k2 ) (k2 −z 2 )

We detect formal solutions with logarithmic terms at the point infinity.

8 8(3w − 4) = − + h.o.t. t 3w

(C) : C = B −

1 m2 (z) 0 0 m5 (z) C C m4 (z) −m3 (z) A −m1 (z) 0

4.1 Formal solutions with logarithmic terms

where h.o.t. means ’higher order terms’. We consider the particular series solution Y1 (t) = λ1 F1 (t) + λ2 F2 (t) of (E1 ) and for k greater than 1, we compute a solution Yk (t) = λk,1 (t) F1 (t) + λk,2 (t) F2 (t) of (Ek ) until we get a logarithm for some value of (λ1 , λ2 ). At the fourth stage, we get, for λ1 = 0 and λ1 = 1,

4.

8 m1 (z) = > > > > > m 2 (z) = > > > > > m3 (z) = > > > < m4 (z) = m5 (z) = > > > > > > m6 (z) = > > > > m7 (z) = > > > : m (z) = 8

m1 (z) m4 (z) m7 (z) m8 (z)

One can notice that the symplectic structure has not been preserved (Q was not a symplectic matrix) but the coefficients of M (z) depend now rationally on the parameters and on z.

> > 0 2 1 > > 1 3 > t + h.o.t. t + 2w > > > @ A > > : F2 (t) = 3 2 2 t + 2w t + h.o.t.

λ4,1 (t)

0

(8)

w2 < 6B − 3 < 3 .

To our knowledge, this case has not been studied before. We work with the normal variational system (5). We make the following transformation

The matrix T is well defined as under the constraints on the parameters, one can check that B − 1, w2 − 6 B + 3, w2 − 3, −3 B + w2 + 3 and −3 B + 2w2 + 3 do not cancel. The normal variational system (8) is equivalent to a system of the type

ν(z) = Q(z) η(z)

t η (t) = (A0 + A1 t + · · · ) η(t)

(7)

57

There exists a matrix P0 such that P0−1 A0 P0 is the diagonal matrix with −1, −2, −3, −4 on its diagonal. We do not give the expression of the matrix P0 . Its determinant is: ` ´3 det(P0 ) = w2 − 6 B + 3 w6 k2 (−1 + B)3 ´3 ` 4 3 w B + w4 − 9 w2 B + 6 w2 − 9 w2 B 2 + 9 + 27 B 2 − 36 B ` ´ ´ ` ´ ` ` 4 4 / 12B 6 w2 − 3 −3 B + w2 + 3 −3 B + 2 w2 + 3 ´2 ´ ` 2 w − 3B .

where B0 is an upper-triangular matrix whose eigenvalues are λ1 − d1 , λ2 − d2 , . . . , λn − dn . Proof. The system satisfied by Z is of the type t Z (t) = [D−1

∞ X

Ak D tk − t D−1 D ] Z(t)

k=0

i.e. ∞ X D−1 Ak D tk − diag(d1 , d2 , . . . , dn )] Z(t) t Z (t) = [

Using Mathematica, one can prove that the determinant of P0 never cancels under the constraints (C) on the parameters. We introduce the variable W = w2 and we find that 3 W 2 B + W 2 − 9 W B + 6 W − 9 W B 2 + 9 + 27 B 2 − 36 B is always positive under the constraints W > 0, 6 B − 3 − W > 0, 1 − B > 0 :

k=0

Furthermore, for all k, the coefficients of the matrix D−1 Ak D (k) (k) are the ai,j tdj −di where the ai,j are the coefficients of the matrix Ak . So the orders at 0 of the non zero coefficients of the matrix D−1 Ak D are at least −1 according to (H1 ). For k ≥ 2 the non zero coefficients of the matrix D−1 Ak D tk have orders greater than 0. If k = 1, the non zero coefficients of the matrix D−1 Ak D tk have orders at least 0 and those of orders 0 are on the upper part of the matrix according to (H1 ). Lastly, when k = 0, then D−1 Ak D tk remains a uppertriangular matrix with a main diagonal λ1 , λ2 , . . . , λn and a second upper diagonal with coefficients 4i,i+1 tdi+1 −di which are in {0, 1} according to (H2 ). So the system satisfied by Z is of the type t Z (t) = (B0 + B1 t + · · · ) Z(t) where the matrix B0 is 1 0 λ1 − d1 ∗ · · · ∗ C B .. .. B0 = @ A . . 0 λn − dn

Experimental‘ImpliesRealQ[{W>0,6*B-3-W>0,1-B>0}, {9*W*B^2-27*B^2+9*W*B+36*B-3*W^2*B-9-6*W-W^2>0}] true

So the system (8) is rationally equivalent to the following system t η (t) = (A0 + A1 t + · · · ) η(t) where the matrix A0 is 0

1 −1 0 0 0 B 0 −2 0 0 C C A0 = B @ 0 0 −3 0 A 0 0 0 −4

According to Theorem 5.1 of [16], as the eigenvalues of A0 differ each other from integers, we can make a transformation η = P η to get an equivalent system of the type

and where the terms ∗ come from the matrix A1 (case k = 1) and from the second diagonal of A0 (case k = 0). Thanks to lemma 2 we establish the following proposition :

t η (t) = (B0 + B1 t + · · · ) η(t)

Proposition 5. Under the conditions (C) on the parameters, the system (8) is equivalent to a system of the type

where the eigenvalues of the matrix B0 are 0, 0, 0, 0. Furthermore, the existence of logarithmic terms will be directly given by the shape of the matrix B0 . To get this transformation matrix P , we will need this small lemma :

t η (t) = (B0 + B1 t + · · · ) η(t) If 2 w2 (w2 − 2 ξ) (3 B − w2 − 3) − 9 ξ 2 (B − 1) = 0 then

Lemma 2. Let t Y (t) = A(t) Y (t) be a n × n linear differential system where

0

0 B 0 B B0 = @ 0 0

A(t) = A0 + A1 t + · · · is the development of A(t) at 0 and where the matrix A0 is a Jordan matrix with eigenvalues λ1 , λ2 , . . . , λn : 0 1 λ1 41,2 0 · · · 0 B 0 C 0 λ2 0 · · · B C A0 = B . C @ .. 4n−1,n A 0 0 λn

1 0 0 0

0 0 0 0

1 0 0 C C. 1 A 0

If 2 w2 (w2 − 2 ξ) (3 B − w2 − 3) − 9 ξ 2 (B − 1) = 0 then 0

0 B 0 B0 = B @ 0 0

with 4i,i+1 ∈ {0, 1}. Let D be a diagonal matrix D = diag(td1 , . . . , tdn ) with

1 0 0 0

0 0 0 0

1 0 0 C C. 0 A 0

In particular, there are formal solutions with a logarithmic term at infinity for all values of the parameters satisfying (C). Proof. Step 1. We increase the three first eigenvalues of A0 of 3 and the fourth of 4 : 0 −3 1 t 0 0 0 B 0 t−3 0 0 C C D1 = B @ 0 0 A 0 t−3 0 0 0 t−4

(H1 ) 0 ≤ d1 − d2 ≤ d1 − d3 ≤ . . . ≤ d1 − dn ≤ 1 and (H2 ) 4i,i+1 = 1 ⇒ di = di+1 Let Z defined by Y = D Z. It satisfies the system t Z (t) = (B0 + B1 t + · · · ) Z(t)

58

and we conclude like previously. Lastly, we assume

We get a new matrix A0 which we diagonalize with a transformation matrix P1 whose determinant is : det(P ` 1) = ´ − 3 w4 B + w4 − 9 w2 B + 6 w2 − 9 w2 B 2 + 9 + 27 B 2 − 36 B ´ ` ´2 ´ ` ` 2 w − 6 B + 3 (−1 + B) / 12B 2 w6 w2 − 3 B The new matrix A0 becomes diagonal with 2, 1, 0, 0 on its diagonal. Step 2. We decrease the two first eigenvalues of A0 of 1 and keep the two other ones : 1 0 t 0 0 0 B 0 t 0 0 C C D2 = B @ 0 0 1 0 A 0 0 0 1

2 w4 − 4 w2 ξ − 3 ξ 2 = 2 w6 + 6 w4 − 4 w4 ξ − 12 w2 ξ − 9 ξ 2 = 0. This is impossible as it leads to w = ξ = 0.

4.2 Irreducibility In this part, we fix k in order to simplify the computations. The parameter k is just used to parameterized the particular solution of the normal variational system, so we can fix it in ]0, 1[. We choose k = 35 . As the normal variational system does not have any symplectic structure, we choose to use a cyclic vector transformation (vector [1, 0, 0, 0]) and to work with an equivalent scalar linear differential equation. We do not detail all the computations.

The determinant of the matrix P2 is : det(P2 ) =

(−3 B +

18w2 + 3) (−1 + B)

w2

• Factors of degree 1. We look for exponential solutions. We need the minimum of the exponents up to an integer at the singularities. We find - at the four finite singularities: 0 and 12 ; -at the point infinity : −1 So if there is an exponential solution, then it is one of these expressions : q q

and the matrix A0 becomes an upper-triangular matrix : 1 0 1 0 0 0 B 0 0 0 0 C C A0 = B @ 0 0 0 1 A 0 0 0 0 Step 3. We decrease the first three other ones. 0 t B 0 B D3 = @ 0 0

eigenvalue of 1 and keep the 0 1 0 0

0 0 1 0

(z − 35 ) (z + i 45 ) , q (z − 35 ) (z − i 45 ) , (z + 35 ) (z − i 45 ) , q q (z + 35 ) (z + i 45 ) , (z + i 45 ) (z − i 45 ) or a0 + a1 z where a0 and a1 are unknown coefficients.

1

q

0 0 C C 0 A 1

The determinant of the matrix P3 is :

Plugging the six first expressions in the differential equation we get polynomial systems and prove they have no solution under the constraints C on the parameters. Plugging a0 + a1 z we get a linear system of size 8 × 2 with coefficients depending on w, B, ξ, we prove that its rank is 2 under the constraints on the parameters, so a0 = a1 = 0. To conclude, there is no non zero exponential solution and no factor of degree one.

2 w2 (w2 − 2 ξ) (3 B − w2 − 3) − 9 ξ 2 (B − 1) . 18w4 The determinant of P3 may cancel. We first assume det(P3 ) = −

2 w2 (w2 − 2 ξ) (3 B − w2 − 3) − 9 ξ 2 (B − 1) = 0 then the matrix P3 is invertible 0 0 B 0 B A0 = B0 = @ 0 0

and the matrix A0 becomes 1 1 0 0 0 0 0 C C 0 0 1 A 0 0 0

• Factors of degree 2. To compute the factors of degree two, one needs to look for rational solutions of the second exterior system, Y (z) = Λ2 (M )(z) Y (z). We compute an equivalent linear differential equation of order 6 and find that the only possible rational solution is (z − 35 ) (z + 35 ) (z + i 45 ) (z − i 45 ). Plugging this expression in the second exterior scalar equation we get a polynomial system in the variables B, w, ξ. We pick up only four equations among the 33 equations and we choose to use the resultant to prove this polynomial system has no solution under the constraints on the parameters.

According to Theorem 5.1 of [16], as B0 is a matrix whose eigenvalues λ satisfy 0 ≤ Re(λ) < 1, the local monodromy around 0 for the system (∗) is e2πiB0 . So in both cases, as the matrix B0 is a Jordan matrix which is not diagonal, there are formal solutions at infinity with logarithmic terms for the normal variational system . Now we assume that 2 w4 − 4 w2 ξ − 3 ξ 2 = 0 and that B=

(z − 35 ) (z + 35 ) ,

• Factors of degree 3. The normal variational system has a factor of degree three if and only if its adjoint system has a factor of degree one. Let us notice here that we have lost the symplectic structure due to the transformation (7) so we cannot link factors of degree one and factors of degree three. We construct a linear differential equation associated to this adjoint system using the cyclic vector approach. If there is an exponential solution, then it is one of these : q

1 2 w6 + 6 w4 − 4 w4 ξ − 12 w2 ξ − 9 ξ 2 . 3 2 w4 − 4 w2 ξ − 3 ξ 2

In this case steps 1 and 2 remain unchanged and step 3 changes. We keep the matrix D3 and compute a new matrix P3 . Its determinant is −1 and the new matrix A0 is 0 1 0 1 0 0 B 0 0 0 0 C C A0 = B0 = B @ 0 0 0 0 A 0 0 0 0

(z − 35 ) (z + 35 ) (z + i 45 ) (z − i 45 ),

59

q

q

(z − 35 ) (z + i 45 ) (a0 + a1 z), q (z − 35 ) (z − i 45 ) (a0 +a1 z), (z + 35 ) (z − i 45 ) (a0 +a1 z), q q (z + 35 ) (z + i 45 ) (a0 +a1 z), (z + i 45 ) (z − i 45 ) (a0 +a1 z)

q

(z − 35 ) (z + 35 ) (a0 + a1 z),

[4] D. Boucher and J.-A. Weil. Application of J.-J. Morales and J.-P. Ramis’ theorem to test the non complete integrability of the planar three-body problem. In IRMA Lectures in Mathematics and Theoretical Physics, From Combinatorics to Dynamical Systems, volume 3, 2003. [5] D. Boucher. Non complete integrability of a satellite in circular orbit. submitted to Portugaliae Mathematica, to appear. [6] R.-C. Churchill. Galoisian obstructions to the integrability of hamiltonian systems. In The Kolchin Seminar in Differential Algebra. Department of Mathematics, City College of New York, 1998. [7] E.-L. Ince. Ordinary Differential Equations. Dover Publications, INC New York, 1956. [8] A. J. Maciejewski. Non-integrability in gravitational and cosmological models. introduction to ziglin theory and its differential Galois extension. The Restless Univers. Applications of Gravitational N-Body Dynamics to Planetary, Stellar and Galatic Systems, pages 361–385, 2001. [9] A.-J. Maciejewski. Non-integrability of a certain problem of rotational motion of a rigid satellite. Dynamics of Natural and Artificial Celestial Bodies, pages 187–192, 2001. [10] A.-J. Maciejewski and K. Gozdziewski. Numerical evidence of nonintegrability of certain lie-poisson system. Reports on Mathematical Physics, 44, 1999. [11] A.-J. Maciejewski and M. Przybylska. Non-integrability of the problem of a rigid satellite in gravitational and magnetic fields. Celestial Mech., pages 317–351, 2003. [12] A.-J. Maciejewski, M. Przybylska, and J.-A. Weil. Non-integrability of the generalized spring-pendulum problem. Journal of Physics A: Mathematical and general, 37:2579–2597, 2004. [13] J.-J. Morales-Ruiz. Differential galois theory and non-integrability of hamiltonian systems. Ferran Sunyer i Balaguer Award Winning Monograph, 1999. [14] J.-J. Morales-Ruiz. Kovalevskaya, liapunov, painlev, ziglin and the differential galois theory. Regular Chaotic Dynamic, 5:251–72, 2000. [15] J.-J. Morales-Ruiz and J.-P. Ramis. Galoisian obstructions to integrability of hamiltonian systems i, ii. Methods Appl. Anal., 8(1):33–95, 97–111, 2001. [16] Put . M. van der and M.-F. Singer. Galois theory of linear differential equations. Grundlehren der Mathematischen Wissenschaften. 328. Berlin: Springer, 2003.

or a0 + a1 z + a2 z 2 where a0 , a1 and a2 are unknown coefficients.

Plugging all these expressions into the adjoint equation, we find polynomial systems with no solution under the constraints (C). We can now state : Proposition 6. Under the constraints (C) on the parameters and with k = 35 , the system (8) is irreducible.

4.3 Conclusion Proposition 7. Under the constraints (C), the magnetic satellite without axial symmetry is not completely integrable along the solution xk with k = 35 . Proof. We use sections 4.1, 4.2 and criterion 1. Let us notice that the restriction k = 35 has been made only in order to simplify the big computations of section 4.2. The aim was to find one solution along which we can compute a normal variational system whose Galois group is not virtually abelian this is why we could make this restriction.

5.

CONCLUSION

We gave a proof of the non complete meromorphic integrability of the satellite (a rigid body moving in a circular orbit around a fixed gravitational and magnetic center) with and without axial symmetry. This result (propositions 3, 4 and 7) completes the results of [9], [1] and [10] on the non complete integrability of the satellite with axial symmetry. Furthermore it was obtained thanks to criterion 1 (established and used in [4]) which we hope to be useful for other problems on complete integrability, and thanks to the new theorem of [14] on higher order variational equations.

6.

ACKNOWLEDGMENTS

I thank Michèle Audin for discussions about the satellite with axial symmetry (section 3), Marius Van der Put for discussions about section 4.1, Andrzej Maciejewski for corrections on a previous version; Felix Ulmer and the referees for their comments.

7.

REFERENCES

[1] M. Audin. Les systèmes hamiltoniens et leur intégrabilité. Cours Spécialisés, SMF et EDP Sciences., 2001. [2] M. Audin. La réduction symplectique appliquée ` a la non-intégrabilité du problème du satellite. Annales de la Faculté des Sciences de Toulouse, 12:25–46, 2003. [3] D. Boucher. Sur les équations différentielles linéaires paramétrées, une application aux systèmes hamiltoniens. Thèse de l’Université de Limoges, October 2000.

60

Symmetric and Semisymmetric Graphs Construction Using G-graphs Alain Bretto

∗

Universite´ de Caen, GREYC CNRS UMR-6072, Campus II Bd Marechal Juin BP 5186, 14032 Caen cedex Caen, France

Luc Gillibert

Bernard Laget

Universite´ de Caen, GREYC CNRS UMR-6072, Campus II Bd Marechal Juin BP 5186, 14032 Caen cedex Caen, France

Ecole Nationale d’Ingénieurs de Saint-Etienne, DIPI 58, rue Jean Parot, 42023 Saint-Etienne Cedex 02. France.

[email protected] [email protected] ABSTRACT

graphs have very nice highly-regular properties. But Cayley graphs are always vertex-transitive, and that can be a limitation. In this article we present and use G-graphs introduced in [4, 5]. G-graphs, like Cayley graphs, have highlyregular properties, consequently G-graphs are a good alternative tool for constructing some symmetric graphs. After the definition of these graphs we give a characterization of bipartite G-graphs. Then, using this characterization, we build a powerful algorithm based on G-graphs for computing symmetric graphs. The classification of symmetric graphs is a very interesting problem. Ronald M. Foster started collecting specimens of small cubic symmetric graphs prior to 1934, maintaining a census of all such graphs. In 1988 the current version of the census was published in a book containing some graphs up to the order 512 [7]. But symmetric graphs are not the only interesting graphs. There exist regular graphs which are edge-transitive but not vertex-transitive [6], they are called semisymmetric graphs, and it is quite difficult to construct them [14, 11]. Indeed, Cayley graphs are always regular and vertex-transitive, so they cannot be semisymmetric, but G-graphs can be either regular or non-regular, vertextransitive or not vertex-transitive. In this paper we exhibit an efficient algorithm, based on G-graphs, constructing cubic semisymmetric graphs. So, with G-graphs, it becomes easy not only to extend the The Foster Census up to order 800, but also to construct cubic semisymmetric graphs, quartic symmetric and semisymmetric graphs, quintic symmetric and semisymmetric graphs and so on.

Symmetric and semisymmetric graphs are used in many scientific domains, especially parallel computation and interconnection networks. The industry and the research world make a huge usage of such graphs. Constructing symmetric and semisymmetric graphs is a large and hard problem. In this paper a tool called G-graphs and based on group theory is used. We show the efficiency of this tool for constructing symmetric and semisymmetric graphs and we exhibit experimental results.

Categories and Subject Descriptors G.2.2 [Discrete Mathematics]: Graph Theory—Graph algorithms

General Terms Algorithms, Theory

Keywords Symmetric graphs, semisymmetric graph, graphs from group, G-graphs.

1.

[email protected]

INTRODUCTION

A graph that is both edge-transitive and vertex-transitive is called a symmetric graph. Such graphs are used in many domains, for example, the interconnection network of SIMD computers. But to construct them is a very hard problem. Usually, the graphical representation of groups is used for the construction of those graphs. A popular representation of a group by a graph is the Cayley representation. A lot of work has been done about these graphs [3, 10]. Cayley

2. BASIC DEFINITIONS We define a graph Γ = (V ; E; ) as follows:

∗Authors by alphabetical Order.

• V is the set of vertices and E is the set of edges. • is a map from E to P2 (V ), where P2 (V ) is the set of subsets of V having 1 or 2 elements.


In this paper graphs are finite, i.e., sets V and E have finite cardinalities. For each edge a, we denote (a) = [x; y] if (a) = {x, y} with x = y or (a) = {x} = {y} if x = y. If x = y, a is called loop. The set {a ∈ E, (a) = [x; y]} is called multiedge or p-edge, where p is the cardinality of the set. We define the degree of x by deg(x) = card({a ∈ E, x ∈

61

(a)}). The line graph of a graph G is the graph obtained by associating a vertex with each edge of G and connecting two vertices with an edge if and only if the corresponding edges of G shares an extremity. In this paper, groups are also finite. We denote the unit element by e. Let G be a group, and let S = {s1 , s2 , . . . , sk } be a nonempty subset of G. S is a set of generators of G if any element θ ∈ G can be written as a product θ = si1 si2 si3 . . . sit with i1 , i2 , . . . it ∈ {1, 2, . . . , k}. We say that G is generated by S = {s1 , s2 , . . . , sk } and we write G = s1 , s2 , . . . , sk . Let H be a subgroup of G. We denote by Hx a right coset of H (with respect to x) in G. A subset TH of G is said to be a right transversal for H if {Hx, x ∈ TH } is precisely the set of all cosets of H in G.

3.

(e a2 a4)

(e a3)

(a a3 a5)

(a a4)

(a2 a5)

3.1 Algorithmic procedure The following algorithm constructs a G-graph from the list L of the cycles of a group: Group_to_graph_G(L) for all s in L Add s to V for all s’ in L for all x in s for all y in s’ if x=y then

GROUP TO GRAPH PROCESS

Let (G, S) be a group with a set of generators S. For any s ∈ S, we consider the left action of the F subgroup H = s on G. Thus, we have a partition G = x∈Ts sx, where Ts is a right transversal of s. The cardinality of s is o(s) where o(s) is the order of the element s. Let us consider the cycles

Add (s,s’) to E

For the construction of the cycles we use the following algorithm, written in the GAP programming language [12]: InstallGlobalFunction ( c_cycles, function(G, ga) local ls1,ls2,gs,k,x,oa,a,res,G2; res:=[]; G2:=List(G); for a in ga do gs:=[]; oa:=Order(a)-1; ls2:=Set([]); for x in G do if not(x in ls2) then ls1:=[]; for k in [0..oa] do; Add(ls1, Position(G2, (a^k)*x)); AddSet(ls2, (a^k)*x); od; Add(gs, ls1); fi; od; Add(res, gs); od; return res; end);

(s)x = (x, sx, s2 x, . . . , so(s)−1 x) of the permutation gs : x −→ sx. Notice that sx is the support of the cycle (s)x. One cycle of gs contains the unit element e, namely (s)e = (e, s, s2 , . . . , so(s)−1 ). We now define a new graph denoted by Φ(G; S) = (V ; E; ) as follows: • The vertices of Φ(G; S) are the cycles of gs , s ∈ S, i.e., V = s∈S Vs with Vs = {(s)x, x ∈ Ts }. • For all (s)x, (t)y ∈ V , {sx, ty} is a p-edge if: card(sx ∩ ty) = p, p ≥ 1 Thus, Φ(G; S) is a k-partite graph and any vertex has a o(s)˜ loop. We denote Φ(G; S) the graph Φ(G; S) without loops. By construction, one edge stands for one element of G. We can remark that one element of G labels several edges. Both ˜ graphs Φ(G; S) and Φ(G; S) are called graph from a group or G-graph and we say that the graph is generated by the group (G; S). Finally, if S = G, the G-graph is called a canonic graph. Example: Let G be the cyclic group of order 6, G = {e, a, a2 , a3 , a4 , a5 }, it is known that G can be generated by an element of order 3 and an element of order 2. Let S be {a2 , a3 }, (a2 )3 = (a3 )2 = e. The cycles of the permutation ga2 are:

4. PROPERTIES OF THE G-GRAPHS We now introduce some useful results: Proposition 1. Let Φ(G; S) = (V ; E; ) be a G-graph. This graph is connected if and only if S is a generator set of G.

(a2 )e = (e, a2 e, a4 e) = (e, a2 , a4 )

Proof. If card(S) = 1, G = s. The graph has only one vertex. So it is connected. AAssume that card(S) ≥ 2. Let (s)x ∈ Vs and (s )y ∈ Vs , because G = S, there exists s1 s2 s3 . . . sn ∈ S such that y = s1 s2 s3 . . . sn x.

(a2 )a = (a, a2 a, a4 a) = (a, a3 , a5 ) The cycles of ga3 are: (a3 )e = (e, a3 )

x ∈ sx ∩ sn x

(a3 )a = (a, a3 a) = (a, a4 )

sn x ∈ sn x ∩ sn−1 sn x

(a3 )a2 = (a2 , a3 a2 ) = (a2 , a5 )

sn−1 sn x ∈ sn−1 sn x ∩ sn−2 sn−1 sn x

˜ The graph Φ(G; S) is the following:

...

62

that for all t ∈ {1, 2, . . . , p − 1}, for all k ∈ {1, 2, . . . , p − 1} we have:

s2 . . . sn x ∈ s2 s3 . . . sn x ∩ s1 s2 . . . sn x y ∈ s1 s2 . . . sn x ∩ s y

st1 = sk2 ,

Consequently there exists a chain from (s)x ∈ Vs to (s )y ∈ Vs . So Φ(G; S) is a connected graph. Conversely, let x ∈ G. There exists si1 ∈ S and x1 ∈ Tsi1 such that x ∈ si1 x1 , with x = sti11 x1 . The graph is connected so there exists a chain from (si1 )x1 to (sik )e:

because if st1 = sk2 , p being a prime number, sk2 is generator of s1 , and the following equality becomes true: st1 = s1 = sk2 = s2 = s1 , s2 = G Consequently the only edge between (s1 )e and (s2 )e is the edge corresponding to e. More generally, let (s1 )x and (s2 )y be two cycles. If x = y, let us suppose that there exists t ∈ {1, 2, . . . , p − 1}, and k ∈ {1, 2, . . . , p − 1} such that st1 x = sk2 y. We have st1 = sk2 , and that led us to the first case. So there can be only one edge between (s1 )x and (s2 )y: the edge corresponding to the element x. Let us consider the case where x and y are different. If there is a multi-edge between (s1 )x and (s2 )y, then st1 x = sk2 y and sl1 x = si2 y. We can suppose that l = t + n and i = k + m. So we have the two following equalities:

t

x = sti11 x1 , x1 = sti22 x2 , . . . , xk−1 = sikk xk With xk = e, so x = sti11 sti22 . . . stikk . Proposition 2. Let h be a morphism between (G1 , S1 ) and (G2 , S2 ), then there exists a morphism, Φ(h), between Φ(G1 ; S1 ) and Φ(G2 ; S2 ). Proof. We define Φ(h) = φ = (f, f # ) in the following way: • f : s∈S1 V1,s −→ s∈S2 V2,s

t sl1 x = st+n x = sn 1 (s1 x) 1

(s)x −→ (h(s))h(x)

k y = sm sl1 x = si2 y = sk+m 2 (s2 y) 2

#

• f : E1 −→ E2

t m k t k n So sn 1 (s1 x) = s2 (s2 y), but s1 x = s2 y, consequently s1 = m s2 , and that led us to the first case.

([(s)x, (t)y]; u) −→ ([(h(s))h(x), (h(t))h(y)]; h(u))

We will use this well-known result:

It is easy to verify that φ is a morphism from Φ(G1 ; S1 ) to Φ(G2 ; S2 ). So, any group morphism gives rise to a graph morphism.

Theorem 2. [14] Let Γ be a simple graph. Then Aut1 (Γ) Aut∗ (Γ) if and only if

For abelian groups we have the following:

(a) not both G1 and G2 , are components of Γ

Theorem 1. Let G1 and G2 be two abelian groups. These two groups are isomorphic if and only if Φ(G1 ; G1 ) and Φ(G2 ; G2 ) are isomorphic. Proof. It has been proved that group isomorphism leads to graph isomorphism. Now, suppose that Φ(G1 ; G1 ) is isomorphic to Φ(G2 ; G2 ). These two graphs have the same degree sequence. Hence the two groups have the same number of elements of the same order. It is known that two abelian groups are isomorphic if and only if they have the same number of elements of the same order. That leads to our assertion.

G1

G2

(b) and none of the graphs Gi , i ∈ {3, 4, 5}, is a component of Γ. x1

x4

x2

x2

We also have: x3

Proposition 3. Let Γ be a connected bipartite and regular G-graph of degree p, p being a prime number, then either Γ is simple or Γ is of order 2.

x4 G3

x1

x1

x3 G4

x2 x4

x3 G5

We are now in position to characterize the bipartite Ggraphs:

Proof. The graph Γ is bipartite and regular of degree p, ˜ so Γ = Φ(G, {s1 , s2 }) with s1 and s2 two different elements of order p. But Γ is a connected graph, so the family {s1 , s2 } generates the group G, in other words G = s1 , s2 . We can notice that the groups s1 and s2 are isomorphic to the cyclic group of order p called Cp . If s1 and s2 are not different we have:

Theorem 3. Let Γ = (V1 , V2 ; E) be a bipartite connected semi-regular simple graph. Let (G, {s1 , s2 }) be a group with o(s1 ) = deg(x), x ∈ V1 and o(s2 ) = deg(y), y ∈ V2 . The three following properties are equivalent: ˜ (i) The graph Γ = (V1 , V2 ; E) is a G-graph, Φ(G; {s1 , s2 }).

s1 = s2 = s1 , s2 = G

(ii) The line graph L(Γ) is a Cayley graph Cay(H; A), (where A = a1 ∪ a2 \ {e}) with (G, {s1 , s2 }) (H, {a1 , a2 })

Therefore Γ is the graph of the cyclic group Cp generated by a family S, with S containing two elements of order p, so the order of the graph Γ is 2. Now let us consider the case where s1 and s2 are different. It is equivalent to saying

(iii) The group G is a subgroup of Aut∗ (Γ) which acts regulary on the set of edges E.

63

˜ Proof. Suppose that Γ is a G-graph Φ(G; {s1 , s2 }). We show that (iii) is true. An edge stands for a unique element of G and an element of G stands for a unique edge. So it is sufficient to show that the action (left multiplication) of G ˜ on itself preserves the graph Φ(G; {s1 , s2 }). Let e1 and e2 be ˜ two adjacent edges of Φ(G; {s1 , s2 }) and let g −1 ∈ G. The images of e1 and e2 are e1 .g −1 and e2 .g −1 . Because e1 and e2 are adjacent we have e1 = sk .u and e2 = sl .u. Hence, e1 .g −1 = sk .u.g −1 = sk .v and e2 .g −1 = sl .u.g −1 = sl .v. Consequently, e1 .g −1 and e2 .g −1 are adjacent and g induced ˜ an automorphism of Aut∗ (Φ(G; {s1 , s2 })). By construction we can remark that, for all x ∈ V1 , an edge ex incident to x is adjacent to s1 .ex , and for all y ∈ V2 , an edge ey incident to y is adjacent to s2 .ey . Assume now that (iii) is true and show that the line-graph of L(Γ) is a Cayley graph verifying the properties of (ii) Let L(Γ) = (L(V ); L(E)) be the line-graph of Γ. It is easy to see that Aut1 (Γ) Aut(L(Γ)). Moreover, from Theorem 2, we have Aut1 (Γ) Aut∗ (Γ), that leads to Aut∗ (Γ) Aut(L(Γ)). Consequently the action of Aut∗ (Γ) on E is equivalent to the action of Aut(L(Γ)) on E. Hence Aut(L(Γ)) contains a subgroup (H, {a1 , a2 }) isomorphic to (G, {s1 , s2 }) which acts regularly on the set of vertices of L(Γ) which characterize the fact that L(Γ) is a Cayley graph. Moreover it is easy to see that o(a1 )−1

A = {a1 , a21 , . . . , a1

o(a2 )−1

, a2 , a22 , . . . , a2

are sorted by their orders and they are listed up to isomorphism. For computing the list of the groups generated by two elements of order 4 we use the following algorithm: result=[]; for all g, group of order n order4=[]; for all x in g if order(x)=4 then add x to order4 end if end for all for all x1 in order4 for all x2 in order4, x2>x1 if <x1,x2>=g add (g,<x1,x2>) to result end for all end for all end for all return result; After the list is established, it is easy to generate all the corresponding G-graphs and to compute their automorphism group with Nauty [17]. If there is only one orbit in the vertex automorphism group, then the graph is vertex-transitive and symmetric. Otherwise, the graph is semisymmetric and there are two orbits, because there is a theorem [14] which affirms that every semisymmetric graph is bipartite. With that algorithm we establish a list of almost all quartic symmetric graphs up to the order 126. The following tables are not exhaustive. When two or more groups give isomorphic quartic graph the name of the two groups are given in the column G. For more information the full tables are on-line at: http://users.info.unicaen.fr/˜bretto (in Publications).

}

Let us suppose that (ii) is true. From L(Γ) we are going to build Γ. For all u ∈ Ta1 , {ai1 .u; aj1 .u} ∈ L(E), 0 ≤ i < j ≤ o(a1 ) − 1 and for all v ∈ Ta2 , {ak2 .v; al2 .v} ∈ L(E), 0 ≤ k < l ≤ o(a2 ) − 1. We have a bijection from H = L(V ) on E such that two elements of L(V ) are adjacent if and only if the corresponding edges in Γ are adjacent. Let us denote o(a )−1 (a1 ).u = (u, a1 .u, a21 .u, . . . , a1 1 .u), for all u ∈ Ta1 , and o(a )−1 (a2 ).v = (v, a2 .v, a22 .v, . . . , a1 2 .v), for all v ∈ Ta2 . Now, let us put an edge between (a1 ).u and (a2 ).v if and only if card(a1 .u ∩ a2 .v) = 1. By construction this graph is a graph which has L(Γ) as line-graph and it is a H-graph. Moreover it has been shown that, if (G; S) (H; A) then ˜ (G; S) Φ ˜ (H; A). Φ

5.

THE CONSTRUCTION OF SYMMETRIC AND SEMISYMMETRIC GRAPHS

5.1 Quartic G-graphs Let G be a group of order 4n, G = C4 , and S a family such that G =< S > and S = {a, b}, with a4 = e and ˜ S) is bipartite, edge-transitive b4 = e. Then the graph Φ(G; and quartic. So there are two possibilities: ˜ 1. Φ(G; S) is vertex-transitive, so it is a symmetric quartic graph; ˜ 2. Φ(G; S) is not vertex-transitive, so it is a semisymmetric quartic graph. Therefore G-graphs are a very interesting tool for constructing quartic edge-transitive graphs, especially semisymmetric graphs. We establish a list of all small groups G of order 4n such that G is generated by two elements of order 4. For that we use GAP and the SmallGroups library. This library gives access to all groups of certain small orders. The groups

64

O(Γ) 8 10 12 16

Quartic O(G) 16 20 24 32

16 18 20 24

32 36 40 48

26 30 32 32 32

52 60 64 64 64

34 .. . 120 120 120 120 120

68 .. . 240 240 240 240 240

122 126

244 252

symmetric simple G-graphs G O(Aut(Γ)) sg-16-2, sg-16-3, sg-16-4 1152 sg-20-3 240 sg-24-12 768 sg-32-2, sg-32-10 4096 sg-32-11, sg-32-13 sg-32-6 384 sg-36-9 144 sg-40-12 80 sg-48-11, sg-48-12 98304 sg-48-19, sg-48-48 sg-52-3 104 sg-60-7 720 sg-64-18 4096 sg-64-34 256 sg-64-8, sg-64-21 2097152 sg-64-39 sg-64-41, sg-64-48 sg-68-3 136 .. .. . . sg-240-189 960 sg-240-120 480 sg-240-121 480 sg-240-124 480 sg-240-72, sg-240-73 1.3835e20 sg-240-80, sg-240-197 sg-244-3 488 sg-252-32 1008

Quartic semisymmetric simple G-graphs O(Γ) O(G) G O(Aut(Γ)) 24 48 sg-48-30 3072 32 64 sg-64-9, sg-64-20 294912 sg-64-23, sg-64-32 sg-64-33, sg-64-35 48 96 sg-96-185, sg-96-186 3145728 sg-96-194, sg-96-195 64 128 sg-128-122, sg-128-134 25165824 sg-128-136, sg-128-139 sg-128-141 64 128 sg-128-144, sg-128-146 2048 64 128 sg-128-26, sg-128-71 268435456 sg-128-72, sg-128-75 sg-128-76, sg-128-80 sg-128-118 72 144 sg-144-115, sg-144-116 37748736 72 144 sg-144-120 144 72 144 sg-144-33 2.416e9 80 160 sg-160-83, sg-160-84 83886080 80 160 sg-160-234 335544320 96 192 sg-192-185 192 96 192 sg-192-26, sg-192-32, 1.649e12 sg-192-33, sg-192-34 sg-192-35, sg-192-86 sg-192-114, sg-192-957 sg-192-960, sg-192-971 sg-192-972 96 192 sg-192-955, sg-192-963 1.288e10 sg-192-964, sg-192-969 sg-192-970, sg-192-987 sg-192-989, sg-192-991 100 200 sg-200-42 400 120 240 sg-240-107 1.055e15 120 240 sg-240-192 1440 120 240 sg-240-91 2.577e11 120 240 sg-240-95, sg-240-97 7.731e11

3

10

5

(3 12 6)

(4 10 7)

(1 11 6)

(2 9 7)

(3 10 5)

(4 12 8)

12

6

9

2

11

7

8

5.2 Cubic and quintic symmetric or semisymmetric graphs By the same process it is easy to establish some tables of all quintic and cubic symmetric and semisymmetric Ggraphs. Such tables can be found on-line at: http://users.info.unicaen.fr/˜bretto (in Publications).

O(Γ) 54 112 120 144 216 240 294 336 336 378 384 400 432 432 448 486 546 576 672 702 702 720 784 784 784 798

Cubic O(G) 81 168 180 216 324 360 441 504 504 567 576 600 648 648 672 729 819 864 1008 1053 1053 1080 1176 1176 1176 1197

semisymmetric G-graphs G Aut(Γ) sg-81-7 2 orbits; Order=1296 sg-168-43 2 orbits; Order=168 sg-180-19 2 orbits; Order=720 sg-216-153 2 orbits; Order=432 sg-324-58 2 orbits; Order=648 sg-360-51 2 orbits; Order=1440 sg-441-9 2 orbits; Order=882 sg-504-157 2 orbits; Order=2016 sg-504-158 2 orbits; Order=504 sg-567-21 2 orbits; Order=567 sg-576-5129 2 orbits; Order=2304 sg-600-150 2 orbits; Order=600 sg-648-102 2 orbits; Order=1296 sg-648-702 2 orbits; Order=1296 sg-672-1257 2 orbits; Order=672 sg-729-100 2 orbits; Order=1458 sg-819-6 2 orbits; Order=819 sg-864-2666 2 orbits; Order=1728 sg-1008-517 2 orbits; Order=2016 sg-1053-30 2 orbits; Order=1053 sg-1053-51 2 orbits; Order=1053 sg-1080-487 2 orbits; Order=2160 sg-1176-42 2 orbits; Order=1176 sg-1176-215 2 orbits; Order=1176 sg-1176-220 2 orbits; Order=1176 sg-1197-9 2 orbits; Order=1197

Notice that the the following well-known cubic symmetric or semisymmetric graphs are G-graphs. The corresponding groups are indicated between parenthesis:

Example: Let G be the group sg-12-3 generated by a family S = {a, b} of order 2 with a3 = b3 = e. For information the group ˜ S) is named sg-12-3 is isomorphic to A4 . The graph Φ(G; the following: (2 11 8)

4

It is a symmetric quartic graph isomorphic to the cuboctahedral graph. It is also a G-graph generated by the group G = C2 × C2 × C2 .

These two table have been built in 38 minutes on a 2 gigahertz Athlon (a 32 bit processor). By the implication (i) ⇒ (ii) of Theorem 3, the linegraph of a cubic G-graph is a quartic Cayley graph. Our table of cubic symmetric and semisymmetric G-graphs gives us a table of quartic Cayley graph.

(1 9 5)

1

1. The cube (G = A4 , S = {(1, 2, 3), (1, 3, 4)}) 2. The Heawood’s graph (a, b | a7 = b3 = e, ab = baa, S = {b, ba}) 3. The Pappus’s graph (G = a, b, c | a3 = b3 = c3 = e, ab = ba, ac = ca, bc = cba, S = {b, c})

It is a cubic symmetric G-graph on 8 vertices isomorphic to the skeleton of a cube. If we compute the linegraph of ˜ the G-graph Φ(G; S) we found the following quartic Cayley graph:

4. The Mobius-Kantor’s graph (G =SmallGroup(24,3), S = {f 1, f 1 ∗ f 2})

65

502 528 542 550 562 600 610 622 640 662 682 682 682 682 704 710 722 800

5. The Gray graph (G =SmallGroup(81,7), S = {f 1, f 2}) 6. The Ljubljana graph (G =SmallGroup(168,43), S = {f 1, f 1 ∗ f 2 ∗ f 4}) A census of all symmetric and semisymmetric cubic graphs up to 768 vertices has already been established [8, 9], but our non-exhaustive algorithm is faster. For computing almost all cubic symmetric and semisymmetric G-graphs up to the order 800, except order 768, 1 hour and 56 minutes are necessary on a 2 Gigahertz Athlon. No list was established for quintic symmetric or semisymmetric graphs. The two following non-exaustive lists, built in 32 minutes on a 2 gigahertz Athlon, are the fist one: Quintic semisymmetric G-graphs O(Γ) O(G) G Aut(Γ) 120 300 sg-300-22 1200 240 600 sg-600-54 2400 250 625 sg-625-7 10000 720 1800 sg-1800-555 14400

1255 1320 1355 1375 1405 1500 1525 1555 1600 1655 1705 1705 1705 1705 1760 1775 1805 2000

sg-1255-1 sg-1320-13 sg-1355-1 sg-1375-9 sg-1405-1 sg-1500-112 sg-1525-3 sg-1555-1 sg-1600-6786 sg-1655-1 sg-1705-3 sg-1705-4 sg-1705-5 sg-1705-6 sg-1760-1139 sg-1775-3 sg-1805-2 sg-2000-488

2510 5280 2710 2750 2810 12000 3050 3110 12800 3310 3410 3410 3410 3410 3520 3550 7220 16000

6. REFERENCES [1] Fred Annexstein, Marc Baumslag, and Arnold L. Rosenberg. Group action graphs and parallel architectures, SIAM J. Comput, 19:544–569, 1990. [2] Sheldon Akers and Balakrishnan Krishnamurthy. Group graphs as interconnection networks, In Proc. 14th Int. Conf. Fault Tolerant Comput, pages 422–427, 1984. [3] L. Babai. Automorphism groups, isomorphism, reconstruction. Chapter 27 of Handbook of combinatorics, 1994. [4] A. Bretto and B. Laget. A new graphical representation of a group. Tenth International Conference on Applications of Computer Algebra.(ACA-2004), Beaumont, USA, 23-25 July 2004. National Science Foundation, (NSF), 2004 25-32, (ISBN: 0-9759946-0-3). [5] A. Bretto and L. Gillibert. Graphical and computational representation of groups, LNCS 3039, Springer-Verlag pp 343-350. Proceedings of ICCS’2004. [6] I. Z. Bouwer, An Edge but Not Vertex Transitive Cubic Graph, Bull. Canad. Math. Soc. 11, 533-535, 1968. [7] I.Z. Bouwer, W.W. Chernoff, B. Monson and Z. Star, The Foster Census, Charles Babbage Research Centre, 1988. [8] Marston Conder and Peter Dobcsanyi, Trivalent symmetric graphs on up to 768 vertices, J. Combinatorial Mathematics and Combinatorial Computing 40 (2002), 41-63. [9] Marston Conder, Aleksander Malnic, Dragan Marusic and Primoz Potocnik, A census of semisymmetric cubic graphs on up to 768 vertices, preprint, March 2004. [10] G. Cooperman and L. Finkelstein and N. Sarawagi. Applications of Cayley Graphs. Appl. Algebra and Error-Correcting Codes. Springer Verlag. Lecture Notes in Computer Sciences, Vol. 508 1991, 367–378. [11] J. Folkman, Regular line-symmetric graphs, J. Combinatory Theory, 3:215-232, 1967.

Quintic symmetric G-graphs O(Γ) O(G) G Aut(Γ) 10 25 sg-25-2 28800 22 55 sg-55-1 1320 24 60 sg-60-5 480 32 80 sg-80-49 3840 48 120 sg-120-5 960 50 125 sg-125-3 4000 62 155 sg-155-1 310 64 160 sg-160-199 640 82 205 sg-205-1 410 110 275 sg-275-3 550 122 305 sg-305-1 610 128 320 sg-320-1012 2560 142 355 sg-355-1 710 144 360 sg-360-118 5760 160 400 sg-400-213 3200 162 405 sg-405-15 19440 202 505 sg-505-1 1010 242 605 sg-605-1 1210 242 605 sg-605-5 1210 242 605 sg-605-6 2420 262 655 sg-655-1 1310 264 660 sg-660-13 5280 288 720 sg-720-409 2880 302 755 sg-755-1 1510 310 775 sg-775-3 1550 320 800 sg-800-1065 6400 352 880 sg-880-214 1760 362 905 sg-905-1 1810 382 955 sg-955-1 1910 384 960 sg-960-11357 7680 384 960 sg-960-11358 46080 410 1025 sg-1025-3 2050 422 1055 sg-1055-1 2110 432 1080 sg-1080-260 4320 482 1205 sg-1205-1 2410 486 1215 sg-1215-68 4860

66

[12] The GAP Team, GAP - Reference Manual, Release 4.3, www.gap-system.org, May 2002. [13] R. Greenlaw and R. Petreschi, Cubic Graphs. ACM Computing Surveys, 27(4):471–495, 1995. [14] Joseph Lauri and Raffaele Scapellato, Topics in Graphs Automorphisms and Reconstruction, London Mathematical Society Student Texts, 2003.

[15] Joseph Lauri, Pseudosimilarity in graphs - a survey, Ars Combinatoria, 36:171-182, 1997. [16] Joseph Lauri, Constructing graphs with several pseudosimilar vertices or edges, Discrete Mathematics, Volume: 267, Issue: 1-3, p. 197-211. June 6, 2003. [17] Brendan D. McKay, Computer Science Department, Australian National University, (1981), Practical graph isomorphism, Congressus Numerantium 30, p. 45-87.

67

Picard–Vessiot Extensions for Linear Functional Systems Manuel Bronstein

∗

∗

Ziming Li

†

Min Wu

∗,†

†

INRIA – Cafe´ 2004 Route des Lucioles 06902 Sophia Antipolis, France

Key Lab of Math.-Mechan. Acad. of Math. and Syst. Sci. Zhong Guan Cun, Beijing (100080), China

[email protected]

[email protected]

ABSTRACT

[email protected] [email protected]


the field of coefficients has characteristic 0 and has an algebraically closed constant field, then Picard-Vessiot extensions for such systems contain no new constants. In this paper, rings are not necessarily commutative and have arbitrary characteristic, unless otherwise specified. Ideals and modules are left ideals and left modules. Fields are however always commutative. The notation (·)τ denotes the transpose of vectors or matrices, while R p×q denotes the set of p × q matrices with entries in (the ring) R. The commutator of a, b ∈ R is [a, b] = ab − ba. We write 1R for the identity map on R and 0R for the zero map on R, and we omit the subscripts when the context is clear.

I.1.2 [Computing Methodologies]: Symbolic and Algebraic Manipulation—Algorithms

2.

Picard-Vessiot extensions for ordinary differential and difference equations are well known and are at the core of the associated Galois theories. In this paper, we construct fundamental matrices and Picard-Vessiot extensions for systems of linear partial functional equations having finite linear dimension. We then use those extensions to show that all the solutions of a factor of such a system can be completed to solutions of the original system.

Let R be a ring and σ be an endomorphism of R. A σderivation ([4]) is an additive map δ : R → R satisfying δ(ab) = σ(a)δ(b) + δ(a)b for all a, b ∈ R. A ∆-ring (R, Φ) is a ring R together with a set Φ = {(σ1 , δ1 ), . . . , (σm , δm )}, where each σi is an automorphism of R, each δi is a σi derivation of R, and [σi , σj ] = [δi , δj ] = [σi , δj ] = 0 for all i 6= j. If R is also a field, then (R, Φ) is called a ∆-field. An element c of R is called a constant if σi (c) = c and δi (c) = 0 for all i. The set of all the constants of R is denoted CR and is clearly a subring of R, and a subfield when R is a field. Remark that a ∆-ring is a (partial) differential ring if σi = 1 for all i, and a (partial) difference ring if δi = 0 for all i.


Keywords Linear functional systems; Picard-Vessiot extensions; Fundamental matrices; Modules of formal solutions.

1.

INTRODUCTION

A linear functional system is a system of form A(Z) = 0 where A is a matrix whose entries are (partial) linear operators, such as differential, shift or q-shift operators or any mixture thereof, and Z denotes a vector of unknowns. A common special case consists of integrable systems, which are of the form {∂i (Z) = Ai Z}1≤i≤m , and correspond to the matrix A given by the stacking of blocks of the form (∂i − Ai ). We show in this paper that fundamental matrices1 and Picard-Vessiot extensions1 always exist for linear functional systems having finite linear dimension1 , which include in particular all integrable systems. In addition, if 1

FULLY INTEGRABLE SYSTEMS

Definition 1. We say that the ∆-ring (R, Φ) is orthogonal if δi = 0 for each i such that σi 6= 1. By reordering the indices, we can assume that there exists an integer ` ≥ 0 such that σi = 1 for 1 ≤ i ≤ ` and δi = 0 for ` < i ≤ m. We write (R, Φ, `) for such an orthogonal ∆-ring. All the δi are usual derivations in an orthogonal ∆-ring. Mixed systems of partial linear differential, difference and qdifference equations can be represented by matrices with entries in Ore algebras ([4]) over orthogonal ∆-rings. Let (F, Φ) be a ∆-field, and suppose that for each i such that σi 6= 1, there exists ai ∈ F such that σi (ai ) 6= ai and σj (ai ) − ai = δj (ai ) = 0 for all j 6= i. Replacing the xi by the ai in the proof of Theorem 1 in [6], one sees that linear functional equations over F can be rewritten as equations over an orthogonal ∆-field. There are however orthogonal ∆rings that do not contain such ai ’s, for example F = C(x) together with Φ = {(1, d/dx), (σx , 0)} where σx is the automorphism of F over C that sends x to x − 1. This field is used in modeling differential-delay equations, and does not match the definition of orthogonality given in [6].

To be defined precisely in Sect. 3 and 5.


68

via δi (U ) = Ai U and for ` + 1 ≤ j ≤ m, the σj are extended to automorphisms of R via σj (U ) = Aj U (σj is bijective because Aj is invertible). It follows from the conditions (2) that these extended maps turn R into a welldefined orthogonal ∆-extension of F and that ∂i (U ) = Ai U for each i. Let D = det(U ) and R be the localization of R with respect to D. Extend the δi and σj via the formulas δi (1/D) = −δi (D)/D2 and σj (1/D) = 1/σj (D), respectively (note that σj (D) = det(Aj )D for j > `). Then R becomes an orthogonal ∆-extension of F , and U is a fundamental matrix of the system. 2 The following proposition reveals that any two fundamental matrices differ by a constant matrix.

Let (F, Φ, `) be an orthogonal ∆-field. We say that a commutative ring E containing F is an orthogonal ∆-extension of (F, Φ, `) if the σi and δi can be extended to automorphisms and derivations of E satisfying: (i) the commutators [σi , σj ] = [δi , δj ] = [σi , δj ] = 0 on E for 1 ≤ i 6= j ≤ m; (ii) σi = 1E for i ≤ ` and δi = 0E for j > `. ˜ be two orthogonal ∆-extensions of F . A Let E and E ˜ is called a ∆-morphism if φ is a ring homap φ from E to E momorphism leaving F fixed and commuting with all the δi and σi . Two orthogonal ∆-extensions of F are said to be isomorphic if there exists a bijective ∆-morphism between them. Definition 2. A system of form δi (Z) = Ai Z for i ≤ `,

σi (Z) = Ai Z for i > ` ,

Proposition 1. Let {∂i (Z) = Ai Z}1≤i≤m be a fully integrable system of size n over F and U ∈ E n×n be a fundamental matrix where E is an orthogonal ∆-extension of F . If V ∈ E n×d with d ≥ 1 is a matrix whose columns are solutions of the system then V = U T for some T ∈ CEn×d . In particular, any solution of {∂i (Z) = Ai Z}1≤i≤m in E n is a linear combination of the columns of U over CE .

(1)

where Ai ∈ F n×n and Z = (z1 , . . . , zn )τ is called an integrable system if the following conditions are satisfied: σi (Aj )Ai + δi (Aj ) = σj (Ai )Aj + δj (Ai )

for all i, j. (2)

The integrable system (1) is said to be fully integrable if the matrices A`+1 , . . . , Am are invertible.

Proof. Let T = U −1 V . A straightforward calculation implies that δi (T ) = 0 for i ≤ `, and σj (T ) = T for j > `. Hence all the entries of T belong to CE . 2 In [10, 11], Picard-Vessiot rings for linear ordinary differential and difference systems are defined. Picard-Vessiot fields for integrable systems of partial differential equations have been studied by Kolchin who proved their existence and developed the associated Galois theory [2, §2][5]. PicardVessiot extension fields have also been defined in [1] for fields with operators, which are more general ∆-fields where the operators do not necessarily commute. While the associated Galois theory was developed there, the existence of Picard-Vessiot extensions was not shown. Indeed, with automorphisms allowed, there are fully integrable systems for which no Picard-Vessiot field exists. Generalizing the definition of Picard-Vessiot rings used for difference equations [10, (Errata)], we obtain Picard-Vessiot rings together with a construction proving their existence. Our definition is compatible with the previous ones: for differential systems, Picard-Vessiot rings turn out to be integral domains, and the Picard-Vessiot fields of [5] are their fields of fractions; For ∆-rings, the Picard-Vessiot rings are generated by elements satisfying linear scalar operator equations, which is the defining property of the Picard-Vessiot fields of [1]. An ideal I of a commutative ∆-ring R is said to be invariant if δi (I) ⊂ I and σi (I) ⊂ I for all 1 ≤ i ≤ m. The ring R is said to be simple if its only invariant ideals are (0) and R.

Using Ore algebra notation, we write {∂i (Z) = Ai Z}1≤i≤m for the system (1) where the action of ∂i is meant to be δi for i ≤ ` and σi for i > `. Note that the conditions (2) are derived from the condition ∂i (∂j (Z)) = ∂j (∂i (Z)) and are exactly the matrix-analogues of the compatibility conditions for first order scalar equations in [6]. Example 1. Let F = C(x, k) and δx and σk denote respectively the ordinary differentiation w.r.t. x and the shift operator w.r.t. k. Then {δx (Z)=Ax Z, σk (Z)=Ak Z} where   2 2 Ax = 

 Ak = 

x −kx−k x(x−k)(x−1)

x −kx+3k−2x kx(x−k)(x−1)

k(kx+x−x2 −2k) (x−k)(x−1)

x3 +x2 −kx2 −2x+2k x(x−k)(x−1)



k+1+kx2 −xk2 −x (x−k)(x−1)

−x − k+1+kx−k k(x−k)(x−1)



x(k+1)(k+1+kx−k2 −x) (x−k)(x−1)

(k+1)(x2 −2kx−x+k2 ) k(x−k)(x−1)



2

is a fully integrable system.

3.

FUNDAMENTAL MATRICES AND PICARD-VESSIOT EXTENSIONS

A square matrix with entries in a commutative ring is said to be invertible if its determinant is a unit in that ring. Let the orthogonal ∆-ring (F, Φ, `) be as in the previous section and {∂i (Z) = Ai Z}1≤i≤m be a fully integrable system of size n over F . An n × n matrix U with entries in an orthogonal ∆-extension E of F is a fundamental matrix for {∂i (Z) = Ai Z}1≤i≤m if U is invertible and ∂i (U ) = Ai U for each i, that is, each column of U is a solution of the system.

Definition 3. Let {∂i (Z) = Ai Z}1≤i≤m be a fully integrable system over F . A Picard-Vessiot ring for this system is a (commutative) ring E such that: (i) E is a simple orthogonal ∆-extension of F . (ii) E = F [U, det(U )−1 ] for some fundamental matrix U for the system.

Theorem 1. For every fully integrable system, there exists a fundamental matrix whose entries lie in an orthogonal ∆-extension of F .

We now construct Picard-Vessiot rings by the same approach used in the ordinary differential and difference cases [10, 11].

Proof. Let {∂i (Z) = Ai Z}1≤i≤m be a fully integrable system of size n over F , U = (ust ) be a matrix of n2 distinct indeterminates and R = F [u11 , . . . , u1n , . . . , un1 , . . . , unn ]. For 1 ≤ i ≤ `, the δi are extended to derivations of R

Lemma 1. Let R be an orthogonal ∆-extension of F and I a maximal invariant ideal in R. Then, (i) E := R/I is a

69

Let E be the orthogonal ∆-extension F [T, T −1 ] such that δi (T ) = ai T for i ≤ ` and σj (T ) = aj T for j > `. Case 1. There does not exist an integer k > 0 and r ∈ F ∗ such that δi (r) = kai r for i ≤ ` and σj (r) = akj r for j>`. Then E is a Picard-Vessiot ring of (3). Case 2. Assume that the integer k > 0 is minimal so that δi (r) = kai r and σj (r) = akj r for some r ∈ F ∗ and for all i ≤ ` and j>`. Then E/ T k − r is a Picard-Vessiot ring of (3). The verification of the above two assertions is similar to that in Example 1.19 in [11]. Unlike in the differential case, the elements of PicardVessiot rings cannot always be interpreted as complex functions: the system {dy/dx = y(x), y(x + 1) = y(x)} is in Case 1 above and has a Picard-Vessiot ring over C(x), but has no nonzero complex function solution. Next, we show that a Picard-Vessiot ring of the system in Example 1 is F [ex , e−x , Γ(k), Γ(k)−1 ] where F = C(x, k). Note that the change of variable1 Z = M Y , where x−k x2 x , M= 2 (x − k)k x k

simple orthogonal ∆-extension of F . (ii) CE is a field. (iii) If F has characteristic 0, CF is algebraically closed and E is a finitely generated algebra over F , then CE = CF . k

`+1 km Proof. Let I = {σ`+1 · · · σm (a)| a ∈ I, k`+1 , . . . , km ∈ Z}. One can verify that I is an invariant ideal containing I but 1 ∈ / I, and hence I = I since I is maximal. The δi and σj can be viewed as derivations and surjective endomorphisms on E = R/I via the formulas δi (a + I) = δi (a) + I and σj (a + I) = σj (a) + I for all a in R, respectively. If σj (a + I) = I then σj (a) ∈ I = I and thus a ∈ I. So the σj are automorphisms of E and E is a simple orthogonal ∆-extension of F . To show the second statement, let c be a nonzero constant of E. Then the ideal (c) is invariant. Since E is simple, (c) contains 1. To show the last statement, suppose that b ∈ CE but b ∈ / CF . By the argument used in the proof of Lemma 1.8 in [10], there exists a nonzero monic polynomial g overF with minimal degree d such that P k = 0. Apply the δi and σj to g(b), g(b) = bd + d−1 k=0 gk b P d−1 k = 0 for i ≤ ` , and respectively, we obtain k=0 δi (gk )b P d−1 k = 0 for j > `. The minimality of d k=0 (σj (gk ) − gk )b then implies gk ∈ CF for 0 ≤ k < d. So b ∈ CF since CF is algebraically closed, a contradiction. 2 The existence of the Picard-Vessiot extensions is stated in the next theorem.

transforms the system into B : {δx (Y )=Bx Y, σk (Y )=Bk Y }, 1 0 1 0 where Bx = and Bk = . 0 0 0 k Thus we need only to find a Picard-Vessiot ring of B. First, let U be a 2 × 2 matrix with indeterminate entries u11 , u12 , u21 and u22 . Define δx (U ) = Bx U and σk (U ) = Bk U . This turns R = F [u11 , u12 , u21 , u22 , 1/ det(U )] into an orthogonal ∆-extension of F . Clearly, I = (u12 , u21 ) is an invariant ideal of R and σk−1 (I) is contained in I. Hence R/I is an −1 orthogonal ∆-ring. As the ∆-rings E = F [u11 , u22 , u−1 11 , u22 ] and R/I are isomorphic, it suffices to show that E is simple. Suppose that J is a nontrivial invariant ideal of E. Let f be a nonzero polynomial in I ∩ F [u11 , u22 ] with the smallest number of terms. It cannot be a monomial, for otherwise J −1 would be E since u−1 11 and u22 are in E. We write

Theorem 2. Every fully integrable system over F has a Picard-Vessiot ring E. If F has characteristic 0 and CF is algebraically closed, then CE = CF . Furthermore, that extension is minimal, meaning that no proper subring of E satisfies condition (ii) of Definition 3. Proof. Let {∂i (Z) = Ai Z}1≤i≤m be a fully integrable system over F . By Theorem 1, it has a fundamental matrix U = (ust ) with entries in the orthogonal ∆-extension R = F [u11 , . . . , unn , det(U )−1 ].

f = ud111 ud222 + rue111 ue222 + other terms,

Let I be a maximal invariant ideal of R and E = R/I. Then E is a simple orthogonal ∆-extension of F by Lemma 1. Clearly, E is generated over F by the entries of the matrix U := (ust + I) and by det(U )−1 . Since U is a fundamental matrix for the system, E is a Picard-Vessiot ring for the system. Assume further that F has characteristic 0 and CF is algebraically closed. Then CE = CF by the third assertion of Lemma 1. Let S = F [V, det(V )−1 ] be a subring of E where V is some fundamental matrix of the system. By n×n such that V = U T . Proposition 1, there exists T ∈ CE Since CE = CF , all the entries of U and the inverse of det(U ) are contained in S. Hence S = E. 2 Assume that the ground field F has characteristic 0 with an algebraically closed field of constants. Let E be a PicardVessiot ring for a fully integrable system of size n over F . Then Proposition 1 together with CE = CF implies that all the solutions of this system in E n form a CF -vector space of dimension n. A direct generalization of Proposition 1.20 in [11] and Proposition 1.9 in [10] reveals that any two Picard-Vessiot rings for a fully integrable system over F are isomorphic as orthogonal ∆-extensions. We present a few examples for Picard-Vessiot rings. Consider the fully integrable system of size one: ∂i (z) = ai z

where ai ∈ F and i = 1, . . . , m.

where r ∈ F with r 6= 0, and (d1 , d2 ) 6= (e1 , e2 ). It follows from δx (u11 ) = u11 and δx (u22 ) = 0 that δx (f ) = d1 ud111 ud222 + (δx (r) + e1 r)ue111 ue222 + other terms, in which each monomial has already appeared in f . Thus (δx (f ) − d1 f ) must be zero, because it is in I but has fewer terms. It follows that (δx (r) − (d1 − e1 )r) is equal to zero. In the same way, one can show that σk (r) − kd2 −e2 r = 0, because σk (u11 ) = u11 and σk (u22 ) = ku22 . But the existence of such a rational function r would imply d1 =e1 and d2 =e2 , a contradiction. Thus E is simple, and so a Picard-Vessiot ring of B, hence also of the system in Example 1.x If we un e 0 x derstand u11 as e and u22 as Γ(k), then V = 0 Γ(k) is a fundamental matrix for B in E, and hence M V is for the system in Example 1. Last, we describe a simple orthogonal ∆-extension that contains a solution of the inhomogeneous system δi (z) = ai for i ≤ `

and

σj (z) = z + aj for j > `,

(4)

1 which can be found, for example, by computing the hyperexponential solutions of the system ([6, 12])

(3)

70

where the ai and aj are in a simple orthogonal ∆-ring E with characteristic zero. This is an extension of Example 1.18 in [11]. Note that the ai and aj have to satisfy some compatibility conditions due to the commutativity of the δi and σj . A more general form for these conditions are given in (8) in the next section. If (4) has a solution in E, then there is nothing to do. Otherwise, let R = E[T ] and extend the δi and σj on R by the formulas δi (T ) = ai and σj (T ) = T + aj . The compatibility conditions imply that R becomes a well-defined orthogonal ∆-ring. If R has a nontrivial invariant ideal I, let f = fd T d + fd−1 T d−1 + · · · + f0 be a nonzero element in I with minimal degree. Let J be the set consisting of zero and leading coefficients of elements in I with degree d. Our extensions of δi and σj imply that J is an invariant ideal of E. Hence 1 ∈ J and, therefore, we may also assume d > 0 and fd = 1. Since d is minimal, both δi (f ) −fd−1 is a solution and (σj (f ) − f ) are 0. Consequently, d of (4), a contradiction. Thus R is simple and contains a solution T of (4).

4.

which implies that the Bi and Di also satisfy the compatibility conditions (2). Therefore B and D are both fully integrable. The first statement is proved. The second is immediate from (6). From Theorem 1, there exist an orthogonal ∆-extension E of F and a fundamental matrix U with entries in E for D. Let η = (η1 , . . . , ηd )τ be a solution of B in some orthogonal ∆-extension R of F . Viewing E and R as commutative F -algebras, we can extend the δi and σj to the commutative E-algebra E ⊗F R via δi (e ⊗ r) = δi (e) ⊗ r + e ⊗ δi (r) and σj (e ⊗ r) = σj (e) ⊗ σj (r) for i ≤ ` and j > `. Then (1 ⊗ η1 , . . . , 1 ⊗ ηd )τ is also a solution of B, so, replacing R by E ⊗F R, we can assume without loss of generality that R contains E. Substitute η into (6) to get ∂i (Y ) = Di Y +Ci η for each i. Let v = (v1 , . . . , vn−d )τ , where the vk are distinct indeterminates over R, and G = R[v1 , . . . , vn−d ]. We extend the δi and σj to G via δi (v)=bi and σj (v)=v + bj where b1 , . . . , bm ∈ Rn−d are given by bi = U −1 Ci η for i ≤ ` and bj = U −1 Dj −1 Cj η for j > `. To turn G into an orthogonal ∆-extension of R, all the δi and σj on G should commute, which is equivalent to the following integrability conditions:  for 1 ≤ i, j ≤ `,  δi (bj ) = δj (bi ), δi (bj ) = σj (bi ) − bi , for i ≤ `, j > `, (8)  σi (bj ) − bj = σj (bi ) − bi , for ` + 1 ≤ i, j ≤ m.

COMPLETING PARTIAL SOLUTIONS

We now consider reducible systems, i.e. systems that can be put into simultaneous block-triangular form by a change of variable Y = M Z for some M ∈ GLn (F ). Factorization algorithms for modules over Laurent–Ore algebras [12] yield such a change of variable for reducible systems, and we motivate them by showing that the solutions of a factor can always be extended to solutions of the complete system.

Although the conditions (8) are generally not satisfied for arbitrary bi ’s, we show that they are satisfied in our case. Since the Ai satisfy the compatibility conditions (2), it follows from the bottom-left block in (7) that, for all i, j,

Theorem 3. Let A: {∂i (Z) = Ai Z}1≤i≤m be a fully integrable system of size n over F , and suppose that there exist a positive integer d < n and matrices Bi in F d×d , Ci in F (n−d)×d and Di in F (n−d)×(n−d) such that Bi 0 Ai = for 1 ≤ i ≤ m . (5) Ci Di

σi (Cj )Bi +σi (Dj )Ci +δi (Cj )=σj (Ci )Bj +σj (Di )Cj +δj (Ci ). (9) For 1 ≤ i, j ≤ `, we have δi (bj ) = δi U −1 Cj η

Then

= =

(i) B : {∂i (X)=Bi X}1≤i≤m and D : {∂i (X)=Di X}1≤i≤m are both fully integrable systems.

which, together with σi = σj = 1 for 1 ≤ i, j ≤ `, and (9) implies δi (bj ) = δj (bi ). The last two integrability conditions in (8) are verified with similar calculations, using the fact that the Di satisfy the compatibility conditions (2). Therefore G is an orthogonal ∆-extension of R, hence of F . Let ζ = U v ∈ Gn−d . Then, for i ≤ `,

(ii) (0, . . . , 0, ζd+1 , . . . , ζn )τ is a solution of A whenever (ζd+1 , . . . , ζn )τ is a solution of D. (iii) For any solution (η1 , . . . , ηd )τ of B in an orthogonal ∆extension of F , there exists an orthogonal ∆-extension of F containing η1 , . . . , ηd as well as ηd+1 , . . . , ηn such that (η1 , . . . , ηn )τ is a solution of A.

∂i (ζ) = δi (ζ) = δi (U )v + U δi (v) = Di U v + U bi = Di ζ + Ci η,

Proof. Let X = (z1 , . . . , zd )τ and Y = (zd+1 , . . . , zn )τ . The system A can then be rewritten into a homogeneous system and an inhomogeneous system: ∂i (X) = Bi X, for 1 ≤ i ≤ m . (6) ∂i (Y ) = Di Y + Ci X,

and, for j > `, ∂j (ζ) = σj (ζ) = σj (U )σj (v) = Dj U (v + bj ) = Dj ζ + Cj η. So (η τ , ζ τ )τ is a solution of the initial system A. 2 We point out here (but omitting the detailed explanation) that in the differential case, the quotient systems of [7] yield an alternative approach to completing solutions of factors.

Since A is fully integrable, the matrices Ai satisfy (2) and Aj is invertible for j > `. Hence, the Bj and Dj for j > ` must also be invertible since det(Aj ) = det(Bj ) det(Dj ). In addition, a routine calculation shows that for all i, j,

Example 2. Let F , δx and σk be as in Example 1, and consider the fully integrable system Bx 0 Bk 0 δx (Z) = Z, σk (Z) = Z Cx Dx Ck Dk (10)

σi (Aj )Ai + δi (Aj ) =

σi (Bj )Bi + δi (Bj ) 0 σi (Cj )Bi + σi (Dj )Ci + δi (Cj ) σi (Dj )Di + δi (Dj )

−U −1 δi (U )U −1 Cj η+U −1 δi (Cj )η+U −1 Cj δi (η) −U −1 (Di Cj − δi (Cj ) − Cj Bi ) η ,

, (7)

71

where Z = (z1 , z2 , z3 )τ , Bx =  2 Cx = 

 Ck = 

x+k , x

Bk =

(k+1)x , k

2x −k2 +2x−kx x(x−k)

x3 −x2 k+2x2 −kx+2x−k2 (x−k)x

,

(k+1)(x3 −2x2 k−3x2 +k2 x+4kx+x−k2 ) k(x−k−1)2 x2 (k+1) k

−

(k+1)(x−k)2 k(x−k−1)2

5.



− xk(x − 1)

We now generalize the previous notions and results to systems of the form A(Z) = 0 where A is a matrix of linear operators. As in previous sections, let (F, Φ, `) be an orthogonal ∆-field and S = F [∂1 ; σ1 , δ1 ] · · · [∂m ; σm , δm ] be the corresponding Ore algebra [4]. In the differential case, an S-module is classically associated to such a system [8, 11]. In the difference case, however, S-modules do not have appropriate dimensions, so modules over Laurent algebras are used instead [9, 10, 13]. It is therefore natural to introduce in our setting the following extension of S: let θ`+1 , . . . , θm be indeterminates independent of the ∂i . Since the σj−1 are also −1 −1 , 0] · · · [θm ; σm , 0] is automorphisms of F , S = S[θ`+1 ; σ`+1 −1 also an Ore algebra. Since (∂j θj )a = ∂j σj (a)θj = a∂j θj for any j > ` and any a ∈PF , ∂j θj is in the center of S. Therefore the left ideal I = m j=`+1 S(∂j θj −1) is a two-sided ideal of S, and we call the factor ring R=S/I the Laurent-Ore algebra generated by Φ over F . Writing ∂j−1 for the image of θj in R, we can also write R (by convention) as

 ,

and Dx =

−2−x+k x−k

0

−2x−x2 +k2 (x−k)x

k x

(k+1)(x−k)2 k(x−k−1)2



! ,

Dk = 

(k+1)(x−k)2 k(x−k−1)2

0



. − kx xk

We complete the solution η1 = kex xk of the system given by Bx and Bk to a solution of (10). Note that ! ke−x 0 (x−k)2 U= ke−x Γ(k)xk (x−k) 2 is a fundamental matrix for the system given by Dx and Dk . By the proof of Theorem 3, we let ! kx x e Γ(k) , b1 = (x − k)(2x2 − k2 + 2x − kx) xk−1 e2x 2

b2 =

2

R

(x3 − 2kx2 − 3x + k x + 4kx + x − k2 )xk e2x

.

We find that v=

Γ(k)−kex +xkex Γ(k) k+1 2x k 2 2x

xk+2 e2x − 2x

ke

+x k e

:= F [∂1 ; 1, δ1 ] · · · [∂` ; 1, δ` ] −1 −1 [∂`+1 , ∂`+1 ; σ`+1 , 0] · · · [∂m , ∂m ; σm , 0]

and view it as an extension of S. For linear ordinary difference equations, R = F [σ, σ −1 ], is the algebra used in [10]. For linear partial difference equations with constant coefficients, R is the Laurent polynomial ring used in [9, 13]. Laurent-Ore algebras allow us to construct fundamental matrices and Picard-Vessiot extensions for linear functional systems of finite linear dimension, a concept that we now define precisely. For our purposes, a linear functional system is a matrix A=(aij ) ∈ S p×q ⊂ R p×q . For any R-module N , we can associate to A a CF -linear map λ : N q → N p given by

!

x+kx+k −xk −k−1 x e Γ(k+1) 2 2

MODULES AND PICARD-VESSIOT RINGS FOR GENERAL LINEAR FUNCTIONAL SYSTEMS

! +1

satisfies δx (v) = b1 and σk (v) − v = b2 . Therefore,   kex xk η1 ke−x   kex xk + (x−k) = 2  U −1 v −x k ke xk+1 kex + (x−k) 2 + Γ(k)x

  Pq  ξ1 j=1 a1j ξj     .. ξ :=  ...  → 7 Aξ =  . Pq . ξq a ξ pj j j=1 

is a solution of (10).

We therefore say that ξ ∈ N q is a solution “in N ” of the system A(Z) = 0 if A(ξ) = 0, and write solN (A(Z) = 0) for all its solutions in N . Clearly, solN (A(Z) = 0) is a vector space over CF . As in the case of D-modules [8], we can associate to A an R-module as follows: the matrix A ∈ Rp×q induces the R-linear map ρ : R1×p → R1×q given by ρ(r1 , . . . , rp ) = (r1 , . . . , rp )A. Let M = coker(ρ) = R1×q /R1×p A, which is simply the quotient of R1×q by the submodule generated by the rows of A. Then

Theorem 3 also yields fundamental matrices for reducible systems. Let {∂i (Z) = Ai Z}1≤i≤m be a fully integrable system where the Ai are as in (5). Suppose that U =(uij )∈Rd×d and V ∈ E (n−d)×(n−d) are fundamental matrices for the systems {∂i (X)=Bi X}1≤i≤m and {∂i (X)=Di X}1≤i≤m respectively, where R and E are orthogonal ∆-extensions of F . As in the procedure of completing solutions, we can assume without loss of generality that R contains E. Then a fundamental matrix for the initial system can be constructed as follows: for each 1 ≤ i ≤ d, following the procedure of completing solutions, we can find an orthogonal ∆-extension Gi of R and ξi ∈ Gn−d such that (u1i , . . . , udi , ξiτ )τ ∈ Gn i is i a solution of {∂i (Z) = Ai Z}1≤i≤m . Viewing all the entries of U , V and the ξi as elements of G = G1 ⊗F · · · ⊗F Gd , U 0 n×n W = ∈ G is easily seen to be a ξ1 . . . ξd V fundamental matrix for {∂i (Z) = Ai Z}1≤i≤m (it is invertible because det(W ) = det(U ) det(V )).

ρ

π

R 1×p −→ R 1×q −→ M −→ 0

(11)

is an exact sequence of R-modules where π : R 1×q → M is the canonical map. For every s ≥ 1 and 1 ≤ i ≤ s, let eis be the unit vector in R 1×s with 1 in the ith position and 0 elsewhere. Then e1p , . . . , epp and e1q , . . . , eqq are canonical bases of R 1×p and R 1×q , respectively. Set ej = π(ejq ) for 1 ≤ j ≤ q. Since π is surjective, e1 , . . . , eq generate M

72

as an R-module. Since ρ(eip ) is the i-th row of A, we have ! q q q X X X 0 = π(ρ(eip )) = π aij ejq = aij π(ejq ) = aij ej , j=1

j=1

every element of R is congruent to an element of S modulo I (note that every element of R can be written as an element of S multiplied by the inverse of an element of H from the right-hand side). Let φ be the map from S/J¯ to R/I that sends a + J¯ to a + I for a ∈ S. The map is well-defined, injective and linear over F because J¯ = S ∩ I. By the conclusion made in the previous paragraph, for every element (b + I) of R/I with b ∈ R, there exists b0 in S such that b ≡ b0 mod I. 0 ¯ Thus φ b + J = b + I. The map φ is surjective. 2

j=1 τ

for 1 ≤ i ≤ p, which implies that (e1 , . . . , eq ) is a solution of A(Z) = 0 in M . Given two R-modules N1 and N2 , let HomR (N1 , N2 ) denote the CF -vector space of all the R-linear maps from N1 to N2 . We next show that the proof of Proposition 1.1 of [8] remains valid when D is replaced by R.

Example 3. Consider a p×1 matrix A = (L1 , . . . , Lp )τ , where the Li are in S. The system A(z) = 0 corresponds to scalar equations L1 (z) = · · · = Lp (z) = 0, whose R-module of formal P solutions is M = R/ρ(R1×p ) = R/I, where P I is the left ideal pi=1 RLi of R. Let J be the left ideal pi=1 SLi of S. Then, by Lemma 2, dimF M is finite if dimF S/J is finite and J contains no monomial in ∂`+1 , . . . , ∂m . Consider the case ` = 0 and m = 2. If J is S(∂1 + 1), then dimF (M ) is not finite. On the other hand, if J is equal to S(∂1 ∂2 (∂1 + 1)) + S(∂1 ∂2 (∂2 + 1)), then dimF S/J is not finite, but dimF M = 1, because I = R(∂1 + 1) + R(∂2 + 1).

Theorem 4. Let M =R1×q /R1×p A. Then solN (A(Z)=0) and HomR (M, N ) are isomorphic as CF -vector spaces for any R-module N . Proof. Applying the functor HomR (·, N ) to the exact sequence (11) of CF -vector spaces and using the isomorphism HomR (R 1×s , N ) → N s given by f 7→ (f (e1s ), . . . , f (ess ))τ , we get the exact sequence: π∗

λ

0 −→ HomR (M, N ) −→ N q −→ N p , in which π ∗ (f ) = (f (e1 ), . . . , f (eq ))τ and λ ((n1 , . . . , nq )τ ) = A(n1 , . . . , nq )τ for n1 , . . . , nq in N . Since π ∗ is injective, HomR (M, N ) ' Im(π ∗ )= ker(λ)=solN (A(Z)=0). 2 Theorem 4 reveals that e := (e1 , . . . , eq )τ ∈ M q is a “generic” solution of the system A(Z) = 0 in the sense that any solution of A(Z) = 0 is the image of e under some homomorphism. This means that M describes the properties of all the solutions of A(Z) = 0 “anywhere”. So we define

Example 4 (Integrable systems). Let A1 ,. . . , Am be in F n×n , 1n be the identity matrix in F n×n and   ∂1 · 1n − A1   .. mn×n A= . ∈S . ∂m · 1n − Am The system A(Z) = 0 corresponds to {∂i (Z) = Ai Z}1≤i≤m , which is not necessarily fully integrable. Let M be its module of formal solutions and e = (e1 , . . . , en )τ ∈ M n be as above. Then A(e) = 0 implies that ∂i e P = Ai e for each i. Since the entries ofPAi are in F , ∂i ej ∈ n i, j, and s=1 F es for Pall n n thus Re ⊂ F e for all j. Hence M = j s s=1 s=1 Res = Pn s=1 F es . In particular, dimF M ≤ n.

Definition 4. Let A ∈ S p×q ⊂ Rp×q . We call the Rmodule M = R1×q /R1×p A the module of formal solutions of the system A(Z) = 0. The dimension of M as an F -vector space is called the linear dimension of the system. The system is said to be of finite linear dimension if 0 < dimF M < +∞.

To check in practice whether a system is of finite linear dimension, we need to compute dimF M . As seen in Example 4, when the system is given as an integrable system, we have a set of generators for M over F , so computing dimF M can be done by linear algebra over F as in Example 5. Note that in the purely differential case, we have dimF M = n if the matrices Ai satisfy (2), dimF M = 0 otherwise. When the system is given by an ideal in S, then Lemma 2 shows that either M = 0 (if the ideal contains a monomial in ∂`+1 , . . . , ∂m ) or an F -basis of M can be computed via Gr¨ obner bases of S-modules. There are algorithms and implementations for this task [3, 4]. For more general matrices A ∈ S p×q , computing an F -basis of M involves computing Gr¨ obner bases of R-modules. In the purely differential case, this is again Gr¨ obner bases of S-modules. When difference operators are involved, the algorithms developed in [9, 13] for pure difference equations with constant coefficients are generalized in [12] to produce Gr¨ obner bases of R-modules. Let A ∈ S p×q and M be the R-module of formal solutions for A(Z)= 0. Suppose that dimF M = n and b1 , . . . , bn form a basis of M over F . Then, for b := (b1 , . . . , bn )τ there exists Bi ∈ F n×n such that ∂i (b) = Bi b for each i. We can regard M as the module of formal solutions for the integrable system {∂i (X) = Bi X}1≤i≤m . Indeed, suppose we find, as described in Example 4, its module MB of formal

Note that we choose to exclude systems with dimF M = 0 in our definition since such systems cannot have nonzero solutions in any R-module (which includes all orthogonal ∆extensions of F ). The next lemma is used to describe modules of formal solutions for finite-rank left ideals in S ([6]). Lemma 2. Let J be a left ideal of S. Assume that J does not contain any monomial in ∂`+1 , . . . , ∂m , and that S/J is finite dimensional over F . Let I be the left ideal generated by J in R and J¯ = I ∩ S. Then S/J¯ and R/I are isomorphic as vector spaces over F . In particular, R/I is finite dimensional over F . Proof. Let H be the set of all monomials in ∂`+1 , . . . , ∂m . Since every element of H is invertible in R, J¯ = {a ∈ S | ha ∈ J for some h ∈ H}. (12) ¯ ¯ Since J ⊂ J, dimF S/J is finite. Let fj be a nonzero polynomial in F [∂j ] ∩ J¯ with minimal degree for j > `. Then each fj is of positive degree with a nonzero coefficient of ∂j0 = 1, for otherwise, J¯ would contain 1, and, hence, J would have a nonempty intersection with H by (12), a contradiction to our assumption. Since ∂j−1 fj ∈ I, ∂j−1 is congruent to an element of F [∂j ] modulo I. It follows that

73

P solutions and f := (f1 , . . . , fn )τ such that MB = n s=1 F fs n and ∂i (f ) = Bi f for each i. Since b ∈ M is a solution of {∂i (X) = Bi X}1≤i≤m , there exists ϕ ∈ HomR (MB , M ) such that b = ϕ(f ) by Theorem 4. Since the bi are linearly independent over F , so are the fi . Hence MB = ⊕n s=1 F fs and ϕ is an isomorphism of R-modules. Since ∂i and ∂j commute for any i and j, ∂i (∂j (b)) = ∂j (∂i (b)). From ∂i (b) = Bi b and the linear independence of b1 , . . . , bn over F , it follows that σi (Bj )Bi + δi (Bj ) = σj (Bi )Bj + δi (Bj ),

Although this is not stated in the definition, it follows from Proposition 2 that the columns of a fundamental matrix form a CE -basis of the CE -module solE (A(Z) = 0): denote solE (A(Z)=0) and solE ({∂i (X) = Bi X}1≤i≤m ) by WA and WB respectively. Then the columns of V = P U are n×1 in WA by Proposition 2. Let c ∈ CE be such that 0=V c = P U c. Since U c ∈ WB , we have U c = 0 by Proposition 2, hence c=0 since U is invertible. Thus the columns of V are linearly independent over CE . For any η ∈ WA there exists ξ ∈ WB such that η = P ξ. By Proposition 1 there n×1 exists c ∈ CE such that ξ = U c. Hence η = P U c = V c. Let b1 , . . . , bn and d1 , . . . , dn be two bases of M over F . Write b = (b1 , . . . , bn )τ and d = (d1 , . . . , dn )τ , and let T ∈ GLn (F ) be given by d = T b. For each i, let Bi , Di ∈ F n×n be such that ∂i (b) = Bi b and ∂i (d) = Di d. If E is a PicardVessiot ring for {∂i (X) = Bi X}1≤i≤m and U ∈ E n×n is a corresponding fundamental matrix, then T U is a fundamental matrix for {∂i (Y ) = Di Y }1≤i≤m by Theorem 4, so E is a Picard-Vessiot ring for that system too. This justifies the second part of Definition 5. As a final consequence of Theorems 1 and 2, we have

1 ≤ i, j ≤ m,

i.e. B1 , . . . , Bm satisfy the compatibility conditions (2). Suppose that Bt is singular for some t > `. Then, there exists a nonzero v ∈ F 1×n such that vBt = 0 and thus v∂t (b) = vBt b = 0. Since M is an R-module on which ∂t−1 acts, we have 0 = ∂t−1 (v∂t (b)) = σt−1 (v)∂t−1 (∂t (b)) = σt−1 (v)b, which implies that b1 , . . . , bn are linearly dependent over F , a contradiction. So the Bj are invertible for ` + 1 ≤ j ≤ m and the system {∂i (X) = Bi X}1≤i≤m is fully integrable. We call it2 the fully integrable system associated to M w.r.t. the basis b1 , . . . , bn . Since any orthogonal ∆-extension E of F is turned into an R-module via the action ∂i (e) = δi (e) for i ≤ ` and ∂i (e) = σi (e) for i > `, solE (A(Z) = 0) is well-defined. We now set up a correspondence between the solutions in E of A(Z) = 0 and those of its associated fully integrable system.

Theorem 5. Every system A(Z)=0 of finite linear dimension has a fundamental matrix and has a Picard-Vessiot ring E. If F has characteristic 0 and CF is algebraically closed, then CE = CF . Proof. Let A ∈ S p×q be such that A(Z) = 0 is of finite linear dimension n > 0, M be its module of formal solutions, e1 , . . . , eq be R-generators for M and b1 , . . . , bn be an F basis of M such that A(e1 , . . . , eq )τ =0 and ∂i (b1 , . . . , bn )τ = Bi (b1 , . . . , bn )τ for each i. Let P ∈ F q×n be given by (e1 , . . . , eq )τ = P (b1 , . . . , bn )τ . Since {∂i (X) = Bi X}1≤i≤m is a fully integrable system, there exists, by Theorem 1, a fundamental matrix U ∈ E n×n for that system where E is some orthogonal ∆-extension of F . Then V := P U ∈ E q×m is a fundamental matrix for A(Z) = 0. The existence of the Picard-Vessiot ring and the second statement follow directly from Theorem 2. 2 Assume that F has characteristic 0 with an algebraically closed field of constants. Let E be a Picard-Vessiot ring for the system A(Z) = 0. As mentioned after Theorem 2, solE ({∂i (X) = Bi X}1≤i≤m ) is of dimension n over CF . But that space is isomorphic to solE (A(Z) = 0) by Proposition 2. Therefore the dimension of solE (A(Z) = 0) as a CF -vector space equals n, the linear dimension of A(Z) = 0.

Proposition 2. Let A(Z) = 0 with A ∈ S p×q be a system of finite linear dimension, M be its module of formal solutions, e1 , . . . , eq be R-generators for M and b1 , . . . , bn be an F -basis of M such that A(e1 , . . . , eq )τ = 0 and ∂i (b1 , . . . , bn )τ = Bi (b1 , . . . , bn )τ q×n

for each i . τ

Let P ∈ F be given by (e1 , . . . , eq ) = P (b1 , . . . , bn )τ . Then, for any orthogonal ∆-extension E of F , the correspondence ξ 7→ P ξ is an isomorphism of CE -modules between solE ({∂i (X) = Bi X}1≤i≤m ) and solE (A(Z) = 0). Proof. To simplify notation, we denote solE (A(Z)=0) and solE ({∂i (X) = Bi X}1≤i≤m ) by WA and WB , respectively. Write e = (e1 , . . . , eq )τ and b = (b1 , . . . , bn )τ . According to Theorem 4, for any ξ ∈ WB , there exists ϕ ∈ HomR (M, E) such that ξ = ϕ(b). Hence A(P ξ) = A(P ϕ(b)) = ϕ(A(P b)) = ϕ(A(e)) = 0, so P ξ belongs to WA . Thus the correspondence ξ 7→ P ξ is a homomorphism of CE -modules from WB to WA . For every η ∈ WA there exists ψ ∈ HomR (M, E) such that η = ψ(e) = ψ(P b) = P ψ(b). The correspondence ξ 7→ P ξ is then surjective, because ψ(b) belongs to WB . If ξ ∈ WB and P ξ = 0, then there exists ϕ ∈ HomR (M, E) such that ξ = ϕ(b). Hence 0 = P ξ = ϕ(P b) = ϕ(e). It follows that ϕ maps everything to 0 as M is generated by e1 , . . . , eq over R. Thus ξ = 0 and the correspondence is bijective. 2

Example 5. Let F , δx and σk be as in Example 1, and the system A is given by   x+1 k(x+1−k) − k(x+1−k) x x2 (k−1) x2 (k−1) 2 2 2  +kx2 +k−1 +2x2 +kx2  − xk−kx(k−1) Ax =  x + 1 xk−k +2x , x(k−1) 2 2 2 2 xk+2x +kx −2k +k xk+2x +kx2 −2k2 +1 x+1 − x(k−1) x(k−1) 

Definition 5. Let A, M, b1 , . . . , bn and P be as in Proposition 2. A q × n matrix V with entries in an orthogonal ∆-extension E of F is called a fundamental matrix for A(Z) = 0 if V = P U where U ∈ E n×n is a fundamental matrix of the fully integrable system associated to M w.r.t. the basis b1 , . . . , bn . A Picard-Vessiot ring for any fully integrable system associated to M is called a Picard-Vessiot ring for A(Z) = 0. 2

 Ak = 

k+1 k x(k+1) k x(k+1) k

k+1−xk−x x(k−1) 1−2x+k−xk+x3 k−1 1−2xk−2x+k+x3 k−1

xk+x−k−1 x(k−1) 2x+xk−x3 −k−1 k−1 2xk+2x−k−x3 −1 k−1

  .

Note that Ax and Ak satisfy the compatibility conditions (2) but Ak is singular, so the system is not fully integrable. Let S = [∂x ; 1, δx ][∂k ; σk , 0] and R be the corresponding Laurent-Ore algebra. Let A ∈ S 6×3 be the matrix corresponding to the system given by Ax and Ak (see Example 4), M = R1×3 /R1×6 A be the module of formal solutions

It is also called an integrable connection.

74

for the system A(Z) = 0, and {e1 , e2 , e3 } be a set of Rgenerators of M such that ∂x (e1 , e2 , e3 )τ = Ax (e1 , e2 , e3 )τ and ∂k (e1 , e2 , e3 )τ = Ak (e1 , e2 , e3 )τ . Solving the linear system (v1 , v2 , v3 )Ak = 0 over F , we see that Ak has rank 2, and ∂k (e1 ), ∂k (e2 ) and ∂k (e3 ) are linearly dependent over F (so are e1 , e2 and e3 by an application of ∂k−1 ). A nontrivial solution of (v1 , v2 , v3 )Ak = 0 and an application of ∂k−1 yield     1 0 e1  e2  =  0 1  e1 , e2 x(k−1) x2 −k e3 x2 −1 x2 −1 | {z }

[2] Cassidy, P. and Singer, M. Galois Theory of Parameterized Differential Equations and Linear Differential Algebraic Groups. Preprint, 2005. [3] Chyzak, F., Quadrat, A., and Robertz, D. OreModules: A symbolic package for the study of multidimensional linear systems. In Proc. of MTNS’04, Leuven (Belgium) (2004), CDRom. [4] Chyzak, F. and Salvy, B. Non-commutative elimination in Ore algebras proves multivariate identities. J. Symbolic Comput. 26, 2 (1998), 187–227. [5] Kolchin, E. Differential algebra and algebraic groups. Academic Press, New York and London, 1973. [6] Labahn, G. and Li, Z. Hyperexponential solutions of finite-rank ideals in orthogonal Ore rings. In Proc. ISSAC’2004 (2004), J. Gutierrez, Ed., ACM, 213–220. [7] Li, Z., Schwarz, F., and Tsarev, S. P. Factoring systems of linear PDEs with finite-dimensional solution spaces. J. Symbolic Comput. 36, 3–4 (2003), 443–471. [8] Malgrange, B. Motivations and introduction to the theory of D-modules. In Computer Algebra and Differential Equations (1994), E. Tournier, Ed., vol. 193 of LMS Lecture Note Series, Cambridge University Press, pp. 3–20. [9] Pauer, F. and Unterkircher, A. Gr¨ obner bases for ideals in Laurent polynomial rings and their applications to systems of difference equations. Appl. Algebra in Eng., Comm., and Comp. 9 (1999), 271–291. [10] Singer, M. and van der Put, M. Galois Theory of Difference Equations. LNM 1666. Springer, 1997. [11] Singer, M. and van der Put, M. Galois Theory of Linear Differential Equations, vol. 328 of Grundlehren der Mathematischen Wissenschaften. Springer, Heidelberg, 2003. [12] Wu, M. Factorization and decomposition of modules over Laurent-Ore algebras. Thèse de mathématiques, Université de Nice, expected 2005. [13] Zampieri, S. A solution of the Cauchy problem for multidimensional discrete linear shift-invariant systems. Linear algebra and appl. 202 (1994), 143–162.

P

which, together with Ax and Ak , implies that ∂x (e1 , e2 )τ = Bx (e1 , e2 )τ and ∂k (e1 , e2 )τ = Bk (e1 , e2 )τ where ! 3 2 2 k(x+1−k) Bx =

−x+x −1+x −xk−k+k x(x2 −1) −x−xk+x3 −1−x2 +k2 −kx2 x2 −1

Bk =

xk+x+k2 +2k+1 k(x+1) 2 2 −2k−1)x − (kx −x−k k(x+1)

x2 (x2 −1) −k2 +xk+kx2 +3x2 −1 x(x2 −1) k+1 − x(x+1) x2 +x−1−k x+1

,

! .

Since Bk is invertible, the system B given by Bx and Bk is fully integrable, and, hence, e1 and e2 form an F -basis of M . The same method to construct a fundamental matrix for the system in Example 1 yields a fundamental matrix for B: xkex −kxk , U= 2 x 2 k+1 kx e (x − k − 1)x hence P U is for A. In addition, a Picard-Vessiot ring of B is a Picard-Vessiot ring of A. Acknowledgments: We would like to thank an anonymous referee for his useful and constructive remarks. The second and third authors were supported in part by a 973 key project (no. 2004CB31830), and by a Bourse du Gouvernement Fran¸cais (BGF no. 2002915), respectively.

6.

REFERENCES

[1] Bialynicki-Birula, A. On Galois theory of fields with operators. Amer. J. Math. 84 (1962), 89–109.

75

On Using Bi-equational Constraints in CAD Construction Christopher W. Brown

Scott McCallum

Department of Computer Science, Stop 9F United States Naval Academy Annapolis, MD 21402, USA

Department of Computing Macquarie University NSW 2109 Australia

[email protected]

[email protected]

ABSTRACT

1. INTRODUCTION

This paper introduces an improved method for constructing cylindrical algebraic decompositions (CADs) for formulas with two polynomial equations as implied constraints. The fundamental idea is that neither of the varieties of the two polynomials is actually represented by the CAD the method produces, only the variety defined by their common zeros is represented. This allows for a substantially smaller projection factor set, and for a CAD with many fewer cells. In the current theory of CADs, the fundamental object is to decompose n-space into regions in which a polynomial equation is either identically true or identically false. With many polynomials, one seeks a decomposition into regions in which each polynomial equation is identically true or false independently. The results presented here are intended to be the first step in establishing a theory of CADs in which systems of equations are fundamental objects, so that given a system we seek a decomposition into regions in which the system is identically true or false — which means each equation is no longer considered independently. Quantifier elimination problems of this form (systems of equations with side conditions) are quite common, and this approach has the potential to bring large problems of this type into the scope of what can be solved in practice. The special case of formulas containing two polynomial equations as constraints is an important one, but this work is also intended to be extended in the future to the more general case.

Cylindrical Algebraic Decomposition (CAD) provides a data structure for representing semi-algebraic sets. It is a data structure that is particularly useful for performing quantifier elimination in elementary real algebra, and it is in this context that Collins invented CAD in the early 1970s [9]. However, people have pointed out many other interesting uses of CAD including the simplification of elementary functions [4], non-linear optimization, topologically reliable curve plotting [2], and simplification of large quantifier-free formulas in elementary real algebra [6]. In the context of quantifier elimination, and many other applications of CAD, one frequently encounters input formulas consisting of a system of equations along with a side condition given by an arbitrary formula. In this case, one would like to take advantage of the constraints imposed by this system of equations. This paper proves some results that lead us to more efficient CAD construction when the system in such an input contains two equation, in which case we say there is a bi-equational constraint. We provide some example computations to show that when this improvement is applicable, its benefits are enormous.

1.1 Previous work An application of CAD generally starts with a formula — a boolean combination of integral polynomial equalities and inequalities — which describes a semi-algebraic set. We extract the set A ⊂ [x1, . . . , xr ] of all the polynomials occurring in the formula. The CAD algorithm then constructs a decomposition of r into cylindrically arranged cells such that the signs of the elements of A are constant inside any given cell. This cylindrical arrangement means that the projections onto j , 0 < j < r, of any two cells are either identical or disjoint. Clearly the formula is identically true or identically false in any cell of this CAD. Thus, by marking the cells of the CAD appropriately, we represent the set defined by the formula. CAD construction proceeds in two phases, projection and lifting. The projection phase computes a set of polynomials called the projection factor set. The projection factor set contains the irreducible factors of the set A, and, in general, other polynomials as well. The maximal connected regions in which the projection factors have invariant signs are the cells of the CAD that is to be constructed. Thus, the projection factor set provides an implicit representation of the CAD. The lifting phase then constructs an explicit representation of this CAD. General descriptions of CAD construction may be found in [11], [3], and [9].

Categories and Subject Descriptors G.4 [Mathematics of Computing]: Mathematical Software—Algorithm design and analysis

General Terms

Algorithms, Theory

Keywords CAD, polynomial systems

Copyright 2005 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by a contractor or affiliate of the U.S. Government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. ISSAC’05, July 24–27, 2005, Beijing, China. Copyright 2005 ACM 1-59593-095-7/05/0007 ...$5.00.

76

Many people have pointed out that one of the problems with a CAD is that it contains a lot more information about the polynomials occurring in the input formula than is typically needed for the problem at hand. In fact, a CAD built for a particular formula is capable of representing any set that can be defined by any formula in the same polynomials, and many more sets as well. Partial CAD [11] was an attempt to ameliorate this by a sort of lazy approach to lifting. The method of equational constraints [10] was an attempt to address this problem for inputs of the form we have discussed — a system of equations with an arbitrary side condition. The idea is as follows: if an input formula includes the constraint f = 0, then decompose r into regions in which f has invariant sign, and then refine the decomposition so that the other polynomials have invariant sign in those cells in which f = 0. The signs of the other polynomials in cells in which f 6= 0 are, after all, irrelevant. Additionally, the method of equational constraints seeks to deduce and use constraints that are not explicit in the input formula, but rather arise as consequences of two or more explicit constraints (e.g. if f = 0 and g = 0 are explicit constraints, then res(f, g) = 0 is also a constraint).

was originally introduced in [10] without proof. Subsequent papers [17, 18] have partially validated the method initially proposed. It is suggested that the reader first consult [10], especially Sections 4 and 8, for a readable and intuitive account of the idea. It may be helpful to consider a special case which is relevant to the present paper. Assume r ≥ 2 and let x denote (x1 , . . . , xr−1 ). Let f and g be squarefree, relatively prime elements of the polynomial ring [x,xr ] both of which are primitive and have positive degree in xr . Suppose we wish to decide whether or not the sentence (∃x)(∃xr )[f (x, xr ) = 0 ∧ g(x, xr ) = 0]

of Tarski algebra is true. (Let us call this a bi-equational existence problem.) The McCallum projection of the set A = {f, g} is the set P (A) ⊂ Z[x] consisting of all of the nonzero coefficients with respect to xr of f and g, together with discrxr (f ), discrxr (g) and resxr (f, g). Notice that the equations f (x, xr ) = 0 and g(x, xr ) = 0 are constraints of our problem. Let us nominate f (x, xr ) = 0 to be the pivot constraint, with respect to which the set P (A) will be reduced, and put E = {f }. Then the restricted projection of A relative to E, denoted by PE (A) in [17], is the smaller set consisting of all of the nonzero coefficients of f only, together with discrxr (f ) and resxr (f, g). (An astute reader will note that this definition of PE (A) differs slightly, but not significantly, from that given in [17].) The main result of [17] implies that PE (A) can be safely used in place of P (A) for the first projection step (that is, elimination of xr ). The survey paper [10] proceeds to describe how equational constraints could be propagated, and to suggest the use of such propagated constraints to simplify subsequent projection steps. For the bi-equational existence problem described in the previous paragraph, this amounts in part to observing that the (r −1)-variate equation resxr (f, g) = 0 is an implied constraint for our problem, and to suggesting that this implied constraint could be used to reduce the second application of McCallum’s projection operation (that is, elimination of xr−1 ). The issue of propagation is discussed in [18], which regrettably does not succeed in rigorously justifying the use of propagated constraints for subsequent projection steps, as suggested in [10]. (The weaker notion of a semi-restricted projection of a set relative to a constraint is introduced and applied in [18].) A second important recent development in CAD theory and practice, independent of the concept of equational constraints, was the introduction of the Brown-McCallum projection operation [5]. For our bi-equational existence problem discussed in the previous two paragraphs, the BrownMcCallum projection of A is the set

1.2 Our contribution That part of the equational constraints method that takes advantage of the constraint f = 0 in projection has been successful. However, the part that tries to take advantage of more than one explicit constraint by deducing new, implicit constraints has been problematic. The original scheme proposed by Collins has never been proved correct, and the version proposed in [18] is much weaker. Thus, we can only really say that for inputs of the form of a system of equations with an arbitrary side condition, only the case of a system of one equation has been adequately addressed. This paper addresses the case of two equations. More fundamentally, this paper proposes a break with the way CADs have always been constructed. The projection factor set in CAD construction has always contained polynomials1 . We propose a model of projection where the two constraint polynomials act as a single object during projection.

1.3 Organization of the paper In Section 2 we present some technical background information, followed by Sections 3, 4, and 5 in which we state and prove our main results. Section 6 describes two 4variable quantifier elimination problems and walks through the process of applying the theorems of the previous sections to efficiently construct CADs and find solution formulas. These experiences then motivate the algorithm proposed in Section 7.

2.

Proj(A) =

BACKGROUND MATERIAL

In this section we provide a short synopsis of some relatively recent developments in the theory and practice of CAD which are relevant to the present paper. First we discuss the application of equational constraints to simplify (that is, reduce) the McCallum projection operation [14, 15] used in the projection phase of CAD. The idea

ldcf xr (f ), ldcf xr (g), discrxr (f ), discrxr (g), resxr (f, g)

.

Note that Proj(A) includes only the leading coefficients of f and g. Certain algorithm modifications are required to use the Brown-McCallum projection for CAD construction. Subject to such modifications, this projection is currently the best (that is, smallest) general purpose projection operation which has been proved valid for CAD. Finally, combining the above two developments, one could contemplate a restricted equational version of the BrownMcCallum projection. For our bi-equational existence problem, in which f is chosen as the pivot constraint so that

1 The Brown-McCallum projection [5] could be viewed as having a projection factor set that includes polynomials and points. That is less of an issue, however, since the points play no role during the projection process.

77

Suppose we want to construct a CAD for f = g = 0 where

E = {f }, this is the set

f = (x1 + x2 )x23 − 3(2x2 + x1 − 1)x3 + x1 x2 + x2 − 3 g = (x1 − x2 )x23 + 2(x2 x1 − 1)x3 + x21 − x2 − 1.

ProjE (A) = {ldcf xr (f ), discrxr (f ), resxr (f, g)}. This is the smallest projection set which, based upon the existing literature, we could be confident to use for our example.

3.

(1) When x1 = x2 = 0, f and g have no common zeros, even though (0, 0) lies on the resultant curve and there are solutions above all points on the resultant curve near (0, 0). Thus, (0, 0) must be a separate cell in the CAD of 2 , but if the resultant alone is included in projection, it will not be. What causes the problem is that the leading coefficients both vanish at (0, 0). The simultaneous vanishing of both leading coefficients is not the only way that a projection based purely on resxr (f, g) can fail. Consider constructing a CAD for f = g = 0 where

STATEMENT OF MAIN RESULTS

This paper assumes that the reader is familiar with the McCallum projection and the notions of order-invariance and analytic delineability on which it relies. A summary of the key technical terms and results can be found in [17]. The reader is referred to [14, 15] for a complete presentation.

3.1 Delineability of V (f, g)

f = (x3 + x1 )(x3 − x1 ) + x1 − x2 g = (x3 + x2 )(x3 − x2 ) + x2 − x1

For this subsection and the next, unless stated otherwise, we shall let r ≥ 2, x denote (x1 , . . . , xr−1 ) and f and g be squarefree, relatively prime elements of the polynomial ring [x,xr ] both of which are primitive have positive degree in xr . We could construct a CAD representing f = g = 0 using the Brown-McCallum projection Proj(A) of the set A = {f, g}. (See the previous section for the definition of Proj(A).) The main result of [5] implies that over any cell in a CAD of r−1 in which discrxr (f ), discrxr (g), resxr (f, g) are order-invariant and ldcf xr (f ), ldcf xr (g) are sign-invariant, both f and g are delineable and stack construction can proceed — provided f and g are not nullified in the cell. As suggested in the previous section, we could apply equational constraints to the Brown-McCallum projection. We could choose a pivot, say f , put E = {f } and use the projection set ProjE (A) defined in the previous section. The main results of [5] and [17] imply that over any cell in a CAD of r−1 in which discrxr (f ), resxr (f, g) are order-invariant and ldcf xr (f ) is sign-invariant, f is delineable and g has constant sign in each section of f , which means stack construction can proceed — assuming that f is not nullified in the cell. Thus, the method of equational constraints is based on the observation that g doesn’t need to be delineable over a cell to construct a CAD representing f = g = 0. It suffices for f to be delineable, and for g to have constant sign in the sections of f . This allows us to reduce the size of the projection. The idea behind what we propose is that neither f nor g needs to be delineable over the (r − 1)-level2 cells to produce a CAD representing f = g = 0. It suffices for the common roots of f and g to be delineable, which we define precisely as follows. Where f and g are real polynomials in x and xr we shall denote by V (and sometimes by V (f, g)) the real variety of f and g

The leading coefficients of f and g are constant, so they are not an issue. However, once again, (0, 0) must be a separate cell in the CAD of 2 in order for V (f, g) to be delineable, because it is only over (0, 0) that f and g have exactly one common real root. The Brown-McCallum projection produces from resxr (f, g) a CAD of 2 such that resxr (f, g) is order-invariant in every cell, and yet (0, 0) is not its own cell. Thus, even when the simultaneous vanishing of leading coefficients is not a problem, the projection consisting only of resxr (f, g) may not suffice. Why does it fail in this case? The problem is that f, g, ∂f /∂xr , and ∂g/∂xr have a common zero above (0, 0). It turns out that the previous two examples of how a projection consisting solely of the resultant of f and g can fail show the only two ways it can fail. This is formalized in Theorem 3.1. What is not yet addressed is how to deal with the polynomials occurring in the side condition H. In order to represent the set f = g = 0 ∧ H it suffices that each polynomial h occurring in H is sign-invariant in each section of V (f, g). One would hope that it would be enough to include resxr ,xr−1 (f, g, h), the multipolynomial resultant of f , g and h (for each h in H) in the projection to ensure this. In fact, Theorem 3.2 shows that as long as each such multipolynomial resultant is not the zero polynomial, it does suffice.

V = V (f, g) = {(x, xr ) ∈

r

(2)

3.3 Statements of two main theorems We define the number of common roots at infinity of f and g at p ∈ r−1 to be the minimum of degxr f − deg f (p, xr ) and degxr g − deg g(p, xr ).

Theorem 3.1. Let f and g be real polynomials in the variables x1 , . . . , xr of positive degrees in xr . Let R(x) be the resultant of f and g with respect to xr , and assume that R is not the zero polynomial. Let S ⊂ r−1 be connected. Suppose that R is order-invariant in S, the number of common roots at infinity of f and g is constant and finite in S, and there is no common zero of the polynomials f, g, ∂f /∂xr , ∂g/∂xr in the cylinder over S. Then V is delineable on S.

| f (x, xr ) = g(x, xr ) = 0}

and say that V is delineable on a connected subset S of r−1 provided that the portion of V which lies in S × consists of the union of the graphs of some k ≥ 0 continuous functions θ1 < . . . < θk from S to .

3.2 Towards a projection for V (f, g) It is clear that if V (f, g) is to be delineable over the cells of the induced CAD of r−1 we will in general have to include resxr (f, g) in the projection. The question is does this suffice? The answer, unfortunately, is no.

Theorem 3.2. Let f and g be real polynomials in in the variables x1 , . . . , xr of positive degrees in xr . Let R(x) be the resultant of f and g with respect to xr , and assume that R is not the zero polynomial. Let S 0 ⊂ r−2 be connected, suppose that R is delineable on S 0 , and let S be a section of

2 The level of a cell is the dimension of the Euclidean space of which it is a subset.

78

R over S 0 . Assume the hypotheses of Theorem 3.1. Then V is delineable on S. Let h(x, xr ) be an element of [x, xr ] which has positive degree in xr . Denote (x1 , . . . , xr−2 ) by x ˜, let ρ(˜ x) = resxr−1 ,xr (f, g, h) and suppose that ρ(˜ x) 6= 0. Suppose further that ρ is order-invariant in S 0 . Then h is sign-invariant in each section of V over S.

of f (x, xr ) ( which is here considered as a polynomial in xr alone). To prove the delineability of V on S near p it suffices to show that for each i, 1 ≤ i ≤ k, for which g(p, αi ) = 0, there exists a neighbourhood Ni ⊂ N0 of p such that for every fixed x ∈ S ∩ Ni , there is exactly one common root, say θi (x), of f (x, xr ) and g(x, xr ) contained in the interior of Ci ; and for each i, 1 ≤ i ≤ k, for which g(p, αi ) 6= 0, there exists a neighbourhood Ni ⊂ N0 of p such that for every fixed x ∈ S ∩ Ni , there is no common root of f (x, xr ) and g(x, xr ) contained in the interior of Ci . (For if this has been shown, put N = ki=0 Ni and argue as in the proof of Theorem 3.2 of [14].) We now proceed to prove that for each i, 1 ≤ i ≤ k, for which g(p, αi ) = 0, there exists a neighbourhood Ni ⊂ N0 of p such that for every fixed x ∈ S ∩ Ni , there is exactly one common root of f (x, xr ) and g(x, xr ) (as polynomials in xr ) contained in the interior of Ci . By hypothesis, (p, αi ) is not a common zero of f, g, ∂f /∂xr , ∂g/∂xr . Therefore either ∂f /∂xr (p, αi ) 6= 0 or ∂g/∂xr (p, αi ) 6= 0. In the former case, we have mi = 1, so we can take Ni = N0 . Denote by σi the graph of the real root function θi defined and continuous in S ∩ N0 . Then f is order-invariant with respect to xr in σi since the order of f with respect to xr at (x, θi (x)) is 1, for every fixed x ∈ S ∩ N0 . Hence, by Theorem 4.1 (in which we take S to be S ∩ N0 ), g is sign-invariant in σi . Thus, for every fixed x ∈ S ∩ N0 , θ(x) is a common root of f (x, xr ) and g(x, xr ), indeed the unique such common root inside Ci . In the latter case, (∂g/∂xr (p, αi ) 6= 0), the implicit function theorem can be applied to g at (p, αi ) yielding a neighbourhood Ni ⊂ N0 of p and an analytic function θi : Ni → whose graph σi is identical to the real variety of g, near (p, αi ). Observing that g is order-invariant with respect to xr in σi , Theorem 4.1 can be applied with the roles of f and g reversed to deduce that f is sign-invariant in σi . Thus, for every fixed x ∈ S ∩ Ni , θi (x) is the unique common root of f (x, xr ) and g(x, xr ) inside Ci . It remains to show that for each i, 1 ≤ i ≤ k, for which g(p, αi ) 6= 0, there exists a neighbourhood Ni ⊂ N0 of p such that for every fixed x ∈ S ∩ Ni , there is no common root of f (x, xr ) and g(x, xr ) inside Ci . We simply choose Ni ⊂ N0 such that g(x, xr ) 6= 0, for all (x, xr ) with x ∈ Ni and xr inside Ci . Such Ni exists by continuity of g and compactness of the closed disk bounded by Ci , since g(p, xr ) 6= 0, for all xr in this closed disk. The proof is complete.

Subsequent sections (4 and 5) contain the proofs of the above theorems.

4.

PROOF OF THEOREM 3.1

We first present a key lemma for the proof of Theorem 3.1. The lemma’s statement makes use of the concept of the order-invariance of a polynomial with respect to a given variable in some set, which we now define. We say that f (x, xr ) is order-invariant with respect to xr in a subset σ of r if the order of f with respect to xr at (p, pr ) is constant as (p, pr ) varies in σ.

Theorem 4.1. Let R(x) be the resultant of f and g with respect to xr . Let S ⊂ r−1 be connected. Let σ be a section over S contained in the real variety of f such that f is orderinvariant with respect to xr in σ. Suppose that R is orderinvariant in S. Then g is sign-invariant in σ.

REMARK CONCERNING PROOF. This theorem is in a sense a rewording and a slight generalization of Theorem 2.2 of [17], the proof of which carries over almost word for word. We now supply the proof of Theorem 3.1. We assume that S has positive dimension. (The dimension 0 case is trivial.) By connectedness of S it suffices to show that V is delineable on S near an arbitrary point p of S. Let p be a point of S. Assume without loss of generality that the number of common roots at infinity of f and g equals degxr f −deg f (p, xr ). By hypothesis, this number is finite. Hence f (p, xr ) 6= 0. Also, as a consequence of an hypothesis, f is degree-invariant in S near p. (For deg f (q, xr ) ≥ deg f (p, xr ), for all q sufficiently near p, by continuity of the coefficients of f at p. Therefore

degxr f − deg f (q, xr ) ≤ degxr f − deg f (p, xr ), for all q sufficiently near p. But the number of common roots at infinity of f and g at q is at most the left hand side of the above inequality, by definition, and is equal to the right hand side of the inequality, by hypothesis. Therefore the inequality is in fact an equation, from which the claim follows immediately.) Denote the degree of f (p, xr ) by l. Let α1 < . . . < αk , k ≥ 0, be the real roots of f (p, xr ), let αk+1 , . . . , αt , k ≤ t, be the distinct non-real roots of f (p, xr ), and let mi be the multiplicity of αi , for 1 ≤ i ≤ t. Observe that ti=1 mi = l. Let

5. PROOF OF THEOREM 3.2 The proof of Theorem 3.2 will require the use of some results about real and complex analytic functions. One such result, a key lemma for our proof, we state at the outset. Theorem 5.1. Let ρ and ρ∗ be functions analytic in the polydisk ∆ about the origin in complex n-space n . Suppose that the zero set of ρ∗ is contained in the zero set of ρ in ∆. Then, for some polydisk ∆0 ⊂ ∆ about the origin, and some m ≥ 1, ρm is divisible by ρ∗ in ∆0 . That is, ρm = ρ∗ ρ0 , for some analytic ρ0 in ∆0 .

κ = min({|αi − αj | : 1 ≤ i < j ≤ t} ∪ {1}).

Let = κ/2 and let Ci be the circle of radius centred at αi , for 1 ≤ i ≤ t. Refine > 0 as necessary to ensure that, for each i, 1 ≤ i ≤ k, for which g(p, αi ) 6= 0, Ci and its interior contain no root of g(p, xr ). By root continuity (Theorem (1,4) of [13]) and degree-invariance of f on S near p, there exists a neighbourhood N0 ⊂ r−1 of p such that for every fixed point x of S ∩ N0 , deg f (x, xr ) = l and the interior of each Ci contains exactly mi roots (multiplicities counted)

REMARK CONCERNING THE PROOF. This theorem is a relatively straightforward consequence of an important result concerning the divisibility of one analytic function by another, which is stated as Theorem 9J of Chapter 1 of [19].

79

(If ρ∗ does not vanish identically near the origin, then one puts m = ord0 ρ∗ and shows that the hypothesis of Theorem 9J is satisfied for ρm and ρ∗ in a suitable polydisk ∆0 ⊂ ∆ about the origin.) PROOF OF THEOREM 3.2. Throughout the rest of this section we shall denote xr−1 by y and xr by z. Let σ be an arbitrary section of V over S. By connectedness of S 0 (hence S, hence σ), it suffices to show that h is locally signinvariant in σ. Take a point p = (˜ p, β, γ) in σ. That h is sign-invariant in σ near p follows by continuity of h in case h(˜ p, β, γ) 6= 0. So henceforth assume that h(˜ p, β, γ) = 0. By an hypothesis, either ∂f /∂z or ∂g/∂z does not vanish at p. Without loss of generality assume the former and that p is the origin. We aim to construct a function ρ∗ , analytic near ˜ 0 in complex (r − 2)-space r−2 , whose zero set is the projection onto r−2 of the portion of the variety of f , g and h near the origin in r . By assumptions the univariate polynomial f (˜ 0, 0, z) is divisible exactly once by z. Therefore, by Hensel’s Lemma, (Theorem 3.1 of [17]) there is a polydisk ∆1 about the origin in complex (r −1)-space and polynomials in z, f1 (˜ x, y, z) = z − θ(˜ x, y) and f2 (˜ x, y, z), whose coefficients are elements of the formal power series ring [[˜ x , y]], absolutely convergent in ∆1 , such that θ(˜ 0, 0) = 0, f2 (˜ 0, 0, 0) 6= 0 and f = f1 f2 . Since a function defined as the sum of a convergent power series is analytic, θ(˜ x, y) and the coefficients of f2 are analytic in ∆1 . For any δ > 0, we denote by ∆(0; δ) the disk in about the origin of radius δ. Put P (˜ x, y) = resz (f1 , g), an element of [[˜ x , y]], absolutely convergent in ∆1 . Then P (˜ 0, y) 6= 0, since R(˜ 0, y) 6= 0. Therefore, by the (r−1)-variable analogue of the Weierstrass preparation theorem (as presented in Lec˜ 2 × ∆(0; δ) ⊂ ∆1 ture 16 of [1]), there is a polydisk ∆2 = ∆ about the origin in complex (r − 1)-space, a polynomial

f = f1 f2 yields f (α, ˜ β, γ) = 0. Therefore, by Theorem 2.4 of [16], ρ(α) ˜ = 0. The claim is proved. ˜4 ⊂ ∆ ˜3 Hence, by Theorem 5.1, there exist a polydisk ∆ 0 m ∗ 0 ˜ and an analytic function ρ in ∆4 such that ρ = ρ ρ , for some m ≥ 1. Since ρ and ρ∗ have real power series representations, so does ρ0 (because the imaginary part of the power series expansion for ρ0 about the origin must be ˜4 = ∆ ˜ 4 ∩ r−2 . 0). Therefore, ρ0 is analytic in the box B By Lemma A.3 of [14], since ρ is order-invariant in S 0 by ˜4 . But ρ∗ (˜ hypothesis, ρ∗ is order-invariant in S 0 ∩ B 0) = 0, since f1 , g and h all vanish at the origin. Hence, ρ∗ (˜ x) = 0, ˜4 . for all x ˜ ∈ S0 ∩ B We conclude our proof in the following way. Let φ denote the continuous function on S 0 whose graph is the section S of ˜4 , φ(˜ R over S 0 . We claim that for every fixed x ˜ ∈ S0 ∩ B x) is a root of P1 (˜ x, y) of multiplicity k, hence the unique root of P1 (˜ x, y). The claim is proved as follows. Now R is delineable on S 0 , by hypothesis, and the identity R = P1 (P2 P¯ ) is valid, where P¯ = resz (f2 , g). It is straightforward to show that P2 P¯ is a polynomial in y (with analytic coefficients). Hence, ˜4 . (The by Lemma A.7 of [14], P1 is delineable on S 0 ∩ B reader will note that we have used the notion of delineability from [14, 17], and have extended the notion to polynomials with analytic coefficients.) This proves our claim. We next claim that h(˜ x, φ(˜ x), θ(˜ x, φ(˜ x))) = 0, for all x ˜ ∈ S 0 ∩ B˜4 . The claim is proved as follows. Let α ˜ ∈ S 0 ∩ B˜4 . Then ρ∗ (α) ˜ = 0, by the last sentence of the previous paragraph. Therefore, there exists β ∈ ∆(0; δ 0 ) such that P1 (α, ˜ β) = T ∗ (α, ˜ β) = 0. But φ(α) ˜ is the unique root of P1 (α, ˜ y), proved above. Hence β = φ(α). ˜ As in our proof of the claim that the zero set of ρ∗ is contained in the zero set of ρ, we deduce that (α, ˜ β, γ) is a common zero of f , g and h, where γ = θ(α, ˜ β). In particular, h(α, ˜ β, γ) = 0. This proves our claim. The proof that h vanishes throughout σ, near the origin, is complete.

P1 (˜ x, y) = y k + a1 (˜ x)y k−1 + · · · + am (˜ x)

6. EXAMPLES

˜ 2 , and an x ]], absolutely convergent in ∆ with the ai (˜ x) ∈ [[˜ element P2 (˜ x, y) of [[˜ x , y]], absolutely convergent in ∆2 , such that P1 (˜ 0, y) = y k , P2 (˜ 0, 0) 6= 0 and P = P1 P2 . Let T (˜ x, y) = resz (f1 , h). By the Weierstrass division theorem (Lecture 16 of [1]), there exists a polydisk ∆3 = ˜ 3 × ∆(0; δ 0 ) ⊂ ∆2 , an element Q(˜ x , y]], abso∆ x, y) of [[˜ lutely convergent in ∆3 , and an element T ∗ (˜ x, y) of [[˜ x ]][y], of degree in y at most k −1, whose coefficients are absolutely ˜ 3 , such that T = P1 Q + T ∗ . By root conconvergent in ∆ tinuity (Theorem (1,4) of [13]) and the analyticity (hence ˜3 continuity) of the coefficients ai (˜ x) of P1 , after refining ∆ to a smaller polydisk about the origin, if necessary, we can ˜ 3 , each root of further assume that for every fixed x ˜ ∈ ∆ P1 (˜ x, y) is in ∆(0; δ 0 ). We now complete our construction of ρ∗ : we put ρ∗ (˜ x) = resy (P1 , T ∗ ). We claim that the zero set of ρ∗ is contained ˜ 3 . The proof is as follows. Let α in the zero set of ρ in ∆ ˜ be ˜ 3 and suppose that ρ∗ (α) an element of ∆ ˜ = 0. Then there exists β ∈ such that P1 (α, ˜ β) = T ∗ (α, ˜ β) = 0. Since each root of P1 (α, ˜ y) is in ∆(0; δ 0 ), we have β ∈ ∆(0; δ 0 ). Hence we can legally substitute (α, ˜ β) for (˜ x, y) in the power series identity T = P1 Q + T ∗ , from which we deduce T (α, ˜ β) = 0. The same substitution in the power series identity P = P1 P2 yields P (α, ˜ β) = 0. Hence, with γ = θ(α, ˜ β), we have g(α, ˜ β, γ) = h(α, ˜ β, γ) = 0. Substitution of (α, ˜ β, γ) into

Here we consider two examples to demonstrate how the results proved earlier can be used to construct CADs more efficiently. They are chosen so that different hypotheses of theorems of Subsection 3.3 are not satisfied globally. Example computations are performed using Qepcad b [7], a system that performs quantifier elimination and formula simplification based on CAD, to construct CADs, and Maple’s Gr¨ obner basis facilities to compute multipolynomial resultants.

6.1 An example involving complex roots Consider the following question: when does p(z) = z 3 + az +b have a non-real root x+iy such that xy < 1. This can be expressed as (∃x)(∃y)[f = g = 0 ∧ y 6= 0 ∧ xy − 1 < 0], where f g

= =

Re(p(x + iy) Im(p(x + iy))/y

= =

x3 − 3xy 2 + ax + b 3x2 − y 2 + a.

Applying Qepcad b to this problem in this form produces the solution 27b2 + 4a3 > 0 after 342 seconds of CPU time (on a 650 MHz Sun Blade 150). Using equational constraints as Collins originally proposed, Qepcad b returns the formula 27b2 + 4a3 > 0 after less than 0.1 seconds, although we cannot be sure that the use of equational constraints is valid3 .

3

80

[18] gives criteria for determining that the full use of equa-

If Theorem 3.2 can be applied to this problem, we would be able to construct a CAD with a projection factor set consisting of J3 = {resy (f, g), resy,x (f, g, y), resy,x (f, g, xy − 1)} and J2 = Proj(J3 ). However, we need to be sure that the hypotheses of the theorems are satisfied. It would suffice to show that

1. resx,w (f, g, h) is not the zero polynomial, 2. there is no common zero of f, g, ∂f /∂x, ∂g/∂x in any stack we construct in 4-space, and 3. the leading coefficients of f and g are not simultaneously zero in any cell over which we lift.

1. neither resy,x (f, g, y) nor resy,x (f, g, xy − 1) are the zero polynomial,

The first condition is easily verified. For the second condition, we must verify that f, g, ∂f /∂x, ∂g/∂x have no common zeros. From these four polynomials, Maple computes a Gr¨ obner Basis with respect to the lexicographical order whose first element is u2 + 4. Since this is never zero over the reals, there are no real common zeros. The final condition is problematic, since both leading coefficients are zero when u = v = 0. The easiest way to deal with this is to simply assume u 6= 0 ∨ v 6= 0 (so that in constructing a partial CAD we will never lift over a cell in which u = v = 0) and treat the u = v = 0 case as a separate problem — a trivial separate problem since substituting zero for u and v yields 1 = 0 as a constraint in the formula. Having verified that, as long as we add the assumption u 6= 0 ∨ v 6= 0, the reduced projection described above suffices, we can use Qepcad b to perform CAD construction by manually removing polynomials introduced during its projection that are not part of the reduced projection described above. This takes approximately 8 seconds on a 650 MHz Sun Blade 150. The Maple computations that justified the reduced projection took less than 1 second. The resulting solution formula is quite large, presumably reflecting the fact that this is an artificial problem created, in part, to demonstrate that short, simple input formulas can swamp CAD-based quantifier elimination (see Figure 1). What this example demonstrates for us, however, is that taking advantage of bi-equational constraints in an input formula can make the difference between a CAD construction that is utterly infeasible and one that can be accomplished fairly quickly.

2. there is no common zero of f, g, ∂f /∂x, ∂g/∂x in any stack we construct in 4-space, and 3. the leading coefficients of f and g are not simultaneously zero in any cell over which we lift. Point 1 is checked with simple calculation. Point 3 is clearly satisfied, since the leading coefficient of g is constant. Computing a Gr¨ obner basis for f, g, ∂f /∂x, ∂g/∂x with an elimination order, we get 27b2 + 4a3 , so we cannot verify Point 2 globally. Therefore, we can proceed with the reduced projection as long as we assume 27b2 +4a3 6= 0. The 27b2 +4a3 = 0 case can then be treated as a separate (and hopefully simpler!) computation. We can use Qepcad b with the assumption 27b2 +4a3 6= 0 to do the quantifier elimination, and interactively remove all the projection factors it creates from the first projection step except for resy (f, g), and add as projection factors resy,x (f, g, y) and resy,x (f, g, xy − 1). The computation yields 27b2 + 4a3 > 0 in 0.11 seconds. The Maple computations required to do the verification above took less than 0.1 seconds. This leaves us with the 27b2 + 4a3 = 0 case to consider. However, this is a constraint for the last projection, so as [17] points out, we can use equational constraints for the first and last projections. This computation, in which our limited use of equational constraints is valid, proceeds in less than 0.1 seconds and tells us that the formula is never satisfied when 27b2 + 4a3 = 0.

6.2 Hong’s example In [12], Hong considered the formula (∃x)[f = 0 ∧ g = 0 ∧ h ≤ 0], where f g h

= = =

ux2 + vx + 1 vx3 + wx + u wx2 + vx + u.

What made this example interesting for demonstrating the algorithm in that paper is that CAD-based quantifier elimination performs so poorly! Even using Collins’ originally proposed equational constraints method (just hoping that it is valid for this example), Qepcad b runs for several minutes before aborting due to a system limitation on the number of primes that can be used in a modular algorithm. This resource limit is arrived at before any lifting over zero dimensional cells (which require computations over the highest degree extensions) is even attempted. If Theorem 3.2 can be applied to this problem, we would be able to construct a CAD with a projection factor set consisting of J3 = {resx (f, g), resx,w (f, g, h)} and J2 = Proj(J3 ). However, we need to be sure that the hypotheses of Theorem 3.2 are satisfied. It would suffice to show that:

Figure 1: The solution set for Hong’s example is of the form r = 0 ∧ K, where r is quadratic in w. Being without good tools for producing a 3D visualization of this set, we’ve provided plots of the regions in which there is exactly one solution for a given point (u, v) (on the left), and where there are exactly two (on the right). This provides some indication that the solution set is inherently complex, and that a short solution formula may not exist.

tional constraints as Collins proposed is valid in the 4variable case. The criteria are not satisfied for this example.

81

7.

Input: f , g and H as described above

TOWARDS AN ALGORITHM

In order to incorporate the results and ideas from the previous sections into an algorithm, two things are needed: a plan for checking whether or not the hypotheses of the various theorems are satisfied, and a plan for dealing with situations in which the hypotheses are not satisfied globally. An in-depth look at either of these two problems is outside the scope of this paper; especially the second, as it is related to the larger problem of how to cope with the situation in which the Brown-McCallum projection fails because a projection factor is nullified. In this paper, our goal is to provide a reasonable strategy without worrying about finding the best plan. The two previous examples are intended to serve as a guide in this. The hypotheses of Theorems 3.1 and 3.2 will typically only fail when certain polynomials or sets of polynomials of level4 less than r are zero. We will make the non-vanishing of these polynomials an assumption in our CAD construction and thus, in the usual way with partial CADs, not lift over any cells in which they fail. Solving the original input problem for the case in which these assumptions fail is a separate problem, and one that is simpler in the sense of being more constrained. This process is demonstrated in our two examples.

Output: formula A in the variables x1 , . . . , xr−1 and CAD D representing f = g = 0 ∧ H ∧ A 1. set R := resxr (f, g) 2. let ρi := resxr ,xr−1 (f, g, hi ) for all i ∈ 1, . . . , s; if any of the ρi are zero, exit returning F AIL 3. set E to the set of irreducible factors of resxr ,xr−1 ,xr−2 (f, g, ∂f /∂xr , ∂g/∂xr ) 4. set C to the set of all pairs of factors of ldcf xr (f ) and ldcfxr (g). 5. set A :=

p∈E

p 6= 0

∧

(p,q)∈C

p 6= 0 ∨ q 6= 0

6. construct CAD D in the following way (a) construct a partial CAD of r−1 for the set {R, ρ1 , . . . , ρs , hs+1 , . . . , ht } under the assumption A.

(b) lift into r-space only over sections of R, treating only common roots of f and g as sections in these new stacks (c) evaluate the original formula at each sample point

7.1 An algorithm

7. return D, A

We describe an algorithm BEQCCAD, which constructs a CAD for input with bi-equational constraints. The algorithm’s input is a triple (f, g, H) representing the formula f = g = 0 ∧ H, where f , g and the polynomials appearing in H are all in [x1, . . . , xr ]. Assume that f and g are relatively prime, square-free r-level polynomials, let h1 , . . . hs be the r and (r − 1)-level factors of polynomials appearing in H, and let hs+1 , . . . , ht be the lower-level factors of polynomials appearing in H. To simplify this presentation, let us suppose that resxr (f, g) does not vanish identically at any point in an (r − 1)-level cell over which we lift. (Since resxr (f, g) will be a product of polynomials in the projection factor set, this condition will be checked during CAD construction.) The CAD D computed by BEQCCAD represents the set defined by the input formula, except over regions in r−1 in which the hypotheses of Theorems 3.1 and 3.2 are not satisfied. We simply do not lift over cells in such regions. Figure 7.1 summarizes the steps performed by the algorithm BEQCCAD. The validity of this algorithm follows by Theorems 3.1 and 3.2. The CAD D represents the set defined by the input formula, except possibly over some lower dimensional regions in which A is not satisfied. As both examples demonstrate, it can be constructed much more quickly than the CAD we would construct from the input formula without taking advantage of any equational constraints in the input. As the second example demonstrates, it can even be constructed more quickly than the CAD we get by simply assuming the validity of the method of equational constraints as originally formulated by Collins.

Figure 2: Algorithm BEQCCAD

8. CONCLUSION We have presented two theorems that allow us to efficiently construct CAD representations of sets defined by formulas with bi-equational constraints, i.e. of the form f = g = 0 ∧ H. The algorithm we have derived based on these theorems will typically construct a CAD representing the input with some assumptions.The situations in which these assumptions fail to hold need to be handled as separate problems, but they are more constrained problems and thus, in a sense, simpler. The given examples illustrate this. Extending these results to problems with more than two equational constraints is a natural and important direction for future research.

Acknowledgements The first author’s work was supported in part by NSF grant number CCR-0306440.

9. REFERENCES [1] S. S. Abhyankar. Algebraic Geometry for Scientists and Engineers. American Math. Society, 1990. [2] D. S. Arnon. Topologically reliable display of algebraic curves. In Proceedings of SIGGRAPH, pages 219–227, 1983. [3] D. S. Arnon, G. E. Collins, and S. McCallum. Cylindrical algebraic decomposition I: The basic algorithm. SIAM Journal on Computing, 13(4):865–877, 1984. [4] James C. Beaumont, Russell J. Bradford, James H. Davenport, and Nalina Phisanbut. A poly-algorithmic

4

The level of a polynomial in the variables x1 , . . . , xr is the maximum i for which the polynomial’s degree in xi is positive.

82

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12] H. Hong. Quantifier elimination for formulas constrained by quadratic equations via slope resultants. Computer J., 36(5):440–449, 1993. [13] M. Marden. Geometry of Polynomials, 2nd edition. American Math. Society, 1966. [14] S. McCallum. An improved projection operation for cylindrical algebraic decomposition of three-dimensional space. Journal of Symbolic Computation, 5(1,2):141–161, 1988. [15] S. McCallum. An improved projection operator for cylindrical algebraic decomposition. In B. Caviness and J. Johnson, editors, Quantifier Elimination and Cylindrical Algebraic Decomposition, Texts and Monographs in Symbolic Computation. Springer-Verlag, Vienna, 1998. [16] S. McCallum. Factors of iterated resultants and discriminants. Journal of Symbolic Computation, 27:367–385, 1999. [17] S. McCallum. On projection in CAD-based quantifier elimination with equational constraint. In Sam Dooley, editor, Proc. International Symposium on Symbolic and Algebraic Computation, pages 145–149, 1999. [18] S. McCallum. On propagation of equational constraints in CAD-based quantifier elimination. In Bernard Mourrain, editor, Proc. International Symposium on Symbolic and Algebraic Computation, pages 223–230, 2001. [19] H. Whitney. Complex Analytic Varieties. Addison-Wesley, 1972.

approach to simplifying elementary functions. In Proc. International Symposium on Symbolic and Algebraic Computation, pages 27–34, 2004. C. W. Brown. Improved projection for cylindrical algebraic decomposition. Journal of Symbolic Computation, 32(5):447–465, November 2001. C. W. Brown. Simple CAD construction and its applications. Journal of Symbolic Computation, 31(5):521–547, May 2001. Christopher W. Brown. QEPCAD B: a program for computing with semi-algebraic sets using CADs. ACM SIGSAM Bulletin, 37(4):97–108, 2003. B.F. Caviness and J. R. Johnson, editors. Quantifier Elimination and Cylindrical Algebraic Decomposition. Texts and Monographs in Symbolic Computation. Springer-Verlag, 1998. G. E. Collins. Quantifier elimination for the elementary theory of real closed fields by cylindrical algebraic decomposition. In Lecture Notes In Computer Science, volume Vol. 33, pages 134–183. Springer-Verlag, Berlin, 1975. Reprinted in [8]. G. E. Collins. Quantifier elimination by cylindrical algebraic decomposition - 20 years of progress. In B. Caviness and J. Johnson, editors, Quantifier Elimination and Cylindrical Algebraic Decomposition, Texts and Monographs in Symbolic Computation. Springer-Verlag, 1998. G. E. Collins and H. Hong. Partial cylindrical algebraic decomposition for quantifier elimination. Journal of Symbolic Computation, 12(3):299–328, Sep 1991.

83

Hybrid Symbolic-Numeric Integration in Multiple Dimensions via Tensor-Product Series Orlando A. Carvajal, Frederick W. Chapman, Keith O. Geddes

∗

Symbolic Computation Group, School of Computer Science, University of Waterloo Waterloo, ON, N2L 3G1, Canada

ABSTRACT


We present a new hybrid symbolic-numeric method for the fast and accurate evaluation of definite integrals in multiple dimensions. This method is well-suited for two classes of problems: (1) analytic integrands over general regions in two dimensions, and (2) families of analytic integrands with special algebraic structure over hyperrectangular regions in higher dimensions. The algebraic theory of multivariate interpolation via natural tensor product series was developed in the doctoral thesis by Chapman, who named this broad new scheme of bilinear series expansions ”Geddes series” in honour of his thesis supervisor. This paper describes an efficient adaptive algorithm for generating bilinear series of Geddes-Newton type and explores applications of this algorithm to multiple integration. We will present test results demonstrating that our new adaptive integration algorithm is effective both in high dimensions and with high accuracy. For example, our Maple implementation of the algorithm has successfully computed nontrivial integrals with hundreds of dimensions to 10-digit accuracy, each in under 3 minutes on a desktop computer. Current numerical multiple integration methods either become very slow or yield only low accuracy in high dimensions, due to the necessity to sample the integrand at a very large number of points. Our approach overcomes this difficulty by using a Geddes-Newton series with a modest number of terms to construct an accurate tensor-product approximation of the integrand. The partial separation of variables achieved in this way reduces the original integral to a manageable bilinear combination of integrals of essentially half the original dimension. We continue halving the dimensions recursively until obtaining one-dimensional integrals, which are then computed by standard numeric or symbolic techniques.

I.1.2 [Symbolic and Algebraic Manipulation]: Algebraic algorithms; G.4 [Mathematical Software]: Algorithm design and analysis


Keywords multiple integration, symbolic-numeric algorithms, approximation of functions, tensor products, bilinear series, splitting operator, Geddes-Newton series expansions, Geddes series scheme, deconstruction/approximation/reconstruction technique (DART)

1.

INTRODUCTION

The problem of approximating definite integrals is commonly known as quadrature (for single or multiple integrals) or cubature (for multiple integrals). Any iterated definite integral with variable limits of integration can be reduced to the following standard form via simple linear changes of variables: 1

I(f ) = 0

···

1 0

f (x1 , x2 , . . . , xd) dxd · · · dx2 dx1 .

The region [0, 1]d is known as the unit hypercube of dimension d or the unit d-cube. Multiple integration problems arise in various application areas, including atomic physics, quantum chemistry, statistical mechanics, and Bayesian statistics. The numerical evaluation of multiple integrals is computationally difficult, especially for larger dimensions, due to the size of the region over which the integrand must be sampled. Various methods have been developed for this problem, but no single method is found to be best for all cases. We propose a new method which we believe is superior to other methods for certain important classes of multiple integrals. We shall present an algorithm to approximate a multivariate function via natural tensor product series. There are two main advantages of this approximation. The first is that the approximation is represented using basis functions with only half as many variables as the original function. The second advantage is that because these basis functions belong to the same family as the original function (e.g., polynomial, rational, trigonometric, exponential, or various special functions), the interpolation series typically needs only a modest

∗This work was supported in part by NSERC of Canada Grant No. RGPIN8967-01 and in part by the MITACS NCE of Canada.


84

2.2

number of terms. We thus obtain a method to approximate a multiple integral in terms of a combination of integrals of half the original dimension. In contrast, other methods rely on subdividing the region to obtain the required accuracy. Our method exploits the possibility of symmetrizing the integrand. In two dimensions, this symmetrization is always possible while in higher dimensions it is non-trivial. Nonetheless, we are able to obtain the symmetry we require in higher dimensions for several important classes of integrands. Our method is able to compute multiple integrals efficiently and to high accuracy, in some cases where other methods require too much computation time.

2.

Globally adaptive methods iteratively divide the integration region into subregions until the desired accuracy is achieved. At each iteration, product formulas are used to estimate the value of the integral over the subregions. If the estimated error is not small enough, the subregion with the highest error is subdivided. The most popular implementations of these methods are ADAPT [7] and DCUHRE [1], the latter being an evolution of the former.

2.3

Monte Carlo Methods

The basic idea behind Monte Carlo methods is very simple: if we evaluate the integrand at n uniformly distributed random points in the region of integration, we can approximate the integral by multiplying the arithmetic mean of these n function values by the volume of the region [8]. The convergence rate of O(n−1/2 ) for such a method is at the same time its great advantage and sad drawback. Because the convergence is independent of the dimension, the method works better with integrals in higher dimensions than deterministic methods. At the same time, this convergence is very slow, and therefore the accuracy which can be achieved within a reasonable amount of computation time is very low.

EXISTING METHODS

In one dimension, there are various well-known methods for numerical integration. Among the most common methods are Clenshaw-Curtis, Newton-Cotes and Gaussian quadratures. Yet, in one dimension we find that some methods are better suited for certain families of problems than others; for example, some methods are better at handling singularities. In the Maple computer algebra system, a hybrid symbolicnumeric polyalgorithm is applied to compute the numerical value of a definite integral (see [5, 6]). In that approach, various symbolic-mode analytical techniques are applied, if necessary, to deal with infinite regions of integration and to deal with integrand singularities including derivative singularities which may slow the convergence of numerical quadrature methods. These analytical techniques include change of variables to transform an improper integral into a proper integral, or to eliminate singularities of the integrand, as well as approximating the integrand by a generalized series expansion near a non-removable singularity. At its base, the polyalgorithm makes a (hopefully intelligent) choice among several quadrature methods to compute the numerical result for the (possibly transformed) integration problem. Computing the numerical value of an integral in multiple dimensions is much harder than in one dimension. In multiple dimensions, the geometry of the region over which the integration takes place can be very complex. Even for very regular geometries, the number of points at which the integrand must be sampled grows exponentially with the dimension, assuming a numerical method which is a multidimensional generalization of one-dimensional quadrature formulas. This exponential growth is sometimes called the curse of dimensionality for definite integration. We now briefly describe various existing techniques for numerical multiple integration.

2.1

Globally Adaptive Methods

2.4

Quasi Monte Carlo Methods

Quasi Monte Carlo methods have received significant attention during the past decade [10, 11]. These methods aim to improve the convergence rate of the Monte Carlo method by using quasi-random numbers with specific distributions. A very comprehensive study of a general class of quasi Monte Carlo methods known as lattice rules is given by Sloan [12]. In general, lattice rules are a good choice only for smooth one-periodic integrands over a hypercube.

2.5

Dimensionality Reducing Expansions

He [9] presents analytical methods which reduce the dimension of the integral by one. This reduction is achieved by replacing the original region of integration with its boundary. The resulting dimensionality reducing expansions for multiple integration are derived using a multidimensional version of integration by parts, and thus require partial derivatives of the original integrand. This method is mostly used to create boundary type quadrature formulas (quadrature formulas which sample the integrand only on the boundary of the region) and in the evaluation of oscillatory integrals. The tensor product methods presented in this paper have a significant advantage over dimensionality reducing expansions: tensor product methods reduce the dimension geometrically by a factor of two, rather than arithmetically by only one dimension.

Quadrature Formulas

The most basic technique to approximate a definite integral uses a quadrature formula that computes the value of the integrand at various sample points in the region of integration. Product formulas extend the quadrature formulas used for one-dimensional integrals to higher dimensions. However, the number of sample points grows exponentially with the dimension. For some specific cases such as polynomial and trigonometric integrands, there are non-product formulas with the property that the number of sample points grows less than exponentially. Stroud [13] provides a comprehensive list of both product and non-product formulas for a wide variety of regions.

3.

TENSOR PRODUCT SERIES

We start by giving the definition of a tensor product and of the nonlinear splitting operator defined by Chapman [3] for generating natural tensor product series expansions.

3.1

Tensor Products

A tensor product is a finite sum of terms, where each term is a product of univariate functions: sn (x, y) =

n i=1

85

gi (x) hi (y) .

Refining the terminology introduced by Chapman [3], we call the resulting series sn the Geddes-Newton series expansion1 of f to n terms with respect to the splitting points {(ai , bi )}n i=1 . In summary, the above algorithm applies the splitting operator at the i-th splitting point (ai , bi ) to the previous remainder ri−1 to generate the formula Υ(ai ,bi ) ri−1 for the i-th term of the Geddes-Newton series sn .

In mathematics, we find many bivariate functions and families of functions that can be expressed as tensor products: ex+y = ex ey cos(x + y) = cos(x) cos(y) − sin(x) sin(y) (x + y)2 = x2 + 2xy + y2 .

Example 1

The minimum number of terms among all the equivalent representations of the tensor product is called the rank of the tensor product. A tensor product is natural when the factors of each term can be derived from the original function by a finite linear combination of linear functionals.

3.2

The following function is defined on the unit square [0, 1]2 : 2 2

f (x, y) = ex

Given a bivariate function f and a point (a, b) where f (a, b) = 0, the splitting operator Υ(a,b) at the point (a, b) is defined by

s3 (x, y) =

3

ci gi (x) hi (y)

i=1

f (x, b) · f (a, y) . f (a, b)

where c1

The point (a, b) will be called a splitting point. Note that Υ(a,b) splits f into a rank-one tensor product. Two important properties of Υ(a,b) f (x, y) are:

= −0.884017, 2

g1 (x) = ex cos(x + 1), 2

h1 (y) = ey cos(y + 1);

• Υ(a,b)f (x, y) interpolates f (x, y) on the lines x = a and y = b . Thus Υ(a,b) f (a, y) ≡ f (a, y) and Υ(a,b) f (x, b) ≡ f (x, b) .

c2

= 0.794868, 2

g2 (x) = cos(x) + 0.477636 ex cos(x + 1),

• If there is a value x such that f (x, y) ≡ 0, it follows that Υ(a,b)f (x, y) ≡ 0 as well. Likewise for a y such that f (x, y) ≡ 0 .

2

h2 (y) = cos(y) + 0.477636 ey cos(y + 1); c3

The combination of these two properties is what allows us to generate a natural tensor product series approximation. This series is generated simply by iterating the splitting operator while varying the choice of splitting point.

3.3

cos(x + y) .

The Geddes-Newton series to three terms is as follows, where the splitting points are (a, a) for a = 1, 0, 0.616336 :

The Splitting Operator

Υ(a,b) f (x, y) =

y

= −9.83284, 2

g3 (x) = e0.379870 x cos(x + 0.616336) − 2

0.356576 ex cos(x + 1) − 0.623342 cos(x), 2

h3 (y) = e0.379870 y cos(y + 0.616336) −

Geddes-Newton Series Expansions

2

0.356576 ey cos(y + 1) − 0.623342 cos(y).

Not every bivariate function f can be expressed as a tensor product of finite rank; however, if an approximation is desired within a compact rectangle R = [a, b]×[c, d] where f is continuous, we can uniformly approximate f by a natural tensor product sn of rank n. We define the n-th remainder to be rn = f − sn and call the function |rn | the absolute error for the approximation sn of f on R. The following simplified algorithm generates a natural tensor-product series expansion sn of rank n of a given function f on R. (Here the uniform norm •∞ is with respect to the region R.)

1

0.8

0.6

0.4

1. Define the initial remainder by r0 := f and initialize the counter i := 1.

0.2

2. While the uniform error satisfies ri−1 ∞ > δ, iterate the following two steps:

0 0

(a) Choose a splitting point (ai , bi ) ∈ R such that |ri−1 (ai , bi )| = ri−1 ∞ .

0.2

0.4

0.6

0.8

1

Figure 1: Lines of Interpolation in Example 1

(b) Let ri := ri−1 − Υ(ai,bi ) ri−1 and i := i + 1.

1 Geddes-Newton series expansions are merely one kind of bilinear series in the general Geddes series scheme. The scheme includes numerous other classes as well, such as Geddes-Taylor series, Geddes-Fourier series, GeddesChebyshev series, and many kinds of Geddes-wavelet series.

3. After exiting the loop, let n := i − 1 and sn := f − rn . Return sn as the desired series approximation of f with uniform error rn ∞ ≤ δ over R.

86

unit square, it suffices to show that the n-th mean-value constant Mn := (∂/∂x)n (∂/∂y)n rn ∞ has an asymptotic growth rate satisfying Mn = o(n!2 ) as n → ∞. Note that even a rapid growth rate like Mn = O(nn ) still ensures uniform convergence. A rigorous upper bound on Mn can be obtained from upper bounds on the higher-order partial derivatives of f on [0, 1]2 and a positive lower bound on the smallest singular value of the nonsingular n × n matrix [f (ai , bj )]n i,j=1 . The elements of this matrix arise naturally as the values of f at the n2 points where the interpolatory grid lines x = ai and y = bj intersect. The proof of these results is beyond the scope of this paper and will appear separately.

1

0.8

0.6

0.4

0.2

0 0

0.2

0.4

0.6

0.8

1

4. Figure 2: Lines of Interpolation: 12 splitting points

INTEGRATION IN TWO DIMENSIONS

Now that we have a Geddes-Newton series to approximate a continuous function f within a compact rectangle R = [a, b] × [c, d], we can use it to calculate the definite integral:

As illustrated in Figure 1, the splitting points in Example 1 are all located on the diagonal y = x of the unit square. However, due to the interpolating properties of the splitting operator (see Section 3.2), the approximation s3 (x, y) agrees exactly with the given function f (x, y) at all points on the horizontal and vertical lines passing through the three splitting points. With this strong interpolating property, we find that even with only three splitting points used, the maximum error over the unit square for the approximation in Example 1 is f − s3 ∞ ≈ 3.0 × 10−3 . Using the strategies presented in the following sections for generating a Geddes-Newton series expansion, if a function is symmetric (i.e., f (x, y) ≡ f (y, x)) as in Example 1, the splitting points which are chosen in the unit square usually all lie on the diagonal. Moreover, we will always be able to symmetrize any bivariate function. If we request a series accurate to 15 digits for the function f (x, y) defined in Example 1, we find that only 12 terms are required (i.e., 12 splitting points) to yield an approximation satisfying f − s12 ∞ < 5.0 × 10−15 . The distribution of the 12 splitting points for this particular function is illustrated in Figure 2. The issue of convergence of a Geddes-Newton series expansion to the original function has similarities with the case of univariate polynomial interpolation via the Newton interpolation series. In both cases, the addition of a new term in the series corresponds to the addition of a new interpolation point. However, in the case of Geddes-Newton series the interpolation property holds not only at the new point but at all points on two lines passing through that point, as illustrated in Figure 2. Intuitively, for any function f which is continuous in the unit square U = [0, 1]2 , if the splitting points are dense along the diagonal then it is reasonable to expect that sn converges uniformly to f on U as n → ∞. This expectation is based on the fact that the series sn interpolates the function f on the lines x = ai and y = bi of the two-dimensional grid generated by the splitting points. The remainder rn therefore vanishes on the boundary of each cell in the resulting n × n grid. By choosing well distributed splitting points we can cover the whole region U with small grid cells, and thus make rn as close to zero as we wish inside these grid cells by the uniform continuity of rn . In order to prove that the Geddes-Newton series expansion converges uniformly to the original function f on the

I(f ) =

f (x, y) dx dy R

I(f ) ≈ In =

sn (x, y) dx dy . R

The main goal achieved when we approximate f (x, y) by sn (x, y) is a reduction in the dimension of the integrals. Specifically, by separating the variables x and y, we replace the calculation of one integral in two dimensions by 2n integrals each in one dimension: b

d

In =

sn (x, y) dy dx = a

c

n i=1

b

d

gi (x) dx a

hi (y) dy .

c

The fact that we can compute one-dimensional integrals relatively efficiently, combined with the power of this particular interpolation scheme which results in n being of modest size, makes this technique very effective. The basic idea of the integration algorithm is now clear, at least for two dimensions. Assuming that the double integral has been transformed to be over the unit square [0, 1]2 , the following conditions will be assumed to hold: • f is continuous in the closed region [0, 1]2 , which implies that the maximum norm of the integrand in [0, 1]2 is finite: f ∞ < ∞ . • The one-dimensional integrals 1

1

f (x, a) dx 0

and

f (a, y) dy 0

can be computed (numerically or symbolically) for any value a ∈ [0, 1]. A simplistic implementation of the algorithm can lead to unnecessarily large running times. We shall explain how a smart implementation of the algorithm dramatically improves its efficiency.

4.1

Preparation of the Integral

First, let us show how to meet two requirements for the initial integral: the integrand must be symmetric and the integration limits must be constants.

87

an expensive two-dimensional sampling of the remainder rn over the whole unit square. This becomes even more expensive after each iteration because the complexity of the remainder rn grows quadratically in the number of iterations n. What can we do about this difficulty? We use the following observation: after some (typically small) number of iterations, convergence of the interpolation series becomes essentially monotonic. Once this point is reached, the norm of the remainder is almost always attained on the diagonal line y = x. This property is a consequence of the symmetry of the integrand. Therefore, after a certain number of iterations it is not necessary to sample on the whole unit square to estimate the norm. The important role of the diagonal leads us to break the approximation process into two phases named the confinement phase and the convergence phase.

We can apply a simple change of variables to convert any integral with non-constant limits to a new one with constant limits. (This can be extended to integrals in any number of dimensions). The change of variables x = a · (1 − s) + b · s y = c(x(s)) · (1 − t) + d(x(s)) · t transforms the integral as follows: b

1

d(x)

1

f (x, y) dy dx = a

0

c(x)

fˆ(s, t) dt ds .

0

Thus, we can limit our attention to integrals over the unit square. We say that a function is symmetric if f (x, y) ≡ f (y, x) and anti-symmetric if f (x, y) ≡ −f (y, x). We can express any bivariate function as a sum f = fS + fA of a symmetric part fS and an anti-symmetric part fA given by

The Confinement Phase

f (x, y) + f (y, x) f (x, y) − f (y, x) ; fA(x, y) = . 2 2 Additionally, we have that the integral of fA over a symmetric region such as [0, 1]2 is always 0. This conveniently gives us

Our objective in this phase is to confine the location (ai , bi ) of the maximum error ri−1 ∞ to the diagonal y = x. After many experiments and a few prototypes, we arrived at the following conclusions:

fS (x, y) =

[0,1]2

f (x, y) dy dx =

[0,1]2

• We should select splitting points (ai , bi ) on the diagonal y = x, unless the function becomes numerically zero on the diagonal. Only in this case do we select an off-diagonal splitting point.

fS (x, y) dy dx .

Example 2

• To preserve the symmetry of the remainder, off-diagonal splitting points must be chosen in symmetric pairs: selecting the point (a, b) as a splitting point implies that the next splitting point must be (b, a). Since a = b and ri (a, a) = ri (b, b) = 0, the sum of the two new terms is symmetric.

Consider the double integral 1 0

1−x

2 2

ex

y

cos(x + y) dy dx .

0

Applying the change of variables to transform into an integral over the unit square, and then applying symmetrization, yields the new problem: 1 0

1 0

• The criterion for deciding when the first phase is over will be based on the norm of the integrand. Achieving a uniform error which is 1/100th of the initial norm has proven to be a good threshold for switching to the convergence phase (i.e., ri ∞ ≤ f ∞ /100). The number of splitting points required to obtain such a threshold depends on the qualitative behaviour of f ; oscillatory functions take longer to complete phase one.

F (s, t) dt ds , where F (s, t) = fˆS (s, t) =

2 2 2 2 2 2 1 cos(s + t − st) es (1−s) t (1 − s) + es (1−t) t (1 − t) . 2 The Geddes-Newton series for the new integrand F (s, t) was computed to three terms based on the splitting points (a, a) for a = 0, 1, 0.434450 . Then the series was integrated (applying one-dimensional quadratures) and this yielded the following estimate for the original double integral: 0.385433 . This result agrees with the correct value of the integral to 5 significant digits, which is excellent for a three-term approximation. By using more than three terms in the series approximation, more accuracy can be obtained.

4.2

• To avoid the quadratic growth of the remainder as a symbolic expression, we convert the original function to a discrete representation. With the sample values of the function stored in a matrix, the operations to calculate a new term and update the remainder become simple linear algebra. Regarding the discretization mentioned in the latter point, for the confinement phase we choose an initial grid of 25×25 sample points. Each time we choose a splitting point (from this grid), we reduce the number of sample points that are used to estimate ri ∞ , so we must monitor the process and possibly increase the size of the grid. This depends on the qualitative behaviour of f . The criterion we have adopted is that if the number of non-zero rows/columns falls below 15 while the f ∞ /100 threshold has not been reached, the grid is expanded by a factor of two in each dimension. The result of the confinement phase is a list of splitting points (in appropriate order) that were used to generate an approximation satisfying the above-specified threshold. The matrix used in the discretization is now discarded.

A Two-Phase Algorithm

We now describe an implementation of the integration algorithm. Some of the implementation details have been developed based on empirical evidence from experimentation. One important characteristic of our implementation is the division of the approximation process into two phases. Let us explain the primary reason for doing so. The algorithm for generating the series approximation is quite simple as presented in Section 3.3. The only step that can be expensive is finding the splitting point (ai , bi ) where the absolute error |ri−1 | attains its maximum norm ri−1 ∞ . The estimation of the norm in two dimensions would require

88

4.3

Symmetry Considerations

Symmetry of the remainder throughout the process is necessary to achieve optimal performance. We have defined a criterion for the selection of splitting points which preserves symmetry. There are even more benefits of this symmetry. The series resulting from our algorithm has the following algebraic form:

0.018 0.016 0.014 0.012 0.01 0.008 0.006

sn (x, y) =

0.004

i=1

0.002 0

n

0.2

0.4

0.6

0.8

ci

i j=1

ki,j f (x, bj )

i

li,j f (aj , y)

j=1

where ci , ki,j , li,j = 0 are real-valued coefficients, and (ai , bi ) is the splitting point used to generate the tensor product term in the i-th iteration. Although ai = bi does not always n hold for specific i, note that {ai }n i=1 = {bi }i=1 . We can represent the series using matrices and vectors as

1

Figure 3: A Typical Error Curve on the Diagonal

sn (x, y) = VT (x) · LT · P · D · L · V(y) where

The Convergence Phase

• V(x) is the column vector of dimension n whose elements are the univariate functions f (x, ai ).

During the convergence phase, the remainder typically exhibits the following behaviour. The remainder vanishes on the boundary of each grid cell (by the interpolation theory) and has constant sign inside each grid cell. These signs alternate in adjacent grid cells resulting in a checkerboard pattern. The maximum error over the whole region generally occurs in one of the grid cells located along the diagonal. The absolute error |ri (x, x)| on the diagonal oscillates as illustrated in Figure 3, and a reasonable estimate for the point of maximum error is obtained by sampling at the midpoint between adjacent splitting points along the diagonal. Based on the aforementioned properties, at each iteration in the convergence phase a new splitting point is chosen as follows: sample the remainder at each midpoint between adjacent splitting points along the diagonal, and select the point where the absolute error is largest. A minor addition is that if either or both of the points (0, 0) and (1, 1) were not chosen as splitting points during the confinement phase then they are added as candidates for the next splitting point. As discussed in Section 3.3, even though the splitting points are on the diagonal, the interpolation property holds on the vertical and horizontal lines through those points. Hence, the set of points where the remainder is zero will be dense in the unit square as long as the splitting points are dense on the diagonal. Furthermore, if the sequence of remainders {rn }∞ n=1 converges uniformly, the uniform limit will be continuous (since each sn and rn inherit the continuity of f ). The uniform limit must therefore vanish identically on the unit square (since it vanishes on a dense subset). In conclusion, if sn converges uniformly to something on [0, 1]2 as n → ∞, then this uniform limit must be the original function f. This is a direct consequence of the continuity of f, the density of the splitting points on the diagonal, and the interpolation properties of each sn . Experimentation has supported the effectiveness of this approach. The reduction in running times achieved by restricting the search to a one-dimensional region (i.e., the diagonal) is significant. At iteration n only O(n) evaluations of the remainder are computed in order to determine the next splitting point. The convergence phase ends when the estimated norm on the diagonal is less than the requested accuracy. At this point we proceed to integrate the series.

• D is an n×n diagonal matrix whose diagonal elements correspond to the coefficients ci = 1/ri−1 (ai , bi ). • P is an n × n permutation matrix that allows the coefficients ki,j to be obtained from the coefficients li,j via [ki,j ] = P·[li,j ]. The matrix P is symmetric and blockdiagonal. Each on-diagonal splitting point (a, a) generates a diagonal block of the form [1], and each pair of off-diagonal splitting points (a,b) and (b, a) generates 0 1 a diagonal block of the form . If there are 1 0 no off-diagonal splitting points, then P is the identity matrix. • L =[li,j ] is an n × n unit lower triangular matrix. This representation reduces the cost of handling what would otherwise become extremely complex expressions for rn and sn . The direct benefits are: • The cost of evaluating sn and rn is reduced from O(n2 ) to O(n) evaluations of the original function f . • The factors can be grouped to use only matrix-vector multiplications and one inner product, making the computation very efficient. • We only need to perform n one-dimensional integrations of cross-sections of the original function: f (x, ai ) for i = 1, 2, . . . , n. In the end, nearly all the processing time is spent evaluating the integrand, which is what we would hope for, and cannot avoid.

5.

INTEGRATION IN HIGH DIMENSIONS

For integrals in more than two dimensions we again wish to generate an approximation of the integrand by a tensor product series. The number of variables in the new functions to be integrated will thereby be cut in half, and applying the concept recursively will reduce the problem to some number of one-dimensional integration problems.

89

therefore the range of both s and t will be [0, 1]. Otherwise, c1 = c2 and only one of them will have

range [0, 1], while

A major issue is how to guarantee the symmetry which is central to the method. As we have seen in previous sections, it is the symmetry of the integrand which allows the computation to be efficient. Carvajal’s master’s thesis [2] presents a novel approach which allows us to exploit our twodimensional approximation techniques in high-dimensional integration problems. Due to space limitations, we can only outline the main concepts of the new method here; further details are presented in the thesis.

5.1

the other will have range 0, min(cc1 ,c2 ) ⊂ [0, 1]. Step 2. We now compute an approximation sn (s, t) of the symmetric bivariate function v(s, t) in [0, 1]2 with our Geddes-Newton series approximation algorithm. Step 3. Next, we produce the reconstruction of f from v. Take the Geddes-Newton series expansion sn (s, t) of v(s, t) in [0, 1]2 and substitute for s and t using their defining equations above. This yields a multivariate series expansion Sn (x1 , . . . , xk ; xk+1 , . . . , xd) of f (x1 , . . . , xk ; xk+1 , . . . , xd) in the original d variables. This series approximation will be valid over the entire unit d-cube [0, 1]d . Step 4. Since we can separate the variables s and t in the Geddes-Newton series sn , we can separate the variables x1 , . . . , xk from xk+1 , . . . , xd in the multivariate series Sn . This reduces the d-dimensional integrals of the terms in Sn to integrals of dimension d/2. Finally, we apply the same technique recursively to each integral of dimension d/2 until we have only one-dimensional integration subproblems.

Description of DART

Our new method does not claim to handle all possible integrands in multiple dimensions. Rather, it is an approach that proves to be very effective for many common integrands, as described in Carvajal [2]. We call this new method of multivariate approximation and integration the deconstruction/approximation/reconstruction technique (DART). This method exploits the fact that high-dimensional integrals arising in applications frequently fit certain patterns. Multivariate functions constructed from a sum or a product of univariate functions are fairly common. We will see that we do not even need to have the original function be symmetric. The integration method has four steps: (1) Find a change of variables which converts an integrand f (x1 , x2 , . . . , xd ) in d > 2 variables into a symmetric bivariate function v(s, t). (2) Generate a Geddes-Newton series approximation sn (s, t) of v(s, t). (3) Substitute for s and t to transform sn (s, t) into a tensor product series Sn (x1 , x2 , . . . , xd ) in the d original variables. (4) Separate these d variables into two distinct groups and evaluate the resulting integrals in d/2 and d/2 variables by applying the method recursively. This recursion reduces the original d-dimensional integration problem to a collection of independent one-dimensional integrals. The following example illustrates the steps of the method.

The special form of integrand appearing in Example 3 has more general applicability than might be expected. Univariate functions of a linear combination of several variables, as in Example 3, are known as ridge functions. It has been proved that every continuous function on a compact subset of Euclidean space can be uniformly approximated by a finite sum of ridge functions [4]. Therefore, this special class of integrands can potentially be used to develop a very general method for multiple integration that would combine approximation by ridge functions and approximation by Geddes-Newton series expansions.

5.2

Example 3 (Ridge Functions)

f (x1 , x2 , . . . , xd) = u(g1 (x1 ) & g2 (x2 ) & · · · & gd (xd)),

Let us assume that a function f (x1 , x2 , . . . , xd ) in d = 2 k variables can be rewritten as

where & always denotes the same operator, either + or ×. Carvajal [2] gives a detailed presentation of the method for integrands fitting this pattern, for various template functions gi . He also discusses how to achieve an efficient implementation of the method.

f (x1 , x2 , . . . , xd ) = u(a1 x1 + a2 x2 + · · · + ad xd ) with ai > 0. We wish to calculate the integral of f over the unit d-cube [0, 1]d; thus, xi ∈ [0, 1]. Step 1. We start with a deconstruction of the original function into a symmetric bivariate function by making a change of variables. We can split the sum into two groups, each containing k variables, and let a1 x1 + a2 x2 + · · · + ak xk , s= c ak+1 xk+1 + ak+2 xk+2 + · · · + a2k x2k , t= c where c1 =

k i=1

ai ,

c2 =

2k

ai ,

Other Integrand Patterns

DART can be applied to integrands having the following general form. Suppose that the integrand f can be expressed in terms of a univariate function u as

6.

COMPUTATIONAL RESULTS

Our integration algorithm was implemented in Maple 9. The linear algebra operations use the NAG/BLAS routines, which can be executed in both hardware floating point and in arbitrary precision software floating point in Maple. The tests were run on a computer with an Intel Pentium 4 processor at 2.02 GHz, with 0.98 GB of RAM, and running Microsoft Windows XP with Service Pack 1. We first present some results for integration in two dimensions. For this case, we consider the performance of the algorithm as the requested accuracy tolerance is decreased (i.e., the requested number of digits of accuracy is increased). Table 1 shows the results for the following double integral:

c = max (c1, c2 ) .

i=k+1

We obtain f (x1 , x2 , . . . , xd ) = u(a1 x1 + a2 x2 + · · · + a2k x2k ) = u(c · (s + t)) = v(s, t).

1 0

Note that the resulting function v(s, t) is symmetric. Since all xi ∈ [0, 1], in the best case we would have c = c1 = c2 and

1 0

sin(8π x(1 − x) y (1 − y)(x − y)2 ) dy dx .

Maple’s current numerical integration methods are not able

90

Tol 5 e-10 5 e-15 5 e-20

Time 0.3 3.7 5.3

5 e-25

7.2

5 e-30

14.2

Result 0.069551393139 0.06955139313890799 0.06955139313890799 01727 0.06955139313890799 0172712980 0.06955139313890799 017271297964487

Pts 15 21 26

RelErr 2.7 e-13 1.9 e-19 9.9 e-24

33

4.5 e-31

38

2.1 e-35

each integrand, Dim specifies the dimension d for that integrand. The dimension was chosen based on the criterion that for the given integrand, dimension 2 d would lead to a computation time of more than a few minutes. Time, Result and RelErr have the same meaning as in Table 1. The results of Table 2 show that our new method can accurately compute some integrals in very high dimensions, and can do so quite rapidly. In contrast, a state-of-theart method such as DCUHRE is successful only at modest dimensions: at most d = 15, and usually much less.

Table 1: A 2-D Integral at Varying Precisions Fcn F5 F6 F7 F16 F17 F18 F19 F22

Dim 32 16 512 256 64 16 128 32

Time 49.7 21.3 37.9 165.2 147.0 30.4 99.5 92.6

Result 3.1000000000 e01 -1.9250000000 e02 -1.8045810938 e-11 1.0000000000 e00 3.4940406596 e00 5.9973550251 e-02 4.9999927298 e-01 -5.6249710526 e01

7.

RelErr 1.9 e-13 1.6 e-13 3.1 e-10 2.0 e-14 2.8 e-12 1.5 e-11 6.4 e-13 9.8 e-12

Table 2: Integrals in High Dimensions to compute this integral beyond hardware floating point precision. In order to compute an accurate reference value, we used a specialized method based on a Taylor series expansion of sin(x) to obtain a result to 34 digits of accuracy. In Table 1, Tol is the requested accuracy tolerance, Time is the CPU time (seconds) used by our method, Result is the numerical value computed, Pts is the number of splitting points, and RelErr is the actual relative error in the computed result. The latter value is based on a comparison with the accurate reference value. Note that we display Result with a number of digits corresponding to Tol, but the computation uses 4 guard digits so the result may have additional accuracy. Indeed, there is room for the stopping criterion to be fine-tuned since RelErr is significantly smaller than Tol. A detailed discussion of the performance of the method is presented in [2]. In Table 2, we present some results for high-dimensional integration problems via DART. For this case, the requested accuracy tolerance is always 5×10−10 and the computations proceed in hardware floating point precision. The region of integration is the unit d-cube: xi ∈ [0, 1] for all i. The integrands cited in Table 2 are the following functions selected from the families of integrands considered in [2] :

F5 = F6 =

2 d i=1 bi xi 3

F7 = cos

d i=1 bi xi

F16 = exp

d i=1

d

xi

i=1

xi

d

F17 = ln 1 +

xi

−1 d i=1 2xi i=1

F18 = 1 + F19 = sin 2 F22 =

π 4

d i=1

xi

d

xi

i=1 2 d cos

i=1

REFERENCES

[1] J. Bernsten, T. O. Espelid, and A. C. Genz. Algorithm 698: DCUHRE - An Adaptive Multidimensional Integration Routine for a Vector of Integrals. ACM Transactions on Mathematical Software, 17:452–456, 1991. [2] O. A. Carvajal. A New Hybrid Symbolic-Numeric Method for Multiple Integration Based on Tensor-Product Series Approximations. Master’s thesis, Univ of Waterloo, Waterloo, ON, Canada, 2004. [3] F. W. Chapman. Generalized Orthogonal Series for Natural Tensor Product Interpolation. PhD thesis, Univ of Waterloo, Waterloo, ON, Canada, 2003. [4] W. Cheney and W. Light. A Course in Approximation Theory. The Brooks/Cole Series in Advanced Mathematics. Brooks/Cole Publishing Co., Pacific Grove, California, 2000. [5] K.O. Geddes. Numerical Integration in a Symbolic Context. In B. W. Char, editor, Proc of SYMSAC’86, pages 185–191, New York, 1986. ACM Press. [6] K.O. Geddes and G. J. Fee. Hybrid Symbolic-Numeric Integration in Maple. In P. Wang, editor, Proc of ISAAC’92, pages 36–41, New York, 1992. ACM Press. [7] A.C. Genz and A. A. Malik. An Adaptive Algorithm for Numerical Integration over an N-Dimensional Rectangular Region. Journal of Computational and Applied Mathematics, 6:295–302, 1980. [8] J. M. Hammersley and D. C. Handscomb. Monte Carlo Methods. Methuen, 1964. [9] T. X. He. Dimensionality Reducing Expansion of Multivariate Integration. Birkha˝ user, 2001. [10] F. J. Hickernell. What Affects Accuracy of Quasi-Monte Carlo Quadrature? In H. Niederreiter and J. Spanier, editors, Monte Carlo and Quasi-Monte Carlo Methods, pages 16–55. Springer-Verlag, Berlin, 2000. [11] C. Lemieux and P. L’Ecuyer. On Selection Criteria for Lattice Rules and Other Quasi-Monte Carlo Point Sets. Mathematics and Computers in Simulation, 55(1-3):139–148, 2001. [12] I. H. Sloan and S. Joe. Lattice Methods for Multiple Integration. Oxford University Press, 1994. [13] A. H. Stroud. Approximate Calculation of Multiple Integrals. Prentice-Hall, 1971.

xi .

F5 and F6 contain coefficients bi which were assigned nonzero integer values in the range −5 ≤ bi ≤ 5 (see [2] for details). In Table 2, Fcn denotes the integrand function and for

91

A BLAS Based C Library for Exact Linear Algebra on Integer Matrices Zhuliang Chen

Arne Storjohann

http://www.uwaterloo.ca/˜z4chen

http://www.uwaterloo.ca/ãstorjoh

School of Computer Science, U. Waterloo Waterloo, Ontario, N2L 3G1 Canada

ABSTRACT

Linear Algebra over

Algorithms for solving linear systems of equations over the integers are designed and implemented. The implementations are based on the highly optimized and portable ATLAS/BLAS library for numerical linear algebra and the GNU Multiple Precision library (GMP) for large integer arithmetic.

Nonsingular Solving over


Word-Size Linear Algebra

I.1.2 [Symbolic and Algebraic Manipulation]: Algorithms; G.4 [Mathematical Software]: Algorithm design and analysis, Efficiency

ATLAS/BLAS

General Terms

GMP Bignum

Algorithms Figure 1: Organization of IML

Keywords integer matrix, linear system solving

1.

lifting algorithm [5, 19] for nonsingular solving. Second, we design and report on an implementation of a new algorithm, based on ideas from [13, 14, 21], for certified solving. A feature of the certified solver is that lattice basis reduction may optionally be used to reduce the size of the particular solution. The implementations of these exact solvers are included in the recently released free library of C source code called IML — Integer Matrix Library1 . Below we discuss these two contributions in more detail, but first consider the organization of the IML library shown in Figure 1. The modules in the lowest level refer to two highly optimized and portable software libraries: the Automatically Tuned Linear Algebra Software library [29] for numerical linear algebra, and the GNU Multiple Precision library [16] for large integer arithmetic. Now consider the dashed box in Figure 1. The module WordSize Linear Algebra refers to the computation of matrix invariants (e.g., determinant, inverse, rank, left and right nullspace, row echelon form) over a small prime field. Section 2 discusses our implementation of this module. We use the standard representation for elements of p — nonnegative integers between 0 and p 1. In IML we always choose the moduli p small enough to allow direct use of the numerical BLAS routines for basic arithmetic operations on matrices and vectors. For example, by stipulating that n p 1 2 253 1 (the size of a double mantissa), the multiplication over p of two matrices with inner dimension n can be performed with single matrix multiplication over followed by reduction of entries modulo p. A more general purpose BLAS interface for matrix arithmetic over finite fields is already described

INTRODUCTION

The fundamental problem in exact linear algebra is to compute a particular solution to a system of linear equations. Linear solving can be divided into two cases. The first case – nonsingular solving – is to compute the unique solution to a nonsingular system Ax n n and b n 1. The second and more general b, where A case – certified solving – is to compute a solution with minimal denominator to a system Ax b, or to certify that the system has n m has arbitrary shape and rank. no solution, where A Nonsingular solving is the main building block of many recently proposed algorithms for other problems, including Diophantine solving [13, 20], the certified solving problem mentioned above [21], determinant [1, 10], Smith form [10, 23], and special cases of Hermite form [28]. Nonsingular solving is the main computational task driving the cost of all the algorithms in [1, 10, 13, 20, 21, 23, 28]. The two main contributions of this paper are as follows. First, we describe an efficient implementation of the well-know p-adic


1 http://www.scg.uwaterloo.ca/˜z4chen/iml.html

92

in [8], where a wider class of fields (i.e., larger, nonprime) are handled efficiently by using various techniques (e.g., lazy reduction, blocking to preserve exactness, alternative field representations). See also [6, 9].

a certified solution of the compressed system from the single nonsingular system solution C1 1 C2 b , which is then easily extended to get a solution of the original system. The nonsingular system solution C1 1 C2 b is computed using the fast nonsingular solver discussed above. Asymptotically, the expected cost of our certified solver is the same as the algorithm in [21], but there are two important advantages in practice. First, the approach in [21] needs to solve multiple systems of the form ABy b, for a sequence of random dense matrices B. The algorithm here solves a single nonsingular system C1 y C2 b with right hand side constant number of columns. It is considerably more efficient to solve one nonsingular system with right hand side k columns than to solve k different nonsingular systems with right hand side one column. Just consider that only one as opposed to multiple matrix inverses modulo p need be computed. Second, lattice reduction may be used to optionally reduce the bitlength of numerators in the solution vector. Our algorithm for solving the compressed system Cx b recovers at the same time an integer right kernel basis for C which has dimension k (e.g., k 10). This kernel can be used to reduce the solution size using lattice basis reduction, based on an idea described in [25, Page 381]. Although a reduction is not guaranteed, this method works well in practice. For example, consider a system Ax b with A of dimension 1000 2000 and all entries in A and b chosen randomly in the range 7 7. Without lattice reduction, a particular (in this case Diophantine) solution is computed in 52s and has entries with about 3097 decimal digits. With basis reduction the time increases to 89s but entries in the improved solution have only 311 decimal digits. The key to this approach is that lattice reduction is applied to a vector space of dimension k (in this example k 10). Applying the idea in [25, Page 381] directly would require computing and reducing the right kernel of dimension 1000 of the original matrix A; we expect this (since the system is random) to produce a solution vector with very small entries (i.e., single decimal digit) but with current techniques the lattice reduction of this large kernel is prohibitively expensive in terms of both time and space. For a matrix or vector A over , we denote by A the maximum magnitude of entries in A.

Nonsingular solving. The goal is to solve a system Ax b, n n is nonsingular and b n 1. A key feature of this where A problem is expression swell. We expect the unique solution vector n 1 to require about as much space to write down as the A 1b entire input system. The three main computations in the variation of p-adic lifting that we use are: (i) a matrix inverse modulo p, log p O log n log A ; (ii) O n matrix-vector products with entries of bitlength about log p; (iii) O n operations with integers of bitlength about n log p. Stage (iii) is accomplished with GMP. Our implementation reduces stages (i) and (ii) to Level 3 BLAS Gemm (wordsize matrix-matrix multiplication) and Level 2 BLAS Gemv (wordsize matrix-vector multiplication) by working over a well chosen residue number system. For the practitioner and implementor, the most important section of this paper is Section 3, where all the various optimization techniques used in our p-adic solver are described. The goal of our implementation is both performance and generality. For example, transposed systems At x bt are directly handled, the right hand side b may be multiple columns, and the integers in A and b may be arbitrarily large. However, if b is a single column and/or entries in A are all word-size (i.e., 32 or 64 bit) then the solver is optimized to take advantage. All timings we give in this paper use the same environment: Intel R Itanium R 2 @ 1.3GHz, 50Gb RAM, Intel R C++ compiler 8.0, GMP v. 4.1.3, ATLAS v. 3.6.0. Our code will solve a system of dimension 500, 1000 and 2 000 with single decimal digit entries in about 1.2 seconds, 7 seconds, and 42 seconds (see Table 2). While integers in these input systems were chosen randomly in the range 7 7, the numerators and denominators of entries in the solution vectors have 141, 1904, and 4110 decimal digits, respectively. Our implementation is designed for dense matrices, and doesn’t take advantage of any structure or sparseness that may be present in the input system.

2.

Certified solving. A reduction of Diophantine solving to non-

singular solving is given in [13]; an integer solution (if one exists) n m has arbitrary shape and rank of a system Ax b, where A profile, can be computed with high probability by combining a few random rational solution vectors. The ideas in [13] are taken up in [20] and extended in [21] to get an algorithm for certified solving: compute either a minimal denominator solution or prove that the system is inconsistent. Section 6 describes our new algorithm for the certified solving problem; the definition of certified solving is recalled at the start of the section. n m has full row rank. On the one Suppose for now that A hand, the approach of [21] is to solve multiple nonsingular sysm n is chosen randomly. tems of the form ABx b, where B On the other hand, the algorithm we describe here computes a sinm n k , gle compression of the form C AB, for random B and then computes a certified solution of the compressed system Cx b. The matrix C can be written as C C1 C2 where C1 is nonsingular and the column dimension k of C2 is a small constant. Thus, the compressed system is almost square. Section 4 sketches our (lengthy) proof that the compression will be successful with high probability even when k 25 (i.e., that the compressed system Cx b has the same minimal denominator as the original system Ax b). Section 5 gives a deterministic algorithm for computing

WORD-SIZE LINEAR ALGEBRA

Almost all vector space invariants of a given A p n m can be recovered by computing an echelon transform: a structured nonsingular U and a permutation P such that UPA is in echelon form. The following example has rank profile 1 4 5 .

U

U2 dI3

PA

U1

R

1

1

1

A recursive reduction of echelon transform computation to matrix multiplication is described in [27, Section 2.1]. We have implemented an optimized version of the echelon transform algorithm in IML. The functionality is also available in Maple R 10 as the command Modular[RowEchelonTransform]. All matrix multiplications arising in the algorithm have inner dimension at most n 2 , and thus can be performed directly with Gemm provided that n 2 p 1 2 p 1 253 1. The additive term p 1 arises because the algorithm takes advantage of an extra matrix addition operation allowed to be performed by Gemm. On the one hand, two n n matrices can be multiplied with 2n3 n2 arithmetic operations. On the other hand, the computation of the inverse of A using the echelon transform also has cost

93

2n3 O n2 arithmetic operations. Of course, the real advantage of using a recursive reduction to matrix multiplication is that we can exploit Level 3 BLAS, which is typically orders of magnitude faster than an equivalent computation using Level 2. To evaluate our implementation in IML we use the ratio test: the time for an invariant computation (e.g., inverse or determinant) of an n n matrix is divided by the time for computing the product of two matrices. Table 1 shows some timings for a call to Gemm A A for comput-

Gemm A A 0.01 s 0.06 s 0.46 s 3.58 s 11.56 s 0.89 m

Dimension 300 500 1000 2000 3000 5000

detA mod p Gemm A A

5 3 2.3 1.9 1.7 1.4

2 1.33 0.98 0.74 0.66 0.54

ing AA for different dimensions of A. In the same table, we show the ratios of the times to compute A 1 mod p and det A mod p with the time for the call to Gemm A A . For dimension five thousand an inverse computation takes about one and a half times as long as a matrix multiplication, while the determinant can be computed in about half the time. Similar results are obtained for the computation of rank, nullspace, and reduced row echelon form. We conclude that the echelon transform gives a highly effective (in practice) reduction of matrix (vector space) invariant computation to matrix multiplication.

3.

NONSINGULAR SOLVING

(i)

B:

1

mod A

The second phase computes vectors ci such that mod A 1b pi 1 c0 c1 p ci pi for i 0 1 k 1, k to be determined.

(ii)

3.1

(iii) c : x:

k c 0 c1 p ck 1 p k RatRecon c N D p ;

1;

Fast solution reconstruction

# radix conversion

The first part of phase (iii) is to compute the sum c c0 c1 p ck 1 pk 1 . Initially we used a Horner scheme but discovered this was too slow. Instead, we implemented recursive algorithm for the radix conversion which incorporates large integer multiplication (see [12, Exercise 9.20]) to take better advantage of GMP. The next step is rational reconstruction. The goal is to compute a common denominator d such that d D and mod dc pk N. The naive approach to accomplish this is to apply rational reconstruction on each entry of c independently and then set d to be the lcm of the n denominators. In practice, this is slow since each entry in the solution vector typically has denominator a large factor of d (in many cases equal to d.) In particular, GMP doesn’t currently offer an optimized function for rational number reconstruction. Our implementation3 is very sensitive to the size of the denominator.

Phase (iii) is accomplished using GMP. The optimized implementation of phase (iii) will be discussed later. First consider phases (i) and (ii). Our implementation reduces all of the matrix arithmetic in these stages to a number of calls to the highly efficient BLAS routines Gemm (matrix-matrix products) and Gemv (matrix-vector products). This is accomplished by working in a residue number system. On the one hand, in order to be able to apply Gemm and Gemv directly we need to work with moduli that are bounded in magnitude by B, where n B 1 2 253 1, or, equivalently, B 2t

Hadamard’s inequality bounds the numerators and denominators of entries in A 1b by N : nn 2 A n 1 b and D : nn 2 A n , respectively. Thus, if k is chosen to satisfy pk 2ND we can re n 1 cover A 1b using radix conversion and rational number reconstruction, see [12, Section 5.10].

r : b; for i from 0 to k 1 do ci : mod B mod r p p ; r : r Aci p od;

p ;

n n and a right hand side b Let a nonsingular matrix A n m be given. The first phase of p-adic lifting is to compute the inverse of A modulo p, where p det A.

Table 1: Ratio test for word-size inverse and determinant.

A 1 mod p Gemm A A

where t : log2 1 253 1 n . On the other hand, to maximize the amount of work in each BLAS call the moduli should be chosen as large as possible. For n 20 000 we have t 19 35, so for all practical purposes there will be sufficiently many distinct primes in the range 2t 1 2t . Thus, we may assume that all our moduli are about t bits in length. The obvious approach is to choose p to be a single t-bit prime such that p det A (such a prime can be found easily with a random choice.) Then stage (i) is accomplished with a single application of the echelon transform algorithm described in Section 2. Moreover, each computation of ci in stage (ii) can be accomplished with a n A ci n A p 1 , the single call to Gemv.2 Since Aci computation of Aci in stage (ii) can be accomplished by working over a residue number system with about 1 log2 n log2 A t moduli. Each iteration of the loop in stage (ii) makes t bits of progress (the bitlength of p), so the number of calls to Gemv per t bits of progress is about 2 log2 n log2 A t. Experiments with the approach just described revealed that much more time is spend in phase (ii) than in phase (i). This is not unexpected since a single matrix-matrix multiplication using Level 3 BLAS is orders of magnitude faster than n matrix-vector multiplications using Level 2 BLAS. To better balance the cost between phases (i) and (ii) we modify the above approach by choosing p to be a multiple of t-bit primes: p p1 p2 pl , l to be determined. Then the computation of ci in phase (ii) requires l calls to Gemv. Computing the updated r in phase (ii) by first computing Aci requires a residue number system with about l log2 n log2 A t primes. But we can optimize this by taking advantage of the fact that r Aci p is integral. Using GMP, compute the division with remainder of each entry in p. Then r Aci p r by p to obtain r pQ R, where R is equal to Q R Aci p. Since R Aci p 1 n A , we can compute R Aci p directly in a residue number system with only about log2 n log2 A t moduli. This optimized approach requires only l log2 n log2 A t calls to Gemv per lt bits of lifting progress, compared to the previous estimate of 2l log2 n log2 A l t calls. Of course, as l increases the cost of phase (i) increases, since l inverses need to be computed. Therefore, we need to find a balance between these two concerns. Our experiments suggest the choice l 2 logn log2 A if entries in A are small and can fit into signed long. Otherwise, we choose l log2 n log2 A . In the next two subsections we discuss two more optimizations, aimed at minimizing the cost of phase (iii) and minimizing the number of lifting steps in phase (ii).

2 We will assume here that the right hand side has column dimension m 1. If m 1 then a single call to Gemm should be used. 3 We implemented the standard iterative algorithm based on the ex-

94

Dimension 100 500 1000 1500 2000 2500 3000

Instead, we use a trick that is implemented in Shoup’s NTL library [24]. Initialize d : 1, and for i 1 2 n reconstruct the ith entry of the solution vector as follows. Let e denote the ith entry in c. Compute e¯ : mod de pk and then apply rational number reconstruction on e¯ to get n d.¯ Then n d d¯ is the ith entry in the solution vector. Update d to be d : d d¯ and proceed to the next i. At the end d will be the lcm of all the denominators. The advantage of this approach is that the product of all n denominators being reconstructed is equal to d, instead of being as large as d n . Thus, the net cost of all n calls to the rational number reconstruction routine is about the same as one call with a number having denominator d. We mention an anecdotal timing result. For a nonsingular system with dimension 3000 and single decimal digit entries, the new technique reconstructs the solution within five seconds. Applying rational reconstruction to each entry independently used twentyfive minutes. Thus, a three hundred times speedup is gained. The reason for this is that the GMP library provides highly optimized multiplication and division subroutines (i.e., for the computation of e¯ : mod de pk ).

3.2

Digits 141 877 1905 2991 4110 5257 6430

Dimension 3500 4000 4500 5000 6000 7000 8000

Time 3.1 m 4.7 m 6.5 m 9.5 m 14.7 m 23.4 m 35.4 m

Digits 7620 8821 10040 11271 13761 16288 18849

Table 2: Timings to solve a nonsingular system Ax b, where b is a vector and entries of A and b are randomly chosen in the range 7 7. The Digits column is the number of decimal digits in entries of the solution vector.

Dimension 200 500 1000 2000

Output sensitive lifting

Time 0.2 s 1.2 s 6.6 s 19 s 42 s 1.3 m 2.2 m

IML 0.2 s 1.5 s 6.6 s 42 s

NTL 0.1 s 17 s 37 s 5.6 m

Maple 5.8 s 87 s 23 m 5 7 h

In the worst case, the number of lifting steps k needs to be chosen satisfying pk 2ND in order to reconstruct the solution correctly, where N nn 2 A n 1 b and D nn 2 A n come from Hadamard’s bound. However, Hadamard’s bound is often a pessimistic [2] and we can take advantage of this by performing the lifting in an output sensitive manner.

Table 3: Comparison of timings to solve a nonsingular system Ax b using IML NonSingSolve, NTL solve1, and Maple Modular[LinIntSolve]. Entries in A and column vector b are randomly chosen between 7 and 7.

1. Initialize k0 to be zero and k to be a small positive integer (e.g., k 10).

point out that the time spent for the matrix inverse computations is almost the same as the time spent for the lifting iteration, due to our optimized choice of size of the lifting basis. A more detailed breakdown of the timing can be found in [4]. Table 3 compares our implementation with function solve1 in NTL [24] v. 5.3.2 and Modular[LinIntSolve] in Maple R v. 9.5. NTL also makes full use of GMP for the large integer arithmetic, but except for the fast rational reconstruction technique described in Section 3.1, the implementation in NTL doesn’t incorporate the optimizations described in this section. The speedup of IML over NTL is obtained in large part because NTL is not using the ATLAS/BLAS library to perform the matrix-vector and matrix-matrix multiplications. The solver in Maple is actually BLAS based but uses a homomorphic imaging scheme instead of lifting. Giorgi [15] has implemented a BLAS based version of lifting for nonsingular solving in the LinBox5 library [7]. For input matrices with small entries (i.e., in the range 7 7) the timings obtained are very similar to those in Table 2. For matrices with larger entries the IML implementation using multiple primes for lifting gives an improvement (e.g., the solution to a 2000 2000 input matrix with 100 bit entries takes about twice as long in LinBox as with IML). Finally, we mention a technical optimization based on a feature of BLAS. If the right hand side b has column dimension m 1, the computation of r and ci in phase (ii) of the algorithm should make use of single calls to Gemm instead of m calls to Gemv. For example, Table 4 shows that solving a system with right hand side ten columns takes less than five times as long as solving a system with right hand side a single column.

2. Perform k k0 iterations in phase (ii) to compute the lifting coefficients ck0 ck 1 . 3. Compute c :

c 0 c1 p

ck

1p

k 1.

4. Use the rational reconstruction scheme of the previous section to attempt to compute a common denominator d and pk 2 . y : mod dc pk such that max d y

5. If the reconstruction succeeds and Ay db then return y d. Otherwise, assign k to k0 and increase k by a small positive integer4 and goto step 2. In our implementation we optimize the above scheme by making the following two changes. First, we merge steps 3 and 4. As soon as we compute the ith entry e of c using radix conversion, we perform the rational reconstruction on e before starting to compute the i 1 st entry of c. If the reconstruction of e fails, we avoid computing entries i 1 n of c. In practice, if k is not large enough a failure is reported quickly. Second, instead of assaying correctness of y in step 5 with an expensive matrix-vector multiplication (note that entries in y may have bitlength n times larger than entries in A) we verify the solution by checking the magnitude bound according to the following lemma, due to [3].

L EMMA 1. If d b n A y

3.3

pk 1 2 then Ay

bd.

Timings

Table 2 gives some timings to solve the nonsingular system Ax b for different dimensions of A, where b is a vector and entries of A and b are relatively small. Although not shown in the table, we

4.

LATTICE COMPRESSION

n

Let A

m n with full row rank m be given, n

m. Let B , with entries chosen uniformly and randomly from a finite set Λ 0 1 λ , where λ 2. We call the action of post-

tended euclidean scheme, see [12, Section 9.10]. 4 Our implementation uses k : k 10, which has not been optimized.

m k

5 http://www.linalg.org

95

Dimension 500 2000 4000

m 1 1.2 s 42.0 s 4.7 m

m 10 4.9 s 3.4 m 22.3 m

Ratio 4.1 4.9 4.7

U

Table 4: Comparison of timings to solve a nonsingular system Ax b with m 1 and m 10, where m is the column dimension of b and the entries of A and b are randomly chosen between 7 and 7.

U1

N

N B

where

1s

1

..

.

sm

t1

(1)

S2 S3

and only if all the diagonal entries in the Smith form of N B are equal to 1. B1 B2 B3 and introduce two indeterPartition B as B minants t1 and t2 such that the column dimensions of B1, B2 and B3 is m t1 , t1 t2 and k t2 respectively. There exist unimodular matrices U and V which separately apply row operations and column

96

.

CERTIFIED SOLVING WITH KERNEL BASIS

p Lemma 3 implies that C mod p has full row rank m over for any prime p if and only if N B mod p has full row rank p . Since N B is equivalent to its Smith form, n over p for any prime p if N B mod p has full row rank over

In this section we present a deterministic algorithm to certified solve a linear system Cx b. We consider a special case of the A B problem: we assume that C may be decomposed as C n n k where A n n is nonsingular. The algorithm here is applicable for arbitrary B, but is designed for the case when the column dimension k of B is small (e.g., k O 1 ). In Section 6 we show how to use the lattice compression technique of Section 4 to certified solve an input system with arbitrarily column dimension by reducing to the special case we describe here, with k 25. Before describing the algorithm we recall the difference between a right nullspace (over ) and right kernel (over ). The matrix C is over the principal ideal domain , but we may also consider C n k k to be over the field . On the one hand, a matrix N is a right nullspace for C over if rank N k and CN 0. On n k k to be a right kernel for C the other hand, for a matrix K over the following additional condition must be satisfied: every integer vector in the right nullspace of C must be generated by an integer linear combination of columns of K. Computing a right kernel is a more subtle problem than computing a right nullspace. For example, scaling a nullspace by multiplying by the least common multiple of the denominators of entries will produce a nullspace with integer entries, but this is unlikely to be a kernel. The first step of our algorithm is to compute A 1 B b . Then n k k for C over , the algorithm constructs a right kernel K

m n and B n l be given. Let N be L EMMA 3. Let A a right kernel for A. Then for any prime p, rank AB mod p rank N B mod p rank N mod p over the finite field p.

I2

5.

since N L In m . Having set up the transform as above, we can see that a sufficient condition for all the diagonal entries of the Smith form of N B to equal one is that s1 s2 sm t1 1 1 and S2 S3 R It1 1 . The first step of our proof is to bound the sm t1 1 1 using the technique from probability that s1 [10, Section 6]. A problem of adapting the technique is that it can only provide a useful bound on the probability that s1 sm t1 1 are equal to one, without determining the value of sm t1 . However, if S2 S3 R It1 1 , then all the entries on the diagonal of the Smith form of N B are necessarily 1. So, the second step of our proof is to bound the probability that S2 S3 R It1 1 using the results from [21, Section 3]. In the final step, we derive the result of Theorem 2 by fixing the values of t1 and t2 and combining the previous two results.

Now we give the idea of our proof for Theorem 2. Refer to [4] for the complete proof. Let C AB. To bound the probability of A R C, an equivalent conversion is to assume that A R Im and to bound the probability that C R Im . Since A R Im if and only if for all primes p, A mod p has full row rank over p , our goal is to derive a lower bound on the probability that C mod p has full row rank over p for all primes p. n n m denote a right kernel for A. Then N Let N L In m (i.e., N has the Hermite row basis an identity matrix). Thus, for any prime p, N mod p has full column rank over p . From [21, Lemma 15] we obtain the following.

S2 S3 U2 B2 B3 and the t1 1 n submatrix U2 is in the nullspace for N. The first n m invariant factors are 1

m n be given, where rank A m and T HEOREM 2. Let A n m k n m. Let B have entries uniformly and independently chosen at random from Λ 0 1 λ 1 . If λ max m 1 log2 n log2 A , then the probability that A R AB is at least 1 16 1 2 k 5. In particular, the probability is 1 2 if k 25.

S ..

1

I3

B1 B2 B3

V

U2

multiplication of A by B lattice compression: C AB where C has k more columns than rows. Let A R C denote that A and C have the same Hermite column basis (i.e., the set of all -linear combinations of columns of A is equal to the set of all -linear combinations of columns of C). In this section we sketch the result in [4] which gives a lower bound on the probability that A R C; the bound exponentially converges to one with respect to k. The lattice compression technique will be used by the algorithm in next section. The strongest form of lattice compression has k 0. For ex1) such that all ample, let V be a unimodular matrix (detV C but the first m columns of AV are zero: AV , where m m is necessarily nonsingular. If we take B as the subC matrix of V comprised of the first m columns, then C AB and A R C. However, the most efficient algorithm to compute such a V is too expensive by factor of m and, moreover, guarantees only the bound B m3m 2 3 A 3m (see [27, Proposition 8.10]) which causes this approach to be very inefficient. Alternatively, by choosn m k to be a random matrix with k a small constant, ing B we have A R C with high probability.

operations on N B and transform the submatrix N B1 to Smith form. Such U and V can be partitioned using a conformal block decomposition as

n k 1 such that Cy a minimal denominator solution y b, 1 n and a certificate vector z which proves the minimality of the denominator of y. For clarity, we first consider the computation of only K, then show how extend the method to compute y and z. For our size estimates in this section we assume that O n logn log C . log b Finally, we recall a technique for optionally reducing the the solution using lattice reduction.

v

k 1 such that

x1 Nv has minimal denominator. Let

N

sx1

have Hermite form

T

c e

(2)

s

Two rational matrices that are left equivalent necessarily have the k 1 such that same denominator. Thus, our goal is to find a v the vector

Constructing a right kernel K

N

We will compute K to be the last k columns of the unique unimodular matrix of dimension n k such that

sx1

sv 1

1 or, equivalently, s

T

c e

sv 1

1 s

s K A

B Ik

H

(3) has minimal denominator. The equivalent presentation in (3), together with the choice v T 1c 1 s , reveals that the minimal denominator is s e. But since sT 1 is integral, we can choose sT 1 c mod se 1 s2 , giving the minimal denominator sov k 1 can s2x1 N v¯ s2, where v¯ sT 1 c mod se lution y 1 be computed by first computing sT mod se using similar method as used for the computation of H from sT¯ 1 mod s. The cost estimate remains the same as before.

where the matrix on the right hand side is in column Hermite form. Solving gives K StackMatrix A 1BH H . We now show how to construct H efficiently from A 1B. Let s be the denominator of n k k . A 1 B b and set N StackMatrix sA 1B sIk Then N is a basis for the right nullspace of C over . The algorithm of [17] can be used to compute the upper triangular row k k. The algorithm is easily modified to comHermite basis T k k that is lower pute, instead of T , a left equivalent matrix T¯ triangular with off-diagonal entries in each column reduced modulo the positive column leader (just an alternative definition of the Hermite row basis). This algorithm uses O nk 2 integer operations and takes advantage of the rows sIk to keep all intermediate entries during the computation of T and T¯ reduced modulo s. Then N T¯ 1 is also a right kernel of C over , so the last k rows of K and N T¯ 1 are right equivalent (i.e., H is the column Hermite form of sT¯ 1 ). The following code fragment shows how to compute H from T¯ , keeping all off-diagonal entries in H reduced modulo s.

q

sA 1B sA sIk

1b

e

s

mod s

which shows that q is as desired. It remains to show how to compute u, row k 1 of a unimodular matrix effecting the transformation in (2). The transformation to Hermite form can be accomplished with a sequence of O nk unimodular transformations, each transformation on one or two rows (cf. [27, page 55]). Multiplying these transformations together would yield a unimodular matrix U UO nk U3U2U1 , but 2 this would be too expensive (O n k arithmetic operations) and the U produced might be dense, with O n2 nonzero entries. Instead, store all these transformations and then, working modulo s, apply them in reverse order to row k 1 of the identity matrix. As a result, u can be computed at the same time. Finally, compute z : qA 1. If k O 1 the nonsingular systems A 1 B b and qA 1 can be solved with O n3 log n log C 2 bit operations (see [20]). Additional O n operations with integers of bitlength O n logn log C bits is required. This gives the following.

Once H has been recovered, compute K : NH 1 s . By Hadamard’s bound and Cramer’s rule, all of s and N are bounded in length by O n logn log C bits. Assuming A 1B has already been computed, the dominant step is to compute T¯ at a cost of O nk 2 operations with integers bounded in absolute value by s.

Constructing a minimal denominator solution y

T HEOREM 4. Suppose log b O n logn log C and k O 1 . Given a prime p such that p O log n log C and p det A, the algorithm described above computes K, y and z with O n3 logn log C 2 bit operations.

Now we extend the algorithm just described to compute a minimal denominator solution y. Recall that N StackMatrix sA 1B sIk is a basis for the right nullspace of C. A particular solution of n k 1, and the set of Cx b is x1 : StackMatrix A 1b 0 n 1 all rational solutions is x1 Nv v . Our goal is to find a

Constructing a certificate z

H : sIk ; for i from 1 to k do for j from 1 to i 1 do for l from 1 to j do Hil : Hil T¯i j H jl mod s od od; for j from 1 to i do Hi j : Hi j T¯ii od od

1 n Finally, we show how to construct a certificate vector z such that zC is integral and zb has denominator s e. Our approach 1 n such that qA 1B is integral and qA 1 b is to construct a q has denominator s e, and then set z : qA 1. Let u be equal to row k 1 of a unimodular matrix that transforms the first matrix in (2) to it’s Hermite form. Let q be the vector comprised of the first n q entries of u reduced modulo s, so that u mod s. Recall that N StackMatrix A 1B sIk and x1 StackMatrix A 1b 0 . Then

Incorporating lattice reduction Let y be a minimal denominator solution for Cx b and K be a right kernel for C. If y and K are computed using the algorithm

97

Dimension n m 500 1000 1000 2000 3000 6000 6000 8000

supporting Theorem 4 the bitlength of entries will be O n logn log A . We can try to find a minimal denominator solution y¯ with improved bitlength using the following approach described in [25, Page 381]. ¯ Use lattice basis reduction [18] to compute a reduced kernel K. k so that the vector d y y¯ : d y y Ku is size Then compute u ¯ In all our experiments with reduced with respect to the vectors in K. random matrices, this produces a minimal denominator solution y¯ which has numerators with bitlength about a factor of k smaller than those in y. The main cost of the above approach is to reduce the lattice K of dimension k in in n k -dimensional space. This lattice has a very special shape since k is so small compared to n k (e.g., k O 1 ). Moreover, the lattice is very skew since the norms of column vectors in K are large (i.e., O n logn log A ). Using directly the LLL algorithm [18] would be too expensive. Instead, we use the modification in [26] that works in three stages: (1) compute K T K k k; (2) compute, modulo an integer M O kn K , a unimodk k; (3) set K ¯ KU mod M. Step (2) dominates ular matrix U the cost with O k 4n logn logA arithmetic operations involving integers bounded in length by O kn logn log A bits. The algorithm in [22], which is less sensitive to K , may also be well suited for this task.

6.

k 10 20 30 40 50

1.

m

1 n with

qA

0 0

1 m and qb

0.

Timings

In the first case, y is a solution with minimal denominator and z serves as a certificate for the minimality of the denominator of y. The idea of minimal denominator certificate is a generalization of the integer version of Farkas’ lemma in [11]. In the second case, vector q certifies that the system is inconsistent and has no rational solution. The idea for certifying inconsistency is due to [14]. Refer to [21] for explanations in detail. There are three stages to our certified solver. The first is to either prove that the input system Ax b is inconsistent or to reduce to an equivalent system which has full row rank. Our implementation of this first stage is accomplished using a similar approach as in [21, CertifiedSolver, Page 506]. The description of this phase is omitted here since it is so similar. Henceforth we will assume that the input n m has full row rank n. matrix A The second stage is to compress the system Ax b with A n m using lattice compression as described in Section 4. This give us an almost square system Cx b (e.g., C AB with B

The certified solver described in Section 5 and above was implemented in IML directly as described using GMP. All nonsingular solving uses the ATLAS/BLAS based algorithm described in Section 3. For the lattice compression phase we chose by default λ 2 and k 10 (the user has the option of choosing a larger k). Using this choice, the lattice compression C AB succeeded in all our experiments, and hence only a single nonsingular system C1 1 C2 b with right hand side eleven columns needed to be solved. Table 4 shows that the time for solving this system is only about five times as long as the time for solving a system with right hand side one column. In Table 5 we give timings to compute a minimal denominator solutions to the system Ax b with entries in A and b randomly chosen. The table shows that the solution size can be approximately reduced by a factor of ten using lattice basis reduction. For a randomly chosen input system, we always observed this correspondence between the solution reduction and the kernel basis dimension. Table 6 considers the effect of increasing the value of k on the quality of the solution and the running time. Most of the increase in time is due to the lattice reduction phase. Finally, we remark that an optimized BLAS based algorithm for diophantine solving has been implemented in LinBox [7]. See [15] for a description and detailed comparison with IML.

n 25

6.1

Time 39 s 3.9 m 15 m 42 m 1.6 h

zb and y have the same denominator.

q

Digits 254 128 86 65 53

chosen randomly). The matrix multiplication AB is accomplished by reducing to Gemm by using a residue number system. Note that if m is not too big with respect to n (e.g., m n 25) then we can choose B Im in which case C A. The third stage computes a certified solution y¯ z to the compressed system Cx b using the algorithm described in the previous section. Using GMP, we assay if By¯ z is a certified solution to the original system Ax b by checking that zA is over , and that zb and By¯ have the same denominator.

y z , where m 1 with Ay b, y 1 n with zA 1 m, and z

2. (“no solution”, q), where

n m and b n 1 be given. We first recall the properLet A ties of minimal denominator. Let d y denote the denominator of a solution vector y to the system Ax b, where b can be a vector or a matrix. The set of denominators of all the solutions to the system Ax b generates an ideal I of . Let d A b denote the generator of the ideal I. Then d A b divides all the elements in I and hence is the minimal denominator. Our algorithms for certified linear system solving has the same functionality as the algorithms in [21], which take as input A and b and return as output one of the following:

Table 6: Timings to compute a size-reduced minimal denominator solution to a system Ax b for different k, where k is the column dimension of the right kernel of the compressed system. 500 1000 and b 500 1 are randomly chosen Entries in A from 210 to 210. The column Solution Digits is the number of decimal digits of the largest entry in the solution numerator.

CERTIFIED LINEAR SYSTEM SOLVING

Reduced Solution Digits Time 146 16 s 311 89 s 1102 33 m 2347 4.8 h

Table 5: Timings to compute a minimal denominator solution n m and b n 1. Entries of to the system Ax b, where A A and b are randomly chosen between 7 and 7. The column Digits is the number of solution decimal digits.

Typical Solution Digits Time 1445 9s 3097 52 s 10995 22 m 23451 3.3 h

98

7.

REFERENCES

[15] P. Giorgi. Arithmetic and algorithmic in exact linear algebra for the LinBox library. PhD thesis, Ecole normale superieure de Lyon, LIP, Lyon, France, December 2004. [16] T. Granlund. The GNU multiple precision arithmetic library, 2004. Edition 4.1.4. http://www.swox.com/gmp. [17] C. S. Iliopoulos. Worst-case complexity bounds on algorithms for computing the canonical structure of finite abelian groups and the Hermite and Smith normal forms of an integer matrix. SIAM Journal of Computing, 18(4):658–669, 1989. [18] A. K. Lenstra, H. W. Lenstra, and L. Lovász. Factoring polynomials with rational coefficients. Math. Ann., 261:515–534, 1982. [19] R. T. Moenck and J. H. Carter. Approximate algorithms to derive exact solutions to systems of linear equations. In Proc. EUROSAM ’79, volume 72 of Lecture Notes in Compute Science, pages 65–72, Berlin-Heidelberg-New York, 1979. Springer-Verlag. [20] T. Mulders and A. Storjohann. Diophantine linear system solving. In S. Dooley, editor, Proc. Int’l. Symp. on Symbolic and Algebraic Computation: ISSAC ’99, pages 281–288. ACM Press, New York, 1999. [21] T. Mulders and A. Storjohann. Certified dense linear system solving. Journal of Symbolic Computation, 37(4):485–510, 2004. [22] P. Nguyen and D. Stehlé. Floating-point LLL revisited. In Proceedings of Eurocrypt ’05, 2005. [23] D. Saunders and Z. Wan. Smith normal form of dense integer matrices, fast algorithms into practice. In J. Gutierrez, editor, Proc. Int’l. Symp. on Symbolic and Algebraic Computation: ISSAC ’04, pages 274–281. ACM Press, New York, 2004. [24] V. Shoup. NTL: A Library for Doing Number Theory, 2005. http://www.shoup.net/ntl/. [25] C. C. Sims. Computation with finitely presented groups, volume 48 of Encyclopedia of mathematics and its applications. Cambridge University Press, 1984. [26] A. Storjohann. Faster algorithms for integer lattice basis reduction. Technical Report 249, Departement Informatik, ETH Zürich, July 1996. [27] A. Storjohann. Algorithms for Matrix Canonical Forms. PhD thesis, Swiss Federal Institute of Technology, ETH–Zurich, 2000. [28] U. Vollmer. A note on the Hermite basis computation of large integer matrices. In R. Sendra, editor, Proc. Int’l. Symp. on Symbolic and Algebraic Computation: ISSAC ’03, pages 255–257. ACM Press, New York, 2003. [29] R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the atlas project. Parallel Computing, 27(1-2), 2001.

[1] J. Abbott, M. Bronstein, and T. Mulders. Fast deterministic computation of determinants of dense matrices. In S. Dooley, editor, Proc. Int’l. Symp. on Symbolic and Algebraic Computation: ISSAC ’99, pages 197–204. ACM Press, New York, 1999. [2] J. Abbott and T. Mulders. How tight is Hadamard’s bound? Experimental Mathematics, 10(3):331–336, 2001. [3] S. Cabay. Exact solution of linear systems. In Proc. Second Symp. on Symbolic and Algebraic Manipulation, pages 248—253, 1971. [4] Z. Chen. A BLAS based C library for exact linear algebra on integer matrices. Master’s thesis, School of Computer Science, University of Waterloo, 2005. [5] J. D. Dixon. Exact solution of linear equations using p-adic expansions. Numer. Math., 40:137–141, 1982. [6] J.-G. Dumas. Efficient dot product over word-size finite fields. CoRR, cs.SC/0404008, 2004. [7] J.-G. Dumas, T. Gautier, M. Giesbrecht, P. Giorgi, B. Hovinen, E. Kaltofen, B. D. Saunders, W. J. Turner, and V. G. LinBox: A generic library for exact linear algebra. In A. J. Cohen and N. Gao, X.-S. andl Takayama, editors, Proc. First Internat. Congress Math. Software ICMS 2002, Beijing, China, pages 40–50, Singapore, 2002. World Scientific. [8] J. G. Dumas, T. Gautier, and C. Pernet. Finite field linear algebra subroutines. In Proc. Int’l. Symp. on Symbolic and Algebraic Computation: ISSAC ’02, pages 63–74. ACM Press, New York, 2002. [9] J. G. Dumas, P. Giorgi, and C. Pernet. Finite field linear algebra package. In J. Gutierrez, editor, Proc. Int’l. Symp. on Symbolic and Algebraic Computation: ISSAC ’04, pages 119–126. ACM Press, New York, 2004. [10] W. Eberly, M. Giesbrecht, and G. Villard. Computing the determinant and Smith form of an integer matrix. In Proc. 31st Ann. IEEE Symp. Foundations of Computer Science, pages 675–685, 2000. [11] J. Edmonds and R. Giles. A min-max relation for submodular functions on graphs. Annals of Discrete Mathematics, 1:185–204, 1977. [12] J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, 2 edition, 2003. [13] M. Giesbrecht. Efficient parallel solution of sparse systems of linear diophantine equations. In M. Hitz and E. Kaltofen, editors, Second Int’l Symp. on Parallel Symbolic Computation: PASCO ’97, pages 1–10. ACM Press, New York, 1997. [14] M. Giesbrecht, A. Lobo, and B. D. Saunders. Certifying inconsistency of sparse linear systems. In O. Gloor, editor, Proc. Int’l. Symp. on Symbolic and Algebraic Computation: ISSAC ’98, pages 113—119. ACM Press, New York, 1998.

99

Structure and Asymptotic Expansion of Multiple Harmonic Sums C. Costermans

J.Y. Enjalbert

Universite´ Lille 2 1, Place Deliot, ´ 59024 Lille, France


[email protected] Hoang Ngoc Minh

[email protected] M. Petitot


Universite´ Lille 1 59655 Villeneuve d’Ascq, France

[email protected]

[email protected] ABSTRACT

with the convention Hs (N ) ≡ 1 when s is the empty composition. It can be proved that the limit

We prove that the algebra of multiple harmonic sums is isomorphic to a shuffle algebra. So the multiple harmonic sums {Hs }, indexed by the compositions s = (s1 , · · · , sr ), are Rlinearly independent as real functions defined over N. We deduce then the algorithm to obtain the asymptotic expansion of multiple harmonic sums.

ζ(s) =

G.2.1 [Discrete Mathematics]: Combinatorics—combinatorial algorithms, generating functions

General Terms Algorithms, Languages

Keywords

Ha (N ) · Hb (N ) = Ha,b (N ) + Hb,a (N ) + Ha+b (N ).

Polylogarithms, multiple harmonic sums, Lyndon words, polyzêtas

INTRODUCTION N X 1 n=1

ns

(s ∈ N>0 , N ∈ N>0 )

,

(1)

can be generalized to any composition s of length r ≥ 0, i.e. a sequence of positive integers s = (s1 , . . . , sr ) by putting Hs (N ) =

X N ≥n1 >···>nr >0

1 n1 s 1 . . . n r s r

(4)

So, the vector-space HR is closed under product. The main result of this article is to prove that in HR , the functions {Hs }s are linearly independent. As a consequence, HR appears to be isomorphic to some shuffle algebra noted RhY i, . The structure of this algebra is well known and Hoffman showed that it is freely generated by Lyndon words on the alphabet Y . Let HR0 be the R-algebra generated by the functions Hs when s describes the set of all convergent compositions. We show that

Let N>0 be the set of positive integers. The harmonic sums Hs (N ) =

(3)

exists if and only if the composition s is empty or if s1 6= 1. In this case, we will say that s is a convergent composition. The values ζ(s) are called ”Multiple Zeta Values” (MZV). Harmonic sums and MZV arise in high-energy particule physics [3] and in analysis of algorithms [5]. We consider the R-vector space HR generated by the Hs , seen as functions from N to R. The theory of quasisymmetric functions shows that the {Hs (N )}s satisfy shuffle relations. In particular, the product of two harmonic functions is a sum of harmonic functions : for all a, b ∈ N>0 , we have


1.

lim Hs (N )

N → ∞

HR ' HR0 [H1 ]

(2)

(5)

i.e. that any harmonic function can be decomposed P uniquely −1 in a univariate polynomial, on the sums H1 (N ) = N . n=1 n This decomposition is obtained thanks to a variant of Taylor expansion for univariate polynomials, by defining a derivation D in the shuffle algebra RhY i, . In fact, this decomposition provides an asymptotic expansion, up to order 0, of Hs (N ) as N → ∞. We can then deduce an asymptotic expansion, up to any order, by using the second form of the Euler-Mac Laurin summation formula. Our result of linear independance of the functions Hs lies on the C-linear independance of the polylogarithm functions


100

This means that every polynomial in RhY i, for shuffle product, can be decomposed uniquely in a linear combination of shuffle products of Lyndon words.

(z is a complex number so that |z| < 1) X

Lis (z) =

n1

n1 >···>nl

z s , s1 · · · nl l n >0 1

(6)

2.2

first proved in [9], then resumed in [13, 14]. In this article, harmonic functions are seen as Taylor coefficients X 1 Lis (z) = Hs (N )z N . 1−z N =0 ∞

(7)

l

Mu Mv = Mu

2.3

How to shuffle?

p=

(p|w) ∈ R.

(p|w)w,

v) + yj (u

v 0 ) + yi+j (u0

3.

=

Hw (N )

=

X

0

1 n1s1 . . . nsrr

for 0 ≤ N < r.

If |w| = 0, we put H (N ) = 1, for any N ≥ 0. Lemma 1. Let w = ys1 . . . ysr ∈ Y ∗ . The sum Hw (N ) is convergent, when N → ∞, if and only if s1 > 1. In this case, this limit is nothing but the polyzta (or MZV [15]) ζ(w), and thus, by an Abel’s theorem lim Hw (N ) = ζ(w) = lim Liw (z).

N →∞

z→1

(15)

Then, the word w is said convergent. A polynomial of RhY i is said convergent when it is a linear combination of convergent words.

(11)

Lemma 2. For w = ys w0 , we have

Hoffman generalized Radford theorem in the following way ([12, 10]). One has ) = (C0 [y1 ],

Hw (N )

N ≥n1 >...>nr >0

∗

) ' (R[Lyndon(Y )],

Harmonic sums

and

A nonempty word w is called a Lyndon word if it is strictly smaller (for lexicographic order) than any of its proper right factors, i.e. w < v for any factorization w = uv with u 6= and v 6= . Let Lyndon(Y ) denotes the set of Lyndon words over Y . Let

(RhY i,

(14)

Definition 1. Let w = ys1 . . . ysr ∈ Y ∗ . Then, for N ≥ r ≥ 1, the harmonic sum Hw (N ) is defined as

< w for w ∈ Y \ yi u < yj v if i > j or if i = j and u < v.

Theorem 1

z n1 · · · nsrr ,

HARMONIC SUMS ALGEBRA

3.1

(10)

C0 = R ⊕ (RhY i \ y1 RhY i).

ns 1 n1 >···>nr >0 1

In fact, this result can be improved by replacing C by any algebra of analytic functions defined over C − {0, 1}, for example C[z, 1/z, 1/(z − 1)], and the proof lies on an explicit evaluation of the monodromy group of Liw .

(9)

v 0 ),

X

Theorem 2 ([9]). The functions Liw , for w ∈ Y ∗ , are C-linearly independent.

with i, j ∈ N>0 and u0 , v 0 ∈ Y ∗ . This product is also extended to RhY i by linearity. Provided with this shuffle product, RhY i becomes an associative and commutative R-algebra noted RhY i, . We can totally order Y by putting yi < yj if i > j. So y1 is the biggest letter of the alphabet Y . Lexicographic order is then recursively defined on words w ∈ Y ∗ by

(13)

for r > 0 and by Li (z) = 1, for the empty word .

The concatenation product for words is extended to polynomials by linearity. The shuffle product of two words u = yi u0 and v = yj v 0 in Y ∗ is recursively defined by w = w, for all w ∈ Y ∗ and v = yi (u0

(u, v ∈ Y ∗ ).

Polylogarithms

Liw (z) =

w∈Y ∗

u

v,

Let us associate to any word w = ys1 · · · ysr the polylogarithm Liw (z) defined for |z| < 1 by

We consider the non commutative alphabet Y = {yn | n ∈ N>0 }. As usual, the set of all words over Y is denoted Y ∗ and the empty word is denoted . The length of the word w is denoted |w| and the word w resulting from the concatenation of two words u and v is the word w = uv. Let R be a commutative ring containing Q. A polynomial p ∈ RhY i is a linear combination of words, with coefficients in R. The coefficient of the word w in polynomial p is noted (p|w) and therefore X

(12)

which is a formal series on the letters tn ∈ T , with coefficients in N. For the empty word, we define M = 1. Thanks to Gessel works, it is known that the product of quasi monomial functions is a sum of quasi monomial functions. Moreover, we have the following identity between formal series

BACKGROUND

2.1

sl , tsn11 · · · tn l

n1 >···>nl >0

where ξ1 , . . . , ξl are roots of unity.

2.

X

Mw =

By using the combinatorics developed by M. Bigotte in [2], it is possible to consider a similar generalization for the harmonic sums related to coloured MZV X ξ1 n1 . . . ξl nl l , (8) ζ(ξs11 ,...,ξ ,...,sl ) = n1 s 1 . . . n l s l n >···>n >0 1

Quasi monomial functions

Let T = {tn | n ∈ N>0 } a countable commutative set. To each word w = ys1 . . . ysl ∈ Y ∗ , we associate the quasi monomial function

Hw (N ) = ).

N X Hw0 (k − 1) k=1

101

ks

Corollary 1. For w = ys w0 , we have ζ(w)

=

X Hw0 (k − 1)

ls

k≥1

Hw (N + 1) − Hw (N )

=

Proof. Proposition 2 gives P as an algebra morphism and by Theorem 2, P is the expected isomorphism. ,

(16)

(N + 1)−s Hw0 (N )

Thanks to the relations existing between Liw (1 − z) and Liw (z) [8], we can precise the asymptotic behaviour of Liw in the neighbourhood of 1. For example,

(17)

1 − Li4 (t) + log(t) Li3 (t) − log(t)2 Li2 (t) 2 1 2 + log(t)3 Li1 (t) + ζ(2)2 . 6 5 So, we find, by formula (14), the expansion of Li2,1,1 (1 − ε) and, by dividing it by ε, we find the one of P2,1,1 (1 − ε) :

([10]). For any words u and v, we have

Lemma 3

Hu

v (N )

Li2,1,1 (1 − t)

= Hu (N ) Hv (N ).

Proof. The harmonic sum Hw (N ) can be obtained by specialization of the quasi-mononial function Mw at ti = 1/i if 1 ≤ i ≤ N and ti = 0 if i > N . By (13), Hw (N ) satisfies the expected result.

3.2

1 1 1 2 ζ(2)2 + log3 ε − log2 ε + log ε 5 ε 6 2 ε ε ε − 1+ log3 ε − log2 ε + log ε + O (ε) 12 8 8 From this, we can also deduce the expansion of the Taylor coefficients of P(z) (see [4]). P2,1,1 (1 − ε)

Generating series

Definition 2 ([6]). For any word w ∈ Y ∗ , let Pw be the ordinary generating series of {Hw (N )}N ≥0 : Pw (z) =

X

Hw (N )z N , with P (z) =

N ≥0

1 . 1−z

3.3

Liw (z) = (1 − z)Pw (z). N

Proposition 3. The map H : u 7→ Hu is an isomorphism from (RhY i, ) to the algebra HR .

Proof. Since Pw (z) = N ≥0 Hw (N )z , it is known that the series expansion of (1 − z)Pw (z) is given by

Since Lyndon(Y ) generate freely the shuffle algebra then

[Hw (N ) − Hw (N − 1)]z N .

Corollary 3. Any harmonic sum in HR can be decomposed, uniquely, as a polynomial on the series Hl , for l ∈ Lyndon(Y ), i.e. HR ' R[Hl ; l ∈ Lyndon(Y )].

N ≥1 0

But, by (17), for w = ys1 w , Hw (N ) − Hw (N − 1) = N −s1 Hw0 (N − 1),

Lemma 4. Any l ∈ Lyndon(Y ) is convergent if and only if l 6= y1 . Any convergent polynomial can be decomposed uniquely as shuffle of convergent Lyndon words.

so (1 − z)Pw (z) = Hw (0) +

X Hw0 (N − 1) N ≥1

N s1

z N = Liw (z).

Proof. By definition, a Lyndon word l is strictly smaller (for lexicographic order) than any of its proper right factors. So, if l = y1 u, with u ∈ Y ∗ , we have y1 u < u which is impossible (remind that y1 is the greatest letter of Y ). Thus, the only Lyndon word beginning by y1 is y1 itself, and our first statement is proved. The second one is based on the remark : if y1 appears as a ) in the Radford decomposition of w, then this factor (for word begins by y1 . Since a convergent polynomial contains convergent terms, which do not begin by y1 , the statement is proved.

Definition 3. The Hadamard product is a bilinear function from C[[z]] × C[[z]] to C[[z]] defined, for all integers n and m,by (

n

z z Thus,

P∞ n=0

an z n

m

=

P∞ n=0

z n if n = m, 0 if n 6= m.

bn z n =

P∞ n=0

an bn z n .

Proposition 4. Every harmonic sum Hw ∈ HR can be decomposed in a unique way in a univariate polynomial in H1 , with coefficients in the convergent harmonic sums. This can also be expressed as follows :

Proposition 2. For u, v ∈ Y ∗ , one has Pu (z) Pv (z) = Pu

v (z).

Proof. By Lemma 3,

X

N ≥0

Hu (N )z N

X

Hv (N )z N

=

N ≥0

X

HR ' HR0 [H1 ], Hu (N )Hv (N )z N

where HR0 is defined as the R-algebra generated by the functions Hw , for all convergent words w ∈ C0 .

N ≥0

=

Algebra HR

From Corollary 2, we deduce then

P

X

=

Definition 4. The algebra HR of harmonic sums is defined as the R-vector space HR = spanR (Hw | w ∈ Y ∗ ), equipped with the ordinary product.

Proposition 1 ([6]). For any word w ∈ Y ∗ and for any complex number z satisfying |z| < 1, one has

(1 − z)Pw (z) = Hw (0) +

=

X

Hu

v (N )z

N

.

Example – The Radford decomposition gives, in Lyndon basis, y1 y4 y2 = y1 y4 y2 − y5 y2 − y4 y1 y2 − y4 y2 y1 − y4 y3 . Thus, H1,4,2 = H1 H4,2 − H5,2 − H4,1,2 − H4,2,1 − H4,3 . By Proposition 3, we deduce ker H = {0}. In other words,

N ≥0

Corollary 2 ([6]). Extented by linearity, the map ) to the P : u 7→ Pu is an isomorphism from (RhY i, Hadamard algebra of {Pw }w∈Y ∗ .

Proposition 5. The harmonic sums Hw , for w ∈ Y ∗ are R-linearly independent.

102

4.

ASYMPTOTIC EXPANSIONS

So we deduce the asympotic expansion up to order q + 2, which will appear to be very useful afterwards :

We are going to construct a recursive algorithm to find the asymptotic expansion of Hw . For that, considering any real sequence {sn }n∈N , we will define ASq sn as the asymptotic expansion up to order q of sn , i.e. so that

N X k=2

log(k − 1) kq

sn − ASq (sn ) = O(n−q ).

4.1

with K =

n

Bn

n≥0

z exp(−z) z z = = , n! exp(z) − 1 1 − exp(−z)

x exp(tx) = exp(x) − 1

Z

n

Bn (t)

n≥0

1 qB2 log(N ) log(N ) + − 2N q qN q 2 N q+1

P+∞ k=2

+

m X

N

f (x)dx + M

n=M

x . n!

N −1

1

log(x) dx = (x + 1)q +

1 = (2m + 1)!

B2m+1 (x − [x])f

(2m+1)

(x)dx.

M

q−1 X Bk 1

k Nk

+O

1 Nq

4.2

X

j=1

∞

X 2j − 1

ck (w) = + (q)2j−1

i=1 ∞

2j−2

k=0

where K =

k

P+∞

(2j − 2 − k)!(q)k

k=2 log(k − 1)k

∞ X

−q

X (−y1 )i Di i=0

1 + iN q+2j−1+i i!N 2j+q−1+i

(19)

(20)

i!

Dk (w).

Since Dk w = 0 as soon as k > |w|, this formula can be summered as follows :

X (−1)i (2j − k − 2)i i=0

y1 k . k!

|w|−k

B2j (−1)2j (2j)! N q+2j−1

ck (w)

Proposition 6. Let w ∈ Y ∗ , a word of length |w|. Then the polynomials ck (w) are given by

X 1 1 log N + − (q − 1)iN i 2N q 2iN q+i i=1

log N

|w| X

In particular, Dw = 0 when w is convergent and D(y1 w) = w for each word w ∈ Y ∗ . We can prove that D is a derivation . for the shuffle product In the following sequence, all products and powers will be . carried out with the shuffle product

X log N 1 − q−1+i (1 − q)N q−1 (1 − q)iN i=1

× −(q)2j−1

log(N − 1) , 2N q ! 2j−2 X 2j − 1 (2j − 2 − k)!(q)k k x2j−k−1 (x + 1)q+k k=0 log(x) . (q)2j−1 (x + 1)q+2j−1

(Dp|w) = (p|y1 w).

∞

+

q−2 1 X1 1 1 − j q − 1 j=1 j N j 2

We want to calculate the convergent polynomials ck (w) ∈ C0 . For that, let D : RhY i → RhY i be the linear application defined, for each p ∈ RhY i and for each word w ∈ Y ∗ by the duality

log(k − 1) kq

i=q−1 ∞

log(N − 1) log(N − 1) + (1 − q)N q−1 q−1 N 1 log 1−q 2

k=0

Lemma 6. One has, for any integer q ≥ 2,

−

1 N q+2

Taylor algorithm

w=

Proof. With the function f (x) = x−r , the summation (18) between M = 1 and N gives the expected results.

∞ X

+O

By Theorem 1, any w ∈ Y ∗ can be expressed as follows,

1 (r − 1)N r−1 ! q−1 X Bk−r+1 k − 1 1 1 , + O k − r + 1 r − 1 Nk Nq

with r ≥ 2.

=K+

1 N q+1

where (s)k = Γ(s + k)/Γ(s) for k ∈ N. We just need to insert the previous terms in the summation (18), expand log(N − 1), and make m → ∞.

k=r

k=2

=

ζ(r) −

= −

N X

f (2j−1) (x)

−

k=1

Hr (N )

=

N

log N + γ −

=

f (1) + f (N − 1) 2 (18)

([1]). One has

Lemma 5 H1 (N )

Z

+

f (M ) + f (N ) 2

B2j (2j−1) f (N ) − f (2j−1) (M ) + R2m (2j)!

j=1

where Rm

Z f (n) =

log(k − 1)k−q .

We need the second form of Euler-Maclaurin summation [11] given by, for all integers q, M , N with N > M , N X

B2 q 2 − 4q − 3 − 2 2q 2 − 2

Proof. Let q > 0 f (x) = log(x)(x + 1)−q . We use the Euler-Maclaurin summation (18) from M = 1 to N − 1, which leads us to calculate each term involved in this sum :

and {Bn (·)}n∈N the Bernoulli polynomials defined by X

+ +

Let {Bn }n∈N be the set of Bernoulli numbers obtained in the expansion of the following series X

K+

Euler Mac-Laurin formula

log(N ) 1 − (1 − q)N q−1 (q − 1)2 N q−1

=

ck (w) = e−y1 D Dk (w)

,

with the convention exp(−y1 D) = making D and y1 commute.

.

103

P

i≥0

(−y1 )i Di /i!, i.e. by

Proof. For a polynomial p ∈ R[X] of degree l, the Taylor expansion is finite, and given by p(x) = p(y) + Dp(y)(x − y) + · · · +

In a second time, if w 6∈ Lyndon(Y ), as before, we use Radford decomposition and the table of the asymptotic expansion for the Lyndon words. Example – Let l = y4 y2 ∈ Lyndon(Y ). By Lemma 2,

Dl p(y) (x − y)l . l!

ck (w) = Dk (w) − D Dk (w)y1 + · · · +

c0

=

i=0

=

Dl k D (w)(−y1 )l . l!

But H2 (i − 1) = ζ(2) − H4,2 (N )

c1

=

i=0

ck

=

(−y1 )i Di (y1 y4 y2 ) = y1 y4 y2 − y1 i! i

ζ(4, 2) − ζ(2)

+

1 2

y4 y 2

H4,2 (N )

(−y1 ) D (y4 y2 ) = y4 y2 i!

∞ X i=N +1

0 for k ≥ 2

Algorithm for asymptotic expansions

) + O(N

).

H1,4,2 (N )

+

(21)

• We assume all expansions for Lyndon words of length lower or equal to L are stored. We then consider a Lyndon word of length L + 1, w = ys u. We know from Lemma 2 that the expansion of Hw is linked to the one of Hu by

i=N +1

ζ(4, 2)

=

ζ(4, 1, 2)

=

ζ(5, 2)

=

ζ(4, 2, 1)

=

ζ(4, 3)

=

H1,4,2 (N )

= +

AEq−s+1 (Hu (i − 1)) . is

+ −

. If u ∈ Lyndon(Y ) then AEq−s+1 Hu (i − 1) is assumed to be stored.

+

. If u 6∈ Lyndon(Y ), with the Radford decomposition, we write u as finite sum of terms t = ··· lr , where c ∈ Q, li ∈ Lyndon, with c l1 AEq−s+1 (Ht (i − 1)) = c

1 3

ζ(2) +

1 ζ(2) + 3 N3 2 5

N5

+O

1 2

ζ(2) +

1 4

N4

1 . N6

log(N )ζ(4, 2) − ζ(4, 1, 2) + γζ(4, 2) ζ(5, 2) − ζ(4, 2, 1) − ζ(4, 3) 1 ζ(4, 2) 1 ζ(4, 2) 1 ζ(2) − + 2 N 12 N 2 9 N3 1 1 1 − 24 ζ(2) − 16 + 120 ζ(4, 2) 1 + O N4 N5

32 ζ(2)3 105 5 5 3 ζ(7) + ζ(2)ζ(5) − ζ(2)2 ζ(3) 8 2 2 4 −11ζ(7) + 5ζ(2)ζ(5) + ζ(2)2 ζ(3) 5 221 11 7 − ζ(7) + ζ(2)ζ(5) + ζ(2)2 ζ(3) 16 2 5 17ζ(7) − 10ζ(2)ζ(5), ζ(3)2 −

So, we deduce the reduced form of the previous expansion

So, there are two possibilities

r Y

∞

Thanks to the table giving the relations between MZV up to weight 161 [7],we have the following identities

• If w = ys , then AEq (Hw (N )) = ASq (Hw (N )), an expansion which is already known by Lemma 5, and so can be stored.

AEq (Hw (N )) = ζ(w) − ASq

= − +

Lemma 2 and Lemma 3 give us the following algorithm. We use the notation AEq (Hw (N )) for the asymptotic expansion of Hw (N ) up to order q. In a first time, we store the table of the asymptotic expansions, for w ∈ Lyndon(Y ). For this, we proceed by recurrence on the length of w.

X ∞

∞

Example – Let l = y1 y4 y2 ∈ / Lyndon(Y ). The Radford decomposition of l is given by l = y1 y4 y2 −y5 y2 −y4 y1 y2 − y4 y2 y1 − y4 y3 . Using our algorithm, we find :

We now are going to use both previous tools (Euler MacLaurin formula and Taylor algorithm) to get an asymptotic expansion of Hw up to order q, in the scale of functions {N −β logα N, α ∈ N, β ∈ N}. This means that we are looking for a polynomial p ∈ R[X, Y ] verifying Hw (N ) = p(log N, N

X 1 1 + 4 i i5 i=N +1

X 1 1 + O 7 i6 i i=N +1

ζ(4, 2) −

= −

−q

∞ X

1 , so i3

Expanding the sums in N , we finally find

i

−1

i=N +1

So we get y1 y4 y2 = c0 +c1 y1 = −y4 y1 y2 −y4 y2 y1 −y4 y3 − y5 y2 + y4 y2 y1 . Note that, in this case, Taylor algorithm gives directly the Radford decomposition.

4.3

1 1 1 − +O i 2 i2

=

−y4 y1 y2 − y4 y2 y1 − y4 y3 − y5 y2 2 X

H2 (i − 1) , i4

i=N +1

Example – Let w = y1 y4 y2 . Note that D(w) = y4 y2 and so that Dk (w) = 0, for k ≥ 2. By using Proposition 6, 3 X

∞ X

H4,2 (N ) = ζ(4, 2) −

So, taking x = 0, y = y1 , p = Dk (w), we find

+

32 32 ζ(2)3 ) − γζ(2)3 105 105 7 γζ(3)2 − 3ζ(2)ζ(5) − ζ(2)2 ζ(3) 10 32 115 1 ζ(3)2 − 105 ζ(2)3 ζ(7) + 16 2 N 32 1 ζ(3)2 − 105 ζ(2)3 1 ζ(2) + 12 N2 9 N3 1 1 1 4 − 24 ζ(2) − 16 + 120 ζ(3)2 − 1575 ζ(2)3 log(N )(ζ(3)2 −

O

1 N5

N4

1

This table is in agreement with the Zagier’s dimension conjecture [15] and is available at http://www.lifl.fr/~petitot/publis/MZV.

AEq−s+1 Hlp (i − 1) .

p=1

104

4.4 More examples − ln(N ) − 1 − γ N 1 ln(N ) + 12 γ + 14 2 N2 1 1 5 1 1 − γ− − ln(N ) + O 6 36 6 N3 N4

H2,1 (N ) =

ζ (3) +

+ +

H3,1 (N ) = + +

H2,1,1 (N )

ζ (2, 1) N 1 1 γ + ζ (2, 1) + 12 ln(N ) + 34 2 2 N2 19 1 1 1 − − ζ (2, 1) − γ − 1/3 ln(N ) 36 6 3 N3 ζ (2, 2, 1) −

H2,2,1 (N ) = + +

+

− 1 ln(N ) − 14 − 12 γ ζ (3, 1) + 2 N2 1 1 1 ln(N ) + 2 γ + 6 2 N3 ln(N ) 7 1 1 1 − ln (N ) + O − γ− 4 48 4 N4 N5

=

−1 −

ζ (2, 1, 1) +

1 2

− ln(N )γ − γ −

+

+

+ + + +

ln(N ) −

1 8

+

=

1 1 ln(N ) +O 12 N4

+

2

ln (N ) N5

− − − + − +

H5,1 (N ) H4,1 (N ) =

ζ (4, 1) + 1 2

− 31 γ −

1 3

ln (N ) − N3

+

γ+

=

1 2

+

1 2

H4,2 (N )

= +

H3,2 (N )

= +

1 2

ζ (2) + 1 ζ (2) + 2 N2 N3 − 14 ζ (2) − 38 1 +O N4 N5

ζ (3, 2) −

1 3

= − − + +

1 8 1 4 1 4 1 2

−

− 14 γ −

1 4

1 ln (N ) − 16 4 N ln(N ) + 12 γ + 1/10 ln(N ) +O N5 N6

1 + 12 ζ (2) 1 ζ (2) 4 + 3 N3 N4 −2/5 − 1/3 ζ (2) 1 +O N5 N6

ζ (4, 2) −

1 1 1 ζ (2) − ln(N )2 − ln (N ) 6 6 9

=

ζ (4, 1, 1) +

−

1 1 1 1 1 − ln(N )γ − γ − γ 2 27 3 9 6 N3

+

O

1 1 (ln (N ))2 − ln(N ) 4 4 1 1 1 − ln (N ) γ − γ + ζ (2) 2 4 4 1 1 1 2 γ + − ζ (2) + γ N2 4 6 1 1 1 γ 2 − + (ln(N ))2 + ln(N ) 9 4 6 2 ln (N ) 1 ln(N )γ + O N3 N4

ζ (3, 1, 1) +

1 3 1 γ − ζ (3) 6 3 1 1 ζ (2) γ + ζ (2) ln(N ) + ζ (2) 2 2 1 ln(N )3 − (ln (N ))2 − 1 − ln(N ) 2 1 2 (ln(N )) γ − ln(N )γ − γ − ln(N )γ 2 2 1 1 1 1 γ2 + − ζ (2) γ + γ3 + γ2 N 4 12 8 1 1 1 ζ (2) + ln(N )3 − ln (N ) − 12 8 16 1 1 2 ln(N ) γ + ln(N )γ − γ 4 8 1 1 ζ (2) ln(N ) + ln(N )γ 2 + ln(N )2 4 8 3 ln (N ) 1 ζ (3) +O N2 N3

H4,1,1 (N )

H3,1,1 (N )

ζ (5, 1) +

1 9

ln (N ) + 18 N4 3 1 1 1 1 − − γ − ln(N ) + O 20 3 3 N5 N6

+

1 2 1 6 1 2 1 2 1 8 1 4 1 4 1 6

−

γ 2 + 12 ζ (2) N ln2 (N ) − ln (N )

1 2

ln(N ) N5

ζ (2, 1, 1, 1) +

+

1 4

O

1 N4

H2,1,1,1 (N )

N ln2 (N ) + 21 ln(N )γ N2 − 14 ζ(2) + 14 γ + 14 γ 2 5 ln(N ) + − N2 36 29 5 1 1 − γ − ln(N )γ − ln2 (N ) 216 36 6 12 1 1 2 1 1 1 ζ(2) − γ γ− + 12 12 N3 12 96 1 4

1 61 1 γ+ + ln(N ) 24 288 24

−

H3,2,1 (N )

ln2 (N ) N4

1 ζ (2, 1) + 2 N2

1 4 γ+ 3 9

ζ (3, 2, 1) −

+

1 1 1 + ζ (2, 1) + ln(N ) 2 3 N3

−

1 3 1 ζ (2, 1) − ln(N ) +O 4 8 N4

105

=

−

17 3 − γ 32 8 ln(N ) N5

H3,1,2 (N ) =

ζ (3, 1, 2) +

−

1 1 ζ (2) γ + ζ (3) 2 2

1 1 1 ζ (2) ln (N ) − ζ (2) + ζ (2, 1) 2 4 2

−

+

1 +O N3

ln(N ) N4

ζ (3, 1, 1, 1) +

− − − −

+

−

H4,3 (N ) =

=

H4,2,1 (N )

= − + − + + − + + −

+ +

1 28

ζ (2, 1, 1, 1, 1) +

−1−γ−

1 ζ (2, 1) + 3 N3

=

ζ (4, 2, 1) −

+

1 1 5 + ln(N ) + 16 4 N4

−

2/5 ln(N ) −

1 ζ (2, 1) 3

H4,1,2 (N )

1 2 γ 2

1 4 1 1 4 ln (N ) − ln2 (N ) − γ 24 2 24 1 1 1 ζ (4) − ζ (3) − ln2 (N )γ 8 3 2 1 1 1 ln(N )γ 2 − ζ (3) ln (N ) − ζ (3) γ 2 3 3 1 1 1 1 ζ (2) + ζ (2) γ − γ 3 − ln3 (N ) 2 2 6 6 1 1 2 3 ζ (2) ln (N ) − ln (N ) γ 4 6 1 1 2 2 ln(N ) γ − ln(N )3 γ 4 6 1 1 ζ (2) ln(N )γ + ζ (2) γ 2 2 4 1 1 ζ (2) ln (N ) − ln(N ) − ζ (2, 2) 2 4 4 ln (N ) 1 ln(N )γ +O N N2

−

1 1 γ + ζ (2, 1) 4 2

− 2/5γ − 1 +O N5

ζ (4, 1, 2) +

+

1 1 1 ζ (3) − ζ (2) γ + ζ (2, 1) 3 3 3

+

O

ln(N ) N4

53 100 ln(N ) N6

1 1 ζ (2) ln(N ) − ζ (2) 3 9

=

1 N3

H4,1,1,1 (N ) = + − − −

5.

1 3 1 ζ (3) − γ 9 18 1 1 1 1 ζ (2) γ − ln2 (N )γ − ln(N )γ − γ 6 6 9 27 1 1 1 3 ln N − (ln (N ))2 − ln (N ) 18 18 27 1 1 1 1 + ζ (2) ln(N ) + ζ (2) − ln(N )γ 2 81 6 18 6 3 ln (N ) 1 2 1 γ +O 18 N3 N3

ζ (4, 1, 1, 1) +

−

ACKNOWLEDGMENTS

Thanks to Boutet de Monvel, Cartier, Jacob and Waldschmidt for useful discussions.

6. =

N6

+

ζ (2, 1, 1) 1 + ζ (2, 1, 1) N 2 1 2 3 3 7 1 γ + γ + ln(N ) + + (ln (N ))2 4 4 4 8 4 2 ln (N ) 1 1 1 ln(N )γ − ζ (2) + O 2 4 N2 N3

+

H6,1 (N )

1 6

− 13 ζ (3) + 1/10 1 ζ (3) 1 ζ (3) + + 3 N3 2 N4 N5 + 16 ζ (3) 1 +O N7 N8

ζ (2, 2, 1, 1) −

+

H2,1,1,1,1 (N )

ζ (4, 3) −

−

−

ζ (5, 1, 1) +

−

H2,2,1,1 (N )

1 1 ln(N )γ − γ 4 16 1 2 1 1 1 ln (N ) − ln(N ) − − γ2 8 16 64 8 2 ln (N ) 1 1 ζ (2) +O 8 N4 N5

H5,1,1 (N ) =

1 ln(N )2 γ 4 1 1 1 1 ln(N )γ − γ − ln(N )γ 2 − γ 2 4 8 4 8 1 1 1 (ln(N ))3 − ln(N )2 − ln (N ) 12 8 8 1 1 1 1 + ζ (2) ln (N ) + ζ (2) + ζ (2) γ 16 4 8 4 3 ln (N ) 1 1 3 1 γ − ζ (3) + O 12 6 N2 N3

=

5 − 12

+

H3,1,1,1 (N )

1 ζ (2) + 5 1 ζ (2) 2 + 4 N4 N5 5 23 − 12 ζ (2) 1 84 + + O N6 N7 N8

ζ (5, 2) −

=

1 N2

1 1 1 1 ζ (2) + ζ (2) γ − ζ (2, 1) − 6 2 2 3

1 1 ζ (3) + ζ (2) ln (N ) 2 2

−

1

H5,2 (N )

REFERENCES

[1] F. Bergeron, G. Labelle, P. Leroux.– Combinatorial Species and Tree-like Structures, Encyclopedia of Mathematics and its Applications, Vol. 67, Cambridge University Press (1998). [2] M. Bigotte.– Etude symbolique et algorithmique des polylogarithmes et des nombres Euler-Zagier colors, Ph.D., Lille, (2000).

1 − 51 ln(N ) − 25 − 1/5γ N5 1 1 ln(N ) + 12 + 12 γ 2 N6 ln(N ) 1 13 1 1 − γ− − ln (N ) + O 2 84 2 N7 N8

ζ (6, 1) +

106

[3] J. Bl¨ umlein.– Mathematical Structure of Anomalous Dimensions and QCD Wilson Coefficients in Higher Order, Nuclear Physics B (Proc Suppl.), 135, pp 225-231, (2004). [4] C. Costermans, J.Y. Enjalbert, Hoang Ngoc Minh.– Algorithmic and combinatoric aspects of multiple harmonic sums, in the proceedings of AofA, Barcelone, 6-10 June, (2005). [5] P. Flajolet, G. Labelle, L. Laforest, B. Salvy.– Hypergeometrics and the Cost Structure of Quadtrees, Random Structures and Algorithms, Vol. 7, No.2, pp 117-144, (1995). [6] Hoang Ngoc Minh.– Finite polyzetas, Poly-Bernoulli numbers, identities of polyzetas and noncommutative rational power series, proc. of 4th Int. Conf. on Words, pp. 232-250, September, 10-13 Turku, Finland, (2003). [7] Hoang Ngoc Minh & M. Petitot.– Lyndon words, polylogarithmic functions and the Riemann ζ function, Discrete Math., 217, pp. 273-292, (2000). [8] Hoang Ngoc Minh, M. Petitot & J. van der Hoeven.– L’algbre des polylogarithmes par les sries gnratrices, SFCA’99, Barcelone, (1999).

[9] Hoang Ngoc Minh, M. Petitot & J. van der Hoeven.– Shuffle algebra and polylogarithms, Discrete Mathematics, 225, pp 217-230, (2000). [10] M. Hoffman.– The algebra of multiple harmonic series, Jour. of Alg., August (1997). [11] A. Ivić.– The Riemann zeta function, J. Wiley, New York, (1985). [12] C. Reutenauer.– Free Lie Algebras, Lon. Math. Soc. Mono., New Series-7, Oxford Sc. Pub., (1993). [13] V.N. Sorokin.– On the linear independence of the values of generalized polylogarithms, Math. Sb., 192:8, pp. 139–154, (2001); English transl, Sb. Math. 192, pp 1225–1239, (2001). [14] E.A. Ulanskii.– Identities for generalized polylogarithms, Mat. Zametki, 73, pp. 613–624; English transl, Math. Notes, 73, pp 571–581, (2003). [15] D. Zagier.– Values of zeta functions and their applications, First European congress of Mathematics, Vol.2, Birkhuser, Basel, 1994, pp. 497-512.

107

Lifting Techniques for Triangular Decompositions Xavier Dahan

´ LIX, Ecole polytechnique 91128 Palaiseau, France

Marc Moreno Maza

ORCCA, University of Western Ontario (UWO) London, Ontario, Canada [email protected]

´ Eric Schost

´ LIX, Ecole polytechnique 91128 Palaiseau, France

[email protected] [email protected] Yuzhen Xie Wenyuan Wu ORCCA, UWO

ORCCA, UWO

[email protected]

[email protected]

ABSTRACT

Let us introduce the notation used below. If k is a perfect field (e.g., Q or a finite field), a triangular set is a family T1 (X1 ), T2 (X1 , X2 ), . . . , Tn (X1 , . . . , Xn ) in k[X1 , . . . , Xn ] which forms a reduced Gr¨ obner basis for the lexicographic order Xn > · · · > X1 and generates a radical ideal (so Ti is monic in Xi ). The notation T 1 , . . . , T s denotes a family of s triangular sets, with T i = T1i , . . . , Tni . Then, any 0dimensional variety V can be represented by such a family, such that I(V ) = ∩i≤s T i holds, and where T i and T i are coprime ideals for i = i ; we call it a triangular decomposition of V . This decomposition is not unique: the different possibilities are obtained by suitably recombining the triangular sets describing the irreducible components of V . In this paper, we consider 0-dimensional varieties defined over Q . Let thus F = F1 , . . . , Fn be a polynomial system in Z[X1, . . . , Xn ]. Since we have in mind to apply Hensel lifting techniques, we will only consider the simple roots of F , that is, those where the Jacobian determinant J of F does not vanish. We write Z(F ) for this set of points; by the Jacobian criterion [10, Ch. 16], Z(F ) is finite, even though the whole zero-set of F , written V (F ), may have higher dimension. Let us assume that we have at hand an oracle that, for any prime p, outputs a triangular decomposition of Z(F mod p). Then for a prime p, a rough sketch of an Hensel lifting algorithm could be: (1) Compute a triangular decomposition t1 , . . . , ts of Z(F mod p), and (2) Lift these triangular sets over Q. However, without more precautions, this algorithm may fail to produce a correct answer. Indeed, extra factorizations or recombinations can occur modulo p. Thus, we have no guarantee that there exist triangular sets T 1 , . . . , T s defined over Q , that describe Z(F ), and with t1 , . . . , ts as modular images. Furthermore, if we assume no control over the modular resolution process, there is little hope of obtaining a quantification of primes p of “bad” reduction. Consider for instance the variety V ⊂ C 2 defined by the polynomials 326X1 −10X26 +51X25 +17X24 +306X22 +102X2 + 34 and X27 +6X24 +2X23 +12. For the order X2 > X1 , the only possible description of V by triangular sets with rational coefficients corresponds to its irreducible decomposition, that is, T 1 : ( X1 −1, X23 +6 ) and T 2 : ( X12 +2, X22 +X1 ). Now, the following triangular sets describe the zeros of (F mod 7), which are not the reduction modulo 7 of T 1 and T 2 ;

We present lifting techniques for triangular decompositions of zero-dimensional varieties, that extend the range of the previous methods. We discuss complexity aspects, and report on a preliminary implementation. Our theoretical results are comforted by these experiments. Categories and Subject Descriptors: I.I.2 [Computing Methodologies]: Symbolic and Algebraic Manipulation – Algebraic Algorithms General Terms: Algorithms, experimentation, theory. Keywords: Polynomial systems, triangular sets, Hensel lifting.

1.

INTRODUCTION

Modular methods for computing polynomial GCDs and solving linear algebra problems have been well-developed for several decades, see [12] and the references therein. Without these methods, the range of problems accessible to symbolic computations would be dramatically limited. Such methods, in particular Hensel lifting, also apply to solving polynomial systems. Standard applications are the resolution of systems over Q after specialization at a prime, and over the rational function field k(Y1 , . . . , Ym ) after specialization at a point (y1 , . . . , ym ). These methods have already been put to use for Gr¨ obner bases [26, 1] and primitive element representations, starting from [13], and refined notably in [14]. Triangular decompositions are well-suited to many practical problems: see some examples in [3, 11, 24]. In addition, these techniques are commonly used in differential algebra [4, 15]. Triangular decompositions of polynomial systems can be obtained by various algorithms [16, 18, 21] but none of them uses modular computations, restricting their practical efficiency. Our goal in this paper is to discuss such techniques, extending the preliminary results of [24].


1 t

108

X22 + 6X2 X12 + 2X2 + X1 X13 + 6X12 + 5X1 + 2

and

2 t

X2 + 6 , X1 + 6

A lifting algorithm should discard t1 and t2 , and replace 1 2 them by the better choice t : ( X1 + 6, X23 + 6 ) and t : 2 2 1 ( X1 + 2, X2 + X1 ), which are the reduction of T and T 2 modulo 7. In [24], this difficulty was bypassed by restricting to equiprojectable varieties, i.e. varieties defined by a single triangular set, where no such ambiguity occurs. However, as this example shows, this assumption discards simple cases. Our main concern is to lift this limitation, thus extending these techniques to handle triangular decompositions. Our answer consists in using a canonical decomposition of a 0-dimensional variety V , its equiprojectable decomposition, described as follows. Consider the map π : V ⊂ A n (k) → A n−1 (k) that forgets the last coordinate. To x in V , we associate N (x) = #π −1 (π(x)), that is, the number of points lying in the same π-fiber as x. Then, we split V into the disjoint union V1 ∪ · · · ∪ Vd , where for all i = 1, . . . , d, Vi equals N −1 (i), i.e., the set of points x ∈ V where N (x) = i. This splitting process is applied recursively to all V1 , . . . , Vd , taking into account the fibers of the successive projections A n (k) → A i (k), for i = n − 1, . . . , 1. In the end, we obtain a family of pairwise disjoint, equiprojectable varieties, whose reunion equals V , which form the equiprojectable decomposition of V . As requested, each of them is representable by a triangular set with coefficients in the definition field of V . Looking back at the example, both Z(F ) and Z(F mod 7) are described on the leftmost picture below (forgetting the actual coordinates of the points). Representing Z(F ) by 1 2 T 1 and T 2 , as well as Z(F mod 7) by t and t amounts to grouping the points as on the central picture; this is the equiprojectable decomposition. The rightmost picture shows the description of Z(F mod 7) by t1 and t2 . 11 00 11 00 1 0 0 1 0 1 1 0 0 1 0 1

11 00 11 00 1 0 0 1 0 1

1 0 0 1 0 1

00 11 00 11 00 11 11 00 0 11 1 00 1 0 0 1

1 0 0 1 0 1

1 0 0 1 0 1

00 11 00 11 00 11 11 00 0 11 1 00 1 0 0 1

height of a 0-dimensional variety V defined over Q: the former denotes its number of points, and the later estimates its arithmetic complexity; see [17] and references therein for its definition. Let then T 1 , . . . , T s be the triangular sets that describe the equiprojectable decomposition of Z = Z(F ). In [9], it is proved that all coefficients in T 1 , . . . , T s have height in O(nO(1) (deg Z + ht Z)2 ). However, better estimates are available, through the introduction of an alternative representation denoted by N 1 , . . . , N s . For i ≤ s, N i = N1i , . . . , Nni is obtained as follows. Let D1i = 1 and N1i = T1i . For 2 ≤ ≤ n and 1 ≤ i ≤ s, define Di =

1 0 0 1 0 1

i Ni = Di Ti mod (T1i , . . . , T−1 ).

• For any C ∈ N, let Γ(C) be the sets of primes in [C + 1, . . . , 2C]. We assume the existence of an oracle O1 which, for any C ∈ N, outputs a random prime in Γ(C), with the uniform distribution. • We assume the existence of an oracle O2 , which, given a system F and a prime p, outputs the representation of the equiprojectable decomposition of Z(F mod p) by means of triangular sets. We give in Section 2 an algorithm to convert any triangular decomposition of Z(F mod p) to the equiprojectable one; its complexity analysis is subject of current research.

1 0 0 1 0 1

00 11 00 11 00 11 11 00 0 11 1 00 1 0 0 1

a a h

h

• For F as in Theorem 1, we write F = (n, d, h), F = ndn (h+11 log(n+3)) and F = 5( F +1) log(2 F +1). The input system is given by a straight-line program of size L, with constants of height at most hL .

b

h

• C ∈ N is such that for any ring R, any d ≥ 1 and monic t ∈ R[X] of degree d, all operations (+, −, ×) in R[X]/t can be computed in Cd log d log log d operations in R [12, Ch. 8,9]. Then all operations (+, −, ×) modulo a triangular set T in n variables can be done in quasi-linear complexity in Cn and deg V (T ).

Theorem 1. Let F1 , . . . , Fn have degree ≤ d and height ≤ h. Let T 1 , . . . , T s be the triangular description of the equiprojectable decomposition of Z(F ). There exists A ∈ N − {0}, with ht A ≤ (n, d, h), and, for n ≥ 2,

a(n, d, h) = 2n d

and

It is proved in [9] that all coefficients in N 1 , . . . , N s have height in O(nO(1) (deg Z +ht Z)). Since T 1 , . . . , T s are easily recovered from N 1 , . . . , N s , our algorithm will compute the latter, their height bounds being the better. Theorem 2 below states our main result regarding lifting techniques for triangular decompositions; in what follows, we say that an algorithm has a quasi-linear complexity in terms of some parameters if its complexity is linear in all of these parameters, up to polylogarithmic factors. We need the following assumptions:

The above algorithm sketch is thus improved by applying lifting only after computing the equiprojectable decomposition of the modular output. Theorem 1 shows how to control the primes of bad reductions for the equiprojectable decomposition, thus overcoming the limitation that we pointed out previously. In what follows, the height of x ∈ Z is defined as ht x = log |x|; the height of f ∈ Z[X1, . . . , Xn ] is the maximum of the heights of its coefficients; that of p/q ∈ Q, with gcd(p, q) = 1, is max(ht p, ht q).

a

2 2n+1

∂Tji ∂Xj

1≤j≤−1

11 00 11 00 1 0 0 1 0 1

Y

Theorem 2. Let ε > 0. There exists an algorithm which, given F , satisfying

(3h + 7 log(n + 1) + 5n log d + 10),

a

4

and with the following property. If a prime p does not divide A, then p cancels none of the denominators of the coefficients of T 1 , . . . , T s , and these triangular sets reduced mod p define the equiprojectable decomposition of Z(F mod p).

F

b

+2 ε

F

+1
, the equality degXi Ti = degXi Ti shows that S is monic in Xi , as requested. Assume that consists of s > 2 triangular sets T 1 , . . . , T s . First, we apply the case s = 2 to T 1 , T 2 , obtaining a triangular set T 1,2 . Observe that every pair T 1,2 , T j , for 3 ≤ j ≤ s, is solvable but not certified solvable: we obtain the requested Bézout coefficient by updating the known ones. Let us fix 3 ≤ j ≤ s. Given A1 , A2 , B1 , Bj , C2 , Cj ∈ K [X ] such that A1 T1 + A2 T2 = B1 T1 + Bj Tj = C2 T2 + Cj Tj = 1 hold in K [X ], we let α = B1 C2 mod Tj and β = A1 Cj T1 + A2 Bj T2 mod T1 T2 . Then, αT1,2 + βTj = 1 in K [X ], as requested. Proceeding by induction ends the proof. Splitting critical pairs. Let now V be a 0-dimensional variety over k. Proposition 3 below encapsulates the first

Equiprojectable decomposition. Let first W be a 0dimensional variety in A i (k), for some 1 ≤ i ≤ n. For x in A i−1 (k), we define the preimage i µ(x, W ) = (πi−1 )−1 (x) ∩ W ;

for any d ≥ 1, we can then define n

T

T

o

i (x), W ) = d . A(d, W ) = x ∈ W | #µ(πi−1

Thus, x is in A(d, W ) if W contains exactly d points x such i i (x) = πi−1 (x ) holds. Only finitely many of the that πi−1 A(d, W ) are not empty and the non-empty ones form a partition of W . Let 1 ≤ i ≤ n. Writing W = πin (V ), we define B(i, d, V ) = {x ∈ V | πin (x) ∈ A(d, W )} . Thus, B(i, d, V ) is the preimage of A(d, W ) in V , so these sets form a partition of V . If V is i-equiprojectable, then all B(i, d, V ) are (i − 1)-equiprojectable. We then define inductively B(V ) = V , and, for 1 < i ≤ n, B(di , . . . , dn , V ) =

110

same set of points, and all of which satisfy Pκ−1 . First, we partition κ using the equivalence relation T ≡ T if and . Assumption Pκ shows only if T1 , . . . , Tκ−1 = T1 , . . . , Tκ−1 that each equivalence class is certified and solvable of level κ. We then let (κ) be the family of triangular sets obtained by applying Proposition 1 to each equivalence class.

part of the Split-and-Merge algorithm: given any triangular decomposition of V , it outputs another one, without critical pairs. We first describe the basic splitting step.

T

T

T

S

be a triangular decomposition of Proposition 2. Let V which contains critical pairs. Then one can compute a triangular decomposition Split( ) of V which has cardinality larger than that of .

T

T

S

Lemma 1. Let S = S in (κ) . The pair {S, S } is noncritical, certified, of level < κ.

T

of level and let Proof. Let T, T be a critical pair of G be a GCD of T , T in K [X ]. First, assume that G is monic, in the sense of [22]; let Q and Q be the quotients of T and T by G in K [X ]. We define the sets A B A B

= = = =

T

T

Proof. Let T, T ∈ , which respectively divide S and S . Due to assumption Pκ , there exists 0 ≤ ≤ κ such that T1 , . . . , T−1 = T1 , . . . , T−1 and (T1 , . . . , T ) and (T1 , . . . , T ) have no common zero. Then, < κ, since T ≡ T . Thus, T1 , . . . , T = S1 , . . . , S and T1 , . . . , T = S1 , . . . , S . Since {T, T } is certified of level < κ, {S, S } is also.

T1 , . . . , T−1 , G, T+1 , . . . , Tn , T1 , . . . , T−1 , Q, T+1 , . . . , Tn , T1 , . . . , T−1 , G, T+1 , . . . , Tn , T1 , . . . , T−1 , Q , T+1 , . . . , Tn .

S

We partition some more, into the classes of the equivalence relation S ≡ S if and only if degXκ Sκ = degXκ Sκ . (κ) (κ) Let 1 , . . . , δ be the equivalence classes, indexed by the common degree in Xκ ; we define Mergeκ ( κ ) as the data of all these equivalence classes.

S

We let Split( ) = {A, B, A , B }, excluding the triangular sets defining the empty set. Since the pair T, T is critical, V (A) and V (B) are non-empty. Since T and T are not associate in K [X ], at least Q or Q is not constant. Thus, Split( ) has cardinality at least 3. Since T and T are radical, if Q ∈ K , G and Q are coprime in K [X ], so V (T ) is the disjoint union of V (A) and V (B). The same property holds for A and B . Thus, the proposition is proved. Assume now that T , T have no monic GCD in K [X ]. Then, there exist triangular sets C 1 , . . . , C s , D1 . . . Ds such that V (T ) is the disjoint union of V (C 1 ), . . . , V (C s ), V (T ) is the disjoint union of V (D1 ), . . . , V (Ds ), at least one pair C i , Dj is critical and Ci , Dj admits a monic GCD in K [X ]. These triangular sets are obtained by the algorithms of [22] when computing a GCD of T , T in K [X ]. Then the results of the monic case prove the existence of Split( ).

T

T

T

T

T

T

T

T

Proposition 4. V (

T

T

T

(κ) d )

= B(κ, d, V (

T S

T S

T

κ ))

for all d.

S

(κ)

T

S

S

S

S

The main merging algorithm. We can now give the main algorithm. We start from a triangular decomposition of V without critical pairs, and where every pair is certified, so it satisfies Pn . Let us initially define n = { }; note that n is a set of families of triangular sets. Then, for 1 ≤ κ ≤ n, assuming κ is defined, we write κ−1 = ∪U(κ) ∈Tκ Mergeκ ( (κ) ). Lemma 2 shows that this process is well-defined; note that each κ is a set of families of triangular sets as well. Let be a family of triangular sets in 0 . Then satisfies P0 , so by the remarks make previously, consists in a single triangular set. Proposition 4 then shows that the triangular sets in 0 form the equiprojectable components of V .

T

T

T

S

Proof. We know that V ( κ ) is the union of the V ( d ). (κ) Besides, both families {V ( d )} and {B(κ, d, V ( ))} form a partition of V ( κ ). Thus, it suffices to prove that for x (κ) in V ( κ ), x ∈ V ( d ) implies that πκn (x) ∈ A(d, W ), with n W = πκ (V ( κ )). First, for S in (κ) , write WS = πκn (S). Then Lemma 1 shows that the WS form a partition of W , κ and that their images πκ−1 (WS ) are pairwise disjoint. (κ) Let now x ∈ V ( d ) and y = πκn (x). There exists a unique S ∈ (κ) such that x ∈ V (S). The definition of (κ) shows that there are exactly d points y in WS such d κ κ that πκ−1 (y) = πκ−1 (y ). On the other hand, for any y ∈ κ (y) = WS , with S = S, the above remark shows that πκ−1 κ (y ). Thus, there are exactly d points y in W such that πκ−1 κ κ πκ−1 (y) = πκ−1 (y ); this concludes the proof.

T

T

satisfies Pκ−1 .

Thus, we can now suppose that we have a triangular decomposition of V without critical pairs, and where every pair is certified, such as the one computed in Proposition 3. We describe the second part of the Split-and-Merge algorithm: merging solvable families in a suitable order, to obtain the equiprojectable decomposition of V . For 0 ≤ κ ≤ n, we say that satisfies property Pκ if for all T, T ∈ the pair {T, T } is certified, has level ≤ κ and for all κ < i ≤ n satisfies degXi Ti = degXi Ti . Observe that if P0 ( ) holds, then contains only one triangular set, and that the input family satisfies Pn .

T

(κ) d

(κ)

T

S

S

Proof. Write 0 = , and define a sequence i by i+1 = Split( i ), if i contains critical pairs, and i+1 = i otherwise. Testing the presence of critical pairs is done by GCD computations, which yields the Bézout coefficients in case of coprimality. Let D be the number of irreducible components of V . Any family i has cardinality at most D, so the sequence i becomes stationary after at most D steps.

T

T

Proof. Let S = S in d , and let T, T be as in the proof of Lemma 1; we now prove the degree estimate. For κ < i ≤ n, we have degXi Ti = degXi Si and degXi Ti = degXi Si ; assumption Pκ shows that degXi Si = degXi Si for κ < i ≤ n. Since degXκ Sκ = degXκ Sκ = d, the lemma is proved.

T

T

S

Lemma 2. Each family

Proposition 3. Let be a triangular decomposition of V . One can compute a triangular decomposition of V with no critical pairs, and where each pair of triangular sets is certified.

T

(κ)

T

T

U

T

U

The basic merging algorithm. Let 1 ≤ κ ≤ n. We now define the procedure Mergeκ , which takes as input a family κ of triangular sets which satisfies Pκ , and outputs several families of triangular sets, whose reunion defines the

T

T

111

T

T

T

T

T U

U

3.

Proof. We prove on = n + 1, . . . , 2 that for all d , . . . , dn , B(d , . . . , dn , Z) equals B(d , . . . , dn , Z); taking = 2 gives the lemma. Since B(X) = X for any variety X, this property holds for = n + 1. Assuming it for B(d+1 , . . . , dn , Z), we prove it for B(d , . . . , dn , Z). Let B = B(d+1 , . . . , dn , Z), n (B); Lemma 4 implies that reB = πn (B) and B−1 = π−1 duction modulo p is one-to-one on both B and B−1 . For y in B−1 and z in B−1 , we define

PROOF OF THEOREM 1

In this section, we consider the simple solutions Z(F ) of a system F = F1 , . . . , Fn in Z[X1, . . . , Xn ], that is, those where the Jacobian determinant J of F does not vanish. We prove that for all primes p but a finite number, the equiprojectable decomposition of Z(F ) reduces modulo p to that of Z(F mod p). These results require to control the cardinality of the “specialization” of a variety at p. Such questions are easy to formulate using primitive elements and associated representations, which we now define as a preamble. Primitive element descriptions. Let W ⊂ C be a 0dimensional variety defined over Q. Let ∆ be a linear form in Z[X1, . . . , X ]. Its minimal polynomial is the minimal polynomial µ ∈ Q[T ] of the multiplication endomorphism by ∆ in Q[W ]; it is the squarefree part of Πx∈W (T − ∆(x)). Then ∆ is a primitive element for W if the map x → ∆(x) is one-to-one on W . In this case, µ has degree deg W and Q[W ] is isomorphic to the residue class ring Q[T ]/µ. Writing wi ∈ Q[T ] for the image of Xi , we deduce that µ(T ) = 0 and Xi = wi (T ), 1 ≤ i ≤ , form a parametrization of the points in W . We will use quantitative estimates on the size of the coefficients in this representation, in terms of the degree and height of W . The following result is [5, Th. 2]; using the coefficient χ leads to sharp height bound, as is the case for the polynomials N i defined in the introduction.

µ(y) = (π−1 )−1 (y) ∩ B

and

µ(z) = (π−1 )−1 (z) ∩ B .

We first prove that µ(y) and µ(y) have the same cardinality for all y in B−1 . To this effect, observe the equalities X

X

#µ(y) = #B , y∈B−1

#µ(z) = #B . z∈B−1

Let now y in B−1 . Since µ(y) ⊂ µ(y), injectivity of the reduction mod p on B implies that #µ(y) ≤ #µ(y). Thus, X

#B =

#µ(y) ≤

y∈B−1

X

#µ(y). y∈B−1

Injectivity of the reduction mod p on B−1 implies that X

X

#µ(y) = y∈B−1

#µ(z) = #B . z∈B−1

This sum equals #B . Thus, all inequalities are equalities, giving our claim. (x)); define similarly ν(z) For x in B , write ν(x) = µ(π−1 for z in B . By the previous point, ν(x) and ν(x) have the same cardinality. Recalling from Section 2 that for d ∈ N, we have defined A(d, B ) as the set {x ∈ B | #ν(x) = d}, and A(d, B ) as the set {z ∈ B | #ν(z) = d}, one can see A(d, B ) = A(d, B ). To conclude, recall that by definition {x ∈ Z | πn (x) ∈ A(d, πn (B(d+1 , . . . , dn , Z)))} = B(d, d+1 , . . . , dn , Z). By the induction assumption, this equals {x ∈ Z | πn (x) ∈ A(d, B )}, and we have proved that this equals {x ∈ Z | πn (x) ∈ A(d, B )}. By definition, this is B(d, d , . . . , dn , Z), which is what we wanted.

Lemma 3. Let h∆ be an upper bound of the height of ∆, and H∆ = ht W + (deg W )h∆ + (deg W ) log( + 2) + ( + 1) log deg W . There exist χ, v1 , . . . , v in Z[T ], such that χ, χ , v1 , . . . , v have height at most H∆ , µn equals χ divided by its leading coefficient, and wi = vi /χ mod χ for all i. Geometric considerations. Let now Z = Z(F ). For 1 ≤ i ≤ n, let ∆i be a linear form in Z[X1, . . . , Xi ] which is a primitive element for πin (Z), let µi ∈ Q[T ] be its minimal polynomial, and let w1 , . . . , wn ∈ Q[T ] be the parametrization of Z associated to ∆n . Let finally p a prime. We first introduce assumptions on p (denoted by H1 , H2 , H3 ), that yield the conclusion of Theorem 1 in a series of lemmas; we then give quantitative estimates for these assumptions. H1 . The prime p divides no coefficients in µn , w1 , . . . , wn and µn remains squarefree modulo p.

Lemma 6. Let T 1 , . . . , T s be the triangular sets that describe the equiprojectable decomposition of Z. Then p cancels no denominator in the coefficients of T 1 , . . . , T s , and the reduction of these triangular sets modulo p defines the equiprojectable decomposition of Z.

Let Fq be a finite extension of Fp such that (µn mod p) splits in Fq , let Qq be the corresponding unramified extension of Qp [20] and Zq its ring of integers; then, µn splits in Qq , and has all its roots in Zq; thus, Z lies in Zn q . Note that p divides no coefficient in µ1 , . . . , µn : the roots of µi are the values of ∆i on πin (Z), so they are in Zq, hence the coefficients of µi are in Zq ∩ Q = Zp. The map Zq → Fq of reduction modulo p extends to maps a ∈ Ziq → a ∈ Fiq for all i. Given A ⊂ Ziq, A is the set {a | a ∈ A}. The same notation is used for the reduction of polynomials modulo p. H2 . All polynomials µi are squarefree.

Proof. For i ≤ s, let Zi = Z(T i ). By Lemma 5, Z1 , . . . , Zs are the equiprojectable components of Z. For i ≤ s, Zi is described by a triangular set ti with coefficients in Fp . The coefficients of T i are rational functions of the points in Zi , given by interpolation formulas [9, §3]. With these formulas, Lemma 4 shows that all denominators are nonzero modulo p. The coefficients of ti are obtained using the same formulas, using the coordinates of the points in Zi . Thus, ti = T i mod p.

H3 . The Jacobian determinant of F vanishes nowhere on Z.

Lemma 4. For i ≤ n, #πin (Z) equals #πin (Z).

Lemma 7. The set Z equals Z(F ).

Proof. The inequality #πin (Z) ≤ #πin (Z) is obvious. By assumption H2 , all values taken by ∆i on πin (Z) are distinct, so #πin (Z) ≥ deg µi = #πin (Z).

Proof. First, we prove that F vanishes on Z. Indeed, all Fi belong to the ideal generated by I = (µn , X1 − w1 , . . . , Xn − obner basis, so any wn ) in Q[T, X1 , . . . , Xn ]. Now, I is a Gr¨ Fi can be written in terms of I. Since p divides no denominator and no leading term in I, the division equality

Lemma 5. For all d2 , . . . , dn , B(d2 , . . . , dn , Z) equals B(d2 , . . . , dn , Z).

112

of all, we describe the required subroutines, freely using the notation of Theorem 2, and that preceding it. We do not give details of the complexity estimates for lack of space; they are similar to those of [24].

specializes modulo p, and F vanishes on Z, as requested. Let then Z = Z(F ). By Assumption H3 , Z ⊂ Z , so it suffices to prove that #Z ≤ #Z. Let Fr be a finite extension of Fp that contains the coordinates of all these points and let Qr be the corresponding unramified extension of Qp . By Hensel’s lemma, all points in Z lift to pairwise distinct simple roots of F in Qn r . Thus, #Z ≤ #Z = #Z.

• EquiprojDecomposition takes as input a polynomial system F and outputs the equiprojectable decomposition of Z(F ), encoded by triangular sets. This routine is called here for systems defined over finite fields. For the experiments in the next section, we applied the triangularization algorithm of [21], followed by the Splitand-Merge algorithm of Section 2, modulo a prime. Studying the complexity of this task is left to the forthcoming [7]; hence, we consider this subroutine as an oracle here, which is called O2 in Theorem 2.

Quantitative estimates. By Lemmas 6 and 7, assumptions H1 , H2 and H3 imply Theorem 1. Thus, it suffices to give quantitative estimates for these assumptions. To this effect, we let D and H be upper bounds on the degrees and heights of the varieties πin (Z), h∆ be an upper bound of the height of ∆1 , . . . , ∆n , and H∆ = H + Dh∆ + D log(n + 2) + (n + 1) log D. Lemma 8. There exists a in N − {0} such that if p does not divide a, H1 and H2 hold. Moreover a verifies:

• Lift applies the Hensel lifting algorithm of [24], but this time to a family of triangular sets, defined first modulo a prime p1 , to triangular sets defined modulo κ the successive powers p21 . From [24], one easily sees that the κth lifting P step has a bit complexity quasilinear in (L, hL , Cn , i≤s deg V (T i ), 2κ , log p1 ), i.e. in (L, hL , Cn , deg Z, 2κ , log p1 ).

ht a ≤ n ((2D − 1)H∆ + (2D − 1) log(2D − 1)) . Proof. Fix i in 1, . . . , n, and let χ, χ , v1 , . . . , vi the polynomials associated to πin (Z) and ∆i in Lemma 3; all of them have integer coefficients of height at most H∆ . Let now ai be the resultant of χ and χ ; by Hadamard’s bound, ht ai ≤ (2D − 1)H∆ + (2D − 1) log(2D − 1). Suppose that p does not divide ai . Then, χ keeps the same degree and remains squarefree modulo p. Furthermore, p divides no coefficient in any wj , since all denominators in 1/χ mod χ divide ai . Thus, assumption H1 holds. Repeating this argument for all projections πin (Z), and taking a = a1 · · · an gives assumption H2 .

• Convert computes the polynomials N i starting from the polynomials T i . Only multiplications modulo triangular sets are needed to perform this operation, so its complexity is negligible before that of Lift. • RationalReconstruction does the following. Let a = p/q ∈ Q, and m ∈ N with gcd(q, m) = 1. If ht m ≥ 2ht a + 1, given a mod m, this routine outputs a. If ht m < 2ht a + 1, the output may be undefined, or differ from a. We extend this notation to the reconstruction of all coefficients of a family of triangular sets. Using the fast Euclidean algorithm [12, Ch 5,11], its complexity is negligible before that of Lift.

Lemma 9. There exists a in N − {0} such that if p does not divide aa , H1 , H2 and H3 hold, and with ht a ≤ 2Dn(dH∆ + h + log d + (d + 1)D log(n + 1)). Proof. Let χ, v1 , . . . , vn be associated to ∆n as in Lemma 3, let J h be the homogenization of J w.r.t. a new variable, and let a ∈ Z be the resultant of J h (χ , v1 , . . . , vn ) and χ; then, a = 0 by the definition of Z. The Jacobian determinant J has coefficients of height at most n(h + log d + (d + 1) log(n + 1)); estimating the height of the determinant of the Sylvester matrix of J h (χ , v1 , . . . , vn ) and χ yields the bound on ht a . Suppose now that p does not divide aa . Then the degree of χ does not drop modulo p, and thus no root of χ cancels J h (χ , v1 , . . . , vn ). In other words, all points described by χ(T ) = 0 and χ (T )Xi = vi (T ), 1 ≤ i ≤ n, are simple for F . This set of points equals Z, giving H3 .

• We do not consider the cost of prime number generation. We see them as input here; formally, in Theorem 2, this is handled by calls to oracle O1 . Computing a triangular decomposition by lifting techniques Input: The system F , primes p1 , p2 Output: The polynomials N 1 , . . . , N s .

T 1,0 , . . . , T s,0← EquiprojDecomposition(Z(F mod p1 )) u1 , . . . , us ← EquiprojDecomposition(Z(F mod p2 )) 1 m , . . . , ms ← Convert(u1 , . . . , us ) κ←1 while not(Stop) do κ T 1,κ , . . . , T s,κ ← Lift(T 1,κ−1 , . . . , T s,κ−1 ) mod p21 1,κ s,κ 1,κ s,κ N ,...,N ← Convert(T , . . . , T ) N 1,κ, . . . , N s,κ ← RationalReconstruction(N 1,κ , . . . , N s,κ ) 1,κ s,κ 1 s Stop ← {m , . . . , m } Equals {N , . . . , N } mod p2 κ← κ+1 end while return N 1,κ−1 , . . . , N s,κ−1

In view of Lemma 9, we prove Theorem 1 with A = aa . By [23, Lemma 2.1], all ∆i can be taken of height at most h∆ = n(log n + 2 log D) ≤ n(log n + 2n log d). Using the arithmetic Bézout bound of [17], we get after simplifications that all H∆ are bounded by ndn (h+3 log(n+1)+2n log d+3). The previous lemmas then give the upper bounds below, which finish proving Theorem 1 after a few simplifications. ht a ht a

4.

≤ ≤

Q

2nd2n (h + 3 log(n + 1) + 2n log d + 7) 2n2 d2n+1 (2h + 4 log(n + 1) + 3n log d + 3).

PROOF OF THEOREM 2

Q

Q

We now give the details of our lifting algorithm: given a polynomial system F , it outputs a triangular representation of its set of simple solutions Z = Z(F ), by means of the polynomials N 1 , . . . , N s defined in the introduction. First

Q

Q

Q

We still use the notation and assumption of Theorem 2. From [9, Th. 1], all coefficients of N 1 , . . . , N s have height

113

Sys 1 2 3 4 5 6 7 8 9 10 11 12 13 14

in nO(1) (deg Z + ht Z), which can explicitly be bounded by F . For p1 ≤ exp (2 F + 1), define

h

h

d = d(p ) =

1

log2

h

2 F +1 log p1

h

d(p1 )

.

Then, p21 has height at least 2 F + 1. In view of the prerequisites for rational reconstruction, (p1 ) bounds the number of lifting steps. From an intrinsic viewpoint, at the last lifting step, 2κ is in O(nO(1) (deg Z + ht Z)). Suppose that p1 does not divide the integer A of Theorem 1. Then, Hensel lifting computes approximations κ T 1,κ , . . . , T s,κ = T 1 , . . . , T s modulo p21 . At the κth lifting step, let N 1,κ , . . . , N s,κ be the outputκ of Convert applied to T 1,κ , . . . , T s,κ , computed modulo p21 ; let N 1,κ , . . . , N s,κ be the same polynomials after rational number reconstruction, if possible. By construction, they have rational coefficients of height at most 2κ−1 log p1 . Supposing that p2 does not divide the integer A of Theorem 1, failure occurs only if for some κ in 0, . . . , − 1, and some j in 1, . . . , s, N j,κ and N j differ, but coincide modulo p2 . For this to happen, p2 must divide some non-zero number of height at most F + 2κ−1 log p1 + 1. Taking all κ into account, this shows that for any prime p1 , there exists a non-zero integer Bp1 such that ht Bp1 ≤ ( F + 1) + 2d log p1 , and if p2 does not divide Bp1 , the lifting algorithm succeeds. One checks that the above bound can be simplified into ht Bp1 ≤ F . Let C ∈ N be such that

d

Q

Q

Name Cyclohexane Fee 1 fabfaux geneig eco6 Weispfenning-94 Issac97 dessin-2 eco7 Methan61 Reimer-4 Uteshev-Bikker gametwo5 chemkin Sys 1 2 3 4 5 6 7 8 9 10 11 12 13 14

h

d

b

C=

a

4

F

b

+2 ε

F

,

so that C ≤

F

+ 1);

Sys 1 2 3 4 5 6 7 8 9 10 11 12 13 14

let Γ be the set of pairs of primes in [C + 1, . . . , 2C]2 and γ be the number of primes in C + 1, . . . , 2C; note that γ ≥ C/(2 log C) and that #Γ = γ 2 . The upper bound on C shows that all primes p less than 2C satisfy the requested inequality log p ≤ 2 F + 1. We can then estimate how many choices of (p1 , p2 ) in Γ lead to failure. There are at most F /log C primes p1 in C + 1, . . . , 2C which divide the integer A of Theorem 1, discriminating at most γ F /log C pairs (p1 , p2 ). For any other value of p1 , there are at most ( F + F )/log C choices of p2 which divide A and Bp1 . This discriminates at most γ( F + F )/log C pairs (p1 , p2 ). Thus the number of choices in Γ leading to failure is at most γ(2 F + F )/log C. The lower bound on γ shows that if (p1 , p2 ) is chosen randomly with uniform probability in Γ, the probability that it leads to failure is at most

h

a

a

a

b

a

b

a

b

a

a

b

a

b

b

a

γ(2 F + F ) γ(2 F + F ) 2 F+ F 4 = = ≤ #Γ log C γ 2 log C γ log C

p1 4423 24499 2671 116663 105761 7433 1549 358079 387799 450367 55313 7841 159223 850088191

d 7 8 7 10 10 7 6 11 11 11 9 7 10 18

a 2 4 5 5 3 3 5 7 4 6 2 5 -

Ca 15 70 110 162 40 31 102 711 89 362 19 125 -

F

b

+2 C

F

∆p 1 3 8 5 12 16 66 47 1515 2292 3507 4879 ∞ -

Ep 0.3 1 0.4 1 1.5 1.5 0.4 9 9 6 1 2 -

Lift 2 9 6 5 6 11 4 232 35 82 9 22 -

Total 7 20 22 18 35 43 133 427 2873 4686 5569 8796 -

Mem. 5 6 7 6 6 7 8 13 11 25 38 63 fail

Output size 243 4157 5855 4757 2555 3282 4653 122902 9916 50476 2621 12870 -

Table 3: Results from our modular algorithm Sys 1 2 3 4 5 6 7 8 9 10 11 12 13 14

,

which is at most ε, as requested. To estimate the complexity of this algorithm, note that since we double the precision at each lifting step, the cost of the last lifting step dominates. From the previous discussion, the number of bit operations cost at the last step is quasi-linear in (L, hL , Cn , deg Z, 2κ , log p1 ). The previous estimates show that at this step, 2κ is in O(nO(1) (deg Z + ht Z)), whereas log p1 is quasi-linear in | log ε|, log h, d, log n. Putting all these estimates ends the proof of Theorem 2.

5.

h 4395 24464 2647 116587 105718 7392 1511 358048 387754 450313 55246 7813 159192 850088102

h 3 2 13 2 0 0 2 7 0 16 1 3 8 11

Table 2: Data for the modular algorithm

h

1 exp (2 2

d 4 4 3 3 3 5 2 2 3 2 5 3 4 3

Table 1: Features of the polynomial systems

Q

d

h

n 3 4 3 6 6 3 4 10 7 10 4 4 5 13

Triang. 0.4 2 512 2.5 5 3000 1593 ∞ -

Mem. 4 6 275 4 5 250 fail fail 18 fail fail fail fail

Size 169 1680 6250 743 3134 2695 55592 -

gsolve 0.2 504 1041 9 4950 1050 ∞ -

Mem. 3 18 34 fail 5 66 31 error fail fail fail fail fail

Size 239 34375 27624 2236 34932 31115 -

Table 4: Results from Triangularize and gsolve

compared with two other Maple solvers, Triangularize, from the RegularChains library, and gsolve, from the Groebner library. Remark that the triangular decompositions modulo a prime, that are needed in our algorithm, are performed by Triangularize. This function is a generic code:

EXPERIMENTATION

We realized a first Maple 9.5 implementation of our modular algorithm on top of the RegularChains library [19]. Tests on benchmark systems [25] reveal its strong features,

114

Acknowledgment

essentially the same code is used over Z and modulo a prime. Thus, Triangularize is not optimized for modular computations. Our computations are done on a 2799 MHz Pentium 4. For the time being our implementation handles square systems that generate radical ideals. We compare our algorithm called TriangularizeModular with gsolve and Triangularize; For each benchmark system, Table 1 lists the numbers n, d, h and Table 2 lists the prime p1 , the a priori and actual number of lifting steps ( and a) and the maximal height of the output coefficients (Ca ). Table 3 gives the time of one call to Triangularize modulo p1 (∆p ), the equiprojectable decomposition (Ep ), and the lifting (Lift.) in seconds — the first two steps correspond to the “oracle calls” O2 mentioned in Theorem 2, which will be studied in [6]. Table 3 gives also the total time, the total memory usage and output size for TriangularizeModular, whereas Table 4 gives that data for Triangularize and gsolve. The maximum time is set up to 10800 seconds; we set the probability of success to be at least 90%. TriangularizeModular solves 12 of the 14 test systems before the timeout, while Triangularize succeeds with 7 and gsolve with 6. Among most of the problems which gsolve can solve, TriangularizeModular shows less time consumed, less memory usage, and smaller output size. Noticeably, quite a few of the large systems can be solved by TriangularizeModular with time extension: system 13 is solved in 18745 seconds. Another interesting system is Pinchon-1 (from the FRISCO project), for which n = 29, d = 16, h = 20, = 1409536095e + 29, which we solve in 64109 seconds. Both Triangularize and gsolve fail these problems due to memory allocation failure. Our modular method demonstrates its efficiency in reducing the size of the intermediate computations, whence its ability to solve difficult problems. We observed that for every test system, for which Ep can be computed, the Hensel lifting always succeeds, i.e. the equiprojectable decomposition over Q can be reconstructed from Ep . In addition, TriangularizeModular failed chemkin at the ∆p phase rather than at the lifting stage. Furthermore, the time consumed in the equiprojectable decomposition and the Hensel lifting is rather insignificant comparing with that in triangular decomposition modulo a prime. For every tested example the Hensel lifting achieves its final goal in less steps than the theoretical bound. In addition, the primes derived from our theoretical bounds are of quite moderate size, even on large examples.

h

The authors are thankful to Fran¸cois Lemaire (Université de Lille 1, France) for his support with the RegularChains library. Merci, Fran¸cois !

7. REFERENCES [1] E. A. Arnold. Modular algorithms for computing Gr¨ obner bases. J. Symb. Comp., 35(4):403–419, 2003. [2] P. Aubry and A. Valibouze. Using Galois ideals for computing relative resolvents. J. Symb. Comp., 30(6):635–651, 2000. [3] F. Boulier, L. Denis-Vidal, T. Henin, and F. Lemaire. L´ episme. In ICPSS, pages 23–27. University of Paris 6, France, 2004. [4] F. Boulier and F. Lemaire. Computing canonical representatives of regular differential ideals. In ISSAC 2000, pages 37–46. ACM Press, 2000. [5] X. Dahan. Borne de hauteur (polynomiale) sur les coefficients d’une repr´ esentation triangulaire d’une vari´ et´ e z´ ero-dimensionnelle pr´ esentant des sym´ etries. Master’s thesis, ´ Ecole Polytechnique, 2003. ´ Schost, W. Wu, and Y. Xie. [6] X. Dahan, M. Moreno Maza, E. The complexity of the Split-and-Merge algorithm. In preparation. ´ Schost, W. Wu, and Y. Xie. On [7] X. Dahan, M. Moreno Maza, E. the complexity of the D5 principle. Preprint. ´ Schost, W. Wu, and Y. Xie. [8] X. Dahan, M. Moreno Maza, E. Equiprojectable decompositions of zero-dimensional varieties. In ICPSS, pages 69–71. University of Paris 6, France, 2004. ´ Schost. Sharp estimates for triangular sets. In [9] X. Dahan and E. ISSAC 04, pages 103–110. ACM Press, 2004. [10] D. Eisenbud. Commutative algebra, volume 150 of GTM. Springer-Verlag, 1995. [11] M.V. Foursov and M. Moreno Maza. On computer-assisted classification of coupled integrable equations. J. Symb. Comp., 33:647–660, 2002. [12] J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, 1999. [13] M. Giusti, J. Heintz, J. E. Morais, and L. M. Pardo. When polynomial equation systems can be solved fast? In AAECC-11, pages 205–231. Springer, 1995. [14] M. Giusti, G. Lecerf, and B. Salvy. A Gr¨ obner free alternative for polynomial system solving. J. Complexity, 17(1):154–211, 2001. ´ Hubert. Notes on triangular sets and [15] E. triangulation-decomposition algorithms. In Symbolic and Numerical Scientific Computations, volume 2630 of LNCS, pages 1–39. Springer, 2003. [16] M. Kalkbrener. A generalized euclidean algorithm for computing triangular representations of algebraic varieties. J. Symb. Comp., 15:143–167, 1993. [17] T. Krick, L. M. Pardo, and M. Sombra. Sharp estimates for the arithmetic Nullstellensatz. Duke Math. J., 109(3):521–598, 2001. [18] D. Lazard. Solving zero-dimensional algebraic systems. J. Symb. Comp., 13:117–133, 1992. [19] F. Lemaire, M. Moreno Maza, and Y. Xie. The RegularChains library. In Maple 10, Maplesoft, Canada. To appear. [20] P. J. McCarthy. Algebraic extensions of fields. Dover, New York, 1991. [21] M. Moreno Maza. On triangular decompositions of algebraic varieties. Technical Report 4/99, NAG, UK, Presented at the MEGA-2000 Conference, Bath, UK. http://www.csd.uwo.ca/∼moreno. [22] M. Moreno Maza and R. Rioboo. Polynomial gcd computations over towers of algebraic extensions. In Proc. AAECC-11, pages 365–382. Springer, 1995. [23] F. Rouillier. Solving zero-dimensional systems through the rational univariate representation. AAECC, 9:433–461, 1999. ´ Schost. Complexity results for triangular sets. J. Symb. [24] E. Comp., 36(3-4):555–594, 2003. [25] The symbolicdata project, 2000–2002. http://www.SymbolicData.org. [26] W. Trinks. On improving approximate results of Buchberger’s algorithm by Newton’s method. In EUROCAL 85, volume 203 of LNCS, pages 608–611. Springer, 1985.

d

h

6.

CONCLUSIONS

We have presented a modular algorithm for triangular decompositions of 0-dimensional varieties over Q and have demonstrated the feasibility of Hensel lifting in computing triangular decompositions of non-equiprojectable varieties. Experiments show the capacity of this approach to improve the practical efficiency of triangular decomposition. By far, the bottleneck is the modular triangularization phase. This is quite encouraging, since it is the part for which we relied on generic, non-optimized code. The next step is to extend these techniques to specialize variables as well during the modular phase, following the approach initiated in [13] for primitive element representations, and treat systems of positive dimension.

115

Computing the Multiplicity Structure in Solving Polynomial Systems Barry H. Dayton

Zhonggang Zeng∗

Department of Mathematics Northeastern Illinois University Chicago, IL 60625

Department of Mathematics Northeastern Illinois University Chicago, IL 60625

[email protected]

[email protected]

ABSTRACT

are even more subtle, yet more essential in defining the nature of the system and the zero:

This paper presents algorithms for computing the multiplicity structure of a zero to a polynomial system. The zero can be exact or approximate with the system being intrinsic or empirical. As an application, the dual space theory and methodology are utilized to analyze deflation methods in solving polynomial systems, to establish tighter deflation bound, and to derive special case algorithms.

• The multiplicity m = 12.

• The Hilbert function {1, 2, 3, 2, 2, 1, 1, 0, · · · } that is a partition of the multiplicity 12. • The dual space D(0,0) (I) of the ideal is 12-dimensional with basis below grouped by the differential orders and counted by the Hilbert function:


1

3

2

2

Here, the differentiation operator

Algorithms, Theory

∂j1 ···js ≡ ∂xj1 ···xjs ≡ 1

Keywords

s

1 ∂ j1 +···+js j1 ! · · · js ! ∂xj1 · · · ∂xsjs 1

.

(2)

The functionals in (1) vanish on the entire ideal I at the zero (0, 0), and form its multiplicity structure.

polynomial ideal, dual space, multiplicity

• The breadth β(0,0) (I) = 2 and the depth δ(0,0) (I) = 6.

INTRODUCTION

In this paper we present several algorithms for computing the multiplicity structure of a polynomial system, namely the dual space of the zero-dimensional ideal at a zero. Using approximate rank-revealing, those algorithms allow the systems and the zeros to be given approximately. Solving polynomial equations is one of the fundamental problems in computational mathematics with a wealth of applications. Multiple zeros, in particular, present a challenge in computation while possessing a rich spectrum of structural invariances, as shown in the following example. Example 1. Consider multiple zero (x1 , x2 ) = (0, 0) of a simple polynomial ideal I = hx31 , x21 x2 + x42 i. The multiplicity 12 may not be obvious. Other structural invariances ∗

(1)

1

1

z }| { z }| { z }| { ∂13 , ∂04 − ∂21 , ∂05 − ∂22 , ∂06 − ∂23 .

General Terms

1.

2

}| { z }| { z}|{ z }| { z ∂00 , ∂10 , ∂01 , ∂20 , ∂11 , ∂02 , ∂12 , ∂03 ,

G.1.5 [Mathematics of Computing]: Roots of Nonlinear Equations – systems of equations; I.1.2 [Symbolic and Algebraic Manipulations]: Algebraic Algorithms

The multiplicity at a zero to a polynomial ideal has been an important topic in algebraic geometry since the days of Newton and Leibniz [5]. In theory, the multiplicity can be computed via symbolic methods if the exact zero is known. Using Singular package [7] this can often be done in practice. However, computer algebra systems such as Maple and Mathematica still output multiplicities with inconsistent accuracy (see §7). When a zero is known approximately or the polynomial system is inexact, computing the multiplicity structure may be beyond the scope of symbolic computation as narrowly defined. Computing multiplicity structures has been studied extensively. The duality approach is originated by Macaulay [11, 15] in 1916, along with an algorithm (see §5) that appears to be largely unknown in modern era. This approach is then phrased in terms of dual spaces by Gr¨ obner in 1939 and elaborated recently by Marinari, M¨ oller and Mora in [13] with a symbolic algorithm. Stetter [20, 22] and Thallinger [24] propose a modified approach with an implementation. Formulation of multiplicity structure as matrix eigenproblems has been studied in [1, 14, 21]. For computing multiplicity only, a numerical algorithm in [8] utilizes Zeuthen’s rule, an eigenvalue approach is introduced in [12], along with a homotopy algorithm in [18] providing an upper bound.

Supported by NSF grant DMS-0412003

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISSAC’05, July 24–27, 2005, Beijing, China Copyright 2005 ACM 1-59593-095-7/05/0007...$5.00.

116

where dim(·) denotes the dimension of a vector space. Using (5), we introduce the breadth and the depth of dual space Dxˆ (I) denoted by βxˆ (I) and δxˆ (I) respectively as

Those approaches possess various strengths. However, as evidenced by the frequent inaccuracies of advanced CAS packages, finding practical, robust and accurate algorithms for computing the complete multiplicity structure is apparently still in an early stage of development, especially in the presence of data error and zero approximation. In this paper, we give an expository account of multiplicity structure via duality. The main algorithm MultStructure is then presented for computing the multiplicity structure. Finally, as an application of the duality analysis we investigate existing regularization strategies for computing the multiple zeros, along with our modifications and special case algorithms.

2.

βxˆ (I) = H(1)

δxˆ (I) = max{ α | H(α) > 0 }.

Depth and breadth play significant roles in˘zero-finding (§6). ¯ In contrast to Example 1, the system x31 , x42 also has zero (0, 0) of multiplicity 12 with a different Hilbert function {1, 2, 3, 3, 2, 1, 0, · · · } and a dual space spanned by 1

2

3

3

2

1

z}|{ z }| { z }| { z }| { z }| { z}|{ ∂00 , ∂10 , ∂01 , ∂20 , ∂11 , ∂02 , ∂21 , ∂12 , ∂03 , ∂13 , ∂22 , ∂23

System {x32 , x2 −x23 , x3 −x21 } at origin is again 12-fold with Hilbert function {1, · · · , 1, 0, · · · } and a dual space basis

NOTATION AND PRELIMINARIES

Throughout this paper, N ≡ { 0, 1, 2, · · · } and C is the complex field. Matrices are denoted by capital letters A, B, etc, while O and I represent zero matrices and identities respectively. For any matrix A, N ( A ), rank ( A ) and nullity ( A ) are the nullspace, rank and nullity of A, respectively. Vectors are columns denoted by boldface lower case letters u, v, etc. For any matrix or vector, (·)> denotes the transpose while (·)H the conjugate transpose. Symbols 0 and e1 represent zero vectors and [1, 0, · · · , 0]> respectively. We use C[x] ≡ C[x1 , · · · , xs ] to denote the ring of polynomials with complex coefficients in variables x = (x1 , · · · , xs ). For every index array j = [j1 , · · · , js ] ∈ Ns , the monomial xj = xj11 · · · xjss , and (x − y)j = (x1 − y1 )j1 · · · (xs − ys )js . The operator ∂j is as in (2) with order |j| ≡ j1 + · · · + js . Consider a system of t polynomials {f1 , · · · , ft } in s ˆ where t ≥ s. Polynomials variables with an isolated zero x f1 , · · · , ft ∈ C[x] generate an ideal I = hf1 , · · · , ft i. As ˆ∈C in (2), we define a functional at x

1

1

1

1

}| { z }| { z }| { z }| { z ∂000 , ∂100 , ∂200 + ∂001 , · · · , ∂600 + ∂401 + ∂202 + ∂210 + ∂003 + ∂011 , 1

z }| { · · · , ∂11,00 + ∂901 + ∂702 + · · · + ∂312 + ∂105 + ∂320 + ∂113 + ∂121 .

The last example is a breadth-one case that is of special interest. Dual spaces in this case can be computed via a simple recursive algorithm in §6.2.

3. THE HILBERT FUNCTION It can be assumed without loss of generality that the sysˆ = 0. There tem {f1 , . . . , ft } has a zero at the origin x ˆ are various constructions of the local ring at the point x (e.g. see [2, §4.2]), the easiest is C[[x]]/hf1 , · · · , ft i where C[[x]] = C[[x1 , · · · , xs ]] denotes the ring of formal power ˆ = 0, and I = hf1 , · · · , ft i is the ideal genseries at x erated by f1 , · · · , ft in C[[x]]. These local rings contain the information needed to calculate the multiplicity structure. `The intersection multiplicity of the zero is given by ‹ ´ dimC C [[x]] I (see [5] or [2, Chap.4, Prop 2.11]) and can be decoded from the associated graded ring L α α+1 GrM (C[[x]]/I) = ∞ = C[x]/In(I) α=0 M /M

∂j [ˆ x] : C[x] → C, where ∂j [ˆ x](p) = (∂j p)(ˆ x) for p ∈ C[x]. ˆ is a linear comGenerally, a (differential) functional at x ˆ that vanish on I bination of ∂j [ˆ x]’s. All functionals at x ˆ form a vector space Dxˆ (I) called the dual space of I at x ˛ n o X ˛ Dxˆ (I) ≡ c = (3) cj ∂j [ˆ x] ˛ c(f ) = 0, for all f ∈ I j∈Ns

where M = hx1 , . . . , xs i is the maximal ideal at the origin and In(I) is the ideal of initial forms relative to a local degree ordering [6, §5.5]. This ring is a standard graded algebra. The Hilbert function is then defined as [19, 6, 2].

where cj ∈ C for all j ∈ Ns .

ˆ of an Definition 1. [11, 22] The multiplicity of zero x ideal I ⊂ C[x] is m if the dual space Dxˆ (I) is of dimension ˆ. m, while Dxˆ (I) itself defines the multiplicity structure of x

H(α) = dimC M α /M α+1 , α ∈ N.

This definition is a generalization from the univariate case, where x ˆ is an m-fold root of p ∈ C[x] if Dxˆ (I) is spanned by ∂0 [ˆ x], ∂1 [ˆ x], · · · , ∂m−1 [ˆ x]. Notice that Dxˆ (I) consists of functionals vanishing not only on polynomials f1 , · · · , ft but also on the entire ideal I they generate. In other words, let ˆ . Then c ∈ Dxˆ (I) if and only if c be a functional at x c(p fi ) = 0 for all p ∈ C[x] and 1 ≤ i ≤ t.

and

(6)

The lemma below summarizes the relevant properties of the Hilbert function. Parts (i),(ii) follow from [6], while (iii),(iv) can be derived from properties of the Hilbert function discovered by Macaulay and enumerated in [19, Thm. 2.2]. Lemma 1. Let H : N −→ N be the Hilbert function defined in (6). Then ` ´ ` ´ P (i) dimC C [[x]]/I = dimC C [x]/In(I) = ∞ α=0 H(α).

(4)

For α ∈ N, Dxα ˆ (I) with ˆ (I) consists of functionals in Dx differential orders bounded by α. The Hilbert function, to be defined in §3, has a convenient property associated with the multiplicity structure (see Theorem 2): 8 ” “ 0 > > < H(0) = dim “Dxˆ (I) ” ≡ 1 “ ” α−1 α (5) (I) (I) − dim D H(α) = dim D ˆ x ˆ x > > : for α ∈ {1, 2, · · · },

ˆ is isolated if and only if H(α) = 0 for (ii) The zero x sufficiently large α ∈ N. ` ´ (iii) If H(1) = β, then H(α) ≤ β+α−1 for all α ∈ N. β−1 (iv) If H(α) ≤ 1 for some α then H(σ + 1) ≤ H(σ) ≤ 1 for all σ ≥ α.

117

THE MULTIPLICITY MATRICES

If an entry of Sα is at the intersection of the row and ˆ )k fi ` and ∂j respectively, column indexed by (x − x then ´ ˆ )j fi (see Example this entry is the value of ∂j [ˆ x] (x − x 2). With this arrangement, Sα is the upper-left mα × nα submatrix of subsequent multiplicity matrices Sσ , for σ ≥ α, as illustrated in Example 2.

Based on (5), the multiplicity structure of the the polyˆ nomial ideal I = hf1 , · · · , ft i at x can computed as dimensions and bases for the vector spaces Dxα for ˆ (I) α = 0, 1, · · · until “ ” “ ” dim Dxα = dim Dxα+1 (I) (7) ˆ (I) ˆ

cj ∂j [ˆ x] ∈ Fxα is

if and only if it vanishes on I = hf1 , · · · , ft i:

−1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 f1 f2 x 1 f1 x 1 f2 x 2 f1 x 2 f2 x21 f1 x21 f2 x 1 x 2 f1 x 1 x 2 f2 x22 f1 x22 f2

0 0 0 0 0 0 0 0 0 0 0 1

0 0 0 0 0 1 0 0 0 0 −1 −1 0 0 0 1 0 0 0 0 −1 −1 1 1 0 0 0 0 1 0 −1 −1 1 1 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0

|j| = 1 z }| { ∂10 ∂01

that form a linear system of homogeneous equations. This observation leads to the following definition/theorem which formulates the multiplicity matrices.

1 1 0 0 0 0 0 0 0 0 0 0

z ∂20

Since c is linear, (8) is equivalent to ” “ ˆ )k fi = 0 for k ∈ Ns , i ∈ {1, · · · , t} c (x − x

|j| = 0 z }| { ∂00

(8)

multiplicity matrices &

c(p fi ) = 0 for p ∈ C[x1 , · · · , xs ] and i ∈ {1, · · · , t}.

0 0 1

|j|≤α

0 0 1

P

0 1 0

. A functional c =

0 1 −1

«

1 0 0

α+s α

bases for nullspaces (transposed as row vectors)

in

Dxα ˆ (I)

„

|j| = 2 }| { ∂11 ∂02

with nα =

z ∂30

which is isomorphic to the space of complex vectors o n ˆ ˜ ˛˛ Cnα ≡ c = cj : |j| ≤ α ˛ cj ∈ C, j ∈ Ns

0 1 0 0 −1 −1 0 0 0 0 0 0

∂21

|j| = 3 }| ∂12

{ ∂03

at the smallest α, say α = σ. When (7) occurs, the corresponding Hilbert function satisfies H(σ + 1) = 0, and consequently H(α) = 0 for all α > σ by Lemma 1–(iv). Thus Dxˆ (I) = Dxσˆ (I). The dual subspace Dxα ˆ (I) is a subspace of ˛ o nX ˛ α Fxˆ ≡ cj ∂j [ˆ x] ˛ cj ∈ C, j ∈ Ns , |j| ≤ α ˛ n o ˛ ≡ span ∂j [ˆ x] ˛ j ∈ Ns , |j| ≤ α

0 0 0

Example 2: Consider system {x1 −x2 +x21 , x1 −x2 +x22 } at ˆ = (0, 0). The following array shows the expansion of x multiplicity matrices from S1 to S2 , then to S3 , with rows and columns labeled by xk fi and ∂j respectively. The table beneath the multiplicity matrices shows the bases for the nullspaces as row vectors using the same column indices.

0 0 −1 −1 1 1 0 0 0 0 0 0

4.

The nullspaces can be converted to bases of dual subspaces using the indices in the table:

which corresponds to a homogeneous linear system Sα c = nα 0 „ for c ∈ « C . Here Sα„, an m «α × nα matrix with mα = α−1+s α+s t and nα = , is called the α-th order α−1 α

0 D(0,0) (I) = span { ∂00 } 1 D(0,0) (I) = span { ∂00 , ∂10 + ∂01 } 2 D(0,0) (I) = span { ∂00 , ∂10 + ∂01 , −∂10 + ∂20 + ∂11 + ∂02 }

multiplicity matrix. Consequently, the Hilbert function (5)

The multiplicity matrices are constructed in a way similar to the convolution matrices as in depend on ˛ and ¯ ˘ [25, §5.3], the ordering of index set Iα ≡ j ∈ Ns ˛ |j| ≤ α . For an obvious ordering ≺ of I1 , we can arrange S1 =

6 6 4

f1 (ˆ x) . . . ft (ˆ x)

J(ˆ x)

3 7 7 5

≡

2

0 6 . 6 . 4 . 0

J(ˆ x)

3 7 7 5

{z |k| = 2

}

S3

S2

{z } |k| = 1 |

It is also easy to verify that nullity ( S3 ) = nullity ( S2 ) = 3. Therefore, the Hilbert function H(N) = {1, 1, 1, 0, · · · }. The multiplicity equals 3. The dual space D(0,0) (I) = 2 D(0,0) (I) with breadth β(0,0) (I) = H(1) = 1 and depth δ(0,0) (I) = max{α | H(α) > 0} = 2. The complete multiplicity structure is obtained. 2 The Hilbert function can be calculated from another perspective. The multiplicity matrix Sα is expanded from Sα−1 in the manner of the following matrix partition

H(α) = nullity ( Sα ) − nullity ( Sα−1 ) , α = 1, 2, · · · , (10) ˆ ˜> with S0 ≡ f1 (ˆ x), · · · , ft (ˆ x) = Ot×1

2

S1

| {z } |k| = 0|

N ( S0 ) N ( S1 ) N ( S2 ) N ( S3 )

ˆ be an isolated zeroPof f1 , · · · , ft ∈ Theorem 1. Let x x] is C[x1 , · · · , xs ]. For α ∈ N, a functional |j|≤α cj ∂j [ˆ α if and only if the coefficient vector c = in ˆ Dxˆ (hf1 , ·˜· · , ftni) cj : |j| ≤ α ∈ C α satisfies “ ” P ˆ )k fi (ˆ x) = 0, |j|≤α cj ∂j (x − x (9) s k ∈ N , |k| < α, i ∈ {1, 2, · · · , s},

Sα =

(11)

»

Sα−1 O

∗ ∗

–

=

»

˛ – ˛ Aα ˛˛ Bα

with Aα =

»

Sα−1 O

–

. (12)

Let Nα be a matrix whose rows form a basis for the left nullspace of Aα . For the ideal hf1 , . . . , ft i, let Inα (I) denote the degree α part of In(I) as in [6]. The following lemma is a result of standard matrix theory.

where J(ˆ x) is the Jacobian of the system {f1 , . . . , ft } at ˆ . Generally, we index rows of Sα by (x − x ˆ )k fi for x (k, i) ∈ Iα−1 × {1, · · · , t} with ordering (k, i) ≺ (k0 , i0 ) if k ≺ k0 in Iα−1 or k = k0 but i < i0 . The columns are indexed by the differential functionals ∂j for j ∈ Iα .

Lemma 2. With the assumptions of Theorem 1 and the partition (12) along with Nα as given above, formula (10)

118

is equivalent to H(α) =

„

α−1+s α−1

«

where (·)† denotes the Moore-Penrose inverse. From a random u0 , this iteration virtually guarantees convergence to an approxi-nulvector u, while {ςk } converges to the distance ς between A and the nearest rank-deficient matrix. ˆ= finding an approxi-nulvector u, we form matrix A » After H – kAk∞ u ˆ the resulting sequence . Applying (13) on A, A

− rank ( Nα Bα ) , α ∈

N. Moreover, rank ( Nα Bα ) = dimC Inα (I).

We now assert that the notion of multiplicity and Hilbert function in §2 agree with those in §3. The second statement of the following theorem is well known in the folklore and implicit in [11, 13, 15], but we have not been able to find a clear and explicit statement with a proof in the literature. This theorem can be derived from Lemmas 1 and 2.

{ˆ uk } converges to an approxi-nulvector v of A orthogonal to u, while the scalar sequence {ˆ ςk } converges to the distance between A and the nearest matrix with nullity 2. This process can be recursively continued by stacking kAk∞ vH on ˆ and applying (13) on the new stacked matrix. top of A We now describe the numerical procedure that carries out the shaded step in Algorithm MultStructure. The nullspace ˆNθ ( S0 ) = ˜span { [1] }. Assume orthonormal basis Y = y1 , · · · , yµ for» Nθ (– Sα−1 ) is»obtained. –

Theorem 2. Under the assumption of Theorem 1, the Hilbert function defined in (6) satisfies (5). Consequently, ˆ is identical to the the intersection multiplicity P∞ of the zero x arithmetic multiplicity α=0 H(α) and the dual multiplicity given in Definition 1.

5.

COMPUTING THE MULTIPLICITY

Sα = where

∗ find N ( Sα ) by expanding N ( Sα−1 )

Rα−1 O

is

Sα−1 O » – F1 F2

– » – F T ZH , G Sα

= QH α−1

» – O . F

=

2

TY H 4Sα−1 O

Let

32 » Rα−1 O Qα−1 5 4 F ˆ O G O » – » – ˆ ˆ R = F2 be Q G O

3 F1 ˆ R5 O

= Qα

»

Rα O

–3 F1 F 2 ˜5 G

a QR

–

(14)

ˆ into Qα . with proper accumulation of Qα−1 and Q Notice that (14) implies N ( Rα ) equals ” \ “ \ N ( Sα ) N Z H = N ( Sα ) Nθ ( Sα−1 )⊥ .

Therefore Nθ ( Rα ) consists of approxi-nullvectors of Sα that are approximately orthogonal to those of Sα−1 . The procedure below produces the approxi-nullspace Nθ ( Rα ).

This algorithm turns out to be essentially equivalent to Macaulay’s procedure of 1916 for finding inverse arrays of dialytic arrays [11, 15], except that Macaulay’s algorithm requires construction of dialytic arrays with full row rank. This requirement is difficult and costly to implement with approximate systems or the zeros. Implementation of MultStructure is straightforward for symbolic computation when the system and zero are exact and properly represented. Applying this multiplicityfinding procedure on approximate zeros and/or inexact systems requires the notion and algorithms of numerical rankrevealing at the shaded step in Algorithm MultStructure. The approxi-rank of a matrix A is defined as the minimum rank of matrices within a threshold θ: min

»

decomposition. Then 2 – » Rα−1 T ZH = Qα 4 O Sα O

∗ if nullity ( Sα ) = nullity ( Sα−1 ) then δ = α − 1, H(α) = 0, break the loop otherwise, get H(α) by (10), end if end for • convert N ( Sδ ) to Dxˆ (I) P Output: multiplicity m = α H(α), the Hil. func. H, Dxˆ (I) basis, depth δxˆ (I), breadth βxˆ (I) = H(1)

kA−Bk2 ≤θ

= Qα−1

available, where Qα−1 is unitary, Rα−1 is square uppertriangular and T is a diagonal scaling matrix. Embedding yi ’s into Cnα by appending zeros at the bottom to form zi for i = 1, · · · , µ, we obtain Z = ˆ ˜ z1 , · · · , zµ whose columns form a subset of an orthonormal basis for Nθ ( Sα ). Notice the matrix partitions

Algorithm: MultStructure ˆ ∈ Cs Input: system {f1 , · · · , ft } and zero x • initialize S0 , N ( S0 ) = span { [1] }, H(0) = 1 • for α = 1, 2, · · · do ∗ expand Sα−1 to Sα ∗ embed N ( Sα−1 ) into N ( Sα )

rankθ ( A ) =

TY H Sα−1

Also assume QR decomposition

The multiplicity and its structure can be computed using symbolic, symbolic-numeric or floating point computation based on Theorem 2 and Theorem 1, depending on the representation of the polynomial system and the zero. The main algorithm can be outlined in the following pseudo-code.

• •

let A = Rα for i = 1, 2, · · · do – apply iteration (13), stop at u and ς with proper criteria – if ς > θ, exit, end if » – H ∞u – get zµ+i = u, reset A with kAkA – update the QR decomposition A = QR end for

Upon exit, vectors zµ+1 , · · · , zµ+ν are remaining basis vectors of Nθ ( Sα ) in addition to z1 , · · · , zµ that are already – obtained. Furthermore, the QR decomposition of » ˆH Tˆ Z is a by-product from a proper accumulation of orSα ˆ ˜ thogonal transformations. Here Zˆ is z1 , · · · , zµ+ν with a column permutation and Tˆ is again a scaling matrix.

rank ( B ) .

The approxi-nullspace Nθ ( A ) of A is the (exact) nullspace of B that is nearest to A with rank ( B ) = rankθ ( A ). With this reformulation, approximate-rank/nullspace computing becomes well-posed. We refer to [10] for details. Approximate rank-revealing applies the iteration [10]. 8 – » –† » kAk∞ (uH < uk+1 = uk − 2kAk∞ uk k uk − 1) A Auk (13) : ςk+1 = kAuk+1 k2 , k = 0, 1, · · ·

6. COMPUTING A MULTIPLE ZERO Multiple zeros are infinitely sensitive to perturbation without regularization. Newton’s iteration as well as other standard iterative methods not only lose their superlinear convergence rate, they are also subject to a dismal barrier of

kuk+1 k2

119

J(x0 ) be its Jacobian matrix. Then

“attainable accuracy”. One of the remedies is the deflation approach [9, 17]. In particular, Leykin, Verschelde and Zhao [9] propose an effective deflation method, which we refer to as the LVZ method, with an objective to restore quadratic convergence for Newton’s iteration. From our perspective, deflation is in fact a regularization process that converts the ill-posed zero-finding into a well-posed least squares problem, and provides a mechanism to refine the multiple zero to high accuracy. The LVZ method can be described as follows. Let F (x) be a polynomial system [f1 (x), · · · , ft (x)]> in x as a ˆ is a multiple zero of F (x), then variable vector in Cs . If x ˆ is rank deficient. the Jacobian J(ˆ x) ∈ Ct×s of F (x) at x Let rank ( J(ˆ x) ) = r < s. Then, for almost all choices of an s × (r + 1) random matrix B, the matrix J(ˆ x)B is of r+1 nullity one. Therefore, for– almost , » –all choices of b ∈ C » J(ˆ x)B 0 has a unique solution y = the linear system H 1 b

J(x0 )x1 =

6 6 4

3 x1 · ∆x0 f1 (x0 ) 7 . 7 . 5 . x1 · ∆x0 ft (x0 )

= ∇x1 F (x0 )

1

3 F (x0 ) 4 J(x0 )x1 5 Rx1 − e1

≡

2

3 F (x0 ) 4∇x F (x0 )5 1 Rx1 − e1

(19)

ˆ 1 induces a functional ∇xˆ 1 [ˆ The value of x1 = x x0 ] ∈ Dxˆ 0 (I). If the zero ˆ z to F˜ remains multiple, then the ˜ z) of F˜ (z) at z ˆ has a nontrivial nullspace. The Jacobian J(ˆ deflation can be applied to F˜ the same way as (19) applied to F . Namely, we seek a solution in (x0 , x1 , x2 , x3 ) to „ « „ «» – » – x x x2 ˜ x2 = e 1 . F˜ x0 = 0, J˜ x0 = 0, R x x 1

1

3

„

Using (16) – (18), equation J˜ 2

(x2 · ∆x0 )F (x0 ) + 4 (x2 · ∆x )∇x F (x0 ) + 0 1 (x2 · ∆x0 )(Rx1 − e1 ) + 2 ∇x2 F (x0 ) = 4 (∇x2 ∇x1 + ∇x3 )F (x0 ) Rx3

3

«» – x2 x0 x3 x1

= 0 implies

3 (x3 · ∆x1 )F (x0 ) (x3 · ∆x1 )∇x1 F (x0 ) 5 3(x3 · ∆x1 )(Rx1 − e1 ) 5

= 0

Thus, the second deflation seeks a solution to the system F (x0 ) = 0, ∇x1 F (x0 ) = 0, (∇x2 ∇x1 + ∇x3 )F (x0 ) = 0.

∇x2 F (x0 ) = 0,

(20)

The third deflation adds variables x4 , · · · , x7 and equations ∇x4 F (x0 ) = 0, (∇x4 ∇x1 + ∇x5 )F (x0 ) = 0, (∇x4 ∇x2 ∇x1 + (∇x4 ∇x2 + ∇x6 )F (x0 ) = 0, ∇x4 ∇x3 + ∇x2 ∇x5 + ∇x6 ∇x1 + ∇x7 )F (x0 ) = 0.

(21)

ˆ 7 ) ∈ C8s to (20) and (21) induces Any solution (ˆ x0 , · · · , x eight differential functionals 1, ∇xˆ 1 , ∇xˆ 2 , ∇xˆ 4 and ∇xˆ 2 ∇xˆ 1 + ∇xˆ 3 , ∇xˆ 4 ∇xˆ 1 + ∇xˆ 5 , ∇xˆ 4 ∇xˆ 2 + ∇xˆ 6 , ∇xˆ 4 ∇xˆ 2 ∇xˆ 1 + ∇xˆ 4 ∇xˆ 3 + ∇xˆ 2 ∇xˆ 5 + ∇xˆ 6 ∇xˆ 1 + ∇xˆ 7

We shall use additional differential notation and operations. The original variables are in a vector form x = [x1 , · · · , xs ]> which will also be denoted by x0 in contrast to auxiliary (vector) variables x1 , x2 etc. For any fixed or variable vector y = [y1 , · · · , ys ]> , the directional differentiation operator along y is defined as

ˆ 0 . In general, the k-th deflation step that vanish on F at x seeks a collection of 2k differential functionals of order k or ˆ 0 . It remains to show less that vanish on the system F at x those functionals satisfy condition (4). For this purpose, we define differential operators Φα ’s as follows. P2ν −1 ν = 0, 1, · · · . (22) Φν+1 = ζ=0 x2ν +ζ · ∆xζ ,

(16)

When y is fixed in Cs , ∇y induces a functional ∇y [ˆ x] : p −→ (∇y p)(ˆ x). For any variable u = [u1 , · · · , us ]> , i> h ∂ ∂ , which is , · · · , ∂u the gradient operator ∆u ≡ ∂u s 1

Specifically, Φ1 Φ3

>

= =

x 1 · ∆ x 0 , Φ2 = x 2 · ∆ x 0 + x 3 · ∆ x 1 , x 4 · ∆x0 + x 5 · ∆x1 + x 6 · ∆x2 + x 7 · ∆x3

with operations such as

(17)

Φ1 F (x0 ) = ∇x1 F (x0 ), Φ2 F (x0 ) = ∇x2 F (x0 ), Φ2 ◦ Φ1 F (x0 ) = (∇x2 ∇x1 + ∇x3 )F (x0 ).

especially ∇y ≡ y · ∆x ≡ y · ∆x0 for any y of dimension s. For any f ∈ C[x0 ], let y and z be auxiliary variables, (y · ∆x0 )(∇z f (x0 )) = ∇y ∇z f (x0 ), z · ∆y f (x0 ) ≡ 0, (z · ∆y )(∇y f (x0 )) = (z · ∆y )(y · ∆x0 )f (x0 ) = ∇z f (x0 ).

=

2

2

F˜ (z) = 0, with F˜ (z) ≡

6.1 Duality analysis of the LVZ method

∂ ∂ , + · · · + vs ∂u v · ∆u ≡ v1 ∂u s 1

3 ∆x0 f1 (x0 )> 7 . 7 x1 . 5 . > ∆x0 ft (x0 )

1

ˆ whose x and y components has an isolated solution z = z ˆ and y ˆ respectively. If ˆ are x z is still a multiple zero to Fˆ , ˆ to one can repeat the same deflation technique on Fˆ and z ˆ ˆ ˆ obtain a further expanded system F and isolated zero ˆ z. Leykin, Verschelde and Zhao’s theory is that the multiˆ to system F . plicity of ˆ z to system Fˆ is less than that of x Since every deflation step deflates the multiplicity by at least one, recursive deflations will eventually exhaust the multiˆ . The plicity at a system G(u) = 0 with simple zero u number of deflation steps is strictly less than the multiplicˆ [9, Theorem 3.1]. The Gauss-Newton iteration ity m of x ˆ with a on the final system G(u) = 0 converges locally to u quadratic rate. More importantly as we see it, the multiple ˆ of system F can now be computed to high accuracy zero x ˆ of G. Furthermore, the LVZ as part of a simple zero u method also generates functionals in Dxˆ (I) as by-products.

mainly used in a “dot product” with v = [v1 , · · · , vs ]

6 6 4

We slightly modify the LVZ method. Let J(ˆ x0 ) be the ˆ 0 with rank r. For almost Jacobian of F (x0 ) at zero x (s−r)×s ˆ 1 ∈ Cs to all x – » solution » R –∈ C » – , there is a unique ˆ J(ˆ x0 ) 0 ˆ= x is a solution to x1 = e0 . As a result, z ˆ x R

ˆ . Consequently a new (2t + 1) × (s + r + 1) system y=y 2 3 » – F (x) x H ˆ ˆ 4 and F (z) ≡ b y − 15 (15) F (z) = 0, with z = y J(x)By

∂ ∇y ≡ y1 ∂x∂ 1 + · · · + ys ∂x . s

2

It is easy to verify that (20) and (21) can be written as F = 0, Φ1 F = 0, Φ2 F = 0, Φ2 ◦ Φ1 F = 0, Φ3 F = 0, Φ3 ◦ Φ1 F = 0, Φ3 ◦ Φ2 F = 0, Φ3 ◦ Φ2 ◦ Φ1 F = 0

(18)

Let F = [f1 , · · · , ft ]> be a polynomial system in C[x0 ] and

We have the following lemma.

120

Lemma 3. Let F = [f1 , · · · , ft ]> be a system of polynomials fi ’s in variable x = [x1 , · · · , xs ]> . Denote F0 = F and x0 = x. Then any isolated solution to the expanded system in the α-th deflation step described as in (19) solves Fα ≡

»

Fα−1 Φα Fα−1

–

= 0,

vector b ∈ Cs . Similar to the modified LVZ method, the first step deflation is to set up an expanded system: » – F0 (x0 ) G1 (x0 , x1 ) = (23) F1 (x0 , x1 )

α = 1, 2, · · ·

with F1 (x0 , x1 ) =

in (x0 , x1 , · · · , x2α −1 ), where Φα ’s are defined in (22).

Theorem 3. Let f1 , · · · , ft ∈ C[x1 , · · · , xs ] with an ˆ to the ideal I = hf1 , · · · , ft i. Then isolated multiple zero x the number k of deflation steps required for the modified LVZ method is bounded by the depth, namely k ≤ δxˆ (I). Furthermore, the method generates 2k differential functionals in the dual space Dxˆ (I) as by-products.

»

∇x1 F (x0 ) b H x1 − 1

–

.

3 F0 (x0 ) 4 5 G2 (x0 , x1 , x2 ) ≡ F1 (x0 , x1 ) (25) F2 (x0 , x1 , x2 ) » – (∇x1 ∇x1 + ∇x2 )F (x0 ) with F2 (x0 , x1 , x2 ) = b H x2

ˆ1, x ˆ 2 ). We define has an isolated zero (ˆ x0 , x P∞ Ψ = η=1 xη · ∆xη−1 .

(26)

Notice that the Ψf is in fact a finite sum for any particular polynomial f in (vector) variables, say x0 , · · · », xσ , since – ΨF (x ) ∆xµ f = 0 for µ > σ + 1. Thus F1 (x0 , x1 ) = bH x0 −0 1 ,

• Initialize F = [f1 , · · · , ft ]> , x0 = [˜ x1 , · · · , x ˜ s ]> • for i = 1, 2, · · · do − for k = 0, 1, · · · do ∗ calculate J(xk ) of F at xk ∗ if J(xk ) is approxi-rank deficient, » – reset x0 =

≡

2

We summarize the modified LVZ method below at an ap˜ = [˜ proximate zero x x1 , · · · , x ˜s ]> , to a system {f1 , · · · , ft }.

J(xk )y =0 Ry − e1 – xk , break y0

–

ˆ 0 is of nullity one, there is a Since the Jacobian of F at x ˆ 1 . Equation (24) along constant γ ∈ C such that v0 = γ x with βxˆ 0 (I) = 1 and (v0 , v1 ) 6= (0, 0) implies γ 6= 0. ˆ 1 . Setting Consequently we can choose γ = 1, namely v0 = x ˆ 2 = v1 , the system x

Proof. By Lemma 3 and the product rule Φα (f g) = (Φα f ) g + (Φα g) f in an induction, the α-th deflation step generates differential functionals of order α that satisfy condition (4). Therefore those functionals belong to Dxˆ (I). Since differential orders of all functionals in Dxˆ (I) are bounded by δxˆ (I), so is α.

»

J(x0 ) x1 b H x1 − 1

ˆ 1 ). If the The system G1 (x0 , x1 ) has an isolated zero (ˆ x0 , x ˆ 1 ), Jacobian J1 (x0 , x1 ) of G1 (x0 , x1 ) is of full rank at (ˆ x0 , x then the system is regularized. Otherwise, there is a nonzero vector (v0 , v1 ) ∈ C2s such that 3 2 – » ∇v0 F (ˆ x0 ) v0 5 = 0. (24) 4 (∇ ∇ + ∇ )F (ˆ x ) ˆ1) ≡ J1 (ˆ x0 , x v0 v1 0 ˆ1 x v1 b H v1

As a consequence, the following theorem improves the result in [9, Theorem 3.1] by bounding the number of deflation steps with depth. In other words, the LVZ method deflates depth, not just multiplicity.

then solve

»

1

F2 (x0 , x1 , x2 ) = ΨF1 (x0 , x1 ) and in general ν−1

}| { z Fν (x0 , · · · , xν ) = Ψ ◦ Ψ ◦ · · · ◦ ΨF1 (x0 , x1 ), ν ≥ 2.

(27)

For example, besides F1 and F2 in (23) and (25) respectively,

for y = y0 ,

F3 (x0 , x1 , x2 , x3 ) =

the k loop.

∗ if the accuracy of xk is satisfactory, ˆ = xk exit with x ∗ calculate xk+1 = xk − J(xk )† F (xk ) end for – construct F˜ in (19), reset F = F˜ end for

»

(∇x1 ∇x1 ∇x1 + 3∇x1 ∇x2 + ∇x3 )F0 (x0 ) b H x3

–

ˆ1, x ˆ2, x ˆ 3 ), we obtain a functional If, say, F3 = 0 at (ˆ x0 , x p −→ (∇xˆ 1 ∇xˆ 1 ∇xˆ 1 + 3∇xˆ 1 ∇xˆ 2 + ∇xˆ 3 ) p(x0 ) for p ∈ C[x0 ] that vanishes on the system F . The original system F = 0 provides a trivial functional ∂0···0 : p → p(ˆ x0 ). The lemma below ensures those functionals are in the dual space. Lemma 4. Let F = [f1 , · · · , ft ]> be polynomial system ˆ ∈ Cs . Denote F0 = generating an ideal I with a zero x ˆ0 = x ˆ and x0 = x. For any γ ∈ { 1, 2, · · · }, let F, x ˆ1, · · · , x ˆ γ ) be a zero of (ˆ x0 , x

The LVZ method appears to be robust in computing tests.

6.2 Special case: dual space of breadth one Consider the ideal I = hf1 , · · · , ft i in the case of breadth βxˆ (I) = 1. The Hilbert function is {1, 1, · · · , 1, 0, · · · }, making the depth δxˆ (I) = m − 1. Computing experiments show that LVZ method always require the maximal number of deflations in this case, with the final system being expanded to the size larger than (2m−1 t) × (2m−1 s) from t × s. A further modified version of the regularized system is of roughly (mt) × (ms). Upon solving the system, a complete basis is also obtained for Dxˆ (I) as a by-product. ˆ =x ˆ 0 . Notice that from Denote x = x0 and the zero x (–J(ˆ x0 )») –= 1 (11), the breadth βxˆ (I) = H(1) = nullity » J(ˆ x0 ) 0 x in implies the system (19), which becomes 1 = 1 bH

Gγ (x0 , x1 , · · · , xγ ) =

2 6 6 4

3 F0 (x0 ) 7 . .. 7. . 5 . . Fγ (x0 , · · · , xγ )

(28)

ˆ γ ) = 0 are Then the functionals derived from Gγ (ˆ x0 , · · · , x linearly independent members of the dual space Dxˆ 0 (I). ˆ be an isolated m-fold zero of the Theorem 4. Let x polynomial ideal I = hf1 , · · · , ft i with breadth βxˆ (I) = ˆ0 = x ˆ and x0 = x. 1. Denote F0 = [f1 , · · · , ft ]> , x Then there is an integer γ ≤ δxˆ (I) such that the system Gγ ˆ1, · · · , x ˆ γ ) that induces in (28) has a simple zero (ˆ x0 , x γ + 1 linearly independent functionals in Dxˆ 0 (I).

ˆ 1 ∈ Cs for almost all random x1 , has a unique solution x

121

Similar to Theorem 3, Theorem 4 can be proved for γ ≤ δxˆ (I). It is our conjecture that γ = δxˆ 0 (I) is always true in breadth one case. We have not seen an exception in our extensive computing experiment. An important impliˆ1, · · · , x ˆ γ ) to cation of γ = δxˆ 0 (I) is that the zero (ˆ x0 , x Gγ induces a complete basis for the dual space Dxˆ (I). ˜ = [˜ Given an approximate zero x x1 , · · · , x ˜s ]> to the > system F = [f1 , · · · , ft ] , the breadth-one algorithm can be summarized below.

4 | {z } LVZ 1 {z CAS

16

4

2

NR NR

}

3

2

1

1 1

1

1

1 2

1 |

1

1

1 NR

1

1

1

16 4

1

4

(0, 0, 0, 0)

(0, 0, −1)

Z3

DZ1

DZ2

DZ3

1, 4, 6, 4, 1

(1, 1, 1, 1, 1) √ √ (2, −i 3, 2, i 3) Z9 Caprasse Cyclic 9

KSS

1 1, 1 (−5/2, 5/2, 1)

2

1 1, 1

1, 2, 1

(1, 0, 0)

(0, 0, 1) Ojika3

(0, 0, 0) (0, 0, 0) (0, 1, 0) (0, 0) (0, 0, 1)

sy cmbs1 cmbs2 mth191 Decker2 Ojika2

em st

ro ze

1. cbms1 [23]: x3 − yz, y 3 − xz, z 3 − xy, (0, 0, 0) 2. cbms2 [23]: x3 − 3x2 y + 3xy 2 − y 3 − z 2 , z 3 − 3z 2 x + 3zx2 − x3 − y 2 , y 3 − 3y 2 z + 3yz 2 − z 3 − x2 , (0, 0, 0) 3. mth191 [9]: x3 + y 2 + z 2 − 1, x2 + y 3 + z 2 − 1, 2 2 3 x + y + z − 1, (0, 1, 0) 4. decker2 [3] x + y 3 , x2 y − y 4 , (0, 0) 5. Ojika2 [17]: x2 +y+z −1, x+y 2 +z −1, x+y+z 2 −1, (0, 0, 1), (1, 0, 0) 6. Ojika3 [17]: x + y + z − 1, 2x3 + 5y 2 − 10z + 5z 3 + 5, 2x + 2y + z 2 − 1, (0, 0, 1), (− 25 , 52 , 1) 7. Caprasse [16]: −x31 x3 + 4x1 x22 x3 + 4x21 x2 x4 + 2x32 x4 + 4x21 − 10x22 + 4x1 x3 − 10x2 x4 + 2, −x1 x33 + 4x2 x23 x4 + 4x1 x3 x24 + 2x2 x34 + 4x1 x3 + 4x23 − 10x2 x4 − 10x24 + 2, x22 x3 +√2x1 x2√ x4 − 2x1 − x3 , 2x2 x3 x4 + x1 x24 − x1 − 2x3 , (2, −i 3, 2, i 3)

t on er cti ilb un H f 1, 3, 3, 3, 1 1, 3, 3, 1 1, 2, 1 1, 1, 1, 1 1, 1

br 3 3 2 1 1

ea

Preliminary implementations of Algorithm MultStructure are conducted in both symbolic and numerical computation, as well as Algorithm BreadthOne with symbolic rank-revealing. Building a comprehensive test suite of polynomial systems is also underway. A large number of problems have been tested. We present results on the following benchmark problems with selected multiple zeros. Most of the systems are well documented in the literature for testing algorithms in polynomial system solving.

1, 2, 1 2 2 4 1, 2, 1 2 2 4 1, 4, 10, 16, 22, 25, 22, 4 10 131 16, 10, 4, 1 1, 2, 3, 3, 2, 2 7 16 2, 2, 1 1, 1, 1, 1, 1 1 4 5 | {z MultStructure and BreadthOne

2

4

2 1

h dt

de 4 3 2 3 1

COMPUTATIONAL EXPERIMENT

2

h pt

t ul m 11 8 4 4 2

i

ic pl

}

y it

Algorithm BreadthOne converges locally to the multiple zero beyond the “attainable accuracy” with a quadratic rate. Moreover, it also refines the basis for the dual space. An important application of Algorithm BreadthOne is that it regularizes the zero-finding problem for a univariate equation f (x) = 0 where the breadth is always one.

1

le ti ap ul M m 1 NR 1 4 1

ic pl

y it

a ic y at icit l m he tip at ul M m 3 NR 2 4 2

s on ti ded a fl e de ne 1 1 1 3 1

11. 12.

2

10.

1

9.

Algorithm BreadthOne ˜0 = x ˜ • Initialize F0 = [f1 , · · · , ft ]> , x • for γ = 1, 2, · · · do ˜ γ−1 , xγ ) = 0 for a least − solve Fγ (˜ x0 , · · · , x ˜γ squares solution xγ = x ˆ > ˜> ˜0 , · · · , x ˜> − set z0 = x γ − for k = 0, 1, · · · do ∗ calculate Jγ (zk ) of Gγ at zk ∗ if Jγ (zk ) is approxi-rank deficient, break the k loop ∗ if the accuracy of zk is satisfactory, exit γ loop with ˜ z = zk ∗ refine zk using the Gauss-Newton iteration zk+1 = zk − Jγ (zk )† Gγ (zk ) end for end for • multiplicity m = γ + 1 ˜ 0 of F0 from z ˜, get γ + 1 • extract the zero x ˜ functionals from components of z

7.

P fσ (x1 , · · · , x5 ) = x2σ + 5ν=1 xν − 2xσ − 4, for σ = 1, · · · , 5, at (1, 1, 1, 1, 1) P Q Cyclic nine [4]: fν = 8i=0 i+ν j=i xi , for ν = 0, · · · , 7 Q8 and f8 = 1 − j=0 xj with “cyclic” variables x8+µ = xµ for µ = 0, · · · , 7. The selected 4-fold zero Z9 = (z0 , z1 , z2 , z0 , −z2 , −z1 , z0 , −z2 , −z1 ) with z0 = −.9396926 − 0.3520201i, z1 = −2.4601472 − .8954204i and z2 = −.3589306 − .1306401i. DZ1: x41 − x2 x3 x4 , x42 − x1 x3 x4 , x43 − x1 x2 x4 , x44 − x1 x2 x3 , (0, 0, 0, 0). Modified from cbms1. DZ2: x4 , x2 y + y 4 , √z + z 2 − 7x3 − 8x2 , (0, 0, √ −1). DZ3: 14x + 33y − 3 5(x2 + 4xy + 4y 2 + 2) + 7+ √ x3 + 6x2 y + 12xy 2 + 8y√3 , 41x − 18y − 5 + 8x3 − 12x2 y + 6xy 2 − y 3 + 3 7(4xy − 4x2 − y 2 − 2), with coefficients rounded to 5 digits, at approximate zero Z3 = (1.5055, 0.36528).

8. KSS [8]:

Table 1: Test results on benchmark problems. Incorrect results from CAS’s are shaded. “NR” means no response in a reasonable amount of time. The CAS packages Maple and Mathematica are used to solve the systems with multiplicity extracted from the out-

122

¨ nemann, [7] G.-M. Greuel, G. Pfister, and H. Scho Singular 2.0. A Computer Algebra System for Polynomial Computations. Centre for Computer Algebra, University of Kaiserslautern (2001). http://www.singular.uni-kl.de. [8] H. Kobayashi, H. Suzuki, and Y. Sakai, Numerical calculation of the multiplicity of a solution to algebraic equations, Math. Comp., 67 (1998), pp. 257–270. [9] A. Leykin, J. Verschelde, and A. Zhao, Newton’s method with deflation for isolated singularities of polynomial systems. Preprint, (2004). [10] T. Y. Li and Z. Zeng, A rank-revealing method with updating, downdating and applications. to appear: SIAM J. Matrix Anal. Appl. [11] F. S. Macaulay, The Algebraic Theory of Modular Systems, Cambridge Univ. Press, 1916. [12] D. Manocha and J. Demmel, Algorithms for intersecting parametric and algebraic curves II: multiple intersections, Graphical Models and Image Processing, 57 (1995), pp. 81–100. ¨ ller, [13] M. G. Marinari, T. Mora, and H. M. Mo On multiplicities in polynomial system solving, Trans. AMS, 348 (1996), pp. 3283–3321. ¨ ller and H. J. Stetter, Multivariate [14] H. M. Mo polynomial equations with multiple zeros solved by matrix eigenproblems, Numer. Math., 70 (1995), pp. 311–329. [15] T. Mora, Solving Polynomial Equation Systems II. manuscript. [16] S. Moritsugu and K. Kuriyama, On multiple zeros of systems of algebraic equations. Proc. of ISSAC ’99, ACM Press, pp 23–30, 1999, 1999. [17] T. Ojika, Modified deflation algorithm for the solution of singular problems, J. Math. Anal. Appl., 123 (1987), pp. 199–221. [18] A. J. Sommese, J. Verschelde, and C. W. Wampler, Numerical decomposition of the solution sets of polynomial systems into irreducible components, SIAM J. Numer. Anal., 38 (2001), pp. 2022–2046. [19] R. P. Stanley, Hilbert function of graded algebras, Advances in Math., 28 (1973), pp. 57–83. [20] H. J. Stetter, Analysis of zero clusters in multivariate polynomial systems. Proc. of ISSAC ’96, ACM Press, pp 127–136, 1996. [21] , Matrix eigenproblems are at the heart of polynomial system solving, SIGSAM Bull., (1996), pp. 22–25. [22] , Numerical Polynomial Algebra, SIAM, Philadelphia, 2004. [23] B. Sturmfels, Solving Systems of polynomial Equations, Number 97 in CBMS Regional Conference Series in Mathematics, AMS, 2002. [24] G. H. Thallinger, Analysis of zero clusters in multivariate polynomial systems. Diploma Thesis, Tech. Univ. Vienna, 1996. [25] Z. Zeng and B. H. Dayton, The approximate GCD of inexact polynomials, II: a multivariate algorithm. Proc. of ISSAC ’04, ACM Press, pp 320–327, 2004.

put zeros. Table 1 lists the results in comparison with the multiplicity structures computed by Algorithm MultStructure. Also in the table are deflation steps required for LVZ method along with the modified version. The test shows that Maple either reports the correct multiplicity or skips on multiplicity identification, while Mathematica makes the attempt but often underestimates the multiplicities. Neither Maple nor Mathematica are implemented for calculating other structural invariances for multiple zeros. The solve function in Singular [7], if successfully terminated, accurately identifies the multiplicities for all above systems with exact coefficients. Like other CAS packages, it appears that Singular is not designed to handle approximate systems or zeros. For example, Singular outputs a cluster of “simple” zeros around Z3 for (approximate) system DZ3, while MultStructure correctly identifies the underlying multiplicity 5 with Hilbert Function {1, 1, 1, 1, 1, 0, · · · }. In summary, our codes for Algorithm MultStructure, as well as BreadthOne in breadth-one cases, accurately identify the multiplicity and output the complete multiplicity structure with bases for the dual spaces along with the Hilbert function. An example of analytic system. The primary objective of this paper is computing multiplicity structures for polynomial systems. The methods in this paper can nonetheless be applied to systems of analytic equations, since the construction of the multiplicity matrices only requires that partial derivatives be obtained. Consider a simple system f1 (x, y) = 1 − cos(x2 ) f2 (x, y) = sin(y) + x2 ex+y Algorithm MultStructure identifies multiplicity 4 at zero (0, 0) with the basis for the dual space ∂0 , ∂x , ∂x2 − ∂y , and ∂x3 − ∂xy + ∂y . Acknowledgment: We would like to thank Jan Verschelde and Anton Leykin for valuable discussions and for providing their results in [9]. We are grateful to Teo Mora and Michael M¨ oller for sharing their knowledge on the history of the duality approach to multiplicity. In particular, Teo Mora pointed out reference [11], kindly provided his manuscript [15] and made valuable suggestions on this paper. We also thank anonymous referees for their helpful comments.

8.

REFERENCES

[1] R. M. Corless, P. M. Gianni, and B. M. Trager, A reordered Schur factorization method for zero-dimensional systems with multiple roots. Proc. ISSAC ’97, AMS Press, pp 133–140. [2] D. Cox, J. Little, and D. O’Shea, Using Algebraic Geometry, Springer Verlag, 1998. [3] D. W. Decker, H. B. Keller, and C. T. Kelly, Convergence rate for Newton’s method at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 296–314. [4] J. C. Faug` ere, A new efficient algorithm for computing G¨ obner bases, Journal of Pure and Applied Algebra, 139 (1998), pp. 61–88. [5] W. Fulton, Intersection Theory, Springer Verlag, Berlin, 1984. [6] G.-M. Greuel and G. Pfister, A Singular Introduction to Commutative Algebra, Springer Verlag, 2002.

123

Algorithms for the Non-monic Case of the Sparse Modular ∗ GCD Algorithm Jennifer de Kleine

Michael Monagan

Allan Wittkopf

Department of Mathematics Simon Fraser University Burnaby, B.C. Canada.

Department of Mathematics Simon Fraser University Burnaby, B.C. Canada.

Maplesoft 615 Kumpf Drive Waterloo, Ont. Canada.

[email protected].

[email protected].

[email protected].

ABSTRACT 2

2

on the running time of Brown’s algorithm (see[1]) when G is also sparse. Zippel’s algorithm is an output sensitive algorithm. Unlike Brown’s algorithm, the number of univariate images depends on the size of G, and not on A and B. Most computer algebra systems use either Zippel’s algorithm or Wang’s EEZ-GCD algorithm (see [10]) for multivariate gcd computation. Zippel’s algorithm is implemented in Macsyma, Magma, and Mathematica. A parallel implementation is described by Rayes, Wang and Weber in [8]. Previous work done to improve the asymptotic efficiency includes that of Zippel in [13], and Kaltofen and Lee in [5]. In this paper we present two new algorithms that extend Zippel’s algorithm to the case where G is not monic in the main variable x1 . In section 2 we give a description of Zippel’s algorithm and previous approaches made to extend it to the non-monic case. In section 3 we describe our first solution and in section 4 our second solution. We have implemented both algorithms in Maple. In section 5 we make some remarks about their efficiency and implementation. Although our algorithms do not require any polynomial factorizations, both require that the content of G in the main variable x1 is 1. The content of G can be computed efficiently by computing the gcd of one coefficient of A, the smallest, with a random linear combination of the other coefficients of A and all coefficients of B in x1 . This requires one recursive gcd computation in [x2, ..., xn ].

2

Let G = (4y + 2z)x + (10y + 6z) be the greatest common divisor (gcd) of two polynomials A, B ∈ [x,y, z]. Because G is not monic in the main variable x, the sparse modular gcd algorithm of Richard Zippel cannot be applied directly as one is unable to scale univariate images of G in x consistently. We call this the normalization problem. We present two new sparse modular gcd algorithms which solve this problem without requiring any factorizations. The first, a modification of Zippel’s algorithm, treats the scaling factors as unknowns to be solved for. This leads to a structured coupled linear system for which an efficient solution is still possible. The second algorithm reconstructs the monic gcd x2 + (5y 2 + 3z)/(2y 2 + z) from monic univariate images using a sparse, variable at a time, rational function interpolation algorithm. Categories and Subject Descriptors: I.1.2 [Symbolic and Algebraic Manipulation]: Algorithms – Algebraic algorithms; F.2.1 [Analysis of Algorithms and Problem Complexity]: Numerical algorithms and problems – Computations on Polynomials. General Terms: Algorithms. Keywords: Zippel’s Algorithm, Polynomial Greatest Common Divisors, Sparse Multivariate Polynomials, Modular Algorithms, Probabilistic Algorithms.

1.

2. ZIPPEL’S ALGORITHM

INTRODUCTION

There are two subroutines in the algorithm. Subroutine M, the main subroutine, computes G = gcd(A, B) where A, B ∈ [x1, ..., xn ]. It does this by computing gcd(A, B) modulo a sequence of primes p1 , p2 , ... then reconstructs G from these images by applying the Chinese Remainder Theorem. The first image G1 is computed by calling subroutine P with inputs A mod p1 and B mod p1 . Subroutine P, which is recursive, computes G = gcd(A, B) where A, B in p[x1 , ..., xn ] for a prime p as follows. If n = 1 it uses the Euclidean algorithm. If n > 1 it computes gcd(A, B) at a sequence of random points α1 , α2 , ... ∈ p for xn then reconstructs G ∈ p[x1 , ..., xn ] from the images using dense interpolation, e.g., Newton interpolation. The first image G1 is computed by calling subroutine P recursively with inputs A mod hxn − α1 i and B mod hxn − α1 i. In both subroutines, after the first image G1 is computed, subsequent images are computed using sparse interpolations. This involves solving a set of independent linear systems which are constructed based on the form of G1 . The algo-

Let A, B be polynomials in [x1, ..., xn ]. Let G be their ¯ = B/G greatest common divisor (gcd) and let A¯ = A/G, B ¯ and B. ¯ be their cofactors. Our problem is to compute G, A In [12] (see also [14] for a more accessible reference) Zippel presented a Las Vegas algorithm for computing G when G is monic in the main variable x1 . Zippel’s algorithm improves ∗Supported by NSERC of Canada and the MITACS NCE of Canada


124

univariate images by the image of a known multiple of L. The solution used in the Macsyma implementation of Zippel’s algorithm is to use γ = gcd(lcx1 (A),lcx1 (B)). Now L divides γ hence γ = ∆ × L for some ∆ in [x2, ..., xn ]. If ∆ = 1, then this approach works very well. However, if ∆ is a non-trivial polynomial then Zippel’s algorithm would reconstruct ∆ × G which might have many more terms than G and it would need to remove ∆ from ∆ × G which would require another gcd computation. A non-trivial example where this will happen is when computing gcd(A, A0 ), the first gcd computation in a multivariate square-free factorization algorithm. An ingenious solution is presented by Wang in [9]. Wang determines L by factoring one of the leading coefficients of the input polynomials, A say, then heuristically determining ¯ If A and which factors belong to G and which belong to A. B are sparse, the factorization is usually not hard. Kaltofen in [4] shows how to reduce the factorization to a bivariate factorization and how to make Wang’s heuristic work for coefficient rings other than . We now present our solutions. Neither requires any factorizations.

rithm assumes that G1 is of the correct form, that is, all non-zero terms of the gcd G are present in G1 . This will be true with high probability if the primes are sufficiently large and the evaluation points are chosen at random from p. We identify three classes of primes and evaluation points which cause problems in the algorithm. Definition 1. (bad prime and evaluation) A prime p is bad if degx1 (G mod p) < degx1 (G). An evaluation point (α1 , ..., αn−1 ) ∈ n−1 is bad if degx1 (G mod I) < p degx1 (G) where I = hx2 − α1 , ..., xn − αn−1 i. For example, if A = (3yx + 1)(x − y) and B = (3yx + 1)(x + y + 1) then 3 is a bad prime and y = 0 is a bad evaluation point. These must be avoided so that the univariate images can be scaled consistently and so that unlucky primes and evaluations can be detected. We may avoid them by choosing p and (α1 , ..., αn−1 ) such that L(α1 , ..., αn−1 ) 6≡ 0 mod p where L = lcx1 A is the leading coefficient of A. Definition 2. (unlucky prime and evaluation) A prime p is unlucky if the cofactors are not relatively prime ¯ mod p, B ¯ mod p)) > 0. Simimodulo p, i.e., degx1 (gcd(A larly an evaluation point (α1 , ..., αn−1 ) ∈ n−1 is unlucky if p ¯ mod I, B ¯ mod I)) > 0 where I = hx2 − α1 , ..., degx1 (gcd(A xn − αn−1 i.

3. ALGORITHM LINZIP In Zippel’s algorithm, if any coefficient of G with respect to the main variable x1 is a monomial in x2 , ..., xn , then the normalization is straightforward. Consider the gcd problem from example 1. Notice that the O(x0 ) term in our first gcd image G1 = (y + 11)x3 + 9y has a single term coefficient, 9y. Since we know the exact form, we can scale our univariate gcd images based on this term. Our assumed form becomes Gf = (αy + β)x3 + (1)y for some α, β ∈ 17. Now we have two unknowns in our O(x3 ) term so we need two evaluation points, neither of which may be 0. We choose y = 5, 7, to get the univariate gcds x3 + 6 (mod 17) and x3 + 9 (mod 17), respectively. Now we scale the first univariate gcd by 65 (mod 17), and the second by 79 (mod 17) before equating, giving (5α + β)x3 + 5 = 15x3 + 5 and (7α + β)x3 + 7 = 14x3 + 7. Solving for α and β in 17, gives us the bivariate image, (8y + 9)x3 + y, which is a scalar multiple of the gcd modulo 17. Thus if (at any level in the recursion) an image has a coefficient in x1 which is a single term, the normalization problem is easily solved. The normalization problem essentially reduces to scaling of the univariate gcd images so that the solution of the linear system produces a correct scalar multiple of the gcd. The approach followed now for the general case is quite simple in concept. It is to treat the scaling factors of the computed univariate gcds as unknowns as well. This results in larger linear systems that may require additional univariate gcd images. We call this the multiple scaling case (as opposed to the single scaling case). Scaling of both the univariate gcds and the coefficients of the assumed form of the multivariate gcd results in a system that is under-determined by exactly 1 unknown (the computation is only determined up to a scaling factor). We fix the scaling factor of the first image to 1. The following example illustrates this approach.

¯ = 7x + 6y and B ¯ = 12x + y then p = 5 is For example, if A an unlucky prime and y = 0 is an unlucky evaluation point. These must be avoided if G is to be correctly reconstructed. Unlike bad primes and bad evaluation points, they cannot be ruled out in advance. Instead they identify themselves as we encounter a univariate image in x1 of higher degree than previous univariate images. Definition 3. (missing terms) A prime p is said to introduce missing terms if any integer coefficient of G vanishes modulo p. Similarly, an evaluation xn = α is said to introduce missing terms if any coefficient in p[xn ] of G vanishes at xn = α. For example, if G = x2 + 5y 3 x + 35 ∈ [x,y], the primes 5 and 7 and the evaluation y = 0 cause terms in G to vanish. Zippel’s algorithm cannot reconstruct G if it uses them. Example 1. (the normalization problem) Consider computing the non-monic bivariate gcd G = (y+ 50)x3 + 100y ∈ [x,y] from input polynomials A = (x − y + 1) G and B = (x + y + 1) G. Here G has leading coefficient y + 50 in the main variable x. Suppose we compute our first bivariate image modulo p1 = 13 and obtain G1 = (y+11)x3 + 9y (mod 13). We proceed to compute a second image using sparse interpolation working modulo 17. We assume G has the form Gf = (y + α)x3 + βy for some α, β ∈ 17. We have at most one unknown per coefficient in x so we evaluate at one random point, y = 5, and compute the univariate gcd x3 + 6 (mod 17). This image is unique up to a scaling factor m. We evaluate Gf at y = 5 and equate to obtain: (5 + α)x3 + 5β = m(x3 + 6). The normalization problem is to determine m. In our example, if we knew L(y) = y + 50, the leading coefficient of G, then m should be L(5) = 4 (mod 17) and we would have (5 + α)x3 + 5β = 4x3 + 7. Solving for α and β in 17, we would obtain G2 = (y + 16)x3 + 15y the second bivariate image.

Example 2. Consider the computation of the bivariate gcd (3y 2 − 90)x3 + 12y + 100. We obtain g ≡ x3 y 2 + 9x3 + 4y + 3 (mod 13), and assumed form of the gcd gf = αx3 y 2 + βx3 + γy + σ. Instead of computing two univariate gcd images for the new prime p2 = 17, we compute three,

Let L = lcx1 (G) be the leading coefficient of G. One solution to the normalization problem is to multiply the monic

125

choosing y = 1, 2, 3 and obtain the gcds x3 + 12, x3 + 8, and x3 respectively. We form the modified system as follows: αx3 + βx3 + γ + σ 4αx3 + βx3 + 2γ + σ 9αx3 + βx3 + 3γ + σ

Example 4. Consider the computation of the bivariate gcd (y + 2)x3 + 12y 2 + 24y. By a call to algorithm P we obtain our first image, g1 = x3 y+2x3 +12y 2 +11y (mod 13), and assumed form of the gcd , gf = αx3 y + βx3 + γy 2 + σy. We require at least three univariate gcd images for the new prime p2 = 17. Choosing y = 1, 2, 3 we obtain the gcds x3 + 12, x3 + 7, and x3 + 2 respectively, and form the modified system as follows:

m1 (x3 + 12) = x3 + 12, m2 (x3 + 8), m3 (x3 ),

= = =

where m2 , m3 are the new scaling factors, and we have set the first scaling factor m1 to 1. Solving this system yields α = 7, β = 11, γ = 11, σ = 1, with scaling factors m2 = 5, m3 = 6, so our new gcd image is given by g ≡ 7x3 y 2 + 11x3 + 11y + 1 (mod 17) which is consistent with our gcd.

αx3 + βx3 + γ + σ 2αx3 + βx3 + 4γ + 2σ 3αx3 + βx3 + 9γ + 3σ

4αx3 + βx3 + 16γ + 4σ

c c c c c c c c c

1 c c c c c c c c c

c

1 c

1

c

c c c c c c c c c

a2 a1 a0 b2 b1 b0 c1 c0 1 m2 m3

What is not necessarily obvious from example 4 is the cause of the failure, which is the presence of a content in the gcd with respect to the main variable x, namely y + 2. In example 4 the content in y is absorbed into the multipliers, so we are unable to obtain a solution for the coefficients in our candidate form as only the relative ratio between terms can be computed. Unfortunately, even if g, the gcd of a and b has no content, certain choices of primes and evaluation points can cause an unlucky content to appear in the gcd.

=

0 0 0 0 0 0 0 0 0 0 0

Definition 4. (Unlucky Content). Given g ∈ [x1, ...,xn ] with contx1 (g) = 1, a prime p is said to introduce an unlucky content if cont x1 (g mod p) 6= 1. Similarly for g ∈ p[x1 , ..., xn ] with contx1 (g) = 1, an evaluation xi = αi is said to introduce an unlucky content if contx1 (g mod hxi − αi i) 6= 1. Consider, for example, g = x(y + 1) + y + 14. If we choose p = 13 then g mod p has a content of y + 1, while for any other prime no content is present. Since unlucky contents are rare, we design the algorithm so this problem is not detected in advance, but rather through its effect, so that its detection does not become a bottleneck of the algorithm. We now present the LINZIP M algorithm, which computes the gcd in [x1, ..., xn ] from a number of images in p[x1 , ..., xn ], and the LINZIP P algorithm, which computes the gcd in p[x1 , ..., xn ] from a number of images in p[x1 , ..., xn−1 ].

= m4 (x3 + 14).

The new system of equations is still under-determined. In fact, the system remains under-determined if we continue to choose new evaluation points for y. This is the case for any chosen prime and set of evaluations, so the algorithm fails to find the gcd for this problem.

Example 3. Consider the linear system that must be solved to compute the gcd for a problem with the assumed form gf = (a2 y 2 + a1 y + a0 )x2 + (b2 y 3 + b1 y + b0 )x + (c1 y 2 + c0 ). We require 3 images to have sufficiently many equations to solve for all unknowns. The resulting linear system has 9 equations in the 10 unknowns. It has the structure shown in figure 1 below. The equations are ordered by decreasing degree in x then by image number. The unknowns are in the same order as in the image followed by the scaling factors. The c’s denote (possibly) non-zero entries. All entries not shown are zero.

x3 + 12, m2 (x3 + 7), m3 (x3 + 2),

In attempting to solve this system, we find that it is underdetermined, so we add a new evaluation point, y = 4, obtaining a gcd of x3 + 14, and the new equation

We explain why we fix the multiplier m1 to be 1 instead of fixing a coefficient in the assumed form of the gcd to be 1. In example 2, suppose we set α = 1. Suppose we then use the evaluation y = 0 which is not bad. Notice that α should be 0 if y = 0. Attempting to set β = 1 will have the same problem for the prime p = 5 which is not bad. Fixing a multiplier to be 1 cannot cause an inconsistency because the algorithm avoids bad primes and bad evaluation points. One might wonder about the efficiency of the multiple scaling case since we are constructing a system that ties together all unknowns through the multipliers. In the single scaling case, each degree in x1 has an independent subsystem. The trick is to realize that the resulting system is highly structured, and the structure can be exploited to put the solution expense of the multiple scaling case on the same order as the solution expense of the single scaling case.

= = =

Figure 1: Structure for the multiple scaling case.

Algorithm 1 (LINZIP M). Input: a, b ∈ [x1, ..., xn ] with gcd(contx1 a, contx1 b) = 1 and degree bounds dx on the gcd in x1 , ..., xn .

The solution can be easily computed by solution of a number of smaller subsystems corresponding to the rectangular blocks of non-zero entries augmented with the multiplier columns. Once the subsystems are upper triangular, remaining rows, only involving the multipliers, can be used to compute the multiplier values, which can then be back-substituted into the subsystems to obtain the image coefficients.

Output: g = gcd(a, b) ∈

[x1, ..., xn ].

1 Compute the scaling factor: γ = gcd(lcx1 ,...,xn (a), lcx1 ,...,xn (b)) ∈

.

2 Choose a random prime p such that γp = γ mod p 6= 0, and set ap = a mod p, bp = b mod p, then compute from these a modular gcd image gp ∈ p[x1 , ..., xn ] with a call to LINZIP P. If the algorithm returns Fail, repeat, otherwise set dx1 = degx1 (gp ) and continue.

This approach solves the normalization problem but it also introduces another difficulty which is illustrated by the following example.

126

gc . Test if gc | a and gc | b. If yes, return gc . Otherwise we need more primes, so goto 5.1.

3 Assume that gp has no missing terms, and that the prime is not unlucky. We call the assumed form gf . There are two cases here:

Algorithm 2 (LINZIP P). Input: a, b ∈ p[x1 , ..., xn ], a prime p, and degree bounds dx on the gcd in x1 , ..., xn .

3.1 If there exists a coefficient of x1 in gf that is a monomial, then we can use single scaling and normalize by setting the integer coefficient of that monomial to 1. Count the largest number of terms in any coefficient of x1 in gf , calling this nx .


γp lcx1 ,...,xn (gp )

or Fail.

0 If the gcd of the inputs has content in xn return Fail.

3.2 If there is no such coefficient, then multiple scaling must be used. Compute the minimum number of images needed to determine gf with multiple scaling, calling this nx . 4 Set gm =

p[x1 , ..., xn ]

1 Compute the scaling factor: γ = gcd(lcx1 ,...,xn−1 (a), lcx1 ,...,xn−1 (b)) ∈

p[xn ].

2 Choose v ∈ p \ {0} at random such that γ mod hxn − vi 6= 0. Set av = a mod hxn − vi, bv = b mod hxn − vi, then compute gv = gcd(av , bv ) ∈ p[x1 , ..., xn−1 ] with a recursive call to LINZIP P (n > 2) or via the Euclidean algorithm (n = 2). If for n > 2 the algorithm returns Fail or for n = 2 we have deg x1 (gv ) > dx1 then return Fail, otherwise set dx1 = degx1 (gv ) and continue.

× gp mod p and m = p.

5 Repeat 5.1 Choose a new random prime p such that γp = γ mod p 6= 0, and set ap = a mod p, bp = b mod p. 5.2 Set S = ∅, ni = 0.

3 Assume that gv has no missing terms, and that the evaluation is not unlucky. We call the assumed form gf . There are two cases here:

5.3 Repeat 5.3.1 Choose α2 , ..., αn ∈ p \ {0} at random such that degx1 (ap mod I) = deg x1 (a), degx1 (bp mod I) = degx1 (b) where I = hx2 − α2 , ..., xn − αn i. Set a1 = ap mod I, b1 = bp mod I.

3.1 If there exists a coefficient of x1 in gf that is a monomial, then we can use single scaling and normalize by setting the integer coefficient of that monomial to 1. Count the largest number of terms in any coefficient of x1 in gf , calling this nx .

5.3.2 Compute g1 = gcd(a1 , b1 ). 5.3.3 If deg x1 (g1 ) < dx1 our original image and form gf and degree bounds were unlucky, so set dx1 = degx1 (g1 ) and goto 2.

3.2 If there is no such coefficient, then multiple scaling must be used. Compute the minimum number of images needed to determine gf with multiple scaling, calling this nx .

5.3.4 If deg x1 (g1 ) > dx1 our current image g1 is unlucky, so goto 5.3.1, unless the number of failures > min(2, ni ), in which case assume p is unlucky and goto 5.1.

4 Set gseq =

5.3.5 For single scaling, check that the scaling term in the image g1 is present. If not, the assumed form must be wrong, so goto 2.

γ(v) lcx1 ,...,xn−1 (gv )

× gv mod p and vseq = v.

5 Repeat 5.1 Choose a new random v ∈ p \ {0} such that γ mod hxn − vi 6= 0 and set av = a mod hxn − vi, bv = b mod hxn − vi.

5.3.6 Add the equations obtained from equating coefficients of g1 and the evaluation of gf mod I to S, and set ni = ni + 1.

5.2 Set S = ∅, ni = 0. 5.3 Repeat 5.3.1 Choose α2 , ..., αn−1 ∈ p \ {0} at random such that deg x1 (av mod I) = degx1 (a) and degx1 (bv mod I) = deg x1 (b) where I = hx2 − α2 , ..., xn−1 −αn−1 i. Set a1 = av mod I, b1 = bv mod I.

Until ni ≥ nx . 5.4 We may now have a sufficient number of equations in S to solve for all unknowns in gf mod p so attempt this now, calling the result gp . 5.5 If the system is inconsistent our original image is incorrect (missing terms or unlucky), so goto 2.

5.3.2 Compute g1 = gcd(a1 , b1 ).

5.6 If the system is under-determined, then record the degrees of freedom, and if this has occurred twice before with the same degrees of freedom then assume that an unlucky content problem was introduced by the current prime p so goto 5.1. Otherwise we need more images so goto 5.3.1.

5.3.3 If degx1 (g1 ) < dx1 then our original image and form gf and degree bounds were unlucky, so set dx1 = degx1 (g1 ) and goto 2. 5.3.4 If deg x1 (g1 ) > dx1 then our current image g1 is unlucky, so goto 5.3.1, unless the number of failures > min(1, ni ), in which case assume xn = v is unlucky and goto 5.1.

5.7 The system is consistent and determined. Scale γp the new image. Set gp = lcx ,...,x ×gp mod p. n (gp ) 1 Apply the Chinese remainder theorem to update gm by combining the coefficients of gp ∈ p[x1 , ..., xn ] with gm ∈ m[x1 , ..., xn ], updating m = m×p.

5.3.5 For single scaling, check that the scaling term in the image g1 is present. If not, the assumed form must be wrong, so goto 2. 5.3.6 Add the equations obtained from equating coefficients of g1 and the evaluation of gf mod I to S, and set ni = ni + 1.

Until gm has stopped changing for one iteration. 7 Remove integer content from gm placing the result in

127

Until ni ≥ nx .

this approach only requires computation of univariate contents to detect the problem as any content in the gcd will eventually show up as a univariate content as we evaluate xn , xn−1 , ....

5.4 We should now have a sufficient number of equations in S to solve for all unknowns in gf mod p so attempt this now, calling the result gv .

4. The check in step 5.6 of either algorithm is intended to check for an unlucky content introduced by the evaluation (LINZIP P ) or prime (LINZIP M ) chosen in step 5.1 of both algorithms. Since it is possible that a new random image from step 5.3.1 does not constrain the form of the gcd (even without the content problem) we check for multiple failures before rejecting the current iteration of loop 5.

5.5 If the system is inconsistent our original image is incorrect (missing terms or unlucky), so goto 2. 5.6 If the system is under-determined, then record the degrees of freedom, and if this has occurred twice before with the same degrees of freedom then assume the content problem was introduced by the evaluation of xn so goto 5.1. Otherwise we need more images so goto 5.3.1.

5. The LINZIP P algorithm performs one probabilistic univariate division test in step 7 instead of testing if gc | a and gc | b. This check is substantially less expensive than a multivariate trial division, though there is still a chance that the test fails to detect an incorrect answer, so the termination division test in LINZIP M must be retained.

5.7 The system is consistent and determined. Scale γ(v) the new image gv . Set gseq =gseq , lcx ,...,x (g ) 1 n−1 v × gv , vseq = vseq , v. Until we have dxn + degxn (γ) + 1 images. 6 Reconstruct our candidate gcd gc using Newton interpolation (dense) on gseq ,vseq , then remove any content in xn .

6. Random evaluation points are chosen from p\{0} rather than p because zero evaluations are likely to cause missing terms in the assumed form, and possibly scaling problems when normalizing images.

7 Probabilistic division test: Choose α2 , ..., αn ∈ p at random such that for I = hx2 − α2 , ..., xn − αn i and g1 = gc mod I we have degx1 (g1 ) = degx1 (gc ). Then compute a1 = a mod I, b1 = b mod I and test if g1 | a1 and g1 | b1 . If yes return gc , otherwise goto 2.

To verify the correctness of this algorithm, in addition to the standard issues with modular algorithms we need also verify that the images are scaled consistently to allow the image reconstruction to proceed. We need to consider 4 main problems, namely bad primes or evaluations, unlucky contents, unlucky primes or evaluations, and missing terms in an initial image.

We make some remarks before discussing the correctness and termination of the algorithm. 1. The degree bound of the gcd in the main variable x1 is used to detect unlucky primes and evaluations, but only those that involve x1 . We update this degree bound whenever we compute a gcd of lower degree in x1 . The degree bounds of the gcd in the non-main variables x2 , ..., xn are used to compute the number of images needed for the interpolation in step 6 of LINZIP P. They are not updated by the algorithm. The degree bound for a variable can be obtained by evaluating the inputs mod a random prime and set of evaluations for all but that variable, then as long as the prime and evaluations are not bad, the degree of the univariate gcd is a bound on the degree of the multivariate gcd for that variable.

Bad primes and bad evaluations: The treatment of bad primes and bad evaluations is straightforward. It is handled for the first prime or evaluation by the check that γ does not evaluate to 0 in step 2 of the algorithms, handled for subsequent primes or evaluations by the check that γ does not evaluate to 0 in step 5.1 of the algorithms, and handled for the univariate images in step 5.3.1 of the algorithms. Unlucky content: The unlucky content problem for the first prime or first evaluation is treated in step 0 of LINZIP P by the single variable content check. As in point 3 above we emphasize that this check will always detect the problem at some level of the recursion, specifically the level containing the last variable contained in the unlucky content (as all the other variables in the content have been evaluated, so the content becomes univariate). There is no efficient way to detect where such an unlucky content was introduced. It may have been introduced by the prime chosen in LINZIP M or any evaluation in prior calls (for xj with j > n) to LINZIP P in the recursion. Thus LINZIP P fails all the way back up to LINZIP M which restarts with a new prime. This strategy is efficient, as only evaluations (modular and variable) and other single variable content checks have been performed before such a failure is detected. The introduction of an unlucky content by the prime or evaluation chosen in step 5.1 of either algorithm will be handled in the combination of steps 5.4 and 5.6. The result is a system with additional degrees of freedom, so this always results in an under-determined system. The check in step 5.6 handles this, as eventually we will obtain a solution for all variables but the free ones resulting from the unlucky

2. The number of required images for the multiple scaling case computed in step 3.2 can be the same as the number of required images for the single scaling case computed in step 3.1, and no more than 50% higher. The worst case is quite infrequent. It will only occur when there are only two coefficients with respect to the main variable, each having exactly the same number of terms. The extra expense of this step can usually be reduced by an intelligent choice of the main variable x1 . The exact formula for the number of images needed for a problem with coefficients having term counts of n1 , ..., ns and a maximum term count of nmax is given by max(nmax , d( si=1 ni − 1)/(s − 1)e). The complexity of LINZIP is otherwise the same as that of Zippel’s original algorithm. For a detailed asymptotic analysis of the LINZIP algorithm, the interested reader may consult [11]. 3. The check in step 0 of LINZIP P is used to detect an unlucky content in the initial gcd introduced higher up in the recursion by either a prime or evaluation. We note that

128

detect this problem, we arrive at LINZIP M. Now we will compute new images in LINZIP M until gc divides both a and b, so the problem must eventually be detected. Note that the missing term case is the most likely failure case of both algorithms, that is, more likely than unlucky primes, unlucky evaluations, and unlucky contents. The probability of choosing a prime or evaluation that causes a term to vanish is O(t/p), where t is the number of terms in the polynomial, and p is the prime. Thus the primes need to be much larger than the number of terms.

content, so the degrees of freedom will stabilize, and we will go back to step 5.1 choosing a new prime or evaluation. Unlucky primes and unlucky evaluations: The treatment of unlucky primes and evaluations is less straightforward. First we consider an unlucky evaluation in step 2 of LINZIP P for xn for which the factor added to the gcd depends upon x1 . If the degree bound dx1 is tight, then this will be detected at a lower level of the recursion by step 2 of LINZIP P when n = 2. If the degree bound dx1 is not tight, then the gcd computed in that step may be unlucky, but we proceed with the computation. Once we reach loop 5, we begin to choose new evaluation points for xn . With high probability we will choose a new point that is not unlucky in step 5.1, the problem will be detected in step 5.3.3. In the worst case, all evaluations in step 5.1 may also be unlucky, introducing the same factor to the gcd, and we will proceed to step 6, and reconstruct an incorrect result. Note that if the factor is in fact different, then the equations accumulated in step 5.3.5 will most likely be inconsistent, and this problem will most likely be detected in steps 5.4 and 5.5. Step 7 will again perform checks much like those in step 5.3.3, and will detect this problem with high probability, but if it does not, an invalid result may be returned from LINZIP P. If we continue to choose unlucky evaluations we will eventually return an incorrect image to LINZIP M. This problem (as well as the unlucky prime case for step 2 of LINZIP M ) is handled by the structure of LINZIP M. Since the steps are essentially the same, the same reasoning follows, and we need the computation to be unlucky through all iterations of loop 5. Since the form of the gcd is incorrect, it is unlikely that gm will stabilize, and we will continue to loop. If gm does stabilize, the invalid image will not divide a and b, so step 7 will put us back into the loop. Now within that loop, which cannot terminate until we have found the gcd, step 5.3.4 will eventually detect this problem, as we must eventually find a prime that is not unlucky. Now consider the case where the unlucky evaluation or prime is chosen in step 2 of either algorithm, and the factor added to the gcd is independent of x1 , that is, it is a content with respect to x1 . This is handled by the same process as the unlucky content problem, specifically it is handled on the way down by step 0 of LINZIP P. Now if an unlucky prime or evaluation occurs in step 5.1 of either algorithm, it must either raise the degree in x1 , in which case it will be detected in step 5.3.4, or it results in an unlucky content. If this content is purely a contribution of the cofactors, then this will not cause a problem for the algorithm, as it reconstructs the new gcd image without that content present (as a result of the assumed form). Hence, the only type of unlucky evaluation that can occur in step 5.3.1 must raise the degree of the gcd in x1 , and thus is handled by step 5.3.4.

4. ALGORITHM RATZIP An alternative way of handling the non-monic case is to use sparse rational function interpolation. The idea is as follows. Suppose we are computing the gcd of two polynomials in [x,w, y, z] with x as the main variable. We will compute the monic gcd in (w,y, z)[x] in the form: n−1

xn + i=0

ai (w, y, z) i x, bi (w, y, z)

where ai , bi ∈ [w, y, z], by interpolating the rational function coefficients using a sparse interpolation. For example, if our gcd is (y + 14)yx3 + 12y 2 x + y + 14, we compute the monic gcd 12y 1 x3 + x+ . y + 14 y We then recover the non-monic gcd by multiplying through by the least common multiple of the denominators. In our example, we multiply through by lcm(y + 14, y) = (y + 14)y to get our non-monic gcd (y + 14)yx3 + 12y 2 x + y + 14. To illustrate how sparse rational function reconstruction works in general, suppose one of the rational function co3 +∗zy 2 efficients is C = ∗z∗w 2 +∗y 2 +wy 3 , here ∗ indicates an integer. Suppose we have reconstructed C at w = 5 to get 2 C1 = ∗z2∗+∗zy . Notice we have normalized the lead+∗y 2 +y 3 ing coefficient of the denominator to be 1, essentially dividing through by w. We then assume the form to be α(w)+β(w)zy 2 Cf = δ(w)z 2 +γ(w)y 2 +y 3 , where α(w), β(w), δ(w), γ(w) are rational functions in w. We have 4 unknowns so we need 4 equations to solve for the next image, C2 . We do this for as many w values as we need, then perform rational function interpolation in w to obtain

∗ zy 2 ∗w2 + w ∗ z 2 + ∗ y 2 +y 3 w w

fractions in w gets us what we want, namely

. Clearing the ∗w3 +∗zy 2 . ∗z 2 +∗y 2 +wy 3

Example 5. Consider the computation of the above gcd G = (y + 14)yx3 + 12y 2 x + y + 14 from input polynomials A = (yx + 1) G and B = (yx + 2) G. Using p1 = 11 we compute our first monic gcd image in 11(y)[x] using dense rational function interpolation. Given a degree bound in y, dy = 2, we need N = 2dy + 1 = 5 evaluation points to ay 2 +by+c interpolate a rational function of the form dy 2 +ey+f in y. If we do this by constructing a linear system, the rational function interpolation will cost O(N 3 ). Instead we use the Euclidean Algorithm. We first apply the Chinese Remainder Theorem to reconstruct polynomial coefficients in y followed by rational function reconstruction (see [2]). This reduces the cost to O(N 2 ). We choose y = 1, 4, 9, 3, 6, to get the gcd images in 11[x], x3 + 3x + 1, x3 + 10x + 3, x3 + 9x + 5, x3 + 6x + 4 and x3 + 8x + 2, respectively. We

Missing terms: If the initial image of g (in either algorithm) has missing terms, the resulting system will likely be inconsistent which will be detected by step 5.5 with high probability. If it is not detected in any iteration of loop 5, then an incorrect image will be reconstructed in step 6 of LINZIP P. The additional check in step 7 of LINZIP P will detect this problem with the new images with high probability, but if this also fails, then we return an incorrect image from LINZIP P. Again assuming a sequence of failures to

129

interpolate in y to get x3 + (6y 4 + 9y 3 + 9y 2 + 10y + 2)x + 10y 4 + y 3 + 5y 2 + 3y + 4 and then apply rational function reconstruction to the coefficients of x to get our first monic y gcd image G1 = x3 + y+3 x+ y1 ∈ 11(y)[x], and our assumed αy x + yδ . form Gf = x3 + y+β Working modulo p2 = 13 we compute a second monic gcd image in 13(y)[x] using sparse rational function interpolation. We have at most two unknowns per coefficient in our main variable x so we need two evaluation points. We evaluate at y = 1, 6, and compute the univariate gcd images in 13[x], x3 + 6x + 1 and x3 + x + 11, respectively. We evaluate Gf at our chosen y values and equate by coefficient to get the following system. 6 1

= =

α , 1+β 6α , 6+β

1 11

= =

δ 1 δ 6

For each coefficient of x1 in gf , count the number of terms in the numerator nt and the number of terms in the denominator dt. Take the maximum sum nt + dt over all coefficients and set nx = nt + dt − 1. The −1 is because we normalize the leading coefficients of the denominators to be 1.

4

5 Repeat 5.1 Choose a new random v ∈ p \ {0} such that γ mod hxn − vi 6= 0 and set av = a mod hxn − vi, bv = b mod hxn − vi.

⇒ α = 12, β = 1, δ = 1

5.2 Set S = ∅, ni = 0.

Substituting back into Gf we get our second monic image in 12 y 3 1 13(y)[x], G2 = x + y + 1 x + y . We then apply the Chinese Remainder Theorem to the integer coefficients of the rational functions of G1 and G2 y to reconstruct our monic gcd in (y)[x], x3 + y12 x + y1 . + 14 Clearing fractions gives us our non-monic gcd in [x,y].

5.3 Repeat 5.3.1 Choose α2 , ..., αn−1 ∈ p \ {0} at random such that deg x1 (av mod I) = degx1 (a) and degx1 (bv mod I) = deg x1 (b) where I = hx2 − α2 , ..., xn−1 −αn−1 i. Set a1 = av mod I, b1 = bv mod I.

Algorithm RATZIP M computes the gcd in [x1, ..., xn ] from a number of images in p(x2 , ..., xn )[x1 ]. After applying the Chinese remainder theorem it must clear the fractions in (x2, ..., xn )[x1 ] which requires further multivariate gcds which is the disadvantage of this algorithm in comparison with LINZIP. Algorithm RATZIP P computes the gcd in p(x2 , ..., xn )[x1 ] from a number of images in p(x2 , ..., xn−1 )[x1 ]. As for the LINZIP M algorithm any content of the gcd with respect to x1 must be removed before the initial call to the RATZIP M algorithm. Unlike in the LINZIP algorithms, we do not use single scaling. It is plausible that it may be applied here but it is not straightforward and we have yet to work out the details. For lack of space, only subroutine RATZIP P is presented. Since it is similar to LINZIP P, the differences are highlighted. Algorithm 3

Set gm = gv , m = xn − v, and Ni = 1.

5.3.2 Compute g1 = gcd(a1 , b1 ). 5.3.3 If degx1 (g1 ) < dx1 then our original image and form gf and degree bounds were unlucky, so set dx1 = degx1 (g1 ) and goto 2. 5.3.4 If deg x1 (g1 ) > dx1 then our current image g1 is unlucky, so goto 5.3.1, unless the number of failures > min(1, ni ), in which case assume xn = v is unlucky and goto 5.1. 5.3.5 Add the equations obtained from equating coefficients of g1 and the evaluation of gf mod I to S, and set ni = ni + 1. Until ni ≥ nx . 5.4 We should now have a sufficient number of equations in S to solve for all unknowns in gf mod p so attempt this now, calling the result gv .

(RATZIP P).

Input: a, b ∈ p[x1 , ..., xn ], a prime p, and degree bounds dx on the gcd in x1 , ..., xn .

5.5 If the system is inconsistent our original image is incorrect (missing terms or unlucky), so goto 2.


5.6 If the system is under-determined, then record the degrees of freedom, and if this has occurred twice before with the same degrees of freedom then assume the content problem was introduced by the evaluation of xn so goto 5.1. Otherwise we need more images so goto 5.3.1.

p(x2 , ..., xn )[x1 ]

or Fail.

0 If the gcd of the inputs has content in xn return Fail. 1 Compute the scaling factor: γ = gcd(lcx1 ,...,xn−1 (a), lcx1 ,...,xn−1 (b)) ∈

p[xn ].

If γ = 1 then set RR = F alse else set RR = T rue.

5.7 The system is consistent and determined, so we have a new image gv .

2 Choose v ∈ p \ {0} at random such that γ mod hxn − vi 6= 0. Set av = a mod hxn − vi, bv = b mod hxn − vi, then compute gv = gcd(av , bv ) ∈

Solve f ≡ gm mod m(xn ) and f ≡ gv mod (xn − v) using the Chinese remainder algorithm for f ∈ p[xn ](x2 , ..., xn−1 )[x1 ] mod m(xn )×(xn −v). Set gm = f, m = m(xn )× (xn − v), and Ni = Ni + 1.

p(x2 , ..., xn−1 )[x1 ]

with a recursive call to RATZIP P (n > 2) or via the Euclidean algorithm (n = 2). If for n > 2 the algorithm returns Fail or for n = 2 we have deg x1 (gv ) > dx1 then return Fail, otherwise set dx1 = degx1 (gv ) and continue.

Until

3 Assume gv has no missing terms, and that the evaluation is not unlucky. We call the assumed form gf .

Ni ≥ dxn + 1 and (RR = F alse or Ni ≥ 3).

6 Reconstruct

130

think of G as a polynomial in x1 and x2 (main variables) with coefficients in [x3, ..., xn ]. If the cost of computing a bivariate image is less than the cost of evaluation mod I = hx3 − α3 , ..., xn − αn i, overall efficiency is not compromised. If G mod I is dense in x1 and x2 then we expect a significant reduction in the maximum of the number of terms of the coefficients in x1 and x2 , hence, a reduction in the maximum size of the linear systems and a reduction in the number of images needed for the sparse interpolations. We also increase the likelihood of not needing to apply the multiple scaling or rational reconstruction methods. Furthermore, we simplify the multivariate gcd computation for the content of G and, in RATZIP M, the final lcm computation.

6.1 If RR = T rue then apply rational function reconstruction in xn and assign the result to gc . If it fails then we need more points, goto 5.1. For n > 2, clear the rational function denominators of gc ∈ p(xn )(x2 , ..., xn−1 )[x1 ] to obtain gc ∈ p(x2 , ..., xn )[x1 ]. 6.2 If RR = F alse then set gc = gm . 7 Probabilistic division test: Choose α2 , ..., αn ∈ p at random such that for I = hx2 − α2 , ..., xn − αn i and g1 = gc mod I we have degx1 (g1 ) = degx1 (gc ). Then compute a1 = a mod I, b1 = b mod I and test if g1 | a1 and g1 | b1 . If yes return gc , otherwise goto 2.

6. REFERENCES [1] W. S. Brown. On Euclid’s Algorithm and the Computation of Polynomial Greatest Common Divisors. J. ACM 18, 478–504, 1971. [2] J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, UK, 1999. [3] M. van Hoeij, M. B. Monagan. A Modular GCD Algorithm over Number Fields Presented with Multiple Field Extensions. P. ISSAC ’2002, ACM Press, 109–116, 2002. [4] E. Kaltofen. Sparse Hensel lifting. P. EUROCAL 85, Springer-Verlag LNCS 2, 4–17, 1985. [5] E. Kaltofen, W. Lee. Early Termination in Sparse Interpolation Algorithms. J. Symbolic Comp. 36 (3-4), 365–400, 2003. [6] E. Kaltofen and B. Trager. Computing with polynomials given by black boxes for their evaluations: Greatest common divisors, factorization, separation of numerators and denominators. J. Symbolic Comp. 9, 301–320, 1990. [7] M. B. Monagan. Maximal Quotient Rational Reconstruction: An Almost Optimal Algorithm for Rational Reconstruction. P. ISSAC ’2004, ACM Press, 243–249, 2004. [8] M. O. Rayes, P. S. Wang, K. Weber. Parallelization of The Sparse Modular GCD Algorithm for Multivariate Polynomials on Shared Memory Multiprocessors, P. ISSAC ’94, ACM Press, 66–73, 1994. [9] P. S. Wang. An Improved Multivariate Polynomial Factorization Algorithm. Math. Comp. 32 (144), 1215–1231, 1978. [10] P. S. Wang. The EEZ-GCD algorithm. ACM SIGSAM Bull. 14, 50–60, 1980. [11] A. D. Wittkopf, Algorithms and Implementations for Differential Elimination, Ph.D. Thesis, Simon Fraser University. (http://www.cecm.sfu.ca/~wittkopf/WittThesis.pdf), 2004. [12] R. Zippel, Probabilistic Algorithms for Sparse Polynomials, P. EUROSAM ’79, Springer-Verlag LNCS, 2, 216–226, 1979. [13] R. Zippel. Interpolating Polynomials from their Values. J. Symbolic Comp. 9 (3), 375–403, 1990. [14] R. Zippel, Effective Polynomial Computation, Kluwer Academic, 1993.

Our implementation of the RATZIP algorithm includes the following enhancement. To reconstruct the rational functions in some variable y with degree bound dy , we need 2dy + 1 evaluation points. In fact, we may need fewer points than this, depending on the form of the rational functions being reconstructed. In our implementation we use the Maximal Quotient Rational Reconstruction algorithm [7], which uses at most one more evaluation point than the minimum number of points required for the reconstruction to succeed. For example, to reconstruct the rational functions in y of y G = x3 + y+3 x + y1 , we would need 4 points, not 5.

5.

IMPLEMENTATION

We have implemented algorithm LINZIP in Maple and have compared it with Maple’s default algorithm, an implementation of the EEZ-GCD algorithm of Wang [10]. The linear algebra over p and univariate polynomial computations over p and the integer arithmetic are all coded in C. The rest is coded in Maple. Algorithm LINZIP is generally faster when the evaluation points used by the EEZ-GCD algorithm cannot be 0. It is also much less sensitive to unlucky primes and evaluations than the EEZ-GCD algorithm. Otherwise it is generally slower, sometimes more than a factor of 3 slower. We have also implemented algorithm LINZIP and RATZIP in Maple using the “recden” [3] data structure. This data structure supports multiple field extensions over and p, and hence, will allow us to extend our implementations to work over finite fields and algebraic number fields. The data structure is currently being implemented in the kernel of Maple for improved efficiency. On our data, the two algorithms perform within a factor of 2 of each other. Algorithm LINZIP is generally faster than RATZIP. A disadvantage of Zippel’s algorithm is the large number of univariate images that must be computed for the sparse interpolations which means a large number of evaluations of the inputs. On our test data we find that the percentage of time spent on evaluations was on average 68% and 75% for LINZIP and RATZIP, respectively. The multivariate trial division in LINZIP M (step 7) and RATZIP M took 19% and 11% of the time, respectively. To improve the efficiency of LINZIP and RATZIP, we are implementing the following idea. Instead of evaluating out all but one variable x1 in LINZIP P and RATZIP P, consider evaluating out all but 2 variables x1 , x2 and computing the bivariate images using a dense gcd algorithm. Thus

131

Computing µ-Bases of Rational Curves and Surfaces Using Polynomial Matrix Factorization Jiansong Deng

Falai Chen

Liyong Shen

Department of Mathematics University of Science and Technology of China Hefei, Anhui 230026, P. R. of China

[email protected],[email protected] ABSTRACT

equation of the rational curve/surface. Thus it provides a connection between the parametric form and the implicit form of a curve/surface. Furthermore, the µ-basis was successfully applied in reparametrizing a rational ruled surface [4], in computing the singular points of a rational curve [7] and in finding more compact representation for the implicit equation of a rational curve with high order of singularities [2]. There are several methods to compute the µ-basis of a rational curve. The first method is based on undetermined coefficients by solving linear system of equations [14]. This method needs O(n3 ) arithmetic operations, where n is the degree of the curve, and it is a trial-and-error approach. The second method was developed by Zheng and Sederberg [18], and it is similar to the Buchberger’s algorithm for computing the Gr¨ obner basis of a module. The computational cost of the method is about 81 n2 + O(n) multiplications in generic 4 case. In [3], Chen and Wang applied vector elimination technique to improve the efficiency of the second algorithm by a factor of two. For a rational ruled surface, an efficient algorithm similar to curve case was developed to compute the µ-basis [1]. However, we do not have a rigorous algorithm to compute the µ-basis of a general rational surface so far. Currently, we use the Gr¨ obner basis technique to compute a generator for the syzygy module of the rational surface, and then try to find the µ-basis by forming linear combinations of the elements in the generator. This is totally a non-automatic approach and fails in most circumstances. In this paper we apply the theory of polynomial matrices developed by researchers in linear systems [11, 12, 13] to the computation of a µ-basis. Using some polynomial matrix operations, such as primitive factorization and GCD extraction, we are able to compute a µ-basis of a rational curve/surface rigorously. The computed µ-basis is further simplified by lowering its degree using vector elimination technique [3]. For curve case, a µ-basis can be computed in 33 2 n + O(n) operations, which is superior than any existing 4 algorithms. The organization of the paper is as follows. In Section 2, some preliminary knowledge about the µ-basis of a rational curve or surface is introduced. In Section 3, some basic concepts and results in the theory of polynomial matrices are reviewed, including the primitive factorization algorithm, Hermite form, and GCD extraction algorithm. Sections 4 and 5 apply the results of Section 3 to the computation of the µ-basis of a rational curve and surface respectively. Some examples are illustrated to demonstrate the detailed process

The µ-bases of rational curves/surfaces are newly developed tools which play an important role in connecting parametric forms and implicit forms of the rational curves/surfaces. They provide efficient algorithms to implicitize rational curves/surfaces as well as algorithms to compute singular points of rational curves and to reparametrize rational ruled surfaces. In this paper, we present an efficient algorithm to compute the µ-basis of a rational curve/surface by using polynomial matrix factorization followed by a technique similar to Gaussian elimination. The algorithm is shown superior than previous algorithms to compute the µ-basis of a rational curve, and it is the only known algorithm that can rigorously compute the µ-basis of a general rational surface. We present some examples to illustrate the algorithm.

Categories and Subject Descriptors I.1.2 [Symbolic and Algebraic Manipulation]: Algorithms—Algebraic algorithms; I.3.5 [Computer Graphics]: Computational Geometry and Object Modeling—Curve, surface, solid, and object representations


Keywords µ-basis, syzygy module, implicitization, primitive factorization algorithm, Hermite form, GCD extraction algorithm

1.

INTRODUCTION

The µ-basis was first introduced in [9] to provide a compact representation for the implicit equation of a rational parametric curve. Then it was generalized by one of the present authors to general rational surfaces [1, 5, 6]. The µ-basis can be used not only to recover the parametric equation of a rational curve/surface but also to derive the implicit

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISSAC’05, July 24–27, 2005, Beijing, China. Copyright 2005 ACM 1-59593-095-705/0007 ...$5.00. Copyright2005ACM1-59593-095-7/05/0007...5.00.

132

of the algorithm. We end the paper with some conclusions and future works in Section 6.

2.

Definition 2. Given a rational parametric surface in homogeneous form: P(s, t) := (a(s, t), b(s, t), c(s, t), d(s, t)) ∈ K 4 [s, t],

µ-BASES OF RATIONAL CURVES AND

where a, b, c, d are relatively prime. A basis {p(s, t), q(s, t), r(s, t)} ⊂ K 4 [t] of the syzygy module Syz(a, b, c, d) is called a µ-basis of the surface P(s, t). If in addition, p, q, r satisfy

SURFACES Throughout the paper, we work over the field K of real numbers or rational numbers. K[x1 , . . . , xk ] = K[x] and K(x) are the polynomial ring and the rational function field, respectively, in variables x = (x1 , . . . , xk ) with coefficients in K. K m×l [x] denotes the set of m×l matrices with entries in K[x]. If m = 1, we write it in K l [x] for short. For any F := (f1 , . . . , fl ) ∈ K m×l [x], the set n

l X

Syz(F ) := (h1 , . . . , hl ) ∈ K l [x]

1. among all the bases of Syz(a, b, c, d), degt p + degt q + degt r is smallest, and 2. among all the bases of Syz(a, b, c, d) which satisfy item 1, degs p + degs q + degs r is smallest, then {p, q, r} is called a minimal µ-basis of P(s, t).

o

hi fi ≡ 0

Remark 2 The existence of µ-basis of a rational surface was proved in [6]. It can be also seen from Corollary 3.2 in the next section. However, except for parametrizations with no base points, standard computational methods only give generating sets for the syzygy module. The main task of the current paper is to describe how to compute a basis of this module, i.e., a µ-basis. Remark 3 From [6], the parametric equation of a rational surface can be recovered by the outer-product of a µ-basis, i.e.,

i=1

is a module over K[x], called a syzygy module [8]. If we can find a generating set {b1 , . . . , bm }, bi ∈ K l [x], of a syzygy module, then the matrix M = (b1 , . . . , bm ) is called the generating matrix of the syzygy module. It follows that F M = 0. A generating set of a module over K[x] is called a basis if the elements in the generating set are K[x]-linearly independent. If a module has a basis, then it is called a free module. Conditions for a syzygy module being free will be given in the next section. We just mention that, if k = 1 or 2, i.e., we are working with univariate or bivariate polynomials, then the syzygy module is free.

[p, q, r] = κ(a, b, c, d) for some nonzero constant κ in K. Here 0 [p, q, r] =

Now we review the definitions of µ-bases of a planar rational curve [3] and a rational surface [6]. Consistent with the notation in [6, 3, 5, 1, 9], we use t and s, t as the variable names for univariate and bivariate cases, respectively. Definition 1 ([9]). Given a planar rational curve of degree n in homogeneous form:

p2 p3 p4

q2 q3 q4

r2 r3 r4

p1 p2 p4

q1 q2 q4

r1 r2 r4

,− ,−

p1 p3 p4

q1 q3 q4

r1 r3 r4

p1 p2 p3

q1 q2 q3

r1 r2 r3

, 1 A .

On the other hand, the implicit equation of a rational surface can be obtained by computing the Gr¨ obner basis for the ideal hp·v, q·v, r·vi : g N , where v = (x, y, z, 1), g ∈ K[s] is defined by ha, b, c, di ∩ K[s] = hgi and N is a sufficiently large integer. Though it is relatively more efficient than the method by direct computation of Gr¨ obner basis of the ideal hdx − a, dy − b, dz − c, dw − 1i ∩ K[x, y, z, s, t], finding more efficient method to derive the implicit representation from a µ-basis is a problem worthy of further investigation.

P(t) := (a(t), b(t), c(t)) ∈ K 3 [t], where max(degt a, degt b, degt c) = n and gcd(a, b, c) = 1. The syzygy module Syz(a, b, c) has a basis {p(t), q(t)} ⊂ K 3 [t] with degree µ and n − µ respectively, where µ 6 n2 . {p(t), q(t)} is called a µ-basis of the rational curve P(t). Remark 1 A µ-basis has the following properties [3]: (1) The µ-basis has the lowest possible degree among all the bases of the syzygy module Syz(a, b, c).

3. PRELIMINARY RESULTS IN THE THEORY OF POLYNOMIAL MATRICES

(2) The parametric equation of P(t) can be recovered from a µ-basis. In fact, for any basis p, q of Syz(a, b, c), we have

Given a matrix M in K m×m [x], its determinant det M is a polynomial in K[x]. If the polynomial is nonzero, the matrix M is nonsingular, otherwise it is singular. If the determinant det M is a nonzero constant in K, then we called M a unimodular matrix. A matrix M ∈ K m×l [x] is of rank r if there exists at least one minor of order r being nonzero polynomial, and all the minors of order r+1 being zero polynomials. We use rank M to denote the rank of M . If rank M = min(l, m), we call the matrix full-rank. For a nonsingular matrix M ∈ K m×m [x], we can calculate its inverse matrix, whose entries are in K(x). The inverse matrix is also in K m×m [x] if and only if M is unimodular.

[p, q] = κ(a, b, c) for some nonzero constant κ in K. (3) A basis {p(t), q(t)} of Syz(a, b, c) is a µ-basis if and only if degt (p(t)) + degt (q(t)) = n, and if and only if LCV(p(t)) and LCV(q(t)) are linearly independent. Here LCV(p(t)) is the leading coefficient vector of vector polynomial p(t) which is defined by LCV(p(t)) := (p1µ , p2µ , p3µ ) if we write p(t) = (p1µ , p2µ , p3µ )tµ + . . . + (p10 , p20 , p30 ). LCV(q(t)) is defined similarly. (4) The implicit equation of P(t) can be obtained by taking the resultant of p · v and q · v with respect to t, where v = (x, y, 1).

Definition 3. Let F ∈ K m×l [x] with m 6 l. Then F is said to be :

133

1. minor left prime (MLP) if all m × m minors of F are coprime.

on how to extract GCRDs from A and B. The approach consists of some important algorithms in the theory of bivariate polynomial matrices, including primitive factorization, Hermite form, and GCD extraction. We will review them in the following subsections.

2. factor left prime (FLP) if any polynomial decomposition F = F1 F2 in which F1 is square, then F1 is a unimodular matrix, i.e., det F1 = k0 ∈ K\{0}.

3.1 Primitive factorization algorithm

Minor right prime (MRP) and factor right prime (FRP) can be similarly defined.

For a bivariate polynomial a(s, t) ∈ K[s][t], we write it into the following form:

In [16], the authors proved that for k = 1, 2, MLP ≡ FLP, MRP ≡ FRP; for k > 3, MLP 6≡ FLP, MRP 6≡ FRP; and for all k > 1, MLP ⇒ FLP, MRP ⇒ FRP. Here k is the number of variables.

a(s, t) =

where ai (s) ∈ K[s]. The content of a(s, t) with respect to K[s][t] is the gcd of the ai ’s. Suppose p(s) is irreducible in K[s]. Then a(s, t)

(mod p(s)) = 0

(i.e., p(s)|ai (s), i = 0, . . . , n), or a(s, t)

˜ ˜ (A = AC, B = BC),

(mod p(s)) =

n1 X

αi (s)ti ,

i=0

then C is a unimodular matrix.

where αi (s) ∈ K[s] with degs αi (s) < degs p(s), n1 6 n, and αn1 (s) 6≡ 0. The αi ’s can be obtained by means of the Euclidean division algorithm. In [13] a primitive factorization (PF) algorithm of bivariate polynomial matrices is proposed, which extracts the content of a full-rank matrix with entries in the ring K[s, t] of bivariate polynomials over some algebraically closed field K. [11] eliminates the restriction on K, such that we can do the factorization over the real field or even the field of rational numbers, provided the coefficients start out in the same field. We describe the PF algorithm as follows. Further details can be found in [11].

Definition 5. Given two polynomial matrices A, B with the same number of rows (columns), where entries are in K[x], a square polynomial matrix D is their greatest common left (right) divisor (GCL(R)D) if there exist two left ˜ B ˜ such that (right) coprime polynomial matrices A, ˜ B = DB ˜ A = DA,

ai (s)ti ,

i=0

Definition 4. Given two polynomial matrices A, B with the same number of rows (columns), where entries are in K[x], we call them to be left (right) coprime if whenever ˜ B ˜ and a square polynomial matrix there are two matrices A, C, where entries of all the three matrices are in K[x] such that ˜ B = CB ˜ A = C A,

n X

˜ ˜ (A = AD, B = BD).

The following theorem due to Lin [12] describes when a syzygy module is free, i.e., it has a basis. ˜ , D] ˜ ∈ K m×l [x] be of Theorem 3.1. [12] Let F = [−N m×m ˜ rank m, with l > m, D ∈ K [x] being nonsingular, and r = l − m. Then Syz(F ) has a generating matrix of dimension l × r (i.e., Syz(F ) is free) if and only if there exists an MRP matrix H ∈ K l×r [x] such that F H = 0m×r . Furthermore, H is the generating matrix.

Algorithm 1

(PF Algorithm).

Input F : an m×l full-rank matrix with entries in K[s, t], m 6 l. Output L, R: m × m and m × l matrices, respectively, with entries in K[s, t], such that F = LR and det L = g(s), where g(s) ∈ K[s] is the content of the greatest common divisor (GCD) of the set of m × m minors of F . Step

Based on Theorem 3.1, Lin derived the following corollary, the proof of which is constructive. We copy its proof to help the reader to more easily understand the algorithm in Sections 4 and 5.

1. Calculate the GCD of the set of m × m minors of F . g(s) is its content as a polynomial in t.

˜ , D] ˜ ∈ K m×l [s, t] be of Corollary 3.2. Let F = [−N m×m ˜ ∈ K rank m, with l > m, D [x] being nonsingular, and r = l − m. Then, there exists a generating matrix H ∈ K l×r [s, t] of Syz(F ).

2. Let L be an identity matrix of order m, and R = F . Factorize g(s) into a list of irreducible factors in K[s]. For every factor p(s) do the following steps:

˜. ˜ −1 N Proof. Associate F with a rational matrix P = D By a well-known result in bivariate polynomial matrix theory [11, 13], P has a right matrix fraction description (MFD) P = N D−1 , where N and D, whose entries are in K[s, t], are right coprime. Let H = (DT , N T )T ∈ K l×r [s, t], which ˜ −1 N ˜ = N D−1 gives rise is FRP, hence MRP. Clearly, P = D to F H = 0. By Theorem 3.1, H is a generating matrix of Syz(F ). According to the proof of Corollary 3.2, to find a basis of Syz(F ), we need to get the MFD of a rational matrix. For any rational matrix M , it is easy to write it into M = AB −1 , where A, B are polynomial matrices. Hence it is important

(a) Set the current row and column indices i, j to be ¯ = R (mod p(s)). 1. R ¯ if there exists a (b) Among rows from i to m in R, row with all entries zeros, say row i0 , then i0

D0 = diag(1, . . . , 1, p(s), 1, . . . , 1) is a left divisor of F , and we let L ← LD0 , and R ← D0−1 R. Continue Step 2 for the next factor. ¯ from rows i to m is zero, then go If no row of R to the next sub-step.

134

(c) From columns j to l, find the first column (say column j0 ) with at least one nonzero entry from rows i to m. Set j ← j0 .

13] is a matrix (aij (t)) or (aij (s, t)) with aij ≡ 0, j < i, and degt ajj > degt aij , j > i. [10] and [13] presented algorithms to compute the Hermite form of a full-rank matrix in univariate case and bivariate case respectively. For univariate case, based on Gaussian elimination technique and Euclid division algorithm, one can find a unimodular matrix U such that H = U F is the Hermite form of F . For bivariate case, the algorithm consists of two steps. First, we work over K(s)[t] to find U with entries ˜ =U ˜ F is a Hermite form with respect in K(s)[t] such that H ˜ ∈ K(s). Second, let pi (s, t) be the to K(s)[t], where det U ˜, least common multiple of the denominators in row i of U ˜ ˜. and D = diag(p1 (s, t), . . . , pm (s, t)), H = DH and U = DU Then it follows that H = U F ∈ K[s, t] is the Hermite form of F with respect to K[s][t], and det U ∈ K[s]. It is obvious that the two steps can be merged into one by using the pseudo division algorithm for two polynomials.

(d) In the column j, from rows i to m, find the entry with the smallest degree of t, say i1 . Interchange ¯ This is equivalent to premultirow i and i1 of R. ¯ with matrix D1 , where D1 comes from plying R Im by interchanging rows i and i0 . Let L ← LD1 , R ← D1−1 R = D1 R. (e) Suppose the entries in column j to be ∗, . . . , ∗, ai (s, t), . . . , am (s, t)

T

.

The leading coefficients of ai (s, t), . . . , am (s, t) in t are bi (s), . . . , bm (s). Since bi (s) and p(s) are relatively prime, by Euclidean algorithm we can find x(s) and y(s) in K[s] such that x(s)bi (s) = 1 − y(s)p(s).

3.3 GCD extraction algorithm

Let = x(z)ai (s, t) (mod p(s)).1 There exist qk (s, t) and rk (s, t) in K[s, t] such that

In [13], a GCD extraction algorithm of bivariate polynomial matrices is presented. We describe it as follows.

ak (s, t) = qk (s, t)a∗i (s, t) + rk (s, t),

Algorithm 2 (GCRD extraction algorithm). Input A, B: two bivariate polynomial matrices A(s, t) and B(s, t) with the same number of columns, such that (AT , B T ) is of full rank. Output D: GCRD of A and B. Step

a∗i (s, t)

k = i + 1, . . . , m, where degt rk (s, t) < degt a∗i (s, t), or rk (s, t) ≡ 0. Then for k = i + 1, . . . , m, we add to row k with row i multiplied by −x(s)qk (s, t). This is ¯ with the matrix equivalents to premultiplying R D3 = diag(Ii−1 , E), where 0 B B B E=B B

1 −xqi+1 −xqi+2 .. . −xqm

1. Use the RPF algorithm with respect to K[s][t] on the ¯ B ¯ and R0 such right side of (AT , B T )T , i.e., find A, that ¯ A A = ¯ R0 , B B

1 C C C C. C A

1 1 ..

.

where det R0 ∈ K[s].

1

2. Find U with entries in K[s, t] and det U ∈ K[s] to get ¯T , B ¯ T )T , i.e., the Hermite form of (A

LD3−1 ,

Let L ← R ← D3 R. Then the j column ¯ is with the form of R ∗, . . . , ∗, ai (s, t), ri+1 (s, t), . . . , rm (s, t)

T

U

.

=

R 0

.

3. Use the LPF algorithm to R,

¯←R ¯ mod p(z). Let R If ri+1 (s, t) ≡ · · · ≡ rm (s, t) ≡ 0, then j ← j + 1, i ← i + 1, and go to sub-step (b). Otherwise, repeat the current sub-step (e).

¯ ∗. R = RR Then D = R∗ R0 is the GCRD of A and B. Remark 6 To save some unnecessary primitive factorization in Step 3, we make a little modification to the above algorithm. In the computation of the Hermite form of a ˜ and U ˜ , let bivariate polynomial matrix, after we get H qi (s, t) be the least common multiple of the denominators ˜ then qi (s, t) is a factor of pi (s, t). Let D ¯ = in row i of H, ¯ =D ¯ H, ˜ and take H ¯ in place diag(q1 (s, t), . . . , qm (s, t)) and H of H in Step 3 of the GCD extraction algorithm.

Remark 4 There is of course a similar primitive factorization algorithm for m > l, where an l × l matrix is extracted on the right. We denote these two algorithms as the LPF and RPF algorithms with respect to K[s][t], respectively. Remark 5 The LPF and RPF algorithms terminate after finitely many steps. In fact, the complexity is predictable after given the degree of polynomials in F . For a given F , the factorization is unique up to a unimodular matrix.

4. COMPUTING µ-BASES OF A RATIONAL CURVE

3.2 Hermite form

Suppose we are given a planar rational curve

Given a univariate or bivariate m × l full-rank polynomial matrix F , m > l, we are interested in finding its Hermite form with respect to K[t] or K[s][t]. The Hermite form [10, 1

¯ A ¯ B

P(t) := (a(t), b(t), c(t)), where a, b, c ∈ K[t] and gcd(a, b, c) = 1. Computing a µ-basis of P(t) is equivalent to computing a basis of the

Note that a∗i (s, t) is monic and degt a∗i (s, t) = degt ai (s, t).

135

syzygy module Syz(a, b, c) with lowest possible degree. We first compute a basis of Syz(a, b, c) based on the proof of Corollary 3.2. ˜ = (c), N ˜ = (−a, −b). Construct a matrix P and Set D compute its MFD:

and

The column-reduced form of M is 0

5(t + 3) t M′ = −13t − 25

where A = (−a, −b) and B = diag(c, c). To compute a generating matrix of Syz(a, b, c), we need to find the GCRD of A and B. The GCRD extraction algorithm in Section 3.3 is described for bivariate polynomial matrices, but it works also for univariate case with minor modifications. In the univariate case, we do not need to do the primitive factorization in Steps 1 and 3 of Algorithm 2. The key step is to find the Hermite form of matrix A B

0

1 −k 0 1 0 0

1

−a −b 0 A. = c 0 c

10

10

0 1 0 0 A 0 µ1 1 0 c/e

Then

0 λ1 A c/d µ2 bc/(de) 0

0

1

0

λ2 a/d 0

5. COMPUTING µ-BASES OF A RATIONAL SURFACE To compute a µ-basis of a rational surface, we just follow what we did to compute a µ-basis of a rational curve. Here the main computational complexity comes from the GCRD extraction algorithm. However, since Steps 1 and 3 of Algorithm 2 can’t be omitted, it is difficult to write down the generating matrix of the syzygy module explicitly. Given a rational parametric surface in homogeneous form

1

0 0 A 1

P(s, t) := (a(s, t), b(s, t), c(s, t), d(s, t)),

1

where a, b, c, d ∈ K[s, t] and gcd(a, b, c, d) = 1. A µ-basis of rational surface P(s, t) is a basis of the syzygy module Syz(a, b, c, d). ˜ = (d), N ˜ = (−a, −b, −c). Construct a matrix P Set D and compute its MFD:

−a −b d r 0 A= 0 e A D c 0 c 0 0 is the Hermite form. Hence the GCRD of A and B is

R=

d 0

r e

˜ −1 N ˜ = (−a/d, −b/d, −c/d) = AB −1 , P =D

,

and the generating matrix of Syz(a, b, c) is

M=

B A

0

R

−1

e = 0 −a/d

where A = (−a, −b, −c) and B = diag(d, d, d). Suppose the ¯ B ¯ ∈ K[s, t] such GCRD of A and B is G, i.e., there exist A, that ¯ ¯ A = AG, B = BG,

1

−r A, d (ar − bd)/c

(1)

¯ and B ¯ are right coprime. Then (B ¯T , A ¯T )T is the where A generating matrix of Syz(a, b, c, d), and the three columns p, q, r of the generating matrix are a µ-basis of the rational surface P(s, t). Similar to curve case, the µ-basis obtained may not be a minimal µ-basis. To lower the degree, we proceed as follows. Rewrite the µ-basis as

˜ (t), q ˜ (t) of matrix M are a since c = de. The two columns p basis of the syzygy module Syz(a, b, c). Note that the basis obtained so far is possibly of higher degree than a µ-basis. To get the µ-basis, we need re˜ (t), q ˜ (t). Suppose n2 := deg(˜ duce the degree of p q(t)) ≥ n1 := deg(˜ p(t)) and n1 + n2 > n. Then LCV(˜ p(t)) and ˜ (t) × LCV(˜ q(t)) must be linearly dependent (otherwise p ˜ (t) 6= k(a, b, c)), that is, there exists some constant α such q ˜ (t) by q ˜ (t) := that LCV(˜ q(t)) = α LCV(˜ p(t)). Update q ˜ (t) − αtn2 −n1 p ˜ (t). This process can be continued until q ˜ q(t) ˜ are a µ-basis. Let deg(˜ p(t)) + deg(˜ q(t)) = n, i.e., p(t), us use an example to illustrate the process.

p=

i=0

P(t) = (2t2 + 4t + 5, 3t2 + t + 4, t2 + 2t + 3). Then 2

e(t) = t + 2t + 3,

λ2 (t) = 2,

dp X

pi (s)ti , q =

dq X i=0

qi (s)ti , r =

dr X

ri (s)ti ,

i=0

where pi (s), qi (s) and ri (s) in K 4 [s]. Without loss of generality, we assume dp > dq > dr . Let mp , mq , mr ∈ K 4 [s] be the leading coefficient vectors of p, q and r with respect to t, respectively, i.e., mp (s) = pdp (s), etc. From the recovery equation of Remark 3 in Section 2, it is easy to see that dp + dq + dr = degt (P(s, t)) if and only if mp , mq and mr are K[s]-linearly independent. Now if mp , mq and mr are K[s]-linearly independent, then dp + dq + dr reaches minimum and the process is terminated. Otherwise, consider the syzygy module Syz(mp , mq ,

Example 1 Suppose a rational curve is parametrized by

λ1 (t) = 1,

1

−5(t + 1) A. 1 10t + 7

The two columns of M ′ are a µ-basis of the rational curve (a(t), b(t), c(t)). The main computational costs of the µ-basis algorithm lie in computing GCDs of univariate polynomials using Euclidean algorithm and column-reduction of matrix M . One can easily prove that the computational complexity is less than 33 2 n + O(n) multiplications, which is faster than fastest 4 known algorithm [3].

Suppose gcd(−a, c) = d, then there exist λ1 , λ2 ∈ K[t] such that λ1 (−a) + λ2 c = d. Assume gcd(−bc/d, c) = e, then there exist µ1 , µ2 ∈ K[t] such that µ1 (−bc/d) + µ2 c = e. Finally suppose the quotient and remainder of λ1 b divided by e are k and r. Denote the following matrix as D, 0

1

t2 + 2t + 3 −5(t + 1) A. 0 1 M = −2t2 − 4t − 5 10t + 7

˜ −1 N ˜ = AB −1 . P =D

0

d = 1,

r(t) = 5(t + 1),

136

mr ), the basis of which can be found based on the results in [12]. Find a vector α := (αp , αq , αr ) in the basis of the syzygy module Syz(mp , mq , mr ) such that one of αp , αq , αr is a non-zero constant, if possible. If not, terminate the process. Set βp = αp , βq = αq sdp −dq , βr = αr sdp −dr and u = βp p + βq q + βr r. If αp is a non-zero constant and degs (u) < degs (p), update p by u. Otherwise if αq is a non-zero constant, and dp = dq and degs (u) < degs (q), update q by u. Otherwise if αr is a non-zero constant, and dp = dq = dr and degs (u) < degs (r), update r by u. This process can be continued until dp + dq + dr = degt (P(s, t)) or one of the above conditions fails to hold. The next step is to reduce the degree of p, q, and r with respect to s while keeping the degree of p, q, and r with respect to t unchanged. This can be done by applying the vector elimination technique in [3] to p, q, and r. We should note that, while the above process generally reduces the degree of a µ-basis, it doesn’t necessarily produce a minimal µ-basis. Now we present some examples to demonstrate the detailed process of the algorithm.

Example 3 Given a bi-quadratic surface defined by a(s, t) = t2 + st + 2s2 − 2s2 t, b(s, t) = t2 + 2st + st2 + 2s2 − s2 t + 2s2 t2 , c(s, t) = −t2 + st + 2st2 + 2s2 − s2 t − 2s2 t2 , d(s, t) = 2st − 2st2 − 2s2 t − s2 t2 . The content of the GCD of all the major minors of

A B

0

0

−2t −2s C 0 0 C. 2 2 A s +t +1 0 2 2 0 s +t +1

0

0

t/2 0 0

B U =B

0

s2 +t2 +1 2s

After Step 2, it follows that 0

2s(3s4 + 5s3 + s2 − 2s + 2) R= 0 0

1

1 1 0

0 0 C C 1 A. 1

t s

¯ −1 R

2

− t s+1 = 1 − s2st+1 0

1 s

0 0

α(s, t) = −(s + 2)(3s2 − 5s − 4)st, β(s, t) = 12s4 + 20s3 + 4s2 − 8s + 8 + 5s4 t + 4s3 t − 4s2 t + 12st − 8t, γ(s, t) = t(2s + st + 2t − 2).

1

0 0

A,

1 s2 +1

−(t2 + 1)(s2 + 1) ∗ R = (s2 + 1)s −s2 t

−st2 s2 + 1 −st

1

The results of the rest steps are omitted since they are a little clumsy to write down. Finally, we obtain a µ-basis for the biquadratic surface as follows: 1 p = 35412 ·

t(t2 + 1) A. −st t2 + 1

Then the GCRD of A and B is R∗ R0 = R∗ . Therefore the generating matrix of Syz(a, b, c, d) is

M=

B A

0

∗ −1

(R )

1

α(s, t) β(s, t) A, γ(s, t) 0 0 γ(s, t)

where

Here U ∈ K(s, t), but ∈ / K[s, t]. Since det(R) = (s2 + 1)s(s2 + t2 + 1)2 , its irreducible factor list of the content with respect to t is s2 + 1, s. Applying the LPF algorithm in Algorithm 1, we get 0

1

−s 1 −1 ¯ = t(st + 2s + 2t − 2) 0 −1 0 A . B 0 0 −1

s2 + 1 −st 2 A, s + t2 + 1 0 0 s2 + t 2 + 1 s 0 0 t

1T

2s2 t − 2s2 − st − t2 ¯ A , −t(2st + s + t + 1) A= 2st2 + 3st − 4s − 2t2 − 2t

1

(s2 + 1)s R= 0 0

1

0

The content of the GCD of all the minors of order 3 with respect to K[s][t] is 1, so we skip Step 1 in the GCRD extraction algorithm, i.e., R0 = I3 . In Step 2 of the GCRD extraction algorithm, we make use of the discussion in Remark 6 of Section 3.3, and obtain 0

1

−a −b −c B d 0 0 C B C = 0 d 0 A 0 0 d

1 1 −1 R0 = 0 s 0 A , 0 0 s

1

−2st

B s2 + t 2 + 1 =B 0

0

with respect to K[s][t] is s2 . Then the irreducible factor list is {s, s}. In Step 1 of GCRD extraction algorithm, one can compute

Example 2 The Steiner surface is defined by (a, b, c, d) = (2st, 2t, 2s, s2 + t2 + 1). The matrices A and B are then

A B

−1

B s =B 0

0

0

1

B B −35412, B B B −2 s 18264 s3 + 49451 s2 + 46965 s + 4705 , B B 12176 s4 + 18762 s3 + 30440 s2 t − 11887 s2 − 4843 st

C C C C C C C A

30440 ts4 + 36528 s4 + 56037 s3 t + 98902 s3 + 16316 s2 t B C + 93930 s2 + 52004 st + 9410 s + 35412 B C

1

0 t C t2 + 1 0 C. 2 st s +1 A −2t −2s

The three columns of M gives a µ-basis of the Steiner surface. One can show that it is a minimal µ-basis.

+ 35412 s + 26002 t + 17706

137

q= 0

1 · 256176

Example 4 Consider the surface parameterized by

−46308 st − 74337 t2 s4 − 96066 ts4 − 216726 s2 t − 209666 s3 t + 59836 s2 t2 − 40826 s3 t2 − 311720 st2 + 92216 s + 100316 t − 36272 + 21680 s2 − 54752 s3 − 22872 s4 ,

B B B B B B B B 4 −4534 + 6993 s + 9703 s2 + 2859 s3 B B (2 s + st + 2 t − 2) , B B B 32022 t 3 s4 + 5 s3 + s2 − 2 s + 2 , B B B −20586 ts4 + 22872 s4 + 55203 s3 t + 73812 s3 B − 74337 s2 t2 + 226126 s2 t + 11240 s2 + 107848 st2

1

a(s, t) = −3s2 t2 + 5s2 t − 5t2 − 4st + 5,

C C C C C C C C C C C C C C C C C A

b(s, t) = −3s2 t2 + 3s2 t + s2 + st2 − s − 2t2 − 5st + 1, c(s, t) = −5s2 t2 + 6s2 t + 2st − t2 − t − 5, d(s, t) = −4s2 t2 + 3s2 t − st + 6t2 − t + 1. If we use the computer algebra system Singular or the package CASA in Maple to compute a generator of syzygy module Syz(a, b, c, d), then we get four or five vector polynomials (depending on different orderings), and it is very difficult to find proper combinations of them to form a µ-basis. By our algorithm, we can easily compute a µ-basis. The result is omitted.

− 68094 st − 118252 s − 155860 t2 − 82180 t − 18136 0

1

B

C C C C C C C A

6. CONCLUSION AND FUTURE WORKS

6 s4 + 10 s3 + 2 s2 − 4 s + 4 + 5 ts4 + 4 ts3 B C −4 ts2 + 12 ts − 8 t s, B C

B 1 B 0, r= B 6 B −2 s 3 s4 + 5 s3 + s2 − 2 s + 2 , B B 2 s5 + s4 + 5 s3 t − 4 s3 − 6 s2 t

In this paper, we apply the theory of polynomial matrices to compute µ-bases of rational curves and surfaces. The algorithm is based on several important techniques in the theory of polynomial matrices, such as primitive factorization, Hermite form and GCD extraction. This is the only known algorithm to compute the µ-bases of general rational surfaces, and it is superior than any existing algorithms for computing the µ-bases of rational curves. In the future, µ-bases of a spatial rational parametric curve will be considered. It is expected that the implicit equation of a space curve can be computed from a µ-bases. On the other hand, finding an efficient method to compute a minimal µ-basis of a rational surface and the complexity analyzing of the algorithm are problems worthy of further research.

+ 10 s2 + 8 st − 4 t Now we apply degree reduction algorithm to reduce the degree of the µ-basis. The new µ-basis is shown below. 1 p′ = 236661622380753 · 0

−300729067167523 st + 56467802265703 s2 t + 203543640533634 s + 228386226979701 t − 196295646522670 s2 − 140869867499516,

B B B B B B −279870932485122 s + 90277583448339 st B B − 125135373778758 t + 140869867499516 B B + 48465134115412 s2 − 97142080550445 s2 t, B B B −58285248330267 s2 t + 81366332623128 st B B + 103250853200943 t + 147830512407258 s2 B B + 76327291951488 s, B B B −40462746679845 st − 77713664440356 s2 t B − 207436107658841 s − 105225741859592 t

1 C C C C C C C C C C C C C C C C C C C C C A

7. ACKNOWLEDGMENTS The authors are support by the Outstanding Youth Grant of NSF of China (No. 60225002), NSF of China (10201030 and 60473132), a National Key Basic Research Project of China (2004CB318000), the TRAPOYT in Higher Education Institute of MOE of China, and SRF for ROCS, SEM. Special thanks go to Dr. Zhiping Lin from Nanyang Technological University, Singapore, for his novel results in the paper [12] and for the helpful discussions with him.

− 811703353674 s2 − 70434933749758 ′

q =

8. REFERENCES

[1] Falai Chen, Jianmin Zheng, and T. W. Sederberg, The µ-basis of a rational ruled surface, Computer Aided Geometric Design, Vol.18, 2001, 61–72. [2] Falai Chen and Thomas W. Sederberg, A new implicit representation of a planar rational curve with high order of singularity, Comput. Aided Geom. Design, vol.19, 2002, 151–167. [3] Falai Chen and Wenping Wang, The µ-basis of a planar rational curve — properties and computation, Graphical Models, Vol.64, 2003, 368–381. [4] Falai Chen, Reparameterization of a rational ruled surface by µ-basis, Computer Aided Geometric Design, Vol.20, 2003, 11–17. [5] Falai Chen and Wenping Wang, Revisiting the µ-basis of a rational ruled surface, Journal of Symbolic Computation, Vol.36, 2003, 699–716. [6] Falai Chen, David Cox, and Yang Liu, The µ-basis of a rational parametric surface, Journal of Symbolic Computation, Vol.39, 2005, 689–706.

1 · 56167843795

0

1

B B −326160947200 − 360111336704 s2 + 487462361088 s, B B B −410518028544 s2 − 164118258432 s, B B 163080473600 − 223271993856 s2

C C C C C C C A

326160947200 + 162042688768 s2 t + 770629365248 s2 B C + 324085377536 st − 323344102656 s, B C

+ 82429766656 s + 162042688768 t r′ = 0

1 · 2039538

23600164 − 47200328 s3 + 64900451 s2 − 23600164 s,

B B −23600164 + 47200328 s + 29500205 s3 − 35400246 s2 , B B B 17700123 s3 − 29500205 s2 − 23600164 s,

1 C C C C C A

11800082 + 23600164 s3 + 5900041 s2

138

[7] Falai Chen and Wenping Wang, Computing the singular points of a planar rational curve using the µ-basis, preprint, 2004. [8] David Cox, John Little, and Donal O’Shea, Using Algebraic Geometry, New York, Springer-Verlag, 1998. [9] David Cox, T. W. Sederberg, and Falai Chen, The moving line ideal basis of planar rational curves, Computer Aided Geometric Design, Vol.15, 1998, 803–827. [10] F. R. Gantmacher, Theory of Matrices, New York: Chelsea Publishing Co., 1959. [11] John P. Guiver and N. K. Bose, Polynomial matrix primitive factorization over arbitrary coefficient filed and related results, IEEE Transactions on Circuits and Systems, Vol.CAS-29, No.10, 1982, 649–657. [12] Zhiping Lin, On syzygy modules for polynomial matrices, Linear Algebra and Its Application, Vol.298, 1999, 73–86. [13] Martin Morf, Bernard C. Lévy, and Sun-Yuan Kung, New results in 2-D system theory, Part I: 2-D polynomial matrices, factorization, and coprimeness, Proceedings of the IEEE, Vol.65, No.6, 1977, 861–872.

[14] T. W. Sederberg and Falai Chen, Implicitization using moving curves and surfaces, Computer Graphics Proceedings, Annual Conference Series, Vol.2, 1995, 301–308. [15] T. W. Sederberg, T. Saito, D. Qi, and K. Klimaszewski, Curve implicitization using moving lines, Computer Aided Geometric Design, Vol.11, 1994, 687–706. [16] D. C. Youla and G. Gnavi, Notes on n dimensional systems, IEEE Transaction on Circuits and System, Vol.26, 1979, 105–111. [17] Fangling Zeng and Falai Chen, Degree reduction of rational curves by µ-basis, Computer Mathematics, Proceedings of the Sixth Asian Symposium (ASCM’2003), Lecture Notes Series on Computing, Vol.10, ed. Ziming Li and William Sit, World Scientific, 2003, 265–275. [18] Jianmin Zheng and T. W. Sederberg, A direct approach to computing the µ-basis of planar rational curves, Journal of Symbolic Computation, Vol.31, 2001, 619–629.

139

Efficient Computation of the Characteristic Polynomial Jean-Guillaume Dumas

Clement ´ Pernet

Zhendong Wan

Universite´ Joseph Fourier, LMC-IMAG B.P. 53, 38041 Grenoble Cedex 9, France.

Universite´ Joseph Fourier, LMC-IMAG B.P. 53, 38041 Grenoble Cedex 9, France.

Department of Computer and Information Science University of Delaware, Newark, DE 19716, USA.

[email protected]

[email protected]

[email protected]

ABSTRACT

algorithms to compute it are based on computations of characteristic polynomial (see for example [21, §9.7]). Using classical matrix multiplication, the algebraic time complexity for the computation of the characteristic polynomial is optimal : several algorithms require only O(n3 ) algebraic operations (to our knowledge the oldest one is due to Danilevski [11, §24]). Now considering that the determinant can be deduced from the characteristic polynomial, and that its computation is proven to be as hard as matrix multiplication [2] the optimality is then straightforward. But with fast matrix arithmetic (O(nω ) with 2 ≤ ω < 3), the best asymptotic time complexity is O(nω logn), given by Keller-Gehrig’s branching algorithm [15]. Now the third algorithm of Keller-Gehrig has a O(nω ) algebraic time complexity but only works for generic matrices. In this paper we focus on the practicability of such algorithms applied on matrices over a finite field. Therefore we used the techniques developped in [4, 5], for efficient basic linear algebra operations over a finite field. We propose a new O(n3 ) algorithm designed to take benefit of the block matrix operations; improve KellerGehrig’s branching algorithm and compare these two algorithms. Then we focus on Keller-Gehrig’s third algorithm and prove that its generalization is not only of theoretical interest but is also promising in practice. As an application, we show that these results directly lead to an efficient computation of the characteristic polynomial of integer matrices using chinese remaindering and an early termination criterion adapted from [6]. This basic application outperforms the best existing softwares on many cases. Now better algorithms exist for the integer case, and can be more efficients with sparse or structured matrices. Therefore, we also propose a probabilistic algorithm using a blackbox computation of the minimal polynomial and our finite field algorithm. This can be viewed as a simplified version of the algorithm described in [22] and [14, §7.2]. Its efficiency in practice is also very promising. In the following we will denote by Ai1 ...i2 ,j1 ...j2 the submatrix of A located between rows i1 and i2 and columns j1 and j2 and by Ak,1...n the kth row vector of A.

We deal with the computation of the characteristic polynomial of dense matrices over word size finite fields and over the integers. We first present two algorithms for finite fields: one is based on Krylov iterates and Gaussian elimination. We compare it to an improvement of the second algorithm of Keller-Gehrig. Then we show that a generalization of KellerGehrig’s third algorithm could improve both complexity and computational time. We use these results as a basis for the computation of the characteristic polynomial of integer matrices. We first use early termination and Chinese remaindering for dense matrices. Then a probabilistic approach, based on integer minimal polynomial and Hensel factorization, is particularly well suited to sparse and/or structured matrices.

Categories and Subject Descriptors G.4 [Mathematics and Computing]: Mathematical Software—Algorithm Design and Analysis; I.1.2 [Computing Methodologies]: Symbolic and Algebraic Manipulation

General Terms Algorithms, Experimentation

Keywords Characteristic polynomial, minimal polynomial, Keller-Gehrig, probabilistic algorithm, finite field, integer, Magma

1.

INTRODUCTION

Computing the characteristic polynomial of an integer matrix is a classical mathematical problem. It is closely related to the computation of the Frobenius normal form which can be used to test two matrices for similarity. Although the Frobenius normal form contains more information on the matrix than the characteristic polynomial, most

2. KRYLOV’S APPROACH Among the different techniques to compute the characteristic polynomial over a field, many of them rely on the Krylov approach. A description of them can be found in [11]. They are based on the following fact: the minimal linear dependance relation between the Krylov iterates of a vector v (i.e. the sequence (Ai v)i ) gives the minimal polynomial min PA,v of this sequence, and a divisor of the minimal polyno-


140

mial of A. Moreover, if X is formed by the the first independent column vectors of this sequence and CP min is the com-

to avoid the computation of the last n − k Krylov iterates with an early termination. This reduces the time complexity to O(nω logk) for fast matrix arithmetic, and O(n2 k) for classical matrix arithmetic. Note that if the finite field is reasonably large, then choosing v randomly makes the algorithm Monte-Carlo for the computation of the minimal polynomial of A.

A,v

min panion matrix associated to PA,v , we have AX = XCP min . A,v

2.1 Minimal polynomial We show here an algorithm to compute the minimal polynomial of the sequence of the Krylov iterates of a vector min v and a matrix A. This is the monic polynomial PA,v of least degree such that P (A).v = 0. We firstly presented it in [19, 18], however, we recall it here for clarity and accessibility purposes. A similar algorithm was also simultaneously published in [16, Algorithm 2.14], but it does not take advantage of fast matrix multiplication as we do here. The idea is to compute the n × n matrix KA,v (called the Krylov matrix), whose ith column is the vector Ai u, and to perform an elimination on it. More precisely, one computes the LSP min t . So [12] factorization of KA,v . Let k be the degree of PA,v the first k columns of KA,v are linearly independent, and the n − k following ones are a linear combination of the first k ones. Therefore S is triangular with its last n − k rows t equals to 0. Thus, the LSP factorization of KA,v can be viewed as in figure 1.

t

K(A,v) =

vt (Av) t (A 2v) t ..... (A k+1 v)t

=

2.2 LU-Krylov algorithm From algorithm 2.1, we then derive the computation of the characteristic polynomial since it produces the k first independent Krylov iterates of v. They form a basis of an invariant subspace under the action of A (the first invariant min subspace of A if PA,v = PAmin ). The idea is to use the elimination performed on this basis to compute a basis of its supplementary subspace. Then a recursive call on a n − k × n − k matrix provides the remaining factors. The algorithm follows, where k, P , and S are as in algorithm 2.1. Algorithm 2.2 LUK : LU-Krylov algorithm Require: A a n × n matrix over a field A Ensure: Pchar the characteristic polynomial of A 1: Pick a random vector v A,v 2: Pmin =» MinPoly(A, v) of degree k – L1 {X = [S1 |S2 ]P is computed} L2 3: if (k = n) then A,v A 4: return Pchar = Pmin 5: else » 0 – A11 A012 6: A0 = P A T P T = where A011 is k × k. A021 A022

L 1..k .

S

. P

L k+1 .....

Figure 1: LSP factorization of the Krylov matrix

−1

A0 −A021 S1

22 Pchar

7:

8: return 9: end if

Now the trick is to notice that the vector m = Lk+1 L−1 1...k min contains the opposites of the coefficients of PA,v . Indeed, T let us define X = KA,v therefore Xk+1,1...n = (Ak v)T = Pk−1 i T min = m · X1...k,1...n where PA,v (X) = X k − i=0 mi (A v) k−1 mk X − · · · − m1 X − m0 . Thus Lk+1 SP = m · L1...k SP and finally m = Lk+1 .L−1 1...k . Algorithm 2.1 is then straightforward. The dominant operation in this algorithm is the

A Pchar

S2

(X) = LUK(A022 − A021 S1−1 S2 ) −1 A0 −A021 S1 S2

A,v 22 = Pmin × Pchar

Theorem 2.1. Algorithm LU-Krylov computes the characteristic polynomial of a n × n matrix A in O(n3 ) field operations. Proof. The first k rows of X (X1...k,1...n ) form a basis of the invariant subspace generated by v. Moreover we have X1..k AT = CPT A,v X1..k Indeed ∀i < k we have min ` ´T ` ´T Xi AT = Ai−1 v AT = Ai v = Xi+1 and Xk AT = ` k−1 ´T T ` k ´T ` i ´T Pk−1 A v A = A v = The idea is i=0 mi A v now to complete this basis into a basis of the whole space. Viewed as a matrix, this basis form the n × n invertible matrix X: – » –» » – X L1 0 S1 S2 X= P = ˆ 1...k,1...n˜ 0 In−k 0 In−k 0 In−k P | {z }| {z }

Algorithm 2.1 MinPoly : Minimal Polynomial of A and v Require: A a n × n matrix and v a vector over a field A,v Ensure: Pmin (X) minimal polynomial of (Ai v)i 1: K1...n,1 = v 2: for i = 1 to log2 (n) do i−1 3: K1...n,2i ...2i+1 −1 = A2 K1...n,1...2i −1 4: end for 5: (L, S, P ) = LSP(K T ), k = rank(K) 6: m = Lk+1 .L−1 1...k P A,v i 7: return Pmin (X) = X k − k−1 i=0 mi X

L

S

Let us compute

computation of K, in log2 n matrix multiplications, i.e. in O(nω logn) algebraic operations. The LSP factorization requires O(nω ) operations and the triangular system resolution, O(n2 ). So the algebraic time complexity of this algorithm is O(nω logn). Now with classical matrix multiplications (ω = 3), one should prefer to compute the Krylov matrix K by k successive matrix vector products. The complexity is then O(n3 ). One can also merge the construction of the Krylov matrix and its LSP factorization so as

T

XA X

−1

= =

"

# CT 0 ˜ ˆ −1 −1 0 In−k P AT P T S L " # » CT 0 CT ˆ 0 ˜ −1 −1 = Y A21 A022 S L

0 X2

–

with X2 = A022 − A021 S1−1 S2 . By a similarity transformation, we thus have reduced A to a block triangular matrix.

141

Algorithm 2.3 ColReducedForm Require: A a m × n matrix of rank r Ensure: r linearly independent columns of A 1: (L, Q, U, P, r) = LQUP(AT ) (r = rank(A)) 2: return ([Ir 0](QT AT ))T

Then the characteristic polynomial of A is the product of the characteristic polynomial of these two diagonal blocks: A0 −A0 S −1 S

A,v 2 A 22 21 1 Pchar = Pmin × Pchar . Now for the time complexity, we will denote by TLUK (n) the number of field operations for this algorithm applied on a n×n matrix, by Tminpoly (n, k) the cost of the algorithm 2.1 applied on a n × n matrix having a degree k minimal polynomial, by TLSP (m, n) the cost of the LSP factorization of a m × n matrix, by Ttrsm (m, n) the cost of the simultaneous resolution of m triangular systems of dimension n, and by TMM (m, k, n) the cost of the multiplication of a m × k matrix by a k × n matrix. The values of TLSP and Ttrsm can be found in [5]. Then, using classical matrix arithmetic, we have: TLUK (n) = Tminpoly (n, k) + TLSP (k, n)+Ttrsm (n−k, k)+Tmm (n−k, k, n−k)+TLUK (n−l) = 2 O(n k2 n + k2 (n − k) + k(n − k)2 ) + TLUK (n − k) = Pk + 2 2 O( n) = O(n3 ), The latter being true since P i n ki + kiP 2 2 k = n and i i i ki ≤ n .

LQUP factorization indicates the positions of the linearly independant blocks of iterates in W . To each of these blocks, one can associate a block column in L. Now applying the triangular system resolution of algorithm 2.1 to this block column will compute the coefficients of the first linear dependency between these iterates. Since the Krylov iterates are already computed, and the last call to ColReducedForm performed the elimination on them, there only remains to solve triangular systems. We thus get the coefficients of each polynomial, for a total cost of O(n2 ). Algorithm 2.4 shows

We have thus derived a deterministic algorithm from a probabilistic one. When algorithm 2.1 fails, it still returns a factor of the true minimal polynomial and the next recursive calls then compute the forgotten factors. Note also that when using fast matrix arithmetic, it is no longer possible to sum the log(ki ) into log(n) nor the kiω−2 n2 into nω . Therefore the best known algebraic time complexity, O(nω logn), can not be reached by such an algorithm. We thus focus on the second algorithm of Keller-Gehrig achieving this best known time complexity.

Algorithm 2.4 KGB: Keller-Gehrig Branching algorithm Require: A a n × n matrix over a field A Ensure: Pchar (X) the characteristic polynomial of A 1: i = 0 2: V0 = In = (V0,1 , V0,2 , . . . , V0,n ) 3: B = A 4: while (∃k, Vk has 2i columns) do 5: for all j do 6: if ( Vi,j has strictly less than 2i columns ) then 7: Wj = Vi,j 8: else 9: Wj = [Vi,j |BVi,j ] 10: end if 11: end for 12: W = (Wj )j 13: Vi+1 = ColReducedForm(W ) (remember L and Q from LQUP) {Vi+1,j are the remaining vectors of Wj in Vi+1 } 14: B = B ×B 15: i=i+1 16: end while 17: for all j do 18: Let s, t be the indexes of the first and last column of linearly independent iterates of the vector ej in W (given by Q) 19: m = Lt+1,s...t .L−1 s...t,s...t P 20: Pj (X) = X t−s − t−s−1 mi X i i=0 21: end for 22: return Πj Pj

2.3 Improving Keller-Gehrig branching algorithm In [15], Keller-Gehrig presents a so called branching algorithm, computing the characteristic polynomial of a n × n matrix over a field K in the best known time complexity : O(nω logn) field operations. The idea is to compute the Krylov iterates of a several vectors at the same time, so as to replace several matrix vector products by a fast matrix multiplication. More precisely, the algorithm computes a sequence of n×n matrices (Vi )i whose columns are the Krylov iterates of vectors of the canonical basis. V0 is the identity matrix (every vector of the canonical basis is present). At the i-th iteration, the algorithm computes the next 2i Krylov iterates of the remaining vectors. Then a Gaussian elimination determines the linear dependencies between them so as to form Vi+1 by picking the n linearly independent vectors. The algorithm ends when no more iterate can be added (Vi+1 = Vi ). Then the matrix Vi−1 AVi is block upper triangular with companion blocks on the diagonal. The polynomials of these blocks are the minimal polynomials of each of the sequence of Krylov iterates, and their product is the characteristic polynomial of the input matrix. The removal of the linear dependencies is performed by a step-form elimination algorithm defined by Keller-Gehrig. Its formulation is rather sophisticated, and we propose to replace it by the column reduced form algorithm (algorithm 2.3) using the more standard LQUP factorization [12]. More precisely, the step form elimination of Keller-Gehrig, the LQUP factorization of Ibarra & Al. and the echelon form elimination (see e.g. [21]) are equivalent and can be used to determine the linear dependencies in a set of vectors. Our second improvement is to apply the idea of algorithm 2.1 to compute the polynomials associated to each companion block, instead of computing V −1 AV . The permutation Q of the

these modifications. The operations in the while loop have a O(nω ) algebraic time complexity. This loop is executed at most logn times and the overall algebraic time complexity is therefore O(nω logn). More precisely it is O(nω logkmax ) where kmax is the degree of the largest invariant factor.

2.4 Experimental comparisons We implemented these two algorithms, using a finite field representation on double size machine floating point numbers : modular<double> (see [5]), and the efficient routines for finite field linear algebra FFLAS-FFPACK presented in [5, 4]. We also only considered classic matrix arithmetic. We ran them on a series of matrices of order 300 whose Frobenius normal forms had different number of diagonal com-

142

Kω nω + o(nω ) algebraic operations, where » 2ω−2 1 Kω = Cω − − ω ω−2 2(2 − 1)(2ω−1 − 1)(2ω − 1) 2 −1 1 3 2 + ω−2 − ω−1 + ω−2 (2 − 1)(2ω−1 − 1) 2 −1 2 −1

panion blocks. Figure 2 shows the computational time on a Pentium IV 2.4Ghz with 512Mb of RAM. It appears that

KGB vs LU−Krylov for a 300x300 matrix over GF(101) 1.6

LU−Krylov Keller−Gehrig

1.4

1 2ω−2 + ω−2 + ω ω−2 (2 − 1)(2 − 1) 2(2 − 1)(2ω−1 − 1)2

Time (s)

1.2 1

and Cω is the constant in the algebraic time complexity of the matrix multiplication.

0.8 0.6

The proof and a description of the algorithm are given in appendix A. In particular, with classical matrix arithmetic (ω = 3, Cω = 2), we have on the one hand Kω = 176/63 ≈ 2.794. On the other hand, the algorithm 2.2 called on a generic matrix simply computes the n Krylov vectors Ai v (2n3 operations), computes the LUP factorization of these vectors (2/3n3 operations) and the coefficients of the polynomial by the resolution of a triangular system (O(n2 )). Therefore, the constant for this algorithm is 2+2/3 ≈ 2.667. These two algorithms have thus a similar algebraic complexity, LU-Krylov being slightly faster than Keller-Gehrig’s third algorithm. We now compare them in practice.

0.4 0.2 0 0

5

10

15

20 25 30 35 Number of blocks

40

45

50

Figure 2: LU-Krylov vs. KGB LUK is faster than KGB on every matrix. This is due to the extra logn factor in the time complexity of the latter. One can note that the computational time of KGB is decreasing with the number of blocks. This is due to the fact that the log(n) is in fact log(kmax where kmax is the size of the largest block. This factor is decreasing when the number of blocks increases. Conversely, LUK computational time is almost constant. It slightly increases, due to the increasing number of rectangular matrix operations: their computation are less efficient than square matrix operations, due to BLAS optimizations of memory accesses.

3.2 Experimental comparison We claim that the study of precise algebraic time complexity of these algorithms is worthwhile in practice. Indeed these estimates directly correspond to the computational time of these algorithms applied over finite fields. Therefore KG3 vs LU−Krylov over Z/65521Z 1800

LU−Krylov Keller−Gehrig Fast algorithm

1600

3.

–

1400

TOWARD AN OPTIMAL ALGORITHM

1200 Mfops

As mentioned in the introduction, the best known algebraic time complexity for the computation of the characteristic polynomial is not optimal in the sense that it is not O(nω ) but O(nω logn). However, Keller-Gehrig gives a third algorithm (let us name it KG3), having this time complexity but only working on generic matrices. In the following, we will use Keller-Gehrig’s definition of a generic matrix : each of its coefficients can be considered as an independent indeterminate. To get rid of the extra log(n) factor, it is no longer based on a Krylov approach. The algorithm is inspired by a O(n3 ) algorithm by Danilevski (described in [11]), improved into a block algorithm. The genericity assumption ensures the existence of a series of similarity transformations changing the input matrix into a companion matrix.

1000 800 600 400 200 0

500

1000

1500 2000 Matrix order

2500

3000

3500

Figure 3: LUK vs. KG3: speed comparison we ran these algorithms on a word size prime finite field with modular arithmetic. Again we used modular<double> and FFLAS-FFPACK. These routines can use fast matrix arithmetic, we, however, only used classical matrix multiplication so as to compare two O(n3 ) algorithms having similar constants (2.67 for LUK and 2.794 for KG3). We used random dense matrices over the finite field Z65521 , as generic matrices. We report the computational speed in Mfops (Millions of field operations per second) for the two algorithms on figure 3. It appears that LU-Krylov is faster than KG3 for small matrices (better algebraic time complexity), but for matrices of order larger than 1500, KG3 is faster. Indeed, the O(n3 ) operations are differently performed: LUKrylov computes the Krylov basis by n matrix-vector products, whereas KG3 only uses matrix multiplications. Now, as the order of the matrices increases, the BLAS routines

3.1 Comparing the constants The optimal “big-O” complexity often hides a large constant in the exact expression of the time complexity. This makes these algorithms impracticable since the improvement induced is only significant for huge matrices. However, we show in the following lemma that the constant of KG3 is very close to the one of LUK. Lemma 3.1. The computation of the characteristic polynomial of a n×n generic matrix using KG3 algorithm requires

143

4.

“

”

1 √ ≤ . Therefore for n > 4, log 2 1+(n) 12n+1 2π n −1.296. Then i(n−i) is decreasing in i for i < bn/2c so that n . its maximum is 2(n−2) ` í “ n ”n−i p (n−i) Consider now K(n, i) = ni (n − i)B 2 . n−i 2 n−i n n We have log 2 (K(n, i)) = 2 log2 (B )+ 2 log2 (n)+ 2 T (n, i), n where T (n, i) = log 2 ( n−i ). Well T (n, i) is ) + ni log2 ( n−i i2 √ −1+ 1+4en maximal for i = as announced in the lemma. 2e n We end with the fact that T (n, i) − n2 1.296 + n1 log2 ( 2(n−2) ) 1 12n

provide better efficiency for matrix multiplications than for matrix vector products. Once again, algorithms exclusively based on matrix multiplications are preferable: from the complexity point of view, they make it possible to achieve O(nω ) time complexity. In practice, they promise the best efficiency thanks to the BLAS better memory management.

OVER THE INTEGERS

There exist several algorithms to compute the characteristic polynomial of an integer matrix. A first idea is to perform the algebraic operations over the ring of integers, using exact divisions [1] or by avoiding divisions [3, 13, 14]. We focus here on field approaches. Concerning the bit complexity of this computation, a first approach, using Chinese remaindering gives O˜ (nω+1 logkAk) bit operations (O˜ is the “soft-O” notation, hiding logarithmic and polylogarithmic factors in n and kAk). Baby-step Giant-step techniques applied by Kaltofen [13] improves this complexity to O˜ (n3.5 logkAk) (using classical matrix arithmetic). Lastly, the recent improvement of [14, §7.2], combining Coppersmith’s block-Wiedemann techniques set the best known exponent for this computation to 2.697263 using fast matrix arithmetic. Our goal here is not to give an exhaustive comparison of these methods, but to show that a straightforward application of our finite field algorithm LU-Krylov is already very efficient and can outperform the best existing softwares. A first dense deterministic algorithm, using Chinese remaindering is given in section 4.1. Then we propose in section 4.2 a probabilistic algorithm that can be adapted for dense or for sparse and structured matrices. It combines the early termination technique of [6, §3.3], and a recent alternative to chinese remaindering in [22], also developed in [14, §7.2]. Lastly we compare implementations of these algorithms in practice.

< (n)

n

1+

ak

k=1

(1 − ak ) > 1 −

ak k=1

k=1

then (n ≥ 3, x ∈

ak ,

!

!

Fn (x) ≤ (x − 1) (x + 2)

n , n−1

n

"

>

1

F1 (x) = 1, F2 (x) = x,

n−3

xi

k=2

n

1

• 3.3.38 in [16]: If Fn (x) denotes the nth Fibonacci polynomial, defined by

2

!

i=1

n

2

>

k−1

• The Weierstraß inequalities (3.2.37 in [16]):

(0 < x ≤ 1, n ≥ 0)

2

xk

where xk > 0, x2 > 4x1 , and n ≥ 2

1 1 + nxn+1 ≤ 1 + n(1 − x)2 x−n (1 + n)xn 2

Fn+2 (x) = xFn+1 (x) + Fn (x),

n

n2 , n−1

i=1

• Levin’s inequality (3.2.12 in [16]):

2

(n ≥ 1)

• A variation of 4.15 in [15]:

A suitable choice for Reduce is Mathematica’s command CylindricalDecomposition [24], which returns

1≤

1 4

n+

(n ≥ 1) for 0 < ak < 1 with

)

n k=1 !

ak < 1

• An inequality of Beesack [3],

There is no need to restrict the defining recurrences to rational functions. Inequalities involving algebraic functions can be treated as well, provided that the algebraic functions are simple enough so that the CAD implementation can deal with them. This makes it possible to automatize the proofs for the following inequalities:

n

k

xi k=1

β

i=1

n

xα i ≤

α+β

xk k=1

(n ≥ 1)

for xk > 0 and α ≥ 1, α + β ≥ 1, can be done for specific values α ∈ , β ∈ , e.g., for α = 2, β = −1. #

$

We are also able to prove inequality 5.16 from [15], • [17] If (Rn ) is defined by n Rn+1 = 1 + (n ≥ 1), Rn

n

R1 = 1,

k=1

159

sin(kx) ≥

1 sin(nx) 2

(0 ≤ x ≤ π, n ≥ 1),

4.2 Difficult Examples

using

There are some prominent examples of quite difficult inequalities that also fit well into the class of inequalities we consider. One example is the Askey-Gasper inequality mentioned in the introduction. This inequality reduces for α = 0 to Fejér’s inequality [1]. Another example is the inequality of Vietoris [15, § 0.8]: if (an )n≥1 is positive, decreasing, and satisfies

sin((n + 2)x) = 2 cos x sin((n + 1)x) − sin(nx) as defining recurrence and the identity (cos2 x − 1) − sin2 (nx) − 2 cos x sin(nx) sin((n + 1)x) − sin2 ((n + 1)x) = 1

(n ∈

)

%

as well as the facts −1 ≤ cos x ≤ 1 and 0 ≤ sin(nx) ≤ 1 as additional knowledge. The figure below shows the graph of n &

fn (x) := k=1

sin(kx) −

Choosing ak = 1/k gives the Fejér-Jackson inequality [15] n

3

k=1

2.5

1.5 1 0.5 2.5

3

Inequalities which are not amenable to our proving procedure can sometimes be rewritten in a way that makes the proving procedure applicable. As an example, consider inequality (3) of Knopp and Schur. We have the relation ∞

1

&

k2 k+n n k=1 '

The procedure of Section 3 can be modified slightly in order to analyze the sign patterns of oscillating sequences. Consider C-finite sequences (fn )n≥0 defined by linear homogeneous recurrences with constant coefficients, for example √ √ √ f0 = 2 + 2, f1 = 2 + 10, f2 = −2 + 5 2 √ √ √ fn = (4 + 5)fn−1 − (5 + 4 5)fn−2 + 5 5fn−3 (n > 3).

(

where the first identity was provided by Mathematica, and (2) 2 the second follows from Thm. 1.2.7 in [1]. Hn = n k=1 1/k denotes the harmonic number of second order and ψ the digamma function [1]. Zeilberger’s algorithm [20, 21] delivers )

n−1

n−1

1

&

k2 k+n n k=1

&

=

'

k=1 (

3k2 + 3k + 1 2(k + 1)(2k + 1)k 2 '

2k−1 k−1

−

The initial values and recurrence coefficients are chosen such that (fn )n≥1 has the closed form √ (n ≥ 0) fn = 2 5n/2 (1 − 2 sin(nθ − π4 ))

1 (k + 1)2

with θ = arctan 12 . It is well known [4] that the numbers (nθ − π4 ) mod 2π lie dense in the interval [0, 2π], hence fn clearly has infinitely many positive and infinitely many negative values. Our goal is to obtain finer information on the sign of fn . As additional knowledge, we use the identity √ √ 10 25fn2 − 11 (14 + 13 5)fn fn+1 − 20 (2 − 6 5)fn fn+2 11 √ √ 2 2 2 (14 − 13 5)fn+1 fn+2 = 0, + (6 + 4 5)fn+1 − 11 + fn+2

(

by which (3) simplifies to π2 −1− 6

n−1 &

k=1

3k2 + 3k + 1 k2 (k + 1)(2k + 1) '

2k k (

Proceedings ISSAC (Beijing)

Proceedings ISSAC (Santander) TOC

Proceedings ISSAC 2010 (Munich)

README.Proceedings ISSAC

Beijing

Analytic number theory Proceedings Beijing-Kyoto

Issac Asimov - Nemesis

Bestemming Beijing

Bestemming Beijing

Asimov, Issac - Feeling of Power

Bestemming Beijing

Frommer's Beijing

Bestemming Beijing

Bestemming Beijing

Bestemming Beijing

Bestemming Beijing

Frommer's Beijing

Bestemming Beijing

The Man from Beijing

Beijing Welcomes You

The Man from Beijing

The Beijing conspiracy

The Man From Beijing

the Man from Beijing

The Man from Beijing

The Man from Beijing

The Man from Beijing

The Man from Beijing

Beijing Welcomes You

The Man From Beijing

The Man from Beijing

Proceedings ISSAC (Beijing)

Proceedings ISSAC (Santander) TOC

Proceedings ISSAC 2010 (Munich)

README.Proceedings ISSAC

Beijing

Analytic number theory Proceedings Beijing-Kyoto

Issac Asimov - Nemesis

Bestemming Beijing

Bestemming Beijing

Asimov, Issac - Feeling of Power

Bestemming Beijing

Frommer's Beijing

Bestemming Beijing

Bestemming Beijing

Bestemming Beijing

Bestemming Beijing

Frommer's Beijing

Bestemming Beijing

The Man from Beijing

Beijing Welcomes You

The Man from Beijing

The Beijing conspiracy

The Man From Beijing

the Man from Beijing

The Man from Beijing

The Man from Beijing

The Man from Beijing

The Man from Beijing

Beijing Welcomes You

The Man From Beijing

The Man from Beijing

Recommend Documents