Cauchy’s Cours d’analyse An Annotated Translation
For other titles published in this series, go to http://www.springer.com/series/4142
Sources and Studies in the History of Mathematics and Physical Sciences
Editorial Board L. Berggren J.Z. Buchwald J. L¨utzen
Robert E. Bradley, C. Edward Sandifer
Cauchy’s Cours d’analyse An Annotated Translation
123
Robert E. Bradley Department of Mathematics and Computer Science Adelphi University Garden City NY 11530 USA
[email protected] C. Edward Sandifer Department of Mathematics Western Connecticut State University Danbury, CT 06810 USA
[email protected] Series Editor: J.Z. Buchwald Division of the Humanities and Social Sciences California Institute of Technology Pasadena, CA 91125 USA
[email protected] ISBN 978-1-4419-0548-2 e-ISBN 978-1-4419-0549-9 DOI 10.1007/978-1-4419-0549-9 Springer Dordrecht Heidelberg London New York Library of Congress Control Number: 2009932254 Mathematics Subject Classification (2000): 01A55, 01A75, 00B50, 26-03, 30-03 c Springer Science+Business Media, LLC 2009 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
We dedicate this volume to Ronald Calinger, Victor Katz and Frederick Rickey, who taught us the importance and satisfaction of reading original sources, and to our friends in ARITHMOS, with whom we enjoy putting those lessons into practice.
Translators’ Preface
Modern mathematics strives to be rigorous. Ancient Greek geometers had similar goals, to prove absolute truths by using perfect deductive logic starting from incontrovertible premises. Often in the history of mathematics, we see a pattern where the ideas and applications come first and the rigor comes later. This happened in ancient times, when the practical geometry of the Mesopotamians and Egyptians evolved into the rigorous efforts of the Greeks. It happened again with calculus. Calculus was discovered, some say invented, almost independently by Isaac Newton (1642–1727) about 1666 and by Gottfried Wilhelm von Leibniz (1646–1716) about 10 years later, but its rigorous foundations were not established, despite several attempts, for more than 150 years. In 1821, Augustin-Louis Cauchy (1789–1857) published a textbook, the Cours ´ d’analyse, to accompany his course in analysis at the Ecole Polytechnique. It is one of the most influential mathematics books ever written. Not only did Cauchy provide a workable definition of limits and a means to make them the basis of a rigorous theory of calculus, but also he revitalized the idea that all mathematics could be set on such rigorous foundations. Today, the quality of a work of mathematics is judged in part on the quality of its rigor; this standard is largely due to the transformation brought about by Cauchy and the Cours d’analyse. The 17th century brought the new calculus. Scientists of the age were convinced of the truth of this calculus by its impressive applications in describing and predicting the workings of the natural world, especially in mechanics and the motions of the planets. The foundations of calculus, what Colin Maclaurin (1698–1746) and Jean le Rond d’Alembert (1717–1783) later called its metaphysics, were based on the intuitive geometric ideas of Leibniz and Newton. Some of their contemporaries, especially Bishop George Berkeley (1685–1753) in England and Michel Rolle (1652– 1719) in France, recognized the problems in the foundations of calculus. Rolle, for example, said that calculus was “a collection of ingenious fallacies,” and Berkeley ridiculed infinitely small quantities, one of the basic notions of early calculus, as “the ghosts of departed quantities.” Both Berkeley and Rolle freely admitted the practicality of calculus, but they challenged its lack of rigorous foundations. We
vii
viii
Translators’ Preface
should note that Rolle’s colleagues at the Paris Academy eventually convinced him to change his mind, but Berkeley remained skeptical for his entire life. Later in the 18th century, only a few mathematicians tried to address the questions of foundations that had been raised by Berkeley and Rolle. Over the years, three main schools of thought developed: infinitesimals, limits, and formal algebra of series. We could consider the British ideas of fluxions and evanescent quantities either to be a fourth school or to be an ancestor of these others. Leonhard Euler (1707–1783) [Euler 1755] was the most prominent exponent of infinitesimals, though he devoted only a tiny part of his immense scientific corpus to issues of foundations. Colin Maclaurin [Maclaurin 1742] and Jean le Rond d’Alembert [D’Alembert 1754] favored limits. Maclaurin’s ideas on limits were buried deep in his Treatise of Fluxions, and they were overshadowed by the rest of the opus. D’Alembert’s works were very widely read, but even though they were published at almost the same time as Euler’s contrary views, they did not stimulate much of a dialog. We suspect that the largest school of thought on the foundations of calculus was in fact a pragmatic school – calculus worked so well that there was no real incentive to worry much about its foundations. In An V of the French Revolutionary calendar, 1797 to the rest of Europe, JosephLouis Lagrange (1736–1813) [Lagrange 1797] returned to foundations with his book, the full title of which was Th´eorie des fonctions analytiques, contenant les principes du calcul diff´erentiel, d´egag´es de toute consid´eration d’infiniment petits ou d’´evanouissans, de limites ou de fluxions, et r´eduits a` l’analyse alg´ebrique des quantit´es finies (Theory of analytic functions containing the principles of differential calculus, without any consideration of infinitesimal or vanishing quantities, of limits or of fluxions, and reduced to the algebraic analysis of finite quantities). The book ´ was based on his analysis lectures at the Ecole Polytechnique. Lagrange used power series expansions to define derivatives, rather than the other way around. Lagrange kept revising the book and publishing new editions. Its fourth edition appeared in 1813, the year Lagrange died. It is interesting to note that, like the Cours d’analyse, Lagrange’s Th´eorie des fonctions analytiques contains no illustrations whatsoever. ´ Just two years after Lagrange died, Cauchy joined the faculty of the Ecole Polytechnique as professor of analysis and started to teach the same course that Lagrange had taught. He inherited Lagrange’s commitment to establish foundations of calculus, but he followed Maclaurin and d’Alembert rather than Lagrange and sought those foundations in the formality of limits. A few years later, he published his ´ lecture notes as the Cours d’analyse de l’Ecole Royale Polytechnique; I.re Partie. Analyse alg´ebrique. The book is usually called the Cours d’analyse, but some catalogs and secondary sources call it the Analyse alg´ebrique. Evidently, Cauchy had intended to write a second part, but he did not have the opportunity. The year after ´ its publication, the Ecole Polytechnique changed the curriculum to reduce its emphasis on foundations [L¨utzen 2003, p. 160]. Cauchy wrote new texts, R´esum´e des ´ lec¸ons donn´ees a l’Ecole Polytechnique sur le calcul infinitesimal, tome premier in 1823 and Lec¸ons sur le calcul diff´erentiel in 1829, in which he reduced the material in the Cours d’analyse about foundations to just a few dozen pages.
Translators’ Preface
ix
Because it became obsolete as a textbook just a year after it was published, the Cours d’analyse saw only one French edition in the 19th century. That first edition, published in 1821, was 568 pages long. The second edition, published as Volume 15 (also identified as Series 2, Volume III) of Cauchy’s Oeuvres compl`etes, appeared in 1897. Its content is almost identical to the 1821 edition, but its pagination is quite different, there are some different typesetting conventions, and it is only 468 pages long. The Errata noted in the first edition are corrected in the second, and a number of new typographical errors are introduced. At least two facsimiles of the first edition were published during the second half of the 20th century, and digital versions of both editions are available on line, for example, through the Biblioth`eque Nationale de France. There were German editions published in 1828 and 1885, and a Russian edition published in Leipzig in 1864. A Spanish translation appeared in 1994, published in Mexico by UNAM. The present edition is apparently the first edition in any other language. The Cours d’analyse begins with a short Introduction, in which Cauchy acknowledges the inspiration of his teachers, particularly Pierre Simon Laplace (1749–1827) and Sim´eon Denis Poisson (1781–1840), but most especially his colleague and former tutor Andr´e Marie Amp`ere (1775–1836). It is here that he gives his oft-cited intent in writing the volume, “As for the methods, I have sought to give them all the rigor which one demands from geometry, so that one need never rely on arguments drawn from the generality of algebra.” The Introduction is followed by 16 pages of “Preliminaries,” what today might be called “Chapter Zero.” Here, Cauchy takes pains to define his terms, carefully distinguishing, for example, between number and quantity. To Cauchy, numbers had to be positive and real, but a quantity could be positive, negative or zero, real or imaginary, finite, infinite or infinitesimal. Beyond the Preliminaries, the book naturally divides into three major parts and a couple of short topics. The first six chapters deal with real functions of one and several variables, continuity, and the convergence and divergence of series. In the second part, Chapters 7 to 10, Cauchy turns to complex variables, what he calls imaginary quantities. Much of this parallels what he did with real numbers, but it also includes a very detailed study of roots of imaginary equations. We find here the first use of the words modulus and conjugate in their modern mathematical senses. Chapter 10 gives Cauchy’s proof of the fundamental theorem of algebra, that a polynomial of degree n has n real or complex roots. Chapters 11 and 12 are each short topics, partial fraction decomposition of rational expressions and recurrent series, respectively. In this, Cauchy’s structure reminds us of Leonhard Euler’s 1748 text, the Introductio in analysin infinitorum [Euler 1748], another classic in the history of analysis. In Euler, we find 11 chapters on real functions, followed by Chapters 12 and 13, “On the expansion of real functions into fractions,” i.e., partial fractions, and “On recurrent series,” respectively. The third major part of the Cours d’analyse consists of nine “Notes,” 140 pages in the 1897 edition. Cauchy describes them in his Introduction as “. . . several notes placed at the end of the volume [where] I have presented the derivations which may
x
Translators’ Preface
be useful both to professors and students of the Royal Colleges, as well as to those who wish to make a special study of analysis.” Though Cauchy was only 32 years old when he published the Cours d’analyse, and had been only 27 when he began teaching the analysis course on which it was based, he was already an accomplished mathematician. This should not be surpris´ ing, as it was not easy to earn an appointment as a professor at the Ecole Polytechnique. Indeed, by 1821, Cauchy had published 28 memoirs, but the Cours d’analyse was his first full-length book. Cauchy’s first original mathematics concerned the geometry of polyhedra and was done in 1811 and 1812. Louis Poinsot (1777–1859) had just established the existence of three new nonconvex regular polyhedra. Cauchy, encouraged to study ´ the problem by Lagrange, Adrien-Marie Legendre (1752–1833) and Etienne Louis Malus (1775–1812), [Belhoste 1991, pp. 25–26] extended Poinsot’s results, discovered a generalization of Euler’s polyhedral formula, V − E + F = 2, and proved that a convex polyhedron with rigid faces must be rigid. These results became his earliest papers, the two-part memoir “Recherches sur les poly`edres” and “Sur les polygones et les poly`edres.” [Cauchy 1813] Despite his early success, Cauchy seldom returned to geometry, and these are his only significant results in the field. After Cauchy’s success with the problems of polyhedra, his father encouraged him to work on one of Fermat’s (1601–1665) problems, to show that every integer is the sum of at most three triangular numbers, at most four squares, at most five pentagonal numbers, and, in general, at most n n-gonal numbers. He presented his solution to the Institut de France on November 13, 1815 and published it under the title “D´emonstration g´en´erale du th´eor`eme de Fermat sur les nombres polygones” [Cauchy 1815]. Belhoste [Belhoste 1991, p. 46] tells us that this was the article “that made him famous,” and suggests that “[t]he announcement of his proof may have ´ supported his appointment to the Ecole Polytechnique a few days later.” Just a month later, on December 26, 1815, the Academy’s judgment was confirmed when Cauchy won the Grand Prix de Math´ematiques of the Institut de France, and its prize of 3000 francs, for an essay on the theory of waves. With his career established, Cauchy married Alo¨ıse de Bure (1795?–1863) in 1818. They had two daughters. It is a measure of Cauchy’s later fame and success that one of his daughters married a count, the other a viscount. Indeed, Freudenthal [DSB Cauchy, p. 135] says that Cauchy “was one of the best known people of his time.” The de Bure family were printers and booksellers. The title page of the Cours d’analyse, published by de Bure fr`eres, describes them as “Libraires du Roi et de la biblioth`eque du Roi.” ´ It seems that Cauchy was an innovative but unpopular teacher at the Ecole Polytechnique. He, along with Amp`ere and Jacques Binet (1786–1856), proposed substantial revisions in the analysis, calculus and mechanics curricula. Cauchy wrote the Cours d’analyse to support the new curriculum. In 1820, though, before the Cours d’analyse was published, but apparently after it had been written and the publisher had committed to printing it, the Conseil
Translators’ Preface
xi
Fig. 1 Cauchy, by Susan Petry, 18 × 28 cm, bas relief in tulip wood, 2008. An interpretation of portraits by Boilly (1821) and Roller (∼ 1840). Photograph by Eliz Alahverdian, 2008. Reprinted with permission of Susan Petry and Aliz Alahverdian. All rights reserved.
xii
Translators’ Preface
d’Instruction, more or less a Curriculum Committee, largely influenced by de Prony (1755–1839) and Navier (1785–1836), ordered that Cauchy and Amp`ere change the curriculum again. As a consequence, the Cours d’analyse was never used as a textbook. A more complete account of this episode is found in [Belhoste 1991, pp. 61–66]. ´ Lectures at the Ecole Polytechnique were scheduled to be 50 lectures per term, each consisting of 30 minutes “revision” then 60 minutes of lecture. On April 12, 1821, Cauchy was delivering the 65th lecture of the term. When the lecture neared the end of its second hour, students began to jeer, and some walked out. A formal investigation followed, and eventually both the students and Cauchy were found responsible, but nobody was punished. Fuller accounts are found in [Belhoste 1991, pp. 71–74] and [Grattan-Guinness 1990, pp. 709–712]. From 1824 to 1830, Cauchy also taught part-time at the Coll`ege de France, where he presented, among other techniques, methods of differential equations, and gave lectures on the theory of light. At the same time he worked also as a substitute professor on the Facult´e des Sciences de Paris, where he replaced Poisson, and lectured on the mechanics of solids, fluid mechanics and on his general theory of elasticity. By 1826, Cauchy had grown impatient with the time it took for the Academy to publish his articles and memoirs. That year they published only 11 of his memoirs, up from six in 1825, so he founded a private journal, the Exercises de math´ematiques, published by his in-laws, Debure fr`eres. By 1830, he had published five volumes of the Exercises, containing 51 of his articles. These comprise volumes 18 to 21 of the Oeuvres compl`etes. The July Revolution of 1830 deposed the Bourbon monarch, Charles X. Cauchy refused to take a loyalty oath to his Orleans successor, Louis-Philippe, and went into 8 years of voluntary exile. He taught at the University of Turin from 1831 to 1833, where he continued his journal under the new name, R´esum´es analytiques (Oeuvres compl`etes, volume 22), and then spent the rest of his exile tutoring in Prague in the exile court of Charles X. While in Prague, his king awarded Cauchy the title “Baron.” In 1838, Cauchy returned to Paris, but because he had not taken the loyalty oath, ´ he was not allowed to teach, either at the Ecole Polytechnique or at his part-time jobs. He was still an active member of the Acad´emie des Sciences, though, and over the next 10 years he submitted over 400 items to the Comptes rendus, the published notes and articles presented at the weekly meetings of the Academy. Because the Academy took breaks and vacations, “weekly” meetings did not actually take place every week. Over these 10 years, Cauchy averaged an article for each week the Academy was in session. These articles occupy most of volumes 4 to 10 of Cauchy’s 27-volume Oeuvres compl`etes. At the same time, he continued his private journal under yet another title, the Exercises d’analyse et de physique math´ematique. These 47 articles fill volumes 23 to 26 of the Oeuvres compl`etes. During his decade away from the classroom, 1838 to 1848, Cauchy produced about half of his published works by item count, about a third of them by page count. It was a remarkable decade.
Translators’ Preface
xiii
The February Revolution of 1848 ended the reign of Louis-Philippe and established the Second Republic. Loyalty oaths were not required, so Cauchy returned to the Facult´e des Sciences as professor of mathematical astronomy. When loyalty oaths were reestablished in 1852, Napoleon III made an exception for Cauchy. Cauchy’s last 9 years were active. In 1853, he published one last volume of the Exercises d’analyse et de physique math´ematique. He did a good deal of research on the theory of light and bickered with his colleagues. He made another 159 contributions to the Comptes rendus. The last of his 589 contributions to that journal came on May 4, 1857. [Oeuvres 12, p. 435] It was a short note on mathematical astronomy, and he closed it with the words C’est ce que j’expliquerais plus au long dans un prochain M´emoire, “I will explain this at greater length in a future Memoir.” Clearly, he was not expecting to die just 18 days later. Many studies give more detailed accounts of Cauchy’s life, works and times than we give here. For a full biography of Cauchy, we refer our readers to [Belhoste 1991]. The entry in the Dictionary of Scientific Biography [DSB Cauchy] is much briefer; it contains many inaccurate citations to Cauchy’s work and in general seems to suffer from “hero worship.” For example, we find no other source that describes Cauchy as “one of the best-known people of his time, and must have been often mentioned in newspapers, letters and memoirs.” Still, its basic facts are correct. For accounts of Cauchy’s work and its importance, we recommend [Grabiner 2005] and [Grattan-Guinness 2005] as good places to begin. See also [GrattanGuinness 1990] for a comprehensive account of the French mathematical community in the time of Cauchy. Grattan-Guinness first presents his case that Cauchy “plagiarized” Bolzano in [Grattan-Guinness 1970a]. This assertion precipitated a controversy that raged through [Grattan-Guinness 1970b], [Freudenthal 1971b], and still echoed in [Grabiner 2005]. Other modern contributions to Cauchy scholarship are more numerous than we wish to describe, but we will mention in particular [Jahnke 2003], [L¨utzen 2003], [Ferraro 2008] and [Bottazzini 1990]. Starting with these references, the interested reader can find a great many more. As we translated the Cours d’analyse, we laid out the text and formulas, used italics, bold face and punctuation, and, as much as possible, adopted the styles of the 1897 edition of the text. We have also added an index (neither the 1821 nor the 1897 editions have indices), and we have used our footnotes to note passages that are quoted, cited or translated in certain important secondary sources. We have not made note of errors cited in the Errata of the 1821 edition, all of which were corrected in the 1897 edition, but we have noted errors not mentioned in the Errata, as well as new errors introduced in the second edition. We distinguish such footnotes with the signature “(tr.).” Expository footnotes are unsigned. We believe that the primary purpose of a translation such as this one is to make the work available in English, and not to provide a platform for our opinions on how this work should be interpreted. Towards this end, we have generally limited
xiv
Translators’ Preface
our commentary to expository remarks rather than interpretative ones. For those passages that are controversial and subject to a variety of interpretations, we try to refer the interested reader to appropriate entry-point sources and do not try to be comprehensive. For a variety of reasons, we decided to follow Grabiner [Grabiner 2005], Freudenthal [DSB Cauchy] and others, rather than Kline [Kline 1990], and to make our translation, as well as to cite page numbers, from the second edition. Although electronic copies of both editions are freely available on the World Wide Web, bound copies of the 1821 edition are rather hard to find, while the second edition is found in many university libraries. The on-line library catalog WorldCat reports 57 copies of the 1821 edition in North America, and only seven copies of the facsimiles. Yet they report at least 117 copies of the 1897 edition in North America. We say “at least” because there are several different kinds of catalog entries, and it is difficult to tell how much duplication there is. We would estimate at least 200 copies. The two editions are identical in content, notation and format, but differ in pagination, page layout and some punctuation. In general, we found the typography and page layout of the 1821 edition somewhat cluttered, even quirky, particularly in the ways that formulas were cut into many lines to be arranged on the page. Weighing all these circumstances, it seemed more reasonable to follow the more accessible version. In general, we resisted the temptation to modernize Cauchy’s notation and terminology. When he uses the word limites to mean both what we call “limits” and what we call “bounds,” we translate it as “limits” in both cases. In the index, citations of the word “limit” direct the reader to instances in which the limit process is being used, and not to instances meaning “bounds.” Moreover, when he fails to distinguish between open intervals and closed intervals, or between “less than” and “less than or equal to,” we translate it as Cauchy wrote it, and do not attempt to force upon Cauchy distinctions he himself did not make. There are two conspicuous exceptions. Cauchy wrote lx, or sometimes Lx to denote the logarithm of x to a given base A. We modernize this to log x or Log x to avoid unnecessary confusion. Likewise, we write ln x to denote the natural logarithm of x, rather than using Cauchy’s lx. Also, Cauchy used periods at the end of the abbreivated names of trignonometric functions (such as cos. x) and denoted the tangent and arctangent functions tang. x and arc tang. x. Following modern usage, we omit the periods and use tan x and arctan x. √ Cauchy Euler’s innovation of the 1770s, to write i for −1, so we √ did not adopt write −1 as well.1 Within our translation of the text, numbers in square brackets, like [116], mark where new pages begin in the 1897 edition. Thus, for example, when we find the notation [116] in the midst of the statement of the Cauchy Convergence Criterion, we know that Cauchy’s statement of that criterion appeared on pages 115 and 116 √ Many people attribute Euler’s first use of the symbol i to denote −1 to his 1748 text, the Introductio in analysin infinitorum [Euler 1748] , but readers who check Volume II, Chapter 21, § 515 will see that the quantity Euler √ denotes there as i is actually ln(−n), for some positive value of n, and not the imaginary unit, −1. 1
Translators’ Preface
xv
of the 1897 edition. We give a page concordance of the two French editions in an appendix. Cauchy seemed to enjoy choosing his words carefully and precisely, and then once the correct words were chosen, using those very words over and over again. For example, in Chapter VII, § III, he studies the n-th roots of unity, or, as he calls them, 1 to the fractional power 1n . He states his theorems and gives his proofs about these objects. Later in that same section, when he studies other fractional powers of 1, − n1 , mn , and − mn , the words in his theorems and proofs are almost identical, changing only what must be changed. We have taken care to do the same in our translation. Our ambition is, as much as the very idea of translation allows, to let Cauchy speak for himself. We are grateful to Emili Bifet, David Burns, Larry D’Antonio, Ross Gingrich, Andy Perry, Kim Plofker, Fred Rickey, Chuck Rocca and Jeff Suzuki who, as participants in the ARITHMOS reading group, read early drafts of portions of this translation. Likewise, we are grateful to our students Shannon Abernathy, Erik Gundel, Amanda Peterson and Joseph Piraneo, who read parts of this manuscript in a history of mathematics seminar at Western Connecticut State University in the Spring of 2008. Careful proofreading and helpful suggestions by both groups have greatly improved this translation. We also acknowledge the assistance of the editorial staff at Springer, particularly Ann Kostant and Charlene Cruz Cerdas. Most importantly, we thank our wives Susan Petry and Terry Sandifer for supporting and encouraging our efforts, and for being understanding about the many long days that this project occupied. Garden City, New York, Danbury, Connecticut, March 2009
Robert E. Bradley C. Edward Sandifer
Contents
Translators’ Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1
On real functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 General considerations on functions. . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 On simple functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 On composite functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17 17 18 19
2
On infinitely small and infinitely large quantities, and on the continuity of functions. Singular values of functions in various particular cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 On infinitely small and infinitely large quantities. . . . . . . . . . . . . . . . . 2.2 On the continuity of functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 On singular values of functions in various particular cases. . . . . . . . .
21 21 26 32
On symmetric functions and alternating functions. The use of these functions for the solution of equations of the first degree in any number of unknowns. On homogeneous functions. . . . . . . . . . . . . . . . . . 3.1 On symmetric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 On alternating functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 On homogeneous functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 49 51 56
3
4
Determination of integer functions, when a certain number of particular values are known. Applications. . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Research on integer functions of a single variable for which a certain number of particular values are known. . . . . . . . . . . . . . . . . . . 4.2 Determination of integer functions of several variables, when a certain number of particular values are assumed to be known. . . . . . . 4.3 Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59 59 64 67
xvii
xviii
Contents
5
Determination of continuous functions of a single variable that satisfy certain conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.1 Research on a continuous function formed so that if two such functions are added or multiplied together, their sum or product is the same function of the sum or product of the same variables. . . . . . 71 5.2 Research on a continuous function formed so that if we multiply two such functions together and then double the product, the result equals that function of the sum of the variables added to the same function of the difference of the variables. . . . . . . . . . . . . . . . . . . . . . . 77
6
On convergent and divergent series. Rules for the convergence of series. The summation of several convergent series. . . . . . . . . . . . . . . . . 85 6.1 General considerations on series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.2 On series for which all the terms are positive. . . . . . . . . . . . . . . . . . . . 90 6.3 On series which contain positive terms and negative terms. . . . . . . . . 96 6.4 On series ordered according to the ascending integer powers of a single variable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7
On imaginary expressions and their moduli. . . . . . . . . . . . . . . . . . . . . . . . 117 7.1 General considerations on imaginary expressions. . . . . . . . . . . . . . . . 117 7.2 On the moduli of imaginary expressions and on reduced expressions.122 7.3 On the real and imaginary roots of the two quantities +1 and −1 and on their fractional powers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.4 On the roots of imaginary expressions, and on their fractional and irrational powers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 7.5 Applications of the principles established in the preceding sections. 152
8
On imaginary functions and variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 8.1 General considerations on imaginary functions and variables. . . . . . . 159 8.2 On infinitely small imaginary expressions and on the continuity of imaginary functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 8.3 On imaginary functions that are symmetric, alternating or homogeneous. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 8.4 On imaginary integer functions of one or several variables. . . . . . . . . 167 8.5 Determination of continuous imaginary functions of a single variable that satisfy certain conditions. . . . . . . . . . . . . . . . . . . . . . . . . . 172
9
On convergent and divergent imaginary series. Summation of some convergent imaginary series. Notations used to represent imaginary functions that we find by evaluating the sum of such series. . . . . . . . . . 181 9.1 General considerations on imaginary series. . . . . . . . . . . . . . . . . . . . . 181 9.2 On imaginary series ordered according to the ascending integer powers of a single variable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 9.3 Notations used to represent various imaginary functions which arise from the summation of convergent series. Properties of these same functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Contents
xix
10
On real or imaginary roots of algebraic equations for which the left-hand side is a rational and integer function of one variable. The solution of equations of this kind by algebra or trigonometry. . . . . . . . 217 10.1 We can satisfy any equation for which the left-hand side is a rational and integer function of the variable x by real or imaginary values of that variable. Decomposition of polynomials into factors of the first and second degree. Geometric representation of real factors of the second degree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 10.2 Algebraic or trigonometric solution of binomial equations and of some trinomial equations. The theorems of de Moivre and of Cotes. 229 10.3 Algebraic or trigonometric solution of equations of the third and fourth degree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
11
Decomposition of rational fractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 11.1 Decomposition of a rational fraction into two other fractions of the same kind. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 11.2 Decomposition of a rational fraction for which the denominator is the product of several unequal factors into simple fractions which have for their respective denominators these same linear factors and have constant numerators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 11.3 Decomposition of a given rational fraction into other simpler ones which have for their respective denominators the linear factors of the first rational fraction, or of the powers of these same factors, and constants as their numerators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
12
On recurrent series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 12.1 General considerations on recurrent series. . . . . . . . . . . . . . . . . . . . . . 257 12.2 Expansion of rational fractions into recurrent series. . . . . . . . . . . . . . 258 12.3 Summation of recurrent series and the determination of their general terms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Note I – On the theory of positive and negative quantities. . . . . . . . . . . . . . . . 267 Note II – On formulas that result from the use of the signs > or b or b < a. As usual, we represent the product of two quantities +a and +b by13 +a × +b,
or simply a.b or ab
and their quotient by
a or a : b. b [23] Now let m and n be two whole numbers, A an arbitrary number and a and b two arbitrary quantities, positive or negative. Then 1
Am , A n =
√ m n A, A± n
and Ab
represent the positive quantities which we obtain by raising the number A to the powers denoted respectively by the exponents m,
m 1 ,± n n
and b,
and a±m denotes the quantity, positive or negative, that arises from taking the quantity a to the power ±m. We use the notation 1
((a)) n =
p p n a
m
and ((a))± n
to denote not only the positive and negative values, when they exist, of the powers of the quantity a raised to the exponents 13
In [Cauchy 1821, p. 9, Cauchy 1897, p. 22] Cauchy used a period in a.b rather than a centered dot, as we would today.
10
Preliminaries
1 n
±
and
m , n
but also the imaginary values14 of these same powers (see Chap. VII for the meaning of imaginary expressions). It is helpful to observe that if we let A be the numerical value of a, and if we assume that the fraction mn is in lowest terms, then the power m
((a)) n
has a single positive or negative real value, namely m
+A n
m
−A n ,
or
as long as mn is a fraction with an odd denominator, but if the denominator is even, then it has [24] either the two real values just mentioned, or no real values. We could make a similar remark about the expression m
((a))− n . In the particular case where the quantity a is positive and we let mn = 21 , the expresm sion ((a)) n has two real values, given by formula (2) or, what amounts to the same thing, by formula (1). The notations15 l(B), L(B), L0 (B), . . . denote the real logarithms of the number B to different bases, whereas each the following, l((b)), L((b)), L0 ((b)), . . . denote, in addition to the real logarithm of the quantity b, when it exists, any of the imaginary logarithms of this same quantity (see Chap. IX for the meaning of imaginary logarithms). In trigonometry, sin a,
cos a,
tan a,
cot a,
sec a,
csc a,
siv a and
cosiv a
denote, respectively, the sine, cosine, tangent, cotangent, secant, cosecant, versine and vercosine of the arc a.16 The notations
14
Cauchy does not actually define an “imaginary value,” but it is clear that it is what we get when we assign particular real values to the real quantities in an imaginary expression. 15 Here we have reproduced Cauchy’s notation for logarithm. Subsequently, we will always use more modern notation, like ln(B), log(B), Log(B). 16 We note that Cauchy uses “tang. a” for the trigonometric function as well as the inverse trigonometric function. His notations for secant and cosecant are “s´ec. a” and “cos´ec. a.” Note also his use of the obsolete trigonometric functions versed sine and versed cosine. He will later also use the obsolete function chord on p. 45; [Cauchy 1821, p. 63, Cauchy 1897, p. 66].
Preliminaries
11
arcsin((a)), arccos((a)), arctan((a)), arccot((a)), arcsec((a)), arccsc((a)) indicate some one of the arcs which have the quantity a as their sine, cosine, tangent, cotangent, secant or cosecant. We use the simple notations arcsin(a), arccos(a), arctan(a), arccot(a), arcsec(a), arccsc(a), [25] or we may suppress the parentheses and write arcsin a, arccos a, arctan a, arccot a, arcsec a, arccsc a when, from among the arcs for which a trigonometric function is equal to a,17 we wish to designate the one with smallest numerical value, or, if there are two such arcs with opposite signs, the one with the positive value. Consequently, arcsin a, arctan a, arccot a, arccsc a, denote positive or negative arcs between the limits −
π 2
and
+
π , 2
where π denotes the semiperimeter of the unit circle, whereas arccos a and
arcsec a
denote positive arcs between 0 and π. By virtue of the conventions that we have just established, if we denote by k an arbitrary positive integer, we obviously have, for arbitrary positive or negative values of the quantity a, arcsin((a)) = π2 ± π2 − arcsin a ± 2kπ, arccos((a)) = ± arccos a ± 2kπ, arctan((a)) = arctan a ± kπ, (3) arccos a + arcsin a = π2 and arccsc a + arcsec a = π2 . Furthermore, we find that, for positive values of a, (4)
arccot a + arctan a =
π , 2
[26] and for negative values of a, 17 Here, Cauchy writes “. . . parmi les arcs dont un ligne trigonom´ etrique est e´ gale a` a.” This translates literally “. . . among the arcs for which a trigonometric line is equal to a.” Cauchy is treating trigonometric functions as giving lines, which have signed lengths, rather than in the modern view of giving real numbers.
12
(5)
Preliminaries
π arccot a + arctan a = − . 2
When a variable quantity converges towards a fixed limit, it is often useful to indicate this limit with particular notation. We do this by placing the abbreviation18 lim in front of the variable quantity in question. Sometimes, when one or several variables converge towards fixed limits, an expression containing these variables converges towards several different limits at the same time. We therefore denote an arbitrary one of these limits using the doubled parentheses following the abbreviation lim, so as to enclose the expression under consideration. Specifically, suppose that a positive or negative variable denoted by x converges towards the limit 0, and denote by A a constant number. It is easy to see that each of the expressions lim Ax
and
lim sin x
has a unique value determined by the equation lim Ax = 1 or lim sin x = 0, whereas the expression lim
1 x
takes two values, +∞ and −∞, and 1 lim sin x admits an infinity of values between the limits −1 and +1. We will finish these preliminaries by presenting several theorems on average quantities, the knowledge of which will [27] be extremely useful in the remainder of this work. We call an average among several given quantities a new quantity between the smallest and the largest of those under consideration. From this definition it is clear that there are an infinity of averages among several unequal quantities, and that the average among several equal quantities is equal to their common value. Given this, we will easily establish, as one can see in Note II, the following propositions:
18
The notation “Lim.” for limit was first used by Simon Antoine Jean L’Huilier (1750–1840) in [L’Huilier 1787, p. 31]. Cauchy wrote this as “lim.” in [Cauchy 1821, p. 13]. The period had disappeared by [Cauchy 1897, p. 26].
Preliminaries
13
Theorem I.19 — Let b, b0 , b00 , . . . denote n quantities of the same sign, and a, a0 , . . . be the same number of arbitrary quantities. The fraction
a00 ,
a + a0 + a00 + . . . b + b0 + b00 + . . . is an average of the following quantities, a a0 a00 , , , .... b b0 b00 Corollary. — If we let b = b0 = b00 = . . . = 1, it follows from the preceding theorem that the quantity a + a0 + a00 + . . . n is an average of the quantities a, a0 , a00 , . . . . This particular kind of average is called the arithmetic mean. Theorem II. — Let A, A0 , A00 , . . . ; B, B0 , B00 , . . . , be two sequences of numbers taken at will, which we suppose contain n terms each. Form from these two sequences the roots √ √ √ B0 B00 B A, A0 , A00 , . . . . √ 0 00 [28] Then B+B +B +... AA0 A00 . . . is a new root which is an average of the other roots. Corollary. — If we let B = B0 = B00 = . . . = 1, we find that the positive quantity √ n AA0 A00 . . . is an average of A, A0 , A00 , . . . . This particular average is called the geometric mean. Theorem III. — With the same hypotheses as in theorem I, and if α, α 0 , α 00 , . . . again denote quantities of the same sign, the fraction 19
Cauchy gives a proof of this theorem in Note II, Theorem XII [Cauchy 1821, p. 447, Cauchy 1897, p. 368].
14
Preliminaries
αa + α 0 a0 + α 00 a00 + . . . αb + α 0 b0 + α 00 b00 + . . . is an average of
a a0 a00 , , , .... b b0 b00
Corollary. — If we suppose that b = b0 = b00 = . . . = 1, we conclude from the previous theorem that the sum aα + a0 α 0 + a00 α 00 + . . . is equivalent to the product of α + α 0 + α 00 + . . . with an average of the quantities a, a0 , a00 , . . .. For brevity, when we wish to denote an average of [29] several quantities a, a0 , a00 , . . ., we use the notation M(a, a0 , a00 , . . .). Given this, the preceding theorems and their corollaries are included in the following formulas: a a0 a00 a + a0 + a00 + . . . (6) =M , , ,... , b + b0 + b00 + . . . b b0 b00 a + a0 + a00 + . . . (7) = M a, a0 , a00 , . . . , n √ √0 √ √ B B00 B+B0 +B00 +... B (8) AA0 A00 . . . = M A, A0 , A00 , . . . , √ n (9) AA0 A00 . . . = M A, A0 , A00 , . . . , aα + a0 α 0 + a00 α 00 + . . . a a0 a00 (10) =M , , ,... , bα + b0 α 0 + b00 α 00 + . . . b b0 b00 (11) aα + a0 α 0 + a00 α 00 + . . . = (α + α 0 + α 00 + . . .)M a, a0 , a00 , . . . . In these formulas, a, a0 , a00 , . . . ; b, b0 , b00 , . . . ; α, α 0 , α 00 , . . . denote three sequences of quantities, and A, A0 , A00 , . . . ; B, B0 , B00 , . . .
Preliminaries
15
two sequences of numbers, all five sequences consisting of n terms. The second and third sequences consist of quantities of the same sign. The notation that we have just adopted gives a way to express that a quantity is included between two given limits. In fact, any quantity between the limits a and b is an average of those two limits, and may be denoted by M(a, b). And so, for example, any positive quantity can be represented by M(0, ∞), and any negative quantity by M(−∞, 0), and any real quantity by M(−∞, +∞). When we do not want to indicate [30] any particular one of the quantities contained between the limits a and b, we double the parentheses and write M((a, b)). For example, if we suppose that the variable x converges to zero, we have20 1 = M((−1, +1)), lim sin x given that the expression lim sin 1x admits an infinity of values between the extreme values −1 and +1.
20
Note that here M((−1, +1)) is meant to include both endpoints, but that above, M(0, ∞) was meant to exclude the endpoint 0. Apparently, Cauchy does not see a need to distinguish between open and closed intervals.
Chapter 1
On real functions.
First Part1 A LGEBRAIC ANALYSIS Chapter I. O N REAL FUNCTIONS .
1.1 General considerations on functions. [31] When variable quantities are related to each other such that the value of one of the variables being given one can find the values of all the other variables, we normally consider these various quantities to be expressed by means of the one among them, which therefore takes the name the independent variable. The other quantities expressed by means of the independent variable are called functions of that variable. When variable quantities are related to each other such that the values of some of them being given one can find all of the others, we consider these various quantities to be expressed by means of several among them, which therefore take the name independent variables. The other quantities expressed by means of the independent variables are called functions of those same variables. The various expressions that are used in algebra and trigonometry, when they involve variables that are considered to be independent, are also functions of these same variables. And so, for example, log(x), sin x, . . . [32] are functions of the variable x, while 1 Cauchy had originally planned for the Cours d’analyse to consist of two volumes. He never wrote the second one; see p. viii.
R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 1, c Springer Science+Business Media, LLC 2009
17
18
1 On real functions.
x + y, xy , xyz, . . . are functions of the variables x and y, or of x, y and z, . . . . When the functions of one or several variables are directly expressed, as in the preceding examples, by means of those same variables, they are called explicit functions. But when they are given only as relations among the functions and the variables, that is to say the equations that the quantities must satisfy, as long as the equations are not solved algebraically, the functions are not expressed directly by means of the variables, then they are called implicit functions. To make them explicit, it suffices to solve, when it is possible, the equations that determine them. For example, when y is given by the equation log(y) = x, then it is an implicit function of x. If we let A be the base of the system of logarithms being considered, the same function made explicit by solving the given equation will be y = Ax . When we want to denote an explicit function of a single variable x or of several variables, x, y, z, . . ., without specifying the nature of that function, we use one of the notations f (x), F(x), φ (x), χ(x), ψ(x), ϖ(x), . . . , f (x, y, z, . . .), F(x, y, z, . . .), φ (x, y, z, . . .), . . . . For a function of one variable to be completely determined, it is necessary and sufficient that from every particular value assumed by the variable, one can deduce the corresponding value of the function. Sometimes, for each value of the variable, the given function [33] takes on several values different from one another. Conforming to the conventions adopted in the preliminaries, we will usually designate these multiple values of a function by the notations in which the variables are written with doubled symbols or with doubled parentheses. Thus, for example arcsin((x)) indicates one of the arcs that have x as their sines, and p p √ x=± x indicates one of the two square roots of the variable x, assuming that x is positive.
1.2 On simple functions. Among the functions of a single variable x, the ones we call simple are those that result from just one operation on that variable. There are only a few simple func-
1.3 On composite functions.
19
tions that we ordinarily consider in analysis, some of which arise from algebra and the others from trigonometry. Addition and subtraction, multiplication and division, raising to powers and extracting roots, and finally the formation of exponentials and logarithms, produce the simple functions that arise in algebra. Thus, if A denotes a constant number and if a = ±A is a constant quantity, the simple algebraic functions of the variable x are a + x, a − x, ax,
a a x , x , A and log(x). x
Here we need not account for roots because we can always write them as powers. There are a great number of simple functions that arise in trigonometry. They include the simple functions of all the trigonometric lines as well as the arcs that correspond to these same lines, but [34] they all reduce to the four following functions: sin x, cos x, arcsin x and arccos x, and we will put the other trigonometric lines, tan x, sec x, . . ., along with the corresponding arcs, arctan x, arcsec x, . . ., among the composite functions, because these lines can always be expressed by means of the sine and the cosine. We could even, for the sake of rigor, reduce the two simple functions, sin x and cos x, to a single one because they are related to each other by the equation sin2 x + cos2 x = 1, but we use these two functions so frequently that it is useful to keep both of them among the simple functions.
1.3 On composite functions. The functions that are given by a single variable by means of several operations are called composite functions. We will distinguish among these the functions of functions that result from several successive operations, the first operation acting upon the variable and each of the others acting on the result of the preceding operation. By virtue of these definitions, xx ,
√ log x x , ... x, x
are composite functions of the variable x, and log(sin x) and
log(cos x), . . .
are functions of functions, of which each is the result of two successive operations. Composite functions are distinguished from each other by the nature of the operations that produce them. It seems that we ought [35] to call by the name algebraic functions all those functions that are formed by the operations of algebra, but instead we will reserve that name particularly for those functions formed using only the first
20
1 On real functions.
algebraic operations, namely addition and subtraction, multiplication and division, and finally the raising to fixed powers. Those functions that involve variable exponents or logarithms we will call exponential or logarithmic functions. The algebraic functions are divided into rational functions and irrational functions. The rational functions are those in which the variable is raised only to integer powers.2 In particular, an integer function3 is any polynomial function4 that involves only integer powers5 of the variable, for example a + bx + cx2 + . . . , and a fractional function or rational fraction is the quotient of two such polynomials. The degree of an integer function of x is the exponent of the highest power of x in that function. A function of the first degree, namely a + bx, is also called a linear function because in Geometry we use it to represent the ordinates6 of a straight line. Every integer or fractional function is at the same time rational, and every other kind of algebraic function is irrational. The functions produced by the operations of trigonometry are called trigonometric or circular functions. The various names that we have just given to composite functions of just one variable apply as well to functions of several variables when these functions enjoy, with respect to each of the variables that they involve, the properties corresponding to the various names. Thus, for example, any polynomial [36] that contains nothing but integer powers of the variables x, y, z, . . . is an integer function of these variables. We call the degree of this integer function the sum of the exponents of the variables in the term where that sum is the largest. An integer function of the first degree, like a + bx + cy + dz + . . . is called a linear function.
2
Cauchy uses the term puissances enti`eres here. This is the second appearance of the adjective entier, the first being on [Cauchy 1821, p. 9, Cauchy 1897, p. 23], where he spoke of nombres entiers. The word translates literally as “entire.” Because to Cauchy, “numbers” are positive, there was no question in the first case that he meant to exclude negative values. Here, though, when he writes of puissances enti`eres, and because he has never actually defined the word entier, it is not clear whether these powers are allowed to be negative, or if they are strictly positive. 3 Cauchy calls this a fonction enti` ere, which translates literally as an “entire function.” We translate it as an “integer function” to avoid confusion with the modern definition of “entire” from complex variables. Cauchy’s “integer” functions form a subset of the modern set of “entire” functions. 4 Note that Cauchy does not define “polynomial.” 5 Again, Cauchy writes puissances enti` eres, but here it is clear from his example that he means positive integers only. 6 I.e., y-coordinates.
Chapter 2
On infinitely small and infinitely large quantities, and on the continuity of functions. Singular values of functions in various particular cases.
2.1 On infinitely small and infinitely large quantities. [37] We say that a variable quantity becomes infinitely small when its numerical value decreases indefinitely in such a way as to converge towards the limit zero. It is worth remarking on this point that one ought not confuse a constant decrease with an indefinite decrease. The area of a regular polygon circumscribed about a given circle decreases constantly as the number of sides increases, but not indefinitely, because it has as its limit the area of the circle.1 Similarly, a variable which takes as successive values only the different terms of the sequence2 2 3 4 5 6 , , , , , ..., 1 2 3 4 5 taken to infinity, would decrease constantly, but not indefinitely because its successive values converge towards the limit 1. On the other hand, a variable which takes as successive values only the different terms of the sequence 1 1 1 1 1 1 , , , , , , ..., 4 3 6 5 8 7 taken to infinity, does not decrease constantly, since the difference between two consecutive terms of this sequence is alternately [38] positive and negative. Nevertheless, it decreases indefinitely because its value ultimately becomes smaller than any given number.
1
The example of the polygon circumscribed about or inscribed in a circle had been the standard example illustrating the informal definition of the limit for many years. See, for example, [Chapelle 1765]. 2 When Cauchy uses suite, he almost always means “sequence.” When he uses series, he may mean either “sequence” or “series.” We translate suite as “sequence,” s´erie as “series” and progression as “progression,” because we do not want to leave the reader with the impression that Cauchy made modern distinctions that he did not actually make. R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 2, c Springer Science+Business Media, LLC 2009
21
22
2 On infinitely small and infinitely large quantities and on continuity.
We say that a variable quantity becomes infinitely large when its numerical value increases indefinitely in such a way as to converge towards the limit ∞. It is again essential to observe here that one ought not confuse a variable that increases indefinitely with a variable that increases constantly. The area of a regular polygon inscribed in a given circle increases constantly as the number of sides increases, but not indefinitely. The terms of the natural sequence of integer numbers 1, 2, 3, 4, 5, . . . increase constantly and indefinitely. Infinitely small and infinitely large quantities enjoy several properties that lead to the solution of important questions, which I will explain in a few words. Let α be an infinitely small quantity, that is a variable whose numerical value decreases indefinitely. When the various integer powers of α, namely α, α 2 , α 3 , . . . , enter into the same calculation, these various powers are called, respectively, infinitely small of the first, the second, the third order, etc. In general, we call any variable quantity infinitely small of the first order if its ratio with α converges to a finite limit different from zero as the numerical value of α diminishes.3 We call a variable quantity involving α infinitely small of the second order if its ratio with α 2 converges towards a finite limit different from zero, and so forth for higher orders. Given this, if k denotes a finite quantity different from zero and ε denotes a variable number that decreases indefinitely with the numerical value of α, the general form of infinitely small quantities of the first order is kα
or at least kα (1 ± ε) .
[39] The general form of infinitely small quantities of the second order will be kα 2 or at least kα 2 (1 ± ε) , .... .......... ............ Finally, the general form of infinitely small quantities of order n (where n represents an integer number) will be kα n
or at least kα n (1 ± ε) .
We may easily establish the following theorems concerning these various orders of infinitely small quantities. 3
Here, Cauchy is making the implicit assumption that α is never zero. Contemporaries, such as L’Huilier and Lacroix, were unclear on whether a variable quantity could ever exceed its limit. By explicitly using the numerical value, Cauchy evidently avoided such problems. Nevertheless, he resists allowing the variable quantity to attain its limiting value during the limiting process. See [Grabiner 2005, pp. 84–85] for further discussion.
2.1 On infinitely small and infinitely large quantities.
23
Theorem I. — If we compare two infinitely small quantities of different orders with each other, while both converge towards the limit zero, then eventually the one of the higher order will constantly have the smaller numerical value. Proof. — Indeed, let kα n (1 ± ε)
0
and k0 α n 1 ± ε 0
be two infinitely small quantities, one of order n, the other of order n0 , and suppose that n0 > n. The ratio between the second of these infinitely small quantities and the first, namely k0 n0 −n 1 ± ε 0 α , k 1±ε converges indefinitely with α towards the limit zero, which cannot occur unless the numerical value of the second eventually becomes constantly less than that of the first. Theorem II. — An infinitely small quantity of order n, that is to say of the form kα n (1 ± ε) , changes sign with α whenever n is an odd number, and for very small numerical values of α takes the same sign as the quantity k whenever n is an even number. Proof. — Indeed, under the first hypothesis, α n changes [40] sign with α, and under the second hypothesis, α n is always positive. Furthermore, the sign of the product k (1 ± ε) is the same as that of k, when ε is very small. Theorem III. — The sum of several infinitely small quantities of orders n, n0 , n00 , . . . (where n0 , n00 , . . . denote numbers larger than n) is a new infinitely small quantity of order n. Proof. — Indeed, 0
00
kα n h(1 ± ε) + k0 α n (1 ± ε 0 ) + k00 α n (1 ± ε 00 ) + . . . 0
0
00
00 −n
= kα n 1 ± ε + kk α n −n (1 ± ε 0 ) + kk α n = kα n (1 ± ε1 ) ,
i (1 ± ε 00 ) + . . .
where ε1 is a number which converges with α towards the limit zero. From the principles which we have just stated, we easily deduce, as we will see, several remarkable propositions concerning polynomials ordered according to increasing powers of an infinitely small quantity α.
24
2 On infinitely small and infinitely large quantities and on continuity.
Theorem IV. — Any polynomial ordered according to increasing powers of α, for example a + bα + cα 2 + . . . , or more generally
0
00
aα n + bα n + cα n + . . . , (where the numbers n, n0 , n00 , . . . form an increasing sequence), will eventually be constantly of the same sign as its first term a
or
aα n
for very small numerical values of α. Proof. — Indeed, the sum formed by the second term and those that follow is, in the first case, an infinitely small quantity of the first order, whose numerical value will eventually be smaller than the finite quantity a,4 and, in the second case, an infinitely small quantity [41] of order n0 , which eventually takes a numerical value that is constantly smaller than that of the infinitely small quantity of order n. Theorem V. — When, in the polynomial 0
00
aα n + bα n + cα n + . . . , ordered according to increasing powers of α, the degree n0 of the second term is an odd number, then for very small numerical values of α, this polynomial is either greater than or less than its first term aα n , depending on whether the variable α and the coefficient b have the same or opposite signs. Proof. — Indeed, under the given hypothesis, the sum of the terms that follow the first, namely 0 00 bα n + cα n + . . . , 0
has the same sign as each of the two products bα n and bα, for very small numerical values of α. Theorem VI. — When, in the polynomial 0
00
aα n + bα n + cα n + . . . , ordered according to increasing powers of α, the degree n0 of the second term is an even number, then for very small numerical values of α, this polynomial will eventually become constantly greater than its first term whenever b is positive, and constantly less whenever b is negative.
4
Here, Cauchy means the sum to be smaller than the numerical value of a, as a may be negative.
2.1 On infinitely small and infinitely large quantities.
25
Proof. — Indeed, under the given hypothesis, the sum of the terms that follow 0 the first has, for very small numerical values of α, the sign of the product bα n , and consequently the sign of b. Corollary. — Supposing in the preceding theorem that n = 0, we get the following proposition: Theorem VII. — If, in the polynomial5 0
00
a + bα n + cα n + . . . , ordered according to increasing powers of α, n0 denotes an even number, [42] then among the values of this polynomial corresponding to infinitely small values of α, the one that corresponds to α = 0, that is a, will always be the smallest whenever b is positive, and the greatest when b is negative. This particular value of the polynomial, either larger or smaller than all of its neighboring values, is what we call a maximum or a minimum. The properties of infinitely small quantities having been established, we deduce from them the analogous properties of infinitely large quantities by observing that any variable quantity of this last kind may be represented as α1 , where α denotes an infinitely small quantity. Thus, for example, when, in the polynomial axm + bxm−1 + cxm−2 + . . . + hx + k, ordered according to decreasing powers of x, this variable becomes infinitely large, then substituting α1 for x we reduce the given polynomial to h m−1 k m c 2 a b + α 1+ α + α +...+ α . αm a a a a Thus we see immediately that for very small numerical values of α, or what amounts to the same thing, for very large numerical values of x, this polynomial has the same sign as its first term, a = axm . αm As this remark applies even in the case where some of the quantities b, c, . . ., h, k reduce to zero, we can state the following theorem: Theorem VIII. — When, in a polynomial ordered according to decreasing powers of the variable x, we let the numerical value of this variable increase indefinitely, then the polynomial will eventually have the same sign as its first term.
In [Cauchy 1897, p. 41], there is a typographical error here, writing n where we write n00 . This error was not in [Cauchy 1821, p. 32]. (tr.)
5
26
2 On infinitely small and infinitely large quantities and on continuity.
2.2 On the continuity of functions. [43] Among the objects related to the study of infinitely small quantities, we ought to include ideas about the continuity and the discontinuity of functions. In view of this, let us first consider functions of a single variable. Let f (x) be a function of the variable x, and suppose that for each value of x between two given limits, the function always takes a unique finite value. If, beginning with a value of x contained between these limits, we add to the variable x an infinitely small increment α, the function itself is incremented by the difference6 f (x + α) − f (x), which depends both on the new variable α and on the value of x. Given this, the function f (x) is a continuous function of x between the assigned limits if, for each value of x between these limits, the numerical value of the difference f (x + α) − f (x) decreases indefinitely with the numerical value of α. In other words, the function f (x) is continuous with respect to x between the given limits if, between these limits, an infinitely small increment in the variable always produces an infinitely small increment in the function itself.7 We also say that the function f (x) is a continuous function of the variable x in a neighborhood of a particular value of the variable x whenever it is continuous between two limits of x that enclose that particular value, even if they are very close together. Finally, whenever the function f (x) ceases to be continuous in the neighborhood of a particular value of x, we say that it becomes discontinuous, and that there is solution of continuity8 for this particular value. [44] Having said this, it is easy to recognize the limits between which a given function of a variable x is continuous with respect to that variable. So, for example, the function sin x, which takes a unique finite value for each particular value of the variable x, is continuous between any two limits of this variable, given that the numerical value of sin 21 α , and consequently that of the difference9 sin (x + α) − sin x = 2 sin
6
1 2α
cos x + 12 α ,
Cauchy defines continuity only on the interior of a bounded interval, and for the whole interval, not just at a single point. See [Grabiner 2005, p. 87] for more on this point. This passage is also cited in [DSB Cauchy, p. 136]. 7 [Grattan-Guinness 1970b] has suggested that Cauchy “stole” this and other ideas from Bolzano’s paper of 1817. See also [Freudenthal 1971b, Jahnke 2003, p. 161, Grabiner 2005, pp. 9–12]. 8 This word “solution” takes an old meaning here; it means that continuity dissolves or disappears. 9 To verify this formula, let u = x + 1 α and v = 1 α, then apply the usual formula for sin(a + b) to 2 2 the expression sin(u + v) − sin(u − v).
2.2 On the continuity of functions.
27
decreases indefinitely with the numerical value of α, whatever finite value is given to the variable x.10 In general, with respect to the 11 simple functions which we have considered above (Chap. I, § II), namely a + x, a − x, ax, ax , xa , Ax , log(x), sin x, cos x, arcsin x, arccos x, if we consider the question of the continuity, we find that each of these functions remains continuous between two finite limits of the variable x whenever they are always real11 between these two limits and they are never infinite on the interval. It follows that each of these functions is continuous in the neighborhood of any finite value given to the variable x if that finite value is contained:12 For the functions a+x a−x ax between the limits x = −∞ and x = +∞, x A sin x cos x For the function a x
first, between the limits x = −∞ and x = 0, second, between the limits x = 0 and x = ∞,
[45] For the functions xa between the limits x = 0 and x = ∞, log(x) and finally
10
This proof is somewhat unsatisfying because it relies on the unspoken assumptions that sin x is continuous at zero and that cos x is bounded. 11 This is meant to rule out imaginary quantities that might arise from roots of negative numbers or complex values of logarithm functions. In modern terms, the content of this passage is that the 11 simple functions are continuous on their domains of definition. 12 Recall from the Preliminaries [Cauchy 1821, p. 9, Cauchy 1897, p. 23] that A is a number, hence positive, so there is no ambiguity about whether Ax is well defined. Because a may be negative, Cauchy avoids problems with ax by restricting his interval of definition. His treatment of ax makes it clear that here Cauchy considers an interval not to contain its endpoints.
28
2 On infinitely small and infinitely large quantities and on continuity.
For the functions arcsin x between the limits x = −1 and x = +1. arccos x It is worth observing that in the case where a = ±m (where m denotes an integer number), the simple function xa is always continuous in the neighborhood of a finite value of the variable x, as long as this value is contained: between the limits x = −∞ and x = +∞,
if a = +m,
between the limits x = −∞ and x = 0 as well as if a = −m, between the limits x = 0 and x = ∞. Among the 11 functions that we have just cited, only two become discontinuous for a value of x contained in the interval between whose limits these functions remain real.13 The two functions in question are a x
and xa (when a = −m).
Both become infinite and, as a consequence, discontinuous when x = 0. Now let f (x, y, z, . . .) be a function of several variables, x, y, z, . . ., and suppose that in the neighborhood of particular values X,Y, Z, . . . of these [46] variables, f (x, y, z, . . .) is simultaneously a continuous function of x, a continuous function of y, a continuous function of z, . . .. We prove easily that if we let α, β , γ, . . . denote infinitely small quantities, and that if we give x, y, z, . . . the values X, Y , Z, . . . or values very near to these, the difference14 f (x + α, y + β , z + γ, . . .) − f (x, y, z, . . .) is itself infinitely small. Indeed, it is clear from the previous hypothesis that the numerical values of the differences f (x + α, y, z, . . .) − f (x, y, z, . . .), f (x + α, y + β , z, . . .) − f (x + α, y, z, . . .), f (x + α, y + β , z + γ, . . .) − f (x + α, y + β , z, . . .), ................................................. Cauchy does not have the notion of the “domain” of a function. For him, even functions like 1x √ or x are always defined, but sometimes the values of those functions are infinite or complex. 14 In [Cauchy 1821, p. 38, Cauchy 1897, p. 46], this was written f (x + α, y + β , z + γ) − f (x, y, z, . . .), with no ellipses in the first term. (tr.)
13
2.2 On the continuity of functions.
29
decrease indefinitely with those of the quantities α, β , γ, . . ., namely the numerical value of the first difference decreases with the numerical value of α, that of the second difference with the numerical value of β , that of the third with the numerical value of γ, and so on. We must conclude that the sum of all these differences, namely f (x + α, y + β , z + γ, . . .) − f (x, y, z, . . .) , converges towards the limit zero if α, β , γ, . . . converge to the same limit. In other words, f (x + α, y + β , z + γ, . . .) has as its limit f (x, y, z, . . .). The proposition that we have just proven evidently remains true in the case where we have established certain relations among the variables α, β , γ, . . .. It is sufficient that these relations permit the new variables to converge all at the same time towards the limit zero. When, in the same proposition, we replace x, y, z, . . . by [47] X, Y , Z, . . ., and x + α, y + β , z + γ, . . . by x, y, z, . . ., we obtain the following statement: Theorem I.15 — If the variables x, y, z, . . . have for their respective limits the fixed and determined quantities X, Y , Z, . . ., and the function f (x, y, z, . . .) is continuous with respect to each of the variables x, y, z, . . . in the neighborhood of the system of particular values x = X, y = Y, z = Z, . . . , then f (x, y, z, . . .) has f (X,Y, Z, . . .) as its limit. Because in the second statement, the variables α, β , γ, . . . are replaced by x − X, y − Y , z − Z, . . ., the relations that we were able to establish in the first statement among α, β , γ, . . . may be established in the second statement among the quantities x − X, y −Y , z − Z, . . ..16 As a result, the function f (x, y, z, . . .) has f (X,Y, Z, . . .) as its limit in the case where the variables x, y, z, . . . are subject to certain relations, as long as these relations permit them to approach indefinitely the limits X, Y , Z, . . .. To clarify these ideas, suppose that x, y, z, . . . are functions of the same variable t, considered to be independent and continuous with respect to this variable in the neighborhood of the particular value t = T. If for convenience we let f (x, y, z, . . .) = u, 15 16
As stated, this theorem is not true. See [Gelbaum 2003, p. 115 ff] for counterexamples. [Cauchy 1821, p. 39, Cauchy 1897, p. 47] omitted ellipses here, writing x − X, y −Y , z − Z. (tr.)
30
2 On infinitely small and infinitely large quantities and on continuity.
then u is a composite function of the variable t. If X, Y, Z, . . . , U, respectively, denote the values of x, y, z, . . . , u in the case where t = T , it is clear, on the one hand, that a [48] value of t very close to T gives for u a unique and finite value. On the other hand, it is sufficient to let t converge towards the limit T for the variables x, y, z, . . . to converge towards the limits X, Y , Z, . . ., and consequently, the function u = f (x, y, z, . . .) towards the limit U = f (X,Y, Z, . . .). We prove in absolutely the same way that if we give t a value very close to T , the corresponding value of the function u is the limit towards which this function approaches indefinitely as t converges towards the given value. We must conclude that u is a continuous function of t in the neighborhood of t = T . We may therefore state the following theorem: Theorem II. — Let x, y, z, . . . denote several functions of the variable t, which are continuous with respect to this variable in the neighborhood of the particular value t = T . Furthermore, let X, Y, Z, . . . be the particular values of x, y, z, . . . corresponding to t = T . Suppose that in the neighborhood of these particular values, the function u = f (x, y, z, . . .) is simultaneously continuous with respect to x, continuous with respect to y, continuous with respect to z, . . .. Then u, considered as a function of t, is also continuous with respect to t in the neighborhood of the particular value t = T . If in the previous theorem we reduce the variable quantities x, y, z, . . . to a single variable x, we get a new theorem, which can be stated as follows: Theorem III. — Suppose that in the equation u = f (x), the variable x is a function of another variable t. Imagine further that [49] the variable x is a continuous function of t in the neighborhood of the particular value t = T , and that u is a continuous function of x in the neighborhood of the particular value x = X corresponding to t = T . The quantity u, considered as a function of t, is also continuous with respect to this variable in the neighborhood of the particular value t = T .
2.2 On the continuity of functions.
31
Suppose, for example, u = ax
and x = t n ,
where a denotes a constant quantity and n an integer number. We conclude from theorem III that between any arbitrary limits of the variable t, u = at n is a continuous function of this variable. Similarly, if we let x u= , y
x = sint,
and y = cost,
we conclude from theorem II that the function u = tant is continuous with respect to t in the neighborhood of any finite value of this variable any time the value in question does not have the form t = ±2kπ ±
π , 2
where k denotes an integer number, that is to say any time that this value of t corresponds to a finite value of tant. On the contrary, the function tant admits solution of continuity, by becoming infinite, for each of the values of t given by the preceding formula. Now let us suppose u = a+x+y+z+..., x = bt, y = ct 2 , . . . , [50] where a, b, c, . . . denote constant quantities. Because u is a continuous function of x, y, z, . . . between any limits of these variables, and because x, y, z, . . . are continuous functions of the variable t between arbitrary limits of t, we conclude from theorem III that the function u = a + bt + ct 2 + . . . is itself continuous with respect to t between arbitrary limits. As a consequence, because t = 0 gives u = a, if we make t converge towards the limit zero, then the function u converges towards the limit a and eventually takes the same sign as this limit, and this agrees with theorem IV of § I. A remarkable property of continuous functions of a single variable is that they may be used in Geometry to represent the ordinates of straight or curved continuous lines. From this remark we easily deduce the following proposition:
32
2 On infinitely small and infinitely large quantities and on continuity.
Theorem IV.17 — If the function f (x) is continuous with respect to the variable x between the limits x = x0 and x = X, and if b denotes a quantity between f (x0 ) and f (X), we may always satisfy the equation f (x) = b by one or more real values of x contained between x0 and X. Proof. — To establish the preceding proposition, it suffices to show that the curve that has as its equation y = f (x) meets the straight line that has for its equation y=b one or more times in the interval contained between the ordinates that correspond to the abscissas x0 and X. Now it is evident under the given hypothesis that this is what happens. Indeed, because the function f (x) is continuous between the limits x = x0 and x = X, the curve which has y = f (x) as its equation and which passes [51] 1◦ through the point corresponding to the coordinates x0 , f (x0 ), and 2◦ through the point corresponding to the coordinates X and f (X), is continuous between these two points. Because the constant ordinate b of the straight line which has y = b as its equation is found between the ordinates f (x0 ) and f (X) of the two points being considered, the straight line necessarily will pass between these two points, which it could not do without meeting the above-mentioned curve in the interval. Furthermore, as we will do in Note III, we can prove theorem IV by a direct and purely analytic method, which also has the advantage of providing the numerical solution to the equation f (x) = b.
2.3 On singular values of functions in various particular cases. When a function of one or several variables admits but a single value for a system of values attributed to the variables which it contains, this unique value is ordinarily deduced from the definition itself of the function. If a particular case arises in which the given definition cannot immediately give the value of the function under consideration, we seek the limit or limits towards which this function converges as the variables approach indefinitely the particular values assigned to them. If there exist one or more limits of this kind, they are regarded as the values of the function under 17
This is the Intermediate Value Theorem. Cauchy gives a rigorous proof of this theorem in Note III [Cauchy 1821, pp. 460–462, Cauchy 1897, pp. 378–380].
2.3 On singular values of functions in various particular cases.
33
the given hypothesis, however many there may be. We call singular values of the proposed function those values determined as we have just described. For example, such values are those which we obtain by attributing infinite values to the variables, and also those values which correspond to the solutions of continuity.18 Research on singular values of functions is one of the most important and most delicate questions of Analysis: it offers more or less [52] difficulty depending on the nature of the functions and the number of variables which they contain. If we first consider simple functions of a single variable, we find that it is easy to determine their singular values. These values always correspond to one of the three cases x = −∞, x = 0 or x = ∞, and are, respectively, for the functions
a + x a arbitrary
a + (−∞) = −∞
.........
a+∞ = ∞
a − x a arbitrary
a − (−∞) = ∞
.........
a − ∞ = −∞
ax
a positive a × (−∞) = −∞ a negative a × (−∞) = ∞
......... .........
a×∞ = ∞ a × ∞ = −∞
a x
a positive a negative
a 0 a 0
a ∞ a ∞
xa
a positive . . . . . . . . . a negative . . . . . . . . .
Ax
A>1 A 1 . . . . . . . . . log(x) base < 1 . . . . . . . . . sin x . . . . . . . . . cos x . . . . . . . . .
= ±∞ = ∓∞
=0 =0
0a = 0 0a = ∞
∞a = ∞ ∞a = 0
A0 = 1 A0 = 1
A∞ = ∞ A∞ = 0
log(0) = −∞ log(0) = ∞ log(0) = ∞ log(0) = −∞
sin(−∞) = M((−1, +1)) . . . . . . . . . cos(−∞) = M((−1, +1)). . . . . . . . .
sin(∞) = M((−1, +1)) cos(∞) = M((−1, +1))
Here, as in the preliminaries, the notation M ((−1, +1)) denotes one of the average quantities between the two limits −1
18
and
+ 1.
Recall [Cauchy 1821, p. 35, Cauchy 1897, p. 43] that a “solution of continuity” is a point where continuity dissolves, what we would call a point of discontinuity.
34
2 On infinitely small and infinitely large quantities and on continuity.
It is worth observing that, in the case where we suppose that a = ±m, [53] where m is an integer number, the simple function xa always admits three singular values, namely: when m being even (−∞)m = ∞, 0m = 0, a = +m m being odd (−∞)m = −∞, 0m = 0, when a = −m
m being even (−∞)−m = 0, m being odd (−∞)−m = 0,
∞m = ∞, ∞m = ∞,
0−m = ∞, ∞−m = 0, −m ((0)) = ±∞, ∞−m = 0.
Now let us consider functions composed of a single variable x. Sometimes it is easy to find their singular values. Thus, for example, if we denote by k any integer number, we recognize without trouble that the composite function tan x =
sin x cos x
has its singular values contained in the three formulas tan ((∞)) = M ((−∞, ∞)) , tan 2kπ ± π2 = ±∞ and tan ((−∞)) = M ((−∞, ∞)) , while the singular values of the inverse function x arctan x = arcsin √ 1 + x2 are, respectively, arctan (−∞) = −
π 2
and
arctan (∞) =
π . 2
Often such questions also present true difficulties. For example, we do not immediately see how to determine the singular value of the function xx , [54] when we suppose that x = 0, or that of the function 1
xx , when we take x = ∞. To give an idea of the methods which lead to the solution of questions of this kind, I am going to establish here two theorems by the aid of which we can, in a great number of cases, determine the singular values which the
2.3 On singular values of functions in various particular cases.
35
two functions
f (x) x take when we suppose that x = ∞.
and
1
[ f (x)] x
Theorem I. — If the difference f (x + 1) − f (x) converges towards a certain limit k, for increasing values of x, then the fraction f (x) x converges at the same time towards the same limit. Proof. — First suppose that the quantity k has a finite value, and denote by ε a number as small as we wish. Because the increasing values of x make the difference f (x + 1) − f (x) converge towards the limit k, we can give the number h a value large enough that, when x is equal to or greater than h, the difference in question is always contained between the limits k − ε and k + ε. Given this, if we denote by n any integer number, each [55] of the quantities f (h + 1) − f (h) , f (h + 2) − f (h + 1) , ....................., f (h + n) − f (h + n − 1) , and consequently their arithmetic mean, namely f (h + n) − f (h) , n is contained between the limits k − ε and k + ε. Thus we have f (h + n) − f (h) = k + α, n where α is a quantity contained between the limits −ε and +ε. Now let h + n = x. The preceding equation becomes
36
2 On infinitely small and infinitely large quantities and on continuity.
f (x) − f (h) = k + α, x−h
(1) and we thus conclude
f (x) = f (h) + (x − h) (k + α) , (2)
f (x) f (h) h = + 1− (k + α) . x x x
Moreover, to make the value of x increase indefinitely, it suffices to make the integer number n increase indefinitely without changing the value of h. Consequently, let us suppose that in equation (2) we consider h as a constant quantity and x as a variable quantity which converges towards the limit ∞. The quantities f (h) x
and
h , x
contained in the right-hand side, converge towards the limit zero, [56] and the righthand side itself converges towards a limit of the form k + α, where α is always contained between −ε and +ε. Thus the ratio f (x) x has for its limit a quantity contained between k − ε and k + ε. This conclusion remains true however small the number ε may be, and as a result the limit in question is precisely the quantity k. In other words, we have (3)
lim
f (x) = k = lim [ f (x + 1) − f (x)] . x
Second, let us suppose that k = ∞. Denoting by H a number however large we may wish, we can always find a number h so large that, for x equal to or greater than h, the difference f (x + h) − f (x) , which converges towards the limit ∞, becomes always greater than H. Reasoning as above, we establish the formula f (h + n) − f (h) > H. n Now, if we set h + n = x, we find the following formula instead of equation (2),
2.3 On singular values of functions in various particular cases.
37
f (x) f (h) h > +H 1− , x x x from which we conclude that
f (x) >H x by making x converge towards the limit ∞. The limit of the ratio lim
f (x) x [57] is thus greater than the number H, however great it may be. This limit, larger than any assignable number, cannot be anything but positive infinity. Finally, let us suppose that k = −∞. To reduce this last case to the preceding one, it suffices to observe that because the difference f (x + 1) − f (x) has as its limit −∞, the following [− f (x + 1)] − [− f (x)] has for its limit +∞. We then conclude that the limit of consequently the limit of
f (x) x
− f (x) x
is equal to +∞, and
equals −∞.
Corollary I. — To give an application of the preceding theorem, let us suppose that f (x) = log(x), where log is the characteristic of logarithms in a system for which the base is greater than 1. We find that 1 f (x + 1) − f (x) = log(x + 1) − log(x) = log 1 + , x and consequently 1 k = log 1 + = log(1) = 0. ∞ We can thus affirm that as x grows indefinitely, the ratio log(x) x converges towards the limit zero, and it follows that in a system for which the base is greater than 1, the logarithms of numbers grow much less rapidly than the numbers themselves.
38
2 On infinitely small and infinitely large quantities and on continuity.
Corollary II. — Suppose, on the other hand, that f (x) = Ax , [58] where A denotes a number greater than 1. We find that f (x + 1) − f (x) = Ax+1 − Ax = Ax (A − 1) , and consequently k = A∞ (A − 1) = ∞. We can thus affirm that when x grows indefinitely, the ratio Ax x converges towards the limit ∞, and it follows that the exponential Ax , when the number A is greater than 1, eventually grows more rapidly than the variable x. Corollary III. — We ought to observe, moreover, that it is not necessary to use theorem I to find the value of the ratio f (x) x corresponding to x = ∞ except in the case where the function f (x) becomes infinite along with the variable x. If this function remains finite for x = ∞, the ratio f (x) x evidently has zero as its limit. I pass to a theorem which serves to determine in many cases the value of 1
[ f (x)] x for x = ∞. It consists of this:
Theorem II. — If the function f (x) is positive for very large values of x and the ratio f (x + 1) f (x) converges towards the limit k when x grows indefinitely, then the expression 1
[ f (x)] x converges at the same time to the same limit.
[59] Proof. — First suppose that the quantity k, necessarily positive, has a finite value, and denote by ε a number as small as we wish. Because increasing values of x make the ratio
2.3 On singular values of functions in various particular cases.
39
f (x + 1) f (x) converge towards the limit k, we can give the number h a value large enough that when x is equal to or greater than h, the ratio in question is always contained between the limits k − ε and k + ε. Given this, if we denote by n any integer number, each of the quantities f (h + 2) , f (h + 1)
f (h + 1) , f (h)
...,
f (h + n) f (h + n − 1)
and consequently their geometric mean, namely
f (x + h) f (x)
1 n
,
is contained between the limits k + ε and k − ε. Thus we have
f (h + n) f (h)
1
n
= k + α,
where α is a quantity contained between the limits −ε and +ε. Now let h + n = x. The preceding equation becomes (4)
f (x) f (h)
1 x−h
= k + α,
and we thus conclude f (x) = f (h) (k + α)x−h , (5)
1
1
x
[ f (x)] x = [ f (h)] x (k + α)1− h .
[60] Moreover, to make the value of x increase indefinitely, it suffices to make the integer number n increase indefinitely without changing the value of h. Consequently, let us suppose that in equation (5) we consider h as a constant quantity and x as a variable quantity which converges towards the limit ∞. The quantities 1
[ f (h)] x
and
h 1− , x
contained in the right-hand side, converge towards the limit 1, and the right-hand side itself converges towards a limit of the form
40
2 On infinitely small and infinitely large quantities and on continuity.
k + α, where α is always contained between −ε and +ε. Thus the expression 1
[ f (h)] x
has for its limit a quantity contained between k − ε and k + ε. This conclusion remains true however small the number ε, and as a result the limit in question is precisely the quantity k. In other words, we have (6)
1
lim [ f (x)] x = k = lim
f (x + 1) . f (x)
On the other hand, let us suppose that the quantity k is infinite, that is to say, because this quantity is positive, that k = ∞. Then, denoting by H a number as large as we wish, we can always find a number h so large that when x is equal to or greater than h, the ratio f (x + 1) , f (x) which converges towards the limit ∞, becomes always greater than H. Reasoning as above, we establish the formula
f (h + n) f (h)
1
n
> H.
[61] Now, if we set h + n = x, we find the following formula instead of formula (5) 1
1
h
[ f (x)] x > [ f (h)] x H 1− x , from which we conclude that 1
lim [ f (x)] x > H, by making x converge towards the limit ∞. The limit of the expression 1
[ f (x)] x
is thus greater than the number H, however great it may be. This limit, larger than any assignable number, cannot be anything but positive infinity. Note. — We can easily prove equation (6) by using theorem I to find the limit towards which the logarithm 1
log [ f (x)] x =
log [ f (x)] x
converges and then returning from logarithms to numbers.
2.3 On singular values of functions in various particular cases.
41
Corollary I. — To give an application of theorem II, let us suppose that f (x) = x. We have
f (x + 1) x + 1 1 = = 1+ , f (x) x x
and consequently, by passing to the limits, k = 1. Then, if we make the variable x grow indefinitely, the function 1
xx converges towards the limit 1.
[62] Corollary II. — On the other hand, let f (x) = axn + bxn−1 + cxn−2 + . . . = P, so that P denotes a polynomial in x of degree n. We find that 1 f (x + 1) a 1 + x = f (x)
n
+ bx 1 + 1x
n−1
b x
c x2
a+ +
and, by passing to the limits, k=
+ xc2 1 + 1x
n−2
+...
+...
a = 1. a 1
Thus, if P represents any integer polynomial, then P x has 1 as its limit. Corollary III. — Finally let f (x) = log (x) . We find that log 1 + 1x f (x + 1) log (x + 1) log (x) + log 1 + 1x = = = 1+ , f (x) log (x) log (x) log (x) and passing to the limits, k = 1. 1 x
Consequently, [log (x)] also has 1 as its limit. Theorems I and II evidently remain true in the case where the variable x takes only integer values. Indeed, to make the proofs that we have given to these two
42
2 On infinitely small and infinitely large quantities and on continuity.
theorems apply in this particular case, it suffices to suppose that the quantity denoted by h in each of these proofs is a very large integer number. If in the same case we represent the successive values of the function f (x) corresponding to the various integer values of x, namely f (1) ,
f (2) ,
f (3) ,
...,
f (n) ,
by A1 ,
A2 ,
A3 ,
...,
An ,
[63] we obtain the following propositions instead of theorems I and II: Theorem III. — If the sequence of quantities A1 ,
A2 ,
A3 ,
...,
An ,
...
is such that the difference between two consecutive terms of this sequence, namely An+1 − An , converges constantly towards a fixed limit A for increasing values of n, then the ratio An n converges at the same time towards the same limit. Theorem IV. — If the sequence of numbers A1 ,
A2 ,
A3 ,
...,
An ,
is such that the ratio between two consecutive terms, namely An+1 , An converges constantly towards a fixed limit A for increasing values of n, then the expression 1 (An ) n converges at the same time towards the same limit. To give an application of this last theorem, let us suppose that An = 1 · 2 · 3 . . . n. The sequence A1 , A2 , . . . becomes 1,
1 · 2,
1 · 2 · 3,
...,
1 · 2 · 3 . . . (n − 1) n,
...,
2.3 On singular values of functions in various particular cases.
43
and the ratio between two consecutive terms of this same series, namely An+1 1 · 2 · 3 . . . n (n + 1) = n + 1, = An 1·2·3...n [64] evidently converges towards the limit ∞ for increasing values of n. Consequently, the expression 1 1 (An ) n = (1 · 2 · 3 . . . n) n converges towards the same limit. On the other hand, we find that the expression
1 1·2·3...n
1
n
converges, for increasing values of n, towards the limit zero. Often, with the aid of theorems I and II, we can determine the singular value of a composite function of the variable x when this variable vanishes. Thus, for example, if we wish to obtain the singular value of xx corresponding to x = 0, it suffices to 1 look for the limit towards which the expression 1x x = 11 converges for increasing xx
values of x. This limit, by virtue of theorem II (corollary I), is equal to 1. Likewise, we conclude from theorem I (corollary I) that the function x log (x) vanishes with the variable x. When the two terms of a fraction are infinitely small quantities, the numerical values of which decrease indefinitely with that of the variable α, the singular value of this fraction for α = 0 is sometimes finite, sometimes zero or infinite. Indeed, let us denote by k and k0 two finite constants that are not zero, and by ε and ε 0 two variable numbers which converge with α towards the limit zero. Two infinitely small quantities, one of order n, the other of order n0 , can be represented, respectively, by kα n (1 ± ε)
0 and k0 α n 1 ± ε 0 ,
[65] and their ratio, namely 0
k0 α n (1 ± ε 0 ) k0 1 ± ε 0 n0 −n k0 1 ± ε 0 1 , = α = kα n (1 ± ε) k 1±ε k 1 ± ε α n−n0 evidently has as its limit k0 k,
if we suppose that n0 = n,
0, if we suppose that n0 > m, ±∞, if we suppose that
n0
< n.
and
44
2 On infinitely small and infinitely large quantities and on continuity.
Likewise, we can prove that the limit towards which the ratio of two infinitely large quantities converges when their numerical values increase indefinitely with that of a variable x can be zero, finite or infinite. But this limit has a determined sign, constantly equal to the product of the signs of the two quantities being considered. Among the fractions for which the two terms converge with the variable α towards the limit zero, we ought to include the following19 f (x + α) − f (x) , α always attributing to the variable x a value in the neighborhood of which the function f (x) remains continuous.20 Indeed, under this hypothesis, the difference f (x + α) − f (x) is an infinitely small quantity. We might also remark that in general it is an infinitely small quantity of the first order, so that the ratio f (x + α) − f (x) α ordinarily converges towards a finite limit different from zero as the numerical value of α diminishes. This limit is, for example, 2x,
if we take
f (x) = x2
and
a a , if we take f (x) = . 2 x x [66] In the particular case where we suppose that x = 0, the ratio −
f (x + α) − f (x) α reduces to
f (α) − f (0) . α Among the ratios of this last kind, we will restrict ourselves to considering the following sin α . α Because it can be put into the form 19
This, and what follows over the next few pages, are as close as Cauchy gets to using the derivative in the Cours d’analyse. It highlights the fact that the book is about the foundations of calculus, and not about calculus itself. It is not until the third lesson of his R´esum´e [Cauchy 1823] that he takes the next step and defines the derivative as the limit of the difference quotient. 20 Note that Cauchy does not seem to consider the necessary and sufficient conditions for a function f (x) to be differentiable at a point x.
2.3 On singular values of functions in various particular cases.
45
sin (−α) , −α its limit will remain the same, whatever the sign of α may be. Given this, suppose that the arc α takes a very small positive value. Because the chord21 of the double arc 2α is represented by 2 sin α, we evidently have 2α > 2 sin α, and as a consequence, α > sin α. Moreover, the sum of the tangents taken at the endpoints of the arc 2α is represented by 2 tan α, and, by forming a portion of a polygon which encloses this arc, we now have 2 tan α > 2α,22 and consequently tan α > α. By combining the two formulas which we have just established, we find that23 sin α < α < tan α, then by replacing tan α with its value sin α < α
cos α. α
and
[67] Now, when α decreases, cos α converges towards the limit 1. Thus, a fortiori the ratio sinαα is always contained between 1 and cos α, and consequently we have24 (7)
lim
sin α = 1. α
f (0) f (x) Because the study of the limits towards which the ratios f (x+α)− and f (α)− α α converge is one of the principal objects of the infinitesimal Calculus, there is no need to dwell any further on this. 21
The chord is an obsolete trigonometric function; see p. 10 or [Cauchy 1821, p. 11, Cauchy 1897, p. 24] for others. The chord of x is 2 sin 2x . 22 Following Lagrange, Cauchy does not supply diagrams in his text. Presumably, he expected the reader to supply any diagrams necessary for following the argument. 23 [Cauchy 1897, p. 66] has “sin a” instead of “sin α.” This error is not in [Cauchy 1821, p. 63]. (tr.) 24 Cauchy is using what we call the Squeeze Theorem here. He considers it evident and sees no need either to state the theorem explicitly or to prove it.
46
2 On infinitely small and infinitely large quantities and on continuity.
It remains for us to examine the singular values of functions of several variables. Sometimes these values are completely determined and independent of the relations which we may establish among the variables. Thus, for example, if we denote by α,
β,
and y
x
four positive variables of which the first two converge towards the limit zero and the last two towards the limit ∞, we recognize without trouble that the expressions αβ ,
xy,
y , β
α , x
αy
and xy
have for their respective limits 0,
∞,
0,
0
∞,
and ∞.
But more often the singular value of a function of several variables cannot be entirely determined except in the particular case where, in making these variables converge towards their respective limits, we establish certain relations among them, and when these relations are not fixed, the singular value in question is a quantity either totally indeterminate, or only required to remain contained between known limits. Thus, as we have remarked above, the singular value to which the ratio of two infinitely small variables is reduced in the case where each of its variables vanishes can be any quantity, either finite, zero or infinite. [68] In other words, this singular value is completely indeterminate. If instead of two infinitely small variables we consider two infinitely large variables, we find that the ratio of these last ones, when their numerical values increase indefinitely, converge again towards an arbitrary limit, which may be positive or negative according to whether the two variables are of the same sign or of opposite signs. It is equally easy to assure ourselves that the product of an infinitely small variable by an infinitely large one has for its limit a quantity that is completely indeterminate. In order to present a final application of the principles which we have just established, let us look for the values that must be attributed to variables x and y in order that the value of the function 1 yx become indeterminate. If A denotes a number greater than 1 and if log is the characteristic of logarithms in the system for which the base is A, we evidently have y = Alog(y) , and consequently 1
yx = A
log(y) x
Now, it is clear that the expression A
log(y) x
.
2.3 On singular values of functions in various particular cases.
47
converges towards an indeterminate limit whenever the ratio log(y) x itself converges towards such a limit. This may arise in two different cases, namely: 1◦ when log(y) and x are two infinitely large quantities, that is to say when x and y have for their respective limits 0 and 1; and 2◦ when log(y) and x are two infinitely large quantities, that is to say when x has an infinite limit and y has [69] 0 or ∞ as its limit. In either case, it is worth observing that the indeterminate limit of the expression A
log(y) x
1
= yx
is necessarily positive. It may even happen that this limit must remain contained between the extreme values of 0 and 1, or else between 1 and ∞. Suppose, for example, that each of the variables x and y converges towards the limit ∞. In this case, because the limit of the ratio log(y) x 1
log(y)
can be any positive quantity, the limit of y x = A x must be an average quantity between 1 and ∞. Moreover, this average is indeterminate as long as we do not establish a particular relation between the infinitely large variables x and y. But if we suppose that y = f (x) , where f (x)25 denotes a function which increases indefinitely with the variable x, then the average value in question, which is none other than the limit of 1
[ f (x)] x , takes a determinate value, which we can always calculate with the aid of theorem II. 1 If, in place of the function y x , we consider the following yx , we find that this last one becomes indeterminate: 1◦ when the variable y converges towards the limit 1 and the variable x towards −∞ or +∞, and 2◦ when the variable x has zero for its limit and y converges towards zero or positive infinity. In calculation, we sometimes encounter singular expressions which cannot be considered except as limits towards which functions of several variables converge, as these same [70] variables become infinitely small or infinitely large, or even more generally, converge towards fixed limits. Examples of such expressions are
25
This was incorrectly written as f (y) in [Cauchy 1897, p. 69], but was correctly given as f (x) in [Cauchy 1821, p. 67]. (tr.)
48
2 On infinitely small and infinitely large quantities and on continuity.
0 , 0
0 × 0,
∞ × ∞,
∞ , ∞
0 × ∞,
00 ,
1∞ ,
...,
among which we ought to consider the first two as the limits towards which the product and the ratios of two infinitely small variables converge, the next two as the limits of the product and of the ratio of two infinitely large positive variables, etc. In particular, if we consider the singular expressions which the functions x + y,
xy,
x , y
yx
1
and y x
produce, we find that when the variables remain independent, the values of these same expressions can be easily determined by that which precedes. The equations which serve to determine these values are, respectively, For the functions
x+y
∞ + ∞ = ∞,
xy x y yx
y
1 x
∞ − ∞ = M((−∞, +∞));
0 × 0 = 0, 0 × ∞ = 0 × −∞ = M((−∞, +∞)), ∞ × ∞ = −∞ × −∞ = ∞, ∞ × −∞ = −∞; 0 = M((−∞, +∞)), ∞ −∞ ∞ = −∞ = M((0, ∞)),
0 0 ∞ −∞ ∞ = −∞ = 0, 0 = 0 = ±∞, −∞ ∞ −∞ = ∞ = M((−∞, 0));
00 = ∞0 = M((0, ∞)), 0−∞ = ∞∞ = ∞,
0∞ = ∞−∞ = 0, 1∞ = 1−∞ = M((0, ∞));
(
0 0 = ∞ 0 = 0 or ∞, 0 ∞ = ∞ −∞ = M((0, 1)), 1 1 1 0 −∞ = ∞ ∞ = M((1, ∞)), 1 0 = M((0, ∞)).
0
1
1
1
1
Chapter 3
On symmetric functions and alternating functions. The use of these functions for the solution of equations of the first degree in any number of unknowns. On homogeneous functions.
3.1 On symmetric functions. [71] A symmetric function of several quantities is one which conserves the same value and the same sign after any exchange made among its quantities. Thus, for example, each of the functions x + y,
x y + yx ,
xyz,
sin x + sin y + sin z,
...
is symmetric with respect to the variables which it contains, while x − y,
xy ,
...
are not symmetric functions of the variables x and y. Likewise, b + c,
b2 + c2 ,
bc,
...
are symmetric functions of the two quantities b and c, and b + c + d,
b2 + c2 + d 2 ,
bc + bd + cd
and bcd
are symmetric functions of the three quantities b, c and d, etc. Among the symmetric functions of several quantities b, c, . . ., g and h, we ought to distinguish those which serve as the coefficients of the various powers of a in the expansion of the product (a − b) (a − c) . . . (a − g) (a − h) , and whose properties lead to a very elegant solution to several [72] equations of the first degree among n variables x, y, z, . . ., u, v, when the equations are of the form
R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 3, c Springer Science+Business Media, LLC 2009
49
50
3 On symmetric, alternating and homogeneous functions.
x+ y+ z+...+ u+ v = k0 , ax+ by+ cz + . . . + gu+ hv = k1 , a2 x+ b2 y+ c2 z + . . . + g2 u+ h2 v = k2 , ..................................................., n−1 a x+ bn−1 y+ cn−1 z + . . . + gn−1 u+ hn−1 v = kn−1 .
(1)
Indeed, let An−2 = − (b + c + . . . + g + h) , An−3 = bc + . . . + bg + bh + . . . + cg + ch + . . . + gh, ......................................................, A0 = ±bc . . . gh be the symmetric functions in question, so that we have an−1 + An−2 an−2 + . . . + A1 a + A0 = (a − b) (a − c) (a − d) . . . . If, in this last formula, we replace a successively by b, by c, . . . , by g, and by h, we have bn−1 + An−2 bn−2 + . . . + A1 b + a0 = 0, cn−1 + An−2 cn−2 + . . . + A1 c + a0 = 0, ......................................., gn−1 + An−2 gn−2 + . . . + A1 g + a0 = 0, hn−1 + An−2 hn−2 + . . . + A1 h + a0 = 0. Then, if we add equations (1) term by term, after multiplying the first one by A0 , the second by A1 , . . ., the next-to-last by An−2 , and the last by one, we obtain the following, an−1 + An−2 an−2 + . . . + A1 a + A0 x = kn−1 + An−2 kn−2 + . . . + A1 k1 + A0 k0 , and we conclude that
(2)
kn−1 − (b + c + . . . + g + h) kn−2 +(bc + . . . + bg + bh + . . . + cg + ch + . . . + gh) kn−3 − . . . ± bc . . . gh · k0 . x= (a − b) (a − c) . . . (a − g) (a − h)
[73] By an analogous process, we can determine the values of the other unknowns y, z, . . ., u, v. When we substitute for the constants k0 ,
k1 ,
k2 ,
...,
kn−1
in equations (1), the successive integer powers of a particular quantity k, namely k0 = 1,
k,
k2 ,
...,
kn−1 ,
3.2 On alternating functions.
51
the value found for x reduces to (3)
x=
(k − b) (k − c) . . . (k − g) (k − h) . (a − b) (a − c) . . . (a − g) (a − h)
3.2 On alternating functions. An alternating function of several quantities is one which changes sign, but keeps the same value next to the sign, when we interchange two of these quantities. Consequently, by a series of such exchanges, the function becomes alternatingly positive and negative. According to this definition, x − y,
xy2 − x2 y,
x log , y
sin x − sin y,
...
are alternating functions of the two variables x and y, (x − y) (x − z) (y − z) is an alternating function of the three variables x, y and z, and so forth. Among the alternating functions of several variables x,
y,
z,
...,
u,
v,
we ought to distinguish those which are rational and integer with respect to each of these same variables. Suppose that such a function [74] is expanded and put into the form of a polynomial. One of its terms, taken at random, has the form kx p yq zr . . . us vt , where p, q, r, . . ., s, t denote integer numbers and k denotes any coefficient whatsoever. Moreover, because the function ought to change sign, but keep the same value next to the sign after interchanging the variables x and y, it is necessary that there correspond to the term in question another term of contrary sign, −kxq y p zr . . . us vt , derived from the first by virtue of this exchange. Thus the function is composed of terms, alternately positive and negative, which, combined two by two, produce binomials of the form kx p yq zr . . . us vt − kxq y p zr . . . us vt = k (x p yq − xq y p ) zr . . . us vt . In each binomial of this kind, p and q will necessarily be two integer numbers, distinct from each other. Because the difference
52
3 On symmetric, alternating and homogeneous functions.
x p yq − xq y p is evidently divisible by y − x, or what amounts to the same thing, by x − y, it follows that each binomial, and consequently the sum of the binomials, or the given function, is divisible by ± (y − x) . Moreover, by the reasoning above, we can substitute any two other variables x and z, or y and z, . . ., for the two variables x and y. Consequently, we definitively obtain the following conclusions: 1◦ An alternating but integer function of several variables x, y, z, . . ., u, v, is composed of terms alternately positive and negative, in each of which the various variables all have different exponents; [75] 2◦ Such a function is divisible by the product of the differences ± (y − x) , ± (z − x) , . . . , ± (u − x) , ± (v − x) , ± (z − y) , . . . , ± (u − y) , ± (v − y) , . . . , ± (u − z) , ± (v − z) , (1) ........., ........., ± (v − u) , each taken with whichever sign we please. The product in question here, as we can easily recognize, is itself an alternating function of the variables which we are considering. To prove this, it suffices to observe that this product changes sign, but keeps the same value next to the sign, after interchanging two variables, x and y for example. But indeed, according to whether we adopt for each difference the sign + or the sign −, this product is found to be equal either to +ϕ or to −ϕ, the value of ϕ being determined by the equation (2)
ϕ = (y − x) (z − x) . . . (u − x) (v − x) × (z − y) . . . (u − y) (v − y) × . . . × (v − u) .
Because it is evident that this value of ϕ changes only its sign by virtue of interchanging the variables x and y, we can conclude that it will be the same for a function equivalent either to +ϕ or to −ϕ. In order to fix these ideas, imagine that we take each of the differences (1) with the sign +. The product of all these differences will be the function ϕ determined by equation (2), or what amounts to the same thing, by the following (3)
ϕ = (y − x) × (z − x) (z − y) × . . . × (v − x) (v − y) (v − z) . . . (v − u) .
If additionally we let n be the number of variables x, y, z, . . ., u, v, then n − 1 is evidently the number of differences which contain a particular variable. Consequently, in each term of the function ϕ expanded and put into the form of a polynomial, the exponent of any variable [76] cannot surpass n−1. Finally, because in any particular
3.2 On alternating functions.
53
term, the different variables ought to have different exponents, it is clear that these exponents will be respectively equal to the numbers 0,
1,
2,
3,
n − 1.
...,
Each term, disregarding the sign and the numerical coefficient, is thus equivalent to the product of the various variables arranged in some order, and respectively raised to powers 0, 1, 2, 3, . . ., n−1. We ought to add that each product of this kind is found only once, sometimes with the sign +, sometimes with the sign −, in the expansion of the function ϕ. For example, the product x0 y1 z2 . . . un−2 vn−1 cannot be formed except by the multiplication of the first letters of the binomial factors which compose the right-hand side of equation (3). With the aid of the principles that we have just established, it is easy to construct in its entirety the expansion of the function ϕ and to demonstrate its various properties (on this subject see Note IV). We are now going to show how one is led, by the consideration of such an expansion, to the solution of general equations of the first degree of several variables. Let a0 x +b0 y +c0 z + . . . + g0 u +h0 v = k0 , a1 x +b1 y +c1 z + . . . + g1 u +h1 v = k1 , a2 x +b2 y +c2 z + . . . + g2 u +h2 v = k2 , (4) ................................................, an−1 x+bn−1 y+cn−1 z+ . . . + gn−1 u+hn−1 v = kn−1 be n linear equations among the n variables or unknowns x,
y,
z,
...,
u,
v,
[77] and the constants a0 , b0 , c0 , . . . , g0 , h0 , k0 , a1 , b1 , c1 , . . . , g1 , h1 , k1 , a2 , b2 , c2 , . . . , g2 , h2 , k2 , ..., ..., ..., ..., ..., ..., ..., an−1 , bn−1 , cn−1 , . . . , gn−1 , hn−1 , kn−1 , chosen arbitrarily. Moreover, let P represent the result of replacing the variables x,
y,
z,
...,
u,
v
b,
c,
...,
g,
h,
in the function ϕ by the letters a,
considered as new quantities. Consequently we have
54
(5)
3 On symmetric, alternating and homogeneous functions.
P = (b − a) × (c − a) (c − b) × . . . × (h − a) (h − b) (h − c) . . . (h − g) .
The product P is the simplest alternating function of the quantities a, b, c, . . ., g, h, and if we expand this function by algebraic multiplication of these binomial factors, each term of the expansion will be equivalent, except for the sign, to the product of these same quantites arranged in a certain order, and respectively raised to the powers 0, 1, 2, 3, . . ., n − 1. Given this, imagine that in each term we replace the exponents with letters for their indices, by writing, for example, a0 b1 c2 . . . gn−2 hn−1 in place of the term a0 b1 c2 . . . gn−2 hn−1 , and denote by D the expansion of the product P. The quantity D, just like the product P, evidently has the property of changing its sign whenever we interchange two [78] of the given letters, for example, the letters a and b. From this, it is easy to conclude that the value of D is reduced to zero if in all of its terms we write the letter b in place of the letter a without writing at the same time a in place of b. It is the same if everywhere we write one of the letters c, . . ., g, h in place of the letter a. Consequently, suppose that in the polynomial D we denote the sum of all the terms that have a0 as their common factor by A0 a0 , the sum of the terms which contain the factor a1 by A1 a1 , . . ., and finally the sum of the terms that have the factor an−1 by An−1 an−1 , so that the value of D is given by the equation (6)
D = A0 a0 + A1 a1 + A2 a2 + . . . + An−1 an−1 .
Then we find, by writing successively in the right-hand side of this equation the letters b, c, . . ., g, h in place of the letter a, 0 = A0 b0 + A1 b1 + A2 b2 + . . . + An−1 bn−1 , 0 = A0 c0 + A1 c1 + A2 c2 + . . . + An−1 cn−1 , .........................................., (7) 0 = A0 g0 + A1 g1 + A2 g2 + . . . + An−1 gn−1 , 0 = A0 h0 + A1 h1 + A2 h2 + . . . + An−1 hn−1 . Now suppose that we add equations (4) together term by term, after multiplying the first by A0 , the second by A1 , the third by A2 , . . ., the last by An−1 . In this sum, we see that the coefficients of the unknowns y, z, . . ., u, v disappear by virtue of formulas (7), and we obtain definitively the equation Dx = A0 k0 + A1 k1 + A2 k2 + . . . + An−1 kn−1 , from which we conclude (8)
x=
A0 k0 + A1 k1 + A2 k2 + . . . + An−1 kn−1 . D
3.2 On alternating functions.
55
Moreover, of the two quantities D and A0 k0 + A1 k1 + A2 k2 + . . . + An−1 kn−1 , [79] the first is what arises from the expansion of the product (b − a) × (c − a) (c − b) × . . . × (h − a) (h − b) (h − c) . . . (h − g) , when we replace the exponents of the letters in this expansion with the indices, and the second is what becomes of the quantity D, equivalent to the right-hand side of formula (6), when we substitute the letter k for the letter a. Consequently, we can consider the value of x to be determined by the equation (9)
x=
(b − k) × (c − k) (c − b) × . . . × (h − k) (h − b) (h − c) . . . (h − g) , (b − a) × (c − a) (c − b) × . . . × (h − a) (h − b) (h − c) . . . (h − g)
provided that we agree to expand the two terms of the fraction that forms the righthand side and to replace in each expansion the exponents of the letters by their indices. Taken literally, the value which equation (9) seems to give to the unknown x is not exact and not capable of being made exact without the stated modifications. This is what we call a symbolic value of this unknown. The method which has led us to the symbolic value of x furnishes equally the symbolic values of the other unknowns. To give an application of this method, suppose that we wish to solve the linear equations a0 x + b0 y + c0 z = k0 , a1 x + b1 y + c1 z = k1 , (10) a2 x + b2 y + c2 z = k2 . Under this hypothesis, we find the symbolic value of the unknown x to be,1 (b − k) (c − k) (c − b) x = (b − a) (c − a) (c − b) (11) 0 1 2 0 2 1 1 2 0 1 0 2 2 0 1 2 1 0 = k b c −k b c +k b c −k b c +k b c −k b c , a0 b1 c2 − a0 b2 c1 + a1 b2 c0 − a1 b0 c2 + a2 b0 c1 − a2 b1 c0 [80] and consequently, the true value of the unknown is2 (12)
x=
k0 b1 c2 − k0 b2 c1 + k1 b2 c0 − k1 b0 c2 + k2 b0 c1 − k2 b1 c0 . a0 b1 c2 − a0 b2 c1 + a1 b2 c0 − a1 b0 c2 + a2 b0 c1 − a2 b1 c0
Note. — When, in equations (4), we replace the indices of the letters a, b, c, . . ., g, h, k by the exponents, the symbolic value of x given by equation (9) evidently 1 In [Cauchy 1897, p. 79], there are typographical errors in the second line of (11), with c written 0 in place of c0 in two instances. These errors were not present in [Cauchy 1821, p. 81]. (tr.) 2 We recognize this as Cramer’s Rule, named for Gabriel Cramer (1704–1752); see [Cramer 1750].
56
3 On symmetric, alternating and homogeneous functions.
becomes the true value, and coincides, as we ought to expect, with that furnished by formula (3) of § 1.
3.3 On homogeneous functions. A function of several variables x, y, z, . . . is homogeneous when changing x to tx, y to ty, z to tz, . . ., where t is a new variable independent of the others, makes this function vary in the ratio of 1 to some fixed power of t. The exponent of this power is called the degree of the homogeneous function. In other words, f (x, y, z, . . .) is a homogeneous function of degree a with respect to the variables x, y, z, . . ., if for any t, we have (1) f (tx,ty,tz, . . .) = t a f (x, y, z, . . .) . Thus, for example, x2 + xy + y2 ,
√ xy and
ln x − ln y
are three homogeneous functions of the variables x and y, the first of the second degree, the second of the first degree and the third of degree zero. An integer function of the variables x, y, z, . . . composed of terms chosen so that the sum of the exponents of the various [81] variables is the same in all the terms is evidently homogeneous. If we let t = 1x in formula (1), we conclude that y z f (x, y, z, . . .) = xa f 1, , , . . . . (2) x x This last equation establishes a property of homogeneous functions that we can state in the following manner: Whenever a function of several variables x, y, z, . . . is homogeneous, it is equivalent to a product of any one of the variables raised to a certain power by a function of the ratios of these same variables combined in pairs. We can add that this property applies exclusively to homogeneous functions. And, indeed, suppose that f (x, y, z, . . .) is equivalent to the product of xa by a function of the ratios among the variables x, y, z, . . . combined in pairs. Because we can express each of these ratios by means of those which have x for their denominators by writing, for example, in place of yz , z xy , x
3.3 On homogeneous functions.
57
it follows that the value of f (x, y, z, . . .) is given by an equation of the form y z f (x, y, z, . . .) = xa ϕ , , . . . . x x This equation remains true, whatever the values of x, y, z, . . . may be, and if we replace x by tx, y by ty, z by tz, . . . , it becomes
y z , ,... . x x [82] Consequently, under the given hypothesis, we have, f (tx,ty,tz, . . .) = t a xa ϕ
f (tx,ty,tz, . . .) = t a f (x, y, z, . . .) , whatever t may be. In other words, f (x, y, z, . . .) will be a homogeneous function of degree a with respect to the variables x, y, z, . . ..
Chapter 4
Determination of integer functions, when a certain number of particular values are known. Applications.
4.1 Research on integer functions of a single variable for which a certain number of particular values are known. [83] To determine a function when a certain number of particular values are taken to be known is what we call to interpolate. When it is a matter of a function of one or two variables, this function can be considered as the ordinates of a curve or of a surface, and the problem of interpolation consists of fixing the general value of this ordinate given a certain number of particular values, that is to say, to make the curve or the surface pass through a certain number of points. This question can be solved in an infinity of ways, and in general the problem of interpolation is indeterminate. However, the indeterminacy will cease if, to the knowledge of the particular values of the desired function, we add the expressed condition that this function be integer, and of a degree such that the number of its terms becomes precisely equal to the number of particular values given. To fix these ideas, suppose that we consider first the integer functions of a single variable x. We establish easily in this regard the following propositions: Theorem I. — If an integer function of the variable x vanishes for [84] a particular value of this variable, for example for x = x0 , it is algebraically divisible by x − x0 . Theorem II. — If an integer function of the variable x vanishes for each of the values of x contained in the series x0 ,
x1 ,
x2 ,
...,
xn−1 ,
where n denotes any integer, it will necessarily be divisible by the product (x − x0 ) (x − x1 ) (x − x2 ) . . . (x − xn−1 ) .
R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 4, c Springer Science+Business Media, LLC 2009
59
60
4 Determination of integer functions from known values.
Now let ϕ (x) and ψ (x) be two integer functions of the variable x, both of degree n − 1, and which become equal to each other for each of the n particular values of x contained in the series x0 , x1 , x2 , . . ., xn−1 . I say that these two functions are identically equal, that is to say that we have, ψ (x) = ϕ (x) , whatever x may be. Indeed, if this equality did not occur, we would find in the difference ψ (x) = ϕ (x) , an integer polynomial for which the degree does not surpass n − 1 but which vanishes for each of the values of x mentioned above, and is still divisible by the product (x − x0 ) (x − x1 ) (x − x2 ) . . . (x − xn−1 ) , that is to say by a polynomial of degree n, which is absurd. We are assured a fortiori of the absolute equality of the two functions ϕ (x) and ψ (x) if we know that they become equal to each other for a number of values of x greater than n. We can thus state the following theorem: Theorem III. — If two integer functions of the variable x become [85] equal for a number of values of this variable greater than the degree of each of these two functions, they are identically equal, whatever x may be. We thereby deduce as a corollary this other theorem: Theorem IV. — Two integer functions of the variable x are identically equal whenever they become equal for all integer values of that variable, or even for all integer values which surpass a given limit. Indeed, in this case the number of values of x for which the two functions become equal is indefinite. It follows from theorem III that an integer function u of degree n−1 is completely determined if we know its particular values u0 ,
u1 ,
u2 ,
...,
un−1
x1 ,
x2 ,
...,
xn−1
corresponding to the values x0 ,
of the variable x. Under this hypothesis, we look for the general value of the function u.1 If we suppose first that the particular values u0 , u1 , . . ., un−1 all reduce to zero 1
The interpolation technique that Cauchy is about to describe is known as Lagrange interpolation. See, for example, [Burden and Faires 2001, pp. 107–118].
4.1 Integer functions of a single variable.
61
with the exception, u0 , then the function u ought to vanish for x = x1 , for x = x2 , . . ., and finally for x = xn−1 , and it is divisible by the product (x − x1 ) (x − x2 ) . . . (x − xn−1 ) , and consequently it is of the form u = k (x − x1 ) (x − x2 ) . . . (x − xn−1 ) , where k must be a constant quantity. Moreover, because u must reduce to u0 for x = x0 , we conclude that u0 = k (x − x1 ) (x − x2 ) . . . (x − xn−1 ) [86] and consequently u = u0
(x − x1 ) (x − x2 ) . . . (x − xn−1 ) . (x0 − x1 ) (x0 − x2 ) . . . (x0 − xn−1 )
Likewise, if the particular values u0 , u1 , u2 , . . ., un−1 all reduce to zero with the exception of the second one, u1 , we find that (x − x0 )(x − x2 ) . . . (x − xn−1 ) u = u1 , (x1 − x0 )(x1 − x2 ) . . . (x1 − xn−1 ) .......................................... Finally, if they all reduce to zero with the exception of the last one, un−1 , we find u = un−1
(x − x0 ) (x − x1 ) . . . (x − xn−2 ) . (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 )
In adding together these various values of u corresponding to the various hypotheses that we have just made, we obtain for the sum a polynomial in x of degree n − 1 which evidently has the property that it reduces to u0 when x = x0 , to u1 when x = x1 , . . ., and to un−1 when x = xn−1 . Thus this polynomial is the general value of u which solves the given question, so that this value is found to be determined by the formula (x − x1 )(x − x2 ) . . . (x − xn−1 ) u = u0 (x 0 − x1 )(x0 − x2 ) . . . (x0 − xn−1 ) (x − x0 )(x − x2 ) . . . (x − xn−1 ) + u1 (x (1) 1 − x0 )(x1 − x2 ) . . . (x1 − xn−1 ) + .................................... (x − x0 )(x − x1 ) . . . (x − xn−2 ) + un−1 . (xn−1 − x0 )(xn−1 − x1 ) . . . (xn−1 − xn−2 )
62
4 Determination of integer functions from known values.
We could have deduced the same formula directly from the method which we employed above (Chap. III, § I) to solve linear equations of several variables in a particular case (on this subject, see Note V). Denoting by a a constant quantity, if we replace in formula (1) the function u by the function u − a, which evidently is [87] of the same degree, and the particular values of u by the particular values of u − a, we obtain the equation2
(2)
(x − x1 ) (x − x2 ) . . . (x − xn−1 ) u − a = (u0 − a) (x0 − x1 ) (x0 − x2 ) . . . (x0 − xn−1 ) (x − x0 ) (x − x2 ) . . . (x − xn−1 ) + (u1 − a) (x1 − x0 ) (x1 − x2 ) . . . (x1 − xn−1 ) + . . . . . . . . . ................................. (x − x0 ) (x − x1 ) . . . (x − xn−2 ) + (un−1 − a) , (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 )
and by comparing this equation to formula (1), we find the following (x − x1 ) (x − x2 ) . . . (x − xn−1 ) 1 = (x 0 − x1 ) (x0 − x2 ) . . . (x0 − xn−1 ) (x − x0 ) (x − x2 ) . . . (x − xn−1 ) + (3) (x1 − x0 ) (x1 − x2 ) . . . (x1 − xn−1 ) + ....................................... (x − x0 ) (x − x1 ) . . . (x − xn−2 ) + . (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 ) This last equation is an identity and remains true whatever x may be. Equations (1) and (2) can both serve to solve the problem of interpolation for integer functions, but in general it is advisable to prefer equation (2), considering that we can make one of the terms of the right-hand side disappear by taking the constant a to be equal to one of the quantities u0 ,
u1 ,
u2 ,
...,
un−1 .
Suppose, for example, that we are trying to make a straight line pass through two given points. Denote by x0 and y0 the rectangular coordinates of the first point, by x1 and y1 the those of the second, and by y the ordinate variable of the straight line. By replacing the letter u in formula (2) by the letter y, then making n = 1 and a = y0 , we find the equation of the line to be (4)
2
y − y0 = (y1 − y0 )
x − x0 . x1 − x0
In both [Cauchy 1821, p. 91] and [Cauchy 1897, p. 87], there is a typographical error in the last line of formula (2), in which the denominator contains an x2 where it should be x1 . (tr.)
4.1 Integer functions of a single variable.
63
[88] On the other hand, suppose that we are trying to make a parabola whose axis is parallel to the y axis pass through three given points. Let x1 and y1 ,
x2 and y2 ,
and x3 and y3
be the rectangular coordinates of the three points. Also, let y be the ordinate variable of the parabola. By replacing the letter u in formula (2) by the letter y, then making n = 2 and a = y0 , we find the equation of the parabola to be (x − x1 ) (x − x2 ) y − y1 = (y0 − y1 ) (x − x ) (x − x ) 0 1 0 2 (5) (x − x ) (x − x 0 1) + (y2 − y1 ) , (x2 − x0 ) (x2 − x1 ) or what amounts to the same thing, x − x1 x − x2 x − x0 (6) y − y1 = (y0 − y1 ) + (y2 − y1 ) . x2 − x0 x1 − x0 x2 − x1 When in equation (1) we take u = xm (m denoting an integer number less than n), the particular values of u represented by u0 ,
u1 ,
u2 ,
...,
un−1
x0m ,
x1m ,
x2m ,
...,
m xn−1 .
evidently reduce to Thus we have, for integer values of m which do not surpass n − 1,3
(7)
(x − x1 ) (x − x2 ) . . . (x − xn−1 ) xm = x0m (x 0 − x1 ) (x0 − x2 ) . . . (x0 − xn−1 ) (x − x0 ) (x − x2 ) . . . (x − xn−1 ) + x1m (x1 − x0 ) (x1 − x2 ) . . . (x1 − xn−1 ) + .................................... (x − x0 ) (x − x1 ) . . . (x − xn−2 ) m + xn−1 . (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 )
This last formula contains equation (3) as a particular case. Moreover, if we observe that each power of x, and in particular [89] the power xn−1 , ought necessarily to have the same coefficient on both sides of formula (7), we find: 1◦ by supposing that m < n − 1,
3
[Cauchy 1897, p. 88] has an unbalanced parenthesis in the last denominator of this formula. This typographical error is not in [Cauchy 1821, p. 92]. (tr.)
64
(8)
4 Determination of integer functions from known values.
x0m 0= (x0 − x1 ) (x0 − x2 ) . . . (x0 − xn−1 ) x1m + (x1 − x0 ) (x1 − x2 ) . . . (x1 − xn−1 ) + . ................................ m xn−1 + ; (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 ) 2◦ By supposing that m = n − 1,
(9)
x0n−1 1 = (x0 − x1 ) (x0 − x2 ) . . . (x0 − xn−1 ) x1n−1 + (x1 − x0 ) (x1 − x2 ) . . . (x1 − xn−1 ) + ................................. (xn−1 )n−1 + . (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 )
It is worth remarking that formula (8) remains true in the case where we suppose that m = 0 and then it becomes4 1 0= (x − x ) (x − x 0 1 0 2 ) . . . (x0 − xn−1 ) 1 + (x1 − x0 ) (x1 − x2 ) . . . (x1 − xn−1 ) (10) + ................................. 1 + . (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 )
4.2 Determination of integer functions of several variables, when a certain number of particular values are assumed to be known. The methods by which we determine functions of one variable when a certain number of particular values are assumed to be [90] known can be easily extended, as we are going to see, to functions of several variables. To fix these ideas, let us first consider functions of two variables, x and y. Let ϕ (x, y) and ψ (x, y) be two such functions, both of degree n − 1 with respect to each of the variables, and which become equal to each other whenever, by attributing to 4
This result is due to Euler [Euler 1769, vol. 2, § 1169]. See also [Sandifer 2007, pp. 133–137].
4.2 Integer functions of several variables.
65
the variable x one of the particular values x0 ,
x1 ,
x2 ,
...,
xn−1
at the same time we attribute to the variable y one of the following y0 ,
y1 ,
y2 ,
...,
yn−1 .
Then ϕ (x0 , y) and ψ (x0 , y) are two functions of the single variable y, which ought to be equal to each other for n particular values of this variable. Consequently (by virtue of theorem III, § I), these two functions are constantly equal, whatever y may be. Then we have identically ϕ (x0 , y) = ψ (x0 , y) . Likewise we find ϕ (x1 , y) = ψ (x1 , y) , ϕ (x2 , y) = ψ (x2 , y) , ....................., ϕ (xn−1 , y) = ψ (xn−1 , y) . Moreover, the left-hand sides of the preceding n equations are particular values of the function ϕ (x, y) in the case where we consider just x as the variable, and the right-hand sides represent the corresponding particular values of the function ψ (x, y). The two functions ϕ (x, y)
and ψ (x, y) ,
when we attribute to y a constant value chosen arbitrarily, thus become equal for n particular values of x, and because they are both of degree n − 1 with respect to x, it follows [91] that they remain equal, not only for any value attributed to the variable y but also for any value of x. We are assured, a fortiori, of the absolute equality of the two functions ϕ (x, y) and ψ (x, y) if we know that they become equal whenever the values of x and y are respectively taken in two series each composed of more than n different terms. Thus we can state the following proposition: Theorem I. — If two integer functions of the variables x and y become equal whenever the values of these two variables are respectively taken from two series both of which contain a number of terms greater than the highest exponents of x and y in these same functions, then they are identically equal. We thereby deduce as a corollary this other theorem: Theorem II. — Two integer functions of the variables x and y are identically equal whenever they become equal for all integer values of these variables, or even for all integer values which surpass a given limit.
66
4 Determination of integer functions from known values.
Indeed, in this case the number of values of x and y for which the two functions become equal is indefinite. It follows from theorem I that, if we suppose that the function ϕ (x, y) is integer and of degree n − 1 with respect to each of the variables x and y, this function is completely determined when we know the particular values which it receives when, in taking for the values of x one of the quantities x0 ,
x1 ,
x2 ,
...,
xn−1 ,
we take at the same time for the value of y one of the following y0 ,
y1 ,
y2 ,
...,
yn−1 .
Under the same hypothesis, the general value of the function can [92] be easily deduced from formula (1) of the preceding section.5 Indeed, if we replace u by ϕ (x, y) in this formula, we get
(1)
(x − x1 ) (x − x2 ) . . . (x − xn−1 ) ϕ (x0 , y) ϕ (x, y) = (x0 − x1 ) (x0 − x2 ) . . . (x0 − xn−1 ) (x − x0 ) (x − x2 ) . . . (x − xn−1 ) + ϕ (x1 , y) (x1 − x0 ) (x1 − x2 ) . . . (x1 − xn−1 ) + .......................................... (x − x0 ) (x − x1 ) . . . (x − xn−2 ) + ϕ (xn−1 , y) , (xn−1 − x0 ) (xn−1 − x1 ) . . . (xn−1 − xn−2 )
and we have, moreover, denoting by m one of the integer numbers 1, 2, 3, . . ., n − 1,
(2)
(y − y1 ) (y − y2 ) . . . (y − yn−1 ) ϕ (xm , y0 ) ϕ (xm , y) = (y 0 − y1 ) (y0 − y2 ) . . . (y0 − yn−1 ) (y − y0 ) (y − y2 ) . . . (y − yn−1 ) + ϕ (xm , y1 ) (y1 − y0 ) (y1 − y2 ) . . . (y1 − yn−1 ) + .......................................... (y − y0 ) (y − y1 ) . . . (y − yn−2 ) + ϕ (xm , yn−1 ) . (yn−1 − y0 ) (yn−1 − y1 ) . . . (yn−1 − yn−2 )
We draw the general value of ϕ (x, y) immediately from the two preceding equations. For example, by supposing that n = 2, we find
5
Cauchy used the word paragraphe, which we will consistently translate as “section.” (tr.)
4.3 Applications.
(3)
67
x − x1 y − y1 ϕ (x0 , y0 ) x0 − x1 y0 − y1 x − x0 y − y1 ϕ (x1 , y0 ) x1 − x0 y0 − y1 x − x1 y − y0 + ϕ (x0 , y1 ) x0 − x1 y1 − y0 x − x0 y − y0 + ϕ (x1 , y1 ) . x1 − x0 y1 − y0
ϕ (x, y) = +
If we consider functions of three or more variables, we obtain results entirely similar to those which we have just found for functions of [93] only two variables. We find, for example, in place of theorem II the following proposition: Theorem III. — Two integer functions of several variables x, y, z, . . . are identically equal to each other whenever they become equal for all integer values of these variables, or even for all integer variables which surpass a given limit.
4.3 Applications. To apply the principles established in the preceding sections, let us consider in particular products formed by the multiplication of successive factors for which each surpasses the following one by one, the first factor being one of the variables x, y, z, . . .. By means of these kinds of products, we seek to express the very similar product that we would obtain by taking for the first factor to be the sum of the given variables, namely x+y+z+.... If we reduce the number of variables to two, the problem at hand can be stated as follows: Problem I. — To express the product (1)
(x + y) (x + y − 1) (x + y − 2) . . . (x + y − n + 1) ,
in which n denotes any integer number, by means of the following products x (x − 1) (x − 2) . . . (x − n + 1) y (y − 1) (y − 2) . . . (y − n + 1)
and
and all such products which arise by changing the value of n. Solution. — To solve the preceding question more easily, let us first suppose that x and y are integer numbers greater than or equal to n. Then the product (1) is nothing other than the numerator [94] of the fraction that expresses the number of possible combinations of x + y letters taken n at a time. This number is precisely
68
4 Determination of integer functions from known values.
(x + y) (x + y − 1) (x + y − 2) . . . (x + y − n + 1) . 1·2·3...n Given this, imagine that a,
b,
c,
...,
p,
q,
r,
...
are x + y letters, and that we divide them into two groups so that there are x letters, a, b, c, . . ., in the first group and y letters, p, q, r, . . ., in the second group. Among the combinations formed with these different letters, some contain only letters taken from the first group. The number of combinations of this kind is x (x − 1) (x − 2) . . . (x − n + 1) . 1·2·3...n Others contain n − 1 letters taken from the first group and one letter taken from the second. We easily determine the number of combinations of this second kind and we see that it is equal to x (x − 1) (x − 2) . . . (x − n + 2) y . 1 · 2 · 3 . . . (n − 1) 1 Likewise, we find that the number of combinations which contain n − 2 letters taken from the first group and two letters from the second group is x (x − 1) (x − 2) . . . (x − n + 3) y (y − 1) , 1 · 2 · 3 . . . (n − 2) 1·2 etc. Finally, the number of combinations which contain only letters taken from the second group is y (y − 1) (y − 2) . . . (y − n + 1) . 1·2·3...n The sum of the numbers of combinations of each kind ought [95] to produce the total number of combinations of x + y given letters taken n at a time. We conclude that6 (x + y) (x + y − 1) . . . (x + y − n + 1) 1·2·3...n x (x − 1) . . . (x − n + 1) x (x − 1) . . . (x − n + 2) y = + 1·2·3...n 1 · 2 · 3 . . . (n − 1) 1 (2) x (x − 1) . . . (x − n + 3) y (y − 1) + +... 1 · 2 · 3 . . . (n − 2) 1·2 x y (y − 1) . . . (y − n + 2) y (y − 1) . . . (y − n + 1) + + . 1 1 · 2 · 3 . . . (n − 1) 1·2·3...n 6
The numeral 3 was missing from the denominator in the third line of equation (2) in [Cauchy 1897, p. 95], but present in [Cauchy 1821, p. 100]. (tr.)
4.3 Applications.
69
The preceding equation, being thus proved in the case where the variables x and y take integer values greater than n,7 remains true, by virtue of theorem II (§ II), for all values of these variables, and the value of product (1) derived from the same equation is (x + y) (x + y − 1) . . . (x + y − n + 1) = x (x − 1) . . . (x − n + 1) n + x (x − 1) . . . (x − n + 2) y 1 (3) n (n − 1) x (x − 1) . . . (x − n + 3) y (y − 1) + . . . + 1·2 n + xy (y − 1) . . . (y − n + 2) 1 +y (y − 1) . . . (y − n + 1) . Corollary I. — If we replace x by −x and y by −y, in equation (2) we obtain the following:8 (x + y) (x + y + 1) . . . (x + y + n − 1) 1·2·3...n x (x + 1) . . . (x + n − 1) x (x + 1) . . . (x + n − 2) y = + 1·2·3...n 1 · 2 · 3 . . . (n − 1) 1 (4) x (x + 1) . . . (x + n − 3) y (y + 1) + +... 1 · 2 · 3 . . . (n − 2) 1·2 x y (y + 1) . . . (y + n − 2) y (y + 1) . . . (y + n − 1) + . + 1 1 · 2 · 3 . . . (n − 1) 1·2·3...n Corollary II. — If we replace x by
(5)
7
x 2
and y in equation (2) [96] by 2y , we find
(x + y) (x + y − 2) . . . (x + y − 2n + 2) 2 · 4 · 6 . . . (2n) x (x − 2) . . . (x − 2n + 2) x (x − 2) . . . (x − 2n + 4) y = + 2 · 4 · 6 . . . (2n) 2 · 4 · 6 . . . (2n − 2) 2 +............................................. x y (y − 2) . . . (y − 2n + 4) y (y − 2) . . . (y − 2n + 2) + + . 2 2 · 4 · 6 . . . (2n − 2) 2 · 4 · 6 . . . (2n)
Cauchy has modified, perhaps inadvertently, the condition “greater than or equal to” stated at the beginning of this solution. 8 We have restored parentheses to the second line of equation (4) that were missing in [Cauchy 1897, p. 95]. They had been present in [Cauchy 1821, p. 100]. (tr.)
70
4 Determination of integer functions from known values.
Corollary III. — By expanding both sides of equation (2) and keeping on each side only the terms in which the sum of the exponents of the variables is equal to n, we obtain the formula xn xn−1 y (x + y)n = + 1·2·3...n 1 · 2 · 3 . . . n 1 · 2 · 3 . . . (n − 1) 1 xn−2 y2 (6) + +... 1 · 2 · 3 . . . (n − 2) 1 · 2 yn−1 yn x + . + 1 1 · 2 · 3 . . . (n − 1) 1 · 2 · 3 . . . n The value of (x + y)n taken from this last formula is precisely that given by the Newton binomial. The formulas that we have just derived can easily be extended to the case where we consider more than two variables, and the method which has brought us to the solution of problem I is equally applicable to the following question: Problem II. — With x, y, z, . . . denoting any number of variables, to express the product (x + y + z + . . .) (x + y + z + . . . − 1) (x + y + z + . . . − 2) . . . (x + y + z + . . . − n + 1) as a function of the following ones x (x − 1) (x − 2) . . . (x − n + 1) , y (y − 1) (y − 2) . . . (y − n + 1) , z (z − 1) (z − 2) . . . (z − n + 1) , .............................., and all such products which arise by changing the value of n. [97] We begin by solving the problem in the case where x, y, z, . . . denote integer numbers greater than n, and on the basis of this principle, the fraction (x + y + z + . . .) (x + y + z + . . . − 1) (x + y + z + . . . − 2) . . . (x + y + z + . . . − n + 1) 1·2·3...n is equal to the number of combinations that we can form with x + y + z + . . . letters taken n at a time. Then we pass to the case where the variables x, y, z, . . . become any quantities based on theorem III of § II. When we have thus proved the formula which solves the given question, we deduce without trouble the value of the power (x + y + z + . . .)n . We then solve the problem, indeed, by expanding both sides of the formula we found, and keeping on each side only the terms in which the combined exponents of the variables x, y, z, . . . form a sum equal to n.
Chapter 5
Determination of continuous functions of a single variable that satisfy certain conditions.
5.1 Research on a continuous function formed so that if two such functions are added or multiplied together, their sum or product is the same function of the sum or product of the same variables. [98] When, instead of integer functions we imagine any functions, so that we leave the form entirely arbitrary, we can no longer successfully determine them given a certain number of particular values, however large that number might be, but we can sometimes do so in the case where we assume certain general properties of these functions. For example, a continuous function of x, represented by ϕ (x), can be completely determined when it is required to satisfy, for all possible values of the variables x and y, one of the equations (1) (2)
ϕ (x + y) = ϕ (x) + ϕ (y) or ϕ (x + y) = ϕ (x) × ϕ (y) ,
as well as when, for all positive real values of the same variables, one of the following equations: (3) (4)
ϕ (xy) = ϕ (x) + ϕ (y) or ϕ (xy) = ϕ (x) × ϕ (y) .
The solution of these four equations presents four different problems, which we will treat one after another. [99] Problem I. — To determine the function ϕ (x) in such a manner that it remains continuous between any two real limits of the variable x and so that for all real values of the variables x and y, we have (1)
ϕ (x + y) = ϕ (x) + ϕ (y) .
R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 5, c Springer Science+Business Media, LLC 2009
71
72
5 Determination of certain continuous functions.
Solution. — If in equation (1) we successively replace y by y + z, z by z + u, . . ., we get ϕ (x + y + z + u + . . .) = ϕ (x) + ϕ (y) + ϕ (z) + ϕ (u) + . . . , however many variables x, y, z, u, . . . there may be. Also, if we denote this number of variables by m and a positive constant by α, and then we make x = y = z = u = . . . = α, then the formula which we have just found becomes ϕ (mα) = mϕ (α) . To extend this last equation to the case where the integer number m is replaced by a fractional number mn , or even by an arbitrary number µ, we set, in the first case, β=
m α, n
where m and n denote integer numbers, and we conclude that nβ = mα, nϕ (β ) = mϕ (α) and ϕ (β ) = ϕ mn α = mn ϕ (α) . Then, by supposing that the fraction mn varies in such a way as to converge towards any number µ, and passing to the limit, we find that ϕ (µα) = µϕ (α) . [100] If we now take α = 1, then we have, for all positive values of µ, (5)
ϕ (µ) = µϕ (1) ,
and consequently, by making µ converge towards the limit zero, ϕ (0) = 0. Moreover, if in equation (1) we set x = µ and y = −µ, we conclude that ϕ (−µ) = ϕ (0) − ϕ (µ) = −µϕ (1) . Thus, equation (5) remains true when we change µ to −µ. In other words, we have, for any values, positive or negative, of the variable x, (6)
ϕ (x) = xϕ (1) .
5.1 Functions satisfying certain conditions involving addition and multiplication.
73
It follows from formula (6) that any function ϕ (x) which remains continuous between any limits of the variable and satisfies equation (1) is necessarily of the form (7) ϕ (x) = ax, where a denotes a constant quantity. I add that the function ax enjoys the stated properties whatever the value of the constant a may be. Indeed, between any limits of the variable x, the product ax is a continuous function of that variable, and what’s more, the assumption that ϕ (x) = ax changes equation (1) into this other equation, a (x + y) = ax + ay, which is evidently always an identity. Thus formula (7) gives a solution to the proposed question, whatever value is attributed to the constant a. Because we have the ability to choose this constant arbitrarily, we call it an arbitrary constant. Problem II. — To determine the function ϕ (x) in such a manner that it remains continuous between any two real limits of the variable x and so that [101] for all real values of the variables x and y, we have (2)
ϕ (x + y) = ϕ (x) ϕ (y) .
Solution. — First, it is easy to assure ourselves that the function ϕ (x) required to satisfy equation (2) will admit only positive values. Indeed, if we make y = x in equation (2), we find that ϕ (2x) = [ϕ (x)]2 , and then, writing 12 x in place of x, we conclude that ϕ (x) = [ϕ ( 21 x)]2 . Thus the function ϕ (x) is always equal to a square, and consequently it is always positive. Given this, suppose that in equation (2) we successively replace y by y + z, z by z + u, . . .. We then get ϕ (x + y + z + u + . . .) = ϕ (x) ϕ (y) ϕ (z) ϕ (u) . . . , however many variables x, y, z, u, . . . there may be. Also, if we denote this number of variables by m, and a positive constant by α, and then we make x = y = z = u = . . . = α, then the formula we have just found becomes ϕ (mα) = [ϕ (α)]m .
74
5 Determination of certain continuous functions.
To extend this last formula to the case where the integer number m is replaced by a fractional number mn , or even by an arbitrary number µ, we set, in the first case, β=
m α, n
where m and n denote two integer numbers, and we conclude that nβ = mα, n
[ϕ (β )] = [ϕ (α)]m and m ϕ (β ) = ϕ mn α = [ϕ (α)] n . [102] Then, by supposing that the fraction mn varies in such a way as to converge towards any number µ and passing to the limit, we find that ϕ (µα) = [ϕ (α)]µ . Now if we take α = 1, we have for all positive values of µ (8)
ϕ (µ) = [ϕ (1)]µ ,
and consequently, by making µ converge towards the limit zero, ϕ (0) = 1. Moreover, if in equation (2) we set x = µ and y = −µ, we conclude that ϕ (−µ) =
ϕ (0) = [ϕ (1)]−µ . ϕ (µ)
Thus, equation (8) remains true when we change µ to −µ. In other words, we have, for any values, positive or negative, of the variable x, (9)
ϕ (x) = [ϕ (1)]x .
It follows from equation (9) that any function ϕ (x) that solves the second problem is necessarily of the form (10) ϕ (x) = Ax , where A denotes a positive constant. I add that we can attribute to this constant any value between the limits 0 and ∞. Indeed, for any positive value of A, the function Ax remains continuous from x = −∞ to x = +∞, and the equation Ax+y = Ax Ay is an identity. The quantity A is thus an arbitrary constant that admits only positive values.
5.1 Functions satisfying certain conditions involving addition and multiplication.
75
[103] Note. — We can get equation (9) very simply in the following manner. If we take logarithms of both sides of equation (2) in any system, we find that log ϕ (x + y) = log ϕ (x) + log ϕ (y) , and we conclude (see problem I) that log ϕ (x) = x log ϕ (1) , then, by passing again from logarithms to numbers, ϕ (x) = [ϕ (1)]x . Problem III. — To determine the function ϕ (x) in such a manner that it remains continuous between any two positive limits of the variable x and so that for all positive values of the variables x and y we have (3)
ϕ (xy) = ϕ (x) + ϕ (y) .
Solution. — It would be easy to apply a method similar to the one we used to solve the first problem to the solution of problem III. However, we will arrive more promptly at the solution we seek by putting equation (3) into a form analogous to that of equation (1), as we are going to do. If A denotes any number and log denotes the characteristic of logarithms in the system for which the base is A, then for all positive values of the variables x and y we have x = Alog x and y = Alog y , so that equation (3) becomes ϕ Alog x+log y = ϕ Alog x + ϕ Alog y . Because in this last formula the variable quantities log x and log y admit any values, positive or negative, it follows [104] that we have, for all possible real values of x and y, ϕ Ax+y = ϕ (Ax ) + ϕ (Ay ) . We conclude that [see problem I, eqn. (6)] ϕ (Ax ) = xϕ A1 = xϕ (A) , and consequently ϕ Alog x = ϕ (A) log x, or what amounts to the same thing
76
(11)
5 Determination of certain continuous functions.
ϕ (x) = ϕ (A) log x.
It follows from formula (11) that every function ϕ (x) that solves problem III is necessarily of the form (12) ϕ (x) = a log(x), where a denotes a constant. Moreover, it is easy to assure ourselves: 1◦ that the constant a remains entirely arbitrary; and 2◦ that by choosing the number A suitably, which is itself arbitrary, we can reduce the constant a to one. Problem IV. — To determine the function ϕ (x) in such a manner that it remains continuous between any two positive limits of the variable x and so that for all positive values of the variables x and y we have (4)
ϕ (xy) = ϕ (x) ϕ (y) .
Solution. — It would be easy to apply a method similar to that which we used to solve the second problem to the solution of problem IV. However, we will arrive more promptly at the solution we seek if we observe that, by denoting by log the characteristic of logarithms in the system for which the base is A, we can put equation (4) into the form ϕ Alog x+log y = ϕ Alog x ϕ Alog y . Because in this last equation the variable quantities log x [105] and log y admit any values, positive or negative, it follows that we have, for all possible real values of the variables x and y, ϕ Ax+y = ϕ (Ax ) ϕ (Ay ) . We conclude that [see problem II, eqn. (9)] ϕ (Ax ) = [ϕ (A)]x and consequently ϕ Alog x = [ϕ (A)]log x = xlog ϕ(A) , or what amounts to the same thing, (13)
ϕ (x) = xlog ϕ(A) .
It follows from equation (13) that any function ϕ (x) that solves problem IV is necessarily of the form (14) ϕ (x) = xa , where a denotes a constant. Moreover it is easy to assure ourselves that this constant ought to remain entirely arbitrary.
5.2 Functions satisfying certain other conditions.
77
The four values of ϕ (x) which respectively satisfy equations (1), (2), (3) and (4), namely ax, Ax , a log x and xa , have this much in common, that each of them contains an arbitrary constant, a or A. Thus we ought to conclude that there is a great difference between the questions where it is a matter of calculating the unknown values of certain quantities and the questions in which we propose to discover the unknown nature of certain functions that have given properties. Indeed, in the first case, the values of unknown quantities are ultimately expressed by means of other known and determined quantities, while in the second case the unknown functions can, as we have seen here, admit arbitrary constants into their expression.
5.2 Research on a continuous function formed so that if we multiply two such functions together and then double the product, the result equals that function of the sum of the variables added to the same function of the difference of the variables. [106] In each of the problems of the preceding section, the equation to be solved contained, along with the unknown function ϕ (x), two other similar functions, namely ϕ (y) and ϕ (x + y) or ϕ (xy). Now we are going to propose a new problem of the same kind, but in which the equation of the condition that the function ϕ (x) must satisfy contains four such functions in place of three. It consists of the following: Problem. — To determine the function ϕ (x) in such a manner that it remains continuous between any two real limits of the variable x and so that for all real values of the variables x and y we have (1)
ϕ (y + x) + ϕ (y − x) = 2ϕ (x) ϕ (y) . Solution. — If we make x = 0 in equation (1), we get1 ϕ (0) = 1.
The function ϕ (x) thus reduces to 1 for the particular value x = 0, and because we suppose that it is continuous between any limits, it is clear that, in the neighborhood of this particular value, it is only very slightly different from 1, and consequently is positive. Thus, by denoting a very small number by α, we can choose this number in such a way that the function ϕ (x) remains constantly positive between the limits
1
Cauchy does not mention the trivial solutions ϕ(x) ≡ 0 and ϕ(x) ≡ 1.
78
5 Determination of certain continuous functions.
x=0
and x = α.
Given this, two things could happen: either the positive value of ϕ(α) will be contained between the limits 0 and 1, or this value will be [107] greater than 1. We will examine successively these two hypotheses. Now suppose that ϕ (α) has a value contained between the limits 0 and 1. We can represent this value by the cosine of a certain arc θ contained between the limits 0 and π2 , and as a consequence we can set ϕ (α) = cos θ . Moreover, if equation (1) is put into the form ϕ (y + x) = 2ϕ (x) ϕ (y) − ϕ (y − x) , and we successively make x = α and y = α, x = α and y = 2α, x = α and y = 3α, ...... ......, then we deduce the formulas ϕ (2α) = 2 cos2 θ − 1 = cos 2θ , ϕ (3α) = 2 cos θ cos 2θ − cos θ = cos 3θ , ϕ (4α) = 2 cos θ cos 3θ − cos 2θ = cos 4θ , one after another and in general, ϕ (mα) = 2 cos θ cos (m − 1) θ − cos (m − 2) θ = cos mθ , where m denotes any integer number. I add that the formula ϕ (mα) = cos mθ remains true even if we replace the integer number m by a fraction or even by any number µ. We will prove this easily as follows. If we make x = 21 α and y = 12 α in equation (1), then we get 2 1 ϕ (0) + ϕ (α) 1 + cos θ 1 2 ϕ α = = = cos θ . 2 2 2 2 Then, by taking the positive roots of both sides and [108] observing that the two functions ϕ (x) and cos x remain positive, the first between the limits x = 0 and x = α and the second between the limits x = 0 and x = θ , we find
5.2 Functions satisfying certain other conditions.
ϕ
1 α 2
79
1 = cos θ . 2
Likewise, if we make 1 x= α 4
1 and y = θ 4
in equation (1), then we get2 2 ϕ (0) + ϕ 1 ϕ = α 4 2
1 2α
=
1 + cos 21 θ 1 2 = cos θ . 2 4
Then, by extracting the positive roots of the first and last parts, we get 1 1 ϕ α = cos θ . 4 4 By similar reasoning, we successively obtain the formulas 1 1 α = cos θ , ϕ 8 8 1 1 ϕ α = cos θ , 16 16 ......... ............, and in general ϕ
1 α 2n
= cos
1 θ, 2n
where n denotes any integer number. If we operate on the preceding expression for ϕ 21n α to deduce that for ϕ 2mn α as we operated on the expression for ϕ (α) to deduce that for ϕ (mα), then we find m m ϕ n α = cos n θ . 2 2 Then, by supposing that the fraction 2mn varies in such a way as to approach [109] indefinitely the number µ, and passing to the limit, we obtain the equation (2)
ϕ (µα) = cos µθ .
Moreover, if we make3 x = µα
and y = 0
in formula (1), then we conclude that In [Cauchy 1897, p. 108], the numerator of the second part contains the expression ϕ 12 α in place of ϕ 12 α . This error did not appear in [Cauchy 1821, p. 116]. (tr.) 3 In [Cauchy 1897, p. 109], this reads x = µa. It is correctly written x = µα in [Cauchy 1821, p. 117]. (tr.) 2
80
5 Determination of certain continuous functions.
ϕ (−µα) = [2ϕ (0) − 1] ϕ (µα) = cos µθ = cos (−µθ ) . Thus, equation (2) remains true when we replace µ by −µ. In other words, we have, for any values, positive or negative, of the variable x, ϕ (αx) = cos θ x.
(3) If we change x to (4)
x α
in this last formula, we get4 θ θ ϕ (x) = cos x = cos − x . α α
The preceding value of ϕ (x) corresponds to the case where the positive quantity ϕ(α) remains contained between the limits 0 and 1. Now let us suppose that this same quantity is greater than 1. It is easy to see that under this second hypothesis we can find a positive value of r that satisfies the equation 1 1 ϕ (α) = r+ . 2 r Indeed, it suffices to take n o1 2 r = ϕ (α) + [ϕ (α)]2 − 1 . Given this, if we successively make x = α and y = α, x = α and y = 2α, x = α and y = 3α, ...... ......, in equation (1), [110] then we deduce, one after another, the formulas 2 ϕ (2α) = 21 r + 1r − 1 = 12 r2 + r12 , ϕ (3α) = 12 r + 1r r2 + r12 − 12 r + 1r = 12 r3 + r13 , ϕ (4α) = 12 r + 1r r3 + r13 − 12 r2 + r12 = 12 r4 + r14 , ......................................................, In general, 1 1 1 m−2 1 1 m−1 ϕ (mα) = r+ r + m−1 − r + m−2 2 r r 2 r
4
The trivial solution ϕ(x) ≡ 1 is included in equation (4) as the case θ = 0.
5.2 Functions satisfying certain other conditions.
=
1 m 1 r + m 2 r
81
,
where m denotes any integer number. I add that the formula 1 m 1 ϕ (mα) = r + m 2 r remains true even if we replace the integer number m by a fraction or even by any number µ. We will prove this easily as follows. If we make x = 21 α and y = 12 α in equation (1), we get 2 1 2 1 ϕ (0) + ϕ (α) 1 + 21 r + 1r 1 1 ϕ α = = = r 2 + r− 2 . 2 2 2 4 Then, by taking the positive roots of both sides and observing that the function ϕ (x) remains positive between the limits x = 0 and x = α, we find 1 1 1 1 ϕ α = r 2 + r− 2 . 2 2 Likewise, if we make 1 x= α 4
1 and y = α 4
in equation (1), then we get5 2 ϕ (0) + ϕ 12 α 1 ϕ α = 4 2 1 1 1 1 + 2 r 2 + r− 2 1 2 1 1 r 4 + r− 4 . = = 2 4 [111] Then, by taking the positive roots of the first and the last parts, we get 1 1 1 1 ϕ α = r 4 + r− 4 . 4 2 By similar reasoning, we successively obtain the formulas 1 1 1 1 ϕ α = r 8 + r− 8 , 8 2 1 1 1 1 ϕ α = r 16 + r− 16 , 16 2 ......... ....................., 5 The negative signs were missing from the exponents − 1 and − 1 in [Cauchy 1897, p. 110]. They 2 4 were present in [Cauchy 1821, p. 119]. (tr.)
82
5 Determination of certain continuous functions.
and in general ϕ
1 α 2n
=
1 1 1n r 2 + r− 2n , 2
where n denotes any integer number. If we operate on the preceding expression for ϕ 21n α to deduce that for ϕ 2mn α as we operated on the expression for ϕ (α) to deduce that for6 ϕ (mα), then we find ϕ
m 1 m m r 2n + r− 2n . α = n 2 2
Then, by supposing that the fraction 2mn varies in such a way as to approach indefinitely the number µ, and passing to the limit, we obtain the equation (5)
ϕ (µα) =
1 µ r + r−µ . 2
x = µα
and y = 0
Moreover, if we make in formula (1), then we conclude that ϕ (−µα) = [2ϕ (0) − 1] ϕ (µα) =
1 −µ r + rµ . 2
[112] Thus, equation (5) remains true when we replace µ by −µ. In other words, we have, for all values, positive or negative, of the variable x, ϕ (αx) =
(6) If we change x to (7)
x α
1 x r + r−x . 2
in this last formula, we get ϕ (x) =
x 1 x r α + r− α . 2 1
When we make ± αθ = a in equation (4) and r± α = A in equation (7), these equations give, respectively, the following forms: (8) (9)
ϕ (x) = cos ax and 1 x ϕ (x) = A + A−x . 2
Thus, if we denote a constant quantity by a and a constant number by A, then any function ϕ (x) that remains continuous between any limits of the variable and that satisfies equation (1) is necessarily contained in one of the two forms that we have just described. Moreover, it is easy to assure ourselves that the values of ϕ (x) given 6
This word “for” is the translation of the word “de,” which was present in [Cauchy 1821, p. 120], but absent from [Cauchy 1897, p. 111]. (tr.)
5.2 Functions satisfying certain other conditions.
83
by equations (8) and (9) solve the proposed question, whatever values may be attributed to the quantity a and the number A. This number and this quantity are two arbitrary constants, of which one admits only positive quantities. From what we have just said, the two functions7 cos ax
and
1 x A + A−x 2
have the common property of satisfying equation (1), and this establishes a remarkable analogy between them. Both of these two [113] functions still reduce to one for x = 0. But one essential difference between the first and the second is that the numerical value of the first is constantly less than the limit 1, whenever it does not reach this limit, while, under the same hypothesis, the numerical value of the second is constantly above the limit 1.
7
Using modern notation, we observe that the second solution may be written as cosh ax, where a = ln A. Lambert (1728–1777) was the first to note such parallels between the trigonometric and hyperbolic functions in [Lambert 1768].
Chapter 6
On convergent and divergent series. Rules for the convergence of series. The summation of several convergent series.
6.1 General considerations on series. [114]1 We call a series an indefinite sequence of quantities, u0 , u1 , u2 , u3 , . . . , which follow from one to another according to a determined law. These quantities themselves are the various terms of the series under consideration. Let sn = u0 + u1 + u2 + . . . + un−1 be the sum of the first n terms, where n denotes any integer number. If, for ever increasing values of n, the sum sn indefinitely approaches a certain limit s, the series is said to be convergent, and the limit in question is called the sum of the series. On the contrary, if the sum sn does not approach any fixed limit as n increases indefinitely, the series is divergent, and does not have a sum. In either case, the term which corresponds to the index n, that is un , is what we call the general term. For the series to be completely determined, it is enough that we give this general term as a function of the index n. One of the simplest series is the geometric progression, 1, x, x2 , x3 , . . . , which has xn for its general term, that is to say the nth power of the quantity [115] x. If we form the sum of the first n terms of this series, then we find 1 + x + x2 + . . . + xn−1 =
1 xn − . 1−x 1−x
1
Both [Cauchy 1821, p. ix] and [Cauchy 1897, p. 473] use the title “On convergent and divergent (real) series. . . . ” in the table of contents. (tr.) R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 6, c Springer Science+Business Media, LLC 2009
85
86
6 On convergent and divergent series. n
x converges toAs the values of n increase, the numerical value of the fraction 1−x wards the limit zero, or increases beyond all limits, according to whether we suppose that the numerical value of x is less than or greater than 1. Under the first hypothesis, we ought to conclude that the progression
1, x, x2 , x3 , . . . 1 is a convergent series which has 1−x as its sum, whereas, under the second hypothesis, the same progression is a divergent series which does not have a sum. Following the principles established above, in order that the series
(1)
u0 , u1 , u2 , . . . , un , un+1 , . . .
be convergent, it is necessary and it suffices that increasing values of n make the sum sn = u0 + u1 + u2 + . . . + un−1 converge indefinitely towards a fixed limit s. In other words, it is necessary and it suffices that, for infinitely large values of the number n, the sums sn , sn+1 , sn+2 , . . . differ from the limit s, and consequently from one another, by infinitely small quantities. Moreover, the successive differences between the first sum sn and each of the following sums are determined, respectively, by the equations sn+1 − sn = un , sn+2 − sn = un + un+1 , sn+3 − sn = un + un+1 + un+2 , ......... ... ................... Hence, in order for series (1) to be convergent, it is first of all necessary [116] that the general term un decrease indefinitely as n increases. But this condition does not suffice, and it is also necessary that, for increasing values of n, the different sums, un + un+1 , un + un+1 + un+2 , .................., that is to say, the sums of as many of the quantities un , un+1 , un+2 , . . . ,
6.1 General considerations on series.
87
as we may wish, beginning with the first one, eventually constantly assume numerical values less than any assignable limit. Conversely, whenever these various conditions are fulfilled, the convergence of the series is guaranteed.2 Let us take, for example, the geometric progression 1, x, x2 , x3 , . . . .
(2)
If the numerical value of x is greater than 1, that of the general term xn increases indefinitely with n, and this remark alone suffices to establish the divergence of the series. The series is still divergent if we let x = ±1, because the numerical value of the general term xn , which is 1, does not decrease indefinitely for increasing values of n. However, if the numerical value of x is less than 1, then the sums of any number of terms of the series, beginning with xn , namely: xn , 1 − x2 , 1−x 1 − x3 xn + xn+1 + xn+2 = xn , 1−x .................. ... ........., xn + xn+1 = xn
are all contained between the limits xn
and
xn , 1−x
[117] each of which becomes infinitely small for infinitely large values of n. Consequently, the series is convergent, as we already knew. As a second example, let us take the numerical series (3)
1,
1 1 1 1 1 , , , ..., , , .... 2 3 4 n n+1
1 The general term of this series, namely n+1 , decreases indefinitely as n increases. 1 Nevertheless, the series is not convergent, because the sum of the terms from n+1 1 up to 2n inclusive, namely
1 1 1 1 + +...+ + , n+1 n+2 2n − 1 2n is always greater than the product n
2
1 1 = , 2n 2
This is the Cauchy Convergence Criterion. It is still one of the few necessary and sufficient conditions for convergence of series.
88
6 On convergent and divergent series.
whatever the value of n. As a consequence, this sum does not decrease indefinitely with increasing values of n, as would be the case if the series were convergent. Let us add that, if we denote the sum of the first n terms of series (3) by sn and the highest power of 2 bounded by n + 1 by 2m , then we have 1 1 1 + +...+ 2 3 n+1 1 1 1 1 1 1 1 > 1+ + + + + + + +... 2 3 4 5 6 7 8 1 1 1 + m−1 + m−1 +...+ m , 2 +1 2 +2 2
sn = 1 +
and, a fortiori, 1 m 1 1 1 + + +...+ = 1+ . 2 2 2 2 2 We conclude from this that the sum sn increases indefinitely with the integer number m, and consequently with n, which is a new proof of the divergence of the series.3 [118] Let us further consider the numerical series sn > 1 +
(4)
1,
1 1 1 1 , , , ..., , .... 1 1·2 1·2·3 1·2·3...n
The terms of this series with index greater than n, namely 1 1 1 , , , ..., 1 · 2 · 3 . . . n 1 · 2 · 3 . . . n(n + 1) 1 · 2 · 3 . . . n(n + 1)(n + 2) are, respectively, less than the corresponding terms of the geometric progression 1 1 1 1 1 , , , .... 1 · 2 · 3 . . . n 1 · 2 · 3 . . . n n 1 · 2 · 3 . . . n n2 As a consequence, the sum of however many of the initial terms as we may wish is always less than the sum of the corresponding terms of the geometric progression, which is a convergent series, and so a fortiori,4 it is less than the sum of this series, which is to say 1 1 1 1 = . 1 1·2·3...n 1− n 1 · 2 · 3 . . . (n − 1) n − 1 Because this last sum decreases indefinitely as n increases, it follows that series (4) is itself convergent. It is conventional to denote the sum of this series by the letter e. By adding together the first n terms, we obtain an approximate value of the number e, 3
Cauchy may not be claiming originality for this “new” proof. It was first given by Oresme (see, for example, [Dunham 1990, pp. 202–203]), but Cauchy was probably not aware of it. 4 This is an implicit use of the Comparison Test. Cauchy never states this test explicitly.
6.1 General considerations on series.
1+
89
1 1 1 1 + + +...+ . 1 1·2 1·2·3 1 · 2 · 3 . . . (n − 1)
According to what we have just said, the error made will be smaller than the product 1 . Therefore, for example, if we let n = 11, we find as the of the nth term by n−1 approximate value of e (5) e = 2.7182818 . . . , and the error made in this case is less than the product [119] of the fraction 1 1 1 1·2·3·4·5·6·7·8·9·10 by 10 , that is 36,288,000 , so that it does not affect the seventh decimal place. The number e, determined as we have just said, is often used in the summation of series and in the infinitesimal Calculus. Logarithms taken in the system with this number as its base are called Napierian, for Napier, the inventor of logarithms, or hyperbolic, because they measure the various parts of the area between the equilateral hyperbola and its asymptotes.5 In general, we denote the sum of a convergent series by the sum of the first terms, followed by an ellipsis. Thus, when the series u0 , u1 , u2 , u3 , . . . is convergent, the sum of this series is denoted u0 + u1 + u2 + u3 + . . . . By virtue of this convention, the value of the number e is determined by the equation (6)
e = 1+
1 1 1 1 + + + +..., 1 1·2 1·2·3 1·2·3·4
and, if one considers the geometric progression 1, x, x2 , x3 , . . . , we have, for numerical values of x less than 1, (7)
1 + x + x2 + x3 + . . . =
1 . 1−x
Denoting the sum of the convergent series u0 , u1 , u2 , u3 , . . . by s and the sum of the first n terms by sn , we have s = u0 + u1 + u2 + . . . + un−1 + un + un+1 + . . . = sn + un + un+1 + . . . , 5
I.e., the area under the curve y = 1x .
90
6 On convergent and divergent series.
and, as a consequence, s − sn = un + un+1 + . . . . [120] From this last equation, it follows that the quantities un , un+1 , un+2 , . . . form a new convergent series, the sum of which is equal to s − sn . If we represent this sum by rn , we have s = sn + rn , and rn is called the remainder of series (1) beginning from the nth term. Suppose the terms of series (1) involve some variable x. If the series is convergent and its various terms are continuous functions of x in a neighborhood of some particular value of this variable, then sn , rn and s are also three functions of the variable x, the first of which is obviously continuous with repect to x in a neighborhood of the particular value in question. Given this, let us consider the increments in these three functions when we increase x by an infinitely small quantity α. For all possible values of n, the increment in sn is an infinitely small quantity. The increment of rn , as well as rn itself, becomes infinitely small for very large values of n. Consequently, the increment in the function s must be infinitely small.6 From this remark, we immediately deduce the following proposition: Theorem I. — When the various terms of series (1) are functions of the same variable x, continuous with respect to this variable in the neighborhood of a particular value for which the series converges, the sum s of the series is also a continuous function of x in the neighborhood of this particular value.7 By virtue of this theorem, the sum of series (2) must be a continuous function of the variable x between the limits x = −1 and x = 1, [121] as we may verify by considering the values of s given by the equation s=
1 . 1−x
6.2 On series for which all the terms are positive. Whenever all the terms of the series 6
This passage is quoted in [L¨utzen 2003, p. 168]. This theorem as stated is incorrect. If we impose the additional condition of uniform convergence on the functions sn , then it does hold. This theorem is controversial. Some have argued that Cauchy really had uniform convergence in mind. See [L¨utzen 2003, pp. 168–169] for further discussion. 7
6.2 On series for which all the terms are positive.
(1)
91
u0 , u1 , u2 , . . . , un , . . .
are positive, we may usually decide whether it is convergent or divergent by using the following theorem: 1
Theorem I.8 — Consider the limit or limits towards which the expression (un ) n converges as n increases indefinitely, and let k denote the largest of these limits, or in other words, the limit of the largest values of the expression in question. Series (1) converges whenever k < 1 and diverges whenever k > 1. Proof. — First of all, suppose that k < 1 and choose an arbitrary third number U between the two numbers 1 and k, so that we have k < U < 1. 1
As n increases beyond assignable limit, the largest values of (un ) n cannot approach indefinitely the limit k without eventually being constantly less than U. Consequently, it is possible to assign an integer value to n large enough so that when n is greater than or equal to this value, we constantly have9 1
or un < U n .
(un ) n < U, It follows that the terms of the series
u0 , u1 , u2 , . . . , un+1 , un+2 , . . . [122] are eventually always smaller than the corresponding terms of the geometric progression 1, U, U 2 , . . . , U n , U n+1 , U n+2 , . . . . As this progression is convergent (because U < 1) we may, by the previous remark, conclude a fortiori the convergence of series (1). On the other hand, suppose that k > 1 and again pick a third number U between the two numbers 1 and k, so that we have k > U > 1. 1
As n increases without limit, the largest values of (un ) n in approaching k indefinitely eventually become greater than U. We may therefore satisfy the condition 1
(un ) n > U or, what amounts to the same thing, the following condition 8
This theorem is now known as the Root Test. It is cited as the definition of upper and lower limits in [DSB Cauchy, p. 136]. 9 In [Cauchy 1897, p. 121], the subscript is missing in the term (u ) 1n . It is present in [Cauchy n 1821, p. 133]. (tr.)
92
6 On convergent and divergent series.
un > U n , for values of n as large as we might wish. As a consequence, we find in the series u0 , u1 , u2 , . . . , un , un+1 , un+2 , . . . an indefinite number of terms greater than the corresponding terms of the geometric progression 1, U, U 2 , . . . , U n , U n+1 , U n+2 , . . . . As this progression is divergent (because U > 1) and, as a consequence its various terms increase to infinity, the remark that we have just made suffices to establish the divergence of series (1). In a great number of cases we may determine the values of the quantity k with the assistance of theorem IV (Chap. II, § III). Indeed, [123] by virtue of this theorem, u any time the ratio n+1 un converges towards a fixed limit, that limit is precisely the value of k. We may therefore state the following proposition: Theorem II.10 — If, for increasing values of n, the ratio un+1 un converges towards a fixed limit k, series (1) converges whenever k < 1 and diverges whenever k > 1. For example, if we consider the series 1,
1 1 1 1 , , , ..., , ..., 1 1·2 1·2·3 1·2·3...n
then we find 1·2·3...n 1 un+1 = = , un 1 · 2 · 3 . . . n(n + 1) n + 1
so k =
1 = 0, ∞
and consequently the series is convergent, as we already knew. The first of the two theorems that we have just established leaves no doubt about the convergence or divergence of a series whose terms are positive, except in the particular case where the quantity k becomes equal to one. In this particular case, it is not always easy to answer the question of convergence. However, we will now prove two new propositions, which frequently help us to decide the issue. Theorem III.11 — Whenever each term of series (1) is smaller than the one preceding it, that series and the following one 10 11
This is the Ratio Test; see [DSB Cauchy, p. 136]. This is known as the Cauchy Condensation Test.
6.2 On series for which all the terms are positive.
93
u0 , 2u1 , 4u3 , 8u7 , 16u15 , . . .
(2)
are either both convergent or both divergent. Proof. — First of all, suppose that series (1) is convergent and [124] let s denote its sum. Then u0 = u0 , 2u1 = 2u1 , 4u3 < 2u2 + 2u3 , 8u7 < 2u4 + 2u5 + 2u6 + 2u7 , ... ... ....................., and consequently, the sum of as many of the terms of series (2) as we may wish is smaller than u0 + 2u1 + 2u2 + 2u3 + 2u4 + . . . = 2s − u0 . It follows that series (2) converges. On the other hand, suppose that series (1) diverges. The sum of its terms, taken in great number, eventually surpasses any assignable limit. Because we have u0 = u0 , 2u1 > u1 + u2 , 4u3 > u3 + u4 + u5 + u6 , 8u7 > u7 + u8 + u9 + u10 + u11 + u12 + u13 + u14 , ... ... .........................................., we must conclude that the sum of the quantities u0 , 2u1 , 4u3 , 8u7 , . . . , taken in great number, is itself eventually greater than any given quantity. Series (2) is therefore divergent, conforming to the stated theorem. Corollary. — Let µ be any quantity. If series (1) is (3)
1,
1 1 1 , , , ..., 2µ 3µ 4µ
then series (2) becomes 1, 21−µ , 41−µ , 81−µ , . . . . [125] This last series is a geometric progression, convergent whenever we have µ > 1 and divergent in the opposite case. As a consequence, series (3) is itself convergent if µ is a number greater than 1, and divergent if µ = 1 or µ < 1. For example, of the three series
94
6 On convergent and divergent series.
1 , 22 1 1, , 2 1 1, 1 , 22
(4)
1 , 32 1 , 3 1 1 , 32
1,
(5) (6)
1 , ..., 42 1 , ..., 4 1 1 , ..., 42
the first is convergent and the other two divergent. Theorem IV.12 — Suppose that log denotes the characteristic of the logarithm in any system and that the ratio log(un ) log 1n converges towards a finite limit h for increasing values of n. Series (1) is convergent if h > 1 and divergent if h < 1. Proof. — First of all, suppose h > 1 and choose any third quantity a between the two quantities 1 and h, so that we have h > a > 1. The ratio
log(un ) , log( 1n )
or its equivalent log
1 un
log(n)
,
eventually, for very large values of n, is constantly greater than a. In other words, if n increases beyond [126] a certain limit, we always have log u1n > a, log(n) or what amounts to the same thing, 1 log > a log(n), un and, as a consequence, 1 > na , un
so un
1), we may, by the previous remark, conclude a fortiori the convergence of series (1). On the other hand, suppose that h < 1, and again pick a third quantity a between 1 and h, so that we have h < a < 1. Eventually, for very large values of n, we constantly have log u1n < a, log(n) or what amounts to the same thing, 1 log < a log(n), un and, as a consequence, 1 < na , un
so un >
1 . na
It follows that the terms of series (1) eventually are constantly [127] greater than the corresponding terms of the following series 1,
1 1 1 1 1 , a , a , ..., a , , .... a 2 3 4 n (n + 1)a
As this last series is convergent (because a < 1), we may, by the remark we have just made, conclude a fortiori the divergence of series (1). Given two convergent series, the terms of which are positive, we may, by adding or multiplying these same terms, form a new series, the sum of which results from the addition or the multiplication of the sums of the first two. On this subject, we establish the two following theorems: Theorem V. — Let (7)
u0 , u1 , u2 , . . . , un , . . . , v0 , v1 , v2 , . . . , vn , . . .
be two convergent series composed only of positive terms, having s and s0 , respectively, as sums. Then (8)
u0 + v0 , u1 + v1 , u2 + v2 , . . . , un + vn , . . .
is a new convergent series, which has s + s0 as its sum. Proof. — If we let
96
6 On convergent and divergent series.
sn = u0 + u1 + u2 + . . . + un−1 and s0n = v0 + v1 + v2 + . . . + vn−1 , then sn and s0n converge, for increasing values of n, towards the limits s and s0 , respectively. As a consequence, sn + s0n , that is the sum of the first n terms of series (8), converges towards the limit s + s0 , which suffices to establish the stated theorem. Theorem VI. — Under the same hypotheses as the previous theorem, u0 v0 , u0 v1 + u1 v0 , u0 v2 + u1 v1 + u2 v0 , . . . (9) . . . , u0 vn + u1 vn−1 + . . . + un−1 v1 + un v0 , . . . is a new convergent series, which has ss0 as its sum. [128] Proof. — Once again, let sn and s0n be the sums of the first n terms of the two series (7), and additionally denote the sum of the first n terms of series (9) by n−1 s00n . If we denote by m the greatest integer included in n−1 2 , that is to say 2 when n n−2 13 is odd and 2 otherwise, we clearly have u0 v0 + (u0 v1 + u1 v0 ) + . . . + (u0 vn−1 + u1 vn−2 + . . . + un−2 v1 + un−1 v0 ) < (u0 + u1 + . . . + un−1 )(v0 + v1 + . . . + vn−1 ) and > (u0 + u1 + . . . + um )(v0 + v1 + . . . + vm ). In other words, s00n < sn s0n
and
> sm+1 s0m+1 .
Now suppose that we make n increase beyond all limit. The number m=
n − 23 ± 12 2
itself increases indefinitely, and the two sums sn and sm+1 converge towards the limit s, while s0n and s0m+1 converge towards the limit s0 . As a consequence, the two products sn s0n and sm+1 s0m+1 , as well as the sum s00n contained between these two products, converge towards the limit ss0 , which suffices to establish theorem VI.14
6.3 On series which contain positive terms and negative terms. Suppose that the series 13
The left-hand side of this inequality contained some subscripting errors in [Cauchy 1821, p. 141], which were not included in the Errata of that edition. These were corrected in [Cauchy 1897, p. 128]. 14 This is another implicit application of the Squeeze Theorem.
6.3 On series which contain positive terms and negative terms.
97
u0 , u1 , u2 , . . . , un , . . .
(1)
is composed of terms that are sometimes positive and sometimes negative, and let ρ0 , ρ1 , ρ2 , . . . , ρn , . . .
(2)
[129] be, respectively, the numerical values of these same terms, so that we have u0 = ±ρ0 , u1 = ±ρ1 , u2 = ±ρ2 , . . . , un = ±ρn , . . . . The numerical value of the sum u0 + u1 + u2 + . . . + un−1 will never surpass15 ρ0 + ρ1 + ρ2 + . . . + ρn−1 , so it follows that the convergence of series (2) always entails that of series (1).16 We ought to add that series (1) is divergent if some terms of series (2) eventually increase beyond all assignable limit. This latter case occurs whenever the greatest 1 values of (ρn ) n converge towards a limit greater than 1, for increasing values of n. On the other hand, whenever this limit is less than 1, series (2) is always convergent. As a consequence, we may state the following theorem: Theorem I.17 — Let ρn be the numerical value of the general term un of series (1), and let k denote the limit towards which the largest values of the expression 1 (ρn ) n converge as n increases indefinitely. Series (1) is convergent if we have k < 1 and divergent if we have k > 1. u
ρ
, that is, the numerical value of the ratio n+1 Whenever the fraction ρn+1 un , conn verges towards a fixed limit, then by virtue of theorem IV (Chap. II, § III), this limit is the desired value of k. This remark brings us to the proposition which I will now state: Theorem II.18 — If the numerical value of the ratio un+1 un converges towards a fixed limit k for increasing values of n, then series 1 is convergent whenever we have k < 1 and divergent whenever we have k > 1. [130] For example, if we consider the series 15
Here Cauchy makes an implicit use of the generalized triangle inequality. Cauchy does not define absolute convergence, but has essentially shown here that absolute convergence implies convergence. 17 This is another application of the Root Test. 18 This is another application of the Ratio Test. 16
98
6 On convergent and divergent series.
1 1 1 1, − , + ,− , +..., 1 1·2 1·2·3 we find that
un+1 1 =− , un n+1
so that k =
1 = 0, ∞
from which it follows that the series is convergent. The first of the two theorems we have just established leaves no doubt about the convergence or divergence of a particular series, except in the particular case where the quantity denoted by k becomes equal to one. In this particular case, we may often establish the convergence of the given series either by verifying that the numerical values of the various terms form a convergent series or by means of the following theorem: Theorem III.19 — If the numerical value of the general term un in series (1) decreases constantly and indefinitely for increasing values of n, and if further the different terms are alternately positive and negative, then the series converges. For example, consider the series (3)
1 1 1 1 1 , .... 1, − , + , − , + . . . ± , ∓ 2 3 4 n n+1
The sum of the terms whose index is greater than n, if we suppose them to be m in number, is 1 1 1 1 1 ± − + − +...± . n+1 n+2 n+3 n+4 n+m Now the numerical value of this sum, namely 1 1 1 1 1 − + − +...± n+1 n+2 n+3 n+4 n+m 1 1 1 1 1 = − − − − −... n+1 n+2 n+3 n+4 n+5 1 1 1 1 = − + − n+1 n+2 n+3 n+4 1 1 + − +..., n+5 n+6 [131] because it is obviously contained between 1 n+1
and
1 1 − , n+1 n+2
decreases indefinitely for increasing values of n, whatever the value of m, which suffices to establish the convergence of the given series. The same arguments may 19
This is the Alternating Series Test.
6.3 On series which contain positive terms and negative terms.
99
obviously be applied to any series of this kind. I will cite, for example, the following 1, −
(4)
1 1 1 , + µ , − µ , ..., µ 2 3 4
which remains convergent for all positive values of µ, by virtue of theorem III. If we suppress the − sign preceding each term of even index in series (4), we obtain series (3) of section II, which is divergent whenever we have µ = 1 or µ < 1. As a consequence, to transform a convergent series into a divergent series, or vice versa, it sometimes suffices to change the sign of certain terms. Moreover, this remark applies exclusively to series for which the quantity denoted by k in theorem II reduces to 1. Given a convergent series, the terms of which are positive, we can only augment the convergence by diminishing the numerical values of the same terms and changing the signs of some of them. It is worth noting that we produce this double effect if we multiply each term by a sine or by a cosine, and this observation suffices to establish the following proposition: Theorem IV.20 — When the series ρ0 , ρ1 , ρ2 , . . . , ρn , . . . ,
(2)
made up entirely of positive terms, is convergent, then each of the following ρ0 cos θ0 , ρ1 cos θ1 , ρ2 cos θ2 , . . . , ρn cos θn , . . . , (5) ρ0 sin θ0 , ρ1 sin θ1 , ρ2 sin θ2 , . . . , ρn sin θn , . . . [132] is also convergent, whatever the values of the arcs θ0 , θ1 , θ2 , . . ., θn , . . .. Corollary. — If we suppose in general that θn = nθ , where θ denotes an arbitrary arc, then the two series in (5) become, respectively, ρ0 , ρ1 cos θ , ρ2 cos 2θ , . . . , ρn cos nθ , . . . , (6) ρ1 sin θ , ρ2 sin 2θ , . . . , ρn sin nθ , . . . . These last two series will therefore always be convergent whenever series (2) is convergent. If we consider two series at the same time, both of which include positive terms and negative terms, we easily prove theorems V and VI of §II about them, as we will now see. Theorem V. — Let 20
This is another implicit application of the Comparison Test and Cauchy’s notion of absolute convergence.
100
6 On convergent and divergent series.
(7)
u0 , u1 , u2 , . . . , un , . . . , v0 , v1 , v2 , . . . , vn , . . .
be two convergent series having s and s0 , respectively, as sums. Then (8)
u0 + v0 , u1 + v1 , u2 + v2 , . . . , un + vn , . . .
is a new convergent series, having s + s0 as its sum. Proof. — If we let sn = u0 + u1 + u2 + . . . + un−1 and s0n = v0 + v1 + v2 + . . . + vn−1 , then, for increasing values of n, sn and s0n converge towards the limits s and s0 , respectively. As a consequence, sn + s0n , that is the sum of the first n terms of series (8), converges towards the limit s + s0 , which suffices to establish the stated theorem. Theorem VI.21 — Under the same hypotheses as the previous [133] theorem, if each of series (7) remains convergent when we replace its various terms with their numerical values, then u0 v0 , u0 v1 + u1 v0 , u0 v2 + u1 v1 + u2 v0 , (9) .................., u0 vn + u1 vn−1 + . . . + un−1 v1 + un v0 , .................................... is a new convergent series having ss0 as its sum. Proof. — Once again, let sn and s0n be the sums of the first n terms of the two series (7), and additionally denote the sum of the first n terms of series (9) by s00n . Then we have sn s0n − s00n = un−1 vn−1 + (un−1 vn−2 + un−2 vn−1 ) + . . . + (un−1 v1 + un−2 v2 + . . . + u2 vn−2 + u1 vn−1 ) . Furthermore, theorem VI was proved in the second section in the case where series (7) consists only of positive terms. It is a consequence of this hypothesis that each the quantities sn s0n and s00n converges towards the limit ss0 , for increasing values of n. Consequently, the difference sn s0n − s00n , or what amounts to the same thing, the sum un−1 vn−1 + (un−1 vn−2 + un−2 vn−1 ) + . . . + (un−1 v1 + un−2 v2 + . . . + u2 vn−2 + u1 vn−1 ) , 21
This is sometimes known as Mertens’ Theorem.
6.3 On series which contain positive terms and negative terms.
101
converges towards the limit zero. Now, if some of the terms of series (7) are positive and the others are negative, suppose that we denote the numerical values of the various terms by ρ0 , ρ1 , ρ2 , . . . , ρn , . . . , (10) ρ00 , ρ10 , ρ20 , . . . , ρn0 , . . . respectively. Suppose further, as in the statement of the theorem, that series (10), composed [134] of these same numerical values, are both convergent. By virtue of the remark we have just made, the sum 0 0 0 )+... ρn−1 ρn−1 + (ρn−1 ρn−2 + ρn−1 ρn−1 0 0 0 0 ) + (ρn−1 ρ1 + ρn−2 ρ2 + . . . + ρ2 ρn−2 + ρ1 ρn−1
converges towards the limit zero for increasing values of n. Because the numerical value of that sum is evidently greater than that of the following un−1 vn−1 + (un−1 vn−2 + un−2 vn−1 ) + . . . + (un−1 v1 + un−2 v2 + . . . + u2 vn−2 + u1 vn−1 ), it follows that this latter, or what amounts to the same thing, the difference sn s0n − s00n itself converges towards the limit zero. Consequently, ss0 , which is the limit of the product sn s0n , is also that of s00n . In other words, series (9) is convergent and has as its sum the product ss0 . Scholium. — The previous theorem could not remain true if series (7), assumed to be convergent, ceased to be so after the reduction of each term to its numerical value. Suppose, for example, that we take both of series (7) to be 1, −
(11)
1 2
1 2
,+
1 3
1 2
,−
1 4
1 2
,+
1 1
, −....
52
Series (9) becomes 1, − √12 + √12 , + √13 + √12·2 + √13 , − √14 + √13·2 + √12·3 + √14 , +............................
(12)
[135] This last series is divergent because its general term, namely ±
1 1 1 1 1 √ +p +p +...+ p +√ n n (n − 1)2 (n − 2)3 2(n − 1)
has a numerical value clearly greater than
! ,
102
6 On convergent and divergent series.
n n
n 2
2
+1
1 = 2
4n n+2
1 2
when n is even, and greater than n h
i1 n+1 2 2
=
2n n+1
2
when n is odd. That is, in every possible case, it has a numerical value greater than 1. Nevertheless, series (11) is convergent. However, we ought to observe that it ceases to be convergent when we replace each term with its numerical value, because it then changes to series (6) of § II.
6.4 On series ordered according to the ascending integer powers of a single variable. Let (1)
a0 , a1 x, a2 x2 , . . . , an xn , . . .
be a series ordered according to the ascending integer powers of the variable x,22 where (2) a0 , a1 , a2 , . . . , an , . . . denote constant coefficients, positive or negative. Furthermore, let A be the quantity that corresponds to the quantity k of the previous section (see § III, theorem II), with respect to series (2).23 The same quantity, when calculated for series (1), is the numerical value of the product Ax. [136] As a consequence, series (1) is convergent if this numerical value is less than 1, which is to say in other words, if the numerical value of the variable x is less than 1 24 A . On the other hand, series (1) is divergent if the numerical value of x is greater than A1 . We may therefore state the following proposition: Theorem I. — Let A be the limit towards which the nth root of the largest numerical values of an converge, for increasing values of n. Series (1) is convergent for all values of x contained between the limits
22
Such series had not yet been given the modern name power series. a Theorem II of § III is the Ratio Test, so Cauchy is saying that A = lim n+1 an , when this limit exists. However, his statements of theorems I and II below and the discussion in between suggest p that he means A = lim sup n |an |. 24 This number 1 had not yet been given the modern name radius of convergence. A
23
6.4 On series ordered according to the ascending integer powers of a single variable.
x=−
1 A
and
103
1 x=+ , A
and divergent for all values of x situated outside of these same limits. a
Whenever the numerical value of the ratio n+1 an converges towards a fixed limit, this limit is the desired value of A (by virtue of theorem IV, Chap. II, § III). This remark brings us to a new proposition that I will write: Theorem II. — If the numerical value of the ratio an+1 an converges towards the limit A for increasing values of n, series (1) is convergent for all values of x contained between the limits x=−
1 A
and
1 x=+ , A
and divergent for all values of x situated outside of these same limits. Corollary I. — For an example, take the series (3)
1, 2x, 3x2 , 4x3 , . . . , (n + 1)xn , . . . .
Because under this hypothesis we find that n+2 1 an+1 = = 1+ an n+1 n+1 [137] and as a consequence, A = 1, we thereby conclude that series (3) is convergent for all values of x contained between the limits x = −1 and x = +1, and divergent for all values of x situated outside of these limits. Corollary II. — For a second example, take the series (4)
x x2 x3 xn , , , ..., , ..., 1 2 3 n
in which the constant term is understood to be zero. Under this hypothesis, we find that an+1 n 1 = = an n + 1 1 + n1
104
6 On convergent and divergent series.
and as a consequence, A = 1. Series (4) is therefore again convergent or divergent according to whether the numerical value of x is less or greater than 1. Corollary III. — If we take the following for series (1) (5) 1,
µ x, 1
µ(µ − 1) 2 x , 1·2
µ(µ − 1)(µ − 2) . . . (µ − n + 1) n x , 1·2·3...n
...,
...,
where µ denotes any quantity, then we find that 1 − µn an+1 µ −n = =− an n+1 1 + 1n and as a consequence, A = lim
1 − µn 1+
1 n
=
1 − ∞1 1 + ∞1
= 1.
We thereby conclude that series (5) is, like series (3) and (4), convergent [138] or divergent, according to whether we assign a numerical value less or greater than 1 to the variable x. Corollary IV. — Now consider the series (6)
1,
x , 1
x2 , 1·2
Because in this case we have
x3 , 1·2·3
...,
xn , 1·2·3...n
....
an+1 1 = an n+1
and, as a consequence 1 = 0, ∞ we thereby conclude that the series is convergent between the limits A=
1 1 x = − = −∞ and x = + = +∞, 0 0 that is, for all possible real values of the variable x. Corollary V. — Finally, consider the series (7)
1, 1 · x, 1 · 2 · x2 , 1 · 2 · 3 · x3 , . . . , 1 · 2 · 3 . . . n · xn , . . . .
In applying theorem II to this series, we find an+1 = n+1 an and consequently, we have
and A = ∞
6.4 On series ordered according to the ascending integer powers of a single variable.
105
1 = 0. A We thereby conclude that series (7) is always divergent, except when we suppose that x = 0, in which case it reduces to its first term 1. By examining the results which we have just obtained, we recognize immediately that, among series ordered according to increasing integer powers of the variable x, some are either [139] convergent or divergent according to the value assigned to this variable, while others are always convergent, no matter x might be, and others are always divergent, except for x = 0. We may add that theorem I leaves no uncertainty about the convergence of such a series, except in the case where the numerical value of x becomes equal to the positive constant given by A1 , that is, when we suppose 1 x=± . A In this particular case, the series is sometimes convergent, sometimes divergent, and the convergence sometimes depends on the sign of the variable x. For example, if in series (4), for which A = 1, we successively let x=1
and x = −1,
we obtain the following (8) (9)
1,
1 2,
1 3,
1 4,
...,
1 n,
...,
−1, + 21 , − 13 , + 14 , . . . , ± 1n , . . . ,
of which the first is divergent (see the corollary to theorem III in § II) and the second is convergent, as follows from theorem III (§ III). It is also essential to remark that, as follows from theorem I, whenever a series ordered according to the ascending integer powers of a variable x is convergent for a numerical value of x different from zero, it remains convergent if we diminish that numerical value, or even let it decrease indefinitely. Whenever two series ordered according to the ascending integer powers of the variable x are convergent for the same value of the variable, we may apply theorems V and VI of § III to them. [140] This remark suffices to establish two propositions, which I will state: Theorem III. — Suppose that the two series a0 , a1 x, a2 x2 , . . . , an xn , . . . , (10) b0 , b1 x, b2 x2 , . . . , bn xn , . . . are both convergent when we assign a particular value to the variable x, and have s and s0 , repectively, as their sums. Then (11)
a0 + b0 , (a1 + b1 ) x, (a2 + b2 ) x2 , . . . , (an + bn ) xn , . . .
106
6 On convergent and divergent series.
is a new convergent series, having s + s0 as its sum. Corollary. — We easily extend this theorem to as many series as we might wish. For example, if the three series a0 , a1 x, a2 x2 , . . . , b0 , b1 x, b2 x2 , . . . , c0 , c1 x, c2 x2 , . . . are convergent for the same value assigned to the variable x, and if we denote their respective sums by s, s0 and s00 , then a0 + b0 + c0 , (a1 + b1 + c1 ) x, (a2 + b2 + c2 ) x2 , . . . is a new convergent series, which has s + s0 + s00 as its sum. Theorem IV. — Under the same hypotheses as the previous theorem, if each of series (10) remains convergent when we replace its various terms with their numerical values, then a0 b0 , (a0 b1 + a1 b0 ) x, (a0 b2 + a1 b1 + a2 b0 ) x2 , . . . , (12) . . . , (a0 bn + a1 bn−1 + . . . + an−1 b1 + an b0 ) xn , . . . is a new convergent series, having ss0 as its sum. [141] Corollary I. — The previous theorem is found contained in the formula (a0 + a1 x + a2 x2 + . . . (b0 + b1 x + b2 x2 + . . . (13) = a0 b0 + (a0 b1 + a1 b0 )x + (a0 b2 + a1 b1 + a2 b0 )x2 + . . . , which remains true in the case where each of series (10) remains convergent when we replace its various terms with their numerical values. Under this hypothesis, formula (13) may be used to expand the product of the sums of the two series into a new series of the same form. Corollary II. — We may multiply together three or more series similar to (10), each of which remains convergent when we replace its various terms with their numerical values, by repeating the operation indicated in equation (13) several times. The product thus obtained is the sum of a new convergent series, ordered according to the increasing integer powers of the variable x. Corollary III. — In the two preceding corollaries, suppose that all the series whose sums we multiply are equal. Then the product we obtain is the integer power of the sum of each of these, and this last sum is also represented by the sum of a series of the same kind. For example, if we let a0 = b0 , a1 = b1 , a2 = b2 , . . ., in equation (13) we get
6.4 On series ordered according to the ascending integer powers of a single variable.
(14)
107
(a0 + a1 x + a2 x2 + . . .)2 = a20 + 2a0 a1 x + (2a0 a2 + a21 )x2 + . . . .
Corollary IV. — If we take µ(µ − 1)(µ − 2) . . . (µ − n + 1) n x 1·2·3...n and
µ 0 (µ 0 − 1)(µ 0 − 2) . . . (µ 0 − n + 1) n x 1·2·3...n as the general terms of series (10), where µ and µ 0 denote any two quantities, and if the variable x is contained between the limits x = −1 and x = +1, then each of series (10) [142] is convergent, even if we replace the various terms with their numerical values, and the general term of series (12) is µ(µ − 1) . . . (µ − n + 1) µ(µ − 1) . . . (µ − n + 2) µ 0 + +... 1·2·3...n 1 · 2 · 3 . . . (n − 1) 1 µ µ 0 (µ 0 − 1) . . . (µ 0 − n + 2) µ 0 (µ 0 − 1) . . . (µ 0 − n + 1) n + + x 1 1 · 2 · 3 . . . (n − 1) 1·2·3...n (µ + µ 0 )(µ + µ 0 − 1)(µ + µ 0 − 2) . . . (µ + µ 0 − n + 1) n = x . 1·2·3...n Given this, if we let ϕ(µ) denote the sum of the first of series (10) under the hypothesis that we have just made, that is if we suppose (15)
ϕ(µ) = 1 +
µ µ(µ − 1) 2 x+ x +..., 1 1·2
then under the same hypothesis the sums of series (10) and (12) are denoted ϕ(µ), ϕ(µ 0 ) and ϕ(µ + µ 0 ), respectively, so that equation (13) becomes (16)
ϕ(µ)ϕ(µ 0 ) = ϕ(µ + µ 0 ).
Whenever we replace the series b0 , b1 x, b2 x2 , . . . in equation (13) with a polynomial composed of a finite number of terms, we obtain a formula that never fails to be exact, as long as the series a0 , a1 x, a2 x2 , . . . remains convergent. We will prove this directly by establishing the following theorem: Theorem V. — If series (1) is convergent and if we multiply the sum of this series by the polynomial kxm + lxm−1 + . . . + px + q, (17)
108
6 On convergent and divergent series.
in which m denotes an integer number, the product we obtain is the [143] sum of a new convergent series of the same form, the general term of which is (qan + pan−1 + . . . + lan−m+1 + kan−m ) xm , as long as, among the first terms, those quantities an−1 , an−2 , . . . , an−m+1 , an−m that have negative indices are considered to be zero. In other words, we have25 kxm + lxm−1 + . . . + px + q a0 + a1 x + a2 x2 + . . . = qa0 + (qa1 + pa0 ) x + . . . + (qam + pam−1 + . . . + la1 + ka0 ) xm (18) +.................................... + (qan + pan−1 + . . . + lan−m+1 + kan−m ) xn + . . . . Proof. — To multiply the sum of series (1) by the polynomial (17), it suffices to multiply it successively by the different terms of the polynomial. Thus, we have kxm + lxm−1 + . . . + px + q a0 + a1 x + a2 x2 + . . . = q a0 + a1 x + a2 x2 + . . . + px a0 + a1 x + a2 x2 + . . . +...................................................... +lxm−1 a0 + a1 x + a2 x2 + . . . + kxm a0 + a1 x + a2 x2 + . . . . Because for any integer value of n we also have q a0 + a1 x + a2 x2 + . . . + an−1 xn−1
= qa0 + qa1 x + qa2 x2 + . . . + qan−1 xn−1 , we conclude that, by making n increase indefinitely and passing to the limit, q a0 + a1 x + a2 x2 + . . . = qa0 + qa1 x + qa2 x2 + . . . . Similarly, we find px a0 + a1 x + a2 x2 + . . . = pa0 x + pa1 x2 + pa2 x3 + . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , a0 + a1 x + a2 x2 + . . . = la0 xm−1 + la1 xm + la2 xm+1 + . . . , kxm a0 + a1 x + a2 x2 + . . . = ka0 xm + ka1 xm+1 + ka2 xm+2 + . . . .
lxm−1
25
The factor xn on the last line of (18) is incorrectly written as xn in [Cauchy 1897, p. 143]. It is correct in [Cauchy 1821, p. 160]. (tr.)
6.4 On series ordered according to the ascending integer powers of a single variable.
109
[144] If we add these last equations and form the sum of the right-hand sides, then by gathering together the coefficients of the same powers of x, we obtain precisely formula (18). Imagine now that we vary the value of x in series (1) by insensible degrees. As long as the series remains convergent, that is as long as the value of x remains contained between the limits −
1 A
and
1 + , A
the sum of the series is (by virtue of theorem I, § I) a continuous function of the variable x. Let ϕ(x) be this continuous function. The equation ϕ(x) = a0 + a1 x + a2 x2 + . . . remains true for all values of x contained between the limits − A1 and + A1 , which we indicate by writing these limits beside the series, as we see here: 1 1 ϕ(x) = a0 + a1 x + a2 x2 + . . . x=− , x=+ (19) . A A When the series is assumed to be known, we may sometimes deduce from it the value of the function ϕ(x) in a finite form, and it is this that we call summing the series. However, more often the function ϕ(x) is given, and we propose return from this function to the series, or in other words, to expand26 the function into a convergent series ordered according to increasing integer powers of x. On this matter, it is easy to establish the proposition that I will state: Theorem VI. — A continuous function of the variable x can be expanded in only one way as a convergent series ordered according to the increasing integer powers of this variable. Proof. — Indeed, suppose that we have expanded the function ϕ(x) by two [145] different methods, and let a0 , a1 x, a2 x2 , . . . , an xn , . . . , b0 , b1 x, b2 x2 , . . . , bn xn , . . . be the two expansions, that is two series, each convergent for values of x other than zero, and each having the function ϕ(x) as its sum, as long as it remains convergent. Because these two series are constantly convergent for very small numerical values of x, for such values they have a0 + a1 x + a2 x2 + . . . = b0 + b1 x + b2 x2 + . . . . By making x vanish in the previous equation, we get 26
Literally to “develop” (d´evelopper) (tr.).
110
6 On convergent and divergent series.
a0 = b0 . In general, it follows that we may reduce that equation to a1 x + a2 x2 + . . . = b1 x + b2 x2 + . . . , or what amounts to the same thing, to x (a1 + a2 x + . . .) = x (b1 + b2 x + . . .) . If we multiply both sides of this last equation by 1x , we obtain the following a1 + a2 x + . . . = b1 + b2 x + . . . , which must also remain true for very small numerical values of the variable x. By letting x = 0, we may conclude from this that a1 = b1 . By continuing in the same way, we may show that the constants a0 , a1 , a2 , . . ., are equal to the constants b0 , b1 , b2 , . . . , respectively. From this it follows that the two expansions of the function ϕ(x) are identical. The differential Calculus gives very expeditious methods for expanding functions into series. We will describe these methods later on. [146] For now we will limit ourselves to expanding the function (1 + x)µ , in which µ denotes any quantity, and using this to derive expansions of two other functions, which follow easily from the first, namely: Ax and log(1 + x), where A denotes a positive constant and log denotes the characteristic of the logarithm in a system chosen at will. As a consequence, we will solve the following three problems, one after another: Problem I. — When possible, to expand the function (1 + x)µ into a convergent series ordered according to increasing integer powers of the variable x. Solution. — First of all, suppose that µ = m, where m denotes any integer number. By the formula of Newton, we have (1 + x)m = 1 +
m m(m − 1) 2 x+ x +.... 1 1·2
The series whose sum constitutes the right-hand side of this formula is always composed of a finite number of terms. However, if we replace the integer number m by any quantity µ, the new series that we obtain, namely
6.4 On series ordered according to the ascending integer powers of a single variable.
(5)
1,
111
µ µ(µ − 1) 2 x, x , ..., 1 1·2
is generally composed of an indefinite number of terms and is convergent only for numerical values of x less than 1. Under this hypothesis, let ϕ(µ) be the sum of the new series, so that we have (15)
ϕ(µ) = 1 +
µ µ(µ − 1) 2 x+ x +... 1 1·2
(x = −1,
x = +1).
By virtue of theorem I (§ I), ϕ(µ) is a continuous function of the [147] variable µ, between arbitrary limits of this variable, and we have (see theorem III, Corollary IV) (16) ϕ(µ)ϕ(µ 0 ) = ϕ(µ + µ 0 ). Because this last equation is entirely similar to equation (2) of chapter V (§ I), it is solved in the same manner, and we conclude thereby that ϕ(µ) = [ϕ(1)]µ = (1 + x)µ . If we substitute the value of ϕ(µ) determined in this way into formula (15), we find that for all values of x contained between the limits x = −1 and x = +1, (20)
(1 + x)µ = 1 +
µ(µ − 1) 2 µ x+ x +... 1 1·2
(x = −1,
x = +1).
Whenever the numerical value of x is greater than 1, series (5) is no longer convergent and ceases to have a sum, so that equation (20) no longer remains true. Under the same hypothesis, as we shall prove later with the aid of the infinitesimal Calculus, it is impossible to expand the function (1 + x)µ into a convergent series ordered according to the ascending powers of the variable x. Corollary I. — If we replace µ by α1 and x by αx in equation (20), where α denotes an infinitely small quantity, then for all values of αx contained between the limits −1 and +1, or what amounts to the same thing, for all values of x contained between the limits − α1 and + α1 , we have 1
2
3
x x (1 + αx) α = 1 + 1x + 1·2 (1 − α) + 1·2·3 (1 − α)(1 − 2α) + . . .
x = − α1 ,
x = + α1 .
This last equation ought to remain true, no matter how small the numerical value of α may be. If we denote as usual by the abbreviation lim placed in front of an expression that includes the variable α the [148] limit towards which this expression converges as the numerical value of α decreases indefinitely, then in passing to the limit, we find (21)
1
lim(1 + αx) α = 1 +
x x2 x3 + + +... 1 1·2 1·2·3
(x = −∞, x = +∞) .
112
6 On convergent and divergent series. 1
It remains to seek the limit of (1 + αx) α . First, from the previous formula we get 1
lim(1 + α) α = 1 + or in other words, (22)
1 1 1 + + +..., 1 1·2 1·2·3 1
lim(1 + α) α = e,
where e denotes the base of Napierian logarithms [see § I, eq. (6)]. We conclude immediately that 1 lim(1 + αx) αx = e, and as a consequence, h i 1 1 x lim(1 + αx) α = lim (1 + αx) αx = ex . 1
Now if we substitute the value of lim(1 + α) α into equation (21), then we obtain the following: (23)
ex = 1 +
x x2 x3 + + +... 1 1·2 1·2·3
(x = −∞, x = +∞) .
We may derive equation (23) directly by observing that the series (6)
1,
x x2 x3 , , , ... 1 1·2 1·2·3
is convergent for all possible values of the variable x and then seeking that function of x which represents the sum of this same [149] series. Indeed, let ϕ(x) be the sum of series (6), which has xn 1·2·3...n as its general term. Then ϕ(y) is the sum of the series whose general term is yn . 1·2·3...n By virtue of theorem VI, § III, the product of these two sums is the sum of a new series that has xn xn−1 y + +... 1 · 2 · 3 . . . n 1 · 2 · 3 . . . (n − 1) 1 +
x yn−1 xn (x + y)n + = 1 1.2.3 . . . (n − 1) 1 · 2 · 3 . . . n 1 · 2 · 3 . . . n
as its general term. This product is therefore equal to ϕ(x + y), and consequently, if we let
6.4 On series ordered according to the ascending integer powers of a single variable.
113
x2 x3 x + + +..., 1 1·2 1·2·3 then the function ϕ(x) satisfies the equation ϕ(x) = 1 +
ϕ(x)ϕ(y) = ϕ(x + y). Solving this equation, we find x 1 1 1 ϕ(x) = [ϕ(1)] = 1 + + + +... . 1 1·2 1·2·3
x
That is ϕ(x) = ex . Corollary II. — If we divide both sides of (20) by µ after subtracting 1 from both sides, the equation that we obtain may be written as follows: (1 + x)µ − 1 x2 x3 1 = x − (1 − µ) + (1 − µ) 1 − µ − . . . µ 2 3 2 (x = −1, x = +1). [150] If we let µ converge towards the limit zero in this last equation we find, by passing to the limit, that27 (24)
lim
x x3 (1 + x)µ − 1 = x− + +.... µ 2 3
Furthermore, when ln denotes the characteristic of Napierian logarithm, taken in the system whose base is e, then we evidently have 1 + x = eln(1+x) and (1 + x)µ = eµ ln(a+x) = 1 +
µ ln(a + x) µ 2 [ln(1 + x)]2 + +.... 1 1·2
We conclude that (1 + x)µ − 1 µ = ln(1 + x) + [ln(1 + x)]2 + . . . . µ 2 Consequently (25)
lim
(1 + x)µ − 1 = ln(1 + x). µ
Given this, formula (24) becomes 27
Cauchy indeed uses a + sign following the contrast with equation (26). (tr.)
x3 3
in [Cauchy 1821, p. 169, Cauchy 1897, p. 150]. Note
114
(26)
6 On convergent and divergent series.
ln(1 + x) = x −
x2 x3 + −... 2 3
(x = −1,
x = +1).
The preceding equation remains true as long as the numerical value of x remains smaller than 1. In this case, the series (27)
x, −
x2 x3 xn , + ,..., ± , ... 2 3 n
is convergent, as is series (4), which differs from it only in the signs of the terms of odd order.28 Because these same series are divergent when we suppose that the numerical value of x is greater than 1, equation (26) ceases to hold under this hypothesis. In the particular case where we take x = 1, series (27) reduces to series (3) of the third section, which is convergent, [151] as we have shown. Thus, equation (26) ought to remain true, so that we have (28)
ln(2) = 1 −
1 1 1 + − +.... 2 3 4
On the other hand, if we let x = −1, then series (27) is divergent and has no sum. We may further note that, if after substituting −x for x in formula (26), we change the signs on both sides of the equation, we obtain the following 1 x2 x3 (29) = x+ + +... (x = −1, x = +1). ln 1−x 2 3 Problem II. — To expand the function Ax , where A denotes an arbitrary number, into a convergent series ordered according to increasing integer powers of the variable x. Solution. — We continue to let the characteristic ln denote the Napierian logarithm taken in the system whose base is e. From the the definition of this logarithm, we have A = eln(A) , and we thereby conclude that (30)
Ax = ex ln(A) .
Consequently, using equation (23), we have 28
This use of “odd order” (rang impair) in both [Cauchy 1821, p. 170] and [Cauchy 1897, p. 150] may be an error. On the other hand, for Cauchy, the rang may mean the position of a term in a series, starting with order zero. So in series (27), terms of odd order may be the ones of even degree. However, this does not seem to have been the case in series (4) of this section [Cauchy 1821, p. 153, Cauchy 1897, p. 137].
6.4 On series ordered according to the ascending integer powers of a single variable.
(31)
Ax = 1 +
115
x ln(A) x2 [ln(A)]2 x3 [ln(A)]3 + + +... 1 1·2 1·2·3 (x = −∞, x = +∞).
This last formula remains true for all possible real values of the variable x. Problem III. — Letting the characteristic log denote the logarithm taken [152] in the system whose base is A, to expand the function log(1 + x), where possible, into a convergent series ordered according to increasing integer powers of the variable x. Solution. — Still denoting the characteristic of the Napierian logarithm by ln, by virtue of well-known properties of the logarithm, we have log(1 + x) =
log(1 + x) ln(1 + x) = . log(A) ln(A)
Consequently, making use of equation (26), we find that for all values of x contained between the limits −1 and +1, 1 x2 x3 log(1 + x) = (32) x − + − . . . (x = −1, x = +1). ln(A) 2 3 This last formula remains true even in the case where we take x = 1, but it ceases to hold whenever we have x = −1 or x2 > 1.
Chapter 7
On imaginary expressions and their moduli.
7.1 General considerations on imaginary expressions. [153] In analysis, we call a symbolic expression or symbol any combination of algebraic signs that do not mean anything by themselves or to which we attribute a value different from that which they ought naturally to have. Likewise, we call symbolic equations all those that, taking the letters and the interpretations according to the generally established conventions, are inexact or do not make sense, but from which we can deduce exact results by modifying and altering either the equations themselves or the symbols which comprise them, according to fixed rules. The use of symbolic expressions or equations is often a means of simplifying calculations and of writing in a short form results that appear quite complicated. We have already seen this in the second section of the third chapter where formula (9) gives a very simple symbolic value to the unknown x satisfying equations (4).1 Among those symbolic expressions or equations which are of some importance in analysis, we should distinguish above all those which we call imaginary. We are going to show how we can put them to good use. We know that the sine and the cosine of the arc a + b are given as functions of the sines and cosines of the arcs a and b by the formulas ( cos (a + b) = cos a cos b − sin a sin b, (1) sin (a + b) = sin a cos b + sin b cos a. [154] Now, without taking the trouble to remember these formulas, we have a very simple means of recovering them at will. Indeed, it suffices to consider the following remark. Suppose that we multiply together the two symbolic expressions
1
Formula (9) [Cauchy 1821, p. 80, Cauchy 1897, p. 79] is Cramer’s rule. Equations (4) [Cauchy 1821, p. 77, Cauchy 1897, p. 76] comprise a system of n linear equations in n unknowns. R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 7, c Springer Science+Business Media, LLC 2009
117
118
7 On imaginary expressions and their moduli.
√ cos a + −1 sin a and √ cos b + −1 sin b, √ by applying the known rules of algebraic multiplication as if −1 were a real quantity the square of which is equal to −1. The resulting √ product is composed of two parts, one entirely real and the other having a√factor of −1. The real part gives the value of cos(a + b) while the coefficient of −1 gives the value of sin(a + b). To establish this remark, we write the formula ( √ cos(a + b) + −1 sin(a + b) (2) √ √ = cos a + −1 sin a cos b + −1 sin b . The three expressions that make up the preceding equation, namely √ cos a + −1 sin a, √ cos b + −1 sin b and √ cos(a + b) + −1 sin(a + b), are three symbolic expressions that cannot be interpreted according to the generally established conventions, and they do not represent anything real. For this reason, they are called imaginary expressions. Equation (2) itself, taken literally, is inexact and it does not make sense. To get exact results, first we must expand its right-hand side by algebraic multiplication, and this reduces the expression to ( √ cos(a + b) + −1 sin(a + b) (3) √ = cos a cos b − sin a sin b + −1 (sin a cos b + sin b cos a) . Secondly, we must equate the real part [155] of the left-hand side of √ equation (3) with the real part of the right-hand √ side, and then the coefficient of −1 on the left-hand side with the coefficient of −1 on the right. Thus we are brought back to equations (1), both of which we ought to consider as implicitly contained in formula (2). In general, we call an imaginary expression any symbolic expression of the form √ α + β −1, where α and β denote real quantities. We say that two imaginary expressions √ √ α + β −1 and γ + δ −1 ◦ are equal to each other when there is equality between corresponding √ parts: 1 be2 ◦ tween the real parts α and γ, and 2 between the coefficients of −1, namely β and δ . We indicate equality between two imaginary expressions in the same way 2 This is incorrectly written as “real parts, α and β ” in [Cauchy 1897, p. 155]. It was correct in [Cauchy 1821, p. 176]. (tr.)
7.1 General considerations on imaginary expressions.
119
that we indicate it between two real quantities, by the symbol =, and this results in what we call an imaginary equation. Given this, any imaginary equation is just the symbolic representation of two equations involving real quantities. For example, the symbolic equation √ √ α + β −1 = γ + δ −1 is just equivalent to the two real equations α =γ
and β = δ .
In the imaginary expression √ α + β −1, √ √ when the coefficient β of −1 vanishes, the term β −1 is understood to be zero, and the expression itself reduces to the real quantity α. By virtue of this convention, imaginary expressions include the real quantities as special cases. Imaginary expressions may be subjected to the same operations of Algebra as real quantities. In particular, if we perform addition, subtraction or multiplication [156] of two imaginary expressions, they operate according to the established rules for real quantities, and we obtain as a result a new imaginary expression that we call the sum, the difference or the product of the given expressions, and the ordinary notations are used to indicate that sum, difference or product. For example, if we are given two imaginary expressions, √ √ α + β −1 and γ + δ −1, we find (4) (5) (6)
√ √ √ α + β −1 + γ + δ −1 = α + γ + (β + δ ) −1, √ √ √ α + β −1 − γ + δ −1 = α − γ + (β − δ ) −1, √ √ √ α + β −1 × γ + δ −1 = αγ − β δ + (αδ + β γ) −1.
It is worth remarking that the product of two or more imaginary expressions, like that of two or more real binomials, remains the same regardless of the order in which we multiply the different factors.3 To divide a first imaginary expression by a second is to find an imaginary expression which, when multiplied by the second, reproduces the first. The result of this operation is the quotient of the two given expressions. To indicate this, we use the ordinary symbol for division. So, for example, √ α + β −1 √ γ + δ −1 3
Although [Servois 1814] introduced the word “commutative” to describe this property, Cauchy has not adopted it here.
120
7 On imaginary expressions and their moduli.
represents the quotient of the two imaginary expressions √ √ α + β −1 and γ + δ −1. To raise an imaginary expression to the power m (where m denotes an integer number) is to form the √ product of m factors equal to that expression. We write the mth power of α + β −1 with the notation √ m α + β −1 . √ [157] To extract the nth root of the imaginary expression α + β −1, or in other words to raise this expression to the power n1 (where n denotes any integer number), √ is to form a new imaginary expression whose nth power reproduces α +β −1. This problem √ has several solutions (see § IV), and as a result, the imaginary expression α + β −1 has several roots of degree n. When we do not wish to distinguish any one of these roots, we use the notation q q √ n α + β −1, or the following, √ n1 . α + β −1 √ In the particular case where β vanishes, α + β −1 reduces to a real quantity α, and among the values of the expression p p 1 α = ((a)) n we may find one or two real roots, as we will see below. In addition to the integer powers and the corresponding roots of imaginary expressions, we must often consider what we call their fractional and negative powers. On this subject, we ought to make the following √ remarks. To raise the imaginary expression α + β −1 to a fractional power mn , supposing that the fraction mn is reduced to its lowest terms, we must: 1◦ extract the nth root of the given expression; and 2◦ raise this root to the integer power m. As this problem can be solved in several ways (see below, § IV), we denote indistinctly any one of the powers mn by the notation √ mn α + β −1 . [158] In the particular case where β is zero,√one or two of these powers can be real. To raise the imaginary expression α + β −1 to a negative power, −m or − 1n or m − n is to divide 1 by the power m or 1n or mn of the same expression. The problem has a unique solution in the first case, and several solutions in the two others. We denote the power −m with the simple notation
7.1 General considerations on imaginary expressions.
121
√ −m α + β −1 , while the two notations
and
√ − 1n α + β −1 √ − mn α + β −1
represent, in the first case, any of the powers − n1 , and in the second case, any of the powers − mn . We say that two imaginary expressions are conjugate4√to each other when the two expressions differ only in the signs of the coefficient of −1. The sum of two such expressions is always real, as is their product. Indeed, the two conjugate imaginary expressions √ √ α + β −1 and α − β −1 have as their sum 2α and as their product α 2 + β 2 . The last part of this observation brings us to a theorem about numbers, which is stated here: Theorem I.5 — If we multiply together two integer numbers that are each the sum of two squares, then the product is always a sum of two squares. Proof. — Let the integer numbers be α2 + β 2
and α 02 + β 02 ,
[159] where α 2 , β 2 , α 02 and β 02 denote perfect squares. We evidently have the two equations √ √ √ (α + β −1)(α 0 + β 0 −1) = αα 0 − β β 0 + (αβ 0 + α 0 β ) −1 and
√ √ √ (α − β −1)(α 0 − β 0 −1) = αα 0 − β β 0 − (αβ 0 + α 0 β ) −1
and, by multiplying these term by term, we obtain the following (7)
α2 + β 2
2 2 a02 + β 02 = αα 0 − β β 0 + αβ 0 + α 0 β .
If we interchange the letters α 0 and β 0 in this last expression, we get (8)
4
α2 + β 2
2 2 a02 + β 02 = αβ 0 − α 0 β + αα 0 + β β 0 .
According to [Smith 1958, vol. 2, p. 267], this is the first use of the term “conjugate” in this sense. 5 This fact of number theory is sometimes called Lagrange’s Theorem, though it is not originally due to Lagrange. See, for example, [Euler 1758].
122
7 On imaginary expressions and their moduli.
Thus, in general we have two ways to decompose into two squares the product of two integer numbers each of which is the sum of two squares. Thus, for example, one draws from equations (7) and (8) 22 + 1 32 + 22 = 42 + 72 = 12 + 82 . We see from these examples that the use of imaginary expressions can be of great use, not only in ordinary Algebra but also in the Theory of numbers. Sometimes we represent an imaginary expression by a single letter. It is an artifice which augments the resources of Analysis and we will make use of it in what follows.
7.2 On the moduli of imaginary expressions and on reduced expressions. A remarkable property of any imaginary expression √ α + β −1 is that it can be put into the form √ ρ cos θ + −1 sin θ , [160] where ρ denotes a positive quantity and θ a real arc. Indeed, if we write the symbolic equation √ √ (1) α + β −1 = ρ cos θ + −1 sin θ , or what amounts to the same thing, the two real equations α = ρ cos θ and (2) β = ρ sin θ , and we get (3)
α 2 + β 2 = ρ 2 cos2 θ + sin2 θ = ρ 2 p ρ = α 2 + β 2.
and
Having thus determined the value of the number ρ, all that remains to verify completely equations (2) is to find an arc θ such that its cosine and sine are, respectively, α cos θ = √α 2 +β 2 and (4) sin θ = √ β . 2 2 α +β
7.2 On the moduli of imaginary expressions and on reduced expressions.
123
This last problem is always solvable because each of the quantities √
α α 2 +β 2
√
β α 2 +β 2
and
has a numerical value less than 1 and the sum of their squares is equal to 1.
Moreover, it has infinitely many different solutions because, having calculated one suitable value of the arc θ , we can increase or decrease this arc by any number of circumferences without changing the value √ of the sine or the cosine. When the imaginary expression α + β −1 is put into the form √ ρ cos θ + −1 sin θ , the positive quantity ρ is called the modulus6 of this imaginary expression. What remains after the suppression of the modulus, that is [161] to say the factor √ cos θ + −1 sin θ , is called the reduced expression. Because we take the quantities α and β to be known, we get only one unique value for the modulus ρ as determined by equation (3), and as a result, the modulus remains the same for any two imaginary expressions that are equal. Thus we can state the following theorem: Theorem I. — The equality of two imaginary expressions always entails the equality of their moduli, and as a consequence, the equality of their reduced expressions. If we compare two conjugate imaginary expressions to each other, we find that their moduli are equal. The square of their common modulus is simply their √product. When the second term β vanishes in the imaginary expression α + β −1, this expression reduces to a real quantity α. Under the same hypothesis, we get from equations (3) and (4): 1◦ when α is positive, that √ ρ = α 2, cos θ = 1
and
sin θ = 0,
and so θ = ±2kπ, where k denotes any integer number; and 2◦ when α is negative, that √ ρ = α 2, cos θ = −1
and
sin θ = 0,
and so θ = ±(2k + 1)π. 6
According to [Smith 1958, vol. 2, p. 267], this is the first use of the term “modulus” in this sense. However, [Grattan-Guinness 1990, vol. 2, p. 170] cites an earlier use in [Cauchy 1817].
124
7 On imaginary expressions and their moduli.
√ Thus the modulus of a real quantity α is simply its numerical value α 2 and the reduced expression that corresponds to such a quantity is always +1 or −1, namely √ +1 = cos (±2kπ) + −1 sin (±2kπ) , [162] whenever α is a positive quantity, and √ −1 = cos ± 2k + 1 π + −1 sin ± 2k + 1 π , whenever α is a negative quantity. Any imaginary expression that has modulus zero itself reduces to zero because its two terms vanish. Conversely, because the cosine and the sine of an arc are never zero at the same time, it follows that an imaginary expression cannot be reduced to zero unless its modulus vanishes. Any imaginary expression which has 1 as its modulus is necessarily a reduced expression. Thus, for example, √ √ cos a + −1 sin a, cos a − −1 sin a, √ √ − cos a − −1 sin a and − cos a + −1 sin a are four reduced expressions forming two conjugate pairs. In fact, to get these four expressions from the formula √ cos θ + −1 sin θ , it is enough to take successively θ = ±2kπ + a,
θ = ±2kπ − a,
θ = ± (2k + 1) π + a and θ = ± (2k + 1) π − a, where k denotes any integer number. Calculations involving imaginary expressions can be simplified by considering reduced expressions. It is important to take note of their properties. These properties are contained in the theorems that I am about to state. Theorem II. — To multiply together two reduced expressions √ √ cos θ + −1 sin θ and cos θ 0 + −1 sin θ 0 , it suffices to add the corresponding arcs θ and θ 0 . [163] Proof. — Indeed, we have √ √ cos θ + −1 sin θ √ cos θ 0 + −1 sin θ 0 (5) = cos (θ + θ 0 ) + −1 sin (θ + θ 0 ) .
7.2 On the moduli of imaginary expressions and on reduced expressions.
125
Corollary. — If we make θ 0 = −θ in the previous theorem, we find, as we might expect, √ √ (6) cos θ + −1 sin θ cos θ − −1 sin θ = 1. Theorem III. — To multiply together several reduced expressions, √ √ √ cos θ + −1 sin θ , cos θ 0 + −1 sin θ 0 , cos θ 00 + −1 sin θ 00 , . . . , it suffices to add the corresponding arcs, θ , θ 0 , θ 00 , . . .. Proof. — Indeed, we have successively,7 √ √ θ 0 + −1 sin θ 0 ) (cos θ + −1 sin θ )(cos √ = cos(θ + θ 0 ) + −1 sin(θ + θ 0 ), √ √ √ (cos θ + −1 sin θ )(cos θ 0 + −1 sinθ 0 )(cos θ 00√ + −1 sin θ 00 ) √ = cos(θ + θ 0 ) + −1√sin(θ + θ 0 ) (cos θ 00 + −1 sin θ 00 ) = cos(θ + θ 0 + θ 00 ) + −1 sin(θ + θ 0 + θ 00 ), ............................................., and, continuing in the same way, we find in general that whatever the number of arcs, θ , θ 0 , θ 00 , . . . may be, √ √ √ cos θ + −1 sin θ cos θ 0 + √ −1 sin θ 0 cos θ 00 + −1 sin θ 00 . . . (7) = cos (θ + θ 0 + θ 00 + . . .) + −1 sin (θ + θ 0 + θ 00 + . . .) . Corollary. — If we expand the left-hand side of equation (7) by ordinary multi8 plication, √ the expansion will consist of two parts, one real and the other having a factor −1. Given this, the real part will take on the value cos θ + θ 0 + θ 00 + . . . , [164] and the coefficient of
√ −1 in the second part will have the value sin θ + θ 0 + θ 00 + . . . .
For example, suppose that we are considering only three arcs, θ , θ 0 and θ 00 . Then equation (7) becomes √ √ √ θ 0 + −1 sin θ 0 cos θ 00 + −1 sin θ 00 cos θ + −1 sin θ cos√ = cos (θ + θ 0 + θ 00 ) + −1 sin (θ + θ 0 + θ 00 )
In the fourth line of this calculation, then term cos(θ + θ 0 ) was missing the right parenthesis in [Cauchy 1897, p. 163]. [Cauchy 1821, p. 187] was parenthesized properly. (tr.) 8 Cauchy calls this multiplication imm´ ediate. It is not clear how this is different from what he calls “algebraic multiplication.” 7
126
7 On imaginary expressions and their moduli.
and, after expanding the left-hand side of this last equation by algebraic multiplication, we conclude that cos θ + θ 0 + θ 00 = cos θ cos θ 0 cos θ 00 − cos θ sin θ 0 sin θ 00 − sin θ cos θ 0 sin θ 00 − sin θ sin θ 0 cos θ 00 and9 sin θ + θ 0 + θ 00 =
sin θ cos θ 0 cos θ 00 + cos θ sin θ 0 cos θ 00 + cos θ cos θ 0 sin θ 00 − sin θ sin θ 0 sin θ 00 .
Theorem IV. — To divide the reduced expression √ cos θ + −1 sin θ by the following
√ cos θ 0 + −1 sin θ 0 ,
it suffices to subtract the arc θ 0 that corresponds to the second expression from the arc θ corresponding to the first. Proof. — Let x be the quotient we are seeking, so that √ cos θ + −1 sin θ √ x= . cos θ 0 + −1 sin θ 0 This quotient ought to√be a new imaginary expression√chosen so that when it is multiplied by cos θ 0 + −1 sin θ 0 it reproduces cos θ + −1 sin θ . In other words, x ought to satisfy the equation √ √ cos θ 0 + −1 sin θ 0 x = cos θ + −1 sin θ . To solve this equation for x, it suffices to multiply both sides by √ cos θ 0 − −1 sin θ 0 . [165] In this way we reduce the coefficient of x to 1 (see theorem II, corollary I), and we find that √ √ x = cos θ + −1 sin θ cos θ 0 − −1 sin θ 0 h √ √ i = cos θ + −1 sin θ cos −θ 0 + −1 sin −θ 0 √ = cos θ − θ 0 + −1 sin θ − θ 0 .
9
The minus sign preceding the final term in the next equation is a plus sign in [Cauchy 1897, p. 164]. This error was not present in [Cauchy 1821, p. 188]. (tr.)
7.2 On the moduli of imaginary expressions and on reduced expressions.
127
Thus, we definitely have √ √ cos θ + −1 sin θ 0 0 √ (8) = cos θ − θ + −1 sin θ − θ . cos θ 0 + −1 sin θ 0 Corollary. — If we take θ = 0 in equation (8), we have (9)
cos θ 0 +
√ 1 √ = cos θ 0 − −1 sin θ 0 . 0 −1 sin θ
Theorem V.10 — To raise the imaginary expression √ cos θ + −1 sin θ to the power m (where m denotes any integer number), it suffices to multiply the arc θ in this expression by the number m. Proof. — Indeed, because the arcs θ , θ 0 , θ 00 , . . . could be arbitrary in formula (7), if we suppose that they are all equal to θ , and that there are m of them, we find m √ √ = cos mθ + −1 sin mθ . (10) cos θ + −1 sin θ Corollary. — If in equation (10) we take successively θ = z and then θ = −z, we get the following two equations: ( m √ √ cos z + −1 sin z = cos mz + −1 sin mz and (11) m √ √ cos z − −1 sin z = cos mz − −1 sin mz. Because they are always the product of m equal factors, the left-hand sides of each of these last equations can be expanded by ordinary multiplication of these factor, or what amounts to the same thing, by the [166] formula of Newton.11 After expanding ◦ the equation, if we equate √ corresponding parts in each equation: 1 the real parts and ◦ 2 the coefficients of −1, we conclude m−2 z sin2 z cos mz = cosm z − m(m−1) 1·2 cos + m(m−1)(m−2)(m−3) cosm−4 z sin4 z − . . . , 1·2·3·4 (12) sin mz = m1 cos zm−1 sin z − m(m−1)(m−2) cosm−3 z sin3 z + . . . . 1·2·3 10
This theorem is known as de Moivre’s Theorem, although it seems de Moivre did not give the result in this form; see [Kline 1990, vol. 2, p. 409]. Euler treated this material in more detail in [Euler 1748, vol. 1, ch. VIII, § 132 ff]. 11 That is, Newton’s binomial formula.
128
7 On imaginary expressions and their moduli.
Supposing m = 2, for example, we find cos 2z = cos2 z − sin2 z and sin 2z = 2 sin z cos z. Supposing m = 3, cos 3z = cos3 z − 3 cos z sin2 z and sin 3z = 3 cos2 z sin z − sin3 z, and so on. Theorem VI. — To raise the imaginary expression √ cos θ + −1 sin θ to the power −m, (where m denotes any integer number), it suffices to multiply the arc θ in this expression by the degree −m. Proof. — Indeed, from the definition we have given of negative powers (see § I), we get −m √ = cos θ + −1 sin θ
1 m √ cos θ + −1 sin θ 1 √ = . cos mθ + −1 sin mθ
Consequently, using formula (9) we get
(13)
−m √ √ = cos mθ − −1 sin mθ , cos θ + −1 sin θ
[167] or what amounts to the same thing, (14)
−m √ √ cos θ + −1 sin θ = cos (−mθ ) + −1 sin (−mθ ) .
After establishing the principal properties of reduced expressions, as we have just done, it becomes easy to multiply or divide two or more imaginary expressions if we know their moduli, as well as to raise any imaginary expression to a power m or −m, (where m denotes an integer number). Indeed, we can easily perform these different operations with the aid of the following theorems: Theorem VII. — To obtain the product of two or more imaginary expressions, it suffices to multiply the product of the reduced expressions to which they correspond by the product of the moduli.
7.2 On the moduli of imaginary expressions and on reduced expressions.
129
Proof. — The stated theorem follows immediately from the principle that the product of several factors, real or imaginary, remains the same regardless of the order in which one multiplies them. Indeed, let √ √ ρ cos θ + −1 sin θ , ρ0 cos θ 0 + −1 sin θ 0 , √ ρ 00 cos θ 00 + −1 sin θ 00 , . . . be several imaginary expressions, where ρ, ρ 0 , ρ 00 , . . . denote their moduli. When we want to multiply these expressions together, where each expression is the product of a modulus and a reduced expression, we can, by virtue of the principle just mentioned, form one part as the product of the moduli, and the other as the product of all the reduced expressions, then multiply together these two products. In this way, we find that the final result is h i √ (15) ρρ 0 ρ 00 . . . cos θ + θ 0 + θ 00 + . . . + −1 sin θ + θ 0 + θ 00 + . . . . Corollary I. — The product of several imaginary expressions is a new imaginary expression which has as its modulus the product of the moduli of all the others. Corollary II. — Because an imaginary expression can never vanish [168] unless its modulus vanishes, and because in order to make the product of several moduli vanish, it is necessary that one of them reduces to zero, it is clear that one may draw from theorem VII the following conclusion: The product of two or more imaginary expressions cannot vanish except when one of the factors reduces to zero. Theorem VIII. — To obtain the quotient of two imaginary expressions, it suffices to multiply the quotient of their corresponding reduced expressions by the quotient of their moduli. Proof. — Suppose that it is a matter of dividing the imaginary expression √ ρ cos θ + −1 sin θ , where the modulus is ρ, by the following √ ρ 0 cos θ 0 + −1 sin θ 0 , where the modulus is ρ 0 . If we denote by x the desired quotient, then x must be a new imaginary expression satisfying the equation √ √ ρ 0 cos θ 0 + −1 sin θ 0 x = ρ cos θ + −1 sin θ . To solve this equation for the value of x, we multiply both sides by the product of the two factors
130
7 On imaginary expressions and their moduli.
1 ρ0 In this way we find, writing x=
and ρ ρ0
√ cos θ 0 + −1 sin θ 0 .
in place of ρ ρ10 , that
√ i ρ h cos θ − θ 0 + −1 sin θ − θ 0 . 0 ρ
Thus, in the final analysis we have √ i √ ρ cos θ + −1 sin θ ρ h = 0 cos θ − θ 0 + −1 sin θ − θ 0 . √ (16) ρ ρ 0 cos θ 0 + −1 sin θ 0 [169] Because, by virtue of theorem IV, √ cos θ − θ 0 + −1 sin θ − θ 0 is precisely the quotient of the two reduced expressions √ √ cos θ + −1 sin θ and cos θ 0 + −1 sin θ 0 , it is clear that, having established formula (16), we ought to consider theorem VIII as being proved. Corollary. — If we take θ = 0 in equation (16),12 we have (17)
√ 1 1 = 0 cos θ 0 − −1 sin θ 0 . √ ρ ρ 0 cos θ 0 + −1 sin θ 0
Theorem IX. — To obtain the mth power of an imaginary expression (where m denotes any integer number), it suffices to multiply the mth power of the corresponding reduced expression by the mth power of the modulus. Proof. — Indeed, if in theorem VII we take the imaginary expressions √ ρ cos θ + −1 sin θ , √ ρ 0 cos θ 0 + −1 sin θ 0 , √ ρ 00 cos θ 00 + −1 sin θ 00 , ........................... all to be equal to each other and to be m in number, their product will be equivalent to the mth power of the first one, that is to say, equal to im h √ ρ cos θ + −1 sin θ . Under this hypothesis expression (15) becomes 12
Apparently Cauchy means to take ρ = 1 as well as θ = 0.
7.2 On the moduli of imaginary expressions and on reduced expressions.
131
√ ρ m cos mθ + −1 sin mθ .
Ultimately, we have h im √ √ (18) ρ cos θ + −1 sin θ = ρ m cos mθ + −1 sin mθ . [170] The reduced expression √ cos mθ + −1 sin mθ is equal (by virtue of theorem V) to m √ cos θ + −1 sin θ . Thus, having established formula (18), it follows that we ought to consider theorem IX to be proved. Theorem X. — To raise an imaginary expression to the power −m (where m denotes an integer number), it suffices to form the same powers of the modulus and of the reduced expression, then to multiply the two parts together. Proof. — Suppose that it is a matter of raising the following imaginary expression to the power −m √ ρ cos θ + −1 sin θ , where the modulus is ρ. By virtue of the definition of negative powers, we have h i−m √ 1 m ρ cos θ + −1 sin θ = √ ρ cos θ + −1 sin θ 1 . √ = ρ cos mθ + −1 sin mθ Consequently, making use of formula (17), we find h i−m √ √ 1 ρ cos θ + −1 sin θ = m cos mθ − −1 sin mθ ρ or what amounts to the same thing, (19)
h i−m √ √ ρ cos θ + −1 sin θ = ρ −m cos mθ − −1 sin mθ .
This last formula, together with equation (13), gives the complete proof of theorem X.
132
7 On imaginary expressions and their moduli.
7.3 On the real and imaginary roots of the two quantities +1 and −1 and on their fractional powers. [171] Suppose that m and n denote two relatively prime integer numbers. If we use the notations adopted in § I, the nth roots of unity, or what amounts to the same thing, the powers of degree 1n , are the various values of the expression p p n
1
1 = ((1)) n ,
and likewise, the fractional positive or negative powers of unity of degree are the various values of m m ((1)) n or ((1))− n .
m n
or − mn
Thus we conclude that to determine these roots and powers it suffices to solve the following three problems, one after another. Problem I. — To find the various real and imaginary values of the expression 1
((1)) n .
Solution. — Let x be one of these values, and in order to present it in a general form that includes the real quantities and the imaginary quantities at the same time, suppose that √ x = r cost + −1 sint , where r denotes a positive quantity and t denotes a real arc. Because of the definition 1 of the expression ((1)) n , we have that xn = 1,
(1)
or what amounts to the same thing, √ rn cos nt + −1 sin nt = 1. [172] We can draw from this last equation (with the aid of theorem I, § II) rn = 1 and √ cos nt + −1 sin nt = 1, and so, r = 1, cos nt = 1,
sin nt = 0, nt = ±2kπ 2kπ t =± , n
and
7.3 On the real and imaginary roots of +1 and −1 and on their fractional powers.
133
where k represents any integer number. The quantities r and t being thereby determined, the various values that satisfy equation (1) are evidently contained in the formula 2kπ 2kπ √ (2) ± −1 sin . x = cos n n 1
In other words, the various values of ((1)) n are given by the equation 1
((1)) n = cos
(3)
2kπ 2kπ √ ± −1 sin . n n
Now let h be the integer number closest to the ratio nk . The difference between the numbers h and nk will be at most equal to 12 , so that we have k k0 = h± , n n 0
where kn denotes a fraction equal to or less than 12 , and as a consequence k0 is a integer number less than, or at most equal to 2n . From this we conclude 2kπ n
cos 2kπ n
0
= 2hπ ± 2kn π
and √ √ 0 2k π 2k0 π ± −1 sin 2kπ n = cos n ± −1 sin n . 1
Consequently, all the values of ((1)) n are contained in the formula cos
2k0 π √ 2k0 π ± −1 sin , n n
[173] if we suppose that k0 is contained between the limits 0 and n2 , or what amounts to the same thing, if we suppose that k is contained between the same limits in formula (3). Corollary I. — When n is even, the various values that the integer number k can assume without going outside the limits 0 and 2n are, respectively, 0,
1,
2,
...,
n−2 2
and
n . 2
In general, for each of these values of k, formula (3) gives two conjugate imaginary 1 values of the expression ((1)) n , that is to say, two conjugate imaginary roots of unity of degree n. However, for k = 0 we find but a single real root, +1, and for k = n2 another real root, −1. In summary, when n is even, the expression 1
((1)) n admits two real values, namely
134
7 On imaginary expressions and their moduli.
+1
− 1,
and
along with n − 2 imaginary values, conjugate in pairs, namely √ √ 2π 2π cos 2π cos 2π n + −1 sin n , n − −1 sin n , √ √ 4π 4π cos 4π cos 4π (4) n + −1 sin n , n − −1 sin n , ....................., ....................., √ √ (n−2)π (n−2)π − −1 sin (n−2)π . cos n + −1 sin n , cos (n−2)π n n The total number of these values, real and imaginary, is equal to n. Suppose, for example, that n = 2. We find that there exist two values of the expression 1 ((1)) 2 , [174] or what amounts to the same thing, two values of x that satisfy the equation x2 = 1, and that these values, both real, are, respectively, +1
− 1.
and
Now suppose that n = 4. We find that there are four values of the expression 1
((1)) 4 , or what amounts to the same thing, four values of x that satisfy the equation x4 = 1. Among these four values, two of them are real, namely +1
and
− 1.
The other two are imaginary and are, respectively, equal to cos
√ π π √ + −1 sin = + −1, 2 2
and to
√ π √ π − −1 sin = − −1. 2 2 Corollary II. — When n is odd, the various values that the integer number k can assume without going outside the limits 0 and 2n are, respectively, cos
0,
1,
2,
...,
n−1 . 2
7.3 On the real and imaginary roots of +1 and −1 and on their fractional powers.
135
In general, for each of these values of k, formula (3) gives two conjugate imagi1 nary values of the expression ((1)) n , that is to say two conjugate imaginary roots of unity of degree n. However, for k = 0 we find but a single real root, namely +1. In summary, when n is odd, the expression 1
((1)) n [175] admits the single real value +1,
along with n − 1 imaginary values, conjugate in pairs, namely √ √ 2π 2π 2π cos 2π n − −1 sin n , cos n + −1 sin n , √ √ 4π 4π cos 4π cos 4π (5) n + −1 sin n , n − −1 sin n , ....................., ....................., √ √ (n−1)π cos (n−1)π + −1 sin , cos (n−1)π − −1 sin (n−1)π . n n n n The total number of these values, real and imaginary, is equal to n. Suppose, for example, that n = 3. We find that there exist three values of the expression 1 ((1)) 3 , or what amounts to the same thing, three values of x that satisfy the equation x3 = 1, and these values, of which one is real, are, respectively,
cos
2π 3
√ + −1 sin 2π 3
+1, and
√ 2π cos 2π 3 − −1 sin 3 .
Moreover, as we know, the side of the hexagon is equal to its radius and the supplement of the arc subtended by this side has as its measure 2π 3 , so we can easily obtain the equations 1 2π 1 2π 32 cos =− and sin =+ . 3 2 3 2 1
By virtue of these equations, the imaginary values of the expression ((1)) 3 reduce to 1 1 1 32 √ 1 32 √ − + −1 and − − −1. 2 2 2 2 [176] Corollary III. — If n is any integer number, the number of values, real and 1 imaginary, of the expression ((1)) n , or what amounts to the same thing, the number of values of x that satisfy the equation xn = 1, is always equal to n.
136
7 On imaginary expressions and their moduli.
Problem II. — To find the various values, real and imaginary, of the expression m
((1)) n .
Solution. — The numbers m and n are assumed to be relatively prime. Because m of the definition of the expression ((1)) n , we have that h i m 1 m ((1)) n = ((1)) n . 1
Substituting the general value for ((1)) n found in equation (3), we get m 2kπ √ 2kπ m ((1)) n = cos ± −1 sin n n and so (6)
m
((1)) n = cos
m · 2kπ m · 2kπ √ ± −1 sin . n n m
To deduce all of the values of ((1)) n from this last formula, one needs only to give k the integer values between 0 and 2n successively. Let k0 and k00 be two such values, assumed to be unequal. I say that the cosines cos
m · 2k0 π n
and
cos
m · 2k00 π n
are necessarily different from each other. Indeed, these cosines cannot be equal except in the case where the arcs to which they correspond are related to each other by an equation of the form m · 2k00 π m · 2k0 π = ±2hπ ± , n n [177]where h denotes an integer number. Now from this equation we get h=
m (±k0 ± k00 ) . n
Thus, because m is relatively prime to n, it is necessary that ±k0 ± k00 be divisible by n, which cannot happen because the numbers k0 and k00 are unequal and each of them cannot exceed 12 n, so their sum and their differences are necessarily less than n. Thus, two different values of k contained between the limits 0 and 21 n give two different values of m · 2kπ cos . n From this remark, we easily conclude that the values, real or imaginary, of the exm pression ((1)) n given by equation (6) are the same in number as the real and imag-
7.3 On the real and imaginary roots of +1 and −1 and on their fractional powers.
137
1
inary roots of ((1)) n determined by equation (3). Moreover, because we evidently have √ m·2kπ n cos m·2kπ n ± −1 sin n √ = cos (m · 2kπ) ± −1 sin (m · 2kπ) = 1, m
it follows that every value of ((1)) n is a real or imaginary expression, the nth power 1
of which equals 1, and thus is a value of ((1)) n . These observations lead to the formula m 1 ((1)) n = ((1)) n (7) in which the sign = indicates only that each of the values on the left-hand side is always equal to one of the values on the right-hand side.13 Problem III. — To find the various values, real and imaginary, of the expression m
((1))− n .
Solution. — From the definition of negative powers, we have that 1
m
((1))− n =
m
.
((1)) n m
[178] Substituting the general value for ((1)) n found in equation (6), and considering formula (9) of the preceding section, we get m
((1))− n = cos
(8)
m · 2kπ √ m · 2kπ ∓ −1 sin . n n m
It follows from this last equation that the various values of ((1))− n are the same as 1
m
those of ((1)) n and consequently are equal to those of ((1)) n . Thus we have m
1
((1))− n = ((1)) n ,
(9)
where the sign = ought to be interpreted as in equation (7). Corollary. — If we make m = 1 in formula (9), it gives (10)
1
1
((1))− n = ((1)) n .
Now suppose that we seek roots and fractional powers, not of unity, but of the quantity −1. The nth roots of this quantity, or what amounts to the same thing, its powers of degree 1n , are the various values of the expression 13
This is a remarkable proof. The modern reader might have trouble even stating the conclusion without recourse to such notions as “set,” “subset,” “one-to-one” and “cardinality,” none of which were available to Cauchy.
138
7 On imaginary expressions and their moduli.
p p n
1
− 1 = ((−1)) n ,
and likewise the fractional powers of −1, positive or negative, of degree are the various values of m m ((1)) n or ((1))− n .
m n
or − mn
As a consequence, to determine these roots and powers, it suffices to solve, one after another, the three problems that I will pose. Problem IV. — To find the various real and imaginary powers of the expression 1
((−1)) n .
[179] Solution. — Let √ x = r cost + −1 sint be one of these values, where r denotes a positive quantity and t denotes a real arc. 1 From the definition itself of the expression ((−1)) n we have xn = −1,
(11)
or what amounts to the same thing, √ rn cos nt + −1 sin nt = −1. We conclude from this last equation (with the aid of theorem I, § II), rn = 1 and √ cos nt + −1 sin nt = −1. It follows that r = 1, cos nt = −1,
sin nt = 0,
nt = ± (2k + 1) π
and
(2k + 1) π t =± , n where k represents any integer number. The quantities r and t being thereby determined, the various values of x that satisfy equation (11) are evidently contained in the formula (2k + 1) π √ (2k + 1) π (12) x = cos ± −1 sin . n n 1
In other words, the various values of ((−1)) n are given by the equation
7.3 On the real and imaginary roots of +1 and −1 and on their fractional powers.
(13)
1
((−1)) n = cos
139
(2k + 1) π √ (2k + 1) π ± −1 sin . n n
Now let h be the integer number closest to the ratio 2k+1 2n . The difference between the two numbers h and 2k+1 is obviously a fraction with an odd numerator, less than n 1 or at most [180] equal to 2 . Thus it follows that we have 2k + 1 2k0 + 1 = h± , 2n 2n where 2k0 + 1 denotes an odd number less than or equal to n. We then conclude that
cos (2k+1)π n
(2k0 + 1) π (2k + 1) π = 2hπ ± and n n √ (2k0 +1)π √ (2k0 +1)π ± −1 sin (2k+1)π = cos ± −1 sin . n n n
Consequently, if we suppose that 2k0 + 1 is contained between the limits 0 and n, 1 then all the values of ((−1)) n are contained in the formula cos
(2k0 + 1) π √ (2k0 + 1) π ± −1 sin , n n
or what amounts to the same thing, the values are contained in formula (13), if we suppose that 2k + 1 is contained between the same limits. Corollary I. — When n is even, the various values that 2k + 1 can assume without going outside the limits 0 and n are, respectively, 1,
3,
5,
...,
n − 1.
For each of these values of 2k + 1, formula (13) always gives two conjugate imagi1 nary values of the expression ((−1)) n . Consequently, in the case we are considering here, this expression does not admit any real values, but only n imaginary values, conjugate in pairs, namely √ π √ cos πn − −1 sin πn , cos n + −1 sin πn , √ √ 3π cos 3π + −1 sin 3π , cos 3π n n n − −1 sin n , (14) .................., .................., √ (n−1)π √ (n−1)π cos n + −1 sin n , cos (n−1)π − −1 sin (n−1)π . n n [181] Suppose for example that n = 2. We find that there are two values of the 1 expression ((−1)) 2 , or what amounts to the same thing, two values of x that satisfy the equation x2 = −1, and that these values, both imaginary, are, respectively,
140
7 On imaginary expressions and their moduli.
√ √ cos π2 + −1 sin π2 = + −1
and
√ √ cos π2 − −1 sin π2 = − −1. Now suppose that n = 4. We see that there are four values of the expression 1 ((−1)) 4 , or in other words, four values of x that satisfy the equation x4 = −1, and that these four values are contained in the two formulas √ cos π4 ± −1 sin π4 and √ 3π cos 3π 4 ± −1 sin 4 , or what amounts to the same thing, in the single formula ± cos
π √ π ± −1 sin . 4 4
Moreover, because we have cos
π π 1 = sin = √ , 4 4 2
we finally find that
1 √ −1. 1 2 22 Corollary II. — When n is odd, the various values that 2k + 1 can assume without going outside the limits 0 and n are, respectively, 1
((−1)) 4 = ±
1,
3,
5,
...,
1
1 2
±
n−2
and n.
[182] In general, for each of these values of 2k +1, formula (13) gives two conjugate 1 imaginary values of the expression ((−1)) n , that is to say two conjugate imaginary roots of −1 of degree n. However, for 2k + 1 = n, we find but a single real root, 1 namely −1. In summary, when n is odd, the expression ((−1)) n admits only the one real value, −1, along with n − 1 imaginary roots, conjugate in pairs, namely √ π √ cos n + −1 sin πn , cos πn − −1 sin πn , √ √ 3π cos 3π + −1 sin 3π , cos 3π n n n − −1 sin n , (15) .................., .................., √ (n−2)π √ (n−2)π cos n + −1 sin n , cos (n−2)π − −1 sin (n−2)π . n n
7.3 On the real and imaginary roots of +1 and −1 and on their fractional powers.
141
The total number of these values, real and imaginary, is equal to n. Suppose, for example, that n = 3. We find that there exist three values of the 1 expression ((−1)) 3 , or what amounts to the same thing, values of x that satisfy the equation x3 = −1, and these values, of which one is real, are, respectively, −1, 1 √ √ cos π3 + −1 sin π3 = 12 + 322 −1 and 1 √ √ cos π3 + −1 sin π3 = 12 − 322 −1. Corollary III. — If n is any integer number, the number of values, real and imag1 inary, of the expression ((−1)) n , or what amounts to the same thing, the number of values of x that satisfy the equation xn = −1, is always equal to n. [183] Problem V. — To find the various values, real and imaginary, of the expression m ((−1)) n . Solution. — The numbers m and n are assumed to be relatively prime. Because m of the definition of the expression ((−1)) n , we have that i h m 1 m ((−1)) n = ((−1)) n . 1
Substituting the general value for ((−1)) n found in equation (13), we get (16)
m
((−1)) n = cos
m (2k + 1) π m (2k + 1) π √ ± −1 sin . n n m
To deduce all of the values of ((−1)) n from this last formula, one needs only successively to give to 2k + 1 all the odd, integer values between 0 and n. Let 2k0 + 1 and 2k00 + 1 be two such values, assumed to be unequal. I say that the cosines cos
m (2k0 + 1) π n
and
cos
m (2k00 + 1) π n
are necessarily different from each other. Indeed, these cosines cannot be equal except in the case where the arcs to which they correspond are related to each other by an equation of the form m (2k0 + 1) π m (2k00 + 1) π = ±2hπ ± , n n where h denotes an integer number. Now from this equation we get
142
7 On imaginary expressions and their moduli.
m
h
h=
±(2k0 +1)±(2k00 +1) 2
i
n
.
Thus, because m is relatively prime to n, it is necessary that the integer number ± (2k0 + 1) ± (2k00 + 1) 2 [184] be divisible by n, which cannot happen because the numbers 2k0 + 1 and 2k00 + 1 are unequal and each of them cannot exceed n, so their half-sum, and a fortiori their half-difference, is necessarily less than n. Thus, two different values of 2k + 1 between 0 and n give two different values of cos
m (2k + 1) π . n
From this remark we easily conclude that the values, real or imaginary, of the exm 1 pression ((−1)) n given by equation (16) are n in number, like those of ((1)) n and of 1
((−1)) n . Moreover, because we evidently have h in √ cos m(2k+1)π ± −1 sin m(2k+1)π n n √ = cos m (2k + 1) π ± −1 sin m (2k + 1) π = (−1)m = ±1, m
it follows that every value of ((−1)) n is a real or imaginary expression, the nth 1
1
power of which equals ±1, and thus is a value of ((1)) n or of ((−1)) n . This remark leads to the equation m 1 (17) ((−1)) n = ((1)) n every time that (−1)m = 1, that is to say, whenever m is an even number, and it leads to 1 m (18) ((−1)) n = ((−1)) n whenever (−1)m = −1, that is to say, whenever m is an odd number. Let us add that we could combine equations (17) and (18) into a single formula by writing (19)
1
m
((−1)) n = (( (−1)m )) n .
[185] Problem VI. — To find the various values, real and imaginary, of the expression m ((−1))− n . Solution. — From the definition of negative powers, we have that m
((−1))− n =
1 m
((−1)) n
.
7.4 On the roots of imaginary expressions, and on their fractional and irrational powers.
143
− mn
Substituting the general value for ((−1)) found in equation (16), and considering formula (9) of the preceding section, we get (20)
m
((−1))− n = cos
m (2k + 1) π √ m (2k + 1) π ∓ −1 sin . n n m
It follows from this last equation that the various values of ((−1))− n are the same m as those of ((1)) n . As a consequence we get 1
m
(21)
((−1))− n = ((1)) n
and (22)
((−1))− n = ((−1)) n
if m is even 1
m
if m is odd.
In place of the two preceding formulas, we could content ourselves by writing instead m 1 ((−1))− n = (( (−1)m )) n . (23) Corollary. — If we make m = 1 in formula (23), it gives (24)
1
1
((−1))− n = ((−1)) n .
To complete this section, we remark that equations (3), (6), (8), (13), (16) and (20), with the aid of which we have determined the values of the expressions 1
m
m
((1)) n ,
((1)) n ,
((1))− n ,
1
m
m
((−1)) n , ((−1)) n , ((−1))− n , [186] can be replaced by two formulas. Indeed, if we denote by a a quantity, positive or negative, but with a fractional numerical value, the value of ((1))a determined by equation (3), (6) or (8) is evidently √ (25) ((1))a = cos 2kaπ ± −1 sin 2kaπ, while the value of ((−1))a determined by equation (13), (16) or (20) is √ (26) ((−1))a = cos (2k + 1) aπ ± −1 sin (2k + 1) aπ. In the two preceding formulas, we may take any integer number whatsoever for k.
7.4 On the roots of imaginary expressions, and on their fractional and irrational powers. Let
144
7 On imaginary expressions and their moduli.
√ α + β −1 be any imaginary expression. We can always find (see § II) a positive value of ρ and infinitely many real values of θ that satisfy the equation √ √ α + β −1 = ρ cos θ + −1 sin θ . (1) Given this, imagine that we denote two relatively prime integer numbers by m √ and n. If we use the notations adopted in § I, the nth roots of the expression α + β −1, or what amounts to the same thing, its powers of degree 1n , are the various values of 14 q q √ √ 1n n α + β −1 = α + β −1 √ and likewise, the fractional powers of α + β −1, positive or negative, of degree mn or − mn , are the various values of √ mn α + β −1
or
√ − mn α + β −1 .
[187] As a consequence, to determine these roots and these powers, it suffices to solve, one after another, the following three problems: Problem I. — To find the various values of the expression √ n1 α + β −1 . Solution. — Let
√ x = r cost + −1 sint
be one of these values, where r denotes a positive quantity and t a real arc. From the √ 1 definition itself of the expression α + β −1 n , we have √ √ (2) xn = α + β −1 = ρ cos θ + −1 sin θ , or what amounts to the same thing, √ √ rn cos nt + −1 sin nt = ρ cos θ + −1 sin θ . With the aid of theorem I, § II, we conclude from this last equation rn = ρ and √ √ cos nt + −1 sin nt = cos θ + −1 sin θ 14 The n was omitted from the radical sign in [Cauchy 1897, p. 186]. This error was not present in [Cauchy 1821, p. 218]. (tr.)
7.4 On the roots of imaginary expressions, and on their fractional and irrational powers.
145
and it follows that 1
r = ρn, cos nt = cos θ ,
sin nt = sin θ , nt = θ ± 2kπ θ ± 2kπ t= , n
and
where k represents any integer number. The quantities r and t are thus determined, and so the various values of x that satisfy equation (1) are evidently given by the formula 1 θ ± 2kπ θ ± 2kπ √ n x=ρ cos + −1 sin n n √ 1 θ θ 2kπ √ 2kπ =ρ n cos + −1 sin cos ± −1 sin , n n n n or what amounts to the same thing, by the following: 1 1 θ √ θ n cos ± −1 sin (3) x=ρ ((1)) n . n n √ 1 1 [188] In other words, the expression α + β −1 n , as well as ((1)) n , gives n different values, determined by the equation √ 1n 1 1 θ √ θ (4) = ρ n cos ± −1 sin ((1)) n . α + β −1 n n Corollary I. — Suppose that n = 2. We find that there exist two values of the expression √ 21 , α + β −1 or what amounts to the same thing, two values of x that satisfy the equation √ √ x2 = α + β −1 = ρ cos θ + −1 sin θ , and that these two values are contained in the formula 1 θ √ θ ±ρ 2 cos + −1 sin . 2 2 Corollary II. — Suppose now that n = 3. We find that there exist three values of the expression √ 31 α + β −1 , or what amounts to the same thing, three values of x that satisfy the equation
146
7 On imaginary expressions and their moduli.
√ √ x3 = α + β −1 = ρ cos θ + −1 sin θ ,
and that these three15 values are, respectively, √ 1 ρ 3 cos θ3 + −1 sin θ3 , √ √ 1 2π ρ 3 cos θ3 + −1 sin θ3 cos 2π 3 + −1 sin 3 √ 1 θ +2π = ρ 2 cos θ +2π and 3 + −1 sin 3 √ √ 1 2π ρ 3 cos θ3 + −1 sin θ3 cos 2π 3 − −1 sin 3 √ 1 θ −2π = ρ 3 cos θ −2π . 3 + −1 sin 3 [189] Corollary III. — Suppose finally that n = 4. We find that there are four values of the expression √ 41 α + β −1 , or what amounts to the same thing, four values of x that satisfy the equation √ √ x4 = α + β −1 = ρ cos θ + −1 sin θ , and that these four values are contained in the two formulas 1 ±ρ 4 cos θ4 + sin θ4 and 1 ±ρ 4 sin θ4 − cos θ4 . Problem II. — To find the various values of the expression √ mn α + β −1 . Solution. — The numbers m and n are assumed to be relatively prime. Because √ m of the definition itself of the expression α + β −1 n , we have √ mn √ 1n m = α + β −1 . α + β −1 √ 1 α + β −1 n found in equation (4), we get √ mn m m mθ √ mθ α + β −1 = ρ n cos + −1 sin ((1)) n . n n
Substituting the general value for (5)
15
[Cauchy 1897, p. 188] read “two values” instead of “three values” here. Also √ in [Cauchy 1897, p. 188], the first of these three values, displayed on the following line, had −1 sin θ3 instead of √ −1 sin θ3 . Neither error was present in [Cauchy 1821, p. 220]. (tr.)
7.4 On the roots of imaginary expressions, and on their fractional and irrational powers.
Corollary I. — If in equation (5) we substitute for ((1)) formula (6) (§ III), we obtain the following: (6)
m n
147
its value given by
h i √ mn √ m m(θ ±2kπ) α + β −1 −1 sin = ρ n cos m(θ ±2kπ) + . n n Problem III. — To find the various values of the expression √ − mn . α + β −1 [190] Solution. — From the definition itself of negative powers, we have √ − mn α + β −1 =
1 √ m . α + β −1 n
√ m Substituting the value for α + β −1 n found in equation (6), and considering formula (17) of § II, we get √ − mn α + β −1 h i √ m m(θ ±2kπ) − −1 sin = ρ − n cos m(θ ±2kπ) n n √ √ m mθ m·2kπ mθ − n , =ρ cos n − −1 sin n cos n ∓ −1 sin m·2kπ n or in other words, (7)
√ − mn m mθ √ mθ − mn α + β −1 =ρ cos − −1 sin ((1))− n . n n
Corollary I. — If we make m = 1, then equation (7) gives √ − n1 1 θ √ θ − 1n ((1))− n . (8) α + β −1 =ρ cos − −1 sin n n Having determined, as we have just done, the various values of the four expressions √ m √ 1 α + β −1 n , α + β −1 n , √ − 1 √ − mn α + β −1 n and α + β −1 , we see without trouble that equations (4), (5), (8) and (7), with the aid of which these values are determined, can be replaced by a single formula. If we let a be a quantity, positive or negative, with a numerical value that is fractional,16 the formula in question is
16
Here, as in Chapter V, Cauchy carefully treats the case where a is rational before extending his results to irrational values of a.
148
(9)
7 On imaginary expressions and their moduli.
√ a √ α + β −1 = ρ a cos aθ + −1 sin aθ ((1))a .
[191] In the above of the imaginary p √ calculations, ρ always denotes the modulus expression α + β −1, that is to say the positive quantity α 2 + β 2 , and θ denotes one of the arcs that satisfy equation (1), or what amounts to the same thing, equations (4) of § II, namely α cos θ = √α 2 +β 2 and (10) sin θ = √ β . 2 2 α +β
By dividing these two formulas, we conclude (11)
tan θ =
β . α
Consequently, if we let ζ be the smallest arc, ignoring the sign, which has tangent, in other words, if we make
β α
as its
β , α
(12)
ζ = arctan
then we find (13)
tan θ = tan ζ .
Given this, it becomes easy to introduce the arc ζ , whose value is completely determined, in place of the arc θ in the various formulas given above. Indeed, we arrive at this through the following considerations. Because the arcs θ and ζ have the same tangent, they also have the same sine and the same cosine, ignoring the sign. Furthermore, because equation (13) can be put into the form sin ζ sin θ = , cos θ cos ζ it is clear that, in order to satisfy that equation, we must either have both (14) or else both (15)
cos θ = cos ζ
and
sin θ = sin ζ
cos θ = − cos ζ
and
sin θ = − sin ζ .
[192] Moreover, because the value of cos θ determined by the first of equations (10) is evidently of the same sign as α, whereas the arc ζ , being contained between the limits − π2 and + π2 , always has a positive cosine, it follows that equations (14) hold if α is positive, and equations (15) hold if α is negative. Now let us see how formulas (1) and (9) reduce under these two hypotheses. First, if we suppose that α is positive, equations (10) can be replaced by equations (14), and we derive infinitely many values of θ , among which we ought to distinguish the following one:
7.4 On the roots of imaginary expressions, and on their fractional and irrational powers.
149
θ = ζ.
(16)
When we use this value, formulas (1) and (9) become, respectively, √ √ (17) and α + β −1 = ρ cos ζ + −1 sin ζ √ a √ α + β −1 (18) = ρ a cos aζ + −1 sin aζ ((1))a . Second, if we suppose that α is negative, then equations (10) can be replaced by equations (15), from which we derive, among other values of θ , θ = ζ + π.
(19)
Consequently, under this hypothesis, we can substitute the following for formulas (1) and (9): √ √ (20) and α + β −1 = −ρ cos ζ + −1 sin ζ √ a α + β −1 √ (21) = ρ a cos (aζ + aπ) + −1 sin (aζ + aπ) ((1))a √ √ = ρ a cos aζ + −1 sin aζ cos aπ + −1 sin aπ ((1))a . √ In particular, if we make α + β −1 = −1, that is to say α = −1 and [193] β = 0, then we find that 0 ζ = arctan = 0, −1 and formula (21) becomes √ (22) ((−1))a = cos aπ + −1 sin aπ ((1))a . As a result, under the given hypothesis, we have in general √ √ a (23) α + β −1 = ρ a cos aζ + −1 sin aζ ((−1))a . By combining formulas (17), (18), (20) and (23) with equations (25) and (26) of § III, we finally √ obtain the following conclusions. Let α + β −1 be any imaginary expression, let a be a quantity, positive or negative, with a numerical value that is fractional, and let k be an integer number chosen arbitrarily. Moreover, if we make (24)
ρ=
p
α2 + β 2
then for positive values of α we have
and ζ = arctan
β , α
150
(25)
7 On imaginary expressions and their moduli.
√ √ −1 = ρ cos ζ + −1 sin ζ , α +β √ a √ α + β −1 = ρ a cos aζ + −1 sin aζ ((1))a , √ ((1))a = cos 2kaπ ± −1 sin 2kaπ,
and for negative values of α, √ √ α + β −1 = −ρ cos ζ + −1 sin ζ , √ a √ (26) α + β −1 = ρ a cos aζ + −1 sin aζ ((−1))a , √ ((−1))a = cos 2k + 1aπ ± −1 sin 2k + 1aπ . We ought to add that if we denote the denominator of the simplest fraction that represents the numerical value of a by n, then n is precisely the number of distinct values of each of the expressions √ a ((1))a , ((−1))a and α + β −1 , and that to deduce these same values from formulas (25) and (26), it [194] suffices to substitute successively in place of 2k and 2k + 1 all the integer numbers within the limits 0 and n. If the numerical value of a becomes irrational, then each of the reduced expressions √ cos 2kaπ ± −1 sin 2kaπ and √ cos 2k + 1aπ ± −1 sin 2k + 1aπ has an indefinite number of values corresponding to the various integer values of k. Consequently, in the calculations we could no longer use the notations √ a ((1))a , ((−1))a and α + β −1 unless we consider each of them as representing an infinity of imaginary expressions, each different from the others. To avoid this inconvenience, we will never use these notations except in the case where the numerical value of a is fractional. Among the various values of ((1))a , there is always one that is real and positive, namely +1, which we indicate by (1)a , if we are using the single parentheses, or by 1a if we leave them out entirely. If we substitute this particular value of ((1))a into the second of equations (25), we get one corresponding value of √ a , α + β −1 which analogy leads us to indicate, with the aid of simple parentheses, by the notation √ a α + β −1 .
7.4 On the roots of imaginary expressions, and on their fractional and irrational powers.
151
This is what we will do from now on. As a consequence, by supposing that α is positive and that the quantities ρ and ζ are determined by equations (24), we have √ √ a (27) α + β −1 = ρ a cos aζ + −1 sin aζ . Because this last equation holds whenever the numerical value of a is integer or fractional, analogy again leads us to consider it to be true in the case where this numerical value [195] √ becomes irrational. Consequently, we agree to denote the product ρ a cos aζ + −1 sin aζ by
√ a α + β −1
in the case where α is positive, whatever real value is given to the quantity a. In other words, if we denote by ζ an arc contained between the limits − π2 and + π2 , whatever a is, we have, h ia √ √ ρ cos ζ + −1 sin ζ = ρ a cos aζ + −1 sin aζ . If we take ρ = 1 in the preceding equation, it becomes17 a √ √ (28) cos ζ + −1 sin ζ = cos aζ + −1 sin aζ . This last formula is entirely similar to equations (10) and (14) of § II, with the only difference being that it applies only for values of ζ between the limits − π2 and + π2 , while the other equations apply for any values of θ . When the quantity α becomes negative, even if we suppose that suppose that the numerical value of a√is fractional, then it is no longer clear that it is the value of the a expression α + β −1 that we may distinguish from the others by using the notation √ a α + β −1 . However, because −α is a positive quantity, it is easy to establish the formula √ √ a (29) −α − β −1 = ρ a cos aζ + −1 sin aζ for any value of a. We finish this section by making the observation that, in the case where the numerical value of a is fractional, formulas (27) and (29) reduce equations (18) and (23) to those that follow: √ a √ (30) α + β −1 = α + β −1 ((1))a and 17
In his first published proof that eiθ = cos θ + i sin θ [Euler 1748, §132–134 and §138], Euler made this leap from the rational to the irrational without a second thought. Compare this to Cauchy’s remarks on page iii of his Introduction.
152
(31)
7 On imaginary expressions and their moduli.
√ a √ α + β −1 = −α − β −1 ((1))a ,
[196] where equation (30) holds only for positive values of the quantity α, and equation (31) holds for negative values of the same quantity.
7.5 Applications of the principles established in the preceding sections. We are going to apply the principles established in the preceding sections to the solution of three problems about sines and cosines. Problem I. — To transform sin mz and cos mz (where m denotes any integer number) into a polynomial ordered according to the ascending integer powers of sin z, or at least into a product formed by the multiplication of such a polynomial and cos z. Solution. — When we substitute the even powers of cos z for the integer powers of 1 − sin2 z in equations (12) of § II, these equations become, for even values of m, m−2 m 1 − sin2 z 2 sin2 z cos mz = 1 − sin2 z 2 − m(m−1) 1·2 m−4 2 2 sin4 z − . . . + m(m−1)(m−2)(m−3) 1 − sin z 1·2·3·4 sin mz = cos z
m 1
1 − sin2 z
m−2 2
and
sin z
− m(m−1)(m−2) 1 − sin2 z 1·2·3
m−4 2
sin3 z + . . . ,
and, for odd values of m, m−3 m−1 2 2 cos mz = cos z 1 − sin2 z 2 − m(m−1) 1 − sin z sin2 z 1·2 m−5 m(m−1)(m−2)(m−3) 4 2 2 + 1 − sin z sin z − . . . and 1·2·3·4 sin mz =
m 1
m−1 1 − sin2 z 2 sin z
m−3 − m(m−1)(m−2) 1 − sin2 z 2 sin3 z + . . . . 1·2·3 [197] If we expand the right-hand sides of the four preceding formulas, or at least the coefficients of cos z on the right-hand sides, into polynomials ordered according to the ascending integer powers of sin z, we find that for even values of m,
7.5 Applications of the principles established in the preceding sections.
153
+ 21 sin2 z cos mz = 1 − m1 m−1 2 h i (m−1)(m−3) 4 m−1 3 3·1 + m(m−2) + + and 1·3 2·4 2 2 2·4 sin z − . . . n m(m−2) m−1 3 m 3 sin mz = cos z 1 sin z − 1·3 2 + 2 sin z h i o (m−1)(m−3) 5 m−1 5 5·3 + + sin z − . . . , + m(m−2)(m−4) 1·3·5 2·4 2 2 2·4
(1)
and for odd values of m, 2 m 1 cos mz = cos z 1 − m−1 1 2 + 2 sin z h i o m(m−2) 4 m3 3·1 + + sin z . . . and + (m−1)(m−3) 1·3 2·4 2 2 2·4 (2) 3 m−2 3 sin mz = m1 sin z − m(m−1) 1·3 2 + 2 sin z h i + m(m−1)(m−3) (m−2)(m−4) + m−2 5 + 5·3 sin5 z − . . . . 1·3·5
2·4
2
2
2·4
Equations (1) and (2) evidently contain the solution to the given question. It only remains to present them in a simpler form. To do that, it suffices to observe that the coefficient of each integer power of sin z generally contains a sum of fractions into which equation (5) of Chapter IV (§ III) permits us to substitute a unique fraction. As a consequence of this reduction, the expansions of cos mz and sin mz become, for even values of m, ( (m+2)m·m(m−2) 2 cos mz = 1 − m·m sin4 z 1·2 sin z + 1·2·3·4 (3) sin6 z + . . . − (m+4)(m+2)m·m(m−2)(m−4) 1·2·3·4·5·6 [198] and (4)
h sin mz = cos z m1 sin z − (m+2)m(m−2) sin3 z 1·2·3 i + (m+4)(m+2)m(m−2)(m−4) sin5 z − . . . , 1·2·3·4·5
and for odd values of m, h cos mz = cos z 1 − (m+1)(m−1) sin2 z 1·2 i (5) + (m+3)(m+1)(m−1)(m−3) sin4 z − . . . 1·2·3·4 and ( (6)
sin mz =
m 1
sin z − (m+1)m(m−1) sin3 z 1·2·3
+ (m+3)(m+1)m(m−1)(m−3) sin5 z − . . . . 1·2·3·4·5 Corollary I. — If in equation (3) we successively make m = 2,
m = 4,
m = 6,
...,
154
7 On imaginary expressions and their moduli.
we obtain the following: cos 2z = 1 − 2 sin2 z, cos 4z = 1 − 8 sin2 z + 8 sin4 z, (7) cos 6z = 1 − 18 sin2 z + 48 sin4 z − 32 sin6 z, ........................................... Corollary II. — If in equation (6) we successively make m = 1,
m = 3,
m = 5,
...,
we get: sin z = sin z, sin 3z = 3 sin z − 4 sin3 z, sin 5z = 5 sin z − 20 sin3 z + 16 sin5 z, .....................................
(8)
Problem II. — To transform sin mz and cos mz (where m denotes any integer number) into a polynomial ordered according to the ascending integer powers [199] of cos z, or at least into a product formed by the multiplication of such a polynomial and sin z. Solution. — To obtain the formulas that solve the given problem, it suffices to replace z with π2 − z in equations (3), (4), (5) and (6), and to observe moreover that for even values of m, we have mπ m cos − mz = (−1) 2 cos mz and 2 mπ m − mz = (−1) 2 +1 sin mz, sin 2 and for odd values of m, mπ
m−1 − mz = (−1) 2 sin mz and 2 mπ m−1 sin − mz = (−1) 2 sin mz. 2
cos
In this way we find that if m is an even number ( m (m+2)m·m(m−2) 2 (−1) 2 cos mz =1 − m·m cos4 z 2 cos z + 1·2·3·4 (9) cos6 z + . . . , − (m+4)(m+2)m·m(m−2)(m−4) 1·2·3·4·5·6 and (10)
h m (−1) 2 +1 sin mz = sin z m1 cos z − (m+2)m(m−2) cos3 z 1·2·3
i 5 z−... , cos + (m+4)(m+2)m(m−2)(m−4) 1·2·3·4·5
7.5 Applications of the principles established in the preceding sections.
155
and if m is an odd number, h m−1 (−1) 2 sin mz = sin z 1 − (m+1)(m−1) cos2 z 1·2 i (11) 4 z−... , cos + (m+3)(m+1)(m−1)(m−3) 1·2·3·4 and ( (12)
(−1)
m−1 2
cos mz =
m 1
cos z − (m+1)m(m−1) cos3 z 1·2·3
+ (m+3)(m+1)m(m−1)(m−3) cos5 z − . . . . 1·2·3·4·5
[200] Corollary I. — If in formula (9) we successively make m = 2,
m = 4,
m = 6,
...,
we obtain the following: − cos 2z = 1 − 2 cos2 z, cos 4z = 1 − 8 cos2 z + 8 cos4 z, (13) − cos 6z = 1 − 18 cos2 z + 48 cos4 z − 32 cos6 z, ................................................ Corollary II. — If in equation (12) we successively make m = 1,
m = 3,
m = 5,
...,
we conclude that
(14)
cos z = cos z, − cos 3z = 3 cos z − 4 cos3 z,
cos 5z = 5 cos z − 20 cos3 z + 16 cos5 z, ..........................................
Problem III. — To express the integer powers of sin z and of cos z as a linear function of the sines and cosines of the arcs z, 2z, 3z, . . .. Solution. — We solve this problem easily by considering the properties of two conjugate imaginary expressions, √ √ cos z + −1 sin z and cos z − −1 sin z. If we denote the first of these by u and the second by v, we have √ 2 cos z = u + v and 2 −1 sin z = u − v.
156
7 On imaginary expressions and their moduli.
By raising both sides of each of the √ two preceding equations to the integer power m, then dividing each by 2 or by 2 −1, then making the reductions indicated by the formulas uv = 1, un − vn un + vn √ = cos nz and = sin nz, 2 2 −1 where the last two equations apply for any integer value [201] of n, we find that if m represents an even number, then 2m+1 cosm z = cos mz + m1 cos m − 2 · z + m(m−1) (15) 1·2 cos m − 4 · z + . . . m(m−1)...( m2 +1) + 21 , 1·2·3... m 2
and (16)
m (−1) 2 2m−1 sinm z = cos mz − m1 cos m − 2 · z + m(m−1) 1·2 cos m − 4 · z − . . . m(m−1)...( m2 +1) , ± 21 1·2·3... m 2
and if m represents an odd number, then m−1 m 2 cos z = cos mz − m1 cos m − 2 · z + m(m−1) (17) 1·2 cos m − 4 · z + . . . m(m−1)... m+3 2 + cos z, 1·2·3... m−1 2
and (18)
m−1 (−1) 2 2m−1 sinm z = sin mz − m1 sin m − 2 · z + m(m−1) 1·2 sin m − 4 · z − . . . m(m−1)... m+3 2 ± m−1 sin z. 1·2·3... 2
Corollary I. — If in formula (15) we successively make m = 2,
m = 4,
m = 6,
...,
[202] we conclude 2 cos2 z = cos 2z + 1, 8 cos4 z = cos 4z + 4 cos 2z + 3, (19) 32 cos6 z = cos 6z + 6 cos 4z + 15 cos 2z + 10, ............................................
7.5 Applications of the principles established in the preceding sections.
157
We would arrive at the same equations if we sought to deduce the successive values of cos2 z, cos4 z, cos6 z, . . . from formulas (13) as linear functions of cos 2z,
cos 4z,
cos 6z,
....
Corollary II. — If in formula (16) we successively make m = 2,
m = 4,
m = 6,
...,
we obtain the equations −2 sin2 z = cos 2z − 1, 8 sin4 z = cos 4z − 4 cos 2z + 3, (20) −32 sin6 z = cos 6z − 6 cos 4z + 15 cos 2z − 10, ............................................, which we could equally well deduce from formulas (7) by eliminating the quantities sin2 z,
sin4 z,
sin6 z,
....
Corollary III. — If in formula (17) we successively make m = 1, we conclude (21)
m = 3,
m = 5,
...,
cos z = cos z, 4 cos3 z = cos 3z + 3 cos z, 16 cos5 z = cos 5z + 5 cos 3z + 10 cos z, ......................................
[203] We would arrive at the same equations if we sought to deduce the successive values of cos z, cos3 z, cos5 z, . . . from formulas (14) as linear functions of cos z,
cos 3z,
cos 5z,
....
Corollary IV. — If in formula (18) we successively take m = 1, we obtain the equations
m = 3,
m = 5,
...,
158
(22)
7 On imaginary expressions and their moduli.
sin z = sin z, −4 sin3 z = sin 3z − 3 sin z, 16 sin5 z = sin 5z − 5 sin 3z + 10 sin z, ......................................,
which we could equally well deduce from formulas (8) by eliminating the quantities sin z,
sin3 z,
sin5 z,
....
Chapter 8
On imaginary functions and variables.
8.1 General considerations on imaginary functions and variables. [204] When we suppose that one or both of the two real quantities u and v are variables, then the expression √ u + v −1 is called an imaginary variable. If also the variable u converges towards the limit U and the variable v towards the limit V , then √ U +V −1 is the limit towards which the imaginary expression √ u + v −1 converges. When the constants or variables contained in a given function, having been considered real are later supposed to be imaginary, the notation that was used to express the function cannot be retained in the calculation except by virtue of new conventions able to determine the sense of this notation under the new hypotheses. Thus, for example, by virtue of the conventions established in the preceding Chapter, the values of the notations a + x,
a − x,
ax
and
a x
are completely determined in the case where the constant a and [205] the variable x become imaginary. Suppose, in order to clarify these ideas, that the constant a remains real, and the variable x has the imaginary value √ √ α + β −1 = ρ cos θ + −1 sin θ , R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 8, c Springer Science+Business Media, LLC 2009
159
160
8 On imaginary functions and variables.
where α and β denote two real quantities which can be replaced by the modulus ρ and the real arc θ . We conclude from Chapter VII (§§ I and II) that the four notations a + x,
a − x,
ax
and
a x
denote, respectively, the four imaginary expressions √ a + ρ cos θ + ρ sin θ −1, √ a − ρ cos θ − ρ sin θ −1, √ aρ cos θ + aρ sin θ −1 and √ a a cos θ − sin θ −1, ρ ρ or in other words, the following quantities: √ √ a + α + β −1, a − α − β −1, and
√ aα + aβ −1 aβ √ aα − 2 −1. 2 2 α +β α +β2
In general, by means of the principles established in Chapter VII, we can clarify without difficulty the values of algebraic expressions in which several imaginary variables or constants are related to each other by the signs of addition, subtraction, multiplication or division. We see without trouble that these expressions retain all the properties as imaginary variables and constants that they have when they are real. For example, if we denote by x, y, z, . . . , u, v, w, . . . several variables, either real or imaginary, we have, in every [206] possible case x + y + z + . . . − (u + v + w + . . .) = x+y+z+...−u−v−w−..., xy = yx, u (x + y + z + . . .) = ux + uy + uz + . . . , x+y+z+... = x + y + z +..., u u u u (1) x y z xyz . . . × × ×... = , u v w uvw ... x vx v u = = × x, u u v ..................... Now consider the notation xa ,
8.1 General considerations on imaginary functions and variables.
161
in the case where the constant a remains real and the variable x takes the imaginary value √ √ α + β −1 = ρ cos θ + −1 sin θ . If we take for the value of a a quantity for which the numerical value is an integer number m, in this same notation, namely xa = x±m , we have, for any real values of α and β , a precise meaning. It is given by the imaginary expression √ ρ m cos mθ + ρ m sin mθ −1, if a = +m, and the following √ ρ −m cos mθ − ρ −m sin mθ −1, if a = −m (see Chapter VII, § II, equations (18) and (19)). But any time that the constant a takes a fractional [207] or irrational numerical value, the notation xa no longer has a precise and determined value, at least when the real part α of the imaginary expression x is not positive. If in this particular case we make ζ = arctan
β , α
then the arc ζ is contained between the limits − π2 and + π2 , and writing x in place of √ α + β −1 in § IV of Chapter VII (equations (17) and (27)), we find √ x = ρ cos ζ + −1 sin ζ and √ a a x = ρ cos aζ + −1 sin aζ , so that the notation xa denotes the imaginary expression √ ρ a cos aζ + ρ a sin aζ −1. It follows also from the conventions and the principles established above (Chap. VII, §§ III and IV) that, for a fractional value of the constant a, the notation ((x))a represents all at once many imaginary expressions, the values of which are given by the two formulas √ ((x))a = xa ((1))a and ((1))a = cos 2kaπ ± −1 sin 2kaπ,
162
8 On imaginary functions and variables.
when the real part α of the imaginary expression x is positive, and by the two formulas ((x))a = (−x)a ((−1))a and √ ((−1))a = cos (2k + 1) aπ ± −1 sin (2k + 1) aπ, when the quantity α becomes negative (on this subject, see § IV of Chapter VII, equations (25) and (26)). The same notation can no longer be employed in the case where the numerical value of a becomes irrational. [208] Expressions of the form xa retain the same properties for real and for imaginary values of the variable as long as the numerical value of the exponent is an integer number. But otherwise these properties do not hold except under certain conditions. For example, let √ √ √ x = α + β −1, y = α 0 + β 0 −1, z = α 00 + β 00 −1, . . . be several imaginary expressions, which would be reduced to real quantities if β , β 0 , β 00 vanish. Moreover, denote by a, b, c, . . . any real quantities with numerical values that are fractional or irrational, and by m, m0 , m00 , . . . several integer numbers. By virtue of the principles established in Chapter VII, we always have 0 00 0 00 xm xm xm . . . = xm+m +m +... , 0 00 0 00 (2) x−m x−m x−m . . . = x−m−m −m −... , 0 00 ±m ±m0 ±m00 x x x . . . = x±m±m ±m ±... , where each of the numbers m, m0 , m00 , . . . is given the same sign on both sides of the equation. Also ( xm ym zm . . . = (xyz . . .)m , (3) x−m y−m z−m . . . = (xyz . . .)−m and (4)
(xm )m0 = (x−m )−m0 = xmm0 , (xm )−m0 = (x−m )m0 = x−mm0 .
On the other hand, we find that of the three formulas (5) (6)
xa xb xc . . . = xa+b+c+... , xa ya za . . . = (xyz . . .)a and (xa )b = xab ,
(7)
the first remains always true only if the real part α [209] of the imaginary expression x is positive. The second remains true when α, α 0 , α 00 , . . . are positive and the sum arctan
β β0 β 00 + arctan 0 + arctan 00 + . . . α α α
8.1 General considerations on imaginary functions and variables.
163
remains contained between the limits − π2 and + π2 . The last remains true when α is positive and the product β a arctan α is contained between the same limits. The conventions adopted in Chapter VII do not yet suffice to determine precisely the meanings of the notations Ax , log x, sin x, cos x, arcsin x and arccos x, in the case where the variable x becomes imaginary. The simplest means of arriving at such precise meanings is by considering imaginary series. We will revisit this subject in Chapter IX. From what has been said above, any algebraic notation that includes imaginary constants along with the variables x, y, z, . . ., assumed to be real, cannot be used in calculation except in the case where, by virtue of established conventions, it has a determined imaginary expression as its value. Such an expression, in which the real √ part and the coefficient of −1 are necessarily real functions of the variables x, y, z, . . ., is called an imaginary function of these same variables. Thus, for example, if we denote by ϕ (x) and χ (x) two real functions of x, an imaginary function of this variable is √ ϕ (x) + χ (x) −1. Sometimes we indicate such a function with the aid of a single symbol ϖ, and we write √ ϖ (x) = ϕ (x) + χ (x) −1. [210] Similarly, if we denote two real functions of the variables x, y, z, . . . by ϕ (x, y, z, . . .) and χ (x, y, z, . . .), then √ ϖ (x, y, z, . . .) = ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1 is an imaginary function of these several variables. The imaginary function √ ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1 is called algebraic or exponential or logarithmic or circular, etc., and, in the first case, is called rational or irrational, integer or fractional, etc., whenever both of the real functions ϕ (x, y, z, . . .) and χ (x, y, z, . . .) enjoy the properties associated with the name in question. Thus, in particular, the general form of a linear imaginary function of the variables x, y, z, . . . is √ (a + bx + cy + dz + . . .) + a0 + b0 x + c0 y + d 0 z + . . . −1 or what amounts to the same thing,
164
8 On imaginary functions and variables.
√ √ √ √ a + a0 −1 + b + b0 −1 x + c + c0 −1 y + d + d 0 −1 z + . . . ,
where a, b, c, d, . . ., a0 , b0 , c0 , d 0 , . . . denote real constants. Again, we ought to distinguish among imaginary functions, as we do among real functions, the ones we call explicit, and which are immediately expressed by means of the variables, as opposed to those which we call implicit, for which the values are determined by certain equations but which cannot be known explicitly until the equations have been solved. Let ϖ (x)
or ϖ (x, y, z, . . .)
be an implicit imaginary function determined by a single equation. We can repre√ sent this function by u + v −1, where u and v denote two real quantities. If in the imaginary equation which must [211] be satisfied, we write √ u + v −1 instead of ϖ (x) or ϖ (x, y, z, . . .), then after expanding √ both sides of the equation and equating both the real parts and the coefficients of −1, we get two real equations between the unknown functions u and v. When we can solve these last equations, the solutions determine the explicit values of u and v, and consequently we get the explicit value of the imaginary expression √ u + v −1. For an imaginary function of a single variable to be completely determined, it is necessary and it suffices that for each particular value attributed to the variable, we can deduce the corresponding value of the function.1 Sometimes, for each value of the variable, the given function obtains several values, different from one another. Conforming to the conventions that we have already established, we ordinarily denote these multiple values of an imaginary function with the notation of doubled signs or doubled parentheses. Thus, for example, q q √ n cos z + −1 sin z or
1 √ n cos z + −1 sin z
indicates any one of the roots of degree n of the imaginary expression √ cos z + −1 sin z. 1
This is tantalizingly close to the modern definition of function, but deceptively so. Cauchy is still thinking of functions given by a formula, either implicitly or explicitly, and here he is merely distinguishing explicit functions and those implicit functions that are single-valued for a particular value of x.
8.2 On infinitely small imaginary expressions and on the continuity of imaginary functions. 165
8.2 On infinitely small imaginary expressions and on the continuity of imaginary functions. An imaginary expression is called infinitely small when it converges to the limit zero, √ which implies that in the given expression, the real part and the coefficient of −1 converge at the same [212] time towards this limit. Given this, represent a variable imaginary expression by √ √ α + β −1 = ρ cos θ + −1 sin θ , where α and β denote two real quantities for which we can substitute the modulus ρ and the real arc θ . Because this expression is infinitely small, it is evidently necessary and sufficient2 that its modulus p ρ = α2 + β 2 itself be infinitely small. An imaginary function of the real variable x is called continuous between two given limits of this variable when between these limits, an infinitely small increase in the variable always produces an infinitely small increase in the function itself. As a result, the imaginary function √ ϕ (x) + χ (x) −1 is continuous between two limits of x if the real functions ϕ (x) and χ (x) are continuous between these limits. We say that an imaginary function of the variable x is a continuous function of that variable in the neighborhood of a particular value of x whenever it remains continuous between two limits which contain that value, even if they are very close to each other. Finally, when an imaginary function of the variable x ceases to be continuous in the neighborhood of a particular value of this variable, we say that it then becomes discontinuous, and that there is a solution of continuity for this particular value. On the basis of the concepts which we have just established relative to the continuity of imaginary functions, we easily recognize that theorems I, II and III of Chapter II (§ II) remain true even in the case where we replace the real functions f (x) and
f (x, y, z, . . .)
[213] with the imaginary functions √ √ ϕ (x) + χ (x) −1 and ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1.
2
Up to here, Cauchy has used the expression “it is necessary and it suffices.” Here for the first time he writes the more familiar “necessary and sufficient” (n´ecessaire et suffisant).
166
8 On imaginary functions and variables.
As a consequence, we can state the following propositions. Theorem I. — If the real variables x, y, z, . . . have for limits the fixed and determined quantities X, Y , Z, . . ., and if the imaginary function √ ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1 is continuous with respect to each of the variables x, y, z, . . . in the neighborhood of the system of particular values x = X, y = Y, z = Z, . . . , √ then ϕ(x, y, z, . . .) + χ(x, y, z, . . .) −1 has as its limit √ ϕ (X,Y, Z, . . .) + χ (X,Y, Z, . . .) −1, or more briefly, if we write √ ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1 = ϖ (x, y, z, . . .) , then ϖ (x, y, z, . . .) has as its limit ϖ (X,Y, Z, . . .) . Theorem II. — Let x, y, z, . . . denote several real functions of the variable t which are continuous with respect to this variable in the neighborhood of the real value t = T . Furthermore, let X, Y , Z, . . . be the particular values of x, y, z, . . . corresponding to t = T . Suppose that in the neighborhood of these particular values, the imaginary function √ ϖ (x, y, z, . . .) = ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1 is simultaneously continuous with respect to x, with respect to y, with respect to z, etc. Then ϖ (x, y, z, . . .), considered as an imaginary function of t, is also continuous with respect to t in the neighborhood of the particular value t = T . In the preceding theorem, if we reduce the variables x, y, z, . . . to a single variable, we get the following statement: [214] Theorem III. — Suppose that in the expression √ ϖ(x) = ϕ(x) + χ(x) −1 the variable x is a real function of another variable t. Imagine further that the variable x is a continuous function of t in the neighborhood of the particular value t = T and that ϖ (x) is a continuous function of x in the neighborhood of the particular value x = X corresponding to t = T . The imaginary expression ϖ (x), considered as
8.4 On imaginary integer functions of one or several variables.
167
a function of t, is also continuous with respect to this variable in the neighborhood of the particular value t = T .
8.3 On imaginary functions that are symmetric, alternating or homogeneous. In extending the definitions that we gave (Chapter III) of symmetric, alternating or homogeneous functions of several variables x, y, z, . . . to imaginary functions, we recognize immediately that √ ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1 is symmetric, alternating or homogeneous of degree a with respect to the variables x, y, z, . . . when the real functions ϕ (x, y, z, . . .)
and
χ (x, y, z, . . .)
are both symmetric, homogeneous or alternating of degree a with respect to these same variables.
8.4 On imaginary integer functions of one or several variables. By virtue of what has been said above (§ I), √ ϕ (x) + χ (x) −1 and
√ ϕ (x, y, z, . . .) + χ (x, y, z, . . .) −1
[215] are two imaginary integer functions, the first of the variable x and the second of the variables x, y, z, . . ., when ϕ (x)
and
χ (x) ,
ϕ (x, y, z, . . .)
and
χ (x, y, z, . . .)
are real integer functions of the same variables. Consequently, if ϖ (x) represents an imaginary integer function of the variable x, then the value of ϖ (x) is determined by an equation of the form √ ϖ (x) = ϕ (x) + χ (x) −1 √ = a0 + a1 x + a2 x2 + . . . + b0 + b1 x + b2 x2 + . . . −1, where a0 , a1 , a2 , . . ., b0 , b1 , b2 , . . . denote real constants. We conclude from this equation, by combining the coefficients of similar powers of x, that
168
(1)
8 On imaginary functions and variables.
√ √ ϖ (x) = a0 + b0 −1 + a1 + b1 −1 x √ + a2 + b2 −1 x2 + . . . .
For the function ϖ (x) determined by the previous formula to vanish with x, it is necessary that we have √ a0 + b0 −1 = 0, that is to say a0 = 0 and b0 = 0, in which case the value of ϖ (x) reduces to √ √ ϖ (x) = a1 + b1 −1 x + a2 + b2 −1 x2 + . . . i h √ √ = x a1 + b1 −1 + a2 + b2 −1 x + . . . . Thus, any imaginary integer function of the variable x that vanishes with that variable is the product of the factor x by a second function of the same kind, or in other words, is divisible by x. On the basis of this remark, we easily extend theorems I and II of Chapter IV (§ I) to the case where the integer functions which are mentioned there are also imaginary. I will add that these two theorems remain true even if we replace the particular real values given to the variable x, such as x0 , x1 , x2 , . . . [216] with the imaginary values3 √ √ aα0 + β0 −1, α1 + β1 −1,
√ α2 + β2 −1,
....
To prove this assertion, it suffices to establish the two following propositions: Theorem I. — If an imaginary integer function of the variable x vanishes for a particular value of that variable, for example, for √ x = α0 + β0 −1, then this function is algebraically divisible by √ x − α0 − β0 −1. Proof. — Indeed, let √ ϖ (x) = ϕ (x) + χ (x) −1 be the imaginary function under consideration. If we let √ x = α0 + β0 −1 + z, 3 This word was changed to variables in [Cauchy 1897, p. 216] from the correct word valeurs in [Cauchy 1821, p. 255]. (tr.)
8.4 On imaginary integer functions of one or several variables.
169
where z denotes a new variable, then by substituting this, we evidently obtain as a result an imaginary integer function of z, namely √ ϖ α0 + β0 −1 + z . Because this function of z ought to vanish for z = 0, we conclude that √ ϖ (x) = ϖ α0 + β0 −1 + z is divisible by
√ z = x − α0 − β0 −1.
Corollary I. — The preceding proposition remains true even in the case where the function χ (x) vanishes, that is to say in the case where ϖ (x) reduces to a real function ϕ (x). Corollary II. — The preceding theorem also remains true when we [217] suppose that β = 0, and consequently when the particular value assigned to the variable x is real. Theorem II. — If an imaginary integer function of the variable x vanishes for each of the particular values of x contained in the sequence √ √ √ √ α0 + β0 −1, α1 + β1 −1, α2 + β2 −1, . . . , αn−1 + βn−1 −1, where n denotes any integer number, this function is equivalent to the product of the factors √ √ √ x − α0 − β0 −1, x − α1 − β1 −1, x − α2 − β2 −1, . . . , √ . . . , x − αn−1 − βn−1 −1 by a new imaginary integer function of the variable x. Proof. — Let
√ ϖ (x) = ϕ (x) + χ (x) −1
be the given function. Because it should vanish for √ x = α0 + β0 −1, by virtue of theorem I it is algebraically divisible by √ x − α0 − β0 −1. As a consequence, we have (2)
√ ϖ (x) = x − α0 − β0 −1 Q0 ,
170
8 On imaginary functions and variables.
where Q0 denotes a new imaginary integer function of the variable x. The function ϖ (x) also ought to vanish when we suppose that √ x = α1 + β1 −1, so this assumption necessarily reduces the right-hand side of equation (2) to zero, and consequently it reduces to zero one of the two factors which compose it (see Chapter VII, § II, theorem VII, corollary II). [218] Furthermore, because the first factor √ x − α0 − β0 −1 cannot become zero for
√ x = α1 + β1 −1
as long as the particular values √ α0 + β0 −1
√ and α1 + β1 −1
are distinct from each other, it is clear that by assigning the second of these two values of x, we ought to reduce to zero the integer function Q0 , and consequently this integer function is algebraically divisible by √ x − α1 − β1 −1. Thus we have
√ Q0 = x − α1 − β1 −1 Q1 ,
where Q1 denotes a new imaginary integer function of the variable x. Consequently equation (2) can be put into the form √ √ ϖ (x) = x − α0 − β0 −1 x − α1 − β1 −1 Q1 . (3) By reasoning again in this way we find: 1◦ that the function ϖ (x) ought to vanish by virtue of the assumption that √ x = α2 + β2 −1, so this assumption necessarily reduces the right-hand side of equation (3) to zero, and consequently it reduces to zero one of its three factors; 2◦ that the factor which reduces to zero cannot be any other than the integer function Q1 , as long as the three particular values of x, √ √ √ α0 + β0 −1, α1 + β1 −1, α2 + β2 −1, are distinct from one another; and 3◦ that because the integer function Q1 ought to vanish for √ x = α2 + β2 −1,
8.4 On imaginary integer functions of one or several variables.
171
[219] it is algebraically divisible by √ x − α2 − β2 −1. Consequently, we have Q1 = (x − α2 − β2 ) Q2 and it follows that (4)
ϖ (x) = (x−α0 −β0
√
√ √ −1)(x−α1 −β1 −1)(x−α2 −β2 −1)Q2 ,
where Q2 again denotes an integer imaginary function of the variable x. Continuing in the same way, we eventually recognize that, in the case where the integer function ϖ (x) vanishes for n different values of x, respectively denoted by √ √ √ √ α0 + β0 −1, α1 + β1 −1, α2 + β2 −1, . . . , αn−1 + βn−1 −1, then we necessarily have ( √ √ ϖ (x) = x − α0 − β0 −1 x − α1 − β1 −1 (5) √ √ × x − α2 − β2 −1 . . . x − αn−1 − βn−1 −1 Q, where Q denotes a new integer function of the variable x. It is almost unnecessary to observe that the preceding theorem remains true when we suppose that χ (x) = 0, or else β0 = 0,
β1 = 0,
β2 = 0,
...,
βn−1 = 0,
that is to say when the function ϖ (x) or the particular values assigned to the variable x become real. With the aid of the principles established in this section, we can prove without difficulty that in Chapter IV (§ I), theorems III and IV, along with formula (1), can be extended to the case where the functions and the variables, at the same time as the particular values attributed to those functions and variables, become imaginary. We can prove as well that propositions I, II and III, along with formulas (1) and (2) in § II of Chapter IV, and formulas (2), (3), (4), (5) and (6) of § III in the same Chapter, remain true whatever the real or imaginary values [220] of the variables, functions and constants may be. Thus, for example, we see, in particular, that equation (6) of § III, namely xn xn−1 y (x + y)n = + +... 1 · 2 · 3 . . . n 1 · 2 · 3 . . . n 1 · 2 · 3 . . . (n − 1) 1 (6) n−1 n x y y + + , 1 1 · 2 · 3 . . . (n − 1) 1 · 2 · 3 . . . n holds for any imaginary values of the variables x and y.
172
8 On imaginary functions and variables.
8.5 Determination of continuous imaginary functions of a single variable that satisfy certain conditions. Let
√ ϖ (x) = ϕ (x) + −1 χ (x)
be a continuous imaginary function of the variable x, where ϕ(x) and χ(x) are two real continuous functions. The imaginary function ϖ(x) is completely determined if for all the possible real values of the variables x and y, it is required to satisfy one of the equations (1) (2)
ϖ (x + y) = ϖ (x) + ϖ (y) or ϖ (x + y) = ϖ (x) × ϖ (y) ,
or else, for all real positive values of the same variables, one of the following equations: (3) (4)
ϖ (xy) = ϖ (x) + ϖ (y) or ϖ (xy) = ϖ (x) × ϖ (y) .
We will solve these four equations successively, which will provide us with four problems analogous to those we have already treated in § I of Chapter V. Problem I. — To determine the imaginary function ϖ (x) in such a manner that it remains continuous between any two real limits of the variable [221] x and so that for all real values of the variables x and y, we have (1)
ϖ (x + y) = ϖ (x) + ϖ (y) . Solution. — If, with the aid of the formula √ ϖ (x) = ϕ (x) + χ (x) −1,
we replace the imaginary function ϖ in equation (1) with the real functions ϕ and χ, this equation becomes √ √ √ ϕ (x + y) + χ (x + y) −1 = ϕ (x) + χ (x) −1 + ϕ (y) + χ (y) −1, √ then by equating the real parts and the coefficients of −1 on both sides, we conclude ϕ (x + y) = ϕ (x) + ϕ (y) and χ (x + y) = χ (x) + χ (y) . From these last formulas (see Chapter V, § I, problem I), we get
8.5 Continuous imaginary functions that satisfy certain conditions.
173
ϕ (x) = xϕ (1) and χ (x) = xχ (1) . Consequently (5)
h √ i ϖ (x) = x ϕ (1) + χ (1) −1 ,
or what amounts to the same thing, ϖ (x) = xϖ (1) .
(6)
It follows from equation (5) that any value of ϖ (x) that satisfies the given question is necessarily of the form √ (7) ϖ (x) = a + b −1 x, where a and b denote two constant quantities. Moreover, it is easy to assure ourselves that any such value of ϖ(x) satisfies equation (1), whatever the values of a and b. These quantities are thus two arbitrary constants. [222] We could remark that to obtain the preceding value of ϖ (x), it suffices to replace the arbitrary real constant a in the value of ϕ (x) given by equation (7) of Chapter V (§ I) by the arbitrary but imaginary constant √ a + b −1. Problem II. — To determine the imaginary function ϖ (x) in such a manner that it remains continuous between any two real limits of the variable x and so that for all real values of the variables x and y, we have ϖ (x + y) = ϖ (x) ϖ (y) .
(2)
Solution.4 — If we make x = 0 in equation (2), we get ϖ (0) = 1, or, because of the formula √ ϖ (x) = ϕ (x) + χ (x) −1, we get what amounts to the same thing, √ ϕ (0) + χ (0) −1 = 1. Consequently, 4
Note that this solution is very different from Cauchy’s solution to the corresponding problem II in Chapter V, § I. By contrast, problem I in this section followed as an easy corollary of problem I of Chapter V, § I.
174
8 On imaginary functions and variables.
ϕ (0) = 1
and
χ (0) = 0.
The function ϕ (x) reduces to 1 for the particular value 0 assigned to the variable x, and because it is assumed to be continuous between any limits, it is clear that in the neighborhood of this particular value, it is only very slightly different from 1, and consequently it is positive. Thus, if α denotes a very small number, we can choose this number in such a way that the function ϕ (x) remains constantly positive between the limits x = 0 and x = α. With this condition satisfied, because the quantity ϕ(α) is itself positive, if we take q χ (α) , ρ = ϕ (α)2 + χ (α)2 and ζ = arctan ϕ (α) we conclude that √ √ ϖ (α) = ϕ (α) + χ (α) −1 = ρ cos ζ + −1 sin ζ . [223] Now imagine that in equation (2) we successively replace y by y + z, then z by z + u, . . .. We conclude that ϖ (x + y + z + . . .) = ϖ (x) ϖ (y) ϖ (z) . . . , however many variables, x, y, z, . . ., there may be. If we also denote by m the number of variables, and if we make x = y = z = . . . = α, then the equation we have just found becomes √ ϖ (mα) = [ϖ (α)]m = ρ m cos mζ + −1 sin mζ . I add that the formula √ ϖ (mα) = ρ m cos mζ + −1 sin mζ remains true if we replace the integer number m by a fraction, or even by an arbitrary number µ. We will prove this easily in what follows. If in equation (2) we make 1 x= α 2
1 and y = α, 2
then we conclude 2 h i √ 1 ϖ α = ϖ (α) = ρ cos ζ + −1 sin ζ . 2
8.5 Continuous imaginary functions that satisfy certain conditions.
175
Then, by taking square roots of both sides in such a way that the real parts are positive, and by observing that the two functions ϕ (x) and cos x remain positive, the first between the limits x = 0 and x = α, and the second between the limits x = 0 and x = ζ , we find that √ 1 ζ √ ζ 1 1 1 −1 = ρ 2 cos + −1 sin α =ϕ α +χ α . ϖ 2 2 2 2 2 Likewise, if in equation (2) we make 1 x= α 4
1 and y = α, 4
[224] then we conclude 2 1 1 ζ √ 1 ζ ϖ α =ϖ α = ρ 2 cos + −1 sin . 4 2 2 2 Then, by taking square roots of both sides so as to obtain positive real parts, we find 1 ζ √ ζ 1 4 α =ρ . cos + −1 sin ϖ 4 4 4 By similar reasoning, we can establish successively the formulas 1 1 ζ ζ √ ϖ α = ρ 8 cos + −1 sin , 8 8 8 1 1 ζ ζ √ ϖ α = ρ 16 cos + −1 sin , 16 16 16 . . . . . . . . . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , and in general, where n denotes any integer number, √ 1 1 1 1 2n cos α = ρ ζ + −1 sin ζ . ϖ 2n 2n 2n If we operate on the preceding value of ϖ 21n α to derive the value of ϖ 2mn α the same way we operate on the value of ϖ (α) to derive that of ϖ (mα), we find that h m √ m i m m ϖ n α = ρ 2n cos n ζ + −1 sin n ζ , 2 2 2 or what amounts to the same thing, m m √ h m √ m i m ϕ n +χ n −1 = ρ 2n cos n ζ + −1 sin n ζ . 2 α 2 α 2 2 Consequently,
176
8 On imaginary functions and variables.
m m m and ϕ n α = ρ 2n cos n ζ 2 2m m m χ n α = ρ 2n sin n ζ . 2 2 [225] Then, by supposing that the fraction 2mn varies in such a way as to approach indefinitely the number µ and passing to the limit, we get the equations ϕ (µα) = ρ µ cos µζ
and
χ (µα) = ρ µ sin µζ ,
from which we conclude that (8)
√ ϖ (µα) = ρ µ cos µζ + −1 sin µζ .
Moreover, if in equation (2) we set x = µα
and y = −µα,
we get ϖ (−µα) =
i h √ ϖ (0) = ρ −µ cos (−µζ ) + −1 sin (−µζ ) . ϖ (µα)
Thus, formula (8) remains true when we replace µ by −µ. In other words, for all real values of the variable x, both positive and negative, we have h i √ ϖ (αx) = ρ x cos ζ x + −1 sin ζ x = [ϖ (α)]x . (9) In this last formula, if we write αx instead of x, it becomes √ x x ζ ζ ϖ (x) = ρ α cos x + −1 sin x = [ϖ (α)] α . (10) α α If, for brevity, we make (11) we find (12)
1
ρ α = A and
ζ = b, α
√ ϖ(x) = Ax cos bx + −1 sin bx .
Thus any value of ϖ (x) that satisfies the given question is necessarily of the form √ Ax cos bx + −1 sin bx , where A and b denote two real quantities, of which the first must be [226] positive. Moreover, it is easy to assure ourselves that such a value of ϖ (x) satisfies equation (2), whatever the values of the number A and the quantity b may be. This number and this quantity are thus arbitrary constants.
8.5 Continuous imaginary functions that satisfy certain conditions.
177
Corollary. — In the particular case where the function ϕ (x) remains positive between the limits x = 0 and x = 1, we can, instead of supposing that α is very small, set α = 1. Then we conclude immediately from equations (9) and (10) that (13)
ϖ (x) = [ϖ (1)]x .
Problem III. — To determine the imaginary function ϖ (x) in such a manner that it remains continuous between any two positive limits of the variable x and so that for all positive values of the variables x and y, ϖ (xy) = ϖ (x) + ϖ (y) .
(3)
Solution. — If, with the aid of the formula √ ϖ (x) = ϕ (x) + χ (x) −1, we replace the imaginary function ϖ in equation (3)√by the real functions ϕ and χ, then we equate the real parts and the coefficients of −1 on both sides, we find ϕ (xy) = ϕ (x) + ϕ (y) and χ (xy) = χ (x) + χ (y) . Moreover, if A denotes any number and log denotes the characteristic of logarithms in the system for which the base is A, we get from the preceding equations (see Chapter V, § I, problem III) ϕ (x) = ϕ (A) log (x) and χ (x) = χ (A) log (x) . We conclude that (14)
h √ i ϖ (x) = ϕ (A) + χ (A) −1 log (x) ,
[227] or what amounts to the same thing, (15)
ϖ (x) = ϖ (A) log (x) .
It follows from formula (14) that any value of ϖ (x) that satisfies the given question is necessarily of the form √ (16) ϖ (x) = a + b −1 log (x) , where a and b denote constant quantities. Moreover, it is easy to assure ourselves that such a value of ϖ (x) satisfies equation (3), whatever the quantities a and b may be. Thus these quantities are two arbitrary constants. We could remark that to obtain the preceding value of ϖ (x), it suffices to replace the arbitrary real constant a in the value of ϕ (x) given by equation (12) of Chapter V (§ I) by the arbitrary but imaginary constant
178
8 On imaginary functions and variables.
√ a + b −1. Note. — We could arrive very simply at equation (15) in the following manner. By virtue of the identities x = Alog x
and y = Alog y ,
equation (3) becomes ϖ Alog x+log y = ϖ Alog x + ϖ Alog y . Because in this last formula the variable quantities log x and log y take on all real values, both positive and negative, as a result we have, for all possible real values of the variables x and y, ϖ Ax+y = ϖ (Ax ) + ϖ (Ay ) . We conclude [see problem I, equation (6)] that ϖ (Ax ) = xϖ A1 = xϖ (A) , and consequently ϖ Alog x = ϖ (A) log x, [228] or what amounts to the same thing, ϖ(x) = ϖ (A) log x. Problem IV. — To determine the imaginary function ϖ(x) in such a manner that it remains continuous between any two positive limits of the variable x and so that for all positive values of the variables x and y, we have (4)
ϖ (xy) = ϖ(x)ϖ (y) .
Solution. — It would be easy to apply a method similar to that which we used to solve the second problem to the solution of this problem. However, we will arrive more promptly at the solution we seek if we observe that, by denoting by log the characteristic of logarithms in the system for which the base is A, we can put equation (4) into the form ϖ Alog x+log y = ϖ Alog x ϖ Alog y . Because in this last equation the variable quantities log x and log y admit any real values, positive and negative, it follows that we have, for all possible real values of the variables x and y, ϖ Ax+y = ϖ (Ax ) ϖ (Ay ) .
8.5 Continuous imaginary functions that satisfy certain conditions.
179
If α represents a very small number and if we replace ϖ(x) with ϖ (Ax ) in equation (10) of the second problem, we conclude that x
ϖ (Ax ) = [ϖ (Aα )] α . Consequently, we find that log x ϖ Alog x = [ϖ (Aα )] α , or what amounts to the same thing, (17)
ϖ(x) = [ϖ (Aα )]
log x α
.
It is essential to observe that the imaginary function ϖ (Ax ), and consequently its real part ϕ (Ax ), reduce to 1 for x = 0, [229] or in other words, that the imaginary function ϖ(x) and its real part ϕ(x) reduce to 1 for x = 1. We can prove this directly by taking x = A0 = 1 in equation (4). As for the number α, it need only be small enough that the real part of the imaginary function ϖ (Ax ) remain constantly positive between the limits x = 0 and x = α. When this condition is satisfied, the real part of the imaginary expression √ ϖ (Aα ) = ϕ (Aα ) + χ (Aα ) −1 is itself positive. Consequently, if we make q χ (Aα ) ρ = [ϕ (Aα )]2 + [χ (Aα )]2 and ζ = arctan , ϕ (Aα ) we have
√ ϖ (Aα ) = ρ cos ζ + −1 sin ζ .
Given this, equation (17) becomes h √ i log x ϖ(x) = ρ α cos αζ log x + −1 sin αζ log x √ i h (18) log ρ = x α cos αζ log x + −1 sin αζ log x . By virtue of this last equation, any value of ϖ(x) that satisfies the given question is necessarily of the form h i √ (19) ϖ(x) = xa cos (b log x) + −1 sin (b log x) , where a and b denote two constant quantities. Moreover, it is easy to assure ourselves that these two constant quantities ought to remain entirely arbitrary.
Chapter 9
On convergent and divergent imaginary series. Summation of some convergent imaginary series. Notations used to represent imaginary functions that we find by evaluating the sum of such series.
9.1 General considerations on imaginary series. [230] Let (1) (2)
p0 , q0 ,
p1 , q1 ,
p2 , . . . , pn , q2 , . . . , qn ,
... ...
and
be two real series. The sequence of imaginary expressions √ √ √ √ p0 + q0 −1, p1 + q1 −1, p2 + q2 −1, . . . , pn + qn −1, . . . (3) forms what we call an imaginary series. Moreover, let √ √ p1 + q1 −1 + . . . sn = p0 + q0 −1 + √ (4) + pn−1 + qn−1 −1 √ = (p0 + p1 + . . . + pn−1 ) + (q0 + q1 + . . . + qn−1 ) −1 be the sum of the first n terms of this series. Depending on whether or not sn converges towards a fixed limit for increasing values of n, we say that series (3) is convergent and that it has this limit as its sum, or else that it is divergent and it does not have a sum. The first case evidently occurs if the two sums p0 + p1 + . . . + pn−1 q0 + q1 + . . . + qn−1
and
themselves converge towards [231] fixed limits, for increasing values of n, and the second in the opposite case. In other words, series (3) is always convergent at the same time as the real series (1) and (2) are convergent. If even one of these series is divergent, then series (3) is divergent as well. In every possible case, the term of series (3) that corresponds to the index n, namely R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 9, c Springer Science+Business Media, LLC 2009
181
182
9 On convergent and divergent imaginary series.
√ pn + qn −1, is called its general term. The simplest of these imaginary series is the one we get by attributing an imaginary value to the variable x in the geometric progression 1,
x,
x2 ,
...,
xn ,
....
Imagine, to clarify these ideas, that we make √ x = z cos θ + −1 sin θ , where z denotes a new real variable and θ a real arc. The geometric progression in question becomes √ √ 1, z cos θ + −1 sin θ √, z2 cos 2θ + −1 sin 2θ , . . . (5) . . . , zn cos nθ + −1 sin nθ , . . . . To obtain the equation that determines the √ sum of thefirst n terms of the preceding series, it suffices to replace x by z cos θ + −1 sin θ in the formula 1 + x + x2 + . . . + xn−1 =
xn 1 − . 1−x 1−x
In this way, we find that √ √ 1 + z cos θ + −1 sin θ + z2 cos 2θ + −1 sin 2θ + . . . √ +zn−1 cos (n − 1) θ + −1 sin (n − 1) θ (6) 1 zn−1 − . √ √ = 1 − z cos θ + −1 sin θ 1 − z cos θ + −1 sin θ For increasing values of n, the modulus of the imaginary [232] expression √ zn cos nθ + −1 sin nθ √ , 1 − z cos θ − z sin θ −1 namely
±zn 1
,
(1 − 2z cos θ + z2 ) 2
converges towards the limit zero or grows beyond all limits, depending on whether we suppose that the numerical value of z is less than or greater than 1. Thus we ought to conclude from equation (6) that under the first hypothesis, series (5) is a convergent series that has 1 √ 1 − z cos θ − z sin θ −1
9.1 General considerations on imaginary series.
183
for its sum, and under the second hypothesis, it is a divergent series that does not have a sum. We indicate the sum of an imaginary series the same way we do for a real series, by the sum of its first terms followed by an ellipsis . . .. Given this, if we denote the sum of series (3) by s, assuming it is convergent, and if we make n grow indefinitely in formula (4), we find by passing to the limit that ( √ √ √ s = p0 + q0 −1 + p1 + q1 −1 + p2 + q2 −1 + . . . (7) √ = (p0 + p1 + p2 + . . .) + (q0 + q1 + q2 + . . .) −1. In the same way, when we suppose that the numerical value of z is less than 1 and we make n grow beyond any assignable limit, we get from equation (6), 2 √ √ 1 + z cos θ + −1 sin θ + z cos 2θ + −1 sin 2θ + . . . √ (8) 1 − z cos θ + z sin θ −1 1 √ = . = 1 − 2z cos θ + z2 1 − z cos θ − z sin θ −1 By virtue of formula (7), the first part of equation (8) can [233] be written in the following form: √ 1 + z cos θ + z2 cos 2θ + . . . + z sin θ + z2 sin 2θ + . . . −1. Thus, for numerical values of z less than 1, we have √ 2 2 1 + z cos θ + z cos 2θ + . . . + z sin θ + z sin 2θ + . . . −1 √ (9) z sin θ −1 √ 1 − z cos θ + = −1. 1 − 2z cos θ + z2 1 − 2z cos θ + z2 Thus we conclude that1 1 − z cos θ , 1 + z cos θ + z2 cos 2θ + z3 cos 3θ + . . . = 1 − 2z cos θ + z2 (10) z sin θ z sin θ + z2 sin 2θ + z3 sin 3θ + . . . = 1 − 2z cos θ + z2 (z = −1, z = +1) . Thus the substitution of an imaginary value for x in the geometric progression 1,
x,
x2 ,
...,
xn ,
...
is enough to lead to the summation of the two series ( 1, z cos θ , z2 cos 2θ , . . . , zn cos nθ , . . . (11) z sin θ , z2 sin 2θ , . . . , zn sin nθ , . . . 1
Euler summed a similar series by similar means in [Euler 1774].
184
9 On convergent and divergent imaginary series.
whenever the variable z remains contained between the limits z = −1
and z = +1,
that is to say, whenever the two series are convergent. The left-hand sides of equations (10) are (by virtue of theorem I, Chapter VI, § I) continuous functions of the variable z in the neighborhood of any particular value contained between the limits z = −1
and z = +1,
and so the left-hand side of equation (9) is itself a continuous function of z in the neighborhood of the same value. Now, this left-hand side is nothing but the sum of series (5), of which the different [234] terms remain continuous functions of z between any limits whatsoever. By generalizing the remark that we have just made, we obtain the following proposition: Theorem I. — When the different terms of series (3) are functions of the same variable z and are continuous with respect to this variable in the neighborhood of a particular value for which this series is convergent, the sum s of this series is also a continuous function of z in the neighborhood of this particular value.2 Proof. — Indeed, in the neighborhood of the particular value attributed to the variable z, series (3) cannot be convergent and have continuous functions of z for its different terms unless the real series (1) and (2) both enjoy the same properties. Now under this hypothesis, because both of the sums p0 + p1 + p2 + . . . q0 + q1 + q2 + . . .
and
are continuous functions of the variable z (by virtue of theorem I, Chapter VI, § I) it follows that the sum of series (3), namely √ s = (p0 + p1 + p2 + . . .) + (q0 + q1 + q2 + . . .) −1 is also a continuous function of this variable. Now suppose that we denote by ρ0 ,
ρ1 ,
ρ2 ,
...
the moduli of the various terms of series (3), and by √ √ √ cos θ0 + −1 sin θ0 , cos θ1 + −1 sin θ1 , cos θ2 + −1 sin θ2 , . . . the corresponding reduced expressions, so that in general we have 2
This theorem as stated is incorrect. Cauchy’s proof depends on his incorrect theorem I of Chapter VI, § I. See the footnote on p. 90.
9.1 General considerations on imaginary series.
185
1 ρn = p2n + q2n 2 and √ √ pn + qn −1 = ρn cos θn + −1 sin θn . [235] Series (3) becomes
(12)
√ ρ0 cos θ0 + −1 sin θ0 , √ ρ1 cos θ1 + −1 sin θ1 , √ ρ2 cos θ2 + −1 sin θ2 , ........................, √ ρ cos θn + −1 sin θn , n ........................,
and we can ordinarily decide if this series is convergent or divergent with the aid of the theorem I am about to state. 1
Theorem II.3 — To find the limit or limits towards which the expression (ρn ) n converges as n grows indefinitely. Series (3) converges or diverges according to whether the largest of these limits is less than or greater than 1. 1
Proof. — First consider the case where the largest values of the expression (ρn ) n converge towards a limit less than 1 as n grows indefinitely. In this case, because the series (13) ρ0 , ρ1 , ρ2 , . . . , ρn , . . . is convergent (Chapter VI, § II, theorem I), the two series ρ0 cos θ0 , ρ1 cos θ1 , ρ2 cos θ2 , . . . , ρn cos θn , (14) ρ0 sin θ0 , ρ1 sin θ1 , ρ2 sin θ2 , . . . , ρn sin θn ,
..., ...
are convergent as well (Chapter VI, § III, theorem IV), and the convergence of these last series entails that of series (12), which is nothing but series (3) presented in a different form. In the second place, suppose that for increasing values of n, the largest values 1 of (ρn ) n converge towards a limit greater than 1. Under this hypothesis and using reasoning similar to that which we used in Chapter VI (§ II, theorem I), we prove that the largest values of the modulus ρn = p2n + q2n
12
[236] increase with n beyond all limits, which cannot be true unless the largest values of the two quantities pn and qn , or at least one of them, likewise grows indefinitely. Now as these two quantities are the general terms of series (1) and (2), we 3
This is the Root Test adapted to complex numbers.
186
9 On convergent and divergent imaginary series.
must conclude that at least one of these two series must be divergent, which suffices to assure the divergence of series (3). Scholium I. — The theorem that we have just established leaves no doubt about the convergence or divergence of an imaginary series except in the particular case 1 where the limit of the largest values of (ρn ) n becomes equal to 1. In this particular case, it is not always easy to decide the issue. Nevertheless, we can affirm that if series (13) is convergent, then series (14) and consequently series (12) are as well. The converse is not true, and it can turn out that although series (12) remains convergent, series (13) is divergent. Thus, for example, if we take 1 1 and θn = n + π, ρn = n+1 2 then we get for series (12) and (13) the two following ones: √ √ √ √ −1, − 12 −1, + 31 −1, − 14 −1, . . . , 1,
1 2,
1 3,
1 4,
...,
where the second is divergent, while the first remains convergent and has for its sum √ −1 ln 2, where ln denotes the characteristic of Napierian logarithms. Scholium II. — Whenever the ratio ρn+1 ρn indefinitely approaches a fixed limit for increasing values of n, this limit is the same 1 [237] as the limit towards which the largest values of the expression (ρn ) n converge. Theorem V of § III (Chapter VI) is evidently applicable to imaginary series as well as to real series. As for theorem VI of the same section, when it is a question of imaginary series, we ought to replace it with the following: Theorem III. — Let ( (15)
u0 ,
u1 ,
u2 ,
...,
un ,
...,
v0 ,
v1 ,
v2 ,
...,
vn ,
...
be two convergent imaginary series that have s and s0 , respectively, as their sums. If each of these series remains convergent when we reduce its terms to their respective moduli, then ( u0 v0 , u0 v1 + u1 v0 , u0 v2 + u1 v1 + u2 v0 , . . . , (16) . . . , u0 vn + u1 vn−1 + . . . + un−1 v1 + un v0 , . . .
9.1 General considerations on imaginary series.
187
is a new convergent imaginary series that has ss0 as its sum. Proof. – Denote, respectively, by sn and s0n the sums of the first n terms of the two series (15), and by s00n the sum of the first n terms of series (16). We find that sn s0n − s00n = un−1 vn−1 + (un−1 vn−2 + un−2 vn−1 ) + . . . + (un−1 v1 + un−2 v2 + . . . + un vn−2 + u1 vn−1 ) . Again denote by ρn and ρn0 the moduli of the imaginary expressions un and vn , so that these expressions are determined by equations of the form √ and un = ρn cos θn + −1 sin θn √ 0 0 0 vn = ρn cos θn + −1 sin θn . Because the real series ρ0 , ρ1 , ρ2 , . . . , ρn , . . . ρ00 ,
ρ10 ,
ρ20 ,
...,
ρn0 ,
and
...
[238] are convergent by hypothesis, we conclude, as in Chapter VI (§ III, theorem VI) that the sum 0 0 0 ρn−1 ρn−1 + ρn−2 ρn−1 + ρn−1 ρn−2 +... 0 0 0 0 + ρ1 ρn−1 + ρn−1 ρ1 + ρn−2 ρ2 + . . . + ρ2 ρn−2 converges towards the limit zero for increasing values of n. It is the same, a fortiori, for the two sums 0 0 ρn−1 ρn−1 cos θn−1 + θn−1 0 0 0 0 cos θn−1 + θn−2 + ρn−1 ρn−2 + ρn−2 ρn−1 cos θn−2 + θn−1 +............................................................... + [ρn−1 ρ10 cos (θn−1 + θ10 ) + ρn−2 ρ20 cos (θn−2 + θ20 ) + . . . 0 0 0 0 +ρ2 ρn−2 cos θ2 + θn−2 + ρ1 ρn−1 cos θ1 + θn−1 and
0 0 ρn−1 ρn−1 sin θn−1 + θn−1 0 0 0 0 + ρn−1 ρn−2 sin θn−1 + θn−2 + ρn−2 ρn−1 sin θn−2 + θn−1 +. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . + [ρn−1 ρ10 sin (θn−1 + θ10 ) + ρn−2 ρ20 sin (θn−2 + θ20 ) + . . . 0 0 0 0 +ρ2 ρn−2 sin θ2 + θn−2 + ρ1 ρn−1 sin θ1 + θn−1 ,
where the first series evidently represents the real part of the imaginary expression sn s0n − s00n ,
188
9 On convergent and divergent imaginary series.
√ while the second series represents the coefficient of −1 in this expression. Consequently, sn s0n − s00n also converges towards the limit zero for increasing values of n. Because sn s0n indefinitely approaches the limit ss0 , it is certainly necessary that the expression s00n , that is to say the sum of the first n terms of series (16), itself indefinitely approaches this last limit. It follows that: 1◦ series (16) is convergent; and 2◦ that this convergent series has as its sum ss0 .
9.2 On imaginary series ordered according to the ascending integer powers of a single variable. [239] Let x be an imaginary variable. Any imaginary series ordered according to the ascending integer powers of the variable x is of the form √ √ √ a1 + b1 −1 x, a2 + b2 −1 x2 , . . . , a0 + b0 −1, √ ..., an + bn −1 xn , . . . , where a0 , a1 , a2 , . . ., an , . . . and b0 , b1 , b2 , . . ., bn , . . . denote two sequences of constant quantities. In the case where the constants in the second sequence vanish, the preceding series reduces to (1)
a0 ,
a1 x,
a2 x2 ,
...,
an x n , . . . .
In this section, we consider in particular series of this last kind. If, for simplicity, we put √ x = z cos θ + −1 sin θ , (2) where z denotes a real variable and θ denotes a real arc, then series (1) becomes ( √ √ a0 , a1 z(cos θ + −1 sin θ ), a2 z2 (cos 2θ + −1 sin 2θ ), . . . , (3) √ . . . , an zn (cos nθ + −1 sin nθ ), . . . . Now, as in Chapter VI (§ IV), let A be the largest of the limits towards which the nth root of the numerical value of an converges as n increases indefinitely. Under the same hypothesis, the largest of the limits towards which the nth root of the modulus of the imaginary expression √ an xn = an zn cos nθ + −1 sin nθ converges is equivalent to the numerical value of the product Az.
9.2 On imaginary series of integer powers of a single variable.
189
Consequently (see above, § I, theorem II), series (3) is [240] convergent or divergent according to whether the product Az has a value less than or greater than 1. We deduce the following proposition immediately from this remark: Theorem I. — Series (3) is convergent for all values of z contained between the limits 1 1 and z = + , z=− A A and divergent for all values of z located outside these same limits. In other words, series (1) is convergent or divergent depending on whether the modulus of the imaginary expression x is less than or greater than A1 . a
Scholium. — When the numerical value of the ratio n+1 an converges to a fixed limit for increasing values of n, this limit is precisely the value of the positive quantity denoted by A. Corollary I. — In comparing the preceding theorem to theorem I of Chapter VI (§ IV), we recognize that if series (1) is convergent for a certain real value of the variable x, it remains convergent for every imaginary value that has this real value as its modulus, up to sign. Consequently, if series (1) is convergent for all real values of the variable x, it remains convergent for whatever imaginary value we may attribute to this variable. Corollary II. — To apply theorem I and the preceding corollary, consider the following four series: x2 ,
...,
xn ,
...,
...,
µ(µ−1)...(µ−n+1) n x , 1·2·3...n
...,
x2 , 1·2
...,
xn , 1·2·3...n
...,
x2 , 2
...,
(4)
1, x,
(5)
1,
µ(µ−1) 2 µ 1 x, 1·2 x ,
(6)
1,
x , 1 x,
−
(7)
±
xn , n
...,
[241] where in the second series µ denotes any quantity whatsoever. Of these four series, the first two, as well as the last, remain convergent for all real values of x contained between the limits x = −1
and x = +1,
and the third remains convergent for all real values of the variable x. However, instead of giving x a real value, if we suppose that √ x = z cos θ + −1 sin θ ,
190
9 On convergent and divergent imaginary series.
then instead of these four series, we get the following ones: ( √ √ 1, z(cos θ + −1 sin θ ), z2 (cos 2θ + −1 sin 2θ ), . . . , (8) √ . . . , zn (cos nθ + −1 sin nθ ), . . . ; µ √ 1, 1 z(cos θ + −1 sin θ ), √ µ(µ−1) 2 (9) ..., 1·2 z (cos 2θ + −1 sin 2θ ), √ µ(µ−1)...(µ−n+1) n z (cos nθ + −1 sin nθ ), . . . ; 1·2·3...n √ √ z(cos θ + −1 sin θ ) z2 (cos 2θ + −1 sin 2θ ) 1, , , ..., 1 1·2 (10) √ zn (cos nθ + −1 sin nθ ) , . . . ; and ..., 1·2·3...n √ √ z(cos θ + −1 sin θ ) z2 (cos 2θ + −1 sin 2θ ) ,− , ..., 1 2 (11) √ zn (cos nθ + −1 sin nθ ) ..., ± , ..., n where the first two and the last one remain convergent for all values of z contained between the limits z = −1 and z = +1, while the remaining one is always convergent, whatever the real value of z may be. Having fixed the limits between which z must be contained in order to render series (3) convergent, we make the remark that, by virtue [242] of the principles established in the preceding section, theorems III, IV and V of Chapter VI (§ IV), with their corollaries, can be extended to the case where the variable x becomes imaginary. We need only assume that in the statement of theorem IV, each of the series a0 , a1 x, a2 x2 , . . . and b0 ,
b1 x,
b2 x2 ,
...
remains convergent when we reduce the terms not just to their numerical values but to their respective moduli. Given this, if we denote by ϖ (µ) what the right-hand side of equation (15) (Chapter VI § IV) becomes when we give to x the imaginary value √ z cos θ + −1 sin θ , or in other words, if we make (12)
√ µ z cos θ + −1 sin θ 1 √ µ(µ − 1) 2 + z cos 2θ + −1 sin 2θ + . . . , 1·2
ϖ(µ) = 1 +
we find, in place of formula (16) (Chapter VI, § IV), the following:
9.2 On imaginary series of integer powers of a single variable.
(13)
ϖ (µ) ϖ µ
0
191
0
= ϖ µ +µ .
It is essential to remark that this last formula remains true only for values of z contained between the limits z = −1 and z = +1, and that between these limits, the imaginary function ϖ (µ), that is to say the sum of series (9), is at the same time continuous with respect to z and with respect to µ (see above, § I, theorem I). Imagine for the time being that instead of series (9) we consider more generally series (3), and that in this last series we make the value of z vary by insensible degrees. As long as series (3) is convergent, that is to say as long as the value of z remains contained between the limits −
1 A
and
1 + , A
the sum of the series is a continuous imaginary function of the [243] variable z. Let ϖ (z) be this continuous function. The equation √ √ ϖ (z) = a0 + a1 z cos θ + −1 sin θ + a2 z2 cos 2θ + −1 sin 2θ + . . . remains true for all values of z contained between the limits − A1 and + A1 , which we indicate by writing these limits beside the series,4 as we see here: √ ( ϖ (z) = a0 +a1 z cos θ + −1 sin θ √ (14) +a2 z2 cos 2θ + −1 sin 2θ + . . . z = − A1 , z = + A1 . We ought to observe that the preceding equation is always equivalent to two real equations. Indeed, if we set √ ϖ (z) = ϕ (z) + χ (z) −1, (15) where ϕ (z) and χ (z) denote two real functions, we get from equation (14) that ( ϕ (z) = a0 + a1 z cos θ + a2 z2 cos 2θ + . . . , (16)
χ (z) =
a1 z sin θ + a2 z2 sin 2θ + . . . z = − A1 , z = + A1 .
When series (3) is given, we can sometimes deduce the value of the function ϖ (x) in a finite form, and to do this is called summing the series. In § I, we have already resolved this question for series (8). We will now try to resolve it for series (9), (10) and (11), and as a consequence, we will treat the three problems that follow, one after another. 4
In [Cauchy 1821, p. 290 ff], the limits really are written beside the series. However, in [Cauchy 1897, p. 243 ff], they are written below. (tr.)
192
9 On convergent and divergent imaginary series.
Problem I. — To find the sum of the series (9)
1,
µ 1 z(cos θ
√ + −1 sin θ ),
µ(µ−1) 2 1·2 z (cos 2θ
√ + −1 sin 2θ ), . . . ,
in the case where we attribute to the variable z a value contained between the limits z = −1 and
z = +1.
[244] Solution. — Let ϖ (µ) be the sum being sought. Let µ 0 denote a real quantity different from µ. We find (13) ϖ (µ) ϖ µ 0 = ϖ µ + µ 0 . The preceding equation, being similar to equation (2) of Chapter VIII (§ V), is solved in the same manner, and we thus conclude that √ ϖ (µ) = r µ cos µt + −1 sin µt , where the modulus r and the angle t are two quantities constant with respect to µ, but which necessarily depend on z and θ . Thus, between the limits z = −1 and z = +1, we have ( √ √ 2 1 + µ1 z cos θ + −1 sin θ + µ(µ−1) 1·2 z cos 2θ + −1 sin 2θ + . . . (17) √ = r µ cos µt + −1 sin µt . To determine the unknown values of r and t, we set µ = 1 in equation (17), and then we get √ √ 1 + z cos θ + z sin θ −1 = r cost + r sint −1, or what amounts to the same thing, 1 + z cos θ = r cost z sin θ = r sint. Consequently we find r = 1 + 2z cos θ + z2
21
.
Then, by observing that cost = 1+z rcos θ remains postive for every numerical value of z less than 1, and denoting by k any integer number, we also find t = arctan
z sin θ ± 2kπ. 1 + z cos θ
Given this, if for brevity we make (18)
s = arctan
z sin θ , 1 + z cos θ
9.2 On imaginary series of integer powers of a single variable.
193
[245] then equation (17) becomes √ √ 1 + µ z cos θ + −1 sin θ + µ(µ−1) z2 cos 2θ + −1 sin 2θ + . . . 1 1·2 1µ √ (19) = 1 + 2z cos θ + z2 2 cos µt + −1 sin µt (z = −1,
z = +1) ,
where the value of t is determined by the formula t = s ± 2kπ,
(20)
in which the integer number k depends only on the quantities z and θ . We remark now that, between the limits z = −1 and z = +1, the left-hand side of equation (19) is a continuous function of z that varies with z by insensible degrees, whatever the value of µ. The right-hand side of the equation thus ought to enjoy the same property. In other words, the quantities 1 + 2z cos θ + z2
µ2
cos µt, µ 1 + 2z cos θ + z2 2 sin µt,
and
and consequently cos µt
and
sin µt
ought to vary with z by insensible degrees for all possible values of µ. Now, this condition cannot be satisfied except in the case where t itself varies with z by insensible degrees. Indeed, if an infinitely small increase in z produces a finite increase in t in such a way as to change t into t + a, where a denotes a finite quantity, the sines and cosines of the two arcs µt
and
µ (t + a)
could not remain sensibly equal, except when the numerical value of the product µa is very close to a multiple of the [246] circumference, which cannot be true except for particular values of the coefficient µ, and not generally for any finite values of this coefficient. Thus we must conclude that the arc t = s ± 2kπ is a continuous function of z. Because the first of the two quantities s and k, determined by equation (18), varies with z in a continuous way between the limits z = −1 and z = +1, while the second, which must always be an integer, admits only finite variations that are multiples of 1, it is clear that to satisfy the stated condition, the quantity s must be the only one to vary and the quantity k must remain constant. Thus this last quantity is independent of z, and to know its value in all possible cases, it suffices to find it by supposing that z = 0. Because under this hypothesis, we have s = 0 and t = s ± 2kπ, we get √ 1 = cos (2kµπ) ± −1 sin (2kµπ) from equation (19), whatever the value of µ may be. Consequently
194
9 On convergent and divergent imaginary series.
k = 0. Given this, in general, formula (20) gives t = s, and equation (19) is found to reduce to √ √ 1 + µ z cos θ + −1 sin θ + µ(µ−1) z2 cos 2θ + −1 sin 2θ + . . . 1 1·2 1µ √ (21) = 1 + 2z cos θ + z2 2 cos µs + −1 sin µs (z = −1,
z = +1) .
Moreover, if we consider formula (27) of Chapter VII (§ IV), we easily recognize that the right-hand side of equation (21) can be represented by the notation h iµ √ 1 + z cos θ + −1 sin θ . Thus, always supposing that the value of z is contained between the [247] limits +1 and −1, we have ( √ √ 2 1 + µ1 z cos θ + −1 sin θ + µ(µ−1) 1·2 z cos 2θ + −1 sin 2θ + . . . µ √ (22) = 1 + z cos θ + −1 sin θ (z = −1,
z = +1) .
In other words, equation (20) of Chapter VI (§ IV), namely 1+
µ (µ − 1) 2 µ x+ x + . . . = (1 + x)µ , 1 1·2
remains true not only if we attribute to the variable x real values contained between the limits −1 and +1 but also if we let √ x = z cos θ + −1 sin θ , the numerical value of z being less than 1. Corollary I. — Formula (21), as with all imaginary equations, is equivalent to two real equations, √ which we obtain by equating on both sides the real parts and the coefficients of −1. In this way we find 1 1+ µ z cos θ + µ(µ−1) z2 cos 2θ + . . . = 1 + 2z cos θ + z2 2 µ cos µs, 1 1·2 1 µ(µ−1) 2 µ 2 2 µ sin µs (23) 1 z sin θ + 1·2 z sin 2θ + . . . = 1 + 2z cos θ + z (z = −1,
z = +1) ,
9.2 On imaginary series of integer powers of a single variable.
195
where the value of s is still determined by equation (18). Corollary II. — If in formulas (22) and (23) we set µ = −1, and if we then replace z by −z, we get equations (8) and (10) of § I. Corollary III. — If we set θ = π2 , or what amounts to the same thing, cos θ = 0
and
sin θ = 1,
[248] the value of s given by formula (18) becomes s = arctan z, and remains contained between the limits − π4 and + π4 for any numerical value of z less than 1. Under the same hypothesis, we evidently have sin s z = tan s = cos and s µ 1 + 2z cos θ + z2 2 = (sec s)µ =
1 , (cos s)µ
and we get from equations (23), but only for values between the given limits, that µ(µ−1) µ µ−2 s sin2 s cos µs = cos s − 1·2 cos + µ(µ−1)(µ−2)(µ−3) cosµ−4 sin4 s − . . . , 1·2·3·4 sin µs = µ1 cosµ−1 s sin s (24) cosµ−3 s sin3 s + . . . − µ(µ−1)(µ−2) 1·2·3 s = − π4 , s = + π4 . Consequently, if in formulas (12) of Chapter VII (§ II), we replace the integer number m by any quantity µ, these formulas, which held for all possible real values of the arc z, are not generally true except for numerical values of this arc less than π4 . Problem II. — To find the sum of the series (10)
1,
z2 √ √ z cos θ + −1 sin θ , cos 2θ + −1 sin 2θ , . . . , 1 1·2
whatever the numerical value of z might be. Solution. — If in equations (18) and (21), we replace z by αz and µ by α1 , where α denotes an infinitely small quantity, we find that [249] for all values of αz contained between the limits −1 and +1, or what amounts to the same thing, for all values of z contained between the limits − α1 and + α1 ,
196
(25)
9 On convergent and divergent imaginary series.
√ 1 + 1z cos θ + −1 sin θ √ z2 + 1·2 cos 2θ + −1 sin 2θ (1 − α) √ z3 + 1·2·3 cos 3θ + −1 sin 3θ (1 − α) (1 − 2α) + . . . 1 √ = 1 + 2αz cos θ + α 2 z2 2α cos αs + −1 sin αs z = − α1 , z = + α1 ,
where the arc s is determined by the formula s = arctan
(26)
αz sin θ . 1 + αz cos θ
Now if we let the numerical value of α in equation (25) decrease indefinitely, then by passing to the limit we find z2 √ √ 1 + 1z cos θ + −1 sin θ + 1·2 cos 2θ + −1 sin 2θ √ z3 cos 3θ + −1 sin 3θ + . . . + 1·2·3 (27) 1 √ = lim 1 + 2αz cos θ + α 2 z2 2α cos αs + −1 sin αs (z = −∞,
z = +∞) .
It remains to find the limit of the product 1 + 2αz cos θ + α 2 z2
2α1
cos
s √ s + −1 sin , α α
and consequently, the limits of each of the quantities 1 + 2αz cos θ + α 2 z2
2α1
and
s . α
Now, in the first place, if we make 2αz cos θ + α 2 z2 = β , [250] we conclude that5 2
1 + 2αz cos θ
1
+ α 2 z2 2α
z cos θ + αz2 β = (1 + β )
and consequently
5
2
In [Cauchy 1897, p. 250] the numerator of the exponent on the right-hand side reads s cos θ + αs2 . In [Cauchy 1821, p. 299] the numerator is given correctly, with z in place of s. (tr.)
9.2 On imaginary series of integer powers of a single variable.
lim 1 + 2αz cos θ + α 2 z2
197 αz2 1 ilim z cos θ + 2
2α1
h = lim (1 + β ) β = ez cos θ .
Moreover, because the value of s given by equation (26) is infinitely small, the ratio s 1 = sin s cos s tan s s has 1 as its limit. Also, we get from equation (26) that tan s z sin θ = and α 1 + αz cos θ s z sin θ s = . α tan s 1 + αz cos θ Thus we find, by passing to the limit, that s = z sin θ . lim α Given this, it is clear that the right-hand side of equation (25) has as its limit the imaginary expression i h √ ez cos θ cos (z sin θ ) + −1 sin (z sin θ ) , so that formula (27) becomes ( z2 √ √ 1 + 1z cos θ + −1 sin θ + 1·2 cos 2θ + −1 sin 2θ + . . . √ (28) = ez cos θ cos (z sin θ ) + −1 sin (z sin θ ) (z = −∞,
z = +∞) .
The value of the real variable z is completely arbitrary because it can be chosen at will between the extreme values z = −∞ and z = +∞. [251] Corollary I. — If, in comparing the √ two sides of equation (28), we equate: 1◦ the real parts; and 2◦ the coefficients of −1, we obtain the two real equations ( z2 1 + 1z cos θ + 1·2 cos 2θ + . . . = ez cos θ cos (z sin θ ) , (29)
z 1
2
z sin θ + 1·2 sin 2θ + . . . = ez cos θ sin (z sin θ )
(z = −∞,
z = +∞) .
Corollary II. — If we suppose that θ = π2 , or what amounts to the same thing cos θ = 0
and
sin θ = 1,
198
then equations (29) become ( (30)
9 On convergent and divergent imaginary series.
2
4
z z 1 − 1·2 + 1·2·3·4 − . . . = cos z, z 1
3
z − 1·2·3 + . . . = sin z
(z = −∞,
z = +∞) .
These last equations, as well as equations (29), remain true for any real values of z, and it follows that the functions sin z and cos z can always be expanded into series ordered by the ascending powers of the variables they contain. As this proposition is noteworthy, I will prove it here directly. Because the series x x2 1, , , ... 1 1·2 is convergent for all possible real values of the variable x, it remains convergent (by virtue of theorem I, corollary I) for all imaginary values of the same variable. If we multiply the sum of this series by the sum of [252] the similar series 1,
y , 1
y2 , 1·2
...,
and we take into consideration both theorem II of § I and formula (6) of Chapter VIII (§ IV), we find that for all possible values, real and imaginary, attributed to x and y, 2 2 1+ x + x +... 1+ y + y +... 1 1·2 1 1·2 (31) (x+y)2 x+y = 1 + 1 + 1·2 + . . . . √ √ In the preceding equation, when we replace x by x −1 and y by y −1, we obtain the following √ √ x3 −1 x −1 x2 − − + . . . 1 + 1 1·2 1·2·3 √ √ 2 3 −1 y −1 y (32) × 1 + 1 − 1·2 − y 1·2·3 +... √ 2 = 1 + (x+y)1 −1 − (x+y) 1·2 − . . . , in which we may, if we wish, assume that the variables x and y are real. Under this hypothesis, take √ √ x2 x3 −1 x −1 − − +.... ϖ (x) = 1 + 1 1·2 1·2·3 Equation (32) becomes ϖ (x) ϖ (y) = ϖ (x + y) , and we conclude that (see Chapter VIII, § V, equation (12))
9.2 On imaginary series of integer powers of a single variable.
199
√ ϖ (x) = Ax cos bx + −1 sin bx , or what amounts to the same thing, √ √ ( 3 −1 x2 x4 1 + x 1−1 − 1·2 − x 1·2·3 + 1·2·3·4 +... (33) √ x = A cos bx + −1 sin bx (x = −∞,
x = +∞) ,
[253] where the letters A and b denote two unknown constants, the first one of which is necessarily positive. Consequently, we have ( x2 x4 1 − 1·2 + 1·2·3·4 − . . . = Ax cos bx, (34) x x3 x 1 − 1·2·3 + . . . = A sin bx (x = −∞,
x = +∞) .
To determine the unknown constants A and b, it suffices to observe: 1◦ that formulas (34) must remain true when we change x to −x, and that to fulfill this condition it is necessary to suppose that Ax = A−x , and consequently A = 1; 2◦
and to observe that if we divide both sides of the second of formulas (34) by x, and then we let the variable x converge towards the limit zero, then the left-hand side converges towards the limit 1, and the right-hand side, namely Ax
sin bx sin bx = Ax × b, x bx
converges towards the limit b. From this it follows that b = 1. Given this, formulas (33) and (34) become, respectively, √ √ ( 3 −1 x2 x4 1 + x 1−1 − 1·2 − x 1·2·3 + 1·2·3·4 +... (35) √ = cos x + −1 sin x (x = −∞,
x = +∞)
and6
6
The denominator of the x2 term is incorrectly given as 1 instead of 1 · 2 in [Cauchy 1821, p. 304, Cauchy 1897, p. 253]. (tr.)
200
9 On convergent and divergent imaginary series.
( (36)
2
4
x x 1 − 1·2 + 1·2·3·4 − . . . = cos x, x 1
3
x − 1·2·3 + . . . = sin x
(x = −∞,
x = +∞) .
[254] If in these last two formulas we replace the variable x with the variable z, we rediscover formulas (30). It is essential to observe that when we suppose that x = z sin θ , equation (35) gives the expansion of √ cos (z sin θ ) + −1 sin (z sin θ ) according to the ascending powers of z. If we multiply this expansion by that of ez cos θ , and take into consideration formula (31), which remains true for all values, real and imaginary, of the variables it contains, then we get precisely equation (28). Problem III. — To find the sum of the series ( z2 √ √ z 1 cos θ + −1 sin θ − 2 cos 2θ + −1 sin 2θ (11) √ 3 + z3 cos 3θ + −1 sin 3θ − . . . in the case where we attribute to the variable z a value contained between the limits z = −1 and
z = +1.
Solution. — If we use the notation ln for the characteristic of the Napierian logarithms, then we have 1 + 2z cos θ + z2
21 µ
1 2 = e 2 µ ln(1+2z cos θ +z ) ,
and consequently equation (21) can be put into the form √ √ 2 1 + µ1 z cos θ + −1 sin θ + µ(µ−1) 1·2 z cos 2θ + −1 sin 2θ + . . . √ 1 2 = e 2 µ ln(1+2z cos θ +z ) cos µs + −1 sin µs (z = −1,
z = +1) ,
[255] where the value of s is still given by formula (18). If we expand the two factors of the right-hand side of the preceding equation into convergent series ordered according to the ascending powers of µ, then, if we form the product of these two expansions with the aid of formula (31), we find
9.2 On imaginary series of integer powers of a single variable.
201
√ √ 2 1 + µ1 z cos θ + −1 sin θ + µ(µ−1) 1·2 z cos 2θ + −1 sin 2θ + . . . √ = 1 + µ1 21 ln 1 + 2z cos θ + z2 + s −1 √ 2 µ2 1 2 + 1·2 2 ln 1 + 2z cos θ + z + s −1 + . . . (z = −1,
z = +1) .
Finally, after subtracting 1 from both sides, then dividing them by µ, if we let µ converge towards the limit zero, we obtain the equation ( z2 √ √ z 1 cos θ + −1 sin θ − 2 cos 2θ + −1 sin 2θ + . . . (37) √ = 21 ln 1 + 2z cos θ + z2 + s −1 (z = −1,
z = +1) .
Corollary I. — If in the√two sides of equation (37), we equate: 1◦ the real parts; and 2◦ the coefficients of −1, then if we substitute for s its value determined by formula (18), we obtain the two real equations ( (38)
z 1
2 3 cos θ − z2 cos 2θ + z3 cos 3θ − . . . = 21 ln 1 + 2z cos θ + z2 , z 1
2
3
z sin θ sin θ − z2 sin 2θ + z3 sin 3θ − . . . = arctan 1+z cos θ
(z = −1,
z = +1) .
[256] Corollary II. — If we suppose that θ = π2 , or what amounts to the same thing, cos θ = 0 and sin θ = 1, the second of equations (38) becomes (39)
z−
z3 z5 + − . . . = arctan z 3 5
(z = −1,
z = +1) .
The series that forms the left-hand side of this last equation is convergent, not only for any numerical value of z less than 1, but also when we suppose that z = 1 (see Chapter VI, § III, theorem III), and as a result the equation remains true in this last hypothesis. Moreover, because we have arctan (1) = we conclude that (40)
1−
π , 4
1 1 π + −... = . 3 5 4
Formula (40)7 can be used to calculate an approximation of the value of π, that is to say the ratio if the circumference to the diameter. 7
This series was originally discovered independently by James Gregory (1638–1675) and Gottfried Wilhelm von Leibniz (1646–1716).
202
9 On convergent and divergent imaginary series.
9.3 Notations used to represent various imaginary functions which arise from the summation of convergent series. Properties of these same functions. Consider the six notations Ax ,
sin x,
cos x,
log x, arcsin x and arccos x. As we know, if we give the variable x a real value, these six notations represent as many real functions of x, which, taken two by two, are inverses of each other, that is to say [257] given by inverse operations, provided, however, that, where A denotes a number, log expresses the characteristic of the logarithms in the system for which the base is A. It remains to clarify the sense of these same notations in the case where the variable x becomes imaginary. We will do this here, starting with the first three. We have proved that in the case where the variable x is taken to be real, the three functions represented by Ax , sin x and cos x can always be expanded into series ordered according to the ascending integer powers of this variable. Indeed, under this hypothesis we have
(1)
2 3 (ln A)3 A)2 x = 1 + x ln1 A + x (ln + x 1·2·3 +..., 1·2 A 4 2 x x + 1·2·3·4 −..., cos x = 1 − 1·2 x x3 sin x = 1 − 1·2·3 + . . . ,
where the characteristic ln denotes a Napierian logarithm. Moreover (by virtue of theorem I, corollary I, § II), the above series remain convergent for all values, real and imaginary, of the variable x, so we agree to extend equations (1) to all possible cases and to consider them as clarifying the meanings of the three notations Ax ,
sin x
and
cos x,
even when the variable becomes imaginary. Now we observe that if we make A=e in the first of equations (1), where e denotes the base of the Napierian logarithms, then we find that x2 x (2) +.... ex = 1 + + 1 1·2 √ √ [258] Then in place of x, we can successively write x ln A, x −1 and −x −1 to get
9.3 Notations used to represent various imaginary functions.
(3)
2
2
203 3
3
A) (ln A) + x 1·2·3 +..., ex ln A = 1 + x ln1 A + x (ln 1·2 √ √ √ x x2 x3 x −1 e = 1 + 1 −1 − 1·2 − 1·2·3 −1 + . . . , √ √ −x√−1 x2 x3 = 1 − 1x −1 − 1·2 + 1·2·3 −1 + . . . . e
As a consequence we have8 ex ln A = Ax , √ √ (4) ex −1 = cos x + −1 sin x, √ −x√−1 e = cos x − −1 sin x, where the variable x may be either real or imaginary. Moreover, whatever x and y may be, equation (31) (§ II) gives (5)
ex ey = ex+y .
Given this, it becomes easy to find in finite form the values of Ax , sin x and cos x corresponding to the imaginary values of the variable x. Indeed, if we suppose that √ (6) x = α + β −1, where α and β represent real quantities, then we conclude from the first two of equations (4) together with equation (5) that ( √ √ Ax = ex ln A = e(α+β −1) ln A = eα ln A e β ln(A) −1 (7) √ = Aα cos β ln A + −1 sin β ln A . From the last two of equations (4), we conclude that √ √ x −1 + e−x −1 e cos x = , 2 √ √ (8) ex −1 − e−x −1 √ . sin x = 2 −1 √ Then, by substituting the value α + β −1 for x and expanding the [259] right-hand sides, we get √ β −β β −β cos x = e +e cos α − e −e sin α −1, 2 2 √ β −β eβ +e−β (9) sin α + e −e cos α −1 2 2 sin x = √ = cos π2 − α − β −1 . And so, under the given hypotheses, the three notations
8
The second of formulas (4) is Euler’s Identity, extended to complex numbers.
204
9 On convergent and divergent imaginary series.
Ax ,
sin x
and
cos x,
respectively, denote the three imaginary expressions √ Aα cos ln A + −1 sin ln A , √ β −β eβ +e−β sin α + e −e cos α −1 and 2 2 √ β −β eβ +e−β cos α − e −e sin α −1. 2 2 Under the same hypothesis, if we make A=e then equation (7) gives the following value √ eα cos β + −1 sin β for the notation ex . Now that we have determined the values of the three functions Ax ,
sin x
and
cos x
in the case where the variable x becomes imaginary, we still have to look for which definitions to give in the same case for the inverse functions log x,
arcsin x
and
arccos x,
[260] or more generally, what meaning to give to the notations log ((x)) ,
arcsin ((x))
and
arccos ((x)) .
We continue to suppose that √ √ x = α + β −1 = ρ cos θ + −1 sin θ , where α and β denote two real quantities which can √ be replaced by the modulus ρ and the real arc θ . Every imaginary expression u + −1v that satisfies the equation (10)
Au+v
√
−1
√ = α + β −1 = x
is what we call an imaginary logarithm of x taken in the system where √ the base is A. As we will see below, equation (10) gives several values of u + v −1, even in the case where β is zero. It follows that any expression, imaginary or real, has several imaginary logarithms. Whenever we wish to designate indistinctly any one of these logarithms (among which we ought to include the real one, if there is one),
9.3 Notations used to represent various imaginary functions.
205
we use the characteristic log or ln followed by double parentheses, taking care to state the base of the system in the narrative. We prefer to use the characteristic ln when it is a question of Napierian logarithms taken in the system for which the base is e. By virtue of these conventions, the various logarithms of the real and imaginary quantities √ 1, −1, α + β −1 and x are respectively denoted, in the system for which the base is A by √ and log ((x)) , log ((1)) , log ((−1)) , log α + β −1 and in the Napierian system for which the base is e by √ ln ((1)) , ln ((−1)) , ln α + β −1
and
ln ((x)) .
Given this, to determine these various logarithms it suffices to solve the following problems.9 [261] Problem I. — To find the various values, real and imaginary, of the expression ln ((1)) . √ Solution. — Let u + v −1 be one of these values, where u and v denote two real quantities. From the definition itself of the expression ln ((1)), we have √
eu+v
(11)
−1
= 1,
or what amounts to the same thing, √ eu cos v + −1 sin v = 1. From this last equation we get eu = 1 and √ cos v + −1 sin v = 1, and consequently u = 0, cos v = 1,
sin v = 0
and v = ±2kπ,
where k represents any integer number. √ With the quantities u and v determined in this way, the various values of u + v −1 satisfying equation (11) are evidently contained in the formula 9
Euler was the first to resolve these problems; see [Euler 1751].
206
9 On convergent and divergent imaginary series.
√ √ u + v −1 = ±2kπ −1. In other words, the various values of ln ((1)) are given by the equation √ (12) ln ((1)) = ±2kπ −1. Only one of these values is real, namely the one that we obtain by setting k = 0, which reduces the value itself to zero. To represent this real value, we commonly use the simple notation ln (1) or ln 1. There are evidently an infinite number of imaginary values of ln ((1)). [262] Problem II. — To find the various values of the expression ln ((−1)) . √ Solution. — Let u + v −1 be one of these values, where u and v denote two real quantities. From the definition itself of the expression ln ((−1)), we have √
eu+v
(13)
−1
= −1,
or what amounts to the same thing, √ eu cos v + −1 sin v = −1. From this last equation we get eu = 1 and √ cos v + −1 sin v = −1, and consequently u = 0, cos v = −1,
sin v = 0
and v = ± (2k + 1) π,
where k represents any integer number. √ With the quantities u and v determined in this way, the various values of u + v −1 satisfying equation (13) are evidently contained in the formula √ √ u + v −1 = ± (2k + 1) π −1. In other words, the various values of ln ((−1)) are given by the equation √ (14) ln ((−1)) = ± (2k + 1) π −1. Consequently, there are infinitely many such values and they are all imaginary.
9.3 Notations used to represent various imaginary functions.
207
Problem III. — To find the various values of the expression √ ln α + β −1 . √ Solution. — Let u+v −1 of these values. From the [263] definition itself √be one of the expression ln α + β −1 we have √ √ √ (15) eu+v −1 = α + β −1 = ρ cos θ + −1 sin θ , or what amounts to the same thing, √ √ eu cos v + −1 sin v = ρ cos θ + −1 sin θ , √ where ρ denotes the modulus of α + β −1. From the preceding equation, we get eu = ρ and √ √ cos v + −1 sin v = cos θ + −1 sin θ , and consequently, u = ln (ρ) , cos v = cos θ ,
sin v = sin θ
and v = θ ± 2kπ,
where k represents any integer number. √ With the quantities u and v determined in this way, the various values of u + v −1 are contained in the formula √ √ √ u + v −1 = ln (ρ) + θ −1 ± 2kπ −1. In other words, the various values of √ ln α + β −1 are given by the equation √ √ (16) ln α + β −1 = ln (ρ) + θ −1 + ln ((1)) . It is worth observing that in this last equation, the value of ρ is completely determined and is equal to p α 2 + β 2, while θ can be any arc which has √
α α 2 +β 2
[264] Corollary I. — If we make
as its cosine and √
β α 2 +β 2
as its sine.
208
9 On convergent and divergent imaginary series.
ζ = arctan
(17)
β α
for greater convenience, then it is easy to substutute the arc ζ in place of the arc θ in formula (16). Indeed, we may suppose that θ =ζ if α is positive, and θ = ζ +π if α is negative. In the first case, we find that √ √ (18) ln α + β −1 = ln (ρ) + ζ −1 + ln ((1)) , and in the second case that √ √ √ ln α + β −1 = ln (ρ) + ζ −1 + π −1 + ln ((1)) . (19) In particular, if in this last equation we make √ α + β −1 = −1, that is to say α = −1
and β = 0,
and consequently ρ =1 we obtain (20)
and ζ = 0,
√ ln ((−1)) = π −1 + ln ((1)) .
In general, it follows that for negative values of α we have √ √ (21) ln α + β −1 = ln (ρ) + ζ −1 + ln ((−1)) . Now suppose that we substitute the values α2 + β 2
21
and
arctan
β α
for ρ and ζ in formulas (18) and (21). We find for the various values of √ ln α + β −1 : [265] 1◦ if α is positive, that (22)
√ 1 β √ 2 2 ln α + β −1 = ln α + β + arctan −1 + ln ((1)) ; 2 α
9.3 Notations used to represent various imaginary functions.
209
and 2◦ if α is negative, that (23)
ln
√ 1 β √ α + β −1 = ln α 2 + β 2 + arctan −1 + ln ((−1)) . 2 α
Corollary II. — If we suppose that β = 0 in equations (22) and (23), then for positive values of α we have √ (24) ln ((α)) = ln(α) + ln ((1)) = ln(α) ± 2kπ −1, and for negative values of α we have (25)
√ ln ((α)) = ln (−α) + ln ((−1)) = ln (−α) ± (2k + 1) π −1,
where k as always is an integer number. It follows from these last formulas that a real quantity α has an infinity of imaginary logarithms, among which, in the case where α is positive, we find just one real logarithm. We obtain this real logarithm, denoted by the simple notation ln (α) or ln α, by setting k = 0 in equation (24). Scholium I. — Among the various values of ln ((1)), as we have just remarked, there is one that is equal to zero, and which we indicate by the notation ln (1) or ln 1, making use of the simple parentheses or suppressing them altogether. If we substitute this particular value in equation (22), we obtain a corresponding value √ ln α + β −1 , which analogy leads us to indicate, with the aid of simple parentheses, by the notation √ ln α + β −1 . We will do so from now on. Consequently, supposing that α is positive, we have √ 1 β √ (26) ln α + β −1 = ln α 2 + β 2 + arctan −1. 2 α [266] On the other hand, if α becomes negative, then −α is positive and we find that √ 1 −β √ 2 2 ln −α − β −1 = ln α + β + arctan −1, 2 −α or what amounts to the same thing, (27)
√ 1 β √ ln −α − β −1 = ln α 2 + β 2 + arctan −1. 2 α
By making use of the preceding notations, we can reduce equations (22) and (23) to the following
210
(28) (29)
9 On convergent and divergent imaginary series.
√ √ ln α + β −1 = ln α + β −1 + ln ((1)) and √ √ ln α + β −1 = ln −α − β −1 + ln ((−1)) ,
where the first equation applies for positive values of α, while the second applies for negative values of that quantity. In other words, depending on whether the real part of an imaginary expression x is positive or negative, we have (30) or else (31)
ln ((x)) = ln(x) + ln ((1)) ln ((x)) = ln(−x) + ln ((−1)) .
To summarize what we have just said, we see that the notation ln(x) has a precise meaning determined by equation (26) only in the first case, where the real part of the imaginary expression x is positive, while in all possible cases the notation ln ((x)) has infinitely many values determined by one of equations (28) or (29). [267] Problem IV. — To find the various values of the expression √ log α + β −1 , where the characteristic log indicates a logarithm taken in the system where the base is A. √ Solution. — Let u + v −1 still denote one of the values of the expression we are considering. From the definition itself of this expression, we have √
(32)
Au+v
−1
√ = α + β −1,
or what amounts to the same thing, e(u+v
√
−1) ln A
√ = α + β −1,
where ln is the characteristic of the Napierian logarithms. Then we conclude that √ √ u + v −1 ln A = ln α + β −1 , and consequently
√ √ ln α + β −1 u + v −1 = , ln A
9.3 Notations used to represent various imaginary functions.
211
or in other words, (33)
log
√ √ ln α + β −1 α + β −1 = . ln A
This last equation remains true √ in the case where β vanishes, that is to say when the imaginary expression α + β −1 reduces to a real quantity. Scholium. — If √ we suppose that the quantity α √ is positive, then the particular value of ln α + β −1 represented by ln α + β −1 corresponds to a partic√ ular value of log α + β −1 , which analogy leads us to indicate with the aid of simple parentheses by the notation √ log α + β −1 . [268] Given this, for positive values of α we have √ √ ln α + β −1 log α + β −1 = ln A (34) arctan αβ √ 1 = 2 log α 2 + β 2 + −1. ln A √ Moreover, if in equation (33) we substitute for ln α + β −1 its value given successively in formulas (28) and (29), we find that for positive values of the quantity α, √ √ ln α + β −1 ln ((1)) log α + β −1 = + (35) ln A √ ln A = log α + β −1 + log ((1)) , and for negative values of the same quantity,10 √ √ ln −α − β −1 ln ((−1)) log α + β −1 = + (36) ln A √ ln A = log −α − β −1 + log ((−1)) . In other words, according to whether the real part of an imaginary expression x is positive or negative, we have, respectively, √ 2kπ −1 (37) log ((x)) = log x + log ((1)) = log (x) ± ln A or else (38) 10
√ (2k + 1) π −1 log ((x)) = log(−x) + log ((−1)) = log (−x) ± , ln A
In [Cauchy 1821, p. 323, Cauchy 1897, p. 268], single parentheses were used on the left-hand side of equation (36). (tr.)
212
9 On convergent and divergent imaginary series.
where k denotes any integer number. We can add that of the two preceding formulas, the first remains true for all positive real values of x and the second for all negative real values of the same variable. After having calculated the various logarithms of the imaginary expression √ x = α + β −1, we propose to find the imaginary arcs for which the cosine is equal to x. If we denote any one of these arcs by √ arccos ((x)) = u + v −1, √ [269] then to determine u + v −1, we have the equation √ √ cos u + v −1 = α + β −1, or what amounts to the same thing, (39)
√ √ ev + e−v ev − e−v cos u − sin u −1 = α + β −1. 2 2
This separates into two other equations, namely (40)
ev + e−v cos u = α 2
and
ev − e−v sin u = −β . 2
For these last two equations, we can substitute the equivalent system of two formulas α β α β ev = (41) − and e−v = + . cos u sin u cos u sin u Moreover, if we eliminate v from formulas (41), it follows that β2 α2 − 2 = 1 and 2 cos u sin u 4 sin u − 1 − α 2 − β 2 sin2 u − β 2 = 0. Then, by observing that sin2 u is necessarily a positive quantity, we have s 2 −β2 1 − α 1 − α2 − β 2 2 2 sin u = + + β 2. 2 2 Consequently we have 1 + α2 + β 2 − cos2 u = 2
s
1 + α2 + β 2 2
2 − α2
9.3 Notations used to represent various imaginary functions.
= 1+α 2 +β 2 2
+
α2 r
1+α 2 +β 2 2
213
2
, − α2
and because (by virtue of the first of equations (40)) cos u and α [270] must have the same sign, we have, by extracting square roots, (42)
α
cos u = " 1+α 2 +β 2 2
+
r
1+α 2 +β 2
2
2
#1 . 2
− α2
Given this, if for convenience we make α U = arccos 1 s 2 2 2 +β 2 2 +β 2 1+α 1+α + −α 2 2 2 (43) β α V = ln cosU − sinU ,
and
we conclude from equations (41) and (42) that (44)
u = ±U ± 2kπ
and v = ±V,
where k denotes any integer number and the two letters U and V must have the same sign. Thus, we finally have √ arccos ((x)) = ±2kπ ± U +V −1 . (45) Among the various values of arccos ((x)) given by the preceding equation, the simplest is the one obtained by setting k = 0 in the first term of the right-hand side, and giving a + sign to the other term. We denote this particular value with the aid of simple parentheses, and consequently we write √ arccos (x) = U +V −1, or even, by suppressing the parentheses entirely, √ (46) arccos x = U +V −1. In the particular case where β is zero, the quantity α remains contained between the limits −1 and +1, and formula (46) reduces, as we [271] should expect, to the identity arccos α = arccos α. On the other hand, if we note that ±2kπ represents any of the arcs that have 1 for their cosines, we recognize that equation (45) can be put into the form
214
9 On convergent and divergent imaginary series.
arccos ((x)) = ± arccos x + arccos ((1)) .
(47)
Yet, it is essential to remark that in the case where we suppose that β = 0 and the numerical value of α is greater than 1, the expression arccos α always takes an imaginary value. This imaginary value is given by the equation √ (48) arccos α = ln (α) −1 if α is positive, and by (49)
h √ √ i√ arccos α = π + ln (−α) −1 = ln (−α) − π −1 −1
if α is negative. √ Now consider the imaginary arcs for which the sine is x = α + β −1. If we denote any one of these arcs by √ arcsin ((x)) = u + v −1, then by taking into consideration the second of equations (9), we find π √ √ x = sin u + v −1 = cos − u − v −1 , 2 and we conclude (50)
√ π arcsin ((x)) = u + v −1 = arccos ((x)) . 2
In the previous formula, if we substitute the various values of arccos ((x)), one of which is distinguished by the notation arccos (x) or arccos x, we obtain the various values of arcsin ((x)), one of which [272] is distinguished by the notation arcsin (x) or arcsin x, and determined by the equation (51)
arcsin x =
π − arccos x. 2
With the aid of the principles that we have just established, it is easy to recognize the most essential properties that are enjoyed by those functions of the imaginary variable x represented by the notations Ax ,
cos x,
sin x,
log x, arccos x and arcsin x. To obtain these properties, it suffices to extend the formulas that these functions satisfy in the case where the variable x is real to the case where the variable becomes imaginary. This extension is ordinarily carried out without difficulty for each of the
9.3 Notations used to represent various imaginary functions.
215
three functions Ax ,
cos x
and
sin x.
Thus, for example, if A, B, C, . . . denote several numbers, we can easily prove that the equations ( Ax Ay Az . . . = Ax+y+z+... , (52) Ax BxCx . . . = (ABC . . .)x and (53)
(
cos (x + y) = cos x cos y − sin x sin y, sin (x + y) = sin x cos y + sin y cos x
remain equally true for any values, real or imaginary, of the variables x, y, z, . . .. But if we consider formulas that involve the inverse functions log x,
arccos x
or
arcsin x,
we usually find that these formulas, extended to the case where the variables become imaginary, remain true only with considerable restrictions, and only for certain values of the variables that they involve. For example, if we make √ √ √ x = α + β −1, y = α 0 + β 0 −1, z = α 00 + β 00 −1, . . . , [273] and if we denote by µ any real quantity, we recognize that the formula (54)
log (x) + log (y) + log (z) + . . . = log (xyz . . .)
remains true only in the case where α, α 0 , α 00 , . . . are positive and the sum arctan
β β0 β 00 + arctan 0 + arctan 00 + . . . α α α
remains contained between the limits − π2 and + π2 . The formula (55)
log (xµ ) = µ log (x)
remains true only in the case where α is positive and the product µ arctan remains contained between the same limits.
β α
Chapter 10
On real or imaginary roots of algebraic equations for which the left-hand side is a rational and integer function of one variable. The solution of equations of this kind by algebra or trigonometry.
10.1 We can satisfy any equation for which the left-hand side is a rational and integer function of the variable x by real or imaginary values of that variable. Decomposition of polynomials into factors of the first and second degree. Geometric representation of real factors of the second degree. [274] Consider an algebraic equation for which the left-hand side is a rational and integer function of the variable x. Such an equation can be put into the form (1)
a0 xn + a1 xn−1 + a2 xn−2 + . . . + an−1 x + an = 0,
where n represents the degree of this equation and a0 , a1 , a2 , . . ., an−1 , an , are constant coefficients, real or imaginary. A root of this equation is any expression, real or imaginary, that when substituted in place of the unknown value x, makes the left-hand side equal to zero. First, to clarify the ideas, suppose that the constants a0 , a1 , a2 , . . ., an , reduce to real quantities. Then if two real values of x substituted into the left-hand side of equation (1) give two results containing zero between them, that is to say, results with opposite signs, we conclude from Chapter II (§ II, theorem IV)1 that equation (1) admits one or more real roots contained between these two values. It follows that any equation of odd degree has at least one real root. Indeed, if n is an odd number, the left-hand side [275] of equation (1) changes signs, with its first term a0 xn , whenever, by giving the variable x very large numerical values, we make this variable pass from positive to negative (see theorem VIII of Chapter II, § I). When n is an even number, the quantity xn remains positive as long as the variable x is real. Thus, for very large numerical values of x, the left-hand side of equation 1
This is the Intermediate Value Theorem, which was proven intuitively in Chapter II, and will be proven rigorously in Note III. R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 10, c Springer Science+Business Media, LLC 2009
217
218
10 On real or imaginary roots of algebraic equations.
(1) will eventually always be the same sign as a0 . If, under the same hypothesis, an and a0 are of opposite signs, the left-hand side evidently changes signs as it passes from a very large numerical value of x to a very small one, while remaining either always positive or always negative. Then equation (1) has at least two real roots, one positive and the other negative. When n is an even number and a0 and an have the same sign, it can happen that the left-hand side of equation (1) remains of the same sign as a0 for all real values of x, without ever vanishing. This is what happens, for example, for each of the binomial equations x2 + 1 = 0,
x4 + 1 = 0,
x6 + 1 = 0,
....
In such a case, equation (1) no longer has real roots, but we satisfy the equation by taking for x an imaginary expression √ u + v −1, where u and v denote two finite real quantities. This proposition and the ones that we have just established are found contained in the following theorem: Theorem I. — Whatever the values, real or imaginary, of the constants a0 , a1 , . . ., an−1 , an may be, the equation (1)
a0 xn + a1 xn−1 + a2 xn−2 + . . . + an−1 x + an = 0,
in which n denotes an integer number greater than or equal to 1, always has real or imaginary roots. [276] Proof. — For brevity, denote the left-hand side of equation (1) by f (x). Then f (x) is a function, real or imaginary, but always integer, of the variable x. Because any real√expression u is contained as a particular case of some imaginary expression u + v −1, to establish the stated theorem it suffices to prove in general that we can satisfy the equation (1) f (x) = 0 by taking
√ x = u + v −1,
then giving the new variables u and v real values. Now, if we substitute the preceding value of x in the function f (x), the result is of the form √ ϕ (u, v) + −1χ (u, v) , where ϕ(u, v) and χ(u, v) denote two real integer functions of the variables u and v. Given this, equation (1) becomes √ ϕ (u, v) + −1χ (u, v) = 0.
10.1 Decomposition of polynomials into factors.
219
To satisfy this equation, it suffices to satisfy the two real equations ( ϕ (u, v) = 0 and (2) χ (u, v) = 0, or what amounts to the same thing, the single equation [ϕ (u, v)]2 + [χ (u, v)]2 = 0.
(3)
Thus, if for convenience we set (4)
F (u, v) = [ϕ (u, v)]2 + [χ (u, v)]2 ,
it remains only to show that we can find real values of u and v that make the function F (u, v) vanish. We can easily do this with the aid of the following considerations. [277] First, to determine the general value of the function F (u, v), we represent each of the real or imaginary constants a0 , a1 , . . ., an−1 , an , as well as the imag√ inary variable u + v −1, by the product of a modulus and a reduced expression. Consequently, we write √ a0 = ρ0 cos θ0 + −1 sin θ0 , √ a1 = ρ1 cos θ1 + −1 sin θ1 , .................................., (5) √ an−1 = ρn−1 cos θn−1 + −1 sin θn−1 , √ an = ρn cos θn + −1 sin θn and (6)
√ √ u + v −1 = r cost + −1 sint .
Consequently we have √ f u + v −1 √ n (nt + θ0 ) = ρ0 r cos(nt + θ0 ) + −1 sin √ (7) +ρ1 rn−1 cos n − 1 · t + θ1 + −1 sin n − 1 · t + θ1 √ + . . . + ρn−1 r cos (t + θn−1 ) + −1 sin (t + θn−1 ) √ +ρn cos θ0 + −1 sin θ0 . From this we deduce that
220
(8)
and
(9)
10 On real or imaginary roots of algebraic equations.
ϕ (u, v) = ρ0 rn cos (nt + θ0 ) +ρ1 rn−1 cos n − 1 · t + θ1 + . . . . . . + ρn−1 r cos (t + θn−1 ) + ρn cos θn , χ (u, v) = ρ0 rn sin (nt + θ0 ) +ρ1 rn−1 sin n − 1 · t + θ1 + . . . . . . + ρn−1 r sin (t + θn−1 ) + ρn sin θn F (u, v) = [ρ0 rn cos (nt + θ0 ) +ρ1 rn−1 cos n − 1 · t + θ1 + . . . . . . + ρn−1 r cos (t + θn−1 ) + ρn cos θn ]2 + [ρ0 rn sin (nt + θ0 ) +ρ1 rn−1 sin n − 1 · t + θ1 + . . . . . . + ρn−1 r sin (t + θn−1 ) + ρn sin θn ]2 2ρ0 ρ1 cos (t + θ0 − θ1 ) = r2n ρ02 + r 2 + 2ρ ρ cos (2t + θ − θ ) ρ 0 2 0 2 1 + +... . r2
[278] It follows from this last formula that the function F (u, v), which is evidently always positive, is the product of two factors, of which one, namely n r2n = u2 + v2 , grows indefinitely if we give one or both of the variables u and v larger and larger numerical values, while under the same hypothesis, the other factor converges towards the limit ρ02 , that is to say towards a finite limit different from zero. Thus we conclude that the function F (u, v) cannot retain a finite value except when both of the two quantities u and v receive values of this kind, and it becomes infinitely large when either of the two quantities grows indefinitely. Moreover, equation (4) gives an integer function for F (u, v), and consequently a continuous function of the variables u and v. Thus, it is clear that F (u, v) varies with u and v by insensible degrees and cannot drop below zero, and so it attains, one or several times, a certain lower limit below which it never descends. Denote this limit by A, and by u0 and v0 one of the systems of finite values of u and v for which F (u, v) reduces to A. Consequently, we have identically (10) F (u0 , v0 ) = A. The difference F (u, v) − F (u0 , v0 ) can never fall below zero. As a consequence, if we make (11) u = u0 + αh and v = v0 + αk
10.1 Decomposition of polynomials into factors.
221
where α denotes an infinitely small quantity and h and k denote two finite quantities, then the expression F (u0 + αh, v0 + αk) − F (u0 , v0 ) is never negative. On the basis of this principle, it is easy to determine the value of the constant A, as we shall see. √ In the imaginary expression f u + v −1 , if we substitute for u and v their values given in formulas (11), this expression becomes [279] an imaginary and integer function of the product √ α h + k −1 , and it can be expanded according to the ascending integer powers of this product. If we denote the imaginary coefficients of these powers by √ R cos T + −1 sin T , √ R1 cos T1 + −1 sin T1 , ........................., √ Rn cos Tn + −1 sin Tn , some of which may be reduced to zero, and if we make, for convenience √ √ h + k −1 = ρ cos θ + −1 sin θ , (12) we obtain the equation √ √ f u0 + v0 −1 + α h + k −1 √ = R cos T + −1 sin T (13) √ +αR1 ρ cos (T1 + θ ) + −1 sin (T1 + θ ) + . . . √ . . . + α n Rn ρ n cos (Tn + nθ ) + −1 sin (Tn + nθ ) , in which the terms on the right-hand side, and thus the moduli R1 ,
R2 ,
...,
Rn ,
do not all vanish at the same time. Moreover, because we have √ ( f u0 + αh + (v0 + αk) −1 (14) √ = ϕ (u0 + αh, v0 + αk) + −1χ (u0 + αh, v0 + αk) , we conclude from equation (13) that
222
(15)
10 On real or imaginary roots of algebraic equations.
ϕ (u0 + αh, v0 + αk) = R cos T + αR1 ρ cos (T1 + θ ) + . . . + α n Rn ρ n cos (Tn + nθ ) , χ (u0 + αh, v0 + αk) = R sin T + αR1 ρ sin (T1 + θ ) + . . . + α n Rn ρ n sin (Tn + nθ ) ,
[280] and as a consequence,
(16)
F (u0 + αh, v0 + αk) = [R cos T + αR1 ρ cos (T1 + θ ) + α n Rn ρ n cos (Tn + nθ )]2 + [R sin T + αR1 ρ sin (T1 + θ ) + α n Rn ρ n sin (Tn + nθ )]2 .
If we set α = 0 in this last formula, we get F (u0 , v0 ) = R2 . 1
Moreover, R2 = A and so R = A 2 . If we now expand the right-hand side of equation 1 (16) according to the descending powers of R and then replace R by A 2 , this equation becomes F (u0 + αh, v0 + αk) 1 = A + 2A 2 αρ [R1 cos (T1 − T + θ ) + . . . . . . +α n−1 ρ n−1 Rn cos (Tn − T + nθ ) (17) n n−1 ρ n−1 R cos (T + nθ ) 2 2ρ 2 R cos (T + θ ) + . . . + α +α n n 1 1 2 o . + R1 sin (T1 + θ ) + . . . + α n−1 ρ n−1 Rn sin (Tn + nθ ) If we move the quantity A = F (u0 , v0 ) to the left-hand side, we finally find that F (u0 + αh, v0 + αk) − F (u0 , v0 ) 1 = 2A 2 αρ [R1 cos (T1 − T + θ ) + . . . . . . +α n−1 ρ n−1 Rn cos (Tn − T + nθ ) (18) n 2 2ρ 2 +α R1 cos (T1 + θ ) + . . . + α n−1 ρ n−1 Rn cos (Tn + nθ ) 2 o + R1 sin (T1 + θ ) + . . . + α n−1 ρ n−1 Rn sin (Tn + nθ ) . Given this, because the difference F (u0 + αh, v0 + αk) − F (u0 , v0 ) ought never fall below the limit zero, it is absolutely necessary that, for very small numerical values of α, the right-hand side of the preceding equation, and hence the first term of the right-hand side, that is to say, the term which contains the small-
10.1 Decomposition of polynomials into factors.
223
est power of α, can never become negative. Now, denoting by Rm the first of the quantities R1 , R2 , . . . , Rn [281] which has a value different from zero, we find that the term in question is, 1
2A 2 α m ρ m Rm cos (Tm − T + mθ ) , if A is not zero, and α 2m ρ 2m R2m otherwise. Moreover, the value of the arc θ is entirely indeterminate, so we can choose it in such a way as to give the factor cos (Tm − T + mθ ) , and hence the product 1
2A 2 α m ρ m Rm cos (Tm − T + mθ ) , whichever sign we wish. Thus it is clear that only the second hypothesis remains admissible. Thus we necessarily have A = 0,
(19) which reduces equation (10) to (20)
F (u0 , v0 ) = 0.
It follows that the function F (u, v) vanishes if we attribute to the variables u and v the real values u0 and v0 , and consequently that the equation f (x) = 0
(1) is satisfied by taking
√ x = u0 + v0 −1.
√ In other words, u0 + v0 −1 is a root of the equation a0 xn + a1 xn−1 + . . . + an−1 x + an = 0.
(1)
The preceding proof of theorem I, while different in several points from that given by M. Legendre (Th´eorie des Nombres, 1st Part, § XIV),2 is based on the same principles. [282] Corollary. — The polynomial
2
See [Legendre 1808].
224
10 On real or imaginary roots of algebraic equations.
f (x) = a0 xn + a1 xn−1 + . . . + an−1 x + an , which vanishes, as we have just said, for √ x = u0 + v0 −1, is algebraically divisible by the factor √ x − u0 − v0 −1, by virtue of theorem I (Chapter VII, § IV). Because the quotient is just a new polynomial of degree n − 1 with respect to x, it is again necessarily divisible by a new factor of the same form as the previous one, that is to say, of first degree with respect to x. Denote this new factor by √ x − u1 − v1 −1. The polynomial f (x) is equivalent to the product of the two factors √ √ x − u0 − v0 −1 and x − u1 − v1 −1 and a third polynomial of degree n − 2. We can prove that this third polynomial is divisible by a third factor similar to the two others, and by continuing to operate in the same manner, we eventually obtain n linear factors of the polynomial f (x). Let these factors be √ √ √ x − u0 − v0 −1, x − u1 − v1 −1, . . . , x − un−1 − vn−1 −1, respectively. By dividing the polynomial f (x) by their product, we find the quotient to be a constant, evidently equal to the coefficient a0 , of the greatest power of x in f (x). Consequently we have √ √ f (x) = a0 x − u0 − v0 −1 x − u1 − v1 −1 . . . (21) √ . . . x − un−1 − vn−1 −1 . This last equation contains a theorem that we may state as follows: [283] Theorem II.3 — Whatever the values, real or imaginary, of the constants a0 , a1 , . . ., an−1 , an may be, the polynomial a0 xn + a1 xn−1 + . . . + an−1 x + an = f (x) is equivalent to the product of the constant a0 by n linear factors of the form √ x − α − β −1. 3
This is the Fundamental Theorem of Algebra.
10.1 Decomposition of polynomials into factors.
225
To determine the factors in question here is called to decompose the polynomial f (x) into its linear factors. There is only one way to carry out this decomposition. To demonstrate this, suppose that there were two different ways of forming the two equations4 √ √ f (x) = a0 x − u0 − v0 −1 √x − u1 − v1 −1 . . . . . . x − un−1 − vn−1 −1 and (22) √ √ f (x) = a0 x − α0 − β0 −1 x − α1 − β1 −1 . . . √ . . . x − αn−1 − βn−1 −1 . We get that
(23)
√ √ x − α0 − β0 −1 x − α1 − β1 −1 . . . √ . . . x − αn−1 − βn−1 −1 √ √ = x − u0 − v0 −1 x − u1 − v1 −1 . . . √ . . . x − un−1 − vn−1 −1 .
Because the right-hand side of the preceding formula vanishes when we give the √ variable x the particular value u0 + v0 −1, it is necessary that, for this value of x, the left-hand side, and hence one of its factors (see Chapter VII, § II, theorem VII, corollary II), reduces to zero. Let √ x − α0 − β0 −1 be that factor. We have identically √ √ α0 + β0 −1 = u0 + v0 −1, and consequently,
√ √ x − α0 − β0 −1 = x − u0 − v0 −1.
[284] Given this, formula (23) can be replaced by the following: √ √ x − α1 − β1 −1 . . . x − αn−1 − βn−1 −1 √ √ = x − u1 − v1 −1 . . . x − un−1 − vn−1 −1 . Because the right-hand side of this vanishes when we suppose that √ x = u1 + v1 −1, one of the factors of the left-hand side, for example, √ x − α1 − β1 −1,
4
In [Cauchy 1897, p. 283], the last term of the second line of (22) has an−1 in place of αn−1 . The equation is given correctly in [Cauchy 1821, p. 341]. (tr.)
226
10 On real or imaginary roots of algebraic equations.
must to vanish under the same hypotheses, and this entails two new identity equations of the form √ √ α1 + β1 −1 = u1 + v1 −1 and √ √ x − α1 − β1 −1 = x − u1 − v1 −1. By repeating the same reasoning several times, we prove that the different linear factors that comprise the right-hand sides of equations (22) are absolutely the same as each other. It is essential to add that each imaginary factor of the form √ x − α − β −1 is changed into a real factor x − α any time that the quantity β is reduced to zero. Because, as we have just said, the left-hand side of equation (1) is decomposable into linear factors in just one way, it cannot vanish except when one of these factors vanishes. Thus if we successively make them equal to zero, we obtain all the possible values of x that satisfy equation (1), that is to say, all the roots of this equation. The number of these roots, like the number of linear factors, is equal to n. Moreover, each real factor of the form x − α corresponds to one real root α, and each imaginary factor of the form √ x − α − β −1 [285] corresponds to an imaginary root √ α + β −1. These remarks suffice to establish the following proposition: Theorem III. — Whatever the values, real or imaginary, of the constants a0 , a1 , . . ., an−1 , an may be, the equation (1)
a0 xn + a1 xn−1 + . . . + an−1 x + an = 0
always has n roots, real or imaginary, and it will never have a greater number. It can happen that several of the roots of equation (1) are equal to each other. In this case, the number of different values of the variable that satisfy this equation necessarily becomes less than n. Thus, for example, because the second-degree equation x2 − 2ax + a2 = 0, has two equal roots, it cannot be satisfied except by a single value of x, namely x = a. Whenever the constants a0 , a1 , . . ., an−1 , an are all real, the imaginary expression √ α + β −1
10.1 Decomposition of polynomials into factors.
227
evidently cannot be a root of equation (1) except when the conjugate expression √ α − β −1 is also a root of the same equation. Consequently, under this hypothesis, the imaginary linear factors of the polynomial that form the left-hand side of equation (1) are pairwise conjugate and of the form5 √ √ x − α − β −1 and x − α + β −1. The product of two such factors is always a real polynomial of the second degree, namely (x − α)2 + β 2 , [286] and so we deduce the following theorem immediately from the observation that we have just made: Theorem IV. — When a0 , a1 , . . ., an−1 , an denote real constants, the polynomial (24)
a0 xn + a1 xn−1 + . . . + an−1 x + an
is decomposable into real factors of the first and second degree. In the preceding, we have presented the imaginary roots of equation (1) in the form √ α ± β −1. Then for polynomial (24), a real factor of the second degree corresponding to two conjugate imaginary roots √ √ α + β −1 and α − β −1 is of the form (x − α)2 + β 2 . For convenience, if we make √ √ α ± β −1 = ρ cos θ ± −1 sin θ , (where ρ denotes a positive quantity and θ denotes an angle that we can assume is contained between the limits 0 and π), then the same real factor of the second degree becomes (x − ρ cos θ )2 + (ρ sin θ )2 = x2 − 2ρ cos θ + ρ 2 .
√ In [Cauchy 1897, p. 285], the second of the factors below is given as x − α + −1. The coeffeicient β is correctly included in [Cauchy 1821, p. 344]. (tr.) 5
228
10 On real or imaginary roots of algebraic equations.
It is easy to construct this last expression geometrically in the case where we give the variable x a real value. Indeed, if we trace a triangle in which one angle is equal to θ and the two adjacent sides are first the numerical value of x and second the modulus ρ, then the square of the third side is (from a well-known theorem of Trigonometry)6 the value of the trinomial x2 − 2ρx cos θ + ρ 2 , [287] whenever the value of the variable x is positive. If the value of x becomes negative, it suffices to replace the given angle θ in the construction by its supplement. The third side of the triangle in question cannot vanish unless the two other sides fall on the same straight line and their extremities coincide, and this requires: 1◦ that the angle θ reduces to zero or to π; and 2◦ that the numerical value of x is equal to ρ. Consequently, the factor x2 − 2ρ cos θ + ρ 2 cannot become zero for a real value of x, at least when we do not suppose that cos θ = 1
or
cos θ = −1,
and the only value of x that makes this factor vanish is, in the first case, x = ρ, and in the second, x = −ρ. We arrive directly at the same conclusion by observing that the equation x2 − 2ρ cos θ + ρ 2 has two roots, √ and ρ cos θ − −1 sin θ ,
√ ρ cos θ + −1 sin θ
which cannot cease to be imaginary without becoming equal, and that the only values of θ capable of producing this effect are those which satisfy the formula sin θ = 0. From this we get cos θ = ±1, and consequently x = ±ρ for the common value of the two roots.
6
This is the Law of Cosines.
10.2 Solution of binomial and trinomial equations.
229
Up to now, we have been limited to determining the number [288] of roots of equation (1), along with the form of these roots and of their corresponding factors. In the following sections, we will review some particular cases in which we are able to solve similar equations without being required to imagine their coefficients converted into numbers, and to express the roots of these coefficients as algebraic or trigonometric functions of the coefficients. On this matter, we observe here that in every algebraic equation for which the left-hand side is a rational and integer function of the variable x, we can reduce the coefficient of the highest power of x to 1 by division, and the coefficient of the next-highest power of x to zero by a change of variable. Indeed, if a0 is not equal to 1 in the equation a0 xn + a1 xn−1 + . . . + an−1 x + an = 0, it suffices to divide the equation by a0 to reduce the coefficient of xn to 1. If an equation has been put into the form xn + a1 xn−1 + . . . + an−1 x + an = 0 and a1 is not zero, then it suffices to set x = z−
a1 n
to obtain a transformation into z of degree n which no longer has the second term, that is to say, a transformation in which the coefficient of zn−1 vanishes.
10.2 Algebraic or trigonometric solution of binomial equations and of some trinomial equations. The theorems of de Moivre and of Cotes. Consider the binomial equation (1)
xn + p = 0,
where p denotes a constant quantity. We get that xn = −p [289] or, if ρ denotes the numerical value of p, then xn = ±ρ. Thus we have to solve the equation (2)
xn = ρ,
230
10 On real or imaginary roots of algebraic equations.
if −p is positive, and the following, xn = −ρ,
(3)
if −p is negative. We satisfy the first one by taking 1
1
1
1
1
x = ((ρ)) n = ρ n ((1)) n ,
(4) and the second one by taking
1
x = ((−ρ)) n = ρ n ((−1)) n .
(5)
1
1
As for the various values of each of the two expressions ((1)) n and ((−1)) n , there are always n of them (see Chapter VII, § III), and they are deduced from these two formulas: ((1)) 1n = cos 2kπ ± √−1 sin 2kπ and n n (6) ((−1)) 1n = cos (2k+1)π ± √−1 sin (2k+1)π , n
n
in which it suffices to give k successively all the integer values which do not surpass n 2 . When n is an even number, the first of equations (6) gives two real values of 1
((1)) n , namely +1 and −1, the first of which corresponds to k = 0 and the second to 1
k = n2 . Under the same hypothesis, all of the values of ((−1)) n are imaginary. When 1
n is an odd number, the expression ((1)) n has a single real value, +1, corresponding 1
to k = 0, and the expression [290] ((−1)) n has a single real value, −1, corresponding to k = n−1 2 . Consequently, when n is an even number equation (1) either admits two real roots or it admits none at all, and in the contrary case the same equation admits a single real root. Moreover, we recognize immediately by inspection of formulas (6) that the imaginary roots form conjugate pairs, as we ought to expect. Now consider the trinomial equation x2n + pxn + q = 0,
(7)
where p and q denote two constant quantities chosen at will. We get x2n + pxn = −q, and consequently
(8) 2
xn +
p 2 p 2 = − q. 2 4
If p4 − q is positive, the preceding equation will lead to one of the two following ones: q 2 xn + 2p = + p4 − q or q 2 xn + 2p = − p4 − q,
10.2 Solution of binomial and trinomial equations.
231
so that xn admits two real values contained in the formula7 r p p2 n (9) x =− ± − q. 2 4 When the number n reduces to 1, formula (9) immediately gives the two real roots of the trinomial equation of the second degree x2 + px + q.
(10)
When n is not equal to 1, then by substituting the formula under consideration into equation [291] (7), we have only to solve two binomial equations similar to those we have treated above. 2 Now suppose that the quantity p4 − q is negative. Then equation (8) leads to one of the two following ones: q 2√ xn + 2p = + q − p4 −1 or q 2√ p n x + 2 = − q − p4 −1. Consequently, xn admits two imaginary values contained in the formula r p p2 √ n (11) x = − ± q− −1. 2 4 If the number n reduces to 1, these values will be the imaginary roots of equation (10). However, if we suppose that n > 1, it still remains to deduce the values of x from the known values of xn . Under this hypothesis, denote by ρ the modulus of the imaginary expression that serves as the right-hand side of formula (11). We evidently have 1 ρ = q2 . (12) Moreover, for convenience make
(13)
ζ = arctan
q q− − 2p
p2 4
.
When p is negative, the two values of xn given by formula (11) become √ (14) xn = ρ cos ζ ± −1 sin ζ , and thus we conclude that 7
Readers in North America may not be aware that in Europe the most commonly taught q version
of the quadratic formula gives the roots of a monic quadratic x2 + px + q as − 2p ± Cauchy’s readers, the version in formula (9) would have been very familiar.
p2 4
− q. To
232
(15)
10 On real or imaginary roots of algebraic equations.
1 1 ζ √ ζ x = ρ n cos ± −1 sin ((1)) n . n n
[292] On the other hand, if p is positive we find that √ xn = −ρ cos ζ ± −1 sin ζ , (16) and consequently (17)
1
x=ρn
cos
ζ √ ζ ± −1 sin n n
1
((−1)) n .
In the particular case where we have p2 − q = 0, 4 ζ becomes zero, so that equations (15) and (17) take the form of equations (4) and (5). 1 If for brevity we denote ρ n by r, then by supposing that the quantity p is negative, we get from equations (12) and (13) that p = −2rn cos ζ , x2n + pxn + q
q = r2n
and
= x2n − 2rn xn cos ζ
+ r2n .
Under the same hypothesis, formula (15) gives √ √ 2kπ x = r cos ζn ± −1 sin ζn cos 2kπ n ± −1 sin n ζ ±2kπ = r cos ζ ±2kπ ± sin , n n where k represents a whole number. Thus we conclude that the trinomial x2n − 2rn xn cos ζ + r2n is decomposable into real factors of the second degree of the form x2 − 2rx cos
ζ ± 2kπ + r2 . n
On the other hand, if we suppose that the quantity p is positive, the trinomial x2n + pxn + q becomes x2n + 2rn xn cos ζ + r2n , [293] and its real factors of the second degree are of the form
10.3 Solution of equations of the third and fourth degree.
x2 − 2rx cos
233
ζ ± (2k + 1) π + r2 . n
Under both hypotheses, whenever we give real values to the variable x, we can construct the real factors of the second degree geometrically by the method indicated above (see § I). If we take the numerical value of the variable x as the common base of all the triangles that correspond to the different factors, and in each triangle we always join to the same end of this base the known side represented by r, we find that the vertices of these various triangles coincide with points that divide the circumference of a circle of radius of r into equal parts. Consequently, if we multiply together the squares of the lines taken from the second extremity of the base to the points in question, the product of these squares will be the value of the trinomial x2n + pxn + q = x2n ± 2rn xn cos ζ + r2n . In the particular case where ζ = 0, the product of the lines themselves represents the numerical value of the binomial xn ± r n , which corresponds to the positive square root of the trinomial x2n ± 2rn xn + r2n . Of the two propositions that we have just stated, the first is the theorem of de Moivre and the second that of Cotes.
10.3 Algebraic or trigonometric solution of equations of the third and fourth degree. Consider the general equation of the third degree. By making the second term of this equation vanish, we can always [294] reduce it to the form (1)
x3 + px + q = 0,
where p and q denote two constant quantities. Moreover, if we set x = u + v, where u and v are two new variables, we conclude that x3 = (u + v)3 = u3 + v3 + 3uvx, or (2)
x3 − 3uvx − u3 + v3 = 0.
234
10 On real or imaginary roots of algebraic equations.
To make equation (2) identical to the given equation, it suffices to subject the unknowns u and v to the two conditions u3 + v3 = −q
(3) and (4)
p uv = − . 3 Thus we find that the solution of equation (1) is reduced to the simultaneous solution of equations (3) and (4). First, let us seek the values of u3 and v3 . If we make (5)
u3 = z1
and v3 = z2 ,
then we have, by virtue of equations (3) and (4), that z1 + z2 = −q and z1 z2 = −
p3 , 27
and consequently, by naming a new variable z, (z − z1 ) (z − z2 ) = z2 + qz −
p3 . 27
As a result, z1 and z2 are the two roots of the equation (6)
z2 + qz −
p3 = 0. 27
Knowing these two roots, we deduce from formulas (5) the three values of u and of v that correspond, two by two, [295] in a way that satisfies formula (4). Let U be any one of the three values of u, and let V be the corresponding value of v, so that we have p UV = − . 3 Moreover, denote the imaginary expression cos
2π √ 2π + −1 sin 3 3 1
by α. Then the three values of the expression ((1)) 3 are, respectively, α 0 = 1, 1 √ √ 2π 1 32 α = cos 2π −1 and 3 + −1 sin 3 = − 2 + 2 1 √ √ 2π 1 32 α 2 = cos 2π −1, 3 − −1 sin 3 = − 2 − 2
10.3 Solution of equations of the third and fourth degree.
235 1
and the three values of u, evidently contained in the general formula ((1)) 3 U, must be U, αU and α 2U. We find that the corresponding values of v are V,
V α
and
V , α2
or what amounts to the same thing, V,
α 2V
and αV.
Consequently, if we name the three roots of equation (1) x0 , x1 and x2 , we have x0 = U +V, x1 = αU + α 2V and (7) x2 = α 2U + αV. It is essential to observe that because U, αU and α 2U are the three values of [296] 1 u = ((z1 )) 3 , and that because V , α 2V and αV are the corresponding values of v = p − 1 , the roots x0 , x1 and x2 determined by equations (7) are, respectively, equal 3((z1 )) 3
to the three values of x given by the formula8 (8)
1
x = ((z1 )) 3 −
p 1
.
3 ((z1 )) 3 Whenever equation (6) has all real roots, formulas (5) give a system of real values of u and v that correspond in a way that satisfies equation (4). If we take these same values for U and V , we recognize immediately that of the three roots x0 , x1 and x2 , the first is necessarily real and the two others may be real or imaginary, according to whether the quantity q2 p3 + 4 27 is zero or positive, that is to say according to whether equation (6) has roots that are equal or unequal. In the first case, we find that x0 = 2U
and x1 = x2 = −U.
Whenever the roots of equation (6) become imaginary, we can present them in the form √ √ z1 = ρ cos θ + −1 sin θ and z2 = ρ cos θ − −1 sin θ , 8
1
Cauchy neglects to remind us here that it is necessary to use the same particular value of ((z1 )) 3 in each term of the right-hand side.
236
10 On real or imaginary roots of algebraic equations.
where the modulus ρ is determined by the equation ρ2 = −
p3 . 27
Because under this hypothesis we have 1 1 1 θ θ √ ((z1 )) 3 = ρ 3 cos + −1 sin ((1)) 3 , 3 3 we find that formula (8) reduces to " √ 1 1 x = ρ 3 cos θ3 + −1 sin θ3 ((1)) 3 # (9) 1 √ θ θ . + cos 3 − −1 sin 3 1 ((1)) 3 [297] Moreover, by taking for U the imaginary expression 1 θ θ √ 3 ρ cos + −1 sin , 3 3 we conclude from equations (7) that 1 x = 2ρ 3 cos θ3 , 0 1 (10) and x1 = 2ρ 3 cos θ +2π 3 1 x2 = 2ρ 3 cos θ −2π 3 . These last three values of x are all real and coincide with those which are given by formula (9). In the preceding calculations, equation (6), the solution of which leads to that of equation (1), is what we call the reduced equation. Its roots z1 and z2 are necessarily equivalent to certain functions of the required roots x0 , x1 and x2 . To determine these functions, it suffices to observe that, by virtue of formulas (5), we have z1 = U 3
and z2 = V 3 ,
where U and V denote particular values of u and v. Moreover, from equations (7) we get that
10.3 Solution of equations of the third and fourth degree.
237
3U = x0 + αx2 + α 2 x1 = α x2 + αx1 + α 2 x0
= α 2 x1 + αx0 + α 2 x2 3V = x0 + αx1
+ α 2x
and
2
= α x1 + αx2 + α 2 x0
= α 2 x2 + αx0 + α 2 x1 . Consequently we find that 27z1 (11) 27z2
= x0 + αx2 + α 2 x1
3
= x2 + αx1 + α 2 x0
3
= x1 + αx0 + α 2 x2
3
= x0 + αx1 + α 2 x2
3
= x1 + αx2 + α 2 x0
3
= x2 + αx0 + α 2 x1
3
and
.
It follows that z1 and z2 are, respectively, equal (except for a numerical coefficient) to the only two distinct values which arise as the cube of the linear function x0 + αx1 + α 2 x2 , [298] when we interchange the roots, x0 , x1 and x2 of this function in every manner 1 possible. The numerical coefficient is evidently 27 , or the cube of the fraction 31 .9 Now consider the general equation of the fourth degree. By making the second term disappear, we can reduce it to the form x4 + px2 + qx + r = 0,
(12)
where p, q and r denote constant quantities. Moreover, if we set x = u + v + w, where u, v and w are three new variables, we then conclude that x2 = u2 + v2 + w2 + 2 (uv + uw + vw) , and consequently, 2 2 x − u2 + v2 + w2 = 4 u2 v2 + u2 w2 + v2 w2 + 8uvw · x, or what amounts to the same thing, 9
What Cauchy has derived in this first part of § III is sometimes called the Cardano Formula for the cubic.
238
10 On real or imaginary roots of algebraic equations.
( (13)
x4 − 2 u2 + v2 + w2 x2 − 8uvw · x 2 + u2 + v2 + w2 − 4 u2 v2 + u2 w2 + v2 w2 = 0.
To make this last equation identical to the given one, it suffices to subject the unknowns u, v and w to the conditions 4 u2 + v2 + w2 = −2p, 8uvw = −q and (14) 2 2 2 2 16 u v + u w + v2 w2 = p2 − 4r. Thus we find that the solution of equation (12) reduces to the simultaneous solution of equations (14). First, we seek the values of 4u2 , 4v2 and 4w2 . If we make 4u2 = z1 ,
(15)
4v2 = z2
and
4w2 = z3 ,
[299] we have, by virtue of formulas (14), z1 + z2 + z3 = −2p,
z1 z2 + z1 z3 + z2 z3 = p2 − 4r
and z1 z2 z3 = q2 .
Consequently, letting z be a new variable, we have (z − z1 ) (z − z2 ) (z − z3 ) = z3 + 2pz2 + p2 − 4r z − q2 . It follows that z1 , z2 and z3 are the three roots of the equation (16) z3 + 2pz2 + p2 − 4r z − q2 = 0, and because these three roots must satisfy the formula z1 z2 z3 = q2 , we can be sure that at least one of the roots will be positive and that the other two will be either both positive, both negative or both imaginary. When we have determined these roots, the first two of equations (15) give two equal values for each of the variables u and v, up to sign. Let u = ±U and v = ±V be the values, real or imaginary, in question, and let W be a real quantity or an imaginary expression determined by the equation 8UVW = −q. If we suppose that in the second of formulas (14) u = +U
and v = +V
u = −U
and v = −V,
or else
10.3 Solution of equations of the third and fourth degree.
239
we get w = +W. On the other hand, if we make u = +U
and v = −V
u = −U
and v = +V,
or else10 we
find11 w = −W.
In this way, we obtain for the variables u, v and w four systems [300] of values that satisfy equations (14). If we represent by x0 , x1 , x2 and x3 the four values corresponding to the unknown x = u + v + w, then we have
x0 = U +V +W, x1 = −U −V +W, x2 = U −V −W and x3 = −U +V −W.
(17)
It is easy to recognize that if equation (16) has three positive roots, then these four values of x are all real; if equation (16) has two distinct negative roots, then they are all imaginary; while if equation (16) has two equal negative roots or two imaginary roots, then two values will be real and two will be imaginary. By the method that we have just described, the solution of equation (12) is reduced to that of equation (16). This last equation, which we call the reduced equation, necessarily has for its roots certain functions of the roots of the given equation. If we wish to determine these functions, that is to say, to express z1 , z2 and z3 in terms of x0 , x1 , x2 and x3 , it suffices to observe that because U, V and W are particular values of u, v and w, we have, by virtue of formulas (15), that z1 = 4U 2 ,
z2 = 4V 2
and z3 = 4W 2 .
Moreover, we get from equations (17) that 4U = x0 − x1 + x2 − x3 , 4V = x0 − x1 + x3 − x2
and
4W = x0 − x2 + x1 − x3 . As a consequence, we find 10
In [Cauchy 1897, p. 299], this is written u = −U and w = +V . It is v = +V in [Cauchy 1821, p. 362]. (tr.) 11 In [Cauchy 1897, p. 299], this is written u = −W . It is w = −W in [Cauchy 1821, p. 362]. (tr.)
240
(18)
10 On real or imaginary roots of algebraic equations.
4z = (x0 − x1 + x2 − x3 )2 = (x1 − x0 + x3 − x2 )2 , 1 4z2 = (x0 − x1 + x3 − x2 )2 = (x1 − x0 + x2 − x3 )2 and 4z3 = (x0 − x2 + x1 − x3 )2 = (x2 − x0 + x3 − x1 )2 .
[301] It follows that z1 , z2 and z3 are, if we ignore the numerical coefficient 41 = 1 2 2 , respectively equal to the three distinct values that are given by the square of the linear function x0 − x1 + x2 − x3 , when we interchange the roots x0 , x1 , x2 and x3 in this function in all possible ways. This same linear function can thus be written as follows: x0 + (−1) x1 + (−1)2 x2 + (−1)3 x3 , which is evidently a particular case of the general formula x0 + αx1 + α 2 x2 + α 3 x3 , 1
when we denote by α one of the values of the expression ((1)) 4 .
Chapter 11
Decomposition of rational fractions.
11.1 Decomposition of a rational fraction into two other fractions of the same kind. [302] Let f (x) and F(x) be two integer functions of the variable x. Then f (x) F (x) is what we call a rational function. If we denote the degree of the denominator F(x) by m, then the equation (1) F (x) = 0 admits m roots, real or imaginary, equal or not equal to each other. Supposing them to be distinct, if we represent them by x0 ,
x1 ,
x2 ,
...,
xm−1 ,
then the linear factors of the polynomial F(x) are, respectively, x − x0 , Given this, make (2) and (3)
x − x1 ,
x − x2 ,
...,
x − xm−1 .
F (x) = (x − x0 ) ϕ (x) f (x0 ) = A. ϕ (x0 )
[303] Because ϕ (x0 ) is not zero, the constant A is finite and the difference f (x) f (x) − Aϕ (x) −A = ϕ (x) ϕ (x)
R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 11, c Springer Science+Business Media, LLC 2009
241
242
11 Decomposition of rational fractions.
vanishes for x = x0 . Consequently, the same is true of the polynomial f (x) − Aϕ (x) and this polynomial is algebraically divisible by x − x0 . Thus we have f (x) − Aϕ (x) = (x − x0 ) χ (x) or (4)
f (x) = Aϕ (x) + (x − x0 ) χ (x) ,
where χ (x) denotes a new integer function of the variable x. If we divide the two sides of this last equation by F(x) and take into account formula (2), we conclude that A χ (x) f (x) (5) = + . F (x) x − x0 ϕ (x) Thus, if we separate the polynomial F(x) into two factors, one of which is linear, f (x) into two others which have as their we can decompose the rational fraction F(x) respective denominators the two factors in question, and for which the simpler one has a constant numerator. Imagine now that we separate the function F(x) into two factors where the first, instead of being linear, corresponds to several roots of the equation F(x) = 0. For example, take for the first factor the factor of second degree (x − x0 ) (x − x1 ) . As a consequence, we have F (x) = (x − x0 ) (x − x1 ) ϕ (x) .
(6) The fraction
f (x) ϕ(x)
still has a finite value, not only for x = x0 , but also for x = x1 . If
we denote by u a polynomial [304] which, under both hypotheses is equal to we find (Chapter IV, § I) (7)
u=
f (x1 ) x − x0 f (x0 ) x − x1 + . ϕ (x0 ) x0 − x1 ϕ (x1 ) x1 − x0
Because the polynomial u is determined, as we have just said, the equation f (x) −u = 0 ϕ (x) or f (x) − uϕ (x) = 0 includes x0 and x1 among its roots and consequently the polynomial
f (x) ϕ(x) ,
11.1 Decomposition of a rational fraction into two other fractions of the same kind.
243
f (x) − uϕ (x) is divisible by the product (x − x0 ) (x − x1 ) . Thus we have f (x) − uϕ (x) = (x − x0 ) (x − x1 ) χ (x) , or (8)
f (x) = uϕ (x) + (x − x0 ) (x − x1 ) χ (x) ,
where χ(x) denotes a new integer function of the variable x. If we divide the last equation by F(x) and take into account formula (6), we conclude (9)
u χ (x) f (x) = + . F (x) (x − x0 ) (x − x1 ) ϕ (x)
Likewise, we could prove that it suffices to set (10) and
(11)
F (x) = (x − x0 ) (x − x1 ) (x − x2 ) ϕ(x) u=
f (x0 ) (x − x1 ) (x − x2 ) ϕ (x0 ) (x0 − x1 ) (x0 − x2 ) f (x1 ) (x − x0 ) (x − x2 ) + ϕ (x1 ) (x1 − x0 ) (x1 − x2 ) f (x2 ) (x − x0 ) (x − x1 ) + ϕ (x2 ) (x2 − x0 ) (x2 − x1 )
[305] to obtain an equation of the form (12)
u χ (x) f (x) = + , F (x) (x − x0 ) (x − x1 ) (x − x2 ) ϕ (x)
etc. Thus, in general, whenever the equation F(x) = 0 does not have equal roots, if we separate the polynomial F(x) into two factors of which the first is the product f (x) of several linear factors, then the rational fraction F(x) is decomposable into two other fractions of the same kind which have as their respective denominators the two factors mentioned above, and of which the first has a numerator of a degree less than that of its denominator. I move on to the case where we suppose that the equation F(x) = 0 has equal roots. Under this second hypothesis, let a,
b,
c,
...
be the various roots of this same equation, and denote by m0 the number of roots equal to a, by m00 the number of roots equal to b, by m000 the number of roots equal
244
11 Decomposition of rational fractions.
to c, etc. The function F(x) is equal to the product 0
00
000
(x − a)m (x − b)m (x − c)m . . . or to this product multiplied by a constant coefficient, and we have m0 + m00 + m000 + . . . = m. Given this, make (13)
0
F (x) = (x − a)m ϕ (x)
and
f (a) = A. ϕ (a)
(14)
Because ϕ(a) is not zero, the constant A remains finite and the difference f (x) −A ϕ (x) [306] vanishes for x = a. Thus we conclude that the polynomial f (x) − Aϕ (x) is divisible by x − a, and consequently we have (15)
f (x) = Aϕ (x) + (x − a) χ (x) ,
where χ(x) denotes a new integer function of the variable x. Finally, if we divide both sides of equation (15) by F(x) and take into consideration formula (13), we find f (x) A χ (x) = (16) + . F (x) (x − a)m0 (x − a)m0 −1 ϕ (x) By reasoning in the same way, we could prove that it suffices to take (17)
0
and (18)
00
F (x) = (x − a)m (x − b)m ϕ (x)
u=
f (b) x − a f (a) x − b + ϕ (a) a − b ϕ (b) b − a
to obtain an equation of the form (19) etc.
u f (x) χ (x) = + , F (x) (x − a)m0 (x − b)m00 (x − a)m0 −1 (x − b)m00 −1 ϕ (x)
11.2 Decomposition when the denominator has unequal linear factors.
245
11.2 Decomposition of a rational fraction for which the denominator is the product of several unequal factors into simple fractions which have for their respective denominators these same linear factors and have constant numerators. Let
f (x) F (x)
be the rational fraction under consideration, m be the degree of the function F(x) and x0 , x1 , x2 , . . . , xm−1 [307] the roots, assumed to be unequal, of the equation (1)
F (x) = 0.
If k denotes a constant coefficient, we have (2)
F (x) = k (x − x0 ) (x − x1 ) . . . (x − xm−1 ) ,
and by virtue of the principles established in the preceding section, the rational f (x) can be decomposed into two others, of which the first is of the form fraction F(x) A0 , x − x0 where A0 represents a constant, while the second has as its denominator F (x) = k (x − x1 ) (x − x2 ) . . . (x − xm−1 ) . x − x0 By decomposing this second rational fraction by the same method, we obtain: 1◦ A new simple fraction of the form A1 ; x − x1
and
2◦ A fraction which has as its denominator k (x − x2 ) . . . (x − xm−1 ) . By continuing in this way, we make all the linear factors contained in the polynomial F (x) = k (x − x0 ) (x − x1 ) . . . (x − xm−1 )
246
11 Decomposition of rational fractions.
successively disappear. Consequently, we finally reduce the polynomial to the constant k. Thus, when by a series of such partial decompositions like those we have f (x) a series of simple fractions just indicated, we have extracted from the fraction F(x) [308] of the form A1 , x − x1
A0 , x − x0
A2 , x − x2
...,
Am−1 , x − xm−1
where the remainder is just a rational fraction with a constant denominator, that is to say an integer function of the variable x. Denoting this integer function by R, we find f (x) A0 A1 A2 Am−1 (3) = R+ + + +...+ . F (x) x − x0 x − x1 x − x2 x − xm−1 Now it remains to find the values of the constants A0 ,
A1 ,
A2 ,
...,
Am−1 .
These values are deduced without difficulty by the method of decomposition indicated in § I. However, we arrive more directly at their determination with the aid of the following considerations: If we multiply the two sides of equation (3) by F(x), we get F (x) F (x) + A1 f (x) = RF(x) +A0 x − x0 x − x1 (4) F (x) F (x) +A2 + . . . + Am−1 . x − x2 x − xm−1 If we make x = x0 + z in both sides of this last formula, then the sum RF (x) + A1
F (x) F (x) F (x) + A2 + . . . + Am−1 , x − x1 x − x2 x − xm−1
which is evidently a polynomial in x divisible by x − x0 , takes the form zZ, where Z denotes an integer function of z. It follows that we have (5)
f (x0 + z) = A0
F (x0 + z) + zZ. z
[309] Now suppose that the substitution of x + z in place of x in the function F(x) gives generally (6) F (x + z) = F (x) + zF1 (x) + z2 F2 (x) + . . . .
11.2 Decomposition when the denominator has unequal linear factors.
247
We then deduce that F (x0 + z) = zF1 (x0 ) + z2 F2 (x0 ) + . . . , and equation (5) becomes (x0 + z) = A0 [F1 (x0 ) + zF2 (x0 ) + . . .] + zZ. When we make z = 0 in this last equation, it reduces to f (x0 ) = A0 F1 (x0 ) , and we conclude that A0 =
(7)
f (x0 ) . F1 (x0 )
By an entirely similar calculation, we find that f (x1 ) A1 = , F 1 (x1 ) f (x2 ) A2 = , (8) F1 (x2 ) .................., f (xm−1 ) Am−1 = . F1 (xm−1 ) The values that we have just obtained for A0 ,
A1 ,
A2 ,
...,
Am−1
are evidently independent of the method used for the decomposition of the rational f (x) . From this it follows that this fraction can be decomposed in only one fraction F(x) way into simple fractions which have as denominators linear factors of the polynomial F(x) with constant numerators. It is easy to see how equation (7) and formula (3) of the preceding section [310] agree with each other. Indeed, F1 (x0 ) is what the polynomial F1 (x0 ) + zF2 (x0 ) + . . . =
F (x) F (x0 + z) = z x − x0
becomes when we make z = 0 or x = x0 . Consequently, if we set (9)
F (x) = (x − x0 ) ϕ (x) ,
we have F1 (x0 ) = ϕ (x0 ) and
248
11 Decomposition of rational fractions.
A0 =
(10)
f (x0 ) . ϕ (x0 )
To show an application of the formulas established above, suppose that it is a question of decomposing the rational fraction xn xm − 1 into simple fractions, where n denotes an integer number less than m. In this particular case, we have f (x) = xn ,
F (x) = xm − 1
and k = 1.
If we represent an integer number which does not surpass m2 by h, then the various roots of the equation F(x) = 0, all unequal to each other, are contained in the formula cos
2hπ 2hπ √ ± −1 sin . m m
Let a be one of these roots. We seek the numerator A of the simple fraction that has x − a as its denominator. This numerator is A=
f (a) an = , F1 (a) F1 (a)
where the value of F1 (a) is determined by the equation F (a) + zF1 (a) + . . . = F (a + z) = (a + z)m − 1 = am − 1 + mam−1 z + . . . , [311] and as a consequence is equal to mam−1 . Thus we find that A=
an 1 = an+1−m . mam−1 m
Moreover, because we have 2hπ √ 2hπ n+1−m 2h (n + 1) π √ 2h (n + 1) π cos ± −1 sin = cos ± −1 sin , m m m m and taking (n + 1) π =θ m for brevity, we conclude from the preceding, that (11)
11.2 Decomposition when the denominator has unequal linear factors.
(12)
1 xn = m x −1 m
249
√ cos 2θ + −1 sin 2θ 1 √ + 2π x − 1 x − cos 2π m − −1 sin m √ cos 2θ − −1 sin 2θ √ + 2π x − cos 2π m + −1 sin m √ cos 4θ + −1 sin 4θ √ + 4π x − cos 4π m − −1 sin m ! √ cos 4θ − −1 sin 4θ √ + +... . 4π x − cos 4π m + −1 sin m
By reasoning in the same manner, we find that √ n 1 1 cos θ + −1 sin θ x √ = − + π π xm + 1 m x − 1 x − cos m − −1 sin m √ cos θ − −1 sin θ √ + π π + −1 sin m x − cos m (13) √ cos 3θ + −1 sin 3θ √ + 3π x − cos 3π m − −1 sin m ! √ cos 3θ − −1 sin 3θ √ + +... . x − cos 3π + −1 sin 3π m
m
It is essential to observe that, in equation (12) for even values of m and in [312] equation (13) for odd values of m, the last of the simple fractions contained in the right-hand side of the equation is cos mθ cos (n + 1) π (−1)n+1 = = . x+1 x+1 x+1 Thus, for example, we have (14) (15) (16)
1 1 1 1 = − , x2 − 1 2 x−1 x+1 1 1 1 x = + , x2 − 1 2 x−1 x+1 √ cos π3 + −1 sin π3 1 1 √ =− x3 + 1 3 x − cos π3 − −1 sin π3 ! √ cos π3 − −1 sin π3 1 √ + − , x − cos π3 + −1 sin π3 x + 1 . . . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
250
11 Decomposition of rational fractions.
We could also remark that if in the right-hand sides of equations of (12) and (13) we combine by addition two simple fractions corresponding to conjugate linear factors of the binomial xm ± 1, then the sum is a new fraction which has as denominator a real factor of the second degree and for its numerator a real linear function of the variable x. For example, by taking n = 0 and m = 3, we find 2x cos π3 − 2 1 1 1 = − − x3 + 1 3 x2 − 2x cos π3 + 1 x + 1 (17) 1 2−x 1 = + . 3 x2 − x + 1 x + 1 It is easy to generalize this remark as follows. Because the integer functions f (x) and F(x) are real, suppose that we denote two conjugate imaginary roots of equation (1) by √ √ α + β −1 and α − β −1 and take [313] A and B to be two real quantities that satisfy the formula √ √ f α + β −1 √ = A − B −1, (18) F1 α + β −1 where F1 (x) still represents the coefficient of z in the expansion of F(x + z). We necessarily have √ √ f α − β −1 √ = A + B −1. (19) F1 α − β −1 f (x) As a consequence, if we decompose the rational fraction F(x) , then the two simple fractions corresponding to the conjugate linear factors √ √ x − α − β −1 and x − α + β −1
are, respectively, (20)
√ A − B −1 √ x − α − β −1
and
√ A + B −1 √ . x − α + β −1
By adding these two fractions we obtain the following: (21)
2A (x − α) + 2Bβ (x − α)2 + β 2
.
This last formula, which has as its numerator a real linear function of the variable x and as its denominator a real factor of the second degree of the polynomial F(x), does not differ from the fraction
11.3 Decomposition into fractions with denominators that are powers of linear factors.
251
u , (x − x0 ) (x − x1 ) which in formula (9) of section I contains in the case where we suppose √ √ x0 = α + β −1 and x1 = α − β −1.
11.3 Decomposition of a given rational fraction into other simpler ones which have for their respective denominators the linear factors of the first rational fraction, or of the powers of these same factors, and constants as their numerators. [314] Let f (x) F (x) be the rational fraction under consideration, m be the degree of the polynomial F(x), and a, b, c, . . . the various roots of the equation (1)
F (x) = 0.
Denote by k a constant coefficient and by m0 , m00 , m000 , . . . several integer numbers for which the sum is equal to m. Then we have (2)
0
00
000
F (x) = k (x − a)m (x − b)m (x − c)m . . . .
Given this, if we make use of the method explained in section I, we decompose the f (x) into two others for which the first one is of the form rational fraction F(x) A (x − a)m
0
,
while the second has as its denominator 0 00 000 F (x) = k (x − a)m −1 (x − b)m (x − c)m . . . . x−a
By decomposing this second rational fraction by the same method, we obtain: 1◦ a new simple fraction A1 , 0 (x − a)m −1
252
11 Decomposition of rational fractions.
[315] in which A1 represents a constant; and 2◦ a fraction which has as its denominator 0 00 000 k (x − a)m −2 (x − b)m (x − c)m . . . . By continuing like this, we successively make the different linear factors composing 0 the power (x − a)m of the polynomial F(x) disappear. When we have extracted from f (x) F(x) a sequence of simple fractions of the form A m0
(x − a)
,
A1 m0 −1
(x − a)
A2
,
m0 −2
(x − a)
,
...,
Am0 −1 , x−a
what remains is a new rational fraction for which the denominator is reduced to 00
000
k (x − b)m (x − c)m . . . . If we extract a second sequence of simple fractions of the form B m00
(x − b)
,
B1 m00 −1
(x − b)
,
B2 m00 −2
(x − b)
,
...,
Bm00 −1 x−b
from what remains, we obtain a second remainder for which the denominator is 000
k (x − c)m . . . . Finally, if we extend these operations until the polynomial F(x) is reduced to the constant k, the last of all the remainders is a rational function with a constant denominator, that is to say an integer function of the variable x. Call this integer function f (x) R. Finally, we have as the value of F(x) decomposed into simple fractions
(3)
f (x) A A1 A 0 =R + + + . . . + m −1 m0 m0 −1 F (x) x−a (x − a) (x − a) B B1 B 00 + + + . . . + m −1 m00 m00 −1 x−b (x − b) (x − b) C C1 Cm000 −1 + 000 + 000 −1 + . . . + m m x−c (x − c) (x − c) +..........................................,
[316] where A, A1 , . . ., Am0 −1 ; B, B1 , . . ., Bm00 −1 ; C, C1 , . . ., Cm000 −1 ; . . . denote constants which we can easily deduce from the principles described in section I, or calculate directly with the aid of the following considerations. For convenience, make
11.3 Decomposition into fractions with denominators that are powers of linear factors.
(4)
253
B1 Bm00 −1 B R+ 00 + 00 −1 + . . . + m m x−b (x − b) (x − b) C C1 C 000 + + + . . . + m −1 m000 m000 −1 x−c (x − c) (x − c) + ........................ Q , = 000 m00 (x − b) (x − c)m . . .
where Q is a new integer function of the variable x. Equation (3) then becomes f (x) A A1 A 0 = + + . . . + m −1 m0 m0 −1 F (x) x−a (x − a) (x − a) Q + . 000 m00 (x − b) (x − c)m . . . If we multiply both sides of this last formula by 0
00
000
F (x) = k (x − a)m (x − b)m (x − c)m . . . we then conclude that f (x) = [A + A1 (x − a) + . . . i (5) m0 −1 0 . . . +A (x − a) m −1
F (x) m0
(x − a)
0
+ kQ (x − a)m .
Consequently, by making x = a + z, we find that (6)
F (a + z) 0 0 + Zzm , f (a + z) = A + A1 z + . . . + Am0 −1 zm +1 0 m z
where Z denotes the value of the polynomial kQ expressed as a function of z. Now suppose that the substitution of x + z in place of x in the functions f (x) and F(x) gives in general f (x + z) = f (x) +z f1 (x) + z2 f2 (x) + . . . , F (x + z) = F (x) +zF1 (x) + z2 F2 (x) + . . . (7) 0 0 +zm Fm0 (x) + zm +1 Fm0 +1 (x) + . . . . [317] By taking x = a + z and observing that the expansion of the function F (x) = F (a + z)
254
11 Decomposition of rational fractions. 0
0
ought to be divisible by (x − a)m = zm , we have that ( f (a + z) = f (a) + z f1 (x) + z2 f2 (x) + . . . , (8) 0 F (a + z) = Fm0 (a) + zFm0 +1 (a) + z2 Fm0 +2 (a) + . . . zm and (9)
F (a) = 0,
F1 (a) = 0,
...,
Fm0 −1 (a) = 0.
Given this, formula (6) is found to reduce to f (a) + z f1 (a) + z2 f2 (a) + . . . = A + A1 z + A2 z2 + . . . (10) 00 × Fm0 (a) + zFm0 +1 (a) + z2 Fm0 +2 (a) + . . . + zm Z. By equating the coefficients of similar powers of z on the two sides of the equation, we derive from this that f (a) = AFm0 (a) , f1 (a) = A1 F 0 (a) + AF 0 (a) , m m +1 (11) f2 (a) = A2 Fm0 (a) + A1 Fm0 +1 (a) + AFm0 +2 (a) , ................................................ By an entirely similar calculation we find f (b) = BFm00 (b) , f1 (b) = B1 Fm00 (b) + BFm00 +1 (b) , f2 (b) = . . . , (12) f (c) = CFm000 (c) , f1 (c) = C1 Fm000 (c) +CFm000 +1 (c) , f2 (c) = . . . , .................., ................................., ............. These various equations suffice to determine completely the values of the constants A, A1 , A2 , . . ., B, B1 , B2 , . . ., C, C1 , C2 , . . .. They give, for example,
(13)
f (a) A = , 0 (a) F m A1 = f1 (a) − AFm0 +1 (a) , Fm0 (a) f (a) − A1 Fm0 +1 (a) − AFm0 +2 (a) 2 , A2 = Fm0 (a) ... .....................................
[318] Because the constants thus determined are evidently independent of the f (x) , it follows that this method used for the decomposition of the rational fraction F(x) fraction is decomposable into simple fractions of the form of those on the right-hand side of equation (3) in only one way.
11.3 Decomposition into fractions with denominators that are powers of linear factors.
255
It is easy to see that the first of equations (13) agrees with formula (14) of section I. Indeed, the quantity Fm0 (a) is what becomes of the polynomial Fm0 (a) + zFm0 +1 (a) + z2 Fm0 +2 (a) + . . . =
F (x) F (a + z) = 0 0 m z (x − a)m
when we make z = 0 or x = a. Consequently, if we set (14)
0
F (x) = (x − a)m ϕ (x) ,
we have Fm0 (a) = ϕ (a) and (15)
A=
f (a) . ϕ (a)
In the case where the functions f (x) and √ F(x) are both real and the equation F(x) = 0 admits m0 roots equal to α + β −1, the same equation also admits m0 roots equal and conjugate to the first ones, and consequently represented by √ α − β −1. Under this hypothesis, if after the decomposition of the rational fraction f (x) , F (x) we combine in pairs the simple fractions which have as their denominators √ m0 √ m0 and x − α + β −1 , x − α − β −1 √ m0 −1 √ m0 −1 x − α − β −1 and x − α + β −1 , ....................., ....................., [319] and finally √ x − α − β −1
and
√ x − α + β −1,
the different sums obtained are the real and rational fractions which have as their respective denominators h im0 (x − α)2 + β 2 , h im0 −1 (x − α)2 + β 2 , .................., (x − α)2 + β 2 ,
256
11 Decomposition of rational fractions.
and by which the system can be replaced by a sequence of other fractions which, with the same denominators, have as their numerators real linear functions of the variable x. Finally, it is easy to calculate directly this new sequence of fractions by beginning with those which correspond to the highest powers of (x − α)2 + β 2 . For example, let us seek the one which has as its denominator h im0 √ m0 √ m 0 x − α + β −1 . (x − α)2 + β 2 = x − α − β −1 From the principles established in section I, the fraction is (16)
u h
provided that we make " 1 u = √ 2β −1 (17) − and (18)
(x − α)2 + β 2
im0 ,
√ √ f α + β −1 √ x − α + β −1 ϕ α + β −1 # √ √ f α − β −1 √ x − α − β −1 ϕ α − β −1
F (x) ϕ (x) = h im0 . (x − α)2 + β 2
We add that if we successively set √ √ x = α + β −1 + z and x = α − β −1 + z in the preceding formula, [320] we conclude, taking into account the second of equations (8), that √ √ Fm0 α + β −1 + zFm0 +1 α + β −1 + . . . √ , ϕ α + β −1 + z = m0 √ 2β −1 + z √ √ Fm0 α − β −1 + zFm0 +1 α − β −1 + . . . √ ϕ α − β −1 + z = , m0 √ −2β −1 + z and consequently
(19)
√ √ Fm0 α + β −1 and √ m0 ϕ α + β −1 = 2β −1 √ √ m0 Fm0 α − β −1 √ m0 . ϕ α − β −1 = (−1) 2β −1
Chapter 12
On recurrent series.
12.1 General considerations on recurrent series. [321] A series (1)
a0 ,
a1 x,
a2 x2 ,
...,
an xn ,
...,
ordered according to the ascending integer powers of the variable x, is called recurrent when in this series, starting after a given term, the coefficient of any power of the variable is expressed as a linear function of a fixed number of the coefficients of lesser powers, and consequently it suffices to run back1 to the values of these last coefficients to deduce the one we are seeking. Thus, for example, the series (2)
1,
2x,
3x2 ,
...,
(n + 1) xn ,
...
is recurrent, considering that if we make an = n + 1, we always have, for values of n greater than 1, (3)
an = 2an−1 − an−2 .
In general, series (1) is recurrent if, for all values of n greater than a certain limit, the coefficients an , an−1 , an−2 , . . . , an−m of several consecutive powers of x are found related to each other [322] by an equation of the first degree. Let (4)
kan−m + lan−m+1 + . . . + pan−1 + qan = 0
1
Cauchy uses the French verb recourir here. He seems to be commenting on the etymology of “recurrent” (r´ecurrent in French), which has its origins in the Latin verb currere, “to run.”
R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 12, c Springer Science+Business Media, LLC 2009
257
258
12 On recurrent series.
be the equation in question, where k, l, . . ., p and q denote determined constants. The sequence of these constants forms what we call the recurrence relation2 of the series, the recurrence for which the constants themselves are the different terms. In series (1), assumed to be recurrent, the variable x and its coefficients a0 , a1 , a2 , . . ., an , can be either real quantities or imaginary expressions. Given this, represent the modulus of the expression an by ρn , and consequently the numerical value of this expression whenever it is real. We conclude immediately from the principles established in Chapters VI and IX that series (1) is either convergent or divergent depending on whether the modulus or the numerical value of x is less than or greater 1 than the smallest of the limits towards which the expression (ρn )− n converges, when n grows indefinitely.
12.2 Expansion of rational fractions into recurrent series. Any time that a rational fraction can be expanded into a convergent series ordered according to ascending integer powers of the variable, that series is recurrent, as we will see. First consider the rational fraction A , (x − a)m
(1)
in which a and A denote two constants, real or imaginary, and m an integer number. It can be put into the form (−1)m
A x −m 1 − , am a
[323] and it is expandable, as well as the expression
1−
x −m , a
into a convergent series ordered according to the ascending integer powers of the variable x if the numerical value of the ratio ax in the real case, or the modulus of the same ratio in the imaginary case, is a quantity contained between the limits 0 and 1. This condition is satisfied if the modulus of the variable x, a modulus which reduces to the numerical value of the same variable when it becomes real,3 is less than the modulus of the constant a, and we have, under this hypothesis,
2 3
Cauchy uses the term e´ chelle de relation, literally “scale [or ladder] of relation.” (tr.) Cauchy writes imaginaire here in [Cauchy 1821, p. 391, Cauchy 1897, p. 323].
12.2 Expansion of rational fractions into recurrent series.
(2)
259
−m m x m (m + 1) x2 + +... = 1+ 1 − ax 1a 1 · 2 a2 1 · 2 · 3 . . . (m − 1) 2·3·4...m x = + 1 · 2 · 3 . . . (m − 1) 1 · 2 · 3 . . . (m − 1) a 3 · 4 · 5 . . . (m + 1) x2 + +.... 1 · 2 · 3 . . . (m − 1) a2
Consequently, we find (3)
A = (−1)m (x − a)m
A m Ax m (m + 1) Ax2 + + +... . am 1 am+1 1 · 2 am+2
If for brevity we make A (−1)m m = a0 , a m A (−1)m = a1 , 1 am+1 (−1)m m (m + 1) A = a , 2 1 · 2 am+2 ..............................,
(4)
we obtain the equation (5)
A = a0 + a1 x + a2 x 2 + . . . + an x n + . . . . (x − a)m
[324] Now imagine that we multiply both sides of the preceding equation by (a − x)m . We find that4 h i m m − m am−1 x + m(m−1) am−2 x2 − . . . ± xm (−1) A = a 1 1·2 2 +... × a + a x + a x 0 1 2 = am (a0 +a1 x+a2 x2 +...+am xm +am+1 xm+1 +...) (6) − m1 am−1 (a0 x+a1 x2 +...+am−1 xm +am xm+1 +...) m−2 a x2 +...+a m m+1 +... + m(m−1) (0 ) m−2 x +am−1 x 1·2 a −.................................... ±(a0 xm +a1 xm+1 +···), or what amounts to the same thing,
4
Cauchy writes a + before the ellipses in the first line of this equation in [Cauchy 1821, p. 392, Cauchy 1897, p. 324].
260
(7)
12 On recurrent series.
(−1)m A = am a0 + am a1 − m1 am−1 a0 x h i m a − m am−1 a + m(m−1) am−2 a x2 + a 2 1 0 1 1·2 + . . . . . . . . . . . . . . . . . . . . . . . . ............... h m−2 a + am an − m1 am−1 an−1 + m(m−1) n−2 1·2 a − . . . ± an−m xn +.......................................
This last formula ought to remain true any time that the modulus of the variable x is less than the modulus of the constant a, and consequently any time that we attribute to x a real value slightly different from zero. We conclude, by reasoning similar to that which we have used for the proof of theorem VI of Chapter VI (§ IV), that (−1)m A = am a0 , am a1 − m am−1 a0 = 0, 1 (8) m m−1 m−2 a = 0, m a a2 − 1 a a1 + m(m−1) 0 1·2 a ........................................, [325] and in general, (9)
am an −
m m−1 m (m − 1) m−2 a an−1 + a an−2 − . . . ± an−m = 0. 1 1·2
It is essential to remark that equation (9) applies only for real integer values of n greater than or equal to m, and that whenever we suppose that n < m, it ought to be replaced by one of the formulas (8). Moreover, because equation (9) is linear with respect to the constants an ,
an−1 ,
an−2 ,
...,
an−m ,
it gives the first of these constants as a linear function of all the other ones. It follows that in the series (10) an , a1 x, a2 x2 , . . . , an xn , . . . starting from the term am xm ,5 the coefficient of any power of x is expressed as a linear function of the m coefficients of lesser powers taken consecutively. This series is thus one of those that we have named recurrent. Among the various particular formulas which we can deduce from equation (3), it is good to mention those which correspond to the two suppositions m = 1 and m = 2. We find, under the first hypothesis, that
5
In [Cauchy 1897, p. 325], this is written as am xm . It is given correctly in [Cauchy 1821, p. 394].
12.2 Expansion of rational fractions into recurrent series.
261
A A A A =− + 2 x + 3 x2 + . . . , x−a a a a
(11)
and under the second hypothesis, that (12)
A 2
(x − a)
=
A A A A + 2 3 x + 3 4 x2 + 4 5 x3 + . . . . 2 a a a a
The two preceding formulas, where the first determines the sum of a geometric progression, remain true, and thus equation (3) as well, as long as the modulus of x is less than the modulus of a. [326] When in equation (12) we make both A=1
and a = 1,
we obtain the following (13)
1 (x − 1)2
= 1 + 2x + 3x2 + 4x3 + . . . ,
which has for its right-hand side the sum of series (2) (§ I), and supposes that the modulus of x is less than 1. Now consider any rational fraction (14)
f (x) , F (x)
where f (x) and F(x) are two integer functions of the variable x. Represent by a, b, c, . . . the various roots of the equation (15)
F (x) = 0,
by m0 the number of roots equal to a, by m00 the number of roots equal to b, by m000 the number of roots equal to c, . . ., and by k the coefficient of the highest power of x in the polynomial F(x), so that we have (16)
0
00
000
F (x) = k (x − a)m (x − b)m (x − c)m . . . .
f (x) into simple fractions, the method For the decomposition of the rational fraction F(x) explained in the preceding chapter gives an equation of the form
262
(17)
12 On recurrent series.
A A1 A 0 f (x) =R + + + . . . + m −1 m0 m0 −1 F (x) x−a (x − a) (x − a) B B1 B 00 + + + . . . + m −1 m00 m00 −1 x−b (x − b) (x − b) C C1 Cm000 −1 + 000 + 000 −1 + . . . + m m x−c (x − c) (x − c) +. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ,
where A, A1 , . . ., B, B1 , . . ., C, C1 , . . ., etc. denote determined constants [327] and R is an integer function of x which vanishes when the degree of the polynomial f (x) is less than that of the polynomial F(x). Given this, imagine that the modulus of the variable x is less than the moduli of the various roots a, b, c, . . ., and consequently less than the smallest of these moduli. We can expand each of the simple fractions that make up the right-hand side of equation (17) into a convergent series ordered according to the ascending powers of the variable x. Then, by adding the expansions formed like this to the polynomial R, we obtain a new convergent series, still ordered according to the ascending powers of x and where the sum is equal to the rational f (x) fraction F(x) . Let (18)
a0 ,
a1 x,
a2 x2 ,
...,
an xn ,
...
be the new series in question here. The formula (19)
f (x) = a0 + a1 x + a2 x2 + . . . F (x)
remains true any time this new series is convergent, that is to say any time the modulus of the variable x is less than the smallest of the numbers that serve as the moduli of the roots of equation (15). I add that series (18) is still a recurrent series. We will easily prove this as follows. Denote by m the sum of the integer numbers m0 , m00 , m000 , . . ., or what amounts to the same thing, the degree of the polynomial F(x), and consequently make (20)
F (x) = kxm + lxm−1 + . . . + px + q,
where k, l, . . ., p and q represent constants, real or imaginary. Equation (19) becomes (21)
f (x) kxm + lxm−1 + . . . + px + q
= a0 + a1 x + a2 x2 + . . . .
After putting it into the form (22)
f (x) = q + px + . . . + lxm−1 + kxm
a0 + a1 x + a2 x2 + . . . ,
[328] we get, by expanding the right-hand side as we did for equation (6),
12.2 Expansion of rational fractions into recurrent series.
(23)
263
f (x) = qa0 + (qa1 + pa0 ) x + . . . + (qam + pam−1 + . . . + la1 + ka0 ) xm + . . . + (qan + pan−1 + . . . + lan−m+1 + kan−m ) xn +..........................................
Because this last formula ought to remain true as long as the modulus of the variable x is less than the moduli of the constants a, b, c, . . ., we can prove, by reasoning similar to that which we have used to establish theorem VI of Chapter VI (§ IV), that the coefficients of like powers of x in the two sides are necessarily equal to each other. It follows: 1◦ that the coefficients of the various powers of x in the different terms of the polynomial f (x) are respectively equal to the coefficients of the same powers of the series, the sum of which constitutes the right-hand side of equation (23); and 2◦ that in this series the coefficients of the powers where the exponent surpasses the degree of the polynomial f (x) reduce to zero. Moreover, if we consider a term of the series in which the exponent n of the variable x surpasses the degree of the polynomial f (x), and is at the same time equal to or greater than m, the term is of the form (qan + pan−1 + . . . + lan−m+1 + kan−m ) xn . Thus, any time the value of n is greater than the degree of the polynomial f (x) and is also equal to or greater than the degree m of the polynomial F(x), the coefficients an ,
an−1 ,
...,
an−m+1 ,
an−m
are found to satisfy the linear equation (24)
qan + pan−1 + . . . + lan−m+1 + kan−m = 0.
Consequently, for such a value of n, the coefficent an of the power xn is expressed as a linear function of those coefficients of m lesser powers taken consecutively. Series (18) [329] is thus one of those that we call recurrent. Its recurrence relation is composed of the constants k,
l,
...,
p,
q,
respectively equal to the coefficients of the various powers of x in the polynomial F(x). Among the series which represent the expansions of the fractions contained in the right-hand side of formula (17) and which are all convergent in the case where the modulus of the variable x remains less than the moduli of the various roots of equation (15), at least one would become divergent if the modulus of the variable came to surpass that of some root. Consequently, series (18), still convergent in the first case, is divergent in the second. On the other hand, if we make the integer number n increase indefinitely, and if we denote by ρn the modulus of the coefficient an in series (18), this series is convergent or divergent (see § I) depending on whether 1 the modulus of x is less than or greater than the smallest of the limits of (ρn )− n .
264
12 On recurrent series.
Because the two rules of convergence that we have just stated must necessarily agree with each other, we can conclude that the smallest of the moduli which correspond to roots of equation (15) is precisely equal to the smallest of the limits of the expression 1 (ρn )− n . When the two functions f (x) and F(x) are real, the coefficient an is real as well, and its modulus ρn is no different from its numerical value. If under the same hypothesis, the equation F(x) = 0 has no real roots, the root that has the smallest numerical value is, from what we have just said, equal (up the sign) to the small1 n converges to a fixed limit, we est of the limits of (ρn )− n . Finally, if the ratio ρρn+1 can substitute it (Chap. II, § III, theorem II) for the desired limit of the expression 1 (ρn )− n .6 This remark leads to the rule that Daniel Bernoulli has given for determining numerically [330] the smallest (ignoring the sign) of all the quantities which represent the roots, supposed to be real, of an algebraic equation.
12.3 Summation of recurrent series and the determination of their general terms. When a series ordered according to the ascending powers of the variable x is at the same time convergent and recurrent, it always has a rational fraction as its sum. Indeed, let (1) a0 , a1 x, a2 x2 , . . . , an xn , . . . be such a series. Suppose that for values of n above a certain limit, the coefficient an of the power xn is determined as a linear function of the n coefficients of the lesser powers by an equation of the form (2)
kam−n + lan−m+1 + . . . + pan−1 + qan = 0,
such that the constants k,
l,
...,
p and q
form the recurrence relation of the series. If we multiply the sum of the series, namely a0 + a1 x + a2 x2 + . . . by the polynomial kxm + lxm−1 + . . . + px + q, the product obtained is the sum of a new series in which the coefficient of xn , calculated as in Chapter VI (§ IV, theorem V), vanishes for values of n greater than the assigned limit. In other words, the product in question is a new polynomial of a degree indicated by this limit. If we denote this new polynomial by f (x), we have 6
1
This was given as (ρ)− n in [Cauchy 1821, p. 400, Cauchy 1897, p. 329].
12.3 Summation of recurrent series.
(3)
265
f (x) = kxm + lxm−1 + . . . + px + q a0 + a1 x + a2 x2 + . . .
[331] and consequently, (4)
a0 + a1 x + a2 x2 + . . . =
f (x) . kxm + lxm−1 + . . . + px + q
Thus, any series which is ordered according to the ascending and integer powers of the variable x and which is both convergent and recurrent has as its sum a rational fraction for which the denominator is a polynomial in which the successive powers of x have for coefficients the different terms of the recurrence relation of the series. When we describe a recurrent series by giving only its first terms and the recurrence relation which serves to determine from the first terms all those which follow, then with the aid of the method which we have just indicated, we easily determine the rational fraction which represents the sum of the series in the case where it remains convergent. Once this rational fraction is calculated, we can substitute a sum of simple fractions for it, possibly augmented by an integer function of the variable x. If we then seek the recurrent series which expresses the expansions of the simple fractions in question for conveniently chosen values of x, and we add the general terms of these same series, we obtain the general term of the proposed series.
NOTES.1 NOTE I. O N THE THEORY OF POSITIVE AND NEGATIVE QUANTITIES .
[333] There has been much dispute about the nature of positive and negative quantities, and various theories have been given on this subject. The one we have adopted (see the Preliminaries, pages 2 and 3)2 appears to us to be the best for clarifying all the difficulties. First we will state it in a few words. Then we will show how we deduce the rule of signs. Just as we see the idea of number born from the measurement of magnitudes, so also we acquire the idea of quantity (positive or negative) when we consider each magnitude of a given kind as being able to serve as the increase or diminution of another fixed size of the same kind. To indicate this intention, we represent the sizes that ought to serve as increases by numbers preceded by the sign +, and the sizes that ought to serve as diminutions by the numbers preceded by the sign −. Given this, the signs + or − placed in front of numbers can be compared, following the remark that has been made,3 with adjectives placed near their nouns. We designate 1
The last portion of the Cours d’analyse is a collection of 9 appendices, which Cauchy calls “Notes.” On p. 1 of the introduction [Cauchy 1821, p. ii, Cauchy 1897, p. ii], he describes them as “the derivations which may be useful both to professors and students of the Royal Colleges, as well as to those who wish to make a special study of analysis.” 2 See [Cauchy 1821, pp. 2–3, Cauchy 1897, pp. 18–19]. Curiously, this reference in [Cauchy 1897] was still to pages 2 and 3, even though that edition has no pages numbered 2 or 3. (tr.) 3 Cauchy’s footnote (1) reads “Transactions philosophiques, ann´ ee 1806,” a reference is to [Bu´ee 1806]. Abb´e Bu´ee (1748–1826) in turn cited Carnot, Frend and Euler, so it seems that Bu´ee, hence R.E. Bradley, C.E. Sandifer, Cauchy’s Cours d’analyse, Sources and Studies 267 in the History of Mathematics and Physical Sciences, DOI 10.1007/978-1-4419-0549-9 BM2, c Springer Science+Business Media, LLC 2009
268
Note I – On the theory of positive and negative quantities.
numbers preceded by the sign + as positive quantities, and numbers preceded by the sign − as negative quantities. Finally, we agree to include the absolute numbers which are not preceded by any sign among the class of positive quantities, and it is for this reason that we sometimes dispense with writing the sign + before the numbers which ought to represent quantities of this kind. In Arithmetic we always operate on numbers for which the particular value is known, and which are consequently given as figures, while in Algebra, where we consider the general properties of numbers, [334] we ordinarily represent these same numbers by letters. There, a quantity is expressed by a letter preceded by the sign + or −. Moreover, nothing prevents representing the quantities by simple letters as well as by numbers. It is an artifice which augments the resources of Analysis, but when we wish to use it, it is necessary to take account of the following conventions. Following what we have said above, in the case where the letter A represents a number, we can denote the positive quantity for which the numerical value is equal to A either by +A or by A alone, while −A denotes the opposite quantity, that is to say the negative quantity for which A is the numerical value. Thus, in the case where a represents a quantity, we regard the two expressions a and +a as synonyms, and we denote by −a the opposite quantity. Following these conventions, if we represent either a number or any quantity by A, and if we make a = +A and b = −A, then we have +a = +A,
+b = −A,
−a = −A and −b = +A. In the last four equations, if we replace a and b with their values between parentheses, we get the formulas ( + (+A) = +A, + (−A) = −A, (1) − (+A) = −A and − (−A) = −A. In each of these formulas, the sign of the right-hand side is what we call the product of the two signs of the left-hand side. To multiply two signs by each other is to form their product. Inspection alone of equations (1) suffices to establish the rule of signs, contained in the theorem which I am going to state. Theorem I. — The product of two signs that are the same is always + and the product of two opposite signs is always −. It also follows from the same equations that when one of the signs is +, the product of two signs is equal to the other one. Thus, if we have several signs to Cauchy, was fully aware of the controversy raging in England at the time, regarding the nature of negative numbers. This paper by Bu´ee is described as “perhaps the very first purely mathematical theory of time” [Windred 1933].
Note I – On the theory of positive and negative quantities.
269
multiply together, we can ignore all the + signs. From this remark, we easily deduce the following propositions: Theorem II. — If we multiply together several signs [335] in any order, the product is always + whenever the number of − signs is even, and the product is − in the opposite case. Theorem III. — The product of as many signs as we like remains the same, in any order in which we multiply them. An immediate consequence of the above definitions is that the multiplication of signs has no relation to the multiplication of numbers. However, we need not be surprised if we note that the idea of the product of two signs arises as one of the first steps that we make in Analysis, because in addition or subtraction of a monomial, we really multiply the sign of this monomial by the sign + or −. Starting from the principles which we have just established, we easily clear up all difficulties which the use of the signs + and − can present in the operations of Algebra and of Trigonometry. We need only distinguish carefully the operations relative to numbers from those which apply to positive or negative quantities. We especially ought to clarify precisely the goals of each kind of operation, to define their results and to describe their principal properties. This is what we are going to try to do in a few words, for the various operations which we commonly use.
Addition and subtraction. Sums and differences of numbers. — To add the number B to the number A, or in other words, to subject the number A to an increase +B is what we call an arithmetic addition. The result of this operation is called the sum. We indicate this by placing the increase +B next to the number A, as follows: A + B. We will not prove it, but we admit as evident that the sum of several numbers remains the same in whatever order we add them. This is a fundamental axiom on which rest Arithmetic, Algebra and all the sciences of calculation. Arithmetic subtraction is the inverse of addition. It consists of taking away from a first number A some second number B, that is to say of finding a third number C which added to the second number reproduces the first number. This is also what we call subjecting to the number A the diminution −B. The result of this operation is called the difference. We indicate it by placing [336] the diminution −B following the number A, as follows: A − B. Sometimes we indicate the difference A − B with the name the excess or the remainder or the arithmetic relation between the two numbers A and B.
270
Note I – On the theory of positive and negative quantities.
Sums and differences of quantities. — We have explained in the preliminaries what it is to add two quantities together. In adding several quantities to each other, we obtain what we call their sum. Based on the axiom about the addition of numbers, it is easy to prove the following proposition: Theorem IV. — The sum of several quantities remains the same in whatever order we add them. We indicate the unique sum of several quantities by the simple juxtaposition either of the letters which represent their numerical values or of the quantities themselves, with each letter preceded by the sign which it must have to express the corresponding quantity. Moreover, the different letters can be arranged in any order, and it is permitted to suppress the + sign before the first letter. Let us consider, for example, the quantities a,
b,
c,
...,
−f,
−g,
−h,
....
Their sum could be represented by the expression a− f −g+b−h+c+.... In such an expression, each of the quantities a,
b,
c,
...,
−f,
−g,
−h,
...
is what we call a monomial. The expression itself is a polynomial, for which the monomials in question are the different terms. When a polynomial contains only two, three, four, . . ., terms, it takes the name binomial, trinomial, quadrinomial, . . .. We easily prove that two polynomials for which the terms are equal and of contrary signs represent two opposite quantities. The difference between a first quantity and a second is a third quantity which, added to the second, reproduces the first. On the basis of this definition, we prove that to subtract a second quantity b from a first quantity [337] a, it suffices to add the quantity opposite to b, that is to say −b, to the first quantity. We thus conclude that the difference of two quantities a and b ought to be represented by a − b. Note. — Subtraction, being the inverse of addition, can always be indicated in two ways. Thus, for example, to express that the quantity c is the difference of two quantities a and b, we can write either a − b = c or a = b + c.
Note I – On the theory of positive and negative quantities.
271
Multiplication and division. Products and quotients of numbers. — To multiply the number A by the number B is to operate on the number A precisely as we operate on one to obtain B. The result of this operation is what we call the product of A by B. To better understand the preceding definition of multiplication, it is necessary to distinguish different cases, depending on the kind of number B is. This number may be rational, that is to say integer or fractional, or it may be irrational, that is to say not rational. To obtain B when B is an integer number, it suffices to add one to itself several times consecutively. Thus, to form the product of A by B, we must add the number A to itself the same number of times, that is to say the sum of as many numbers equal to A as there are ones in B. When B is a fraction which has numerator m and denominator n, the operation by which we arrive at the number B consists of separating the number one into n equal parts and then repeating the result m times. Thus, we obtain the product of A by B by separating the number A into n equal parts and then repeating one of these parts m times. When B is an irrational number, we can obtain rational numbers that approach it more and more closely. We can easily see that under the same hypothesis the product of A by the rational numbers in question approach a certain limit more and more closely. This limit is the product of A by B. If we suppose, for example, that B = 0, we find a zero limit, and we conclude that the product of any number by zero vanishes. In the multiplication of A by B, we call the number A the multiplicand and [338] the number B the multiplier. These numbers are also designated together under the name the factors of the product. To indicate the product of A by B, we use any one of the following three notations: B × A,
B · A or BA.
The product of several numbers remains the same in whatever order we multiply them. This proposition, when it concerns just two or three integer factors, is derived from the axiom about the addition of numbers. We can then prove it successively: 1◦ for two or three rational factors; 2◦ for two or three irrational factors; and finally 3◦ for any number of factors, rational or irrational. To divide the number A by the number B is to find a third number for which its product by B is equal to A. The operation by which we arrive at this is called division and the result of this operation is the quotient. Moreover, the number A takes the name of dividend and the number B that of divisor. To indicate the quotient of A by B, we use at will one of the two following notations: A or A : B. B Sometimes we indicate the quotient A : B by the name ratio or geometric relation of the two numbers A and B.
272
Note I – On the theory of positive and negative quantities.
The equality of two geometric ratios A : B and C : D, or in other words the equation A:B=C :D is what we call a geometric proportion. Ordinarily, instead of the sign = we use the following ::, which has the same meaning, and we write A : B :: C : D. Note. — From the definition, when B is an integer number, to divide A by B is to find a number which, repeated B times, reproduces A. Thus, it is to separate the number A into as many equal parts as there are ones in B. We conclude easily from this remark that if m and n denote two integer numbers, the nth part of one ought to be represented by 1 , n [339] and the fraction which has numerator m and denominator n by 1 m× . n Indeed, this is the notion by which we naturally denote the fraction in question. However, because we easily prove that the product m×
1 n
is equivalent to the quotient of m by n, that is to say to mn , it follows that the same fraction can be represented more simply by the notation m . n Products and quotients of quantities. — The product of a first quantity by a second is a third quantity which has for its numerical value the product of the numerical values of the two others, and for its sign the product of their signs. To multiply two quantities by each other is to form their product. The first of the two quantities is called the multiplier and the other the multiplicand, and the two of them are both factors of the product. Using these definitions, we easily establish the following proposition: Theorem V. — The product of several quantities remains the same in whatever order we multiply them. To prove this proposition, it suffices to combine the similar proposition about numbers with theorem III about signs (see above).4 4
[Cauchy 1821, p. 405, Cauchy 1897, p. 335].
Note I – On the theory of positive and negative quantities.
273
To divide a first quantity by a second is to find a third quantity which, multiplied by the second, reproduces the first. The operation by which we arrive at this is called division. The first quantity is the dividend, the second is the divisor and the result of the operation is the quotient. Sometimes we indicate the quotient by the name of ratio or geometric relation of the two given quantities. On the basis of the preceding definitions, we easily prove that the quotient of two quantities has as its numerical value the quotient of their numerical values, and as its sign the product of their signs. [340] Multiplication and division of quantities are indicated just like the multiplication and division of numbers. We say that two quantities are inverses of each other when the product of these two quantities is one. From this definition, the quantity a has a1 as its inverse, and reciprocally. We have remarked above that what we call a fraction in Arithmetic is equal to the ratio or quotient of two integer numbers. In Algebra we also denote the ratio or quotient of any two quantities by the name fraction. Thus if a and b represent two quantities, their ratio ab is an algebraic fraction. Again we observe that division, being an inverse operation of multiplication, can always be indicated in two ways. Thus, for example, to express that the quantity c is the quotient of two quantities a and b, we can write either a = c or a = bc. b Products and quotients of numbers enjoy general properties to which we often have recourse. We have already spoken of the one whereby the product remains the same in whatever order we may multiply its factors. Other properties, no less remarkable, are found in the formulas which I am about to write. Let a,
b,
c,
...,
k,
a0 ,
b0 ,
...,
a00 ,
b00 ,
...,
...
be several sequences of quantities, positive or negative. For all possible values of these quantities, we have k (a + b + c + . . .) = ka + kb + kc + . . . , a b c a+b+c+... = + + + . . ., k k k k (2) a a0 a00 aa0 a00 . . . × × × . . . = 0 00 , b b0 b00 bb b . . . k bk b = × k. a = a a b The four preceding formulas give rise to a multitude of consequences [341] which would be too long to list here in detail. We conclude from the third formula, for example: 1◦ that the fractions
274
Note I – On the theory of positive and negative quantities.
a ka and b kb are equal to each other, where a, b and k denote any quantities; 2◦ that the fraction a b ◦ b has a as its inverse; and 3 that to divide one quantity k by another quantity a, it suffices to multiply k by the inverse of a, that is to say by 1a .
Elevation of powers. Extraction of roots. Powers and roots of numbers. Positive exponents. — To raise the number A to the power indicated by the number B is to look for a third number which is formed from A by multiplication as B is formed from one by addition. The result of this operation made on the number A is what we call its power of degree B. To understand the preceding definition of the elevation to powers well, it is necessary to distinguish three cases, depending on whether the number B is integer, fractional or irrational. When B denotes an integer number, this number is the sum of several ones. The power of A of degree B thus ought to be the product of as many factors equal to A as there are ones in B. When B represents a fraction mn (m and n being two integer numbers), to represent this fraction it is necessary: 1◦ to find a number which, repeated n times, produces one; and 2◦ repeat the number in question m times. Thus, to obtain the power of A of degree mn , it is necessary: 1◦ to find a number such that the multiplication of n factors equal to this number reproduces A; and 2◦ to form a product of m factors equal to this same number. When we suppose in particular that m = 1, the power of A under consideration reduces to that of degree 1n , and it is found to be determined by the single condition that the number A be equivalent to the product of n factors equal to this same power. When B is an irrational number, we can then obtain rational numbers with values approaching it more and more closely. We easily prove that under the same hypothesis, powers of A indicated by the rational [342] numbers in question approach more and more closely towards a certain limit. This limit is the power of A of degree B. In the elevation of the number A to the power of degree B, the number A is called the root and the number B, which indicates the degree of the power, is the exponent. To represent the power of A of degree B, we use the following notation AB . From the preceding definitions, the first power of a number is nothing but the number itself. Its second power is the product of two factors equal to this number, its third power three such factors, and so on. Geometric considerations have led us to indicate the second power by the name square and the third power by the name cube. As for the power of degree zero, it is the limit towards which the power of degree B converges when the number B decreases indefinitely. It is easy to show that this limit reduces to one, from which it follows that we have, in general,
Note I – On the theory of positive and negative quantities.
275
A0 = 1. We always suppose that the value of the number A remains finite and different from zero. To extract the root indicated by the number B of a number A is to find a third number which, raised to the power of degree B produces A. The operation by which we accomplish this is called extraction and the result of the operation is the root of A of degree B. The number B which indicates the degree of the root is called the index. To represent it, we use the following notation: √ B A. The roots of second and third degree are ordinarily indicated by the names square roots and cube roots. When it is a matter of a square root, we almost always dispense √ with writing the index 2 along with the sign . Thus the two notations √ √ 2 A and A ought to be considered as equivalent. Note. — The extraction of roots of numbers, being the inverse of their elevation to powers, can always be indicated in two ways. Thus, for example, to express that the number C is equal to the root of A of [343] degree B, we can write either √ B A = CB or C = A. We remark again that, by virtue of these definitions, if we denote any integer 1 number by n, then A n is a number such that the multiplication of n factors equal to this number produces A. In other words, we have 1 n A n = A, from which we conclude that
1
An =
√ n A.
Thus, when n is an integer number, the power of A of degree 1n , and the nth root of A are equivalent expressions. We prove easily that it is the same in the case where we replace the integer number n by any number. Powers of numbers. Negative exponents. — To raise the number A to the power indicated by the negative exponent −B is to divide one by AB . The value of the expression A−B is thus found to be determined by the equation A−B =
1 , AB
276
Note I – On the theory of positive and negative quantities.
which we can also put into the form AB A−B = 1. Consequently, if we raise the same number to two powers indicated by two opposite quantities, we obtain as results two positive quantities that are inverses to each other. Powers and real roots of quantities. — In the definitions which we have given of powers and roots of numbers corresponding to exponents, either integer or fractional, if we substitute the word quantities in place of numbers, we obtain the following definitions for powers and real roots of quantities. To raise the quantity a to the real power of degree m, where m is an integer number, [344] is to form the product of as many factors equal to a as there are ones in m. To raise the quantity a to the real power of degree mn , where m and n are two integer numbers and, to avoid all uncertainty, where the fraction mn is reduced to its simplest expression, is to form a product of m factors chosen so that the nth power of each of them is equal to the quantity a. To extract the real root of degree m or mn of the quantity a is to find a new quantity which, raised to the real power of degree m or mn produces a. From this definition, the nth real root of a quantity is evidently the same thing as its real power of degree 1 n n . Moreover, we easily prove that the root of degree m equals the power of degree m n. Finally, to raise the quantity a to the real power of degree −m or − mn is to divide one by the same quantity a raised to the real power of degree m or mn . In these operations of which we have just spoken, the number or the quantity which marks the degree of a real power of a is called the exponent of this power, while the number which marks the degree of a real root is named the index of this root. Every power of a which corresponds to an exponent for which the numerical value is an integer, that is to say to an exponent of the form +m or −m, where m represents an integer number, admits a unique real value which we denote by the notation am or a−m . As for the roots and powers for which the numerical value is fractional, they can admit either two real values, or but one real value, or admit none at all. The real values in question here are necessarily either positive quantities or negative quantities. However, in Algebra, in addition to these quantities we also use symbols which have no meaning by themselves, but nevertheless receive the names powers and roots because of their properties. These symbols [345] are among the algebraic expressions to which we have given the name imaginary, as opposed to the name real expressions, which only applies to numbers or quantities. Given this, it follows from the principles established in Chapter VII that the nth root of any quantity a and its powers of degree mn and − mn , where n is an integer number and mn is an irreducible fraction, each of which admits n distinct values, real
Note I – On the theory of positive and negative quantities.
277
or imaginary. Conforming to the notations adopted in the same chapter, we denote any one of these values, if it is a question of the nth root, by the notation p p n
1
a = ((a)) n ,
and if it is a question of the power which has for its exponent notation m m ((a)) n or ((a))− n .
m n
or − mn , by the
1
We add that the expression ((a)) n is contained as a particular case of the more genm eral expression ((a)) n . By calling A the numerical value of a, we find that the real values of the two expressions m
((a)) n
and
m
((a))− n
are:
1◦ If n denotes an odd number and m
m
m
m
m
m
a is + A . . . . . . . . . . . . +A n and +A− n , a is − A . . . . . . . . . . . . −A n and −A− n ; 2◦ If n denotes an even number and a is + A . . . . . . . . . . . . ±A n and ±A− n . In the last case, when we suppose that a is negative, all the values of each of the m m expressions ((a)) n and ((a))− n become imaginary. If we make the fraction mn vary in such a way that it approaches indefinitely an irrational number B, the denominator n then grows beyond any assignable limit, and likewise the number of imaginary values [346] which each of the expressions m
((a)) n
and
m
((a))− n
take on. Consequently, we cannot admit into calculation the notations ((a))B
and
((a))−B
or the notation ((a))b , when we make b = ±B, unless we consider such a notation itself as representing an infinity of imaginary expressions. To avoid this inconvenience, we never employ the algebraic expression ((a))b in the case where the numerical value of b is irrational. Under this hypothesis, only when a takes a positive value +A can we make use of the notation
278
Note I – On the theory of positive and negative quantities.
ab
or
(a)b ,
which we ought to consider as equivalent to +Ab (see Chapter VII, § IV). Powers of numbers and quantities enjoy several remarkable properties which are easy to prove. Among others, we note those contained in the formulas which I am going to write. Let a, a0 , a00 , . . ., b, b0 , b00 , . . . be any quantities, positive or negative,. Let A, A0 , 00 A , . . . be any numbers and let m, m0 , m00 , . . . be integer numbers. We have b b0 b00 0 00 A A A . . . = Ab+b +b +... , Ab A0b A00b . . . = (AA0 A00 . . .)b , (3) b0 0 Ab = Ab b , and
(4)
0
00
0
a±m a±m a±m . . . = a±m±m ±m am a0m a00m . . .
00 ±...
,
= (aa0 a00 . . .)m ,
a−m a0−m a00−m . . . = (aa0 a00 . . .)−m , 0 m )m0 = (a−m )−m = amm0 and (a 0 m0 m −m0 (a ) = (a−m ) = a−mm ,
where each of the numbers m, m0 , m00 , . . . in the first equation (4) must be affected with the same sign on both sides of the equation. [347] Formulas (3) and (4) give rise to a multitude of consequences, among which we will content ourselves to indicate the following. We get from the second formula (3) that Ab and we then conclude that
b 1 = 1b = 1, A b 1 1 = b. A A
Thus, if we raise two positive quantities that are inverses to each other to the same power, the results are always two inverse quantities.
Note I – On the theory of positive and negative quantities.
279
Formation of exponentials and logarithms. When we regard the number A as fixed and the quantity x as a variable in the expression Ax , the power Ax takes the name exponential. Under the same hypothesis, if for a particular value of x we have Ax = B, then this particular value is what we call the logarithm of the number B in the system for which the base is A. We indicate this logarithm by placing before the number the initial letters ln or log, like5 ln B or
log B.
However, as such a notation does not tell the base of the system of logarithms to which it refers, it is important to state in the discussion the value of this base. Given this, if we use the characteristic log to denote logarithms taken in the system for which the base is A, the equation Ax = B implies the following one x = log B. Sometimes, when we must treat logarithms taken in different systems at the same time, we distinguish among them with the aid of one of several accents placed to the right of the letters log, and as a consequence we denote by these letters without accents the logarithms of a first system, by the same letters followed by a single accent logarithms of a second system, etc. Based on the preceding definitions and on the general properties of powers of numbers, we easily recognize: 1◦ that one [348] has zero for its logarithm in all systems; 2◦ that in any system of logarithms for which the base exceeds one, every number greater than one has a positive logarithm, and every number less than one has a negative logarithm; 3◦ that in any system of logarithms for which the base is less than one, every number less than one has a positive logarithm and every number greater than one has a negative logarithm; and finally 4◦ that in two systems for which the bases are inverses to one another, the logarithms of the same number are equal and of contrary signs. Moreover, we easily prove the formulas which establish the principal properties of logarithms, among which we ought to note these which I am going to write. If we denote by B, B0 , B00 , . . ., C any numbers, by the characteristics log and log0 the logarithms taken in two different systems for which the bases are A and A0 , and by k any quantity, positive or negative, we have
5
As mentioned in the Preface and in a footnote in the Preliminaries, we use the more modern notations “ln” and “log” to avoid confusion, whereas Cauchy used “l” and “L”, respectively. If Cauchy means the natural logarithm, we always use “ln.” (tr.)
280
(5)
Note I – On the theory of positive and negative quantities.
log BB0 B00 . . . = log B + log B0 + log B00 + . . . , log Bk = k log B, BlogC = Alog B·logC = Clog B , log0 C logC = . log B log0 B
From the first of these formulas, we get log B + log
1 = log 1 = 0 B
and consequently 1 = − log B. B From this it follows that two positive quantities that are inverse to each other have equal logarithms of contrary signs. We add that the fourth formula can be deduced easily from the second. Indeed, suppose that the quantity k represents the logarithm of the number C in the system for which the base is B. We have log
C = Bk and consequently logC = k log B and
log0 C = k log0 B,
from which we conclude immediately that logC log0 C = = k. log B log0 B [349] We can also remark that if we take B = A, then because log A = 1, we get from the fourth formula that log0 C = log0 A · logC, or, taking for brevity log0 A = µ, log0 C = µ logC. Thus, to pass from a system of logarithms for which the base is A to one for which the base is A0 , it suffices to multiply the logarithms taken in the first system by a certain coefficient µ equal to the logarithm of A taken in the second system. The logarithms of which we have just spoken are those which we call real logarithms because they always reduce to positive or negative quantities. However, other than these quantities, there exist imaginary expressions which, because of their properties, also bear the name of logarithms. We return to this subject in Chapter IX, in which we reveal the theory of imaginary logarithms.
Note I – On the theory of positive and negative quantities.
281
Formation of trigonometric lines and arcs of a circle. We have remarked in the Preliminaries that a length measured on a curved or straight line can sometimes be represented by a number, sometimes by a quantity, depending on whether we simply regard it as the measure of this length, or if we consider it as being moved along the given line in one sense or another, relative to a fixed point which we call the origin, to serve as the growth or diminution of another constant length ending at this point. We have added that in a circle for which the plane is taken to be vertical, we ordinarily fix the origin of the arcs as the endpoint of the radius taken horizontally from left to right, and that, with respect to this origin, the arcs are counted as positive or negative depending on whether, to describe them, we begin by going up from there or by going down. Finally, we have indicated the origins of several trigonometric lines which correspond to these same arcs in the case where the radius of the circle is reduced to one. We will return to this topic shortly and complete the ideas which pertain to it. First, we easily establish with regard to lengths measured on the same line or curve relative to a given origin the following propositions: [350] Theorem VI. — Let a, b, c, . . . be any quantities, positive or negative. To obtain on a line, straight or curved, the extremity of the length a+b+c+... measured with respect to a given origin and in the direction determined by the sign of the quantity a+b+c+..., it suffices to move along this line: 1◦ the length a starting from the origin in the direction determined by the sign of a; 2◦ the length b starting from the extremity of a in the direction determined by the sign of b; and 3◦ the length c starting from the extremity of b in the direction determined by the sign of c, and so on. Theorem VII. — Let a and b be any two quantities. Suppose also that we move along a straight line or curve starting from a given origin: 1◦ a length equal to the numerical value of a in the direction determined by the sign of a; and 2◦ a length equal to the numerical value of b in the direction determined by the sign of b. To pass from the extremity of the first length to that of the second, or reciprocally, along the line under consideration, it suffices to move a third length equal to the numerical value of the difference a − b. Theorem VIII. — Supposing the same things as in the preceding theorem, the extremity of the length represented by a+b 2
282
Note I – On the theory of positive and negative quantities.
is situated on the given line at a point at equal distances from the extremities of the lengths a and b (where the distances are measured along the line itself). Now we apply these theorems to arcs measured on the circumference of a circle for which the plane is vertical and for which the radius equals one, the origin of the arcs being fixed at the extremity of the radius drawn horizontally from left to right. If we denote the ratio of the circumference to its diameter by π, following common usage, because the diameter is equal to 2, the entire circumference is found to be expressed by the number 2π, half of the circumference by the number π, and the quarter by π2 . Moreover, if we denote by a any arc, [351] positive or negative, we conclude from theorem VI that, to obtain the extremity of the arc a + 2mπ
or a − 2mπ,
(where m is an integer number), it is necessary to move along the circumference, starting with the extremity of the arc a, either in the direction of the positive arcs or in the direction of the negative arcs, a length equal to 2mπ, that is to say to travel m times about the entire circumference in one direction or the other, which necessarily returns to the point from which we started. It follows that the extremities of the arcs a and a ± 2mπ coincide. Likewise we conclude from theorems VI or VII: 1◦ that the extremities of the arcs a and a ± π contain between themselves an arc equal to π and as a consequence they consist of the extremities of the same diameter; and 2◦ that the extremities of the arcs a and a ±
π 2
contain between themselves a quarter of the circumference, and so they coincide with the endpoints of two radii perpendicular to each other. Finally, we conclude from theorem VIII: 1◦ that the extremities of the arcs a and π − a are located at equal distances from the extrtemity of the arc π , 2 and as a consequence are placed symmetrically about the vertical diameter; and 2◦ that the extremities of the arcs a and
π −a 2
Note I – On the theory of positive and negative quantities.
283
are situated at equal distances from the extremity of the arc π . 4 [352] The arcs π −a 2 in question here are, respectively, called the supplement and the complement of the arc a. In other words, two arcs represented by two quantities a and b are supplements or complements of each other depending on whether we have π − a and
a+b = π
or a + b =
π . 2
Because angles at the center which have for a common side the radius taken as the origin of the arcs grow or diminish proportionally with the arcs which they serve to measure, and because these angles themselves can be considered as the increases or decreases of one of these taken at will, nothing prevents us from denoting angles by the same quantities as arcs. This is a convention which has been effectively adopted. We also say that two angles are complements or supplements of each other when the corresponding arcs are themselves complements or supplements of each other. Now we move on to the study of trigonometric lines, and towards this end we consider a single arc represented by the quantity a. If we project it successively: 1◦ on the vertical diameter; and 2◦ on the horizontal diameter, the two projections are what we call the sine and the versed sine of the arc a.6 We can observe that the first of these is at the same time the projection on the vertical diameter of the radius which passes through the extremity of the arc. If we prolong this same radius until it intersects the tangent of the circle taken from the origin of the arcs, the part of this tangent contained between the origin and the point of intersection is what we call the trigonometric tangent of the arc a. Finally, the length measured on the radius extended between the center and the point of intersection is the secant of this same arc. The cosine and versed cosine of an arc, its cotangent and its cosecant are nothing but the sine, versed sine, tangent and secant of its complement, and they constitute, along with the sine, the versed sine, the tangent and the secant of the same arc, the complete system of trigonometric lines. From what has been said above, the sine of an arc is measured on the vertical diameter, the versed sine on the horizontal diameter, the tangent on the line which touches the circle at the origin of the arcs, and the secant on the moving diameter which passes through the extremity of the given arc. Moreover, the sine and the secant have for their common origin the center of the circle, while the origin [353] of the tangents and versed sines correspond to that of the arcs. Finally, we generally agree to represent by positive quantities the trigonometric lines of the arc a in the 6
Some readers may have expected the projection onto the horizontal axis to be the cosine, but the cosine is the projection of the radius, not the arc. For a more complete account of the versed sine and other topics in the history of trigonometry, see [Van Brummelen 2009, Ch. 3–4].
284
Note I – On the theory of positive and negative quantities.
case where the arc is positive and less than a quarter of the circumference, from which it follows that we ought to measure the sine and the tangent positively from the base upwards, the versed sine from right to left, and the secant in the direction of the radius towards the extremity of the arc a. On the basis of the principles which we have just adopted, we immediately recognize that the versed sine, and consequently the versed cosine, are always positive, and moreover, we determine without trouble the signs which ought to affect the other trigonometric lines of an arc for which the endpoint is given. To make this determination easier, we imagine the circle divided into four equal parts by two diameters perpendicular to each other, one horizontal and the other vertical, and these four parts of the circle are, respectively, designated as the first, second, third and fourth quarters of the circle. The first two quarters of the circle are situated above the horizontal diameter, namely the first on the right and the second on the left. The last two are situated below the same diameter, namely the third on the left and the fourth on the right. Given this, because the extremities of two arcs that are complements of each other are equally distant from the extremity of the arc π4 , we conclude that they are placed symmetrically on either side of the diameter which divides the first and the third quarters of the circle into two equal parts. If we then look for what signs ought to be attributed to the various trigonometric lines of an arc other than the versed sign and the versed cosine, according to whether the extremity of this arc falls in one quarter of the circle or in another, we find that the signs are, respectively, In the 1st In the 2nd In the 3rd In the 4th quarter of quarter of quarter of quarter of the circle the circle the circle the circle For sine and cosecant For cosine and secant For tangent and cotangent
+ + +
+ − −
− − +
− + −
On this subject, we can remark that the sign of the tangent is always the product of the sign of the sine by the sign of the cosine. The preceding considerations now lead us to recognize that the cosine of an arc corresponds with the projection of the radius which passes through the extremity of this arc onto the horizontal diameter, and that on this diameter it ought to be measured positively from left to right, starting from the center taken as the origin. The versed cosine can be measured on the vertical diameter [354] from the highest point on the circumference taken as the origin to the endpoint of the sine. The cotangent, measured positively from left to right along the horizontal tangent to the circle at the origin of the versed cosines, reduces to the length contained between this origin and the extension of the moving diameter the half of which is the radius taken to the extremity of the arc. Finally the cosecant, measured along the moving diameter, is measured positively in the direction of the radius in question and starting from the center taken as the origin to the extremity of the cotangent.
Note I – On the theory of positive and negative quantities.
285
In the preliminaries we have sufficiently developed the system of notations used to represent the various trigonometric lines and the arcs to which they correspond. We shall not return to this subject, and we will content ourselves to observe that the trigonometric lines of an arc are at the same time supposed to belong to the angle at the center of the circle which it measures and which we designate by the same quantity. Thus, for example, if a, b, . . . represent any quantities, we can say that the notations sin a, cos b, . . . express equally the sine of the arc or of the angle a, the cosine of the arc or of the angle b, . . .. We end this note by recalling some remarkable properties of trigonometric lines. First, if we denote by a any quantity, we find that the sine and the cosine of the angle a are always related to each other by the equation sin2 a + cos2 a = 1,
(6)
and that the other trigonometric lines can be expressed by means of these first two as follows: ( 1 sin a siv a = 1 − cos a, tan a = cos a , sec a = cos a , (7) 1 a cosiv a = 1 − sin a, cot a = cos sin a , csc a = sin a . From formulas (6) and (7) we easily deduce several other equations, for example (8)
cot a =
1 , sec2 a = 1 + tan2 a, csc2 a = 1 + cot2 a, . . . . tan a
It is also easy to see that if the positive quantity R represents the length [355] of a straight line between two points and α represents the angle, acute or obtuse, formed by this straight line with a fixed axis, the projection of the given length on the fixed axis is measured by the numerical value of the product R cos α, and the projection of the same length on a perpendicular axis is measured by the numerical value of the product R sin α. Finally, we recognize without trouble that if by starting from a point taken at random on the circumference of a circle of radius one, we move along this circumference in one direction or the other a length equal to the numerical value of any quantity c, the smallest arc contained between the endpoints of this length is less than or greater than π2 , depending on whether cos c is positive or negative. Admitting these principles, imagine that on the circumference of which we speak we determine: 1◦ the extremities A and B of the arcs represented by any two quanti-
286
Note I – On the theory of positive and negative quantities.
7 ties a and b; and 2◦ the extremity N of a third arc represented by a+b 2 . In addition, let M be the midpoint of the chord which joins the points A and B, and suppose that the point M projects onto the horizontal diameter of the circle to a certain point P. If the lengths measured on the diameter starting from the center taken for the origin are counted positively from left to right, like cosines, the distance from the center to the point P ought to be represented (by virtue of theorem VIII) by the quantity
cos a + cos b . 2 Moreover, because (by virtue of the same theorem) the point N is situated at equal distances from the points A and B, the diameter which passes through the point N contains the midpoint M of the chord AB and the distance from this midpoint M to the center of the circle is equal (ignoring the sign) to the cosine of each of the arcs NA and NB, or what amounts to the same thing, to a+b a+b a−b cos − a = cos − b = cos . 2 2 2 To obtain the horizontal projection of this distance, it suffices to multiply it by the cosine of the acute angle contained between the radius taken horizontally [356] from left to right and the diameter which contains the point N, that is to say by a factor equal (up to sign) by cos a+b 2 . In other words, the distance from the center to the point P has for its measure the numerical value of the product cos
a−b a+b cos . 2 2
I add that this product is positive or negative according to whether the point M is situated to the right or to the left of the vertical diameter. Indeed, cos a+b 2 is positive or negative according to whether the point N is situated to the right side or the left side with respect to this diameter. Also, cos a−b 2 is positive or negative – and consequently the product a+b a−b cos cos 2 2 is of the same sign as cos a+b 2 or of the opposite sign – according to whether each of the arcs NA and NB is less than or greater than π2 , which in turn follows whether the point M is situated on the same side as the point N or on the opposite side. Moreover, because the vertical line which passes through the point M also contains the point P, it follows from the preceding remark that the distance from the center to the point P, even in the case where we pay attention to the signs, can be represented by the 7
At this point Cauchy is embarking on a delicate argument which, following the example of Lagrange, he is determined to carry out without the aid of diagrams. The reader who wishes to follow this argument carefully should note that the arcs a and b uniquely determine the point N, although their extremities A and B do not. However, Cauchy will soon show that the additional information of the sign of cos( a−b 2 ) suffices to determine N.
Note I – On the theory of positive and negative quantities.
287
product cos
a−b a+b cos . 2 2
b have the same sign as well as the Thus, this product and the quantity cos a+cos 2 same numerical value, and we have, as a consequence, for all possible values of the quantities a and b,
cos a + cos b = 2 cos
(9)
a−b a+b cos . 2 2
If we replace b by b + π in equation (9), we get cos a − cos b = 2 sin
(10)
b−a a+b sin . 2 2
Moreover, if in equations (9) and (10) we substitute for the angles a and b their [357] complements π2 − a and π2 − b, we obtain the following: ( a+b sin a + sin b = 2 cos a−b 2 sin 2 , (11) a+b sin a − sin b = 2 sin a−b 2 cos 2 . Once formulas (9), (10) and (11) are established, we then easily deduce a great number of others. We find, for example,
(12)
(13)
tan 21 (a − b) sin a − sin b , = sin a + sin b tan 21 (a + b) cos b − cos a = tan 1 (a − b) tan 1 (a + b) , 2 2 cos b + cos a ( cos (a − b) + cos (a + b) = 2 cos a cos b, cos (a − b) − cos (a + b) = 2 sin a sin b, (
(14)
sin (a + b) − sin (a − b) = 2 sin b cos a, (
(15) (16) (17)
sin (a + b) + sin (a − b) = 2 sin a cos b, cos (a ± b) = cos a cos b ∓ sin a sin b, sin (a ± b) = sin a cos b ± sin b cos a,
tan a ± tan b , 1 ∓ tan a tan b ( cos 2a = cos2 a − sin2 a = 2 cos2 a − 1 = 1 − 2 sin2 a, tan (a ± b) =
sin 2a = 2 sin a cos a.
Now let a, b and c be any three angles. From the first formula (13), we get
288
Note I – On the theory of positive and negative quantities.
( (18)
cos (a + b + c) + cos (b + c − a) + cos (c + a − b) + cos (a + b − c) = 4 cos a cos b cos c.
In the preceding formula, if we write 21 a, 21 b and 12 c, instead of a, b and c and then we suppose that (19) a + b + c = π, we find (20)
b c a sin a + sin b + sin c = 4 cos cos cos . 2 2 2 [358] Under the same hypothesis, formula (16) gives (21)
tan a + tan b + tan c = tan a tan b tan c.
Equation (20) ought to remain true, along with equation (19), when we replace two of the angles a, b and c with their supplements, and then change the sign of the third one. Then we conclude sin b + sin c − sin a = 4 cos 2a sin b2 sin 2c , sin c + sin a − sin b = 4 sin a2 cos b2 sin 2c , (22) sin a + sin b − sin c = 4 sin a sin b cos c . 2
2
2
Combining these last formulas with equation (20), we deduce the following:
(23)
(sin a + sin b + sin c) (sin b + sin c − sin a) cos2 12 a = , 4 sin b sin c sin2 1 a = (sin c + sin a − sin b) (sin a + sin b − sin c) . 2 4 sin b sin c
Finally, if we imagine that a, b and c denote the three angles of a triangle and that their opposite sides are, respectively, A, B and C, the six products, equal in pairs, namely B sin c = C sin b,
C sin a = A sin c and A sin b = B sin A,
represent the perpendiculars dropped from the vertices to the three sides. It follows that we have sin a sin b sin c (24) = = , A B C and equations (23) become
(25)
(A + B +C) (B +C − A) , cos2 21 a = 4BC sin2 1 a = (C + A − B) (A + B −C) . 2 4BC
Note I – On the theory of positive and negative quantities.
289
Moreover, by taking into consideration formulas (19) and (24), we get from the first [359] equation (12) A−B (26) tan 12 (a − b) = cot 12 c. A+B Formulas (19), (24), (25) and (26) suffice to determine three of the six elements of a rectilinear triangle when the other three elements are known and when this determination is possible. We can also remark that the values of cos a and sin a, deduced from equations (25) with the aid of formulas (17), are, respectively,8 B2 +C2 − A2 and cos a = p 2BC (27) sin a = (A + B +C) (B +C − A) (C + A − B) (A + B −C) . 2BC The first of these values can be drawn directly from a known theorem of Geometry. As for the second, it gives a means of expressing the area of a triangle as a function of its three sides. Indeed, this area, equal to the product of the base C by half the height corresponding to B sin a is9 p 1 1 (28) (A + B +C) (B +C − A) (C + A − B) (A + B −C). 2 BC sin a = 4
8
The first formula in (27) is the Law of Cosines. p Formula (28) is known as Heron’s formula. The area is often given as S(S − A)(S − B)(S −C) where S = (A + B +C)/2 is the semiperimeter of the triangle.
9
Note II – On formulas that result from the use of the signs > or b and a < b serve equally to express that the first quantity, a, surpasses the second, b, that is to say that the difference a−b is positive. On the basis of this principle, we easily establish the propositions that I am going to state: Theorem I. — If a, a0 , a00 , . . ., b, b0 , b00 , . . . represent quantities subject to the conditions a > b, a0 > b0 , a00 > b00 , ......, then we also have a + a0 + a00 + . . . > b + b0 + b00 + . . . . Proof. — Indeed, when the quantities a − b,
a0 − b0 ,
a00 − b00 ,
...
are positive, we can be sure that their sum a + a0 + a00 + . . . − b + b0 + b00 + . . . is positive as well. Theorem II. —If A, A0 , A00 , . . ., B, B0 , B00 , . . . represent numbers [361] subject to the conditions
291
Note II – On formulas that use the signs > or < and on averages.
292
A > B, A0 > B0 , A00 > B00 , ........., then we also have AA0 A00 . . . > BB0 B00 . . . . Proof. — Indeed, because each of the differences is positive by hypothesis, A0 − B0 ,
A − B,
A00 − B00 ,
...,
each of the products (A − B) A0 A00 . . . = AA0 A00 . . . − BA0 A00 . . . , B (A0 − B0 ) A00 . . . = BA0 A00 . . . − BB0 A00 . . . , BB0 (A00 − B00 ) . . . = BB0 A00 . . . − BB0 B00 . . . , ..........................................., is positive as well, and consequently, so is their sum AA0 A00 . . . − BB0 B00 . . . . Theorem III. — Let a, b and r be any three quantities and suppose that a > b. We then conclude that if r is positive, then ra > rb, and if r is negative, then ra < rb. Proof. — Indeed, the product r (a − b) = ra − rb is positive in the first case and negative in the second. Corollary. – Suppose that a and b are positive. If we successively take r=
1 a
1 and r = , b
1>
b a
and
we then conclude that
a > 1. b
Note II – On formulas that use the signs > or < and on averages.
293
We are thus brought back to the proposition, obvious by itself, [362] that a fraction is less than or greater than 1 according to whether the larger of its two terms is its denominator or its numerator. Theorem IV. — Let A and A0 be two numbers that satisfy the condition A > A0 , and let b be any quantity. If b is positive we have Ab > A0b , and if b is negative, Ab < A0b . Proof. — Indeed, because the quotient Ab = A0b
A A0
is > 1, the fraction
A A0
b
is evidently greater than or less than 1 according to whether the quantity b is positive or negative. Theorem V. — Denote any number by A and let b and b0 be two quantities subject to the condition b > b0 . We then conclude that if A is greater than 1, then 0
Ab > Ab , and if A is less than 1, then
0
Ab < Ab . Proof. — Indeed, because the quantity b − b0 is positive by hypothesis, the fraction Ab b−b0 0 =A b A is evidently greater than or less than 1 according to whether A > 1 or A < 1. Theorem VI. — Let log be the characteristic of logarithms taken in the system for which the base is A, and denote by B and B0 two numbers subject to the condition B > B0 . If A is greater than 1, we have log B > log B0 ,
Note II – On formulas that use the signs > or < and on averages.
294
[363] and if A is less than 1, we have log B < log B0 . Proof. — Indeed, the logarithm log
B = log B − log B0 B0
is positive in the first case and negative in the second. Corollary. — If we use the symbol ln to indicate the Napierian logarithms taken in the system for which the base is e = 2.7182818 . . . ,
(1)
[Chapter VI, § I, equation (5)], then the condition B > B0 always entails the formula ln B > ln B0 . To the preceding theorems we add the following, from which we can deduce several important consequences. Theorem VII. — Let x be any quantity. We have 1 + x < ex ,
(2)
where the letter e denotes, as usual, the base of the Napierian logarithms. Proof. — Because the right-hand side of formula (2) still remains positive, the stated theorem is evident by itself if the quantity 1 + x is negative. Thus it suffices to examine the case where we suppose that 1 + x > 0.
(3)
Now, for all possible real values of x, equation (23) of Chapter VI (§ IV) gives x x2 x3 x4 x5 ex = 1 + + + + + +... 1 1·2 1·2·3 1·2·3·4 1·2·3·4·5 (4) 2 4 x = 1+x+ x 1+ x + x 1+ +.... 2 3 2·3·4 5 [364] Because the products1 1
The first term is given as 1821, p. 442]. (tr.)
x2 x 3 (1 + 3 )
in [Cauchy 1897, p. 364]. It is given correctly in [Cauchy
Note II – On formulas that use the signs > or < and on averages.
x x2 1+ , 2 3
x4 x 1+ , 2·3·4 5
295
...
are positive not only when the quantity x is positive but also when x is negative but it has a numerical value less than 1, we get from equation (4) that whenever condition (3) is satisfied, ex > 1 + x. Corollary I. — In the case where 1 + x is positive, if we take the Napierian logarithms of both sides of formula (2), we obtain the following: ln (1 + x) < x
(5)
(see the corollary of theorem VI). This last formula remains true whenever its lefthand side is real. Corollary II. — Let x, y, z, . . . be several quantities subject to the conditions (6)
1 + x > 0,
1 + y > 0,
1 + z > 0,
....
By virtue of formula (2), we have 1 + x < ex ,
1 + y < ey ,
1 + z < ez ,
...,
and so we conclude (theorem II) that (7)
(1 + x) (1 + y) (1 + z) . . . < ex+y+z... .
This last formula remains true whenever its left-hand side contains only positive factors. Corollary III. — In the preceding corollary, if we suppose that x = aα,
y = a0 α 0 ,
z = a00 α 00 ,
...,
where α, α 0 , α 00 , . . . denote positive quantities and a, a0 , a00 , . . . denote other quantities, respectively, greater than 1 − , α
−
1 , α0
−
1 , α 00
...,
[365] then formula (7) becomes (1 + aα) 1 + a0 α 0
0 0 00 00 1 + a00 α 00 . . . < eaα+a α +a α +... .
Moreover, if the quantities a, a0 , a00 , . . . are all less than a certain limit A, then we have (by virtue of theorems I and III) that aα + a0 α 0 + a00 α 00 + . . . < A α + α 0 + α 00 + . . . ,
296
Note II – On formulas that use the signs > or < and on averages.
and consequently we finally have (8)
(1 + aα) 1 + a0 α 0
0 00 1 + a00 α 00 . . . < eA(α+α +α +...) .
Formula (8) can be used to good advantage in the approximate solution of differential equations. Now we move on to theorems on averages. As we have already said (Preliminaries),2 we call an average among several given quantities a new quantity contained between the smallest and the largest of those under consideration. From this definition, the quantity h is an average between two quantities g and k, or among several quantities among which one of these values is the largest and the other is the smallest, if the two differences g − h and h − k are of the same sign. Given this, if we use the notation M a, a0 , a00 , . . . for denoting an average among the quantities a, a0 , a00 , . . ., as we did in the Preliminaries, we establish the following propositions without trouble: Theorem VIII. — Let a, a0 , a00 , . . . and h be several quantities subject to the condition (9) h = M a, a0 , a00 , . . . , and let r be an entirely arbitrary quantity. Then we always have (10) rh = M ra, ra0 , ra00 , . . . . Proof. — Indeed, let g denote by the largest and k denote the smallest of the quantities a, a0 , a00 , . . .. The two differences g − h and h − k [366] are positive, and consequently the products r (g − h)
and r (h − k)
or in other words, the two differences rg − rh and rh − rk are of the same sign. Thus we have rh = M (rg, rk) and a fortiori, 2
See [Cauchy 1821, p. 14, Cauchy 1897, p. 27].
Note II – On formulas that use the signs > or < and on averages.
297
rh = M ra, ra0 , ra00 , . . . , given that rg and rk are necessarily two of the products ra,
ra0 ,
ra00 ,
....
Theorem IX. — Let A, A0 , A00 , . . . and H be several numbers which satisfy the condition (11) H = M A, A0 , A00 , . . . , and let b be any quantity. Then we have (12) H b = M Ab , A0b , A00b , . . . . Proof. — Indeed, let G and K be the largest and the smallest of the numbers A, A0 , A00 , . . .. Because the differences G−H
and H − K
are positive, we conclude from theorem IV that the following Gb − H b
and H b − K b
are of the same sign. Thus we have H b = M Gb , K b , and a fortiori, H b = M Ab , A0b , A00b , . . . . Corollary. — In particular, if we make b = 12 , we find √ √ √ √ H =M A, A0 , A00 , . . . . [367] Theorem X. — Let A denote any number and let b, b0 , b00 , . . . and h be several quantities subject to the condition (13) h = M b, b0 , b00 , . . . . Then we have (14)
0 00 Ah = M Ab , Ab , Ab , . . . .
Proof. — Denote by g the greatest and k the smallest of the quantities b, b0 , b00 , . . .. Because the two differences g − h and h − k
Note II – On formulas that use the signs > or < and on averages.
298
are positive, we conclude from theorem V that the quantities Ag − Ah
and Ah − Ak
are of the same sign. Thus we have 00 0 Ah = M Ag , Ak = M Ab , Ab , Ab , . . . . Theorem XI. — Let log be the characteristic of logarithms in the system for which the base is A and denote by B, B0 , B00 , . . . and H several numbers subject to the condition (15) H = M B, B0 , B00 , . . . . Whatever A may be, we have log H = M log B, log B0 , log B00 , . . . .
(16)
Proof. — Indeed, suppose that we represent by G the largest and by K the smallest of the numbers B, B0 , B00 , . . .. Then because the two fractions G H
and
H K
G H
and
log
log B − log H
and
log H − log K
are greater than 1, the logarithms log
H , K
or in other words, the differences
are of the same sign. Thus we have log H = M (log G, log K) = M log B, log B0 , log B00 , . . . . [368] Theorem XII. — Let b, b0 , b00 , . . . be several quantities of the same sign, n in number, and let a, a0 , a00 , . . . be any quantities, also n in number. Then we have a a0 a00 a + a0 + a00 + . . . (17) =M , , ,... . b + b0 + b00 + . . . b b0 b00 Proof. — Let g be the largest and k the smallest of the quantities a , b Then the differences
a0 , b0
a00 , b00
....
Note II – On formulas that use the signs > or < and on averages.
299
a a and − k, b b 0 0 a a g − 0 and − k, b b0 a00 a00 − k, g − 00 and b b00 ........................... g−
are all positive. By multiplying the first two by b, the following two by b0 , etc., we obtain the products gb − a and a − kb, gb0 − a0
and
a0 − kb0 ,
gb00 − a00
and
a00 − kb00 ,
.............................. which are all of the same sign as the quantities b, b0 , b00 , . . .. Consequently, the sums of these two kinds of products, namely g (b + b0 + b00 + . . .) − (a + a0 + a00 + . . .)
and
a + a0 + a00 + . . . − k (b + b0 + b00 + . . .) , and the quotients of these sums by b + b0 + b00 + . . ., namely g−
a + a0 + a00 + . . . b + b0 + b00 + . . .
and
a + a0 + a00 + . . . − k, b + b0 + b00 + . . .
are also quantities of the same sign. From this we conclude that a a0 a00 a + a0 + a00 + . . . = M (g, k) = M , , ,... b + b0 + b00 + . . . b b0 b00 [see in the Preliminaries theorem I and formula (6)]. [369] Corollary I. — Suppose that the quantities b, b0 , b00 , . . . reduce to 1. We find that a + a0 + a00 + . . . (18) = M(a, a0 , a00 , . . .). n The left-hand side of the preceding formula is what we call the arithmetic mean of the quantities a, a0 , a00 , . . .. Corollary II. — Because the average among several equal quantities is equal to 0 00 each of them, if the fractions ba , ab0 , ba00 , . . . are equal, we have (19)
a + a0 + a00 + . . . a a0 a00 = = = = ..., b + b0 + b00 + . . . b b0 b00
and this is easy to prove directly.
Note II – On formulas that use the signs > or < and on averages.
300
Corollary III. — If we denote by α, α 0 , α 00 , . . . new quantities which have the same sign, then by virtue of equation (17) we have αa α 0 a0 α 00 a00 αa + α 0 a0 + α 00 a00 + . . . , , . . . = M , αb + α 0 b0 + α 00 b00 + . . . αb α 0 b0 α 00 b00 (20) a a00 a0 =M , , . . . . , b b0 b00 This last formula suffices to establish theorem III of the Preliminaries. Theorem XIII. — Let A, A0 , A00 , . . ., B, B0 , B00 , . . . be two sequences of numbers taken at will, each of which we suppose has the same number of terms, n. With these two sequences form the roots √ B A,
√ A0 ,
√ A00 ,
B0
B00
....
Then we have (21)
√ √ √ √ B0 B00 B AA0 A00 . . . = M A, A0 , A00 , . . . .
B+B0 +B00 +...
Proof. — The logarithms of the quantities √ AA0 A00 . . .,
B+B0 +B00 +...
√ A0 ,
√ B A,
√ A00 ,
B0
B00
...,
indicated by the characteristic ln are, respectively, ln A + ln A0 + ln A00 + . . . , B + B0 + B00 + . . .
ln A , B
ln A0 , B0
ln A00 , B00
...,
[370] and equation (17) gives the following relation among these logarithms: ln A + ln A0 + ln A00 + . . . ln A ln A0 ln A00 = M , . , , . . . B + B0 + B00 + . . . B B0 B00 Now if we return from logarithms to numbers, as is permitted by virtue of theorem X, we again find formula (21). Corollary I. — By supposing that the numbers B, B0 , B00 , . . . reduce to 1, we have simply √ n (22) AA0 A00 . . . = M A, A0 , A00 , . . . . The left-hand side of the preceding formula is what we call the geometric mean of the numbers A, A0 , A00 , . . .. Corollary II. — If all the roots √ B A,
√ A0 ,
B0
√ A00 ,
B00
...
Note II – On formulas that use the signs > or < and on averages.
301
are equal, then their average is equal to each of them. Thus we have (23)
√ √ √ √ B0 B00 B AA0 A00 . . . = A = A0 = A00 = . . . ,
B+B0 +B00 +...
which would be easy to prove directly. The numerical value of an average among several given quantities is not always an average among their numerical values. Thus, for example, −1 is an average between −2 and +3; however, 1 is not an average value between 2 and 3. Among the various ways of obtaining an average among numerical values of n quantities a,
a0 ,
a00 ,
...,
one of the simplest consists of first forming the arithmetic mean among the squares, a2 ,
a02 ,
a002 ,
...,
and then extracting the square root of the result. In operating in this way, we first find a2 + a02 + a002 + . . . = M a2 , a02 , a002 , . . . , n [371] and then, taking into account the corollary of theorem IX, we find √ √ √ √ a2 + a02 + a002 + . . . √ (24) =M a2 , a02 , a002 , . . . . n Now because the positive quantities √ √ a2 , a02 ,
√ a002 ,
...
represent precisely the numerical values of the given quantities a,
a0 ,
a00 ,
...,
it follows from formula (24) that we obtain an average among the values if we divide the very simple expression p a2 + a02 + a002 + . . . √ by n. This expression, which is greater than the largest of the numerical values in question, is what we could call the modulus of the system of quantities a, a0 , a00 , . . .. The modulus of a system of two quantities √ a and b is nothing other than the modulus itself of the imaginary expression a + b −1 (see Chapter VII, § II). In any case, real expressions of the form p a2 + a02 + a002 + . . . enjoy some very remarkable properties. In Geometry, they serve to determine the measured lengths of a straight line and the areas of plane surfaces by means of
Note II – On formulas that use the signs > or < and on averages.
302
their orthogonal projections. In Algebra, they are the subject of several important theorems, among which I will content myself to state those which follow. Theorem XIV. — If the fractions a , b
a0 , b0
a00 , b00
...
are equal, then the numerical value of each of them is expressed by the ratio √ a2 + a02 + a002 + . . . √ , b2 + b02 + b002 + . . . so that we have √ a2 + a02 + a002 + . . . a a0 a00 , = 0 = 00 = . . . = ± √ b b b b2 + b02 + b002 + . . .
(25)
[372] where the sign + or the sign − is adopted according to whether the given fractions are positive or negative. Proof. — Indeed, under the given hypothesis, the fractions a2 , b2
a02 , b02
a002 , b002
...
are equal, and as a consequence we have a2 a02 a002 a2 + a02 + a002 + . . . = 02 = 002 = . . . = 2 . 2 b b b b + b02 + b002 + . . . By extracting the square roots, we recover formula (25). Theorem XV. — Let a, a0 , a00 , . . . be any n real quantities. If these quantities are not equal to each other, then the numerical value of the sum a + a0 + a00 + . . . is less than the product
√ p 2 n a + a02 + a002 + . . . ,
so that we have (26)
√ p val.num. a + a0 + a00 + . . . < n a2 + a02 + a002 + . . ..
Proof. — Indeed, if to the square of the sum a + a0 + a00 + . . .
Note II – On formulas that use the signs > or < and on averages.
303
we add the squares of the differences among the quantities a, a0 , a00 , . . . combined in pairs in every possible manner, namely a − a0 we find ( (27)
2
,
a − a00
2
,
...,
a0 − a00
2
,
...,
(a + a0 + a00 + . . .)2 + (a − a0 )2 + (a − a00 )2 + . . . + (a0 − a00 )2 + . . . = n a2 + a02 + a002 + . . . ,
and we conclude that 2 a + a0 + a00 + . . . < n a2 + a02 + a002 + . . . . By taking the positive square roots of both sides of this last formula, we obtain precisely formula (26). [373] Corollary. — If we divide both sides of formula (26) by n, we find √ a + a0 + a00 + . . . a2 + a02 + a002 + . . . √ val.num. (28) < . n n Thus the numerical value of the arithmetic mean among several quantities a, a0 , a00 , . . . is less than the ratio √ a2 + a02 + a002 + . . . √ , n which represents an average among the numerical values of these same quantities, as we have remarked above. Scholium I. — When the quantities a, a0 , a00 , . . . become equal, we evidently have √ p val.num. a + a0 + a00 + . . . = n a2 + a02 + a002 + . . . = na. Scholium II. — If we successively set n = 2, n = 3, . . . in equation (27), we conclude that (a + a0 )2 + (a − a0 )2 = 2 a2 + a02 , (29) (a + a0 + a00 )2 + (a − a0 )2 + (a − a00 )2 + (a0 − a00 )2 = 3 a2 + a02 + a002 , ........................................................................ Theorem XVI.3 — Let a, a0 , a00 , . . ., α, α 0 , α 00 , . . . be two sequences of quantities and suppose that each of these sequences contains n terms. If the ratios 3
This is now known as the Cauchy–Schwarz Inequality.
Note II – On formulas that use the signs > or < and on averages.
304
a , α
a00 , α 00
a0 , α0
...
are not all equal to each other, then the sum aα + a0 α 0 + a00 α 00 + . . . is less than the product p p a2 + a02 + a002 + . . . α 2 + α 02 + α 002 + . . ., so that we have ( (30)
val.num. (aα + a0 α 0 + a00 α 00 + . . .) √ √ < a2 + a02 + a002 + . . . α 2 + α 02 + α 002 + . . ..
[374] Proof. — Indeed, if to the square of the sum aα + a0 α 0 + a00 α 00 + . . . we add the numerators of the fractions which represent the squares of the differences between the ratios a a0 a00 , , , ... 0 α α α 00 combined with each other in every possible way, namely aα 0 − a0 α we find (31)
2
,
aα 00 − a00 α
2
,
...,
a0 α 00 − a00 α 0
2
,
...,
2 0 0 00 00 0 0 2 (aα + a α + a α + . . .) + (aα − a a) + (aα 00 − a00 a)2 + . . . + (a0 α 00 − a00 α 0 )2 + . . . = a2 + a02 + a002 + . . . α 2 + α 02 + α 002 + . . . ,
and we conclude that 2 aα + a0 α 0 + a00 α 00 + . . . < a2 + a02 + a002 + . . . α 2 + α 02 + α 002 + . . . . By extracting the square roots of both sides of this last formula, we obtain precisely formula (30). Corollary. — If we divide both sides of formula (30) by n, we find aα + a0 α 0 + a00 α 00 + . . . val.num. n √ √ (32) 2 02 002 a + a + a + . . . α 2 + α 02 + α 002 + . . . < √ √ . n n
Note II – On formulas that use the signs > or < and on averages.
305
Thus the arithmetic mean among the products aα,
a0 α 0 ,
a00 α 00 ,
...
has a numerical value less than the product of the two ratios that represent the averages among the numerical values of the two kinds of quantities contained in the two sequences a, a0 , a00 , . . . and α, α 0 , α 00 , . . . . [375] Scholium I. — When the ratios a , α
a0 , α0
a00 , α 00
...
become equal, we get from formula (31) that 2 aα + a0 α 0 + a00 α 00 + . . . = a2 + a02 + a002 + . . . α 2 + α 02 + α 002 + . . . , and consequently val.num. (aα + a0 α 0 + a00 α 00 + . . .) √ √ = a2 + a02 + a002 + . . . α 2 + α 02 + α 002 + . . .. It is easy to arrive directly at the same result. Scholium II. — If we successively set n = 2,
n = 3,
...,
in formula (31), then we conclude that (aα + a0 α 0 )2 + (aα 0 − a0 α)2 = a2 + a02 α 2 + α 02 , 2 0 0 00 00 2 0 0 (aα + a α + a α ) + (aα − a α) (33) + (aα 00 − a00 α)2 + (a0 α 00 − a00 α 0 )2 = a2 + a02 + a002 α 2 + α 02 + α 002 , ....................................... The first of the preceding equations agrees with equation (8) of Chapter VII (§ I). The second can be written as follows ( (aα 0 − a0 α)2 + (aα 00 − a00 α)2 + (a0 α 00 − a00 α 0 )2 (34) = a2 + a02 + a002 α 2 + α 02 + α 002 − (aα + a0 α 0 + a00 α 00 )2 , and in this form it can be used with good advantage in the theory of radii of curvature of curves traced on any surfaces, thus in several questions of Mechanics.
Note II – On formulas that use the signs > or < and on averages.
306
We end this note with the proof of a theorem worthy of remark, which leads to comparing the geometric mean of several numbers with their arithmetic mean. It consists of the following: Theorem XVII.4 — The geometric mean of several numbers A, B, C, D, . . . is always less than their arithmetic mean. [376] Proof. — Let n be the number of the letters A, B, C, D, . . .. It suffices to prove in general that √ A + B +C + D + . . . n ABCD . . . < , n
(35)
or what amounts to the same thing, A + B +C + D + . . . n ABCD . . . < (36) . n Now in the first place, it is evident, for n = 2, that AB =
A+B 2
2
−
A−B 2
2
2 AB, √ (39) A + B +C > 3 3 ABC, ...........................
Note III – On the numerical solution of equations.
[378] To solve numerically one or several equations is to find the values in numbers of the unknowns which they contain. This evidently requires that the constants contained in the equations be themselves constrained to numbers. We will concern ourselves here only with equations that contain one unknown, and we will begin by establishing, in this connection, the following theorems. Theorem I.1 — Let f (x) be a real function of the variable x, which remains continuous with respect to this variable between the limits x = x0 and x = X. If the two quantities f (x0 ) and f (X) have opposite signs, we can satisfy the equation (1)
f (x) = 0
with one or several real values of x contained between x0 and X. Proof. — Let x0 be the smaller of the two quantities x0 and X. Let X − x0 = h, and denote by m any integer number larger than 1. Because one of the two quantities f (x0 ) and f (X) is positive and the other negative, if we form the sequence h h h f (x0 ), f x0 + , f x0 + 2 , ..., f X − , f (X), m m m and if we suppose that, in this sequence, we successively compare the first term with the second, the second with the third, the third with the fourth, etc., eventually we must find one or more times that two consecutive terms have opposite signs. Let 1
This is a special case of the Intermediate Value Theorem. See Chapter II, § II, theorem IV, p. 32 [Cauchy 1821, p. 43, Cauchy 1897, p. 50]. Cauchy’s proof there was quite intuitive. The version of the theorem in Chapter II is given here as corollary II [Cauchy 1821 p. 463, Cauchy 1897, p. 381]. See also [Grabiner 2005, pp. 69–75].
309
310
Note III – On the numerical solution of equations.
f (x1 ) and
f (X 0 )
[379] be two such terms, x1 being the smaller of the two corresponding values of x. We evidently have2 x0 < x1 < X 0 < X and
1 h = (X − x0 ) . m m Having determined x1 and X 0 as we have just said, we can likewise locate two other values x2 and X 00 between x1 and X 0 , which give results of opposite signs when substituted into f (x), and which satisfy the conditions X 0 − x1 =
x1 < x2 < X 00 < X 0 and
1 1 X 0 − x1 = 2 (X − x0 ) . m m In continuing like this, we obtain: 1◦ an increasing series of values of x, namely X 00 − x2 =
x0 , x1 , x2 , . . . ;
(2)
and 2◦ a series of decreasing values X, X 0 , X 00 , . . . ,
(3)
which exceed the corresponding values of the first series by quantities, respectively, equal to the products 1 × (X − x0 ) ,
1 1 × (X − x0 ) , 2 × (X − x0 ) , . . . , m m
and they eventually differ from the terms of the first series by as little as we might wish. We must conclude that the general terms of series (2) and (3) converge towards a common limit. Let a be that limit. Because the function f (x) is continuous from x = x0 to x = X, the general terms of the following series f (x0 ), f (x1 ), f (x2 ), . . . , f (X), f (X 0 ), f (X 00 ), . . . converge likewise towards the common limit f (a). As they approach that limit they always have opposite signs, so it is clear [380] that the quantity f (a), being necessarily finite, cannot differ from zero. As a consequence, it satisfies the equation (1)
2
f (x) = 0,
Here, Cauchy does not make the distinction between “less than” and “less than or equal to.”
Note III – On the numerical solution of equations.
311
by assigning to the variable x the particular value a, contained between x0 and X. In other words, (4) x=a is a root of equation (1). Scholium I. — Suppose we have extended series (2) and (3) to the terms xn
and X (n) ,
(where n denotes any integer number). If we take the half-sum of these terms as the value approximating the root a, the error made is less than their half-difference, namely 1 X − x0 . 2 mn Because this last expression decreases indefinitely as n increases, it follows that, by calculating a sufficient number of terms of the two series, we eventually obtain values as close to the root a as we wish. Scholium II. — If there exist several real roots of equation (1) between the limits x0 and X, the preceding method locates some, and sometimes all of them. Then, we find for x1 and X 0 or for x2 and X 00 , . . . several systems of values which enjoy the same properties. 3 Scholium III. — If the function f (x) is constantly increasing or constantly decreasing from x = x0 to x = X, then between these limits there exists but a single value of x that satisfies equation (1). Corollary I. — If equation (1) has no real roots between the limits x0 and X, then the two quantities f (x0 ) and f (X) have the same sign. [381] Corollary II. — If in the statement of theorem I, we replace the function f (x) by f (x) − b (where b denotes a constant quantity), then we obtain precisely theorem IV of Chapter II (§ II).4 Under the same hypothesis, and following the method indicated above, we can determine numerically the roots of the equation (5) 3
f (x) = b
Cauchy is saying in other words that if m > 2, then we may see more than one sign change in f (x) in the interval at any given stage. 4 See p. 32 [Cauchy 1821, p. 43, Cauchy 1897, p. 50].
312
Note III – On the numerical solution of equations.
contained between x0 and X. Note. — When equation (1) has several roots contained between x0 and X, in calculating series (2) and (3), we are not always assured of finding the smallest or the largest of the roots in the interval. However, we could do this following another method that Mr. Legendre has used in his Suppl´ement a` la Th´eorie des nombres.5 This second method follows immediately from the two theorems I am about to state. Theorem II. — As in theorem I, suppose that the function f (x) remains continuous from x = x0 to x = X (where X is greater than x0 ), and denote by ϕ(x) and χ(x) two auxiliary functions also continuous on the given interval, but also subject to: 1◦ that they both increase constantly6 with x on this interval; and 2◦ that they give for the difference ϕ(x) − χ(x), an expression which initially is negative when we give x the particular value x0 , and which always remains equal (up to sign) to f (x). If the equation (1)
f (x) = 0
has one or several real roots between x0 and X, then the values of x given by (6)
x0 , x1 , x2 , x3 , . . . ,
and derived from one another by means of the formulas (7)
ϕ(x1 ) = χ(x0 ), ϕ(x2 ) = χ(x1 ), ϕ(x3 ) = χ(x2 ), . . .
make an increasing series of quantities, for which the general term converges towards the smallest of these roots. On the other hand, if equation (1) does not have [382] real roots contained between x0 and X, then the general term of series (6) eventually exceeds X.7 Proof. — Let us suppose in the first place that the equation f (x) = 0 has one or several real roots between x0 and X, and denote by a the smallest of these roots. It satisfies the equation in question, or what amounts to the same thing, the following: (1)
ϕ(x) − χ(x) = 0.
Taking x = a, we have as a consequence (8) 5
ϕ(a) = χ(a).
See [Legendre 1816, p. 43]. Legendre used the word omale to describe these functions that we now call “monotonic.” Galois also used the term. It seems that Cauchy did not adopt the word, though he must have known of Legendre’s use of it, and its use seems to have died out. 7 See [Galuzzi 2001] for more on Legendre’s method. 6
Note III – On the numerical solution of equations.
313
Moreover, because the function χ(x) is constantly increasing with x from x = x0 to x = X and a is greater than x0 , we have χ(a) > χ(x0 ). By combining these last two formulas with the first of equations (7), namely χ(x0 ) = ϕ(x1 ), we conclude that ϕ(a) > ϕ(x1 ) and consequently (9)
a > x1 .
In the same way, by combining the three formulas ϕ(a) = χ(a),
χ(a) > χ(x1 ) and
χ(x1 ) = ϕ(x2 ),
where the second follows immediately from formula (9), we find ϕ(a) > ϕ(x2 ), and consequently (10)
a > x2 .
By continuing like this, we are assured that all the terms of series (6) are less than the root a. I will add that these various terms form an increasing sequence of quantities, and indeed, because the difference ϕ(x) − χ(x) [383] is negative by hypothesis for x = x0 , we have ϕ(x0 ) < χ(x0 ). But χ(x0 ) = ϕ(x1 ), so ϕ(x0 ) < ϕ(x1 ) and (11)
x0 < x1 .
Moreover, because x1 is contained between x0 and a, no real root of the equation ϕ(x) − χ(x) = 0 is found contained between the limits x0 and x1 , and consequently (see theorem I, corollary I) ϕ(x0 ) − χ(x0 ) and ϕ(x1 ) − χ(x1 )
314
Note III – On the numerical solution of equations.
are quantities of the same sign, that is to say, both of them are negative. So we have ϕ(x1 ) < χ(x1 ), and consequently, because χ (x1 ) = ϕ (x2 ), ϕ (x1 ) < ϕ (x2 ) , and so (12)
x1 < x2 ,
etc. Thus, the quantities x0 , x1 , x2 , . . . form a series for which the general term xn increases constantly with n without ever surpassing the root a, and it necessarily converges to a root equal to or less than this root. Let us call this limit l. Because, by virtue of equations (7), we have, for every n, ϕ (xn+1 ) = χ (xn ) , we conclude, by letting n increase indefinitely and passing to the limits, (13)
ϕ (l) = χ (l) .
Thus, the quantity l is itself a root of equation (1), and because this quantity is greater than x0 without being greater than the root a, we evidently have (14)
l = a.
[384] In the second place, let us suppose that equation (1) has no real roots between x0 and X. We will now prove under this hypothesis that the general term xn of series (6) grows constantly with x, at least as long as this term remains less than X. Indeed, as long as this condition is satisfied, the difference ϕ(xn ) − χ(xn ) has (theorem I, corollary I) the same sign as ϕ(x0 ) − χ(x0 ), that is to say negative, and consequently, we establish formulas (11), (12), . . ., as above. Moreover, xn cannot converge towards a fixed limit l less than X, because the existence of this limit would evidently involve equation (13), and consequently the existence of a real root contained between x0 and X. Thus necessarily, under the given hypothesis, the value of xn eventually surpasses the limit X. Corollary I. — The conditions to which the auxiliary functions ϕ (x) and χ (x) are subject in the statement of theorem II can be satisfied in infinitely many ways. However, among the infinitely many values we could give to the function ϕ (x), it
Note III – On the numerical solution of equations.
315
is important to choose one that permits the easy solution of equations (7), that is to say in general, any equation of the form ϕ (x) = const. After choosing the value of ϕ (x) as we just said, we calculate without difficulty the various terms of series (6), and it suffices to find the limit towards which they converge to obtain the smallest of the roots of equation (1) contained between x0 and X. If these same terms eventually surpass X, then equation (1) does not have a real root in the interval from x0 to X. Corollary II. — If we take x0 = 0, and if also equation (1) has positive roots, then the quantities x1 , x2 , . . . are all less than the smallest root of this kind, and they give its value more and more closely. Theorem III. — As in theorem I, suppose that the function f (x) remains continuous from x = x0 to x = X (where X is greater than x0 ), and [385] denote by ϕ (x) and χ (x) two auxiliary functions also continuous on the given interval, but also subject to: 1◦ that they both increase constantly with x on this interval; and 2◦ that they give for the difference ϕ (x) − χ (x) an expression which becomes positive when we give x the particular value X, and which always remains equal (up to sign) to f (x). If the equation f (x) = 0
(1)
has one or several real roots between x0 and X, then the values of x given by (15)
X,
X 0,
X 00 ,
X 000 ,
...
and derived from one another by means of the formulas (16) ϕ X 0 = χ (X) , ϕ X 00 = χ X 0 , ϕ X 000 = χ X 00 , . . . make a decreasing series of quantities for which the general term converges towards the largest of these roots. On the other hand, if equation (1) does not have real roots contained between x0 and X, then the general term of series (15) eventually descends below x0 . The proof of this third theorem is so similar to that of the second that for brevity we will dispense with recounting it here. Corollary I. — Among the infinitely many values we could give to the function ϕ (x) in a way that satisfies the given conditions, it is important to choose one that
316
Note III – On the numerical solution of equations.
permits the easy solution of equations (16), that is to say in general, any equation of the form ϕ (x) = const. After choosing the value of ϕ (x) as we just said, we calculate without difficulty the various terms of series (15), and it suffices to find the limit towards which they converge to obtain the largest of the roots of equation (1) contained between x0 and X. If these same terms eventually fall below x0 , then equation (1) does not have a real root in the interval from x0 to X. [386] Corollary II. — If equation (1) has positive roots and if X surpasses the largest root of this kind, the quantities X 0 , X 00 , . . . always remain larger than this root and they give its value more and more closely. Scholium I. — If equation (1) has but one real root a, contained between x0 and X, the general terms of series (6) and (15), where the first is increasing and the second is decreasing, converge towards a common limit equal to this root. Then, if we extend these series up to the terms xn
and X (n) ,
and then if we take the half-sum of these two terms as a value close to the root a, the resulting error is less than X (n) − xn . 2 Scholium II. — To show an application of the principles that we have just established, consider in particular the equation (17)
xm − A1 xm−1 − A2 xm−2 − . . . − Am−1 x − Am = 0,
where m denotes any integer number and where A1 ,
A2 ,
...,
Am−1 ,
Am
denote quantities, positive or zero. Because the left-hand side of this equation is negative for x = 0 and positive for very large values of x, it follows that it has at least one root that is positive and finite. Moreover, this same equation is not different from the following A1 A2 Am−1 Am + 2 + . . . + m−1 + m = 1, x x x x where the right-hand side remains invariable, while the left-hand side decreases constantly for positive and increasing values of x, so it evidently admits but a single real positive root. Let a be that root, and let A be the largest of the numbers A1 ,
A2 ,
...,
Am−1 ,
Am .
Note III – On the numerical solution of equations.
317
Finally, denote as usual an average of these numbers by the notation M (A1 , A2 , . . . , Am−1 , Am ) . [387] By making x = a and taking into account formula (11) of the Preliminaries, we get from equation (17) am = A1 am−1 + A2 am−2 + . . . + Am−1 a + Am = am−1 + am−2 + . . . + a + 1 M (A1 , A2 , . . . , Am−1 , Am ) am − 1 am − 1 = M (A1 , A2 , . . . , Am−1 , Am ) < A , a−1 a−1 and consequently a−1 < A and (18)
am − 1 nAr am−r and am < nAs am−s , and consequently 1
a > (nAr ) r
and
1 s
a < (nAs ) . It is clear that the root a is contained between the smallest and the largest of the numbers 1 1 1 (19) nA1 , (nA2 ) 2 , (nA3 ) 3 , . . . , (nAm ) m . Finally, because by virtue of theorem I (corollary I), the left-hand side of equation (17) remains negative from x = 0 to x = a and positive from x = a to x = ∞, it
8 Cauchy is simultaneously defining n and noting that it has the property that it is less than or equal to m. He used the notation “n = or < m” in [Cauchy 1821, p. 472, 474]. In [Cauchy 1897, p. 387, 389], the editors used a symbol resembling n<m. There is a similar situation in Note VI with the symbol ≥ in [Cauchy 1821, pp. 533–534, Cauchy 1897, pp. 437–438].
318
Note III – On the numerical solution of equations.
follows that we can also choose as a lower bound9 of the root a the largest of the integer numbers which make negative the expression (20)
xm − A1 xm−1 − A2 xm−2 − . . . − Am−1 x − Am ,
[388] and as an upper bound the smallest of these that make it positive. Now let x0
and X
be the lower and upper bounds found following the rules that we have just given. Moreover, if we make ( ϕ(x) = xm and (21) χ(x) = A1 xm−1 + A2 xm−2 + . . . + Am−1 x + Am , then theorems II and III are applicable to equation (17), and because under this hypothesis, each of equations (7) and (16) can be reduced to the form xm = const., it becomes easy to calculate the quantities contained in the two series X, X 0 , X 00 , X 000 , . . . and x0 , x1 , x2 , x3 , . . . , where the general terms are values approaching the root a from above and from below. Scholium III. — Consider again the equation (22)
xm + A1 xm−1 + A2 xm−2 + . . . + Am−1 x − Am = 0,
where m still denotes an integer number and where A1 ,
A2 ,
...,
Am−1 ,
Am
denote quantities, positive or zero, the largest of which is equal to A. By taking the unknown,10 we can rewrite this equation in the following form: (23)
1 x
as
m 1 Am−1 1 m−1 Am−2 1 m−2 A1 1 1 − − −...− − = 0, x Am x Am x Am x Am
9 Here we translate Cauchy’s words limite inf´ erieure as “lower bound.” He does not seem to mean “limit inferior” in its modern sense. Likewise we translate limite sup´erieure as “upper bound” in the next phrase. (tr.) 10 Note that equation (23) is algebraically equivalent to equation (22). Cauchy has not substituted 1 x in place of x in equation (22).
Note III – On the numerical solution of equations.
319
which is similar to that of equation (17). We thus conclude that equation (22) admits but one positive root less than the quotient 1 , A +1 Am
(24)
[389] and this root is contained, not only between the smallest and the largest of the quantities (25)
Am , nAm−1
Am nAm−2
1
2
,
Am nAm−3
1
3
,
...,
Am nA1
1 m−1
,
Am n
1
m
,
where n ≤ m represents the number of variable terms contained in the left-hand side of equation (22), but the root a is also contained between the largest of the integer numbers that make the following expression negative (26)
xm + A1 xm−1 + A2 xm−2 + . . . + Am−1 x − Am
and the smallest of those that make it positive. Following these remarks, after having determined two limits, one greater than and one less than the root in question, in order to approach it more closely, it suffices to apply theorems II and III to equation (23) and to consider 1x as the unknown that we are trying to find. Scholium IV. — If equation (1) had two real roots contained between x0 and X, but extremely close to each other, the general terms of series (6) and (15) would appear at first to converge towards the same limit, and we would not be able to extend the series very long before we would perceive the difference between the limits towards which they are effectively converging. The same remark applies to series (2) and (3). Consequently, the solution methods based only on theorem I or else on theorems II and III are not always useful in all cases to find the number of real roots of a numerical equation. However, they always give the value of any single root which is found contained between any two given limits as accurately as we might wish. In the particular case where the numerical equation that we are considering has for its left-hand side a real integer function of the variable x, we can determine the number of real roots all at once, as M. Lagrange has shown, and calculate their approximate values. To do this easily, it is best to start by reducing the given equation so that it has only unequal roots, and then proceeding as follows. Let (27) F(x) = 0 be the given equation. Denote by a, b, c, . . . the various roots, real or [390] imaginary, and let m be the degree of the left-hand side, for which we suppose that the coefficient of this highest power of x is reduced to 1. Finally, let m0 be the number of these roots equal to a, m00 the number equal to b, m000 the number equal to c, . . .. Then we have
320
Note III – On the numerical solution of equations.
(28)
m0 + m00 + m000 + . . . = m
and (29)
F (x) = (x − a)m (x − b)m (x − c)m . . . .
0
00
000
Let z be a new variable. We conclude that (30)
m0 m00 m000 z F (x + z) z z = 1+ .... 1+ 1+ F (x) x−a x−b x−c
Now if we make F (x + z) = F (x) + zF1 (x) + z2 F2 (x) + . . . ,
(31)
and if we expand the expressions
z 1+ x−a
m0
,
z 1+ x−b
m00
,
z 1+ x−c
m000 ,
...
according to increasing powers of z, equation (30) becomes F1 (x) F2 (x) + z2 +... F (x) F (x) m00 m000 m0 z + . . . 1 + x−b z + . . . 1 + x−c z+... ... = 1 + x−a 0 m00 m000 m + x−b + x−c +... z+.... = 1 + x−a
1+z
Then, by equating corresponding coefficients of the first power of z on the two sides, we find m0 m00 m000 1 (x) FF(x) = x−a + x−b + x−c +... (32) 0 00 (x−a)(x−c)...+m000 (x−a)(x−b)...+... = m (x−b)(x−c)...+m(x−a)(x−b)(x−c)... . Because the preceding formula has for its right-hand side an algebraic fraction that is evidently irreducible, it follows that it is enough to divide the [391] left-hand side F(x) of equation (27) by the greatest common divisor of the two polynomials F(x) and F1 (x) to reduce this equation to the following (33)
(x − a) (x − b) (x − c) . . . = 0,
which has only unequal roots. We will not stop here to show how we could use these same principles to deduce various equations whose distinct roots would be equal, either to the simple roots, or to the double roots, or to the triple roots, etc., of the given equation. Here we will only add some remarks relative to the case where we begin by supposing that all the roots of equation (27) are distinct from one another. Each of these numbers m0 , m00 , m000 , . . . then reduces to 1, and we get from formula (32)
Note III – On the numerical solution of equations.
321
(34) F1 (x) = (x − b) (x − c) . . . + (x − a) (x − c) . . . + (x − a) (x − b) . . . + . . . , and consequently (35)
F1 (a) = (a − b) (a − c) . . . , F1 (b) = (b − a) (b − c) . . . , F (c) = (c − a) (c − b) . . . , 1 ............................,
and (36)
m(m−1) 2
F1 (a) F1 (b) F1 (c) . . . = (−1)
(a − b)2 (a − c)2 . . . (b − c)2 . . . .
Thus, under the given hypothesis, the product of squares of the differences between the roots of equation (27) is equivalent, ignoring the sign, to the product F1 (a) F1 (b) F1 (c) . . . , and consequently to the last term of the equation in z given by the elimination of x between the two following (37)
F (x) = 0
and z − F1 (x) = 0.
Then, by calling the numerical value of the last term H, we have (38)
(a − b)2 (a − c)2 . . . (b − c)2 . . . = ±H.
Under the same hypothesis, because the values of F1 (a), F1 (b), . . . given by formulas (35) are never zero, if we denote a real root of equation (27) by a, it suffices to give very small values of the number α for [392] the two quantities F (a + α) =
αF1 (a) + α 2 F2 (a) + . . .
F (a − α) = −
αF1 (a) + α 2 F2 (a) − . . .
and
to have opposite signs. Moreover, if we represent by x0 and X a lower bound and an upper bound where a is the only real root contained between them, then by virtue of theorem I (corollary I), F (X) has the same sign as F (a + α) and F (x0 ) the same sign as F (a − α), and consequently the two quantities F (x0 )
and F (X)
have opposite signs. When equation (27) does not have equal roots, or it has been disencumbered of those that it did have, then for this equation it becomes easy to determine not only two limits between which all the real roots are contained, but also a sequence of quantities which, taken two by two, serve, respectively, as limits of the various roots of this kind, and finally values as close to these same roots as we might want. We will establish this by solving, one after another, the three following problems.
322
Note III – On the numerical solution of equations.
Problem I. — To determine two limits between which all of the real roots of the equation (27) F(x) = 0 are contained. Solution. — By hypothesis, F(x) is a real polynomial of degree m with respect to x, and in which the highest power of x has 1 for its coefficient. If we denote the successive coefficients of the lesser powers by a1 ,
a2 ,
...,
am−1 ,
am ,
and the numerical values of these same coefficients by A1 ,
A2 ,
...,
Am−1 ,
Am ,
then we have identically ( F (x) = xm + a1 xm−1 + a2 xm−2 + . . . + am−1 x + am (39) = xm ± A1 xm−1 ± A2 xm−2 ± . . . ± Am−1 x ± Am . [393] Now let k be a number greater than the unique positive root of equation (17) (theorem III, scholium II). Polynomial (20) is positive whenever we suppose that x ≥ k. Consequently, it suffices to give x a numerical value greater than the number k for the sum of the numerical values of the terms A1 xm−1 ,
A2 xm−2 ,
...,
Am−1 x,
Am
to become less than the numerical value of xm . As a result, the left-hand side of equation (7) can never vanish while the value of x is located inside the limits −k
and
+ k.
Thus all the roots, positive and negative, of equation (27) are contained between these same limits. Scholium I. — If the number k is subject to the sole condition of surpassing the positive root of equation (17), then we can suppose it to be equal either to the largest of the expressions (19), or to the smallest of the integer numbers which, when substituted in place of x in polynomial (20), give a positive result. Scholium II. — We could easily assure that the number k, determined as we have just said, is greater not only than all the numerical values of the real roots of equation (27), but also than the moduli of all the imaginary roots. Indeed, let √ x = r cost + −1 sint
Note III – On the numerical solution of equations.
323
be such a root. At the same time we have two real equations ( rm cos mt ± A1 rm−1 cos (m − 1)t (40) ±A2 rm−2 cos (m − 2)t ± . . . ± Am−1 r cost ± Am = 0 and (41)
(
rm sin mt ± A1 rm−1 sin (m − 1)t ±A2 rm−2 sin (m − 2)t ± . . . ± Am−1 r sint = 0.
By adding the first equation multiplied by cos mt to the second one multiplied by sin mt, we conclude ( rm ± A1 rm−1 cost ± A2 rm−2 cos 2t ± . . . (42) ± Am−1 r cos (m − 1)t ± Am cos mt = 0. Now it is clear that we would not satisfy this last equation by supposing [394] r > k, because under this hypothesis the numerical value of rm exceeds the sum of the numerical values of the terms A1 rm−1 ,
A2 rm−2 ,
...,
Am−1 r,
Am ,
and a fortiori it exceeds the sum of the numerical values that these same terms acquire when they are multiplied by the cosines. Scholium III. — By comparing the polynomial (26) with the left-hand sides of equations (27) and (40), we easily prove that if g denotes a number less than the unique positive root of equation (22), then g is a lower limit not only for the numerical values of all the real roots of equation (27), but also of the moduli of all the imaginary roots. This is what happens, for example, if we take g to be the smallest of expressions (25), or the largest of the integer numbers which, substituted in place of x in polynomial (26), give a negative result. The number g being determined as we have just said, all the positive roots of equation (27) are contained between the limits +g and + k and the negative roots of the same equation are between the limits −k
and
− g.
Scholium IV. — When we only propose to obtain a lower limit on the smallest of the positive roots or an upper limit on the largest of them, we can sometimes do this by using the corollary to theorem XVII (preceding Note). Indeed, suppose that all the terms of the polynomial F(x) except one have the same sign. Then equation (27) takes the following form:
324
Note III – On the numerical solution of equations.
( (43)
xm + A1 xm−1 + . . . + as−1 xm−s+1 +As+1 xm−s−1 + . . . + Am−1 x + Am = As xm−s .
Now let n be the number of terms in the left-hand side of equation (43) which do not reduce to zero, and Bxµ the geometric mean of these terms, where B denotes the geometric mean of their coefficients. By virtue of the corollary to theorem XVII (Note II), every real and positive value of x that satisfies the given equation, [395] or what amounts to the same thing, serves as its root, necessarily satisfies the condition As xm−s > nBxµ and consequently, one of the two following x>
(44)
nB As
As nB
,
or x
H 2 ,
or what amounts to the same thing, 1
(48)
H2
mod . (a − b) > (2k)
m(m−1) −1 2
.
When the roots a and b are real, the modulus of the difference a − b reduces to its numerical value. Consequently, if we set 1
(49)
H2
h= (2k)
m(m−1) −1 2
,
then we obtain a number h less than the smallest difference between the real roots of equation (27). Scholium I. — It would be easy to prove that if each of the numbers A1 , A2 , . . ., Am (problem I) is integer, then the number H is integer as well. Consequently, under this hypothesis, the number H, which cannot vanish as long as the roots of equation (27) are unequal to each other, has a value equal to or greater than 1. Given this, formula (48) gives 1 (50) . mod . (a − b) > m(m−1) (2k) 2 −1 [399] We conclude that, to obtain a number h less than the smallest difference between the roots, it suffices to take (51)
1
h= (2k)
Scholium II. — Let (52)
m(m−1) −1 2
.
Z=0
be the equation in z given by the eliminating x from formulas (37). If, by the method indicated above (problem I, scholium III), we determine a limit G less than the moduli of all of the roots, real or imaginary, of equation (52), and if we still denote
328
Note III – On the numerical solution of equations.
the roots of equation (27) a, b, c, . . ., then we have mod .F1 (a) > G, or what amounts to the same thing [see equations (35)], mod . (a − b) (a − c) . . . > G. We conclude that mod . (a − b) >
G mod . (a − c) . . .
and consequently, (53)
mod . (a − b) >
G (2k)m−2
,
because the differences a − b,
a − c,
...,
which involve the root a combined successively with each of the others, are m − 1 in number, or if we set aside the difference a − b, they are m − 2 in number. Given this, it is clear that the number h still satisfies the required conditions if we take h=
(54)
G (2k)m−2
.
Scholium III. — Having determined h by one of the preceding methods, we are able to choose for the sequence of numbers k1 ,
k2 ,
...,
kn
a decreasing arithmetic progression for which the difference is equal to or less than h, in which the terms must always [400] lie between the limits 0 and k. Moreover, if we denote by g (see problem I, scholium III), a limit less than the numerical values of all of the real roots of equation (27), we are evidently able to remove from series (46) all the terms, positive and negative, whose numerical values are smaller than g and write in their place just the two terms −g and
+ g.
If after modifying sequence (46) as we have just said, we substitute successively into the polynomial F(x): 1◦ the negative terms of this series from −k to −g; and 2◦ the positive terms from +g to +k, and every time that two consecutive terms of the first or the second kind give results of contrary signs, we are certain that a real root, negative in the first case, positive in the second, is contained between these two terms. Scholium IV. — When, by whatever means, we have determined for equation (27) a value close to the real root a, either above or below, then in a great number of cases
Note III – On the numerical solution of equations.
329
we can obtain a value close to this same root in the contrary direction, and thus fix two limits, one greater than the real roots less than a, and the other less than the real roots greater than a, by applying the proposition I am about to state. As usual, let F1 (x) , F2 (x) , F3 (x) , . . . represent the coefficients of the first, second, third, . . . powers of z in the expansion of F(x + z), a, b, c, . . . the various roots of equation (27) and k a number greater than their moduli. In addition, suppose that the quantity ξ has a value close to the real root a, where the difference a−ξ and the quantity α determined by the equation (55)
α =−
F (ξ ) F1 (ξ )
are so small, ignoring the signs, that in the polynomial (56)
F1 (ξ ) + 2 (2α) F2 (ξ ) + 3 (2α)2 F3 (ξ ) + 4 (2α)3 F4 (ξ ) + . . . ,
the numerical value of the first term surpasses the sum of the numerical values of all the others. Finally, denote by G a number less than [401] the excess of the first numerical value of the given sum. We are certain: 1◦ that the real root a is contained between the limits ξ and ξ + 2α; and 2◦ that the difference a − b or b − a between the root a and a new real root b does not surpass G (57) . (2k)m−2 To prove the preceding proposition, we first observe that under the given hypothesis, because polynomial (56) has the same sign as its first term, we can say as much a fortiori about the two polynomials ( −3F1 (ξ ) + (2α) F2 (ξ ) − (2α)2 F3 (ξ ) + (2α)3 F4 (ξ ) − . . . , (58) F1 (ξ ) + (2α) F2 (ξ ) + (2α)2 F3 (ξ ) + (2α)3 F4 (ξ ) + . . . , which we obtain by expanding the fractions F (ξ − 2α) α
and
F (ξ + 2α) α
according to the ascending powers of α and using equation (55). Consequently, because the first terms of the two polynomials are of opposite signs, the same is true for the two fractions and for their numerators F (ξ − 2α)
and F (ξ + 2α) .
So there is at least one real root of equation (27) between the limits
330
Note III – On the numerical solution of equations.
ξ − 2α
and ξ + 2α.
I add that there is only one of them, and indeed, it is easy to see that if several real roots were contained between these limits and if a and b denote two such roots taken one after the other from the sequence, we would find for the values of the expressions F1 (a) = (a − b) (a − c) . . . , and F1 (b) = (b − c) (b − a) . . . , two quantities with contrary signs. Consequently the equation F1 (x) = 0
(59)
[402] would have a real root contained between a and b, of the form ξ + z, where the quantity z is contained between the limits −2α and +2α. Now this cannot be accepted, because if we replace z in formula (31) by y + z and if we expand the left-hand side of this formula, modified according to the ascending powers of y, we get F (x + z) + yF1 (x + z) + . . . = F (x) + (y + z) F1 (x) + (y + z)2 F2 (x) + . . . , then, in equating the coefficients of the first power of y on both sides, (60)
F1 (x + z) = F1 (x) + 2zF2 (x) + 3z2 F3 (x) + 4z3 F4 (x) + . . . .
Consequently, the expansion of (61) becomes (62)
F1 (ξ + z) F1 (ξ ) + 2zF2 (ξ ) + 3z2 F3 (ξ ) + 4z3 F4 (ξ ) + . . . .
Because in polynomial (56) the numerical value of the first term exceeds the sum of the numerical values of all the others, the same is true a fortiori for polynomial (62), as long as the numerical value of z is supposed to be less than that of 2α. It follows that under this hypothesis expression (61) does not vanish. Thus equation (59) does not have real roots contained between the limits ξ − 2α and ξ + 2α, and equation (27) has only one root between these limits. The root in question is necessarily the one closest to the quantity ξ , and which we have denoted by a. On the other hand, because the fraction F (ξ + 2α) α is equivalent to the second of the two polynomials (58) and has the same sign as the first term of this polynomial, namely
Note III – On the numerical solution of equations.
F1 (ξ ) = −
331
F (ξ ) , α
we ought to conclude that F (ξ )
and F (ξ + 2α)
[403] are two quantities of contrary signs and that the root a is found bound between the two limits ξ and ξ + 2α. As for the second part of the proposition stated above, it is an immediate consequence of scholium II, because the quantity G evidently lies below, ignoring the sign, polynomial (62), that is to say of the expansion of F1 (ξ + z), as long as the numerical value of z does not exceed that of 2α, and consequently is less than the quantity F1 (a), which we deduce from F1 (ξ + z) by setting z = a−ξ. Thus it follows from this second part that the real roots greater than a are all greater than the limit G a+ (63) (2k)m−2 and the roots smaller than a are less than the limit (64)
a−
G (2k)m−2
.
Problem III. — To find values as close as we might wish to the real roots of equation (27). Solution. — We begin by determining, with the aid of the preceding problem, two limits, one greater than and one less than each real positive root. Suppose in particular that the root a is of this kind, and denote by x0 and X the two limits, below and above this root. If we form two different sums, the first with the positive terms of the polynomial F(x), the second with the negative terms taken with the contrary sign, then the one which is smaller for x = x0 becomes the larger for x = X. Represent this sum by ϕ (x) and the other by χ (x). The two integer functions ϕ (x) and χ (x) enjoy the properties stated in theorems II and III, and consequently, if the function ϕ (x) is such that we can easily solve equations of the form ϕ (x) = const., then formulas (7) and (16) immediately give values closer and [404] closer to the root a. This is what happens, for example, whenever the function is given in the form B (x +C)n + D,
332
Note III – On the numerical solution of equations.
where B, C, D are any three integer numbers and n an integer number equal to or less than m, because then we obtain the successive terms of series (6) and (15) by the extraction of roots of degree n. If the function ϕ (x) is not of the form that we have just indicated, we can easily put it into that form by adding an integer polynomial ψ (x), in which all the terms are positive, to both sides of the equation ϕ (x) = χ (x) . Indeed, it is clear that the values of ϕ (x) and of χ (x), modified by the addition of such a polynomial, preserve all of the same properties. Moreover, we can assign an infinity of different values to the polynomial ψ (x). Suppose, for example, that ϕ (x) = x3 + 3x2 + 8. The value of ϕ (x) modified by the addition of the polynomial ψ (x) becomes (x + 1)3 + 7 if we suppose that ψ (x) = 3x, but it becomes (x + 2)3 if we suppose that ψ (x) = 3x2 + 12x, etc. On this matter, it is worth remarking: 1◦ that we can always choose the integer function ψ (x) so that the number B is 1; and 2◦ that in many cases, one of the numbers C or D can be reduced to zero. After using the preceding method to determine the real positive roots of equation (27), it evidently suffices to obtain the negative roots as well by using the same method to seek the positive roots of the equation (65)
F (−x) = 0.
Scholium. — There exist several methods of approximation other than the one we have just described, among which we must mention that of Newton. It supposes that we already know a value ξ close to the [405] root that we are seeking, and it consists of taking as a correction to this value the quantity α determined by the equation F (ξ ) (55) α =− . F1 (ξ ) However, because this last method is not always applicable, it is important to examine the cases in which it can be used. On this subject, we are going to establish the following propositions:
Note III – On the numerical solution of equations.
333
Theorem IV. — Suppose that a denotes any one of the real roots, positive or negative, of equation (27) and that ξ is a value close to this root. Suppose that we determine α by means of equation (55). If α is small enough, ignoring the sign, that the numerical value of the first term of polynomial (56) exceeds the sum of the numerical values of all the others, then of the two quantities and ξ + α,
ξ
the second is closer to a than the first. Proof. — We have already seen (problem II, scholium IV) that under the given hypotheses, the root a is the only root between the limits and ξ + 2α.
ξ Given this, if we take (66)
a = ξ + z,
then z is a quantity contained between the limits 0 and 2α and satisfies the equation F (ξ + z) = 0, or what amounts to the same thing, (67)
F (ξ ) + zF1 (ξ ) + z2 F2 (ξ ) + . . . = 0.
If, for convenience, we make q=−
(68)
F2 (ξ ) + zF3 (ξ ) F1 (ξ )
and consider formula (55), then equation (67) becomes z = α + qz2 .
(69) [406] Consequently, we have (70)
a = ξ + z = ξ + α + qz2 ,
from which it follows that by taking ξ + α in place of ξ for the value close to a, we commit an error that is equal, no longer to the numerical value of z, but rather to that of qz2 . Moreover, because polynomial (56) has the same sign as its first term F1 (ξ ), the two polynomials ( F1 (ξ ) + 2 (2α) F2 (ξ ) + 2 (2α) zF3 (ξ ) + . . . = (1 − 4αq) F1 (ξ ) and (71) F1 (ξ ) − 2 (2α) F2 (ξ ) − 2 (2α) zF3 (ξ ) − . . . = (1 + 4αq) F1 (ξ )
334
Note III – On the numerical solution of equations.
evidently enjoy the same property, which requires that the numerical value of 2αq and a fortiori that of qz be less than 12 . We conclude immediately that the numerical value of qz2 is less than that of 21 z. Thus, of the two errors which we make by taking and ξ + α
ξ
as values close to a, the second error is smaller than half of the first. Scholium I. — Because we get z=
α 1 − qz
from equation (69) and because the numerical value of qz is less than 12 , we are certain that the value of z always remains between the limits 2 3α
and
2α.
Scholium II. — By solving equation (69) as if the value of q were known, we find √ 1 ± 1 − 4αq 2α √ z= = . 2q 1 ∓ 1 − 4αq √ Here the radical 1 − 4αq is given a double sign. However, because the value of z ought to be smaller than that of 2α, it is clear that we ought to prefer the inferior sign. Thus we have 2α √ z= (72) . 1 + 1 − 4αq [407] Given this, if we call q0 and Q two limits, the first less than and the second greater than the quantity q determined by formula (68), we conclude from equation (72) that the exact value of z is contained between the two expressions (73)
2α √ 1 + 1 − 4αq0
and
2α √ . 1 + 1 − 4αQ
Consequently, this value contains all the decimal digits common to the two expressions expressed as numbers. Scholium III. — Suppose that of the two quantities q0 and Q, the second has the larger numerical value and that this numerical value is less than 1. Then, if the difference a − ξ = z is, ignoring the sign, smaller than the unit decimal of order n, that is to say if we have n 1 (74) , val.num. z < 10 then the difference a − (ξ + α) = qz2
Note III – On the numerical solution of equations.
335
is smaller, ignoring the sign, than a unit decimal of order 2n. Indeed, we find that (75)
2
val.num. qz