TEXT FLY WITHIN THE BOOK ONLY
TEXT CROSS WITHIN THE
BOOK ONLY
168106
>
73
OSMANIA UNIVERSITY LIBRARY X
Call No...
99 downloads
1495 Views
21MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
TEXT FLY WITHIN THE BOOK ONLY
TEXT CROSS WITHIN THE
BOOK ONLY
168106
>
73
OSMANIA UNIVERSITY LIBRARY X
Call No.^/'
Author
'
I
S*i
y
7 Y/ // ^fy / /.
/,
**""*
A.
^
/
Accession No.
JJ /
/
^
f/T*-
7
/
*
.
7.
,
/
;/
/ ..
6 -i^}
'V
This book should be Returned on^or before the
drite last
marked below,
f
International Series in Pure and Applied Mathematics WILLIAM TED MARTIN, Consulting Editor
PRINCIPLES OF NUMERICAL ANALYSIS
International Series in Pure and Applied Mathematics
WILLIAM TED MARTIN, Consulting Editor
AHLFORS
Complex Analysis
BELLMAN Stability Theory of Differential Equations GOLOMB & SHANKS Elements of Ordinary Differential Equations HOUSEHOLDER Principles of Numerical Analysis LASS
Vector and Tensor Analysis
LEIGHTON An Introduction
NEHARI
Theory
of Differential
Conformul Mapping Logic for Mathematicians
KOSSER
RUDIN
to the
Mathematical Analysis Fourier Transforms
Principles of
SNEDDON
STOLL Linear Algebra
WEINSTOCK Calculus
of Variations
Equations
PRINCIPLES OF NUMERICAL ANALYSIS
ALSTON
S.
HOUSEHOLDER
Oak Ridge National Laboratory
New York
Toronto
London
MCGRAW-HILL BOOK COMPANY, 1953
INC.
PRINCIPLES OF NUMERICAL ANALYSIS Copyright, 1953, by the McGraw-Hill Book Company, Inc. Printed in the United States of America. Ail rights reserved. This book, or parts thereof, may not be reproduced in any form without permission of the publishers. Library of Congress Catalog Card Number: 53-6043
THE MAPLE PRESS COMPANY, YORK,
PA.
TO B,
J,
AND
J
PREFACE This
is
a mathematical textbook rather than a
compendium
of
computa-
hoped that the material included will provide a useful background for those seeking to devise and evaluate routines for numerical tional rules.
It is
computation.
The
general topics considered are the solution of finite systems of equations, linear and nonlinear, and the approximate representation of functions. Conspicuously omitted are functional equations of all types.
The justification on the part
for this omission lies first in the
background presupposed
Second, there are good books, in print and in
of the reader.
But ultimately, preparation, on differential and on integral equations. the numerical "solution" of a functional equation consists of a finite table numbers, whether these be a set of functional values, or the first n coefficients of an expansion in terms of known functions. Hence, eventually the problem must be reduced to that of determining a finite set of numbers of
and of representing functions thereby, and at book become relevant.
The endeavor has been one who has had a course
this stage the topics in this
to keep the discussion within the reach of
in calculus, though some elementary notions are utilized in the allusions to statistical assess-
of the probability theory ments of errors, and in the brief outline of the
book
an expansion of lecture notes for the University of Tennessee during the is
Monte Carlo method. The a course given in Oak Ridge for spring and summer quarters of
1950.
The material was assembled with high-speed
compvitation always in mind, though many techniques appropriate only to "hand" computation are discussed. By a curious and amusing paradox, the advent of high-speed machinery has lent popularity to the two innovations from the field of statistics referred to above. How otherwise the continued use of these machines will transform the computer's art remains to digital
But this much can surely be said, that their effective use demands a more profound understanding of the mathematics of the problem, and a more detailed acquaintance with the potential sources of error, than is ever required by a computation whose development can be watched, step by step, as it proceeds. It is for this reason that a textbook on the mathematics of computation seems in order. Help and encouragement have come from too many to permit listing be seen.
. .
Vil
PREFACE
Vlll
by name.
But
a pleasure to thank, in particular, J. A. Cooley, C. C. Kurd, D. A. Flanders, J. W. Givens, A. de la Garza, and members of the Mathematics Panel of Oak Ridge National Laboratory. And for the all
it is
painstaking preparation of the copy, thanks go to Iris Tropp, Gwen Wicker, and above all, to Mae Gill. A. S. Householder
CONTENTS Preface 1.
vii
The Art of Computation 1.1
1.2 1.3
1.4 1.5 1
.6
Errors and Blunders Composition of Error Propagated Error and Significant Figures Generated Error Complete Error Analyses Statistical Estimates of Error
1.7 Bibliographic 2.
1
4 G 7
10 14
Notes
15
Matrices and Linear Equations
17
Iterative
2.2
Methods Direct Methods
44 65
2.3
Some Comparative Evaluations
81
2.1
2.4 Bibliographic 3.
1
Notes
83
Nonlinear Equations and Systems 3.1
3.2
8(>
The Graeffe Process Bernoulli's Method
3.3 Functional Iteration 3.4
Systems
of
Equations 3.5 Complex Roots and Methods 3.6 Bibliographic Notes
4.
of Factorization
....
The Proper Values and Vectors of a Matrix
143
Methods Direct Methods
150 166 184
4.1 Iterative
4.2
4.3 Bibliographic 5.
106 114 118 132 138 141
Notes
Interpolation
185
5.1
193 211
Polynomial Interpolation 5.2 Trigonometric and Exponential Interpolation 5.3 Bibliographic Notes ix
.
.
.
.
213
CONTENTS
X 6.
More General Methods
6.3
7.
215
Chebyshev Expansions Bibliographic Notes
223-
225
Numerical Integration and Differentiation 7.1
The Quadrature Problem
7.2
Numerical Differentiation
General
226
Methods
226 231 232
Notes
241
The Monte Carlo Method
242
Numerical Integration 8.2 Random Sequences 8.3 Bibliographic Notes
243 245 246
7.3 Operational
7.4 Bibliographic
8.
215
Methods
6.1 Finite Linear
6.2
of Approximation
8.1
Bibliography
Problems Index
in
247 263 269
CHAPTER
1
THE ART OF COMPUTATION
1.
The Art
of
Computation
We
are concerned here with mathematical principles that are sometimes of assistance in the design of computational routines. It is hardly
necessary to remark that the advent of high-speed sequenced computing machinery is revolutionizing the art and that it is much more difficult to explain to a machine how a problem is to be done than to explain to most human beings. Or that the process that is easiest for the human being to carry out is not necessarily the one that is easiest or quickest for the
machine. Not only that, but a process may be admirably well adapted to one machine and very poorly adapted to another. Consequently, the robot master has very few tried and true rules at his disposal, and is forced to go back to first principles to construct such rules as seem to conform best to the idiosyncrasy of his particular robot. If a computation requires more than a very few operations, there are usually many different possible routines for achieving the same end Even so simple a computation as ab/c can be done (ab)/c, (a/c)b, result. or a(6/c), not to mention the possibility of reversing the order of the factors in the multiplication. Mathematically these are all equivalent; computationally they are not (cf. 1.2 and 1.4). Various, and sometimes conflicting, criteria must be applied in the final selection of a parIf the routine must be given to someone else, or to a ticular routine. computing machine, it is desirable to have a routine in which the steps are easily laid out, and this is a serious and important consideration in the use of sequenced computing machines. Naturally one would like the routine to be as short as possible, to be self-checking as far as possible, to give results that are at least as accurate as may be required. And with reference to the last point, one would like the routine to be such that it is possible to assert with confidence (better yet, with certainty) and in advance that the results will be as accurate as may be desired, or if an advance assessment is out of the question, as it often is, one would hope that it can be made at least upon completion of the
computation. 1.1. Errors and Blunders. result of the division 1
-f-
The number
3, is correctly 1
0.33,
when expressing the
obtained even though
it
deviates
PRINCIPLES OP NUMERICAL ANALYSIS
2
per cent from the true quotient. The number 0.334, when expressing the result of the same division, deviates by only 0.2 per cent from the true quotient, and yet is incorrectly obtained. The deviation of 0.33 from the true quotient will be called an error. If the division is to be
by
1
carried out to three places but not more, then 0.333 tion possible and the replacement of the final "3"
is
the best representa-
by a
final
"4"
will
be
called a blunder.
Blunders result from fallibility, errors from finitude. Blunders will not be considered here to any extent. There are fairly obvious ways to guard against them, and their effect, when they occur, can be gross, Generally the sources of error insignificant, or anywhere in between. other than blunders will leave a limited range of uncertainty, and genIt is imporerally this can be reduced, if necessary, by additional labor.
tant to be able to estimate the extent of the range of uncertainty. Four sources of error are distinguished by von Neumann and Goldstine,
and while occasionally the
errors of one type or another
may be
negligible
These sources are the following: or absent, generally they are present. 1. Mathematical formulations are seldom exactly descriptive of any Perfect gases real situation, but only of more or less idealized models.
and material points do not exist. 2. Most mathematical formulations contain parameters, such as lengths, times, masses, temperatures, etc., whose values can be had only from measurement. Such measurements may be accurate to within 1, 0.1, or 0.01 per cent, or better, but however small the limit of error, it is not zero.
mathematical equations have solutions that can be constructed only in the sense that an infinite process can be described whose 3.
Many
definition the infinite process cannot be completed, so one must stop with some term in the sequence, accepting this as the adequate approximation to the required solution. limit is the solution in question.
By
This results in a type of error called the truncation error. 4. The decimal representation of a number is made by writing a sequence of digits to the left, and one to the right, of an origin which is
marked by the decimal point. The digits to the left of the decimal are finite in number and are understood to represent coefficients of increasing of 10 beginning with the zeroth; those to the right are possibly infinite in number, and represent coefficients of decreasing powers of 10.
powers
In digital computation only a finite number of these digits can be taken account of. The error due to dropping the others is called the round-off error.
In decimal representation 10
the base of the representation. Many modern computing machines operate in the binary system, using the base 2 instead of the base 10. Every digit in the two sequences is is called
THE ART OF COMPUTATION
3
and the point which marks the origin is called the binary Desk computing machines which point, rather than the decimal point. use the base 8 are on the market, since conversion between the bases 2 and 8 is very simple. Colloquial languages carry the vestiges of the use of other bases, e.g., 12, 20, 60, and in principle, any base could be used. Clearly one does not evaluate the error arising from any one of these Generally sources, for if he did, it would no longer be a source of error. it cannot be evaluated. In particular cases it can be evaluated but not either
or
1,
3 carried out to a preassigned number of places). But one does hope to set bounds for the errors and to ascertain that the errors will not exceed these bounds.
represented
(e.g.,
in the division 1
-*
not responsible for sources 1 and 2. He is not concerned with formulating or assessing a physical law nor with making physical measurements. Nevertheless, the range of uncertainty to which they give rise will, on the one hand, create a limit below which the range of uncertainty of the results of a computation cannot come, and
The computer
is
on the other hand, provide a range of tolerance below which it does not need to come. With the above classification of sources, we present a classification This is to some extent artificial, since errors arising of errors as such. from the various sources interact in a complex fashion and result in a Nevertheless, single error which is no simple sum of elementary errors. thanks to a most fortunate circumstance, it is generally possible to estimate an over-all range of uncertainty as though it were such a simple sum (1.2). Hence we will distinguish propagated error, generated error, and residual error.
At the
any computation the data may contain errors of measurement, round-off errors due to the finite representation in some base of numbers like }-<j, or even numbers requiring a finite but large number These initial errors carry through the of places for exact representation. computation and lead to an uncertainty at every step. It is important to outset of
know how and
these initial errors are propagated through the computation to what extent they render the results uncertain.
In addition to this, at every step, or nearly every step, new errors may arise as a result of round-off, these combine with the errors already propagated, and the total is propagated through what computations remain. Finally, when the computation is terminated, a truncation error may remain and further enlarge the region of uncertainty. Roughly the extent to which errors are propagated and the uncertainty due to residual error depend upon the mathematical formulation of the computational procedure, while the generation is detailed ordering of the computational steps.
Any
more dependent upon the
computation, however elaborate, consists of a
finite
number
of
PRINCIPLES OP NUMERICAL ANALYSIS
4
elementary operations carried out in some sequence.
The elementary
operations are usually additions and subtractions, multiplications and An unamdivisions, comparisons, possibly table look-ups, and the like.
biguous description of the sequence in which the operations are performed or to be performed, with a specification of the data upon which each If multiplication and division are is to operate, constitutes a routine.
elementary operations, there are six possible routines for computing Hence a routine is by no means defined when a mathematical ab/c. formula, or sequence of them, is written down. A routine of any complexity breaks up naturally into parts or subA subroutine may have for its purpose the computation of an routines. intermediate quantity, of no interest in itself but serving as a datum or
operand for one or more subsequent subroutines. Thus, in order to Or a subroutine calculate ab/c, one must first calculate ab, or a/c, or b/c. may operate upon intermediate results to produce a final result. .) Suppose a subroutine is intended to compute a function f(x, y, If / is a rational function or a polyfor given values of its arguments. nomial, there need be no residual error in the computation, but only propagated and generated errors. If / is not a rational function, some One type of rational approxirational approximation must be devised. mation is a Taylor series. If a Taylor series is used, only a finite number of terms will be computed. The residual error in the computation is the .
sum
of all neglected terms.
Hence the
residual error
is
fixed
.
by the
mathematical formulation of the problem, together with the specification of the number of terms to be used in the computation. But an error may be generated and propagated in each computed term. Another type of approximation, e.g., in solving an equation by Newone ton's method, is the following: From an initial approximation / defines a sequence of approximations by a relation /,-+i =
generated by the routine for evaluating Vo
s$
=
1
then surely
(1.5.2)
when
a/z
y
a/y.
But the function /() = z a/z is properly monotonically Hence f(z) > f(y) implies the first inequality in (1.5.5). This implies, in particular, that, a
i
then Xi
>
\/a.
-f-
x^
if
=
2
-f-
Xi
>
Xi+i
(^
then clearly we should take
-
a
Xi
and not
-
a
+
4- xt)
2
2~ x
(1.5.7)
(x^i
we
one more step in the iteration.
shall take at least
-T-
Suppose the equality holds in
>
z_i)/2
-s-
Then
(1.5.7).
= x^i -
Xi
a
0,
should happen that
If it
(1.5.6)
(xi-i
increasing.
2~ x a -r
the other hand,
if
,
-
,
fa-i
#,-_i)
-5-
2
=
2~~
x ,
whence a
rc t-_i
-5-
Hence
>
a
^
+
^-.i
>
a;,-
>
>
a/a^i,
2~ x
.
fortiori if the inequality holds in (1.5.7).
2~x x a/x > 2 x > a 4- 2- 2X ~ 2 ) (x + 2~ ~ 2~ 2X 2 )> - 2- x 1 x > (a
Hence
in all cases
,
1
,
+
(1.5.8)
.
This gives a lower bound for the computed value. Next suppose - a -T- x^ -f- 2 < 0. (Xi
Then
Xi- a^Tl/t
.
Hence
n
Cl'
-2-
^*
.
l/j
O ^
I
(^
(X) has the remarkable property that 4>(T)
(2.06,10)
=
0.
the Cayley-Hamilton theorem, which can be stated otherwise by saying that the null space of (T) is the entire space. This might be expected from the fact, shown above, that the null spaces of two poly-
This
is
nomials in
common
T
have a non-null vector
divisor.
A proof of the
in
common
only
if
Cayley-Hamilton theorem
they have a is
as follows
Since
equal to a polynomial in T multiplied by X/. said to be divisible by T Also, by Eq.
therefore this difference
T
X7,
and hence
is
(2.05.6),
(T
Hence ,
X
X,-/)F,
.
Furthermore, the columns of
1.
vi
35
-
(T
\
1),
and we have again the recursion (2.121.2). When Ai is a diagonal matrix whose diagonal is the same as that of yA, we have a method discussed by von Mises and Geiringer. Since for such a choice of A\ the diagonal elements of A 2 are all zeros, and since A\ a diagonal matrix, the diagonal elements of A^At are also all zeros. It is no restriction to suppose that Ai = / and that 7 = 1, for we may replace the original system by the equivalent one, is
Then
by von Mises and Geiringer
(2.121.5) yields one of the criteria given which they state in the form
I We
4
x](y
+
r
),
the two quantities are equal, the two approximations are equally good. However for making the test in a particular instance, there will be available only the vectors
and
if
r*
By
=
y
-
p =
(Ax pr,
0, 1.
(2.132.2),
N(rp
-
rj)
- N[(Axp)* - Axp < ]
and therefore
~
r*)|
< N(x p)N(rp -
r*)
Also rj)
+ x\(r9 -
-
rj),
PRINCIPLES OF NUMERICAL ANALYSIS
62
Hence r
)
-
x\(y
+
=
n)
+ fj) -
*l(y
(2.1321.1) will certainly be satisfied
and
(2.1321.2)
Since
x\(y
we can
+ rf) -
+ rf) + ^0(^*0 - rj)
-
rf),
if
n^[N(x
]
also say that
-
\xl(rP
therefore (2.1321.1) x\(y
(2.1321.3)
+ rj) >
xl(y
I(0
is
+
^
r *)l
w&(z p)b(r p
also implied
rf)
-
*J(y
-
n[6(zo)
This requirement is somewhat more stringent. Now consider a particular approximation x and the digital approximaCan tion that would be obtained from XQ following a single projection. we be assured that the digital result of making the projection will be a better approximation than #0? If the projection is-made on e
2 ,
if
x
= Rw,
then
w x Indeed,
R =
relation
R*AR =
V** 1 ,
and
D
2
is
=
RD~*R*y.
therefore unit upper triangular. Hence the equivalent to A V*D*V, which is the triangular it is
resolution already obtained but arrived at in a different way. An orthogonalization process of somewhat different type has been
devised by Stiefel and Hestenes, independently. The process leads to a fairly simple iteration, which, however, terminates in n steps to yield
the exact solution apart from round-off. Since the n steps yield progressively better approximations to the true solution, the process can be
continued beyond n steps for reduction of the round-off error. The first step, as applied to a positive definite matrix A, is the same as in the method of steepest descent, in that one starts with an arbitrary initial
approximation x
residual r
.
and improves
it
by adding a multiple
Thereafter, however, instead of adding to each
#,-
of the
a multiple
PRINCIPLES OF NUMERICAL ANALYSIS
74
one adds a vector so chosen that n+i is orthogonal, with respect to the metric /, to all preceding r,-. If this can be accomplished, then for some m < n rm = 0, and hence Ax m = y. For if all the vectors r n, rn _i are non-null, then being mutually orthogonal they are linearly independent, and only the null vector is orthogonal to all of them. Geometrically the method has other points of interest. We have already noted that the solution x of the equations Ax = y minimizes the of r,
y
.
.
.
,
,
function 2/(z)
(2.22.5)
= x*Ax -
2x*y.
common center of the hyperdimensional ellipsoids
In fact it represents the (2.22.6)
f(x)
=
const.
This fact provides the usual approach to the method of steepest descent. Also at x the function f(x) is varying most rapidly in the direction of r which is the gradient at XQ of the function f(x). Hence one takes ,
x\
=
XQ
4
<x
ro,
where o minimizes the function f(x Q 4 \
95
2
6 or
...
1
At each thetic
.
.
.
.
.
.
Cn
2
we
cut off the final remainder and repeat the synwith the preceding coefficients. This is sometimes
division
division
C n -l
On
"
Iff o r
called reducing the roots of the equation, since every root 2, of the equation P(z is r less than a corresponding root Xi of P(x) = 0. r) In solving equations by Newton's or Horner's method, it is first
+
necessary to localize the roots roughly, and the first step in this is to obtain upper and lower bounds for all the real roots. If in the process of reducing the roots
by a
and the
positive r the &'s
c of
any
line are all
positive, as well as all the c's previously calculated, then necessarily all succeeding 6's and c's will be positive. Hence the transformed equation will have only positive coefficients and hence can have no positive
Hence the
real roots.
Hence any
r.
have no
original equation can positive number r is an upper
real roots exceeding to the real roots of an
bound the scheme for reducing the
roots by r algebraic equation if in any line of all numbers are positive along with the c's already calculated. 3.05. Sturm Functions; Isolation of Roots. The condition just given is
us that r is an upper bound to the roots of P = 0, but it is not necessary. In particular if all coefficients of P are positive, the equation can have no positive roots. This again is a sufficient but sufficient for assuring
not a necessary condition. A condition that is both necessary and In fact, we shall be able to sufficient will be derived in this section. tell
exactly
how many
somewhat
is
given
laborious,
real roots lie in
some other weaker, but
r is
.
.
p(
.
it
simpler, criteria will be
an m-fold root so that
= small
However, since
interval.
first.
Suppose
Since
any
(x
-
P (m) (r) ^ 0, there is some interval so that P (m) (z) is non-null throughout ,P'(x), P(x) are non-null except at
w*- 1 )(ic)
tive at r
is
r.
(r
,
r
+
e)
sufficiently
the interval, and
Suppose P
(>n)
(r)
>
P (m~ 0.
l)
(x),
Then
increasing throughout the interval, and so it must be nega~ Hence (m 2) (#) is decreasing, and hence positive at r -}- e.
P
,
and hence again
+
e. positive, at r and at r extending the argument, it appears that the signs at r can be tabulated as follows:
positive, at r
;
increasing,
By
+
PRINCIPLES OF NUMERICAL ANALYSIS
If
P
(fn)
p
and
in particular
lim j*
p-+
00
a feasible method could be found for computing s p for sufficiently large p, we could take the pth root of this and obtain thereby an evaluation of the largest root (in case there is such) of the equation.
Hence
if
The
Graeffe process does equation in the form
and somewhat more.
this,
If
we
write the
(3.1.1)
and square both
sides,
we
obtain
or
a^ 2n +
(2a a 2
-
a 2 )a: 2n
-2
+
(2a O4
-
2aia 3
+
aftx**-*
+=
0.
Since only even powers of x occur here, this can be written (3.1.2)
a*y
n
+
(2a a 2
-
a 2 )^"- 1
+
(2a a 4
-
+ where y
=
x*.
=
0,
NONLINEAR EQUATIONS AND SYSTEMS
107
Hence we obtain a new equation whose roots are the squares of the roots of the original equation. If we repeat, we obtain an equation whose roots are the fourth powers, another repetition gives one with the eighth powers, etc. After p such operations we obtain an equation whose roots are the 2 p th powers of the roots of the original:
aSxfo
(3.1.3)
At any stage
if
we
=
+ a?^- + 1
0.
write the coefficients in sequence
f
a
p)
a[
a
l
f (
a%
.
.
.
,
then to get the new sequence ap +l> we take the product of a p) by the coefficient symmetrically placed with respect to atf* and double, subtract the double product of a{p) by its symmetric mate, ending with p>s Now if the roots are xi} then aj .
.
.
,
.
(3.1.4)
ai/Oo
=
-So%,
=
df/atf
and
If the roots are all distinct,
-So? 8,,
Xi has a
.
.
.
,
modulus
a^/ajP
= -Sof.
larger than that of
any
other root, then eventually (3.1.5)
Now it is
so that
-/a
h\
\Xi
if
roots
and
rc
\Xj\,
then also
h\.
\Xj
Brodetsky and Smeal therefore make the natural proposal that h should be ''infinitesimal," and Lehmer has developed an effective algorithm. The original Graeffe process can be described in slightly different terms by saying that we start with a polynomial P(z) and obtain from it a polynomial PI(X) whose zeros are the squares of those of P from PI we obtain Pz(x) whose zeros are the squares of those of Pi and hence the ;
- Pf (\G)P,(-
Pp+i(*)
(3.11.1)
On
....
fourth powers of those of P, one verifies that
setting
P =P
=
0, 1, 2,
V*),
p
for uniformity,
....
In fact
P
(x)
=
a$l(x
=
agn( -*
+ *?),
and the general statement follows by a simple induction. If
we
Qo(x)
Qi(x)
write
= a U(x = ajn(V
Xi
-
h), Xi
-
V* -
h)(-
Xi
-
h)
= Qo(V^)Qo(- V*)
and continue the same procedure with
then
we
find inductively that
Q p (x)
a polynomial whose zeros are the n
is
quantities (xt
+
m ti)
=
+
x?
mhx?-
1
+
,
m=
where the terms omitted contain h 2 and higher powers Lehmer's algorithm is obtained by setting (z, h)
(3.11.2)
**
(x
-
ft)-"Po(*
-
h)
and defining recursively (3.11.3)
ft)^p(~
V*>
Then (3.11.4)
Q,(a?)
=
(ft
0)
+
m
Also o(a;,
ft)
= -
o(a;,
a
0)
h<j>'Q (x,
-
)"*,(*,
ft).
ft)-
2P
of h.
,
PRINCIPLES OF NUMERICAL ANALYSIS
110
where
=
'
d/dx
=
-d<j>/dh,
and
=
6r
By
from
direct calculation
=
4>^i(x, h)
(3.11.5)
-
(r
r
l)a r_i,
2, 3,
....
if
(3.11.3)
o^*-"
=
+ o^-ar +
a
m
and
if
y
>
1,
as
we may
\
first of
(x) is a second-order iteration.
sequence
(3.3.4)
More
generally,
of order
m exactly
if
(3.3.11)
(a)
m at least,
-
and
0,
While one can write in general the expansion
-
-
(x
-
a)0'(a) 4- (*
-
a)
V()/2!
4-
NONLINEAR EQUATIONS AND SYSTEMS (since
is
supposed analytic), when the iteration
121
order m, one can
is of
write
-
+(x)
(3.3.13)
a
=
a
=
-
(x
a)-
In this case Xi+i
a)
(Xi
m
(m)
+
()/m!
.
Some classical methods of successive Special Cases. approximation are methods of functional iteration as here understood. Horner's method is not, and since it has little to recommend it in any 3.31.
Some
not be described here. 3.311. First-order iterations. Best known of these is the regula falsi. This applies to real roots of real equations, algebraic or not. If f(x') and
case, it will
f(x") have opposite signs,
-
** lies
between
x'
intersects the
-
[*'/(*")
The chord from
and x".
x axis at #2 as one
>
suppose that f(xz)f(x")
- /(*')]
*"/(*')]/[/(*")
the point
It is
verifies easily.
otherwise
to x", f(x") no restriction to
x', /(#')
we can
reverse the designations of x' and x". now let Xz play the role of x" and repeat, or we let x" Hence we Xi and regard this as the first step in the iteration. are taking 0, since
We
The
=
*(*)
(3.311.1)
derivative
<j>'(ci)
is
-
[*'/(*)
xf(x')]/(f(x)
-
seen to be
=
*'()
In case / has continuous
+
[/(*')
-
(
*'
and second derivatives near
first
a,
then the
be small, and there
will exist
Taylor expansion gives
-
/(*')
where now x"
is
/(a)
+
1
(X
-
a)/'(a)
some point on the
/(*')
+
(a
-
+
Y
2 (x'
-
a)
Hence
interval (a, x'}.
=
*')/'()
and therefore
- HO*' Hence for #' sufficiently close to an interval about a over which
a,
(x)
=
x
-
+'(x)
H
1
-
and
mf(x),
mf'(x).
has fixed sign throughout some neighborhood of the solution, we to have the same sign and, in fact, such that throughout the choose If /'(#)
m
interval
2
>
mj'(x)
3.312. Second-order iterations.
*(x)
(3.312.1)
which
is
we take
If
a
0.
-
to say *(x)
(3.312.2)
we
=
>
=
l/f'(x),
obtain the well-known Newton's method.
(3.312.3)
+'(x)
=
1
-
[/"(*)
The
derivative
is
- f(x)f"(x)]/f'\x),
whence If
a
and
is
*'()
=
0.
/'()
*
0,
not a multiple root,
for
any positive k there
The requirement
often
is
made
a neighborhood of a throughout which
that at the initial approximation x
we
should have /(*o)/"(*o) is
>
not strictly necessary.
Newton's method applies to transcendental as well as to algebraic However if the equaequations, and to complex roots as well as to real. tion is real, then the complex roots occur in conjugate pairs, and the iteration cannot converge to a complex root unless x is itself complex. But if XQ is complex, and sufficiently close to a complex root, the iteration will
converge to that root.
For algebraic equations, as each of the first two or three x is obtained, it customary to diminish the roots by #,- by the process described in 3.04. t
is
NONLINEAR EQUATIONS AND SYSTEMS
Or
123
then one obtains Xi XQ and rather, one first diminishes by # diminishes the roots of the last equation by this amount; then obtains Since Xi and diminishes by this amount, etc. #2 ;
f(Xi
+
U)
=
/(*f"(x(x)
=
(x
+
N/x)/2.
Higher Order: Konig's Theorem. If one applies Newton's method to any product q(x)f(x) to obtain a particular zero a of /(#), one always obtains an iteration of second order at least, provided only q(a) is neither zero nor infinite. Hence one might expect that by proper choice of q it should be possible to obtain an iteration of third order or even higher. This is true; one can in fact obtain an iteration of any desired order, and various schemes have been devised for the purpose. Some of these will now be described. However, one must expect that 3.32. Iterations of
apt to require more laborious computations. The optimal compromise between simplicity of algorithm and rapidity of convergence will depend in large measure upon the nature of the avail-
an iteration
of higher order is
able computing
Consider
first
facilities.
Konig's theorem.
= we have
9/f
=
In the expansion
ho
Taylor's expansion about the origin in which
the origin to some point x, we can restate the theorem in an apparently more general form, as follows: If in some circle about x the equation If
we move
(3.32.1)
/()
=
PRINCIPLES OP NUMERICAL ANALYSIS
124
which is simple; if /(z) and and g(a) 7* 0; and if we define
has only a single root
throughout this
circle,
P
(3.32.2)
g(z) are analytic
a,
m
r (x)
then
P
is
r (#)
= a -
T (x)
x.
oo
I
For
P
lim
(3.32.3)
simply the ratio of the coefficients of the Taylor expansion of
about the point z. This being true, then at least for
h(z)
r sufficiently large it is to
be expected
that
and hence x
P
P
X
|a
r (x)\
and ^ (2) as in (3.33.2), defines as in (3.33.5) with ^ ( D, ^ (2 ), and w replacing (1) ^ (2 ), and a:, then also defines
an
This
iteration.
is