The equation to be solved is
where is a real or complex variable and the function is nonlinear. Solutions are called roots of the equation, or zeros of . If and , then is a simple zero of . If and , then is a zero of of multiplicity ; compare §1.10(i).
Sometimes the equation takes the form
and the solutions are called fixed points of .
Equations (3.8.1) and (3.8.2) are usually solved by iterative methods. Let be a sequence of approximations to a root, or fixed point, . If
for all sufficiently large, where and are independent of , then the sequence is said to have convergence of the th order. (More precisely, is the largest of the possible set of indices for (3.8.3).) If and , then the convergence is said to be linear or geometric. If , then the convergence is quadratic; if , then the convergence is cubic, and so on.
An iterative method converges locally to a solution if there exists a neighborhood of such that whenever the initial approximation lies within .
This is an iterative method for real twice-continuously differentiable, or complex analytic, functions:
If is a simple zero, then the iteration converges locally and quadratically. For multiple zeros the convergence is linear, but if the multiplicity is known then quadratic convergence can be restored by multiplying the ratio in (3.8.4) by .
For real functions the sequence of approximations to a real zero will always converge (and converge quadratically) if either:
and , do not change sign between and (monotonic convergence).
, , do not change sign in the interval , and (monotonic convergence after the first iteration).
. The first positive zero of lies in the interval ; see Figure 4.15.3. From this graph we estimate an initial value . Newton’s rule is given by
Results appear in Table 3.8.1. The choice of here is critical. When or , Newton’s rule does not converge to the required zero. The convergence is faster when we use instead the function ; in addition, the successful interval for the starting value is larger.
|0||4.65000 00000 000|
|1||4.60567 66065 900|
|2||4.55140 53475 751|
|3||4.50903 76975 617|
|4||4.49455 61600 185|
|5||4.49341 56569 391|
|6||4.49340 94580 903|
|7||4.49340 94579 091|
|8||4.49340 94579 091|
If with , then the interval contains one or more zeros of . Bisection of this interval is used to decide where at least one zero is located. All zeros of in the original interval can be computed to any predetermined accuracy. Convergence is slow however; see Kaufman and Lenker (1986) and Nievergelt (1995).
Let and be such that and have opposite signs. Inverse linear interpolation (§3.3(v)) is used to obtain the first approximation:
We continue with and either or , depending which of and is of opposite sign to , and so on. The convergence is linear, and again more than one zero may occur in the original interval .
Whether or not and have opposite signs, is computed as in (3.8.6). If the wanted zero is simple, then the method converges locally with order of convergence . Because the method requires only one function evaluation per iteration, its numerical efficiency is ultimately higher than that of Newton’s method. There is no guaranteed convergence: the first approximation may be outside .
This iterative method for solving is given by
It converges locally and quadratically for both and .
For other efficient derivative-free methods, see Le (1985).
For the computation of zeros of orthogonal polynomials as eigenvalues of finite tridiagonal matrices (§3.5(vi)), see Gil et al. (2007a, pp. 205–207). For the computation of zeros of Bessel functions, Coulomb functions, and conical functions as eigenvalues of finite parts of infinite tridiagonal matrices, see Grad and Zakrajšek (1973), Ikebe (1975), Ikebe et al. (1991), Ball (2000), and Gil et al. (2007a, pp. 205–213).
has zeros in , counting each zero according to its multiplicity. Explicit formulas for the zeros are available if ; see §§1.11(iii) and 4.43. No explicit general formulas exist when .
After a zero has been computed, the factor is factored out of as a by-product of Horner’s scheme (§1.11(i)) for the computation of . In this way polynomials of successively lower degree can be used to find the remaining zeros. (This process is called deflation.) However, to guard against the accumulation of rounding errors, a final iteration for each zero should also be performed on the original polynomial .
. The zeros are and . Newton’s method is given by
The results for are given in Table 3.8.2.
|0||1.50000 00000 000|
|1||1.19907 40740 741|
|2||1.04431 68969 414|
|3||1.00274 20038 676|
|4||1.00001 12265 490|
|5||1.00000 00001 891|
|6||1.00000 00000 000|
As in the case of Table 3.8.1 the quadratic nature of convergence is clearly evident: as the zero is approached, the number of correct decimal places doubles at each iteration.
Newton’s rule can also be used for complex zeros of . However, when the coefficients are all real, complex arithmetic can be avoided by the following iterative process.
Let be an approximation to the real quadratic factor of that corresponds to a pair of conjugate complex zeros or to a pair of real zeros. We construct sequences and , , from , , and for ,
Then the next approximation to the quadratic factor is , where
The method converges locally and quadratically, except when the wanted quadratic factor is a multiple factor of . On the last iteration is the quotient on dividing by .
. With the starting values , , an approximation to the quadratic factor is computed (, ). Table 3.8.3 gives the successive values of and . The quadratic nature of the convergence is evident.
|0||1.75000 00000 000||0.50000 00000 000|
|1||2.13527 29454 109||1.21235 75284 943|
|2||2.01786 10488 956||1.02528 61401 539|
|3||2.00036 06329 466||1.00047 63067 522|
|4||2.00000 01474 803||1.00000 01858 298|
|5||2.00000 00000 000||1.00000 00000 000|
This example illustrates the fact that the method succeeds even if the two zeros of the wanted quadratic factor are real and the same.
For further information on the computation of zeros of polynomials see McNamee (2007).
Newton’s rule is the most frequently used iterative process for accurate computation of real or complex zeros of analytic functions . Another iterative method is Halley’s rule:
This is useful when satisfies a second-order linear differential equation because of the ease of computing . The rule converges locally and is cubically convergent.
Initial approximations to the zeros can often be found from asymptotic or other approximations to , or by application of the phase principle or Rouché’s theorem; see §1.10(iv). These results are also useful in ensuring that no zeros are overlooked when the complex plane is being searched.
For an example involving the Airy functions, see Fabijonas and Olver (1999).
For fixed-point methods for computing zeros of special functions, see Segura (2002), Gil and Segura (2003), and Gil et al. (2007a, Chapter 7). For describing the distribution of complex zeros of solutions of linear homogeneous second-order differential equations by methods based on the Liouville–Green (WKB) approximation, see Segura (2013).
Suppose also depends on a parameter , denoted by . Then the sensitivity of a simple zero to changes in is given by
Thus if is the polynomial (3.8.8) and is the coefficient , say, then
For moderate or large values of it is not uncommon for the magnitude of the right-hand side of (3.8.14) to be very large compared with unity, signifying that the computation of zeros of polynomials is often an ill-posed problem.
The zeros of
are well separated but extremely ill-conditioned. Consider and . We have and . The perturbation factor (3.8.14) is given by
Corresponding numerical factors in this example for other zeros and other values of are obtained in Gautschi (1984, §4).
The convergence of iterative methods
for solving fixed-point problems (3.8.2) cannot always be predicted, especially in the complex plane.
Consider, for example, (3.8.9). Starting this iteration in the neighborhood of one of the four zeros , sequences are generated that converge to these zeros. For an arbitrary starting point , convergence cannot be predicted, and the boundary of the set of points that generate a sequence converging to a particular zero has a very complicated structure. It is called a Julia set. In general the Julia set of an analytic function is a fractal, that is, a set that is self-similar. See Julia (1918) and Devaney (1986).