About the Project
3 Numerical MethodsAreas

§3.11 Approximation Techniques

  1. §3.11(i) Minimax Polynomial Approximations
  2. §3.11(ii) Chebyshev-Series Expansions
  3. §3.11(iii) Minimax Rational Approximations
  4. §3.11(iv) Padé Approximations
  5. §3.11(v) Least Squares Approximations
  6. §3.11(vi) Splines

§3.11(i) Minimax Polynomial Approximations

Let f(x) be continuous on a closed interval [a,b]. Then there exists a unique nth degree polynomial pn(x), called the minimax (or best uniform) polynomial approximation to f(x) on [a,b], that minimizes maxaxb|ϵn(x)|, where ϵn(x)=f(x)pn(x).

A sufficient condition for pn(x) to be the minimax polynomial is that |ϵn(x)| attains its maximum at n+2 distinct points in [a,b] and ϵn(x) changes sign at these consecutive maxima.

If we have a sufficiently close approximation

3.11.1 pn(x)=anxn+an1xn1++a0

to f(x), then the coefficients ak can be computed iteratively. Assume that f(x) is continuous on [a,b] and let x0=a, xn+1=b, and x1,x2,,xn be the zeros of ϵn(x) in (a,b) arranged so that

3.11.2 x0<x1<x2<<xn<xn+1.

Also, let

3.11.3 mj=(1)jϵn(xj),

(Thus the mj are approximations to m, where ±m is the maximum value of |ϵn(x)| on [a,b].)

Then (in general) a better approximation to pn(x) is given by

3.11.4 k=0n(ak+𝛿ak)xk,


3.11.5 k=0nxjk𝛿ak=(1)j(mjm),

This is a set of n+2 equations for the n+2 unknowns 𝛿a0,𝛿a1,,𝛿an and m.

The iterative process converges locally and quadratically (§3.8(i)).

A method for obtaining a sufficiently accurate first approximation is described in the next subsection.

For the theory of minimax approximations see Meinardus (1967). For examples of minimax polynomial approximations to elementary and special functions see Hart et al. (1968). See also Cody (1970) and Ralston (1965).

§3.11(ii) Chebyshev-Series Expansions

The Chebyshev polynomials Tn are given by

3.11.6 Tn(x)=cos(narccosx),

They satisfy the recurrence relation

3.11.7 Tn+1(x)2xTn(x)+Tn1(x)=0,

with initial values T0(x)=1, T1(x)=x. They enjoy an orthogonal property with respect to integrals:

3.11.8 11Tj(x)Tk(x)1x2dx={π,j=k=0,12π,j=k0,0,jk,

as well as an orthogonal property with respect to sums, as follows. When n>0 and 0jn, 0kn,

3.11.9 =0nTj(x)Tk(x)={n,j=k=0 or n,12n,j=k0 or n,0,jk,

where x=cos(π/n) and the double prime means that the first and last terms are to be halved.

For these and further properties of Chebyshev polynomials, see Chapter 18, Gil et al. (2007a, Chapter 3), and Mason and Handscomb (2003).

Chebyshev Expansions

If f is continuously differentiable on [1,1], then with

3.11.10 cn=2π0πf(cosθ)cos(nθ)dθ,

the expansion

3.11.11 f(x)=n=0cnTn(x),

converges uniformly. Here the single prime on the summation symbol means that the first term is to be halved. In fact, (3.11.11) is the Fourier-series expansion of f(cosθ); compare (3.11.6) and §1.8(i).

Furthermore, if fC[1,1], then the convergence of (3.11.11) is usually very rapid; compare (1.8.7) with k arbitrary.

For general intervals [a,b] we rescale:

3.11.12 f(x)=n=0dnTn(2xabba).

Because the series (3.11.12) converges rapidly we obtain a very good first approximation to the minimax polynomial pn(x) for [a,b] if we truncate (3.11.12) at its (n+1)th term. This is because in the notation of §3.11(i)

3.11.13 ϵn(x)=dn+1Tn+1(2xabba),

approximately, and the right-hand side enjoys exactly those properties concerning its maxima and minima that are required for the minimax approximation; compare Figure 18.4.3.

More precisely, it is known that for the interval [a,b], the ratio of the maximum value of the remainder

3.11.14 |k=n+1dkTk(2xabba)|

to the maximum error of the minimax polynomial pn(x) is bounded by 1+Ln, where Ln is the nth Lebesgue constant for Fourier series; see §1.8(i). Since L0=1, Ln is a monotonically increasing function of n, and (for example) L1000=4.07, this means that in practice the gain in replacing a truncated Chebyshev-series expansion by the corresponding minimax polynomial approximation is hardly worthwhile. Moreover, the set of minimax approximations p0(x),p1(x),p2(x),,pn(x) requires the calculation and storage of 12(n+1)(n+2) coefficients, whereas the corresponding set of Chebyshev-series approximations requires only n+1 coefficients.

Calculation of Chebyshev Coefficients

The cn in (3.11.11) can be calculated from (3.11.10), but in general it is more efficient to make use of the orthogonal property (3.11.9). Also, in cases where f(x) satisfies a linear ordinary differential equation with polynomial coefficients, the expansion (3.11.11) can be substituted in the differential equation to yield a recurrence relation satisfied by the cn.

For details and examples of these methods, see Clenshaw (1957, 1962) and Miller (1966). See also Mason and Handscomb (2003, Chapter 10) and Fox and Parker (1968, Chapter 5).

Summation of Chebyshev Series: Clenshaw’s Algorithm

For the expansion (3.11.11), numerical values of the Chebyshev polynomials Tn(x) can be generated by application of the recurrence relation (3.11.7). A more efficient procedure is as follows. Let cnTn(x) be the last term retained in the truncated series. Beginning with un+1=0, un=cn, we apply

3.11.15 uk=2xuk+1uk+2+ck,

Then the sum of the truncated expansion equals 12(u0u2). For error analysis and modifications of Clenshaw’s algorithm, see Oliver (1977).

Complex Variables

If x is replaced by a complex variable z and f(z) is analytic, then the expansion (3.11.11) converges within an ellipse. However, in general (3.11.11) affords no advantage in for numerical purposes compared with the Maclaurin expansion of f(z).

For further details on Chebyshev-series expansions in the complex plane, see Mason and Handscomb (2003, §5.10).

§3.11(iii) Minimax Rational Approximations

Let f be continuous on a closed interval [a,b] and w be a continuous nonvanishing function on [a,b]: w is called a weight function. Then the minimax (or best uniform) rational approximation

3.11.16 Rk,(x)=p0+p1x++pkxk1+q1x++qx

of type [k,] to f on [a,b] minimizes the maximum value of |ϵk,(x)| on [a,b], where

3.11.17 ϵk,(x)=Rk,(x)f(x)w(x).

The theory of polynomial minimax approximation given in §3.11(i) can be extended to the case when pn(x) is replaced by a rational function Rk,(x). There exists a unique solution of this minimax problem and there are at least k++2 values xj, ax0<x1<<xk++1b, such that mj=m, where

3.11.18 mj=(1)jϵk,(xj),

and ±m is the maximum of |ϵk,(x)| on [a,b].

A collection of minimax rational approximations to elementary and special functions can be found in Hart et al. (1968).

A widely implemented and used algorithm for calculating the coefficients pj and qj in (3.11.16) is Remez’s second algorithm. See Remez (1957), Werner et al. (1967), and Johnson and Blair (1973).


With w(x)=1 and 14-digit computation, we obtain the following rational approximation of type [3,3] to the Bessel function J0(x)10.2(ii)) on the interval 0xj0,1, where j0,1 is the first positive zero of J0(x):

3.11.19 R3,3(x)=p0+p1x+p2x2+p3x31+q1x+q2x2+q3x3,

with coefficients given in Table 3.11.1.

Table 3.11.1: Coefficients pj, qj for the minimax rational approximation R3,3(x).
j pj qj
0 0.99999 99891 7854
1 0.34038 93820 9347 0.34039 05233 8838
2 0.18915 48376 3222 0.06086 50162 9812
3 0.06658 31942 0166 0.01864 47680 9090

The error curve is shown in Figure 3.11.1.

See accompanying text
Figure 3.11.1: Error R3,3(x)J0(x) of the minimax rational approximation R3,3(x) to the Bessel function J0(x) for 0xj0,1 (=0.89357). Magnify

§3.11(iv) Padé Approximations


3.11.20 f(z)=c0+c1z+c2z2+

be a formal power series. The rational function

3.11.21 Np,q(z)Dp,q(z)=a0+a1z++apzpb0+b1z++bqzq

is called a Padé approximant at zero of f if

3.11.22 Np,q(z)f(z)Dp,q(z)=O(zp+q+1),

It is denoted by [p/q]f(z). Thus if b00, then the Maclaurin expansion of (3.11.21) agrees with (3.11.20) up to, and including, the term in zp+q.

The requirement (3.11.22) implies

3.11.23 a0 =c0b0,
a1 =c1b0+c0b1,
ap =cpb0+cp1b1++cpqbq,
0 =cp+1b0+cpb1++cpq+1bq,
0 =cp+qb0+cp+q1b1++cpbq,

where cj=0 if j<0. With b0=1, the last q equations give b1,,bq as the solution of a system of linear equations. The first p+1 equations then yield a0,,ap.

The array of Padé approximants

3.11.24 [0/0]f[0/1]f[0/2]f[1/0]f[1/1]f[1/2]f[2/0]f[2/1]f[2/2]f

is called a Padé table. Approximants with the same denominator degree are located in the same column of the table.

For convergence results for Padé approximants, and the connection with continued fractions and Gaussian quadrature, see Baker and Graves-Morris (1996, §4.7).

The Padé approximants can be computed by Wynn’s cross rule. Any five approximants arranged in the Padé table as



3.11.25 (NC)1+(SC)1=(WC)1+(EC)1.

Starting with the first column [n/0]f, n=0,1,2,, and initializing the preceding column by [n/1]f=, n=1,2,, we can compute the lower triangular part of the table via (3.11.25). Similarly, the upper triangular part follows from the first row [0/n]f, n=0,1,2,, by initializing [1/n]f=0, n=1,2,.

For the recursive computation of [n+k/k]f by Wynn’s epsilon algorithm, see (3.9.11) and the subsequent text.

Laplace Transform Inversion

Numerical inversion of the Laplace transform (§1.14(iii))

3.11.26 F(s)=f(s)=0estf(t)dt

requires f=1F to be obtained from numerical values of F. A general procedure is to approximate F by a rational function R (vanishing at infinity) and then approximate f by r=1R. When F has an explicit power-series expansion a possible choice of R is a Padé approximation to F. See Luke (1969b, §16.4) for several examples involving special functions.

For further information on Padé approximations, see Baker and Graves-Morris (1996, §4.7), Brezinski (1980, pp. 9–39 and 126–177), and Lorentzen and Waadeland (1992, pp. 367–395).

§3.11(v) Least Squares Approximations

Suppose a function f(x) is approximated by the polynomial

3.11.27 pn(x)=anxn+an1xn1++a0

that minimizes

3.11.28 S=j=1J(f(xj)pn(xj))2.

Here xj, j=1,2,,J, is a given set of distinct real points and Jn+1. From the equations S/ak=0, k=0,1,,n, we derive the normal equations

3.11.29 [X0X1XnX1X2Xn+1XnXn+1X2n][a0a1an]=[F0F1Fn],


3.11.30 Xk =j=1Jxjk,
Fk =j=1Jf(xj)xjk.

(3.11.29) is a system of n+1 linear equations for the coefficients a0,a1,,an. The matrix is symmetric and positive definite, but the system is ill-conditioned when n is large because the lower rows of the matrix are approximately proportional to one another. If J=n+1, then pn(x) is the Lagrange interpolation polynomial for the set x1,x2,,xJ3.3(i)).

More generally, let f(x) be approximated by a linear combination

3.11.31 Φn(x)=anϕn(x)+an1ϕn1(x)++a0ϕ0(x)

of given functions ϕk(x), k=0,1,,n, that minimizes

3.11.32 j=1Jw(xj)(f(xj)Φn(xj))2,

w(x) being a given positive weight function, and again Jn+1. Then (3.11.29) is replaced by

3.11.33 [X00X01X0nX10X11X1nXn0Xn1Xnn][a0a1an]=[F0F1Fn],


3.11.34 Xk=j=1Jw(xj)ϕk(xj)ϕ(xj),


3.11.35 Fk=j=1Jw(xj)f(xj)ϕk(xj).

Since Xk=Xk, the matrix is again symmetric.

If the functions ϕk(x) are linearly independent on the set x1,x2,,xJ, that is, the only solution of the system of equations

3.11.36 k=0nckϕk(xj)=0,

is c0=c1==cn=0, then the approximation Φn(x) is determined uniquely.

Now suppose that Xk=0 when k, that is, the functions ϕk(x) are orthogonal with respect to weighted summation on the discrete set x1,x2,,xJ. Then the system (3.11.33) is diagonal and hence well-conditioned.

A set of functions ϕ0(x),ϕ1(x),,ϕn(x) that is linearly independent on the set x1,x2,,xJ (compare (3.11.36)) can always be orthogonalized in the sense given in the preceding paragraph by the Gram–Schmidt procedure; see Gautschi (1997a).

Example. The Discrete Fourier Transform

We take n complex exponentials ϕk(x)=eikx, k=0,1,,n1, and approximate f(x) by the linear combination (3.11.31). The functions ϕk(x) are orthogonal on the set x0,x1,,xn1, xj=2πj/n, with respect to the weight function w(x)=1, in the sense that

3.11.37 j=0n1ϕk(xj)ϕ(xj)¯=nδk,,

δk, being Kronecker’s symbol and the bar denoting complex conjugate. In consequence we can solve the system

3.11.38 fj=k=0n1akϕk(xj),

and obtain

3.11.39 ak=1nj=0n1fjϕk(xj)¯,

With this choice of ak and fj=f(xj), the corresponding sum (3.11.32) vanishes.

The pair of vectors {𝐟,𝐚}

3.11.40 𝐟 =[f0,f1,,fn1]T,
𝐚 =[a0,a1,,an1]T,

is called a discrete Fourier transform pair.

The Fast Fourier Transform

The direct computation of the discrete Fourier transform (3.11.38), that is, of

3.11.41 fj =k=0n1akωnjk,
ωn =e2πi/n,

requires approximately n2 multiplications. The method of the fast Fourier transform (FFT) exploits the structure of the matrix 𝛀 with elements ωnjk, j,k=0,1,,n1. If n=2m, then 𝛀 can be factored into m matrices, the rows of which contain only a few nonzero entries and the nonzero entries are equal apart from signs. In consequence of this structure the number of operations can be reduced to nm=nlog2n operations.

The property

3.11.42 ωn2(k(n/2))=ωn/2k

is of fundamental importance in the FFT algorithm. If n is not a power of 2, then modifications are possible. For the original reference see Cooley and Tukey (1965). For further details and algorithms, see Van Loan (1992).

For further information on least squares approximations, including examples, see Gautschi (1997a, Chapter 2) and Björck (1996, Chapters 1 and 2).

§3.11(vi) Splines

Splines are defined piecewise and usually by low-degree polynomials. Given n+1 distinct points xk in the real interval [a,b], with (a=)x0<x1<<xn1<xn(=b), on each subinterval [xk,xk+1], k=0,1,,n1, a low-degree polynomial is defined with coefficients determined by, for example, values fk and fk of a function f and its derivative at the nodes xk and xk+1. The set of all the polynomials defines a function, the spline, on [a,b]. By taking more derivatives into account, the smoothness of the spline will increase.

For splines based on Bernoulli and Euler polynomials, see §24.17(ii).

For many applications a spline function is a more adaptable approximating tool than the Lagrange interpolation polynomial involving a comparable number of parameters; see §3.3(i), where a single polynomial is used for interpolating f(x) on the complete interval [a,b]. Multivariate functions can also be approximated in terms of multivariate polynomial splines. See de Boor (2001), Chui (1988), and Schumaker (1981) for further information.

In computer graphics a special type of spline is used which produces a Bézier curve. A cubic Bézier curve is defined by four points. Two are endpoints: (x0,y0) and (x3,y3); the other points (x1,y1) and (x2,y2) are control points. The slope of the curve at (x0,y0) is tangent to the line between (x0,y0) and (x1,y1); similarly the slope at (x3,y3) is tangent to the line between x2,y2 and x3,y3. The curve is described by x(t) and y(t), which are cubic polynomials with t[0,1]. A complete spline results by composing several Bézier curves. A special applications area of Bézier curves is mathematical typography and the design of type fonts. See Knuth (1986, pp. 116-136).